Chap12 Nonlinear Programming

College of Management, NCTU
Operation Research II
Spring, 2009
Chap12 Nonlinear Programming
‰ General Form of Nonlinear Programming Problems
Max f(x)
S.T. gi(x) ≤ bi for i = 1,…, m
x≥0
9 No algorithm that will solve every specific problem fitting this format is
available.
‰ An Example – The Product-Mix Problem with Price Elasticity
9 The amount of a product that can be sold has an inverse relationship to the price
charged. That is, the relationship between demand and price is an inverse curve.
9 The firm’s profit from producing and selling x units is the sales revenue xp(x)
minus the production costs. That is, P(x) = xp(x) – cx.
9 If each of the firm’s products has a similar profit function, say, Pj(xj) for
producing and selling xj units of product j, then the overall objective function is
n
f(x) =
∑ P (x
j =1
j
j
) , a sum of nonlinear functions.
9 Nonlinearities also may arise in the gi(x) constraint function.
Jin Y. Wang
Chap12-1
College of Management, NCTU
Operation Research II
Spring, 2009
‰ An Example – The Transportation Problem with Volume Discounts
9 Determine an optimal plan for shipping goods from various sources to various
destinations, given supply and demand constraints.
9 In actuality, the shipping costs may not be fixed. Volume discounts sometimes
are available for large shipments, which cause a piecewise linear cost function.
‰ Graphical Illustration of Nonlinear Programming Problems
Max Z = 3x1 + 5x2
S.T. x1 ≤ 4
9x12 + 5x22 ≤ 216
x1, x2 ≥ 0
9 The optimal solution is no longer a CPF anymore. (Sometimes, it is; sometimes,
it isn’t). But, it still lies on the boundary of the feasible region.
¾ We no longer have the tremendous simplification used in LP of limiting the
search for an optimal solution to just the CPF solutions.
9 What if the constraints are linear; but the objective function is not?
Jin Y. Wang
Chap12-2
College of Management, NCTU
Operation Research II
Spring, 2009
Max Z = 126x1 – 9x12 + 182x2 – 13x22
≤ 4
S.T.
x1
2x2 ≤ 12
3x1 + 2x2 ≤ 18
x1, x2 ≥ 0
9 What if we change the objective function to 54x1 – 9x12 + 78x2 – 13x22
9 The optimal solution lies inside the feasible region.
9 That means we cannot only focus on the boundary of feasible region. We need
to look at the entire feasible region.
‰ The local optimal needs not to be global optimal--Complicate further
Jin Y. Wang
Chap12-3
College of Management, NCTU
Operation Research II
Spring, 2009
9 Nonlinear programming algorithms generally are unable to distinguish between
a local optimal and a global optimal.
9 It is desired to know the conditions under which any local optimal is guaranteed
to be a global optimal.
‰ If a nonlinear programming problem has no constraints, the objective
function being concave (convex) guarantees that a local maximum (minimum)
is a global maximum (minimum).
9 What is a concave (convex) function?
9 A function that is always “curving downward” (or not curving at all) is called a
concave function.
9 A function is always “curving upward” (or not curving at all), it is called a
convex function.
9 This is neither concave nor convex.
‰ Definition of concave and convex functions of a single variable
9 A function of a single variable f (x) is a convex function, if for each pair of
values of x, say, x ' and x '' ( x ' < x '' ),
f [λx '' + (1 − λ ) x ' ] ≤ λf ( x '' ) + (1 − λ ) f ( x ' )
for all value of λ such that 0 < λ < 1 .
9 It is a strictly convex function if ≤ can be replaced by <.
9 It is a concave function if this statement holds when ≤ is replaced by ≥ (> for
the case of strict concave).
Jin Y. Wang
Chap12-4
College of Management, NCTU
Operation Research II
Spring, 2009
9 The geometric interpretation of concave and convex functions.
‰ How to judge a single variable function is convex or concave?
9 Consider any function of a single variable f(x) that possesses a second derivative
at all possible value of x. Then f(x) is
d 2 f ( x)
≥ 0 for all possible value of x.
convex if and only if
dx 2
concave if and only if
d 2 f ( x)
≤ 0 for all possible value of x.
dx 2
‰ How to judge a two-variables function is convex or concave?
9 If the derivatives exist, the following table can be used to determine a
two-variable function is concave of convex. (for all possible values of x1 and x2)
Quantity
Convex
Concave
≥ 0
≥ 0
∂ 2 f ( x1 , x 2 )
∂x12
≥ 0
≤ 0
∂ 2 f ( x1 , x 2 )
∂x 22
≥ 0
≤ 0
∂ 2 f ( x1 , x 2 ) ∂ 2 f ( x1 , x 2 ) ⎡ ∂ 2 f ( x1 , x 2 ) ⎤
−⎢
⎥
∂x12
∂x 22
⎣ ∂x1 ∂x 2 ⎦
Jin Y. Wang
2
Chap12-5
College of Management, NCTU
Operation Research II
Spring, 2009
9 Example: f ( x1 , x 2 ) = x12 − 2 x1 x 2 + x 22
‰ How to judge a multi-variables function is convex or concave?
9 The sum of convex functions is a convex function, and the sum of concave
functions is a concave function.
9 Example: f(x1, x2, x3) = 4x1 – x12 – (x2 – x3)2
= [4x1– x12] + [–(x2 – x3)2]
‰ If there are constraints, then one more condition will provide the guarantee,
namely, that the feasible region is a convex set.
‰ Convex set
9 A convex set is a collection of points such that, for each pair of points in the
collection, the entire line segment joining these two points is also in the
collection.
9 In general, the feasible region for a nonlinear programming problem is a convex
set whenever all the gi(x) (for the constraints gi(x) ≤ bi) are convex.
Max Z = 3x1 + 5x2
S.T. x1 ≤ 4
9x12 + 5x22 ≤ 216
x1, x2 ≥ 0
Jin Y. Wang
Chap12-6
College of Management, NCTU
Operation Research II
Spring, 2009
9 What happens when just one of these gi(x) is a concave function instead?
Max Z = 3x1 + 5x2
≤ 4
S.T. x1
≤ 14
2x2
8x1 – x12 + 14x2 – x22 ≤ 49
x1, x2 ≥ 0
¾ The feasible region is not a convex set.
¾ Under this circumstance, we cannot guarantee that a local maximum is a
global maximum.
‰ Condition for local maximum = global maximum (with gi(x) ≤ bi constraints).
9 To guarantee that a local maximum is a global maximum for a nonlinear
programming problem with constraint gi(x) ≤ bi and x ≥ 0, the objective
function f(x) must be a concave function and each gi(x) must be a convex
function.
9 Such a problem is called a convex programming problem.
‰ One-Variable Unconstrained Optimization
9 The differentiable function f(x) to be maximized is concave.
9 The necessary and sufficient condition for x = x* to be optimal (a global max) is
df
*
= 0 , at x = x .
dx
9 It is usually not very easy to solve the above equation analytically.
9 The One-Dimensional Search Procedure.
¾ Fining a sequence of trial solutions that leads toward an optimal solution.
¾ Using the signs of derivative to determine where to move. Positive
derivative indicates that x* is greater than x; and vice versa.
Jin Y. Wang
Chap12-7
College of Management, NCTU
Operation Research II
Spring, 2009
‰ The Bisection Method
9 Initialization: Select ε (error tolerance). Find an initial x (lower bound on x* )
and x (upper bound on x* ) by inspection. Set the initial trial solution x ' =
x+x
.
2
9 Iteration:
¾ Evaluate
df ( x)
at x = x ' .
dx
df ( x)
'
≥ 0 , reset x = x .
dx
df ( x)
'
≤ 0 , reset x = x .
¾ If
dx
¾ If
'
¾ Select a new x =
x+x
.
2
9 Stopping Rule: If x − x ≤ 2ε , so that the new x ' must be within ε of x * , stops.
Otherwise, perform another iteration.
9 Example: Max f(x) = 12x – 3x4 – 2x6
0
1
2
3
4
5
6
7
Jin Y. Wang
df(x)/dx
x
x
4.09
-2.19
1.31
-0.34
0.51
0.75
0.75
0.8125
0.8125
0.828125
1
0.875
0.875
0.84375
0.84375
New x
'
0.875
0.8125
0.84375
0.828125
0.8359375
f (x' )
7.8439
7.8672
7.8829
7.8815
7.8839
Chap12-8
College of Management, NCTU
Operation Research II
Spring, 2009
‰ Newton’s Method
9 The bisection method converges slowly.
¾ Only take the information of first derivative into account.
9 The basic idea is to approximate f(x) within the neighborhood of the current trial
solution by a quadratic function and then to maximize (or minimize) the
approximate function exactly to obtain the new trial solution.
9 This approximating quadratic function is obtained by truncating the Taylor
series after the second derivative term.
f '' ( xi )
f ( xi +1 ) ≈ f ( xi ) + f ( xi )( xi +1 − xi ) +
( xi +1 − xi ) 2
2
'
9 This quadratic function can be optimized in the usual way by setting its first
derivative to zero and solving for xi+1.
Thus, xi +1 = xi −
f ' ( xi )
.
f '' ( x i )
9 Stopping Rule: If xi +1 − xi ≤ ε , stop and output xi+1.
9 Example: Max f(x) = 12x – 3x4 – 2x6 (same as the bisection example)
¾
xi +1 = xi −
f ' ( xi )
=
f '' ( x i )
¾ Select ε = 0.00001, and choose x1 = 1.
f ( xi )
Iteration i
xi
1
2
3
4
0.84003
0.83763
7.8838
7.8839
f ' ( xi )
f '' ( xi )
-0.1325
-0.0006
-55.279
-54.790
xi+1
0.83763
0.83762
‰ Multivariable Unconstrained Optimization
9 Usually, there is no analytical method for solving the system of equations given
by setting the respective partial derivatives equal to zero.
9 Thus, a numerical search procedure must be used.
Jin Y. Wang
Chap12-9
College of Management, NCTU
Operation Research II
Spring, 2009
‰ The Gradient Search Procedure (for multivariable unconstrained
maximization problems)
9 The goal is to reach a point where all the partial derivatives are 0.
9 A natural approach is to use the values of the partial derivatives to select the
specific direction in which to move.
9 The gradient at point x = x’ is ∇f (x) = (
∂f ∂f
∂f
’
,
,...,
) at x = x .
∂x1 ∂x x
∂xn
9 The direction of the gradient is interpreted as the direction of the directed line
segment from the origin to the point (
∂f ∂f
∂f
,
,...,
) , which is the direction of
∂x1 ∂x x
∂x n
changing x that will maximize f(x) change rate.
9 However, normally it would not be practical to change x continuously in the
direction of ∇f (x), because this series of changes would require continuously
reevaluating the
∂f
and changing the direction of the path.
∂xi
9 A better approach is to keep moving in a fixed direction from the current trial
solution, not stopping until f(x) stops increasing.
9 The stopping point would be the next trial solution and reevaluate gradient. The
gradient would be recalculated to determine the new direction in which to move.
¾ Reset x’ = x’ + t* ∇f (x’), where t* is the positive value that maximizes
f(x’+t* ∇ f(x’)) =
9 The iterations continue until ∇f ( x) = 0 with a small tolerance ε .
‰ Summary of the Gradient Search Procedures
9 Initialization: Select ε and any initial trail solution x’. Go first to the stopping
rule.
9 Step 1: Express f(x’+t ∇ f(x’)) as a function of t by setting x j = x 'j + t (
∂f
) ' , for
∂x j x = x
j = 1, 2,…, n, and then substituting these expressions into f(x).
9 Step 2: Use the one-dimensional search procedure to find t = t* that maximizes
f(x’+t ∇ f(x’)) over t ≥ 0.
9 Step 3: Reset x’ = x’ + t* ∇ f(x’). Then go to the stopping rule.
Jin Y. Wang
Chap12-10
College of Management, NCTU
Operation Research II
9 Stopping Rule: Evaluate ∇ f(x’) at x = x’. Check if
Spring, 2009
∂f
≤ ε , for all j = 1,2,…, n.
∂xi
If so, stop with the current x’ as the desired approximation of an optimal solution
x*. Otherwise, perform another iteration.
‰ Example for multivariate unconstraint nonlinear programming
Max f(x) = 2x1x2 + 2x2 – x12 – 2x22
∂f
∂f
= 2 x 2 − 2 x1 ,
= 2 x1 + 2 − 4 x 2
∂x1
∂x 2
.
We verify that f(x) is
Suppose pick x = (0, 0) as the initial trial solution.
∇f (0,0) =
9 Iteration 1: x = (0, 0) + t(0, 2) = (0, 2t)
f ( x ' + t∇f ( x ' )) = f(0, 2t) =
9 Iteration 2: x = (0, 1/2) + t(1, 0) = (t, 1/2)
9 Usually, we will use a table for convenience purpose.
Jin Y. Wang
Chap12-11
College of Management, NCTU
Iteration
1
2
x'
Operation Research II
∇f ( x ' )
x ' + t ∇f ( x ' )
Spring, 2009
f ( x ' + t∇f ( x ' ))
t*
x ' + t * ∇f ( x ' )
‰ For minimization problem
9 We move in the opposite direction. That is x’ = x’ – t* ∇ f(x’).
9 Another change is t = t* that minimize f(x’ – t ∇ f(x’)) over t ≥ 0
‰ Necessary and Sufficient Conditions for Optimality (Maximization)
Problem
Necessary Condition
Also Sufficient if:
f(x) concave
One-variable unconstrained df = 0
Multivariable unconstrained
dx
∂f
= 0 (j=1,2,…n)
∂xi
f(x) concave
General constrained problem KKT conditions
f(x) is concave and
gi(x) is convex
‰ The Karush-Kuhn-Tucker (KKT) Conditions for Constrained Optimization
9 Assumed that f(x), g1(x), g2(x), …, gm(x) are differentiable functions. Then
x* = (x1*, x2*, …, xn*) can be an optimal solution for the nonlinear programming
problem only if there exist m numbers u1, u2, …, um such that all the following
KKT conditions are satisfied:
(1)
m
∂g
∂f
*
− ∑ u i i ≤ 0 , at x = x , for j = 1, 2, …, n
∂x j i =1 ∂x j
(2) x *j (
m
∂g
∂f
*
− ∑ u i i ) = 0 , at x = x , for j = 1, 2, …, n
∂x j i =1 ∂x j
(3) gi(x*) – bi ≤ 0, for i =1, 2, …, m
(4) ui [gi(x*) – bi] = 0, for i =1, 2, …, m
Jin Y. Wang
Chap12-12
College of Management, NCTU
Operation Research II
Spring, 2009
(5) x *j ≥ 0 , for j =1, 2, …, m
(6) u j ≥ 0 , for j =1, 2, …, m
‰ Corollary of KKT Theorem (Sufficient Conditions)
9 Note that satisfying these conditions does not guarantee that the solution is
optimal.
9 Assume that f(x) is a concave function and that g1(x), g2(x), …, gm(x) are
convex functions. Then x* = (x1*, x2*, … , xn*) is an optimal solution if and only
if all the KKT conditions are satisfied.
‰ An Example
Max f(x) = ln(x1 + 1) + x2
S.T. 2x1 + x2 ≤ 3
x1, x2 ≥ 0
n = 2; m = 1; g1(x) = 2x1 + x2 is convex; f(x) is concave.
1. ( j = 1)
1
− 2u1 ≤ 0
x1 + 1
2. ( j = 1) x1 (
1
− 2u1 ) = 0
x1 + 1
1. (j = 2) 1 − u1 ≤ 0
2. (j = 2) x 2 (1 − u1 ) = 0
3. 2 x1 + x2 − 3 ≤ 0
4. u1 (2 x1 + x 2 − 3) = 0
5. x1 ≥ 0, x2 ≥ 0
6. u1 ≥ 0
9 Therefore, There exists a u1 = 1 such that x1 = 0, x2 = 3, and u1 = 1 satisfy KKT
conditions. The optimal solution is (0, 3).
‰ How to solve the KKT conditions
9 Sorry, there is no easy way.
9 In the above example, there are 8 combinations for x1( ≥ 0), x2( ≥ 0), and u1( ≥ 0).
Try each one until find a fit one.
9 What if there are lots of variables?
9 Let’s look at some easier (special) cases.
Jin Y. Wang
Chap12-13
College of Management, NCTU
Operation Research II
Spring, 2009
‰ Quadratic Programming
Max f(x) = cx – 1/2 xTQx
S.T. Ax ≤ b
x≥0
n
9 The objective function is f(x) = cx – 1/2 xTQx = ∑ c j x j −
j =1
1 n n
∑∑ qij xi x j .
2 i =1 j =1
9 The qij are elements of Q. If i = j, then xixj = xj2, so –1/2qij is the coefficient of
xj2. If i ≠ j, then –1/2(qij xixj + qji xjxi) = –qij xixj, so –qij is the coefficient for the
product of xi and xj (since qij = qji).
9 An example
Max f(x1, x2) = 15x1 + 30x2 + 4x1x2 – 2x12 – 4x22
S.T. x1 + 2x2 ≤ 30
x1, x2 ≥ 0
9 The KKT conditions for the above quadratic programming problem.
1.
2.
1.
2.
3.
4.
5.
6.
(j = 1)
(j = 1)
(j = 2)
(j = 2)
Jin Y. Wang
15 + 4x2 – 4x1 – u1 ≤ 0
x1(15 + 4x2 – 4x1 – u1) = 0
30 + 4x1 – 8x2 – 2u1 ≤ 0
x2(30 + 4x1 – 8x2– 2u1) = 0
x1 + 2x2 – 30 ≤ 0
u1(x1 + 2x2 – 30) = 0
x1 ≥ 0, x2 ≥ 0
u1 ≥ 0
Chap12-14
College of Management, NCTU
Operation Research II
Spring, 2009
9 Introduce slack variables (y1, y2, and v1) for condition 1 (j=1), 1 (j=2), and 3.
= –15
1. (j = 1) – 4x1 + 4x2 – u1 + y1
+ y2
= –30
1. (j = 2) 4x1 – 8x2 – 2u1
+ v1 = 30
3.
x1 + 2x2
Condition 2 (j = 1) can be reexpressed as
2. (j = 1) x1y1 = 0
Similarly, we have
2. (j = 2) x2y2 = 0
4.
u1v1 = 0
9 For each of these pairs—(x1, y1), (x2, y2), (u1, v1)—the two variables are called
complementary variables, because only one of them can be nonzero.
¾ Combine them into one constraint x1y1 + x2y2 + u1v1 = 0, called the
complementary constraint.
9 Rewrite the whole conditions
4x1 – 4x2 + u1 – y1
= 15
– y2 = 30
–4x1 + 8x2 + 2u1
+ v1 = 30
x1 + 2x2
=0
x1y1 + x2y2 + u1v1
x1 ≥ 0, x2 ≥ 0, u1 ≥ 0, y1 ≥ 0, y2 ≥ 0, v1 ≥ 0
9 Except for the complementary constraint, they are all linear constraints.
9 For any quadratic programming problem, its KKT conditions have this form
Qx + ATu – y = cT
Ax + v = b
x ≥ 0, u ≥ 0, y ≥ 0, v ≥ 0
x Ty + u Tv = 0
9 Assume the objective function (of a quadratic programming problem) is
concave and constraints are convex (they are all linear).
9 Thus, x is optimal if and only if there exist values of y, u, and v such that all four
vectors together satisfy all these conditions.
9 The original problem is thereby reduced to the equivalent problem of finding a
feasible solution to these constraints.
9 These constraints are really the constraints of a LP except the complementary
constraint. Why don’t we just modify the Simplex Method?
Jin Y. Wang
Chap12-15
College of Management, NCTU
Operation Research II
Spring, 2009
‰ The Modified Simplex Method
9 The complementary constraint implies that it is not permissible for both
complementary variables of any pair to be basic variables.
9 The problem reduces to finding an initial BF solution to any linear programming
problem that has these constraints, subject to this additional restriction on the
identify of the basic variables.
9 When cT ≤ 0 (unlikely) and b ≥ 0, the initial solution is easy to find.
x = 0, u = 0, y = – cT, v = b
9 Otherwise, introduce artificial variable into each of the equations where cj > 0 or
bi < 0, in order to use these artificial variables as initial basic variables
¾ This choice of initial basic variables will set x = 0 and u = 0 automatically,
which satisfy the complementary constraint.
9 Then, use phase 1 of the two-phase method to find a BF solution for the real
problem.
¾ That is, apply the simplex to (zi is the artificial variables)
Min Z = ∑ z j
j
Subject to the linear programming constraints obtained from the KKT
conditions, but with these artificial variables included.
¾ Still need to modify the simplex method to satisfy the complementary
constraint.
9 Restricted-Entry Rule:
¾ Exclude from consideration any nonbasic variable to be the entering
variable whose complementary variable already is a basic variable.
¾ Choice the other nonbasic variables according to the usual criterion.
¾ This rule keeps the complementary constraint satisfied all the time.
9 When an optimal solution x*, u*, y*, v*, z1 = 0, …, zn = 0 is obtained for the phase
1 problem, x* is the desired optimal solution for the original quadratic
programming problem.
Jin Y. Wang
Chap12-16
College of Management, NCTU
Operation Research II
Spring, 2009
‰ A Quadratic Programming Example
Max 15x1 + 30x2 + 4x1x2 – 2x12 – 4x22
S.T. x1 + 2x2 ≤ 30
x1, x2 ≥ 0
Jin Y. Wang
Chap12-17
College of Management, NCTU
Operation Research II
Spring, 2009
‰ Constrained Optimization with Equality Constraints
9 Consider the problem of finding the minimum or maximum of the function f(x),
subject to the restriction that x must satisfy all the equations
g1(x) = b1
…
gm(x) = bm
9 Example:
Max f(x1, x2) = x12 + 2x2
S.T. g(x1, x2) = x12 + x22 = 1
9 A classical method is the method of Lagrange multipliers.
m
¾ The Lagrangian function h( x, λ ) = f ( x) − ∑ λi [ g i ( x) − bi ] , where (λ1 , λ2 ,..., λm )
i =1
are called Lagrange multipliers.
9 For the feasible values of x, gi(x) – bi = 0 for all i, so h(x, λ ) = f(x).
9 The method reduces to analyzing h(x, λ ) by the procedure for unconstrained
optimization.
¾ Set all partial derivative to zero
m
∂g
∂h
∂f
=
− ∑ λi i = 0 , for j = 1, 2, …, n
∂x j ∂x j i =1 ∂x j
∂h
= − g i ( x) + bi = 0, for i = 1, 2, …, m
∂λi
¾ Notice that the last m equations are equivalent to the constraints in the
original problem, so only feasible solutions are considered.
9 Back to our example
¾ h(x1, x2) = x12 + 2x2 – λ ( x12 + x22 – 1).
¾
∂h
=
∂x1
∂h
=
∂x 21
∂h
=
∂λ
Jin Y. Wang
Chap12-18
College of Management, NCTU
Operation Research II
Spring, 2009
‰ Other types of Nonlinear Programming Problems
9 Separable Programming
¾ It is a special case of convex programming with one additional assumption:
f(x) and g(x) functions are separable functions.
¾ A separable function is a function where each term involves just a single
variable.
¾ Example: f(x1, x2) = 126x1 – 9x12 + 182x2 – 13x22 = f1(x1) + f2(x2)
f1(x1) =
f2(x2) =
¾ Such problem can be closely approximated by a linear programming
problem. Please refer to section 12.8 for details.
9 Geometric Programming
¾ The objective and the constraint functions take the form
N
g ( x) = ∑ ci Pi ( x) , where Pi ( x) = x1ai1 x 2ai 2 ...x3ai 3 for i = 1, 2, …, N
i =1
¾ When all the ci are strictly positive and the objective function is to be
minimized, this geometric programming can be converted to a convex
programming problem by setting x j = e y .
j
9 Fractional Programming
¾ Suppose that the objective function is in the form of a (linear) fraction.
Maximize f(x) = f1(x) / f2(x) = (cx + c0) / (dx + d0).
¾ Also assume that the constraints gi(x) are linear. Ax ≤ b, x ≥ 0.
¾ We can transform it to an equivalent problem of a standard type for which
effective solution procedures are available.
¾ We can transform the problem to an equivalent linear programming
problem by letting y = x / (dx + d0) and t = 1 / (dx + d0), so that x = y/t.
¾ The original formulation is transformed to a linear programming problem.
Max Z = cy + c0t
S.T. Ay – bt ≤ 0
dy + d0t = 1
y,t ≥ 0
Jin Y. Wang
Chap12-19