Part 3 Linear Programming

Part 4 Nonlinear
Programming
4.1 Introduction
Standard Form
min f  y 
s.t.
hj  y   0
j  1, 2,
g j y  0
j  n  1, n  2,
y   y1
y2
,n
yn  m 
T
,n  p
An Intuitive Approach to Handle the
Equality Constraints
One method of handling just one or two equality constraints
is to solve for 1 or 2 variables and eliminate them from
problem formulation by substitution.
EX.
f  y   y12  y22
s.t. y1  y2  1
Soln:
f  y   y  y  y  1  y1   2 y  2 y1  1  f ( y1 )
2
1
2
2
df
 4 y1  2  0
dy1
2
1
1
 y1 
2
2
2
1
1
 y2 
2
Use of Lagrange Multipliers to Handle n
Equality Constraints and m+n Variables
min f  y1 , y2 ,
, yn  m 
s.t.
h1  y1 , y2 ,
, yn  m   0
h2  y1 , y2 ,
, yn  m   0
hn  y1 , y2 ,
, yn  m   0
Equivalent Formulation
min f  x1 ,
, xn ; u1 ,
, um 
s.t.
h1  x1 ,
, xn ; u1 ,
, um   0
min f  x; u 
s.t.
h(x; u)  0


hn  x1 , , xn ; u1 , , um   0


 state variables decision variables 
(1)
(2)
Choice of Decision Variables
For a given optimization problem, the
choice of which variables to
designate as the decision (control)
variables is not unique.
It is only a matter of convenience to
make a distinction between decision
and state variables.
1st Derivation of Necessary
Conditions (i)
A stationary point is one where
 f 
 f 
df  0    dx    du
(3)
 x 
 u 
for arbitrary du while holding
 h 
 h 
dh  0    dx    du
 x  nn
 u  nm
and letting dx change as it will.
(4)
f  f

x  x1
 dx1 
 dx 
dx   2 
 
 
 dxn 
 h1
 x
 1
 h2
h 
  x1
x


 hn
 x1
f
x2
f 

xn 
h1
x2
h1 
xn 

h2 

xn 


hn 
xn  nn
h2
x2
hn
x2
1st Derivation of Necessary
Conditions (ii)
 dh 
If   is nonsingular (and it should be if u determines x from Eq (2)),
 dx  n  n
Eq (4) can be solved for dx, i.e.
dx  h x1hu du
(5)
Substituting into Eq (3) yields
df  0   f u  f xh x1hu  du
(6)
Hence, if df is to be zero for arbitrary du, it is necessary that
f u  f xh x1hu  0
(m equations)
(7a)
These m equations together with
h  x; u   0
determine u and x.
(n equations)
(7b)
1st Derivation of Necessary
Conditions (iii)
In other words, Eq (7) represents
f
0
u h
But, notice that, in general
f f

u u x
2nd Derivation of Necessary
Conditions (i)
Consider first a special case
min f  x, y, z 
s.t.
g  x, y , z   0
h  x, y , z   0
At extremum ( x* , y* , z * )
df  f x dx  f y dy  f z dz  0
Since dx, dy and dz are not independent, we cannot conclude
that f x , f y and f z vanish identically.
2nd Derivation of Necessary
Conditions (ii)
Since g and h must be maintained constant at zero,
dg  0  g x dx  g y dy  g z dz
dh  0  hx dx  hy dy  hz dz
Let us introduce two extra unkonwns 1 and 2 ,
df  1dg  2 dh 
 f x  1 g x  2 hx  dx
  f y  1 g y  2 hy  dy
  f z  1 g z  2 hz  dz  0
2nd Derivation of Necessary
Conditions (iii)
Now, it is possible to find nontrivial 1 and 2 such that at least
one pair of the differentials are zero, say
f x  1 g x  2 hx  0
(8)
f y  1 g y  2 hy  0
(9)
Otherwise, all pairs must satisfy
gx
hx
gy
hy

gy
hy
gz
hz

gz
hz
gx
hx
0
and g and h would be functionally dependent and, thus,
they must be either equivalent or inconsistent.
2nd Derivation of
Necessary Conditions (iv)
If 1 and 2 are determined by Eqs (8) and (9), the remaining
differentials can be arbitrarily assigned and forced to zero.
f z  1 g z  2 hz  0
Eqs (8), (9) and (10) together with
g  x, y , z   0
h  x, y , z   0
can be solved simultaneously.
(10)
2nd Derivation of Necessary
Conditions - General Formulation
min f  x; u 
s.t.
h  x; u   0
Necessary Conditions are:
fx  λT hx  0
(n equations)
(11)
fu  λ T hu  0
(m equations)
(12)
Together with
h  x; u   0
(n equations)
From Eq (11),
λ T   f xh x1
Substituting this into Eq (12) yields Eq (7)
fu  f xh x1hu  0
3rd Derivation with Lagrange
Multipliers
Adjoin the constraints to the objective function by
a set of n undetermined multipliers, 1 , 2 ,
, n , i.e.
L  x , u , λ   f  x, u   λ T h  x , u 
where 1 , 2 ,
, n are called Lagrange multipliers.
Treat the minimization problem
min L  x, u, λ 
x ,u , λ
as an unconstrained problem. The necessary conditions are:
L f
h

 λT
0
x x
x
L f
h

 λT
0
u u
u
L
h0
λ
which are the same as before.
Example:
1  x2 u 2 
f   2 2
2a b 
subject to h  x, u   x  mu  c  0
Solution:
1  x2 u 2 
L  f   h   2  2     x  mu  c 
2a b 
L x
L u
L
 2    0;
 2  m  0;
 x  mu  c  0
x a
u b

mcb 2
ca 2
c
u  2
; x 2
;  2
2 2
2 2
a m b
a m b
a  m 2b 2
Example :
min f  y1 , y2   y1  y2
s.t. h  y1 , y2   y12  y22  b  0 and b  1
Solution :
L   y1  y2     y12  y22  1
L
 1  2 y1  0
y1
L
 1  2 y 2  0
y2
y12  y22  1  0
1
1.414
1
 1.414 
*
*
 f  y     ; h  y   

1
 1.414 
 y1*  y2*  0.707;  
 Two vectors pointing toward opposite directions at minimum!
1  2 y1  0 
 y1  y2
1  2 y 2  0 
2
2
y1  y2  b  0
b
 y1  y2  
2
Sensitivity Interpretation
1/2
b
y b  y b     
2
*
1
*
2
  b    2b 
*
1/2
V  b   y  b   y  b     2b 
*
1
*
2
1/2
 minimum objective value
dV
1/2
   2b    *  b 
db
dV
*
V  b   V 1 
b

1

V
1


    (1)  b  1
db b 1
Generalized Sensitivity
min f  y 
s.t.
hˆ  y   b
i
i  1, 2,
i
,m
If b = b and the optimal objective value is V  b  .
The corresponding local optimum is y *  b  and λ *  b  .
i*  b  
V
bi
b
The constraint with the largest absolute i value is the
one whose rhs affects the optimal value function V the most,
at least for b close to b.
Problems with Inequality
Constraints Only
min f  y 
s.t.
hj  y   0
j  1, 2,
g j y  0
j  n  1, n  2,
y   y1
y2
,n
yn  m 
T
,n  p
Two scenarios at minimum: (1) g ( y* )  0 and (2) g ( y * )  0.
(1) If g ( y* )  0, the constraint is not effective and can be ignored.

df
dy
0
y*
(2) If g ( y* )  0, then df
 df
 sgn 
 dy

y*

 dg
   sgn 

 dy
y* 


df
dy
dy  0 and dg
y*
y*

dg
dy
dy  0
y*



y* 
The above two possibilities can be expressed in one equation as
df
dg

0
dy
dy
and
 0
(13)
Two Scenarios at Minimum
f ( y)
f ( y)
g ( fy( *y))  a  y*  0
f ( y)
g ( y* )  a  y*  0
f ( y)
df
dy
y*  a
y**
0
y
*
 0
df
dy
or

y*
dg
dy
 df
sgn 
 dy

 0 and   0
y*

 dg
   sgn 

 dy
y* 




y* 
If min is on the boundary, Eq (13) can be written as
f  g  0 and
 0
(14)
which should be interpreted as: f parallel to g
but pointing in opposite directions.
g
f
g 0
Area of improvement exists if
Eq (14) is not satisfied.
f
f  constant
f  constant
g2
g1
g1  0
f f
f  constant
minimum
g2  0
g2  0
f
g1  0
J Inequality Constraints and
N Variables
The necessary condition is: f  μT g  0
or
f  1g1   2 g 2 
  J g J  0
  0 gi  y1 , y2 ,
where, i  
  0 gi  y1 , y2 ,
and i  1, 2, , J
, y N   0 active
, y N   0 inactive
Since, at minimum, the i 's have to be nonnegative, f can be
expressed as a negative linear combination of g j 's. In words,
the gradient of f w.r.t y at a minimum must be pointed in such
a way the decrease of f can only come by violating the
active constraints.
Geometrical Interpretation
At any local constrained optimum, no
(small) allowable change in the problem
variables can improve the value of the
objective function.
f lies within the cone formed by the
negative gradients of the active
constraints.
General Formulation
min f  y1 , y2 ,
, yN 
(1)
s.t.
g j  y1 , y2 ,
, yN   0
j  1, 2,
,J
(2)
hk  y1 , y2 ,
, yN   0
k  1, 2,
,K
(3)
where, N  K
Active Constraints
The inequality constraint g j  y   0 is said to be an
active or binding constraint at the point y if
g j  y  =0
It is said to be inactive or nonbinding if
g j  y   0.
Kuhn-Tucker Conditions
 y f  μT  y g  y   λ T  y h  y   0T
(4)
g y  0
(5)
h y  0
(6)
μ0
(7)
μT g  y   0
(8)
or  j g j  y   0
j  1, 2,
,J
Kuhn-Tucker Necessity Theorem
Consider the NLP problem given by Eqs (1) - (3). Let f , g and h
be differentiable functions and y * be a feasible solution to NLP.
Let I   j | g j  0 . Furthermore,  y g j  y *  for j  I and  y hk  y * 
for k  1, 2,
, K are linearly independent.
If y * is an optimal solution to NLP, then there exists a  μ*
such that  y * μ*
λ *  solves the Kuhn-Tucker problem
given by Eqs (4) - (8).
λ* 
Remarks
• The Kuhn-Tucker necessity
theorem helps to identify points
that are not optimal.
• If the KTC are satisfied, there is
no assurance that the solution is
truly optimal.
Example 1. min f  y   y12  y2
s.t. y1  y2  6, y1  1  0, y12  y22  26
Solution:
f   2 y1
1
g1   1 0 , g 2   2 y1
2 y2 
h   1 1
Eq (4) becomes
2 y1  1  1  2  2 y1   1  1  0
1  1  0   2  2 y2   1  1  0
Eq (8) becomes
1 1  y1   0
2  y12  y22  26   0
Example 2
min f  y   y1 y2
s.t. g  y   y12  y22  25
L  y ,    y1 y2    y  y  25 
2
1
2
2
L
 y2  2  y1  0
y1
L
 y1  2  y2  0
y2
  y  y  25   0 and   0
2
1
2
2

y1
y2
C  25  y12  y22
Sensitivity
min f  y1 , y2 ,
, yN 
s.t.
g j  y1 , y2 ,
, y N   gˆ j  y1 , y2 ,
, yN   c j  0
j  1, 2,
,J
hk  y1 , y2 ,
, y N   hˆk  y1 , y2 ,
, y N   bk  0
k  1, 2,
,K
K
J
k 1
j 1
L  y , λ , μ   f  y    k hk  y     j g j  y 
V
 
bk
*
k
V
 
c j
*
j
Constraint Qualification
g j  y *  for j  I   j | g j  0 and hk for k  1, 2,
,K
are linear independent at the optimum.
When the constraint qualification is not met at
the optimum, there may or may not exist a
solution to the Kuhn-Tucker problem.
Second-Order Optimality
Conditions
δT  δ2 L  y , λ , μ   δ  0
for all nonzero vectors δ such that
J  y*  δ  0
where J is the matrix whose rows are the gradients
of the constraints that are active at y * .
In other words, the above equation defines
a set of vectors δ that are orthogonal to
the gradients of the active constraints.
These vectors form the tangent plane to
the active constraints.
Necessary and Sufficient
Conditions for Optimality
If a Kuhn-Tucker point satisfies
the second-order sufficient
conditions, then optimality is
guaranteed.
Basic Idea of Penalty Methods
min f  x  

s.t.

  min P  f , g, h, r 
g x  0 
h  x   0 
where P  f , g, h, r  is an unconstrained penalty function,
and r is a penalty parameter.
Example
min f  x    x1  1   x2  2 
2
2
s.t. h  x   x1  x2  4  0

P  x, r    x1  1   x2  2   r  x1  x2  4 
2
2
2
Exact L1 Penalty Function
P1  x, w (1) , w (2)   f  x    wk(1) hk  x    w(2)
j max 0, g j  x 
K
J
k 1
j 1
0
0
where wk(1) >0 and w(2)
j >0 are positive.
If  x* , λ * , μ*  satisfy the Kuhn-Tucker conditions and if
wk(1)  k*
k  1, 2,
,K
w(2j )   *j
j  1, 2,
,J
then it can be shown that x* is a local minimum of P1  x, w (1) , w (2)  .
However, P1 is nonsmooth at hk  x   0 and g j  x   0.
Equivalent Smooth Constrained
Problem
K
J

(1)
(1)
(1)
(2) (2) 
min  f  x    wk  pk  nk    w j p j 
k 1
j 1


s.t.
hk  x   pk(1)  nk(1)
,K

(2)
(2)
g j x  p j  nj
j  1, 2, , J 
(1)
(1)
p

n
k
k  hk  x 

(1) (1)
pk nk  0
  (2 )
p j  max 0, g j  x 

(2) (2)
pj nj  0

( 2)

pk(1) , nk(1) , p (2)
,
n

0
j
j

k  1, 2,
Barrier Method
min f  x    x1  1   x2  2 
2
2


s.t. g  x     x1  x2  4   0


0



B  x, r    x1  1   x2  2   r ln  x1  x2  4 
2
2


1
  x1  1   x2  2   r ln 

x

x

4
 1 2

where r  0 is a positive scalar called the barrier parameter.
2
2
r ln  x1  x2  4
 x1  1   x2  2 
2
2
x1  x2  4
Generalized Cases
min f  x 
s.t.
g j x  0
j  1, 2,
,J

min B  x, r   f  x   r  ln   g j  x  
J
j 1
Mixed Penalty-Barrier Method
Barrier method is not directly applicable to problems
with equality constraints, but equality constraints can be
integrated using a penalty term and inequality can use a
barrier term, leading to a mixed penalty-barrier method.
min f  x 
s.t. hi  x   0
i  1, 2,
,K
g j  x  0
j  1, 2,
,J

min PB  x, r1 , r2   f  x   r1  hi2  x   r2  ln   g j  x  
K
J
i 1
j 1