Matlab-based Optimization: the Optimization Toolbox

Matlab-based Optimization:
the
Optimization Toolbox
Gene Cliff (AOE/ICAM - [email protected] )
3:00pm - 4:45pm, Monday, 11 February 2013
.......... FDI ..........
AOE: Department of Aerospace and Ocean Engineering
ICAM: Interdisciplinary Center for Applied Mathematics
1 / 37
Matlab’s Optimization Toolbox
Classifying Optimization Problems ⇐
A Soup Can Example
Intermezzo
A Trajectory Example
2nd Trajectory Example: fsolve
2 / 37
Solver Categories
There are four general categories of Optimization Toolbox solvers:
Minimizers
This group of solvers attempts to find a local minimum of the objective function near a starting point x0.
They address problems of unconstrained optimization, linear programming, quadratic programming, and
general nonlinear programming.
Multiobjective minimizers
This group of solvers attempts to either minimize the maximum value of a set of functions (fminimax), or
to find a location where a collection of functions is below some prespecified values (fgoalattain).
Least-Squares (curve-fitting) solvers
This group of solvers attempts to minimize a sum of squares. This type of problem frequently arises in
fitting a model to data. The solvers address problems of finding nonnegative solutions, bounded or linearly
constrained solutions, and fitting parameterized nonlinear models to data.
Equation solvers
This group of solvers attempts to find a solution to a scalar- or vector-valued nonlinear equation f(x) = 0
near a starting point x0. Equation-solving can be considered a form of optimization because it is equivalent
to finding the minimum norm of f(x) near x0.
3 / 37
Generic Optimization Problem
minx f (x1 , x2 , ..., xn ), subject to
equality constraints
inequality constraints
c eq1 (x1 , x2 , ..., xn ) = 0
c1 (x1 , x2 , ..., xn ) ≤ 0
c eq2 (x1 , x2 , ..., xn ) = 0
..
.
c2 (x1 , x2 , ..., xn ) ≤ 0
..
.
c eq` (x1 , x2 , ..., xn ) = 0
cm (x1 , x2 , ..., xn ) ≤ 0
simple bound constraints
xıL ≤ xı ≤ xıU , ı = 1, 2, ..., n
4 / 37
Classifying a Problem
Identify your objective function as one of five types:
Linear
Quadratic
Sum-of-squares (Least squares)
Smooth nonlinear
Nonsmooth
Identify your constraints as one of five types:
None (unconstrained)
Bound
Linear (including bound)
General smooth
Discrete (binary integer)
5 / 37
Problem classification table
We focus on fmincon
6 / 37
Matlab’s Optimization Toolbox
Classifying Optimization Problems
A Soup Can Example ⇐
Intermezzo
A Trajectory Example
2nd Trajectory Example: fsolve
7 / 37
Soup Can Example (from MathWorks Training Docs)
We are to design a soup can in the shape of a right circular
cylinder.We are to choose values for:
1
the diameter (d),
2
the height (h)
Requirements are:
2
the volume ( πd4 h) must be 333 cm3
the height can be no more than twice the diameter
2
the cost is proportional to the surface area ( πd2 + πdh), and
should be minimized
Since the cost function and the volume constraint are nonlinear,
we select fmincon.
8 / 37
Soup can design space
9 / 37
Soup can: cost function
function [ val val x ] = cost soup can ( x )
% E v a l u a t e t h e c o s t f u n c t i o n f o r t h e soup−can e x a m p l e
%
% x ( 1 ) − d i a m e t e r o f t h e can
% x ( 2 ) − h e i g h t o f t h e can
%
% a r e a = 2 ∗ ( p i ∗d ˆ 2 ) / 4 + p i ∗d∗h
val =
pi ∗x (1)∗( x (2) + x ( 1 ) / 2 ) ;
% Evaluate the gra di en t
i f nargout > 1
v a l x = p i ∗ [ x (1)+ x ( 2 ) ; x ( 1 ) ] ;
end
end
10 / 37
Soup can: volume constraint
f u n c t i o n [ c c e q c x c e q x ]= c o n s o u p c a n ( x , volume )
%E v a l u a t e t h e c o n s t r a i n t f o r t h e soup−can e x a m p l e
% x ( 1 ) − d i a m e t e r o f t h e can
% x ( 2 ) − h e i g h t o f t h e can
%
c = [];
% no n o n l i n e a r
inequalities
c e q = volume − ( p i / 4 ) ∗ x ( 2 ) ∗ x ( 1 ) ˆ 2 ; % volume = p i ∗d ˆ2∗ h /4
% compute t h e J a c o b i a n s
i f nargout > 2
c x
= [];
c e q x = −( p i / 4 ) ∗ x ( 1 ) ∗ [ 2 ∗ x ( 2 ) ; x ( 1 ) ] ;
end
end
11 / 37
Soup can: set-up script
%
%
%
%
S c r i p t t o s e t up soup−can e x a m p l e
We a r e t o d e s i g n a r i g h t −c y l i n d r i c a l ( c i r c u l a r ) can o f a g i v e n volume
and w i t h minimum s u r f a c e a r e a ( m a t e r i a l c o s t ) . The h e i g h t can be no more
than t w i c e the diameter
% volume = p i ∗dˆ2∗h /4
% area
= 2∗( p i ∗d ˆ 2 ) / 4 + p i ∗d∗h
% h \ l e 2∗d ==> −2∗d + h \ l e 0
% I n o u r o p t i m i z a t i o n p r o b l e m we h a v e
%
x = [d ; h];
% The s p e c i f i e d volume i s 333 cmˆ3
% We h a v e e x t e r n a l f u n c t i o n
% c o s t s o u p c a n .m
% c o n s o u p c a n .m
files
%% d e f i n e h a n d l e t o t h e c o n s t r a i n t f u n c t i o n w i t h t h e s p e c i f i e d volume v a l u e
volume = 3 3 3 ;
h c o n = @( x ) c o n s o u p c a n ( x , volume ) ;
% Arrays f o r the l i n e a r
A = [−2 1 ] ; b = 0 ;
inequality
% l o w e r / u p p e r bounds
lb = [ 4;
5];
ub = [ 8 ; 1 5 ] ;
% i n i t i a l guess
x0 = [ 6 ; 1 0 ] ;
12 / 37
optimtool: soup can example
13 / 37
Command Window: soup can example
>> s o u p c a n 2
I t e r F−c o u n t
0
3
1
6
2
9
3
12
4
15
5
18
6
21
f (x)
245.044
247.138
265.113
265.948
265.92
265.956
265.957
Max
constraint
50.26
35.41
1.713
0.05285
0.06939
0.0001174
4 . 8 7 1 e−08
L o c a l minimum p o s s i b l e . C o n s t r a i n t s
Line search
steplength
Directional
derivative
1
1
1
1
1
1
3.93
17.3
6.92
−0.0716
6.49
0.53
F i r s t −o r d e r
o p t i m a l i t y Procedure
Infeasibl
1.2
2.68
0.798
0.0899
0.00326
6 . 8 8 e−05 H e s s i a n m
satisfied .
fmincon stopped because the p r e d i c t e d change i n the o b j e c t i v e f u n c t i o n
i s l e s s t h a n t h e s e l e c t e d v a l u e o f t h e f u n c t i o n t o l e r a n c e and c o n s t r a i n t s
were s a t i s f i e d to w i t h i n the s e l e c t e d v a l u e o f the c o n s t r a i n t t o l e r a n c e .
<s t o p p i n g
criteria
details>
No a c t i v e
>>
inequalities .
14 / 37
Matlab’s Optimization Toolbox
Classifying Optimization Problems
A Soup Can Example
Intermezzo ⇐
A Trajectory Example
2nd Trajectory Example: fsolve
15 / 37
fmincon: choice of algorithms
‘trust-region reflective’
requires you to provide a gradient, and
allows only bounds or linear equality constraints, but not both. Within these
limitations, the algorithm handles both large sparse problems and small dense
problems efficiently. It is a large-scale algorithm, and can use special techniques
to save memory usage, such as a Hessian multiply function. For details, see
Trust-Region-Reflective Algorithm.
‘active-set’
can take large steps, which adds speed. The algorithm is effective
on some problems with nonsmooth constraints. It is not a large-scale algorithm.
‘sqp’
satisfies bounds at all iterations. The algorithm can recover from NaN or
Inf results. It is not a large-scale algorithm.
‘Interior-point’
handles large, sparse problems, as well as small dense
problems. The algorithm satisfies bounds at all iterations, and can recover from
NaN or Inf results. It is a large-scale algorithm, and can use special techniques
for large-scale problems.
16 / 37
Large-Scale vs Medium-Scale
An optimization algorithm is large scale when it uses linear algebra that does
not need to store, nor operate on, full matrices. This may be done internally by
storing sparse matrices, and by using sparse linear algebra for computations
whenever possible. Furthermore, the internal algorithms either preserve sparsity,
such as a sparse Cholesky decomposition, or do not generate matrices, such as
a conjugate gradient method. Large-scale algorithms are accessed by setting
the LargeScale option to on, or setting the Algorithm option appropriately (this
is solver-dependent).
In contrast, medium-scale methods internally create full matrices and use dense
linear algebra. If a problem is sufficiently large, full matrices take up a
significant amount of memory, and the dense linear algebra may require a long
time to execute. Medium-scale algorithms are accessed by setting the
LargeScale option to off, or setting the Algorithm option appropriately (this is
solver-dependent).
Don’t let the name ”large-scale” mislead you; you can use a large-scale
algorithm on a small problem. Furthermore, you do not need to specify any
sparse matrices to use a large-scale algorithm. Choose a medium-scale
algorithm to access extra functionality, such as additional constraint types, or
possibly for better performance.
17 / 37
fmincon: command line inputs
x = fmincon(fun, x0, A, b, Aeq, beq, lb, ub, nonlcon,
options)
fun - function handle for the cost function
x0 - initial guess for solution
A, b - matrix, rhs vector for inequality constraints (A x ≤ b)
Aeq, beq - matrix, rhs vector for equality constraints
lb, ub - lower,upper bounds for solution vector
nonlincon - function handle for the nonlinear inequality and
equality constraints; [c, ceq] = nonlincon(x)
options - structure of options for the algorithm
18 / 37
fmincon: additional outputs
[x,fval,exitflag,output,lambda,grad,hessian]
exitflag
1: First-order optimality measure was less than options.TolFun,
and maximum constraint violation was less than .TolCon.
0: Number of iterations exceeded options.MaxIter or number
of function evaluations exceeded options.FunEv
output -structure of data about performance of the algorithm
lambda - structure of the Lagrange multipliers
grad -gradient of the Lagrangian
Hessian -Hessian of the Lagrangian
19 / 37
Matlab’s Optimization Toolbox
Classifying Optimization Problems
A Soup Can Example
Intermezzo
A Trajectory Example ⇐
2nd Trajectory Example: fsolve
20 / 37
Trajectory Example
We are to launch an object at speed v0 ; we seek an initial elevation
angle for maximum range. In the classical case with no drag, the
best elevation is π4 . Suppose we have a simple drag force; b v 2 ?
formulate an initial-value problem for the projectile motion
the initial position and speed are given, the initial elevation
angle (γ(0)) is unknown
the final range (x(tf ) to be maximized) occurs when the
height returns to its initial value (final time (tf ) is unknown)
Since the cost function and the final height constraint are
nonlinear functions of the unknowns, we select fmincon.
21 / 37
Trajectory: Setup and solve an IVP
f u n c t i o n [ r a n g e , a l t i t u d e ] = t r a j e c t o r y ( gam 0 , t f , param )
% S o l v e an IVP f o r t h e b a l l i s t i c t r a j e c t o r y
% E v a l u a t e t h e f i n a l a l t i t u d e and r a n g e
%
% gam 0 i s t h e i n i t i a l f l i g h t −p a t h a n g l e ( r a d i a n s )
% t f
i s the f i n a l time ( s )
%
% range / a l t i t u d e are the f i n a l values
%
% param i s a d a t a s t r u c t u r e
% param . b c o e f i s t h e d r a g c o e f f i c i e n t
% param . g r a v
i s t h e g r a v i t a t i o n a l a c c e l e r a t i o n (m/ s ˆ 2 )
% param . v e l 0
i s t h e i n i t i a l s p e e d (m/ s )
% anonymous
h rhs
z 0
[˜ , Z]
range
altitude
fu nc tio n handle with s p e c i f i e d parameters
= @( t , z ) b a l l i s t i c r h s ( t , z , param . b c o e f , param . g r a v ) ;
= [ 0 ; 0 ; param . v e l 0 ; gam 0 ] ; % s e t t h e i n i t i a l s t a t e
= ode23 ( h r h s , [ 0 t f ] , z 0 ) ;
% s o l v e t h e IVP
= Z ( end , 1 ) ;
= Z ( end , 2 ) ;
end
Note that to evaluate the cost we need the range, and to evaluate
the constraint we need the altitude.
Do we really have to solve the IVP twice to evaluate both ?
22 / 37
Trajectory: the RHS of the ODE system
function z
% Evaluate
% z = [x,
% x
−
% h
−
% v
−
% gamma −
dot = b a l l i s t i c r h s (˜ , z , b coef , grav )
r h s o f eq . o f m o t i o n f o r a b a l l i s t i c o b j e c t
h , v , gamma ]
range
altitude
speed
f l i g h t −p a t h a n g l e
s i n g = s i n ( z ( 4 ) ) ; cos g = cos ( z ( 4 ) ) ;
v
= max ( z ( 3 ) , 0 . 1 ) ; % g u a r d a g a i n s t z e r o d i v i s o r
z dot = [ z (3)∗ cos g ; z (3)∗ s i n g ;
−b c o e f ∗ z ( 3 ) ∗ z ( 3 ) − g r a v ∗ s i n g ;
−g r a v ∗ c o s g / v ] ;
end
23 / 37
Matlab: ObjectiveandConstraints
f u n c t i o n [ c o s t , n o n l i n c o n ] = O b j e c t i v e a n d C o n s t r a i n t s ( param )
% E n c a p s u l a t e s c o s t and c o n s t r a i n t f u n c t i o n s f o r f m i n c o n
% c o s t and n o n l i n c o n a r e f u n c t i o n h a n d l e s
% param i s a s t r u c t u r e t h a t e n c o d e s p a r a m e t e r s f o r t h e c o s t / c o n s t r a i n t f c n s
% I n i t i a l i z e v a r i a b l e s and make them a v a i l a b l e t o t h e n e s t e d f u n c t i o n s
range
= []; altitude
= [];
LastZ
= []; % initialize
cost
= @objective ;
nonlincon = @constraints ;
% Nested f u n c t i o n s
function [ val , val Z ] = o b j e c t i v e ( z )
i f ˜ i s e q u a l ( z , LastZ ) % update f o r t h i s v a l u e
% S o l v e t h e IVP
[ r a n g e , a l t i t u d e ] = t r a j e c t o r y ( z ( 1 ) , z ( 2 ) , param ) ;
LastZ = z ;
end
%
Evaluate cost
val
= −r a n g e ; % m i n i m i z e t h e n e g a t i v e r a n g e
val Z = [ ] ;
% g r a d i e n t n o t computed i n t h i s v e r s i o n
end
%
f u n c t i o n [ c , ceq c Z , ceq Z ] = c o n s t r a i n t s ( z )
i f ˜ i s e q u a l ( z , LastZ ) % update f o r t h i s v a l u e
% S o l v e t h e IVP
[ r a n g e , a l t i t u d e ] = t r a j e c t o r y ( z ( 1 ) , z ( 2 ) , param ) ;
LastZ = z ;
end
% Evaluate constraints
c
= [ ] ; % no i n e q u a l i t y c o n s t r a i n t s
ceq
= altitude ;
= [ ] ; % J a c o b i a n s n o t computed
c Z
ceq Z = [ ] ;
end
end
24 / 37
ObjectiveandConstraints: insights
Invoking ObjectiveandConstraints defines the function
handles cost and nonlincon.
Since the variables: param, range, altitude, lastZ are
defined at the high-level, they are available to the nested
functions objective and constraints.
If z 6= LastZ we solve the IVP and return range and
altitude.
If z == LastZ we use the stored values of range and
altitude.
This approach is useful in cases wherein evaluating the
cost/constraint functions requires an expensive calculation,
such as the solution of an ODE/IVP or a PDE/BVP.
Future documentation of the Optimization Toolbox will
include this description.
25 / 37
fmincon:trajectory example
%
%
%
%
%
%
S c r i p t t o s e t p a r a m e t e r s f o r and t h e n r u n t h e max−r a n g e t r a j e c t o r y p r o b l e m
param
param .
param .
param .
i s a s t r u c t u r e of data f o r the problem
b c o e f i s the drag c o e f f i c i e n t
grav
i s t h e g r a v i t a t i o n a l a c c e l e r a t i o n (m/ s ˆ 2 )
vel 0
i s t h e i n i t i a l s p e e d (m/ s )
param . b c o e f = 0 . 1 ;
param . g r a v
= 9.8;
param . v e l 0 = 2 5 . 0 ;
% d e f i n e handles f o r f u n c t i o n s e v a l u a t i n g the cost / c o n t r a i n t s
[ c o s t , n o n l c o n ] = O b j e c t i v e a n d C o n s t r a i n t s ( param ) ;
% l o w e r / u p p e r bounds
lb = [ 0
;
0 . 5 ∗ param . v e l 0 / param . g r a v ] ;
ub = [ p i / 4 ; 5∗ l b ( 2 ) ] ;
% i n i t i a l guess
x0 = 0 . 5 ∗ ( l b+ub ) ;
%% s e t p a r a m e t e r s and i n v o k e f m i n c o n
OPT = o p t i m s e t ( ’ f m i n c o n ’ ) ;
OPT = o p t i m s e t (OPT, ’ A l g o r i t h m ’
, ’ a c t i v e −s e t ’ ,
’ Display ’
, ’ iter ’ , ...
’ U s e P a r a l l e l ’ , ’ always ’ ) ;
%
...
x s t a r = f m i n c o n ( f u n , x0 , A , b , Aeq , beq , l b , ub , n o n l c o n , o p t i o n s )
x s t a r = f m i n c o n ( c o s t , x0 , [ ] , [ ] , [ ] , [ ] , l b , ub , n o n l c o n ,
OPT ) ;
26 / 37
Matlab’s Optimization Toolbox
Classifying Optimization Problems
A Soup Can Example
Intermezzo
A Trajectory Example
2nd Trajectory Example: fsolve ⇐
27 / 37
2nd Trajectory Example: fsolve
With the same dynamics as earlier, we now seek an initial elevation
angle (γ0 )) and a final time (tf ) so that the trajectory ends at a
specified point in the vertical plane (xf , hf ).
since the IVP solution depends on time, as well as on the
initial elevation angle, we write the range and height functions
as x(t; γ0 ) and h(t; γ0 ), respectively.
we want to find values of tf and γ0 that lead to zero for the
vector-valued function:
4
f1 (γ0 , tf ) = x(tf , γ0 ) − xf
4
f2 (γ0 , tf ) = h(tf , γ0 ) − hf
we use the the function fsolve from the Optimization
Toolbox
28 / 37
Modified trajectory code
This version can return the time/state history [T, Z]
f u n c t i o n [ r e s i d u a l , T , Z ] = t r a j e c t o r y ( gam 0 , t f , param )
% S o l v e an IVP f o r t h e b a l l i s t i c t r a j e c t o r y
% E v a l u a t e t h e f i n a l a l t i t u d e and r a n g e
%
% gam 0 i s t h e i n i t i a l f l i g h t −p a t h a n g l e ( r a d i a n s )
% t f
i s the f i n a l time ( s )
%
% range / a l t i t u d e are the f i n a l values
%
% param i s a d a t a s t r u c t u r e
% param . b c o e f i s t h e d r a g c o e f f i c i e n t
% param . g r a v
i s t h e g r a v i t a t i o n a l a c c e l e r a t i o n (m/ s ˆ 2 )
% param . v e l 0
i s t h e i n i t i a l s p e e d (m/ s )
i s the s p e c i f i e d t a r g e t range
(m)
% param . x f
% param . h f
i s t h e s p e c i f i e d t a r g e t a l t i t u d e (m)
% anonymous f u n c t i o n h a n d l e w i t h s p e c i f i e d p a r a m e t e r s
h rhs
= @( t , z ) b a l l i s t i c r h s ( t , z , param . b c o e f
z 0
= [ 0 ; 0 ; param . v e l 0 ; gam 0 ] ; % s e t t h e
i f n a r g o u t == 1
[˜ , Z]
= ode23 ( h r h s , [ 0 t f ] , z 0 ) ;
% solve
r e s i d u a l = Z ( end , 1 : 2 ) ’ − [ param . x f ; param . h f ] ;
else
[T, Z ]
= ode23 ( h r h s , [ 0 t f ] , z 0 ) ;
% solve
r e s i d u a l = Z ( end , 1 : 2 ) ’ − [ param . x f ; param . h f ] ;
end
end
% ’ l o c a l ’ b a l l i s t i c f u n c t i o n goes here
, param . g r a v ) ;
i n i t i a l state
t h e IVP
t h e IVP
29 / 37
fsolve: 2nd trajectory example
%
%
%
%
%
%
%
%
S c r i p t t o s e t p a r a m e t e r s f o r and t h e n r u n a t r a j e c t o r y t a r g e t p r o b l e m
param
param .
param .
param .
param .
param .
i s a s t r u c t u r e of data f o r the problem
b c o e f i s the drag c o e f f i c i e n t
grav
i s t h e g r a v i t a t i o n a l a c c e l e r a t i o n (m/ s ˆ 2 )
vel 0
i s t h e i n i t i a l s p e e d (m/ s )
x f
i s the s p e c i f i e d t a r g e t range
(m)
h f
i s t h e s p e c i f i e d t a r g e t a l t i t u d e (m)
param . b c o e f = 0 . 1 ;
param . g r a v
= 9.8;
param . v e l 0 = 2 5 . 0 ;
param . x f
param . h f
=
=
8.0;
2.0;
% d e f i n e handles f o r f u n c t i o n s e v a l u a t i n g the cost / c o n t r a i n t s
f h n d l = @( x ) t r a j e c t o r y ( x ( 1 ) , x ( 2 ) , param ) ;
% i n i t i a l guess
x0 = [ p i / 4 ; 0 . 5 ∗ param . v e l 0 / param . g r a v ] ;
%% s e t p a r a m e t e r s and i n v o k e f s o l v e
OPT = o p t i m s e t ( ’ f s o l v e ’ ) ;
OPT = o p t i m s e t (OPT,
’ Display ’
, ’ iter ’ , ...
’ U s e P a r a l l e l ’ , ’ always ’ ) ;
%
[ x s t a r , f v a l , e x i t f l a g ] = f s o l v e ( FUN ,
[ x star ,
˜
,
flag
] = fsolve ( f hndl ,
X0 , OPTIONS)
x0 , OPT ) ;
30 / 37
fsolve: 2nd trajectory example
%
[ x s t a r , f v a l , e x i t f l a g ] = f s o l v e ( FUN , X0 , OPTIONS)
[ x star , ˜ , flag ] = fsolve ( f hndl ,
x0 , OPT ) ;
if
f l a g == 1
[ ˜ , T , Z ] = t r a j e c t o r y ( x s t a r ( 1 ) , x s t a r ( 2 ) , param ) ;
figure
p l o t ( Z ( : , 1 ) , Z ( : , 2 ) , ’−−k ’ , ’ L i n e W i d t h ’ , 2 ) ;
h o l d on ; g r i d on
p l o t ( param . x f , param . h f , ’ r o ’ )
x l a b e l ( ’ r a n g e (m) ’ ) ; y l a b e l ( ’ h e i g h t (m) ’ )
else
f p r i n t f ( 1 , ’ \n f l a g = %02 i \n ’ , f l a g ) ;
end
31 / 37
2nd trajectory example: fsolve
Note that as in the zero-drag case, the problem has two solutions
Low trajectory
High trajectory
32 / 37
THE END
Please complete the evaluation form
http://www.fdi.vt.edu/training/evals/
Thanks
33 / 37
Backup - underlying ideas - problem w/o constraints
Problem P0 : Find x ∗ ∈ IRn to minimize a smooth function
f : IRn → IR.
We assume that f is twice continuously differentiable in the
neighborhood of a solution.
If x ∗ a minimizer for P0 , then x ∗ is a stationary point for f , so
that (∇f )x ∗ = 0 ∈ IRn , furthermore
the Hessian of f , is
positive semi-definite, ∇2 f x ∗ ≥ 0.
Applying Newton’s
−1 method to ∇f = 0 we get the update
2
pk = − ∇ f x (∇f )xk , xk+1 = xk + pk
k
Algorithms for P0 generate estimates for ∇2 f based on
computes changes in (∇f )
The update is commonly generalized to xk+1 = xk + αpk ,
where α > 0 is a step-size.
Trust-region methods minimize a quadratic approximation to
f near xk subject to a step size (trust-region radius).
34 / 37
Backup- underlying ideas -Newton update
Write (∇F )x+p ≈ (∇F )x + (∇2 F )x p
Newton step - compute p so that (∇F )xk +p = 0
(∇2 F )xk p = −(∇F )xk
Quasi-Newton update for (∇2 F )k+1
(∇2 F )k+1 p̃ = (∇F )xk +p̃ − (∇F )xk
| {z }
update this
Q-N rules are commonly based on rank-two, least-change ideas
(∇2 F )k+1 = (∇2 F )k + |{z}
∆
rank two
In the period 1960 - 1980 a great deal of work was done
Davidon, Fletcher, Powell, Broyden, Goldfarb, Shanno
35 / 37
Backup- underlying ideas - problem w equality constraints
Problem Pc : Find x ∗ ∈ IRn to minimize a smooth function
f : IRn → IR; subject to g (x) = 0 ∈ IRm where
g : IRn → IRm
We assume that f , g are twice continuously differentiable in
the neighborhood of a solution.
If x ∗ is a minimizer for Pc and the Jacobian J = ∇g has full
rank at x ∗ then there exists a vector λ̂ ∈ IRm such that x ∗ is a
stationary point for the Lagrange function
L(x) = f (x) + hλ̂, g (x)i. Furthermore, x ∗ is a local minimizer
for L in the null space of J(x ∗ ).
The latter condition implies that
the projected Hessian of L is
T
2
positive semi-definite Z ∇ L x ∗ Z ≥ 0 where the columns
of Z span the null-space of J(x ∗ ).
36 / 37
Backup- underlying ideas - problem w inequality constraints
Problem Pi - the constraints  = k + 1, ..., m are inequalities,
g ≤ 0.
Karush-Kuhn-Tucker theory implies that λ ≥ 0
(NB - in some formulations λ ≤ 0.)
Many algorithms are based on an active-set strategy. Some
set A ⊂ {k + 1, ..., m} of inequalities are treated as equalities
in a version of problem Pc
At each (major) iteration the set A is adjusted:
1
2
if g` > 0 for some ` ∈ Ac , then add ` to the active-set
If λ` < 0 for some ` ∈ A, then remove ` from the active-set
37 / 37

Download Report

Matlab-based Optimization: the Optimization Toolbox

Paperzz.com

Your Paperzz