Steady-State Optimization Lecture 3: Unconstrained

Steady-State Optimization
Lecture 3: Unconstrained Optimization Problems,
Numerical Methods and Applications
Dr. Abebe Geletu
Ilmenau University of Technology
Department of Simulation and Optimal Processes (SOP)
Steady-State Optimization Lecture 3: Unconstrained Optimization Problems, Numerical Methods and Applications
TU Ilmenau
3.1. Unconstrained Optimization Problems
• Consider the unconstrained optimization problem
(UNLP)
min f (x).
x∈Rn
Aim:
to find a point x ∗ from the whole of Rn so that f has a minimum
value at x ∗ .
• The function f in the problem UNLP is known as the
objective function or performance criteria.
• A vector x∗ is a minimum point of the function f (equivalently x∗ is
a solution of the optimization problem UNLP) if
f (x) ≥ f (x∗ ) for all x ∈ Rn .
Steady-State Optimization Lecture 3: Unconstrained Optimization Problems, Numerical Methods and Applications
TU Ilmenau
3.1. Unconstrained Optimization Problems...
• Similarly we can also have an unconstrained maximization problem
(UNLP)
max f (x).
x∈Rn
• But a maximization problem can be also, equivalently, written as a
minimization problem as
(UNLP)
− minn (−f (x)) .
x∈R
• Therefore, an optimization problem is either a maximization or a
minimization problem.
• The discussions in this lecture are limited to minimization problems.
Steady-State Optimization Lecture 3: Unconstrained Optimization Problems, Numerical Methods and Applications
TU Ilmenau
3.1. Unconstrained Optimization Problems...
Example: The optimization problem
(UNLP)
min
x=(x1 ,x2 )
1
(x1 − 2)2 + x22 − 5
2
has x ∗ = (2, 0)> as a minimum point. In particular, the minimum value is =
−5 = f (x∗ ) ≤ f (x) for all x ∈ R2 .
Steady-State Optimization Lecture 3: Unconstrained Optimization Problems, Numerical Methods and Applications
TU Ilmenau
3.1. Unconstrained Optimization Problems...
• In the example above, in any direction d we move from the minimum point x∗> = (2, 0),
the value of the function f (x ∗ ) = −5 cannot be reduced. That is
f (x ∗ + d) ≥ f (x ∗ ) for any direction vector d ∈ Rn .
Question: How do we know that wether a given point x ∗ is a minimum point of a function
f or not?
First-order Taylor approximation at the point x ∗ :
f (x ∗ + d) = f (x∗ ) + ∇f (x ∗ )> d ⇒ f (x ∗ + d) − f (x ∗ ) = ∇f (x ∗ )> d
In general, if x ∗ is a minimum point of UNLP, then
f (x ∗ + d) ≥ f (x ∗ ) ⇒ ∇f (x ∗ )> d ≤ 0 for any vector d ∈ Rn .
In particular, if we take d = ∇f (x ∗ ), then it follows that
∇f (x ∗ )> [∇f (x ∗ )] ≤ 0 ⇒ k∇f (x ∗ )k2 ≤ 0 ⇒ ∇f (x ∗ )> = 0.
Steady-State Optimization Lecture 3: Unconstrained Optimization Problems, Numerical Methods and Applications
TU Ilmenau
3.1. Unconstrained Optimization Problems...
First-order optimality condition for unconstrained optimization
problems
If x ∗ is a minimum point of UNLOP, then
∇f (x ∗ ) = 0.
Remark: For an unconstrained optimization problem, if ∇f (x) 6= 0,
then x is not a minimum point of f (x).
Therefore, we look for the minimum points of a function f (x)
(i.e. solution of UNLP) among those points that satisfy the
equation
∇f (x) = 0.
• Points that satisfy the equation ∇f (x) = 0 are commonly known as
stationary points and they are candidates for optimality.
Steady-State Optimization Lecture 3: Unconstrained Optimization Problems, Numerical Methods and Applications
TU Ilmenau
3.1. Unconstrained Optimization Problems...
Example 1: (see example above) Let f (x) = 12 (x1 − 2)2 + x22 − 5. Then
x −2
0
∇f (x) = 0 ⇒ 1
=
⇒ x1 = 2, x2 = 0.
2x2
0
Abbildung: Ein nichtlineares Feder-System
function my2DPlot2
x = -10:0.1:10; y = -10:0.1:10;
[X,Y] = meshgrid(x,y);
Z = 0.5*(X -2).^2 + Y.^2 -5 ;
meshc(X,Y,Z)
xlabel(’x - axis’);
Steady-State
Optimization Lecture 3: Unconstrained Optimization Problems, Numerical
ylabel(’y - axis’);
hold on
%plot3(2,0,0,’sk’,’markerfacecolor’,[0,0,0]);
TU Ilmenautitle(’Plot for the function f(x_{1},x_{2})=0.5(x_{1}-2)^{2}+ x_{2}^{2} - 5’)
Methods and Applications
3.1. Unconstrained Optimization Problems...
Example 2: Find the solution(s) of optimization problem
min
(x1 ,x2 )
n
f (x1 , x2 ) = x13 − 3x1 − x23 + 3x2
o
Solution: First compute ∇f (x) to obtain:
∂f
∂x1
∂f
∂x2
∇f (x) =
!
3x12 − 3
−3x22 + 3
=
Next find the stationary points:
∇f (x) =
3x12 − 3
−3x22 + 3
=
3x12 − 3
0
⇒
0
−3x22 + 3
=0
=0
We obtain, x1 = ±1 and x2 = ±1.
• Hence, the points (1, 1), (1, −1), (−1, 1), (−1, −1) solve the equation ∇f (x) = 0
and they are candidates for optimality.
Steady-State Optimization Lecture 3: Unconstrained Optimization Problems, Numerical Methods and Applications
TU Ilmenau
3.1. Unconstrained Optimization Problems...
Note that: The point (1, −1) is the only minimum point.
• Even if ∇f (x) = 0 at the rest of the points (1, 1), (−1, 1) and (−1, −1), they are not
minimum points.
• In fact (−1, 1) is a maximum point.
That a point x ∈ Rn satisfies the equation ∇f (x) = 0 is not enough (or sufficient)
to conclude that x is an minimum point.
Steady-State Optimization Lecture 3: Unconstrained Optimization Problems, Numerical Methods and Applications
TU Ilmenau
Sufficient optimality condition for unconstrained
Optimization
• ∇f (x) = 0 is not sufficient that x is a minimum point. So, we need an additional criteria!
• Suppose that ∇f (x) = 0. Consider the 2nd-order Taylor Approximation of f at the point x
f (x + d) ≈ f (x) + d
>
1 >
∇f (x) + d H(x)d
| {z } 2
=0
for any direction vector d. It then follows that
f (x + d) ≈ f (x) +
1 >
d H(x)d
2
• If d > H(x)d > 0, then we have f (x + d) > f (x).
⇒ There is no direction vector d which is a descent direction for f at the point x. Hence, x is a minimum point.
3.2. Sufficient Optimality Condition for UNLP
Suppose that, ∇f (x) = 0. If an addition the Hessian matrix H(x) (of f at x) is positive definite
d
>
n
H(x)d > 0, d ∈ R , d 6= 0,
the x is a minimum point.
Recall: • A square matrix is positive definite if all its eigenvalues are positive.
• For a diagonal matrix, the diagonal elements are its eigenvalues.
Steady-State Optimization Lecture 3: Unconstrained Optimization Problems, Numerical Methods and Applications
TU Ilmenau
3.2. Sufficient optimality condition for unconstraine
Optimization...
Example 3: Consider again the optimization problem
min f (x1 , x2 ) = x13 − 3x1 − x23 + 3x2
(x1 ,x2 )
We know that each of the points (1, 1), (1, −1), (−1, 1), (−1, −1) satisfy the equation
3x12 − 3
0
∇f (x) =
=
0
−3x22 + 3
The Hessian matrix of f is
H(x1 , x2 ) =
6x1
0
0
−6x2
.
Hence,
H(1, 1)
H(1, −1)
H(−1, 1)
H(−1, −1)
indefinite
positive definite
negative definite
indefinite
(1, 1) neither maximum nor a minimum (saddle) point
(1, −1) is a minimum point
(−1, 1) is a maximum point
(−1, −1) neither maximum nor a minimum (saddle) point
Steady-State Optimization Lecture 3: Unconstrained Optimization Problems, Numerical Methods and Applications
TU Ilmenau
3.3. Unconstrained Optimization
Problems...Example
Application (A nonlinear spring system):
Abbildung: A nonlinear spring system
• The applied forces are F1 = 0, F2 = 2N and the spring constants are k1 = k2 = 1N/m.
Steady-State Optimization Lecture 3: Unconstrained Optimization Problems, Numerical Methods and Applications
TU Ilmenau
3.3. Unconstrained Optimization
Problems...Example
There is a shift in the x and y direction of the junction of the two springs due to the
applied forces. What are the values these shifts in order to minimize the potential energy
of the system?
The potential energy is given by
P(x1 , x2 ) =
1
1
k1 (∆L1 )2 + k2 (∆L2 )2 − F1 x1 − F2 x2 ,
2
2
where ∆L1 and ∆L1 are changes in the length of the springs and x1 und x2 are the x and
y shifts, respectively. Hence,
q
√
∆L1 =
(x1 + 10)2 + (x2 − 10)2 − 10 2
q
√
∆L1 =
(x1 − 10)2 + (x2 − 10)2 − 10 2.
Therefore, solve the optimization problem
1
1
(UNLP)
min P(x) = k1 (∆L1 )2 + k2 (∆L2 )2 − F1 x1 − F2 x2 .
x
2
2
Steady-State Optimization Lecture 3: Unconstrained Optimization Problems, Numerical Methods and Applications
TU Ilmenau
3.3. Unconstrained Optimization
Problems...Example
First-order optimality condition: ∇P(x) = 0.
∂P
∂x1
∂P
∂x2
=
=
k1 ∆L1
k1 ∆L1
∂∆L1
+ k2 ∆L2
∂x1
∂∆L1
+ k2 ∆L2
∂x2
∂∆L2
∂x1
∂∆L2
∂x2
− F1
(1)
− F2 ,
(2)
where
∂∆L1
∂x1
∂∆L2
∂x1
(x1 + 10)
=
p
(x1 + 10)2 + (x2 − 10)2
=
p
(x1 − 10)2 + (x2 − 10)2
(x1 − 10)
,
,
∂∆L1
∂x2
∂∆L2
∂x2
= p
(x2 − 10)
(3)
(x1 + 10)2 + (x2 − 10)2
= p
(x2 − 10)
(x1 − 10)2 + (x2 − 10)2
.
(4)
Set (3) and (4) in (1) and (2), so that
∂P
∂x1
∂P
∂x2
=
k1 (x1 + 10) + k2 (x1 − 10) − F1 = 0 ⇒ x1 =
=
k1 (x2 − 10) + k2 (x2 − 10) − F2 = 0 ⇒ x2 =
F1 + 10(k2 − k1 )
(5)
k1 + k2
F2 + 10(k2 + k1 )
k1 + k2
.
Replacing the given values of F1 and F2 we obtain (x1 , x2 ) = (0, 11.5).
Steady-State Optimization Lecture 3: Unconstrained Optimization Problems, Numerical Methods and Applications
TU Ilmenau
(6)
3.3. Unconstrained Optimization
Problems...Example
The Hessian matrix of the function P(x) is
k1 + k2
0
H(x) =
.
0
k1 + k2
Since k1 and k2 are positive numbers, the matrix H(x) is always
positive definite.
⇒ (x1 , x2 ) = (0, 11.5) is a minimum point.
• If the Hessian H(x) is a positive definite matrix for any x, then the
function f (x) is a a convex function.
Steady-State Optimization Lecture 3: Unconstrained Optimization Problems, Numerical Methods and Applications
TU Ilmenau
3.4. Convex Sets and Convex Functions
A. Convex Set
A set S ⊂ Rn is said to be a convex set, if for any x1 , x2 ∈ S and any λ ∈ [0, 1] we have
λx1 + (1 − λ)x2 ∈ S.
• For a set S to be convex, the line-segment joining any two points x1 , x2 in S should be
completely contained in S.
Abbildung: Convex and a non-convex sets
• The set S = {x ∈ Rn | Ax ≤ a, Bx = b} is a convex set, where A, B are matrices and
a, b are vectors.
Steady-State Optimization Lecture 3: Unconstrained Optimization Problems, Numerical Methods and Applications
TU Ilmenau
3.4. Convex Sets and Convex Functions
B. Convex Functions
A function f : Rn → R is said to be a convex function, if for any x1 , x2 ∈ Rn and any
λ ∈ [0, 1] we have
f (λx1 + (1 − λ)x2 ) ≤ λf (x1 ) + (1 − λ)f (x2 )
• A segment connecting any two points on the graph of f lies above the graph of f .
Examples: The following are convex functions f1 (x) = x 2 , f2 (x) = e x , f (x1 , x2 ) = x12 + x22 .
Question: Is there a simple method to verify whether a function is convex or not?
Steady-State Optimization Lecture 3: Unconstrained Optimization Problems, Numerical Methods and Applications
TU Ilmenau
3.4. Convex Sets and Convex Functions
Verifying convexity of a function
A function f : Rn → R is a convex function if and only if
the Hessian matrix H(x) is positive semi-definite, for every x ∈ Rn .
Example: Hence, for a quadratic function f (x) = 21 x > Qx + q > x, its Hessian matrix is
H(x) = Q. Hence, f is a convex function if Q is positive semi-definite.
If f is a convex function and ∇f (x) = 0, then x is minimum point of f .
⇒ Therefore, for a convex function, the condition ∇f (x) = 0 is sufficient for x to be a
minimum point of f .
Remark: Convex optimization problems have an extensive application in signal processing
and control engineering.
(See the book of S. Boyd and L. Vandenberghe: Convex Optimization. URL:
http://www.stanford.edu/ boyd/cvxbook/ )
Steady-State Optimization Lecture 3: Unconstrained Optimization Problems, Numerical Methods and Applications
TU Ilmenau
3.5. Numerical methods for unconstrained
optimization problems
• In general, it is not easy to solve the systems of equations
 ∂f

∂x1 (x)
 ∂f (x)
 ∂x2 
∇f (x) = 

=0
.. 
. 
∂f
∂xn (x)
analytically. In many cases this system of equations is nonlinear and
the number of equations can be also very large.
I Therefore, any of the algorithms: Newton, modified Newton,
quasi-Newton or inexact Newton methods can be used to solve the
nonlinear equation
F (x) = ∇f (x) = 0.
Steady-State Optimization Lecture 3: Unconstrained Optimization Problems, Numerical Methods and Applications
TU Ilmenau
3.5. Numerical methods ...
A general algorithm for unconstrained optimization
problems :
Start: Choose an initial iterate x0
Set k ← 0
Repeat:
• Determine a step-length : αk > 0
• Solve the system of linear equations: H(xk )dk = −∇f (xk )
• Set xk+1 = xk + αk dk
• k ←k +1
Until: (Termination criteria is satisfied)
• There are different types of algorithms, depending on how you
determine the search direction d and step-length α .
• Such algorithms are commonly known as
Gradient-based Algorithms .
Steady-State Optimization Lecture 3: Unconstrained Optimization Problems, Numerical Methods and Applications
TU Ilmenau
3.5. Numerical methods ...
Requirements on the search direction dk :
• Determine dk , in such a way that value f (xk + αk dk ) is not greater
than the value f (xk ).
⇒ If xk is not already a solution of UNLP, then dk should a descent
direction.
• It follows that
f (xk + αk dk ) ≤ f (xk ).
• Using the approximation f (xk + αk dk ) ≈ f (xk ) + αk dk> ∇f (xk ) we
obtain
(a)
− αk dk> ∇f (xk ) = f (xk ) − f (xk + αk dk )
(b)
dk> ∇f (xk ) ≤ 0.
• Hence, the expression dk> ∇f (xk ) is a measure of decrease for the
function f at the point xk in the direction of the vector dk .
Steady-State Optimization Lecture 3: Unconstrained Optimization Problems, Numerical Methods and Applications
TU Ilmenau
3.5. Numerical methods ...
• A vector d is called a descent direction for the function f at the
point x if d > ∇f (x) ≤ 0 .
• Note that.
−dk> ∇f (xk ) = −kdk kk∇f (xk )kcosθ,
where θ is the angle between the vectors dk and ∇f (xk ).
• The reduction f (xk ) − f (xk + αkdk ) is very large when cosθ = −1;
i.e., θ = 1800 .
⇒ The reduction f (xk ) − f (xk + αkdk ) is very large when the vectors
dk and ∇f (xk ) are in opposite directions.
⇒ That is, when dk = −∇f (xk ).
Steepest Descent
The direction −∇f (xk ) is known as the steepest descent direction.
Steady-State Optimization Lecture 3: Unconstrained Optimization Problems, Numerical Methods and Applications
TU Ilmenau
3.5. Numerical methods ...
(A1) Method of the steepest descent direction
Start: • Choose an initial iterate x0 and
a termination Tolerance • Set k ← k0
Repeat:
• Compute : ∇f (xk )
• Set xk+1 = xk − αk ∇f (xk )
• k ←k +1
Until: (k∇f (xk )k ≤ )
Advantages:
• There is no need to solve linear system of equations to determine a search direction.
• Easy to implement.
Disadvantage:
• The convergence speed is only linear (kxk+1 − x∗ k ≤ C kxk − x∗ k, C ∈ (0, 1) constant → slow convergence).
• The steepest descent direction −∇f (xk ) is not, in general, a good search direction.
Steady-State Optimization Lecture 3: Unconstrained Optimization Problems, Numerical Methods and Applications
TU Ilmenau
3.5. Numerical methods ...
Matlab code
Steady-State Optimization Lecture 3: Unconstrained Optimization Problems, Numerical Methods and Applications
TU Ilmenau
3.5.Steepst Descent with Exact Line search
• Once the search direction dk is already known, the step-length αk
can be chosen in such a way that we have a maximum reduction in
the direction of dk .
• That is, determine αk so that
min f (xk + αdk )
α
Hence, for dk = −∇f (xk ), we have f (xk + αk dk ) ≤ f (xk ).
• The method of finding a step-length by solving the optimization
problme minα f (xk + αdk ) is known as exact line search.
Matlab code
Steady-State Optimization Lecture 3: Unconstrained Optimization Problems, Numerical Methods and Applications
TU Ilmenau
3.5.Steepst Descent Algorithm with Exact Line
Search
Algorithm 1: Steepst Descent with Exact Line Search
1:
2:
3:
4:
5:
Choose an initial iterate x0 ;
Set k ← 0;
while (k∇f (xk )k > tol) do
Compute ∇f (xk ) ;
Determine αk by solve the one-dimensional optimization
problem
min f (xk − α∇f (xk ))
α
Compute the next iterate: xk+1 = xk − αk ∇f (xk )
Set k ← k + 1;
8: end while
6:
7:
Steady-State Optimization Lecture 3: Unconstrained Optimization Problems, Numerical Methods and Applications
TU Ilmenau
3.5.Steepst Descent Algorithm with Exact Line
Search...
function xsol=SteepstExactLS(x0,tolf,maxIter)
% A matlab implementation of steepest descent with Exact Line search
% Use should supply: initial iterate x0
% tolx - tolerance between subsequent iterates xk and xk+1, norm(x_k+1 - x_k)
% tolf - tolerance for norm(F(x_k)) < tolf
% maxIter - the maximum number of iterations
fprintf(’==================================================================== \n’)
fprintf(’iteration
alphak
norm(grad) \n’)
fprintf(’=================================================================== \n’)
%Constants
%sigma=10e-4;
%lambda = 0.5;
%Initialization
x0=x0(:);
alpha0=1;
k=0;
datasave=[];
%datasave=[datasave; 0 x0(1) x0(2) norm(myfun(x0))];
xk=x0;
alphak=alpha0;
grad=gradFun(x0);
while ((norm(grad)>=tolf)) & (k<=maxIter)
dk=gradFun(xk) ; % Compute the gradient
dk=dk(:);
alphak=lineSearch(xk,dk);
xk=xk+alphak*dk;
grad=gradFun(xk);
k=k+1;
datasave=[datasave; k+1
alphak
norm(grad)];
end %end while
Steady-State Optimization Lecture 3: Unconstrained Optimization Problems, Numerical Methods and Applications
TU Ilmenau
3.5.Steepst Descent Algorithm with Exact Line
Search...
xsol=xk;
disp(datasave)
end
function F=myfun(x)
F=(x_{1}-2)^{2} + x(2)^2-5;
end
function grad=gradFun(x)
grad(1) = 2*(x(1)-2);
grad(2) = 2*x(2);
end
function alpha=lineSearch(xk,dk)
x=xk;
d=dk;% The parameter.
alpha=fminbnd(@(y) fun4LS(y,x,d),0,1);
end
function f=fun4LS(y,x,d)
f = ((x(1)+y*d(1)-2))^2+(x(2)+y*d(2))^2-5;
end
Steady-State Optimization Lecture 3: Unconstrained Optimization Problems, Numerical Methods and Applications
TU Ilmenau
3.5.Steepst Descent Algorithm for Quadratic
Functions
• Let f (x) = 21 x > Qx + q > x.
newline Assumption: Q is symmeteric and positive definite.
Then
• steepst descent direction: dk = −∇f (xk ) = −Qxk − q ;
• exact line search :
1
>
>
(xk + α∇dk ) Q (xk + αdk ) + q (xk + αdk )
min f (xk +αdk ) = min
α
α
2
Then using dk = −Qxk − q, we obtain
αk =
kQxk + qk2
(Qxk + q)> Q (Qxk + q)
Steady-State Optimization Lecture 3: Unconstrained Optimization Problems, Numerical Methods and Applications
TU Ilmenau
3.5.Steepst Descent for QP ... Algorithm
Algorithm 2: Steepst Descent with Exact Line Search
Choose an initial iterate x0 ;
Set k ← 0;
3: while (k∇f (xk )k > tol) do
4:
Compute dk = −Qxk − q ;
1:
2:
2
kQxk +qk
Determine αk =
.
(Qxk +q)> Q(Qxk +q)
6:
Set xk+1 = xk + αk dk
7:
Set k ← k + 1;
8: end while
5:
Exercise: Implement this algorithm under MATLAB.
Steady-State Optimization Lecture 3: Unconstrained Optimization Problems, Numerical Methods and Applications
TU Ilmenau
3.5.Steepst Descent Algorithm ... Properties
• Convergence properties of the steeepst descesnt algorithm depends strongly on the properties of the Hessian matrix
H(x) = ∇2 f (x) (i.e. for a quandratic function 12 x > Qx + q > x we have H(x) = Q ).
=⇒ if H(x) is positive defnite, then we have a good convergence property.
• If a step-length αk is exact (optimal), then necessarily we have
df (xk + αdk ) > d (xk + αdk )
⇒ ∇f (xk + αk dk ) ·
= 0.
dα
dα
α=αk =0.
This implies
>
∇f (xk + αk dk )dk = 0.
Since the next steepst descent direction is dk+1 = −∇f (xk + αk dk ), it follows that
>
>
−dk+1 dk = 0 =⇒ dk+1 dk = 0.
• When using exact line-search, each new steepest descent direction is orthogonal to the previous one.
=⇒ This causes the so called zig-zag problem.
=⇒ So, the steepest descent algorithm with exact line-search may take too long to converge.
Steady-State Optimization Lecture 3: Unconstrained Optimization Problems, Numerical Methods and Applications
TU Ilmenau
3.6.The Conjugate Gradient Algorithm
Question: How to avoid the zig-zag problem in the steepest
(gradient) descent algorithm?
One solution: Use the Hessian matrix H(x) of f (x) in the
determiniation of dk (and αk ).
• Choose the search direction dk ,in such a way that
>
d
(∗∗)
∇f (xk + αdk−1 ) dk = 0
dα
• First order Tylor approximation
2
∇f (xk + αdk ) = ∇f (xk ) + α∇ f (xk + αdk )dk = ∇f (xk ) + α H(xk + αdk ) dk
|
{z
}
=Hk
d [∇f (x + αd )] = H
Hence, dα
k
k
k+1 dk .
• Consquently, according to (**)
Hk+1 dk
>
dk−1 = 0.
=⇒ dk> Hk dk−1 = 0 - We say that dk is conjugate to dk−1 .
Steady-State Optimization Lecture 3: Unconstrained Optimization Problems, Numerical Methods and Applications
TU Ilmenau
3.6.Conjugate Gradient ... Quadratic Problems
• The CG algorithm is commonly used to solve unconstrained
quadratic programming problems
1 >
>
(QP)
min f (x) = x Qx + q x ,
x
2
where Q is a symmetric positve definite matrix.
• Starting from an initial search direction d0 , the CG method for (QP)
generates search directions d1 , d2 , . . . , dn−1 so that
dk> Qdk−1 = 0, k = 1, . . . , n − 1.
• Given xk and dk , we use exact line-search to determine αk by solving
min f (xk + αdk ).
α
⇒ We solve the equation
df (xk +αdk )
dα
d >g
= 0 to obtain αk = − d >k Qdk
k
where gk = ∇f (xk ) = Qxk + q.
Steady-State Optimization Lecture 3: Unconstrained Optimization Problems, Numerical Methods and Applications
TU Ilmenau
k
3.6.Conjugate Gradient ... Quadratic Problems
Question: Given dk how to determine dk+1
+αdk )
• Observe that df (xkdα
= 0 also implies
>
gk+1
dk = 0,
where gk+1 = ∇f (xk + αk dk ) = Q (xk + αk dk ) + q.
• In addition,
>
dk+1
Qdk = 0
(since dk and dk+1 are conjugate vectors).
• Hence, determine dk+1 from dk+1 = −gk+1 + βk dk .
• The expression βk dk is correction term for the steepes descent
direction −gk+1 .
• There are three well knoown methods for the determination of
parameter βk .
Steady-State Optimization Lecture 3: Unconstrained Optimization Problems, Numerical Methods and Applications
TU Ilmenau
3.6.Conjugate Gradient ... Quadratic Problems
Fletcher-Reeves:
βk =
Polak-Ribiere:
βk =
Hestenes-Stiefel:
βk =
> g
gk+1
k+1
gk> gk
> (g
gk+1
k+1 − gk )
gk> gk
> (g
gk+1
k+1 − gk )
dk> (gk+1 − gk )
.
.
Steady-State Optimization Lecture 3: Unconstrained Optimization Problems, Numerical Methods and Applications
TU Ilmenau
3.6.CG ... Quadratic Problems...Algorithm
Algorithm 3: Steepst Descent with Exact Line Search
Choose an initial iterate x0 ;
Set g0 = ∇f (x0 ) and d0 = −g0 ;
3: Set k ← 0;
4: while (k∇f (xk )k > tol) do
1:
2:
5:
6:
7:
8:
9:
10:
d >g
Determine the step length αk = − d >k Qdk .
k
k
Set xk+1 = xk + αk dk
Determine βk by one of the methods Fletcher-Reeves,
Polak-Ribiere or Hestenes-Stiefel.
Set dk+1 = −gk+1 + βk dk
Set k ← k + 1;
end while
Steady-State Optimization Lecture 3: Unconstrained Optimization Problems, Numerical Methods and Applications
TU Ilmenau
3.6.Conjugate Gradient ... Quadratic
Problems...Matlab
function xsol=myCG_PR(x0,tolf,maxIter)
% A matlab implementation of steepest descent with Exact Line search
% Use should supply: initial iterate x0
% tolx - tolerance between subsequent iterates xk and xk+1, norm(x_k+1 - x_k)
% tolf - tolerance for norm(F(x_k)) < tolf
% maxIter - the maximum number of iterations
fprintf(’=============================================================================== \n’)
fprintf(’iteration
alphak
norm(grad) \n’)
fprintf(’=============================================================================== \n’)
x0=x0(:);
alpha0=1;
k=0;
datasave=[];
xk=x0;
alphak=alpha0;
grad=-gradFun(x0);
gradOld=grad;
dk=-grad;
while ((norm(grad)>=tolf)) & (k<=maxIter)
Q=hess(xk);
dk=dk(:);
gradOld=gradOld(:);
alphak= -(dk’*dk)/(dk’*Q*dk);
xk=xk+alphak*dk;
grad=gradFun(xk);
grad=grad(:);
betak=grad’*(grad-gradOld)/(dk’*(grad-gradOld)); %Plak-Ribier
dk=-grad + betak*dk;
gradOld=grad;
k=k+1;
datasave=[datasave; k+1
alphak
norm(grad)];
end %end while
Steady-State Optimization Lecture 3: Unconstrained Optimization Problems, Numerical Methods and Applications
TU Ilmenau
3.6.Conjugate Gradient ... Quadratic
Problems...Matlab...
xsol=xk;
disp(datasave)
end
function F=myfun(x)
F=(x_{1}-2)^{2} + x(2)^2-5;
end
function grad=gradFun(x)
grad(1) = 2*(x(1)-2);
grad(2) = 2*x(2);
end
function Q=hess(x)
Q(1,1)=2;
Q(1,2)=0;
Q(2,1)=0;
Q(2,2)=2;
end
Steady-State Optimization Lecture 3: Unconstrained Optimization Problems, Numerical Methods and Applications
TU Ilmenau
3.6.Conjugate Gradient ... Quadratic Problems
Example: (parameter estimation)
For a chemical process pressure measured at
different temperature is given in the following table. Formulate an optimization problem to
determine the best values of the parameters in the following exponential model of the data
p = αe βT . Choose any arbitrary point and find optimum values of the parameters using
steepest descent and CG Algorithms
Temprature (T ◦ C )
20
25
30
35
40
50
60
70
Pressure (mm of Mercury)
14.45
19.23
26.54
34.52
48.32
68.11
98.34
120.45
Steady-State Optimization Lecture 3: Unconstrained Optimization Problems, Numerical Methods and Applications
TU Ilmenau
3.6.Conjugate Gradient ... Properties
• For QP’s with symmetric positive definite n × n matrix Q CG
requires only n steps to converge.
• For a general nonlinear unconstrained optimization problem
minx f (x) convergence depends on
I on the properties of the Hessian matrix
⇒ Good convergence if the Hessian matrix H(x) is
positive definite (i.e. f (x) is a strictly convex function )
I Since in the CG the Hessian matrix H(x) changes from iteration
to iteration, H(x) may be ill-conditioned.
⇒ Preconditioning techniques are required.
⇒ As a result we preconditioned Conjugate Gradient
(PCG) Methods.
Steady-State Optimization Lecture 3: Unconstrained Optimization Problems, Numerical Methods and Applications
TU Ilmenau
3.5. Numerical methods ...
(A2) The Newton Algorithm:
Start: • Choose an initial iterate x0 and
a termination Tolerance • Set k ← k0
Repeat:
• Compute : ∇f (xk )
• Determine a search direction : dk = − [H(xk )]−1 ∇f (xk ) .
• Set xk+1 = xk + αk ∇dk
• k ←k +1
Until: (k∇f (xk )k ≤ )
Advantage
• The convergence is for αk = 1 kxk+1 − x∗ k ≤ C kxk − x∗ k2 , C > 0 constant → fast)
• Easy to implement if [H(xk )]−1 is easy to obtain.
Disadvantages:
• In every iteration the system of equations H(xk )dk = −∇f (xk ) should be solve to determine the search direction dk .
• If the matrix
H(xk ) is not semi-definite or ill-conditioned
the solution H(xk )dk = −∇f (xk ) may not be exact .
• If the initial iterate x0 is not choose properly, convergence is not guaranteed. (local convergence)
docs/Erfassungsbogen.doc
Steady-State Optimization Lecture 3: Unconstrained Optimization Problems, Numerical Methods and Applications
TU Ilmenau
3.5. Numerical methods ...
• In the Newton-Algorithm the matrix H(xk ) may become ill-conditioned from step to step.
• In general, the difficulties could arise:
− the direct computation of H(xk ) can be very hard
− the inverse of H(xk ) may not be easily available ⇒ use approximate Hessian
⇒ Quasi-Newton Methods
− for unconstrained optimization with several variables, the equation
H(xk )dk = −∇f (xk )
may consume too much cpu-time ⇒ use an inexact Newton Methods.
Some known methods that guarantee the matrix H(xk ) remain well-conditioned at each iteration step
(i) Levenberg-Marquardt-Verfahren
Replace [H(xk )] by the approximation
b k ) = H(xk ) + βI , β ≥ 0.
H(x
(ii) Quasi-Newton-Method(BFGS=Broyden-Fletcher-Gldfarb-Shanon update)
Approximate the inverse [H(xk+1 )]−1 by
BFGS
k
Bk+1 = B +
1+
γ>B k γ
δ> γ
!
δ> δ
−
δ> γ
δγ > B k + B k γδ >
δ> γ
!
,
b0 = I . Then, dk+1 = −Bk+1 ∇f (xk+1 ) .
where δ = xk+1 − xk , γ = ∇f (xk+1 ) − ∇f (xk ) and H
Steady-State Optimization Lecture 3: Unconstrained Optimization Problems, Numerical Methods and Applications
TU Ilmenau
The Matlab Optimization Toolbox- functions for
unconstrained optimization
• fminunc - Multidimensional unconstrained nonlinear minimization
• lsqnonlin - Nonlinear least squares with upper and lower bounds.
Using fminunc.m to solve unconstrained optimization problems
[xsol,fopt,exitflag,output,grad,hessian] = fminunc(fun,x0,options)
Input arguments:
fun
a Matlab function m-file that contains the function to be minimzed
x0
Startvector for the algorithm, if known, else [ ]
options
options are set using the optimset funciton, they determine what algorism to use,etc.
Output arguments:
xsol
optimal solution
fopt
optimal value of the objective function; i.e. f (xsol)
exitflag
tells whether the algorithm converged or not, exitflag ¿ 0 means convergence
output
a struct for number of iterations, algorithm used and PCG iterations(when LargeScale=on)
grad
gradient vector at the optimal point xsol.
hessian
hessian matrix at the optimal point xsol.
To display type of parameters that you can set fmincun.m use the command:
>>optimset(’fminunc’) !
Steady-State Optimization Lecture 3: Unconstrained Optimization Problems, Numerical Methods and Applications
TU Ilmenau
The Matlab Optimization Toolbox- functions for
unconstrained optimization
Use the Matlab fminunc to solve:
2
2
2
min{f (x) = x1 + 3x2 + 5x3 }
x
Problem definition
function [f,g]=fun1(x)
%Objective function for example (a)
%Defines an unconstrained optimization problem to be solved with fminunc
f=x(1)^2+3*x(2)^2+5*x(3)^2;
if nargout > 1
g(1)=2*x(1);
g(2)=6*x(2);
g(3)=10*x(3);
end
Main program
function [xopt,fopt,exitflag]=unConstEx1
options=optimset(’fminunc’);
options.LargeScale=’off’; options.HessUpdate=’bfgs’;
%assuming the function is defined in the
%in the m file fun1.m we call fminunc
%with a starting point x0
x0=[1,1,1];
[xopt,fopt,exitflag]=fminunc(@fun1,x0,options);
Steady-State Optimization Lecture 3: Unconstrained Optimization Problems, Numerical Methods and Applications
TU Ilmenau
The Matlab Optimization Toolbox- functions for
unconstrained optimization
If you decide to use the Large-Scale option in fminunc.m to solve problem:
2
2
2
min{f (x) = x1 + 3x2 + 5x3 }
x
Edit the main programm as follows
function [xopt,fopt,exitflag]=unConstEx1
options=optimset(’fminunc’);
options.LargeScale=’on’;
options.Gradobj=’on’;
%assuming the function is defined as in fun1.m
%we call fminunc with a starting point x0
x0=[1,1,1];
Steady-State Optimization Lecture 3: Unconstrained Optimization Problems, Numerical Methods and Applications
TU Ilmenau