WI4087TU sheets week..

Criteria for optimality
If f is a differentiable function, then f’(x) = 0 is a
necessary condition for x being a minimum. It is
not a sufficient condition, however:
f’(x) = 0

stationary point
f(x) = x2

x = 0 is absolute minimum
f(x) = -x2

x = 0 is absolute maximum
f(x) = x3

x = 0 is saddle point
f(x) = 0

x = 0 is both minimum and maximum
For a twice differentiable function the condition
f’(x) = 0, f’’(x) > 0 is sufficient for x being a minimum. It
is not a necessary condition, however:
f(x) = x4
f’(0) = f’’(0) = f’’’(0) = 0, f’’’’(0) > 0
This function has an absolute minimum, but does not
satisfy the above criterion.
For 2k times differentiable functions, a sufficient
criterion is:
f’(x) = f’’(x) = … = f(2k-1)(x) = 0, f(2k)(x) >0
Is this also necessary for an infinitely differentiable
function, i.e., does a non-zero function that has a
minimum, satisfy this criterion for some k?
The answer is: no! Consider
f(x) = exp(-1/x2) (if x0), and f(0) = 0
This function is continuous, infinitely many times
differentiable, and f(j)(0) = 0 for all j, but f has an
absolute minimum in x = 0.
This function is not analytic in x = 0 (the Taylor
series expansion does not converge to the
function).
Generalization to higher dimensions:
f:Rn  R has a strict minimum in x if
f(x) = 0 and 2f(x) > 0
2f(x) > 0 means that the Hessian matrix of f is strictly
positive definite (This means that (2f(x)y, y) > 0 for all
y0)
Example1: consider f x1 , x2 , x3   x12  x1 x2  x22  x32 . The first
and second derivatives are:
 2 x1  x2 


f x1 , x2 , x3    x1  2 x2  ,
 2x 
3


 2 1 0


 2 f x1 , x2 , x3    1 2 0 
 0 0 2


The eigenvalues of the Hessian matrix are 1, 2, 3. These
are all non-negative, so (0, 0, 0) is a strict minimum.
x12
f  x1 , x 2 , x3  
 x32
Example 2: consider
for x2 > 0.
x2
The first and second derivatives are:
 x1 
2 
 x2 
 x12 
f  x1 , x2 , x3     2 
 x2  ,
 2 x3 




 2

 x2
 2x
 2 f  x1 , x2 , x3     21
 x2
 0


2 x1
x22
2 x12
x23
0


0


0

2


The eigenvalues of the Hessian matrix are 0, 2,
x12  x22
2
x23
.
These are all non-negative, so the matrix is nonnegative definite and the function f is convex. This also
follows from the definition of non-negative definite:

 2


  y   x2
  1   2 x1
  y2 ,   2
  y   x2
 3   0




2 x1
x22
2 x12
x23
0




0
 y1  
2
   2 
x1 y2 
  2 y32  0.
0  y2     y1 
x2 
 y   x2 
2  3  




All points (0, x2, 0) with x2 > 0 are (non-strict) minima.
Multivariable unconstrained optimization (Ch 12.5)
Steepest descent method.
The idea of this method is that from a starting point a
minimum is found in a steep(est) descent direction.
From that point a new point is then found.
The steepest ascent direction is given by the gradient:
f(x) = f’(x)T,
because from the Taylor approximation
f(x+h) = f(x) + f(x)Th + O(|h|2).
it is clear that f(x) is the direction in which f locally
increases maximally.
The steepest ascent algorithm works as follows:
0. Find a starting point x0
1. Find the value t* for which tf(xk + tf’(xk)) is maximal
2. xk+1 := xk + t* f’(xk)
3. If the stopping criterion is satisfied, stop,
else k:=k+1 and go to 1
Example:
f(x1, x2) = 2x1x2 + 2x2 – x12 – 2x22
f’(x1, x2) = (2x2 – 2x1
2x1 + 2 – 4x2)
Starting point X0 = (0,0)
Iteration 1: f’(0,0) = (0,2)
Find the maximum of f((0,0) + t(0,2)) = f(0, 2t) = 4t – 8t2:
t* = ¼
X1 = (0,0) + ¼(0,2) = (0,1/2)
Iteration 2: f’(0,1/2) = (1 0)
Find the maximum of f((0,1/2) + t(1,0)) = f(t, 1/2) = ½ + t –
t2: t* = 1/2
X2 = (0,1/2) + 1/2(1,0) = (1/2,1/2)