HW4 Solution

ECE580 Spring 2016
Solution to Problem Set 4
February 17, 2016
1
ECE580 Solution to Problem Set 4:
Newton’s and Gradient Algorithms
These problems are from the textbook by Chong and Zak, 4th edition, which is the
textbook for the ECE580 Spring 2016 semester. As such, many of the problem statements
are taken verbatim from the text; however, others have been reworded for reasons of
efficiency or instruction. Solutions are mine. Any errors are mine and should be reported
to me, [email protected], rather than to the textbook authors.
7.10 Secant Method
(a) Implement in Matlab the secant method
x(k+1) = x(k) −
x(k) − x(k−1)
f 0 (x(k) )
f 0 (x(k) ) − f 0 (x(k−1) )
using the stopping criterion
|x(k+1) − x(k) | < |x(k) |.
Solution: straightforward.
(b) Let f (x) = (2x − 1)2 + 4(4 − 1024x)4 . Find a root of f (x) using the secant
method with initial points x(−1) = 0 and x(0) = 1 and the stopping criterion
uses = 10−5 .
Solution:
To find the root of g(x), we use, instead of the algorithm above,
x(k+1) = x(k) −
x(k) − x(k−1)
f (x(k) ).
f (x(k) ) − f (x(k−1) )
The odd thing about this problem is that regardless of which algorithm one
uses, one gets the same “root”, approximately x = 0.3866 with the approximate corresponding value of f being f (0.3866) = 0.9846. Needless to say,
this is not a root, though it may be a minimizer. It is simply the result of
using an algorithm that does not meet our needs. This example shows why it
is important to check the value of the function at the supposed minimizer or
root, to determine whether one has achieved one’s objective.
So, what went wrong? Let’s try to find the root analytically. g(x) is the sum
of two positive terms, therefore it is zero only if both terms are zero. To make
the first term zero, we need x = 1/2. To make the second term zero, we need
x = 4/1024 = 1/256. x cannot take both values simultaneously, so the function
g(x) has no roots.
ECE580 Spring 2016
Solution to Problem Set 4
February 17, 2016
2
9.3 Consider minimizing f (x) = x4/3 , noting that 0 is the global minimizer of f .
(a) Calculate the update algorithm for Newton’s method applied to this problem.
Solution: f and its derivatives are
f (x) = x4/3
4 1/3
f 0 (x) =
x
3 4 1
f 00 (x) =
x−2/3 .
3 3
Thus the update algorithm for Newton’s method is
x
(k+1)
=x
(k)
0
00
− f (x)/f (x) = x
(k)
−
4 1/3
x
3
4 −2/3
x
9
= x(k) − 3x(k) = −2x(k) .
(b) Show that the only starting point for which the algorithm converges to zero is
zero itself.
Solution: From x(k+1) = −2x(k) , we see that the sign changes with every step
and we can express x(k) as a function of x(0) as
x(k) = (−2)k x(0) .
Thus unless x(0)) = 0, the absolute value of x(k) is given by
|x(k) | = 2k |x(0) |
so the magnitude of the sequence is strictly increasing. Hence, the sequence
will not converge to zero unless it starts at zero.
9.4 Rosenbrock’s Function f : R2 → R is given by
f (x) = 100(x2 − x21 )2 + (1 − x1 )2 .
(a) Prove that the point b = [1 1]T is the unique global minimizer of f over R2 .
Solution: Suppose that there is another point a = [a1 , a2 ] that is also a global
minimizer of f over R2 . Then
f (b) = 100(1 − 12 )2 + (1 − 1)2 = 0 = f (a) = 100(a2 − a21 )2 + (1 − a1 )2 .
Note that both expressions on the right hand size of f (a) are positive, thus
their sum can be zero iff both are. The second term of the sum gives us a1 = 1
and then the first gives us that a2 = a21 = 1 so we have shown that the global
minimum is unique.
ECE580 Spring 2016
Solution to Problem Set 4
February 17, 2016
3
(b) Apply two iterations of Newton’s method starting from the origin.
Solution: For this we need the first two derivatives of Rosenbrock’s function:
200(x2 − x21 )(−2x1 ) + −2(1 − x1 )
∇f (x) =
,
200(x2 − x21 )
−400x2 + 400(3)x21 + 2 −400x1
F (x) =
.
−400x1
200
Thus
x(k+1) = x(k) − (F (x(k) ))−1 ∇f (x(k) ),
where
−1
(F (x))
1
=
−80000(x2 − x21 ) + 400
200
400x1
400x1 −400(x2 − 3x21 ) + 2
.
The first step is easy. With x1 = x2 = 0 we have
x(1) = x(0) − (F (x(0) ))−1 ∇f (x(0) )
1/2
0
−2
= 0−
0 1/200
0
1
=
.
0
(1)
(1)
Using
the Matlab script below to keep track of the calculations, with x =
1
we have
0
f (x(1) ) = 100
−400
(1)
∇f (x ) =
−200
1200 −400
(1)
F (x ) =
−400
200
1/402
1/201
(1) −1
(F (x ))
=
1/201 601/40200
1
x(2) =
1
I used the Matlab script below to find the quantities listed above.
clear all
syms x1 x2
f = (1 - x1)^2 + 100*(- x1^2 + x2)^2
ECE580 Spring 2016
Solution to Problem Set 4
February 17, 2016
4
gf = gradient(f)
Hf = hessian(f)
dHf = det(hessian(f))
iHf = 1/dHf*[Hf(2,2) -Hf(1,2);-Hf(2,1) Hf(1,1)]
newx = -subs(subs(iHf*gf,x1,0),x2,0)
newerx = newx - subs(subs(iHf*gf,x1,1),x2,0)
(c) Apply two steps of the fixed step gradient algorithm to the same problem. Use
α = 0.05.
Solution: The update for the fixed step gradient algorithm is
x(k+1) = x(k) − α∇F (x(k) ).
Starting again at x(0) , with ∇f (x(0)) (1) we have
−2
0.1
(1)
x = 0 − 0.05
=
.
0
0
With
200(0 − 0.12 )(−2(0.1)) + −2(1 − 0.1)
−1.4
∇f (x(1)) =
=
,
200(0 − 0.12 )
−2
0.1
−1.4
0.17
(2)
x =
− 0.05
=
.
0
−2
0.1
8.1 Perform two iterations of the steepest descent method, starting from the origin,
towards finding a minimizer, and also determine the optimal solution analytically.
The function of interest is
f (x1 , x2 ) = x1 + x2 /2 + x21 /2 + x22 + 3.
Solution: f can be expressed as
1
f (x) = xT Qx − xT b + c,
2
where
Q=
b=
1 0
0 2
−1
−1/2
,
,
and
c = 3.
Note that the minimizer does not depend on the value of c, so we can ignore it
except when finding the minimum value of function.
ECE580 Spring 2016
Solution to Problem Set 4
February 17, 2016
5
In fact we know that the minimizer is
1 0
−1
−1
∗
−1
x =Q b=
=
.
0 1/2
−1/2
−1/4
We will now test the steepest descent algorithm, at least the first two steps thereof.
The gradient is
1 + x1
∇f (x) =
= Qx − b.
1/2 + 2x2
We know that for a quadratic,
αk =
∇f (x(k) )T ∇f (x(k) )
.
∇f (x(k) )T Q∇f (x(k) )
Now for the first step of the algorithm, we find
1
(0)
∇f (x ) =
1/2
so
T
1 1/2
α1 =
T 1 0
1 1/2
0 2
Then
x
(1)
1
1/2
1
1/2
=0−
= (5/4)/(3/2) = 5/6
5/6
5/12
.
Repeating the process we find
∇f (x
(1)
)=
1/6
−1/3
so
T
1/6 −1/3
α2 =
T 1 0
1/6 −1/3
0 2
1/6
−1/3
1/6
−1/3
1/6
−1/3
= (1/36 + 1/9)/(1/36 + 2/9) = 5/9,
and
x
(2)
=
−5/6
−5/12
5
−
9
=
−25/27
−5/12 + 5/27
=
−25/27
−25/108
ECE580 Spring 2016
Solution to Problem Set 4
February 17, 2016
6
8.8 Consider the function
f (x) = 3 x21 + x22 + 4x1 x2 + 5x1 + 6x2 + 7.
What is the largest range of values of α for which the fixed step algorithm is globally
convergent?
Solution:
We must find the eigenvalues of the Q matrix for the quadratic function, which can
be expressed as
1
f (x) = xT Qx − xT b + c
2
where
6 4
Q=
.
4 6
The eigenvalues of Q are found by solving |sI − Q| = 0 for s. Here we have
(s − 6)2 − 16 = s2 − 12s + 20 = (s − 2)(s − 10).
According to Theorem 8.3 in the text, the fixed-step algorithm will converge for any
initial condition, iff α ∈ (0, 2/λmax (Q)), i.e. α ∈ (0, 1/5).