Assignments 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12

Department of Mechanical Engineering
Indian Institute of Technology Kanpur
ME 752: Optimization Methods in Engineering Design (2008-2009 II)
Assignment 0: Mathematical Background
1. Given that

 



a 0 0
a b c
4 2 4
 b d 0   0 d e  =  2 2 2 ,
c e f
0 0 f
4 2 3
find out the values of a, b, c, d, e and f . (For a square root, select the positive value.)
[What you just attempted is called Cholesky decomposition and works as desired for
symmetric positive definite matrices.]
2. Consider three vectors u1 = [2 0 − 1 1]T , u2 = [1 2 0 3]T and u3 = [3 0 − 1 2]T .
(a) Find the unit vector v1 along u1 .
(b) From u2 , subtract its component along v1 (which will have magnitude v1T u2 ) and hence
find the unit vector v2 such that vectors v1 , v2 form an orthonormal basis for the
subspace spanned by u1 , u2 .
(c) Similarly, find a vector v3 which, together with v1 and v2 , forms an orthonormal basis
for the subspace spanned by all the three vectors u1 , u2 , u3 .
(d) Find a vector v4 to complete this basis for the entire space R4 .
(e) Write a generalized algorithm for building up the vectors v1 , v2 , · · · , vl , l ≤ m, when
the given m vectors u1 , u2 , · · · , um are in Rn , m < n.
[This process is called Gram-Schmidt orthogonalization and is used for building orthonormal bases for prescribed subspaces.]
3. Find an orthonormal
matrix

2 1
3
A =  3 0 −2
5 1
1
basis for the range space of the linear transformation defined by the

4
2 .
6
4. A surveyor reaches a remote valley to prepare records of land holdings. The valley is a
narrow strip of plain land between a mountain ridge and sea, and local people use a local
and antiquated system of measures. They have two distant landmarks: the lighthouse and
the high peak. To mention the location of any place, they typically instruct: so many bans
towards the lighthouse and so many kos towards the high peak. Upon careful measurement,
the surveyor and his assistants found that (a) one bans is roughly 200 m, (b) one kos is around
15 km, (c) the lighthouse is 10 degrees south of east, and (d) the high peak is 5 degress west
of north. The surveyor’s team, obviously, uses the standard system, with unit distances of 1
km along east and along north. Now, to convert the local documents into standard system
and to make sense to the locals about their intended locations, work out
(a) a conversion formula from valley system to standard system, and
(b) another conversion formula from standard system to valley system.
5. Given that

 



1 0 0
a d g
5
2 1
 b 1 0   0 e h  =  10
6 1 ,
c f 1
0 0 i
5 −4 7
find out the values of a, b, c, d, e, f , g, h and i.
[This is the celebrated Crout-Doolittle algorithm LU decomposition without pivoting.]
6. For n × n matrices Q, R and A, consider the matrix multiplication QR = A columnwise
and observe that r1,k q1 + r2,k q2 + r3,k q3 + · · · + rn,k qn = ak . For the matrix


6 5 −1 0
 6 5 −1 6 
,
A=
 6 1
1 0 
6 1
1 2
write out the column equations one by one and determine the corresponding columns of an
orthogonal Q and an upper triangular R. (Note: There is no trick in this problem. Never
stop in between. The process of QR decomposition always works — till the end!)
∂p
,
7. For a bilinear form p(x, y) = xT Ay of two vector variables x ∈ Rm and y ∈ Rn , find out ∂x
i
∂p
∂p
∂p
∂yi ; and hence the vector gradients ∂x and ∂y . As a corollary, derive the partial derivative
∂q
T
∂xi and vector gradient of a quadratic form q(x) = x Ax.
8. Check whether the matrix


4 1
 1 5 1 
1 2
is positive definite matrix.
9. Consider matrix
2
3
1
P=
.
a + b b − a 3a + b
(a) For which values of a and b, PPT is positive definite?
(b) For which values of a and b, PT P is positive definite?
10. Consider the matrix


80 −60
A =  36 −27  .
−48
36
(a) Construct AT A and determine its eigenvalues λ1 , λ2 (number them in descending order,
for convenience) and corresponding eigenvectors v1 , v2 , as an orthonormal basis of R2 .
√
(b) Define σk = λk , form a diagonal matrix with σ1 and σ2 as the diagonal elements and
extend it (with additional zeros) to a matrix Σ of the same size as A.
(c) Assemble the eigenvectors into an orthogonal matrix as V = [v1
orthogonal matrix U satisfying A = UΣVT .
v2 ] and find any
(d) Identify the null space of A in terms of columns of V.
(e) Identify the range space of A in terms of columns of U.
(f) How does a system of equations Ax = b transform if the bases for the domain and the
co-domain of A change to V and U, respectively?
[This powerful decomposition of matrices for solution, optimization and diagnostics of linear
systems is called singular value decomposition (SVD).]
11. Find the characteristic

0 0
0 ···
 1 0
0
···


..
 0 1
.
0

 .. .. . .
.
..
 . .
.

 .. ..
.. . .
 . .
.
.


 0 0
0 ···
0 0
0 ···
polynomial of the following matrix and mention its significance.

···
0
−an
···
0 −an−1 


···
0 −an−2 

..
.. 
..
.
.
. 

.
.. ..
.. 

.
.


..
.
0
−a2 
···
1
−a1
12. Eigenvalues of matrix A are 1.1, 1, 0.9 and the corresponding eigenvectors are [1 0 1]T ,
[1 2 − 1]T , [1 1 1]T . Compute A and A6 .
13. Let f (x) be a scalar function of a vector variable x ∈ Rn . Let Q ∈ Rn×n be an orthogonal
matrix, such that its columns q1 , q2 , · · · , qn form an orthonormal basis of Rn .
(a) For small α, find out f (x + αqj ) − f (x).
(b) Hence, show that the directional derivative
∂f
∂qj
= qTj ∇f (x).
(c) Now, compose the vector resultant
Pn
∂f
j=1 ∂qj qj
and show that it equals ∇f (x).
14. A function f (x) of two variables has been evaluated at the following points.
f (1.999, 0.999) = 7.352232,
f (2, 0.999) = 7.359574,
f (2.001, 0.999) = 7.366922,
f (1.999, 1) = 7.381671,
f (2, 1) = 7.389056,
f (2.001, 1) = 7.396449,
f (1.999, 1.001) = 7.411257;
f (2, 1.001) = 7.418686;
f (2.001, 1.001) = 7.426124.
Find out the gradient and Hessian of the function at the point (2,1). How many function
values did you have to use for each of them?
15. Let P (x) be a polynomial on which successive synthetic division by a chosen quadratic polynomial x2 + px + q produces the successive quotients P1 (x), P2 (x) and remainders rx + s,
ux + v, such that
P (x) = (x2 + px + q)P1 (x) + rx + s,
P1 (x) = (x2 + px + q)P2 (x) + ux + v.
(a) Observing that the expressions P1 (x), P2 (x) and the numbers r, s, u, v all depend upon
p and q in the chosen expression x2 + px + q, differentiate the expression for P (x) above
∂r ∂r ∂s ∂s
, ∂q , ∂p , ∂q .
partially with respect to p and q, and simplify to obtain expressions for ∂p
[Hint: At its roots, a polynomial evaluates to zero.]
(b) Frame the Jacobian J of [r s]T with respect to [p q]T and work out an iterative
algorithm based on the first order approximation to iterate over the parameters p and q
for obtaining r = s = 0. In brief, work out an iterative procedure to isolate a quadratic
factor from a polynomial.
[This is the Bairstow’s method, often found to be an effective way to solve polynomial
equations.]
(c) Implement the procedure to find all roots of the polynomial
P (x) = x6 − 19x5 + 125x4 − 329x3 + 66x2 + 948x − 216
up to two places of decimal, starting with p = 0 and q = 0, i.e. x2 as the initial divisor
expression.
16. Function f (t) is being approximated in the interval [0, 1] by a cubic interpolation formula in
terms of the boundary conditions as
f (t) = [f (0) f (1) f ′ (0) f ′ (1)] W [1 t t2 t3 ]T .
Determine the matrix W.
Department of Mechanical Engineering
Indian Institute of Technology Kanpur
ME 752: Optimization Methods in Engineering Design (2008-2009 II)
Assignment 1: Important Identities and Results
1. For an m × n matrix A, m < n, P = (AAT )−1 has already been computed. Then, an
additional row aT is appended in A, such that the matrix A gets updated to Ā. In terms of
P, develop an update formula in the form
Q b
A
T −1
P̄ = (ĀĀ ) =
,
where Ā =
.
bT ε
aT
Similarly, develop a working rule to update P̄ from available P, if Ā is obtained by dropping
the last row of A.
[In the active set strategy of nonlinear optimization, such updates are routinely involved while
including and excluding inequality constraints in and from the active set of constraints.]
2. Let an (n − m)-dimensional subspace be defined in Rn as M = {d : Ad = 0}, where
A ∈ Rm×n is full-rank. Show that the orthogonal projection of a vector to this subspace is
accomplished by the transformation
P = In − AT (AAT )−1 A.
[Try to derive the transformation, apart from simply verifying.]
3. For an invertible matrix A ∈ Rn×n and non-zero column vectors u, v ∈ Rn ,
(a) find out the rank of the matrix uvT , and
(b) prove the following identity (Sherman-Morrison formula) and comment on its utility.
(A − uvT )−1 = A−1 + A−1 u(1 − vT A−1 u)−1 vT A−1 .
4. In a certain application, the inverse of the matrix In + cAT A is required, where A is a fullrank m × n matrix (m < n) and c is a large number. For two reasons, a direct inversion of
this matrix is not very advisable. For indirectly obtaining the inverse, prove the identity
(In + cAT A)−1 = In − cAT (Im + cAAT )−1 A
and verify it for c = 10, n = 3, m = 2 and A = [2e1 3e2 0].
Can you figure out what are the two reasons?
5. For a real invertible matrix A, show that AT A is positive definite. Further, show that, for
ν > 0, AT A + ν 2 I is positive definite for any real matrix A.
6. For solving Ax = b with symmetric positive definite matrix A, we formulate the error vector
e = Ax − b,
start from an arbitrary point x0 and iterate along selected directions d0 , d1 , d2 etc, as
xk+1 = xk + αk dk , for k = 0, 1, 2, · · · .
(a) Denoting ek = Axk − b, determine αk such that dTk ek+1 = 0, i.e. the step along a
chosen direction eliminates any error along that direction in the next iterate.
(b) Then, find out dT0 e2 , dT0 e3 and dT1 e3 , i.e. errors along old directions.
(c) Generalize the observation for the k-th step.
(d) Work out the conditions that the chosen directions must satisfy such that errors along
the old directions also vanish.
[These conditions characterize these directions as conjugate directions for the matrix A.]
7. For n × n symmetric positive definite L and m × n full-rank A (m < n), prove the identity
A(L + cAT A)−1 AT (Im + cAL−1 AT ) = AL−1 AT
and use the result to prove that an eigenvector v of AL−1 AT with eigenvalue σ is also an
1
eigenvector of A(L + cAT A)−1 AT with corresponding eigenvalue c+1/σ
.
[This result has valuable implication in the theory behind a powerful duality-based algorithm
(Augmented Lagrangian method) of nonlinear optimization.]
8. Plot contours of the function f (x) = x21 x2 − x1 x2 + 8, in the region 0 < x1 < 3, 0 < x2 < 10.
Develop a quadratic approximation of the function around (2,5) and superimpose its contours
with those of f (x). Are the contour curves of this quadratic approximation elliptic, parabolic
or hyperbolic?
9. Find a solution of the equation e−x = x up to two places of decimal. Is the solution unique?
10. Let A be an m × n matrix (m < n) of rank m and let L be an n × n symmetric positive
definite matrix. Then, show that the (n + m) × (n + m) matrix
L AT
,
H=
A 0
is non-singular, but indefinite.
Department of Mechanical Engineering
Indian Institute of Technology Kanpur
ME 752: Optimization Methods in Engineering Design (2008-2009 II)
Assignment 2: Optimization Problems and Algorithms
1. Mrs. Anna D’Souza (height 170 cm) called our office requesting for an economical size and
best positioning for a mirror that she would get fixed on a door of her bedroom almirah.
Considering that the cost of the mirror is proportional to its area and taking appropriate estimates of dimensions, margins etc, formulate an optimization problem and work out suitable
height, width and positioning of the mirror in which she can view her full image.
2. Design a minimum cost cylindrical tank closed at both ends to contain a fixed volume V of
fluid. Assume that the cost depends directly on the area A of sheet metal.
3. Find out the maximum volume of a tank of the above kind that has a given surface area A.
Show that the relationships between surface area and volume for the optimal design in this
case is equivalent to that in the earlier problem.
4. Design a tank of the above kind for a volume of 250 m3 , incorporating the assembly constraint
H ≤ 10 − D/2 appearing from the plan of locating it at a particular place in a shed.
5. Using a graphical method, solve the problem
minimize f (x) = x21 + x22 − 4x1 + 4
subject to x1 − 2x2 + 6 ≥ 0,
x21 − x2 + 1 ≤ 0,
x1 , x2 ≥ 0.
6. Show that the algorithm
1
2 (xk + 2) for xk > 1
xk+1 = A(xk ) =
1
for xk ≤ 1
4 xk
for the problem
minimize φ(x) = |x|
is not globally convergent. Explain the reason.
7. Find the order of convergence and convergence ratio of the sequence {xk }∞
k=0 if
(a) xk = αk
for 0 < α < 1.
(b) xk = α2k
Department of Mechanical Engineering
Indian Institute of Technology Kanpur
ME 752: Optimization Methods in Engineering Design (2008-2009 II)
Assignment 3: Univariate Optimization
1. Identify the stationary points of the following function using the exact method:
f (x) = (x − 1)2 − 0.01x4 .
Check if they are minimum or maximum points.
2. Find out the stationary points of the function
f (x) = 5x6 − 36x5 +
165 4
x − 60x3 + 36
2
analyze their nature.
3. Maximize f (x) = −x3 + 3x2 + 9x + 10 in the interval −2 ≤ x ≤ 4.
4. Implement bounding phase method (refer book by Deb for algorithm) to bracket a minimum
point of the function f (x) = ex −x3 . Next, use the program to bracket minima of the functions
in the first three exercises above.
5. Over the bracket arrived at for f (x) = ex − x3 , use two iterations of Fibonacci search and
golden section search methods. Implement one of these (whichever you like through this
experience) in a function and use it for all the four functions above upto an accuracy of 10−3 .
6. Over the bracket arrived at for f (x) = ex − x3 , use two iterations of regula falsi and Newton’s
methods. Implement one of these (whichever you like through this experience) in a function
and use it for all the four functions above upto an accuracy of 10−3 .
7. Compare the above two favourite algorithms from two classes in terms of their performance
on the set of these four functions.
2
8. Identify the regions over which the function e−x is convex and where it is concave. Determine
its global minima and maxima.
9. Find all the maxima and minima of the function φ(x) = 25(x − 21 )4 − 2(x − 12 )2 , identify its
other salient features and sketch its graph.
Department of Mechanical Engineering
Indian Institute of Technology Kanpur
ME 752: Optimization Methods in Engineering Design (2008-2009 II)
Assignment 4: Fundamentals of Multivariate Optimization
1. Consider the function f : R2 → R defined by
f (x) = 2x21 − x41 + x61 /6 + x1 x2 + x22 /2 .
Find out all the stationary points and classify them as local minimum, local maximum and
saddle points.
2. The Hessian H(x) of a function f (x) is positive semi-definite everywhere.
(a) Using the Taylor’s theorem (in the remainder form) around point x2 as
1
f (x1 ) = f (x2 ) + [∇f (x2 )]T (x1 − x2 ) + (x1 − x2 )T H[x2 + α(x1 − x2 )](x1 − x2 )
2
for α ∈ [0, 1], show that
f (x1 ) ≥ f (x2 ) + [∇f (x2 )]T (x1 − x2 ).
(b) Does the argument remain valid if we know only that the Hessian is positive semi-definite
(or positive definite) at the point x2 , and not everywhere? Why?
(c) Now, consider an arbitrary line through x2 and select two points y and z on opposite
sides of it. Writing the result of part (a) with y and z as x1 in turn, show that
βf (y) + (1 − β)f (z) ≥ f [βy + (1 − β)z]
for some β ∈ [0, 1].
(d) Does the above inequality hold for all β ∈ [0, 1]? Why?
(e) Using the definition of a convex function, summarize the entire result in a single sentence.
3. Find the domain in which the function 9(x21 − x2 )2 + (x1 − 1)2 is convex.
4. (a) Develop a quadratic model of the function 9(x21 − x2 )2 + (x1 − 1)2 around the origin.
(b) Superimpose the contours of the original function and the quadratic model. (Use a
software, e.g. MATLAB, to develop the contours.)
(c) With a circular trust region of radius 0.2 unit, mark the point where a step from the
origin should reach.
(d) Obtain the coordinates of this point from the plot and repeat the entire process for one
more step.
Department of Mechanical Engineering
Indian Institute of Technology Kanpur
ME 752: Optimization Methods in Engineering Design (2008-2009 II)
Assignment 5: Basic Methods of Multivariate Optimization
1. Use Nelder and Mead’s simplex search method, starting from the origin, to find the minimum
point of the function
f (x, y, z) = 2x2 + xy + y 2 + yz + z 2 − 6x − 7y − 8z + 9.
2. For minimizing the function f (x) = (x1 − x2 )2 + (1 − x1 )2 + (2x1 − x2 − x3 )2 , consider the
origin as the starting solution, i.e. x0 = [0 0 0]T .
(a) Evaluate the function f (x0 ) and gradient g(x0 ) at this point. Using the negative gradient
as the search direction, define a new function φ(α) = f (x0 − αg(x0 )).
(b) Find out the minimizer α0 of φ(α) and update x1 = x0 − α0 g(x0 ).
(c) Similarly, carry out two more such iterations, i.e. find out f (xk ), g(xk ), αk and xk+1
for k = 1, 2.
(d) Tabulate the results and analyze them in terms of function value as well as distance from
the actual minimum point.
3. Starting from the point x0 = [2 − 1 0 1]T , minimize the function
f (x) = (x1 + 2x2 − 1)2 + 5(x3 − x4 )2 + (x2 − 3x3 )4 + 10(x1 − x4 )4
by (a) Hooke-Jeeves method with unit initial step size and (b) steepest descent method.
4. Show that, with x0 = [c 1]T , steepest descent iterations for the function f (x) = x21 +cx22 , c >
0 are given by xm = am [c (−1)m ]T , where am = (c − 1)m /(c + 1)m . Comment on the
behaviour of the method for large values of c.
5. What are the starting points from which a single iteration of the steepest descent algorithm
would converge to the minimum point of the function 5x21 + 4x1 x2 + 3x22 ?
6. While minimizing the function f (x) = 4x21 − 5x1 x2 + 3x22 + 6x1 − 2x2 , a transformation of
the form x1 = ay1 + by2 , x2 = cy1 + dy2 was used. For the reformulated problem in terms
of the variables y1 , y2 , steepest descent method was found to converge to the minimum in a
single iteration from arbitrary starting solution. Find out the conditions that the coefficients
a, b, c, d must satisfy for this to happen.
7. (a) Identify the stationary points of the function
f (x) = x21 + (x2 + 1)2 x21 + (x2 − 1)2 .
(b) Classify the stationary points and find out the minimum value(s).
(c) Starting from the point (1, 1), execute a Newton’s step and evaluate the step as acceptable or otherwise, in terms of reduction in function value.
8. Consider the function
f (x) = (x21 − x2 )2 + (x1 − 1)2
and the origin as the starting point.
(a) Determine a step of the pure Newton’s method.
(b) Is it a descent step?
(c) Is the associated direction a descent direction?
9. For minimizing the function f (x) = (x21 −x2 )2 +(1−x1 )2 , perform one iteration of Newton’s
method from the starting point [2 2]T and compare this step with the direction of the steepest
descent method, regarding approach towards the optimum.
10. (a) Develop the first order necessary condition for a minimum point of the function
1
E(x) = kAx − bk2 .
2
(b) Is the resulting system of equations necessarily consistent? Why?
(c) At a solution of this system, does the function necessarily have a minimum value? Why?
(d) Discuss the distribution of minima when AT A is singular.
11. Solve the following systems of equations by formulating them as optimization problems:
(a) x2 − 5xy + y 3 = 2, x + 3y = 6 and (b) zex − x2 = 10y, x2 z = 0.5, x + z = 1.
12. Starting from x = [1 1 1]T , solve the system of equations
16x41 + 16x42 + x43 = 16, x21 + x22 + x23 = 3, x31 − x2 = 0
by Newton’s method and Levenberg-Marquardt method.
13. Find constants a1 , a2 , a3 , a4 and λ for least square fit of the following tabulated data in the
form a1 + a2 x + a3 x2 + a4 eλx .
x
0
1
2
3
4
5
6
7
8
y 20 52 69 76 74 67 55 38 17
[Hint: You may attempt it as a five-variable least square problem or as a single-variable
optimization problem with a linear least square problem involved in the function evaluation.]
Department of Mechanical Engineering
Indian Institute of Technology Kanpur
ME 752: Optimization Methods in Engineering Design (2008-2009 II)
Assignment 6: Multivariate Optimization Methods
1. For the function f (x, y, z) = 2x2 + xy + y 2 + yz + z 2 − 6x − 7y − 8z + 9, develop expressions for
the gradient and Hessian, and work out three conjugate directions through a Gram-Schmidt
procedure. Now, starting from origin, conduct four sets of line searches along (a) e1 , e2 , e3 ;
(b) three successive steepest descent directions; (c) the three conjugate directions developed;
and (d) three directions recommended by conjugate gradient method. In each case, trace the
function values and gradient norms through the steps.
2. Starting from the point x0 = [2 − 1 0 1]T , minimize the function
f (x) = (x1 + 2x2 − 1)2 + 5(x3 − x4 )2 + (x2 − 3x3 )4 + 10(x1 − x4 )4
by (a) Polak-Ribiere/Fletcher-Reeves method and (b) Powell’s conjugate direction method,
and compare their performance in terms of number of function evaluations.
3. Following is an excerpt from the record of line searches in a run of Powell’s conjugate directions
method applied in a two-variable problem.
· · · → (2, 5) → (2.9, 6.2) → (4.2, 6.2) → (4.5, 6.6) → (4.9, p) → (5.05, q) → (5.09, r) → · · ·
What are the values of p, q and r?
4. Starting from the origin and taking the identity matrix as the initial estimate of Hessian
inverse, apply a few steps of the DFP method on the Himmelblau function
f (x1 , x2 ) = (x21 + x2 − 11)2 + (x1 + x22 − 7)2 .
Show the progress of the iterations superimposed with a contour plot, and record the development of the inverse Hessian estimate.
5. Using the same starting point, apply conjugate gradient (Polak-Ribiere or Fletcher-Reeves)
method and quasi-Newton (DFP or BFGS) method to minimize the function
f (x) = x21 + (x2 + 1)2 x21 + (x2 − 1)2
to an accuracy of 10−6 . In each case, apart from the function value and gradient, also evaluate
the exact Hessian (which is not needed and not to be used for the iteration process) after
every line search and examine the relation of the two current search directions (previous and
next) with the local Hessian. Conduct such experiments with at least two starting points.
Department of Mechanical Engineering
Indian Institute of Technology Kanpur
ME 752: Optimization Methods in Engineering Design (2008-2009 II)
Assignment 7: Theory of Optimization Methods (optional)
1. For minimizing f (x) = 12 xT Ax + bT x, with positive definite Hessian matrix A, the starting
point has been taken as x0 and the starting direction as d0 = −g0 , the negative gradient.
(a) Determine α0 such that x1 = x0 + α0 d0 minimizes the function over the line from x0
along the direction d0 .
(b) If at x1 , the gradient g1 6= 0, then determine β0 that will make the new direction
d1 = −g1 + β0 d0 conjugate to d0 .
(c) Show that the subspace spanned by g0 and g1 is the same as that spanned by g0 and
Ag0 , which is also the same as spanned by d0 and d1 , i.e.
< g0 , g1 > = < g0 , Ag0 > = < d0 , d1 > .
(d) Next, determine α1 such that x2 = x1 + α1 d1 minimizes the function over the line from
x1 along the direction d1 .
(e) If g2 6= 0, then determine β1 that will make d2 = −g2 + β1 d1 conjugate to d1 . Show
that the resulting direction d2 is conjugate to d0 as well.
(f) Show the following identity concerning the subspace traversed so far.
< g0 , g1 , g2 > = < g0 , Ag0 , A2 g0 > = < d0 , d1 , d2 > .
(g) In the same manner, determine α2 , x3 , β2 and d3 . Establish the conjugacy of d3 to
the old directions and the identity similar to the above, over the expanded subspace
< d0 , d1 , d2 , d3 >.
(h) Finally, write the general result in terms of the step index k and prove it.
[In your arguments, you may use the expanding subspace theorem, which has been proved
earlier independently.]
2. Noting that the rank-one update on the inverse Hessian as
Bk+1 = Bk +ak (pk −Bk qk )(pk −Bk qk )T = Bk +ak pk pTk − Bk qk pTk − pk qTk Bk + Bk qk qTk Bk
with ak = pTk qk − qTk Bk qk fulfils the key requirement of ensuring the equality Bk+1 qk = pk
but fails to guarantee continued positive definiteness, we propose to generalize the update
formula as
Bk+1 = Bk + αpk pTk − βBk qk pTk − γpk qTk Bk + δBk qk qTk Bk .
(a) Determine the conditions on the coefficients α, β, γ and δ so as to fulfil the basic
requirement, which is to ensure Bk+1 qk = pk .
(b) Further, taking γ = β for symmetry of Bk+1 , solve the above equations for α and δ in
terms of β.
(c) Show that the resulting update formula for Bk+1 in terms of the only remaining free
parameter β is equivalent to the Broyden family of update formulae.
(d) Imposing one more condition on the coefficients, special cases can be derived. Verify
that
i. condition α = δ leads to the degenerate rank-one update,
ii. condition β = 0 gives the DFP update formula, and
iii. condition δ = 0 results in the BFGS formula.
√
√
(e) Assuming Bk to be positive definite, denote Bk x = u and Bk qk = v for an arbitrary
vector x and develop a simplified expression for xT Bk+1 x.
(f) Show that pTk qk = αk gkT Bk gk and hence determine a bound on the value of β that will
ensure Bk+1 to be positive definite.
(g) Show that, with any β satisfying the above bound for the update of Bk+1 , any starting
point x0 and any positive definite matrix B0 , the quasi-Newton iterations
dk = −Bk gk ;
Line search for αk ; pk = αk dk ; xk+1 = xk + pk ;
qk = gk+1 − gk ;
applied on a convex quadratic problem, with constant positive definite Hessian H, leads
to the following additional properties:
pTi Hpk = 0, qTk Bk qi = 0 and Bk+1 qi = pi
for 0 ≤ i < k .
3. The BFGS update on the inverse Hessian is not a rank-two update, but the equivalent of a
rank-two update on the corresponding Hessian. To derive it, consider a DFP-like update of
the Hessian as
Hk+1 = Hk +
qk qTk
Hk pk pTk Hk
−
qTk pk
pTk Hk pk
and apply the following two steps of Sherman-Morrison formula to modify its inverse.
(a) First, put A = Hk , A−1 = Bk , u = µqk , v = µqk , where µ2 =
1
,
qT
k pk
and develop the
intermediate update B′ = (A + uvT )−1 , which is the inverse of H′ = Hk +
(b) Next, put A = H′ , A−1 = B′ , u = −νHk pk , v = νHk pk , where ν 2 =
1
,
pT
k Hk pk
out the final update as Bk+1 = (A + uvT )−1 , which is now the inverse of
Hk+1 = H′ −
qk qTk
Hk pk pTk Hk
Hk pk pTk Hk
=
H
+
−
.
k
pTk Hk pk
qTk pk
pTk Hk pk
qk qT
k
.
qT
k pk
and work
Department of Mechanical Engineering
Indian Institute of Technology Kanpur
ME 752: Optimization Methods in Engineering Design (2008-2009 II)
Assignment 8: Framework of Constrained Optimization
1. Sketch the feasible region described by the constraints x1 − x3 + 1 = 0 = x21 + x22 − 2x1 .
Identify irregular points of the constraints, if any. Can you reduce an optimization problem
over this domain to a 2-variable problem? To a single-variable problem? How?
2. Reduce the problem
minimize
f (x)
subject to
Ax = b,
gi (x) ≤ 0 for 1 ≤ i ≤ q
to a k-variable problem (where k = n − p), given that A ∈ Rp×n has full row rank.
3. We want to determine the minimum sheet metal needed to construct a right cylindrical can
(including bottom and cover) of capacity at least 1.5 litre with diameter between 5 cm and
12 cm, and height between 10 cm and 18 cm. Taking sensible assumptions, write down the
KKT conditions, identify the salient points of the domain, where constraint boundaries meet,
as KKT candidate points and then test the conditions on those points.
4. For the problem
minimize
f (x) = 0.01x21 + x22
subject to
g1 (x) = 25 − x1 x2 ≤ 0, g2 (x) = 2 − x1 ≤ 0 ;
obtain the solution using KKT conditions, sketch the domain with the solution and verify
the second order sufficient condition for optimality. Estimate the new optimal value of the
function if (a) the first constraint is changed to g1 = 26 − x1 x2 ≤ 0, or (b) the second
constraint is changed to g2 = 3 − x1 ≤ 0.
5. Verify that (1, 0, 3) is a KKT point of the NLP problem
minimize
subject to
f (x) = −x31 + x32 − 2x1 x23
2x1 + x22 + x3 = 5,
5x21 − x22 − x3 ≥ 2,
x1 , x2 , x3 ≥ 0.
Examine the second order conditions. Is this a convex programming problem?
6. Identify the KKT points (i.e. points satisfying KKT conditions) of the problem
minimize
subject to
f (x) = x21 − x22
x21 + 2x22 = 4,
and examine them through the second order conditions for optimality.
7. Locate the KKT point(s) of the NLP problem
minimize
subject to
f (x) = x21 + x22 − 2x2 − 1
g1 (x) = 4(x1 − 4)2 + 9(x2 − 3)2 − 36 ≤ 0,
g2 (x) = 9(x1 − 4)2 + 4(x2 − 3)2 − 36 ≤ 0
over a sketch of the domain.
8. Write down the formal KKT conditions of the NLP problem
minimize f (x) = x2 − 8x + 10 subject to x ≥ 6 .
Develop the Lagrangian L(x, µ) of this problem, evaluate its derivatives up to the second
order and construct contours of the Lagrangian function on the x-µ plane.
9. For the problem
minimize
f (x) = (x1 − 3)2 + (x2 − 3)2
subject to
2x1 + x2 ≤ 2;
develop the dual function, maximize it and find the corresponding point in x-space.
Compare the optimal values of the primal and dual functions.
10. Show that convex functions gi (x) for all i in g(x) ≤ 0 with linear equality constraints in
h(x) = 0 define a convex domain.
11. Suppose that a regular point x∗ of a convex programming problem satisfies the KKT conditions.
(a) If it is not a local minimum, then show that assumption of an arbitrarily close feasible
point y, such that f (y) < f (x∗ ), leads to a contradiction.
(b) Now that we are forced to admit x∗ as a local minimum point, let us suppose that it is
not a global minimum point. Then, taking a point z somewhere in the domain such that
f (z) < f (x∗ ), show that you can always find another point y satisfying all the premises
of part (a), and hence leading to contradiction.
(c) Finally, suppose that x∗ is a global minimum point, but it is not unique. Then, considering another global minimum point w, show that every point in the line segment
joining x∗ and w is also a global minimum point.
(d) Summarize the complete result in the form of a statement on KKT conditions in the
context of a convex programming problem.
Department of Mechanical Engineering
Indian Institute of Technology Kanpur
ME 752: Optimization Methods in Engineering Design (2008-2009 II)
Assignment 9: Linear and Quadratic Problems
1. Formulate the problem
maximize
xα1 1 xα2 2 · · · xαnn
subject to xβ1 i1 xβ2 i2 · · · xβnin ≤ bi
xj ≥ 1
for i = 1 to m
for j = 1 to n
as a linear programming problem and justify your formulation.
2. Using the simplex method, solve the following LP problems.
(a) Minimize x1 subject to
2x1 + x2 ≤ 2,
x1 + 5x2 + 10 ≥ 0,
x2 ≤ 1.
(b) Minimize 3x1 + x2 subject to
4x1 + x2 ≥ 3,
4x1 + 3x2 ≤ 6,
x1 + 2x2 ≤ 3,
x1 , x2 ≥ 0.
3. Implement the simplex method in a general program, use it on the following LP problems,
and report your results and experience.
(a) Minimize 2x1 + 3x2 subject to
4x1 − 5x2 ≤ 17,
−3x1 + 2x2 + 10 ≥ 7,
x1 , x2 ≤ 0.
(b) Minimize 3x1 + 4x2 subject to
3x1 + 2x2 ≤ 12, x1 + 2x2 ≤ 6, 2x1 − 7x2 ≥ 10, x1 , x2 ≥ 0.
4. Maximize f (x) = 2x1 + 9x2 + 3x3 subject to
x1 + x2 + x3 ≤ 1, x1 + 4x2 + 2x3 ≤ 2, x1 , x2 , x3 ≥ 0;
and find out the Lagrange multipliers corresponding to the constraints at the optimal point.
5. Consider the two-variable optimization problem
minimize
f (x, y) = c1 x + c2 y
subject to
a11 x + a12 y = b1 , a21 x + a22 y ≤ b2 , y ≥ 0.
(a) Develop the Lagrangian for the problem and work out the KKT conditions.
(b) Use these conditions to express the optimal function value in terms of the Lagrange
multipliers and determine its sensitivity to b1 and b2 .
(c) Develop the dual problem.
(d) Work out the KKT conditions for this dual problem.
6. In three-dimensional space, we have a line segment with known end-points A (a1 , a2 , a3 ) and
B (b1 , b2 , b3 ). Similarly, we have a triangle with known vertices P (p1 , p2 , p3 ), Q (q1 , q2 , q3 ) and
R (r1 , r2 , r3 ). Formulate the problem of finding the closest distance between the line segment
and the triangle as an optimization problem. Develop the KKT conditions for the problem.
If a given pair of points (on the line segment and on the triangle) together satisfies the KKT
conditions, can we say that this pair gives a local minimum for the distance?
7. Using quadratic programming approach, solve the problem formulated in the previous exercise, for the triangle P QR with P (10, 0, 0), Q(0, 8, 0) and R(0, 0, 6) for the following cases of
line segment AB:
(i) A(1, −1, 1), B(6, 9, 6);
(ii) A(1, 3, 8), B(3, 9, 12)
and (iii) A(8, 5, 0), B(3, 1, 6).
Attempt both active set and slack variable strategies with several starting solutions.
8. Starting from the origin and using square trust regions by imposing artificial bounds on the
variables (take initial size as 0.4 units), use quadratic programming as the iterative step for
the unconstrained minimization problem of the function 9(x21 − x2 )2 + (x1 − 1)2 . Use exact
gradient and Hessian for defining the quadratic model function at every iteration.
9. Consider a QP problem with two variables and a single inequality constraint (x, p ∈ R2 ) as
minimize
1
f (x) = xT Qx − bT x + c
2
subject to
pT x ≤ d.
(a) Write down the complete KKT conditions. Identify the number of unknowns to be solved
and number and type (linear/nonlinear) of equations and inequalities to be satisfied.
(b) Develop formulas (in terms of Q, b, c, p, d, which are data for the problem) for these
unknowns if the constraint is inactive.
(c) Develop formulas for these unknowns if the constraint is active.
(d) Develop algorithmic steps to take care of both these cases for any given set of data, with
positive definite Q.
Department of Mechanical Engineering
Indian Institute of Technology Kanpur
ME 752: Optimization Methods in Engineering Design (2008-2009 II)
Assignment 10: A Case Study
Consider the NLP problem
minimize
subject to

f (x) = 2(x21 + x22 − 1) − x1

g1 (x) = 4(x1 − 4)2 + 9(x2 − 3)2 − 36 ≤ 0,

g2 (x) = 9(x1 − 4)2 + 4(x2 − 3)2 − 36 ≤ 0.
(1)
1. Linearizing the objective function and constraint functions around the selected point (3, 2),
set up the LP problem according to the Frank-Wolfe formulation.
2. Sketch and describe the domain of this LP problem.
3. Find out the solution of this LP problem. Rather than solving the LP problem formally, you
can identify the solution from the sketch and establish it convincingly.
4. Is the resulting point a feasible solution of the original NLP problem (1)? If not, then what
can be done to obtain a feasible solution to proceed to the next iteration?
5. Conduct one more iteration of the above procedure.
6. In a fresh attempt, starting from the original point (3, 2), linearize only the constraint functions, leaving the objective function as it is.
7. Solve the resulting quadratic programming problem.
8. Is the resulting point feasible for the original NLP problem? If not, then what would you do
to obtain a feasible solution to proceed to the next iteration?
9. Your friend Reeta wants to solve the NLP problem (1) approximately, but refuses to learn
any optimization algorithm. However, she is good at geometry and would not mind drawing
a few ellipses and circles. Chalk out a clear and economical action plan for her to capture
the optimal solution roughly through geometric construction.
10. Develop the diagram(s) Reeta would produce following your advice.
11. How many KKT points do you expect for the NLP problem (1): none, unique, multiple or
infinite? Support your answer with clear arguments.
12. Identify one KKT point of the problem. Use (and spell out) discretion in abandoning branches
of fruitless calculation.
Department of Mechanical Engineering
Indian Institute of Technology Kanpur
ME 752: Optimization Methods in Engineering Design (2008-2009 II)
Assignment 11: Methods of Constrained Optimization
1. We want to minimize f (x) = x2 − 8x + 10 subject to x ≥ 6 by using the penalty function
1
2
2 max[0, g(x)] , where g(x) = 6 − x. Minimize a sequence of penalized functions, with the
penalty parameter values c = 0, 0.0001, 0.001, 0.01, 0.1, 1, 10, 100, 1000, 10000.
2. Starting from the origin and using square trust regions by imposing artificial bounds on the
variables (take initial size as 0.4 units), use quadratic programming as the iterative step for
the unconstrained minimization problem of the function 9(x21 − x2 )2 + (x1 − 1)2 . Use exact
gradient and Hessian for defining the quadratic model function at every iteration.
3. Consider the problem
f (x) = 2(x21 + x22 − 1) − x1
x21 + x22 − 1 = 0 .
Minimize
subject to
(a) Show that x∗ = [1 0]T is the minimizer and find the associated Lagrange multiplier.
(b) Suppose that xk = [cos θ sin θ]T where θ ≈ 0. Verify feasiblity and closeness to optimality.
(c) Set up and solve the corresponding quadratic program.
(d) With a full Newton step xk+1 = xk + dk , examine feasibility at xk+1 and compare the
function values at xk and xk+1 .
(e) From this exercise, can you draw any significant conclusion about an active set method?
4. Solve the NLP problem
Minimize
subject to
f (x) = −3x − 4y − 5x
x2 + y 2 + (z − 1)2 ≤ 4,
x2 + y 2 + (z + 1)2 ≤ 4,
(x − 1)2 + y 2 + z 2 ≤ 4;
by the cutting plane method.
5. A chain is suspended from two thin hooks that are 160 cm apart on a horizontal line. The
chain consists of 20 links of steel, each 10 cm in length. The equilibrium shape of the chain
is found by formulating the problem as
minimize
n
X
i=1
ci yi
subject to
n
X
i=1
yi = 0
and
L−
n q
X
i=1
l2 − yi2 = 0,
where ci = n − i + 1/2, n = 20, l = 10, L = 160. Derive the dual function for this problem
and work out a complete steepest ascent formulation for maximizing the dual function, and
hence solving the original problem.
Implement this formulation in a steepest ascent loop and obtain optimal values of Lagrange
multipliers, equilibrium configuration and the corresponding (minimum) potential energy, i.e.
P
( ni=1 ci yi ).
6. Starting from the origin (which is the unconstrained minimum point of the objective function),
use augmented Lagrangian method to minimize the function 5x21 +4x1 x2 +3x22 over the domain
defined by constraints 2 sin x1 ≤ x2 ≤ 2 cos x1 and x1 + x22 = 15. (Try penalty parameter
values 2, 10 and 20.)
7. Use an alternative method to solve the above NLP problem. Now, rather than using any
formal method of constrained optimization, use variable elimination, study of the domain,
function plots and common sense to crack the problem. In brief, solve the problem the way
you would do if solving this were utterly necessary for your survival and if you had not taken
this optimization course. (You do not need to go all the way plotting contours like Reeta!)
8. Starting from the point (1, 1), perform two iterations of the feasible directions (Zoutendijk)
method to find the point, farthest from the point C(1.5, 4), in the domain defined by
4.5x1 + x22 ≤ 18,
2x1 − x2 ≥ 1,
x1 , x2 ≥ 0 .
9. Starting from the feasible solution (1, −2), and with initial line-search bound αU = 0.25, solve
the problem
Minimize
x21 + 2x22
subject to
x21 + x22 ≥ 5
by the generalized reduced gradient method using
(a) slack variable strategy,
(b) active set strategy.
10. Use the active set formulation of the gradient projection method to solve the NLP problem
Minimize
subject to
(x1 − 1)2 + 4(x2 − 3)2
x21 + x22 ≤ 5
(x1 − 1)2 + x22 ≥ 1
with initial line-search bound αU = 0.25, and starting point (a) (0, 0) and (b) (1, 1.5).
Department of Mechanical Engineering
Indian Institute of Technology Kanpur
ME 752: Optimization Methods in Engineering Design (2008-2009 II)
Assignment 12: Miscellaneous Topics
1. Minimize f (x, y, z) = 2x2 +xy +y 2 +yz +z 2 −6x−7y −8z +9 if z can take only integer values.
Repeat the exercise, considering y and z both taking values from the set {0, 1.2, 2.8, 3.9, 6.2}.
2. For a particle of mass m, define the (Lagrangian) function
L(t, x, y, ẋ, ẏ) =
m 2
(ẋ + ẏ 2 ) − mgy
2
and develop the integral (action)
Z
s = Ldt.
The problem is to determine the trajectory x(t), y(t) of the particle from (0, 0) at time t = 0
to (a, b) at time t = T , along which s is minimum, or at least stationary.
(a) Verify that x(t) = αt(t − T ) + at/T, y(t) = βt(t − T ) + bt/T is a feasible trajectory, and
develop the function s(α, β).
(b) Formally find out values of α and β to minimize s, and hence determine the required
trajectory x(t), y(t). Find out ẋ, ẏ, ẍ, ÿ. Which law of physics did you derive just now,
in a way?
(c) Now, bypass the work of the two previous steps and work on a more direct theme. Work
out the variation δs as a result of arbitrary variations δx(t), δy(t), that respect the given
boundary conditions, and consistent variations in their rates as well. Insist on δs = 0 to
derive the same result as above. [Hint: To get rid of the δẋ and δẏ terms, integrate
the corresponding terms by parts.]
3. We want to solve the Blasius problem in the form
f ′′′ (x) + f (x)f ′′ (x) = 0, f (0) = f ′ (0) = 0, f ′ (5) = 1
by Galerkin method. Let us choose x2 , x3 , · · · , x8 as the basis functions, which already satisfy
the first two conditions. Taking 1, x, x2 , · · · , x5 as trial functions and using the boundary
condition at x = 1, determine the solution.

Download Report

Assignments 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12

Paperzz.com

Your Paperzz