Department of Mechanical Engineering Indian Institute of Technology Kanpur ME 752: Optimization Methods in Engineering Design (2008-2009 II) Assignment 0: Mathematical Background 1. Given that a 0 0 a b c 4 2 4 b d 0 0 d e = 2 2 2 , c e f 0 0 f 4 2 3 find out the values of a, b, c, d, e and f . (For a square root, select the positive value.) [What you just attempted is called Cholesky decomposition and works as desired for symmetric positive definite matrices.] 2. Consider three vectors u1 = [2 0 − 1 1]T , u2 = [1 2 0 3]T and u3 = [3 0 − 1 2]T . (a) Find the unit vector v1 along u1 . (b) From u2 , subtract its component along v1 (which will have magnitude v1T u2 ) and hence find the unit vector v2 such that vectors v1 , v2 form an orthonormal basis for the subspace spanned by u1 , u2 . (c) Similarly, find a vector v3 which, together with v1 and v2 , forms an orthonormal basis for the subspace spanned by all the three vectors u1 , u2 , u3 . (d) Find a vector v4 to complete this basis for the entire space R4 . (e) Write a generalized algorithm for building up the vectors v1 , v2 , · · · , vl , l ≤ m, when the given m vectors u1 , u2 , · · · , um are in Rn , m < n. [This process is called Gram-Schmidt orthogonalization and is used for building orthonormal bases for prescribed subspaces.] 3. Find an orthonormal matrix 2 1 3 A = 3 0 −2 5 1 1 basis for the range space of the linear transformation defined by the 4 2 . 6 4. A surveyor reaches a remote valley to prepare records of land holdings. The valley is a narrow strip of plain land between a mountain ridge and sea, and local people use a local and antiquated system of measures. They have two distant landmarks: the lighthouse and the high peak. To mention the location of any place, they typically instruct: so many bans towards the lighthouse and so many kos towards the high peak. Upon careful measurement, the surveyor and his assistants found that (a) one bans is roughly 200 m, (b) one kos is around 15 km, (c) the lighthouse is 10 degrees south of east, and (d) the high peak is 5 degress west of north. The surveyor’s team, obviously, uses the standard system, with unit distances of 1 km along east and along north. Now, to convert the local documents into standard system and to make sense to the locals about their intended locations, work out (a) a conversion formula from valley system to standard system, and (b) another conversion formula from standard system to valley system. 5. Given that 1 0 0 a d g 5 2 1 b 1 0 0 e h = 10 6 1 , c f 1 0 0 i 5 −4 7 find out the values of a, b, c, d, e, f , g, h and i. [This is the celebrated Crout-Doolittle algorithm LU decomposition without pivoting.] 6. For n × n matrices Q, R and A, consider the matrix multiplication QR = A columnwise and observe that r1,k q1 + r2,k q2 + r3,k q3 + · · · + rn,k qn = ak . For the matrix 6 5 −1 0 6 5 −1 6 , A= 6 1 1 0 6 1 1 2 write out the column equations one by one and determine the corresponding columns of an orthogonal Q and an upper triangular R. (Note: There is no trick in this problem. Never stop in between. The process of QR decomposition always works — till the end!) ∂p , 7. For a bilinear form p(x, y) = xT Ay of two vector variables x ∈ Rm and y ∈ Rn , find out ∂x i ∂p ∂p ∂p ∂yi ; and hence the vector gradients ∂x and ∂y . As a corollary, derive the partial derivative ∂q T ∂xi and vector gradient of a quadratic form q(x) = x Ax. 8. Check whether the matrix 4 1 1 5 1 1 2 is positive definite matrix. 9. Consider matrix 2 3 1 P= . a + b b − a 3a + b (a) For which values of a and b, PPT is positive definite? (b) For which values of a and b, PT P is positive definite? 10. Consider the matrix 80 −60 A = 36 −27 . −48 36 (a) Construct AT A and determine its eigenvalues λ1 , λ2 (number them in descending order, for convenience) and corresponding eigenvectors v1 , v2 , as an orthonormal basis of R2 . √ (b) Define σk = λk , form a diagonal matrix with σ1 and σ2 as the diagonal elements and extend it (with additional zeros) to a matrix Σ of the same size as A. (c) Assemble the eigenvectors into an orthogonal matrix as V = [v1 orthogonal matrix U satisfying A = UΣVT . v2 ] and find any (d) Identify the null space of A in terms of columns of V. (e) Identify the range space of A in terms of columns of U. (f) How does a system of equations Ax = b transform if the bases for the domain and the co-domain of A change to V and U, respectively? [This powerful decomposition of matrices for solution, optimization and diagnostics of linear systems is called singular value decomposition (SVD).] 11. Find the characteristic 0 0 0 ··· 1 0 0 ··· .. 0 1 . 0 .. .. . . . .. . . . .. .. .. . . . . . . 0 0 0 ··· 0 0 0 ··· polynomial of the following matrix and mention its significance. ··· 0 −an ··· 0 −an−1 ··· 0 −an−2 .. .. .. . . . . .. .. .. . . .. . 0 −a2 ··· 1 −a1 12. Eigenvalues of matrix A are 1.1, 1, 0.9 and the corresponding eigenvectors are [1 0 1]T , [1 2 − 1]T , [1 1 1]T . Compute A and A6 . 13. Let f (x) be a scalar function of a vector variable x ∈ Rn . Let Q ∈ Rn×n be an orthogonal matrix, such that its columns q1 , q2 , · · · , qn form an orthonormal basis of Rn . (a) For small α, find out f (x + αqj ) − f (x). (b) Hence, show that the directional derivative ∂f ∂qj = qTj ∇f (x). (c) Now, compose the vector resultant Pn ∂f j=1 ∂qj qj and show that it equals ∇f (x). 14. A function f (x) of two variables has been evaluated at the following points. f (1.999, 0.999) = 7.352232, f (2, 0.999) = 7.359574, f (2.001, 0.999) = 7.366922, f (1.999, 1) = 7.381671, f (2, 1) = 7.389056, f (2.001, 1) = 7.396449, f (1.999, 1.001) = 7.411257; f (2, 1.001) = 7.418686; f (2.001, 1.001) = 7.426124. Find out the gradient and Hessian of the function at the point (2,1). How many function values did you have to use for each of them? 15. Let P (x) be a polynomial on which successive synthetic division by a chosen quadratic polynomial x2 + px + q produces the successive quotients P1 (x), P2 (x) and remainders rx + s, ux + v, such that P (x) = (x2 + px + q)P1 (x) + rx + s, P1 (x) = (x2 + px + q)P2 (x) + ux + v. (a) Observing that the expressions P1 (x), P2 (x) and the numbers r, s, u, v all depend upon p and q in the chosen expression x2 + px + q, differentiate the expression for P (x) above ∂r ∂r ∂s ∂s , ∂q , ∂p , ∂q . partially with respect to p and q, and simplify to obtain expressions for ∂p [Hint: At its roots, a polynomial evaluates to zero.] (b) Frame the Jacobian J of [r s]T with respect to [p q]T and work out an iterative algorithm based on the first order approximation to iterate over the parameters p and q for obtaining r = s = 0. In brief, work out an iterative procedure to isolate a quadratic factor from a polynomial. [This is the Bairstow’s method, often found to be an effective way to solve polynomial equations.] (c) Implement the procedure to find all roots of the polynomial P (x) = x6 − 19x5 + 125x4 − 329x3 + 66x2 + 948x − 216 up to two places of decimal, starting with p = 0 and q = 0, i.e. x2 as the initial divisor expression. 16. Function f (t) is being approximated in the interval [0, 1] by a cubic interpolation formula in terms of the boundary conditions as f (t) = [f (0) f (1) f ′ (0) f ′ (1)] W [1 t t2 t3 ]T . Determine the matrix W. Department of Mechanical Engineering Indian Institute of Technology Kanpur ME 752: Optimization Methods in Engineering Design (2008-2009 II) Assignment 1: Important Identities and Results 1. For an m × n matrix A, m < n, P = (AAT )−1 has already been computed. Then, an additional row aT is appended in A, such that the matrix A gets updated to Ā. In terms of P, develop an update formula in the form Q b A T −1 P̄ = (ĀĀ ) = , where Ā = . bT ε aT Similarly, develop a working rule to update P̄ from available P, if Ā is obtained by dropping the last row of A. [In the active set strategy of nonlinear optimization, such updates are routinely involved while including and excluding inequality constraints in and from the active set of constraints.] 2. Let an (n − m)-dimensional subspace be defined in Rn as M = {d : Ad = 0}, where A ∈ Rm×n is full-rank. Show that the orthogonal projection of a vector to this subspace is accomplished by the transformation P = In − AT (AAT )−1 A. [Try to derive the transformation, apart from simply verifying.] 3. For an invertible matrix A ∈ Rn×n and non-zero column vectors u, v ∈ Rn , (a) find out the rank of the matrix uvT , and (b) prove the following identity (Sherman-Morrison formula) and comment on its utility. (A − uvT )−1 = A−1 + A−1 u(1 − vT A−1 u)−1 vT A−1 . 4. In a certain application, the inverse of the matrix In + cAT A is required, where A is a fullrank m × n matrix (m < n) and c is a large number. For two reasons, a direct inversion of this matrix is not very advisable. For indirectly obtaining the inverse, prove the identity (In + cAT A)−1 = In − cAT (Im + cAAT )−1 A and verify it for c = 10, n = 3, m = 2 and A = [2e1 3e2 0]. Can you figure out what are the two reasons? 5. For a real invertible matrix A, show that AT A is positive definite. Further, show that, for ν > 0, AT A + ν 2 I is positive definite for any real matrix A. 6. For solving Ax = b with symmetric positive definite matrix A, we formulate the error vector e = Ax − b, start from an arbitrary point x0 and iterate along selected directions d0 , d1 , d2 etc, as xk+1 = xk + αk dk , for k = 0, 1, 2, · · · . (a) Denoting ek = Axk − b, determine αk such that dTk ek+1 = 0, i.e. the step along a chosen direction eliminates any error along that direction in the next iterate. (b) Then, find out dT0 e2 , dT0 e3 and dT1 e3 , i.e. errors along old directions. (c) Generalize the observation for the k-th step. (d) Work out the conditions that the chosen directions must satisfy such that errors along the old directions also vanish. [These conditions characterize these directions as conjugate directions for the matrix A.] 7. For n × n symmetric positive definite L and m × n full-rank A (m < n), prove the identity A(L + cAT A)−1 AT (Im + cAL−1 AT ) = AL−1 AT and use the result to prove that an eigenvector v of AL−1 AT with eigenvalue σ is also an 1 eigenvector of A(L + cAT A)−1 AT with corresponding eigenvalue c+1/σ . [This result has valuable implication in the theory behind a powerful duality-based algorithm (Augmented Lagrangian method) of nonlinear optimization.] 8. Plot contours of the function f (x) = x21 x2 − x1 x2 + 8, in the region 0 < x1 < 3, 0 < x2 < 10. Develop a quadratic approximation of the function around (2,5) and superimpose its contours with those of f (x). Are the contour curves of this quadratic approximation elliptic, parabolic or hyperbolic? 9. Find a solution of the equation e−x = x up to two places of decimal. Is the solution unique? 10. Let A be an m × n matrix (m < n) of rank m and let L be an n × n symmetric positive definite matrix. Then, show that the (n + m) × (n + m) matrix L AT , H= A 0 is non-singular, but indefinite. Department of Mechanical Engineering Indian Institute of Technology Kanpur ME 752: Optimization Methods in Engineering Design (2008-2009 II) Assignment 2: Optimization Problems and Algorithms 1. Mrs. Anna D’Souza (height 170 cm) called our office requesting for an economical size and best positioning for a mirror that she would get fixed on a door of her bedroom almirah. Considering that the cost of the mirror is proportional to its area and taking appropriate estimates of dimensions, margins etc, formulate an optimization problem and work out suitable height, width and positioning of the mirror in which she can view her full image. 2. Design a minimum cost cylindrical tank closed at both ends to contain a fixed volume V of fluid. Assume that the cost depends directly on the area A of sheet metal. 3. Find out the maximum volume of a tank of the above kind that has a given surface area A. Show that the relationships between surface area and volume for the optimal design in this case is equivalent to that in the earlier problem. 4. Design a tank of the above kind for a volume of 250 m3 , incorporating the assembly constraint H ≤ 10 − D/2 appearing from the plan of locating it at a particular place in a shed. 5. Using a graphical method, solve the problem minimize f (x) = x21 + x22 − 4x1 + 4 subject to x1 − 2x2 + 6 ≥ 0, x21 − x2 + 1 ≤ 0, x1 , x2 ≥ 0. 6. Show that the algorithm 1 2 (xk + 2) for xk > 1 xk+1 = A(xk ) = 1 for xk ≤ 1 4 xk for the problem minimize φ(x) = |x| is not globally convergent. Explain the reason. 7. Find the order of convergence and convergence ratio of the sequence {xk }∞ k=0 if (a) xk = αk for 0 < α < 1. (b) xk = α2k Department of Mechanical Engineering Indian Institute of Technology Kanpur ME 752: Optimization Methods in Engineering Design (2008-2009 II) Assignment 3: Univariate Optimization 1. Identify the stationary points of the following function using the exact method: f (x) = (x − 1)2 − 0.01x4 . Check if they are minimum or maximum points. 2. Find out the stationary points of the function f (x) = 5x6 − 36x5 + 165 4 x − 60x3 + 36 2 analyze their nature. 3. Maximize f (x) = −x3 + 3x2 + 9x + 10 in the interval −2 ≤ x ≤ 4. 4. Implement bounding phase method (refer book by Deb for algorithm) to bracket a minimum point of the function f (x) = ex −x3 . Next, use the program to bracket minima of the functions in the first three exercises above. 5. Over the bracket arrived at for f (x) = ex − x3 , use two iterations of Fibonacci search and golden section search methods. Implement one of these (whichever you like through this experience) in a function and use it for all the four functions above upto an accuracy of 10−3 . 6. Over the bracket arrived at for f (x) = ex − x3 , use two iterations of regula falsi and Newton’s methods. Implement one of these (whichever you like through this experience) in a function and use it for all the four functions above upto an accuracy of 10−3 . 7. Compare the above two favourite algorithms from two classes in terms of their performance on the set of these four functions. 2 8. Identify the regions over which the function e−x is convex and where it is concave. Determine its global minima and maxima. 9. Find all the maxima and minima of the function φ(x) = 25(x − 21 )4 − 2(x − 12 )2 , identify its other salient features and sketch its graph. Department of Mechanical Engineering Indian Institute of Technology Kanpur ME 752: Optimization Methods in Engineering Design (2008-2009 II) Assignment 4: Fundamentals of Multivariate Optimization 1. Consider the function f : R2 → R defined by f (x) = 2x21 − x41 + x61 /6 + x1 x2 + x22 /2 . Find out all the stationary points and classify them as local minimum, local maximum and saddle points. 2. The Hessian H(x) of a function f (x) is positive semi-definite everywhere. (a) Using the Taylor’s theorem (in the remainder form) around point x2 as 1 f (x1 ) = f (x2 ) + [∇f (x2 )]T (x1 − x2 ) + (x1 − x2 )T H[x2 + α(x1 − x2 )](x1 − x2 ) 2 for α ∈ [0, 1], show that f (x1 ) ≥ f (x2 ) + [∇f (x2 )]T (x1 − x2 ). (b) Does the argument remain valid if we know only that the Hessian is positive semi-definite (or positive definite) at the point x2 , and not everywhere? Why? (c) Now, consider an arbitrary line through x2 and select two points y and z on opposite sides of it. Writing the result of part (a) with y and z as x1 in turn, show that βf (y) + (1 − β)f (z) ≥ f [βy + (1 − β)z] for some β ∈ [0, 1]. (d) Does the above inequality hold for all β ∈ [0, 1]? Why? (e) Using the definition of a convex function, summarize the entire result in a single sentence. 3. Find the domain in which the function 9(x21 − x2 )2 + (x1 − 1)2 is convex. 4. (a) Develop a quadratic model of the function 9(x21 − x2 )2 + (x1 − 1)2 around the origin. (b) Superimpose the contours of the original function and the quadratic model. (Use a software, e.g. MATLAB, to develop the contours.) (c) With a circular trust region of radius 0.2 unit, mark the point where a step from the origin should reach. (d) Obtain the coordinates of this point from the plot and repeat the entire process for one more step. Department of Mechanical Engineering Indian Institute of Technology Kanpur ME 752: Optimization Methods in Engineering Design (2008-2009 II) Assignment 5: Basic Methods of Multivariate Optimization 1. Use Nelder and Mead’s simplex search method, starting from the origin, to find the minimum point of the function f (x, y, z) = 2x2 + xy + y 2 + yz + z 2 − 6x − 7y − 8z + 9. 2. For minimizing the function f (x) = (x1 − x2 )2 + (1 − x1 )2 + (2x1 − x2 − x3 )2 , consider the origin as the starting solution, i.e. x0 = [0 0 0]T . (a) Evaluate the function f (x0 ) and gradient g(x0 ) at this point. Using the negative gradient as the search direction, define a new function φ(α) = f (x0 − αg(x0 )). (b) Find out the minimizer α0 of φ(α) and update x1 = x0 − α0 g(x0 ). (c) Similarly, carry out two more such iterations, i.e. find out f (xk ), g(xk ), αk and xk+1 for k = 1, 2. (d) Tabulate the results and analyze them in terms of function value as well as distance from the actual minimum point. 3. Starting from the point x0 = [2 − 1 0 1]T , minimize the function f (x) = (x1 + 2x2 − 1)2 + 5(x3 − x4 )2 + (x2 − 3x3 )4 + 10(x1 − x4 )4 by (a) Hooke-Jeeves method with unit initial step size and (b) steepest descent method. 4. Show that, with x0 = [c 1]T , steepest descent iterations for the function f (x) = x21 +cx22 , c > 0 are given by xm = am [c (−1)m ]T , where am = (c − 1)m /(c + 1)m . Comment on the behaviour of the method for large values of c. 5. What are the starting points from which a single iteration of the steepest descent algorithm would converge to the minimum point of the function 5x21 + 4x1 x2 + 3x22 ? 6. While minimizing the function f (x) = 4x21 − 5x1 x2 + 3x22 + 6x1 − 2x2 , a transformation of the form x1 = ay1 + by2 , x2 = cy1 + dy2 was used. For the reformulated problem in terms of the variables y1 , y2 , steepest descent method was found to converge to the minimum in a single iteration from arbitrary starting solution. Find out the conditions that the coefficients a, b, c, d must satisfy for this to happen. 7. (a) Identify the stationary points of the function f (x) = x21 + (x2 + 1)2 x21 + (x2 − 1)2 . (b) Classify the stationary points and find out the minimum value(s). (c) Starting from the point (1, 1), execute a Newton’s step and evaluate the step as acceptable or otherwise, in terms of reduction in function value. 8. Consider the function f (x) = (x21 − x2 )2 + (x1 − 1)2 and the origin as the starting point. (a) Determine a step of the pure Newton’s method. (b) Is it a descent step? (c) Is the associated direction a descent direction? 9. For minimizing the function f (x) = (x21 −x2 )2 +(1−x1 )2 , perform one iteration of Newton’s method from the starting point [2 2]T and compare this step with the direction of the steepest descent method, regarding approach towards the optimum. 10. (a) Develop the first order necessary condition for a minimum point of the function 1 E(x) = kAx − bk2 . 2 (b) Is the resulting system of equations necessarily consistent? Why? (c) At a solution of this system, does the function necessarily have a minimum value? Why? (d) Discuss the distribution of minima when AT A is singular. 11. Solve the following systems of equations by formulating them as optimization problems: (a) x2 − 5xy + y 3 = 2, x + 3y = 6 and (b) zex − x2 = 10y, x2 z = 0.5, x + z = 1. 12. Starting from x = [1 1 1]T , solve the system of equations 16x41 + 16x42 + x43 = 16, x21 + x22 + x23 = 3, x31 − x2 = 0 by Newton’s method and Levenberg-Marquardt method. 13. Find constants a1 , a2 , a3 , a4 and λ for least square fit of the following tabulated data in the form a1 + a2 x + a3 x2 + a4 eλx . x 0 1 2 3 4 5 6 7 8 y 20 52 69 76 74 67 55 38 17 [Hint: You may attempt it as a five-variable least square problem or as a single-variable optimization problem with a linear least square problem involved in the function evaluation.] Department of Mechanical Engineering Indian Institute of Technology Kanpur ME 752: Optimization Methods in Engineering Design (2008-2009 II) Assignment 6: Multivariate Optimization Methods 1. For the function f (x, y, z) = 2x2 + xy + y 2 + yz + z 2 − 6x − 7y − 8z + 9, develop expressions for the gradient and Hessian, and work out three conjugate directions through a Gram-Schmidt procedure. Now, starting from origin, conduct four sets of line searches along (a) e1 , e2 , e3 ; (b) three successive steepest descent directions; (c) the three conjugate directions developed; and (d) three directions recommended by conjugate gradient method. In each case, trace the function values and gradient norms through the steps. 2. Starting from the point x0 = [2 − 1 0 1]T , minimize the function f (x) = (x1 + 2x2 − 1)2 + 5(x3 − x4 )2 + (x2 − 3x3 )4 + 10(x1 − x4 )4 by (a) Polak-Ribiere/Fletcher-Reeves method and (b) Powell’s conjugate direction method, and compare their performance in terms of number of function evaluations. 3. Following is an excerpt from the record of line searches in a run of Powell’s conjugate directions method applied in a two-variable problem. · · · → (2, 5) → (2.9, 6.2) → (4.2, 6.2) → (4.5, 6.6) → (4.9, p) → (5.05, q) → (5.09, r) → · · · What are the values of p, q and r? 4. Starting from the origin and taking the identity matrix as the initial estimate of Hessian inverse, apply a few steps of the DFP method on the Himmelblau function f (x1 , x2 ) = (x21 + x2 − 11)2 + (x1 + x22 − 7)2 . Show the progress of the iterations superimposed with a contour plot, and record the development of the inverse Hessian estimate. 5. Using the same starting point, apply conjugate gradient (Polak-Ribiere or Fletcher-Reeves) method and quasi-Newton (DFP or BFGS) method to minimize the function f (x) = x21 + (x2 + 1)2 x21 + (x2 − 1)2 to an accuracy of 10−6 . In each case, apart from the function value and gradient, also evaluate the exact Hessian (which is not needed and not to be used for the iteration process) after every line search and examine the relation of the two current search directions (previous and next) with the local Hessian. Conduct such experiments with at least two starting points. Department of Mechanical Engineering Indian Institute of Technology Kanpur ME 752: Optimization Methods in Engineering Design (2008-2009 II) Assignment 7: Theory of Optimization Methods (optional) 1. For minimizing f (x) = 12 xT Ax + bT x, with positive definite Hessian matrix A, the starting point has been taken as x0 and the starting direction as d0 = −g0 , the negative gradient. (a) Determine α0 such that x1 = x0 + α0 d0 minimizes the function over the line from x0 along the direction d0 . (b) If at x1 , the gradient g1 6= 0, then determine β0 that will make the new direction d1 = −g1 + β0 d0 conjugate to d0 . (c) Show that the subspace spanned by g0 and g1 is the same as that spanned by g0 and Ag0 , which is also the same as spanned by d0 and d1 , i.e. < g0 , g1 > = < g0 , Ag0 > = < d0 , d1 > . (d) Next, determine α1 such that x2 = x1 + α1 d1 minimizes the function over the line from x1 along the direction d1 . (e) If g2 6= 0, then determine β1 that will make d2 = −g2 + β1 d1 conjugate to d1 . Show that the resulting direction d2 is conjugate to d0 as well. (f) Show the following identity concerning the subspace traversed so far. < g0 , g1 , g2 > = < g0 , Ag0 , A2 g0 > = < d0 , d1 , d2 > . (g) In the same manner, determine α2 , x3 , β2 and d3 . Establish the conjugacy of d3 to the old directions and the identity similar to the above, over the expanded subspace < d0 , d1 , d2 , d3 >. (h) Finally, write the general result in terms of the step index k and prove it. [In your arguments, you may use the expanding subspace theorem, which has been proved earlier independently.] 2. Noting that the rank-one update on the inverse Hessian as Bk+1 = Bk +ak (pk −Bk qk )(pk −Bk qk )T = Bk +ak pk pTk − Bk qk pTk − pk qTk Bk + Bk qk qTk Bk with ak = pTk qk − qTk Bk qk fulfils the key requirement of ensuring the equality Bk+1 qk = pk but fails to guarantee continued positive definiteness, we propose to generalize the update formula as Bk+1 = Bk + αpk pTk − βBk qk pTk − γpk qTk Bk + δBk qk qTk Bk . (a) Determine the conditions on the coefficients α, β, γ and δ so as to fulfil the basic requirement, which is to ensure Bk+1 qk = pk . (b) Further, taking γ = β for symmetry of Bk+1 , solve the above equations for α and δ in terms of β. (c) Show that the resulting update formula for Bk+1 in terms of the only remaining free parameter β is equivalent to the Broyden family of update formulae. (d) Imposing one more condition on the coefficients, special cases can be derived. Verify that i. condition α = δ leads to the degenerate rank-one update, ii. condition β = 0 gives the DFP update formula, and iii. condition δ = 0 results in the BFGS formula. √ √ (e) Assuming Bk to be positive definite, denote Bk x = u and Bk qk = v for an arbitrary vector x and develop a simplified expression for xT Bk+1 x. (f) Show that pTk qk = αk gkT Bk gk and hence determine a bound on the value of β that will ensure Bk+1 to be positive definite. (g) Show that, with any β satisfying the above bound for the update of Bk+1 , any starting point x0 and any positive definite matrix B0 , the quasi-Newton iterations dk = −Bk gk ; Line search for αk ; pk = αk dk ; xk+1 = xk + pk ; qk = gk+1 − gk ; applied on a convex quadratic problem, with constant positive definite Hessian H, leads to the following additional properties: pTi Hpk = 0, qTk Bk qi = 0 and Bk+1 qi = pi for 0 ≤ i < k . 3. The BFGS update on the inverse Hessian is not a rank-two update, but the equivalent of a rank-two update on the corresponding Hessian. To derive it, consider a DFP-like update of the Hessian as Hk+1 = Hk + qk qTk Hk pk pTk Hk − qTk pk pTk Hk pk and apply the following two steps of Sherman-Morrison formula to modify its inverse. (a) First, put A = Hk , A−1 = Bk , u = µqk , v = µqk , where µ2 = 1 , qT k pk and develop the intermediate update B′ = (A + uvT )−1 , which is the inverse of H′ = Hk + (b) Next, put A = H′ , A−1 = B′ , u = −νHk pk , v = νHk pk , where ν 2 = 1 , pT k Hk pk out the final update as Bk+1 = (A + uvT )−1 , which is now the inverse of Hk+1 = H′ − qk qTk Hk pk pTk Hk Hk pk pTk Hk = H + − . k pTk Hk pk qTk pk pTk Hk pk qk qT k . qT k pk and work Department of Mechanical Engineering Indian Institute of Technology Kanpur ME 752: Optimization Methods in Engineering Design (2008-2009 II) Assignment 8: Framework of Constrained Optimization 1. Sketch the feasible region described by the constraints x1 − x3 + 1 = 0 = x21 + x22 − 2x1 . Identify irregular points of the constraints, if any. Can you reduce an optimization problem over this domain to a 2-variable problem? To a single-variable problem? How? 2. Reduce the problem minimize f (x) subject to Ax = b, gi (x) ≤ 0 for 1 ≤ i ≤ q to a k-variable problem (where k = n − p), given that A ∈ Rp×n has full row rank. 3. We want to determine the minimum sheet metal needed to construct a right cylindrical can (including bottom and cover) of capacity at least 1.5 litre with diameter between 5 cm and 12 cm, and height between 10 cm and 18 cm. Taking sensible assumptions, write down the KKT conditions, identify the salient points of the domain, where constraint boundaries meet, as KKT candidate points and then test the conditions on those points. 4. For the problem minimize f (x) = 0.01x21 + x22 subject to g1 (x) = 25 − x1 x2 ≤ 0, g2 (x) = 2 − x1 ≤ 0 ; obtain the solution using KKT conditions, sketch the domain with the solution and verify the second order sufficient condition for optimality. Estimate the new optimal value of the function if (a) the first constraint is changed to g1 = 26 − x1 x2 ≤ 0, or (b) the second constraint is changed to g2 = 3 − x1 ≤ 0. 5. Verify that (1, 0, 3) is a KKT point of the NLP problem minimize subject to f (x) = −x31 + x32 − 2x1 x23 2x1 + x22 + x3 = 5, 5x21 − x22 − x3 ≥ 2, x1 , x2 , x3 ≥ 0. Examine the second order conditions. Is this a convex programming problem? 6. Identify the KKT points (i.e. points satisfying KKT conditions) of the problem minimize subject to f (x) = x21 − x22 x21 + 2x22 = 4, and examine them through the second order conditions for optimality. 7. Locate the KKT point(s) of the NLP problem minimize subject to f (x) = x21 + x22 − 2x2 − 1 g1 (x) = 4(x1 − 4)2 + 9(x2 − 3)2 − 36 ≤ 0, g2 (x) = 9(x1 − 4)2 + 4(x2 − 3)2 − 36 ≤ 0 over a sketch of the domain. 8. Write down the formal KKT conditions of the NLP problem minimize f (x) = x2 − 8x + 10 subject to x ≥ 6 . Develop the Lagrangian L(x, µ) of this problem, evaluate its derivatives up to the second order and construct contours of the Lagrangian function on the x-µ plane. 9. For the problem minimize f (x) = (x1 − 3)2 + (x2 − 3)2 subject to 2x1 + x2 ≤ 2; develop the dual function, maximize it and find the corresponding point in x-space. Compare the optimal values of the primal and dual functions. 10. Show that convex functions gi (x) for all i in g(x) ≤ 0 with linear equality constraints in h(x) = 0 define a convex domain. 11. Suppose that a regular point x∗ of a convex programming problem satisfies the KKT conditions. (a) If it is not a local minimum, then show that assumption of an arbitrarily close feasible point y, such that f (y) < f (x∗ ), leads to a contradiction. (b) Now that we are forced to admit x∗ as a local minimum point, let us suppose that it is not a global minimum point. Then, taking a point z somewhere in the domain such that f (z) < f (x∗ ), show that you can always find another point y satisfying all the premises of part (a), and hence leading to contradiction. (c) Finally, suppose that x∗ is a global minimum point, but it is not unique. Then, considering another global minimum point w, show that every point in the line segment joining x∗ and w is also a global minimum point. (d) Summarize the complete result in the form of a statement on KKT conditions in the context of a convex programming problem. Department of Mechanical Engineering Indian Institute of Technology Kanpur ME 752: Optimization Methods in Engineering Design (2008-2009 II) Assignment 9: Linear and Quadratic Problems 1. Formulate the problem maximize xα1 1 xα2 2 · · · xαnn subject to xβ1 i1 xβ2 i2 · · · xβnin ≤ bi xj ≥ 1 for i = 1 to m for j = 1 to n as a linear programming problem and justify your formulation. 2. Using the simplex method, solve the following LP problems. (a) Minimize x1 subject to 2x1 + x2 ≤ 2, x1 + 5x2 + 10 ≥ 0, x2 ≤ 1. (b) Minimize 3x1 + x2 subject to 4x1 + x2 ≥ 3, 4x1 + 3x2 ≤ 6, x1 + 2x2 ≤ 3, x1 , x2 ≥ 0. 3. Implement the simplex method in a general program, use it on the following LP problems, and report your results and experience. (a) Minimize 2x1 + 3x2 subject to 4x1 − 5x2 ≤ 17, −3x1 + 2x2 + 10 ≥ 7, x1 , x2 ≤ 0. (b) Minimize 3x1 + 4x2 subject to 3x1 + 2x2 ≤ 12, x1 + 2x2 ≤ 6, 2x1 − 7x2 ≥ 10, x1 , x2 ≥ 0. 4. Maximize f (x) = 2x1 + 9x2 + 3x3 subject to x1 + x2 + x3 ≤ 1, x1 + 4x2 + 2x3 ≤ 2, x1 , x2 , x3 ≥ 0; and find out the Lagrange multipliers corresponding to the constraints at the optimal point. 5. Consider the two-variable optimization problem minimize f (x, y) = c1 x + c2 y subject to a11 x + a12 y = b1 , a21 x + a22 y ≤ b2 , y ≥ 0. (a) Develop the Lagrangian for the problem and work out the KKT conditions. (b) Use these conditions to express the optimal function value in terms of the Lagrange multipliers and determine its sensitivity to b1 and b2 . (c) Develop the dual problem. (d) Work out the KKT conditions for this dual problem. 6. In three-dimensional space, we have a line segment with known end-points A (a1 , a2 , a3 ) and B (b1 , b2 , b3 ). Similarly, we have a triangle with known vertices P (p1 , p2 , p3 ), Q (q1 , q2 , q3 ) and R (r1 , r2 , r3 ). Formulate the problem of finding the closest distance between the line segment and the triangle as an optimization problem. Develop the KKT conditions for the problem. If a given pair of points (on the line segment and on the triangle) together satisfies the KKT conditions, can we say that this pair gives a local minimum for the distance? 7. Using quadratic programming approach, solve the problem formulated in the previous exercise, for the triangle P QR with P (10, 0, 0), Q(0, 8, 0) and R(0, 0, 6) for the following cases of line segment AB: (i) A(1, −1, 1), B(6, 9, 6); (ii) A(1, 3, 8), B(3, 9, 12) and (iii) A(8, 5, 0), B(3, 1, 6). Attempt both active set and slack variable strategies with several starting solutions. 8. Starting from the origin and using square trust regions by imposing artificial bounds on the variables (take initial size as 0.4 units), use quadratic programming as the iterative step for the unconstrained minimization problem of the function 9(x21 − x2 )2 + (x1 − 1)2 . Use exact gradient and Hessian for defining the quadratic model function at every iteration. 9. Consider a QP problem with two variables and a single inequality constraint (x, p ∈ R2 ) as minimize 1 f (x) = xT Qx − bT x + c 2 subject to pT x ≤ d. (a) Write down the complete KKT conditions. Identify the number of unknowns to be solved and number and type (linear/nonlinear) of equations and inequalities to be satisfied. (b) Develop formulas (in terms of Q, b, c, p, d, which are data for the problem) for these unknowns if the constraint is inactive. (c) Develop formulas for these unknowns if the constraint is active. (d) Develop algorithmic steps to take care of both these cases for any given set of data, with positive definite Q. Department of Mechanical Engineering Indian Institute of Technology Kanpur ME 752: Optimization Methods in Engineering Design (2008-2009 II) Assignment 10: A Case Study Consider the NLP problem minimize subject to f (x) = 2(x21 + x22 − 1) − x1 g1 (x) = 4(x1 − 4)2 + 9(x2 − 3)2 − 36 ≤ 0, g2 (x) = 9(x1 − 4)2 + 4(x2 − 3)2 − 36 ≤ 0. (1) 1. Linearizing the objective function and constraint functions around the selected point (3, 2), set up the LP problem according to the Frank-Wolfe formulation. 2. Sketch and describe the domain of this LP problem. 3. Find out the solution of this LP problem. Rather than solving the LP problem formally, you can identify the solution from the sketch and establish it convincingly. 4. Is the resulting point a feasible solution of the original NLP problem (1)? If not, then what can be done to obtain a feasible solution to proceed to the next iteration? 5. Conduct one more iteration of the above procedure. 6. In a fresh attempt, starting from the original point (3, 2), linearize only the constraint functions, leaving the objective function as it is. 7. Solve the resulting quadratic programming problem. 8. Is the resulting point feasible for the original NLP problem? If not, then what would you do to obtain a feasible solution to proceed to the next iteration? 9. Your friend Reeta wants to solve the NLP problem (1) approximately, but refuses to learn any optimization algorithm. However, she is good at geometry and would not mind drawing a few ellipses and circles. Chalk out a clear and economical action plan for her to capture the optimal solution roughly through geometric construction. 10. Develop the diagram(s) Reeta would produce following your advice. 11. How many KKT points do you expect for the NLP problem (1): none, unique, multiple or infinite? Support your answer with clear arguments. 12. Identify one KKT point of the problem. Use (and spell out) discretion in abandoning branches of fruitless calculation. Department of Mechanical Engineering Indian Institute of Technology Kanpur ME 752: Optimization Methods in Engineering Design (2008-2009 II) Assignment 11: Methods of Constrained Optimization 1. We want to minimize f (x) = x2 − 8x + 10 subject to x ≥ 6 by using the penalty function 1 2 2 max[0, g(x)] , where g(x) = 6 − x. Minimize a sequence of penalized functions, with the penalty parameter values c = 0, 0.0001, 0.001, 0.01, 0.1, 1, 10, 100, 1000, 10000. 2. Starting from the origin and using square trust regions by imposing artificial bounds on the variables (take initial size as 0.4 units), use quadratic programming as the iterative step for the unconstrained minimization problem of the function 9(x21 − x2 )2 + (x1 − 1)2 . Use exact gradient and Hessian for defining the quadratic model function at every iteration. 3. Consider the problem f (x) = 2(x21 + x22 − 1) − x1 x21 + x22 − 1 = 0 . Minimize subject to (a) Show that x∗ = [1 0]T is the minimizer and find the associated Lagrange multiplier. (b) Suppose that xk = [cos θ sin θ]T where θ ≈ 0. Verify feasiblity and closeness to optimality. (c) Set up and solve the corresponding quadratic program. (d) With a full Newton step xk+1 = xk + dk , examine feasibility at xk+1 and compare the function values at xk and xk+1 . (e) From this exercise, can you draw any significant conclusion about an active set method? 4. Solve the NLP problem Minimize subject to f (x) = −3x − 4y − 5x x2 + y 2 + (z − 1)2 ≤ 4, x2 + y 2 + (z + 1)2 ≤ 4, (x − 1)2 + y 2 + z 2 ≤ 4; by the cutting plane method. 5. A chain is suspended from two thin hooks that are 160 cm apart on a horizontal line. The chain consists of 20 links of steel, each 10 cm in length. The equilibrium shape of the chain is found by formulating the problem as minimize n X i=1 ci yi subject to n X i=1 yi = 0 and L− n q X i=1 l2 − yi2 = 0, where ci = n − i + 1/2, n = 20, l = 10, L = 160. Derive the dual function for this problem and work out a complete steepest ascent formulation for maximizing the dual function, and hence solving the original problem. Implement this formulation in a steepest ascent loop and obtain optimal values of Lagrange multipliers, equilibrium configuration and the corresponding (minimum) potential energy, i.e. P ( ni=1 ci yi ). 6. Starting from the origin (which is the unconstrained minimum point of the objective function), use augmented Lagrangian method to minimize the function 5x21 +4x1 x2 +3x22 over the domain defined by constraints 2 sin x1 ≤ x2 ≤ 2 cos x1 and x1 + x22 = 15. (Try penalty parameter values 2, 10 and 20.) 7. Use an alternative method to solve the above NLP problem. Now, rather than using any formal method of constrained optimization, use variable elimination, study of the domain, function plots and common sense to crack the problem. In brief, solve the problem the way you would do if solving this were utterly necessary for your survival and if you had not taken this optimization course. (You do not need to go all the way plotting contours like Reeta!) 8. Starting from the point (1, 1), perform two iterations of the feasible directions (Zoutendijk) method to find the point, farthest from the point C(1.5, 4), in the domain defined by 4.5x1 + x22 ≤ 18, 2x1 − x2 ≥ 1, x1 , x2 ≥ 0 . 9. Starting from the feasible solution (1, −2), and with initial line-search bound αU = 0.25, solve the problem Minimize x21 + 2x22 subject to x21 + x22 ≥ 5 by the generalized reduced gradient method using (a) slack variable strategy, (b) active set strategy. 10. Use the active set formulation of the gradient projection method to solve the NLP problem Minimize subject to (x1 − 1)2 + 4(x2 − 3)2 x21 + x22 ≤ 5 (x1 − 1)2 + x22 ≥ 1 with initial line-search bound αU = 0.25, and starting point (a) (0, 0) and (b) (1, 1.5). Department of Mechanical Engineering Indian Institute of Technology Kanpur ME 752: Optimization Methods in Engineering Design (2008-2009 II) Assignment 12: Miscellaneous Topics 1. Minimize f (x, y, z) = 2x2 +xy +y 2 +yz +z 2 −6x−7y −8z +9 if z can take only integer values. Repeat the exercise, considering y and z both taking values from the set {0, 1.2, 2.8, 3.9, 6.2}. 2. For a particle of mass m, define the (Lagrangian) function L(t, x, y, ẋ, ẏ) = m 2 (ẋ + ẏ 2 ) − mgy 2 and develop the integral (action) Z s = Ldt. The problem is to determine the trajectory x(t), y(t) of the particle from (0, 0) at time t = 0 to (a, b) at time t = T , along which s is minimum, or at least stationary. (a) Verify that x(t) = αt(t − T ) + at/T, y(t) = βt(t − T ) + bt/T is a feasible trajectory, and develop the function s(α, β). (b) Formally find out values of α and β to minimize s, and hence determine the required trajectory x(t), y(t). Find out ẋ, ẏ, ẍ, ÿ. Which law of physics did you derive just now, in a way? (c) Now, bypass the work of the two previous steps and work on a more direct theme. Work out the variation δs as a result of arbitrary variations δx(t), δy(t), that respect the given boundary conditions, and consistent variations in their rates as well. Insist on δs = 0 to derive the same result as above. [Hint: To get rid of the δẋ and δẏ terms, integrate the corresponding terms by parts.] 3. We want to solve the Blasius problem in the form f ′′′ (x) + f (x)f ′′ (x) = 0, f (0) = f ′ (0) = 0, f ′ (5) = 1 by Galerkin method. Let us choose x2 , x3 , · · · , x8 as the basis functions, which already satisfy the first two conditions. Taking 1, x, x2 , · · · , x5 as trial functions and using the boundary condition at x = 1, determine the solution.
© Copyright 2025 Paperzz