Matlab-based Optimization: the Optimization Toolbox Gene Cliff (AOE/ICAM - [email protected] ) 3:00pm - 4:45pm, Monday, 11 February 2013 .......... FDI .......... AOE: Department of Aerospace and Ocean Engineering ICAM: Interdisciplinary Center for Applied Mathematics 1 / 37 Matlab’s Optimization Toolbox Classifying Optimization Problems ⇐ A Soup Can Example Intermezzo A Trajectory Example 2nd Trajectory Example: fsolve 2 / 37 Solver Categories There are four general categories of Optimization Toolbox solvers: Minimizers This group of solvers attempts to find a local minimum of the objective function near a starting point x0. They address problems of unconstrained optimization, linear programming, quadratic programming, and general nonlinear programming. Multiobjective minimizers This group of solvers attempts to either minimize the maximum value of a set of functions (fminimax), or to find a location where a collection of functions is below some prespecified values (fgoalattain). Least-Squares (curve-fitting) solvers This group of solvers attempts to minimize a sum of squares. This type of problem frequently arises in fitting a model to data. The solvers address problems of finding nonnegative solutions, bounded or linearly constrained solutions, and fitting parameterized nonlinear models to data. Equation solvers This group of solvers attempts to find a solution to a scalar- or vector-valued nonlinear equation f(x) = 0 near a starting point x0. Equation-solving can be considered a form of optimization because it is equivalent to finding the minimum norm of f(x) near x0. 3 / 37 Generic Optimization Problem minx f (x1 , x2 , ..., xn ), subject to equality constraints inequality constraints c eq1 (x1 , x2 , ..., xn ) = 0 c1 (x1 , x2 , ..., xn ) ≤ 0 c eq2 (x1 , x2 , ..., xn ) = 0 .. . c2 (x1 , x2 , ..., xn ) ≤ 0 .. . c eq` (x1 , x2 , ..., xn ) = 0 cm (x1 , x2 , ..., xn ) ≤ 0 simple bound constraints xıL ≤ xı ≤ xıU , ı = 1, 2, ..., n 4 / 37 Classifying a Problem Identify your objective function as one of five types: Linear Quadratic Sum-of-squares (Least squares) Smooth nonlinear Nonsmooth Identify your constraints as one of five types: None (unconstrained) Bound Linear (including bound) General smooth Discrete (binary integer) 5 / 37 Problem classification table We focus on fmincon 6 / 37 Matlab’s Optimization Toolbox Classifying Optimization Problems A Soup Can Example ⇐ Intermezzo A Trajectory Example 2nd Trajectory Example: fsolve 7 / 37 Soup Can Example (from MathWorks Training Docs) We are to design a soup can in the shape of a right circular cylinder.We are to choose values for: 1 the diameter (d), 2 the height (h) Requirements are: 2 the volume ( πd4 h) must be 333 cm3 the height can be no more than twice the diameter 2 the cost is proportional to the surface area ( πd2 + πdh), and should be minimized Since the cost function and the volume constraint are nonlinear, we select fmincon. 8 / 37 Soup can design space 9 / 37 Soup can: cost function function [ val val x ] = cost soup can ( x ) % E v a l u a t e t h e c o s t f u n c t i o n f o r t h e soup−can e x a m p l e % % x ( 1 ) − d i a m e t e r o f t h e can % x ( 2 ) − h e i g h t o f t h e can % % a r e a = 2 ∗ ( p i ∗d ˆ 2 ) / 4 + p i ∗d∗h val = pi ∗x (1)∗( x (2) + x ( 1 ) / 2 ) ; % Evaluate the gra di en t i f nargout > 1 v a l x = p i ∗ [ x (1)+ x ( 2 ) ; x ( 1 ) ] ; end end 10 / 37 Soup can: volume constraint f u n c t i o n [ c c e q c x c e q x ]= c o n s o u p c a n ( x , volume ) %E v a l u a t e t h e c o n s t r a i n t f o r t h e soup−can e x a m p l e % x ( 1 ) − d i a m e t e r o f t h e can % x ( 2 ) − h e i g h t o f t h e can % c = []; % no n o n l i n e a r inequalities c e q = volume − ( p i / 4 ) ∗ x ( 2 ) ∗ x ( 1 ) ˆ 2 ; % volume = p i ∗d ˆ2∗ h /4 % compute t h e J a c o b i a n s i f nargout > 2 c x = []; c e q x = −( p i / 4 ) ∗ x ( 1 ) ∗ [ 2 ∗ x ( 2 ) ; x ( 1 ) ] ; end end 11 / 37 Soup can: set-up script % % % % S c r i p t t o s e t up soup−can e x a m p l e We a r e t o d e s i g n a r i g h t −c y l i n d r i c a l ( c i r c u l a r ) can o f a g i v e n volume and w i t h minimum s u r f a c e a r e a ( m a t e r i a l c o s t ) . The h e i g h t can be no more than t w i c e the diameter % volume = p i ∗dˆ2∗h /4 % area = 2∗( p i ∗d ˆ 2 ) / 4 + p i ∗d∗h % h \ l e 2∗d ==> −2∗d + h \ l e 0 % I n o u r o p t i m i z a t i o n p r o b l e m we h a v e % x = [d ; h]; % The s p e c i f i e d volume i s 333 cmˆ3 % We h a v e e x t e r n a l f u n c t i o n % c o s t s o u p c a n .m % c o n s o u p c a n .m files %% d e f i n e h a n d l e t o t h e c o n s t r a i n t f u n c t i o n w i t h t h e s p e c i f i e d volume v a l u e volume = 3 3 3 ; h c o n = @( x ) c o n s o u p c a n ( x , volume ) ; % Arrays f o r the l i n e a r A = [−2 1 ] ; b = 0 ; inequality % l o w e r / u p p e r bounds lb = [ 4; 5]; ub = [ 8 ; 1 5 ] ; % i n i t i a l guess x0 = [ 6 ; 1 0 ] ; 12 / 37 optimtool: soup can example 13 / 37 Command Window: soup can example >> s o u p c a n 2 I t e r F−c o u n t 0 3 1 6 2 9 3 12 4 15 5 18 6 21 f (x) 245.044 247.138 265.113 265.948 265.92 265.956 265.957 Max constraint 50.26 35.41 1.713 0.05285 0.06939 0.0001174 4 . 8 7 1 e−08 L o c a l minimum p o s s i b l e . C o n s t r a i n t s Line search steplength Directional derivative 1 1 1 1 1 1 3.93 17.3 6.92 −0.0716 6.49 0.53 F i r s t −o r d e r o p t i m a l i t y Procedure Infeasibl 1.2 2.68 0.798 0.0899 0.00326 6 . 8 8 e−05 H e s s i a n m satisfied . fmincon stopped because the p r e d i c t e d change i n the o b j e c t i v e f u n c t i o n i s l e s s t h a n t h e s e l e c t e d v a l u e o f t h e f u n c t i o n t o l e r a n c e and c o n s t r a i n t s were s a t i s f i e d to w i t h i n the s e l e c t e d v a l u e o f the c o n s t r a i n t t o l e r a n c e . <s t o p p i n g criteria details> No a c t i v e >> inequalities . 14 / 37 Matlab’s Optimization Toolbox Classifying Optimization Problems A Soup Can Example Intermezzo ⇐ A Trajectory Example 2nd Trajectory Example: fsolve 15 / 37 fmincon: choice of algorithms ‘trust-region reflective’ requires you to provide a gradient, and allows only bounds or linear equality constraints, but not both. Within these limitations, the algorithm handles both large sparse problems and small dense problems efficiently. It is a large-scale algorithm, and can use special techniques to save memory usage, such as a Hessian multiply function. For details, see Trust-Region-Reflective Algorithm. ‘active-set’ can take large steps, which adds speed. The algorithm is effective on some problems with nonsmooth constraints. It is not a large-scale algorithm. ‘sqp’ satisfies bounds at all iterations. The algorithm can recover from NaN or Inf results. It is not a large-scale algorithm. ‘Interior-point’ handles large, sparse problems, as well as small dense problems. The algorithm satisfies bounds at all iterations, and can recover from NaN or Inf results. It is a large-scale algorithm, and can use special techniques for large-scale problems. 16 / 37 Large-Scale vs Medium-Scale An optimization algorithm is large scale when it uses linear algebra that does not need to store, nor operate on, full matrices. This may be done internally by storing sparse matrices, and by using sparse linear algebra for computations whenever possible. Furthermore, the internal algorithms either preserve sparsity, such as a sparse Cholesky decomposition, or do not generate matrices, such as a conjugate gradient method. Large-scale algorithms are accessed by setting the LargeScale option to on, or setting the Algorithm option appropriately (this is solver-dependent). In contrast, medium-scale methods internally create full matrices and use dense linear algebra. If a problem is sufficiently large, full matrices take up a significant amount of memory, and the dense linear algebra may require a long time to execute. Medium-scale algorithms are accessed by setting the LargeScale option to off, or setting the Algorithm option appropriately (this is solver-dependent). Don’t let the name ”large-scale” mislead you; you can use a large-scale algorithm on a small problem. Furthermore, you do not need to specify any sparse matrices to use a large-scale algorithm. Choose a medium-scale algorithm to access extra functionality, such as additional constraint types, or possibly for better performance. 17 / 37 fmincon: command line inputs x = fmincon(fun, x0, A, b, Aeq, beq, lb, ub, nonlcon, options) fun - function handle for the cost function x0 - initial guess for solution A, b - matrix, rhs vector for inequality constraints (A x ≤ b) Aeq, beq - matrix, rhs vector for equality constraints lb, ub - lower,upper bounds for solution vector nonlincon - function handle for the nonlinear inequality and equality constraints; [c, ceq] = nonlincon(x) options - structure of options for the algorithm 18 / 37 fmincon: additional outputs [x,fval,exitflag,output,lambda,grad,hessian] exitflag 1: First-order optimality measure was less than options.TolFun, and maximum constraint violation was less than .TolCon. 0: Number of iterations exceeded options.MaxIter or number of function evaluations exceeded options.FunEv output -structure of data about performance of the algorithm lambda - structure of the Lagrange multipliers grad -gradient of the Lagrangian Hessian -Hessian of the Lagrangian 19 / 37 Matlab’s Optimization Toolbox Classifying Optimization Problems A Soup Can Example Intermezzo A Trajectory Example ⇐ 2nd Trajectory Example: fsolve 20 / 37 Trajectory Example We are to launch an object at speed v0 ; we seek an initial elevation angle for maximum range. In the classical case with no drag, the best elevation is π4 . Suppose we have a simple drag force; b v 2 ? formulate an initial-value problem for the projectile motion the initial position and speed are given, the initial elevation angle (γ(0)) is unknown the final range (x(tf ) to be maximized) occurs when the height returns to its initial value (final time (tf ) is unknown) Since the cost function and the final height constraint are nonlinear functions of the unknowns, we select fmincon. 21 / 37 Trajectory: Setup and solve an IVP f u n c t i o n [ r a n g e , a l t i t u d e ] = t r a j e c t o r y ( gam 0 , t f , param ) % S o l v e an IVP f o r t h e b a l l i s t i c t r a j e c t o r y % E v a l u a t e t h e f i n a l a l t i t u d e and r a n g e % % gam 0 i s t h e i n i t i a l f l i g h t −p a t h a n g l e ( r a d i a n s ) % t f i s the f i n a l time ( s ) % % range / a l t i t u d e are the f i n a l values % % param i s a d a t a s t r u c t u r e % param . b c o e f i s t h e d r a g c o e f f i c i e n t % param . g r a v i s t h e g r a v i t a t i o n a l a c c e l e r a t i o n (m/ s ˆ 2 ) % param . v e l 0 i s t h e i n i t i a l s p e e d (m/ s ) % anonymous h rhs z 0 [˜ , Z] range altitude fu nc tio n handle with s p e c i f i e d parameters = @( t , z ) b a l l i s t i c r h s ( t , z , param . b c o e f , param . g r a v ) ; = [ 0 ; 0 ; param . v e l 0 ; gam 0 ] ; % s e t t h e i n i t i a l s t a t e = ode23 ( h r h s , [ 0 t f ] , z 0 ) ; % s o l v e t h e IVP = Z ( end , 1 ) ; = Z ( end , 2 ) ; end Note that to evaluate the cost we need the range, and to evaluate the constraint we need the altitude. Do we really have to solve the IVP twice to evaluate both ? 22 / 37 Trajectory: the RHS of the ODE system function z % Evaluate % z = [x, % x − % h − % v − % gamma − dot = b a l l i s t i c r h s (˜ , z , b coef , grav ) r h s o f eq . o f m o t i o n f o r a b a l l i s t i c o b j e c t h , v , gamma ] range altitude speed f l i g h t −p a t h a n g l e s i n g = s i n ( z ( 4 ) ) ; cos g = cos ( z ( 4 ) ) ; v = max ( z ( 3 ) , 0 . 1 ) ; % g u a r d a g a i n s t z e r o d i v i s o r z dot = [ z (3)∗ cos g ; z (3)∗ s i n g ; −b c o e f ∗ z ( 3 ) ∗ z ( 3 ) − g r a v ∗ s i n g ; −g r a v ∗ c o s g / v ] ; end 23 / 37 Matlab: ObjectiveandConstraints f u n c t i o n [ c o s t , n o n l i n c o n ] = O b j e c t i v e a n d C o n s t r a i n t s ( param ) % E n c a p s u l a t e s c o s t and c o n s t r a i n t f u n c t i o n s f o r f m i n c o n % c o s t and n o n l i n c o n a r e f u n c t i o n h a n d l e s % param i s a s t r u c t u r e t h a t e n c o d e s p a r a m e t e r s f o r t h e c o s t / c o n s t r a i n t f c n s % I n i t i a l i z e v a r i a b l e s and make them a v a i l a b l e t o t h e n e s t e d f u n c t i o n s range = []; altitude = []; LastZ = []; % initialize cost = @objective ; nonlincon = @constraints ; % Nested f u n c t i o n s function [ val , val Z ] = o b j e c t i v e ( z ) i f ˜ i s e q u a l ( z , LastZ ) % update f o r t h i s v a l u e % S o l v e t h e IVP [ r a n g e , a l t i t u d e ] = t r a j e c t o r y ( z ( 1 ) , z ( 2 ) , param ) ; LastZ = z ; end % Evaluate cost val = −r a n g e ; % m i n i m i z e t h e n e g a t i v e r a n g e val Z = [ ] ; % g r a d i e n t n o t computed i n t h i s v e r s i o n end % f u n c t i o n [ c , ceq c Z , ceq Z ] = c o n s t r a i n t s ( z ) i f ˜ i s e q u a l ( z , LastZ ) % update f o r t h i s v a l u e % S o l v e t h e IVP [ r a n g e , a l t i t u d e ] = t r a j e c t o r y ( z ( 1 ) , z ( 2 ) , param ) ; LastZ = z ; end % Evaluate constraints c = [ ] ; % no i n e q u a l i t y c o n s t r a i n t s ceq = altitude ; = [ ] ; % J a c o b i a n s n o t computed c Z ceq Z = [ ] ; end end 24 / 37 ObjectiveandConstraints: insights Invoking ObjectiveandConstraints defines the function handles cost and nonlincon. Since the variables: param, range, altitude, lastZ are defined at the high-level, they are available to the nested functions objective and constraints. If z 6= LastZ we solve the IVP and return range and altitude. If z == LastZ we use the stored values of range and altitude. This approach is useful in cases wherein evaluating the cost/constraint functions requires an expensive calculation, such as the solution of an ODE/IVP or a PDE/BVP. Future documentation of the Optimization Toolbox will include this description. 25 / 37 fmincon:trajectory example % % % % % % S c r i p t t o s e t p a r a m e t e r s f o r and t h e n r u n t h e max−r a n g e t r a j e c t o r y p r o b l e m param param . param . param . i s a s t r u c t u r e of data f o r the problem b c o e f i s the drag c o e f f i c i e n t grav i s t h e g r a v i t a t i o n a l a c c e l e r a t i o n (m/ s ˆ 2 ) vel 0 i s t h e i n i t i a l s p e e d (m/ s ) param . b c o e f = 0 . 1 ; param . g r a v = 9.8; param . v e l 0 = 2 5 . 0 ; % d e f i n e handles f o r f u n c t i o n s e v a l u a t i n g the cost / c o n t r a i n t s [ c o s t , n o n l c o n ] = O b j e c t i v e a n d C o n s t r a i n t s ( param ) ; % l o w e r / u p p e r bounds lb = [ 0 ; 0 . 5 ∗ param . v e l 0 / param . g r a v ] ; ub = [ p i / 4 ; 5∗ l b ( 2 ) ] ; % i n i t i a l guess x0 = 0 . 5 ∗ ( l b+ub ) ; %% s e t p a r a m e t e r s and i n v o k e f m i n c o n OPT = o p t i m s e t ( ’ f m i n c o n ’ ) ; OPT = o p t i m s e t (OPT, ’ A l g o r i t h m ’ , ’ a c t i v e −s e t ’ , ’ Display ’ , ’ iter ’ , ... ’ U s e P a r a l l e l ’ , ’ always ’ ) ; % ... x s t a r = f m i n c o n ( f u n , x0 , A , b , Aeq , beq , l b , ub , n o n l c o n , o p t i o n s ) x s t a r = f m i n c o n ( c o s t , x0 , [ ] , [ ] , [ ] , [ ] , l b , ub , n o n l c o n , OPT ) ; 26 / 37 Matlab’s Optimization Toolbox Classifying Optimization Problems A Soup Can Example Intermezzo A Trajectory Example 2nd Trajectory Example: fsolve ⇐ 27 / 37 2nd Trajectory Example: fsolve With the same dynamics as earlier, we now seek an initial elevation angle (γ0 )) and a final time (tf ) so that the trajectory ends at a specified point in the vertical plane (xf , hf ). since the IVP solution depends on time, as well as on the initial elevation angle, we write the range and height functions as x(t; γ0 ) and h(t; γ0 ), respectively. we want to find values of tf and γ0 that lead to zero for the vector-valued function: 4 f1 (γ0 , tf ) = x(tf , γ0 ) − xf 4 f2 (γ0 , tf ) = h(tf , γ0 ) − hf we use the the function fsolve from the Optimization Toolbox 28 / 37 Modified trajectory code This version can return the time/state history [T, Z] f u n c t i o n [ r e s i d u a l , T , Z ] = t r a j e c t o r y ( gam 0 , t f , param ) % S o l v e an IVP f o r t h e b a l l i s t i c t r a j e c t o r y % E v a l u a t e t h e f i n a l a l t i t u d e and r a n g e % % gam 0 i s t h e i n i t i a l f l i g h t −p a t h a n g l e ( r a d i a n s ) % t f i s the f i n a l time ( s ) % % range / a l t i t u d e are the f i n a l values % % param i s a d a t a s t r u c t u r e % param . b c o e f i s t h e d r a g c o e f f i c i e n t % param . g r a v i s t h e g r a v i t a t i o n a l a c c e l e r a t i o n (m/ s ˆ 2 ) % param . v e l 0 i s t h e i n i t i a l s p e e d (m/ s ) i s the s p e c i f i e d t a r g e t range (m) % param . x f % param . h f i s t h e s p e c i f i e d t a r g e t a l t i t u d e (m) % anonymous f u n c t i o n h a n d l e w i t h s p e c i f i e d p a r a m e t e r s h rhs = @( t , z ) b a l l i s t i c r h s ( t , z , param . b c o e f z 0 = [ 0 ; 0 ; param . v e l 0 ; gam 0 ] ; % s e t t h e i f n a r g o u t == 1 [˜ , Z] = ode23 ( h r h s , [ 0 t f ] , z 0 ) ; % solve r e s i d u a l = Z ( end , 1 : 2 ) ’ − [ param . x f ; param . h f ] ; else [T, Z ] = ode23 ( h r h s , [ 0 t f ] , z 0 ) ; % solve r e s i d u a l = Z ( end , 1 : 2 ) ’ − [ param . x f ; param . h f ] ; end end % ’ l o c a l ’ b a l l i s t i c f u n c t i o n goes here , param . g r a v ) ; i n i t i a l state t h e IVP t h e IVP 29 / 37 fsolve: 2nd trajectory example % % % % % % % % S c r i p t t o s e t p a r a m e t e r s f o r and t h e n r u n a t r a j e c t o r y t a r g e t p r o b l e m param param . param . param . param . param . i s a s t r u c t u r e of data f o r the problem b c o e f i s the drag c o e f f i c i e n t grav i s t h e g r a v i t a t i o n a l a c c e l e r a t i o n (m/ s ˆ 2 ) vel 0 i s t h e i n i t i a l s p e e d (m/ s ) x f i s the s p e c i f i e d t a r g e t range (m) h f i s t h e s p e c i f i e d t a r g e t a l t i t u d e (m) param . b c o e f = 0 . 1 ; param . g r a v = 9.8; param . v e l 0 = 2 5 . 0 ; param . x f param . h f = = 8.0; 2.0; % d e f i n e handles f o r f u n c t i o n s e v a l u a t i n g the cost / c o n t r a i n t s f h n d l = @( x ) t r a j e c t o r y ( x ( 1 ) , x ( 2 ) , param ) ; % i n i t i a l guess x0 = [ p i / 4 ; 0 . 5 ∗ param . v e l 0 / param . g r a v ] ; %% s e t p a r a m e t e r s and i n v o k e f s o l v e OPT = o p t i m s e t ( ’ f s o l v e ’ ) ; OPT = o p t i m s e t (OPT, ’ Display ’ , ’ iter ’ , ... ’ U s e P a r a l l e l ’ , ’ always ’ ) ; % [ x s t a r , f v a l , e x i t f l a g ] = f s o l v e ( FUN , [ x star , ˜ , flag ] = fsolve ( f hndl , X0 , OPTIONS) x0 , OPT ) ; 30 / 37 fsolve: 2nd trajectory example % [ x s t a r , f v a l , e x i t f l a g ] = f s o l v e ( FUN , X0 , OPTIONS) [ x star , ˜ , flag ] = fsolve ( f hndl , x0 , OPT ) ; if f l a g == 1 [ ˜ , T , Z ] = t r a j e c t o r y ( x s t a r ( 1 ) , x s t a r ( 2 ) , param ) ; figure p l o t ( Z ( : , 1 ) , Z ( : , 2 ) , ’−−k ’ , ’ L i n e W i d t h ’ , 2 ) ; h o l d on ; g r i d on p l o t ( param . x f , param . h f , ’ r o ’ ) x l a b e l ( ’ r a n g e (m) ’ ) ; y l a b e l ( ’ h e i g h t (m) ’ ) else f p r i n t f ( 1 , ’ \n f l a g = %02 i \n ’ , f l a g ) ; end 31 / 37 2nd trajectory example: fsolve Note that as in the zero-drag case, the problem has two solutions Low trajectory High trajectory 32 / 37 THE END Please complete the evaluation form http://www.fdi.vt.edu/training/evals/ Thanks 33 / 37 Backup - underlying ideas - problem w/o constraints Problem P0 : Find x ∗ ∈ IRn to minimize a smooth function f : IRn → IR. We assume that f is twice continuously differentiable in the neighborhood of a solution. If x ∗ a minimizer for P0 , then x ∗ is a stationary point for f , so that (∇f )x ∗ = 0 ∈ IRn , furthermore the Hessian of f , is positive semi-definite, ∇2 f x ∗ ≥ 0. Applying Newton’s −1 method to ∇f = 0 we get the update 2 pk = − ∇ f x (∇f )xk , xk+1 = xk + pk k Algorithms for P0 generate estimates for ∇2 f based on computes changes in (∇f ) The update is commonly generalized to xk+1 = xk + αpk , where α > 0 is a step-size. Trust-region methods minimize a quadratic approximation to f near xk subject to a step size (trust-region radius). 34 / 37 Backup- underlying ideas -Newton update Write (∇F )x+p ≈ (∇F )x + (∇2 F )x p Newton step - compute p so that (∇F )xk +p = 0 (∇2 F )xk p = −(∇F )xk Quasi-Newton update for (∇2 F )k+1 (∇2 F )k+1 p̃ = (∇F )xk +p̃ − (∇F )xk | {z } update this Q-N rules are commonly based on rank-two, least-change ideas (∇2 F )k+1 = (∇2 F )k + |{z} ∆ rank two In the period 1960 - 1980 a great deal of work was done Davidon, Fletcher, Powell, Broyden, Goldfarb, Shanno 35 / 37 Backup- underlying ideas - problem w equality constraints Problem Pc : Find x ∗ ∈ IRn to minimize a smooth function f : IRn → IR; subject to g (x) = 0 ∈ IRm where g : IRn → IRm We assume that f , g are twice continuously differentiable in the neighborhood of a solution. If x ∗ is a minimizer for Pc and the Jacobian J = ∇g has full rank at x ∗ then there exists a vector λ̂ ∈ IRm such that x ∗ is a stationary point for the Lagrange function L(x) = f (x) + hλ̂, g (x)i. Furthermore, x ∗ is a local minimizer for L in the null space of J(x ∗ ). The latter condition implies that the projected Hessian of L is T 2 positive semi-definite Z ∇ L x ∗ Z ≥ 0 where the columns of Z span the null-space of J(x ∗ ). 36 / 37 Backup- underlying ideas - problem w inequality constraints Problem Pi - the constraints = k + 1, ..., m are inequalities, g ≤ 0. Karush-Kuhn-Tucker theory implies that λ ≥ 0 (NB - in some formulations λ ≤ 0.) Many algorithms are based on an active-set strategy. Some set A ⊂ {k + 1, ..., m} of inequalities are treated as equalities in a version of problem Pc At each (major) iteration the set A is adjusted: 1 2 if g` > 0 for some ` ∈ Ac , then add ` to the active-set If λ` < 0 for some ` ∈ A, then remove ` from the active-set 37 / 37
© Copyright 2026 Paperzz