SE524/EC524 Optimization Theory and Methods Yannis Paschalidis [email protected], http://ionia.bu.edu/ Department of Electrical and Computer Engineering, Division of Systems Engineering, and Center for Information and Systems Engineering, Boston University Lecture 15: Outline 97/159 1 Introduction to Nonlinear Programming (NLP). 2 Some NLP formulations. 3 Unconstrained optimization. 4 Gradient methods. 5 Stepsize selection. Yannis Paschalidis, Boston University SE524/EC524: Lecture 15 Some background material for NLP Norms || · || on Rn . Euclidean norm: ||x|| = √ x′ x. Open Ball around a with radius r : {y | ||y − a|| < r }. A ⊂ Rn is compact iff closed and bounded (Heine-Borel). Consider function f : A → Rn : continuous at x ∈ A if limy→x f (y) = f (x). right-continuous if limy↓x f (y) = f (x). left-continuous if limy↑x f (y) = f (x). lower-semicontinuous if f (x) ≤ lim inf k→∞ f (xk ) for every sequence xk → x. upper-semicontinuous if f (x) ≥ lim supk→∞ f (xk ) for every sequence xk → x. coercive if limk→∞ f (xk ) = ∞ for every sequence satisfying ||xk || → ∞. 98/159 Yannis Paschalidis, Boston University SE524/EC524: Lecture 15 Some background material for NLP (cont.) Theorem (Weierstrass) Let A ⊂ Rn where A is closed and non-empty. Let f : A → Rn be lower-semicontinuous for all x ∈ A . (a) (b) If A is compact then ∃x ∈ A s.t. f (x) = inf z∈A f (z). If f is coercive, then ∃x ∈ A s.t. f (x) = inf z∈A f (z). Gradient: , . . . , ϑfϑx(x) ). f : Rn → R ⇒ ∇f (x) = ( ϑfϑx(x) 1 n f = (f1 , . . . , fm ) : Rn → Rm ⇒ ∇f (x) = [∇f1 (x) · · · ∇fm (x)]. Hessian: f : Rn → R ⇒ ∇2 f (x) = ∇(∇f (x)) = ϑ2 f (x) ϑxi ϑxj . Taylor expansion: 1 f (x + y) = f (x) + y′ ∇f (x) + y′ ∇2 f (x)y + o(||y||2 ). 2 99/159 Yannis Paschalidis, Boston University SE524/EC524: Lecture 15 Formulation and definitions Unconstrained optimization problem: min f (x) s.t. x ∈ Rn x∗ is a local minimum if ∃ǫ > 0 s.t. f (x∗ ) ≤ f (x) ∀x with ||x − x∗ || < ǫ. x∗ is a global minimum if f (x∗ ) ≤ f (x) ∀x ∈ Rn . 100/159 Yannis Paschalidis, Boston University SE524/EC524: Lecture 15 Necessary Conditions Proposition Let x∗ be an unconstrained local min and f : Rn → R continuously differentiable in an open set S containing x∗ . Then ∇f (x∗ ) = 0. If f : Rn → R is twice continuously differentiable within S then ∇2 f (x∗ ) 0. 101/159 (1st order) Yannis Paschalidis, Boston University (2nd order) SE524/EC524: Lecture 15 Convexity Proposition Let f : C → R be convex over the convex set C ⊂ Rn . (a) (b) A local min of f over C is also a global min over C . If f is strictly convex, ∃ at most one global min. If f is convex and C is open, the condition ∇f (x∗ ) = 0 is necessary and sufficient for x∗ ∈ C to be a global min of f over C . 102/159 Yannis Paschalidis, Boston University SE524/EC524: Lecture 15 Sufficient Conditions Proposition Let f : Rn → R be twice continuously differentiable in an open set S ⊂ Rn . Let also x∗ ∈ S s.t. ∇f (x∗ ) = 0, ∇2 f (x∗ ) ≻ 0. Then x∗ is a strict unconstrained local min of f , that is, ∃γ, ǫ > 0 s.t. f (x) ≥ f (x∗ ) + 103/159 γ ||x − x∗ ||2 , 2 Yannis Paschalidis, Boston University ∀x with ||x − x∗ || < ǫ. SE524/EC524: Lecture 15 Gradient Methods Generic gradient method: xk+1 = xk + αk dk such that if ∇f (xk ) 6= 0 then dk is chosen so that ∇f (xk )′ dk < 0 (descent direction). An interesting class of gradient methods: xk+1 = xk − αk Dk ∇f (xk ). Steepest descent: Dk = I. Newton’s method: Dk = (∇2 f (xk ))−1 . Diagonallyscaled steepest descent: 2 k −1 2 k −1 ϑ f (x ) f (x ) Dk = diag . , . . . , ϑ(ϑx 2 (ϑx1 )2 n) Modified Newton’s method: Dk = (∇2 f (x0 ))−1 . 104/159 Yannis Paschalidis, Boston University SE524/EC524: Lecture 15 Least squares problems min f (x) = 21 ||g (x)||2 s.t. x ∈ Rn . Note that ∇f (x) = ∇g (x)′ g (x). Gauss-Newton method for least squares: −1 xk+1 = xk − αk ∇g (xk )∇g (xk )′ ∇g (xk )′ g (xk ). 105/159 Yannis Paschalidis, Boston University SE524/EC524: Lecture 15 Stesize Selection Minimization rule: f (xk + αk dk ) = min f (xk + αdk ). α≥0 Limited minimization rule: f (xk + αk dk ) = min f (xk + αdk ). α∈[0,s] Constant stepsize: αk = s. Diminishing stepsize: αk → 0 106/159 with Yannis Paschalidis, Boston University ∞ X k=0 αk = ∞. SE524/EC524: Lecture 15
© Copyright 2026 Paperzz