Introduction to optimization methods and line search Jussi Hakanen Post-doctoral researcher [email protected] spring 2014 TIES483 Nonlinear optimization How to find optimal solutions? Trial and error β widely used in practice, not efficient and high possibility to miss good solutions Better to use a systematic way to find optimal solution Typically we know only β function value(s) at the current trial point β possibly gradients at the current trial point How can we know which solution is optimal? How can we find optimal solutions? spring 2014 TIES483 Nonlinear optimization Optimality conditions How can we know that a solution is optimal? One way is to utilize optimality conditions Necessary optimality conditions = conditions that an optimal solution has to satisfy (does not guarantee optimality) Sufficient optimality conditions = conditions that guarantee optimality when satisfied 1. order conditions (1. order derivatives) and 2. order conditions (2. order derivatives) Global vs. local minimizers A solution π₯ β β π is a global minimizer if π π₯ β β€ π π₯ for all π₯ β π A solution π₯ β β π is a local minimizer if there exists an π > 0 s.t. π π₯ β β€ π(π₯) for all π₯ β π where π₯ β π₯ β < π Convexity: a local minimizer is a global minimizer Global minimizers are preferred, local minimizers are usually easier to identify spring 2014 TIES483 Nonlinear optimization Solving an optimization problem Find optimal values π₯ β for the variables Problems that can be solved analytically min π₯ 2 , π€βππ π₯ β₯ 3 β π₯ β = 3 Usually impossible to solve analytically Must be solved numerically β approximation of the solution β In mathematical optimization a starting point is iteratively improved spring 2014 TIES483 Nonlinear optimization Numerical solution Modelling β mathematical model of the problem Numerical methods β numerical simulation model for the mathematical model Optimization method β solve the problem utilizing the numerical simulation model SO modelling β simulation β optimization spring 2014 TIES483 Nonlinear optimization Optimization method Algorithm: a mathematical description 1. 2. 3. Choose a stopping parameter π > 0, starting point π₯ 1 and a symmetric positive definite π × π matrix π·1 (e.g. π·1 = πΌ). Set π¦1 = π₯ 1 and β = π = 1. If π»π(π¦ π ) < π, stop. Otherwise, set π π = βπ·π»π(π¦ π ). Let ππ be a solution of min π(π¦ π + ππ π ), s.t. π β₯ 0. Set π¦ π+1 = π¦ π + ππ π π . If π = π, set π¦1 = π₯ β+1 = π¦ π+1 , β = β + 1, π = 1 and repeat (2). Compute π· π+1 . Set π = π + 1 and go to (2). Method: numerical methods included Software: a method implemented as a computer programme spring 2014 TIES483 Nonlinear optimization Structure of optimization methods Typically β Constraint handling converts the problem to (a series of) unconstrained problems β In unconstrained optimization a search direction is determined at each iteration β The best solution in the search direction is found with line search spring 2014 TIES483 Nonlinear optimization Constraint handling method Unconstrained optimization Line search Local optimization methods Find a (closest) local optimum Fast Usually utilize derivatives Mathematical convergence For example β Direct search methods (pattern search, Hooke & Jeeves, Nelder & Mead, β¦) β Gradient based methods (steepest descent, Newtonβs method, quasi-Newton method, conjugate gradient, SQP, interior point methodsβ¦) spring 2014 TIES483 Nonlinear optimization Global optimization methods Try to get as close to global optimum as possible No mathematical convergence Do not assume much of the problem Slow, use lots of function evaluations Heuristic, contain randomness Most well known are nature-inspired methods (TIES451 Selected topics in soft computing) β based on improving a population of solutions at a time instead of a single solution spring 2014 TIES483 Nonlinear optimization Hybrid methods Combination of global and local methods Try to combine the benefits of both β rough estimate with a global method, fine tune with a local method Challenge: how the methods should be combined? β e.g. when to switch from global to local? (speed vs. accuracy) spring 2014 TIES483 Nonlinear optimization Line search What did you find out about line search? spring 2014 TIES483 Nonlinear optimization Line search The idea of line search is to optimize a given function with respect to a single variable Optimization algorithms for multivariable problems generate iteratively search directions in which better solutions are found β Line search is used to find these! Exact minimum is not required but an approximation of it which is within a given tolerance π > 0 β enough to know that x β β [πβ , π β ] where π β β πβ < π spring 2014 TIES483 Nonlinear optimization Optimality conditions Necessary: Let π: π β π be differentiable. If π₯ β is a local minimizer, then π β² π₯ β = 0. In addition, if π is twice continuously differentiable and π₯ β is a local minimizer, then π β²β² π₯ β β₯ 0. Sufficient: Let π: π β π be twice continuously differentiable. If π β² π₯ β = 0 and π β²β² π₯ β > 0, then π₯ β is a strict local minimizer. spring 2014 TIES483 Nonlinear optimization Examples π π₯ = (π₯ β 2)2 β4 π β² π₯ = 2π₯ β 4 π β²β² π₯ = 2 If π₯ β = 2, then both the necessary and sufficient optimality conditions are satisfied π π₯ = (π₯ β 2)2 β4 spring 2014 TIES483 Nonlinear optimization Examples π π₯ = (π₯ β 2)3 β4 πβ² π₯ = 3 π₯ β 2 2 π β²β² π₯ = 6π₯ β 12 If π₯ β = 2, then the necessary optimality conditions are satisfied although π₯ β = 2 is not a local minimizer β It is a saddle point Sufficient optimality conditions are not satisfied in π₯ β = 2 spring 2014 TIES483 Nonlinear optimization π π₯ = (π₯ β 2)3 β4 Note on optimality conditions If π is not differentiable, then local minimizer can be in a point where π is 1) not differentiable or 2) discontinuous π π₯ = π₯ spring 2014 TIES483 Nonlinear optimization Finding a unimodal interval Most line search methods assume that the search is started from a unimodal interval [π, π] π is unimodal in [π, π] if there is exactly one π₯ β β [π, π] s.t. for all π₯ 1 , π₯ 2 β [π, π] for which π₯ 1 < π₯ 2 holds β If π₯ 2 < π₯ β , then π π₯ 1 > π(π₯ 2 ) and β If π₯ 1 > π₯ β , then π π₯ 1 < π(π₯ 2 ) spring 2014 TIES483 Nonlinear optimization Search with fixed steps Let (π΄, π΅) be the interval where we want to find a minimum for π Compute values for π in π equally spaced points π₯ π in (π΄, π΅) π β π₯ =π΄+ π (π΅ π+1 β π΄), π = 1, β¦ , π When points π₯ π , π₯ π+π and π₯ π+2 are found s.t. π π₯ π > π π₯ π+1 < π(π₯ π+2 ), we know that there exist at least one local minimizer in (π₯ π , π₯ π+2 ) The interval can be further reduced spring 2014 TIES483 Nonlinear optimization Line search methods Assume that π is unimodal in [π, π] General idea is to start reducing the interval [π, π] s.t. the minimizer is still included in it An approximation of the minimizer is found when the length of the interval is smaller than a pre-determined tolerance Line search methods can be divided into β Elimination methods β Interpolation methods (often use derivatives) spring 2014 TIES483 Nonlinear optimization The method of bisection Elimination method 1) Choose small but significant constant 2π > 0 and an allowable length πΏ > 0 for the final interval. Let [π1 , π1 ] be the original (unimodal) interval. Set β = 1. 2) If πβ β πβ < πΏ, stop. Minimizer π₯ β β [πβ , π β ]. Otherwise, compute values of π in π₯β πβ +πβ πβ +πβ β = β π and π¦ = + π. 2 2 π(π¦ β ), set πβ+1 = πβ and π β+1 = π¦ β . 3) If π π₯ β < Otherwise, set πβ+1 = π₯ β and π β+1 = π β . Set β = β + 1 and go to step 2). π₯β πβ spring 2014 π¦β 2π TIES483 Nonlinear optimization πβ The method of bisection (cont.) Efficiency: β Length of the interval after β iterations is 1 1 π β π + 2π 1 β β β 2 2 β Number of iterations required if the final length should be πΏ is (why?) πΏβ2π β = βln( )/ ln 2 πβπβ2π β For each iteration, the objective function is evaluated 2 times (in π₯ β and π¦ β ) β in total 2β evaluations spring 2014 TIES483 Nonlinear optimization Golden section Assume that we want to separate a sub interval (length π¦) πΏ π¦ from an interval of length πΏ such that = π¦ Then, π¦ = 5β1 2 πΏβπ¦ πΏ β 0.618πΏ It is said that now the interval is divided in the ratio of golden section Theorem Divide an interval [π, π] in the ratio of golden section first from right (point π) and then from left (point π). Then point π divides the interval [π, π] in the ratio of golden section and point π does the same for [π, π]. π spring 2014 π π TIES483 Nonlinear optimization π Golden section search Elimination method, known also as Fibonacci search. Let 5β1 πΆ= . 1) 2) 3) 4) 5) πβ spring 2014 2 Choose an allowable length πΏ > 0 for the final interval. Let [π1 , π1 ] be the original (unimodal) interval. Set π₯ 1 = π1 + 1 β πΆ π1 β π1 = π1 β πΆ(π1 β π1 ) and π¦1 = π1 + πΆ π1 β π1 . Compute π(π₯ 1 ) and π(π¦1 ). Set β = 1. If π β β πβ < πΏ, stop. Minimizer π₯ β β [πβ , π β ]. Otherwise, if π π₯ β β€ π(π¦ β ) go to step 4). Set πβ+1 = π₯ β and π β+1 = π β . Further set π₯ β+1 = π¦ β and π¦ β+1 = πβ+1 + πΆ(π β+1 β πβ+1 ). Compute π(π¦ β+1 ) and go to step 5). Set πβ+1 = πβ and π β+1 = π¦ β . Further set π¦ β+1 = π₯ β and π₯ β+1 = πβ+1 + (1 β πΆ)(π β+1 β πβ+1 ). Compute π(π₯ β+1 ). Set β = β + 1 and go to step 2). π₯β π¦β TIES483 Nonlinear optimization πh Golden section search (cont.) Efficiency β Length of the interval after β iterations is πΆ β (π β π) β Number of iterations required if the final length should be πΏ is (why?) πΏ β = ln( )/ ln πΆ πβπ β For each iteration (except the last), the objective function is evaluated one time (in π₯ β+1 or π¦ β+1 ) plus in the beginning in two points (π₯ 1 and π¦1 ) β in total β + 1 evaluations spring 2014 TIES483 Nonlinear optimization Quadratic interpolation Idea is to approximate π with a quadratic polynomial whose minimizer is known Taylorβs second order polynomial is used: π π₯ =π π₯β + πβ² π₯β π₯β π₯β + 1 β²β² π 2 π₯β π₯β 2 β π₯ If π β²β² π₯ β β 0, then π(π₯) has a critical point in π₯ β+1 πβ²(π₯ β ) β+1 β+1 β when πβ²(π₯ )=0βπ₯ =π₯ β β πβ²β²(π₯ ) Newtonβs method for solving π β² π₯ = 0!!! Interpolation can also be applied in the case where no derivatives are available (find out the idea by yourself) spring 2014 TIES483 Nonlinear optimization Programming assignment Form the pairs!!! Start programming by implementing some line search method Any programming language is ok Test your implementation with some optimization problems where you know the minimizer spring 2014 TIES483 Nonlinear optimization Topic of the lectures on January 20th & 22nd Mon, Jan 20th: unconstrained optimization with multiple variables, optimality conditions and methods that donβt utilize gradient information (=direct search methods) Wed, Jan 22nd: methods that utilize gradient information Study this before the lecture! Questions to be considered β What kind of optimality conditions there exist? β What kind of techniques direct search methods use to find a local minimizer? β How gradient information is utilized? spring 2014 TIES483 Nonlinear optimization
© Copyright 2026 Paperzz