Mathematics Performance of Numerical Optimization Routines Sponsoring Faculty Member: Dr. Jon Ernstberger Kayla S. Cline Introduction Nonlinear iterative optimization involves estimating a value based upon an initial estimate to achieve a root of a function. This can be subject to certain constraints. We also use nonlinear optimization on models to achieve more accurate fits to data by approximating roots to objective functions such as (1) where q is an initial estimate to parameters of a mathematical model (f (xi ,q)) and (xi) its corresponding data point. We will explore different methods for optimization and compare each method for efficiency and accuracy. First, we will look at the effectiveness of Newton, Secant, and Chord methods to approximate roots in regards to the type of function and the accuracy of an initial guess. Then, we will briefly explore the application of built-in optimization tools in MATLAB, in particular fminsearch, a Nelder-Mead based method. Newton-Based Root Finders For a function f (x), a root is defined as x*, such that f (x*) = 0. Iterative, NewtonBased methods can be used to solve for a root of a given function. We can apply these techniques to optimization problems by iteratively estimating using an input parameter to a model for the purpose of achieving an approximate root of a related objective function. Derivation of Newton’s Method Consider the Taylor polynomial for a function f (x): (2) where E(x) is the error term [1] defined by 134 Kayla S. Cline We will use this polynomial as a starting point to develop an iterative method so that we can estimate the root of the function. As the function approaches a root, higher ordered terms of the Taylor polynomial will approach zero. Therefore, we can consider the higher-order terms trivial. We truncate these terms to get a linear approximate to f (x), given as (3) Since we are solving for a root, we set our approximate equal to 0 and solve for x: (4) Now we have an equation that approximates the value, x, at which f (x) equals zero given an initial estimate of x0. From this, we can derive the general form of the Newton - Raphson Method, given as (5) This creates an iterative process that gains a closer root approximate with each iteration. Quadratic Example Now we will approximate a root to a function f (x) using Newton’s Method. Consider the quadratic equation f (x) = x2 - x - 6 with the graph of the equation shown in Figure 1(a). We will iterate Newton’s Method to approximate a zero for f (x). We initiate our method by choosing x0 = 6. We continue with Newton’s Method shown as (6) 135 (a) (b) 2 Figure 1: (a) Graph of f (x) = x - x – 6 and (b) Newton’s Method for Quadratic Example Now we have a closer approximate to the root of the function. We now repeat the same process using x1 as our new estimate, and continue this process until we converge to the value of x at the root of f: Newton’s Method was successful in the convergence to the root of x = 3. Since Newton’s Method is derivative based, the derivative of the initial estimate may determine the root to which the method will converge. The derivative of the initial estimate will determine where the tangent line will create the next approximate, which determines the direction in which Newton’s Method approximates the root. The convergence to x = 3, instead of the root x = -2, is dependent upon the fact that our initial estimate of x0 = 6 was closer to x = 3 than x = -2. Convergence Criteria 136 Kayla S. Cline According to C.T. Kelly in Iterative Methods for Optimization [3], there are several criteria that must be met in order to guarantee convergence to a root. The criteria to converge to a root, x*, on [a,b] are: The initial estimate x0 is in the interval [a,b]. An initial estimate sufficiently outside x0 [a,b] may result in divergence due to the difference in the derivative values. The function f must be continuously differentiable (f C2[a,b]). For convergence, derivatives must exist with no continuity issues. The derivative value of the function at x* cannot be equal to zero (f (x*) 0). We must not divide by zero in order to maintain numerical precision. A root x* must exist for our function so that f (x*) = 0. Failure to meet even one of these criteria may result in failure of the algorithm for Newton’s Method. Problems with Newton’s Method While Newton’s Method is at the current state of the art for approximations of roots of functions, we may encounter several problems depending on our initial estimate or the type of function. Consider certain functions for which the derivative f (x) approaches zero near the root. As our derivative evaluations continue to get smaller, we continue to divide by a number closer and closer to zero. Therefore, Newton’s Method will diverge and may not reach the true root. This is demonstrated by the function f (x) =ex. For this example, Newton’s Method never converges to a root because no root exists. Algebraically, our method simplifies to the following equation: Consequently, the algorithm ultimately diverges to negative infinity. Other functions such as f (x) = arctan(x), shown in Figure 2, will diverge if an initial estimate is not extremely close to the actual root of the function. 137 Different types of functions can also cause a breakdown of quadratic convergence in Newton’s Method. Consider the function f (x) = x2. Note that f (x) has a double root at x = 0. Our generalized iteration of Newton’s Method would be Figure 2: Graph of f (x) = arctan(x). which generates a slowed, linear convergence rate. A modification to the Newton Step from Equation (5), in functions containing a root x = p to the mth degree where m > 1, is given as (7) 138 Kayla S. Cline To ensure quadratic convergence in functions containing repeated roots, we can rewrite the Newton Step as given as , giving us a modified version of Newton’s Method (8) This does not resolve problems for cases where f (x*) = 0. However, it does ensure that functions with repeated roots will maintain a quadratic convergence rate. Algorithmic Improvements There are several different alterations that can be made to Newton’s Method to create better algorithmic implementation. Simple algorithmic constraints can be added to decrease runtime. These constraints can improve the runtime of Newton’s Method as well as provide a way to terminate the algorithm if divergence occurs. Further, variations to the Newton Step can help to avoid extra function evaluations in the method, which may result in faster convergence to the root. Algorithmic Enhancements There are three main modifications that we can add to the algorithm to ensure that we do not continue to iterate once the estimate of the root is sufficient. We can set a maximum number of iterations to ensure that, if Newton’s Method diverges, we terminate the iterations and output an error message. This allows the user to diagnose problems with the defined function while using computer resources efficiently. We must also understand what it means to converge to a root. If Newton’s Method is converging to the root x* = 3, on just the eighth iteration we may have x8 = 3.00001 but may never converge to the true root x* = 3. However, at x8, if our next iteration is an extremely small step towards the root, we can say that we have sufficiently converged to the root value. This is done by using an absolute difference of input values to determine the termination of the algorithm, or (9) where is an extremely small value (e.g., 1x10-6). We can also use this same type of constraint on the function evaluation. As Newton’s Method converges to the root value, the function evaluation will 139 converge to f (x*) = 0. Therefore, we can set a tolerance such that the algorithm terminates when the function evaluation is close enough to zero, denoted as (10) The combined effort of these convergence inequalities works to ensure that a sufficient number of iterations are completed. Modification of the Newton Step Several iterations, each requiring a function evaluation and a derivative evaluation, can be time consuming. This is especially true as the calculations involve several multiplications, divisions, or nonlinear function calls, each taking extra computational time. Therefore, we use a process called Secant Method. Instead of calculating the derivative with each iteration, we use function evaluations at each iteration to calculate the derivative approximate. For our example, we approximate f (x) using a forward derivative approximate, given as (11) where is small. This creates a secant line instead of the tangent line used in Newton’s Method. This potentially less expensive approximate to the derivative may result in faster convergence to the local root of the function. Function evaluations may require significant time in these algorithms. If we have a curve that is continuously differentiable, then as we approach a root we should have a curve that is either increasing or decreasing. Therefore, our derivative should maintain the same sign as iterations of Newton’s Method get closer to the root value. Chord Method is a modified Newton’s Method that takes into account similar derivative values. Instead of calculating f (x) at every iteration of Newton’s Method, f (x) is calculated every mth iteration, where m is a number such as m = 50. This may result in a greater number of iterations with fewer derivative calculations, and potentially quicker convergence to an approximate of the root. Comparison of Newton, Secant, and Chord Methods We now run each of these methods using the same function for the purpose of comparison. Consider the cubic function (12) 140 Kayla S. Cline with roots at x* = -1,1, and 3. Let our initial estimate be x0 = 30. Due to the derivative evaluation and tangent line at the initial estimate, we expect our methods to converge to the root x = 3. When each method is implemented, Newton’s Method and Secant Method converge to x = 3, each with different runtimes, iterations, and function evaluations, as shown in Table 1. The first three iterations of Newton’s Method for this function are shown in Figure 3. Figure 3: Newton’s Method for f (x) = (x - 1)3 - 4x - 4 There are several differences in the runtimes and number of iterations between the methods. The convergence of Newton’s Method took approximately twice as long as Secant Method. Chord Method had the longest runtime, due to the divergence of the method. Secant Method converged with the same number of iterations as Newton’s Method, but Chord Method maintained fewer derivative evaluations with many more iterations. Not all three were able to converge to the root x = 3. Chord Method diverged and returned “Not a Number” due to the lack of numerical precision. Table 1 demonstrates that the convergence of Chord Method may be dependent upon the type of function. 141 Specification Newton Run Time (Seconds) 0.000837 0.000360 0.00143 Total Iterations 13 13 48 Derivative Calculations 13 13 3 Root Found x=3 Not A Number x=3 Secant Chord Table 1: Results with x0 = 30 for f (x) = (x – 1)3 - 4x + 4 Fitting Models to Data Using MATLAB Methods Several methods of these methods can be used to locate the minimum of a multidimensional objective function (i.e. f (x, q) : N ) in order to fit a model to data. There are also several different methods that are included in MATLAB. We will focus on the program fminsearch, which uses the Nelder-Mead algorithm and requires an initial estimate of the model parameters to locate the minimum of an objective function, defined in a manner similar to (13) From the initial estimate, fminsearch will create two new estimates to form a simplex, such as the one shown in Figure 4 for a two-dimensional plane. Method A simplex is a geometric object of N + 1 vertices in a space of N dimensions. The objective function will be evaluated at each of the N + 1 vertices created by fminsearch. The vertex evaluated with the optimal value of the cost function will be used to form a line segment in a determined direction where the objective function will be evaluated at several points along the segment. A new simplex will be formed with the point at which the objective function is evaluated at the lowest value, and then the process is reiterated. For fminsearch, this will happen until the new line segment formed in each iteration reaches a determined minimum value. Figure 4 exemplifies the Nelder-Mead algorithm converging to a point in a twodimensional space [4]. 142 Kayla S. Cline Problems with Nelder-Mead Although this is an efficient algorithm, the convergence to the local minimum of an objective function is directly based on the accuracy of the initial parameter estimate. Due to the nature of the algorithm, it will converge to the nearest minimum of the objective function, which may only be a local minimum instead of a global minimum. Therefore, it may not be able to locate the optimal parameters for the model. Finally, fminsearch does not take into account any constraints for the model parameters, which limits its use and can result in nonsensical parameter estimations. Application Consider the output generated by y (xi) = xi3 + 2xi2 + 3xi + 4. We will use this to create an optimization problem to demonstrate the effectiveness of the algorithm. We generate several data points to which the cubic model is a perfect fit. Now we will rewrite the parameters as q = [a,b,c,d] of f (q) = q(1)x3 + q(2)x2 + q(3)x + q(4). We want to show that fminsearch will take in an initial incorrect estimate of (a) (b) Figure 4: (a) One Iteration of Nelder-Mead and (b) Several Iterations Converging to (3, 2) parameters and match them with our original coefficients. Let our initial estimate be q0 = [5278] and our objective function be given as (14) where (x) is the corresponding generated data point for f (x). 143 The objective function evaluation at the initial estimate is J (qf) = 1.80x1013. However, after a search time of 1.93 seconds, the algorithm outputs that the optimal values for the parameters are qf = [0.999, 2.000, 3.000, 3.999]. At these values, the objective function evaluation is J (qf) = 4.90x10-5. Adjusting constraints on maximum iterations and tolerance values can affect the accuracy of the output parameters. Conclusion Nonlinear optimization involves iterative processes that include root finders and data fitting. Newton-based root finders offer an efficient method for locating the root of a function. However, the speed and accuracy of these methods are dependent upon the initial estimates and types of functions. Newton-based methods, as well as the Nelder-Mead algorithm, can be applied to mathematical models to achieve more accurate fits to data. These methods can be applied in research throughout many science, technology, engineering, and mathematical fields. References [1] Kendall E. Atkinson. “The Taylor Polynomial Error Formula”. 2003. [2] Richard L Burden and John Douglas Faires. Numerical Analysis. Cengage Learning, 1993. [3] C.T. Kelley. Iterative Methods for Optimization. Society for Industrial and Applied Mathematics (Philadelphia, PA 19104), 1999. [4] John H. Matthews and Kurtis K. Fink. Numerical Methods Using MATLAB. Prentice-Hall Inc. (Upper Saddle River, New Jersey), 2004. [5] Teukolsky SA Vetterling WT Flannery BP Press WH. Numerical Recipes: The Art of Scientific Computing. Cambridge University Press (New York, NY), 2007. MATLAB Implementation of Algorithms Implementations of these algorithms in MATLAB software can be found at: http://home.lagrange.edu/jernstberger/kscline/kscline_code.zip 144
© Copyright 2026 Paperzz