An algorithm for finding a solution of simultaneous nonlinear equations by R. H. HARDAWAY Collins Radio Company Dallas, Texas INTRODUCTION In many practical problems the need for a solution of a set of simultaneous nonlinear algebraic equations arises. The problems will vary greatly from one disciplin,e to another, but the ba'sic mathematical formulation remains the same. A general digital computer solution for all sets of simultaneous nonlinear equations does not seem to exist at the present time; however, sever~l recent techniques make the solution of certain systems more feasible than in the past. The algorithm described here is a slight variation of one of the methods described by C. G. Broyden1 in "A Class of Methods for Solving Nonlinear Simultaneous Equations." This modified version of Newton's method converges quadratically for a convex space. It includes Broyden's technique of approximating the initial Jacobian and the~ updating its inverse at each step rather than recomputing and reinverting the Jacobian at each ite:r;ation. A procedure is given which helped to circumvent the difficulty of an initially singular Jacobian in several test cases. The examples given include applications in several engineering fields. A simple hydraulic network and the equivalent nonlinear resistive network are given to show the identical mathematical formulation. Applications to the stress analysis of a cable, to the analysis of a hydraulic network, to optimal control problems, to the determination of nonlinear stability domains and to statistical modeling are mentioned as examples of usage. Statement of the problem Let a system of n nonlinear equations in n unknowns be given as fl(XI, X2, "', x n ) = 0 f 2(xI, X2, "', x n ) = 0 (1) This may be represented more concisely in vector notation as f(x) = 0 (2) where x is a column vector of independent variables and fis a colll;mn vector of functions. A solution of the system of n equations is a vector x which satisfies each fj in the system simultaneously. Newton's method In Newton's method for a one dimensional case, the iterative procedure is given by (3) The need for a good ini~ial estimate, the computation of the derivative, and failure to find multiple roots are usually cited as major dis~dvantages of the method. However, from a "good" initial estimate, convergence' of the method has been frequently proven and has been shown to be quadratic. For an n dimensional system we will expand this notation. The initial limitations and the advanta,ge of quadratic convergence are maintained. For proofs on the convergence, see Henrici's Elements of Numerical Analysis. 2 Rather than a single independent variable, we have an n dimension;:tl vector of Independent variables x. An n dimensional vector of functions f replaces the single function equation and the Jacobian replaces the derivative of the function. The Jacobian is defined as an (n X n) matrix composed of the partial derivatives of the functions with respect to the various components of the x vector. A general term of the Jacobian J may be denoted as aij = afj/axi where fj and Xi are components of the fand x vectors, respectively: Since in equation (3) the derivative appears in the denominator we must consider the inverse of the Jacobian and evaluate it as Xi. This will be denoted as J rl. 105 From the collection of the Computer History Museum (www.computerhistory.org) 106 Fall Joint Computer Conference, 1968 Therefore, for an n dimensional case, Newton's method may be rewritten as In order to define the modification method for the inverse Jacobian, we let (4) (6) The obvious difficulties of the method lie in choosing an initial vector Xo and in computing and evaluating the inverse Jacobian. Despite the oversimplification, it will be assumed that enough knowledge of the system exists to enable one to make a "good" initial guess for Xo. The Jacobian is considerably more difficult. With the method that was presented by Broyden not only can an initial approximation to the Jacobian be used, but also at each iteration the inverse Jacobian may be updated rather than recomputed. Then the vector of functions of f(x') may be considered a function of a single variable t. The first derivative of f with respect to t will exist since the Jacobian has previously been assumed to exist. It follows that Broyden's variatwn of Newton's method Notation Xi i'th approximation to the solution fi = f(Xi) set of J i or J rl Jacobian or its inverse evaluated at func~ions = afax' (7) ax' ati When we determine df/ dt, we have established a necessary condition on the Jacobian. To approximate df/dt, Broyden suggests a differencing method where each component of f may be expanded as a Taylor series about s as follows: fi evaluated at Xi = f(t i - s) = fi+! - df dt S- - ••• (8) Xi Ai or A i- l i'th approximation to Jacobian or its inverse evaluated at Xi t df dt Disregarding the higher terms, an approximate expression for df/ dt is df dt scalar chosen to prevent divergence fHl - fi Yi s s -~--- the difference in f i+l and f i Yi = fi+l - fi ~ (9) The choice of s is such that the approximation to the derivative is as accurate as possible, but care must be taken that the rounding error introduced in the division is not significant. Since we have chosen Broyden's method of full step reduction, ti is set equal to sand (9) becomes the negative product of Ari and fi Pi :::; -Ai-1fi the transpose of Pi Method It is assumed that an initial approximation of the Jacobian Ao exists. The iterative procedure seeks to find a better approximation as it also seeks to find the solution of the system. In this process the function vector f will tend to zero and indicate the convergence of the system. If we let Pi = - Arlfi as given above, then equation (4) becomes (5) where t i is a scalar chosen to prevent the divergence of the iterative procedure. The value of ti is chosen so that the Euclidean norm of fi+l is less than or equal to the Euclidean norm of f~; hence, convergence is not ensured but divergence is prevented. A complete discussion of the backward differencing method used is given in a followinig paragraph. (10) If we now combine equations (7) and (10) we have another necessary condition which the Jacobian will satisfy, (11) Since Ai, an approximation to the initial Jacobian, exists, we are seeking a better approximation and J is replaced in equation (11) with Ai +1 , (12) This equation gives the relationship between the change in function vector f and the change in the X vector in the direction Pi. Since we have no knowledge of chang- From the collection of the Computer History Museum (www.computerhistory.org) Algorithm for Finding Solution of Nonlinear Equations es in any direction other than pi, we will ,assume that there is no change in the function vector f in any direction orthogonal to Pi. Using this assumption and equation (12), A i +1 can be determined uniquely and is expressed as follows: 107 given in the next paragraphs. The general flow chart in Figure 1 summarizes the procedure. The initial vector The initial vector Xo may be selected arbitrarily or from knowledge of the system, With a nonlinear resistive network, for example, the elemental values cou~d Since we actually need the inverse of A i+l, Broyden uses a modification given by H<!>useholder 3 which enables us to obtain a modification formula for the inverse from the information we already have. Householder's formula is INITIALIZE X (14) EVALUATE SET OF FUNCTIONS where A and (A + xyT) are nonsingular matrices and x and yare vectors all of order n. Therefore, A -1 HI - A -1 i + r.t .p' t 'I - A .-ly .J P .TA .-1 'I TA Pi 'I -1 i Yi 'I 'I EVALUATE JACOBIAN (13) is the modification we have Deen seeking. This method of updating the inverse of an initial approximate Jacobian was shown by Broyden to give a better approximation as i increases if the terms omitted in the Taylor series expansion are small. Since s = t i is always chosen to be less than or equal to I} this condition is satisfied. With this improved approximation of the inverse a new Pi is computed and the iteration is repeated. As Xi approaches the solution, the convergence becomes quadratic as Henrici3 has shown. The function vector tends to zero and the Jacobian tends to the actual Jacobian evaluated at the solution vector. AAY one of several methods citnbe used to determine the accuracy of a solution. Since the computation of partial derivatives has been simplified by the approximation procedure, this method presents advantages over those which require explicit evaluation of partial derivatives. The step size at each iteration is such that the norm is reduced rather than minimized. The time spent in evaluating the set of functions repeatedly and the storage required to save various vectors to determine the minimum, negate the advantage of norm minimization. This method combines the use of initial approximations with an iteration procedure that is computationally simple, to produce an efficient algorithm. INVERT JACOBIAN EXIT CHOOSE TTO REDUCE NORM ( THE ·VECTOR X AND THE .sET OF FUNCTIONS AR.E RECALCULATED IN THIS PROCEDURE.) UPDATE JACOBIAN INVERSE NO YES Computational procedure Details on the implementation of this algorithm are FIGURE I-Flow chart From the collection of the Computer History Museum (www.computerhistory.org) 108 Fall Joint Computer Conference, 1968 be used. If no knowledge at all exists, a unity vector is generated as Xo. The Jacobian Since the initial Jacobian may be approximated' several methods are available to compute this matrix· The simplest is to let the inverse equal a given value and not attempt to compute or invert the Jacobian. So much valuable information is lost that, despite the simplicity, this method is not chosen. At the opposite extreme is the possibility of computing all the necessary partial derivatives, evaluating the Jacobian at Xo, and then inverting the matrix. The evaluation of the partial derivatives is gener~lly very laborious so this method was not considered. Further, the assumption thatAo may be approximated makes this degree of precision unnecessary. The third alternative is to approximate the partia\ derivatives. This will provide better information than the first case and give less computational difficulty than the second case. Since any term of A may be denoted as aij = afj / aXi, we have ai, . = hm fj(xi + h) h - f,(xi) (16) i_O where fi and Xi are components of f and x. For purpose of evaluating Ao 1 a relatively small value of h is selected and equation (16) used to generate each element. For the test program, a value for h was computed as .001 of Xi. The inverse Jacobian When the Jacobian has been evaluated, the next step is to invert it. If the Jacobian is nonsingular, the inversion is performed and the iteration proceeds. A standard Gaussian inversion routine is used. If, however, the Jacobian is singular, the inverse doe~ not exist. The first solution given in the previous section is one method of sidestepping this problem. During the testing of the algorithm an attempt was made to determine an optimum arbitrary matrix. One choice was to use the results that had been stored in the inverse matrix by the Gaussian elimination procedure at the time the singularity was detected. This will give a reasonably good approximation for some entries in the inverse. At the next step, the modification procedure will improve the initial approximate inverse. This is the matrix that was used if the Jacobian was singular. Seletion of the Scalar ti The Euclidean norm of the vector fi should approach zero as a solution is reached. When t i is selected so th'Bt the norm of fi is a nonincreasing function of i, the divergence is prevented. The first guess for t i is always + 1. If this satisfies the condition that the Euclidean norm of fi+l(Xi + tiPi) is nonincreas~ng with respect to the norm of fi, then ti = 1 is used. If not, then a quadratic minimization procedure similar to one given by Broyden1 is used. Ten attempts are made to find a good value of t i . At that point the final value of ti is used and the corresponding modifications are made on the inverse Jacobian and the f and x vectors. Obviously, the Euclidean norm may not be decreased but at the next step the directional change in the correction vector will result in norm reduction with relative ease. In the process of selecting t i ,ji+1, Xi+l and the norm of fi+l are all computed. These values are all saved for future use in the ~omputational procedure. Convergence . The necessary degree of accuracy should be determined by the application and specified for each case. The norm of the function vector can be used as a convergence criterion; the absolute value of each element in f can be checked to see how closely it approaches zero; ora comparison between the norm of Pi and the norm of Xi may be used. This last method implies that if the norm of the vector Pi which is used to cor~ct the solution Xi is less than some epsilon times the norm of the solution, then convergence is already attained. Since Pi = -Arlfi,animplicittestismadeonfi. This last criterion presents an advantage because the prediction vector as well as the function vector is considered, so this was the criterion selected. If the iterative procedure is not converging, provision must be made to terminate the problem. A value equal to 2n2, where n is the order of the system, is computed. The maximum of this value or 50 is used as an upper limit on the number of iterations allowed. Terminati.on is enforced when this number of iterations has been attained whether or not a solution has been found. Modifieation of inverse Jacobian If the convergence criterion is not satisfied, then the inverse Jacobian is modified accoraing to equation (15). The new values of Xi and f i that are stored as Xi+! and fi+! are placed in the appropriate vectors. From the collection of the Computer History Museum (www.computerhistory.org) Algorithm for Finding Solution of NonIinear Equations Continuing the iterative process The value of i is set equal to i + 1. If the max mum number of iterations will not be exceeded, the iteration is repeated from the evaluation of Pi. The subprogram The procedure described was programmed in generalized subroutine form using variable dimensioning and making use of the option of an external subroutine to evaluate the set of functions. The external subroutine can be varied from one application to the next without affecting the main subroutine that performs all the other invariate calculations. As a second option, the subroutine that evaluates the initial approximation to the Jacobian is declared external.The usual approximation is given as follows, (17) (.001) (Xi). This approximation has been where h programmed into a subroutine. However, in any specific case if a better method of approximation exists, a subroutine which evaluates this approximation may be declared external and replace the first approximation subroutine. Results using the first approximation were so satisfactory that this second option was not utilized in the cases discussed here. The defining equations, the initial vector, and the solution obtained in each test case are given as follows. 1. fl = X 12 f2 = Xl fa = Xl + X 22 + Xa2 - + X2 + Xa 5 1 - 3 - Xo .[ V5 [ ++-VsJ V5 1 = X = 3 01 ] (18) 2 F. H. Deist and L. Sefor'. 2. same as 1. x·=[U X = [ 1.66667 ] -.66666 1.33333 3. f1 = X 12 + X 22 - 1.0 f2 = 0.75 X la - X 2 + 0.9 -.4 ] Xo = [ -.1 X = [ -.98170. ] (19) .19042 V. A. l\latveev6 • 4. same as 3. Xo = [ Timing and accuracy The method was programmed in Fortran V on. the Univac 1108 for single precision input and output. TABLE I, which follows, reflects the order of the system, the accuracy of the result, the number of iterations and the time required on the Univac 1108 for ten sample problems. Following the table the defining equations are given for each system. 109 1.3] X = [ .35697 ] -.3 5. fl f2 = = .93412 10(X2 - X12) 1 - Xl x. = [: ] X = [: ] (20) H. H. Rosenbrock6 • 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. Order N Accuracy 3 3 2 2 2 4 5 10 20 30 10--6 10-6 10-5 10-& 10-7 10--6 10-7 10-7 10-7 10-7 Iterations 19 11 15 11 21 7 10 13 17 18 Time in sec. .02 .01 .01 .01 .23 .02 .02 .09 .40 .90 TABLE I-Timing and accuracy of the subprogram 6. fl = 20Xl - COS2X2 + Xs - sin Xs - 37 f2 = cos 2X l + 20X 2 + log (I + X42) + 5 fa = sin (Xl + X 2) - X 2 + 19X a + arctan Xa - 12 f4 = 2 tanh X 2 + exp( -2Xa2 + ..5) +. 21 X 4 -lr X _ O. G. Mancino? From the collection of the Computer History Museum (www.computerhistory.org) 1.896513l -.210251 .542087 .023885 J (21) 110 Fall Joint Computer Conference, 1968 7., 8., 9., 10. Are all defined by the. same general equations 2X 2 - 1 fl = - (3 - .5Xl) Xl f, = X'-l - (3 - .5Xi) Xi 2X'+1 - 1 i = 2, "', n - 1 f" = X,,-l - (3 - .5X n ) X" - 1 + + Asa second example, consider the hydraulic network in Figure 3. Let the pressure drop across each member be expressed as Pi = (aL i /D i 4.86) Ql·85 = B i Qi· 85 , where a is a constant, Li is the length, Di is the diameter, and Qi is the flow. The flow through each member may be determined from the following equations which describe the system. (22) C. G. Broyden. 1 ApplicationDf the subprogram Equivalent systems To demonstrate the equivalent mathematical formulation of a set of .defining equations for different systems, a very simple example of a dual application will be given. Consider the nonlinear resistive network given in Figure 2. Let the voltage drop across each resistor be Vi = RiI/'. Then from Kirchoff's laws the system may be completely described by the following three equations and II, 1 2, and Is may be determined uniquely. PUMP 1--E FIGURE 3-A nonlinear hydraulic network ----+ II FIGURE 2-A nonlinear resistive network From the collection of the Computer History Museum (www.computerhistory.org) Algorithm for Finding Solution of Nonlinear Equations Equations (23) and (24) are identical in form although they originated from different sources. The, solution to either example may be determined by solving the same set of nonlinear equations. From th}s second example an additional feature of the subprogram may be pointed out. The subroutine that evaluates the set of functions may be dependent on other calculations for the elements in the expressions. When the determination of a single element becomes involved, a subroutine may be used for this calculation. Finally, depending on the states of the system, control may be transferred to various segments of the subprogram and the appropriate operation performed. The versatility of a particular subprogram may be greatly increased in this manner. Cable tension program An application to which this'method of 'Solution was applied was the stress analysis of a suspended cable. Analysis for a uniformiy loaded cable does not give an acourate picture of what actually happens under environmental conditions such as wind and ice or for concentrated loads. The cable in Figure 3 is suspended between two points, A and B. The stress conditions may be considered forces acting on the cable at certain positions and are represented by the k weights. By writing the static equilibrium equations for each elemental segment, a set of equations for expressing the 1-- 7~'O"'~ 1c r /11 /1--------W _ N 1 w, FIGURE 4-Suspended cable under stress 111 change in each direction as a result of each load is obtained. From these equations for eaoh segment, the total stress on the cable in each direction may be determined. The position of B calculated by this method will not coincide with the actual position of B. A set of nonHl1ear equations expresses the change in the position 0 .. B. As this change is minimized, the calculated tension L 1. the cable approaches the actual tension. The loads on the cable may represent any environmental conditions either singly or multiply. Analysis of the stress performed in this way gives a much more accurate solution that the assumption of a uniformly loaded cable which was previously used. This problem is a good example of the extended capabilities that are available with the external subroutine to compute the function evaluations for the nonlinear solution subprogram. The external subroutine which we will call CNTRL acts as a monitor to determine what operations· will be performed in a second subroutine, CABLE. The CABLE subroutine reads the data, computes the function evaluations and calculates the final state tension. CABLE, in turn, calls a second subroutine ELEM to evaluate the components of the equations which are evaluated in CABLE. The complexity of the equations and of the individual elements is such that this approach greatly simplifies the programming. Applications to functional optimization Finding a minimum of a functional of several variables is a frequently encountered problem. Many optimal control problems fall within this category; so with an appropriate set of equations for the problem this subprogram may be used to find a solution. Both a minimization problem and an optimal control problem will probably have constraints on the solution. In a control problem both initial and terminal constraints on the state and costate variables and inequality constraints on the state and control variables may be expressed as a penalty function in the formulation of the problem. Lasdon8 and others have derived a method of approach involving the conjugate gradient technique which may be adopted to the given subprogram. The determination of stability domains for nonlinear dynamical systems also involves functional optimization. As in the optimal control problem; provision must be made for the constraints that operate on the given system. The second method of Liapunov may be used as a theoretical basis for stability determination. A nonlinear dynamical model may be expressed as the following n~dimensional system of autonomous state differential equations, From the collection of the Computer History Museum (www.computerhistory.org) 112 Fall Joint Computer Conference, 1968 --.-------------------------------------------------------------------------------------x= Ax + f(x) = g(x), g(O) = 0 . . (25) where f(x) contains all the nonlinear terms. The method is based on choosing a quadratic Liapunov function V which yields the largest estimate of the domain of attraction for the system given in equation (25). . Figure 5 shows the projection in two dimensions of the quadratic Liapunov function. To find the optimal or largest stability domain, one needs to maxiI:niz'e the area of the ellipse represented by Vex) = C subject to the conditions Vex) ~ 0 and x ~ O. A complete discussion of this problem is given by G. R. Geiss.' Hydraulics network The analysis of a hydraulic cooling system is a specific example of how the subprogram may be used in network analysis. The temperature drop across any heat exchanger may be calculated if we know the corresponding pressure drop. If we assume that the positions of the valves are held stationary, then from the conservation equations that describe the system, one can determine the pressure drop across each heat exchanger. With the known environmental conditions, the external temperature in the vicinity of each heat exchanger may ultimately be determined. On the other hand, if a specific external temperature is desired, the position of each valve may be calculated Ii(x)<o Ii(x) >0 OOMAINOF ASYMPTOTIC STABILITY ~ __ ...::::.v..:::. .( ONE OF A SET OF NESTED ELLIPSES Ii(x) >0 V(X)=.C to give the proper flow through each member to cause an appropriate preSsure drop. A very general program may be written to analyze a very general network. The elements and their connections may be read into the computer along with the various parameters that are necessary. The complete process of analyzing and solving the network is supervised by a program which will eventually use the subprogram for solution of simultaneous nonlinear equations. As networks become more complex, this approach will greatly reduce the time spent in tedious calculations. Application to statistical modeling Many statisticians are involved in constructing models on the basis of experimental data. The model may then be used to draw conclusions and predict future outcomes. Frequently these models are nonlinear and will result in nonlinear equations. After' a model has been developed, its accuracy must be constantly verified. Using the equations of the model and the equations that describe the data the validity of the model may be determined. Since the model is assumed to be nonlinear, the resulting analysis will involve simultaneous nonlinear equations. Other statistical applications that may require the solution of a set of simultaneous nonlinear equations are nonlinear regression analysis, testing likelihood estimator functions and survival time analysis. The many fields that use decision theory would then have the same applications. As a simple illustration of the preceding application, suppose that it has been observed that in a sample of two hundred persons, twenty-two possess a certain genetic characteristic. Suppose, further, that the characteristic is inherited according to a hypothesis which predicts that one-eighth of those sampled can be expected to possess the characteristic. The model would be a frequency function that would enable the observer to infer' future outcomes and detect disagreements with his theory. The solution of an appropriate set of nonlinear equations will express the relationship between the model and outcome. By this type of analysis the hypothesis may be accepted or rejected or reformulated. CONCLUSIONS v(x)<o FIGURE 5-Two dimensional portrayal of region of stability The algorithm presented here can be programmed for a digital computer with relative ease. The speed of computation, the number of iterations and the accuracy of the solution compare very fav.orably with other methods in current use. Since the information from each preceding iteration can be used to modify the From the collection of the Computer History Museum (www.computerhistory.org) Algorithm for Finding Solution of Nonlinear Equations inverse Jacobian, the time involved and the complexity of operations in each iteration are minimized. In this way the inversion of the Jacobian at each iteration is avoided. The examples presented demonstrate usages in mathematics, statistics, and several engineering fields, and -these examples do not begin to exhaust the application areas. The usages ar,e so numerous that it would seem desirable to have this type program widely available particularly in -any software library designed for application purposes. -Principles of Numerical Analysis McGraw.Hill New York 1953 4 F H DEIST L SEFOR REFERENCES Journal of ACM Vol 14 hl67 8 L S LASDON S J MITTER A D WAREN 113 Solution of systems of nonlinear equations by parameter variation • The Computer Journal May 1967 5 VAMATVEEV Method of approximate solution oj systems of nonlinear equations Defense Documentation Center AD 637 966 June 1966 6 H H ROSENBROCK A n automatic method for finding the greatest or least value of a function The Computer Journal Vo131960 f 70GMANCINO Resolution by interation of some nonlinear sY8tem8 1 CGBROYDEN A class oj methods of solving nonlinear simultaneous equations Mathematics of Computation Vol 19 No 92 Oct 1965 2 PKHENRICI Elements of Numerical Analysis John Wiley & Sons Inc New York 1964 3 A S HOUSEHOLDER The conjugate gradient method for optimal control problem8 IEEE Transactions on Automatic Control Vol AC-12 April 1967 9 GRGEISS JV ABBATE Study on determining stability domains for nonlinear dynamical systems Grumman Research Department Report RE-282 Feb 1967 From the collection of the Computer History Museum (www.computerhistory.org) From the collection of the Computer History Museum (www.computerhistory.org)
© Copyright 2026 Paperzz