Some effective methods for unconstrained optimization based on the

JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS: Vol. 62, No. 2, AUGUST 1989
Some Effective Methods for Unconstrained
Optimization Based on the Solution of Systems
of Ordinary Differential Equations !
A. A. BROWN 2 AND M. C. BARTHOLOMEW-BIGGS3
Communicated by L. C. W. Dixon
Abstract. In this paper, we review briefly some methods for minimizing
a function F(x), which proceed by following the solution curve of a
system of ordinary differential equations. Such methods have often been
thought to be unacceptably expensive; but we show, by means of
extensive numerical tests, using a variety of algorithms, that the ODE
approach can in fact be implemented in such a way as to be more than
competitive with currently available conventional techniques.
Key Words. Unconstrained minimization, trajectory following, ODE
methods for optimization, computational algorithms.
1. Introduction
We are concerned in this p a p e r with m e t h o d s for solving the u n c o n strained minimization p r o b l e m
min
F(x),
(1)
by following the solution curve o f a system o f ordinary differential equations.
M e t h o d s o f this type for optimization (and for the related problem o f solving
sets o f nonlinear equations) have been p r o p o s e d by a n u m b e r o f authors,
for instance, Botsaris and J a c o b s o n (Ref. 1), Botsaris (Refs. 2-4), Boggs
(Ref. 5), Zirilli et at. (Refs. 6-11), Zghier (Ref. 12), and S n y m a n (Refs.
13, 14). This reference list is by no means exhaustive, but gives some
indication o f recent activity in the area.
J This work was supported by a SERC research studentship for the first author. Both authors
are indebted to Dr. J. J. McKeown and Dr. K. D. Patel of SCICON Ltd, the collaborating
establishment, for their advice and encouragement.
z Staff Member, Numerical Algorithms Group, Oxford, England.
3 Lecturer, School of Information Science, Hatfield Polytechnic, Hatfield, England.
211
0022-3239/89/0800-0211506,00/0 @ 1989 Plenum Publishing Corporation
212
JOTA: VOL. 62, NO. 2, AUGUST 1989
The idea of proceeding from an initial guess x ~°) to the solution x ~*)
of (1), via a smooth, curvilinear trajectory seems rather attractive. Moreover,
for highly nonlinear minimization problems, it may offer distinct advantages
over the approach favored by conventional optimization techniques which
take finite steps along straight line search directions. It is well known that
finding a suitable stepsize (the line search) can be difficult when F is
nonquadratic and has large third derivatives; and the resulting step may in
fact be very small (as, for instance, when the search is negotiating the
bottom of a narrow, curving valley). In such circumstances, the conventional
methods make slow progress and we might expect a more effective approach
to result from a deliberate attempt to follow a curvilinear path. A secondary
argument that is sometimes used in favor of ODE methods for optimization
is that they allow us to pose problem (1) in a form for which there is a
considerable amount of specialized and very sophisticated numerical
integration software already available.
In spite of the comments in the previous paragraph, however, ODE
methods for unconstrained minimization do not seem to have gained wide
acceptance. The discussion on pages 59-60 of Ref. 15 appears to reflect the
common feeling that, when compared with, say, a conventional quasiNewton algorithm, such methods are probably too expensive to be of general
use. instead, their role is seen as some kind of occasional last resort for the
case when other methods experience difficulty. Our purpose in this paper
is to present some numerical evidence showing that this view of ODE
methods is not altogether justified. It is true that they can be expensive;
but this depends very much upon the way in which they are implemented.
In the next section, we shall review some published algorithms for solving
(1) via systems of differential equations, and we shall then show that their
performance is significantly affected by the choice of integration method
and the way in which the stepsize is controlled. In Section 3, we report a
quite extensive numerical comparison between the best of the ODE techniques and two well-known conventional algorithms. Our results suggest
that, when suitably implemented, ODE methods deserve a place in the
mainstream of optimization algorithm development.
2. ODE Methods for Optimization
2.1. Differential Equations Which Define Solution Trajectories. The
simplest path from an arbitrary initial point x (°) to x <*), the solution of (1),
is the continuous, steepest descent trajectory defined by
dx/dt = -~TF(x(t)),
(2)
with initial condition x = x ~°) at t = 0. We can also consider the continuous
JOTA: VOL. 62, NO. 2, AUGUST 1989
213
Newton equation
dx/dt = -G-l(x(t))VF(x(t)),
(3)
with initial condition x = x C°) at t = 0. In (3), G denotes the Hessian matrix
V2F(x(t)), which for the m o m e n t we assume to be nonsingular for all x.
Notice that (2) and (3) each represent an a u t o n o m o u s system, since t does
not appear explicitly on the right-hand side. I f for either of these equations
a solution x(t) exists for t > O such that lim ..... x(t)=x (*~, then x t*) is a
stationary point of F(x). It is easy to show that the solution to (2) is a
curve in x-space which is always downhill w.r.t, the function F(x). The
same can also be said of the solution to (3), provided G(x(t)) is positive
definite at every point on the path.
It is also worth considering the linearized form of (2), namely,
dx/dt = - V F ( x ~k~)- G(x(k~)[x(t) --X(k)],
(4)
with x = x (k~ at t = t (k) as the initial condition. The Iinearization is about
some point x (k~ on the path from x (°) to x(*); and (4) defines the local
steepest descent curve away from x (k). Equation (4) has been discussed and
used by Botsaris and Jacobson (Ref. 1) and also by Botsaris (Refs. 2-4).
Trajectories defined by second-order differential equations have been
suggested by some authors. Zirilli et al. (Refs. 6-11) have developed a
method for solving a system of nonlinear equations
f(x) = 0 ,
(5)
by minimizing the sum-of-squares function
O(x) =f(x)'f(x).
The method can be adapted to deal with problem (1) if we t a k e f ( x ) = V F(x).
Zirilli et al. use the differential equation
t~( t) d2x/ dt2 + ~( t) dx/ dt =-VQ(x(t)),
(6)
w h e r e / x ( t ) and/~(t) denote positive, real-valued functions. They describe
and justify a practical procedure for choosing tx and/~ during the integration
of (6) so that, as t ~ co, the solution trajectory resembles that of Newton's
method. The authors claim that use of second-order equations gives a larger
domain of convergence than for a first-order system. They also argue that
the form o f (6) facilitates greater control of the trajectory since at t = 0 we
must specify not only the initial point x (°) but also dx(°)/dt, and different
choices for these derivatives may lead to different solutions of the original
problem. A drawback with (6) when applied to function minimization is
of course that its solution will have a limit point at any stationary point of
F(x), and there is no intrinsic mechanism for avoiding m a x i m a or saddle
points.
214
JOTA: VOL. 62, NO. 2, AUGUST 1989
Snyman (Ref. 13) has also suggested a rather similar second-order
system of ODEs specifically for the unconstrained minimization problem,
namely,
d2x/ dt 2 = - V F ( x ( t) ).
(7)
A more detailed discussion of these equations can be found in the references
cited earlier, and also in Brown (Ref. 16). However, we now turn our
attention to the important practical question of how best to solve systems
such as (2)-(4), (6), and (7).
2.2. Integration Methods for ODEs Arising in Optimization. We need
to consider carefully what methods are appropriate for solving equations
like (2)-(4), (6), and (7). Notice that, in the context of function minimization,
"'appropriate" need not mean the same as "accurate", since we are primarily
concerned with finding a limit point, rather than the whole solution.
Moreover, it should be remembered that we have a measure of progress
toward x ~*) which is not normally available during the numerical solution
of a system of ODEs (namely, the value of the objective function which
we would like to reduce at every integration step). There is, therefore, a
choice to be made, in implementing an ODE technique for optimization,
between using a "'black box" routine which computes the trajectory to high
accuracy and employing a simpler, low-order integration scheme whose
stepsize is controlled by monitoring F(x). For the most part, it is the second
of these alternatives which has been preferred by the authors whose work
has been cited in the preceding paragraphs.
In this section, we shall outline a number of ODE optimization
algorithms in terms of the differential equation they use and the integration
scheme and stepszie control that they adopt. These algorithms have all been
programmed, and a comparison and discussion of their numerical behavior
appears in this section and the next section. Brown (Ref. 16) considers a
much wider range of algorithms than we deal with here, and in particular
he looks at a number of different ways of using information about the
objective function to adjust the stepsize. We shall simply distinguish between
strategy A, in which the usual methods of truncation error estimation are
used to control the stepsize and maintain accuracy in the solution, and
strategy B, in which the stepsize is increased (doubled) if F ( x ) is reduced
and decreased (halved) otherwise. In strategy B, an integration step is not
accepted until a reduction in F has been obtained.
The first two algorithms that we mention are entitled LSOSTD and
LSONEW. LSOSTD involves the use of the package LSODE (Hindmarsh,
Ref. 17) to solve the steepest descent equation (2). LSODE is a variableorder, variable-step code which can deal with stiff systems. Stepsize adjustment is via strategy A and no account is taken in LSOSTD of the behavior
JOTA: VOL. 62, NO. 2, AUGUST 1989
215
of the objective function, except of course that the norm of V F is monitored
at every output point, and the algorithm is terminated if this is sufficiently
small. L S O N E W uses L S O D E in a similar way to solve the continuous
Newton equation (3).
By contrast, as examples of the use of low-order integration schemes,
the algorithms IMPSTD and RK1 NEW employ only first-order techniques.
Specifically, IMPSTD applies the implicit Euler method to Eq. (2), while
RKt N E W uses the explicit Euler method to solve (3). In both cases, stepsize
control is via strategy B. It is worth noting that R K 1 N E W computes each
new point x (k+l) by solving
[ G(x~k)](X (k+l) -- X (k~) = -h(k~V F ( x ( ~ ) ,
(8)
where h (k) is the stepsize. If the matrix G (k) is found to be nonpositive
definite during factorization, then its diagonal terms are modified so as to
ensure that the step is downhill with respect to F. Such a safeguard cannot
be incorporated into LSONEW, however, because of its use of the "black
box" routine LSODE.
We now turn to the differential equation (4). Suppose that we employ
the implicit Euler method to calculate x (k+~ as an estimate of x ( t t k ) + h ) .
Then, it is easy to show that the step Ax = x ( k + ' - - X (k) is given by
( h G ( x (k)) + I ) Ax = -hVF(x~k~).
(9)
It is worth noting that this method of calculating each step is different from
that involved in IMPSTD where the implicit Euler scheme is applied to
(2). In IMPSTD, a set of nonlinear equations will usually need to be solved
for Ax, whereas (9) is a linear system. As is well known, the step Ax obtained
from (9) tends to the steepest descent direction - V F ( x ~k)) as h -~ 0 and to
the Newton direction as h ~ ee. In other words, as we increase the size of
the step to be taken from x Ck), then the new point x (k+~ lies on a path
sometimes known as a spiral. There is obviously a connection here with
trust region methods for function minimization [see for instance, Goldfeld,
Quandt, and Trotter (Ref. 18)]. This will be taken up again in a later section.
The algorithm I M P B O T is based on Eq. (9); and it uses strategy B for
choosing the stepsize h on each step.
Zghier (Ref. 12) applies the generalized trapezoidal rule to Eq. (4) and,
in a similar vein, to Eq. (9), obtains
(hOG(x~k~)+ I ) A x = - h V F ( x ~k~)
(t0)
as the calculation for each step, where 0 is a parameter. When 0 = 0, (10)
corresponds to the explicit Euler method; and, when 0 = t / 2 , it becomes
the trapezoidal rule. When 0 = 1 of course, (9) and (10) are identical.
216
JOTA: VOL. 62, NO. 2, AUGUST 1989
Zghier's algorithm adjusts h and 0 at every step in order to obtain limh.~ 0 =
t and also 0 ~ 0 as h - 0. In particular, it is suggested that the relationship
between h and 0 on each step be determined by reference to the stability
of the numerical solution to the scalar test equation d z / d t = -Az. Hence,
in this case, h is chosen neither by the usual accuracy criteria nor on the
basis of the value of the objective function. All these ideas have been
implemented in a code entitled ZGHIER.
Another algorithm, called QUABOT, is based upon the iteration
( hB (k) + I ) A x = - h V F( x(k)),
(11)
which in turn comes from the differential equation (4) with a matrix B ~k)
approximating the true Hessian G(x~k)). The BFGS updating formula is
used to generate the sequence {B¢k)}, starting from the usual initial choice
B ¢°) = L In other respects, QUABOT is identical to IMPBOT.
All the algorithms mentioned so far have been programmed in
FORTRAN specifically for the experiments described in this paper. Thus,
although they employ ideas that have been suggested previously, they are
new optimization codes as regards the details of implementation. Two
further programs, previously published in the literature, have also been
included to illustrate the use of the second-order differential equations (6)
and (7). The algorithm DAFNE [Zirilli et aL (Ref. 1i)], from the ACM
software library, uses Eq. (6). The numerical solution of this system is
performed by rewriting it as a set of first-order equations, namely,
d x / dt = v( t),
(12a)
d v / dt = [ - f l (t) v(t) - V F ( x ( t) ) ]//~ (t).
(12b)
The right-hand side of (12b) is then linearized w.r.t, x and the resulting
equations solved by the implicit Euler method, giving an iteration rather
similar to (9). Finally, the code LFOPI(b) is one given by Snyman (Ref.
14), based on following the solution curve of (7). The integration of these
equations is based on rewiriting them as the first-order system
d x / dt = - v ( t),
dv/dt = -VF(x(t)),
and then applying the explicit Euler method to the first equation and the
implicit Euler method to the second. This is the so-called leapfrog method.
The value of h is increased from one iteration to the next so long as the
scalar product VF(x~k+t))'VF(x ~k~) is greater than zero, indicating that the
trajectory is proceeding steadily downhill. If the scalar product is negative
JOTA: VOL. 62, NO. 2, AUGUST 1989
217
or zero, however, then h is reduced on the assumption that the search has
reached the bottom of a valley. Notice that this strategy does not take any
account of the changes in function value.
Alongside all the ODE algorithms mentioned in this section, we also
consider two conventional minimization methods: E04KDF, from the NAG
subroutine library; and OPVM, from the NOC OPTIMA package. E04KDF
is a modified Newton method (Ref. 19) and OPVM is a quasi-Newton
technique (Ref. 20).
The programs were used initially to solve a group of six well-known
(and comparatively easy) optimization problems, mainly with a view to
assessing the relative merits of the various ODE methods. The problems
considered were the Rosenbrock function, the Powetl quartic function, the
Wood function, and the three exponential least-squares functions (EXP2,
EXP3, and EXP4), given by Biggs (Ref. 20). For each of these problems,
all the methods were provided with analytical expressions for the function
and its first derivatives. Where second derivatives were required, they were
obtained by finite differences. Calculations were performed in double precision on the Hatfietd Polytechnic DEC1091 computer, with the exception
of those involving E04KDF, which were carried out on a VAX-11/85. The
termination criterion for each method was IlTF(x)tl < 10-6. Performance of
the codes is summarized in Table !. The entries in the table are in the form
IT/EFE, where IT denotes the number of iterations and EFE the number
of equivalent function evaluations (function calls + n* gradient calls) needed
for each problem.
Clearly, there is considerable variation in the performance of the
methods. At one extreme, the high-accuracy ODE solvers LSOSTD and
LSONEW appear to be very"expensive, performing many more integration
steps than the low-order schemes. The integration steps used by IMPSTD,
although comparatively few in number, are each rather expensive, because
they involve the solution of a set of nonlinear equations and ~TZFmay need
to be evaluated several times. By contrast, LFOPI(b), which does not use
second derivatives, has a very inexpensive calculation on each step; but,
since it appears to require quite a !arge number of steps, its EFE totals are
not very competitive. The other ODE-based methods (RK1NEW, IMPBOT,
ZGHIER, and QUABOT) do, however, compare quite well with the two
conventional techniques. We should, however, remember here that OPVM
and QUABOT do not use second derivatives directly, and hence have an
advantage if we use EFE as the only measure of performance. We observe
that QUABOT usually comes second to OPVM on the problems in Table
i (although it appears to be the best method for the Wood function). We
remark too that the Newton method E04KDF is sometimes outperformed
by several of the ODE codes which also use second derivatives. The failures
JOTA: VOL. 62, NO. 2, AUGUST 1989
218
Table 1.
Performance o f methods on six easy test problems.
Test problem
Method
LSOSTD
LSONEW
IMPSTD
RK1NEW
IMPBOT
ZGHIER
QUABOT
DAFNE
LFOPI(b)
E04KDF
OPVM
Rosenbrock
145/643
175/2881
8/2433
22/164
21/161
17/105
26/92
7/63
181/365
21/169
30/111
Powell
160/1325
360/27081
12/4317
17/362
17/362
16/325
38/203
17/445
1210/4845
25/546
28/161
Wood
336/3593
2311/2166661
46/124650
43/921
40/876
18/365
30/171
F
373/1497
37/887
77/443
Test problem
Method
LSOSTD
LSONEW
IMPSTD
RK1NEW
IMPBOT
ZGHIER
QUABOT
DAFNE
LFOPI(b)
E04KDF
OPVM
EXP2
129/413
F
9/246
10/75
11/80
F
15/49
F
86/165
9/73
10/36
EXP3
1199/622
334/14749
12/772
10/138
13/173
14/172
20/84
10/172
129/391
10/179
13/63
EXP4
147/1169
F
13/5130
12/265
17/362
16/325
33/183
14/361
310/1245
14/375
20/112
F denotes that an algorithm terminated at a nonoptimal stationary point.
in this first test set occur for those algorithms which do not specifically
check function values at each step (LSONEW, ZGHIER, and D A F N E ) . It
is not altogether surprising that these methods sometimes terminate at
nonoptimal stationary points, since no precautions are taken to ensure that
F is consistently reduced.
The numerical integration methods used in the successful O D E codes
are all first-order schemes. Brown (Ref. 16) presents results to show that
the use of second-order or higher-order techniques seems to produce an
increase in computational cost per step which is not outweighed by a
corresponding reduction in the number of iterations.
JOTA: VOL. 62, NO. 2, AUGUST 1989
219
On the basis of the figures in Table 1, it is easy to see how ODE-based
methods may have acquired a reputation for being unacceptably expensive.
If we neglect the first three algorithms, however, there seems to be no very
compelling reason for dismissing all the ODE techniques in this way. In
the next section, therefore, we perform a rather more significant comparison
between methods, using test problems of greater difficulty and employing
ranking procedures which are based on more than a simple count o f EFEs.
3. Numerical Experiments Based on Difficult Test Problems
In this section, we carry forward the ODE routines IMPBOT,
QUABOT, and Z G H I E R from the previous section and compare their
performance with that of OPVM and E 0 4 K D F on a further set of problems
which can be regarded as being difficult in the sense o f being badly scaled
or having severe nonlinearities. The routines LSOSTD, LSONEW, and
LFOPI(b) have been dropped on grounds of computational cost and
D A F N E has been left out because it has been found to be particularly
liable to stop at nonoptimal points. R K 1 N E W has been discarded not so
much because of any shortcomings as because it is, in essence, rather like
an ordinary Newton method with a weak line search; hence, it does not
represent so clear an alternative to the conventional methods. Brown
(Ref. 16) reports additional experience with these algorithms which supports
our decision not to proceed with them.
The 48 test problems which we use are specified in full in Refs. 16 and
22. Some of these examples have appeared in the literature previously, and
some are new. They include some highly nonlinear and badly scaled problems, involving between two and twenty variables. Several of the examples
are in the form of penalty functions based on constrained problems in the
collective given by Hock and Schittkowski (Ref. 21). Tables which summarize the results for each problem appear in Ref. 22 and record, in detail, the
number o f equivalent function evaluations (EFE), the computing time in
seconds (CPU), and the final function value achieved (F). Here, we simply
present an overview o f the main conclusions that can be drawn from the
experiments.
The detailed results (Ref. 22) show clearly that some of the problems
are difficult enough to defeat several algorithms. It is interesting, therefore,
to consider the number of failures for each of the tested codes. Here,
"failure" is taken to mean that a method did not locate a stationary point
o f any kind, but terminated by exceeding some preset time, or iteration
limit, or else because of some numerical breakdown such as overflow or
division by zero. It was not counted as a failure in this sense ~f termination
220
JOTA: VOL. 62, NO. 2, AUGUST 1989
occurred at a point where the norm of the gradient was close to the level
required for convergence. The success rates of the various methods are
summarized in Table 2.
It is clear that I M P B O T and QUABOT both do very much better than
the other methods in terms o f reliability. A number of the failures experienced by Z G H I E R can be attributed to the fact that it does not use function
values to regulate its progress; hence, it sometimes wanders off into regions
where the variables become very large and the calculations break down
with some condition such as floating overflow. Apart from these cases,
however, the algorithm is, as we shall see below, frequently competitive. A
possible reason for the failures of OPVM is that the quasi-Newton updating
is being required to generate approximations to some very ill-conditioned
Hessian matrices in, for instance, the penalty function examples. At the
same time, however, it should be noted that QUABOT, which uses a similar
updating scheme, appears to be tess severely affected. The particularly poor
performance of E 0 4 K D F is rather surprising, especially when contrasted
with that of IMPBOT which also uses the Hessian matrix and which takes
steps which tend to the Newton correction as h becomes large.
We next attempt to rank the methods from a more positive point of
view, namely, by counting the number of problems for which each code
returns the best performance. By giving one point to the most successful
code, two points to the second, and so on, we can obtain for each algorithm
a total score which reflects the frequency with which it outperforms the
others. Of course, there remains the question of how we should measure
the best performance. In order to present as balanced a view as possible,
we have tried three different criteria. In the first, the method which obtains
the lowest function value is regarded as the most successful for each problem
(but where two or more methods produce the same function value, the one
with lowest CPU time is placed first). Under the second criterion, we rank
the methods in increasing order of CPU time taken to find any local solution
for each problem. The third way of comparing the codes is based upon the
Table 2.
Ranking of methods by a count
of failures on the test problems.
Method
IMPBOT
QUABOT
OPVM
ZGHIER
E04KDF
Number of failures
0
4
8
13
20
JOTA: VOL. 62, NO. 2, AUGUST 1989
Table 3.
221
Ranking based on best function value found on
each problem.
Number of times
Method
IMPBOT
QUABOT
OPVM
ZGHIER
E04KDF
Table 4.
Total points
scored
1st
2rid
119
122
124
174
175
14
11
9
12
4
9
15
19
1
5
Ranking b a s e d on least C P U time to find a
stationary point.
Number of times
Method
IMPBOT
OPVM
ZGHIER
QUABOT
E04KDF
Total points
scored
1st
2nd
113
136
144
148
196
10
16
17
14
0
17
5
4
6
8
numbers o f EFEs needed to find any local stationary point. Rankings of
the tested algorithms according to each of these measures of performance
a p p e a r in Tables 3, 4, and 5.
We can make several comments on Tables 2-5. We note first that an
O D E method appears at the top of every one of them. This is usually
Table 5,
R a n k i n g b a s e d on least E F E to find a stationary
point.
Number of times
Method
QUABOT
OPVM
ZGHIER
IMPBOT
E04KDF
Total points
scored
! st
2nd
99
119
154
157
205
17
17
13
2
0
20
13
4
6
4
222
JOTA: VOL. 62, NO. 2, AUGUST 1989
IMPBOT, except in the case of Table 5, where the criterion based on
counting EFEs might be expected to favor the quasi-Newton approaches.
The figures for Z G H I E R are interesting. In Tables 3-5, it records more first
places than any other method; but the fact that it achieves relatively few
second places illustrates well its unreliability. When it works, it converges
quickly; but it can be seen from Table 2 that it is more vulnerable to outright
failure than either IMPBOT or QUABOT. E04KDF is consistently to be
found at the bottom of the tables, but OPVM tends usually to be quite
competitive with the best technique in each comparison. We make no further
comment here on the behavior of the conventional minimization methods,
except to stress that, in all the tests, they were used in accordance with the
guidelines given in their accompanying documentation.
4. Discussion and Conclusions
In this paper, we have shown, by means of numerical experiments,
that unconstrained minimization techniques based on the solution of the
system of ordinary differential equations (4) can compare very favorably
with conventional Newton and quasi-Newton algorithms as regards reliability, accuracy, and efficiency. The main conclusion that we wish to draw
from this is that the basic O D E approach to function minimization appears
to deserve rather more attention than it has so far been given. The algorithms
which have performed best in our trials are those based on the use of
low-order integration schemes and using information about the objective
function in order to control the stepsize.
The most successful algorithm is IMPBOT, which uses Eq. (9). As was
mentioned in Section 2, the iterations based on (9) have something in
common with trust region methods for optimization, which use the calculation
(hi + G(k~)Ax = -VF(x~k~).
(13)
It can easily be seen that (9) and (13) are equivalent when h = 1/h. IMPBOT,
however, involves the adjustment o f a stepsize h in the space of the parameter
t, while conventional trust region methods use h to adjust a bound A on
the norm o f the correction Ax in x-space. The relationship between h and
A is highly nonlinear, however, and it may be to the advantage of IMPBOT
that it does not have to deal with this.
IMPBOT uses the true second derivative matrix G(x(k~), or at least a
finite-difference approximation to it. The similar algorithm QUABOT avoids
the expense of computing the Hessian matrix by using instead an updated
JOTA: VOL. 62, NO. 2, AUGUST 1989
223
approximation obtained from the BFGS formula. This routine is somewhat
less reliable than IMPBOT, but is more economical in terms of function
and gradient evaluations.
We do not wish to make excessive claims for the particular codes
I M P B O T and QUABOT on the basis of the computational results in Section
3. We regard these as being prototype codes, and some further development,
both theoretical and practical, is still required. For instance, the relationship
with the trust region approach seems to be worth further exploration. Some
features of Zghier's method, such as the use of different integration schemes,
may be worth including; and there may also be some advantages in keeping
G (k) constant over several steps (9) or in using updates other than BFGS
for the matrix B in (11). Most importantly, a convergence analysis is still
neded for both methods.
We would also concede that the conventional minimization algorithms
OPVM and E04KDF used in the comparisons in Section 3 are not necessarily
at the leading edge of optimization algorithm development. Nevertheless,
they are representative of methods quite widely used in practice; therefore,
the results in this paper do adequately support our contention that further
research into the use of (4) for optimization is amply justified. As a final
remark, we note that the success of IMPBOT and QUABOT on the penalty
function examples encourages us to think that ODE-based methods may
also prove useful for the solution of constrained optimization problems.
References
1. BOTSARIS, C. A., and JACOBSON, D. t'-I., A Newton-Type CurviIinear Search
Method for Optimization, Journal of Mathematical Analysis and Applications,
Vol. 54, pp. 217-229, 1976.
2. BOTSARIS,C. A., A Curvilinear Optimization Method Based on Stable Numerical
Integration Techniques, Journal of Mathematical Analysis and Applications, Vol.
63, pp. 396-411, 1978.
3. BOTSARIS,C. A., Differential Gradient Methods, Journal of Mathematical Analysis and Applications, Vol. 63, pp.177-t98, 1978.
4. BOTSARlS, C. A., A Class of Methods for Unconstrained Minimization Based on
Stable Numerical Integration Techniques, Journal of Mathematical Analysis and
Applications, Vok 63, pp. 729-749, 1978.
5. BOGGS, P. T., An Algorithm, Based on Singular Perturbation Theory, for IllConditioned Minimization Problems, SIAM Journal on Numerical Analysis, Vol.
15, pp. 830-843, 1977.
6. ZIRILLI, F., INCERTI, S., and PARISl, V., A New Method for Solving Nonlinear
Simultaneous Equations, SIAM Journal on Numerical Analysis, Vol. 16, pp.
779-789, 1979.
224
JOTA: VOL. 62, NO. 2, AUGUST 1989
7. ZIRILLI, F., INCERTI, S., and ALUFFI, F., Systems of Equations and A-Stable
Integration of Second-Order ODEs, Numerical Optimization and Dynamic Systems, Edited by L. C. W. Dixon and G. P. Szego, Elsevier-North Holland,
Amsterdam, Holland, 1980.
8. ZIRILLI, F., ALUFFI, F., and INCERTI, S., Systems of Simultaneous Equations
and Second-Order Differential Equations, Ottimizzazione Nonlineare e
Applicazioni, Edited by S. Incerti and G. Treccani, Pitagora Editrice, Bologna,
Italy, 1980.
9. ZIR1LLI,F., INCERTI, S., and PARISI, V., A FORTRAN Subroutine for Solving
Systems of Nonlinear Simultaneous Equations, Computer Journal, Vol. 24, pp.
87-91, 1981.
10. ZIR1LLI, F., ALUFFI, F., and PARISI, V., A Differential Equations Algorithm for
Nonlinear Equations, ACM Transactions on Mathematical Software, Vol. 10,
pp. 299-316, 1984.
11. ZIRILLI, F., ALUFFI, F., and PARISI, V., DAFNE: A Differential Equations
Algorithm for Nonlinear Equations, ACM Transactions on Mathematical Software, Vol. 10, pp. 317-324, 1984.
12. ZGHIER, A. K., The Use of Differential Equations in Optimization, PhD Thesis,
Loughborough University, 1981.
13. SNYMAN, J. A., A New and Dynamic Method for Unconstrained Optimization,
Applied Mathematical Modelling, Vol. 6, pp. 449-462, 1982.
14. SNYMAN, J. A., An Improved Method of the Original Leapfrog Dynamic Method
for Unconstrained Minimization, University of Pretoria, Applied Mathematics
Department, Report No. UP-TW30, 1982.
15. POWELL, M. J. D., Editor, Nonlinear Optimization 1981, Academic Press,
London, England, 1982.
16. BROWN, A. A., Optimization Methods Involving the Solution of Ordinary Differential Equations, PhD Thesis, Hatfield Polytechnic, 1986.
17. HINDMARSH, A. C., LSODE and LSODI: Two New Initial-Value Ordinary
Differential Equation Solvers, ACM Signum Newsletter, Vol. 15, pp. 10-11, 1980.
18.GOLDFELD, D., QUANDT, R. E., and TROTTER, H. F., Maximization by Quadratic Hiii-Climbing, Econometrica, Vol. 34, pp. 541-551, 1966.
19. GILL, P. E., MURRAY, W., and PICKEN, S. M., The Implementation of Two
Modified Newton Algorithms for Unconstrained Optimization, National Physical
Laboratory, Report No. NAC-24, 1972.
20. BIGGS, M. C., Minimization Algorithms Making Use of Nonquadratic Properties
of the Objective Function, Journal of the Institute of Mathematics and Its
Applications, Vol. 8, pp. 315-327, 1971.
21. HOCK, w., and SCHITTKOWSKI,K., Test Examples for Nonlinear Programming
Codes, Springer-Verlag, New York, New York, 1981.
22. BROWN, A. A., and BARTHOLOMEW-BIGGS,M. C., Some Effective Methods
for Unconstrained Optimization Based on the Solution of Systems of Ordinary
Differential Equations, Technical Report No. 178, Numerical Optimisation
Centre, Hatfield Polytechnic, 1987.

Download Report

Some effective methods for unconstrained optimization based on the

Paperzz.com

Your Paperzz