A GLOBAL CONVERGENCE THEORY FOR

SIAM J. OPTIM.
Vol. 7, No. 1, pp. 177–207, February 1997
c 1997 Society for Industrial and Applied Mathematics
010
A GLOBAL CONVERGENCE THEORY FOR GENERAL
TRUST-REGION-BASED ALGORITHMS FOR EQUALITY
CONSTRAINED OPTIMIZATION∗
J. E. DENNIS, JR.† , MAHMOUD EL-ALEM‡ , AND MARIA C. MACIEL§
Abstract. This work presents a global convergence theory for a broad class of trust-region
algorithms for the smooth nonlinear programming problem with equality constraints. The main
result generalizes Powell’s 1975 result for unconstrained trust-region algorithms.
The trial step is characterized by very mild conditions on its normal and tangential components.
The normal component need not be computed accurately. The theory requires a quasi-normal component to satisfy a fraction of Cauchy decrease condition on the quadratic model of the linearized
constraints. The tangential component then must satisfy a fraction of Cauchy decrease condition
on a quadratic model of the Lagrangian function in the translated tangent space of the constraints
determined by the quasi-normal component. Estimates of the Lagrange multipliers and the Hessians
are assumed only to be bounded.
The other main characteristic of this class of algorithms is that the step is evaluated by using the
augmented Lagrangian as a merit function with the penalty parameter updated using the El-Alem
scheme. The properties of the step and the way that the penalty parameter is chosen are sufficient
to establish global convergence.
As an example, an algorithm is presented that can be viewed as a generalization of the Steihaug–
Toint dogleg algorithm for the unconstrained case. It is based on a quadratic programming algorithm
that uses a step in a quasi-normal direction to the tangent space of the constraints and then takes
feasible conjugate reduced-gradient steps to solve the reduced quadratic program. This algorithm
should cope quite well with large problems for which effective preconditioners are known.
Key words. constrained optimization, global convergence, trust regions, equality constrained,
nonlinear programming, conjugate gradient, inexact Newton Method
AMS subject classifications. 65K05, 49D37
PII. S1052623492238881
1. Introduction. This work is concerned with the development of a global convergence theory for a broad class of algorithms for the equality constrained minimization problem:
minimize f (x)
(EQC) ≡
subject to C(x) = 0.
The functions f : <n → < and C : <n → <m are at least twice continuously differentiable where C(x) = (c1 (x), . . . , cm (x))T and m < n.
Our purpose is to generalize to constrained problems a powerful theorem given in
1975 by Powell for unconstrained problems.
The global convergence theory that we establish in this work holds for a class of
nonlinear programming algorithms for (EQC) that is characterized by the following
features:
∗ Received by the editors October 14, 1992; accepted for publication (in revised form) October 6,
1995. This research was supported by DOE DE-FG005-86ER25017, CRPC CCR-9120008, AFOSRF49620-95-1-0210, and the REDI Foundation.
http://www.siam.org/journals/siopt/7-1/23888.html
† Department of Computational and Applied Mathematics & Center for Research on Parallel
Computation, Rice University, P. O. Box 1892, Houston TX 77251 ([email protected]).
‡ Department of Mathematics, Faculty of Science, Alexandria University, Alexandria, Egypt
([email protected]).
§ Departamento de Matematica, Universidad Nacional del Sur, Avenida Alem 1253, 8000 Bahia
Blanca, Argentina.
177
178
J. DENNIS, M. EL-ALEM , AND M. MACIEL
1. The algorithms of the family use the trust-region approach as a globalization
strategy.
2. All these algorithms generate steps that satisfy our mild conditions on the
trial step’s normal and tangential components. It is important to note that
the condition on our “quasi-normal” step snc , that ksnc k2 ≤ K1 kC(xc )k2 for
some independent constant K1 , is always satisfied under our hypotheses if
a truly normal component is used for the trial step. The other conditions
are that the quasi-normal component satisfies a fraction of Cauchy decrease
condition on the quadratic model of the linearized constraints and that the
tangential component (as measured from the quasi-normal component) satisfies a fraction of Cauchy decrease on the quadratic model of the reduced
Lagrangian function associated with (EQC).
3. The estimates of the Lagrange multiplier vector and the Hessian matrix are
assumed only to be bounded uniformly across all iterations.
4. The step is evaluated for acceptance by using the augmented Lagrangian
function with penalty parameter updated by the El-Alem scheme [9]. A key
point here is that the step is computed before the penalty parameter, which
will be used to evaluate the step, is updated.
Points 1 and 3 are satisfied by the algorithms of Byrd, Schnabel, and Shultz [2];
Omojokun [21]; Celis, Dennis, and Tapia [4]; and Powell and Yuan [23]. The first two
papers require a normal, rather than just a quasi-normal, snc in point 2.
We use the following notation: the sequence of points generated by an algorithm
is denoted by {xk }. This work also uses subscripts -, c, and + to denote the previous,
the current, and the next iterates, respectively. However, when we need to work with
a whole sequence we will use the index k. The matrix Hc denotes the Hessian of the
Lagrangian at the current iterate or an approximation to it. Subscripted functions
mean the function value at a particular point; for example, fc = f (xc ), `c = `(xc , λc ),
and so on. Finally, unless otherwise specified, all the norms are `2 -norms, and we use
the same symbol 0 to denote the real number zero and the zero vector.
The rest of the paper is organized as follows: in section 2, we review the concept
of fraction of Cauchy decrease. In section 3, we review the SQP algorithm. In section
4, we survey existing trust-region algorithms for solving problem (EQC). In section
5, we present a general trust-region algorithm with the conditions that the trial step
must satisfy. In section 6 we state the algorithm. Sections 7 and 8 present the global
convergence theory that we have developed. In section 7.1, we state the assumptions
under which global convergence is established. In section 7.2, we discuss some properties of the trial steps. In section 7.3, we study the behavior of the penalty parameter.
Section 8 presents our main global convergence result. In section 9, we present, as an
example, an algorithm that solves problem (EQC), and we prove that it fits the assumptions of the paper. This algorithm was one we had in mind as motivation for the
convergence theory. It can be viewed as a generalization to the constrained problem
of the Steihaug–Toint dogleg algorithm for the unconstrained case. This algorithm
has worked quite well for some large problems. Finally, we make some concluding
remarks in section 10.
2. Fraction of Cauchy decrease condition. Consider the following unconstrained minimization problem
(UCMIN) ≡
minimize
subject to
f (x)
x ∈ <n ,
206
J. DENNIS, M. EL-ALEM , AND M. MACIEL
algorithm is a research topic. We believe that Tapia [29] will be of considerable value
here.
A related question that has to be looked at is the search for preconditioners to
produce more efficient algorithms. We believe that the reducer matrix W should play
a role in that search; see Dennis and Lewis [6].
This theory is developed for the equality constrained case, but it can be applied
to the general case by one of the strategies known as EQP and IQP. Here, we mean
that in the EQP strategy the choice of the active set is made outside the algorithm
that determines the step, whereas in the IQP strategy, that choice is made inside
the procedure that determines the step. Since the active set may change at each
iteration, the choice of the submatrix B, will be strongly affected. Certainly, this is
an important topic that deserves to be investigated.
Acknowledgments. We wish to thank Richard Byrd for many helpful comments
and the referees for pointing out many unclear points, which we hope have been clarified. We thank Robert Michael Lewis and the referees for pointing out the importance
of dealing with the quasi-normal component. We especially thank Luis Vicente and
Doug Moore for careful and insightful reading.
REFERENCES
[1] C. Bischof, A. Carle, G. Corliss, A. Griewank, and P. Hovland, Adifor – generating
derivative codes from fortran programs, Sci. Programming, 1 (1992), pp. 11–29.
[2] R. H. Byrd, R. B. Schnabel, and G. A. Shultz, A trust region algorithm for nonlinearly
constrained optimization, SIAM J. Numer. Anal., 24 (1987), pp. 1152–1170.
[3] R. Carter, Multi-Model Algorithms for Optimization, Ph.D. thesis, Department of Mathematical Sciences, Rice University, Houston, TX, 1986.
[4] M. R. Celis, J. E. Dennis Jr., and R. A. Tapia, A trust region strategy for nonlinear equality
constrained optimization, in Numerical Optimization 1984, SIAM, Philadelphia, PA, 1985.
[5] E. Craig, The n-step iteration procedures, J. Math. Physics, 34 (1955), pp. 64–73.
[6] J. E. Dennis Jr. and R. M. Lewis, A Comparison of Nonlinear Programming Approaches to
an Elliptic Inverse Problem and a New Domain Decomposition Approach, Tech. report,
Department of Computational and Applied Mathematics, Rice University, Houston, TX,
1994, in preparation.
[7] J. E. Dennis Jr. and R. B. Schnabel, Numerical Methods for Unconstrained Optimization
and Nonlinear Equations, Prentice-Hall, Englewood Cliffs, NJ, 1983. Russian edition, Mir
Publishing Office, Moscow, 1988, O. Burdakov, translator.
[8] M. M. El-Alem, A Global Convergence Theory for a Class of Trust Region Algorithms for
Constrained Optimization, Ph.D. thesis, Department of Mathematical Sciences, Rice University, Houston, TX, 1988.
[9] M. M. El-Alem, A global convergence theory for the Celis-Dennis-Tapia trust region algorithm
for constrained optimization, SIAM J. Numer. Anal., 28 (1991), pp. 266–290.
[10] M. M. El-Alem, A robust trust-region algorithm with nonmonotonic penalty parameter scheme
for constrained optimization, SIAM J. Optim., 5 (1995), pp. 348–378.
[11] M. El-Hallabi and R. Tapia, A Global Convergence Theory for Arbitrary Norm Trust Region
Methods for Nonlinear Equations, Tech. report TR93-41(replaces 87-25), Department of
Computational and Applied Mathematics, Rice University, Houston, TX, submitted.
[12] R. Fletcher, A class of methods for nonlinear programming with termination and convergence properties, in Integer and Nonlinear Programming, J. Abadie, ed., North-Holland,
Amsterdam, 1970.
[13] R. Fletcher, An exact penalty function for nonlinear programming with inequalities, Math.
Programming, 5 (1973), pp. 129–150.
[14] R. Fletcher, Practical Methods of Optimization, John Wiley & Sons, New York, 1987.
[15] P. Gill, W. Murray, M. Saunders, and M. Wright, Some Theoretical Properties of an Augmented Lagrangian Merit Function, Tech. Report SOL 86-6, Stanford University, Stanford,
CA, 1986.
A THEORY FOR GENERAL TRUST-REGION-BASED ALGORITHMS
207
[16] P. Gill, W. Murray, and M. Wright, Practical Optimization, Academic Press, New York,
1981.
[17] M. C. Maciel, A Global Convergence Theory for a General Class of Trust Region Algorithm
for Equality Constrained Optimization, Ph.D. thesis, Department of Computational and
Applied Mathematics, Rice University, Houston, TX, 1992.
[18] J. J. Moré, Recent developments in algorithms and software for trust region methods, in
Mathematical Programming. The State of the Art, A. Bachem, M. Grotschel, and B. Korte,
eds., Springer-Verlag, New York, 1983, pp. 258–287.
[19] J. J. Moré and D. C. Sorensen, Computing a trust region step, SIAM J. Sci. Statist. Comput.,
4 (1983), pp. 553–572.
[20] S. G. Nash, Preconditioning of truncated-Newton methods, SIAM J. Sci. Statist. Comput., 6
(1985), pp. 599–616.
[21] E. O. Omojukun, Trust Region Strategies for Optimization with Nonlinear Equality and Inequality Constraints, Ph.D. thesis, Department of Computer Science, University of Colorado, Boulder, CO, 1989.
[22] M. J. D. Powell, Convergence properties of a class of minimization algorithms, in Nonlinear
Programming 2, O. Mangasarian, R. Meyer, and S. Robinson, eds., Academic Press, New
York, 1975, pp. 1–27.
[23] M. J. D. Powell and Y. Yuan, A trust region algorithm for equality constrained optimization,
Math. Programming, 49 (1991), pp. 189–211.
[24] K. Schittkowski, On the convergence of a sequential quadratic programming method with an
augmented Lagrangian line search function, Math. Operationsforch. Statist. Ser. Optim.,
14 (1983), pp. 197–216.
[25] G. A. Shultz, R. B. Schnabel, and R. H. Byrd, A family of trust-region-based algorithms
for unconstrained minimization with strong global convergence properties, SIAM J. Numer.
Anal., 22 (1985), pp. 47–67.
[26] D. Sorensen, Newton’s method with a model trust region modification, SIAM J. Numer. Anal.,
19 (1982), pp. 409–426.
[27] T. Steihaug, The conjugate gradient method and trust regions in large scale optimization,
SIAM J. Numer. Anal., 20 (1983), pp. 626–637.
[28] R. A. Tapia, Quasi-Newton methods for equality constrained optimization: Equivalence of
existing methods and a new implementation, in Nonlinear Programming 3, O. Mangasarian,
R. Meyer, and S. Robinson, eds., Academic Press, New York, 1978, pp. 125–164.
[29] R. A. Tapia, On secant update for use in general constrained optimization, Math. Comp., 51
(1988), pp. 181–202.
[30] P. L. Toint, Towards an efficient sparsity exploiting Newton method for minimization, Tech.
Report 80/4, Departément de Mathématique, Facultés Universitaires de Namur, Belgium,
1980.
[31] A. Vardi, Trust Region Strategies for Unconstrained and Constrained Minimization, Ph.D. thesis, School of Operations Research and Industrial Engineering, Cornell University, Ithaca,
NY, 1980.
[32] A. Vardi, A trust region algorithm for equality constrained minimization: Convergence properties and implementation, SIAM J. Numer. Anal., 22 (1985), pp. 575–591.
[33] K. A. Williamson, A robust trust region algorithm for nonlinear programming, Ph.D. thesis,
Department of Mathematical Sciences, Rice University, Houston, TX, 1990.
[34] J. Zhang, N.-H. Kim, and L. Lasdon, An improved successive linear programming algorithm,
Management Sci., 31 (1985), pp. 1312–1331.