decomposition strategies for large

Chemical
Printed
Engineering
in Great
Science,
Vol. 47. No.
4, pp. 851&864,
1992.
000%2509,92
S5.00 + 0.00
0
1992 Fw@mon
Prem plc
Britain.
DECOMPOSITION
DYNAMIC
STRATEGIES
OPTIMIZATION
FOR LARGE-SCALE
PROBLEMS
J. S. LOGSDON
and.L. T. BIEGLER
Department of Chemical Engineering, Carnegie-Mellon
University, Pittsburgh, PA 15213, U.S.A.
(Receivedfor
publication
6 A~ust
1991)
Abstraet7Recently,
efficient strategies have been developed to solve dynamic simulation and optimization
problems in a simultaneous manner. These rely on the ability to obtain an accurate algebraic discretixation
of the differential equations as well as the ability to solve large optimization problems in an efficient
manner. These concerns have been addressed by applying orthogonal collocation on finite elements to these
systems and solving the nonlinear program (NLP) with a reducedspace successivequadraticprogramming
(SQP) approach. In a recent study we discussed theoretical properties of these differentialalgebraic
equation (DAE) systems and cautioned that applicationof orthogonal collocation may not yield a stable
discretizationnor an accuratesolution to the control problem. As a resultof this, preanalysisof the DAE
system is requiredand appropriate approximationerror criteriamust be embedded within the nonlinear
program. In this paper we tailor this approach to the accuratesolution of optimal control problems. The
optimal control problem has a natural partitioningof control variablesand state variablesfor the NLP.
Note here that partitionedspaces are not orthogonal. We develop a decompositionstrategyto: (1) exploit
the block matrix form of the discretizeddifferentialequationswhichresultsfrom usingcollocation on finite
elements,and (2) allow us to performthe optimizationin the control space.Here the state variablesfor each
finiteelementare determinedby linearizeddifferentialequations,and a coordinationstep is used to update
the control variablesand integrationlength.Informationis passedfrom elementto elementby chsinruling
the stateinformation.While the approachhas much in common withearlierquasilinearizationapproaches,
the nonlinear programming strategy has a great deal of flexibility in determining control variable
discontinuities,enforcinga wide varietyof state and control variableconstraintsand ensuringthe accurate
determinationof both state and control variableprofiles.Two classes of problems are investigated; first we
consider problems where the differential equations are linear in the state variables and then we consider the
general nonlinear (states and controls) problem. Example problems are illustrated for both classes of
problems.
1. INTRODUCTION
Several recent studies have dealt with the development of optimization-based model-predictive control
algorithms. For linear reference models, the recent
DMC (Cutler and Ramaker, 1979) and QDMC (Prett
and Garcia, 1987) approaches have been popular in
dealing with MIMO systems and allowing the direct
incorporation of process constraints. For nonlinear
reference models, similar approaches have been developed which rely on nonlinear programming (NLP)
strategies to solve dynamic optimization problems
on-line (Patwardhan et al., 1989; Eaton et al., 1988;
Renfro et al., 1987; Li and Biegler, 1989). However, the
determination of optimal control profiles for large
chemical processes, described by models with both
differential and algebraic equations (DAEs), remains a
challenging problem. In particular, the solution and
optimization of the differential equations and algebraic equations requires the numerical algorithm to
be able to handle state and control variable (equality
and inequality) constraints in the reference model. In
the simultaneous approach, one can deal with state
path constraints and control path constraints simply
by including them in the NLP formulation. Also, the
differential equations are converted to algebraic equations using orthogonal collocation on finite elements.
This approach has been used by Cuthrell and Biegler
851
(1987, 1989), Renfro et nl. (1987), and Logsdon and
Biegler (1989) to achieve solutions efficiently for the
optimal control problem.
While the simultaneous approach offers a number
of advantages for dynamic optimization problems, the
nonlinear programming formulations for these problems can become large. Consequently, some form of
decomposition or exploitation of problem structure is
required in order to solve the resulting NLP efficiently.
Vasantharajan
and
Biegler (1988)
and
Vasantharajan et al. (1990) developed a general purpose decomposition algorithm for successive quadratic programming (SQP) and demonstrated its efficiency and reliability with respect to other general
purpose NLP solvers. Logsdon et aE. (1990) also applied this approach to the optimal operation of batch
distillation systems. However, this general purpose
approach does not take advantage of the block-like
structure of the collocation equations; more efficient
approaches can, therefore, be constructed.
For this reason, we develop in this study a special
purpose decomposition algorithm for dynamic optimization problems. Here we take advantage of the
natural partitioning of control and state variables and
of the collocation matrix structure, which occurs from
the discretization of the differential equations. Because initial conditions for the state variables are
J. S. LOGSD~N and L. T.. BIEGLER
852
usually specified, and final-state variables are determined by the control variables, we choose as independent (decision) variables the control variables and
the finite element lengths. We, therefore, exploit this
partitioning of variables by developing an SQP decomposition technique where the optimization step
[solution to the quadratic programming (QP) subproblem] is performed in the control space.
In particular, this approach has special advantages
for problems which are linear in the state variables.
Here the interior states can be eliminated entirely
from the QP step by solving for the complete state
trajectories for a set of control variables. Also,
enforcement of state path and control variable constraints remains straightforward within this formulation.
In the next section we review the general NLP
formulation for optimal-control
problems using
collocation on finite elements and SQP. In Section 3
we consider the general purpose optimization approach and tailor it to this problem class by partitioning the variables and creating a smaller (QP) subproblem. In Section 4, we develop the decomposition
technique for linear-state problems, present some examples, and show that the nonlinear case can be
extended from the linear case by using a
Newton-Raphson technique. For the SQP algorithm
we also show that convergence can be accelerated by
preprocessing the initial Hessian when using a quasiNewton method. Finally, in Section 5 we summarize
the numerical results, draw conclusions, and discuss
future directions for the work.
= inehuality design constraint vector
g
20) = state profile vector
u(t) = control profiles
P = design parameters, not time-dependent
g/ = inequality constraints at final conditions
. . .
z,, =_mrtlal condition for state vector
z(t)L. z(t)” = state profile bounds
u(t)L, u(t)” = control profile bounds.
In order to use the NLP formulation, we convert
the differential equaiions to algebraic equations using
collocation on finite elements. Here, we use a polynomial approximation for the discretization of the
ordinary differential equations and apply orthogonal
collocation to construct the residual equations, which
are solved as a set of algebraic equations. These
residuals are evaluated at the shifted roots of an
orthogonal Legendre polynomial.
Consider the initial-value problem over a finite
element i with time tf [C,, &+ i]:
2 = FCx, u(t), 4th ~3
t a@,
tf 1
in element i
in element i
2. NLPFORMULziTION
In this section we briefly review the formulation of
the NLP for control problems using collocation on
finite elements. Consider the following general control
problem for t E [a, b].
Min
do. zw. P
v [z(b), PI +
G CzW, u(t), PI dt
i(t) = F [z(t), u(t), PI
BCu(t)*z(t)lG 0
s,C4~)1 d 0
z(a) = z*
z(t)= < z(t) < z(t)”
uw
G u(t) 6 u(t)”
where
y CeJ)l = component of object function evaluated at
final conditions
b
i=
1,...,
(2)
NE.
Here k = 1,j means k #j. Also Zx+ i(t) is a (K + l)th
order (degree < K + 1) piecewise polynomial and
U,(t) is Kth order (degree < K) piecewise polynomial. The difference in orders is due to the existence
of the initial conditions for z(t), for each element i.
Also, the Lagrange polynomial has the desirable property that [for Z,, i(t), for example]
zx+
(CPl)
such that
(1)
for state profiles z(t), control profiles u(t), and design
parameters p. We approximate the solution by
Lagrange polynomials over element i, ci < t 6 & + 1:
l(Gj)
=
zij
(3)
which is due to the Lagrange condition &(t,) = S,,,
where 6, is the Kroneclcer delta. This polynomial
form allows for the direct bounding of the states and
controls, i.e. path constraints can be imposed on the
problem formulation.
By using K point, orthogonal collocation on finite
elements as shown in Fig. 1, and by defining the basis
functions so that they are normalized over the each
element A[,(T~ [0, l]), one can write the residual
equation as follows:
AC r(t,,) = ,#,, z,, &@r) - AC,P(zlt, UY*)
i=
1,...,
(4)
NE
k==l,...,K
G[z(t), u(t), p] dt = component of objective funo
s
t&n over a period of time
where 4,(x,) = d#,/dr and is calculated off-line. Note
that t, = c, + A&&. This form is convenient to work
Large-scale
Zi-1.0
4-1.1
W-1.2
G-l.1
G-1.2
Zi.0
*--
1
1,
t-l
dynamic optimization problems
Ui.l‘
W.2
G.1
Zi.2
--*
6
1<_______ _____ A&
Fig. 1. Finite-element collocation
853
W+l,l
Zi+l.O
W+l.2
&+I.1
Zi+l.l
Zi+2.0
1
1
Ci+l
r*I-t2
_ ______ ____>l
discretization for state profiles z(t), control profiles u(t), and element
lengths A&.
when the element lengths are included as decision variables because it is still defined if AC goes to
zero during the solution of the optimization problem.
The element lengths are also used to find possible
points of discontinuity for the control profiles and to
insure that the integration accuracy is within a numerical tolerance. Additionally,
we enforce the continuity of the states at element endpoints (interior
knots C,, i = 2, . . . , NE), but we allow the control
profiles to have discontinuities
at these endpoints.
Here
with
r:+i
(Ti) = 4-21 C(f)
i-
2)...)
w-w
such that
(5)
NE
or
zio = j$O
zi-l,j~j(z
=
I)-
(6)
These endpoints also provid6 the initial conditions for
the next element states. Note that the ~$~(t~) and the
c$JT~) terms (basis functions and their derivatives) are
calculated beforehand [see Villadsen and Michelsen
(1978)], since they depend only on the Legendre root
locations.
Because of properties of Lagrange polynomials, the
imposition of state variable constraints is straightforward. However, for these optimal-control
problems,
numerical difficulties are encountered for problems if
state path constraints are active and/or singular arc
segments occur. These systems are equivalent to the
solution of high-index
DAE
systems. Here, preanalysis of the ODE model is necessary to determine
the potential index of the DAE system and the appropriate collocation (or implicit RungeKutta)
method,
if it exists, should be used for the discretization
(Logsdon and Biegler, 1989). This preanalysis can be
performed
by examining the Kuhn-Tucker
conditions of (NLPl)
given below.
After the potential index of the system has been
determined,
the order of the collocation
method
(number of collocation
points) can be specified in
order to formulate the NLP. This formulation consists of the ODE model discretized on finite elements,
the continuity equation for state variables, and any
other equality and inequality constraints that may be
required. It is given by
where i refers to the element, and j to the collocation
point. Also, A& are finlte-element
lengths for i
= 1,...,
NE, zf is the value of the state at the 6nal
time, and the constraint g, is evaluated at the final
time. Note that zii, uii are collocation coefficients for
the state and control profiles and p are any additional
design parameters.
Problem (NLPl)
can now be solved by any largescale nonlinear programming solver. For this, we use
SQP for the optimization step. We next consider the
general auadratic problem needed for the solution of
the NLP-z
Min #J(Z)
such that
g(r) 6 0
h(z) = 0
where
&:iR” --, R
objective
g:R”+
W’
inequality
h:R”+
Wrn equality constraints
ZEW”
function
constraints
set of variables.
CNLP2)
J. S.
854
LOGS~N
However, as the number of variables becomes large
(say, over 100), SQP can become inefficient. because a
dense n x n Hessian approximation matrix must be
stored and because most quadratic programming
algorithms used in SQP codes are dense implementations. To avoid this limitation, SQP decomposition
procedures have been used successfully for general
purpose problems. For example, the approach by
Vasantharajan and Biegler (1988) partitions the problem space into the range (or equation) space [P(z)
- (n x m) basis matrix] and the null (or optimization)
space [z(z) - (n x (n - m)] basis matrix). At each
iteration k of the SQP method
zk+
1 =zk+p
the search direction is thus partitioned into
Note that the choice of the matrices P and z is general
for any choice of variable partitioning. Vasantharajan
and Biegler (1988) choose the range and null basis
matrices to be orthogonal to each other. Now the
range space direction is determined by
which can be interpreted as a least-squares projection
if the P basis matrix is orthogonal to z. Also, assuming that the range space direction is small (it vanishes
as h approaches zero), the reduced quadratic programming subproblem can be solved to yield the null
space direction, i.e.
I-
-l
such that
VgTZ&
6
- g + VgTF(i(aT)--lh
(QPl)
where B is an approximation to the Hessian of the
Lagrange function. Here the reduced gradients are
given by ZTVc5 and zTVg for the objective and constraint functions, respectively. Moreover, the reduced
Hessian matrix ZrJ3Z is updated directly by a quasiNewton formula. With the (QPl) solution for 8, and
u, the multiplier estimates for the inequality constraints, the remaining multiplier estimates for the
equality constraints can be determined by
fj=
-
@-’
Fr-074 + Vgu).
For our problem the dependent (or “range”) space is
the state variable space, and the reduced (or “null”)
space is the control space. However, because of the
finite-element structure of the collocation equations,
the general purpose approach of Vasantharajan and
Biegler (1988) and the least-squares projection for the
range space step can lead to considerable storage and
computational effort. Instead, if we were to use a
feasible-path method, such as a reduced-gradient
method, for the differential equation equality constraints, we only need to work in the reduced space
(we keep F& = 0), and the calculation of the state
variables is performed as we solve the collocation
and
L. T. BIEGLER
equations forward in time. Now for problems linear in
the state variables, these equations can be satisfied
exactly, once the control variables and element
lengths are fixed, by applying the linear, element by
element, decomposition approach developed in the
next section. Thus, the problem reduces to one in the
control variable space. Similarly, for problems that
are nonlinear in the state variables, we can use a
Newton-Raphson
approach to maintain feasibility.
Finally, note that from (QPl) the enforcement of
the inequalities for the state variable path constraints
is done in the QP step. Here we eliminate the equality
constraints from the state differential equations and
calculate reduced gradients with respect to the objective function and (state and control variable) inequality constraints. In the next section we develop
this tailored decomposition technique by exploiting
the sparsity of the block matrices that result from the
finite-element equations.
3. ALGoluTHM
FOR
PROBLEMS
LINEAR
IN STATES
In this section we develop an algorithm for problems linear in the state variables (with possibly nonlinear controls). A set of control variables determines the
solution trajectories for the states. We construct these
trajectories by solving the ODES forward in time
using the finite-element structure and passing the
information from element to element. This allows us
to exploit the sparsity of the ODES and the collocation formulation. Once the trajectories have been
computed, and the derivative information (sensitivity
of states to control variables) is obtained, we chainrule
this information in order to obtain the reduced
gradients of the objective and constraint functions.
We then call the optimization program (SQP) to
determine the optimal-control profile. Because of the
linear property of the differential equations, the resulting method is a reduced-gradient, feasible-path approach, as the collocation equations are solved at
each optimization iteration.
Formulation
To motivate this section, we first consider a simple,
linear optimal-control problem. This problem, described in Cuthrell and Biegler (1987), consists of
starting and stopping a car in minimum time for a
fixed distance (300 units), and is given by
Min tf
such that
zr = zr
zr(O) = 0,
i, = l.J zl(t/)
= 300,
22(O) = 0
z&,)
= 0
-2<Udl.
Next consider the structure which results from discretizing the differential equations using collocation
on finite elements for two-point collocation. For each
finite element, we solve six equations, four collocation
Large-scale dynamic optimization problems
-
states
tntcrior stxtea
Zl.*jQlj fl.*j z2.2j
cantmls
Ulj U2j
x!!cl!f!o:b
St
xxxoox
Ab AA
&AU
00x 000
00x000
XXX0
xX0;
00x1
ooxx
000
0
xxx000
000 xxx
x00
000 000
000
000
LXX
000 xxx
0
1
+
000
0x0
00x
000
8:
i%
000
xxx
ii
+
iz 5;
Initial conditions
000 000
x0x 000
[lxx
000
000
000
00x
h,(zio, ufj, zii,A&)
Fig. 5. ODE-solver
A4
x
x
x
x
X0
X0
OX
OX
I
StateVariables
000tt000
=I.‘
acl
4¶
=zt
Fill91
lntegntiw
states
L-t@
i = 1, . . . , NE
i = 1, . . . , K
1 st Element
00
00
x0
0%
Cmditlms
z1.n ~I.10
000
000 oxx
000
x0x
Continuity
Interior
states
I
1.
residual equations and two continuity equations. Figure 2 shows the incidence matrix for the two-element
formulation.
Note that the incident x s represent the appearance of the variable in the equation. To determine the
initial conditions for each element, one can examine
the continuity equations and find the first appearance
of the state. This first incident x is the initial condition
for that differential equation, and the last incident x is
the initial condition for the next element. For the first
continuity equation, the initial condition is the first
variable, the next two x s are the interior states, and
the last x is the endpoint or the initial condition for
the next element. Finally, the decision variables are
ordered with the two control variables first and the
integration length following for each finite element.
We can now exploit this structure by passing the
information from element to element.
Consider the first-element residual equations as
shown in Fig. 3. By fixing the control variables and
the element length, we can easily solve for the (linear)
states within the element. Let 4 represent the interior
states in element i and b the right-hand sides, both at
iteration k of the optimization
algorithm. Now z$ is
determined
in each element by Z$ = A-lb,
which
results from the collocation equations given by eq. (4).
In particular, we see that for the collocation equations
(hi = 0) we have
-
Controls Initial
ui, “i2 43
AA
4
000
Fig. 2. Incidence matrix for the car problem-xample
ht(zio, uijs zij, Ni) = 0
Zl.ij
t
euo
w.10
“Li,
Fig. 3. First-element incidence matrix of the car problem.
xxx
xxx
xxx
21.20
~1.10
0
000 x00
II
I
b=
855
Fig. 4. Decomposition
dh
A=dr,
for element to element solution approach.
at iteration
k of OPT.
We further apply the continuity equations-to deter:
mine the initial conditions for the next finite element
and continue the forward elimination of the collocation equations. This leads to the decomposition
strategy shown in Fig. 4 for the Jacobian matrix.
So, for an initial set of state variables. we integrate
forward to form a set of final-state variables which are
functions only of the control variables, the element
lengths and the initial-state conditions:
Note that the flow of information from element to
element is passed forward through the continuity
equations. A schematic diagram of this decomposition
is illustrated in Fig. 5.
Note also that for two-point boundary value problems (TPBVP),
we can also include these functions of
the final states as equality constraints. On the other
hand, if the objective function is one of the final-state
conditions and no other state conditions are specified,
then we simply include the state condition to be
Initial Conditions
Nest Element
z&Jles
for state differential equations using collocation on tinite elements with information
processed from element to element.
J.S.
856
and L.T.
LOGSDON
optimized
as the objective function and solve the
other state differential equations. We do not require
any other final-state conditions as additional equality
constraints within the NLP.
In addition, if we have inequality constraints that
depend on state variables (2,) within some (or all)
intermediate elements, i.e. at element c, then these can
also be expressed by
z,
=f(zg.ATl,ul.A~2,~2r...
,A&,,u,).
Differentiating
Algorithm for problems linear in states
(0.0) Examine
(1.0)
Now to illustrate the calculation procedure for the
reduced gradients of the objective and constraint
functions, recall that we are solving within each element for the interior points:
z$ = A-lb.
BIEGLER
(7)
(1.1)
the structure of the state variable
constraints
and determine
the maximum
likely index of the resulting DAE system, if
any state variable constraints were to become
active. Choose the corresponding
number of
collocation
points based on this index [see
Logsdon and Biegler (1989) for details].
For a set of decision variables, begin with the
initial states and start constructing the state
trajectories. Within an element, perform the
following operations.
For a set of fixed decision variables, begin
with the initial states and solve the residual
equations (5.1) to obtain the interior states:
eq. (7), we have in each element
4 = A-lb.
(1.2) Calculate
where e represents the control variables and the element length for each finite element (e.g. time for the
car problem). An analogous equation holds for the
sensitivities of zI to zr,e. We then construct the states
at the endpoints (~3) by using the continuity equations
and then compute the sensitivities within each element for each state variable:
2 =
z1+1.0
=,~ozVwr
=
1).
(9)
We then proceed to the next element and calculate
the interior states for that element. We must also
calculate the interior-state sensitivity to the previous
element control variables by chainruling the derivatives. Note that the chainruling is done through the
final-state variables within each element i, starting
from each control variable in every element j, up to
element i.
the derivatives for this element’s
decision variables and its initial conditions zI, ,,
by using
Note that
* - fi 2
is determined analyt[at
at I]
ically from the differential equations.
(1.3) Apply the continuity equations and solve for
the next element’s initial conditions:
5; = zi+i,O = ,tO
n =
f,n,.
(11)
Results of the gradient calculations are then transferred to the SQP
optimization
strategy (OPT,
Cuthrell and Biegler, 1985) and the optimal control
problems are solved in the control variable (and element length) space. The algorithm for this approach is
described below.
=
l).
(1.4) Calculate the residuals for the approximation
error, evaluated at a noncollocation
point
(here the endpoint):
ii: =
This forward elimination and chainruling scheme acts
as a simplified ODE solver. Once state variable vectors and their sensitivities are calculated, reduced
gradients for the objective and n, constraint functions,
g(z,), with respect to the jth control variable are
constructed by the following
straightforward
rela_
tions:
zij4_j(z
5 I@?,(’
j=1
wi = f $
i
= 1)
(rf AC:)
1
M
= number
of residual equations
ated at noncollocation’ point.
evalu-
This residual error can either be monitored
over the course of the optimization
or the
constraint We d 6 can be imposed directly in
(NLPl)
for each element i with 6 as a small
error tolerance.
(1.5) Chainrule the derivatives from previous elements and update:
dz:
G=
dzz
KK-”
dzz_,
3
de,-
until an intermediate
element is
(2.0) Continue
reached that influences an inequality j$(z,)],
or until the last element is reached. Determine
Large-scale dynamic optimization problems
the reduced gradients for the objective and
constraint functions according to eq. (11).
(3.0) Assemble the objective and all of the constraint functon values and reduced gradients
from the above steps.
algorithm.
If Kuhn-Tucker
(4.0) Call the OPT
conditions are satisfied, STOP.
quad(5.0) Otherwise, OPT solves the following
ratic program [Note that this QP contains all
of the state and control variable constraints. It
differs from (QPl) in that h = 0 and no “range
space move” need be included]:
MinAc
WTZ
At:+ i AeT(ZTEZ)~c
1
(QW
857
To illustrate and demonstrate
this approach we
next consider some straightforward
example problems. It should be noted that this approach has also
been applied to optimization
of tray by tray batch
distillation models with composition
constraints enforced over time (Logsdon,
1990). These will be described in future studies.
4. EXAMPLE
PROBLEMS
LINBAR
First we consider the car problem
Min C&t/) =
IN STATES
discussed above:
f/l
such that
i, = zr
5, = u
such that
z,(O) = 0,
Zl@/) = 300,
zz(0) = 0
z,(t/)
= 0
-2<u<l.
to determine
the search direction A< and
steplengths for the decision variables 11and AC.
In addition, OPT also updates the reduced
Hessian matrix (ZTBZ)
based on the BFGS
formula (see Biegler and Cuthrell (1985) for
more details on OPTI.
(6-O) Return to step 1.0 with new set of decision
variables from OPT.
Fig. 6. Acceleration profilMr
Fig.
7. Velocity
proa Mr
As shown in Logsdon and Biegler (1989), the optimal
solution can be found by solving an equivalent indexone DAE system, because one differentiaton is needed
to obtain an expression for ri (from the active inequality constraint bounding u). The analytical solution is
the expected bang-bang solution shown in Figs 6-8.
Using a two-point collocation
method in the NLP
formulation
leads to a solution which matches the
analytical results.
problem solution; matches analytical values.
problem solution; matches
analytical
values.
858
J. S. LOGSDONand L. T. BIE~LER
200
4
DISTANCE
Fig. 8. Displacement profile--car problem solution; matches analyticalvalues.
The second example is the batch reactor example
found in Ray (1981) and discussed by Biegler (1984)
and Renfro (1987). This problem is of interest because
the control profile becomes saturated, and moving
finite elements are required to find the exact profile.
The optimal-control problem is:
Max
CY,W)I
such that
31 =
- (u + u2/2)y,
32 = UYl
Y,(O) = 1,
Y2(0) = 0
O<U<5.
The optimal solution is again equivalent to an indexone system because one differentiation is required to
obtain an expression for ti from the optimality conditions. Therefore, two-point collocation should achieve
the solution within a good accuracy. Since the problem is linear in the states, we solve for the states within
each element for a set of control variables using the
algorithm presented earlier. In order to accelerate
convergence, the initial reduced Hessian approximation was calculated by perturbing the analytical reduced gradients from the decision variables. Shanno
and Phua (1978) and Liu and Nocedal(1988) discuss
various scaling strategies for the initial Hessian.
Here we calculate the diagonal Hessian elements for
the initial Hessian approximation.
This is easily
calculated because we use analytical first derivative
information from a differentiation package (JAKEFArgonne National Laboratory). For this problem, we
started with a flat profile for the control variable u
= 1.0 and equally spaced integration lengths (ACi
= l/7) for seven finite elements. By preconditioning
the Hessian, we achieved a solution in nine iterations
for a Kuhn-Tucker
error of 1E - 6. Figure 9 shows
the control profile and Fig. 10 shows the state variable
profiles.
Next we consider a problem that is nonlinear in the
state variables. To solve this problem, a NewtonRaphson solution of the collocation equations is performed in step 1.1 of our algorithm.
o!
0.0
.
,
0.2
.
,
0.4
-
,
0.6
.
,
0.6
.
*
Tlmo
Fig. 9. Control profile for example 2.
Example problem 3: nonlinear in states
For problems that are nonlinear in the states, we
have a choice for the solution method in that we can
either converge the equality constraints for each set of
control variables (feasible-path method) or we can
simultaneously optimize the control variables and
converge the equality constraints at the solution. In
particular, for the feasible-path approach we modify
the algorithm of Section 3 by executing step 1.1 until
the collocation equations for that particular element
are converged. For the simultaneous approach, on the
other hand, (QP2) and the OPT algorithm in step 5.0
must be modified appropriately to reflect that fact hi is
not zero.
In this study we develop and evaluate the feasiblepath approach. Due to the theoretical complexities of
the simultaneous approach, as well as space limitations, we refer the reader to Logsdon (1990) for details
of this approach. With the feasible-path approach, the
interior states can be calculated for a set of control
variables within each element, as in the previous
section. Then the state information is passed on to the
next element through state profile continuity equations. The state variables along the solution trajectory
can be eliminated along with the collocation equations, as discussed in the previous section. However,
the interior-state information still needs to be supplied for upper and lower bounds on the state profiles,
Large-scale dynamic optimization problems
0.0
0.2
0.4
0:s
0.6
859
-m-
Yl
*
Y2
1 .P1
l:O
Time
Fig. 10. State variable profiles for example 2.
as well as any other inequalities
involving
state
variables. Also, the derivative information must be
chainruled to obtain the sensitivity of intermediate
and final states to the control variables. This approach is best applied to problems which have a large
number of states and few control variables, and
relatively few state variable constraints.
To
illustrate
this feasible-path
or “NewtonRaphson” approach we consider a nonlinear example
problem (Ray, 1981) which has an index-one solution
with nonlinear states and controls. Renfro (1986)
solved this problem by using piecewise-constant
controls and by scaling the problem to avoid numerical
difficulties. We do not require this restriction for the
solution of this problem.
The problem is a batch reactor with temperature as
the control variable, and it is desired to maximize one
of the products after a fixed reaction time. Here we
consider the following reaction:
kl
A+B+C.
kz
The problem is nonlinear in the rate equations for the
concentration
of A. Letting c1 and c2 represent the
concentration
of A and B, respectively, the optimal
control problem is:
Max [cz (l.O)]
such that
dc,_- dt
dcz = k,(T)c:
__
dt
k,(T)cT
Example
problem
4: larger systems
This last example poses a severe test for any NLPbased approach to optimal-control
problems. In particular, we demonstrate how our decomposition
approach successfully tackles an NLP formulation with
several thousand variables and several hundred degrees of freedom.
Here we consider a linear system investigated by
Nishida et al. (1976), Jacobson (1968), and Plant and
Athans (1966). The problem description is to move
from an initial position of xi(to) = 10 (i = 1,2,3,4) to
a position at a final time inside a unit sphere located at
the origin. The objective is to minimize the final
position
and the optimal-control
problem
is as
follows:
- k,(T)cz
k,(T) = AiO exp [ - Ei/ RT]
c,(O) = 1.0,
(Vasantharajan
and Biegler, 1988) were reported
earlier (Logsdon
and Biegler, 1989). By using the
Newton-Raphson
approach,
preprocessing
the
Hessian, and directly enforcing the residual constraints on the integration error, we accelerated the
convergence from the previously reported 88 iterations to 22 iterations. For the Newton-Raphson
solution, initially 34 iterations were required for each
element to achieve a feasible point and then 2-3
iterations were required to converge the stage variables for subsequent control variable movement from
OPT. Here we started with an initial-temperature
profile of 300 and the final-control profile is shown in
Fig. 11, with the state variable profiles shown in
Fig. 12.
i = 1,2
c,(O) = 0
298 < T < 398.
Since the solution is known to be equivalent to an
index-one DAE system, two-point collocation should
be adequate for the solution accuracy. The results of
this problem using the null and range space approach
such that
*i-, =
- 0.5x, + 5x,
.
x* = - 5x, - 0.5x2 + U
%‘J=
- 0.6x, + 10x,
xq =
-
10x, - 0.6x, + u
tul d 1
J. S. LOGSDON and L. T. BIEGLER
360
340
4
+
Temperature
Initial Temp
320
0.0
0.2
0.4
0.6
0.6
1.0
1.2
Time
Fig. 11. Control profiles for nonlinear states example.
0.0
0.0
0.2
0.4
0.6
0.6
1.0
1.2
Time
Fig. 12. State profiles for nonlinear states example.
xdto) =
10
xl(t2) d 1.0
i = 1,2,3,4
i = 1,2,3,4
tl = 4.2.
Both of the above studies adopted approximate
methods of solving this problem and obtained suboptimal solutions. Nishida et al. (1976), in particular,
developed fast, heuristic methods tailored to the solution of simple linear problems of this type. Since our
general purpose approach is not tailored to the highly
oscillatory nature of this problem, this is a severe test
for our algorithm.
Nishida et al. (1976) reported the suboptimal-control profile and switching times shown in Table 1. In
their solution, the step size (element length) used for
the integration was 0.0005, which resulted in a reported objective function value of 0.9952 (Nishida et
al.). However, when the above profile was integrated
with LSODE (Hindmarsh, 1980), the objective function obtained was 1.0067. This earlier work overcame
the problem of the switching points by making the
step size small enough so that the switching times
could occur without having to adapt the step size.
However, for a step size of O_ooO5and a final time of
Table 1. Literaturecontrol profile results
Switchingtime
0.0
0.1405
0.9205
1.3745
2.1700
2.6210
3.4345
3.8740
Control variable
- 1.0
-
1.0
1.0
1.0
1.0
1.0
1.0
1.0
4.2, this would require 8400 finite elements using the
NLP approach outlined above. Even using a parameterization approach with only one control variable
within each finite element, this would require 16,800
decision variables for SQP.
To reduce the problem sire, we need to be able to
find the switching points with a smaller number of
finite elements. We can accomplish this by enforcing a
residual evaluated at a noncollocation point to enforce the integration accuracy or by allowing suitably
small element lengths to vary slightly between lower
and upper bounds. The first approach requires the
Large-scale
dynamic optimization problems
enforcement of inequality constraints for each of the
differential equations for each of the finite elements_
The number of these inequality constraints is proportional to the number of finite elements. This approach
should have faster convergence, as demonstrated on
the smaller problems presented earlier. The second
approach has the advantage of requiring fewer inequality constraints in the QP but in our experience it
seems to require more iterations.
To determine the number of elements required,
we first solved the problem by using 140 elements
(time step of 0.03) and fixed the element length.
Again, we initialized
the Hessian by setting the
diagonal entries corresponding
to the control variables equal to zero. One can determine this initialization analytically by analyzing the second-derivative
information. Here the solution required 11 iterations
and 42 CPU minutes on a Vax 6320. This solution is
compared against the literature solution of Nishida et
al. in Table 2.
From the results in Table 2 we see that the control
profile is suboptimal
because one of the control
variables is not at the bounds and we have missed one
of the breakpoints or switching times. Therefore, we
set the lower and upper bounds on the integration
length between 0.0258 and 0.0342 in order to allow for
switching times. Here we obtained a solution using
this formulation in 77 iterations for a Kuhn-Tucker
convergence of l.OE - 6. The solution time was approximately
5.5 h on a Vax 6320. The number of
decision variables was 285, 140 integration steps, 140
control variables, and 5 final-state variables. We also
considered 2236 state variables by using the decomposition technique. The comparison of the results
is presented in Table 3.
The objective function values that we obtained for
the various control profiles are:
(1) Nishida et al. (1976) control profile:
(2) Fixed step length: 1.0078
(3) Variable step length: 1.00347.
1.0067
Note that for this large problem, we obtain convergence simply by allowing the element length to
861
Table 2. Comparison of control profiles
Fixed step length
Nishida et al.
Switching
time
Control
variable
Switching
time
Control
variable
0.0
0.120
0.90
1.38
2.16
2.61
3.42
3.45
3.87
- 1.0
1.0
- 1.0
1.0
- 1.0
1.0
0.78
- 1.0
1.0
0.0
0.1405
0.9205
1.3745
2.1700
2.6210
- 1.0
1.0
- 1.0
1.0
- 1.0
1.0
3.4345
3.8746
- 1.0
1.0
Table 3. Comparison of control profiles
Variable step length
Switching
time
Control
variable
0.0
0.11198
0.89979
1.36428
2.16960
2.62063
3.43619
3.87530
- 1.0
1.0
- 1.0
1.0
- 1.0
1.0
- 1.0
1.0
e
0
I
1
Switching
time
variable
Control
0.0
0.1405
0.9205
1.3745
2.1700
2.6210
3.4345
3.8740
- 1.0
1.0
- 1.0
1.0
- 1.0
1.0
- 1.0
1.0
vary between bounds. Let us now look at the state
profiles to see why we need so many elements for this
problem. Recall that the first two differential equations are coupled together and so are the last two. The
profiles are shown in Figs 13 and 14. We can see that
the profiles are oscillatory for each set of coupled
differential equations and would require the location
of the finite elements to be able to handle the various
characteristics of the state trajectories. However, the
control variable shows up in both sets of coupled
differential equations and requires that we solve the
four equations simultaneously. The solution trajectories are shown in Fig. 15, from which we can see the
20
-20
Nishida et al.
I
I
2
3
I
4
I
5
Time
Fig. 13. State profiles for xi and x3 from optimal-control profIle.
862
J.S. LOGSDON
-20
!
0
I
1
and L.T.
I
2
BIEGLER
I
3
I
4
I
5
Time
Fig. 14. State profiles for x3 and x4 from optimal-control profile.
Q
xl
+-x2
4x3
4
x4
-2o+
0
1
2
3
4
5
Time
Fig. 15. State profiles from optimal-control profile.
need for a large number of elements (or small integration time steps) in order to obtain an accurate solution..Thus, we see that this example presents a severe
test for an optimization-based
procedure. As seen in
Table 4, the algorithm was able to tackle a large NLP
with 2521 variables and 285 degrees of freedom.
In summary, we have demonstrated on four literature example problems how the’state variable equality
constraints can be eliminated from the optimization
quadratic subproblem.
The performance
of the algorithm for the example problems is given in Table 4.
5. SUMMARY AND E6NCLUSIONS
This paper presents a numerical method for obtaining optimal-control
profiles which are useful for
model-predictive control of chemical processes. Orthogonal collocation is used within an NLP framework in order to solve for the control profiles. Moreover, the NLP framework allows us to enforce state
path constraints and control path constraints. Also,
switching times and integration step lengths can be
posed as optimization variables to obtain an accurate
solution of the optimal-control
profile. This work thus
Table 4. Summary of example uroblems
Example
Car problem
Linear batch
Nonlinear
Large linear
(fixed step)
Large linear
(variable step)
Control
variables
State
variables
Finite
elements
Iterations
7
21
18
145
36
2236
2
140
12
9
20
11
285
2236
140
77
:;
2
‘Time is for VAX 6320. Other times are for Micro Vax 3200.
tTime is in hours on VAX 6320.
CPU
(s)
18.08
29.28
34.87
2530.57t
5.5t
Kuhn-Tucker
error
1E
1E
1E
IE
-
lE-6
5
6
8
6
Large-scale dynamic optimization problems
enhances previous optimization-based
studies, in that
we explore a decomposition
technique in order to
reduce the problem size and tailor the decomposition
to the problem structure,
In this study we exploit the structure of the collocation matrix in order to eliminate the state variable
equality constraints. Concurrent with the elimination
of the state variables, we also construct the sensitivity
information
(reduced-gradient
information)
of the
state variables with respect to the control variables.
Thus, we construct a reduced-gradient
method by
processing the state information forward using the
algorithm developed above.
Four example problems from the literature were
considered in this paper. The first is a small linear
example, the car problem (Cuthrell and Biegler, 1987),
which demonstrates the structure of the collocation
matrix. The second problem is a batch reactor problem (linear in state variables) in which we preprocessed the initial Hessian in order to speed-up the
convergence. Next we solved a small nonlinear batch
reactor problem by using a Newton-Raphson
method
to converge the equality state constraints and eliminate these constraints from the quadratic program.
Again, we preprocessed the initial Hessian in order to
speed-up the convergence. For processes which can be
modelled using simplified systems, direct enforcement
of the integration
error is useful for constructing
accurate solutions. However,
if a large number of
elements is required, then the user has to make some
simplifying assumptions in order to hold the QP to a
reasonable size.
To illustrate problems with larger NLPs, we considered a problem which requires a large number
of elements because the differential equations were
tightly coupled and the solution has an oscillatory
nature. It represented a severe test of our approach.
This problem was solved by Jacobson (1968) using a
dynamic programming
approach and by Nishida et
al. (1976) using a piecewise-maximization
approach.
Based on solution strategies tailored to this model,
they obtain good approximate (but suboptimal) solutions because the system is linear and has only finalstate condition
inequality
constraints. This study
shows how these DAE systems can be handled by
collocation on finite elements. Here the DAE system
requires a large number of finite elements due to
stability and error concerns for the state differential
equations. The resulting formulation had over 2500
variables and almost 300 degrees of freedom.
This NLP
approach, therefore, requires effective
storage and processing of the state information. Here
we can eliminate the state variables by using a reduced-gradient
approach. For nonlinear problems,
the solution of the state variables within the finite
elements requires solving the linearized equality constraints until convergence is achieved. In this way,
optimal-control
problems are solved by eliminating
the state variables from the NLP. This allows us to
use the NLP approach to solve much larger problems
than with previous studies, even where a general
863
purpose decomposition
approach is applied (Eaton et
al., 1989; Patwardhan
et al., 1990, Logsdon
and
Biegler, 1989). In addition, the determination
of
switching times requires the element lengths to adapt
in order to construct accurate profiles. Using the
algorithm presented in this paper, one can obtain
solutions to optimal-control
problems for systems
described by linear state differential equations in a
straightforward
manner. For nonlinear state dflerential equations, several issues remain regarding a
simultaneous approach versus the Newton-Raphson
approach. These are discussed further in Logsdon
(1990) and are also the topic of a future study.
Finally, note that the number of control variables
and the sets of state variable inequalities increases
linearly with the number of finite elements. Therefore,
future work needs to focus on dealing with dynamic
optimization problems that require large numbers of
finite elements. An illustration of this was given in
Example 4. Even with our decomposition
approach,
we run into limitations because the number of degrees
of freedom (and the size of the reduced Hessian matrix) still becomes large. A promising alternative for
such problems
was recently proposed
by Wright
(1989, 1991). Here optimality conditions are grouped
into a large banded matrix and can be handled by
efficient band solvers implemented on parallel processors. State variable inequalities are treated by augmenting this banded system with barrier (or penalty)
terms and the resulting problem is solved via interiorpoint methods. This approach has a number of theoretical advantages. Further numerical evaluation as
well as application to complex process problems are
still required, however.
Acknowledgements-Financial
support from the Engineering
Design Research Center, an NSF-supported
Engineering
Center at Carnegie+Mellon University. is gratefully acknowledged.
REFERENCES
Biegler, L. T., 1984, Solution of dynamic optimization problems by successive quadratic progr amming and orthogonal collocation. Comput. Chem. Engng 8.243-248.
Biegler, L. T. and Cuthrell, J. E., 1985, Improved infeasible
path optimization for sequential modular simulators--II.
The optimization algorithm. Comput. Chem. Engng 9,
257-267.
Cuthrell, J. E. and Biegler, L. T., 1987, On the optimization
of differential-algebraic process systems. A.1.Ch.E. J. 33,
1257-1270.
Cuthrell, J. E. and Biegler, L. T.. 1989, Simultaneous optimization and solution methods for batch reactor control
prafiles. Cornput. Chem. Engng 13,49-62.
Cutler, C. R. and Ramaker, B. L.. 1979, Dynamic matrix
control-computer
control algorithm. AIChE National
Meeting, Houston, TX.
Eaton, J. W., Rawlings, J. B. and Edgar, T. F., 1989. Mode1
predictive control and sensitivity analysis for constrained
nonlinear processes, in Proceedings of IFAC Workshop on
Model Based Control, pp_ 129-134. Pergamon [Press,
Oxford.
Hindmarsh, A. C., 1980, LSODE
and LSODI, two new
initial value ordinary differential equation solvers. ACMSIGNUM Newsletter 15, 10-11.
864
J. S. LOCZSDON
and L. T.
Jacobson, D. H., 1968. Differentialdynamic programming
methods for solvimt bantt-bangcontrol Droblems. IEEE
7Mn.s.Avtom. Co&o1 a&13, z61.
Li. W. C. and Biepler.L. T.. 1989. Multisten. Newton-tvne
-&ntrol strategik for cokstra&d nor&ear proce&s.
Chem. Enana Res. Des. 67. 562.
Liu, D. C. & Nor&al, I.,. 1988, On the limited memory
BFGS method for large scale optimization. Technical
Report NAM 03, NorthwesternUniversity.
Logsdon, J. S., Ph.D. Thesis, Carnegie-Mellon University,
Pittsburgh.
Lmmdon. J. S. and Biealer.L. T.. 1989. Accurate solution of
&R&e&al-algebrai~optim&tion problems. Ind. Engng
Chem. Res. 2.& 16281639.
Logsdon. 3. S., Diwekar, U. M. and Biegler,L. T., 1990, On
the simultaneousoptimal desian and oueration of batch
distillationcolum& Chem. E&g
Res..i)es. 11. 683.
Nishida, N.. Liu Y. A.. Lanidua L. and Hiratsuka S.. 1976,
An e&c&e eomputagonal . algorithm for st&ptima~
singular and/or bang-bang control. A.Z.CL.E. J. 22
SOS-S23.
Patwardhan,A. A., Rawlings,J.-B. and Edgar, T. F., 1990,
Noniinear model predictive control. Chem. Engng
Commun. 87, 123-141.
Phtnt,J. B. and Athans, M., 1966, An iterativetechniquefor
the computation of time optimal controls. Proceedingsof
the 3rd International IFAC Conference, London,
England, June 1966.
ZIEGLER
Prett. D. M. and Garcia, C. E., 1988, FundamentalProcess
Control. Buttenvorths,Stoneham, MA.
Ray, W. H., 1981, Advanced Process Control. McGraw-Hill
New York.
Retifro,J. G., 1986, Ph.D. Thesis,Universityof Houston, TX
Renfro, J. G., Morshedi, A. M. and Asbjomsen, 0. A., 1987,
Simultaneous optimization and solution of systems describedby differential/algebraic
equations. Cornput.Chem.
E?y.Jw11.503-517.
Shanno, D. F. and Phua, K. H., 1978, Matrix conditioning
and nonlinearoptimization. Math1 Prog. 14, 149-160.
Vasantharajan.S. and Biegler, L. T., 1988, Large-scale decompositionfor successivequadratic programming.Compt. Chem. Engng 12, 1089.
Vasantharajan,S., Viswanathan,J. and Biegler,L. T., 1990,
Large-scale implementationof reduced successivequadratic programming with smaller degrees of freedom.
Cornput.Chew. Engng 14,%7-917.
Villadsen,J. and Michelsen, M. L., 1978, Solution of Differential Equation
Models by Polynomial Approximation.
Prentice-Hall,Englewood Cliffs, NJ.
Wright, S., 1989, Solution of discrete time optimal control
problems of parallelcomputers.PreprintMCSP89-0789,
Argonne National Laboratory, Argonne, IL.
Wright, S.,1991. Interior point methods for optimal control
of discrete time systems. Preprint MCS-P2260491, Argonne National Laboratory, Argonne, IL.