PARAMETRIC DYNAMIC PROGRAMMING FOR DISCRETE

PARAMETRIC DYNAMIC PROGRAMMING FOR DISCRETE EVENT SYSTEMS
*Juan Cardillo, *Ferenc Szigeti, **Jean-Louis Calvet, ***Jean-Claude Hennet
*Universidad de Los Andes, Facultad de Ingeniería Escuela de Sistemas, Departamento de Sistemas de
Control, Mérida –Venezuela, email: [email protected], [email protected].
**LAAS-CNRS, Groupe MAC, Toulouse France, email: [email protected].
*** LAAS-CNRS, Groupe OCSD, Toulouse France, email: [email protected].
Abstract: This paper describes an approach based on formal calculus to optimize
trajectories described by a succession of discrete states, by means of combining the
Dynamic Programming technique with the formal approach presented in [Cardillo et
2001]. This approach allows to obtain an explicit form of the optimal control sequence,
based on formal calculus. It also allows to introduce parameters in the system model as
well as in the cost function. The control law is then expressed as an explicit function of
these parameters. Copyright © 2002 IFAC
Keywords: Formal Calculus , Dynamic Programming , Discrete Optimization, Discrete
Event Systems.
1. INTRODUCTION
The main motivation of this work is to find a
trajectory of set-points for a particular family of
discrete event dynamical system subject to control
inputs and external parameters of evolution. The
proposed approach is based on formal calculus. It
uses the dynamic programming technique together
with the symbolic algorithm proposed in [Cardillo et
2001] to generate the explicit expressions (symbolicparametric) of the optimal control sequence. Dynamic
programming is combined with a direct
decomposition method to replace the optimization of
a function of several variables by the recursive
resolution of optimization problems of a single
variable. The algorithm in [Cardillo et 2001], called
SCDO (Symbolic Computation for Discrete
Optimization) obtains iteratively the formal optimal
solution of a minimization problem over a finite set of
Boolean variables. At each stage of the Dynamic
Programming method, the SCDO is used to obtain the
explicit expression (symbolic) of the optimal control
sequence and consequently, the trajectory that
describes the optimal succession of discrete states
from the currently evaluated state. The optimal
sequence of controls is obtained at the final stage of
the algorithm, which corresponds to the initial state of
the system. This paper is structured as follows.
Section 2 presents the definitions used to describe the
system dynamics through finite state transitions. This
section also formulates the optimization problem.
Section 3 proposes a generalization of the classical
dynamic programming method, in which parameters
are incorporated in the cost function. Section 4
presents a formal symbolic resolution of the proposed
problem. An illustrative example is described in
section 5, before the conclusion in section 6.
2. PROBLEM STATEMENT
2.1 Definitions
State:
Intuitively, the state may be regarded as a kind of
information storage or memory or an accumulation of
past events. We must, of course, require the set of
internal states of a system to be sufficiently rich to
carry all the information on the history of the system,
needed to predict the effect of the past upon the
future. We do not insist, however, on the minimality
of the state in carrying such information, although
this is often a convenient simplifying assumption.
Decisions:
Depending upon the application, one can talk of
decision or choices. Decisions generally apply to state
transitions.
Parameters:
Parameters are uncertain quantities, generally present
in expressions to represent a family of possibilities.
Objective function:
An objective function is defined in terms of state and
decision variables over a finite time-horizon of n+1
periods, as follows:
n −1
I(x0,u0,...,un−1,p)=∑Ci(xi,ui,p)+Cn(x n,p) .
(2.1)
i =0
The optimal decision problem is to select the feasible
control sequence u0,...,u n −1, such that the cost
function I(x0,u0,...,u n −1,p) will take its minimal
value, denoted , with:
I*(x0,p)= min I(x0,u0, ,un −1,p) .
(2.2)
X:={X01, ,X0j0, ,Xn jn} , is the non-empty set of
possible states of the system,
P is the set of possible values for the vector of
parameters p,
ϕ:=X×U×P→X , is the next-state function. It is
proposed to characterize the transition function by the
Lagrange interpolation in all X and U, in the
following polynomial form


ϕi+1(.,.)=∑Xi+1j ∏ xi −Xik ∏ ui−Uil , (2.4)
ij
ik
ij
il
X
X
U
U
−
−
j=1
l≠
=1j
 k≠=1j

for i= 0,…, n-1, and
η(.,.):X×U×P→Y , characterizes the next output.
If the state is fully observed, then
ηi(.,.)=xi , for i= 0,…, n-1.
The optimization problem can then be formulated as
follows:
Minimize
I(x0,u0, ,u n−1,p)
u 0,u1, u n −1
u 0,u1, u n −1
Consider the graph of figure 1. It describes a
sequence of discrete states, xk, which belong to a set
X, under the action of controls uk in U. In addition,
some costs are associated with the transitions. One of
the basic characteristics of the considered graphs is
that the set of transitions can be perfectly partitioned
into stages. This property allows to clearly
differentiate present states with their applicable
controls and future states. Many discrete event
processes can be described by such a graph, either
directly or through some technique such as the one
presented in [Gimenez 1989] .
U n −11 , C n-11
U 01 , C 01 X
11
Xn1
X 01
U 02 , C 02
X12
X 02
Xn 2
X13
X 0 j0
U 0 p0 , C 0p0
X n jn
X1 j1
U n −11 , C n-11
Fig. 1. Trajectories described by a succession of
discrete states.
The dynamical system defined by the graph of Fig.1
is specified by the quintuplet
G(U, Y, X, P,ϕ(.,.), η(.,.)) ,
(2.3)
where U:={U01, ,U0m0, ,Un −1 mn-1} is the nonempty set of inputs. The number of possible controls
at stage i is denoted mi . An input or a sequence of
inputs represents the action which is taken to apply a
decision based on an evaluation and a judgment,
Y:={Y1, ,Yn−1} is the non-empty set of outputs,
subject to :
(2.5)
x k+1=ϕk(xk,u k,p) for k=0,1,...,n−1.
3. A DYNAMIC PROGRAMMING APPROACH
According to the formulation of the optimization
problem (2.5), dynamic programming provides a
multistage optimization method through resolution of
the functional equation (Bellman 1957):
Jk(xk,p)=min
{Ck(xk,uk,p)+Jk+1(ϕk(xk,uk,p))}(3.1)
uk
for k=n-1 to 0, with the terminal condition :
Jn(x n,p)=Cn(x n,p) .
(3.2)
Classically, the last stage optimal control and costs
given by (3.2) are first evaluated and the dynamic
programming algorithm is applied backwards, using
the optimality equation (3.1). However, in the
proposed parametric approach, the evaluation
performed at each stage k and for each possible state
xk, provides formal expressions of the current optimal
cost-to-go, Jk(x k,p) , and of the current optimal
control, u*k, depending on the current state and on the
parameters.
This method yields an optimal feedback
control: u*k =u k(x k) , and the optimal value of the
objective function:
I*(x0,p)=J0(x0,p) ,
(3.3)
Due to the discrete feature of state and decision sets,
resolution of functional equation (3.1) amounts to an
implicit enumeration and evaluation of all the paths
associated to possible trajectories. A possible way of
presenting the results is then by means of a table
describing the optimal control in each state.
4. A SYMBOLIC SOLUTION
The objective is to obtain a formal expression of the
optimal control sequence. The main motivation for
such an approach is that many problems of
optimization cannot be represented by a unique
numerical formulation. In such problems, the model,
the control sequence or the cost functional contains
parameters in its expression. These parameters govern
the behavior of the system and the solution of the
problem. In spite of their advantages in terms of the
information contained in their expression, parametric
models are not much used in optimization, because
classical calculus cannot easily extract information
from parametric expressions.
U k 1 , C k 1 ( p)
Xk+11 , J* (Xk+11, p)
Xk
Xk+1 2 , J* (Xk+1 2 , p)
U k 2 , C k 2 ( p)
Fig. 2. Graph of the single decision case.
The transition table for the state x k =Xk is as follows.
uk
Uk1
-1
J*k(xk,p)
J*k,1(p)
Xk +12
Uk2
1
J*k,2(p)
Table 1. Transition table, state x k =Xk .
4.1 Preliminaries
Lemma 1: [Cardillo et al 2001]
Let us consider a Boolean function f : {−1,1} → ℜ.
The minimum value of f is obtained at
u * = Sign (f (−1) − f (1)) ,
(4.1)
where the Sign function is used in the following
sense:
 1 if t > 0,
Sign ( t )
(4.2)
− 1 if t < 0,
and, for t=0, Sign(0)={-1,1} is set-valued. In the latter
case, function f achieves its optimum (minimum or
maximum) at both point of the domain.
Applying lemma 1, the symbolic expression of the
optimal control is
~
u*=−Sign −J*k,1(p)+J*k,2(p)
(4.5)
(
where
J*k,i(p)=min(Cki +J*(Xk+1i,p)) .
U k 1 , C k 1 ( p)
Xk+11 , J* (Xk+11, p)
U k 2 , C k 2 ( p)
U k 3 , C k 3 ( p)
Xk+13 , J* (Xk+13 , p)
U k m , C k m ( p)
Application of lemma 2 to the case n=Card( Dg )
allows to transform function g( x ) : Dg → ℜ into the
condition
ϕ(u 0 ,
, u k ) : {−1,1}k +1 → ℜ
2 − 1 < Car (Dg ) < 2
k
k +1
.
under
Xk+1 2 , J* (Xk+1 2 , p)
Xk
k +1
the smallest integer such that n < 2 , then the
mapping
ψ : {0,1 , , n} → {−1,1}k+1 ,
(4.3)
is uniquely defined by its inverse:
(1 + ~
u0 )
u = χ( ~
u0, , ~
uk ) =
u k ). (4.4)
+ + 2 k −1 (1 + ~
2
)
General case. Consider the graph of Figure 3.
Lemma 2: [Cardillo et al 2001]
Consider the set {0, 1, …, n}, and {−1,1}k +1 . If k is
function
~
u
xk +1
Xk+11
Xk+1 m , J* (Xk+1 m , p)
Fig. 3. Graph of the general case.
To apply the formal algorithm in its simplest form,
the previous graph below is decomposed into a
sequence of binary choices, as shown on Fig.4.
the
Function
ϕ(u 0 , , u k ) is then uniquely defined by ϕ = χ g .
Using an extension of g, it can be supposed without
restriction that the domain of ϕ is {−1,1}k +1.
Uk 1
Xk+11, J*k 1 (p)
4.2 The symbolic optimization algorithm
These two lemmas will now be used to obtain an
explicit form (symbolic) of the expression of the
controller at each stage of the dynamic programming
algorithm for the model described in section 2.
Single binary choice. Consider the following graph
Xk
~
Uk 2
Xk+1 2 , J*k 2 (p)
Uk 2
~
X k +1 2
~
Uk 3
~
X k +1 3
U k m−1
~
X k +1 m-1
Uk m
Xk+1 m-1, J*k m-1 (p)
Xk+1 m , J*k m (p)
Fig. 4. Decomposition of the graph - general case-
The transition table associated to the state x k =Xk is
x k +1
uk
~
u
J *k ( x k , p)
X k +11
~
X k +12
U k1
~
Uk2
-1
J *k ,1 (p)
(
1
min J *k , 2 (p),
)
C1
Table 2. Transition table, state x k = X k .
The corresponding optimal control is
~
u * = −Sign − J *k ,1 (p) + min J *k , 2 (p), , J *k , m (p) , (4.6)
(
(
The transition
~
~
x k +1 = X k +12 is
table
))
associated
to
the
Pin ≈ 40 − 45 psi
Pout = ctte ≈ 1480 psi
f in
f out = f d
state
C2
~
x k +1
uk
~
u
J *k ( x k , p)
X k +12
~
X k +13
U k2
~
U k3
-1
J *k , 2 (p)
(
1
f2 ≤ F
Fig. 5. The compression train
)
min J *k ,3 (p), , J *k , m (p)
~
x k +1 = X k +12 .
Table 3. Transition table, state ~
The optimal control associated with this table is
~
u * = −Sign − J *k , 2 (p) + min J *k ,3 (p), , J *k ,m (p) , (4.7)
(
(
))
Thus, the computation is iterated until obtaining the
~
x k +1 = X k +1m −1
transition table associated to the state ~
~
x k +1
fd
f1 ≤ F
, J *k , m (p)
to the achievement of this objective, with the load
as the control parameter.
uk
~
u
J ( x k , p)
X k +1m −1
U km −1
-1
J *k , m −1 (p )
X k +1m
U km
1
to start up
Start up
*
k
J
x k +1
Table 4. Transition table, state ~
to go idle
( p)
~
= X k +1m −1 .
)
to load
Stopped
Available
*
k ,m
The optimal control for this table is
~
u * = −Sign − J *k , 2 (p) + J *k ,m (p) ,
(
It is assumed that both compression units have the
same characteristics. The automaton which describes
the behavior of a compression unit is taken from
[Calderón et al 1999]. It is presented on figure 6.
Thus, each unit possesses 7 states that identify the
behavior of the unit. The mission of the supervisor for
this type of system, is to obtain a desired load flow,
through starting up or stopping the compressors.
to stop
The optimal control in state x k =Xk for a particular
value of parameter p is then obtained from the
evaluation of the expressions of ~
u * at each stage:
1. If the value of ~
u * is 1, proceed to the next stage.
5. EXAMPLE
This example describes a type of application for the
parametric dynamic programming approach.
The problem is to design a supervisor for a simplified
load distribution system in a compression train. The
objective is to drive the system to a load consign,
noted fd , with a compression train composed of 2
compression units. The treatment of this problem
leads to obtaining an explicit form (symbolic) of the
expression that governs the decision process, leading
Idle
stopping
Load
limited
to go
hole
non
surf
Stopped not
Available
(4.4)
2. If the value of ~
u * is -1 the optimal control is the
corresponding one in the transition table.
3. If the value of ~
u * is zero any of the two values in
the transition table is an admissible control. In this
case, two possible optimal trajectories may be
generated, and the inspection is continued.
permissive
to load
surf
Load not
limited
Stop
sequence
Fail or stop
Fig. 6. Automaton for the turbo-compressor unit
In table 5 are presented the load characteristics of the
unit in its different states.
Value State Description
Current load
Stopped not
0
0
available
Stopped
1
0
available
Start up
2
0
Not
loads
3
fi
limited
Hole
4
0
~
Loads limited
5
6
Stop
sequence
F<F
0
Load allowed
0
F
F
F
F
F
0
Table 5 . Admissible load values in the different
states of the compressor unit.
Consider the task of reaching a load consign for the
described compression train. The involved states are
presented in the table 6, with the possible evolutions
from these states.
Value
State
Description
Not
limited
Hole
3
L/C
loads
fi
L/A
F
New State
> fi , < fi , = fi
0
F
> fi , = fi
~
~
Loads limited
5
< fi , = fi
F
F
Stop sequence
6
0
0
0
Table 6 . Load values admissible as load consigns .
4
Tables 7a and 7b, show the decisions to be taken to
impose a load value to the compression train,
considering the current state of each unit. Likewise,
these tables show the load characteristics and the
value of the control for each decision. For reasons of
simplicity and to show in a simple manner the
application of the approach, only the variations of the
second compressor are considered here.
The automaton that represents the behavior described
on tables 7a and 7b is shown on figure 8. The
performance associated is represented on each arc by
an uppercase letter. Each letter associates a line in the
charts 7a and 7b to the columns L/A, politics, ML fd .
Max.
Load
ML
State State
L/A
C1
C2
2F
3
3
f1 + f 2
2F
3
4
f1
~
F + F2
3
5
~
f 1 + F2
Polities ,
> fd
New New
State state
C1
C2
f1 = f 2 = f d / 2
3
3
f1 = f 2 = f d / 2
~
f d < F + F2
~
f d / 2 < F2
3
3
3
3
f1 = f 2 = f d / 2
~
f d < F + F2
~
f d / 2 > F2
~
f 1 = f d − F2
~
f 2 = F2
~
f d ≥ F + F2
3
5
5
5
3
6
F
fd < F
3
6
5
6
5
6
f1 = f d
fd = F
f1 = f d
fd > F
f1 = F
Table 7a . Politics for the supervisor, first part.
State State
remaining
capacity
ML- f d
3
2F − f 1 − f 2
2F − f d > 0
3
4
2F − f 1 − f 2
3
5
C1
C2
3
2F
~
F + F2
F
3
6
u Cod
1
A
B
F − f1
2F − f d > 0 1
~
F + F2 − f d > 0 1
F − f1
~
F + F2 − f d > 0
2
D
0
~
F + F2 − f d < 0
3
E
F − f1
F − fd > 0
4
F
0
F − fd = 0
5
F − fd < 0
Table 7b . Politics for the supervisor, second part.
0
6
C
G
H
The dynamic system is described by
 (u − 1)(u − 2)(u − 4)(u − 5)(u − 6)
+
x 1 (k + 1) = 3 + 2 
− 12

(u − 1)(u − 2)(u − 3)(u − 4)(u − 6)
+
+
− 24
(u − 1)(u − 2)(u − 3)(u − 4)(u − 5) 
+

120

(u − 2)(u − 3)(u − 4)(u − 5)(u − 6)
120
 1 + Sign (2F − f d )  ( x 2 − 4)(x 2 − 5)(x 2 − 6)
+


−6
2


x 2 (k + 1) = 3
( x 2 − 3)( x 2 − 5)( x 2 − 6) 
+
2

~
( x 2 − 3)( x 2 − 4)(x 2 − 6)  1 + Sign (F + F2 − f d ) 
+


−2
2


5
f1 = F
~
f 2 = F2
F
Max.
Load
ML
2F
~
( x 2 − 3)( x 2 − 4)(x 2 − 6)  1 + Sign (F + F2 − f d ) 

*
−2
2


 1 + Sign (~
F2 − f d / 2)  (u − 1)(u − 3)(u − 4)(u − 5)(u − 6)



2
24


(u − 1)(u − 2)(u − 4)(u − 5)(u − 6) 
+
+
− 12

6
( x 2 − 3)( x 2 − 4)( x 2 − 5)  1 + Sign (F + −f d ) 
 *

−2
2


(u − 1)(u − 2)(u − 3)(u − 5)(u − 6)
+
12
1 (u − 1)(u − 2)(u − 3)(u − 4)(u − 6)
+
− 24
2
(u − 1)(u − 2)(u − 3)(u − 4)(u − 5) 
+

120

1, A
6. CONCLUSION
1, B
3,3
3,4
Combination of dynamic programming with formal
calculus is a promising approach in the case of
dynamical systems driven by externally defined
parameters. The fact that few authors have explored
this approach is probably due to its computational
complexity and to the difficulty to provide explicit
expressions of the optimal control and value function.
However, these limitations can be overcome in the
case of discrete event systems with relatively few
possible actions, for example when considering the
supervision level of a plant, when only few strategic
options are available. The key tools used in this paper
to explicitly obtain the optimal trajectory are the
binary decomposition of control values and the use of
an explicit formula for binary choices.
1, C
2, D
3, E
3,5
5,5
4, F
5,G , 6, H
3,6
5,6
Fig. 8. The automaton of the supervisor
Consider now the following cost function
2


d
J(x,fd)=min
f
−
 ∑fi 
u
 i=1 
REFERENCES
Applying the approach described
expression of the controller is
(
above,
the
)
 1 + Sign (2F − f d ) 
~
u = 
 1 + Sign (F − F2 ) *
2


 ( x 2 − 4)(x 2 − 5)( x 2 − 6) ( x 2 − 3)( x 2 − 5)( x 2 − 6) 
+


−6
2


~
( x − 3)(x 2 − 4)(x 2 − 6)  1 + Sign (F + F2 − f d ) 
+ 2

*
−2
2


 1 + Sign (~
F2 − f d / 2) 
~

Sign F2 − f d / 2 +


2


 1 + Sign (f / 2 − F − ~
F2 ) 
~
d
+
2 1 − Sign F2 − f d / 2 + 3


2


1
~  ( x − 3)( x 2 − 4)( x 2 − 5)
1 − Sign f d − F − F2  + 2
*
2
6

(
(
(
(
(
)
))
))
[4 1 + Sign (F − f d )  + 5(1 + Sign (F − f d ) ) +


+
−
1
Sign
(
f
F
)


d

6
2


2
The advantage of such a parametric expression is that
it can be directly applied to any feasible consign
without any new calculation
Bellman R., (1957). Dynamic Programming,
Princeston University Press, N.Y.
Boulehmi M., (1999). Mise en oeuvre et évaluation
d’algorithmes d’optimisation et de commande
optimale sur un système de cacul formel.
Doctorate thesis.
Calderón J., Chacón E., Becerra L., (1999). Gas
process automation in petroleum production.
Proc. 3rd. International Conference on Industrial
Automation. Montreal Canada, pp. 2.5-2.7.
Cardillo J., Szigeti F., Hennet J-L., Calvet J-C.,
(2001). Symbolic Computation in Discrete
Optimization, LAAS report , submitted to Journal
of Symbolic Computation.
Gimenez J.L., (1989). Contribition a la decomposition
de systemes interconectes par programmation
dynamique non seriellle. Application a des
systemes de puissance. Doctorate thesis,
Toulouse, France..
Larson R and Casti J., (1978). Principle of Dynamic
Programming, Control and System Theory Part I
and II. Marcel Dekker ISBN 0-8247-6589-3.