PARAMETRIC DYNAMIC PROGRAMMING FOR DISCRETE EVENT SYSTEMS *Juan Cardillo, *Ferenc Szigeti, **Jean-Louis Calvet, ***Jean-Claude Hennet *Universidad de Los Andes, Facultad de Ingeniería Escuela de Sistemas, Departamento de Sistemas de Control, Mérida –Venezuela, email: [email protected], [email protected]. **LAAS-CNRS, Groupe MAC, Toulouse France, email: [email protected]. *** LAAS-CNRS, Groupe OCSD, Toulouse France, email: [email protected]. Abstract: This paper describes an approach based on formal calculus to optimize trajectories described by a succession of discrete states, by means of combining the Dynamic Programming technique with the formal approach presented in [Cardillo et 2001]. This approach allows to obtain an explicit form of the optimal control sequence, based on formal calculus. It also allows to introduce parameters in the system model as well as in the cost function. The control law is then expressed as an explicit function of these parameters. Copyright © 2002 IFAC Keywords: Formal Calculus , Dynamic Programming , Discrete Optimization, Discrete Event Systems. 1. INTRODUCTION The main motivation of this work is to find a trajectory of set-points for a particular family of discrete event dynamical system subject to control inputs and external parameters of evolution. The proposed approach is based on formal calculus. It uses the dynamic programming technique together with the symbolic algorithm proposed in [Cardillo et 2001] to generate the explicit expressions (symbolicparametric) of the optimal control sequence. Dynamic programming is combined with a direct decomposition method to replace the optimization of a function of several variables by the recursive resolution of optimization problems of a single variable. The algorithm in [Cardillo et 2001], called SCDO (Symbolic Computation for Discrete Optimization) obtains iteratively the formal optimal solution of a minimization problem over a finite set of Boolean variables. At each stage of the Dynamic Programming method, the SCDO is used to obtain the explicit expression (symbolic) of the optimal control sequence and consequently, the trajectory that describes the optimal succession of discrete states from the currently evaluated state. The optimal sequence of controls is obtained at the final stage of the algorithm, which corresponds to the initial state of the system. This paper is structured as follows. Section 2 presents the definitions used to describe the system dynamics through finite state transitions. This section also formulates the optimization problem. Section 3 proposes a generalization of the classical dynamic programming method, in which parameters are incorporated in the cost function. Section 4 presents a formal symbolic resolution of the proposed problem. An illustrative example is described in section 5, before the conclusion in section 6. 2. PROBLEM STATEMENT 2.1 Definitions State: Intuitively, the state may be regarded as a kind of information storage or memory or an accumulation of past events. We must, of course, require the set of internal states of a system to be sufficiently rich to carry all the information on the history of the system, needed to predict the effect of the past upon the future. We do not insist, however, on the minimality of the state in carrying such information, although this is often a convenient simplifying assumption. Decisions: Depending upon the application, one can talk of decision or choices. Decisions generally apply to state transitions. Parameters: Parameters are uncertain quantities, generally present in expressions to represent a family of possibilities. Objective function: An objective function is defined in terms of state and decision variables over a finite time-horizon of n+1 periods, as follows: n −1 I(x0,u0,...,un−1,p)=∑Ci(xi,ui,p)+Cn(x n,p) . (2.1) i =0 The optimal decision problem is to select the feasible control sequence u0,...,u n −1, such that the cost function I(x0,u0,...,u n −1,p) will take its minimal value, denoted , with: I*(x0,p)= min I(x0,u0, ,un −1,p) . (2.2) X:={X01, ,X0j0, ,Xn jn} , is the non-empty set of possible states of the system, P is the set of possible values for the vector of parameters p, ϕ:=X×U×P→X , is the next-state function. It is proposed to characterize the transition function by the Lagrange interpolation in all X and U, in the following polynomial form ϕi+1(.,.)=∑Xi+1j ∏ xi −Xik ∏ ui−Uil , (2.4) ij ik ij il X X U U − − j=1 l≠ =1j k≠=1j for i= 0,…, n-1, and η(.,.):X×U×P→Y , characterizes the next output. If the state is fully observed, then ηi(.,.)=xi , for i= 0,…, n-1. The optimization problem can then be formulated as follows: Minimize I(x0,u0, ,u n−1,p) u 0,u1, u n −1 u 0,u1, u n −1 Consider the graph of figure 1. It describes a sequence of discrete states, xk, which belong to a set X, under the action of controls uk in U. In addition, some costs are associated with the transitions. One of the basic characteristics of the considered graphs is that the set of transitions can be perfectly partitioned into stages. This property allows to clearly differentiate present states with their applicable controls and future states. Many discrete event processes can be described by such a graph, either directly or through some technique such as the one presented in [Gimenez 1989] . U n −11 , C n-11 U 01 , C 01 X 11 Xn1 X 01 U 02 , C 02 X12 X 02 Xn 2 X13 X 0 j0 U 0 p0 , C 0p0 X n jn X1 j1 U n −11 , C n-11 Fig. 1. Trajectories described by a succession of discrete states. The dynamical system defined by the graph of Fig.1 is specified by the quintuplet G(U, Y, X, P,ϕ(.,.), η(.,.)) , (2.3) where U:={U01, ,U0m0, ,Un −1 mn-1} is the nonempty set of inputs. The number of possible controls at stage i is denoted mi . An input or a sequence of inputs represents the action which is taken to apply a decision based on an evaluation and a judgment, Y:={Y1, ,Yn−1} is the non-empty set of outputs, subject to : (2.5) x k+1=ϕk(xk,u k,p) for k=0,1,...,n−1. 3. A DYNAMIC PROGRAMMING APPROACH According to the formulation of the optimization problem (2.5), dynamic programming provides a multistage optimization method through resolution of the functional equation (Bellman 1957): Jk(xk,p)=min {Ck(xk,uk,p)+Jk+1(ϕk(xk,uk,p))}(3.1) uk for k=n-1 to 0, with the terminal condition : Jn(x n,p)=Cn(x n,p) . (3.2) Classically, the last stage optimal control and costs given by (3.2) are first evaluated and the dynamic programming algorithm is applied backwards, using the optimality equation (3.1). However, in the proposed parametric approach, the evaluation performed at each stage k and for each possible state xk, provides formal expressions of the current optimal cost-to-go, Jk(x k,p) , and of the current optimal control, u*k, depending on the current state and on the parameters. This method yields an optimal feedback control: u*k =u k(x k) , and the optimal value of the objective function: I*(x0,p)=J0(x0,p) , (3.3) Due to the discrete feature of state and decision sets, resolution of functional equation (3.1) amounts to an implicit enumeration and evaluation of all the paths associated to possible trajectories. A possible way of presenting the results is then by means of a table describing the optimal control in each state. 4. A SYMBOLIC SOLUTION The objective is to obtain a formal expression of the optimal control sequence. The main motivation for such an approach is that many problems of optimization cannot be represented by a unique numerical formulation. In such problems, the model, the control sequence or the cost functional contains parameters in its expression. These parameters govern the behavior of the system and the solution of the problem. In spite of their advantages in terms of the information contained in their expression, parametric models are not much used in optimization, because classical calculus cannot easily extract information from parametric expressions. U k 1 , C k 1 ( p) Xk+11 , J* (Xk+11, p) Xk Xk+1 2 , J* (Xk+1 2 , p) U k 2 , C k 2 ( p) Fig. 2. Graph of the single decision case. The transition table for the state x k =Xk is as follows. uk Uk1 -1 J*k(xk,p) J*k,1(p) Xk +12 Uk2 1 J*k,2(p) Table 1. Transition table, state x k =Xk . 4.1 Preliminaries Lemma 1: [Cardillo et al 2001] Let us consider a Boolean function f : {−1,1} → ℜ. The minimum value of f is obtained at u * = Sign (f (−1) − f (1)) , (4.1) where the Sign function is used in the following sense: 1 if t > 0, Sign ( t ) (4.2) − 1 if t < 0, and, for t=0, Sign(0)={-1,1} is set-valued. In the latter case, function f achieves its optimum (minimum or maximum) at both point of the domain. Applying lemma 1, the symbolic expression of the optimal control is ~ u*=−Sign −J*k,1(p)+J*k,2(p) (4.5) ( where J*k,i(p)=min(Cki +J*(Xk+1i,p)) . U k 1 , C k 1 ( p) Xk+11 , J* (Xk+11, p) U k 2 , C k 2 ( p) U k 3 , C k 3 ( p) Xk+13 , J* (Xk+13 , p) U k m , C k m ( p) Application of lemma 2 to the case n=Card( Dg ) allows to transform function g( x ) : Dg → ℜ into the condition ϕ(u 0 , , u k ) : {−1,1}k +1 → ℜ 2 − 1 < Car (Dg ) < 2 k k +1 . under Xk+1 2 , J* (Xk+1 2 , p) Xk k +1 the smallest integer such that n < 2 , then the mapping ψ : {0,1 , , n} → {−1,1}k+1 , (4.3) is uniquely defined by its inverse: (1 + ~ u0 ) u = χ( ~ u0, , ~ uk ) = u k ). (4.4) + + 2 k −1 (1 + ~ 2 ) General case. Consider the graph of Figure 3. Lemma 2: [Cardillo et al 2001] Consider the set {0, 1, …, n}, and {−1,1}k +1 . If k is function ~ u xk +1 Xk+11 Xk+1 m , J* (Xk+1 m , p) Fig. 3. Graph of the general case. To apply the formal algorithm in its simplest form, the previous graph below is decomposed into a sequence of binary choices, as shown on Fig.4. the Function ϕ(u 0 , , u k ) is then uniquely defined by ϕ = χ g . Using an extension of g, it can be supposed without restriction that the domain of ϕ is {−1,1}k +1. Uk 1 Xk+11, J*k 1 (p) 4.2 The symbolic optimization algorithm These two lemmas will now be used to obtain an explicit form (symbolic) of the expression of the controller at each stage of the dynamic programming algorithm for the model described in section 2. Single binary choice. Consider the following graph Xk ~ Uk 2 Xk+1 2 , J*k 2 (p) Uk 2 ~ X k +1 2 ~ Uk 3 ~ X k +1 3 U k m−1 ~ X k +1 m-1 Uk m Xk+1 m-1, J*k m-1 (p) Xk+1 m , J*k m (p) Fig. 4. Decomposition of the graph - general case- The transition table associated to the state x k =Xk is x k +1 uk ~ u J *k ( x k , p) X k +11 ~ X k +12 U k1 ~ Uk2 -1 J *k ,1 (p) ( 1 min J *k , 2 (p), ) C1 Table 2. Transition table, state x k = X k . The corresponding optimal control is ~ u * = −Sign − J *k ,1 (p) + min J *k , 2 (p), , J *k , m (p) , (4.6) ( ( The transition ~ ~ x k +1 = X k +12 is table )) associated to the Pin ≈ 40 − 45 psi Pout = ctte ≈ 1480 psi f in f out = f d state C2 ~ x k +1 uk ~ u J *k ( x k , p) X k +12 ~ X k +13 U k2 ~ U k3 -1 J *k , 2 (p) ( 1 f2 ≤ F Fig. 5. The compression train ) min J *k ,3 (p), , J *k , m (p) ~ x k +1 = X k +12 . Table 3. Transition table, state ~ The optimal control associated with this table is ~ u * = −Sign − J *k , 2 (p) + min J *k ,3 (p), , J *k ,m (p) , (4.7) ( ( )) Thus, the computation is iterated until obtaining the ~ x k +1 = X k +1m −1 transition table associated to the state ~ ~ x k +1 fd f1 ≤ F , J *k , m (p) to the achievement of this objective, with the load as the control parameter. uk ~ u J ( x k , p) X k +1m −1 U km −1 -1 J *k , m −1 (p ) X k +1m U km 1 to start up Start up * k J x k +1 Table 4. Transition table, state ~ to go idle ( p) ~ = X k +1m −1 . ) to load Stopped Available * k ,m The optimal control for this table is ~ u * = −Sign − J *k , 2 (p) + J *k ,m (p) , ( It is assumed that both compression units have the same characteristics. The automaton which describes the behavior of a compression unit is taken from [Calderón et al 1999]. It is presented on figure 6. Thus, each unit possesses 7 states that identify the behavior of the unit. The mission of the supervisor for this type of system, is to obtain a desired load flow, through starting up or stopping the compressors. to stop The optimal control in state x k =Xk for a particular value of parameter p is then obtained from the evaluation of the expressions of ~ u * at each stage: 1. If the value of ~ u * is 1, proceed to the next stage. 5. EXAMPLE This example describes a type of application for the parametric dynamic programming approach. The problem is to design a supervisor for a simplified load distribution system in a compression train. The objective is to drive the system to a load consign, noted fd , with a compression train composed of 2 compression units. The treatment of this problem leads to obtaining an explicit form (symbolic) of the expression that governs the decision process, leading Idle stopping Load limited to go hole non surf Stopped not Available (4.4) 2. If the value of ~ u * is -1 the optimal control is the corresponding one in the transition table. 3. If the value of ~ u * is zero any of the two values in the transition table is an admissible control. In this case, two possible optimal trajectories may be generated, and the inspection is continued. permissive to load surf Load not limited Stop sequence Fail or stop Fig. 6. Automaton for the turbo-compressor unit In table 5 are presented the load characteristics of the unit in its different states. Value State Description Current load Stopped not 0 0 available Stopped 1 0 available Start up 2 0 Not loads 3 fi limited Hole 4 0 ~ Loads limited 5 6 Stop sequence F<F 0 Load allowed 0 F F F F F 0 Table 5 . Admissible load values in the different states of the compressor unit. Consider the task of reaching a load consign for the described compression train. The involved states are presented in the table 6, with the possible evolutions from these states. Value State Description Not limited Hole 3 L/C loads fi L/A F New State > fi , < fi , = fi 0 F > fi , = fi ~ ~ Loads limited 5 < fi , = fi F F Stop sequence 6 0 0 0 Table 6 . Load values admissible as load consigns . 4 Tables 7a and 7b, show the decisions to be taken to impose a load value to the compression train, considering the current state of each unit. Likewise, these tables show the load characteristics and the value of the control for each decision. For reasons of simplicity and to show in a simple manner the application of the approach, only the variations of the second compressor are considered here. The automaton that represents the behavior described on tables 7a and 7b is shown on figure 8. The performance associated is represented on each arc by an uppercase letter. Each letter associates a line in the charts 7a and 7b to the columns L/A, politics, ML fd . Max. Load ML State State L/A C1 C2 2F 3 3 f1 + f 2 2F 3 4 f1 ~ F + F2 3 5 ~ f 1 + F2 Polities , > fd New New State state C1 C2 f1 = f 2 = f d / 2 3 3 f1 = f 2 = f d / 2 ~ f d < F + F2 ~ f d / 2 < F2 3 3 3 3 f1 = f 2 = f d / 2 ~ f d < F + F2 ~ f d / 2 > F2 ~ f 1 = f d − F2 ~ f 2 = F2 ~ f d ≥ F + F2 3 5 5 5 3 6 F fd < F 3 6 5 6 5 6 f1 = f d fd = F f1 = f d fd > F f1 = F Table 7a . Politics for the supervisor, first part. State State remaining capacity ML- f d 3 2F − f 1 − f 2 2F − f d > 0 3 4 2F − f 1 − f 2 3 5 C1 C2 3 2F ~ F + F2 F 3 6 u Cod 1 A B F − f1 2F − f d > 0 1 ~ F + F2 − f d > 0 1 F − f1 ~ F + F2 − f d > 0 2 D 0 ~ F + F2 − f d < 0 3 E F − f1 F − fd > 0 4 F 0 F − fd = 0 5 F − fd < 0 Table 7b . Politics for the supervisor, second part. 0 6 C G H The dynamic system is described by (u − 1)(u − 2)(u − 4)(u − 5)(u − 6) + x 1 (k + 1) = 3 + 2 − 12 (u − 1)(u − 2)(u − 3)(u − 4)(u − 6) + + − 24 (u − 1)(u − 2)(u − 3)(u − 4)(u − 5) + 120 (u − 2)(u − 3)(u − 4)(u − 5)(u − 6) 120 1 + Sign (2F − f d ) ( x 2 − 4)(x 2 − 5)(x 2 − 6) + −6 2 x 2 (k + 1) = 3 ( x 2 − 3)( x 2 − 5)( x 2 − 6) + 2 ~ ( x 2 − 3)( x 2 − 4)(x 2 − 6) 1 + Sign (F + F2 − f d ) + −2 2 5 f1 = F ~ f 2 = F2 F Max. Load ML 2F ~ ( x 2 − 3)( x 2 − 4)(x 2 − 6) 1 + Sign (F + F2 − f d ) * −2 2 1 + Sign (~ F2 − f d / 2) (u − 1)(u − 3)(u − 4)(u − 5)(u − 6) 2 24 (u − 1)(u − 2)(u − 4)(u − 5)(u − 6) + + − 12 6 ( x 2 − 3)( x 2 − 4)( x 2 − 5) 1 + Sign (F + −f d ) * −2 2 (u − 1)(u − 2)(u − 3)(u − 5)(u − 6) + 12 1 (u − 1)(u − 2)(u − 3)(u − 4)(u − 6) + − 24 2 (u − 1)(u − 2)(u − 3)(u − 4)(u − 5) + 120 1, A 6. CONCLUSION 1, B 3,3 3,4 Combination of dynamic programming with formal calculus is a promising approach in the case of dynamical systems driven by externally defined parameters. The fact that few authors have explored this approach is probably due to its computational complexity and to the difficulty to provide explicit expressions of the optimal control and value function. However, these limitations can be overcome in the case of discrete event systems with relatively few possible actions, for example when considering the supervision level of a plant, when only few strategic options are available. The key tools used in this paper to explicitly obtain the optimal trajectory are the binary decomposition of control values and the use of an explicit formula for binary choices. 1, C 2, D 3, E 3,5 5,5 4, F 5,G , 6, H 3,6 5,6 Fig. 8. The automaton of the supervisor Consider now the following cost function 2 d J(x,fd)=min f − ∑fi u i=1 REFERENCES Applying the approach described expression of the controller is ( above, the ) 1 + Sign (2F − f d ) ~ u = 1 + Sign (F − F2 ) * 2 ( x 2 − 4)(x 2 − 5)( x 2 − 6) ( x 2 − 3)( x 2 − 5)( x 2 − 6) + −6 2 ~ ( x − 3)(x 2 − 4)(x 2 − 6) 1 + Sign (F + F2 − f d ) + 2 * −2 2 1 + Sign (~ F2 − f d / 2) ~ Sign F2 − f d / 2 + 2 1 + Sign (f / 2 − F − ~ F2 ) ~ d + 2 1 − Sign F2 − f d / 2 + 3 2 1 ~ ( x − 3)( x 2 − 4)( x 2 − 5) 1 − Sign f d − F − F2 + 2 * 2 6 ( ( ( ( ( ) )) )) [4 1 + Sign (F − f d ) + 5(1 + Sign (F − f d ) ) + + − 1 Sign ( f F ) d 6 2 2 The advantage of such a parametric expression is that it can be directly applied to any feasible consign without any new calculation Bellman R., (1957). Dynamic Programming, Princeston University Press, N.Y. Boulehmi M., (1999). Mise en oeuvre et évaluation d’algorithmes d’optimisation et de commande optimale sur un système de cacul formel. Doctorate thesis. Calderón J., Chacón E., Becerra L., (1999). Gas process automation in petroleum production. Proc. 3rd. International Conference on Industrial Automation. Montreal Canada, pp. 2.5-2.7. Cardillo J., Szigeti F., Hennet J-L., Calvet J-C., (2001). Symbolic Computation in Discrete Optimization, LAAS report , submitted to Journal of Symbolic Computation. Gimenez J.L., (1989). Contribition a la decomposition de systemes interconectes par programmation dynamique non seriellle. Application a des systemes de puissance. Doctorate thesis, Toulouse, France.. Larson R and Casti J., (1978). Principle of Dynamic Programming, Control and System Theory Part I and II. Marcel Dekker ISBN 0-8247-6589-3.
© Copyright 2026 Paperzz