An Algorithm for the Long Run Average Cost Problem for Linear

2009 American Control Conference
Hyatt Regency Riverfront, St. Louis, MO, USA
June 10-12, 2009
FrA14.6
An Algorithm for the Long Run Average Cost Problem for
Linear Systems with Non-observed Markov Jump Parameters
Carlos A. Silva and Eduardo F. Costa
Abstract— This paper addresses the problem of long
run average cost for linear systems with non-observed
Markov jump parameters. We present an algorithm
that relies on the approximation of the (infinite horizon) cost via its finite horizon version and uses an
evolutionary-based algorithm for the finite horizon
cost. A numerical example illustrates the proposed
algorithm.
I. Introduction
There has been a great deal of attention to linear systems with Markov jump parameters (LSMJP). LSMJP
constitutes a well known class of system that can be
successfully employed in applications featuring random
failures, environmental changes and other phenomena
that lead to random, abrupt changes of behaviour. There
exist numerous results, spanning from notions of stability
[13], stabilization [16], [9], basic features such as stabilizability and detectability [3], to optimal solution to finite
and infinite quadratic costs [14], [7], [8] and filtering [10].
The available results for LSMJP are strong enough to
allow for parallels with standard linear systems, mainly
in scenarios with complete state observation (which includes the observation of the Markov state), and we mention as illustration the existence of generalized Riccati
equations that characterize the optimal solution to the
long run average cost (LRAC), even in the presence of
additive Gaussian noise, see [6]. However, the parallel
with standard linear systems is weaker when dealing with
incomplete or no observation of the Markov state. For
instance, in [11] authors need to consider a variationallike approach to the finite horizon quadratic cost problem
with partial observation.
In the context of LRAC that we are interested in
this paper, there only exists a result on bounds for the
cost [19], which allows to study existence of the LRAC
and convergence of the finite horizon average cost to the
LRAC, assuming controls in static state feedback form.
Direct extension of the above mentioned variationallike approach seems not to be viable, as it typically
provides non-stationary controls, whereas the LRAC, as
an infinite horizon problem, requires stationary controls.
In fact, the LRAC problem for LSMJP with unobserved
Markov state can be interpreted in simple terms as how
to obtain a static gain that is not a function of the
Markov state and minimizes the average cost incurred by
the controlled Markov jump system. Such controls can be
implemented when the Markov variable is not accessible
and are easier to implement than non-stationary controls, and they are stable (in a certain stochastic sense)
provided some mild detectability-like conditions hold,
making them of much appeal for applications. However,
to the extent of our knowledge there is no available
study on methods for providing solutions to the LRAC
problem.
In this paper, we present a simple algorithm, based
on the idea of approximation via the finite horizon cost
(FHC). The FHC problem with horizon is dealt with a
genetic algorithm, aiming at a suboptimal static gain in
the form uk = Kxk , where xk is the so-called continuous
state and u is the control. Next, we employ the algorithm
for the FHC with increasing horizon T , for approximating
the LRAC. The LRAC is more complex as it involves
issues of stability and sensitivity to the initial condition
of the system. To handle these difficulties, we employ
a modified cost (by including a parameter ε > 0), and
analyse the impact on the original cost. We also show
that stabilizing gains K are associated to finite LRAC
with ε > 0. These contributions are presented in a quite
simple and comprehensive manner, and are fundamental
for the approximation of the LRAC via the FHC and,
hence, for our algorithm to provide a solution to the
LRAC. Apart from these theoretical results, we employ
some new strategies in the genetic algorithm for the
FHC problem: (i) use of solutions given by coupled
Riccati equations from the LSMJP theory to initialize
the population and (ii) representation of each element of
the gain in a polynomial form to increase the number of
genes of each element of the population, as detailed in
Section III.
The paper is organized as follows. In Section II we
formalize the LRAC and FHC problems for LSMJP. The
algorithm for the FHC is detailed in Section III, and
in Section IV we obtain some results that allow us to
use this algorithm to approximate the LRAC. Section
V presents an illustrative example. We finish with some
concluding remarks.
C. A. Silva and E. F. Costa are with Depto. de Matemática
Aplicada e Estatı́stica, Universidade de São Paulo, C.P. 668,
13560-970, São Carlos, SP, Brazil. [email protected], ef-
[email protected]
Research supported by CNPq Grant 304429/2007-4, FAPESP
Grants 2008/02035-8 and 2003/06736-7.
978-1-4244-4524-0/09/$25.00 ©2009 AACC
II. Problem Formulation and Preliminary
Results
One simple way to describe the discrete-time LSMJP
considered in this paper is to take into account, initially,
4434
Proposition 1. Let X ∈ Mn , Q ∈ Mn and Σ ∈ Mn be
defined by Xi = x0 x′0 πi , Qi = Q and Σi = E, ∀i ∈ S. Then,
a standard time-varying linear system in the form
xk+1 = Sk xk + Fk uk + wk ,
k ≥ 0,
the initial condition x0 ∈ R n , where x ∈ R n is the state,
u ∈ R r is the input and wk is a zero-mean independent
Gaussian random variable with covariance matrix E.
Consider the cost function
T
WT =
∑ x′k Qxk + u′k Ruk ,
k=0
Q′
with Q =
≥ 0 and R = R′ > 0. However, instead
of the standard hypothesis that the values of Sk and
Fk are known a-priori for all time instants k ≥ 0, we
consider that they assume values from the collections
A = {A1 , . . . , AS } B = {B1 , . . . , BS } respectively, accordingly
to a Markov chain {θ0 , θ1 , . . .}:
Sk = Aθk , Fk = Bθk ,
k ≥ 0.
The transition probabilities are
P(θk+1 = j|θk = i) = pi j
with i, j ∈ S = {1, . . . , S}, and the initial distribution is π =
[P(θ0 = 1), . . . , P(θ0 = S)] = [π1 , . . . , πS ]. We assume that
the matrix P = [pi j ] and the vector π are known. In this
situation, the system is inherently stochastic and xk forms
a stochastic process, in such a manner that W as defined
above is a random variable, and we consider its expected
value for optimization purposes,
Y T = E {W T }.
It is a well known fact in the LSMJP literature that Y T
can be represented in terms of certain linear operators
(denoted here by T) involving the conditional second
moment matrices of the system, as we present in the
sequel. Let Mr,s denote the linear space formed by a
number S of r × s-dimensional matrices, Mr,s = {U =
(U1 , . . . ,US )}. Also, Mr ≡ Mr,r . For U,V ∈ Mr , U ≥ V
signifies that Ui − Vi ∈ Rr0 for each i ∈ N , and similarly
for other mathematical relations. Consider tr{·} as the
trace operator. It is known that Mr,s equipped with the
inner product
S
hU,V i = ∑
i=1
∑ p jiU jV jU j′ ,
T 0 (V )
∀i ∈ N ,
E
D
k−1
TAk (X) + ∑ TAl (Σ) , Q .
(2)
l=0
The collection of matrices X in Proposition 1 represents the conditional second moment of the initial
condition, Xi = E {x0 x′0 |θ0 = i}P(θ0 = i). We write Y (X)
to emphasize the dependence of Y on X.
Observation and Control Structures
We assume that only the quantity xk is available to the
controller at each time instant k. θk is not observed. In
connection, we consider a linear state feedback control in
the form
uk = Kxk .
(3)
K is referred to as a static gain (in opposition to timedependent gains of the form Kk ). Static gains are among
the simplest to implement controls, thus being of interest
for many applications.
For the complete observation case (xk and θk observed),
with controls in the form
uk = Kθk xk ,
(4)
there are strong results in LSMJP literature, parallelling
the deterministic linear systems theory. One result that
shall be useful to us is the solution of finite horizon
quadratic cost problems via coupled algebraic Riccati
equations, allowing to compute optimal gains Ki in a
simple way, as presented in the Appendix.
For a given gain K (or a collection of gains K = {Ki , i ∈
S} in the complete observation case), we have a closed
loop form xk+1 = (Sk + Fk K) (respectively, xk+1 = (Sk +
Fk Kk )), giving rise to collections of “closed loop” matrices
AK = (A1 + B1 K, . . . , AS + BS K) (respectively, AK = (A1 +
B1 K1 , . . . , AS + BS KS )), operators TAK and costs given by
(2). We denote YKT and YKT (X) to emphasize the dependence on K and on X,
E
k−1
T D
(5)
YKT (X) = ∑ TAk K (X) + ∑ TAl K (Σ), Q .
l=0
k=0
Problem Formulation
We are interested in the optimization problems
S
j=1
T
∑
k=0
tr{Ui′Vi }
forms a Hilbert space. Following the notation of [3] we
define, for U ∈ Mn , V ∈ Mn , the operator TU : Mn → Mn
by
TU,i (V ) :=
YT =
(1)
P1 :
minGYGT (X),
in the scenario with T < ∞, and
P2 :
minG ZG ,
when T = ∞, where
ZG = lim sup YGT (X)/T.
and we define for convenience
= V , and for t ≥ 1,
Tt (V ) = T(Tt−1 (V )). It is simple to check that T is linear.
The next result is an adaptation of the results in [6,
ch 3], see also [19].
T →∞ T
P2 is referred to as the LRAC problem. The scenario with
T = ∞ is more complex in the sense that finite Z does not
ensure that the controlled system is stable, as we shall
see later.
4435
TABLE I
the infinite horizon scenario involves the question of stability and related issues of sensitivity to the initial condition. For instance, we may have limT →∞ (1/T )YGT (X) < ∞
and limT →∞ (1/T )YGT (X) = ∞, W 6= X. Moreover, lack of
positiveness of Σ and Q may lead to finite cost controls
that are not stabilizable, provided certain conditions of
detectability and stabilizability are not satisfied. In order
to overcome these difficulties, we consider the standard
hypothesis that the Markov chain is ergodic, in such a
manner that the limiting distribution P∞ π is unique [1].
We also consider the modified cost
Genetic Algorithm for the FHC problem P1
1) Set the GA and system parameters
2) Initialize the population K ℓ as in (6) and (7)
3) Perform genetic and evolutionary operators
to obtain a new population and the gains K ℓ (7)
4) For each K ℓ , calculate Y ℓ (X) via (5)
5) If the stop criterion is not satisfied, return to (3)
III. A Genetic Algorithm for the Problem P1
One important issue that arises when dealing with GAs
is how to create an adequate initial population. In principle, one could simply take randomly generated gains
K 1 , . . .. However, we have observed in many numerical
examples that the cost associated to many of these gains
are extremely high, leading to a high probability of being
excluded in the selection process. As a result, the “genetic
variety” of the the second generation of the population
decays, and the algorithm performance is poor, see e.g.
[18], [15], [17]. In order to overcome this difficulty, we
initially solve the optimization problem within the class
of controls in the form (4) via the recursive coupled
algebraic Riccati equations described in the Appendix,
leading to a collection of gains L1 , . . . , LS that are optimal
for the complete state observation problem, and then we
use these gains to obtain the initial population as follows:
K ℓ = α1 L1 + . . . + αS LS ,
YKε (X) =
T
∑
k=0
D
E
k−1
TAk K (X) + ∑ TAl K (Σ + ε I), Q + ε I ,
l=0
where I = (I, . . . , I) ∈ Mn , and the related LRAC ZGε , with
the property that ZKε is stabilizing if and only if the gain K
is stabilizing, and we evaluate the impact on the original
costs. Let us start formalizing the stabilizability notion.
Definition 1. We say that a static feedback gain K
(respectively, a collection of gains K ∈ Mr,n in the complete
observation scenario) is mean square (MS) stabilizing
when, for each Σ ∈ Mn satisfying Σ = Σ′ ≥ 0, there exists
Γ ∈ Mn , Γ = Γ′ ≥ 0, such that
k−1
TAk K (X) + ∑ TAl K (Σ) < Γ
(8)
l=0
ℓ = 1, . . . , q
(6)
where L1 , . . . , LS are given by (14) (or by (15) in the
context of the LRAC problem P2), q is the size of the
initial population and αi , i ∈ S, are zero-mean independent Gaussian variables with some arbitrary covariance
matrix P.
The static gain K (or, in the complete observation
case, each element the collection of static gains K)
may be low dimensional, as in Example 1, where K =
[K(1, 1) K(1, 2) K(1, 3)]. In these situations, if we simply
identify each element of K as one gene of each element
of the population, we will have a few genes, and we have
observed a poor performance of the GA. We employ
the polynomial representation for each element of K =
[K(i, j)],
K(i, j) = β1 (i, j)1 + β2 (i, j)2 + . . . + β p(i, j) p .
(7)
For the initialization of each element of the population,
ℓ (i, j)
K ℓ , we employ randomly generated β1ℓ (i, j), . . . , β p−1
ℓ
ℓ
ℓ
1
ℓ
and define β p (i, j) = K (i, j)− β1 (i, j) + . . .+ β p−1(i, j) p−1 ;
the GA will determine the β ’s of the following generations. The Algorithm is presented in Table I.
IV. An Algorithm for the Problem P2 (LRAC)
The problem P2 (LRAC problem for LSMJP) is far
more complex than the finite horizon problem P1 because
for each X ∈ Mn , X = X ′ ≥ 0, provided k ≥ k̄ for some k̄
(possibly dependent on X).
Remark 1. MS-stabilizability as defined above and the
related notion of MS-stability are equivalent to the standard MS notions in the LSMJP literature, e.g. K is MSstabilizing if and only if
k
∑ TAl K (X) < ∞,
l=0
∀X = X ′ ≥ 0.
(9)
For the equivalences, we refer to [6].
Proposition 2. If K is not MS-stabilizing, then for each
M ∈ Mn , M = M ′ ≥ 0, there exists a sufficiently large
integer T̄ such that ∑T̄k=0 TAk K (I) ≥ M.
Proof: From (9), if K is not stabilizable we have
that there exists X such that, for each Γ, ∑kl=0 TAk K (X) ≥
Γ for some k. The linearity of T and the fact that
υ I ≥ X for some υ ≥ 0, allow to write ∑kl=0 TAl K (I) ≥
υ −1 ∑kl=0 TAl K (X) ≥ υ −1 Γ.
Lemma 1. K is MS-stabilizing if and only if ZKε < ∞.
4436
Proof: Necessity. Replacing Σ by Σ+ ε I in Definition
Proof: The statement that ZK ≤ ZKε is trivial. Let us
denote the quantities Γ and k̄ in Definition 1 by ΓΣ and
k̄X to emphasize the dependence on Σ and X respectively.
Let k̃ = max(k̄X , k̄0 ). It is straightforward to check from
(5) and the linearity of T that
1 and employing Proposition 1 we write
E
k−1
YKε ,T (X)
1 T −1 D
= ∑ TAk K (X) + ∑ TAl K (Σ + ε I) , Q + ε I
T
T k=0
l=0
E
k−1
k̄−1 D
1
= ∑ TAk K (X) + ∑ TAl K (Σ + ε I) , Q + ε I
T k=0
l=0
E
k−1
T −1 D
1
+ ∑ TAk K (X) + ∑ TAl K (Σ + ε I) , Q + ε I
T k=k̄
l=0
∆ 1
+
T T
T −1 ∑
lim sup
T →∞
YKε ,T (X)
T
≤ Γ, Q + ε I
Sufficiency. Let Γ = ZKε < ∞. Let us deny the statement
by assuming that K is not stabilizing. In Proposition 2,
set M = ε −2 (ΓI + 1) and consider the corresponding T̄ .
We can write
E
ℓ−1
1 T −1 D
Γ = lim sup ∑ TAℓ K (X) + ∑ TAl K (Σ + ε I) , Q + ε I
T →∞ X T
l=0
ℓ=0
E
T −1 D ℓ−1
1
≥ ε 2 lim sup ∑ ∑ TAl K (I) , I
T →∞ X T
ℓ=0 l=0
E
T −1 D ℓ−1
1
= ε 2 lim
TAl K (I) , I
∑
∑
T →∞ T
ℓ=0 l=0
E
T −1 D T̄
1
≥ ε 2 lim
TAl K (I) , I
∑
∑
T →∞ T
ℓ=T̄ l=0
1 T −1 T − T̄ M, I
∑ M , I ≥ ε 2 Tlim
T →∞ T
→∞
T
ℓ=T̄
≥ ε 2 ε −2 (ΓI + 1) , I ≥ Γ + 1,
≥ ε 2 lim
which is an absurd.
The algorithm we propose in this section seeks for
the LRAC using the approximation via Y ε ,T (X)/T . Thus,
it is important to show that this quantity converges to
Z ε as T → ∞. This issue was studied and solved in [19,
Corollary 2]; for convenience, we present an adaptation
to the present context, next.
Proposition 3. Assume the Markov chain θ is ergodic
and that K is MS-stabilizing. Then, there exist scalars
α , β ≥ 0 such that
Y ε ,T (X) α kX(ℓ)k + β
(11)
ZK −
≤
T
T
Next, the impact of the scalar ε in ZKε is evaluated
in terms of the original LRAC ZK , showing that ZKε
converges to ZK as ε tends to zero, provided the gain
K is stabilizing.
Lemma 2. If K is MS-stabilizing, then there exists ξ ≥ 0
such that ZK ≤ ZKε ≤ ZK + εξ .
∑ TAl K (I) ≤ ε ΓI
(12)
l=0
l=0
∆ T − k̄ Γ, Qε I = +
Γ, Q + ε I ,
T
T
k=k̄
(10)
k
k−1 l
k̄−1
T
(X)
+
T
(Σ
+
where
we
set
∆
=
ε
I)
,
Q+
∑l=0 AK
∑k=0 AK
ε I . This leads to
≤
k−1
k−1
TAk K (0) + ∑ TAl K (ε I) = ε
for k ≥ k̄0 . One can check that
E
T −1 D k−1
YKε ,T (X) − YKT (X) = ∑ ∑ TAl K (ε I) , Q
+
∑
k=0
k=0
k−1
l=0
E T −1 D k−1
E
TAk K (X) + ∑ TAl K (Σ) , ε I + ∑ ∑ TAl K (ε I) , ε I
T −1 D
l=0
k=0
l=0
(13)
Now, evaluations similar to the one in (10) for the terms
on the right-hand side of (13) provides
YKε ,T (X) − YKT (X)
T
∆ T − k̃ ε ΓΣ , I + ε ΓI , Q + ε 2 ΓI , I ,
≤ +
T
T
which leads to
YKε ,T (X) − YKT (X)
≤ εξ
T
T →∞
where we set ξ = ΓΣ , I + ΓI , Q + ε ΓI , I .
In the algorithm for the LRAC, we use the coupled
algebraic Riccati equation in (15) (in the Appendix) to
initialize the population instead of the coupled recursive
Riccati equations in (14). The solution of (15) with
positive weighting matrices is connected to stabilizing
solutions of the LRAC problem, see e.g. [12], thus providing stabilizing initial gains Li for (6). We include in
the Appendix a simple method for solving the coupled
algebraic Riccati equation, for ease of reference.
lim sup
V. Numerical Example
Example 1 (Magnetic suspension system). We take into
account the model of a magnetic suspension system, as
presented in [5]. For ease of reference, next we present
the data of the system,




0
1
0
0
.
0
A1 =  1750 0 −34.07  , B1 = 
0
0 −0.0383
1.9231
Here we consider failures in the control action u. Let the
transition probability matrix P be given by
0.9 0.1
P=
,
0.5 0.5
and assume that whenever θk = 2 the system is not
affected by the controller, and we set A2 = A1 and B2 = 0
accordingly. We also consider zero-mean additive noise
with covariance matrix GG′ , with


1 0
G = 1e−6  1 0  .
0 1
4437
and the fact that perfect observation of the jump variable
θ may be difficult in many practical situations.
In order to overcome some intrinsic difficulties of the
problem, we have proposed a modification on the original
cost, parameterized by a scalar ε , and we show that
the modified cost ZKε converges to the original one when
ε tends to zero, provided K is MS-stabilizing. We also
shown that finite ZKε is strongly connected with MSstabilizing K, in such a manner that, when the method
converges, the obtained solution is stabilizing.
8.0e+007
Y ε ,T (X)
T
7.5e+007
7.0e+007
6.5e+007
6.0e+007
5.5e+007
0
5
10
15
20
T
25
30
References
Fig. 1. Cost Y ε ,T (X)/T with T = 10 versus the number of iterations
t of the GA.
1.2e+008
Y ε ,T (X)
T
1.1e+008
1.0e+008
9.0e+007
8.0e+007
7.0e+007
0
100
200
300
T
400
500
600
700
Fig. 2.
Y ε ,T (X)/T converging to the LRAC as the horizon T
increases.
The weighting matrices are R = 1 and Q = I. We consider
initial condition X = I, and we set ε = 10−3 . For the
initialization of the algorithm, we employ the coupled
Riccati equations (15) (solved using the method presented
in the Appendix), which yields
L1 ≈ [2797 66.87 − 53.71]; L2 = [0 0 0].
We start with horizon length T = 10. Figure 1 shows the
behaviour of the best average cost Y ε ,T (X)/T as a function
of the number of iterations (number of population) of
the method of Table 1. Figure 2 shows how the average
cost evolves as T increases, illustrating the convergence
to the LRAC, Z ε (X). The obtained static gain K and the
attained LRAC are given by
K ≈ [2324 57.91 − 44.99]; Z ε (X) ≈ 117569345.
VI. Conclusions
In this paper we have presented a genetic algorithm
for the problems of finite horizon cost and of long run
average cost, with static feedback controls, for discretetime linear systems with Markov jump parameters and
partial state observation. This is a relevant problem for
applications, taking into account the simplicity of the
controller (in the form uk = Kxk ) and its implementation,
[1] E. Cinlar. Introduction to Stochastic Processes. Prentice-Hall,
1975.
[2] E. F. Costa and J. B. R. do Val. Weak detectability and
the linear quadratic control problem of discrete-time Markov
jump linear systems. International Journal of Control. Special
issue on Switched and Polytopic Linear Systems, 75:1282–
1292, 2002.
[3] E. F. Costa, J. B. R. do Val, and M. D. Fragoso. A new approach to detectability of discrete-time infinite Markov jump
linear systems. SIAM Journal on Control and Optimization,
43:2132–2156, 2005.
[4] E. F. Costa, A. L. P. Manfrim, and J. B. R. do Val. Weak controllability and weak stabilizability concepts for linear systems
with Markov jump parameters. In Proc. ACC’06 American
Control Conference, Minneapolis,USA, 2006.
[5] E. F. Costa, V. A. Oliveira, and J. B. Vargas. Digital
implementation of a magnetic suspension control system for
laboratory experiments. IEEE Transactions on Automatic
Control, 42:315 – 322, 1999.
[6] O. L. V. Costa, M. D. Fragoso, and R. P. Marques. DiscreteTime Markovian Jump Linear Systems. Springer-Verlag, New
York, 2005.
[7] O. L. V. Costa and M.D. Fragoso. Discrete-time LQ-optimal
control problems for infinite Markov jump parameter systems.
IEEE Transactions on Automatic Control, AC-40:2076–2088,
1995.
[8] O. L. V. Costa and E. F. Tuesta. Finite horizon quadratic
optimal control and a separation principle for Markovian jump
linear systems. IEEE Trans. Automat. Control, 48, 2003.
[9] C. E. de Souza and D. F. Coutinho. Robust stability of a
class of uncertain Markov jump nonlinear systems. IEEE
Transactions on Automatic Control, 51(11):1825–1831, 2006.
[10] C. E. de Souza and M. D. Fragoso. H∞ filtering for discretetime linear systems with Markovian jumping parameters. International Journal of Robust and Nonlinear Control, 13:1299–
1316, 2003.
[11] J. do Val and T. Basar. Receding horizon control of jump
linear systems and a macroeconomic policy problem. Journal
of Economic Dynamics & Control, 23:1099–1131, 1999.
[12] J. B. R. do Val and E. F. Costa. Stabilizability and positiveness of solutions of the jump linear quadratic problem and
the coupled algebraic Riccati equation. IEEE Transactions on
Automatic Control, 50:691–695, 2005.
[13] M. D. Fragoso and O. L. V. Costa. A unified approach for
stochastic and mean square stability of continuous-time linear
systems with Markovian jumping parameters and additive
disturbances. SIAM Journal on Control and Optimization,
44(4):1165–1191, 2005.
[14] Z. Gajic and R. Losada. Solution of the state-dependent
noise optimal control problem in terms of Lyapunov iterations.
Automatica, 35:951–954, 1999.
[15] G. P. H. Chou and C.-H. Chu. Genetic algorithms for communications network design-an empirical study of the factors
that influence performance. volume 5, pages 612–619. IEEE
Transactions on Evolutionary Computation, June 2001.
[16] Y. Ji and H. J. Chizeck. Jump linear quadratic Gaussian
control: Steady-state solution and testable conditions. Control
Theory and Advanced Technology, 6(3):289–319, 1990.
[17] J. Liu S. P. M. Choi and S. P. Chan. A genetic agent-based
negotiation system. Computer Networks, 37(2):195–204, 2001.
4438
[18] A. B. Simões and E. Costa. Transposition versus crossover:
An empirical study. pages 612–619, Orlando, Florida, USA,
July 1999. Proceedings of the Genetic and Evolutionary Computation Conference.
[19] A. N. Vargas, E. F. Costa, and J. B. R. do Val. Bounds for
the finite horizon cost of Markov jump linear systems with
additive noise and convergence for the long run average cost.
In Conference on Decision and Control 2006, 2006.
Appendix
Consider the following coupled recursive Riccati equation, with initial condition Xi0 = Qi , i ∈ S,
Xik+1 = A′i Eki Ai − (A′i Eki Ai Bi )(R + B′i Eki Bi )−1 (B′i Eki Ai ) + Qi
(14)
where Eki = ∑Sj=1 pi j X jk is the coupling term. Consider
the associated coupled algebraic Riccati equation in the
variables Xi = Xi ≥ 0, i ∈ S,
Xi = A′i Ei Ai − (A′i Ei Ai Bi )(R + B′i Ei Bi )−1 (B′i Ei Ai ) + Qi
where Ei = ∑Sj=1 pi j X j . (14) can be solved recursively. We
present here, for ease of reference, the following algorithm
[2], [4], which converges to the solution of (15) if and only
if a solution exists.
Method for solving (15)
Step 1. Set κi ≤ 1 as the largest integer for which
√
κi pii Ai is stable.
Step 2. Set X 0 = (X10 , ..., XN0 ) ∈ Mn+ .
Step 3. For k = 1, 2, .. and i = 1, 2, ..., N solve the
standard algebraic Riccati equations:
−Xik +κi pii A′i Xik Ai + A′i Ẽki Ai − (κi pii A′i Xik Bi + A′i Ẽki Bi )
× (R + κi pii B′i Xik Bi + B′i Ẽki Bi )−1
× (κi pii B′i Xik Ai + B′i Ẽki Ai ) + Qi
where
(15)
i
Ẽki =
∑ pi j X jk + (1 − κi)pii Xik .
j=1
4439
=0
(16)

Download Report

An Algorithm for the Long Run Average Cost Problem for Linear

Paperzz.com

Your Paperzz