Elfying, Gustav; (1954)On the theory of markoff chains." Translated by Edwin L. Cox.

ON THE THEORY OF MARKOFF CHAINS
by
Gustav El£ying
InsUtute cf Statistics
Mimeo. Series No. 10.3
Yay, 19$4
-,---------,----,---------------Finland
1937
ZUR THEORIE DER MARKOFFSCHEN KETTEN
Von
Gustav ElfyJ.r.g
ACTA SOCIETATIS SCIENTIARUM FENNICAE
Volume 2, Nos 8, 1937, Finland
Translated by Edwin L. Cox under the supervision of Fred J. Allred, Assistant
Professor of Mcxlern Languages at North carolina State College of the University
of North Carolina., Raleigh, N. C. at the request of the Department of Experimental
StaUstics, North Carolina state College. (Edited jointly by the Translation
Service and the Departnent of Experimental Statistics.)
Mly
Finland
17, 1954
1937
ON 'I'hl1. 'fHEORY OF MARKOFF ChAINS
1.
JSy
a Markoff Ohain is understood" as you know" a stochastic process with
discrete time and a finitelnumber of possible conditions" that is moreover homogeneous with regard to time.
The theory of this process was established by Mark-
off and Poincare.
Later, in particulars, it was carried further by Hostinsky"
2
Hadamard and Von Mises • Romanovsky has given in Acta Mathematica, 1933" a
systematic and in some measure exhaustive exposition of this theory (7).
depends on the essentials of matrix calculation.
He
The application of this techni-
que is postulated by the fact that the probabilities appearing here satisfy
certain systems of linear homogeneous difference equations with constant coefficientse
The Narkoff chains - in the narrow sense - stand nearest in the system of
probability calculations to those with regard to constant time t and homogeneous
stochastic processes with a finite number of possible conditions.
Kolmogoroff
has shown that the transition probabilities appearing here satisfy certain systems
of linear homogeneous differential equations with constant coefficients.
If we
limit ourselves by a process of this kind to a sequence of equi-distant time
points, then we have a 11arkoff chain; this behaves to the corresponding stationary
process almost like the whole number powers of the exponential function.
We denote
this process in the following, in short, as a stationary Markoff p·rocess.
1
Compare also; Kolmogorof! (5).
2
For the history and bibliography of the Narko!! ~hain see B. Hostinsky; Les
methodes analytiques du calcul des probabilities, Gauthier-Villars, Paris 1931;
J. Hadamard and !"I. Frechet. Sur les probabilities discontinues de's eZienemezU"s
en chaine" Z. angew. Math. Mech. XIII (1933).
-2-
The greater part of the work on the Markoff chains is concerned with the
end state of the system considered, that is, with the asymptotic representation
of the process.
The following work will be concerned with two somewhat different
questions.
First;
The parallelism between the Markoff chains and the stationary Markoff
processes poses the following question.
If a Markoff chain is given, is there a
stationary Markoff process • th.en whose transition probability coincides nearly
for whole number t with each of the chains?
Is this process clearly determined?
This problem is at first sight, a rather elementary interpolation problem, but
the demand that the result should be a stochastic process, brings up a certain
characteristic to which we wish to draw attention.
Actually it is shown that
this problem can have none or one unique or infinitely many solutions.
Secondly;
Let there be proposed a law of a Markoff chain or a stationary
Markoff process, i.e., that the transition probabilities be known.
Now, if one
fixes an absolute probability distribution for a certain time point t , then the
o
distribution is determined by it in each following time point. But how does this
fit with the preceding point of time?
Can we continue the process backwards
unconditionally or must a certain initial moment be attributed to it?
It becomes
evident that, in general, the latter is the case. /l
The content of the problem appears at best with a statistical interpretation.
Let us consider a very large set of systems that jointly independent of one another
are subject to the considered process.
The question then reads:
If at the time
moment to a certain distribution of the system of the possible states is observed,
how long at most can the process have continued?
fi
lforeover
compare also Kolmogoroff (4)
OUr problem leads us therefore
-3to the concept of the age of the distribution with a given chain law.
The age
is, for example, the analog of the magnitude t in the expression of the normal
distribution
1
2nt
.... (x:",;Jn)
2
e .;.: . ..;jt~···
if this distribution is explained as the result of a diffusion process homogeneous
in t and x with the diffusion speed 1.
In our paper we shall only handle the especially easily handled analytic
case, in which the matrix of the process possesses only different characteristic
or real values.
The essentials of the results obtained below should be independent
of this assumption and should be easily extended properly through the stationary
considerations, to the case of equal real value.
Markoff Chains. \'\Ie consider a system 5 which is capable of states E , E2 ,
l
4 •• '
E.
probability for 5, reached in v steps from E.1 to
n We denote by P~Vk)the
1
Ek; in particular we set pl~) = Pike The transition probabilities obviously
2.
satisfy the conditions.
(1)
Pik
~
0,
~ Pik •
k
1"
t
On the other hand, we denote by qi ), ••• ,
states El ,
(2 )
~ •• ,
q~t) the absolute probabilities of the
En at time t; then it is equally true
2i
q~t)
= 1.
1
According to the multiplication and addition laws, there exist between the
probabilities introduced before, the relations
(3)
(v) • ~p( v-l) P
Pik
4- ij
jk'
J
-4If we set, in turning to matrix notation
Pll Pl2
p.
Pln
·, •
P2n
• •
• • •
• •
Pnl Pn2
• • •
Pnn
P2l
(4)
• • •
• •
P22
,
Q (t)
•
( (t)
ql
(t) )
• • • qn
'
then the equations (3) assume the form
p( v)
(3')
;;;
(pi: ~ . .
P v, Q(t+v). Q(t)pV
A matrix P with the properties (1) is called stochastic according to
Romanovsky.
In the following we will think of matrix P as given once for all.
The main
task of the theory becomes then naturally the explicit statement of the number
pi;)
as a function of v • This problem has been solved generally in the Romanovsky
article (p. lUI and the following) on the basis of a formula of Perron. We shall
handle it here only under the assumptions made at the end of the introduction.
Let
(5)
11..£ -
pi
·•
- Pln
·
- P2n
• • • • • • • • • • • • •
• • • • •
1..- Pn
- p12
- P2 l
A- P22
•
• •
=
-' PiJ2 •••
- Pn1
A- Pnn
be the characteristic equation of P; its roots which are, as we know, assumed to
be simple are AI' • • • '>I.:n.• 1jJe set
(5)
J\.=
Al
0
•••
o
).,2'"
0
0
• • • • • • • •
o o
• • •A
-5-
knO~ that
Frobenius has shown, as is
negative elements, there is alwaye a
to each other root.
root~
among the roots of a matrix with non0, that is absolutely greater or equal
In case of the stochastic matrix P" this root is equal to one
It is consequently true that Al = 1" ~ ~ 1 (i=2,un).
If there exists a certain matrix P" then we can easily specify a narrowed
(compare (7)" page 149).
range for the roots.
Let the principal elements be identically Pll" ••• " Pnn'
jointly positive, and let p be the smallest of these elements, thus the matrix
P -pE is non-negative and its roots
~
- p (i • 1" ••• , n) are thus absolutely
smaller or equal to the greatest positive among them" which obviously is equal to
1 - p.
(6)
It is also true that
I~
that is" the
- pI
~
~
1 - P
belong to a circular slice which borders on the interior of the
unit circle with A .. 1.
According to the common theory of the linear equation system for each root Ai'
there are two each of characteristic vectors determinGd to· within a ·.faotor tilat we :in
matrix notation indicate by
:0
•
•
•
=
•••
v.
n~
they have the properties
(7)
Ui \
V\j
'"
0
(i
I
j)"
Uil'
VI',~
I
0
(i
..
1" ... , n)
(i
..
1, ... , n)
and
(8)
e
/1
-
Ui~
P=
Ai Ui .\,
PVh =
A.~.
V~.
(Proceedings Akad. Wiss. Berlin, 1908" 1909, 1912.)
-6-
The elements of
to ones.
V¥-
are jointly equal because of (1); we wish to choose them equal
f
For the residual V
we determine some standard and standardize the Uil
accordingly in such a way that
(i .. 1, ••• , n)
Uit V~ .. 1
(9 )
Here are the rotation matrices
ull
U ..
(10)
'12
• • .uln
u22 • • • u2n
• • • • • • • •
unl un2 • • .unn
~1
.
,
V ..
v
11
v
21
• •
vnl
v
12
v
22
• •
vn2
•
· .vln
• • •
• • •
• • •
which are uniquely determined - wi th a fixed sequence of the
\
v2n
• •
vnn
- and by (7),
(8) and (9) the equations
(11)
UV .. E,
hold.
UPV" 1\
On the other hand, if there is to a given chain P, a matrixA of form (5)
as well as two matrices U, V with properties (11) then A!,"',
An are characte-
ristic roots of P and U, V are the rotation matrices of the chain.
, u - pI" I U I
· I A.E
-
pl· I VI ..
J AE
- UPV I
.. }AE
It is actually
-1\ I = (A - Al ) .. • (A - An);
further from (11) UP =1\ U and PV .. V/\ follows, from which the determining
equations (8) are given for the single rows or columns of U, V.
Formulas (11) make possible the solution of our iteration problem.
Through
the involution of the second formula under the examination of the first, we obtain
and so, since v .. 1, A .. 1, we have
1
il
X;
pi:) .. U lk + U2k Vi2
+ • • • + U nk vin A:.
This equation contains all the known results about the asymptotic behavior of the
(13)
chain.
If all the characteristic roots with the exception of Al are numerically
smaller than one, pill) tends - and further more q~ v ) as well - for
the value Pk .. ulk ' independently of i.
v -+ 00 to
If, on the other hand, we get negative o.r
-7"
pl~ ) becomes asymptotically periodic.
complex roots with absolute value one, then
The formulas (13) are true only for numerous simple real values.
The presentation
of Romanovsky (page 221) is incorrect on this point by our reasoning.
3. The Stationary Markoff .Process.
We consider again a system 5, which
can be found in the states E , •••• En' but this time, a stationary observation
l
can be attained.
Let Pikes, t) be the conditional probability that if' the state
E at time t be known, then the system will be found in state E at time s. \tJe
i
k
agree with Kolmogoroff ( (3) , page 428) about a stationary stochastic process
with a finite number of states in the case when the matrix P
<"~"'t) of
°
,
the transition
probabilities for t -.,s tends to E.
In the following, we confine ourselves to a homogeneous stationary process,
1.e. to one in which the Pik(s,t) depend only on t~s; we write in short Pik(t)
in place of Pik(s,s + t).
These quantities suffice obviously for the probability
of the characteristic conditions
and
(15)
Pik(t1 + t 2 ) =
f
Pij (t1 ) Pjk (t2 ),
as well as the stationary conditions
(16)
(i
f
k), P. ° (t) ~l
~~
(t
~
0).
\le shall see later that in the homogeneous case the existence of the limits
(17 )
aik
!:
P~k (0)
a ii ~ P~i (0)
Pok(t)
==
lim
= lim
follow from (15) and (16).
~t
(t -t 0; irk)
Pio(t)-l
J.
t
(t ,...,. 0)
For the present we shall assume their existence.
(14) and (17) then follow immediately the properties
(18)
(i
f
k),
~ aO k .. O.
k
~
From
-8/1 certain- smp1 e dJ.off eThe functions Pik ( t ) satisfy as Kolmogoroff has shown-;0
rential equations.
vie have actually from (15)
and from this follows, according to (17) for
(19)
lit~
0
(k = 1, ... , n).
Thus we have the statement:
The functions Pik(t) satisfy the differential equations
(19i) with the initial conditions (16i).
..
4.
Solution of the Differential Equations.
Conversely, let a matrix be
stated with the properties (18),
(20)
A•
We ask for the corresponding solutions of the system (19) with the initial conditions (16i) (i
= 1,
••• , n).
The existence and uniqueness of the solutions are insured by the general
existence theorems.
Kolmogoroff (p. 431) shows that they possess exactly the
properties (14), (15). We shall carry out the known explicit solution immediately
below, but again only under the assumption that the matrix A possesses many simple
real values.
~Je
I..i
~3)
turn now entirely to matrix notations.
Hi th the notation
, page 428 Kolmogoroff treated the general function Pik(s,t) and
assumed the existence
}t Pik(s,t)
for t > s, not for t
z
s.
~
-9-
(21)
• • • • • • • • •
the differential equations (19) can be condensed
(19' )
dP(t)
dt
(t)
= PAt
while the initial conditions (16) take the form
(16 1)
p(a) • E •
The equation (19') will undergo a transformation which is entirely analogous to
one used in Section 2.
We write the roots of the matrix
A
in the form
a ••• a
a X2 • • • a
Xl
(22)
• • • • • • •
a a
• • 'Xn
Because of (18)3' one of these, for example, the first Xl
I:
a.
out of the
characteristic roots we construct exactly as in Section 2 normal real vectors and
from them rotation matrices U, V with
(23)
UV • E, UAV = K ~
Also here we can take v ...... vnl = 1.
ll
post-multiply by V and set besides
(24)
If we pre-multiply (19') by U and
R(t) ... Up(t)v t
then it follows, since the elements ri~) from R(t) are put together linearly from
those of pet) ,
(25)
~
c
RK
or writing in full
-10(t)
dr ik
(25 1 )
dt
=
(t)
Xk r ik
Because of (16 1 ), R(O)
ret) = eXk t
( 26)
kk
c
E, and (25 1 ) gives for solutions
r(t) .. 0
'ik'
in short
0
0
e
• • • 0
x2 t
• • • 0
• • • • • • • •
xnt
0 0
• • • e
R(t) ..
(26 1 )
1
•
Finally with the help of (24) we turn to the original variables and obtain
(27)
pet) = VR(t)U
or in detail completely similar to equation (13)
(27 1 )
AS
(t) _
"2
Pik - Ulk + u2 k vi2 e
t
+ • ••
+
v e"tnt.
unk in
will be shown in Section 7, if the real parts of
"2' ••• , X'n
are jointly
negative, then it is true that
(t)
Pik ~ulk
(t~+
(0).
Under the application of the above aids introduced, let us demonstrate only
the formulated proposition.
With a homogeneous process there follows the eXistence
.
. dP(t)
of the derivative ~
and the fixed condition
(16,1 )
(t~
0) •
-11For this purpose, we first make the following remark.
If the matrix P has
roots ~, • • • , An and the rotation matrices U, V, then p V has the roots A~,
• • • , Ari and the same rotation matrices as P.
identically through powers that UP v V
-1\v and
From UPV
-1\ ,
it follows
from this on the basis of the
reflection made in Section 2 (p. 6) (follows) the correctness of our remark.
According to the assumption (16'), we now take a time interval (O,h) inside
which the roots of pCt) lie together in the right half plane.
for the purpose of easier writing to take h • 1.
Let the rotation matrices of
pC]) be U, V, the roots AI' • • • , An; further set Xi .. log
of the logarithm is fixed by the restriction'
It does not hurt
':S' (xi) I ~
~,
where the branch
~.
On the basis of the remarks made above, the matrix p(1/2) has for example
X /2
the rotation matrices U, V and the roots,,~ ... e' i • "(i - 1, ... , n); the
values e X1/2
plane.
do not come into consideration since they lie in the left half
In the same way we show generally that the matrix pOl2v)has the rotation
matrices U, V and the roots eXi/2v.. If we define R(t) in consequence of (26')
and
pc;)
by
pet) ... VR(t)U
*
then pet) and
pC;)
,
agree with one another for t
= 1,
1/2, 1/4, ""
it follows
also because of (15') for all t values expressible through finite double divisions;
finally, because of the continuity of pCt), it follows from (15') and (16') for
all t.
But the elements of pC;) are evidently differentiable with respect to t,
therefore those of pet) are also, which was to be proved.
6.
The connection between the Chain and the Stationary Processes.
If the
differential law of a stationary process exists, that is to say, the matrix A is
given, and one limits the observations to an equidistant series of timepoints,
-12perhaps t .. 0, 1, ••• , we have a Markoff chain for which the matrix P follows
from (27) for t .. 1.
(28)
~
Al .. 1,
This proposition shows, that P has the characteristic roots
X2
and the rotation matrices U,
1.~e
Xn
~
.. e , • • . ,
.. e
v.
turn now to the oposite problem; if P is given, can we give such a matrix
A wi th the properties (18) so that the transition matrix pet) determined by A
agrees with P for t .. l?
In short, can the given chain be interpolated stochas-
tically - significantly?
From the statements above it follows inunediately that the roots of the desired
matrix A must have the values Xv .. log AV while its rotation matrices must agree
with those of P.
If we set (the fiXing of the branch of the logarithm may first
of all be left. uncertain)
(29)
o
o
• •
•
o
o log
~
•••
0
•••••
•
L •
•
o
o
.,
• • • log An
then there is
A .. VLU
(30)
or writing the desired matrix in full
(31)
a ik .. ~k vi2 log ~ + • • • +unk vin log An
(i, k .. 1, • • • , n)
Let it be noted that complex roots A. - which always appear in conjugate
#
pairs - bring no difficulties lot th them since in (29) we merely take the conjugate
branch for the logarithm of the conjugate root.
roots, then in (31)
A.t.'
~
u~k'
Then
A~,
'13k are also conjugate and also
come together to give a real term.
in a complex term in (31).
A~
are two conjugate
vi~' v~,
and the terms
On the other hand, a negative brings
-13-
Further if the last condition (18) is definitely fulfilled, then because
UV
= E,
we have
2k
u· k ...
J
~ u' k v kl .. 0
k
J
r 1)
(j
and from this it follows by summation of (1) over k, that 2, a. k • O.
k
J.
But if the matrix (30) is also real and our interpolation problem solved
formally, then it is not yet certain that the conditions
(32)
are sufficient.
Actually examples can be given which permit none or one or a
fini te number of stochastically - significant solutions.
That infinitely many
solutions can never appear we will show in the next section.
Since we anticipate
this result, we can put together our results up to this point in the following
way;
The Markoff chain determined by P can be interpolated stochastically significantly, and indeed by means of the "matrix of differential transition
probabilities" (30) in case the elements of this matrix are non-negative.
The
problem can have either none or one or a finite number of solutions.
7. Interpolation Criteria.
It is of a certain interest to set up criteria
that allow us to decide the interpolatibility of a Markoff chain without calculating the entire matrix (30).
To this end we make first a comment on the position of the roots Xi of a
matrix A with the properties (18); if k is the greatest of the numbers - a ..
J.J.
A
X'i
(i = 1, • • • , n) then the matrix K + E is stochastic and its roots 1C + 1
are hence, according to Frobenius (page
5) absolutely smaller or equal to one.
It is thus true
OJ)
I ~i
+ k
I~
k
(i • 1, • • , n)
-14that is J the roots Ki lie insiCie a certain circle which touches the imaginary
axis on the left in the origin.
Let now P be a given stochastic matrix, and D its determinant.
If then A
is a matrix with properties (18) that solves the interpolation problem belonging
to P, then we can indicate an upper limit for the radius of the above mentioned
circle.
In the characteristic equation of A is namely the coefficient of In-l
equal to all + • • • + ann' thus we have
(34)
all + • • • + ann
&I
Xl + • • • +X n
&I
log (AI ••• An)
Since the aii together are (.. 0, we have in , log D'
The roots of A belong thus to the circular slice
(35)
X - log D ~
&I
log
fi,
log D.
&I
the desired limit.
log D
and indeed because of Xl + • • • + Xn ... log D, in the right half.
We conclude first from this that only a finite number of branches of the
. logarithm .log
~
can give stochastically - significant matrices
A.
AS
was
asserted in Section 6, the interpolation problem is thus only solvable in a
fin! te number of ways.
Further we obtain the criterion; in order that P be interpolable stochastically
- significantly, it is necessary that D
log AI' ... , log \
>0
and that the principal values of
lie within the circles (35).
in a ~ - fold way it is necessary that
In order that P be interpolable
tlog DI ~ (~ - 1) n.
In particular the above criterion teaches us that the matrix P of an interpolable chain can have no vanishing roots, no roots with absolute value one
(outside of Al
&I
1) and finally, no simple negative roots.
belonging to negative A., X
a conjugate root
X•
log
&I
Then for a root
long' A' + ni there would correspond necessarily
I A' - ni
and >.. will then be a double root.
-15d.
An
Example.
lie consider that particular Markoff chain, whose law is
given by the cyclic matrix,
p ..
(36)
•
7
Let
•
•
if.
••••
be a possible value of ~ thus there is, as we can easily confirm,
a characteristic root of P
(37)
+
Itee
+
(that is) generally
n-l
Av
=p
+
~
~=l
2ni (v _
p~ e~n
The real vectors belonging to the roots
1)
( v
= 1,
... , n).
(37) are determined from the
(8). These are evidently satisfied with
equations
• • • , vn
•
-
=1
•
•
n-l
,
,
=
Un
7
2ni
There can therefore with S
e n , be chosen
(i-I) (k-l)
1 (i-I) (l-k)
vik .. e
, uik .. n e
.....
...:
"'-...
I:
and the coefficients in
(38)
vij u jk
.. 1
ne
(13) and (27') become
(i-k) (j-l)
•
Hence we learn that the interpolation matrix of P is likewise cyclic.
To obtain a single example of the result of the previous section, we consider the case n
= 3,
We have
-16\~e
set for the sake of brevity
~
= pe
is
, A..3 • pe
-iQ
and find under the application of the formulas (21 ' ) and (.38) that if the elements
of A be denoted in the same way as those from l',
2
a • j log p ,
=-
al
1
j log p -
N3
~ $
1
A/3
a 2 • - j log P + ~e
Here a is evidently negative.
In order that a!' a
and sufficient that (we suppose for example
B
~
Q, ~
2
,
r
be non-negative it is necessary
0)
J3 log ~ •
We obtain two significant solutions of the interpolation problem as soon indeed
as the argument value
e - 2n provides
a l , a2 non-negative, that is if
2n -$ ~
1
Al.3 log
1
p ,
one obtains three solutions if
1
1
e + 2n ~"'/3
log p
and so on.
9.
The Age of a Distribution.
proposed in the introduction.
We turn now to the second of the problems
Its treatment proceeds in a wholly parallel way
for chains and for stationary processes.
in both cases of the notation
We avail ourselves in the following
pi~) for the transition matrix.
Let there be an absolute probability distribution prescribed, for example,
for the moment t
(39)
= 0,
. . . . ', qn(O»;
qJ.~O)~ 0;
2i
q~O) • 1.
1.
-17We ask first; can we determine for a given earlier time point - t (t
q~-t» with the properties
a distribution Q(-t) • (qi- t ), •••
~
0) such
I
(40 a)
~-t) ~ 0
and
(40 b)
.
that Q(0) results from our process from Q(-t) , that J.S
pet) +
+ q(-t) pet)
qk(O) • q(-t)
1
lk
•••
n
nk
(41)
in short; is there a stochastically - significant solution of the matrix equation
(41 1 )
In formal algebra the question is quite simple to answer.
Only the inversion
of the matrix pet) is involved.
This operation is generally impossible because the determinant
that is, whenever the matrix has a null root.
with chains. We put this case aside.
in an unique wayG
Ip(t)
I ..
0,
Thi's is, as we know, only possible
In all other cases (p(t»-l is determined
Under the application of the formula found earlier for posi-
tive t
(27)
pet) .. VR(t)U
(p(t»-l can be easily expressed explicitly; it becomes
(p(t»-l ..
u- l
v-l
(R(t»-l
.. V(R(t»-lu,
and hence evidently (R (t) ) -1 .. R(-t)
(42)
(p(t»-l • VR(-t)
u ..
whereby pet) for t< 0 is defined by (27).
p(-t) ,
Generally, one can by virtue of (27)
calculate tdth pet) as exactly as with powers.
-18The solution of the system (4l') will be hence
Q(-t) ... Q(O) p(-t),
(43)
which written in full, becomes
~-t) ... qfO) p~t)
(43 1 )
+ ••••• +
~O) p~~t)
(i
The solution automatically satisfies the condition (40 b).
= 1, •• ,
n)
This follows from
(41) through summation over k.
But we do not say that (43) satisfies the condition (40 a)
or describes an
actual stochastically significant distribution.
We show easily that i f this holds for an earlier, it holds also for a later
time point.
therefore
(t) (t2-t l )
(44)
tl
It may be expressly in the state
iii
Q 1
P
< t 2 <.
0; then it follows
')
(t )
and the elements of Q 2 must then be
?
O.
Hence, we conclude that there is given a well determined period
Which has the following properties; for t }
t
< -'t
(it is) not.
He shall indicate by
't
-'t,
(0 ......
<
't ~
....
00),
Q(t) is a distribution, for
the age of the distribution Q(0) •
From what has been said, it follows directly that
In the case of a stationary process,
-'t
't
+ t is the age of Q(t).
is evidently tile smallest not directly
t ) (t~ 0) defined.
divided zero point of any of the probabilities
't
qi-
10. Finally we wish to give necessary and sufficient conditions for
't co 00.
For the purpose, we call attention to the idea of a stationary distribution.
distribution Q ... (ql
•••
A
q ) is called stationary which satisfies the equation
n
-19-
= Q for
QP(t)
all t.
Besides Q is in the case of a chain, a solution of the
system
(i
= 1,
••• , n)
while this distribution in the case of a stationary process satisfies the equations
+ a . q
IU.
n
=D
(i
The solutions of this system are noted in Sections 2 and
X1 pertaining to the real vector U •
ll
as one can show easily.
Moreover
4 for
= 1,
••• , n).
the roots
~
or
Their components have the same symbols
on account of ~ ul i
].
=~
].
ul i ViI • 1, they
are already regulated in such a way that they are positive and have the sum one.
There is thus for a single valued stationary distribution
(45)
~Je need distinguish only two cases.
Either (1) there are jointly , Avl
<. 1 (v >2) or ('I) there is besides ~ ... 1 also other roots with absolute value
one.
The second case can, as we know, . happen only with chains, not with stationary
processes.
(i)
The matrix P will then be called imprimitive.
In this case, it appears directly from equations
(13) or (27 1 ), that
(t)
(46)
Pik -.,. qk
is true, independent of i.
(t ...,. 00)
vie conclude from this that the age of the distribu-
tion Q(D) can be infinite when and only when Q(D)
= Q.
If certainly -'t'
-< -t <:0,
then it is true that
q k(O)
(47)
If now't'
C
= q(-t)
1
~(t) +
+ a(-t) pet)
- J.k
•••
~
nk •
00, then t can be chosen sufficiently large that the right side comes
near the chosen value qk
= lim p1~);
it is also necessary that
Obviously this condition is also sufficient in the case Q(O)
is always satisfied with Q(-t)= Q.
D
q~O) = qk.
Q, then also (41 1 )
(2°)
Now consider a Markoff chain with imprimitive matrix P.
From such a
matrix the following can be proved (Romanovsky (7) p. 261, compare also von
Mises
(6) p. 545). If there is such a whole number r that it is divisible by n,
let n
= rd,
then pr is totally decomposed, that is,
Ml 0
r
• • •
0
°
is of the form
• • • 0
·°
~ • •
0
a
r
·•
• •
• • • Mr
with appropriate enumeration of the conditions, where the matrices M , • • • , M
r
l
are primitive quadratic partition matrices of order d.
vje limit ourselves to a look at the time point sequence, 0, -r, -2r, • •
The absolute distribution in one of these time points, for example
t~~
•
vr, is
determined from·
This system obviously falls among the r independent systems with d equations
always.
by
ml: ),
We look at the first, for example; i f we denote the element from
this system maintains the form
(v)q(-vr)+
+m(v)a~-vr)
mik
1
• ••
dk
~
(48)
We reduce it to a form completely analogous to
s
a
Ml v )
ql°) + • • • +
q~O)
qJ.~t)
I
CI
S
q(O)(kl
k
=,
.• "
d)
(47) in which we divide by the sum
and set
_
=
q. (t) •
J.
•
-21-
He thus obtain the system
m:t: ) q~ (- v r) + • • • + m(v) qd'(-vr) ... qk1 (0) (k • 1, •• , d)
dk
(481 )
11
We see, exactly as in (1) that this system permits only stochastically.significant solutions for any large v, whenever the distribution (q~ (0) , • • • , q~ (0»
with that from the matrix Ml belongs to a stationary distribution. If the solution
of the systems UP ... U is U11 = (u1l , ••• , uln ) as earlier(Through:reguJal:ii.rg there1atim
ull + •• • + uln = 1), then this vector satisfies also the system Up vr =
u;
and also if the independent system in r is broken dOwn, then)besides (ul l, • • , uld )
is the solution of the systens U1'1I ... U. The components of (qiO), ••• , q~O»
must also be proportional to the numbers u11 , ••• , u1d • If this is the case,
, (- vr)
I
(0)
then (48') is real and solvable with qi
... qi
for all v •
These reflections can be carried through for each of the index groups
(49)
1, ••• , d; d + 1, ••• , 2d: • • .; (r - 1) d + 1, ••• , n.
The distribution Q(O) is then and only then arbitrarily far "backwards continuable",
whenever the qiO) inside of each group (49) is proportional to the corresponding
~Je
sum up.
Let there be a law P applied to a Markoff process1 we assume that P has only
single valued non-vanishing characteristic roots.
Then there comes from each given distribution Q(O), a number ~ (O~ ~ ~ 00),
the age of the distribution with the following properties:
to each t
<~,
-...:
there
is a distribution Q(-t) that subject to the given process, produces after a lapse
of time t, the distribution Q(O); for t
In order tha
~
...
CD,
> ~, there is no such distribution.
it is necessary and sufi'icienta
-22-
(lj with stationary processes and chains with primitive matrices, that
Q(O) correspond with the stationary distribution of the processes.
(f)
with chains with imprimitive matrices, that the q(O) be proportional
inside each of the cyclic condition groups with the corresponding probabilities
of the stationary distributions.
Literature
(1)
J. Hadamard and 1'):. Frechet: Sur les probabilitie's discontinues des evenements
"
en chaine,
Z. angewo Math. Mech. XIII (1933).
(2 )
/
""
" Gauthier-Villars,
B. hostinskYI Methodes
generales
du calcul des probabilities
Paris, 1931.
I'
(3)
A. Kolmogoroft; Uber die analytischen Methoden in der Wahrscheinlichkeit-
srechnung. Math. Ann. 104. (1931).
(4)
A. Kolmogoroff;
(5) A. KolmogoroffJ
~ur Umkehrb~rkeit
der Naturgesetze. Math. Ann. 104, (1936).
Anfangsgr~nde der Theorie der Markoffschen Ketten mit
unendlich vielen moglichen lustanden, Mat. Sbornik 1936.
(6)
R. v. Misesj Wahrscheinlichkeitrechnung, etc. Leipzig and Vienna, 1931.
(7)
V. Romanovskyj Recherches sur les chaines de Markoff. Acta Math. 66, 1936.