Asymptotic Properties of Markoff Transition Probabilities

Asymptotic Properties of Markoff Transition Probabilities
Author(s): J. L. Doob
Source: Transactions of the American Mathematical Society, Vol. 63, No. 3 (May, 1948), pp.
393-421
Published by: American Mathematical Society
Stable URL: http://www.jstor.org/stable/1990566
Accessed: 19/09/2010 01:00
Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at
http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unless
you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you
may use content in the JSTOR archive only for your personal, non-commercial use.
Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at
http://www.jstor.org/action/showPublisher?publisherCode=ams.
Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed
page of such transmission.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact [email protected].
American Mathematical Society is collaborating with JSTOR to digitize, preserve and extend access to
Transactions of the American Mathematical Society.
http://www.jstor.org
ASYMPTOTIC PROPERTIES OF MARKOFF
TRANSITION PROBABILITIES
BY
J. L. DOOB
Let X be any space of abstract elements,and let ax be any Borel fieldof
X sets, includingX itself.Let P(t)(x, A) be a functionof xEX, A C8, and
the positive real variable t, with the followingproperties:
(a) 0 <P(t)(x, A) <P(t)(x, X) = 1;
(b) for fixedx and t, P(0)(x, A) is a completelyadditive functionof sets
A; forfixedA and t, P(t)(x, A) is measurable in x with respectto the fielda.
(c) with a self-explanatorynotation forintegrationwith respect to a set
function,
(0. 1)
P(8+)(x,
A) =
P(8)(y,A)P(t)(x, dGy),
s, t > 0(').
A functionwith these propertiescan be consideredthe transitionprobability
functionof a Markoffprocess; it is the probabilityof a transitioninto A from
initial positionx, in time t. The Markoffprocess is determinedfor t>to if a
distributionof x at time t= tois assigned.The process is called stationary,or
temporallyhomogeneous,ifthe initialdistributionreproducesitself,that is, if
the set function4D(A)definingthe initial distributionsatisfiesthe equation
(0.2)
4'(A) =
f
p(t) (X, A) 4(dGx).
The probabilityformalizationof a stochastic process is now well known. In
the presentcase the initial distributionand the transitionprobabilitiesare
used to definea probabilitymeasure in the space of all functionsx(t), where
t> to,and x(t) is a functionwhich takes on values in X. For example, to the
set of functionssatisfyingthe conditionx(s) =A (s>to is fixed) is assigned
as measure
(0.3)
P{ x(s)
A} =f P(8)(x,
A) (dGz),
where 1(G) definesthe initialdistribution.If 1 is self-reproducing,
the definition in (0.3) and the correspondingdefinitionsformore complicated sets betotheSociety,
Presented
November
25,1945,under
thetitleMarkoff
chains-denumerable
case;receivedbytheeditorsFebruary27, 1947.
(1) In thefollowing,
the domainof integration
will not be indicatedexplicitlywhenit is
thewholespace.
393
[May
J. L. DOOB
394
come independentof to,and the probabilitymeasure is then definedon the
space of all functionsx(t), - oo <t < co ; in this case x(s), foreach s, has the 4)
distribution.The transformationT, takingx(t) into x(t+s) is then a measure
in functionspace, and the theoryof measure prepreservingtransformation
servingtransformationsis applicable.
In the following,it will be supposed that t runs throughintegralvalues
only. Function space then reduces to sequence space. The results obtained
will howeverbe applicable, with the obvious changes,when t runsthroughall
I x(j) is replaced by the
real numbers; for example the average (1/n)
average (1/t)fOx(s)ds.
If certainrestrictionsdiscussed below are imposed on the given transition
probabilities,it followsthat
I t
lim -Y
P(s)(x, A) = Q(x, A)
(0.4)
l40
t
existsforall x. Moreover:
(a) This limitis uniformin x and A.
probabilitydistribution.
(b) For fixedx, Q(x, A) definesa self-reproducing
of disjunct sets,
finite
number
a
sum
of
as
a
(c) X can be expressed
N, A1, A2, * * *, where
P (t)(x, Ai) = {
o,.5)
O,
Q(x, A) = Qi(A),
0.
Q(x, N)
x E Ai,
xC Ak, k $ j;
x C Ai,
The sets A1, A2, * are called finalsets. If a simple type of cyclic transition
is excluded,the Cesaro limitin (0.4) becomes an ordinarylimit(2).
In order to explain the significanceof the conditions imposed on the
transitionprobabilitiesto obtain these results,a familiarexample will be recalled. Let X be any abstract space, and let !x be a Bornelfieldof X sets (inof the points of X
cludingX itself).Let Tx be any one-to-onetransformation
which togetherwith its inverse takes Wxsets into Wxsets. Define Pt)(x, A)
as follows:
(0.6)
P(t)(x, A)
=
0
f if
Ttx CA,
- A.
if ~TtxC X
This functionsatisfiesthe three conditions (a), (b), (c), stated above, and
thereforedefinesa Markoffprocess (in conjunctionwith-an initial distribution). The average in (0.4) becomes the average numberof times TtxCA as t
and fordetailsoftheproof.Numbersinbrackets
statement
(2) See [7] fora morecomplete
at theendofthepaper.
referto thebibliography
1948]
MARKOFF TRANSITION PROBABILITIES
395
increases. It is thus hopeless to expect that without furtherhypotheseson
the transitionprobabilitiesaverages like that in (0.4) can behave in any more
regular fashion than the correspondingaverages for point transformations.
If some probabilitymeasure,that is, a measure with value 1 on the whole
space, has been definedon the sets of W=,and if the transformationT is
measurepreserving,the measureof X sets definesa self-reproducing
distribution. Even the hypothesisthat a Markoffprocess has a self-reproducing
distributioncan however at best make the averages in (0.4) behave like the
correspondingaverages for measure preservingpoint transformations.The
behavior detailed above, beginningwith the existence of the limit in (0.4)
for all x, is actually far more regular than the correspondingbehavior for
measure preservingpoint transformations.Now the transitionprobability
P(t)(x, A) definedin (0.6) is forfixedx and t a highlysingularset function;
it definesa distributionconcentratedat a point. The restrictionsimposed on
the transitionprobabilitiesof Markoffprocesses by various authors can be
consideredas smoothingrestrictionsimposed to make the representationof
P(t) (x, A) in termsof a point transformation
impossible.
If thereis a self-reproducing
distribution1, it will be shown that undera
mild restrictionon the field ,, X can be expressedas the sum of at most a
continuumof sets At and a remainderof 4Dmeasure 0, such that the A t's
have essentiallythe properties(0.5). The presentpaper can be consideredas
an examinationof the strengthening
of this decompositiontheoremwhich is
made possible by the impositionof weak additional restrictionson the transition probabilityfunctions.
The theoremsof the present paper will be stated, for the most part, in
termsof stationaryMarkoffprocesses. Note that this is the same as stating
theoremson Markofftransitionprobabilitiesforwhichit is knownthat there
exist self-reproducing
distributions.Althoughit is not customaryforwriters
on Markoffprocesses to presuppose the existence of a self-reproducing
distribution,the existence is implied in their hypotheses,since each Qj(A) in
(0.5) definessuch a distribution.
The weakest conditionsimposed on the transitionprobabilitiesto insure
the truthof (0.4) and the subsequent paragraphare those of Doeblin and of
Kryloffand Bogolioiuboff.
Conditionof Doeblin. The space X is the interval (0.1). The field Wxis
the fieldof Borel subsets of this interval.It is supposed that thereis an s and
a positivee such that P(s)(x, A) < 1 -e ifA has Lebesgue measure less than e.
ConditionofKryloffand Bogoliou"boff.
The space X and the fieldW,are as
in Doeblin's condition.Let P(s) be the linear operatordefinedby the kernel
P(s)(x, A) taking the Banach space 9 of completelyadditive set functions
definedon W,into itself,
(0.7)
p
=
f
P(8)(x,A)4(dGx).
396
J. L. DOOB
[May
(The normof an elementof 9Y is its variation.) It is supposed that thereis an
s and a completelycontinuous linear operator V, also taking 9) into itself,
such that P(8) - V has norm less than 1. Note that p(s) is the sth power of
p(l)(3)
Althoughthe conditionsof Doeblin and of Kryloffand Bogolioiuboff
are
weak to cover some of the simplestcases
very weak, they are insufficiently
in x of approach to the
commonlyencountered.In these cases the uniformity
limitin (0.4) is not true.One such example is furnishedby the Markoffprocess
encountered in renewal theory (see the followingpaper in this volume).
Another example is the Gaussian stationary process with correlationfunction pt, one of the simplest Markoffprocesses. Here X is the real line, W
the fieldof Borel sets, and
(0.8)
P (tx A)
p[27r(1 p2t)]12JA exp2(1
wherep is a constantbetween0 and
(0.9)
lim P(t)(x, A)
=
1(4).
-
p2t)
dy,
Evidently
2J
exp
dy.
In this case (0.4) holds as an ordinarylimit.It is clear howeverthat the limit
is not uniformin x. While it is hardlynecessaryto invokethe generaltheoryof
Markoffprocesses to treat this simple example, it is unsatisfactorythat the
general theoryas developed up to the presentdoes not already cover it.
There is a self-reproducing
distributionin this Gaussian process,the distributiondefinedby the integralin (0.9), and withthis distributionthe Gaussian processcan be made stationary.The renewal processis anotherexample
of a Markoffprocess,not covered by the hypothesisstated above, which has
a self-reproducingdistribution. Under the conditions of Doeblin and of
Kryloffand Bogolioiuboff,
distribution.It
Qj(A) in (0.5) is a self-reproducing
is thereforenatural to study those Markoffprocesses which have such distributions,that is to study stationaryMarkoffprocesses,in order to extend
the Doeblin-Kryloff-Bogoliotuboff
results.The author,Yosida, and Kakutani
have made the firststeps in thisdirection;referencesto thisworkwillbe given
below.
The notation used is, in summary:
X is an abstract space of points x, y,
(3) The space X and field _can be taken considerablymore generallyin both these conditions. The condition of Kryloffand Bogoliouiboffis somewhat weaker than that of Doeblin
as stated, but it will be seen below that the two conditionsbecome equivalent ifother measures
of Borel sets than Lebesgue measure are allowed in Doeblin's condition.
(4) Cf. [4] fora detailed discussion of this process. The discussion covers the continuous
parameter case, but the change to t integralis trivial.
1948]
MARKOFF TRANSITION PROBABILITIES
397
are X sets; Wxis a given Borel fieldof these sets. In practiceX
A, B,
is usually a Borel set in Euclidean space with the topology determinedby
Euclidean distance. For this reason no special notation is used to distinguish
points and sets of the abstract space X frompoints and sets of numbers.
Q is function(sequence) space as already defined;w is a point of Q, that
is, a function(sequence) x(t).
A, M, * * *are Q sets.
, is the Borel fieldof Q sets generated by sets of the form
{x(tj)
CAi, j = 1, * * , n},
As E a.
P(A) is the probabilitymeasure on the sets of a. determinedby a given
distributionand the given transitionprobabilities.For this
self-reproducing
reason it is convenient to call the sets of a,, the P-measurable sets. (The
parametert was chosen to be integral-valuedto avoid certain complications
in P-measure. In the continuousparametercase a topology is introducedin
X and regularitypropertiesare imposed on the transitionprobabilitiesto insure measurabilityof the stochastic process.)
are complex-valued functions of co, that is, chance
a(w), :(w),
variables.
a =fa(c)P(dA,)
E*I
is the expectationof a.
a ,;A} and E {a; j3} are respectivelythe probabilityofA and the expectation of 3 conditionedby assigned a; they are functionsof a. It will sometimesbe convenientto writeP {a = x; A } and E {ax= x; 3} to stressa particular value assigned to a. The conditionalexpectationE {a;j } can be obtained
as the integralof : in termsof the probabilitymeasure P{ a; Al}.
of functionspace already defined.We shall someT78is the transformation
times write T8a fora(T8w).
1 measureis the measureof X sets definedby a preassignedself-reproducing probabilitymeasure b(A). This will be used, togetherwith the transition
probabilities,to definea stationaryMarkoffprocess,so that forevery s
(0.10)
4(A)
=
Ptx(s)
@Aj.
Alternativelya stationaryMarkoffprocesswill be presupposedand b (A) will
then be definedby (0.10). In this approach the transitionprobabilitiesmust
be definedindirectly,and the fact that the definitionis ambiguous on sets of
measure zero will cause complications.These will be discussed below.
A P-measurable set A will be called invariant if TA and A differby at
most a set of P-measure zero. A set A of a- will be called invariantif
(0.11)
P{x(O); x(1) C A} = 1
almost everywhere(P-measure), on the Q set {x(O) CA }, that is, if
(0.11')
P{x(O)
=
x; x(l) CA} = 1
[May
J. L. DOOB
398
foralmost all x (1 measure) in A. If A is invariant (0.11) and (0.11') are also
true with x(1) replaced by x s), forevery s >0. The set A is invariantif and
only if the Q set {x(O) CA } is invariant. Moreover it is knownthat if a set A
is invariantit differsby at mosta set of P-measure zero froma set of the form
{x(O) CA }. This fact makes it possible to consideronly invariantsets of this
simpletype. More generally [3, pp. 118-119] it is knownthat ifany P-measurable functiona(o) satisfiesthe condition Ta=eXla for some constant X,
neglectinga possible exceptional set of P-measure zero, then a differson at
most a set of P-measure zero from a P-measurable functionof the form
f[x(O)]. A functionsatisfyingthis equation with X = 0 is called an invariant
function;if eiX#1, log a when suitably definedis called an angle variable.
A stationaryprocess (Markoffor not) is said to be metricallytransitive
ifthereis no invariantP-measurable functionnot constantwithprobability1.
In the presentcase, the x(t) process is a Markoffprocess,and it has already
been noted that in this case, in discussingboth metrictransitivityand angle
variables, it is sufficientto considerfunctionsof x(O). In all cases, when investigating metric transitivity,it is sufficientto consider functionstaking
on only the values 0 and 1, that is, characteristicfunctionsof point sets. A
Markoffprocess is thus metricallytransitiveif and only if every invariant 1
measurable set A has 1 measure either0 or 1, that is if and only if every Q
set of the form {x(s) CA } (fixed s, A 1?measurable) which is unchanged,
neglectingsets of zero P measure,by a change of s, has P measure 0 or 1(5).
The theoremsof the present paper concern conditional expectations of
the formE { x(O) = x; Tta }. These conditionalexpectationsare consideredat
firstas functionsof x forfixeda, and the almost everywherepropertiesof the
functionsfort-* oo are investigated. These results would remain valid if for
each s the conditional expectations were changed on a set of probability0,
and in fact these conditional expectations are ordinarilyso defined that
thereis ambiguityon sets of zero probability.Sometimes,however,individual
values of x will be consideredand such an ambiguitywill be inadmissible.
It will then be supposed that the conditional probabilitiesare regular. By
this is meant that transitionprobabilities
P{x(s)
= x; x(s+
t) CA}
=
P(t)(x,A)
are given,as uniquelydefinedfunctions,satisfyingconditions(a), (b), and (c)
of the firstparagraphof this paper. If a is a chance variable dependingon the
functionsx(t) fort> s, E { x(s) = x; a } is then unambiguouslydefinedas the
integralof a in termsof the conditional probabilitymeasure forx(s) =x.
The hypothesisof regularityof the transitionprobabilitiesis not so strong
a restrictionas mightappear at firstsight.In the firstplace Markoffprocesses
are most frequentlygiven in terms of uniquely defined transition prob(5) The general background of ergodic theory,includingconcepts like metrictransitivity,
and so on, is contained in [5]. The specificapplication to Markoffprocesses is given in [3].
19481
MARKOFF TRANSITION PROBABILITIES
399
abilities; in this case regularityis intrinsicin the definition.In the second
place even ifthe transitionprobabilitiesare not so given,it is usually possible
to redefinethem to be regular. In fact, this is always possibleif thefield a,
is strictlyseparable(6).This hypothesisis always satisfied,forexample, when
! consistsof the Borel subsets of a Borel set in a Euclidean space. The proof
when , is strictlyseparable goes as follows.
of this possibilityof redefinition
Althoughfora given A E j, P {x(0) =x; x(1) EA } is not a uniquely defined
functionof x, it can be so definedforeach A Ez that, excludingan X set N
independentof A, with
(0.12)
P { x(0) C N}
N C ax,
=
4(N) = 0,
is for fixedx a probabilitymeasure in A [3, p. 96].
P{x(0)=x;
x(l)EA}
DefineP(1) (x, A) as this functionforx EX - N, and as any probabilitymeasure in A forxEN. The functionPt)(x, A) definedinductivelyby
(0.13)
P(t) (x, A)
=
P(t-)(y,
A) P(2) (x, dGV)
is a regulartransitionprobabilityof the process.
The above discussion made essential use of the hypothesisthat a. was
strictlyseparable. In the followingit will frequentlybe desirable to perform
integrationsof the type
(0.14)
ff(y)P{ x(0) =
x; x(1) C dG}
withoutany assumptionof regularityimposed on the transitionprobabilities.
It is known that this is permissibleand that the usual rules of integration,
taking the limit under the integralsign,and so on, apply, and the definition
of (0.14) is in termsof the usual Lebesgue sums [3, p. 98]. This convention
will be used below, for integrationwith respect to P { x(0) = x; x(t) CA
measure or any other conditional probability measure with no further
comment.
We shall need the followingfact below. If thetransitionprobabilitiesare
regularand if A1, A2, * are invariant,thereis a set N of 1 measure0 such
that
(0.15)
P{x(0) = x; x(t) EA-Ai
N}
=
1, ifx C As-A1N, j > 1, t > 1.
(6) A Borel fieldof sets is called strictlyseparable if it is the Borel field generated by a
finiteor denumerablesubcollectionof its sets. Note added in proof:The italicized statement is
incorrect;the proofoutlinedis based on Theorem 3.1 [3, p. 96] whichis incorrect.(It was tacitly
assumed in the proofof this theoremthat the t image of Q had outer F measure 1; this is not
necessarilytrue, although it is true if the t image is a Borel set.) The italicized statement is
however correct if, forexample, !, consists of the Borel subsets of a Borel set in a Euclidean
space. The discussion below of (0.14) is correct,with a slightlymodifiedjustification.
400
J. L. DOOB
[May
In fact thereis a set N1 of 1 measure 0, independentofj and t, such that
P{x(O) = x; x(t)C Ai} = 1
if xC As-AiN1,
and a set N2DN1 of 4) measure0, independentofj and t,such that
PIx(O) = x;x(l)Cz As-AiN,}
= 1 if xCA
-A N2,
Then (0.15) is true with N=,jNj.
THEOREM
1 (STATIONARY PROCESS). (a) Let a(X) be any P-measurablefunc-
tion,and suppose thatfor some p > 1
El Ia(IP} < oo.
(1.1)
Then theaverage
i t
-E E{x(O);
t s=1
(1.2)
T,a}
converges
in themean withthesame index p to a functiona*:
(1.3)
-E
lim E
-a
E{ x(O); Ta}
a*
= 0.
(b) If p > 1, or moregenerallyif
(1.4)
Et aIx log+ I a
<
00(7
thelimitexistspointwise:
(1.5)
lim-E
t-0
t
e=1
E{x(O) = x; Ta} =a*
exceptfor an x set of 1 measure0. If A Ez
(1.5) can be specializedto
(1.6)
i t
-E
t =
theaveragesin (1.2), (1.3), and
P{ x(O); x(s) E A}.
(c) If theprocessis metricallytransitivethe limitax*is E {a } whichreduces
to P {x(O) GA } = J(A) whent--oo in (1.6). Conversely
if a* =E {ca} for every
a theprocessis metricallytransitive.
(d) Suppose thattheprocessis a Markoffprocessand letQ(x, A) be thelimit
in (1.6) whent-> oo. Then Q(x, A) is theconditionalprobabilityof { x(O) CA
relativeto thefieldofinvariantsets; thatis
(1.7)
(7)
fQ(x,
A) (dGx) =
1(A B)
t log+IJa=0 ifIa! <1.
log+Ial =log Ial ifIaJ21,
19481
MARKOFF TRANSITION PROBABILITIES
401
if B is invariant.More generally,if a dependsonly on x(0), x(1),
a*, is
theconditionalexpectationofa relativeto thefieldofinvariantsets. Thefunction
Q(x, A) satisfiestheequations:
=fP{ x(0) = y; x(t) C A}Q(x; dGV),
(1.8)
Q(x,A)
(1.9)
Q(x, A) =
(1. 10)
Q(x, A) =
Q(y, A)P{x(O) = x; x(t)
dG},
f
Q(y, A)Q(x, dGv),
foralmostall values of x (4 measure). The exceptionalset varieswithA and the
if and
choiceoftheconditionalprobabilitiesinvolved.Thereis metrictransitivity
onlyif Q(x, A) = cI$(A)for almostall x (4i measure)for each A.
(e) Suppose thatthe process is a Markoffprocess with regulartransition
probabilitiesand suppose thatC = C(a) is thex set on which(1.5) is true,where
a is boundedand depends only on x(O), x(1), * . Then C includes everyx
satisfying
(1.11)
P{x(0)
[X(t) c
= X; E
tO0
C4
=
1.
Note that (1.6) is the counterpartof (0.4). Althoughthe limitwhen t-> co
in (1.6) may not exist on an x(O) set of 1 measure0, in manyapplications the
transitionprobabilitiesare regular and the possibilityof exceptional points
of nonconvergencecan be eliminated by the last statementof the theorem.
Various special cases of this theorem have been proved by the author
[3, p. 114], by Yosida [8], and by Kakutani [6]. It has always been supposed hitherto,however, in proving (1.3) and (1.5) that the process was
Markoff,and, in findingpointwise limits,that a was bounded. It will be
clear fromthe proof that the conditional expectations for preassignedx(0)
can be replaced in (a), (b), and (c) by any other conditional expectations
withoutalteringeitherthe validity of the theoremor its proof.It would be
interestingto prove the existence of the pointwiselimit in (1.5) for any a
withE { ja j } < oo,or to obtain a counterexample.
Proof of (a). Under the stated hypotheses,E { j } < oo. Hence according
to the point ergodic theorem (strong law of large numbers for stationary
processes)
(1.12)
1
lim-Z
t--+
t
8=1
T,,a = a,
exists with probability 1, and according to the mean ergodic theorem if
E{I aIP}< oO forsome p>1, it followsthat E{ fa,IP} < oo and there is con-
[May
J. L. DOOB
402
vergencein the mean of order p. Hence, in this case, when t-* oo,
t
-E
E{13
E{x(O);
t0x(); a,
T8a} -E
1
=
( ) a?i
{Et
E {|Ex(O;
=E{
P)
| ]}2 P
tEE
E{x(0);[-Tsaa-al]
?EEx0;a
(1.13)
II
8-
0.
<=E{-ET8a-a1lo
Thus (1.3) holds, with a* =E { x(O); a, }.
to show that if p-=1 and
Proof of (b). To prove (1.5) it will be sufficient
if (1.4) is true then (1.12) can be integratedto the limit (t-* oo) with respect
to the conditionalmeasure E {x(O); A}. This will be shown by showingthat
(in termsof this conditional measure) excludingperhaps an x(O) set of zero
measure there is convergence in (1.12) almost everywhere,and that the
averages are dominatedby an integrablefunction.(It has already been noted
that this procedureis correcteven though strictlyspeaking the conditional
probabilitymay not definea measure.) Accordingto (1.12) the bracket converges to a,1except on a set M with P {M } = 0. Since
(1.14)
fr{x(O)
= x; M}4(dGx) = P{M}
=
0
it followsthat P{x(O); M} =0 for almost all x(O) (4 measure). That is to
say, excludingthe exceptionalvalues of x(O) the limitin (1.12) exists almost
everywhere(P{x(O); A} measure). Moreover according to a theorem of
Wiener [9, p. 27] (1.4) implies that
(1.15)
{
~~~~t
t
=
}
Hence, excludingan x(O) set of c1 measure 0,
(1.16)
E{x(O); L.U.B. -
T8a
< oc.
This means that, excludingan x(O) set of P-measure 0, the average in (1.12)
is dominatedby an integrablefunction,as was to be proved.
Proofof (c). Accordingto the ergodictheoremthereis metrictransitivity
if and only if a, =E {ta} forall a, in which case, then,a* =E {t } also. If a is
definedto be 1 when x(O) CA and 0 otherwise,the average in (1.2) becomes
that in (1.6). The limitis thenE {oa} = c1(A) whenthereis metrictransitivity.
19481
MARKOFF TRANSITION PROBABILITIES
403
Proofof (d). Let Q(x, A) be the limitin (1.6). To prove that Q(x, A) is the
stated conditionalexpectation,that is, that (1.7) is true,we note that since the
processis a Markoffprocess,and since B is invariant,
fA-E
P{ x(O)
=
x; x(s) E A}I4K(dGx)
it
it
=--
t
8=1
P{ x(s)
E
B, x(s) C A}
P{ x(O) G A B } = D(A *B).
When t-* oo,this becomes (1.7). The identificationofa* as a conditionalprobability is made in the same way. The equalities (1.8), (1.9) and (1.10) are
"almost everywhere"equalities. Hence to show their truthit is sufficientto
show that integrationover an arbitrary ax set B yields equality in each
It is also
case. This is clear in (1.9), using the fact that 4 is self-reproducing.
it
is clear
Now
set.
clear in (1.8) and (1.10), using (1.7), if B is an invariant
is
of
and
this
function
invariant,
that
x(O)
fromthe definitionof Q(x(O), A)
in
be
B
to
invariant
testing
(1.8)
thereforethat it is permissibleto restrict
and (1.10). Thus the threeequalities are completelyproved. Finally, if there
is metrictransitivitywe have already seen that Q(x, A) = 4(A) foralmost all
x (4 measure). Conversely,if this equation is true, and if A is invariant,
4'(A) = 1 or 0, since the conditional probabilityQ(x, A) is the characteristic
functionof A if A is invariant. Hence the process is metricallytransitive.
Moreover it is clearly sufficientif Q(x, A) = 4(A) on a collection of sets so
large that the Borel fieldthey generate coincides with a,.
Proofof (e). Suppose that ja I < K, thata depends onlyon x(O), x(1), . .
that (1.5) is true for xGC, and that the process is a Markoffprocess with
regulartransitionprobabilities.Let x satisfy(1.11), that is to say it is supposed that startingfromx(O) =x, x(s) takes on a value in C forsome sO 0
with probability1. We show that then xCC. Let v be the firstvalue of s
with x(s) EEC. Then if we use the fact that the process is a Markoffprocess,
(s),
and that the equality v = s is a conditionon x(0), *
(1.18)
E I x(O) = x; T,a}
2=Ef
i=o
{v=i)
T.caPI x(0) = x; dA},
where,ifs _j, thejth integralin (1.18) can be writtenin the form
(1.19)
J
E{x(j); T8a}P{x(O)
= x; dA}.
404
J. L. DOOB
[May
Hence
it
E{x(0)
t Ei=
T,a}
-x;
t
t
(1.20)
ZJ
j
{=
-
t
- IE
x(j); Ta}P{x(0)
t 8=j
=
x; dA}
f
+-E eJ
TsaP{x(0) = x;dA}.
Now the second sum is at most
K t
-Z P{X(o)
(1.21)
t
8=1
=
X;v>s}
which goes to 0 when t---o, since P{x(0)=x;
v>t} >-0. The integrandin
thejth integralof the firstsum in (1.20) is bounded by K and, since x(j) EC
whenv =j, convergesto a limitwhen t- X. The integral,and hence the sum,
thereforealso convergesto a limit; this finishesthe proof.
It is a remarkablefact,due in the firstplace to von Neumann, and proved
later more generallyby Ambrose,Halmos, and Kakutani [1], that any stationaryprocess (with mild restrictionson the fieldof measurable sets) can be
decomposed into metrically,transitiveprocesses. Although the various decomposition-limit
theoremson Markoffprocessesare merelyreflections
of this
fact (the proofsvary with the special character of the hypothesesand the
predilectionsof the authors) the exact adaptation of the decomposition
theoremto Markoffprocesseshas neverbeen stated explicitly.This is done in
the followingtheorem,whose proofwill be given in detail; the theoremis not
a corollaryof the generaltheorembecause strongerresultsare desiredin this
special case.
THEOREM
2 (STATIONARY MARKOFF PROCESS WITH REGULAR TRANSITION
Let Q(x, A) be theconditionalprobabilityof A (Cz ) withrespecttothefieldofinvariantsets,so that(by Theorem1)
PROBABILITIES).
(2.1)
i
t
lim t-+
t
8-
P x(0)
=
x; x(s)
A} -Q(x, A)
for almostall x (4 measure).
If thefieldax is strictly
separable,Q(x, A) can be so definedthatthefollowing
is true(8).Thereis a collectionof sets {A,,} whereA,.C x and A variesthrough
(8) Added in proof: (Cf. footnote6.) It is necessary to strengthenthis hypothesis; it will
be supposed that a. consists of the Bore] subsets of a set in a Euclidean space. With this restriction,the discussion of the unique definitionof Q(x, A), involved in the definitionof A., is
correct.
1948]
MARKOFF TRANSITION PROBABILITIES
405
some set of real numbers,withthefollowingproperties:
(a) EverysetA EC whichis a sum of setsA, is invariantand (neglecting
sets of 4 measure0) everyinvariantset can be expressedin thisway.
4(N) = 0,
(b)
If xCX-N,
where N = X- ZA,.
Q(x, A) is a probabilitymeasurein A C
Q(x, A) = Q,(A),
(2.2)
(2.3)
(2.4)
satisfying
f
x C A,
Q, (A,) = 1,
Q(x,A)4(dGx) = 4(A).
(c) Each probabilitymeasureQ,.(A) is self-reproducing,
(2.5)
P{ x(O) = y; x(1) E A } Q,(dGy) = Q(A),
and moregenerallyif T(dgu) is any probabilitymeasurein ,u space,
(2.6)
(D(A) =
f
Qj(A)T(dA)
is a self-reproducing
probabilitymeasure. Converselyif 4b1(A) is self-reproand
ducing
absolutelycontinuouswithrespectto cJ?(A)it can be put in theform
(2.6). If, for some,u,1(A,.) >0, thenQ,(A) is proportionalto cJ?(A
.A,), Q,.(A)
-
4(A .A )/4)(A,,).
(d) P { x(0) = x; x(t) CA, }1 everywhere
on A, and for each g and each
A Cx, (2.1) is truefor almostall x (Q,,measure). The Markoffprocesswith
c1 replaced by Q, is metricallytransitive.The original process is metrically
transitive
if and onlyif someA,/has 4 measure1.
The analogues of (1.9) and (1.10) have been omittedhere,since they reduce to trivialidentities.Conditionsare given below whichexclude the possibility that any A, have zero 1 measure. Under these conditions there can
be at most denumerably many of these sets. The strongerconditions of
Doeblin and of Kryloffand Bogolioiuboff
imply that there can be at most a
finitenumber.
Definitionof A,. and proof of (a). It has already been remarked that
Q(x(O), A) is an invariantfunction,for fixedA. This implies that the x set
definedby Q(x, A) <k is an invariantset. In the followingwe shall suppose
that , is strictlyseparable, specificallythat it is the Borel fieldgenerated
by the sets Ci, C2, * * * , CjCax.
It can then be supposed that the condi-
tional expectationQ(x, A) is so definedthat, for fixedx, Q(x, A) is a probabilitymeasurein A. It will be supposed below that Q(x, A) is so defined,and
[May
J. L. DOOB
406
be
that this functionis then a uniquely given functionof x and A. Let
X
of
the
collection
denumerable
by
sets
generated
the Borel fieldof invariant
sets of the form
{Q(x, Cn) < r},
(2.7)
n = 1, 2,
; r rational.
If A is an invariantX set, we have seen that Q(x, A) is the characteristic
functionof A, neglectingsets of 4 measure 0. Hence, neglectingsets of 4
measure 0, A is the set definedby the equality Q(x, A) = 1. Since A is in the
this equality definesa set in a-, neglectBorel fieldgeneratedby C1, C2,
ing sets of I) measure0. Hence a-, althoughit may not include A, includes a
fromit by at most a set of 4 measure 0.
set differing
The sets of a- which contain no propernon-nullsubsetsin a- are called
. They can be exhibitedexplicitlyin the form
the atoms of
(2.8)
D',
=
? 1,
Di = Dj,
D3l = X -Di
are the sets in (2.7). The non-nullintersectionsin (2.8)
where D1, D2,
are the atoms. The cardinal numberof these non-nullintersectionsis at most
that of the continuum,and we can thereforedenote them by {A, } where ,
ranges throughsome set of real numbers. In the followingwe shall find it
necessaryto drop some of these atoms, puttingthem in a set N of 4 measure
0. Since every set of W is composed of atoms, every invariant set differs
froma sum of A,'s by at most a set of 4 measure 0. The converse,that every
A, sum is invariant if it is a set in a- is a consequence of (b), to be proved
below. In fact if A Ga. is a sum of A,'s, where , ranges over some set M,
QM(A)= 1 when piCM and Q,(A) = 0 otherwise.Hence the set A is definedby
the equality Q,(A) =1, that is, neglectingsets of 4kmeasure 0, A is defined
by the equality Q(x, A) = 1. This definitionhas already been seen to define
an invariantset.
Proof of (b). Since Q(x, A) is j5- measurable forfixedA, it depends only
on the atom in whichx lies. Equation (2.2) is merelythe analytic formulation
the conditionalprobabilityQ(x, A) is 1 almost everyofthisfact. If A C
where in A (4 measure); that is to say, using (2.2), Q(x, A) = 1 in every A,
except in some constitutinga set of 4 measure 0. Drop these atoms fromthe
A, forevery set A definedby (2.7). If this is done, (2.3) will be true. Equation (2.4) is obtaiinedby integrating(2.1) with respect to 4 measure.
Proof of (c). According to Theorem 1, (1.8) is true for almost all x ('1
measure), that is, (2.5) is trueexcept possiblyfora set of ,ucorrespondingto
atoms of total 4 measure 0. Absorb these in N forall sets A = C1,C2, * .
Then (2.5) will be true forall A Ga.. If (2.6) is used to definea probability
measure 411(A),integrationof (2.5) in ju (J measure) gives the equation showConverselysuppose that 4) is any self-reproing that (D is self-reproducing.
ducing probabilitymeasure. Then
1948]
(2.9)
MARKOFF TRANSITION PROBABILITIES
407
-E fP { x(0) = y; x(s) E A}I )(dG,) = 4)1(A).
t 8-1
When t-*oo the integrandconvergesto Q(y, A), except possiblyon a set of 4)
measure 0. If (b, is absolutely continuouswith respect to 4), the exceptional
set has at most zero 4b,measure also; when t-- oo, (2.9) becomes
f
Q(y,A)4)1(dGy)= bP(A),
(2.10)
which is a special case of (2.6), since 4)1(dGy)induces a measure in A space.
In particular,suppose that 4)(A,) > 0. Then by (2.4), 4)(A -A,) = Q ,(A) ) (A,).
Proofof (d). It was proved in the introductoryremarksof this paper that
(since the sets D1, D2, - * - defined by (2.7) are invariant) there is a set Do
=1 forxCDj-DjDo.
of 4) measure 0 such that P{x(0) =x; x(1)CDj-DjDo}
We shall now suppose that H is the fieldof sets generated by Do, D1 ...
(ratherthan by D1, D2, ***). None of the previous argumentsrequire any
change because of this. It is now true,however,that P{x(0) =x; x(1) CA } = 1
forxCA ifA Eza and ifA CX-Do. Then the firststatementof (d) is true if
Do, which is a sum of atoms, is incorporatedin N. Equation (2.1) is true for
each A foralmost all x (4bmeasure) and hence, using (2.10), if N1 is the ex= 0 for
ceptional set, Q(y, N1) = 0 foralmost all y (4bmeasure). Hence Q,M(N,)
all , except some correspondingto A,'s of total 4) measure 0. Absorb these in
N foreach A = C1,C2
*
- -.
We shall showbelowthat then,foreach /.Z
and
each A C a, (2.1) is trueforalmost all x (Q, measure). Accordingto Theorem
1, the processwith 4) replaced by Q, will be metricallytransitiveif
(2.11)
i t
lim-EP{x(0)
t-l0 t 8=1
= x; x(s) EA}
=Q;,(A)
is true foralmost all x (Q, measure), forevery A. In fact, accordingto a rethat (2.12) hold fora collecmark made in provingTheorem 1, it is sufficient
is
This
the
of
A
field
tion sets determining
preciselywhat we have shown,
a,.
=
.
.
.
Hence
reduced
transitive.It
the
withA C1,C2, *
processis metrically
for
that
is
holds
almost all
then followsfromTheorem 1 that (2.11),
(2.1),
is
x (Q, measure) foreach A ECa If the given process metricallytransitive,
the invariantX sets have 4) measureeither0 or 1. It is then clear from(2.8)
that one atom musthave 4) measure 1. This atom can neverbe absorbed in N.
Converselyif some A, has 4) measure 1, say A, then Q,(A) = 4)(A), and the
'reduced" process,which has been shown to be metricallytransitive,is preciselythe originalprocess.
In general,it is not difficultto show that the / can be so chosen that any
Borel set of ,t's correspondsto a set of atoms which constitutea set of x,
and converselyany W set can be expressedin this way (ignoringthe atoms
[May
J. L. DOOB
408
in N). This gives an elegant parametrizationof the invariantsets.
The followingtheoremsmake Theorem 2 more precise by imposing restrictionson the transitionprobabilitieswhichmake it possible to discuss the
truthof (2.1) forindividuallyspecifiedvalues of x.
THEOREM
MARKOFF
3 (STATIONARY
PROCESS
WITH
REGULAR
TRANSITION
Let Pl(x, A, t) be the singular componentof the measure
P {x(O) = x; x(t) CA } withrespectto 41measure.In thefollowing,a(co) will be
any P measurablefunctiondependingon x(t) for t_ 0, and eitherboundedor at
least with
PROBABILITIES).
E{x(O) = X; Ta
(3.1)
}<
oo,
xcX,t>
1
and
= Y;
L.U.B.E{x(O)
(3.2)
I Tt0 } <
oo
for someto>0.
in t,forfixedx.
(a) Pl(x, X, t) is monotonenon-increasing
(b) Withtheaboveconventionon a,
i
t
1
8=1
lim -E{x(O)
(3.3)
t
>??
= x; T8a}
existsfor all x satisfying
rim
P1(X, X, t) = 0.
(3.4)
If theprocessis metricallytransitive,thelimit in (3.4) is E { a
(c) If C is thex set on which(3.4) is true,C includeseveryx satisfying
P{x(O) = X; E [X(t)C C]} = 1.
(3.5)
t20
If (3.4) is satisfiedon an X set of positive f measure,and if the process is
(3.4) is satisfiedbyalmostall values ofx (4 measure).
metricallytransitive,
(d) If x satisfies(3.4), and if A C x,
i
(3.6)
1im-E
t- *0
t
t
8=1
= x; x(s) CA}
P x(O)
I
=
Q(x, A)
exists. The setfunctionQ(x, A) is a probabilitymeasure whichis absolutely
continuouswithrespectto 4 measureand is self-reproducing,
(3.7)
P { X(0) = y; x(t) E A } Q(x, dGv) = Q(x, A).
If theprocessis metricallytransitive,Q(x, A) = 4?(A).
MARKOFF TRANSITION PROBABILITIES
1948]
409
(e) If almostall x (b measure)satisfy(3.4), Q(x, A) (forthosevalues ofx)
satisfies
f
Q(y,A)Q(x,dGy)= Q(x, A) (9)
(3.8)
and also satisfies
Q(y, A)P { x(O) = x; x(t) E dGy}
(3.9)
=
Q(x, A)
for almostall x (4 measure)(10). If all x satisfy(3.4), (3.8) and (3.9) are true
for all x.
Proof of (a). For each t let At be a set of 4 measure 0 forwhich
P{x(O) = x; x(t) C A} = Pi(x, X, t).
(3.10)
The set At depends on x and t, but x in the followingwill be held fast. Then
s+ t)
Pi(x,X,
=fP{x(t)
= y; x(s + t) E A8+t}P{x(O) = x; x(t) E dGy}, s,t > 0.
Since 4b(A,+t)= 0, the integrandvanishes for almost all y (4) measure); if
B is the set on which it does not vanish,
fP{x(O)
Pi(x, X, s +t)
(3.12)
=
P{x(O)
=
=
x; x(t) E dGy}
x; x(t) E B} < Pi(x, X, t),
as was to be proved. It is interestingto note, although we shall not use the
fact explicitly,that P1 obeys the same functional equation (0.1) as the
transitionprobabilitiesthemselves.
Proofof (b). The problemis that of justifyinggoing to the limitunderan
integralsign. In fact in
EE{x(O)
(3.13)
= x; Ta}
=Ex(O)
= x;-Z
T8a
the average (1/t) Et=, T8a convergeswithprobability1. AccordingtoTheorem
1 integration to the limit is justified for almost all x (4 measure) if
E a log+ Ia I } < co. The present hypotheses are of a differenttype. If
f
(9) The integrationis meaningfuleven though Q(y, A) is not definedon a y set N of ci
measure 0 because, accordingto (d), N is also of Q(x, dGy)measure0.
(10) In this integralthe integrandis again undefinedon a set of ci measure 0, but this has
integrationmeasure 0 for almost all (ci measure) values of x.
410
J. L. DOOB
[May
0<s?t-to,
-
i
t
t
E{x(O)
i==1
-x;Tia}
= f E{x(s)
(3 .14)
o
+ E{x(O)
Ti
y;-
=
=x;-
q+ to-1
~~~1
t
E+
j-1
P{x(O) =x;
x(s) Ezd }
T;a}
Now by hypothesis(3.2) is true,and it followseasily that (3.2) is true with
toreplaced by any t> to,with the same bound, say K. Then thejth integrand
in (3.14) is bounded by K. Accordingto Theorem 1 this integrandconverges
in the mean of order 1, and thereforein probability,that is, in c1 measure.
Unfortunatelyb measure is not the relevant integrationmeasure. However
if At is definedas in the proof of (a), the integrand converges on X-Ae
in the integrationmeasure. Integrationto the limit on X-Ae is therefore
allowed and the maximum oscillation of (3.14) when t-> oo is thus at most
KP1(x, X, s), an upper bound to the integrationover A,. By hypothesisthis
can be made arbitrarilysmall by choosing s large. Hence the left side of
(3.14) convergesto a limitwhen t-> oo. If the processis metricallytransitive,
the limit is E {a }, since in that case the almost everywherelimit in (3.13)
when t- oo is E { a }, by Theorem 1.
Proof of (c). If (3.4) is satisfiedon an X set C and if x satisfies(3.5),
that is, if startingfromx(O) =x, x(s) takes on a value in C forsome s >0 with
probability1, we shall now show that x also satisfies(3.4). Let v be the first
value of s with x(s) CC. Then using the fact that the process is a Markoff
x (s),
process, and that z = s is a condition on x(0),
(3.15)
P { x(O) = x; x(t)
00
A}
= E
8=0
P x(O)
=
x; v = s, x(t) C A;
where,if t>s,
(3.16)
P{x(O)=x;v=s,x(t)A}
=JPX
P
x(s);x(t) G A}P{x(O)
=
x; dA}
In particular,if (A) =0
P{ x(O) = x; v = s, x(t) EA}
(3.17)
<
P1(x(s), X,
- s)P { x(O) = x; dA}.
If A is chosen so that the leftside of (3.15) becomes Pl(x, X, t), it then fol
MARKOFF TRANSITION PROBABILITIES
1948]
411
lows that if t _ n,
Pi(x(s), X, t -s)P{
X, t) _
Pi(x,
4=0
x(0) = x; dA}
{k=8)
00
+ E
s=nl
(3.18)
<
P{ x(0) = x; v =s, x(t) C A}
P1(x(s), X, t - s)P { x(0) = x; dA}
s=0 {P=s)
00
+ E
s=n+l
p{x(o)=x;v=s}.
Now the last series in (3.18) is the remainderof a convergentseries:
00
E P{ x(O)= x;v = sI}= 1
(3.19)
8=()
and can be made arbitrarilysmall by choosingn large. When v= s, x(s) EC.
Hence the integrandsin (3.18) convergeto 0 (t-->oo) by the definitionof C.
Since the seriesof integralsis dominatedtermby termby the series (3.19), it
followsthat the leftside of (3.18) convergesto 0. We have thus shown that C
includes the values of x satisfying(3.5). If the process is metricallytransitive, 4(C)=O or b(C)=1. In fact if 4(C)>0, and if n(t) is the number of
timesx(s) CC for0< s _ t, n(t)/t-*d1(C),with probability1, accordingto the
ergodic theorem.Thus for almost all values of x(O) =x (4 measure), (3.5)
is satisfiedand in fact x(t) will even take on values in C forinfinitelymany
values of t.
Proofof (d). If x satisfies(3.4) and ifa(c) is definedto be 1 whenx(O)GA
and 0 otherwise,the average in (3.3) reduces to that in (3.6). Since thereis a
limitin (3.6) forevery A, Q(x, A) is a probabilitymeasure in A. According
to (b), Q(x, A) =E {Ia }= (A) if the process is metricallytransitive. The
equation
1
T
(3.20)
?71?
,
t=8+1
=+
P{x(O) = x;x(t)GA}
f P{x(t)
= PIx(o)
=-ZP{x(O)
=x;x(s+
t= 1
= y; x(s + t) C A}P{x(0)
= y; x(s) E
A}*-I
P{x(O)
=
=
t)eA}
x; x(t) C dG,}
x; x(t) E
dG,,}
leads (when r-->c ) to (3.7). This means that Q(x, A) is a self-reproducing
distribution which may or may not coincide with the given one, 1(A). If
4(A) =0,
[May
J. L. DOOB
412
Q(x,A)=lim-ZP{x(O)=x;x(s)GA}
=1
t
1 t
< lim- E PN(x,X, s) = 0.
t40O
(3.21)
I0
8=1
Hence Q(x, A) is absolutelycontinuouswithrespectto 1 measure,forfixedx.
The absolute continuitymay not be uniformin x.
Proof of (e). If almost all x (1 measure) satisfy(3.4), (3.7) must hold for
almost all x. Hence forthose x,
f-t E P x(o)
t
1
(3.22)
8=1
=
y; x(s) C A }Q(x, dGy)= Q(x, A).
When t-> co the integrandconvergesboundedlyto Q(y, A) foralmost all y (b
measure) and thereforeforalmost all y in the integrationmeasure, because
the latter is absolutely continuousin 4 measure. Integrationto the limit is
thereforepermissible,and yields (3.8). It has already been seen in Theorem
1 that (3.9) holds foralmost all x (db measure). If all x satisfy (3.4), (3.8)
must holds for all x. Moreover when o- >oo in
1
t+a
-
P{x(O)
=
A} =-
x; x(s)
1
a
P{x(O)=x;
x(s + t) E A}
1
(3.23)
=
Jw-Z
P
{ x(O) = y; x(s) C A} P { x(O) = x; x(t) E dGy1},
the integrandconvergesboundedlyto Q(y, A) when a--->oo. Integrationto the
limit yields (3.9).
The followingtheorem gives conditions under which the Cesaro limits
used up to the presentcan be replaced by ordinarylimits.
THEOREM 4 (STATIONARY MARKOFF PROCESS WITH REGULAR TRANSITION
AND NO ANGLE VARIABLES). In thefollowinga is a P measurable
functionsatisfying(3.2) for all largetoand dependingonlyon x(t) for t? const.
PROBABILITIES
(theconstantmay varywitha).
(a) Thereis a, t set I of relativemeasure 1("), independentof x and of a
such thatif x satisfies(3.4)
(4.1)
lim
1- O , t
I
E{x(s) = x; Tta}
thelimitis
existsand is independentof s. If theprocessis metricallytransitive,
E{a}.
(1")A t set of relative measure 1 is a set withthe propertythat (number of integersin the
set between n and -n)/2n--+1 when n-- co.
MARKOFF
1948]
TRANSITION
413
PROBABILITIES
(b) If (3.4) is satisfiedbyalmostall x ((D measure),(4.1) can be strengthened:
lim E{ x(s) = x; Tta}
(4.2)
existsfor thosex, and is E I a } if theprocessis metricallytransitive.
Withproperchoiceof a, (4.1) and (4.2) become
(4.1')
l-e
lim
,tEI
P{ x(s)
- x; x(t) EA}
=
Q(x, A)
and
lim P{x(s)
(4.2')
= x; x(t) E A}
=
Q(x, A).
The specialization from (4.1) and (4.2) to (4.1') and (4.2') is clear, and
will receive no furthercomment.
Proof of (a). Accordingto a theoremof von Neumann and Koopman [5,
pp. 36-37] if the process has no angle variables, there is a t set I of relative
measure 1, with the propertythat
(4.3)
lim
t-+oo,tEI
E{I3(co)Tta(co)}
} < oo. If the
existsforall chance variables a, / withE { a 12} <co and E { 1321
processis metrically
transitive,
thelimitis E { a } E { 13}.Moreoverthelimit
is unchangedif a is replaced by T8a or if : is replaced by T43. It will be convenientto writePI{x(O) =x; x(t)CA } explicitlyas the sum of its absolutely
continuousand singularcomponentswith respect to 1 measure,
P I x(O) = x; x(t) E A} =
(4.4)
P(x, y, t)cI(dGy)+ Pi(x, A, t).
The conditionalexpectationin (4.1) can be written(for large t) in the form
E{ x(s)
=
x; Tta} =
(4.5)
=
=
E x(u) = y; Tta } P{ x(s) = x; x(u) E dGy}
f E{x(u)
= y; T ta} p(x, y, i-
E p(x, x(,u), u
-
s) Tta } +
s)J2(dG,)+
,
where
|IX|
(4.6)
=
f E{ x(u) =
? L.U.B. E{x(O)
y; Tta}Pl(x, dGy,u
=
-
s)
y;I Tt_uat }Pi(x, X,
-
s).
n
414
J. L. DOOB
[May
If tI->c, tEl, the last expectation in (4.5) converges to a limit (put
/3=p(x, x(u), u -s) in (4.3)); in fact althoughthe integrationhypothesesimposed by Koopman and von Neumann are not satisfiedhere, a simple approximationextends their result as required. Moreover it was remarkedin
the proofof Theorem 3 that our hypotheseson a imply that the L.U.B. in
(4.6) is bounded in t when t-*oo, say with bound K. Hence when t-* o,
with tCI, the left side of (4.6) has oscillation at mostKP,(x, X, u-s). By
hypothesisthis can be made arbitrarilysmall by choosingu large. Hence the
leftside of (4.5) convergesto a limitwhen t-> co, tCI. If the processis metrically transitivethe limit is E { a}. This can be seen fromthe fact that the
limit in (4.3) is E { a } E {3 } in that case, or fromthe fact that the Cesxaro
limitis EJ{a} in that case, accordingto Theorem 3.
Proof of (b). Suppose now that (3.4) is satisfiedby almost all x ((b measure). For large t we tan write
E{x(O) = x; Tta} =
E{ x(s) = y; Tta} P{ x(O) = x; x(s)C dG,}
(4.7)
=JE { x(o) = y; Tt,a } P { x(O) = x; x(s) E dG,}.
Accordingto (a) the integrandconvergesfor almost all y when t->oo with
t- sGI. We let t->oc withno restriction,choosings = s(t) so that
t-s C I,
s E I,
s -->c.
Now if A8 is a set of 4> measure 0 satisfyingthe equation Pl(x, A8,s)
=Pl(x, X, s), and containingthe points of nonconvergenceof the integrand
in (4.7), the integrandconverges boundedly everywhereon X-I#sA8 and
the integrationmeasures converge. Hence
(4. 8) lim
f
E { x(s) = y; Tta } P { x(O) = x; x(s) E dG7,}
(A
=
EA.)
exists. Moreover if K has the same significanceas in the proofof (a),
(4.9)
fE{x(s)
= y; TtaI}Px(O)
= x; x(s) E dGY} ? KP,(x, X, s)-*O
so that the left side of (4.7) convergesto a limit when t-- oc, as was to be
proved. As in (a), if the process is metricallytransitive,the limitis E {al.
Theorems3 and 4 weredevoted to the asymptoticcharacterofE {x(0) =x;
Tta } as conditionedby the validityof (3.4). If (3.4) is assumed true on an X
set of positive 4 measure,the decompositiontheorem(Theorem 2) becomes
essentiallysimplified.Only two cases will be discussed: when (3.4) is true for
all x (Theorem 5) and when (3.4) is true foralmost all x (Theorem 6). The
1948]
MARKOFF TRANSITION PROBABILITIES
415
paper is then concluded by a detailed analysis of the hypothesesof Doeblin
and of Kryloffand Bogolioiuboff
as compared with the hypothesisthat (3.4)
is true forall x.
THEOREM
PROBABILITIES;
5 (STATIONARY
MARKOFF PROCESS WITH REGULAR TRANSITION
VALUE OF X SATISFIES (3.4)). There are finitelyor
EVERY
denumerably
many disjunctsetsA1, A2, * * *, of positive4) measure,
infinitely
and correspondingself-reproducing
probabilitymeasures Q1(A), Q2(A),
withthefollowingproperties:
4)(N) = Qi(N) =
(5 .1)
(N = X-,
o
Ai)
Qj(A ) =1.
(5.2)
A probability
measure4'1(A) is self-reproducing
if and onlyif it can beput in the
form
(5.3)
as _ 0, E as = 1.
-1(A) = E a7Q(A),
i
i
In particulareverya1 is positivefor (D = 4 and moregenerallytheseconstants
are all positiveif and only if 4 is absolutelycontinuouswith respectto 4)D.
Every4) is absolutelycontinuouswithrespectto (.
Each Qj process (( replacedby Qj) is metricallytransitiveand theQj's are
theonly self-reproducing
distributions
givingmetricallytransitiveprocesses.
The transitionprobabilitiessatisfy
(5.4)
P{x(O)
=
x E Ai,
x; x(t) E A } =1,
and, if AGeI,
(5.5)
limt-oo
it
t
P x(O)=x;
x(s) CA}
= Q(x, A)
=
existsfor all x, with
(x C A?)
Q(x, A) =Qj(A)
(5.6)
=
x C N, ax
ai(x)Qi(A)
ai(
x) =
For xCAj, (5.5) can be strengthened
as follows. To each Aj correspondsa
positiveintegerdj. If dj = 1,
(5.7)
lim P{x(O)
t-4 o
=
x; x(t)
A} =Qj(A),
xCGAi.
If dj > 1, A j is thesum ofdj disjunctsetsA j, * * * , A id,and thereare probability
measuresQjl, * * *, Qjdj satisfying
416
[May
J. L. DOOB
(5.8)
Q3(A) -
(5.9)
Qjk(Ajk)
(5.10)
[Qil(A)+
=
PI x(0)
+ Qid(A)1,
Qj(Ajk) =Ildp
1,
x; x(t) E
=
Ajk}
=
X
4kZ+t,
e
AJ(1)
Finally
lim P{ x(0) = x; x(d1s+ t) C A} =Qil+ (A),
(5.11)
(5.12)
x EzAil;t
O,
, di -1,
P{ x(0) = y; x(t) E A} Qjk(dGy).
Qik+t(A) =
If (3.4) holds forall x, Q(x, A) is definedby (3.6) forall x and all A CE,
and satisfies (3.8). According to a theorem of Blackwell [2, p. 567] since
Q(x, A) is absolutely continuouswith respect to 4(A) and satisfies(3.8), X
is the sum of disjunct sets Bo, B1, A1, A2, A3, . . . , finiteor denumerablyinfinitein number,with the As of positive 4) measure,and thereare probability
measures Q1(A), Q2(A), * * *satisfying
Q(x, A) = Qi(A),
=
Qj(A)
1,
if A C Aj, 1(A) > 0,
Qi(A) > 0
(5.13)
0,
Q(x, Bo)
-:b(B
x E Ai,
=
0.
We shall only touch on those points which are not obvious in the light of
Blackwell's theoremand the precedingtheoremsof this paper. Equation (0.2)
implies that 4(Bo) =0. We set N=Bo+B1 and shall absorb other sets of fD
measure into N also, as required below. Evaluating (3.8), and using (5.13)
and the fact that Q(x, A)=0 when (A)=0,
Q(x, A)
(5.14)
=
E
Qi(A)Q(x, A,).
This is (5.6), with an explicit evaluation of the as(x). If 41(A) is self-reproducing,
-1(A)
f
P{ x(O) = y; x(t) E A}I 1(dGy)
(5.15)
1
-J
(12)
t
EPIx(O)=
y; x(t) C A}ib(dGy).
modulodi.
i in A1iis to be interpreted
thesecondsubscript
Hereand in thefollowing
1948]
417
MARKOFF TRANSITION PROBABILITIES
When t-* oo this becomes
Q(y,A)4)j(dGy) = E2Qj(A)
41(A)
Q(y, Aa)4b)(dGy)
i
(5.16)
E Qj(A)(bi(A
j)
~~~~~=
(5. 16)
which is (5.3). In particular if 4b=P, aj=4b(Aj)>0. Since every Qj(A) is
absolutelycontinuouswithrespectto 4), every ib must be also. Any two selfreproducingdistributionswhose correspondingconstants of combination a1
are simultaneouslyzero definethe same sets of measure 0. The only special
featureabout 4) is that all of the correspondingaj's are positive. If 4'i did not
have this property,(3.4) would hold only foralmost all x, using (D as a base
measure.
Equation (3.6) becomes forxEEA1
(5.17)
i t
lim-
g-?o
t
P{ x(0) = x; x(s)
A}
= Qj(A).
=
Hence by Theorem 1(d) the Qj processis metricallytransitive.It is clear that
the Qj's are the only self-reproducingdistributions making the process
metricallytransitive. For example this followsfromthe fact that the Aj's
and sums of Aj's are the only invariantsets, neglectingsets of 4) measure 0,
that is, with every Qj vanishing.Since A; is invariant, (5.4) certainlyholds
foralmost all x in Aj (Qj measure),and we have seen at the beginningof this
paper that Aj can be replaced by a subset of the same Q; measure forwhich
is put into N. We shall assume that this has been
(5.4) holds. The difference
done.
Define d,= 1 if the Qj process has no angle variables. Then (5.7) follows
fromTheorem 4. Now suppose that the Qj process has an angle variable,
that is, that there is a 4) measurable functionf(x) (not identicallyzero on a
set of 4) measure 1) and a real constantX satisfying
f[x(s)] = eixsf[x(O)],
(5.18)
neglectingx(s) sets of Qj measure 0. Since
s = 1, 2, * *
Ifl is invariant, and since the
transitive,
Qj processis metrically
IfIeconst. on a set ofQj measure1. It is
no restrictionto take this constant as 1. Then
(5.19)
f=eh7,
It followsthat the Q sets
(5.20)
{x(0) C B}:
I{x(s) CC}:
{a < Xs + g[x(0)] < b (mod27r)},
{a <g[x(s)] <b(mod2,r)}
are identical,neglectingsets of probability0. Thus
0?g<27r.
(5.21)
[May
J. L. DOOB
418
Qi(B) = Qj(C) = P{a < g[x(s)] < b (mod 27r)},
P{x(O)
=
x; x(s) EC}
= 1,
x E B,
neglectingan x set N(a, b) of Qj measure0. Let N1 be the sum ofthe N(a, b)for
all rational a and b. Unless g is a functionwhich takes only a finiteor denumerablyinfiniteset of values (aside fromvalues taken on on a set of Qj
measure 0), thereis a sequence of rational pairs (ay,bj), shrinkingto a point,
correspondingto a sequence of pairs of sets (By, Cj) satisfying(5.21) and
(5.22)
B1 D B2 D
*
,
Cl D C2 D **
Q,(Bn) = Qj(Cn) > 0
contains points in Ai - N1. Let x be such a point,and choose
such that1J7JBn
s so large that
(5.23)
Pi(x, X, s) < 1.
Then
(5.24)
P{x(O) = x; x(s) C
JCn}= 1,
Q1(JICn) =O
whichcontradicts(5.23). Thus thereis a constantc whichf assumes on a set
of positive Qj measure. The constantsceiX,ce2AX,* are also assumed on sets
of the same Qj measure.This is impossibleunless eiXis a root of unity.We can
suppose then that eiXis a djth root of unityand that
f(x) = ceA~',
ce2X,
*I.
on disjunct sets A11,
*, Aid, respectively,all having the same Qj measure.
ji
The sum of these sets is invariant, and thereforeincludes all of Aj except
possiblya set of Qj measure0. Hence Qj(Ajk) = 1/dj.The remainderA j-ikA,jk
is absorbed in N, and A j is thenreduced again as above, so that (5.4) will still
hold. If x(s)CAJk, it follows that x(s+1)CAik+1, that is, (5.21) becomes
(5.10), valid foralmost all points of Ajk. The sets A jkand Aj are then modified to ensure the simultaneousvalidity of (5.4) and (5.10) without exceptional points. Unless therewere a maximumvalue to the possible integersdj,
(5.10) would show that for almost all x (Q measure) and every n there are
sets Bn withP {x(0) =x; x(s) CBn } = 1, Qj(Bn) < 1/n. This contradicts (5.23).
Hence thereis a maximumdj and it is assumed fromnow on that this maximum has been chosen above.
The transformationT consideredon the Qj processwill be denoted by Tj.
It has a point spectrum("3)consistingof 0 (correspondingto the fact that 1 is
a characteristicfunction)and 2rn/dj (correspondingto the fact thatf and its
integral powers are characteristic functions) for n = 1, * - - , dj -1.
These
that is,
(13) Cf. [5 ] fora discussionof the spectrumof a measure preservingtransformation,
of the correspondingunitarytransformation.
MARKOFF
19481
TRANSITION
419
PROBABILITIES
points all have multiplicity1, since Tj is metricallytransitive.Then Tdi has
a point spectrumconsistingof the single point 0, which has multiplicitydj.
Correspondingto this fact, the transitionprobabilities
P{x(O) = x; x(dt) E A},
t = 1,2,...
determine(togetherwith Qj measure) a stationaryMarkoffprocess which is
not metricallytransitive,because A jio * * *, A jdj are invariant,but whichhas
no angle variables, and which becomes metricallytransitivewhen reduced
still furtherfromAi to any A,k. Then the limitQj,(A) definedby (5.11) with
t= 0 must exist,accordingto Theorem 4. Equations (5.11) and (5.12) are obtained when s- oo in
P{x(O)
=
fJ P{ x(o) = y; x(t) E A} P { x(0) = x; x(dis) E dG,I
(5 . 25)
=
THEOREM
x; x(d,s+ t) CA}
P{X(O)=
y; x(d'is) C A } P { x(O)
=
x; x(t) C dG} .
6 (STATIONARY MARKOFF PROCESS WITH REGULAR TRANSITION
If almostall x (4) measure)satisfy(3.4), thereis a set NoG0
of 4) measure0 suchthat
PROBABILITIES).
(6.1)
P{ x(O) = x; x(t) E X
-
No} = 1,
x C X -No,
and thatif X is replacedbyX - No, and thefieldofsetsA E a bythefieldofsets
Ao =A -A No, theMarkoffprocesswiththisspace, theseX measurablesets,the
transitionprobabilities(6.1), and the same 4) measuresatisfiesthehypotheses
applicable to
of Theorem5. The decompositionof that theoremis therefore
X-No.
BeforeprovingTheorem 6, we presentthe followingexample. In Theorem
measure 4b)as definedby (5.3), and suppose that
5 choose a self-reproducing
=
are then satisfied,using the basic 4b,measure,
The
0.
hypotheses
present
a,
but Theorem 5 is no longerapplicable since (3.4) is no longertrue on Al. The
presenttheoremthengives the decompositionof X-Al but gives no information on A1.
To prove Theorem 6 we need only note that if (3.4) is true on X -No, by
a process already used repeatedly in the proofof Theorem 5 No can be increased slightlyto make (6.1) true. The theoremis now obvious.
Comparisonbetweentheconditionthat(3.4) be validfor all x and thecondiIn the followingit will be
tions of Doeblin and of Kryloffand Bogolioz"boff.
in
Euclidean space and that
measure
X
of
finite
is
set
a
Borel
that
supposed
of a set A will be
The
measure
subsets.
Borel
the
of
its
Lebesgue
field
ax is
and to
0
for
some
is
that
e
>
Doeblin's
denoted by mA.
hypothesis
(7.1)
[May
J. L. DOOB
420
P{x(O)
=
1- 6
x; x(to) EA}
ifmA ?
e,
uniformlyin x. This condition is shown to imply the existence of the limit
Q(x, A) in (3.6) uniformlyin x and A. The use of Lebesgue measure here is
somewhatconfusing,since the proofsare equally valid formany othermeasures of Borel sets. In orderto clarifythis point,his conditionwill be put in a
probability
somewhatdifferent
form.In his case Q(x, A) is a self-reproducing
measureforeach x, and the decompositionof Theorem 5 is valid; in fact only
a finitenumber of sets Aj can be present. Moreover, using the notation of
Theorem 5, if mA >0, and ifA CAj, Qj(A) >0 also. Define 4(A) by
)(A)-=
ajQi(A)
wherethe aj's are positive,have sum 1, and are otherwiseunrestricted.Then
4?(A) is a self-reproducing
probabilitydistribution,and 4D(A)> 0 if mA > 0,
that is, mA is absolutelycontinuouswithrespectto 4(A). Condition (7.1) can
thereforebe put in the followingform:
(7.1')
P{x(O) = x; x(to) eA}
< 1
-
E
if (A) <
Now if this conditionis satisfied,and ifP1(x, A, t) is the singularcomponent
of P{x(0) =x; x(t)EA } with respect to 4(A), then certainly
(7.2)
Pi(x, X, to) < 1 -
e
uniformlyin x. The proofthat Pl(x, X, t) is monotonenonincreasingin t now
shows (using (7.2) in (3.12) with s=to) that
(7.3)
Pi(x, X, to+ t) ? (1
-
e)P1(x, X, t).
Hence
(7.4)
Pi(x, X, nto)< (1
n
and thus (3.4) is true,and the convergenceis even uniformin x and exponenof convergenceto 0 which precludesthe postially fast. It is this uniformity
sibilityof infinitelymany sets Aj, and the still strongercondition (7.1') implies the uniformity
of the limitequation (3.6).
The formulationof Doeblin's conditionin (7.1') explains the ratherarbitraryuse of Lebesgue measurein the originalformulation.Lebesgue measure
was a measure of Borel sets with the propertythat mA >0 implied Qj(A) >0
forsomej. Any othermeasureof Borel sets with this propertywould serve as
well. In the same way the b(A) distributionin Theorem 5 was ratherarbidistribuany self-reproducing
trary (althoughit had to be self-reproducing);
tion withthe propertythat ifany Qj(A) >0, then 4(A) >0 would have served
as well.
The conditionof Doeblin is thus strongerthan the hypothesisthat there
1948]
MARKOFF TRANSITION PROBABILITIES
421
is a self-reproducing
distributionand that (3.4) is true for all x. Doeblin's
conditionhas the advantage that it does not requirea priorithe existenceof a
self-reproducing
distribution.In some examples, however, the existence of
such distributionsis already known. This is true in the renewal process (see
the followingpaper in thisvolume). Doeblin's conditionis howeveressentially
strongerand leads to strongerresultswhereit is applicable. The conditionof
Kryloffand Bogolioiuboff
is somewhatweakerthan that of Doeblin, as stated,
but since it leads to exactly the same decompositiontheorem,the two are
essentially equivalent. In both cases (7.1') is true. The point is that the
various possible choices forX measure in Doeblin's conditionbringit up to
the generalityof the Kryloff-Bogolioiuboff
condition.
As a somewhattrivial example which may howeverbe illuminating,consider the Gaussian Markoffprocess described in the Introduction. In this
case, as already remarked, there is a self-reproducingdistribution,and
Pl(x, X, t) -O, so that Theorem 5 is applicable. Actually (0.9) shows that
there is metrictransitivityand no angle variables; there is only one set Aj,
and thereare no cyclic transitions.The convergencein (0.9) is not uniform
in x; this means that neitherthe Doeblin nor the Kryloff-Boglioiuboff
conditions are satisfied.
BIBLIOGRAPHY
1. W. Ambrose, P. Halmos, and S. Kakutani, The decompositionof measures,II, Duke
Math. J. vol. 9 (1942) pp. 43-47.
2. D. Blackwell, IdempotentMarkoffchains, Ann. of Math. vol. 43 (1942) pp. 560-567.
3. J. L. Doob, Stochasticprocesseswithan integral-valued
parameter,Trans. Amer. Math.
Soc. vol. 44, (1938) pp. 87-150.
, The Brownian movement
4.
and stochasticequations,Ann. of Math. vol. 43 (1942)
pp. 351-369.
5. E. Hopf, Ergodentheorie,
Ergebnisse der Mathematik, vol. 5, no. 2.
6. S. Kakutani, Ergodic theoremsand the Markoffprocesswitha stabledistribution,Proc.
Imp. Acad. Tokyo vol. 16 (1940) pp. 49-54.
7. K. Yosida and S. Kakutani, Operator-theoretical
treatmentof Markoff's process and
mean ergodictheorem,
Ann. of Math. vol. 42 (1941) pp. 188-228.
8. K. Yosida, The Markoffprocesswitha stabledistribution,Proc. Imp. Acad. Tokyo vol.
16 (1940) pp. 43-48.
9. N. Wiener, The ergodictheorem,Duke Math. J. vol. 5 (1939) pp. 1-18.
10. (Added in proof.)A. M. Yaglom, The ergodicprinciplefor Markovprocesseswithstationary
distributions,C. R. (Doklady) Acad. Sci. URSS. N.S. vol. 54 (1947) pp. 347-349. The author
supposes that P(s)(x, A) is absolutely continuous with respect to a given self-reproducingdistribution ?1(A), with a positive continuous density, and proves that then lim8_".P(s)(x, A)
= 1(A). This is a special case of Theorem 5.
UNIVERSITY OF ILLINOIS,
URBANA, ILL.