Asymptotic Properties of Markoff Transition Probabilities Author(s): J. L. Doob Source: Transactions of the American Mathematical Society, Vol. 63, No. 3 (May, 1948), pp. 393-421 Published by: American Mathematical Society Stable URL: http://www.jstor.org/stable/1990566 Accessed: 19/09/2010 01:00 Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive only for your personal, non-commercial use. Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at http://www.jstor.org/action/showPublisher?publisherCode=ams. Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission. JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected]. American Mathematical Society is collaborating with JSTOR to digitize, preserve and extend access to Transactions of the American Mathematical Society. http://www.jstor.org ASYMPTOTIC PROPERTIES OF MARKOFF TRANSITION PROBABILITIES BY J. L. DOOB Let X be any space of abstract elements,and let ax be any Borel fieldof X sets, includingX itself.Let P(t)(x, A) be a functionof xEX, A C8, and the positive real variable t, with the followingproperties: (a) 0 <P(t)(x, A) <P(t)(x, X) = 1; (b) for fixedx and t, P(0)(x, A) is a completelyadditive functionof sets A; forfixedA and t, P(t)(x, A) is measurable in x with respectto the fielda. (c) with a self-explanatorynotation forintegrationwith respect to a set function, (0. 1) P(8+)(x, A) = P(8)(y,A)P(t)(x, dGy), s, t > 0('). A functionwith these propertiescan be consideredthe transitionprobability functionof a Markoffprocess; it is the probabilityof a transitioninto A from initial positionx, in time t. The Markoffprocess is determinedfor t>to if a distributionof x at time t= tois assigned.The process is called stationary,or temporallyhomogeneous,ifthe initialdistributionreproducesitself,that is, if the set function4D(A)definingthe initial distributionsatisfiesthe equation (0.2) 4'(A) = f p(t) (X, A) 4(dGx). The probabilityformalizationof a stochastic process is now well known. In the presentcase the initial distributionand the transitionprobabilitiesare used to definea probabilitymeasure in the space of all functionsx(t), where t> to,and x(t) is a functionwhich takes on values in X. For example, to the set of functionssatisfyingthe conditionx(s) =A (s>to is fixed) is assigned as measure (0.3) P{ x(s) A} =f P(8)(x, A) (dGz), where 1(G) definesthe initialdistribution.If 1 is self-reproducing, the definition in (0.3) and the correspondingdefinitionsformore complicated sets betotheSociety, Presented November 25,1945,under thetitleMarkoff chains-denumerable case;receivedbytheeditorsFebruary27, 1947. (1) In thefollowing, the domainof integration will not be indicatedexplicitlywhenit is thewholespace. 393 [May J. L. DOOB 394 come independentof to,and the probabilitymeasure is then definedon the space of all functionsx(t), - oo <t < co ; in this case x(s), foreach s, has the 4) distribution.The transformationT, takingx(t) into x(t+s) is then a measure in functionspace, and the theoryof measure prepreservingtransformation servingtransformationsis applicable. In the following,it will be supposed that t runs throughintegralvalues only. Function space then reduces to sequence space. The results obtained will howeverbe applicable, with the obvious changes,when t runsthroughall I x(j) is replaced by the real numbers; for example the average (1/n) average (1/t)fOx(s)ds. If certainrestrictionsdiscussed below are imposed on the given transition probabilities,it followsthat I t lim -Y P(s)(x, A) = Q(x, A) (0.4) l40 t existsforall x. Moreover: (a) This limitis uniformin x and A. probabilitydistribution. (b) For fixedx, Q(x, A) definesa self-reproducing of disjunct sets, finite number a sum of as a (c) X can be expressed N, A1, A2, * * *, where P (t)(x, Ai) = { o,.5) O, Q(x, A) = Qi(A), 0. Q(x, N) x E Ai, xC Ak, k $ j; x C Ai, The sets A1, A2, * are called finalsets. If a simple type of cyclic transition is excluded,the Cesaro limitin (0.4) becomes an ordinarylimit(2). In order to explain the significanceof the conditions imposed on the transitionprobabilitiesto obtain these results,a familiarexample will be recalled. Let X be any abstract space, and let !x be a Bornelfieldof X sets (inof the points of X cludingX itself).Let Tx be any one-to-onetransformation which togetherwith its inverse takes Wxsets into Wxsets. Define Pt)(x, A) as follows: (0.6) P(t)(x, A) = 0 f if Ttx CA, - A. if ~TtxC X This functionsatisfiesthe three conditions (a), (b), (c), stated above, and thereforedefinesa Markoffprocess (in conjunctionwith-an initial distribution). The average in (0.4) becomes the average numberof times TtxCA as t and fordetailsoftheproof.Numbersinbrackets statement (2) See [7] fora morecomplete at theendofthepaper. referto thebibliography 1948] MARKOFF TRANSITION PROBABILITIES 395 increases. It is thus hopeless to expect that without furtherhypotheseson the transitionprobabilitiesaverages like that in (0.4) can behave in any more regular fashion than the correspondingaverages for point transformations. If some probabilitymeasure,that is, a measure with value 1 on the whole space, has been definedon the sets of W=,and if the transformationT is measurepreserving,the measureof X sets definesa self-reproducing distribution. Even the hypothesisthat a Markoffprocess has a self-reproducing distributioncan however at best make the averages in (0.4) behave like the correspondingaverages for measure preservingpoint transformations.The behavior detailed above, beginningwith the existence of the limit in (0.4) for all x, is actually far more regular than the correspondingbehavior for measure preservingpoint transformations.Now the transitionprobability P(t)(x, A) definedin (0.6) is forfixedx and t a highlysingularset function; it definesa distributionconcentratedat a point. The restrictionsimposed on the transitionprobabilitiesof Markoffprocesses by various authors can be consideredas smoothingrestrictionsimposed to make the representationof P(t) (x, A) in termsof a point transformation impossible. If thereis a self-reproducing distribution1, it will be shown that undera mild restrictionon the field ,, X can be expressedas the sum of at most a continuumof sets At and a remainderof 4Dmeasure 0, such that the A t's have essentiallythe properties(0.5). The presentpaper can be consideredas an examinationof the strengthening of this decompositiontheoremwhich is made possible by the impositionof weak additional restrictionson the transition probabilityfunctions. The theoremsof the present paper will be stated, for the most part, in termsof stationaryMarkoffprocesses. Note that this is the same as stating theoremson Markofftransitionprobabilitiesforwhichit is knownthat there exist self-reproducing distributions.Althoughit is not customaryforwriters on Markoffprocesses to presuppose the existence of a self-reproducing distribution,the existence is implied in their hypotheses,since each Qj(A) in (0.5) definessuch a distribution. The weakest conditionsimposed on the transitionprobabilitiesto insure the truthof (0.4) and the subsequent paragraphare those of Doeblin and of Kryloffand Bogolioiuboff. Conditionof Doeblin. The space X is the interval (0.1). The field Wxis the fieldof Borel subsets of this interval.It is supposed that thereis an s and a positivee such that P(s)(x, A) < 1 -e ifA has Lebesgue measure less than e. ConditionofKryloffand Bogoliou"boff. The space X and the fieldW,are as in Doeblin's condition.Let P(s) be the linear operatordefinedby the kernel P(s)(x, A) taking the Banach space 9 of completelyadditive set functions definedon W,into itself, (0.7) p = f P(8)(x,A)4(dGx). 396 J. L. DOOB [May (The normof an elementof 9Y is its variation.) It is supposed that thereis an s and a completelycontinuous linear operator V, also taking 9) into itself, such that P(8) - V has norm less than 1. Note that p(s) is the sth power of p(l)(3) Althoughthe conditionsof Doeblin and of Kryloffand Bogolioiuboff are weak to cover some of the simplestcases very weak, they are insufficiently in x of approach to the commonlyencountered.In these cases the uniformity limitin (0.4) is not true.One such example is furnishedby the Markoffprocess encountered in renewal theory (see the followingpaper in this volume). Another example is the Gaussian stationary process with correlationfunction pt, one of the simplest Markoffprocesses. Here X is the real line, W the fieldof Borel sets, and (0.8) P (tx A) p[27r(1 p2t)]12JA exp2(1 wherep is a constantbetween0 and (0.9) lim P(t)(x, A) = 1(4). - p2t) dy, Evidently 2J exp dy. In this case (0.4) holds as an ordinarylimit.It is clear howeverthat the limit is not uniformin x. While it is hardlynecessaryto invokethe generaltheoryof Markoffprocesses to treat this simple example, it is unsatisfactorythat the general theoryas developed up to the presentdoes not already cover it. There is a self-reproducing distributionin this Gaussian process,the distributiondefinedby the integralin (0.9), and withthis distributionthe Gaussian processcan be made stationary.The renewal processis anotherexample of a Markoffprocess,not covered by the hypothesisstated above, which has a self-reproducingdistribution. Under the conditions of Doeblin and of Kryloffand Bogolioiuboff, distribution.It Qj(A) in (0.5) is a self-reproducing is thereforenatural to study those Markoffprocesses which have such distributions,that is to study stationaryMarkoffprocesses,in order to extend the Doeblin-Kryloff-Bogoliotuboff results.The author,Yosida, and Kakutani have made the firststeps in thisdirection;referencesto thisworkwillbe given below. The notation used is, in summary: X is an abstract space of points x, y, (3) The space X and field _can be taken considerablymore generallyin both these conditions. The condition of Kryloffand Bogoliouiboffis somewhat weaker than that of Doeblin as stated, but it will be seen below that the two conditionsbecome equivalent ifother measures of Borel sets than Lebesgue measure are allowed in Doeblin's condition. (4) Cf. [4] fora detailed discussion of this process. The discussion covers the continuous parameter case, but the change to t integralis trivial. 1948] MARKOFF TRANSITION PROBABILITIES 397 are X sets; Wxis a given Borel fieldof these sets. In practiceX A, B, is usually a Borel set in Euclidean space with the topology determinedby Euclidean distance. For this reason no special notation is used to distinguish points and sets of the abstract space X frompoints and sets of numbers. Q is function(sequence) space as already defined;w is a point of Q, that is, a function(sequence) x(t). A, M, * * *are Q sets. , is the Borel fieldof Q sets generated by sets of the form {x(tj) CAi, j = 1, * * , n}, As E a. P(A) is the probabilitymeasure on the sets of a. determinedby a given distributionand the given transitionprobabilities.For this self-reproducing reason it is convenient to call the sets of a,, the P-measurable sets. (The parametert was chosen to be integral-valuedto avoid certain complications in P-measure. In the continuousparametercase a topology is introducedin X and regularitypropertiesare imposed on the transitionprobabilitiesto insure measurabilityof the stochastic process.) are complex-valued functions of co, that is, chance a(w), :(w), variables. a =fa(c)P(dA,) E*I is the expectationof a. a ,;A} and E {a; j3} are respectivelythe probabilityofA and the expectation of 3 conditionedby assigned a; they are functionsof a. It will sometimesbe convenientto writeP {a = x; A } and E {ax= x; 3} to stressa particular value assigned to a. The conditionalexpectationE {a;j } can be obtained as the integralof : in termsof the probabilitymeasure P{ a; Al}. of functionspace already defined.We shall someT78is the transformation times write T8a fora(T8w). 1 measureis the measureof X sets definedby a preassignedself-reproducing probabilitymeasure b(A). This will be used, togetherwith the transition probabilities,to definea stationaryMarkoffprocess,so that forevery s (0.10) 4(A) = Ptx(s) @Aj. Alternativelya stationaryMarkoffprocesswill be presupposedand b (A) will then be definedby (0.10). In this approach the transitionprobabilitiesmust be definedindirectly,and the fact that the definitionis ambiguous on sets of measure zero will cause complications.These will be discussed below. A P-measurable set A will be called invariant if TA and A differby at most a set of P-measure zero. A set A of a- will be called invariantif (0.11) P{x(O); x(1) C A} = 1 almost everywhere(P-measure), on the Q set {x(O) CA }, that is, if (0.11') P{x(O) = x; x(l) CA} = 1 [May J. L. DOOB 398 foralmost all x (1 measure) in A. If A is invariant (0.11) and (0.11') are also true with x(1) replaced by x s), forevery s >0. The set A is invariantif and only if the Q set {x(O) CA } is invariant. Moreover it is knownthat if a set A is invariantit differsby at mosta set of P-measure zero froma set of the form {x(O) CA }. This fact makes it possible to consideronly invariantsets of this simpletype. More generally [3, pp. 118-119] it is knownthat ifany P-measurable functiona(o) satisfiesthe condition Ta=eXla for some constant X, neglectinga possible exceptional set of P-measure zero, then a differson at most a set of P-measure zero from a P-measurable functionof the form f[x(O)]. A functionsatisfyingthis equation with X = 0 is called an invariant function;if eiX#1, log a when suitably definedis called an angle variable. A stationaryprocess (Markoffor not) is said to be metricallytransitive ifthereis no invariantP-measurable functionnot constantwithprobability1. In the presentcase, the x(t) process is a Markoffprocess,and it has already been noted that in this case, in discussingboth metrictransitivityand angle variables, it is sufficientto considerfunctionsof x(O). In all cases, when investigating metric transitivity,it is sufficientto consider functionstaking on only the values 0 and 1, that is, characteristicfunctionsof point sets. A Markoffprocess is thus metricallytransitiveif and only if every invariant 1 measurable set A has 1 measure either0 or 1, that is if and only if every Q set of the form {x(s) CA } (fixed s, A 1?measurable) which is unchanged, neglectingsets of zero P measure,by a change of s, has P measure 0 or 1(5). The theoremsof the present paper concern conditional expectations of the formE { x(O) = x; Tta }. These conditionalexpectationsare consideredat firstas functionsof x forfixeda, and the almost everywherepropertiesof the functionsfort-* oo are investigated. These results would remain valid if for each s the conditional expectations were changed on a set of probability0, and in fact these conditional expectations are ordinarilyso defined that thereis ambiguityon sets of zero probability.Sometimes,however,individual values of x will be consideredand such an ambiguitywill be inadmissible. It will then be supposed that the conditional probabilitiesare regular. By this is meant that transitionprobabilities P{x(s) = x; x(s+ t) CA} = P(t)(x,A) are given,as uniquelydefinedfunctions,satisfyingconditions(a), (b), and (c) of the firstparagraphof this paper. If a is a chance variable dependingon the functionsx(t) fort> s, E { x(s) = x; a } is then unambiguouslydefinedas the integralof a in termsof the conditional probabilitymeasure forx(s) =x. The hypothesisof regularityof the transitionprobabilitiesis not so strong a restrictionas mightappear at firstsight.In the firstplace Markoffprocesses are most frequentlygiven in terms of uniquely defined transition prob(5) The general background of ergodic theory,includingconcepts like metrictransitivity, and so on, is contained in [5]. The specificapplication to Markoffprocesses is given in [3]. 19481 MARKOFF TRANSITION PROBABILITIES 399 abilities; in this case regularityis intrinsicin the definition.In the second place even ifthe transitionprobabilitiesare not so given,it is usually possible to redefinethem to be regular. In fact, this is always possibleif thefield a, is strictlyseparable(6).This hypothesisis always satisfied,forexample, when ! consistsof the Borel subsets of a Borel set in a Euclidean space. The proof when , is strictlyseparable goes as follows. of this possibilityof redefinition Althoughfora given A E j, P {x(0) =x; x(1) EA } is not a uniquely defined functionof x, it can be so definedforeach A Ez that, excludingan X set N independentof A, with (0.12) P { x(0) C N} N C ax, = 4(N) = 0, is for fixedx a probabilitymeasure in A [3, p. 96]. P{x(0)=x; x(l)EA} DefineP(1) (x, A) as this functionforx EX - N, and as any probabilitymeasure in A forxEN. The functionPt)(x, A) definedinductivelyby (0.13) P(t) (x, A) = P(t-)(y, A) P(2) (x, dGV) is a regulartransitionprobabilityof the process. The above discussion made essential use of the hypothesisthat a. was strictlyseparable. In the followingit will frequentlybe desirable to perform integrationsof the type (0.14) ff(y)P{ x(0) = x; x(1) C dG} withoutany assumptionof regularityimposed on the transitionprobabilities. It is known that this is permissibleand that the usual rules of integration, taking the limit under the integralsign,and so on, apply, and the definition of (0.14) is in termsof the usual Lebesgue sums [3, p. 98]. This convention will be used below, for integrationwith respect to P { x(0) = x; x(t) CA measure or any other conditional probability measure with no further comment. We shall need the followingfact below. If thetransitionprobabilitiesare regularand if A1, A2, * are invariant,thereis a set N of 1 measure0 such that (0.15) P{x(0) = x; x(t) EA-Ai N} = 1, ifx C As-A1N, j > 1, t > 1. (6) A Borel fieldof sets is called strictlyseparable if it is the Borel field generated by a finiteor denumerablesubcollectionof its sets. Note added in proof:The italicized statement is incorrect;the proofoutlinedis based on Theorem 3.1 [3, p. 96] whichis incorrect.(It was tacitly assumed in the proofof this theoremthat the t image of Q had outer F measure 1; this is not necessarilytrue, although it is true if the t image is a Borel set.) The italicized statement is however correct if, forexample, !, consists of the Borel subsets of a Borel set in a Euclidean space. The discussion below of (0.14) is correct,with a slightlymodifiedjustification. 400 J. L. DOOB [May In fact thereis a set N1 of 1 measure 0, independentofj and t, such that P{x(O) = x; x(t)C Ai} = 1 if xC As-AiN1, and a set N2DN1 of 4) measure0, independentofj and t,such that PIx(O) = x;x(l)Cz As-AiN,} = 1 if xCA -A N2, Then (0.15) is true with N=,jNj. THEOREM 1 (STATIONARY PROCESS). (a) Let a(X) be any P-measurablefunc- tion,and suppose thatfor some p > 1 El Ia(IP} < oo. (1.1) Then theaverage i t -E E{x(O); t s=1 (1.2) T,a} converges in themean withthesame index p to a functiona*: (1.3) -E lim E -a E{ x(O); Ta} a* = 0. (b) If p > 1, or moregenerallyif (1.4) Et aIx log+ I a < 00(7 thelimitexistspointwise: (1.5) lim-E t-0 t e=1 E{x(O) = x; Ta} =a* exceptfor an x set of 1 measure0. If A Ez (1.5) can be specializedto (1.6) i t -E t = theaveragesin (1.2), (1.3), and P{ x(O); x(s) E A}. (c) If theprocessis metricallytransitivethe limitax*is E {a } whichreduces to P {x(O) GA } = J(A) whent--oo in (1.6). Conversely if a* =E {ca} for every a theprocessis metricallytransitive. (d) Suppose thattheprocessis a Markoffprocessand letQ(x, A) be thelimit in (1.6) whent-> oo. Then Q(x, A) is theconditionalprobabilityof { x(O) CA relativeto thefieldofinvariantsets; thatis (1.7) (7) fQ(x, A) (dGx) = 1(A B) t log+IJa=0 ifIa! <1. log+Ial =log Ial ifIaJ21, 19481 MARKOFF TRANSITION PROBABILITIES 401 if B is invariant.More generally,if a dependsonly on x(0), x(1), a*, is theconditionalexpectationofa relativeto thefieldofinvariantsets. Thefunction Q(x, A) satisfiestheequations: =fP{ x(0) = y; x(t) C A}Q(x; dGV), (1.8) Q(x,A) (1.9) Q(x, A) = (1. 10) Q(x, A) = Q(y, A)P{x(O) = x; x(t) dG}, f Q(y, A)Q(x, dGv), foralmostall values of x (4 measure). The exceptionalset varieswithA and the if and choiceoftheconditionalprobabilitiesinvolved.Thereis metrictransitivity onlyif Q(x, A) = cI$(A)for almostall x (4i measure)for each A. (e) Suppose thatthe process is a Markoffprocess with regulartransition probabilitiesand suppose thatC = C(a) is thex set on which(1.5) is true,where a is boundedand depends only on x(O), x(1), * . Then C includes everyx satisfying (1.11) P{x(0) [X(t) c = X; E tO0 C4 = 1. Note that (1.6) is the counterpartof (0.4). Althoughthe limitwhen t-> co in (1.6) may not exist on an x(O) set of 1 measure0, in manyapplications the transitionprobabilitiesare regular and the possibilityof exceptional points of nonconvergencecan be eliminated by the last statementof the theorem. Various special cases of this theorem have been proved by the author [3, p. 114], by Yosida [8], and by Kakutani [6]. It has always been supposed hitherto,however, in proving (1.3) and (1.5) that the process was Markoff,and, in findingpointwise limits,that a was bounded. It will be clear fromthe proof that the conditional expectations for preassignedx(0) can be replaced in (a), (b), and (c) by any other conditional expectations withoutalteringeitherthe validity of the theoremor its proof.It would be interestingto prove the existence of the pointwiselimit in (1.5) for any a withE { ja j } < oo,or to obtain a counterexample. Proof of (a). Under the stated hypotheses,E { j } < oo. Hence according to the point ergodic theorem (strong law of large numbers for stationary processes) (1.12) 1 lim-Z t--+ t 8=1 T,,a = a, exists with probability 1, and according to the mean ergodic theorem if E{I aIP}< oO forsome p>1, it followsthat E{ fa,IP} < oo and there is con- [May J. L. DOOB 402 vergencein the mean of order p. Hence, in this case, when t-* oo, t -E E{13 E{x(O); t0x(); a, T8a} -E 1 = ( ) a?i {Et E {|Ex(O; =E{ P) | ]}2 P tEE E{x(0);[-Tsaa-al] ?EEx0;a (1.13) II 8- 0. <=E{-ET8a-a1lo Thus (1.3) holds, with a* =E { x(O); a, }. to show that if p-=1 and Proof of (b). To prove (1.5) it will be sufficient if (1.4) is true then (1.12) can be integratedto the limit (t-* oo) with respect to the conditionalmeasure E {x(O); A}. This will be shown by showingthat (in termsof this conditional measure) excludingperhaps an x(O) set of zero measure there is convergence in (1.12) almost everywhere,and that the averages are dominatedby an integrablefunction.(It has already been noted that this procedureis correcteven though strictlyspeaking the conditional probabilitymay not definea measure.) Accordingto (1.12) the bracket converges to a,1except on a set M with P {M } = 0. Since (1.14) fr{x(O) = x; M}4(dGx) = P{M} = 0 it followsthat P{x(O); M} =0 for almost all x(O) (4 measure). That is to say, excludingthe exceptionalvalues of x(O) the limitin (1.12) exists almost everywhere(P{x(O); A} measure). Moreover according to a theorem of Wiener [9, p. 27] (1.4) implies that (1.15) { ~~~~t t = } Hence, excludingan x(O) set of c1 measure 0, (1.16) E{x(O); L.U.B. - T8a < oc. This means that, excludingan x(O) set of P-measure 0, the average in (1.12) is dominatedby an integrablefunction,as was to be proved. Proofof (c). Accordingto the ergodictheoremthereis metrictransitivity if and only if a, =E {ta} forall a, in which case, then,a* =E {t } also. If a is definedto be 1 when x(O) CA and 0 otherwise,the average in (1.2) becomes that in (1.6). The limitis thenE {oa} = c1(A) whenthereis metrictransitivity. 19481 MARKOFF TRANSITION PROBABILITIES 403 Proofof (d). Let Q(x, A) be the limitin (1.6). To prove that Q(x, A) is the stated conditionalexpectation,that is, that (1.7) is true,we note that since the processis a Markoffprocess,and since B is invariant, fA-E P{ x(O) = x; x(s) E A}I4K(dGx) it it =-- t 8=1 P{ x(s) E B, x(s) C A} P{ x(O) G A B } = D(A *B). When t-* oo,this becomes (1.7). The identificationofa* as a conditionalprobability is made in the same way. The equalities (1.8), (1.9) and (1.10) are "almost everywhere"equalities. Hence to show their truthit is sufficientto show that integrationover an arbitrary ax set B yields equality in each It is also case. This is clear in (1.9), using the fact that 4 is self-reproducing. it is clear Now set. clear in (1.8) and (1.10), using (1.7), if B is an invariant is of and this function invariant, that x(O) fromthe definitionof Q(x(O), A) in be B to invariant testing (1.8) thereforethat it is permissibleto restrict and (1.10). Thus the threeequalities are completelyproved. Finally, if there is metrictransitivitywe have already seen that Q(x, A) = 4(A) foralmost all x (4 measure). Conversely,if this equation is true, and if A is invariant, 4'(A) = 1 or 0, since the conditional probabilityQ(x, A) is the characteristic functionof A if A is invariant. Hence the process is metricallytransitive. Moreover it is clearly sufficientif Q(x, A) = 4(A) on a collection of sets so large that the Borel fieldthey generate coincides with a,. Proofof (e). Suppose that ja I < K, thata depends onlyon x(O), x(1), . . that (1.5) is true for xGC, and that the process is a Markoffprocess with regulartransitionprobabilities.Let x satisfy(1.11), that is to say it is supposed that startingfromx(O) =x, x(s) takes on a value in C forsome sO 0 with probability1. We show that then xCC. Let v be the firstvalue of s with x(s) EEC. Then if we use the fact that the process is a Markoffprocess, (s), and that the equality v = s is a conditionon x(0), * (1.18) E I x(O) = x; T,a} 2=Ef i=o {v=i) T.caPI x(0) = x; dA}, where,ifs _j, thejth integralin (1.18) can be writtenin the form (1.19) J E{x(j); T8a}P{x(O) = x; dA}. 404 J. L. DOOB [May Hence it E{x(0) t Ei= T,a} -x; t t (1.20) ZJ j {= - t - IE x(j); Ta}P{x(0) t 8=j = x; dA} f +-E eJ TsaP{x(0) = x;dA}. Now the second sum is at most K t -Z P{X(o) (1.21) t 8=1 = X;v>s} which goes to 0 when t---o, since P{x(0)=x; v>t} >-0. The integrandin thejth integralof the firstsum in (1.20) is bounded by K and, since x(j) EC whenv =j, convergesto a limitwhen t- X. The integral,and hence the sum, thereforealso convergesto a limit; this finishesthe proof. It is a remarkablefact,due in the firstplace to von Neumann, and proved later more generallyby Ambrose,Halmos, and Kakutani [1], that any stationaryprocess (with mild restrictionson the fieldof measurable sets) can be decomposed into metrically,transitiveprocesses. Although the various decomposition-limit theoremson Markoffprocessesare merelyreflections of this fact (the proofsvary with the special character of the hypothesesand the predilectionsof the authors) the exact adaptation of the decomposition theoremto Markoffprocesseshas neverbeen stated explicitly.This is done in the followingtheorem,whose proofwill be given in detail; the theoremis not a corollaryof the generaltheorembecause strongerresultsare desiredin this special case. THEOREM 2 (STATIONARY MARKOFF PROCESS WITH REGULAR TRANSITION Let Q(x, A) be theconditionalprobabilityof A (Cz ) withrespecttothefieldofinvariantsets,so that(by Theorem1) PROBABILITIES). (2.1) i t lim t-+ t 8- P x(0) = x; x(s) A} -Q(x, A) for almostall x (4 measure). If thefieldax is strictly separable,Q(x, A) can be so definedthatthefollowing is true(8).Thereis a collectionof sets {A,,} whereA,.C x and A variesthrough (8) Added in proof: (Cf. footnote6.) It is necessary to strengthenthis hypothesis; it will be supposed that a. consists of the Bore] subsets of a set in a Euclidean space. With this restriction,the discussion of the unique definitionof Q(x, A), involved in the definitionof A., is correct. 1948] MARKOFF TRANSITION PROBABILITIES 405 some set of real numbers,withthefollowingproperties: (a) EverysetA EC whichis a sum of setsA, is invariantand (neglecting sets of 4 measure0) everyinvariantset can be expressedin thisway. 4(N) = 0, (b) If xCX-N, where N = X- ZA,. Q(x, A) is a probabilitymeasurein A C Q(x, A) = Q,(A), (2.2) (2.3) (2.4) satisfying f x C A, Q, (A,) = 1, Q(x,A)4(dGx) = 4(A). (c) Each probabilitymeasureQ,.(A) is self-reproducing, (2.5) P{ x(O) = y; x(1) E A } Q,(dGy) = Q(A), and moregenerallyif T(dgu) is any probabilitymeasurein ,u space, (2.6) (D(A) = f Qj(A)T(dA) is a self-reproducing probabilitymeasure. Converselyif 4b1(A) is self-reproand ducing absolutelycontinuouswithrespectto cJ?(A)it can be put in theform (2.6). If, for some,u,1(A,.) >0, thenQ,(A) is proportionalto cJ?(A .A,), Q,.(A) - 4(A .A )/4)(A,,). (d) P { x(0) = x; x(t) CA, }1 everywhere on A, and for each g and each A Cx, (2.1) is truefor almostall x (Q,,measure). The Markoffprocesswith c1 replaced by Q, is metricallytransitive.The original process is metrically transitive if and onlyif someA,/has 4 measure1. The analogues of (1.9) and (1.10) have been omittedhere,since they reduce to trivialidentities.Conditionsare given below whichexclude the possibility that any A, have zero 1 measure. Under these conditions there can be at most denumerably many of these sets. The strongerconditions of Doeblin and of Kryloffand Bogolioiuboff imply that there can be at most a finitenumber. Definitionof A,. and proof of (a). It has already been remarked that Q(x(O), A) is an invariantfunction,for fixedA. This implies that the x set definedby Q(x, A) <k is an invariantset. In the followingwe shall suppose that , is strictlyseparable, specificallythat it is the Borel fieldgenerated by the sets Ci, C2, * * * , CjCax. It can then be supposed that the condi- tional expectationQ(x, A) is so definedthat, for fixedx, Q(x, A) is a probabilitymeasurein A. It will be supposed below that Q(x, A) is so defined,and [May J. L. DOOB 406 be that this functionis then a uniquely given functionof x and A. Let X of the collection denumerable by sets generated the Borel fieldof invariant sets of the form {Q(x, Cn) < r}, (2.7) n = 1, 2, ; r rational. If A is an invariantX set, we have seen that Q(x, A) is the characteristic functionof A, neglectingsets of 4 measure 0. Hence, neglectingsets of 4 measure 0, A is the set definedby the equality Q(x, A) = 1. Since A is in the this equality definesa set in a-, neglectBorel fieldgeneratedby C1, C2, ing sets of I) measure0. Hence a-, althoughit may not include A, includes a fromit by at most a set of 4 measure 0. set differing The sets of a- which contain no propernon-nullsubsetsin a- are called . They can be exhibitedexplicitlyin the form the atoms of (2.8) D', = ? 1, Di = Dj, D3l = X -Di are the sets in (2.7). The non-nullintersectionsin (2.8) where D1, D2, are the atoms. The cardinal numberof these non-nullintersectionsis at most that of the continuum,and we can thereforedenote them by {A, } where , ranges throughsome set of real numbers. In the followingwe shall find it necessaryto drop some of these atoms, puttingthem in a set N of 4 measure 0. Since every set of W is composed of atoms, every invariant set differs froma sum of A,'s by at most a set of 4 measure 0. The converse,that every A, sum is invariant if it is a set in a- is a consequence of (b), to be proved below. In fact if A Ga. is a sum of A,'s, where , ranges over some set M, QM(A)= 1 when piCM and Q,(A) = 0 otherwise.Hence the set A is definedby the equality Q,(A) =1, that is, neglectingsets of 4kmeasure 0, A is defined by the equality Q(x, A) = 1. This definitionhas already been seen to define an invariantset. Proof of (b). Since Q(x, A) is j5- measurable forfixedA, it depends only on the atom in whichx lies. Equation (2.2) is merelythe analytic formulation the conditionalprobabilityQ(x, A) is 1 almost everyofthisfact. If A C where in A (4 measure); that is to say, using (2.2), Q(x, A) = 1 in every A, except in some constitutinga set of 4 measure 0. Drop these atoms fromthe A, forevery set A definedby (2.7). If this is done, (2.3) will be true. Equation (2.4) is obtaiinedby integrating(2.1) with respect to 4 measure. Proof of (c). According to Theorem 1, (1.8) is true for almost all x ('1 measure), that is, (2.5) is trueexcept possiblyfora set of ,ucorrespondingto atoms of total 4 measure 0. Absorb these in N forall sets A = C1,C2, * . Then (2.5) will be true forall A Ga.. If (2.6) is used to definea probability measure 411(A),integrationof (2.5) in ju (J measure) gives the equation showConverselysuppose that 4) is any self-reproing that (D is self-reproducing. ducing probabilitymeasure. Then 1948] (2.9) MARKOFF TRANSITION PROBABILITIES 407 -E fP { x(0) = y; x(s) E A}I )(dG,) = 4)1(A). t 8-1 When t-*oo the integrandconvergesto Q(y, A), except possiblyon a set of 4) measure 0. If (b, is absolutely continuouswith respect to 4), the exceptional set has at most zero 4b,measure also; when t-- oo, (2.9) becomes f Q(y,A)4)1(dGy)= bP(A), (2.10) which is a special case of (2.6), since 4)1(dGy)induces a measure in A space. In particular,suppose that 4)(A,) > 0. Then by (2.4), 4)(A -A,) = Q ,(A) ) (A,). Proofof (d). It was proved in the introductoryremarksof this paper that (since the sets D1, D2, - * - defined by (2.7) are invariant) there is a set Do =1 forxCDj-DjDo. of 4) measure 0 such that P{x(0) =x; x(1)CDj-DjDo} We shall now suppose that H is the fieldof sets generated by Do, D1 ... (ratherthan by D1, D2, ***). None of the previous argumentsrequire any change because of this. It is now true,however,that P{x(0) =x; x(1) CA } = 1 forxCA ifA Eza and ifA CX-Do. Then the firststatementof (d) is true if Do, which is a sum of atoms, is incorporatedin N. Equation (2.1) is true for each A foralmost all x (4bmeasure) and hence, using (2.10), if N1 is the ex= 0 for ceptional set, Q(y, N1) = 0 foralmost all y (4bmeasure). Hence Q,M(N,) all , except some correspondingto A,'s of total 4) measure 0. Absorb these in N foreach A = C1,C2 * - -. We shall showbelowthat then,foreach /.Z and each A C a, (2.1) is trueforalmost all x (Q, measure). Accordingto Theorem 1, the processwith 4) replaced by Q, will be metricallytransitiveif (2.11) i t lim-EP{x(0) t-l0 t 8=1 = x; x(s) EA} =Q;,(A) is true foralmost all x (Q, measure), forevery A. In fact, accordingto a rethat (2.12) hold fora collecmark made in provingTheorem 1, it is sufficient is This the of A field tion sets determining preciselywhat we have shown, a,. = . . . Hence reduced transitive.It the withA C1,C2, * processis metrically for that is holds almost all then followsfromTheorem 1 that (2.11), (2.1), is x (Q, measure) foreach A ECa If the given process metricallytransitive, the invariantX sets have 4) measureeither0 or 1. It is then clear from(2.8) that one atom musthave 4) measure 1. This atom can neverbe absorbed in N. Converselyif some A, has 4) measure 1, say A, then Q,(A) = 4)(A), and the 'reduced" process,which has been shown to be metricallytransitive,is preciselythe originalprocess. In general,it is not difficultto show that the / can be so chosen that any Borel set of ,t's correspondsto a set of atoms which constitutea set of x, and converselyany W set can be expressedin this way (ignoringthe atoms [May J. L. DOOB 408 in N). This gives an elegant parametrizationof the invariantsets. The followingtheoremsmake Theorem 2 more precise by imposing restrictionson the transitionprobabilitieswhichmake it possible to discuss the truthof (2.1) forindividuallyspecifiedvalues of x. THEOREM MARKOFF 3 (STATIONARY PROCESS WITH REGULAR TRANSITION Let Pl(x, A, t) be the singular componentof the measure P {x(O) = x; x(t) CA } withrespectto 41measure.In thefollowing,a(co) will be any P measurablefunctiondependingon x(t) for t_ 0, and eitherboundedor at least with PROBABILITIES). E{x(O) = X; Ta (3.1) }< oo, xcX,t> 1 and = Y; L.U.B.E{x(O) (3.2) I Tt0 } < oo for someto>0. in t,forfixedx. (a) Pl(x, X, t) is monotonenon-increasing (b) Withtheaboveconventionon a, i t 1 8=1 lim -E{x(O) (3.3) t >?? = x; T8a} existsfor all x satisfying rim P1(X, X, t) = 0. (3.4) If theprocessis metricallytransitive,thelimit in (3.4) is E { a (c) If C is thex set on which(3.4) is true,C includeseveryx satisfying P{x(O) = X; E [X(t)C C]} = 1. (3.5) t20 If (3.4) is satisfiedon an X set of positive f measure,and if the process is (3.4) is satisfiedbyalmostall values ofx (4 measure). metricallytransitive, (d) If x satisfies(3.4), and if A C x, i (3.6) 1im-E t- *0 t t 8=1 = x; x(s) CA} P x(O) I = Q(x, A) exists. The setfunctionQ(x, A) is a probabilitymeasure whichis absolutely continuouswithrespectto 4 measureand is self-reproducing, (3.7) P { X(0) = y; x(t) E A } Q(x, dGv) = Q(x, A). If theprocessis metricallytransitive,Q(x, A) = 4?(A). MARKOFF TRANSITION PROBABILITIES 1948] 409 (e) If almostall x (b measure)satisfy(3.4), Q(x, A) (forthosevalues ofx) satisfies f Q(y,A)Q(x,dGy)= Q(x, A) (9) (3.8) and also satisfies Q(y, A)P { x(O) = x; x(t) E dGy} (3.9) = Q(x, A) for almostall x (4 measure)(10). If all x satisfy(3.4), (3.8) and (3.9) are true for all x. Proof of (a). For each t let At be a set of 4 measure 0 forwhich P{x(O) = x; x(t) C A} = Pi(x, X, t). (3.10) The set At depends on x and t, but x in the followingwill be held fast. Then s+ t) Pi(x,X, =fP{x(t) = y; x(s + t) E A8+t}P{x(O) = x; x(t) E dGy}, s,t > 0. Since 4b(A,+t)= 0, the integrandvanishes for almost all y (4) measure); if B is the set on which it does not vanish, fP{x(O) Pi(x, X, s +t) (3.12) = P{x(O) = = x; x(t) E dGy} x; x(t) E B} < Pi(x, X, t), as was to be proved. It is interestingto note, although we shall not use the fact explicitly,that P1 obeys the same functional equation (0.1) as the transitionprobabilitiesthemselves. Proofof (b). The problemis that of justifyinggoing to the limitunderan integralsign. In fact in EE{x(O) (3.13) = x; Ta} =Ex(O) = x;-Z T8a the average (1/t) Et=, T8a convergeswithprobability1. AccordingtoTheorem 1 integration to the limit is justified for almost all x (4 measure) if E a log+ Ia I } < co. The present hypotheses are of a differenttype. If f (9) The integrationis meaningfuleven though Q(y, A) is not definedon a y set N of ci measure 0 because, accordingto (d), N is also of Q(x, dGy)measure0. (10) In this integralthe integrandis again undefinedon a set of ci measure 0, but this has integrationmeasure 0 for almost all (ci measure) values of x. 410 J. L. DOOB [May 0<s?t-to, - i t t E{x(O) i==1 -x;Tia} = f E{x(s) (3 .14) o + E{x(O) Ti y;- = =x;- q+ to-1 ~~~1 t E+ j-1 P{x(O) =x; x(s) Ezd } T;a} Now by hypothesis(3.2) is true,and it followseasily that (3.2) is true with toreplaced by any t> to,with the same bound, say K. Then thejth integrand in (3.14) is bounded by K. Accordingto Theorem 1 this integrandconverges in the mean of order 1, and thereforein probability,that is, in c1 measure. Unfortunatelyb measure is not the relevant integrationmeasure. However if At is definedas in the proof of (a), the integrand converges on X-Ae in the integrationmeasure. Integrationto the limit on X-Ae is therefore allowed and the maximum oscillation of (3.14) when t-> oo is thus at most KP1(x, X, s), an upper bound to the integrationover A,. By hypothesisthis can be made arbitrarilysmall by choosing s large. Hence the left side of (3.14) convergesto a limitwhen t-> oo. If the processis metricallytransitive, the limit is E {a }, since in that case the almost everywherelimit in (3.13) when t- oo is E { a }, by Theorem 1. Proof of (c). If (3.4) is satisfiedon an X set C and if x satisfies(3.5), that is, if startingfromx(O) =x, x(s) takes on a value in C forsome s >0 with probability1, we shall now show that x also satisfies(3.4). Let v be the first value of s with x(s) CC. Then using the fact that the process is a Markoff x (s), process, and that z = s is a condition on x(0), (3.15) P { x(O) = x; x(t) 00 A} = E 8=0 P x(O) = x; v = s, x(t) C A; where,if t>s, (3.16) P{x(O)=x;v=s,x(t)A} =JPX P x(s);x(t) G A}P{x(O) = x; dA} In particular,if (A) =0 P{ x(O) = x; v = s, x(t) EA} (3.17) < P1(x(s), X, - s)P { x(O) = x; dA}. If A is chosen so that the leftside of (3.15) becomes Pl(x, X, t), it then fol MARKOFF TRANSITION PROBABILITIES 1948] 411 lows that if t _ n, Pi(x(s), X, t -s)P{ X, t) _ Pi(x, 4=0 x(0) = x; dA} {k=8) 00 + E s=nl (3.18) < P{ x(0) = x; v =s, x(t) C A} P1(x(s), X, t - s)P { x(0) = x; dA} s=0 {P=s) 00 + E s=n+l p{x(o)=x;v=s}. Now the last series in (3.18) is the remainderof a convergentseries: 00 E P{ x(O)= x;v = sI}= 1 (3.19) 8=() and can be made arbitrarilysmall by choosingn large. When v= s, x(s) EC. Hence the integrandsin (3.18) convergeto 0 (t-->oo) by the definitionof C. Since the seriesof integralsis dominatedtermby termby the series (3.19), it followsthat the leftside of (3.18) convergesto 0. We have thus shown that C includes the values of x satisfying(3.5). If the process is metricallytransitive, 4(C)=O or b(C)=1. In fact if 4(C)>0, and if n(t) is the number of timesx(s) CC for0< s _ t, n(t)/t-*d1(C),with probability1, accordingto the ergodic theorem.Thus for almost all values of x(O) =x (4 measure), (3.5) is satisfiedand in fact x(t) will even take on values in C forinfinitelymany values of t. Proofof (d). If x satisfies(3.4) and ifa(c) is definedto be 1 whenx(O)GA and 0 otherwise,the average in (3.3) reduces to that in (3.6). Since thereis a limitin (3.6) forevery A, Q(x, A) is a probabilitymeasure in A. According to (b), Q(x, A) =E {Ia }= (A) if the process is metricallytransitive. The equation 1 T (3.20) ?71? , t=8+1 =+ P{x(O) = x;x(t)GA} f P{x(t) = PIx(o) =-ZP{x(O) =x;x(s+ t= 1 = y; x(s + t) C A}P{x(0) = y; x(s) E A}*-I P{x(O) = = t)eA} x; x(t) C dG,} x; x(t) E dG,,} leads (when r-->c ) to (3.7). This means that Q(x, A) is a self-reproducing distribution which may or may not coincide with the given one, 1(A). If 4(A) =0, [May J. L. DOOB 412 Q(x,A)=lim-ZP{x(O)=x;x(s)GA} =1 t 1 t < lim- E PN(x,X, s) = 0. t40O (3.21) I0 8=1 Hence Q(x, A) is absolutelycontinuouswithrespectto 1 measure,forfixedx. The absolute continuitymay not be uniformin x. Proof of (e). If almost all x (1 measure) satisfy(3.4), (3.7) must hold for almost all x. Hence forthose x, f-t E P x(o) t 1 (3.22) 8=1 = y; x(s) C A }Q(x, dGy)= Q(x, A). When t-> co the integrandconvergesboundedlyto Q(y, A) foralmost all y (b measure) and thereforeforalmost all y in the integrationmeasure, because the latter is absolutely continuousin 4 measure. Integrationto the limit is thereforepermissible,and yields (3.8). It has already been seen in Theorem 1 that (3.9) holds foralmost all x (db measure). If all x satisfy (3.4), (3.8) must holds for all x. Moreover when o- >oo in 1 t+a - P{x(O) = A} =- x; x(s) 1 a P{x(O)=x; x(s + t) E A} 1 (3.23) = Jw-Z P { x(O) = y; x(s) C A} P { x(O) = x; x(t) E dGy1}, the integrandconvergesboundedlyto Q(y, A) when a--->oo. Integrationto the limit yields (3.9). The followingtheorem gives conditions under which the Cesaro limits used up to the presentcan be replaced by ordinarylimits. THEOREM 4 (STATIONARY MARKOFF PROCESS WITH REGULAR TRANSITION AND NO ANGLE VARIABLES). In thefollowinga is a P measurable functionsatisfying(3.2) for all largetoand dependingonlyon x(t) for t? const. PROBABILITIES (theconstantmay varywitha). (a) Thereis a, t set I of relativemeasure 1("), independentof x and of a such thatif x satisfies(3.4) (4.1) lim 1- O , t I E{x(s) = x; Tta} thelimitis existsand is independentof s. If theprocessis metricallytransitive, E{a}. (1")A t set of relative measure 1 is a set withthe propertythat (number of integersin the set between n and -n)/2n--+1 when n-- co. MARKOFF 1948] TRANSITION 413 PROBABILITIES (b) If (3.4) is satisfiedbyalmostall x ((D measure),(4.1) can be strengthened: lim E{ x(s) = x; Tta} (4.2) existsfor thosex, and is E I a } if theprocessis metricallytransitive. Withproperchoiceof a, (4.1) and (4.2) become (4.1') l-e lim ,tEI P{ x(s) - x; x(t) EA} = Q(x, A) and lim P{x(s) (4.2') = x; x(t) E A} = Q(x, A). The specialization from (4.1) and (4.2) to (4.1') and (4.2') is clear, and will receive no furthercomment. Proof of (a). Accordingto a theoremof von Neumann and Koopman [5, pp. 36-37] if the process has no angle variables, there is a t set I of relative measure 1, with the propertythat (4.3) lim t-+oo,tEI E{I3(co)Tta(co)} } < oo. If the existsforall chance variables a, / withE { a 12} <co and E { 1321 processis metrically transitive, thelimitis E { a } E { 13}.Moreoverthelimit is unchangedif a is replaced by T8a or if : is replaced by T43. It will be convenientto writePI{x(O) =x; x(t)CA } explicitlyas the sum of its absolutely continuousand singularcomponentswith respect to 1 measure, P I x(O) = x; x(t) E A} = (4.4) P(x, y, t)cI(dGy)+ Pi(x, A, t). The conditionalexpectationin (4.1) can be written(for large t) in the form E{ x(s) = x; Tta} = (4.5) = = E x(u) = y; Tta } P{ x(s) = x; x(u) E dGy} f E{x(u) = y; T ta} p(x, y, i- E p(x, x(,u), u - s) Tta } + s)J2(dG,)+ , where |IX| (4.6) = f E{ x(u) = ? L.U.B. E{x(O) y; Tta}Pl(x, dGy,u = - s) y;I Tt_uat }Pi(x, X, - s). n 414 J. L. DOOB [May If tI->c, tEl, the last expectation in (4.5) converges to a limit (put /3=p(x, x(u), u -s) in (4.3)); in fact althoughthe integrationhypothesesimposed by Koopman and von Neumann are not satisfiedhere, a simple approximationextends their result as required. Moreover it was remarkedin the proofof Theorem 3 that our hypotheseson a imply that the L.U.B. in (4.6) is bounded in t when t-*oo, say with bound K. Hence when t-* o, with tCI, the left side of (4.6) has oscillation at mostKP,(x, X, u-s). By hypothesisthis can be made arbitrarilysmall by choosingu large. Hence the leftside of (4.5) convergesto a limitwhen t-> co, tCI. If the processis metrically transitivethe limit is E { a}. This can be seen fromthe fact that the limit in (4.3) is E { a } E {3 } in that case, or fromthe fact that the Cesxaro limitis EJ{a} in that case, accordingto Theorem 3. Proof of (b). Suppose now that (3.4) is satisfiedby almost all x ((b measure). For large t we tan write E{x(O) = x; Tta} = E{ x(s) = y; Tta} P{ x(O) = x; x(s)C dG,} (4.7) =JE { x(o) = y; Tt,a } P { x(O) = x; x(s) E dG,}. Accordingto (a) the integrandconvergesfor almost all y when t->oo with t- sGI. We let t->oc withno restriction,choosings = s(t) so that t-s C I, s E I, s -->c. Now if A8 is a set of 4> measure 0 satisfyingthe equation Pl(x, A8,s) =Pl(x, X, s), and containingthe points of nonconvergenceof the integrand in (4.7), the integrandconverges boundedly everywhereon X-I#sA8 and the integrationmeasures converge. Hence (4. 8) lim f E { x(s) = y; Tta } P { x(O) = x; x(s) E dG7,} (A = EA.) exists. Moreover if K has the same significanceas in the proofof (a), (4.9) fE{x(s) = y; TtaI}Px(O) = x; x(s) E dGY} ? KP,(x, X, s)-*O so that the left side of (4.7) convergesto a limit when t-- oc, as was to be proved. As in (a), if the process is metricallytransitive,the limitis E {al. Theorems3 and 4 weredevoted to the asymptoticcharacterofE {x(0) =x; Tta } as conditionedby the validityof (3.4). If (3.4) is assumed true on an X set of positive 4 measure,the decompositiontheorem(Theorem 2) becomes essentiallysimplified.Only two cases will be discussed: when (3.4) is true for all x (Theorem 5) and when (3.4) is true foralmost all x (Theorem 6). The 1948] MARKOFF TRANSITION PROBABILITIES 415 paper is then concluded by a detailed analysis of the hypothesesof Doeblin and of Kryloffand Bogolioiuboff as compared with the hypothesisthat (3.4) is true forall x. THEOREM PROBABILITIES; 5 (STATIONARY MARKOFF PROCESS WITH REGULAR TRANSITION VALUE OF X SATISFIES (3.4)). There are finitelyor EVERY denumerably many disjunctsetsA1, A2, * * *, of positive4) measure, infinitely and correspondingself-reproducing probabilitymeasures Q1(A), Q2(A), withthefollowingproperties: 4)(N) = Qi(N) = (5 .1) (N = X-, o Ai) Qj(A ) =1. (5.2) A probability measure4'1(A) is self-reproducing if and onlyif it can beput in the form (5.3) as _ 0, E as = 1. -1(A) = E a7Q(A), i i In particulareverya1 is positivefor (D = 4 and moregenerallytheseconstants are all positiveif and only if 4 is absolutelycontinuouswith respectto 4)D. Every4) is absolutelycontinuouswithrespectto (. Each Qj process (( replacedby Qj) is metricallytransitiveand theQj's are theonly self-reproducing distributions givingmetricallytransitiveprocesses. The transitionprobabilitiessatisfy (5.4) P{x(O) = x E Ai, x; x(t) E A } =1, and, if AGeI, (5.5) limt-oo it t P x(O)=x; x(s) CA} = Q(x, A) = existsfor all x, with (x C A?) Q(x, A) =Qj(A) (5.6) = x C N, ax ai(x)Qi(A) ai( x) = For xCAj, (5.5) can be strengthened as follows. To each Aj correspondsa positiveintegerdj. If dj = 1, (5.7) lim P{x(O) t-4 o = x; x(t) A} =Qj(A), xCGAi. If dj > 1, A j is thesum ofdj disjunctsetsA j, * * * , A id,and thereare probability measuresQjl, * * *, Qjdj satisfying 416 [May J. L. DOOB (5.8) Q3(A) - (5.9) Qjk(Ajk) (5.10) [Qil(A)+ = PI x(0) + Qid(A)1, Qj(Ajk) =Ildp 1, x; x(t) E = Ajk} = X 4kZ+t, e AJ(1) Finally lim P{ x(0) = x; x(d1s+ t) C A} =Qil+ (A), (5.11) (5.12) x EzAil;t O, , di -1, P{ x(0) = y; x(t) E A} Qjk(dGy). Qik+t(A) = If (3.4) holds forall x, Q(x, A) is definedby (3.6) forall x and all A CE, and satisfies (3.8). According to a theorem of Blackwell [2, p. 567] since Q(x, A) is absolutely continuouswith respect to 4(A) and satisfies(3.8), X is the sum of disjunct sets Bo, B1, A1, A2, A3, . . . , finiteor denumerablyinfinitein number,with the As of positive 4) measure,and thereare probability measures Q1(A), Q2(A), * * *satisfying Q(x, A) = Qi(A), = Qj(A) 1, if A C Aj, 1(A) > 0, Qi(A) > 0 (5.13) 0, Q(x, Bo) -:b(B x E Ai, = 0. We shall only touch on those points which are not obvious in the light of Blackwell's theoremand the precedingtheoremsof this paper. Equation (0.2) implies that 4(Bo) =0. We set N=Bo+B1 and shall absorb other sets of fD measure into N also, as required below. Evaluating (3.8), and using (5.13) and the fact that Q(x, A)=0 when (A)=0, Q(x, A) (5.14) = E Qi(A)Q(x, A,). This is (5.6), with an explicit evaluation of the as(x). If 41(A) is self-reproducing, -1(A) f P{ x(O) = y; x(t) E A}I 1(dGy) (5.15) 1 -J (12) t EPIx(O)= y; x(t) C A}ib(dGy). modulodi. i in A1iis to be interpreted thesecondsubscript Hereand in thefollowing 1948] 417 MARKOFF TRANSITION PROBABILITIES When t-* oo this becomes Q(y,A)4)j(dGy) = E2Qj(A) 41(A) Q(y, Aa)4b)(dGy) i (5.16) E Qj(A)(bi(A j) ~~~~~= (5. 16) which is (5.3). In particular if 4b=P, aj=4b(Aj)>0. Since every Qj(A) is absolutelycontinuouswithrespectto 4), every ib must be also. Any two selfreproducingdistributionswhose correspondingconstants of combination a1 are simultaneouslyzero definethe same sets of measure 0. The only special featureabout 4) is that all of the correspondingaj's are positive. If 4'i did not have this property,(3.4) would hold only foralmost all x, using (D as a base measure. Equation (3.6) becomes forxEEA1 (5.17) i t lim- g-?o t P{ x(0) = x; x(s) A} = Qj(A). = Hence by Theorem 1(d) the Qj processis metricallytransitive.It is clear that the Qj's are the only self-reproducingdistributions making the process metricallytransitive. For example this followsfromthe fact that the Aj's and sums of Aj's are the only invariantsets, neglectingsets of 4) measure 0, that is, with every Qj vanishing.Since A; is invariant, (5.4) certainlyholds foralmost all x in Aj (Qj measure),and we have seen at the beginningof this paper that Aj can be replaced by a subset of the same Q; measure forwhich is put into N. We shall assume that this has been (5.4) holds. The difference done. Define d,= 1 if the Qj process has no angle variables. Then (5.7) follows fromTheorem 4. Now suppose that the Qj process has an angle variable, that is, that there is a 4) measurable functionf(x) (not identicallyzero on a set of 4) measure 1) and a real constantX satisfying f[x(s)] = eixsf[x(O)], (5.18) neglectingx(s) sets of Qj measure 0. Since s = 1, 2, * * Ifl is invariant, and since the transitive, Qj processis metrically IfIeconst. on a set ofQj measure1. It is no restrictionto take this constant as 1. Then (5.19) f=eh7, It followsthat the Q sets (5.20) {x(0) C B}: I{x(s) CC}: {a < Xs + g[x(0)] < b (mod27r)}, {a <g[x(s)] <b(mod2,r)} are identical,neglectingsets of probability0. Thus 0?g<27r. (5.21) [May J. L. DOOB 418 Qi(B) = Qj(C) = P{a < g[x(s)] < b (mod 27r)}, P{x(O) = x; x(s) EC} = 1, x E B, neglectingan x set N(a, b) of Qj measure0. Let N1 be the sum ofthe N(a, b)for all rational a and b. Unless g is a functionwhich takes only a finiteor denumerablyinfiniteset of values (aside fromvalues taken on on a set of Qj measure 0), thereis a sequence of rational pairs (ay,bj), shrinkingto a point, correspondingto a sequence of pairs of sets (By, Cj) satisfying(5.21) and (5.22) B1 D B2 D * , Cl D C2 D ** Q,(Bn) = Qj(Cn) > 0 contains points in Ai - N1. Let x be such a point,and choose such that1J7JBn s so large that (5.23) Pi(x, X, s) < 1. Then (5.24) P{x(O) = x; x(s) C JCn}= 1, Q1(JICn) =O whichcontradicts(5.23). Thus thereis a constantc whichf assumes on a set of positive Qj measure. The constantsceiX,ce2AX,* are also assumed on sets of the same Qj measure.This is impossibleunless eiXis a root of unity.We can suppose then that eiXis a djth root of unityand that f(x) = ceA~', ce2X, *I. on disjunct sets A11, *, Aid, respectively,all having the same Qj measure. ji The sum of these sets is invariant, and thereforeincludes all of Aj except possiblya set of Qj measure0. Hence Qj(Ajk) = 1/dj.The remainderA j-ikA,jk is absorbed in N, and A j is thenreduced again as above, so that (5.4) will still hold. If x(s)CAJk, it follows that x(s+1)CAik+1, that is, (5.21) becomes (5.10), valid foralmost all points of Ajk. The sets A jkand Aj are then modified to ensure the simultaneousvalidity of (5.4) and (5.10) without exceptional points. Unless therewere a maximumvalue to the possible integersdj, (5.10) would show that for almost all x (Q measure) and every n there are sets Bn withP {x(0) =x; x(s) CBn } = 1, Qj(Bn) < 1/n. This contradicts (5.23). Hence thereis a maximumdj and it is assumed fromnow on that this maximum has been chosen above. The transformationT consideredon the Qj processwill be denoted by Tj. It has a point spectrum("3)consistingof 0 (correspondingto the fact that 1 is a characteristicfunction)and 2rn/dj (correspondingto the fact thatf and its integral powers are characteristic functions) for n = 1, * - - , dj -1. These that is, (13) Cf. [5 ] fora discussionof the spectrumof a measure preservingtransformation, of the correspondingunitarytransformation. MARKOFF 19481 TRANSITION 419 PROBABILITIES points all have multiplicity1, since Tj is metricallytransitive.Then Tdi has a point spectrumconsistingof the single point 0, which has multiplicitydj. Correspondingto this fact, the transitionprobabilities P{x(O) = x; x(dt) E A}, t = 1,2,... determine(togetherwith Qj measure) a stationaryMarkoffprocess which is not metricallytransitive,because A jio * * *, A jdj are invariant,but whichhas no angle variables, and which becomes metricallytransitivewhen reduced still furtherfromAi to any A,k. Then the limitQj,(A) definedby (5.11) with t= 0 must exist,accordingto Theorem 4. Equations (5.11) and (5.12) are obtained when s- oo in P{x(O) = fJ P{ x(o) = y; x(t) E A} P { x(0) = x; x(dis) E dG,I (5 . 25) = THEOREM x; x(d,s+ t) CA} P{X(O)= y; x(d'is) C A } P { x(O) = x; x(t) C dG} . 6 (STATIONARY MARKOFF PROCESS WITH REGULAR TRANSITION If almostall x (4) measure)satisfy(3.4), thereis a set NoG0 of 4) measure0 suchthat PROBABILITIES). (6.1) P{ x(O) = x; x(t) E X - No} = 1, x C X -No, and thatif X is replacedbyX - No, and thefieldofsetsA E a bythefieldofsets Ao =A -A No, theMarkoffprocesswiththisspace, theseX measurablesets,the transitionprobabilities(6.1), and the same 4) measuresatisfiesthehypotheses applicable to of Theorem5. The decompositionof that theoremis therefore X-No. BeforeprovingTheorem 6, we presentthe followingexample. In Theorem measure 4b)as definedby (5.3), and suppose that 5 choose a self-reproducing = are then satisfied,using the basic 4b,measure, The 0. hypotheses present a, but Theorem 5 is no longerapplicable since (3.4) is no longertrue on Al. The presenttheoremthengives the decompositionof X-Al but gives no information on A1. To prove Theorem 6 we need only note that if (3.4) is true on X -No, by a process already used repeatedly in the proofof Theorem 5 No can be increased slightlyto make (6.1) true. The theoremis now obvious. Comparisonbetweentheconditionthat(3.4) be validfor all x and thecondiIn the followingit will be tions of Doeblin and of Kryloffand Bogolioz"boff. in Euclidean space and that measure X of finite is set a Borel that supposed of a set A will be The measure subsets. Borel the of its Lebesgue field ax is and to 0 for some is that e > Doeblin's denoted by mA. hypothesis (7.1) [May J. L. DOOB 420 P{x(O) = 1- 6 x; x(to) EA} ifmA ? e, uniformlyin x. This condition is shown to imply the existence of the limit Q(x, A) in (3.6) uniformlyin x and A. The use of Lebesgue measure here is somewhatconfusing,since the proofsare equally valid formany othermeasures of Borel sets. In orderto clarifythis point,his conditionwill be put in a probability somewhatdifferent form.In his case Q(x, A) is a self-reproducing measureforeach x, and the decompositionof Theorem 5 is valid; in fact only a finitenumber of sets Aj can be present. Moreover, using the notation of Theorem 5, if mA >0, and ifA CAj, Qj(A) >0 also. Define 4(A) by )(A)-= ajQi(A) wherethe aj's are positive,have sum 1, and are otherwiseunrestricted.Then 4?(A) is a self-reproducing probabilitydistribution,and 4D(A)> 0 if mA > 0, that is, mA is absolutelycontinuouswithrespectto 4(A). Condition (7.1) can thereforebe put in the followingform: (7.1') P{x(O) = x; x(to) eA} < 1 - E if (A) < Now if this conditionis satisfied,and ifP1(x, A, t) is the singularcomponent of P{x(0) =x; x(t)EA } with respect to 4(A), then certainly (7.2) Pi(x, X, to) < 1 - e uniformlyin x. The proofthat Pl(x, X, t) is monotonenonincreasingin t now shows (using (7.2) in (3.12) with s=to) that (7.3) Pi(x, X, to+ t) ? (1 - e)P1(x, X, t). Hence (7.4) Pi(x, X, nto)< (1 n and thus (3.4) is true,and the convergenceis even uniformin x and exponenof convergenceto 0 which precludesthe postially fast. It is this uniformity sibilityof infinitelymany sets Aj, and the still strongercondition (7.1') implies the uniformity of the limitequation (3.6). The formulationof Doeblin's conditionin (7.1') explains the ratherarbitraryuse of Lebesgue measurein the originalformulation.Lebesgue measure was a measure of Borel sets with the propertythat mA >0 implied Qj(A) >0 forsomej. Any othermeasureof Borel sets with this propertywould serve as well. In the same way the b(A) distributionin Theorem 5 was ratherarbidistribuany self-reproducing trary (althoughit had to be self-reproducing); tion withthe propertythat ifany Qj(A) >0, then 4(A) >0 would have served as well. The conditionof Doeblin is thus strongerthan the hypothesisthat there 1948] MARKOFF TRANSITION PROBABILITIES 421 is a self-reproducing distributionand that (3.4) is true for all x. Doeblin's conditionhas the advantage that it does not requirea priorithe existenceof a self-reproducing distribution.In some examples, however, the existence of such distributionsis already known. This is true in the renewal process (see the followingpaper in thisvolume). Doeblin's conditionis howeveressentially strongerand leads to strongerresultswhereit is applicable. The conditionof Kryloffand Bogolioiuboff is somewhatweakerthan that of Doeblin, as stated, but since it leads to exactly the same decompositiontheorem,the two are essentially equivalent. In both cases (7.1') is true. The point is that the various possible choices forX measure in Doeblin's conditionbringit up to the generalityof the Kryloff-Bogolioiuboff condition. As a somewhattrivial example which may howeverbe illuminating,consider the Gaussian Markoffprocess described in the Introduction. In this case, as already remarked, there is a self-reproducingdistribution,and Pl(x, X, t) -O, so that Theorem 5 is applicable. Actually (0.9) shows that there is metrictransitivityand no angle variables; there is only one set Aj, and thereare no cyclic transitions.The convergencein (0.9) is not uniform in x; this means that neitherthe Doeblin nor the Kryloff-Boglioiuboff conditions are satisfied. BIBLIOGRAPHY 1. W. Ambrose, P. Halmos, and S. Kakutani, The decompositionof measures,II, Duke Math. J. vol. 9 (1942) pp. 43-47. 2. D. Blackwell, IdempotentMarkoffchains, Ann. of Math. vol. 43 (1942) pp. 560-567. 3. J. L. Doob, Stochasticprocesseswithan integral-valued parameter,Trans. Amer. Math. Soc. vol. 44, (1938) pp. 87-150. , The Brownian movement 4. and stochasticequations,Ann. of Math. vol. 43 (1942) pp. 351-369. 5. E. Hopf, Ergodentheorie, Ergebnisse der Mathematik, vol. 5, no. 2. 6. S. Kakutani, Ergodic theoremsand the Markoffprocesswitha stabledistribution,Proc. Imp. Acad. Tokyo vol. 16 (1940) pp. 49-54. 7. K. Yosida and S. Kakutani, Operator-theoretical treatmentof Markoff's process and mean ergodictheorem, Ann. of Math. vol. 42 (1941) pp. 188-228. 8. K. Yosida, The Markoffprocesswitha stabledistribution,Proc. Imp. Acad. Tokyo vol. 16 (1940) pp. 43-48. 9. N. Wiener, The ergodictheorem,Duke Math. J. vol. 5 (1939) pp. 1-18. 10. (Added in proof.)A. M. Yaglom, The ergodicprinciplefor Markovprocesseswithstationary distributions,C. R. (Doklady) Acad. Sci. URSS. N.S. vol. 54 (1947) pp. 347-349. The author supposes that P(s)(x, A) is absolutely continuous with respect to a given self-reproducingdistribution ?1(A), with a positive continuous density, and proves that then lim8_".P(s)(x, A) = 1(A). This is a special case of Theorem 5. UNIVERSITY OF ILLINOIS, URBANA, ILL.
© Copyright 2026 Paperzz