510
CONTINUOUS PARAMETER
M A R K O V CHAINSf
By K A I - L A I
CHUNG
The name given in the title is an abbreviation of 'Markov processes with
continuous time parameter, denumerable state space and stationary
transition probabilities \ This theory is to the discrete parameter theory
as functions of a real variable are to infinite sequences. New concepts
and problems arise which have no counterpart in the latter theory.
Owing to the sharply defined nature of the process, these problems are
capable of precise and definitive solutions, and the methodology used
well illustrates the general notions of stochastic processes. It is possible
that the results obtained in this case will serve as a guide in the study
of more general processes. The theory has contacts with that of martingales and of semi-groups which have been encouraging and may become
flourishing. For lack of space the developments from the standpoint of
semi-groups or systems of differential equations cannot be discussed here.
Terms and notation not explained below follow more or less standard
usage such as in [11].
Let (Ü, $, P) be a probability triple where ($,P) is complete;
T = [0, oo), T° = (0, oo), 33 the usual Borei field on T. Let {xt, t e T} be
a Markov chain with the minimal state space I, the stationary transition
matrix {(p^)} and an arbitrary fixed initial distribution. By the minimal
state space we mean the smallest denumerable set (of real numbers) such
that T?{xt((x)) € 1} = 1 for every t € T. The transition matrix is characterized by the following properties: for every i, j e I, s, t e T°:
Pa(t)>o,
Zft,(*) = i, ^ ( H ^ S f e W f t i W .
jel
(i)
feel
The last of these relations is the semi-group property. In order to have
a separable and measurable Markov chain with the given I and {(##)}
it is sufficient (and essentially necessary) that
iimpu(t) = 1
(2)
for every iel. In this case we define ^ ( 0 ) = S{i. E a c h ^ . is then uniformly continuous in T. We shall confine ourselves to this case. We may
suppose, by going to a standard modification, that the process is separable
f This paper was prepared with the partial support of the Office of Scientific Research
of the United States Air Force.
CONTINUOUS PARAMETER MARKOV CHAINS
511
(relative to the class of closed sets) and measurable. The basic sample
function properties, due largely to Doob [9] and Lévy [17] , can then be
deduced. We cannot detail these properties but, as a consequence of
them, there is a specific version (or realization) which has desirable properties. To obtain this let us take I to be the set of positive integers and
compactify it by adjoining one fictitious state oo, in the usual manner.
Let us call a function/ on T lower right semi-continuous with respect to a
denumerable set E dense in T if f(t) = lim f(s) for every t. Then there is
R3sjt
a version of the given Markov chain such t h a t
(i) x(-,- ) is measurable S3 x $ , or Borei measurable;
(ii) each sample function x( •, a)) is lower right semi-continuous with
respect to any B.
Other properties which are valid for almost all sample functions may
be further imposed; we need not elaborate them here. Clearly (i) implies
measurability and (ii) implies well-separability, namely separability
with respect to any denumerable set dense in T. We mention that,
despite (ii), it is possible that for all a), the t-set $«,(&>) where x(t, co) = oo
is everywhere dense in T (see [14] ).
I t has been known for some time t h a t an important concept in the
study of general Markov processes is the so-called strong Markov property (see [3»12,22]) j t turns out t h a t the version specified above has this
property, which we now proceed to describe (in a slightly restricted form).
Let $t be the Borei subfield of ^ generated by {xs, 0 ^ s < t} and augmented by all sets of probability zero. A random variable a with domain
of finiteness A a is said to be optional (or 'stopping t i m e ' or 'independent
of the future ') if for every t we have
{ùJ:OC(ù)) <
t}e%t.
The collection of sets A in g such t h a t An {co: OL(O)) < t] € $t is a Borei
field g a called the pre-cc field. The process {£f, t e T} on the reduced triple
( A a , A a & P ( . | A a ) ) where
£(t,ù)) = x(a(ù)) + t,ù))
is called the post-oc process and the augmented Bòrei field generated by
this process the post-ct field %'a. Observe t h a t if a is optional then so is
a +1 for each t > 0. For the sake of brevity we shall suppose t h a t A a = O
in the following. The following assertions, collectively referred to as
the strong Markov property here, are true for every optional a.
(a) For each t e T°, £f is finite (i.e. £t e I) with probability one.
(6) The post-a process is a Markov chain in T° whose state space and
transition matrix are restrictions of those of the given Markov chain.
512
KAI-LAI CHUNG
(c) For each teT, the pre-(a + £) and post-(a-M) fields are conditionally independent given £,t, wherever the latter is finite (hence in
particular almost everywhere if t e T°, by (a)). Thus if A € %a+t and
A' € 3£+* then on the set {co: ^(o) e 1} we have
P{AA'|£} = P{Alfi}P{A'|S}.
(d) The post-a process is well-separable and Borei measurable as it
stands.
Furthermore, let us consider for each given A e $a and s < t, the
conditional probability
P{x(t,(o)=j\A;cc
= s} = r3.(s,t\A)
(3)
defined for almost all s according to the measure induced by a on S3.
The following additional assertions are true.
(e) For each j e I and almost all s according to the a-measure: the
function r^(s, • | A) satisfies conditions analogous to (1) and is continuous
in (s, oo) ; and we have
P{£(£,o>) = x(a(o)) + t,(o) = j | A; a = s} = r^s,s +11 A).
(4)
(/) The pre-a and post-a fields are absolutely independent if and
only if r^(s, t \ A) as a function of the pair (s, t) is a function of t — s, for
each A and j .
Let us observe, comparing (3) and (4), that the assertion (e) is a nontrivial substitution property for the conditional probability.
A preliminary view of the above assertions, together with a justification of the name 'strong Markov property', may be obtained by considering the particular case a = constant. In this case the assertions
(a)-(d) become the defining properties of {xt, t e I}, while (e) reduces to
the continuity of each py. This simple observation implies the truth of
(a)-(e) if a is denumerably-valued and shows that for a discrete parameter Markov chain the corresponding assertions hold almost trivially.
Proofs of the above assertions except (d), in somewhat more precise
terms, are given in [7]. Similar results which overlap these are given by
YushkevicE223f; for another proof of (a), (b) and part of (c) see Austin[2].
The essence of the strong Markov property may be briefly stated as
follows: The ordinary Markov property valid at a fixed time t remains
valid at a variable time cc chosen according to the evolution of the
process but without prevision of the future. The classical illustration
is that of a gambler who chooses his turn of playing according to a
f His assertion involving another random variable ^ a and measurable 5« follows
easily from (e).
CONTINUOUS PARAMETER MARKOV CHAINS
513
gambling system which he has devised without the aid of prescience.
Similar concepts for a martingale have been developed by Doob [11] .
Let us discuss some applications of the strong Markov property. I t
should be remarked t h a t while its invocation is a basic step in each of
the following cases further work is needed to establish the results to
be mentioned.
(A) The simplest case of an optional random variable is the first
entrance time into a given state. This satisfies (/) and its judicious use
yields various ' decomposition formulas '. For example, let Hp^(t) denote
the transition probability from i to j in time t 'under the taboo H9
(namely, before entering any state in H), and jcfHPij(t) the analogous
probability where H is replaced by the union of H and k; HFik the first
entrance time distribution from i to h under the taboo H. The intuitive
meaning of the following formula is clear: if 1c £H, we have
HPifi) = KBPHQ) + j0BPkj(t
-s)d
JS^ìJM ;
but its rigorous proof requires the strong Markov property, in particular (e). Specialization of H to one state leads to ratio limit theorems
of the DoebHn type concerning
p^(s)dsj\
pu(s)dsa,st ->oo; see [8]
Next, let us recall t h a t the state i is called stable or instantaneous
according as qi = —p'u(0), which always exists, is finite or infinite. Let
i =}= j and consider in a recurrent class (see [8] ) the successive returns to
i via j (the intervention ofj is necessary only if i is instantaneous). These
return times partition the time axis T into independent blocks to which
Doeblin's method of treating a functional of the Markov chain can be
applied. I n this way the classical limit theorems, like the laws of large
numbers, the law of iterated logarithm and the central limit theorem, can
be easily extended. For the discrete case see [4], where there are some
errors in the proofs which can be corrected (see the last footnote
in M).
Finally, Kolmogorov's example [16] , in which there is exactly one instantaneous state, can be analysed probabilistically by use of certain
entrances into this state and taboo probabilities. I t can be shown as a
consequence t h a t the construction of sample functions of this process
given by Kendall and Reuter [15] with semi-group methods is indeed
the unique one. Namely, the version specified above having Kolmogorov's transition matrix must have the properties implied b y the
Kendall-Reuter construction.
33
TP
514
KAI-LAI CHUNG
(B) Let the process start at a stable state i and consider the first exit
time from i (see [6]). This is optional and the condition in (/) is again
satisfied. Denoting for A = Q the corresponding r^s^ + t) by r€i(t),
which is continuous in t by (e), we have readily
Pifi) = à^e-^ +
qA^i^r^ds.
This integral representation implies the existence of a continuous
derivative Py(t) = q^r^t) —Pij(t)] and various complements including
an interpretation of Kolmogorov's first (backward) system of differential
equations. (The second (forward) system can be dually treated and
falls under (A) above.) This gives a probabilistic proof of a result which
was first established by analytic means by Austin[1]. A proof similar to
the one sketched here was announced by Yushkevic but has not yet
appeared.
The independence of the pre-exit and post-exit fields implies the
fundamental observation due to Lévy[17] that the lengths of the stable
intervals are independently distributed; the separability of the post-exit
process asserted in (d) then yields the negative exponential distributions
for these lengths; see [5].
(C) Let the process start at an instantaneous state i and put
Sua) = {t: x(t, co) = i}, pt(t> *>) = / * W " (°> *)L
where pt is the Borel-Lebesgue measure on 93. Then for each s, the random
variable a,s defined by
. x . r (. .. .
.
J
a8(o))
= inf{tifici,o)) > s}
is optional. In words, as is the first time when the total amount of time
spent in the state i exceeds s. This idea, which is a partial analogue of
the exit time from a stable state discussed under (B ), is due to Levy[17» 18>19].
Levy makes use of the more general device of counting time only on a
selected set of states, thereby annihilating the remaining states together
with the time spent in them. This idea remains to be fully exploited.
As a simple example, if i =)= j , then the total time spent in i before entering j has the negative exponential distribution 1/•OC
Jo *?«(*)*
in our previous 'taboo' notation.
(D) In this last application we touch upon a chapter of the theory
of continuous parameter Markov chains which has yet to be written. It
is to be observed that the strong Markov property fails on the set where
£o = oo. While the assertions (a)-(d) are always valid for t e T°, our
information is inadequate as the critical time a is approached from the
CONTINUOUS PARAMETER MARKOV CHAINS
515
right if the sample function values approach oo there. This failure may
be formally attributed to the crude one-point compactification we have
adopted, which does not distinguish between the various modes of
approaching oo that ought to correspond to distinct adjoined fictitious
states rather than the one and only oo. From this point of view the main
task, called the 'boundary problem' by certain authors, is the proper
compactification of the minimal state space I so as to restore the strong
Markov property on the set where g0 £ I and to induce the appropriate
boundary behavior (as in potential theory). Without loss of generality we
may suppose that this set has probability one. For fixed j € I and t0 e T°
let us consider the process {%, 0 < t ^ t0}, where 7j(t,ù)) = Pgt}(û\j(tQ — t).
Since {£J is a Markov chain by (6) the new process {%} is easily seen to be
a martingale. Applying the martingale convergence theorem we see that
lim 7j(t,ù)) exists and is finite with probability one, and the limit has
tlQ
certain gratifying properties. The idea of considering this sort of martingale is due to Doob[9], and the present application to the post-a process
will undoubtedly play a role in the compactification problem. For other
formulations of the boundary problem see Feller[13], Reuter[21], and Ray[20].
The preceding discussion is centered around the strong Markov
property as a convenient rallying point. Lest the impression should
have been made that there was nothing else to be done I should like to
conclude my discussion by mentioning some other problems not directly
connected with the above.
A very natural circle of problems concerns the analytical properties
(not to say characterization) of the elements of a transition matrix
defined by (1) and (2). These may be regarded as problems in pure
analysis. For example, it is still an open problem whether p'y(t) exists
if t > 0 and both i and j are instantaneous, j* The solution of such a problem would be the more interesting if probabilistic significance is found.
In this connection Jurkat [14] has observed that the differentiability
results discussed under (B) hold even if the second condition in (1) is
omitted, the condition (2) being assumed of course. The following even
more primitive and probabilistically meaningful result is only a few weeks
old: each p^ is either identically zero or never zero. The original proof
of this result, due to Austin, makes ingenious use of the strong Markov
property. It is almost ' unfortunate ' that a simplification has been found
by myself which uses only the separability and measurability of an
associated process. This is not the only example where a purely analytic
f {Added in proof.) D. Ornstein has now proved that for every i and j , p'i3-(t) exists
and is continuous for t > 0.
33-2
516
KAI-LAI CHUNG
and simple-sounding proposition has so far been proved only by properly
probabilistic methods.f It follows from this result, as observed by Austin,
that if all states communicate then each of the Kolmogorov systems
holds as soon as one of its equations holds for one value of the argument.
Another circle of problems is the approximation of the continuous
parameter chain (£ = {xt, teT} by its discrete skeletons (£s = {xns, neN}
where s > 0 and N is the sequence of non-negative integers. In what
sense and how well do the skeletons ©s approximate (£ as s^Oi This
does not appear to be as simple a matter as might be expected. To cite
a specific example: let m^ denote the mean first entrance time (or return
time if i = j) from i toj in &, and let mi3-(s) denote the analogous quantity
in ds. The well-known theorem that Mm pu(t) exists (see [17]) implies that
if i is stable then mu(s) = q^^ for every s. If i and j are distinct states
in a positive (or strongly ergodic) class then it can be shown that
lim smi:j(s) = mi3- by a rather devious method. But I do not know what
sjo
the situation is with moments of higher order. We may also mention
the open problem of characterizing a discrete parameter Markov chain
which can be imbedded in a continuous parameter one, namely which is
a skeleton of the latter.
Finally, let me mention an annoying kind of problem. Various models
of Markov chains can be easily described by so-to-speak word-pictures
but the rigorous verification that they are indeed Markovian is often
laborious. The well-known construction by Doob[10] is an example.
Other examples are given by Lévy[17] of which one (his example II. 10.5)
may be roughly described as follows. Consider first the infinite descending
escalator such that from the state i +1 one necessarily goes into the
state i while the mean sojourn times in all the states form a convergent
series, the process terminating at the state 1. This is a Markov chain in T°
and one need only hitch it on to a new state at the beginning to obtain
a Markov chain in T. In fact, the resulting process is the second example
given by Kolmogorov[16], which like the first one mentioned under (A)
above has been analysed in detail by Kendall and Reuter[15]. Now modify
this scheme by allowing, upon leaving each step, the alternative of either
entering the next lower step or starting all over again from the (infinite)
top of the escalator. By proper choice of the probabilities of the alternatives it is possible to j ump to and return from infinity a nondenumerably
infinite number of times. It seems ' intuitively obvious ' that the resulting
process is still Markovian, but if so why does it elude a simple proof?
f (Added in proof.)
D . Ornstein has now found a n analytical proof of this result.
CONTINUOUS PARAMETER MARKOV CHAINS
517
REFERENCES
[1] Austin, D. G. On the existence of derivatives of Markoff transition probability functions. Proc. Nat. Acad. Sci., Wash., 41, 224-226 (1955).
[2] Austin, D. G. A new proof of the strong Markov theorem of Chung. Proc.
Nat. Acad. Sci., Wash., 44, 575-578 (1958).
[3] Blumenthal, R. M. An extended Markov property. Trans. Amer. Math.
Soc. 85, 52-72 (1957).
[4] Chung, K. L. Contributions to the theory of Markov chains. I I . Trans.
Amer. Math. Soc. 76, 397-419 (1954).
[5] Chung, K. L. Foundations of the theory of continuous parameter Markov
chains. Proceedings Third Berkeley Symposium on Mathematical Statistics
and Probability, 2, 29-40. (1955).
[6] Chung, K. L. Some new developments in Markov chains. Trans. Amer,
Math. Soc. 81, 195-210 (1956).
[7] Chung, K. L. On a basic property of Markov chains. Ann. Math. (2), 68
126-149 (1958).
[8] Chung, K. L. Some aspects of continuous parameter Markov chains. Pubi,
Inst. Statist. Univ. Paris, 6, 271-287 (1957).
[9] Doob, J. L. Topics in the theory of Markoff chains. Trans. Amer. Math
Soc. 52, 37-64 (1942).
[10] Doob, J. L. Markoff chains—denumerable case. Trans. Amer. Math. Soc,
58, 455-473 (1945).
[11] Doob, J. L. Stochastic Processes. New York, 1953.
[12] Dynkin, E. and Yushkevic, A. Strong Markov processes. Theory of Pro
bability and Its Applications, 1, 149-155 (1956).
[13] Feller, W. On boundaries and lateral conditions for the Kolmogorov
differential equations. Ann. Math. (2), 65, 527-570 (1957).
[14] Feller, W. and McKean, H. A diffusion equivalent to a countable Markov
chain. Proc. Nat. Acad. Sci., Wash., 42, 351-354 (1956).
[15] Kendall, D. G. and Reuter, G. E. H. Some pathological Markov processes
with a denumerable infinity of states and the associated semi-groups of
operators on I. Proceedings International Congress of Mathematicians,
Amsterdam, 3, 377-415 (1954).
[16] Kolmogorov, A. N. On some problems concerning the differentiability of
the transition probabilities in a temporally homogeneous Markov process
having a denumerable set of states. Uëenye Zapiski (Matem.) Moskov.
Gos. Univ. (4), 148, 53-59 (1951).
[17] Levy, P . Systèmes markoviens et stationnaires; cas dénombrables. Ann.
Sci. Éc. Norm. (3), 68, 327-381 (1951).
r
18] Levy, P . Compléments à l'étude des processus de Markoff, Ann. Sci. Éc.
Norm. (S), 69, 26-212 (1952).
[19] Lévy, P . Processus markoviens et stationnaires du cinquième type (infinité
dênombrable des états possibles, paramètre continu). C.R. Acad. Sci.,
Paris, 236, 1630-1632 (1953).
[20] Ray. D. (To appear.)
[21] Reuter, G. E. H. Denumerable Markov processes and the associated contraction semi-groups on I. Acta Math. 97, 1-46 (1957).
[22] Yushkevic, A. On strong Markov processes. Theory of Probability and
Its Applications, 2, 187-213 (1957).
© Copyright 2026 Paperzz