LIMIT THEOREMS FOR STOCHASTIC PROCESSES A. V.

LIMIT THEOREMS FOR STOCHASTIC PROCESSES
A. V. Skorokhod 1
1 Introduction
1.1. In recent years the limit theorems of probability theory, which previously
dealt primarily with the theory of summation of independent random variables,
have been extended rather widely to the theory of stochastic processes. Among the
work done in this connection, we mention the articles of A. N. Kolmogorov [1, 2],
M. Donsker [3], I. I. Gikhman [4], Yu. V. Prokhorov [5, 6], the author [7, 8], and
N. N. Chentsov [9].
Kolmogorov, Donsker, Gikhman and the author treat various more or less special
important cases of this kind of limit theorems. Yu. V. Prokhorov indicates a general
approach to limit theorems for stochastic processes, basing his work on compactness
criteria of measures in a complete separable metric space. It seems to me that the
restriction to a complete metric space is not very natural, since in every specific
case it becomes necessary to find in the trajectory space of the random process a
complete metric satisfying definite conditions. This is not always possible, and even
when it is, it is not always simple.
The present article suggests a new approach to limit theorems which can be used
for many topological spaces in which in general no complete metric exists.
1.2. Limit theorems for stochastic processes contain primarily statements from
which one can tell whether or not certain quantities f [ξn (t)], which depend on a
sequence of random processes ξn (t), have the same distribution for n → ∞ as the
distribution of f [ξ (t)], where ξ (t) is the process which is the limit of the sequence
ξn (t) (assuming, usually, that the finite dimensional distributions of ξn (t) converge
weakly to the finite dimensional distributions of ξ (t)). Such limit theorems are useful for the following two reasons.
In the first place, the limit process ξ (t) often has a simpler structure; moreover
there often exists a good analytic apparatus for its analysis, as is true, for instance,
in the case of convergence to processes with independent increments or to Markov
processes, describable by differential and integro-differential equations. For these
cases it is often much easier to analyze the distribution of f [ξ (t)] than it is to analyze
that of f [ξn (t)] and this can be done when we have the proper limit theorems (an
example is Kolmogorov’s well known statistical criterion).
Secondly, a process ξ (t) can often be approximated by a sequence of processes
ξn (t) whose structure is simpler. In order to analyze the properties of ξ (t) in this
way we must know which properties of ξn (t) are maintained in the transition to the
limit, and this can also be reduced to studying the limit distributions of f [ξn (t)],
where f is some function defined on the trajectory of the process. Although such
treatments are still, very rare and often not entirely rigorous (an example is Feller’s
1
Original paper published in Teorija veroyatn. i prim. 1 (1956), no.3, 289–319. English translation:
Theory of Probability and its Applications 1 (1956), no.3, 261–290.
Ó Springer International Publishing Switzerland 2016
A. A. Dorogovtsev et al. (eds.), Selected Works
19
20
1 Limit Theorems for Stochastic Processes
[10] attempt to analyze some Markov processes by approximating them by discrete
ones), analysis of stochastic processes based on approximating them by simpler
ones are quite powerful and may soon be very widely used. It therefore becomes
even more important to have limit theorems for stochastic processes.
1.3. We shall obtain the limit theorems in this article in the following way.
Consider a sequence of processes ξn (t) whose finite dimensional distributions
converge to the finite dimensional distributions of the process ξ (t). We shall assume
that with probability 1 the trajectories of both ξn (t) and ξ (t) belong to a certain set K
of functions x(t), and it is natural to suppose that the functions of K are determined
by their values on a certain countable everywhere dense set N of values of t. It then
turns out that one can construct processes ξ n (t) and ξ (t) so that ξ n (t) converges
with probability 1 to ξ (t) for all t in N, so that ξ n (t) and ξn (t) as well as ξ (t)
and ξ (t) have the same distributions, and so that the trajectories of ξ n (t) and ξ (t)
belong to K with probability 1. Now let F be a class of functions defined on K,
which are continuous in some topology S defined in K. We will find the conditions
which must be satisfied by a sequence xn (t) in K which converges to x(t) in K for t
in N, in order that xn (t) converges to x(t) in the topology S. If these conditions are
fulfilled by the sequence ξ n (t), then ξ n (t) converges with probability 1 to ξ (t) in
the topology S, and this means that for any f ∈ F the sequence f [ξ n (t)] converges
with probability 1 to f [ξ (t)]. Furthermore, the distribution of f [ξ n (t)] will converge
to the distribution of f [ξ (t)], and since the distribution of f [ξ n (t)] coincides with
that of f [ξn (t)], while that of f [ξ (t)] coincides with that of f [ξ (t)], we find that the
distribution of f [ξn (t)] converges to the distribution of f [ξ (t)].
This method of obtaining the limit theorems is used when K is the set of all
functions without second order discontinuities. This is perhaps the simplest class of
functions which can be defined by its values on an everywhere dense set of values
of t (except, of course, the set of all continuous functions; for them, however, the
problem has been solved by Prokhorov [5]). In addition, there is a wide class of processes whose trajectories have, with probability 1, no second order discontinuities.
The basic difficulty which arises here is in the choice of reasonable classes of
functions, that is in the choice of the topology. In the space of functions without
second order discontinuities the author has already suggested a topology [7] in connection with the study of the convergence of processes with independent increments.
It was found later that the applicability of this topology is wider than was thought,
and that the topology is a very natural one 2 . In the present article some other topologies, occurring naturally from certain points of view, are also considered. Section 2
is devoted to a study of the convergence conditions in these topologies, convergence
conditions which contain the above mentioned subsidiary conditions must be fulfilled by a sequence of functions converging on an everywhere dense set in order to
have convergence in the desired topology. Section 3 contains the limit theorems for
stochastic processes.
I should like to note that although the suggested method is used only for a rather
special choice of topology and a rather narrow class of processes, it can be used
2
See A. N. Kolmogorov, Probability Theory and Its Applications, this journal, Vol. 1, No. 2 (1956).
Limit Theorems for Stochastic Processes
21
for other topologies and for more complicated processes such as, for instance, for
stochastic functions of many variables. If this has not been done in the present article, it is only because the theory of random functions of many variables is not yet
sufficiently developed for one to be able to judge which classes of functions are
the most natural to consider (except, of course, continuous functions, for which the
problem has been solved by Prokhorov [5]).
I take this opportunity to express my deep gratitude to E. B. Dynkin and
A. N. Kolmogorov for valuable consultation during the many discussions of the
results here presented.
2 The Space KX . Topologies in this Space
2.1. Let X be a complete metric separable space with elements x, y. · · · , and let
ρ (x, y) denote the distance between x and y. We denote by KX the space of all functions x(t) which are defined on the interval [0, 1], whose values lie in X, and which
at every point have a limit on the left and are continuous on the right (and on the left
at t = 1).
Let us consider certain properties of the functions which belong to KX . A function
x(t) will be said to have a discontinuity ρ (x(t0 − 0), x(t0 + 0)) at the point t0 .
2.1.1. If x(t) ∈ KX , then for any positive ε there exists only a finite number of
values of t such that the discontinuity of x(t) is greater than ε . (This follows from the
fact that if there exists a sequence for which tk → t0 such that ρ (x(tk +0), x(tk −0)) >
ε , then at t0 the function x(t) would have no limit either on the right or on the left.)
2.1.2. Let t1 ,t2 , · · · ,tk be all the points at which x(t) has discontinuities no less
than ε . Then there exists a δ such that if |t − t | < δ and if t and t both belong
to the same one of the intervals (0,t1 ), (t1 ,t2 ), · · · , (tk , 1), then ρ (x(t ), x(t )) < ε .
To prove this, assume the contrary. Then there would exist sequences tn and tn
which converge to some point t0 and belong to the same one of the intervals
(0,t1 ), · · · , (tk , 1), and the sequences would have the property that ρ (x(tn ), x(tn )) ≥
ε . Now the points tn and tn lie on opposite sides of t0 (otherwise ρ (x(tn ), x(tn )) ≥ ε
would be impossible), so that ρ (x(t0 + 0), x(t0 − 0)) ≥ ε . Therefore t0 is one of the
points t1 ,t2 , · · · ,tk , which contradicts the statement that tn and tn belong to the same
one of the intervals (0,t1 ), (t1 ,t2 ), · · · , (tk , 1).
2.1.3. If x(t) ∈ KX , then for all η > 0 there is a δ > 0 such that every point
t ∈ [0, 1] satisfies one of the inequalities
sup
0<t1 −t<δ
sup
0<t−t1 <δ
ρ (x(t), x(t1 )) < η ,
ρ (x(t1 ), x(t)) < η .
(2.1)
This assertion follows directly from 2.1.2. It may be stated in the following more
compact way.
2.1.4. If x(t) ∈ KX , then
22
1 Limit Theorems for Stochastic Processes
sup
−δ <t1 −t<0<t2 −t<δ
min[ρ (x(t1 ), x(t)); ρ (x(t), x(t2 ))] → 0 for δ → 0.
(2.2)
Finally, the more or less converse assertion can also be made.
2.1.5. If x(t) satisfies (2.2) then, there exists a x(t) ∈ KX which coincides with
x(t) at all of its points of continuity.
We shall show that x(t) has, at every point, a limit from the left and from the
right. If this were not so, there would exist an η > 0 and three points t1 < t2 < t3
arbitrarily close to each other such that ρ (x(t1 ), x(t2 )) > η and ρ (x(t2 ), x(t3 )) > η .
But this contradicts (2.2). We note also that x(t) must definitely coincide either with
x(t − 0) or x(t + 0). Setting x(1) = limt→J x(t) and x(t) = lim
x(t ), we obtain the
t →t
t >t
desired function.
2.2. We shall now go on to define the topologies in the space KX . A general
analysis of this problem (that is, of classifying the possible topologies and studying their properties) is probably of independent interest. Here we shall discuss the
five topologies which are simplest from our point of view, and to which we are led
by considering functionals of trajectories of processes usually studied in probability
theory. In addition, these are natural generalizations of the uniform topology on the
space of continuous functions, generalizations in harmony with the whole development of mathematics.
2.2.1. DEFINITION. The sequence of functions xn (t) converges uniformly to x(t)
at the point t0 if for all ε > 0 there exists a δ such that
lim sup ρ (xn (t), x(t)) < ε .
n→∞ |t−t |<δ
0
(2.3)
Obviously if xn (t) converges uniformly to x(t) at every point of some closed set,
then xn (t) converges uniformly to x(t) on this whole set.
A necessary condition for convergence in all the topologies considered here will
be that the sequence xn (t) converges to x(t) uniformly at every point of continuity
of x(t).
Thus for continuous functions, all topologies reduce to ordinary uniform convergence. Our topologies begin to differ when we consider the behavior of xn (t)
in the neighborhood of points of discontinuity of x(t). The simplest behavior we
might suggest is that the convergence be uniform also at points of discontinuity, in
which case we obtain ordinary uniform convergence (in our notation we shall call
this topology U).
Although we shall consider this topology in addition to others, we remark that
it is not the most natural one for KX . This is because the uniform topology in KX
requires that the convergence of xn (t) to x(t) imply that there exists a number such
that for all n greater than or equal to this number the points of discontinuity of xn (t)
coincide with the points of discontinuity of x(t). This means that if t is considered to
be the time, we must assume the existence of an instrument which will measure time
exactly, and physically this is an impossibility. It is much more natural to suppose
Limit Theorems for Stochastic Processes
23
that the functions we can obtain from each other by small deformation of the times
scale lie close to each other. We thus are led to propose the topology J1 .
2.2.2. DEFINITION. The sequence xn (t) is called J1 -convergent to x(t) if there
exists a sequence of continuous one-to-one mappings λn (t) of the interval [0, 1] onto
itself, such that
lim sup ρ (xn (t), x(λn (t))) = 0; lim sup |λn (t) − t| = 0.
n→∞ t
n→∞ t
(2.4)
Let us analyze in more detail how we define the behavior of xn (t) in the neighborhood of a discontinuity point t0 if we require that xn (t) converges to x(t) in one of
our topologies. Since x(t) has a limit from the left and the right at every point t, we
know that in a sufficiently small neighborhood of t0 it will be approximately equal
to x(t0 − 0) for t < t0 and to x(t0 + 0) for t ≥ t0 . Therefore on the left of t0 there exist
intervals where xn (t) is approximately equal to x(t0 − 0), and on the right there are
those in which xn (t) is approximately equal to x(t0 + 0) (since the points of continuity of x(t) are everywhere dense, and at these points the convergence is uniform).
Thus the question of how xn (t) behaves in the neighborhood of t0 is equivalent to
the question how xn (t) changes from x(t0 − 0) to x(t0 + 0). In the topologies U and
J1 this transition takes the form of a single jump. In both of these topologies, for
values of t close to t0 the function xn (t) can take on values which are either close
to x(t0 − 0) or to x(t0 + 0). If we wish to keep this last property, but do not require
that the transition be in the form of a single jump, that is that a function xn (t) may
change back and forth between the values x(t0 − 0) and x(t0 + 0) several times in the
neighborhood of a point t0 , then we obtain topology J2 .
2.2.3. DEFINITION. A sequence xn (t) is said to be J2 -convergent to x(t) if there
exists a sequence of one-to-one mappings λn (t) of the interval [0, 1] onto itself such
that
sup |λn (t) − t| → 0 and sup ρ (xn (t), x(λn (t))) → 0 as n → ∞.
(2.5)
t
t
We shall define the two following topologies for the case in which X is a complete
linear normalized space. In this case, together with every function belonging to KX
we shall consider its graph Γx(t) , constructed in the following way.
Let X × T be the space composed of the pairs (x,t), where x ∈ X and t ∈ [0, 1].
In this space we define the metric R by
R[(x1 ,t1 ); (x2 ,t2 )] = ρ (x1 , x2 ) + |t1 − t2 |.
(2.6)
We then define the graph Γx(t) as the closed set in X × T which contains all pairs
(x,t), such that for all t the point x belongs to the segment joining x(t − 0) and
x(t + 0) (henceforth we shall denote the segment joining x1 and x2 by [x1 , x2 ]). The
graph Γx(t) is a continuous curve in X × T, and therefore we can define convergence
for graphs in the same way as we usually do for space curves. We then obtain the
topology M1 .
2.2.4. DEFINITION. The pair of functions (y(s),t(s)) gives a parametric representation of the graph Γx(t) if those and only those pairs (x,t) belong to it for which
24
1 Limit Theorems for Stochastic Processes
an s can be found such that x = y(s),t = t(s), where y(s) is continuous, and t(s) is
continuous and monotonically increasing (the functions y(s) and t(s) are defined on
the segment [0, 1]).
We note that if (y1 (s),t1 (s)) and (y2 (s),t2 (s)) are parametric representations
of Γx(t) , there exists a monotonically increasing function λ (s) such that y1 (s) =
y2 (λ (s)) and t1 (s) = t2 (λ (s)).
2.2.5. DEFINITION. The sequence xn (t) is called M1 -convergent to x(t) if there
exist parametric representations (y(s),t(s)) of Γx(t) and (yn (s),tn (s)) of Γxn (t) such
that
lim sup R[(yn (s),tn (s)); (y(s),t(s))] = 0.
(2.7)
n→∞ s
We can characterize the topology M1 in the following way from the point of
view of the behavior at a point of discontinuity t0 of the function x(t). The transition
from x(t0 − 0) to x(t0 + 0) is such that first xn (t) is arbitrarily close to the segment
[x(t0 − 0), x(t0 )] and second that xn (t) moves from x(t0 − 0) to x(t0 ) almost always
advancing. More rigorously, what this last condition means is that if x(t0 − 0) <
x1 < x2 < x(t0 ) (using the natural ordering for the segment), then having reached
the neighborhood of x2 , we will not return to the neighborhood of x1 .
If we drop the second requirement and assume only that xn (t) goes from x(t0 − 0)
to x(t0 + 0) remaining always arbitrarily close to the segment [x(t0 − 0), x(t0 )], we
obtain the topology M2 .
2.2.6. DEFINITION. The sequence xn (t) is called M2 -convergent to x(t) if
lim
sup
inf
n→∞ (y ,t )∈Γ (y2 ,t2 )∈Γx (t)
n
1 1
x(t)
R[(y1 ,t1 ); (y2 ,t2 )] = 0.
(2.8)
Let S be any one of our topologies. We shall denote convergence in the topology
S
S by the symbol xn (t) → x(t). Let us consider the relation between our topologies. It
is clear that U is stronger than J1 , and that this in turn is stronger than J2 . It is also
clear that M1 is stronger than M2 . We recall that a topology S1 is stronger than S2 if
convergence in S1 implies convergence in S2 . If X is a linear space, we can use any
of our topologies. It is easily seen that convergence in M2 follows from convergence
in any of the other topologies, and that convergence in J1 implies convergence in any
of the other topologies except U. Examples can be used to show that the topologies
M1 and J2 cannot be compared. (Indeed, let X be the real line. We set
0;t < 12 ,
x(t) =
1;t ≥ 12 ,
⎧
⎪
0;t ∈ 0, 12 − 1n ,
⎪
⎪
⎪
⎪
⎨ xn (t) = n2 t − 12 + 1n ; t ∈ 12 − 1n , 12 + 1n ,
⎪
⎪
⎪
⎪
⎪
⎩1; t ∈ 1 + 1 , 1 ,
2
n
Limit Theorems for Stochastic Processes
25
⎧
1
⎪
⎪0;t < 2 − 1/n,
⎪
⎨1; 1 − 1/n ≤ t < 1 ,
2
2
xn (t) =
⎪0; 12 ≤ t < 12 + 1/n,
⎪
⎪
⎩
1; t ≥ 12 + 1/n,
M
J
so that xn (t) →1 x(t) although this does not converge 3 to x(t) in J2 , while xn (t) →2
x(t) although this does not converge to x(t) in M1 ). Denoting the statement “the
topology S1 is stronger than S2 ” by S1 → S2 all the above can be summarized by
U → J1
M1
M.
2
(2.9)
J2
Let us also mention the relation between our topologies and functionals ordinarily used in probability theory. We
shall consider X to be the real line. Consider the following functionals (§ 2.2.7 – § 2.2.9)
2.2.7. The upper and lower bound of a function on some segment [t1 ,t2 ], namely
M[t1 ,t2 ] [x(t)] = sup x(t); m[t1 ,t2 ] [x(t)] = inf x(t).
t1 ≤t≤t2
t1 ≤t≤t2
[a,b]
[x(t)]
1 ,t2 ]
2.2.8. The number of intersections ν[t
with the strip [a, b] on the segment [t1 ,t2 ] is equal to k if it is possible
to find k + 1 points t (0) < t (1) < · · · < t (k) on [t1 ,t2 ] such that either x(t (0) ) ≤ a, x(t (1) ) ≥ b, x(t (2) ) ≤ a, · · · , or x(t (0) ) ≥
b, x(t (1) ) ≤ a, x(t (2) ) ≥ b, · · · , and is impossible to give k+2 points with these properties.
+
[x(t)]. Let
2.2.9. The first value γ[t+ ,t ],a [x(t)] above some level a on the segment [t1 ,t2 ]. Let us first define γ[0,1],a
1 2
τa [x(t)] = infx(t)≥a t, if sup x(t) ≥ a and τa [x(t)] = 1, if sup x(t) < a. Then
+
γ[0,1],a
[x(t)] =
Further, we set
x(τa [x(t)]) − a for τa [x(t)] < 1,
−1
for τa [x(t)] = 1.
⎧
⎪
⎨x(t1 + 0),t < t1 ,
+
γ[t+ ,t ],a [x(t)] = γ[0,1],a
[x(t)], where x(t) = x(t),t1 ≤ t ≤ t2 ,
1 2
⎪
⎩x(t − 0),t > t .
2
2
The following assertions are easily proved, but we shall not present the proofs here.
M
2.2.10. A necessary and sufficient condition for xn (t) →2 x(t) is that
M[t1 ,t2 ] [xn (t)] → M[t1 ,t2 ] [x(t)] and m[t1 ,t2 ] [xn (t)] → m[t1 ,t2 ] [x(t)],
where t1 and t2 are points of continuity of x(t).
M
2.2.11. A necessary and sufficient condition for xn (t) →1 x(t) is that
Remark of the editors. We define the sequence (xn (t))t∈[0,1] above in a way different from that
given mistakenly by the author:
⎧
⎪
⎨0;t < 1/2 − 1/n;
xn (t) = n(t − 1/2) + 1/2; 1/2 − 1/n ≤ t < 1/2 + 1/n;
⎪
⎩1; t > 1/2 + 1/n.
3
One can easily reveal that this sequence does not converge on any of the topologies
U, J1 , J2 , M1 , M2 .
26
1 Limit Theorems for Stochastic Processes
[a,b]
[xn (t)]
1 ,t2 ]
ν[t
[a,b]
[x(t)]
1 ,t2 ]
→ ν[t
for all t1 and t2 which are points of continuity of x(t) and almost all a and b.
J
2
2.2.12. A necessary and sufficient condition for xn (t) →
x(t) is that
γ[t+ ,t
1 2 ],a
[xn (t)] → γ[t+ ,t
1 2 ],a
[x(t)]
for almost all a and for all t1 and t2 which are points of continuity of x(t).
J
1
2.2.13. A necessary and sufficient condition for xn (t) →
x(t) is that
γ[t+ ,t
1 2 ],a
[xn (t)] → γ[t+ ,t
1 2 ],a
[a,b]
[xn (t)]
1 ,t2 ]
[x(t)] and ν[t
[a,b]
[x(t)]
1 ,t2 ]
→ ν[t
for all t1 and t2 which are points of continuity of x(t) and almost all a and b.
Let us now go on to a more detailed analysis of our topologies, starting with the
weaker ones and going on to the stronger.
2.3. The topology M2 .
We shall first find a condition for M2 -convergence which is equivalent to the
definition but which is given not in terms of graphs but of the functions xn (t) and
x(t) themselves.
M
2.3.1. A necessary and sufficient condition for xn (t) →2 x(t) is that
lim lim ΔM2 (c; xn (t), x(t)) = 0,
c→0 n→∞
where
(2.10)
ΔM2 (c; y(t), x(t)) = sup H(x(t), y(tc ), x(tc∗ )),
tc = max[0,t − c], tc∗ = min[1,t + c],
and H(x1 , x2 , x3 ) is the distance of x2 from the segment [x1 , x3 ].
M
Proof. NECESSITY. Let xn (t) →2 x(t). We shall show that this implies that by
choosing c sufficiently small we can make limn→∞ ΔM2 (c; xn (t), x(t)) arbitrarily
small. Let c be such that if t1 < t2 < t3 ,t2 − t1 < 2c, and t3 − t2 < 2c, then
min[ρ (x(t1 ), x(t2 )); ρ (x(t2 ), x(t3 ))] < ε . Let us now denote by t1 ,t2 , · · · ,tk all the
points at which the discontinuities are greater than or equal to ε , and let c be so
small that ρ (x(t ), x(t )) < ε so long as |t − t | < c and t and t both belong to the
same one of the intervals (0,t1 ), (t1 ,t2 ), · · · , (tk , 1). We choose n so large that
sup
inf
(y1 ,t1 )∈Γx(t) (y2 ,t2 )∈Γxn (t)
R[(y1 ,t1 ); (y2 ,t2 )] < c/2.
(2.11)
Now if the points tc ,t,tc∗ all belong to the same one of the intervals (0,t1 ), (t1 ,t2 ), · · · ,
(tk , 1), then
H(x(tc ), xn (t), x(tc∗ )) ≤ c/2 + 2ε .
For, it follows from (2.11) that the graph Γx(t) contains a point (x,t) such that |t −t| <
c/2 and ρ (x, xn (t)) < c/2, and hence if t < t, then
H(x(tc ), xn (t), x(tc∗ )) < ρ (x(tc ), x(t − 0)) + ρ (x(t − 0), x) + ρ (x, xn (t)),
Limit Theorems for Stochastic Processes
27
while if t > t, then
H(x(tc ), xn (t), x(tc∗ )) < ρ (x(tc∗ ), x(t + 0)) + ρ (x(t + 0), x) + ρ (x, xn (t)).
Let us now find H(x(tc ), xn (t), x(tc∗ )) when tc < ti < t < tc∗ (the case tc < t < ti < tc∗
is entirely symmetric). It follows from (2.11) that Γx(t) contains a point (
x, t ) such
that ρ (
x, xn (t)) < c/2 and |
t − t| < c/2. If t = t, then either ti > t and
H(x(tc ), xn (t), x(tc∗ )) ≤ ρ (x(tc ), x(
t − 0)) + ρ (x(
t − 0), x) + ρ (
x, xn (t)) ≤ 3ε + c/2,
or ti < t and
H(x(tc ), xn (t), x(tc∗ )) ≤ ρ (x(tc∗ ), x(
t + 0)) + ρ (x(
t + 0), x) + ρ (
x, xn (t)) ≤ 3ε + c/2.
If, however, ti = t , then
H(x(tc ), xn (t), x(tc∗ )) ≤ H(x(tc ), x, x(tc∗ )) + ρ (
x, xn (t))
≤ ρ (x(tc ), x(ti − 0)) + ρ (x(tc∗ ), x(ti + 0)) + H(x(ti − 0), x, x(ti + 0)) + ρ (
x, xn (t))
≤ 2ε + c/2,
since H(x(ti − 0), x, x(ti + 0)) = 0. This proves the necessity.
SUFFICIENCY. Assume that (2.10) is fulfilled. This means that there exists a c
sufficiently small so that for n sufficiently large we have
ΔM2 (c; xn (t), x(t)) < ε ,
(2.12)
and in addition that c can be chosen so small that if t and t belong to the same one
of the intervals
(0,t1 ), (t1 ,t2 ), · · · , (tk , 1); |t − t | < 2c, then ρ (x(t ), x(t )) < ε
and 2c is less than the length of any of these intervals. Let us show that for every
point of Γxn (t) there is a point of Γx(t) such that in the metric R the distance between
these points is less than δ , and that this last quantity can be made arbitrarily small
by appropriate choice of ε and c.
From the fact that H(x(tc ), xn (t), x(tc∗ )) < ε , if tc and tc∗ belong to the same one
of the intervals (0,t1 ), (t1 ,t2 ), · · · , (tk , 1), it follows that ρ (xn (t), x(tc )) < 3ε , which
means that
ρ (xn (t), x(t)) < 4ε .
(2.13)
x,t) of Γx(t)
In this case, therefore, for a point (y,t) of Γxn (t) we can find a point (
(with the same t) such that ρ (
x, y) < 4ε .
Now assume that tc < ti < t < tc∗ . Then
H(x(tc ), xn (t), x(tc∗ )) < ε and H(x(tc ), xn (t − 0), x(tc∗ )) < ε .
28
1 Limit Theorems for Stochastic Processes
For every point of the segment [xn (t − 0), xn (t)], therefore, there is a point x
on [x(tc ), x(tc∗ )] such that ρ (x , x) < 2ε . Now since ρ (x(tc ), x(ti − 0)) < ε and
ρ (x(tc∗ ), x(ti + 0)) < 2ε , it follows that on the segment [x(ti − 0), x(ti + 0)] there
is a point x such that ρ (x , x) < 2ε .
Thus R[(y,t), (
x,ti )] < 4ε + c.
This proves the sufficiency.
2.3.2. REMARK. We may consider, instead of H(x1 , x2 , x3 ), any other quantity
such that
ϕ1 (H(x1 , x2 , x3 )) ≤ H(x1 , x2 , x3 ) ≤ ϕ2 (H(x1 , x2 , x3 )),
where ϕi (t) are nonnegative continuous functions which vanish only for t = 0. If X
is a Hilbert space, we may set
H(x1 , x2 , x3 ) = ρ (x1 , x2 ) + ρ (x2 , x3 ) − ρ (x2 , x3 ).
2.3.3. GENERAL CONDITIONS FOR M2 -CONVERGENCE. Necessary and
M
sufficient conditions for xn (t) →2 x(t) are that
(a) xn (t) converges to x(t) on a set of values of t which contains 0 and 1 and
which is everywhere dense on [0, 1] and
(b) limc→0 limn→∞ ΔM2 [c; xn (t)] = 0, where
ΔM2 [c; y(t)] =
sup
t∈[0,1];t1 ∈[tc ,tc +c/2]
t2 ∈[tc∗ −c/2,tc∗ ]
H(y(t1 ), y(t), y(t2 )).
M
Proof. The necessity of (a) is implied by the fact that if xn (t) →2 x(t), then
xn (t) → x(t)
at every point of continuity of x(t).
NECESSITY OF (b).
H(xn (t1 ), xn (t), xn (t2 )) ≤ H(
x1 , xn (t), x2 ) + ρ (
x1 , xn (t1 )) + ρ (
x2 , xn (t2 )).
M
Since xn (t) →2 x(t), there exist points t1 and t2 such that |t1 −t1 |, |t2 −t2 |, ρ (
x1 , xn (t1 )),
and ρ (
x2 , xn (t2 )) are small, where the points x1 and x2 belong to the segments
[x(t1 − 0), x(t1 )] and [x(t2 − 0), x(t2 )], respectively. Now let c be such that x(t) cannot have two discontinuities equal to or greater than ε in a distance 3c, and such
that if t 2 − t 1 ≤ 2c and if there is no discontinuity between these points which is
equal to or greater than ε , then ρ (x(t 2 ), x(t 1 )) < ε . Let us evaluate H(
x1 , xn (t), x2 )
assuming that |t1 − t1 | < c/4 and |t2 − t2 | < c/4. If in (t1 ,t2 ) and (tc ,tc∗ ) there are no
discontinuities of x(t) equal to or greater than ε , then
H(
x1 , xn (t), x2 ) ≤ H(x(tc ), xn (t), x(tc∗ )) + 4ε .
If, however, one of the intervals (t1 ,t2 ) or (tc ,tc∗ ) has a discontinuity equal to or
greater than ε at some point t, we consider the following two possibilities.
Limit Theorems for Stochastic Processes
29
1) t1 < t < t2 ; tc < t < tc∗ . Then ρ (x(tc ), x1 ) ≤ 2ε and ρ (x(tc∗ ), x2 ) ≤ 2ε , so that
H(x1 , xn (t), x2 ) ≤ 4ε + H(x(tc ), xn (t), x(tc∗ )).
2) t lies between t1 and tc (and may coincide with t1 ). Then in the interval
∗
∗ ) all the discontinuities of x(t) are less than ε , so that ρ (x(t
(tc/4 ,tc/4
c/4 ), x(tc/4 )) < ε .
∗
∗
This means that ρ (x(tc/4 ), xn (t)) < ε + H(x(tc/4 ), xn (t), x(tc/4 )). In addition, since
∗ ), x(t ∗ )) < ε , and the discontinuity in x(t) at t is also less than ε , we have
ρ (x(tc/4
c
2
∗
ρ (xn (t), x2 ) < 3ε + H(x(tc/4 ), xn (t), x(tc/4
)).
Therefore
∗
H(
x1 , xn (t), x2 ) < 3ε + H(x(tc/4 ), xn (t), x(tc/4
)).
This proves the necessity.
SUFFICIENCY. We must show that if conditions (a) and (b) are fulfilled, then
lim lim ΔM2 (c; xn (t), x(t)) = 0.
c→0 n→∞
Let us assume, on the contrary, that there exist sequences ck → 0, nkj → ∞ (where
the arrow indicates that we are considering the limit as j → ∞) and an η such that
ΔM2 (c; xn (t), x(t)) > η .
nk
nk
nk
nk
nk
nk
nk
This means that there exist points t1 j ,t2 j ,t3 j , such that t3 j − t2 j = t2 j − t1 j = ck
nk
(since the ti j lie, for sufficiently small ck , within the segment [0, 1] because (a) and
(b) imply that xn (t) converges uniformly to x(t) at 0 and 1) and
nk
nk
nk
H(x(t1 j ), xnk (t2 j ), x(t3 j )) > η .
j
Condition (a) implies that for all ε and δ there is an n0 such that in any interval of
length δ there exists a point t ∗ for which ρ (xn (t ∗ ), x(t ∗ )) ≤ ε when n > n0 . Let δ1
be so small that
sup
−δ1 <t1 −t2 <0<t3 −t2 <δ1
min[ρ (x(t1 ), x(t2 )); ρ (x(t2 ), x(t3 ))] < ε .
(2.14)
Further, (b) implies that there exist c1 , c2 , and n such that for n > n we have
ΔM2 (c, xn (t)) < ε ; ΔM2 (c2 , xn (t)) < ε ;
c1 > 2c2 ; c1 > ck > c2 ; 2c1 < δ1 .
We then have either
nk
nk
nk
ρ (x(t1 j ), x(t)) < 2ε , t ∈ (t2 j − c1 ,t2 j − c1 /2),
(2.15)
30
1 Limit Theorems for Stochastic Processes
nk
nk
nk
nk
nk
nk
nk
ρ (x(t3 j ), x(t)) < 2ε , t ∈ (t2 j + c1 /2,t2 j + c1 ),
or
nk
ρ (x(t1 j ), x(t)) < 2ε , t ∈ (t2 j − c2 ,t2 j − c2 /2),
nk
ρ (x(t3 j ), x(t)) < 2ε , t ∈ (t2 j + c2 /2,t2 j + c2 )
(this follows from (2.14) and from the fact that the four intervals do not intersect in
pairs). If we now choose δ < c2 /2, then for nkj > max(n0 , n ) there exist three points
nk
t1 ,t2 j ,t3 such that
nk
H(xnk (t1 ), xnk (t2 j ), xnk (t3 )) > η − 12ε ,
j
j
j
nk
nk
and either c1 /2 ≤ |t2 j − t j | ≤ c1 or c2 /2 ≤ |t2 j − t j | < c2 , i = 1, 3. Thus either
ΔM2 (c1 , xnk (t)) > η − 12ε or ΔM2 (c2 , xnk (t)) > η − 12ε , which contradicts (2.15)
j
j
if we take 13ε < η .
We note that in proving the sufficiency we have proved the following stronger
assertion.
M
2.3.4. A sufficient condition for xn (t) →2 x(t) is that (a) be fulfilled and that
(b ) limc→0 limn→∞ ΔM2 (c, xn (t)) = 0.
Thus we can obtain various sufficient conditions by making different choices of
sequences ck → 0.
In studying the compactness of sets of functions in the topologies J1 , J2 , M1 , and
M2 , the following lemma plays an important role.
2.3.5. Consider a sequence xn (t) which converges to an arbitrary function x(t)
for t ∈ N (some set, everywhere dense on [0, 1], containing 0 and 1), and let
lim lim (ΔM2 (c, xn (t)) + sup ρ (xn (t), xn (0)) + sup ρ (xn (1), xn (t))) = 0.
c→0 n→∞
1−c<t<1
0<t<c
M
Then there exists x(t) ∈ KX such that xn (t) →2 x(t).
Proof. Let ti be the points of N. Let us prove the existence of
lim x(ti ) and
ti →t,ti >t
lim x(ti ).
ti →t,ti <t
It is sufficient to prove that one of these limits, say the limit from the right, exists.
First let t = 0. Let us assume that limti →0 x(ti ) does not exist, which means that there
is a sequence tik and an ε such that ρ (x(tik ), x(tik+1 )) > ε .
Let us choose c so small that limn→∞ ΔM2 (c, xn (t)) < ε /4, and k such that tik and
tik+1 are both less than c/2. Then, if tr is a point such that
tik + c/2 < tr < tik + c and tik+1 + c/2 < tr < tik+1 + c,
Limit Theorems for Stochastic Processes
31
we have
H(x(tik ), x(tik+1 ), x(tr )) < ε /4 and H(x(tik+1 ), x(tik ), x(tr )) < ε /4.
But this contradicts the fact that ρ (x(tik ), x(tik+1 )) > ε . Let us now assume that the
limit limti →t,ti >t x(ti ) does not exist, where t is an internal point of [0, 1]. We then
have the following two possibilities.
1) There exists an ε such that in any neighbourhood of a point t we can find three
points tr1 < tr2 < tr3 lying to the right of t for which
ρ (x(tri ), x(tr j )) > ε (i = j; i, j = 1, 2, 3).
2) There exist x1 and x2 such that every sequence x(tnk ) (tnk > t,tnk → t) can be
separated into two sequences x(tn ) and x(tn ) (the set {nk } is the union of {nk } and
k
k
{nk }) such that
lim x(tn ) = x1 and lim x(tn ) = x2 .
tn →t
k
k
tn →t
k
k
We shall show that both of these possibilities contradict the conditions of the lemma.
The first case. Choose c0 so that limn→∞ ΔM2 (c, xn (t)) < μ if c < c0 (we shall
choose μ later). Let t and t be points of N such that if 0 < t − t < c0 /2, then,
t − c0 < t < t < c0 /2 and t + c0 /2 < t < t + c0 /2. Then, for t < ti < t, x(ti ) will be
at a distance equal to or greater than μ from the segment [x(t ), x(t )]. Now let A be
the set of points x of the segment [x(t ), x(t )] for which limti →t,ti >t ρ (x(ti ), x) ≤ μ ,
and let us write x1 = infx∈A x and x2 = supx∈A x (the order on the segment [x(t ), x(t )]
is defined by the fact that we consider x(t ) < x(t )). Now consider a point x of A
such that ρ (x1 , x) > ε − 2μ and ρ (x2 , x) > ε − 2μ . If t and t are points which, for
some c < c0 , satisfy t − c < t < t − c/2, t + c/2 < t < t + c for 0 < t − t < c/4 and
ρ (x, x(t )) < μ , we find that if 0 < t − t < c/4, the point x(
t ) lies at a distance no
greater than 2μ from the segment [x(t ), x]. Therefore x1 and x2 must lie at a distance
no greater than 3μ from [x(t ), x] which is impossible if ε > 5μ .
In the second case we choose, for every ε , a δ such that for 0 < t1 − t < δ we
have either ρ (x(t1 ), x1 ) < ε or ρ (x(t1 ), x2 ) < ε . If ε < ρ (x1 , x2 )/4, only one of the
possibilities is realized. Choose c0 such that limn→∞ ΔM2 (c, xn (t)) < ε for c < c0 .
Then if t1 is a point for which 0 < t1 −t < δ and 0 < t1 −t < c0 and ρ (x1 , x(t1 )) < ε ,
t ), x1 ) < ε for (t1 − t)/3 < t − t < 23 (t1 − t) and t ∈ N, since
we find that ρ (x(
H(x1 , x(
t ), x(t1 )) < ε ,
and therefore
H(x1 , x(
t ), x1 ) < 2ε ,
t )) < ε (there are only two possibilities,
t )) < 2ε , and therefore ρ (x1 , x(
or ρ (x1 , x(
t )) > 3ε ). Choosing t2 from N such that 12 t1 <
t )) < ε , or ρ (x1 , x(
namely ρ (x1 , x(
t2 < 23 t1 , we find that ρ (x(
t ), x1 ) < ε for (t2 − t)/3 < t − t < 23 (t2 − t). Continuing
t ), x1 ) < ε for t < t < 23 t1 and therefore
this process ad infinitum, we find that ρ (x(
limt >t,t →t x(t ) = x1 . Thus x(t) has limits from the right and the left at every point.
32
1 Limit Theorems for Stochastic Processes
Setting
x(t) =
lim
t >t,
t →t,
t ∈N
x(
t ), x(1) = lim x(
t ),
t ∈N,
t →1
M
we obtain a function from KX . We now have to prove that x1 (t) →2 x(t). To do this
it is sufficient to show that xn (t) converges to x(t) at all points of continuity of x(t).
Let t0 be a point of continuity of x(t). Then if we choose t and t from N such
that t0 − c < t < t0 − c/2 and t0 + c/2 < t < t0 + c, where c is chosen so that
limn→∞ ΔM2 (c, xn (t)) < ε , ρ (x(t ), x(t0 )) < ε , and ρ (x(t ), x(t0 )) < ε , we find that
limn→∞ ρ (x(t0 ), x(t0 )) < 4ε . This proves the lemma.
2.4. The topology M1 .
2.4.1. CONVERGENCE CONDITIONS. A necessary and sufficient condition
M
that xn (t) →1 x(t) is that 2.3.3 (a) be fulfilled and that
(b) limc→0 limn→∞ ΔM1 (c; xn (t)) = 0,
where
ΔM1 (c; y(t)) =
sup
H(y(t1 ), y(t2 ), y(t3 )).
t2 −c<t1 <t2 <t3 <t2 +c
M
Proof. THE NECESSITY OF (b). Let xn (t) →1 x(t). This means that there exist
parametric representations (y(s),t(s)) for Γx(t) and (yn (s),tn (s)) for Γxn (t) such that
limn→∞ sups |tn (s) − t(s)| = 0 and limn→∞ sups ρ (y(s), yn (s)) = 0. Let t1 < t2 < t3
(n) (n) (n)
(n)
and s1 , s2 , s3 be chosen so that tn (si ) = ti , (i = 1, 2, 3). Then
H(xn (t1 ), xn (t2 ), xn (t3 ))
(n)
(n)
(n)
(n)
(n)
(n)
= H(yn (s1 ), yn (s2 ), yn (s3 )) ≤ H(y(s1 ), y(s2 ), y(s3 ))
+3 sup ρ (yn (s), y(s)).
s
It remains to prove that for s1 < s2 < s3 we can make H(y(s1 ), y(s2 ), y(s3 )) arbitrarily small if t(s3 ) − t(s1 ) is sufficiently small.
Let us consider three cases.
1. If t(s1 ) = t(s2 ) = t(s3 ), so that H(y(s1 ), y(s2 ), y(s3 )) = 0.
2. If t(s1 ) = t(s2 ) < t(s3 ), then either the discontinuity in x(t) at t(s1 ) is less than
ε , i.e. ρ (y(s1 ), y(s2 )) < ε and therefore H(y(s1 ), y(s2 ), y(s3 )) < ε , or the discontinuity in x(t) at t(s1 ) is greater than ε and then for sufficiently small t(s3 ) − t(s2 ) we
have ρ (y(s3 ), x(t(s1 ))) < 2ε . This means that
H(y(s1 ), y(s2 ), y(s3 )) < 2ε .
3. If, finally, t(s1 ) < t(s2 ) < t(s3 ), then either there is no discontinuity at any of
these points greater than ε and therefore
H(y(s1 ), y(s2 ), y(s3 )) < H(x(t(s1 )), x(t(s2 )), x(t(s3 ))) + 3ε
(2.16)
(it follows from 2.1.4 that (2.16) is small), or at t(s1 ) the discontinuity is greater
than ε . This means that at the other points the discontinuities are less than ε
Limit Theorems for Stochastic Processes
33
for sufficiently small t(s3 ) − t(s1 ) so that ρ (x(t(s2 )), x(t(s3 ))) is small and so is
H(y(s1 ), y(s2 ), y(s3 )) (if the discontinuity greater than ε is at t(s3 ), the considerations are similar), or the discontinuity greater than ε is at t(s2 ) in which case
ρ (y(s1 ), x(t(s2 − 0))) and ρ (x(t(s2 )), y(s2 )) are small, and therefore so is
H(y(s1 ), y(s2 ), y(s3 )).
This proves the necessity.
SUFFICIENCY. First we shall show that if ΔM1 (c; x(t)) < h and if (x1 ,t1 ),
(x2 ,t2 ), (x3 ,t3 ) are points of Γx(t) such that t2 − c < t1 < t2 < t3 < t2 + c, then
H(x1 , x2 , x3 ) < h. To do this it is sufficient to prove the following two assertions:
1) if H(x1 , x2 , x3 ) < h, H(x1 , x2 , x3 ) < h, and x1 ∈ [x1 , x1 ], then H(x1 , x2 , x3 ) < h; 2)
if H(x1 , x2 , x3 ) < h, H(x1 , x2 , x3 ) < h, and x2 ∈ [x2 , x2 ], then H(x1 , x2 , x3 ) < h. Assertion (1) follows from the fact that if y ∈ [x1 , x3 ] and y ∈ [x1 , x3 ], where ρ (y , x2 ) < h
and ρ (y , x2 ) < h, then any point y on the intersection of [y , y ] and [x1 , x3 ] will also
satisfy ρ (x2 , y) < h. Assertion (2) follows from the fact that if y and y are points of
[x1 , x3 ] such that ρ (y , x2 ) < h, ρ (y , x2 ) < h, and x2 = α x2 +(1 − α )x2 , (0 < α < 1),
then we know about y + α y + (1 − α )y that it satisfies
ρ (x2 , y) = ρ (α x2 + (1 − α )x2 , α y + (1 − α )y )
≤ ρ (α x2 , α y ) + ρ ((1 − α )x2 , (1 − α )y )
≤ αρ (x2 , y ) + (1 − α )ρ (x2 , y ) ≤ h
(it should be recalled that although in this case X is a Banach space, we are using
the distance rather than the norm for uniformity of notation). To prove our assertion,
it is sufficient to construct, for every ε and for all sufficiently large n, parametric
representations (yεn (s),tnε (s)) for Γxn (t) and (yε (s),t ε (s)) for Γx(t) such that
sup |tnε (s) − t ε (s)| < ε ; sup ρ (yεn (s), yε (s)) < ε .
s
s
Let (y(s),t(s)) be some parametric representation of Γx(t) and let 0 = s0 < s1 < s2 <
· · · < sm = 1 be values of the parameter such that t(si ) are either points of continuity of x(t) or points where the discontinuities are no less than ε , and further let
t(si ) − t(si − 1) < η for i = 1, · · · , m − 1 and min[ρ (y(si ), y(s)); ρ (y(s), y(si+1 ))] <
ε /2 for s ∈ (si , si+1 ). If finally t(sk ) = t(sk+1 ), then ε /4 < ρ (y(sk ), y(sk+1 )) < ε /2.
Let us choose an n0 sufficiently large and an η < ε such that if t(si ) is a point
of continuity of x(t), then ρ (xn (t(si )), x(t(si ))) < ε /4 and ΔM1 (η , xn (t)) < ε /8
for n > n0 . Then if the point t = t(sr ) = · · · = t(sr+ j ) is a point of discontinuity of x(t) at which the discontinuity is no less than ε , then there exist on
(n) (n)
(n) (n)
(n) (n)
n − t | < η and
Γxn (t) points (yr ,tr ), (yr+1 ,tr+1 ), · · · , (yr+ j ,tr+ j ) such that |tr+1
(n)
ρ (yr+i , y(sr+i )) < 78 ε (because the segment [y(sr−1 ), y(sr+ j+1 )] lies at a distance
no greater than ε /4 from [xn (t(sr−1 )), xn (t(sr+ j+1 ))] and at a distance no greater
than ε /2 from [x(t(sr ) − 0), x(t(sr+ j ))], while all the points of Γx(t) with t(sr−1 ) <
t < t(sr+ j+1 ) lie no further than ε /8 from [xn (t(sr−1 )), xn (t(sr+ j+1 ))].
34
1 Limit Theorems for Stochastic Processes
Thus if (yn (s),tn (s)) is some parametric representation of Γxn (t) , then there
(n)
(n)
(n)
(n)
exist points 0 = s0 < s1 < · · · < sm = 1 such that |tn (si ) − t(si )| < ε and
(n)
ρ (yn (si ), y(si )) < ε . Let λ (s) be the monotonic continuous function which maps
(n)
[0, 1] onto itself, such that λ (si ) = si . Then (yn (λ (s)),tn (λ (s))) will also be a
parametric representation of Γxn (t) , and it is easily shown that
|tn (λ (s)) − t(s)| < 2ε , ρ (yn (λ (s)), y(s)) < 4ε .
This completes the proof.
2.5. The topology J2 .
J
2.5.1. A necessary and sufficient condition that xn (t) →2 x(t) is that
lim lim ΔJ2 [c; xn (t), x(t)] = 0,
c→0 n→∞
where
ΔJ2 (c; y(t), x(t)) = sup[min{ρ (x(tc ), y(t)); ρ (y(t), x(tc∗ ))}]
t
(the points tc and
tc∗
are defined as previously).
J
Proof. NECESSITY. Let xn (t) →2 x(t), so that there exists a sequence λn (t) such
that
lim sup ρ (xn (t), x(λn (t)) = 0; lim sup |t − λn (t)| = 0.
(2.17)
n→∞ t
n→∞ t
Therefore
ρ (x(tc ), xn (t)) ≤ ρ (x(tc ), x(λn (t))) + ρ (x(λn (t)), xn (t)),
ρ (xn (t), x(tc∗ )) ≤ ρ (x(λn (t)), x(tc∗ )) + ρ (x(λn (t)), xn (t)).
Hence
min{ρ (x(tc ), xn (t)), ρ (xn (t)), x(tc∗ ))} ≤ ρ (x(λn (t)), xn (t))
+ min{ρ (x(λn (t)), x(tc )); ρ (x(λn (t)), x(tc∗ ))}.
(2.18)
In view of (2.17) we have only to show that
lim lim sup min{ρ (x(tc ), x(λn (t)); ρ (x(λn (t)), x(tc∗ ))} = 0.
c→0 n→∞ t
(2.19)
But if |λn (t) − t| < c, we have tc ≤ λn (t) ≤ tc∗ , so that (2.19) follows from 2.1.4.
SUFFICIENCY. Assume there exists such a sequence ck → 0 and nk → ∞ that
ΔJ2 (ck ; xn (t), x(t)) < εk for n > nk and εk → 0.
Let t1 ,t2 , · · · ,tk be the points at which the discontinuities of x(t) are no less than
ε . Let us surround these points by intervals Δ1 , Δ2 , · · · , Δk so small that
ρ (x(t), x(ti − 0)) < ε for t < ti ,t ∈ Δi ,
ρ (x(t), x(ti + 0)) < ε for t > ti ,t ∈ Δi .
(2.20)
Limit Theorems for Stochastic Processes
Let us choose ck so small that if t ∈ [0, 1] −
35
k
i=1 Δ j
we have
ρ (x(t), x(tck − 0)) < 2ε ,
ρ (x(t), x(tc∗k )) < 2ε .
(2.21)
Then for n > nk and t ∈ Δi we have either
ρ (xn (t), x(ti − 0)) < 3ε + εk ,
or
ρ (xn (t), x(ti + 0)) < 3ε + εk .
(2.22)
If ρ (x(ti − 0), x(ti + 0)) ≥ 6ε + εk , then only one of (2.22) can be fulfilled. Let
ε ,ε
us now construct the function λn k (t), which maps the segment [0, 1] onto itself
ε ,εk
in the following way: λn (t) = t everywhere except on those intervals Δi where
ε ,ε
ρ (x(ti − 0), x(ti + 0)) ≥ 6ε + 2εk , and on these intervals λn k (t) maps the set of
values of t for which ρ (xn (t), x(ti − 0)) < 3ε + εk into the set t < ti ,t ∈ Δi , and the
set of values of t for which ρ (xn (t), x(ti + 0)) < 3ε + εk into the set t ≥ ti ,t ∈ Δi .
ε ,ε
Obviously |λn k (t) − t| is equal to or less than the greatest of the Δi intervals,
ε ,ε
and ρ (xn (t), x(λn k (t))) ≤ 4ε + εk for n > nk . Now by choosing the sequences
ε → 0, k → ∞ and δ = {max length Δi } → 0, we obtain the desired result.
It is easily seen that in proving the sufficiency we have proved the following
stronger assertion.
J
2.5.2. A sufficient condition for xn (t) →2 x(t) is that
lim lim ΔJ2 (c; xn (t), x(t)) = 0.
c→0 n→∞
2.5.3. CONVERGENCE CONDITION IN THE TOPOLOGY J2 . Necessary and
sufficient conditions for the functions xn (t) to be J2 -convergent to x(t) are that
(a) xn (t) converge to x(t) on an everywhere dense set of [0, 1] which contains 0
and 1, and
(b) limc→0 limn→∞ ΔJ2 (c, xn (t)) = 0, where
ΔJ2 (c, y(t)) =
sup
tc ≤t1 ≤tc +c/2,t∈[0,1],tc∗ −c/2≤t2 ≤tc∗
min[ρ (y(t1 ), y(t)); ρ (y(t), y(t2 ))].
J
Proof. NECESSITY OF CONDITION (b). Let xn (t) →2 x(t). Then there exists a
sequence λn (t) such that supt |λn (t) − t| and supt ρ (xn (t), x(λn (t))) → 0 as n → ∞.
Therefore
ρ (xn (t1 ), xn (t)) ≤ ρ (x(λn (t1 )), x(λn (t))) + 2 sup ρ (x(λn (t)), xn (t)),
t
ρ (xn (t), xn (t2 )) ≤ ρ (x(λn (t), x(λn (t2 ))) + 2 sup ρ (x(λn (t)), xn (t)).
t
Hence
36
1 Limit Theorems for Stochastic Processes
min[ρ (xn (t1 ), xn (t)); ρ (xn (t), xn (t2 ))] ≤ 2 sup ρ (xn (t), x(λn (t)))
t
+ min[ρ (x(λn (t1 )), x(λn (t))); ρ (x(λn (t)), x(λn (t2 )))].
By choosing c sufficiently small and n so large that |λn (t) − t| < c/2, we can make
the expression
sup
tc ≤t1 ≤tc +c/2;t∈[0,1],tc∗ −c/2≤t2 ≤tc∗
min[ρ (x(λn (t1 )), x(λn (t))); ρ (x(λn (t)), x(λn (t2 )))]
arbitrarily small, so that if t ∈ [c, 1 − c], then λn (t1 ) < λn (t) < λn (t2 ) and λn (t2 ) −
λn (t1 ) < 3c. This means we can apply 2.1.4. Now for t ≤ c and t ≥ 1 − c, all the
quantities x(λn (t1 )), x(λn (t)), x(λn (t2 )) can be made, arbitrarily close to x(0) or x(1)
by choosing c sufficiently small. The necessity is proven. To prove the sufficiency,
we shall prove the following stronger assertion.
J
2.5.4. A sufficient condition for xn (t) →2 x(t) is that condition 2.5.3(a) be fulfilled,
and
(b ) limc→0 limn→∞ ΔJ2 (c, xn (t)) = 0.
Proof. We have
ρ (x(tck ), xn (t)) ≤ ρ (x(tck ), x(t )) + ρ (xn (t), xn (t )) + ρ (xn (t ), x(t )),
ρ (xn (t), x(tc∗k )) ≤ ρ (xn (t), xn (t )) + ρ (xn (t ), x(t )) + ρ (x(t ), x(tc∗k )).
Since xn (t) converges to x(t) on an everywhere dense set, there is for every t a t such
that |t − t | < δ , and as n → ∞, we have ρ (xn (t ), x(t )) → 0; further ρ (x(t), x(t )) <
ε . Then for δ < ck /4
lim ΔJ2 (ck , xn (t), x(t)) ≤ lim ΔJ2 (ck + δ , xn (t)) + 2ε .
n→∞
n→∞
Therefore if
lim lim ΔJ2 (ck , xn (t)) = 0,
ck →0 n→∞
we have
lim lim ΔJ2 (ck , xn (t), x(t)) = 0,
ck →0 n→∞
which, in view of 2.5.2, proves our assertion.
2.6. The topologies J1 and U.
2.6.1. CONVERGENCE CONDITIONS IN THE TOPOLOGY J1 . Necessary
J
and sufficient conditions for xn (t) →1 x(t) are that
(a) xn (t) converges to x(t) on an everywhere dense set containing 0 and 1, and
(b) limc→0 limn→∞ ΔJ1 (c, xn (t)) = 0, where
ΔJ1 (c, y(t)) =
sup
t−c<t1 <t≤t2 <t+c
min[ρ (y(t1 ), y(t)); ρ (y(t), y(t2 ))].
Limit Theorems for Stochastic Processes
37
J
Proof. NECESSITY OF CONDITION (b). If x(t) →1 x(t), then there exists a sequence of monotonic continuous functions λn (t), which map [0, 1] onto itself such
that limn→∞ supt |λn (t) − t| = 0 and limn→∞ supt ρ (xn (t), x(λn (t))) = 0. Then
min[ρ (xn (t1 ), xn (t)); ρ (xn (t), xn (t2 ))] ≤ 4 sup ρ (xn (t), x(λn (t)))
t
+ min[ρ (x(λn (t1 )), x(λn (t))); ρ (x(λn (t)), x(λn (t2 )))],
and since (1) λn (t1 ) < λn (t) < λn (t2 , ) and (2), as c → 0 and n → ∞, we know that
λn (t2 ) − λn (t1 ) becomes arbitrarily small, it follows that ΔJ1 (c, xn (t)) approaches
zero as n → ∞ and c → 0.
SUFFICIENCY. Assume that (a) and (b) are fulfilled. We must construct a sequence of monotonic functions λn (t) mapping [0, 1] onto itself such that
supt ρ (xn (t), x(λn (t))) → 0 and |λn (t) − t| → 0 as n → ∞. To do this it is sufficient
to construct continuous monotonic functions λnε (t) mapping [0, 1] onto itself, such
that for n > nε we have ρ (xn (t), x(λnε (t))) < ε and limε →0 limn→∞ |λnε (t) − t|. Let
us attempt to construct such functions.
Let t1 ,t2 , · · · ,tk be all the points where the discontinuities of x(t) are no less than
μ . Let us choose c and N so that for n > N we have
ΔJ1 (c, xn (t)) < μ .
(2.23)
Further, let this c be so small that if t and t both belong to the same one of the
intervals (0,t1 ), (t1 ,t2 ), · · · , (tk , 1), then ρ (x(t ), x(t )) < μ so long as |t − t | < c.
The fact that xn (t) converges to x(t) on an everywhere dense set implies that
(c)
(c)
(c)
(c)
(c)
(c)
there exist points 0 = τ0 < τ1 < τ2 < · · · < τm = 1 such that τi − τi−1 <
(c)
(i)
c/2, (i = 1, 2, · · · , m), and ρ (xn (τi ), x(τi )) < μ for n > N . Let tl lie between
(c)
(c)
τil and τil +1 . Then if the discontinuity at tl is greater than 8μ , it follows that
(c)
(c)
(c)
(c)
ρ (x(τil ), x(τil +1 )) > 6μ . This implies that ρ (xn (τil ), xn (τil +1 )) > 4μ for n >
(c)
(c)
(c)
N . But if n > max{N , N } and t ∈ (τil , τil +1 ), we have min{ρ (xn (τil ), xn (t)),
(c)
(c)
(c)
ρ (xn (τil +1 ), xn (t))} < μ . Therefore in the interval (τil , τil +1 ) there is a point t l
(c)
(c)
(c)
such that t ∈ (τil ,t l ) implies ρ (xn (τil ), xn (t)) < μ and t ∈ (t l , τil +1 ) implies
(c)
ρ (xn (t), xn (τil +1 )) < μ . (If this were not so, there would be points t1 < t2 < t3 ∈
(c)
(c)
(τil , τil +1 ) such that
(c)
(c)
(c)
ρ (xn (t1 ), x(τil ) < μ , ρ (xn (t2 ), xn (τil +1 ) < μ , ρ (xn (t3 ), x(τil ) < μ ,
or ρ (xn (t1 ), xn (t2 )) > 2μ and ρ (xn (t2 ), xn (t3 )) > 2μ , which is impossible according
to (2.23)).
Now let tk1 ,tk2 , · · · ,tkr be points where the discontinuities are greater than 8μ ,
and let the t ki be points constructed for the tki in exactly the same way as t l was
constructed for tl . Let λnε (t) be a function continuous and monotonic on each of the
38
1 Limit Theorems for Stochastic Processes
intervals (0,tk1 ), (tk1 ,tk2 ), · · · , (tkr , 1) of the form
λnε (t) = ait + bi , λnε (tki ) = t ki .
(c)
(c)
Obviously |λnε (t) − t| < c/2. Let us evaluate ρ (xn (t), x(λn (t))) for t ∈ (τ j , τ j+1 ).
If in this interval there is not a single point t ki (and therefore no point tki ), then
(c)
(c)
ρ (xn (t), x(λnε (t))) ≤ μ + min{ρ (xn (t), xn (τ j )); ρ (xn (t), xn (τ j+1 ))}
(c)
(c)
+ max{ρ (x(λnε (t)), x(τ j )); ρ (x(λnε (t)), x(τ j+1 ))} < 12μ .
(c)
(c)
(c)
(c)
If t ∈ (τ j , τ j+1 ), however, and tkν and t kν also belong to (τ j , τ j+1 ), then
(c)
(c)
either t ∈ (τ j ,tkν ), λnε (t) < t kν or t ∈ (t kν , τ j+1 ), λnε (t) > t kν . This implies that
ρ (xn (t), x(λnε (t)) < 3μ . If we set ε = 12μ , we obtain the desired result.
2.6.2. CONVERGENCE CONDITIONS IN THE TOPOLOGY U. Necessary
U
and sufficient conditions for xn (t) → x(t) are:
J
(a) xn (t) →1 x(t) and
(b) if t1 ,t2 , · · · ,tk denote points where the discontinuities of x(t) are greater than
ε , where ε is arbitrary except that x(t) has no discontinuities whose magnitude is
(n) (n)
(n)
ε , and if t1 ,t2 , · · · ,tkn denote the points where the discontinuities of xn (t) are
(n)
greater than ε , then for n greater than a certain value we have kn = k,ti = ti .
The necessity of both conditions is obvious. The sufficiency is seen if we note
that by constructing λnε (t) (see the proof of the sufficiency in 2.6.1), we obtain
λnε (t) = t.
2.7. Compactness conditions in the topologies J1 , J2 , M1 and M2 . We note the
following conditions necessary for compactness.
2.7.1. If a set of functions K ⊂ KX is to be compact in one of the topologies
J1 , J2 , M1 , M2 , it is necessary that for all t ∈ [0, 1] and x(t) ∈ K the values of x(t)
belong to a single compact set A of X.
Indeed, if we have a sequence of points xk (tk ), then by choosing a sequence nk ,
M
such that xnk (t) →2 x0 (t),tnk → t0 , we find that the distance between xnk (tnk ) and the
segment [x0 (t0 − 0), x0 (t0 )] approaches zero, which means that xnk (tnk ) is compact,
so that the segment is compact.
2.7.2. Sufficient condition for compactness. The set of functions K is compact in
a topology S, where S is J1 , J2 , M1 or M2 , if 2.7.1 is fulfilled and if
lim lim sup(ΔS (c, x(t)) + sup ρ (x(0), x(t)) + sup ρ (x(1), x(t))) = 0
c→0 x(t)∈K
0<t<c
1−c<t<1
(by lim sup of a certain numerical set we mean its maximum limit point).
(2.24)
Limit Theorems for Stochastic Processes
39
Proof. Choosing some everywhere dense set N of values of t which contains 0 and
1, we may take from any sequence xn (t) a subsequence xnk (t) such that if t ∈ N then
limnk →∞ xnk (t) exists. Since ΔM2 (c, x(t)) ≤ ΔS (c, x(t)), we have
lim lim (ΔM2 (c, xnk (t)) + sup ρ (x(0), x(t)) + sup ρ (x(1), x(t))) = 0.
c→0 nk →∞
1−c<t<1
0<t<c
M
According to 2.3.5, therefore, there exists an x(t) ∈ KX such that xnk (t) →2 x(t). This
means that xnk (t) converges to x(t) on an everywhere dense set containing 0 and 1,
S
and that (2.24) is fulfilled. Hence xnk (t) → x(t).
Bearing in mind that ΔJ1 (c, x(t)) and ΔM1 (c, x(t)) are monotonic functions of c,
it is easy to obtain the following:
2.7.3. Necessary and sufficient condition for compactness in the topologies
J1 , M1 .4
Let S1 denote either J1 or M1 . Conditions 2.7.1 and
lim sup (ΔS1 (c, x(t)) + sup ρ (x(0), x(t)) + sup ρ (x(1), x(t))) = 0
c→0 x(t)∈K
1−c<t<1
0<t<c
are necessary and sufficient for K to be compact in S1 . The next example shows that
condition (2.24) is not necessary either for J2 or for M2 .
2.7.4. Example. Consider the set K consisting of the functions xs (t), s ∈ 14 , 12 ,
given by
⎧
0, 0 ≤ t < s,
⎪
⎪
⎪
⎨1, s ≤ t < 1 + s/2,
4
xs (t) =
⎪0, 14 + s/2 ≤ t < 12 ,
⎪
⎪
⎩
1, 12 ≤ t ≤ 1.
The set is J2 -compact and therefore also M2 -compact, but
lim sup ΔJ2 (c, xs (t)) = 1,
s
1
lim sup ΔM2 (c, xs (t)) = 1, for c < .
2
s
2.8. A certain inequality. The inequality we are about to derive hardly has any
independent meaning, but will be needed in the proof of the limit theorems.
Let us write
R(δ , x(t), y(t)) =
sup
inf
0≤t≤1−δ t≤t1 ≤t+δ
ρ (x(t), y(t)),
R(δ , x(t), y(t)) = max{ρ (x(0), y(0)); ρ (x(1), y(1)); R(δ , x(t), y(t))}.
Further, we set
4
The compactness conditions for the topology J1 have been obtained by A. N. Kolmogorov.
40
1 Limit Theorems for Stochastic Processes
k(c, x(t)) = max{ sup ρ (x(0), x(t)); sup ρ (x(1), x(t))}.
0≤t≤c
1−c≤t≤1
Then for c < c/4, we have
ΔS (c, y(t)) ≤ L(R(α c, x(t), y(t)) + ΔJ1 (3c, x(t)) + k(2c, x(t)) + ΔS (c, y(t))),
where S denotes one of the topologies M2 , M1 , J2 , or J1 , and L and α < 1 are certain
absolute constants.
We shall give the proof for M2 . We note two properties of the quantity ΔJ1 (c, x(t)).
These are: (1) on a segment of length c the function x(t) cannot have two discontinuities larger than 2ΔJ1 (c, x(t)) and (2) on a segment of length c there can
exist two points t1 and t2 such that ρ (x(t1 ), x(t2 )) > 4ΔJ1 (c, x(t)) only if there
exists t1 < t ∗ < t2 such that ρ (x(t ∗ − 0), x(t ∗ + 0)) > 2ΔJ1 (c, x(t)). Let us now
evaluate H(y(t1 ), y(t2 ), y(t3 )), where t1 ∈ [t2c ,t2c + c/2] and t3 ∈ [t2∗c − c/2,t2∗c ]. If
t ∈ [c, 1 − c], then there exist points ti ,ti , i = 1, 2, 3, with ti ∈ [tic ,tic + c/2],ti ∈ [ti∗c −
c/2,ti∗c ] such that ρ (x(ti ), y(ti )) and ρ (x(ti ), y(ti )) are less than R( 12 c, x(t), y(t)). If
in the interval (t2 ,t2 ) the function x(t) has a discontinuity greater than 2ΔJ1 (2c, x(t)),
then ρ (x(t1 ), x(t2 )) and ρ (x(t2 ), x(t2 )) < 4ΔJ1 (2c, x(t)), and therefore
ρ (y(t1 ), y(t2 )) < ΔM2 (c, y(t)) + 2R(c/2, x(t), y(t)) + 4ΔJ1 (2c, x(t)).
Similarly,
ρ (y(t3 ), y(t2 )) < ΔM2 (c, y(t)) + 2R(c/2, x(t), y(t)) + 4ΔJ1 (2c, x(t)).
(2.25)
Therefore
H(y(t1 ), y(t2 ), y(t3 )) < 2ΔM2 (c, y(t)) + 2R(c/2, x(t), y(t)) + 4ΔJ1 (2c, x(t)).
If, on the other hand, in the interval (t2 ,t2 ) the function x(t) has no discontinuity
greater than 2ΔJ1 (2c, x(t)), then either
ρ (x(t1 ), x(t2 )) < 4ΔJ1 (2c, x(t)); ρ (x(t1 ), x(t2 )) < 4ΔJ1 (2c, x(t)),
which means
ρ (y(t1 ), y(t3 )) < 2ΔM2 (c, y(t)) + 2R(c/2, x(t), y(t)) + 4ΔJ1 (2c, x(t)),
or
ρ (x(t3 ), x(t2 )) < 4ΔJ1 (2c, x(t)); ρ (x(t3 ), x(t2 )) < 4ΔJ1 (2c, x(t))
which means
ρ (y(t2 ), y(t3 )) < 2ΔM2 (c, y(t)) + 2R(c/2, x(t), y(t)) + 4ΔJ1 (2c, x(t)),
so that (2.25) remains valid. Now let t2 < c. Then
Limit Theorems for Stochastic Processes
41
ρ (y(t2 ), x(0)) < ΔM2 (c, y(t)) + R(c/2, x(t), y(t)) + k(c, x(t)),
so that
ρ (y(t1 ), y(t2 )) < 2ΔM2 (c, y(t)) + 2R(c/2, x(t), y(t)) + 2k(c, x(t)).
The case in which 1 − c < t2 < 1 is entirely analogous.
3 Fundamental Limit Theorems
Let us consider a stochastic process ξ (t) given over a time interval [0, 1] and taking
on values from a certain complete separable metric space X. This means that there
exists a space Ω of elementary outcomes ω with a probability measure P on Ω , and
we are investigating functions ξ (t, ω ) defined for t ∈ [0, 1] and for almost all ω ∈ Ω
in the sense of the measure P, and which take their values from X. A function ξ (t, ω )
is measurable in the measure P for every fixed t, which means that for every Borel
set A ⊂ X, the set of those ω for which ξ (t, ω ) ∈ A is a P-measurable set. The
probabilities P{ξ (t1 ) ∈ A1 ; ξ (t2 ) ∈ A2 ; · · · ; ξ (tk ) ∈ Ak } (the Ai are Borel sets in X)
are called the finite dimensional distributions of the process ξ (t, ω ). We shall say
that the finite dimensional distributions of the processes ξn (t, ω ) converge to the
finite dimensional distributions of the process ξ (t) if
P{ξn (t1 ) ∈ A1 ; ξn (t2 ) ∈ A2 ; · · · ; ξn (tk ) ∈ Ak } →
P{ξ (t1 ) ∈ A1 ; ξ (t2 ) ∈ A2 ; · · · ; ξ (tk ) ∈ Ak }
for all k, for any collection t1 ,t2 , · · · ,tk ∈ [0, 1], and for any collection of Borel sets
A1 , A2 , · · · , Ak for which
P{ξ (ti ) ∈ Ai ∩ X − Ai } = 0 (i = 1, 2, · · · , k).
In what follows we shall assume without further mention that for almost all ω
(with probability 1) we have ξ (t, ω ) ∈ KX .
We shall henceforth write ξ (t, ω ) only when Ω is some specific space. Otherwise
we shall write simply ξ (t).
3.1. Consider a sequence of random X-valued variables ξn converging in probability to some random X-valued variable ξ , which is to say
lim P{ρ (ξn , ξ ) > ε } = 0 for arbitrary ε .
n→∞
Then it is easy to see that the distribution of ξn converges weakly to the distribution of ξ . The converse statement is, of course, not true, since it is possible to give
ξn and ξ even on different spaces of elementary outcomes. Nevertheless we shall
now prove an assertion similar to the converse.
42
1 Limit Theorems for Stochastic Processes
3.1.1. Assume that the distributions of the quantities ξn converge weakly to the
distribution of ξ0 . Then there exist random variables xn (ω ) with n = 0, 1, · · · , such
that
P{ξn ∈ A} = P{xn (ω ) ∈ A} (n = 0, 1, · · · )
and
P{xn (ω ) → x0 (ω ), n → ∞} = 1.
(Here Ω is the segment [0, 1], and P is the ordinary Lebesgue measure).
Proof. Let us choose a Borel set Si1 ,i2 ,··· ,ik ⊂ X, where k, i1 , i2 , · · · , ik take on all
integral values, such that the following properties are fulfilled:
1) Si1 ,i2 ,··· ,ik and Si ,i ,··· ,i have no intersection if ik = ik ;
1 2
k
∞
2) ∞
ik =1 Si1 ,i2 ,··· ,ik−1 ,ik = Si1 ,i2 ,··· ,ik−1 ;
i1 =1 Si1 = X;
k
3) The diameter of Si1 ,i2 ,··· ,ik is less than or equal to 12 ;
4) P{ξ0 ∈ Si1 ,i2 ,··· ,ik ∩ X − Si1 ,i2 ,··· ,ik } = 0.
(k) (k)
The set Si1 ,i2 ,··· ,ik can be constructed in the following way. Let x1 , x2 , · · · be
a sequence of points such that every point of X lies at a distance no greater than
1 k+1
(k)
from at least one point of the sequence. We denote by Sr (xi ) the set of
2
k
k+1
(k)
< rk < 12 , such
points x such that ρ (x, xi ) < r, and we can choose an rk , 12
that
(k)
(k)
P{ξ0 ∈ Sr k (xi ) ∩ X − Sr k (xi )} = 0 for all i,
(3.1)
since there is at most a denumerable number of values of r where (3.1) is positive
for at least one value of i. Let us write
(k)
Dki = Sr k (xi ) −
i−1
(k)
Sr k (x j ),
j=1
Si1 ,i2 ,··· ,ik = D1i1 ∩ D2i2 ∩ · · · ∩ Dkik .
(n)
We denote by Δi1 ,i2 ,··· ,ik intervals of the segment [0, 1] defined as follows. The
(n)
(n)
(n)
intervals Δi1 ,··· ,ik and Δi ,··· ,i have no intersection if ik = ik , and Δi1 ,i2 ,··· ,ik lies to the
1
(n)
k
left of Δi ,i ,··· ,i if there exists an r such that i j = ij for j = 1, 2, · · · , r − 1 and ir < ir .
1 2
k
(n)
Finally the length of Δi1 ,i2 ,··· ,ik is equal to P{ξn ∈ Si1 ,i2 ,··· ,ik }. From each set Si1 ,i2 ,··· ,ik
we now choose one point xi1 ,i2 ,··· ,ik . We define the functions xnm (ω ) with ω ∈ [0, 1]
by
(n)
xnm (ω ) = xi1 ,i2 ,··· ,ik for ω ∈ Δi1 ,i2 ,··· ,im .
m
m+p
m
It is easily seen that ρ (xn (ω ), xn (ω )) ≤ 21 , so that, since X is a complete
(n)
space, the limit xn (ω ) = limm→∞ xnm (ω ) exists. Since the length of Δi1 ,i2 ,··· ,im tends
(0)
(0)
to that of Δi1 ,i2 ,··· ,im , for all internal points of the intervals Δi1 ,i2 ,··· ,im we have
Limit Theorems for Stochastic Processes
43
lim ρ (x0 (ω ), xn (ω )) ≤
n→∞
m−1
1
,
2
and therefore for all ω except possibly a denumerable set, we have xn (ω ) → x0 (ω ).
We have yet to verify that the xn (ω ) and ξn have the same distributions. For this
purpose it is sufficient, for instance, to show that P{xn (ω ) ∈ A} = P{ξn ∈ A} for A
such that
P{ξn ∈ A ∩ X − A} = 0.
Assume, therefore, that A is such a set. By A(m) we denote the sum of the Si1 ,i2 ,··· ,im
belonging to A, and by A(m) the sum of the Si1 ,i2 ,··· ,im which do not belong to X − A.
It is obvious that A(m) ⊂ A ⊂ A(m) and
P{xn (ω ) ∈ A(m) } = P{ξn ∈ A(m) }; P{xn (ω ) ∈ A(m) } = P{ξn ∈ A(m) }.
(3.2)
Further, if C(m) is the set of points whose distance from A ∩ X − A is equal to or less
than ( 12 )m , then A(m) − A(m) ⊂ C(m) , and since P{ξn ∈ A ∩ X − A} = 0 we find that
P{ξn ∈ A(m) − A(m) } → 0 as m → ∞. Together with (3.2) this proves our assertion.
What we have just proved leads to the following corollary for sequences of processes.
3.1.2. Consider a sequence of processes ξn (t) whose trajectories belong to KX
with probability 1, and let their finite dimensional distributions tend to the finite
dimensional distributions of ξ0 (t). Then it is possible to construct processes xn (t, ω )
with n = 0, 1, · · · , (Ω is the segment [0, 1] and P is the Lebesgue measure on this
segment), such that the finite dimensional distributions of xn (t, ω ) and ξn (t) are the
same, xn (t, ω ) ∈ KX for almost all ω ∈ [0, 1], and xn (t, ω ) → x(t, ω ) for almost all
ω and for all t belonging to some denumerable set N which is everywhere dense on
[0, 1], but whose choice is otherwise arbitrary.
To prove this, let N = {t 1 ,t 2 , · · · } be some everywhere dense set. Denote by X (∞)
the space of the sequences (x1 , x2 , · · · ), where xi ∈ X. In X (∞) we introduce the
metric ρ (∞) , defined by
ρ (∞) [(x1 , x2 , · · · ), (y1 , y2 , · · · )] =
∞
1
∑ (1 − e−ρ (xk ,yk ) ) k! .
k=1
With this metric, X (∞) is a complete separable metric space. Let us now consider in
X (∞) the random variable
ξn = {ξn (t 1 ), ξn (t 2 ), · · · }.
It is easily shown that the distributions of the ξn converge weakly to the distribution
(∞)
of ξ0 . It is therefore possible to construct functions xn (ω ) whose values lie in X (∞)
(∞)
(∞)
and which are defined for ω ∈ [0, 1], such that xn (ω ) → x0 (ω ) with probability
(∞)
1 and such that the distributions of the xn (ω ) and ξn are the same.
(∞)
(n)
(n)
Let xn (ω ) = (x1 (ω ), x2 (ω ), · · · ). We write
44
1 Limit Theorems for Stochastic Processes
(n)
xn (t i , ω ) = xi (ω ).
Since the joint distributions of xn (t i , ω ) and ξn (t i ) are the same, there almost certainly exists the limit
xn (t, ω ) = lim xn (t j , ω ).
(3.3)
t j →t
t j >t
The functions xn (t, ω ) are exactly those which we have wished to find out. That they
belong to KX with probability 1 follows from the fact that the joint distributions of
xn (t j , ω ) and ξn (t j ) coincide and from (3.3). By construction we have xn (t j , ω ) →
x0 (t j , ω ) with probability 1.
Clearly, for our purposes it was not necessary that the finite dimensional distributions of ξn (t) converge to those of ξ (t) for all t. It would have been sufficient that
this be true for all t in N.
3.2. Let us now recall the convergence conditions in the topologies J1 , J2 , M1 ,
and M2 , and let us relate these conditions to the results just obtained. The way in
which we will obtain the desired limit theorems then becomes immediately evident.
Let S be one of the topologies J1 , J2 , M1 , M2 . Let F be some metric separable
space, and let FS be the set of all functions f [x(t)] defined on KX whose values lie
in F and which are continuous in S.
3.2.1. Theorem. If for any f ∈ FS the distribution of f [ξn (t)] is to converge, as
n → ∞, to the distribution of f [ξ0 (t)], it is sufficient that
(a) the finite dimensional distributions of ξn (t) converge to those of ξ0 (t) for t in
some set N everywhere dense on [0, 1] and containing 0 and 1, and
(b) limc→0 limn→∞ P{ΔS (c, ξn (t)) > ε } = 0 for all ε .
Proof. If condition (a) is fulfilled, we can construct a sequence xn (t, ω ) which
converges with probability 1 to x0 (t, ω ) as n → ∞, such that xn (t, ω ) has the
same finite dimensional distributions as ξn (t). This means that the distributions
of f [ξn (t)] and f [xn (t, ω )] are the same. To prove the theorem it is sufficient to
show that f [xn (t, ω )] → f [x0 (t, ω )] in probability. According to condition (b) we
can choose sequences ck → 0, nk → ∞, and εk → 0, such that for n > nk we have
P{ΔS (ck , ξn (t)) > εk } < 1/k2 . This means that
P{ΔS (ck , xn (t, ω )) > εk } < 1/k2 .
(3.4)
Consider a sequence mk such that mk > nk and
P{R(α ck , xmk (t, ω ), x0 (t, ω )) > εk } < 1/k2 .
(3.5)
From inequality (2.8) we obtain
ΔS (c, xmk (t, ω )) ≤ L[ΔS (ck ; xmk (t, ω )) + R(α ck , xmk (t, ω ), x0 (t, ω ))
+k(2c, x0 (t, ω )) + ΔJ1 (2c, x(t, ω ))].
Since ΔS (ck , xmk (t, ω )) and R(α ck , xmk (t, ω )) approach zero with probability 1 (as
follows from (3.4) and (3.5)), we have
Limit Theorems for Stochastic Processes
45
P{ lim ΔS (c, xmk (t, ω )) > ε } ≤
n→∞
ε
.
P k(2c, x0 (t, ω )) + ΔJ1 (2c, x0 (t, ω )) >
2L
(3.6)
On the other hand, from the fact that x0 (t, ω ) ∈ KX with probability 1, it follows that
P{lim k(2c, x0 (t, ω )) = 0} = 1
c→0
and
P{lim ΔJ1 (2c, x0 (t, ω )) = 0} = 1.
c→0
Therefore
S
P{lim lim ΔJ1 (2c, xmk (t, ω )) = 0} = 1, P{xmk (t, ω ) → x0 (t, ω )} = 1,
c→0 n→∞
which means that
P{ lim xmk (t, ω ) = x0 (t, ω )} = 1.
mk →∞
In other words, no matter what the sequence f [xnk (t, ω )], it always contains a subsequence f [xmk (t, ω )] such that P{ f [xmk (t, ω )] → f [x0 (t, ω )]} = 1. This, on the other
hand, means that f [xn (t, ω )] converges to f [x0 (t, ω )] in probability. This proves the
theorem.
It is easily seen that conditions (a) and (b) of Theorem 3.2.1 are not necessary
for all spaces F. (If F consists, for instance, of only a single point, the distributions
of all the f [ξn (t)] coincide.) If F is a discrete space, conditions (a) and (b) are also
not necessary.
It turns out that these conditions are necessary if F contains a subset which can be
mapped continuously onto a line segment. This assertion follows from the following
proposition.
3.2.2. Theorem. If for any bounded functional f [x(t)] (by functional we mean
numerical function, i.e., in the present case F is the real line) continuous in the topology S, the distribution of f [ξn (t)] converges weakly to the distribution of f [ξ0 (t)],
then
(a) the finite dimensional distributions of ξn (t) converge to those of ξ0 (t) for t in
some everywhere dense set N of [0, 1] containing 0 and 1, and
(b) limc→0 limn→∞ P{ΔS (c, ξn (t)) > ε } = 0 for all ε .
To prove (a) we show that if t1 ,t2 , · · · ,tk are stochastic continuity points of ξ0 (t)
(which means that the probability of ξ0 (t) converges to that of ξ0 (ti ) as t → ti , i =
1, · · · , k), and if A1 , A2 , · · · , Ak ⊂ X are sets such that P{ξ0 (ti ) ∈ Ai ∩ X − Ai } = 0, i =
1, 2, · · · , k, then
lim P{ξn (ti ) ∈ Ai ; i = 1, 2, · · · , k} = P{ξ0 (ti ) ∈ Ai ; i = 1, 2, · · · , k}.
n→∞
To do this, consider the functional
(3.7)
46
1 Limit Theorems for Stochastic Processes
k
fε∗ [x(t)] = ∑ sup gti ,δ (t)[λ (Aεi , x(t)) + 1]/[λ (Aεi , x(t)) + 2],
(3.8)
i=1 t
where gti ,δ (t) is a continuous function never greater than 1, equal to 1 on (ti −
δ ,ti + δ ), and vanishing outside of (ti − 2δ ,ti + 2δ ); the Aεi are the sets of internal
points of the Ai whose distance from the boundaries of the Ai is greater than ε ,
and λ (A, x(t)) = infy∈A ρ (x(t), y). The functional of (3.8) is continuous in all of
the topologies J1 , J2 , M1 , M2 , since λ (A, x(t)) gives a continuous mapping of KX
onto KR1 (where R1 is the real line) in all the topologies. There therefore exists ε1
arbitrarily small such that
k
k
∗
∗
P fε [ξn (t)] < + ε1 → P fε [ξ0 (t)] < + ε1 .
2
2
Since supt gti (t)(λ (Aεi , x(t)) + 1)/(λ (Aεi , x(t)) + 2) ≤ 12 , it follows that fε∗ [ξn (t)] <
k/2 + ε1 implies
gti ,δ (t)(λ (Aεi , ξn (t)) + 1)/(λ (Aεi , ξn (t)) + 2) <
1
+ ε1
2
for all i, and therefore λ (Aεi , ξn (ti )) < ε1 for all i. If ε1 < ε , therefore, we have
k
∗
lim P{ξn (ti ) ∈ Ai ; i = 1, 2, · · · , k} ≥ P fε [ξ0 (t)] < + ε1 .
2
n→∞
On the other hand, from the fact that the ti are stochastic continuity points of ξ0 (t),
it follows that the event {λ (Aεi , ξ0 (ti )) < ε1 /k; i = 1, 2, · · · , k} implies the event
{ fε∗ [ξ0 (t)] < k/2 + 2ε1 } if one neglects a set whose measure can be made arbitrarily small by choosing δ sufficiently small. Since for sufficiently small ε and ε1 ,
P{λ (Aεi , ξ0 (ti )) < ε1 ; i = 1, · · · , k} is arbitrarily close to P{ξ0 (ti ) ∈ Ai ; i = 1, · · · , k},
we have
k
∗
P fε [ξ0 (t)] < + ε1 ≥ P{ξ0 (ti ) ∈ Ai ; i = 1, · · · , k} − μ ,
2
where μ is an arbitrarily small number. Therefore
lim P{ξn (ti ) ∈ Ai ; i = 1, 2, · · · , k} ≥ P{ξ0 (ti ) ∈ Ai ; i = 1, 2, · · · , k}.
(3.9)
n→∞
Since X = Ai ∪CAi (where CAi is the complement of Ai ) and the sets CAi have the
same properties as Ai (Ai ∩ X − Ai = CAi ∩ X −CAi ), it is not difficult to obtain (3.7)
from (3.9).
Let us demonstrate this for k = 2. We have the inequalities
Limit Theorems for Stochastic Processes
47
lim P{ξn (t1 ) ∈ A1 , ξn (t2 ) ∈ A2 } ≥ P{ξ0 (t1 ) ∈ A1 , ξ0 (t2 ) ∈ A2 },
n→∞
lim P{ξn (t1 ) ∈ A1 , ξn (t2 ) ∈ CA2 } ≥ P{ξ0 (t1 ) ∈ A1 , ξ0 (t2 ) ∈ CA2 },
n→∞
lim P{ξn (t1 ) ∈ CA1 , ξn (t2 ) ∈ A2 } ≥ P{ξ0 (t1 ) ∈ CA1 , ξ0 (t2 ) ∈ A2 },
(3.10)
n→∞
lim P{ξn (t1 ) ∈ CA1 , ξn (t2 ) ∈ CA2 } ≥ P{ξ0 (t1 ) ∈ CA1 , ξ0 (t2 ) ∈ CA2 }.
n→∞
If (3.7) were not true, there would be a sequence nk such that
lim P{ξnk (t1 ) ∈ A1 , ξnk (t2 ) ∈ A2 } > P{ξ0 (t1 ) ∈ A1 , ξ0 (t2 ) ∈ A2 }.
nk
If we were to add to this inequality the last three of (3.10), we would obtain 1 > 1.
Thus (a) is proved.
The proof of (b) is almost the same for the different topologies. Let us go through
it for the case of S = J1 . Let
(m)
ΔJ1 (c, x(t)) = max (gc (t1 ,t,t2 ))m {min[ρ (x(t1 ), x(t)); ρ (x(t), x(t2 ))]},
t1 <t<t2
and let gc (t1 ,t,t2 ) be a continuous function of the arguments t1 ,t,t2 such that
= 1 for t − c < t1 < t < t2 < t + c,
gc (t1 ,t,t2 )
< 1 for the other t1 ,t,t2 .
(m)
(m)
For every m and c the functional ΔJ1 (c, x(t)) is J1 -continuous and ΔJ1 (c, x(t)) >
ΔJ1 (c, x(t)). Therefore
(m)
lim P{ΔJ1 (c, ξn (t)) > ε } ≤ P{ΔJ1 (c, ξ0 (t)) > ε }.
n→x
(m)
Since limm→∞ ΔJ1 (c, x(t)) = ΔJ1 (c, x(t)), we have
lim P{ΔJ1 (c, ξn (t)) > ε } ≤ P{ΔJ1 (c, ξ0 (t)) > ε /2}.
n→∞
It remains to be shown that
lim P{ΔJ1 (c, ξ0 (t)) > ε } = 0.
c→0
This, however, follows from the fact that for all x(t) in KX we have
lim ΔJ1 (c, x(t)) = 0.
c→0
This proves the theorem.
If we review the proof of Theorem 3.2.1, we see that it is not necessary to require
S
that f [x(t)] be continuous. It would be sufficient to use f [x(t)] for which ξn (t) →
48
1 Limit Theorems for Stochastic Processes
ξ0 (t) with probability 1 implies f [ξn (t)] → f [ξ0 (t)] with probability 1. Let μξ0 be
the measure on KX of the process ξ0 (t). Then such functions f [x(t)] will be functions
continuous in the topology S almost everywhere in terms of the measure μξ0 . We
then arrive at the following theorem.
3.2.3. Theorem. Conditions (a) and (b) of 3.2.1 imply that for every function
f defined on KX whose values lie in F, which is continuous in the topology S almost everywhere in terms of the measure μξ0 , the distribution of f [ξn (t)] converges
weakly to the distribution of f [ξ0 (t)] as n → ∞.
3.2.4. Corollary. Let K be a set of KX open in the topology S, such that P{ξ0 (t) ∈
KS } = 0 (where KS is the boundary of K in the topology S). Then conditions (a) and
(b) of 3.2.1 imply that
lim P{ξn (t) ∈ K} = P{ξ0 (t) ∈ K}.
n→∞
3.2.5. REMARK. If the space F contains an inverse image of the line segment,
the convergence of the distribution of f [ξn (t)] to that of f [ξ0 (t)] for all f in FS
implies the convergence of the distribution of f [ξn (t)] to that of f [ξ0 (t)] for all f
which are S-continuous almost everywhere in terms of the measure μξ0 .
3.2.6. REMARK. The extension of Theorem 3.2.1 to an almost everywhere continuous functional increases significantly the region of its applicability. Thus, of all
the examples we have considered in 2.2.6–2.2.8, only m[t1 ,t2 ] [x(t)] and M[t1 ,t2 ] [x(t)]
are continuous in U. Nevertheless all these functionals are almost everywhere continuous in the appropriate topologies (see 2.2.9–2.2.12) so long as t1 and t2 are
stochastic continuity points of ξ0 (t), and a and b are such that at a local extremum
ξ0 (t) differs from a and b with probability 1.
3.3. Let us now go on to a consideration of the convergence conditions for functionals continuous in the uniform
topology. Let FU be the set of functions defined on KX , which take their values from F (some complete metric separable
space) and are measurable with respect to the Borel closure of all cylindrical sets. (This last requirement does not in any
sense follow from continuity in the topology U, although it would for other topologies considered in this article.)
3.3.1. Assume the following two conditions to be true.
(ε ) (ε )
(ε )
(a) Let ε be such that the probability is 1 that ξ0 (t) has no discontinuities equal to ε ; let τn,1 , τn,2 , · · · , τn,ν ε be
n
points where the discontinuities of ξn (t) are greater than ε , and let
(ε )
(ε )
P{ξn (t1 ) ∈ A1 , · · · , ξn (tk ) ∈ Ak ; τn,1 ∈ B1 , · · · , τn,m
∈ Bm ; νnε = m}
approach. as n → ∞.
(ε )
(ε )
P{ξ0 (t1 ) ∈ A1 , · · · , ξ0 (tk ) ∈ Ak ; τ0,1 ∈ B1 , · · · , τ0,m ∈ Bm ; νnε = m}
(3.11)
(3.12)
for all t1 , · · · ,tk ∈ [0, 1], all Borel sets Ai in X such that
P{ξ0 (ti ) ∈ Ai ∩ X − Ai } = 0,
and all Borel sets B1 , B2 , · · · , Bm on [0, 1] so that the absolute difference between (3.11) and (3.12) is equal to or less
than some function λn (t1 , · · · ,tk , A1 , · · · , Ak ) which approaches zero as n → ∞ with fixed t1 , · · · ,tk , A1 , · · · , Ak ;
(b) limc→0 limn→∞ P{ΔJ1 (c, ξn (t)) > h} = 0 for all h > 0.
If both of these conditions are fulfilled then for any function f ∈ FU the distribution of f (ξn (t)) converges to that of
f (ξ0 (t)).
Proof. Let us first assume that f is uniformly continuous in U, that is, for every ε there exists a δ such that
rF ( f (x(t)), f (y(t))) < ε ,
Limit Theorems for Stochastic Processes
49
if supt ρ (x(t), y(t)) < δ .
If the theorem is not true for f , there exists a G ⊂ F such that
P{ f (ξ0 (t)) ∈ G ∩ F − G} = 0,
(3.13)
but P{ f (ξn (t)) ∈ G} does not approach P{ f (ξ0 (t)) ∈ G}. One can therefore find a > 0 and subsequences nk such that
|P{ f (ξnk (t)) ∈ G} − P{ f (ξ0 (t)) ∈ G}| > a.
Let, for instance,
P{ f (ξnk (t)) ∈ G} > P{ f (ξ0 (t)) ∈ G} + a.
We define the random processes ξn∗ (t) as follows. Choose points 0 = t0 < t1 < · · · < tN = 1, and set ξn∗ (t) = ξn (ti ) if
(ε )
ti ≤ t < ti+1 and if between ti and ti+1 there are either no points τn,k or no less than two points; if, on the other hand, in
(ε )
(ti ,ti+1 ) there lies one point τn, j , we set
ξn∗ (t) =
(ε )
ξn (ti )
for ti ≤ t < τn, j ,
(ε )
ξn (ti+1 ) for τn, j ≤ t < ti+1 .
It is not difficult to show that
P{sup ρ (ξn∗ (t), ξn (t)) > 8ε } ≤ P{ΔJ1 (c, ξn (t)) > ε },
(3.14)
t
where c = maxi=1,··· ,N−1 (ti+1 − ti ).
+
Therefore if f (ξn (t)) ∈ G, we have f (ξn∗ (t)) ∈ G+
δ , where Gδ is the set of all points whose distance from G is no
greater than δ , and where δ is such that
rF ( f (y(t)), f (x(t))) < δ for sup ρ (x(t), y(t)) < 8ε
t
if ΔJ1 (c, ξn (t)) ≤ ε .
−
On the other hand, if f (ξ0∗ (t)) ∈ G−
δ , we have f (ξ0 (t)) ∈ G, where Gδ is the set of all points whose distance from
F − G is no less than δ so long as ΔJ1 (c, ξ0 (t)) ≤ ε .
Therefore
∗
−
P{ f (ξn∗k (t)) ∈ G+
δ } > a + P{ f (ξ0 (t)) ∈ Gδ }
−P{ΔJ1 (c, ξnk (t)) > ε } − P{ΔJ1 (c, ξ0 (t)) > ε }.
Further,
∗
+
∗
+
−
P{ f (ξn∗k (t)) ∈ G+
δ } > a + P{ f (ξ0 (t)) ∈ G2δ } − P{ f (ξ0 (t)) ∈ G2δ − Gδ }
−P{ΔJ1 (c, ξ0 (t)) > ε } − P{ΔJ1 (c, ξnk (t)) > ε }.
(3.15)
−
Since G+
2δ − Gδ describes, as δ → 0, a monotonically decreasing sequence of sets converging to G ∩ F − G, it follows
from (3.14) that
−
lim P{ f (ξ0∗ (t)) ∈ G+
2δ − Gδ } ≤ P{Δ J1 (c, ξn (t)) > ε }.
δ →0
Therefore by choosing ε , δ , and c sufficiently small, all the negative terms in (3.15) can be made so small that their sum
is less than a/2. Then
∗
+
P{ f (ξn∗k (t)) ∈ G−
(3.16)
δ } > a/2 + P{ f (ξ0 (t)) ∈ G2δ }.
Obviously the distribution of ξn∗ (t) is uniquely defined by the probabilities
(ε )
P{ξn (ti ) ∈ Ai ; i = 0, 1, · · · , N; τn, j ∈ B j ; j = 1, 2, · · · , m; νnε = m}.
Consider sets Ari , i = 0, 1, · · · , N; r = 1, 2, · · · , R, such that
P{ξ0 (ti ) ∈ Ari ∩ X − Ari } = 0,
∑ P{ξ0 (ti ) ∈ Ari ; i = 1, · · · , R} > 1 − a/4,
(3.17)
50
1 Limit Theorems for Stochastic Processes
where the sum is taken over all possible different collections r1 , r2 , · · · , rN , in which each ri takes on one of the values
1, · · · , R independently of the others. In addition, assume that the diameter of each of the Ari is no greater than μ , where
μ is a number such that
rF ( f (x(t)), f (y(t))) < δ /4 if sup ρ (x(t), y(t)) < μ .
t
let
∗
Let us now construct new processes ξ n (t) in the following
(r)
y be any point different from all of these xi . We set
∗
ξ n (ti ) =
(r)
way. From each of the Ari we choose one point xi and
(r)
xi , if ξn∗ (ti ) ∈ Ari ,
/ r Ari ,
y,
if ξn∗ (ti ) ∈
∗
∗
and the discontinuity points of ξ n (t) coincide with those of ξn∗ (t), and we set ξ n (t) equal to a constant where ξn∗ (t) is
constant. Then it follows from the choice of μ , (3.16), and (3.17) that
∗
P{ f (ξ n (t)) ∈ G+
5δ /4 } >
k
∗
a
+ P{ f (ξ 0 (t)) ∈ G+
7δ /4 }.
4
(r )
(r )
(3.18)
(r )
There exists therefore at least one m and a collection of points x0 0 , x1 1 , · · · , xN N for which
∗
∗
(r )
ε
i
lim P{ f (ξ n (t)) ∈ G+
5δ /4 , νn = m, ξ n (ti )) = xi ; i = 1, · · · , N}
n→∞
∗
∗
(r )
ε
i
> P{ f (ξ 0 (t)) ∈ G+
7δ /4 , ν0 = m, ξ 0 (ti )) = xi ; i = 1, · · · , N},
which contradicts (a), since it implies that
(ε )
(ε )
P{νnε = m, τn,1 ∈ B1 , · · · , τn,m
∈ Bm , ξn (t0 ) ∈ A00 , · · · , ξn (tN ) ∈ ANN },
r
r
r
r
converges, for fixed m, A11 , · · · , ANN to
(ε )
(ε )
P{ν0ε = m, τ0,1 ∈ B1 , · · · , τ0,m ∈ Bm , ξ0 (t0 ) ∈ A00 , · · · , ξ0 (tN ) ∈ ANN }
r
r
in the sense of variation of signed measures.
This completes the proof of the theorem for uniformly continuous functions. The transition to arbitrary continuous
functions, and even to almost everywhere continuous functions in the measure generated by ξ0 (t) can be performed in
exactly the same way as in one of the author’s previous works [7] if we note that the distance in the uniform metric from
a measurable set is a measurable uniformly continuous function.
3.3.2. REMARK. Condition (a) can be replaced by the following condition: (a ) The variation of the differ(ε )
(ε )
(ε )
(ε )
ence between the distribution of the τn,1 , · · · , τn,m for νnε = m and the distribution of the τ0,1 , · · · , τ0,m for ν0ε = m
(ε )
(ε ) , ξn (t1 ), · · ·
n,νn
converges to zero; the joint distribution of the τ
(ε )
τ (ε ) , ξ0 (t1 ), · · ·
0,ν0
, ξn (tk ) converges to the joint distribution of the
, ξ0 (tk ) if t1 , · · · ,tk belong to some set everywhere dense on [0, 1] which contains 0 and 1.
3.3.3. REMARK. Condition (a ) in 3.3.2 and condition (b) in 3.3.1 are necessary and sufficient for Theorem 3.3.1
if F is a segment of the line. The necessity of (b) follows from 3.2.2. The necessity of (a ) can be shown by considering
functionals of the form
(ε )
(ε )
f (ξn (t)) = g(ξn (t1 ), ξn (t2 ), · · · , ξn (tk ); τn,1 , · · · , τn,m
),
where g is continuous in the first set of variables and Borel measurable in the second set.
All the limit theorems obtained can be applied to various concrete cases. The
author hopes in the near future to publish applications of these theorems to processes
with independent increments and to Markov processes.
Received by the editorial office
October 10, 1956.
Limit Theorems for Stochastic Processes
51
References
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
A.N.Kolmogorov, Izv. Akad. Nauk SSSR. Ser. Mat. Estestv. Nauk, (1931), p. 959.
A.N.Kolmogorov, Izv. Akad. Nauk SSSR. Set. Mat. Estestv. Nauk, (1933), p. 63.
M.Dosker, Mem. Amer. Math. Sot., 6 (1951).
I.I.Gikhman, Kiiv. Derzh. Univ. Mat. Sb., (1953), pp. 7, 75.
Yu.V.Prokhorov, Uspekhi Mat. Nauk, (1953), pp. 8, 165.
Yu.V.Prokhorov, Teor. Veroyatnost. Primenen., 1:2 (1956).
A. V. Skorokhod, Dokl. Akad. Nauk SSSR., 104:3 (1955).
A. V. Skorokhod, Dokl. Akad. Nauk SSSR., 106:5 (1956).
N.N.Chentsov, Teor. Veroyatnost. Primenen., 1:1 (1956).
W.Feller, Trans. Amer. Math. Sot., 77:1 (1954).
LIMIT THEOREMS FOR STOCHASTIC PROCESSES
A. V. SKOROKHOD (MOSCOW)
(Summary)
Let us consider a sequence of processes ξn (t) such that the multivariate distribution of ξn (t1 ), ξn (t2 ), · · · , ξn (tk ) tends to the multivariate distribution of ξ0 (t1 ),
ξ0 (t2 ), · · · ,ξ0 (tk ) for all k and t1 ,t2 , · · · ,tk .
Let f be the functional for which f (ξn (t)) are determined with a probability of
1, the latter being random variables (i.e. those that have probability distributions).
This paper contains several sufficient conditions, for which the distributions of
f (ξn (t)) tend to the distribution of f (ξ0 (t)) as n → ∞.
Let K be the space of all functions not having discontinuities higher than simple
jumps, and let us assume that ξn (t) with a probability of 1 is in K.
Several topologies in K are defined. The necessary and sufficient conditions are
found for all functionals f that are continuous in these topologies for which the
distribution of f (ξn (t)) tends to the distribution of f (ξ0 (t)).
The results are demonstrated in the example of topology J1 which is defined as
follows.
The sequence xn (t) tends to x0 (t) in topology J1 if there exists a sequence of
monotonic continuous functions λn (t) for which
λn (0) = 0, λn (1) = 1, lim sup |λn (t) − t| = 0,
n→∞ t
lim sup |xn (λn (t)) − x0 (t)| = 0.
n→∞ t
Theorem. The distribution of f (ξn (t)) tends to the distribution of f (ξ0 (t)) for
all f that are continuous in topology J1 , if and only if
a) the multivariate distribution of ξn (t1 ), · · · , ξn (tk ) tends to the multivariate distribution of ξ0 (t1 ), · · · , ξ0 (tk ) for all k, and t1 ,t2 , · · · ,tk from some set N that is dense
on [0, 1].
b) for all ε > 0
lim lim P{
c→0 n→∞
sup
t−c<t1 <t<t2 <t+c
min[|ξn (t1 ) − ξn (t)|; |ξn (t) − ξn (t2 )|] > ε } = 0.
http://www.springer.com/978-3-319-28505-4