1
ME/MATH 577
Stochastic Systems in Science and Engineering
Chapter #03
Mean Square Caculus
This chapter addresses the fundamentals of mean-square calculus for solving stochastic differential equations. In particular,
we will focus on the Hilbert space L 2 (P ) in the probability space Rn , B(Rn ), P for n ∈ N and the probability measure
P : B(Rn ) → [0, 1] (refer to Chapter #01). Let us call the Hilbert space L 2 (P ) as H and, in this Hilbert space, the inner
product is defined as:
n
T T
X, Y H Trace E XY
=E X Y =E
xk yk
k=1
i.e., if the random vectors X and
Y belong to the probability space R , B(Rn ), P and have finite second moments.
Furthermore, if (x k − x) H E (xk − x)T (xk − x) → 0 as k → ∞, then we say l.i.m. xk = x, where l.i.m.
stands for limit in the mean. In that case, l.i.m. xk = x ⇔ . limk→∞ E (xk − x)T (xk − x) = 0 .
n
Theorem 0.1: (Mean Square Convergence) Let {x k }, {yk } and {vk } be random sequences and let z be a random variable,
each of which belongs to H. Let {c k } be a sequence of real numbers and let a and b be real constants. If l.i.m. x k = x,
l.i.m. yk = y, l.i.m. vk = v, and lim ck = c, then
(i) l.i.m. ck = c.
Proof 1: Since ck ∈ R ∀k,it follows that E[(ck − c)2 ] = (ck − c)2 and lim(ck − c)2 = 0.
(ii) l.i.m. z = z.
Proof 2: Since z k = z ∀k, it follows that E |zk − z|2 = E |z − z|2 = 0, which implies that l.i.m. z = z.
(iii) l.i.m. zck = zc.
Proof 3: l.i.m. (zck − zc) = lim E |z(ck − c)|2 = [E|z|2 ] lim |ck − c|2 = 0
(iv) l.i.m.(axk + byk ) = ax + by, i.e., l.i.m. is a linear operation.
Proof 4: E |(axk + byk ) − (ax + by)|2 = a2 E[|xk − x|2 ] + 2abE[(xk − x)(yk − y)] + b2 E[|yk − y|2 ]
It follows from Cauchy-Schwarz inequality that |E[(x k −x)(yk −y)]| E[(xk − x)2 ]E[(yk − y)2 ] → 0 as k → ∞.
Hence, l.i.m.(axk + byk ) = ax + by.
(v) lim E xk = E l.i.m. xk = E x , i,e., the operations l.i.m. and E[•] commute.
Proof 5: By setting the second random variable equal to 1, it follows from Cauchy-Schwarz inequality that
|E[(xk − x)]| E[(xk − x)2 ]] → 0 as k → ∞. Hence, E l.i.m. xk = E x .
(vi) lim E[xk yk ] = E[xy].
Proof 6: It follows from Cauchy-Schwarz inequality that
|E[(xk yk − xy)]| =
|E[(xk − x)(yk − y) + E[(xk − x)y] + E[x(yk − y)]|
≤ |E[(xk − x)(yk − y)]| + |E[(xk − x)y]| + |E[x(yk − y)]|
≤ E[(xk − x)2 ]E[(yk − y)2 ] + E[(xk − x)2 ]E[|y|2 ] + E[|x|2 ]E[(yk − y)2 ] → 0.
(vii) If E[xk yk ] = E vk for all but finitely many k’s, then E[xy] = E[v].
Proof 7: Similar to the proof of part (vi).
I. C ONCEPTS
OF
C ONTINUITY
Continuity of a stochastic process is defined in terms of convergence. As there are different concepts of convergence of
random sequences (see Chapter #01), so are different concepts of continuity. Let x : T × Ω → R, where T ⊆ [0, ∞), be a
stochastic process in a probability space Ω, E, P . A few examples follow.
•
Sure continuity (also called sample point continuity): The stochastic process xt (ζ) is surely continuous at a time instant
t⋆ ∈ T if ∀ε > 0 ∀ζ ∈ Ω ∃δ(ε, t⋆ ) > 0 such that |t − t⋆ | < δ ⇒ |xt (ζ) − xt⋆ (ζ)| < ε .
Equivalently, ∀ζ ∈ Ω, limt→t⋆ |xt (ζ) − xt⋆ (ζ)| = 0.
•
Almost sure continuity (also called sample pointwise almost everywhere continuity): ∀ε > 0 ∀ζ ∈ Ω\Θ, where P [Θ] = 0,
there exists δ(ε, t⋆ ) > 0 such that |t − t⋆ | < δ ⇒ |xt (ζ) − xt⋆ (ζ)| < ε .
Equivalently, ∀ζ ∈ Ω \ Θ, limt→t⋆ |xt (ζ) − xt⋆ (ζ)| = 0.
•
•
Continuity in the rth mean, where r ∈ [1, ∞); for r = 2, it is called mean-square continuity. The stochastic process xt (ζ)
is rth -mean continuous at a time instant t⋆ ∈ T if
h
i
1
k (xt − xt⋆ ) kLr (P ) , E r |xt − xt⋆ |r → 0 as t → t⋆
Continuity in probability measure (also called p-continuity): The stochastic process xt (ζ) is continuous in probability (or
p-continuous) at a time instant t⋆ ∈ T if ∀ε > 0, limt→t⋆ P [|xt − xt⋆ | > ε] = 0.
Notice that p-continuity is weaker than almost sure continuity. Compare the following expressions:
(as continuity at time t⋆ ): ∀ε > 0, P [{ζ ∈ Ω : limt→t⋆ |xt (ζ) − xt⋆ (ζ)| > ε}] = 0
(p-continuity at time t⋆ ): ∀ε > 0, limt→t⋆ P [{ζ ∈ Ω : |xt (ζ) − xt⋆ (ζ)| > ε}] = 0
Notice that p-continuity is weaker than rth -mean continuity (especially mean-square continuity) by following the expressions for generalized Markov inequality and Chebyshev inequality in Chapter #01.
TAYLOR ’ S T HEOREM
Let f be a real-function on [a, b] ( R, which satisfies the following conditions:
1) Given n ∈ N, the (n − 1)th derivative f (n−1) is continuous on [a, b] and f (n) (t) exists for all t ∈ (a, b).
Pn−1 (k)
2) For α, β ∈ [a, b] and α < β, let P (t) , k=0 f k!(α) (t − α)k .
Then, there exists x ∈ (α, β) such that
f (β) = P (β) +
f (n) (x)
(β − α)n
n!
Proof 8: Let M ∈ R be defined by f (β) = P (β) + M (β − α)n and let g(t) , f (t) − P (t) − M (t − α)n for t ∈ [a, b].
Then, it follows that
g (n) (t) = f (n) (t) − n!M ∀t ∈ (a, b)
Next we show that g (n) (x) = 0 for some x ∈ (α, β).
Since P k (α) = f k (α) for k = 0, · · · , n − 1, it follows that g(α) = g (1) (α) = · · · = g (n−1) (α) = 0.
Now one may choose M to make g(β) = 0 such that g (1) (x1 ) = 0 for some x1 ∈ (α, β) by the mean value theorem. Since
g (1) (α) = 0, one may conclude similarly that g (2) (x2 ) = 0 for some x2 ∈ (α, x1 ). After n such iterations, it is concluded that
g (n) (xn ) = 0 for some xn ∈ (α, xn−1 ), i.e., xn ∈ (α, β).
Remark 1.1: For n = 1, Taylor’s theorem reduces to the mean value theorem. In general, Taylor’s theorem states that the
real function f can be approximated by a polynomial of degree (n − 1), and it can be used to obtain the estimation error if
the the bounds on |f (n) | are known.
2
© Copyright 2026 Paperzz