1 Existence and uniqueness of adapted solu

1
Existence and uniqueness of adapted solutions, additive noise
Consider the SDE with additive noise, with given initial condition x ∈ Rd ,
Z t
Xt = x +
b (Xs ) ds + Wt .
0
We can solve this equation pathwise.
Lemma 1 Assume that b satisfies one of the following two properties:
i) it is bounded and locally Lipschitz continuous;
ii) b (x) = Ax+B (x, x) where A is linear, B is bilinear, and hB (x, y) , yi =
0 for every x, y.
Let ω ∈ Ω be such that t 7→ Wt (ω) is continuous. Then the deterministic
equation
Z
t
Xt (ω) = x +
b (Xs (ω)) ds + Wt (ω)
(1)
0
has a unique continuous solution X· (ω). It is the limit, uniform in t over
compact sets, of Xtδ (ω) as δ → 0, δ > 0, where Xtδ (ω) is explicitly defined,
step by step, on subsequent intervals of size δ, by the equation
Z (t−δ)∨0
¡
¢
δ
Xt (ω) = x +
b Xsδ (ω) ds + Wt (ω) .
(2)
0
Proof. Step 1: case (i). Assume first that b is bounded and locally
Lipschitz continuous. Consider equation (2) on some [0, T ]. Its solution
Xtδ (ω) is bounded and continuous. Moreover, for T ≥ t ≥ τ ≥ 0
¯ δ
¯
¯Xt (ω) − Xτδ (ω)¯ ≤
Z
(t−δ)∨0
(τ −δ)∨0
¯ ¡ δ
¢¯
¯b Xs (ω) ¯ ds + |Wt (ω) − Wτ (ω)|
≤ C |t − τ | + |Wt (ω) − Wτ (ω)|
ª
©
hence the family of functions X·δ (ω) δ>0 is equi-uniformly continuous. By
Ascoli-Arzelà theorem, there is a sequence X·δn (ω) which converges uniformly
to some continuous function X· (ω). From the equation it follows that this
function is a solution to (1). Thus equation ©(1) hasªat least one solution.
If we prove it is unique, then the full family X·δ (ω) δ>0 must converge as
1
δ → 0, uniformly, to X· (ω). In such a case the statement is proved (for
bounded locally Lipschitz b).
(i)
To prove uniqueness, let X· (ω), i = 1, 2, be two solutions. The differ(1)
(2)
ence Vt (ω) = X· (ω) − X· (ω) satisfies
Z t
¢¤
¢
¡
£ ¡ (1)
Vt (ω) =
b Xs (ω) − b Xs(2) (ω) ds.
0
We claim that V is identically zero (uniqueness). If not, let us call t0 be the
infimum of times where Vt (ω) is not zero. We have
Z t
£ ¡ (1)
¢
¡
¢¤
Vt (ω) =
b Xs (ω) − b Xs(2) (ω) ds
t0
hence
Z
t
|Vt (ω)| ≤
L |Vs (ω)| ds
t0
where L is the Lipschitz constant over a ball that contains both solutions.
Gronwall lemma implies V equal to zero, which completes the proof.
Step 2: case (ii). Assume now (ii). The field b is locally Lipschitz but
not bounded. Given the initial value x and T > 0, let us prove that there
exists a constant R = R (|x| , T, ω) > 0 such that any continuous solution
X· (ω) of (1) on any interval [0, b] ⊂ [0, T ] satisfies |Xt (ω)| ≤ R for t ∈ [0, b].
Set Yt (ω) = Xt (ω) − Wt (ω). We have
Z t
b (Xs (ω)) ds
Yt (ω) = x +
0
hence
dYt (ω)
= b (Xt (ω)) = b (Yt (ω) + Wt (ω))
dt
d |Y |2
= 2 hb (Y + W ) , Y i
dt
= 2 hAY, Y i + 2 hAW, Y i + 2 hB (Y + W, Y ) , Y i + 2 hB (Y + W, W ) , Y i
¡
¢
≤ CA |Y |2 + |W | |Y | + CB |Y + W | |W | |Y |
¡
¢
≤ CA 2 |Y |2 + |W |2 + CB |W | |Y |2 + CB |W |2 |Y |
≤ (2CA + CB |W | + CB ) |Y |2 + CA |W |2 + CB |W |4
≤ C1 (T, ω) |Y |2 + C2 (T, ω)
2
which implies, by Gronwall lemma, that
Z t
2
2 C1 (T,ω)t
|Yt (ω)| ≤ |Y0 (ω)| e
+
eC1 (T,ω)(t−s) C2 (T, ω) ds
0
2 C1 (T,ω)T
≤ |x| e
+ T eC1 (T,ω)T C2 (T, ω) .
Hence
|Xt (ω)|2 ≤ 2 sup |Wt (ω)|2 + 2 |x|2 eC1 (T,ω)T + T eC1 (T,ω)T C2 (T, ω) .
t∈[0,T ]
Call R2 (|x| , T, ω) this constant.
Having proved this “a priori bound”, we may cut-off b outside the ball
B2R or radius 2R (|x| , T, ω), so that the modified b is still locally Lipschitz (as
it is b) and bounded. Then the modified equation has a unique solution, limit
of equations (2) (with the modified b). Let [0, b] ⊂ [0, T ] be any interval such
that the solution Xt (ω) of the modified problem is still in B2R (intervals
with such property exist by continuity of Xt (ω) and the fact that |x| ≤
R (|x| , T, ω)). Then, on [0, b], Xt (ω) is a solution of the original equation
and thus, by the a priori bound, it is contained in BR , not only in B2R . This
implies that over the full interval [0, T ] it cannot leave BR , and thus the
solution of the modified equation is in fact a solution of the original problem.
Since Xtδ (ω), the solution of (2) with the modified b, converges uniformly on
[0, T ] to Xt (ω), also Xtδ (ω) lives in B2R on [0, T ], at least for small δ. Hence
Xtδ (ω) is also solution of (2) with the original b, for small δ. The proof is
complete.
Theorem 2 Under any one of the two assumptions of the previous lemma,
the stochastic equation
Z t
Xt = x +
b (Xs ) ds + Wt
0
has a unique continuous adapted solution.
Proof. Existence and uniqueness of a function Xt (ω), defined for P -a.e.
ω ∈ Ω (those such that t 7→ Wt (ω) is continuous), continuous in t, such that
the equation holds true, comes from the lemma. We have only to prove that
it is an adapted process.
3
Equation (2) defines Xtδ (ω) explicitly, step by step, on subsequent intervals of size δ. Because of this, it is easy to see that Xtδ is a (continuous)
adapted process. For P -a.e. ω ∈ Ω (those such that t 7→ Wt (ω) is continuous), X·δ (ω) converges uniformly to X· (ω), on any time interval [0, T ]. This
easily implies that also Xt is a (continuous) adapted process. The proof is
complete.
2
Markov property
2.1
Discrete case
It is easy to formulate and understand Markov property for a discrete-time
discrete space stochastic system. Consider a stochastic process (Xn )n∈N ,
defined on some probability space (Ω, F, P ), with values in a finite set S of
“states”: Xn : Ω → S. We may think of S as a finite number of “nodes” of
a graph. The stochastic dynamic behind the process (Xn )n∈N jumps from a
node to another at random; Xn is the position (the node occupied) at time n.
The nodes of the graph are connected by arrows that declare that transition
from a node to another is possible.
With this graph-like interpretation it is natural to put a probability on
each arrow, the probability that the stochastic dynamics makes that jump.
More precisely, if we denote by i and j the nodes connected by the arrow
(the arrow is from i to j), the probability pij on the arrow is the probability
that the system, when being in the state i, jumps in j. We must have
X
pij ∈ [0, 1] for every i, j,
pij = 1 for every i.
j
This scheme (or mathematical model) looks very natural but implicitly we
have imposed two basic properties: i) that the system has no memory: when
it is in state i, it decides where to go (in a probabilistic sense) independently
of the past history; ii) that the dynamics is statistically invariant by time
translations (called homogeneous or autonomous): the probability to perform a certain jump does not depend on time n. Only being in state i is
important. This is time-homogeneous Markov property. Markov property in
itself, without time-homogeneity, reads
P (Xn+1 = j|Xn = i) = P (Xn+1 = j|Xn = i, Xn−1 = in−1 , ..., X0 = i0 )
4
for all values of the parameters. If add time-homogeneity, we require that
P (Xn+1 = j|Xn = i) is independent of n, thus
P (X1 = j|X0 = i) = P (Xn+1 = j|Xn = i, Xn−1 = in−1 , ..., X0 = i0 ) .
For the definition and interpretation of conditional probability we address to
any book on basic probability.
A notation: we set pij := P (X1 = j|X0 = i),
X
P (A|i) :=
pij = P (X1 ∈ A|X0 = i)
j∈A
Pk (A|i) := P (Xk ∈ A|X0 = i) ,
k ≥ 1.
Exercise 3 For k ≥ 2,
Pk (A|i) =
X
pij2 pj2 j3 · · · pjk−1 jk P (A|jk ) .
j2 ,...,jk
2.2
Continuous (or general) case
If (Xt )t≥0 is a stochastic process on a probability space (Ω, F, P ) with values
in Rd , like the solution of a stochastic differential equation, how do we define
Markov property, so that we may ask ourselves whether the solutions to SDEs
are Markov? Events like Xn = i have zero probability (usually), hence the
previous definition cannot work so easily. If the random variables Xt have
densities, we may try to use concepts like conditional densities. Anyway this
is not fully general, so we try a general definition.
Let us transform the definition of the discrete case until it takes a general
form. The reason why the previous definition was not good is the presence
of zero probability events like Xn = i. If we succeed to use only events like
Xn ∈ A, we generalize.
First, one can check that it is equivalent to (we omit details of this step
since the result is intuitively clear; it could be taken as the very definition of
homogeneous Markov property)
Pk (A|x) = P (Xn+k ∈ A|Xn = x, (Xn−1 , ...X0 ) ∈ Bn−1 × · · · × B0 )
5
which must hold true for every n, k ≥ 0, every x ∈ S, every A, B0 , ..., Bn−1 ⊂
S. But still it is basic that the condition at the “present time” n is Xn = x.
Let us rewrite it as
Pk (A|x) P (Xn = x, (Xn−1 , ...X0 ) ∈ Bn−1 × · · · × B0 )
= P (Xn+k ∈ A, Xn = x, (Xn−1 , ...X0 ) ∈ Bn−1 × · · · × B0 )
or
£
¤
E Pk (A|x) 1Xn =x 1(Xn−1 ,...X0 )∈Bn−1 ×···×B0
= P (Xn+k ∈ A, Xn = x, (Xn−1 , ...X0 ) ∈ Bn−1 × · · · × B0 ) .
Given Bn ⊂ S, we sum over x ∈ Bn and get
"
#
X
E 1(Xn−1 ,...X0 )∈Bn−1 ×···×B0
Pk (A|x) 1Xn =x
x∈Bn
= P (Xn+k ∈ A, (Xn , ...X0 ) ∈ Bn × · · · × B0 )
£
¤
= E 1Xn+k ∈A 1(Xn ,...X0 )∈Bn ×···×B0
We have
X
Pk (A|x) 1Xn =x = Pk (A|Xn ) 1Xn ∈Bn
x∈Bn
because both r.v. are equal to zero if Xn (ω) ∈
/ Bn and to Pk (A|x) where
x = Xn (ω), when Xn (ω) ∈ Bn . Thus we reach the identity
£
¤
£
¤
E Pk (A|Xn ) 1(Xn ,...X0 )∈Bn ×···×B0 = E 1Xn+k ∈A 1(Xn ,...X0 )∈Bn ×···×B0 .
This identity is expressed entirely in terms of nontrivial events. It is very
easy to see that we may go backwards and prove that it is equivalent to the
original condition.
There is still, however, a nontrivial object in this identity: Pk (A|Xn ). In
the discrete case it is defined as the composition of the function x 7→ Pk (A|x)
with the r.v. ω 7→ Xn (ω). But which is the meaning of Pk (A|x) in the
continuous case?
Let us start not simply from a single process (Xt )t≥0 but from a family
of processes (Xtx )t≥0 indexed by x ∈ Rd . This will be the case when dealing
with solutions of SDEs with initial condition x. We define
Pt (A|x) := P (Xtx ∈ A)
6
for every t ≥ 0, x ∈ Rd , A a Borel set of Rd . We say that the family
(Xtx )t≥0,x∈Rd is a time-homogeneous Markov family (or process) if
h ¡
i
h
i
¢
x
E Ps A|Xtn 1(X x ,...X x )∈Bn ×···×B0 = E 1Xtxn +s ∈A 1(X x ,...X x )∈Bn ×···×B0
tn
t0
tn
t0
for all x ∈ Rd , s ≥ 0, tn ≥ ... ≥ t0 ≥ 0 and Borel sets A, B0 , ..., Bn . By an
easy argument of linearity and density, this is equivalent to ask
h
i
x ∈A Z
E [Ps (A|Xtx ) Z] = E 1Xt+s
for every x ∈ Rd , s, t ≥ 0, Borel set A, r.v. Z measurable with respect to the
σ-algebra Ft generated
Xrx for r ∈ [0, t].
¡ by
¢
For every ϕ ∈ Cb Rd , the space of bounded continuous functions from
d
R to R, define the function Pt ϕ from Rd to R as
(Pt ϕ) (x) := E (ϕ (Xtx ))
so that this notion extends Pt (A|x): (Pt 1A ) (x) = Pt (A|x). Again an easy
argument of linearity and density implies that the previous condition is equivalent to
£ ¡ x ¢ ¤
E [(Ps ϕ) (Xtx ) Z] = E ϕ Xt+s
Z
¡
¢
for every x ∈ Rd , s, t ≥ 0, ϕ ∈ Cb Rd , Ft -measurable r.v. Z. This is our
final condition, usually written in the form
£ ¡ x ¢ ¤
(Ps ϕ) (Xtx ) = E ϕ Xt+s
|Ft
for those who know the concept of conditional expectation.
2.3
Solutions to SDEs are Markov
We simply state the following fact, the proof is given in the CIME notes.
Theorem 4 For every x ∈ Rd , denote by Xtx the unique continuous adapted
solution of equation
Z t
Xt = x +
b (Xs ) ds + Wt
0
where b satisfies one of the two assumptions of lemma 1. Then (Xtx )t≥0,x∈Rd
is a time-homogeneous Markov family.
7