The Smoluchowski-Kramers approximation for state dependent friction

The Smoluchowski-Kramers approximation for
stochastic differential equations with
arbitrary state dependent friction
by
Scott Hottovy
A Dissertation Submitted to the Faculty of the
Department of Mathematics
In Partial Fulfillment of the Requirements
For the Degree of
Doctor of Philosophy
In the Graduate College
The University of Arizona
2013
2
THE UNIVERSITY OF ARIZONA
GRADUATE COLLEGE
As members of the Dissertation Committee, we certify that we have read the dissertation prepared by Scott Hottovy entitled
The Smoluchowski-Kramers approximation for stochastic differential equations with
arbitrary state dependent friction
and recommend that it be accepted as fulfilling the dissertation requirement for the
Degree of Doctor of Philosophy.
Date: April 10, 2013
Jan Wehr
Date: April 10, 2013
Thomas Kennedy
Date: April 10, 2013
Sunder Sethuraman
Date: April 10, 2013
Joseph Watkins
Final approval and acceptance of this dissertation is contingent upon
the candidate’s submission of the final copies of the dissertation to
the Graduate College.
I hereby certify that I have read this dissertation prepared under
my direction and recommend that it be accepted as fulfilling the
dissertation requirement.
Date: April 10, 2013
Jan Wehr
3
Statement by Author
This dissertation has been submitted in partial fulfillment of requirements for an advanced degree at The University of Arizona and
is deposited in the University Library to be made available to borrowers under rules of the Library.
Brief quotations from this dissertation are allowable without special permission, provided that accurate acknowledgment of source is
made. Requests for permission for extended quotation from or reproduction of this manuscript in whole or in part may be granted by the
head of the major department or the Dean of the Graduate College
when in his or her judgment the proposed use of the material is in the
interests of scholarship. In all other instances, however, permission
must be obtained from the author.
Signed:
Scott Hottovy
4
Dedication
For my Mom, Dad, and Anna.
5
Acknowledgments
I would like to thank my advisor, Professor Jan Wehr, who first enticed me to attend
the University of Arizona with an enlightening talk on quantum communication. After
a semester of work on quantum communication, Jan introduced me to my dissertation
problem. I also thank him for introducing me to our collaborator, Giovanni Volpe,
who I had many interesting discussion about problems at the boundary of physics
and mathematics, especially the Brownian particle in a diffusion gradient [Sec. 1.1.1].
Thank you to my committee members, Professor Tom Kennedy, Professor Sunder
Sethuraman, and Professor Joe Watkins. All of the members were integral in my
success at the University of Arizona, from classwork to oral comprehensive exam to
dissertation writing. They deserve thanks for always willing to listen to an argument
on how to prove weak convergence or compactness, and for agreeing to be on my
dissertation committee.
I am grateful that Professor Lenya Rhyzik agreed to be the external reviewer
for this dissertation. I also would like to thank Lenya for collaborating with me on
an NSF Postdoctoral fellowship proposal. While the proposal was not accepted, his
advice was extremely helpful.
I would like to acknowledge Professor Thomas Kurtz who I had fruitful conversations with during his visit to Tucson. He also pointed us to the work of [KP91],
which is an instrumental part to this dissertation.
I am thankful for all my teachers from primary to graduate school who encouraged
me to study mathematics. And to my undergraduate advisor Professor George Avalos,
who first made me realize that I could become a mathematician, and for his patience
in working with me at the University of Nebraska.
6
Table of Contents
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9
Chapter 1. Introduction and History
1.1. Introduction . . . . . . . . . . . . . .
1.1.1. Motivating example . . . . . .
1.1.2. Statement of the main result .
1.2. History . . . . . . . . . . . . . . . . .
1.3. Outline of the dissertation . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
10
10
11
14
17
21
Chapter 2. Stochastic Differential Equations and the SmoluchowskiKramers approximation . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.2. Stochastic Differential Equations . . . . . . . . . . . . . . . . . . . . 23
2.2.1. Construction of the Stochastic Integral . . . . . . . . . . . . . 25
2.2.2. Existence and uniqueness of solutions to SDE . . . . . . . . . 28
2.2.3. The Backward and Forward Kolmogorov Equations . . . . . . 30
2.2.4. An existence theorem using Lyapunov functions . . . . . . . . 33
2.3. Mathematical Modeling using Stochastic Differential Equations . . . 35
2.3.1. Chemical reactions . . . . . . . . . . . . . . . . . . . . . . . . 36
2.3.2. Stability of orbiting satellites . . . . . . . . . . . . . . . . . . 38
2.3.3. Nucleation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.3.4. Climate models . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.4. Small parameter limits . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.4.1. Modes of convergence . . . . . . . . . . . . . . . . . . . . . . . 41
2.4.2. Approximation of the Wiener process . . . . . . . . . . . . . . 43
2.5. Previous results for the Smoluchowski-Kramers approximation . . . . 45
2.5.1. Homogenization calculation . . . . . . . . . . . . . . . . . . . 50
2.5.2. Smoluchowski-Kramers approximation with colored noise . . . 53
2.5.3. Other small mass results . . . . . . . . . . . . . . . . . . . . . 58
Chapter 3. Proof of the Main Theorem .
3.1. Definitions and Notation . . . . . . . . . .
3.2. Main theorem and assumptions . . . . . .
3.3. Outline of the proof . . . . . . . . . . . . .
3.4. Proof of the main theorem . . . . . . . . .
3.4.1. Convergence of Stochastic Integrals
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
61
61
63
64
67
67
7
Table of Contents—Continued
3.4.2.
3.4.3.
3.4.4.
3.4.5.
3.4.6.
Convergence of the Langevin equation . . .
Integration by parts to satisfy assumption of
Lyapunov equation . . . . . . . . . . . . . .
Reformulate the SDE . . . . . . . . . . . . .
Check Condition 1 . . . . . . . . . . . . . .
. .
the
. .
. .
. .
. . .
limit
. . .
. . .
. . .
. . . . .
theorem
. . . . .
. . . . .
. . . . .
81
85
86
90
93
Chapter 4. Applications of Main Theorem . . . . . . . . . . . . . . . 97
4.1. One Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
4.1.1. Smoluchowski-Kramers limit as different conventions of the stochastic integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
4.1.2. An equation for α(x) . . . . . . . . . . . . . . . . . . . . . . . 104
4.1.3. Brownian particle in a diffusion gradient . . . . . . . . . . . . 108
4.2. Ornstein-Uhlenbeck Colored Noise . . . . . . . . . . . . . . . . . . . . 113
4.2.1. Colored noise with constant friction . . . . . . . . . . . . . . . 114
4.3. Thermophoresis using OU colored noise . . . . . . . . . . . . . . . . . 116
4.3.1. Physical interpretation: Drift and probability density . . . . . 119
4.3.2. Analysis of the limiting equation and discussion . . . . . . . . 122
4.4. Higher dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
4.4.1. Three dimensional Brownian particle with non-conservative force 127
Chapter 5. Summary . . . . . . . .
5.1. Conclusion . . . . . . . . . . . .
5.2. Future work . . . . . . . . . . .
5.2.1. Active Brownian Motion
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
129
129
129
131
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
8
List of Figures
Figure 1.1. Experimental setup for a synthetic bead [VHB+ 10] . . . . . . .
Figure 1.2. Experimental results from [VHB+ 10] . . . . . . . . . . . . . . .
Figure 1.3. Potential well . . . . . . . . . . . . . . . . . . . . . . . . . . . .
13
14
19
Figure
Figure
Figure
Figure
2.1.
2.2.
2.3.
2.4.
One dimensional Brownian motion . . . . . . . . . . . . . .
Double potential well . . . . . . . . . . . . . . . . . . . . . .
Trajectory of a Brownian particle in a bistable potential well
A three dimensional potential function . . . . . . . . . . . .
.
.
.
.
.
.
.
.
24
37
37
38
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
4.1.
4.2.
4.3.
4.4.
4.5.
4.6.
4.7.
4.8.
4.9.
4.10.
4.11.
One-dimensional Convergence for α = 1 . . . . . . . .
One dimensional convergence of trajectories for α = 1 .
α vs. λ . . . . . . . . . . . . . . . . . . . . . . . . . .
One-dimensional Convergence for α = 2 . . . . . . . .
One-dimensional Convergence for γ = cσ . . . . . . . .
Plot of D(x) . . . . . . . . . . . . . . . . . . . . . . .
Thermophoresis of DNA [DB06a] . . . . . . . . . . . .
Thermophoresis in two cases . . . . . . . . . . . . . . .
Drift and stationary distribution for µ(T ) constant . .
Drift and stationary distribution for µ(T ) linear . . . .
Drift and stationary distribution for µ(T ) quadratic . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
102
105
106
107
109
110
116
120
123
125
126
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Figure 5.1. An example of Active Brownian motion [SWS+ 10] . . . . . . . 132
9
Abstract
In this dissertation a class of stochastic differential equations is considered in the limit
as mass tends to zero, called the Smoluchowski-Kramers limit. The SmoluchowskiKramers approximation is useful in simplifying the dynamics of a system. For example, the problems of calculating of rates of chemical reactions, describing dynamics of
complex systems with noise, and measuring ultra small forces, are simplified using the
Smoluchowski-Kramers approximation. In this study, we prove strong convergence in
the small mass limit for a multi-dimensional system with arbitrary state-dependent
friction and noise coefficients. The main result is proved using a theory of convergence
of stochastic integrals developed by Kurtz and Protter. The framework of the main
theorem is sufficiently arbitrary to include systems of stochastic differential equations
driven by both white and Ornstein-Uhlenbeck colored noises.
10
Chapter 1
Introduction and History
1.1
Introduction
Most physical, chemical, biological and economic phenomena present an intrinsic
degree of randomness. These are typically modeled by stochastic differential equations (SDEs) [Øks03]. SDEs were introduced at the beginning of the 20th century to
describe Brownian motion by adding a random driving function to an ordinary differential equation (ODE). Since then, SDEs have come into widespread use in many
fields, e.g., physics, biology, and economics. The class of equations studied in this
dissertation is the Langevin equation, the stochastic Newton equation,
m
dvt X
=
Fi ,
dt
i
(1.1.1)
where vt is the velocity of a small particle, m mass, and Fi are the forces of the
system, with at least one force random. In practice, the dynamics of the position xt
are of interest. Defining vt = dxt /dt, the system is,
dxt
=vt
dt
dvt X
m
=
Fi (xt ).
dt
i
(1.1.2)
(1.1.3)
In many applications, e.g. molecular dynamics, the dimension of state space, xt ∈ Rd ,
is large (d 1), and the above system is solved for twice as many variables (xt , vt ).
Valid approximations that reduce the state space are crucial for both theoretical and
computational applications.
In this dissertation we prove a small mass limit, called the Smoluchowski-Kramers
approximation, for a wide class of systems. In short, this approximation for the system
[Eq. (1.1.2)] reduces the state space from (xt , vt ) to xt cutting the dimension in half.
11
We begin the discussion with a motivating example of an experimental system of
a Brownian particle, in which the Smoluchowski-Kramers approximation provides a
way to measure the forces of the system by measuring the position of the particle at
short times.
We consider the 2d-dimensional stochastic differential equation (SDE):
(
m
= x,
xm
= v
dxm
0
t
h t dt
i
m)
m)
m)
F
(x
γ(x
σ(x
t
dvtm =
− mt vtm dt + mt dWt v0m = v,
m
(1.1.4)
m
d
d
d
with xm
t , vt ∈ R × [0, T ], F : R 7→ R , and W is a k-dimensional Wiener process
on the probability space (Ω, F, P ). We assume that γ(x) is a d × d invertible and
continuously differentiable matrix function γ : Rd → Rd with positive eigenvalues,
and σ is a d × k continuously differentiable matrix function σ : Rd → Rk . The above
SDE provides a framework to model many physical systems, from colloidal particles
in a fluid to a camera tracking an object [Pap10].
1.1.1
Motivating example
A very important example is a model of the experiment [VHB+ 10] of a synthetic bead,
with diameter on the order of 1µm, in water close to the bottom of a container. The
experimental setup is pictured in figure 1.1. The particle is far away from the walls
in the transverse direction compared to the planar wall, therefore equation (1.1.4) is
used with x = x ∈ (0, a) ⊂ R as the distance away from the wall. A mathematical
explanation of the experimental results in [VHB+ 10] was the original motivation for
this dissertation.
In the experiment, the particle experiences a potential force due to an electrostatic
potential and xt is measured in one dimension. The random forcing term is approximated by white noise and D(x) is given as a hydrodynamical diffusion gradient and
p
the fluctuation-dissipation relation σ(x) = 2kB T γ(x) where kB is the Boltzmann
12
constant and T is the temperature of the fluid [TKS92]. The SDE model for the
Brownian particle is,
= vtm dt
dxm
t
F (xm
t )
dvtm =
−
m
kB T
vtm
D(xm
t )
√
= x
xm
0
T
dt + √ 2kBm
dWt , v0m = v.
(1.1.5)
D(xt )
Note that the above example is special in that its coefficients satisfy the fluctuationdissipation relation
γ(x) ∝ σ(x)2
(1.1.6)
[TKS92]. In [VHB+ 10], to measure the external force F , the drift and equilibrium
distribution measurement methods are used and compared to each other. For the
equilibrium distribution measurement method the system must be at thermodynamic
equilibrium. Measurements are made infrequently to sample the particle’s steady
state distribution. The joint steady state distribution for the position and velocity
ρ(x, v) is the Boltzmann and Gibbs distributions,
mv 2
1
ρ(x, v) = C exp
U (x) ,
exp −
2kB T
2kB T
(1.1.7)
where C is a normalizing constant and U (x) is the potential function, i.e. U 0 (x) =
−F (x). However, if the system is not in thermodynamic equilibrium then the drift
measurement method must be used. Because the Brownian particle’s motion is irregular, a measurement of the particle’s velocity is difficult to obtain. Thus the measurements are made to the particle’s position and a relationship between the force
F and the position must be derived. This is done with the Smoluchowski-Kramers
approximation, the limit as m → 0.
The limiting equation, as m → 0, for equation (1.1.5) can be found in many ways,
as will be discussed in Section 2.5. For the experiment in [VHB+ 10], the external
force can be measured by first modeling the particle with small mass by,


 F (x )D(x )

p


t
t
0
dxt = 
+ D (xt )  dt + 2D(xt ) dWt ,
| {z } 
| {z }
kB T

|
{z
} spurious drift
σ/γ
F/γ
x0 = x.
(1.1.8)
13
Figure 1.1. The experimental set up as described in [VHB+ 10], measuring the position of the synthetic bead performing Brownian motion using total internal reflection
microscopy. The forces acting on the particle are gravity and electrostatic interactions. I(x) is the intensity of the light, which is measured by the photomultiplier
(PMT). The measurements can then be converted into a trajectory.
This is the Smoluchowski-Kramers approximation for SDE (1.1.5). The importance
of the above equation is that when the friction and noise coefficients are constant,
γ(x) = γ and σ(x) = σ, then the spurious drift term is is zero. To measure the force,
the position measurements of the particle at times ti over many experiments are used
to approximate average instantaneous velocity. Correcting for the spurious drift to
obtain F ,
xtj − xtj−1
kB T 0
kB T
E
−
D (x) .
F (x) =
D(x)
tj − tj−1
D(x)
| {z }
(1.1.9)
correction
Without the correction term, the results of the drift measurement method are off and
14
can result in a change in sign for the sum of the external forces (see figure 1.2).
Figure 1.2. The experimental results as described in [VHB+ 10]. The force is measured in femtanewtons (10−15 N) and the distance from the bottom of the container
x is in nanometers (10−9 m). The blue circles are the drift measurement method
without the correction term in equation (1.1.9). The blue squares are the drift measurement method with the correct term in equation (1.1.9). The yellow squares are an
independent measurement using the equilibrium distribution measurement method.
1.1.2
Statement of the main result
m
d
In this dissertation, for xm
t , vt ∈ R × [0, T ] the limiting equation, as m → 0, is
derived and proved. Under the assumptions stated in Section 3.2 and defining J as
the matrix that solves the Lyapunov equation,
− J γ ∗ − γJ = −σσ ∗ ,
(1.1.10)
15
the x-component of the solution of equation (1.1.4) converges in L2 (Ω, F, P ) to the
solution of the equation (with the original initial condition x0 = x),
dxt = γ −1 (xt )F (xt ) + S(xt ) dt + γ −1 (xt )σ(xt )dWt ,
(1.1.11)
where the ith component of S is defined as
Si (x) =
X ∂
−1
m
m
(γij−1 (xm
s ))γαn (xs )G(xs )j,n .
∂x
α
α,n,j
(1.1.12)
where G = J γ ∗ . The convergence is with respect to L2 of the probability space
(Ω, F, P ) that arises from the Wiener process, and also the topology on CU ([0, T ]),
that is the space of continuous functions from [0, T ] to U ⊂ Rd , an open bounded
subset, with the uniform metric. It is crucial that the limiting SDE be interpreted
as an Itô equation. Note that for m > 0 the process xm
t has bounded variation
and thus all definitions of a stochastic integral lead to the same interpretation of the
equation (1.1.4) (see Lemma 1 Section 2.2.1). The term S(x)dt is a noise-induced
drift—an additional drift term appearing in the limiting equation. It was the experimental observation of such a term in the case of a Brownian particle in a diffusion
gradient that motivated the present study [VHB+ 10].
The main theorem of this dissertation is stated below:
Theorem 1. Suppose SDE (1.1.4) satisfies Assumptions 1-2 stated in chapter 3.
m
d
m
d
Let (xm
t , vt ) ∈ U × R , with xt ∈ U, a bounded domain in R , be the solution
of SDE (1.1.4) with initial conditions (x, v) for every m constant and let xt be the
solution to the Itô SDE (1.1.11) with the same initial condition x0 = x. Then
"
2 #
= 0.
(1.1.13)
lim E
sup |xm
t − xt |
m→0
0≤t≤T
The main result does not assume the fluctuation-dissipation relation and is presented in multidimensional form and is thus much more general than this original
motivating example. The essence of Theorem 1 is that for m small, the position of
16
the particle xm
t may be accurately approximated by the solution of SDE (1.1.11) on
the level of trajectories, i.e. the same Wiener process W (ω) for every realization ω
in the probability space Ω.
The above general framework also applies to systems with colored noise, another
class of SDE of interest in this dissertation. Consider the one-dimensional Langevin
equation driven by Ornstein-Uhlenbeck (OU) colored noise with a short correlation
time τ :
mẍt = (−ẋt + f (xt )ηtτ ) ,
Using the SDE for ηtτ the system
 m,τ

 dxt
dvtm,τ


dηtτ
(1.1.14)
is rewritten as,
= vtm,τ dt
v m,τ
= − tm +
=
αη τ
− τt
dt
f (xm,τ
)ηtτ
t
dt
m
√
+ τ2λ dWt .
(1.1.15)
Defining vtm,τ = (vtm,τ , ηt τ ) and setting τ = τ0 m (as in [PS08]), the above system can
be written in the form of equation (1.1.4) (see Sec 4.2). Thus the main result applies
to Langevin equations with both white and OU colored noises in the limit as m → 0.
Important aspects of Theorem 1 include:
d
d
m
1. In SDE (1.1.4), xm
t ∈ U ⊂ R , vt ∈ R and the friction γ = γ(x) is state-
dependent and matrix valued function,
2. the convergence is in L2 (Ω, F, P ) with respect to the space of continuous functions from [0, T ] to U ⊂ Rd with the uniform metric (i.e. CU [0, T ]),
3. the proof holds for physically realistic coefficients F , γ, σ, e.g. in [VHB+ 10]
where γ, σ blow up at the boundaries of U,
4. γ is sufficiently arbitrary to include systems with colored noise, as modeled by
equation (1.1.15), and the limit is as m and τ , the correlation time of the noise,
go to zero.
17
In the proceeding chapter, we will clarify the modes of convergence as well as give
previous convergence theorems.
1.2
History
Taking a small mass limit to aid in measuring the external force applied to a colloidal
particle is just one example of the usefulness of the Smoluchowski-Kramers approximation of SDEs. As the name suggests, Kramers and Smoluchowski [Smo16] were the
first to compute a small parameter limit for a Brownian particle, but to an entirely
different end. In this section, we describe how a Brownian particle, a particle with
diameter on the order of 1nm-1µm in a fluid, can be modeled by partial differential
equations (PDE). We also describe how Kramers first used the approximation that
came to be known as the Smoluchowski-Kramers approximation in [Kra40].
A small dense particle immersed in fluid will collide with surrounding molecules
causing irregular motion of the particle (see figure 2.2). Such particle movement is
called Brownian motion, in honor of its observer Robert Brown [Bro28]. In 1905, a kinetic theory was underway with the assumption that Brownian motion of microscopic
particles was due to bombardment by the surrounding molecules of the fluid. By the
law of equipartition of energy in statistical mechanics [TKS92] the kinetic energy of
the translation of the particle and of a molecule of the fluid should be equal. All that
remains is to measure the velocity of the particle, a feat only accomplished in the
last few years [LKMR10]. Independently, Einstein and Smoluchowski first developed
a theory of Brownian motion by circumventing the problem of measuring velocity
[Nel67].
Einstein’s theory consisted of describing how ρ = ρ(x, t), the probability density
that a Brownian particle is at x ∈ Rd at time t, evolves in time. Using probabilistic
18
arguments, he derived the diffusion equation
d
X ∂ 2ρ
∂ρ
=D
,
∂t
∂x2i
i=1
(1.2.1)
where the diffusion constant D is, through a physical argument (see [Nel67, TKS92,
Sch80]),
D=
kB T
,
mγ
(1.2.2)
where kB is the Boltzmann constant, T the absolute temperature of the fluid, and γ
the friction coefficient. Equation (1.2.2) is called the Einstein diffusion relation. Equation (1.2.1) models the probability density of a Brownian particle with no external
forces. The connection between equation (1.2.1) and the Langevin equation (1.1.4)
is given in Chapter 2 by the Chapman-Kolmogorov equation (see Section 2.2.3).
Smoluchowski derived the diffusion equation for the position x of a particle under the
influence of an external force. In one dimension the Smoluchowski equation is,
∂ρ
∂
∂ρ
=
−F̄ (x)ρ + D
.
(1.2.3)
∂t
∂x
∂x
First we describe how Kramers used Einstein and Smoluchowski’s theory to study
chemical reaction rates.
For a Brownian particle with position x in one dimension, the probability density
ρ = ρ(x, v, t) that the particle is at x with velocity v at time t is governed by the
diffusion equation,
∂ρ
∂ρ
∂ρ
∂
= −F (x)
−v
+η
∂t
∂v
∂x
∂v
∂ρ
vρ + kB T
∂v
,
(1.2.4)
where F is the force from a potential well, i.e. F (x) = −U 0 (x) where U is depicted
in figure 1.3, and η is the viscosity. Suppose the particle is in the potential well near
A in figure 1.3. With purely deterministic dynamics, the particle will come to rest at
A due to friction. However, the noise of the system will cause the particle to leave
the potential well and escape to x > B, even for large barrier heights compared to
19
Figure 1.3. An example of the potential well that Kramers considered in his paper
[Kra40]
kB T . Kramers was interested in finding the escape rate r for a particle near A. The
formula, given by transition state theory is,
r=µ
ωA −∆E/kB T
e
,
2π
(1.2.5)
where ∆E is the height of the barrier between A and B, and µ is the pre-factor that
Kramers is interested in calculating. For a state space (x, v), calculating µ is difficult,
however, if the state space is reduced to x then µ can be found explicitly. This is
exactly what Kramers does in a limit of large viscosity η. For this limit, we assume
the following:
1. the random forces on the velocity are much larger than that of the external
force F ,
2. F does not change much over the distance
√
kB T /η,
3. given arbitrary initial condition ρ(x, v, 0) = ρ0 (x, v) a Maxwell [TKS92, Ris89]
velocity distribution will be valid on the order of 1/η. That is,
ρ(x, v, t) ≈ σ(x, t)e−mv
2 /(2k
BT )
.
(1.2.6)
20
Equation (1.2.4) is written as
∂ρ
∂ρ
∂
1 ∂
=
η
−
vρ + kB T ∂v
−
∂t
∂v
η ∂x
F (x)
∂ρ
∂
− ∂x
.
ρ − kBηT ∂x
η
F (x)
ρ
η
+
kB T ∂ρ
η ∂x
−
Integrating both sides along the line x+p/η = x0 , constant over R, denote
(1.2.7)
R
x+p/η=x0
ρ=
ρ̃(x0 ), then
∂ ρ̃
∂t
=
∂ρ
∂
1 ∂
η
−
pρ + kB T ∂v
−
∂v
η ∂x
x+v/η=x0
F (x)
∂ρ
∂
− ∂x
dv
ρ − kBηT ∂x
η
∂ ρ̃
≈ − ∂x∂ 0 F (xη 0 ) ρ̃ − kBηT ∂x
.
0
R
F (x)
ρ
η
+
kB T ∂ρ
η ∂x
−
(1.2.8)
This leads to the Smoluchowski-Kramers approximation. Kramers then used this
approximation to calculate µ in equation (1.2.5) exactly.
The utility of the Smoluchowski-Kramers approximation can also be seen by considering the problem of finding the expected time of escape over the barrier τ (x, v)
m
depending on the initial condition (xm
0 , v0 ) = (x, v). The function τ is governed by
the PDE
∂τ
∂τ
∂
∂τ
− F (x)
−p
+η
vτ + kB T
= −1,
(1.2.9)
∂v
∂x
∂v
∂v
for the particle in phase space (x, v) ∈ R2 [Gar04, Ris89]. Notice that the left hand
side of the above equation is the same differential operator acting on τ as the right
hand side of equation (1.2.4) acting on ρ. If an approximation is found to reduce
the phase space to x ∈ R, then the above PDE (1.2.9) reduces to a one dimensional
problem. Using the Smoluchowsk-Kramers approximation for the expected escape
time example from the potential well, the problem is reduced to
F (x0 )
kB T dτ̃ (x0 )
d
τ̃ (x0 ) −
= −1
−
dx0
η
η
dx0
(1.2.10)
At first glance, taking a large viscosity limit does not seem equivalent to the small
mass limit taken above in equation (1.1.4). First, let u = √vm and scale time as
√
t = m, then for one dimensional x and v equation (1.1.4) becomes,
dxt = u
t dt
dut = F (xt ) −
γ(xt )
√ ut
m
dt +
σ(xt )
dWt .
m1/4
(1.2.11)
21
The probability density ρ for the solution of the above SDE is governed by the diffusion
equation,
∂ρ
∂ρ
∂ρ
1 ∂
= −u
− F (x)
+√
∂t
∂x
∂u
m ∂u
∂ρ
γ(x)uρ + σ(x)
∂u
.
(1.2.12)
If we set γ(x) = η and by Einstein’s diffusion relation [Eq. (1.2.2)] σ = kB T η, then
we recover equation (1.2.4) where, for small mass, the m−1/2 is absorbed into η for
large viscosity.
1.3
Outline of the dissertation
The outline for the dissertation is as follows: In Section 2.2 we derive the SDE (1.1.4)
and define the stochastic integral with respect to the Wiener process [Sec. 2.2.1].
We give existence and uniqueness theorems for SDE [Sec. 2.2.2, Sec. 2.2.4] and derive the relationship between the forward and backward Kolmogorov equations and
SDE [Sec. 2.2.3]. In Section 2.3 we give examples of mathematical models using the
Langevin equation in applications. In Section 2.4 we define modes of convergence and
in Section 2.5 we give previous results of the Smoluchowski-Kramers approximation.
The main proof is found in Chapter 3. In Chapter 4 we use the main theorem to
derive the Smoluchowski-Kramers limit in the one-dimensional general case [Sec. 4.1]
and the original motivating example [Sec. 4.1.3]. We also apply the theorem to the
Langevin equation driven by OU colored noise [Sec. 4.2] including an application of
thermophoresis [Sec. 4.3] (i.e. Brownian particles moving due to a temperature gradient). In Section 4.4.1 we derive the Smoluchowski-Kramers limit for a Brownian
particle in three dimensions with a non-conservative external force. In Chapter 5 we
discuss the results and possible future research.
22
Chapter 2
Stochastic Differential Equations and the
Smoluchowski-Kramers approximation
2.1
Introduction
In this section we give a brief introduction to stochastic differential equations (SDE)
with an emphasis towards classical results and estimates that will be used to study
small parameter limits. As mentioned in the previous chapter, there is a connection
between the Fokker-Plank PDEs (e.g. equation (1.2.4)) and SDEs that will be explicitly stated later in the chapter [Sec. 2.2.3]. However, the main theorem of this
dissertation pertains to the SDE description and we will focus on the study of SDEs.
We start by giving a physically motivated derivation of equation (1.1.4) using
SDEs. The dynamics of a Brownian particle of mass m are modeled by the stochastic
Newton equation:
m
d2 xt
dxt
= F (xt ) − γ(xt )
+ σ(xt )ηt ,
dt
dt
(2.1.1)
d
where xm
t ∈ R × [0, T ] with initial condition x0 = x and dxt /dt = v. The motion
of the Brownian particle consists of deterministic and random forces. The function
F (x) : Rd → Rd is the sum of external forces. The second term on the right hand
side of equation (2.1.1) is a friction force which describes the dissipation of energy
with friction coefficient γ(x) : Rd → Rd×d . The last term is the force caused by
random collisions with the molecules of the fluid, where σ(x) : Rd → Rd×k is the
intensity and ηt : [0, T ] → Rd is a random process accounting for the noise. The
noise is too complex to analyze exactly because the number of degrees of freedom is
of the order of 1023 [Sch80]. A first approximation depends on many factors including
time and distance scalings. For example, in the experiment of a Brownian particle
in a diffusion gradient [VHB+ 10], the time series of measurements is on the order of
23
microseconds while the random collisions are correlated on the order of picoseconds
[FGB+ 11]. Therefore ηt can be modeled by Gaussian white noise, a process that is
independent at every time. In this dissertation we consider two different processes
for ηt , white and colored noises.
Before proceeding, some notational definitions are necessary. We denote xt in a
bounded open domain U ⊂ Rd , i.e. xt ∈ U × [0, T ] and vt ∈ Rd × [0, T ]. On the
probability space Ω with σ-algebra F and probability measure P , denoted (Ω, F, P ),
Wt is a k-dimensional standard Wiener processes with each Wi , i = 1, ..., k mutually
independent. The minimum of two numbers, or random variables, a, b is denoted
min{a, b} = a ∧ b and the maximum max{a, b} = a ∨ b. We denote A ∈ Rn×m for a n
by m real valued matrix, and A∗ its Hermitian transpose. For a vector a ∈ Rd , the
norm |a| = |a|2 is the Euclidean norm unless otherwise specified. The matrix norm
is the induced operator norm. The space of continuous functions from [0, T ] to Rd is
denoted CRd [0, T ].
2.2
Stochastic Differential Equations
We consider the case where ηt = ηt in equation (2.1.1) is a one dimensional process
called Gaussian white noise. That is, ηt has the properties:
1. t1 6= t2 implies ηt1 and ηt2 are independent.
2. {ηt }t≥0 is stationary.
3. E[ηt ] = 0 for all t,
4. and
E[ηt ηs ] =
0 t 6= s
1 t = s.
(2.2.1)
By this definition, ηt will not have continuous paths and will not even be measureable
with respect to B×F where B is the Borel σ-algebra on R [Øks03]. We will circumvent
24
this problem by replacing ηt with a suitable process. To define the equations of
motion, we follow the derivation in [Øks03] by considering the discrete version of
equation (2.1.1) for vt , xt ∈ CR [0, T ], by defining dxt /dt = vt and ∆t = ti − ti−1 ,
xti − xti−1 = vti−1 ∆t
m(vti − vti−1 ) = F (xti−1 )∆t − γ(xti−1 )vti−1 ∆t + σ(xti−1 )ηti−1 ∆t.
(2.2.2)
We wish to replace ηti−1 ∆t by a process that has stationary, independent increments
with mean 0 (from properties 1-4). It can be shown that the Wiener process, Wt , is
the only such process that has stationary increments Wtk − Wtk−1 with mean zero.
If we try to substitute dWt /dt = ηt , we see that the Wiener process is nowhere
differentiable (see figure 2.2). We substitute ∆W for ηt ∆t and sum over the partition
Figure 2.1. A one dimensional realization of the Wiener process. Note that the
process is rough and nowhere differentiable.
of [0, t] to obtain,
P
xt = x0 + N
vti−1 ∆t
PNi=1 F (x
P
)
t
vt = v0 + i=1 mi−1 ∆t − N
i=1
γ(xti−1 )
vti−1 ∆t
m
+
PN
i=1
σ(xti−1 )
∆W
m
(2.2.3)
Taking the limit as the partition gets finer, we wish to define,
lim
N →∞
N
X
Z
f (xti−1 )∆W =
i=1
for a certain class of functions f .
t
f (xs )dWs ,
0
(2.2.4)
25
2.2.1
Construction of the Stochastic Integral
For a measurable function f (t) : R → R, the Riemann integral of f (t) is defined on
the interval [0, 1] by taking sums of the form
∗
=
SN
N
X
i=1
f (t∗i )(ti − ti−1 ),
(2.2.5)
where t0 , ..., tN is a partition of [0, 1] and t∗ ∈ [ti−1 , ti ]. We parameterize t∗ by writing
t∗i = αti + (1 − α)ti−1 ,
for all 0 ≤ α ≤ 1.
(2.2.6)
For the stochastic integral, we restrict the class of integrable functions to f ∈ V where
V on [0, T ] has the following properties:
1. f (t, ω) is B × F- measurable, where B is the Borel σ-algebra on [0, T ].
2. f (t, ω) is Ft -adapted.
3. E[
RT
0
f (t, ω)2 dt] < ∞.
For f ∈ V on [0, T ] we construct sums of the form
lim φN (α) = lim
N →∞
N →∞
N
X
i=1
f (t∗i , ω)(Wti − Wti−1 ),
(2.2.7)
where Wt is the standard Wiener process. For α = 0, the sum converges in L2 (Ω, F, P )
RT
to the Itô integral 0 f (t, ω) dWt . If we take α = 1/2, the sum converges in L2 to the
RT
Stratonovich integral denoted 0 f (t, ω)◦dWt . A convenient property of the Riemann
∗
sum is that as the partition of [0, 1] is made finer the different sums SN
= SN (α) all
RT
converge to the same value denoted 0 f (t) dt. However, it is not always the case
that both these integrals coincide. For example,
Z t
Z t
1
1
2
Ws ◦ dWs = Wt2 ,
Ws dWs = (Wt + t), and
2
2
0
0
(2.2.8)
which are not equal for t > 0 [Øks03]. For the SDE of interest [Eq. (1.1.4)], there
may be a question as to how to construct the stochastic integral. However, if the
26
integrand of the stochastic integral varies smoothly in time then all constructions are
equivalent to the Itô construction.
Lemma 1. For f ∈ V on [0, T ], if there exists K < ∞ and > 0 such that
E[|f (s, ω) − f (t, ω)|2 ] ≤ K|s − t|1+ ,
for 0 ≤ s, t ≤ T , then
Z
lim φN (α) = lim φN (0) =
N →∞
N →∞
(2.2.9)
t
f (s, ω) dWt ,
(2.2.10)
0
in L2 (Ω, F, P ), for all α ∈ [0, 1], where the stochastic integral on the right hand side
is Itô.
Proof: exercise 3.10 [Øks03]. Define ∆Wi = Wti+1 − Wti and consider
2 P
P
∗
2
E[|φN (α) − φN (0)| ] = E i f (ti , ω)∆Wi − j f (ti , ω)∆Wi P
2
∗
,
ω))
∆W
|
(f
(t
,
ω)
−
f
(t
= E
|
i
i
i
i
P
∗
∗
=
i,j E (f (ti , ω) − f (ti , ω)) f (tj , ω) − f (tj , ω) ∆Wi ∆Wj .
(2.2.11)
By the Cauchy-Schwartz inequality,
X h
2 i
E[|φN (α) − φN (0)|2 ] ≤
E (f (ti , ω) − f (t∗i , ω))2 f (tj , ω) − f (t∗j , ω) E[(∆Wi )2 (∆Wj )2 ]
i,j
(2.2.12)
≤
X
i,j
K 2 |ti − ti∗ |1+ |tj − tj ∗ |1+ E[(∆Wi )2 (∆Wj )2 ],
(2.2.13)
with the last inequality coming from the assumption [Eq. (2.2.9)]. The Wiener process
has independent increments and
E[(∆Wi )2 (∆Wj )2 ] = E[(∆Wi )2 ]E[(∆Wj )2 ] = (ti+1 − ti )(tj+1 − tj ).
(2.2.14)
Therefore,
E[|φN (α) − φN (0)|2 ] ≤
X
i,j
K 2 |ti − ti∗ |1+ |tj − tj ∗ |1+ (ti+1 − ti )(tj+1 − tj ). (2.2.15)
Taking the limit as N → 0 the right hand side will be a double integral with each
integral going to zero. This completes the proof.
27
One important property of the Itô integral, which will be exploited several times in
the main proof, is the Itô isometry.
Lemma 2 (Itô isometry). For all f ∈ V on [0, T ],
"Z
2 # Z T
T
E f (t, ω)2 dt.
E
f (t, ω)dWt
=
(2.2.16)
0
0
For the proof, see [Øks03, KS91].
The construction of the stochastic integral gives a well defined meaning of the
differential,
dxt = b(xt ) dt + σ(xt ) dWt ,
(2.2.17)
as well as the integral equation,
Z
t
t
Z
σ(xt ) dWt ,
b(xt ) dt +
xt = x0 +
(2.2.18)
0
0
where the stochastic integral will be interpreted as the Itô integral. The definition of
SDE easily extends to functions b(t, x). However, in this dissertation we only assume
time homogeneous coefficients.
In one dimension, we show that if the stochastic integral is constructed in some
Rt
other sense, defined 0 σ(xt ) ◦α dWt where α is defined in equation (2.2.6), then the
integral is written as the sum of a drift term and an Itô stochastic integral.
Stochastic integrals as modified Itô integrals: A stochastic integral with a given α can
be expressed as an Itô integral, i.e. α = 0, with an additional noise-induced drift, i.e.
a “spurious drift.” To justify this claim, consider the α-SDE
dxt = b(xt )dt + σ(xt ) ◦α dWt ,
(2.2.19)
where the SDE is defined by α and the solution xt is real valued and one dimensional.
The integrated equation is
Z
xt = x0 +
t
b(xs ) ds + lim
0
N →∞
N
X
n=0
σ(xtαn )∆Wtn ,
(2.2.20)
28
with tαn =
n+α
t.
N
By expanding σ(xt ), we see that this corresponds to
Z t
Z t
N
X
dσ(xs )
σ(xtn )∆Wtn ,
ds + lim
xt = x0 +
b(xs ) ds +
ασ(xs )
N →∞
dxs
0
0
n=0
with tn =
n
t.
N
(2.2.21)
This can be interpreted as the Itô (α ≡ 0) equation
dxt = b(xt )dt + ασ(xt )
dσ(xt )
+ σ(xt )dWt ,
dxt
(2.2.22)
where we omit the ◦0 for all Itô integrals. The same argument can be carried out in
d-dimensions through a multi-dimensional Taylor expansion. The special stochastic
integrals we will mention are, α = 0 is the Itô integral, α = 1/2 is the Stratonovich
integral, and α = 1 is the anti-Itô integral.
2.2.2
Existence and uniqueness of solutions to SDE
Given an Itô stochastic differential equation,
dxt = b(xt ) dt + σ(xt ) dWt ,
x0 = x,
(2.2.23)
where xt ∈ Rd ×[0, T ], b : Rd → Rd , σ : Rd → Rk×d and Wt is a k dimensional Wiener
process, consider the existence and uniqueness of a solution xt . For the motivating
example of the Brownian particle in a diffusion gradient [VHB+ 10], the friction and,
by the Einstein diffusion relation [Eq. 1.2.2], noise coefficients grow unbounded as
the position of the particle approaches the wall (x → 0). Thus it is natural to ask:
what are the conditions on the coefficients b(x) and σ(x) such that SDE (2.2.23) has
a unique solution xt for all t ≥ 0?
We answer this question in stages, starting with the classical existence and uniqueness theorem:
Theorem 2 (5.2.1 [Øks03]). Let T > 0, and the SDE (2.2.23) have the initial condition x0 = x with E[|x|2 ] < ∞. If the coefficients b, σ satisfy the following growth
conditions,
|b(x)| + |σ(x)| ≤ C(1 + |x|);
x ∈ Rd ,
(2.2.24)
29
for some constant C and |σ| =
P
|σij |2 , and
|b(x) − b(y)| + |σ(x) − σ(y)| ≤ D|x − y|;
x, y ∈ Rd ,
(2.2.25)
for some constant D, then there is a unique solution to SDE (2.2.23) for all time
t ∈ [0, T ] with probability 1.
The first condition [Eq. (2.2.24)] assures that the solution will not explode (i.e.
|xt | → ∞ in finite time). The second condition [Eq. (2.2.25)] is a Lipschitz condition,
as often seen in existence and uniqueness theorems of ordinary differential equations
(ODEs) [Har02]. The proof of existence is carried out much in the way of ODEs (see
[Øks03, KS91] for the SDE treatment and for ODE see [Har02]). We provide the
proof of uniqueness because we use an estimate many times in Chapter 3.
Proof of uniqueness. Let xt and x̂t be two solutions to equation (2.2.23), with
initial conditions x0 and x̂0 respectively. Consider,
"
2 #
Z t
Z t
E[|xt − x̂t |2 ] = E x0 − x̂0 +
b(xs ) − b(x̂s ) ds +
σ(xs ) − σ(x̂s ) dWs .
0
0
(2.2.26)
An estimate, used many times in this dissertation is the Cauchy-Schwarz inequality,
2
N
N
X
X
ai ≤ N
|ai |2 .
(2.2.27)
i=1
i=1
Therefore, for T < ∞,
E[sup0≤t≤T |xt − x̂t |2 ] ≤
+
≤
+
R
2 t
3E[|x0 − x̂0 |2 ] + 3E sup0≤t≤T 0 b(xs ) − b(x̂s ) ds
R
2 t
3E sup0≤t≤T 0 σ(xs ) − σ(x̂s ) dWs hR
i
T
3E[|x0 − x̂0 |2 ] + 3T E 0 |b(xs ) − b(x̂s )|2 ds
hR
i
T
12E 0 |σ(xs ) − σ(x̂s )|2 ds ,
(2.2.28)
with the last line using Hölder’s inequality for the Lebesgue integral, and that t = T
is the supremum of the integral, and Itô isometry and Doob’s maximal inequality
30
(see Lemma 4) for the stochastic integral, which is a martingale. Now the Lipschitz
condition [Eq. (2.2.25)] yields
Z
2
2
2
E sup |xt − x̂t | ≤ 3E[|x0 − x̂0 | ] + 3(1 + T )D
0≤t≤T
0
T
E[|xs − x̂s |2 ] ds. (2.2.29)
This inequality allows us to use the Gronwall inequality (see [HS74] for a proof) yields,
2
E sup |xt − x̂t | ≤ 3E[|x0 − x̂0 |2 ] exp{3(1 + T )D2 }.
(2.2.30)
0≤t≤T
Now assume that because xt and x̂t are solutions to equation (2.2.23), then x0 = x̂0 .
Therefore, E[|xt − x̂t |2 ] = 0. So, by Chebyshev’s inequality, for all > 0,
E[|xt − x̂t |2
P |xt − x̂t |2 ≥ for all t ∈ [0, T ] ≤
= 0,
(2.2.31)
and thus
P (xt = x̂t
for all t ∈ [0, T ]) = 1.
(2.2.32)
A solution to equation (2.2.23) is thereby unique.
The Lipschitz condition [Eq. (2.2.25)] can be dropped and the result holds with existence and uniqueness in the weak sense (see [KS91] prop. 5.3.6). Because we are
only interested in strong solutions, e.g. the result of Theorem 2, we do not cover this
case.
Before we discuss another existence theorem that is essential for applications of
the main theorem, we must introduce the connection between SDE and PDE. As
mentioned in Chapter 1 [Sec. 1.2], the motion of a Brownian particle is also described
by the deterministic PDEs, the Fokker-Planck (or forward Kolmogorov) and backward
Kolmogorov equations. In the next section we give a standard derivation of the
Fokker-Planck equation given SDE (2.2.23).
2.2.3
The Backward and Forward Kolmogorov Equations
In this section we give a non-rigorous derivation of the Fokker-Planck equation for
the solution to equation (2.2.23), where the process xt = xt is one dimensional. Let
31
ρ(x, t) be the probability density of the stochastic process xt . The solution xt of an
SDE is a Markov process (see [Øks03]), i.e., the probability of xt3 = x3 given that at
time t1 , xt1 = x1 and at t2 , xt2 = x2 (t1 < t2 < t3 ) then
ρ(x3 , t3 |(x1 , t1 ), (x2 , t2 )) = ρ(x3 , t3 |x2 , t2 ).
For any Markov process, the Chapman-Kolmogorov equation is satisfied,
Z
ρ(x3 , t3 |x1 , t1 ) = ρ(x3 , t3 |x2 , t2 )ρ(x2 , t2 |x1 , t1 ) dx2 .
(2.2.33)
(2.2.34)
Next, we assume that the Markov process xt is invariant in time. That is
ρ(x, t1 + s) = ρ(x, t1 ).
(2.2.35)
To derive the Fokker-Planck equation we follow the strategy of [CKW04] closely.
Consider the integral
Z
∞
I=
h(y)
−∞
∂ρ(y, t|x)
dy
∂t
(2.2.36)
with h(y) a smooth function with compact support. We write the derivative with
respect to t as a limit and interchange the limit and the integral,
Z ∞
ρ(y, t + ∆t|x) − ρ(y, t|x)
I = lim
h(y)
dy.
∆t→0 −∞
∆t
(2.2.37)
Now we use the Chapman-Kolmogorov identity [Eq. (2.2.34)] on the right hand side
letting z be the intermediate point, to obtain
Z ∞
Z ∞
Z ∞
1
h(y)
ρ(y, ∆t|z)ρ(z, t|x) dzdy −
h(y)ρ(y, t|x) dy ,
I = lim
∆t→0 ∆t
−∞
−∞
−∞
(2.2.38)
R∞
using the fact that −∞ ρ(z, t|x) dz = 1. Changing the limits of integration in the
first term, and letting y approach z in the second term,
Z ∞
Z ∞
1
I = lim
ρ(z, t|x)
ρ(y, ∆t|z)(h(y) − h(z)) dydz .
∆t→0 ∆t
−∞
−∞
(2.2.39)
32
By assumption, h(y) is smooth, and therefore can be expanded as a Taylor series
about z for y sufficiently close to z. Therefore the above integral can be written as
!
Z ∞
Z ∞
∞
n
X
(y
−
z)
1
ρ(y, ∆t|z)
h(n) (z)
ρ(z, t|x)
dydz.
(2.2.40)
I = lim
∆t→0 ∆t
n!
−∞
−∞
n=1
Now define the function
D
(n)
1 1
(z) =
n! ∆t
Z
∞
−∞
(y − z)n ρ(y, δt|z) dy.
Integral I is written as
Z ∞
Z ∞
∞
X
∂ρ(y, t|x)
h(y)
ρ(z, t|x)
D(n) (z)h(n) (z) dz.
dy =
∂t
−∞
−∞
n=1
Integrating by parts n times, it follows that the integrands are equal,
n
∞ ∂ρ(x, t) X
∂
=
−
[D(n) (z)ρ(z, t|x)].
∂t
∂z
n=1
(2.2.41)
(2.2.42)
(2.2.43)
With the assumption that D(i) (z) is negligible, for all i ≥ 3, it can be derived (see
[CKW04, Ris89]), that
∆x
∆t→0 ∆t
D(1) = lim
(∆x)2
.
∆t→0 ∆t
D(2) = lim
(2.2.44)
This gives the drift and diffusion coefficients respectively. Setting D(1) (x) = b(x) and
D(2) (x) = σ(x)2 , then equation (2.2.43) becomes
∂ρ(x, t)
∂
1 ∂
= − (b(x)ρ(x, t)) +
(σ(x)2 ρ(x, t)),
∂t
∂x
2 ∂x2
(2.2.45)
which is the Fokker-Planck equation in one variable. The Fokker-Planck (or forward
Kolmogorov) equation for x = (x1 , x2 , ..., xd )∗ ∈ Rd is,
d
d
d
∂ρ(x, t) X ∂
1 XX ∂
=
−
(bi (x)ρ(x, t)) +
[σ(x)σ ∗ (x)]i,j )ρ(x, t)), (2.2.46)
∂t
∂xi
2 i=1 j=1 ∂xi xj
i=1
which we write,
∂ρ
= L∗ ρ,
∂t
(2.2.47)
33
where
∗
L =
d
X
i=1
d
d
XX ∂
∂
(bi (x)·) +
([σ(x)σ ∗ (x)]i,j ·).
−
∂xi
∂x
x
i
j
i=1 j=1
(2.2.48)
The notation of L∗ suggests that the differential operator is a adjoint of some differential operator L. This operator is the one that generates the backward Kolmogorov
equation,
d
d
d
XX
∂ρ(x, t) X
∂
∂
=
ρ(x, t) +
ρ(x, t),
bi (x)
[σ(x)σ ∗ (x)]i,j
∂t
∂xi
∂xi xj
i=1
i=1 j=1
(2.2.49)
or
∂ρ
= Lρ,
∂t
where
L=
d
X
i=1
d
(2.2.50)
d
XX
∂
∂
+
[σ(x)σ ∗ (x)]i,j
.
bi (x)
∂xi i=1 j=1
∂xi xj
(2.2.51)
The operator L is often called the infinitesimal generator for the Markov process xt .
This name comes from the theory of semigroups and for suitable bounded functions
f ∈ C 2 (Rd ), the operator L satisfies
lim
t↓0
E[f (xt )|x0 = x] − f (x)
= Lf (x),
t
(2.2.52)
for all x ∈ Rd . We will not go into this theory further. For more reading on semigroups and generators, including convergence, see [EK86]. For other derivations of
the Fokker-Planck and backward Kolmogorov equations see [Øks03, KS91, Ris89].
2.2.4
An existence theorem using Lyapunov functions
For the case of the experiment [VHB+ 10], the friction coefficient blows up at the
boundary (x = 0), and thus the assumptions of the existence and uniqueness theorem
[Thm 2], notably equation (2.2.24), are not satisfied. Therefore, Theorem 2 can not be
used to prove existence and uniqueness of equation (1.1.4) for coefficients of interest.
To prove existence and uniqueness we use the theory of Lyapunov functions and
ergodic theory (see [RB06] for an introduction and [Kha12] for a full treatment).
34
Definition 1. A continuous function V : Rd → R is a Lyapunov function if V (x) ≥ 1
and
lim V (x) = 0.
|x|→∞
(2.2.53)
Theorem 3. Consider equation (2.2.23) with b and σ are locally Lipschitz, and
assume there exists a Lyapunov function V (x) such that,
LV (x) ≤ cV (x),
(2.2.54)
where L is the infinitesimal generator for equation (2.2.23), V is in the domain of L,
and for some constant c. Then the solution of equation (2.2.23) is defined for all time
t > 0 and is unique.
Proof. For b, σ locally Lipschitz, there exists a local solution xt for small times.
Define the stopping time,
τn = inf{t > 0, V (xt ) ≥ n}.
Consider the stopped process xt∧τn , then by inequality (2.2.54),
∂
+ L V (xt∧τn )e−ct ≤ 0.
∂t
(2.2.55)
(2.2.56)
Now we use Dynkin’s formula, which is stated without proof:
Lemma 3 (Dynkin’s Formula). Let f : Rd → R be a twice differentiable function
with compact support. Let τ be any stopping time with E[τ ] < ∞. Then
Z τ ∂
E[f (xτ )] = f (x) + E
+ L f (xs ) ds ,
∂t
0
(2.2.57)
where L is the infinitesimal generator of xt .
Using Dynkin’s formula,
−ct∧τn
E[V (xt∧τn )e
] − V (x0 ) = E
Z
0
t∧τn
∂
−cs
+ L V (xs )e
ds .
∂t
(2.2.58)
35
Inequality (2.2.56) with the above equation yields,
E[V (xt∧τn )] ≤ V (x0 )ect .
(2.2.59)
E[V (xt∧τn )] ≥ E[V (xt∧τn )χ{τn <t} ] = nP (τn < t),
(2.2.60)
Also,
where χA is the indicator function on the set A. Therefore,
P (τn < t) ≤
Taking n → ∞ yields
P
ect V (x0 )
E[V (xt∧τn )]
≤
.
n
n
lim τn = ∞ = 1.
n→∞
(2.2.61)
(2.2.62)
Therefor, the paths of xt almost surely do not reach infinity in finite time. The
coefficients b and σ will be locally Lipschitz on the set {V (xt ) < n0 } and thus the
solution is unique. This proves the theorem.
2.3
Mathematical Modeling using Stochastic Differential Equations
Stochastic differential equations have been used to model many different physical systems with much success. In this section we consider a few applications of the Langevin
equation to numerous other fields of study. We stress that the main theorem is a rigorous proof of a limit as the small parameter m → 0. We justified in the introduction
[Sec. 1.1] that when the white noise is replaced with an Ornstein-Uhelenbeck process
to model colored noise with correlation time τ , by setting τ = τ0 m the system of
SDEs could be reformulated to fit the original Langevin equation (1.1.4). Therefore,
the main theorem can be applied if any of the following applications have small parameter limits that can be reformulated, for example, through a change of variables,
to fit the framework of equation (1.1.4).
36
There are many references with introductions to applications of SDE including
[KP92, Øks03, Ris89, vK81, KS91]. We introduce a few interesting specific examples
and give citations for further study.
2.3.1
Chemical reactions
In the motivating example of Section 1.2, Kramers was interested in calculating chemical reaction rates by studying the small mass approximation of a Brownian particle
in a potential well. A molecule is composed of many particles held by chemical bonds.
The molecule collides with other molecules causing external forcing on the particles.
The chemical bonds are much stronger than these external forces. Thus the Langevin
equation is used to model the particles. Let xt be the position and vt the velocity of
a particle at time t. Then the motion is approximated by,
dxt = vt dt,
mdvt = (F (xt ) − γvt ) dt + σ dWt .
(2.3.1)
The force F is the force of the chemical bonds and σ << |F | is the strength of the
noise. The force F is considered to be a potential force arising from a potential function φ. This function, in its simplest form, has a single local minimum describing the
current state of the particle (see the discussion in Section 1.2 particularly figure 1.3).
There is much interest in studying more complicated potentials. For example
in [MDG99], a bistable potential (a potential function with two wells shown in figure 2.2) is studied with an emphasis on the transition rates (see figure 2.3). Adding
small noise to dynamical systems is called stochastic resonance and is a well developed field [FW12]. The noise effects can cause many changes to the deterministic
dynamics, e.g. shifts in the limit-cycle oscillations or occurrences of limit-cycles when
no deterministic cycles (i.e. when = 0) are present [RS94].
For a system with xt ∈ Rd , the potential function can have many local minima
like in figure 2.4. If the energy barriers are large compared to the random forcing,
37
Figure 2.2. An example of a bistable potential well studied in [MDG99]
Figure 2.3. A typical trajectory of a particle in a bistable potential well.
the particle will spend long times in each minimum before transitioning to another.
The path that the particle takes is of particular interest in the work of [WVE10].
In many of these systems, the analysis is carried out on the Somluchowski-Kramers
approximation,
dxt =
F (xt )
σ
dt + dWt ,
γ
γ
(2.3.2)
instead of the 2d dimensional system [Eq. (2.3.1)]. Thus, the dimension size has
38
Figure 2.4. An example of a potential function in three dimensions with many
different local minima. The dynamics of the particles trajectories from minimum to
minimum are studied in [WVE10]
been reduced by a factor of 2. We stress that for more complicated state dependent
frictions (i.e. γ = γ(xt )), the limit must be modified to equation (1.1.11) before any
analysis can be performed.
2.3.2
Stability of orbiting satellites
Satellites with magnetic elements are influenced by fluctuations in the geomagnetic
field. A magnetic rod is used to stabilize the oscillations of the satellite. The yaw
oscillations ψ ∈ R are modeled by the equation,
B ψ̈ + kH 2 ψ̇ + HI sin ψ = 0,
(2.3.3)
where B is the moment of inertia with respect to the yaw axis, k is a constant
depending on the magnetic properties of the satellite, I is the moment of inertia with
respect to the pitch axis, and H is the intensity of the magnetic field. The intensity
of the magnetic field is modeled by H = H0 (1 + ηt ) where ηt is white noise accounting
for the small fluctuations of the field. Thus H 2 = H0 (1 + 2ηt + ηt2 ), where the ηt2 term
39
is negligible. The above equation is modeled using the Langevin equation with x = ψ
and v = ψ̇,
dxt = vt dt
dvt = HB0 I sin xt dt −
kH0
v
B t
dt +
H0 (sin xt +2vt )
B
dWt .
(2.3.4)
Note that the noise term depends on vt as well as xt , which differs from equation (1.1.4). See [Sag72] for a more detailed study.
2.3.3
Nucleation
In dilute solutions, certain polymers fold and crystallize into thin single crystals.
In, [LM98], the Langevin equation is used for simulations to understand early-stage
polymer crystallization. Consider d-particles, where d is large, that form a chain (the
polymer). Let xt ∈ Rd and vt ∈ Rd be the positions and velocities of the d particles
of the polymer. The position of each particle is modeled by the Langevin equation,
d[xt ]i = [vt ]i dt
d[vt ]i = (−[∇U (xt )]i − γ[vt ]i ) dt + σd(Wt )i .
(2.3.5)
The friction coefficient γ and the intensity of the noise σ are constant. The dynamics
of the particles are largely influenced by the potential function U . The potential is
made up of four terms: a bond stretching, bond angle, bond torsion, and a Leonard
Jones potential. These potential functions cause the particles to form a chain and the
random motion perturbs the chain into a fold. In [LM98], simulations are run with d
between 100 and 1000 and chain folding is observed.
2.3.4
Climate models
SDEs are frequently used to simplify systems with many degrees of freedom to analyze
dynamics on longer time scales. Weather modeling consists of many variables with
different length and time scales. For example, in climate modeling, there are three
main time scales: fast time scales called weather modeled by stochastic processes,
intermediate time scales or dominant modes consisting of trends which we want to
40
predict (e.g. average temperature over decades), and slow time scales viewed as
external forcing (occurs over centuries and millenia).
In [BG02], a Langevin type equation is used to study transitions between equilibria
that model different long time climate trends (e.g. ice ages). Let xt ∈ R be the state
of the climate modeled by the SDE,
dxt = F (xt , t) dt + σG(xt , t) dWt .
(2.3.6)
This equation can be interpreted as the Smoluchowski-Kramers limit. Therefore, F
is considered to be derived from some force, in this case a potential force U (x, t) =
−dF (x, t)/dx. The potential U has two local minima, or potential wells with some
barrier height that changes in time. The paper studies when the state leaves one
minima and stays in the other for a long period of time for certain functions G.
An example of modeling the intermediate modes is in [MTVE99]. The paper
is concerned with modeling the coupled atmosphere and oceanic system, with an
emphasis on the mean and variance of storm tracks. This can be used to model such
phenomenon as El Niño, North Atlantic oscillations, and mid-Atlantic storm tracks.
It was shown that 57 variables can be effectively modeled by three (xt , yt1 , yt2 ) (see
[MTVE99] and references there in). For example, consider xt a zonal averaging, and
yt1 and yt2 are wave variables, satisfying the Langevin equation,
A3 1 2
y y dt
t t
a
A1
γ1
σ1
dyt1 = yt2 dt +
xt yt2 dt − 2 yt1 dt +
dWt1
b
A2
γ2
σ2
dyt2 = yt1 dt +
xt yt1 dt − 2 yt2 dt +
dWt1
dxt =
(2.3.7)
(2.3.8)
(2.3.9)
(2.3.10)
Note that this equation is different from SDE (1.1.4), in that dxt = cyt1 yt2 dt, and, in
terms of a possible small mass approximation, the variables (yt1 , yt2 ) have two different
scales with order −1 and −2 .
41
2.4
Small parameter limits
The main theorem of this dissertation concerns a small parameter limit of a stochastic
differential equation. This leads to a class of SDE, that is for each m ≥ 0, there is a
different solution xm
t parameterized by m. In this section we clarify the meaning of,
limm→0 xm
t = xt . That is, what does it mean to say that the solution of SDE (1.1.4)
converges, as m → 0, to the solution of SDE (1.1.11)?
2.4.1
Modes of convergence
We start with the classic definitions for random variables. Let {X}n≥0 be a sequence
of random variables, that is X n : Ω → Rd , for each n ≥ 0. We define the following
modes of of convergence:
Definition 2. We say that X n converges to X almost surely or with probability one,
if
P
lim X n = X = 1.
n→∞
(2.4.1)
Definition 3. X n converges to X in probability if for all > 0,
lim P (|X n − X| > ) = 0.
n→0
(2.4.2)
Definition 4. X n converges to X in Lp if
lim E[|X n − X|p ] = 0.
n→0
(2.4.3)
Definition 5. X n converges to X in distribution, law, or weakly, if for every bounded
uniformly continuous function f : Rd → R,
lim E[f (X n )] = E[f (X)].
n→∞
(2.4.4)
Convergence in distribution can be shown in many different ways. All of which are
equivalent by Portmanteu’s Theorem [Bil99]. For this dissertation we use the above
definition.
42
A chain of implications follows from these definitions. Almost sure convergence
implies convergence in probability and weak convergence. Lr implies Ls for r >
s ≥ 1. Lp implies convergence in probability, and convergence in probability implies
convergence in distribution. Convergence in distribution is the weakest form because
all other modes imply convergence in distribution.
Mathematically, for each n ≥ 0, X n has a distribution function which defines a
probability measure Pn and X defines a probability measure P on Rd . It follows that
X n converges in distribution to X if and only if Pn ⇒ P , that is, for all bounded
R
R
continuous functions f, f (x) dPn converges to f (x) dP , where the integration is
over the probability space. In terms of the SDE (1.1.4), there is a separate probability
m̃
measure for each m which makes it difficult to compare two trajectories xm
t and xt .
For example, if we know that xm converges to x in distribution, then the probability
measures converge. However, if we want to analyze properties of two trajectories xm
and xm̃ , e.g. Lyapunov exponents, there may be no common probability measure
without taking the limit as m, m̃ → 0. For any other mode of convergence, there is
only one probability measure, and thus one space (Ω, F, P ) on which all trajectories
m̃
are defined, and xm
t and xt can be compared on (Ω, F, P ).
Now consider a collection of random variables indexed by a continuous time parameter t ∈ [0, T ]. These are stochastic processes Xtn : Ω×[0, T ] → Rd . If we consider
the sequence of processes at a fixed time t̂, {Xt̂n }n≥0 , then the modes of convergence
are no different than those stated above, because Xt̂n are random variables. However,
if a process Xtn converge to Xt for arbitrary t, X n does not necessarily converge
to X as a process. That is, X n does not necessarily converges to X in the space
of continuous functions CRd [0, T ]. We alter the above definitions for convergence of
processes:
Definition 6. We say X n converges to X almost surely or with probability one with
43
respect to CRd [0, T ], if for all > 0,
N
lim P sup sup |Xt − Xt | ≥ = 0.
n→∞
(2.4.5)
N ≥n 0≤t≤T
Definition 7. X n converges to X in probability with respect to CRd [0, T ] if for all
> 0,
lim P
n→0
sup
0≤t≤T
|Xtn
− Xt | > = 0.
Definition 8. X n converges to X in Lp with respect to CRd [0, T ] if
p n
lim E
sup |X − X|
= 0.
n→0
(2.4.6)
(2.4.7)
0≤t≤T
Definition 9. X n converges to X in distribution, law, or weakly with respect to
CRd [0, T ] if for every bounded uniformly continuous function f : CRd [0, T ] → R (i.e.
for x, y ∈ CRd [0, T ] there exists δ > 0 such that for all > 0, sup0≤t≤T |xt − yt | < δ
implies, |f (x) − f (y)| < ),
lim E[f (X n )] = E[f (X)].
n→∞
(2.4.8)
The same chain of implications holds for convergence of processes. We stress that
the convergence of the main theorem is with respect to continuous processes. The
theory of convergence of processes with respect to the Skorokhod topology (the space
of processes that are right continuous and have left hand limits at every point) is well
developed (see [EK86, Bil99] and references therein).
2.4.2
Approximation of the Wiener process
We start the discussion of small parameter limits by considering an important example
of replacing white noise with a continuous process.
In the derivation of the motion of a Brownian particle in water [Sec. 2.1] the
random forcing term was described as,
Frand (t) = σ(xt )ηt ,
(2.4.9)
44
where ηt was Gaussian white noise and satisfied some simplified properties. We
interpret the random force as the integral equation,
Z t
Z t
σ(xs )dWs ,
Frand (s)ds =
(2.4.10)
0
0
where Wt is a standard Wiener process on R. The differential equation analog is,
Frand (t) = σ(xt )
dWt
.
dt
(2.4.11)
However, we know that dWt /dt exists nowhere with probability one. A reasonable
approximation is to define ηt = ηtτ , dependent on some small parameter τ , as a
continuous process such that,
t
Z
0
ηs ds → Wt ,
(2.4.12)
as τ → 0. The random force [Eq. (2.4.9)] is then piecewise continuous, and the
Langevin equation is a well defined random differential equation,
mẍt = F (xt ) − γ(xt )ẋt + σ(xt )ηtτ .
(2.4.13)
Rt
Rt
One question we could ask is: does the integral 0 σ(xs )ηsτ ds converge to 0 σ(xs )dWs
as τ → 0? Such stochastic integrals were studied in [WZ65], where it was shown that
as the correlation time of the noise goes to zero, the stochastic integral converges to
the Stratonovich integral (α = 1/2 defined in equation (2.2.6)) with respect to the
Wiener process.
Theorem 4 ([WZ65]). Let Wtn ∈ CR [0, T ] be a process with piecewise continuous
derivatives with probability one, such that W n → W , where W is a one dimensional
Wiener process, almost surely with respect to CR [0, T ]. Let xnt be the solution to the
random ODE,
dxnt = b(xnt ) dt + σ(xnt ) dWtn ,
xn0 = x,
(2.4.14)
where b, σ ∈ CR1 [0, T ], are globally Lipschitz, and σ(x) > 0 or σ(x) < 0 for all x ∈ R.
Define the integral,
Z
0
t
σ(xns ) dWsn = lim
N →∞
N
X
i=1
σ(xnsi )(Wsni+1 − Wsni ),
(2.4.15)
45
i.e. the left hand sum (also referred to as the Itô integral). Let xt be the solution to
1
dxt = b(xt ) dt + σ(xt )σ 0 (xt ) dt + σ(xt ) dWt ,
2
x0 = x.
(2.4.16)
Then xn → x almost surely with respect to CR [0, T ].
This result is crucial with respect to two aspects of this dissertation. First, it
Rt
Rt
answers the question that we posed: does 0 σs dWsn → 0 σs dWs if W n → W in some
sense? The answer, from the theorem of [WZ65], is no, because the integral with
respect to the approximation W n is a left hand sum, but the integral converges to
the Stratonovich construction which is a midpoint sum. Given processes Xtn and Ytn
Rt
that converge to Xt and Yt , what are the conditions needed such that 0 Xsn dYsn →
Rt
Xs dYs ? This question is answered in [KP91] (see also references therein) and
0
the results generalize Theorem 4. This theory is essential to the proof of the main
theorem in Chapter 3.
The second way Theorem 4 is used is in the Smoluchowski-Kramers approximation
with colored noise. The prior knowledge that integrating with respect to a piecewise
differentiable process results in the Stratonovich integral as a limit is taken will show
up many times in Section 2.5.2.
2.5
Previous results for the Smoluchowski-Kramers approximation
The main limit considered in this dissertation is the Smoluchowski-Kramers approximation. There are many equivalent limits, such as the over-damped, high friction,
and large viscosity limits. Here, we consider the small mass limit, which is equivalent
to the former limits through a transformation of variables.
The limit as mass tends to zero has been studied by many authors beginning with
Smoluchowski [Smo16] and Kramers [Kra40] as described in Section 1.2. For clarity,
46
we list the SDE of interest again in this section,
(
m
= x,
xm
= v
dxm
0
t
h t dt
i
m
m
m
F (xt )
γ(xt ) m
σ(xt )
m
m
dvt =
− m vt dt + m dWt v0 = v.
m
(2.5.1)
Nelson proved that in the case where γ and σ are constant, the solution to equation (2.5.1) converges to the solution of equation (1.1.11) almost surely [Nel67].
Theorem 5 ([Nel67]). Let F : Rd → Rd satisfy a global Lipschitz condition and let
m
W be the Rd Wiener process. Let xm
t , vt be the solution of SDE (2.5.1) with γ = 1,
and σ = 1. Let xt be the solution to
dxt = F (xt ) dt + dWt ,
x0 = x.
(2.5.2)
Then for all v,
lim xm
t = xt ,
(2.5.3)
m→0
with probability 1 (almost surely) uniformly for t in compact subsets of [0, ∞).
The convergence holds uniformly for all t in compact intervals, and thus the convergence is almost sure with respect to CRd [0, T ].
The case including an external force was also treated in [Sch80] and references
therein, particularly [LS78], but by entirely different methods. The method used was
a homogenization technique that is outlined later [Sec. 2.5.1].
The problem of identifying the limit for position-dependent noise and friction was
first studied in [Han82]. The argument used was as follows: consider equation (2.5.1)
m
m
m
0
with xm
t = xt and vt = vt in one dimension, and F (x) = −U (x) for a potential
function U . Consider the deterministic flow,
dxm
t
dtm
dvt
dt
= vtm ,
∂U (xm )
= − m1 ∂xmt −
t
γ(xm
t ) m
vt .
m
(2.5.4)
For a system with noise, the diffusion coefficient is derived from the Einstein relation
D(x) = 2kB T /γ(x) [TKS92]. Through the detailed balance relation (see Section 6.4
47
of [Ris89]), the deterministic flow as m → 0, is
"
#
∂γ(xt )
1
∂U (xt ) kB T ∂xt
dxt
=−
+
.
dt
γ(xt )
∂xt
γ(xt )
Setting γ = γ and σ =
(2.5.5)
√
2kB T γ in SDE 1.1.4, then the limit of the main theorem
[Eq. 1.1.11] in one dimension is
dxt =
F (xt ) kB T γ 0 (xt )
−
γ(xt )
γ(xt )2
s
dt +
2kB T
dWt .
γ(xt )
(2.5.6)
The drift is identitical to the one derived in equation (2.5.5).
The one dimensional case where the fluctuation dissipation relation [Eq. (1.1.6)] is
not necessarily satisfied is studied rigorously in [SSMD82] and the multidimensional
case is also discussed there, but without complete proof. We first go through the
heuristic argument presented in [SSMD82].
m
Heuristic argument [SSMD82]. For xm
t , vt ∈ R the solution to SDE (2.5.1), note
m
that dxm
t = vt dt, and
m
m
m
mdvtm =F (xm
t ) dt + σ(xt ) dWt − γ(xt ) vt dt
m
m
m
=F (xm
t ) dt + σ(xt ) dWt − γ(xt ) dxt .
(2.5.7)
(2.5.8)
Without going into detail on the exact convergence, we expect that as m → 0,
Z t
Z t
m
F (xs ) ds →
F (xs ) ds,
(2.5.9)
0
0
which is clear for F continuous,
Z t
Z t
m
σ(xs ) dWs →
σ(xs ) dWs ,
0
(2.5.10)
0
because for all m > 0 the integral on the left in Itô, and so the integral on the right
will be Itô. For the last integral, the authors of [SSMD82] argue that
Z t
Z t
m
m
γ(xs ) ◦ dxs ,
γ(xt ) dxt →
0
0
(2.5.11)
48
the Stratonovich integral, because by [WZ65], the limit of Riemann integrals converges
to the Stratonovich stochastic integral. With the fact mvtm → 0 (see Lemma 10 of
the main proof), this gives the SDE, in Itô form,
1
0 = F (xs ) ds − γ(xs ) dxs − γ 0 (xs )(dxs )2 + σ(xs ) dWs .
2
(2.5.12)
Solving for dxs ,
dxs =
F (xs )
1 γ 0 (xs )
σ(xs )
ds −
( dxs )2 +
dWs .
γ(xs )
2 γ(xs )
γ(xs )
Now we see that,
2
(dxs ) =
σ(xs )
γ(xs )
(2.5.13)
2
ds,
(2.5.14)
which implies
F (xs )
1 γ 0 (xs )σ(xs )2
σ(xs )
dxs =
ds −
ds +
dWs .
3
γ(xs )
2 γ(xs )
γ(xs )
(2.5.15)
The task of proving the above argument in multi-dimensions is formidable. However, the techniques of this dissertation could lead to a possible proof of the convergence of the above integrals. Also, the heuristics could result in a limiting equation
without having to solve the Lyapunov equation as in the main proof of this dissertation.
We state and give the main ideas of the proof of the rigorous theorem due to its
simplicity and potential extensions to future research.
m ∗
2
Theorem 6 ([SSMD82]). Let (xm
t , vt ) ∈ R be the solution to SDE (2.5.1) with
γ(x) = γ(x) a scalar function such that γ(x) ≥ a > 0 and σ(x) = σ(x) a scalar
function for all x ∈ R. Define
Z
Γ(x) =
x
γ(z) dz,
0
(2.5.16)
49
F̄ (x) = F (Γ−1 (x)) and σ̄(x) = σ(Γ−1 (x)). Assume that Γ−1 , F̄ , and σ̄ are globally
Lipschitz with constants A, B and C respectively. Let xt be the solution to
γ 0 (xt )
σ(xt )
F (xt )
2
−
dt
+
dWt .
(2.5.17)
dxt =
σ(x
)
t
γ(xt )
2γ(xt )3
γ(xt )
Then for t ∈ [0, ∞),
2
= 0.
lim E (xm
t − xt )
(2.5.18)
m→0
Note that the convergence is L2 in terms of xm
t at a single fixed time t, and is
point wise convergence.
Proof. Let ytm = Γ(xm
t ). Then
m
dytm = γ(xm
t )vt dt,
(2.5.19)
dytm = −m dvtm + F̄ (ytm ) dt + σ̄(ytm ) dWt .
(2.5.20)
and using equation (2.5.1),
Defining yt = Γ(xt ), then
dyt = F̄ (xt ) dt + σ̄(xt ) dWt .
(2.5.21)
Because Γ−1 is Lipschitz with constant A then for fixed time t ∈ [0, ∞),
2
−1 m
−1
2
2
m
2
lim E[|xm
t − xt | ] = lim E[|Γ (yt ) − Γ (yt )| ] ≤ A lim E[|yt − yt | ]. (2.5.22)
m→0
m→0
m→0
It suffices to show that E[|ytm − yt |2 ] → 0 as m → 0. The change of variables from xt
to yt eliminates the spurious drift term so that we integrate a constant with respect to
vtm in equation (2.5.20). This differs from the main proof, were we integrate γ −1 (xt )
with respect to vtm (see Chapter 3). For the rest of the proof, the goal is to use
equations (2.5.20) and (2.5.21), the Lipschitz condition, and Gronwall’s lemma to
derive an inequality similar to the proof in Theorem 2 for uniqueness. The only thing
left to show is that mvtm → 0 in L2 for fixed t ∈ [0, ∞). This is done in the proof of
the main theorem (see Lemma 10).
50
2.5.1
Homogenization calculation
Another method used to compute the Smoluchowski-Kramers approximation is homogenization of the backward Kolmogorov equation. Using homogenization techniques [Pap77, Sch80, PS08] one can systematically compute the limiting backward
Kolmogorov equation as mass is taken to zero. This method, similar to [Sch80], is
a powerful method from asymptotic PDE theory and takes advantage of the Backward Kolmogorov equation [Eq. (2.2.49)]. We derive the limit in one dimension for
SDE (2.5.1).
To simplify further analysis, we substitute um
t =
√
the following two-dimensional SDE:
( m
dxt = √1m um
dt
i
h mt
F (xt )
γ(xm
t )
√
dt +
=
u
dum
−
t
t
m
m
m
with initial conditions xm
0 = x and u0 =
mvtm in Eq. (2.5.1) obtaining
σ(xm
)
√t
m
dWt
(2.5.23)
√
mv.
To determine the limiting equation we use a multiscale analysis of the backward
Kolmogorov equation of the SDE (2.5.1). Let g(x0 , u0 , t0 |x, u, t) be the probability
density of the distribution of the position and (rescaled) velocity (x0 , u0 ) of the particle
at time t0 given their values (x, u) at a time t < t0 . Then the backward Kolmogorov
equation for the SDE (2.5.1) is
∂g(x0 , u0 , t0 |x, u, t)
u ∂g(x0 , u0 , t0 |x, u, t)
σ(x)2 ∂ 2 g(x0 , u0 , t0 |x, u, t)
√
=
+
∂t
2m
∂u2
∂x
m
0
0 0
0
0 0
F (x) ∂g(x , u , t |x, u, t) γ(x)u ∂g(x , u , t |x, u, t)
+√
−
.
∂u
m
∂u
m
(2.5.24)
Because the equation involves derivatives with respect to the x, u, and t variables we
write g(x0 , u0 , t0 |x, u, t) = g(x, u, t), to shorten notation. Eq. (2.5.24) can be rewritten
as
∂g
=
∂t
1
1
L1 + √ L2 g.
m
m
51
with
σ(x)2 ∂ 2
∂
− γ(x)u ,
2
2 ∂u
∂u
∂
∂
L2 = u
+ F (x) .
∂x
∂u
L1 =
Notice that the operator L1 is the generator for an Ornstein-Uhlenbeck (OU) process
with coefficients dependent on x. We denote this process as ũ, and write the stochastic
differential equation
dũt = −γ(x)ũt dt + σ(x) dWt .
(2.5.25)
The invariant density for ũt is
∗
g (ũ) = C(x) exp
−γ(x)ũ2
σ(x)2
,
(2.5.26)
where C(x) is a normalizing constant.
We postulate that the solution of the
√
Kolmogorov equation has an asymptotic expansion g = g0 + mg1 + mg2 + ...
[Pap77, PS08, Sch80]. We match powers of m to obtain the following equations,
L1 g0 = 0,
(2.5.27)
L1 g1 = −L2 g0 ,
∂g0
= L1 g2 + L2 g1 ,
∂t
(2.5.28)
(2.5.29)
where L1 , L2 are differential operators. Solving Eq. (2.5.27) results in
Z u γ(x)û2
g0 (x, u, t) = C1 (x, t)
e σ(x)2 dû + C2 (x, t).
−∞
Because the first term is not integrable in u, C1 must be zero and thus g0 = g0 (x, t),
independent of u. By the Fredholm alternative, the solvability condition for Eq.
(2.5.28) is given as
Z
∞
g ∗ L2 g0 du = 0,
−∞
∗
for all g such that
L∗1 g ∗
= 0 [CH53, Lax02] . Here L∗1 is the adjoint of L1 :
L∗1 =
σ(x)2 ∂ 2
∂
+ γ(x) (u·) ,
2
2 ∂u
∂u
52
where the (·) denotes where the differential operator is applied to a function. The
relevant (integrable) solution is a mean-zero Gaussian given in Eq. (2.5.26), which
satisfies g ∗ (u) = g ∗ (−u), thus
Z
Z ∞
∂g0 ∞ ∗
∗
ug (u) du = 0.
g L2 g0 du =
∂x −∞
−∞
Next the solvability condition for Eq. (2.5.29) is
Z ∞
∂g0
−1
−L2 L1 L2 g0 +
g ∗ du = 0.
∂t
−∞
(2.5.30)
First we set
V = L−1
1 L2 g0 ,
which by the previous solvability condition is well defined. Thus L1 V = L2 g0 , or
∂V
∂g0
σ(x)2 ∂ 2 V
− γ(x)u
=u
.
2
2 ∂u
∂u
∂x
(2.5.31)
Notice that the function
V =−
u ∂g0
γ(x) ∂x
is a particular solution of Eq. (2.5.31). From Eq. (2.5.30) we must have
Z ∞ u2 dγ(x) ∂g0 F (x) ∂g0
u2 ∂ 2 g0
∂g0
∗
g
+
−
,
du = −
2
2
−γ(x) ∂x
γ(x) dx ∂x
γ(x) ∂x
∂t
−∞
(2.5.32)
for any g ∗ satisfying L∗1 g ∗ = 0, in particular for the invariant density of the OU process
ũt . After Gaussian integration over u Eq. (2.5.32) becomes
σ(x)2 ∂ 2 g0
F (x)
σ(x)2 dγ(x) ∂g0
∂g0
+
−
=
.
2
2
3
2γ(x) ∂x
γ(x)
2γ(x) dx
∂x
∂t
(2.5.33)
This gives the Smoluchowski-Kramers approximation to the backward Kolmogorov
equation. The corresponding (Itô) SDE is
F (xt )
σ(xt )2 dγ(xt )
σ(xt )
−
dt
+
dW̃t .
dxt =
γ(xt )
2γ(xt )3 dx
γ(xt )
(2.5.34)
Because we derived this equation from the convergence of the infinitesimal operators,
rather than directly studying the limit of the SDE (2.5.1), the convergence, under
53
some assumptions, is in law. In [PV03] convergence in distribution is proven rigorously for equations of the same type as equation (2.5.1), under somewhat stronger
assumptions than those made here.
The same result was also obtained in [CBEA12], with the fluctuation-dissipation
relation, in a manner very similar to the above discussion. The limit is also analyzed
with respect to entropy production.
2.5.2
Smoluchowski-Kramers approximation with colored noise
In physical systems of Brownian particles, the noise driving Brownian motion is not
white, but colored, due to hydrodynamic memory [FGB+ 11]. In this section we
consider a Langevin equation driven by colored noise:
(
dxm,τ
= vtm,τ dt
t
F (xm,τ
)
γ(xm,τ
) m,τ
σ(xm,τ
) τ
m,τ
t
t
t
− m vt + m ηt dt,
dvt
=
m
(2.5.35)
where ηtτ is a k-dimensional random process whose stationary process is zero-mean
Rt
with short correlation time τ . We also assume that 0 ηsτ ds → Wt , in some sense, the
k dimensional Wiener process, as τ → 0. With our previous discussions, we already
have a notion to what the limit should be as we take m → 0 and τ → 0 separately.
The heuristic argument for the convergence of xm,τ
= xm,τ
∈ R as both m, τ → 0
t
t
is as follows: For τ → 0 first, the above integrals are well defined. The integral
R t σ(xm,τ
R t m,τ
) τ
s
ηs ds will converge to 0 σ(xms ) ◦ dWs (the Stratonovich integral) as τ → 0
m
0
R t m,τ
by Theorem 4. Because xt is differentiable, then by Lemma 1, 0 σ(xms ) ◦ dWs =
R t σ(xm,τ
)
s
dWs , the Itô integral. Thus, as m → 0, the limit will be equation (2.5.34).
m
0
If m is taken to zero first, then we should obtain equation (2.5.34) with dWt reRt
placed by ηtτ dt. Then as τ → 0, the stochastic integral 0 σ(xτs )/γ(xτs )ηs ds will
converge to the Stratonovich integral. This is the result found in [Fre04] for γ constant.
54
Theorem 7 ([Fre04]). Let xm,τ ∈ Rd be the solution to equation (2.5.35) with γ = 1,
and F , σ globally Lipschitz, with the initial conditions (x, v) for all τ and m, and
ηtτ ∈ Rk colored noise such that
Z t
sup ηs ds − Wt → 0,
0≤t≤T
(2.5.36)
0
as τ → 0 with probability one. Let xt ∈ Rd × [0, T ] be the solution to
dxt = F (xt ) dt + σ(xt ) ◦α dWt ,
(2.5.37)
where the integral in constructed with respect to α [Eq. (2.2.6)]. Then, the process
xm,τ converges in probability with respect to CRd [0, T ] (i.e. the space of continuous
functions from [0, T ] to Rd with the uniform metric) to the solution of equation
(2.5.37) with Itô’s (α = 0) stochastic integral as m, τ → 0 such that τ < f (m) for
a function f defined in the proof. If m, τ → 0 such that m exp(τ −1 ) → 0 then xm,τ
converges to the solution of equation (2.5.37) with the Stratonovich stochastic integral
(α = 1/2).
This proof, like many others in this section, relies on integrating equation (2.5.35)
(See outline of proof of Theorem 9) and estimating the different terms. This can be
done because the friction matrix γ is a scalar constant 1.
We know the limit as τ → 0 then m → 0 and vice versa. However, what if
τ, m → 0 in a particular way that is not handled by Theorem 7? If the smoothed
white noise is chosen as an Ornstein-Uhlenbeck (OU) process, one can add the SDE
for the noise to the original system of equations, thus obtaining a higher-dimensional
system of SDEs. That is, let ηtτ be a k dimensional OU process defined by the SDE,
dηtτ = −
B
A τ
ηt dt +
dWt ,
τ
τ
(2.5.38)
where A ∈ Rk×k and B ∈ Rk×` are constant matrices and Wt is a ` dimensional
Wiener process. Note that the stationary solution is a zero-mean Gaussian process
55
with correlation τ . Then

m,τ

 dxt
dvtm,τ


dηtτ
equation (2.5.35) is written,
m,τ
= v
dt
t m,τ
F (xt )
−
=
m
= −A
η τ dt
τ t
γ(xm,τ
) m,τ
t
vt
m
+ Bτ dWt ,
+
σ(xm,τ
) τ
t
ηt
m
dt
(2.5.39)
m,τ
m,τ
with the initial conditions (xm,τ
= η is a vector with each
0 , v0 ) = (x, v) and η0
entry a normal random variable with mean zero and variance 1/τ mutually independent and independent of the Wiener process Wt . Such a system was studied in
[KPS04] outlining a rough proof by estimating the rate of vtm,τ as m, τ → 0. The
authors also include a short homogenization method similar to Section 2.5.1. The
result is stated here for xm,τ
= xm,τ
and vtm,τ = vtm,τ in one dimension and F (x) = 0,
t
t
√
γ = 1, A = a ∈ (0, ∞), B = b ∈ (0, ∞). Then if m = τ0 2 and τ = 2 the limit as
→ 0 is
√
b
b
dxt = 2
σ(xt )σ 0 (xt ) dt +
σ(xt ) dWt .
2a (1 + τ0 a)
a
(2.5.40)
Recall that τ0 = m/τ . This equation shows the interpolation of the Itô integral
(τ0 = ∞ which implies τ → 0 then m → 0) and the Stratonovich (τ0 = 0 which
implies m → 0 then τ → 0). This limit is shown, using the framework of the main
theorem, in Section 4.2.
The rigorous proof, with L2 convergence with respect to CRd [0, T ] and where ηt
is a infinite dimensional process given by a stochastic partial differential equation, is
done in [PS05] by different methods. The theorem is much more involved than its
one-dimensional analogue and it states,
Theorem 8 ([PS05]). Let xm,τ
∈ Rd be given by equation (2.5.39), where γ = 1,
t
and A is a strictly positive self-adjoint operator on a Hilbert space H with
A = diag{αj },
and Wt =
(2.5.41)
P∞ √ k k k
λ e Wt with ek , the standard basis in `2 , and Wtk mutually ink=1
dependent one-dimensional Wiener processes. Assume that F and σ are globally
56
Lipschitz, and that the eigenfunctions of A and their derivatives are bounded in a
certain way (see [PS05]). For τ = 2 and m = β :
for β ∈ (0, 2), xm,τ → x1 in L2p with respect to CRd [0, T ] where x1t is the solution
to
dx1t = F (x1t ) dt + σ(x1t )A−1 dWt .
Define Θ and Θ̂, operators such that
λj
λj
, Θ̂ = diag
.
Θ = diag
2αj2
2αj2 (1 + αj )
(2.5.42)
(2.5.43)
Then for β ∈ (2, ∞), xm,τ → x2 in L2p with respect to CRd [0, T ] where x2t is the
solution to
dx2t = F (x2t ) dt + ∇ · (σ(x2t )Θσ ∗ (x2t )) dt − σ(x2t )Θ∇ · σ ∗ (x2t ) dt + σ(x2t )A−1 dWt .
(2.5.44)
For β = 2, xm,τ → x3 in L2p with respect to CRd [0, T ] where x3t is the solution to
dx3t = F (x3t ) dt + ∇ · (σ(x3t )Θ̂σ ∗ (x3t )) dt − σ(x3t )Θ∇ · σ ∗ (x3t ) dt + σ(x3t )A−1 dWt .
(2.5.45)
The case for state-dependent friction γ(xt ) for xt ∈ Rd is considered in [FH11]
with σ = 1. The framework is much like that of the earlier work of Freidlin [Fre04]
[Thm. 7]. As discussed above, if τ → 0 first, then m → 0, then the double limit is
essentially the Smoluchowski-Kramers approximation. However, in [FH11], the proof
shown is that defining xt as the solution to
dxt = γ −1 (xt )F (xt ) dt + γ −1 (xt ) ◦α dWt ,
(2.5.46)
then xm,τ
does not converge to xt for any construction of the stochastic integral.
t
If m → 0 then τ → 0, xm,τ → x for the Stratonovich construction (α = 1/2) in
probability with respect to CRd [0, T ].
57
Theorem 9 ([FH11]). Let xm,τ
∈ Rd be the solution of SDE (2.5.39) with F globally
t
Lipschitz, γ(x) = γ(x) : Rd → R, σ = 1 and ηt , instead of an OU process, be defined
as,
Z
t
ηsτ
0
1
ds =
τ
∞
Z
Ws ρ
0
s−t
τ
with ρ ∈ CR∞ (R) with compact support on [0, 1] and
ds,
R1
0
(2.5.47)
ρ ds = 1. If m → 0, then
τ → 0, xm,τ → x in L1 with respect to CRd [0, T ] where xt is the solution to,
dxt =
F (xt )
1
dt + −1
◦ dWt ,
−1
γ (xt )
γ (xt )
(2.5.48)
that is, the Stratonovich SDE.
Outline of proof. Much like the proof of the main theorem, vtm is solved with xm
t
continuous and of bounded variation (see Lemma 10 of the main proof). Unlike the
proof of the main theorem of this dissertation, the solution vtm is integrated to obtain,
Z t
m
xt =x +
vsm dt
(2.5.49)
0
Z t
Rs
1
m
=x + v
e− m 0 γ(xr ) dr ds
(2.5.50)
0
Z s
Z
Rr
1
1 t − 1 R s γ(xm
m ) du
)
dr
γ(x
m
r
u
e m 0
+
em 0
F (xr ) dr ds
(2.5.51)
m 0
0
Z s
Z
Rr
1
1 t − 1 R s γ(xm
m ) du
)
dr
γ(x
e m 0 r
em 0 u
dWr ds.
(2.5.52)
+
m 0
0
The right hand side is estimated through an integration by parts of the integral
Z t
Z t
Rs
1
−m − 1 R s γ(xm
γ(xm
−m
) dr
) dr
r
r
0
0
m
d e
e
ds =
(2.5.53)
m
0
0 γ(xs )
" 1 Rt m
1
e− m 0 γ(xs ) ds
−
(2.5.54)
=−m
m
γ(xt )
γ(x)
Z t
Rs
1
1
−m
γ(xm
) dr
r
0
−
e
d
.
(2.5.55)
γ(xm
0
s )
This is used to show,
Z
v
0
t
1
e− m
Rs
0
γ(xm
r ) dr
ds → 0,
(2.5.56)
58
as m → 0 in probability with respect to CRd [0, T ], and
Z s
Z t
Z
Rr
1
F (xs )
1 t − 1 R s γ(xm
) dr
γ(xm
) du
m
r
u
0
0
m
m
e
e
F (xr ) dr ds →
ds,
m 0
0
0 γ(xs )
as m → 0 in probability with respect to CRd [0, T ]. However,
Z s
Z t
Z
Rr
1
1 t − 1 R s γ(xm
1
m ) du
)
dr
γ(x
r
u
0
0
em
dWr ds 6→
e m
dWs ,
m
m 0
0
0 γ(xs )
(2.5.57)
(2.5.58)
as m → 0. Instead if dWs is replaced by ηsτ ds as defined in equation (2.5.47), then
Z s
Z t
Z
Rr
1
1
1 t − 1 R s γ(xm
m ) du
)
dr
τ
γ(x
r
u
em 0
ηr dr ds →
e m 0
η τ ds,
(2.5.59)
m) s
m 0
γ(x
0
0
s
as m → 0 in probability with respect to CRd [0, T ]. As τ → 0 the right hand side will
converge to the Stratonovich integral in probability with respect to CRd [0, T ].
2.5.3
Other small mass results
There are many other results that serve as special cases of the Smoluchowski-Kramers
approximation. We outline a few of them here referring to the corresponding papers
for more details.
Zero friction: For all of the previous theorems and limits, including the main theorem,
the friction coefficient is assumed to be bounded below, i.e. |γ(x)| > c > 0 for some
c > 0 and all x ∈ U ⊆ Rd . The case when γ(x) = 0 for x ∈ Ū ⊂ U is treated in
[FHW13]. The analysis considers γ(xt ) + I to be the friction coefficient, where I
is the identity matrix, and γ(x) is zero on some subset of U. On this subset, the
analysis is carried out as a singular limit as → 0.
Potential landscape: In [AKQ07, SCY+ 12], the Smoluchowski-Kramers approximation is studied where the external force F = −∇φ(x) for φ : Rd → R is a potential
function. Thus the Smoluchowski-Kramers approximation is of the form
γ(xt )dxt = −∇φ(xt ) dt + σ(xt ) dWt .
(2.5.60)
59
As discussed before, for non-zero mass, the 2d SDE (2.5.1) has a stochastic inteRt
m
gral 0 σ(xm
s ) dWs . Because xt is differentiable, all constructions of the stochastic integral are equivalent to the Itô construction. However, xt defined by equation (2.5.60) is continuous and nowhere differentiable. Therefore, the stochastic
Rt
integral 0 σ(xs ) dWs will change when the construction changes. The papers
[AKQ07, SCY+ 12], consider different interpretations of the stochastic integral, and
the resulting stationary solution to the corresponding Fokker-Planck equation. When
the integral is constructed in the anti-Itô sense, then the Gibbs distribution is recov1
ered, i.e. the stationary solution is proportional to e− 2 φ(x) .
Fractional Brownian motion: In [BT05] the Smoluchowski-Kramers approximation is
studied for the Langevin equation driven by fractional Brownian motion. That is,
d
xm
t ∈ R is the solution to
= vtm dt
dxm
t
F (xm
t )
dvtm =
dt −
m
vtm
m
dt +
1
m
dBtH ,
(2.5.61)
where BtH is fractional Brownian motion. Fractional Brownian motion is a Gaussian
random variable with covariance,
E[BsH BtH ] =
1 2H
t + s2H − |t − s|2H .
2
(2.5.62)
Note that when H = 1/2 then
1/2
E[Bs1/2 Bt ] = max{t, s},
1/2
and Bt
(2.5.63)
= Wt . The paper proves that for F globally Lipschitz and every H ∈ (0, 1),
xm
t converges to xt , the solution of
dxt = F (xt ) dt + dBtH ,
almost surely with respect to CR [0, T ].
(2.5.64)
60
Infinite number of degrees of freedom: In [CF06] stochastic partial differential equations (SPDE) are considered. Let x ∈ U ⊂ Rd and um (t, x) : [0, T ] × Rd → R be
governed by the SPDE,
mum
tt (t, x)
u(0, x)
ut (0, x)
u(t, x)
=
=
=
=
m
∆um (t, x) − um
t (t, x) + b(x, u ) +
u0
v0
0, x ∈ ∂U,
∂W Q
∂t
(2.5.65)
where W Q is a Gaussian mean zero random field τ correlated in time and Q describes
the correlation in space. For x = x in one dimension, ∂ 2 W Q /∂x∂t is considered to
be space time white noise. Through rigorous analysis, as m → 0 it is shown that
P (|um − u|CH [0,T ] > ) → 0, where H is a Hilbert space, and u is the solution to the
SPDE,
ut = ∆u + b(x, u) +
u(0, x) = u0 ,
u(t, x) = 0, x ∈ ∂U.
∂W Q
,
∂t
x∈U
(2.5.66)
61
Chapter 3
Proof of the Main Theorem
3.1
Definitions and Notation
In the proof, we use many terminology and techniques from the theory of continuous
time stochastic processes. We begin with definitions and stating properties, without
proof, where they are useful. Assume for these definitions that Xt ∈ R × [0, T ]. All
the following definitions can be extended to processes with state space Rd .
Definition 10. A {Ft }-adapted process, Xt , is of bounded variation if each path
t 7→ Xt (ω) is of finite variation Vt (X) and X0 = 0, where the total variation is defined
as
Z
Vt (X, ω) =
0
t
|dXs (ω)| = sup
n
X
i=1
|Xsi (ω) − Xsi−1 (ω)|,
(3.1.1)
where the supremum is over all partitions of [0, t].
Note, that for a differentiable process Xt ,
Z t
d
Xs ds.
Vt (X, ω) =
0 ds
(3.1.2)
Definition 11. For a {Ft }-adapted process Xt , define
hXit = sup
X
|Xti+1 − Xti |2
(3.1.3)
where the supremum is over all partitions of [0, t], be the quadratic variation of the
process Xt .
Definition 12. A {Ft }-adapted process Mt is a martingale (submartingale) if for
every 0 ≤ s < t < ∞, E[|Xt |] < ∞ and E[Xt |Fs ] = Xs (E[Xt |Fs ] ≥ Xs ).
62
Definition 13. A {Ft }-adapted process Mt is a local martingale if there exists a
sequence of increasing stopping times τn such that, P (τn < T ) → 0 as n → 0 and
Mτn ∧t is a martingale.
Definition 14. A process Yt is called a semimartingale if it has the decomposition,
Yt = Mt + At ,
t ∈ [0, ∞),
(3.1.4)
where Mt is a {Ft } local martingale and At is a {Ft } process of bounded variation.
The Doob-Meyer decomposition (see [KS91]) assures uniqueness of this decomposition for continuous semimartingales Yt .
A key inequality, that is used many times in this chapter, is Doob’s maximal
inequality.
Lemma 4 (Doob’s maximal inequality). If Xt ∈ CR [0, T ] is a submartingale (i.e.
E[Xt |Fs ] ≥ Xs ) adapted to {Ft }, then for every p > 1,
p
p p
E
sup Xt
E[XTp ].
≤
p−1
0≤t≤T
(3.1.5)
For a proof see [RY99]. The above lemma is applied with p = 2. Because |Xt | is
a submartingale adapted to {Ft } the following inequality is used many times in the
proof
"
E
2 #
sup |Xt |
≤ 4E[|XT |2 ].
(3.1.6)
0≤t≤T
For the main theorem, let {Ft } be a filtration. Suppose H ∈ CRk [0, T ] is an {Ft }adapted semi-martingale and denote its Doob-Meyer decomposition Ht = Mt + At ,
where Mt is an Ft -local martingale and At is a process of locally bounded variation
[RY99]. Define τc = inf{t : |Mt | ≥ c or Vt (At ) ≥ c}. For a continuous {Ft }-adapted
process X ∈ CRd×k [0, T ] define,
Z t
X
Xs dHs = lim
Xti (Hti+1 − Hti ),
0
i
(3.1.7)
63
where {ti } is the partition of [0, T ] and the limit is taken as the maximum of ti+1 − ti
goes to zero. For a continuous processes Xs such that
Z t∧τc
Z t∧τc
2
2
|Xs | dVs (A) < ∞ = 1,
P
|Xs | dhM is +
(3.1.8)
0
0
the stochastic integral [Eq. (3.1.7)] exists and has finite paths with probability one
[Pro05].
3.2
Main theorem and assumptions
d
d
m
Let (xm
t , vt ) ∈ U × R , where U ⊂ R is an open bounded set, be the solution to the
SDE
(
m
= v
dxm
t
h t dt
F (xm
t )
dvtm =
−
m
γ(xm
t )
m
i
vtm dt +
= x,
xm
0
σ(xm
t )
m
dWt v0m = v,
(3.2.1)
with F : U → Rd , γ : U → Rd×d , σ : U → Rd×k and W is a k-dimensional Wiener
process on the probability space (Ω, F, P ).
Let the limit process xt ∈ U be the solution to
dxt = γ −1 (xt )F (xt ) + S(xt ) dt + γ −1 (xt )σ(xt )dWt ,
x0 = x,
(3.2.2)
where the ith component of S is defined as
Si (x) =
∂
((γ −1 )ij (x))(γ −1 )αn (x)J(x)jn (γ ∗ )n` (x),
∂xα
(3.2.3)
where we use Einstein summation notation throughout the paper and J is the matrix
that solves the Lyapunov equation,
− J γ ∗ − γJ = −σσ ∗ .
(3.2.4)
Here and throughout the paper B ∗ denotes the transpose of a real matrix B.
d
For the main theorem, we assume xm
t , xt ∈ U ⊂ R , an open bounded set and
vtm ∈ Rd for all 0 ≤ t ≤ T . For an arbitrary vector a ∈ Rd , |a| is the Euclidean norm
and for a d × d matrix A ∈ Rd×d , |A| is the induced operator norm. We now state
the assumptions and main theorem.
64
Assumption 1. The coefficients F , γ, σ are continuously differentiable functions.
Furthermore, the matrix γ has positive real eigenvalues
0 < cλ ≤ λ0 (x) ≤ ... ≤ λd (x),
(3.2.5)
and |γ(x)| > cγ > 0 for all x ∈ U.
Remark 1. The lower bounds on γ and its eigenvalues are crucial for the estimates
of the proof. However, the case for vanishing friction, |γ(x)| = 0 for x in a subset of
U, is treated in [FHW13].
Assumption 2. With probability one, there exist global unique solutions to equation (1.1.4) and, for each m, for equation (1.1.11), such that for all m xm
t ∈ Uk ⊂ U
a compact set. In particular, there are no explosions.
Assumption 2 can be verified in many ways. In Section 4.1, we use a class of
Lyapunov functions to do it for the case when xt is one-dimensional.
m
Theorem 1. Suppose SDE (1.1.4) satisfies Assumptions 1 and 2. Let (xm
t , vt ) ∈
U × Rd be the solution of SDE (1.1.4) with initial conditions (x, v) constant for every
m and let xt be the solution to the Itô SDE (1.1.11) with the same initial condition
x0 = x. Then
"
lim E
m→0
3.3
2 #
= 0.
sup |xm
t − xt |
(3.2.6)
0≤t≤T
Outline of the proof
First we give an outline of the proof starting with the general theory of convergence
of SDE in [KP91]. Let (U m , H m , X m ) satisfy the SDE, written in integral form,
Z t
m
m
f (Xsm ) dHsm .
(3.3.1)
Xt = X0 + U t +
0
65
where f : U → Rd×k , and Htm is a semimartingale with respect to {Ft } with the
Doob-Meyer decomposition H m = Am + M m , and H0m = 0 for all m. Define Xt to
be the solution of
Z
t
Xt = X0 +
f (Xs ) dHs ,
(3.3.2)
0
where Ht ∈ CRd [0, T ] is a semi-martingale with respect to the filtration {Ft }. The
main lemma is,
Lemma 5([KP91], Theorem 5.10). Assume that there exists a unique and global
solution to equation (3.3.2), that f is a continuous function, (U m , H m ) → (0, H) in
probability with respect to CRd ×Rk [0, T ] and the following condition is satisfied:
Condition 1. {Vt (Am )} is stochastically bounded for each t > 0, i.e. P (Vt (Am ) >
L) → 0 as L → ∞, uniformly in m.
Then X m → X in probability with respect to CU ([0, T ]) as m → 0, where Xt is the
solution of equation (3.4.2).
This sets up the general framework for the main theorem and gives the key assumption (Condition 1) that is needed to apply Lemma 5. Given the limiting equation,
dxt = γ −1 (xt )F (xt ) + S(xt ) dt + γ −1 (xt )σ(xt )dWt ,
x0 = x,
(3.3.3)
the continuous function f must, at least, contain
f (x) = (γ −1 (x)F (x), γ −1 (x)σ(x), S(x), ...).
(3.3.4)
We say “at least” because f may contain more columns if the limit process Ht has
zeros in the corresponding rows, i.e.


t
W t 
 
 t 
 
Ht =  0  .
 
 .. 
 . 
0
(3.3.5)
66
This will be the case, later in the proof.
To use Lemma 5, we show the following,
Lemma 10. For each m, let xm
t have continuous paths in a compact set Uk , and
define vtm as the solution to the SDE given by the second equation in (3.2.1) with
functions, F , γ, and σ satisfying Assumptions 1-2. Then mv m → 0 as m → 0 in L2
with respect to CRd [0, T ] with the uniform metric and hence also in probability.
To transform the original SDE (3.2.1) to use Lemma (5), we solve the dvtm equation
for vtm dt and substitute to obtain,
m
−1
m
−1
m
−1
m
dxm
(xm
(xm
(xm
t = vt dt = γ
t )F (xt ) dt + γ
t )σ(xt )dWt − mγ
t ) dvt . (3.3.6)
The last term on the right hand side must now be integrated by parts, because mvtm
does not satisfy Condition 1. After integrating by parts xm
t is defined as
Z t
Z t
m
−1
m
m
m
m
γ −1 (xm
γ (xs )F (xs )ds +
xt = x + Ut +
s )σ(xs )dWs
Z
t
+
S(xs )ds +
0
(3.3.7)
0
0
d Z
X
i=1
t
m
m
f i (xm
s ) d[m(vs )i mvs ],
(3.3.8)
0
where Utm → 0 in L2 as m → 0 with respect to CRd [0, T ] and f i (x) : U → Rd×d
are continuous functions of x for i = 1, ..., d. This allows us to define f : U →
2
Rd×(1+k+1+d ) as
f = (γ −1 (x)F (x), γ −1 (x)σ(x), S(x), f 1 , ..., f d ),
(3.3.9)
and Htm ∈ CR1+k+1+d2 [0, T ] as

Htm

t


Wt




t


= m(v m ) mv m − mv mv  .
1
t 1
t




..


.
m(vtm )d mvtm − mvd mv
(3.3.10)
67
From Lemma 10, m(vtm )i mvtm → 0 as m → 0 in L2 with respect to CRd [0, T ]. To use
Lemma 5, Htm must satisfy Condition 1. It suffices to show that m(vtm )i mvtm satisfies
Condition 1.
Therefore, by Assumption 2 there exists a unique global solution to the limit
equation [SDE (3.2.2)], f is continuous, (Utm , Htm ) → (0, Ht ) in probability with
respect to CRd ×R1+k+1+d2 [0, T ], and Htm satisfies Condition 1. All of the assumptions
of Lemma 5 are satisfied and thus xm
t → xt in probability with respect to CU [0, T ].
Finally, convergence in probability in a bounded set implies convergence in L2 .
3.4
Proof of the main theorem
To study convergence of SDE (3.2.1) we use a theorem of Kurtz and Protter [KP91],
which, for greater clarity, we state here in a slightly less general form, sufficient for
our purposes.
3.4.1
Convergence of Stochastic Integrals
Consider (U m , H m ) with paths in CRd ×Rn [0, T ] adapted to {Ft } where Htm is a
semi-martingale with respect to Ft . Let Htm = Mtm + Am
t be the Doob-Meyer
decomposition of Htm [RY99]. Let Ht with paths in CRk [0, T ] be a semi-martingale
with respect to the filtration {Ft }. Let f : U → Rd×k be a continuous matrix valued
function and let X m , with paths in CRd [0, T ], satisfy the SDE
Z t
m
m
Xt = X0 + U t +
f (Xsm ) dHsm ,
(3.4.1)
0
where X0 ∈ Rd is the same initial condition for all m. Define X with paths in
CRd [0, T ] to be the solution of
Z
Xt = X0 +
f (Xs ) dHs .
0
Note that U0m = 0 for all m.
t
(3.4.2)
68
To study convergence of the solutions to SDE (1.1.4) we use a theorem of Kurtz
and Protter [KP91], which, for greater clarity, we state here in a slightly less general
form, sufficient for our purposes.
Lemma 5 ([KP91], Theorem 5.10). Assume that there exists a unique and global
solution to equation (3.4.2), (U m , H m ) → (0, H) in probability with respect to
CRd ×Rn [0, T ], i.e., for all > 0,
m
m
P sup |Us | + |Hs − Hs | > → 0,
(3.4.3)
0≤s≤T
as m → 0, and the following condition is satisfied:
Condition 1. {Vt (Am )} is stochastically bounded for each t > 0, i.e. P (Vt (Am ) >
L) → 0 as L → ∞, uniformly in m.
Then, as m → 0, X m → X the solution of equation (3.4.2), in probability with
respect to CU ([0, T ]).
The proof of Lemma 5 relies on a theorem about weak convergence of stochastic
m
integrals with respect to the Skorokhod topology. In our case both xm
t and vt are
continuous processes. From Theorem 3.10.2 of [EK86], if a process X m ⇒ X with
respect to the Skorohod topology, and X is continuous with probability one, then
X m ⇒ X with respect to the space of continuous functions with the uniform metric
(see also [Bil99] section 18). Therefore, direct application of Theorem 5.10 of [KP91]
for xm
t → xt for xt continuous, holds with respect to CU [0, T ]. However, the proofs in
[KP91] and references therein, are quite technical, especially for compactness. Therefore, we prove equivalent lemmas for the space of continuous functions.
Consider (U m , H m ) with paths in CRd ×Rk [0, T ], Xtm with paths in CRd [0, T ], and
m
Xt with paths in CRd [0, T ]. Define the stopping time τb,f
= inf{t : |f (Xtm )| ≥ b},
and let (U m , X m,b , H m ) satisfy
Xtm,b
= X0 +
Utm
Z
+
0
t
m
f (Xsm,b )χ{0 ≤ t < τb,f
} dHsm ,
(3.4.4)
69
where χ{A} is the indicator function on a set A. Note that Xtm,b = Xtm for t ∈
m
[0, τb,f
).
Lemma 6. For each m, let (X m , Y m ) be an {Ft }-adapted process with sample paths
in CRd×k ×Rk [0, T ], and let Y m be an {Ft }-semimartingale. Let Y m = M m + Am be
the Doob-Meyer decomposition of Y m into an {Ft }-local martingale and a process
with finite variation. Suppose Condition 1 is satisfied. If (X m , Y m ) ⇒ (X, Y ) as
m → 0, (weak convergence with respect to CRd×k ×Rk [0, T ] with the uniform metric,
sup0≤s≤T |xs − ys |) then Y is a semimartingale with respect to a filtration to which
R
R
X and Y are adapted and (X m , Y m , X m dY m ) ⇒ (X, Y , XdY ) as m → 0
Rt
Rt
with respect to CRd×k ×Rk ×Rd [0, T ] (where 0 Xsm dYsm and 0 Xs dYs are considered
as processes).
Proof. The idea of the proof is to approximate X m and X by differentiable processes
which are adapted to the filtration with respect to which Y m is a semimartingale. Fix
> 0, and because X with paths in CRd×k ([0, T ]), then Xt is uniformly continuous
and there exists a δ > 0 such that for all |t − s| < δ, |Xt − Xs | < . Letting
n = n(X, ) ∈ N to be the smallest number such that
1
n
< δ, define

t
≤
0,
 XR0
t
n
n (t− 1 ) Xs ds, 0 < t < T,
Xt =
n

XT
t
≥
T.
By uniform continuity,
Z
Z t
t
n
|Xt − Xt | = n
X − Xt ds ≤ n
|Xs − Xt |ds < ,
(t− 1 ) s
(t− 1 )
n
(3.4.5)
(3.4.6)
n
for all t ∈ [0, T ]. By this argument, for all X m with paths in CRd×k [0, T ] there exists
n and subsequently X m,n = X m, which is adapted to {Ft }, and sup0≤s≤T |Xsm −
Xsm, | < for all m > 0. First we show weak convergence of X m, to X , using the criterion of the Portmanteau theorem [Bil99]: X m ⇒ X if and only if
70
limm→0 E[f (X m )] = E[f (X)] for all bounded uniformly continuous real valued functions f on the space of continuous functions with the uniform metric. First consider
an arbitrary bounded uniformly continuous function f : CRd×k ×Rk ×Rd×k [0, T ] → R,
|E[f (X m , Y m , X m, )] − E[f (X, Y , X )]|
(3.4.7)
≤ E[|f (X m , Y m , X m, ) − f (X m , Y m , X m )|]
(3.4.8)
+|E[f (X m , Y m , X m )] − E[f (X, Y , X)]| + E[|f (X, Y , X) − f (X, Y , X )|].
(3.4.9)
By uniform continuity of f , for all ˜ > 0 there exists > 0 such that
sup0≤s≤T |(Xsm , Ysm , Xsm ) − (Xsm , Ysm , Xsm, )| < implies
|f (X m , Y m , X m ) − f (X m , Y m , X m, )| < ˜ and similarly for X. Because
sup0≤s≤T |Xsm, − Xsm | < ,
lim inf |E[f (X m , Y m , X m, )] − E[f (X, Y , X )]|
(3.4.10)
≤ lim inf E[|f (X m , Y m , X m, ) − f (X m , Y m , X m )|]
(3.4.11)
m→0
m→0
+|E[f (X m , Y m , X m )] − E[f (X, Y , X)]| + E[|f (X, Y , X) − f (X, Y , X )|]
(3.4.12)
≤2˜ + lim inf |E[f (X m , Y m , X m )] − E[f (X, Y , X)]|.
m→0
(3.4.13)
Note that f (X m , Y m , X m ) = g(X m , Y m ) where g : CRd×k ×Rk [0, T ] → R is uniformly continuous and bounded, and (X m , Y m ) ⇒ (X, Y ) as m → 0 with respect
to CRd×k Rk [0, T ] by assumption. Thus,
lim inf |E[f (X m , Y m , X m, )] − E[f (X, Y , X )]| ≤ 2˜,
m→0
(3.4.14)
where ˜ is arbitrarily small. Therefore, (X m , Y m , X m, ) ⇒ (X, Y , X ) as m → 0
with respect to CRd×k ×Rd ×Rd×k [0, T ].
Now that the differentiable approximations converge, we show that the stochastic
Rt
integrals will converge as m → 0. Define Utm = 0 Xsm dYsm in CRd [0, T ] and Utm, =
71
Rt
0
X m, dY m in CRd [0, T ] with similar definitions for Ut and Ut . Consider
Z t
m,
m,
m
Rt ≡ Ut − Ut = (Xsm − Xsm, )dYsm
Z0 t
Z t
m
m,
m
(Xsm − Xsm, )dAm
= (Xs − Xs )dMs +
s .
(3.4.15)
(3.4.16)
0
0
First we derive an estimate for the first moment of Rtm, . We start with the stochastic
integral with respect to the martingale Mtm . For any stopping time τ , by Jensen’s
inequality and Doob’s maximal inequality,
"
Z s
2
E
sup (Xrm − Xrm, )dMrm ≤E
sup
0≤s≤T ∧τ
Z s
2 #
(Xrm − Xrm, )dMrm 0≤s≤T ∧τ
0
" Z
≤4E 0
Z
=4E
0
0
(3.4.17)
#
2
T ∧τ
(Xrm − Xrm, )dMrm (3.4.18)
T ∧τ
(Xrm − Xrm, )2 dhM m ir ,
(3.4.19)
with the last equality resulting from Itô isometry, where hM m ir is the quadratic
variation of M m at r. Recall that sup0≤s≤T |Xsm −Xsm, | < for all t ∈ [0, T ], therefore
the right hand side of inequality (3.4.19) is bounded above by 42 E[hM m it∧τ ]. Taking
the square root of both sides of the end result we obtain,
Z s
m
m,
m
E
sup (Xr − Xr )dMr ≤ 2E[hM m it∧τ ]1/2 .
0≤s≤T ∧τ
(3.4.20)
0
For the second integral in equation (3.4.15), we use a bound on the total variation,
Z s
Z s
m
m,
m
m
m,
m
E
sup (Xr − Xr )dAr ≤ E
sup
|Xr − Xr |dVr (A ) .
0≤s≤T ∧τ
0≤s≤T ∧τ
0
0
(3.4.21)
Again we use sup0≤s≤T |Xsm − Xsm, | < and the definition of first variation,
Z s
m
m,
m
m
E
sup (Xr − Xr )dAt ≤E
sup Vs (A )
(3.4.22)
0≤s≤T ∧τ
0
≤E
0≤s≤T ∧τ
[VT ∧τ (Am )] ,
(3.4.23)
72
because Vt (Am ) is an increasing function of t. Putting together the estimates of
inequalities (3.4.20) and (3.4.23) yields
m,
E
sup |Rs | ≤ (2E[hM m iT ∧τ ]1/2 + E[VT ∧τ (Am )]),
(3.4.24)
0≤s≤T ∧τ
and a similar bound holds for R . To show weak convergence of U m to U , we have
for any bounded uniformly continuous function f : CRd×k ×Rk ×Rd×k ×Rd [0, T ] → R,
|E[f (X m , Y m , X m, , U m )] − E[f (X, Y , X , U )]|
(3.4.25)
≤E[|f (X m , Y m , X m, , U m ) − f (X m , Y m , X m, , U m, )|]
(3.4.26)
+|E[f (X m , Y m , X m, , U m, )] − E[f (X, Y , X , U )]|
(3.4.27)
+E[|f (X, Y , X , U ) − f (X, Y , X , U )|].
(3.4.28)
First we bound the first term of the right hand side by considering when
sup
0≤s≤T ∧τ
|(Xsm , Ysm , Xsm, , Usm −(Xsm , Ysm , Xsm, , Usm, )|
=
sup
0≤s≤T ∧τ
=
sup
0≤s≤T ∧τ
(3.4.29)
|Usm − Usm, |
(3.4.30)
|Rsm, | < δ
(3.4.31)
for some δ > 0. We use the standard notation for a set A ∈ F,
Z
f (X)dP.
E[f (X); A] =
(3.4.32)
A
By uniform continuity of f , for all > 0 there exists δ > 0 such that
sup
0≤s≤T ∧τ
|Rsm, | =
sup
0≤s≤T ∧τ
|(Xsm , Ysm , Xsm, , Usm ) − (Xsm , Ysm , Xsm, , Usm, )| < δ
(3.4.33)
implies that
|f (X m , Y m , X m, , U m ) − f (X m , Y m , X m, , U m, )| < (3.4.34)
E[|f (X m , Y m , U m ) − f (X m , Y m , U m, )|; sup |Rsm, | < δ]
0≤s≤T ∧τ
≤P
sup |Rsm, | < δ ≤ .
(3.4.35)
and
0≤s≤T ∧τ
(3.4.36)
73
When sup0≤s≤T ∧τ |Rsm, | ≥ δ, by Chebyshev’s inequality and the boundedness of f ,
m
m
m,
m
m
m
m,
m,
m,
E [|f (X , Y , X , U )− f (X , Y , X , U )|; sup |R | ≥ δ (3.4.37)
0≤s≤T ∧τ
≤2CP
sup |Rsm, | ≥ δ
(3.4.38)
0≤s≤T ∧τ
2C
m,
sup |Rs |
(3.4.39)
≤ E
δ
0≤s≤T ∧τ
2C
≤
2E[hM m it∧τ ]1/2 + E[Tt∧τ (Am )] .
(3.4.40)
δ
Now take τ = τcm = inf{t : |Mtm | ≥ c or Vt (Am
t ) ≥ c} and let α > 0 be an arbitrary
time. Note that convergence in distribution of {Y m } implies that {sup0≤t≤T |Ytm |}
is stochastically bounded. Using the Doob-Meyer decomposition, with M0m = 0 for
all m,
m
m
sup |Mtm | = sup |Ytm − Am
t | ≤ sup |Yt | + VT (A ),
0≤t≤α
0≤t≤α
(3.4.41)
0≤t≤α
which, by Condition 1, is stochastically bounded uniformly in m for all α > 0.
Therefore, we can choose cα such that P (τcmα ≤ α) ≤ 1/α and hence τcmα > T
with probability 1 for fixed T < ∞. Because Mtm and Am
t are continuous, then
2E[hM m it∧τcmα ]1/2 + E[Tt∧τcmα (Am )] ≤ 2cα + cα = 3cα which, due to Condition 1 is
bounded independent of m.
A similar bound holds for E[|f (X, Y , X , U ) − f (X, Y , X , U )|]. All that is left
is to show that (X m , Y m , X m, , U m, ) converges weakly to (X, Y , X , U ) as m → 0
with respect to CRd×k ×Rk ×Rd×k ×Rd [0, T ]. Note that, from the construction [Eq. (3.4.5)],
X m, , X are continuously differentiable on [0, T ]. Integrating by parts,
Rt
Utm, = 0 Xsm, dYsm
Rt
m
= Xtm, Ytm − X0m, Y0 − 0 n(Xsm − Xs−1/n
)Ysm ds,
(3.4.42)
and similarly for Ut . The right hand side of equation (3.4.42) is a continuous mapping
from CRd×k ×Rk ×Rd×k [0, T ] 7→ CRd [0, T ]. Thus,
f (X, Y , X , U ) = g(X, Y , X ),
(3.4.43)
74
where g : CRd×k ×Rk ×Rd×k [0, T ] → R is a bounded uniformly continuous function.
We have shown that (X m , Y m , X m, ) ⇒ (X, Y , X ) as m → 0 with respect to
CRd×k ×Rk ×Rd×k [0, T ]. Therefore, (X m , Y m , X m, , U m, ) converges weakly to
(X, Y , X , U ) as m → 0 with respect to CRd×k ×Rk ×Rd×k ×Rd [0, T ]. Consequently,
lim inf |E[f (X m , Y m , X m, , U m )] − E[f (X, Y , X , U )]|
m→0
(3.4.44)
≤ lim inf E[|f (X m , Y m , X m, , U m ) − f (X m , Y m , X m, , U m, )|]
(3.4.45)
+|E[f (X m , Y m , X m, , U m, )] − E[f (X, Y , X , U )]|
(3.4.46)
+E[|f (X, Y , X , U ) − f (X, Y , X , U )|]
(3.4.47)
≤C̃,
(3.4.48)
m→0
R
R
where C̃ depends on C, c and δ. Therefore, (X m , Y m , X m dY m ) ⇒ (X, Y , XdY )
as m → 0 with respect to CRd×k ×Rk ×Rd [0, T ]. Q.E.D.
To show relative compactness, we need to state two tightness criteria for the space
of continuous functions. Note that by Prohorov’s theorem [Bil99], for the separable
and complete space of continuous functions, a set is tight if and only if the set is
relatively compact.
Lemma 7. The sequence {X m } is tight in CRd [0, T ] if and only if these two conditions
hold:
1. For each positive η, there exists an a such that,
P (|X0m | > a) ≤ η,
(3.4.49)
for m > 0.
2. For each > 0 and η > 0, there exist a 0 < δ < T , and a m0 > 0 such that
!
P
sup |Xtm − Xsm | ≥ |t−s|<δ
for all 0 < m ≤ m0 .
≤ η,
(3.4.50)
75
Proof. See Theorem 8.2 [Bil99].
Another tightness criteria we use is a moment bound.
Lemma 8. The sequence {X m } is tight with respect to CRd [0, T ] if {X0m } is tight,
and there exist global constants a ≥ 0 and b > 1, and a nondecreasing continuous
function F : [0, T ] → R such that
E[|Xtm − Xsm |a ] ≤ |F (t) − F (s)|b ,
(3.4.51)
for all 0 ≤ t, s ≤ T and m > 0.
Proof. See Theorem 12.3 [Bil99].
To use this, we prove a lemma about fourth moments of stochastic integrals with
respect to local martingales. First we define some notation. For Mt , Nt local martingales with respect to the filtration Ft , define the joint quadratic variation as,
hM , N it =
1
(hM + N it − hM − N it ) .
4
(3.4.52)
Note that hM , M it = hM it the quadratic variation of Mt .
Lemma 9. Let Mt with paths in CRk [0, T ] be a continuous local martingale with
quadratic variation hM it . Then for any continuous function f : [0, T ] → Rd×k such
RT
that 0 |fs |2 dhM is < ∞ with probability one,
" Z
4 #
Z tZ s
t
E fs dMs = 6
E[|fs |2 |fr |2 ] dhM ir hM is ,
(3.4.53)
0
0
0
where the expectation on the right hand side is finite with probability one and the
quadratic variation hM it is non-random.
Proof. Set Yt =
Rt
0
fs dMs . Then, by Itô’s formula
d|(Yt )|4 = 4|Yt |2 Yt∗ dYt + 6|Yt |2 dhY , Y it
= 4|Yt |2 Yt∗ dYt + 6|Yt |2 |ft |2 dhM it .
(3.4.54)
76
Rt
We are interested in E[|Yt |4 ] and since Yt is a martingale then 0 4|Yt |2 Yt∗ dYt is a
Rt
martingale and thus E[ 0 4|Yt |2 Yt∗ dYt ] = 0. Now we use Itô’s formula for Yt2 to
obtain,
d|Yt |2 =2Yt∗ dYt + dhY , Y it ,
(3.4.55)
=2Yt∗ dYt + |ft |2 dhM it .
(3.4.56)
Substituting the expression for |Yt |2 into the integral form of equation (3.4.54) with
the expectation yields,
Z t Z s
Z s
4
∗
2
2
E[|Yt |] =E 6
2Yr dYr +
|fr | dhM ir |fs | dhM is
0
0
0
Z t Z s
∗
∗
2Yr dYr |fs fs | dhM is
=6E
0
0
Z t Z s
2
2
|fr | |fs | dhM ir dhM is .
+6E
0
(3.4.57)
(3.4.58)
(3.4.59)
0
The second term of the right hand side is the result, and we now show that the first
term is zero. We use the integration by parts formula (see [RW00] Theorem IV.32.4)
to obtain,
Z t Z
s
2Yr∗
6E
0
Z
0
t
Ys∗
dYr |fs fs∗ |
Z
dhM is
t
(3.4.60)
Z tZ
|fs fs∗ |
s
2
dhM is −
|fr |
0
0
Z t
Z t
2
∗
+ 6E
|fs | dhM is ,
2Ys dYs .
=6E
0
dYs
0
0
dhM ir 2Ys∗
dYs
(3.4.61)
(3.4.62)
0
Because Yt is a martingale, the first two terms of the right hand side are martingales
with respect to Ft . The last term is the joint quadratic variation of a local martingale
Rt
Rt
( 0 2Ys∗ dYs ) and a process of bounded variation ( 0 |fs |2 dhM is ). Therefore the joint
quadratic variation is zero. Q.E.D.
Proof of Lemma 5. For SDE (3.4.4), the initial conditions X0m,b = X0 and H0m =
H0 are independent of m and U0m = 0. Therefore the first part of Lemma 7 is
77
satisfied. We have (U m , H m ) converges in probability with respect to CRd ×Rk [0, T ]
implies {U m , H m } is tight and the second condition of Lemma 7 requires us to
estimate P (sup|t−s|<δ |Xtm,b − Xsm,b | ≥ ). Note that,
!
P
sup
|t−s|<δ
=P
|Xtm,b
− Xsm,b | ≥ !
Z t
m
m
f (Xrm,b )χ{0 ≤ t ≤ τb,f
} dHrm ≥ sup Ut − Usm +
|t−s|<δ
≤P
≤P
+P
(3.4.63)
(3.4.64)
s
!
Z t
m,b
m
m
(3.4.65)
sup
−
+ sup f (Xr )χ{0 ≤ t ≤ τb,f } dHr ≥ |t−s|<δ
|t−s|<δ
s
!
m
m
sup |Ut − Us | ≥
(3.4.66)
2
|t−s|<δ
!
Z t
m,b
m
m
sup f (Xr )χ{0 ≤ t ≤ τb,f } dHr ≥
(3.4.67)
2
|t−s|<δ
s
|Utm
Usm |
R
m
and it is sufficient to show that {U m } and { f (X m,b )χ{0 ≤ t < τb,f
} dH m } are
tight separately. Convergence of U m in probability to zero with respect to CRd [0, T ]
implies {U m } is relatively compact and hence tight.
R
m
For tightness of { f (X m,b )χ{0 ≤ t < τb,f
} dH m } we use Lemma 8. The initial
R0
R0
m
condition 0 f (X m,b )χ{0 ≤ t < τb,f
} dHsm = 0 for all m > 0, thus { 0 f (Xsm,b )χ{0 ≤
m
t < τb,f
} dHsm } is tight. We will achieve the bound in equation (3.4.51) with a = 4.
78
Consider,
" Z
4 #
t
m,b
m
m
E f (Xr )χ{0 ≤ r < τb,f } dHr (3.4.68)
s
" Z
4 #
Z t
t
m,b
m
m
m
m
m,b
f (Xr )χ{0 ≤ r < τb,f } dAr =E f (Xr )χ{0 ≤ r < τb,f } dMr +
s
s
(3.4.69)
" Z
4 #
t
m
≤8E f (Xrm,b )χ{0 ≤ r < τb,f
} dMrm s
" Z
4 #
t
m
.
+8E f (Xrm,b )χ{0 ≤ r < τb,f
} dAm
r (3.4.70)
(3.4.71)
s
(3.4.72)
We will estimate each term of the right hand side separately. For the first term on
the right hand side of inequality (3.4.68), we use Lemma 9 to obtain,
" Z
4 #
t
m
8E f (Xrm,b )χ{0 ≤ r < τb,f
} dMrm 0
Z t Z r
m,b 2
m,b 2
m
m
m
|f (Xu )| |f (Xr )| χ{0 ≤ r < τb,f } dhM iu dhM ir
=48E
s
s
Z t Z r
4
m
m
≤48b E
dhM iu dhM ir = 48b4 E (hM m it − hM m is )2
s
(3.4.73)
(3.4.74)
(3.4.75)
s
The second term on the right hand side of inequality (3.4.68), is bounded using a
standard inequality,
" Z
" Z
4 #
4 #
t
t
m
≤8b4 E
8E f (Xrm,b )χ{0 ≤ r < τb,f
} dAm
dVr (Am )
r s
(3.4.76)
s
≤8b4 E (Vt (Am ) − Vs (Am ))4 . (3.4.77)
Next define Ft : [0, T ] → R as
√ 2
√ 2
m
m 2
Ft = max sup 2 14b E [(hM it )] , sup 2 2b E[(Vt (A )) ] .
m
(3.4.78)
m
By the justification given in the proof of Lemma 6, there exists a constant cα and
m
stopping time τcmα = inf{t : |Mtm | ≥ cα or Vt (Am
t ) ≥ cα } such that P (τcα > α) ≤ 1/α,
79
thus τcmα > T with probability 1 and supm E[hM m it∧τcmα +Vt∧τcmα (Am )] < ∞. Therefore,
F is finite. Because, hM m it and Vt (Am ) are increasing continuous functions of t then
so is Ft . From inequalities (3.4.75) and (3.4.77),
" Z
4 #
t
m
E f (Xrm,b )χ{0 ≤ r < τb,f
} dHrm ≤ (Ft − Fs )2 .
(3.4.79)
s
By Lemma 8, {X m,b } is tight and hence relatively compact in CRd [0, T ].
The function f is assumed to be continuous and because {X m,b } is relatively
compact, then {f (X m,b )} is relatively compact with respect to CRd×k [0, T ]. Note the
convergence is with respect to the space of continuous functions with the uniform
m
metric on the time interval [0, T ]. So τb,f
= inf{t ∈ [0, T ] : |f (Xtm,b )| ≥ b} is
m
m
bounded above by T and {τb,f
} is relatively compact. Therefore {U m , H m , X m,b , τb,f
}
is relatively compact in CRd ×Rk ×Rd [0, T ] × [0, T ].
0
Let {U , H, X b , τb,f
} denote a weak limit point where X b , depends on b. Then,
by the Skorohod representation theorem [Bil99], there exists a subsequence {mn }
mn
0
such that {U mn , H mn , X mn ,b , τb,f
} → {0, H, X b , τb,f
} as n → ∞ almost surely. By
Lemma 6, the limit point satisfies the SDE,
Z t
b
Xt = X0 +
f (Xsb )dHs ,
(3.4.80)
0
0
for t < τb,f
.
Note that
0
τb,f = inf{t : |f (Xtb )| ≥ b} ≤ lim inf{t : |f (Xtmn ,b )| ≥ b} = τb,f
n→∞
(3.4.81)
0
and there exists a possible gap between τb,f and τb,f
where the sequence X mn ,b will not
converge. Take d < b, then τd,f < τb,f or else f would have a jump at f (x) = b which
mn
contradicts the assumption that f is continuous. Thus {U mn , H mn , X mn ,b , τd,f
}⇒
0
{0, H, X b , τd,f
} as n → ∞ where X b is the solution of equation (3.4.80) for 0 ≤ t ≤
τd,f . By assumption, there exists a unique global solution to equation (3.4.2) and
80
we take d → ∞. Therefore, (U m , H m , X m ) converges in distribution (weakly) to
(0, H, X) as m → 0 with respect to CRd ×Rk ×Rd [0, T ].
Given (U m , H m ) converges in probability to (0, H) with respect to CRd ×Rk [0, T ],
we must show convergence of X m to X in probability. Let f : CRd [0, T ] → R be
a bounded continuous function and let g : CRd ×Rk [0, T ] → R be a bounded continuous function. The weak convergence of the triple (U m , H m , X m ), with respect to
CRd ×Rk ×Rd [0, T ], implies
lim E[f (X m )g(U m , H m )] = E[f (X)g(U , H)].
m→0
(3.4.82)
Convergence of (U m , H m ) in probability, with respect to CRd ×Rk , implies
|E[f (X m )g(U , H)] − E[f (X)g(U , H)]| = |E[f (X m )(g(U , H) − g(U m , H m ))]
(3.4.83)
+E[f (X m )g(U m , H m )] − E[f (X)g(U , H)]|
(3.4.84)
≤CE[|g(U , H) − g(U m , H m )|] + |E[f (X m )g(U m , H m )] − E[f (X)g(U , H)]|
(3.4.85)
By the continuous mapping theorem, (U m , H m ) → (U , H) in probability implies
g(U m , H m ) → g(U , H) in probability.
Therefore the right hand side of equa-
tion (3.4.85) converges to zero as m → 0 and
lim E[f (X m )g(U , H)] = E[f (X)g(U , H)].
m→0
(3.4.86)
Because X is the unique global solution to the SDE (3.4.2), then there exists a
bounded and measurable function g̃ : CRd ×Rk [0, T ] → R such that f (X) = g̃(U , H).
We can approximate (in L1 ) any measurable function by a continuous function, hence
g̃ is assumed to be continuous. Therefore,
lim E[(f (X m ) − f (X))2 ] = lim E[f (X m )2 − 2f (X m )f (X) + f (X)2 ]
m→0
m→0
(3.4.87)
= lim E[f (X m )2 − 2f (X m )g̃(U , H) + f (X)2 ] (3.4.88)
m→0
=0.
(3.4.89)
81
Recall that f is a continuous function of X m with paths in CRd [0, T ] and for all > 0,
there exists a δ() > 0 such that,
m
m
P sup |Xt − Xt | > δ() ≤ P (|f (X m ) − f (X)| > ).
(3.4.90)
0≤t≤T
Using Chebyshev’s inequality, and taking the limit,
m
m
lim P sup |Xt − Xt | > δ() ≤ lim P (|f (X m ) − f (X)| > )
m→0
m→0
0≤t≤T
1
E[(f (X m ) − f (X))2 ]
m→0 2
=0
≤ lim
(3.4.91)
(3.4.92)
(3.4.93)
and (U m , H m , X m ) → (U , H, X) in probability with respect to CRd ×Rk ×Rd . QED.
3.4.2
Convergence of the Langevin equation
of Theorem 1. We first state and prove a lemma about the convergence of the
process mv m to zero. First we define a compact set where xm
t is contained for a
sufficiently long time for all m. Because U is open, there exists a sequence of compact
sets such that Uk ⊆ Uk+1 for all k and ∪k Uk = U.
Lemma 10. For each m, let xm
t be any process with continuous paths in Uk and
define vtm as the solution to the SDE given by the second equation in (1.1.4) with
functions, F , γ, and σ satisfying Assumptions 1-2. Then mv m → 0 as m → 0 in L2 ,
and hence probability, with respect to CRd [0, T ], i.e.
"
#
2
sup mvtm
lim E
m→0
= 0,
(3.4.94)
0≤t≤T
and, for all > 0,
lim P
m→0
sup
0≤t≤T
|mvtm |
> = 0.
(3.4.95)
82
Proof. We solve the second equation in (1.1.4) for vtm exactly. Because it is a linear
SDE with additive noise and variable coefficients (see [Arn74] and [KP92] Sec. 4.4)
the solution is
vtm
1
= Φ(t)v +
m
Z
t
Φ(t)Φ
−1
(s)F (xm
s )
0
1
ds +
m
t
Z
Φ(t)Φ−1 (s)σ(xm
s ) dWs ,
0
(3.4.96)
where 0 ≤ t ≤ T and Φ(t) : [0, T ] → Rd×d is the fundamental solution matrix of the
matrix differential equation
γ(xm
d
t )
Φ(t) = −
Φ(t).
dt
m
(3.4.97)
Note xm
t is of bounded variation and therefore all interpretations of the stochastic integral in equation (3.4.96) are equivalent to the Itô interpretation. For any continuous
function f (t) : [0, ∞) → Rd an estimate from [Har02] (Lemma 4.2) yields
Z t
Z t
Z t
λ0 (xm
r )
Φ(t)Φ−1 (s)f (s) ds ≤ Cd
dt |f (s)| ds,
exp −
m
s
0
0
(3.4.98)
where Cd is a constant dependent only on the dimension d, and λ0 is the smallest
eigenvalue of γ. Using the time substitutions r̃ = r/m and s̃ = s/m,
Z t
Z T
−1
m
|Φ(t)Φ−1 (s)F (xm
sup Φ(t)Φ (s)F (xs ) ds ≤
s )| ds
0≤t≤T
0
(3.4.99)
0
≤Cd
T
Z
0
Z
=mCd
0
T
m
t
λ0 (xm
r )
dr |F (xm
s )| ds
m
s
(3.4.100)
( Z T
)
m
m
exp −
λ0 (xm
mr̃ ) dr̃ |F (xms̃ )| ds̃.
Z
exp −
s̃
Let us remind that xm
t lies in the compact set Uk and define Ck as the constant that
bounds the continuous functions |F (x)|, |γ(x)|, and |σ(x)| for all x ∈ Uk . Also, by
Assumption 1, λ0 (x) ≥ cλ > 0 for all x. Therefore
R
RT
t
−1
m
sup0≤t≤T 0 Φ(t)Φ (s)F (xs ) ds ≤ mCd 0m exp −cλ
≤ Ck,d m,
T
m
− s̃
Ck ds̃
(3.4.101)
83
for some constant Ck,d dependent on k and d. For the stochastic integral, using Itô
isometry,
" Z
E T
Φ(T )Φ
−1
(s)σ(xm
s )
2 # Z
dWs ≤
T
0
0
i
h
2
m 2
−1
E Φ(T )Φ (s) |σ(xs )| ds.
(3.4.102)
Performing a change of variables and using similar bounds as before, we obtain
" Z
2 #
Z T
T
m
T
−1
m
2
Φ(T )Φ (s)σ(xs ) dWs ≤mCk Cd
E exp −cλ
− s̃
ds̃
m
0
0
(3.4.103)
≤C̃k,d m.
Hence,
E[ sup
0≤t≤T
|mvtm |2 ]
≤3E
2
sup |mΦ(t)v|
(3.4.104)
0≤t≤T
"
Z t
2 #
−1
m
+3E sup Φ(t)Φ (s)F (xs ) ds
0≤t≤T
0
"
Z t
2 #
−1
m
+3E sup Φ(t)Φ (s)σ(xs ) dWs ≤ Cm,
0≤t≤T
0
where we have used the Cauchy-Schwartz inequality,
N 2
N
X X
ai ≤ N
|ai |2 ,
i=1
(3.4.105)
i=1
and Doob’s maximal inequality for the Itô integral, which is a martingale. Therefore
"
2 #
≤ Cm.
(3.4.106)
E
sup |mvt |
0≤t≤T
Q.E.D.
To determine the limit of SDE (1.1.4) as m → 0, we use the equation for vtm to
write
m
m
γ(xm )vtm dt = F (xm
t ) dt + σ(xt )dWt − mdvt .
(3.4.107)
84
By Assumption 1, γ(x) is invertible, thus
m
−1
m
−1
m
−1
m
dxm
(xm
(xm
(xm
t = vt dt = γ
t )F (xt ) dt + γ
t )σ(xt )dWt − mγ
t ) dvt ,
(3.4.108)
or, in integral form,
Z t
Z t
Z t
m
−1
m
m
−1
m
m
m
mγ −1 (xm
γ (xs )σ(xs )dWs −
γ (xs )F (xs ) ds +
xt = x +
s ) dvs .
0
0
0
(3.4.109)
Remark 2. One may be tempted to use Lemma 5 on the above equation because
mvtm → 0. However, the theorem would yield the limiting equation,
dxt = γ −1 (xt )F (xt ) dt + γ −1 (xt )σ(xt ) dWt .
(3.4.110)
This is not the equation proven to be satisfied (see [SSMD82, CBEA12] for the one
d
dimensional case and [FH11] for a proof that xm
t ∈ R does not converge to the
solution of equation (3.4.110)). In view of Lemma 10, if γ(x) = γ0 is a constant
matrix for all x, then
"
"
Z t
2 #
2 #
lim E
sup mγ0−1 dvtm = lim E
sup γ0−1 mvtm − γ0−1 mv = 0,
m→0
m→0
0≤t≤T
0≤t≤T
0
(3.4.111)
and hence in probability, similarly to [Nel67, Fre04]. However, with γ(x) dependent
on position, the limit will be non-zero because mvtm does not satisfy Condition 1.
Note that from the SDE (1.1.4) for dvtm ,
Z t
Z t
m
m
m
m
mvt = mv +
(F (xs ) − γ(xs )vs ) ds +
σ(xm
s ) dWs .
0
0
|
{z
} |
{z
}
Am
t Bounded Variation
(3.4.112)
Mtm Local Martingale
Because the limits of integration are finite, then Am
t has bounded variation for fixed
m > 0. Note that O(Vt (Am )) = O(vtm ). Recall the solution for vtm ,
Z
Z
1 t
1 t
m
−1
m
vt = Φ(t)v+
Φ(t)Φ (s)F (xs ) ds+
Φ(t)Φ−1 (s)σ(xm
s ) dWs . (3.4.113)
m 0
m 0
85
From the bounds in the proof of Lemma 10, the correct scaling is a time substitution
of s̃ = s/m. Thus,
vtm
Z
= Φ(t)v +
t/m
Φ(t)Φ
0
−1
(s̃m)F (xm
s̃m )
1
ds̃ +
m
Z
t/m
Φ(t)Φ−1 (s̃m)σ(xm
s̃m ) dWs̃m .
0
(3.4.114)
The first two terms will be bounded in m. The last term is one that grows unbounded
because the Wiener process needs to be rescaled. Thus,
Z
Z t/m
1 t/m
1
m
−1
m
vt ≈
Φ(t)Φ (s̃m)σ(xs̃m ) dWs̃m = 3/2
Φ(t)Φ−1 (s̃m)σ(xm
s̃m ) dWs̃
m 0
m
0
(3.4.115)
=O(m−1/2 ),
where the 1/m power is integrated out as in the proof of Lemma 10. Therefore
O(Vt (Am )) = O(m−1/2 ) and Lemma 5 can not be used.
3.4.3
Integration by parts to satisfy assumption of the limit theorem
We now continue the proof of the main result. To determine the limit as m → 0, we
consider the one dimensional integrals (using Einstein summation notation and note
v0m = v),
Z t
m
−1
m
−1
m (γ −1 )ij (xm
)ij (xm
)ij (x)mvj
s ) d(vs )j =(γ
t )m(vt )i − (γ
0
Z t
∂
m
m
−
((γ −1 )ij (xm
s ))m(vs )j d(xs )α ,
∂x
α
0
(3.4.116)
by integration by parts (note that xm
t has bounded variation, hence the Itô term in
m
the integration part formula is zero). For d(xm
t )α = (vt )α dt we use equation (3.4.108),
Z t
−1
m
−1
m
)ij (xm
)ij (x)(mv)j
(3.4.117)
m(γ −1 )ij (xm
s )d(vs )j = (γ
t )(mvt )j − (γ
0
Z t
−1
∂
m
m
−
[(γ −1 )ij (xm
)αn (xm
s )](mvs )j (γ
s )Fn (xs )ds
0 ∂xα
m
−1
m
m
+ (γ −1 )αn (xm
)σ
(x
)
d(W
)
−
m(γ
)
(x
)d(v
)
.
n
n`
s
`
αn
s
s
t
t
86
Define,
−1
m
)ij (x)(mv)j
(Ûtm )i =(γ −1 )ij (xm
t )(mvt )j − (γ
Z t
−1
∂
m
m
((γ −1 )ij (xm
)αn (xm
−
s ))(mvs )j (γ
s )Fn (xs )ds
∂x
α
0
m
+ (γ −1 )αn (xm
t )σn` (xt ) d(Ws )` .
(3.4.118)
We later show that Û m converges to zero, as a process, as m → 0 [Lemma 11]. To
identify the limit of the remaining term in
Z t
m
m
m(γ −1 )ij (xm
(3.4.119)
t ) d(vt )j =(Ût )i
0
Z t
∂
−1
m
m
+
[(γ −1 )ij (xm
)αn (xm
s )](γ
s ) (mvs )j d(mvs )n ,
{z
}
|
0 ∂xα
(dZtm )jn
we will show that Ztm =
Rt
0
mvsm d(mvsm )∗ has a limit and rewrite it as a process that
satisfies Condition 1.
3.4.4
Lyapunov equation
Define the matrix-valued process, Ztm with paths in CRd×d [0, T ] as
Z t
m
Zt =
mvsm d(mvsm )∗ .
(3.4.120)
0
Its entries are (Ztm )jk =
Rt
0
(m(vsm )j )d(m(vsm )k ) and by Itô’s formula for d(m(vsm )j m(vsm )k ),
∗
m
dZtm + d(Ztm )∗ = d[mvtm (mvtm )∗ ] − σ(xm
t )σ (xt ) dt.
(3.4.121)
The diagonal elements of Ztm are thus
(Ztm )ii
1
=
[(mvtm )i ]2 − (mvi )2 −
2
Z
0
t
1
2
σi` (xm
s ) ds
2
(3.4.122)
For the off-diagonal elements we need more information. It is contained in the differential,
∗
m
m
m ∗
dZtm =mvtm d(mvtm )∗ = mvtm F (xm
t ) dt − mvt (γ(xt )vt ) dt
∗
+mvtm (σ(xm
t )dWt ) .
(3.4.123)
87
We expect the integrals of the first and the third term to converge to zero as m → 0
in L2 . Define,
Ũtm
Z
=
t
mvsm F ∗ (xm
s )ds
Z
+
t
∗
mvsm (σ(xm
s )dWs ) .
(3.4.124)
0
0
Therefore,
dZtm = dŨtm − mvtm (vtm )∗ γ ∗ (xm
t )dt,
(3.4.125)
where we later show Ũtm → 0 in L2 [Lemma 11]. Substituting the above equation
into equation (3.4.121) yields
m ∗
m
m
m ∗
dŨtm − mvtm (vtm )∗ γ ∗ (xm
t )dt + d(Ũt ) − γ(xt )mvt (vt ) dt
(3.4.126)
∗
m
=d[mvtm (mvtm )∗ ] − σ(xm
t )σ (xt ) dt.
Moving the Ũ terms to the right hand side we obtain,
m
m
m ∗
−mvtm (vtm )∗ γ ∗ (xm
t )dt − γ(xt )mvt (vt ) dt
(3.4.127)
∗
m
m
m ∗
=d[mvtm (mvtm )∗ ] − σ(xm
t )σ (xt ) dt − dŨt − d(Ũt ) .
The goal is to write the differential mvtm (vtm )∗ dt in a different manner and substitute
it back into equation (3.4.125). Equation (3.4.127) can be written as
m
m
m ∗
[mvtm (vtm )∗ dt][−γ ∗ (xm
t )] + [−γ(xt )][mvt (vt ) dt]
(3.4.128)
∗
m
m
m ∗
=d[mvtm (mvtm )∗ ] − σ(xm
t )σ (xt ) dt − dŨt − d(Ũt ) .
Defining V = mvtm (vtm )∗ dt the above matrix equation is of the form,
AV + V A∗ = C,
(3.4.129)
where A = −γ(xm
t ) and C is the right hand side of equation (3.4.128). This
equation is referred to as Lyapunov’s equation (or, in general, Sylvester’s equation)
[Ort87, Bel97]. By [Ort87] Theorem 6.4.2, if the real parts of all the eigenvalues of
88
A are negative then there exists a unique solution to Lyapunov’s equation for all C.
Furthermore, it is given by (see [Bel97], Chapter 11)
Z ∞
∗
eAy CeA y dy.
V =−
(3.4.130)
0
Letting A = −γ(xm
t ) then, by Assumption 1, −γ(x) has strictly negative eigenvalues and there exists a unique solution to the Lyapunov equation (3.4.128). Because a solution exists and is unique, then the integral on the right hand side of
equation (3.4.130) exists and is finite. Thus solving equation (3.4.128) for V we get
Z ∞
m
m
m ∗
∗
m
mvt (vt ) dt = −
e−γ(xt )y (d[mvtm (mvtm )∗ ] − σ(xm
(3.4.131)
t )σ (xt ) dt
0
∗
m
− dŨtm − d(Ũtm )∗ e−γ (xt )y dy
Z ∞
m
∗
m
=−
e−γ(xt )y d[mvtm (mvtm )∗ ]e−γ (xt )y dy
{z
}
| 0
dCt1
Z
∞
m )y
e−γ(xt
+
0
|
Z
+
|0
∗
m
−γ
(σ(xm
t )σ (xt ) dt) e
{z
dCt2
∞
m )y
e−γ(xt
∗ (xm )y
t
dy
}
∗
m
dŨtm + d(Ũtm )∗ e−γ (xt )y dy
{z
}
dCt3
We will interpret each term in a different way. After substituting the above expression
into equation (3.4.125), the term with Ct1 will be a included in the Htm process (in
the notation of Lemma 5), the Ct2 term will become a part of the spurious drift term
S in the limiting equation (1.1.11), and the Ct3 term will become a part of Utm which
will be shown to converge to zero. First,
Z ∞
m
∗
m
1
d(Ct )ij = −
(e−γ(xt )y )i,k1 d[m(vtm )k1 (mvtm )∗k2 ](e−γ (xt )y )k2 ,j dy
Z0 ∞
m
∗
m
=−
(e−γ(xt )y )i,k1 (e−γ (xt )y )k2 ,j dy d[m(vtm )k1 (mvtm )∗k2 ],
0
where the integral exists and is finite for all t ∈ [0, T ].
(3.4.132)
89
For the second term we set dCt2 = J (xm
t )dt and note that the integral exists and is
finite. That is we define J (x) : U → Rd×d as the solution to the Lyapunov equation,
− J γ ∗ − γJ = −σσ ∗ .
(3.4.133)
Note that the differentials are absent from the above equation. We justify this as
follows: consider the integral equation
Z t
Z t
m
∗
m
m
m
∗
m
−J (xs )γ (xs ) − γ(xs )J (xs ) ds = −
σ(xm
s )σ (xs ) ds.
0
(3.4.134)
0
Because all the corresponding integrals are Lebesgue, taking the derivative of each
side with respect to t yields equation (3.4.133).
Remark 3. The matrix function J is significant in that it contributes to the drift
term in the limiting equation (1.1.11). Equation (3.4.130) is useful for studying
the analytical properties of the solution to Lyapunov’s equation. For an explicit
solution in low dimensions, we use, in Section 4.2, a symbolic computer program
R
(Mathematica
).
Using the equation for Ũ m [Eq. (3.4.124)], the entries of C 3 are given by
Z tZ ∞
m
3
(Ct )i,j =
(e−γ(xs )y )i,k1 ([mvsm F ∗ (xm
(3.4.135)
s )]k1 ,k2 ds
0
+
0
m
∗
[mvs (σ(xm
s )dWs ) ]k1 ,k2
m ∗
+ [F (xm
s )(mvs ) ]k1 ,k2 ds
∗
m
m ∗
−γ (xs )y
)k2 ,j dy
+ [σ(xm
s )dWs (mvs ) ]k1 ,k2 ) (e
XZ tZ ∞
m
∗
m
=
(e−γ(xs )y )i,k1 (e−γ (xs )y )k2 ,j dy ([mvsm F ∗ (xm
s )]k1 ,k2 ds
k1 ,k2
0
0
∗
m
m ∗
+ [mvsm (σ(xm
s )dWs ) ]k1 ,k2 + [F (xs )(mvs ) ]k1 ,k2 ds
m ∗
+ [σ(xm
s )dWs (mvs ) ]k1 ,k2 ) .
We substitute the expression for mvtm (vtm )∗ dt back into equation (3.4.125) to obtain,
3 ∗
m
m
∗
m
dZtm = dŨtm − dCt1 γ ∗ (xm
t ) − dCt γ (xs ) − J (xs )γ (xt ) dt.
(3.4.136)
90
3.4.5
Reformulate the SDE
We substitute the expression (3.4.136) into equation (3.4.109) for dZtm = mvtm d(mvtm )∗ i.
In the resulting formula for xm
t , we will put together the contributions from Û , Ũ
and C 3 to form a vector valued process U m . Integrating equation (3.4.109) by parts
and using equation (3.4.119),
Z
t
m
(γ −1 (xm
(3.4.137)
s )F (xs ))i ds
0
Z t
−1
m
m
(γ (xs )σ(xs ))dWs
+
0
i
Z t
∂
−1
m
∗
+
((γ −1 )ij (xm
)αn (xm
s ))(γ
s )(xs )J(x)j,n (γ )n` (x) ds
∂xα
Z0 t
∂
−1
((γ −1 )ij (xm
)αn (xm
+
s ))(γ
s )×
∂x
α
0
Z ∞
−γ(xm
)y
−γ ∗ (xm
)y ∗
m
s
t
−
(e
)j,k1 (e
γ (xs ))k2 ,n dy d[(mvsm )k1 (mvsm )k2 ],
(xm
t )i
= xi +
(Utm )i
+
0
where Utm is
m
−1
(Utm )i = − (γ −1 )ij (xm
)ij (x)(mv)j
t )(mvt )j + (γ
Z t
−1
∂
m
m
((γ −1 )ij (xm
)αn (xm
s ))(mvs )j (γ
s )Fα (xs )ds
∂x
α
0
m
+ (γ −1 )αn (xm
t )σn` (xt ) d(Ws )`
Z t
∂
−1
m ∗
m
((γ −1 )ij (xm
)αn (xm
+
s ))(γ
s ) [([mvs F (xs )]j,n ds
∂x
α
0
∗
m
m ∗
+ [mvsm (σ(xm
s )dWs ) ]j,n + [F (xs )(mvs ) ]j,n ds
+ [σ(xm
)dWs (mvsm )∗ ]j,n )]
Z ∞ s
m
∗
m
+
(e−γ(xs )y )j,k1 (e−γ (xs )y γ ∗ (xm
s ))k2 ,n dy×
(3.4.138)
(3.4.139)
(3.4.140)
(3.4.141)
(3.4.142)
(3.4.143)
(3.4.144)
0
m
m
∗
([mvsm F ∗ (xm
s )]k1 ,k2 ds + [mvs (σ(xs )dWs ) ]k1 ,k2
m ∗
m
m ∗
+ [F (xm
s )(mvs ) ]k1 ,k2 ds + [σ(xs )dWs (mvs ) ]k1 ,k2 )] .
(3.4.145)
(3.4.146)
Now we prove that Utm → 0 in L2 with respect to CRd [0, T ] and hence in probability.
By Lemma 10, the first two terms on the right hand side of equation (3.4.138), go to
91
zero in L2 with respect to CRd [0, T ]. The rest of the terms in U m are Lebesgue or Itô
integrals with integrands that are products of continuous functions and m(vtm )i . We
prove a lemma about the convergence of these integrals to zero.
Lemma 11. Let xm
t be of bounded variation in the compact set Uk for 0 ≤ t ≤ T .
If gs (xm
s ) : Uk → R is a continuous function, then
"
2 #
Z t
m
= 0,
)
ds
lim E
sup g(xm
)m(v
i
s
s
m→0
0≤t≤T
(3.4.147)
0
and
"
Z t
2 #
m
sup g(xm
= 0,
s )m(vs )i d(Ws )j 0≤t≤T
lim E
m→0
(3.4.148)
0
for all i, j = 1, ..., d.
Proof. First note that,
"Z
"
2 #
Z t
m
≤E
E
sup g(xm
s )m(vs )i ds
0≤t≤T
0
0
By the Cauchy-Schwarz inequality,
"Z
2 #
Z
T
m
m
2
≤T
E
|g(xs )m(vs )i | ds
≤C̃k T
m
|g(xm
s )m(vs )i | ds
2 #
. (3.4.149)
T
m 2
E |g(xm
ds
s )m(vs )i |
0
0
T
2
Z
(3.4.150)
T
E (m(vsm )i )2 ds,
0
where the constant C̃k , depending on k, results from the continuous function g
bounded on a compact set Uk . Taking the limit as m → 0 of both sides, and using the Lebesgue dominated convergence theorem, Lemma 10 implies,
"Z
2 #
T
m
lim E
|g(xm
= 0.
s )m(vs )i | ds
m→0
Therefore,
"
lim E
m→0
(3.4.151)
0
"Z
Z t
2 #
m
sup g(xm
≤ lim E
s )m(vs )i ds
m→0
0≤t≤T
0
0
T
2 #
m
|g(xm
s )m(vs )i | ds
(3.4.152)
=0.
92
To estimate the Itô integral [Eq. (3.4.148)], we first use Itô isometry:
"Z
2 # Z T
T
m
m 2
m
ds
E
|g(x
)m(v
)
|
=
g(xm
)m(v
)
d(W
)
E i
i
s
j
s
s
s
s
(3.4.153)
0
0
≤C̃k
T
Z
0
E[|m(vsm )i |2 ] ds.
Using Lebesgue dominated convergence theorem and Doob’s maximal inequality,
" Z
"
2 #
2 #
Z t
T
m
m
≤4E E
sup g(xm
g(xm
s )m(vs )i d(Ws )j s )m(vs )i d(Ws )j 0≤t≤T
0
0
(3.4.154)
→ 0.
as m → 0. Q.E.D.
We use Lemma 11 to show U m converges to zero as m → 0 with respect to
CRd [0, T ]. Note that all functions in the expression (3.4.138) for U m are continuR∞
m
∗
m
ous. The terms, 0 (e−γ(xs )y )j,k1 (e−γ (xs )y )k2 ,n dy are continuous, because γ and the
matrix exponential are continuous and the integrand decays exponentially with y.
Therefore,
"
lim E
m→0
2 #
sup |Utm |
= 0,
(3.4.155)
0≤t≤T
and, as a consequence, U m → 0 as m → 0 in probability with respect to CRd [0, T ].
To verify the rest of the assumptions of Lemma 5, including Condition 1, we first
write equation (3.4.137) in the form,
xm
t
=x+
Utm
Z
+
t
m
f (xm
t ) dHt .
(3.4.156)
0
2
Define, f : U → Rd×(1+k+1+d ) as
f (x) = γ −1 (x)F (x), γ −1 (x)σ(x), S(x), f 1 (x), ..., f d (x)
where S(x) : U → Rd is defined, by components, as
Z t
∂
−1
∗
Si (x) =
((γ −1 )ij (xm
)αn (xm
s ))(γ
s )Jjn (x)(γ )n` (x),
0 ∂xα
(3.4.157)
(3.4.158)
93
where J is the solution to the Lyapunov equation (3.4.133) and f β (x) : U → Rd×d is
defined by components, as
k2
(x)
fi,k
1
Z
=
t
∂
[(γ −1 )ij (x)](γ −1 )αn (x)×
∂x
0 Z ∞α
−γ(x)y
−γ ∗ (x)y ∗
(e
)j,k1 (e
γ (x))k2 ,n dy
−
(3.4.159)
0
for k1 , k2 = 1, 2, ..., d. Next, Htm with paths in CR1+k+1+d2 [0, T ], is defined as


t


Wt




t


Htm = (mv m ) mv m − mv mv  .
(3.4.160)
1
t 1
t




..


.
m
m
(mvt )d mvt − mvd mv
By Lemma 10, H m → H as m → 0 in L1 with respect to CR1+k+1+d2 [0, T ], where
 
t
W t 
 
 t 
 
Ht =  0  .
(3.4.161)
 
 .. 
 . 
0
Therefore, (Utm , Htm ) → 0 as m → 0 in probability with respect to CRd ×R1+k+1+d2 [0, T ].
All that is left, to be able to use Lemma 5, is to check Condition 1.
3.4.6
Check Condition 1
To check Condition 1, we find the Doob-Meyer decomposition of Htm and stochastically bound, in m, the bounded variation part of the decomposition, denoted Am
t .
Only the last d2 rows of H m depend on m. Furthermore, the columns of the matrix
(mvtm (mvtm )∗ ) make up the last d2 rows of H m . That is, the first column of the
matrix (mvtm (mvtm )∗ ) is rows 1 + k + 1 + 1 through 1 + k + 1 + d of H m . The second
94
column of the matrix (mvtm (mvtm )∗ ) is rows 1 + k + 1 + d + 1 through 1 + k + 1 + 2d
of H m and so on. Consider
d(mvtm (mvtm )∗ ) = dZtm + d(Ztm )∗ + σσ ∗ (xm
t ) dt,
(3.4.162)
where Ztm is given by equation (3.4.120). Substituting the expression for dZtm and
d(Ztm )∗ from equation (3.4.123),
∗
m
m
m ∗
d(mvtm (mvtm )∗ ) =mvsm F (xm
s ) ds − m(vs )(γ(xs )vs ) ds
(3.4.163)
m ∗
m
m
m ∗
∗
m
+F (xm
s )(mvs ) ds + γ(xs )vs m(vs ) ds + σσ (xs ) ds
∗
m
m ∗
+mvsm (σ(xm
s ) dWs ) + σ(xs ) dWs (mvs ) .
Because the stochastic integrals are local martingales, Am
t will be each column of the
Lebesgue integrals in the above expression. That is,


t
 0 


 t 


Am
t = (Am )1  .
 t 
 .. 
 . 
d
(Am
t )
(3.4.164)
where,
1
(Am
t ) ,
2
(Am
t ) ,
··· ,
d
(Am
t )
Z
t
∗
mvsm F (xm
s ) ds
=
(3.4.165)
0
Z
t
+
−
−
Z0 t
Z0 t
Z0 t
+
m ∗
F (xm
s )(mvs ) ds
m ∗
m(vsm )(γ(xm
s )vs ) ds
m
m ∗
γ(xm
s )vs m(vs ) ds
∗
m
σ(xm
s )σ (xs ) ds.
0
m
We must show that Am
→ 0 in L2 , the
t are stochastically bounded. Because mv
first and second terms go to zero in L2 . To prove stochastic boundedness of third and
95
fourth term, it is enough to show E[|m(vsm )i (vsm )|] is bounded uniformly in m. The
fifth term is stochastically bounded in m.
Indeed, based on previous works [PS08, KPS04], we expect
√ m
mvt to be of
order one. Note that mvtm (vtm )∗ is a Hermitian matrix. For the rows we have
|m(vsm )i (vsm )| ≤ cd |mvtm (vtm )∗ | for every i = 1, ..., d and we bound the matrix norm
using the formula for vtm in terms of xm
t in equation (3.4.96). Thus
Z t
1
m ∗
m
E [|mvt (vt ) |] = E m Φ(t)v +
Φ(t)Φ−1 (s)F (xm
s ) ds
m 0
Z
1 t
−1
m
+
Φ(t)Φ (s)σ(xs ) dWs (Φ(t)v
m 0
Z
1 t
+
Φ(t)Φ−1 (s)F (xm
s ) ds
m 0
∗ Z
1 t
−1
m
+
Φ(t)Φ (s)σ(xs ) dWs m 0
≤ 9mE [|Φ(t)vv ∗ Φ(t)∗ |]
Z t
9
+ E Φ(t)Φ−1 (s)F (xm
s ) ds ·
m
0
Z t
m ∗
−1
∗
F (xs ) [Φ(t)Φ (s)] ds
0
Z t
9
+ E Φ(t)Φ−1 (s)σ(xm
s ) dWs ·
m
0
Z t
∗ −1
m
Φ(t)Φ (s)σ(xs ) dWs .
(3.4.166)
0
For small m, the first term is bounded independently of m. For an arbitrary operator
A we use the equality |AA∗ | = |A|2 . From the proof of Lemma 10,
" Z
2 #
1 t
2
E
Φ(t)Φ−1 (s)F (xs ) ds ≤ Ck,d
.
m 0
(3.4.167)
For the stochastic integral, we use inequality (3.4.167) which is bounded independently of m. Thus, by the Chebyshev inequality, {Vt (Am )} is stochastically bounded
and this proves that Htm satisfies Condition 1.
Therefore, xm
t → xt as m → 0 in probability. We will use this together with
boundedness to prove L2 convergence; because xm
t lies in a bounded set U, there
96
exists N > 0 such that P (|xm
t | ≤ N ) = 1 for all t and m. Therefore,
"
!
2 #
2
Z ∞
lim E
sup |xm
= lim
P
sup |xm
≥ x dx
t − xt |
t − xt |
m→0
m→0
0≤t≤T
Z
=
(2N )2
lim P
0
=0.
This completes the proof.
0≤t≤T
0
m→0
(3.4.168)
!
2
sup |xm
≥ x dx
t − xt |
0≤t≤T
97
Chapter 4
Applications of Main Theorem
4.1
One Dimension
As a first example we apply Theorem 1 to a one-dimensional model of a Brownian
particle. This is the modeled studied in [SSMD82]; see also [CBEA12]. The particles
position satisfies
(
dxm
= vtm dt
t
F (xm
t )
−
dvtm =
m
γ(xm
t ) m
vt
m
dt +
σ(xm
t )
dWt ,
m
(4.1.1)
m
with initial conditions xm
0 = x and v0 = v. The Assumption 4 below justifies
restricting xt to an interval (a, b) for 0 ≤ t ≤ T . To find the expression for J (x) = J(x)
in the limit [Eq. (1.1.11)], the Lyapunov equation is trivial to solve in one dimension.
Equation 3.4.133 in one-dimension is
2J(x)γ(x) = σ(x)2
This defines J(x) =
(4.1.2)
1 σ(x)2
.
2 γ(x)
Thus the limiting equation for xt is
F (xt )
γ 0 (xt )
σ(xt )
2
dxt =
−
σ(xt ) dt +
dWt ,
3
γ(xt )
2γ(xt )
γ(xt )
(4.1.3)
with x0 = x which agrees with prior results [FH11, CBEA12, SSMD82]. It is instructive to illustrate on this simple example the key quantities entering the proof of
Lemma 5, namely f and Htm . Define f , a continuous function from (a, b) to R4 , as
γ 0 (x)
(x)
γ 0 (x)
2
, σ(x)
,
−
σ(x)
,
−
f (x) = Fγ(x)
,
(4.1.4)
γ(x)
2γ(x)3
γ(x)3
and Htm with paths in CR4 [0, T ] as,

Htm

t


Wt
.
=


t
1
m 2
2
[(mvt ) − (mv) ]
2
(4.1.5)
98
Therefore, limm→0 Htm = (t, Wt , t, 0)∗ , and the limiting equation (4.1.3) is recovered.
We give conditions under which Assumptions 1 and 2 are satisfied to stated in
Theorem 1. First, we denote the potential of the force F entering equation (4.1.1) by
U:
d
U (x) = −F (x),
dx
for x ∈ (a, b).
(4.1.6)
Assumptions 1 and 2 are implied by the following:
Assumption 3. The coefficients F, γ, σ are continuously differentiable functions with
γ > 0 for x ∈ (a, b).
Assumption 4. The potential function U (x) defined in equation (4.1.6), is positive
and U grows to infinity at x = a and x = b such that
1
1
e (x−a)
e− (x−b)
lim
, lim
∈ [0, ∞).
x→a+ U (x) x→b− U (x)
(4.1.7)
Also σ and γ satisfy the following: there exists a positive constant c1 such that
σ(x)2 ≤ c1 U (x),
for all x ∈ (a, b),
(4.1.8)
and there exists c2 > 0 such that
F (x)γ 0 (x) ≤ c2 U (x),
for all x ∈ (a, b).
(4.1.9)
Remark 4. The condition that U (x) grows to infinity exponentially at x = a and
x = b confines the particle’s position to the finite interval (a, b).
Assumptions 3 is the one dimensional equivalent of Assumption 1. To show that
Assumptions 3 and 4 imply Assumption 2, we define Lyapunov functions which will
be used to prove that SDE (4.1.1) and SDE (4.1.3) have globally defined solutions.
m
The infinitesimal generators for (xm
t , vt ) and for xt are
F (x) γ(x)v ∂
σ(x)2 ∂ 2
∂
+
−
+
Lm = v
∂x
m
m
∂v
2m ∂v 2
F (x) σ(x)2 0
∂
σ(x)2 ∂ 2
L0 =
−
γ
(x))
+
,
γ(x) γ(x)3
∂x 2γ(x)2 ∂x2
(4.1.10)
(4.1.11)
99
respectively [Øks03]. Define the Lyapunov functions,
Vm (x, v) =
mv 2
+ U (x),
2
(4.1.12)
and
V0 (x) = U (x).
(4.1.13)
Given the infinitesimal generators, equations (4.1.10-4.1.11), and Assumption 4, there
exists a positive constant cm , dependent on m, such that
L m V m ≤ cm V m ,
(4.1.14)
which follows from inequality (4.1.8). To prove a similar inequality for L0 note that
σ(x)2
σ(x)2
2
0
0
L0 V0 = − F (x) + F (x)
+ F (x)γ (x)
.
(4.1.15)
2γ(x)2
2γ(x)3
B
If U (x) = Ae (x−a) near x = a with A, B > 0, then U grows just fast enough to satisfy
the growth condition in Assumption 4. In a neighborhood of x = a,
B
ABe (x−a)
,
F (x) = −U (x) = −
(x − a)2
0
and
B
(4.1.16)
B
(AB)2 e (x−a) + 2(x − a)ABe (x−a)
F (x) =
.
(x − a)4
0
(4.1.17)
We analyze the first term of equation (4.1.15) and note that
B
2
2
B
σ(x)2
ABe (x−a)
σ(x)
σ(x)
F (x) − F (x)
=
AB e (x−a) −
+ 2(x − a)
. (4.1.18)
γ(x)2
(x − a)4
γ(x)2
γ(x)2
2
0
For x near a, then the above equation is positive. Therefor F (x)2 ≥ F 0 (x)σ(x)2 /γ(x)2 ,
and there exists c0 > 0 such that
L0 V0 ≤ c0 V0 .
(4.1.19)
This assures global existence of the strongly unique solutions xm
t to SDE (1.1.4) and
xt to SDE (1.1.11) ([RB06] theorem 5.9).
100
Remark 5. An argument must still be made to justify the existence of a compact set
where xm
t is contained for 0 ≤ t ≤ T and uniformly in m. The mathematical proof of
this claim is an open question.
2
Therefore, by Theorem 1, xm
t → xt as m → 0 in L (Ω, F, P ) with respect to
C(a,b) [0, T ].
4.1.1
Smoluchowski-Kramers limit as different conventions of the stochastic integral
Recall that in one dimension the stochastic integral with the α interpretation can be
written as an Itô integral (see Sec 2.2.1) by the formula
Z t
Z t
Z t
0
α
σ(xs ) dWs .
ασ(xs )σ (xs ) ds +
σ(xs ) ◦ dWs =
0
0
(4.1.20)
0
In this section, we discuss the limit [Eq. (4.1.3)] as an interpretation of the stochastic
integral.
For modeling, various preferences regarding the appropriate choice of α have
emerged in the numerous fields where SDEs have been applied [Øks03]. For example,
the martingale property, i.e. the specific feature of the Itô integral of “not looking into
the future,” meaning that, when the integral is approximated by a summation, the
leftmost point of each interval is used, is the main reason of its popularity in economics
[Øks03] and biology [Tur77, Ao08]. In general, the Stratonovitch integral emerges naturally when the Wiener process is replaced by a sequence of approximating deterministic processes and has the advantage of leading to ordinary chain rule formulas under a
change of variable [Gar04]. However, the fact that Stratonovitch integrals are not martingales gives the Itô integral an important computational advantage [KP92, Sus78].
Finally, the anti-Itô integral has been shown to be the most appropriate to describe
physical phenomena that are in e quilibrium with a heat-bath for which Einstein
fluctuation-dissipation relation holds [EM78, LBLO01, LL07, VHB+ 10, BVH+ 11]. In
101
particular, equations that satisfy the fluctuation-dissipation relation occur in molecular dynamics. As described in the introduction [Sec. 1.1] the limiting equation is
constrained to be the anti-Itô type to correct the invariant distribution to model the
Gibbs distribution.
A classical example of a phenomenon in equilibrium with a heat bath is the motion
of a mesoscopic particle of mass m immersed in a fluid, i.e. Brownian motion. If we
assume that the particle moves in one dimension under the action of a continuous
force F (x), its position being xm
t ∈ R at all times t ≥ 0 in a finite interval, the
corresponding Newton equation is:
= vtm dt
dxm
t
m m
m
mdvtm = F (xm
t ) − γ(xt )vt dt + σ(xt )dWt
(4.1.21)
with the initial conditions v0m = v and xm
0 = x. The friction coefficient γ(x) > 0
and the intensity (standard deviation) of the noise σ(x) > 0 are, in general, positiondependent; we also assume that F , γ, and σ are differentiable functions of x smooth
enough so that the process (xt , vt ) exists for all t on a finite interval. It is well known
m
that, since the derivative of xt , i.e. ẋm
t = vt , exists, the stochastic integral in Eq.
(4.1.21) is equivalent under all interpretations [Øks03, KS91, Gar04].
In general, similar limits involve additional drift terms, i.e. “spurious drifts.”
A statement of our results in terms of different definitions of stochastic integral is
also possible and, in some cases, straightforward. By naively setting m = 0 in Eq.
(4.1.21), we obtain a SDE for x0t = xt :
dxt =
F (xt )
σ(xt )
dt +
dWt ,
γ(xt )
γ(xt )
| {z }
| {z }
deterministic stochastic
(4.1.22)
with initial condition x0 = x. Eq. (4.1.22) is called the Smoluchowski-Kramers (SK)
approximation to Eq. (4.1.21). Differently from the solution of Eq. (4.1.21), the
solution of Eq. (4.1.22) depends on the interpretation of the stochastic term, i.e. on
the choice of α.
102
8
(a)
γ,σ
6
γ(x)
4
σ(x)
2
0
−100
−50
0
50
100
x
200
(b)
x
100
0
100
−200
0
50
m = 100
m = 10
m = 10
Itô (α = 0)
Anti-Itô (α = 1)
200
x
−100
0
900
400
600
800
950t
1000
1000
t
Figure 4.1. (a) For a Brownian particle γ(x) (dark line) and σ(x) (grey line) are
related by the Einstein fluctuation-dissipation relation [Eq. (4.1.32)]; in this case,
γ(x) = (1 + x/100) (b) The solution of the Newton equations [Eq. (4.1.21)] for
m → 0 (dashed lines) converge to the solution of the SK approximation [Eq. (4.1.22)]
for α = 1, i.e. anti-Itô integral, (black solid line); also the solution for α = 0, i.e.
Itô integral, (grey solid line) is given for comparison. All solutions are obtained for
the same Wiener process. The inset is a blow-up of the final part of the trajectories
(dashed square).
103
We can gain some insight into this zero-mass limit procedure considering numerical
solutions of Eq. (4.1.21) for various decreasing values of m, but for the same Wiener
process [Fig. 4.1]. For a Brownian particle the Einstein fluctuation-dissipation relation holds:
γ(x) ∝ σ(x)2 .
(4.1.23)
In Fig. 4.1(a) γ(x) (dark line) and σ(x) (grey line) are presented. The dashed
lines in Fig. 4.1(b) represent some solutions of Eq. (4.1.21) for decreasing m: they
become rougher and rougher as the m decreases. They converge towards the anti-Itô
(α = 1) solution of Eq. (4.1.22) (black solid line); this is in agreement with the
recent experimental demonstration [VHB+ 10, BVH+ 11] that for a Brownian particle
the correct interpretation is the anti-Itô integral. We remark that the Itô (α = 0)
solution of Eq. (4.1.22) (grey solid line) presents clear deviations from the correct
one, as can be clearly seen in the inset of Fig. 4.1(b).
In this section we study the zero-mass limiting behavior of a larger class of equations that have the form of Eq. (4.1.21), but for which γ(x) and σ(x) are allowed to
vary independently from each other. This can be the case, e.g., in the description of
the evolution of complex systems [AKQ07]. This section identifies for given γ(x) and
σ(x) the drift term and hence the corresponding α [Eq. (4.1.25)]: we find that in general it can be a function of x. We remark that the spurious drifts are defined assuming
the Itô stochastic calculus convention, while the values of α are defined with reference to SDE (4.1.22); we introduce the notation ◦α(x) dWt to indicate the presence of
such extra drift. Interestingly, we find that when a generalized fluctuation-dissipation
relation holds, i.e.
γ(x) ∝ σ(x)λ ,
(4.1.24)
α is only a function of the exponent λ and independent from x [Eq. (4.1.28)]. In
particular, for λ = 0 we retrieve the Itô interpretation and for λ = 2 the anti-Itô
interpretation, while the Stratonovich interpretation is only retrieved asymtotically
104
for λ → ∞. Interestingly, also values of α ∈
/ [0, 1] occur for λ ∈ (0, 2). Although
in this section we always consider the variable x to be one dimensional, the general
multi-dimensional case can be studied using similar methods [PS08].
4.1.2
An equation for α(x)
We derive an equation for α(x) depending on the friction γ(x) and diffusion σ(x)
comparing theSDE [Eq. (4.1.22)] and Eq. (4.1.3) and solving for α(x):
γ 0 (x)σ(x)
,
α(x) =
2(γ 0 (x)σ(x) − γ(x)σ 0 (x))
where γ 0 (x) =
dγ(x)
dx
and σ 0 (x) =
dσ(x)
.
dx
(4.1.25)
This equation shows that in general, α varies
with position and can even take values outside the interval [0, 1]. Interestingly, α
never takes the value 12 , i.e. we never obtain a Stratonovich correction.
γ(x) ≡ γ0 : Constant friction The case in which γ(x) ≡ γ0 , while σ(x) is allowed
to vary [Fig. 4.2(a)], has been often object of mathematical studies. For example,
Freidlin [Fre04] and later Pavliotis and Stuart [PS05] proved that the limiting equation
has a stochastic term with α = 0; this result is rederived here. Physically, an example
of this system is in the framework of the Maxey-Riley model of inertial particles in a
Gaussian field [SS02] with correlation time assumed to be very short. In Fig. 4.2(b),
we show how the numerical solutions for m → 0 converge towards the Itô (α = 0)
solution of Eq. (4.1.22).
γ(x) ∝ σ(x)2 : Brownian motion The particular case when Einstein fluctuationdissipation relation is satisfied in its standard form [Eq. (4.1.32)] is particularly
important because it describes the diffusion of Brownian particles. If D(x) denotes
the hydrodynamic diffusion coefficient and kB T the thermal energy, then
γ(x) =
kB T
D(x)
(4.1.26)
105
8
(a)
γ,σ
6
4
γ(x)
σ(x)
2
0
−100
−50
0
50
100
x
100 (b)
50
−50
−100
−150
−200
0
100
m = 100
m = 10
m=1
Itô (α = 0)
Anti-Itô (α = 1)
200
x
x
0
0
−100
900
400
600
800
950 t
1000
1000
t
Figure 4.2. (a) σ(x) (grey line) and γ(x) = σ(x)0 = constant (dark line). (b) The
solutions of the Newton equations [Eq. (4.1.21)] for m → 0 (dashed lines) converge
to the solution of the SK approximation [Eq. (4.1.22)] for α = 0, i.e. Itô integral,
(grey solid line); also the solution for α = 1, i.e. anti-Itô integral, (black solid line) is
given for comparison. All solutions are obtained for the same Wiener process. The
inset is a blow-up of the final part of the trajectories (dashed square).
106
3
2
α
1
0
−1
−2
−3
−3
−2
−1
0
λ
1
2
3
Figure 4.3. α as a function of λ for the case when γ(x) ∝ σ(x)λ [Sec. 4.1.2]. For
λ → 1, α diverges asymptotically (dashed line) leading to the singular case discussed
in § 4.1.2. The Itô integral (α = 0) is obtained for λ = 0 (square) and the anti-Itô
(α = 1) for λ = 2 (diamond); the Strasonovich integral (α = 0.5) is only obtained
asymptotically (dotted line) for λ → ∞.
and
√
kB T 2
.
σ(x) = p
D(x)
(4.1.27)
This case was studied experimentally in [VHB+ 10, BVH+ 11], showing that the correct
value of α for m → 0 is α = 1, see section 4.1.3.
γ(x) ∝ σ(x)λ : Constant α All the cases for which α(x) ≡ α, can be obtained equating
the right-hand side of Eq. (4.1.25) to a constant, different from 12 . After a simple
calculation, we obtain γ(x) = cσ(x)λ , where c is a constant, i.e. Eq. (4.1.24). It
107
8
(a)
γ,σ
6
γ(x)
4
σ(x)
2
0
−100
−50
0
50
100
x
300
(b)
100
200
0
180
x
x
200
−100
−200
−300
0
m = 100
m = 10
m=1
Itô (α = 0)
Anti-Itô (α = 1)
α=2
200
160
140
900
400
600
950
t
800
1000
1000
t
Figure 4.4. (a) σ(x) (grey line) and γ(x) = σ(x)4/3 (dark line). (b) The solutions
of the Newton equations [Eq. (4.1.21)] for m → 0 (dashed lines) converge to the
solution of the SK approximation [Eq. (4.1.22)] for α = 2, (dark grey solid line); also
the solution for α = 0, i.e. Itô integral (grey solid line), and α = 1, i.e. anti-Itô
integral, (black solid line) is given for comparison. All solutions are obtained for the
same Wiener process. The inset is a blow-up of the final part of the trajectories
(dashed square)
108
follows that
α=
λ
.
2(λ − 1)
(4.1.28)
The value of α depends on the exponent λ as shown in Fig. 4.3. This result includes
as particular cases γ(x) ≡ γ, for which α = 0 [Sec. 4.1.2], and γ(x) = cσ(x)2 , for
which α = 1 [Sec. 4.1.2]. However, we remark that the value α =
1
2
is only obtained
asymptotically for λ → ∞.
Interestingly, values of α outside the interval [0, 1] can be achieved for certain
friction–diffusion relations. For example, the relation γ(x) = σ(x)4/3 gives α = 2
by the formula [Eq. (4.1.28)]. Figure (4.4) gives insight into the zero mass limit.
Different constructions of the stochastic integral are given for Itô (grey solid line),
anti-Itô (black solid line), and for α = 2 (dark grey solid line), for the same Wiener
process.
γ(x) ∝ σ(x): A singular case When γ(x) ∝ σ(x) [Fig. 4.5(a)], the stochastic term
σ(xt )
in Eq. (4.1.22) gets multiplied by a constant factor, i.e.
= constant, and thus
γ(xt )
there is no ambiguity in its solution. However, the zero-mass limit of Eq. (4.1.21)
does not converge to this solution. This can be seen by setting γ(x) = cσ(x) and
using equation (4.1.3) directly; the limiting equation is
F (xt )
σ 0 (xt )
1
dxt =
− 2
dt + dWt .
cσ(xt ) 2c σ(xt )
c
(4.1.29)
Here we see that there is a correction to the drift term. In Fig. 4.5(b), we show how
the numerical solutions for m → 0 (dashed lines) converge towards the solution of
the SK approximation (black solid line), while the solution Eq. (4.1.22) without the
correction to the drift clearly diverges (grey solid line).
4.1.3
Brownian particle in a diffusion gradient
We study a particle of mass m in water contained in a cylinder. The particle’s position
is described in one dimension as xt ∈ (0, a) for a > 0, the distance away from the
109
8
(a)
γ,σ
6
4
γ(x)
σ(x)
2
0
−100
−50
0
50
100
x
200
(b)
100
0
x
x
100
−100
−200
0
m = 100
m = 10
m=1
S-K approx
S-K w/o correction
200
50
0
900
400
600
950
t
800
1000
1000
t
Figure 4.5. (a) γ(x) = cσ(x). (b) For m → 0, the solutions of Eq. (1.1.4)
(dashed lines) converge to the limiting Eq. (4.1.29) (dark solid line). The grey
solid line represents the solution of Eq. (4.1.22) disregarding the noise-induced drift.
All solutions are obtained for the same Wiener process. The inset is a blow-up of the
final part of the trajectories (dashed square).
110
bottom of the cylinder. The particle is modeled by,
m
mẍm
t = F (xt ) −
kB T m
ẋt + η(t, xm
t ),
m
D(xt )
x0 = x
(4.1.30)
where F is given as
F (x) = Be−κx − Geff − U 0 (x).
(4.1.31)
The first term is the electrostatic potential, with κ−1 and B dependent on surface
charge densities of the particle and the wall. The second term is the effective gravity
which is constant with respect to x [VHB+ 10]. The third term is a force term confining
the particle’s position to (0, a). Thus U (x) has the growth condition in Assumption 4.
The frictional force is dependent on the position of the particle with respect to the
bottom of the cylinder. The exact form of D is an infinite sum and can be found
in [HB65]. For the analysis, it is enough to know the behavior of D as shown in
figure 4.6. The last term in equation (4.1.30) is a random force describing noise in
0.6
0.5
D(x)/D∞
0.4
0.3
0.2
0.1
0
0
0.5
1
1.5
2
2.5
3
Distance from plate, x [m]
3.5
4
4.5
5
−7
x 10
Figure 4.6. Plot of D(x). Note that D(x) is not a linear function of x, and D0 (0) 6= 0.
the system by interactions with the particle and the surrounding fluid. We assume
that the noise is Gaussian white noise, and that the Einstein fluctuation-dissipation
relation [TKS92] holds:
√
2kB T
η(t, x)dt = p
dWt ,
D(x)
(4.1.32)
111
where Wt is a standard Wiener process. Setting vtm = ẋm
t , we write an SDE that
models the dynamics of this Brownian particle:
dxm
= vtm dt,
t
F (xm
t )
dvtm =
−
m
kB T
vtm
mD(xm
t )
√
T
dt + √ 2kBm
dWt ,
D(xt )
xm
= x,
0
v0m = v.
(4.1.33)
We are interested in the effective SDE for the particle’s position in the limit as mass
goes to zero. From Theorem 1, and equation (4.1.3), the limit as m → 0 is,
p
F (xt )D(xt )
0
+ D (xt ) dt + 2D(xt ) dWt , x0 = x.
(4.1.34)
dxt =
kB T
This limit has been identified both experimentally [VHB+ 10] and through a multiscale analysis [CBEA12]. The advantages of Theorem 1 are that it identifies the limit
for equation (4.1.33) and proves convergence in L2 (Ω, F, P ) which is stronger than the
weak convergence proofs from multi-scale analyses. First we give a Lyapunov function
that in addition to proving existence and uniqueness of the solutions of SDE (4.1.33)
and SDE (4.1.34), also proves exponential convergence of the initial distribution to
the unique steady state distribution. This property is called geometric ergodicity
[RB06].
Define D∗ = maxx∈(0,a) {D(x), D0 (x)} < ∞ and for some b, d > 0, where b will be
chosen later, assume
Assumption 5.
kB T D(x)(x − a/2)F (x) (kB T )2
b
b
1
+
≤ − U (x) −
x2 + d,
m
mD(x)
m
m 6D∞ D∗ a
(4.1.35)
for all x ∈ (0, a).
Consider the Lyapunov function,
1
kB T
1 (kB T )2 2
V (x, v) = mv 2 + U (x) +
D(x)vx
+
x.
2
3D∞ D∗ a
6 D∞ D∗ am
(4.1.36)
First we give an initial estimate of V (x, v). For c ≥ 0 we have for all δ > 0,
2
v
cx
δ √ −√
≥ 0,
(4.1.37)
2
2δ
112
which gives
cxv ≤
δ 2 c2 2
v + x .
2
2δ
(4.1.38)
Using this bound we have
1
kB T D(x)
V (x, v) ≤ mv 2 + U (x) +
2
3D∞ D∗ a
(kB T )2 2
δ 2
1 2
v + x +
x.
2
2δ
D∞ D∗ am
(4.1.39)
Choosing
δ=
kB T D∞ m
,
6(kB T )2 − m
(4.1.40)
for m small enough and bounding D(x) ≤ D∞ for all x we have
1
(kB T )2 D∞
1
2
V (x, v) ≤ m
+
x2 ,
v
+
U
(x)
+
2 18D∗ (kB T )2 − 3mD∗
6D∞ D∗ a
1
D∞
1
≤m
+
v 2 + U (x) +
x2 ,
∗
∗
2 18D
6D∞ D a
(4.1.41)
(4.1.42)
because m 1, or
− V (x, v) ≥ −m
1
D∞
+
2 18D∗
v 2 − U (x) −
1
x2 .
6D∞ D∗ a
Then the infinitesimal generator, given by equation (4.1.10) with γ =
√
2kB T
√
,
D
(4.1.43)
kB T
D
and σ =
applying to V gives
F (x)
kB T
Lm V = mv
−
v − F (x)v
m
mD(x)
kB T D(x) F (x)
kB T
kB T D(x) 2
+
−
v x+
v
∗
3D∞ D a
m
mD(x)
3D∞ D∗ a
kB T D0 (x) 2
1 (kB T )2
(kB T )2
+
xv
+
2
xv
+
m
3D D∗ a
6 D∞ D∗ am
m2 D(x)
∞
−kB T
kB T D0 (x)x kB T D(x)
v2
=
+
+
D(x)
3D∞ D∗ a
3D∞ D∗ a
|
{z
}
I
kB T D(x)xF (x) (kB T )2
+
+
.
3D∞ D∗ a
m
mD(x)
|
{z
}
II
(4.1.44)
(4.1.45)
(4.1.46)
(4.1.47)
(4.1.48)
113
We start with the first term I. Let
18kB T D∗
b=
.
2
18D∞ D∗ + 3D∞
(4.1.49)
The inequality
−3kB T D∞ D∗ b + kB T D0 (x)D(x)x + kB T D(x)2 2 −kB T 2
v ≤
v
3D∞ D∗ bD(x)
3D∞
b
1
D∞
=− m
+
v2,
m
2 18D∗
I=
(4.1.50)
(4.1.51)
is shown by noting that for x ∈ (0, a) and the numerator can be bounded above using
the estimates D(x) ≤ D∞ and D0 (x) ≤ D∗ and maximizing the denominator. Part
II is assumed to be bounded [Assumption 5], therefore with inequality (4.1.43),
Lg ≤ −
b
g + bd.
m
(4.1.52)
This inequality is stronger than inequality (4.1.14) and shows geometric ergodicity of
SDE (4.1.33). In particular, inequality (4.1.52) also proves existence and uniqueness
of the solution to equation (4.1.33). For the limiting equation (4.1.34), we use the
Lyapunov function in equation (4.1.13) to show existence and uniqueness of xt which
2
allows to apply Theorem 1. Therefore xm
t → xt as m → 0 in L with respect to
C(0,a) [0, T ].
For the limit equation (4.1.34), we use the Lyapunov function in equation (4.1.13)
to show existence and uniqueness of xt . Therefore, by Theorem 1, xm → x as m → 0
in L2 .
4.2
Ornstein-Uhlenbeck Colored Noise
In physical systems of Brownian particles, the noise driving Brownian motion is not
white but colored, due to hydrodynamic memory [FGB+ 11]. We work through a
couple of examples with Ornstein-Uhlenbeck colored noise to demonstrate computing
the limit and agreement with prior results. We limit ourselves to calculating the
114
limiting equations without stating explicit conditions for existence and uniqueness,
assumed in Theorem 1. In this section we consider a Langevin equation, a multdimensional form of equation (4.1.1) that was studied in Section 4.1 with xt ∈ U ⊂ Rd
and vt ∈ Rd , driven by colored noise:
(
dxt = v
t dt
dvt =
F (xt )
m
−
γ(xt )
vt
m
+
σ(xt )
ηt
m
dt,
(4.2.1)
where ηt is a k-dimensional random process whose stationary process is zero-mean
with short correlation time τ . To use the framework of the theorem, we consider a
special type of noise, the Ornstein-Uhlenbeck process defined by the SDE
dηt = −
λ
A
ηt dt + dWt ,
τ
τ
(4.2.2)
where A is a k by k constant invertible matrix, and λ – a k by ` constant matrix, and
W a `-dimensional Wiener process. Defining the variable Nt by the SDE dNt = ηt dt,
we use the framework above by setting x̄ = (x, N ) and v̄ = (v, η). To clarify this
case, we use Theorem 1 to show convergence, as the correlation time τ and mass m
tend to zero, in two systems driven by colored noise, studied in earlier works.
4.2.1
Colored noise with constant friction
Consider the system
mẍt = F (xt ) + (−√ẋt + f (xt )ηt ) dt
t
η̇t = − aη
dt + 2λ
dWt .
2
2
(4.2.3)
This is equivalent to the example in [PS08] section 11.7.6 with the substitution ηt =
1
η˜ ,
t
where η̃t is the colored noise in the text. Set m = τ 2 , then in SDE form,


 dxt = vt dt
dvt = F (xt ) + − τvt2 + f (xτ t2)ηt dt
√


2λ
t
dηt = − aη
dt
+
dWt .
2
2
(4.2.4)
115
In the framework of Theorem 1, define Nt =
Rt
0
ηs ds, xt = (xt , Nt )∗ and vt = (vt , ηt )∗ .
We then have
dxt = vt dt
mdvt = F (xt ) − γ(xt )vt dt + σ(xt )dWt ,
(4.2.5)
− f (xτ t )
τ
0
a
(4.2.6)
with m = 2 and
F (xt )
F (xt ) =
,
0
1
γ(xt ) =
0
.
σ(xt ) = √
2λ
,
To compute the spurious drift term, we must solve the Lyapunov equation,
− γJ − J γ ∗ = −σσ.
(4.2.7)
R
We use Mathematica
to find a closed form for J ,
J (x) =
λf (x)2
a(1+aτ )
λf (x)
a(1+aτ )
λf (x)
a(1+aτ )
λ
a
!
(4.2.8)
So G(x) is given as
G(x) = J (x)γ ∗ (x) =
0
(x)
− λf
1+τ a
λf (x)
1+τ a
λ
!
.
(4.2.9)
Now we compute the spurious drift in the first component (i = 1) using equation (3.2.3), noting that the sum is zero for α = 2, j = 1:
P
−1
−1
∂
S1 (xt ) =
(γ1j
(xt ))γαk
(xt )Gjk (x)
α,j,k ∂xα
=
=
λf (xt )
f 0 (xt )
τ − (1+τ
ds
a
a)
0
λf (xt )f (xt )
.
a2 (1+τ a)
+
f 0 (xt ) f (xt )
λ
a
α
ds
(4.2.10)
Therefore the effective SDE for xt is,
r
λf 0 (xt )f (xt )
2λ
dxt = τ F (xt ) + 2
dt +
f (xt ) dWt ,
a (1 + τ a)
a2
which is in agreement of [PS08].
(4.2.11)
116
Figure 4.7. An experiment from [DB06a], where the letters “DNA” are heated 2◦ C
warmer than the water, of 50 nm DNA particles moving from cooler regions (left) to
warmer regions (right) when the temperature of the water is raised from 3◦ C to 20◦ C.
4.3
Thermophoresis using OU colored noise
The same type of limit is used to model a phenomenon called thermophoresis, the
movement of small particles in a temperature gradient. Experimental methods to
separate and group small particles have been used with much success [Pia08]. Theoretical models of this phenomenon are still in question. An example of thermophoresis
is the experiment [DB06a], where DNA particles 50 nm in size are floating freely in
a shallow chamber of water (see figure 4.7). When the temperature of the water’s
temperature is 3◦ C, and a region of is heated to 5◦ C, the DNA particles move to the
colder regions; when the water is 20◦ C and a region is heated to 22◦ C, the particles
move to warmer regions. In the limit as m, τ → 0, the particle’s average velocity
will change direction and move toward the hotter regions or toward the colder regions depending on the ratio m/τ . Similarly, the stationary probability distribution
of the particle will change from peaking in the hotter regions to the colder depending
on m/τ as well. We now show this limit using Theorem 1. For friction coefficient
dependent on positions γ = γ(x) we consider the Langevin equation for a Brownian
particle with OU colored noise:
 m
dx
= vtm dt


 t
γ(xm
t ) m
dvtm =
F (xm
vt +
t )−
m



dηt = − 2ητ t dt + τ2 dWt ,
√
γ(xm
t )
2D(xm
t )ηt
m
dt
(4.3.1)
117
with Wt a one-dimensional Wiener process. This is an approximation which leaves
out memory effects in the friction term, due to the fluctuation-dissipation relation in
view of the time correlation of the noise [McL89].
For a spherical particle of radius R immersed in a fluid of viscosity µ, which in
general depends on the absolute temperature T , i.e. µ = µ(T ), the friction coefficient
γ satisfies Stokes law,
γ(T ) = 6πµ(T )R,
(4.3.2)
and the diffusion coefficient D is related to γ by the fluctuation-dissipation relation
[TKS92],
D(T ) =
kB T
.
γ(T )
(4.3.3)
For ease of argument, we will assume that the particle’s motion is one-dimensional
in a horizontal direction perpendicular to gravity, with position denoted by xt ∈ R
for all times t ≥ 0. We also assume viscosity to depend only on temperature, and
fluid thermal expansion and convection to be negligible. If more complex models are
needed, e.g. to account for interactions between the fluid and the particle or for the
thermal expansion of the fluid, then the above model may need to be modified and
the results may change, but the approach to address such problem will be the same
as described in this section. The relation in eq. (4.3.3) assumes local thermodynamic
equilibrium [OS06], which implicates that the temperature gradients should not be
too steep; such conditions have been shown to be experimentally verified, e.g., in
Ref. [DB06b]. The resulting motion of the particle is governed by the stochastic
Newton equation:
p
ηt
mẍt = −γ(xt )ẋt + γ(xt ) 2D(xt ) √ ,
τ
(4.3.4)
with initial conditions x0 = x and ẋ0 = v. The coloured noise ηt is an OrnsteinUhlenbeck process (OUP) defined by the stochastic differential equation (SDE):
r
2
4
dηt = − ηt dt +
dWt .
(4.3.5)
τ
τ
118
where τ > 0 is the noise correlation time and Wt is a standard Wiener process. Its sta
tionary solution is a zero-mean Gaussian process with E[(ηt ηs )/τ ] = (1/τ ) exp − τ2 |t − s|
and its covariance function converges to the delta function as τ tends to zero.
Typically, the hydrodynamic and inertial memory time-scales, i.e τ and m/γ respectively, are very fast – in particular, much faster than the typical time resolution
at which the particle Brownian motion is experimentally sampled – and their effects
accumulate over the longer diffusive time-scale and eventually results in transport
dynamics. Therefore, following Langevin’s approach, it is customary to drop the inertial term, i.e. set the left hand side to zero in eq. (4.3.4) [Nel67] and take the time
correlation of the noise to zero. However, the limit of eq. (4.3.4) as m → 0, i.e. σ → 0,
has to be studied with care, requiring a nontrivial computation, as, in general, similar
limits involve additional drift terms [KPS04, Fre04]. Here we are interested in the
long-time behaviour of xt and in how the particle undergoes a deterministic drift in
response to a temperature gradient as both the inertial and the noise characteristic
times are taken to zero.
For the analysis we introduce the non-dimensional quantity
θ(x) =
m
.
γ(x)τ
(4.3.6)
For m and τ of the same order and taken to zero, we write equation (4.3.1) in terms
of τ :

dxt = vt dt




 dNt = ηt dt
√
2D(xt )ηt
1
dvt =
F (xt ) − θτ vt +
dt

θτ




dηt = − 2ητ t dt + τ2 dWt ,
(4.3.7)
Different from previous sections, τ is the small parameter instead of m. Define x =
(x, N )∗ , v = (v, η)∗ , and
γ(x) =
1
θ(x)
0
−
√
2D(x)
θ(x)
2
!
0
σ(x) =
.
2
(4.3.8)
119
From above, γ, is invertible
γ −1 (x) =
θ(x)
0
√
2D(x)
2
1
2
!
,
(4.3.9)
To compute the spurious drift term, we must solve the Lyapunov equation,
− γJ − J γ ∗ = −σσ.
R
We use Mathematica
to find a closed form for J ,
√

J (x) =
(4.3.10)

2D(x)
(1+2θ(x)) 
2D(x)
 (1+2θ(x))
√
2D(x)
(1+2θ(x))
(4.3.11)
1
So G(x) is given as

0
G(x) = J (x)γ ∗ (x) =  2√2D(x)
− 1+2θ(x)
√

2 2D(x)
1+2θ(x) 
.
(4.3.12)
2
Using equation (1.1.11), as τ, m → 0, such that θ is held constant and the fact
that θ0 = −θγ 0 /γ, the limiting equation for x is given as
p
F (xt ) γ(xt )D0 (xt ) − 4θ(xt )γ 0 (xt )D(xt )
dxt =
+
dt + 2D(xt ) dWt .
θ(xt )
2γ(xt )(1 + θ(xt ))
4.3.1
(4.3.13)
Physical interpretation: Drift and probability density
Since the effective limiting SDE [eq.(4.3.13)] has been constructed using a stochastic
integral with the Itô convention, the expected position is
Z t
E[xt ] = E[x0 ] + E
A(xs ) ds .
(4.3.14)
0
The sign of A(x) determines the direction towards which a Brownian particle is expected to travel. Therefore, if, e.g., we have A(x) > 0, the particle will on average
travel towards increasing x until it reaches some boundaries, which can be either
absorbing boundaries or reflecting boundaries.
120
(a)
Hot
Col d
A b (x) > 0
A w (x) < 0
I n l et
(b)
Col d
Hot
Figure 4.8. (a) Schematic representation of the behaviour of thermophoretic particles in the presence of absorbing boundaries: black (white) particles have Ab (x) > 0
(Aw (x) < 0) [eq. (4.3.14)] and are mostly pushed towards the right (left) boundary, where they are removed from the channel. (b) Schematic representation of the
behaviour of thermophoretic particles in the presence of reflecting boundaries: the
particles eventually reach a steady state probability density ρ∞ (x) [eq.(4.3.17)], which
is different for particles with Ab (x) > 0 (black) and Aw (x) < 0 (white).
121
In the presence of absorbing boundaries, particles disappear from the system as
soon as they reach a boundary. Therefore, the sign of A(x) determines the boundary
at which particles are preferentially absorbed, as schematically shown in figure 4.8(a).
This situation can be experimentally realized, for example, within a relatively long
microfluidic channel where particles are steadily injected at a certain position, e.g.,
x = 0, and removed once they reach either end of the channel. In the presence
of a temperature gradient, these particles move towards increasing or decreasing x
depending on the sign of A(x) and, therefore, can be sorted and classified on the basis
of their physical and chemical properties that influence A(x). Such a sorting process
is more efficient the longer the channel and the smaller the noise term, i.e. B(x).
In the presence of reflecting boundaries, particles are reflected back into the system
when they reach a boundary. In this way, once a particle has interacted with the
boundaries multiple times, the particle’s position xt reaches a steady state probability
density ρ∞ (x), as schematically indicated in figure 4.8(b). The time-dependent ρ(x, t)
is the solution to the following Fokker-Planck equation [Eq. (2.2.46)] (also known as
forward Kolmogorov equation),
∂ρ
∂
∂ 2 B(x)2
= − [A(x)ρ] + 2
ρ ,
∂t
∂x
∂x
2
(4.3.15)
with a given initial condition ρ(x, 0) = ρ0 (x). Note that x is referring to the position
variable with respect to the density function ρ, not the initial condition. Since the
Fokker-Planck equation is deterministic, its solution, i.e. the evolution of the probability density over time, does not involve any randomness. As t → +∞, the solution
to eq. (4.3.15) converges to the steady state distribution ρ∞ (x), under certain conditions [Ris89, sec. 5.2]. If we assume the motion of the particle to be restricted to the
interval (a, b), a < b, then we can solve the stationary Fokker-Planck equation for the
SDE (4.3.13) [Gar04]. Given A(x) and B(x) the stationary solution is
Z x
A(x̃)
C
exp 2
dx̃ ,
ρ∞ (x) =
2
B(x)2
a B(x̃)
(4.3.16)
122
which can also be expressed as a function of D(x), γ(x) and θ(x) as
1+4θ(x)
2θ(x)
ρ∞ (x) = CD(x)− 2+4θ(x) γ(x)− 1+2θ(x) ,
(4.3.17)
where C is a normalizing constant. This situation can be experimentally realized in
closed systems where a temperature gradient is present. Interestingly, this is the case
of most experiments performed to study thermophoresis and the Soret effect, where
a suspension of particles in a thermal gradient is given enough time to relax to its
steady state distribution [Pia08, SS11].
4.3.2
Analysis of the limiting equation and discussion
For the following discussion, we will express the thermophoretic drift as a function of
T and µ(T ), and since µ(T ) is interpreted as an expansion around some temperature
T0 , we will write ∆T = T (x) − T0 , and
d
µ(T )
dx
= µ0 (T )∆T . From eq. (4.3.14), using
eqs. (4.3.2) and (4.3.3), we find that the effective thermophoretic drift is
A(x) = kB T 0
µ(T ) − µ0 (T )∆T [1 + 4θ]
,
12πRµ2 (T ) [1 + 2θ]
(4.3.18)
where we have set T = T (x) and θ = θ(x), suppressing the dependence on x
for brevity and µ0 (T ) and T 0 are derivatives with respect to T and x respectively.
Eq. (4.3.18) shows that the thermophoretic drift is determined not only by T 0 , but
also by the dependence of µ on T . Interestingly, if µ0 (T ) > 0 and µ(T ) > µ0 (T )∆T ,
from eq. (4.3.18), there is a critical θ denoted θc such that the drift A(x) changes
sign:
θc =
µ(T ) − µ0 (T )∆T
.
4µ0 (T )∆T
(4.3.19)
The stationary density is given as
µ(T (x))
ρ∞ (x) = C
T (x)1+4θ
1
2+4θ
,
which is an inverse power of T unless µ(T ) is at least quadratic in T .
(4.3.20)
123
(b)
30
80
60
20
40
10
20
0
0
0.5
Position, x
100
θ =0
θ = 1 80
θ =∞
θ = 0 60
θ =1
θ = ∞ 40
T (x)
20
∞
θ =0
θ =1
θ =∞
T (x)
.1
.08
Density, ρ
A(x)
40
100
T (x)
50
0
1
.06
.04
.02
0
0
0.5
T (x)
(a)
0
1
Position, x
Figure 4.9. µ(T ) constant. (a) Thermophoretic drift A(x) for various θ (dashed,
dashed-dotted and dotted lines) in the presence of a temperature gradient (solid line).
(b) Corresponding theoretical steady state distribution (lines) are always peaked towards the cold side. The symbols represent the steady states distributions resulting
from Brownian dynamics simulations.
Below, we consider in more detail the three simplest cases, i.e. µ constant, µ
linear in T and µ quadratic in T . We stress that, even though these are the first three
orders of approximation, it might be necessary to consider more complex situations
in real applications because of the large range over which T can vary.
µ(T ) constant In the simplest case µ does not depend on T , i.e. µ(T ) ≡ µ0 > 0, and
the thermophoretic drift term (4.3.18) becomes
kB T 0
A(x) =
.
12πRµ0 (1 + 2θ)
(4.3.21)
We stress that in this case θ is independent of x. For example, let us consider the
temperature gradient [grey solid line in figure 4.9(a)]. The associated thermophoretic
drift [eq.(4.3.21)] is presented in figure 4.9(a) for three values of θ, corresponding to
124
the situations where the time correlation of the coloured noise dominates (θ = 0, blue
dashed line), the time-scale of the particle inertia dominates (θ = +∞, red dotted–
dashed line) and the two time-scales are comparable (θ = 1, green dotted line). For
all θ ≥ 0 the drift term has the same sign as T 0 , thus imposing a drift on the Brownian
particle instantaneous flow towards the hotter region. Equation (4.3.18) features two
terms, one term dependent on the frictional gradient and the second on the coloured
noise; for µ constant, since there is no frictional gradient, the drift is only driven by
coloured noise and the resulting flow increases with decreasing θ.
If now the particles are allowed to interact repeatedly with the boundaries, they
will eventually reach their stationary distribution, using eq. (4.3.17),
1+4θ
ρ∞ (x) ∝ T (x)− 2+4θ .
(4.3.22)
As shown in figure 4.9(b), the particles will accumulate towards the areas of low
temperature for all θ ≥ 0, which is in agreement with most experiments [SS11]. This
result is in striking contrast with the fact that the instantaneous thermophoretic drift
actually pushes the particles in the opposite direction towards the hotter regions. Such
a difference between the instantaneous drift and the long-term stationary distribution
has also been observed in systems at thermodynamic equilibrium [LBLO01, AKQ07,
VHB+ 10, BVH+ 11].
µ(T ) linear We now consider the case when µ(T ) is linear, i.e. µ(T (x)) = µ0 +
µ1 ∆T (x) > 0. Interestingly, if µ1 > 0 (e.g. Ref. [Sha10]), A(x) changes sign at a
critical value of θ [eq. (4.3.19)]
θc =
µ0
.
4µ1 ∆T
(4.3.23)
The resulting effective thermophoretic drift (4.3.18) is shown in figure 4.10(a) for the
cases of θ = 0 < θc , θ = 1 > θc , and θ = ∞. In particular,
kB T 0 µ0
A(x) =
for θ = 0
12πR(µ0 + µ1 ∆T )2
(4.3.24)
125
(b)
60
−150
θ = 0 40
θ =5
θ = ∞ 20
T (x)
0
1
−200
0.5
θ
θ
θ
θ
θ
θ
∞
−100
100
.04
Density, ρ
80
A(x)
−50
−250
0
.05
100
T (x)
0
.03
.02
.01
0
5 80
∞
0 60
5
∞ 40
20
0
0
Position, x
=
=
=
=
=
=
T (x)
(a)
0.5
Position, x
0
1
Figure 4.10. Same as figure 4.9 for µ(T ) linear. Note in (a) the sign change of A(x)
as θ crosses θc .
and
A(x) = −
kB T 0 µ1 ∆T
for θ → +∞,
6πR(µ0 + µ1 ∆T )2
(4.3.25)
whose dependence on the thermal gradient clearly shows opposite signs if µ1 > 0.
For the stationary distribution [figure 4.10(b)],
h
ρ∞ (x) ∝ µ0 ∆T (x)
−(1+4θ)
−4θ
+ µ1 ∆T (x)
1
i 2+4θ
.
(4.3.26)
For µ1 > 0 it is clear that ρ∞ will be an inverse power of T (x). If µ1 < 0, since
γ(x) > 0 for all x, then µ0 > |µ1 ∆T (x)| and µ0 ∆T (x)−(1+4θ) > |µ1 ∆T (x)−4θ |. Thus
ρ∞ will be an inverse power of temperature for all admissible µ0 , µ1 . This suggests
that the particle will more likely be found in the colder regions. Interestingly, unlike
the expected drift, derived from eqs. (4.3.24)-(4.3.25), there is no qualitative change
in behaviour at θc .
126
(b)
.04
100
100
θ =0
θ =1
θ =∞
θ =0
θ =1
θ =∞
T (x)
66.7
θ =0
θ =1
θ =∞
T (x)
−1
−1.5
0
0.5
Position, x
33.3
0
1
Density, ρ
A(x)
−0.5
T (x)
∞
0
.03
.02
.01
0
0
75
50
T (x)
(a)
25
0.5
0
1
Position, x
Figure 4.11. Same as figure 4.9 for µ(T ) quadratic. Note in (b) the change of the
distribution peak from cold to hot as a function of θ.
µ(T ) quadratic We finally consider the case when µ(T ) is quadratic in T , i.e. µ(T (x)) =
µ0 + µ1 ∆T (x) + µ2 ∆T (x)2 . Again, A(x) changes sign at [eq. (4.3.19)]
θc =
µ0 − µ2 ∆T 2
,
4(µ1 ∆T + 2µ2 ∆T 2 )
(4.3.27)
if µ0 > µ2 ∆T 2 and µ1 ∆T + 2µ2 ∆T 2 > 0, or µ0 < µ2 ∆T 2 and µ1 ∆T + 2µ2 ∆T 2 < 0.
The resulting effective thermophoretic drift (4.3.18) for the extreme cases is
kB T 0 (µ0 − µ2 ∆T 2 )
for θ = 0
A(x) =
12πR(µ0 + µ1 ∆T + µ2 ∆T 2 )2
(4.3.28)
and
A(x) = −
kB T 0 (µ1 ∆T + 2µ2 ∆T 2 )
for θ → +∞,
6πR(µ0 + µ1 ∆T + µ2 ∆T 2 )2
(4.3.29)
whose dependence on the thermal gradient shows opposite signs for µ0 > µ2 ∆T 2 . In
figure 4.11(a), we study the case µ0 < µ2 ∆T 2 and µ1 ∆T + 2µ2 ∆T 2 .
The stationary distribution is
1
h
i 2+4θ
ρ∞ ∝ µ0 ∆T −(1+4θ) + µ1 ∆T −4θ + µ2 ∆T (1−4θ)
,
(4.3.30)
127
where it is clear that there are combinations of µ0 , µ1 and µ2 for which the density
will incur a transition from peaking at colder regions to hotter, as shown in figure
4.11(b). This may lead to interesting behaviours as a function of the temperature.
For example, in reference [DB06a], DNA particles (θ 1) change from accumulating
in colder regions when the minimum temperature is T = 276 K to accumulating in
warmer regions when a new minimum temperature T = 293 K and the gradient is
left unchanged (2 K between the colder and warmer regions). In agreement with
this experiment we predict, for T = 276 K, since, expanding µ(T ) around T0 = 273,
µ0 + µ1 ∆T > µ2 ∆T 2 , the stationary distribution ρ∞ [eq. (4.3.30)] peaks in colder
regions for all θ, and, for T = 293 K, since µ2 ∆T 2 > µ0 + µ1 ∆T , the stationary distribution peaks in hotter regions for θ < 1/4, which is verified in this case. However, we
note that the model in this paper leads to a different explanation than the one in reference [DB06a] and there may be numerous other factors influencing thermophoresis
including thermal expansion of the fluid and convection.
4.4
4.4.1
Higher dimensions
Three dimensional Brownian particle with non-conservative force
As a generalization of the example in section 4.1.3 we consider a Brownian particle in
3
fluid whose position xm
t ∈ R , with spatially varying noise coefficient σ(x), and the
fluctuation dissipation theorem [TKS92] in multi-dimensional form
γ(x) =
σ(x)σ(x)∗
σσ ∗ (x)
=
.
kB T
kB T
Given an external force F , equation (1.1.4) is written
(
m
dxm
= v
t
h t dt x0 = x, i
dvtm =
F (xm
t )
m
−
σσ ∗ (x) m
v
mkB T t
dt +
σ(xm
t )
m
dWt
(4.4.1)
v0 = v,
(4.4.2)
for xt ∈ U ⊂ R3 a bounded open set. If F is a conservative force, i.e. F = −∇U ,
the limiting equation (1.1.11) can be written by using the Gibbs distribution, ρ∞ =
128
C exp{−U (x) − m|v|2 /2}, as a solution to the stationary Fokker-Planck equation
corresponding to equation (1.1.11) and solving for G. For non-conservative force
F , the stationary solution will not be Gibbs and the limit can be identified using
Theorem 1. To find the limit, the Lyapunov equation that must be solved is,
−γJ − J γ ∗ = −σσ ∗
−σσ ∗ J − J σσ ∗ = −σσ ∗
(4.4.3)
A solution to the above equation is J = 12 I where I is the identity matrix, and the
solution is unique. Thus the matrix G is,
1
G(x) = J (x)γ ∗ (x) = σσ ∗ (x).
2
(4.4.4)
Using equation (1.1.11), the limit as m → 0 is
dxt = (σσ ∗ (xt ))−1 kB T F (xt ) − kB T S(xt ) dt + [σ(xt )∗ ]−1 kB T dWt ,
(4.4.5)
where the ith component of G is defined as
Si (x) =
1X ∂
([(σσ ∗ (x))−1 ]ij )[(σσ ∗ (x))−1 ]α,n [σσ ∗ (x)]j,n .
2 α,j,n ∂xα
(4.4.6)
Summing over n and noting that σσ ∗ is Hermitian,
Si (x) =
1X ∂
1X ∂
([(σσ ∗ (x))−1 ]ij )Iα,j =
([(σσ ∗ (x))−1 ]iα )
2 α,j ∂xα
2 α ∂xα
(4.4.7)
129
Chapter 5
Summary
5.1
Conclusion
In this dissertation we proved a convergence theorem for a system of SDE with arbitrary state-dependent friction. The assumptions on the friction matrix allow the
theorem to include systems with Ornstein-Uhlenbeck colored noise. We showed that
this theorem can be used to find an approximation for a Brownian particle in a diffusion gradient. Measuring the external forces of a Brownian particle in a diffusion
gradient was studied in [VHB+ 10]. This was the original motivation for extensions
that became this dissertation. We also applied the theorem to a Brownian particle in
a temperature gradient to study the dynamics on a long diffusive time scale.
5.2
Future work
In Section 4.1.1 and 4.1.2, we identified the Itô form of the limiting Langevin equation in one dimension and discussed its equivalent interpretation in terms of other
definitions of stochastic integrals. We introduced the notation ◦α(x) dWt , that is interpreted as an Itô integral with an additional well-defined drift term. A future work
would be to give a rigorous mathematical construction of this integral and discuss its
properties.
In Section 4.3, we gave a systematic analysis of a system with two short timescales: the time correlation of the coloured noise τ and the inertial relaxation time σ.
We derived the effective thermophoretic drift A(x) and the steady-state probability
distribution ρ∞ (x) in the limit as these time scales go to zero. We applied these results
to study the thermophoretic motion of a Brownian particle in a temperature gradient, showing how, in agreement with experiments, ρ∞ (x) tends to be peaked towards
130
the colder region (positive thermophoresis), but can switch to hotter regions (negative thermophoresis) under the right conditions. As possible future lines of research,
noises that are not OUP can be considered: these might be particularly promising to
model the effects of the chemical interaction between a particle and a solvent, of the
viscoelasticity of the medium or of depletion forces due, e.g., to polymers or small
particles in the solution. The equations of motion were an approximation, and the
friction term should include delay, due to hydrodynamic memory. Stochastic differential delay equations (SDDEs) are used to model this equation. A future research
project is to perform a similar analysis, if possible, in the framework of SDDEs.
One of the applications of the main theorem considers a Brownian particle in three
dimensions with an external force F that is non-conservative. There is no systematic
study of the steady state distribution in space (the x variable) for this case. Using
the small mass approximation a simplified steady state Fokker-Planck equation is
derived. This steady state Fokker-Planck equation is multi-dimensional and there is
no systematic study of the reduced problem. However, the reduction of the state
space allows for simplified numerical solutions.
The framework of Theorem 1 includes OU colored noise. However, the solution to
the SDE driven by OU colored noise can not be Taylor expanded to first order. This is
because the OU process is not regular enough (i.e. continuous but not differentiable).
Theorem 1 can be extended to more regular colored noises by writing a system of
n SDE to model the colored noise to achieve an n − 1 differentiable process. For
example, take an OU process defined by the SDE,
dyt1 = −a1 yt1 dt + b1 dWt ,
(5.2.1)
and replace the white noise with another OU process yt2 . That is,
dyt1 = − a1 yt1 dt + b1 yt2 dt
(5.2.2)
dyt2 = − a2 yt2 dt + b2 dWt .
(5.2.3)
131
This procedure can be iterated until there is n SDE with only dytn having a dWt term.
Two questions can be asked in this situation. 1: How does the noise yt1 change with
each successive addition of another OU process, especially the infinitesimal generator?
2: What happens when the limit as n, where n is the number of SDE, is taken to
infinity?
If the noise driving the system is colored, but not of OU type, then the system
must be studied in another manner. One such way may be to use fractional Brownian
motion. Every Gaussian colored noise can be written as a solution of a linear system
of stochastic differential equation with fractional Brownian motion [CDS13]. Then
a system of SDE can be written, like in the case with OU colored noise, and the
analysis may be similar to the work of the Smoluchowski-Kramers approximation
with fractional Brownian motion [BT05].
5.2.1
Active Brownian Motion
There are particles which have an internal motor, e.g. sperm cells, that along with
thermal noise, cause the particle to move randomly. These particles, called active
Brownian particles, exhibit Brownian motion in a much different way than particles
mentioned earlier. Physical examples of active Brownian motion include self-propelled
cells, macroscopic animals (for example insects and schools of fish) and interacting
particles [RBE+ 12]. These particles will aggregate and form patterns, as in figure 5.1
of actin filaments. A system of N active Brownian particles in R2 is modeled by a
2N -dimensional equation
dxt =vt dt
F (xt )
γ(xt )
σ(xt )
dvt =
dt −
vt dt +
dWt .
m
m
m
(5.2.4)
(5.2.5)
SDE theory can be used to study what patterns are formed, their stability, and the
sensitivity of the patterns as functions of the parameters they depend on.
132
Figure 5.1. An experiment from [SWS+ 10], of actin filaments propelled by immobilized molecular motors in a planar geometry.
133
Mean field limit of interacting particles: One phenomenon I will research is pattern
formations by active Brownian particles. The set up is as follows: for N active Brownian particles with positions in R2 given by (xit , xi+1
t ) for i = 1, ..., 2N , equation (5.2.4)
is the equation of motion for the set of positions of all N particles (i.e. xt ∈ R2N ).
Equation (5.2.4) needs modification to suitably model active Brownian particles. The
friction is now nonlinear and depends on velocity, γ = γ(x, v), and F is the sum of
external forces and an interaction term. Pattern formation has been observed in numerical experiments of many active Brownian particles [SGMRM95, MV12, KtW+ 13].
For active interacting particles in R2 modeled by equation (5.2.4), the force F = ∇U
is a potential field generated by the interactions between the particles. This term is
often replaced, in the limit of large numbers of particles, by a mean field limit. To
start, equation (5.2.4) will be analyzed, varying the number of particles N , to determine the ranges of N where the patterns form and when they do not. This analysis
pertains to what regime, for the number of particles, the mean field limit is valid. To
begin this work, the SDE and the stationary Fokker-Planck equation will be studied
numerically.
Pattern formation of interacting particles: Another interesting phenomenon of active
Brownian motion is the different types of patterns formed by interacting particles,
their stability, and the sensitivity of the patterns as functions of the parameters they
depend on. To begin, using white noise to model the noise, the system of SDE
dxit = vti dt
√
1
i 2
i
dvti = F i (vN
, ..., vtN ) − ∇U (x1t , ..., xN
2DdWt
t )dt − (a − b(vt ) )vt dt +
(5.2.6)
for i = 1, ..., N models the motion of the particles, with a Rayleigh-Helmholtz frictional force, and an external potential U . The force F i is a dissipative interaction
between particles and can be modeled in a variety of ways including the mean field
limit mentioned before. One quantity of interest is the velocity of the center of mass
PN i
1
N
of the ensemble, defined as uN
t = N
i=1 vt . I will study the stationary state of ut ,
for both finite N , and in the limit as N tends to infinity.
134
Pattern formation on a manifold: Up to this point, this discussion has been focused
on active Brownian particles on a flat two dimensional surface. Pattern formation and
aggregation of active Brownian particles often happen on curved spaces for example
on growing tissue. A long term goal of the above work on active Brownian motion
would be to study these processes on curved surfaces, starting with the surface of a
sphere and using methods of stochastic analysis on manifolds [Str00].
135
References
[AKQ07]
Ping Ao, Chulan Kwon, and Hong Qian, On the existence of potential
landscape in the evolution of complex systems, Complexity 12 (2007),
no. 4, 19–27. MR 2308458 (2008a:37057)
[Ao08]
P. Ao, Emerging of stochastic dynamical equalities and steady state thermodynamics from Darwinian dynamics, Commun. Theor. Phys. (Beijing) 49 (2008), no. 5, 1073–1090. MR 2489637 (2010d:82098)
[Arn74]
Ludwig Arnold, Stochastic differential equations: theory and applications, Wiley-Interscience [John Wiley & Sons], New York, 1974, Translated from the German. MR 0443083 (56 #1456)
[Bel97]
Richard Bellman, Introduction to matrix analysis, Classics in Applied
Mathematics, vol. 19, Society for Industrial and Applied Mathematics
(SIAM), Philadelphia, PA, 1997, Reprint of the second (1970) edition,
With a foreword by Gene Golub. MR 1455129 (98a:01021)
[BG02]
Nils Berglund and Barbara Gentz, Metastability in simple climate models: pathwise analysis of slowly driven Langevin equations, Stoch. Dyn.
2 (2002), no. 3, 327–356, Special issue on stochastic climate models.
MR 1943556 (2004c:86003)
[Bil99]
Patrick Billingsley, Convergence of probability measures, second ed., Wiley Series in Probability and Statistics: Probability and Statistics, John
Wiley & Sons Inc., New York, 1999, A Wiley-Interscience Publication.
MR 1700749 (2000e:60008)
[Bro28]
Robert Brown, A brief account of microscopical observations made in
the months of june, july, and august, 1827, on the particles contained
in the pollen of plants; and on the general existence of active molecules
in organic and inorganic bodies, Philosophical Magazine N.S. (1828),
161–173.
[BT05]
Brahim Boufoussi and Ciprian A. Tudor, Kramers-Smoluchowski
approximation for stochastic evolution equations with FBM, Rev.
Roumaine Math. Pures Appl. 50 (2005), no. 2, 125–136. MR 2156107
(2006m:60086)
[BVH+ 11]
Thomas Brettschneider, Giovanni Volpe, Laurent Helden, Jan Wehr,
and Clemens Bechinger, Force measurement in the presence of brownian
136
noise: Equilibrium-distribution method versus drift method, Phys. Rev.
E 83 (2011), 041113.
[CBEA12]
A. Celani, S. Bo, R. Eichhorn, and E. Aurell, Anomalous thermodynamics at the micro-scale, ArXiv e-prints (2012).
[CDS13]
G. Cottone, M. Di Paola, and R. Santoro, A Novel Exact Representation of Stationary Colored Gaussian Processes (Fractional Differential
Approach), ArXiv e-prints (2013).
[CF06]
Sandra Cerrai and Mark Freidlin, On the Smoluchowski-Kramers approximation for a system with an infinite number of degrees of freedom,
Probab. Theory Related Fields 135 (2006), no. 3, 363–394. MR 2240691
(2007j:60090)
[CH53]
R. Courant and D. Hilbert, Methods of mathematical physics. Vol.
I, Interscience Publishers, Inc., New York, N.Y., 1953. MR 0065391
(16,426a)
[CKW04]
W. T. Coffey, Yu. P. Kalmykov, and J. T. Waldron, The Langevin
equation, second ed., World Scientific Series in Contemporary Chemical
Physics, vol. 14, World Scientific Publishing Co. Inc., River Edge, NJ,
2004, With applications to stochastic problems in physics, chemistry
and electrical engineering. MR 2053912 (2005b:82065)
[DB06a]
S. Duhr and D. Braun, Why molecules move along a temperature gradient, Proc. Natl. Acad. Sci. USA 103 (2006), 19678–19682.
[DB06b]
Stefan Duhr and Dieter Braun, Thermophoretic depletion follows boltzmann distribution, Phys. Rev. Lett. 96 (2006), 168301.
[EK86]
Stewart N. Ethier and Thomas G. Kurtz, Markov processes, Wiley Series in Probability and Mathematical Statistics: Probability and Mathematical Statistics, John Wiley & Sons Inc., New York, 1986, Characterization and convergence. MR 838085 (88a:60130)
[EM78]
Donald L. Ermak and J. A. McCammon, Brownian dynamics with hydrodynamic interactions, The Journal of Chemical Physics 69 (1978),
no. 4, 1352–1360.
[FGB+ 11]
Thomas Franosch, Matthias Grimm, Maxim Belushkin, Flavio M. Mor,
Giuseppe Foffi, Laszlo Forro, and Sylvia Jeney, Resonances arising from
hydrodynamic memory in brownian motion, Nature 478 (2011), 85–88.
137
[FH11]
M. Freidlin and W. Hu, Smoluchowskikramers approximation in the
case of variable friction, Journal of Mathematical Sciences 179 (2011),
184–207, 10.1007/s10958-011-0589-y.
[FHW13]
Mark Freidlin, Wenqing Hu, and Alexander Wentzell, Small mass
asymptotic for the motion with vanishing friction, Stochastic Process.
Appl. 123 (2013), no. 1, 45–75. MR 2988109
[Fre04]
Mark Freidlin, Some remarks on the Smoluchowski-Kramers approximation, J. Statist. Phys. 117 (2004), no. 3-4, 617–634. MR 2099730
(2005k:82074)
[FW12]
Mark I. Freidlin and Alexander D. Wentzell, Random perturbations of
dynamical systems, third ed., Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], vol.
260, Springer, Heidelberg, 2012, Translated from the 1979 Russian original by Joseph Szücs. MR 2953753
[Gar04]
C. W. Gardiner, Handbook of stochastic methods for physics, chemistry and the natural sciences, third ed., Springer Series in Synergetics,
vol. 13, Springer-Verlag, Berlin, 2004. MR 2053476 (2004m:00008)
[Han82]
Peter Hanggi, Nonlinear fluctuations: the problem of deterministic limit
and reconstruction of stochastic dynamics, Phys. Rev. A (3) 25 (1982),
no. 2, 1130–1136. MR 643593 (83b:82040)
[Har02]
Philip Hartman, Ordinary differential equations, Classics in Applied
Mathematics, vol. 38, Society for Industrial and Applied Mathematics
(SIAM), Philadelphia, PA, 2002, Corrected reprint of the second (1982)
edition [Birkhäuser, Boston, MA; MR0658490 (83e:34002)], With a foreword by Peter Bates. MR 1929104 (2003h:34001)
[HB65]
John Happel and Howard Brenner, Low Reynolds number hydrodynamics with special applications to particulate media, Prentice-Hall Inc.,
Englewood Cliffs, N.J., 1965. MR 0195360 (33 #3562)
[HS74]
Morris W. Hirsch and Stephen Smale, Differential equations, dynamical
systems, and linear algebra, Academic Press [A subsidiary of Harcourt
Brace Jovanovich, Publishers], New York-London, 1974, Pure and Applied Mathematics, Vol. 60. MR 0486784 (58 #6484)
[Kha12]
Rafail Khasminskii, Stochastic stability of differential equations, second
ed., Stochastic Modelling and Applied Probability, vol. 66, Springer,
Heidelberg, 2012, With contributions by G. N. Milstein and M. B. Nevelson. MR 2894052
138
[KP91]
Thomas G. Kurtz and Philip Protter, Weak limit theorems for stochastic
integrals and stochastic differential equations, Ann. Probab. 19 (1991),
no. 3, 1035–1070. MR 1112406 (92k:60130)
[KP92]
Peter E. Kloeden and Eckhard Platen, Numerical solution of stochastic
differential equations, Applications of Mathematics (New York), vol. 23,
Springer-Verlag, Berlin, 1992. MR 1214374 (94b:60069)
[KPS04]
R. Kupferman, G. A. Pavliotis, and A. M. Stuart, Itô versus
Stratonovich white-noise limits for systems with inertia and colored multiplicative noise, Phys. Rev. E (3) 70 (2004), no. 3, 036120, 9. MR
2130323 (2005k:60220)
[Kra40]
H. A. Kramers, Brownian motion in a field of force and the diffusion
model of chemical reactions, Physica 7 (1940), 284–304. MR 0002962
(2,140d)
[KS91]
Ioannis Karatzas and Steven E. Shreve, Brownian motion and stochastic
calculus, second ed., Graduate Texts in Mathematics, vol. 113, SpringerVerlag, New York, 1991. MR 1121940 (92h:60127)
[KtW+ 13]
F. Kümmel, B. ten Hagen, R. Wittkowski, I. Buttinoni, G. Volpe,
H. Löwen, and C. Bechinger, Circular motion of asymmetric selfpropelling particles, ArXiv e-prints (2013).
[Lax02]
Peter D. Lax, Functional analysis, Pure and Applied Mathematics (New
York), Wiley-Interscience [John Wiley & Sons], New York, 2002. MR
1892228 (2003a:47001)
[LBLO01]
P. Lanon, G. Batrouni, L. Lobry, and N. Ostrowsky, Drift without flux:
Brownian walker with a space-dependent diffusion coefficient, EPL (Europhysics Letters) 54 (2001), no. 1, 28.
[LKMR10]
Tongcang Li, Simon Kheifets, David Medellin, and Mark G. Raizen,
Measurement of the instantaneous velocity of a brownian particle, Science 328 (2010), no. 5986, 1673–1675.
[LL07]
A. W. C. Lau and T. C. Lubensky, State-dependent diffusion: thermodynamic consistency and its path integral formulation, Phys. Rev. E (3)
76 (2007), no. 1, 011123, 17. MR 2365504 (2008i:82122)
[LM98]
C. Liu and M. Muthukumar, Langevin dynamics simulations of earlystage polymer nucleation and crystallization, The Journal of Chemical
Physics 109 (1998), no. 6, 2536–2542.
139
[LS78]
Edward W. Larsen and Zeev Schuss, Diffusion tensor for atomic migration in crystals, Phys. Rev. B 18 (1978), 2050–2058.
[McL89]
J.A. McLennan, Introduction to nonequilibrium statistical mechanics,
Prentice-Hall advanced reference series: Physical and life sciences, Prentice Hall, 1989.
[MDG99]
Lowell I. McCann, Mark Dykman, and Brage Golding, Thermally activated transitions in a bistable three-dimensional optical trap, Nature
402 (1999).
[MTVE99]
Andrew J. Majda, Ilya Timofeyev, and Eric Vanden Eijnden, Models
for stochastic climate prediction, Proc. Natl. Acad. Sci. USA 96 (1999),
no. 26, 14687–14691 (electronic). MR 1731439 (2000h:86007)
[MV12]
M. Mijalkov and G. Volpe, Sorting of Chiral Microswimmers, ArXiv
e-prints (2012).
[Nel67]
Edward Nelson, Dynamical theories of Brownian motion, Princeton
University Press, Princeton, N.J., 1967. MR 0214150 (35 #5001)
[Øks03]
Bernt Øksendal, Stochastic differential equations, sixth ed., Universitext, Springer-Verlag, Berlin, 2003, An introduction with applications.
MR 2001996 (2004e:60102)
[Ort87]
James M. Ortega, Matrix theory, The University Series in Mathematics, Plenum Press, New York, 1987, A second course. MR 878977
(88a:15002)
[OS06]
José M. Ortiz and Jan V. Sengers, Hydrodynamic fluctuations in fluids
and fluid mixtures, Elsevier, Amsterdam, 2006.
[Pap77]
George C. Papanicolaou, Introduction to the asymptotic analysis of
stochastic equations, Modern modeling of continuum phenomena (Ninth
Summer Sem. Appl. Math., Rensselaer Polytech. Inst., Troy, N.Y.,
1975), Amer. Math. Soc., Providence, R.I., 1977, pp. 109–147. Lectures
in Appl. Math., Vol. 16. MR 0458590 (56 #16790)
[Pap10]
Andrew Papanicolaou, Filtering for fast mean-reverting processes,
Asymptot. Anal. 70 (2010), no. 3-4, 155–176. MR 2761191
(2011k:60141)
[Pia08]
R. Piazza, Thermophoresis: moving particles with thermal gradients,
Soft Matt. 4 (2008), 1740–1744.
140
[Pro05]
Philip E. Protter, Stochastic integration and differential equations,
Stochastic Modelling and Applied Probability, vol. 21, Springer-Verlag,
Berlin, 2005, Second edition. Version 2.1, Corrected third printing. MR
2273672 (2008e:60001)
[PS05]
G. A. Pavliotis and A. M. Stuart, Analysis of white noise limits for
stochastic systems with two fast relaxation times, Multiscale Model.
Simul. 4 (2005), no. 1, 1–35 (electronic). MR 2164708 (2006e:60086)
[PS08]
Grigorios A. Pavliotis and Andrew M. Stuart, Multiscale methods, Texts
in Applied Mathematics, vol. 53, Springer, New York, 2008, Averaging
and homogenization. MR 2382139 (2010a:60003)
[PV03]
È. Pardoux and A. Yu. Veretennikov, On Poisson equation and diffusion
approximation. I,II,III, Ann. Probab. 31 (2003), no. 3, 1166–1192. MR
1988467 (2004d:60156)
[RB06]
Luc Rey-Bellet, Ergodic properties of Markov processes, Open quantum
systems. II, Lecture Notes in Math., vol. 1881, Springer, Berlin, 2006,
pp. 1–39. MR 2248986 (2008g:60224)
[RBE+ 12]
P. Romanczuk, M. Br, W. Ebeling, B. Lindner, and L. SchimanskyGeier, Active brownian particles, The European Physical Journal - Special Topics 202 (2012), 1–162, 10.1140/epjst/e2012-01529-y.
[Ris89]
H. Risken, The fokker-planck equation, Springer-Verlag, New York,
1989.
[RS94]
Wouter-Jan Rappel and Steven H. Strogatz, Stochastic resonance in an
autonomous system with a nonuniform limit cycle, Phys. Rev. E 50
(1994), 3249–3250.
[RW00]
L. C. G. Rogers and David Williams, Diffusions, Markov processes,
and martingales. Vol. 2, Cambridge Mathematical Library, Cambridge
University Press, Cambridge, 2000, Itô calculus, Reprint of the second
(1994) edition. MR 1780932 (2001g:60189)
[RY99]
Daniel Revuz and Marc Yor, Continuous martingales and Brownian
motion, third ed., Grundlehren der Mathematischen Wissenschaften
[Fundamental Principles of Mathematical Sciences], vol. 293, SpringerVerlag, Berlin, 1999. MR 1725357 (2000h:60050)
[Sag72]
Peter S. Sagirow, The stability of a satellite with parametric excitation
by the fluctuations of the geomagnetic field, Stability of stochastic dynamical systems (Proc. Internat. Sympos., Univ. Warwick, Coventry,
141
1972), Springer, Berlin, 1972, pp. 311–316. Lecture Notes in Math.,
Vol. 294. MR 0416151 (54 #4227)
[Sch80]
Zeev Schuss, Theory and applications of stochastic differential equations,
John Wiley & Sons Inc., New York, 1980, Wiley Series in Probability
and Statistics. MR 595164 (84h:60114)
[SCY+ 12]
Jianghong Shi, Tianqi Chen, Ruoshi Yuan, Bo Yuan, and Ping Ao,
Relation of a new interpretation of stochastic differential equations to
ito process, Journal of Statistical Physics 148 (2012), 579–590 (English).
[SGMRM95] Lutz Schimansky-Geier, Michaela Mieth, Helge Ros, and Horst Malchow, Structure formation by active brownian particles, Physics Letters
A 207 (1995), no. 34, 140 – 146.
[Sha10]
Sharma S. et al., Viscoelastic solution of long polyoxyethylene chain
phytosterol/monoglyceride/water systems, Coll. Pol. Sci. 288 (2010),
405–414.
[Smo16]
M. Smoluchowski, Drei vortrage über diffusion brownsche bewegung and
koagulation von kolloidteilchen., Phys. Z. 17 (1916), 557–585.
[SS02]
H. Sigurgeirsson and A. M. Stuart, A model for preferential concentration, Phys. Fluids 14 (2002), no. 12, 4352–4361. MR 1938235
(2003i:76053)
[SS11]
S. Srinivasan and M. Z. Saghir, Experimental approaches to study thermodiffusion – a review, Int. J. Therm. Sci. 50 (2011), 1125–1137.
[SSMD82]
J. M. Sancho, M. San Miguel, and D. Dürr, Adiabatic elimination for
systems of Brownian particles with nonconstant damping coefficients, J.
Statist. Phys. 28 (1982), no. 2, 291–305. MR 666513 (83k:82056)
[Str00]
Daniel W. Stroock, An introduction to the analysis of paths on a Riemannian manifold, Mathematical Surveys and Monographs, vol. 74,
American Mathematical Society, Providence, RI, 2000. MR 1715265
(2001m:60187)
[Sus78]
Héctor J. Sussmann, On the gap between deterministic and stochastic
ordinary differential equations, Ann. Probability 6 (1978), no. 1, 19–41.
MR 0461664 (57 #1649)
[SWS+ 10]
Volker Schaller, Christoph Weber, Christine Semmrich, Erwin Frey, and
Andreas R. Bausch, Polar patterns of driven filaments, Nature 467
(2010).
142
[TKS92]
M. Toda, R. Kubo, and N. Saitô, Statistical physics. I. Equilibrium
statistical mechanics, second ed., Springer Series in Solid-State Sciences,
vol. 30, Springer-Verlag, Berlin, 1992.
[Tur77]
Michael Turelli, Random environments and stochastic calculus, Theoret.
Population Biology 12 (1977), no. 2, 140–178. MR 0465290 (57 #5195)
[VHB+ 10]
Giovanni Volpe, Laurent Helden, Thomas Brettschneider, Jan Wehr,
and Clemens Bechinger, Influence of noise on force measurements,
Phys. Rev. Lett. 104 (2010), 170602.
[vK81]
N. G. van Kampen, Stochastic processes in physics and chemistry, Lecture Notes in Mathematics, vol. 888, North-Holland Publishing Co.,
Amsterdam, 1981. MR 648937 (84h:60003)
[WVE10]
E Weinan and Eric Vanden-Eijnden, Transition-path theory and pathfinding algorithms for the study of rare events, Physical Chemistry 61
(2010).
[WZ65]
E. Wong and M. Zakai, On the convergence of ordinary integrals to
stochastic integrals, Ann. Math. Stat. 36 (1965), 1560–1564.