SOME RECENT TRENDS IN THE MATHEMATICAL THEORY OF

SOME RECENT TRENDS IN THE MATHEMATICAL THEORY
OF DIFFUSION
WILLIAM FELLER
, ,
1. Introduction. This lecture may seem to sail under a false flag since it
appears under the general heading of Applied Mathematics. Only a small part
of the mathematical diffusion theory is connected with immediate applications,
and much of the recent literature on the subject has an abstract character.
The situation is here as in the theory of harmonic functions which likewise stems
from practical problems and has many applications, but which has spread into
and influenced many fields: complex variable theory, boundary value problems
for general differential equations, subharmonic functions, generalized potentials,
etc.
Similarly, the diffusion theory has outgrown its origin as a special topic in
partial differential equations. Nevertheless, not even this particular chapter is
closed: recent applications to the theory of genetic evolution lead us to singular
diffusion equations and confront us with a new type of boundary problems.
We find in this connection unsolved problems, but of even greater interest is
the fact that in diffusion theory we find for the first time an intimate interplay
between differential equations and measure theory in function spaces. The latter
throws new light even on the classical parts of diffusion theory; it leads to new
types of solutions, and opens a new avenue of attack on problems connected
with boundary conditions which were left open for a long time. It turns out that
the adjoint of a differential equation is not necessarily itself a differential equation, and this new information is of value for the general theory of semigroups
of operators. This is underlined by the fact that the probability approach leads
us to consider diffusion equations which are not of local character and which
can be treated by the methods developed, under different forms, by Bochner,
M. Riesz, and L. Schwartz. In this way the theory of differential equations is
tied in with other theories and we shall find unexpected connections between
harmonic functions and diffusion processes.
It is the purpose of the present address to outline some of the new results and
open problems. It is believed that the new methods promise to be fruitful also
in other fields, in particular for other types of differential equations. [Added in
proof: Since this was written, new results were obtained and some of the problems
are no longer open; cf. [8].]
2. Classical diffusion and the Wiener space. For a first orientation consider
the classical diffusion (or heat-conduction) equation
(2.1)
ut = uxx
over the interval — oo < x < oo. Its fundamental solution (or Green function)
is given by the normal density function
322
MATHEMATICAL THEORY OF DIFFUSION
(2.2)
^
u(y; t, x) = ^ p
exp {- (x -
323
yf/tt].
Under very general conditions on f(x) the function
-foo
/
f(y)u(y; t, x) dy
-oo
represents the unique solution of (2.1) which, for fixed t > 0, is integrable with
respect to x and which tends to f(x) as t —> 0.
Now equation (2.1) is supposed to be connected with Brownian movement,
the description of which involves a function space. In fact, the position of a
Brownian particle is a function X(t), and "qbserving a particle" means observing
X(t). Thus each X(t) represents a sample point in our experiment, and the
problem is to describe the properties of the possible paths X(t) in probabilistic
terms. This was first done by N. Wiener [25]. He .starts from the assumption
that (2.2) defines the transition probability density of our process, that is, the
conditional probability density of the relation X(U + t) = x if it is known that
X(tQ) = y; it is furthermore assumed that the increments AX(t) = X(t + h) —
X(t) over nonoverlapping time intervals are statistically independent (or that
the corresponding probabilities multiply). Wiener (and by other methods P.
Levy [20]) shows that with probability one the path function is continuous but
of unbounded variation in every ^-interval; the set of ^-values for which X(t) = 0
has the structure of a Cantor set; with probability one X(t) satisfies a Lipschitz
condition of order 1/2 but no Lipschitz condition of order 1/2 + e (more precise results are given by the local law of the iterated logarithm).
Obviously statements of this type describe the hypothetic diffusion process
more directly than does the differential equation (2.1). It is less obvious that
even the formal theory of (2.1) can profit from the function space approach.
Consider, for example, the boundary value problem for (2.1) for a region bounded
by the part x > a of the œ-axis and a curve x = <p(t) with <p(0) = a. We are
here confronted with a situation similar to that in the Dirichlet problem for
Au = 0. In general, an essentially unique solution exists, but it does not assumo
the prescribed boundary values at all points of the curve. As Fortet [10] noticed,
the question whether a particular point xQ of the boundary curve is an exceptional point can be solved by a comparision of the local properties of the boundary with those of the path curves X(t). A final criterion is obtained from the
local law of the iterated logarithm. Thus the measure-theoretical approach
leads on one hand to an existence theorem which is more general than those
obtained by purely analytical methods, and on the other hand we obtain a regularity criterion which admits of a direct interpretation in terms of path functions.
We want to show that even the simpler problem of diffusion in a finite fixed
interval is intimately related to the theory of path functions and apparently
cannot be treated in a satisfactory manner without the function space approach.
It is well known that in the case of diffusion in a finite interval the solution is
not determined by the initial values alone, but that different boundary conditions
324
'
WILLIAM FELLER
can be imposed. Perhaps the most familiar boundary condition is that of an
absorbing barrier, which is usually defined formally by the condition u(t, XQ) = 0.
It must be emphasized that this condition is obtained only heuristically by a
passage to the limit from a discrete random walk and is justified by the results.
This could be considered satisfactory were it not that the heuristic method
completely breaks down in other cases. In fact, a perusal of the literature shows
that even in comparatively simple physical problems the appropriate boundary
conditions remain unknown^ We shall see later on that in a certain case the absorbing barrier is completely described by the condition u(t, x0) < oo f but again
this was found only from the properties of the solutions to which this condition
leads. In other cases we shall see that no boundary conditions can be imposed.
In short, the appropriate analytical formulation of various boundary conditions
is an unsolved problem. Apparently it is unavoidable to revert to the original
definition of the various conditions, and this definition usually refers directly
to the path functions. Thus, diffusion with absorbing barriers can be described
by saying that the laws of free diffusion prevail in the interior but that the process stops when the particle for the first time reaches a boundary, that is, when
X(t) equals x0 of xi .
3. The adjoint diffusion equations. Consider now a more general diffusion
process in the infinite line with transition probability u(y; t, x). By this we mean
that if at any time fa the position of the particle is X(tQ) = y, then
(3.1)
Pr {a < X(t + to) < b} =* f u(y; t, x) dx.
Ja
If the initial position X(0) of the particle is a random variable with probability
density f(x), then the probability density of X(t) is given by the integral (2.3).'
A general stochastic process cannot be described solely in tèrmo öf the initial
distribution and the transition probabilities. A diffusion process, however, is öf
the Markov type which means, very roughly, that the future depends statistically on the present state, but hot on the past history which led tö it. 1 This
implies in particular the fundamental identity
Z
+OO
u(y) s, £>(£; t, x) d£,
oo
known as the Chapman-Koltnogörov equation. By virtue of the Markov property
the transition'probability u(y; t, x) determines all probability relation in thé
space of path functions X(t). In particular, the joint probability density of
(X(ti), X(h), • • • , X(Q) for 0 < h < h < • • • < tn is given by ""
(3.3)
u(h , xi)u(xi ; h — h , x2) • • • u(xn-i ; tn — tn-x , xn)3
where u(t, x) is defined by (2.3). Thus the properties of the path functions X(t)
are implicitly given by u(y; t, x), and could be derived exactly as Wiener ob1
Cf. the discussion in §8.
MATHEMATICAL THEORY OF DIFFUSION
325
tained the properties of the paths in Brownian motion from (2.2). Actually,
Wiener's results have been only partially generalized; for example, general conditions which guarantee continuity of almost all path functions seem to be
unknown.
Among all stochastic processes for which the Chapman-Kolmogorov equation
(3.2) holds, the diffusion processes are characterized by the requirement that
the probability of a change exceeding e during a short time interval At is small
as compared to At; more precisely, we assume2 that
(3.4)
~[
u(y;At,x)dx-*0
lit J|a;-|/|>e
for every fixed y.
If at some time t the position is X(t) = y, then the mean and the variance of
the displacement AX(t) during the following time interval of duration At are
+00
-+00
/
(x — y)u(y; At, x) dx
00
and
/
(x — y)2u(y; At, x) dx.
J—OO
Actually these integrals may diverge,3 and we introduce therefore the truncated
moments
(3.6)
/
(x - y)u(y; At, x) dx = a(At; y),
J | sa—y | < e
(3.7)
[
(x - yfu{y; At, x) dx = 2b(Ai; y).
J I»—2/|<«
In view of (3.4) the asymptotic behavior of a(At; y) and b(At; y) as t —> 0 is
independent of e, and the physical significance of these quantities is essentially
the same as that of the moments (3.5), except that less weight is attributed to
large displacements. It is therefore natural to suppose that the limits
(3.8)
a(,) = lim a -^p>,
b(y) = lim
^
exist.
2
This definition was given in [5] where diffusion processes are called Markov processes
of the "purely continuous" type. It seems to be an open problem whether (3.4) implies that
the path functions X(t) are continuous with probability one.
3
An (unpublished) example is given hy the densitjr u(y; i, x) = I2wlf2lll2}~1<f>'(x)
e x p { - [<l>(x) - 4> G/)]2/4*) where <j>(x) = log1/4 (100 + x/(l -f e"*)). This is a solution of the
equations (3.12)-(3.13) with b(x) = \<i>'{x)}-\ a{x) = (1/2) b'{x).
4
It will be seen from the following derivation (which follows [5]) that our assumptions
actually imply the existence of the partial derivative ut{y; t, x). Conversely, if it is assumed
that u is differentiable and that (3.4) holds, then it suffices to assume that one of the ratios
occurring in (3.8) has a finite point of accumulation as At —» 0. Letting then At approach
zero through an appropriate sequence of values it follows from (3.11) that for this particular
sequence also the other limit in (3.8) exists, and this leads to (3.12). The first derivation
of (3.12) is due to Kolmogorov [17] who, however, uses strong uniformity conditions and
assumes the existence of the moments (3.5).
326
WILLIAM FELLER
It is then easy to< derive a differential equation for u(y; tj x) as a function of
t and y. From (3.2) and (3.4) we have
< ''
+0O
v>.y; •
u(y; At, Qu{£\ *, a) c?£
/
<
<>
= /"
• i*(y; A«, Ött(ft t, x) dì + o(A*),
•
J.|È-»I<«
,
. »
provided w(£; £, a;) is uniformly bounded for every fixed t > 0. If we now assume
that for fixed t, x the function u(£; t, x) has two continuous derivatives with
respect to £, then the Taylor formula leads to
u(y; t + At, x) = u(y; t, x) /
u(y; At, £) d£
J|?-2/l<e
(3.10)
+ .^(2/; *, s) f
(£ -
+ J «w(w t, x) [
2
V)ufo;
A«, £) d£
(£ - j/) V t t A«, 0 d£ + oGM).
J|f-2/I<6
Now the integral of u(y; At, £) should be identically unity.5 Using, then, (3.4)
we can write (3.10) as
(3.11)
u(y; t + At, x) - u(y; t, x)
= b(At; y)uyy(y; t, x) + a(At; y)uy(y; t, x) + o(At).
Hence, as a function of t and y, the transition probability u(y; t, x) satisfies the
first (or backward) diffusion equation
(3.12)
zt(t, y) » b(y)zyy(tKy) + a(y)zv(t, y).
The obvious initial conditions on transition probabilities show that u(y; t, x)
is, in fact, a fundamental solution of (3.12). One should therefore expect that
for fixed y the function u(y; t, x) should satisfy the adjoint equation
(3.13)
wt(t, x) = {b(x)w(t, x)}xx - {a(x)w(t7 x)}x ,
known as second (or forward) diffusion equation, and also as Fokker-Planck
equation.6
We shall see (in §5) that this surmise is true in general, but that there exist
cases where u(y; t, x) satisfies only the first diffusion equation; instead of (3.13)
we get in these cases an inequality, and a positive operator is added to the right
side. This phenomenon is connected with the existence of a new type of solutions which has not been encountered before. It changes our concept of adjoint
operators and throws new light on the theory of semigroups in general. ,
6
It is possible to relax this condition so as to take account of more general situations.
It is possible to introduce additional terms in (3.12) and (3.13) which will account for
the possibility of particles disappearing or being created.
8
MATHEMATICAL THEORY OF DIFFUSION
327
4. Singular equations. The theory of the preceding section applies also if. thp
infinite œ-axis is replaced by a finite or semi-infinite interval, except that no;w
boundary conditions have to be specified. As has been mentioned before, ilçie
derivation of the familiar boundary conditions (such as absorbing, reflecting,
and elastic barriers) is essentially of a heuristic naturse, and it would be desirable
to derive them frpm the basic assumptipns in the same way a§ the, diffusion
equation itself has been obtained. We want now to, disouss some, nqw aspects
of the boundary problems for a type of equations which is of both theoretical and
practica^( jnterest.
In,physical applications the coefficient b(y) is essentially positive. Now modem diffusion theory has found a new and interesting field of applications in
biology, in, particular in the mathematical theory of evolution (cf. [6]). Here
the particle uncier, diffusion is to be interpreted symbolically and stands for a
population size, gene frequency, etc. For example, the frequency of a particular
gene in a population is a random variable X(t) which by definition is restricted
to the interval 0 < x < 1, and Sewall Wright's theory of evolution assumes
essentially that X(t) is subject to a diffusion process with
(4.1)
b(x) = ßx(l - x),
a(x) = -yx
+ 8(1 - x)
where ß, y, ò are positive constants. Similarly, in the simplest growth process
X(t) stands for a population size and can assume all positive values; the underlying stochastic process is described by our diffusion equations with
(4.2)
b(x) = ßx,
a(x) = ax + y.
The fact that b(x) vanishes on the boundaries represents a singularity and
has important and curious consequences.
Consider, for example, the diffusion equations with the coefficient^ (4.2),
which are discussed in [7]. If 7 __ 0, then there exists a unique solution so thjat
no boundary conditions can be imposed. For this solution the probability
,
(4.3)
/
u(y; t,x) dx = ir(t,y)
Jo
decreases steadily in time. The difference 1 — ir(t, y) represents the probability
that a population of initial size y dies out before time t. (This, is the absorption
or extinction probability.)
The situation changes radically when 0 < 7 < ß, and resembles the more
familiar circumstances of ordinary diffusion. There exist infinitely many solutions, among which there is one for which ir(t, y) = 1; this corresponds to reflecting barriers. For all other solutions ir(t, y) decreases, and the solution for
which the rate of decrease is greatest obviously corresponds to absorbing barriers.
It is the only solution for which u(y; t, 0) < 00, so that in this particular case
an absorbing barrier is described by the boundary condition u < 00.
For 7 > ß the situation changes once more. In passing to the limit 7 —» ß —,
the absorbing and the reflecting barrier solutions become identical, and for
328
WILLIAM FELLER
7 > io we are once more in the situation that no boundary conditions can be
imposed and that our problem is completely determined by the differential
equation itself.7
Accordingly, we can say that for y _§ 0 and y > ß the origin acts as natural
boundary where no artificial conditions can be imposed, while for 0 < y < ß
we are confronted with conditions similar to those in physical diffusion theory.
It is possible (cf. [8]) to give criteria to determine whether for a particular
singular equation the boundary acts as a natural one or whether there exist
infinitely many solutions, but we are still unable to formulate the appropriate
boundary conditions in all cases. Moreover, the biological applications lead us
to a new type of boundary problems.
The classical concept of an elastic barrier at x = 0 may be interpreted by
saying that whenever the particle reaches x = 0 it has a probability p to be
absorbed and probability q = 1 — p to be reflected. The limiting cases p = 0
and p = 1 represent reflecting and absorbing barriers, respectively. Now biological problems compel us to consider the more general case where the particle
is only temporarily absorbed; that is, if the particle is absorbed, it remains
fixed at the boundary for a finite time, and reverts then to the diffusion process.
The sojourn time at the boundary is a random variable with an exponential
distribution. In the familiar terminology of a probability mass spread over the
x-axis this means that in addition to the solution u(t, x) we have to consider a
finite mass m(t) concentrated at the origin. This mass flows at a rate proportional to m(t) back into the interval x > 0, but it is partly replaced by new mass
being absorbed at the origin. If the diffusion starts with all the initial mass
distributed over x > 0, then m(0) = 0, and one would expect that m(t) will
increase to a certain saturation value.
Unfortunately the appropriate boundary conditions have not been formulated
and the precise conditions on a(y) and b(y) under which such a process can take
place are not known. For an interesting special case which is of some importance
in mathematical genetics cf. [6].
5, Existence problems. Pathological solutions. We now return to the case of
an infinite interval and the two diffusion equations (3.12) and (3.13). As was
mentioned at the end of §3, the whole setup as well as the analogy with the
classical diffusion equation uy — uxx leads one to assume that the transition
probability u(y; t, x) will satisfy thô two equations (3.12) and (3.13) and, conversely, that each of these equations should have a fundamental solution which
is essentially uniquely determined and can serve as u(y; t, x). However, already
the first attempt to prove this conjecture showed that it cannot be true under
7
However, for y > ß there exist solutions for which ir(t, y) increases. These solutions
have no probability significance and correspond to the solutions of the theory of heat with
heat flowing into the medium. The existence of analogous solutions for the whole axis was
discovered by Hille [13]; his "explosive" solutions are of this kind with ic(t, y) —> «> as
t->*o>0.
MATHEMATICAL THEORY OF DIFFUSION
329
all circumstances. In fact, it was shown in [5] that whenever one of the two
integrals
(5.1)
f
a~ll2(x)dx,
f_ a~ll2(x) dx
converges, an appropriate b(x) can be found such that (3.12) admits of a solution which is non-negative, integrable, and such that w(0, x) = 0. Thus in this
case the equation (3.12) does not suffice to describe our process. However, assuming that both integrals (5.1) diverge, the construction of [5] shows that our
conjecture is valid under very mild regularity restrictions on a(x) and b(x).
In other words, in case of divergence of (5.1) the whole classical theory carries
over to our diffusion equations.8 The problem was taken up by Yosida [26] and
Hille [11, 12] who obtained similar results under slightly different conditions.
In particular, Hille showed that (3.13) has a unique, non-negative solution
u(i, x) satisfying the conditions
Z
+oo
oo
provided that
(5.3)
and that the integral
(5.4)
--H»
u(t} x) dx = /
f(x) dx
J—oo
\a'(x) - b(x) \ _S K{ \ x | + 1}
f-TN^
J a(x)
diverges both at + oo and — oo ; conversely, if (5.3) holds, the divergence of
(5.4) is a necessary condition for the theorem.9
These unexpected results were at first rather disturbing since they seemed to
indicate that the diffusion equations are, after all, an inadequate description of
the actual process. Fortunately a satisfactory explanation can be found which
makes the theory more harmonious than it would be if our diffusion equations
always behaved as the classical equation ut = uxx . At the same time we shall
be led to a new type of solution of the backward equation and to an interesting
phenomenon concerning the forward equation.
The singular diffusion equations discussed in the preceding section point in
8
In [5] the more general case is treated where the coefficients a and b depend on t. The
later investigations depend on the theoiy of semigroups and apply therefore only to the
temporally homogeneous case treated in the text.
9
The apparent discrepancy between Hille's conditions and Feller's condition (5.1) is
readily explained. If one of the integrals in (5.1) converges, then it is impossible that
| a'{x) | < K ( | x | -f 1), and hence (5.3) can hold only if the growth of b(x) is adjusted in
a special way to that of a(x). On the other hand, if | a'{x) \ < K (| x | + 1), then a(x) —
0(x2) and the divergence of (5.1) implies that of (5.4). [Added in proof. Unfortunately
the text takes account only of Hille's paper [12] in which (3.13) is treated. The author
is indebted to Hille for the manuscript and for many inspiring discussions. Subsequently
Hille treated also (3.12) and obtained conclusive results concerning the uniqueness problems for the two equations; they are announced in [11].]
330
WILLIAM FELLER
the proper direction. T?he first point to be noted,is that there is.no essential
difïerence between diffusion in a finite or infinite interval. In fact, let 7(5) be an
arbitrary positive function and consider the transformation of variables
y(s) ds,
, . . , . , . , ' ; ;
n = / y(s) ds.
,
> ,
5
Jo .
.
•
' r1 . , \ •
ïî'MiJ'X$satisfies the diffusion equations (i 12) and (3.13), then u*(i;t, Ç)/y(i)
(as a'functiònòf H, rj and ôï t\ £, respectively) satisfies similar*1 equations with
à(y) and 8"(#)" replàcedby a(y)-yy2(y) and1 Xid(!y)'y'(y) + b(y)-y(y). Moreover1, if
u(y\ V,'x) is,'äs a'funotiôïi òf xx a probability density; then the same'is tnieof
ti* as a function of'£. Thus wé hd,vô a whole group of trarisformatïoné which
change'thé pair of diffusion equatiohs'into^equivalent equation^. We can make
use of this fact either to simplify the equations or to change the interval. In
particular, it is always possible to transform a finite interval into the1 entire réâl
axis and vice versa. However, even if the coefficients are regular in" the interior,
the transformed equations w^ill in general be of a singular type, jvith the coefficients unbounded at ,the endpòints, or. d(x) vanishing at an endpoint.
In particular, we may take the ordinary diffusion equation ut = .uxx fpr the
finite interval — 1 < x <C 1, and transform the latter into — 00 < £ < 00. We
obtain in this way a pair òf diffusion equations for the infinite interval which
exhibits all the anomalies which at first appeared so perplexing. There'is no
uniqueness, since we have the transforms of the classical absorbing and reflecting
barrier solutions, etc. Furthermore, there exist positive solutions (well known
from .the theory of heat conduction) with zero initial values and so on.
In probabilistic language we can describe the situation as follows. The coefficients a and 6 can be such that with probability one the particle never reaches
a boundary, In this case no boundary conditions need (or can) *be imposed, and
we have,the phenomenon of natural boundaries. The typical example is the
homogeneous diffusion on the entire axis, since Wiener'gj result that almost all
path functions are continuous implies that the particle cannot reach infinity.
. . If the coefficients a and b are suc}i that the particle has positive probability
to reach a boundary, several contingencies may arise, For example, the drift
toward, the barrier can be so strong that the boundary automatically, acts as an
absorbing barrier, and again no boundary conditions can be imposed. A typical
example is the origin in the case of the singular equation with coefficients given
by (4.2) when 7 __ 0 (cf. §4). Alternatively, the situation may fye as'with homogeneous, diffusion in a finite interval, where various boundary conditions can be
imposed. \
'
'
' "v -v :
" In" partièulâr, it is possible that there' 'exists a unique transition "probability
U/(y; t,'x)['which Satisfies/th,e. two equations^(3112) and (3.13), but for which the
tptaì probability mass ir(t, y) (defined in (4$)) is steadily decreasing^ tlere the
•boundaries are natural in the sense that no boundary conditions in the classical
sense can be imposed. Nevertheless, we are led to a new type of solutions .at
least for the backward equation (3.12) and a curious phenomenon regarding
the forward equation.
MATHEMATICAL THEORY OF DIFFUSION
331
Following a remark of Doob10 we can modify our diffusion process by stipulating that whenever the particle reaches a boundary, it is instantaneously
transferred back to the interior. More precisely, let <j>i(x) (i = 1, 2) be two density functions and let us assume that if the particle reaches the right or left
boundary, it jumps to a position Yi or Y2 , where the Yi are random variables
with probability densities <t>i(x). It is easy to calculate the transition probability
u*(y; t, x) of the so modified process, but for our purposes the explicit form is of
no particular interest. It is rather obvious that u*(y; t, x) behaves asymptotically for t —» 0 as the transition probability u(y; t, x) of the original process, so
that according to the theory of §3 also u*(y; t, x) must satisfy the backward
equation (3.12). It follows that w* — u is a non-negative solution of (3.12) with
identically vanishing initial values, so that in this case the initial valus problem
for (3.12) cannot be determined. The interesting fact is that our phenomenon can
occur even when the initial value problem for (3.13) is uniquely determined.
In this case u*(y; t, x) cannot satisfy (3.13) and we have thus a diffusion process
where the transition probability satisfies the backward equation but not the forward
equation. Instead, the inequality sign _^ holds in (3.13) and the right side is to
be modified by adding a positive operator.11 We are thus led to the conclusion
that the adjoint of the backward equation (3.12) is not always given by (3.13),
and is in general not even a differential equation. This is a new phenomenon
which seems of interest for the general theory of semigroups and of differential
operators.
6. Ito's approach. Before passing to more general processes a word should
be said on a new way of describing our diffusion processes which is due to Ito
[14,15]. He does not make direct use of the differential equations or the transition
probabilities, but expresses (at least in certain cases) the path functions of a
general process by means of those of the Wiener process.
Let X(t) represent the path functions of the latter. We have seen that with
probability one X(t) is not of bounded variation so that the classical definition
of the integral
(6.1)
ff(r)dX(r)
Jo
10
Doob discussed in detail a similar phenomenon for discrete Markov chains, cf. [2J.
For details cf. [8], In the case of discrete Markov chains one is led to two infinite
systems of ordinary differential equations playing the role of our diffusion equations. As
Doob has shown, the analogue of (3.12) holds always, but the analogue of (3,13) holds only
with the equality replaced by the sign _t. We now see why a direct derivation of (3.13) is
impossible. In attempting a derivation along the lines which led to (3.12) one would introduce the function F{t, y) = J_M u(y; t, x)f(x) dx where f(x) vanishes for | x | > A and
has, say, three continuous derivatives. Then (*) F(t-\- At, y) = / i £ u{y; At, z)F(t, z) dz g£
/_«> w(j/; t, z) dz'f\a-z\<i f(x)u(z; At, x) dx. If one had here the equality sign, it would
be eaBy to derive (3.13) under fairly mild conditions (essentially supposing that the limits
in (3.4) and (3.5) are attained boundedly in every finite interval). However, in order to replace the inequality sign in (*) by the equality sign one has to introduce strong uniformity
conditions which are actually not justified even in relatively simple cases.
11
332
WILLIAM FELLER
breaks down, even if f(t) is continuous: in fact, for almost every X(t) we can
find a sequence of subdivisions tin) such that the corresponding Riemann sums
will diverge. Nevertheless, it is possible to give a meaning to the integral (6.1).
In fact, consider a fixed sequence of subdivisions, say tin) = k/2n. The corresponding Riemann sum
HMn))AX(4n))
(6.2)
k
is a random variable, and it has been shown by various authors that it converges in probability to a random variable which has all the desired properties
of an integral.
Using this notion Ito shows that under certain restrictions on a and b the stochastic integral equation
(6.3)
Y(t) = c + [ b(Y(r)) dr+
Jo
[ a(Y(r))
dX(r)
Jo
admits of an essentially unique solution Y(t). The latter is defined as a random
variable on the Wiener space {X(t)}. It is fairly obvious from (6.3) that the
infinitesimal transition probabilities of Y(t) have the desired properties (3.4)
and (3.8), and thus we have arrived at a representation of a stochastic process of
the required type.
This approach has the advantage that it permits a direct study of the properties of the path functions Y(t), such as their continuity, etc. In principle, we
have here a possibility of proving existence theorems for the partial differential
equations (3.12)-(3.13) directly from the properties of the path functions. However, the method is for the time being restricted to the infinite interval and the
conditions on a and b are such as to guarantee the uniqueness of the solution.
So far, therefore, we cannot obtain any new information concerning the "pathological" cases.
7. Diffusion in phase space. The method of §3 works also in two and more
dimensions. In two dimensions the transition probability is of the form
u
(Vi J 2/2 ; ty #1 j #2), and the equations corresponding to (3.12) and (3.13) are
(7.1)
and
Ut = ]C UiiUyiVj - Il ï>iUui ,
(7.2)
Ut = 2
(aurìxixj + _C Q>iu)x.,
i,j = 1, 2
i,j = 1, 2
where the coefficients may depend on the two space variables and the matrix
(flu) is symmetric and positive definite. The meaning of the coefficients an and
bi is the same as before, and the mixed coefficient ai,2 gives the infinitesimal
covariance:
«12(2/1,2/2)
(7 3)
1 ff
= lim —- / /
A*-»Q /It
JJ\Xi—ui]<€
(xi — 2/1) (x2 — y2)u(yi, y2 ; At, xi, x2) dxi dx2.
MATHEMATICAL THEORY OF DIFFUSION
333
These equations were first derived by Kolmogorov [17]. Feller's constructions
of the transition probability was generalized to two dimensions by Dressel [4],
In the time homogeneous case the equations were integrated by Yosida [26]
using the theory of semigroups. However, nothing is known for the case of
singular equations or the case where no uniqueness exists, and practically nothing is known about the appropriate boundary conditions in various problems
involving bounded domains.
We shall not pursue this line of investigation but shall be content to indicate
how our equations lead to a refined model of diffusion in which the path functions
have derivatives.
The fact that the original model of diffusion leads to the conclusion that the
particles have no velocities has been pointed out by many authors. Actually
this fact is neither perturbing nor surprising: the whole theory is based on the
assumption that the process is Markovian, that is, that the particle has no
memory. Now if the particle had a finite velocity, shocks in the direction of
motion would be less probable, so that any change of velocity would affect the
chances of further changes. In other words, finite velocities would imply that
the position of the particle is a random variable of a stochastic process with
aftereffects.
In a Section meeting at this Congress two proposals Avere made to modify
the basic assumptions so as to endow the particles with finite velocities. In
both cases the diffusion equations would be replaced by equations of the hyperbolic type. Now the general solution of the initial value problem for such equations does not preserve positivity or the integral mean. Moreover, it depends
not only on the initial value but also on their derivatives, which have no apparent
probabilistic meaning. In short, the hyperbolic equations in question cannot
serve as appropriate descriptions of a stochastic process, and if a particular
transition probability is a solution, this is a lucky coincidence.
However, a refined model of diffusion in which the particles have finite velocities (but no accelerations) is due to Ornstein and Uhlenbeck.12 This theory
assumes that the particle has a velocity X2(t) which is the subject of an ordinary
diffusion process regulated by (2.1). The position Xi(i) of the particle is then
obtained from
(7.4)
Xi(0 = f X2(r) dr,
Jo
the integral having a meaning since X2(t) is continuous with probability one.
As shown by Doob, Xi(t) is the variable of a Gaussian (non-Markovian) process
whose probability relations approach for large t those of the ordinary Brownian
motion process.
An alternative way of reaching the same conclusion consists in studying the
vector (Xi(t), X2(t)) as the variable of a two-dimensional diffusion process (so
12
Cf. [23]. A thorough discussion of this process which is more appropriate for our purposes is contained in Doob [3].
334
WILLIAM FELLER
that the plane is really the phase space of a particle in one-dimensional motion).
Under the above assumptions this two-dimensional process is Markovian, and
the corresponding transition probability u(yi ,y2;t,xi,
x2) should satisfy a pair
of equations of the form (7.1), (7.2), Now according to Ornstein and Uhlenbeck
the variable X2(t) is subject to a symmetric homogeneous diffusion'process, so
that b2 = 0 and 022 is a constant which we can assume as unity. From (7.4) it
follows that if at any time Xi(t) = yi, X2(t) = y2, then
E(Xi{tx + At) - Xi(0) = y*At + o(At)
E((Xi(t + At) - Xi(t))2) = o(At),
#,nd hence an = 0, òi = y2 . From the second equation (7.5) and Schwarz' ine;
quality it follows that also au = 0. Hence (7.1) reduces to
(7.6)
'
ut = uV2V2 - y2uyi .
This equation has first been given by Kolmogorov [19]. Ijf the condition of
homogeneity is dropped, the same consideration leads to the more general equation
(7.7)
ut = b(y1, y2)uy2y2 + a(yx, y2)uy2 - y2uvi .
This equation is of a degenerate type and practically nothing is known about
the appropriate boundary conditions for finite domains. A fundamental solution
for the infinite plane was constructed by M. Weber [24].
To prove the equivalence of the two approaches one should prove that the
solution of (7.7) satisfies the condition (7.4). This is intuitively rather obvious,
but a satisfactory proof has not been given. Otherwise the same method could be
applied to the calculation of various Wiener functionals, which were the object
of investigation by Cameron and Martin, and by Kac (cf. [16]). Let again
X2(t) be the random variable of the simplest diffusion process, and, put
(7.8)
Xi(t) = f V(X2(r)) dr,
Jo
where V(x) is a given function, say x2. Then the pair (Xi, X2) is the variable of
a two-dimensional diffusion process, and should satisfy an equation of the form
(7.6) with the coefficient y2 replaced by V(y2). The difficulty of this approach
(as well as in Ito's approach) seems to he in a direct verification that the transition probability of the pair (Xi, X2) has all the properties assumed in the derivation of (7.6). Formally, however, one can start with our diffusion equation and
derive Kac's results from it.
In the Ornstein-Uhlenbeck model of diffusion the position Xi(t) is not the
variable of a Markovian process, but the process becomes Markovian when the
state of the particle is defined so as to include position and velocity. Thus the
same physical process can be described as a Markovian or non-Markovian
process depending on the parameters used for its description. For example, in
MATHEMATICAL THEORY OF DIFFUSION
335
cosmic ray showers the most interesting variable is the total number of particles,
but it is not the variable of a Markov process. However, we may use the conceptual simplicity of and the tools available for Markov processes at the expense
of describing the state of the system by the positions, masses, and velocities of
each particle. In this way we are led to a Markov process in a rather complicated
phase space. If the process is continuous (which excludes instantaneous changes
like splitting of particles), we shall again be led to partial differential equations
of the diffusion type.
The most radical step in this direction was taken bj7" Feynman [9] in his new
approach to quantum physics. The starting point of Feynman's theory is the
remark that the phase-time space can always be split into two parts involving
the "past" and "future" in such a way that the physical process becomes what
we would call a stochastic process of the Markov type. As a physicist Feynman
does not mind generalizations and abstractions which would make mere mathematicians shudder, and thus his "past" is permitted even to include future
events (thus introducing advance effects instead of aftereffects). Feynman then
introduces an assumption equivalent to continuity and shows heuristically that
the diffusion equation to which one is led is essentially the Schrödinger equation.
It has complex coefficients since Feynman deals with complex probability
amplitudes rather than with classical transition probabilities. It is not clear at
present whether this is only an excellent analytical tool or whether we have here
an essential generalization of the classical concept of a stochastic process.
8. Generalizations by means of M. Riesz' potentials. Usually partial differential equations appear to play a special role among functional equations, but
from a probabilistic point of view the diffusion processes are only a particular
type of a Markov process. Starting from the Chapman-Kolmogorov equation
(3.2) we have derived a diffusion equation by imposing certain conditions on the
asymptotic behavior of the transition probability u(y; t, x) as t —» 0. Probability
considerations suggest to us also more general conditions which then lead to a
generalization of the diffusion equation in which certain integrals involving u
are added to the right-hand member (cf. [5] and Yosida [27]). However, even the
integro-differential equations thus obtained describe only a special class of
Markov processes and represent from a certain point of view an unnatural type
of functional equations. Our real problem is to find a linear functional equation
equivalent to the Chapman-Kolmogorov equation (3.2). In the language of the
theory of semigroups this amounts to saying that we desire to find the infinitesimal generators to the most general semigroup corresponding to (3.2).
Before proceeding it should be emphasized that the Chapman-Kolmogorov
equation (3.2) is by no means restricted to probability problems, but that an
analogous relation holds for all linear differential equations. In fact, suppose that
the solution u(t, x) of an equation (whose coefficients do not depend on t) is
determined by its initial values. Then u(t, x) will be given by an integral of the
form (2.3), and the Chapman-Kolmogorov equation merely expresses the fact
336
WILLIAM FELLER
that the solution u(s + t, x) can be calculated either directly in terms of its
initial values at time 0 or, alternatively, in terms of its values at time t which,
in turn, are expressible in terms of the initial values. Equating the two expressions
we obtain (3.2). It is clear that a similar reasoning holds, for example, for a
hyperbolic differential equation- except that a vector notation must be used
since now the problem is determined by1 the initial values of the solution and its
time derivative.
In short, a kernel u(y; t, x) induces two associated linear transformations
Ttf(x) = J f(y)u(y;t,x)dy,
(8.1)
Stg(y) = J g(x)u(y; t, x) dx
and the Chapman-Kolmogorov equation (3.2) merely expresses the fact that as
t varies each of these transformations forms a semigroup. In operator theoretical
language the equation uy = uxx can be interpreted as stating that thç infinitesimal
generator (time derivative) of Tt is given by the operator d2/dx2, or symbolically
«>
ÌT>-TÌ-
A formal integration leads to the symbolic equation
(8.3)
Tt = etdVdx\
If /(») is an entire function, then (8.3) gives
(8.4)
Ttf(x) = eldVdx*f(x) F= j _
tk/klfm(x),
which agrees with the classical solution (2.2) and (2.3). The more general diffusion equations merely replace the operator d2/dx2 on the right by other differential operators. In the case of the integro-differential equations mentioned
above we have on the right a combination of a differential and integral operator.
However, for a general solution we shall have to consider much more general
operators.
The stable distributions are the best known class of solutions of the ChapmanKolmogorov equation which are not connected with either differential or integrodifferential equations. Bochner [1] was the first to notice that the symmetric
stable distributions can be interpreted as solutions of an operator equation
ut = Au,
where the operator A is a certain fractional power of d2/dx2. A better understanding of this important observation may be derived from an application of a
technique developed by M. Riesz [22] which also points to various generalizations.
MATHEMATICAL THEORY OF DIFFUSION
337
A fractional potential of order a is defined, according to Riesz, by
(8.5)
F f(x) =
*
r^y) \y-x I""1 dy.
2T(a) cos ira/2 J-«,
This integral diverges in general, but Riesz showed that a meaning can be
attached to it for a large class of functions f(x). In particular, I°f(x) = f(x)i
and I~2nf(x) = (— l)nf(2n)(x), whenever n is a positive integer. This relation is
interesting for two reasons. First, it is clear that in general Ia is a global operator,
that is, I"f(x) depends on all values of f(x). For the particular values a = 0,
—2, — 4, • • • , however, the operator takes on a purely local character. Secondly,
since I~2 is the same as —d2/dx2, we can interpret J - 1 as a square root of —d2/dx2,
and similarly for other powers. This enables us to pursue in more detail Bochner's
remark. We are thus led to consider the general functional equation
(8.6)
ut = - J T V
If a = 2, this reduces to the classical diffusion equation ut = uxx , but for other
values of a the functional equation (8.6) has quite different a character, since
the operator on the right is of a global character.
From Bochner's remark one should expect that for 0 < a g 1 the equation
(8.6) should lead to symmetric stable distributions. This is so, and we proceed to
verify it in a purely formal fashion for a = 1.
The symmetric stable distribution of order a = 1 is characterized by the
density
(8.7)
u(y; t9 x)
*
(x-y)*+P'
which is known as Cauchy distribution. It plays for the Cauchy process the role
of the normal density (2.2), and the solution (2.3) reduces in the present case to
the function u(t, x) which is harmonic in the half plane t > 0, vanishes at infinity,
and reduces for t «= 0 to f(x).
If we integrate (8.6) for a = 1 formally by analogy with (8.4), we are led to
an expansion
(8.8)
u(t,x)
=±t^r"f(x).
Substituting from the definition (8.5) it is easily verified that the right side
reduces to the anticipated solution
(8.9)
u
«>*)=llj{y)(y-l)>+l?dy-
The formal steps leading to this formula require justification. However, by
similar calculations all symmetric stable distributions can be obtained, and by
a slight generalization of Riesz' definitions of potentials we can derive also the
unsymmetric stable distribution. It appears, in fact, that all probabilistic solu-
338
WILLIAM FELLER
tions of the Chapman-Kolmogorov equation can be obtained in thiä way, except
that the parameters of the potentials on the right side will depend on x.
It is of particular interest that we have obtained the solution of a Dirichlet
problem for the Laplace equation Au = 0 as the solution of a generalized diffusion
problem. The explanation fies in the semigroup property of the Green function
(8.7), and a similar statement applies also to the Dirichlet problem for closed
curves. The curve itself corresponds to the a?-axis, and certain closed curves in thè
interior to the lines t = const. These contract to a single point which corresponds
to t — oo. From the values of u along the boundary curve the values along the
curves in the interior can be calculated by the methods just described. The
fundamental fact that as t —» oo all the values tend to the same limit follows
then from the ergodic principle for Markov processes on closed manifolds. We
have thus a direct connection of the Dirichlet problem for elliptic equations with
certain generalized diffusion equations, which in turn are closely connected with
measure in function spaces where almost no function is continuous.
REFERENCES
1. S. BOCHNER, Diffusion equation and stochastic processes, Proc. Nat. Acad. Sci. U.S.A.
vol. 35 (1949) pp. 368-370.
2. J. L. DOOB, Markoff chains—^enumerable case, Trans. Amer. Math. Soc. vol. 68
(1945) pp. 455-473.
3.
, The Brownian movement and stochastic equations, Ann. of Math. vol. 43 (1942)
pp. 351-369.
4. F. G. DRESSEL, The fundamental solution of the parabolic differential equation, Duke
Math. J. vol. 7 (1940) pp. 186-203.
5. W. FELLER, Zur Theorie der Stochastischen Prozesse (Existenz und Eindeutigkeitssätze),
Math. Ann. vol. 113 (1936) pp. 113-160.
6.
, Diffusion processes in the mathematical theory of Evolution, Proceedings of the
Second Symposium on Probability and Mathematical Statistics, Berkeley [to appear],
7.
, Two singidar diffusion problems, Ann. of Math. vol. 52 (1951), to appear»
8.
, The parabolic differential equations and the associated semi-groups of transfer*
mations, to appear, with a companion paper, in Ann. of Math.
9. R. P. FEYNMAN, Space-time approach to non-relativistic quantum mechanics, Reviews
of Modern Physics vol. 20 (1948) pp. 367-387.
10. R. FORTET, Les fonctions aléatoires du type de Markoff associées à certaines équations
linéaires aux dérivées partielles du type parabolique, J. Math. Pures Appi. vol. 22 (1943)
pp. 177-243.
11. E. HILLE, Les probabilités continues en chaîne, G. R. Acad. Sci. Paris vol. 230 (1950)
pp. 34-35.
X2.
, On the integration problem for Fokker-Planck's equation in the theory of stochastic processes, [To appear in the proceedings of the Tenth Congress of Scandinavian
Mathematicians in Trondheim, 1950].
13.
, "Explosive" solutions of Fokker-Plahck's equation, Proceedings of the International Congress of Mathematicians, Cambridge, Mass., 1950.
14. KIYOSI ITO, On a stochastic integral equation, Proceedings of the Japan Academy of
Sciences vol. 22 (1946) pp. 32-35.
1
15.
, On stochastic differential equations, Memoirs of the American Mathematical
Society, no. 4 (1951).
,
16. M, KAO, On the distribution of certain Wiener functionals, Trans, Amer. Math. Soc.
vol. 65 (1949) pp. 1-13.
MATHEMATICAL THEORY OF DIFFUSION
339
17. A. KOLMOGOROFF, lieber die analytischen Methoden in der Wahrscheinlichkeitsrech~
nung, Math. Ann. vol. 104 (1931) pp. 415-458.
18f
1 Zur Theorie der stetigen zufälligen Prozesse, Math. Ann. vol. 108 (1933) pp.
149-160.
19,
> Zufällige Bewegungen, Ann. of Math. vol. 35 (1934) pp. 116-117,
20. P. LEVY, Théorie de Vaddition des variables aléatoires, Paris, Gauthier-Vili ars, 1937.
21. •
, Processus stochastiques et mouvement Brownien, Paris, Gauthier-Villars,
1948.
22. M. RIESZ, L*intégrale de Riemann-Liouville et le problème de Cauchy, Acta Math.
vol. 81 (1948) pp. 1-223,
23. G. E. UHLENBEOK and L. S. ORNSTEIN, On the theory of the Brownian motion, Physical
Reviews vol. 36 (1930) pp. 823-841.
24. M. WEBER, The fundamental solution of the diffusion equation in phase space [To
appear].
25. N. WIENER, Generalized harmonic analysis, Acta Math. vol. 55 (1930) pp. 117-258.
26. K. YOSIDA, An operator-theoretical treatment of temporally homogeneous Markoff
process, Journal of the Mathematical Society of Japan vol. 1 (1949) pp. 244-253.
27.
9 An extension of Fokker-Planck's equation, Proceedings of the Japan Academy
of Sciences vol. 25 (1949) pp. 1-3.
28.
, On the differentiability and the representation of one-parameter semi-group of
linear operators, Journal of the Mathematical Society of Japan vol. 1 (1948) pp. 15-21.
PRINCETON UNIVERSITY,
PRINCETON, N. J., U, S. A.
INTERNATIONAL CONGRESS
OF
MATHEMATICIANS
Cambridge, Massachusetts, U. S. A.
1950
CONFERENCE IN TOPOLOGY
Committee
Hassler Whitney (Chairman)
Deane Montgomery
N. E. Steenrod