Markov Chains as Models in Statistical Mechanics. Eugene Seneta

Moyal Lecture : 3 Nov., 2006
Markov Chains as Models in Statistical Mechanics.
Eugene Seneta
1
SUMMARY
The D.Bernoulli (1769)/Laplace (1812) urn model and the Ehrenfest (1907) urn model for mixing are instances of simple Markov
chain models called random walks. Both can be used to suggest
a probabilistic resolution of the controversy concerning the irreversibility and recurrence paradoxes inherent in Boltzmann’s HTheorem. Marian von Smoluchowski (1914) also modelled by a
simple Markov chain, with analogous properties, fluctuations over
time in the number of particles contained in a geometrically well
defined small element of volume in a solution.
The lecture explores lightly the themes of entropy, recurrence and
reversibility within the framework of such Markov chains.
1
Emeritus Professor, School of Mathematics and Statistics, University of Sydney, N.S.W. 2006. Email:
[email protected]
1
Markov Chain.
A PROBABILITY MODEL WHICH ALLOWS FOR STATISTICAL DEPENDENCE BETWEEN OBSERVATIONS AT SUC0, 1, · · ·
CESSIVE TIME POINTS
SYSTEM: X0, X1, X2, · · ·
ON THE STATE OF A
EVOLVING OVER TIME.
Markov Property:
P r(Xm+1 = j|Xm = i, Xm−1 = im−1 , · · · , X0 = i0) = pij , i, j ∈ S.
Matrix Structure:
P ≥ 0,
P 1 = 1.
Matrices with this property are called stochastic. P is the transition matrix of the Markov chain.
Matrix Powers.
(n)
(n)
P n = {pij }, where pij = P r(Xm+n = j|Xm = i).
Finite Markov Chain: There are N + 1 states, S = {0, 1, · · · N }.
Simplest Markov Chain: S = {0, 1}.
P =







p00 p01
p10 p11
2







.
Stationary/Ergodic Properties.
If P is finite and irreducible, there is a unique solution vector π
of
πT P = πT ,
π T 1 = 1.
Result 1. π > 0, and since π T 1 = 1 its elements form a probability distribution with strictly positive entries which add to one:
π T = {π0, π1, · · · , πN },
π0 + π1 + · · · + πN = 1.
If the Markov chain with transition matrix P starts off at time 0
with this distribution over its states, this is the distribution at all
time points n = 0, 1, 2, · · ·:
P r(Xn = j) = πj , j = 0, 1, · · · , N
π is called the stationary distribution vector.
Result 2. If P is irreducible and aperiodic (that is P n > 0 for
some n), then as n → ∞ :
P n −→ 1π T .
The limiting stochastic matrix 1π T has all its rows identical. This
is called the ergodic property. As n → ∞:
(n)
pij = P r(Xn = j|X0 = i) −→ πj .
3
D.Bernoulli(1769)/Laplace (1812) Model.
Two Urns, A and B. Each urn has N balls, so total number of balls
is 2N..
The totality of 2N balls consists of N white, and N black.
Let Xn be the number of black balls in Urn A after n interchanges.
An interchange consists of selecting a ball at random
from Urn A , and a ball at random from Urn B, and
placing it in the other urn.
Then
pi,i−1
i 2
N − i 2


=
, pi,i+1 =
N
N




i N − i
, i = 0, 1, · · · , N
pi,i = 2 
N
N


The Markov chain {Xn}, n = 0, 1, · · · has stationary/limiting
distribution π T = {π0, π1, · · · , πN } given by
N! N !
N !−i .
πi = i 2N
N
4
Ehrenfest(1907) Model.
Two Urns, A and B. Total number of balls is N.
All N balls are black, and labeled 1 to N.)
Let Xn be the number of (black) balls in Urn A after n interchanges.
An interchange consists of selecting a number at random from the set {1, 2, · · · , N }, finding the ball with this
number and placing it in the other urn.
pi,i−1
i
N − i
=   , pi,i+1 = 
, i = 0, 1, · · · , N.
N
N




The Markov chain {Xn}, n = 0, 1, · · · , has stationary/limiting
distribution π T = {π0, π1, · · · πN } given by

N
πi =
i





 
 
 
 
 

1 N

.

2

Symmetric Binomial Distribution. This distribution a twocontainer case of Boltzmann-Maxwell ‘statistics’.
5
Statistical Mechanics.
The Ehrenfest Model was created in response to certain paradoxes
which appeared in Boltzmann’s ≈ 1872) efforts, following Maxwell,
to explain thermodynamics of gases on the basis of kinetic
theory.
Gases were to be viewed as aggregates of particles undergoing movement at different velocities, and collisions between the particles were
to accord to the principles of Newtonian mechanics. Hence the term
Statistical Mechanics.
• One paradox was the apparent conflict in Boltzmann’s theory
between irreversibility (as manifested by increasing entropy)
and recurrence of states as expected from the assumptions
of Newtonian mechanics. ( ≈ Zermelo’s Paradox.)
• Loschmidt’s Reversibility Paradox.
6
Recurrence of States.
Both chains are irreducible, so ‘positive recurrent’, which means that
every state recurs with probability one, and the mean
time between recurrences is finite.
Mean Recurrence time of state i :
1
µi = , i = 0, 1, · · · , N.
πi
Thus for the
Ehrenfest model:
µi =
1
N
i
N
1
2
.
If N = 20, 000, and i = 0,
µ0 = 220000 time units (say secs;), ≈ 106000years.
However, if i ≈ N2 ,
µN/2 ≈ 175 time units.
7
An Entropy Property of Both Models.
The Ehrenfests chose as an analogue of the negative entropy of Boltzmann (which is supposed to be decreasing with
time) the quantity:
2|Xn −
N
|,
2
n = 0, 1, 2, · · ·
Kohlrausch and Schrödinger (1926), reported a simulation study
with 5000 successive drawings, when N = 100, and X0 = 100. By
drawing no. 200, the plot of this quantity against drawing number (
the Treppenkurve, the analogue of Boltzmann’s H-Kurve) oscillates
closely about 0. Define entropy as function of EXn.
For both our models:
N
N
2 


E(Xn+1 − ) = 1 −  E(Xn − ).
2
N
2


An example of a scalar function of E(Xn) which decreases as n increases (negative ‘ENTROPY’ ):
N
φ(E(Xn)) = 2|E(Xn) − |.
2
8
Models for Heat Exchange.
The Ehrenfest model is a physical model for heat exchange between
two isolated bodies at unequal temperatures, the two urns symbolizing the isolated bodies, and the number of black balls in each symbolizing their temperatures.The realistic value of model supported
by:
N
2 k
N

E(Xk − ) = 1 −
E(X0 − )
2
N
2


since in the limit as k → ∞, N → ∞, τ → 0, in such a way that
N τ /2 → γ −1, and kτ = t = constant = time, where τ is unit of
time, the factor:
k
2
1 −
 −→ e−γt .
N


Newton’s law of cooling.
D. Bernoulli (1769) already had the Newton’s law of cooling argument. Transparently, the Bernoulli-Laplace model could
have been used also to resolve by analogy the paradoxes in statistical mechanics without the introduction of
the Ehrenfest model.
9
Marian Smoluchowski’s Model. 1.
Let Xn, n = 0, 1, 2, · · · , denote the number of individuals at time
n, where movement from time n to time n + 1 is defined by:
(n+1)
(n+1)
(n+1)
Xn+1 = Z1
+Z2
+· · ·+ZXn +In+1.
(n+1)
Here Zj
is the number of offspring of the jth individual ex-
isting at time n, In+1 is the number of immigrants coming into the
population to supplement these offspring in forming the totality of
the number of individuals Xn+1 at time n + 1.
(k)
All the random variables Zj , Ik , j, k ≥ 1 are assumed inde(k)
pendent. All the Zj ’s are assumed to have the same probability
distribution, and all the Ik ’s are assumed to have the same probability distribution.
P r(Z = 0) = p = 1 − P r(Z = 1)
and
e−λλj
P r(I = j) =
, j = 0, 1, · · ·
j!
10
Marian Smoluchowski’s Model. 2.
From the expression relating Xn+1 to Xn above, that the sequence
{Xn}, n ≥ 0 is a Markov chain on the countably infinite state
space S = {0, 1, 2, · · ·}.
For Smoluchowski’s choices for the offspring (binary) and immigration (Poisson(λ)) distributions, the chain is positive recurrent, and has limiting stationary distribution which is
Poisson(µ).
e−µµj
λ
P r(Xn = j) =
, j = 0, 1, · · · , µ = .
j!
P
Moreover
E(Xn+1 − µ) = (1 − P )E(Xn − µ).
Thus this ‘entropy’ feature is common to all 3 models.
11
Reversibility.
Another common feature of each of the 3 models is that in their
stationary regime, the Markov chain is reversible
P r(Xn+1 = j, Xn = i) = P r(Xn+1 = i, Xn = j)
Loschmidt’s Reversibility Paradox
Loschmidt first drew attention to the
fact that in view of the essential symmetry of the laws of mechanics in the past
and future, all molecular processes must
be reversible from the point of view of
statistical mechanics. This is in apparent contradiction with the point of view
held in thermodynamics that certain processes are irreversible.
Chandrasekhar, 1943, , p.55
12
Essential Structure of the 3 Models.
All 3 models possess a common special MATRIX SPECTRAL structure.
• This is responsible for the ‘entropy’ recurrence relations:
E(Xn+1|Xn) = a + bXn, b 6= 1.
Putting µ = a/(1 − b), in matrix terms:
X
j
pij (j − µ) = b(i − µ).
Thus b is an eigenvalue of P , the transition matrix, corresponding to right eigenvector {j − µ}, j = 0, 1, 2, · · ·.
If the chain is in stationary regime, EXn = µ.
• Further, reversibility for the two finite state space cases comes
from the fact that random walk Markov chains are reversible,
which is just a matrix property of the structure of the transition matrix. The transition matrix P of any finite
reversible Markov chain has all eigenvalues real.
13
• For the Ehrenfest model the complete set of eigenvalues is
2n
λn = 1 − , n = 0, 1, · · · , N
N
and the correspnding right eigenvector in the nth Krawtchouk
(Kravchuk) polynomial. These polynomials are orthogonal with respect to the symmetric Binomial distribution.
• For the Bernoulli-Laplace model the complete set of eigenvalues:
n(2N + 1 − n)
λn = 1 −
, n = 0, 1, · · · , N
2
N
and the correspnding right eigenvector in the nth Hahn polynomial. These polynomials are orthogonal with respect
to the Hypergeometric distribution.
14
Structure and Terminology.
1. The Bernoulli-Laplace and Ehrenfest models each an urn
model, can be viewed also as special cases of the Markov chain
models called random walks.
2. Smoluchowski’s model is a special case of the Markov chain
models called branching processes with immigration.
3. The general concept of a Markov chain arose with a paper in 1906 of the Russian mathematician Andrei Andreievich
Markov (1856-1922).
4. The physical idea of a transition probability is present
even in Smoluchowski’s (1914) paper, where an explicit form for p i,j
for his model is written down.
15
Marian Smoluchowski’s Model. 3.
Smoluchowski (1914) used the process {Xn} as a model for the
fluctuation in the number of particles contained in a geometrically
well-defined small element of volume, v, in a much larger volume
of solution containing particles exhibiting random motion, the system being in it equilibrium ( = stationary regime). Observations
Xn, n ≥ 1, on v are made at points of time at equal intervals, τ,
apart.
Probability After-effect (Wahrscheinlichkeitsnachwirkung.)
• The number P is a probability that a particle somewhere inside v, will have emerged from v during time interval τ. Explicit
and numerically calculable expressions for P and µ are obtained by Smoluchowski (1914), in terms of the various physical
constants and colloid theory.
• He also shows how to estimate P and µ statistically from observation of a trajectory (sample path) of {Xn}.
• The necessarily ad hoc statistical comparisons in Smoluchowski
(1914) and Chandrasekhar (1943) show very good agreement
between the model value and estimated values of P and µ.
16
Time Series Analysis of Smoluchowski’s Model .
Needed: A large sample statistical test to test the null hypothesis that an observed non-negative data sequence comes from
• A branching process with immigration;
• More narrowly: from a Smoluchowski simple branching process
with immigration.
Such tests were finally developed by Mills and Seneta (1989)(1991)
as analogues of Quenouille’s test in time series, using likewise
partial sample autocorrelations.
Smoluchowski’s model was found to give a striking simplification
of the general case, with sample partial autocorrelations at lag k ≥ 2
asymptotically independent and Gaussian.
Reinhold Fürth (1918-1919) and passers-by in Prague.
17
Marian Smoluchowski (1872-1917) .
• Born to a Polish family in Vorderbrühl, near Vienna.
• The times: the Austro-Hungarian Monarchy. North-east part:
Galicia (German: Galizien), with its university cities Krakau(Polish:
Kraków; English: Cracow), the old capital of Poland, and
Lemberg (Ukrainian: L’viv; Polish: Lwów; Russian L’vov).
• Smoluchowski held the Chair of Mathematical Physics at the
University in Lemberg from 1903 to 1913. His most important
work was done during this time, and is heavily probabilistic and
statistical.
• From about 1900 Smoluchowski worked on Brownian motion.
He had waited to use experimental data for verification of his
theory, and was beaten to publication by Einstein.
• Smoluchowski one of the strong tradition of the Viennese statistical physicists, and close friend of F. Hasenörhl, who edited
Boltzmann’s Wissenschaftliche Abhandlungen.(1909).
• In 1913 became Professor of Experimental Physics at the Jagiellonian University in Cracow. Died in summer of 1917, of dysentry, at age 45. Rectorship of the Jagiellonian University.
18
References
[1] M.S.Bartlett, An Introduction to Stochastic Processes. Cambridge University Press, Cambridge. 1955.
[2] S.Chandrasekhar, Stochastic problems in physics and astronomy.Reviews of Modern Physics, 15(1943), 1–89. (Also in N.
Wax (Ed.) Selected Problems on Noise and Stochastic Processes. Dover, New York, 1954, pp.3–91.)
[3] P. and T.Ehrenfest, Über zwei bekannte Einwände gegen
das Boltzmannsche H-Theorem. Physikalische Zeitschrift,
8(1907), 311–314.
[4] R.Fürth,
Statistik
wirkung.Physikalische
und
Zeitschrift,
Wahrscheinlichkeitsnach19(1918),
421–426;
ibid. 20(1919), 21.
[5] C.C.Heyde and E.Seneta, Estimation theory for growth and
immigration rates in a multiplicative process. J. Appl. Prob.,
9(1972), 235–256.
19
[6] M. Kac, Random Walk and the Theory of Brownian Motion.
American Mathematical Monthly, 54(1947), 369–391. (Also
in N. Wax (Ed.) Selected Problems on Noise and Stochastic
Processes. Dover, New York, 1954, pp.295–317.)
[7] K.W.F.Kohlrausch and E.Schrödinger, Das Ehrenfestsche Modell der H-Kurve.Physikalische Zeitschrift, 27(1926), 306–313.
[8] S.Ku and E.Seneta, Practical estimation from the sum of AR(1)
processes. Commun. Statist.-Simula.,, 27(1998), 981–998.
[9] T.M.Mills and E.Seneta, Goodness-of-fit for a branching process with immigration using sample partial autocorrelations.
Stochastic Processes and their Applications, 33(1989), 151–
161.
[10] T.M.Mills and E.Seneta, Independence of partial autocorrelations for a classical immigration branching process.Stochastic
Processes and their Applications, 37(1991), 275–279.
[11] P.A.P. Moran, Entropy, Markov processes and Boltzmann’s HTheorem. Proceeings of the Cambridge Philosophical Society. 57(1961), 833–842.
20
[12] J.E.Moyal, Stochastic processes and statistical physics. J. Roy.
Statist.Soc., Ser. B. 11(1949), 150–210.
[13] J.E.Moyal, Discontinuous Markov Processes. Acta Mathematica, 98(1957), 221–264.
[14] E.Seneta, Entropy and Martingales in Markov Chain Models. In
J. Gani and E.J. Hannan (Eds.) Essays in Statistical Science.
Papers in honour of P.A.P. Moran. J. Appl. Prob., 19A(1982),
367–381.
[15] E.Seneta, Marian Smoluchowski. In C.C.Heyde and E.Seneta
(Eds.) Statisticians of the Centuries. Springer, New York.
2001. pp. 299–302.
[16] M. von Smoluchowski, Studien über Molekularstatistik von
Emulsionen und deren Zusammenhang mit der Brown’schen
Bewegung. Sitzungsberichte. Akademie der Wissenschaften.
Mathematisch-Naturwissenschaftliche Klasse. CXXIII Band X
Heft. Dez. 1914. Abteilung IIA. Wien. pp. 2381–2405.
[17] S. Ulam, Marian Smoluchowski and the theory of probabilities
in physics. American Journal of Physics, 25(1957), 475–481.
21