Moyal Lecture : 3 Nov., 2006 Markov Chains as Models in Statistical Mechanics. Eugene Seneta 1 SUMMARY The D.Bernoulli (1769)/Laplace (1812) urn model and the Ehrenfest (1907) urn model for mixing are instances of simple Markov chain models called random walks. Both can be used to suggest a probabilistic resolution of the controversy concerning the irreversibility and recurrence paradoxes inherent in Boltzmann’s HTheorem. Marian von Smoluchowski (1914) also modelled by a simple Markov chain, with analogous properties, fluctuations over time in the number of particles contained in a geometrically well defined small element of volume in a solution. The lecture explores lightly the themes of entropy, recurrence and reversibility within the framework of such Markov chains. 1 Emeritus Professor, School of Mathematics and Statistics, University of Sydney, N.S.W. 2006. Email: [email protected] 1 Markov Chain. A PROBABILITY MODEL WHICH ALLOWS FOR STATISTICAL DEPENDENCE BETWEEN OBSERVATIONS AT SUC0, 1, · · · CESSIVE TIME POINTS SYSTEM: X0, X1, X2, · · · ON THE STATE OF A EVOLVING OVER TIME. Markov Property: P r(Xm+1 = j|Xm = i, Xm−1 = im−1 , · · · , X0 = i0) = pij , i, j ∈ S. Matrix Structure: P ≥ 0, P 1 = 1. Matrices with this property are called stochastic. P is the transition matrix of the Markov chain. Matrix Powers. (n) (n) P n = {pij }, where pij = P r(Xm+n = j|Xm = i). Finite Markov Chain: There are N + 1 states, S = {0, 1, · · · N }. Simplest Markov Chain: S = {0, 1}. P = p00 p01 p10 p11 2 . Stationary/Ergodic Properties. If P is finite and irreducible, there is a unique solution vector π of πT P = πT , π T 1 = 1. Result 1. π > 0, and since π T 1 = 1 its elements form a probability distribution with strictly positive entries which add to one: π T = {π0, π1, · · · , πN }, π0 + π1 + · · · + πN = 1. If the Markov chain with transition matrix P starts off at time 0 with this distribution over its states, this is the distribution at all time points n = 0, 1, 2, · · ·: P r(Xn = j) = πj , j = 0, 1, · · · , N π is called the stationary distribution vector. Result 2. If P is irreducible and aperiodic (that is P n > 0 for some n), then as n → ∞ : P n −→ 1π T . The limiting stochastic matrix 1π T has all its rows identical. This is called the ergodic property. As n → ∞: (n) pij = P r(Xn = j|X0 = i) −→ πj . 3 D.Bernoulli(1769)/Laplace (1812) Model. Two Urns, A and B. Each urn has N balls, so total number of balls is 2N.. The totality of 2N balls consists of N white, and N black. Let Xn be the number of black balls in Urn A after n interchanges. An interchange consists of selecting a ball at random from Urn A , and a ball at random from Urn B, and placing it in the other urn. Then pi,i−1 i 2 N − i 2 = , pi,i+1 = N N i N − i , i = 0, 1, · · · , N pi,i = 2 N N The Markov chain {Xn}, n = 0, 1, · · · has stationary/limiting distribution π T = {π0, π1, · · · , πN } given by N! N ! N !−i . πi = i 2N N 4 Ehrenfest(1907) Model. Two Urns, A and B. Total number of balls is N. All N balls are black, and labeled 1 to N.) Let Xn be the number of (black) balls in Urn A after n interchanges. An interchange consists of selecting a number at random from the set {1, 2, · · · , N }, finding the ball with this number and placing it in the other urn. pi,i−1 i N − i = , pi,i+1 = , i = 0, 1, · · · , N. N N The Markov chain {Xn}, n = 0, 1, · · · , has stationary/limiting distribution π T = {π0, π1, · · · πN } given by N πi = i 1 N . 2 Symmetric Binomial Distribution. This distribution a twocontainer case of Boltzmann-Maxwell ‘statistics’. 5 Statistical Mechanics. The Ehrenfest Model was created in response to certain paradoxes which appeared in Boltzmann’s ≈ 1872) efforts, following Maxwell, to explain thermodynamics of gases on the basis of kinetic theory. Gases were to be viewed as aggregates of particles undergoing movement at different velocities, and collisions between the particles were to accord to the principles of Newtonian mechanics. Hence the term Statistical Mechanics. • One paradox was the apparent conflict in Boltzmann’s theory between irreversibility (as manifested by increasing entropy) and recurrence of states as expected from the assumptions of Newtonian mechanics. ( ≈ Zermelo’s Paradox.) • Loschmidt’s Reversibility Paradox. 6 Recurrence of States. Both chains are irreducible, so ‘positive recurrent’, which means that every state recurs with probability one, and the mean time between recurrences is finite. Mean Recurrence time of state i : 1 µi = , i = 0, 1, · · · , N. πi Thus for the Ehrenfest model: µi = 1 N i N 1 2 . If N = 20, 000, and i = 0, µ0 = 220000 time units (say secs;), ≈ 106000years. However, if i ≈ N2 , µN/2 ≈ 175 time units. 7 An Entropy Property of Both Models. The Ehrenfests chose as an analogue of the negative entropy of Boltzmann (which is supposed to be decreasing with time) the quantity: 2|Xn − N |, 2 n = 0, 1, 2, · · · Kohlrausch and Schrödinger (1926), reported a simulation study with 5000 successive drawings, when N = 100, and X0 = 100. By drawing no. 200, the plot of this quantity against drawing number ( the Treppenkurve, the analogue of Boltzmann’s H-Kurve) oscillates closely about 0. Define entropy as function of EXn. For both our models: N N 2 E(Xn+1 − ) = 1 − E(Xn − ). 2 N 2 An example of a scalar function of E(Xn) which decreases as n increases (negative ‘ENTROPY’ ): N φ(E(Xn)) = 2|E(Xn) − |. 2 8 Models for Heat Exchange. The Ehrenfest model is a physical model for heat exchange between two isolated bodies at unequal temperatures, the two urns symbolizing the isolated bodies, and the number of black balls in each symbolizing their temperatures.The realistic value of model supported by: N 2 k N E(Xk − ) = 1 − E(X0 − ) 2 N 2 since in the limit as k → ∞, N → ∞, τ → 0, in such a way that N τ /2 → γ −1, and kτ = t = constant = time, where τ is unit of time, the factor: k 2 1 − −→ e−γt . N Newton’s law of cooling. D. Bernoulli (1769) already had the Newton’s law of cooling argument. Transparently, the Bernoulli-Laplace model could have been used also to resolve by analogy the paradoxes in statistical mechanics without the introduction of the Ehrenfest model. 9 Marian Smoluchowski’s Model. 1. Let Xn, n = 0, 1, 2, · · · , denote the number of individuals at time n, where movement from time n to time n + 1 is defined by: (n+1) (n+1) (n+1) Xn+1 = Z1 +Z2 +· · ·+ZXn +In+1. (n+1) Here Zj is the number of offspring of the jth individual ex- isting at time n, In+1 is the number of immigrants coming into the population to supplement these offspring in forming the totality of the number of individuals Xn+1 at time n + 1. (k) All the random variables Zj , Ik , j, k ≥ 1 are assumed inde(k) pendent. All the Zj ’s are assumed to have the same probability distribution, and all the Ik ’s are assumed to have the same probability distribution. P r(Z = 0) = p = 1 − P r(Z = 1) and e−λλj P r(I = j) = , j = 0, 1, · · · j! 10 Marian Smoluchowski’s Model. 2. From the expression relating Xn+1 to Xn above, that the sequence {Xn}, n ≥ 0 is a Markov chain on the countably infinite state space S = {0, 1, 2, · · ·}. For Smoluchowski’s choices for the offspring (binary) and immigration (Poisson(λ)) distributions, the chain is positive recurrent, and has limiting stationary distribution which is Poisson(µ). e−µµj λ P r(Xn = j) = , j = 0, 1, · · · , µ = . j! P Moreover E(Xn+1 − µ) = (1 − P )E(Xn − µ). Thus this ‘entropy’ feature is common to all 3 models. 11 Reversibility. Another common feature of each of the 3 models is that in their stationary regime, the Markov chain is reversible P r(Xn+1 = j, Xn = i) = P r(Xn+1 = i, Xn = j) Loschmidt’s Reversibility Paradox Loschmidt first drew attention to the fact that in view of the essential symmetry of the laws of mechanics in the past and future, all molecular processes must be reversible from the point of view of statistical mechanics. This is in apparent contradiction with the point of view held in thermodynamics that certain processes are irreversible. Chandrasekhar, 1943, , p.55 12 Essential Structure of the 3 Models. All 3 models possess a common special MATRIX SPECTRAL structure. • This is responsible for the ‘entropy’ recurrence relations: E(Xn+1|Xn) = a + bXn, b 6= 1. Putting µ = a/(1 − b), in matrix terms: X j pij (j − µ) = b(i − µ). Thus b is an eigenvalue of P , the transition matrix, corresponding to right eigenvector {j − µ}, j = 0, 1, 2, · · ·. If the chain is in stationary regime, EXn = µ. • Further, reversibility for the two finite state space cases comes from the fact that random walk Markov chains are reversible, which is just a matrix property of the structure of the transition matrix. The transition matrix P of any finite reversible Markov chain has all eigenvalues real. 13 • For the Ehrenfest model the complete set of eigenvalues is 2n λn = 1 − , n = 0, 1, · · · , N N and the correspnding right eigenvector in the nth Krawtchouk (Kravchuk) polynomial. These polynomials are orthogonal with respect to the symmetric Binomial distribution. • For the Bernoulli-Laplace model the complete set of eigenvalues: n(2N + 1 − n) λn = 1 − , n = 0, 1, · · · , N 2 N and the correspnding right eigenvector in the nth Hahn polynomial. These polynomials are orthogonal with respect to the Hypergeometric distribution. 14 Structure and Terminology. 1. The Bernoulli-Laplace and Ehrenfest models each an urn model, can be viewed also as special cases of the Markov chain models called random walks. 2. Smoluchowski’s model is a special case of the Markov chain models called branching processes with immigration. 3. The general concept of a Markov chain arose with a paper in 1906 of the Russian mathematician Andrei Andreievich Markov (1856-1922). 4. The physical idea of a transition probability is present even in Smoluchowski’s (1914) paper, where an explicit form for p i,j for his model is written down. 15 Marian Smoluchowski’s Model. 3. Smoluchowski (1914) used the process {Xn} as a model for the fluctuation in the number of particles contained in a geometrically well-defined small element of volume, v, in a much larger volume of solution containing particles exhibiting random motion, the system being in it equilibrium ( = stationary regime). Observations Xn, n ≥ 1, on v are made at points of time at equal intervals, τ, apart. Probability After-effect (Wahrscheinlichkeitsnachwirkung.) • The number P is a probability that a particle somewhere inside v, will have emerged from v during time interval τ. Explicit and numerically calculable expressions for P and µ are obtained by Smoluchowski (1914), in terms of the various physical constants and colloid theory. • He also shows how to estimate P and µ statistically from observation of a trajectory (sample path) of {Xn}. • The necessarily ad hoc statistical comparisons in Smoluchowski (1914) and Chandrasekhar (1943) show very good agreement between the model value and estimated values of P and µ. 16 Time Series Analysis of Smoluchowski’s Model . Needed: A large sample statistical test to test the null hypothesis that an observed non-negative data sequence comes from • A branching process with immigration; • More narrowly: from a Smoluchowski simple branching process with immigration. Such tests were finally developed by Mills and Seneta (1989)(1991) as analogues of Quenouille’s test in time series, using likewise partial sample autocorrelations. Smoluchowski’s model was found to give a striking simplification of the general case, with sample partial autocorrelations at lag k ≥ 2 asymptotically independent and Gaussian. Reinhold Fürth (1918-1919) and passers-by in Prague. 17 Marian Smoluchowski (1872-1917) . • Born to a Polish family in Vorderbrühl, near Vienna. • The times: the Austro-Hungarian Monarchy. North-east part: Galicia (German: Galizien), with its university cities Krakau(Polish: Kraków; English: Cracow), the old capital of Poland, and Lemberg (Ukrainian: L’viv; Polish: Lwów; Russian L’vov). • Smoluchowski held the Chair of Mathematical Physics at the University in Lemberg from 1903 to 1913. His most important work was done during this time, and is heavily probabilistic and statistical. • From about 1900 Smoluchowski worked on Brownian motion. He had waited to use experimental data for verification of his theory, and was beaten to publication by Einstein. • Smoluchowski one of the strong tradition of the Viennese statistical physicists, and close friend of F. Hasenörhl, who edited Boltzmann’s Wissenschaftliche Abhandlungen.(1909). • In 1913 became Professor of Experimental Physics at the Jagiellonian University in Cracow. Died in summer of 1917, of dysentry, at age 45. Rectorship of the Jagiellonian University. 18 References [1] M.S.Bartlett, An Introduction to Stochastic Processes. Cambridge University Press, Cambridge. 1955. [2] S.Chandrasekhar, Stochastic problems in physics and astronomy.Reviews of Modern Physics, 15(1943), 1–89. (Also in N. Wax (Ed.) Selected Problems on Noise and Stochastic Processes. Dover, New York, 1954, pp.3–91.) [3] P. and T.Ehrenfest, Über zwei bekannte Einwände gegen das Boltzmannsche H-Theorem. Physikalische Zeitschrift, 8(1907), 311–314. [4] R.Fürth, Statistik wirkung.Physikalische und Zeitschrift, Wahrscheinlichkeitsnach19(1918), 421–426; ibid. 20(1919), 21. [5] C.C.Heyde and E.Seneta, Estimation theory for growth and immigration rates in a multiplicative process. J. Appl. Prob., 9(1972), 235–256. 19 [6] M. Kac, Random Walk and the Theory of Brownian Motion. American Mathematical Monthly, 54(1947), 369–391. (Also in N. Wax (Ed.) Selected Problems on Noise and Stochastic Processes. Dover, New York, 1954, pp.295–317.) [7] K.W.F.Kohlrausch and E.Schrödinger, Das Ehrenfestsche Modell der H-Kurve.Physikalische Zeitschrift, 27(1926), 306–313. [8] S.Ku and E.Seneta, Practical estimation from the sum of AR(1) processes. Commun. Statist.-Simula.,, 27(1998), 981–998. [9] T.M.Mills and E.Seneta, Goodness-of-fit for a branching process with immigration using sample partial autocorrelations. Stochastic Processes and their Applications, 33(1989), 151– 161. [10] T.M.Mills and E.Seneta, Independence of partial autocorrelations for a classical immigration branching process.Stochastic Processes and their Applications, 37(1991), 275–279. [11] P.A.P. Moran, Entropy, Markov processes and Boltzmann’s HTheorem. Proceeings of the Cambridge Philosophical Society. 57(1961), 833–842. 20 [12] J.E.Moyal, Stochastic processes and statistical physics. J. Roy. Statist.Soc., Ser. B. 11(1949), 150–210. [13] J.E.Moyal, Discontinuous Markov Processes. Acta Mathematica, 98(1957), 221–264. [14] E.Seneta, Entropy and Martingales in Markov Chain Models. In J. Gani and E.J. Hannan (Eds.) Essays in Statistical Science. Papers in honour of P.A.P. Moran. J. Appl. Prob., 19A(1982), 367–381. [15] E.Seneta, Marian Smoluchowski. In C.C.Heyde and E.Seneta (Eds.) Statisticians of the Centuries. Springer, New York. 2001. pp. 299–302. [16] M. von Smoluchowski, Studien über Molekularstatistik von Emulsionen und deren Zusammenhang mit der Brown’schen Bewegung. Sitzungsberichte. Akademie der Wissenschaften. Mathematisch-Naturwissenschaftliche Klasse. CXXIII Band X Heft. Dez. 1914. Abteilung IIA. Wien. pp. 2381–2405. [17] S. Ulam, Marian Smoluchowski and the theory of probabilities in physics. American Journal of Physics, 25(1957), 475–481. 21
© Copyright 2026 Paperzz