Stirling's Formula Derived from the Poisson Distribution

Steven R. Dunbar
Department of Mathematics
203 Avery Hall
University of Nebraska-Lincoln
Lincoln, NE 68588-0130
http://www.math.unl.edu
Voice: 402-472-3731
Fax: 402-472-8466
Topics in
Probability Theory and Stochastic Processes
Steven R. Dunbar
Stirling’s Formula derived from the Poisson Distribution
Rating
Mathematicians Only: prolonged scenes of intense rigor.
1
Section Starter Question
What is the Poisson distribution? What kind of circumstances does a Poisson
distribution describe?
Key Concepts
1. Stirling’s Formula, also called Stirling’s Approximation, is the asymptotic relation
√
n! ∼ 2πnn+1/2 e−n .
2. The formula is useful in estimating large factorial values, but its main
mathematical value is in limits involving factorials.
3. The Poisson distribution with parameter λ is the discrete probability distribution defined on the non-negative integers 0, 1, 2, . . . with
k
probability mass P [X = k] = p(k; λ) = λk! e−λ .
Vocabulary
1. Stirling’s Formula, also called Stirling’s Approximation, is the asymptotic relation
√
n! ∼ 2πnn+1/2 e−n .
2. The Poisson distribution with parameter λ is the discrete probability distribution defined on the non-negative integers 0, 1, 2, . . . with
k
probability mass P [X = k] = p(k; λ) = λk! e−λ .
2
Mathematical Ideas
Stirling’s Formula
Stirling’s Formula, also called Stirling’s Approximation, is the asymptotic
relation
√
n! ∼ 2πnn+1/2 e−n .
The formula is useful in estimating large factorial values, but its main mathematical value is in limits involving factorials. Another attractive form of
Stirling’s Formula is
n n
√
n! ∼ 2πn
.
e
An improved inequality version of Stirling’s Formula is
√
√
2πnn+1/2 e−n+1/(12n+1) < n! < 2πnn+1/2 e−n+1/(12n) .
See Stirling’s Formula in MathWorld.com.
Intuitive Probabilistic Proof
This intuitive argument is adapted from [6, page 171-172] and also from the
short article by Hu, [4].
Let X1 , X2 , . . . P
be independent Poisson random variables each having
mean 1. Let Sn = nj=1 Xj and then note that the mean and the variance
of Sn are equal to n.
P [Sn = n] = P [n − 1 < Sn ≤ n]
√
√
= P −1/ n < (Sn − n)/ n ≤ 0
Z 0
1
2
√ e−x /2 dx
≈
√
2π
−1/ n
1 1
≈√ √
2π n
3
On the other hand Sn is Poisson with mean n, and so
P [Sn = n] =
e−n nn
n!
Equating the two expressions for P [Sn = n], and rearranging obtain
√
n! ≈ 2πnn+1/2 e−n .
This “proof” relies on having a version of the Central Limit Theorem
that has not been proved using Stirling’s Formula! Such a version of the
Central Limit Theorem itself is pretty advanced. It also relies on several other
advanced facts, such as the distribution of Poisson random variables, the fact
that the sum of independent Poisson random variables is again Poisson and
the fact that the variance of the sum of independent random variables is the
sum of the variances. On the other hand, it is short and simple!
Rigorous Derivation of Stirling’s Formula
The characteristic function with variable θ ∈ R of the Poisson distribution
k
p(k; λ) = λk! e−λ with parameter λ is
p̂(θ; λ) =
∞
X
p(k; λ)eikθ = eλ(e
iθ −1)
.
k=0
For further properties see Breiman, [1, page 170 ff.], especially Definition
8.26 and Proposition 8.27, or Chung, [2, page 142 ff.], especially item 6 on
page 147. The characteristic function is a 2π-periodic function with a series
definition which converges uniformly on R, meaning we can integrate the
series term-by-term on [−π, π] to obtain
Z π
1
p(k; λ) =
p̂(θ, λ)e−ikθ dθ
(1)
2π −π
which is valid for λ > 0 and k = 0, 1, 2, 3, . . .. This is the Fourier inversion
formula for the characteristic function. See also Breiman, [1], Theorem 8.39,
page 178; Chung, [2], Theorem 6.2.3; or Feller, [3], page 509.
In equation (1) set λ = k, and define Ik by
Z π
k k −k
1
iθ
Ik = e =
ek(e −1−iθ) dθ k = 0, 1, 2, 3, . . .
k!
2π −π
4
Compare this representation to the Gaussian integral with variance k defined
as
Z ∞
1
1
2
e−kθ /2 dθ .
Jk = √
=
2π −∞
2πk
The general plan is to show that Ik ≈ Jk , so that then we can assert that
k k −k
1
e ≈√
k!
2πk
which can be rearranged to Stirling’s Formula. The detailed plan is to show
that this approximation can be expressed as an asymptotic limit.
Break Ik into pieces to make the following definitions
Z
Z
1
1
iθ
(1)
(2)
k(eiθ −1−iθ)
e
dθ +
ek(e −1−iθ) dθ = Ik + Ik
Ik =
2π |θ|≤1
2π 1<|θ|≤π
and
1
Jk =
2π
Z
−kθ2 /2
e
|θ|≤1
1
dθ +
2π
Z
e−kθ
2 /2
(1)
(2)
dθ = Jk + Jk
1<|θ|
Lemma 1. The complex exponential has the following properties and estimates:
1. For A ∈ C
2. For θ ∈ R
3. For θ ∈ R
4. For A, B ∈ C
|eA | = e<A ,
(2)
|eiθ − 1 − iθ| ≤ θ2 /2,
(3)
|eiθ − 1 − iθ + θ2 /2| ≤ |θ|3 /3!,
(4)
|eA − eB | ≤ |A − B|emax[<A,<B] .
(5)
Proof. Left as exercises.
5
(2)
Use the triangle inequality for integrals and equation (2) to bound Ik
(2)
and Jk as
Z
1
iθ
(2)
Ik ≤
|ek(e −1−iθ) | dθ
2π 1<|θ|≤π
Z
1
≤
ek cos(θ)−1 dθ
2π 1<|θ|≤π
Z
k(cos(1)−1) 1
dθ
≤e
2π 1<|θ|≤π
≤ ek(cos(1)−1)
Use a technique similar to the proof of Markov’s inequality to estimate
as
Z
Z
1
1
1 2 −k/2
2
(2)
−kθ2 /2
Jk =
e
dθ ≤
|θ|e−kθ /2 dθ =
e
.
2π |θ|>1
2π |θ|>1
2π k
(2)
Jk
(2)
(2)
Both Ik and Jk tend to zero at an exponential rate.
(1)
(1)
Now the effort is to estimate the closeness of I1 and J1 . Write
Z 1
1
iθ
2
(1)
(1)
Ik − Jk =
ek(e −1−iθ) − e−kθ /2 dθ .
2π −1
For |θ| ≤ 1, use inequality (4) to derive
| cos(θ) − 1 + θ2 /2| ≤ | cos(θ) + i sin(θ) − 1 − iθ + θ2 /2|
≤ |eiθ − 1 − iθ + θ2 /2|
|θ3 |
θ2
≤
≤ .
(6)
6
6
Therefore, cos(θ) − 1 ≤ −θ2 /3.
Now put these together using inequalities (6) and (4)
Z 1
Z 1 3
|θ| −kθ2 /3
1
k(eiθ −1−iθ)
(1)
(1)
−kθ2 /2 −e
|Ik − Jk | ≤
e
dθ
e
dθ ≤ k
2π −1
−1 3!
√
Change variables with φ = kθ to obtain (the derivation is left as an exercise)
Z 1 3
Z √k
|θ| −kθ2 /3
1
2
k
e
dθ =
|φ3 |e−φ /3 dφ
√
6k − k
−1 3!
3
e−k/3 3e−k/3
=
−
−
2k
2
2k
6
Then putting all these together |Ik − Jk | → 0. Recalling the definitions of Ik
and Jk , this is the same as
−k
k!e
1
k! − √2πk → 0
as k → ∞. This is equivalent to Stirling’s Formula
√
2πkk k e−k
= 1.
lim
k→∞
k!
An alternate proof using the Lebesgue Dominated Convergence
Theorem
Start with the definition of Ik
Z π
k k −k
1
iθ
ek(e −1−iθ) dθ .
Ik = e =
k!
2π −π
√
√
Make the change of variables y = θ k with dy = dθ k to obtain
√
Z π √k
√
√
√
1
k k k −k
k(eiy/ k −1−iy k)
e =
e
dy .
Ik k =
k!
2π −π√k
Consider the integrand ek(e
√
√
iy/ k −1−iy k)
. The exponent converges pointwise
√
√
k(eiy/ k − 1 − iy/ k) → −y 2 /2
as k → ∞ by using equation (4), so the integrand converges pointwise to
ek(e
√
√
iy/ k −1−iy k)
→ e−y
2 /2
.
The integrand is bounded pointwise by
|ek(e
√
√
iy/ k −1−iy k)
√
| = ek(cos(y/
k)−1)
using equation (2). Using the half-angle identity, this can be written as
√
ek(cos(y/
k)−1)
= e−2k sin
Finally,
e−2k sin
2 (y/(2
√
k))
7
√
2 (y/(2
≤ e−2y
2 /π 2
k))
.
√
if and only if −2k sin2 (y/(2 k)) ≤ −2y 2 /π 2 or
√
sin2 (y/(2 k))
1
√
≥ 2
π
(y/ k)2
√
√
√
2 (y/(2 k))
√
on the domain of integration [−π k, π k]. But the function sin(y/
has
k)2
a limit
is decreasing on
√ of 1/4 as y → 0, is symmetric around y = 0, and √
(0, π k). The minimum value of the function occurs at π k and is 1/π 2 .
Then using the Lebesgue Dominated Convergence Theorem, [2, page 42] or
[1, page 33, Theorem 2.44]
Z ∞
Z π
√
2
k(eiθ −1−iθ)
e−y /2 dy = 2π .
e
dθ →
−∞
−π
Putting this together
√
√
k! ke−k
→ 2π
k!
as k → ∞. This is equivalent to Stirling’s Formula
√
2πkk k e−k
lim
= 1.
k→∞
k!
Discussion
These proofs establishes the Stirling’s Formula asymptotic limit fairly easily,
but are not enough to show the inequality
√
√
2πnn+1/2 e−n+1/(12n+1) < n! < 2πnn+1/2 e−n+1/(12n) .
In order to establish the inequality requires bounds on the rate of approach
of
Z π
1
iθ
Ik =
ek(e −1−iθ) dθ
2π −π
to
Z ∞
1
1
2
Jk = √
=
e−kθ /2 dθ .
2π −∞
2πk
Such an estimate requires bounds on the rate of approach of
√
e
k(eiy/
k −1−iy
√
k)
which is possible with careful estimation.
8
→ e−y
2 /2
Sources
The heuristic proof using the Central Limit Theorem is adapted from Ross
[6, pages 171-172], which in turn is based on Hu [4]. The rigorous proof is
adapted from the short article by Pinsky [5].
Problems to Work for Understanding
1. Show that for A ∈ C, |eA | = e<A
2. Show that for θ ∈ R, |eiθ − 1 − iθ| ≤ θ2 /2
3. Show that for θ ∈ R, |eiθ − 1 − iθ + θ2 /2| ≤ |θ3 /3!|
4. Use standard theorems from calculus (either the Fundamental Theorem
of Calculus or the Mean Value Theorem) applied to the function f (t) =
etA+(1−t)B to show that for A, B ∈ C, |eA − eB | ≤ |A − B|emax[<A,<B]
5. Show that
√
1
6k
Z
k
√
− k
|φ3 |e−φ
2 /3
dφ =
3
e−k/3 3e−k/3
−
−
2k
2
2k
6. Derive inequalities to estimate the size of the difference
Z π
Z ∞
1
1
1
2
k(eiθ −1−iθ)
Ik − Jk =
e
dθ − √
=
e−kθ /2 dθ .
2π −π
2π −∞
2πk
Use these inequalities to derive inequalities for k! refining Stirling’s
asymptotic limit formula to an inequality.
9
7. In the intuitive proof, the key approximation is
P [Sn = n] = P [n − 1 < Sn ≤ n]
√
= P −1/ n < Zn ≤ 0
Z 0
1 −x2 /2
√
e
dx
≈
√
2π
−1/ n
1 1
≈√ √ .
2π n
In the Poisson probability, the interval (for example) (n − 2/3, n + 2/3]
could replace the interval (n − 1, n]. But then the value of the integral
is proportional to the length 4/3 of the interval, producing the wrong
result. If (for example) (n − 1/2, n + 1/2] replaces the interval (n −
1, n], the result is correct. How does the intuitive proof change when
calculating the respective probabilities with an interval of length not
equal to 1?
Reading Suggestion:
References
[1] Leo Breiman. Probability. Addison Wesley, 1968.
[2] Kai Lai Chung. A Course in Probability Theory. Academic Press, 1974.
[3] William Feller. A Introduction to Probability Theory and It Applications,
Volume II, Second Edition, volume II. John Wiley and Sons, second
edition, 1971.
[4] T.-C. Hu. A statistical method of approach to Stirling’s formula. American Statistician, 42:204–205, 1988.
10
[5] Mark A. Pinsky. Stirling’s formula via the Poisson distribution. American
Mathematical Monthly, 114(3):256–258, March 2007.
[6] Sheldon M. Ross. Introduction to Probability Models. Elsevier, 6th edition,
1997.
Outside Readings and Links:
I check all the information on each page for correctness and typographical
errors. Nevertheless, some errors may occur and I would be grateful if you would
alert me to such errors. I make every reasonable effort to present current and
accurate information for public use, however I do not guarantee the accuracy or
timeliness of information on this website. Your use of the information from this
website is strictly voluntary and at your risk.
I have checked the links to external sites for usefulness. Links to external
websites are provided as a convenience. I do not endorse, control, monitor, or
guarantee the information contained in any external website. I don’t guarantee
that the links are active at all times. Use the links here with the same caution as
you would all information on the Internet. This website reflects the thoughts, interests and opinions of its author. They do not explicitly represent official positions
or policies of my employer.
Information on this website is subject to change without notice.
Steve Dunbar’s Home Page, http://www.math.unl.edu/~sdunbar1
Email to Steve Dunbar, sdunbar1 at unl dot edu
Last modified: Processed from LATEX source on June 23, 2017
11