Central limit theorem () • Central limit theorem. Suppose that X 1

Central limit theorem (中央極限定理)
• Central limit theorem. Suppose that X1 , . . ., Xn are IID with E(Xi ) = µ
and√V ar(Xi ) = σ 2 . Let X̄ = (X1 + · · · + Xn )/n. Then the distribution
n(X̄ − µ)
of
is approximately N (0, 1) in the sense that
σ
√
n(X̄ − µ)
P
≤ x ≈ P (N (0, 1) ≤ x)
σ
for every x for large n.
– Recall that from (6) and (7) in the handout “Mean, variance and
standard deviation”, we have
E(X̄) = µ and V ar(X̄) =
so
√
σ2
,
n
n(X̄ − µ)
X̄ − E(X̄)
=
σ
(V ar(X̄))1/2
is a random variable with mean 0 and variance 1.
• Example in the text (Page 280 in the 14th Ed or Page 288 in the 15th
Ed). The Quality Assurance Department for Cola, Inc., maintains records
regarding the amount of cola in its Jumbo bottle. Its records indicate that
the amount of cola follows the normal probability distribution. The mean
amount per bottle is 31.2 ounces and the population standard deviation
is 0.4 ounces. At 8 A.M. today the quality technician randomly selected
16 bottles from the filling line. The mean amount of cola contained in the
bottles is 31.38 ounces. Is this an unlikely result? Is it likely the process
is putting too much soda in the bottles?
Solution.
– Suppose that (X1 , . . . , X16 ) is a random sample from N (µ, σ 2 ), where
µ = 31.2 and σ = 0.4. Let X̄ be the sample mean and
√
16(X̄ − 31.2)
Z=
,
0.4
then Z ∼ N (0, 1) and
P X̄ > 31.38
!
√
√
16(X̄ − 31.2)
16(31.38 − 31.2)
>
= P
0.4
0.4
=
P (Z > 1.8) = 0.5 − P (0 < Z < 1.8).
1
From the table “Normal probabilities”, P (0 < Z < 1.8) ≈ 0.4641,
so
P (X̄ > 31.38)
=
0.5 − P (0 < Z < 1.8)
≈
0.5 − 0.4641 = 0.0359.
It is unusual that the mean amount of cola contained in the 16
bottles is 31.38 ounces since the probability P (X̄ > 31.38) is small
(less than 0.05). It is likely the process is putting too much soda in
the bottles.
• Normal approximation to Bin(n, p). Suppose that X1 , · · ·, Xn are IID
Bin(1, p), then E(X1 ) = p and V ar(X1 ) = p(1 − p). Let
√
n(X̄ − p)
,
Z= p
p(1 − p)
then the distribution of Z is approximatelyp
N (0, 1) for large n. Express
nX̄ in terms of Z and we have nX̄ = np + np(1 − p)Z, so
p
P (Bin(n, p) ≤ x) = P (np + np(1 − p)Z ≤ x)
!
x − np
= P Z≤p
np(1 − p)
!
x − np
≈ P N (0, 1) ≤ p
np(1 − p)
=
P (N (np, np(1 − p)) ≤ x).
The formula
P (Bin(n, p) ≤ x) ≈ P (N (np, np(1 − p)) ≤ x)
(1)
provides one way to approximate P (Bin(n, p) ≤ x). However, when
x = k is an integer, the formula
P (Bin(n, p) ≤ k) ≈ P (N (np, np(1 − p)) < k + 0.5),
(2)
provides a better approximation. See the experiment result at the end of
this handout that supports the use of (2).
• Apply (2) with k replaced by k − 1 and we have
P (Bin(n, p) < k) ≈ P (N (np, np(1 − p)) < k − 0.5).
From (2) and (3), we have
P (Bin(n, p) ≥ k) ≈ P (N (np, np(1 − p)) > k − 0.5),
and
P (Bin(n, p) > k) ≈ P (N (np, np(1 − p)) > k + 0.5).
2
(3)
Example 1. Suppose that X ∼ Bin(80, 0.7). Approximate P (X < 60)
and P (X ≥ 60) using normal probabilities.
Sol. E(X) = 80 × 0.7 = 56 and V ar(X) = 80 × 0.7 × (1 − 0.7) = 16.8.
P (X < 60)
≈ P (N (56, 16.8) < 60 − 0.5)
59.5 − 56
= P N (0, 1) < √
16.8
≈ P (N (0, 1) < 0.85) ≈ 0.5 + 0.3023 = 0.8023.
P (X ≥ 60) = 1 − P (X < 60) ≈ 1 − 0.8023 = 0.1977.
• An experiment. We plot P (N (np, np(1−p) ≤ x) and P (Bin(n, p) ≤ x) for
1000 x’s in [−1, 40] with (n, p) = (40, 0.5). The R scripts and the output
are given below. P (Bin(n, p) ≤ x)’s are plotted in red. The plot shows
that the normal probabilities approximate the binomial probabilities well.
x <- seq(-1, 40, length=1000)
n <- 40; p <- 0.5
plot(x, pnorm(x, mean=n*p, sd=sqrt(n*p*(1-p))), type="l", ylab="")
lines(x, pbinom(x, size=n, prob=p), col="red")
title("P(N(np, np(1-p)) <=x) and P(Bin(n, p) <= x)")
3