September 30

ST 380
Probability and Statistics for the Physical Sciences
The Distribution of the Sample Mean
Suppose that X1 , X2 , . . . , Xn are a simple random sample from some
distribution with expected value µ and standard deviation σ.
The most important parameter of most distributions is the expected
value µ, and it is often estimated by the sample mean X̄ .
So the sampling distribution of the sample mean X̄ plays a central
role in estimating µ.
Some aspects of that sampling distribution are known exactly, and for
some others we have useful approximations for large n.
1 / 13
Joint Probability Distributions
Distribution of the Sample Mean
ST 380
Probability and Statistics for the Physical Sciences
Mean and Standard Deviation
For any n ≥ 1, the sampling distribution of X̄ has the properties:
E (X̄ ) = µX̄ = µ;
σ2
V (X̄ ) = σX̄2 = ;
n
and hence the standard deviation of X̄ is
σ
σX̄ = √
n
2 / 13
Joint Probability Distributions
Distribution of the Sample Mean
ST 380
Probability and Statistics for the Physical Sciences
Normal Populations
If X1 , X2 , . . . , Xn are a sample from a normal distribution, then for
any n ≥ 1, X̄ is also normally distributed.
We√already know its expected value is µ and its standard deviation is
σ/ n, so
σ2
X̄ ∼ N µ,
.
n
In particular, for any z,
P
3 / 13
X̄ − µ
√ ≤z
σ/ n
= Φ(z).
Joint Probability Distributions
Distribution of the Sample Mean
ST 380
Probability and Statistics for the Physical Sciences
Other Populations
If X1 , X2 , . . . , Xn are a simple random sample from any distribution
with expected value µ and standard deviation σ, then for large n, X̄
is approximately normally distributed.
Again,
√ we know its expected value is µ and its standard deviation is
σ/ n, so
σ2
.
X̄ ≈ N µ,
n
The Central Limit Theorem states that the approximation holds in
the limit:
X̄ − µ
√ ≤ z → Φ(z) as n → ∞.
P
σ/ n
4 / 13
Joint Probability Distributions
Distribution of the Sample Mean
ST 380
Probability and Statistics for the Physical Sciences
Binomial Distribution
We can use the Central Limit Theorem to approximate the binomial
distribution.
Suppose that X1 , X2 , . . . , Xn are the success indicators in a binomial
experiment. That is, each is a Bernoulli variable with P(Xi = 1) = p
for some 0 < p < 1.
Then
E (Xi ) = p
and
V (Xi ) = p(1 − p).
5 / 13
Joint Probability Distributions
Distribution of the Sample Mean
ST 380
Probability and Statistics for the Physical Sciences
The Central Limit Theorem implies that for large n
!
X̄ − p
P p
≤ z ≈ Φ(z).
p(1 − p)/n
So, if X = X1 + X2 + · · · + Xn = nX̄ , then
!
X − np
P p
≤ z ≈ Φ(z).
np(1 − p)
The approximation is improved by replacing X − np by X − np + 1/2
(a continuity correction).
6 / 13
Joint Probability Distributions
Distribution of the Sample Mean
ST 380
Probability and Statistics for the Physical Sciences
Distribution of a Linear Combination
The sample mean X̄ and the sample total nX̄ are both examples of a
linear combination of X1 , X2 , . . . , Xn .
A general linear combination is of the form
Y = a1 X1 + a2 X2 + · · · + an Xn
for some constants a1 , a2 , . . . , an .
For example, X̄ is the special case ai = 1/n, i = 1, 2, . . . , n.
7 / 13
Joint Probability Distributions
Distribution of a Linear Combination
ST 380
Probability and Statistics for the Physical Sciences
Mean and Variance
Suppose that
E (Xi ) = µi and V (Xi ) = σi2 , i = 1, 2, . . . , n.
Then
E (Y ) = a1 E (X1 ) + a2 E (X2 ) + · · · + an E (Xn )
= a 1 µ1 + a 2 µ2 + · · · + a n µn
and, if X1 , X2 , . . . , Xn are uncorrelated,
V (Y ) = a12 V (X1 ) + a22 V (X2 ) + · · · + an2 V (Xn )
= a12 σ12 + a22 σ22 + · · · + an2 σn2 .
8 / 13
Joint Probability Distributions
Distribution of a Linear Combination
ST 380
Probability and Statistics for the Physical Sciences
If X1 , X2 , . . . , Xn are correlated, the variance becomes
V (Y ) =
n X
n
X
ai aj Cov(Xi , Xj )
i=1 j=1
=
n
X
ai2 V (Xi )
i=1
=
n
X
i=1
+2
n−1 X
n
X
ai aj Cov(Xi , Xj )
i=1 j=i+1
ai2 σi2
+2
n−1 X
n
X
ai aj σi σj ρi,j .
i=1 j=i+1
In the last expression, we use the definition
ρi,j = Corr(Xi , Xj ) =
9 / 13
Cov(Xi , Xj )
.
σi σj
Joint Probability Distributions
Distribution of a Linear Combination
ST 380
Probability and Statistics for the Physical Sciences
The proofs of these results are straightforward if tedious, and depend
on nothing more than the fact that if g (X1 , X2 , . . . , Xn ) and
h(X1 , X2 , . . . , Xn ) are any two functions of X1 , X2 , . . . , Xn , then
E [g (X1 , X2 , . . . , Xn ) + h(X1 , X2 , . . . , Xn )]
= E [g (X1 , X2 , . . . , Xn )] + E [h(X1 , X2 , . . . , Xn )].
The earlier statements about X̄ , that
E (X̄ ) = µ and V (X̄ ) =
σ2
,
n
are just the special case for uncorrelated X1 , X2 , . . . , Xn and
ai = 1/n, µi = µ, σi = σ, i = 1, 2, . . . , n.
10 / 13
Joint Probability Distributions
Distribution of a Linear Combination
ST 380
Probability and Statistics for the Physical Sciences
Difference Between Two Variables
We often need to compare two measurements.
For example,
X1 = blood pressure before taking a medication
X2 = blood pressure 1 hour after taking medication
Y = X2 − X1 = change in blood pressure.
This is the special case n = 2, a1 = −1, a2 = 1.
11 / 13
Joint Probability Distributions
Distribution of a Linear Combination
ST 380
Probability and Statistics for the Physical Sciences
So
E (Y ) = µ2 − µ1
and
V (Y ) = σ12 + σ22 − 2ρ1,2 σ1 σ2
and, if ρ1,2 = 0,
V (Y ) = σ12 + σ22 .
Note that when X1 and X2 are uncorrelated, the variances add,
because a12 = a22 = 1.
12 / 13
Joint Probability Distributions
Distribution of a Linear Combination
ST 380
Probability and Statistics for the Physical Sciences
Normal Variables
If X1 , X2 , . . . , Xn are independent and normally distributed, then any
linear combination
Y = a1 X1 + a2 X2 + · · · + an Xn
is also normally distributed.
This general result includes as a special case the fact that X̄ is
normally distributed when X1 , X2 , . . . , Xn are independent and
normally distributed.
A more general Central Limit Theorem states that Y is
approximately normally distributed when n is large, provided no ai Xi
contributes substantially to the sum.
13 / 13
Joint Probability Distributions
Distribution of a Linear Combination