Central Limit Theorem

Central Limit Theorem
Paul Cohen ISTA 370
April, 2012
Paul Cohen ISTA 370 ()
Central Limit Theorem
April, 2012
1 / 11
Central Limit Theorem
Central Limit Theorem
Suppose you are drawing samples of size N from a
population with mean µ and standard deviation σ. You
take the mean, x , of each sample. As N approaches
infinity, the sampling distribution of x approaches a
Gaussian (or Normal) √
distribution with mean µ and
standard deviation σ/ N .
Paul Cohen ISTA 370 ()
Central Limit Theorem
April, 2012
2 / 11
Central Limit Theorem
Central Limit Theorem on the Homework
“This exercise demonstrates the sampling distribution of the mean
and the Central Limit Theorem. We begin with a highly skewed
population. Many mathematical distributions will generate skewed
populations and the details of these distributions don’t matter to
this exercise. We’ll use the Weibull distribution:”
> rweibull(30,.7,1.2)
[1]
[6]
[11]
[16]
[21]
[26]
0.60155001
0.02430550
0.01023894
1.17381195
3.20908404
2.62773495
Paul Cohen ISTA 370 ()
0.57239234
0.08412866
4.57922140
5.13185834
8.89052093
0.09371633
0.50945882
0.41569648
0.05074523
0.00066688
0.68149142
3.74760238
Central Limit Theorem
0.29879757
0.67287305
2.10108635
0.03702882
0.68527976
0.00022452
2.14286034
2.21902605
0.23411717
0.46193620
2.58642056
1.19712924
April, 2012
3 / 11
Central Limit Theorem
Central Limit Theorem on the Homework
“The formulae for the mean and standard deviation of the Weibull
are more than we need to go into, here, so the easiest thing is to
draw an enormous sample from the Weibull and use its mean and
standard deviation as the population mean and standard deviation:”
> W<-rweibull(100000,.7,1.2)
> print(WeibullMu<-mean(W))
[1] 1.5258
> print(WeibullSD<-sd(W))
[1] 2.2361
Paul Cohen ISTA 370 ()
Central Limit Theorem
April, 2012
4 / 11
Central Limit Theorem
Central Limit Theorem on the Homework
0.4
0.0
0.2
Density
0.6
0.8
“To show you how skewed this population is, here’s a picture of the
density of W:”
0
10
20
30
40
N = 100000 Bandwidth = 0.1142
Paul Cohen ISTA 370 ()
Central Limit Theorem
April, 2012
5 / 11
Central Limit Theorem
Central Limit Theorem on the Homework
“Your task is to generate the sampling distribution of the mean of
samples drawn from this distribution at three sample sizes, 30, 300,
3000; plot the sampling distributions; find their means and standard
deviations; and compare these to the values predicted by the
Central Limit Theorem.”
Almost everyone did this wrong. Typical answers were not
sampling distributions of the mean of the Weibull, but the
Weibull distributions themselves.
This, plus wrong answers in recent classes, make me think that a
lot of people don’t understand what a sampling distribution is.
Divide into groups and solve the problem correctly.
Paul Cohen ISTA 370 ()
Central Limit Theorem
April, 2012
6 / 11
Central Limit Theorem
My Answers
“Generate the sampling distribution of the mean of samples drawn
from this distribution at three sample sizes, 30, 300, 3000; plot the
sampling distributions; find their means and standard deviations;
and compare these to the values predicted by the Central Limit
Theorem.”
You can sample either from the Weibull distribution using
rweibull(N,.7,1.2) or from the very large sample, W
I’ll solve the problem both ways.
Paul Cohen ISTA 370 ()
Central Limit Theorem
April, 2012
7 / 11
Central Limit Theorem
My Answers. N = 30
> SamplingDist30<-replicate(10000,mean(rweibull(30,.7,1.2)))
> mean(SamplingDist30)
[1] 1.5233
> WeibullMu
2000
[1] 1.5258
1000
500
Frequency
> WeibullSD/sqrt(30)
[1] 0.40826
0
[1] 0.41286
1500
> sd(SamplingDist30)
Histogram of SamplingDist30
1
> hist(SamplingDist30)
Paul Cohen ISTA 370 ()
2
3
4
SamplingDist30
Central Limit Theorem
April, 2012
8 / 11
Central Limit Theorem
My Answers: N=300
> SamplingDist300<-replicate(10000,mean(rweibull(300,.7,1.2))
> mean(SamplingDist300)
[1] 1.5209
> WeibullMu
[1] 1.5258
3.0
density.default(x = SamplingDist300)
[1] 0.1291
2.0
1.5
Density
1.0
0.5
> WeibullSD/sqrt(300)
0.0
[1] 0.12927
2.5
> sd(SamplingDist300)
1.0
1.2
> plot(density(SamplingDist300))
Paul Cohen ISTA 370 ()
Central Limit Theorem
1.4
1.6
1.8
2.0
N = 10000 Bandwidth = 0.01808
April, 2012
9 / 11
Central Limit Theorem
My Answers: N=3000
> SamplingDist3000<-replicate(10000,mean(rweibull(3000,.7,1.2
> mean(SamplingDist3000)
[1] 1.5189
> WeibullMu
[1] 0.040826
6
Density
4
2
0
> WeibullSD/sqrt(3000)
8
> sd(SamplingDist3000)
[1] 0.040228
density.default(x = SamplingDist3000)
10
[1] 1.5258
1.4
> plot(density(SamplingDist3000))
Paul Cohen ISTA 370 ()
Central Limit Theorem
1.5
1.6
1.7
N = 10000 Bandwidth = 0.005816
April, 2012
10 / 11
Central Limit Theorem
My Answers: N=3000, sampling from W
> SamplingDist3000<replicate(10000,mean(sample(W,size=3000,replace=TRUE)))
> mean(SamplingDist3000)
[1] 1.5254
> WeibullMu
10
density.default(x = SamplingDist3000)
6
2
[1] 0.040865
4
> sd(SamplingDist3000)
Density
8
[1] 1.5258
0
> WeibullSD/sqrt(3000)
[1] 0.040826
1.4
1.5
1.6
1.7
N = 10000 Bandwidth = 0.005841
> plot(density(SamplingDist3000))
Paul Cohen ISTA 370 ()
Central Limit Theorem
April, 2012
11 / 11