Confidence intervals

Mathematics 241
Sampling Distrbutions
October 6
Goals for the day:
1. Words: confidence interval, confidence level, margin of error
2. R:
3. Big idea: We make a statement with confidence about the value of an unknown mean. (But alas we have to know
the unknown standard deviation.)
Warning. We are following the book here which has an approach that is typical in engineering statistics texts. We
are also using the terminology and approach of the fundamentals of engineering exam. But it is a mistake and a
confusion that we will repair next week!
Words in a book.
Population
Variable
Parameter
All words in a book.
length of word (in letters)
µ
Sample
n = 290 words
Statistic
x̄
σ
The probability is 95% that x̄ will be within 1.96 √ of µ. That means that the probability is 95% that µ is within
n
σ
1.96 √ of x̄.
n
If the sample size n is large, the interval
σ
σ
x̄ − 1.96 √ , x̄ + 1.96 √
n
n
is an approximate 95% confidence interval for µ.
There are two approximations:
1. Central Limit Theorem says the distribution of X̄ is approximately normal.
2. We are using s to approximate σ since σ is unknown.
Write a 95% confidence interval for the length of words in the population:
Page 2
In the M241 package are several datasets that come from samples from populations.
Population
All Calvin College students
Day of the year
Calvin students taking GRE test
Glass manufacturing process
Variable
Body Temperature
Maximum wind velocity at San Diego Airpt
quantitative score
breaking strength
Sample size
130
6209
43
31
variable name
normtemp$Temp
wind$Wind
gre$quant
windowstrength$ksi
On the basis of these data, write 95% confidence intervals for
The mean body temperature of Calvin students
The mean maximum daily wind velocity at the Sandiego airport
The mean quantitative GRE score of Calvin students that might take GRE test
The mean breaking strength of glass produced by the process
Just as there are 95% confidence intervals, it is easy to conceive of 99% or 90% confidence intervals. A confidence
interval for µ has the form
s
x̄ ± (critical value) √
n
We can change the confidence level by changing the critical value from 1.96 to some other value. Examine how we
got 1.96 and compute the critical values the following intevals. (You might want to use qnorm to find these values.)
90% confidence interval
99% confidence interval
If we use 10 samples to generate 90% confidence intervals, we might expect 9 of them to capture the mean of the
population. We’ll do an experiment.
Recall that sr$GPA has the GPAs of 1,333 seniors as of February, 2005. Compute µ, the mean GPA of this population:
µ
Select 10 different samples of size 30 from this population (sample(sr$GPA,30)) and compute a 90% confidence
interval for the mean from each sample. How many of your 10 intervals contained µ?
The Central Limit Theorem gives us an approximate result. The approximation is better the larger n is. And the
approximation is better for population distributions that are not highly skew and that do not have large outliers.
We do not know the population distribution but can get an idea from the sample. In which of the examples above
do you think the use of the Central Limit Theorem is most suspect?