Sect 8.2 Estimating Pop Props

CHAPTER 8
ESTIMATING WITH CONFIDENCE
Section 8.2
Estimating a Population Proportion
A large company is concerned that many of its employees
are in poor physical condition, which can result in
decreased productivity. To determine how many steps
each employee takes per day, on average, the company
provides a pedometer to 50 randomly selected employees
to use for one 24-hour period. After collecting the data,
the company statistician reports a 95% confidence
interval of 4547 steps to 8473 steps.
(a) Interpret the confidence interval.
(b) What is the point estimate that was used to create the
interval? What is the margin of error?
(c) Recent guidelines suggest that people aim for 10,000
steps per day. Is there convincing evidence that the
employees of this company are not meeting the guideline,
on average? Explain.
(a) We are 95% confident that the interval from 4547
to 8473 captures the true mean number of steps
taken per day for employees at this company.
(b) The point estimate is the midpoint of the interval:
4547+8473
= 6510. The margin of error is the distance
2
from the point estimate to each endpoint of the
interval: 8473 – 6510 = 1963.
(c) Because all the values in the interval are less than
10,000, there is convincing evidence that the
employees of this company are not taking 10,000
steps per day, on average. That is, there is convincing
evidence that the employees are not meeting the
guideline.
We saw in Sect 8.1 that a confidence interval can be used
to estimate an unknown population parameter. We are
often interested in estimating the proportion p of some
outcome in the population.
• What proportion of U.S. adults are unemployed right
now?
• What proportion of high school students have cheated
on a test?
• What proportion of pine trees in a national park are
infested with beetles?
• What proportion of college students pray daily?
• What proportion of a company’s laptop batteries last as
long as the company claims?
Conditions for Estimating p
Mrs. Richardson’s class wants to construct a confidence interval for the
proportion p of goldfish that have smiles. Their sample had 107
goldfish with smiles and 144 goldfish that did not have smiles.
How can we use this information to find a confidence interval for p?
pˆ 
 If the sample size is large enough tha t both np
and n(1  p) are at least 10, the sampling
distribution of pˆ is approximately Normal.

 The mean of the sampling distribution of pˆ is p.
 The standard deviation of pˆ is  pˆ 
p(1  p)
.
n
107
 0.426
251
Conditions for Estimating p
Check the conditions for estimating p from our sample.
Random: The class took an SRS of 251 goldfish
107
pˆ 
 0.426
251

Normal: Both np and n(1 – p) must be greater than 10. Since we don’t know p, we
check that
 107 
 107 
npˆ  251
  107 and n(1  pˆ )  2511 
  144
251
251




The counts of successes (goldfish with smiles) and failures (goldfish with no smiles)
are both ≥ 10.
Independent: If you are using an SRS, make sure the rule of thumb, where the
population is much bigger than n*10, is satisfied. 251*10= 2510 < pop of goldfish.
If you did an experiment (spinning coins, etc) just make sure each event is not
influencing other events.
Since all three conditions are met, it is safe to construct a confidence interval.
Constructing a Confidence Interval for p
We can use the general formula from Section 8.1 to construct a
confidence interval for an unknown population proportion p:
statistic  (critical value)  (standard deviation of statistic)
The sample proportion pˆ is the statistic we use to estimate p.
When the Independent condition is met, the standard deviation
of the sampling distibution of pˆ is
p(1  p )
n
Since we don’ t know p, we replace it with the sample proportion pˆ .
This gives us the standard error (SE) of the sample proportion:
 pˆ 
pˆ (1  pˆ )
n
Definition:
When the standard deviation of a statistic is estimated from
data, the results is called the standard error of the statistic.
Finding a Critical Value
How do we find the critical value for our confidence interval?
statistic  (critical value)  (standard deviation of statistic)
If the Normal condition is met, we can use a Normal curve. To find a
level C confidence interval, we need to catch the central area C
under the standard Normal curve.
For example, to find a 95% confidence
interval, we use a critical value of 2 based
on the 68-95-99.7 rule. Using Table A or a
calculator, we can get a more accurate
critical value.
Note, the critical value z* is actually 1.96
for a 95% confidence level.
Finding a Critical Value
Use Table A to find the critical value z* for an 80% confidence
interval. Assume that the Normal condition is met.
Since we want to capture the
central 80% of the standard
Normal distribution, we leave
out 20%, or 10% in each tail.
Search Table A to find the point
z* with area 0.1 to its left.
The closest entry is z = – 1.28.
z
.07
.08
.09
– 1.3
.0853
.0838
.0823
– 1.2
.1020
.1003
.0985
– 1.1
.1210
.1190
.1170
So, the critical value z* for an 80% confidence interval is z* = 1.28.
Use Table A to find the critical value z* for a 96%
confidence interval. Assume that the Large Counts
condition is met.
For a 96% confidence interval, we need to capture the
middle 96% of the standard Normal distribution. This
leaves out 2% in each tail, so we want to find the z-score
with an area of 0.02 to its left. The closest entry is z =
−2.05, so the critical value we want is z* = 2.05.
Using technology: The
command
invNorm(area:0.02,µ:0,
:1) gives z = –2.05, so z* =
2.05 which matches the value
we got from Table A.
One-Sample z Interval for a Population Proportion
Once we find the critical value z*, our confidence interval for the
population proportion p is
statistic  (critical value)  (standard deviation of statistic)
pˆ (1  pˆ )
 pˆ  z *
n
One-Sample z Interval for a Population Proportion
When the conditions are met, a C% confidence interval for the
unknown proportion p is
pˆ (1 - pˆ )
pˆ ± z *
n
where z* is the critical value for the standard Normal curve with
C% of its area between −z* and z*.
One-Sample z Interval for a Population Proportion
Suppose you took an SRS of goldfish from the bin and got 107 that had
smiles and 144 that didn’t have smiles. Calculate and interpret a 90%
confidence interval for the proportion of goldfish that have smiles. Your
teacher claims 50% of the goldfish have smiles. Use your interval to
comment on this claim.
z
.03
.04
.05
– 1.7
.0418
.0409
.0401
 We checked the conditions earlier.
– 1.6
.0516
.0505
.0495
– 1.5
 For a 90% confidence level, z* = 1.645
.0630
.0618
.0606
pˆ ± z *
pˆ (1 - pˆ )
n
 Sample proportion = 107/251 = 0.426
We are 90% confident that the
interval from 0.375 to 0.477 captures
the true proportion of goldfish that
have smiles.
(0.426)(1 - 0.426)
= 0.426 ± 1.645
Since this interval gives a range of
251
plausible values for p and since 0.5
= 0.426 ± 0.051
is not contained in the interval, we
= (0.375, 0.477)
have reason to doubt her claim.
The Four-Step Process
We can use the familiar four-step process
whenever a problem asks us to construct and
interpret a confidence interval.
Confidence Intervals: A Four-Step Process
State: What parameter do you want to estimate, and at what confidence
level?
Plan: Identify the appropriate inference method. Check conditions.
Do: If the conditions are met, perform calculations.
Conclude: Interpret your interval in the context of the problem.
The Gallup Youth Survey asked a random sample of 439 U.S.
teens aged 13 to 17 whether they thought young people
should wait to have sex until marriage. Of the sample, 246
said “Yes”. Construct and interpret a 95% confidence
interval for the proportion of all teens who would say
“Yes” if asked this question.
We want to estimate the actual proportion, p, of all teens in the U.S. who
would say that young people should wait until marriage at a 95% confidence
246
interval. n=439, 𝑝 =
= 0.56,
439
PLAN: We should use a one-sample z interval for p if the conditions are satisfied.
STATE:
Random: Gallup surveyed a random sample of 439 U.S. teens.
Normal: We check the counts of “successes” and “failures”
npˆ  246  10
and n1  pˆ   193  10
Independent: The responses of each teen are independent of each other.
And since we are sampling without replacement, make sure your rule of
thumb is satisfied: 10*n = 10*439=4390 which is smaller than the U.S.
teen population.
DO:
pˆ (1  pˆ )
pˆ  z *
n
(0.56)(0.44)
0.56  1.96
439
 0.56  0.046
 0.514,0.606
CONCLUDE: We are 95% confident that the interval from 0.514 to 0.606
captures the true proportion of 13-17 year olds in the U.S.
who would way that teens should wait until marriage to
have sex.
Using your calculator
• If you decide to use your calculator, it is recommended
you show the calculation with the appropriate formula
and then check your work with your calculator. If you
decide to only use your calculator, be sure to list all your
variables, name the procedure (one-proportion z interval,
etc) and give the interval (0.514 to 0.606).
• STAT, then go over to TESTS, scroll down to A which is 1Prop Z Interval
In her first-grade social studies class, Jordan learned
that 70% of the Earth’s surface was covered in water.
She wondered if this was really true and asked her
dad for help. To investigate, he tossed an inflatable
globe to her 50 times, being careful to spin the globe
each time. When she caught it, he recorded where
her right index finger was pointing. In 50 tosses, her
finger was pointing to water 33 times. Construct and
interpret a 95% confidence interval for the proportion
of the Earth’s surface that is covered in water.
State: We want to estimate p = the true proportion of the Earth’s surface that
33
is covered in water with 95% confidence. n=50, 𝑝 = = 0.66,
50
Plan: We should use a one-sample z interval for p if the conditions are met.
• Random: The 50 locations are a random sample of all possible locations on
the globe.
• 10%: We do not need to check the 10% condition because the locations
were not selected without replacement. Each toss is independent of other
tosses.
• Large Counts: 𝑛𝑝 = 33 ≥ 10 and 𝑛 1 − 𝑝 = 17 ≥ 10
Do:
𝑝 ± 𝑧∗
𝑝(1−𝑝)
𝑛
= 0.66 ± 1.96
0.66(1−0.66)
50
= 0.66 ± 0.131 = (0.529, 0.791)
Conclude: We are 95% confident that the interval from 0.529 to 0.791
captures the true proportion of the Earth’s surface that is covered in water.
This is consistent with the claim that 70% of the Earth’s surface is covered in
water, because 0.70 is one of the plausible values in the interval.
Warmup
1. Are the conditions for calculating a confidence interval for the
population proportion, p, met in the following settings?
a) An AP Stats class at a large high school conducts a survey. They ask the
first 100 students to arrive at school one morning whether or not they
slept at least 8 hours the night before. Only 17 students say “yes”.
b) A quality control inspector takes a random sample of 25 bags of potato
chips from the thousands of bags filled in an hour. Of the bags
selected, 3 had too much salt.
2. Alcohol abuse has been described by college presidents as the number
one problem on campus, and it is an important cause of death in young
adults. How common is it? A survey of 10,904 randomly selected U.S.
college students collected information on drinking behavior and alcoholrelated problems. The researchers defined “frequent binge drinking” as
having five or more drinks in a row three or more times in the past two
weeks. According to this definition, 2486 students were classified as
frequent binge drinkers. Calculate a 99% confidence interval.
1. a) The random condition is note met because this
was a convenience sample. 10% condition: n*10
= 100*10 = 1000<pop of school so this condition is
met. Normal condition: np = 100(17/100)=17>10
and nq = 83>10 so this condition is met.
b) Random: met because the inspector chose an SRS
of bags. 10% condition: n*10=25*10=250 < pop of
bags filled in an hour so this is met. Normal: np=
3>10 so this condition is not met.
Choosing the Sample Size
In planning a study, we may want to choose a sample size that allows us to
estimate a population proportion within a given margin of error.
Calculating a Confidence Interval
To determine the sample size n that will yield a level C confidence
interval for a population proportion p with a maximum margin of
error ME, solve the following inequality for n:
pˆ (1- pˆ )
£ ME
n
where pˆ is a guessed value for the sample proportion. The margin of error
will always be less than or equal to ME if you take the guess pˆ to be 0.5.
z*
Example: Customer Satisfaction
A company has received complaints about its customer
service. The managers intend to hire a consultant to carry
out a survey of customers. Before contacting the
consultant, the company president wants some idea of the
sample size that she will be required to pay for. One critical
question is the degree of satisfaction with the company’s
customer service, measured on a five-point scale. The
president wants to estimate the proportion p of customers
who are satisfied (that is, who choose either “satisfied” or
“very satisfied”, the two highest levels on the five-point
scale). She decides that she wants the estimate to be
within 3% (0.03) at a 95% confidence level. How large a
sample is needed?
Example: Determining sample size
Problem: Determine the sample size needed to estimate p within 0.03
with 95% confidence.
 The critical value for 95% confidence is z* = 1.96.
 We have no idea about the true proportion p of satisfied
customers, so we decide to use p-hat = 0.5 as our guess.
 Because the company president wants a margin of error of no
more than 0.03, we need to solve the equation:
 margin of error < 0.03
Example: Determining sample size
Because the company president wants a margin of error of no
more than 0.03, we need to solve the equation
0.5(1  0.5) 1.96

n
0.03
p̂(1- p̂)
1.96
£ 0.03
n
0.5(1  0.5)  1.96 


n
 0.03 
2
 1.96 
0.25  n

 0.03 
2
0.25
 1.96 


 0.03 
1067.111 £ n
We round up to 1068 respondents to ensure that the
margin of error is no more than 3%.
2
n
Suppose that you want to estimate p = the true proportion of
students at your school who have a tattoo with 95%
confidence and a margin of error of no more than 0.10.
Determine how many students should be surveyed to
estimate p within 0.10 with 95% confidence.
Because we don’t have any previous knowledge about the proportion
of students with a tattoo, we will use 𝑝 = 0.5 to estimate the sample size
needed:
1.96
0.5(1−0.5)
𝑛
≤ 0.10 →
1.96 2
0.10
0.5 (1 − 0.5) ≤ 𝑛 → 𝑛 ≥ 96.05
We need to survey at least 97 students to estimate the true proportion
of students with a tattoo with 95% confidence and a margin of error of
at most 0.10.