CHAPTER 8 ESTIMATING WITH CONFIDENCE Section 8.2 Estimating a Population Proportion A large company is concerned that many of its employees are in poor physical condition, which can result in decreased productivity. To determine how many steps each employee takes per day, on average, the company provides a pedometer to 50 randomly selected employees to use for one 24-hour period. After collecting the data, the company statistician reports a 95% confidence interval of 4547 steps to 8473 steps. (a) Interpret the confidence interval. (b) What is the point estimate that was used to create the interval? What is the margin of error? (c) Recent guidelines suggest that people aim for 10,000 steps per day. Is there convincing evidence that the employees of this company are not meeting the guideline, on average? Explain. (a) We are 95% confident that the interval from 4547 to 8473 captures the true mean number of steps taken per day for employees at this company. (b) The point estimate is the midpoint of the interval: 4547+8473 = 6510. The margin of error is the distance 2 from the point estimate to each endpoint of the interval: 8473 – 6510 = 1963. (c) Because all the values in the interval are less than 10,000, there is convincing evidence that the employees of this company are not taking 10,000 steps per day, on average. That is, there is convincing evidence that the employees are not meeting the guideline. We saw in Sect 8.1 that a confidence interval can be used to estimate an unknown population parameter. We are often interested in estimating the proportion p of some outcome in the population. • What proportion of U.S. adults are unemployed right now? • What proportion of high school students have cheated on a test? • What proportion of pine trees in a national park are infested with beetles? • What proportion of college students pray daily? • What proportion of a company’s laptop batteries last as long as the company claims? Conditions for Estimating p Mrs. Richardson’s class wants to construct a confidence interval for the proportion p of goldfish that have smiles. Their sample had 107 goldfish with smiles and 144 goldfish that did not have smiles. How can we use this information to find a confidence interval for p? pˆ If the sample size is large enough tha t both np and n(1 p) are at least 10, the sampling distribution of pˆ is approximately Normal. The mean of the sampling distribution of pˆ is p. The standard deviation of pˆ is pˆ p(1 p) . n 107 0.426 251 Conditions for Estimating p Check the conditions for estimating p from our sample. Random: The class took an SRS of 251 goldfish 107 pˆ 0.426 251 Normal: Both np and n(1 – p) must be greater than 10. Since we don’t know p, we check that 107 107 npˆ 251 107 and n(1 pˆ ) 2511 144 251 251 The counts of successes (goldfish with smiles) and failures (goldfish with no smiles) are both ≥ 10. Independent: If you are using an SRS, make sure the rule of thumb, where the population is much bigger than n*10, is satisfied. 251*10= 2510 < pop of goldfish. If you did an experiment (spinning coins, etc) just make sure each event is not influencing other events. Since all three conditions are met, it is safe to construct a confidence interval. Constructing a Confidence Interval for p We can use the general formula from Section 8.1 to construct a confidence interval for an unknown population proportion p: statistic (critical value) (standard deviation of statistic) The sample proportion pˆ is the statistic we use to estimate p. When the Independent condition is met, the standard deviation of the sampling distibution of pˆ is p(1 p ) n Since we don’ t know p, we replace it with the sample proportion pˆ . This gives us the standard error (SE) of the sample proportion: pˆ pˆ (1 pˆ ) n Definition: When the standard deviation of a statistic is estimated from data, the results is called the standard error of the statistic. Finding a Critical Value How do we find the critical value for our confidence interval? statistic (critical value) (standard deviation of statistic) If the Normal condition is met, we can use a Normal curve. To find a level C confidence interval, we need to catch the central area C under the standard Normal curve. For example, to find a 95% confidence interval, we use a critical value of 2 based on the 68-95-99.7 rule. Using Table A or a calculator, we can get a more accurate critical value. Note, the critical value z* is actually 1.96 for a 95% confidence level. Finding a Critical Value Use Table A to find the critical value z* for an 80% confidence interval. Assume that the Normal condition is met. Since we want to capture the central 80% of the standard Normal distribution, we leave out 20%, or 10% in each tail. Search Table A to find the point z* with area 0.1 to its left. The closest entry is z = – 1.28. z .07 .08 .09 – 1.3 .0853 .0838 .0823 – 1.2 .1020 .1003 .0985 – 1.1 .1210 .1190 .1170 So, the critical value z* for an 80% confidence interval is z* = 1.28. Use Table A to find the critical value z* for a 96% confidence interval. Assume that the Large Counts condition is met. For a 96% confidence interval, we need to capture the middle 96% of the standard Normal distribution. This leaves out 2% in each tail, so we want to find the z-score with an area of 0.02 to its left. The closest entry is z = −2.05, so the critical value we want is z* = 2.05. Using technology: The command invNorm(area:0.02,µ:0, :1) gives z = –2.05, so z* = 2.05 which matches the value we got from Table A. One-Sample z Interval for a Population Proportion Once we find the critical value z*, our confidence interval for the population proportion p is statistic (critical value) (standard deviation of statistic) pˆ (1 pˆ ) pˆ z * n One-Sample z Interval for a Population Proportion When the conditions are met, a C% confidence interval for the unknown proportion p is pˆ (1 - pˆ ) pˆ ± z * n where z* is the critical value for the standard Normal curve with C% of its area between −z* and z*. One-Sample z Interval for a Population Proportion Suppose you took an SRS of goldfish from the bin and got 107 that had smiles and 144 that didn’t have smiles. Calculate and interpret a 90% confidence interval for the proportion of goldfish that have smiles. Your teacher claims 50% of the goldfish have smiles. Use your interval to comment on this claim. z .03 .04 .05 – 1.7 .0418 .0409 .0401 We checked the conditions earlier. – 1.6 .0516 .0505 .0495 – 1.5 For a 90% confidence level, z* = 1.645 .0630 .0618 .0606 pˆ ± z * pˆ (1 - pˆ ) n Sample proportion = 107/251 = 0.426 We are 90% confident that the interval from 0.375 to 0.477 captures the true proportion of goldfish that have smiles. (0.426)(1 - 0.426) = 0.426 ± 1.645 Since this interval gives a range of 251 plausible values for p and since 0.5 = 0.426 ± 0.051 is not contained in the interval, we = (0.375, 0.477) have reason to doubt her claim. The Four-Step Process We can use the familiar four-step process whenever a problem asks us to construct and interpret a confidence interval. Confidence Intervals: A Four-Step Process State: What parameter do you want to estimate, and at what confidence level? Plan: Identify the appropriate inference method. Check conditions. Do: If the conditions are met, perform calculations. Conclude: Interpret your interval in the context of the problem. The Gallup Youth Survey asked a random sample of 439 U.S. teens aged 13 to 17 whether they thought young people should wait to have sex until marriage. Of the sample, 246 said “Yes”. Construct and interpret a 95% confidence interval for the proportion of all teens who would say “Yes” if asked this question. We want to estimate the actual proportion, p, of all teens in the U.S. who would say that young people should wait until marriage at a 95% confidence 246 interval. n=439, 𝑝 = = 0.56, 439 PLAN: We should use a one-sample z interval for p if the conditions are satisfied. STATE: Random: Gallup surveyed a random sample of 439 U.S. teens. Normal: We check the counts of “successes” and “failures” npˆ 246 10 and n1 pˆ 193 10 Independent: The responses of each teen are independent of each other. And since we are sampling without replacement, make sure your rule of thumb is satisfied: 10*n = 10*439=4390 which is smaller than the U.S. teen population. DO: pˆ (1 pˆ ) pˆ z * n (0.56)(0.44) 0.56 1.96 439 0.56 0.046 0.514,0.606 CONCLUDE: We are 95% confident that the interval from 0.514 to 0.606 captures the true proportion of 13-17 year olds in the U.S. who would way that teens should wait until marriage to have sex. Using your calculator • If you decide to use your calculator, it is recommended you show the calculation with the appropriate formula and then check your work with your calculator. If you decide to only use your calculator, be sure to list all your variables, name the procedure (one-proportion z interval, etc) and give the interval (0.514 to 0.606). • STAT, then go over to TESTS, scroll down to A which is 1Prop Z Interval In her first-grade social studies class, Jordan learned that 70% of the Earth’s surface was covered in water. She wondered if this was really true and asked her dad for help. To investigate, he tossed an inflatable globe to her 50 times, being careful to spin the globe each time. When she caught it, he recorded where her right index finger was pointing. In 50 tosses, her finger was pointing to water 33 times. Construct and interpret a 95% confidence interval for the proportion of the Earth’s surface that is covered in water. State: We want to estimate p = the true proportion of the Earth’s surface that 33 is covered in water with 95% confidence. n=50, 𝑝 = = 0.66, 50 Plan: We should use a one-sample z interval for p if the conditions are met. • Random: The 50 locations are a random sample of all possible locations on the globe. • 10%: We do not need to check the 10% condition because the locations were not selected without replacement. Each toss is independent of other tosses. • Large Counts: 𝑛𝑝 = 33 ≥ 10 and 𝑛 1 − 𝑝 = 17 ≥ 10 Do: 𝑝 ± 𝑧∗ 𝑝(1−𝑝) 𝑛 = 0.66 ± 1.96 0.66(1−0.66) 50 = 0.66 ± 0.131 = (0.529, 0.791) Conclude: We are 95% confident that the interval from 0.529 to 0.791 captures the true proportion of the Earth’s surface that is covered in water. This is consistent with the claim that 70% of the Earth’s surface is covered in water, because 0.70 is one of the plausible values in the interval. Warmup 1. Are the conditions for calculating a confidence interval for the population proportion, p, met in the following settings? a) An AP Stats class at a large high school conducts a survey. They ask the first 100 students to arrive at school one morning whether or not they slept at least 8 hours the night before. Only 17 students say “yes”. b) A quality control inspector takes a random sample of 25 bags of potato chips from the thousands of bags filled in an hour. Of the bags selected, 3 had too much salt. 2. Alcohol abuse has been described by college presidents as the number one problem on campus, and it is an important cause of death in young adults. How common is it? A survey of 10,904 randomly selected U.S. college students collected information on drinking behavior and alcoholrelated problems. The researchers defined “frequent binge drinking” as having five or more drinks in a row three or more times in the past two weeks. According to this definition, 2486 students were classified as frequent binge drinkers. Calculate a 99% confidence interval. 1. a) The random condition is note met because this was a convenience sample. 10% condition: n*10 = 100*10 = 1000<pop of school so this condition is met. Normal condition: np = 100(17/100)=17>10 and nq = 83>10 so this condition is met. b) Random: met because the inspector chose an SRS of bags. 10% condition: n*10=25*10=250 < pop of bags filled in an hour so this is met. Normal: np= 3>10 so this condition is not met. Choosing the Sample Size In planning a study, we may want to choose a sample size that allows us to estimate a population proportion within a given margin of error. Calculating a Confidence Interval To determine the sample size n that will yield a level C confidence interval for a population proportion p with a maximum margin of error ME, solve the following inequality for n: pˆ (1- pˆ ) £ ME n where pˆ is a guessed value for the sample proportion. The margin of error will always be less than or equal to ME if you take the guess pˆ to be 0.5. z* Example: Customer Satisfaction A company has received complaints about its customer service. The managers intend to hire a consultant to carry out a survey of customers. Before contacting the consultant, the company president wants some idea of the sample size that she will be required to pay for. One critical question is the degree of satisfaction with the company’s customer service, measured on a five-point scale. The president wants to estimate the proportion p of customers who are satisfied (that is, who choose either “satisfied” or “very satisfied”, the two highest levels on the five-point scale). She decides that she wants the estimate to be within 3% (0.03) at a 95% confidence level. How large a sample is needed? Example: Determining sample size Problem: Determine the sample size needed to estimate p within 0.03 with 95% confidence. The critical value for 95% confidence is z* = 1.96. We have no idea about the true proportion p of satisfied customers, so we decide to use p-hat = 0.5 as our guess. Because the company president wants a margin of error of no more than 0.03, we need to solve the equation: margin of error < 0.03 Example: Determining sample size Because the company president wants a margin of error of no more than 0.03, we need to solve the equation 0.5(1 0.5) 1.96 n 0.03 p̂(1- p̂) 1.96 £ 0.03 n 0.5(1 0.5) 1.96 n 0.03 2 1.96 0.25 n 0.03 2 0.25 1.96 0.03 1067.111 £ n We round up to 1068 respondents to ensure that the margin of error is no more than 3%. 2 n Suppose that you want to estimate p = the true proportion of students at your school who have a tattoo with 95% confidence and a margin of error of no more than 0.10. Determine how many students should be surveyed to estimate p within 0.10 with 95% confidence. Because we don’t have any previous knowledge about the proportion of students with a tattoo, we will use 𝑝 = 0.5 to estimate the sample size needed: 1.96 0.5(1−0.5) 𝑛 ≤ 0.10 → 1.96 2 0.10 0.5 (1 − 0.5) ≤ 𝑛 → 𝑛 ≥ 96.05 We need to survey at least 97 students to estimate the true proportion of students with a tattoo with 95% confidence and a margin of error of at most 0.10.
© Copyright 2026 Paperzz