R Lab #5: Central Limit Theorem and Confidence Intervals Central Limit Theorem Let X be a random variable with mean µ and standard deviation σ. For a large sample size n, the distribution of the sample mean is σ √ x̄ ∼ N µ, n Example The weights of apples collected from a farm are normally distributed with a mean of 5.2 oz and a standard deviation of 1.1 oz, X ∼ N (5.2, 1.1). Find the probability that a randomly selected apple weighs less than 5 oz? Note, this is a single apple. pnorm(5, mean = 5.2, sd = 1.1) ## [1] 0.4279 Find the probability that a simple random sample of 10 apples has a sample of mean less than 5 oz? Note, this is a mean with a sample size of 10. pnorm(5, mean = 5.2, sd = 1.1/sqrt(10)) ## [1] 0.2827 Find the probability that a simple random sample of 30 apples has a sample of mean less than 5 oz? pnorm(5, mean = 5.2, sd = 1.1/sqrt(30)) ## [1] 0.1597 Find the probability that a sample of 100 apples will have a sample mean greater than 5.3 oz? 1 - pnorm(5.3, mean = 5.2, sd = 1.1/sqrt(100)) 1 ## [1] 0.1817 What is the probability that a sample of 100 apples will have a sample mean between 5.1 oz and 5.3 oz? pnorm(5.3, mean = 5.2, sd = 1.1/sqrt(100)) - pnorm(5.1, mean = 5.2, sd = 1.1/sqrt(100)) ## [1] 0.6367 Confidence Intervals How can the central limit theorem help us construct a confidence interval? Well first we need to define confidence. There are three levels of confidence that are commonly used in statistics: 95% (the most common), 90%, and 99%. A 95% confidence interval is the most common, and it always refers to the middle 95%. For example, take a simple random sample of size 30 from our apple population. Figure 1: plot of chunk unnamed-chunk-6 If the shaded area represents the middle 95%, how much is on either side? These values are important, as they help us define the 95% Confidence Interval. So if the mean is 5.2, the standard deviation is 1.1 and the sample size is 30, I am 95% confident that a sample mean will between. . . round(qnorm(0.025, 5.2, 1.1/sqrt(30)), 2) 2 ## [1] 4.81 round(qnorm(1 - 0.025, 5.2, 1.1/sqrt(30)), 2) # or .975 ## [1] 5.59 I am 90% confident that the sample mean will be between. . . round(qnorm(0.05, 5.2, 1.1/sqrt(30)), 2) ## [1] 4.87 round(qnorm(1 - 0.05, 5.2, 1.1/sqrt(30)), 2) # or .95 ## [1] 5.53 I am 99% Confident that the sample mean will be between . . . round(qnorm(0.005, 5.2, 1.1/sqrt(30)), 2) ## [1] 4.68 round(qnorm(1 - 0.005, 5.2, 1.1/sqrt(30)), 2) # or .995 ## [1] 5.72 This is great in theory, as it tells you the possible values of x̄ when µ is known. In practice, all you have is x̄, which is an estimate of µ. We can change this around. To start, we will recall that approximately 95% of the data is within 2 standard deviation of the mean, or in this case 2 standard errors. P σ σ µ − 2 √ 6 x̄ 6 µ + 2 √ n n P P σ σ 2 √ 6 x̄ − µ 6 2 √ n n = .95 σ σ x̄ − 2 √ 6 µ 6 x̄ + 2 √ n n = .95 = .95 We can rewrite this in a nicer form, and say we are approximatly 95% confident that population mean is between σ x̄ ± 2 √ n If we want to be exact, we can use R to calculate the exact multiplier 3 qnorm(0.025, 0, 1) ## [1] -1.96 qnorm(0.975, 0, 1) ## [1] 1.96 σ x̄ ± 1.96 √ n We also don’t know the value of σ, so we will replace it with s. Therefore, the formula for a 95% confidence interval for the mean, assuming n is large, is s x̄ ± 1.96 √ n The 1.96 √sn part of the equation is called the margin of error, and it is broken up into two parts: a multiplier (1.96 in this case) and the standard error √sn . Example 1 For a chemical reaction, 50 repeated trials showed that the amount of time for the reaction to finish was 149.2 seconds with a standard deviation of 44.1 seconds. Construct a 95% confidence interval for the mean reaction time. 1.96 * 44.1/sqrt(50) ## [1] 12.22 149.2 - 1.96 * 44.1/sqrt(50) ## [1] 137 149.2 + 1.96 * 44.1/sqrt(50) ## [1] 161.4 There are multiple ways to report this answer. There is the margin of error format: 149.2 ± 12.2 seconds Or you can use the interval format: (137.0, 161.4) seconds Not that units are clearly stated and the same number of decimal places/significant digits that mean is reported to are used. 4 Large Sample Confidence Interval for the Mean The form for any C% confidence interval for the mean, assuming n is large (30 or more), is s x̄ ± Zc √ n The value of Zc depends on the level of confidence. • For C=95%, Zc = 1.96 • For C=90%, Zc = 1.645 • For C=99%, Zc = 2.576 Example 2 A drug company claims that the time it takes to relieve a headache after taking a pill is 10 minutes. The drug is administered to 45 people, and the average time to relief was 13.4 minutes with a standard deviation of 6.7 minutes. Find the 99% C.I. for the mean time to relief. Does the claim seem accurate? 13.4 - 2.576 * 6.7/sqrt(45) ## [1] 10.83 13.4 + 2.576 * 6.7/sqrt(45) ## [1] 15.97 The 99% confidence interval is (10.8,16.0) minutes. Since 10 minutes is not in the 99% confidence interval, the claim appears to be incorrect. 5
© Copyright 2026 Paperzz