6.5 One- and Two-Sample Comparisons of Proportions (Only Confidence Intervals) Confidence Interval for a Single Proportion Example 17: Nickel-Cadmium Cells n 235 cells (independent) x 9 had shorts pˆ sample proportion pˆ 9 0.038 235 or 3.8% had shorts How do we find a confidence interval for p, the population proportion of shorts? How variable/precise is p̂ as an estimator of p? The observed proportion can be looked at as an average of a bunch of 1's and 0's Count X i X i 1 if i th cell has short , X i 0 if shortpˆ Count X i X n n E X i x * P x 0* P 0 1* P 1 p 9 0.038 235 pˆ E pˆ p pˆ is an unbiased estimator of p. On average gives the right value (unbiased). 2 Var X i E x x * P x 0 P 0 1 P 1 2 0 p 1 p 1 p p p(1 p) 2 SE pˆ 2 n 2 p 1 p n Or X ~ Binomial n, p E X np X E pˆ E n Var X np 1 p np 1 p E X n n X Var pˆ Var n np 1 p p 1 p 1 Var X 2 n2 n n 2 An approximate confidence interval for p pˆ z pˆ 1 pˆ n As long as n is big enough and the distribution of 1’s and 0’s is not too skewed/asymmetric, the Central Limit Theorem tells us p̂ normal. One can do computations (via computer) for the exact binomial distributions of X, but the normal approximation is still very useful. One rule for when you can use the normal approximation is Successes 5 np 5 and Failures 5 n 1 p 5 Nickel-Cadmium cells: 95% Confidence Interval n 235 cells (independent) x 9 had shorts pˆ sample proportion pˆ 9 0.038 235 0.038 1.96 0.038 1 0.038 235 0.038 1.96 0.0125 0.038 0.024 0.014 to 0.062 Example 18: n1 n2 100 pellets Interest in fraction conforming to specs. For small and large shot sizes Note: If conforming is defined as some variable such as strength say Y 10 Conform Y ~ normal , We would be best off using Y and SY and the normal table to estimate % conforming. Normal Reliability Estimates of p Problem 4 in Section 5.3 gives lifetimes of n=23 bearings. We are interested in approximating the fraction of bearing that last more than 50x106 revolutions. Using the observed fraction ≥50, the binomial distribution approximate confidence interval is pˆ 18 0.783 23 SE pˆ pˆ 1 pˆ 0.086 PLow 0.696 1.96*0.086 0.614 PLow 0.696 1.96*0.086 0.951 Using p̂ fraction is less efficient use of the data; we are ignoring some of the data’s information, the actual lifetimes. The data for log(lifetimes) are fairly normal. 2.50 2.00 Normal Quantiles 1.50 1.00 0.50 0.00 -0.50 1.0 1.2 1.4 1.6 1.8 -1.00 -1.50 -2.00 -2.50 Log Life 2.0 2.2 2.4 Let X = Log(lifetime) X 1.802 S 0.232 2.000 1.802 P Lifetime 100 P X log 100 P Z 0.232 P Z 0.853 0.803 Let Z x X 0.853 S Z low Z PLow 1 n*Z2 1.96 23*0.8532 0.853 1 0.373 n 1 23 1 23 1 n*Z2 1.96 23*0.8532 0.853 1 1.333 n 1 23 1 23 n P Z Z low P Z 0.373 0.645 Z high Z PHigh Z 0.025 Z 0.025 n P Z Z low P Z 1.333 0.909 The 95% approximate normal distribution confidence interval for the reliability, fraction less than or equal to x, is 0.645 to 0.909. o The width of this interval is 0.909 – 0.645 = 0.263 o The width of the binomial interval is 0.951 – 0.614 = 0.337 o The normal distribution based interval Is more precise But has a more stringent assumption of normality Confidence Interval for Two Proportions For comparing 2 independent proportions Var pˆ1 pˆ 2 Var pˆ1 Var pˆ 2 p1 1 p1 n1 Confidence Interval for p1 – p2 pˆ1 pˆ 2 z p2 1 p2 n2 pˆ1 1 pˆ1 pˆ 2 1 pˆ 2 n1 n2 As long as number of successes & failures 5 separately for both groups. Using the formulas from Section 6.5: Example 18: 2 shot sizes n1 100 pˆ1 0.38 n2 100 pˆ 2 0.29 90% confidence Interval 0.38 0.29 1.645 0.38 0.62 100 0.29 0.71 100 0.019 to 0.199 Even a difference of 9% (38% vs. 29%) with n1 = n2 = 100 is not particularly impressive. Using the more informative quantitative information of y1 , s1 , y2 , s2 could do better. Required Sample Size n To be 95% certain that p̂ is within ± of the true population proportion p 1.96* SE pˆ 1.96* p 1 p n The “worst case scenario” with largest SE is when p = 0.5 More generally, the worst case is for p the closest potential p to 0.5 o For example suppose we figure p is somewhere between 0.7 and 0.9. o The worst case situation with largest SE is p=0.7. p 1 p WC WC Solve 1.96* SE pˆ 1.96* n o PWC = worst case p = potential p closet to 0.5 Suppose we want to be 95% certain that p̂ is within ± of the true population proportion p and 0.3 ≤ p ≤ 0.8. 0.5 1 0.5 0.05 n 1.962 *0.5 1 0.5 n 384.2 0.052 Closest interger n with n 384.2 n 384 1.96* If 0.85 ≤ p ≤ 1.0 then 0.85 1 0.85 0.05 n 1.962 *0.85 1 0.85 n 195.9 0.052 Closest interger n with n 195.9 n 196 1.96* It’s easier to estimate proportions closer to 0 or 1.
© Copyright 2026 Paperzz