Chapter 8: Inference for Means 8.2 Comparing Two Proportions Overview Confidence intervals and tests designed to compare two population proportions are based on the difference in the sample proportions D pˆ1 pˆ 2 where pˆ i X i ni , i 1 or 2 , and X i is the number of successes in the sample ni . When both sample sizes are sufficiently large, the sampling distribution of the difference D is approximately normal. Inference procedures for comparing proportions are z procedures based on the normal approximation and on standardizing the difference D. The first step is to obtain the mean and standard deviation of D. By the addition rule for means, the mean of D is the difference of the means: D pˆ1 pˆ 2 p1 p2 That is, the difference D pˆ1 pˆ 2 between the sample proportions is an unbiased estimator of the population difference p1 p2 . Similarly, the addition rule for variances tells us that the variance of D is the sum of the variances: D2 P2ˆ1 P2ˆ2 p1 (1 p1 ) p2 (1 p2 ) n1 n2 When n1 and n2 are large, D is approximately normal with mean D p1 p2 and standard deviation D p1 (1 p1 ) p2 (1 p2 ) n1 n2 Significance tests Significance tests for the equality of the two proportions, H 0 : p1 p2 , use a different squared error for the difference in the same proportions which is based on a pooled estimate of the common (Under H 0 ) value of p1 and p2 , pˆ X X2 # of successes in both samples 1 # of observatio ns in both samples n1 n2 Significance test for Comparing Two Proportions To test the hypothesis H 0 : p1 p2 compute the z-statistic pˆ1 pˆ 2 z SED p where the pooled standard error is SED p 1 1 pˆ (1 pˆ ) n1 n2 and where X1 X 2 pˆ n1 n2 In terms of a standard normal random variable Z, the P-value for a test of H 0 against H a : p1 p2 is P( Z z ) H a : p1 p2 is P( Z z ) H a : p1 p2 is 2P( Z | z | ) Example. Are men and women college students equally likely to be frequent binge drinkers? We examine the survey data to answer the question. Here is the data summary: Population 1(men) 2(women) Total n 7180 9916 17096 X 1630 1684 3314 pˆ X n 0.227 0.17 0.194 The sample proportions are certainly quite different, but we will perform a significance test to see if the difference is large enough to lead us to believe that the population proportions are not equal. Formally, we test the hypotheses H 0 : p1 p2 H a : p1 p2 The pooled estimate of the common value of p is 1630 1684 3314 pˆ 0.194 7180 9916 17096 The test statistic is calculated as follows: SED p 1 1 (0.194 )(0.806 ) 7180 9916 0.006126 pˆ 1 pˆ 2 0.227 0.170 z 9.34 SED p 0.006126 The P-value is 2 P ( Z 9.34) .The largest value of z in Table A is 3.49, so from this table we can conclude P 2 0.0002 0.0004 Since P-value is less than 0.05 , we do reject H 0 : p 0.5 at the level 0.05 . We report: among college students in the study, 22.7% of the men and 17% of the women were frequent binge drinkers; the difference is statistically significant (z=9.34, P<0.0004).
© Copyright 2026 Paperzz