Objectives (PSLS Chapter 22) The chi-square test for two-way tables Two-way tables Hypotheses for the chi-square test for two-way tables Expected counts in a two-way table Conditions for the chi-square test Chi-square test for two-way tables of fit Simpson’s paradox Two-way tables Two-way tables organize data about two categorical variables with a finite number of levels/treatments. High school students were asked whether they smoke and whether their parents smoke: Second factor: Student smoking status First factor: Parent smoking status 400 416 188 1380 1823 1168 Marginal distribution The marginal distributions (in the “margins” of the table) summarize each factor independently. Marginal distribution for parental smoking: 400 416 188 1380 1823 1168 P(both parents) = ??/?? = ??% percent of all students P(one parent) = ??% P(neither parent) = ??% 40 30 20 10 0 both parents one parent neither parent With two factors, there are two marginal distributions. 400 416 188 1380 1823 1168 P(student smokes) = ??/?? = ??% P(student doesn’t) = ??/?? = ??% percent of all students Marginal distribution for student smoking: 80 60 40 20 0 Student smokes Sudent doesn't Conditional distribution The cells of the two-way table represent the intersection of a given level of one factor with a given level of the other factor. They can be used to compute the conditional distributions. 400 416 188 1380 1823 1168 Conditional distribution of student smoking for different parental smoking statuses: P(student smokes | both parents) = ??/?? = ??% P(student smokes | one parent) = ??/?? =??% P(student smokes | neither parent) = ??/?? = ??% Hypotheses A two-way table has r rows and c columns. H0: There is no association between the row and column variables. Ha: There is an association/relationship between the two variables. We will compare actual counts from the sample data with expected counts given the null hypothesis of no relationship. Expected counts in a two-way table A two-way table has r rows and c columns. H0 states that there is no association between the row and column variables (factors) in the table. The expected count in any cell of a two-way table when H0 is true is: expected count row total column tot al table total Conditions for the chi-square test The chi-square test for two-way tables looks for evidence of association between two categorical variables (factors) in sample data. The samples can be drawn either: By randomly selecting SRSs from different populations (or from a population subjected to different treatments) girls vaccinated for HPV(Human Papillomavirus) or not among 8th-graders and 12th-graders remission or no remission for different treatments Or by taking one SRS and classifying the individuals according to two categorical variables (factors) obesity and ethnicity among high school students We can safely use the chi-square test when: very few (no more than 1 in 5) expected counts are < 5.0 all expected counts are ≥ 1.0 [Note: If one factor has many levels and too many expected counts are too low, you might be able to “collapse” some of the levels (regroup them) and thus have large enough expected counts.] The chi-square test for two-way tables H0 : there is no association between the row and column variables Ha : H0 is not true The c2 statistic is summed over all r × c cells in the table: c 2 observed count - expected count 2 expected count When H0 is true, the c2 statistic follows ~ c2 distribution with (r-1)(c-1) degrees of freedom. P-value: P(c2 variable ≥ calculated c2) Expected counts computation Student smokes Student not smokes Row total Both parents smoke 400 1780*1004/5375 =332.49 1380 1780*4371/5375 =1447.51 1780 One 416 2239*1004/5375 =418.22 1823 2239*4371/5375 =1820.78 2239 Neither 188 1356*1004/5375 =253.29 1168 ?? 1356 Column total 1004 4371 5375 Chi-square Stat computation c 2 2 2 400 332.49 1380 1447.51 332.49 1447.51 2 2 416 418.22 1823 1820.78 418.22 1820.78 2 2 188 253.29 1168 1102.71 253.29 1102.71 13.709 3.149 0.012 0.003 16.829 3.866 37.566 Influence of parental smoking Here is a computer output for a chi-square test performed on the data from a random sample of high school students (rows are parental smoking habits; columns are the students’ smoking habits). What does it tell you? Sample size? Hypotheses? Are the data okay for a c2 test? Interpretation? Table D Ex: df = 6 If c2 = 15.9 the P-value is between 0.01−0.02. Cocaine addiction Cocaine produces short-term feelings of physical and mental well being. To maintain the effect, the drug may have to be taken more frequently and at higher doses. After stopping use, users will feel tired, sleepy, and depressed. A study compares the rates of successful rehabilitation for cocaine addicts following one of three treatment options: 1. antidepressant treatment (desipramine) 2. standard treatment (lithium) 3. placebo (“sugar pill”) Observed % Expected % 35% 35% 35% Expected relapse counts No 25*26/74 ≈ 8.78 Desipramine 25*0.351 Yes 16.22 25*0.649 Lithium 9.13 26*0.351 16.87 26*0.649 Placebo 8.07 23*0.351 14.93 23*0.649 Table of counts: “actual/expected,” with three rows and two columns: No relapse Relapse Desipramine 15 8.78 10 16.22 Lithium 7 9.13 19 16.87 Placebo 4 8.07 19 14.93 df = (3 − 1)(2 − 1) = 2 2 2 15 8 . 78 10 ?? c2 We compute the c2 statistic: Using Table D: 10.60 < c2 < 11.98 8.78 ?? 2 2 7 9.14 19 16.86 9.14 16.86 2 2 4 8.08 19 14.93 8.08 14.93 ?? ?? > P > ?? The P-value is very small (software gives P = 0.0047) and we reject H0. There is a significant relationship between treatment type (desipramine, lithium, placebo) and outcome (relapse or not). Interpreting the c2 output When the c2 test is statistically significant: The largest components indicate which condition(s) are most different from H0. You can also compare the observed and expected counts, or compare the computed proportions in a graph. No relapse Desipramine 4.41 0.50 Lithium 2.06 Placebo Relapse 2.39 0.27 1.12 c2 components The largest c2 component, 4.41, is for desipramine/no relapse. Desipramine has the highest success rate (see graph). A 2013 Gallup study investigated how phrasing affects the opinions of Americans regarding physician-assisted suicide. Telephone interviews were conducted with a random sample of 1,535 national adults. Using random assignment, 719 heard the question in Form A (“end the patient’s life by some painless means”) and 816 the one in Form B (“assist the patient to commit suicide”). Allowed Not allowed No opinion Total "Painless means" 503 194 22 719 "Commit suicide" 416 367 33 816 The chi-square test statistic for these data is c2 = 57.88. Conclude using = 0.05. A. There is significant evidence of a relationship between opinions and question wording. B. We failed to find significant evidence of a relationship between opinions and question wording. C. The test assumptions are not met. c 2 57.88 Should be allowed Should not be allowed No opinion Number interviewed We found that phrasing significantly (P < 0.0005) influences opinions about physician-assisted suicide. Specifically, the phrasing of “painless means” resulted in a substantially higher approval (70% in favor) than the phrasing of “commit suicide” (51% in favor). Form A "End the patient's life by some painless means" 70% 27% 3% Form B "Assist the patient to commit suicide" 51% 45% 4% 719 816 Caution with categorical data Beware of lurking variables! An association that holds for all of several groups can reverse direction when the data are combined to form a single group. This reversal is called Simpson’s paradox. Kidney stones A study compared the success rates of two different procedures for removing kidney stones: open surgery and Small stones Open surgery PCNL Success 81 234 273 289 77 6136 Failure 6 % failure 7% 13% 22% 17% percutaneous nephrolithotomy (PCNL), a minimally invasive technique. Can you think of a possible lurking variable here? The procedures are not chosen randomly by surgeons! In fact, the minimally invasive procedure is most likely used for smaller stones (with a good chance of success) whereas open surgery is likely used for more problematic conditions. Small stones Large Open surgery PCNL Open surge Success 81 234 Success 192 273 289 61 Failure 677 36 Failure 71 % failure 7% 13% % failure 27% 22% 17% Small stones Open surgery PCNL Success 81 234 Failure 6 36 % failure 7% 13% Success Failure % failure Large stones Open surgery PCNL 192 55 71 25 27% 31% For both small stones and large stones, open surgery has a lower failure rate. This is Simpson’s paradox. The more challenging cases with large stones tend to be treated more often with open surgery, making it appear as if the procedure was less reliable overall.
© Copyright 2025 Paperzz