Carolyn Anderson & Youngshil Paek (Slide contributors: Shuai Wang, Michael Culbertson, Yi Zheng & Haiyan Li) Department of Educational Psychology University of Illinois at Urbana-Champaign 1 The rest of the semester will all be hypothesis testing! Test of one proportion/mean Comparing two proportions/means Independent samples Matched samples Comparing the means of multiple groups One-way ANOVA Two-way ANOVA Test of association between two categorical variables Chi-square test Fisher’s Exact test Compare groups by ranking Wilcoxon test and Wilcoxon signed-ranks test Sign test Kruskal-Wallis test Significance of regression models 2 3 Key Points Comparing Conditional Proportions 2. Independence vs. Dependence 3. Purpose of Hypothesis Testing 1. 4 Example: Is There an Association Between Happiness and Family Income? 5 Example: Is There an Association Between Happiness and Family Income? Standard conventions when constructing tables with conditional distributions: Make the response variable the column variable Compute conditional proportions for the response variable within each row Include the total sample sizes 6 Example: Is There an Association Between Happiness and Family Income? The percentages in a particular row of a table are conditional proportions They form the conditional distribution for a happiness level, given a particular income level 7 Example: Is There an Association Between Happiness and Family Income? 8 Independence vs. Dependence Two variables are independent if the population percentage in any category of one variable is the same for all categories of the other variable For two variables to be dependent (or associated), the population percentages in the categories are NOT all the same 9 Independence vs. Dependence The conditional distributions in the table below are similar but not exactly identical. It is tempting to conclude that the variables are dependent. However, the definition of independence between variables refers to a population. The table is only a sample, not a population 10 Independence vs. Dependence Even if variables are independent, we would not expect the sample conditional distributions to be identical Because of sampling variability, each sample percentage typically differs somewhat from the true population percentage The purpose of hypothesis testing is to determine whether the difference in conditional distributions observed from the sample is plausible (merely due to randomness), if the variables were independent. 11 Independence vs Homogenous Distributions These are different null hypotheses Independence is a concept covered when we talk about probability Ho: Independence means that 𝑃 𝑟𝑜𝑤 = 𝑎 & 𝑐𝑜𝑙 = 𝑏 = 𝑃 𝑟𝑜𝑤 = 𝑎 𝑃(𝑐𝑜𝑙 = 𝑏) Ho: Homeogeneous Association means that P(column=b & row=a) = 𝑃 𝑐𝑜𝑙𝑢𝑚𝑛 = 𝑏 𝑟𝑜𝑤 = 𝑎 𝑛𝑎 where 𝑛𝑎 equals the number in row a. Different nulls hypotheses, different conclusions, but same test statistic! 12 Key Points Revisited Comparing Conditional Proportions 2. Independence vs. Dependence 3. Purpose of Hypothesis Testing 1. 13 14 Key Points 1. A Significance Test for Categorical Variables 2. What Do We Expect for Cell Counts if the 3. 4. 5. 6. 7. 8. Variables Are Independent? The Chi-Squared Test Statistic The Chi-Squared Distribution The Five Steps of the Chi-Squared Test of Independence Chi-Squared and the Test Comparing Proportions in 2x2 Tables Limitations of the Chi-Squared Test Fisher’s Exact Test 15 A Significance Test for Categorical Variables Create a table of frequencies divided into the categories of the two variables. The hypotheses for the test are: H0: The two variables are independent Ha: The two variables are dependent (associated) The test assumes random sampling and a large sample size (Expected cell counts in the frequency table all at least 5) 16 What Do We Expect for Cell Counts if the Variables Are Independent? The expected cell count is found under the presumption that H0 is true Expected Cell Count: For a particular cell, Expected cell count = (Row total) ´ (Column total) Grand total The expected frequencies are values that have the same row and column totals as the observed counts, but for which the conditional distributions are identical (this is the assumption of the null hypothesis). 17 Example: How to Find Expected Cell Counts? If party and opinion are independent: 516 1012 P(Democrat and Stricter) = P(Democrat)´ P(Stricter) = ´ 1215 1215 So the expected cell count in Democrat and Stricter P(Democrat and Stricter) ´ Grand Total is: = 516 1012 516 ´1012 (Row total)´ (Column total) ´ ´1215 = = 1215 1215 1215 Total sample size Should be stricter Should be Total less strict Democrat 454 62 516 Independent 195 37 232 Republican 363 104 467 Total 1012 203 1215 18 The Chi-Squared Test Statistic The chi-squared statistic summarizes how far the observed cell counts in a contingency table fall from the expected cell counts for a null hypothesis χ2 = (𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑 𝑐𝑜𝑢𝑛𝑡 −𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝑐𝑜𝑢𝑛𝑡)2 𝑎𝑙𝑙 𝑐𝑒𝑙𝑙𝑠 𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝑐𝑜𝑢𝑛𝑡 “Pearson’s chi-square” The chi-squared statistic will not change if we exchange the response variable and the explanatory variable (i.e., exchange the column variable and row variable). 19 The Chi-Squared Test Statistic The larger the c value, the greater the evidence 2 against the null hypothesis of independence and in support of the alternative hypothesis that the two variables are associated To obtain a P-value, we compare χ2 test statistic to the sampling distribution of the χ2 test statistic. For large sample sizes, this sampling distribution is well approximated by the chi-squared probability distribution 20 The Chi-Squared Distribution 21 The Chi-Squared Distribution Main properties of the chi-squared distribution: It falls on the positive part of the real number line The precise shape of the distribution depends on the degrees of freedom: df = (r – 1)(c – 1) where r=number of rows and c=number of columns The mean of the distribution equals the df value It is skewed to the right The larger the χ2 value, the greater the evidence against H0: independence 22 The Chi-Squared Distribution 23 The Five Steps of the Chi-Squared Test of Independence 1. Assumptions: Two categorical variables Independent random samples Expected counts ≥ 5 in all cells 24 The Five Steps of the Chi-Squared Test of Independence 2. Hypotheses: H0: The two variables are independent Ha: The two variables are dependent (associated) 3. Test Statistic: (observed count - expected count) c =å expected count 2 2 25 The Five Steps of the Chi-Squared Test of Independence 4. P-value: Right-tail probability above the observed value, for the chi-squared distribution with df = (r – 1)(c – 1) 5. Conclusion: Report P-value and interpret in context P-value ≤ significance level reject H0 and data support the conclusion that the two variables are associated. P-value > significance level fail to reject H0 and data support the conclusion that the two variables are independent. 26 Example: Is There an Association between Happiness and Family Income? 27 Equivalence of Chi-Squared Test and the Test Comparing Proportions in 2x2 Tables Denote the population proportion of success by p1 in group 1 and p2 in group 2 If the response variable is independent of the group, the conditional distributions are equal (“homogeneous distributions), so p1= p2 The “H0: p1 = p2” in the test comparing two proportions (Lecture 10) is equivalent to the “H0: two variables are independent” in the Chisquared test The test statistics are also related: z =c 2 2 where z = ( p̂1 - p̂2 ) se0 28 Limitations of the Chi-Squared Test If the P-value is very small, strong evidence exists against the null hypothesis of independence But… The chi-squared statistic and the P-value tell us nothing about the nature of the association We know that there is statistical significance, but the test alone does not indicate whether there is practical significance as well 29 Looking at Nature of Association There are many measures but we will only look at “residuals”. The value of (𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑 −𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑) 𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑 is approximately normal with mean=0 and standard deviation=1. These standardized residuals can be used to describe the “mis-fit” of independence. Example from GSS… 30 Two items from GSS Item 1: A working mother can establish just as warm and secure relationship with her children as a mother who does not work. Item 2: Working women should have paid maternity leave. The observed frequencies are on the next slide 31 The Data Item 1 Item 2 Strongly agree Agree Neither Disagree Strongly disagree Strongly agree agree disagree 97 102 42 96 199 102 22 48 25 17 38 36 2 5 7 234 392 212 Strongly disagree total 9 250 18 415 7 102 10 101 2 16 46 884 32 The Expected Frequencies Paid maternity Leave Strongly agree Agree Neither Disagree Strongly disagree Mom Establish Warm Relationship Strongly Strongly agree agree disagree disagree total 66.18 109.85 27.00 26.74 4.24 234 110.86 59.96 184.03 99.53 45.23 24.46 44.79 24.22 7.10 3.84 392 212 13.01 21.60 5.31 5.26 0.83 46 250 415 102 101 16 884 33 The Test Ho: Independence vs Ha: Dependence 𝑋 2 = 47.576 df = (4-1)(5-1)= 12 P-value <.01 The data support that conclusion that responses to the items are dependent (i.e., reject Ho). We’ll look at standardized residuals to see why we rejected Ho 34 Standardized Residuals What is the nature of the dependency? Paid maternity Leave Strongly agree Agree Neither Disagree Strongly disagree Mom Establish Warm Relationship Strongly Strongly agree agree disagree disagree total 3.79 -0.84 -2.23 -1.11 250 -1.32 1.10 0.25 -0.77 415 -0.96 0.41 -1.01 -0.79 102 -1.88 -1.01 2.39 1.61 101 -1.09 -0.79 1.61 1.28 16 234 392 212 46 884 35 Limitations of the Chi-Squared Test The chi-squared test is often misused. Some examples are: When some of the expected frequencies are too small When separate rows or columns are dependent samples, e.g. marginal proportions (For dependent samples, we use McNemar’s Test) When data are not random When quantitative data are classified into categories --- results in loss of information 36 “Goodness of Fit” Chi-Squared Tests The Chi-Squared test can also be used for testing particular proportion values for a categorical variable. The null hypothesis is that the distribution of the variable follows a given probability distribution; the alternative is that it does not. The test statistic is calculated in the same manner where the expected counts are what would be expected in a random sample from the hypothesized probability distribution. The McNemar’s test is a “goodness of fit” test. 37 Fisher’s Exact Test The chi-squared test of independence is a large sample test. When the expected frequencies are small, any of them being less than about 5, small-sample tests are more appropriate. Fisher’s exact test is a small-sample test of independence for 2-way tables. The calculations for Fisher’s exact test are tedious; Statistical software can be used to obtain the Pvalue. The smaller the P-value, the stronger the evidence that the variables are associated. 38 How to do chi-square in R? data<-matrix(c(9,4,15,12,5,10,15,4),nrow=2) chisq.test(data, correct=FALSE) fisher.test(data) 39 Illinois Admission Scandal Admission to Illinois No Yes Total “I”-list 37 123 160 General 800 1800 26000 Total 8037 18123 26160 40 A Small Sample Example “Imposing Views, Imposing Shoes” Alper & Raymond (1995): Classes were randomly assigned to one of 2 groups: professors wore Nikes or not. After 3 time/week for 14 weeks, checked to see if students purchased Nikes Students Buy Nikes? Yes No Professor Yes 4 6 Work nikes? No 7 9 Fisher’s pvalue = 1.00 (chisquare test pvalue=.83) 41 Key Points Revisited A Significance Test for Categorical Variables What Do We Expect for Cell Counts if the Variables Are Independent? 3. The Chi-Squared Test Statistic 4. The Chi-Squared Distribution 5. The Five Steps of the Chi-Squared Test of Independence 6. Chi-Squared and the Test Comparing Proportions in 2x2 Tables 7. Limitations of the Chi-Squared Test 8. Fisher’s Exact Test 1. 2. 42
© Copyright 2026 Paperzz