Comparing two categorical variables

Carolyn Anderson & Youngshil Paek
(Slide contributors: Shuai Wang, Michael
Culbertson, Yi Zheng & Haiyan Li)
Department of Educational Psychology
University of Illinois at Urbana-Champaign
1
The rest of the semester will all be hypothesis
testing!
 Test of one proportion/mean
 Comparing two proportions/means
 Independent samples
 Matched samples
 Comparing the means of multiple groups
 One-way ANOVA
 Two-way ANOVA
 Test of association between two categorical variables
 Chi-square test
 Fisher’s Exact test
 Compare groups by ranking
 Wilcoxon test and Wilcoxon signed-ranks test
 Sign test
 Kruskal-Wallis test
 Significance of regression models
2
3
Key Points
Comparing Conditional Proportions
2. Independence vs. Dependence
3. Purpose of Hypothesis Testing
1.
4
Example: Is There an Association Between
Happiness and Family Income?
5
Example: Is There an Association Between
Happiness and Family Income?
 Standard conventions when constructing tables
with conditional distributions:
 Make the response variable the column variable
 Compute conditional proportions for the
response variable within each row
 Include the total sample sizes
6
Example: Is There an Association Between
Happiness and Family Income?
 The percentages in a particular row of a table are
conditional proportions
 They form the conditional distribution for a
happiness level, given a particular income level
7
Example: Is There an Association Between
Happiness and Family Income?
8
Independence vs. Dependence
 Two variables are independent if the population
percentage in any category of one variable is the same
for all categories of the other variable
 For two variables to be dependent (or associated), the
population percentages in the categories are NOT all
the same
9
Independence vs. Dependence
 The conditional distributions in the table below are
similar but not exactly identical. It is tempting to
conclude that the variables are dependent.
 However, the definition of independence between
variables refers to a population. The table is only a
sample, not a population
10
Independence vs. Dependence
 Even if variables are independent, we would not expect
the sample conditional distributions to be identical
 Because of sampling variability, each sample
percentage typically differs somewhat from the true
population percentage
 The purpose of hypothesis testing is to determine
whether the difference in conditional distributions
observed from the sample is plausible (merely due to
randomness), if the variables were independent.
11
Independence vs Homogenous
Distributions
 These are different null hypotheses
 Independence is a concept covered when we talk about
probability
 Ho: Independence means that
𝑃 𝑟𝑜𝑤 = 𝑎 & 𝑐𝑜𝑙 = 𝑏 = 𝑃 𝑟𝑜𝑤 = 𝑎 𝑃(𝑐𝑜𝑙 = 𝑏)
 Ho: Homeogeneous Association means that
P(column=b & row=a) = 𝑃 𝑐𝑜𝑙𝑢𝑚𝑛 = 𝑏 𝑟𝑜𝑤 = 𝑎 𝑛𝑎
where 𝑛𝑎 equals the number in row a.
Different nulls hypotheses, different conclusions,
but same test statistic!
12
Key Points Revisited
Comparing Conditional Proportions
2. Independence vs. Dependence
3. Purpose of Hypothesis Testing
1.
13
14
Key Points
1. A Significance Test for Categorical Variables
2. What Do We Expect for Cell Counts if the
3.
4.
5.
6.
7.
8.
Variables Are Independent?
The Chi-Squared Test Statistic
The Chi-Squared Distribution
The Five Steps of the Chi-Squared Test of
Independence
Chi-Squared and the Test Comparing
Proportions in 2x2 Tables
Limitations of the Chi-Squared Test
Fisher’s Exact Test
15
A Significance Test for Categorical Variables
 Create a table of frequencies divided into the
categories of the two variables. The hypotheses for
the test are:
H0: The two variables are independent
Ha: The two variables are dependent
(associated)
The test assumes random sampling and a large
sample size (Expected cell counts in the frequency
table all at least 5)
16
What Do We Expect for Cell Counts if the
Variables Are Independent?
 The expected cell count is found under the
presumption that H0 is true
 Expected Cell Count:
 For a particular cell,
Expected cell count =
(Row total) ´ (Column total)
Grand total
 The expected frequencies are values that have
the same row and column totals as the observed
counts, but for which the conditional
distributions are identical (this is the
assumption of the null hypothesis).
17
Example: How to Find Expected Cell Counts?
If party and opinion are independent:
516 1012
P(Democrat and Stricter) = P(Democrat)´ P(Stricter) =
´
1215 1215
So the expected cell count in Democrat and Stricter
P(Democrat and Stricter) ´ Grand Total
is:
=
516 1012
516 ´1012 (Row total)´ (Column total)
´
´1215 =
=
1215 1215
1215
Total sample size
Should be
stricter
Should be Total
less strict
Democrat
454
62
516
Independent
195
37
232
Republican
363
104
467
Total
1012
203
1215
18
The Chi-Squared Test Statistic
 The chi-squared statistic summarizes how far the
observed cell counts in a contingency table fall from the
expected cell counts for a null hypothesis

χ2
=
(𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑 𝑐𝑜𝑢𝑛𝑡 −𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝑐𝑜𝑢𝑛𝑡)2
𝑎𝑙𝑙 𝑐𝑒𝑙𝑙𝑠
𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝑐𝑜𝑢𝑛𝑡
 “Pearson’s chi-square”
 The chi-squared statistic will not change if we
exchange the response variable and the explanatory
variable (i.e., exchange the column variable and
row variable).
19
The Chi-Squared Test Statistic
 The larger the
c value, the greater the evidence
2
against the null hypothesis of independence and in
support of the alternative hypothesis that the two
variables are associated
 To obtain a P-value, we compare χ2 test statistic
to the sampling distribution of the χ2 test
statistic.
 For large sample sizes, this sampling distribution
is well approximated by the chi-squared
probability distribution
20
The Chi-Squared Distribution
21
The Chi-Squared Distribution
 Main properties of the chi-squared distribution:
 It falls on the positive part of the real number line
 The precise shape of the distribution depends on the
degrees of freedom:
df = (r – 1)(c – 1)
where r=number of rows and c=number of columns
 The mean of the distribution equals the df value
 It is skewed to the right
 The larger the χ2 value, the greater the evidence
against H0: independence
22
The Chi-Squared Distribution
23
The Five Steps of the Chi-Squared Test of
Independence
1. Assumptions:
 Two categorical variables
 Independent random samples
 Expected counts ≥ 5 in all cells
24
The Five Steps of the Chi-Squared Test of
Independence
2. Hypotheses:
 H0: The two variables are independent
 Ha: The two variables are dependent
(associated)
3. Test Statistic:
(observed count - expected count)
c =å
expected count
2
2
25
The Five Steps of the Chi-Squared Test of
Independence
4. P-value:
Right-tail probability above the observed
value, for
the chi-squared distribution with df = (r – 1)(c – 1)
5. Conclusion:
Report P-value and interpret in context
 P-value ≤ significance level  reject H0 and data
support the conclusion that the two variables are
associated.
 P-value > significance level  fail to reject H0 and
data support the conclusion that the two variables
are independent.
26
Example: Is There an Association between
Happiness and Family Income?
27
Equivalence of Chi-Squared Test and the Test
Comparing Proportions in 2x2 Tables
 Denote the population proportion of success by p1
in group 1 and p2 in group 2
 If the response variable is independent of the group,
the conditional distributions are equal
(“homogeneous distributions), so p1= p2
 The “H0: p1 = p2” in the test comparing two
proportions (Lecture 10) is equivalent to the
“H0: two variables are independent” in the Chisquared test
 The test statistics are also related:
z =c
2
2
where z = ( p̂1 - p̂2 ) se0
28
Limitations of the Chi-Squared Test
 If the P-value is very small, strong evidence
exists against the null hypothesis of
independence
But…
 The chi-squared statistic and the P-value tell us
nothing about the nature of the association
 We know that there is statistical significance,
but the test alone does not indicate whether
there is practical significance as well
29
Looking at Nature of Association
 There are many measures but we will only look at
“residuals”.
 The value of
(𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑 −𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑)
𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑
is approximately
normal with mean=0 and standard deviation=1.
 These standardized residuals can be used to describe
the “mis-fit” of independence.
 Example from GSS…
30
Two items from GSS
 Item 1: A working mother can establish just as warm
and secure relationship with her children as a mother
who does not work.
 Item 2: Working women should have paid maternity
leave.
 The observed frequencies are on the next slide
31
The Data
Item 1
Item 2
Strongly agree
Agree
Neither
Disagree
Strongly disagree
Strongly
agree
agree disagree
97
102
42
96
199
102
22
48
25
17
38
36
2
5
7
234
392
212
Strongly
disagree total
9
250
18
415
7
102
10
101
2
16
46
884
32
The Expected Frequencies
Paid maternity
Leave
Strongly agree
Agree
Neither
Disagree
Strongly disagree
Mom Establish Warm Relationship
Strongly
Strongly
agree
agree disagree disagree total
66.18
109.85
27.00
26.74
4.24
234
110.86 59.96
184.03 99.53
45.23 24.46
44.79 24.22
7.10
3.84
392
212
13.01
21.60
5.31
5.26
0.83
46
250
415
102
101
16
884
33
The Test
 Ho: Independence
vs
Ha: Dependence
 𝑋 2 = 47.576
 df = (4-1)(5-1)= 12
 P-value <.01
 The data support that conclusion that responses to the
items are dependent (i.e., reject Ho).
 We’ll look at standardized residuals to see why we
rejected Ho
34
Standardized Residuals
What is the nature of the dependency?
Paid maternity
Leave
Strongly agree
Agree
Neither
Disagree
Strongly disagree
Mom Establish Warm Relationship
Strongly
Strongly
agree
agree disagree disagree total
3.79
-0.84
-2.23
-1.11
250
-1.32
1.10
0.25
-0.77
415
-0.96
0.41
-1.01
-0.79
102
-1.88
-1.01
2.39
1.61
101
-1.09 -0.79
1.61
1.28
16
234
392
212
46
884
35
Limitations of the Chi-Squared Test
 The chi-squared test is often misused. Some
examples are:
 When some of the expected frequencies are too
small
 When separate rows or columns are dependent
samples, e.g. marginal proportions (For
dependent samples, we use McNemar’s Test)
 When data are not random
 When quantitative data are classified into
categories --- results in loss of information
36
“Goodness of Fit” Chi-Squared Tests
 The Chi-Squared test can also be used for testing
particular proportion values for a categorical
variable.
 The null hypothesis is that the distribution of the
variable follows a given probability distribution; the
alternative is that it does not.
 The test statistic is calculated in the same manner
where the expected counts are what would be
expected in a random sample from the hypothesized
probability distribution.
 The McNemar’s test is a “goodness of fit” test.
37
Fisher’s Exact Test
 The chi-squared test of independence is a large



sample test.
When the expected frequencies are small, any of
them being less than about 5, small-sample tests are
more appropriate.
Fisher’s exact test is a small-sample test of
independence for 2-way tables.
The calculations for Fisher’s exact test are tedious;
Statistical software can be used to obtain the Pvalue.
The smaller the P-value, the stronger the evidence
that the variables are associated.
38
How to do chi-square in R?
 data<-matrix(c(9,4,15,12,5,10,15,4),nrow=2)
 chisq.test(data, correct=FALSE)
 fisher.test(data)
39
Illinois Admission Scandal
Admission to
Illinois
No
Yes
Total
“I”-list
37
123
160
General
800
1800
26000
Total
8037
18123
26160
40
A Small Sample Example
 “Imposing Views, Imposing Shoes”
 Alper & Raymond (1995): Classes were randomly
assigned to one of 2 groups: professors wore Nikes or
not. After 3 time/week for 14 weeks, checked to see if
students purchased Nikes

Students Buy Nikes?
Yes
No
Professor
Yes 4
6
Work nikes?
No 7
9
Fisher’s pvalue = 1.00 (chisquare test pvalue=.83)
41
Key Points Revisited
A Significance Test for Categorical Variables
What Do We Expect for Cell Counts if the
Variables Are Independent?
3. The Chi-Squared Test Statistic
4. The Chi-Squared Distribution
5. The Five Steps of the Chi-Squared Test of
Independence
6. Chi-Squared and the Test Comparing Proportions
in 2x2 Tables
7. Limitations of the Chi-Squared Test
8. Fisher’s Exact Test
1.
2.
42