X - cloudfront.net

Chapter 12
Lesson 12.2a
Comparing Two Populations or Treatments
12.2: Test for Homogeneity and
Independence in a Two-way Table
X2 Test for Homogeneity
Null Hypothesis:
H0: The distribution is the same for all the different
groups
Alternative Hypothesis:
Ha: The distribution is not the same for all the different
groups
Test Statistic:
X2 

all cells
observed cell count - expected cell count 2
expected cell count
X2 Test for Homogeneity
Continued . . .
Assumptions:
1) The data are in counts.
2) Data are from random samples or from subjects
who were assigned at random to treatment
groups.
3) All expected cell counts are at least 5. (Columns
can be combined if this is not true)
X2 Test for Homogeneity
Continued . . .
Expected Counts: (assuming H0 is true)
(row marginal total)(column marginal total)
Expected cell counts =
grand total
df = (number of rows – 1)(number of columns – 1).
P-value:
The P-value associated with the computed test statistic
value is the area to the right of X2 under the appropriate
chi-square curve.
A study was conducted to determine if collegiate soccer
players had an increased risk of concussions over other
athletes or students. The two-way frequency table below
displays the number of previous concussions for students
in independently selected random samples of 91 soccer
players, 96 non-soccer
53 non-athletes.
These athletes,
values inand
green
are
Number
of Concussions
the observed
counts.
0
1
2
3 or
more
Total
Soccer Players
45
25
11
10
91
Non-Soccer Players
68
15
8
5
96
Non-Athletes
45
5
3
0
53
Total
158
45
22
15
240
These
Thisvalues
value in blue
red isare
the
the marginal
grand total.
totals.
Soccer Players Continued . . .
State the hypotheses.
Number of Concussions
0
1
2
3 or
more
Total
Soccer Players
45
25
11
10
91
Non-Soccer Players
68
15
8
5
96
Non-Athletes
45
5
3
0
53
Total
158
45
22
15
240
H0: The number of concussions is the same for
all three categories.
ToAnother
find df way
count
the
number
ofcan
rows
and
to
find
df
–
you
also
Ha:columns
The number
of including
concussion the
is not
the same
–
not
totals!
cover
row
and one column, then
for allone
three
categories.
dfcount
= (number
rows – 1)(number
columns
theofnumber
of cellsofleft
(not– 1)
including totals)
Df = (2)(3) = 6
Soccer Players Continued . . .
NumberofofConcussions
Concussions
Number
0
0
1
1
2
2 or
3 or
more
more
Total
Total
Soccer Players
45 (59.9)
25 (17.1)
(14.0)
45 (59.9)
25 (17.1)
11 (8.321 10
(5.7)
9191
Non-Soccer Players
68 (63.2)
15 (18.0)
68 (63.2)
15 (18.0)
8 (8.8)13 (14.8)
5 (6.0)
96
96
Non-Athletes
45 (34.9)
5 (10.0)
45 (34.9)
5 (10.0)
3 (4.9) 3 (8.2)
0 (3.3)
53
53
Total
158 158
45
45 22
2215
240
240
df = 4
2
2
(
45

59
.
9
)
(
3

8
.
2
)
Test Statistic: X 2 
 ... 
 20.6
Notice
that
NOT
the
So
combine
the
column
for 2
59
.5 table
8a.all
2df
This combined
has
Expected
counts
are
shown
expected
counts
atcolumn
least
the
= (2)(2)
= 4.andare
inconcussions
the
parentheses
next
to
5. concussions.
for
3observed
or more
the
P-value < .001
acounts.
= .05
Soccer Players Continued . . .
Number of Concussions
0
1
2 or
more
Total
Soccer Players
45 (59.9)
25 (17.1)
21 (14.0)
91
Non-Soccer Players
68 (63.2)
15 (18.0)
13 (14.8)
96
Non-Athletes
45 (34.9)
5 (10.0)
3 (8.2)
53
158
45
22
240
Total
Since the P-value < a, we reject H0. There is
We
look
thethe
chi-square
These
cellsatthat
had
largest
enough evidence
tocan
suggest
the
category
contributions
which
of X
the
2 test
contributions
tonumber
the
proportions
for–the
ofcells
the greatest same
statistic.
concussionsabove
is nothave
the
contributions to the value of the
for the 3 groups.
Is that all I 2can say – that there
X statistic?
is a difference in proportions for
the groups?
Practice Handout
• Medical Researchers…
Homework
• Pg.722: #12.14, 16, 21
–(extra practice #23)