One-Way ANOVA F tests - UAB School of Public Health

Relationship of One-Way and Two-Way ANOVA
One-Way ANOVA Source Table
ANOVA MODEL: Yij = µ* + αj + εij
Source
Sum of Squares
Between Groups
(Explained Variance)
Within Groups
(Error Variance)
J
∑ n (Y
j
− Y* ) 2
j
H0: µ1 = µ2 = . . . = µj or H0: Σα2j = 0
df
Mean Squares
F
J-1
SSB/(J – 1)
MSB/MSW =
j =1
J
nj
∑∑ (Y − Y )
i
2
j
N–J
SSW/dfW
N-1
sY2 = SST/N-1
( N − J ) SS B
( J − 1) SSW
j =1 i =1
(Error Variance)
Total Variance
N
∑ (Y − Y )
i
*
2
i =1
where, N = total number of cases, J = number of groups, Y * = the grand mean of Y across all groups.
Yi = each individual score on Y, and Y j = the mean for group j. nj = the number of cases in group j.
R2 = η2= SSB/SST is the Proportion of Variance Explained by Group Differences. This is also known a
Proportional Reduction in Error (PRE).
Pairwise Post-Hoc Comparisons of Means
However, a statistically significant F ratio only indicates that assuming the Null Hypothesis,
H0: µ1 = µ2 = . . . = µj ,
The Results were not likely to have occurred by chance. Or, we interpret this as, “at least one mean is
different.”
We don’t know which one(s)!
A common approach is to use a post-hoc test for group comparisons. I personally like Tukey’s Honestly
Significant Difference (HSD) for pairwise comparisons of means. Tukey’s HSD controls for inflation of the
Type I error rate when J(J - 1)/2 pairwise comparsions are made. Because of this adjustment of the
significance level, it is possible to obtain a statistically significant F-ratio when no pairwise comparisons are
significant. The q statistic for Tukey’s HSD can be computed as follows:
YL - YS
q
=
M SW ( 1 + 1
nL nS
2
)
,
Then q is compared to a critical value obtained from the Studentized Range Table (q-distribution).
Thus, when two means are compared in a pairwise fashion, if the calculated q statistic is larger than the
q-critical value then this pairwise difference in means is statistically significant.
Similarly, one can solve for the Minimum Mean Difference that would be “Honestly Significant”
HSD = q
MSW 1
1
( + )
2 nL nS
This HSD concept can also be used to construct Confidence Intervals:
CI: ( Y L - Y S) ± q
MSW 1
1
( + )
2 nL nS
If the Confidence Interval contains the null value of ZERO then you cannot claim an Honestly Significant
Difference. If the Confidence DOES NOT contain the null value of ZERO then you can claim that there is an
Honestly Significant Difference.
1
Relationship of One-Way and Two-Way ANOVA
ANOVA MODEL: Yij = µ* + αj + εij
Total
Y
Y* ( Y- Y* ) ( Y- Y* ) 2
Within Group
Y
2
3
2
4
6
6
6
6
-4
-3
-4
-2
16 Occupational
9 Therapy
16 (j = 1) (OCT)
4 n1 = 5
2
3
2
4
4
6
-2
4
5
6
7
4
6
6
6
6
-1
0
1
-2
4 Y1 = 3
1 Maternal
0 Child Health
1 (j = 2) (MCH)
4 n2 = 5
8
6
2
10
9
9
8
6
6
6
6
4
3
3
2
9
6
3
5
7
6
7
6
6
6
6
-1
1
0
1
5
6
-1
Y* = 6
4 Y2 = 6
16 Nutrition
9 Science
9 (j = 3) (NTS)
4 n3 = 5
9 Y3 = 9
1 Epidemiology
1
0 (j = 4) (EPI)
1 n4 = 5
1 Y4 = 6
110
SST =
Between Group
2
Yj ( Y- Yj ) ( Y- Yj )
3
-1
1
3
0
0
3
-1
1 SSW1 = 4
3
1
1 S12 = 1
3
1
1 S1 = 1
Yj
3
3
3
3
Y* ( Yj 6
6
6
6
Y* ) ( Yj - Y* ) 2
-3
9
-3
9
-3
9
-3
9
3
6
-3
9
5
6
7
4
6
6
6
6
-1
0
1
-2
1
0
1 SSW2= 10
4 S22 = 2.5
6
6
6
6
6
6
6
6
0
0
0
0
0
0
0
0
8
6
2
4 S2 = 1.58
6
6
0
0
10
9
9
8
9
9
9
9
1
0
0
-1
1
0
0 SSW3 = 2
1 S32 = .5
9
9
9
9
6
6
6
6
3
3
3
3
9
9
9
9
9
9
0
0 S3 = .707
9
6
3
9
5
7
6
7
6
6
6
6
-1
1
0
1
1
1
0 SSW4 = 4
1 S42 = 1
6
6
6
6
6
6
6
6
0
0
0
0
0
0
0
0
5
6
-1
1 S4 = 1
6
6
0
0
SSW =
20 SSW = 20
SSB =
90
(Σα2j ) estimated by SSBetween = Σ n j ( Y j - Y *) = 5(3-6)2 + 5(6-6)2 + 5(9-6)2 + 5(6-6)2 = 90.
2
(Σε2ij ) estimated by SSWithin =
ANOVA Source Table
Source
SS
Between
90
(Explained)
Within
20
(Error)
Σ(Sj2)(nj-1)
= (1x4) + (2.5x4) + (.5x4) + (1x4) = 20
H0: µ1 = µ2 = µ3 = µ4
df
J-1 =3
MS
90/3 = 30
N-J = 20-4 =16
20/16 = 1.25
Total
110
N - 1 = 19
110/19 = 5.79
2
F(3,16) = 24.00, p < .05, η = .82. Reject H0: µ1 = µ2 = µ3 = µ4.
Pairwise Comparisons using Tukey’s HSD.
q1 2 =
Y1 - Y1
q jk =
F
30/1.25 = 24.00
η2 = 90/110 = .82
Yj - Yk
( M Sw / 2) ( 1/ n j + 1/ n k )
3 - 6
q1 2 =
( 1 .2 5 / 2 ) ( 1 / n 1 + 1 / n 2 )
( 1 .2 5 / 2 ) ( 1 / 5 + 1 / 5)
= 3/(.5) = 6.
q12 = 6 > 4.046 from Studentized Range Table; thus, p < .05.
q13 = 12, p < .05. q14 = 6, p < .05. q23 = 6, p < .05. q24 = 0, ns, p > .05. q34 = 6, p < .05.
Conclusion: HA: µ1 < (µ2 = µ4) < µ3
The HSD = q
MSW 1
1
( + ) = 4.046(0.5) = 2.023
2 nL nS
The 95% CI for the difference between Groups 1 an d 2 is: 3 ± 2.023 (0.977 ↔ 5.023).
The 95% CI does NOT CONTAIN ZERO, so there is an Honestly Significant Difference between OCT and MCH.
2
Relationship of One-Way and Two-Way ANOVA
Complex Contrasts of Means.
H0: ψ = 0, where ψ = a1 Y 1 + a2 Y 2 + . . . + aJ Y J and
a1 + a2 + . . . + aJ = 0. In this case, ψ = a1 Y 1 + a2 Y 2 + a3 Y 3 + a4 Y 4 and a1 + a2 + a3. + a4 = 0.
For example, comparing Nutrition Science (j = 3) to a combination of Maternal Child Health (j = 2) and Epidemiology (j = 4)
yields; ψ = (0)(3) + (1/2)(6) + (-1)(9) + (1/2)(6) or
ψ = (0)(3) + (1)(6) + (-2)(9) + (1)(6) = -6.
a simpler method
The error term for any contrast of this form is:
SEψ2ˆ
= ( MSW )
a 2j
J
∑n
j =1
.
j
Note this assumes a common or pooled error term (MSW)
2
For this contrast S Eψ = (1.25)(0/5 + 1/5 + 4/5 + 1/5) = (1.25)(1.2) = 1.5
An F test with 1 and dfw degrees-of-freedom is used to test the statistical significance of this contrast.
F(1, dfw) =
ψ
2
2
/S Eψ = (-6)2/1.5 = 36/1.5 = 24.00, which is statistically significant.
Thus, F(1,16) = 24.00, p < .05, Reject H0: ψ = 0
To compare Occupational Therapy (j = 1) to a combination of the other three groups
yields; ψ = (-3)(3) + (1)(6) + (1)(9) + (1)(6) = 12.
2
S Eψ = (1.25)(9/5 + 1/5 + 1/5 + 1/5) = (1.25)(2.4) = 3; and F = (12)2/ 3 = 48.00.
Thus, F(1,16) = 48.00, p < .05, Reject H0: ψ = 0
___________________________________________________________________________________
Pairwise Effect Sizes.
ESjk =
Yj - Yk
( S ST / ( N - 1 )
For example, ES12 = (3-6)/2.41 = -1.25.
F tests can also be converted to Effect Sizes by the following:
ES =
dfn F
d fnF + d fd
2
or
r =
d fnF
dfd
3
Relationship of One-Way and Two-Way ANOVA
data progs;
data progs;
input group name $ rate school $ sch orient $ ori;
datalines;
1 OCT 2 SHRP 1 APP 1
1 OCT 3 SHRP 1 APP 1
1 OCT 2 SHRP 1 APP 1
1 OCT 4 SHRP 1 APP 1
1 OCT 4 SHRP 1 APP 1
2 MCH 5 SOPH -1 APP 1
2 MCH 6 SOPH -1 APP 1
2 MCH 7 SOPH -1 APP 1
2 MCH 4 SOPH -1 APP 1
2 MCH 8 SOPH -1 APP 1
3 NTS 10 SHRP 1 RES -1
3 NTS 9 SHRP 1 RES -1
3 NTS 9 SHRP 1 RES -1
3 NTS 8 SHRP 1 RES -1
3 NTS 9 SHRP 1 RES -1
4 EPI 5 SOPH -1 RES -1
4 EPI 7 SOPH -1 RES -1
4 EPI 6 SOPH -1 RES -1
4 EPI 7 SOPH -1 RES -1
4 EPI 5 SOPH -1 RES -1
;
proc glm data=progs order=data;class name;
model rate = name;means name/tukey cldiff ;
contrast 'NTS vs (MCH-EPI)' name 0 1 -2 1;
contrast 'OCT VS OTHERS'
name -3 1 1 1;
contrast 'SHRP vs SOPH'
name 1 -1 1 -1;
contrast 'Appl vs Research' name 1 1 -1 -1;
contrast 'Interaction'
name 1 -1 -1 1;
run;
proc glm data=progs ;class group;
model rate = group;means group/tukey;
contrast 'NTS vs (MCH-EPI)' group 0 1 -2 1;
contrast 'OCT VS OTHERS'
group -3 1 1 1;
contrast 'SHRP vs SOPH'
group 1 -1 1 -1;
contrast 'Appl vs Research' group 1 1 -1 -1;
contrast 'Interaction'
group 1 -1 -1 1;
run;
proc glm data=progs;class school orient;
model rate = school orient school*orient;run;
proc glm data=progs;
model rate = sch ori sch*ori;run;
4
Relationship of One-Way and Two-Way ANOVA
The GLM Procedure
Class Level Information
Class
name
Levels
4
Values
OCT MCH NTS EPI
Number of Observations Read
Number of Observations Used
20
20
Dependent Variable: rate
Source
DF
Sum of
Squares
Mean Square
F Value
Pr > F
Model
3
90.0000000
30.0000000
24.00
<.0001
Error
16
20.0000000
1.2500000
Corrected Total
19
110.0000000
Source
name
R-Square
Coeff Var
Root MSE
rate Mean
0.818182
18.63390
1.118034
6.000000
DF
Type III SS
Mean Square
F Value
Pr > F
3
90.00000000
30.00000000
24.00
<.0001
Tukey's Studentized Range (HSD) Test for rate
NOTE: This test controls the Type I experimentwise error rate, but it generally has a
higher Type II error rate than REGWQ.
Alpha
Error Degrees of Freedom
Error Mean Square
Critical Value of Studentized Range
Minimum Significant Difference
0.05
16
1.25
4.04609
2.023
Means with the same letter are not significantly different.
name
Comparison
NTS - MCH
NTS - EPI
NTS - OCT
MCH - EPI
MCH - OCT
EPI - OCT
Tukey Grouping
A
Mean
9.0000
N
5
name
NTS
B
B
6.0000
6.0000
5
5
MCH
EPI
C
3.0000
5
OCT
Difference
Between
Means
3.0000
3.0000
6.0000
0.0000
3.0000
3.0000
Simultaneous 95%
Confidence Limits
0.9770
5.0230
0.9770
5.0230
3.9770
8.0230
-2.0230
2.0230
0.9770
5.0230
0.9770
5.0230
***
***
***
***
***
Comparisons significant at the 0.05 level are indicated by ***.
5
Relationship of One-Way and Two-Way ANOVA
Dependent Variable: rate
Contrast
DF
Contrast SS
Mean Square
F Value
Pr > F
NTS vs (MCH-EPI)
OCT VS OTHERS
SHRP vs SOPH
Appl vs Research
Interaction
1
1
1
1
1
30.00000000
60.00000000
0.00000000
45.00000000
45.00000000
30.00000000
60.00000000
0.00000000
45.00000000
45.00000000
24.00
48.00
0.00
36.00
36.00
0.0002
<.0001
1.0000
<.0001
<.0001
The GLM Procedure
Class Level Information
Class
school
Levels
2
orient
2
Values
SHRP SOPH
APP RES
Number of Observations Read
Number of Observations Used
20
20
Dependent Variable: rate
Source
DF
Sum of
Squares
Model
Error
Corrected Total
3
16
19
90.0000000
20.0000000
110.0000000
R-Square
0.818182
Source
school
orient
school*orient
Coeff Var
18.63390
Mean Square
F Value
Pr > F
30.0000000
1.2500000
24.00
<.0001
Root MSE
1.118034
rate Mean
6.000000
DF
Type III SS
Mean Square
F Value
Pr > F
1
1
1
0.00000000
45.00000000
45.00000000
0.00000000
45.00000000
45.00000000
0.00
36.00
36.00
1.0000
<.0001
<.0001
6
Relationship of One-Way and Two-Way ANOVA
Oneway
Descriptives
Y
N
OCT
MCH
NUT
EPI
Total
Y
Between Groups
5
5
5
5
20
Std.
Mean
Deviation Std. Error
3.0000
1.0000
.4472
6.0000
1.5811
.7071
9.0000
.7071
.3162
6.0000
1.0000
.4472
6.0000
2.4061
.5380
95% Confidence
Interval for Mean
Lower
Upper
Bound
Bound
Minimum Maximum
1.7583
4.2417
2.00
4.00
4.0368
7.9632
4.00
8.00
8.1220
9.8780
8.00
10.00
4.7583
7.2417
5.00
7.00
4.8739
7.1261
2.00
10.00
ANOVA
Sum of
Squares
90.000
Within Groups
Total
df
Mean
Square
3
30.000
20.000
16
110.000
19
F
24.000
Sig.
.000
1.250
Post Hoc Tests
Multiple Comparisons
Dependent Variable: Y
Tukey
HSD
Mean
Difference
(I) D
(J) D
(I-J)
Std. Error
Sig.
OCT
MCH
-3.0000
.7071
.003
NUT
-6.0000
.7071
.000
EPI
-3.0000
.7071
.003
MCH
OCT
3.0000
.7071
.003
NUT
-3.0000
.7071
.003
EPI
.0000
.7071
1.000
NUT
OCT
6.0000
.7071
.000
MCH
3.0000
.7071
.003
EPI
3.0000
.7071
.003
EPI
OCT
3.0000
.7071
.003
MCH
.0000
.7071
1.000
NUT
-3.0000
.7071
.003
* The mean difference is significant at the .05 level.
95% Confidence
Interval
Lower
Upper
Bound
Bound
-5.0231
-.9769
-8.0231
-3.9769
-5.0231
-.9769
.9769
5.0231
-5.0231
-.9769
-2.0231
2.0231
3.9769
8.0231
.9769
5.0231
.9769
5.0231
.9769
5.0231
-2.0231
2.0231
-5.0231
-.9769
Homogeneous Subsets
Y
Tukey HSD
Subset for alpha = .05
D
N
1
2
3
OCT
5
3.0000
MCH
5
6.0000
EPI
5
6.0000
NUT
5
9.0000
Sig.
1.000
1.000
1.000
Means for groups in homogeneous subsets are displayed.
a Uses Harmonic Mean Sample Size = 5.000.
7
Relationship of One-Way and Two-Way ANOVA
Contrast Coefficients
D
Maternal
Chi Health
1
1
1
Contrast
1
2
3
Y
Assume equal
Variances
Does not
Assume Equal
Variances
Occupation
Therapy
1
-1
-1
Epidemi
ology
-1
1
-1
Contrast
1
2
3
1
2
3
Nutrition
Science
-1
-1
1
Value of
Contrast
-6.0000
.0000
-6.0000
-6.0000
.0000
-6.0000
Std.
Error
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
t
-6.000
.000
-6.000
-6.000
.000
-6.000
df
16
16
16
11.765
11.765
11.765
Sig.
(2-tailed)
.000
1.000
.000
.000
1.000
.000
Univariate Analysis of Variance
Descriptive Statistics
Dependent Variable: Y
SCHOOL
SRHP
SOPH
Total
Orient
Applied
Exper
Total
Applied
Exper
Total
Applied
Exper
Total
Mean
3.0000
9.0000
6.0000
6.0000
6.0000
6.0000
4.5000
7.5000
6.0000
Std.
Deviation
1.00000
.70711
3.26599
1.58114
1.00000
1.24722
2.01384
1.77951
2.40613
N
5
5
10
5
5
10
10
10
20
Tests of Between-Subjects Effects
Dependent Variable: Y
Type III Sum
Mean
Source
df
of Squares
Square
Corrected Model
90.000(a)
3
30.000
Intercept
720.000
1 720.000
SCHOOL
.000
1
.000
Orient
45.000
1
45.000
SCHOOL*Orient
45.000
1
45.000
Error
20.000
16
1.250
Total
830.000
20
Corrected Total
110.000
19
a R Squared = .818 (Adjusted R Squared = .784)
F
Sig.
24.000 .000
576.000 .000
.000 1.000
36.000 .000
36.000 .000
8
Relationship of One-Way and Two-Way ANOVA
10
Nutrition Science
M = 9.00
SD = 1.00
Estimated Marginal Means
9
8
Matern Health
M = 6.00
SD = 1.58
7
Epidemiology
M = 6.00
SD = 1.00
6
5
4
SCHOOL
Orientation
SHRP
Occup Ther
M = 3.00
SD = 0.71
3
2
applied
SOPH
Experimental
Education
Applied
Psychology
Experimental
ORIENTATION
Department Membership
10
Nutrition Science
M = 9.00
SD = 1.00
Estimated Marginal Means
9
8
Epidemiology
M=6
SD = 1.00
7
6
5
Matern Health
M=6
SD = 1.58
Occup Therapy
M = 3.00
SD = 0.71
ORIENTATION
4
Department
3
SHRP
Education
2
Psychology
SOPH
Applied
Experimental
SOPH
SHRP
SCHOOL
Orientation
9