p - Cengage

Chapter 15
Introduction to the Analysis of
Variance
I The Omnibus Null Hypothesis
H0: 1 = 2 = . . . = p
H1: j = j′ for some j and j´
1
A. Answering General Versus Specific Research
Questions
Population contrast, i, and sample contrast, ˆ i
 1  1   2
ˆ1  Y1  Y2
2. Pairwise and nonpairwise contrasts
1.
ˆ1  1  2
ˆ 2  1  3
ˆ 3  1  4
ˆ 4  2  3
ˆ 5  2  4
ˆ 6  3  4
 2  3
ˆ 7  1 
2
ˆ 8  1 
 2  3   4
3
1   2 3   4
ˆ 9 

2
2
2
B. Analysis of Variance Versus Multiple t Tests
1. Number of pairwise contrasts among p means
is given by p(p – 1)/2
 p=3
3(3 – 1)/2 = 3
 p=4
4(4 – 1)/2 = 6
 p=5
5(5 – 1)/2 = 10
2. If C = 3 contrasts among p = 3 means are tested
using a t statistic at  = .05, the probability of
one or more type I errors is less than
1 (1  )C  1 (1 .05)3  .14
3
3. As C increases, the probability of making one or
more Type I errors using a t statistic increases
dramatically.
Prob. of one or more Type I errors
 C4
 [1 (1 .05)4 ]  .19
 C 5
 [1 (1 .05)5 ]  .23
 C6
 [1 (1 .05)6 ]  .26
 C7
 [1 (1 .05)7 ]  .37
4
4. Analysis of variance tests the omnibus null
hypothesis, H0: 1 = 2 = . . . = p , and controls
probability of making a Type I error at, say,
 = .05 for any number of means.
5. Rejection of the null hypothesis makes the
alternative hypothesis, H1: j ≠ j’, tenable.
5
II
Basic Concepts In ANOVA
A. Notation
1. Two subscripts are used to denote a score, Xij.
The i subscript denotes one of the i = 1, . . . , n
participants in a treatment level. The j subscript
denotes one of the j = 1, . . . , p treatment levels.
2. The jth level of treatment A is denoted by aj.
6
Treatment Levels
a1
a2
a3
a4
X11
X21
X12
X22
X13
X23
X14
X24
Xn1
Xn2
Xn3
Xn4
X.1
X .2
X.3
X .4
X 11  X 21    X n1
X 1 
n
X 14  X 24    X n4
X 4 
n
X. .
X 1  X 2    X 4
X  
p
7
B. Composite Nature of a Score
1. A score reflects the effects of four variables:
 independent variable
 characteristics of the participants in the
experiment
 chance fluctuations in the participant’s
performance
 environmental and other uncontrolled
variables
8
2. Sample model equation for a score

X ij
Score
X . .  ( X . j  X . . )  ( X ij  X . j )
Grand
Mean
Treatment
Effect
Error
Effect
3. The statistics estimate parameters of the model
equation as follows
X ij
Score
   (  j   )  ( X ij   j )
Grand
Mean
Treatment
Effect
Error
Effect
9
4. Illustration of the sample model equation using the
weight-loss data in Table 1.
Table 1. One-Month Weight Losses for Three Diets
Treatment Levels (Diets)
a1
a2
a3
7
9
8
10
13
9
12
11
15
6
7
14
X.1  8
X.2  9
X.3  12
X . .  9.67
10
5. Let X11 = 7 denote Joan’s weight loss. She used
diet a1. Her score is a composite that tells a story.
X ij

X . .  ( X . j  X . . )  ( X ij  X . j )
7  9.67  (8  9.67 )
Score
Treatment
Grand
Mean Effect  1.67
 (7  8)
Error
Effect  1
6. Joan used a less effective diet than other girls
(8 – 9.67 = –1.67), and she lost less weight than
other girls on the same diet (7 – 8 = –1).
11
C. Partition of the Total Sum of Squares (SSTO)
1. The total variability among scores in the diet
experiment
p
n
SSTO    ( X ij  X . . )2
j1 i1
also is a composite that can be decomposed into
 between-groups sum of squares (SSBG)
p
SSBG  n  ( X . j  X . . )2
j1

within-groups sum of squares (SSWG)
p
n
SSWG    ( X ij  X . j )2
j1 i1
12
D. Degrees of Freedom for SSTO, SSBG, and
SSWG
1. dfTO = np – 1
2. dfBG = p – 1
3. dfWG = p(n – 1)
E. Mean Squares, MS, and F Statistic
1. SSTO / (np  1)  MSTO
2. SSBG / ( p  1)  MSBG
3. SSWG / p(n  1)  MSWG
4. F  MSBG / MSWG
13
F. Nature of MSBG and MSWG
1. Expected value of MSBG and MSWG when the
null hypothesis is true.
E( MSBG)  E( MSWG)   2
2. Expected value of MSBG and MSWG when the
null hypothesis is false.
E( MSBG)   2  n (  j   )2 / ( p  1)
E( MSWG)   2
14
3. MSBG represents variation among participants
who have been treated differently—received
different treatment levels.
4. MSWG represents variation among participants
who have been treated the same—received
the same treatment level.
5. F = MSBG/MSWG values close to 1 suggest that
the treatment levels did not affect the dependent
variable; large values suggest that the treatment
levels had an effect.
15
III Completely Randomized Design (CR-p
Design)
A. Characteristics of a CR-p Design
1. Design has one treatment, treatment A, with p
levels.
2. N = n1 + n2 + . . . + np participants are randomly
assigned to the p treatment levels.
3. It is desirable, but not necessary, to have the same
number of participants in each treatment level.
16
B. Comparison of layouts for a t-test design for
independent samples and a CR-3 design
Treat.
Treat.
level
level
Participant1
Participant2
a1
a1
Participant10
Participant11
Participant12
a1
a2
a2
Participant20
a2
X.1
X .2
Participant1
Participant2
a1
a1
Participant10
Participant11
Participant12
a1
a2
a2
Participant20
Participant21
Participant22
a2
a3
a3
Participant30
a3
X.1
X .2
X.3
17
C. Descriptive Statistics for Weight-Loss Data
In Table 1
Table 2. Means and Standard Deviations
for Weight-Loss Data
Diet
a1
a2
a3
X. j
8.00
9.00
12.00
ˆ j
2.21
2.21
2.31
18
a3
a2
a1
4
6
8
10
12
14
16
One-Month Weight Loss
Figure 1. Stacked box plots for the weight-loss data. The
distributions are relatively symmetrical and have similar
dispersions.
19
Table 3. Computational Procedures for CR-3 Design
a1
a2
a3
7
9
8
10
13
9
12
11
15
6
 X ij  80
7
14
90
120
 X ij  7  9  8   14  290.000
2
2
2
2
2










X

AS

7

9

8



14
 3026.000
  ij
  X 
2
ij
np  X   290 2 10 3  3803 .333
20
2
 n

 X ij 
p 
 i1 
(80)2 (90)2 (120)2
 [ A] 


 2890.000

n
10
10
10
j1
D. Sum of Squares Formulas for CR-3 Design
SSTO  [ AS]  [ X ]  3026.000  2803.333  222.667
SSBG  [ A]  [ X ]  2890.000  2803.333  86.667
SSWG  [ AS]  [ A]  3026.000  2890.000  136.000
21
Table 4. ANOVA Table for Weight-Loss Data
Source
1. Between
groups (BG)
Three diets
SS
86.667
df
p – 1 = 2 43.334
2. Within
136.000 p(n – 1) = 27
groups (WG)
3. Total
222.667
MS
F
1
 
2
8.60*
5.037
np – 1 = 29
*p < .002
22
E. Assumptions for CR-p Design
1. The model equation, X ij    (  j   )  ( X ij   j ),
reflects all of the sources of variation that affect
Xij.
2. Random sampling or random assignment
3. The j = 1, . . . , p populations are normally
distributed.
4. Variances of the j = 1, . . . , p populations are
equal.
23