mean separation tests (lsd and dmrt)

MEAN SEPARATION TESTS (LSD AND Tukey’s Procedure)

If H o  1   2  ... n is rejected, we need a method to determine which means are
significantly different from the others.

We’ll look at three separation tests during the semester:
1. F-protected least significant difference (F-protected LSD)
2. Tukey’s Procedure
3. Orthogonal linear contrasts (covered at the end of the semester)
F-protected Least Significant Difference


The LSD we will use is called an F-protected LSD because it is calculated and used only
when
Ho is rejected.

Sometimes when one fails to reject Ho and an LSD is calculated, the LSD will wrongly
suggest that there are significant differences between treatment means.

To prevent against this conflict, we calculate the LSD only when Ho is rejected.
LSD  t / 2 * sY 1 Y 2 and df for t = Error df
If r1  r2 ...  rn then sY 1 Y 2 
2 ErrorMS
r
1 1
If ri  ri ' then sY 1 Y 2  s 2   
 ri ri ' 

If the difference between two treatment means is greater than the LSD, then those
treatment means are significantly different at the 1   % level of confidence.
Example
Given an experiment analyzed as a CRD that has 7 treatments and 4 replicates with the following
analysis
SOV
Treatment
Error
Total
Df
6
21
27
SS
5,587,174
1,990,238
7,577,412
MS
931,196
94,773
F
9.83**
and the following treatment means
Treatment
A
B
C
D
E
F
G
Mean
2,127
2,678
2,552
2,128
1,796
1,681
1,316
Calculate the LSD and show what means are significantly different at the 95% level of
confidence.
Step 1. Calculate the LSD
LSD  t / 2 * sY 1 Y 2
 2.080
2 ErrorMS
r
 2.080
2(94,773)
4
 452.8
 453
Step 2. Rank treatment means from low to high
Treatment
G
F
E
A
D
C
B
Mean
1,316
1,681
1,796
2,127
2,128
2,552
2,678
Step 3. Calculate differences between treatment means to determine which ones are significantly
different from each other.
If the difference between two treatment means is greater than the LSD, then those treatment
means are significantly different at the 95% level of confidence.
Treatment F vs. Treatment G
Treatment E vs. Treatment G
1681 – 1316 = 365ns
1796 – 1316 = 480*
Since E is significantly greater than Treatment G, then the rest of the means greater than that of
Treatment E also are significantly different than Treatment G.
Thus there is no need to keep comparing the difference between the mean of Treatment G and
Treatments with means greater than the mean of Treatment E.
Treatment E vs. Treatment F
Treatment A vs. Treatment F
Treatment D vs. Treatment F
Treatment C vs. Treatment F
1796 – 1681 = 115ns
2127 – 1681 = 446ns
2128 – 1681 = 447ns
2552 – 1681 = 871*
*Therefore Treatment B must also be different from Treatment F
Treatment A vs. Treatment E
Treatment D vs. Treatment E
Treatment C vs. Treatment E
2127 – 1796 = 331ns
2128 – 1796 = 332ns
2552 – 1796 = 756*
*Therefore Treatment B must also be different from Treatment E
Treatment D vs. Treatment A
Treatment C vs. Treatment A
Treatment B vs. Treatment A
2128 – 2127 = 1ns
2552 – 2127 = 425ns
2678 – 2127 = 551*
Treatment C vs. Treatment D
Treatment B vs. Treatment D
2552 – 2128 = 424ns
2678 – 2128 = 550*
Treatment B vs. Treatment C
2678 – 2552 = 126ns
Step 4. Place lowercase letters behind treatment means to show which treatments are
significantly different.
Step 4.1. Write letters horizontally
G
F
E
A
D
C
B
Step 4.2. Under line treatments that are not significantly different.
G
F
E
A
D
C
B
Step 4.3. Ignore those lines that fall within the boundary of another line.
G
F
E
A
D
C
B
Step 4.4 Label each line, beginning with the top one, with lowercase letters beginning with “a.”
G
F
E
A
D
C
B
a
b
c
d
Step 4.5 Add lowercase letters behind the respective means.
Treatment
G
F
E
A
D
C
B
Mean
1,316 a
1,681 ab
1,796 b
2,127 bc
2,128 bc
2,552 cd
2,678
d
F-protected LSD when rj≠rj’/
1
1 
LSD  t .05 / 2;errordf s 2   
r

 j rj ' 
Given:
SOV
Treatment
Error
Total
Df
3
13
16
SS
0.978
0.660
1.638
MS
0.326
0.051
F
6.392**
And
Treatment
A
B
C
D
n
5
3
5
4
Mean
2.0
1.7
2.4
2.1
How man LSD’s do we need to calculate?
Step 1. Calculate the LSD’s.
1 1
LSD #1) Treatment A or C vs. Treatment B: 2.160 0.051    0.356  0.4
 5 3
1 1
LSD #2) Treatment A or C vs. Treatment D: 2.160 0.051    0.327  0.3
5 4
1 1
LSD #3) Treatment A vs. C: 2.160 0.051    0.309  0.3
5 5
1 1
LSD #4) Treatment B vs. D: 2.160 0.051    0.373  0.4
3 4
Step 2. Write down the means in order from low to high.
Treatment
B
A
D
C
n
3
5
4
5
Mean
1.7
2.0
2.1
2.4
Step 3. Calculate differences between treatment means to determine which ones are significantly
different from each other.
If the difference between two treatment means is greater than the LSD, then those treatment
means are significantly different at the 95% level of confidence.
Treatment A vs. Treatment B (LSD #1)
Treatment D vs. Treatment B (LSD #4)
Treatment C vs. Treatment B (LSD #1)
2.0 – 1.7 = 0.3ns
2.1 – 1.7 = 0.4ns
2.4 – 1.7 = 0.7*
Treatment D vs. Treatment A (LSD #2)
Treatment C vs. Treatment A (LSD #3)
2.1 – 2.0 = 0.1ns
2.4 – 2.0 = 0.4*
Treatment C vs. Treatment D (LSD #2)
2.4 – 2.1 = 0.3ns
Step 4. Place lowercase letters behind treatment means to show which treatments are
significantly different.
Step 4.1. Write letters horizontally
B
A
D
C
Step 4.2. Under line treatments that are not significantly different.
B
A
D
C
Step 4.3. Ignore those lines that fall within the boundary of another line.
B
A
D
C
Step 4.4 Label each line, beginning with the top one, with lowercase letters beginning with “a.”
B
A
D
C
a
b
Step 4.5 Add lowercase letters behind the respective means.
Treatment
B
A
D
C
n
3
5
4
5
Mean
1.7 a
2.0 a
2.1 ab
2.4 b
F-protected LSD with Sampling when rjsk≠rj’sk’ or rjsk=rj’sk’
 1
1 
LSD  t.05 / 2;errordf s 2 

 r s r 's 
j k' 
 j k
If rjsk=ri’sk’: LSD  t.05 / 2;errordf
2s 2
rs
Tukey’s Procedure

This test takes into consideration the number of means involved in the comparison.

Tukey’s procedure uses the distribution of the studentized range statistic.
q
ymax  ymin
MS Error r
Where ymax and ymin are the largest and smallest treatment means, respectively, out of a
group of p treatment means.

Appendix Table VII, pages 621 and 622, contains values of q ( p, f ) , the upper α percentage
points of q where f is the number of degrees of freedom associated with the Mean Square
Error.

As the number of means involved in a comparison increases, the studentized range statistic
increases.

The basis behind Tukey’s Procedure is that in general, as the number of means involved in a
test increases, the smaller or less likely is the probability that they will be alike (i.e. the
probability of detecting differences increases).

Tukey’s Procedure accounts for this fact by increasing the studentized range statistic as the
number of treatments (p) increases, such that the probability that the means will be alike
remains the same.
MS Error
If ri = ri’, Tukey’s statistic = T  q ( p, f )
r


Two treatments means are considered significantly different if the different between their
means is greater than Tα.
Example (using the same data previously used for the LSD example)
Given an experiment analyzed as a CRD that has 7 treatments and 4 replicates with the following
analysis
SOV
Treatment
Error
Total
Df
6
21
27
SS
5,587,174
1,990,238
7,577,412
MS
931,196
94,773
F
9.83**
and the following treatment means
Treatment
A
B
C
D
E
F
G
Mean
2,127
2,678
2,552
2,128
1,796
1,681
1,316
Calculate used Tukey’s procedure to show what means are significantly different at the 95%
level of confidence.
Step 1. Calculate Tukey’s statistic.
T  q ( p, f )
MS Error
r
T0.05  q0.05 (7,21)
94,773
4
 4.60 23,693.26
 708.06
 708
Step 2. Rank treatment means from low to high
Treatment
G
F
E
A
D
C
B
Mean
1,316
1,681
1,796
2,127
2,128
2,552
2,678
Step 3. Calculate differences between treatment means to determine which ones are significantly
different from each other.
If the difference between two treatment means is greater than Tα, then those treatment means are
significantly different at the 95% level of confidence.
Treatment F vs. Treatment G
Treatment E vs. Treatment G
Treatment A vs. Treatment G
1681 – 1316 = 365ns
1796 – 1316 = 480ns
2127 - 1316 = 811*
Since A is significantly greater than Treatment G, then the rest of the means greater than that of
Treatment A also are significantly different than Treatment G.
Thus there is no need to keep comparing the difference between the mean of Treatment G and
Treatments with means greater than the mean of Treatment A.
Treatment E vs. Treatment F
Treatment A vs. Treatment F
Treatment D vs. Treatment F
Treatment C vs. Treatment F
1796 – 1681 = 115ns
2127 – 1681 = 446ns
2128 – 1681 = 447ns
2552 – 1681 = 871*
*Therefore Treatment B must also be different from Treatment F
2127 – 1796 = 331ns
2128 – 1796 = 332ns
2552 – 1796 = 756*
Treatment A vs. Treatment E
Treatment D vs. Treatment E
Treatment C vs. Treatment E
*Therefore Treatment B must also be different from Treatment E
Treatment D vs. Treatment A
2128 – 2127 = 1ns
Treatment C vs. Treatment A
2552 – 2127 = 425ns
Treatment B vs. Treatment A
2678 – 2127 = 551ns
Step 4. Place lowercase letters behind treatment means to show which treatments are
significantly different.
Step 4.1. Write letters horizontally
G
F
E
A
D
C
B
Step 4.2. Under line treatments that are not significantly different.
G
F
E
A
D
C
B
Step 4.3. Ignore those lines that fall within the boundary of another line.
G
F
E
A
D
C
B
Step 4.4 Label each line, beginning with the top one, with lowercase letters beginning with “a.”
G
F
E
A
D
C
B
a
b
c
Step 4.5 Add lowercase letters behind the respective means.
Treatment
G
F
E
A
D
C
B
Mean
1,316 a
1,681 ab
1,796 ab
2,127 bc
2,128 bc
2,552 c
2,678 c
Tukey-Kramer Procedure

Used for unbalanced data (i.e., ri  ri ' ).

T 
1 1 
q  p, f 
Error MS  
2
 ri ri ' 
Example
Given:
SOV
Treatment
Error
Total
Df
3
13
16
SS
0.978
0.660
1.638
MS
0.326
0.051
F
6.392**
And
Treatment
A
B
C
D
n
5
3
5
4
Mean
2.0
1.7
2.4
2.1
How man Tα values do we need to calculate?
Step 1. Calculate the Tα values.
Where T 
1 1 
q  p, f 
Error MS  
2
 ri ri ' 
And qα(p,f) = q0.05(4,13) = 4.15
T #1) Treatment A or C vs. Treatment B:
4.15
1 1
0.051    0.48  0.5
2
 5 3
T #2) Treatment A or C vs. Treatment D:
4.15
1 1
0.051    0.445  0.4
2
5 4
T #3) Treatment A vs. C:
4.15
1 1
0.051    0.415  0.4
2
5 5
T #4) Treatment B vs. D:
4.15
1 1
0.051    0.508  0.5
2
3 4
Step 2. Write down the means in order from low to high.
Treatment
B
A
D
C
n
3
5
4
5
Mean
1.7
2.0
2.1
2.4
Step 3. Calculate differences between treatment means to determine which ones are significantly
different from each other.
If the difference between two treatment means is greater than the T -value, then those
treatment means are significantly different at the 95% level of confidence.
Treatment A vs. Treatment B (Tα value #1) 2.0 – 1.7 = 0.3ns
Treatment D vs. Treatment B (Tα value #4) 2.1 – 1.7 = 0.4ns
Treatment C vs. Treatment B (Tα value #1) 2.4 – 1.7 = 0.7*
Treatment D vs. Treatment A (Tα value #2) 2.1 – 2.0 = 0.1ns
Treatment C vs. Treatment A (Tα value #3) 2.4 – 2.0 = 0.4ns
Step 4. Place lowercase letters behind treatment means to show which treatments are
significantly different.
Step 4.1. Write letters horizontally
B
A
D
C
Step 4.2. Under line treatments that are not significantly different.
B
A
D
C
Step 4.3. Ignore those lines that fall within the boundary of another line.
B
A
D
C
Step 4.4 Label each line, beginning with the top one, with lowercase letters beginning with “a.”
B
A
D
C
a
b
Step 4.5 Add lowercase letters behind the respective means.
Treatment
B
A
D
C
n
3
5
4
5
Mean
1.7 a
2.0 ab
2.1 ab
2.4 b
Tukey’s Procedure with Sampling
T  q ( p, f ) * sY where sY 
s2
rs
Tukey Kramer Procedure with Sampling
T 
q ( p, f ) 2  1
1 
s

r s

2
 j k rj s k ' 
Output of the Proc Anova Command
The ANOVA Procedure
Class Level
Information
Class Levels Values
trt
3 abc
Number of Observations Read
12
Number of Observations Used
12
Output of the Proc Anova Command
The ANOVA Procedure
Dependent Variable: yield
Sum of
Squares Mean Square F Value
Source
DF
Model
2
300.5000000 150.2500000
Error
9
379.5000000
11
680.0000000
Corrected Total
Pr > F
3.56 0.0725
42.1666667
R-Square Coeff Var Root MSE yield Mean
0.441912 17.55023 6.493587
Source DF
trt
37.00000
Anova SS Mean Square F Value
2 300.5000000 150.2500000
Pr > F
3.56 0.0725
Output of the Proc Anova Command
The ANOVA Procedure
Output of the Proc Anova Command
The ANOVA Procedure
t Tests (LSD) for yield
Note This test controls the Type I comparisonwise error rate, not the
experimentwise error rate.
:
Alpha
0.05
Error Degrees of Freedom
Error Mean Square
9
42.16667
Critical Value of t
2.26216
Least Significant Difference
10.387
Means with the same letter
are not significantly different.
t Grouping
A
Mean N trt
43.000
4 c
37.250
4 b
30.750
4 a
A
B
A
B
B
Output of the Proc Anova Command
The ANOVA Procedure
Tukey's Studentized Range (HSD) Test for yield
Note This test controls the Type I experimentwise error rate, but it generally has a higher Type II error
rate than REGWQ.
:
Alpha
0.05
Error Degrees of Freedom
9
Error Mean Square
42.16667
Critical Value of Studentized Range
3.94840
Minimum Significant Difference
12.82
Means with the same letter are not
significantly different.
Tukey Grouping
A
Mean
N trt
43.000
4 c
37.250
4 b
30.750
4 a
A
A
A
A