Unit 12: Analysis of
Single Factor Experiments
Statistics 571: Statistical Methods
Ramón V. León
7/16/2004
Unit 12 - Stat 571 - Ramón V. León
1
Introduction
• Chapter 8: How to compare two treatments.
• Chapter 12:
– How to compare more than two treatments
– Limited to a single treatment factor
• Example of single factor experiment:
– Compare the flight distances of three types of golf balls differing in the
shape of dimples on them: circular, fat elliptical, and thin elliptical
– Treatment factor: type of ball
– Factor levels: circular, fat elliptical, and thin elliptical
– Treatments: circular, fat elliptical, and thin elliptical
• How would an experiment with more than one treatment
factor look?
7/16/2004
Unit 12 - Stat 571 - Ramón V. León
2
Experimental Designs
Two
Treatments
Independent
Samples
Dependent
Samples
7/16/2004
More Than
Two
Treatments
Independent Completely
Samples
Randomized
Design
Design
Matched Pair Randomized
Design
Block Design
Unit 12 - Stat 571 - Ramón V. León
3
Completely Randomized Design
Random sample drawn in each of six molding stations.
Runs should be in random order to protect against time trend
7/16/2004
Unit 12 - Stat 571 - Ramón V. León
4
Completely Randomized Design Notation
If the sample
sizes are equal
the design is
balanced;
otherwise the
design is
unbalanced
a
N = ∑ ni
j =1
7/16/2004
Unit 12 - Stat 571 - Ramón V. León
5
Completely Randomized Design: Comments
• In a CRD the experimental units are randomly
assigned to each treatment
• Similar data also arises in observational studies
where the units are not assigned to the different
groups by the investigator
• Stronger conclusions are possible with
experimental data
7/16/2004
Unit 12 - Stat 571 - Ramón V. León
6
Completely Randomized Design Data
Inspection
Nominal Variable
7/16/2004
Unit 12 - Stat 571 - Ramón V. León
7
CRD Side-by-Side Box Plots
Station 5 has two
outliers
52.5
7/16/2004
52
Weights
Stations 4, 5, and 6
which are supplied
by feeder 2 have a
higher average as
a group than
stations 1, 2, and 3
that are supplied by
feeder 1. Is this
difference real
or the result
sampling variation?
51.5
51
1
2
3
4
5
6
Station
Unit 12 - Stat 571 - Ramón V. León
8
CRD Model and Estimation
Model assumption: the data on the i-th treatment are
a random sample from an N ( µi ,σ 2 ) population
Yij = µi + ε ij (i = 1, 2,..., a; j = 1, 2,..., ni )
where ε ij are independent and identically distributed (i.i.d.)
N (0, σ 2 ) random errors.
7/16/2004
Unit 12 - Stat 571 - Ramón V. León
9
CRD Model and Estimation
The treatment means µi and the error variance σ 2 are unknown
parameters. The primary interest is on comparing the means
Frequently, we write µi = µ + τ i where µ is the "grand mean"
defined as the weighted average of the µi :
a
µ
µ
n
∑
∑
i
i
µ = i =a1
= i =1 i if ni = n are egual
∑ i=1 ni a
and τ i = µi − µ is the deviation of the i-th treatment mean
a
from this grand mean.
We refer to τ i as the i-th treatment effect.
7/16/2004
Unit 12 - Stat 571 - Ramón V. León
10
CRD Model and Estimation
Alternative Formulation of the Model:
Yij = µ + τ i + ε ij (i = 1, 2,..., a ; j = 1, 2,..., ni )
The τ i are subject to the contraint:
0 = ∑ i =1 niτ i =
a
(∑
a
τ if the ni = n are equal
i =1 i
)
So there are only a -1 linearly independent τ i 's.
7/16/2004
Unit 12 - Stat 571 - Ramón V. León
11
CRD Parameter Estimates
σˆ 2 = s 2
Measure of
common experimental
error
7/16/2004
Unit 12 - Stat 571 - Ramón V. León
12
ANOVA in JMP’s Fit Model Platform
Note that the Station variable is nominal
7/16/2004
Unit 12 - Stat 571 - Ramón V. León
13
µˆ
τˆ1
τˆ 2
τˆ 3
τˆ 4
τˆ 5
CRD Parameter Estimates
s
2
How do we find the value of τˆ6 ?
7/16/2004
Unit 12 - Stat 571 - Ramón V. León
14
Relationship to Dummy Variable Regression
1 if station i
zi = −1 if station 6
0 otherwise
i = 1, 2,...,5
y = 51.57 + 0.09 z1 − 0.23z2 − 0.33 z3 + 0.05 z4 + 0.13 z5 + ε
y = µˆ + τˆ1 z1 + τˆ2 z2 + τˆ3 z3 + τˆ4 z4 + τˆ5 z5 + ε
7/16/2004
Unit 12 - Stat 571 - Ramón V. León
15
CRD Parameter Estimates
s
7/16/2004
Unit 12 - Stat 571 - Ramón V. León
2
16
CRD (1-α)-level Confidence Interval
yi − t N − a ,α 2
s
s
≤ µi ≤ yi + t N − a ,α 2
ni
ni
However, usually we are more interested in comparing
the µi with each other than estimating them separately.
Fit Y by X:
7/16/2004
Unit 12 - Stat 571 - Ramón V. León
17
Mean Diamonds in JMP
Why do all the
diamonds have
the same height?
7/16/2004
Unit 12 - Stat 571 - Ramón V. León
18
Analysis of Variance
Homogeneity Hypothesis :
H 0 : µ1 = µ2 = ... = µa vs. H1 : Not all the µi are equal.
H 0 : τ 1 = τ 2 = ... = τ a = 0 vs. H1 : At least some τ i ≠ 0.
Note SSA = Treatment sums of squares
7/16/2004
Unit 12 - Stat 571 - Ramón V. León
19
ANOVA in JMP
Wrong ANOVA table:
(Model: Y = β 0 + β1Station + ε )
Note that the SS has the wrong number of degrees of freedom
Correct ANOVA table: (Model: Y = µ + τ 1 z1 + τ 2 z2 + τ 3 z3 + τ 4 z4 + τ 5 z5 + ε )
7/16/2004
Unit 12 - Stat 571 - Ramón V. León
20
Model Diagnostics: Residuals versus Fitted Value
eij = yij − yi
Part of “Fit Model” Output
This plot checks the
assumption of constant
error variance σ2
A cone shape in this plot
would
suggest a log
transformation of response
7/16/2004
Unit 12 - Stat 571 - Ramón V. León
21
Model Diagnostic: Assumption of Equal Variances
(More Formal Tests)
7/16/2004
Unit 12 - Stat 571 - Ramón V. León
22
Model Diagnostics: Residual Versus Row (Time?) Order
Fit Model Platform:
A time pattern here
would be confounded
with a station effect.
JMP table should be in
the random order that
the data is supposed to
have been collected
7/16/2004
Unit 12 - Stat 571 - Ramón V. León
23
Model Diagnostics: Normal Plot of Residuals
Strong indication that
errors are normally
distributed.
7/16/2004
Unit 12 - Stat 571 - Ramón V. León
24
Multiple Comparison of Means
If H 0 : µ1 = ... = µ a is rejected all that we can say is that
the treatment means are not equal. The F -test does not
pinpoint which treatment means are significantly different
from each other.
We could test all pairwise equality hypotheses H 0ij : µi = µ j
Reject H 0ij if tij =
| yi − y j |
s 1 ni + 1 n j
> t N − a ,α 2
⇔ | yi − y j | > t N − a ,α 2 s 1 ni + 1 n j =
( Least significant difference, LSD )
7/16/2004
Unit 12 - Stat 571 - Ramón V. León
25
Pairwise Equality Hypotheses
Since each of the 15 pairwise test have a level α, the type I error
probability of declaring at least one pairwise difference
falsely significant will exceed α.
Family Wise Error rate (FWE):
FWE = P{Reject at least one true null hypothesis when they are true}
If all six means are actually equal in the plastic container example
FWE = 0.350 when each LSD test is done at the 0.05 level.
Fisher’s protected LSD method:
Use LSD method only after the F-test rejects
(This method is not recommended today.)
7/16/2004
Unit 12 - Stat 571 - Ramón V. León
26
LSD Method in JMP
Overlap
Marks
7/16/2004
If the overlap marks overlap the two means are not
significantly different according to the LSD criterion
Unit 12 - Stat 571 - Ramón V. León
27
LSD
Method
in JMP
Fit Y by X JMP
platform:
7/16/2004
Unit 12 - Stat 571 - Ramón V. León
28
Tukey
Method
Recommended
Method:
FWE = α if the
sample sizes
are equal and
is slightly
conservative
(i.e., the actual
FWE is < α )
when sample
sizes are
unequal
7/16/2004
Unit 12 - Stat 571 - Ramón V. León
29
This report shows the ranked differences, from highest to lowest, with
a confidence interval band overlaid on the plot.
Confidence intervals that do not fully contain their corresponding bar
are significantly different from each other.
7/16/2004
Unit 12 - Stat 571 - Ramón V. León
30
Tukey Method Confidence Intervals
This is a way of construction 100(1-α)% Simultaneous Confidence Intervals
(SCIs) for all pairwise difference of means
7/16/2004
Unit 12 - Stat 571 - Ramón V. León
31
Tukey Method
Confidence Intervals
Compare to the Minitab
output at the bottom of Figure
12.6 of your textbook. How
would you get the top output
in that figure?
7/16/2004
Unit 12 - Stat 571 - Ramón V. León
32
Dunnett Method for Comparisons with a Control
7/16/2004
Unit 12 - Stat 571 - Ramón V. León
33
Dunnett Method
in JMP
7/16/2004
Unit 12 - Stat 571 - Ramón V. León
34
Hsu Method for Comparison with the Best
7/16/2004
Unit 12 - Stat 571 - Ramón V. León
35
Box Plots for Teaching Method
40
Test Score
35
30
25
20
15
10
Case
Equation
Formula
Unitary Analysis
Method
7/16/2004
Unit 12 - Stat 571 - Ramón V. León
36
Hsu Method in
JMP
7/16/2004
Unit 12 - Stat 571 - Ramón V. León
Explanation
Next Page
37
Hsu Method
in JMP
The Unitary
Method is best
Can’t tell which
is the worse
method
7/16/2004
Unit 12 - Stat 571 - Ramón V. León
38
Randomized Block Design
•Blocking helps to reduce experimental error variation caused by
difference in the experimental units by grouping them into
homogeneous sets (called blocks).
•Treatments are randomly assigned within each block
7/16/2004
Unit 12 - Stat 571 - Ramón V. León
39
Randomized Block Design Model: Fixed Block Effects
Yij = µ + τ i + β j + ε ij (i = 1,..., a; j = 1,..., b)
where ε ij are i.i.d. N(0,σ 2 )
µ is called the grand mean
τ i is called the ith treatment effect
β j is called the jth block effect
∑
a
τ = 0 and ∑ j = 1 β j = 0 so there are
b
i =1 i
a − 1 independent treatment effects
b -1 independent block effects
7/16/2004
Unit 12 - Stat 571 - Ramón V. León
40
“Mystery of Degrees of Freedom Explained”
Counting the grand mean there are 1 + (a -1) + (b -1) = a + b − 1
unknown parameters. (This many degrees of freedom are needed
to estimate these parameters.)
There are N = ab observations (total degrees of freedom).
So there are ν = ab − (a + b − 1) = (a − 1)(b − 1) degrees of
freedom for estimating the error variation
(degrees of freedom for error).
7/16/2004
Unit 12 - Stat 571 - Ramón V. León
41
No Interactions Between Treatments and Blocks
The difference in mean responses between any two treatments
is the same across all blocks
µij − µi ' j = ( µ + τ i + β j ) − ( µ + τ i ' + β j ) = τ i − τ i '
which is indepedent of the particular block j
We say that there are no interactions between treatments and blocks
Example: Consider the treatments to be fertilizer and the blocks
to be different fields. Then no interaction implies that the difference
in mean yields between any two fertilizers is the same for all fields.
7/16/2004
Unit 12 - Stat 571 - Ramón V. León
42
RBD
Example
Notice that interest is
on the differences
among the positions.
We assume that these
differences are the
same for all three
batches except for
random error, that is,
we assume no interaction
between batch and position.
7/16/2004
Unit 12 - Stat 571 - Ramón V. León
43
JMP Analysis of Drip Loss Experiment
Nominal
7/16/2004
Unit 12 - Stat 571 - Ramón V. León
44
JMP
Analysis of
Drip Loss
Experiment
Position and batch explain 86% of the variation in drip loss
SSModel = SSTreatment + SSBlocks
True because we assume no interaction between
treatment and block. (See next slide.)
7/16/2004
Unit 12 - Stat 571 - Ramón V. León
45
JMP 4 Analysis of Drip Loss Experiment. III
These two table
were not the
same in regression.
They are equal here
because the model
is balanced.
Also in regression
the sum of the Type
III sums of squares
is not equal to the
model sums
of squares. This
only true here because
the model is balanced.
7/16/2004
The P-values show that there are significant position
effects. We recommend ignoring the Block (Batch) test
because it is not meaningful for the RBD.
(Type III)
Model SS = 56.654971
Recall: The sum of the Type I sums of squares is
always equal to the model sums of squares
Unit 12 - Stat 571 - Ramón V. León
46
Drip Loss in Meat Loaves: Residual Plots
The predicted versus residual plot is part
of the standard output of the Fit Model
platform. The normal plot was obtained
by saving the residuals and then going to
the Distribution platform.
7/16/2004
Unit 12 - Stat 571 - Ramón V. León
47
Tukey Method for the RBD
Using the Fit Model
platform with batch
and position in the
model. That the two
variables be included
is important.
Warning: Don’t use the Fit Y by X platform to
do Tukey’s test as you will use the wrong number
of degrees of freedom.
7/16/2004
Unit 12 - Stat 571 - Ramón V. León
48
Tukey Method for the RBD
7/16/2004
Unit 12 - Stat 571 - Ramón V. León
49
Tukey Method for the RBD
7/16/2004
Unit 12 - Stat 571 - Ramón V. León
50
Mixed Effects Model for the RB Design
Yij = µ + τ i + β j + ε ij (i = 1,..., a; j = 1,..., b)
where ε ij are i.i.d. N(0,σ 2 )
and β j are i.i.d. N(0,σ )
2
B
Independent
µ is called the grand mean
τ i is called the ith treatment effect
β j's are called the block effects
∑
a
τ = 0 so there are a − 1 independent treatment effects
i =1 i
7/16/2004
Unit 12 - Stat 571 - Ramón V. León
51
7/16/2004
Unit 12 - Stat 571 - Ramón V. León
52
Compare with
Results in Section 12.4.5,
Example 12.16 of your
textbook
The variability due to
batches accounts for
about 58.4% of the total
variability in drip loss.
7/16/2004
Unit 12 - Stat 571 - Ramón V. León
53
© Copyright 2026 Paperzz