1 Factorial Design Terminology

1
Factorial Design Terminology
Suppose we have more than one independent variable that we think is important. Can we manipulate two (or more) things at once? Example: Lets
do verbal memory and gender. In a standard free-recall task, participants
see a list of words at the study phase. After the study phase, and then after
a delay period, there is a test phase in which the participants need to write
down as many words as they can. One variable we can manipulate is word
frequency. Low frequency words (e.g., armadillo) are better recalled than
high-frequency words (dog). What about gender?
• IV 1: Word Frequency (levels? within/between)
• IV 2: Gender (levels? within/between)
• DV: # of words recalled
• This design is called a 2x2 mixed factorial design.
• 2x2: First IV has 2 levels & 2nd IV has two levels
• mixed: Some IVs are within; other are between
• factorial: all combinations are present.
Male
Female
Word Frequency
High Frequency Low Frequency
8
12
10
14
• What is a 2x4 within-subject factorial design?
• What is a 3x2 between-subject factorial design?
• What is a 2x2x3 mixed factorial design?
1
2
Questions in Analysis
In a two-variable design, we are generally interested in the following questions:
1. What is the main effect of one of the IVs?
2. What is the main effect of the other IV?
3. Is there an interaction?
Lets take these questions in turn:
2.1
Main Effects
The overall effect of an IV. This effect is without regard to the effect of the
other IV.
Word Frequency
High Frequency Low Frequency Average
Male
8
12
10
Female
10
14
12
Average
9
13
11
The overall grand mean is 11. The overall mean for males is 10; it is 12
for females. The main effect of gender is ±1. Likewise, the overall mean for
High Frequency words is 9; the overall mean for Low Frequency words is 13.
The main effect of gender is ±2.
2
15
F
10
F
M
0
5
Free−Recall Score
M
High Frequency
Low Frequency
Main effects are denoted by parallel lines. If an experiment does not have
interactions (that is, it only has main effects, then the effect of one variable
3
15
does not depend on the level of the other.
L
10
H
5
H
0
Free−Recall Score
L
Female
Male
4
14
8
6
4
2
0
Free−Recall Score
10
12
Female
Male
High Frequency.
5
Low Frequency.
2.2
Main-Effects Statistical Model
Data denoted by yijk where i refers to the level of the first factor (say gender,
i = 1 for men and i = 2 for women), j refers to the level of the second factor
(say frequency, j = 1 for low and j = 2 for high), and k refers to the specific
observation in the cell (so k = 5 refers to the fifth observation in the cell).
The main effects statistical model is
iid
yijk ∼ Normal(µ + αi + βj , σ),
where µ is the grand mean, αi is the main effect of the levels of gender, and
βj is the main effect of the levels of frequency.
The Main-Effect Predictions are the predictions of this model for cell
means.
2.3
Interactions
What is an interaction? An interaction is when the effect of one variable
depends on the level of the other. There are no interactions in the wordfrequency/gender data above. Here is an experiment with only an interaction
and no main effects:
We are interested in designing the perfect ice-cream. We want to know
people’s preference for sugar and fat in ice-cream. We are going to give
people a taste of ice-cream and ask them how much they would pay for a
half-gallon.
Sugar
Not a lot Some Lots
Lo Fat
6
5
4
Creamy
4
5
6
What is the overall average. What is the effect of not a lot of sugar, some
sugar, lots of sugar? What is the effect of lo fat, creamy? Are there any
main effects?
Yet, something is going on. This is an interaction. It is effects above and
beyond main effects. Lets make a graph.
6
7
C
L
4
5
6
C
C
L
3
Price in Dollars
L
No Sugar
Low Sugar
7
Lots=O−Sugar
Sugar
Not a lot Some Lots Ave
Lo Fat
6
5
4
5
Creamy
4
5
6
5
Ave
5
5
5
5
Here are the Main Effect Predictions:
Sugar
Not a lot Some Lots
Lo Fat
5
5
5
Creamy
5
5
5
The fact that the main effect predictions do not account for the data
means that there are interactions. The effect of fat is not consistent—it
depends on the amount of sugar.
2.4
Main Effects + Innteractions Model
The main effects + interactions statistical model is
iid
yijk ∼ Normal(µ + αi + βj + πij , σ),
where πij are the interaction terms. In this case there are 6 of them. They
are:
Sugar
Not a lot Some Lots
Lo Fat
1
0
-1
Creamy
-1
0
1
3
Example
Ok, here is another example. We are trying to find the right amount of water
to give to our plants. There are two levels of light (sun and shade) and two
levels of water (little, lot). Here is the data. Are there main effects? Are
there interactions?
Light
shade sun
little water
1
1
lots of water
2
6
• Lets Graph it.
8
• Lets figure out main effect:
Light
shade sun
little water
1
1
lots of water
2
6
ave
1.5
3.5
ave
1
4
2.5
• What are the main effect predictions?
iid
– We can do it formally. The main-effects model is yijk ∼ Normal(µ+
αi + βj , σ), and let α refer to sunlight and β refer to water. In this
case, µ is 2.5; α1 = −1 (shade) and α2 = 1 (sun); and β1 = −1.5
(little water) and β2 = 1.5 (lot of water).
Predictions:
shade and little: µ + α1 + β1 = 2.5 − 1 − 1.5 = 0
sun and little: µ + α2 + β1 = 2.5 + 1 − 1.5 = 2
shade and lot: µ + α1 + β2 = 2.5 − 1 + 1.5 = 3
sun and lot: µ + α2 + β2 = 2.5 + 1 + 1.5 = 6
– Alternatively, we can do it within the table, which is what I like:
Light
shade
sun
little water 2.5 − 1.5 − 1 = 0 2.5 − 1.5 + 1 = 2
lots of water 2.5 + 1.5 − 1 = 3 2.5 + 1.5 + 1 = 5
Cleaning up:
Light
shade sun
little water
0
2
lots of water
3
5
• What are the interactions:
Light
shade
sun
little water
1 − 0 = 1 1 − 2 = −1
lots of water 2 − 3 = −1 6 − 5 = 1
9
cleaning up:
4
little water
lots of water
Light
shade sun
1
-1
-1
1
Another Example
It is pretty easy to comapre the sizes of animals from imagination. For
example, “Which is bigger, an Bear or a Whale?” Notice how we ask which
is bigger for big things. It is far less common to hear “Which is smaller, an
Bear or a Whale?” Yet, it is not uncommon to use “smaller” for small things;
e.g., “Which is smaller a mouse or a flea?” Why is it that bigger is used for
big things and smaller is used for small things? Is it harder to answer the
mismatched questions; e.g., “Which is smaller, an Bear or a Whale?” and
“Which is bigger a mouse or a flea?”
My design is simple. Participants are asked to compare two animals
on size. For each question, the two animals are from the same category of
size. Small animals are flease, mice, and rats; medium animas are dogs,
hyenas, and monkeys; large animals are bears, hippos, and whales. These
size categories form the first IV. The second IV is the form of the question:
either “which is smaller” or “which is larger.” Here are some sample data
Category of Animal Size
Small Medium Large
Smaller 650
700
850
Bigger
750
600
550
The goal is to determine if the question-type mismatches really slowed
performance.
• Plot the data.
• Are there main effects?
• Are there interactions?
• Which characteristic, main effects or interacitons answer the question?
10
Category of Animal Size
Small Medium Large Ave
Smaller 650
700
850
733
Bigger
750
600
550
633
Ave
700
650
700
683
Main Effects:
Category:
• Small: +17
• Medium: -33
• Large: +17
Question:
• Smaller +50
• Larger -50
Main Effect Predictions:
Category of Animal Size
Small Medium Large
Smaller 750
700
750
Bigger
650
600
650
Interactions:
Category of Animal Size
Small Medium Large
Smaller -100
0
100
Bigger
100
0
-100
Recode Data:
Match: 550,650: 600
Neutral: 700,600: 650
Mismatch:750,850: 800
5
Example
An experimenter wanted to test the hypothesis that males are more creative
than females. She also hypothesized that the male superiority in creativity
11
would be heightened under conditions involving ego. The design used was
a 2X2 between-participants factorial design in which the variables were sex
and degree of ego involvement. This later variable was manipulated with
instructions. The high-ego group was told the task was an intelligence test
with the results posted by name on a bulletin group. The low ego group
was told the task was part of developing new measures for clinical diagnosis
and their scores would be averaged together. Her test of creativety was to
give people an object such as a hammer and ask participants to write doen
as manny unusual uses of that object as possible in 5 minutes. Twenty-five
males from ROTC and 25 members of a sorority pledge class were recruited.
The objects were a mnkey wrench and a compass. Means were as follows:
Ego
Low High
Men
4.1
7.6
Women 3.2
2.4
• Before critiqueing the expt., let’s ask if the hypotheses were supported?
Sex? Ego?
• Plot the data. Interpret the interactions.
• Lets list some critiques. How would they affect the data pattern?
Men
Main Effect of Gender?
Yes: Men=5.85
Women =2.80
Diff=3.05
3.5
−.8
Low
Ego
Women
Main Effect of Ego?
Yes: Low=3.65
High=5
Diff = 1.35
Interaction:
Yes: Slopes are different
Lines aren’t parallel
Difference in Slope=4.3
High
Ego
12
Suppose The Task is too Male Centric. Suppose we used a
more gender−neutral task.
Men
−.8
Women
3.5
Low
Ego
Main Effect of Gender?
Yes: Men=5.85
Women =5.85
Diff=0
Main Effect of Ego?
Yes: Low=5.05
High=6.4
Diff = 1.35
Interaction:
Yes: Slopes are different
Lines aren’t parallel
Difference in Slope=4.3
High
Ego
ROTC men were more competitive than other men;
Sorority women were less competitive than other women.
Men
1
1
Low
Ego
Main Effect of Gender?
Yes: Men=5.85
Women =2.80
Diff=3.05
Women
Main Effect of Ego?
Yes: Low=3.65
High=5
Diff = 1.35
Interaction:
No
High
Ego
13