Experiment Basics: Variables

Experiment Basics: Control
Psych 231: Research
Methods in Psychology
Quiz 7 due Friday Mar. 4
 Exam 2 three weeks from today

Announcements


Independent variables
Dependent variables

Measurement
• Scales of measurement
• Errors in measurement

Extraneous variables



Control variables
Random variables
Confound variables
Variables

Control variables


Holding things constant - Controls for excessive random
variability
Random variables – may freely vary, to spread variability
equally across all experimental conditions

Randomization
• A procedure that assures that each level of an extraneous variable has an
equal chance of occurring in all conditions of observation.
Extraneous Variables

Mythbusters examine: Yawning (4 mins)
Earlier version, of
exp1 (6.5 mins)



What sort of sampling method?
Why the control group?
Should they have confirmed?
• Probably not, if you do the stats, with this sample size
the 4% difference isn’t big enough to reject the null
hypothesis
• What the stats do: quantify how much random variability
(error) there is compared to observed variability and held
you decide if the observed variability is likely due to the
error or the manipulated variability
Experimental Control
ReggieNet: Provine (2005). Yawning.
American Scientist, 93(6), 532-539.

Control variables


Holding things constant - Controls for excessive random
variability
Random variables – may freely vary, to spread variability
equally across all experimental conditions

Randomization
• A procedure that assures that each level of an extraneous variable has an
equal chance of occurring in all conditions of observation.

Confound variables


Variables that haven’t been accounted for (manipulated,
measured, randomized, controlled) that can impact changes in
the dependent variable(s)
Co-varys with both the dependent AND an independent
variable
Extraneous Variables

Divide into two groups:


men
women

Instructions: Read aloud the COLOR that the words are
presented in. When done raise your hand.

Women first. Men please close your eyes.
Okay ready?

Colors and words
Blue
Green
Red
Purple
Yellow
Green
Purple
Blue
Red
Yellow
Blue
Red
Green
List 1



Okay, now it is the men’s turn.
Remember the instructions: Read aloud the
COLOR that the words are presented in. When
done raise your hand.
Okay ready?
Blue
Green
Red
Purple
Yellow
Green
Purple
Blue
Red
Yellow
Blue
Red
Green
List 2

So why the difference between the results for
men versus women?

Is this support for a theory that proposes:


“Women are good color identifiers, men are not”
Why or why not? Let’s look at the two lists.
Our results
Matched
List 1
List 2
Women
Men
Blue
Green
Red
Purple
Yellow
Green
Purple
Blue
Red
Yellow
Blue
Red
Green
Blue
Green
Red
Purple
Yellow
Green
Purple
Blue
Red
Yellow
Blue
Red
Green
Mis-Matched

Blue
Green
Red
Purple
Yellow
Green
Purple
Blue
Red
Yellow
Blue
Red
Green


What resulted in the performance
difference?

Our manipulated independent variable
(men vs. women) Our question of interest

The other variable match/mis-match?
Because the two variables are
perfectly correlated we can’t tell
This is the problem with confounds
IV
?
DV
Co-vary together
Confound
Confound that we can’t rule out
Blue
Green
Red
Purple
Yellow
Green
Purple
Blue
Red
Yellow
Blue
Red
Green
Blue
Green
Red
Purple
Yellow
Green
Purple
Blue
Red
Yellow
Blue
Red
Green

What DIDN’T result in the performance
difference?

Extraneous variables

Control
• # of words on the list
• The actual words that were printed

Random
• Age of the men and women in the groups
• Majors, class level, seating in classroom,…

These are not confounds, because
they don’t co-vary with the IV
Blue
Green
Red
Purple
Yellow
Green
Purple
Blue
Red
Yellow
Blue
Red
Green

Our goal:

To test the possibility of a systematic relationship
between the variability in our IV and how that
affects the variability of our DV.
IV

DV
variability
Control is used to:
• Minimize excessive variability
• To reduce the potential of confounds (systematic
variability not part of the research design)
Experimental Control

Our goal:

To test the possibility of a systematic relationship
between the variability in our IV and how that
affects the variability of our DV
DV.
T = NRexp + NRother + R
NRexp: Manipulated independent variables (IV)
Nonrandom (NR)
Variability
• Our hypothesis: the IV will result in changes in the DV
NRother: extraneous variables (EV) which covary with IV
• Condfounds
Random (R) Variability
• Imprecision in measurement (DV)
• Randomly varying extraneous variables (EV)
Experimental Control

Variability in a simple experiment:
T = NRexp + NRother + R
Treatment group
NR
other
NR
exp
R
“perfect experiment” no confounds (NRother = 0)
Absence of
the treatment
Control group (NRexp = 0)
R
Bigger the weight = more
variability from a source
Experimental Control: Weight analogy

Variability in a simple experiment:
T = NRexp + NRother + R
Control group
Treatment group
NR
exp
R
R
Difference Detector
Our experiment is a “difference detector”
We can’t “see”
what’s on the
scales
This is the only
part we “see”
Experimental Control: Weight analogy

If there is an effect of the treatment then NRexp will ≠ 0
Control group
Treatment group
R
NR
exp
R
Difference Detector
Our experiment can detect the
effect of the treatment
Experimental Control: Weight analogy

Variability in a simple experiment:
Try it out at home: using coins as weights, with your
eyes closed, can you tell different combinations apart?
Treatment group
Difference Detector
Control group
Bigger the weight = more
variability from a source
Experimental Control: Weight analogy

Potential Problems


Excessive random variability
Confounding
Treatment group
Control group
Difference Detector
Things making detection difficult

Excessive random variability

If experimental control procedures are not applied
• Then R component of data will be excessively large, and
may make NRexp undetectable
Potential Problems

If R is large relative to NRexp then detecting a difference
may be difficult
R
R
NR
exp
Difference
Detector
Experiment can’t detect the
effect of the treatment
Excessive random variability

But if we reduce the size of NRother and R relative to
NRexp then detecting gets easier
 So try to minimize this by using good measures of
DV, good manipulations of IV, etc.
R
NR
exp
R
Difference
Detector
Our experiment can detect the
effect of the treatment
Reduced random variability

Confound

If an EV co-varies with IV, then NRother component
of data will be present, and may lead to
misattribution of effect to IV
This relationship may
or may not exist
IV
DV
Co-vary together
EV
IV = independent var
DV = dependent var
EV = extraneous var
Potential Problems

Confound

Hard to detect the effect of NRexp because the effect looks like it
could be from NRexp but could be due to the NRother
R
NR
other
NR
R
exp
Difference
Detector
Experiment can detect an effect,
but can’t tell where it is from
Confounding

Confound

Hard to detect the effect of NRexp because the effect looks like it
could be from NRexp but could be due to the NRother
These two situations look
the same
R
NR
other
R
NR
NR
exp
other
R
Difference
Detector
There is an effect of the IV
Confounding
R
Difference
Detector
There is not an effect of the IV

Confound

Hard to detect the effect of NRexp because the effect looks like it
could be from NRexp but could be due to the NRother
Use experimental control to eliminate
the variability from the confound
Use experimental control to spread
the variability equally across
conditions
NRother
R
NR
exp
R
NR
R
exp
Difference
Detector
R
R
Difference
Detector
Removing Confounding

How do we introduce control?

Methods of Experimental Control
• Constancy/Randomization
• Comparison
• Production
Controlling Variability

Constancy/Randomization

If there is a variable that may be related to the DV that you
can’t (or don’t want to) manipulate
• Control variable: hold it constant (so there isn’t any variability
from that variable, no R weight from that variable)
• Random variable: let it vary randomly across all of the
experimental conditions (so the R weight from that variable is the
same for all conditions)
Methods of Controlling Variability

Comparison

An experiment always makes a comparison, so it must have at
least two groups (2 sides of our scale in the weight analogy)
• Sometimes there are control groups
• This is often the absence of the treatment
Training
group
•
•
No training
(Control) group
Without control groups if is harder to see what is really
happening in the experiment
•
It is easier to be swayed by plausibility or
inappropriate comparisons (see diet crystal example)
Useful for eliminating potential confounds (think about our
list of threats to internal validity)
Methods of Controlling Variability

Comparison

An experiment always makes a comparison, so it must have at
least two groups
• Sometimes there are control groups
• This is often the absence of the treatment
• Sometimes there are a range of values of the IV
1 week of
Training group
2 weeks of
Training group
3 weeks of
Training group
Methods of Controlling Variability

Production

The experimenter selects the specific values of the Independent
Variables
1 week of
Training group
2 weeks of
Training group
3 weeks of
Training group

selects the
specific values
variability
1 weeks
2 weeks
3 weeks
Duration taking the training program
Methods of Controlling Variability

Production

The experimenter selects the specific values of the Independent
Variables
1 week of
Training group
2 weeks of
Training group
3 weeks of
Training group
• Need to do this carefully
• Suppose that you don’t find a difference in the DV across your
different groups
• Is this because the IV and DV aren’t related?
• Or is it because your levels of IV weren’t different enough
Methods of Controlling Variability


So far we’ve covered a lot of the general details of
experiments
Now let’s consider some specific experimental
designs.


Some bad (but not uncommon) designs (and potential fixes)
Some good designs
•
•
•
•
1 Factor, two levels
1 Factor, multi-levels
Factorial (more than 1 factor)
Between & within factors
Experimental designs

Bad design example 1: Does standing close to
somebody cause them to move? (theory of personal space)


“hmm… that’s an empirical question. Let’s see what
happens if …”
So you stand closely to people and see how long before they
move
Problem: no control group to establish the comparison group
(this design is sometimes called “one-shot case study design”)
Fix: introduce a (or some) comparison group(s)
Very Close (.2 m)
Close (.5 m)
Not Close (1.0 m)
Poorly designed experiments

Bad design example 2:

Does a relaxation program decrease the urge to
smoke?

2 groups
• relaxation training group
• no relaxation training
group

Training
group
No training
(Control) group
The participants choose
which group to be in
Poorly designed experiments

Bad design example 2: Non-equivalent control groups
Self
Assignment
Independent
Variable
Dependent
Variable
Training
group
Measure
No training
(Control) group
Measure
participants
Random
Assignment
Problem: selection bias for the two
groups
Fix: need to do random assignment
to groups
Poorly designed experiments

Bad design example 3:

Does a relaxation program decrease the urge to
smoke?

Pre-test desire level
Give relaxation training program
Post-test desire to smoke


Poorly designed experiments

Bad design example 3: One group pretest-posttest
design
Dependent Independent Variable Dependent
Variable
Pre vs. Post
Variable
participants
Pre-test
Training
group
Post-test
Measure
Post-test
No Training
Fix: Add
Pre-test
Measure
group
another
factor
Problems include: history, maturation, testing, and
more
Poorly designed experiments


So far we’ve covered a lot of the general details of
experiments
Now let’s consider some specific experimental
designs.


Some bad (but not uncommon) designs
Some good designs
•
•
•
•
1 Factor, two levels
1 Factor, multi-levels
Factorial (more than 1 factor)
Between & within factors
Experimental designs

Good design example

What are our IV and DV?
How does anxiety level affect test performance?
• Two groups take the same test
• Grp1(low anxiety group): 5 min lecture on how good
grades don’t matter, just trying is good enough
• Grp2 (moderate anxiety group): 5 min lecture on the
importance of good grades for success

1 Factor (Independent variable), two levels
• Basically you want to compare two treatments (conditions)
• The statistics are pretty easy, a t-test
1 factor - 2 levels

Good design example

How does anxiety level affect test performance?
Random
Assignment
Anxiety
Dependent
Variable
Low
Test
Moderate
Test
participants
1 factor - 2 levels

Good design example

How does anxiety level affect test performance?
anxiety
low
moderate
60
80
test performance
One factor
Use a t-test to see if
these points are
statistically different
T-test =
Observed difference between conditions
Difference expected by chance
low
Two levels
1 factor - 2 levels
moderate
anxiety

Advantages:


Simple, relatively easy to interpret the results
Is the independent variable worth studying?
• If no effect, then usually don’t bother with a more complex
design

Sometimes two levels is all you need
• One theory predicts one pattern and another predicts a
different pattern
1 factor - 2 levels

Disadvantages:

“True” shape of the function is hard to see
• Interpolation and Extrapolation are not a good idea
Interpolation
test performance
What happens within of the ranges that you test?
low
1 factor - 2 levels
moderate
anxiety

Disadvantages:

“True” shape of the function is hard to see
• Interpolation and Extrapolation are not a good idea
Extrapolation
test performance
What happens outside of the ranges that you test?
low
moderate
anxiety
1 factor - 2 levels
high


So far we’ve covered a lot of the general details of
experiments
Now let’s consider some specific experimental
designs.


Some bad (but not uncommon) designs
Some good designs
•
•
•
•
1 Factor, two levels
1 Factor, multi-levels
Factorial (more than 1 factor)
Between & within factors
Experimental designs


For more complex theories you will typically
need more complex designs (more than two
levels of one IV)
1 factor - more than two levels


Basically you want to compare more than two
conditions
The statistics are a little more difficult, an ANOVA
(Analysis of Variance)
1 Factor - multilevel experiments

Good design example (similar to earlier ex.)

How does anxiety level affect test performance?
• Groups take the same test
• Grp1(low anxiety group): 5 min lecture on how good grades
don’t matter, just trying is good enough
• Grp2 (moderate anxiety group): 5 min lecture on the
importance of good grades for success
• Grp3 (high anxiety group): 5 min lecture on how the
students must pass this test to pass the course
1 Factor - multilevel experiments
Random
Assignment
participants
Anxiety
Dependent
Variable
Low
Test
Moderate
Test
High
Test
1 factor - 3 levels
low
mod
high
60
80
60
test performance
anxiety
low
mod
high
anxiety
1 Factor - multilevel experiments

Advantages

Gives a better picture of the relationship
(functions other than just straight lines)
2 levels
test performance
test performance
low moderate
anxiety

3 levels
low
mod
high
anxiety
Generally, the more levels you have, the less
you have to worry about your range of the
independent variable
1 Factor - multilevel experiments

Disadvantages


Needs more resources (participants and/or
stimuli)
Requires more complex statistical analysis
(ANOVA [Analysis of Variance] & follow-up
pair-wise comparisons)
1 Factor - multilevel experiments

The ANOVA just tells you that not all of the groups
are equal.

If this is your conclusion (you get a “significant ANOVA”)
then you should do further tests to see where the differences
are
• High vs. Low
• High vs. Moderate
• Low vs. Moderate
Pair-wise comparisons


So far we’ve covered a lot of the about details
experiments generally
Now let’s consider some specific experimental
designs.


Some bad (but common) designs
Some good designs
•
•
•
•
1 Factor, two levels
1 Factor, multi-levels
Factorial (more than 1 factor)
Between & within factors
Experimental designs

Two or more factors

Some vocabulary
• Factors - independent variables
• Levels - the levels of your independent variables
• 2 x 4 design means two independent variables, one with 2 levels and one with 4
levels
• “Conditions” or “groups” is calculated by multiplying the levels, so a 2x4
design has 8 different conditions
B1
B2
B3
B4
A1
A2
Factorial experiments
Two or more factors


Main effects - the effects of your independent variables ignoring
(collapsed across) the other independent variables
Interaction effects - how your independent variables affect each
other
• Example: 2x2 design, factors A and B
B1
• Interaction:
• At A1, B1 is bigger than B2
• At A2, B1 and B2 don’t differ
Dependent Variable

B2
A2
A1
A
Everyday interaction = “it depends on
…”
Factorial experiments

Rate how much you would want to see a new movie
(1 no interest, 5 high interest):


5
Hail, Caesar! – new Cohen Brothers movie in 2016 (Feb. 5)
Ask men and women – looking for an effect of gender
4.5
4
3.5
3
2.5
Men
Women
2
1.5
1
Interaction effects
Not much of a difference:
no effect of gender

Maybe the gender effect depends on whether you know
who is in the movie. So you add another factor:

Suppose that George Clooney or Scarlett Johansson
might star. You rate the preference if he were to star
and if he were not to star.
5
4.5
4
3.5
Men
Women
3
2.5
2
1.5
1
No star
Effect of gender
depends on
whether George
or Scarlett stars in
the movie or not
This is an
interaction
Clooney stars Johansson stars
Interaction effects
A video lecture from ThePsychFiles.com podcast

The complexity & number of outcomes increases:
• A = main effect of factor A
• B = main effect of factor B
• AB = interaction of A and B
• With 2 factors there are 8 basic possible patterns of results:
1) No effects at all
2) A only
3) B only
4) AB only
5) A & B
6) A & AB
7) B & AB
8) A & B & AB
Results of a 2x2 factorial design
A1
B1
B2
A2
Condition
Condition
mean
mean
A1B1
A2B1
Condition
Condition
mean
mean
A1B2
A2B2
A1 mean
Interaction of AB
What’s the effect of A at B1?
What’s the effect of A at B2?
B1 mean
Main
effect
of B
B2 mean
A2 mean
Main effect
of A
Marginal
means
2 x 2 factorial design
A1
A2
Main Effect
of B
B1
30
60
45
B2
30
60
45
30
60
B
Dependent Variable
A
Main Effect
of A
Main effect of A
Main effect of B
Interaction of A x B
B1
B2
A2
A1
A
✓
X
X
Examples of outcomes
B1
A1
A2
Main Effect
of B
60
60
60
B
B2
30
30
45
45
30
Dependent Variable
A
Main Effect
of A
B1
B2
A2
A1
A
Main effect of A
X
Main effect of B
✓
Interaction of A x B X
Examples of outcomes
B1
A1
A2
Main Effect
of B
60
30
45
30
60
45
45
45
B
B2
Dependent Variable
A
Main Effect
of A
B1
B2
A2
A1
A
Main effect of A
X
Main effect of B
X
Interaction of A x B ✓
Examples of outcomes
B1
A1
A2
Main Effect
of B
30
60
45
30
30
30
30
45
B
B2
Dependent Variable
A
Main Effect
of A
Main effect of A
Main effect of B
Interaction of A x B
B1
B2
A2
A1
A
✓
✓
✓
Examples of outcomes
Let’s add another variable: test difficulty.
test performance
easy
medium
hard
low mod
anxiety
high
Test difficulty
anxiety
hard
medium
easy
low
mod
high
35
80
35
65
80
65
80
80
80
60
80
60
main effect
of anxiety
Interaction ? Yes: effect of anxiety depends on
level of test difficulty
Anxiety and Test Performance
main effect
of difficulty
50
70
80

Advantages

Interaction effects
– Consider the interaction effects before trying to interpret the
main effects
–
Adding factors decreases the variability
– Because you’re controlling more of the variables that
influence the dependent variable
– This increases the statistical Power of the statistical tests
–
Increases generalizability of the results
– Because you have a situation closer to the real world (where
all sorts of variables are interacting)
Factorial Designs

Disadvantages



Experiments become very large, and unwieldy
The statistical analyses get much more complex
Interpretation of the results can get hard
• In particular for higher-order interactions
• Higher-order interactions (when you have more than two
interactions, e.g., ABC).
Factorial Designs

Consider the results of our class experiment

Main effect of word
type

Main effect of depth of
processing

No Interaction between
word type and depth of
processing
Dr. Kahn's reporting stats page
Factorial designs


So far we’ve covered a lot of the about details
experiments generally
Now let’s consider some specific experimental
designs.


Some bad (but common) designs
Some good designs
•
•
•
•
1 Factor, two levels
1 Factor, multi-levels
Factorial (more than 1 factor)
Between & within factors
Experimental designs

What is the effect of presenting words in
color on memory for those words?

So you present lists of words for recall either in
color or in black-and-white.
Clock
Chair
Cab

Clock
Chair
Cab
Two different designs to examine this question
Example

Between-Groups Factor
 2-levels
 Each of the participants is in only one level of the IV
levels
Clock
Colored
Chair
words
Cab
participants
Test
BW
words
Clock
Chair
Cab

Within-Groups Factor
 Sometimes called “repeated measures” design
 2-levels, All of the participants are in both levels of
the IV
levels
participants
Colored
words
Clock
Chair
Cab
Test
BW
words
Clock
Chair
Cab
Test

Between-subjects
designs
 Each participant
participates in one and
only one condition of
the experiment.

Within-subjects designs

All participants
participate in all of the
conditions of the
experiment.
Colored
words
participants
Test
BW
words
participants
Colored
words
Test
BW
words
Test
Between vs. Within Subjects Designs

Between-subjects
designs
 Each participant
participates in one and
only one condition of
the experiment.

Within-subjects designs

All participants
participate in all of the
conditions of the
experiment.
Colored
words
participants
Test
BW
words
participants
Colored
words
Test
BW
words
Test
Between vs. Within Subjects Designs

Clock
Colored
words Chair
Cab
Advantages:
participants
Test
BW Clock
words
Chair
Cab

Independence of groups (levels of the IV)
• Harder to guess what the experiment is about without
experiencing the other levels of IV
• Exposure to different levels of the independent variable(s)
cannot “contaminate” the dependent variable
• Sometimes this is a ‘must,’ because you can’t reverse the
effects of prior exposure to other levels of the IV
• No order effects to worry about
• Counterbalancing is not required
Between subjects designs

Disadvantages
Clock
Colored
words Chair
Cab
participants
Test
BW Clock
words
Chair
Cab

Individual differences between the people in the
groups
• Excessive variability
• Non-Equivalent groups
Between subjects designs

The groups are composed of different
individuals
participants
Colored
words
BW
words
Individual differences
Test

The groups are composed of different
individuals
participants

Colored
words
BW
words
Excessive variability due to individual
differences

Test
Harder to detect the effect of the IV if there
is one
Individual differences
NR
R
R

The groups are composed of different
individuals
participants

Colored
words
Test
BW
words
Non-Equivalent groups (possible confound)

The groups may differ not only because of the IV, but also because the
groups are composed of different individuals
Individual differences

Strive for Equivalent groups



Created equally - use the same process to
create both groups
Treated equally - keep the experience as
similar as possible for the two groups
Composed of equivalent individuals
• Random assignment to groups - eliminate bias
• Matching groups - match each individuals in one
group to an individual in the other group on relevant
characteristics
Dealing with Individual Differences
Group A
Red
Short
21yrs
Blue
tall
23yrs
Green
average
22yrs
Brown
tall
22yrs
Group B
matched
matched
matched
matched
Matching groups
Red
Short
21yrs
Blue
tall
23yrs
Green
average
22yrs
Brown
tall
22yrs

Matched groups


Trying to create
equivalent groups
Also trying to reduce
some of the overall
variability
• Eliminating variability
from the variables
that you matched
people on
Color
Height
Age

Between-subjects
designs
 Each participant
participates in one and
only one condition of
the experiment.

Within-subjects designs

All participants
participate in all of the
conditions of the
experiment.
Colored
words
participants
Test
participants
Colored
words
Test
BW
words
Test
BW
words
Between vs. Within Subjects Designs

Advantages:

Don’t have to worry about individual differences
• Same people in all the conditions
• Variability between conditions is smaller (statistical
advantage)

Fewer participants are required
Within subjects designs

Disadvantages


Range effects
Order effects:
• Carry-over effects
• Progressive error
• Counterbalancing is probably necessary to address these
order effects
Within subjects designs

Range effects – (context effects) can cause
a problem


The range of values for your levels may impact
performance (typically best performance in
middle of range).
Since all the participants get the full range of
possible values, they may “adapt” their
performance (the DV) to this range.
Within subjects designs

Carry-over effects


Transfer between conditions is possible
Effects may persist from one condition into
another
• e.g. Alcohol vs no alcohol experiment on the effects on
hand-eye coordination. Hard to know how long the
effects of alcohol may persist.
Condition 1
Condition 2
test
Order effects
How long do we
wait for the
effects to wear
off?
test

Progressive error


Practice effects – improvement due to repeated
practice
Fatigue effects – performance deteriorates as
participants get bored, tired, distracted
Order effects

Counterbalancing is probably necessary

This is used to control for “order effects”
• Ideally, use every possible order
• (n!, e.g., AB = 2! = 2 orders; ABC = 3! = 6 orders, ABCD = 4! = 24 orders, etc).

All counterbalancing assumes Symmetrical
Transfer
• The assumption that AB and BA have reverse effects
and thus cancel out in a counterbalanced design
Dealing with order effects

Simple case


Two conditions A & B
Two counterbalanced orders:
• AB
• BA
Colored
words
Test
BW
words
Test
BW
words
Test
Colored
words
Test
participants
Counterbalancing

Often it is not practical to use every possible
ordering

Partial counterbalancing
• Latin square designs – a form of partial
counterbalancing, so that each group of trials occur in
each position an equal number of times
Counterbalancing

Example: consider four conditions
Recall: ABCD = 4! = 24 possible orders
1) Unbalanced Latin square: each condition appears
in each position (4 orders)

Order 1
A
B
C
D
Order 2
Order 3
B
C
D
A
C
D
A
B
Order 4
D
A
B
C
Partial counterbalancing

Example: consider four conditions
Recall: ABCD = 4! = 24 possible orders
2) Balanced Latin square: each condition appears
before and after all others (8 orders)

A
B
C
D
A
B
D
C
B
C
D
A
B
C
A
D
C
D
A
B
C
D
B
A
D
A
B
C
D
A
C
B
Partial counterbalancing

Mixed factorial designs


Treat some factors as within-subjects
(participants get all levels of that factor) and
others as between-subjects (each level of
this factor gets a different group of
participants).
This only works with factorial (multi-factor)
designs
Mixed factorial designs