Experimental design

Quasi-experiments
(not quite experiments)
Definition: Experiments lacking random
assignment of subjects to treatment
conditions and the absence of an
independent variable manipulated by the
experimenter (e.g., automobile
deindividuation study).
Smoking Cessation Study
• A company is offering
a free stop smoking
programme to all its
workers.
• The researchers plan
to test the efficacy of
relaxation training in
getting smokers to
quit.
This U.S. government employee
may be smoking on the job and
wasting taxpayer’s money by time
away from his desk and increased
sick leave claims.
Confounded Variables
Definition: unintended/uncontrolled variation that may
explain the obtained pattern of results rather than the
expected relationship between independent and
dependent variables.
Problems with the One Group PretestPosttest Design
• History: any event not part of the
experimental manipulation that
occurs between the pre- and
posttest (e.g., a celebrity dies of
lung cancer, The Insider debuts in
theatres and gets extensive media
coverage).
• Maturation of subjects: refers to
systematic changes to the subjects
as a function of time (e.g., age,
subjects get more concerned about
their health as they get older,
maybe people get more responsible
once they have children, boredom
and hunger also increase over
time).
Problems with the One Group Pretest-Posttest Design
• Testing: refers to the problem of the pretest influencing
or changing behaviour (e.g., if subjects were asked to
keep a smoking diary, merely logging their cigarette
consumption may make them realize that they smoke
much more than they previously thought).
• Instrument decay: refers to a change in the accuracy of
the data recording over the course of the research. If a
smoking diary were to be kept, possibly by Week 4,
subjects are less motivated, or are bored or tired and fail
to record all cigarettes actually smoked.
Problems with the One Group Pretest-Posttest
Design
• Statistical regression: subjects selected on the basis of
extreme scores, upon retesting have decreased (or
increased if their selection was based on their low
score), the extremity of their scores and now appear
closer to the mean score. Possibly smokers selected on
the basis of extremely high smoking frequency were for
some strange reason smoking at a level way above what
they might normally do (they got a supply of duty-free
cigarettes, smuggled smokes, etc.), so that upon
retesting at a later time, they appear to decrease their
cigarette consumption as a result of the experimental
manipulation (relaxation training).
Problems with the One Group Pretest-Posttest Design
Statistical regression (con’t.)
Statistical regression often occurs when unreliable
measures are used. See the Sports Illustrated effect for
batters and pitchers and the example of the best and
worst performing Canadian mutual funds.
Statistical Regression: Canadian Mutual Funds
(from Report on Business, January 2000, p. 74)
Top 5 Canadian Equity Funds (1989)
Average Return in 1989 31.9%
Average Annual Return over next 10 yrs 11.7%
Bottom 5 Canadian Equity Funds (1989)
Average Return in 1989 7.1%
Average Annual Return over next 10 yrs 8.1%
Regression to the Mean
This phenomenon is observed when there is a less
than perfect correlation between two measures. The
more extreme the selection of scores, the greater the
regression to the mean. It occurs in all types of
measurement situations (e.g., Sports Illustrated
effect, parental heights, I.Qs.). As with most
statistical phenomena, regression to the mean is true
of groups of observations and is probabilistic (i.e., it
does not happen every time). Remember the effect is
for groups of scores rather than the score(s) of
individual group members.
Regression to the mean
Initial test mean
of group
Subgroup mean
Subgroup mean
for retest
95
87
60
35
30
Mortality or Experimental Mortality
Mortality refers to subjects dropping out of an
experiment. It is a problem with longitudinal
designs or longer experimental time
commitments from the subjects. There is a
danger of differential dropout in certain
conditions which would then cause the different
experimental groups to be nonequivalent.
Example: heavy smokers may be sicker over the
course of the study and have a higher dropout
rate, so that by the time the study is over, only
the light smokers remain.
Mortality or Experimental Mortality
This is the reason for using a pretest. Otherwise, if the
sample size is reasonable, the notion of randomization
should handle the problem of one group having a
different type of subject. Pretests can be awkward, or
worse, they sensitize the subjects, tipping them off to the
demand characteristics of the experiment.
Well-designed Experiments
• Since random assignment was used, maturation
and history affect both groups the same way.
• What other forms of assignment of subjects to
conditions could be performed? What are the
advantages or disadvantages of this approach?
• If pretests were to be used, what problems might
exist (e.g., demand characteristics, experimental
mortality).
Experimental Design Issues
• What are the defining characteristics of experiments?
• Can all possible variables affecting experimental
subjects be controlled for or assessed? If not, is
experimental work possible? How is this
problem/issue handled?
• When do we use correlational designs?
• In operant conditioning work ABAB designs are
frequently used. Each subject is his/her own control.
What advantages/disadvantages does this design
pose for those working in the real world? Policy
issues?
Nonequivalent Control Group Design
Unlike the one group pretest-posttest design, here a control group is
added. But there is still a problem with confounding because the
subjects are not equivalent in both the treatment (experimental)
group and the control group. This may be due to selection
differences—although all subjects are smokers, perhaps a
motivational difference exists such that more motivated, heavy
smokers have volunteered to receive relaxation training, while
nonvolunteering smokers are in the control group. Or it may be that
the relaxation training group smokers are recruited from the
maintenance department of a large manufacturing corporation, while
the controls are recruited from the electrical engineering department.
Here social class, educational variables, intelligence, etc., may vary
and smoking may be correlated with those variables
Ex Post Facto Study
(after the fact study)
Group Selected
Naturally Occurring
Event—no direct
manipulation
Measure
Do the Haitian earthquake victims experience post-traumatic stress
disorder?
No causal statements are possible since rival hypotheses cannot be
eliminated since many confounding variables could also produce the
same pattern of results.
“All sex offenders reported prior exposure to pornography”.
Experimental Design
Between-subjects design: a research design with two or
more groups with subjects assigned to only one group.
Within-subjects design: all subjects receive the same
treatments—each subject serves as his/her own control.
There is a problem found in within-subjects designs termed
order or sequence effects. Here the order of treatments
makes a difference in the dependent variable(s). The
participant’s behaviour in later parts of the experiment
may be influenced by what was presented to the subject
earlier in the experiment.
Dealing with Order Effects: Counterbalancing
Solution: Counterbalance order of treatments (examples of
drivers experiencing either high or low congestion first
and then experiencing the other condition to rule out
fatigue or the mere passage of time producing stress).
Latin square designs may also be used.
Statistical Issues
Null hypothesis: the hypothesis that there is no relationship
between two or more variables. The null hypothesis may be
rejected (which is the researcher’s hope), but never accepted.
Power: the ability to find differences (or to avoid Type 2 errors). It
is the ability to find significant differences when differences truly
exist.
Type 1 error: rejecting the null hypothesis (no difference between
the treatment groups) when it is true. Declaring a difference
between the treatment groups statistically significant when it is
really due to chance or random variation.
Type 2 error: failure to reject the null hypothesis when it is false.
Failing to find a difference between the treatment groups
(independent variable) when there really is a difference
(relationship between them.