Designing Studies Suppose that you want to collect data to study

Name _________________
Designing Studies
Suppose that you want to collect data to study whether the expression “an apple a day
keeps the doctor away” has any validity (does eating apples have any health benefit).
Consider four different designs of such a study:
1. You take a random sample of individuals, identify which do and do not eat apples
regularly, and then follow them for six months to see who requires a visit to a
doctor and who does not.
2. You take a random sample of physicians and ask them whether they have noticed
any health benefits of eating apples.
3. You take a random sample of individuals, randomly assign half to eat an apple a
day for the next six months and the other half not to, and then see who requires a
visit to a doctor and who does not.
4. You recall that your Uncle Joe loved apples and was never sick a day in his life,
while your Uncle Tom despised apples and was often ill.
These four examples illustrate four types of studies:
¾ Anecdotes, with which the investigator merely recounts instances known to him
or her.
¾ Surveys, with which the investigator asks people to answer questions about their
opinions or practices.
¾ Observational studies, with which the investigator passively observes and
records information on observational units, such as people’s practices.
¾ Experiments, with which the investigator deliberately imposes some condition on
the subjects (or experimental units) observing and recording the results.
1. Identify the type of study represented by 1-4 above.
2. For studies 1 and 3, identify the observational units (cases) and the two variables
involved. Also indicate whether the variables are categorical or quantitative.
The variable whose effect one wants to study is called the explanatory (or independent)
variable. The variable that one suspects is affected by the other is known as the
response (or dependent) variable.
3. Identify which is the explanatory and which is the response variable in studies 1 & 3.
AP Stats
Designing Studies
We can denote the structure of such a study through a simple graphic:
Group 1
Observational units
compare results
Group 2
Explanatory variable
Response variable
where we can list any number of groups for the explanatory variable.
4. Produce a graphic that describes studies 1 & 3, replacing the generic terms with a
description of the variables in this context.
5. Explain why study 3 is a more effective design than study 1 for establishing whether
or not there is a cause-and-effect relationship between apples and health.
An experiment is necessary to establish a cause-and-effect relationship between
variables. In contrast to an observational study, one of the key characteristics of an
experiment is that the experimenter actively imposes the treatment on the subjects. The
experimenter then hopes to see the direct effect of this explanatory variable on the
response.
6. What would be lacking in an experiment that took a random sample of individuals,
asked them to eat an apple a day for the next few months, and recorded the number of
visits to a physician by each person?
A well-designed experiment exerts several forms of control on the subjects to help
minimize the effects of extraneous variables, so that any effects on the response variable
can be directly attributed to the explanatory variable. One fundamental form of control is
that of comparison.
AP Stats
Designing Studies
Suppose that a high school is considering adopting a foreign language requirement, partly
on the grounds that students who study a foreign language tend to do better on the verbal
portion of the SAT exam than students who do not. You decide to investigate this
argument by examining records of random samples of students taking and not taking a
foreign language. You find the following data.
Foreign language study
No foreign language study
570
450
570 600 610
480 490 490
660 660 670
550 570 570
700 700 780
610 610 690
7. Identify the explanatory and response variables in this study and produce a graphic
summarizing the structure of the study.
8. Is this an observational study or a controlled experiment? Explain.
9. Do these sample data indicate that those studying a foreign language perform better
on the verbal portion of the SAT than those who do not study a foreign language?
Explain.
10. Can you suggest an alternative explanation for why this difference might occur even
if foreign language study does not improve students’ verbal skills?
An observational study does not control for possible effects of variables that are not
considered in the study but could have an effect on the response variable. These
unmonitored variables are often called lurking variables. These lurking variables can
have effects on the response variable that are confounded with those of the explanatory
variable. A confounding variable is one whose effects on the response are
indistinguishable from those of the explanatory variable. This prevents the experimenter
from isolating the effects of each variable.
11. Suggest a potential confounding variable for observational study: an apple a day
keeps the doctor away, at the beginning of this worksheet.
12. If you could design an experiment to assess the possible effect of foreign language
study on SAT score, would you want all of the high verbal aptitude students to be in
one group and all the low verbal aptitude students in the other group? How would
you want these students divided?
AP Stats
Designing Studies
13. Even if you did not have access to measures of verbal aptitude, how might you assign
students to treatment groups in an effect to balance out the aptitudes between the two
groups? Provide a graphic which illustrates your experiment.
14. Now if there turned out to be a significant difference in the SAT scores between the
two groups, would you be able to reasonably attribute the difference to the foreign
language study alone? Explain.
Concluding that a causal relationship exists between an explanatory and a response
variable is appropriate only when the data come from a well-designed, controlled
experiment. Randomization plays a crucial role in such experiments by aiming to
balance out potential effects of lurking variables.
The very fact that subjects in the treatment group realize that they are being given
something that researchers expect will improve their condition may affect their responses
differently than those of subjects who are not given any treatment. This phenomenon has
been detected in many circumstances, especially medical studies, and is known as the
placebo effect. Experimenters control for this confounding variable by administering a
placebo to those subjects in the control group. This method of control is called
blindness, since subjects are not told whether or not they are receiving the real treatment
or the placebo. When possible, experiments should be double blind in that the person
responsible for evaluating the subjects should also be unaware of which subjects receive
which treatment. In this way the evaluator’s judgment is not influenced by any hidden
biases.
This worksheet was adapted from “Workshop Statistics”, 2nd edition, by Alan Rossman, Beth Chance, J.
Barr Von Oehsen.
AP Stats
Designing Studies