Quiz 4

Chapter 4 Review
1. A researcher suspected a relationship between people’s preferences in movies and preference in pizza.
A random sample of 100 people produced the following two-way table:
a. Enter the overall (marginal) distributions on the table.
Favorite
Movie
Argo
Pacific Rim
Captain
Phillips
Totals
Pepperoni
Ground Beef
Mushrooms
Totals
20
8
5
15
10
12
35
35
15
2
13
30
43
22
35
100
b. Compute (in percents) the conditional distribution of favorite movie among those who prefer ground
beef topping.
Favorite
Ground Beef
Movie
c. Briefly describe your finding in words.
Argo
5
» 22.7%
22
Of people who prefer ground beef topping on pizza, the
Pacific Rim
15
majority (68%) prefer the movie Pacific Rim, while 23%
= 68%
22
prefer Argo and fewest (9%) prefer Captain Phillips.
Captain
2
» 9.1%
Phillips
22
d. Compute (in percentages) the conditional distribution of
Totals
22 or 100%
favorite pizza topping among those who chose Captain
Phillips as their favorite movie. Show the distribution in a table.
Favorite
Movie
Captain
Phillips
Pepperoni
Ground Beef
Mushrooms
Totals
15
2
13
30
e. Briefly describe your finding in words.
Of those who prefer Captain Phillips, 50% (15/30) prefer pepperoni, 43% (13/30) prefer mushrooms,
and 7% (2/30) prefer ground beef.
f. Of those who prefer pepperoni, what proportion prefer Argo?
20
» 46.5%
43
g. Of those who prefer Pacific Rim, what proportion prefer ground beef?
15
» 42.9%
35
Chapter 4 Review
2. Sarah’s parents are concerned that she seems short for her age. Their pediatrician has the following
record of Sarah’s height:
Age (in months)
Height (cm)
36
86
48
90
51
91
57
94
60
95
A scatterplot showed a strong positive association between age and height, and a least-squares
regression line was found to be HEIGHT = 71.95 + 0.3833 AGE. The correlation was r = 0.994. The
doctor wants to predict Sarah’s height in middle age (50 years old) if there is no intervention (growth
hormone), and he uses the regression line for this prediction. Check the doctor’s prediction, and then
comment on this procedure
HEIGHT=71.95+0.3833 AGE Sarah's height would be 3
=71.95+0.3833(600)
=301.9 cm
meters, according to the model. Clearly this is a case of extrapolation and leads to an unreasonable
conclusion.
50 years = 600 months
3. There is a strong positive correlation between years of schooling completed (x) and lifetime earnings
(y) for American men. One possible reason for this association is causation: more education leads to
higher-paying jobs. But lurking variables may explain some of the correlation. Suggest some lurking
variables that would explain why men with more education earn more.
Socioeconomic status may well have an impact here. Students from wealthier families likely have
greater access to education and to higher paying jobs. They may be more likely to be part of a family
business or have resources to start their own company.
Health may be a lurking variable. Men with excellent health may be more likely to obtain education
and high paying jobs, as compared to men with health problems.
4. The National Halothane Study was a major investigation of the safety of the anesthetics used in
surgery. Records of over 850,000 operations performed in 34 major hospitals showed the following
death rates for four common anesthetics x.
Anesthetic
Death rate
A
1.7%
B
1.7%
C
3.4%
D__
1.9%
Do these data prove that anesthetic C is causing more deaths than the others, or there another possible
explanation? Explain in a short narrative OR draw an appropriate diagram (and label variables
appropriately). Use an additional sheet if necessary.
Anesthetic C could be used for surgeries that have a higher risk of complications, leading falsely to
conclude that the anesthetic caused the deaths, when the true cause was related to the type of illness.
The severity of the patients' illnesses is also likely a lurking variable. Different anesthesiologists may
prefer to use different anesthetics so there may be differences due to the surgeons.
Chapter 4 Review
5. A student experimenting with a pendulum counted the number of full swings the
pendulum made in 20 seconds for various lengths of string. Her data are shown below:
Length
(in)
# of swings
6
9
12
15
18
21
24
27
30
33
23
20
17
16
14
13
13
12
11
10
Propose a reasonable model for the number of swings of the pendulum (in 20 seconds) for various string
lengths. Show all work, Use your model to predict the number of swings in 20 seconds for a 42 inch long
string.
Linear Model:
The scatterplot shows a strong negative nonlinear relationship between string length in pendulums and the
number of swings in 20 seconds.
Linear Fit
Number Swings = 23.369697 - 0.4343434*Length
The patterned residual plot shows that the linear model
is not a good fit.
Chapter 4 Review
Exponential Model:
Linear Fit
Log( Number of Swings) = 1.4007577 0.012388*Length
The residual plot still has some pattern, but is a
definite improvement over the linear model.
Power Model:
Linear Fit
Log( Number of Swings) = 1.7429559 - 0.4718291*Log
Length
The residual plot is patternless, confirming
the linear model (of the transformed data).
So,
ŷ = 101.74296 x -.47182 where x is length and y is number
of swings