4.3

Warm Up
I. Pilots or Senators: Which team has played better?
WINS
vs. East
LOSSES
PCT.
WINS
vs. West
LOSSES
PILOTS
4
1
.800
3
7
SENATORS
7
3
.700__
1
4
1)
2)
3)
4)
5)
PCT.
.300
.200
Make a new table of the teams’ overall performance (wins/losses/pct.)
Which team had a better winning percentage when divvied up by East/West division?
Which team had a better overall winning percentage?
What happened? What is this called?
What do you notice about team records that might cause this?
II. Consider the following two-way table:
Gender
Male
Female
1)
2)
3)
4)
Type of college
4 yr private
4 yr public Community
5
7
9
18
25
26
Identify the row and column variables
Find the marginal distributions by percents
Find the conditional distribution of males and females going to a community college
Find the conditional distribution of community college attendees who are male and female.
Consider the following list of sixteen factors. Eight of the factors have a
strong correlation (+ or -) with test scores; the other eight don’t seem to
matter. Its taken from the 2005 best selling book Freakanomics. Try to
guess which are which:

















The child has highly educated parents.
The child’s family is intact.
The child’s parents have high socioeconomic status.
The child’s parents recently moved into a better neighborhood.
The child’s mother was 30 or older at the time of the first child’s
birth.
The child’s mother didn’t work between birth and kindergarten.
The child had low birth weight.
The child attended Head Start.
The child’s parents speak English in the home.
The child’s parents regularly take him/her to museums.
The child is adopted.
The child is regularly spanked.
The child’s parents are involved in the PTA.
The child frequently watches television.
The child has many books in his home.
The child’s parents read to him nearly every day.
Which ones correlate with test scores?
Lurking Variables


Often the relationship between 2 variables
is strongly influenced by one or more
lurking variables.
Ex: Studies show that men who complain
of chest pain are more likely to get
detailed tests and aggressive treatment
such as bypass surgery than are women
with similar complaints. Is this association
between gender and treatment due to
discrimination?
4.3: Establishing Causation
(Types of Associations)



Causation: Changes
in x cause changes in
y
Common response:
Both x and y respond
to changes in some
unobserved variable
Confounding: The
effect of x on y is
hopelessly mixed up
with the effects of
other variables.
Causation
Examples of observed associations between x and y
1) x = mother’s body mass index
y = daughter’s body mass index
2) x = amount of artificial sweetener saccharin in a rat’s diet
y = count of tumors in a rat’s bladder
Careful: A strong association is not
necessarily causation!

An article in a woman’s mag
reported that mother’s who nurse
their babies feel more receptive
toward their infants than mothers
who bottle-feed. The author
concluded that breast-feeding (x) led
to a more positive attitude (y)
toward the child. Problems with this?
Common Response
1) x = Ice Cream Sales
y = # of shark attacks in swimmers
2) x = Skirt Length
y = Stock Prices
3) x = # of cavities in elementary school kids
y = vocabulary knowledge
4) x = a high school senior’s SAT score
y = the student’s first-year college grade point
average
5) x = monthly flow of money into stock market funds
y = monthly rate of return for the stock market
Confounding
CONFOUNDING ONLY EXISTS if there is CONFUSION about
whether changes in the confounding variable on the
explanatory variable are leading to observed changes in the
response variable.
1) x = whether a person regularly attends religious services
y = how long the person lives
2) x = the number of years a worker has
y = the worker’s income
YEAR
. 1860

Example of a (spurious)
correlation between the
number of Methodist
ministers in New England
and the amount of Cuban
rum imported to Boston
over the years (by # of
barrels).
1) Calculate r
2) Is the increasing
number of ministers
causing people to drink
more? What could be the
lurking variable?
3) What type of association
is this?
1865
1870
1875
1880
1885
1890
1895
1900
1905
1910
Ministers
Rum
63
48
53
64
72
80
85
76
80
83
105
8376
6406
7005
8486
9595
10643
11265
10071
10547
11008
13885