Correlation and Hypothesis Tests

Correlation or Causation - Solutions
1
There is a positive correlation between the amount of sugar and the number of calories in
breakfast cereals.
Might be causal. One might directly cause the other. Although other ingredients may add to
the calorie count. Eg, there is an outlier, Kellogg’s Crunchy Nut Clusters. This high calorie
reading could be to do with this cereal containing nuts which are high in calories.
Also note the vertical axis does not start at zero so any correlation may be exaggerated.
2
It is reported, based on a sample of students aged between 11 and 16, that taller children
have a better vocabulary.
Might not be causal. The variable that is probably causing this correlation is age.
3
There is a positive correlation between the time spent running on the treadmill and the
number of calories the person burns.
Likely to be causal as we know exercise uses up calories.
4
A news story starts "Scottish neighbourhoods with the most licensed premises have the
highest rates of alcohol-related illness and deaths, according to a new study."
Open for discussion:
Might be causal - if there were fewer places selling alcohol, it is possible that alcohol-related
illnesses would drop. Does supply create demand or demand cause supply?
Might not be causal - discuss other variables that could affect either alcohol-related
illnesses or most licensed premises. Unemployment, investment in medical care, education
on risks of drinking alcohol, interest in sport (shown on the big screens in pubs).
5
Between 1992 and 2006 there is a negative correlation between the number of first time
marriages for women and life expectancy in England and Wales.
Might not be causal. The variable that is probably causing this correlation is the year.
There were more marriages 1992 but female life expectancy was lower.
6
Organic food may be the cause of the increase in the number of cases of autism in the
USA.
Not causal - the graph only shows that both variables increase in a similar way over the time
period. It offers no scientific explanation of why this may happen.
Correlation or Causation?
State what you
think the evidence
is trying to suggest
Relationship
Example
There is a positive correlation between
the number of ice-creams sold and the
number of drownings.
Suggests eating
ice-cream causes
drowning.
Give a reason why you agree or
disagree with what the evidence is
suggesting
Do you think the
relationship is
probably causal or
probably not causal?
When the weather is hot more Probably
people swim and more people eat causal
ice-cream. There is a third variable
affecting
both,
the
outside
temperature.
Probably
not causal
1 There is a positive correlation between
the amount of sugar and the number of
calories in breakfast cereals
Probably
causal
Kcal per 100g vs Sugar per 100g for breakfast cereals
450
Kcal/100g
425
Probably
not causal
400
375
350
0
10
20
Sugar per 100g (g)
1
30
40
X
Correlation or Causation?
Relationship
2 It is reported, based on a sample of
students aged between 11 and 16, that
taller children have a better vocabulary
State what you
think the evidence
is trying to suggest
Give a reason why you agree or
disagree with what the evidence is
suggesting
Do you think the
relationship is
probably causal or
probably not causal?
Probably
causal
Probably
not causal
3 There is a positive correlation between
the time spent running on the treadmill
and the number of calories the person
burns
4 A news story starts "Scottish
neighbourhoods with the most licensed
premises have the highest rates of
alcohol-related illness and deaths,
according to a new study”
2
Probably
causal
Probably
not causal
Probably
causal
Probably
not causal
Correlation or Causation?
State what you
think the evidence
is trying to suggest
Relationship
5 Between 1992 and 2006 there is a
negative correlation between the number
of first time marriages for women and life
expectancy in England and Wales
Female life expectancy vs number of females marrying
Give a reason why you agree or
disagree with what the evidence is
suggesting
Do you think the
relationship is
probably causal or
probably not causal?
Probably
causal
Female life expectancy at birth (years)
82.0
81.5
81.0
Probably
not causal
80.5
80.0
79.5
79.0
20000
22000
24000
26000
28000
30000
32000
34000
Number of first time marriages
6 Organic food may be the cause of the
increase in the number of cases of
autism in the USA
Data obtained from the Organic Trade Association Survey 2011
Probably
causal
Probably
not causal
Organic Industry Survey. US Department of Education Office for Special
Education Program
3
Popular Degrees
Most popular degrees based on UCAS applications in 2014
https://www.ucas.com/
Actual ranks
Student's ranks
(to be given by the teacher)
Rank these degree courses in order of popularity
1 is the most popular and 10 is the least popular
Social Sciences
Rank
Social Sciences
(eg Psychology, Sociology)
Rank
Physical Sciences
Rank
Physical Sciences
(eg Physics, Chemistry, Geology, Earth Sciences)
Rank
Medical Related
Rank
Medical Related
(Nursing, Physiotherapy, Healthcare)
Rank
Medicine and Dentistry
Medicine and Dentistry
Rank
Rank
Computer Sciences
Computer Sciences
Rank
Rank
Law
Law
Rank
Rank
Engineering
Engineering
Rank
Rank
Biological Studies
Biological Studies
Rank
Rank
Creative Arts and Design
Creative Arts and Design
Rank
Rank
Business and Admin Studies
Business and Admin Studies
Rank
Rank
Produced by MEI on behalf of OCR
© OCR 2015
Popular Degrees Template
What type of correlation is this?
Popularity of UK Degree Courses
10
 Perfect positive correlation
 Strong positive correlation
9
Student Rank
 Weak positive correlation
8
 No correlation
7
 Weak negative correlation
 Strong negative correlation
6
 Perfect negative correlation
5
State anything that has surprised or
interested you about your plot
4
3
----------------------------------------------------------------
2
-----------------------------------------------------------------
1
--------------------------------------------------------------------------------------------------------------------------------
0
0
1
2
3
4
5
6
Actual Rank
7
8
9
10
---------------------------------------------------------------------------------------------------------------------------------
Psychology
Physics Chemistry,
Geology, Natural
Sciences
Nursing,
Physiotherapy
Student's ranks
Rank these degree courses in order
of popularity based on the number
of applications
1 is the most popular and 10 is the
least popular
1 2 3 4 5 6 7 8 9 10
Most popular degrees based on UCAS applications in 2014
https://www.ucas.com/
Actual ranks
Number of applicants in
2014
Social Sciences
5
231,495
Physical Sciences
9
109,335
Medical Related
1
399,500
Medicine and Dentistry
10
97,560
Computer Sciences
8
112,300
Law
7
122,660
Engineering
6
161,140
Biological Studies
4
255,470
Creative Arts and Design
3
272,345
Business and Admin Studies
2
329,685
Plot the coordinates
(x,y)
(Actual rank, Student Rank)
First point to plot in this case
(5,7)
Perfect positive correlation
Strong positive correlation
No correlation
Strong negative correlation
Perfect negative correlation
Most popular degrees based on UCAS applications in 2014
https://www.ucas.com/
Degree courses
Rank
Number of applicants Number of acceptances
in 2014
in 2014
Social Sciences
5
231,495
43,325
Physical Sciences
9
109,335
19,920
Medical Related
1
399,500
52,405
Medicine and Dentistry
10
97,560
8,915
Computer Sciences
8
112,300
23,585
Law
7
122,660
23,695
Engineering
6
161,140
29,110
Biological Studies
4
255,470
48,815
Creative Arts and Design
3
272,345
53,330
Business and Admin Studies
2
329,685
63,660
Might be causal?
Might be causal?
Might be causal?
Might be causal?
Might be causal?
Might be causal?
True or False
If a scatter graph is used to show how 20
people scored on 2 variables, the scatter
diagram will have 2 axes and 20 points plotted.
True or False
This season, every time Jim has worn his blue
scarf his team has won. His friend says that if
Jim wears his blue scarf to the next game the
team Jim supports will definitely win.
True or False
Some islanders noticed that whenever they
were sick they didn't have lice whereas they
did have lice whenever they were healthy.
So, as long as you have lice you will be healthy.
True or False
If you are the tallest student in your class and
there is a positive relationship between height
and another variable, your point will be
plotted in the top right of a scatter graph of
height against this other variable.
True or False
Correlation always implies causation.
True or False
If two variables are negatively correlated high
values for one variable suggests low values for
the other variable.
Spearman's rank correlation coefficient
Country
Actual Rank
X
Student Rank
Y
Social Sciences
Physical
Sciences
Medical Related
Medicine and
Dentistry
Computer
Science
Law
Engineering
Biological Studies
7
5
6
R Actual Rank
X
a
n
d
(
1
9
8
-4
-5
3
5
1
4
10
8
6
-9
7
-5
2
2
5
5.5
1
Creative Arts and
Design
Business and
Admin Studies
Means
3
8
9
10
1
2
4
6
5.5
Sum
r=
ficient
Student Rank vs A
12
Student Rank
Y
10
1
16
25
25
8
25
4
Student rank
64
36
81
6
1
278
4
-0.685
2
0
0
2
4
udent Rank vs Actual Rank
6
Actual rank
8
10
12
A macro is included in this spreadsheet.
Press Ctrl c: this will randomly order the student rank s, plot a scatter graph
and calculate Spearman's rank correlation coefficient for the new student
ranks.
Each time you press Ctrl c put into column G any values of r you consider fa
enough away from zero to convince you that there is an association
between the populations of student ranks and actual ranks.
12
plot a scatter graph
the new student
you consider far
association
True of False – Solutions
Statement
1
2
3
4
5
6
True or False
If a scatter graph is used to show how 20 people
True
scored on 2 variables, the scatter diagram will have
2 axes and 20 points plotted.
False
This is just luck
This season, every time Jim has worn his blue scarf
and nothing to
his team has won. His friend says that if Jim wears
do with the way
his blue scarf to the next game the team Jim
Jim’s team
supports will definitely win.
plays.
False
Lice leave
Some islanders noticed that whenever they were
sick they didn't have lice whereas they did have lice unhealthy
whenever they were healthy. So, as long as you people.
have lice you will be healthy.
If you are the tallest student in your class and there
is a positive relationship between height and
another variable, it is certain that your point will be
plotted in the top right of a scatter graph of height
against this other variable.
False.
This could be
an outlier.
False
We have just
Correlation always implies causation.
seen several
examples of this
not being true.
If two variables are negatively correlated high values True
for one variable suggests low values for the other
variable.
True of False
Statement
1
If a scatter graph is used to show how 20 people
scored on 2 variables, the scatter diagram will have
2 axes and 20 points plotted.
2
This season, every time Jim has worn his blue scarf
his team has won. His friend says that if Jim wears
his blue scarf to the next game the team Jim
supports will definitely win.
3
Some islanders noticed that whenever they were
sick they didn't have lice whereas they did have lice
whenever they were healthy. So, as long as you
have lice you will be healthy.
4
If you are the tallest student in your class and there
is a positive relationship between height and another
variable, it is certain that your point will be plotted in
the top right of a scatter graph of height against this
other variable.
5
Correlation always implies causation.
6
If two variables are negatively correlated high values
for one variable suggests low values for the other
variable.
True or
False