Q - Gore High School

Statistical
Literacy
SLO
Three averages (central tendency) and their
advantages/disadvantages
Important: It is unlikely in the exam you will asked to calculate
statistics. Far more likely is to be asked to interpret given
statistics.
SLO:
To find the Mean Average
Copy into
Your notes
Mean
Sum of values
Mean =
Number of values
Use the mean to describe the middle of a set of data that does not
have an outlier.
Advantages:
There is only one answer.
popular
Disadvantages:
Affected by extreme values (outliers): see next slide
http://www.youtube.com/watch?v=UVnelMfhtrg
Youtube video how to find mean average (Very strange man!)
Potential Problem with Means
Median
Mean
Median
Mean
Calculating the mean
SLO:
To find the Median Average
Copy into
Your notes
Median
The median is the middle number when all numbers are in order.
If there are two middle numbers, you need to find what is
halfway between them.
A way to find the middle of those two numbers is to add them
up and divide by two.
http://www.youtube.com/watch?v=loAAovIKLGw
(youtube video to find median)
Copy into
Your notes
Median
Use the median to describe the middle of a set
of data that does have an outlier.
Advantages:
Outliers do not affect the Median significantly.
There is only one answer.
Disadvantages:
Not as popular as mean.
Your Turn:
Find the Median of the following
21, 18, 24, 19, 27
Step 1 – Arrange the numbers in order from least to greatest.
18, 19, 21, 24, 27
Step 2 – Find the middle number.
21 is your median number.
Your Turn:
Find the Median of the following
21, 25, 19, 28, 27, 18
Step 1 – Arrange the numbers in order from least to greatest.
18, 19, 21, 25, 27, 28
Step 2 – Find the middle number.
Step 3 – As there are two middle numbers, find the middle
of these two numbers. (46 ÷ 2)
23 is your median number.
Outliers and the median and mean
SLO:
To find the mode average
Copy into
Your notes
Mode/Modal
The most common item is called the mode.
Use the mode when the data is non-numeric or when asked to
choose the most popular item.
Advantages:
Outliers do not affect the mode.
Disadvantages:
May be more than one answer
When no values repeat in the data set, the mode is every value and
is useless.
When there is more than one mode, it is difficult to interpret and/or
compare.
http://www.youtube.com/watch?v=dam-TCRbkFw
(youtube clip to find mode average, deals with multiple modes)
Your Turn:
Find the mode
21, 18, 24, 19, 18
Mode = 18
Your Turn:
Find the mode
29, 8, 4, 8, 19, 4
Mode = 4 and 8
When the mode is not appropriate
A survey is carried out among university students.
The results are represented in this table:
Numbers of
sports played
0
1
2
3
4
5
6
Frequency
20
17
15
10
9
3
2
A newspaper reporter writes:
“You may be surprised to learn
that the average number of sports
played by university students is 0.”
Do you think this is a fair
representation of the data?
Questions to do from the book
Achieve
Gamma
P264 EX20.01 Q1 – 12
EAS
P284 EX21.01 Q4
P18 Q30 – 38
Merit
Excellence
Spread
Spread looks at the distribution of the data.
SLO:
To find the range of a set of data
Copy into
Your notes
Range
Range is a measure of Spread
Range = highest value – lowest value
When the range is small; the values are similar in size.
When the range is large; the values vary widely in size.
Advantage
Quick to use
Easy to understand
Disadvantage
Affected by outliers
Your Turn:
Find the Range
21, 18, 24, 19, 27
Step 1: Find the lowest and highest numbers.
Step 2: Find the difference between these 2 numbers.
27 – 18 = 9
The range is 9
Calculating the mean, median and range
SLO
Comparing sets of data using mean and range
The range
Here are the high jump scores for two girls in metres.
Joanna
1.62
1.41
1.35
1.20
1.15
Kirsty
1.59
1.45
1.41
1.30
1.30
Calculate the mean and the range for each girl.
Joanna
Kirsty
Mean
1.35 m
1.41 m
Range
0.47 m
0.29 m
Use these results to decide which one you would enter
into the athletics competition and why.
Comparing sets of data
Here is a summary of Chris and Rob’s performance in the 200
metres over a season. They each ran 10 races.
Mean
Range
Chris
Rob
24.8 seconds
1.4 seconds
25.0 seconds
0.9 seconds
Which of these conclusions are correct?
Robert is more reliable.
Robert is better because his mean is higher.
Chris is better because his range is higher.
Chris must have run a better time for his quickest race.
On average, Chris is faster but he is less consistent.
Comparing hurdles scores
Year 9
12.1
14.0
15.3
15.4
15.4
15.6
15.7
15.7
16.1
16.7
17.0
Year 10 Here are the top eleven hurdles scores in
12.3 seconds for Year 9 and Year 10.
13.7
Work out the mean and range.
15.5
15.5
Year 9
Year 10
15.6
15.4
16.1
Mean
15.9
4.9
10.6
16.0
Range
16.1
16.1
Which year group do you think is
17.1
better and why?
22.9
Why might Year 10 feel the comparison is unfair?
Copy into
Your notes
In conclusion
HOW
Advantage
disadvantage
mean
(sum ÷ #)
1 answer,
popular
Outliers
mode
More than
Non numerical,
Most common
one,
Outliers OK
No answer
median
middle
Outliers OK,
1 answer
unpopular
Big - small
quick
outliers
AVERAGE
SPREAD
range
Leave a gap at bottom of table to add one more line at a later date
Web resources
http://www.bbc.co.uk/schools/gcsebitesize/maths/statistics/measuresofaverageact.shtml
(Notes and individual interactive mean, median and mode questions)
http://www.youtube.com/watch?v=oNdVynH6hcY
(Mean, median and mode song)
http://www.bbc.co.uk/bitesize/ks2/maths/data/mode_median_mean_range/quiz/q10083371/
(mean, mode, median, range quiz)
SLO
Quartiles and interquartile range
Copy into
Your notes
Quartiles and Median
1st
Quartile
2nd
Quartile
3rd
Quartile
¼
¼
¼
¼
Q1
Lower
Quartile
½
Q2
Median
4th
Quartile
¼
¾
Q3
Upper
Quartile
SLO: Know how to find the:
Lowest, highest, Lower
Quartile, Upper
Quartile and Median
Copy into
Your notes
To find the lowest, highest, lower quartile, upper
quartile and median.
2 4 6 7 8 9 13 14 15 24 56
Lower
Quartile
Median
Upper
Quartile
Step 1: Put the data in order
Step 2: Find the middle piece of data (Median)
Step 3: Find the ‘middle’ piece of data on the left (lower quartile)
Step 4: Find the ‘middle’ piece of data on the right (upper quartile)
Step 5: Find the smallest piece of data.
Step 6: Find the biggest piece of data
E.g. Find the lowest, highest, lower
quartile, upper quartile and median for
the data below.
9, 4, 5, 2, 5, 6, 7, 10, 7, 8, 7
Put the numbers in order
Q1 = 5
2
Lowest =2
4
5
Q2 = 7
5
6
7
Q3 = 8
7
7
8
9
10
Highest =10
Your Turn:
Find the lowest, highest, lower quartile, upper quartile
and median for the data.
13, 8, 10, 1, 4, 15, 9, 12, 7, 3, 9
Put the numbers in order
Q1= 4
1
Lowest= 1
3
Q 2= 9
4
7
8
9
Q3=12
9 10 12 13 15
Highest=15
Your Turn:
Find the lowest, highest, lower quartile, upper
quartile and median for the data.
7, 3, 5, 6, 7, 9, 15, 5, 4, 8
Put the numbers in order
Q1 = 5
3
L =3
4
5
Q2 = 6.5
5
6
7
Q3 = 8
7
8
9
15
H = 15
Your Turn:
Find the lowest, highest, lower quartile, upper quartile
and median for the data.
5, 22, 1, 4, 6, 8, 9, 5, 7
Put the numbers in order !
Q1 = 4.5
1
L =1
4
5
Q2 = 6
5
6
Q3 = 8.5
7
8
9
22
H = 22
Your Turn:
Find the lowest, highest, lower quartile, upper quartile and median
for the following.
Question
Data
Lowest
Lower
Quartile
Median
Upper
Quartile
Highest
A
3 5 7 12 16 24 27 30 34 39 42
3
7
24
34
42
B
1 4 5 6 8 10 27 30 43 48 70
1
5
10
43
70
C
4 9 12 17 19 26 30
4
9
17
26
30
D
83 62 13 40 87 23 80
13
23
62
83
87
E
19 9 4 8 2 17 31 4 12 10 3
2
4
9
17
31
F
1 1 2 2 3 3 3 4 4 5 5 5 7 8 8
1
2
4
5
8
G
1 3 7 8 9 10 16 22 24 35
1
7
9.5
22
35
H
2 6 8 12 14 17 19 20
2
7
13
18
20
I
47 40 34 38 43 50 18 30
18
32
39
45
50
SLO
How to find the interquartile range
Copy into
Your notes
Interquartile Range (IQR).
Interquartile Range = Upper Quartile – Lower Quartile.
Advantage
Not affected by outliers
Disadvantage
Harder to understand
Interquartile Range (IQR).
What if there is
E.g. Find the IQR for the following (select once to see
animation)
an outlier?
23 47 12 46 22 58 35
68 10 14
Median = 29
370
U.Q. = 47
L.Q. = 14
Interquartile Range = 47 – 14 = 33
(Range = 68 – 10 = 58)
10
12
14
22
23
35
46
47
58
68
Median = 35
L.Q. = 14
U.Q. = 58
Interquartile Range = 58 – 14 = 44 Notice how the IQR
and range change
(Range = 370 – 10 = 360)
Finding the interquartile range
Your Turn:
Find the interquartile range for the following.
Interquartile
range
Upper
Quartile
Lower
Quartile
Question
Data
A 3 5 7 12 16 24 27 30 34 39 42
7
34
27
B
1 4 5 6 8 10 27 30 43 48 70
5
43
38
C
4 9 12 17 19 26 30
9
26
17
D 83 62 13 40 87 23 80
23
83
60
E
19 9 4 8 2 17 31 4 12 10 3
4
17
13
F
1 1 2 2 3 3 3 4 4 5 5 5 7 8 8
2
5
3
G 1 3 7 8 9 10 16 22 24 35
7
22
15
H 2 6 8 12 14 17 19 20
7
18
11
I
32
45
13
47 40 34 38 43 50 18 30
SLO
How to use the interquartile range
Sarah’s exam marks : 88 %, 90 %, 89%, 91%,
Range = 93 – 88 = 5% (Small Range)
92%, 93%,
89%,
90%
ThisInterquartile
small
A small
Interquartile
Range
Range
also shows
that
A Small
Range
means
that Sarah is very
shows
thathalf
Mark
the
Middle
ofisthe
consistent,
predictable,
reliable.
consistent
numbers
are bunched
Mark’s exam marks : 92% 88% 89% 91% 94%
(Sarah’s
IQR = 2.5)
together.
Range = 94 – 32 = 62% (Big Range)
90% 92% 32%
A Big Range means that Mark is very
inconsistent, unpredictable, unreliable.
But Mark is
I will try
predictable
! Itthe
was
only Interquartile
the 32% that
L.Q.Range
= 88.5! Median = 90.5
gave the
impression
that he
Interquartile
Range
is inconsistent !
= 92 – 88.5
= 3.5 (Small Interquartile Range)
U.Q. = 92
Discuss the calculations below.
Battery Life:
The life of 12 batteries recorded in hours is:
2, 5, 6, 6, 7, 8, 8, 8, 9, 9, 10, 15
Mean = 93/12 = 7.75 hours and the range = 15 – 2 = 13 hours.
2, 5, 6, 6, 7, 8, 8, 8, 9, 9, 10, 15
Median = 8 hours and the inter-quartile range = 9 – 6 = 3 hours.
The averages are similar but the measures of spread are
significantly different since the extreme values of 2 and 15 are
not included in the inter-quartile range.
Your Turn:
Use the quartiles to describe the spread of the data
The most number of pages read was 42
25% of the students have read between 3 and 7 pages
25% of the students have read between 7 and 24 pages
25% of the students have read between 24 and 34 pages
25% of the students have read between 34 and 42 pages
50% of the students have read between 7 and 34 pages
Highest
The least number of pages read was 3
Upper
Quartile
Lower
Quartile
7
Median
Lowest
3
Data: pages read of book
24 34 42
Your Turn:
Use the quartiles to describe the spread of the data
1) Find the smallest amount of money in a bank account
Highest
5
Upper
Quartile
Lower
Quartile
1
Median
Lowest
Data: money in bank account
10 43 70
1
2) Use 25% to describe the amount of money in an account
(several possible answers)
??
3) The Inter Quartile Range is 50% of the data. Use this in a sentence to describe the
amount of money in the bank accounts.
50% of the bank accounts have between $5 and $43.
Your Turn:
Use the quartiles to describe the spread of the data
Highest
9
Upper
Quartile
Lower
Quartile
4
Median
Lowest
Data: albums on Ipod
17 26 30
1) Find the smallest number of albums on an ipod
4
2) Use 25% to describe the number of albums on an ipod
(several possible answers)
??
3) The Inter Quartile Range is 50% of the data. Use this in a sentence to describe the
number of albums on the ipodss.
50% of the ipods have between 9 and 26 albums.
Your Turn:
Use the quartiles to describe the spread of the data
83
Highest
62
Upper
Quartile
23
Median
Describe what the above is showing
Lower
Quartile
Lowest
Data: percentage in maths tests this year 13
87
Your Turn:
Use the quartiles to describe the spread of the data
62
62
83
70
Highest
23
60
Upper
Quartile
13
6
Median
Lower
Quartile
Compare the two classes mathematics results
Lowest
Class A: percentage in maths test
Class B: percentage in maths test
87
75
Your Turn:
Use the quartiles to describe the spread of the data
Highest
Upper
Quartile
Median
Compare the savings of the two classes
Lower
Quartile
Lowest
Class A: money in bank account
Class B: money in bank account
1 8 10 13 94
17 21 52 80 89
Questions to do from the book
Achieve
Gamma
EAS
Merit
P273 EX 20.03
Q1 – 10
P21 Q39 – 44 P22 Q45 – 48
(copy raw data from
answers and use this
to make comparisons)
Excellence
Copy into
Your notes
In conclusion (add to earlier table)
HOW
Advantage
disadvantage
AVERAGE (Central Tendency)
mean
(sum ÷ #)
1 answer,
popular
Outliers
mode
Most common
Non numerical,
Outliers OK
More than one,
No answer
median
middle
Outliers OK,
1 answer
unpopular
Big - small
quick
outliers
SPREAD
Range
(with mean)
Interquartile
range (IQR)
(with median)
Upper quartile Outliers OK
minus
Lower quartile
Harder to
calculate/
understand
SLO
Tally charts and frequency tables
SLO: To draw a tally chart
Tally Charts
Tally charts are used to organise data.
Data
Red
llll
Yellow
llll llll
llll ll
White
llll ll
Orange
l
Green
llll
Tally
represents a tally of 5
Example: Organise the diameters of
tomatoes given below into a tally chart.
58
56
59
57
60
56
62
62
58
56
58
59
56
59
56
59
57
58
60
62
61
58
59
62
Step 1: Identify the largest and smallest values
Lowest number 56
Highest number 62
58
56
X
X
X 59
56
59
X
X 56
X
X 59
X
X 62
X 58X 56
X 58
59
X 61
X 62
X
X 57
X 58
X 58
X 59
X 60X 62
57
X
60
X
56
X
62
Step 2: Draw a tally chart from smallest to largest
Step 3: Fill in the tally chart
56
Tally
lll
57
ll l
58
llll
llll
59
61
llll
l
62
ll
60
l
Diameter
SLO
How to make a frequency table
Copy into
Your notes
Frequency Tables
A frequency is a tally chart with an extra column
for the total (frequency) of each tally.
Diameter
Tally
Frequency
Red
lll
3
Blue
llll
4
llll llll
llll llll lll
9
13
Green
Black
http://www.youtube.com/watch?v=Ng3SMd-_ys8
(Video: How to draw a tally graph and frequency table)
SLO
Calculating statistics from tables
Using tables
From a frequency table we can calculate mean, mode, median and
range.
In the table below students were asked how many siblings they have
Siblings
frequency
0
3
1
5
2
6
3
7
4
0
5
1
Copy into
Your notes
Using tables
Siblings
frequency
0
3
1
5
2
6
3
7
4
0
5
1
To visualise the numbers, sometimes it is easier to list the numbers
0 0 0 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 3 3 5
Range = 5 – 0 = 5
Median = 2 (As there are 22 pieces of data, this is half way between
the 11th and 12th piece of data)
Copy into
Your notes
Using tables
Siblings
frequency
Frequency x value
0
3
0
1
5
5
2
6
12
3
7
21
4
0
0
5
1
5
Total
22
43
Mean
To find the mean it is easier if an extra column and row is
added, as above
Mean = 43 ÷ 22 = 1.95
Mode = 3 (take care, 7 is not the mode, this is the frequency)
http://www.bbc.co.uk/bitesize/ks2/maths/data/frequency_diagrams/play/
(interactive frequency table questions)
Questions to do from the book
Achieve
P268 EX 20.02 Q1 – 5
Gamma
P313 EX 22.01 Q6 – 9
EAS
P25 Q49 – 52
Merit
P284 EX 21.01 Q 1 – 9
P289 EX21.02
Excellence
SLO
Data display
Important: It is unlikely in the exam you will asked to draw
graphs. Far more likely is to be asked to interpret given
graphs.
SLO
Bar graphs
Copy into
Your notes
Bar charts
Bar charts can be used to display categorical (non-numerical) data
or discrete (numerical) data.
How children travel to school
The bars of a bar chart
do not touch each
other (like the bars in
a jail).
10
8
6
4
2
0
walk
train
car
bicycle
bus
other
Method of travel
Number of CDs bought in a month
25
Number of children
Number of children
12
20
15
10
5
0
0
1
2
3
Number of CDs bought
4
5
Bar graphs
Bars should be separate.
20
14
12
5
1
3
4
6
7
9
What is wrong with
this bar graph?
Make a list of the 8
mistakes.
The bars must be the same
width.
The gaps must be the
same width.
The scales must go up by
equal intervals.
The numbers on the
horizontal axis must
appear in the middle of the
bar.
The axes must be labelled.
There should be a title.
Bar line graphs
Bar line graphs are the same as bar charts except that lines are
drawn instead of bars.
For example, this bar line graph shows a set of test results.
Number of pupils
Mental maths test results
Mark out of ten
Bar charts for two sets of data
Two or more sets of data can be shown on a bar chart.
For example, this bar chart shows favourite subjects for a group
of boys and girls.
Girls' and boys' favourite subjects
Number of pupils
8
7
6
5
Girls
4
Boys
3
2
1
0
Maths
Science
English
Favourite subject
History
PE
Bar charts for two sets of data
For example, this bar chart shows favourite sport for a group
of boys and girls.
1) Which is the most popular sport? Basketball
2) What sport is most popular with the girls?
Hard to tell as pink part of bar is approximately the same size for all
SLO
Histograms
Heights of Year 8 pupils
35
30
Frequency
25
20
15
10
5
0
140
145
150
155
Height (cm)
160
165
170
175
Copy into
Your notes
Histograms
The bars of a histogram touch each other.
A histogram is for continuous data.
Age Group
Frequency
1-5
4
6 - 10
16 - 20
30
27
50
21 - 25
11
11 - 15
http://www.youtube.com/watch?v=g1wUk-JV7JU&feature=related
(You tube video explanation of how to draw a histogram)
Copy into
Your notes
Skewed data
Data that is heavily weighted towards one end of the data set is
said to be skewed.
When data is skewed, the mode is not an appropriate average.
25
14
12
20
15
Frequency
Frequency
10
10
8
6
4
5
2
0
0
1
2
3
4
5
6
Numbers of sports played
Negatively skewed data
7
1
2
3
4
5
Numbers of sports played
Positively skewed data
6
7
SLO:
To be able to tell the difference
between a bar graph and a histogram
Your Turn:
Is each of the following a histogram or a bar graph?
A
D
G
B
E
H
C
F
I
Questions to do from the book
Achieve
Gamma
EAS
Merit
Excellence
P285 EX21.01 Q6 – 8
P39 Q69 – 73
(do not do questions using mid points)
The word proportion will need to be explained.
SLO
Pie graph
Pie charts
A pie chart is a circle divided up into sectors which are representative
of the data.
In a pie chart, each category is shown as a fraction of the circle.
Methods of travel to work
For example, in a survey
half the people asked
drove to work, a quarter
walked and a quarter
went by bus.
Car
Walk
Bus
Pie charts
This pie chart shows the distribution of drinks sold in a cafeteria
on a particular day.
Drinks sold in a cafeteria
coffee
soft drinks
tea
Altogether 300 drinks were
sold.
Estimate the number of each
type of drink sold.
Coffee:
75
Soft drinks:
50
Tea:
175
Reading pie charts
The following pie chart shows the favourite chip flavours of 72
children.
Cheese
and chive
55º
Salt and
vinegar
Smokey
bacon
35º
85º
50º
Cheese
and
onion
How many children preferred
ready salted chips?
135º
Ready
salted
Hint: Find the proportion of
children who preferred ready
salted:
135
= 0.375
360
The number of children who
preferred ready salted is:
0.375 × 72 = 27
SLO
Comparing pie charts
Pie charts
These two pie charts compare the proportions of boys and girls in
two classes.
Mr Humphry's class
Number of
boys
Number of
girls
Mrs Payne's class
Number of
boys
Number of
girls
Dawn says, “There are more girls in Mrs Payne’s class than in Mr
Humphry’s class.” Is she right?
NO, pie charts compare proportions and not actual amounts.
Drinking habits among young people in 2013
Compare the results for 1988 with 2013.
1988
2013
20%
25%
39%
38%
18%
15%
11%
12%
Last week
1-4 weeks
1-6 months
More than 6 months
Never
10%
12%
Compare the two sets of data in the
pie charts
Compare the two sets of data in the
pie charts
Compare the two sets of data in the
pie charts
Compare the two sets of data in the
pie charts
Questions to do from the book
Achieve
Gamma
EAS
Merit
P34 Q61a,b
Excellence
SLO
Box and Whisker Graphs
SLO:
To draw a box and whisker diagram
Copy into
Your notes
Box and Whisker Diagrams.
Upper
Median Quartile
Lowest
Lower
Value Quartile
Whisker
4
5
25%
Whisker
Box
6
7
8
25%
9
25%
50%
Highest
Value
10
25%
11
12
Copy into
Your notes
Drawing Box and Whisker
Diagrams.
Given the following data, plot a box and whisker plot
Lowest = 6
Median = 12
Lower Quartile = 10
Highest = 21
Upper Quartile = 18
Step 1: Draw a number line from lowest to highest
Step 2: Draw 5 small vertical lines at the 5 data points
Step 3: Join lines to make Box and whisker
6
8
10
12
14
16
18
20
http://www.youtube.com/watch?v=GMb6HaLXmjY
(you tube video explanation of how to draw a box and whisker plot)
22
Your turn: Drawing Box and
Whisker Diagrams.
Draw a box and Whisker diagram for the following data.
6,
3,
9,
8,
4,
10,
8,
4,
15,
8,
10
Step 1: Order the data
Step 2: Find the 5 data points Q2, Q1, Q3, Min, Max
3,
Lowest
= 3
4,
4,
Lower
Quartile
= 4
Q3
Q2
Q1
Min
6,
8,
8,
Median
= 8
8,
9,
10,
Upper
Quartile
= 10
Max
10,
15
Highest
= 15
Box and whisker plots from dot plots
For the dot plot below draw a box and whisker graph
List the data in order:
24
18 19 19 19 21 21 22 23 23 23 24 24 24 24 24 25 25 26 26 26 28 28
Find the 5 data points
Draw the box and whisker graph
18
20
22
24
26
28
Copy into
Your notes
Outliers on box and whisker graphs
Sometimes the outliers of a set of data are not included as part of the
whiskers. The outliers are represented by dots to left or right of the
whiskers.
Copy into
Your notes
Sometimes the outliers of a set of data are not included as part of the
whiskers. The outliers are represented by dots to left or right of the
whiskers.
Limit of where whiskers can extend to
Plots that are 1.5 times the width of the box away from the edge
of the box should not be part of the whisker
SLO
Comparing box and whisker plots
Copy into
Your notes
Sampling:
If we sample data we are unlikely to get the true picture of the
population.
The bigger the sample the more likely we are to get a better
picture of the population.
Copy into
Your notes
Achieve level comparisons
As the boxes do not over lap we say that
‘B tends to be larger than A for the population’
We do not definitely know if this is true for the
population as the sample might be unrepresentative.
Copy into
Your notes
Achieve level comparisons
A
B
25%
25%
25%
25%
50%
You can compare box and whisker graphs by
using the above percentages e.g. 50% of the
results in sample A are in the top 25% of the
sample for B
Copy into
Your notes
Merit Level comparison
Boxes that overlap could just be because of variation
due to sampling.
If the medians and boxes overlap you can only say
that ‘B tends to be bigger than A for the population’
if the distance between the medians is bigger than
1/3 of the visible spread.
Copy into
Your notes
Merit Level comparison
A
The difference between the medians should also be disused in
relation to the ranges.
B
In the above example, the difference between the Medians in
two pairs of box plots is greater in diagram A.
Your Turn
Describe the difference between the two box plots.
A
B
A tends to be larger than B because the middle
50% of A is outside the middle 50% of B
Your Turn
Describe the difference between the two box plots.
A
B
A tends to be larger than B because the median
of A is outside the middle 50% of B
Your Turn
Describe the difference between the two box plots.
A
B
As the boxes overlap, and the distance between
the medians is less than 1/3 of the visible spread
there is not a significant difference.
Your Turn
Describe the difference between the two box plots.
A
B
As the boxes overlap, and the distance between
the medians is more than 1/3 of the visible spread
A is there is a significant difference.
Outline the differences between the
two box and whisker diagrams
(hrs life of battery)
Natural brand tends to be larger than Regular
brand because the middle 50% of Natural is
outside the middle 50% of Regular
Outline the differences between the
two box and whisker diagrams
(Score in test)
Girls tends to score higher than Boys because the
median of Boys is outside the middle 50% of Girls.
Outline the differences between the
two box and whisker diagrams
(time to walk to get up in morning)
As the boxes overlap, and the distance between
the medians is less than 1/3 of the visible spread
there is not a significant difference.
Outline the differences between the
two box and whisker diagrams
(time to service car)
Service B tends to take longer than A because
the median of B is outside the middle 50% of A
Outline the differences between
attendance of Year 9 and Year 11 students
Year 11 tend to be absent more than Year 9 because the
middle 50% for year 11 is above the larget number for Year
9
Comparing sets of data
Here are box-and-whisker diagrams representing James’ lap times and
Tom’s lap times for a karting race. Who is better and why?
James’ lap times
52 54
53
58
91
Tom’s lap times
52 54
60
65
86
Tom is slower than James because the median
for Tom is outside the middle 50% of James.
Web resources:
http://www.brainingcamp.com/resources/math/box-plots/lesson.php
(Good interactive notes for teaching box and whisker, but a little long)
http://www.brainingcamp.com/resources/math/box-plots/interactive.php
(interactive box and whisker, could be good for teaching)
Questions to do from the book
Achieve
Gamma
EAS
Merit
P284 EX21.01 Q1 – 3, 9
P27 Q53 – 56
Excellence
SLO
Stem and leaf Graphs
SLO
To know what a stem and leaf are
Copy into
Your notes
Leaf: The last digit on the right of the number.
Stem: The digit or digits that remain when the leaf is dropped.
E.g.
Stem
284
Leaf
Copy into
Your notes
What is a Stem and leaf Plot?
A stem and leaf plot is made up from two parts.
1)
All the
stem
numbers
in order
2)
All the leaf
numbers in
order
SLO
To plot a stem and leaf graph
Copy into
Your notes
Stem and Leaf plots are used to show data more clearly
Step 1: Draw an empty stem and leaf diagram
Step 2: Fill in the stem
Step 3: Fill in the leaves
Step 4: Rearrange the leaf side in order (least to greatest)
http://www.youtube.com/watch?v=Gn4Izx_o7Pg
(You tube video explanation of stem and leaf plots)
Back to Back Stem and Leaf
Sometimes to compare two sets of data two sets of
leaves can share the same stem as below.
SLO
To read values from a stem and leaf plot
1.
2.
3.
4.
How many presidents were 51 years old at their inauguration?
What age is the youngest president to be inaugurated? 42
What is the age of the oldest president to be inaugurated? 69
How many presidents were 40-49 years old at their
inauguration? 7
4
2 3 6 7 8 9 9
5
0 0 1 1 1 1 2 2 4 4 4 4 5 5 5 6 6 6 7 7 7 7 8
6
0 1 1 1 2 4 4 5 8 9
4
Rugby Team 1
Heights (cm)
A back to back stem-leaf
helps us to compare two
sets of data.
4 2 1
7 6
8 5 0
6 43 3 1
7 0
How tall was the tallest player?
Rugby Team 2
Heights (cm)
14 0 2 6 7 8
15 3 4
16 0 1 6
17 1 6
18 1 4 4
187 cm
How short was the shortest player? 140 cm
Calculations with stem-and-leaf diagrams
Stem
(tens)
Leaf (units)
0 5 5 7 7 7
1 0 1 3 5 7 7 9
2 0 0 2 3 4
3 0 0 8
4 1 5
Mode
The mode is ____
7 .
Mean
22 people in the
There are ___
survey and they smoke a total of
427 cigarettes a week.
____
19
427 ÷ 22 =_____
Mean =
Median
The median is halfway between
___
18
17 and ___.
19 This is ___.
Range
___
45 – ___
5 = _____
40
SLO
Comparing stem and leaf plots
Fill in the blanks
a) The lowest mark from either classes was 57
b) The highest score from either class was 98.
c) There were more higher scores in class 1
d) The most common score was in the 70’s for both classes
Class 1
5
6
7
7 7 8
0 3
0 2 5 5 6 9
8
1 1 2 3 5
9
2 4 7 8
Class 2
5
7 8 9
6
0 2 5 5 9
7
0 3 5 7 7 9
8
1 2 5 8
9
7 8
Outline the differences between the
two stem and leaf plots.
Outline the differences between the two
stem and leaf plots.
(money spent on leisure each fortnight)
Outline the differences between the
two stem and leaf plots.
Outline the differences between the
two stem and leaf plots.
Questions to do from the book
Achieve
Gamma
EAS
Merit
????
????
Excellence
Web resources:
http://www.wisc-online.com/Objects/ViewObject.aspx?ID=tmh1101 (individual
interactive notes )
http://www.learner.org/courses/learningmath/data/session3/part_a/making.html
(individual interactive stem and leaf plot at bottom of page but not very exciting )
SLO
Line graph
Copy into
Your notes
Line graphs
Line graphs are most often used to show trends over time.
For example, this line graph shows the temperature in London, in
ºC, over a 12-hour period.
Temperature (ºC)
Temperature in London
20
18
16
14
12
10
8
6
4
2
0
6 am 7 am 8 am 9 am 10 am 11 am 12 pm 1 pm 2 pm 3 pm 4 pm 5 pm 6 pm
Time
Line graphs
This line graph compares the percentage of boys and girls gaining
Merit passes at maths in a particular school.
Percentage of boys and girls gaining A* to C passes at GCSE
70
60
50
40
Girls
Boys
30
20
10
0
1998
1999
2000
2001
2002
2003
2004
What trends are shown by this graph?
Regular smoking in Years 7 to 10
16
14
Percentage of regular smokers
12
10
8
Boys
Girls
6
4
2
0
1982 1984 1986 1988 1990 1992 1994 1996 1998 2000 2002
Year
Explain what this graph shows
Comparing smoking for Years 7 to 10
and
Year
11
35
Percentage of regular smokers
30
25
20
15
10
Years 7 to 10
Year 11
5
0
1982
1984
1986
1988
1990
1992 1994
Year
1996
1998
Explain what this graph shows
2000
2002
Questions to do from the book
Achieve
Gamma
EAS
Merit
P30 Q57 – 60
Excellence
SLO
Time series graphs
Copy into
Your notes
Time Series Components
Trend
Cyclical
Seasonal
Irregular
Copy into
Your notes
Trend (long-term)
Persistent, overall upward or downward pattern
Sales
Time
Copy into
Your notes
Cyclical
Repeating up & down movements
Sales
Time
Copy into
Your notes
Seasonal
• Regular pattern of up & down fluctuations
• Due to weather, customs etc.
• Occurs within one year
Summer
Response
Mo., Qtr.
Copy into
Your notes
Irregular
• Erratic
• Due to random variation or unforeseen events
• Short duration & nonrepeating
© 1984-1994 T/Maker Co.
Your Turn:
describe what the following graph show
Your Turn:
describe what the following graph show
Your Turn:
describe what the following graph show
Your Turn:
describe what the following graph show
Your Turn:
describe what the following graph show
Your Turn:
describe what the following graph show
Questions to do from the book
Achieve
Gamma
EAS
Merit
P296 EX 21.03 Q1 – 5
P299 EX 21.04 Q1, 2
P53 Q83
Excellence
Scatter Plots
Scatter plots are most often used to show correlations or relationships
among data.
How Study Time Affects Grades
120
Overall grade
100
80
60
40
20
0
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
Time in hours
http://www.youtube.com/watch?v=soi-1wXQLoM&feature=channel&list=UL
(video: explanation of scatter plots)
Scatter graphs
What does this scatter graph show about the relationship
between the height and weight of twenty Year 10 boys?
Weight (kg)
60
55
50
45
40
140
150
160
170
Height (cm)
As height increases, weight increases.
180
190
Scatter graphs
What does this scatter graph show?
Life expectancy
85
80
75
70
65
60
55
50
0
20
40
60
80
100
Number of cigarettes smoked in a week
It shows that life expectancy decreases as the number of
cigarettes smoked increases.
120
This appears easy but students should realise what a
scatter graph is showing in terms of the variables
Copy into
Your notes
The line of best fit
The line of best fit is drawn by eye so that there are roughly an equal
number of points below and above the line.
It should not be used for predictions outside the range of data used .
The steeper the line of best fit the faster the change in one of the
variables.
25
25
25
25
20
20
20
20
15
15
15
15
10
10
10
10
5
0
0
0
5
10
15
20
Strong positive
correlation
25
5
5
5
0
5
10
15
20
Weak positive
correlation
25
0
0
0
5
10
15
20
Strong negative
correlation
25
0
5
10
15
20
25
Weak negative
correlation
The stronger the correlation, the closer the points are to the line and
the more confident we can be when predicting patterns.
http://staff.argyll.epsb.ca/jreed/math9/strand4/scatterPlot.htm
(interactive line of best fit, use second graph down, ok for teaching)
Your Turn:
Identifying correlation from scatter graphs
25
Decide whether each of the
following graphs shows,
A
20
B
12
10
15
8
6
10
4
weak –ve
5
strong positive correlation
No correlation
2
0
0
5
10
15
20
25
0
0
2
4
6
8
10
12
25
strong negative correlation
C
25
D
20
20
15
zero correlation
15
10
10
5
weak positive correlation
weak negative correlation.
Strong +ve
weak +ve
5
0
0
0
5
10
15
20
25
20
0
5
10
15
20
25
25
E
25
F
20
20
15
15
10
10
Strong –ve
5
0
0
5
10
weak –ve
5
15
20
25
0
0
5
10
15
20
25
Life expectancy and the number of cigarettes smoked in a week
Your Turn:
85
1) 71
Life expectancy
80
2) Weak
correlation, 6
very different
values at this
point.
75
70
65
60
55
50
0
20
40
60
80
Number of cigarettes smoked in a week
100
3) So many
other factors
that effect life
expectancy.
1) Estimate the life expectancy for someone who smokes 10 cigarettes a
week.
2) Why would an estimate of the life expectancy for someone who
smoked 40 cigarette a week not be reliable?
3) Can you explain why there are so many outliers for this data?
Questions to do from the book
Achieve
Gamma
EAS
Merit
P289 EX21.02 Q1 – 7
P35 Q63 – 66
Excellence
SLO
Dot plots
SLO:
To draw a dot plot
http://www.youtube.com/watch?v=_zurDAF1Fw4
(You tube video explanation of dot plots)
Drawing Dot Plots
Step 1: Draw a number line from smallest to largest value
Step 2: Transfer each number to a dot on the number line
12, 12, 13, 15, 15, 21, 23, 29, 32, 32, 40, 40, 41, 41, 51, 54, 55, 55, 55, 57
10
20
30
40
50
60
SLO
To read values from dot plots
Copy into
Your notes
Reading Dot Plots
The highest column of dots shows which
was most common i.e. 28 raisins.
The left
hand dot
shows the
smallest
value i.e.
25 raisins
The column
on the right
shows the
largest value
i.e.
31 raisins
Your Turn:
Number of people living in a house
1) What was the is the most common number of
people lining in a house? 4
2) How many people in the house with the most people? 6
3) How many people in the house with the least number
of people? 2
Your Turn:
Age of people at a party
1) How old was the youngest person at the party? 7
2) What was the most common age of the person at the party? 18
3) How old was the oldest person at the party? 72
Your Turn:
Lengths of Male Bears
1) What was the most common weight of a male bear? 64
2) How long was the longest male bear? 83
3) How long was the shortest bear? 37
SLO
Comparing dot plots
Outline the main differences between
these two dot plots
(Length of 600m reel of wire)
Outline the main differences between
these two dot plots
(Florida and New York sales of laptops
by individual stores last year)
Outline the main differences between
these two dot plots (heights of girls)
Outline the main differences between
these two dot plots
Questions to do from the book
No good dot plot questions to do from the books, but now is a good time to do a mix
of all graph questions from below.
Achieve
Gamma
EAS
EAS
Merit
Excellence
P268 EX 20.02 Q1 – 4
P273 EX 20.03 Q1 – 10
P278 EX 20.4 Q1 – 3
P27 – 44 Q53 – 73
P45 – 56 Q74 – 82, 84