Day 57 Practice - High School Math Teachers

Day 57 Practice
Name _____________________________________________
EXAMINING THE EFFECT OF OUTLIERS
Use the data below to answer questions 1 – 6.
In this worksheet you will be investigating how an outlier affects the mean and median of a set
of data. By the end of the lesson you will be able to explain which measure of central tendency
most accurately represents a set of data with an outlier.
DATA SET 1: Rushing Yards Gained by San Diego Chargers Football Players
The table below show the rushing yards gained by San Diego Chargers Football Players during
the 2006 season.
Player
Rushing Yards
LaDainian Tomlinson
1815
Michael Turner
502
Lorenzo Neal
140
Philip Rivers
49
Andrew Pinnock
25
Erick Parker
19
Vincent Jackson
16
Charlie Whitehurst
13
Keenan McCardell
8
Brandon Manumaleuna
1
Billy Volek
-3
Mike Scifres
-7
1.a).Which player is an outlier in the data?
b). How many rushing yards did he have?
Algebra1Teachers @ 2015
Page 1
Day 57 Practice
Name _____________________________________________
2. Calculate the mean and median for the rushing yards, but DO NOT include the outlier in your
calculations.
3. Now, recalculate the mean and median for the rushing yards, but this time INCLUDE the
outlier in your calculations.
SUPPORTING QUESTIONS:
Answer all supporting questions in complete sentences and justify your answers by referring
back to your calculations.
4. Look at your calculations for the mean and median when you DID NOT include the outlier.
a) How many players had a rushing total that was less than the mean?
b) How many players had a rushing total that was greater than the mean?
c) How many players had a rushing total that was less than the median?
d) How many players had a rushing total that was greater than the median?
Algebra1Teachers @ 2015
Page 2
Day 57 Practice
Name _____________________________________________
5. Look at your calculations for the mean and median when you DID include the outlier.
a) How many players had a rushing total that was less than the mean?
b) How many players had a rushing total that was greater than the mean?
c) How many players had a rushing total that was less than the median?
d) How many players had a rushing total that was greater than the median?
6.
Look at your answers for questions #1 and #2. If you wanted to accurately represent
the number of yards that a TYPICAL San Diego Charger gained rushing, should you use the mean
or the median to report the data? Justify your answer with supporting details.
Algebra1Teachers @ 2015
Page 3
Day 57 Practice
Name _____________________________________________
Use the following information to answer questions 7 - 13
DATA SET 2: Populations of the 10 Largest Cities in Maryland
The table below shows the populations of the 10 largest cities in Maryland
City
Population
Baltimore
651,154
Columbia
88,254
Silver Spring
76,540
Dundalk
62,306
Wheaton-Glenmont
57,694
Ellicott City
56,397
Germantown
55,419
Bethesda
55,277
Frederick
52,816
Gaithersburg
52,455
7. a).Which city is an outlier in the data?
b). What is the population
8. Calculate the mean and median for the populations, but DO NOT include the outlier in your
calculations. Show your work below.
Algebra1Teachers @ 2015
Page 4
Day 57 Practice
Name _____________________________________________
9. Now, recalculate the mean and median for the populations, but this time INCLUDE the outlier
in your calculations.
10. Finally, calculate how the outlier affected your mean and median. Calculate the difference
between the second calculations and the first calculations.
Mean (Mean with outlier – Mean without outlier)
Median (Median with outlier – Median without outlier)
SUPPORTING QUESTIONS:
Answer all supporting questions in complete sentences and justify your answers by referring
back to your calculations.
11. Look at your calculations for the difference between the two mean populations. Did the
outlier have a significant effect on the value of the mean population? If so, what was the
effect?
12. Look at your calculations for the difference between the two median populations. Did the
outlier have a significant effect on the value of the median population? If so, what was the
effect?
Algebra1Teachers @ 2015
Page 5
Day 57 Practice
Name _____________________________________________
13. Look at your answers for questions #1 and #2. Summarize how an outlier affects the mean
and median of a set of data.
Use the following information to answer questions 14 - 20
DATA SET 3: Gross Domestic Product (GDP) of the 10 wealthiest countries
•
Record the name of each country and the GDP
•
Report the GDP in billions. For example (United States), $11,667,515,000,000.00 would
be 11,667 billion dollars. For another example (Spain), $991,442,000,000.00 would be 991
billion dollars
Country
GDP (in billions of
dollars
United States
17,947
China
10,982
Japan
4,123
Germany
3,357
United Kingdom
2,849
France
2,421
India
2,090
Italy
1,815
Brazil
1,772
Canada
1,552
South Korea
1,376
Russia
1,324
14.a). Which country is an outlier in the data?
Algebra1Teachers @ 2015
Page 6
Day 57 Practice
Name _____________________________________________
b). What is the GDP of that country?
15. Calculate the mean and median for the GDP, but DO NOT include the outlier in your
calculations.
16. Now, recalculate the mean and median for the GDP, but this time INCLUDE the outlier in
your calculations.
SUPPORTING QUESTIONS:
Answer all supporting questions in complete sentences and justify your answers by referring
back to your calculations.
17. Look at your calculations for the mean and median when you DID NOT include the outlier.
a) How many countries had a GDP less than the mean GDP?
b) How many countries had a GDP greater than the mean GDP?
Algebra1Teachers @ 2015
Page 7
Day 57 Practice
Name _____________________________________________
c) How many countries had a GDP less than the median GDP?
d) How many countries had a GDP greater than the median GDP?
18. I. Look at your calculations for the mean and median when you DID include the outlier.
a) How many countries had a GDP less than the mean GDP?
b) How many countries had a GDP greater than the mean GDP?
c) How many countries had a GDP less than the median GDP?
d) How many countries had a GDP greater than the median GDP?
Algebra1Teachers @ 2015
Page 8
Day 57 Practice
Name _____________________________________________
II. Look at your answers for questions #1 and #2. When the GDP of the United States is
included in the calculations, which measure of central tendency (mean or median) most
accurately represents the GDP of a TYPICAL country in the top ten?
Algebra1Teachers @ 2015
Page 9
Day 57 Practice
Name _____________________________________________
CONCLUDING QUESTIONS:
Now that you have examined three sets of data you are ready to make some general
conclusions. Answer each question in complete sentences and justify your answer by referring
back to calculations you made with the data sets.
19. When there is an outlier in a data set, how is the value of the mean affected? How is the
value of the median affected? Does the outlier have a greater effect on the mean or the
median? Remember to justify your answer with examples from your calculations.
20. You want to accurately represent a typical number in a data set. If there is an outlier in the
data, which measure of central tendency (mean or median) should you use to represent the
data?
BONUS: In all our data sets the outlier was significantly higher than the rest of the data points.
An outlier can also be a data point that is significantly lower than the rest of the data. How do
you think that an outlier that is lower than the rest of the data will affect the mean? How will it
affect the median?
Algebra1Teachers @ 2015
Page 10
Day 57 Practice
Name _____________________________________________
Answer Key
EXAMINING THE EFFECT OF OUTLIERS
1. a). LaDainian Tomlinson
b). 1815
502+140+49+25+19+16+13+8+1+(−3)+(−7)
2. 𝑀𝑒𝑎𝑛 =
11
= 69.63
𝑴𝒆𝒅𝒊𝒂𝒏 = 𝟏𝟔
3. 𝑀𝑒𝑎𝑛 =
1815+502+140+49+25+19+16+13+8+1+(−3)+(−7)
12
= 214.83
𝑀𝑒𝑑𝑖𝑎𝑛 = 17.5
4. a).9 players
b).2 players
c). 5 players
d). 5 players
5. a).10 players
b). 2 players
c).6 players
d).6 players
6. If you want to accurately represent the number of yards that a TYPICAL San Diego
Charger gained rushing, you should not use the mean or the median to report the data.
The reason is an outlier.
7. a).Baltimore
b).651,154
8. Mean
=
88,254 + 76,540 + 62,306 + 57,694 + 56,397 + 55,419 + 55,277 + 52,816 + 52,455
9
𝑀𝑒𝑎𝑛 =
557,158
= 61,906
9
= 56,397
9. Mean
=
651,154 + 88,254 + 76,540 + 62,306 + 57,694 + 56,397 + 55,419 + 55,277 + 52,816 + 52,455
10
1,208,312
𝑀𝑒𝑎𝑛 =
= 120,831.2
10
𝑀𝑒𝑑𝑖𝑎𝑛 = 57,045.5
Algebra1Teachers @ 2015
Page 11
Day 57 Practice
Name _____________________________________________
10. Mean (Mean with outlier – Mean without outlier)
𝐷𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒 𝐵𝑒𝑡𝑤𝑒𝑒𝑛 𝑀𝑒𝑎𝑛 𝑃𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛𝑠 = 58,925.2
Median (Median with outlier – Median without outlier)
𝐷𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒 𝐵𝑒𝑡𝑤𝑒𝑒𝑛 𝑀𝑒𝑑𝑖𝑎𝑛 𝑃𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛𝑠 = 648.5
11. Yes, the outlier had a significant effect on the value of the mean population. The effect
is the mean doesn’t represent real image of the population.
12. The outlier didn’t have a significant effect on the value of the median population. The
difference is very small.
13. With the outlier, the mean changed significantly. With the outlier, the median did NOT
change too much.
14. United States
$17,947 billions
15. Mean
=
10,982 + 4,123 + 3,357 + 2,849 + 2,421 + 2,090 + 1,815 + 1,772 + 1,552 + 1,376 + 1,324
11
33661
= 3,060.1
11
𝑀𝑒𝑑𝑖𝑎𝑛 = 2,090
𝑀𝑒𝑎𝑛 =
16. Mean
=
17,947 + 10,982 + 4,123 + 3,357 + 2,849 + 2,421 + 2,090 + 1,815 + 1,772 + 1,552 + 1,376 + 1,324
12
51608
𝑀𝑒𝑎𝑛 =
= 4,300.7
12
𝑀𝑒𝑑𝑖𝑎𝑛 = 2,255.5
17. Look a
(a). 8
(b). 3
(c). 5
(d). 5
18. I.(a).10
(b). 2
(c). 6
(d). 6
Algebra1Teachers @ 2015
Page 12
Day 57 Practice
Name _____________________________________________
II.When the GDP of the United States is included in the calculations, the median most
accurately represents the GDP of a typical country in the top ten.
19. When there is an outlier in a data set, the value of the mean is greater than without it.
The value of the mean when there is an outlier is 4,300.7, and without it, the value of
the mean is 3,060.1.
When there is an outlier in a data set, the value of the median is greater than without it,
but just a little bit. The value of the median when there is an outlier is 2,255.5, and
without it, the value of the median is 2,090.
The outlier has a greater effect on the mean.
20. If there is an outlier in the data and you want to accurately represent a typical number
in a data, you should use the median to represent the data.
BONUS:
When there is an outlier in a data set, which is significantly lower than the rest of the
data, the value of the mean would be much lower.
When there is an outlier in a data set, which is significantly lower than the rest of the
data, the value of the median would be lower, but just a little bit.
Algebra1Teachers @ 2015
Page 13