Analyzing the Effects of Outliers on Mean and Median

Analyzing the Effects of Outliers on Mean and Median
Lesson Plan
Subject Area: Math
Grade Levels: Grades 4–12 (ages 9–18)
Time: At least one 50-minute class period; time outside of class as necessary
Lesson Objectives:
Students will:
•
Develop an understanding of the concepts of mean and median, and how they can be used to
describe a set of data.
•
Develop an understanding of how outliers can impact central tendency, and the importance of
that concept in real-world situations.
•
Build data literacy skills by using statistics and dynamic, visual plots to analyze data, interpret
results, and draw conclusions.
•
Explain their findings with writing and visual slide shows.
Standards:
Common Core State Standards1:
Common Core State Standards for Mathematics:
Mathematical Practices
• Reason abstractly and quantitatively.
• Use appropriate tools strategically.
Interpreting Categorical and Quantitative Data
• Summarize, represent, and interpret data on a single count or measurement variable.
Measurement and Data
• Represent and interpret data.
Statistics and Probability
• Summarize and describe distributions.
College and Career Readiness Anchor Standards for Writing:
Standard 6. Use technology, including the Internet, to produce and publish writing and to
interact and collaborate with others.
© 2011 Inspiration Software, Inc. You may use and modify this lesson plan for any non-commercial, instructional use.
1
Overview:
Students are often asked to calculate the values of the three measures of center of a set of data:
mean, median, and mode. While completing these calculations can be straightforward, deciding which
measure to use as a description of the ―typical‖ value of a set of data is not. Students will use the
InspireData Alligators database to explore the effect of an outlier on the values of the mean and
median. Then they will apply what they learn to another real-world situation. Which is the best
measure of the ―center‖ of a data set containing an outlier? Throughout the process, students will gain
a deeper understanding of the pros and cons of using mean and median to express the central
tendency of a data set. They will explain their findings and analyses in annotated slide shows.
Preparation:
•
This lesson requires the InspireData® software application published by Inspiration Software,
Inc. You can download a 30-day trial at http://www.inspiration.com/InspireData.
Lesson:
1. Begin the lesson with a quick discussion of how to calculate the values of mean, median, and
mode, and how to interpret their meaning as measures of the center of a set of data. Ask
students to record guesses for the mean and median ages of students in the room. Have
students share their answers and discuss why many of them wrote down the same age for
both mean and median. What is it about calculating those values that would make them very
© 2011 Inspiration Software, Inc. You may use and modify this lesson plan for any non-commercial, instructional use.
2
close? Now ask students to guess how the inclusion of your age would impact the mean and
median age of people in the room (up, down, or about the same). Did that change either one
or both of the values? Which one(s) and why?
2. Open the Alligators and Outliers
database: InspireData Starter>
Databases>Mathematics>Alligators.
3. Discuss the contents of the table,
including the inclusion of lengths of
individual alligators and weights in both
pounds and kilograms.
© 2011 Inspiration Software, Inc. You may use and modify this lesson plan for any non-commercial, instructional use.
3
4. Demonstrate how to switch to Plot View
and create a Stack plot of the alligator
weights by clicking Stack plot, then clicking X Axis to choose either Weight (kg) or
Weight (lbs). Ask the class to describe the distribution of the alligator weights shown in the
stack plot. Do the alligators tend to be light (less that 60 kg, for instance) or heavy (more
than 60 kg, for instance)? Do one or more alligators stand apart from the rest?
Optional: Click X Axis to select the field name, then change the Step so that alligators are
grouped in different intervals. How does this change the distribution or observations?
Select Reset to return to default settings.
5. Add the mean alligator weight and the median alligator weight to the Stack plot. (Select
Options and then choose Show Mean and Show Median). Discuss their meaning in the
context of the distribution of alligator weights. Why is the mean higher than the median?
6. Exclude the heaviest alligator from the Stack plot. (Click its icon and choose the Plot
menu>Exclude Selected Icons. This also excludes the heaviest alligator from the mean and
median calculations. Elicit student observations about these new mean and median values
© 2011 Inspiration Software, Inc. You may use and modify this lesson plan for any non-commercial, instructional use.
4
and then exclude the next heaviest alligator. How does an outlier in a data set affect the value
of the mean? How does it affect the value of the median?
7. Click the Back button
at the far left of the Toolbar to return to the original stack plot
containing data for all of the alligators. Lead a discussion of which measure of center—mean
or median—best describes the average weight of an alligator in the dataset.
8. Students can find the mode by changing the Stack plot axis type to Category. (Click
Range>Category to create a subdivision for each unique value.) Note that the alligator data
set is bimodal. Is the mode generally a good measure of center? Why or why not?
9. Ask the class what real-world situations they can think of where the outlier should be excluded.
For example, Honolulu, Hawaii, and Juneau, Alaska, could be excluded from an evaluation of
distances between state capitals and Washington, D.C.
© 2011 Inspiration Software, Inc. You may use and modify this lesson plan for any non-commercial, instructional use.
5
10. Demonstrate for students how to click the Ages + Outliers tab for an example of another
dataset with outliers. Read the table notes and explain the data in the table.
11. To review, ask for a volunteer to come forward and demonstrate how to plot the ages and
calculate the mean and median.
© 2011 Inspiration Software, Inc. You may use and modify this lesson plan for any non-commercial, instructional use.
6
12. Demonstrate how to record observations and analyses of the plot with the Notes area
create a slide by using the Capture Slide button
in the Slide Sorter
and
.
13. Click the Database Template tab and
explain that students will be analyzing data
on the ages of class members, plus your
age (if you are willing) and/or a famous
person’s age. Enter the data directly into the
table, or use the Survey or e-Survey tools
to collect it. For more information, refer to
Help> Documentation>Handouts>Learn to Use Surveys.
14. Divide students into as many groups as there are computers available, and have each group
access the database containing information for the entire class. Direct students to create a
stack plot for Age in Months and find the mean, median, and mode age for the class. Have
students record all three measures and any other observations in the Notes area before
capturing a slide.
15. Lead a class discussion about the effect the outlier(s) had on the Stack plot and the values of
the mean, median, and mode. Which is the best measure of the center of a data set containing
an outlier?
Adaptations/Extensions:
•
For younger students, do the entire lesson as a whole class activity.
•
For older students or those experienced with InspireData, less explanation may be necessary
than that presented in the lesson.
•
Have students choose another area of interest that they think might involve outliers, such as
sports statistics, housing prices, or salaries, and conduct research to collect the data. They
can use InspireData to determine the mean and median, and then use those values to
determine the best measure to describe the set of data.
•
Ask students to calculate the mean and median before confirming the values with the Show
Mean and Show Median features.
•
Pass out the ―Learn to Use Stack Plots‖ and ―Learn to Use Plots‖ handouts from InspireData
for student reference (Help>Documentation>Handouts).
© 2011 Inspiration Software, Inc. You may use and modify this lesson plan for any non-commercial, instructional use.
7
•
Students can enhance their plots by adding other InspireData features and computations. You
can also encourage them to create different plot types, such as Axis plots. Direct students to
the other InspireData handouts for help with different plot types and product features
(Help>Documentation>Handouts).
•
If students have experience with describing the distribution of data (uniform, normal, skewed
left, skewed right), have them use the terms to describe the distribution of data in the plots.
1
© Copyright 2010. Common Core State Standards. National Governors Association Center for Best Practices
and Council of Chief State School Officers. All rights reserved. Learn more online at
http://www.corestandards.org.
© 2011 Inspiration Software, Inc. You may use and modify this lesson plan for any non-commercial, instructional use.
8