This Week on Prancing with the B

Analyze Data
The most commonly used statistic is the average, or finding where the middle of the data lies. There are three
ways to measure the average: the mean, median, and mode.
Why three ways? Good question. Each will give you a different way of looking at the numbers; depending on
the question you're trying to answer (or the argument you're trying to make), any of the three could prove the
most useful.
The mean is the most commonly used measure of finding the average. (In fact, in everyday language, people
often use the word "average" simply to mean, um, "mean"). Finding the mean is simple: just add up all the
numbers in a data set and divide by the number of data entries.
The median is the middle number in a data set. However, the data must be in numerical order (least to greatest
or greatest to least) before finding this average. If the middle number lies between two numbers, find the mean
of those two numbers (add them together and divide by 2).
The mode is probably the least common way of finding the average, and in most cases is the least useful. To
find the mode, just look for the number that occurs the most. There can be more than one mode, or none at all.
Finally there is the range. The range is NOT a measure of the average; however, it is often taught along with
averages because it's another helpful way to measure a set of data. The range measures the "spread" of the data,
how far apart the smallest and largest values are. To find the range, subtract the smallest value in the data set
from the largest.
This Week on Prancing with the B-List Celebrities
Here are the contestants' scores on this week's episode of our favorite show, Prancing with the B-List
Celebrities:
Evan L 52
Nicole S 50
Pamela A 47
Chad O 44
Erin A 39
Jake P
38
Niecy N 36
Kate G 32
Now, let's find the three averages and the range for the contestants' scores.
Stat
How to Find
Explanation
Mean
Add all the scores and divide by 8, the number of
contestants. The mean is 42.25.
Median
First put the scores in order, then find the middle value.
In this set, the middle value lies between 44 and 39, so
add these middle numbers together and divide by 2.
The median is 41.5.
Mode
Range
No mode
No score occurs more than once, so there is no mode
for this data set.
Subtract the smallest score from the largest. The range
is 20 points.
For this data set, there are really only two measures of the average, since there is not a mode. Both the mean and
median could be used to describe the average. If you were Evan, would you rather call out the mean or the
median? What if you were Kate?
Look Out: the range of a data set does NOT measure the average of the data.
Box-and-whisker plots are a handy way to display data broken into four quartiles, each with an equal number
of data values. The box-and-whisker plot doesn't show frequency, and it doesn't display each individual statistic,
but it clearly shows where the middle of the data lies. It's a nice plot to use when analyzing how your data is
skewed.
There are a few important vocabulary terms to know in order to graph a box-and-whisker plot. Here they are:





Q1 – quartile 1, the median of the lower half of the data set
Q2 – quartile 2, the median of the entire data set
Q3 – quartile 3, the median of the upper half of the data set
IQR – interquartile range, the difference from Q3 to Q1
Extreme Values – the smallest and largest values in a data set
Let's start by making a box-and-whisker plot (also known as a "box plot") of the geometry test scores we saw
earlier:
90, 94, 53, 68, 79, 84, 87, 72, 70, 69, 65, 89, 85, 83, 72
Step 1: Order the data from least to greatest.
Step 2: Find the median of the data.
This is also called quartile 2 (Q2).
Step 3: Find the median of the data less than Q2.
This is the lower quartile (Q1).
Step 4. Find the median of the data greater than Q2.
This is the upper quartile (Q3).
Step 5. Find the extreme values: these are the largest and smallest data values.
Extreme values = 53 and 94.
Step 6. Create a number line that will contain all of the data values.
It should stretch a little beyond each extreme value.
Step 7. Draw a box from Q1 to Q3 with a line dividing the box at Q2. Then extend "whiskers" from each
end of the box to the extreme values.
Ads by ZINC
This plot is broken into four different groups: the lower whisker, the lower half of the box, the upper half of the
box, and the upper whisker. Since there is an equal amount of data in each group, each of those sections
represent 25% of the data.
Using this plot we can see that 50% of the students scored between 69 and 87 points, 75% of the students scored
lower than 87 points, and 50% scored above 79. If your score was in the upper whisker, you could feel pretty
proud that you scored better than 75% of your classmates. If you scored somewhere in the lower whisker, you
may want to find a little more time to study.
Outliers
Outliers are values that are much bigger or smaller than the rest of the data. These are represented by a dot at
either end of the plot. Our geometry test example did not have any outliers, even though the score of 53 seemed
much smaller than the rest, it wasn't small enough.
In order to be an outlier, the data value must be:


larger than Q3 by at least 1.5 times the interquartile range (IQR), or
smaller than Q1 by at least 1.5 times the IQR.