Measures of Spread The terms spread, dispersion, and variation all refer to a measure of the way a data set is distributed about some central value ﴾i.e. indicate how closely a set of data clusters around its center﴿. Why do we care about spread? Range The range of data is the difference between the maximum and the minimum value. Box and Whisker Plot A graphical representation of the quartiles of data. Quartiles and Box and Whisker Plot Three values that divide a set of ordered data into four groups with equal numbers of data in each group. Variance (coming up...) Standard Deviation (stay tuned...) Box and Whisker Plots • • a representation of the spread of the data using the medians of the data need to know: range minimum values to maximum values quartiles lower quartile (Q1) median of lower half of data median (Q2) upper quartile (Q3) median of upper half of data Let's construct one! Data P { 26, 26, 27, 28, 30, 32, 34, 35, 39, 42, 44, 45, 49, 51, 64} n = 15 ; 1) make a number line containing at least the range of your data 25 30 35 40 45 50 55 60 65 70 2) determine Q1, Q2 and Q3 of the data { 26, 26, 27, 28, 30, 32, 34, 35, 39, 42, 44, 45, 49, 51, 64} th lower quartile is the (1+7)/ 2 = 4 entry, 28 th median is the (1+15)/2 = 8 entry, 35 upper quartile is the (9+15)/2 = 12 th entry, 45 25 30 35 40 45 50 55 60 65 70 3) mark the Q1, Q2 and Q3 on the number line, and then complete the box 25 30 35 40 45 50 55 60 65 70 4) Draw a line along the number line from the left side of the box to the minimum value, and one from the right side of the box to the maxium value. 25 30 35 40 45 50 55 60 65 70 5) Enjoy your completed box and whisker plot! Analysing Box and Whisker Plots • box and whisker plots provide information about the spread of the data when it is divided up into quarters • consider.... 25 30 35 40 45 50 55 60 65 70 • What is the range of this data set? • Determine the four intervals which each have an equal number of data points or 25% of the data. • What is the interquartile range of this data set? The interquartile range (IQR) is the difference between the upper and lower quartiles; includes middle 50% of the data; removes outliers. 25 30 35 40 45 50 55 60 65 70 • Describe the difference in spread between the 1st quartile and the 4th quartile. • Describe the difference in spread between the 2nd quartile and the 3rd quartile. • Describe the difference in spread between the 1st quartile and the 3rd quartile. Interquartile Range (IQR) The interquartile range (IQR) is the difference between the first and the third quartiles (Q3Q1). The IQR contains the middle 50% of data (recall: the box in a boxandwhisker plot). Note the larger the interquartile range, the larger the spread of data. Your turn... A farmer keeps track of the number of corn cobs produced by 11 different stalks in two different fields. Find the quartile values and calculate the IQR to compare the two fields. Field A 25 23 33 28 27 30 25 30 27 29 31 Field B 24 28 27 30 36 31 29 29 27 30 28 Field A produces a slightly greater spread of number of stalks than Field B. In other words, Field B is more consistent then Field A. Variance A measure of dispersion that is found by averaging the squares of the deviation (from the mean) of each piece of data. (i.e. the mean of the squares of the deviations). Field Standard Deviation A statistic used to measure how much the data is spread out around the mean. The square root of the variance. The lowercase Greek letter sigma, , is the symbol for standard deviation. Standard Deviation: The smaller the standard deviation, the more compact (less spread) the data. Standard Deviation for Grouped Data A student keeps track of the time one has to wait for an oil change at a Toyota garage. Find the standard deviation of the wait times. Homework: pg.168 #1, 2ac, 3ac, 4, 5, 6 Summary of Measures of Spread • range = maximum minimum • quartiles; interquartile range (IQR) • box and whisker plot • variance • standard deviation You need to know how to calculate each of the quantities and generate the plots. You will also need to know how to analyse these measures of spread to determine something about the data set (i.e. comparing the variance of different data sets to determine which one has the greater spread).
© Copyright 2025 Paperzz