Unit 2: Exploring Data 2.1 Variables and their Graphs Name______________________ Lab: Questions on Backs Student Survey Categorical vs Quantitative Variables What is the difference between a categorical and a quantitative variable? Do we ever use numbers to describe the values of a categorical variable? Give some examples. What is a distribution? Example: US Census Data Here is information about 10 randomly selected US residents from the 2000 census. State Kentucky Florida Wisconsin California Michigan Virginia Pennsylvania Virginia California New York Number of Family Members 2 6 2 4 3 3 4 4 1 4 Age Gender 61 27 27 33 49 26 44 22 30 34 Female Female Male Female Female Female Male Male Male Female Marital Status Married Married Married Married Married Married Married Never married/ single Never married/ single Separated Total Income 21000 21300 30000 26000 15100 25000 43000 3000 40000 30000 Travel time to work 20 20 5 10 25 15 10 0 15 40 (a) Who are the individuals in this data set? (b) What variables are measured? Identify each as categorical or quantitative. In what units were the quantitative variables measured? (c) Describe the individual in the first row. Graphs of Data: Categorical Quantitative [Chapter 3] 2.2 Analyzing and Displaying Categorical Data What graphs are used for categorical data? Bar Graph: Pie Graph: What is the most important thing to remember when making pie charts and bar graphs? Why do statisticians prefer bar graphs? Segmented Bar Graph: 1 0.9 0.8 Percent 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Male Female Frequency and Relative Frequency Tables: Color Blue Red Orange Green Yellow Brown TOTAL Freq. 13 7 11 9 8 7 55 Rel. Freq. Percent 1.000 100% What are some common ways to make a misleading graph? What is wrong with these graphs? Two-way tables: What is a contingency (two-way) table? TOTAL Male Female What is a marginal distribution? TOTAL What is a conditional distribution? The conditional distribution of political preference, conditional on being male: Liberal Moderate Conservative TOTAL Male The conditional distribution of political preference, conditional on being female: Liberal Moderate Conservative TOTAL Female What is the conditional relative frequency distribution of gender among conservatives? Classwork: Transportation and Gender [Chapter 4-5] 2.3 Analyzing and Displaying Quantitative Data What graphs are used to display quantitative data? Dotplots: Stemplots (stem and leaf): Example: Make a stemplot for the following data, The following data are price per ounce for various brands of dandruff shampoo at a local grocery store. 0.32 0.21 0.29 0.54 0.17 0.28 0.36 0.23 Can you make a stemplot with this data? What is the most important thing to remember when making a stemplot? Back-to Back Stemplots: Example: Tobacco use in G-rated Movies Total tobacco exposure time (in seconds) for Disney movies: 223 176 548 37 158 51 299 37 11 165 74 9 Total tobacco exposure time (in seconds) for other studios’ movies: 205 162 6 1 117 5 91 155 24 55 17 Make a back-to-back stemplot. 2 6 23 206 9 Boxplots: Example: We will use the following data representing tornadoes per year in Oklahoma from 1995 until 2004 (Sullivan, 2nd edition, p. 167), to construct a modified box plot . 79 47 55 83 145 44 61 18 78 Describing Distributions: Briefly describe/illustrate the following distribution shapes: Symmetric Skewed right Unimodal Bimodal Identify the shape of the following distributions: Skewed left Uniform 62 Example: Smart Phone Battery Life Here is the estimated battery life for each of 9 different smart phones (in minutes). Make a graph of the data and describe what you see. Smart Phone Battery Life (minutes) Apple iPhone 300 Motorola Droid 385 Palm Pre 300 Blackberry Bold 360 Blackberry Storm 330 Motorola Cliq 360 Samsung Moment 330 Blackberry Tour 300 HTC Droid 460 Lab: Features of Distributions Center: Unusual Features: Spread: Shape’s Impact on Mean and Median: Resistant Measures: 2.4 Histograms Why would we prefer a relative frequency histogram to a frequency histogram? What will cause you to lose points on tests and projects (and cause Miss Hartman to go crazy…or crazier)? The following table presents the average points scored per game (PPG) for the 30 NBA teams in the 2009–2010 regular season. Make a histogram of the distribution. Team Atlanta Hawks Boston Celtics Charlotte Bobcats Chicago Bulls Cleveland Cavaliers Dallas Mavericks Denver Nuggets Detroit Pistons Golden State Warriors Houston Rockets PPG 101.7 99.2 95.3 97.5 102.1 102 106.5 94 108.8 102.4 Team Indiana Pacers Los Angeles Clippers Los Angeles Lakers Memphis Grizzlies Miami Heat Milwaukee Bucks Minnesota Timberwolves New Jersey Nets New Orleans Hornets New York Knicks PPG 100.8 95.7 101.7 102.5 96.5 97.7 98.2 92.4 100.2 102.1 Team Oklahoma City Thunder Orlando Magic Philadelphia 76ers Phoenix Suns Portland Trail Blazers Sacramento Kings San Antonio Spurs Toronto Raptors Utah Jazz Washington Wizards PPG 101.5 102.8 97.7 110.2 98.1 100 101.4 104.1 104.2 96.2 Here is some data on time spent on the internet. Graph the data using a histogram. Time on Internet (min.) Frequency 0 10 20 30 40 45 60 90 120 180 210 240 270 300 360 7 1 3 7 1 1 15 3 14 10 1 10 2 9 3 2.5 Comparing Two Distributions Example: McDonald’s Beef Sandwiches Here is data for the amount of fat (in grams) for McDonald’s beef sandwiches. Calculate the median and the IQR. Sandwich Fat (g) Are there any outliers in the beef sandwich distribution? Here is data for the amount of fat (in grams) for McDonald’s chicken sandwiches. Are there any outliers in this distribution? Draw parallel boxplots for the beef and chicken sandwich data. Compare these distributions. Hamburger Cheeseburger Double Cheeseburger McDouble Quarter Pounder® Quarter Pounder® with Cheese Double Quarter Pounder® with Cheese Big Mac® Big N' Tasty® Big N' Tasty® with Cheese Angus Bacon & Cheese Angus Deluxe Angus Mushroom & Swiss McRib ® Mac Snack Wrap Sandwich McChicken ® Premium Grilled Chicken Classic Sandwich Premium Crispy Chicken Classic Sandwich Premium Grilled Chicken Club Sandwich Premium Crispy Chicken Club Sandwich Premium Grilled Chicken Ranch BLT Sandwich Premium Crispy Chicken Ranch BLT Sandwich Southern Style Crispy Chicken Sandwich Ranch Snack Wrap® (Crispy) Ranch Snack Wrap® (Grilled) Honey Mustard Snack Wrap® (Crispy) Honey Mustard Snack Wrap® (Grilled) Chipotle BBQ Snack Wrap® (Crispy) Chipotle BBQ Snack Wrap® (Grilled) 9g 12 g 23 g 19 g 19 g 26 g 42 g 29 g 24 g 28 g 39 g 39 g 40 g 26 g 19 g Fat 16 g 10 g 20 g 17 g 28 g 12 g 23 g 17 g 17 g 10 g 16 g 9g 15 g 9g Dotplot of EnergyCost vs Type Type Example: Energy Cost: Top vs. Bottom Freezers How do the annual energy costs (in dollars) compare for refrigerators with top freezers and refrigerators with bottom freezers? The data below is from the May 2010 issue of Consumer Reports. bottom top 56 70 84 98 112 EnergyCost 126 140 Example: Which gender is taller, males or females? A sample of 14-year-olds from the United Kingdom was randomly selected using the CensusAtSchool website. Here are the heights of the students (in cm). Make a back-to-back stemplot and compare the distributions. Male: 154, 157, 187, 163, 167, 159, 169, 162, 176, 177, 151, 175, 174, 165, 165, 183, 180 Female: 160, 169, 152, 167, 164, 163, 160, 163, 169, 157, 158, 153, 161, 165, 165, 159, 168, 153, 166, 158, 158, 166 Lab: Matching Graphs to Variables 2.6 Standard Deviation Lab: Guess My Age In the distribution below, how far are the values from the mean, on average? Dot Plot Collection 1 0 1 2 What does the standard deviation measure? 3 4 Data 5 6 7 8 What are some similarities and differences between the range, IQR, and standard deviation? How is the standard deviation calculated? What is the variance? What are some properties of the standard deviation? Example: A random sample of 5 students was asked how many minutes they spent doing HW the previous night. Here are their responses (in minutes): 0, 25, 30, 60, 90. Calculate and interpret the standard deviation. Unit 2 FRAPPY
© Copyright 2026 Paperzz