Chapter 3 Section 2 day 7 2016s Notes.notebook March 03, 2016 Honors Statistics Thursday March 3, 2016 Aug 23-8:26 PM Daily Agenda 3. Review OTL C3#10 4. Review Sprint data 5. Space Shuttle activity Aug 23-8:31 PM 1 Chapter 3 Section 2 day 7 2016s Notes.notebook March 03, 2016 Feb 28-3:45 PM Oct 17-7:12 PM 2 Chapter 3 Section 2 day 7 2016s Notes.notebook March 03, 2016 r (0 , a) residual = observed y - predicted y Oct 17-7:12 PM Oct 17-7:12 PM 3 Chapter 3 Section 2 day 7 2016s Notes.notebook March 03, 2016 can be explained by least squares regression line of Oct 17-7:12 PM Study the formulas sheet for a quiz tomorrow. A skip none pg 198: 70 use the LSRL worksheet examine the point (116, 41) determine its influence. I made the Lists LRUSH,LPTSC PLEASE CHECK YOUR DATA LISTS BEFORE CONTINUING WITH THE WORKSHEET. Oct 5-6:47 PM 4 Chapter 3 Section 2 day 7 2016s Notes.notebook March 03, 2016 PLEASE CHECK YOUR DATA LISTS BEFORE CONTINUING WITH THE WORKSHEET. Below are the results for the statistics..... rushing yards statistics points scored statistics Mar 2-9:33 AM PLEASE CHECK YOUR DATA LISTS BEFORE CONTINUING WITH THE WORKSHEET. Oct 12-7:43 PM 5 March 03, 2016 Points scored Chapter 3 Section 2 day 7 2016s Notes.notebook Rushing yards Oct 12-7:44 PM weak positive linear association between total rushing yards and points scored in each game of the 2011 Jacksonville NFL season. Oct 12-7:45 PM 6 Chapter 3 Section 2 day 7 2016s Notes.notebook March 03, 2016 11.39 + 0.03x Oct 12-7:45 PM y = 11.39 + 0.03x I pick x = 250 yards rushing y = 11.39 + 0.03(250) = 18.89 I suppose this is feasible ... it is pretty difficult to score 19 points but .... Jacksonville did score 41 Oct 12-7:45 PM 7 Chapter 3 Section 2 day 7 2016s Notes.notebook March 03, 2016 8.26 Oct 12-7:45 PM Rushing yards Oct 12-7:45 PM 8 Chapter 3 Section 2 day 7 2016s Notes.notebook March 03, 2016 y = 7.23 + 0.05x This point is located in a position that would make us think that it is a residual or regression outlier. Without the point the association between rushing yards and points scored becomes stronger (from r = 0.1 to r = 0.32) the standard deviation dropped from s = 8.26 to s = 4.18. Therefore, this point is a residual outlier (or regression outlier). Oct 12-7:45 PM Final Conclusion: A linear model is appropriate for this data set. The residual plot and scatterplot have no curved pattern. The correlation coefficient r = 0.10 and coefficient of determination r2 = 0.01 are very weak. The standard deviation of the residuals is s = 8.26 so the predictions for the number of points scored is typically off by about 8.26. There is one data point which is a regression outlier. Without this point the association becomes stronger. So while a linear model is the right choice, the model is not very accurate and should be used with extreme caution. Or perhaps not at all. Oct 28-9:16 AM 9 Chapter 3 Section 2 day 7 2016s Notes.notebook March 03, 2016 Oct 1-9:05 AM 3.1 calc 3.2 Sentence 3.3 Sentence 3.4 3.5 calc 3.6 3.7 Sentence 3.8 3.9 calc 3.10 3.11 calc 3.12 Sentence calc 3.13 Sentence 3.14 3.15 3.16 Sep 26-11:13 AM 10 Chapter 3 Section 2 day 7 2016s Notes.notebook March 03, 2016 Oct 11-2:32 PM Oct 11-2:33 PM 11 Chapter 3 Section 2 day 7 2016s Notes.notebook March 03, 2016 Oct 11-2:33 PM Feb 28-3:39 PM 12 Chapter 3 Section 2 day 7 2016s Notes.notebook March 03, 2016 Oct 19-2:06 PM The Final conclusion is like the summary paragraph of a term paper. Address the following statements. 1. Should a linear model be used to model this data set? How do you know? What is your evidence? Look back at the original data plot and the evidence of the residual plot. 2. If a line is appropriate, what is the strength of the model? How do you know? What is your evidence? Look back at the r, r2 and s values. What strength do they show? 3. Are their any special points that affect the model? What are they and do they make the model stronger or weaker when they are removed? Look back at the last pattern deviations investigation . What does it show? Oct 14-8:56 AM 13 Chapter 3 Section 2 day 7 2016s Notes.notebook March 03, 2016 Final Conclusion: Feb 28-3:25 PM regression to mean video https://youtu.be/B98XzmOA7eg A music video called Regression to the mean https://youtu.be/7Td0kSVXoI0 Oct 20-8:25 AM 14 Chapter 3 Section 2 day 7 2016s Notes.notebook March 03, 2016 Should we use a line? How good is the line at predicting y from x? How do we calculate the "error" of the prediction? Oct 3-7:38 PM REGRESSION EXAMPLE Data is collected from a small statistics class. Members participated in a 40 yard sprint and the long jump. Data Lists are called .... (7.25, 110) This scatterplot displays a strong negative linear association between a students 40-yard sprint time in seconds and their long jump ability as recorded by inches. Oct 14-5:44 PM 15 Chapter 3 Section 2 day 7 2016s Notes.notebook March 03, 2016 The correlation coefficient r=-0.84 verifies a strong negative linear association between a students 40-yard sprint time in seconds and their long jump ability as recorded by inches. Oct 1-12:35 PM Track and Field Day x = sprint time y = long jump distance 414.79 - 45.74x When the sprint time is 0 seconds, the predicted long jump distance is 414.79 inches. As the sprint time increases by 1 second, the long jump distance is predicted to decrease by 45.74 inches. Oct 9-3:29 PM 16 Chapter 3 Section 2 day 7 2016s Notes.notebook March 03, 2016 Track and Field long jump = 414.79 - 45.74(15 seconds) = Feb 28-7:21 PM Track and Field Day 0.70 70% of the variation in the long jump distances can be explained by the variation in the 40-yard sprint times as calculated by the LSRL of inches on seconds. (or jump distances on sprint times) 22.38 inches 22.38 inches is the standard deviation of the residuals. It is the typical amount the observed jump distance in inches differs from its predicted jump distances on the LSRL. Mar 17-8:38 PM 17 Chapter 3 Section 2 day 7 2016s Notes.notebook March 03, 2016 7.47 The residual plot shows a scatter of points so a linear model is appropriate. The residuals are rather large so the model will not be very accurate. Feb 28-7:28 PM (7.25, 110) Oct 19-2:06 PM 18 Chapter 3 Section 2 day 7 2016s Notes.notebook With the point (7.25,110) March 03, 2016 Without the point (7.25,110) V V y = 414.79- 45.74x y = 447.33 - 51.33x r = -0.84 r = -0.86 r2 = 0.75 r2 = 0.70 s = 21.27 s = 22.38 The point (7.25, 110) changed the y intercept a little and the slope a little. It is an influential observation. When removed it made the association a small bit stronger but not enough to call it a regression outlier. (7.25, 110) is an influential observation. Oct 12-8:40 PM Final Conclusion: Feb 28-3:25 PM 19 Chapter 3 Section 2 day 7 2016s Notes.notebook March 03, 2016 A linear model is the correct model for this field day data because the original scatterplot of the data does not show any curving and the residual plot appears scattered (it has no pattern). The linear model for the animal gestation vs. longevity data is a strong model. The r = -0.86 and r2 values are strong. The standard deviation is 22.38. So our predictions for long jumps are typically off by 22.38 inches. The point (7.25, 110) only changes the line and correlation slightly. Use the model with a bit caution because the predictions are only somewhat accurate. Mar 1-3:18 PM Oct 5-6:47 PM 20 Chapter 3 Section 2 day 7 2016s Notes.notebook March 03, 2016 Managing Diabetes Fasting Plasma 406.77 30.23 4.78 HbA 20.62 This scatterplot displays a moderately weak positive linear association between HbA and fasting plasma. 0.48 The correlation coefficient verifies the moderately weak positive linear association between HbA and fasting plasma. Oct 12-8:54 PM That's enough for you to get started, we will go over the rest in class tomorrow! Mar 3-2:25 PM 21
© Copyright 2024 Paperzz