Aim #97: How do we distinguish between scatter plots that model a linearversus a nonlinear equation and how do we write the linear regression equation for a set of data using our calculator? 5-9-17 Homework: Handout Do Now: 1) A scatter plot is an informative way to display numerical data with two variables. Here is a scatter plot of the data on elevation and mean number of clear days. a) Do you see a pattern in the scatter plot, or does it look like the datapoints are scattered? b) How would you describe the relationship between elevation and meannumber of clear days for these 14 cities? That is, does the mean number ofclear days tend to increase as elevation increases, or does the mean number of clear days tend to decrease as elevation increases? c) Do you think that a straight line would be a good way to describe therelationship between the mean number of clear days and elevation? Why do you think this? 2) The scatter plot below shows number of cell phone calls and age. Isthere a relationship between number of cell phone calls and age? If there is a relationship between number of cell phone calls and age, does the relationship appear to be linear? 3) Below are three scatter plots. Each one represents a data set with eight observations. The scales on the x and y axes have been left off these plots on purpose so you will have to think carefully about the relationships. a) If one of these scatter plots represents the relationship between heightand weight for eight adults, which scatter plot do you think it is and why? b) If one of these scatter plots represents the relationship between heightand SAT math score for eight high school seniors, which scatter plot do youthink it is and why? c) If one of these scatter plots represents the relationship between theweight of a car and fuel efficiency for eight cars, which scatter plot do you think it is and why? d) Which of these three scatter plots does not appear to represent a linear relationship? Explain the reasoning behind your choice. 4) The scatter plot below compares frying time and moisture content. Isthere a relationship between moisture content and frying time, or do the data points look scattered? If so, does the relationship look linear or non linear? If non-linear, what type of association exists between frying time and moisture content? 5) Describe the type of association for each type of scatter plot. 6) The scatter plot below shows a straight line that can be used to model the relationship between elevation and mean number of clear days. The equation of this line is y = 83.6 + 0.008x. a) There are 14 US cities shown in the scatter plot above. Should you seemore clear days per year in Los Angeles, which is near sea level or in Denver which is known as the mile high city? Justify your choice. b) One of the cities in the data set was Albany, New York, which has anelevation of 275 feet. What would you predict this number to be based onthe equation of the line that describes the relationship between elevation n ad mean number of clear days? c) Another city in the data set was Albuquerque, New Mexico. Ithas an elevation of 5,311 feet. What would you predict this number to be using the equation of the line? d) The actual value for Albany is 69 clear days and the actual value forAlbuquerque is 167 clear days. Was the prediction of the mean number of clear days based on the line closer to the actual value for Albany or for Albuquerque? How could you tell this from looking at the scatter plot with the line shown above? 7) Kendra likes to watch crime scene investigation shows on television. She watched a show where investigators used a shoe print to help identify a suspect in a case. She questioned how it is possible to predict someone‛s height from his shoe print. To investigate, she collected data on shoe length (in inches) and height (in inches) from 10 adult men. Her data appears in the table and scatter plot below. Steps for finding the linear regression equation (also known as the line of best fit) 1. Stat Edit Enter your data into L1 and L2 2. Stat Calc Choose LinReg (4) a) Using your calculator, write the linear regression equation for the table above where shoe size is the independent variable. Round the slope to the nearest hundredth and y-intercept to the nearest tenth. b) Is there a relationship between shoe length and height? Explain c) Do the men with longer shoe lengths tend to be taller? d) Using the equation of the line of best fit from part (a), predict the height of a man with a shoe length of 12 inches. Round to the nearest hundredth. e) Use the equation of the line of best fit to predict the height of a man with a shoe length of 12.6 inches. f) How does the predication from part e compare to the first data point in the table? Since his actual height was different than the predicted height, you can calculate the prediction error by subtracting the predicted value from the actual value. This prediction error is called a residual. For the first data point, the residual is calculated as follows: Residual = actual y value - predicted y value = 74 - 71.42 = 2.58 inches g) For the line y = 3.66x + 25.3, calculate the missing values, to the nearest hundredth, and complete the table. actual predicted (residual) 2 h) Why is the residual in the table‛s first row positive, and the residual in the second row negative? i) What is the sum of the residuals? ________ j) Why did you get a number close to zero for this sum? k) Does this mean that all of the residuals were close to 0? When you use a line to describe the relationship between two numerical values, the best line is the line that makes the residuals as small as possible overall. l) If the residuals tend to be small, what does this say about the fit of the line to the data? m) Calculate the residuals squared and fill in the column in the table above. -The way we determine the line of best fit is to make the sum of the squared residuals as small as possible. n) Calculate the sum of the squared residuals. Fill this in the table above. o) Why do we use the sum of the squared residuals instead of just the sum of the residuals (without squaring)? p) Assuming that the 10 men in the sample are representative ofadult men in general, what height would you predict for a man whose shoe length is 12.5 inches? q) What shoe length for a man whose height is 60 inches? r) Give an interpretation of the slope of the linear regression equationfor predicting height from shoe size for adult men. s) What is the y-intercept ofy = 3.66x + 25.3? t) Explain why it does not make sense to interpret the y-intercept of 25.3 as the predicted height for an adult male whose shoe length is zero. Sum it up! A scatter plot can be used to investigate whether or not there is a relationship between two numerical variables. This relationship can be described as linear or nonlinear. The line that has the smallest sum of squared residuals for a data set is called the least-squares line. This line can also be called the linear regression equation or the line of best fit . The least-squares line (best-fit line) can be used to predict the value either variable given the other. Residual = actual y value - predicted y value
© Copyright 2026 Paperzz