Probability and Statistics β Mrs. Leahy Unit 3 Homework Packet: Correlation and Regression Problems should be completed on your own paper. Questions are due the class period after they are assigned. Scatter Diagrams and Linear Correlation 1. When drawing a scatter diagram, along which axis is the explanatory variable placed? Along which axis is the response variable placed? 2. Supposed two variables are positively correlated. Does the response variable increase or decrease as the explanatory variable increases? 3. Suppose two variables are negatively correlated. Does the response variable increase or decrease as the explanatory variable increases? 4. Describe the relationship between two variables (strong/weak, positive/negative/none, etc.) when the correlation coefficient, r, is a) near -1 b) near 0 c) near 1 5. Look at the following diagrams. State whether each diagrams shows high linear correlation (strong), moderate or low linear correlation (weak), or no linear correlation. 6. Look at the following diagrams. Does each diagram show high linear correlation, moderate or low linear correlation or no linear correlation? 7. Over the past few years, there has been a strong positive correlation between the annual consumption of diet soda drinks and the number of traffic accidents. a) Do you think increasing consumption of diet soda drinks causes traffic accidents? b) What lurking variables might be causing the increase in one or both of the variables? Explain. 8. Over the past decade, there has been a strong positive correlation between teacher salaries and prescript drug costs. a) Do you think paying teachers more causes prescription drugs to cost more? b) What lurking variables might be causing the increase in one or both of the variables? Explain. 9. Over the past 50 years, there has been a strong negative correlation between average annual incomes and record times to run 1 mile. In other words, average annual incomes have been rising while the record time to run 1 mile has been decreasing. a) Do you think increasing incomes cause decreasing times to run the mile? b) What lurking variables might be causing the increase in one variable and the decrease in the other variable? Explain. 10. Over the past 30 years in the United States, there has been a strong negative correlation between the number of infant deaths at birth and the number of people over the age of 65. In other words, the number of infant deaths at birth has been decreasing and the number of people over the age of 65 has been increasing. a) Do you think that people living longer is causing a decrease in infant mortalities at birth? b) What lurking variables might be causing the increase in one variable and the decrease in the other variable? Explain. 11. How much should a healthy Shetland pony weigh? Let x be the age of the pony (in months) and let y be the average weight of the pony (in kilograms). The following information is based on data from the Merck Veterinary Manual (a reference used in most veterinary colleges). a) Make a scatter diagram and draw the line you think best fits the data. (Use graph paper) b) Would you say the correlation is low, moderate, or strong? Is it positive or negative? 12. The following data are based on information from Domestic Affairs. Let x be the average number of employees in a group health insurance plan, and let y be the average administrative cost as a percentage of claims. a) Find β π₯ , β π₯ 2 , β π¦ , β π¦ 2 , πππ β π₯π¦ b) Use your answers from part a) to compute r. As x increases, does the value of r imply that y should tend to increase or decrease? Explain. Linear Regression and the Coefficient of Determination 13. For the least squares line, π¦Μ = 5 β 2π₯, what is the value of the slope? 14. For the least square line, π¦Μ = 5 + 3π₯, what is the marginal change in π¦Μ for each unit of change in x? 15. When we use the least squares line to predict y values for x values beyond the range of x values found in the data, are we extrapolating or interpolating? Are there any concerns about such predictions? 16. If two variables have a negative linear correlation, is the slope of the least-squares line positive or negative? 17. An economist is studying the job market in Denver area neighborhoods. Let x represent the total number of jobs in a given neighborhood, and let y represent the number of entry level jobs in the same neighborhood. A sample of six Denver neighborhoods gave the following information (units in hundreds of jobs). Given that a) Draw a scatter diagram displaying the data. (Use graph paper!) b) Compute the correlation coefficient, r c) Find π₯Μ , π¦Μ , π and π. Then, find the equation of the least squares line π¦Μ = π + ππ₯ d) Graph your least squares line on your scatter diagram. e) Find the value of the coefficient of determination r2. What percentage of the variation in y can be explained by the corresponding variation in x and the least squares line? What percentage is unexplained? f) For a neighborhood with x = 40 jobs, how many are predicted to be entry level jobs? 18. Data for this problem are based on information from STATS Basketball Scoreboard. It is thought that basketball teams that make too many fouls in a game tend to lose the game even if they otherwise play well. Let x be the number of fouls more than (i.e., over and above) the opposing team. Let y be the percentage of times the team with the larger number of fouls wins the game. x y xy x2 y2 SHOW YOUR WORK: DO NOT USE A GRAPHING CALCULATOR. a) (Hint: make a table) b) Calculate the correlation coefficient βrβ c) d) Compute the value of a and b. e) What is the equation of the least squares line? f) Make a scatterplot (use graph paper!). Graph the least squares line on your scatter plot. g) Find the coefficient of determination βr2β. h) What % of variation in y can be explained? What % of variation in y cannot be explained? i) If a team had x = 4 fouls over and above the opposing team, what does the least-squares equation predict for y? 19. Let x be the age of a licensed driver in years. Let y be the percentage of all fatal accidents (for a given age) due to failure to yield the fight of way. For example, the first data pair states that 5% of all fatal accidents involving 37-year-olds are due to failure to yield the right of way. The Wall Street Journal reported the following data: SHOW YOUR WORK: DO NOT USE A GRAPHING CALCULATOR. a) (Hint: make a table) b) Calculate the correlation coefficient βrβ c) d) Compute the value of a and b. e) What is the equation of the least squares line? f) Make a scatterplot (use graph paper!). Graph the least squares line on your scatter plot. g) Find the coefficient of determination βr2β. h) What % of variation in y can be explained? What % of variation in y cannot be explained? i) Predict the percentage of all fatal accidents due to failing to yield the right of way for 70-year-olds. 20. It is thought that prehistoric Indians did not take their best tools, pottery, and household items when they visited higher elevations for their summer camps. It is hypothesized that archaeological sites tend to lose their cultural identity and specific cultural affiliation as the elevation of the site increases. Let x be the elevation (in thousands of feet) of an archaeological site in the southwesten United States. Let y be the percentage of unidentified artifacts (no specific cultural affiliation) at a given elevation. The follow data were obtained for a collection of archaeological sites in New Mexico: a) USE A GRAPHING CALCULATOR to find the least squares line for the data. b) State correlation coefficient βrβ c) Find the coefficient of determination βr2β. d) What % of variation in y can be explained? What % of variation in y cannot be explained? e) Predict the percentage of unidentified objects at an elevation of x = 6.5 (thousand feet).
© Copyright 2026 Paperzz