Chapter 3 Online Quiz Chapter 3: Describing Relationships 1. A researcher wants to study the effect of regular exercise on cholesterol level. The researcher compares the cholesterol levels of 50 people who belong to a local gym and exercise there regularly with the cholesterol levels of 50 people from the same community who do not exercise regularly. The cholesterol levels of the gym members are substantially lower. The researcher can conclude that A. belonging to a gym reduces cholesterol level. AR. Incorrect. This study shows an association between gym membership and cholesterol level, but simply being a gym member does not lower cholesterol. Confounding variables may be present. For example, individuals who are concerned about their health may be more likely to both join gyms and control their cholesterol levels (for example, by keeping track of what foods they eat). B. exercising regularly at a gym reduces cholesterol level. BR. Incorrect. This study shows an association between regular exercise and cholesterol level, but simply exercising does not lower cholesterol. Confounding variables may be present. For example, individuals who are concerned about their health may be more likely to exercise regularly and watch their cholesterol levels (for example, by keeping track of what foods they eat). *C. members of a local gym who exercise there regularly have lower cholesterol levels than people in the community who do not exercise regularly. CR. Correct. We can conclude that there is an association only for those individuals who participated in the study. Confounding variables may be present. See (A) or (B). 2. You wish to conduct a study to examine how the religious affiliation of a person may influence his or her opinion about the permissibility of late-term abortions. Which of the following statements about the variable “religious affiliation” is correct? A. It is the explanatory variable and is quantitative. AR. Incorrect. “Religious affiliation” is categorical, not quantitative. B. It is the response variable and is categorical. BR Incorrect. We are using “religious affiliation” to try to explain observed differences in opinion about partial-birth abortions, not the other way around. *C. It is the explanatory variable and is categorical. CR Correct. We are using “religious affiliation” to try to explain observed differences in opinion about partial-birth abortions, and the variable “religious affiliation” is indeed categorical. 3. The points in the scatterplot represent paired observations (x, y) where x is an individual’s weight and y is the time (in seconds) it takes for walking on a treadmill to raise the individual’s pulse rate to 140 beats per minute. The o’s correspond to females and the +’s to males. © W.H. Freeman/BFW Publishers 2011 The Practice of Statistics for AP*, 4e 1 Chapter 3 Online Quiz From the scatterplot, which conclusion we can make? A. There is a positive correlation r between gender and weight because men tend to weigh more than women. AR Incorrect. The correlation r measures association between two quantitative variables. Since “gender” is not quantitative, r is not an appropriate measure of association between gender and weight. *B. There is a negative correlation r between weight and time for both males and females. BR Correct. If one looks only at the o’s corresponding to females, there is a clear downward trend indicating a negative correlation between weight and time. The same is true for the +’s corresponding to males. C. In general, males tend to take less time to have their pulse rate raised to 140 bpm while walking on the treadmill. CR Incorrect. If this statement were true, then the cluster of +’s corresponding to males would lie distinctly below the cluster of o’s corresponding to females (that is, the +’s would tend to have smaller y-coordinates than the o’s), clearly not the case here. 4. Which of the following statements does not contain a blunder? A. There is a correlation of r = 0.54 between the position a football player plays and his or her weight. © W.H. Freeman/BFW Publishers 2011 The Practice of Statistics for AP*, 4e 2 Chapter 3 Online Quiz AR Incorrect. This statement contains a blunder because r measures association between quantitative variables and the variable “position” is categorical. *B. The correlation between amount of fertilizer and yield of tomatoes was found to be r = 0.33. BR Correct. Both variables are quantitative in this case, so r would be an appropriate measure of association. Also, in this case, we would expect the correlation to be positive. C. The correlation between the gas mileage of a car and its weight is r = –0.71 gallon-pounds. CR Incorrect. This statement contains a blunder because the correlation coefficient r has no units. If the value had been r = 0.71, then the statement would have been correct. 5. There is a strongly linear association between the weight of a football player and the time in seconds it takes for that player to run a 100-yard dash. Knowing this, a reasonable value for the correlation r between weight and 100-yard dash time would be *A. r = 0.8 AR Correct. The value of r is greater than 0, reflecting the fact that lighter players tend to run faster (take less time) and heavier players tend to run slower (take more time). It is also close to 1, reflecting the strength of the linear association. B. r = 0 BR Incorrect. r = 0 indicates either that there is no linear relationship between the variables or that the relationship takes some form other than a straight line. Neither situation applies here. C. r = –0.8 CR Incorrect. If this were a reasonable value of r, then heavier players would tend to run faster than lighter players, which does not make physical sense. 6. Foresters use linear regression to predict the volume of timber in a tree using easily measured quantities such as diameter. Let y be the volume of timber in cubic feet produced by a tree and let x be the tree’s diameter in feet (measured at a height of 3 feet above the ground). One set of paired data gives the prediction equation ŷ = –30 + 60x The predicted volume of timber for a tree of diameter 18 inches is A. 1050 cubic feet AR Incorrect. The prediction equation was computed from data in which the explanatory variable x was measured in feet. You have used the equation with x = 18 measured in inches. You need to convert x to feet. B. 90 cubic feet BR Incorrect. In the prediction equation, we multiply the value of x in feet (1.5) by 60 and then add the intercept, –30. You have neglected to add the intercept in your calculation. *C. 60 cubic feet CR Correct. After converting x to feet (18 inches = 18/12 or 1.5 feet), the prediction equation for x = 1.5 yields y = –30 + 60(1.5) = –30 + 90 = 60. © W.H. Freeman/BFW Publishers 2011 The Practice of Statistics for AP*, 4e 3 Chapter 3 Online Quiz 7. Foresters use linear regression to predict the volume of timber in a tree using easily measured quantities such as diameter. Let y be the volume of timber in cubic feet produced by a tree and let x be the tree’s diameter in feet (measured at a height of 3 feet above the ground). One set of paired data gives the prediction equation ŷ = -30 + 60x The residual of a tree of diameter 2 feet that yields 120 cubic feet of timber is *A. 30 cubic feet AR Correct. The predicted amount of timber yielded by a tree with diameter x = 2 feet is, from the prediction equation, y = -30 + 60(2) = 90 cubic feet. The observed value of y for x = 2 is actually y = 120 cubic feet. The residual = observed y – predicted y = 120 – 90 = 30. B. -30 cubic feet BR Incorrect. Recall that residual = observed y – predicted y. You have interchanged the values of observed y and predicted y in your calculations. C. 90 cubic feet CR Incorrect. The predicted amount of timber yielded by a tree with diameter x = 2 feet is, from the prediction equation, y = -30 + 60(2) = 90 cubic feet. However, to obtain the residual, you must subtract this predicted value from the observed y-value, 120. You have not done so in this case. 8. Which of the following statements about the slope of the least-squares regression line is true? *A. It has the same sign as the correlation coefficient r. AR correct. Recall that b = r(sy/sx), where b is the slope of the line, r is the correlation coefficient, and the ratio of the sample standard deviations sy/sx is always positive. Therefore, b and r must always have the same sign. B. The square of the slope equals the proportion of the variation in the response variable that is explained by the explanatory variable. BR Incorrect. The square of the correlation coefficient r has this property, not the square of the slope of the line. Although the slope and the correlation have the same sign, their numerical values and other properties are different. C. It is unitless. CR Incorrect. The units of the slope are units of y divided by units of x. The correlation coefficient, not the slope, is unitless. Although the slope and the correlation have the same sign, their numerical values and other properties are different. 9. For 10 pairs of data (x, y), we obtain the following summary statistics: The 10 x-values have sample mean 0.30 and sample standard deviation 0.02. The 10 y-values have sample mean 0.28 and sample standard deviation 0.04. The correlation coefficient r = 0.896. The equation of the least-squares regression line of y on x is A. ŷ = 0.1456 + 0.448x © W.H. Freeman/BFW Publishers 2011 The Practice of Statistics for AP*, 4e 4 Chapter 3 Online Quiz AR Incorrect. You used the incorrect equation b = r(sx/sy) to calculate the slope of the LSRL. The correct equation is b = r(sy/sx). *B. ŷ = -0.2576 + 1.792x BR Correct. From the equation b = r(sy/sx), we obtain b = (0.896)(0.04/0.02) = 1.792. Since the LSRL passes through the point (0.30, 0.28), the equation of the LSRL is given by ŷ - 0.28 = 1.792(x – 0.30) or ŷ – 0.28 = 1.792x – 0.5376. Solving for ŷ yields the desired equation, ŷ = 0.2576 + 1.792x. C. ŷ = -0.20176 + 1.792x CR Incorrect. You calculated the slope correctly, but in determining the equation of the line, you assumed that the line passed through the point (0.28, 0.30). It in fact passes through the point (0.30, 0.28). You have interchanged the sample means. 10. A study showed that students who spend more time studying for statistics tests tend to achieve better scores on their tests. In fact, the number of hours studied turned out to explain 81% of the observed variation in test scores among the students who participated in the study. What is the value of the correlation between number of hours studied and test score? A. r = 0.81 AR Incorrect. You have misinterpreted r2, the proportion of variation in the response variable that can be explained by regression on the explanatory variable, as r. B. r = 0.656 BR Incorrect. You performed the incorrect operation to transform the given information, which depends on r, into the value of r. *C. r = 0.9 CR Correct. The given information implies that r2 = 0.81. Taking the (positive) square root yields r = 0.9. (We use the positive root because the first sentence of the problem indicates that the direction of the association between number of hours studied and test score is positive.) 11. Suppose that the circled point were removed from the scatterplot. Which of the following would happen as a result? © W.H. Freeman/BFW Publishers 2011 The Practice of Statistics for AP*, 4e 5 Chapter 3 Online Quiz *A. The slope of the regression line would increase (in the positive direction). AR Correct. The circled point is an influential point because it is an outlier in the x direction. Including it has a marked effect on the regression line. Specifically, including it flattens the line and decreases the slope. Removing it would cause the slope of the regression line to increase because the line would now pass through the main body of points. B. The correlation coefficient would decrease. BR Incorrect. Deleting the circled point would improve the linear fit because the line would now pass through the main body of points and not be influenced by the circled point. The correlation would therefore increase, rather than decrease. C. The fit of the regression line to the data would be worse. CR Incorrect. The circled point is an influential point because it is an outlier in the x direction. Including it has a marked effect on the regression line. Specifically, it flattens the line and decreases the slope. Removing it would cause the fit of the line to the data to improve because the line would now pass through the main body of points. 12. In the scatterplot, the world-record time (in minutes) in the marathon is plotted against the year in which the record was set. The plotting symbol o is used for men and x for women. The data include only records set between 1908 and 1988. © W.H. Freeman/BFW Publishers 2011 The Practice of Statistics for AP*, 4e 6 Chapter 3 Online Quiz Based on the plot, which statement would be a valid conclusion? A. We can expect the world-record time for women to be lower than that for men sometime before the year 2010. AR Incorrect. This is extrapolation—extending the linear relationship beyond the scope of the data. Such predictions are often unreliable. *B. The world-record times for women show a greater rate of improvement (a more rapid decrease) than the world record times for men. BR Correct. A line fitted to the x’s would have a more negative slope than a line fitted to the o’s, which would indicate a greater rate of improvement (a more rapid decrease) in record times for women. C. By the year 2010, the world-record time for men will reach a plateau beyond which no improvement will be possible. CR Incorrect. Although it is true that the plot has been leveling off, once again this statement is extrapolation. The relationship has been extended beyond the scope of the data, and such predictions are often unreliable. 13. A residual plot displays a “reverse fan” arrangement, with the spread of points about the line (residual = 0) gradually decreasing from left to right (that is, as x increases). Which statement would be a correct interpretation of this plot? A. The original data display a nonlinear relationship (curved pattern of association). AR Incorrect. If the association were nonlinear, then the residual plot would display a curved pattern of some sort. The residual plot described here may not result from such an association. © W.H. Freeman/BFW Publishers 2011 The Practice of Statistics for AP*, 4e 7 Chapter 3 Online Quiz B. Predictions using the regression line will be more reliable for small x than for large x. BR Incorrect. According to the plot, the linear fit is better for large x than for small x, so predictions will be more reliable for large x than for small x. *C. Predictions using the regression line will be more reliable for large x than for small x. CR Correct. According to the plot, the linear fit is better for large x than for small x, so predictions will be more reliable for large x than for small x. © W.H. Freeman/BFW Publishers 2011 The Practice of Statistics for AP*, 4e 8
© Copyright 2026 Paperzz