9/29/2014 Scatterplots & Correlation Section 3.1A Relationships between quantitative variables. Relationships between two Variables O A study found that short women are more likely to have heart attacks than tall women…. O Smokers on average die younger than nonsmokers…. O But – to make these conclusions we must first eliminate the effect of other variables. NOTE: Statistical relationships are overall tendencies, NOT absolute rules! Example A smoker who lives to age 90. Lurking Variables O Can strongly influence the relationship between two variables. Does this mean that we should conclude that country of origin is the cause for the difference in math SAT scores? Math SAT Scores Hand Span (cm) Height (cm) Case of the Missing Cookies No – Broad Section of all U.S. students take test, while other countries might have a more select group. Hand Span (cm) Height (cm) Variables Response O A response variable measures the outcome of a study. Explanatory O An exploratory variable may help explain or influence changes in a response variable. 1 9/29/2014 Which is the Explanatory and which is the Response Variable? O We think that car weight helps to explain accident death. O We think that O It is easiest to identify explanatory and smoking influences life expectancy. Explanatory: Car Weight Explanatory: # of Cigarettes Smoked Response: Accident Death Rate Response: Life Expectancy response variables when we actually specify values of one variable to see how it affects another variable. O When we don’t specify the values of either variable but just observe both variables, there may or may not be explanatory and response variables. Points to Remember about a Scatter Plot Scatterplot O The most useful graph to show the relationship between two quantitative variables measured on the same individuals. Each individual in the data appears as a point in the graph. O Put the correct variable on the x and y axis. O Explanatory variable goes on the horizontal O Label and scale your axes. axis. axis. (eX Xplanatory goes on the x-axis.) O If there is no explanatory variable then either variable can go on the horizontal axis. O Plot your points. Types of Correlation Has the increase been constant? 90 100 80 Would Vote for a woman 90 80 70 60 70 60 % Responding Yes 50 50 40 40 100 30 30 20 20 80 10 10 0 0 0 10 20 30 40 50 0 60 Series1 40 20 Strong Positive Linear 20 40 60 80 Strong Negative Linear 120 100 80 60 0 No Correlation 40 0 50 100 150 20 0 Years (since 1900) 0 20 40 60 80 100 2 9/29/2014 Describe the correlation Caution….. O Apples: circumference, weight O College freshmen: shoe size, weight O People: age, grip strength Association Does Not Imply Causation! ausation! O Drivers: blood alcohol, reaction time Interpreting Scatterplots O Look for DIRECTION (positive, negative, none) O Look at the FORM of the relationship O Straight or curved O Any clusters O Look at the STRENGTH When writing to describe: O There appears to be a (strong, weak, moderate) (positive/negative) (linear, nonlinear) relationship between _____ (give the x variable) and ______ (give the y variable) O How closely does it follow the form O Do not just say between x & y! O Look for outliers O Individual value that falls outside the overall pattern of the relationship Interpret Direction: Negative; States in which higher percentages of high school graduates take the SAT tend to have lower mean SAT Math scores. Let’s look at a scatter plot of the # of registered boats in Florida and the # of manatees killed by boats for the years 1977 - 2007 Form: Slightly curved; Appears that most states fall into one of two distinct clusters. Strength: Strength is determined by how closely the points follow a clear form. The overall relationship in this figure is moderately strong states with similar percents taking the SAT tend to have roughly similar mean SAT Math scores. 3 9/29/2014 Graph Using a calculator: Interpret Direction: Positive – the more boats registered, the more manatees killed. Form: Linear – the overall pattern follows a straight line from lower left to upper right. Strength: Strong – the points don’t deviate greatly from a line. There are no obvious outliers. NOTE: Although the scatterplot shows a strong linear relationship between the variables, we can NOT conclude that the increase in manatee deaths was caused by the change in boat registrations. Interpret…. Influential Pt! 200 1. Direction 160 120 2. Form 80 40 5 6 7 8 9 Sprint (seconds) 3. Strength Sprint Time (sec) 5.41 5.05 9.49 8.09 7.01 7.17 6.83 6.73 8.01 5.68 5.78 6.31 6.04 Long Jump (in) 1.71 184 48 151 90 65 94 78 71 130 173 143 141 The following data represents 9th grade students who go on a backpacking trip. Body wt (lb) 120 187 109 103 131 165 158 116 Backpack (lb) 26 30 26 24 29 35 31 28 4. Outliers A point is INFLUENTIAL if removing it would markedly change the result of the calculation. Interpret: Backpack Direction: Positive – lighter students carry lighter backpacks. Form: Somewhat Linear – the overall pattern follows a straight line from lower left to upper right. O The Starnes family arrived at Old Faithful after it had erupted. They wondered how long it would be until the next eruption. Here is a scatterplot that plots the interval between consecutive eruptions of Old Faithful against the duration of the previous eruptions, for the month prior to their visit. Strength: Moderately Strong – the points vary somewhat from the linear pattern. One possible outlier the hiker with body weight 187 pounds and pack weight 30 pounds. 4 9/29/2014 Answer the following questions: O Describe the direction of the relationship. Explain why this makes sense. POSITIVE - The longer the duration of the eruption, the longer the wait between eruptions. One reason for this may be that if the geyser erupted for longer, it expended more energy and it will take longer to build up the energy needed to erupt again. Pg. 148 Answer the following questions: O What form does the relationship take? Why Answer the following questions: O How strong is the relationship? Justify. are there 2 clusters of points? FAIRLY STRONG. LINEAR. The clusters indicate that in general there are two types of eruptions, one shorter, the other somewhat longer. Answer the following questions: O Are there any outliers? The clusters indicate that in general there are two types of eruptions, one shorter, the other somewhat longer. Answer the following questions: O What information does the Starnes family need to predict when the next eruption will occur? THERE ARE A FEW OUTLIERS AROUND THE CLUSTERS. but not many and not very distant from the main grouping of points. The Starnes family needs to know how long the last eruption was in order to predict how long it will be until the next one. 5 9/29/2014 Homework O Page 159 (1-13) odd 6
© Copyright 2026 Paperzz