Scatterplots and Correlations 1. SAT scores and college achievement— among college students, those with higher SAT scores also have higher grades 2. Happiness and helpfulness—as people’s happiness level increases, so does their helpfulness (conversely, as people’s happiness level decreases, so does their helpfulness) This table shows some sample data. Each person reported income and years of education. sample annual income years in school 1 2 3 4 5 6 7 8 9 10 125,000 100,000 40,000 35,000 41,000 29,000 35,000 24,000 50,000 60,000 19 20 16 16 18 12 14 12 16 17 the correlation is .79 We find that people with higher income have more years of education. (You can also phrase it that people with more years of education have higher income.) When we know there is a correlation between two variables, we can make a prediction. If we know a group’s income, we can predict their years of education. 1. Education and years in jail—people who have more years of education tend to have fewer years in jail (or phrased as people with more years in jail tend to have fewer years of education) 2. Crying and being held— among babies, those who are held more tend to cry less (or phrased as babies who are held less tend to cry more) Is there a correlation between TV viewing and class grades? the correlation is -.63. There is a negative correlation between TV viewing and class grades—students who spend more time watching TV tend to have lower grades Advantage Disadvantage An advantage of the correlation method is that we can make predictions about things when we know about correlations. If two variables are correlated, we can predict one based on the other. For example, we know that SAT scores and college achievement are positively correlated. So when college admission officials want to predict who is likely to succeed at their schools, they will choose students with high SAT scores. The problem that most people have with the correlation method is remembering that correlation does not measure cause. Take a minute and chant to yourself: Correlation is not Causation! Correlation is not Causation! Correlation is NOT Causation Predicting outcome through stats... beers 5 2 9 8 3 7 3 5 BAC 0.10 0.03 0.19 0.12 0.04 0.095 0.07 0.06 r =.894338 beers 3 5 4 6 5 7 1 4 BAC 0.02 0.05 0.07 0.10 0.085 0.09 0.01 0.05 Scatterplot of student volunteer with their BAC vs Beer intake Correlation measures the strength of a linear relationship. Patterns closer to a straight line have correlations closer to 1 or -1 More facts about correlation: •Correlation requires that both variables be quantitative. • “r” does not describe curved relationships between variables, no matter how strong they are. • Like the mean and standard deviation, the correlation is not resistant: r is strongly affected by few outliers. •Correlation is not a complete summary of twovariable data.(unless you list down the means and sd of both x and y) A scatterplot displays the relationship between two quantitative variables measured on the same individuals. (x = explanatory variable and y= response variable) Plot points with different colors or symbols to see the effect of a categorical variable in a scatterplot. In examining a scatterplot, always look for an overall patter showing direction, form, strength and then look for outliers or other deviation from this pattern. Direction: either positive or negative Form: when the points show a straight-line relationship, the pattern is said to be linear which is an important relationship between variables. Strength: determined by how close the points in the scatterplots. Correlation r measures the strength and direction of linear association between 2 quantitative variables. If r > 0 then the linear relationship has a positive association If r < 0 then the linear relationship has a negative association -1 ≤ r ≤ 1, perfect correlation is when r = ± 1 which only occurs if the points lie on a straight line.
© Copyright 2026 Paperzz