S1 – Correlation Chapter 5 A: Measuring Correlation Measure the strength of any connection between two variables What do the graphs tell you about Data that involves any connection between the two variables? Graphs 2 variables = Bivariate Snails: Coins: data Snails with larger foot areas weigh more. Generally older coins weigh less, although the relationship is weak. Reactions: With a higher heart rate they generally have a shorter reaction time. Blood pressure: Little relationship between blood pressure and weight. Would you say that this data has a poor, good, excellent or perfect positive correlation? y x We need a measure of how well correlated data sets are. If we look at the diagram again we can divide it into 4 quadrants. y 2 1 3 4 y x If we treat the quadrant lines as our new axes, the coordinates of the points now are all of the form: x ( x x, y y) xx Now, let us consider in each quadrant y 2 Quad ( x x ) ( y y ) ** ( x x )( y y) 1 y 3 xx x + for every point - for every point x x y y y y 4 1 + + 2 - + 3 + - 4 + + - x no have or very weak correlation: SoFor if we a positive correlation: For negative correlation: • The points are scattered randomly in all four • The majority of the points lie in the first and third • The majority of the points lie in the second and Fourth quadrants quadrants. • The values of ** will be quadrants. both positive and negative and • The sum of these terms will be positive and large. • The sum Will of mostly these terms cancel will each be negative other out.and large The sum of the values will be very small. S xy x x y y It appears that the sum could be used to measure how strong a correlation is. The actual sum itself is not very useful since it does not take into account: The number of data items The units of x and y and hence the spread of the data. • But a good starting point! S xy x x y y The sign of ‘the sum’ tells us what type of correlation there is between x and y. S xy 0 S xy 0 + - S xy 0 no Pearson’s product moment correlation coefficient (PMCC) Takes Number of data items and Spread of data Into account by standardising the values ( x x ) and ( y y) This is done in a similar way as standardising the normal distribution (getting standard deviation) S xy r S xx x 2 S yy y 2 S xx S yy ( x ) 2 S xy n ( y ) n 2 x y xy n r S xy S xx S yy The value of r always lies in the interval: 1 r 1 If r = 1 THEN perfect positive correlation r = -1 THEN perfect negative correlation r = 0 THEN two variables are uncorrelated Example 1 Calculate the Pearson's product moment correlation coefficient. x 85 50 75 56 65 y 70 66 68 40 50 x = 331 y = 294 x 2 = 22711 2 y = 17980 xy = 19840 Work through Example 1 p73 Exercise A page 73 Number 1 and 2
© Copyright 2026 Paperzz