Statistics Linear Regression 1. Data: (x1 , y1 ), . . . , (xn , yn ) May 7, 2009 (n individuals, two variables x and y) 2. Least-squares regression line: ŷ = a + bx. 3. Residuals: y − ŷ. “Error” in using ŷ to predict y. 4. Choose a and b to minimize sums of squares of residuals. 5. Properties of line: sy where sx and sy are the standard deviations of x and y and r is sx the correlation coefficient. • The slope b = r • The line passes through the point (x̄, ȳ). 6. Explaining variation in y Why aren’t all the different y values equal? (a) x “explains” some of it (b) natural variation (“error”) explains the rest. X (c) Total variation in y to be explained: SStotal = (y − ȳ)2 . X (d) Variation not explained by x: SSerror = (y − ŷ)2 . (e) Variation explained by x: SSmodel = SStotal − SSerror SSmodel (f) Percentage of variation explained by x: R2 = SSTotal 7. Describing the result of a linear regression: Use either R2 or b
© Copyright 2026 Paperzz