Psychology 282 Lecture #3 Outline Simple Linear Regression (SLR) Given variables X, Y. Sample of n observations. In study and use of correlation coefficients, X and Y are interchangeable. In regression analysis, one variable is defined as dependent (Y) and the other as independent (X). SLR represents Y as linear function of X. (If we have more than one IV, then method is called multiple linear regression, or MLR.) Relationship may be causal, but not necessarily. Use regression for various purposes: • Explanation • Prediction Will study SLR in terms of both geometric and algebraic representations. 2 Geomtric representation of SLR Recall scatterplot: 130 120 110 100 90 Y 80 70 -2 0 2 4 6 8 X Consider representation or approximation of relationship between X and Y using straight line. 130 120 110 100 90 Y 80 70 -2 0 2 4 6 8 X Using this line, for any selected individual, can obtained predicted Y, called Yˆ . 3 Can then define residual or error in prediction, e, as vertical deviation of point from line. Can define residuals for entire sample. Note that different lines produce different residuals. Consider objective of choosing the best line: Find line that produces smallest residuals for sample. Algebraic representation of SLR Equation for a straight line: Yˆ = B0 + BYX X where Yˆ is predicted value of Y. B0 is intercept: value of Y where line crosses Y-axis, or predicted value of Y when X=0. BYX is regression coefficient, or slope of line. Slope is defined as the ratio of vertical change to horizontal change, or ∆Y ∆X . This value represents the change in predicted Y corresponding to a 1-unit increase in X. For any individual, given X and Y and an equation of the form Yˆ = B0 + BYX X , we can then obtain predicted value Yˆ and residual e = Y − Yˆ . 4 Can obtain residuals for each of n individuals in sample. Problem: Would like to find equation Yˆ = B0 + BYX X that makes these residuals small, thus providing best approximation of observed Y values. Need to define aggregate measure of size of residuals in sample. Cannot just sum them to obtain ∑ e , because positive and negative residuals will cancel out. 2 e Consider squaring residuals, then summing: ∑ This value will become smaller as selected equation, and corresponding line, approximates relationship better. Objective: Given scores for n individuals on X and Y, 2 e find B0 and BXY such that ∑ is as small as possible, where e = Y − Yˆ and Yˆ = B0 + BYX X . 2 e This is the principle of least squares, and ∑ is the sum of squared residuals, to be minimized. 5 Minimizing the sum of squared residuals: 2 θ = e ∑ . Define = ∑ (Y − Yˆ ) 2 = ∑ (Y − ( B0 + BYX X )) 2 = ∑ (Y − B0 − BYX X ) 2 Problem: Find B0 and BYX that minimize this quantity. Solution obtained by basic calculus: Obtain partial derivatives of θ with respect to B0 and BYX. Set these two partial derivatives equal to zero. Solve this system of two equations for B0 and BYX. Result: BYX = ∑ XY − nXY ∑ X − nX 2 2 and B0 = Y − BYX X These equations provide linear regression coefficient and intercept for SLR that minimize sum of squared residuals. Any other intercept and slope would 2 produce higher ∑ e . See applet: http://hadm.sph.sc.edu/COURSES/J716/demos/LeastSquares/LeastSquaresDemo.html 6 • Relationship between regression coefficient and correlation coefficient: sd BYX = r Y sd X Given r, can compute BYX easily. • Expressing regression equation in terms of Y: Given Yˆ = B0 + BYX X Residuals: e = Y − Yˆ Then: Y = Yˆ + e Implies: Y = B0 + BYX X + e • Relationships among X, Y, Yˆ , and e: rXY observed from data. rXYˆ = 1 rXe = 0 rYYˆ = rXY 7 SLR for standardized variables Development of SLR above was for raw scores on X and Y. Suppose we wished to do SLR after standardizing IV and DV: zX = (X − X ) sd X zY = (Y − Y ) sdY Means are zero: z X = 0 and zY = 0 ∑z ∑z Variances are 1.0: (n − 1) = 1 and (n − 1) = 1 2 X 2 Y Regression equation: zˆY = B0* + B z*Y z X z X Residuals: e zY = zY − zˆY Wish to find SLR equation that will make residuals as small as possible. Apply principle of least squares. * * B B Problem: Find 0 and zY z X that will minimize ∑e 2 zY 8 Solution: Use solution for raw score regression coefficient and intercept, converting to standardized variables. Results: Standardized regression coefficient: z X zY − nz X zY ∑ * Bz z = 2 2 z − n z ∑ X X Y X z ∑ = X Y z ∑ = X Y z − n(0)(0) (n − 1) − n(0) z = rXY (n − 1) Standardized intercept: B0* = zY − B z*Y z X z X = 0 − Bz*Y z X (0) = 0 In SLR for standardized variables, the intercept will be zero, meaning the regression line must pass through the origin. The slope, or standardized regression coefficient, will be equal to the Pearson correlation coefficient. 9 SLR for standard scores can be represented as: zˆY = rXY z X z Y = rXY z X + e ZY Thus, the Pearson correlation coefficient has two distinct interpretations and uses: • Measure of linear relationship. • Standardized regression coefficient in SLR. SLR using raw vs. standard scores: Choice based on context, purpose. Desire to predict raw score, vs. desire to predict relative standing with respect to the mean. Relationship: BYX = rXY sd Y sd X rXY = BYX sd X sdY 10 Regression toward the mean Standard scores represent deviation from the mean, in sd units. SLR for standard scores: zˆY = rXY z X Bounds on value of rXY: rXY ≤ 1.0 Implies that: zˆY ≤ z X Predicted score on Y must be relatively closer to mean than was observed score on X. This is regression to the mean. A statistical phenomenon associated with regression and least squares. Substantive interpretation of this phenomenon is not justified. 11 Measures of strength of association In SLR we are interested in strength of association between IV and DV. 2 Can be measured by rXY and by ∑ e . Consider other possible measures. Use notion of partitioning variance in Y. Variance in Y is partially accounted for by X, with the remainder unaccounted for. Determine these portions. Consider first in standardized SLR: z Y = rXY z X + e ZY zY = zˆY + eZ Y Var ( zY ) = Var ( zˆY ) + Var (eZ Y ) Var ( zY ) = Var ( rXY z X ) + Var (eZ Y ) 2 1 = rXY Var ( z X ) + Var (eZ Y ) 2 1 = rXY + Var (eZ Y ) 12 This expression shows that the variance in zY, which is 1.0, can be partitioned into two parts: Variance accounted for by X, which is equal to r2. Variance not accounted for by X, which is equal to (1-r2). Thus, r2 indicates proportion of variance in DV accounted for by its linear relationship with IV. This is an important device for interpreting strength of relationship implied by r. We can define similar partitioning of variance in Y using raw variance. Divide that variance into two portions: Variance in Y accounted for by its linear relationship 2 2 2 sd = r sd Y with X: Yˆ Variance in Y not accounted for by its linear relationship with X: sd (2Y −Yˆ ) = (1 − r 2 ) sdY2 13 In regression analysis we are often interested in the standard deviation of the residuals. From this last term we can define standard deviation of raw residuals: sd (Y −Yˆ ) = (1 − r 2 ) sd Y2 This is a sample statistic. We often wish to estimate the standard deviation of residuals that would be obtained in the population. This value is called the standard error of estimate (SE). The sample value of sd (Y −Yˆ ) tends to overestimate this value; i.e., it is a biased estimate of the true standard error of estimate. An unbiased estimate of SE can be obtained by: SE = ∑ (Y − Yˆ ) 2 (n − 2) This value is provided by regression software and is used to construct confidence intervals for predicted scores and for other purposes.
© Copyright 2025 Paperzz