ST430 Introduction to Regression Analysis ST430: Introduction to Regression Analysis, Chapter 9, Sections 1-4 Luo Xiao November 16, 2015 1 / 27 ST430 Introduction to Regression Analysis Special Topics 2 / 27 Special Topics ST430 Introduction to Regression Analysis Some complex model-building problems can be handled using the linear regression approach covered up to this point. For example, piecewise regression, including piecewise linear regression and spline regression. Some require more general nonlinear approaches. For example, logistic and probit regression for binary responses. 3 / 27 Special Topics ST430 Introduction to Regression Analysis Sometimes, theory may suggest that a relationship is a straight line, but with different slopes in different parts of the data. For example, E (Y ) = Y ]—0 + —1 x [— ú + — ú x 0 1 x Æ ›, x > ›. Usually we want the model to be continuous at x = ›: —0 + —1 › = —0ú + —1ú › Instead of imposing this as a constraint, the model is usually reparameterized as E (Y ) = —0 + —1 x + —2 max (x ≠ ›, 0) . 4 / 27 Special Topics ST430 Introduction to Regression Analysis Piecewise regression Example 9.1 Y : compressive strength of concrete; X : ratio of water to cement (by mass); Note: the compressive strength decreases as the ratio increases. R code (output in “output1.txt“): setwd("~/Dropbox/teaching/2015Fall/R_datasets/Exercises&Exampl load("CEMENT.RData") with(CEMENT,plot(STRENGTH~RATIO,pch=20)) lq = lm(STRENGTH ~ RATIO + I(RATIO^2), CEMENT) summary(lq) curve(predict(lq, data.frame(RATIO = x)), add = TRUE) 5 / 27 Special Topics ST430 Introduction to Regression Analysis 3 2 1 STRENGTH 4 R graph 50 60 70 80 RATIO 6 / 27 Special Topics ST430 Introduction to Regression Analysis In the example, suppose the theory suggests that › = 70 is the critical ratio: R code (output in “output2.txt“): setwd("~/Dropbox/teaching/2015Fall/R_datasets/Exercises&Exampl load("CEMENT.RData") with(CEMENT,plot(STRENGTH~RATIO,pch=20)) xi = 70 lp70 = lm(STRENGTH ~ RATIO + pmax(RATIO - xi, 0), CEMENT) summary(lp70) curve(predict(lp70, data.frame(RATIO = x)), add = TRUE, col = "red") Note the slightly higher R 2 and Ra2 . 7 / 27 Special Topics 3 2 1 STRENGTH 4 ST430 Introduction to Regression Analysis 50 60 70 80 RATIO 8 / 27 Special Topics ST430 Introduction to Regression Analysis Sometimes the lines are not constrained to be continuous. Example 9.2: Y : reading test score; X : age of children/young adults. R code: setwd("~/Dropbox/teaching/2015Fall/R_datasets/Exercises&Exampl load("READSCORES.RData") with(READSCORES, plot(SCORE~AGE,pch=20)) 9 / 27 Special Topics ST430 Introduction to Regression Analysis 40 30 20 10 0 SCORE 50 60 Scatter plot 5 10 15 20 AGE 10 / 27 Special Topics ST430 Introduction to Regression Analysis Models We may fit a quadratic regression. To fit separate lines for AGE Æ 14 and AGE > 14, include the interaction of AGE and the indicator variable for AGE > 14. The piecewise regression model has higher Ra2 than the quadratic model. 11 / 27 Special Topics ST430 Introduction to Regression Analysis R code: setwd("~/Dropbox/teaching/2015Fall/R_datasets/Exercises&Exampl load("READSCORES.RData") with(READSCORES, plot(SCORE~AGE,pch=20)) l14 = lm(SCORE ~ AGE * (AGE > 14), READSCORES) lq = lm(SCORE~AGE + I(AGE^2), READSCORES) curve(predict(l14, data.frame(AGE = x)), to = 14, add = TRUE, col = "blue") curve(predict(l14, data.frame(AGE = x)), from = 15, add = TRUE, col = "blue") curve(predict(lq, data.frame(AGE = x)),add=TRUE,col="red") 12 / 27 Special Topics ST430 Introduction to Regression Analysis R fits 40 30 20 10 0 SCORE 50 60 piecewise linear quadratic 5 10 15 20 AGE 13 / 27 Special Topics ST430 Introduction to Regression Analysis Inverse prediction One use of a regression model E (Y ) = —0 + —1 X is to predict Y for a new X , x0 . Sometimes, instead, we observe a new y0 , and want to make an inference about the new x0 . Often X is expensive to measure, but Y is cheap; the relationship is determined from a calibration dataset. 14 / 27 Special Topics ST430 Introduction to Regression Analysis Because y0 = —0 + —1 x0 + ‘0 , we can solve for x0 : x0 = y0 ≠ —0 ≠ ‘0 . —1 We do not observe ‘0 , but we know that E (‘0 ) = 0. Similarly, we do not know —0 and —1 , but we have estimates —ˆ0 and —ˆ1 . 15 / 27 Special Topics ST430 Introduction to Regression Analysis This suggests the estimate x̂0 = y0 ≠ —ˆ0 . —ˆ1 This is known as inverse prediction. An approximate 100(1 ≠ –)% prediction interval for x0 is: s x̂0 ± t–/2 ◊ ◊ —ˆ1 16 / 27 Û 1+ 1 (x̂0 ≠ x̄ )2 + . n SSxx Special Topics ST430 Introduction to Regression Analysis An alternative approach is to fit the inverse regression: X = “0 + “1 Y + ‘. Then use the standard prediction interval x̂0 ± t–/2 ◊ sx |y ◊ where Û 1+ 1 (y0 ≠ ȳ )2 + n SSyy x̂0 = “ˆ0 + “ˆ1 y0 . This is not supported by the standard theory, because, in the calibration data, X is fixed and Y is random. But it has been shown to work well in practice. 17 / 27 Special Topics ST430 Introduction to Regression Analysis Weighted least squares Recall the linear regression equation E (Y ) = —0 + —1 X1 + —2 X2 + · · · + —k Xk We have estimated the parameters —0 , —1 , —2 , . . . , —k by minimizing the sum of squared residuals SSE = = n 1 ÿ i=1 n Ë ÿ i=1 18 / 27 Yi ≠ Ŷi 1 22 Yi ≠ —ˆ0 + —ˆ1 Xi,1 + —ˆ2 Xi,2 + · · · + —ˆk Xi,k Special Topics 2È2 . ST430 Introduction to Regression Analysis Sometimes we want to give some observations more weight than others. We achieve this by minimizing a weighted sum of squares: WSSE = = n ÿ i=1 n ÿ i=1 1 wi Yi ≠ Ŷi Ë 1 22 wi Yi ≠ —ˆ0 + —ˆ1 Xi,1 + —ˆ2 Xi,2 + · · · + —ˆk Xi,k 2È2 ˆ are called weighted least squares (WLS) estimates, and the The resulting —s WLS residuals are Ô wi (Yi ≠ Ŷi ). 19 / 27 Special Topics ST430 Introduction to Regression Analysis Why use weights? Suppose that the variance is not constant: var(Yi ) = ‡i2 . If we use weights wi à 1 , ‡i2 the WLS estimates have smaller standard errors than the ordinary least squares (OLS) estimates. That is, the OLS estimates are inefficient, relative to the WLS estimates. 20 / 27 Special Topics ST430 Introduction to Regression Analysis In fact, using weights proportional to 1/‡i2 is optimal: no other weights give smaller standard errors. When you specify weights, regression software calculates standard errors on the assumption that they are proportional to 1/‡i2 . 21 / 27 Special Topics ST430 Introduction to Regression Analysis How to choose the weights If you have many replicates for each unique combination of Xs, use si2 to estimate var(Y |Xi ). Often you will not have enough replicates to give good variance estimates. The text suggests grouping observations that are “nearest neighbors”. Alternatively you can use the regression diagnostic plots. 22 / 27 Special Topics ST430 Introduction to Regression Analysis Example 9.4: Florida road contracts Y : winning (lowest) bid price X : length of new road R code (run the code!): setwd("~/Dropbox/teaching/2015Fall/R_datasets/Exercises&Exampl load("DOT11.RData") fit = lm(BIDPRICE~LENGTH,DOT11) plot(fit) 23 / 27 Special Topics ST430 Introduction to Regression Analysis The first plot uses unweighted residuals yi ≠ ŷi , but the others use weighted residuals. Also recall that they are “Standardized residuals” yi ≠ ŷi ziú = Ô . s 1 ≠ hi which are called Studentized residuals in the text. With weights, the standardized residuals are ziú = 24 / 27 Ô wi 3 4 y ≠ ŷi Ôi . s 1 ≠ hi Special Topics ST430 Introduction to Regression Analysis Note that the “Scale-Location” plot shows an increasing trend. Try weights that are proportional to powers of x = LENGTH: fit1 = lm(BIDPRICE~LENGTH, DOT11, weights = 1/LENGTH) plot(fit1) fit2 = lm(BIDPRICE~LENGTH, DOT11, weights = 1/LENGTH^2) plot(fit2) summary() shows that the fitted equations are all very similar, and weights = 1/LENGTH gives the smallest standard errors. 25 / 27 Special Topics ST430 Introduction to Regression Analysis Note Standard errors are computed as if the weights are known constants. But for the “nearest neighbors" method in textbook, weights are based on a preliminary OLS fit. Theory shows that in large samples the standard errors are also valid with estimated weights. 26 / 27 Special Topics ST430 Introduction to Regression Analysis Note When you specify weights wi , lm() fits the model ‡i2 = ‡2 wi and the “Residual standard error” s is an estimate of ‡: s2 = qn (yi ≠ ŷi )2 n≠k ≠1 i=1 wi If you change the weights, the meaning of ‡ (and s) changes. You cannot compare the residual standard errors for different weighting schemes (c.f. page 488, foot). 27 / 27 Special Topics
© Copyright 2025 Paperzz