J.M. Hughes-Oliver ST708 Applied Least Squares Remedial Measures for Collinearity Impact of collinearity wrt inferential goals: • causes variance inflation of estimated parameters “. . . collinearity creates serious problems if the purpose of the regression is to understand the process, to identify important variables in the process, or to obtain meaningful estimates of the regression coefficients.” (Rawlings, Pantula, Dickey, 1998, p. 458) • ineffective determination of “relative importance” of explanatory variables “The best recourse to the collinearity problem when the objective is to assign relative importance is to recognize that the data are inadequate for the purpose and obtain better data, perhaps from controlled experiments.” (Rawlings, Pantula, Dickey, 1998, p. 446) • no effect on precision of estimated responses (and predictions) at observed points in the X-space “If the regression analysis is intended solely for prediction of the dependent variable, the presence of near singularities in the data does not create serious problems as long as certain very important conditions are met . . . ” (Rawlings, Pantula, Dickey, 1998, p. 457) Fall 2006 1 J.M. Hughes-Oliver ST708 Applied Least Squares • causes variance inflation of estimated responses (and predictions) at unobserved points in the X-space OLS estimator is BLUE: • collinearity maintains unbiasedness • collinearity causes increase in variances Sacrifice unbiasedness to get smaller variance ; Biased Regression: • improved (wrt MSE) estimation of parameters • improved (wrt MSE) estimation of estimated responses at unobserved points in certain regions of the X-space • (still) ineffective determination of “relative importance” of explanatory variables Some biased regression approaches: • Principal Components Regression • Ridge Regression • Partial Least Squares Fall 2006 2 Principal Components Regression J.M. Hughes-Oliver ST708 Applied Least Squares • Assuming X is n × (p + 1) with a column of ones, Y = Xβ + ² = 1n β0 + X ∗ β ∗ + ² = 1n β0 + X β ∗ + Xc β ∗ + ² 0 0 where X ∗ is n × p, X = [1n X ∗ ], β 0 = (β0 β ∗ ) ∗ where X ∗ = [1n X 1 1n X 2 · · · 1n X p ] is n × p, Xc = X ∗ − X = ∗ 1n β0 + X β ∗ + Xc D −1/2 D 1/2 β ∗ +² {z } | {z } | {z } | Z 1n Y = ∗ is centered version of X ∗ where D = diag(Xc0 Xc ) δ 1n Y + Zδ + ² and Z is centered and scaled version of X, omitting the intercept column • Z = U L1/2 V 0 by singular value decomposition • W = ZV converts the p columns of Z into p principal components corresponding to eigenvalues λ1 ≥ · · · ≥ λp . – columns of W are orthogonal – reasonable to drop principal components with “small” eigenvalues • Y = 1n Y + Zδ + ² ⇐⇒ Y = 1n Y + ZV V 0 δ +² = 1n Y + W γ + ² |{z} |{z} W Fall 2006 γ 3 J.M. Hughes-Oliver ST708 Applied Least Squares Steps: b 1. Regress Y on W to get γ 2. Eliminate all principal components that (a) have condition index > 10 and (b) b(g) . have nonsignificant coefficients γj . End-product is V(g) and γ 3. Convert back to centered and scaled variables: + b(g) δ(g) = V(g) γ + −1 0 var(δ(g) ) = V(g) L(g) V(g) M SE, with where M SE is from the reduced model (this is what SAS does). 4. Convert back to original variables: + + ∗ b(g) = D −1/2 δ(g) = D −1/2 V(g) γ β(g) Why is this called biased regression? + 0 b δOLS δ(g) = V(g) V(g) In fact, ⇒ + 0 E(δ(g) ) = V(g) V(g) δ 6= δ. + bias = E(δ(g) − δ) = −V(s) γ(s) , where V(s) is the set of principal components dropped earlier and γ(s) is the corresponding set of coefficients. By only omitting nonsignificant γj ’s in step 2, we ensure a small bias. Fall 2006 4 J.M. Hughes-Oliver ST708 Applied Least Squares Algae Example–13-degree non-orthogonal case Summary: • VIF flags all explanatory variables. Drop them all? • COLLINOINT flags 9 condition indices > 10, with all explanatory variables appearing in the flagged principal components • Test if we can drop the bottom 9 principal components: F = [(RM SE9 )2 (14 + 9) − (RM SE)2 (14)]/9 = 2.6809 2 (RM SE) p-value = 9 > 2.6809) = .04781 Pr(F14 Cannot drop bottom 9 principal components. • Test if we can drop the bottom 8 principal components: F = 0.3184 8 p-value = Pr(F14 > 0.3184) = 0.94571 Drop bottom 8 principal components. • After dropping bottom 8 principal components, get estimate as Fall 2006 5 J.M. Hughes-Oliver + β(g) + s.e.(β(g) ) ST708 Applied Least Squares intercept day day2 day3 day4 ... 3.75067 .39956 −.01968 .00065 −.00023 −.00004 0 0 0 0 0 0 0 0 .05671 .02522 .00349 .00075 .00003 .00002 0 0 0 0 0 0 0 0 options nodate ls=85 ps=25 nonumber; data algae; input day density @@; datalines; 1 .530 1 .184 2 1.183 2 .664 3 1.603 3 1.553 4 1.994 4 1.910 5 2.708 5 2.585 6 3.006 6 3.009 7 3.867 7 3.403 8 4.059 8 3.892 9 4.349 9 4.367 10 4.699 10 4.551 11 4.983 11 4.656 12 5.100 12 4.754 13 5.288 13 4.842 14 5.374 14 4.969 ; title1 height=.15in "Algae Example: Polynomial Regression"; run; Fall 2006 6 J.M. Hughes-Oliver ST708 Applied Least Squares data algae2; set algae; day=day-7.5; day2=day*day; day3=day2*day; day4=day3*day; day5=day4*day; day6=day5*day; day7=day6*day; day8=day7*day; day9=day8*day; day10=day9*day; day11=day10*day; day12=day11*day; day13=day12*day; run; /* 13-degree polynomial, based on "day-7.5" variable *//* Principal components regression */ proc reg data=algae2 outest=fixcoll noprint; title2 height=.15in "13-degree polynomial around day 7.5"; title3 height=.15in "Principal Components Regression"; model density=day day2 day3 day4 day5 day6 day7 day8 day9 day10 day11 day12 day13/ vif collin pcomit=1 to 12 edf outseb; run; proc print data=fixcoll; run; data testdrop; set fixcoll; if _n_=1 or _type_="IPC"; if _n_=1 then do; mse0=_rmse_*_rmse_; edf0=_edf_; end; retain mse0 edf0; run; data testdrop; set testdrop; j=_n_-1; edf=edf0+j; ftest=(_rmse_*_rmse_*edf - mse0*edf0)/( j*mse0 ); pvalue=1-cdf(’F’,ftest,j,edf0); run; proc print; run; Fall 2006 7 SAS Output Algae Example: Polynomial Regression 13-degree polynomial around day 7.5 Principal Components Regression Obs _MODEL_ _TYPE_ _DEPVAR_ _RIDGE_ _PCOMIT_ _RMSE_ Intercept day day2 day3 day4 day5 day6 day7 day8 day9 day10 day11 day12 density _IN_ _P_ _EDF_ _RSQ_ 1 MODEL1 PARMS density . . 0.21287 3.83499 0.31061 -0.12773 0.12901 0.036240 -0.038954 -.004839719 0.004613589 0.000282030 -.000249945 -.000007271 0.000006163 0.000000068 -5.57374E-8 day13 -1 13 14 14 0.99091 2 MODEL1 SEB density . . 0.21287 0.13527 0.26248 0.18359 0.22876 0.061370 0.061351 0.007760111 0.006891571 0.000436953 0.000361705 0.000011034 0.000008738 0.000000101 7.795563E-8 -1 . . . . 3 MODEL1 IPC density . 1 0.20919 3.83499 0.46060 -0.12773 -0.02351 0.036240 0.003337 -.004839719 -.000178178 0.000282030 0.000002015 -.000007271 0.000000077 0.000000068 -1.47597E-9 -1 . . . . 4 MODEL1 IPCSEB density . 1 0.20919 0.13293 0.14753 0.18042 0.06516 0.060308 0.008712 0.007625813 0.000418654 0.000429392 0.000004170 0.000010843 0.000000173 0.000000100 3.140033E-9 -1 . . . . 5 MODEL1 IPC density . 2 0.20549 3.78899 0.46060 -0.02126 -0.02351 -0.002636 0.003337 0.000191853 -.000178178 -.000002818 0.000002015 -.000000079 0.000000077 0.000000002 -1.47597E-9 -1 . . . . 6 MODEL1 IPCSEB density . 2 0.20549 0.11142 0.14493 0.08104 0.06401 0.014064 0.008558 0.000797163 0.000411266 0.000010305 0.000004096 0.000000310 0.000000170 0.000000006 3.084627E-9 -1 . . . . 7 MODEL1 IPC density . 3 0.20058 3.78899 0.41038 -0.02126 0.00336 -0.002636 -0.000422 0.000191853 0.000004307 -.000002818 0.000000213 -.000000079 0.000000002 0.000000002 -1.1229E-10 -1 . . . . 8 MODEL1 IPCSEB density . 3 0.20058 0.10876 0.08840 0.07910 0.02022 0.013727 0.001187 0.000778107 0.000008473 0.000010059 0.000000523 0.000000303 0.000000004 0.000000006 2.64953E-10 -1 . . . . 9 MODEL1 IPC density . 4 0.19530 3.80343 0.41038 -0.03928 0.00336 0.000784 -0.000422 -.000005966 0.000004307 -.000000277 0.000000213 -.000000002 0.000000002 1.38218E-10 -1.1229E-10 -1 . . . . 10 MODEL1 IPCSEB density . 4 0.19530 0.09036 0.08607 0.03431 0.01969 0.002735 0.001156 0.000035186 0.000008250 0.000001179 0.000000510 0.000000006 0.000000004 6.29802E-10 2.57978E-10 -1 . . . . 11 MODEL1 IPC density . 5 0.19083 3.80343 0.43635 -0.03928 -0.00375 0.000784 0.000011 -.000005966 0.000001365 -.000000277 0.000000022 -.000000002 -5.7976E-11 1.38218E-10 -1.6296E-11 -1 . . . . 12 MODEL1 IPCSEB density . 5 0.19083 0.08829 0.05001 0.03352 0.00516 0.002673 0.000063 0.000034381 0.000002507 0.000001152 0.000000032 0.000000006 3.49858E-10 6.15397E-10 3.2638E-11 -1 . . . . 13 MODEL1 IPC density . 6 0.18640 3.78959 0.43635 -0.03051 -0.00375 0.000031 0.000011 0.000003618 0.000001365 0.000000049 0.000000022 -1.9966E-10 -5.7976E-11 -3.4562E-11 -1.6296E-11 -1 . . . . 14 MODEL1 IPCSEB density . 6 0.18640 0.07187 0.04884 0.01259 0.00504 0.000297 0.000062 0.000006155 0.000002448 0.000000045 0.000000031 0.000000002 3.4172E-10 8.43127E-11 3.18788E-11 -1 . . . . 15 MODEL1 IPC density . 7 0.18541 3.78959 0.39956 -0.03051 0.00065 0.000031 -0.000042 0.000003618 -.000000790 0.000000049 -.000000005 -1.9966E-10 2.12317E-10 -3.4562E-11 1.14131E-11 -1 . . . . 16 MODEL1 IPCSEB density . 7 0.18541 0.07149 0.02533 0.01252 0.00075 0.000295 0.000017 0.000006122 0.000000185 0.000000045 0.000000002 0.000000002 1.50804E-10 8.38678E-11 5.50334E-12 -1 . . . . 17 MODEL1 IPC density . 8 0.18461 3.75067 0.39956 -0.01968 0.00065 -0.000233 -0.000042 -.000001891 -.000000790 0.000000011 -.000000005 0.000000001 2.12317E-10 3.97994E-11 1.14131E-11 -1 . . . . 18 MODEL1 IPCSEB density . 8 0.18461 0.05671 0.02522 0.00349 0.00075 0.000033 0.000017 0.000000240 0.000000184 0.000000014 0.000000002 5.04088E-10 1.50155E-10 1.46208E-11 5.47965E-12 -1 . . . . 19 MODEL1 IPC density . 9 0.27408 3.75067 0.27437 -0.01968 0.00460 -0.000233 0.000050 -.000001891 0.000000167 0.000000011 -.000000012 0.000000001 -5.5104E-10 3.97994E-11 -1.7135E-11 -1 . . . . 20 MODEL1 IPCSEB density . 9 0.27408 0.08419 0.01407 0.00518 0.00020 0.000048 0.000002 0.000000357 0.000000064 0.000000021 0.000000002 7.48362E-10 7.02193E-11 2.17058E-11 1.89009E-12 -1 . . . . 21 MODEL1 IPC density . 10 0.32105 3.57408 0.27437 -0.00346 0.00460 -0.000087 0.000050 -.000002107 0.000000167 -.000000050 -.000000012 -.000000001 -5.5104E-10 -2.7115E-11 -1.7135E-11 -1 . . . . 22 MODEL1 IPCSEB density . 10 0.32105 0.07360 0.01649 0.00067 0.00023 0.000017 0.000002 0.000000410 0.000000075 0.000000010 0.000000003 2.26086E-10 8.22544E-11 5.27968E-12 2.21404E-12 -1 . . . . 23 MODEL1 IPC density . 11 0.97296 3.57408 0.03958 -0.00346 0.00147 -0.000087 0.000040 -.000002107 0.000000986 -.000000050 0.000000024 -.000000001 5.61273E-10 -2.7115E-11 1.32346E-11 -1 . . . . 24 MODEL1 IPCSEB density . 11 0.97296 0.22306 0.00585 0.00204 0.00022 0.000052 0.000006 0.000001243 0.000000146 0.000000029 0.000000003 6.85169E-10 8.29035E-11 1.60004E-11 1.95483E-12 -1 . . . . 25 MODEL1 IPC density . 12 1.00738 3.36007 0.03958 0.00000 0.00147 0.000000 0.000040 0 0.000000986 0 0.000000024 0 5.61273E-10 0 1.32346E-11 -1 . . . . 26 MODEL1 IPCSEB density . 12 1.00738 0.19038 0.00605 0.00000 0.00022 0.000000 0.000006 0 0.000000151 0 0.000000004 0 8.5836E-11 0 2.02398E-12 -1 . . . . file:///C|/Documents%20and%20Settings/hughesol/My%20Documents/LAPTOP/Instruction/ST708/2005Fall/Notes/RemedialCollinearity/SASoutputPCR.html (1 of 2)11/29/2005 9:29:08 AM SAS Output Algae Example: Polynomial Regression 13-degree polynomial around day 7.5 Principal Components Regression Obs _MODEL_ _TYPE_ _DEPVAR_ _RIDGE_ _PCOMIT_ _RMSE_ Intercept day day2 day3 day4 day5 day6 day7 day8 day9 day10 day11 day12 day13 density _IN_ _P_ _EDF_ _RSQ_ mse0 edf0 j edf 1 MODEL1 PARMS density . . 0.21287 3.83499 0.31061 -0.12773 0.12901 0.036240 -0.038954 -.004839719 0.004613589 0.000282030 -.000249945 -.000007271 0.000006163 6.765007E-8 -5.57374E-8 -1 13 14 14 0.99091 0.045313 14 0 14 . . 2 MODEL1 IPC density . 1 0.20919 3.83499 0.46060 -0.12773 -0.02351 0.036240 0.003337 -.004839719 -.000178178 0.000282030 0.000002015 -.000007271 0.000000077 6.765008E-8 -1.47597E-9 -1 . . . . 0.045313 14 1 15 0.4853 0.49743 3 MODEL1 IPC density . 2 0.20549 3.78899 0.46060 -0.02126 -0.02351 -0.002636 0.003337 0.000191853 -.000178178 -.000002818 0.000002015 -.000000079 0.000000077 1.704715E-9 -1.47597E-9 -1 . . . . 0.045313 14 2 16 0.4553 0.64335 4 MODEL1 IPC density . 3 0.20058 3.78899 0.41038 -0.02126 0.00336 -0.002636 -0.000422 0.000191853 0.000004307 -.000002818 0.000000213 -.000000079 0.000000002 1.704715E-9 -1.1229E-10 -1 . . . . 0.045313 14 3 17 0.3647 0.77955 5 MODEL1 IPC density . 4 0.19530 3.80343 0.41038 -0.03928 0.00336 0.000784 -0.000422 -.000005966 0.000004307 -.000000277 0.000000213 -.000000002 0.000000002 1.38218E-10 -1.1229E-10 -1 . . . . 0.045313 14 4 18 0.2879 0.88096 6 MODEL1 IPC density . 5 0.19083 3.80343 0.43635 -0.03928 -0.00375 0.000784 0.000011 -.000005966 0.000001365 -.000000277 0.000000022 -.000000002 -5.7976E-11 1.38218E-10 -1.6296E-11 -1 . . . . 0.045313 14 5 19 0.2540 0.93074 7 MODEL1 IPC density . 6 0.18640 3.78959 0.43635 -0.03051 -0.00375 0.000031 0.000011 0.000003618 0.000001365 0.000000049 0.000000022 -1.9966E-10 -5.7976E-11 -3.4562E-11 -1.6296E-11 -1 . . . . 0.045313 14 6 20 0.2225 0.96287 8 MODEL1 IPC density . 7 0.18541 3.78959 0.39956 -0.03051 0.00065 0.000031 -0.000042 0.000003618 -.000000790 0.000000049 -.000000005 -1.9966E-10 2.12317E-10 -3.4562E-11 1.14131E-11 -1 . . . . 0.045313 14 7 21 0.2760 0.95322 9 MODEL1 IPC density . 8 0.18461 3.75067 0.39956 -0.01968 0.00065 -0.000233 -0.000042 -.000001891 -.000000790 0.000000011 -.000000005 0.000000001 2.12317E-10 3.97994E-11 1.14131E-11 -1 . . . . 0.045313 14 8 22 0.3184 0.94571 10 MODEL1 IPC density . 9 0.27408 3.75067 0.27437 -0.01968 0.00460 -0.000233 0.000050 -.000001891 0.000000167 0.000000011 -.000000012 0.000000001 -5.5104E-10 3.97994E-11 -1.7135E-11 -1 . . . . 0.045313 14 9 23 2.6809 0.04781 11 MODEL1 IPC density . 10 0.32105 3.57408 0.27437 -0.00346 0.00460 -0.000087 0.000050 -.000002107 0.000000167 -.000000050 -.000000012 -.000000001 -5.5104E-10 -2.7115E-11 -1.7135E-11 -1 . . . . 0.045313 14 10 24 4.0593 0.00878 12 MODEL1 IPC density . 11 0.97296 3.57408 0.03958 -0.00346 0.00147 -0.000087 0.000040 -.000002107 0.000000986 -.000000050 0.000000024 -.000000001 5.61273E-10 -2.7115E-11 1.32346E-11 -1 . . . . 0.045313 14 11 25 46.2079 0.00000 13 MODEL1 IPC density . 12 1.00738 3.36007 0.03958 0.00000 0.00147 0.000000 0.000040 0 0.000000986 0 0.000000024 0 5.61273E-10 0 1.32346E-11 -1 . . . . 0.045313 14 12 26 47.3571 0.00000 file:///C|/Documents%20and%20Settings/hughesol/My%20Documents/LAPTOP/Instruction/ST708/2005Fall/Notes/RemedialCollinearity/SASoutputPCR.html (2 of 2)11/29/2005 9:29:08 AM ftest pvalue J.M. Hughes-Oliver ST708 Applied Least Squares Ridge Regression • Assuming X is n × (p + 1) with a column of ones, Y = Xβ + ² = 1n β0 + X ∗ β ∗ + ² = 1n β0 + X β ∗ + Xc β ∗ + ² 0 0 where X ∗ is n × p, X = [1n X ∗ ], β 0 = (β0 β ∗ ) ∗ where X ∗ = [1n X 1 1n X 2 · · · 1n X p ] is n × p, Xc = X ∗ − X = ∗ 1n β0 + X β ∗ + Xc D −1/2 D 1/2 β ∗ +² {z } | {z } | {z } | Z 1n Y = ∗ is centered version of X ∗ where D = diag(Xc0 Xc ) δ 1n Y + Zδ + ² and Z is centered and scaled version of X, omitting the intercept column e • Ridge Estimator from Ridge Factor k ≥ 0 is δ(k) = (Z 0 Z + kI)−1 Z 0 Y ; e0 (k) β e , β(k) = ∗ e (k) β • How to choose k? – k = p · M SE/ Fall 2006 Pp e∗ (k), e0 (k) = Y − X ∗ β β e j ]2 [ δ(0) j=1 e∗ (k) = D −1/2 δ(k) e β Hoerl, Kennard, Baldwin (1975) 1 J.M. Hughes-Oliver ST708 Applied Least Squares e – Ridge Trace: plot β(k) j versus k for each j = 1, . . . , p. Select k when estimates stop changing rapidly (not when get a flat line!) – When V IFj stabilizes wrt k, for each j = 1, . . . , p. Why is this called biased regression? e = (Z 0 Z + kI)−1 (Z 0 Z)δbOLS δ(k) h i 0 −1 0 e ⇒ bias = E[δ(k) − δ] = (Z Z + kI) Z Z − I δ = −k(Z 0 Z + kI)−1 δ, so that bias increases as Ridge Factor k increases. Variance decreases as Ridge Factor k increases: e var[δ(k)] = σ 2 (Z 0 Z + kI)−1 (Z 0 Z)(Z 0 Z + kI)−1 Why is this called a shrinkage estimator? e • δ(k) → 0 as k increases • Bayesian interpretation: σ2 – prior is δ ∼ 0, k I – large k indicates strong belief that δ ≈ 0; prior dominates – small k indicates little prior knowledge; data dominates Fall 2006 2 J.M. Hughes-Oliver ST708 Applied Least Squares Algae Example–13-degree non-orthogonal case Summary: • VIF flags all explanatory variables. Drop them all? • COLLINOINT flags 9 condition indices > 10, with all explanatory variables appearing in the flagged principal components • PCR: drop bottom 8 principal components, get estimate as + β(g) + s.e.(β(g) ) intercept day day2 day3 day4 ... 3.75067 .39956 −.01968 .00065 −.00023 −.00004 0 0 0 0 0 0 0 0 .05671 .02522 .00349 .00075 .00003 .00002 0 0 0 0 0 0 0 0 • Ridge Reg’n: what k to use? – k = 2.1077 × 10−9 Hoerl, Kennard, Baldwin (1975) – Ridge Trace: k = .001 – V IFj : k = .004. e β(.004) e s.e.[β(.004)] Fall 2006 I choose this one. Get estimate as intercept day day2 day3 day4 ... 3.77660 .39375 −.02762 .00001 −.00000 −.00002 0 0 0 0 0 0 0 0 .07919 .03081 .01152 .00166 .00030 .00003 0 0 0 0 0 0 0 0 3 J.M. Hughes-Oliver ST708 Applied Least Squares options nodate ls=85 ps=25 nonumber; data algae; input day density @@; datalines; 1 .530 1 .184 2 1.183 2 .664 3 1.603 3 1.553 4 1.994 4 1.910 5 2.708 5 2.585 6 3.006 6 3.009 7 3.867 7 3.403 8 4.059 8 3.892 9 4.349 9 4.367 10 4.699 10 4.551 11 4.983 11 4.656 12 5.100 12 4.754 13 5.288 13 4.842 14 5.374 14 4.969 ; title1 height=.15in "Algae Example: Polynomial Regression"; run; data algae2; set algae; day=day-7.5; day2=day*day; day3=day2*day; day4=day3*day; day5=day4*day; day6=day5*day; day7=day6*day; day8=day7*day; day9=day8*day; day10=day9*day; day11=day10*day; day12=day11*day; day13=day12*day; run; Fall 2006 4 J.M. Hughes-Oliver ST708 Applied Least Squares /* 13-degree polynomial, based on "day-7.5" variable *//* Ridge regression */ proc reg data=algae2 outest=fixcoll noprint; title2 height=.15in "13-degree polynomial around day 7.5"; title3 height=.15in "Ridge Regression"; model density=day day2 day3 day4 day5 day6 day7 day8 day9 day10 day11 day12 day13/ ss1 ss2 vif collinoint ridge=0 to 0.02 by .001 edf outseb outstb outvif; plot / ridgeplot vref=0 nomodel nostat; title4 height=.15in "Ridge Trace"; run; data choosek; set fixcoll; run; proc gplot data=choosek; where (_type_="RIDGEVIF" and day13<30); plot (day--day13) * _ridge_ / overlay legend; title4 "Ridge VIFs"; run; proc print data=fixcoll; title4; run; data chooseka; set fixcoll; if (_type_="PARMS"); run; data choosekb; set fixcoll(drop=_in_ _p_ _edf_ _rsq_); if (_type_="RIDGESTB" and _ridge_=0.00); run; data choosek; merge chooseka choosekb; k=_in_*_rmse_*_rmse_/ ( (day**2+day2**2+day3**2+day4**2+day5**2+day6**2+ day7**2+day8**2+day9**2+day10**2+day11**2+day12**2+day13**2) * (_edf_*_rmse_*_rmse_/(1-_rsq_)) ); run; title4 "Hoerl, Kennard, Baldwin Recommendation for k"; proc print; run; Fall 2006 5 SAS Output The REG Procedure file:///C|/Documents%20and%20Settings/hughesol/My%20Documents/LAPTOP/Instruction/ST708/2005Fall/Notes/RemedialCollinearity/SASoutputRidge.html (1 of 5)11/30/2005 4:51:48 AM SAS Output file:///C|/Documents%20and%20Settings/hughesol/My%20Documents/LAPTOP/Instruction/ST708/2005Fall/Notes/RemedialCollinearity/SASoutputRidge.html (2 of 5)11/30/2005 4:51:48 AM SAS Output Algae Example: Polynomial Regression 13-degree polynomial around day 7.5 Ridge Regression Obs _MODEL_ _TYPE_ _DEPVAR_ _RIDGE_ _PCOMIT_ _RMSE_ Intercept day day2 day3 day4 day5 day6 day7 day8 day9 day10 day11 day12 day13 density _IN_ _P_ _EDF_ _RSQ_ 1 MODEL1 PARMS density . . 0.21287 3.83499 0.311 -0.13 0.13 0.04 -0.04 -0.00 0.00 0.00 -0.00 -0.00 0.00 0.00 -0.00 -1 13 14 14 0.99091 2 MODEL1 SEB density . . 0.21287 0.13527 0.262 0.18 0.23 0.06 0.06 0.01 0.01 0.00 0.00 0.00 0.00 0.00 0.00 -1 . . . . 3 MODEL1 RIDGEVIF density 0.000 . . . 691.776 4332.20 524275.06 883079.00 54373099.72 24912055.76 1110720884.98 140632757.42 5205889507.87 160682024.12 5293103867.19 24351249.58 742862608.10 -1 . . . . 4 MODEL1 RIDGE density 0.000 . 0.21287 3.83499 0.311 -0.13 0.13 0.04 -0.04 -0.00 0.00 0.00 -0.00 -0.00 0.00 0.00 -0.00 -1 . . . . 5 MODEL1 RIDGESTB density 0.000 . 0.21287 0.00000 0.793 -1.17 10.41 14.14 -119.31 -79.33 568.56 195.06 -1270.56 -212.86 1307.64 83.95 -496.60 -1 . . . . 6 MODEL1 RIDGESEB density 0.000 . 0.21287 0.13527 0.262 0.18 0.23 0.06 0.06 0.01 0.01 0.00 0.00 0.00 0.00 0.00 0.00 -1 . . . . 7 MODEL1 RIDGEVIF density 0.001 . . . 15.065 26.40 107.68 76.78 33.62 33.13 57.14 33.89 19.60 5.11 5.12 47.48 55.98 -1 . . . . 8 MODEL1 RIDGE density 0.001 . 0.22364 3.78767 0.416 -0.03 -0.00 0.00 -0.00 0.00 0.00 -0.00 0.00 -0.00 0.00 0.00 -0.00 -1 . . . . 9 MODEL1 RIDGESTB density 0.001 . 0.22364 0.00000 1.061 -0.28 -0.13 0.05 -0.06 0.03 0.06 -0.01 0.07 -0.01 0.02 0.00 -0.06 -1 . . . . 10 MODEL1 RIDGESEB density 0.001 . 0.22364 0.08392 0.041 0.02 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 -1 . . . . 11 MODEL1 RIDGEVIF density 0.002 . . . 11.056 20.34 54.41 35.01 16.76 19.05 27.60 10.96 8.50 3.49 3.26 24.17 28.16 -1 . . . . 12 MODEL1 RIDGE density 0.002 . 0.22495 3.78308 0.406 -0.03 -0.00 0.00 -0.00 0.00 0.00 0.00 0.00 -0.00 0.00 -0.00 -0.00 -1 . . . . 13 MODEL1 RIDGESTB density 0.002 . 0.22495 0.00000 1.036 -0.27 -0.06 0.02 -0.06 0.03 0.01 0.01 0.04 -0.01 0.02 -0.01 -0.01 -1 . . . . 14 MODEL1 RIDGESEB density 0.002 . 0.22495 0.08179 0.035 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 -1 . . . . 15 MODEL1 RIDGEVIF density 0.003 . . . 9.337 17.20 34.17 23.79 11.36 14.45 16.43 5.74 4.90 2.87 2.58 17.44 17.94 -1 . . . . 16 MODEL1 RIDGE density 0.003 . 0.22616 3.77957 0.399 -0.03 -0.00 0.00 -0.00 0.00 -0.00 0.00 0.00 -0.00 0.00 -0.00 0.00 -1 . . . . 17 MODEL1 RIDGESTB density 0.003 . 0.22616 0.00000 1.019 -0.26 -0.02 0.01 -0.07 0.03 -0.01 0.01 0.02 -0.00 0.02 -0.01 0.01 -1 . . . . 18 MODEL1 RIDGESEB density 0.003 . 0.22616 0.08031 0.032 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 -1 . . . . 19 MODEL1 RIDGEVIF density 0.004 . . . 8.357 14.96 24.25 18.44 8.78 11.85 11.02 3.70 3.23 2.47 2.23 13.99 12.96 -1 . . . . 20 MODEL1 RIDGE density 0.004 . 0.22738 3.77660 0.394 -0.03 0.00 -0.00 -0.00 0.00 -0.00 0.00 0.00 0.00 0.00 -0.00 0.00 -1 . . . . 21 MODEL1 RIDGESTB density 0.004 . 0.22738 0.00000 1.005 -0.25 0.00 -0.00 -0.07 0.03 -0.02 0.02 0.01 0.00 0.02 -0.01 0.02 -1 . . . . 22 MODEL1 RIDGESEB density 0.004 . 0.22738 0.07919 0.031 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 -1 . . . . 23 MODEL1 RIDGEVIF density 0.005 . . . 7.694 13.23 18.61 15.15 7.29 10.05 7.98 2.68 2.32 2.18 2.00 11.77 10.12 -1 . . . . 24 MODEL1 RIDGE density 0.005 . 0.22863 3.77399 0.389 -0.03 0.00 -0.00 -0.00 0.00 -0.00 0.00 0.00 0.00 0.00 -0.00 0.00 -1 . . . . 25 MODEL1 RIDGESTB density 0.005 . 0.22863 0.00000 0.994 -0.25 0.02 -0.01 -0.07 0.02 -0.03 0.02 0.00 0.00 0.02 -0.01 0.02 -1 . . . . 26 MODEL1 RIDGESEB density 0.005 . 0.22863 0.07833 0.030 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 -1 . . . . 27 MODEL1 RIDGEVIF density 0.006 . . . 7.195 11.83 15.05 12.87 6.32 8.70 6.10 2.08 1.77 1.96 1.84 10.16 8.32 -1 . . . . 28 MODEL1 RIDGE density 0.006 . 0.22995 3.77164 0.385 -0.03 0.00 -0.00 -0.00 0.00 -0.00 0.00 -0.00 0.00 0.00 -0.00 0.00 -1 . . . . 29 MODEL1 RIDGESTB density 0.006 . 0.22995 0.00000 0.983 -0.24 0.04 -0.02 -0.06 0.02 -0.03 0.02 -0.00 0.01 0.02 -0.00 0.02 -1 . . . . 30 MODEL1 RIDGESEB density 0.006 . 0.22995 0.07769 0.029 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 -1 . . . . 31 MODEL1 RIDGEVIF density 0.007 . . . 6.793 10.68 12.64 11.15 5.63 7.64 4.85 1.69 1.40 1.77 1.72 8.92 7.10 -1 . . . . 32 MODEL1 RIDGE density 0.007 . 0.23132 3.76952 0.381 -0.03 0.00 -0.00 -0.00 0.00 -0.00 0.00 -0.00 0.00 0.00 -0.00 0.00 -1 . . . . 33 MODEL1 RIDGESTB density 0.007 . 0.23132 0.00000 0.974 -0.24 0.05 -0.02 -0.06 0.02 -0.04 0.02 -0.01 0.01 0.01 -0.00 0.02 -1 . . . . 34 MODEL1 RIDGESEB density 0.007 . 0.23132 0.07721 0.028 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 -1 . . . . 35 MODEL1 RIDGEVIF density 0.008 . . . 6.454 9.72 10.90 9.81 5.11 6.77 3.98 1.42 1.15 1.62 1.61 7.94 6.21 -1 . . . . 36 MODEL1 RIDGE density 0.008 . 0.23275 3.76757 0.378 -0.03 0.00 -0.00 -0.00 0.00 -0.00 0.00 -0.00 0.00 0.00 0.00 0.00 -1 . . . . 37 MODEL1 RIDGESTB density 0.008 . 0.23275 0.00000 0.965 -0.23 0.06 -0.02 -0.06 0.02 -0.04 0.02 -0.01 0.01 0.01 0.00 0.02 -1 . . . . 38 MODEL1 RIDGESEB density 0.008 . 0.23275 0.07687 0.028 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 -1 . . . . 39 MODEL1 RIDGEVIF density 0.009 . . . 6.159 8.91 9.61 8.72 4.70 6.05 3.35 1.22 0.97 1.49 1.53 7.13 5.54 -1 . . . . 40 MODEL1 RIDGE density 0.009 . 0.23423 3.76576 0.375 -0.03 0.00 -0.00 -0.00 0.00 -0.00 0.00 -0.00 0.00 0.00 0.00 0.00 -1 . . . . 41 MODEL1 RIDGESTB density 0.009 . 0.23423 0.00000 0.957 -0.23 0.07 -0.03 -0.06 0.01 -0.04 0.02 -0.01 0.01 0.01 0.00 0.02 -1 . . . . 42 MODEL1 RIDGESEB density 0.009 . 0.23423 0.07664 0.027 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 -1 . . . . file:///C|/Documents%20and%20Settings/hughesol/My%20Documents/LAPTOP/Instruction/ST708/2005Fall/Notes/RemedialCollinearity/SASoutputRidge.html (3 of 5)11/30/2005 4:51:48 AM SAS Output 43 MODEL1 RIDGEVIF density 0.010 . . . 5.898 8.22 8.60 7.82 4.36 5.44 2.87 1.07 0.83 1.38 1.45 6.45 5.02 -1 . . . . 44 MODEL1 RIDGE density 0.010 . 0.23575 3.76408 0.372 -0.02 0.00 -0.00 -0.00 0.00 -0.00 0.00 -0.00 0.00 0.00 0.00 0.00 -1 . . . . 45 MODEL1 RIDGESTB density 0.010 . 0.23575 0.00000 0.949 -0.23 0.08 -0.03 -0.05 0.01 -0.04 0.02 -0.01 0.01 0.01 0.00 0.02 -1 . . . . 46 MODEL1 RIDGESEB density 0.010 . 0.23575 0.07650 0.027 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 -1 . . . . 47 MODEL1 RIDGEVIF density 0.011 . . . 5.662 7.63 7.79 7.07 4.07 4.93 2.50 0.95 0.73 1.29 1.39 5.88 4.59 -1 . . . . 48 MODEL1 RIDGE density 0.011 . 0.23730 3.76251 0.369 -0.02 0.00 -0.00 -0.00 0.00 -0.00 0.00 -0.00 0.00 0.00 0.00 0.00 -1 . . . . 49 MODEL1 RIDGESTB density 0.011 . 0.23730 0.00000 0.942 -0.22 0.09 -0.03 -0.05 0.01 -0.04 0.01 -0.01 0.01 0.01 0.01 0.02 -1 . . . . 50 MODEL1 RIDGESEB density 0.011 . 0.23730 0.07643 0.026 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 -1 . . . . 51 MODEL1 RIDGEVIF density 0.012 . . . 5.448 7.11 7.14 6.43 3.83 4.48 2.20 0.86 0.65 1.21 1.33 5.39 4.24 -1 . . . . 52 MODEL1 RIDGE density 0.012 . 0.23888 3.76102 0.366 -0.02 0.00 -0.00 -0.00 0.00 -0.00 0.00 -0.00 0.00 0.00 0.00 0.00 -1 . . . . 53 MODEL1 RIDGESTB density 0.012 . 0.23888 0.00000 0.935 -0.22 0.09 -0.04 -0.05 0.01 -0.04 0.01 -0.02 0.01 0.01 0.01 0.02 -1 . . . . 54 MODEL1 RIDGESEB density 0.012 . 0.23888 0.07643 0.026 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 -1 . . . . 55 MODEL1 RIDGEVIF density 0.013 . . . 5.251 6.65 6.59 5.88 3.61 4.10 1.96 0.78 0.58 1.14 1.28 4.96 3.95 -1 . . . . 56 MODEL1 RIDGE density 0.013 . 0.24049 3.75962 0.364 -0.02 0.00 -0.00 -0.00 0.00 -0.00 0.00 -0.00 0.00 0.00 0.00 0.00 -1 . . . . 57 MODEL1 RIDGESTB density 0.013 . 0.24049 0.00000 0.928 -0.22 0.10 -0.04 -0.04 0.01 -0.04 0.01 -0.02 0.01 0.00 0.01 0.02 -1 . . . . 58 MODEL1 RIDGESEB density 0.013 . 0.24049 0.07648 0.026 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 -1 . . . . 59 MODEL1 RIDGEVIF density 0.014 . . . 5.069 6.25 6.12 5.40 3.42 3.76 1.77 0.72 0.53 1.08 1.23 4.59 3.69 -1 . . . . 60 MODEL1 RIDGE density 0.014 . 0.24212 3.75829 0.361 -0.02 0.00 -0.00 -0.00 0.00 -0.00 0.00 -0.00 0.00 0.00 0.00 0.00 -1 . . . . 61 MODEL1 RIDGESTB density 0.014 . 0.24212 0.00000 0.922 -0.22 0.11 -0.04 -0.04 0.01 -0.04 0.01 -0.02 0.01 0.00 0.01 0.02 -1 . . . . 62 MODEL1 RIDGESEB density 0.014 . 0.24212 0.07657 0.026 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 -1 . . . . 63 MODEL1 RIDGEVIF density 0.015 . . . 4.900 5.90 5.72 4.98 3.25 3.46 1.60 0.66 0.48 1.02 1.18 4.27 3.48 -1 . . . . 64 MODEL1 RIDGE density 0.015 . 0.24376 3.75701 0.359 -0.02 0.00 -0.00 -0.00 0.00 -0.00 0.00 -0.00 0.00 0.00 0.00 0.00 -1 . . . . 65 MODEL1 RIDGESTB density 0.015 . 0.24376 0.00000 0.916 -0.21 0.11 -0.04 -0.04 0.00 -0.04 0.01 -0.02 0.01 0.00 0.01 0.01 -1 . . . . 66 MODEL1 RIDGESEB density 0.015 . 0.24376 0.07670 0.025 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 -1 . . . . 67 MODEL1 RIDGEVIF density 0.016 . . . 4.743 5.58 5.37 4.61 3.10 3.20 1.47 0.62 0.45 0.97 1.14 3.98 3.28 -1 . . . . 68 MODEL1 RIDGE density 0.016 . 0.24541 3.75580 0.356 -0.02 0.00 -0.00 -0.00 0.00 -0.00 0.00 -0.00 0.00 -0.00 0.00 0.00 -1 . . . . 69 MODEL1 RIDGESTB density 0.016 . 0.24541 0.00000 0.910 -0.21 0.12 -0.04 -0.03 0.00 -0.04 0.01 -0.02 0.01 -0.00 0.01 0.01 -1 . . . . 70 MODEL1 RIDGESEB density 0.016 . 0.24541 0.07686 0.025 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 -1 . . . . 71 MODEL1 RIDGEVIF density 0.017 . . . 4.595 5.30 5.06 4.29 2.96 2.97 1.35 0.58 0.42 0.93 1.10 3.73 3.11 -1 . . . . 72 MODEL1 RIDGE density 0.017 . 0.24708 3.75463 0.354 -0.02 0.00 -0.00 -0.00 0.00 -0.00 0.00 -0.00 0.00 -0.00 0.00 0.00 -1 . . . . 73 MODEL1 RIDGESTB density 0.017 . 0.24708 0.00000 0.904 -0.21 0.12 -0.05 -0.03 0.00 -0.04 0.01 -0.02 0.01 -0.00 0.01 0.01 -1 . . . . 74 MODEL1 RIDGESEB density 0.017 . 0.24708 0.07704 0.025 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 -1 . . . . 75 MODEL1 RIDGEVIF density 0.018 . . . 4.457 5.05 4.78 4.00 2.83 2.76 1.25 0.54 0.39 0.89 1.06 3.50 2.96 -1 . . . . 76 MODEL1 RIDGE density 0.018 . 0.24875 3.75351 0.352 -0.02 0.00 -0.00 -0.00 0.00 -0.00 0.00 -0.00 0.00 -0.00 0.00 0.00 -1 . . . . 77 MODEL1 RIDGESTB density 0.018 . 0.24875 0.00000 0.899 -0.21 0.13 -0.05 -0.03 0.00 -0.04 0.01 -0.02 0.01 -0.00 0.01 0.01 -1 . . . . 78 MODEL1 RIDGESEB density 0.018 . 0.24875 0.07725 0.025 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 -1 . . . . 79 MODEL1 RIDGEVIF density 0.019 . . . 4.327 4.82 4.54 3.74 2.71 2.58 1.16 0.51 0.37 0.85 1.03 3.30 2.82 -1 . . . . 80 MODEL1 RIDGE density 0.019 . 0.25042 3.75242 0.350 -0.02 0.00 -0.00 -0.00 -0.00 -0.00 0.00 -0.00 0.00 -0.00 0.00 0.00 -1 . . . . 81 MODEL1 RIDGESTB density 0.019 . 0.25042 0.00000 0.894 -0.21 0.13 -0.05 -0.02 -0.00 -0.04 0.01 -0.02 0.01 -0.01 0.01 0.01 -1 . . . . 82 MODEL1 RIDGESEB density 0.019 . 0.25042 0.07747 0.024 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 -1 . . . . 83 MODEL1 RIDGEVIF density 0.020 . . . 4.204 4.61 4.31 3.51 2.60 2.41 1.08 0.49 0.35 0.82 1.00 3.12 2.69 -1 . . . . 84 MODEL1 RIDGE density 0.020 . 0.25210 3.75138 0.348 -0.02 0.00 -0.00 -0.00 -0.00 -0.00 0.00 -0.00 0.00 -0.00 0.00 0.00 -1 . . . . 85 MODEL1 RIDGESTB density 0.020 . 0.25210 0.00000 0.889 -0.20 0.14 -0.05 -0.02 -0.00 -0.04 0.01 -0.02 0.01 -0.01 0.02 0.01 -1 . . . . 86 MODEL1 RIDGESEB density 0.020 . 0.25210 0.07771 0.024 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 -1 . . . . file:///C|/Documents%20and%20Settings/hughesol/My%20Documents/LAPTOP/Instruction/ST708/2005Fall/Notes/RemedialCollinearity/SASoutputRidge.html (4 of 5)11/30/2005 4:51:48 AM SAS Output Algae Example: Polynomial Regression 13-degree polynomial around day 7.5 Ridge Regression Hoerl, Kennard, Baldwin Recommendation for k Obs 1 _MODEL_ MODEL1 _TYPE_ RIDGESTB _DEPVAR_ density _RIDGE_ _PCOMIT_ _RMSE_ Intercept 0 . 0.21287 0 file:///C|/Documents%20and%20Settings/hughesol/My%20Documents/LAPTOP/Instruction/ST708/2005Fall/Notes/RemedialCollinearity/SASoutputRidge.html (5 of 5)11/30/2005 4:51:48 AM day day2 day3 day4 day5 day6 day7 day8 day9 day10 day11 day12 day13 0.79318 -1.16696 10.4058 14.1412 -119.310 -79.3262 568.565 195.058 -1270.56 -212.863 1307.64 83.9458 -496.605 density _IN_ _P_ _EDF_ _RSQ_ k -1 13 14 14 0.99091 2.1078E-9
© Copyright 2025 Paperzz