Multiple Regression Models The Multiple Regression Model The relationship between one dependent & two or more independent variables is a linear function Population Y-intercept Population slopes Random Error Yi X1i X 2i p X pi i Yi b0 b1 X1i b2 X 2i bp X pi ei Dependent (Response) variable for sample Independent (Explanatory) variables for sample model Multiple Regression Model: Example Develop a model for estimating heating oil used for a single family home in the month of January based on average temperature and amount of insulation in inches. Oil (Gal) Temp(0F) Insulation 275.30 40 3 363.80 27 3 164.30 40 10 40.80 73 6 94.30 64 6 230.90 34 6 366.70 9 6 300.60 8 10 237.80 23 10 121.40 63 3 31.40 65 10 203.50 41 6 441.10 21 3 323.00 38 3 52.50 58 10 Sample Multiple Regression Model: Example Yˆi b0 b1 X1i b2 X 2i Excel Output Intercept X Variable 1 X Variable 2 bp X pi Coefficients 562.1510092 -5.436580588 -20.01232067 Yˆi 562.151 5.437 X1i 20.012 X 2i For each degree increase in temperature, the estimated average amount of heating oil used is decreased by 5.437 gallons, holding insulation constant. For each increase in one inch of insulation, the estimated average use of heating oil is decreased by 20.012 gallons, holding temperature constant. Interpretation of Estimated Coefficients Slope (bi) The average Y changes by bi each time Xi is increased or decreased by 1 unit holding all other variables constant. For example: If b1 = -2, then fuel oil usage (Y) is expected to decrease by an estimated 2 gallons for each 1 degree increase in temperature (X1) given the inches of insulation (X2). Interpretation of Estimated Coefficients Intercept (b0) The intercept (b0) is the estimated average value of Y when all Xi = 0. Using The Model to Make Predictions Predict the amount of heating oil used for a home if the average temperature is 300 and the insulation is 6 inches. ˆ Yi 562.151 5.437 X1i 20.012 X 2i 562.151 5.437 30 20.012 6 278.969 The predicted heating oil used is 278.97 gallons Developing the Model Checking for problems. Being sure the model passes all tests for model quality. Identifying Problems Do all the residual tests listed for simple regression. Check for multicolinearity. Multicolinearity • • • This occurs when there is a high correlation between the explanatory variables. This leads to unstable coefficients . The VIF used to measure colinearity (values exceeding 5 are not good and exceeding 10 are a big problem): 2 1 R j = Coefficient of Multiple VIF j , 2 Determination of Xj 1 Rj with all the others Is the fit to the data good? Coefficient of Multiple Determination Excel Output R e g re ssi o n S ta ti sti c s M u lt ip le R 0.982654757 R S q u a re 0.965610371 r2 A d ju s t e d R S q u a re 0.959878766 Adjusted r2 S t a n d a rd E rro r 26.01378323 O b s e rva t io n s 15 The r2 is adjusted downward to reflect small sample sizes. Do the variables collectively pass the test? Testing for Overall Significance •Shows if there is a linear relationship between all of the X variables taken together and Y •Hypothesis: H0: 1 = 2 = … = p = 0 (No linear relationships) H1: At least one i 0 (At least one independent variable effects Y) Test for Overall Significance Excel Output: Example ANOVA df Regression Residual Total SS MS F Significance F 2 228014.6 114007.3 168.4712 1.65411E-09 12 8120.603 676.7169 14 236135.2 p = 2, the number of explanatory variables p value n-1 MSR MSE = F Test Statistic Test for Overall Significance H0: 1 = 2 = … = p = 0 H1: At least one I 0 = .05 df = 2 and 12 Critical value(s): Test Statistic: F 168.47 (Excel Output) Decision: Reject at = 0.05 Conclusion: = 0.05 0 3.89 F There is evidence that at least one independent variable affects Y. Test for Significance: Individual Variables •Shows if there is a linear relationship between each variable Xi and Y. •Hypotheses: H0: i = 0 (No linear relationship) H1: i 0 (Linear relationship between Xi and Y) T Test Statistic Excel Output: Example t Test Statistic for X1 (Temperature) C o e ffi c i e n ts S ta n d a r d E r r o r I n te r c e p t t S ta t 5 6 2 .1 5 1 0 0 9 2 1 .0 9 3 1 0 4 3 3 2 6 .6 5 0 9 4 X V a r i a b l e 1 -5 . 4 3 6 5 8 0 6 0 .3 3 6 2 1 6 1 6 7 -1 6 . 1 6 9 9 X V a r i a b l e 2 -2 0 . 0 1 2 3 2 1 2 .3 4 2 5 0 5 2 2 7 -8 . 5 4 3 1 3 bk t Sbk t Test Statistic for X2 (Insulation) t Test : Example Solution Does temperature have a significant effect on monthly consumption of heating oil? Test at = 0.05. H0: 1 = 0 h1: 1 0 Test Statistic: t Test Statistic = -16.1699 Decision: Reject H0 at = 0.05 df = n-2 = 12 critical value(s): Reject H0 Reject H0 .025 .025 -2.1788 0 2.1788 t Conclusion: There is evidence of a significant effect of temperature on oil consumption. Confidence Interval Estimate For The Slope Provide the 95% confidence interval for the population slope 1 (the effect of temperature on oil consumption). b1 t n p 1Sb1 Coefficients Lower 95% Upper 95% In te rce p t 562.151009 516.1930837 608.108935 X V a ria b le 1 -5.4365806 -6.169132673 -4.7040285 X V a ria b le 2 -20.012321 -25.11620102 -14.90844 -6.169 1 -4.704 The estimated average consumption of oil is reduced by between 4.7 gallons to 6.17 gallons per each increase of 10 F. Special Regression Topics Dummy-variable Models • Create a categorical variable (dummy variable) with 2 levels: For example, yes and no or male and female. The date is coded as 0 or 1. • • • The coding makes the intercepts different. This analysis assumes equal slopes. The regression model has same form: Y i 0 1 X 1 i 2 X 2 i p X pi i Dummy-variable Models Assumption Given: Ŷi b0 b1 X 1i b2 X 2 i Y = Assessed Value of House X1 = Square footage of House X2 = Desirability of Neighborhood = 0 if undesirable 1 if desirable Desirable (X2 = 1) Ŷi b0 b1 X 1i b2 ( 1 ) ( b0 b2 ) b1 X 1i Undesirable (X2 = 0) Ŷi b0 b1 X 1i b2 ( 0 ) b0 b1 X 1i Same slopes Dummy-variable Models Assumption Y (Assessed Value) Same slopes b0 + b2 Intercepts different b0 X1 (Square footage) Interpretation of the Dummy Variable Coefficient For example: Y i b0 b1 X1i b2 X 2i 20 5 X1i 6 X 2i Y : Annual salary of college graduate in thousand $ X1 : GPA X 2: 0 Female 1 Male This 6 is interpreted as given the same GPA, the male college graduate is making an estimated 6 thousand dollars more than female on average.
© Copyright 2025 Paperzz