Notebook Tab 6 Pages 183 to 196 © 2014 ConteSolutions © 2014 ConteSolutions When the assumed relationship best fits a straight line model (r (Pearson’s correlation coefficient) is close to |1|), this approach is known as Linear Regression Analysis. Excel and Minitab gives r2 value r is square root of r² with a + or - sign When the modeling analysis best fits a straight line and includes only one independent variable, it is known as Simple Linear Regression Analysis. © 2014 ConteSolutions TOH in Linear Regression model (ANOVA) The null hypothesis (= sign) and a parameter y b1 x b 0 e Null Hypothesis: All beta coefficients are equal to zero Criteria - P-value from Minitab is less than P=0.05 we reject the null hypothesis statistical significant alternative based on α = 0.05 Excel users note: 5.45E-05 is exponential notation and equal to 0.0000545 (see Wikipedia) © 2014 ConteSolutions Response variable and input variable Objective to predict dependent variable based on the independent variable Example: predict customer satisfaction based on wait time Strength of relationship is “Correlation Coefficient” The regression equation contains two values of interest © 2014 ConteSolutions y ax b y b1 x b0 e Slope and intercept and error term Least squares method Least squares regression line Example: x is wait time, y is customer satisfaction The following data was collected to see if weight can be predicted from a person’s height: HEIGHT WEIGHT 70 155 63 150 72 180 60 135 66 156 70 168 74 178 65 160 62 132 67 145 65 139 © 2014 ConteSolutions © 2014 ConteSolutions © 2014 ConteSolutions © 2014 ConteSolutions © 2014 ConteSolutions © 2014 ConteSolutions Correlations: HEIGHT, WEIGHT R-Sq = 75.92% Square root of 0.7592 = 0.87 Pearson correlation of HEIGHT and WEIGHT = + 0.87 Pearson Correlation Coefficients range between -1.0 and + 1.0 They can be positive or negative Customer Satisfaction model it could be + or – P-Value = 0.000477 H0: Slope and Intercept values = 0 (no linear correlation) (beta0 and beta1) Ha: Slope and Intercept not = 0 (there appears to be a linear correlation) © 2014 ConteSolutions Regression Analysis: WEIGHT versus HEIGHT The regression equation is: WEIGHT = - 62.8509 + 3.2553 x HEIGHT Predictor Coef SE Coef T Constant -62.851 40.85 -1.54 HEIGHT 3.2553 0.6455 6.28 P 0.158 0.000 Based on this analysis the independent variable, Height, does appear to be significant in predicting a person’s weight. © 2014 ConteSolutions Is the Model Statistically Useful? Model Diagnostics For this model, S = 8.42 – the standard deviation of the distances the actual values are from the fitted line (smaller is better) R-Sq = 75.92% - this model explains 75.92% of the variation in weights © 2014 ConteSolutions H0: The model’s beta coefficients are all zero (none are useful in predicting the response variable Ha: At least one of the model’s beta coefficients are not zero (at least one independent variable is useful in predicting the response variable) What is your conclusion now? © 2014 ConteSolutions A weak correlation coefficient does not mean that the model is not useful. Remember that the correlation coefficient only tests for linear relationships and the relationship among the variables may be curvilinear. A low R sq / R sq adj does not mean that the model is not useful. It just means that is it not complete and that other terms need to be added to make it more effective at predicting the variation in the response variable. © 2014 ConteSolutions Desktop Open Minitab File/Open Worksheet ◦ Villanova/Correlation Exercise.XLS Graph/Scatter Plot (with regression) Stat/Regression/Regression ◦ Response=WEIGHT, Predictor=Height © 2014 ConteSolutions © 2014 ConteSolutions © 2014 ConteSolutions © 2014 ConteSolutions © 2014 ConteSolutions © 2014 ConteSolutions © 2014 ConteSolutions
© Copyright 2025 Paperzz