Notebook Tab 6 Pages 183 to 196

Notebook Tab 6
Pages 183 to 196
© 2014 ConteSolutions
© 2014 ConteSolutions




When the assumed relationship best fits a
straight line model (r (Pearson’s correlation
coefficient) is close to |1|), this approach is
known as Linear Regression Analysis.
Excel and Minitab gives r2 value
r is square root of r² with a + or - sign
When the modeling analysis best fits a
straight line and includes only one
independent variable, it is known as Simple
Linear Regression Analysis.
© 2014 ConteSolutions


TOH in Linear Regression model (ANOVA)
The null hypothesis (= sign) and a parameter
y  b1 x  b 0  e




Null Hypothesis: All beta coefficients are equal to
zero
Criteria - P-value from Minitab is less than P=0.05
we reject the null hypothesis
statistical significant alternative based on α = 0.05
Excel users note: 5.45E-05 is exponential notation
and equal to 0.0000545 (see Wikipedia)
© 2014 ConteSolutions





Response variable and input variable
Objective to predict dependent variable based on
the independent variable
Example: predict customer satisfaction based on
wait time
Strength of relationship is “Correlation Coefficient”
The regression equation contains two values of
interest
© 2014 ConteSolutions
y  ax  b
y  b1 x  b0  e




Slope and intercept and error term
Least squares method
Least squares regression line
Example: x is wait time, y is customer satisfaction
The following data was collected to see if weight can be predicted from a
person’s height:
HEIGHT
WEIGHT
70
155
63
150
72
180
60
135
66
156
70
168
74
178
65
160
62
132
67
145
65
139
© 2014 ConteSolutions
© 2014 ConteSolutions
© 2014 ConteSolutions
© 2014 ConteSolutions
© 2014 ConteSolutions
© 2014 ConteSolutions
Correlations: HEIGHT, WEIGHT
R-Sq = 75.92%
Square root of 0.7592 = 0.87
Pearson correlation of HEIGHT and WEIGHT = + 0.87
Pearson Correlation Coefficients range between -1.0 and + 1.0
They can be positive or negative
Customer Satisfaction model it could be + or –
P-Value = 0.000477
H0: Slope and Intercept values = 0 (no linear correlation) (beta0
and beta1)
Ha: Slope and Intercept not = 0 (there appears to be a linear
correlation)
© 2014 ConteSolutions
Regression Analysis: WEIGHT versus HEIGHT
The regression equation is:
WEIGHT = - 62.8509 + 3.2553 x HEIGHT
Predictor
Coef
SE Coef
T
Constant -62.851 40.85 -1.54
HEIGHT
3.2553 0.6455 6.28
P
0.158
0.000
Based on this analysis the independent variable, Height, does appear to
be significant in predicting a person’s weight.
© 2014 ConteSolutions
Is the Model Statistically Useful?
Model Diagnostics
For this model,


S = 8.42 – the standard deviation of the distances the actual
values are from the fitted line (smaller is better)
R-Sq = 75.92% - this model explains 75.92% of the variation
in weights
© 2014 ConteSolutions
H0:
The model’s beta coefficients are all zero (none are useful in predicting the
response variable
Ha:
At least one of the model’s beta coefficients are not zero (at least one
independent variable is useful in predicting the response variable)
What is your conclusion now?
© 2014 ConteSolutions

A weak correlation coefficient does not mean that the model
is not useful.
Remember that the correlation coefficient only tests for linear
relationships and the relationship among the variables may
be curvilinear.

A low R sq / R sq adj does not mean that the model is not
useful.
It just means that is it not complete and that other terms
need to be added to make it more effective at predicting the
variation in the response variable.
© 2014 ConteSolutions



Desktop
Open Minitab
File/Open Worksheet
◦ Villanova/Correlation Exercise.XLS


Graph/Scatter Plot (with regression)
Stat/Regression/Regression
◦ Response=WEIGHT, Predictor=Height
© 2014 ConteSolutions
© 2014 ConteSolutions
© 2014 ConteSolutions
© 2014 ConteSolutions
© 2014 ConteSolutions
© 2014 ConteSolutions
© 2014 ConteSolutions