HZAU MULTIVARIATE HOMEWORK #2 MULTIPLE AND STEPWISE LINEAR REGRESSION Using the malt quality dataset on the class’s Web page: 1. Determine the simple linear correlation of extract with the remaining variables. a. Which of the variables have a correlation with extract that is significantly different from zero. b. Would you consider the correlation of the variables identified in part a as having a strong or weak association with extract? Explain your answer. 2. Determine the partial correlation between extract and viscosity, while controlling for beta-glucan content. a. Report the partial correlation value. b. When controlling for beta-glucan content, would you consider the relationship between malt extract and viscosity to spurious? Explain your answer. 3. Develop a regression model to predict the percent malt extract using the remaining variables as the independent variables. a. What is the regression equation? b. What independent variables have regression coefficients significantly different from zero? c. What percent of the variation in extract is explained by your regression model? d. Would you consider the regression model to adequately explain the variation in extract? Explain your answer. 4. Using stepwise regression, develop a regression model that includes those independent variables that significantly contribute to explaining the variation in extract. a. What is the regression equation? b. What percent of the variation in extract is explained by your regression model? c. Would you consider the regression model to adequately explain the variation in extract? Explain your answer. options pageno=1; data hmwk2; input Line $ Plump Protein Solprot Color FAN Betagluc Viscosity Fructose datalines; . . Extract amylase DP Glucose Maltose Maltotriose; Insert malt quality data from class Web page. kolbach . ;; proc corr; var extract; with Plump Protein amylase DP kolbach Solprot Color FAN Betagluc Viscosity Fructose Glucose Maltose Maltotriose; title 'Correlation of Extract with Independent Variables Related to Quality'; run; Proc corr; var extract viscosity; partial betagluc; title 'Partial Correlation of Extract and Viscosity While Controlling Betaglucan content'; run; proc reg; model extract=Plump Protein amylase DP kolbach Solprot Color FAN Betagluc Viscosity Fructose Glucose Maltose Maltotriose; title 'Multiple Regression of Extract with the Remaining Quality Traits'; run; proc stepwise; model extract=Plump Protein amylase DP kolbach Solprot Color FAN Betagluc Viscosity Fructose Glucose Maltose Maltotriose; title 'Stepwise Regression of Extract with the Remaining Quality Traits'; run; 12:28 Thursday, June 21, 2012 1 Correlation of Extract with Independent Variables Related to Quality The CORR Procedure 14 With Variables: Plump Protein amylase DP kolbach Solprot Viscosity Fructose Glucose Maltose Maltotriose 1 Extract Variables: Color FAN Simple Statistics Variable N Mean Std Dev Sum Minimum Maximum Plump 61 93.01475 1.97753 5674 89.10000 96.55000 Protein 61 12.53934 0.36721 764.90000 11.75000 13.45000 amylase 61 67.36557 6.41523 55.95000 81.45000 DP 61 143.04590 12.44975 8726 118.30000 176.20000 kolbach 61 50.69426 3092 Solprot 61 6.35467 0.45995 387.63500 4.83500 7.14000 Color 61 2.57869 0.52974 157.30000 1.80000 3.75000 FAN 61 320.06803 31.37553 19524 258.05000 406.80000 Betagluc 61 175.49262 36.41330 10705 124.40000 261.95000 Viscosity 61 1.46426 0.02101 89.32000 Fructose 61 0.08822 0.07869 Glucose 61 Maltose 3.69884 4109 40.60000 57.80000 1.41500 1.51500 5.38150 0 0.31700 1.04295 0.09717 63.62000 0.89000 1.28500 61 3.99230 0.16880 243.53000 3.50500 4.27000 Maltotriose 61 0.98893 0.07334 60.32500 0.68000 1.25500 Extract 61 78.31721 0.70651 4777 76.80000 79.70000 Betagluc Pearson Correlation Coefficients, N = 61 Prob > |r| under H0: Rho=0 Extract Plump -0.10224 0.4330 Protein -0.26267 0.0408 amylase 0.04370 0.7381 DP -0.26047 0.0426 kolbach 0.16305 0.2093 Solprot 0.05783 0.6580 Color 0.18480 0.1539 FAN -0.07966 0.5417 Betagluc -0.45925 0.0002 Viscosity -0.31965 0.0120 Fructose 0.12030 0.3557 Glucose 0.19140 0.1395 Maltose -0.05729 0.6610 Maltotriose 0.18916 0.1443 12:28 Thursday, June 21, 2012 3 Partial Correlation of Extract and Viscosity While Controlling Beta-glucan content The CORR Procedure 1 Partial Variables: Betagluc 2 Variables: Extract Viscosity Simple Statistics Variable Std Dev Sum Minimum Maximum Betagluc 61 175.49262 36.41330 10705 124.40000 261.95000 Extract N Mean 61 78.31721 Viscosity 61 1.46426 0.70651 4777 0.02101 89.32000 76.80000 79.70000 1.41500 Viscosity 0.40055 0.63289 1.51500 0.0004387 0.02095 Pearson Partial Correlation Coefficients, N = 61 Prob > |r| under H0: Partial Rho=0 Extract Partial Partial Variance Std Dev Extract Viscosity 1.00000 -0.28477 0.0274 -0.28477 0.0274 1.00000 Number of Observations Read 61 Number of Observations Used 61 Analysis of Variance Sum of Squares Mean Square F Value Pr > F Source DF Model 14 12.79922 0.91423 Error 46 17.15020 0.37283 Corrected Total 60 29.94943 Root MSE 2.45 0.0113 0.61060 R-Square 0.4274 Dependent Mean 78.31721 Adj R-Sq 0.2531 Coeff Var 0.77965 Parameter Estimates Variable DF Parameter Standard Estimate Error t Value Pr > |t| Intercept 1 104.48451 28.92954 3.61 0.0007 Plump 1 0.02014 0.05121 0.39 0.6959 Protein 1 -0.75220 2.00083 -0.38 0.7087 amylase 1 0.00776 0.01664 0.47 0.6432 DP 1 -0.01021 0.01091 -0.94 0.3544 kolbach 1 -0.06250 0.48064 -0.13 0.8971 Solprot 1 0.52908 3.91020 0.14 0.8930 Color 1 0.13624 0.31566 0.43 0.6680 FAN 1 -0.00303 0.00299 -1.01 0.3175 Betagluc 1 -0.00673 0.00276 -2.44 0.0187 Viscosity 1 -12.02975 4.45003 -2.70 0.0096 Fructose 1 1.44893 1.32733 1.09 0.2807 Glucose 1 -0.44204 1.17175 -0.38 0.7077 Maltose 1 0.36908 0.72849 0.51 0.6148 Maltotriose 1 0.41147 1.45583 0.28 0.7787 12:28 Thursday, June 21, 2012 5 Stepwise Regression of Extract with the Remaining Quality Traits The STEPWISE Procedure Model: MODEL1 Dependent Variable: Extract Number of Observations Read 61 Number of Observations Used 61 Stepwise Selection: Step 1 Variable Betagluc Entered: R-Square = 0.2109 and C(p) = 6.3871 Analysis of Variance Source DF Model 1 Sum of Squares Mean Square F Value Pr > F 6.31677 6.31677 Error 59 23.63265 0.40055 Corrected Total 60 29.94943 Variable 15.77 0.0002 Parameter Standard Estimate Error Type II SS F Value Pr > F Intercept 79.88098 0.40203 Betagluc -0.00891 0.00224 15814 39479.1 <.0001 6.31677 15.77 0.0002 Bounds on condition number: 1, 1 Stepwise Selection: Step 2 Variable Viscosity Entered: R-Square = 0.2749 and C(p) = 3.2468 Analysis of Variance Source DF Model 2 Sum of Squares Mean Square F Value Pr > F 8.23325 4.11663 Error 58 21.71617 0.37442 Corrected Total 60 29.94943 Variable 10.99 <.0001 Parameter Standard Estimate Error Type II SS F Value Pr > F Intercept 92.34866 5.52445 104.62588 279.44 <.0001 Betagluc -0.00816 0.00219 5.17312 13.82 0.0005 Viscosity -8.60486 3.80337 1.91648 5.12 0.0274 Bounds on condition number: 1.0235, 4.0941 Stepwise Selection: Step 3 Variable DP Entered: R-Square = 0.3424 and C(p) = -0.1711 Analysis of Variance Source DF Sum of Squares Mean Square F Value Pr > F Model 3 10.25321 3.41774 Error 57 19.69621 0.34555 Corrected Total 60 29.94943 Variable 9.89 <.0001 Parameter Standard Estimate Error Type II SS F Value Pr > F Intercept 97.60920 5.73588 100.06673 289.59 <.0001 DP -0.01539 0.00636 2.01996 5.85 0.0188 Betagluc -0.00714 0.00215 3.80458 11.01 0.0016 Viscosity -10.81702 3.76662 2.84984 8.25 0.0057 Bounds on condition number: 1.0898, 9.7265 12:28 Thursday, June 21, 2012 7 Stepwise Regression of Extract with the Remaining Quality Traits The STEPWISE Procedure Model: MODEL1 Dependent Variable: Extract Stepwise Selection: Step 3 Stepwise Selection: Step 4 Variable Protein Entered: R-Square = 0.3786 and C(p) = -1.0835 Analysis of Variance Source DF Sum of Squares Mean Square F Value Pr > F Model 4 11.33903 2.83476 Error 56 18.61039 0.33233 Corrected Total 60 29.94943 Variable 8.53 <.0001 Parameter Standard Estimate Error Type II SS F Value Pr > F Intercept 104.18019 6.69752 Protein -0.39460 0.21830 1.08582 3.27 0.0760 DP -0.01341 0.00634 1.48878 4.48 0.0388 Betagluc -0.00621 0.00217 2.71760 8.18 0.0059 Viscosity -12.22975 3.77565 3.48674 10.49 0.0020 80.40991 241.96 <.0001 Bounds on condition number: 1.1602, 18.191 All variables left in the model are significant at the 0.1500 level. No other variable met the 0.1500 significance level for entry into the model. Summary of Stepwise Selection Variable Variable Number Partial Model Step Entered Removed Vars In R-Square R-Square C(p) F Value Pr > F 1 Betagluc 1 0.2109 0.2109 6.3871 15.77 0.0002 2 Viscosity 2 0.0640 0.2749 3.2468 5.12 0.0274 3 DP 3 0.0674 0.3424 -0.1711 5.85 0.0188 4 Protein 4 0.0363 0.3786 -1.0835 3.27 0.0760
© Copyright 2026 Paperzz