NCSU ST512 QUIZ 3 Sum2 2011 1) A study of the effect of water acidity on plant growth uses 10 plants watered at each of 5 acidity levels (Treatment = A,B,C,D,E) in a completely randomized design, getting these means and variances: Treatment Mean y i Variance si2 y 16 12 11 9 10 11.6 10 10 6 10 6 10*9=90 10*9=90 6*9=54 10*9=90 6*9=54 378 ij yi. 2 j A B C D E Overall mean SUM a) Computed the error sum of squares SS[E] 378 and its degrees of freedom 5*(10-1)=54 y 10 si2 j 1 ij yi. 2 y 10 10 1 j 1 2 ij yi. 10 1 si2 2 SS E yij yi 10 1si2 5 10 5 i 1 j 1 i 1 2) All questions refer to this: I have regressed Y (yield) on P=soil pH, M = soil moisture, T=soil temperature, and MT where MT =M*T is the product of moisture times temperature. Y be the column vector of observed yields and X, the X matrix containing a column of 1s, plus columns for the explanatory variables. Here is a partial PROC REG output: Dependent Variable: Y Analysis of Variance Source DF Sum of Squares Mean Square Model Error C Total 4 46 50 2214.67512 747.95233 2962.62745 553.66878 16.2598 F Value Prob>F 34.051 0.0001 Model is yi o 1P 2 M 3T 4 MT ei Parameter Estimates Variable DF Parameter Estimate Standard Error T for H0: Parameter=0 Prob > |T| INTERCEP P M T 1 1 1 1 -34.468222 -1.866855 1.854880 1.611417 93.43489693 2.16613910 2.40241982 1.39034011 -0.369 -0.862 0.772 1.159 0.7139 0.3932 0.4440 0.2524 1 July 22, 2011 NCSU MT ST512 QUIZ 3 1 0.009275 Sum2 2011 0.03598928 0.258 Variable DF Type I SS Type II SS INTERCEP P M T MT 1 1 1 1 1 1249887 14.4795 1406.756208 792.359286 1.079988 2.212767 12.077159 9.692811 21.841861 1.079988 0.7978 2214.67512-(1406.756208+792.359286+1.07998) = 14.4795 a) How many rows 51 and columns 5 does my X matrix have? b) Where possible, fill in the blanks in the above output. Your boss says that temperature has no effect on yield and no term involving temperature should be included in the regression. She says, "look at the p-values 0.2524 and 0.7978 and you will see you can leave out both terms with T. c) What hypothesis is being tested by test statistic tcalc = 1.159 for temperature (T), give degrees of freedom for test-statistic. H o : 3 0 vs H1 : 3 0 t-test have (n-1-4) = 46 df. d) Compute the test statistic for testing the null hypothesis that regression coefficients for T and MT are both simultaneously equal to 0. Reduced model (under Ho) : yi o 1P 2 M ei Under Ho, F H o SS[ E] r SS[ E ] f dfe MSE r dfe f 792.359286 1.079988 16.2598 2 24.3988 e) Write null hypothesis and df of the test-statistic. Use as critical value 4.5. Do you reject Ho? H o : 3 4 0 vs H1 : not all i 0 Fcalc = 24.3988 which is greater than critical value F(2, 46, 0.05) = 3.19 f) Do you agree with your boss’ statement?. Explain your answer. g) Although individual test for T and MT are non significant, the simultaneous test of both coefficients 3 and 4 are highly significant, which means that we should not drop both variables, since the nonsignificant pvalues for testing that each partial regression coefficient is equal to zero is due to the 2 July 22, 2011 NCSU ST512 QUIZ 3 Sum2 2011 high collinearity present among explanatory variables. It will be necessary to analyze the pattern of correlation among variables and the nature of the MT interaction. Note MT has a small TypeI SS = 1.079988, and its pvalue = 0.7978. 3 July 22, 2011 NCSU ST512 QUIZ 3 Sum2 2011 3) In an experimental study, a researcher incorporated a polychlorinated biphenyl (PCB) mixture in the diets of lab mice fed ad libitum. Rates were 0, 62.5, 250, 1000 ppm. After two weeks, the mice were injected with Nembutal and their sleeping times recorded. Each diet was randomly assigned to three mice for a total of 12 mice. Researcher wants to determine whether the four PCB diets (treatments) have an effect on Sleeping times, and whether this effect, if significant, may be presented as a linear relationship between Sleeping time and BCP rates. Analysis of variance table for Treatments Dependent Variable: SleepingTimes DF Sum of Squares Mean Square F Value Pr > F Treatment 3 5352.916667 1784.305556 12.10 0.0024 Error 8 1180.000000 147.500000 11 6532.916667 Source Corrected Total a) Give the model (regression) sum of squares when fitting the following regression model Yj o 1 X j 2 X 2j 3 X 3j e j j 1, ..., 12 SS[R] = 5352.916667 To answer whether a linear regression line is adequate to represent the relationship between Treatment (BCP tailored diets) and mean response (SleepingTimes) the following analysis was carried out b) Fill the blanks in table below The GLM Procedure Class Level Information Class Levels Values CRATEs 4 0 62.5 250 1000 5352.916667-542.688300 = 4810.228 Dependent Variable: SleepingTimes Source DF Type I SS Mean Square F Value Pr > F rates CRATES 1 2 542.688300 4810.228367 542.688300 2405.114183 3.68 16.31 0.0914 0.0015 c) Calculate F value for CRATES, What hypothesis is tested by this F-test? Write out the hypotheses and p-value, and then interpret the results of this test F=2405.1141/542.6883 = 16.31, with num df = 2 and denominator df= 8. d) Compute the pure error sum of squares = MSE from full model = 147.5 e) Summarize the findings of this analysis, would you recommend to use a linear equation to represent the relationship between SleepingTimes and BCP diet content. Support your answer. Since the lack of fit test from the linear regression of SleepingTimes on Rates was significant, there is evidence (p=0.0015 < 0.05) that the relationship between Rates and SleepingTimes is not just linear, 4 July 22, 2011 NCSU ST512 QUIZ 3 Sum2 2011 but higher degree either quadratic or cubic. Testing for quadratic regression coefficient and lack of fit from quadratic is necessary. 5 July 22, 2011
© Copyright 2026 Paperzz