St512 1. Exercises SSII10 I have 5 fish tanks with water at each of 4 pH values (4, 5, 6, and 7) in a completely randomized design. In each of these 20 tanks I have 1 fish so altogether there are 20 fish. The weight gain of fish in tank j, for pH i is Yij. a) Write down the lineal model b) The following model recognize the fact that we have four groups, from an experimental setting, where interest is comparing the different pH level effects on the response given by weight gain. Yij i eij i=1, , ,4 pH levels j=1, ,5 tanks/pH level An associated regression model is given by Yij 0 1 X 1 j 2 X 2 j 3 X 3 j 4 X 4 j eij i=1, ,4 pH levels j=1, , ,5 tanks/pH level and X1 1 if i-th observation belongs to pH level 1, and 0 otherwise. similarly, X 2 1 if i-th observation belongs to pH level 2, and 0 otherwise X 3 1 if i-th observation belongs to pH level 3, and 0 otherwise X 4 1 if i-th observation belongs to pH level 4, and 0 otherwise These 0/1 dummy variables are used to identify to what group each observation belongs to. c) Write down the X matrix, For the dummy variables X1-X4, the X matrix is given by 1 1 1 1 1 1 1 1 1 1 X 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 (taken from DD ST512 notes) Note a. This matrix is called design matrix because incorporates the experimental setting and experimental factor of the study, b. Because we are identifying uniquely each observation in the dataset, the matrix is singular, have redundant columns, and in this case, one column could be dropped without losing any information about the effect on pH on weight gain. St512 d) Exercises I am a. b. c. SSII10 interested in analyzing the following comparisons between the ph levels: The largest vs the smallest pH level The two central Find out a third orthogonal comparison to the first two. This question is directly associated with the main objective in running an experiment, we want to be able to say something about the pH levels studied and their effects (and differences) over the response. Thus a simple way is to set up a set of comparisons of interest for researcher. These comparisons are described as linear combination of the pH-level effects (group means). When the linear combination is set up to be equal to zero is called a contrast. In a contrast we compare two groups, one with positive coefficients in the associated pH levels and the other with negative coefficients such that the sum of all coefficients is 0. Two contrasts C1 and C2 are orthogonal if the sum of the cross product of their coefficients is zero, SUM_C1_i*C2_i = 0. For our case, a set of orthogonal contrasts have number_pH-levels -1 contrasts, and we can choose from several sets, the one that best serves our objectives, if possible. Not always objectives can be represented by orthogonal sets. How do we choose the coefficients of each orthogonal contrast? Use the “description”, pH Levels 4 5 6 7 C1 1 0 lowest vs largest = 0 1 C2 "compare middle levels" = 0 1 1 0 Note that the column vector for pH follows the natural order seen in the X matrix (important!!), now Are C1 and C2 orthogonal? C1 1 pH 4 0 pH 5 0 pH 6 1 pH 7 C 2 0 pH 4 1 pH 5 1 pH 6 0 pH 7 Sum _ C1* C 2 (1)*(0) (0)*( 1) (0)*(1) (1)*(0) 0, thus C1 and C2 are orthogonal How many orthogonal contrasts do we have available? There are 4 -1 = 3 orthogonal contrasts that may be used instead of the dummy variables to analyze the effect of pH on fish Weight gain in experiment. We need a third contrast, C3, orthogonal to C1 and C2. A suggestion is to compare the two groups involved in C1 and C2, i.e., (pH 4 and 7) vs (pH 5 and 6) pH Levels 4 5 6 7 C3 1 1 extremes vs middle = 1 1 C1*C3=(-1)(1) + (0)(-1)+(0)(-1)+(1)(1) =0 C2*C3=(0)(1) + (1)(-1)+(-1)(-1)+(0)(1) =0 Thus C1, C2, and C3 are an orthogonal set of contrasts that may be used instead of the dummy variables to analyze the effect of pH on weight gain. 2 (taken from DD ST512 notes) St512 2. Exercises SSII10 Here is a PROC PRINT and a PROC REG where you see I have 5 treatments A through E (from a completely randomized design) and some ORTHOGONAL columns C1 through C4. My model is the usual: Y(ij) = Mu + Tau(i) + e(ij) Yij i eij i=1, , ,5 treatments j=1, ,3 repetitions(per treatment) SAS output: 5 treatments, completely randomized design TRT A A A B B B C C C D D D E E E Y C1 C2 C3 C4 ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? 1 1 1 0 0 0 -1 -1 -1 0 0 0 0 0 0 -1 -1 -1 0 0 0 -1 -1 -1 2 2 2 0 0 0 0 0 0 -1 -1 -1 0 0 0 0 0 0 1 1 1 2 2 2 -3 -3 -3 2 2 2 2 2 2 -3 -3 -3 C1 is A vs C C2 is D vs (A and C) C3 is B vs E C4 is (A, C, and D) vs (B and E) Model: MODEL1 Dependent Variable: Y Analysis of Variance Source DF Sum of Squares Mean Square Model Error C Total 4 10 14 631.00403 48.35803 679.36207 157.75101 4.83580 F Value Prob>F 32.621 0.0001 Parameter Estimates Variable DF Parameter Estimate Standard Error T for H0: Parameter=0 Prob > |T| INTERCEP C1 C2 C3 C4 1 1 1 1 1 36.501133 4.271500 -0.350833 -7.514333 1.416267 0.56779123 0.89775677 0.51832011 0.89775677 0.23179980 64.286 4.758 -0.677 -8.370 6.110 0.0001 0.0008 0.5138 0.0001 0.0001 a) TYPE I SS 19984.98 109.4742 2.215507 338.791 180.5230 Compute, if possible, the F test for the null hypothesis that the 5 treatments all have the same mean. 3 (taken from DD ST512 notes) St512 Exercises F ModelMS 157.75101 32.621 ErrorMS 4.83580 SSII10 H o : 1 2 3 4 5 p-value= 0.001 , Reject null hypothesis, at leat one ph level has an effect significantly different from 0. b) Compute, if possible, the F test for the null hypothesis that treatments E and B have the same mean in the population. (H0: Tau(2) = Tau(5) ) note that . mean for treatment 1 is given by 1 1 , 2 2 and so on. . equality of means imply equality of effects; . We want to test H o : 2 5 or H o : 2 5 0 , and this comparison is the same as contrast C3, so our null hypothesis may be expressed as H o : C3 0 From the Anova results, the requested F value is the F value for C3 FC3 338.791 1 2 70.06 8.370 = t2 (for C3). p-value is 0.0001 4.83580 , reject null hypothesis, and conclude that C2 and C3 are significantly different. c) Let b2 and b3 denote the estimated regression coefficients for columns C2 and C3 respectively. Compute the standard error of the difference b2 - b3 from the standard errors of the two coefficients. Standard error of b2 - b3 is ____ Ho : C2 C3 H o : C2 C3 =0 or Since the two contrasts are orthogonal, we know that their covariance is zero and that var C2 C3 = var C2 var C3 and s.e.(C2 C3 )= var C2 var C3 Which is given by s.e.(C2 C3 )= 0.51832011 0.89775677 1.036640 2 2 d) Compute the partial SS for C2, C4. Partial SS (C2/int, C1, C3, C4) = 2.215507 = Seq SS(C2/int, C1) Partial SS (C4/int, C1, C2, C3) = 180.5230= Seq SS(C4/int, C1,C2,C3) e) I want to test the null hypothesis that the coefficients of C2 and C3 can simultaneously be set to 0 in the above regression. 2 Give the F test statistic F = _________ for this hypothesis. 10 H o : C2 C3 0 We should have the order of variables entering the model as : int, C1, C4, C2, C3 So that the two variables being tested enter last in the model and used the F SS Re gression full SS Re gressionreduced 2 MSError 2.215507 338.791 2 35.2585 4.83580 Note that since C2 and C3 are orthogonal, Type I SS and Type II SS are the same, In general, the above procedure may not be valid, and we should have to run again the model with C2 and C3 entering last (sequentially). 4 (taken from DD ST512 notes)
© Copyright 2025 Paperzz