ASSUMPTIONS IN THE ANOVA Assumptions in the ANOVA and the mathematical model may not always be true in data from experiments. 1. The error terms or residual effects, eij, are independent from observation to observation and are randomly and normally distributed with zero mean and the same variance ó 2. This can be expressed as eij are iN (0, ó2). 2. Variances of different samples are homogeneous. 3. Variances and means of different samples are not correlated, i.e., are independent. 4. The main effects (block and treatment) are additive. Need to ensure that the data fits the assumptions of the analysis. Know assumptions and tests for violations of the assumptions Weights, lb, of vitamin-treated and control animals in a RCBD (from Little and Hills) Block Treatment I II III IV Total Mean Mice-control 0.18 0.30 0.28 0.44 1.2 0.3 Mice-vitamin 0.32 0.40 0.42 0.46 1.6 0.4 Subtotal 0.50 0.70 0.70 0.90 2.8 0.35 Chickens-control 2.0 3.0 1.8 2.8 9.6 2.40 Chickens-vitamin 2.5 3.3 2.5 3.3 11.6 2.90 Subtotal 4.5 6.3 4.3 6.1 21.2 2.65 Sheep-control 108.0 140.0 135.0 165.0 548.0 137.0 Sheep-vitamin 127.0 153.0 148.0 176.0 604.0 151.0 Subtotal 235.0 293.0 283.0 341.0 1152.0 144.0 Total 240.0 300.0 288.0 348.0 1176.0 Mean 40.0 50.0 48.0 58.0 1 49.0 ANOVA Source df SS MS F F.05 F.01 Total 23 111,567.40 Treatment (5) 108,713.68 21,742.74 174.43** 2.90 4.56 Species 2 108,321.16 54,160.58 434.51** 3.68 6.36 Vitamin 1 142.11 142.11 1.14 4.54 8.68 Species x Vitamin 2 250.41 125.20 1.00 3.68 6.36 Block 3 984.00 328.00 2.63 3.29 5.42 Error 15 1,869.72 124.65 Weights of mice, chickens and sheep are significantly different, which is hardly a surprise. No effect of vitamin is seen. Test whether assumptions are met. 1. Normally-distributed, Random and Independent Errors Generally deviations from the assumption of normality do not seriously affect the validity of the analysis of variance. One informal test for normality is to graph the data. This is also very useful for detecting outliers. A better test for normality is to calculate and graph the error components or residuals for each observation. This is equivalent to graphing the data after correcting for treatment and block effects. Calculation of error component: 2 Error components in vitamin experiment Block Treatment I II III IV Total Mice-control 8.88 -1.00 0.98 -8.86 0 Mice-vitamin 8.92 -1.00 1.02 -8.94 0 Chickens-control 8.60 -0.40 0.40 -8.60 0 Chickens-vitamin 8.60 -0.60 0.60 -8.60 0 Sheep-control -20.00 2.00 -1.00 19.00 0 Sheep-vitamin -15.00 1.00 -2.00 16.00 0 Totals 0 0 0 0 0 These errors appear to occur in groups, rather than randomly. Graphing the errors gives the following: Errors in vitamin experiment Range of errors is larger for the larger means. For means over 100, that the errors increase linearly as the means increase. This hardly appears random. This data set fails the assumption of normally-distributed, random and independent errors. 3 2. Homogeneity of Variances Calculate the variance for each treatment. For treatment 1: SS(t1) = ÓX2 - (ÓX)2/r = 0.182 + ... + 0.442 - (1.2)2/4 = 0.0345 2 s (t1) = SS/df = 0.0345/3 = 0.0115 Perform Bartlett’s test for homogeneity of variance Treatment df s2 Coded s2 Log coded s2 Mice - C 3 0.0115 11.5 1.06 Mice - V 3 0.0035 3.5 0.54 Chick - C 3 0.3467 346.7 2.54 Chick - V 3 0.2133 213.3 2.33 Sheep - C 3 546.0 546,000 5.74 Sheep - V 3 425.3 425,000 5.63 971,875 17.84 Total Mean 18 161,979 Log of Mean 5.209 4 3. Independence of Means and Variances If variances are homogeneous and independent of the means then Treatment Mean s2 s s2/Mean s/Mean Mice - C 0.3 0.01147 0.107 0.04 0.36 Mice - V 0.4 0.0347 0.059 0.01 0.15 Chick - C 2.4 0.3467 0.589 0.14 0.24 Chick - V 2.9 0.2133 0.462 0.07 0.16 Sheep - C 137.0 546.0 23.367 3.98 0.17 Sheep - V 151.0 425.3 20.624 2.82 0.14 Standard deviation is closely related to the mean. This suggests that a log transformation should be used. 5 4. Additivity Terms in the mathematical model for a design are additive. In an RCB, the treatment and block effects are assumed to be additive. This means the treatment effects are the same in all blocks and the block effects are the same in all treatments. Additive effects Block I Trt 1 10 II (+10) 20 (+20) Trt 2 30 (+20) (+10) 40 Multiplicative effects Block I Trt 1 10 II (+100%) 20 (+200%) Trt 2 30 (+20) (+200%) 60 Log transformation can transform multiplicative effects into additive effects. Log Transformed effects Block I Trt 1 1.00 II (+0.30) 1.30 (+0.48) Trt 2 1.48 (+0.48) (+0.30) 1.78 6 Tukey’s test for additivity Block Treatment I II III IV Mean Trt Effect Mice-control 0.18 0.30 0.28 0.44 0.3 -48.7 Mice-vitamin 0.32 0.40 0.42 0.46 0.4 -48.6 Chickens-control 2.0 3.0 1.8 2.8 2.40 -46.6 Chickens-vitamin 2.5 3.3 2.5 3.3 2.90 -46.1 Sheep-control 108.0 140.0 135.0 165.0 137.0 88.0 Sheep-vitamin 127.0 153.0 148.0 176.0 151.0 102.0 Mean 40.0 50.0 48.0 58.0 49.0 Block Effect -9.0 1.0 -1.0 9.0 ANOVA Source df SS MS Error (BxT) 15 1869.72 Nonadditivity 1 1822.94 1822.94 Residual Err 14 46.78 3.34 F 545.7 F.05 4.60 Assumption of additivity is incorrect. POSSIBLE COURSES OF ACTION 1. Analyze species separately. Have valid tests for each species, but no information on interactions. 7 Species Source df Mice Block 3 0.0400 0.0133 8.31 Vitamin 1 0.0200 0.0200 12.50* Error 3 0.0048 0.0016 Block 3 1.64 0.547 41.00** Vitamin 1 0.50 0.500 37.5** Error 3 0.04 0.013 Block 3 2834.0 944.7 157.4** Vitamin 1 392.0 392.0 66.3** Error 3 18.0 6.0 Chickens Sheep SS MS F See a significant effect of vitamin. No test for effect of species or for interaction. 2.Transform data and re-analyze. Because standard deviation is proportional to the mean, use a log transformation. Source df SS MS F Block 3 0.12075 0.04025 13.77** Treatment 5 28.60738 5.72148 1959.41** Vitamin 1 0.04860 0.04860 16.62** Species 2 28.54926 14.27463 4883.00** VxS 2 0.00952 0.00476 Error 15 0.04385 0.00292 1.63 Effect of species is significant. Effect of vitamin is significant. Interaction is not significant. Implications: Can use a mouse model instead of the larger and more expensive sheep. After transforming, the tests of the assumptions need to be redone to make sure the assumptions are now met. 8 Example: Bartlett’s test on transformed data Treatment s2 df Coded s2 Log coded s2 Mice - C 3 0.0243 24.3 1.39 Mice - V 3 0.0040 4.0 0.60 Chick - C 3 0.0118 11.8 1.07 Chick - V 3 0.0048 4.8 0.68 Sheep - C 3 0.0062 6.2 0.79 Sheep - V 3 0.0038 3.8 0.58 54.9 5.11 Total 18 Mean 9.15 Log of Mean 0.9614 The assumption of homogeneity of variance is now met. TRANSFORMATIONS 1. Log transformation Use when standard deviation is proportional to the mean main effects are multiplicative rather than additive data are whole numbers and cover a wide range of values can not use if data has negative values Examples number of insects per plot number of egg masses per plant 9 Coding if data has values <10, multiply by a constant (power of 10) if data has 0's, use X + 1 Meaning of analysis Original data, does the amount of change in X vary in response to treatment? Transformed, does the proportion or percent change in X vary in response to treatment? 2. Square root transformation Use when variance is proportional to the mean (Poisson distribution) data are counts of rare events data are small whole numbers percentage data where the range is 0 to 30% or 70 to 100% Examples number of infected plants per plot number of insects caught in traps number of weeds in a plot Coding if data has values <10, use X + ½ Meaning of analysis detransformed means are weighted with more weight given to smaller values (smaller variates are measured with less sampling error than the larger ones) increases the precision with which differences between small means are measured deviations from assumptions and corrections generally smaller than for log transformation 3. Arcsine or angular transformation Transform by taking the arcsine of the square root of each data point expressed as a proportion (percentage/100). Tables are available to go directly from percentages to arcsine transformation. Use when data are based on counts and expressed as percentages or proportions of sample total data follow a binomial distribution variances are small on the two ends of the range of values, but larger in the middle range of percentages is over 40% Examples 10 percent germination Coding none Meaning of analysis means are weighted with more emphasis given to means at the two ends of the range precision of comparisons is increased at the ends of the range Tests of Means Comparison of means of transformed data by multiple range tests use transformed means use variance of the transformed data Transformed means are weighted means and are correct for the transformed data. Presentation of Means For presentation or publication, detransform the means to the original units to make them more understandable. Showing mean differences by letters can be confusing because detransformed means may not look very different from each other, although they are significantly different when transformed. Explain the transformation that was used and why. Discuss the mean differences in the text. 11
© Copyright 2026 Paperzz