Biometrika Trust The Transformation of Data from Entomological Field Experiments so that the Analysis of Variance Becomes Applicable Author(s): Geoffrey Beall Source: Biometrika, Vol. 32, No. 3/4 (Apr., 1942), pp. 243-262 Published by: Biometrika Trust Stable URL: http://www.jstor.org/stable/2332128 . Accessed: 27/01/2011 11:45 Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive only for your personal, non-commercial use. Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at . http://www.jstor.org/action/showPublisher?publisherCode=bio. . Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission. JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected]. Biometrika Trust is collaborating with JSTOR to digitize, preserve and extend access to Biometrika. http://www.jstor.org THE TRANSFORMATION OF DATA FROM ENTOMOLOGICAL FIELD EXPERIMENTS SO THAT THE ANALYSIS OF VARIANCE BECOMES APPLICABLEt By GEOFFREY BEALL Dominion EntomologicalLaboratory,Chatham,Ontario,Canada 1. INTRODUCTORY THE presentpaper deals with experimentson the controlof insects in the field. In such experimentalworkthe problemto be investigatedis whethermoreinsects survive on plots which have been subjected to one treatment than on plots subjected to another. It will be shown in the presentpaper that the numbersof insectsfoundper plot must vary in such a way that one cannot, strictly,subject the results to the analysis of variance, and it is proposed to findhow the data may be transformedso that analysis of variance becomes applicable. Such transformationhas been discussed by Bartlett (1936a,b) in connexion with entomologicalexperiments,and by Tippett (1934) in connexionwith industrial experiments. 2. EXPERIMENTAL RESULTS CONSIDERED The data used in the followingwork are results from seven insecticidal experimentsarrangedby the author at Chatham, Ontario. The workwas carried out with replicatedblocks containingplots subjected to treatmentsof which the assignmentwas random. This procedure,normalin agronomicwork,was supplemented by one repetitionof each treatmentwithina block. The assignmentof the repetitionof a treatmentwas independentof the firstfor that treatment, except that, of course, the same plot could not be chosen twice. This repetition was carriedout to obtain estimates of variabilitywithinblocks. In these experiments complete counts were not made but random sampling was employed. Experiments on Pyrausta nubilalis Hubn., reportedby Beall et al. (1939), for whichresultsare shownin Tables 1 and 2, weremade on one area at two different periods,whereas experimentson LeptinotarsadecemlineataSay, forwhichresults are indicated in Tables 3 and 4, were carriedout on contiguousareas at the same time. Three,similar experimentswere carried out in one place on the tobacco hornworm,Phlegethontius quinquemaculataHaw., forwhich the data are shown in Tables 5-7. Reference is also made to the data from a uniformitytrial on insects of Beall (1939). ScienceService,DepartmentofAgriculture, t PublicationNo. 2101,DivisionofEntomology, Ottawa,Canada. 244 Transformation ofdatafromentomological fieldexperiments Table 1. Numbersofan insect,Pyraustanubilalis,perplot. Experiment I Block T rea t- ment 1 1 2 2 3 3 4 4 5 5 6 6 7 7 _ _ _ _ - _ _ _ _ _ _ - _ _ _ _ - _ _ _ _ _ _ - _ _ _ _ - _ _ _ _ _ _ - _ _ _ 1 2 3 4 5 6 7 8 9 10 15 27 19 11 16 19 14 34 16 23 12 15 43 47 23 20 12 28 15 16 23 21 16 12 14 16 28 81 21 23 34 37 22 18 10 9 19 12 17 15 35 30 31 33 16 16 25 21 19 34 26 10 10 28 36 69 22 34 20 26 13 19 17 19 15 12 14 13 50 35 14 27 10 18 21 24 18 9 11 17 24 22 69 29 18 17 24 19 18 21 7 15 18 13 24 11 62 71 11 13 23 13 38 18 18 16 23 21 17 7 63 47 21 20 14 10 27 12 8 12 27 12 7 5 42 50 34 26 13 9 10 18 17 12 6 9 3 4 40 43 Table 2. Numbersofan insect,Pyraustanubilalis,perplot. Experiment II Block T reat-- ment 1 1 2 2 3 3 4 4 5 5 6 6 7 7 _ _ _ _- _ _ _ -- _ _ - _ _ 1 2 3 4 5 6 7 8 9 10 32 18 6 9 10 4 2 24 13 17 13 17 37 44 38 40 23 14 21 21 17 13 2 22 10 9 58 71 27 39 8 20 25 26 11 13 5 23 21 29 28 55 7 12 4 13 10 4 3 10 0 8 4 5 11 20 13 19 3 15 13 9 10 6 18 14 10 18 24 26 14 26 18 14 20 14 10 14 10 16 8 15 44 27 26 30 26 15 33 30 26 28 33 26 17 19 30 43 25 19 27 19 48 18 13 11 23 22 15 16 44 52 22 18 17 19 28 27 22 34 20 15 13 27 56 39 30 28 19 10 27 18 17 7 34 34 16 23 45 58 _ 245 GEOFFREY BEALL Table 3. Numbersof an insect,Leptinotarsa decemlineata, per plot. ExperimentIII Block Treatment 1 1 2 2 3 3 4 4 1 2 3 4 5 6 7 305 207 97 93 270 153 7 12 391 364 49 51 105 190 42 2 420 639 21 25 341 348 34 22 355 527 12 37 469 212 8 4 287 293 3 4 82 100 1 1 175 248 10 12 57 285 10 3 454 397 10 1 221 309 4 3 Table 4. Numbersof an insect,Leptinotarsa decemlineata, per plot. ExperimentIV Block Treatment 1 1 2 2 3 3 4 4 1 2 3 4 5 6 253 239 16 95 18 40 2 2 145 265 13 54 130 137 0 1 309 166 74 159 165 118 22 31 665 230 110 108 137 142 6 8 99 302 14 14 153 239 129 9 93 237 5 13 78 63 3 8 1 Table 5. Alumbersof an insect,Phlegethontiusquinquemaculata, per plot. ExperimentV Block Treatment 1 1 2 2 3 3 1 2 3 4 5 6 6 4 0 2 15 12 5 15 1 2 17 22 6 13 1 1 22 16 13 6 0 4 28 11 6 10 1 1 8 13 11 15 1 1 16 25 246 Transformationof data from entomologicalfieldexperiments Table 6. Numbersofan insect,Phlegethontiusquinquemaculata, perplot. Experiment VI Block Treatment > 3 1 2 3 4 5 6 1 1 2 2 3 12 13 13 20 13 9 6 9 9 5 8 9 5 4 7 1 4 4 11 5 5 12 8 4 10 4 4 5 5 6 6 1 2 13 11 1 2 0 2 12 4 8 2 1 1 3 1 9 4 4 4 6 9 6 4 3 5 11 8 5 12 3 1 7 7 7 8 _ 9 9 7 5 6 10 _ 4 4 | _ 3 7 7 7 9 2 Table 7. Numbersofan insect,Phlegethontiusquinquemaculata, perplot. Experiment VII Block Treatment 1 1 2 2 3 3 4 4 0 5 6 6 3. THE RELATIONSHIP 1 2 3 10 7 11 17 0 1 3 5 3 5 11 9 20 14 21 11 7 2 12 6 3 5 15 22 14 12 16 14 3 1 4 3 3 6 15 16 4 I 10 23 17 17 2 1 5 5 1 1 13 1 0 I 5 6 17 20 19 21 3 0 5 5 3 2 26 26 14 13 7 13 1 4 2 4 6 4 24 13 BETWEEN THE STANDARD DEVIATION AND THE MEAN IN THE EXPERIMENTAL DATA If x is the numberof insects on one of a group of small contiguousareas, say plots, within a larger area, say a block, let the expectation of x over all these plots be M and the standard deviation be o; then over a number of the larger GEOFFREY BEALL 247 areas, when the insectsare distributedin a completelyrandom fashion,fromthe (1) Poisson distribution, 0.2 = M. As is discussed by 'Student' (1919) one cannot, however,anticipate that (1) will be satisfiedwhen organisms occur in groups, as, say, when insects come from masses of eggs, or when thereis a change in expectation fromplot to plot within a block. Generally,0-2 will tend to be greaterthan M and we can only say 02 = f(M). (2) The formoff(M), in (2), must be consideredcarefully,since it bears on the form of the transformationwhich may be developed to make the standard deviation independentof the mean. In dealingwith (2), Bartlett (1936a) startedbysupposingthat,approximately, 0.2= KM, (3) whereK is a constant. Generally,in fielddata, however,the relationshipbetween 0.2 and M, or of theirrespectiveestimates,S2 and x, does not, as in Fig. 1, appear to be linear; rather,the departureof S2 fromx3becomes disproportionatelygreat as x increases. This relationshipbetween departures and the magnitude of the mean has been discussed by Clapham (1936) in connexion with data on the frominsects as much as floweringplants, and distributionof organismsdiffering with very low mean have the squared distributions he showed that only those standard deviation close to the mean. Our discussionabove on the shortcomingsof (3) suggeststhe conclusionthat 0,2- Moc M (4) is generallyuntrue. We propose to considerthe possibilitythat the curvilinearity of (2) mightbe bettermet by supposing that o,2- Equation (5) leads to 2= Moc M2. (5) M + kM2, (6) wherek is a constant. It will be noticed that k = (o2 - M) M-2 (7) is the Charlier coefficientof disturbance from a Poisson distribution. This was employed by Beall (1935). coefficient It is possible to considerthe suitabilityof (3), as comparedwith (6), by finding and how,respectively,they fitobservationson S2 and x. To fitexactly is difficult, it was foundnecessaryto fall back on an empiricaldeterminationof K and of k; thus, if there are a number of pairs of estimates, x and s2, from(3) and (6) we K = Zs2/L'x, (8) estimate k = (L's2-2X2)/2?;2, whereL representsthe summationover all pairs. (9) 248 Transformationof data from entomologicalfield experiments Since in the work presentedin ? 2, x and 82, being based on only two observations,are highlyvariable, these experimentsdo not show clearlythe suitability of (3) and (6). Accordingly,referenceis made instead to the data fromthe uniformitytrial on LeptinotarsadecemlineataSay of Beall (1939). When the mean and standard deviation of 144 samplingunits withineach of 16 areas were considered, the estimates from (8) and (9) were K = 2*405 and k = 0-2548. For these values from(1), (3) and (6), curves,describedas lines 1, 2 and 3 respectively, are plottedin Fig. 1, the observedvalues ofmean and squared standard deviation 3 25- 2 20I 15- z u O 4 X6 I10 Fig. 1. The squaredstandarddeviationplottedagainstthe meanfor144 smallareas withineach of16 largeareas; line 1 is fromequation(1), line2 from(3) and line3 from(6). The countshad decemlineatc been made on Leptinotarsa Say. are also shown. In the cases where the mean is near unity the departure of the squared standard deviation fromthe mean, i.e. fromline 1, appears to be trivial, but as the mean increases the departure becomes more marked. It can be seen that the observations lie more snugly about line 3 from(6) than about line 2 from (3). Generally, for the data from field studies the same effecthas been observed. Such resultssuggestthat (6) may be generallya betterapproximation to the formoff(M) than (3) and make it preferableto proceed with the analysis of data fromthe assumption (6). 4. THE TRANSFORMATIONS OF FIELD DATA Fig. 1 shows clearly how, within an area, the variability of the numbers of insects on sub-areas is related to the mean numberof insectsper sub-area. This relationshipwill make invalid the use of the analysis ofvariance on experimental GEOFFREY BEALL 249 results involving counts on insects, since the expectation of the variance should be the same for all plots. To overcome this invalidity,Bartlett (1936a) suggestedtransforming the observations,x, fromthe basis of (3). The transformation found was xi, which Bartlett modifiedto (x + J)k. From ? 3 it was seen, however, that for field data the relationship between standard deviation and mean may be representedbetterby equation (6) than by (3), and, since the form of the transformation depends on the formoff(M), a freshtransformation must be sought.A transformation, as is developed in the Appendix to the presentpaper, is suggestedby the method of Tippett (1934), i.e. x' = k-i sinh-1(kx)i. (10) An advance note of this transformationwas published by Beall (1940). The adequacy of this transformationmust be judged from the extent to which it stabilizes variability. In (10), if we express sinh-1(kx)k,when kx < 1, as a welknown series,we have xi=xi-= k2xl (11) kxi+ 430- 5X2kx+..., where it is obvious that for k = O, x' = xi. Of course, for large values of kx, x' varies almost as log x, or as the log (x+ 1) used by Williams (1937), and so our proposed expansion may be regarded,for practical purposes, as embracing the root and logarithmictransformations. Table 8 gives the transformation,(10), fora probable range of observations, x, and fork at intervalswhichwillprobablybe close enoughforpracticalpurposes. This table was computedin part by inverseinterpolationfromthe table ofhyperbolic functionsof the SmithsonianMathematicalTables (Becker & Van Orstrand, 1931), and in part from (12). Should values of x' be required outside those of Table 8, these can convenientlybe calculated from x = k-i loge {(kx)i + (1+ kx)i}. (12) In preparingTable 8 the question arose of whether,instead of dealing with k-i sinh-1(kx)k, one should not use k-i sinh-l {kk(x+ J)k} in the same way as Bartlett (1936a) dealt with the transformation,(x+ 1)i, instead of xi. This modificationwas rejected on the basis of results of the transformation,as disand cussed in ? 5, since it was foundthat the addition of 2 made little difference did not give, consistently,an improvement. For fielddata, in making the transformation(10), it is necessaryto estimate the value of k empiricallyby (9) forwhich estimates x, of the mean and s, of the standarddeviation,mustbe found.The mostobviousmethodin practiceofmaking these estimatesseems to be to put more than one plot subjected to a given treatmentin a block and so to estimatethe chance variation of resultsfora plot within a block. In the presentwork,as is discussed in ? 2, two plots were subjected to a given treatmentin each block and this is probably good practice. Transformationof data from entomologicalfield experiments 250 *< t- t- cX ~~~~~~~~~X= N C) CC G OC t-O O~~~~~~Cp i O 0O . . . . . s r. CD . .* .q + t-~~ t- ee . tn.. . X x tvm _ t-t t:) (X) N OC C5 O t-m Q t ~~~~~ 6 ~~~~~~~~~~~~~~6 O t-X w ~~~~ ~~~~~ - O0 "- m __ O c O~ O-'-- ~ O t-~M 6 Q 'IQ s G _ 10 x . - t-. . - t C -- .ll t- q. M 00 aq OXtCO crN c:M:G- t- t- 16, 16cccc~ 4c X2 C'.- ",d C . . . c cs cs cs cs cc "dq to0 c~ m c~ c~-I 1c qc M 03 M t- N XXX c . ~ c~14c 16 1e0t Q>e sc s N CY. :C aC t< :t t-:c:s cc,, cr, CD CD cc:*u CD <u CC . (\ O cs cs ce c+ c* C< c:) c:)tt-x x 1 r.O Q x d o~ o~ o~-c c6 t-C5 cs cs CererC er*++:uu e t- c~ C~D444444c cc~ ~-CtC- -c = 0 - t m t t- -O C~.c ~c c 444 _ __ __ Nt- c~~~c c~444 cc~~~~~~~~~~~~~~c~~~~~ 6. ~ ~~~t ~~~~---~~~~~~i 1 et C'1C'16l <1 mcm m6,1-. co t 7 aq 1 16 c~c~c~ 16 1c16 X ___ __. QC N 0e u *c x Tt1 sest" m c csc mm t-s X Q c d . 7. N = 6 N or, c er~~~~~~~~~~~~~~~~6 tX o 61 ceu: i~~~~~~~~~~~t co ; sm VDX CX N m1 s CO Lce} _r M Me CD<3 'Id. 16 m .c m _ :1.C'1 C'1c'l c'l c'l c'l 6q N' *:' CQC QCC1 t- . 4c~6 6 O N: co) t-c 00 <t cO 6 O O' tct- ^ NNNN - aq to 00 cc c c t-o tO _ - 1>X CC. ~~~c ~ in st-sc~ir~f ~ 41. r- N V X o---_ G--CQ61 :'l 51 0 . = o *&. O ci CC ( cc . t- (M O _4 __ o I- a N (M ~~~~~~~ G~ co o - 00 _ _ ifG-,r 15co t-1-t I' M "- N I*4444o I-*I- ~c ~c o 4.. arz. e o o o o - csi _ _ _ cs cs c 4(ce _ cec~444 ,_ _ _ _ _, _ _ , _ _ _ _ _. 444444 _~~~~~ ~~~~~~, ~ _ _, _ EH - - -- -----o----- 9 o O CQ _-~ o o o es CQ - M- t~~-od O C; ~- - -- ~ 7 , - ~ -.- C;Dcs cs > nX ++++++ :u ce c;t c ce CDu:u XX co 00 Ct- cs c; C - ~~~~~~~~~cs O~~~~~~~6 o1 61 O O -- c------ ~ C;4 c~~~~~c~~~~c~~~~ 4444444444 ~~~ o -- ,::_ t-s00 00 t-o cQ Q_o QC 00 C q - _UC U_C cs cs CA)c D*+++u oc ec re :u :nu:u :u ~CO U<COOC: 4 44 4i~ii + - lstXsOOOOOssXtssXe o O O O t-o es s c: Xoce +Uc:c c: t-X + O _ Ce CA cc sC :stXso ~~~~ CCC,~~~m c. N C, 444 l q 00 ~ 44c~'4444 . .4. m N N 0 C ~C~C4444~ 4444 44444 010C10C 4444444~~~~~~~~~. . .4 . r4 N r N444 00 C~C. ~C~C ;~C~ ccm10t-440 ; ; ~c~C~C~C. cCcc~~44 44 ~C~T1~C c 00m 0 ~444444 . .44 aqNNalN*INNa x mMX ~c~~ c~cc~ . .. 251 BEALL GEOFFREY 4. . . r444 0 = 1', 444 N0 00a 10 'Zi l1 M 0 0000 m 00 0Cmx -4 ______ m .~C C~,-"ZM' tmI 0 44 4444444444. C~C~C~C; ;I c 44 444444444i0 10 44 INa ;4_ -C0 .1 r- o 0MNN . . 4444444iWmM. M m - - --------- 10 t-o 4 4lf l~f~4~lf xxxii Biommetrika l~lf 1701 l~ ~~~--------- t-O f~lb4~~bIb - -~ -b b -b b e- 0m 0t e- e- e- x c-t-t e-C e-~ xt bG 252 Transformationof data from entom,ological field experimbents In the special case wherethereare two plots forthe ith treatment(i = ,n) in thejth block (j = 1, ..., N), and so two observations,xjl and Xij2 the estimate of the mean will be writtenxij and of the squared standard deviation (13) S9) = 2(Xijl-Xij2 Then from(9), we estimate k from k= 2 (in N EE(xijl n -xij2)2- N j(xijl i( + xij2)j n N Xjl 1+ A-1 Xi2)j (14) and the calculation is very light. 5. RESULTS SHOWING ON THE THE EFFECT VARIABILITY OF THE TRANSFORMATION OF DATA Trheadequacy of our proposed transformationmay be judged in two ways: first,with respectto its effect,which we shall considerin the presentsection,on the differences between repetitionsof a treatmentwithina block, and secondly, with respect to its effect,which we shall considerin ? 6, on the behaviour of the quantities submittedto the analysis of variance. It is a fundamentalassumption in the analysis of variance that the chance variabilityforeach plot shall be, when the effectof block and of treatmentare removed, normallydistributedwith a standard deviation common to all plots, in which situation of course the standard deviation of the chance variabilityfor a given plot is independentof the expectation forthat plot. In the data of the presentwork, where each treatmentis repeated in each block, it is possible to examine the estimates of this standard deviation,8ij, and of the expectation,x,j.. For a clear graphical illustrationofthe situationconsiderFig. 2, as obtained from the original data of Experiment III on Leptinotarsadecemlineata,where sij is plotted against xij, and contrast this situation with that obtaining for the correspondingquantities s'j and x4j, obtained after transformation(k = 008) in Fig. 3. In Fig. 2 the points are widely scattered as is natural froma sample of two; nevertheless,it is apparent that forthe smallest values of xij the values of s8 are correspondingly small and fall in a close group. In Fig. 3 the clusterof observations in the lower left-handcornerof the previous diagram has disappeared, and generallythe scatter appears to be independentof x4j,so that apparently the transformation gave satisfactoryresults.The nature of the material involved that it is such does not seem possible to examine the relationshipunder considerationmore exactly, nor to summarizeexactly the correspondingresultsfor the other treatments;it can only be said that the same type of result appeared although the magnitudeof the relationshipbeforetransformationdepended on the magnitudeof the differences between the effectsof treatments. The resultsshown in Figs. 2 and 3 suggestthat the proposed transformation has tendedto make thestandarddeviationiindependentofthe mean,in accordance GEOFFREY 253 BEALL theanalysisofvariance.In usingthisprocedure underlying withtheassumptions one actuallyassumes,morebroadly,thata commonstandarddeviationexists,so should beforeand aftertransformation ofobservations thatthehomoscedasticity be tested.Thus it is assumedthatxijl and xij2 are observationsfroma normal 200- 0 50 100- * 50- 0 00 0 200 100 Xij 400 300 500 Fig. 2. The standarddeviationand meanas estimatedfromplotsby pairs, data on Leptinotarsa decemlineata. withuntransformed 2-0 . 1*5(J* 1.0~~~~ 0 1*0- 0 *~~~~~~ 0.5-~~~~ 05 * 0 2 0 * 4 * / Xijj 6 * @0 8 Fig. 3. The standarddeviationand meanas estimatedfromplotsbypairs,withthe transformed data on Leptinotarsa decemlineata Say, i.e. usingx' = k-i sinh-' (kx)1,(k = 0 08). ofi and j. Then populationwitha standarddeviation,o, whichis independent _(x (Zzk- )2}-2 is as x2 withonedegreeoffreedom.tAccordingly, distributed t In theL1 test,discussedby Nayer(1936),thiscase ofestimatesofstandarddeviationwith sincezerovaluestendto arisewhendealingwithgroupedor is troublesome one degreeoffreedom to a geometric integralobservations.Whenthisis the case L1, whichis theratioofan arithmetic have a wider maytherefore meanof sumsofsquares,cannotbe calculated.The presenttreatment atrnlication. I7-2 254 Transformationof data from entomologicalfield experiments yi= (x6ijl- xj2)/V2o- should be distributednormallywith unit standard deviation forall i and j. In orderto test the hypothesisofnormalitywithunit standard deviation it is only necessaryto test forleptokurtosis;forthe distributionmust and thereforeof yij, is a matter of be symmetricalsince the sign of differences, chance. Since the numberof items involved will almost certainlybe < 100, and since the population,mean is zero, the wncriterionofGeary (1935) will providean appropriate test. In using this criterionwe must find the ratio of the mean deviation to the standard deviation,i.e. Wn nnN = n ( I xjl -Xij2 i=lj=l N _ EE Xj2) (Xijlf nNi=lj=1 (15) Of course,values of w,,may be calculated fortransformeddata by substitution of XZjk for Xijk. Table 9. The w. teston thehomoscedaisticity ofcountsbyplots a six within blockfor fieldexperiments Untransformed | {Experi- mxent anN I II III IV V VI 70 70 28 24 18 36 Upper Lower lImit limit 0-841 0 841 0 866 5% 0 872 0 881 0 857 5% Transformed (Beall) (Bartlett) - _ _ - _ _ _- _ Wn Departure by S.D. wn 0 757 0 757 0 737 0 6659 0 7885 0 5973 - 5 33 - 0 49 - 5 33 0 7525 0 7807 0 6431 0 728 0 745 0 7554 0 7959 -1 16 - 0 22 0 8155 0 8166 0 732 V Transformed 0 5692 - 5 67 0 6948 -_ _ - - _ - Departure by S.D. Estimated Employed n Departure by S.D. 191 079 -415 0 078 0 046 0 084 0 08 0 04 0 08 0-7846 0 7499 0 6838 - 0-64 - 2 01 - 3 11 0 08 0 02 0-7823 0 8130 - - 267 0 285 0 082 +0 13 +0 38 1 0 019 0 30 0 7370 - 166 -0 58 +0 28 Values of w.,fromthe untransformedobservationsand fromthe transformed observations, both following Bartlett (i.e. the transformation(x + )i) and followingthe line suggested in the present paper, are shown in Table 9 for the thevalues ofk as calculated fielddata ofTables 1-6. For the secondtransformation from(14) are shown as well as the nearest value of k entered in Table 8. For each experimentthe value of nN and also the 0 05 limits of probability,from Geary (1935), are shown. There are also shown the departuresof observed Wn from the expected value in termsof the standard deviation, a useful criterion since the distributionof wn is almost normal. From Table 9 it can be seen that out of the three experimentsin which Wn fell beyond the lower 5 % limit of probability for the untransformeddata and the data transformedas (x+ 1)i, in only one experimentdid Wn fall so with the finaltransformation.The results forExperimentII, in which Wn is decreased by the transformation, are peculiar. Considerationof the departuresfromthe mean in termsof the standard deviation indicates more clearly the improvementeffectedby each transformation GEOFFREY BEALL 255 and how the transformationsuggestedin the presentwork secures an improvement of the same, but more marked, character than that secured from the of Bartlett. The resultssuggestthat whilehomoscedasticitymay transformation not be attained always, it will be approached by means of the proposed transformation. 6. THE EFFECT OF THE TRANSFORMATION ON THE ANALYSIS OF VARIANCE besides As was indicatedat the beginningof ? 5, our proposedtransformation makingthe variabilitywithina block fora repeated treatmentthe same for all treatmentsand blocks, should also provide quantities satisfyingthe assumptionsunderlyingthe analysis of variance. Since it is not quite clear how,in so far as the transformation is satisfactoryin the firstway, it will necessarilybe satisfactoryin the second, it will be well to considerdirectlythe suitabilityof our transformedvalues forthe analysis ofvariance. In the application of the analysis of variance one would deal with xij rather than withXijk and suppose that (16) xii. = A + Bi + Cj + Dij, whereA is a contributionfromthe general level of population on the experimental area, Bi the contributionof the ith treatmentand Cj the contributionof thejth block. The remainderterm,Dij, is called the interactionof treatmentsand blocks. Of course, the present discussion on the untransformedvalues, xij, holds for the transformedvalues, xj = (xt1 + x*j2) when the appropriate symbols,A', B*, CQand Dj are used. In material satisfyingthe conditionsunderlyingthe analysis of variance, for the observationsunder each treatment,the calculated squared standard devia1 N tion is )2. (17) s N-1 jl(x,j. Following the argumentof the analysis of variance, xj - xi, of which the mean is 0, is an estimateof Cj + Dij, in whichthe two termsare independent;hence the expectationofS2is C2 = 0-2 (18) where ocj and oDj are the standard deviations of the parameters,Ci and Dij, respectively,and are independent of treatment.Accordingly,si should be independentoftreatmentand distributedas an estimateofoi, havingN - 1 degrees of freedom. Conversely,if xij. cannot be built of the independenttermsof (16), then the various values of si will not be distributedas estimates of a single standard deviation. The hypothesisthat the values of si in any one experiment are estimates of one quantity may be tested.t withDr R. W. B. Jackson,thewriterhas learnedthathe had arrived t Fromcorrespondence at the same test. independently 256 Transformationof data from entomologicalfieldexperiments The results of the tests on the homogeneityof the values, si, withinthe six experimentstreated in the present paper are presentedin Table 10, where the value of the L1 criterionis shown forthe original data and forthe transformed values togetherwith the appropriate0*05and 0.01 levels of probability(Nayer, 1936). From Table 10 it can be seen that of the values of L1 obtained fromthe original data, all but one are near or beyond the 0 05 level of significance,but that aftertransformationall are moved in to less significantvalues. Accordingly, the values ofsi, whencalculated fromthe originaldata, appear heterogeneousbut the correspondingvalues obtained afterthe transformation appear homogeneous. Thus it is more probable that the analysis of variance is applicable to the transformeddata than to the untransformed. Table 10. The homogeneity, as measuredby thecriterionL1, of theestimates si for various values of i beforeand aftertransformation Experiment 1 LI beforetransformation LI aftertransformation 1 % limit 5 % limit 0 867 0 864 0 757 03812 2 0 0 0 0 833 941 757 812 3 4 0.325 0 766 0 604 0 707 0 0 0 0 657 688 542 656 [5 0 344 0 813 0 514 0 648 6 0 0 0 0 680 730 583 673 In Table 10 we have tested the homogeneityof the estimates,si, as in ? 5 we tested the homogeneityof sij, that is without referenceto the values of the associated means. In view ofour originalassumptionswe are, however,interested in the possibilitythat the standard deviations, as calculated, might show every sign of being estimates of a common standard deviation and yet be dependent on the associated means. Accordingly,we have investigated such dependence roughlyby fittingby least squares a firstorderregressionof si on xi.. From this fittingwe recordthe sign of the regressionas follows: Experiment Before transformation Aftertransformation I 1 2 31 +* + +* - + - I 4 5 6 + ** + + + + -* GEOFFREY BEALL 257 By a single asteriskwe have indicated cases where the reductionin variability effectedby the regressionpassed the 5 % probability limit and by a double asteriskwhereit passed the 1 % limit. Several points may be noted. (1) In two cases (Experiments 2 and 6) after transformationthe residual sun1 of squares about the regressionwas greaterthan the reductionin squares due to the regression, whereas it was consistentlyless beforetransformation.(2) As can be seen above, the regressiongenerallydid not effecta significantreductionin variability aftertransformation but did before(the small numberofdegreesoffreedommade high significancedifficultof attainment). (3) Aftertransformationthe sign of the regressionseemed to be a chance matter,whereas beforetransformationit was consistentlypositive. These results suggest that the transformationproposed did tend to make the variabilitywithina given treatmentindependentof the mean forthat treatment. 7. THE EFFECT OF TRANSFORMATION UPON THE CONCLUTSIONS FROM THE ANALYSIS OF VARIANCE It has been shown in ?? 5 and 6 that the analysis of variance can be made on entomologicaldata when a suitable transformationhas been effected.It is of willhave upon practicalinterestto see what numericaleffectsuch transformation tests on the significanceof, say, the effectof treatmentand the significanceof differences fortreatments. First, consider the numerical results to be obtained from the analysis of variance (1) withoutand (2) with transformation.Thus the mean square ascribable to blocks, treatmentsand their interactionis shown in Table 11, for six experimentsof which the data are given in ? 2; parallel resultsare presentedfor untransformedobservationsand forobservationstransformedby (10) with the values of k fromTable 9. To facilitatethe comparisonof the results,the mean square for blocks and for treatmentsis expressed in terms of the estimate for interaction,as the F of Snedecor (1934), and presentedin each case. The transformationof the data has modifiedthe conclusionsto be drawnfromthe analysis ofvariance in Table 11, in that thereare considerablechangesin the criterion,F, fortreatmentsor forblocks. In the examples shownthe effectof treatmentswas highlysignificantin all cases and so the changes introducedby transformation did not alter the conclusions,as would have been the case forless definiteeffects. Consider next the effectof transformationon the significanceof differences between the means fortreatmentsas tested by the criterion,t, calculated with such estimates of mean square as the interactionof Table 11. For illustration, values of t, fromthe data on Leptinotarsadecemlineata(Experiment III), are shownin Table 12 foreach possible comparisonoftreamentswhenuntransformed data are used, when the transformation, (x + 1)i, as suggestedby Bartlett (1936a) is used, and when the transformation, k-i sinh-1(kx)i, as suggestedin the present paper is used. In orderthat the influenceof the level of population under each Table 11. The analysis of variance of untransformed and transformed data in six experiments D egrees Variation Untransformed data of freedom __ _ _ _ __ Transformed data _ _ _ _ _ _ _ _ Mean Mean square square _ _ _ F Betweenblocks Betweentreatments Interaction ExperimentI. P. nubilalis 9 92-8 1 21 6 2,839-0 36*95 54 76-8 0-565 7-51 0-341 1 66 22-03 1 Betweenblocks Betweentreatments 1 Interaction ExperimentII. P. nubilalis 9 577-0 6-72 6 1,721-0 20-04 54 85-9 537 8-69 0 509 10-55 17-07 Betweenblocks Betweentreatments Interaction ExperimentIII. L. decemlineata 6 20,172-0 2-26 3 390,932-0 43-77 18 8,931-0 4-67 111-5 1 54 3 03 72-16 Betweenblocks Betweentreatmients Interaction ExperimentIV. L. decemlineata 5 12,960.0 2-06 3 124,054.0 20-40 15 6,7270 2-06 20-40 1-05 1-96 19-41 Betweenblocks Betweentreatments Interaction ExperimentV. P. quinquemaculata 5 27.4 2*74 0*349 2 752-0 75.33 19.5 10 9-98 0-083 Betweenblocks Betweentreatments Interaction ExperimentVI. P. quinquemaculata 5 41-7 4*13 5 66*2 6*55 25 10*1 - 1.50 3*33 0*365 4-19 233 77 - 412 9*14 - Table 12. The values of t, in the comparisonof means, as calculatedfrom the and thetransformed data of ExperimentIII on Leptinotarsa untransformed decemlineata urMeansfor data Comparison IXl. -x4.. x2 -x3 . IX2..-x4. . X3__-X4 1 [ f 362 30 362 224 362 11 30 224 f30 11 224 11 t without transformation +9.27** +3.84** + 9.82** - 5.43** +0 54 +5.98** t from (x+ -1)i t from k-I sinh-I(kx)i +11-33** + 3.52** + 12.89** + 9.92** + 2.11* + 12.46** - 7.81** + 1*56 + 9.37** - 7.81** + 2.54* +10.35** 259 GEOFFREY BEALL treatmentmay be judged, there are shown, also in Table 12, the means forthe untransformeddata. The values of t fallingbeyond the 0.01 level of significance have been marked with two asterisksand the values beyond the 0 05 level with one. It can be seen that the transformationresultedin a profoundalteration in the conclusions.Apparentlyon account of the dependence of variance on mean in untransformeddata, the pooled estimate of variance was originallytoo low forthe treatmentswhich resultedin high populations and too high forthe treatments which resulted in low populations. Thus, in the comparison of the first and thirdtreatments,which appeared to have the two highestsurvivingpopulavalues was high. In the other tions,the value of t calculated fromuntransformed extreme case, the comparison between the second and fourthtreatments,the values was very low. It can be seen value of t,as calculated fromuntransformed further,that the firsttransformationonly secured in part the modificationin the value of t that was secured by the second transformation. OF TRANSFORMATION IN PRACTICE THE PROCEDURE 8. The methods which were found applicable in the preceding discussion will ofthe data shownin Table 7 (Experinient now be illustratedin the transformation quinquemaculata,of the same type as the experiments VII) on Phlegethontiu8 previouslydiscussed in the presentpaper. The steps in the analysis will be set out withthe purpose of providinga model forprocedurein estimatingthe constant, k, which will be used to effecta transformationof the data so that the analysis of variance may be made. Supposing that the experimenthas been laid out with a repetitionof each treatmentin each block, the procedureof estimatingk makes it firstnecessaryto findthe sum and the absolute difference of each pair of plots subjected to a given n treatmentin a givenplot andthen to sum the sums, z squared, nN N i=l j z (xijl + xij2), and the sums n + xiJ2)2, and also the differencessquared, E X(jl ( .N (Xijl - Xi2)2, over all such pairs and by substitutingthe resultsin (14) to findk. In the case being used for an illustrationthe two plots subjected to the firsttreatmentin each block gave respectively10 and 7, 20 and 14, 14 and 12, 10 and 23, 17 and 20, 14 and 13, so that N Similarly, (Xljl + Xlj2) = (IO + 7) + (20 + 14) + (14 + 12) + N z j=1 (Xljl + Xlj2)2 = (10 + 7)2 + (20 + 14)2+... = 5308 and similarly, N (Xljl-Xlj2 )2 = (10 - 7)2 + (20-14)2+ = 228. 174. 260 Tran8formationof data from entomologicalfield experiment8 Of course, in estimatingk the summationsare not limitedto one treatmentbut must be extended over all in the experiments.If this is done we find n N i=l j=l n N (X1jl+ xij2) = 684, E i=l j=l j(xjl +xj2)2 = 19,656, k = 2(708 -684) 19,656 From (14) we estimate n N E E i=l j=l (x j - Xij2)2 = -708. 0002, and referring to Table 8, p. 250, use k = 0*00as the nearestvalue occurringthere. Of course,in this case, the transformationis simplyxi. Now fromthe above resultit will be possible to replace the observedvalues of Table 7 with the correspondingtransformedvalues from the firstcolumn of Table 8. Thus in Table 7 replace in the firstrow: 10, 20, 14, 10, 17 and 14, by 3'16, 4.47, 3*74,3.16, 4412and 3-74. With such transformedvalues we can now proceed to carry out a routineanalysis of variance which will be facilitatedby workingwiththe sum foreach pair ofplots in a givenblockwitha giventreatment. For example, the finalanalysis of variance forExperimentVII would be carried out with the values of Table 13. Table 13. Transformedand summedvalues to be used in theanalysis of variancefor ExperimentVII on P. quinquemaculata Block Treatment 1 2 3 4 5 6 5-81 7.44 1 00 3*97 3*97 6*32 9. 2 3 4 8*21 7.90 4-06 5-91 3 97 8-56 7-20 7*74 2*73 3.73 4-18 7*87 7*96 8-24 2*41 4-48 2-0) 6 77 5 8-59 8-94 1-73 4-48 3 14 i 10-20 6 7.35 6-26 3 00 3-41 4 45 8-51 SUTMMARY AND CONCLUSIONS The foregoingworkis a study of experimentalresultsfromseven fieldexperiments on the control of insects. In such data, the standard deviation of the number of insects per plot varies with the mean. By the transformation, x' = k-I sinh-1(kx)1,where k is a constant and x an observation,the data were put in a formforwhichthe standard deviationapproached a constantindependent of the mean. The estimationof the one constant,k, necessaryforthe transformation was made possible by the design of the experimentswith repetitionof treatments withinblocks. In practice, the transformationgave good results so that GEOFFREY BEALL 261 analysis of variance could be made. From the analysis of the transformeddata, the results were found to differmarkedly fromthose which would have been obtained fromthe untransformeddata. ACKNOWLEDGEMENTS The presentworkwas suggestedto the writerby Prof. E. S. Pearson to whom the writeris furtherindebted for advice as the work progressed.The writeris furtherbeholden to Dr R. W. B. Jackson, to Dr B. L. Welch and to Dr M. S. Bartlett for criticismand advice. Dr G. M. Stirrettand Mr F. A. Stinson very kindlypermittedthe use of unpublisheddata fromtheirexperimentscarriedout at Delhi, Ontario,on Phlegethontiu8 quinquemaculataHaw. APPENDIX As has been said, the transformation of (10) was suggestedby the methodused by Tippett (1934, p. 61). The procedureis as follows. It is required to find x' = f(x), such that the standard deviation, yxof x', shall be approximatelyconstant. Let us write x' = f(M)+f'(M) (x-M)+.*., (19) whereM is the expectation of x and whence, approximately, (x'-M') = f '(M) (x-M), whereM' is the expectation of x'. Hence o02, = {f '(M)}2 o2, (20) (21) where o-is the standard deviation of the observations,x. Replacing ox in (21) by a constant,c, as is the purpose of our operation,and suibstituting forC from equation (6), p. 247, we have (22) f '(M) = c(M + kM2)-A, where k is, as has been previously discussed, a constant peculiar to our data. Integratingin (22), f(M) = 2ck-isinh-1(kM)i. (23) From (23) the formof the functionsuggestedis sinh-1(kx)i, but it is wise instead to use k-I sinh-1(kx)1,since the transformation thenbecomes identical,as shown in (11), with the established transformation, xi, when k = 0. As Tippett (1934) says: 'This derivation is not mathematicallysound, and the result is only justifiedif on application it is found to be satisfactory.' The writerwould have hesitated to have used it had it not already led to useful transformationsin cases analogous to the present,namely to xi where x comes froma Poisson distribution,to sin-lpi wherep comes froma binomial distribution and, accordingto Tippett, to tanh-1r. wherer is the correlationcoefficient. 262 Transformation ofdatafromentomological fieldexperiments REFERENCES M. S. (1936a). Square root transformationin analysis of variance. J. Roy. Statist.Soc. Suppl. 3, 68-78. (1936b). Some notes on insecticidetests in the laboratoryand in the field. J. Roy. Statist.Soc. Suppl. 3, 185-94. BEALL, G. (1935). Study of arthropodpopulations by the method of sweeping. Ecology, 16, 216-25. (1939). Methods of estimatingthe population of insects in a field. Biometrika, 30, 422-39. of data fromentomologicalfieldextperiments.Canadian (1940). The transformation Ent. 72, 168. BEALL, G., STIRRETT, G. M. & CONNERS, I. L. (1939). A fieldexperimenton the control of the European cornborer,PyraustanubilalisHubn., by BeauveriaBassiana Vuill. II. Sci. Agric. 19, 531-4. BECKER, G. F. & VAN ORSTRAND, C. E. (1931). Hyperbolicfunctions(4th reprint). SmithTables. sonian Mathematiccal in grasslandcommunitiesand the use of statistical CLAPHAM, A. R. (1936). Over-dispersion methodsin plant ecology. J. Ecol. 24, 232-51. GEARY, R. C. (1935). The ratio of the mean deviation to the standard deviation as a test of normality. Biometrika,27, 310-32. NAYER, P. P. N. (1936). An investigationinto the application of Neyman and Pearson's L, test, with tables of percentagelimits. Statist.Res. Mem. 1, 38-51. G. W. (1934). Calculation and Interpretationof Analysis of Variance and SNEDECOR, Covariance. Pp. 96. Ames, Iowa: CollegiatePress Inc. BARTLETT, 'STUDENT' (1919). An explanation of deviations from Poisson's law in practice. Biometrika, 12, 211-15. TIPPETT, L. H. C. (1934). Statistical methods in textile research. Part 2. Uses of the binomial and Poisson distributions.ShirleyInst. Mem. 13, 35-72. ofcertainentomological WILLiAMs,C. B. (1937). The use oflogarithmsin the interpretation problems. Ann. Appl. Biol. 24, 404-14.
© Copyright 2026 Paperzz