A Comparison of Procedures for Multiple Comparisons of Means with Unequal Variances Author(s): Ajit C. Tamhane Source: Journal of the American Statistical Association, Vol. 74, No. 366 (Jun., 1979), pp. 471480 Published by: American Statistical Association Stable URL: http://www.jstor.org/stable/2286358 Accessed: 21/10/2010 17:45 Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive only for your personal, non-commercial use. Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at http://www.jstor.org/action/showPublisher?publisherCode=astata. Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission. JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected]. American Statistical Association is collaborating with JSTOR to digitize, preserve and extend access to Journal of the American Statistical Association. http://www.jstor.org A Comparisonof Proceduresfor MultipleComparisons of Means With UnequalVariances AJITC. TAMHANE* Nine proceduresformultiplecomparisonsof means withunequal cedures based on the correspondingjoint confidence in someproceduresare provariancesare reviewed.Modifications or easier im- procedures. in theirperformance posed eitherforimprovement The proceduresincluded in this study fall into two A MonteCarlosamplingstudyis carriedoutforpairplementation. and theprocedures groups. One group consistsof procedureshaving resoluas wellas a fewselectedcontrasts wisedifferences are comparedbased on the resultsofthisstudy.Recommendations forthe choiceof the proceduresare given.Robustnessof two pro- tions (Gabriel 1969) forall linearcombinationsof the Ai; variancesunderviolationofthat the secondgroupconsistsofprocedureshavingresolutions ceduresdesignedforhomogeneous is also examinedin theMonteCarlostudy. assumption for all pairwise differences that can be extended to all KEY WORDS: Multiple comparisons;Unequal variances; One- contrastsamongthe Ai. (Brownand Forsythe'sprocedure problem. ANOVA; Behrens-Fisher wayfixed-effects also has resolutionfor all contrastsbut it is somewhat 1. INTRODUCTION model of Consider the usual one-way fixed-effects analysis of variance: Xij = Ai + eij whereall the eij are independentwitheii - N(0, j = 1, 2, ..., ni; i = 1, 2, ..., k. The means U-2) for j.i and variances o-2are assumed to be unknown.Let xi denote the sample mean and let Si2 denote an unbiased estimate of U,2 based on vi degrees of freedom(df) that is independentofXi; forthe mostpart we shall take si2 to be the usual sample variance based on vi = ni- 1 df. In recentyearsconsiderableattentionhas been focused on the problem of multiplecomparisons,among the A when the ai2 are unequal; for example, see Ury and Wiggins (1971), Spjotvoll (1972), Brown and Forsythe (1974), Games and Howell (1976), Hochberg (1976), Tamhane (1977), and Dalal (1978). The primarypurpose of the presentarticleis to give a briefreviewof theseprocedures, point out any relationships between them, propose improvementsin some proceduresand, finally, make comparisonsbased on an extensive Monte Carlo (MC) samplingstudy. The criteriaused forcomparison are (a) the confidencelevel of the joint confidenceintervals (Cl's) of a suitable familyof parametricfunctions or contrasts)forthe ,uiand (b) the (pairwisedifferences widthsof these Cl's. Note that these two quantities respectivelycorrespondto (a) the familywiseType I error rate and (b) the power of the simultaneoustest pro* AjitC. Tamhaneis AssistantProfessor, ofIndustrial Department University, and ManagementSciences,Northwestern Engineering Evanston,IL 60201. The authoris gratefulto two referees,an associateeditor,and the Editorforpointingout severalreferences in theearlierdraft.This workis manyimprovements and suggesting supportedby NSF GrantENG 77-06112. different fromthe other proceduresin the second group as pointedout in Section 2.1.2.) By carryingout the MC study for pairwiseas well as general contrastswe have triedto provide "homegrounds"forboth groups of procedures,thus makingthe comparisonfair to the extent that is possible. It would have been preferableifa simple modificationof the proceduresin the firstgroup were available that would reduce their resolution from all linear combinationsto all contrasts.No such modification seems to exist,however. The secondarypurpose of this article is to study the robustness propertiesunder variance heterogeneityof two generalized-Tukeyprocedures(generalizedto cover the case of unequal ni's) that are designedforthe homogeneous variances case. These two procedures(Spjotvoll and Stoline's 1973 Ext-T and Hochberg's 1974 GT2) were selectedforcomparisonbecause they performquite well relative to their competitors(see Ury 1976). The reason forincludingonly the generalizedTukey and not, for example, the Scheffeprocedure,was because of our predominantinterestin pairwisecomparisonsforwhich the Tukey-typeproceduresare knownto be morepowerful.The second reason fornot includingmoreprocedures was, ofcourse,to keep the size ofthe studyto manageable proportions. The inclusion of the generalized-Tukey procedures also serves a subsidiary purpose; namely, when the ai2 are in fact equal, as is the case for some configurationsstudied in the sequel, they provide a standardagainstwhichthe performance ofthe procedures designedforunequal o-2 can be compared. In the remainderof this article,Section 2 reviewsthe various procedures;Section 3 proposes modificationsof 471 ? Journal of the American Statistical Association June 1979, Volume 74, Number 366 Theory and Methods Section Journalof the AmericanStatisticalAssociation,June 1979 472 some of the procedureseitherfortheireasierimplementation in practiceor forimprovementin theirperformance. In part of Section 3 and in Section 4 some preliminary comparisonsamong the competingproceduresare carried being eliminatedin the process. out, a fewnoncontenders The detailsof the MC studyand the resultsare presented in Section 5. Finally, the discussionof the results and recommendationsfor the choice of the proceduresare given in Section 6. 2. A REVIEWOF THE PROCEDURES 2.1 ProceduresforUnequal aU2 The proceduresin the firstgroup guaranteethe designated joint confidencelevel exactly. Because the contrasts problemis a generalizationof the Behrens-Fisher problemfor which no exact solution is known to exist, the proceduresin the second group are inexact; that is, they are either conservativeor approximate. Now we describethe proceduresin the two groups. 2.1.1 Proceduresfor all linear combinationsof the ,u,. Dalal (1978) proposed a familyof proceduresbased on Holder's inequality.Let p ? 0, q > 0 satisfyi/p + 1/q = 1, and let dq,a( = dq,a(vi, .. ., Vk) denote the upper a 1 point of the distributionof dq = ( it"= q)lq wheret, denotesa Student's t randomvariable (rv) with v df and the t^,are independont(i = 1, . . ., k). Then the exact 100 (1 - a)% joint CI's for all linear combinations =1 a.sAare given by k k Z aA C: ai[ i a? dq i=l i=l k PsiP/np(2)1 I p i= Dalal remarksthat these CI's are competitiveforall q, although it is possible that for a specificsubclass one mightdominate the others. In general it is difficultto compute the distributionof dq and, consequently,to compute dq,a. For q = x, p = 1 it is easy to see that doo,a = doo,(a, ..., Vk) is given by the solution in d to the equation k TI{2F,4(d) i=1 - 1} = 1 - a (2.1) and a = (1 k 2/b) E - i=l {1v/(vi - 2)} We shall referto this procedureas the S procedure. Hochberg (1976) proposed the followingprocedure based on a generalization of the Tukey method of multiplecomparisons.Let ha' ha'(vi, . . ., Vk) denote theuppera pointoftheaugmentedrangeR' oftv1,..., t; R' = R'(vi, . . ., Vk) = max { maxi It, I, maxi,j t - t' II. Then the exact 100(1 - a)%o joint CI's for all linear are given by combinationsJk=1 aq,Ai k k E aiAi E [S aii where bk) = M(b1, .l., ? ha'lM(bi, ..., k max(Z bi+, i=l bk)], k i=l (2.3) bi-), bi+ = max(bt,0), bi- = max(-b,, 0), and bi = aisi/A\n (i = 1, ..., k). We shall referto this procedureas the Hi procedure. in computingha,',HochbergsugBecause of difficulties gestedinstead using ha, = the upper a point of the range R of tv, .. ., tvk; R = R(vi, . . ., Vk) = max,jIt,i -t,jI; he noted that ha' is well approximatedby ha, provided k > 3 and a < .05. In Section 5.2 we have developed an integral expressionfor the distributionof R' based on whichha,' can be easily computedforarbitrarycombinations of the vi. In the MC studies we used Hi in its exact form. 2.1.2 Proceduresfor all contrastsamongthe jti. In this group we include the proceduresproposed by Ury and Wiggins, Brown and Forsythe, Games and Howell, Hochberg, and Tamhane. All these procedures except that of Brown and Forsythe involve firstconstructing joint CI's forall pairwisedifferencesi - Aj(i, j = 1, . . .. k; i < j) and then extendingthem to all contrastsby using Lemma 3.1 of Hochberg (1975). (This extensionis not in the originalarticlesof Uryand Wigginsand Games and Howell.) Thus, if wij denotes the width of the 100(1 - a)%0 joint CI forpi - j(i, j =1, . . ., k; i < j), then the correspondingjoint CI's for all contrasts i=1 csiA,where i=1 ci = 0, are given by whereF,(.) denotes the distributionfunctionof a t, rv. We shall referto this procedureas the D procedure. For the special case q = 2, the above Cl's were proCi(-Cj)WijE E k k (c) (c) posed by Spjotvoll (1972). He approximated the disjeO ki ? (2 4) Ci,i E = c tributionof d22 = Ek=1 t,i2by a scaled F distributionand by using the methodof momentsgave the followingapwhere c = (cl, ..., ck)', @(c) = {i: ci > 0}, and nI(c) proximationto d2,a = d2, a (vl, . . ., Vk) = {j: cj < 01. On the otherhand, Brown and Forsythe (2.2) d2, .2 aF(c(k,b), obtain joint Cl's forall contrastsdirectlyusing Scheff6's where F, (m, n) denotes the upper a point of an F dis- projection method. Also the previous procedurestypitributionwithm and n df, cally involve bounding the probability of the joint statement related to the Cl's for (2) differences k k i j=1, ,k; i < j) by a functionof the , ItV,2(V, -1)/ (vi-2)2(vi-4) ) Ai(k-2)[ L Itvi/(vi-2)I]2+4ki=l i=l1 probabilities of individual statements by means of a k k inequality.All the proceduresare based Bonferroni-type i=l on some solutionto the Behrens-Fisherproblem,usually i=l E= J , Procedures forUnequalVariances Tamhane:MultipleComparison Welch's (1938) approximatesolution.Thus, most of the procedures described in the followingparagraphs are approximate-conservative(approximate because of the Welch solution; conservativebecause of the Bonferronitypeinequalityused). The validityof the Welch solution in the case of the two-sampleproblemhas been demonstrated by Wang (1971). Wang recommends that it should be true that ni > 6 foreach sample. Ury and Wiggins (1971) proposed a procedurebased on the Welch approximate solution and the Bonferroni inequality.Accordingto this procedurethe approximateconservative 100(1 - a)% joint Cl's for all pairwise differences,ui- Auj(i,j = 1 ..., k; i < j) are given by 473 given by (t,i ,,sillni+ t4i"2sj2/nlj) where y = {1 - (1 - a)l/'k We shall refer to this procedureas the TI procedure.The second procedurewas based on the Welch solution and the gidak inequality. According to the second procedure approximate-conservative 100(1- a)% joint Cl's forall pairwisedifferj E[xt - Ai encesi - j(ij t-p = tj :4 , 1, .. ., pi-yj E [Xi-Xj k; i < j) are given by i t^ij, (s2/n + 12/nj)1] where Pij is given by (2.5). We shall referto this latter procedureas the T2 procedure.In the MC studiescarried out in Tamhane (1977) it was demonstratedthat Ti is + sP2/nj)f]l -.iUj E [E-i j t^i>t(si11ni highly conservativerelative to T2 and hence we shall drop TI fromfurtherconsideration.It is reviewedhere where4,0 denotesthe upper : point of a t, rv, 3 /2k', for the sake of completeness. ki'= (k),and Finally, we describeBrownand Forsythe's(1974) pro(s82/ni + sj2/nj)2 cedure. According to this procedure approximate A.j =(2.5) where 100(1 - a)% joint Cl's forall contrasts '= ci,Ai, + sj4/nj2(nj - 1)} { si4/n-2 (ni-1) J=,ci = 0, are given by We shall referto this procedureas the UW procedure. k L k of (2.5), E, Ciipi E E, CiXti Uryand Wigginsadvocated a slightmodification i=1 i1=1 but we shall take up this modificationin Section 3.2. k Ci2Si2] Hochberg was by given procedure A closely related i{(k -1) F,(k-1p)8 ni i=l (1976). According to this procedure the approximateconservative 100(1 - a) % joint CI's for all pairwise where vcis the Welch df given by differencesAi- A(ij j = 1, ..., Ik; i < j) are given by '-1 2 k 4 k A- ij E E -Xi t4 g(S2/ni Sj2/nj)'] + wherega solves the equation k E - ic Ci2Si2\ E i_=1~ni - C,4 i i., n2(i;n-3 i (2.6) We shall referto this procedureas the BF procedure. Note that BF gives the followingapproximate-conservative 100(1 - ax)% joint Cl's forall pairwisedifferences k E Pt ltvjii> g) = a = 1 j=i+1 HAi- A(j j =1, ... I k; < j): in g, and v is given by (2.5). We shall referto this Ai -'j E[ti - Xj procedureas the H2 procedure. A: (k - 1)Fa(k - 1, vii) (2/ni + s2/nj)] Games and Howell (1976) proposed the followingapproximate100(1- a)% joint Cl's forall pairwisediffer- where Pij is given by (2.5). Closely related procedures have been describedin Naik (1967). 1 ...,kk;i < j) encesli - (jj(iJj Ai - E X xi j 1 v/2 (si2/ni + gj1/nj)1 forEqual -,2 2.2 Procedures Spj0tvoll and Stoline (1973) extendedTukey's multiple comparison procedureto take into account the case of whereqV,k,adenotesthe upper a point of the studentized unequal ni's. According to their procedure the exact rangedistribution(see Miller1966,Ch. 2, fora definition) the variances assumption holds) (if homogeneous with parametersk and v and Pij is given by (2.5). We - a)% for all linear combinations Cl's 100(1 joint shall referto this procedureas the GH procedure.The ; are given by , ailAi use ofthe studentizedrangestatisticin the GH procedure k k does not seem to have been adequately justified;in fact A: qk,, _sM(b,., _ b)] E aigi GE [ it turnsout in the MC studiesthat in some instancesGH i=l i=1 procedureyields familywiseType I error rate greater whereqk,v,a' denotesthe upper a pointof the studentized than the specifiedlevel a, that is, it is radical. In Tamhane (1977) we proposed two procedures,the augmented range distributionwith parametersk and v firstof whichis based on Banerjee's (1961) conservative (see Miller 1966, Ch. 2, fora definition);82 iS the usual solution to the Behrens-Fisherproblem and gidAk's "within" estimate of the common variance with lcX- df; bt a1/VnX and M is as definedin (1967) multiplicativeinequality. Accordingto this pro- 3/= -= cedure conservative100(1 -a) %0joint CI's forall pair- (2.3). If a is a contrastvectorand the miare equal, then kc;i < j) are qk,v,at can be replacedby qk,w, (See Theorem2.3 of jst h(i, j = 1, .. wise differences 3 - ., 474 Journalof the AmericanStatisticalAssociation,June 1979 Hochberg 1975), thus yielding Tukey's original pro- wheredoo,a and tvz,(i = 1, . . ., k) satisfy cedure.For thisreason we shall referto this procedureas -= a P I Itv,I doo,,, i= 1, ... ,1k} the TSS procedure. Hochberg (1975) proposedthe followingprocedurefor = PI ItvI < tvtb i = 1, ..., k} the case of unequal ni's that he referredto as the GT2 procedure;we shall use the same nomenclaturebecause Therefore,there exists a number v* E (minivi,maxivi) of its widespread familiarity.According to GT2, the such that if Vi,vj > v*then (3.1) is violated; if vP,vj < v* conservative(if the homogeneousvariances assumption then (3.1) is satisfied;in all othercases (3.1) is satisfied from is highlydifferent holds) 100(1- a)o joint Cl's forall pairwisedifferences (with high probability)if oq2/rn, than D' for D to do better we should expect Thus oaj2/nj. j = 1, ..., k; i < j) are given by Al Aj(i, those pairwisecomparisonsinvolvinghighlyunbalanced , - Ai E [-Ti -fi +4 IM k',v,aS(1/n1i + 1/n,) 2] (< 2/i)-values. In spite of this one advantage of D over D' we drop D fromfurtherconsiderationbecause of the where Imik' ,,a denotes the upper a point of the stuin favor of D' cited earlierand because, on advantages dentizedmaximummodulusdistributionwithparameters of D and D' are similar. (The the performances average, k' and v (see Miller 1966, Ch. 2, for a definition).One D fromthe author.) for are available MC results can extend the aforementionedpairwiseintervalsto all H2 similar manner.The modified modified in a can be contrastsusing (2.4). H2 (denoted by H2') has the same advantages over H2 that D' has over D. For this reason we drop H2 from 3. MODIFICATIONS OF SOME PROCEDURES furtherconsideration.Now note that the modifiedproAND COMPARISONS cedure H2' is identical to UW and, therefore,H2' will 3.1 A Modificationof the D Procedure not be consideredseparately. We note that forsolving (2.1) one must use a tedious trial-and-errormethod with a hand calculator or use a 3.2 Modificationof (2.5) and ProceduresAffectedby It digital computer,except possibly for the case ni Ury and Wiggins (1971) pointedout that P', given by a with = nk = 'u (say). For thiscase we have do,a = tn_(2.5) ranges between min(rni- 1, nj - 1) and ?--+ nm which can interbe obtained 2{1(1 by a)l/k}, 2= polating in the t tables. This suggests the following - 2, but usually tends to be on the conservativeside. of the D procedurethat can be implemented Based on the workof Pratt (1964), they advocated that modification with the help of the t tables alone even for the case of V,jbe taken to be equal to its upper limit mp+ nj - 2 holds: unequal nr's. Accordingto the modifiedprocedurethe when at least one of the followingfourconditioins exact 100(1 - a)%o joint Cl's forall linear combinations < 10/9; 1. 9/10 < nli/?1T i=1aiA,are givenby 2. k E i=l1= k k a,uj E [E a,1 i E i=l tvi,bIaiIsN/Vn1] We shall refer to this modifiedprocedure as the D' procedure.It has the followingadvantages over D: 1. D' is easier to implementin practice. 2. In D' the CI fora linear combinationEiEE where E C {1, 2, ..., ajAi1 k} depends only on the vi fori E E. In D, however,this CI depends on all the vi that is unappealingforobvious reasons. The performanceof D' should be on the average (over (2) pairwisecomparisons)equivalentto that of D; in fact if the v, are equal then D and D' are identical. In the unequal v, case, if all the vi -c or, equivalently, if all the ai2 are known,then do,a and all the t, a tend to zb wherezbdenotesthe upper6 pointofthe standardnormal distribution.Thus D and D' are again equivalent. If the vi are unequal but small then it mightbe noted that if using D Wii,D denotes the width of the CI for Ai - ,j, and Wii,D', that using D' then (a.s.) we have + t^3,asj/V/n,) (ti,asi/V/n1 , (3.1) (s/n +sV3 9/10 < 1)/ (Sj2/nl,) < 10/9; (S,2/nt1 3. 4/5 < n,/n,< 5/4 and 1/2 < (s,2/n1)/(sj2/n) < 2; 4. 2/3 < ni/n, < 3/2 and 3/4 < (s,2/ni)/(sj2/llj) < 4/3. This modificationessentiallymeans that if the sample sizes of the two groupsare approximatelybalanced or the standard errorsof the correspondingsample means are approximatelybalanced thenone should use the "usual" df ni + ij - 2. This modifiedvalue of vi,can be used in UW, T2, GH, and BF (for pairwise comparisons) procedures; denote the modifiedproceduresby UW', T2', GH', and BF', respectively.(For generalcontrastswe shall use (2.6) for the df vCfor BF withoutany modificationalthough we shall still referto that procedureas BF'.) Because UW, T2, and BF (forpairwisecomparisons)are approximateconservative,one can anticipate that this modification will make UW', T2', and BF' sharper without making them radical; this is indeed so and we drop originalprocedures UW, T2, and BF from furtherconsideration. The same statementcannotbe made about GH', however. In fact,in our MC studies GH' was triedbut turnedout to be substantiallyradical. Therefore,we retainledGH, which itselfis somewhatradical. For the convenienceof Tamhane: MultipleComparisonProceduresfor Unequal Variances 475 5. MONTE CARLO STUDY the reader we provide a glossary of all the procedures consideredso far. 5.1 Choice of (2.2, n)-Configurations and OtherParameters The sampling experimentswere conducted for all pairwise differencesof the i, for k = 4 and 8 and for Procedure Mlnemoiic selected contrastsfor k = 8. The values of a used were Dalal's (1978) procedureand its modified .05 and .10 althoughhere we reportthe resultsonly for D, D' a = .05; the patternsin the resultsfora = .10 are quite version,respectively similar. For each combinationof k and a, eight (a2, n)S Spjotvoll's (1972) procedure were studied. Our concernis mainlywith configurations Hochberg's (1976) two procedures Hi, H2 the small-samplebehavior of the procedures,and this Ury and Wiggins's (1971) procedureand UW, UW' guided our choice of the n,'s in the range 7 to 13. These its modifiedversion,respectively were orderedin termsof a measureof unGames and Howell's (1976) procedureand configurations GH, GH' balance in the values of its modifiedversion,respectively Ti, T2, T2' Tamhane's (1977) two proceduresand the var(,) = a2.2, n(i-1, . . h;) modifiedversionof his latter procedure, respectively This measure p(r) is given by Brown and Forsythe's (1974) procedure BF, BF' k and its modifiedversion,respectively E (T, - T)2 9(s TSS Spjotvoll and Stoline's (1973) procedure i=l Hochberg's (1974) procedure GT2 where T = Ek=, Ti/k. The same measure was used by Keselman and Rogan (1978) although they used it for 4. SOME FURTHERCOMPARISONS measuring unbalanceof J,2'S; we believethatT is a more First note that while UW' is based on the Bonferroni relevantparameterin the presentproblemthan a-2. The inequality,T2' is based on the multiplicativeSidak in- configuirationsin their "natural" order (primed serial equality; thus : < yand t,, > t, . Therefore,T2' uni- numbers)are listed in Table 1. The sovalues associated formlydominates UW' in terms of the CI widths and with the configurations and theirorderaccordingto the hence we drop UW' fromfurtherconsideration. sovalues (unprimedserialnumbers)are listedin the same It is easy to check that qvk ,/V2 < t, z with equality table. Thus for both k = 4, 8, configuration1' with holding iffk = 2. Thus GH uniformlydominates T2. yp(r)= 0 is the most balanced while configuration8' As mentionedearlier,however,GH tends to be radical with =() - 1.479 is the most unbalanced. AInalternaand hencethe comparisonis not exactlyfair.In any case, tive measureof unbalance that can be used is we use T2' and not T2 and it is not clear whetherGH 1 uniformlydominates T2' because the df of the critical Q(r) = max n/nmin T, i<i?k 1<t<k points used in the two procedurescan now be different. A directcomparisonbetweenD' and Hi seemsdifficult Note that i1 givesthe same orderingas soexceptthatranks but one can compareD and Hi as follows.Comparingthe of configurations 3' and 6' are interchanged,that is they respectiveassociated widthsWi,D and w,H, of the Cl's are 6 and 7 accordingto p while7 and 6 accordingto b. forA, - ,i(i, j = 1, ..., k; i < j), one has (a.s.) The MC resultsare presentedin termsof the unprimed serial numbersof the configurations, thus enabling the h'P, ,Pk > TVij,H1 Wij,D reader to see how the react to indifferent procedures . . ., dho a'(P1, creasingunbalance in the r,'s. (si/Vn/i+ s\/Vn/i) (4.1) The class of general contrasts of practical interest involvescomparingthe average of one subset of typically max(sj/&\/nj,sj-\4.nj) the the average of anothersubsetof the ,us.We against , It is easy to verifythat the leftside of (4.1) is greaterthan selected three representativecontrastsfromthis class for and a values one and forthe (pi, ..., vk)-combinations in Thus fork = 8, we selected the MC study experiments. studied in this article we verifiedthat it is in the range one :4 and two middlecontrast high-order (4 comparison) 1.35 to 1.5 by computingha' and d:,a; the values of ha' are order :2 contrasts (2 comparisons). They are given in Table 2. The rightside of (4.1) is greater than one (a.s.), and it would be close to one with large Contrast 1: ( 1 + A 2 + A/3 + it4) Thus probabilityif o-i2/niand ojl2/njare highlydifferent. 51 -4(A5 + J6 + 117 + A8); (5.1) (au//ni)exceptforcomparisonsinvolvinghighlyunequal 7 + 1L8) values, it is likelythat Hi would dominateD (and there- Contrast2: 2(Al + 42)- (results. the MC foreD'); this is supportedby Contrast 3: 2(I( + 4) - (O5 + L6) Based on these considerationswe finallykeep eight proceduresin our MC study: D', 5, Hi, T2', GH, BF', The pairwisecomparisonsare, of course,the lowest-order contrasts. TSS, and GT2. GLOSSARY OF THE PROCEDURES IIlk) CONSIDERED 476 Journalof the AmericanStatisticalAssociation,June 1979 1. Configuration No. k 4 8 o- 2,U2 n)-Configurations (j-2 n, n2, .k 7,7,7,7 7,7,7,7 7,7,7,7 7,9,11,13 7,9,11,13 13,11,9,7 7,9,11,13 13,11,9,7 0 1 4 6 2 3 7 5 8 1' 2' 3' 4' 5' 6' 7' 8' 1,1,1,1,1,1,1,1 1,1,2,2,3,3,4,4 1,1,4,4,7,7,10,10 1,1,1,1,1,1,1,1 1,1,2,2,3,3,4,4, 1,1,2,2,3,3,4,4 1,1,4,4,7,7,10,10 1,1,4,4,7,7,10,10 7,7,7,7,7,7,7,7 7,7,7,7,7,7,7,7 7,7,7,7,7,7,7,7 7,7,7,7,7,7,7,7 7,7,9,9,11,11,13,13 13,13,11,11,9,9,7,7 7,7,9,9,11,11,13,13 13,13,11,11,9,9,7,7 0 1 4 6 2 3 7 5 8 . = PI max (Ti)-Tj ? h' forall jI k E i=O P{Ti > Tj for all j Ti -h' k k = j=1 7) 1,1,1,1 1,2,3,4 1,4,7,10 1,1,1,1 1,2,3,4 1,2,3,4 1,4,7,10 1,4,7,10 In this section we describehow the criticalpoints for the various proceduresincluded in the MC study were obtained. The actual details of the MC study are given in the followingsection. Unusual t values needed forD' and T2' wereobtained by using the IMSL subroutineMDSTI. Similarly,the critical point d2,a, for S and the F values for BF' were obtained by using the IMSL subroutineMDFI. Note that both MDSTI and MDFI give exact resultseven in the case of fractionaldf. The q and q' values needed for GH and TSS wereobtainedby interpolationin the tables given by Harter (1960) and Stoline (1978), respectively. The ImI values needed for GT2 were obtained by interpolationin the tables given by Stoline and Ury (1978). Linear harmonic interpolationwith respect to the df was used in all threecases. Tables of values of ha' needed forHi are not available in the literature.To computeha' we develop an integral expressionforthe distributionfunctionof R' that seems to be new and hence is given here. Represent R' = maxo<i<j<k IT, - Tj where To - 0, Ti is distributedas a t4 rv (denotedby TM 4 t), and T1, ... Tk To obtain ha' ha'(,,v ... are independent. Vk) one solves the followingequation in h': = nk 1' 2' 3' 4' 5' 6' 7' 8' 5.2 CriticalPointsforthe VariousProcedures 1-a Configuration No. in Terms of I{F,2(0) - F,(-h')} - t=1 F; h' k j=81 -FPY - i= j 1} I {Fvj(t) si (t - h') }dF, (t) k k i==1 J, 1 + {F,j(h') k hl + E | 4 (t -h') } f, (t)]dt , j|l [Z {FP,(t) (5.2) .447 .610 .235 .262 .639 .500 1.479 .447 .610 .235 .262 .639 .550 1.479 where f4(.) denotes the densityfunctionof a t, rv. The IMSL (1978) subroutinieMDTD was used to evaluate F,(.) and ZSYSTM was used to solve (5.2); Romberg quadrature methodwas used to evaluate the integral. ifthen, are equal), ImIk',Y,a, The valuesofqk,,,a'(qk,v,, 11 and ... I >k) ha'(i. vk)used in the MC study d2,aci ( are given in Table 2. 2. CriticalPointsforthe VariousProcedures (1 - a k P-'I, I '2 ., Vk V h' = .95) d2, qk.v,a q,V I m kv, 4 6,6,6,6 6,8,10,12 24 36 4.747 4.374 4.156 3.812 3.901 - 3.810 2.851 2.775 8 6,6,6,6,6,6,6,6 48 6,6,8,8,10,10,12,12 72 5.903 5.359 5.396 4.906 4.481 - 4.415 3.286 3.228 5.3 Details of the MC Study For each combinationof k and the (q2, n)-configuration 1,000 experimentswere run. For studyingthe CI's we can take all the , equal to zero withoutloss ofgenerality. Therefore,in each experiment,k independentpairs of _ rv'ss i -~ N(0, ot2/n1) and s82 cr-2X"-/V, weregenerated. For generatingnormal rv's, the Box-MVilleralgorithm was used; the chi-squaredrv's were generatedby using the relation (for v even) xv2 - ' i=2 loge Ui wherethe Ui are independentuniform[0, 1] rv's. The FORTRAN libraryfunctionRANF was used to generatethe uniform rv's. We carriedout separate runs forpairwisecomparisons and contrasts.For pairwise comparisons,for each procedure we obtained the estimates of (a) the achieved of the joint confidencelevel, (b) the expectedhalf-widths and (c) the average CI's forall (2) pairwisedifferences, of the (") expected half-widths.For a given procedure, the estimate of the joinitconfidencelevel was obtained by keepinga count of the numberof runs in which zero Tamhane: MultipleComparisonProceduresfor Unequal Variances 477 3. EstimatedConfidenceLevels forAll PairwiseComparisons (1 O.95)a - No. (~2n)-Configuration k Procedure 1 2 3 4 5 6 7 8 D' S Hi1 ~~.999 .994 .985 .955 .946 .998 .993 .984 .956 .944 .995 .990 .975 .953 .941 .998 .993 .983 .954 .942 .996 .991 .989 .963 .950 .993 .987 .979 .947 .940 .998 .992 .986 .968 .957 .988 .982 .975 .948 .936* .950 .951 .960 .955 .968 .963 .938* .950 .970 .967 .998 .997 .985 .965 .935* .990 .962 .962 .992 .992 .981 .942 .916* .975 .927* .942 ~~T2' 4 GH BF' TSS GT2 D' S Hi ~~T2' 8 8 ~~~~GH BF' TSS GT2 .957 .964 ~~.999 ~~~~ ~~.997 .992 .966 .941 .990 .960 .965 ~ .996 .995 .980 .963 .943 .985 .959 .960 .964 ~ .962 .973 .953 .975 I .9551 .992 .993 .984 .966 .948 .987 .965 .962 .993 .993 .983 .963 .949 .977 .914* .938* .998 .996 .986 .966 .946 .990 .922* .916* .993-i .994 .984 .973 .949 .988 .897* .882* .930* .940* .945 .928* .909* .891~ Asterisk indicatesthattheachievedconfidencetevelis less thanthe designatedconfidencelevel =95 at 10 percentsignificance. The standarderrorofanyentry P is given by(P(l - P)/1,000)21. was included in all (2) Cl's for the differences unless the unbalance in the ri values is extreme.Keep in k; i < j). The results regarding mind that both TSS and GT2 are conservativefor the estimatedconfidencelevels are presentedin Table 3. For problem of pairwise comparisons (if the homogeineous lack of space we have not reportedhere the estimatesof variances assumption holds) except when the ni are the expected half-widthsforthe individual (k) compari- equal when TSS is exact. Thus the apparent robust sons, but in Table 4 we have given the average (over (t) behavior of these procedures might be due to their of the Cl's forpairwisediffer- inherentconservativenature. Cl's) expectedhalf-widths ences produced by each procedure.The half-widthsof Amongthe proceduresdesignedforunequal variances the Cl's produced by each procedurefor selected con- we findthat only GH tends to be liberal but theredoes trastsare given in Table 5. Note that the resultsforTSS not seem to be any specificpatternof configurations for and GT2 are not included,in an effortto keep the table whichGH is liberal.This liberalnatureof GH was noted small and also because sufficient informationabout the in Games and Howell (1976), althoughin the MC study relativeperformanceof these two proceduresis obtained done by Keselman and Rogan (1978) GH is shown to fromthe pairwisecomparisonsdata. The standarderrors controlthe confidencelevel moreprecisely.We also find of the estimatesare given as parentheticalentriesin the that BF', HI, S, and D' are increasinglymoreconservasame tables. In arrangingthe tables we have put the tive. This should not come as a surprisebecause BF' is proceduresof similar type togetherso that their per- designedforall contrasts,whiletheotherthreeprocedures formancescan be more readilycompared; thus we have are designedforall linear combinations.T2' controlsthe put D', S, and Hi togetherbecause they possess the confidencelevel fairlywell. One can compare the percommon property of having resolutionsfor all linear formanceof T2' withthat ofT2 (see Table 1 of Tamhane combinations.The discussion of the resultsis given in 1977; T2 is referredto as the W procedurethere) and note the reductionin overprotection the followingsection. due to the use of the Ury and Wigginsdf modification. Next turnto Table 4. Amongthe proceduresdesigned 6. DISCUSSIONAND RECOMMENDATIONS for unequal variances we note that GH produces the Let us study Table 3 first.To identifythe liberal pro- shortestCI's for all the eight configurationswhile T2' cedureswe have markedwithan asteriskthoseconfidence and BF' come next; HI, S, and D' produceincreasingly levels that fall in the critical region for the one-sided wider Cl's. Althoughit seems that GH gives the best hypothesis-testingproblem Ho: 1 - a = .95 vs. H1: performance forpairwisecomparisons,one musttake into 1 - a < .95 at 10 percentlevel of significance(i.e., con- account the factthat GH is also somewhatliberal.If one fidencelevels < .942). Thus we note that TSS and GT2 does not want to run the risk of frequentlyliberal CI's tend to be liberalforconfigurations 6, 7, and 8; forother producedby GH, thenT2' seems to offerthe best choice configurations they seem to controlthe designatedcon- for pairwise comparisonsbecause it controls the confidencelevel fairlywell. One can thus conclude that for fidencelevel more preciselyand produces the Cl's that pairwise comparisonsTSS and GT2 are fairly robust are only slightlywider. - i j = 1, ..., 478 Journalof the AmericanStatisticalAssociation,June 1979 4. Average Estimated Cl Half-Widthsfor Pairwise Comparisons (1 - a 0.95) (2, k 1 2 3 4 5 6 7 8 D' 2.541 (.017)a 1.987 (.011) 2.930 (.016) 3.924 (.027) 4.109 (.022) 5.532 (.039) 3.212 (.021) 4.633 (.032) S 2.173 (.014) 2.000 (.014) 1.643 (.010) 1.586 (.010) 1.692 (.011) 1.457 (.007) 1.506 (.007) 1.713 (.009) 1.580 (.010) 1.376 (.008) 1.309 (.007) 1.424 (.008) 1.323 (.005) 1.260 (.005) 2.576 (.014) 2.370 (.015) 2.022 (.011) 1.933 (.010) 2.103 (.011) 2.205 (.009) 2.100 (.009) 3.418 (.024) 3.285 (.027) 2.583 (.018) 2.535 (.018) 2.661 (.018) 2.335 (.012) 2.393 (.012) 3.720 (.020) 3.594 (.022) 2.930 (.016) 2.797 (.016) 3.045 (.017) 3.339 (.014) 3.181 (.014) 4.930 (.037) 4.901 (.042) 3.727 (.028) 3.722 (.030) 3.839 (.029) 3.371 (.018) 3.485 (.018) 2.809 (.019) 2.774 (.023) 2.368 (.018) 2.227 (.017) 2.426 (.018) 1.983 (.008) 1.889 (.008) 4.129 (.031) 4.201 (.034) 3.560 (.030) 3.330 (.027) 3.628 (.030) 2.840 (.013) 2.705 (.013) D' 2.976 (.020) 3.341 (.018) 6.524 (.046) 3.708 (.024) 5.359 (.037) 2.830 (.018) 2.498 (.018) 2.086 3.305 (.018) 2.910 (.017) 2.500 4.534 (.036) 4.374 (.030) 3.994 (.037) 3.225 4.672 (.025) S 2.284 (.014) 2.207 (.013) 1.934 (.013) 1.721 4.750 (.026) 4.325 (.027) 3.607 6.423 (.048) 6.050 (.051) 4.736 3.612 (.024) 3.362 (.027) 2.993 5.297 (.038) 5.070 (.043) 4.526 (.014) 1.947 (.013) 2.367 (.015) 1.687 (.005) 1.750 (.006) (.010) 1.590 (.009) 1.973 (.011) 1.522 (.004) 1.471 (.004) (.014) 2.332 (.013) 2.894 (.016) 2.530 (.006) 2.445 (.007) (.022) 3.057 (.022) 3.658 (.024) 2.641 (.009) 2.739 (.009) (.020) 3.357 (.018) 4.169 (.023) 3.817 (.011) 3.689 (.011) (.035) 4.572 (.034) 5.372 (.038) 3.932 (.015) 4.077 (.016) (.024) 2.705 (.022) 3.365 (.027) 2.291 (.007) 2.215 (.007) (.039) 4.044 (.033) 5.035 (.041) 3.281 (.010) 3.171 (.010) Procedure H1 T2' 4 GH BF' TSS GT2 Hi T2' 8 n)-Configuration No. GH BF' TSS GT2 The entries in parentheses are the standard errors of the corresponding estimates. 8 whereTr = 1/13 and Tk = 10/7. It is not It is not the purpose of this article to carry out an configuration such detailed comparisonshere because give to possible note we GT2. But and extensivecomparisonbetweenTSS of space. of lack uninvolving in passing that for all the configurations Finally, we turn to Table 5. In this table we have numbers2, 3, 5, 7, and 8) equal ni's (i.e., configuration listed the estimatedhalf-widthsof the Cl's forselected while for the other GT2 producesshorterCl's than TSS, for all the procedures designed for unequal contrasts Cl's. shorter produces TSS cases, as one would expect, (i.e., excludingTSS and GT2). Here we find that variances T2' with and GH of widths CI By comparingthe BF' the best performancein all the cases. For gives that re1 and numbers 2, of TSS and GT2 forconfiguration second-best performance,the contendersare S and the variances the for which configurations spectively(i.e., the betterthan GH forthehigh-order seems to perform GH. S in loss efficiency idea of the are homogeneous),one getsan 1 in the case of middle-ordercontrasts2 of and, case in the contrast comparisons, for pairwise ifgood procedures 7 and 8 correspondingto unconfigurations and for 3, when used unequal variances such as GH and T2', are performsbetterin the other GH while values balanced seem that ri the underlyingvariances are equal. It would comparisons,T2' proof pairwise case the cases. As in in be kept must it the loss is not substantial,although wider than those proare slightly only duces CI's that sample the when pertains only mindthat this conclusion D' very wide Cl's, Hl produce Hl and by GH. duced sizes are not too different. D'. than better compariperforming previous the all out that It shouldbe pointed Perhaps a verysurprisingresult (at least to us) of the sons pertainto average CI widths. If the widthsof the compared, MC study for contrastswas the performanceof S. Alare differences pairwise (2) for individualCl's we anticipatedthat BF' would performverywell T2'. D' though beats there are few instances in which even we did not quite anticipatethat S would be highly for contrasts, are values the two when Typically this occurs Ti of D' and Hl even The poor performance the second best. = for 8) hk(k for CI different, 4, namely, for the Al Tamhane: MultipleComparisonProceduresfor Unequal Variances 479 5. Estimated Cl Half-Widthsfor Contrasts (1 - a= 0.95) Contrast No. (CT2, Procedure 1 2 3 4 5 6 7 8 1.436 (.004) 2.316 2.285 (.007) 1.120 (.003) 1.858 3.351 (.009) 1.681 (.004) 2.828 4.586 (.015) 2.281 (.008) 4.020 4.658 (.012) 2.431 (.006) 4.377 6.527 (.022) 3.349 (.012) 6.210 3.686 (.012) 1.868 (.007) 3.469 5.361 (.019) 2.797 (.011) 5.350 (.008) 2.085 (.006) (.007) 1.740 (.006) (.008) 2.511 (.006) (.018) 3.292 (.012) (.014) 3.648 (.010) (.028) 4.820 (.017) (.016) 3.130 (.014) (.024) 4.841 (.022) GH 1.945 (.006) 1.599 (.005) 2.339 (.006) 3.149 (.012) 3.387 (.009) 4.725 (.019) 2.788 (.011) 4.252 (.018) BF' 1.022 (.003) 0.867 (.003) 1.286 (.003) 1.658 (.006) 1.863 (.005) 2.477 (.010) 1.499 (.006) 2.284 (.011) D' 2.969 (.013) 2.016 (.009) 2.383 (.011) 2.081 (.009) 1.941 (.008) 1.571 (.007) 2.383 (.010) 1.612 (.007) 1.979 (.011) 1.842 (.010) 1.671 (.008) 1.365 (.007) 3.272 (.012) 2.310 (.008) 2.918 (.012) 2.449 (.009) 2.280 (.008) 1.847 (.007) 4.489 (.022) 3.211 (.017) 4.325 (.028) 3.300 (.018) 3.230 (.019) 2.590 (.015) 4.292 (.016) 3.270 (.013) 4.581 (.021) 3.516 (.015) 3.256 (.013) 2.662 (.011) 6.136 (.031) 4.668 (.027) 6.721 (.043) 4.787 (.027) 4.901 (.031) 3.872 (.024) 3.891 (.020) 2.771 (.016) 3.928 (.026) 3.663 (.025) 3.146 (.020) 2.501 (.015) 5.578 (.031) 4.173 (.026) 6.126 (.040) 5.852 (.041) 4.907 (.033) 3.866 (.025) D' 2.980 (.013) 5.023 (.018) 6.917 (.031) 3.481 (.014) 5.143 (.021) 2.025 (.009) 2.401 (.011) 2.090 (.009) 1.951 (.009) 1.578 (.007) 3.429 (.013) 2.428 (.009) 2.855 (.012) 2.568 (.009) 2.398 (.009) 1.942 (.007) 4.684 (.022) S 2.187 (.008) 1.544 (.006) 1.820 (.008) 1.642 (.007) 1.532 (.006) 1.245 (.005) 3.199 (.015) 3.847 (.021) 3.299 (.016) 3.094 (.015) 2.502 (.012) 3.574 (.013) 4.270 (.018) 3.785 (.013) 3.530 (.012) 2.862 (.010) 4.743 (.021) 5.821 (.032) 4.892 (.022) 4.609 (.022) 3.721 (.017) 2.473 (.010) 3.050 (.017) 2.683 (.012) 2.487 (.011) 2.031 (.009) 3.670 (.015) 4.600 (.025) 4.008 (.018) 3.705 (.016) 3.033 (.014) D' 2.975 (.009)a S Hi T2' S H1 2 n)-Configuration T2' GH BF' Hi T2' GH BE' "The entries in parentheses are the standard errors of the corresponding estimates. for contrastswould seem to rule out their use in most practicalapplications.It shouldbe mentionedthat Hi is less conservativeforpairwisecomparisonsthan S, while D' always seems to be the most conservative. On the whole, we would recommendGH and T2' for pairwisecomparisons,GH givingslightlynarrowerCl's than T2' at the riskof not attainingthe designatedconfidencelevel by a small amountin some cases. For general contrastcomparisonswe recommendthe BF' procedure. [ReceivedOctober1977. RevisedDecember1978.] REFERENCES Banerjee, Saibal K. (1961), "On ConfidenceInterval forTwo Means Problem Based on Separate Estimates of Variances and Tabulated Values of t-Variable," Sankhyd,Ser. A, 23, 359-378. Brown, Morton B., and Forsythe, Alan B. (1974), "The ANOVA and Multiple Comparisons for Data With Heterogeneous Variances," Biometrics,30, 719-724. Dalal, Siddharth R. (1978), "Simultaneous Confidence Procedures forUnivariate and Multivariate Behrens-FisherType Problems," Biometrika,65, 221-224. Gabriel, K. Ruben (1969), "Simultaneous Test Procedures-Some Theory of Multiple Comparisons," Annals of Mathematical Statistics,40, 224-250. Games, Paul A., and Howell, John F. (1986), "Pairwise Multiple Comparison Procedures With Unequal N's and/or Variances: A Monte Carlo Study," Journalof Educational Statistics,1, 113-125. Harter, H. Leon (1960), "Tables of Range and Studentized Range," Annals of MathematicalStatistics,31, 1122-1147. Hochberg, Yosef (1974), "Some Generalizations of the T-Method in Simultaneous Inference," Journal of Multivariate Analysis, 4, 224-234. (1975), "An Extension of the T-Method to General Unbalanced Models of Fixed Effects,"Journal of theRoyal Statistical Society,Ser. B, 37, 426-433. (1976), "A Modification of the T-Method of Multiple Comparisons fora One-Way Layout With Unequal Variances," Journal of theAmerican StatisticalAssociation,71, 200-203. International Mathematical and Statistical Libraries (1978), IMSL Manual (Vols. I and II), Houston, Tex.: IMSL. 480 Journal of the American StatisticalAssociation, June1979 Keselman,H.J., and Rogan,JoanneC. (1978), "A Comparisonof MethodsofMultipleComparisons theModifiedTukeyand Scheffe for Pairwise Contrasts," Journal of the American Statistical Association,73, 47-52. Miller, Rupert G. (1966), Simultaneous Statistical Inference,New Unequal Sample Sizes," Journal of the American Statistical Association,68, 975-978. Stoline,MichaelR. (1978), "Tables oftheStudentizedAugmented Range and Applicationsto Problemsof MultipleComparison," Journal of theAmerican StatisticalAssociation,73, 656-660. York: McGraw-HillBook Co. and Ury, Hans K. (1979), "Tables of the Studentized MaximumModulusDistributionand an Applicationto Multiple Naik, UmeshD. (1967), "SimultaneousConfidenceBounds ConComparisons cerningMeans in the Case of UnequalVariances,"TheKarnatak AmongMeans," Technometrics, to appear. UniversityJournal: Science, XII, 93-104. in Model I OneTamhane,Ajit C. (1977), "MultipleComparisons Way ANOVA WithUnequal Variances,"Communications Pratt,JohnW. (1964), "Robustnessof Some Proceduresforthe in StaTwo-Sample Location Problem," JournaloftheAmericanStatistical Association,59, 665-680. Confidence RegionsforMeans SidAk,Zbynek(1967), X"Rectangular Journalof theAmerican of MultivariateNormalDistributions," StatisticalAssociation,62, 626-633. Spj0tvoll,Emil (1972), "JointConfidenceIntervalsforAll Linear Functionsof Means in ANOVA With UnknownVariances," Biometrika,59, 684-685. , and Stoline,Michael R. (1973), "An Extensionof the T-Methodof MultipleComparisonsto Include the Cases With tistics,A6 (1), 15-32. Ury,Hans K. (1976),"A Comparison ofFourProcedures forMultiple ComparisonsAmongMeans (PairwiseContrasts)forArbitrary Sample Sizes," Technometrics, 18, 89-97. --, and Wiggins,Alvin,D. (1971), "Large Sampleand Other Multiple Comparisons Among Means," BritishJournal of Mathematical and StatisticalPsychology,24, 174-194. Wang,Y.Y. (1971), "Probabilitiesof Type I Errorof WelchTests for the Behrens-Fisher Problem," Journal of the American StatisticalAssociation,66, 605-608.
© Copyright 2024 Paperzz