• UNIVERSITY OF NORTH CAROLINA Department of Statistics Chapel Hill, N. C. " ILWSTRATIONS OF SOME SCHEFFE-TYPE TESTS FOR SOME BEHRENS-FISBER-TYPE REGRESSION PROBLEMS by ,t~Ohard F. Potthoff October 1963 Ie r--'. f f I , r t t r: r r Contract No. AF AFOSR-62-l96 It is sometimes required to compare two simple regression line~ without assuming that the variances of the two sets of error t~rms are necessarily equal: one may wish to (A) test whether two parallel regression lines are identical, or (B) test whether two regression lines are parallel. In this paper, a t-test for each of these probl~: is explained and illustrate~ numerically. These t-tests bear a ~ertain resemblance to Scheffe1s test for the Behrens-Fisher problSm, but (unlike the latter) are not randomized tests. This paper is on a practical rather than a theoretical level; the more technical aspects of the tests are covered in a separate paper (Mimeo Series No. 374). t. This research was supported in part by Educational Test!ng Service; it was supported in part by the Air Force Office of Scientif~e Research; and it was supported in part by a grant from the NATO Research Grants Programme under a joint arrangement between the Institut de Statistique of the University of Paris, the Istituto di Calcolo delle Probabilit~ of the University of Rome, and the n~~tmez:~r, Statistics of the University of North Carolina. Institute of Statistics Mimeo Series No. 377 .. ;/ ILllJSTRATIONS OF SOME SCHEFFE-TYPE TESTS FOR SOME BEHRENS-FISHER-TYPE REGRESSION PROBrnMS by Richard F. Potthoffl University of North Carolina = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = == = = = = = 1. Introduction. In educational and psychological applications, it is some- times necessary to make certain comparisons between two regression lines when the two variances are different. Such problems arise particularly in connection with experiments comparing two curriculums or two teaching methods (which often have different variances), as is explained more fully in [lJ cipal types of problems are the same, and (B) and [2].. The two prin- (A) testing whether two parallel regression lines are testing whether two regression lines are parallel. These will be referred to as Problem A and Problem B respectively. By generalizing an idea which Scheffe Behrens-Fisher problem, another paper [3] ;/ Scheffe-type tests for Problems A and B. [4] used to devise a test for the gives a theoretical development of The purpose of the present paper is to provide numerical illustrations of some of the techniques presented in [3]. The illustrations of this paper are all for the case of simple regression, even though SCheffe-type tests are available in [;] both for simple regression and for the more involved case of multiple regression. randomized test, but the tests in [3J Scheffe's test itself is a include both randomized and non-randomized ~hiS research was supported in part by Educational Testing Service; it was supported in part by the Air Force Office of Scientific Research; and it was supported in part by a grant from the NATO Research Grants ProgralDIIle under a joint arrangement between the Institut de Statistique of the University of Paris, the Istituto di Calcolo delle Probab1lita of the University of Rome, and the Department of Statistics of the University of North Carolina. 2 tests. The tests illustrated in this paper, however, are all non-randomized (under the provision that the independent variates all have distinct values), for it is suggested in [3J that, for the case of simpLe regression, the nonrandomized ~ts would ordinarily be used in preference to the randomized tests. An entirely different test for Problem A is described in Problem B in [l]j [2J, and for other pcssib1e tests for Problem B Which are in the literature are mentioned in [3J. All these tests, however, are inexact (i.e., the actual level of significance is only approximately equal to the stated level of significance) and tend to be less exact the smaller the sample. The Scheffe'type tests, on the other hand, are exact tests based on the t-distribution. Thus these Scheffe.type t-tests are particularly useful for small sample sizes. The mathematica.l models for Problems A and B may be expressed respectively as follows: Problem A. We suppose that we have one group consisting of M pairs (Y2, X2 ), ••• , (YM, XM), and another group consisting of N pairs (Zl' WI)' (Z2' 1'12 ), ... (~, WN) For each i, the observed value Yi follows the relation (Yl , Xl)' 0 ,. (1.1a) and for each j, the observed value Zj follows the relation (1.1b) The a's Wj's and ~ are unknown parameters (regression coefficients), the are specified constants, and the ei's and fj'S and are error terms which are independent and normal with mean 0 and unknown variances tively. Xi'S u; and u~ respec- The hypothesis to be tested is that Cty regression lines associated with (l.la) be able to obtain confidence bounds on = oz., i.e., that the two parallel and (l.lb) (CYz - ay). are identical. We will also Standard regression methods cannot be used for this problem, of course, since two different variances 2 and cr f cr; are involved. ProbleJn B. The model is the same as for Problem A, except that in place of (1.1) we have (1.2a) and (1.2b) • The hypothesis to be tested is that lines associated with (~z - ~y) (1.2a) and ~y = ~Z' (1.2b) i.e., that the two regression are parallel. Confidence bounds on can also be obtained. For definiteness, we will always assume (with no loss of generality) that M < N. The reader should refer to relation of Problems [lJ and [2J for a further discussion of the A and B to the experimental situation of comparing two curriculums or two teaching methods. 2. The test for Problem A. This section tells how to calculate the test statistic for Problem A, and then Section 3 will present a numerical example. Since this test, like that of Scheffe [4J for the Behrens-Fisrer problem, is based on pairing of the two samples, it is first necessary to effect pairing. ~his To start, let the first group be arranged in order of increasing the second group in order of increasing W, so that Xl < X2 < ••• < XM X and and 4 WI < W2 < '" < WN, (In case there are tied values of the X's or of the W's, it will be necessary to break the tie in some way, either by randomizing or else by some other objective but less arbitrary method. e,g" An alternative to randomizing, would be to break the tie on the basis of some second variable similar to X and W which otherwise would not even be utilized at all.) Paired to each of the M members of the first group will be a member of the second group, Thus M of the Wj's will be unpaired, paired to Xi' Let Vl(i) will be paired, but the remaining (N-M) denote that vl (of the second group) which is j The pairing is to be done according to the formula , l<i< v , where the integer v v+l < i <M is determined in a way which will be described shortly, Note, however, that in the special case M=N, no v (2,1) siID.]?ly reduces to matching the Xl sand W's The optimal way to choose v the smallest integer v is necessary, and the pairing in opposite order. (for the general case) is to set it equal to such that N l: M-v-l l: j=l W + j W j=N-v+l j - .jflj[ (X M-l v+I M-I is < O. [The function 0v (2.2) decreases as from positive to negative somewhere between Actually, ~ , 'V X) v increases, and will turn =0 objective method of choosing v and 'V =M - I.J may be used and the probability of Type I o error will be unaltered; however, the power of the test will be affected. 'V If the experimenter prefers . not to use the optimal method for choosing because he feels it may be too lengthy computationally, the following simple 5 but somewhat sub-optimal method is recommended instead: choose \) = M/2 if M is even, and choose \) = (M-i)/2 [or (M+l)/2] if M is odd. After completing the pairing, we proceed as follows. M (2.3) X = 1 EX., M i=l ~ Wo (2.4) 1 = M t = M E W(i) i=l I.~ R Y M 1 Y = M. E Yi , 3.=1 , -WeN 1 and -0 Z Let N Z= E W., j=l J NE j=l 1 = N 1 M M E Z(i) i=l Zj , , , + (w _X)2 u.u where (2.6) ahd M M 2 + 2 (E Z( ) - MZO) N i=l i - - (U'TA)2 u*u • 6 e The statistic t (2.4) is used to test the n'\.1l1 hypothesis follows the t ..distribution with (M-2) <ly = OZ. It degress of freedom if the null hypothesis is true. 3. Numerical example of the test for Problem A. Suppose there are M=4 observations in the first group, y x and N=6 o 7 20.7 9 8 1 observations in the second group, z W ~anging 22.8 5.5 6.9 4 6 8 the first group in order of increasing 1 • X and the second group in order of increasing W, we have ~ X2 -7 - = 8 and • The next step is to find values of v = 2. We do not have to calculate v, but rather we need only find the place where to negative. for v. Let us start with an intermediate value of v: U~ing (2.2), and noting that i = 6, we get &v 5 for all v goes from positive we evaluate & v 7 6 = ~ 15 "2 - -'3 Hence the first negative =- ~ (8 .. 6) =2 could occur at v 0v (6 + 4 Jb) < or at \) < 2. 0 • We next cal- culate ~ :3 (1 + 2) + (8) 6 + 3 2 :3 Now, evaluating 0v for the next lower value of > which establishes that v =1 in [Note: 01 is the first (2.1), and (2.1) Z(l) =Z6=17. 2 H(l) = W = 6 o < = 8 Z(2) (7 - 6) • v, we find , o r. which is < O. Hence we set Vv then gives us = Z3 Z(3) = Z2 = 5.5 Z(4) = Zl w(:3) =W2 = 2 W(4) = WI = 1 = 6.9 W(2) = w3 = ; If we had simply chosen v = M/2, then \) would be 2, not = 2.5 1.] With respect to the quantities in (2.3), we may write MX = 24, Hence MY = 67.9, NW = 24, NZ = 54.5, -0 -0 MW = 14, MZ = 32.1. Z - Y = -7.89 and W- X = -2. In order to get t (2.4), we must obtain (2.5 - 2.7), for which we must first calculate 8 ~ X.Y. ~ ~ , = 504.5, • Now we use (2.5 - 2.7) to write , ufTA = = 260.474 97.1 - (-63.524) - (-60.217) + 39.633 , and (260.474)2 131.387 Finally, (2.4) becomes t = = The number of degrees of freedom for test at the a:; ~05 t -3.93 .277 is M-2=2. (az Confidence bounds on confidence interval for (aZ - - CXy) ay) is = az Hence, for a two-sided are easily obtained. given by i.e., we state that -2.74 < confidence. (az - CXy) ~ -5.12 Since '-14.191 . , 0/0 -14.19. level, the critical region is It I > 4.30. is > 4.30, we reject the null hypothesis CXy with 95 == For example, a 95 0/0 9 4. The test for Problem B. This section tells how to calculate the test statistic for Problem B, and then Section 5 will present two numerical examples. As before, the first thing to do is to pair the two samples. by assuming Xl < X < ••• < 2 integer such that (4.1) is YV < O. = ~G ~ and W < W < ••• < vl • N l 2 1 (W + W ) N-v M-v [This function y M-v-l M-l (I: j=l (4.1), like v 5 v vij Let vle start again v be the smallest N + W) I: j=N-v+l j (2.2), decreases as v increases.] v by an objec- Now define the two pairings (4.2a) + Wei) l<i<M-v = Wi , = WN_M+i M-v+l < i< M and (4.2b) vl(i) = WN+l _i , = WM+l _i 1 l<i<v v+l < i < M • [As in the case of the previous test, it is permissible to choose tive method different from the one just reconnnended. Only one of the two pairings (4.2) will be used. No v To find out which one it is to be, we compute the correlation coefficients between the and between the Xi IS and 'ttl(i) IS. Xi's and Wei) 's We select whichever pairing leads to the correlation coefficient with larger absolute value. tion coefficients have the same is needed if M=N. J denominator.) pairing which is chosen (i.e., either 'ttl(i) be), and the correlation coefficient is then (Note that the two correla- vle use 'H(i) to denote the or Wei)' whichever it turns out to 10 M - !: i=l = p -0 X. VI(.) - M X W J. J. j [In this section, we define all means the same as in (2.3).J Some future formulas will contain (+) and <:~) signs. The upper sign is to be chosen in all places if pairing (4.2a) was selected, and the lower sign if pairing (4.2b) v~s selected. VIe define 2 M O"x (4.4) M E X2 i=l i = - , M X2 N E V~_Nw2 j=l N~ = and N 0"20 W M E . 1 = J.= 2 -19 2 , -MW He) J. and calcuJAte (4.5) R = N 0'.2 O"w -cr o = vl W --"2 Ncr o vI At this stage, we specify t1VO cases: Jp' > Ip 1:5 l/R. A Ip I :5 t t3 z We calculate 1\ - t3 y =----~- ;tC;:;;/:r where if llR • Case I.: (4.6) Case I. , l/R, and Case II. if 11 (4.7) (4.8) '~(R"p) 1 + R2 ,+ 2 p R 2 1 ... P = (uy TB) A fly = (R...l)2 2(1 '+ , (R+l)2 + p) 2(1 .! p) -+2 p R(UZTB) , 1 ... P -+ A (4.9) = f3Z = 2 P R(uy TB) + R 2 1 ... P (UZ TB) , (4.10) 1 (M + --2~ .,E N O"w 2 _0 Z(i)'" M Z 2 ) ... A fly ~=1 M (,E X.Y .... MX 'Y) '+ (4.11) i=l ~ ~ ( uy TB ~ 1 )" M O"x ) UZ TB) ---..l M (,E x. Z(i) ... M X ZO)" A ( -2 YN trw I , ... flZ i=l ~ and (4.12) Case II. 'pl > l/R. We calculate A t = 1 1P1 where A flZ ... fly )7 M...; , 12 (uy TB) A (4.14) f3 y = A (4.15) f3 Z ::: (uz TB) + 2 1 .. P (ur TB) + (l/p2)(UZ TB) 2 1 .. P (4.16) , - M -X -0) Z 1 (4.17) and (4.18) UZ TB =- . . /M Y Note also that P I 2,/N ( 2 O'x rO'o W M - -0) .E YiW(i) - MY W + 1.=1 p N 0' 2 2 WO M ( -0:::'0 E W(i)Z(i)" MW Z ) • i=l (4.14) and (4.15) imply the formula To test the hypothesis f3 y = f3z, the statistic to tables of the t-distribution with (M-3) t (4.6 or 4.13) is referred degrees of freedom. 5. Numerical examples of the test for Problem B. We present two examples 1 one for Case I. and the other for Case II. first group contains the M=6 observations For the first example, suppose the 13 Yl = 0.7 X = 1 Y2 X 2 0 = = 2.4 Y = 1.9 3 X =4 3 2 and suppose the second group contains the N=7 Y6 = 1.1..5 = 4.2 = 13 Y 5 x 5 Y4 = 2.4 x = 6 4 X6 =17 1 observations • Using (4.1)J we find that and J so that =3 v is the proper v-value to use in (4.2). Thus we obtain + W(6) = 9 and from (4.2). As for the means (2.3)J we will need the values M X = 42, MY = 16.1 J N W = 27 J -0 MW = 24, -0 M Z = 78 .5 • Now we find and i:: Xi W( i) Since 1116/ > O - MXW = 59 - 168 = -109 /-1091J we take W(i) = W(i) , so that • 14 Z(l) = 3.2 Z(2) = 5.0 Z(3) = 8.5 z(4) = 15.7 Z(5) W(l) =0 W(2) =1 W(3) =2 W(4) = W(5) = 7 5 = 20.6 Z(6) = 25.5 w(6) = The sums of squares and cross-products which we will need are . I: I: ~ = 53.51 XtZ(i) and , = 839.5, N I: j=l I: XiW(i) b , 284, W~ = 169 • J Thus we obtain 2 M CTX P = 220 , 116 .../220 x 64 = ;::;;;;:::=:;:;::::= and 2 N CTW = 64.857 j = j .955682 = = /1.013393 = 1.006674 from (4.4), (4.3), and (4.5). Since ·977590 Ripi = j .984114 < 1, this example falls within Case I. Wherever there are two signs in (447 - 4.12), we choose the upper sign ( since the upper signs are for + Wei) - ) and the lower signs for W(i)' Thus (4.11 - 4.12) become 2~0 (45.2) - uy TB = 1 /220 x 64.B57 (290.0) = -2.222317 and =_ 1 (23.7)+ -/220 x 64.857 2.271636 , 9 15 and so (4.8 - 4.10) are A f3 y (-2.222317) + .984114 (2.271636) 1 - .9556S2 = , :: .29856 .984114 (-2.222317) + 1.013393 (2.271636) 1 - .9556S2 , :: 2.5959 and 2 -/220 x 64.857 = .021884 Finally, (4~7) is = 1.0191 + and thus the test statistic (4.6) This t :;.~29:;;86~ = .....;2::.::59=:59:--;:w t has -11.• 0191 ./;.021884 6-3 is = 2.297 .08622 = 26.64 • M-3=3 degrees of freedom. Hence the critical region for a two-sided test at the a = .05 level is we reject the null hypothesis f3 y = f3Z. Itl > 3.18. Since A 95 % /26.64/ > 3.18, confidence interval on (f3z - f3 y ) is given by 2.30 + 3.18 (.0862) , i.e., • 16 For our second example, we will use exactly the same figures as for the first example, except that we will suppose that X4 = 9 instead of 6. Since the W.'s are the same as before, v will also be the same as before, J and therefore the W+(i) I sand W-(i) I s will likewise be as before. will stay the same except that M X = 45. The means We find + --0 E Xi W(i) - M X W = 299 - 180 = 119 - -0 E Xi W(i) - MX W = 65 - 10 8 = -115 and Thus we take W(i) = W(i) • (sincel119J > 1-1151>; this means that the W(i) t S and Z(i)'s are exactly as before. The sums' of squares and cross-products will be the same as before except for 2 E XiYi = 165.1, E XiZ(i) = 886.6, E Xiw(i) = 299 • 2 2 Hence M CT2 X = 221.5 now, but N CTW' N CTWO ' and R are unaltered. We obtain E Xi = p This time, = 559, .119 /221. 5 x 6liO Rlpf = • 1.006141 > 1, which means that we are thrown into Case II. It remains to calculate (4.17 - 4.18), (4.14 - 4.16), and (4.13). ttY TB = 1 221.5 (44.35) uz TB = _ ~999471 . (23.7) /221.5 x 6~ A _ 0999471. (297.85) /221.5 x 64 + = .99~42 (160.2) ~Y = (-2.300067) + f2.301528) 1 - .9989 2 "!3z = (-2.300067) + (1/.998942) (2.301528) 1 - .998942 = We find -2.300067 , = 2.301528 , , 1.3809 = 3.6853 , . 17 2 x .999471 = /221.5 x 64 , and finally t = , 3.6853 - 1.3809 1 I .022953 .999471 6-3 = 2.304 .08752 = 26.33 • J Since 126.331 )l' 3.18, we reject the null hypothesis f3 y ::: f3Z. A 95% fidence interval for (f3z - f3 y ) is given by , i.e., con- 18 REFERENCES [lJ Potthoff, Richard F., "Illustration of a Technique Which Tests Whether Two Regression Lines Are Parallel When the Variances Are Unequal", M1meo Series No. 320, Department of Statistics, University of North Carolina, 1962. [2J Potthoff, Richard F., "Illustration of A Test Which Compares Two Parallel Regression Lines When the Variances Are Unequal", Mimeo Series No. 323, Department of Statistics, University of North Carolina, 1962. [3] Potthoff, Richard F., "Some Scheff~ -Type Tests for Some Behrens-FisherType Regression Problems", Mimeo Series No. 374, Department of Statistics, University of North Carolina, 1963. [4J Scheffe, Henry, "On Solutions of the Behrens-Fisher Problem, Based on the t-Distribution", Annals of Mathematical Statistics, volume 14 (1943), PP. 35-44.
© Copyright 2026 Paperzz