TIYIE KJ~STKTU'vrE OIF STATXSTK(CS THE UNIVERSITY OF NORTH CAROLINA · ' THE DEFINITION OF PARAMETERS IN GENERAL LINEAR MODELS by Ronal d W. Helms Department of Biostatistics University of North Carolina at Chapel Hill Institute of Statistics Mimeo Series No. 1320. Decembe r 1980 ~-------"-' - - - - - - - DEPARTMENT OF BIOSTATISTICS Chapel Hill, North Carolina 1. INTRODUCTION In recent years a number of authors have addressed the question. "What hypothesis is being tested by such-and-such general linear model program (or procedure)?" These questions have arisen largely because of the use of traditional. less-than-full-rank (LTFR). ANOVA linear models by automatic or semi-automatic programs such as GLM in SAS (Goodnight. 1976). Although one can avoid these questions by using full rank models in conjunction with non-automatic software such as LINMOD (Helms. Christiansen. and Hosking. 1979). the inconvenience of non-automatic software. especially for large problems. is simply too great. One can expect substantial advances in automatic linear model software in the next few years. A number of authors have addressed a variety of aspects of the "What hypothesis is being tested" issue. and related issues. over the past decade. See. for example. Urquhart. Weeks and Henderson (1973). Hocking and Speed (1975). Speed and Hocking (1976). Seeley (1977). and the recent entire special issue of Communications in Statistics (Vol. A9. No.2. 1980). (The papers are listed in the List of References.) The issue addressed in this paper is a fundamental one which will be of special importance to producers of automated linear model software. viz .• what are the definitions of the various parameters in a linear model? The point of view adopted here (taken from Scheffe (1959). Chapter 3). is that the fundamental quantities in a linear model E(Y) = XB. are Xand the elements of E(Y). - -- - - The parameter vector in many cases. defined as a vector-valued linear function of X and E(Y). case the appropriate interpretation of a arises from the definition of a may - be. In such a a. For illust- ration. Scheffe (1959. section 3.1 defined a model for a one-way design by Yij = ai +eij • i=1.2 ••.•• I; j=1.2••.•• J i {e ij } are independently N(O 0 2 ). Here the definition of each element of a- is a.1 =E(YiJ')' a.1 has a simple. direct inter- The interpretation of an estimate of ai follows from this definition: ai is an estimate of the population mean of observations from the i-th group. If the elements of a can be defined as linear combinations of the elements of X and E(Y) pretation. ,. - then one can reasonably and directly interpret the elements of ~ - - as known as linear combinations of (unknown) expected values--means of well defined populations. The in- terpretation may. or may not. be simple. depending on the complexity of the linear combination. If a has a direct interpretation then an estimate of a also has a direct interpretation. -2- In the usual ANOVA model for the one-way design. without side restrictions. E(Yij) • ~+ai' the parameter ~ does not have a definition in the sense described above. Moreover. there is no reasonable interpretation of an estimate of squares estlmat~). ~ (e.g •• a least for one can obtaI" another least squares estimate ot arbitrary constant. ~ by adding an Thus. definability of a parameter is an important part of the in- terpretation of the parameter and its estimate. The parameter ~ in the model E(!) c ~~ is called a primary parameter; a secondarY parameter is a linear combination of the form ~ c £~-2 [or ~ • ~(!) - 2]' A primary parameter may (or may not) be defined as a linear combination of X and E(!). Similarly. a secondary parameter may (or may not) be defined as a linear combination of E(!) and fixed. known matrices (~. £. etc.). If ~ is so defined. the interpretation of the elements of 0. and the interpretation of an estimator of 0. stem directly from the definition. The importance of a definition is the close link between the definition and interpretation of a parameter. The interpretation of a well defined parameter (and its estimator) is straightforward. though possibly complex. As illustrated by the one-way ~ above. an undefinable parameter. and its estimate. probably cannot be interpreted. Section 2 of this paper contains a definition of primary parameter definitions in the general linear model and a simple illustration. The issue of alternative. but equivalent. definitions is addressed. and a canonical definition is defined. Section 3 addresses similar issues for secondary parameters. The results presented in this paper may be used by the developer of automatic or semi-automatic linear model software. When a model is constructed for a user the program can display the canonical definitions of ~ and/or any definable secondary parameters. together with equivalent. more easily interpreted definitions. Given such definitions the user will be able. in a straightforward manner. to answer the questions. "Whal 1.10 lhese paramelers mean?" and "What hypothesis Is being tested?" -3- 2. DEFINITIONS OF LINEAR MODEL PRIMARY PARAMETERS 2.1 Notation and Terminology We shall assume. as a basis for discussion. the general linear univariate model with an Nxl vector of random observations. !. an Nxq "design matrix". r & ~. with Rank(X)<q<N, such that -- =~ ~ V(!) = 02!. (2.1 ) E(!) (2.2) (Generalizations of (2.2) will be discussed below.) The qxl vector ~ vector of primary parameters whose values are unknown; presumably can take on any ~ is a nonstochastic value in Eq. q-dimensional Euclidean space. The model equation (2.1) is assumed to be q consistent, i.e., whatever the value of E(!) there exists a 8eE such that (2.1) is satisfied. Thus, manifold of columns of X - • {teEn: - = linear for some 8eEq t"' xaL E(Y)e~(X) - - Thus a is an exact solution of (2.1). tion, for -- -Y ~ - This contrasts with the inconsistent data equa- Xb which does not have an exact solution. ~,produces [The least squares estimator, b, - an approximate solution of the data equation and an exact solution of the consistant normal equations, (~'~)~ • ~'!.] We are interested in studying specifications and definitions of the primary parameter ~. DEFINITION 2.1. A specification of a parameter is a collection of one or more conditions and matrix equations which involve: (1) the parameter, (2) E(!), and (3) known, constant matrices (such as ~). DEFINITION 2.2. A definition of a parameter is a specification which uniquely determines the parameter as a vector-valued function of E(!) and known, constant matrices (such as ~), i.e., for each permissible. fixed value of E(Y) and the known constant matrices there is one unique value of the parameter which satisfies the specification •. A specification may not result in a para~eter being well-defined in the mathematical sense; a definition results in a mathematically well-defined parameter (whose value, like , that of E(Y) is unknown). For example. the model equation E(Y) iff for each E(!) e M(~) = ~~ is a specification of there exists a unique value of a satisfying ~; it is adefinition the equation. The -4- following matrix theorem [adapted from Searle (1973), TheoreM 4. page 12] is useful in the ·present context: THEOREM 2.1. The number of linearly independent solutions to the consistent model equation E(!) • ~~ is q+l-Rank(~) where ~ is Nxq, q<N. The following results are immediate consequences. COROLLARY 2.1.1. The consistent model equation E(!) • and is, therefore, a definition of ~, iff Rank (~) ~~ has a unique solution for ~ • q, i.e., iff X is Full Rank (FR), in which case the unique solution is given by B • (X'Xt-1X'E(Y). (2.3) Let S be a specification of a parameter E(!) = ~~). We shall say that ~ ~ (as, for example. the model equation is well defined (or "definable") if and only if S is a definition. The primary parameter. B. in the usual Less Than Full Rank (LTFR) ANOVA model without side restrictions is not well defined. Some workers leave the parameters unrestricted but place restrictions on the estimates to determine a solution to the normal equations. This procedure leaves the parameters undefined. If one places appropriate restrictions on the parameters of the LTFR model there are two consequences: ferent; imposing restrictions changes the model itself. (1) The model is dif- (2) The model. with the addi- tional restrictions. admits a unique solution and the specification becomes a definition. The well defined. restricted parameters are different from the undefined, unrestricted parameters. DEFINABILITY and ESTlMABILITY. Obviously, a primary parameter is well defined if and only if it is estimable; in a sense, then. the two concepts are equivalent. However. as will be seen. the study of definitions and their properties is quite different from the study of estimators and their properties. Even though definabi1ity and estimabi1ity are equivalent concepts in this context, the concepts stem from different origins and it is useful to use different names for the two concepts. 2.2 Canonical. Alternative, and Equivalent Primary Parameter Definitions In this section we consider unconstrained, full rank models in which the primary parameter is well defined. If we let the model equation E(!) following is a reasonable consequence of Corollary 2.1.1. ~~.specify ~ then the -5- DEFINITION 2.3. Let ~ be the primary parameter specified by the full rank model equa- tion E(Y) • xa. then the canonical definition of a is a = (~I~)-l~'E(!). (2.3) There are other definitions of a primary parameter and it is often the case that alternative definitions are more useful for human interpretation than the canonical definition. The following simple illustration will help make this point. Illustration. Y' = (Yl' Consider ~ hypothetical experiment with three observations, Y2' Y3), in which one knows from experimental considerations that E(Yl) but E(Y3) may be different. and its parameters. Consider the following intuitive definition of a FR model The definition is labelled D2' Dl will be the label for the canon- ical definition. (2.4) The oarameter vector defined by D will be denoted aliI i 02: a!2J = E[YIJ = £[Y2J a!21 = E[Y3] = (alil,aJiJ),. - E[Yl] or, in matrices, (2.5) Solving (2.4) equation, in matrices: (2.6) where aliI is defined by the canonical definition: (2.7) 01: a(11 (~,~)-l~'E(!) (1/2) or, equivalently (2.8) = E(Y2), r t 1 1 01 l - l 2J -6- There are at least two reasons for considering alternative definitions. First, if one constructs a model by first defining readily interpretable parameters (as with (2.4)} and then constructing an appropriate ~ matrix the "original", or "intuitive", definition will usually not be the same as the canonical definition. setting" the intuitive definition matrix, ~I' For an "ANOYA will usually contain mostly O's, l's and -l's. which make the intuitive definition more readily interpretable than the canonical definition. Secondly, if one first constructs the ~I ~ and then determines the canonical definition, matrix may be difficult to interpret. As will be illustrated later, it is pos- sible to construct an equivalent definition, ~(2) • ~2E(!), which is much more easily interpretable, especially for incomplete and/or unbalanced designs. This "example" illustrates the fact that a primary parameter may have two or more definitions, which mayor may not be "equivalent". DEFINITION 2.4. Equivalent Definitions of FR Model Primary Parameters. Let E(!) .. ~~ specify a full rank linear model and consider two alternative definitions of the primary parameter, a, of this model: D.: .(. where ~I ~ ~2. -aU! .. A.E(Y) , i " 1,2, -"- Then 01 and O2 are equivalent definitions if and only if they define the same parameter, that is, if and only if for all E(Y)eM(X), (2.9) COROLLARY 2.4.1. In the context of Definition 2.4, definitions 01 and 02 are equivalent if and only if (2.10) which is equivalent to (2.11) Proof. The equivalence of (2.10) and (2.11) is a well-known theorem from matrix theory. Substituting E(Y) .. Xa in (2.9) and changing a to d produces (2.11). i.e., (2.9) holds iff (2.11) holds. QED. The most important case is the equivalence of an alternative, or "intuitive", definition to the canonical definition. This issue is addressed in the following corollaries. -7- COROLLARY 2.4.2. In the context of Definition 2.4, let 0 1 be the canonical definition of~, f.e., ~I ; (~,~)-l~,; then O2 is equivalent to the canonical definition if and only if ~2~ ; !q. (Proof: Equivalence <==> ~2~ = ~I~ = (~,~)-l~,~ = !q. QEO). COROLLARY 2.4.3. Class of Definitions Equivalent to the Canonical Definition. In the context of Definition 2.4 let 01 be the canonical definition [~I Then definition n2 = (~,~)-l~,]. is equivalent to the canonical definition, 01 if and only if there exists a qxN matrix K such that = ( ~ ' ~ )-1 ~ , -~, (2.12a) A .2 (2.12b) KX ; O. and That is, the class of all definitions equivalent to the canonical definition is generated by matrices Proof (1). ~2 of the form (2.12a) satisfying (2.12b). Let A2 be of the form (2.12a) and let KX = O. - = [(~'~) ~2~ . - -1 ~'-~]~ Note AIX =1. Then ... - O2 equivalent. QEO(l). --... = ! => 01 , Proof (2). Assume 02 is equivalent to 01 so that A2 X =!. ~2 is fixed. Let ...K = (X'X)-lX'-A ...... ... ... 2 and note that, solving for A2 , (2.12a) is satisfied. Also, KX - = (X'X)-lX'X-A ...... ......... 2 X ; ...O. QEO(2) • In the simple illustration given above, since 0 1 is the canonical definition one can easily demonstrate equivalence by verifying ~2~ =!. The class of definitions equivalent to the canonical definition can be found with a bit of arithmetic. The condition KX = 0 (2.12b) requires that K be of the form K= [a -a 0] b -b 0 for arbitrary scalars a,b. Further, one can easily verify that the ~2 in the illustra- tion is generated by a ; -1/2, b ; 1/2. Non-equivalent definitions are easily constructed. parameters, if one-makes an error in constructing tions will generally not be equivalent. a1"= E(yll = E(Y2) a (,I ; E(y,) 2 or alsl = A,E(Y); . .. the intuitive and canonical defini- To continue the illustration consider a third definition 0,: ~, In fact, after defining the n °Q1 LO 0 ~ E(.Y) -8- A simple calculation shows ~3~ 'I !. as expected. because 61 z1 is defined as a "cell lllean" rather than a difference of cell means (as 6!zl). model which would have an 03 defines a "cell mean" matrix different from the one for definitions 01 and Oz ~ This illustrates the point that the definition of 6 is taken in the context of the model equation E(!) = ~~. The model equation may contain information not contained in the definition of 6 alone. In equation (2.5) for example, ~z does not specify that E(Yd .. E(yz)' 2.3 Computing "Simpler" Definitions from X Although Corollary 2.4.3 defines the class of primary parameter definitions equivalent to the canonical definition. no algorithm is provided for computing such a definition. The following algorithm will produce a canonical-equivalent definition which is often easily interpreted. The use of Gaussian elimination (row operations) to "reduce" the matrix [~,!] to Hermite normal form can be written in matrices as (2.13) where or ~z is qxN • !t is (N- qlxN. [I ~q AZ] o R ~ That is, if one uses row operations to reduce (~, !n] to Hermite normal form. the result of the computations will be as shown in the right side of (2.13). The matrix Az forms an alternative definition of The matrix R satisfies onical definition. ~~ =~ ~ which is equivalent to the can- and represents model restrictions on E(Y): RE(Y) = RXa = O. R will contain information such as E(Yl)-E(yz) = 0 in the illustration. ~ ~ will also contain restrictions such as an assumption, in a factorial design. that certain interactions are zero. To continue the illustration, the algorithm is applied to the matrix [X,I] to produce: -9- We see that 0 = ~E(!) = -E(Yl)+ E(yz). the restriction that E(Yl) • E(yz). The Az is the same matrix originally used in Oz. a coincidence. not unexpected in this simple illustration. 2.4 Definitions from Cell Means and "First Observations" In ANOVA-like models with no concommitant variables one can frequently reduce the effort required to form a definition. Consider. for example. an AxB factorial treatment design run as a completely random experimental design. with Yabk denoting the k-th observation from the a-th level of factor A. b-th level of factor b. With no concommitant variables we may define. for each non-empty cell: ~ab = E [Yabk] • k • 1.2 •••.• Nab and. in particular: ~ab (2.14) = E[Yabl]' If we extract from the model equation E(Y) = XB those rows corresponding to the first observation from each cell (Yabl)' we have the "essence" model. say E(!E) that only! and of cell means. ~ are affected. ~ab' arranged i~ not~. But. from above. ~ = E(!E)' where NC~N. ~ is the set a vector. in the order of the elements E(Yabl) in E(!E)' Letting NC denote the number of (non-empty) cells in the design. all have NC rows. = ~E~' Note ~. E(!E)' and ~E Often NC is substantially less than N. DefiniHons of B may be constructed in this "essence" model. An essence-model canonical definition of B is: (2.15) Applying the algorithm described above to [~E' !NC] produces an alternative. possibly more intuitively appealing definition of [~E' (2.16) !NC] A19. [!q ~ where ~E2 is ':l)(NC •. ~E is ~ in the context of the "essence" model: ~E2j ~E (NC- q) xNC and °E2: ~ • ~2~' which is. by construction. equivalent to DEl in the context of the "essence" model. If the original model imposes constraints on the estimation space (e.g •• AxB interactions are zero). the n th e number 0 f parame t ers i n ber of cell means: ~. i .e .• q. 1S . 1ess than the num- q<NC. and the restrictions are defined by no constraints. q. NC and the ~E matrix is null (OxO). RE~ =~. If there are -10- The extraction of the "essence" model may be performed as a matrix multiplication, with [E(!E)' ~E] .. ~[E(!), ~] NCxl NCxq NCxN Nxl Nxq where Uis a matrix of l's and O's which "picks off" the rows of E(!) and ing to the first observation in each cell. The nonzero columns of ~ ~ correspond- collectively form INC· "Essence" model definitions are easily expanded to full model definitions using the matrix U: ~ .. ~EE(!E) .. ~E~(!) .. ~E(!) with ~ = ~EY. (The result of this operation is essentially the same as inserting one ~E column of zeros into to correspond to each non-first observation (Yabk' k>l) in each cell.) The correspondence between "essence" model definitions and full model definitions is established in the following theorem: NOTATION. Let a full rank linear model be specified by the model equation E(Y) .. XB where E(Y) is Nx1, X is Nxq, and let the canonical definition of B be ~ 01: (t~)-l~'E(!) .. ~IE(!). .. Let NC be the number of distinct rows in X; (1) Each column of ~,~ (2) ~ NC~N. Let U (NCxN) satisfy: is zero or a column of !NC; .. !NC (Which insures that exactly once.) ~ contains every column of INC . Let ~E = ~, !E .. Let the model E(!E) = ~E ~ so that E(!E) .. ~(!). be called the essence model. By definition the essence model canonical definition of B is DEl: Obviously ~El~E ~ .. ~EIE(!E) .. (~E ~E)-l~E E(!E)· .. !q. THEOREM 2.2. With the notation given above, when definition DEl is.extended to the full model the corresponding definition is equivalent to the canonical definition. That is, if the extension of DEl to the full model is DEE1 : ~ .. (~El~) E(!) .. ~£E1E(!) then -11- Proof. As noted above, ~E1~E = !i substituting: = ~E1Y~ =~E1~E =!. ~EE1~ QED. 2.5 An Example The incomplete, unbalanced part of a 3x4 factorial, with four missing cells, shown in Figure 2.1, is taken from Searle (1971, page 287). Unlike Searle, we shall assume the interactions are zerOi they are omitted from the model. The "reference cell" technique will be used to construct a full rank model with directly interpretable parameters. be the reference level. For convenience, let the first level of each factor (The models constructed by the SAS GLM procedure are, in effect, essentially equivalent to reference cell models, but the GLM algorithm uses the last level of each factor as the reference level.) In reference cell models "main effects" are taken at the reference levels of the other factors, not averaged over the levels of the other factors as in other methods. In- stead of a "general mean", one uses the mean of the reference cell as a measure of the 'e "genera1 level of response". defined: Us ing F;gure 2.1, the following reference cell parameters are III I E[Y llk ], reference ce11 mean Q2 E[Y 21k ]-1l1l =1121-1111 6, E[Y 13k ]-1l1I = I'll-JJII = E[Y 14k ]-JJII = 6~ JJI~-JJIl' The missing cells do not permit the direct definitions of CJ 3 and 6 at the reference 2 (first) levels of the other factors, but by taking note of the no-interaction assumption one can define Q, 62 = E[Y 33k ]-E[Y13k] = JJ13-1l13 = E[Y22k]-E[Y21k'] = JJ22-JJ21. Expressing these results in matrix form produces the matrix which specifies the "intuitive definition" of 6. ~2, shown in Table 2.1, Note that A2 uses only the first ob- servation from each cell. - The design matrix, X, and the Al specifying the canonical definition of ~ are shown in Tables 2.2 and 2.3. . Clearly, '~l and ~2 are differenti for interpretive purposes the definitions given above are generally preferable to the canonical definitions. calculation shows ~2~ = !' which proves the definition specified by the canonical definition specified by ~l' ~2 An easy is equivalent to Note that one of the differences between ~I ""'"-' -12- and ~2 is that the canonical definition assigns equal weight to the expected values of observations in one cell, i.e., to all observations with the same expectation under the model. This would generally not be the case if the model contained concommitant variables. The design matrix, ~E' for the "essence" model (for the first observation from each cell) is constructed by selecting the rows of ~ marked with an asterisk (*) in Table'2.2. The ~E matrix was used to compute ~El' (see equation 2.15) the matrix which is the basis of the canonical definition of ~El ~El is shown in Table 2.4. was expanded to correspond to non-first observations, i.e., readily verify ~E~E = !6 and = !6' ~EE1~ note the substantial difference between ~i the matrix in Table 2.4) and Finally, the matrix are shown in Table 2.5. [~E' in the essence model. The transpose of ~ ~EEl by inserting columns of zeroes to = ~El~ as ~El in Theorem 2.2. One can In spite of the equivalence of definitions, ~EEl (obtained by inserting rows of zeroes into in Table 2.3. !sJ was transformed to Hermite normal form; the results The resulting definition matrix, ~E3' when expanded to the full model, defines all the parameters, except a, just as they are defined by A2 3 - - (Table 2.1). The restrictions ~E~ = ~ are: o = ~11-~13-~21+~22-~32+~33 An examination of these linear combinations reveals that these are combinations of interactions. The matrices and ~E3 are not uniquely defined by the algorithm. ~E In fact, since the linear combinations above are zero, one can modify the definition of a parameter by adding zero (in the form of one of the rows of To illustrate, consider the a 3 = ~E3 ~E~) to the definition of the parameter. definition of a from Table 2.5, panel 2, row 3: 3 ~11+~2i-~22+lJ32 The first restriction above is: Thus, adding the expressions for a and 0 yields 3 a +0 3 = ~U-~13 which is the same as the A2 definition in Table 2.1. of the first yields a = ~3.-~1.' 3 Using the second row of ~E~ instead Since the interactions are zero, the equivalence of all these definitions of a is obvious. The example illustrates one way in which different, J -13- but equivalent, definitions of a parameter may arise from one model. As a result of this discussion one can observe that any linear combination of the rows of ~E to any row(s) of would be equivalent to the one from the original ~E3 may be modified by adding ~E3 ~E3; the resulting definition and to the canonical definition. Computationally, this device may be used to "simplify" a computed "intuitively pleasing" definition. The example also shows that a computed definition, as a in 3 ~E3 in Table 2.5, may contain more non-zero terms than, and may not be as easily interpreted as, other equivalent definitions. 2.6 Weighted Models: The Case V(!) = a2~ To this point we have assumed V(Y) ~ If one assumes instead, V(!) = a2~, where is a known, positive definite symmetric matrix then one can transform from the origin- al model. E(Y) 0_ -0 21. (2.17) = XB. to: - -- E(l-ly) = (l-lX)8 ........ V(l-l y) = a 2 _N 1 where We shall refer to (2.17) as the "transformed model". All the preceeding results apply to this model directly, with (~-l!) in place of ! and (~-l~) in place of ~. The canonical definition of 0T1: ~ ~ in model (2.17) is, after some simplification: = (~,~-l~)-l~,(~-l)' E(l-ly) • L :'.1 E(l-ly). - - Now consider the canonical definition of 8 from the original model, but in the context of the transformed model: 0T2: ~ • (~,~)-l~'E(!) • (X'X)-lX'l E(l-ly) -- -• ~Tl(~-l!) ........ A quick application of Corollary 2.4.1 to 0T2 in the contet of model (2. 17). ~T2 (l-l X)= !. demonstrates that 0T2 and 0Tl are equivalent. Note that both definitions can be written in the context of the original model: 001 • DT1 : ~. ~olE(Y), ~01 002 • 0T2: ~ = ~02E(!). ~02 • (~,~-l~)-~x,~-l • (X'X)-lX' -14Another application of Corollary 2.4.1 shows 001 and 002 • hence 0Tl and 0T2. are equivalent 1n the context of the original model. The parameter ~ 1s identically the same. whether defined in the original model or 1n a transformed model such as (2.17). Thus: (1) whether or not one is using a weight- ed (transformed) model. and (2) whether or not one should be using a weighted model. the parameter defined by 002 in the original model is equivalent to the corresponding parameter defined by 0T1 in the transformed model. Any 002-equivalent alternative definitions in the original model are also equivalent to 0T1 and its equivalents in the transformed model. Thus interpretation of ~ may be made in the simplest. most intuitively appealing setting. and the interpretation applied directly to ~he canonical definition in the transformed model. Figure 2.1 Incomplete. Unbalanced Part of a 3x4 Factorial Design (Searle) Cell Means: Mab Numbers of Observations. Nab b a = Levels of Factor B 2 3 4 =1 Levels of Factor A 3 a 2 0 2 2 2 0 0 3 0 2 2 4 Table 2.1 b Levels of Factor A c so E[YabkJ Levels of Factor B 1 2 3 4 JJ 11 -- JJu JJH so 2 JJ21 JJ22 -- 3 -- JJu JJ33 -\JH -A2• The Matrix of the Intuitive Oefinition of 6- PARAMETER OBSERVATION SUBSCRIPT 111 112 113 131 141 142 211 212 221 222 321 322 331 332 341 342 343 344 * * * * * * * JJ 11 Q2 -1 Q, -1 -1 B2 B, -1 B_ -1 *Notes: 1. Each row marked with an asterisk is also a row of A . E2 2. Zero values have been omitted. * -15- Table 2.2 X. The Reference Cell I~del Design r~trix for the Example PAIWIETER OBSERVATION SUBSCRIPT \.I 11 111 * 112 113 131 * 141 142 211 * 212 • 221 * 222 321 * 322 331 * 332 341 * 342 343 344 1 1 1 1 1 1 1 1 \ 1. Rows marked with an asterisk are the rows of XE• 2. Zero values have been omitted. *Notes: Table 2.3 ~El' The Transpose of the Hatrix of the Essence Hodel Canonical Definition PARAMETER OBSERVATION SUBSCRIPT 111 131 141 211 221 321 331 341 \.111 02 as 82 Ss S~ 0.8 O. 1 0.1 0.2 -0.2 0.2 -0.1 -0.1 -0.6 -0.2 -0.2 0.6 0.4 -0.4 0.2 0.2 -0.2 -0.4 -0.4 0.2 -0.2 0.2 0.4 0.4 -0.4 0.2 0.2 -0.6 0.6 0.4 -0.2 -0.2 -0.7 0.6 0.1 -0.3 0.3 -0.3 0.4 -0.1 -0.7 0.1 0.6 -0.3 0.3 -0.3 -0.1 0.4 -16Table 2.4 AE1 • The Essence Model Canonical Definition for the Example e OBSERVATION SUBSCRIPT PARAMETER llu <l2 lX3 B2 B3 B_ 111 131 141 211 221 321 331 341 0.8 -0.6 -0.2 -0.4 -0.7 -0.7 O. 1 -0.2 -0.4 0.2 0.6 O. 1 0.1 -0.2 -0.4 0.2 0.1 0.6 0.2 0.6 0.2 -0.6 -0.3 -0.3 -0.2 0.4 -0.2 0.6 0.3 0.3 0.2 -0.4 0.2 0.4 -0.3 -0.3 0.1 0.2 0.4 -0.2 0.4 -0.1 0.1 0.2 0.4 -0.2 0.1 0.4 Table 2.5 Matrices Used in Computing Definitions of Parameters for the Example q NC 1. = Number = Number of parameters = Rank (!) of non-empty cells -8 [~E' !NC]* PARAMETER CELL lJ I IClz CLj B2 B3 B_ 11 13 14 21 22 32 33 34 2. PARAMETER Ilu CELL 10 11 13 14 21 22 32 33 34 1 Aft., R.do,tio" to Henoit...,..1 Form, [; ~E3] III I PARAMETER* Q3 B2 B3 B_ Clz 1 1 Or! 1 CJa B2 B3 B_ "Zero-l" "Zero-2" =6 1 1 0 0 0 0 0 0 0 0 0 0 1 0 0 CELL ID 11 13 14 21 22 32 33 34 1 1 -1 1 -1 1 -1 0 -1 1 -1 1 1 -1 -------'---1 -1 0 -1 1 -1 1 0 1 0 -1 -1 1 -1 0 1 *Note: Most zero values have been omitted. -17- 3. DEFINITIONS OF SECONDARY PARAMETERS Notation and Definitions; Definability 3.1 Much of the work of a general linear model analysis is performed via the estimation of secondary parameters and testing hypotheses about these parameters. Given the general linear model, E{Y) = XB, either Full Rank (FR) or Less Than Full Rank (LTFR), secondary parameters are commonly specified in terms of the primary parameters, ~I = ~~-21 (3.1) or in terms of E{Y), ~2 (3.2) where ~ and ~2 = ~2E{!)-22 are specified matrices of constants and 21 and 22 are specified vectors of constants. As with primary parameters, three basic issues arise: (1) Is a particular secondary parameter well defined? (That is, is the specification of a a definition?) (2) What is a canonical definition for a definable secondary parameter? (3) Given two different definitions for a secondary parameter, are the two definitions equivalent? The following results address these issues. DEFINITION 3.1 Well-Defined ("Definable") Secondary Parameter. In the context of the linear model E(Y) = XB, a secondary parameter, a, specified by (3.1) or (3.2) is said -to be well-defined (or "definable") if and only if for an arbitrary specified value of E{!) e M(~) ~ there exists a unique value of which satisfied the specification of ~. In particular: (1) A secondary parameter, ~2 of the form (3.2) is well-defined and the specification (3.2) is a definition of ~2' (2) A secondary parameter, only if for arbitrary specified values of ~I satisfying (3.1), invariant to choice ~I ~ specified by (3.1) is well-defined if and and E(!) e of~satisfYing M(~) there is a unique value of E{!) =XB. If61 is well-defined, (3.1) is a definition of 6 I ' In a LTFR model without side restrictions the primary parameter B is not welldefined but the parameter specified by (3.1) mayor may not be well-defined. Clearly, in a FR model all secondary parameters specified by (3.1) or (3.2) are well-defined. Let X have full column rank in the linear model E(Y) = XB and let aI' ... -... 82 be specified by (3.1) and (3.2), respectively. Then (3.1) is a definition of 61' COROLLARY 3.1.1. - -18- (3.2) is a definition of 62. and both 61 and 6 2 are well-defined. Proof. - -- First, by Definition 3.1,92 is well-defined and (3.2) is a definition. ·has full rank then by Corollary 2. 1.1 ~ whic~ unique value of ~ is well-defined: for any E(!) e M(~) If X there is a = ~~. Therefore. since all other matrices in satisfies E(!) (3.1) are fixed constants. for any E(Y) e M(X) there is a unique value of 91 given by (3.1). which satisfies (3.1). QED. The following theorem gives necessary and sufficient conditions for a LTFR model secondary parameter to be well-defined. THEOREM 3.1. A secondary parameter. 61. specified by (3.1) is well-defined if and only if (3.3a) or equivalently. C e M(X') (3.3b) - - for any generalized inverse. ~-, if~. columns of A. If (3.4) g 1 = CX-E(Y)-gl ~ and where M(8) denotes the space spanned by the is definable any specification of the form is a definition of e I ' Proof. First. ~I is unique iff e ~ is unique. Now. by Rao and Mitra (l971). Theorem _I _1 =-C13 has a unique value for each solution, a. of E(Y) = Xa. iff the - - .... __ = CX-XS __ __ = CX-E(Y), __ _ equivalent conditions (3.3a) and (3.3b) hold. From (3.3a) e_I -g_1 = Ca 2.3.1(c). 91-~1 - which implies (3.4). QED. As a result of this theorem one can observe that a secondary parameter is we11defined if and only if it is estimable. just as is the case for primary parameters. As before. "definabil ity" stems from a different conceptual base than does estimabil ity. Although the concepts are "tangent" at this point, the study of alternative definitions and equivalence of definitions is different from the study of estimators and their properties. As illustrated by teh proof of Theorem 3.1. however, some theory tools are useful in both studies. 3.2 Equivalence of Alternative Definitions of a Secondary Parameter As with primary parameters it may be possible to have different. but equivalent. definitions of a secondary parameter. Since only well defined secondary parameters need -19- be considered, there is no need to distinguish between LTFR and FR models. The basic idea of equivalence of definitions of secondary parameters is straightforward: two definitions are equivalent iff they define the same parameter. Some additional notation will facilitate manipulation and comparison of definitions. A secondary parameter 6 specified by (3.1) or (3.2) is a vector-valued linear function of X, which is fixed, and E(V) e M(X). Since M(X) = {XB: BeE q}, it is convenient to regard a as a vector-valued linear function of S, either directly as in (3.1) - or indirectly via E(V) = XS in (3.2). - We shall use the notation 6 = 6(S) to denote this relationship. Let 6 1(S) and 62(S) be two possibly different, well-defined secondary parameters. By varying E(V) over all of M(~), or equivalently, by varying ~ over Eq , one generates all possible values of ~I(~) and ~2(~)' Therefore, ~I(~) and ~2(~) are identical iff q 61(S) = 62(S) for all S e E . - - With- this notation it is easy to define equivalence of secondary parameter defini- tions. DEFINITION 3.2. Equivalence of Secondary Parameter Definitions. Let 01 and O2 be definitions of the secondary parameters ~I =~I(~) and ~2 = ~2(~) in the linear model = ~~. The definitions 01 and O2 are said to be equivalent, and eland 62are said to be equivalently defined if and only if 6 ds) = 6 2(S) for all S e Eq , i.e .. iff 6 I E(!) and 62 are identical. The following theorem gives straightforward necessary and sufficient conditions for the equivalence of secondary parameter definitions. THEOREt' 3.2. Necessary and Sufficient Conditions for Equivalence of Secondary Parameter Definitions. Under the linear model E(V) = XS let four secondary parameters be defined by: (3.5) Dlj : ~Ij • ~Ij(~) = ~j~-2I' j • 1,2; (3.6) D2j : ~2j = ~2j(~) = ~2jE(!)-~2' j = 1,2. Then: (1) 011 and 012 are equivalent iff ~I (if) 021 and 022 are equivalent iff ~21~ (ifi) Oil and 021 are equivalent iff = ~2' = ~22~' (a) gl = g2' and (b) ~1 = H21 X. -20- Proof (i). D11 and D12 are equivalent iff ~ = ~11(6)-a12(~) = (£1-£2)~ <=> fl Proof (ii). q for all ~ e E for all 6 e Eq = f2' QED(i). D21 and D2 2 are equivalent iff ~ = ~21(~)~22(~) for all ~ e = (~21-~22)~~ for all ~ e Eq <=> ~21~ = ~22~' QED(ii). Proof (iii). q E D1l and D21 are equivalent iff q for all 6 e E -·(~1-~;1~)~+(21-~2) for ;11 ~ e Eq 0=8 11(6)-6 21 (6) letting ~ = ~ establishes ~1 = ~2; q (fl-~21~)~ = ~ for all ~ e E <=> fl = H21X. QED(iii). 3.3 Canonical Definitions of Secondary Parameters The objective of constructing a canonical definition is to provide one definition which is easily and uniquely determined from any member of a class of equivalent definitions. The canonical definition is thus a unique representative of the class of equivalent definitions. The canonical definition of 6 in the full rank model is easily justified: 6 = (~,~)-l~'E(!) is the unique solution of the model equation E(!) equivalent defintions of ~ necessarily involve another matrix, ~, = ~~. Alternative, as shown in Corollary 2.4.3. Thus, the canonical definition of 6 is not only the unique solution of the model equa- tion but is also, in one sense, the most concise definition in the class of equivalent definitions. The justification of the canonical definition of a secondary parameter lies primarily in the fact that it is a unique definition in the sense that, beginning with any equivalent alternative definition, application of a simple formula always results in the same canonical definition. THEOREM 3.3. let four equivalent definitions of a secondary parameter = ~j~-~' j = 1,2 ! = ~jE(!)-2' j . 1,2. D1j : ~ D2j : Then, This statement is proven in the following theorem. ~2 be given by: ·21- ~ .. ~1~(r~ff (3.7) .. ~2~(r~)-r .. £r (r~fr .. £2 (r~ft for any generalized inverse, (~'~). of ~'~. Proof. By equivalence and Theorem 3.2, ~1~ = ~2~ = £1 = £2' Postmultiplying each of these terms by (X'X)-X' produces H. variant to choice of ~j(~'~)-~' (~'~)-. = ~~(~'~)-~' Also, by Theorem 3.1, C1 = ~2 Note that X(X'X)-X' is ine M(~') => ~j .. ~~, so is also invarjant to choice of (~'~)-. QED. Theorem 3.3 and the preceeding discussion motivate the following. DEFINITION 3.3. Canonical Definition of a Secondary Parameter. (1) let a secondary parameter, e, in the model E(Y) .. Xs be defined by (3.8) Then the canonical definition of e is 'e D!: ~ (3.9) = [:(~'~f~'J E(!)-~ (2) Let a secondary parameter, e, in the model E(Y) = Xs be defined by (3.10) Then the canonical definition of e is D!: e =[HX(X'Xfx'J E(Y)-g (3.11) - -- - - - -- In each case (X'X)- is an arbitrary generalized inverse of (X'X). COROLLARY 3.3.1. In Definition 3.3, 01 and Dt are equivalent and 02 and D~ are equi- valent, i.e., a secondary parameter definition is equivalent to its canonical definition. Proof. For D1 and D*1 note that (x'xfx' is a generalized inverse of X, i.e., X- .. (X'X)-X'. - - - let -H* -- - =CX· _ in Df;. using (3.3a) we have _H*X_ - = __ CX-X_ = _C. Thus D1 and D*1 are equivalent by Theorem 3.2(iii). Now, D2 and D: are equivalent, by.Theorem 3.2(ii) iff HX is equal to [HX(X'X)-X']X = HXX·X .. HX. QED. It is interesting that although (X'X)·X' is a generalized inverse for X, the expression ~~- [: e M(~')J is not invariant to choice of ~., but :(~'~)-~' ~ invariant to choice of (~'~)-. Thus, in a sense, !3.9) represents the simplest g.i.-invariant canooical definition of e in terms of C, X, and E(Y). Similarly since ~~- is not invariant -22- to choice of ~-, but ~(~'~)-~' is invariant to choice of (~'~)-, (3.11) is the simpl- est g.i.-invariant canonical definition of ~ from definition D2 (3.10). The following corollary shows that an expression which appears to be slightly simpler than (3.9) is, in fact, identical. COROLLARY 3.3.2. In the setting of Definition 3.2, £(~'~)-~' = £~+ where ~+is the Moore-Penrose g.i. of X. Thus an identical expression for (3.9) is: Dt: ~ Proof. z ~~+E(~)-~~ Graybill (1976), Theorem 1.5.26, shows ~ => £(~'~)-~' = ~~(~'~)-~' = ~~~+ COROLLARY 3.3.3. z ~~ + = - ~(~'~) ~'. Now £ e M(~') -> C = £~+. QED. In the context of Theorem 3.2, two secondary parameter definitions of the forms (3.5) or (3.6) are equivalent if and only if they have the same canonical definition. Proof (i). Using the notation of Theorem 3.2 and Corollary 3.3.1, the canonical def- initions of e l' are - J where ~lj • £j(~'~)-~' , j - 1,2. By Theorem 3.2, Dll and D12 are equivalent iff £1 i.e., ~lland ~12have = £2. the same canonical definition. canonical definition,i.e., ~11 = ~12' Clearly £1 Suppose ~11 and = £2=> ~12 ~11 = ~12' have the same then ~11(~)~12(~) = (~11-~12)~~ = 0 for all ~ e ~q => ~11 and ~12 are identical and D11 and DI2 are equivalent. Proof (ii). The canonical definitions of ~2j QED(i). are ~2j = ~;l(!)-~' where H!. _J =H _ 2j (X'X)-X' - __ , j = 1,2. If D21 and D22 are equivalent, by original specification (3.6) and Theorem 3.2, ~21~ = ~22~' nition. If which implies ~21 and ~22have ~2! = ~2~' i.e., ~21 and ~22 have the same canonical defi- the same canonical definition, i.e., ~21(~)~22(~) = (~!j-~!j)~~ = ~ q for all ~ e E -> D21 and D22 are equivalent. QED(ii). ~2~ = ~2~' then -23- Proof (iii). The canonical definitions of ei -I .. .' z Hi· E(Y)-9," -I i -- ~II and ~21 are = 1,2 with ~tl= ~I(~'~f~' ~;I = ~21~(~'~)-~" If Oil and 0 21 are equivalent then (Theorem 3.2(iii» ~I • ~2 and ~I • ~21~ => ~tl = ~~I' i.e., the canonical definitions are identical. if the canonical definitions are identical, i.e., ~I = ~;1 and ~I • ~2' then ~11(~)~2' ~) • (~11-~;1)~~ for all ~ Oil and 0 21 are equivalent. => QEO(iii). The following result is useful in the next section: COROLLARY 3.3.4. A secondary parameter defined by (3.5) has an equivalent definition in the form (3.6), and vice versa. (3.6). For ~2j [The canonical definition of (3.5) is of the form of the form (3.6), let ~ = ~2j~ to obtain the form (3.5); equivalence follows from Theorem 3.2(iii).] 3.4 Definitions and Equivalence in the Essence Model All of the definitions and results in the preceeding sections may be applied directly to the essence model, i.e., the linear model for the first observation from each cell, where all observations from a cell have the same expected value. Parameter definitions made in the essence model may be extended to equivalent definitions in the full model. Essence model canonical definitions are equivalent to, but generally not identical to, their extensions to the full model. The following result establishes most of the possibilities. THEOREM 3.3. In the context of the model E(Y) = Xs, with X Nxq, let U be an NCxN matrix, NC<N, whose rows are a subset of the rows of!N hence, !E • ~ is the vector of first observations from the cells. essence model is E(!E) = ~E~' ~~' Define • !NC such that ~E • ~~. the let the following be equivalent secondary parameter definitions in the essence model: DEl: e = CS-g It follows that definition of ~ DE2 : ~ = ~EE(!E)-2 • ~E~E • ~E~~ = ~~ with ~ = ~E~' and that the essence model canonical e may be stated in the two equivalent forms -24- DEl: e • ~(~'~)-~ E(!E)-~ DEz: e = ~E~E(~E'~E)-~Er E(!E)-g· D~: ~. £(!'!)-!'E(!)-2 01: ~ = ~~(~'~)-~'E(V)-9 O2: e = HE(V)-g. Define: Then all of DEl' DE2 , DEl' DEz ' n:, D~, and O2 are equivalent definitions of ~. Proof. The equivalence of DE~ and DEl is assumed. the equivalence of DEl and DEl to Dlz DEl and Now, by Theorem 3.1, ~ ignored. ~~ <=> DEl and follow from Corollary 3.2.1. £e M(~') <=> £= CX-X. = ~~-~E Since <=> ~ ~ is the same vector throughout it may be e M(~E') <a> ~ ~E s.t. ~. ~E~E = Consider the difference of the parameters defined by D~: (CB-9) - [C(X'X)·X' E(V) -g] -- - - - - - - • Ca-CX-Xa = (C·CX-X)B = O. q The two parameters are identical for all Be E , i.e., DEl and D~ are equivalent. By the equivalence of DEl and DEZ , £ = ~~. Simple manipulations show ~ • ~U'. thus ~ = ~~ .. ~~'~~ .. ~~, which establishes the equivalence of DEl and O2 and. therefore, of which is the canonical definition of D2. We have established a chain such that q for all ~ e E , all six definitions of ~ produce identical values of e. All the D~ definitions are therefor equivalent. QED. Secondary parameter definitions are often more conveniently constructed in the essence model than in the full model because the vector of cell means, fewer elements than E(V). ~ • E(!E) has There is no difference for secondary parameters of the form e .. CB-9. of course. - --- 3.5 Examples As a first example reconsider the design and model presented in section 2.5. We shall work with the essence model to keep the matrices small. The first obvious secondary parameters to define are the A and B main effects, defined in terms of the primary parameters: ~A where ~~ .. ~B • ~B~ • .. [~J [i;] -25- .. [~ 1 0 0 1 82 83 84 0 0 0 0 C _B = [~ 0 0 0 0 0 0 1 0 0 ~ll c _A .' 02 Q! ~J 0 1 0 ~] An obvious second set of definitions of these effects may be made in terms of cell means: ~'" (~11 ~2= ~2' ~13 ~1" lI2l ~22 ~!2 ~!3 [-1 0 0 -1 0 0 1 0 0 0 0 0 0 1 [-: 0 1 0 0 0 1 -1 0 0 1 0 0 0 0 0 0 0 0 -1 DA2 : ~A .. ~A2~ .. ~A2 E(!E) = ~,..) ~J ~] [~21-lJllJ ~!3-~13 DBZ : ~B .. ~Bz!: .. e ['' "' J ~11-ll11 ~1 ..-lJll Since definitions 0AZ and DBZ coincide with original intuitive definitions of the parameters (02. a,)' and (8 2, B3 , B4 ) ' , one would expect equivalence between 0Al and DAZ and between DBl and 0BZ" This equivalence is easily verified by demonstrating that £A = ~AiE and £8 2= ~BZ~E· Since the model assumes zero interactions we may define, as secondary parameters, cell means for the empty cells. Some obvious definitions are: ~l2 = lJll+B 2 .. ~11+lJ22-ll21 ll23 .. llll+C12+B 3 • -~11+)Jl!+ll21 ll2lt .. llll+C12+B 4 = -llll+)J1It+)J21 ~31 .. ~ll+OJ • ~ll-~l!+lJJ3 Given these secondary parameters it is interesting to define "average distance between t the lines" main effects for A and B. DM ' !.A' 1/4 These are defined as: b=l [lJ2b - ~lb] )J3b - )JIb -26- DB3 : ~B = 1/3 Ua 2 t Ual - • H _B U ua ' - Ual ua ' - Ual a=l An overall mean is also interesting: DU3 : eU = 1/12 aE IIEU ab • -H".. 3-U After some minor calculations. the matrices of these definitions are shown to be: Un UI' U21 ~4-1 0 -2 0 -1 4 1 ' t 1 3 1 0 0 2 -2 2 Ull ~A3 = 1/4 ~A3 • 1/3 -3 -3 ".", H -UI = 1/12 [1 U22 U52 US! 0 0 -1 1 0 2 2 0 0 1 -1 0 0 0 0 0 2 2 Simple matrix multiplication demonstrates that and ~B ~B3~E U" ~J n -1 2 £A • 1J (DAl and DA3 are equivalent) ~3~E = ~B3~E (DB1 and DB3 are equivalent). as expected. One can compute ~U • = 1/12 [12 4 4 3 3 3J. i.e .• the equivalent definition of ~U in terms of ~ is eU '" UII +(a2+a3)/3 + (lh+B3+B .. )/4. This definition appeals to one's intuition if one recalls that. by definition. al '" 0 and 61 '" 0; the terms in parentheses are averages of the c; and 6j • respectively. A Reparameterized Model. A well-known method for obtaining a full rank model in ANOYA problems is to "reparameterize" 1he lTFR ANOYA model for the example described abo~e and in section 2.5; we continue to omit interactions. One device for reparameter- ization. based on the notation of "sum to zero" restrictions. is as follows. Require: 61+62+61+6, = 0 => One omits al and 61 from the model. by the expression above. 61 '" -(6 2+6 1+6,) When E[YabkJ involves CIt or 61t the term is replaced Thus for example: lTFR Model: E[Y llk ] = u+ol+61 Reparaooterized flodel: E[YllkJ = u-~-al-B2-BI-6, Application of this technique to the "essence" LTFR model produces the!E shown in Table 3.1. -27- 2 a2is a comparison of the second A typical interpretation of the parameter a is: level of A!!. (minus) the first level of A. Similarly, as is often interpreted as comparing the third vs. the first levels of A. . ' the 6j and levels of B. The parameter ~ Similar interpretations are applied to is interpreted as an overall mean. We shall examine these interpretations . Define the following secondary parameters (which are just subsets of the primary parameters) : [:] [:l D~ , ~B' D~: The matrices £A' £B and • -AC.6 ~B~ e~ = ~ = £~~ £~ are shown in Table 3.1. If the interpretations given above are correct, the following definitions, in terms of cell means, should be equivalent to the ones above: ~B .[~22-~2 J ~13-~11 ~H-~ll D 2: ~ where the ~ab e-~ • (1/12) I I ~ a b ab The matrices ~A3 ~A2' ~B2 = ~~E' ~B3 Parameter: -CA3 • presented earlier. = ~B~E ~ [~ C -~3 and ~v3' ~~ are above. and ~~3 • ~~~E are: ~ Os 62 6, e~ 0 0 0 0 ~] 2 1 1 2 U 0 0 0 0 0 0 1 1 1 :] [1 0 0 0 0 0] ~, ~B' ~A is indentical to 2 These matrices are compared with 2 and C in Table 3.1. -~ • C shows that the parameter -~ ~~ The matrices 1 ~B3 • -~ -~- for missing cells are constructed as above. identical to the matrices First, C = H~ ~ is truly an overall mean in the sense -28- described by D112 above, where secondary parameters are used for means for empty cells. Obviously SA ~ £A3 and SB ~ £B3' Thus for example, the parameter~ is not a com- parison of the second level of A ~.the first level of A; that comparison is made by (from the first row of SA3) 202+09 '" ~!one of thea or Bb have the interpre2 tations suggested above: Although this type of reparameterization is a useful tech1l21-\.l11' nique, especially if one wishes only to test hypotheses, interpretation of the parameter estimates from such a model requires considerable care. Remarks on Interpretation of Results. The canonical definition of a secondary parameter is primarily useful in concept rather than application: it is useful to have one unique definition which results from application of a simple algorithm to any member of a class of equivalent definitions. In practice the coefficients in the canonical definition matrix are often too complicated for direct interpretation; there are simpler, equivalent definitions which are much more useful for interpretation. However, the formal equivalence of definability and estimability becomes useful, for the coefficients of the canonical definition of e are the same as the coefficients of the BLUE of ~, VCr) .. under the assumption a2~. That is, if the canonical def- A inition of e is D: e" HE(Y)-g, then e .. HY-g is the BLUE of e. - -One could insist upon a data oriented interpretation, in contrast to the parameter oriented approach taken in this paper, in which one would interpret ~ in terms of the A coefficients expressing ~ as a vector-valued linear function of!. Since the same coefficients, the matrix H above, also form the canonical definition of e the two interpretative approaches are very similar at this point. An advantage of the parametric approach is that there are equivalent, simpler definitions of permit easy, straightforward interpretation. ~ which often In contrast, while there are unbiased A A - - estimators other than the BLUE, e, those estimators are not equivalent toe from an estimation point of view and it would not be appropriate, in a data oriented approach, A to interpret e in terms of non-equivalent estimators. One can usefully combine the two approaches. For example, in the reference cell model example the BLUE02 is estimating (from Table 2.3) E[~J = E[-3/l4 + 5/28~2 Nab where Yab- '"l=l Yabk' .lb_ -5/42)\31 -5/42)\~. + 9/28.lb. -5/ 28Y32. + 5/84~ u + 5/ 84yJ ... .J However, by adding information from alternative definitions -29- of O , 2 ~A2 from or ~A3 above. one can note that E[aZ J = Ilz l-Illl • E[y Z 1 kJ - E[y 11 k· J subject to the no-interaction assumption. Surely the latter expression for E[&ZJ is . easier to interpret than the former. (Even the latter has pitfalls; the no inter- A action assumption means YZI. Table 3.1 ~E ~ IlZl.) for the Full Rank Reparameterized Essence Model and C Matrices for First Definitions l e PARAMETER CELL ID Il a Z 0 ! B Z ! -1 1 -1 -1 21 22 1" 1 0 0 -1 1 32 33 34 0 0 0 1 1 1 0 0 0 1 0 0 0 0 ~J 0 ~] -1 0 0 1 1 -1 -1 0 0 0 0 0 1 1 -8 [g 0 0 0 0 0 0 0 0 0 C .. [ 1 0 0 0 0 = C .. 1 0 1 0 0 0 -Il -1 -1 -1 -1 [~ C _A . B 13 14 11 -1 B 1 o] =X -E
© Copyright 2025 Paperzz