• J~"'I. A TEST O'~SIGNIFlCANCE FOR COMPARING TWO m:.FFEltENf'·SYSTEMS OF STRATIFYING THE SAME POPULATION OJ Yandle '---..... ~_/>j ... and R. J. Hader Institute of Statistics Mimeograph Series No. 456 Dec~ 1965 iv TABLE OF COl\"'TENTS .Page LIST OF TABLES • • • • ... LIST OF ILLUSTRATIONS 1. INTRODUCTION MID REVIEW OF LITERATURE 2. GENERAL 2.1 2.2 2.3 2.4 2.5 3. • • • a * • • • • • 0 • • • • 1 • Q • 10 • Formulation of the Null Hypothesis • • • • Linear Transformations • • • • Some Properties of the a.. • • • • Transformation of Ho lJ.... Derivation of the Test Statistic ... 22 28 The Case with Equal Cell Proportions The Case with r=c=2 ••••••• ... .... GENERAL SOLUTION FOR r > 2 MID c = 2, ••• , _r 4.1 4.2 4.3 4.4 10 12 18 23 TWO SPEC IAL CASES OF THE GENERAL RESULT 3.1 3.2 4. DEVELOP~NT * e 28 32 37 General Remarks • • •••• Explicit Algebraic Representation of the Equation g: 'B-1UB-l£* = 0 • • • • • • • • An Example Illustrating Use of the Methods Numerical Solution for Problems of Size Greater than r=c=4 • • • • • • • 37 38 43 ... 47 . . . . . . .f. • 50 5. EMPIRICAL SAMPLING 6. SUMMARY A1'ID CONCLUSIONS • 66 7. LIST OF REFERENCES 69 8. APPEND ICES 8.1 8.2 8.3 8.4 Theorems 8.1.1 8.1.2 8.1.3 • EXPER~NTS • ..., 71 and Derivations • Theorem on U2 • • • • • • • •• • • • • Theorem on the Determinant of IT • Relationship Between the Sums of Principal Minors of a Matrix and Those of Its Inverse • • • • • • • •• • • • • 8.1.4 Theorem on the Inverse of B • • • • Matrices and Sums of Principal Minors for Problem Sizes to r=e=4 • • • • • • • • • • • • • • •• Computer Program for Transformations • • Computer Program to Obtain Polynomial Roots • 0 •• •••••• 71 71 73 81 82 89 94 106 v LIST OF T.AJ3LES Page 4.1. Summary of white oak sp. sawlog data 44 5.1. Rowand column dimensions and sample sizes for empirical sampling experiments • • 51 Transformation of rectangular distribution offuur-digit random numbers to random normal deviates • • • • •• 52 5.2. 8.1. Example of computer program output 104 vi LIST OF ILLUSTRATIONS Page 5.1. Empirical distributions for r=e=2 with equal cell sizes 55 Empirical distributions for r=3, c=2 with equal cell sizes • e eo. ·.. 56 5.3. Empirical distributions for r=c=3 with equal cell sizes •• 57 5.4. Empirical distributions for r=c=2 with unequal cell sizes - part one • • • • • • • • • 58 0 5.5. 5.6. 5.7. 5.8. 5.9. 5.10. 0 • Q 0 GOG 0 Q 0 0 • 0 • • Empirical distributions for r=e=2 with unequal cell sizes - part two • • • • • • • • • • 59 Empirical and theoretical power curves for r=e=2 with equal cell sizes • • • • • • • • • • •• 61 Empirical and theoretical power curves for r=3, c=2 with equal cell sizes 62 ········ ··.. Empirical and theoretical power curves for r=e=3 with equal cell sizes ············.. Empirical and theoretical power curves for r=e =2 with unequal cell sizes - part one ··········· Empirical and theoretical power curves for r=e=2 with unequal cell sizes - part two ··········· 63 64 65 1. INTRODUCTION AND REYIEW OF LITERATURE The problem considered in this dissertation was first brought to the author's attention in the context of grades for sawlogs. problem is posed by the question: The given two systems for grading saw- logs, how should we obtain and use a sample from a population of sawlogs to compare the two grading systems in order to make some judgment as to which is the llbetter ll system? In order to give perspective to the problem, we will discuss briefly how logs are graded and the uses of log grades. A set of specifications for grading logs is usually, of necessity, based on characteristics that are identifiable on the whole log; whereas, the value of the log is the sum of the values of the end products (lumber, veneer, etc.) less the cost of conversion. The identifiable characteristics of a log are those of size, length, and diameter; irregular geometry exhibited by sweep, crook, eccentricity, and excessive taper; and such defects as rot, stain, seams, bumps, burls, knots, and others on the bark surfaces and ends of the log. In aa.dition, the relative positions of defects or of clear areas between defects are visible characteristics. The set of specifications for a grading system sets forth the permissible characteristics and their range, either singly or in combination, for each of the grades in the system. For illustration, we reproduce one set of specifications for grading Southern pine logs as given by Campbell (3, p.4): 2 Southern pine logs are graded in two steps. First they are given a tentative g.cade based on diameter and K countY j secondly, they are given a final g::::'ade based o~;other degrading factors. Step 1 consists of determining D~ and total K count on all four faces. Establish a tentative grade according to the following tabulati,;)n: Grade I II III rv Minim-urn scaling diameter CD) Maximum Knot count (K) 17 10 5 5 Dis D/2 no limit no limit As step 2, determine in the sequence lj_sted: Sweep.--Degrade any tentative I, II, or III grade log one grade if sweep is at least 3 inches and equals or exceeds D/3. (This is the final grade if the log has no evidence of heart rot and no rotten or oversize knots.) Heart rot.--Degrade any tentative I, II, or III grade log one grade if conk, massed hyphae, or other evidence of advanced heart rot is found. (This is the final grade if the log has no unsound or oversize knots.) Unsound or oversize knots.--Degrade any tentative grade III log to grad.e rv if unsound or oversize knots are dispersed so that they cannot be contained in one quarter face. In virtually all the papers reviewed, the authors have indicated the need for log grades in statements such as these by Petro (19, p. 5): "The steady rise in production costs and increased market competition over the years, has provided an impetus to the need for evaluating the quality of hard.wood sawlogs." Y K count == number of overgrown knots plus sum of diameters of sound knots plus twice the sum of diameters of unsound knots. ~ D == average diameter at small end of log inside bark to nearest whole inch. 3 By Vaughan (23, pG 1): l!Increasingly d.iversified hardwood utiliza- tion further complicates the task of determining the highest use for and appraising the value of hardwood ll)gs or standing timber. Or more simply by Newport and O'Regan (18, p. 1): tI "Workable log grading systems are needed as aids to marketing timber and other phases of forest management." port and OiRegan appe~rs However, a further statement by New- to be unique in that it refers to log grading by a term not used by other authors. less of the manner of selecting They state: "Regard- the grade specifications they need testing to determine the effectiveness of stratification." The key word, of course, is "stratification." Many other papers essentially discuss stratification of logs but without using the term directly. If we conside~ log grades to be the definition of strata in a given population of logs, then we can interpret the need for log grades in a statistical sense as being the need to increase precision by stratifying the population. In this paper we will consider that a system of log grades, having specifications that define a log as uniquely belonging to one and only one grade, defines the stratification of a population of logs. Further) we will limit consideration to a single measurable characteristic or response variable: log value. Thus, we will treat the problems of log grading as equivalent to those of stratified sampling. Newport e~ ale (17, p. 28) recognized three problem areas requiring further work in developing adequate log grading systems: 4 1. Techr.iques are needed for the testing a~d selection of the controlling factors to be included in grade specifications. This includes the dete~mination of breaking points between grades for certain factors such as knot size. 2. Methods are needed for comparing the effectiveness and reliability of grading systems and of grades within a system. 3. Techniques are needed for the calculation of endproduct performance data. First, let us dispense with problem area c. This is the problem of determining the appropriate estimators. The best esti- mator for a given situation will depend on many considerations in addition to stratification, ~.§., we may at times be satisfied with simple estimates for means or totals but at other times it may be necessary to use ratio or regression estimators or other techniques to satisfy our purpose. Since in this paper we will be concerned with a problem of stratification, we will not elaborate further on estimation procedures. Problem area a is essentially that of con- structing optimum stratification which in the past fifteen years has been the object of a considerable research effort by Dalenius (5, 6, 7, 8), Dalenius and Gurney (9), Dalenius and Hodges (10, 11), Ekman (12), Cochran (4), and others. This is a very difficult problem except under the simplest of conditions. Dalenius first showed how to determine optimlXill stratum boundaries for a fixed number of strata when the frequency function is known. Further work has been primarily concerned with particular cases or with the development of more rapid approximate methods for estimating the stratum boundaries. Cochran (4) compared four of these approximate 5 methods as applied to eight degrees of skewness. frequen~y distributions having different The r,:o;sults of this paper indicate (although i t is not explicitly stated) that a gain in precision could be realized by using the frequency distribution of the variate from a recent survey. A great amount of further work in this direction is required before methods applicable to log grading or similar problems will be available. The use of principal components to construct strata in a situation having similarities to the log grading problem was reported by Hagood and Bernert (16). Analogous to the log characteristics pre- viously discussed, they use twelve population and agricultural • variables which are termed "control" variables. The main points of the method are given as follows, (16, p. 335): To utilize ir~ormation on all the selected control variables in stratification, mutually uncorrelated component indexes were employed. Each index is a linear function of the twelve control variables , with the weights for the variables bei.nK determined by component analysis of the matrix of intercorrelations of the variables. tL~d further, (16, p. 337): The principles generally followed in the present stratification were: (1) to use an index for each component explaining more than 10 percent of the total variation of the control variables; (2) to use more class intel~als for the index of the component which explained the greatest proportion of the variation of the control variables. Log grading systems in use today can be broadly classified according to the method of developing the system: Judgment - Log grade is based on an estimate of performance. The system specifications set forth the minimum amounts of key tr 6 products to be produced from the leg. The log grader must use his own judgment in determ:::"ning whether or not a given leg meets the specification. Thus the system is developed on the basis of statements of desired performance and its success is dependent upon the experience and skill of the grader. It is obvious that such a system does not uniquely define the stratification of a polulation of logs. For- tunately, systems of this kind are being replaced with others that are better defined and less dependent on judgment. Arbitrary definition - The set of specifications is based on the visible log characteristics but defined arbitrarily, albeit by men experienced in woods and mill operatj.ons. • AYJ.alytical development - 'I'he set of specifications is based on the visible log characteristics after a study of data obtained expressly for the purpose of studying the relationship of log characteristics to product yields or values. Either the second or third method of developing log grades as just outlined above can lead to grading specifications that uniquely define a system of stratifj.cation. (Although the problem of misclassificatio~ will exist to some extent even with the best defined system.) It is systems such as these that we consider in this paper. Most of the log grading systems that have been developed analytically have had that development take place through the use of some form of regression or correlation analysis. The system, pre- viously outlined, by Campbell (3), was partially developed. by studying the multiple regression of log grade value on several quantitative log characteristics. For a more thorough discussion of multiple 7 regression, including consideratian of appropriate models, in the development of log grades see the paper by Newport and O'Regan (18). The second major problem area outlined on page 4 is further elaborated by Newport et al., (17, p. 30): a great deal of effort will be spent on the comparison of existing grading systems in order to select the most effective ones and on the testing of grading systems for possible use on other species or in other areas. Therefore techniques are needed for such testing Rephrasing this statement in the more general context of sampling theory we can say: techniques are needed to compare the effectiveness of two different stratifications of the same population. In order to learn the state of knowledge with respect to such techniques, we have turned to the literature of sampling theory and methods with the surprising result of finding that the problem seems to have been completely ignored. To be sure, this problem cannot be completely divorced from other problems relating to stratification, particularly the problem of finding optimum stratificationj however, the direction is different and it is a lack of other work in this direction that is striking. It is the work toward a partial solution of this problem with which we will be concerned in this paper. In approaching this problem we must first have a suitable working definition of what j.s meant by the I!better l! of the two systems of stratification to be compared. First, we will consider some standards for log grading systems as recommended by Newport et al., (17, po 18): 8 a. The grades in a grading system must group the logs or trees so that the variability in value and/or product yields is reduced to a reasonable limit. b. The square root of the variance of value per unit volume should be [no greater than] 7 percent of the mean value per unit volume for each grade within a grading system. c. For a given log size one grade should differ from another by not less than 10 percent of the mean value of the higher of the two grades under consideration. The difference in mean value between the several grades should be approximately equal. It is obvious that these standards are unsatisfactory: a is in- definite, b is an arbitrary standard which could be difficult to obtain for some populations and even impossible for others, and taken together band c could for some populations be contradictory. We will adopt the following definition: the "better" of two systems of stratification is that one for which the weighted average within-stratum variance is smaller. 'llhis is equivalent to defining the "better" system to be that one which, when used together with proportional allocation of the sample, would result in the smaller variance for the estimated population mean or total. Thus we will ignore 'all costs and physical or administrative problems involved in delineating the strata or obtaining samples. The problem of comparing two systems of stratification will be treated as one of devising a test of the hypothesis of equality of the weighted average within-stratum variances for the two systems. This test will be derived from the ratio of maximum likelihoods AH = LO Lftu ~ 9 where L(w) is the likelihood f'unction in the space restricted by the null hypothesis and L(O) is the likelihood function in the unrestricted space. As will be apparent in the ensuing sections, the null hypothesis is a tests derived non~linear f~om function of the parameters. Examples of likelihood ratios when the null hypothesis is non- linear are given by (1) and (24). Watson (24) in the investigation of equatorial distributions on a sphere was able to formulate his problem in such a way that the Lagrangian mUltiplier, appearing in the equations to be solved for the estimates of parameters in the restricted space, can be shown to be the smallest root of a given matrix; hence, the maximum likelihood solution is given by the characteristic vector associated. with this least root. A.nderson and Bancroft (1) give an example in deriving a test of second-order interaction in a 2 X2 X2 contingency table. In that problem, the restricted maximum likeli- hood estimates are found as functions of the Lagrangian multiplier. These functions must be substituted into the equation restriction, 1.~., null hypothesis, and a Lagrangian multiplier. nl~erical solution obtained for the This numerical solution is then used to ob- tain the numerical estimates of the parameters. Except for two special cases, the prcblem being treated. here will require a numerical solution parallel to that described by A.~derson and Bancroft. 10 2. 2.1 GENERAL DKlELOPMEl\""I' Formulation of the Null Hypoth~s~s The two systems of stratification that are to be compared will be designateo_ as "R", composed of r strata~ As a convention, we will set r c. ~ and "C", composed of c strata. The population to be sampled may be considered as a two-way array with r rows andc columns being defined as the strata of systems Rand C respectively. in this array defines a sub-population with mean a proportion, P1j, of the entire population, f.L1j The (ij)th cell and containing i.e.~ ~ Pij - - ij = 1. We wish to construct a statistic that will be a test of the hypothesis that the average variance within strata of system R (within rows) and the average variance within strata of system C (within columns) are equal, l.~., (2.1.1) If we consider the population in any stratum of either system to be a composite of the populations of all cells contained in the stratum, then the frequency function for any stratum can be expressed in terms of the frequency functions of the cells, ~.~., for the i th stratum in system R, 0) 10(YiO) • 11 (Yl1) + ••• + (Pi fi(Yi) :::: ( Pl1) TI -.'-'=' i Pi·· (2.1.2) 11 The first two moments about the origin for f 1 (Yi) are: = -l Pi· [PH fl.1l (2.1.3) + ••• + Pi c fl.1 c ] and (2.1.4) ' Wl'th'In th e_"th stratum·as.· Thus, we can wrl't e th e varlance Similarly, CT6 j = P~ j [P1 j IJ.~ (13) + 8 o. + Pr j IJ.~ (r j )] (2.1.6) Upon substituting (2.1.5) and (2.106) and simplifying, equation (2.1.1) becomes I [p~. (~Pi i j lJ.i j )2J 12 We will see later (section 2.5) that a sample of' size n = n.. = ~ n i j 1j will be required to make the test and, f'urther, that the sample will be proportional, ~.~., (2.1.8) Thus, the equality in (2.1.7) can be written equivalently as: or (2.1.10) 2.2 Linear Transformations In order to have a better framework within which to describe certain relationships and to make many of the required derivations more tractable, it is desirable to work with linear transforms of both observations and the parameters. Equation (2.1.10) can be rewritten as: 6. R -6. C =0 where 6. R = S.S. Rows (unadj. for ColsJ, in terms of the parameters, 6. C = ~ij • S.S. Cols. (unadj. for Rows)~ in terms of the parameters, ~i j • (2.2.1) 13 Since the r X.c array has unequal p:::'oportions in the cells, the linear functions comprising the orthogonal set associated with 6 R are wco :\ this reason, we will find it convenient to make two parallel, nonorthogonal transformations, examine the relationships of one to the other, and show how one set of the transformed parameters can be expressed in terms of the other. Consider the vector of observations to be arranged as = ¥,.r c where Y1.l1 [ Y1Jk (n Xl) j 14 The vector of p~rameters is then ~l . o ,~c E(yJ . = ~ = o o o ~1c l:!'.rl ~rc where ~i j = ~1 (2.2.5) j..! • Now, we will define the transformations C1 (1 X n) ~ 13 = ~ = C:a(r - 1 X n) 2;3 Cs (c -1 X n) ~ C4 ((r - l)(C - 1) X n) !6 C6(N - rcX n) o ~ = Cl! 15 and Ql (1 X n) ':iJ. y = ~ == ~( c - 1 X n) Y;3 ~ ( r - 1 X n) Y!J, '4 «r ys Qs (N - rc X n) H: = % - l)(c - 1) X n) where C and Q are each orthonormal matrices with Cl - Ql, and C4 - '4, Cs - ~. (2.2.8) Similarly, for the observations, we define and 1. == Cl: t == ~ Further, C and Q are constructed such that the following statements hold: ~ I I I I C1C1J! == J! Ql Ql~ f1. I = U10l I ==.Y1.YJ. == S.S. due to overall mean ~/C~C2~ == ~§:p == 6R == S.S. Rows (unadj. for Cols.) J! IC;C3~ == ~§.s == S. S. Cols. (adj. for Rows) ~ I I I Q2QeJ!:=::tPYF == 6 C == S.S. Cols. ~ I QSI QaH: == S.S. Rows == I YsYs ( ) unadj. for Rows ( adj. for) Cols. J!/C;C4~ == J!/'4'4!:!'. == §.;04= 1.;"14 == S.S. R X C (adj. Rows and Cols.) !:!'./C~CS~ =~/Qs'Qs~ =~.§.p=~YJs = S.S. Within Cells (= 0) ~/C/C ~ = ~/Q Q ~ = S.S. Total 16 A parallel set of relationships holds using course, S.S. Within Cells Z~ g, and t except, of = Sw f o. We will now examine a single term, V~1' from ~ , (a (i = 1, ... , c ~ =2 1) or 3): (2.2.11) Since C is an orthonormal matrix, we have from (2.2.6) (2.2.12) which upon substitution into (2.2.11) gives 2 Vai = 0 'C q 'C' 0 -"'O:'i 3a i - - .9o-1.9a, i [C'1 c'2 00. C6'J~ ~ (2.2.13) From equations (2.2.8) we know that all row vectors in C1 C4 , , and C6 are orthogonal to 3ai; therefore, 0 c.5!a1 .5!a' 1 C' = L.. 0 0 0 0 0 c2.9ai .9a1· ' C'2 C ' C' 2.9ai.9ai 3 0 0 0 C3S!a1~iC~ C3~i.9ai Cs 0 0 , , 0 0 0 0 0 0 0 0 0 0 (2.2.14) 17 If we let 0* c* =[:] =[:] , (2.2.15) , (2.2.16) and Ai = C*.9aiii c*' then, Y~i is uniquely expressed in terms of 0- parameter set by -0*' ~ -0* Now, let ai, r -1 = , -1 ~,r %i ' (l" = 1 ,.••• , r +·c - 2) (2.2.19) , and where as a convention we consider that the vectors, q -"'OIi ,are arranged in the sequence .9.:31 , ·.9,a2' • • ., .9.2, C -1 , -Cb1 , -'h2' • • ., (1~ ~, r-1 so that it will be clear that it is the vectors comprising ~ that are under consideration when we restrict the range of the index to i = 1, ••• , c - 1 as will sometimes be necessary. 18 Using the definition given by (2.2.19) we can rewrite [ I I .9a1.£21 ••• S!ai.£2, r_l I .9a1~1 I ••• .9a1,Ss J 0-1 ] I ~l3ai as or Ai = 2 an an a 12 a12 a f1 a12 2 ail ai, r+0-.2 a12 a i, r+o-2 (2.2.21) a"r+0_2 a 12 ai, r+o_ 2 2.3 2 a 1,r+o_l Some Properties of the atJ Let us consider that O~J r-l = linear component of S.S. Rows = quadratic component of S.S. Rows (adj. for linear) = (r - l)st degree component of S.S. Rows (adj. for linear, quadratic, ••• , (r - 2)nd degree components) 19 Of1 = linear component of S08. Colso (adj. for Rows) Of2 = (8aj. quadratic component of SoS. eals. for Rows and linear component of Cols.) of, 0 -1 := (c -1 )st degree component of 8.8. Cols. (ad.j. for Rows and li.near, quadr"atic, • 0 0' (c - 2 )nd degree components of eolso) and Yi1 = Yi2 Yi, linear component of 8. S. eols. quadratic component of S. 8. Cols. (adj. for linear) 0-1 = (c - l)st degree component of 8.8. Cols. (adj. for li.near, quadratic, 0 •• , (c - 2)nd degree components) Yf1 = linear component of 8.8. Rows (adj. for Cols.) Yf2 = quadratic component of 8.8 0 Rows (adj. for Cols. and linear component of Rows) Yf,r-1 = (r - l)st degree component of 8 S. Rows (adj. for 0 Cols. and linear, quadrati.c, ••• , (r - 2) nd degree components of Rows) The subdivisions into linear, quadrati.c, etc., components is taken for purposes of discussion only. Obviously, from the definitions of C and Q, any other subdivisions into individual components would serve equally well. 20 NOW, 1 e t ... , US + _.'l.n th e conSl.d er a 11 componen'JO -, ~b-'1":"oe.I. . . cw V31. r. 2 vJ: 2 ...... 0 J Oi,r~l) and then for the linear component of colllilli~s (adjusted for rows), 0;1' However, each of the components ofa, ••. , Of,o-l is invariant with respect to the crder of aO.justment by all preceding terms so that 0s2 a, ••• , 2 os, 0 ~1 are unchanged if the adjustment is first by the linear component of columns (unadjusted) and then by rows (adjusted for linear component of columns). ••• , 5f,0-1 Therefore, each of ofa, is orthogonal to the linear component of columns 2 (unadjusted) which is Yr;n' Now, and Therefore, from the above argument, or 2 Using a similar argQment for Yaa, for i = 1, ••• , c - 2. o Q 0 , Y~-a, this can be generalized as Thus, we can rewrite equation (2.2.21) as 21 O(r + i -1, c - i - I ) ·· · * O(c - i - 1, r + i - 1) O(c - i - 1, c - i - 1) for i ::: 1, ••• , c - 1. We will now proceed to show some aO_ditional relationships that exist among the a i j • ~~ From equati.ons (2.2.10) we have + 2.,;2.,3 ::: ;d"J:p (2.3.8) + 1.;1;3 which can be rewritten as ~ '[ I I I I '£;1,£21 + • • • + .£a,r-1.£2, r-1 + 331.331 + ••• + .£3,C-1~,c-1 :::~ '[ I I 3,a1SLa1 + ••• +,9,g,C-l.9:a,C-l I +:1'31~1 + ••• I ] ~ ] +..9;3,r-1~,r-1 ~ or Since ~1~1 ::: 1 and ~l~S ::: 0 (a,S ~ 2,1), if we pre-multiply by ~1 on both sides of equation (2.3.9) we obtain or 2 2 2 1 ::: 8.:i.1 + ••• + ail + ••• + a r + c -2, 1 (2.3.10) and if we pre-multiply by ~~l and post-mu.ltiply by, say, ~2 on both sides of equation (2.3.9) we obtain 22 or (2.3.11) By considering all vectors, ~1j, we can immediately generalize (2.3.10) and (2.3.11) as r+o -2 I afj (j = 1 = 1, •.• , r + c - 2) (2.3.12) 1 =1 and r+o-2 I 1 = o (j = 1, a 1 j a 1 l3 1 ••• , r + c - 2 f. and 13 (2.3.13) j) By similar operations with the ~ j we can show r+o_2 I j afj = 1, •.. , (i =1 r + c - 2) = 1 and r+ 0-2 I j=.1. a 1 j aOtj :::: 0 (i 1, r + c -2 = (2.3.15) and ex f. 2.4. Transformation of Bo i) From (2.2.10) we have 2 = Yl31 + 2 + Y2, 0-1 and using (2.2.18), this becomes 6 C :::: 2.-*'At!* + ••• + §..*' Ac -1§.* = 0*' [O~lA1J ~* 1= 1 (2.4.1) 23 If' we now express L1 R "" ~ 52 as = 2-.* '[ _I(_r_-~l.,_r_-_l_)_l-_a(~r_-_l_,_C_-_l_) 1, a(e - =: 5*'K5* r - 1) a(c - 1, c J 2..* -1) (say) and set c-1 U == L:A1 -K 1=1 the null hypothesis can be stated as Ho : 5* I U 5* =: a This is the form of the null hypothesis that will be used throughout the remainder of this paper. If the alternative hypothesis is two-sided, we have H1 : 5* I U 5* 1= a The one-sided alternative that cor:responds to L1 > L1 is R C H1 : 6 * 'u 6* > a and similarly for L1 <L1 R C 2.5 Derivation of the Test Statistic Let ¥. be a sample of size n compri.sed of the independent samples ¥.iJ each of size niJ from the rc cells of the arrayed population. If we let the within-cell density functions be N(~1J,cr2), then the likelihood function of the sample is: 24 !l LC~,:;.§.,O'2) 2 = (2rTcr ) "2 exp [= 2&~ (.2. - ,g,)' (.2. = !f) ] (2.5.1) or L(};).§..,O'2) := (2rrcr 2 r ~ exp{ ~ r2~ [(011 - £11 )2 + (~= ~)'~ - :!:a) + (~- & ) , (~- !s) + (§.4 - :f4 ) , (§.4 - !..4 ) + (.2.5 - i6)' (.2.5 =.{5 The usual st~aight-forward me~hod ~f )J} (2.5.2) diffcrentia0ing lnL with respect to each parameter i.n turn, equating each derivative to zero J and solving each of the result::"ng equations yield.s the necessa:-y maximum likeli,.. hooel estimates: .2. and 0'2 J in the un:r·e8~T:"cted parameter spa.ce. placing the parameters in (2.5.2) Re- with these estima+~es g:tves the likeli- hood of the sample in the urrrestricteQ space: n L(O) [~'s w ]2 exp ( - ¥) (2.5.3) In order to find the m.l. estimators in the parameter space restricted by EO , it is necessary to find iJ ;2, and A. that maximize the function (2.5.4) where L' = lnL and A. is a Lagrangian mllltupl.ier. Since the o's a~e all linea~ly independent, it follows that F will " b,.2.4 be a maximum with re spect te 2J., .2.4' and.2.5 when.§.l := = .{4 , ,.. " and (~ - 5) '(.§..5 -[6) "" Sw. Therefore, we will temporarily be con- i. cerned with maximizing F only with respect to it is equivalent if we minimize: ~ and. ~. To this end, 25 ::: (2:* - gx·) , (2.* - g*) +- A (~X 'U~x. ) The derivative of F' w:tth respec t :: :>J:i1' ~5* 20* 2 g-x t ..) (I + AU).§.* Or, if we let w ::: rearrang~ng ::: f o'x- is (2.5.6) + 2AUO'* Equating the derivative t09.n (1' + 'C are all zeros and (2.5.5) 2 X 1) vector whose elements - yields l.*. (2.5.7) (2.5.8) (wI + U).§.* ::: w!:.* • Thus, the m.l. estimator of ~ ~* ::: w(wI + U) - 1 ~* (2.5.9) '!...*. Replac ing 2..* with ~* in (.§,* - (~* - g*)'(~* - g*) is l* ) '(§.* - lYe) == [wg-X"(wI + U)-l - we have g,*.I] [w(wI + ur l ~* - g"*,] :::g,*'[w(wI+ Ur 1 _ I][w(wI + Ur 1 - I] g* ::: f.*' (wI+Ur l [ wI-(wHU)][ wI-( wI+U)] (wI+Ur l =g,*'(wI +- UrI Thus~ replacing the paramete::'3..2, u2 (wI + U)-lt.,*' /£* (2.5.10) with their m.1. estimators in equation (2.5.2) gives 1 [. '( _ IJ. 'I L ~) §.., u 2) ::: ( 2rru 2)- E: 2 exp{ - 2(j2 Sw + l*wl + U )-~.2( u wI + U)-1. ~*] A ( • J (2.5.11) 26 from which we fl::J.d the m.l. esti'llator 'Jf' (Y2 tc be (2.5.12) Substitution of functioL for ~2 for (Y2 l::.'l equation (2.5.12. )yi2:La.S the likelihood. t:b~ sa.>nple in the rest:::'lr::ted pa:::'ar.letel" space: L(w) (2.5.13) From equatlons (205.3) and (2.5.l3)J t:1.e li.k,,;li~lood rat::..c ~ [;,.+ g* • :'..8 fcnmd to be E: S '(WH:)-l 1J 2 (wHU)-1 l*Y - n '2 (2.5.14) ] Taking the large sampie theo!.'y result clue to Wilks (2,5) that - 21rJAH is distributed as X2 with degrees of freed,)m being the (Ufferecic,~ in dimensionality of the parameter spaces 0 ar.;.d w-' we have as a test statistic _ ,~ TH- n-U.l 1 [ /J.x-'( W -J.+H)-l "T2( + 4f. \.1 U VI 'T+1'T)~l -'- ,-' o*J j!. (2.5015) Sw which is approximately distrib'cted. as xf. A single degree of freedom for X2 results from the w-space having di..meY1Sions one les8 than the O-space due to the single constraint that is expressed as llQ. 27 In order to obtain the explicit value of the test statistic given by (2.5.15) it is necessary to find the solution for the Lagrangian A multiplier w by substitution of 0* for 0* in (2.4.4), 1.~., ~/(wI + U)-l U(wI + U)-l ~ = 0 (2.5.16) which from (8.1.4.2) is seen to be, in general, a polynomial of degree 2(r + C - 3) in w. In section three, we will discuss two special cases that result in second degree polynomials and in,section four we will consider cases having polynomials of higher degree. 28 3. 3.1 TWO SPECIAL CASES OF THE GEl\lERAL RESULT The Case with Equal Cell Proportions If we let the cell proportions, p .. , be equal and subsequently 1J obtain equal numbers of observations per cell, then from (2.3.1) and (2.3.3) we see that: = 2 Y2i 2 (3.1.1) °3i and 2 Y3i 2 °2i = so that (2.4.1) becomes = 2-* '[_QoJ = .§.~3 • 0* 0 ] '01 I(c-l),c-l) (3.1. 3 ) For (2.4.3) we now obtain c-l U = L: A.- K .1 1 1= = [- . I(r-l,r-l) 0 I I(c-l,c-l) 0 ] Therefore, t? = I{r 2, + c - r + c - 2) and B = (WI + U) == [(W-l)II o . 0 (w+l)I ] , and = (WI + U)-l '=~~)II 0 . o . (-1....)I w+l ] (3.187) 29 Now; we find that for the fo~egoing conditions, equation (2.5.16) becomes o (3.1.8) or or or by taking the square root of both sides of the equation Rearranging; we have the two solutions for the Lagrangian mUltiplier: (3.1.10 ) We will designate as w the solution having a plus sign in the numerl ator and a minus sign in the denominator and the reverse as w • 2 Referring to equation (205.10), we see that the sum of squares to be minimized is by application of (3.105) and (301.7): 30 (~* - g*)' (~* = g*) = ~'B-~2B-1g* =g* 'B- 1 IB- 1g* = g:.' (B- 1 )2gy, g;g2 #..;).3 = TW-1)2 + (w~1)2 (W+1)2g;G2 + (W_1)2 £;;'3 = (w 2_1)2 (3.1.11) substituting w for win the numerator of (3.1011) gives: 1 8 (/2'£2) (~) (3.1.12) And, substituting w for W in the denominator of (3.1.11) gives: 1 31 (3.1.13 ) Replacing the numerator and denominator of (3.1.11) with (3.1.12) and (3.1.13) we have (3.1.14) Similarly, if w is used, 2 (3.1.15) If we take the convention that (SSR)1/2 and (SSC)1/2 are the positive square roots, then (3.1.14) will always give the minimum for ~/B-~2B-lg*. Using (3.1.14) in (2.5.15) we find the test statistic for the case with equal cell proportions to be TH = n ln [ l_ + [(SSR)1/2 - (SSC )1/2 J2_ ] 2S w (3.1.16) 32 The Case wi.th r = c = 2 3.2 If the two systems of stratification to be compared each contain two strata, then c-l L: i=l A. =A ·1 1. a~la12 ] =[a~l a a 12 12 , = ~1~1 and :I =[~ ~J K k k' = (1, 0) . where k'= Therefore, from (2.4.3), U::::::aa'-kk' -1-1 -- Noting~.from (2.3.14), that .. 2 all + 2 ~12 = 1 and us;~ng (8.1'.1.2 )we have From (8.1.4.2) we find, for r = c = 2, =~ [wI - (U - p I)J p* 1 2 (3.2.1) 33 where P l =: trace U ::: 0 and thus, Also, using (3.2.4), we have ::: ::: w2 1_2wU+U2 (w2=a2 )2 12 (w2 +a 2 )I- 2wU 12 (w2_a2 )2 12 To find the necessary solutions for w we substitute (3.2.6) into (2.5.16)~ ~'(wI-U)(~l~{ = lE ~')(wI-U)l* =: 0 g'*"(WI-U)~l~{(wI-U)l* =: l*'(wI-U)! ls'(wI-U)g-)E> [.§;{ (wI-U)g* ] 2 ::: [~' (wI-u)g* ] 2 := ± k' (wI-u).i* a' (wI-U).i* -1 - - ~ (3.2.3) and 34 or which after expr:tnd:L:c.g and using (3.2.4) = wI ~ t:.) siiT.plify beC;)rrL8~": a12[a12R21 + (1-a11 )131 J [ (I-all. )JL 21- a12i31 ] as the solution usi.ng the negative signs a,::;,& (3.2.10) as the solution using the positive signs in the numerator and denominator of (3.2.8). Using (3.2.5) and (3.2.7) in equatio:cl (2.5.10) gives: R* 'B~VB-li* :: - - a 2 f"l.' I (?,-l)'2;..x, 12- - (3.2.11) si.lbstniitirtg w forw and.' simplifyirJ.g gi"ires: 1 [(1=a11 )f2l - a12f31J 2U:a11 ) 2 35 = 2 = [a12R21 ~~:~~31J [(1-alllf21 - a12f31J 2a 12 = = Replacing the parameters with observations in (2.2.18) we can write t 2 21 =j*'AP;* - -l-'- t~l = (a1 J!21 + a12f31)2 and t 2 31 2 = I*'A i* 2- t 31 = (a21f21 + a22(31) 2 36 From (2.3.13) and (2.3.15) we have the two equations which when solved simultaneously give (302.15 ) • Substituting into (3.2.14) we have Thus, (3.2.12) becomes l*'B-ITiB-Ig* = [t31 -131J [121 - t 21 J 2a 12 Similarly, using w for w in (3.2.11) gives the same result as (3.2.17) 2 but with positive signs. Again i f we take the convention that the square roots are positive, the minimum sum of squares is given by (3.2.17) which may be written as: !..* 'B':"YB-J.g* = [ (SSR .adj}1/2 _ (sse adj)1/2J[" (p~~")1/2~ {SS0)1/2J =-----:..---.,.-------=::--.....- - - - - - = : : . Therefore, when r=e=2, we have for the test statistic: 'H= n m[l+ [.(SSRadj l/2 - (sse ail" )1/2 J[. (SSR 2a S 12 w )1/ 2 - (sse )1/2]J (3.2.19) 37 GtTIl\1ERAL SOL'GTlON FOR r > 2 ana. c ~2 J 4. GR.n~ral 4.1 • • • ;1 .,. Renarks Ir, the preceding section we have shown that th2 LagraL.g~..an illulti- pIfer' J w., can be found by the solution of a simple quad.:cat.:c equation and that the explicit algebraic form of the test statistic;; 'T R» can be gi.ven f;)r two special cases: for equal no. with an)' number of rQ1"s and l.J . columns and. for r=c::2 with the no. not equal. l.J For ur"equal no 0' 1.J when the problem size is increased to r=3 and c=2 the polyLomial inw is of' degree 2(r+c-3) size is increased to 2(r+c-3) := r=c~ 6; and so on. = 4; when the problem the polynomial in w is of degree We know that a general polynomial of fourth degree is solvable in terms of rad.icalsj however, it does not appear that the required cumbersome algebra is warranted in order' to obtain a solution for one additional special case. Especially since this would contribute nothing toward a solution for the general case. also krJ.01oT We that the algebraic solution of the general polyno:rr';'.al cf degree greater than four is impossi.bleo Certain spee:i.al forms of' higher degree polynomials ean be solved by direct fact.oring or other means but the ca.se in quest.ion does not hold the promise of a. ready solution due to the manner in which the polynomi.al coefficients are functions of' both the obeervation values and the number of' observations per cell, no .0 l.J We have J thez:'efore;l directed. the work toward obtaini.ng a method of Ui.llllerical sc>luticm for those problems having r > 2 andc ••• J r. := Two somewha.t different routes leading ·to rm..."'llerical. e.olutions are presented in. the next two sections. These two dif'fe?' only in the 1. , 38 point at which the algebraic development terminates and the numerical solution is started. 4.2 Explicit Algebraic Representation of the IJ I -L_ -1.0 Equation . .t* B lJB ~* == 0 For these problems of size up to and including r=c=4, i.~oJ poly- nomials up to degree ten, the polynomial coefficients have been derived algebraically. .Numerical solution of the polynomial then leads directly to a calculated value for AHo In this section we will illus- trate the derivation of the polynomial coefficients by using the problem with r==c=3 as an exa~pleo The coefficients together with other necessary quantities are given in append.Ix 8 2 for all combinations of 0 rows and columns up to r=e=4. For r=c=3 we have ( a~l + a~l - 1 ) (alla 12 + a 2l a 22 ) (all a 13 + a 21a 23 ) 2 2 (a + a 22 -1 ) 12 U := and from (8.1.1.2) '*, (a a + a a ) 12 13 22 23 (2 + a2 ) a 23 13 39 2 + 2 ) - ( all a 21 - 1 o o o o o o o o It can be confirmed by inspection that the inverse of U is: o 2 alla13a24 * a 2 2 a 13 24 o It is now necessary to find the sums of principal minors of U. The sum of the one-rowed principal minors, Pl' is: 3 2 4 2 Pl = trace U = ~ a . + ~ a 2 . -2 j=l l J j=l J or since a 14 4 =0 2 4 2 Pl = ~ a l · + ~ a 2 · - 2 j=l J j=l J From (2.3.14), each of the first" two terms is unity; therefore, Pl = O. 40 The sum of the two-rowed principal minors of U is: P2 = {[ (ail + a~l - 1)(a~2 + a~2 -1) - (alla 12 + a 21a 22 )2J 2 2 2 2 2J + [ (all + a 21 -1)(a13 + a 23 ) - (a a 13 + a 21 a 23 ) ll + [(ail + a~l - 1)(a~4) - a~la~4J which by rearrangement and use of equations (2.3.14) and (2.3.15) becomes Rather than proceed directly to find the sum of the three-rowed principal minors of U, P3' we can save much algebraic manipulation by making use of the relationship of the sums of principal minors of a matrix and those of its inverse as given in appendix 8.1.3. By inspec- l tion we see that the sum of the one-rowed principal minors of U- , is p{ = trace U- l =0 p{, and from equation (8.1.3.3) the sum of the three- rowed principal minors of U is From the proof given in appendix 8.1.2 we have for the sum of the four-rowed principal minors of U: P4 = lui = ai3a~4 41 Summarizing J we have p 1 P2 =0 := - P3 = 0 2 2 P4 := a 13 a 24 In appendix 8.1.4 it is shown that the inverse cf B(4x4) can be expressed as: B- 1 = MW 3 2 I - w (U- PI I ) + w(if- Plu+P2 I ) - (tf'-Plif+P2U-P3I)] • (4.2.5) Since in this case J PI = P3 = 0 we have: (It should be noted that P = P =0 does not hold in the other case 1 3 that yields 4x4 matrices U and B. 4x4 matrices but PI := - 2 and P 3 := When r =4 and c 2a2 2 and 1 a = 14 14 = 2, U and Bare o. See appendix 8.2.) l Using equation (4.2.6) we can expand B-luB- : -L_-l ~TBT ( 1 )2{w6U - 2w . 5 U2 + wU 4[ 3 + 2(U3 +p U) ] B -UB 2 - 4W3[U(U3+P2U)] + w2[3if+P2I)(U3+P2U)] If we express the inverse of U as given by equation (8.1.3.1) U- l = (_~)4P4 = -1 P4 [u3 [u3 - Plif + P2U - P3 I] + P2U] 42 then rearranging we have 3 -1 U + P U = -p U (4.2.8) 3 U (4.2.9) 2 4 and -1 = -P4U - P2U Substituting (4.2.8) and (4.2.9) into (4.2.7) gives: -L_-l £ 1)2 { 6 5_2 41 ( - 1 ] B -UB ~TBT w u - 2w U- + w L3 -P 4U - P2U) + 2P2U 2 3 2 l l - 4W [U(_P U- )] + w [(3U + P2 I)(-P4U- )] 4 (4.2.10) Thus, from equations (4.2.10) and 4.2.4) we can write g*'B-luB-lg* =0 as a polynomial in w: [g*'ug*Jw 6 - [2~,u2g*Jw5 + [(a~3 + a~3 + a~4)l*'U~ - 3a~3a~4l*'u-lf*Jw4 2 a 2 [3"* 'U D* (2 2 2 )/'* 'U-1J?*J 2 2 0* '1'*J + [ 4a13a24~ _ w3 - a 13 £ w2 24 ~ ~ - a 13 + a 23 + a 24 _ where g*'U!..* 2 (£21 + 2 122 ) 2 2 2 + a 13 31 + (a23131 + a 24R32 ) R - (al J!21 + a 12R22 ) 2 - (a21'21 + a22f22 ) 2 43 2) R +132 1 {2 2 [( 2 -2~~2a 13 a 24 31 a a 13 24 + 2a13a24[(all~21 - (2 P21 2)J +R22 + a12R22)a24R31 + (a21A?21 + a22l:22)a13R32 - (allR21 + 4.3 a12i22)a23~2 J } An Example Illustrating Use of the Methods Two systems of log grading (stratifying), each comprising three grades (strata), were developed from different logical starting points. System "R" was developed with primary consideration given to size and relative position of clear areas on the log surface. system "e", In developing emphasis was placed on size and character of visible de- fects such as knots. tures in common, In addition, the two systems have certain fea- ~.~., each specifies the minimum log diameter for each grade although these minima are not identical for the two systems. A sampling study of white oak sp. logs was made in order to compare the effectiveness of the two grading systems. The methods discussed in the previous section will be illustrated with the data from this stUdy1{ The pertinent information from this data is summarized in table 4.1. The upper number in each cell of the table is the number of logs, n .. , lJ and the lower number is the sum, Y. 0' of the observations, y .. k' on the lJ lJ nl"Jo logs. The observations, Yo Ok' express log value (i.e., total value lJ - - of all boards produced from the log) on a per unit basis with one-thousand board-feet as the unit. In addition to the values given in table 4.1 we 1/ Data furnished by U. S. Forest Service, Forest Products Laboratory, Madison, Wisconsin. 44 will need the pooled within-cell sum of' squares, S w the total sum of squares, ST Using the computer :=: :=: 176,732.82, and 353,057.98. progra~ described in appendiX 8.3, the fol1ow- ing quantities were computed from the data appearing in table~4.1 all = 0.48759129 a ~ a a 12 21 a 0.0037259931 a 13 14 =0.19093343 == 0.87306405 :=: 0.0 _. -0.10884114 :=: 0.51739437 ~1 :=: 338.70728 131 122 :=: 231. 73764 f.o2 '" 66.045057 22 Table 4.10 :=: :=: 0.82704357 46.768679 Summary of' white oak sp. sawlog data System C Grade 1 Grade 1 Grade 2 49 88 7 144 5321.10 9047.85 650.05 15019000 299. 60 364 5 System Grade 2 415.50 24560065 Grade 3 Totals 4444 035 29420.50 R Grade 3 Totals 3 75 182 260 260080 4824 056 11036.85 16122.21 57 462 249 768 5997.40 38433.06 16131.25 60561. 71 45 t t ::: 206.84622 21 ::: 234.10209 22 Using the above quantities we have from equations (4.2.4), ~l. 450883 P2 P4 <- 0.52137355, from equations (4.2.12), (4.2.13), and (4.2.14), g*'ul* = -70,835.812 g*'uF£* ~ 110,918.71 g:'U-lg* "" -11,130.558 , and from equation (4.2.9), If* 'u3g* ::; - - -p Jl* 'U-1i* - 4- ~ p R* 'U(l:'* ~- ::: - 45,253.084 Also, we find !! 't'¥.o ::: 174, 974. 21 Using the quantities given by (4.3.5), (4.3.6), and (4.3.8) to calculate the numerical values for the coefficients, the polynomial equation (4.2.11) becomes 6 5 4 3 70,835.812w + 221,837.42w - 70,801.40w - 364,907.7Ow 2 + 26,180.293w + 150,373.66w + 30,256.238 ::: 0 Numerical solution of this polynomial using the computer program described in a~pendix 8.4 yields the following calculated real roots. 46 We must now determine which of these roots min:i.mizes the sum. of squares f*B=~2B-l!.* Note that since the left si(1e of equation (4 02.11) is 0 IBI20g*B-~-~~ we can obtain g*B-~~-ll* by s:i.mply increasing the exponent of U by one i.n that equationo TherefDre J by substituting the numerical values from equations (4.3.5) through (4.3.8) into the modified equatior. (4 2 11) we have 0 0 6 5 4 3 + 90 J 506.17Ow - I11 J 951.51w - .147,727.68w 2 - 56,1140538w + 60,512.476w + 47,563.306]/IB\2 [110,918.71w From equation (8.1. 4. 7) we have which upon substituting for the p. 's gives ~ IBI= w4 2 - 1.4580883w + 0.52137355 Using the four real roots of g*/B-luB-l~* =0 g*/B-l~B=l~* = 328,166.22 = 14 J 784.071 = 162,307097 = 79,772.325 Thus, we have for the test statistic J Til = n In [1 + g*B~lu2p'* ] w = 768 In[l + = 768 In(1.0836521) = 61. 73 14,784.071 ] 176,732.82 we have: for WI ' for w2 ~ (4 03.13) • 47 Since X(1,O.95) =: 3.84, we conclude that there is s';ll'f'icient evidence to reject the null hypothesise From equations (4.303) and (4.3.4) we have SSR and ~ +,2 ~n~ + t 2 22 c..L o"S'"v - =: 97,5890148 from which we find the following estimate:;:; of within col:mnns: 2 s (within rows) va:~iance within rows and ST - SSR .. 3 - 241035 n2 ST - sse 8 (within columns) =: U-3 =: 334 034 =: Therefore, we conclude that system "R" is the better of the two systems with respect to the reduction of variance within grades (strata)o 4.4 Numerical Solution for of Size Greater than ~~oblems r~=4 It is obvious that the methods presented in the two preceding sections are suitable only for problems of modest sizeo ' Larger problems can be handled more easily if the numerical solution is started directly with B- 1 as given by equation (4 0205), or in the general equation (8.1.402)0 -1 In oruer to express B as a polynomial in w with numerical matrix coefficients we must find the numerical values for Pl", eoo.,.Pr+c-3 and .2 ,.r+c-3 for U, U-, ooe, U Were it not for electronic computers, the 0 calculation of these quantities would be prohibitive in most cases. However, every research computing center today has available programs for matrix multiplication and solution of determinants for large matrices. The maximunl size matrix accommodated, of course, depends on computer capacity and related factors. Available programs could, in most cases, be used for the necessary calculations with only slight modification. The amount of calculation can be consid.erably reduced by making use of theorem (8.1.1) and the relationship given in (8.1.3). Using theorem (8.1.1) we have from which it follows that ::: and and so on. Thus, we need not find the (r+c_3)rd power of U but can work with lesser powers of the partitions of U. If r+c=3 is even~ the highest power of the partitions that is required is (r+c=3)/2 and if r+c-3 is odd, the highest required power is (r+c-2)/2o values for Pl' • 0 ., p + r c- In finding the 3 if we use the relati.onship between the sums of principal minors of U and U- l as given by equation (8.1.3.3) we can work with determinants having about one-half as many rows and columns as would otherwise be necessary. For purposes of illustration, suppose that r=e=lO in which case U would have dimensions (18 X 18) and we would need to find all powers 49 of U up to (r+c-3) := 17. By using the approach as outlined above, we could accomplish this by finding all powers up to (r+c-2)!2 andU22 , and the products UllU12 ;> ~lUl2' titions each have dimensions (9 X9). ••• ;> -1 where the par- To find Pl' ••• , P17 we would first find Pl' ••• , P9 directly from U then find Ufrom U • ~lUl2 = 9 of U1l l and derive p{ ••• , P~ Using equation (S.1.3.3), PIO' u., P would be directly 17 I I available from P ' ••• , P • S l Thus, the maximum size solved is reduced from (17 X17) to (9 X9 ). determinar~ to be 50 5. EMPIRICAL SAMPLING EXPERIMENTS In the development of a test statistic, we have used the result from large-sample theory that, when He is true, TH =-2 ln A. has the 2 X distributionj however, in actual practice we usually will be working with relatively modest sized samples. Therefore, empirical sam- pling experiments have been made in order to study the behavior of T H for some small sample sizes. These empirical experiments included a study of the power of the test as well as a study of the central distribution. Thirteen combinations of row and column dimensions and sample sizes were tested. These combinations are given in table 5.1. each of these combinations the "trial values of T step process using two computer programs. H For were found by a two- This process will be de- scribed using the combination having r=c=2 and all n .. = 10 to lJ illustrate. The steps performed by the first computer program are: (a) Input to program is 4000 cards each containing 72 random digits (1.~., 10 four-digit random numbers plus an additional 32 random digits) from the RAND Tables (20). (b) Each four-digit random number transformed to a random normal deviate as shown in table 5.2. (c) For the ten random normal deviates generated from a single input card, the sum and sum of squares were calculated and punched on an output card. Each of the 4000 output cards also contained the last 12 random digits from the input card. 51 Table 5.1. Rowand column dimensions and sample sizes for empirical sampling experiments p!::xperiment number 1 Number of rows and columns r=e=-2 Number of observations per cell Total number of observations all n .. = 5 20 = 10 40 = 20 80 all Ii.. . = 5 30 = 10 60 = 20 120 J.J 2 3 4 r=3,?c=2 J.J 5 6 r=e=3 all n .. - 5 45 8 = 10 90 9 = 20 180· 7 10 r=e=2 lJ nIl=20, n12 =10, n21 = 5, n22 =15 50 11 nIl=40, n12 =20, n21 =10, n22 =30 100 12 nIl=20, n12 = 5, n21 := 5, n22 :=20 50 13 nIl=40, n12 =10, n21 =10, n22 =40 100 . 52 Table 5.2. Transformation of rectangular distribution of four-digit random numbers to random normal deviates a Random number 3 7 11 15 19 27 39 51 67 91 119 155 199 255 319 399 495 607 735 883 1055 1251 1467 Normal deviate -33 -32 -31 -30 -29 -28 -27 -26 -25 -24 -23 -22 -21 -20 -19 -18 -17 -16 -15 -14 -13 -12 -11 Random number 1711 1975 2263· 2575 2911 3263 3631 4011 4403 4799 5199 5595 5987 6367 6735 7087 7423 7735 8023 8287 8531 8747 8943 Normal deviate -10 - 9 - 8 - 7 - 6 - 5 - 4 - 3 - 2 - 1 0 1 2 3 4 5 6 7 8 9 10 11 12 Random number Normal deviate 9115 9263 9391 9503 9599 9679 9743 9799 9843 9879 9907 9931 9947 9959 9971 9979 9983 9987 9991 9995 9999 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 a A random number less than or equal to the number in the first column yields the normal deviate in the second column. 53 The second computer program completed the following steps: (a) The desired values for ~ll' ~12' ~21' and ~22 were read into the computer from parameter cards. These values of the ~ .. were pre- ~J calculated to give a required value for the variance ratio, V = cr 2/cr 2 R c r One computer run was made for each of the desired values of V. r four runs were made with V =1 (null case), V == 1.5, V r r r (b) == In all 2 , and V r in groups of four. The These cards were read into the computer summa~ information contained in each group of four cards was used, together with the parameters for the desired V , to calculate a value of the test statistic. To shorten the com- r putations, the form of the test statistic was changed as follows: Pr { [(SSR)l/:~ (SSC)1/2 J2 ~ 2 2 [1 - e~ X12~1-21) ] } == O! w Thus, the program calculated only for each sample. (c) == The input cards for the second program were the 4000 output cards from the first program. A deck of 1000 cards each containing one trial value of the test statistic was the output for each computer run. These cards were sorted to arrange the trial values in ascending order, the empirical distribution of the test statistic. ~.~., • to give 3. 54 (d) Between successive runs for V I' = Ip 1.5, 2~ deck was sorted on thTee co2.umns of the twelve rand.om Ll each ca::'d. 3 the input card Il'..unbers contained Thus, the empi:rical results between values of' V:r :for a given size arx:ay and size sample are not independent. This means that the i!ldivi(lual poir.ts generated to const!'uct an empirical power curve are correlated. However, since new random deviates were generated for each of the thirteen combinations given in table 5.1, their respective power curves are independen:t.;. The empi!'ical frequency distributions generated by the thirteen s~~pling experj~ents are shown in figures 5.1 through 5.5. These dis- tributions are arranged to show the number of observed values of T H 2 tha.t occurred in each ten=percentile range of a true X distribution. 'Using E (number of observations) = 100 2 for each percentile group, X for goodness-of-fit was calculated and is shown on each of the figures • .A study of these empirical distributions indicates that for small samples the 11pper tail of the 2 the X distribution. T H distribution is larger than that of However, when the number of observations per cell is ten or recre, the discrepancy is small enough to be quite tolerable, 10~0, for a nominal type I error, a =. 0010, the test may have an actual a equal 0.11 or 0.12. Since we have neither the central nor the nOrl=central small sample distribution for 'Tn' we cannot compare the empirical power curves to the theoretical curves. In fact, if the theoretical curves were known, there would be no need for the empiri.cal sampling experiments o However, it is inf'orrnative to compare the empil'ical power curves for 'T H with the power of an alternative test proced.uree If the total sampling ef'f'ort is divided so that n/2 observations are 55 160 140 120 r-- 100 ""--1-- ---- ~ 80 ~ 60 i n=20 40 all n .. = 5 20 2 X = 27.24 ~J ** o 120 -- - 100 80 ~r-- r--- -- l>a = 40 n 60 40 all n g. 20 ~ X = 5.22 N.S. 0 S CD ij = 10 2 120 100 80 n = 80 60 all n 40 01----1_--L._-L._..L..._J.----lL......-..L_--L._..J----I 10 20 30 40 50 60 70 80 90 100 Nominal percentile group ~. ',. " " = 20 2 X = 10.36 N.S. 20 . . . .'. .\...... ~ . . : .~" ij Figure 5.1. Empirical distributions for r=e=2 with e~ual cell sizes 56 1_-a.__ 80 n == 30 n == 60 60 40 20 o 120 100 ---- - --' ~ - f-- _r--- 80 60 40 all n = 10 ij 20 ?C == 14.18 N.S. 2 o 120 100 80 n == 120 60 40 all n .. = 20 ~J 20 _ _4___l______l._....l 70 80 90 100 04----II.--.L.-..L--l--~_~ 10 20 30 40 50 60 A 2 == 6.12 N°.S. Nominal percentile group Figure 5.2. Empirical distributions for r=3 J c=2 with equal cell sizes 57 140 ~ 12 c 10{ e--- ....~ 8 01-- - - - I--- n =:; 45 n := 90 60 40 20 0 120 100 80 60 p" 0 l:l \l)l ~ ~7: CD 40 all n, .:"~ 10 lJ 20 2 X = 10.,16 NoS. 0 J-; ~·:r.i 120 100 .-- "'-1--- I I-- 80 11 ,,~ 180 60 8,11 n 40 . -- l,J 20 2 X :::: 4~ 88 1\1". S. 20 o 10 20 30 40 50 60 70 80 90 100 Nominal percentile group Figure 5.3. Empirical distributions for r=c=3 with e~lal cell sizes 58 n = 100 rill = 40 n 21 2 X := == 10 n 12 = 10 n 22 == 40 10.90 N.S. 140 120 100 n == 50 80 n 60 n 40 ll 21 == 20 n = 5 n 12 22 = 5 = 20 X 2 :::: 7.4 2 N.S • 20 0 10 20 30 40 50 60 70 80 90 100 Nominal percentile group Figure 5.4. Empirical distributions for r=c=2 with unequal cell sizes-part one 59 140 12 n = 100 10'-1-----1 8 n 6 ll = 40 n 12 = 20 n = 10 n = 30 22 21 2 X = 10.90 N. S. 4 2 n n = 50 ll = 20 n 12 n =10 = 5 n =15 22 21 2 X = 22.42 ** 10 20 30 40 50 60 70 80 90 100 Nominal percentile group Figure 5.5. Empirical distributions :for r=e=2 with unequal cell sizes~part two 60 taken to estimate average within-row variance and another independent sample of n/2 observations is taken to estimate average within-column variance, then the ratio of MS (within rows) to MS (within columns) is distributed approximately as F(~ - r,¥ - C)e for F are much easier than those for T Since the calculations , we might choose F as the test H statistic i f the two tests were about equal in power. Power curves for the F..,test with type I error fixed at Ol = e05 over a range of V sufficient for comparative purposes were obtained r from tables given by Tang (22) and from nomographs by Fox curves together with the empirical curves for 5.6 through 5.10e T H (14). These are show.n in figures It is apparent upon inspection of these curves that the gain in power of over F is sufficiently large that in most H applications we would prefer T as the test statistic. H T 61 ~------/ , 1.0 / / .9 / ' / I I tJ;:l0 / / I I /. / / I .6 II / I I / "I'"f ~ () Q) ... Q) p:< 4-t I '/ I ..... I ~ I / I .4 ,0 I / / I l>. ~ "I'"f ~ / / I .5 0 ..-f I I I / I I I .1 / / / / / / I I , .2 " / "/ ./ ,/ /1I' .3 / / / / I .7 W ' / f /' / / .8 ~ /r / , ~' n. = 5 l.J n. 0=10 l.J n .= 20 l.J o ------------ o l 0.0 '-----L 1 ---JI-- ..J- 3 2 2/ 2 Vr=CYRCY C Figure 5.6. Empirical and theoretical power curves for r=e=2 with equal cell.sizes ...._ 62 ,, 1.0 L:x--- .9 , !( , ..., Q) 05 CH , of"f ~ .4 ,0 e .3 I I .2 / I I / I I I I I I I I I I I I r I Po! / / /I I °I>t ~ / / i'I ,I ~ () Q) r-i I I !I .6 of"f of"f / / 1I 0 ,f / I, p:j I III ' .7 ~ / , .8 l:I:l / I / / / / / / / / / ,!, f Il .. = 5 ~J n .. = 10 - - ~J " n .. = 20 - - - - - ~J .1 o.ol----L.---------L.---------I..-----1 2 3 2 V r =crR/ Figure 5.7. 2 crc Empirical and theoretical power curves for r=3,c=2 withe~ual cell sizes 63 /~-/ 1.0 t( / .9 I .- I I / / I , !J I I , .8 I I / II .7 I I ' I II / I I III I I .6 I I .5 / I / / / / +iI I I I I .4 I II I' I I I I I I I' I~ I' I n .. = lJ 5 n .. = 10 - - - - - - - lJ n. 0= lJ 20 - - - - - - - o. 0 lI.--l~---------:!:2-------~3:!:------ Figure 5.8. Empirical and theoretical power curves for r=e=3 with equal cell sizes 64 1.0 .9 .8 .7 II? tlO 1=1 .6 ..-I ~ () Q) .., ~ CH 0 .5 ~ ~ '1"'1 r-I or-! .g 84 n .c0 n J.I Pi ll 12 = 20 = 10 5 21 = n = 15 22 Yl .3 n .2 ll = 40 n 12 = 20 n n .1 = 10 21 22 ------- = 30 o. 0 l...---Ll--------~2-----------,J3L...--------T2/ 2 Vr=O'RO'C Figure 5.9. Empirical and theoretical power curves for r=e=2 with unequal cell sizes-part one 65 1.0 / .9 / .8 .7 :i ~ orf .6 +:> () Q) .,.., Q) fl:\ Cl-f 0 .5 ~ +:> orf ri "I"'l ~ .4 n .0 0 n ~ n .3 n n .2 n n n 11 12 21 22 11 12 21 22 = 20 = 5 = 5 = 20 = 40 = 10 = 10 ------ = 40 .1 o. 0 l--~1--------~2L---------3:!:------"""'Vr = Figure 5.10. cr~/cr~ Empirical and theoretical power curves for r=e=2 with unequal cell sizes-part two 66 6. SUMMARY AJ.\ID CONCLUSIONS The primary result obtained in this dissertation is the derivation of a test statistic for comparing the average within-strata variances for two systems of stratification of a single population. The popula- tion conceptionally is considered as a two-way array with rows defined by the strata of one system and columns defined by the strata of the other system. 'T The general form of the test statistic is: = n In[l + H g* I l (wI+Ur U2(wI+Url.e* ] S w where n = total sample size S = sum of squares within cells of the two-way array g* = vector of linear functions of the observations (see section 2.2) • w U = a matrix that is a function of the hypothesis tested and of the structure of the sample (see section 2.2 through 2.4) w= a Lagrange mUltiplier A direct algebraic solution for w is not possible in generalj however, a solution is obtained under certain restrictions thus yielding explicit algebraic solution for 'T H in two special cases. When the population cell proportions are all equal and thQS all n .. = n/re, then lJ = n In[l + [(SSR)l/~~ (SSC)1/2 J2 ] w where SSR = sum of squares between rows SSC = sum of squares between columns 67 In the case with unequal cell proportions but with r=e=2, lo~., two rows and two columns, the test statistic is shown to be: ] where SSR adj = sum of squares between rows, aO_justed for columns sse adj = sum of squares between columns, adjusted for rows a , 12 , = -21-'31 c q = -31-;;:1' c 0_ and the vectors ~21' ~31' ~l' ~31 are defined by the following relations: ('£21.l) 2 = SSR (:!.3il:)2 :: sse ad,j <-'11)2 ( ~3JJ I = sse )2 = SSR adJ' When the cell proportions are unequal and I' > 2 (c=2, ••• , r), the Lagrange multiplier, w, must be found by solution of an equation, g*'(wI+U)-l U(WI+U)-lg* = 0, which is a polynomial of degree 2(r+c-3) in w. O l For arrays up to size r=e=4, the coefficients of the w , w , 2 2(r+c-3) . w , ••• , w· have been derl.ved algebra.ically and are given in the appendix. Numerical solution of this polynomial is necessary to obtain w to use in calculating ~H for a given sample. A computer program to obtain this numerical solution, based on Bairstowis method is given in the appendix. This procedure for numerical solution is illustrated with data from an array with three rows and three columns. The problem illustrated i.s a comparison of two systems for grading 68 (stratifying)white oak Spa sawlogs. For arrays of size greater than r=c=4, the algebraic derivation of the polynomial coefficients becomes unwieldy and it is recommended that the numerical solution be started a step earlier. The test statistic has been derived from the ratio of maximum likelihoods, A. :=: tH~J' by taking 'T H sample distribution of 'T H Although the large 2 is known to be that of X , i t is necessary :=: -2 In A.. 2 to obtain some indication as to how well the X distribution approximates that of 'T H for relatively small samples. Empirical sampling experiments that were performed indicate that the upper tail of the 'T H 2 distribution is slightly larger than that of the X distribution. When the number of observations per cell is ten or more, the actual 2 type I error may be Q' = 0.11 or 0.12 for a nominal Q' = 0.10 based on X • 2 Thus, we conclude that the X approximation to 'T H is sufficiently close so that in application the critical region for 'T H can be taken 2 as 'T H ~ Xl, (l-Q')" The empirical sampling was extended beyond the null case to construct power curves for the test. These empirical curves for 'T with the theoretical power of F( ~ - r, ¥-c) test of the null hypothesis being considered. H were compared which could be used as a This alternative test re- quires subdivision of the total sample into two parts in order to obtain an independent estimate of the average within-strata variance for each of the two systems of stratification. The comparison of the power of the two tests shows that 'THis greatly superior to F( ¥- r, ¥- c ) in the ability to reject the null hypothesis when it is false. 69 7. LIST OF REFERENCES 1. Anderson, R. L. and T. A. Bancroft. 1952. Statistical Theory in Research. McGraw-Hill Book Company, New York. 2. Browne, E. T. 1958. Introduction to the Theory of Determ.inants and Matrices. University of North Carolina Press, Chapel Hill, N. C. 3. Campbell, R. A. 1964. Forest service log grades for southern pine. U. S. Forest Service, Southeastern Forest Experiment Station (Asheville, N. C.) Research Paper SE-ll. 4. Cochran, W. G. 1960. Comparison of methods for determining stratum boundaries. Bulletin of the International Statistical Institute 38(2):345-358. 5. Dalenius, T. 1950. The problem of optimum stratification. Skandinavisk Akuarietidskrif 33~203-2l3. 6. Dalenius, T. 1952. The problem of optimum stratification in a special type of design. Skandinavisk Aktuarietidskrif 35:61-70. 7. Dalenius, T. sampling. 8. Dalenius, T. 1962. Recent advances in sample survey theory and methods. Annals of Mathematical Statistics 33(2):325-349. 9. Dalenius, T. and M. Gurney. 1951. The problem of optimum stratification II. Skandinavisk Akuarietidskrif 34:133-148. 1953. The economics of one-stage stratified Sankhya 12(4):351-356. 10. Dalenius, T. and J. L. Hodges, Jr. 1957. The choice of stratification points. Skandinavisk Akuarietidskrif 40:198-203. 11. Dalenius, T. and J. L. Hodges, Jr. 1959. Minimum variance stratification. Journal of the American Statistical Association 54:88-101. 12. Ekman, G. 1959. An approximation useful in univariate stratification. Annals of Mathematical Statistics 30:219-229. 13. Feller, W. 1950. An Introduction to Probability Theory and Its Applications. John Wiley, Inc., New York. 14. Fox, M. 1956. Charts of the power of the F-test. Mathematical Statistics 27:484-496. Annals of 70, 15. Grandage, A. 1958. Orthogonal coefficients for unequal intervals. Biometrics 14(2):287-289. 16. Hagood, M. J. and E. H. Bernert. 1945. Component indexes as a basis for stratification in sampling. Journal of the American Statistical Association 40:330-337. 17. Newport, C. A., C. R. Lockard, and C. L. Vaughan. 1958. Log and tree grading as a means of measuring quality. U. S. Forest Service, Washington, D. C. 18. Newport, C. A. and W. G. OIRegan. 1963. An analysis technique for testing log grades. U. S. Forest Service, Pacific Southwest Forest and Range Experiment Station (Berkeley, Calif.) Research Paper PSW-P3. 19. Petro, F. J. 1962. How to evaluate the quality of hardwood logs for factory lumber. Canada Department of Forestry (Ottawa) Technical Research Note No. 34. 20. RrND Corporation. 1955. Million Random Digits with 100,000 Normal Deviates. Macmillan Company, New York. 21. Robson, D. S. 1959. A simple method for constructing orthogonal polynomivl S W'l'i' the independent variable is unequally spaced. Biometrics 15(2):187-191. 22. Tang, P. S. 1938. The power function of the analysis of variance tests with tables and illustrations of their use. Statistical Research Memoirs (University College, London) 2: 126-157. 23. Vaughan, C. L. 1958. Development of log and bolt grades for hardwoods. U. S. Forest Service, Forest Products Laboratory (Madison, Wisconsin) Research Paper No. TGUR-16. 24. Watson, G. S. 1965. Equatorial distributions on a sphere. Biometrika 52:193-201. 25. Wilks, S. S. 1938. The large-sample distribution of the likelihood ration for testing composite hypotheses. Annals of Mathematical Statistics 9:60-62. 26. Wishart, J. and T. Metakides. 1953. fitting. Biometrika 40:361-369. Orthogonal polynomial • 71 8. 8.1 8.1.1 Theorem on APPENDICES Thec;rems and Derivatior:s u2 I f U is the matrix defined by (2.4.3) and i f we define the .partitio:J.s: Ull(r-l,r-l) (S.1.1.1) U{2(c-l,r-l) then (8.1.1.2) • Proof: .h c th and-ce~ the (a,~)t~ element in A~ is the product of the a th row l coll~ of A. , i.e., l The expression enclosed by parentheses is given by equation (2.3.14) to be unity; ( a,~ ) therefo~e, th 2 element in Ai ~ aiaai~ which is seen to be the same as the (a,~)th element in A.• Therefore, l Ai is an id.empoter..t matrix, i. e., 2 A. l • ~ (8.1.1.3) A. "- The (a,~)th element in A.A. (i " j)' is: l J a. a.,(a.la'C)+a. a. 2 (a. 2a' C)+••• +a. a. ~(a. r+ 2 a . C) la l~ J J~ la l J J~ la l,L~C-C J, cJ~ -l = a.le'.1.., a'C(a.la' l Jl + a.1. 2a'J 2 + '" + a a ) i,r+c-2 j,r+c-2 72 (2.3.15) which from A.A. l J =0 is seen to be zeroj therefore, (S.1.1.4) (r+c-2,r+c-2) Now, partition A. as follows: l A.l (S.1.1.5) = . Then, (S.1.1.6) where K is Expanding defined by (2.4.~). if gives: 2: [ I u = [ Ai - KJ2 (i=l, oe. ,c-l) I Ai J2 + K2 - [ I Ai JK - 1{ I Ai J which by using equations (S.1.1.3), (S.1.1.4), 2 noting that K = K, becomes: and (S.1.1.6) and also 73 = which completes the proof. 8.1.2 Theorem on the Determinant of U The determinant of U is: (8.1.2.1) ",.Q)th element of U as: write the ( ~,~ c-l li aiaai~ - e aS = i=l Proof: I where a=l, ••• ,r+c-2 ; e =1 = 0 for ot ~=l, (8.1.2.2) ••• ,r+c-2 ; and = ~ = 1, ••• ,r-l otherwise. (8.1.2.3) and Multiplying the athrowand Sth column in the determinant for all a and ~ gives: lui 1 c-l ~ a a ( a . a. - e) = ~-:::--( C-l,ot c-l,S i~ lot lS r+c-2 2 .TI J=l a 1 . c-, J (8.1.2.5) Now, the sum of the otth row, for ot ~ r-l, of the determinant on the . right is: r+c a 2 c-l I[aC_l~aaC_l~~Iaiaai~J ~=l - a 74 2 c-l,a - i=l r+c-2 = \L [a c-l~aa c-l~~Q(alaa l~Q+ ••• +a c-l~aa c-l~~Q)J - a 2 c-l,a S=l r+c-2 = L [ac-l,aa la (ac-l~~Qal~Q)+ ••• +ac-l,aa c-l,a (ac-l~~Qac-l,~Q)J \ - a S=l r+c-2 r+c-2 = a c-l,aa la[ \L a c-l,~Qal~QJ+ ••• +ac-l~aa c-,a 2 [ \' a L c-l,~Qac-2~~QJ S~ S~ - a 2 c-l,a =0 since from (2.3.14) and (2.3.15) the first c-2 terms in braces are zero and the last term in braces is unity. Similarly, the sum of the a th row, for a=r, ••• ,r+c-2 , is: =a 2 c-l,a Therefore, the last row of the determinant is: 2 o, ••• ~o,ac_l,l 2 2 ' a c _l ,2 , ••• , a c-l,r+c-2 where the first r-l elements are zero. If, in the same way~ we sum columns then: 2 c-l~a 75 1 lui =----x r+c-2 2 II a 1 j=l c-,J 0 2 a c-l,l a c-l,r+c- 3ul,r+c- 3 a u . .11. . . . . . . . . . c-l,l • ·• ··• ·· • .• ...·• .. • . o . 2 o a a • 2 c-l,r c-l,r ·· a 2 c-l,r+c-3 r+c-2 I a~_l,C¥ 2 a 2 o a c-l,r+c- 3a c-l,lu r+c-3,1 ••• a c-l,r+c- 3ur+c-3,r+c-3 o ••• 0 0 • c-l,r+c-3 c¥=r (8.1.2.8) or • • • • fJ • • • u e •••• • · 1 lui = - - a 2 c-l,r+c ..2 ···· ·••• ·•• • u r+c-3,1 o ••• 0 •••••• 0 a c-l,r u • o 1,r+c-3 • o a r+c-3,r+c-3 a . c-l,r c-l,r+c-3 a c- 1 ,r+c- 3 00. (8.1.2.9) Repeating these operations using the a CoO 2 0 ,J '000, a lJo respectively on the first (r+c-3), ••• ,(r-l) rows and columns gives: 76 O(r-l,c-l) r+c-2 1 lui \' a a L lr c-l,O' O!=:r =---c-l 2 II a.~,r-1+.~ i=l O(c-l,r-l) r+c-2 a I c-l,O' 10' O'=r a 0 ••• (8.1.2.10) 1 = ------c-l 2 II a.~,r-1+.~ i=l Ull I 0 (say). ~ And, by Laplace's theorem (2, p.20), 1 lui = ----.IUlll.lu~21 (8.1.2.11) c-l 2 II a. 1. i=l ~,r- +~ Recalling from (2.3.6) that for i=1,000,c-2, ai,r+l = ai,r+i+l = 000 = a i ,r+c_2 = 0 the second determinant above is: 2 aIr a a lr 2r r+l 2 I;a O'=.t" 20' a a lr 3r r!;la a 0'=-'" 20' 30' •••• 0 a lr c-l,r r~ . a a ..... O'=r 20' c-l,O' •••• $ o * a r+2 ~ a a O'=.t" 3 0' c-l,O' • o 0 • o r+c..-2 2 ].j a O'=r c-l,O' If we factor aIr from the first row and first column and a 2 ,r+l from the second row and second column then: 77 1 a (2r ) a 2",r+l a 2 (~r a a a a ( 2r 3r, a +1) a 2,r+l 2,r+l c-l,r a a ) ••• ( 2r c-l....z.r + a ) 3 r+l c-l,r+l ' a 2,r+l r+2 ~ a a O:'=r 30:' c-l,O:' r+2 2 ~a O:'=r 30:' * ,r+c-2 2 ~ a O:'=r c-l,O:' (8.1.2.13 ) a Subtracting ••• , (a 2 (a r X row 1) from row 2 and (a 3r 2,r+l 1 X row 1) from rows 3, •• c- ,r a 1 1~21 a 2,r+l 1 0 a 3 , r+l .• .. 8 8 0 a. c-l,r+l 1) , ,c-l respectively gives: 2r 0 2 2 = a a lr 2,r+l 8 x row a'<..,r :, a a 3 ,r+l a r+2 2 a3 0:'=r+1 0:' • ~ 8 8 .88 c-l,r c-l,r+l r+2 ~ a a O:'=r+l 30:' c-l,O:' ··· r+2 r+c-2 2 ~ a a ••• ~ a 1 O:'=r+l c- ,0:' O:'=r+l c-l,O:' 30:' 78 or a 1 3 ,r+l r+2 r; a 2 2 = a lr a 2,r+l a ••• r+2 2 Q'=r+l 3Q' c-l,r+l r; a a Q'=r+l 3Q' c-l,Q' * r+c-2 2 !: a Q'=r+l c-l,Q' (8.1.2.14) By repetition of this process we finally find c-l 2 = i=l IT a J.,r- l+J. ' (8.1. 2.15) o Therefore, from (8.1.2.11) we have (8.1.2.16) From (8.1.1.2) we have The matrix U has dimensions (r-l,r-l) ; therefore, 11 luFl= (-1)r-llulll.lu221 lu.ul = (-1)r-llulll.lu221 lui. lui and from (8.1.2.16) = (-1)r-llulll.lu221 , 79 or (8.1.2.17) It now remains to evaluate IU221. From (8.1.1.1) and (2.3.6) we have IU22 1= c-l c-l ~ a. a. -Ll.·. ~ 2 80 • a. i=2 ~r l,L"' i=c- lr ~,r+c~3 c-l 2 !; a. i=l lr a a c-l,r c-l,r+c-2 c~l c-l 2 ••• i~~ 2 80 ~,r+ • 1 80 ~,r+c• 3 Ea. '1 i~ ~,r-t' a ··• c-l 2 ~ a. 3 i=e-2 ~,r+c- * a c-l,r+l c-l,r+c-2 a a c-l,r+c-3 c-l,r+c-2 a 2 c-l,r+c-2 (8.1.2.18) or, upon factoring (a, 1 r+ c-, c- 2 )/ (a 1 3) from the last row and c- ,r,c- last column 2 c-l,r+c-2 X 2 a c _l ,r+c_3 a c-l ~ a. a. i=2 ~r ~,r+ c-l 2 ~ a. " i~ ~,r+l * c-l 1 ••• ~ i=e-2 a. a. + ~r ~,r c- 3 c-l ••• i=c~ 2 80 ~,r+ • 1 80 ~,r+c• 3 ··· c-l 0 ~ a~ 3 i=e-2 ~,r+c- a a a c-l,r c-l,r+c-3 a c-l,r+l c-l,r+c-3 ··• a 2 c-l,r+c-3 a 2 c-l,r+c-3 (8.1.2.19) 80 If we now subtract the last row, (c-l)·st , from the (c=2)-~d row, subt rae t th e ( c- 1 ) st " COJ.:U1lL~ f rom t".he ( '~-c.) ,. ., \nd cO.Lurrm, and +·h v en multiply the l.ast row a:1.d last column by c-l 2 L, a. i=l ~r lu22 I~' 2 a c-l,r+c~2 11(a c-1 J r:-c=3 ) we have: c-l L, a. a. • •• a 0 a 2 3 i.=2 ~r 1.,r+l c=""r c- ,9r+c- a c-l 2 ~ a. 1 i=2 1.,r+ a c=l,r+l ••• a c- 2 ,r+l a c- 2 ,1-rC' 3 c-IJr • · •a 2 • o c-2,r+c-3 * 1 (8.1. 2. 20) Repeating the above operations with 1!(aC _ l ,r+C=4) , ••• , l!(aC_l,r) being the fac-sors from the last row and last column for the successive repetitions and S11btracting the last row and from the (c-3 )rd., ••• , 1 st coD~D row and column for the successive repetitions gives: c-2 a 2 ~a.l.ra.1.,r+l···ac- 2 ,ra c- 2 ,r+c- 3 i=2 o c-2 2 ~ a. i=2 l.,r+l o c=1,r+c-2 . .. · • 2 • a o o 1 ••• a c - 2 , r+ 1 a C - 2 ,r-t·c = 3 c-2,r+c-3 o o (8.1. 2. 21) Noting that the principal minor in the upper left is of the same form as (8.1.2.19) , we see that continued. reductions must result in a = 222 a a c=1,r+c-2 c=2,r+c=3··· lr c-l 2 II a.l.,r= l+1. ' i=l (8.1.2.22) 81 which, togetheT with (8.1.2.17) gives: cc~pletes This 3.1.:3 the proof. Relationship Between the Sums of Pri:lcipal Yl:.rlC':':-'s of a Matrix and those of its Inverse A diTect result ()f the Cayley-Hamilton theoreni is n~squa~e matrix~ of an -1 ... 1 [ r:.l M -::::--- M (_l)Dp r. where p~ M , with rank "'1-2 - PlM" r~: ()n-2 +••• + -1 tha+~ the inverse can be expanded as follows: P n- ..l. 2M , (S.l. 3.1) is the sum of the i-rowed principal mi..nor's of M. Rearranging terms gives: -1 ::: - - - - [ p (_l)n n l\R-l -p .'1 n-l T - + P n- 1I;:-2 2M +••• + ( --1. )n-lPILl J and multiplying both sides by ( M-1)n-2 gives: -1 M l ( M-l)n=l - Pn_l \'M• )n-2+.00 + (_1)n-lp1 =--- [ p (_l)n -'-J .!.. "'" n (801..3.2) which is seen to be of the same for.m as (8.1.3.1). TherefoTe, if we I Pn-i:::: Pi/p where n P~-i is the S~~ of the (n-i)th-rowed principal minors of M- 1 0 82 8.1.4 Theorem on the Inverse of B The inverse of the (r+c-2)-square matrix + wI B := U (8.1.4.1) is , 1 { Wn-1 I B-.l..:=p* n - ~ oJ. ( ••• " -.L W n-2 CU-P -T·] + wn-3 CU2 -PlU+P2 I] +••• l )r-l n-rC_..r-l W U •..r-2 ...r-3 ( . )r-l ] -Pl U +P2 u +••• + -1 Pr_1I +••• ] } ••• + ( -1 ) n-1C.~-1 lJ -Pl .~-2 U +p2U'~-3+••• + ()n-l -1 Pn_1I. (8.1.4.2) where P := sum of the r-rowed principal minors of U I' 7 P::= determinant of B , n and n = r+c-2 • Proof: between p I' first it will be necessary to establish the relationship and p*= sum of the r-rowed principal minors of B • I' Any r-rowed princlpal minor of B can be represented as: lB.J. I = I U.J. I' I' + wI I (801.4.3) where rBi represents the (rxr) matrix of elements composed of the intersection of the r rows and I' columns contained. in the combination. Expanding the right side of (8.1.4 3) gives: 0 where p .. = sum of the j-rowed principal minors of J.J U. I' J.. .th J.. 83 rrherefore J the sum of the r-rowed. principal minors of B is en r r :E [w .:- i=l r-l +••• + n'lW' k"l (8.1.4.5) 1'-1 Now we will consider the coefficient of w '" in p* • r crn ~ pc = sum of the j-rowed principal minors of all C~ i=l lJ r-rowecl principal m:i.nors of' U e Each of the r-rowed principal minors of U containE'. C~ j-rowed ,] principal minors; thus, their sum (including all duplications) must contain C~·Cn j-rowed principal minors of U. J r But, U contains only C~ j-rowed principal mi.r-ors and by syrnrL.etry all must be duplicated J equally, therefore CD r ~ po 0 i=l lJ = (8.1.4.6) and " P"::: r Rearranging gives: ~ ,..n-j I.Jv j=O n-r r-j w p. J (8.1.4.7) 84 _f(!D_Crl=ICn.J.f'On<>2Cn_cn-3Cll+ +(=I)r-1CU-r+1CYl ) r 0.0 '''' r r- 1 l'v,r- 2 2 r- 3 3 000 1 1 W r-~ In p i r the coefficient of w p* r-l 0 , (i=I, ••• ,r), is: n - r +i ) 0.. +(_I)i-lCn-r+i-(i-l)C 1) 1 l0 1.= (" 1.0 = =(Cn-r+iCn-r+i_Cn-r+i-lCn-r+i+Cn-r+i-2Cn-r+i+ i 0 i-I 1 i-2 2 .00 000 +( -1 )i-lCn-r+i-(i=1 )C n - r +i +( -1 )iCn-r+i-iCn-r+i) i-(i-l) i-I 0 i +(_I)iC~-r+i-ic~-r+i But the expression inside the parentheses i.s zero (13, p047)j therefore, i the coefficient of w p* . i.n p r-1. is: r ( -1) iC~-r+i J. (801.4.8) Thus, the piS are expressed in terms of the p*i S Po = P~ :::: 1 n PI = P"~-CIWP~ n-l n 2 P2 = P~-Cl wp!+C 2w P~ Pr : P"r*_C1n -r+lWPr-l * +Cn-r+2 2 * + +(_I)r-lCn-lr-l *+(_I)rCnWr * 2 w P -2 ••• r_lw PI r Po r (801.4.9) for r=O,.oo,n 0 If we now rearrange the terms within the braces of (801.4.2) as: 85 n-l w 1PO n~2D' +wn-2 I P ~w n-3Ip +w ~W 1 ··· Ii 0 ' n-3_2 n-3 UP -rW U-p 210 and substitute for P~ f~om equations (8.1.4.9) we have: ... wn-1I(C~WOp~) ~ n-2 I (C n 1 *_Cn-1 0 *)_ n-2U(0.n 0 *) w 1w Po 0 w P w "Ow Po l n-3I(Cnw2 *_Cn-1 1 *+Cn-2 0 *)+ n-3U(CD 1 *_Cn-1 0 *) +w 2 Po 1 w PlOw P2 w lW Po 0 w Pl + ( -1 ) n-l 0 (n wn-l n-l n-2 ()n-l 1 0 ) w I. Cn- l P*O-C n- 2w P~l·+ •• ·+ -1 COw p*n- 1 +••• nO) ••• + ( -1 ) n-J_~-l( -U-COw I1) (s. 1. 4.10) If we sum coefficients of P~ from the first parentheses i::1 rows i, ••• ,n-l, from the second parentheses in rows 2, ••• ,n-l , etc.; and sum coefficients of p! from the first parentheses in rows 2, ••• ,n-l , from the second parentheses :1.n rows 3, ••• ,n-l , etc. for p~ , ••• , P~-l we have: j and similarly 86 n 1 w - I [C~-C~+ ••• +(-1)n-1C~_1] ~{ _wn - 2u [en_c n+••• +( _1)n-2Cn ] 1 n-2 ° +(_1)n-3W2~-3[C~_C~+C~] +(_1)n-2w~-2[en_Cn] . ° -I- (-1 )n-1w°Ur:.-1[c n ] ° 1 } wn - 21 [C~-1_c~-1+ ••• +(_1)n-2c~:~] _wn - 3u [Cn-1_cn-1+••• +(_1)n-3Cn-l] 1 n-3 ·•• ° (8.1. 4.11) -I-(_1)n-4W2~-4[C~-1_C~-1+c~-lJ +(_1)n-3w~~-3[Cn-1_en-1] 1 + ( -1 ··· ·+· .l:'n_2 'I',":f { + p*n-1{ )n-2 Q~-2[ w u ° } n-1] Co W1I[C~-ei] _wOU[C 2] ° } WOI[C~J } which may be rearranged as: 87 *J ( -l) u-lPol n-I T [en _cn ± ±C~lJ w ~ n-l n-2 ••• 1 n-2 [en _",n ± ±e1'l:t:1J +w u n-2 ~D-3 •• vI 0 +w2"(J11•• 3[C n -C D+IJ 2 1 +wV=2[C~-lJ +wOUn - 1 [lJ +(=1)n-2 pl{ } n 2 w- 1 [C~:~-C~:~± ... ±C~-~lJ +wu - 3u [C n - 1 _c n - 1±••• ±cn-~lJ n-3 n-4 1 (8.1.4.12) +W2~-4[c~-1_e~-1+lJ +w~..n-3[ en - 1 -lJ 1 +wOlfl - 2 [lJ } 1 2 w I [c -lJ 1 +(_l)Op* { n-l +wOU [lJ } wOr [lJ } But, from (13, p.48), we have Therefore, (8.1.4.12) becomes: . 88 *[C n - l n-1 I +C u - l wn-2 U+ +C n - 1 ly-p-2+Cn-l ~p-1J n-lw n_2 ••• 1 w u 0 W u Po :l<-[C n - 2wn-2 +C n - 2wn-3 + ,,,n-2wInI!·-2....... n-2.WO..U.n-2 I U ••• -'v 2 3 u rv J 1 o .L nn- -p., • +(-1)n-2p~_2[CiwlI+C;WoUJ +(_l)n-lp~_l[cgwo!J } (8.1.4.14) Substituting into the right side of (8.1.4.2) gives: (8.1.4.15) 1 which from (8.1.3.1) is seen to be B- • 89 8.2 Matrices and Sums of Principal I~nors for Problem Sizes to r:::c:::4 For all values of the parameters rand c , r~c , up to r=e:::4 the .-1 sums of principal minors of U, the matrices U and Ii polynomial t"* JB-~"1l-lg* are as follows. "¥,.' For r=e=2, P1 = 0 u = !!"'[ w2U + 2WP2 I -p2uJi~· - =0 For r=3, c=2, P1 = -1 P2 = -a13 P3 -- 2 a 2 13 2 (a -1) 11 U = * a a 11 12 2 (a12 -1) a a 11 13 a a 12 13 a 2 13 ,and the 90 o -1 U 1 =-2- a 2 -a 13 13 a * a 12 13 a 2 13 For r=e=3, P3 = 0 P4 2 2 = a 13 a 24 2 2 (a11+a12 -1) (a11a12+a21a22) (ai2+a~2-1) U = * (a12a13+a22a23) a 22 a 24 2 2 (a +a ) 13 23 a a 23 24 a 2 2 -a a 24 13 -1 U 1 = a 2 2 a 13 24 2 24 0 2 a11a13a24 -a13a24(a11a23-a13a21) 2 2 -a a 24 13 2 a12a13a24 -a13a24(a12a23-a13a22) a * 2 2 a 13 24 0 a 2 2 a 13 24 2 3 1 6 5 2 g*'[w U _ 2w U -(P2U+3P4U-1) + 4w p 1 - w p (3U+P U- ) 2 4 4 +2WP4(~+P2I) + p~U-1Jg* = 0 91 For r=4, c=2, P1 = -2 P2 2 = l~a14 P3 2 = 2a14 P4 = -a14 2 2 (a11-1) a 11 12 a a 11 13 a a 11 14 2 (a -1) 12 a a 12 13 a a 12 14 2 (a -1) 13 a a 13 14 a U == * a -1 1 U = -c:.0 a 14 2 -a 14 0 0 2 -a 14 0 * -a 2 14 a a 11 14 a a 12 14 a a 13 14 a 2 14 2 14 n '{w6U - 2w5[ U2 -p1U ] -w4[ p1U2 - (P 2 ) -lJ ~* 1 -P2 U~3P3I+3P4U - 2w3[P3U-(P1P2+2P4)I+P1P4U-1J + w2[P3~-(P1P3+3P4)U+(3P1P4+P2P3)I-P2P4U-1J + 2WP4[~-P1U+P2IJ + p~U-1 For r =4, c =3 , 222 P2 == -(a14 + a 24 + a 25 ) P 3 = 222 (a + a + a ) 24 25 14 }g* =0 92 (a118.12+8.218.22) (8. 22. +8. -1) 12 22 (a11a13+a21a23) (a11a14+a21a24) a 21 a 25 (a128.13+a22a23) (a12a14+a22a24) a 22 a 25 (2 (a138.14+a23a24) a 23 a 25 2 1) \a13+a23-~' o o o 2 a12a14a25 2 a13a14a25 * a 2 2 a 14 25 o 8 7 l*'{w U -2w [lf- P1U] +W6[3U3_4P1uF+(pi+2P2)U] -2W5[P3U-(2P4+P1P3)I +(2P5+P1P4)U-1_P1P5U-2] +W4[P3~-(3P4+P1P3)U1'(5P5+3P1P4+P2P3)I ~(3P1P5+P2P4)U-1+P2P5U-2] -2W3[-P4~+(2P5+P1P4)~-(2P1P5+P2P4)I 1 +P2P5U- ] +W2[-P4U3+(3P5+P1P4)if-(3P1P5+P2P4)U+(3P2P5+P3P4)I -P3P5U=1] +2w[P4P5U-l_p~U-2] + p~U-1 }i* ~ 0 93 2 2 2 = a14(a2S+a3S+a36)+(a24a35-a25a34) P = 0 s 2 D.·1-l 1 J. 2 P4 3 D.· 1 a· 2 1 J. J. 3 2 D3.· -1 1 J. 2 u= 2 2 2 2 +a36(a24+a25) 3 :3 a 1:a· 1 J. 2 J.,3 3 3 I; a, 2 a . 4 1 J. J. 3 I;a· 2a. S 2 J. J. a a 32 36 3 2 I;a. 3-1 1 J. 3 I;a, 3 a , 4 1 J. J. 3 a I;a. 2 J.3 J.. S a a 33 36 3 2 I;a'4 1 J. 3 I;a· a, S 2 J. 4 J. a a 34 36 3 2 I;a. 2 J. S a a 3S 36 3 1I;a'la':3 J. 1. * a a 1:a' 31 36 2 J.1 a.J. S 1 J. 1 a,J. 4 1:a' a 2 36 where ana. ~2 = a14a2Sa36 X a11a2Sa36 -a36 (alla24-a14a21) [all(a24a3S=a25a34)-a14(a21a3S-a2Sa31)J a12a2Sa36 =a36(a12a24-a14a22) [a12(a24a3S-a2Sa34)=a14(a22a3S-a2Sa32)J a13a2Sa36 -a36(a13a24-a14a23) [a13(a24a35-a2Sa34)-a14(a23a3S-a2Sa33)J 94 Cc~ter 8 3 0 P:::.·ogram for Transf'Jrmations A program was written in FORTRAN rJ' for the IBM 1410 computer to calculate the (a) f'ollowing~ The set of orthogonal vectors of linear coefficients de- (8.3.1.) .£31 c -3,c-l • In the printed computer output these vectors are identified by the heading: LINEAR COEFFICIEN"l'S SET 1. (b) where, A set of vectors analogous to (a) but not defined else- 2:.. ~. , Q*= Q - = 2 3.21 (8.3.2) Q 3 .9.:2,c-l 3.31 .9.3,r-l These vectors are identified in the computer output by the heading: LINEAR COEFFICIENTS SET 2. 95 (c) The matrix having rows defined by (2.2.19) for i=l, ••• , c-l !:l = ~2 all a a a 21 a ••• 12 a 22 1,r+c-2 (8.3.3) 2,r+e-2 • a -c-l a a c-l,l c-2,2 a •• 0 e-l,r+e-2 This is identified in the output as MATRIX Ao (Note: this should not be confused with the matrix A as defined by (2.2.21). o ~ (d) !{ The vectors = (t 21 ••• t 2 ,C_l). g* I = (121 ... L 2 ,r_l £'31 0 .. .13 ,C-l) and These are identified as LINEAR FUNCTION Land LINEAR FUNCTION T in the output. It must be explained that the computer program does not calculate the quantities in (a) through (d) exactly as defined. section (2.2) we see then any vector ~S or £as Referring to is of length n but that it is composed of only rc possibly unique elements,; thus, for computing purposes we consider ~S and £as to be of length re. In addition, the program as presented here does not normalize the vectors, 1·~·, ~S ~S ~ 1 and ~S £as 1= 1. It was found to be desirable in the early stages of the present study to have the exact values of the coefficients at the expense of a small amount of hand calculating to later produce the normalized quantities. We felt that a program that would calculate the exact coefficients for orthogonal contrasts when the numbers are unequal would be useful for other purposes as well 96 and thus we have lef't it in that f'orm. With some modest revisions the program could be made to produce the normalized quantities. There are a number of' papers in the literature (15~ 21, 26) that give methods f'or calculating orthogonal polynomials; however, none of' these is directly applicable to the problem at hand. The method used here is most nearly related to the technique that has come to be known as "step-wise regression" or "step-wise reduction." If' we consider, f'or example, the 3 X3 array as given in table 4.1 we have: m 1 Xl = • • · 1· 0 o o ·•• 1 o 0 1 1 0 .. • • o 1 0 0 1 0 0 1 0 0 0 1 0 ·. 1 ··0• 0 1 0 0 1 0 1 0 0 0 1 0 0 1 0 0 1 0 0 0 (8.3.4 ) 97 :.t X{X1 = n1 • n2 • n' l n' 2 n10 0 n n n n2 • * 11 n 21 (8.3.5) 12 22 0 n' l n' 2 768 = 144 364 57 462 144 0 49 88 364 5 299 57 0 * (8.3.6) 462 and y X{l .- .. , Y 1" Y2 0 , Y '1' Y • 2' = 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 0 1 0 0 1 0 !1 = Zl!l (8.3.7) where Y11• Y12 • !1 = Y , 13 Y21 • Y22 • Y23 • Y31 • Y32 • (e Y33 • (8,3.8) 98 In most computational sche:n:es, we woulcl augmer;.t the upper-triangular portion of X!X, with the vector·Xl/~ and then proceed by the selected 1. -'- method. to directly calculate sums of sQuares of the observations and other desired Quantities from the data. Quire not only the Sl11l1S For our purposes, we re- of sQuares but also the individual vectors of coefficients aS80ciated with single "degrees of freedom" of the sums of sQuares; therefore, we augment the upper-triangular portion of X{X1With the mat::ix Zl that is defined in eQuation (8.3.7) and proceed with operations on this array. We will represent the array as: IZ J [ Upper triangular X{X l a = ... a a a l,r+c-l a 2,r+c-l r+c-l,r+c-l a a l,r+c l,rc+r+c-l 2,r+c a r+c-l,r+c r+c-l,rc+r+c-l (8.3.9) Instead of the usual ffstep-wise reduction" calculation, b .. lJ ai+l,j+l - (al,i+lal,j+l)/all ' we will use b ij al,i+lal,j+l b ll = allai+l,j+l _ to perform the reduction of (8 0309) to b 12 ••• 1,r+c-2 b b 22 ••• b 2,r+c-2 b b = l,r+c-l 2,r+c-l ... b1,rc+r+c-2 o •• b 2,rc+r+c-l • • b · r+c-2,r+c-2 b r+c-2Jlr+c-l 00 • b r+c-2Jlrc+r+c-2 (803010) 99 Similarly,\> we can compute clo " J :=: bll- b-l+ .+~ 1 ~ ~'\>J ~ obtain a second reduction and so on. - b'l~,\>--i+lbl_,JT~ 0'" to In th::ts way,\> 'by eliminating the division, the exact values of the coefficients are developed on the right side of the array. One additional dodge is employed in order to keep the size of the n~~bers as small as possible. We see that: c. ° lJ = bllbi+l,j+l = (alla 22 - bl,i+lbl,j+l 2 - a12)(allai+2,j+2 - a l ,i+2 a l,j+2) - a 12a l ,\>j+2) 2 - a 12ai+2,\>j+2 - a 11a 2,i+2 a 2 ,j+2 + a 12a 2,i+2 a l,j+2 + a12a 2,j+2 a l,i+2 ] • (8.3.11) That is, all the elements in the array at any reduction are exactly divisible by the (1,1) element in the reduction two steps previous. By using Xl and Zl the procedure will yield the orthogonal vectors of coefficients associated with the individual components of the sums of squares for rows and columns (adjusted for rows) as given by equati,ons (2.3.1). If we let X be the rearrangement of Xl such 2 that the columns are ordered m, C ' C2 , R , R2 , and then define X~X2 1 l and Z2 accordingly, the computational proceQQre will give the vectors of coefficients associated with the individual components of the sums of squares for columns and rows (adjusted for columns) as given by equation (2.3.2). 100 Input to the computer program is in the following order: (a) Parameter card Columns 1 3 6 Contents -2 -5 -8 r+e-l rc+r+c-l 2rc-:r-c c 9 -10 blank 11 -80 (b) Upper triangular portion of X{X • l Each element is a whole number right justified in a five-digit field. I on cards serially by rows of XIX , 1 ••• , a l,r+c-l'. a 22' (c) ... , Representation of Zl. a 1.~., . 2,r+c-l' .0 . j a r+c- 1 ,r+c-l • One card for each element equal unity in Zl except those elements of the first row. Columns Elements are entered Card format is: Contents 1 - 3 row number of element 4 - 6 column number of element 7 -80 blank The column number required is that of the augmented array and not for Zl alone, ~.~., elements of the first column of Zl are found in the (r+c)th column of the augmented array. (d) Upper triangular portion of X~X2. Card format is the same as for (b). (e) Representation of Z20 (f) Cell totals. Card format is the same as for (c). Each cell total is entered as a whole number right justified'in a ten-digit field. whole number for the analysis.) rows into the cards. (i.e., Y.. - - l.J. must be scaled to a Cell totals are entered serially by 101 (g) Number of observations per cell. Each n .. is entered as a ~J whole number right .justified in a five-digit field. They are entered serially by rows into the cards. (h) Indicator card. Placed at the end of each problem set. This card contains 00 or 01 in columns one and two for respectively the last problem set a.nd any other problem set. The program is as follows: C C C C C C C 00001 00002 00003 00004 00101 00102 00103 00104 OOlOS 00106 OOOOS 00006 00007 OOOOS OOOSO PR¢GRAM F¢R Llr-J"EAR TRANSF¢RMATI¢NS 'r:rrIS PR¢G':R.AM CALCULATES C* = LINEAR C¢E:F']'ICIENTS SET 1 Q;* = LINEAR C¢EFFICIENTS SET 2 MATRIX A LINEAR FUNCTI¢NS L LINEAR FUNCTI¢NS T DIMENSI¢NX(11,4S),FL(10),FT(S) F¢RMA.T(16FS.0) F¢RMAT (I3 , I3 ) F¢RMA.T(SFIO.O) F¢~1AT(I2,I3,I3,I2) F¢RMA:TU/IIOX, 26HLINEAR C¢EFF'ICIENTS SET, I2) F¢RMAT(SH R¢W ,I2/(SF2S.0)) F¢lli~~(IIIIOX,SHMATRIXA) F¢F3MAT eI//IOX, 19H LINEARFUNCTI¢NS (SF2S. 0) ) F¢RMAT Ulllox, 19H LINEAR FUNCTI¢NS Til (5F2S. 0) ) F¢RMA:r ( I2 ) READ (1,4)M,MY,NU,K2 NT=l D¢7I=1,M D¢7J:.::l,MY X(I,J)=O.O READ ( 1, 1) ( (X ( I, J ) .' J =I , M) , I:.::1, M) D¢SL=l,NU READ(1,2)Nl,N2 X(Nl,N2 )=1. 0 J¢=M+l D¢SOJ=J¢ ,MY X(l,J)=1. 0 D=LO D¢10Ml=2,M K=Ml-l D¢9I=Ml,M D¢9J=I,M'f LII 102 00009 X(I,J)=(X(K"K)*X(I,J)-X(K,J}*X(K, I) )/D 00010 D=X(K,K) WRITE (3 ,101mI' D¢12I=2.,M N¢n¢W=I-l 00012 WRITE(3,l02)N¢R¢W,(X(I,J),J=J¢,MY) IF(I~.GTol)o¢T¢14 00013 00014 OOOlS OOOlS 00017 00018 00019 NI'=2 D¢13I:::2,M WRlTE(4)(X(I,J),J=J¢,MY) FEWIND4 a¢T¢S D¢lSI:::2,M WRITE(S)(X(I,J),J=J¢,MY) REWINDS READ(l,3)(X(M,J),J=J¢,MY) READ(l,l)(X(M-l,J),J=J¢,MY) IM=K2-1 JJYl'2 =M- 1 D¢lSI=l,IM D¢16J=l,JM2 X(I,J)=O.o D¢17I=2,K2 READ (s) (X(I,J) ,J=J¢ ,MY) D¢18I=l,IM FT(I)=O.O D¢18J=J¢, MY FT(I)=FT(I)+X(I+1,J)*X(M,J) D¢1911=l,JM2 . READ (4) (X(l,J) ,J=J¢ ,MY) D¢1912=l,IM D¢1913=J¢ ,MY X(12,Ll)~(12,11)+X(l,13)*X(12+1,13)*X(M-l,13) REWIND 4 REWINDS WRITE(3,103) D¢20I=l,IM N¢R¢W=I 00020 WRITE(3,l02)N¢R¢W,(X(I,J),J=l,JM2) D¢21I=l,..TM2 READ (4) (X(I,J) ,J=J¢ ,MY) F1(I)=0.0 D¢21J=J¢,MY 00021 F1(I)~1(I)+X(I,J)*X(M,J) REWIND 4 WRITE(3,l04)(F1(I),I=l,JM2) WRITE (3 , lOS) (FT(I), 1=1, 1M) READ(l,lOS)N¢PR¢B IF(N¢PR¢B.EQ.l)Q¢T¢S S'l'¢P END 103 The fol towing te::,'ms used in the program are defined. M = r+c-l MY = rc+r+c-l NU = 2:::::'c -r"'c K2 = c Nl = row m:unber for elements equal unity in ,Zl or Z2 N2 D NT = colum..YJ. number for elements equal unity in Zl or Z2 = J. initially, then successively all' =1 bll~ ... I when program is operating on XIX and Zl l I = 2 when program is operating on X2X2 and Z.c:.. . N¢R¢W = row number in program output FL = FT = linear N¢PR¢B = linear function of ob ser-V"at ions , f ij function of observations, t .. l.J 1 in all problem sets except the last, = 0 in the last problem set. All other terms used are indices and are defined in the program. An example of the program output using the data from table 4.1 is given in table 8.1. The divisors required to normalize the vectors in the two sets are found by calculating r c 2 ~ ~ n.c ( ) i=l j =1 iJ QI~ ij and are as follows: Vector Divisor V69,009,40S ~ 8307.1901 104 Table 8.1. Example of computer program output LINEAR COEFFICIEl\1TS SET 1 624. -144. -144. O. 37440. -52416. -4637360. -187200. -157248. 51253384. 91350460. 384551092. 624. -144. -144. O. 37440. -52416. -4637360. -187200. -157248. -496913352. -456816276. -163615644. 711. 711. 711. O. O. O. 920304. -5636862. -5636862. 9077244. 153211146. -403613562. -57. -57. -57. 14193. 14193. 14193. 5308182. -1248984. -1248984. -26286204. 117847698. -438977010. -57. -57. -57. -26334. -26334. -26334. 6372828. -184338. -184338. 266914428. 411048330. -145776378. 22597632. 23003136. 420992053248. 295150946688. -4849515474336. 0 25644176064. 404231406592. 115398573. 12068008308. 624 -144. -144. O. 37440. -52416. 8990800. 13440960. 13470912. -21059272. 19037804. 312238436. ROW 1 0 ROW 2 ROW 3 ROW 4 SET 2 LIl'ilEAR COEFFICIENTS ROW 1 ROW 2 ROW 3 ROW 4 MATRlX A ROW 1 o. ROW 2 817648128. 2458587364161264. LINEAR FUNCTIONS L 281370576 38086166048492. LINEAR FUNCTIONS T 105 £31 Y7,470,b03,984,885,760 ~ ~32 V33 ,254,763,752,143,594,752 .9.21 Y31,124,736 86,432,076 ~ 5,766,694,400 5,578.9548 V265,742,266,482 ~ 515,501.95 (8.3.12) The divisors for not used. ~31 and ~32 are not given since these two vectors are Using the nllillerical approximations in (8.3.12) as divisors we directly adjust the remainder of the output to find a a I = 0.0037259931 12 = .£2~21 I 13 =: '£31~1 = 0.87306405 I 0.0 a 14 = .£3~1 a - -21-~ 2 = 0.19093343 21 -c a 0.51739437 22 = £22-~2 = a =-0.10884114 ~ 23 =c -31:-L2 a I = 0.82704357 24 = £3~22 := I I I = 338.70728 = 231. 73764 46.768679 66.045057 = 206.84622 :: 234.10209 106 8.4 Compute~ Program to Obtain Polynomial Roots The following program written in FORTIUU~ II for the IBM 7072 computer was used to obtain the numerical approximations to the roots given by equations (4.3.10) C MAIN PR¢GRAM C¢E(1)=70835.812 C¢.E(2)=211837.42 C¢E(3 )=-70801. 400 C¢E(4)=-364907.70 C¢E(5 )=26180. 293 C¢E(6)=150373.66 C¢E(7 )=30256.238 CALL R¢¢TN(6,C¢E,R,CR) PRDf.r 1, (C¢E(I),R(I),CR(I),I=1,7) 1 F¢RMAT(3E20.8) Eli[) C SUBR¢UTINE R¢.¢.TN (N,C¢E,R,CR) P¢LYN¢MIAL R¢¢T EXTRACTI¢N C C ¢:BTAI1'ifS REAL AND C¢MPLEX R¢¢TS ¢F A P¢LYN¢MIAL IN ¢NE Ul\1KN¢.WN WITH REAL C¢EFFICIE:NTS. A VARIATI¢N ¢F BAIRST¢W S METH¢D IS USED. C C DIMENSI¢N C¢E (51),C¢EFN(51),B(51),C(50),R(50),CR(50) ACe = •0000001 INDIC = INDIC J=J K == 1 Nl::.-:N+l NP=~ 6 5 2 9 8 7 NPl=N+l FIN=O. D¢ 6 I=l,Nl C¢EFN(I)::C¢E (I)/C¢E (1) IF(N-2) 22,23,5 IF (C¢E(2))9,2,9 BETAl ~ C¢EFN(N)/.OOl BETA2 '" COEFN(N)!.OOl a¢ T¢ 8 BETAl =e¢EFN(N)/C¢EFN(2) BETA2 =COEFN(Nl)/COEFN(2) D¢ 15 J==l,lOO B(l)::C¢EFN(l) e(l)=B(l) B(2)= C¢EFN(2) + B(l)* BETAl C(2)= B(2) + eel) * BETAl 107 D¢ 10 I:=' 3,NPl B(r) == e¢EFN(I) + B(l-l)* BETj\~ -I- B(I-2)J,(- BETA2 10 e(I) :::: B(l) + C(I-l)* BETAl + C(I-2) .* BETA2 DELTA ,: C(NP-l) *e(NP-l) - e(NP-2) * a(NE) DBETl :::: (C(NP1-3) * B(NP1) - e(NPl-2) * B(NP1-l) )/ DELTA DBET2 == (C(NP1-l) * B(NP1-l) - e(NPl-2) B(NP1) DELTA BETIP = BETAl BET2P "" BETA2 BETAl :: BETAl + DBETl BET.42 == BETA2 + DBET2 IF (ABSF(BETIP - BETA1) - ACC (ABSF(BETAl)))11,11,15 11 IF (ABSF(BET2P - BETA2) - ACC (ABSF(BETA2)))16,16,15 15 C¢l\1TINUE ACe=ACC*lO. IF(ACC-.l) 120,120,9 120 PRINT 12 12 P¢RMAT(87H PR¢GRA.M HAS ITERATED 100 TIMES. TRY lAGAIN lHTH II1EW TRIAL S¢LillI¢NS) IF (:nnnc )1, :3 ~ 1 3 BETAl = -.2 BETA2 == 2. INDIC = 1 a¢ T¢ 8 16 NP= NP-2 NP1==NPl-2 BETA1= -BETAl BETA2= -BETA2 DO 4 I=l, NPl 4 COEFN(I) = B(l) 19 RAIl == BETAl BETAl - 4.0 BETA2 IF (RAD) 20,21,21 20 R(K) = - BETAl72.0 R(K+l) = R(K) CR(K) = SQRTF(-RAD) / 2.0 CR(K+l) = - CR(K) K = K + 2 IF(FIN) 1,24,1 24 IF(NP-2) 22,23,9 21 R(K) == (-BETAl + SQRTF(RAD)) / 2.0 R(K+l) = -(BETAl + SQRTF(RAD))/ 2.0 . CR(K) = O. CR(K+l) = O. * * * * K = K + 2 25 22 23 1 IF(FIN) 1,25,1 IF (NP - 2}22,23,9 R(K) = - C¢EFN(2) ('~ T¢ 1 FIN = 1 BETAl = C¢EFN(2) BETA2 == C¢EFN(3) a¢ T¢ 19 RETURN END * )1 108 The f'ollowing terms usee). in the program are a.ef'ined: N = degree COE(I) R of' polynomial = coef'f'icient = real of' term of i th degree (i=l, ••• , N+l) part of' a root CR = coefficient of the imaginary part of a root. Program output for the example is: I COE R CR 1 0.70835812E 05 -0. 53806485E 00 O.OOOOOOOOE 00 2 0.21183742E 06 -0.26564956E 01 O.OOOOOOOOE 00 3 -0.70801400E 05 -0. 24406735E 00 O.OOOOOOOOE 00 4 -0.36490nOE 06 -0.13883567E 01 O.OOOOOOOOE 00 5 0.26180293E 05 0.91822165E 00 0.19682898E 00 6 0.15037366E 06 0.91822165E 00 -0.19682898E 00 7 0.30256238E 05
© Copyright 2024 Paperzz