PSYCHOMETRIKA--VOL. ~, NO. 2 JUNE,1970 THE RELATION BETWEEN SAMPLE AND POPULATION CHARACTERISTIC VECTORS* NORMAN UNIVERSITY CLIFF OF SOUTHERN CALIFORNIA Dataare reportedwhichshowthe statistical relation betweenthe sample and populationcharacteristic vectors of correlation matriceswith squared multiple correlations as communality estimates. Samplingfluctuations were foundto relate onlyto differencesin the squareroots of characteristic roots andto samplesize. Aprinciple for determining the number of factors to rotate andinterpret after rotation is suggested. A number of statistical estimation procedures involve the computation of the characteristic vectors of a covariance or correlation matrix, but very little is known concerning the sampling behavior of such characteristic vectors. Characteristic vectors are of particular importance in the currently used methods of estimation in factor analysis, in which case R~, the sample squared multiple correlation of variable j with all the remaining variables in the matrix, is often substituted for unity as the diagonal element of any row j of the correlation matrix R. This paper reports a Monte Carlo study of the sampling characteristics of the characteristic vectors of such reduced correlation matrices. Method The procedure used follows a methodology essentially the same as that used by JSreskog [1963], Browne[1968], Hamburger[1963], Cliff and Pennell [1967], and Pennell [1968]. In it, a number of sample correlation matrices are generated from a given correlation matrix according to a procedure developed by Pennell and Young [1967] following a scheme set forth by Browne [1968] and summarized by Cliff and Pennell [1967]. The method assumes multivariate normality for the variables correlated. Five population matrices were used in the present study. Four contained 12 variables, and the fifth, 20. Since the purpose of the study focussed on commonfactor analysis, all five of the matrices had the property that their rank, in the population, reduced exactly to four if the correct "communali* This study was supported by the National Science Foundation, Grant GB4230. The author wishes to express his appreciation for the use of WesternData Processing Center and the Health Sciences ComputingFacility, UCLA.He also thanks Dr. Roger Pennellfor extremelyvaluableassistance in a numberof phasesof the study. PSYCHOMETRIKA ties," which are not the population squared multiple correlations, were inserted as diagonal entries. Twoof the matrices were designed to have common factor loadings which were "univocal," i.e., they had only one non-zero loading. This also implies that sections of the correlation matrix consisted of zero elements. The remaining two matrices consisted of variables with "complex" loading patterns, i.e., few zero loadings and few or no zero correlations. The two membersof each pair differed in the pattern of their characteristic roots; one univocal and one complex matrix had nearly equal nonzero characteristic roots, while the other two matrices had non-zero roots that were of different sizes, the smallest being quite small. It was hypothesized that the behavior of characteristic vectors of sample reduced correlation matrices would depend not on the characteristic roots and vectors of either the correlation matrices themselves or upon those of the correlation matrices with population R~ as diagonal estimates, but rather upon those of the "expected value" reduced correlation matrices. The expected values of R~ , the sample squared multiple, is given by the formula (1) E(/~) = R~ + .(n -- 1) (1 2 (N 1) where E(/~) is the expected value of R~ . N is the sample size, and n is the number of variables. This formula is adapted from Kendall and Stuart [1961, v. 2, p. 341] and is accurate as an approximation within 1/2N. The expected value of a sample Pearson correlation coefficient is the population parameter p within the same degree of precision, 1/2N [Kendall, & Stuart, 1963, v. 1, p. 390]. The approximations improve as R~ departs from 0.50 and as p departs from zero, respectively. The sample sizes assumed in the present study were 100 and 600. Consequently, the expected value of the correlation matrix with/~ in the diagonal will be closely approximated by the population correlation matrix itself with the diagonal replaced by (1). Such matrices were used here as "expected value" (EV) correlation matrices. The characteristic roots and vectors these expected value matrices were computed. The characteristic roots are given in Table 1. These roots and vectors will be referred to as EV roots and vectors. An identification problem exists with respect to considering the sampling characteristics of characteristic vectors. Population characteristic vectors maybe identified by their roots, so long as the latter are distinct, but this cannot be done in samples since the sample roots will differ from sample to sample as well as from the population values. What does a sample characteristic vector estimate? Howis it matched with any one population parameter rather than another? Discussing the whole set of characteristic vectors of a matrix is not fruitful because, on the one hand, they may be permuted, provided the roots are correspondingly permuted, and thus their order of Table.1 CharacCeris~icRoo~s of Expected Value CorrelationMatrices Population Matrix Complex, Complex, Univocal, Univocal, different, similar, different, similar, 12ovariable IO0 600 12-varlable i00 600 12-variable i0o 600 12-variable I00 600 Complex, different, 20-variable I00 1 2.842 2.786 2.020 1.952 2.805 2.796 1.422 1.359 2.896 2 1.o80 I.O01 1.128 1.057 1.040 .961 1.223 1.147 1.215 ¯ 640 .561 .893 .825 .761 .672 1.074 .997 .838 .171 .096 .623 .557 .342 .243 ¯ 973 .892 .386 .o15 - .067 .136 .070 .036 - .027 - .071 - .152 .152 .072 6 - .008 - .067 .o18 - .064 .036 - .027 .071 - .152 .116 .072 7 - .008 -..071 .o18 - .064 - .018 - .063 - .078 - .153 .116 .072 8 - .126 - .108 .001 - .066 - .018 - .063 - .078 - .153 .112 .072 9 - .040 - .120 - .055 - .129 - .049 - .138 - .085 - .155 .1±2 .056 I0 - .066 - .128 - .081 - .152 - .049 - .138 ¯ 085 - .155 .112 .036 ii - .IIi - .176 - .162 - .226 - .076 - .155 - .090 - .161 .lO5 .005 12 - .116 - .178 - .168 - .232 - .076 - .155 .090 - .161 .o73- .003 aColumn contains roots 13-20 for N.= i00; matrix not studied at N = 600.. ~. 166 PSYCHOMETRIKA listing is not fixed. Onthe other hand, any one set of characteristic vectors. is always an orthogonal transformation of any other set of the same order. That is, any sample set is a transformation of any pop~ation set of the same. order and there is no way to guarantee circumstances under which the transformation will be an identity transformation. The approach to the identification problem taken here was to use the rank in magnitude of characteristic roots to identify the vectors, both population and sample. The sampfing problem was then posed as follows: when sample and population normalized characteristic vectors are ordered by magnitude of characteristic root, what size can one expect for the elements of the transformation that carries the sample vectors into the pop~ation? . If two symmetric matrices are of the same order, then matrices composed of their normalized characteristic vectors are orthogonal transformation of each other because the characteristic vectors of each set are orthogonal. The EVand sample characteristic vector matrices, V and l? are such matrices; their columns are n characteristic vectors, so (2) VV’ = tF’ = ~ Then (3) lTiT’V = V So ~V is a transformation that changes 17 into V. Its elements are also the scalar products of each vector of V with each of those of 17 (from the definition of the scalar product). Also, because a scalar product, ~’~ x,.y., , between vectors X and Y is (4) c~ = IX[ IV[ cos ~, where IX] and ]Y] are the respective lengths of X and Y [Birkhoff & MacLane, 1953, p. 158], the elements of IT’V are the cosines of the angles between their constituent vectors. The procedure used here was to generate a number of sample values of reduced R, compute their characteristic vector matrices l?, multiply 17’V, and tabulate the distribution of each entry of T -- ]7’V. If t.~M , the scalar product (cosine) between sample vector m and population vector M, typically small for m ~ M, and close to unity for m = M, then the sample vector resembles the corresponding EVfactor closely. If the tin, are not near zero and tmity, this impfies that the sample vectors are hodgepodgecombinations of the EVones. These t..M are also the "Tucker phi’s" [Tucker, 1951] between sample and population principal factors since the phi is NORMAN CLIFF 167 and and correspondingly for ]~u . The overall index of a vector’s (and factor’s) sampling stability was the root mean square over samples of the sample tram ¯ The mean is of little interest since the sign of a characteristic vector is arbitrary. These tmMwere compared to various characteristics of the respective EV matrices in an attempt to discover the parameters which influence the size of t~. It is also of interest to find someoverall index of the degree to which a given population vector, and the corresponding factor, are "recoverable" by rotation of the sample principal factors. Equation (3) repeats the wellknownfact that a given population vector is always completely recoverable from the complete set of sample factors, but this is certainly not the case if only the r sample vectors corresponding the r largest roots are rotated, as is the typical case in applications of factor analysis. The sum of the squared cosines between a population vector and the r largest sample vectors is a good index of its recoverableness. It is the phi that could be obtained, on the average, between sample and pop~ation vectors by appropriate rotation of the sample vectors corresponding to the largest roots. One hundred sample correlation matrices were derived by the Permell and Young [1967] procedure for each population matrix and sample size combination. Sample and expected value characteristic vectors were arranged by order of size of characteristic root. Then each sample characteristic vector matrix was multiplied by the transpose of the appropriate E¥ characteristic matrix, yielding the cosines of the angles between sample and population vectors. The root mean squares over samples of the entries in this cosine matrix were computed as an index of the sampling stability of the characteristic vectors. Results Twoof the nine matrices of these root mean square cosines are given in Table 2. It is immediately apparent that the angle of rotation between sample and EV vectors depends on the difference in size of the corresponding EV characteristic roots. This seemed reasonable on intuitive grounds. It remained to attempt to specify the nature of the relationship more exactly. Closer examination led to the hypothesis that it was the difference in the square roots of the characteristic roots rather than the roots themselves that were important. This hypothesis was tested for the positive EVroots, using Fig. 1 which plots root mean square t.,M , the cosine of the angle between sample vector m (m= 1, 2,- ¯ -, n) and EVvector M(M = 1, 2,..., against the reciprocals of the differences in square roots of EVcharacteristic roots, [~2 _ ~1. The plots appear to be linear in the lower ranges. The Table 2 Mean ~qua~ed Cosines ~ between Smuple and Population Factors Unt~ocal~ ~ame-s~zed Factors N = 100 ~opulat~on Fac~o~ .~ample Facto~ ~ ~ ~ ~ 5 6 7 8 9 I0 11 i~ 6~ 098 ~ 098 o~ 007 oo~ oo~ o0~ 0o~ oo~ 0o~ 2 z3~552 z3~ z~oz9 0o7 ~07 o0~ 005 005 0o7 00~ zoo ~65 539 z28 o2~ 008 005 009 o07 006 006 005 08k z2k z~5 566 02o oz3 oo7 ozo 008 008 006 008 oo~ 006 oo7 oo7 156 111 117 132 lOl 116 lO3 136 oo~ 009 o13 009 122 122 1~ o9~ lO8 119 113 1~o 7 006 OlO 006 o13 116 13o 118 lOO 132 12o 131 118 006 oo8 0o6 Oll lO9 127 lo2 139 095 12~ 1~5 130 ? 005 ook 008 oz2 ~z8 z~k ~ ~z6 ~3 z29 zz? ~o8. 10 oo5 oo7 OlO 0o7 lOl lOl 12o 127 1~6 118 132 11 003 007 ozo o~o lO3 1~6 lk9 132 1~8 13o zz8 101 0o3 Oll o10 o12 099 1~3 lOl 131 131 1~ 117 119 Oo~le~ Dif£e~en~-slzed Fac~oPs N = 600 ~opulation sampleFactor Facto~ 5 6 7 8 9 Z0 ZZ Z~ 1 99~ 00~ OOl o0o ooo ooo oo0 ooo oo0 o00 oo0 o00 2 003 966OZ9 002 0O2 001 ooi 0oi 0oi 001 ooi ooi OOZo~9 9k9 o08 003 004 003 003 002 002 002 002 oo~ oo2 o09 833 o1~2. 034 025 OZ7 o~5 009 006 007 ooo ooi 003 025 z9o ~o2 25z ~57 o78 ol¢.~.o~9 o~ 6 o00 0oi 003 o31 232 233 179 122 083 051 030 032 7 ooo oo1 oo~o38 236 199 z55 z28 079 062 047 o51 0oo ooi 002 023 iP_~lO~lll161 179 I~6 069 077 9 000 001 003 015 080 093 114155 i~6 164 114 113 ~o 0oo ooi oo2.Oll043 069 o87 11~9 258 175 128 075 ooo 0oi ooi 005 o25 027 o35 o57 080 158 299 3~3 ooo oo1 002 006 o~1 032 039 049 o77 187 275 310 *Dec~mls NORMAN CLIFF N = 600 12 va~. CB ¯ I~ vat. CD ¯ .2. 0.0 ~ REC~PROCAL OF ROOT DIFFERENCE @ N = I00 .8. 12 vat. CB ¯ 12 vat. 20 vat. CD~ .\ .6. 0 0.0 RECIPROCAL OF ROOT DIFFERENCE 169 PSYCHOMETRIKA Recovery o] Population Factors ]rom Sample Data Table 3 presents data on the degree of resemblance (phi-square) between a sample factor and the corresponding population factor, the factors being identified by order of root-size. The entries on each line correspond to the diagonal entries on Table 2, but data is presented for all nine matrices studied. With a few exceptions, these entries are large for all the real factors, and small for the error factors. Those entries for real factors that are not high (less than .85) correspond to instances where there was one or more population roots, whether real or error, which were nearly equal in size to the given one. For N = 100, there will be a high degree of resemblance (phi-square) greater than .81) between sample and population factors only in those few cases where the root is really different from all the others. There will be a fair degree of resemblance on the average (phi-square greater than .50) in majority of cases, but difficulty in recognizing all the factors is to be expected in a fair proportion of samples. It mayalso be noted that there is very little resemblance between sample and population "error" factors. A low resemblance between corresponding factors will not be serious if the r largest sample factors are simply a transformation of the r largest population factors. Data on the question of whether or not this will be the case is presented in Table 4. Its entries are "multiple phi-squared," the maximumcosine-squared possible between a population factor and a linear combination of the four largest sample factors. Thus it is a measure of the degree to which population factor is within the space of the sample factors. Here, the results are -¢ery encouraging for all the N = 600 data, but less so for N = 100. As long as the real roots are all substantially greater than the error roots in the population, the factors are recoverable, but bearing in mind that these figures are averages it appears that when this is not the case one or even two population factors may not be recoverable, even when a Procrustes rotation is used. The situation is especially bad in the case of the 20-variable matrix where the presence of 16 error dimensions seems to provide that many more ways in which the solution may wobble. Table 4’s data concerns the "recoverableness" of population principal factors, and one can wonder about the "composition" of the sample factors, i.e., the degree to which a given sample factor includes real rather than error variance. The symmetry of the matrices in Table 2 shows that the proportion of a sample factors’ variance that is from real rather than error population factors will be virtually identical to the "recoverableness" of the corresponding population factor. Therefore, when N was 600, even the smallest sample factor is almost completely derived from real variance, but when N = 100 the fourth (and sometimes even the third) sample factor contains good proportion of error when the corresponding root is small. One may conceptualize the results from the point of view of factor Table 3 Mean Square Cosines (Phi-Squared) between Population Factor and Corresponding Sample Fector Fsctor Matrix ~=lOO "Error" "Real" x ~ 3 ~ 5 6 7 8 9 lO 11 12 cS .878 .694 .638 .651 .172 .188 .170. .121 .135 .152 .227 .204 ~CD-12 .962 .820 .710 .h3~8 o177 ,.166 ,132 ,093 ,103 .138 .21a. ,15o us .641 .552 .539 .566 .156 o122 ,118 ,139 ,IA3 o118 ,118 ,ll7 UD ,955 °732 .659 ,551 ,231 ,155 ,341 ,317 ,138 ,150 ,194 ,177 CD-20* ,915 ,662 ,541 ,307 .16i ,143 .132 .105 ,058 ,050 ,056 ,051 °057 ,074 ,059 ,063 ,060 ,049 ,050 ,055 N=600 CS ,983 ,895 ,850 .916 .213 .251 .158 .221 ,263 ,311 ,400 ,471 CD ¯ 933 ,966 °949 ,833 ,190 ,233 ,155 ,161 ,146 ,175 ,299 ,310 US ,81~ ,72o ,685 °732 .115 .I01 ,152 ,147 .138 .113 ,143 ,122 UD ,991 ,934 .921 ,916 .160 .425 .120 .256 .222 .189 .239 .265 *Factors 13-20 on second line ma~le~ "Recoverableness"of Population Factors: Average Eultiple-Phi-Squaredwlth Four Largest Sample Factors Factors Matrix ~:zoo "Real" CB ~972 .938 .906 ,865 ~ 6 7 8 9 ~o ~1 ~ .o52 .o~9 .058 .0~1 .033 .03~.017.020 CD-12 .984 .939 .869 .537 .121 ,091 ,090 .I0~ .101 ~060 .052 .051 US .951 .9~ .932 .819 .02~ .036 .035 .031 .029 .029 .930 .036 I 2 3 ~ .092 .~I0 .011 .009 .074 .0?8.060.063 o~-2o, .95o .81~ .?18 .~95 .090 .09~ .085 .084 .068 .076.082.070 ¯ 064 .066 .071 .061 .066 o0~8.046.03~ UD .982 .925 .891 .695 N=600 CS CD-I~ US .995 .990 .964 .979 .997 .990 .977 .B~ ~993 ,992 ,990 .989 .99~ ,988 .981 .9~9 .oo7 oolo .oo9 .ol1 .029 .035 .043 .0~6 .oo5 .oo4 .003 .003 .019 .014 .007 .009 .oo~ .oo~ .oo~ ,oo~ .oo~ .oo~ .oo~ .oo~ .002 .002 .o17 .o16 .o~3 .oz~.o1~.OlO *Factors 13-20 on second line NORMAN CLIFF 175 analysis as follows. In factor analysis, only the r largest roots and the corresponding vectors are retained for further study, the remainder being discarded. The population factor matrix F, is defined as (10) in which H, is a super matrix: I, being an r by r identity matrix and 0 an n - r zero matrix. In samples, we have the corresponding equation or (13) Someidea of the irdiuences affecting the difference between F, and/~, can be gathered by expressing (13) in supermatrices T .... 0 Carrying out the multiplication, 1,I LT..... (17) Letus consider thetwopartsof therightsideof (17)separately. Thesecond statesineffect thatinsofar as T, ....contains non-zero elements, theerror vectors of theEV matrixwillbe involved in thesamplefactormatrix. This section of T is theoneinvol~ing cosines between "real"and"error" vectors. Thee~dence presented earlier argues thatthesecoefficients willbenearzero if thedifferences in thecorresponding rootsarelarge. Thesedifferences will be1~rgeif thefirstr EV rootsarealllargeandtheremaining n --r arenear ~,ero ornegative. Membersof the two setscan approach eachotherin sizefromeither direction. One or moreof theerrorrootscanbe moderate sizedif allthe ~(/~)arenotaccv_rate communality estimates. Consideration of conditions under which one or more of the r largest are relatively small suggests that ~ ~ PSYCHOMETRIKA this will be the case if the rotated factors do not exhibit good orthogonal simple structure, or if all factors are not equally well defined, but those are only preliminary suggestions.. It may be noted that neither A~-r nor -~-r enter into (17) directly: the absolute size of their elements is not important. What is important is that all of the elements of An-r be as different as possible from those of Ar. The first part of the expression may not be as great an influence on sampling fluctuations as the second, especially if the possibility of rotation is introduced. The transformation which gives a least squares fit of toFr = VrAris (18) H = (see Mosier, 1939), provided .~r and T~ are non-singular. The transformation is oblique in general. If the rightmost part of (17) is zero, then the fit exact, although the transformation remains oblique. In either case, the degree of obliquity depends on how different the ~ and )~M(m, M <_ r) are, both within and between sets since if they are all equal their matrices are scalars and would cancel out each other, leaving T~, . This will be an orthogonal transformation if Tn .... = 0. These considerations support the practice of looking for a break in the size of the characteristic roots and rotating those factors corresponding to roots larger than the break-point. The reasoning here is that the sample characteristic roots will follow the EVrather closely in size (C]. Cliff and Hamburger, 1967); therefore, a sample break will correspond to an EV break, and the elements of T, .... will be small if r is taken as indicated. In that case the r retained factors will be relatively uncontaminatedby the remaining ones. This is no guarantee that r is the numberof factors in the population, but the indication is that even if there are one or two more population factors any additional sample factors retained will include a substantial proportion of error. The present results mayexplain a minor aspect of Pennell’s [1968] data. He studied the influence of loading size and sample size upon the sampling errors of loadings and found that, except for two instances, the sampling -~ variances of loadings, as determined empirically, were proportional to N and (1 - 1~)~. The present results suggest an explanation for his two disparate instances. The hypothesis is that in those two instances the EVroots for factors beyond r were of moderate size, were close in magnitude to smaller real roots, and the variables studied had appreciable loadings on the correo sponding vectors. This could appreciably increase the instability of the loading since it would have the effect of randomly adding or subtracting from the loading, or looked at another way, give the test vector the appearance of "wobbling" in the commonfactor space. The latter occurs not simply because the vector wobbles, but because the definitions of the commonfactor NORMAN CLIFF 177 space changes from sample to sample. Suppose, for example, there was a good-sized difference betweenthe fourth and fifth roots, but the fifth, sixth, seventh, etc. were of nearly equal size. One should resist the temptation to keep five factors, even if he has theoretical grounds for doing so, because it will contain a substantial proportion of variance from the sixth, seventh... etc., population factors, which are presumably error. It is tempting, although undoubtedly premature, to take Formula (9) literally, and decide on the number of factors by computing the complete table of differences i~ successive roots, the corresponding cosines, and the resulting expected proportions of variance from smaller factors. If further studies were to confirm the present ones, it ~vould be worthwhile to work out such decision procedures in more detail. They would be based not on the "significance" of a factor but on the proportion of real variance ia it, although the nature of the currently available significance tests would indicate that the decisions would be similar. The results presented here are for the principal-factors-with-squaredmultiples procedure for getting factor loadings. This procedure, while it is currently very widely practiced, does not have a very well-integrated theoretical basis. Our results are designed to be of interest to users of this method. Currently, a related but muchbetter grounded procedure for getting factor loadings is gaining acceptance. This is the Rao [1955] procedure Harris [1962, 1967] has discussed. In it, rather than subtracting S~ = 1 -- R~ from the diagonal of the correlation matrix, it is pre- and postmultiplied by S Thus S-1RS-1 2. is factored rather than R -- S JSreskog [1963] has done a Monte Carlo study of S-IRS -~ but he ports results only for individual loadings rather than for the complete vectors. He did find, however, what he termed an "ill-conditioned case," one in which the roots were very nearly equal. This gave loadings with very high sampling errors, indicating that the same I~inds of effects will occur, but the parametric question of their magnitude is left open. It would be of considerable interest to study the s~mpling behavior of the characteristic vectors of S-~RS-1. Since the multiplication by S-~ tends to have ~he effect of equalizing the error roots, the sampling behavior of the characteristic vectors maybe improved slightly for a given correlation n~trix. Browne’s[1968] results are somewhat in support of this view. Although he did not explicitly study S-~RS-~, he did study a closely related matrix, and fotmd its sampling behavior to be slightly better than that for R - S~. Again, parametric irfformarion of the present kind is not available from his study. REFERENCES Birkhoff, G. & MacLane, S. A brief survey of modern algebra. Rev. ed. New York: Macmillan, 1953. Browne, M. W. A comparison of factor analytic techniques. Psychometrika, 1968, 33, 267-334. 178 PSYCHOMETRIKA Cliff, N. &Hamburger, C. D. The study of sampling errors in factor analysis by means of artificial experiments.PsychologicalBulletin, 1967, 58, 430-445. Cliff, N. &Pennell, R. The influence of communality,factor strength, and loading size on the sampling characteristics of factor loading. Psychometrika, 1967, 32, 309-326. Harris, C. W. SomeRao-Guttmanrelatio~hips. Psychometrika, 1962, 27, 247-263. Harris, C. W.Onfactors and factor scores. Psychometrika,1967, 32, 363-379. JSreskog, K. G. Statistical estimation in factor analysis. Stockholm:Almquistand Wiksell, 1963. Kendall, M. G. & Stuart, A. The advanced theory of statistics. Vol. 2. London: Charles Griffin, 1961. Kendall, M. G. &Stuart, A. The advancedtheory of statistics. (2nd Ed.) Vol. 1. London: Charles Griffin, 1963. Pennell, R. J. & Young,F. W. An IBM7094 program for generating random factor matrices. BehavioralScience, 1967, 12, 165-166. Pennell, R. J. The effect of communalityand N on the sampling distributions of factor loadings. Psychometrika,1968, 33, 423-440. Tucker, L. R. A methodfor synthesis of factor studies. Personnel Research Section Report No. 984, 1951, Department of the Army(mimeo.). Manuscript received 12/30/67 Revised manuscriptreceived ~ /~5 /69
© Copyright 2026 Paperzz