A Method for Detecting Structure in Sociometric Data Author(s): Paul W. Holland and Samuel Leinhardt Source: American Journal of Sociology, Vol. 76, No. 3 (Nov., 1970), pp. 492-513 Published by: The University of Chicago Press Stable URL: http://www.jstor.org/stable/2775735 . Accessed: 29/06/2011 09:15 Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive only for your personal, non-commercial use. Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at . http://www.jstor.org/action/showPublisher?publisherCode=ucpress. . Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission. JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected]. The University of Chicago Press is collaborating with JSTOR to digitize, preserve and extend access to American Journal of Sociology. http://www.jstor.org A MethodforDetectingStructure in Sociometric Data Paul W. Holland' Harvard University Samuel Leinhardt2 Carnegie-Mellon University The authorsfocuson developingstandardizedmeasuresformodels of structurein interpersonalrelations.A theoremis presented which yields expectationsand variances for measures based on triads. Random models forthese measuresare discussedand the procedureis carriedout fora model of a partial order.This model containsas special cases a numberof previouslysuggestedmodels, includingthe structuralbalance modelof Cartwrightand Harary, Davis's clusteringmodel,and the ranked-clusters model of Davis and Leinhardt.In an illustrativeexample, eight sociogramsare analyzed and the generalmodel is comparedwith the special case of rankedclusters. 1. INTRODUCTION A varietyof modelshave been proposedwhichrelategroupstructureto interdependences among interpersonal relations.Graph theory,because it is concernedwith the characteristicsof sets of points connectedby relations,has been a naturallanguagein whichto expressthesemodels of small-scalesocial structure.In their classic paper, Cartwrightand Harary (1956) used graph theoryto restate Heider's (1946) balance theoryand proposeda theoremwhich,by showingthat balance implied the dichotomizationof groups along lines of interpersonalsentiment, providedan importantinsightinto the natureof the social structureof groups.Davis (1967) recognizedthat the decompositionof groupsinto onlytwo cliques was simplynot empiricallydemonstrable;he expanded upon structuralbalance by indicatingwhat conditionswere necessary for clustering,the developmentof one or more cliques. Balance, thus, was made a special case of a moregeneralgraphtheoreticmodel which sentiseemedmoreadequatelyto describethe structureof interpersonal ment. Followingthis work,Davis and Leinhardt(1970) combinedtwo commonsocial structuralcomponents,status and clusters,by showing how directedsentimentrelationscould generatea structureincorporating a systemof hierarchically arrangedcliques. In a recentreport(Holland and Leinhardt1970), we have shownthat even thisranked-clusters ' Research supported by National Science Foundation grants GP-8774 and GS2044X1. 2Research 492 supported by a Social Science Research Council postdoctoral fellowship. DetectingStructurein SociometricData model is a special case of a moregenerallyapplicable structuralmodel, that of a partial order,and, incidentally,with this model we have reestablishedthe connectionto Heider's theoryby proposingthat transisentitivityis an importantstructuralpropertyofpositiveinterpersonal ment (Heider 1958). The most generalimportof these mathematicalmodels is that they how complex networkscan be the resultof interdependenmonstrate dencies among interpersonalsentimentrelations. Nonetheless,while meaningfultheoreticalinsightscan resultfromexpressingsocial theory in themselvesto argue forthe mathematically,these are not sufficient the acceptance of theory.It still remainsfor empiricalverificationto help us distinguishformaltheoryfromformalnonsense;in this effort graphtheory,because of its deterministic quality,has limiteduse. For example,a statementwhichimpliesthat a groupcannot be balanced if a linebetweentwopointsservesto linktwo otherwisedisconnectedcomponentsmay followlogicallyfromthe axioms of graph theory;it does not make much sense in the logic of empiricalsociology.The problem is no less severein modelswhichpurportto be closerrepresentations of reality.A groupmay similarlyfail to be partiallyorderedor possess a rankedclusteringbecause of one contradictory line. The added complicationin thesecases is thatthe offending relationis not readilyobserved. If theseproblemsare to be avoided and thesemodelsare to be of scientificuse, theymustbe expressedprobabilistically and measureswhich gauge the fitof empiricaldata to them must be developed. However, whendeterministic graphtheoreticstatementsare replacedby propositions of tendency,the acceptance or rejectionof a hypothesisbecomes complicated,and techniquesare necessarywhichpermitthe statistical of a measuredtendencyto be judged. While some effort significance has been made to develop measuresof tendency,strength,and fitforthe various structuralmodels, there has been only limited discussionof randomnessin this contextand only meagerinformation existson the distributionsof these structuralindices (Harary, Norman, and Cartwright1965, pp. 339-62; Davis and Leinhardt 1970). Thus, it is extremelydifficult(if not impossible) to know the significanceof some measuredtendencyor to gauge,in general,the closenesswithwhichthe graphtheoreticmodelsof structuredescribesocial behavior. It is our aim in this paper to presenttechniqueswhichcan assist in solvingthisproblemby developingan indexof transitivity fora general model, a partial order,notingthat the procedureis applicable to all special cases. We then presenta theoremwhichwe use to generateexpectationsanidvariancesforour structuralmodel. This is followedby a discussionof the appropriatenessof several random models for sociometricdata, and an explanationof whywe have chosenone whichcon493 AmericanJournalof Sociology strainspairs. This randomdistributionis then employedin generating tables of probabilitiesfor our standardizedtransitivitymeasure. The approximatenormalityof this measure is substantiatedthroughthe analysis of a simulationstudy. To demonstratethe use of these techniques,eightsociogramsare analyzed and the resultsobtainedare comparedwiththosefortherankedclusterersmodelofDavis and Leinhardt. 2. A STRUCTURAL MODEL Sociometricdata are oftenin the formof a set of individuals,X, togetherwitha binary"choice relation,"C, definedon X byxCy,ifand only ifpersonx choosespersonyin thesociometric test.(To avoid trivialexceptions,we make the conventionthat xCx forall x, even thoughthereis no implicationthat x actually chooseshimselfin the sociometrictest.) In this paper, sociometricdata-that is, (X,C)-will be said to exhibit if the choice relationC is transitive. structure In technicalterms,this means that for any x,y,zin X, if xCy and yCz, then xCz. Briefly,our structuralmodel is that (X,C) is a partiallyorderedset (not necessarily and the centralaim ofthispaper is to presenta method antisymmetric), fordetectingtendenciesin the directionof this type of structurewhich can be appliedto socionmetric data. For ourpresentpurposes,transitivity is a convenientideal structuralmodel against whichwe may compare actual sociograms. The assumptionthat C is transitiveis generaland containsas special cases several earlier models: for example, balance (Cartwrightand Harary 1956), clustering(Davis 1968, pp. 544-51), ranikedclustering (Davis and Leinhardt 1970), tranisitivetournaments (Landau 1951a, 1951b, 1953), and quasi-series(Hempel 1952, pp. 58-62). We discuss the substantivebasis and implicationsof this model and relate it to previousones elsewhere(Holland and Leinhardt1970), consequently we will only considerthese importanttopics brieflyin the next section of this report. 3. TRANSITIVITY AND TRIADS In a givensociogram,to verifywhetheror not C is transitive,everyordered triple of distinctindividuals (x,y,z) must be examined. If xCy and yCz, then for C to be transitivexCz must also hold. If for some triplex,y,zit occursthat xCy and yCz but x does not choosez, thenC is intransitive. On the other hand, if xCy but y does not choose z, or if yCz but x does not choosey, or ifneitherxCy noryCz,thenthe question of transitivity is not relevantbecause the hypothesisof the tranlsitivity condition-that is, the "if" conmponent, is not satisfied-x may or may not choose z withoutcontradictingthe transitivity of C. When the hy494 DetectingStructurein SociometricData pothesis of the transitivityconditionis not satisfiedfor a particular triplex,y,z,theirrelationshipto each otheris vacuouslytransitive. Figure1 showsall ofthe sixteenpossiblerelationshipsthat can obtain betweenany threepeople in a sociogram.A single arrowx -> y means that xCy but y did not choose x (an asymmetricpair). A double arrow x <-* y means that xCy and yCx (a mutual pair). No arrowmeans that neitherx nor y chose the other(a null pair). The triadsare labeled accordingto thenumberofmutual,asymmetric, and nullpairs (in that order)whichtheycontain.Thus, a 201 triadcontains two mutual, no asyminetric,and one null pair. Nonisomorphic triad typeswith equal numbersof pair types are furtherdifferentiated by letters(e.g., 021U, 021D, and 021C). On the leftside of figure1 are the nine triad types that exhibitno intransitivities (the transitiveand vacuously transitivetriads); on the rightside are the seven triad types that exhibitat least one intransiwe mean that for tivity(the intransitivetriads). By ail intransitivity, some orientationof threeindividuals,say x,y,z,we have xCy and yCz but x does not choose z. To say that C is a transitiverelationon X is equivalent to saying that none of the seven intransitivetriads appear in the sociogram.To discoverwhetheror not C is transitivefora givensociogram,it is sufficientto examineeveryunorderedtripleof distinctindividualsto see if the triadtheyformfallson the leftor the rightside of figure1. If they all fall on the left,C is transitive;otherwise,it is not. By consideringthe triads of figure1, we may see how the several structuralmodels we have mentionedcan be consideredto be special cases of a partial order.Were we unable to findin our examinationof a graph'stripleseitherintransitiveor 012 triads,thenwe would have a ranked clustering.Additional absences of 003, 021U, 021D, and 102 triadswould mean that a quasi-orderexisted.If only003 and 102 triads appeared,the structurewould be a clustering.If no 003 triadsoccurred and only 102 triads were found,balance would be indicated. Finally, 030T triads as the sole arrangementsof tripleswould mean that the graphwas a transitivetournament. Returningto our own model, for a "live" sociogram,C will alnost certainlynot be transitive.Therefore,the numberof intransitivetriads possessedby the sociogramis a measureof how transitiveC is. If there are only a few intransitivetriads in the sociogram,then C is tending towardtransitivity.On the otherhand, if thereare about as many intransitivetriadsas would be expectedby chance,thenthereis littleevidence that C has a tendencyto be transitive.Let T be the numberof intransitivetriadsin a givensociogramand /T and S- be the mean and standard deviationof T under the null hypothesisthat the sociogram 495 Transitive and Vacuously Transitive Triads Intransitive Triads 0 * a 003 0 /0 012 * *@ 02 1U 00 021 D 030T 0 10 2 02 1 C 030C * 0 0 120U 120D 120C IIID IIIU 0 201 Ao > > 210 300 FIG. 1.-All sixteentriad types arrangedverticallyby numberof choices made and divided horizontallyinto those with no intransitivitiesand those with at least one. DetectingStructurein SociometricData is "random" (preciselywhat this means will be discussedin detail in the nextsection). Definethe index X-by: T-gT (1) OT The measure X- is our proposed transitivityindex. It is a minimum when C is transitiveand has mean zero and varianceone underthe assumptionthat the sociogramis "random." This type of structuralindexis similarto that proposedin the paper of Davis and Leinhardt (1970). Their structuralmodel is also defined in termsof triads,but their"non-permissible" triadsincludecne which our model considersto be permissible(i.e., transitive).Their structural indexis D-I-D (2) AD whereD is the numberofnonpermissible triads(in theirsense) and AD iS its expectedvalue underthe chance hypothesis.The index 3 has a minimumof -1.00 and may be expressedin termsofthe percentageoffewer nonpermissible triadsthan expected.When theirpaper was written,the variance of 3 was not knownand thus significancelevels could not be given forit. There are various theoreticalreasonswhy X-and 3 should both have approximatenormaldistributions under the chance hypothesis; in section5 we shall indicatethe resultsof a simulationstudythat supportthisconjecture.Because X- is in standardizedform,approximate significancelevels for an observedvalue of X-may be obtaiied by referringit to a table of the percentagepointsof the normaldistribution. The appropriatetest is one-sided,and rejectionof randomnessin favor of the transitivitymodel is indicated by sufficiently large negative values of -r. The actual computationsof T froma sociogram,and ofwrand 0T from the formulasgivenin section5 are verytedious.Consequently,in practice the value of X-must be foundwiththe aid of a computer.One of us (Leinhardt) has writtena FORTRAN IV programthat does sociometric analyses,includingthe computationof X-and .3* It shouldalso be pointedout that X-and 3 may be viewedas generalizationsof the indexof intransitivity givenby Kendall and Smith(1939) forpaired comparisondata. In that case, there are no mutual or null choices; thereforeonly two possible triads can occur, and one of these fallson the rightside offigure1. Theirmeasurecountedall the examples of intransitivetriadsin the graph. I This program has been used on the Harvard Computer Center's IBM 360-65, and Carnegie-Mellon University's IBM 360-67. Copies may be obtained from Samuel Leinhardt, Carnegie-Mellon University,School of Urban and Public Affairs,Pittsburgh, Pennsylvania 15213. 497 AmericanJournalof Sociology 4. RANDONI DISTRIBUTIONS IN SOCIOMETRY to evaluate the statisticalsignificanlce The use of "random" distributions of observedfeaturesin sociogramsis a standardpractice.Moreno (1953) simulated "random groups" by hand beforethe theoreticalanalvses wereavailable. We feelthat thereare certainsubtletiesthat arise in the choice of random distributionswhich have not receivedproperattention. Consequently,this sectionis devotedto developinga guidelinefor forsocionmetric analyses. choosingrandomdistributions It is convenientto discuss randommodels in sociometryin the context of sociomatricesratherthan sociograms.Let Xij = 1 if person i chooses person j, and X,j = 0 otherwise.Let X,i = 0 by convention; (Xi,) is theg X g sociomatrix.The rowtotalXi+ = Xi1 + Xi2 +-- ... + xig is the umuberof choices made by person i. The column total X+j = X13 + X2i + . . . + X,j is the numberof choicesreceived by personj. The numberofmutualchoicesin the sociograrnis M = EXijXji. iJ An early findingwas that socionmetric status the distributionof the X+j-is not in accord with what would be expectedby chance. In this case "chance" meaiis that each personi distributedhis Xi+ choices at randomn to the g - 1 othergroupmembers.More technically,thisineans that the "chance" distributionof the X+j was computedunder the assurnptionthat all sociomatrices,(Xij), with the given values of the row totals X1+, . . , X+ are equally likely.In otherwords,the chance on the given distributionof choicesreceivedwas computedconditionally values of the choicesmade. In nmany applicationsall subjects make the the theoreticalcalculations. same numberofchoicesand thissinmplifies Anotherfindingwas that the numberof mutual choicesM was larger than expectedby chance. As before,"chance" means that the distribution of M was computedunder the assumptionthat all sociomatrices (Xij) with the given row totals were equally likely.However, it seenms poputo us that once the firstfindinghas been observed differential distributionto evalularity-it is inappropriateto use the samerandonm ate the extentofmutuality.In particular,people receivingmanychoices are quite likelyto be involvedin nmutual choices,while those receiving none can not be involved in any. This suggeststhat, once differential popularityis detected, the appropriatechance model for evaluating that all sociomatrices tendenciestowardmutual choices should assunme the of the column totals with values given X+i, . . . , X+g as well (Xij) . . , . as the row totals Xi+, Xg+ are equally likely.In otherwords,inferencesabout mutualityshould be conditionalon boththe row and 498 DetectingStructurein SociometricData columntotals of the sociomatrix.Katz, Tagiuri, and Wilson (1958) indicated the difference betweenthe two approaches toward mutuality. Results of Katz and Powell (1954) are basic to furtherprogressin this direction,but as yet these have not been successfullyapplied to this problem.This practical issue notwithstanding, let us pursue the idea further. The structuralmeasurer proposedin section3 is concernedwiththe chance distributionof triads. This is a step furtherinto the structure of the sociogrambeyond choices receivedand mutual choices.Accordingly,the appropriatechance distributionfor T should fix (1) choices made Xi+, (2) choicesreceivedX+j, and (3) mutual choicesM, and assume that all sociomnatrices (Xij) withthe givenvalues of these quantiare ties equally likely.Unfortunately, the mathematicalresultsneeded to implementthisanalysisare not available. Untiltheyare, a reasonable attitudeis to ask how can these threeconditionsbe relaxedin orderto produce a feasible and yet reasonable chance distributionfor T? One promisingdirectionwould be to fix choices made and the numberof mutual pairs. Even this is not available at present. Weakening this criterionone furtherstep leaves us withtwo possibilities:(1) fixchoices made only,or (2) fixthe numberof mutual,asymmetric,and null pair relations.Since the firstalternativeis what we are tryingto avoid, we have adopted the second. This random distributionwas also used by Davis and Leinhardt(1970). Its advantage is that it allows us to eliminate the effectof the numberof mutual,asymmetric,and null pairs in the group. Its disadvantageis that it does not allow forthe fact that everyonein the groupmay have made the same numberof choices,nor does it allow forthe effectof a "star" who receivessignificantly more choices than the others and the "isolate" who receives significantly fewerchoices.Both of theseadditionalconstraintseventuallyshould be broughtinto the analysis. In detail, the randommodel we shall use to computeAT and 0ST is as follows.Let m,a, and n denotethe actual numberofmutual,asymmetric, and null pairs, respectively,in a given sociogram.Then m + a + n = g(g - 1)/2, the total numberof pairs. Randomly and withoutreplacenient, these m (mutual), a (asymmetric),and n (null) pairs are distributedto the pairs of group membersso that all arrangementsare equally likely. In practice this mightbe done as follows.Assume the individualsare numberedfrom1 to g. Put g(g - 1)/2 balls, numbered consecutivelyfrom1 to g(g - 1)/2, into an urn. Let ball 1 referto the pair (1,2), ball 2 to pair (1,3), etc.; ball g - 1 to pair (1,g), ball g to pair (2,3), ball g + 1 to pair (2,4) etc.; and finallyball g(g - 1)/2 referto pair (g - 1,g). This is a triangularenumerationof the unorderedpairs. The balls are thendrawnout of the urn,one at a timewithoutreplace499 AmericanJournalof Sociology to the numberson the firstm balls ment. To the pairs corresponding that are drawn,assignmutualpairs.For the nexta balls that are drawn, pairs of individuals.The assign asymmetricpairs to the corresponding directionsof these asymmetricchoices are then decided by a tosses of to the n a faircoin. The remainingn pairs of individualscorresponding undrawnballs are assignednull pair relations.If mn= 0 and n = 0, this is the usual random distributionused in the analysis of paired comparisons. 5. THE DISTRIBUTION OF T In this section,the mean and variance of T-the numberof intransitive triads in a randomlyconstructedsociogram-is derived. We also discuss the resultsof a simulationstudy that bears on the question of of the standardthe approximatenormalityof the chance distribution ized variable ir. Some notation is necessary.Let the intransitivetriad types 021C, 030C, lIID:, IIU, 120C, 201, and 210 be called type 1, type 2, .. .. type7, respectively.Let Ti be the numberoftriadsoftypei that appear in a givensociogram.Then T, the total numberof intransitivetriadsis T = T? + T2+ . . . + T7. (3) From equation (3) and standardformulasforthe mean and variance of variableswe have the relations a sum of randomn A= and ST = i ZVar (Ti) + 2 E Cov (Ti, Tj) i (4) EE(Ti) i < j . (5) to computeE (Ti), Hence, in orderto calculate gT and ST it is sufficient Var(Ti), and Cov(Ti,Tj). Theorem 1, below, expressesthese quantities in termsof certainprobabilitieswhichmay be computedfromthe random model. In definingthese probabilitiesit is convenientto let (1,2,3) denotethe triadformedby the personslabeled 1, 2, and 3. The "type" of a triadwillreferto the seven distincttypesofintransitivetriadslisted above. The probabilitiesthat appear in theorem1 are definedas follows. thattriad(1,2,3)is of typej. p(j) = Probability = Probability thattriad(1,2,3)is oftypei and triad(2,3,4)is oftypej. p1(i,j) = Probabilitythat triad(1,2,3)is oftypei and triad (3,4,5) is p2(i,j) oftypej. thattriad(1,2,3)is of typei and triad(4,5,6)is po(i,j)- Probability oftypej. 500 DetectingStructurein SociometricData to thenumberofnodesthat refers k on pk(i,j) Notethatthesubscript are commonto the two triadsin question. Theorem1: If Ti is thenumberof triadsof typei in a randomgraph on g pointsthen, (a) E(Ti) (b) Var (Ti) = (3) p(i) = (g) p(i)[1 3 + 3 (g p(i)] + 3(g - 3) (g) [p2(i, i) - po(i, i)] g (3[pi(i,i) + (g)(2)[po(i, i) (c) Cov (Ti, Tj) = - (g) p(i)p(j) + 3(g + - 32 3 3) - po(i,i)] p2(i)], + 3(g - 3) G)[p(i, j) () [p2(ixi) - po(i,j)] - po(i, )] [p(i, j) -p (i) p(j)] [P The quantities(9) and (g 23) thatappearin theorem1 are the bi- (9) is thedescendingfactorial and theexpression nomialcoefficients is that it indicateswhat probabilities The main value of this theoremi how theyare combinedto obtain and model mustbe computedfromthe the Ti. The proofof theorem1 of covariances and the means,variances, it. A similar theoremis omit we and tedious but is straightforward case mentionedabove, comparison paired the for (1947) provedby Moran theorem1 as well. prove to generalizes and the techniqueused there on the particular 1 depend not does theorem that Finally,we mention stated-that is, is it As validity. its for randommodelwe have adopted 1 is truefor etc.-theorem p(i), to values without givingspecific p2(i,i), not depend on does that graphs any random method of constructing the labels of the points. we have adopted,thereis some In the case ofthe randomdistribution the factthatpi(i,j) = po(i,j), to due is This 1. in theorem simplification forVar(Ti) and Cov(Ti,Tj) expressions the and hencethe thirdtermsin vanish. In orderto use theorem1 in conjunctionwith equations (4) and (5) 501 ArnericaiiJournalof Sociology mustbe computed. to compute T aiid 4-2,the values ofp(i) and pk(i,j) These are giveniin tables 1, 2, and 3. Throughoutthese three tables, the descendingfactorialnotationis used, that is, x(k) = ;(X - (x k + 1).(6 )...( ~~~~~~~~(6) The denominatorsof these probabilitiesD1,D2,D3are given by / \(3) D1= (2) (7) D2= ()) (8) D3 = (2) (9) To illustratehow these probabilitiesare calculated we shall consider threeexamples:p(l), po(l,l), and p2(1,3)TABLE 1 )ip(i) AS FUNCTION OF m, a, n T RIAD 021C 030C 111D 11IU 120C 201 210 -a (2)n 4-a(3) 3man 3man 2ma(2) 3m(2)n 3m(2)a To computep(l), note fromfigure1 that the triad labeled 021C has pairs and one null pair. The probabilityofgettingthese two asymlmetric threepairs is 1 a(a - 1)n (10) (2 L(g-11[(2 -2 22 The terminalfactorof 1/22 comesfromthe necessityto orientthe direction of the two asymmetricchoices. But thereare threepossible posiornce tions for the null pair, that is, (1,2), (1,3), or (2,3). Furthermore, the positionof the null pair has been specifiedthereare two ways the two asymmetricpairs can be orientedwith respect to each other to producea triad of type 1. Hence the value of p(l) is 3*2 = 6 timesthe expressionin equation (10) or (1) = a(2)n (1 1) (2) that,exceptforthe fact To computepo(ll) it is usefulto remrember that the m, a, and n pairs are distributedwithoutreplacement,any two 502 | O t t t t ,_ ' ._ I E[ | ~ .- n _ _ I~~~~~~~~~~~~~~~~~~~~~~~~ _ 4~~~~~~~~~~~~~~~~~~~~~~~~~~~~4c N ta I 4~~~~~~~~~ It 4-tte*+I | ! N O N jc O~4 i~I 4 40T a V I~ I 0 CA 44 ~ I~~?IX+ Cl O D ,4++ Ng C 4+ 4 4~~~~~~~~~~~~ 4 o Q~~~~~~~~~~~~~~~~~~~~c 0 , H Q *e, O1 Is.. 444~~~~~~~ . . -7 ~ ~ ~ . . 0 0lC -0 CI~~~~~ 4 ~ H . 4444- ~ -qNN -iH 2qN N -?g+ C ~ ~ ' ~ ' . 4~~~~~~~~~~~~~~~~~C -- C 1 AmericanJournalof Sociology triadswithno commonedgesare statisticallyindependent.Hence po(l, ) is essentiallythe "square" of p(l), except that descendingfactorials are used ratherthan powers.Thus (1) 2 a(2g)n (a - 2)(2 (n -1) a(4)n2 (12) The numeratorof thisexpressionappears in the (1,1) positionoftable 2. To computeprobabilitieslike P2(1,3), it is necessaryto recognizethat thereare two possibilitiesforthe commonedge betweenthe two triads (1,2,3) and (2,3,4). If triad (1,2,3) is to be of type 1 and triad (2,3,4) is to be of type 3, theircommonedge (2,3) can be eitheran asymmetric cases which are deor a null pair. These give two essentiallydifferent picted in figure2. First considerfigure2(a). There are four versions of this case-two possible orientationsof the two asymmetricchoices in triad (1,2,3) times the two positionsforthe mutual choice in triad (2,3,4). (The orientationof the asymmetricchoice in a triad of type 3 is determinedonce the positionof the mutualchoicehas been specified.) Each of the fourversionsof figure2(a) has probability (13) ma(3)n 1 (g)(5) 23 Thus the contributionto P2(1,3) fromthe cases like figure2(a) is four timesthis or (14) 1 ma(3)n (5)* (2) () There are fourversionsof figure2(b)-two possible orientationsfor choicebetween(2,3) timestwo possiblepositionsforthe the asymmetric otherasymmetricin (1,2,3). Each of these fourversionsof figure2(b) has probability m2)n(2) 1 ma2n(2 (g)(5) (2) (15) 22' hence the contributionto P2(1,3) fromthe cases like figure2(b) is four timesthis or ma(2)n (2) (16) (2( Adding(14) and (16) togethergivesthe value ofP2(1,3), that is: P2(1,3) (0) 504 (1/2)ma(3) n + ma(2)?() 17 (17) DetectingStructurein SociometricData The numeratorof equation (17) appears in the (1,3) positionof table 3. In orderto checkour calculationsof the mean and varianceof T, and to ascertainhow well the standard normal distributionapproximates the distributionof r (forthe purposeof computingsignificance levels), we performeda simulationstudy. Twenty-sevensets of 100 random groupseach weregeneratedby a computerprogram.In each set of 100 randomgroups,the values of g, m, a, and n were fixedat designated values and randomsociogramsweregeneratedusingan algorithmmuch like the one describedat the end of section4. For each randomsociogram generated,r was computed.For each set of 100 simulations,the mean and variance of the 100 values of r werefound,and we recorded the numberof timesin the set of 100 simulationsthat r was less than -1.282, -1.645, and - 2.326 (the one-sidednegative10 percent,5 percent,and 1 percentpointsforthe standardnormaldistribution, respectively). The agreementbetween the theoreticaland observed means 1 2 3 V (a) 1 2- +3 4 (b) FIG. 2.-Two essentially different ways that (1,2,3)can be oftype1 (021C) while (2,3,4)is oftype3 (IIID). and variancesof r foreach of the twenty-seven sets of 100 simulation is excellent.This impliesthat our formulasforcomputingthe mean and variance of T are correctand, more importantly,that our computer programforperforming these calculationsis correct.It also arguesfor the adequacy of the pseudorandomnumbergeneratorused to perform the simulations. Table 4 summarizesthe resultsof the simulationstudy that bear on the questionof the adequacy of the approximationof the standardnormal distributionto the distributionof r. The overall agreement,across all the values of g, m, a, and n, is verygood. Of the total of 2,700 simulations,10.2 percentof the time r was less than the negative10-percent pointofthe normaldistribution. The corresponding figuresforthe 5 percent and 1 percent-pointsare 4.7 percentand 1.1 percent,respectively. The actual distributionof r is discrete,of course,and one expectsthe normal approximationto be best when the total numberof possible values of r is large.This numberis a functionofg, the size ofthe group, but also of the number of mutual (m) and asymmetricchoices (a). 505 AmericanJournalof Sociology Examinationof the individualvalues of r that resultforsmallgroupssize 5, 6, and 7-reveals that the numberof possiblevalues of r is small (sometimesas few as four or five), especiallywhen m and a are also values ofg, n, a, and n used different small. In table 4 the twenty-seven are groupedinto six classes by the size of g. The numberof timesr was TABLE 4 RESULTS p 5% 1% 15 8 11 9 6 3 7 7 1 2 1 1 10.75 5.75 1.25 268 150 210 180 15 9 9 9 4 5 7 4 0 3 2 0 ... 10.50 5.00 1.25 157 95 133 114 8 12 13 4 5 3 7 4 1 1 1 0 463 279 393 357 ......... 41 90 60 60 16 60 30 60 ... . Average%...... 24 57 38 38 9 38 19 38 20................ 20................ 20................ 20................ ... ... Average%.... 15 36 24 21 6 24 12 21 16................ 16................ 16................ 15................ ... 4 13 7 11 9 7 13................ 12................ 12................ 11................ 10................ 9................. . Overallaverage% 3 4 3 3 2 ... 17 15 9 5 7 ... ... 4.75 .75 8 5 3 1 1 3 2 0 4.25 1.50 16 15 17 18 11 9 7 6 5 3 8 6 1 3 0 3 1 2 14.33 5.83 1.67 1 14 12 4 7 1 5 4 4 7 0 1 1 0 0 4.20 0.40 4.7 1.1 11 5 7 7 7.50 7.60 ... ... . ... ... 1 2 3 2 1 7T............... 7................ 6................ 5................ 5................ 64 33 46 33 27 18 10 20 13 11 9 11 ... Average% ......... 9.25 99 60 84 63 ... Average%.... STUDY* 10% n 70 179 112 119 28 119 56 119 26................ 25................ 25................ 25................ Average% a m 34................ 35................ 34................ 35................ Average% OF SIMULATION 10.2 * Number of times T exceeded the 10 percent, 5 percent, and 1 percent cutoff points for the normal distribution for selected values of g, m, a, and n; 100 simulations for each choice of ., m, a, and n. 506 DetectingStructurein SociometricData 100 significantat the 10, 5, and 1 percentlevels forthe corresponding simulationsare given in the same row. The average of the numberof times r was significantfor each of the six classes is also given. These averages rangefrom7.50 to 14.33 at the 10 percentlevel, 4.20 to 5.83 at the 5 percentlevel, and 0.40 to 1.67 at the 1 percentlevel. Although thereis some evidence that the discretenessof the distributioncauses trouble in a few cases-g = 7, m = 1, a = 3; g = 5, m = 2, a = 3; g = 5, m = 1, a = 2; and perhapseveng = 20, m = 38, a = 38-the approximationseemsto workverywellformostof thesituationssimulated. There is some evidencethat forgroupsof size 11 to 13, the 10 percent point is really about a 15 percentpoint, but this effectis not carried over to the smallersignificancelevels. The simplepracticalconclusion is that if r is referred to tables of the percentagepointsof the standard normaldistributionand foundto be significant at the 5 percentlevel or less, this is not due to the inadequacy of the normalapproximation. 6. A COMPARISON OF r AND a In this sectionwe compare the values of r and 8 in eight sociograms drawn arbitrarilyfrom the sociometryliterature.The models upon which these two measures are based differsolely in the acceptability of 012 triads.This triad is vacuouslytransitive,but nonpermissible for the model of rankedclusters.This difference is an importantone, both empiricallyand theoretically.In the analyseswhichfollowwe shall see that some groupspossess far more 012 triads than the randommodel predicts.While these surplusesdo not bear directlyon the transitivity hypothesis,they are fatal to the ranked-clusters model. Indeed, Davis and Leinhardt (1970) reportedthat in analyses of sixty groups their hypotheseswerestronglycontradictedonlyin the case of the 012 triad. To understandthe substantivesignificanceof these findings,it will be necessaryto review the structuresthe two models describe,so that we may see what role 012 triadsplay in each. Briefly,Davis and Leinhardtpredictthat group structurewill tend to be arrangedinto a systemof hierarchicallyarrangedlevels, each of whichmay containone or morecliques of one or moregroupmembers. This structureis a productof tendenciesin pair relations.People in the same clique, they suggest,will tend to choose and be chosen by each other,whilemembersof different cliqueswilltendto refrainfromchoosing one another.A hierarchyis introducedbecause lower-statusgroup memberswho fail to recipromemberstend also to choosehigher-status cate these choices.Postulatingthat an arrangementof group members which contradictedthese tendencieswould be "inconsistent"or "uncomfortable"and that groupmemberswould tend to avoid them,Davis 507 AmericanJournalof Sociology and Leinhardtsingledout triadswhichpresentedsuch inconsistencies, proved that theirmodel was impliedwhen these triads were absent, fortheirtheory. and showedthat therewas some empiricaljustification Their exampleof such a structureappears in figure3. Since this model of ranked clustersassumes only one hierarchicalsystem,Davis and Leinhardtconcludedthatthe012 triadwas inconsistent:"The twoN relations implythat (the group members)are all on the same level, although in different cliques; but the A relation. . . implies that (one group member)is in a higherlevel" a contradiction(Davis and Leinhardt 1970). For our model,however,the structuralpropositionis the association of the transitivepropertywith the interpersonalrelation. Since the hypothesisof this conditionis not met in the 012 triad,we considerit to be vacuously transitive.Now, if we examine this triad in light of LEVEL FIG. h High Levels, Cliques and Relations . . clique I Middle clique 2 Low clique 4 el -- o -4- - --- cltique I- 0 clique 5 3.-An exampleofrankedclustersfromDavis and Leinhardt(1970) Davis and Leinhardt'sconceptualizationof the structuralrole of pairs, we can place thetwomemberslinkedby an asymmetric pairintoa status hierarchywhich is unrelatedto the thirdmember.This structureis a perfectlylegitimatepartial ordering.The prevalenceof 012 triads,then, mightbe consideredas evidencefortheexistencewithina groupofat least twocomponentseach ofwhichmay containorderings.Multipleorderings with more than one connectedcomponentare commonformsof group An excellentexampleoccursin theoftennoted"sex cleavage" of structure. children'sgroups.Figure4 presentsan illustrationofan idealizeddichotomized children'sgroup in which the boys' and girls' subgroupsare separate systemsof ranked clusterings.This structurewould be a perfectexampleofthe transitivity model,whilethe prevalenceof012 triads in the group as a whole would contradictthe hypothesisof the ranked clusteringmodel. In table 5, we have listedthe resultsof analysesof eightgroups.The last four columnsof the table give values for 5, 5STD (a versionof 5 standardizedto have mean zero and variance one), r, and a standard508 DetectingStructurein SociometricData ized measurefor012 triads (012STD). The two new standardizedvalues are computedin a manneranalogous to that for r. Theorem 1 is used to generatevariance and covarianceterms.While all necessaryprobabilitiesare not presentedin our tables, theircomputationis straightforwardand the measuresare calculated in our computerprogram.An argumentsimilarto that put forthfor the approximatenormalityof these standardizedvariables to the normaldistrir supportsreferring bution. The Davis-Leinhardtmodel of rankedclusteringspredictsthat a will be negative.On the basis of a small simulationstudy,Davis and Leinhardtsuggestthat a value of a that is less than -.05 oughtto be conTo check this,we compare a with 8STD sideredstatisticallysignificant. to normal may be ascertainedby reference (whosestatisticalsignificance tables) for the eight groups listed in table 5. Two facts emergefrom this comparison.First, (5STD is not monotonicallyrelatedto a (e.g., for J Boys Girls I A K B C systemof rankedclusters(N FIG. 4.-A possiblegroupwitha two-component pairsare not connected). group 1, a = -.152, while (5STD = -3.63; and forgroup2, a = -.091, while (5STD = -4.88). Second, if we use - 1.645 as the 5 percentcutoff point for (5STD, the value of -.05 for ( suggestedby Davis and Leinhardtis fairlywell supportedby these eightgroups.For onlyone group while (STD is-a is greaterthan -.05, (group 4), 8 is not significant, while (5STD is less than -1.645. These two findingssuggestto us that the - .05 cutoffpointfora is roughlyright,but the strengthof the significanceis not correctlygivenby the size of 8. The measures r and (STD may be used to compare the transitivity model with Davis and Leinhardt's model of ranked clusterings;r is in all in all but one case (group 5), while (STD is significant significant values of r are but two cases (groups5 and 8). Overall,the significant values of (STD, althoughnot always more negativethan the significant so (e.g., in group 1 (STD = -3.63, while r = -2.401). The last column of table 5 contains a standardizedmeasure of the numberof 012 triadsin the group.Since the 012 triad distinguishesthe it is of interest modelfromthe model of rankedclusterings, transitivity to see how oftenthis triad occursin a sociogram.If we use a two-sided 509 CN cD m 'IT cD (- 'IT ? CO O o m I I ++ I I- N 'StCS CO O o cs nc X c t Lo 00 0t 7: ce Ca n eo b f-4 Htcc e sc m c37:1 bD h 0 3 H ; ;> t : c fi,-t; Ev1-4VOV . F.. . .o C.. 0 Or 9r*e . 0 . ~ CS~~~~~~~ce .- . 'IC ;- 0 sbo DetectingStructurein SociometricData 5 percenttest that the 012 triads occur at random,thereare only two groups (groups 4 and 8) with significantvalues of 012STD (i.e., that exceed 1.96 in absolute value). In both of these groups,the value of 012STDindicatesthat the numberof 012 triads is largerthan expected by chance. We offerthe followingtentativeexplanationof this finding. Leinhardt(1968) foundthat when classroomgroupswere divided into sexuallyhomogeneoussubgroupsand the subgroupsanalyzed separately, the numberof012 triadsbecame fewerthan expected,whereaswhen the groupswere not divided therewere more 012 triads than expected by chance. As indicated in figure4, this findingis consonantwith a model that incorporatessex cleavage as well as rankingand cliquing. Sex cleavage of classroomgroupshas been observedto become stronger throughthe elementarygrades and then weaker during the college years. Returningto the last column of table 5, we note that the two groups with significantlymore 012 triads than expected were both eighth-gradeclasses consistingof boys and girls.The othergroups are eitherolderor sexuallyhomogeneous.Our explanationis, therefore, that sex cleavage has created the excess of 012 triads observed in groups 4 and 8. 7. DISCUSSION Sociologists,with growingfrequencyand sophistication,are turningto mathematicsas a language in whichto model social behavior.Two assumptionsunderliethis trend.One holds mathematicsto be a clearer, betterway of expressingrelationshipsbetweenvariables.The othersuggests that mathematicalexpressions,because they are easily manipulated, will rendernew, non-obviousrelationshipsapparent (Beauchamp 1970). Clearly, these assumptionshave been implicitin the work on graph theoreticmodels of structurein interpersonalrelationsand both have been corroborated.The modelshave producednew understanding of the interdependentrelationswhich link group members,and have suggested a sociological rather than psychologicalinterpretationfor consistencyor balance theories(Davis 1968). Nonetheless,the test of sociologicaltheory,be it mathematicalor verbal, must be empirical. With this in mind we have developedprocedureswhichpermittesting of tendenciesin sociometricdata toward a varietyof graph theoretic models of structure.We have presenteda theoremwhich specifiesthe probabilitiesneeded forstandardizedmeasuresbased on triad frequencies, and have providedformulasfor a generalmodel, a partial order. Whiletheseformulasare dependentupon the randommodelchosen,the proceduresused to generatethem are not, and we discussed why we thoughtmore researchwas needed on random models for sociometric 511 AmericanJournalof Sociology we data. Since our interestin thisreportwas principallymethodological, refrainedfromdata analysis save an illustrativeexamplein which the partialordermodelwas comparedwitha specialcase, theranked-clusters model of Davis and Leinhardt(1970). REFERENCES Beauchamp, Murry A. 1970. Elementsof MathematicalSociology.New York: Random House. Cartwright,Dorwin, and Frank Harary. 1956. "Structural Balance: A Generalization of Heider's Theory." PsychologicalReview63:277-93. Davis, James A. 1967. "Clustering and StructuralBalance in Graphs." Human Relations 20:181-87. . 1968. "Social Structuresand Cognitive Structures." In Theoriesof Cognitive Consistency:A Sourcebook,edited by R. P. Abelson, E. Aronson, W. J. McGuire, T. M. Newcomb, M. J. Rosenberg, and P. H. Tannenbaum. Chicago: Rand McNally. Davis, James A., and Samuel Leinhardt. 1970. "The Structureof Positive Interpersonal Relations in Small Groups." In Sociological Theoriesin Progress,edited by Joseph Berger, Morris Zelditch, Jr.,and Bo Anderson. Vol. 2. Boston: Houghton Mifflin(in press). Harary, Frank, Robert Z. Norman, and Dorwin Cartwright.1965. StructuralModels. New York: Wiley. Hayes, M. L., and M. E. Conklin. 1953. "Intergroup Attitudes and Experimental Change." Journal of ExperimentalEducation 22:19-36. Heider, Fritz. 1946. "Attitudes and Cognitive Organization." Journal of Psychology 21: 107-12. . 1958. The Psychologyof InterpersonalRelations. New York: Wiley. Hempel, Carl G. 1952. Fundamentals of ConceptFormationin Empirical Science. In Encyclopedia of Unified Science. Vol. 2, no. 7. Chicago: University of Chicago Press. Holland, Paul W., and Samuel Leinhardt. 1970. "A UnifiedTreatment of Some Structural Models for Sociometric Data." Technical Report, Carnegie-Mellon University, January 1970. Horace Mann-Lincoln Institute of School Experimentation. 1947. How to Constructa Sociogram. New York: Bureau of Publications, Teachers College, Columbia University. Katz, L., and J. H. Powell. 1954. "The Number of Locally Restricted Directed Graphs." Proceedingsof theAmerican MathematicalAssociation 5:621-26. Katz, L., R. Tagiuri, and T. Wilson. 1958. "A Note on Estimating the Statistical Significanceof Mutuality." Journal of GeneralPsychology58:97-103. Kendall, M. G., and B. B. Smith. 1939. "On the Method of Paired Comparisons." Biometrika31:324-345. Landau, H. G. 1951a. "On Dominance Relations and the Structure of Animal Societies. I. Effectof Inherent Characteristics." Bulletin of MathematicalBiophysics 13: 1-19. . 1951b. "On Dominance Relations and the Structureof Animal Societies. II. Some Effects of Possible Social Factors." Bulletin of Mathematical Biophysics 13:245-62. . 1953. "On Dominance Relations and the Structureof Animal Societies. III. The Conditionfora Score Structure." Bulletin of MathematicalBiophysics 15:14348. Leinhardt, Samuel. 1968. "The Development cf Structurein the InterpersonalRelations of Children." Ph.D. dissertation,Universityof Chicago. Moran, P. A. P. 1947. "On the Method of Paired Comparisons." Biometrika34:36365. 512 DetectingStructurein SociometricData Moreno, J. L. 1953. Who Shall Survive?New York: Beacon House. Taba, H. E., E. Brady, J. Robinson, and W. Vickery. 1951. Diagnosing Human RelationsNeeds. Washington: American Council on Education. Taba, H. E., and D. Elkins. 1950. WithFocus on Human Relations.Washington,D.C.: American Council on Educatioin. Zeleny, L. D. 1947. "Selection of the Unprejudiced." Sociometry10:396-401. . 1950. "Adaptation ofResearch Findingsin Social Leadership to College Classroom Procedures." Sociometry13:314-28. 513
© Copyright 2026 Paperzz