A Method for Detecting Structure in Sociometric Data

A Method for Detecting Structure in Sociometric Data
Author(s): Paul W. Holland and Samuel Leinhardt
Source: American Journal of Sociology, Vol. 76, No. 3 (Nov., 1970), pp. 492-513
Published by: The University of Chicago Press
Stable URL: http://www.jstor.org/stable/2775735 .
Accessed: 29/06/2011 09:15
Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unless
you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you
may use content in the JSTOR archive only for your personal, non-commercial use.
Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at .
http://www.jstor.org/action/showPublisher?publisherCode=ucpress. .
Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed
page of such transmission.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact [email protected].
The University of Chicago Press is collaborating with JSTOR to digitize, preserve and extend access to
American Journal of Sociology.
http://www.jstor.org
A MethodforDetectingStructure
in Sociometric
Data
Paul W. Holland'
Harvard University
Samuel Leinhardt2
Carnegie-Mellon
University
The authorsfocuson developingstandardizedmeasuresformodels
of structurein interpersonalrelations.A theoremis presented
which yields expectationsand variances for measures based on
triads. Random models forthese measuresare discussedand the
procedureis carriedout fora model of a partial order.This model
containsas special cases a numberof previouslysuggestedmodels,
includingthe structuralbalance modelof Cartwrightand Harary,
Davis's clusteringmodel,and the ranked-clusters
model of Davis
and Leinhardt.In an illustrativeexample, eight sociogramsare
analyzed and the generalmodel is comparedwith the special case
of rankedclusters.
1. INTRODUCTION
A varietyof modelshave been proposedwhichrelategroupstructureto
interdependences
among interpersonal
relations.Graph theory,because
it is concernedwith the characteristicsof sets of points connectedby
relations,has been a naturallanguagein whichto expressthesemodels
of small-scalesocial structure.In their classic paper, Cartwrightand
Harary (1956) used graph theoryto restate Heider's (1946) balance
theoryand proposeda theoremwhich,by showingthat balance implied
the dichotomizationof groups along lines of interpersonalsentiment,
providedan importantinsightinto the natureof the social structureof
groups.Davis (1967) recognizedthat the decompositionof groupsinto
onlytwo cliques was simplynot empiricallydemonstrable;he expanded
upon structuralbalance by indicatingwhat conditionswere necessary
for clustering,the developmentof one or more cliques. Balance, thus,
was made a special case of a moregeneralgraphtheoreticmodel which
sentiseemedmoreadequatelyto describethe structureof interpersonal
ment. Followingthis work,Davis and Leinhardt(1970) combinedtwo
commonsocial structuralcomponents,status and clusters,by showing
how directedsentimentrelationscould generatea structureincorporating a systemof hierarchically
arrangedcliques. In a recentreport(Holland and Leinhardt1970), we have shownthat even thisranked-clusters
' Research supported by National Science Foundation grants GP-8774 and GS2044X1.
2Research
492
supported by a Social Science Research Council postdoctoral fellowship.
DetectingStructurein SociometricData
model is a special case of a moregenerallyapplicable structuralmodel,
that of a partial order,and, incidentally,with this model we have reestablishedthe connectionto Heider's theoryby proposingthat transisentitivityis an importantstructuralpropertyofpositiveinterpersonal
ment (Heider 1958).
The most generalimportof these mathematicalmodels is that they
how complex networkscan be the resultof interdependenmonstrate
dencies among interpersonalsentimentrelations. Nonetheless,while
meaningfultheoreticalinsightscan resultfromexpressingsocial theory
in themselvesto argue forthe
mathematically,these are not sufficient
the acceptance of theory.It still remainsfor empiricalverificationto
help us distinguishformaltheoryfromformalnonsense;in this effort
graphtheory,because of its deterministic
quality,has limiteduse. For
example,a statementwhichimpliesthat a groupcannot be balanced if
a linebetweentwopointsservesto linktwo otherwisedisconnectedcomponentsmay followlogicallyfromthe axioms of graph theory;it does
not make much sense in the logic of empiricalsociology.The problem
is no less severein modelswhichpurportto be closerrepresentations
of
reality.A groupmay similarlyfail to be partiallyorderedor possess a
rankedclusteringbecause of one contradictory
line. The added complicationin thesecases is thatthe offending
relationis not readilyobserved.
If theseproblemsare to be avoided and thesemodelsare to be of scientificuse, theymustbe expressedprobabilistically
and measureswhich
gauge the fitof empiricaldata to them must be developed. However,
whendeterministic
graphtheoreticstatementsare replacedby propositions of tendency,the acceptance or rejectionof a hypothesisbecomes
complicated,and techniquesare necessarywhichpermitthe statistical
of a measuredtendencyto be judged. While some effort
significance
has
been made to develop measuresof tendency,strength,and fitforthe
various structuralmodels, there has been only limited discussionof
randomnessin this contextand only meagerinformation
existson the
distributionsof these structuralindices (Harary, Norman, and Cartwright1965, pp. 339-62; Davis and Leinhardt 1970). Thus, it is extremelydifficult(if not impossible) to know the significanceof some
measuredtendencyor to gauge,in general,the closenesswithwhichthe
graphtheoreticmodelsof structuredescribesocial behavior.
It is our aim in this paper to presenttechniqueswhichcan assist in
solvingthisproblemby developingan indexof transitivity
fora general
model, a partial order,notingthat the procedureis applicable to all
special cases. We then presenta theoremwhichwe use to generateexpectationsanidvariancesforour structuralmodel. This is followedby a
discussionof the appropriatenessof several random models for sociometricdata, and an explanationof whywe have chosenone whichcon493
AmericanJournalof Sociology
strainspairs. This randomdistributionis then employedin generating
tables of probabilitiesfor our standardizedtransitivitymeasure. The
approximatenormalityof this measure is substantiatedthroughthe
analysis of a simulationstudy. To demonstratethe use of these techniques,eightsociogramsare analyzed and the resultsobtainedare comparedwiththosefortherankedclusterersmodelofDavis and Leinhardt.
2. A STRUCTURAL MODEL
Sociometricdata are oftenin the formof a set of individuals,X, togetherwitha binary"choice relation,"C, definedon X byxCy,ifand only
ifpersonx choosespersonyin thesociometric
test.(To avoid trivialexceptions,we make the conventionthat xCx forall x, even thoughthereis
no implicationthat x actually chooseshimselfin the sociometrictest.)
In this paper, sociometricdata-that is, (X,C)-will be said to exhibit
if the choice relationC is transitive.
structure
In technicalterms,this
means that for any x,y,zin X, if xCy and yCz, then xCz. Briefly,our
structuralmodel is that (X,C) is a partiallyorderedset (not necessarily
and the centralaim ofthispaper is to presenta method
antisymmetric),
fordetectingtendenciesin the directionof this type of structurewhich
can be appliedto socionmetric
data. For ourpresentpurposes,transitivity
is a convenientideal structuralmodel against whichwe may compare
actual sociograms.
The assumptionthat C is transitiveis generaland containsas special
cases several earlier models: for example, balance (Cartwrightand
Harary 1956), clustering(Davis 1968, pp. 544-51), ranikedclustering (Davis and Leinhardt 1970), tranisitivetournaments (Landau
1951a, 1951b, 1953), and quasi-series(Hempel 1952, pp. 58-62). We
discuss the substantivebasis and implicationsof this model and relate
it to previousones elsewhere(Holland and Leinhardt1970), consequently we will only considerthese importanttopics brieflyin the next section of this report.
3. TRANSITIVITY AND TRIADS
In a givensociogram,to verifywhetheror not C is transitive,everyordered triple of distinctindividuals (x,y,z) must be examined. If xCy
and yCz, then for C to be transitivexCz must also hold. If for some
triplex,y,zit occursthat xCy and yCz but x does not choosez, thenC is
intransitive.
On the other hand, if xCy but y does not choose z, or if
yCz but x does not choosey, or ifneitherxCy noryCz,thenthe question
of transitivity
is not relevantbecause the hypothesisof the tranlsitivity
condition-that is, the "if" conmponent,
is not satisfied-x may or may
not choose z withoutcontradictingthe transitivity
of C. When the hy494
DetectingStructurein SociometricData
pothesis of the transitivityconditionis not satisfiedfor a particular
triplex,y,z,theirrelationshipto each otheris vacuouslytransitive.
Figure1 showsall ofthe sixteenpossiblerelationshipsthat can obtain
betweenany threepeople in a sociogram.A single arrowx -> y means
that xCy but y did not choose x (an asymmetricpair). A double arrow
x <-* y means that xCy and yCx (a mutual pair). No arrowmeans that
neitherx nor y chose the other(a null pair).
The triadsare labeled accordingto thenumberofmutual,asymmetric,
and nullpairs (in that order)whichtheycontain.Thus, a 201 triadcontains two mutual, no asyminetric,and one null pair. Nonisomorphic
triad typeswith equal numbersof pair types are furtherdifferentiated
by letters(e.g., 021U, 021D, and 021C).
On the leftside of figure1 are the nine triad types that exhibitno
intransitivities
(the transitiveand vacuously transitivetriads); on the
rightside are the seven triad types that exhibitat least one intransiwe mean that for
tivity(the intransitivetriads). By ail intransitivity,
some orientationof threeindividuals,say x,y,z,we have xCy and yCz
but x does not choose z.
To say that C is a transitiverelationon X is equivalent to saying
that none of the seven intransitivetriads appear in the sociogram.To
discoverwhetheror not C is transitivefora givensociogram,it is sufficientto examineeveryunorderedtripleof distinctindividualsto see if
the triadtheyformfallson the leftor the rightside of figure1. If they
all fall on the left,C is transitive;otherwise,it is not.
By consideringthe triads of figure1, we may see how the several
structuralmodels we have mentionedcan be consideredto be special
cases of a partial order.Were we unable to findin our examinationof
a graph'stripleseitherintransitiveor 012 triads,thenwe would have a
ranked clustering.Additional absences of 003, 021U, 021D, and 102
triadswould mean that a quasi-orderexisted.If only003 and 102 triads
appeared,the structurewould be a clustering.If no 003 triadsoccurred
and only 102 triads were found,balance would be indicated. Finally,
030T triads as the sole arrangementsof tripleswould mean that the
graphwas a transitivetournament.
Returningto our own model, for a "live" sociogram,C will alnost
certainlynot be transitive.Therefore,the numberof intransitivetriads
possessedby the sociogramis a measureof how transitiveC is. If there
are only a few intransitivetriads in the sociogram,then C is tending
towardtransitivity.On the otherhand, if thereare about as many intransitivetriadsas would be expectedby chance,thenthereis littleevidence that C has a tendencyto be transitive.Let T be the numberof
intransitivetriadsin a givensociogramand /T and S- be the mean and
standard deviationof T under the null hypothesisthat the sociogram
495
Transitive
and
Vacuously
Transitive
Triads
Intransitive
Triads
0
*
a
003
0
/0
012
*
*@
02 1U
00
021 D
030T
0
10 2
02 1 C
030C
*
0
0
120U
120D
120C
IIID
IIIU
0
201
Ao > >
210
300
FIG. 1.-All sixteentriad types arrangedverticallyby numberof choices made and
divided horizontallyinto those with no intransitivitiesand those with at least one.
DetectingStructurein SociometricData
is "random" (preciselywhat this means will be discussedin detail in
the nextsection). Definethe index X-by:
T-gT
(1)
OT
The measure X- is our proposed transitivityindex. It is a minimum
when C is transitiveand has mean zero and varianceone underthe assumptionthat the sociogramis "random."
This type of structuralindexis similarto that proposedin the paper
of Davis and Leinhardt (1970). Their structuralmodel is also defined
in termsof triads,but their"non-permissible"
triadsincludecne which
our model considersto be permissible(i.e., transitive).Their structural
indexis
D-I-D
(2)
AD
whereD is the numberofnonpermissible
triads(in theirsense) and AD iS
its expectedvalue underthe chance hypothesis.The index 3 has a minimumof -1.00 and may be expressedin termsofthe percentageoffewer
nonpermissible
triadsthan expected.When theirpaper was written,the
variance of 3 was not knownand thus significancelevels could not be
given forit. There are various theoreticalreasonswhy X-and 3 should
both have approximatenormaldistributions
under the chance hypothesis; in section5 we shall indicatethe resultsof a simulationstudythat
supportthisconjecture.Because X- is in standardizedform,approximate
significancelevels for an observedvalue of X-may be obtaiied by referringit to a table of the percentagepointsof the normaldistribution.
The appropriatetest is one-sided,and rejectionof randomnessin favor
of the transitivitymodel is indicated by sufficiently
large negative
values of -r.
The actual computationsof T froma sociogram,and ofwrand 0T from
the formulasgivenin section5 are verytedious.Consequently,in practice the value of X-must be foundwiththe aid of a computer.One of us
(Leinhardt) has writtena FORTRAN IV programthat does sociometric
analyses,includingthe computationof X-and .3*
It shouldalso be pointedout that X-and 3 may be viewedas generalizationsof the indexof intransitivity
givenby Kendall and Smith(1939)
forpaired comparisondata. In that case, there are no mutual or null
choices; thereforeonly two possible triads can occur, and one of these
fallson the rightside offigure1. Theirmeasurecountedall the examples
of intransitivetriadsin the graph.
I This program has been used on the Harvard Computer Center's IBM
360-65, and
Carnegie-Mellon University's IBM 360-67. Copies may be obtained from Samuel
Leinhardt, Carnegie-Mellon University,School of Urban and Public Affairs,Pittsburgh, Pennsylvania 15213.
497
AmericanJournalof Sociology
4. RANDONI DISTRIBUTIONS IN SOCIOMETRY
to evaluate the statisticalsignificanlce
The use of "random" distributions
of observedfeaturesin sociogramsis a standardpractice.Moreno (1953)
simulated "random groups" by hand beforethe theoreticalanalvses
wereavailable. We feelthat thereare certainsubtletiesthat arise in the
choice of random distributionswhich have not receivedproperattention. Consequently,this sectionis devotedto developinga guidelinefor
forsocionmetric
analyses.
choosingrandomdistributions
It is convenientto discuss randommodels in sociometryin the context of sociomatricesratherthan sociograms.Let Xij = 1 if person i
chooses person j, and X,j = 0 otherwise.Let X,i = 0 by convention;
(Xi,) is theg X g sociomatrix.The rowtotalXi+ = Xi1 + Xi2 +-- ... + xig
is the umuberof choices made by person i. The column total X+j =
X13 + X2i + . . . + X,j is the numberof choicesreceived by personj.
The numberofmutualchoicesin the sociograrnis
M = EXijXji.
iJ
An early findingwas that socionmetric
status the distributionof the
X+j-is not in accord with what would be expectedby chance. In this
case "chance" meaiis that each personi distributedhis Xi+ choices at
randomn
to the g - 1 othergroupmembers.More technically,thisineans
that the "chance" distributionof the X+j was computedunder the assurnptionthat all sociomatrices,(Xij), with the given values of the
row totals X1+, . . , X+ are equally likely.In otherwords,the chance
on the given
distributionof choicesreceivedwas computedconditionally
values of the choicesmade. In nmany
applicationsall subjects make the
the theoreticalcalculations.
same numberofchoicesand thissinmplifies
Anotherfindingwas that the numberof mutual choicesM was larger
than expectedby chance. As before,"chance" means that the distribution of M was computedunder the assumptionthat all sociomatrices
(Xij) with the given row totals were equally likely.However, it seenms
poputo us that once the firstfindinghas been observed differential
distributionto evalularity-it is inappropriateto use the samerandonm
ate the extentofmutuality.In particular,people receivingmanychoices
are quite likelyto be involvedin nmutual
choices,while those receiving
none can not be involved in any. This suggeststhat, once differential
popularityis detected, the appropriatechance model for evaluating
that all sociomatrices
tendenciestowardmutual choices should assunme
the
of
the
column
totals
with
values
given
X+i, . . . , X+g as well
(Xij)
.
.
,
.
as the row totals Xi+,
Xg+ are equally likely.In otherwords,inferencesabout mutualityshould be conditionalon boththe row and
498
DetectingStructurein SociometricData
columntotals of the sociomatrix.Katz, Tagiuri, and Wilson (1958) indicated the difference
betweenthe two approaches toward mutuality.
Results of Katz and Powell (1954) are basic to furtherprogressin this
direction,but as yet these have not been successfullyapplied to this
problem.This practical issue notwithstanding,
let us pursue the idea
further.
The structuralmeasurer proposedin section3 is concernedwiththe
chance distributionof triads. This is a step furtherinto the structure
of the sociogrambeyond choices receivedand mutual choices.Accordingly,the appropriatechance distributionfor T should fix (1) choices
made Xi+, (2) choicesreceivedX+j, and (3) mutual choicesM, and assume that all sociomnatrices
(Xij) withthe givenvalues of these quantiare
ties
equally likely.Unfortunately,
the mathematicalresultsneeded
to implementthisanalysisare not available. Untiltheyare, a reasonable
attitudeis to ask how can these threeconditionsbe relaxedin orderto
produce a feasible and yet reasonable chance distributionfor T? One
promisingdirectionwould be to fix choices made and the numberof
mutual pairs. Even this is not available at present. Weakening this
criterionone furtherstep leaves us withtwo possibilities:(1) fixchoices
made only,or (2) fixthe numberof mutual,asymmetric,and null pair
relations.Since the firstalternativeis what we are tryingto avoid, we
have adopted the second. This random distributionwas also used by
Davis and Leinhardt(1970). Its advantage is that it allows us to eliminate the effectof the numberof mutual,asymmetric,and null pairs in
the group. Its disadvantageis that it does not allow forthe fact that
everyonein the groupmay have made the same numberof choices,nor
does it allow forthe effectof a "star" who receivessignificantly
more
choices than the others and the "isolate" who receives significantly
fewerchoices.Both of theseadditionalconstraintseventuallyshould be
broughtinto the analysis.
In detail, the randommodel we shall use to computeAT and 0ST is as
follows.Let m,a, and n denotethe actual numberofmutual,asymmetric,
and null pairs, respectively,in a given sociogram.Then m + a + n =
g(g - 1)/2, the total numberof pairs. Randomly and withoutreplacenient, these m (mutual), a (asymmetric),and n (null) pairs are distributedto the pairs of group membersso that all arrangementsare
equally likely. In practice this mightbe done as follows.Assume the
individualsare numberedfrom1 to g. Put g(g - 1)/2 balls, numbered
consecutivelyfrom1 to g(g - 1)/2, into an urn. Let ball 1 referto the
pair (1,2), ball 2 to pair (1,3), etc.; ball g - 1 to pair (1,g), ball g to pair
(2,3), ball g + 1 to pair (2,4) etc.; and finallyball g(g - 1)/2 referto
pair (g - 1,g). This is a triangularenumerationof the unorderedpairs.
The balls are thendrawnout of the urn,one at a timewithoutreplace499
AmericanJournalof Sociology
to the numberson the firstm balls
ment. To the pairs corresponding
that are drawn,assignmutualpairs.For the nexta balls that are drawn,
pairs of individuals.The
assign asymmetricpairs to the corresponding
directionsof these asymmetricchoices are then decided by a tosses of
to the n
a faircoin. The remainingn pairs of individualscorresponding
undrawnballs are assignednull pair relations.If mn= 0 and n = 0, this
is the usual random distributionused in the analysis of paired comparisons.
5. THE DISTRIBUTION OF T
In this section,the mean and variance of T-the numberof intransitive triads in a randomlyconstructedsociogram-is derived. We also
discuss the resultsof a simulationstudy that bears on the question of
of the standardthe approximatenormalityof the chance distribution
ized variable ir.
Some notation is necessary.Let the intransitivetriad types 021C,
030C, lIID:, IIU, 120C, 201, and 210 be called type 1, type 2, .. ..
type7, respectively.Let Ti be the numberoftriadsoftypei that appear
in a givensociogram.Then T, the total numberof intransitivetriadsis
T = T? + T2+ . . . + T7.
(3)
From equation (3) and standardformulasforthe mean and variance of
variableswe have the relations
a sum of randomn
A=
and
ST
=
i
ZVar (Ti) + 2 E Cov (Ti, Tj)
i
(4)
EE(Ti)
i < j
.
(5)
to computeE (Ti),
Hence, in orderto calculate gT and ST it is sufficient
Var(Ti), and Cov(Ti,Tj). Theorem 1, below, expressesthese quantities
in termsof certainprobabilitieswhichmay be computedfromthe random model. In definingthese probabilitiesit is convenientto let (1,2,3)
denotethe triadformedby the personslabeled 1, 2, and 3. The "type"
of a triadwillreferto the seven distincttypesofintransitivetriadslisted
above. The probabilitiesthat appear in theorem1 are definedas follows.
thattriad(1,2,3)is of typej.
p(j) = Probability
= Probability
thattriad(1,2,3)is oftypei and triad(2,3,4)is
oftypej.
p1(i,j) = Probabilitythat triad(1,2,3)is oftypei and triad (3,4,5) is
p2(i,j)
oftypej.
thattriad(1,2,3)is of typei and triad(4,5,6)is
po(i,j)- Probability
oftypej.
500
DetectingStructurein SociometricData
to thenumberofnodesthat
refers
k on pk(i,j)
Notethatthesubscript
are commonto the two triadsin question.
Theorem1: If Ti is thenumberof triadsof typei in a randomgraph
on g pointsthen,
(a)
E(Ti)
(b)
Var (Ti)
=
(3) p(i)
=
(g) p(i)[1 3
+ 3 (g
p(i)] + 3(g - 3) (g) [p2(i, i) - po(i, i)]
g
(3[pi(i,i)
+ (g)(2)[po(i, i)
(c)
Cov (Ti, Tj)
=
-
(g) p(i)p(j)
+ 3(g
+
-
32
3
3)
- po(i,i)]
p2(i)],
+ 3(g - 3)
G)[p(i, j)
()
[p2(ixi)
- po(i,j)]
- po(i, )]
[p(i, j) -p (i) p(j)]
[P
The quantities(9) and (g 23) thatappearin theorem1 are the bi-
(9) is thedescendingfactorial
and theexpression
nomialcoefficients
is that it indicateswhat probabilities
The main value of this theoremi
how theyare combinedto obtain
and
model
mustbe computedfromthe
the Ti. The proofof theorem1
of
covariances
and
the means,variances,
it. A similar theoremis
omit
we
and
tedious
but
is straightforward
case mentionedabove,
comparison
paired
the
for
(1947)
provedby Moran
theorem1 as well.
prove
to
generalizes
and the techniqueused there
on the particular
1
depend
not
does
theorem
that
Finally,we mention
stated-that is,
is
it
As
validity.
its
for
randommodelwe have adopted
1 is truefor
etc.-theorem
p(i),
to
values
without
givingspecific
p2(i,i),
not
depend on
does
that
graphs
any random method of constructing
the labels of the points.
we have adopted,thereis some
In the case ofthe randomdistribution
the factthatpi(i,j) = po(i,j),
to
due
is
This
1.
in theorem
simplification
forVar(Ti) and Cov(Ti,Tj)
expressions
the
and hencethe thirdtermsin
vanish.
In orderto use theorem1 in conjunctionwith equations (4) and (5)
501
ArnericaiiJournalof Sociology
mustbe computed.
to compute T aiid 4-2,the values ofp(i) and pk(i,j)
These are giveniin tables 1, 2, and 3. Throughoutthese three tables,
the descendingfactorialnotationis used, that is,
x(k) = ;(X -
(x k + 1).(6
)...(
~~~~~~~~(6)
The denominatorsof these probabilitiesD1,D2,D3are given by
/ \(3)
D1=
(2)
(7)
D2=
())
(8)
D3 =
(2)
(9)
To illustratehow these probabilitiesare calculated we shall consider
threeexamples:p(l), po(l,l), and p2(1,3)TABLE 1
)ip(i) AS FUNCTION
OF m, a, n
T RIAD
021C
030C
111D
11IU
120C
201
210
-a (2)n
4-a(3)
3man
3man
2ma(2)
3m(2)n
3m(2)a
To computep(l), note fromfigure1 that the triad labeled 021C has
pairs and one null pair. The probabilityofgettingthese
two asymlmetric
threepairs is
1
a(a - 1)n
(10)
(2
L(g-11[(2
-2
22
The terminalfactorof 1/22 comesfromthe necessityto orientthe direction of the two asymmetricchoices. But thereare threepossible posiornce
tions for the null pair, that is, (1,2), (1,3), or (2,3). Furthermore,
the positionof the null pair has been specifiedthereare two ways the
two asymmetricpairs can be orientedwith respect to each other to
producea triad of type 1. Hence the value of p(l) is 3*2 = 6 timesthe
expressionin equation (10) or
(1)
=
a(2)n
(1 1)
(2)
that,exceptforthe fact
To computepo(ll) it is usefulto remrember
that the m, a, and n pairs are distributedwithoutreplacement,any two
502
|
O
t t t t ,_
' ._
I
E[
|
~
.-
n
_
_
I~~~~~~~~~~~~~~~~~~~~~~~~
_
4~~~~~~~~~~~~~~~~~~~~~~~~~~~~4c
N
ta
I
4~~~~~~~~~
It
4-tte*+I
|
!
N
O
N
jc
O~4
i~I
4
40T
a
V
I~
I
0
CA
44
~
I~~?IX+
Cl
O
D
,4++
Ng
C
4+
4
4~~~~~~~~~~~~
4
o
Q~~~~~~~~~~~~~~~~~~~~c
0
,
H
Q
*e,
O1
Is..
444~~~~~~~
.
.
-7
~
~ ~
.
.
0
0lC
-0
CI~~~~~
4
~
H
.
4444-
~
-qNN
-iH 2qN N
-?g+
C
~
~
'
~
'
.
4~~~~~~~~~~~~~~~~~C
--
C
1
AmericanJournalof Sociology
triadswithno commonedgesare statisticallyindependent.Hence po(l, )
is essentiallythe "square" of p(l), except that descendingfactorials
are used ratherthan powers.Thus
(1)
2 a(2g)n
(a -
2)(2 (n
-1)
a(4)n2
(12)
The numeratorof thisexpressionappears in the (1,1) positionoftable 2.
To computeprobabilitieslike P2(1,3), it is necessaryto recognizethat
thereare two possibilitiesforthe commonedge betweenthe two triads
(1,2,3) and (2,3,4). If triad (1,2,3) is to be of type 1 and triad (2,3,4)
is to be of type 3, theircommonedge (2,3) can be eitheran asymmetric
cases which are deor a null pair. These give two essentiallydifferent
picted in figure2. First considerfigure2(a). There are four versions
of this case-two possible orientationsof the two asymmetricchoices
in triad (1,2,3) times the two positionsforthe mutual choice in triad
(2,3,4). (The orientationof the asymmetricchoice in a triad of type 3
is determinedonce the positionof the mutualchoicehas been specified.)
Each of the fourversionsof figure2(a) has probability
(13)
ma(3)n 1
(g)(5)
23
Thus the contributionto P2(1,3) fromthe cases like figure2(a) is four
timesthis or
(14)
1 ma(3)n
(5)*
(2)
()
There are fourversionsof figure2(b)-two possible orientationsfor
choicebetween(2,3) timestwo possiblepositionsforthe
the asymmetric
otherasymmetricin (1,2,3). Each of these fourversionsof figure2(b)
has probability
m2)n(2) 1
ma2n(2
(g)(5)
(2)
(15)
22'
hence the contributionto P2(1,3) fromthe cases like figure2(b) is four
timesthis or
ma(2)n (2)
(16)
(2(
Adding(14) and (16) togethergivesthe value ofP2(1,3), that is:
P2(1,3) (0)
504
(1/2)ma(3) n +
ma(2)?()
17
(17)
DetectingStructurein SociometricData
The numeratorof equation (17) appears in the (1,3) positionof table 3.
In orderto checkour calculationsof the mean and varianceof T, and
to ascertainhow well the standard normal distributionapproximates
the distributionof r (forthe purposeof computingsignificance
levels),
we performeda simulationstudy. Twenty-sevensets of 100 random
groupseach weregeneratedby a computerprogram.In each set of 100
randomgroups,the values of g, m, a, and n were fixedat designated
values and randomsociogramsweregeneratedusingan algorithmmuch
like the one describedat the end of section4. For each randomsociogram generated,r was computed.For each set of 100 simulations,the
mean and variance of the 100 values of r werefound,and we recorded
the numberof timesin the set of 100 simulationsthat r was less than
-1.282, -1.645, and - 2.326 (the one-sidednegative10 percent,5 percent,and 1 percentpointsforthe standardnormaldistribution,
respectively). The agreementbetween the theoreticaland observed means
1
2
3
V
(a)
1
2-
+3
4
(b)
FIG. 2.-Two essentially
different
ways that (1,2,3)can be oftype1 (021C) while
(2,3,4)is oftype3 (IIID).
and variancesof r foreach of the twenty-seven
sets of 100 simulation
is excellent.This impliesthat our formulasforcomputingthe mean and
variance of T are correctand, more importantly,that our computer
programforperforming
these calculationsis correct.It also arguesfor
the adequacy of the pseudorandomnumbergeneratorused to perform
the simulations.
Table 4 summarizesthe resultsof the simulationstudy that bear on
the questionof the adequacy of the approximationof the standardnormal distributionto the distributionof r. The overall agreement,across
all the values of g, m, a, and n, is verygood. Of the total of 2,700 simulations,10.2 percentof the time r was less than the negative10-percent
pointofthe normaldistribution.
The corresponding
figuresforthe 5 percent and 1 percent-pointsare 4.7 percentand 1.1 percent,respectively.
The actual distributionof r is discrete,of course,and one expectsthe
normal approximationto be best when the total numberof possible
values of r is large.This numberis a functionofg, the size ofthe group,
but also of the number of mutual (m) and asymmetricchoices (a).
505
AmericanJournalof Sociology
Examinationof the individualvalues of r that resultforsmallgroupssize 5, 6, and 7-reveals that the numberof possiblevalues of r is small
(sometimesas few as four or five), especiallywhen m and a are also
values ofg, n, a, and n used
different
small. In table 4 the twenty-seven
are groupedinto six classes by the size of g. The numberof timesr was
TABLE 4
RESULTS
p
5%
1%
15
8
11
9
6
3
7
7
1
2
1
1
10.75
5.75
1.25
268
150
210
180
15
9
9
9
4
5
7
4
0
3
2
0
...
10.50
5.00
1.25
157
95
133
114
8
12
13
4
5
3
7
4
1
1
1
0
463
279
393
357
.........
41
90
60
60
16
60
30
60
...
.
Average%......
24
57
38
38
9
38
19
38
20................
20................
20................
20................
...
...
Average%....
15
36
24
21
6
24
12
21
16................
16................
16................
15................
...
4
13
7
11
9
7
13................
12................
12................
11................
10................
9.................
.
Overallaverage%
3
4
3
3
2
...
17
15
9
5
7
...
...
4.75
.75
8
5
3
1
1
3
2
0
4.25
1.50
16
15
17
18
11
9
7
6
5
3
8
6
1
3
0
3
1
2
14.33
5.83
1.67
1
14
12
4
7
1
5
4
4
7
0
1
1
0
0
4.20
0.40
4.7
1.1
11
5
7
7
7.50
7.60
...
...
.
...
...
1
2
3
2
1
7T...............
7................
6................
5................
5................
64
33
46
33
27
18
10
20
13
11
9
11
...
Average% .........
9.25
99
60
84
63
...
Average%....
STUDY*
10%
n
70
179
112
119
28
119
56
119
26................
25................
25................
25................
Average%
a
m
34................
35................
34................
35................
Average%
OF SIMULATION
10.2
* Number of times T exceeded the 10 percent, 5 percent, and 1 percent cutoff points for the normal
distribution for selected values of g, m, a, and n; 100 simulations for each choice of ., m, a, and n.
506
DetectingStructurein SociometricData
100
significantat the 10, 5, and 1 percentlevels forthe corresponding
simulationsare given in the same row. The average of the numberof
times r was significantfor each of the six classes is also given. These
averages rangefrom7.50 to 14.33 at the 10 percentlevel, 4.20 to 5.83
at the 5 percentlevel, and 0.40 to 1.67 at the 1 percentlevel. Although
thereis some evidence that the discretenessof the distributioncauses
trouble in a few cases-g = 7, m = 1, a = 3; g = 5, m = 2, a = 3;
g = 5, m = 1, a = 2; and perhapseveng = 20, m = 38, a = 38-the approximationseemsto workverywellformostof thesituationssimulated.
There is some evidencethat forgroupsof size 11 to 13, the 10 percent
point is really about a 15 percentpoint, but this effectis not carried
over to the smallersignificancelevels. The simplepracticalconclusion
is that if r is referred
to tables of the percentagepointsof the standard
normaldistributionand foundto be significant
at the 5 percentlevel or
less, this is not due to the inadequacy of the normalapproximation.
6. A COMPARISON OF
r
AND
a
In this sectionwe compare the values of r and 8 in eight sociograms
drawn arbitrarilyfrom the sociometryliterature.The models upon
which these two measures are based differsolely in the acceptability
of 012 triads.This triad is vacuouslytransitive,but nonpermissible
for
the model of rankedclusters.This difference
is an importantone, both
empiricallyand theoretically.In the analyseswhichfollowwe shall see
that some groupspossess far more 012 triads than the randommodel
predicts.While these surplusesdo not bear directlyon the transitivity
hypothesis,they are fatal to the ranked-clusters
model. Indeed, Davis
and Leinhardt (1970) reportedthat in analyses of sixty groups their
hypotheseswerestronglycontradictedonlyin the case of the 012 triad.
To understandthe substantivesignificanceof these findings,it will be
necessaryto review the structuresthe two models describe,so that
we may see what role 012 triadsplay in each.
Briefly,Davis and Leinhardtpredictthat group structurewill tend
to be arrangedinto a systemof hierarchicallyarrangedlevels, each of
whichmay containone or morecliques of one or moregroupmembers.
This structureis a productof tendenciesin pair relations.People in the
same clique, they suggest,will tend to choose and be chosen by each
other,whilemembersof different
cliqueswilltendto refrainfromchoosing one another.A hierarchyis introducedbecause lower-statusgroup
memberswho fail to recipromemberstend also to choosehigher-status
cate these choices.Postulatingthat an arrangementof group members
which contradictedthese tendencieswould be "inconsistent"or "uncomfortable"and that groupmemberswould tend to avoid them,Davis
507
AmericanJournalof Sociology
and Leinhardtsingledout triadswhichpresentedsuch inconsistencies,
proved that theirmodel was impliedwhen these triads were absent,
fortheirtheory.
and showedthat therewas some empiricaljustification
Their exampleof such a structureappears in figure3. Since this model
of ranked clustersassumes only one hierarchicalsystem,Davis and
Leinhardtconcludedthatthe012 triadwas inconsistent:"The twoN relations implythat (the group members)are all on the same level, although in different
cliques; but the A relation. . . implies that (one
group member)is in a higherlevel" a contradiction(Davis and Leinhardt 1970).
For our model,however,the structuralpropositionis the association
of the transitivepropertywith the interpersonalrelation. Since the
hypothesisof this conditionis not met in the 012 triad,we considerit
to be vacuously transitive.Now, if we examine this triad in light of
LEVEL
FIG.
h
High
Levels, Cliques and Relations
. .
clique I
Middle
clique 2
Low
clique 4
el
-- o
-4- -
---
cltique
I-
0
clique 5
3.-An exampleofrankedclustersfromDavis and Leinhardt(1970)
Davis and Leinhardt'sconceptualizationof the structuralrole of pairs,
we can place thetwomemberslinkedby an asymmetric
pairintoa status
hierarchywhich is unrelatedto the thirdmember.This structureis a
perfectlylegitimatepartial ordering.The prevalenceof 012 triads,then,
mightbe consideredas evidencefortheexistencewithina groupofat least
twocomponentseach ofwhichmay containorderings.Multipleorderings
with more than one connectedcomponentare commonformsof group
An excellentexampleoccursin theoftennoted"sex cleavage" of
structure.
children'sgroups.Figure4 presentsan illustrationofan idealizeddichotomized children'sgroup in which the boys' and girls' subgroupsare
separate systemsof ranked clusterings.This structurewould be a perfectexampleofthe transitivity
model,whilethe prevalenceof012 triads
in the group as a whole would contradictthe hypothesisof the ranked
clusteringmodel.
In table 5, we have listedthe resultsof analysesof eightgroups.The
last four columnsof the table give values for 5, 5STD (a versionof 5
standardizedto have mean zero and variance one), r, and a standard508
DetectingStructurein SociometricData
ized measurefor012 triads (012STD). The two new standardizedvalues
are computedin a manneranalogous to that for r. Theorem 1 is used
to generatevariance and covarianceterms.While all necessaryprobabilitiesare not presentedin our tables, theircomputationis straightforwardand the measuresare calculated in our computerprogram.An
argumentsimilarto that put forthfor the approximatenormalityof
these standardizedvariables to the normaldistrir supportsreferring
bution.
The Davis-Leinhardtmodel of rankedclusteringspredictsthat a will
be negative.On the basis of a small simulationstudy,Davis and Leinhardtsuggestthat a value of a that is less than -.05 oughtto be conTo check this,we compare a with 8STD
sideredstatisticallysignificant.
to normal
may be ascertainedby reference
(whosestatisticalsignificance
tables) for the eight groups listed in table 5. Two facts emergefrom
this comparison.First, (5STD is not monotonicallyrelatedto a (e.g., for
J
Boys
Girls
I
A
K
B
C
systemof rankedclusters(N
FIG. 4.-A possiblegroupwitha two-component
pairsare not connected).
group 1, a = -.152, while (5STD = -3.63; and forgroup2, a = -.091,
while (5STD = -4.88). Second, if we use - 1.645 as the 5 percentcutoff
point for (5STD, the value of -.05 for ( suggestedby Davis and Leinhardtis fairlywell supportedby these eightgroups.For onlyone group
while (STD is-a is greaterthan -.05,
(group 4), 8 is not significant,
while (5STD is less than -1.645. These two findingssuggestto us that
the - .05 cutoffpointfora is roughlyright,but the strengthof the significanceis not correctlygivenby the size of 8.
The measures r and (STD may be used to compare the transitivity
model with Davis and Leinhardt's model of ranked clusterings;r is
in all
in all but one case (group 5), while (STD is significant
significant
values of r are
but two cases (groups5 and 8). Overall,the significant
values of (STD, althoughnot always
more negativethan the significant
so (e.g., in group 1 (STD = -3.63, while r = -2.401).
The last column of table 5 contains a standardizedmeasure of the
numberof 012 triadsin the group.Since the 012 triad distinguishesthe
it is of interest
modelfromthe model of rankedclusterings,
transitivity
to see how oftenthis triad occursin a sociogram.If we use a two-sided
509
CN
cD
m 'IT
cD
(- 'IT
?
CO
O
o
m
I
I
++
I I-
N 'StCS
CO
O
o cs
nc
X
c
t
Lo
00
0t
7:
ce
Ca
n
eo
b
f-4
Htcc e sc
m
c37:1
bD
h
0
3
H
;
;>
t
:
c
fi,-t;
Ev1-4VOV
.
F..
.
.o
C..
0 Or
9r*e
.
0
.
~
CS~~~~~~~ce
.-
.
'IC
;-
0
sbo
DetectingStructurein SociometricData
5 percenttest that the 012 triads occur at random,thereare only two
groups (groups 4 and 8) with significantvalues of 012STD (i.e., that
exceed 1.96 in absolute value). In both of these groups,the value of
012STDindicatesthat the numberof 012 triads is largerthan expected
by chance. We offerthe followingtentativeexplanationof this finding.
Leinhardt(1968) foundthat when classroomgroupswere divided into
sexuallyhomogeneoussubgroupsand the subgroupsanalyzed separately, the numberof012 triadsbecame fewerthan expected,whereaswhen
the groupswere not divided therewere more 012 triads than expected
by chance. As indicated in figure4, this findingis consonantwith a
model that incorporatessex cleavage as well as rankingand cliquing.
Sex cleavage of classroomgroupshas been observedto become stronger
throughthe elementarygrades and then weaker during the college
years. Returningto the last column of table 5, we note that the two
groups with significantlymore 012 triads than expected were both
eighth-gradeclasses consistingof boys and girls.The othergroups are
eitherolderor sexuallyhomogeneous.Our explanationis, therefore,
that
sex cleavage has created the excess of 012 triads observed in groups
4 and 8.
7. DISCUSSION
Sociologists,with growingfrequencyand sophistication,are turningto
mathematicsas a language in whichto model social behavior.Two assumptionsunderliethis trend.One holds mathematicsto be a clearer,
betterway of expressingrelationshipsbetweenvariables.The othersuggests that mathematicalexpressions,because they are easily manipulated, will rendernew, non-obviousrelationshipsapparent (Beauchamp
1970). Clearly, these assumptionshave been implicitin the work on
graph theoreticmodels of structurein interpersonalrelationsand both
have been corroborated.The modelshave producednew understanding
of the interdependentrelationswhich link group members,and have
suggested a sociological rather than psychologicalinterpretationfor
consistencyor balance theories(Davis 1968). Nonetheless,the test of
sociologicaltheory,be it mathematicalor verbal, must be empirical.
With this in mind we have developedprocedureswhichpermittesting
of tendenciesin sociometricdata toward a varietyof graph theoretic
models of structure.We have presenteda theoremwhich specifiesthe
probabilitiesneeded forstandardizedmeasuresbased on triad frequencies, and have providedformulasfor a generalmodel, a partial order.
Whiletheseformulasare dependentupon the randommodelchosen,the
proceduresused to generatethem are not, and we discussed why we
thoughtmore researchwas needed on random models for sociometric
511
AmericanJournalof Sociology
we
data. Since our interestin thisreportwas principallymethodological,
refrainedfromdata analysis save an illustrativeexamplein which the
partialordermodelwas comparedwitha specialcase, theranked-clusters
model of Davis and Leinhardt(1970).
REFERENCES
Beauchamp, Murry A. 1970. Elementsof MathematicalSociology.New York: Random
House.
Cartwright,Dorwin, and Frank Harary. 1956. "Structural Balance: A Generalization
of Heider's Theory." PsychologicalReview63:277-93.
Davis, James A. 1967. "Clustering and StructuralBalance in Graphs." Human Relations 20:181-87.
. 1968. "Social Structuresand Cognitive Structures." In Theoriesof Cognitive
Consistency:A Sourcebook,edited by R. P. Abelson, E. Aronson, W. J. McGuire,
T. M. Newcomb, M. J. Rosenberg, and P. H. Tannenbaum. Chicago: Rand
McNally.
Davis, James A., and Samuel Leinhardt. 1970. "The Structureof Positive Interpersonal Relations in Small Groups." In Sociological Theoriesin Progress,edited by
Joseph Berger, Morris Zelditch, Jr.,and Bo Anderson. Vol. 2. Boston: Houghton
Mifflin(in press).
Harary, Frank, Robert Z. Norman, and Dorwin Cartwright.1965. StructuralModels.
New York: Wiley.
Hayes, M. L., and M. E. Conklin. 1953. "Intergroup Attitudes and Experimental
Change." Journal of ExperimentalEducation 22:19-36.
Heider, Fritz. 1946. "Attitudes and Cognitive Organization." Journal of Psychology
21: 107-12.
. 1958. The Psychologyof InterpersonalRelations. New York: Wiley.
Hempel, Carl G. 1952. Fundamentals of ConceptFormationin Empirical Science. In
Encyclopedia of Unified Science. Vol. 2, no. 7. Chicago: University of Chicago
Press.
Holland, Paul W., and Samuel Leinhardt. 1970. "A UnifiedTreatment of Some Structural Models for Sociometric Data." Technical Report, Carnegie-Mellon University, January 1970.
Horace Mann-Lincoln Institute of School Experimentation. 1947. How to Constructa
Sociogram. New York: Bureau of Publications, Teachers College, Columbia University.
Katz, L., and J. H. Powell. 1954. "The Number of Locally Restricted Directed
Graphs." Proceedingsof theAmerican MathematicalAssociation 5:621-26.
Katz, L., R. Tagiuri, and T. Wilson. 1958. "A Note on Estimating the Statistical
Significanceof Mutuality." Journal of GeneralPsychology58:97-103.
Kendall, M. G., and B. B. Smith. 1939. "On the Method of Paired Comparisons."
Biometrika31:324-345.
Landau, H. G. 1951a. "On Dominance Relations and the Structure of Animal Societies. I. Effectof Inherent Characteristics." Bulletin of MathematicalBiophysics
13: 1-19.
. 1951b. "On Dominance Relations and the Structureof Animal Societies. II.
Some Effects of Possible Social Factors." Bulletin of Mathematical Biophysics
13:245-62.
. 1953. "On Dominance Relations and the Structureof Animal Societies. III.
The Conditionfora Score Structure." Bulletin of MathematicalBiophysics 15:14348.
Leinhardt, Samuel. 1968. "The Development cf Structurein the InterpersonalRelations of Children." Ph.D. dissertation,Universityof Chicago.
Moran, P. A. P. 1947. "On the Method of Paired Comparisons." Biometrika34:36365.
512
DetectingStructurein SociometricData
Moreno, J. L. 1953. Who Shall Survive?New York: Beacon House.
Taba, H. E., E. Brady, J. Robinson, and W. Vickery. 1951. Diagnosing Human RelationsNeeds. Washington: American Council on Education.
Taba, H. E., and D. Elkins. 1950. WithFocus on Human Relations.Washington,D.C.:
American Council on Educatioin.
Zeleny, L. D. 1947. "Selection of the Unprejudiced." Sociometry10:396-401.
. 1950. "Adaptation ofResearch Findingsin Social Leadership to College Classroom Procedures." Sociometry13:314-28.
513