Hierarchical Grouping to Optimize an Objective Function

Hierarchical Grouping to Optimize an Objective Function
Author(s): Joe H. Ward, Jr.
Source: Journal of the American Statistical Association, Vol. 58, No. 301 (Mar., 1963), pp. 236244
Published by: American Statistical Association
Stable URL: http://www.jstor.org/stable/2282967
Accessed: 23/09/2010 10:19
Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at
http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unless
you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you
may use content in the JSTOR archive only for your personal, non-commercial use.
Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at
http://www.jstor.org/action/showPublisher?publisherCode=astata.
Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed
page of such transmission.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact [email protected].
American Statistical Association is collaborating with JSTOR to digitize, preserve and extend access to Journal
of the American Statistical Association.
http://www.jstor.org
GROUPING TO OPTIMIZE
HIERARCHICAL
OBJECTIVE FUNCTION*
AN
JoEH. WARD, JR.
AerospaceMedical Division,LacklandAir ForceBase
A procedurefor forminghierarchicalgroupsof mutuallyexclusive
subsets,each of whichhas membersthat are maximallysimilarwith
is suggestedfor use in large-scale
respectto specifiedcharacteristics,
(n > 100) studieswhena preciseoptimalsolutionfora specifiednumber
of groupsis not practical. Given n sets, this procedurepermitstheir
theunionofall
reductionto n -1 mutuallyexclusivesetsby considering
possiblen(n -1)/2 pairs and selectinga unionhavinga maximalvalue
for the functionalrelation,or objective function,that reflectsthe
criterionchosen by the investigator.By repeatingthis process until
only one group remains,the complete hierarchicalstructureand a
quantitativeestimate of the loss associated with each stage in the
groupingcan be obtained. A general flowcharthelpfulin computer
and a numericalexampleare included.
programming
S
ITUATIONS
1. FORMULATION
OF THE GROUPING
PROBLIEM
oftenarisein whichit is desirableto clusterlargenumbersof
objects, symbols,or persons into smaller numbersof mutually exclusive
groups,each having membersthat are as much alike as possible. Groupingin
this mannermakes it easier to considerand understandrelationsin large colofmanagement.Grouping,however,
lections;henceit oftenincreasesefficiency
of
information
that may be quantifiedin a
ordinarilyresults in some loss
number.
"value-reflecting"
In earlierformulationsofthe groupingproblem,Cox [2] and Fisher [31were
concernedprimarilywithgroupingbased on similaritywith respectto a single
variable. Cox treatedthe special case whenthe variableis normallydistributed,
and brieflydiscussed the two-dimensionalproblem. Fisher described a technique for groupingwithout regard for the distributionof the variable. His
procedureis designed to minimizea weighted sum of squares about group
fromboth.
means on one variable. Our formulationof the problemdiffers
For our purposes,it was necessaryto take account of the similarityof group
memberswithrespectto many variables. Our desirewas to formeach possible
numberof groups,n, n -1, * * * , 1, in a mannerthat would minimizethe loss
associated with each grouping,and to quantifythat loss in a formthat could
be readily interpreted.Since it was not feasible to obtain the many optimal
solutionsthat would be requiredforlarge-scalestudies (n > 100), a compromise
was necessary. Therefore,we proposed (a) to reduce the number of groups
fromn to n -1 in a mannerthat would minimizethe loss and then, without
modifyingthe groupsformed,to repeat the processuntilthe numberof groups
was systematicallyreduced fromn to 1, if desired,and (b) to evaluate loss in
termsof whateverfunctionalrelationbest expressedan investigator'scriterion
forgrouping.
* The researchreportedin this paper was sponsoredby 6570th PersonnelResearch Laboratory,Aerospace
Medical Division,underAFSC Project 7784, Task 773403.
236
GROUPING
237
TO OPTIMIZE
Function
Objective
{2, 6, 5, 6,2, 2, 2,0, 0,0 }, a common
Givena setofratings
for10individuals,
practiceis to use the meanvalue to represent
all the scoresratherthan to
consider
individualscores.The "loss"in information
thatresultsfromtreating
the 10 scoresas one groupwitha meanof2.5 can be indicatedby a 'valuereflecting"
number,
theerrorsumofsquares(ESS).
The errorsumofsquaresis givenbythefunctional
relation,
1
2
ESS=Z 0=1x --(n
/n
xi)
2
=1
wherexi is thescoreoftheithindividual.The ESS fortheexampleis
10
ESS(one
group)
X -- 0
2
1
2
0
xi
= 113 -62.5
= 50.5.
ifthe10 individuals
are classified
to theirscoresintofour
Similarly,
according
sets,
{0
{o,o0,O}
2, 2 ,2 2},
{5}2
{6, 6}
can be evaluatedas the sumofthefourerrorsumsof squares,
thisgrouping
ESS (fourgroups) = ESS (Group1) + ESS (Group2) + ESS (Group3) + ESS (Group4).
A functional
relationthatprovidesa "value-reflecting"
numberofthistype
willbe referred
In general,an objective
to hereas an "objectivefunction."
selectsto reflect
function
maybe anyfunctional
relationthatan investigator
ofgroupings.
In thisexample,theobjectivefunction
is
therelativedesirability
as reflected
'loss ofinformation"
by theerrorsumofsquares.The mostdesiris its minimum
able levelofthisobjectivefunction
value,00. The objectivevalues that werecomputedindicatethe information
lost whenthe
function
10 scoresare treatedas a singleset (50.5) and as foursets(0.0). The natureof
ofcourse,dictatethechoiceand interpretation
theproblemand thecriterion,
oftheobjectivefunction.
A numberofdifferent
objectivefunctions
have beendevelopedand put to
inapplications
ofthegrouping
use.Theseareexemplified
operational
procedure
theclusternowprogrammed
ResearchLaboratory
to accomplish
byPersonnel
withrespectto measuredcharingof (a) personsto maximizetheirsimilarity
timewhenmenare reassigned
acteristics,
(b) jobs to minimizecross-training
to minimizeerrorsin
accordingto establishedpolicies,(c) job descriptions
and
a largenumberof jobs witha smallnumberof descriptions,
describing
theloss ofpredictive
(d) regression
equationsto minimize
efficiency
resulting
in the numberofequations.In (a) the objectivefunctionis
fromreductions
the grandsum of the squareddeviationsabout the meansof all measured
The objectivefunction
for(b) is expectedcross-training
characteristics.
time,
assumingrandommovement
amongjobs in each cluster;whereasin (c) it is
amountofjob timeincorrectly
andin (d) lossofpredictive
described
efficiency.
the same generalcomputerprogramforgroupingcan be used
Fortunately,
In fact,the four
withmanyobjectivefunctions
thatappearto be different.
238
AMERICAN
STATISTICAL
ASSOCIATION
JOURNAL,
MARCH 1963
objectivefunctionsjust describedall lead to the use ofthe same program[1, 7].
In some situations,the desirednumberofgroupscan be specifiedin advance;
in others,this is difficultto do.' With this groupingprocedure,one need not
value associated with
specifythe numberin advance, forthe objective-function
any given numberof groups (fromn to 1) is available. Indeed, the changesin
these values as the numberof groupsis systematicallyreduced furnishuseful
clues to the appropriatenumberof groupsto be used foroperationalpurposes.
HierarchicalGroups
The groupingprocedurereportedhereis based on the premisethat the greatas indicated by an objective function,is available
est amount of information,
whena set of n membersis ungrouped.Hence the groupingprocessstartswith
these n members,which are termedgroups,or subsets,althoughthey contain
only one member.The firststep in groupingis to select two of these n subsets
which,whenunited,will reduce by one the numberof subsetswhile producing
the least impairmentof the optimalvalue of the objective function.The n-1
resultingsubsets then are examinedto determineif a thirdmembershould be
unitedwiththe firstpair or anotherpairingmade in orderto securethe optimal
value of the objective functionforn-2 groups. This procedure can be continued,if desired,until all n membersof the originalarray are in one group.
* , 1), the
Since the numberof subsets is systematicallyreduced (n, n-i,
processis termed"hierarchicalgrouping"and the resultingmutuallyexclusive
groups "hierarchicalgroups."
Applicationsof HierarchicalGroupingProcedure
Hierarchicalgroups,formedin the mannerdescribed,are particularlyuseful
for classificationpurposes [2, 3, 4, 5, 6, 7]. They may be used, for example,
to establish taxonomiesof plants and animals with respect to genetic background. Such groups also permit organizingand cataloging materials, e.g.,
librarydocuments,so as to facilitatethe storageand retrievalof information.
Similarly,hierarchicalgroupingsof jobs may be helpful in identifyingjob
"types" and "subtypes."
As indicated previously,the usefulnessof this hierarchicalapproach is not
restrictedto classificationproblems.No seriousloss of predictiveaccuracy resulted,forinstance,when a hierarchicalgroupingof regressionequations was
used to select a smallernumberto replace the 60 equations developed forpredictionof success in Air Force technicalschools. While a hierarchicalarrangement may not always be needed forthe "best" partitioninginto groups,it is
preferredin many practical situations.Even when a nonhierarchicalgrouping
is sought, the approach described here probably will yield a good solution,
althoughit may be one that does not optimizethe objective functionforthe
specifiednumberof groups.
A problemoftenlinked with the groupingproblem is that of assigningor
1 When it is desiredto assign objects to k, a specifiednumberof groups-withoutregardforthe subdivisions
of these groups-the goal is only to identifythe k groupsthat optimizethe objective function.If the objective
problem.If the objectivefunction
functionis a linearform,the problemcan be formulatedas a linearprogramming
problem [2]; howevercertain
is nonlinearin form,the problemcan be formulatedas a nonlinearprogramming
must be considered.
computationaldifficulties
GROUPING
239
TO OPTIMIZE
newobjectsintoacceptedgroups.This problem,as wellas those
classifying
and decidingupon
associatedwithselectingappropriateobjectivefunctions
thenumberofgroupsto be utilized,willnot be discussedhere.The purpose
believedto be of
procedure
grouping
ofthisarticleis to reporta hierarchical
sinceit has manyapplications.
generalinterest
2. HIERARCHICAL
GROUPING
PROCEDURE
Cycle
Grouping
Hierarchical
startswith a uniprocedure
S(i, n). The grouping
Optimalunionofsubsets
subsets,i.e.,wellofn one-element
versalset (U), {el,e2,* I,en1,consisting
numberedin sedefinedobjects,symbols,or persons.These are arbitrarily
in computerprocessing.
To reducethe
identification
quenceforconvenient
thechangein the
numberofsubsetsto n-1, onenewsubset,whichminimizes
by unitingtwooftheoriginaln subsets,
value,is formed
objectivefunction's
say
[S(1, n)] U [S(2, n)]
=
{ei,
e2}I
foreachofthen(n-1)/2
Thisrequiresan evaluationoftheobjectivefunction
possibleunionsof subsetsS(i, n), i= 1, 2, * * , n,wherei refersto the number
n at thisstage,refersto the
the set, and the secondparameter,
identifying
in turn,the
As each unionis considered
numberof setsunderconsideration.
hypothesized
and
is
computed
function
objective
value of the corresponding
union.The identityof
to be "equal to or betterthan"thatofany preceding
This
ofcomparisons.
the
is
sequence
union
maintained
"best"
throughout
the
value
of thatunionwhichhas an objective-function
identification
facilitates
"equal to or betterthan"thatofany of then(n-1)/2 possibleunions.This
ofsubsetsis reduced
whenthenumber
unionis acceptedas an optimalgrouping
fromn to n-1.
value.For the
ofnewsubsetand its associatedobjective-function
Designation
in
subsetsare
which
the
to
show
a
out
sequence
of
display
purpose printing
machine
in
it
is
convenient
100
to
studies
in
1000),
(n=
united large-scale
new
each
to
subset
(resulting
identify
designations
special
to give
processing
in
fromacceptanceof a union)and its members.Thus the unionresulting
is
denoted:
n-1 subsets
S(pn
-
1)
=
[S(pn-1,
n)] U [S(q,-, n)]
where
Pn-1
the subsetin the n
used to identify
=the smallerof the twonumbers
thenewsubset.
originalsubsets.Thisnumberis used to identify
the subsetin the n
to
used identify
qn-l=the largerof the two numbers
after
"inactive"
it is used at this
is
number
This
subsets.
original
beenunited.
subsets
have
which
two
the
for
showing
printout
stage
it with
valueis denotedin thesamemannerto identify
The objective-function
thisunion:
Z[pn-1)
qn-1,n - 1].
240
AMERICAN
STATISTICAL
ASSOCIATION
JOURNAL,
MARCH 1963
The termat theright,n-1, showsthenumberofsubsetsremaining
afterthe
numbers
union;theleft-hand
and centertermsare the originalidentification
oftheunitedsubsets.
Optimalunionofsubsets
S(i, n-1),
*, S(i, n- [n-i]). Withn-I subsets,now,we have to defineand consider
S(i, n
wheni==1,2,
,
-
1)
S(i, n)
=
n; i5pPn1; anditqn-i
[S(pn-,, n) |J [S(qn-1,n) ]
and S(pn_1, n-1)
when i-pn4.
Selectionof an optimalunionto reducethe n -1 subsetsto n -2 requires
ofthe (n-I) (n-2)/2 possibleunionsin a manner
evaluationand comparison
analogousto thatused to reducethe n subsetsto n-1. Whenthishas been
value are
done, the acceptedunion and its associatedobjective-function
designated
S(pn2y
n - 2) = [S(pn-2,n - 1)] U
[S(qn-2,
n-1)]
and
Z [pn-2
qn-2)
n - 21
(pn-2 < qn-2).
are maintainedas before,i.e., Pn-2 iS the identification
The identifications
value and qn- is thatwiththelargernunumberwiththesmallernumerical
mericalvalue.
ifdesired,untilall subsetshave been
Thisgrouping
cyclecan be continued,
unitedto formthe universalset, U. At any phase in whichk mutuallyexthe objective-function
clusivesubsetsare underconsideration,
value,and the
as
unionwithwhichit is associatedwouldbe expressed
Z [i,j, k-1 ] associatedwith[S(i, k)]U [S(j, k)]
where
i=1,2,*
-,n-1
i 5? qn-1) qn-2)
j =i+
1,i+2,
qk
**,n
j y?qn-1l) qn-2) * * * ) qkc
Followingselectionof an optimalunion,this union and its corresponding
value wouldbe designated
objective-function
S(pk-1,
k - 1) = [S(pk-1,k)]UJ[S(qk-1,iG)]
and
Z[Pk-ly qk-1, kc
11
(pk-1 < qk-1).
the elementsofany subset,S(i, c), wouldbe designated
Furthermore,
S(i, k) =
{em1, ..
, ema, *
, e.,)
GROUPING
241
TO OPTIMIZE
where
in subset
t=number ofelements
inthesubset.
numberofathelement
ma= identification
NumericalExample ofHierarchicalGroupingCycle
let us considera smallproblemin whichfive
the procedure,
To illustrate
areto be groupedon thebasisofratingstheyhavegivenone object
individuals
is the sum of the
value to be minimized
(Figure1). The objective-function
likethisare
squareddeviationsaboutthegroupmean(ESS). Whileproblems
numbersand
readilydoneby handand do notrequiretheuse ofidentification
whenn> 25. Thereofdata is seldomeconomical
hand-processing
designations,
programming
thatmaybe helpfulin computer
fore,wegivea generalflowchart
operationsforthe exampleis
(Figure2). The patternforthe computational
shownin Figure3.
Numerical
Example
PrintoutDisplayingHierarchicalGroupingResults
Per- Object
son Rating Person
1
2
Numberof Groups
3
4
5
1
1
1
1
1
1
1
1
2
7
3
3
3
3
3
3
3
2
2
2
2
2
2
2
4
9
4
4
4
5
12
FIGURE.
5
5
ObjectiveFunction
Value
86.80
(ESS)
A1
A2
5
73.63
< -------------4
4
<---------------___-___
13.17
62.96
5
10.67
2.50
8.67
4
5
2.00
.50
1.50
5
.50
.00
1. Summaryof resultsof hierarchicalgroupingfornumericalexample.
thehierarchical
summarizing
thefinalmachineprintout
Figure1 illustrates
whenan investigator
is interested
primarily
provided
as itis ordinarily
grouping
in thelast groupsformed,
althoughthenumberofgroupscan be arrangedin
and thevaluesin
thereverseorder.(Machineroutinesto obtainthisprintout
gives
theAl and A2rowsare notincludedin Figures2 and 3.) This printout
structure.
on thecompletehierarchical
information
The
interest.
The valuesin thelast threerowsof Figure1 are ofparticular
valueswhichin thiscase showthe
ESS rowcontainsthe objective-function
errorassociatedwitheach step in the grouping.Study of the hierarchical
to the Al entriesbetweencolumnsthat
is facilitatedby reference
structure
equlpothsin (nuinbe
ian
of lemets
BLOCK
l
.
2
er thanbs all tlies pul
som iniial valu worset
k1to , L
snalsaciedntifiscaton
prsnumber
equal,
BLOCK 3
activ
PuRequace tol thleofir[st
q,_=
nundie grater
t*_=and
k-1JnbtZ[i,c,tk-1]
BLOCK 74-
a
CoI equa
wdliictho the
ut Zto kh last assciated
8
aBLOCK
Isut[i j,
er
ka -II bextthger
t ancetiva identfctonnlb
YESE
BLOCK 6
IF
Isji equal to tnexttlast activeideiitification
number
NO
YE
10
lBLOCK
Put j equial to next higber active identification n:i:m:be:r.]T
BLOCK
s beqtunolotowexo
~
~
~
~
BOC
acieenfudadi identificainnmer?
seslast
12
lBLOCK
Pth ieult
~
11
cieidentification
niimber.qliatle
ethge
BLOCK 13
l
~
~
cnieatind)i
estk unione of two ses hs benfunde
flrBLOCK
~
~
~
~
~
ienultified
12
on by1 th nubrP1admk
|Identiftk new uni
tial
to
- k1.
.T.
FIGURE,2. Flowebart for hierarchical grouping procedure.
LOK
243
GROUPING TO OPTIMIZE
Block
Number
1
Time
Through
1
2
1
3
4
5
6
6
6
6
7
8
4
5
6
6
6
6
7
1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
2
2
1
Operation
k=5
Z[p4,
i=1
q4,
4] =a highvalue, say, 100
j=2
50-32 =18
Z[1, 2, 4] = [(1)2+(7)2]-.50[(1+7)2]=
Is Z[1, 2, 4]<Z[p4, q4,4]? i.e., Is 18<100? Yes; go to Block 6
Z[I, 2, 4] replacesZ[p4, q4, 41 as temporarybest
18 replaces 100 as temporarybest
(i = 1) replacesP4
(j =2) replacesq4
Isj =5? i.e., Is 2 =5? No; go to Block 8
Set j =next higheractive number=3. Go to Block 4
Z[l, 3, 4]= [(1)2+(2)2] -.50 [(1+2)2I=5-4.5=.50
Is Z[l, 3, 4]<Z[p4, q4, 4]? i.e., Is .50<18? Yes; go to Block 6
Z[l, 3, 4] replacesZ[p4, q4, 4] as temporarybest
.50 replaces18
(i=1) replacesP4
(j=3) replacesq4
Is j= 5? i.e., Is 3=5? No; go to Block 8
FIGURE 3. Patternof computationaloperationsforhierarchicalgroupingforexample.
showthe changein the errortermat each unionand to the A2 entriesindicating
the accelerationof error.Results of the groupingprocedureindicate that Persons 1 and 3 gave similarratingsto the object and that the five individuals
formtwo distinct,relativelyhomogeneousgroups. Detailed informationon
specificpersonsand groupsis easily extracted.
3. SUMMARY
A procedurehas been describedforforminghierarchicalgroupsof mutually
exclusivesubsets on the basis of theirsimilaritywithrespectto specifiedcharacteristics. Given k subsets, this method permits their reduction to k-1
mutuallyexclusive subsets by consideringthe union of all possible k(k-1)/2
pairs that can be formedand acceptingthe union withwhichan optimalvalue
of the objective functionis associated. The process can be repeated until all
subsetsare in one group. The computerprogramdescribedalso yields displays
of the hierarchicalstructureshowingthe sequence in which the subsets have
been united. A flowchartand numericalexample are provided.
REFERENCES
[11 Bottenberg,RobertA., and Christal,RaymondE., An IterativeTechniqueforClusterLackland Air Force Base,
ing CriteriawhichRetains OptimumPredictiveEfficiency.
Texas: PersonnelLaboratory,WrightAir DevelopmentDivision,March 1961. (Technical NoteWADD-TN-61-30).
[2] Cox, D. R., "Note on grouping,"Journalof theAmericanStatisticalAssociation,52
(1957), 543-7.
JournaloftheAmerican
[3] Fisher,WalterD., "On groupingformaximumhomogeneity,"
StatisticalAssociation,53 (1958), 789-98.
244
AMERICAN
STATISTICAL
ASSOCIATION
JOURNAL,
MARCH 1963
[4] Tanimoto,T. T., and Loomis,RobertG., A TaxonomyProgramfortheIBM 704. New
York: InternationalBusinessMachinesCorporation(Data SystemsDivision,Mathematics& ApplicationsDepartment),1960. (M&A-6, TheIBM TaxonomyApplication)
" Psychometrika,
18 (1953), 267-76.
[51 Thorndike,RobertL., "Who belongsin thefamily?,
[61 Thorndike,Robert L., Hagen, Elizabeth P., Orr, David B., and Rosner,Benjamin,
An EmpiricalApproachtotheDetermination
ofAir Force Job Families. Lackland Air
Force Base, Texas: Air Force Personneland TrainingResearchCenter,August1957.
(TechnicalReportAFPTRC-TR-57-5, ASTIA Document No. 134 239)
[7] Ward,Joe H., Jr.,and Hook, Marion E., A HierarchicalGroupingProcedureApplied
toa ProblemofGroupingProfiles.Lackland AirForce Base, Texas: PersonnelLaboratory,AeronauticalSystemsDivision,October 1961. (TechnicalNoteASD-TN-61-55)