A Branch and Bound Clustering Algorithm

908
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
IEEE TRANSACTIONS ON COMPUTERS, VOL.
E. Beckenbach, Ed., Applied Combinatorial Mathematics. New
York: Wiley, 1964.
C. G. Bell and A. Newell, Computer Structures: Readings and
Examples. New York: McGraw-Hill, 1971.
D. P. Bhandarkar, "Analytic models for memory interference
in multiprocessor computer systems," Ph.D. dissertation, Elec.
Eng. Dep., Carnegie-Mellon Univ., Pittsburgh, Pa., Rep. AD
773 843, Sept. 1973.
S. Bhatia, 'Techniques for designing balanced, extensible computer systems," Ph.D. dissertation, Carnegie-Mellon Univ.,
Pittsburgh, Pa., Oct. 1972.
J. Buzen, "Queueing network models of multiprogramming,"
Ph.D. dissertation, Harvard Univ., Cambridge, Mass., Rep.
ESD-TR-71-345, Aug. 1971.
W. Feller, An Introduction to Probability Theory and its Applications, vol. 2. New York: Wiley, 1966.
D. P. Gaver, "Probability models for multiprogramming computer systems," J. Assoc. Comput. Mach., vol. 14, pp. 623-638,
July 1967.
U. Grenander and R. F. Tsao, "Quantitative methods for
evaluating computer system performance: A review and proposal," in Statistical Computer Performance Evaluation, W. Freiberger, Ed. New York: Academic, 1972.
J. McKinney, "A survey of analytic time sharing models,"
Comput. Surveys, vol. 1, no. 2, pp. 105-116, 1969.
E. Parzen, Stochastic Processes. San Francisco, Calif.: HoldenDay, 1962.
J. Riordan, An Introduction to Combinatorial Analysis. New
York: Wiley, 1958, pp. 90-92.
C. Skinner and J. Asher, "Effect of storage contention on
system performance," IBM Syst. J., vol. 8, no. 4, pp. 319-333,
1969.
c-24,
NO.
9,
SEPTEMBER
1975
[14] W. D. Strecker, "Analysis of the instruction execution rate in
certain computer structures," Ph.D. dissertation, CarnegieMellon Univ., Pittsburgh, Pa., 1970.
[15] W. Wulf and C. G. Bell, "C.mmp-A multi-mini-processor," in
1972 FaU Joint Comput. Conf., AFIPS Conf. Proc., vol. 41,
part II. Washington, D. C.: Spartan, 1972, pp. 765-777.
Dileep P. Bhandarkar (S'70-M'73) was
born in Bombay, India, on July 16, 1949.
He received the B. Tech. degree in electrical
engineering from the Indian Institute of
Technology, Bombay, in 1970, and the M.S.
and Ph.D. degrees in electrical engineering
from Carnegie-Mellon University, Pittsburgh, Pa., in 1971 and 1974, respectively.
From 1970 to 1973 he was a Research
Assistant in the Departments of Electrical
Engineering and Computer Science, CarnegieMellon University. Since October 1973 he has been a Member of
Technical Staff in the Central Research Laboratories of Texas Instruments, Incorporated. His current research interests include
computer organization and architecture, performance evaluation,
and memory organizations using magnetic bubbles and chargecoupled devices.
Dr. Bhandarkar is a member of the Association for Computing
Machinery.
A Branch and Bound Clustering Algorithm
WARREN L. G. KOONTZ, PATRENAHALLI M. NARENDRA,
KEINOSUKE FUKUNAGA
Abstract-The problem of clustering N objects into M classes
may be viewed as a combinatorial optimization algorithm. In the
literature on clustering, iterative hill-climbing techniques are used
to find a locally optimum classification. In this paper, we develop a
clustering algorithm based on the branch and bound method of combinatorial optimization. This algorithm determines the globally
optimum classification and is computationally efficient
Index Terms-Branch and bound, clustering, combinatorial
optimization, pattern recognition.
I. INTRODUCTION
CLUSTERING may be viewed as a combinatorial
C optimization problem in that of the MN/M! ways in
Manuscript received February 7, 1974; revised January 9, 1975.
This work was supported in part by the National Science Foundation
under Grant GJ-35722.
W. L. G. Koontz is with Bell Laboratories, Whippany, N. J.
P. M. Narendra and K. Fukunaga are with the School of Electrical Engineering, Purdue University, Lafayette, Ind.
STUDENT MEMBER, IEEE, AND
which N objects may be assigned to M distinct classes (let
M be given), the "best" classification must be determined.
This statement is made under the assumption that there
exists a criterion by which each classification can be evaluated. Given such a criterion, the clustering problem
becomes that of finding an efficient algorithm for determining the optimal classification.
Most clustering algorithms described in the literature
may be described as iterative, hill-climbing techniques.
Such an algorithm generates a sequence of classifications
converging to a local optimum. Each successive classification is a perturbation of its predecessor calculated to
improve the criterion. The K-means algorithm [1]-[3] i.
a well-known example of this kind of algorithm. Although'
these algorithms are computationally efficient, they do not"
guarantee a globally optimum solution, except in special
cases.
Exhaustive enumeration can, of course, determine the
909
KOONTZ et al.: BRANCH AND BOUND CLUSTERING ALGORITHM
optimum solution and has been attempted [3]. However,
the computation required is prohibitive. The method of
branch and bound [4]-[7] is a much more reasonable
approach to combinatorial optimization problems. It will
determine the global optimum and is quite efficient. In this
paper we will develop a practical branch and bound clustering algorithm.
In Section II, we will present the basic branch and bound
algorithm. We will focus on a particular form of clustering
criterion and derive an important property of the basic
algorithm. In Section III, we will develop an extended
algorithm with significantly improved efficiency. Experimental results are presented in Section IV.
General Formulation
The method of branch and bound is based on the evaluation of partial classifications of the form [GCl1CO2* * cokI,
1 < k < N (we refrain from more general partial solutions
in order to simplify our presentation). Let bl,b2,- *,bN be a
sequence of lower bounds satisfying
-
bk
=
bk(Wl,W2- *,COk)
<
J((Q;X).
-
(4)
Without loss of generality, we can assume
bl < b2 < .
. .<bN = J(Q;X)
(5)
since if bj < bi for j > i, we can replace bj with bi. This is
II. THE BASIC BRANCH AND BOUND
valid because bi is a function of (wl,... ,coi) and therefore
ALGORITHM
trivially, a function of (WI.,...- . ** ,Wk). This satisfies the
We will begin this section with a mathematical definition requirement implied in (4) that bk be a function of
of the clustering problem in terms of a clustering criterion. (W, ... ,Wk). Let B be an upper bound on the optimal J,
We will then state a branch and bound algorithm which i.e.,
solves this problem. The efficiency of this algorithm deJ(Q*;X) < B.
(6)
pends upon the structure of the clustering criterion. We
will study the behavior of the algorithm with a class of Now if, for a given classification Qa, we have
recursively computable criteria, in particular, the within*W > B,
bk (WlG,W2a,. ..*a)
(7)
class scatter criterion. Some properties of the basic algorithm will be exploited in the extended algorithm pre- it follows that
sented in Section III.
1) Qa is suboptimal, since J(12a;X) . bk
Let X1,X2,.a ,XN be N vectors of dimension n. These Wka) > B > J (Q*;X) (explicit rejection), and (Wla,w2,a'.
vectors are to be clustered into M classes denoted 1,2, ... ,
. . yWe
2) all classifications of the form
M. Let wi be the class to which Xi is assigned by the Wk+1, . . ,C N] are suboptimal since bk [Wla',2a,.
is unchanged
clustering, i.e., wi C {1,2,... ,M}. Let the classification Q (implicit rejection).
and the configuration X be defined by
Thus, given a sequence of tight lower bounds and a
tight
upper bound as defined in (4) and (6), we can elim...
(1)
*
]TX
inate many suboptimal classifications by evaluating a
and
single partial classification. This is the basis of the branch
X = [XiTX2T ..*XNT]T
(2) and bound approach to combinatorial optimization. We
The clustering criterion J(Q;X) is a scalar function of Q will present a branch and bound algorithm, initially in
and X which maps the possible classifications of a given terms of a general sequence of lower bounds. We will then
configuration onto the set of real numbers. That is, J is the consider a family of recursively computable clustering
cost of the classification. The clustering problem is then criteria for which lower bounds can be easily specified.
defined as: for a given configuration X, find the classifica- The within-class scatter criterion belongs to this family.
The branch and bound algorithm stated below is an
tion Q*, such that
alternating
sequence of partial classification generation
J (Q*;X) = min J (Q;X),
(3) and partial classification evaluation steps. All possible
classifications are either explicitly evaluated or implicitly
i.e., find the minimum cost classification.
rejected and the optimum classification is thereby deterThe clustering problem as just stated is a combinatorial mined. The O variables, which are further explained in the
problem. There are MN/M! distinct classifications of N subsequent discussion, control the generation of partial
vectors into M classes.1 A solution, therefore, is to evaluate classifications. Bo is a given upper bound on J (e.g.,
J for each of these classifications and choose the one which Bo = oo).
minimizes J. Even for modest values of M and N, however,
Step 1 (Initialization): Set B = Bo,
the number of classifications is prohibitively large and
for i = 1,2, ** ,Nandj = 1,2, * * * ,M
4ij = 0
explicit enumeration is not a practical solution technique.
The method of branch and bound [4]-[6] is a more and k = 1.
efficient approach to combinatorial optimization problems.
Step 2 (Partial Classification Generation): Compute
Wk such that
1 Although there are MN different ways to assign values to the
mmin
bk (4,wo2, *. .k-12) .
bk (wIw2,..* **k) =
wi there are M! ways to number M classes. For example, 11,2,1,2,. ..I
.
.
is equivalent to
12,1,2,1, *
* }.
i c (1,2,-.M);
*kiJO
910
IEEE TRANSACTIONS ON COMPUTERS, SEPTEMBER
Step 3 (Partial Classification Evaluation): If bk < B, go
to Step 4; else, go to Step 5.
Step 4 (Forward Step): If k = N, go to Step 7; else, set
4kwk = 1, increment k by 1, and return to Step 2.
Step 5 (Backward Step): Set 4k, = 0 for j = 1,2, ... ,M
and decrement k by 1. If k = 0, stop; else, continue to
Step 6.
Step 6 (Seek New Classification at Level k): If Oi = 1
for j = 1,2,...,M, return to Step 5; else, return to Step
2.
Step 7 (Store Classification and Cost): Set B = bN and
wi* = wi for i = 1,2,*** ,N and go to Step 5.
Upon termination of the algorithm, Q* is the optimal
classification and B is its cost. The algorithm is illustrated
by the flow chart in Fig. 1.
The 4 variables control the enumeration of classifications
as follows. If Oki = 1, Xk may not be assigned to class j.
Initially, the 4 variables are all set to zero. As each Xk is
classified, the corresponding 4k,k is set to 1 (Step 4). If the
algorithm backtracks to Xk, a different class will be considered. If the algorithm backtracks from Xk, the Oki are
reset to zero for j = 1,2, ... ,M since some Xi will be reclassified for i < k. This procedure guarantees that all
classifications will be either explicitly or implicitly evaluated and the optimum classification will be determined.
Recursive Clustering Criteria
The branch and bound algorithm has been stated in
terms of a general clustering criterion and the b, have not
been explicitly specified. We will now confine our attention
to a class of recursive clustering criteria and provide a more
complete specification of the bk.
For k > 1, let
JR (W1,W2, .. 3,Wk;Xl,X2,j .* ,Xk)
(8)
...
be the cost of assigning Xi to class wi for i = 1,2, ,k.
Assume JR(1,k) can be computed recursively as
JR(1,k)
JR(l,k + 1)
=
=
JR(l,k) + AJR(l,k),
k = 1,2} ..
(9)
where
AJR(1,k) 2 0,
and, without loss of generality,
JR(1,1) = 0.
(10)
(11)
1975
Fig. 1. Flow chart of the basic branch and bound algorithm.
where E(k + 1,N) is a lower bound on J(1,N) - J(1,k).
The bk(cWl,. ,*k) of (14) do not always satisfy all the
inequalities of relation (5), but they do satisfy the following inequality,
K= 1,...,N.
bk < J(Q;X),
This is sufficient for the optimality of the basic algorithm.
However, when e(k + 1,N) 0 for all k, the corresponding bounds bk in (14) do satisfy relation (5). This property
is used in Algorithm I defined below. We now define
Algorithm I as the branch and bound algorithm with
Bo = ao and bk given by (14) with e(k + 1,N) = 0 for all
k. For E(k + 1,N) > 0, we have a sequence of tighter
bounds. In Section III, we will develop such a sequence.
For Algorithm I, (14) may be rewritten as
bk(wl,c&2 pWk) = JR(1,k)
= JR(1,k - 1) + AJR(l,k - 1)
= bk-l (Wl,W2,* .* *,k-1) + AJ, (i,k - 1).
(15)
Equations (8) -(11) define the class of recursive criteria to
which our remaining discussion is addressed. The recursive Thus, only AJR need be computed at each stage.
criterion is related to the general criterion as
Within-Class Scatter Criterion
(12)
JF(1;X) = JR(1,N)w
Let us consider the application of Algorithm I to the
scatter criterion defined by
within-class
From (9) and (10) we have
k
k = 1,2,..N,
(13)
JR(1,k) < JR(1,N),
=
(16)
JW(1,k) Ell Xi - Cj(1,k)II2
i-1
so that a sequence of lower bounds on JR (1,N) may be
where
expressed as
k
1
bk (W1,W2, . ,X>) = JR (1,k) + e(k + 1,N),
(17)
E Xi1
Cj(l,k) = Nj(1,k)
i-1; wi-j
k = 1,2,*. .,N, (14)
-
KOONTZ et al.:
91
BRANCH AND BOUND CLUSTERING ALGORITHM
and Nj(1,k) is the number of vectors among X1,X2, .. ,Xk
JR(1,k + 1)
assigned to class j. The notation 11 indicates the Euclidean norm. A recursive equation for J. may be derived Therefore,
as follows. Assuming Wk+1 j, we have
J*(1,k + 1)
Jw(1,k + 1) Jw(l,k)
=
IlXk+l - Cj(l,k + 1) 112
k
+
=
[lXi - Cj(1,k + 1) 112
lXi
C,(11,k)ll2]
=
=
JR(1,k) + AJR(1,k).
min JR(l,k + 1)
(wl ,'--- .k+1)
>
J*(1,k)
(25)
so- that
J*(1,1) < J*(1,2) < ... < J(N).
(26)
We may now show that, in the process of determining
(18)
J*(1,N), Algorithm I can also determine J*(1, 1),J*(1,2),
. ,J*( 1,N - 1). First of all, for any partial classification
IIXk+l- Cj(l,k + 1)112 + Nj(1,k)IICj(1lk + 1)
-
E
i=l
-
-
C3(1,k)112.
(19)
we have
bk (l,c2, ... *,k) = JR(1,k)
Equation (19) follows from (18) because of (17) and the
fact that
> J*(j,k)
(27)
C1(l,k + 1) = Cl(l,k)
'vi j.
(20) for k = 1,2,@...,N. Let W)lk,w2k,. .,.!k' be the optimal
classification of Xl,X2, ,Xk. This partial classification
Cj is given recursively by
must be evaluated by Algorithm I since
Cj(1,k + 1) = [Nj(1,k)Cj(1,k)
bk (Jlk,C02k,* * *,Chkk) = J(l,k),
+ Xk+l]/[N3(1,k) + 1] (21)
.J*(11N)
and, trivially,
< B,
(28)
1 =j
Nj(l,k) + 1,
N1(1,k + 1) =
that is, this partial classification is not rejected. Then, in
1 $ j.
(22) view of (27) and (28), J*(1,k) is the minimum value of
N, (l,k)
bk computed in the course of Algorithm I.
Substituting (21) into (19), we have
III. EXTENDED BRANCH AND BOUND
ALGORITHM
1
(23)
=
AJw(1,k) N(1k) + 1 Xk+1C1f,l,k) 12.
Computer experiments have indicated that the compuEquation (23) is substituted into (15) toot)tain a sequence tation required for the basic branch and bound algorithm
increases rapidly with the sample size N. In this section,
of lower bounds for Algorithm I.
Equations (21) and (22) are state equLations and the we will develop extensions to the basic algorithm which
Cl and N, are state variables which c]haracterize the greatly reduce the computation required for large (N >
forward step process of the branch and bc und algorithm. 100) samples. These extensions are based on the following.
If the state variables are stored at each ster , the algorithm
1) The computation required to cluster a given sample
can backtrack any number of steps withoi. it the necessity
exceeds, by a significant margin, the total computation
of recomputing the Cl and N1. The decisi on to store the required to cluster two or more subsets of the sample.
state variables is based on the compute wr storage/time
2) The branch and bound algorithm is considerably
tradeoff. In our program, which is discuEgsed in Section more efficient if tighter bounds (Bo and the e(k,N)) are
IV, the state variables are stored for each step.,
used.
Let us assume that the sample has been partitioned
into L subsets, {X1,X2, * * ,X1},I{Xkl+±,Xkl+2, ...*,X21 },
Optimal Partial Classifications
I{XkL-l+lXkL-2+2-...*XkL) (kL = N), and that each subset
ablishin
We will conclude this section by est ablishing
some has been clustered by Algorithm I. First, we will present an
inequalities for recursive criteria as define 1d by (9)-(11) algorithm which
to determine
and a property of Algorithm I that will be used to develop improved bounds uses the partial solutions
and
clusters
the
entire
sample
by one
the extended algorithm in Section III.
branch and bound search. We will then consider a more
First of all, for any classification, Wi1,W2,
we have general procedure for hierarchically combining the subsets.
*WN,
'
The extensions presented here place one additional
V)
(24)
JR(1,1) < JR(1,2) < *
JR(1J?
requirement on the clustering criterion. Let JR (ki,k2) be
because of (10). Now let J*(l,k) be the cost associated the cost of clustering Xkl,Xkl+l, ... ,Xk2. This is a straightwith the optimal clustering of X1, * * *,Xk for k = 1,2, *,N. forward generalization of the recursive criterion defined in
Note that the optimal classification may be completely Section II. We require that for any classification and for
different for each k. We have
any 1 such that ki < 1 < k2
,
se
-
912
IEEE TRANSACTIONS ON COMPUTERS, SEPTEMBER
1975
(29) Thus, E*(k + 1,N), which can be computed in a straightJR(kl,k2) > JR (kl,l) + JR (l + 1,k2) .
forward manner, can replace e(k + 1,N) in (14) and proIn particular, for the within-class scatter criterion, it can vide a tighter lower bound.
be shown that
We define Algorithm II as a procedure for clustering
combined subsets which have been clustered by Algorithm
JR(kl,k2) = JR(kl,l) + JR(l + 1,k2)
I. Algorithm II is the basic branch and bound algorithm
M
Nj(kl,l)Nj(l + 1,k2)
+
with Bo obtained by concatenating the subset classificaj-, Nj(kl,l) + Nj(l + 1,/ck)
tions and E (k + 1,N) given by (32).
.11 Cj(kl,l) - Cj(1 + 1,k2)112
(30) Hierarchical Subset Combination
where Cj(i, k) and Nj(i,k) are straightforward generalizaAs the sample size N, and therefore the number of
tions of the corresponding definitions irn Section II.
subsets L, increases, the lower bound given by (32) becomes less tight and the algorithm becomes less efficient.
Clusteing Combined Subsets
This problem can be combated with a hierarchical scheme
Improved Upper Bound (Bo): Clearly, we may let for combining subsets. The method will be illustrated for
BO = JR(1,N) for any classification and a good classifica- L = 4.
tion will yield a tight Bo. Algorithm I determines a classifiAgain, suppose the subsets have been clustered by
cation for each subset and, therefore, a potentially good Algorithm I. The first stage of hierarchical combination is
classification for the entire sample. Thus, our basic strategy to cluster { X1,X2, *,Xk2 } and { Xk2+1,Xk2+2,.*,Xk4 } using
is to concatenate the classifications of the subsets and let Algorithm II. The classifications obtained at this stage are
BO = JR (1,N) for this classification.
concatenated to determine Bo for the entire sample. Then
In concatenating subset classifications, we must recog- X1,X2,... ,Xk4} is clustered by branch and bound using
nize that equivalent classes may be labeled differently in the following lower bounds:
each subset. A simple procedure is to compute the mean
+ J* (k + l,k2)
vector for each class in each subset and, using any subset
+
J*(k2 + 1,k4), 1 < k < k
as a basis, classify the mean vectors (and, therefore, the
classes) of the other subsets by the nearest neighbor rule.
P
J*(k,k2)
+ J*(k2 + 1/c k < k < k2
This procedure is heuristic, but it affects only the tightness
=
EH(k
+
1,N)
of Bo, not the optimality of the final result.
J*(k,k3) + J*(k3 + 1/k4), k2 < k < k3
Improved Lower Bounds (e(k + 1,N)): Let the vectors
in the L subsets be reverse ordered. Then, in addition to L
k3 < k < k4.
IJ* (k,k,) I
classifications, Algorithm I determines the following
-
J*(k,/k)
(34)
quantities:
J* (1,kl) ,J*(2,kl/), *..J*.(kl,kl)
J*(ki +
1,k2),J*(k/
+
2,/k2),*..J*(k2jk2)
kkl
+ 1,2L),J*(L_ + 2,L),. J* ( kL).
Applying (29), we have, for 1 < k < N,
JR (1,N) 2 JR (1,k) + JR(k + 1,N),
> JR(1k,k) + J*((k + 1,N).
Further, if k_1 < k < k, for 1 < p < L(ko _ 1)
J*(k + 1,N)
J* (k
r
|J*(k +
2i
1,kp)
L
+ E J*(k/c1 +
t~~~~-p+1
1,/c),
* (k + 1,N),
CE*(k + 1,N),
so that
JR( 1,N) -JR(1k) 2 e*((k + 1,N).
(31)
p<L
p=L
(32)
(33)
These bounds are improved relative to those of (32) for
1 < k < k2 since J*((k2 + 1,k4), which is determined in the
previous stage, satisfies
J* (k2 + 1/k4) 2 J*(k2 + l,/k3) + J*((k3 + 1,/k4) . (35)
The extension of this procedure to general values of
L (L 2 3) is straightforward and L need not be a power of
2. Thus we have defined an algorithm, which we will call.
Algorithm III, to cluster a large sample by hierarchical
combination of subsets.
The efficiency of Algorithm III will vary with the order
in which the subsets are combined. Referring to (34), we
see that the tightness of EH((k + 1,N) increases with
J*((k2 + 1,k4). Thus, the two subsets for which the total
clustering cost is highest should be combined first. But this
cost is initially unknown. Therefore, we offer a heuristic
distance criterion for determining which pairs of subsets to
combine at a given stage. The distance between subsets i
and j is defined by
di1=C N
+
Ck'
NX | ||,
(36)
where Nk and C, are the population and mean vector for
class k of subset i and M is the number of classes. Those
913
KOONTZ et al.: BRANCH AND BOUND CLUSTERING ALGORITHM
X2
60
4
0
oa
00
oc
00
OD00
2
0
00 )
0
020
C
000
002
0
0
AAa a
A
&AA.AAA b
A
A
A
0
A&
A
Ab&A
A&
AA
A
AAf
ODO
2
4
A
6
A
A
A 9
Xi
00 0
0000
410
0
0
-Frpl
Fig. 2. Scatter plot of the data used in computer experiments.
distinct pairs for which di1 is largest are combined at this
stage. The choice of the above heuristic is guided by the
following considerations:
1) The bounds EH(k + 1,N) in (33) should be as close
to J (k + 1,N) as possible for the algorithm to be efficient.
2) Consider two subsets of samples i and j that have
been individually clustered. Let the corresponding clustering costs be Ji* and Jj*. If Jij* is the total clustering cost
for the subsets i and j, then it can be shown as a direct
consequence of (32), that
J* + dii.
(37)
tJ* + Jj* <Jtj* <J* +
We see that for subset pairs with small dij, J* + Jj* is a
very close bound of Jij*. For these pairs, the bounds computed using J* + J* are already very tight. It is the pair
that yield large di, that will yield loose bounds for the next
stage if they are not combined now.
3) By combining subsets that yield large dij, it is
reasonable to expect that at the final stage of the hierarchical combination, each subset will adequately represent the entire sample space. Intuitively, then, the concatenated optimal classifications of the subsets will be
close to the optimal classification of the entire set of
samples. This directly implies tight upper and lower
optimal solution. Thus, if we do not require the true
optimum, we may employ heuristic "bounds" to obtain a
suboptimal solution with less computation.
IV. COMPUTER EXPERIMENTS
In this section, we will present the results of some
experimental applications of the branch and bound clustering algorithms. The within-class scatter criterion is
assumed throughout this section. The experiments were
run on a CDC 6500 computer.
Data
Pseudorandom vectors were generated for each of the
two classes. The vectors were normally distributed for
each class with the following mean vectors and covariance
matrices:
0.5
Ml= [0.0 0.0]T
M2= [5.0 1.0]T
0.01
0.0 5.0]
3.0 0.0
22
=.
0.0 0.5j
A scatter plot of 60 vectors from each class is shown in
Fig. 2 (0 = class 1, A = class 2).
bounds. Moreover, this procedure of recombination
renders Algorithm III practically independent of the Application of Algorithm I
initial ordering of the sample vector.
Algorithm I was applied to a sample of size 40 (20 from
each
class). A CPU time of 7.5 min was required to find
Nonoptimal Clustering
the optimal classification. This performance is quite poor
We have presented two extensions to the basic branch in comparison with other algorithms. It appears, thereand bound algorithm which improve its efficiency while fore, that the major justification for Algorithm I is its
maintaining global optimality. We can further improve application within Algorithms II and III.
the efficiency if we allow optimality to be compromised.
For example, we can replace bk with bk/l (0 < , < 1) and Application of Algorithms II and III
Algorithm III (which includes Algorithms I and II)
thereby eliminate more solutions, including possibly, the
IEEE TRANSACTIONS ON COmPuTERS, SEPTEMBER 1975
CPU
TIME
STAGE
(SECI
4
7.0
3
3.7
2
12.1
)
J*(80)
5.2
8
J*
1
J *( 80)
28.0
SUBSET
Fig. 3. Performance of the extended algorithm.
applied to a sample of size 120. This sample was
divided randomly into 8 groups of 15 vectors. Fig. 3
illustrates the sequence of stages executed by the algorithm. Subsets were combined according to the metric
given by (35). The optimal clustering costs, initial bounds,
subset compositions, and CPU time requirements are
shown for each stage. Note that the initial bounds are
very tight in stages 3 and 4. The total CPU time required
was 28 s which clearly indicates the improved performance
of the extended algorithm.
was
V. SUMMARY
The method of branch and bound is a feasible approach
to the clustering problem. Its primary advantage is that
it will determine the globally optimum classification with
a reasonable amount of computation. Hill-climbing procedures require a comparable amount of computation to
determine a local optimum.
The basic algorithm, Algorithm I, is a straightforward
application of branch and bound to the clustering problem.
However, the efficiency of Algorithm I seems far too low to
warrant practical application. Algorithms II and III,
which are more complex, provide the kind of efficiency
needed in practice. Although experimental results are
presented only for the within-class scatter criterion, the
algorithms have been stated more generallv.
REFERENCES
[1] J. MacQueen, "Some methods for classification and analysis of
multivariate observations," in Proc. 5th Berkeley Symp., vol. I,
1966, pp. 281-297.
[2] G. H. Ball and D. J. Hall, "ISODATA-A novel method of data
analysis and pattern classification," Stanford Res. Institute,
Menlo Park, Calif., Tech. Rep., SRI Project 5533, May 1973.
[3] J. J. Fortier and H. Solomon, Clustering Procedures in Mulivariate Analysis, P. R. Krishnaiah, Ed. New York: Academic,
1966, pp. 493-506.
[4] S. W. Golomb and L. D. Baumert, "Backtrack programming,"
J. Ass. Comput. Mach., vol. 12, pp. 516-524, Jan. 1965.
[5] A. L. Chernyavskii, "Algorithms for the solution of combinatorial problems based on a method of implicit enumeration,"
Automat. Remote Contr., vol. 33, pp. 252-260, Feb. 1972.
[6] E. L. Lawler and D. E. Wood, "Branch-and-bound methods
a survey," Oper. Res., vol. 149, 1966.
[7] N. J. Nilsson, Problem Solving Methods in Artificial Intelligence.
New York: McGraw-Hill, 1971, ch. 3.
[8] J. D. Swift, "Isomorph rejection in exhaustive search techniques," in Proc. Amer. Math. Soc. Symp. Appl. Math., vol. 10,
1960, pp. 195-200.
Warren
loosa,
L.
Ala.,
G.
Koontz
was
June
23,
on
born
in
1944.
Tusca-
He
re-
ceived the B.S.E.E. degree from the University of Maryland, College Park, in 1966,
the M.S.E.E. degree from the Massachusetts
Technology, Cambridge, in 1967,
and the Ph.D. degree from Purdue University, Lafayette, Ind., in 1971.
During the summer of 1963 he worked in
the Quality Evaluation Laboratory of the
Naval Ammunition Depot, Crane, Ind.
During the summers of 1964 and 1965 he was with Harry Diamond
Laboratories, Washington, D.C. In June 1966 he joined the Electronic Switching Division of Bell Laboratories and participated in
their Graduate Study Program. From 1968 to 1971 he was a Graduate
Instructor at Purdue University. Iio 1971 he returned to Bell Laboratories, Whippany, N. J. as a member of the Loop Transmission
Division where he is currently engaged in loop network studies.
Dr. Koontz is a member of Sigma Xi, Tau Beta Pi, Eta Kappa
Nu, Omicron Delta Kappa, and Phi Kappa Phi.
Institute of
Patrenahalli M. Narendra (S'73) was
born in Jog Falls, Mysore State, India on
March 16, 1951. He received the Bachelor
of Engineering (electronics) degree from
Bangalore University, Bangalore, India,
in 1971, the M.S. degree in electrical engineering from Purdue University, Lafayette,
Ind. in December 1972.
He is currently working toward the Ph.D.
degree in electrical engineering at Purdue
University, where he has been employed as
Teaching and Research Assistant since 1971. His principal research
interests include digital information processing, pattern recognition,
and operations research.
Mr. Narendra is a member of the Honor Society of Phi Kappa Phi.
IEEE TRANSACTIONS ON COMPUTERS, VOL.
c-24,
NO.
9,
SEPTEMBER
1975
Keinosuke Fukunaga was born in Himejishi, Japan, on July 23, 1930. He received
the B.S. degree from Kyoto University,
Kyoto, Japan, in 1953, the M.S.E.E. degree
from the University of Pennsylvania, Philadelphia, in 1959, and the Ph.D. degree from
Kyoto University, Japan, in 1962.
From 1953 to 1957 he was with the Central
Research Laboratories of Mitsubishi Electric
Company, Japan, where he performed system
analysis on automatic control systems.
915
From 1957 to 1959 he was with the Moore School, University of
Pennsylvania, working in the field of switching theory. From 1959
to 1966, he was with the Mitsubishi Electric Company, first with the
Central Research Laboratories, working on computer applications
in control systems, and then with the Computer Division, where he
was in charge of hardware development. Since 1966 he has been
with Purdue University, Lafayette, Ind., where he is a Professor of
Electrical Engineering. During the summers of 1967, 1968, and 1969,
he worked with IBM in Endicott, N. Y., and Rochester, Minn. His
interests include pattern recognition and pattern processing.
Dr. Fukunaga is a member of Eta Kappa Nu.
Picture Reconstruction from Projections
R. L. KASHYAP,
MEMBER, IEEE, AND
Abstract-In this paper we will present two methods for the reconstruction of a two-dimensional picture from its one-dimensional
projections at various angles. The first method is recursive, and the
second is nonrecursive. Both of these methods have potential applications in picture transmission, and offer valuable insight into
the more general problem of reconstruction of three-dimensional
objects from their two-dimensional projections. We will illustrate
the numerical details of each method by considering the reconstruction of a picture discretized on a 32 X 32 grid, and compare their
relative merits. These two methods will also be compared with
other methods suggested in the literature.
Index Terms-Algebraic reconstruction methods, image reconstruction, image reconstruction by optimization, picture coding,
picture processing, picture quality, picture reconstruction, picture
transmission, two-dimensional image.
I. INTRODUCTION
N this paper, we will consider the reconstruction of
a two-dimensional picture from its one-dimensional
projections. Such a problem has potential applications in
the field of picture transmission and, in addition, the
method of solution can be extended to solve the more
general problems of reconstruction of three-dimensional
structures from their two-dimensional projections. The
latter problem arises in connection with the determination
of the three-dimensional structure of biological entities
Manuscript received February 11, 1974; revised January 9, 1975.
This work was partially supported by the National Science Foundation under Grant GK-36721.
The authors are with the School of Electrical Engineering, Purdue
University, West Lafayette, Ind. 47907.
M. C. MITTAL
like ribosomes, from their two-dimensional electron micrographs.
There are a number of approaches to the solution of the
problem, namely, the Fourier techniques [1], the convolution method [9], [10], and the algebraic techniques [2],
etc. The relative merits of these approaches is a controversial topic [3], [12] and not enough computational experience is available to warrant a definitive judgment.
The basic idea in the algebraic approach is to choose
a reasonable criterion function to measure the discrepancy
between the acutal two-dimensional picture and the
reconstructed picture and develop a reconstruction method
to minimize the criterion function. We will transform the
picture reconstruction problem to one of constrained
minimization. The minimizing function or criterion function is the sum of two quadratic functions J1 and J2. The
first function Ji is used to ensure the preservation of local
continuity in the reconstructed picture. The second
function J2 is the variance of the cell intensities. Gordon
et al. [2] and Herman et al. [11] have given the algebraic
reconstruction technique (ART) algorithm which minimizes the functional J2 under certain conditions. The
algorithms of Gilbert [14] do not minimize any particular
criterion functions. Our intention is to show that the
algorithm to be developed in this paper gives better reconstructions with the same projections than other algorithms such as ART.
Two methods of solution will be developed for the reconstruction problem. The first solution is a recursive algorithm which is comp.utationally feasible even when the

Download Report

A Branch and Bound Clustering Algorithm

Paperzz.com

Your Paperzz