Fuzzy Clustering Algorithms

Fuzzy Clustering Algorithms
SSIE 617 2nd Presentation
Benjamin James Bush
05/02/2012
What is Clustering?
Crisp & Fuzzy Clustering
Each point belongs to exactly one cluster.
CRISP
C-Means Clustering
Cluster membership is a matter of degree.
FUZZY
Fuzzy C-Means Clustering (FCM)
Fuzzy Min-Max Clustering Neural Network
C-Means Clustering
Fixed number of
clusters. One
per cluster.
Each data point belongs
to the cluster
corresponding to the
closest centroid.
Figure by Andrey A. Shabalin, Ph.D.
Animation
C-Means Clustering
# of clusters
distance between data
point and cluster center
cost function
cost of the
ith cluster
data points belonging
to the ith group
C-Means Clustering
pick c centroids at
random
assign each data
point to the cluster
corresponding to the
nearest centroid.
move each centroid to the
mean value of its cluster’s
data points.
Animation by Andrey A. Shabalin, Ph.D.
Fuzzy
C-Means
Clustering
Fuzzy
C-Means
Clustering
(FCM)
Fixed number of clusters.
One
.
Clusters are fuzzy sets.
Membership degree of a
point can be any number
between 0 and 1.
Sum of all degrees for a
point must add up to 1.
Figure by Matteo Matteucci, Ph.D.
Animation
Fuzzy
C-Means
Clustering
Fuzzy
C-Means
Clustering
(FCM)
C-Means
Fuzzy
C-Means
(FCM)
summing over
all data points
fuzziness
exponent
membership
degree
Fuzzy C-Means Clustering
pick c centroids at
random
assign membership degrees
according to:
move each centroid to the
following position:
Note: formulas are result of the method of
Lagrange multipliers as applied to aforementioned
cost function. Proof left as exercise.
Crisp & Fuzzy Clustering
Each point belongs to exactly one cluster.
CRISP
C-Means Clustering
Cluster membership is a matter of degree.
FUZZY
Fuzzy C-Means Clustering (FCM)
Fuzzy Min-Max Clustering Neural Network
How Many Clusters?
?
Fuzzy Min-Max Clustering NN
Variable number of
clusters. Each cluster has
a Hyperbox Fuzzy Set.
Degrees inside the box
are 1. Degrees outside
the hyperbox decrease
linearly with distance
from the box.
Total degrees for a point
need not add up to 1.
Boxes may not overlap.
Hyperbox Fuzzy Sets
Start Mathematica...
Hyperbox Fuzzy Sets
Easy to implement as ANNs.
Potential to take advantage of
massive parallel processing.
Initialize population of 250 randomly chosen individuals, each with a random
# of boxes. For each box, choose min point and max point at random.
Create an child individual from each
member of the population. When creating
a child, add a Gaussean r.v. to each
component of the min and max point, and
change the # of boxes with probability 0.5.
Evaluate the fitness of each
individual based on its Minimum
Description Length (MDL)
Penalty for #
of clusters.
goodness of fit
Eliminate half of the
individuals via round-robin
tournament competition.
Applications
Applications of Fuzzy C-Means
Applications of Fuzzy C-Means
Applications of Min-Max Clustering NN
Applications of Min-Max Clustering NN
Bibliography
Ch. 15
Ch. 1
Videolectures.net: MDL Tutorial
http://videolectures.net/icml08_grunwald_mld/