LNCS 4179 - A New Similarity Measure for Random Signatures

A New Similarity Measure for Random
Signatures: Perceptually Modified Hausdorff
Distance
Bo Gun Park, Kyoung Mu Lee, and Sang Uk Lee
School of Electrical Eng., ASRI, Seoul National University,
151-600, Seoul, Korea
[email protected], [email protected], [email protected]
Abstract. In most content-based image retrieval systems, the low level
visual features such as color, texture and region play an important role.
Variety of dissimilarity measures were introduced for an uniform quantization of visual features, or a histogram. However, a cluster-based representation, or a signature, has proven to be more compact and theoretically
sound for the accuracy and robustness than a histogram. Despite of these
advantages, so far, only a few dissimilarity measures have been proposed.
In this paper, we present a novel dissimilarity measure for a random signature, Perceptually Modified Hausdorff Distance (PMHD), based on Hausdorff distance. In order to demonstrate the performance of the PMHD, we
retrieve relevant images for some queries on real image database by using
only color information. The precision vs. recall results show that the proposed dissimilarity measure generally outperforms all other dissimilarity
measures on an unmodified commercial image database.
1
Introduction
With an explosive growth of digital image collections, Content-Based Image
Retrieval (CBIR) has been one of the most active and challenging problems
in computer vision and multimedia applications [22][23]. There have been lots
of image retrieval systems, which are based on the query-by-example scheme,
including QBIC [20], PhotoBook [24], VisualSEEK [25], and MARS [26] etc.
However, closing the gap between human perceptual concepts and low-level visual contents extracted by computer, is still one of ongoing problems. In order to
deal with the semantic gap, many techniques have been introduced to improve
visual features and similarity measures [22][4][27][28].
In most image retrieval system based on visual features, a histogram (or a
fixed-binning histogram) is widely used as a visual feature descriptor due to its
simple implementation and insensitivity to similarity transformation [4]. However, in some cases, the histogram based indexing methods fail to match perceptual dissimilarity [1]. The performance of retrieval system employing a histogram
as a descriptor severely depends on the quantization process in feature space because a histogram is inflexible under various feature distribution representations.
To overcome these drawbacks, a clustering based representation, signature (or
J. Blanc-Talon et al. (Eds.): ACIVS 2006, LNCS 4179, pp. 990–1001, 2006.
c Springer-Verlag Berlin Heidelberg 2006
A New Similarity Measure for Random Signatures
991
adaptive-binning histogram) has been proposed [1][3][15]. A signature compactly
represents a set of clusters in feature space and the distribution of visual features. Therefore, it can reduce the complexity of representation and the cost
of retrieval process. Once two sets of visual features based on a histogram or a
signature, are given, it needs to determine how similar one is from the other. A
number of different dissimilarity measures have been proposed in various areas of
computer vision. Specifically for histograms, Jeffrey divergence, histogram intersection, χ2 -statistics and so on have been known to be successful. However, these
dissimilarity measures can not be directly applied to signatures. Rubner et al. [1]
proposed a novel dissimilarity measure for matching signatures called the Earth
Mover’s Distance (EMD), which was able to overcome most of the drawbacks
in histogram based dissimilarity measures and handled partial matches between
two images. Dorado et al. [3] also used the EMD as a metric to compare fuzzy
color signatures. However, the computational complexity of the EMD is very
high compared to other dissimilarity measures. And Leow et al.[15] proposed a
new dissimilarity measure, Weighted Correlation (WC) for signatures, which is
more reliable than Euclidean distance and computationally more efficient than
EMD. The performance of WC was generally better than EMD and comparable
to other dissimilarity measures for image retrieval and image classification, but,
in some cases, it was worse than the Jeffrey divergence (JD) [14].
In this paper, we propose a novel dissimilarity measure for comparison of
random signatures, which is based on the Hausdorff distance. The Hausdorff
distance is an effective metric for the dissimilarity measure between two sets
of points [6][7][8][10], while insensitive to the characteristics changes of points.
In this paper, we modify the general Hausdorff distance into the Perceptually
Modified Hausdorff Distance (PMHD) in order to evaluate the dissimilarity between random signatures and to satisfy human perception. The experimental
results on a real image database show that the proposed metric outperforms
other dissimilarity measures.
2
A Visual Feature Descriptor: A Random Signature
In order to retrieve visually similar images to a query image using visual information, a proper visual feature descriptor should be extracted from an image. It
has been proven that a signature can describe the feature distribution more efficiently than a histogram [1][3][15]. And a signature is appropriate for describing
each image independently to other images in an image database.
In this paper, we represent an original image as a random signature, defined
as
S = {(si , wi , Σi )|i = 1, . . . , N },
(1)
where N is the number of clusters, si is the mean feature vector of i-th cluster, wi
is the fraction of the features that belong to i-th cluster and Σi is the covariance
matrix of i-th cluster. Variety of different clustering methods can be used to
construct a random signature from a color image. In this paper, we used Kmeans clustering [12] to cluster visual features.
992
B.G. Park, K.M. Lee, and S.U. Lee
(a)
(b)
(c)
Fig. 1. Sample images quantized using K-mean clustering : (a) Original image with
256,758 colors, and quantized images based on a random signature with (b) 10 colors,
and (c) 30 colors
Fig. 1 shows two sample images quantized by using a random signature with
color information as a visual feature.
3
A Novel Dissimilarity Measure for a Random Signature
3.1
Hausdorff Distance
It has been shown that the Hausdorff distance (HD) is an effective metric for the
dissimilarity measure between two sets of points in a number of computer vision
literatures [6][7][8][9], while insensitive to the characteristics changes of points.
In this section, we briefly describe the Hausdorff distance(HD). More details
can be found in [6][7][8][9]. Given two finite point sets, P1 = {p11 , . . . , p1N } and
P2 = {p21 , . . . , p2M }, the HD is defined as
DH = (P1 , P2 ) = M ax{dH (P1 , P2 ), dH (P2 , P1 )},
(2)
dH (P1 , P2 ) = max min ||p11 − p22 ||,
(3)
where
p1 ∈P1 p2 ∈P2
and the function dH is the directed HD between two point sets.
3.2
Perceptually Modified Hausdorff Distance
In this paper, we propose a novel dissimilarity, called Perceptually Modified
Hausdorff Distance(PMHD) measure based on HD for comparison of random
signatures.
A New Similarity Measure for Random Signatures
(a)
993
(b)
Fig. 2. An example of perceptual dissimilarity based on the densities of two color
features
Given two random signatures S1 = {(s1i , wi1 , Σ1i )|i = 1, . . . , N }, and S2 =
{(s2j , wj2 , Σ2j )|j = 1, . . . , M }, a novel dissimilarity measure between two random
signatures is defined as
DH (S1 , S2 ) = M ax{dH (S1 , S2 ), dH (S2 , S1 )},
(4)
where dH (S1 , S2 ) and dH (S2 , S1 ) are directed Hausdorff distances between two
random signatures.
The directed Hausdorff distance is defined as
dH (S1 , S2 ) =
i
d(s1 ,s2 )
i j
[wi1 × min min(w
1
2 ]
j
i ,wj )
1
,
wi
(5)
i
where d(s1i , s2j ) is the distance between two visual features of the same type, s1i
and s2j , which measures the difference between two features.
In (5), we divide the distance between two feature vectors by the minimum of
two feature vectors’ densities. Let’s consider an example in Fig. 2(a). There are
two pairs of feature vectors represented as circles centered at mean feature vectors. The size of each circle represents the density of the corresponding feature.
If we compute only the geometric distance without considering the densities of
two feature vectors, two distances d1 and d2 are equal. However, perceptually d2
must be smaller than d1 . Another example is given in Fig. 2(b). There are three
feature vectors. d1 is smaller than d2 if we consider only the geometric distance
regardless of the densities, however, it is perceptually justified that d2 is smaller
than d1 . The desired distance should imply these observations. Therefore, we
divide the geometric distance by the intersection of two feature vector’s volume
to match perceptual dissimilarity.
In the result, PMHD is insensitive to the characteristics changes of mean
features in a signature and theoretically sound for involving human intuition
and perception in the metric.
994
B.G. Park, K.M. Lee, and S.U. Lee
(a)
(b)
Fig. 3. Example query images from four categories in the Corel database. (a) Eagle,
(b) Cheetah.
3.3
Partial PMHD Metric for Partial Matching
If a user is interested in only a part of images or requires to retrieve partially similar images, a global descriptor is not appropriate for such task. Like a histogram,
a signature is also a global descriptor of a whole image. And the proposed distance for random signatures in (4) can include possible outliers by employing
the summation operator over all distances. As indicated in [1][5][9], this kind
of distance can not cope with occlusion and clutter in image retrieval or object
recognition. In order to handle partial matching, Huttenlocher et al. [6] proposed
partial HD based on ranking, which measures the difference between portion of
sets of points. And Azencott et al. [8] further modified rank based partial HD
by order statistics. But, these distances were shown to be sensitive to the parameter changes. In order to address these problems, Sim et al. [9] proposed
two robust HD meausres, M-HD and LTS-HD, based on the robust statistics
such as M-estimation and Least Trimmed Square(LTS). Unfortunately, they are
not appropriate for image retrieval system because they are computationally too
complex to search a large database.
In this section, we explicitly remove outliers in the proposed distance to address partial matching problem. Let us define outlier test function as
⎧
⎨ 1,
f (i) =
⎩
0,
d(s1 ,s2 )
i j
min min(w
1 ,w2 ) < Dth ,
(6a)
otherwise,
(6b)
j
i
j
where Dth is a pre-specific threshold for the outlier detection.
Then we compute two directed Hausdorff distance, daH (S1 , S2 ) and dpH (S1 , S2 ),
defined as
daH (S1 , S2 ) =
i
dpH (S1 , S2 ) =
i
d(s1 ,s2 )
i j
wi1 × min min(w
1
2
j
i ,wj )
1
,
wi
i
wi1
d(s1 ,s2 )
i j
× min min(w
1
2 × f (i)
j
i ,wj )
1
.
wi × f (i)
i
(7)
A New Similarity Measure for Random Signatures
995
Now, let us modify the directed Hausdorff distance in (5) as
⎧
⎪
⎨ da (S , S ),
1
2
H
dH (S1 , S2 ) =
⎪
⎩
dpH (S1 , S2 ),
i
wi1 ×f (i)
1
wi
< Pth ,
(8a)
i
otherwise,
(8b)
where Pth is a pre-specific threshold for the control of a faction of information
loss.
4
4.1
Experimental Results
The Database and Queries
To evaluate the performance of the proposed metric, several experiments have
been conducted on a real database with a color feature as a visual feature.
We used 5,200 images selected from commercially available Corel color image
database without any modification. There are 52 semantic categories, each of
them containing 100 images. Among those, we have chosen four sets of query
data, Cheetah, Eagle, Pyramids and Royal guards. Some example images in
the queries are given in Fig. 3. In this experiment, we used all images in these
four categories as a query. As we note in Fig. 3, grouping of images to different
categories were not based on the color information. Nonetheless, in wide sense,
it was considered that all images in the same category were considered as the
relevant images or correct answers based on the color information. We computed
a precision and recall pair to all query categories, which is commonly used as the
retrieval performance measurement [11]. Precision P and recall R are defined as
P = r/n, R = r/m,
(9)
where r is the number of retrieved relevant images, n is the total number of
retrieved images, and m is the total number of relevant images in the whole
database. Precision P measures the accuracy of the retrieval and recall R measures the robustness of the retrieval performance.
In this paper, we used only color feature as a visual feature. Thus we consider three different distances for d(s1i , s2j ) in (5) : the Euclidean distance, the
CIE94 color difference, and the Mahalanobis distance. In order to guarantee
that the distance is perceptually uniform, the CIE94 color difference equation
is used instead of the Euclidean distance in CIELab color space [17]. And the
Mahalanobis distance explicitly considers the distribution of color features after
clustering process [16]. Three distances are defined as follows.
(i) Euclidean distance :
dE (s1i , s2j ) =
3
[s1i (k) − s2j (k)]1/2 ,
k=1
where si (k) is the k-th element in the feature vector si .
(10)
996
B.G. Park, K.M. Lee, and S.U. Lee
(a)
(b)
(c)
(d)
Fig. 4. Precision-recall curves for various dissimilarity measures on four query categories : (a) Eagle, (b) Cheetah, (c) Pyramids, and (d) Royal guards
(ii) CIE94 color difference equation :
dCIE94 (s1i , s2j ) = [(
ΔL∗ 2
ΔC ∗ 2
ΔH ∗ 2 1/2
) +(
) +(
) ] ,
kL S L
kC S C
kH S H
(11)
where ΔL∗ , ΔC ∗ and ΔH ∗ are the differences in lightness, chroma, and hue
between s1i and s2j .
(iii) Mahalanobis distance :
1
2
dM (s1i , s2j ) = (s1i − s2j )T Σ−1
i (si − sj ).
4.2
(12)
Retrieval Results for Queries
The performance of the proposed PMHD was compared with five well-know dissimilarity measures, including Histogram Intersection(HI), χ2 −statistics, Jeffrey-
A New Similarity Measure for Random Signatures
997
Divergence(JD) and Quadratic Form(QF) distance for the fixed binning histogram, and EMD for the signature. Let H1 and H2 represent two color histograms or signatures. Then, these dissimilarity measures are defined as follows.
– Histogram Intersection(HI) [18] :
d(H1 , H2 ) = 1 −
2
min(h1i , h2i )/
hi ,
i
(13)
i
where hji is the number of elements in i-th bin of Hj .
– χ2 −statistics :
d(H1 , H2 ) =
(h1i − mi )2 /mi ,
(14)
i
where mi = (h1i + h2i )/2.
– Jeffrey-Divergence(JD) [14] :
d(H1 , H2 ) =
(h1i log
i
h1i
h2
+ h2i log i ),
mi
mi
(15)
where again mi = (h1i + h2i )/2.
– Quadratic Form(QF) distance [19][20] :
d(H1 , H2 ) =
(H1 − H2 )T A(H1 − H2 ),
(16)
where A is a similarity matrix. A encodes the cross-bin relationships based
on the perceptual similarity of the representative colors of the bins.
– EMD [1] [13] :
d(H1 , H2 ) =
gij dij /
gij ,
i,j
(17)
i,j
where dij denotes the dissimilarity between i-th bin and j-th bin,
and gij
is the optimal flow between two distributions. The total cost
gij dij is
i,j
minimized subject to the constraints,
gij ≥ 0,
i,j
i
gij = min(
gij ≤ h2j ,
i
h1i ,
gij ≤ h1i ,
j
h2j ).
(18)
j
As reported in [13], EMD yielded very good retrieval performance for the
small sample size, while JD and χ2 performed very well for the larger sample sizes. Leow et al. [15] proposed the novel dissimilarity measure, Weighted
Correlation(WC) which can be used to compare two histograms with different
998
B.G. Park, K.M. Lee, and S.U. Lee
(a)
(b)
(c)
(d)
Fig. 5. Comparison of the retrieval performance for varying the number of color features in a signature : (a) Eagle, (b) Cheetah, (c) Pyramids, and (d) Royal guards
binnings. In the image retrieval, the performance of WC was comparable to
other dissimilarity measures, however, JD always outperform WC. And, in this
paper, we evaluated only the performance of JD. In order to represent a color
image as a fixed histogram representation, the RGB color space was uniformly
partitioned into 10 × 10 × 10 = 1000 color bins. And a color was quantized to
the mean centroid of the cubic bin. While, as mentioned in Section 2, a random
signature was extracted by applying K-means clustering. To compare the performance of the signature based dissimilarity with other fixed histogram based
ones, the quantization level was matched by clustering a color image into only 10
color feature clusters. The mean color quantization error of the 10 × 10 × 10-bin
histogram is 5.99 CIE94 units and that of quantized image based on a random
signature containing 10 color feature vectors was 5.26 CIE94 units. It is noted
that the difference between two quantized image errors are smaller than the perceptibility threshold of 2.2 CIE94 units [21], where two colors are perceptually
indistinguishable [15].
A New Similarity Measure for Random Signatures
999
The retrieval performance results of the proposed metric and other dissimilarity methods are summarized by the precision-recall in Fig. 4. It is noted that the
proposed PMHD dissimilarity measure significantly outperformed other dissimilarity measures for all query images. The performance of PMHD is, on average,
20−30% higher than the second highest precision rate over the meaningful recall
values. And the performance of PMHD with Euclidean distance is almost the
same as that of PMHD with CIE94, and usually performed best in the image
retrieval. It is somewhat surprisingly noted that EMD performed poorer than
other dissimilarity measures in all query categories except “Eagle” query category. This performance is not coincident with the results reported in [13] and
[1], wehre EMD performed very well for the small sample sizes and the compact
representation but not so well for large sample sized and wide representation. As
indicated in [15], the image size, the number of color features in a signature and
the ground distance may degrade the whole performance of EMD. However, as
mentioned before, we only used a signature with 10 color features in this experiment, which is a very compact representation. We note that the large image size
of 98,304 pixels or so and the Euclidean ground distance may severely degrade
the performance of EMD.
4.3
Dependency on the Number of Color Features in a Signatures
In general, the quantization level of a feature space, that is, the number of
clusters in a signature or the number of bins in the fixed histogram, has an important effect on the overall image retrieval performance. In order to investigate
the retrieval performance dependency on the quantization level, we compared
the retrieval performance of the proposed method according to the number of
color features in a signature, which varied for 10 and 30. The mean color error
of the quantized image based on a random signature with 30 color feature vectors is 3.38 CIE94 units, which is significantly smaller than 5.26 CIE94 units
in the case of a random signature with 10 color feature vectors. Fig. 1 shows
two sample images quantized using K-means clustering, which were quantized
by 10 colors and 30 colors each. It is noted that the quantized image based on
a random signature with 30 color features is almost indistinguishable from the
original image, which contains 256,758 color features.
Fig. 5 plots the precision-recall curves of the image retrieval results for varying the number of color features in a signature. We compared the retrieval performance of the proposed PMHD with EMD, since it is the only dissimilarity
measure applicable to signatures. The precision rate of EMD does not vary significantly as the number of color features of a signature increased, as depicted
in Fig. 5. However, the precision rate of PHMD with 30 color features is slightly
higher than that of PMHD with 10 color features. From this result, it can be
expected that the performance of the proposed PMHD becomes higher as the
quantization error decreases. Moreover, this implies that PMHD performs best
for the large sample sizes as well as the compact representation.
1000
5
B.G. Park, K.M. Lee, and S.U. Lee
Conclusion
In this paper, we proposed a novel dissimilarity measure for random signatures,
Perceptually Modified Hausdorff Distance(PMHD) based on Hausdorff distance.
PMHD is insensitive to the characteristics changes of mean features in a signature and theoretically sound for human intuition and perception of the metric.
The extensive experimental results on a real database showed that the proposed PMHD outperformed other dissimilarities. The retrieval performance of
the PMHD is, on average, 20−30% higher than the second highest precision rate.
In this paper, we used only color information, which was shown to be inappropriate to close the sematic gap without using texture information, multi-resolution
representation, relevance feedback, and so on. Thus, combining texture information and representing signature in multi-resolution framework will be our future
work.
Acknowledgement
This work has been supported in part by the ITRC (Information Technology
Research Center) support program of Korean government and IIRC (Image Information Research Center) by Agency of Defense Development, Korea.
References
1. Y.Rubner and C.Tomasi, Perceptual metrics for image database navigation, Kluwer
Academic Publisher, January 2001.
2. G.Qiu and K.M.Lam, “Frequency layered color indexing for content-based image
retrieval,” IEEE Trans. Image Processing, vol.12, no.1, pp.102-113, January 2003.
3. A.Dorado and E.izquierdo, “Fuzzy color signature,” IEEE Int’l Conference on Image Processing, vol.1, pp.433-436, 2002.
4. A.W.M.Smeulders et al., “Content-based image retrieval at the end of the early
years,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol.22, no.12,
pp.1349-1380, December 2000.
5. V.Gouet and N.Boujemaa, “About optimal use of color points of interest for
content-based image retrieval,” Research Report RR-4439, INRIA Rocquencourt,
France, April 2002.
6. D.P.Huttenlocher, G.A.Klanderman, and W.J.Rucklidge “Comparing images using
teh Hausdorff distance,” IEEE Trans. Pattern Analysis and Machine Intelligence,
vol.15, no.9, pp.850-863, September 1993.
7. M.P.Dubuisson and A.K.Jain, “A modified Hausdorff distance for object matching,” Proceedings of IEEE International Conference on Pattern Recognition,
pp.566-568, October 1994.
8. R.Azencott, F.Durbin, and J.Paumard, “Multiscale identification of building in
compressed large aerial scenes,” Proceedings of IEEE International Conference on
Pattern Recognition, vol.2, pp.974-978, Vienna, Austria, 1996.
9. D.G.Sim, O.K.Kwon, and R.H.Park, “Object matching algorithms using robust
Hausdorff distance measures,” IEEE Trans. Image Processing, vol.8, no.3, pp.425428, March 1999.
A New Similarity Measure for Random Signatures
1001
10. S.H.Kim and R.H.Park, “A novel approach to video sequence matching using color
and edge features with the modified Hausdorff distance,” in Proc. 2004 IEEE Int.
Symp. Circuit and Systems, Vancouver, Canada, May 2004.
11. Alberto Del Bimbo, Visual information retrieval, Morgan Kaufmann Publishers
Inc., San Francisco, CA, 1999.
12. R.O.Duda, P.E.Har, and D.G.Stork, Pattern classificatoin, Wiley & Sons Inc., New
York, 2001.
13. J.Puzicha, J.M.Buhmann, Y.Rubner, and C.Tomasi,“Empirical evaluation of dissimilarity measures for color and texture,” Proceedings of IEEE International Conference on Computer Vision, pp.1165-1173, 1999.
14. J.Puzicha, T.Hofmann, and J.Buhmann, “Nonparametric similarity meausres for
unsupervised texture segmentation and image retirieval,” Proceedings of IEEE
Conference on Computer Vision and Pattern Recognition, pp.267-272, June 1997.
15. W.K.Leow and R.Li,“The analysis and applications of adaptive-binning color histograms,” Computer Vision and Image Understanding, vol.94, pp.67-91, 2004.
16. F.H.Imai, N.Tsumura, Y.Miyake, “Perceptual color difference metric for complex
images based on Mahalanobis distance,” Journal of Electronic Imaging, vol.10,
no.2, pp.385-393, 2001.
17. K.N.Plataniotis and A.N.Venetsanopoulos Color image processing and applications, Springer, New York 2000.
18. M.Swain and D.Ballard, “Color indexing,” International Journal of Computer Vision, vol.7, no.1, pp.11-32, 1991.
19. J.Hafner, H.S.Sawhney, W.Equitz, M.Flickner, and W.Niblack, “Efficient color histogram indexing for quadratic form distance fucntions,” IEEE Trans. Pattern Analysis and Machine Intelligence, , vol.17, no.7, pp.729-735, July 1995.
20. M.Flickenr, H.Sawhney, W.Niblack, J.Ashley, Q.Huang, B.Dom, M.Gorkani,
J.Hafner, D.Lee, D.Petkovic, D.Steele, P.Yanker, “Query by image and video content : QBIC system,” IEEE Comput., vol.29, no.9, pp.23-32, 1995.
21. T.Song and R.Luo, “Testing color-difference formulae on complex images using a
CRT monitor,” in Proc. 8th COlor Imaging Conference, 2000.
22. Y.Rui, T.S.Huang, and S.F.Chang, “Image retrieval : Current techniques, promising directions and open issues,” Journal of Visual Communication and Image Representation, 1999.
23. W.Y.Ma and H.J.Zhang, “Content-based image indexing and retrieval,” Handbook
of Multimedia Computing, CRC Press, 1999.
24. A.Pentland, R.W.Picard, and S.Sclaroff, “Photobook : Content-based manipulation of image databases,” International Journal of Computer Vision, vol.18, no.3,
pp.233-254, 1996.
25. J.R.Smith and S.F.Chang, “VisualSEEK : A fully automated content-based image
query system,” ACM Multimedia, Boston MA, 1996.
26. Y.Rui, T.Huang, and S.Mehrotra, “Content-based image retrieval with relevance
feedback in MARS,” IEEE Int’l Conference on Image Processing, 1997.
27. T.Wang, Y.Rui, and J.G.Sun, “Constraint based region matching for image retrieval,” International Journal of Computer Vision, vol.56, no.1/2, pp.37-45, 2004.
28. K.Tieu and P.Viola, “Boosting image retrieval,” International Journal of Computer
Vision, vol.56, no.1/2, pp.17-36, 2004.