International Journal of Communication and Computer Technologies
Volume 01 – No.5, Issue: 02 September 2012
ISSN NUMBER : 2278-9723
A Survey Paper on an Efficient Salient Feature Extraction by
Using Saliency Map Detection with Modified K-Means
Clustering Technique
JYOTI VERMA1, VINEET RICHHARIYA 2
Lakshmi Narain College of Technology, Kalchuri Nagar, Raisen Road, P.O. Kolua
Bhopal, Madhya Pradesh- 462 021, India
Abstract
Human eye is perceptually more sensitive to certain colors and
intensities and objects with such features are considered more
salient. Detection of Salient image regions is useful in
applications such as object based image retrieval, adaptive
content delivery, adaptive region-of interest based image
compression, and smart image resizing .This problem can be
handled by mapping the pixels into various feature spaces. This
paper proposed a methodology that encapsulate K-Means
Clustering with Saliency map detection technique to determine
salient region in images using low-level features of luminance
and color and then extract the features. Proposed methodology is
simple to implement, computationally efficient and generates
high quality saliency maps and saliency object of the same size
and resolution as the input image. Here presented scheme
identify salient regions as those regions of an image that are
visually more conspicuous by virtue of their contrast with
respect to surrounding regions.
Keywords: Clustering, Feature Extraction, K-means,
Saliency
1. Introduction
A striking feature is the process of segmentation which
is used for the extraction of the main features of the image,
like a color input, calculates the number of remaining
points of the maps for color, brightness, contrast and
orientation
at
different
scales
of
static
image. Segmentation is dividing a digital image into
segments (sets of pixels, also known as super-pixel). The
purpose of segmentation is to simplify the display of an
image into something that's meaningful and easier to
analyze change. [1] Image segmentation normally used
for objects and boundaries (lines, curves, etc.) to take
pictures and visually see. More specifically, image
segmentation method for assigning a label for each pixel
in the image so the pixels in the same label certain visual
properties. The most striking objects have the quality to
visual elimination of their environment and are likely to
attract attention to the man. An important that makes an
object is the striking visual difference in the background.
A polar bear is in striking dark rocks, but almost invisible
in the snow. The detection of visual salience is important
in many computer vision applications, the general object
recognition in images on Web pages with the thumb nail
image to a common focus of attention to calculate the
human robot interaction [1-3].
Visual salience and, more generally, visual attention were
largely used in neurobiology and psychophysics
investigated [4] and many computational models [5-7] are
constructed based on these findings. Currently there are
various approaches used to salience, based on
computational and mathematical ideas, and generally less
biologically motivated. These approaches range from the
calculation of the entropy [8,9], the determination of
features that best distinguish between a goal and a null
hypothesis [10] to learn the optimal feature combination
with machine learning techniques [11].
Identification of salient regions is useful for many
applications like object based image retrieval, adaptive
content delivery, adaptive region-of-interest based image
compression, and smart image resizing [12-15]. We
identify salient regions as those regions of an image that
are visually more conspicuous by virtue of their contrast
with respect to surrounding regions.
Presented paper introduced modified k means
clustering technique that decompose the input image in k
part and generate the matrix for each part then select one
cluster center from each. Apply selection approach of
initial cluster center of finding salient regions in images.
After a brief overview of previous works, we will
describe the saliency detection and region saliency
calculation and k-means clustering in Section 3 and 4.Our
Page 1
International Journal Of Communication And Computer Technologies
www.ijccts.org
International Journal of Communication and Computer Technologies
Volume 01 – No.5, Issue: 02 September 2012
ISSN NUMBER : 2278-9723
proposed k-means algorithm and their architecture in
section 5 and 6. Finally, in section 7 conclusions are
presented.
2. Techniques for Saliency Detection
The approaches used to provision of low-level saliency is
purely based on the biological model or based on
computational models. Some approaches consider salient
across multiple scales, while others operate on a single
plane. In general, all methods make use of a number of
means for determining of the local areas of the image
with the environment with one or more of the
characteristics of the color intensity and direction.
Ma and Zhang [11] introduced a local contrast-based
method for generating saliency map, which works on a
single scale, and not biologically-based model. The
entrance to the local contrast-based map is a resize and
color quantized CIELUV image, divided into pixel blocks.
The CIELUV color space is one that is accepted by the
International Commission on Illumination (CIE) in 1976,
a simple transformation of the CIE XYZ color space to
calculate, but tried the uniformity of perception. For
applications such as computer graphics, using the
addresses with colorful lights. Although the addition of
mixtures of different colored lights will fall on a line in
uniform CIELUV chromaticity diagram as an additive
mixtures are not, contrary to popular belief, along a line
in the CIELUV color space, except where the mixtures are
constant in brightness. The composition of the stock of
the summing points different from image pixels with their
respective neighboring pixels in a small neighborhood
preserved. This framework takes the points and regions of
the attention. Then fuzzy-growing method produced
segments of the salient regions of the saliency map.
Hu et al. To [6] proposed a saliency map by the
thresholds of the color, intensity and histogram entropy
thresholding analysis is used by orientation maps in place
of a scale space approach. After that they use a spatial
compactness measure, for calculate the area of the convex
hull encompassing the salient region, and saliency density
which is a function of the size of the saliency value in the
saliency feature map, to weigh the individual saliency
map for every one before they are combined.
Itti et al. [9] have introduced a computational model of
eye-catching measures based on spatial attention of a
biologically plausible architecture to compute maps of
salient features of the brightness, color, built distracted
and orientation at different scales, and the aggregate
information together add at any point in an image and be
processed in a combined bottom-up salient of card in a
certain way. The salience of maps were produced Itti
approach used by other researchers for applications such
as adjusting images on small devices [3] and unsupervised
Segmentation of objects [5,10].
Segmentation using Itti generate the saliency map (a
480x320 pixel image saliency map of size 30x20 pixels),
or other sub-sampling of an eye-catching card from
another method requires complex approaches. For
example, a Markov model for the May Field Random seed
values from the eye of the card, integrated with low-level
features of color, texture and edges grow striking object
regions [5].
Nam and ko [19], on the other hand, use a support
vector machine to the characteristics of the image
segments are trained to identify the main areas of interest
to understand of the image, which are then grouped, elect
to have the most striking properties retrieve.
Currently, Frintrop et al. [20] used integral images [21]
in VOCUS (Visual Object detection with a Computational
Attention System) to accelerate the calculation of the
center-surround differences in the search for distinctive
regions with separate feature maps of the color, intensity
and orientation.
3. Saliency Detection and Saliency Map
This section presents details of saliency determination
and its use in segmenting whole objects. Using the
saliency calculation method described later, saliency maps
are created at different scales. These maps are added
pixel-wise to get the final saliency maps. The input image
is then over-segmented and the segments whose average
saliency exceeds a certain threshold are chosen
3.1 Saliency calculation
Saliency is used to determining the local contrast of
an image region with its neighborhood at various scales
[12].
Page 2
International Journal Of Communication And Computer Technologies
www.ijccts.org
International Journal of Communication and Computer Technologies
Volume 01 – No.5, Issue: 02 September 2012
ISSN NUMBER : 2278-9723
Fig. 1 (a) Contrast detection filter showing inner square
region R1 and outer square region R2. (b) The width of
R1 remains constant while that of R2 ranges according to
Equation 3 by halving it for each new scale. (c) Filtering
the image at one of the scales in a raster scan fashion.
This is used to mean the distance between the feature
vector of the pixels of an image sub-area of the mean
feature vectors of the pixel of interest to be evaluated from
its surroundings. As a result of obtaining a combined
feature map at a particular level of feature vectors for each
pixel instead of a combination of individual saliency map
signal scalar values for each feature.
3.2 Saliency Detection Algorithm
At a certain scale, the contrast based saliency value Ci j for
a pixel at position (i, j) in the image, the distance D
between the average vectors of pixel features of the inner
region R1 and that of the outer region R2 is determined
as
(1)
Where N1 are the number of pixels in regionR1 and N2
are the number of pixels in region R2 respectively, and v
is the vector of feature elements corresponding to a pixel.
D is a Euclidean distance if v is a vector of uncorrelated
feature elements, and it is a suitable distance measure if
the elements of the vector are correlated. For this work,
use the CIELAB color space, assuming RGB images, to
generate feature vectors for color and luminance. The
CIELAB color space is the most complete of the
International Commission on Illumination indicated. It
describes all the colors seen by the human eye. These
three coordinates of CIELAB represent the lightness of the
color, L * = 0 yields black and L * = 100 indicates diffuse
white, its position between red / magenta and green (a *,
negative values indicate green while positive values
indicate magenta) and its position between yellow and
blue (b *, negative values indicate blue and positive
values indicate yellow). It is crucial to recognize that the
visual representations of the full range in the model never
good, they are just trying to introduce the concept to
understand, because the red / green and yellow / blue
opponent channels is calculated as the difference of
lightness transformations reactions of the cone is a
chromatic CIELAB color space value. Since perceptual
differences in CIELAB color space are approximately
Euclidian, D in Equation given below 1 is:
(2)
Where v1 = [L1, a1, b1]T and v2 = [L2, a2, b2]T
are the average vectors for regions inner regionR1 and
outer regionR2. Since only average feature vector values
of R1 and R2 are needed to be found, here use the integral
image approach as used in (i.e., [21]) for computational
efficiency. A small change in scale is affected by scaling
the region R2 instead of scaling the image. The scaling of
the filter in place of the image allows the generation of
the saliency maps of the same size and resolution as the
input image. For an image of width w pixels and height h
pixels, the width of region R2, namely wR2 is varied as
(3)
Assuming w to be smaller than h (else we choose h
to decide the dimensions of R2). This is based on the
observation that the largest size of R2 and the smaller
ones (smaller than w=8) are of less use in finding salient
regions. The former might highlight non-salient regions
as salient, while the latter are basically edge detectors. So
for each image, filtering is performed at three different
scales and the final saliency map is determined as a sum
of saliency values across the scales S:
(4)
for all i [1,w], for all j [1, h] where
is an element
of the combined saliency map M obtained by point-wise
summation of saliency values across the scale.
4. K-means Clustering
Clustering can be considered the most important
unsupervised learning problem: so, as every problem of
this kind, it deals with finding a structure in a collection
of unlabeled data. A definition of clustering could be “the
process of organizing objects into groups whose member
are similar in some way”. A cluster is therefore a
collection of objects which are “similar” between them
and are “dissimilar” to the other belonging to other
clusters. There are many clustering algorithm exist in
literature. A classical partitioning algorithm is one of
them which organize the object into k partition (k ≤ n),n
is a data set of object and k, the number of cluster to form,
where each partition represent a cluster. The well known
Page 3
International Journal Of Communication And Computer Technologies
www.ijccts.org
International Journal of Communication and Computer Technologies
Volume 01 – No.5, Issue: 02 September 2012
ISSN NUMBER : 2278-9723
and commonly used partitioning method are k means, kmedoid and there variation. The k mean algorithm takes
the input parameter, k, and partition set of n object into k
(1)
cluster so that, the resulting intracluster similarity is high
but the inter cluster similarities is low. Cluster similarity
is measure in regards to the mean value of the object in a
cluster, which can be viewed as the cluster centriod. In
this algorithm, it randomly selects k of the object, each of
which is initially represent a cluster mean or center. For
each of the remaining object, an object is assign to the
cluster to which it is the most similar, based on the
distance between the object and the cluster mean. Then it
computes the new mean for each cluster. This process
iterate until the criterion function converges. The square
error criterion is used, define as
2
6. Proposed Architecture
Steps of the whole Process:(1) In first step we compute the saliency map using eq (1)
of input image. Which is described detailed in section
3(3.2) respectively.
(2) Apply the modified k-means clustering algorithm
(describe in section 4) to the input image.
(3) Calculate the avg. Saliency value of each cluster rk
(calculated in step 2) using
(5)
Where mi,j is defined in eq(4).
Where E is the sum of the square-error of all objects in
the set, c is the point in space representing a given object,
and mi is the mean of cluster Ci. This criterion tries to
make the resulting k cluster as compact and as separate as
possible.
(4) Then calculate the threshold saliency value using the
saliency map calculated in step 1. The formula we
use
to calculate the threshold saliency value is
5. Proposed modification of K-means Clustering
Algorithm
Step 1: Firstly decompose the input image in k part and
generate the matrix for each part then choose initial
center from matrix of each part instead of row
vector .With surety that no two center will be from same
part as well as same point. Apply selection approach of
initial cluster center of finding salient regions in images
.
Step 2: Assign each object to the group that has the
closest center.
Step 3: When all objects have been assigned, recalculate
the positions of the centers k.
Step 4: Repeat Steps 2 and 3 until the centers no longer
move.
The efficiency of k-means depends on initial selection of
centers of k-cluster. In our proposed method we have
modified the approach to select the initial centers.
Where X and Y are the width and height of the saliency
map respectively, and S(x , y) is the saliency value of the
pixel at position (x , y).
(5). Then select that cluster (calculated in step 2) whose
avg. Saliency value greater than threshold value
(calculated in step 4) from the clustered image (calculated
in step 2).
(6) Then we put RGB value 1 to the selected cluster in
step 5(turning white) and rest of the pixels RGB value 0
(turning black) which produces bitmap image.
(7) We select those pixels whose value is 1 in the bit map
from the original input image leaving the rest of the
pixels finally getting the salient object.
Page 4
International Journal Of Communication And Computer Technologies
www.ijccts.org
International Journal of Communication and Computer Technologies
Volume 01 – No.5, Issue: 02 September 2012
ISSN NUMBER : 2278-9723
Fig 2 flow Chart
8. Conclusion:
A modified k-means clustering algorithm with respect to
selection approach of initial cluster center of finding
salient regions in images is presented in this paper. The
experimental result validates the efficiency of modified
method. The proposed method has better DBI value which
signifies better clustering whereas time taken by the
proposed method is slightly increased due to the modified
approach of dividing the image into k-part and then
selecting cluster center from matrix generated for each kpart image.
References
[1] Dominik A. Klein and Simone Frintrop : Center-surround Divergence
of Feature Statistics for Salient Object Detection
[2] L. Marchesotti, C. Cifarelli, and G. Csurka. A framework for visual
saliency detection with applications to image thumbnailing. In Proc. of
ICCV, 2009
[3] B. Schauerte and G. A. Fink. Focusing computational visual attention
in multi-modal human-robot interaction. In Proc. of Int’l Conf. on
Multimodal Interfaces (ICMI), 2010.
[4] H. Pashler. The Psychology of Attention. MIT Press, Cambridge, MA,
1997.
[5] J. K. Tsotsos. Towards a computational model of visual attention. In
Early Vision and Beyond. MIT Press, 1995.
[6] L. Itti, C. Koch, and E. Niebur. A model of saliency-based visual
attention for rapid scene analysis. IEEE Trans. On PAMI, 20(11), 1998
[7] S. Frintrop. VOCUS: A Visual Attention System for Object Detection
and Goal-directed Search, volume 3899 of Lecture Notes in Artificial
Intelligence (LNAI). Springer, 2006.
[8] S. Frintrop, E. Rome, and H. I. Christensen. Computational visual
attention systems and their cognitive foundation: A survey. ACM Trans.
on Applied Perception, 7(1), 2010
[9] T. Kadir, A. Zisserman, and M. Brady. An affine invariant salient
region detector. In Proc. of ECCV, 2004.
[10] D. Gao and N. Vasconcelos. Bottom-up saliency is a discriminant
process. In Proc. of ICCV, 2007
[11] T. Liu, Z. Yuan, J. Sun, J. Wang, N. Zheng, X. Tang, and H.-Y.
Shum. Learning to detect a salient object. IEEE Trans. on PAMI, 2009.
[12] Radhakrishna Achanta, Francisco Estrada, Patricia Wils, and Sabine
SÄusstrunk : Salient Region Detection and Segmentation
[13]. Y.-F. Ma and H.-J. Zhang. Contrast-based image attention analysis
by using fuzzy growing. In Proceedings of the Eleventh ACM International
Conference on Multimedia, pages 374{381, November 2003.
[14]. V. Setlur, S. Takagi, R. Raskar, M. Gleicher, and B. Gooch.
Automatic image retargeting. In Proceedings of the 4th International
Conference on Mobile and Ubiquitous Multimedia (MUM'05), pages
59{68, October 2005.
[15]. S. Avidan and A. Shamir. Seam carving for content-aware image
resizing. ACM Transactions on Graphics, 26(3):10, July 2007.
[16]. Y. Hu, X. Xie, W.-Y. Ma, L.-T. Chia, and D. Rajan. Salient region
detection using weighted feature maps based on the human visual
attention model.Springer Lecture Notes in Computer Science,
3332(2):993{1000, October 2004.
[17]. L. Chen, X. Xie, X. Fan, W.-Y. Ma, H.-J. Zhang, and H. Zhou. A
visual attention model for adapting images on small displays.ACM
Transactions on Multimedia Systems, 9:353{364, November 2003.
[18]. 5. J. Han, K. Ngan, M. Li, and H. Zhang. Unsupervised extraction of
visual attention objects in color images. IEEE Transactions on Circuits
and Systems for Video Technology, 16(1):141{145, January 2006.
[19]. B. C. Ko and J.-Y. Nam. Object-of-interest image segmentation
based on human attention and semantic region clustering. Journal of
Optical Society of America A, 23(10):2462{2470, October 2006 .
[20].S. Frintrop, M. Klodt, and E. Rome. A real-time visual attention
system using inte-gral images. InInternational Conference on Computer
Vision Systems (ICVS'07), March 2007.
[21]. P. Viola and M. Jones. Rapid object detection using a boosted
cascade of simple features. Proceedings of IEEE Conference on Computer
Vision and Pattern Recognition (CVPR'01), 1:511{518, December 2001.
AUTHORS PROFILE
Jyoti Verma
Completed Bachelor of engineering in Information Technology from RKDF
Institute of science and technology , Rajiv Gandhi University, Bhopal
(M.P.), India in 2009. Final Year Student Master of technology in Software
Engineering from Lakshmi Naraian college of Technology, Rajiv Gandhi
University, Bhopal(M.P.), India
Vineet Richhariya
Vineet Richhariya is Head Of Department of Computer Science And
Engineering, Lakhmi Narain College Of Technology, Bhopal, India. He did
his M.Tech(Computer Science & Engineering) from BITS Pillali in year
2001. He did his B.E (Computer Science & Engineering) from Jiwaji
University, Gwalior, India in year 1990.
Page 5
International Journal Of Communication And Computer Technologies
www.ijccts.org
© Copyright 2026 Paperzz