June 4, 2004
High-level image classification
S. Papadopoulos, V. Mezaris, Y. Kompatsiaris,
M.G. Strintzis
Informatics and Telematics Institute CE.R.T.H
Aim of automatic image classification
• getting rid of huge work involved in manually
annotating multimedia content
• acceleration of searching in multimedia databases
• automation of object or concept recognition
2
Informatics and Telematics Institute / CE.R.T.H
Methods of tackling the problem
The ‘global’ approach
• Extract features and apply a
pattern recognition technique
globally
The region based approach
• segment the image, extract features
and apply classification to regions
3
Informatics and Telematics Institute / CE.R.T.H
Feature Extraction
Currently the following features are available:
– Color Features (MPEG-7 Dominant Color, Lab
centers, RGB centers, Ohta color space centers, color
histogram)
– Edge Direction Features (modified MPEG-7 edge
direction descriptor, edge direction histogram)
– Contour Shape Features (MPEG-7 Curvature Scale
Space)
– Position Features (only for regions)
4
Informatics and Telematics Institute / CE.R.T.H
Region Feature Extraction
• The features are extracted for
each region after the image has
been segmented.
• A n – dimensional feature
vector f = {f1, f2, … , fn} is
obtained for each region.
5
Informatics and Telematics Institute / CE.R.T.H
Training procedure
• A representative subset of the images to be tested
is chosen.
• A semantic tag is attached to each image or region. Since we use
binary classifiers, this tag can be either 0 or 1, meaning that the
image (or region) belongs or does not belong to the semantic
category of interest.
• After gathering the two groups of feature vectors corresponding to
the images (regions) of the two aforementioned discrete semantic
categories, we calculate an average vector and a covariance matrix
for each group. Let the average vectors be m1 and m2, and the
covariance matrices C1 and C2.
6
Informatics and Telematics Institute / CE.R.T.H
Classifier
• The ‘concept model’ in our case consists of
the average vector mi and the covariance matrix Ci .
Thus, the concept space is depicted in a statistical way.
• The two semantic categories are modeled as multivariate Gaussian
distributions. Once a feature vector y of an unknown image (or region)
is present, two class-conditional probabilities are computed based on the
Bayes theorem:
2
1
P ( y | i )
( 2
where:
n
)2
1
| Ci | 2
e
i
2
i 2 ( y mi )T Ci 1 ( y mi )
and P(y|ωi) is the probability of the y vector appearing
assuming it belongs to class ωi.
Informatics and Telematics Institute / CE.R.T.H
7
Classifier
• In order to decide about the class of the unknown
vector, we apply the MAP criterion (maximum a
posteriori probability), which means that we classify
the image (region) depicted by feature vector y to
class ω1 if the following condition is valid:
p( y | 1) p( 2)
p( y | 2) p(1)
8
Informatics and Telematics Institute / CE.R.T.H
Experimental Results
• The above mentioned classifying technique has been tested on the
following domains:
–
–
–
–
–
images containing faces versus images without faces,
city images versus natural scene pictures,
images containing sky versus images without sky,
images containing horses versus images without horses,
images containing tigers versus images without tigers.
• The aim of testing was to compare the global and the region-based
approach.
9
Informatics and Telematics Institute / CE.R.T.H
Experimental Results
• Domain: Faces
Global Approach
Faces not recognized: 167 / 414 (59.7 %)
Non-faces falsely recognized: 80 / 202 (60.4 % )
Features used: LAB centers + color standard
deviation + 3 dominant edge directions
Region-based Approach
Faces not recognized: 53 / 414 (87.2 %)
Non-faces falsely recognized: 38 / 202 (81.2 %)
Features used: LAB centers + color standard
deviation + MPEG7 contour shape
10
Informatics and Telematics Institute / CE.R.T.H
Experimental Results
• Domain: Faces
Use of MPEG-7 Dominant Color
Descriptor instead of LAB centers
Global Approach
Faces not recognized: 165 / 414 (60.1 %)
Non-faces falsely recognized: 31 / 202 (84.6 % )
Features used: Most dominant color + color
variances + modified edge direction histogram
Region-based Approach
Faces not recognized: 62 / 414 (85.0 %)
Non-faces falsely recognized: 48 / 202 (76.2 %)
Features used: Most dominant color + color
variances + MPEG7 contour shape
11
Informatics and Telematics Institute / CE.R.T.H
Experimental Results
• Domain: City – Nature
Global Approach
City not recognized: 23 / 267 (91.4 %)
Nature falsely recognized: 16 / 21 (92.4%)
Features used: LAB centers + color standard
deviation + MPEG7 edge direction
Region-based Approach
City not recognized: 46 / 267 (82.8 %)
Nature falsely recognized: 32 / 210 (84.8 %)
Features used: LAB centers + color standard
deviation + MPEG7 edge direction
No significant changes observed using the
most dominant color instead of the Lab
centers (80.5% and 85.7% success respectively)
12
Informatics and Telematics Institute / CE.R.T.H
Experimental Results
• Domain: Sky
Global Approach
Sky not recognized: 20 / 103 (80.6 %)
Non-sky falsely recognized: 25 / 295 (91.5 %)
Features used: LAB centers + color standard
deviation + MPEG7 edge directions
Region-based Approach
Sky not recognized: 22 / 103 (78.6 %)
Non-sky falsely recognized: 8 / 295 (97.3 %)
Features used: LAB centers + color standard
deviation + MPEG7 edge direction + y position
13
Informatics and Telematics Institute / CE.R.T.H
Experimental Results
• Domain: Horses
Global Approach
Horses not recognized: 25 / 100 (75 %)
Non-horses falsely recognized: 3 / 299 (98.9 %)
Features used: LAB centers + color standard
deviation + MPEG7 edge direction
Region-based Approach
Horses not recognized: 12 / 100 (88 %)
Non-horses falsely recognized: 43 / 299 (85.6 %)
Features used: LAB centers + color standard
deviation + MPEG7 edge direction
14
Informatics and Telematics Institute / CE.R.T.H
Experimental Results
• Domain: Tigers
Global Approach
Tigers not recognized: 33 / 99 (66.7 %)
Non-tigers falsely recognized: 40 / 300 (86.7 %)
Features used: LAB centers + color standard
deviation + MPEG7 edge direction
Region-based Approach
Tigers not recognized: 37 / 99 (62.6 %)
Non-tigers falsely recognized: 38 / 300 (87.3 %)
Features used: LAB centers + color standard
deviation + MPEG7 edge direction
15
Informatics and Telematics Institute / CE.R.T.H
Conclusions
It is impossible to implement a binary classifier which will be
100% reliable.
The region based approach performs better in most domains,
especially if they are distinguished by concrete objects.
The performance of the region-based approach is strongly
dependent on the result of segmentation.
Enlarged, more realistic test database necessary
(VCE-1 database?)
16
Informatics and Telematics Institute / CE.R.T.H
© Copyright 2026 Paperzz