Image Categorization by Learning and Reasoning with Regions

Support Vector Machines
Pattern Recognition
Sergios Theodoridis
Konstantinos Koutroumbas
Second Edition
A Tutorial on Support Vector Machines for Pattern Recognition
Data Mining and Knowledge Discovery, 1998
C. J. C. Burges
Separable Case
Maximum Margin Formulation
Separable Case

Label the training data
{xi , yi }, i  1,..., l , yi  {1,1}, xi  R

Hyperplane satisfy g(x)  w  x  b  0
w:normal to the hyperplane
 |b|/||w||:perpendicular distance from the

hyperplane to the origin

d+ (d-):margin
d
Separable Case
positive example
negative example
d+
d-
Separable Case

Suppose that all the training data
satisfy the following constraints
x i  w  b  1 for yi  1
x i  w  b  1 for yi  1

class 1
class 2
These can be combines into one set of
inequalities
yi ( x i  w  b)  1  0 , i

Distance of a point from a hyperplane
1
d  d 
|| w ||
Separable Case
1
1
2


 Having a margin of
|| w || || w || || w ||

maximize
Task

compute the parameter w, b of the hyperplane
minimize
1
J ( w)  || w ||2
2
subject to yi ( xi  w  b) 1  0 i
Separable Case

Karush-Kuhn-Tucker (KKT) conditions


L( w, b,  )  0  i  0, i  1,2,...., N
L( w, b,  )  0
b
w
 i [ yi ( xi  w  b)  1]  0, i  1,2,..., N


 : vector of the Langrange multiplier
L( w, b,  ) : Langrangian function
N
1
2
L( w, b,  )  || w ||   i [ yi ( xi  w  b)  1]
2
i 1
N
w    i yi xi
i 1
N
 y
i 1
i
i
0
Separable Case

Wolfe dual representation form
maximize
L(w, b,  )
N
N
subject to w    i yi xi
 y
i 1
i 1
i
i
 0 i  0
 N

1
T
max   i   i j yi y j xi x j 

2 i, j
 i 1

N
subject to
 y
i 1
i
i
 0,  i  0
Image Categorization by Learning and
Reasoning with Regions
Yixin Chen
University of New Orleans
James Z. Wang
The Pennsylvania State University
Journal of Machine Learning Research 5 (2004)
(Submitted 7/03; Revised 11/03; Published 8/04)
Introduction


Automatic image categorization
Difficulties




Variable & uncontrolled image conditions
Complex and hard-to-describe objects in image
Objects occluding other objects
Applications

Digital libraries, Space science, Web searching,
Geographic information systems, Biomedicine,
Surveillance and sensor system, Commerce,
Education
Overview

Give a set of labeled images, can a
computer program learn such knowledge or
semantic concepts form implicit information
of objects contained in image?
Related Work

Multiple-Instance Learning
Diverse Density Function (1998)
 MI-SVM (2003)


Image Categorization
Color Histograms (1998-2001)
 Subimage-based Methods
(1994-2004)

Motivation

Correct categorization of an image
depends on identifying multiple
aspects of the image

Extension of MIL→A bag must contain
a number of instances satisfying
various properties
A New Formulation of
Multiple-Instance Learning
Maximum margin problem in a new
feature space defined by the DD
function
 DD-SVM


In the instance feature space, a
collection of feature vectors, each of
which is called an instance prototype,
is determined according to DD
A New Formulation of
Multiple-Instance Learning

Instance prototype:
• A class of instances (or regions) that is
more likely to appear in bags (or images)
with the specific label than in the other
bags
Maps every bag to a point in
bag feature space
 Standard SVMs are the trained in the
bag feature space

Outline
Image segmentation & feature
representation
 DD-SVM, and extension of MIL
 Experiments & result
 Conclusions & future work

Image Segmentation
Partitions the image into nonoverlapping blocks of size 4x4 pixels
 Each feature vector consists of six
features


Average color components in a block
• LUV color space

Square root of the second order
moment of wavelet coefficients in
high-frequency bands
Image Segmentation

LL HL
Daubechies-4 wavelet transform
k, l
LH HH

2x2
coefficients
1 1 1 2

f    ck i ,l  j 
 4 i 0 j 0

Moments of wavelet coefficients in
various frequency bands are effective
for representing texture (Unser, 1995)
1
2
Image Segmentation
k-means algorithm: cluster the feature
vectors into several classes with every
class corresponding to one “region”
 Adaptively select N by gradually
increasing N until a stopping criterion
is met (Wang et al. 2001)

Segmentation Results
Image Representation


f j :the mean of the set of feature
vectors corresponding to each region Rj
 Shape properties of each region

Normalized inertia of order 1, 2, 3
(Gersho, 1979)
I (R j , γ) 
 

rR j
r-r
1
Vj

2
Image Representation

Shape feature of region Rj as
 I (Rj,1) I (Rj,2) I (Rj,3) 
sj  
,
,

I
I
I
2
3
 1


An image Bi
Segmentation: {Rj : j = 1, …, Ni}
 Feature vectors: { xij : j = 1, …, Ni}


T
x ij   f j , s j 


T
T
9-dimensional
feature vector
An extension of
Multiple-Instance Learning
Maximum margin formulation of MIL in
a bag feature space
 Constructing a bag feature space

Diverse density
 Learning instance prototypes
 Computing bag features

Maximum Margin Formulation of
MIL in a Bag Feature Space

Basic idea of new MIL framework:


Map every bag to a point in a new
feature space, named the bag feature
space
To train SVMs in the bag feature space
l
1 l
 *  arg max  i   yi y j i j K ( (Bi ),  (B j ))
i
2 i , j 1
i 1
l
subject to
 y
i 1
i
i
0
C   i  0, i  1,..., l
Constructing a Bag Feature Space

Clues for classifier design:
What is common in positive bags and
does not appear in the negative bags
 Instance prototypes computed from the
DD function


A bag feature space is then constructed
using the instance prototypes
Diverse Density (Maron and Lozano-Perez, 1998)
A function defined over the instance
space
 DD value at a point in the feature space


The probability that the point agrees
with the underlying distribution of
positive and negative bags
Diverse Density
2
Ni 
1  yi
 x i j  x 
w

DDD ( x, w )  
 yi 1  e
i 1
j 1


 2
l
x
w


 x Diag(w) 2 x
T

1
2
It measures a co-occurrence of
instances from different (diverse)
positive bags
Learning Instance Prototype
An instance prototype represents a
class of instances that is more likely to
appear in positive bags than in
negative bags
 Learning instance prototypes then
becomes an optimization problem


Finding local maximizers of the DD
function in a high-dimensional
Learning Instance Prototype

How do we find the local maximizers?


Start an optimization at every instance
in every positive bag
Constraints:
Need to be distinct from each other
 Have large DD values

Computing Bag Features
Let {( x , w ) : k  1,..., n} be the collection
of instance prototypes
 Bag features,  (Bi ), Bi  {x ij : j  1,..., N i }

*
k
*
k
 min

 min
 ( Bi )  

min



*
x

x
j 1,..., N i
ij
2 w* 
2

...

*

x

x
j 1,..., N i
ij
n w*
n 
j 1,..., N i
x ij  x1*
w1*
Experimental Setup for Image
Categorization

COREL Corp: 2,000 images
20 image categories
 JPEG format, size 384*256 (256*384)
 Each category are randomly divided
into a training set and a test set (50/50)
Light
 SVM
[Joachims, 1999] software is used
to train the SVMs

Sample Images (COREL)
Image Categorization Performance
Chapelle et al.,
1999
Andrews et al.,
2003


14.8%
6.8%
5 random test sets, 95% confidence intervals
The images belong to Cat.0 ~ Cat.9
Image Categorization Experiments
Sensitivity to Image Segmentation
k-means clustering algorithm
with 5 different stopping criteria
 1,000 images for Cat.0 ~ Cat.9

Robustness to Image
Segmentation
6.8%
9.5%
11.7%
13.8%
27.4%
Robustness to the Number of
Categories in a Data Set
81.5%
6.8%
67.5%
12.9
Difference in Average Classification
accuracies
Sensitivity to the
Size of Training Images
Sensitivity to the
Diversity of Training Images Varies
MUSK Data Sets
Speed

40 minutes




Training set of 500 images (4.31 regions per
image)
Pentium III 700MHz PC running the Linux
operating system
Algorithm is implemented in Matlab, C
programming language
The majority is spent on learning
instance prototypes
Conclusions



A region-based image categorization method
using an extension of MIL → DD-SVM
Image → collection of regions → k-means alg.
Image → a point in a bag feature space
(defined by a set of instance prototypes learned with the DD func.)



SVM-based image classifiers are trained in
the bag feature space
DD-SVM outperforms two other methods
DD-SVM generates highly competitive results
on MUSK data set
Future Work


Limitations

Region naming (Barnard et al., 2003)

Texture dependence
Improvement





Image segmentation algorithm
DD function
Scene category can be a vector
Semantically-adaptive searching
Art & biomedical images