Support Vector Machines
Pattern Recognition
Sergios Theodoridis
Konstantinos Koutroumbas
Second Edition
A Tutorial on Support Vector Machines for Pattern Recognition
Data Mining and Knowledge Discovery, 1998
C. J. C. Burges
Separable Case
Maximum Margin Formulation
Separable Case
Label the training data
{xi , yi }, i 1,..., l , yi {1,1}, xi R
Hyperplane satisfy g(x) w x b 0
w:normal to the hyperplane
|b|/||w||:perpendicular distance from the
hyperplane to the origin
d+ (d-):margin
d
Separable Case
positive example
negative example
d+
d-
Separable Case
Suppose that all the training data
satisfy the following constraints
x i w b 1 for yi 1
x i w b 1 for yi 1
class 1
class 2
These can be combines into one set of
inequalities
yi ( x i w b) 1 0 , i
Distance of a point from a hyperplane
1
d d
|| w ||
Separable Case
1
1
2
Having a margin of
|| w || || w || || w ||
maximize
Task
compute the parameter w, b of the hyperplane
minimize
1
J ( w) || w ||2
2
subject to yi ( xi w b) 1 0 i
Separable Case
Karush-Kuhn-Tucker (KKT) conditions
L( w, b, ) 0 i 0, i 1,2,...., N
L( w, b, ) 0
b
w
i [ yi ( xi w b) 1] 0, i 1,2,..., N
: vector of the Langrange multiplier
L( w, b, ) : Langrangian function
N
1
2
L( w, b, ) || w || i [ yi ( xi w b) 1]
2
i 1
N
w i yi xi
i 1
N
y
i 1
i
i
0
Separable Case
Wolfe dual representation form
maximize
L(w, b, )
N
N
subject to w i yi xi
y
i 1
i 1
i
i
0 i 0
N
1
T
max i i j yi y j xi x j
2 i, j
i 1
N
subject to
y
i 1
i
i
0, i 0
Image Categorization by Learning and
Reasoning with Regions
Yixin Chen
University of New Orleans
James Z. Wang
The Pennsylvania State University
Journal of Machine Learning Research 5 (2004)
(Submitted 7/03; Revised 11/03; Published 8/04)
Introduction
Automatic image categorization
Difficulties
Variable & uncontrolled image conditions
Complex and hard-to-describe objects in image
Objects occluding other objects
Applications
Digital libraries, Space science, Web searching,
Geographic information systems, Biomedicine,
Surveillance and sensor system, Commerce,
Education
Overview
Give a set of labeled images, can a
computer program learn such knowledge or
semantic concepts form implicit information
of objects contained in image?
Related Work
Multiple-Instance Learning
Diverse Density Function (1998)
MI-SVM (2003)
Image Categorization
Color Histograms (1998-2001)
Subimage-based Methods
(1994-2004)
Motivation
Correct categorization of an image
depends on identifying multiple
aspects of the image
Extension of MIL→A bag must contain
a number of instances satisfying
various properties
A New Formulation of
Multiple-Instance Learning
Maximum margin problem in a new
feature space defined by the DD
function
DD-SVM
In the instance feature space, a
collection of feature vectors, each of
which is called an instance prototype,
is determined according to DD
A New Formulation of
Multiple-Instance Learning
Instance prototype:
• A class of instances (or regions) that is
more likely to appear in bags (or images)
with the specific label than in the other
bags
Maps every bag to a point in
bag feature space
Standard SVMs are the trained in the
bag feature space
Outline
Image segmentation & feature
representation
DD-SVM, and extension of MIL
Experiments & result
Conclusions & future work
Image Segmentation
Partitions the image into nonoverlapping blocks of size 4x4 pixels
Each feature vector consists of six
features
Average color components in a block
• LUV color space
Square root of the second order
moment of wavelet coefficients in
high-frequency bands
Image Segmentation
LL HL
Daubechies-4 wavelet transform
k, l
LH HH
2x2
coefficients
1 1 1 2
f ck i ,l j
4 i 0 j 0
Moments of wavelet coefficients in
various frequency bands are effective
for representing texture (Unser, 1995)
1
2
Image Segmentation
k-means algorithm: cluster the feature
vectors into several classes with every
class corresponding to one “region”
Adaptively select N by gradually
increasing N until a stopping criterion
is met (Wang et al. 2001)
Segmentation Results
Image Representation
f j :the mean of the set of feature
vectors corresponding to each region Rj
Shape properties of each region
Normalized inertia of order 1, 2, 3
(Gersho, 1979)
I (R j , γ)
rR j
r-r
1
Vj
2
Image Representation
Shape feature of region Rj as
I (Rj,1) I (Rj,2) I (Rj,3)
sj
,
,
I
I
I
2
3
1
An image Bi
Segmentation: {Rj : j = 1, …, Ni}
Feature vectors: { xij : j = 1, …, Ni}
T
x ij f j , s j
T
T
9-dimensional
feature vector
An extension of
Multiple-Instance Learning
Maximum margin formulation of MIL in
a bag feature space
Constructing a bag feature space
Diverse density
Learning instance prototypes
Computing bag features
Maximum Margin Formulation of
MIL in a Bag Feature Space
Basic idea of new MIL framework:
Map every bag to a point in a new
feature space, named the bag feature
space
To train SVMs in the bag feature space
l
1 l
* arg max i yi y j i j K ( (Bi ), (B j ))
i
2 i , j 1
i 1
l
subject to
y
i 1
i
i
0
C i 0, i 1,..., l
Constructing a Bag Feature Space
Clues for classifier design:
What is common in positive bags and
does not appear in the negative bags
Instance prototypes computed from the
DD function
A bag feature space is then constructed
using the instance prototypes
Diverse Density (Maron and Lozano-Perez, 1998)
A function defined over the instance
space
DD value at a point in the feature space
The probability that the point agrees
with the underlying distribution of
positive and negative bags
Diverse Density
2
Ni
1 yi
x i j x
w
DDD ( x, w )
yi 1 e
i 1
j 1
2
l
x
w
x Diag(w) 2 x
T
1
2
It measures a co-occurrence of
instances from different (diverse)
positive bags
Learning Instance Prototype
An instance prototype represents a
class of instances that is more likely to
appear in positive bags than in
negative bags
Learning instance prototypes then
becomes an optimization problem
Finding local maximizers of the DD
function in a high-dimensional
Learning Instance Prototype
How do we find the local maximizers?
Start an optimization at every instance
in every positive bag
Constraints:
Need to be distinct from each other
Have large DD values
Computing Bag Features
Let {( x , w ) : k 1,..., n} be the collection
of instance prototypes
Bag features, (Bi ), Bi {x ij : j 1,..., N i }
*
k
*
k
min
min
( Bi )
min
*
x
x
j 1,..., N i
ij
2 w*
2
...
*
x
x
j 1,..., N i
ij
n w*
n
j 1,..., N i
x ij x1*
w1*
Experimental Setup for Image
Categorization
COREL Corp: 2,000 images
20 image categories
JPEG format, size 384*256 (256*384)
Each category are randomly divided
into a training set and a test set (50/50)
Light
SVM
[Joachims, 1999] software is used
to train the SVMs
Sample Images (COREL)
Image Categorization Performance
Chapelle et al.,
1999
Andrews et al.,
2003
14.8%
6.8%
5 random test sets, 95% confidence intervals
The images belong to Cat.0 ~ Cat.9
Image Categorization Experiments
Sensitivity to Image Segmentation
k-means clustering algorithm
with 5 different stopping criteria
1,000 images for Cat.0 ~ Cat.9
Robustness to Image
Segmentation
6.8%
9.5%
11.7%
13.8%
27.4%
Robustness to the Number of
Categories in a Data Set
81.5%
6.8%
67.5%
12.9
Difference in Average Classification
accuracies
Sensitivity to the
Size of Training Images
Sensitivity to the
Diversity of Training Images Varies
MUSK Data Sets
Speed
40 minutes
Training set of 500 images (4.31 regions per
image)
Pentium III 700MHz PC running the Linux
operating system
Algorithm is implemented in Matlab, C
programming language
The majority is spent on learning
instance prototypes
Conclusions
A region-based image categorization method
using an extension of MIL → DD-SVM
Image → collection of regions → k-means alg.
Image → a point in a bag feature space
(defined by a set of instance prototypes learned with the DD func.)
SVM-based image classifiers are trained in
the bag feature space
DD-SVM outperforms two other methods
DD-SVM generates highly competitive results
on MUSK data set
Future Work
Limitations
Region naming (Barnard et al., 2003)
Texture dependence
Improvement
Image segmentation algorithm
DD function
Scene category can be a vector
Semantically-adaptive searching
Art & biomedical images
© Copyright 2026 Paperzz