Document

S. Savarese, 2003
P. Buegel, 1562
Constellation model of
object categories
Fischler & Elschlager 1973
Brunelli & Poggio ’93
Cootes, Lanitis, Taylor et al. ’95
Perona et al. ‘95, ‘96, ’98, ’00, ’03
Yuille ‘91
Lades, v.d. Malsburg et al. ‘93
Amit & Geman ‘95, ’99
Many more recent works…
X (location)
(x,y) coords. of region center
A (appearance)
normalize
c1
c2
…..
Projection onto
PCA basis
c10
The Generative Model
X (location)
(x,y) coords. of region center
X X
A A
h
X
A (appearance)
normalize
A
I
c1
c2
…..
Projection onto
PCA basis
c10
Hypothesis (h) node
8
X
A
X
I
5
10
7
3
h
X
2
A
1
A
4
6
e.g. hi = [3, 5, 9]
h is a mapping from interest regions to parts
9
The hypothesis (h) node
8
X
A
X
I
5
10
7
3
h
X
2
A
1
A
4
6
e.g. hj = [2, 4, 8]
h is a mapping from interest regions to parts
9
The spatial node
X X
A A
h
X
I
(x2,y2)
(x1,y1)
A
(x3,y3)
Spatial parameters node
Joint Gaussian
X X
A A
h
X
A
I
Joint density over all parts
The appearance node
X X
A A
PCA coefficients
on fixed basis
Pt 1. (c1, c2, c3,…)
h
X
I
A
Pt 2. (c1, c2, c3,…)
Pt 3. (c1, c2, c3,…)
Appearance parameter node
X X
P
A A
Gaussian
h
X
A
I
Fixed PCA basis
Independence assumed between the P parts
Maximum Likelihood interpretation
X X
P
A A
hidden variable
h
X
parameters
A
observed variables
I
Also have background model – constant for given image
MAP solution
Choose conjugate form:
Introduce priors over parameters
Normal – Wishart distributions:
P(, ) = p(|)p()
= N(|m, β ) W(|a,B)
m0X
β0 X
a0X
B0X
m0A
β0 A
a0A
B0A
priors
X
X
A
A
parameters
P
h
I
X
hidden variable
A
observed variables
Variational Bayesian model
Estimate posterior distribution on parameters
– approximate with Normal – Wishart
-- has parameters: {mX, βX, aX, BX, m A, βA, aA, BA}
m0X
β0 X
a0X
B0X
m0A
β0 A
a0A
B0A
priors
X
X
A
A
parameters
P
h
I
X
hidden variable
A
observed variables
ML/MAP Learning
Performed by EM
ML/MAP
X X
P
A A
1
h
X
A
I

n
2
where  = {µX, X, µA, A}
Weber et al. ’98 ’00, Fergus et al. ’03
Variational Learning
Performed by Variational Bayesian EM
m0X
β0 X
a0X
B0X
m0A
β0A
a0A
B0A
X
X
A
A
P
Bayesian
1
h
I
X

A
n
Fei-Fei et al. ’03, ‘04
2
Parameters to estimate: {mX, βX, aX, BX, mA, βA, aA, BA}
i.e. parameters of Normal-Wishart distribution
Variational EM
Random
initialization
E-Step
new ’s
M-Step
new estimate
of p(|train)
(Attias, Hinton, Beal, etc.)
prior knowledge of p()
Weakly supervised learning
No labeling
No segmentation
No alignment
Training:
Experiments
Detection test:
1- 6 images
50 fg/ 50 bg images
(randomly drawn)
object present/absent
Datasets: foreground and background
The Caltech-101
Object Categories
www.vision.caltech.edu/feifeili/Datasets.htm
The prior
• Captures commonality between different classes
• Crucial when training from few images
• Constructed by:
– Learning lots of ML models from other classes
– Each model is a point in θ space
– Fit Norm-Wishart distribution to these points using moment
matching i.e. estimate {m0X, β0X, a0X, B0X, m0A, β0A, a0A, B0A}
What priors tell us? – 1. means
Appearance
likely
unlikely
Shape
What priors tell us? – 2. variability
Appearance
Shape
Renoir
Da Vinci, 1507
Warhol, 1967
Picasso, 1951
Magritte, 1928
Picasso, 1936
Arcimboldo, 1590
Miro, 1949
The prior on Appearance
Blue: Airplane; Green: Leopards; Red: Faces
Magenta: Background
The prior on Shape
Blue: Airplane; Green: Leopards; Red: Faces
X-coord
Magenta: Background
Y-coord
Motorbikes
• 6 training images
• Classification task (Object present/absent)
Grand piano
Cougar faces
Number of classes in prior
How good is the prior alone?
Performance over all 101 classes
Conclusions
• Hierarchical Bayesian parts and structure
model
• Learning and recognition of new classes
assisted by transferring information from
unrelated object classes
• Variational Bayes superior to MAP
Visualization of learning
Sensitivity to quality of feature detector
Discriminative evaluation
Mean on diagonal:
18%
More recent work
by Holub, Welling & Perona
40%
Using gen./disc hybrid