Deformable Part

Deformable Part Model
Presenter: Liu Changyu
Advisor:
Prof. Alex Hauptmann
Interest : Multimedia Analysis
April 11st, 2013
Contents





Introduction
Model
Learning
Experiment
Conclusion
CMU - Language Technologies Institute
2
Introduction
1. Research Question
1) Object bank is just a image representation for high-level
visual tasks, it should be used combing with detailed efficient
traning method.
2) For difficult tasks, such as extending Object Bank to over
1000 objects and benchmarks of the PASCAL Challenge, it need
new traning method to improve the average precision.
3) So we want to combine use the part model that proposed at
CVPR in 2008.
CMU - Language Technologies Institute
3
Introduction
2. What’s Deformable Part Model?
Deformable Part is a discriminatively trained, multiscale model
for image training that aim at making possible the effective use
of more latent information such as hierarchical (grammar)
models and models involving latent three dimensional pose.
CMU - Language Technologies Institute
4
Contents





Introduction
Model
Learning
Experiment
Conclusion
CMU - Language Technologies Institute
5
Model--- Deformable Part
(a) person detection Example
(b1) coarse template (b2)part templates (b3) spatial model
Fig. 1 Deformable Part Model
The deformable model include both a coarse global template covering an
entire object and higher resolution part templates.The templates represent
histogram of gradient features
CMU - Language Technologies Institute
6
Model---Deformable Part
Fig.2 illustrates a
placement of such a model
in a HOG pyramid. The
root filter location defines
the detection window (the
pixels inside the cells
covered by the filter). The
part filters are placed
several levels down in the
pyramid, so the HOG cells
at that level have half the
(a) Image pyramid
(b)HOG feature pyramid
size of cells in the root
Fig.2 Pyramids of Deformable Part Model filter level.
CMU - Language Technologies Institute
7
Model---Deformable Parts
The score of a placement is given by the scores of each
filter (the data term) plus a score of the placement of each
part relative to the root (the spatial term),
is the w × h × 9 × 4 weight vector
are the features in a w×h subwindow of a
HOG pyramid.
gives the location
of the i-th part relative to the root location.
ai and bi are two dimensional vectors coefficients for
measuring a score for each possible placement of the i-th part.
Where
CMU - Language Technologies Institute
8
Contents





Introduction
Model
Learning
Experiment
Conclusion
CMU - Language Technologies Institute
9
Learning
Latent SVMs
This model use Latent SVMs to have a classification. As:
where is a vector of model parameters, needed to
be learned first, according to:
z is a set of latent values.
CMU - Language Technologies Institute
10
Contents





Introduction
Model
Learning
Experiment
Conclusion
CMU - Language Technologies Institute
11
Experiment
CMU - Language Technologies Institute
12
Experiment
We execute the matlab code as…..
CMU - Language Technologies Institute
13
Contents





Introduction
Model
Algorithm
Experiment
Conclusion
CMU - Language Technologies Institute
14
Conclusion
1)Experiment has not completed yet, it needed more object models used
for deformable part training.
2) Computation need to be speed up.
CMU - Language Technologies Institute
15
Reference
[1] P. Felzenszwalb, D. McAllester, D. Ramanan. A Discriminatively Trained,
Multiscale, Deformable Part Model. IEEE Conference on Computer
Vision and Pattern Recognition (CVPR), 2008
[2] P. Felzenszwalb, R. Girshick, D. McAllester, D. Ramanan. Object
Detection with Discriminatively Trained Part Based Models. IEEE
Transactions on Pattern Analysis and Machine Intelligence, Vol. 32, No.
9, Sep. 2010.
[3] Level Image Representation for Scene Classification and Semantic
Feature Sparsification. Proceedings of the Neural Information
Processing Systems (NIPS), 2010.
CMU - Language Technologies Institute
16
Thank you!
CMU - Language Technologies Institute
17