Exemplar-SVM for Action Recognition

Exemplar-SVM for Action
Recognition
Week 10
Presented by Christina Peterson
Movement Exemplar-SVMs


Tran and Torresani [1] based the MEX-SVM
on the work of Malisiewicz et. al. [2]
Linear SVMs applied to histograms of spacetime interest points (STIPs) calculated from
sub-volumes of the video
◦ Trained on one positive samples and many
negative samples

Calibrate MEX-SVM’s using Platt’s Method
Overview of MEX-SVM
Results
Catch
Dribble
Ride Bike
Dive
Fencing
Golf
Ride Horse
Jump
Kick Ball
Walk
MEX-SVM
Accuracy
53.3
14.4
25.6
24.4
43.3
68.9
36.7
28.9
12.2
11.1
Exemplar
Avg. Accuracy
26.67
16.00
34.00
21.33
14.00
40.00
54.00
24.00
10.00
12.00
Results
Performance
50
Accuracy %
40
Catch
30
Dribble
20
Ride Bike
10
Dive
Fencing
0
5
10
15
20
25
Number of Exemplars
30
Performance
70
Accuracy %
60
50
Golf
40
Ride Horse
30
Jump
20
Kick Ball
10
Walk
0
5
10
15
20
25
Number of Exemplars
30
Reasons for Discrepancies

Different training/testing set
◦ MEX-SVMs trained on UCF50 data set, tested on HMDB51
◦ Exemplar-SVM trained and tested on UCF50 data set

Exemplar Feature Vector
◦ MEX-SVM used ground truth bounding box
◦ Exemplar-SVM use entire video

Mid-Level Feature Vector
◦ MEX-SVM Mid-Level Feature Dimension = Na x Ns x Np




Na = Number of Exemplars
Ns = Exemplar template scale
Np = Spatial-Temporal Pyramid Level
185 x 3 x (1 + 8 + 64) = 40,515
◦ Exemplar-SVM Mid-Level Feature Dimension = Na
 Varied between 250 – 1,500
References
[1] D. Tran and L. Torresani. MEXSVMs: Mid-level Features
for Scalable Action Recognition. Dartmouth Computer
Science Techinical Report TR2013-726, January 2013.
[2] T. Malisiewicz,A. Gupta, and A.A. Efros.
Ensemble of Exemplar SVMS for Object
Detection and Beyond. In Proc. ICCV, 2011.