Week 6

Week 6
Fatemeh Yazdiananari
1
Feature Extraction
• 75 Validation Videos
• Were sent to the cluster for DTF feature extraction
• The extracted features are large in size
• Max size:152GB
• Min size: 540MB
• Total size: 1.6TB
• Programing
• Since the files are large
• Textscan, fgetl, dlmread
• Ran each feature and saved all the information into a mat file
• Used the mat files to run the rest of the codes.
2
Histogram code
• For each video
• Obtain 4 feature matrix
• Per feature find the closest codebook term and save the
index
• Using the saved indices run a histogram function
• Save : video name, first 10 elements of features, Indices of
Tr, Hof, Hog, and Mbh, and histogram of Tr, Hof, Hog, and
Mbh.
3
OverView
15 Validation Videos
• Holistic
• Histograms of 15 videos
• Binary SVM
• Sliding Windows (10sec long windows)
• Histograms of the Sliding window
• Binary SVM
4
Histograms
15 validation videos
 Normalized histograms
HOF histogram for one video
HOG histogram for one video
5
Histograms (cont.)
MBH histogram for one video
TR histogram for one video
6
Action recognition steps
• Binary SVM
• Trained using the UCF101 train splits (split 1)
• obtained models
• Tested using the 15 validation videos
7
SVM & Results
Added two UCF101 features
Class 1 and class 2
Ran Binary SVM
Classification results (right)
Accurately predicted class 1 and 2 of
UCF101 (11.76%)
8
Sliding window steps
• Obtain the frame rate of the video (videoreader)
• 10secs worth of frames: multiply to frame rate
and get the frame number (window size)
• Load the histograms of the 15 videos
• Using the frame number we read in the
histograms and save them into a structure
• After all the videos are divided into their windows
and saved we load them
• run normalization
• binary SVM
• Trained on UCF101 split 1
• Tested on all the windows
9
Sliding Window Histogram
• 15 Validation Videos
• Divided into sliding windows per 10secs
• Normalized histograms
Hof histogram for one sliding window
Hog histogram for one sliding window
10
Sliding Window Histogram (cont.)
Mbh histogram of one sliding window
Tr histogram of one sliding window
11
SVM & Results
Added two UCF101 features
Class 1 and class 2
Ran Binary SVM
Classification results
Classified class 1 and 2 of UCF101 accurately
The sliding windows were misclassified as class 38
12
Conclusion
• Action recognition using the existing methods on temporally
untrimmed videos were done.
• The videos are long YouTube videos (THUMOS’14)
• The approach based on one Bag-of-words histogram per videos
(state-of-the-art) obviously failed as expected.
• The Sliding Window approach failed since:
• The histograms extracted from the untrimmed clips may not
include any particular action;
• such histograms happen to be similar to typical histograms of
Class 38 misclassification as class 38.
• The effect of not having the exact boundaries of the shots/clips
have a nontrivial impact on the formed histograms.
14
Next Steps
Visual comparison of the class 38 histograms with
the untrimmed histograms a visual
similarities should be observed.
Extracting the histograms from UCF101 videos and
classifying them using the same code used for
the untrimmed videos sanity check.
Manually identifying the boundaries of the action in
the untrimmed videos and selecting the
windows accordingly investigating the impact
of unknown boundaries and partial windows.
15