SmartPlayer, User-Centric Video Fast

SmartPlayer: User-Centric Video
Fast-Forwarding
K.-Y. Cheng, S.-J. Luo, B.-Y. Chen, and
H.-H. Chu
ACM CHI 2009
(international conference on Human factors in computing systems)
Outline
• Introduction
• SmartPlayer
– User-Centric Video Fast-Forwarding
– Skimming Model
– User Interface
• Results
• Conclusion
Introduction
• Microsoft Windows Media Player
– Play, pause, stop, fast-forward, rewind/reverse video
Introduction
• Video summarization
– Still-image abstraction
—key frame extraction
• Ex: image mosaic
– Video skimming
• Short video summary
• Video analysis techniques
– Image/video features
– Different video types
Introduction
• SmartPlayer
– Adjust playback speed
• Complexity of the current scene
• Predefined semantic events
– Learn user’s preferences
• About predefined semantic events
• User’s favorite playback speed
– Play video continuously
• Not to miss any undefined events
Introduction
• SmartPlayer
User Behavior Observation And Inquiry
• User inquiry
– 10 participants: 5 males and 5 females
Video type
Number of people
who Fast-forward
Surveillance video
10
Sport video
9
Movies
0
Lecture videos
2
– How users fast-forwarding these videos?
User Behavior Observation And Inquiry
• User inquiry
– surveillance, baseball, tennis, golf, and wedding
videos
– training videos
– prototype player
• accelerate and decelerate (1~16x)
• Can jump to the normal speed
One user’s watching pattern
for a baseball video.
User-Centric Video Fast-Forwarding
• User behavior
– Users tend to maintain a constant playback speed
within a video shot.
– Users prefer gradual increases of playback speed.
– Users set the playback rate based on several
minutes of recently viewed shots.
• SmartPlayer
– Cut the video into segments
– Adjust the playback speed gradually across segment
boundaries
– Speed control
Skimming Model
• Speed control
– motion complexity
– speed of the previous content
Skimming Model
• Motion layer
– Color[1]
• detect shot boundaries
– Motion
• extract optical flows between frames using the
Lucas-Kanade method
[1] Lienhart, R. Comparison of automatic shot boundary detection
algorithms. SPIE Storage and Retrieval for Image and Video Databases VII
3656, (1999), 290-301.
Skimming Model
• Semantic layer
– Extract semantic event points in video
– Manual annotation
Video type
Events
Baseball
Pitch, hit, homerun……
Surveillance
Appearance of pedestrians, cars, bicycles
Wedding
Formal wedding procedure
News
Political, financial, life, international event
Drama
No event
Skimming Model
• Personalization layer
– Learning from user input
– 𝑆𝑒′ =∝ 𝑆𝑒 + (1 −∝)𝑆𝑒𝑢
User Interface
Results
• Personalized adaptive fast-forwarding
– 20 participants: 13 males and 7 females
Results
• Comparisons of different video players
Video watching time
Video content understanding rate
Results
• Average rating of three types of video players
Results
Conclusion
• Automatically adapts its playback speed
according to :
– scene complexity
– predefined events of interest
– user’s preferences with respect to playback speed
• Learn user’s preferred event types and
playback speeds for these event types
• Not skipping any segments
An Extended Framework for
Adaptive Playback-Based Video
Summarization
Kadir A. Peker and Ajay Divakaran
SPIE ITCOM 2003
Features
• Visual complexity
– Motion activity: motion vector
– Spatial complexity: DCT coefficient
visual complexity=(motion vector)‧(DCT coefficient)
For each DCT coefficient
For each frame
visual complexity=
mean(cumulative energy at each visual complexity value)
Features
• Audio classes
– 1-s segments
– GMM-based classifiers
– Silence, ball hit, applause, female speech, male
speech, speech and music, music, and noise
– Sport highlights detection
• Face detection
– Viola-Jones face detector based on boosting[2]
[2] P. Viola and M. Jones, "Rapid object detection using a boosted cascade of
simple features, " In Proc. of IEEE Conference on Computer Vision and Pattern
Recognition, Kauai, HI, December 2001.
Features
• Cut detection
– Software tool Webflix
• Camera motion[3]
– Translation parameters and a zoom factor
– Camera motion and close-up object motion
[3] Yap-Peng Tan; Saur, D.D.; Kulkami, S.R.; Ramadge, P.J., "Rapid estimation of
camera motion from compressed video with application to video annotation, " IEEE
Trans. on Circuits and Systems for Video Technology, vol. 0, Feb. 2000, Page(s):
133 –146.
Summarization Method
• Shot level
– Find key frames
• Local maxima in the face-size curve
• Local maxima of the camera motion
• Combine close key frame points as one segment
– Adaptive fast playback
• According to visual complexity
• Normal playback at highlight points
Results
Results