PowerPoint **

Broadcast Court-Net
Sports Video Analysis
Using Fast 3-D Camera
Modeling
Jungong Han
Dirk Farin
Peter H. N.
IEEE CSVT 2008
Introduction
 In consumer videos, sports video attracts a large




audience
Pixel/object-level analysis
Extract highlights
Event-based system
Construct a general framework
System Architecture
Camera Calibration Introduction
 Map the points in real world coordinates to the
image domain
 Assume the ground plane is placed at 𝑧 = 0, so the
homography-matrix H is:
Computing the Ground-Plane
Homography
1. Line-Pixel Detection
 Detect white pixels
 Use additional constraint to prevent large area from
being extracted
 Structure-tensor based filter
Computing the Ground-Plane
Homography
2. Line-Parameter Estimation
 Use RANSAC-like algorithm to detect dominant lines
 Refined by a least-squares approximation
Line g
𝑷: 𝒕𝒉𝒆 𝒔𝒆𝒕 𝒐𝒇 𝒄𝒐𝒖𝒓𝒕 𝒍𝒊𝒏𝒆 𝒑𝒊𝒙𝒆𝒍𝒔
𝝉: 𝒕𝒉𝒆 𝒂𝒑𝒑𝒓𝒐𝒙𝒊𝒎𝒂𝒕𝒆 𝒍𝒊𝒏𝒆 𝒘𝒊𝒅𝒕𝒉
𝝉
Computing the Ground-Plane
Homography
3. Court Model Fitting
 Determine correspondences between the 4 detected
lines and the lines in court model
 Compute the model matching error E through every
configuration
𝑀: 𝑡ℎ𝑒 𝑐𝑜𝑙𝑙𝑒𝑡𝑖𝑜𝑛 𝑜𝑓 𝑙𝑖𝑛𝑒 𝑠𝑒𝑔𝑚𝑒𝑛𝑡𝑠
:the closest line segment
in image
Computing the Ground-Plane
Homography
4. Model Tracking
 Assume the change in camera speed is small

Refine the camera calibration parameters
Playing Frame Detection
 Define a frame with a court as a playing-frame
If (𝑭 𝒕 − 𝝁𝑭 )𝟐 < 𝟐𝝈𝑭
Count the number
of white pixels in
current frame
If 𝒏𝒐𝒕
Switch to court-detection
This is not a playing frame
𝑭 𝒕 : 𝒕𝒉𝒆 𝒏𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒘𝒉𝒊𝒕𝒆 𝒑𝒊𝒙𝒆𝒍 𝒊𝒏 𝒄𝒐𝒖𝒓𝒕 𝒓𝒆𝒈𝒊𝒐𝒏
𝝁𝑭 : 𝒕𝒉𝒆 𝒎𝒆𝒂𝒏 𝒏𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒘𝒉𝒊𝒕𝒆 𝒑𝒊𝒙𝒆𝒍 𝒇𝒓𝒐𝒎 𝒕𝟎 𝒕𝒐 𝒕𝒏
𝝈𝑭; 𝒕𝒉𝒆 𝒗𝒂𝒓𝒊𝒂𝒏𝒄𝒆 𝒐𝒇 𝒕𝒉𝒆 𝒏𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒘𝒉𝒊𝒕𝒆 𝒑𝒊𝒙𝒆𝒍 𝒇𝒓𝒐𝒎 𝒕𝟎 𝒕𝒐 𝒕𝒏
Moving Player Segmentation
1. Build a background model
 Use 3 Gaussian to model the RGB color space
 Compute Mahalanobis distance
2. EM-based background subtraction
Moving Player Segmentation
3. Player body bounding
 Detect the foot position
 The bounding box is compute
from the player’s real height
Occlusion Handling
 The occlusion has two properties
 Obtain the contour of players in binary map
 Find the peak
 Use Gaussian distribution to represent the contour
Player Tracking
 Determine the correspondences between one
known player in the previous frame and one blob in
the current frame
𝑻𝒊 : 𝒕𝒉𝒆 𝒌𝒏𝒐𝒘𝒏 𝒑𝒍𝒂𝒚𝒆𝒓
𝑫𝒋 : 𝒕𝒉𝒆 𝒋𝒕𝒉 𝒄𝒂𝒏𝒅𝒊𝒅𝒂𝒕𝒆 𝒊𝒏 𝒄𝒖𝒓𝒓𝒆𝒏𝒕 𝒇𝒓𝒂𝒎𝒆
𝒙: 𝒕𝒉𝒆 𝒑𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 𝒑𝒍𝒂𝒚𝒆𝒓 𝒎𝒐𝒕𝒊𝒐𝒏 𝒓𝒆𝒑𝒓𝒆𝒔𝒆𝒏𝒕𝒆𝒅
𝒃𝒚 𝒔𝒑𝒆𝒆𝒅 𝒂𝒏𝒅 𝒅𝒊𝒓𝒆𝒄𝒕𝒊𝒐𝒏
: 𝒕𝒉𝒆 𝒄𝒐𝒗𝒂𝒓𝒊𝒂𝒏𝒄𝒆 𝒎𝒂𝒕𝒓𝒊𝒙
 Adopt the DES operator to smooth and refine the
motion of each player [23]
Scene Level
 Feature factor
𝑷𝑹𝟏 : 𝑡ℎ𝑒 𝑟𝑒𝑙𝑎𝑡𝑖𝑣𝑒 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛 𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑡𝑤𝑜 𝑝𝑙𝑎𝑦𝑒𝑟𝑠 𝑎𝑛𝑑 𝑡ℎ𝑒 𝑐𝑜𝑢𝑟𝑡 𝑓𝑖𝑒𝑙𝑑
𝑷𝑹𝟐 : 𝑡ℎ𝑒 ℎ𝑜𝑟𝑖𝑧𝑜𝑛𝑡𝑎𝑙 𝑟𝑒𝑙𝑎𝑡𝑖𝑣𝑒 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛 𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑡𝑤𝑜 𝑝𝑙𝑎𝑦𝑒𝑟𝑠
𝑃𝑅1 = 0, 𝑖𝑓 𝑏𝑜𝑡ℎ 𝑝𝑙𝑎𝑦𝑒𝑟𝑠 𝑎𝑟𝑒 𝑖𝑛 𝑏𝑎𝑠𝑒𝑙𝑖𝑛𝑒 𝑟𝑒𝑔𝑖𝑜𝑛
𝑃𝑅1 = 1, 𝑖𝑓 𝑜𝑛𝑒 𝑜𝑓 𝑡ℎ𝑒 𝑝𝑙𝑎𝑦𝑒𝑟𝑠 𝑖𝑠 𝑖𝑛 𝑛𝑒𝑡 𝑟𝑒𝑔𝑖𝑜𝑛
𝑃𝑅2 = 0, 𝑖𝑓 𝑏𝑜𝑡ℎ𝑒 𝑝𝑙𝑎𝑦𝑒𝑟𝑠 𝑎𝑟𝑒 𝑜𝑛 𝑡ℎ𝑒 𝑠𝑎𝑚𝑒 ℎ𝑜𝑟𝑖𝑧𝑜𝑛𝑡𝑎𝑙 ℎ𝑎𝑙𝑓
1, 𝑎𝑐𝑐𝑒𝑙𝑒𝑟𝑎𝑡𝑖𝑛𝑔
𝑺𝑪 = −1, 𝑑𝑒𝑐𝑒𝑙𝑒𝑟𝑎𝑡𝑖𝑛𝑔
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
𝑻𝑹 : 𝑡ℎ𝑒 𝑡𝑒𝑚𝑝𝑜𝑟𝑎𝑙 𝑜𝑟𝑑𝑒𝑟 𝑜𝑓 𝑡ℎ𝑒 𝑒𝑣𝑒𝑛𝑡
 Event classification
 Service in single game
 Both-net in a double game
Experiment Results
 Test sequences are recorded from TV broadcasts
 4 tennis, 3 badminton, and 2 volleyball games
 Resolutions:720 × 576 and 320 × 240
 Robustness
Experiment Results
Precision=98.04%
Recall=94.39%
Experiment Results
Experiment Results
 The performance of player position refinement
Experiment Results
service
Baseline rally
Net approach
System Efficiency
 The efficiency depends on image resolution and
content complexity
𝑐𝑎𝑚𝑒𝑟𝑎 𝑐𝑎𝑙𝑖𝑏𝑟𝑎𝑡𝑖𝑜𝑛 30%
 Eg. 473.8 ms per frame
𝑝𝑙𝑎𝑦𝑒𝑟 𝑑𝑒𝑡𝑒𝑐𝑡𝑖𝑜𝑛 64%
Conclusion
 The new algorithm shows a detection rate/accuracy
of 90-98%
 At the scene level, the system was able to classify
some simple events.