Broadcast Court-Net Sports Video Analysis Using Fast 3-D Camera Modeling Jungong Han Dirk Farin Peter H. N. IEEE CSVT 2008 Introduction In consumer videos, sports video attracts a large audience Pixel/object-level analysis Extract highlights Event-based system Construct a general framework System Architecture Camera Calibration Introduction Map the points in real world coordinates to the image domain Assume the ground plane is placed at 𝑧 = 0, so the homography-matrix H is: Computing the Ground-Plane Homography 1. Line-Pixel Detection Detect white pixels Use additional constraint to prevent large area from being extracted Structure-tensor based filter Computing the Ground-Plane Homography 2. Line-Parameter Estimation Use RANSAC-like algorithm to detect dominant lines Refined by a least-squares approximation Line g 𝑷: 𝒕𝒉𝒆 𝒔𝒆𝒕 𝒐𝒇 𝒄𝒐𝒖𝒓𝒕 𝒍𝒊𝒏𝒆 𝒑𝒊𝒙𝒆𝒍𝒔 𝝉: 𝒕𝒉𝒆 𝒂𝒑𝒑𝒓𝒐𝒙𝒊𝒎𝒂𝒕𝒆 𝒍𝒊𝒏𝒆 𝒘𝒊𝒅𝒕𝒉 𝝉 Computing the Ground-Plane Homography 3. Court Model Fitting Determine correspondences between the 4 detected lines and the lines in court model Compute the model matching error E through every configuration 𝑀: 𝑡ℎ𝑒 𝑐𝑜𝑙𝑙𝑒𝑡𝑖𝑜𝑛 𝑜𝑓 𝑙𝑖𝑛𝑒 𝑠𝑒𝑔𝑚𝑒𝑛𝑡𝑠 :the closest line segment in image Computing the Ground-Plane Homography 4. Model Tracking Assume the change in camera speed is small Refine the camera calibration parameters Playing Frame Detection Define a frame with a court as a playing-frame If (𝑭 𝒕 − 𝝁𝑭 )𝟐 < 𝟐𝝈𝑭 Count the number of white pixels in current frame If 𝒏𝒐𝒕 Switch to court-detection This is not a playing frame 𝑭 𝒕 : 𝒕𝒉𝒆 𝒏𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒘𝒉𝒊𝒕𝒆 𝒑𝒊𝒙𝒆𝒍 𝒊𝒏 𝒄𝒐𝒖𝒓𝒕 𝒓𝒆𝒈𝒊𝒐𝒏 𝝁𝑭 : 𝒕𝒉𝒆 𝒎𝒆𝒂𝒏 𝒏𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒘𝒉𝒊𝒕𝒆 𝒑𝒊𝒙𝒆𝒍 𝒇𝒓𝒐𝒎 𝒕𝟎 𝒕𝒐 𝒕𝒏 𝝈𝑭; 𝒕𝒉𝒆 𝒗𝒂𝒓𝒊𝒂𝒏𝒄𝒆 𝒐𝒇 𝒕𝒉𝒆 𝒏𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒘𝒉𝒊𝒕𝒆 𝒑𝒊𝒙𝒆𝒍 𝒇𝒓𝒐𝒎 𝒕𝟎 𝒕𝒐 𝒕𝒏 Moving Player Segmentation 1. Build a background model Use 3 Gaussian to model the RGB color space Compute Mahalanobis distance 2. EM-based background subtraction Moving Player Segmentation 3. Player body bounding Detect the foot position The bounding box is compute from the player’s real height Occlusion Handling The occlusion has two properties Obtain the contour of players in binary map Find the peak Use Gaussian distribution to represent the contour Player Tracking Determine the correspondences between one known player in the previous frame and one blob in the current frame 𝑻𝒊 : 𝒕𝒉𝒆 𝒌𝒏𝒐𝒘𝒏 𝒑𝒍𝒂𝒚𝒆𝒓 𝑫𝒋 : 𝒕𝒉𝒆 𝒋𝒕𝒉 𝒄𝒂𝒏𝒅𝒊𝒅𝒂𝒕𝒆 𝒊𝒏 𝒄𝒖𝒓𝒓𝒆𝒏𝒕 𝒇𝒓𝒂𝒎𝒆 𝒙: 𝒕𝒉𝒆 𝒑𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 𝒑𝒍𝒂𝒚𝒆𝒓 𝒎𝒐𝒕𝒊𝒐𝒏 𝒓𝒆𝒑𝒓𝒆𝒔𝒆𝒏𝒕𝒆𝒅 𝒃𝒚 𝒔𝒑𝒆𝒆𝒅 𝒂𝒏𝒅 𝒅𝒊𝒓𝒆𝒄𝒕𝒊𝒐𝒏 : 𝒕𝒉𝒆 𝒄𝒐𝒗𝒂𝒓𝒊𝒂𝒏𝒄𝒆 𝒎𝒂𝒕𝒓𝒊𝒙 Adopt the DES operator to smooth and refine the motion of each player [23] Scene Level Feature factor 𝑷𝑹𝟏 : 𝑡ℎ𝑒 𝑟𝑒𝑙𝑎𝑡𝑖𝑣𝑒 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛 𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑡𝑤𝑜 𝑝𝑙𝑎𝑦𝑒𝑟𝑠 𝑎𝑛𝑑 𝑡ℎ𝑒 𝑐𝑜𝑢𝑟𝑡 𝑓𝑖𝑒𝑙𝑑 𝑷𝑹𝟐 : 𝑡ℎ𝑒 ℎ𝑜𝑟𝑖𝑧𝑜𝑛𝑡𝑎𝑙 𝑟𝑒𝑙𝑎𝑡𝑖𝑣𝑒 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛 𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑡𝑤𝑜 𝑝𝑙𝑎𝑦𝑒𝑟𝑠 𝑃𝑅1 = 0, 𝑖𝑓 𝑏𝑜𝑡ℎ 𝑝𝑙𝑎𝑦𝑒𝑟𝑠 𝑎𝑟𝑒 𝑖𝑛 𝑏𝑎𝑠𝑒𝑙𝑖𝑛𝑒 𝑟𝑒𝑔𝑖𝑜𝑛 𝑃𝑅1 = 1, 𝑖𝑓 𝑜𝑛𝑒 𝑜𝑓 𝑡ℎ𝑒 𝑝𝑙𝑎𝑦𝑒𝑟𝑠 𝑖𝑠 𝑖𝑛 𝑛𝑒𝑡 𝑟𝑒𝑔𝑖𝑜𝑛 𝑃𝑅2 = 0, 𝑖𝑓 𝑏𝑜𝑡ℎ𝑒 𝑝𝑙𝑎𝑦𝑒𝑟𝑠 𝑎𝑟𝑒 𝑜𝑛 𝑡ℎ𝑒 𝑠𝑎𝑚𝑒 ℎ𝑜𝑟𝑖𝑧𝑜𝑛𝑡𝑎𝑙 ℎ𝑎𝑙𝑓 1, 𝑎𝑐𝑐𝑒𝑙𝑒𝑟𝑎𝑡𝑖𝑛𝑔 𝑺𝑪 = −1, 𝑑𝑒𝑐𝑒𝑙𝑒𝑟𝑎𝑡𝑖𝑛𝑔 0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 𝑻𝑹 : 𝑡ℎ𝑒 𝑡𝑒𝑚𝑝𝑜𝑟𝑎𝑙 𝑜𝑟𝑑𝑒𝑟 𝑜𝑓 𝑡ℎ𝑒 𝑒𝑣𝑒𝑛𝑡 Event classification Service in single game Both-net in a double game Experiment Results Test sequences are recorded from TV broadcasts 4 tennis, 3 badminton, and 2 volleyball games Resolutions:720 × 576 and 320 × 240 Robustness Experiment Results Precision=98.04% Recall=94.39% Experiment Results Experiment Results The performance of player position refinement Experiment Results service Baseline rally Net approach System Efficiency The efficiency depends on image resolution and content complexity 𝑐𝑎𝑚𝑒𝑟𝑎 𝑐𝑎𝑙𝑖𝑏𝑟𝑎𝑡𝑖𝑜𝑛 30% Eg. 473.8 ms per frame 𝑝𝑙𝑎𝑦𝑒𝑟 𝑑𝑒𝑡𝑒𝑐𝑡𝑖𝑜𝑛 64% Conclusion The new algorithm shows a detection rate/accuracy of 90-98% At the scene level, the system was able to classify some simple events.
© Copyright 2026 Paperzz