Video-Based Soccer Ball Detection in Difficult Situations Josef Halbinger and Juergen Metzler(B) Fraunhofer Institute of Optronics, System Technologies and Image Exploitation IOSB, Fraunhoferstr. 1, 76131 Karlsruhe, Germany [email protected] http://www.iosb.fraunhofer.de Abstract. The interest in video-based systems for acquiring and analyzing player and ball data of soccer games is increasing in several domains such as media and professional training. Consequently, tracking systems for live acquisition of quantitative motion data are becoming widely used. The interest for such systems is especially high for training purposes but the demands concerning the precision of the data are very high. Current systems reach a satisfying precision due to heavy interaction of operators. In order to increase the level of automation while retaining a constantly high precision, more robust tracking systems are required. However, this demand is accompanied by an increasing trend to stand-alone, mobile, low-cost soccer tracking systems due to cost concerns, stadium infrastructure, media rights etc. As a consequence, the live data acquisition has to be accomplished by using only a few cameras so that there are generally only few perspectives of the players and the ball. In addition, only lowresolution images are available in many cases. The low-resolution images strongly exacerbate the problem of detection and tracking the soccer ball. Apart from the challenge that arises from the appearance of the ball, situations where the ball is occluded by the players make the detection of the ball difficult. The lower the number of the cameras is, the lower generally is the number of available perspectives - and thus the more difficult it is to gather precise motion data. This paper presents a soccer ball detection approach that is applicable to difficult situations such as occluded cases. It handles low-resolution images from single static camera systems and can be used e.g. for ball trajectory reconstruction. The performance of the approach is analyzed on a data set of a Bundesliga match. Keywords: Soccer system 1 · Sport analysis · Ball detection · Video-tracking Introduction The increasing professionalization of soccer is accompanied by a growing media attention as well as game analysis and professional training. Especially, the automation of live analysis of soccer games is interesting for several domains such as media. This, however, requires a robust acquisition of player and ball c Springer International Publishing Switzerland 2015 J. Cabri et al. (Eds.): icSPORTS 2013, CCIS 464, pp. 17–24, 2015. DOI: 10.1007/978-3-319-17548-5 2 18 J. Halbinger and J. Metzler (a) (b) Fig. 1. (a) Variety of the appearance of the ball extracted from one image sequence and (b) examples for partially occluded situations. data that still relies heavily on the interaction of operators (so-called scouts) in current systems. Live acquisition of quantitative motion data such as distances covered by players, distances between players or ball possession can only be done by sophisticated automation. Our overall two-camera tracking system (one camera per half of the pitch) provides this kind of quantitative data for supporting a scout and for the automated acquisition of the relevant data [1]. It automatically detects, classifies and tracks the ball, the 22 soccer players, the referee and the two linesmen in one stitched image sequence of approximately double Full HD resolution. The main contribution of this work is the detection of the ball in situations in which the ball is close to a player or even partially covered by one. Detection of the ball in image sequences generally is a difficult task as the appearance of the ball varies from image to image. For instance, the high accelerations occurring at the ball may cause motion blur so that the shape of the ball is then more of an ellipse than a circle (see Fig. 1(a)). In addition, the color of the ball may vary from image to image due to changes of the illumination conditions. It might also have the same color as the lines of the pitch which exacerbates the ball detection task. Another challenge is the image resolution of the ball which is usually very small so that confusions with player body parts may occur. Depending on the camera perspective, the ball is in front of a complex image background such as the audience which exacerbates its detection as well. Besides difficulties that arise from the appearance of the ball itself, the detection of the ball is very challenging in situations where it is occluded by the players (see Fig. 1(b)). Every time a player touches the ball there is a chance that the ball is not fully visible for a short time as parts of the player’s body can move between ball and the camera. However, as long as the ball is at least partially visible, there is an opportunity to identify the ball in the image. The motivation of this work is to find a solution which can help to detect the ball in such cases. At the first stage, a circle detection method is applied. At the second stage, the detected circles are evaluated by examining the Freeman chain code [6] of the found contours. There are several publications for detecting and tracking the soccer ball as seen in [2]. However, most of these approaches focus on Video-Based Soccer Ball Detection in Difficult Situations 19 Fig. 2. Sample snapshot of the processed input image sequence including detected ball tracklets marked by rectangles (top) and one for the corresponding motion history image (bottom). tracking the ball in broadcast soccer videos and require a high image resolution [3]. The approach presented in this contribution is applicable to static camera systems, even for low-resolution cameras. So it can be used in huge tracking systems consisting of several cameras usually fixed installed in stadiums as well as for low-cost tracking systems that generally consist of 1–3 cameras capturing the entire pitch. The contribution is structured as follows: In Sect. 2, the module for the ball detection is described. It is a two-stage approach: At the first stage, the ball is detected in situations in which it is generally not occluded. Robust partial ball trajectories (tracklets) are extracted. In order to acquire detection hypotheses in occluded situations (within the gaps between the tracklets), a ball detector specialized for occluded situations is used at the second stage. Then, in Sect. 3, results of an evaluation on data sets of a Bundesliga match are presented. 2 Ball Detection The reconstruction of the ball trajectory requires a reliable soccer ball detection. However, this is challenging as there are usually a lot of occlusions. Furthermore, a high detection rate should be achieved at a low false alarm rate. We follow a detection approach that has been widely established in order to be able to fulfill this requirement (see e.g. [2]): the detection task is divided into two stages in which ball candidates are extracted and verified. First stage: in not occluded situations the ball is detected and confirmed by its appearance as a single object. Second stage: if there is a (partially) occluded situation which means that the ball has not been detected as a single object, a two-step approach is applied that first detects circles in the image and then analyzes the Freeman chain code [6]. 20 J. Halbinger and J. Metzler (a) (b) (c) (d) (e) Fig. 3. (a) Input image, (b) foreground/background segmentation of the input image, (c) chain code representation of the outer contour: lef t- and right-values (yellow and light blue/horizontal lines) of the chain code are of particular interest, (d) detected Hough circles (gray/smaller circles) and circular RoI RoIbc in which the CCH is calculated (blue/upper, brown/central and yellow/lower circle), (e) detected Hough circles and identified ball (green/lower cirlce) (Color figure online). 2.1 Not Occluded Situations At the first stage, the soccer ball has to be detected as a single object. Due to the real-time constraint for live applications, a feature-based detection with e.g. a sliding windows approach cannot be used. Instead, as the images from a static camera are captured, the foreground/background segmentation from Kim et al. [4] is applied at first. Temporal static background like the pitch and marking lines are segmented as background, whereas moving objects generate changing appearance and are therefore segmented as foreground. During the ball candidate extraction, all foreground regions are extracted and checked for their size using calibration information of the cameras. Foreground regions that are no candidates for the soccer ball due to their size are removed. Out of the remaining regions, the external contours are extracted as a sequence of points and analyzed afterwards. If the number of the contour pixels is higher than a specific bias, an ellipse is fitted to it and the mean squared error between every sequence point and the ellipse is calculated. Ball candidates with a high mean squared error are removed. The remaining candidates are kept as verified foreground regions in the foreground/background segmented image. Then, a dilatation is applied and the last n binary images are accumulated to a so-called Motion History Image (MHI) of verified ball candidates. Finally, ball tracklets (robust partial ball trajectories) are finally extracted from the MHI. Figure 2 shows a MHI and some results of detected/extracted ball tracklets. 2.2 Partially Occluded Situations The foreground/background segmentation of the first stage often merges ball and player into a single silhouette if they are either close to each other or partially Video-Based Soccer Ball Detection in Difficult Situations 21 occlude each other. Thus, the ball is not a singular object and only appears as a bump poking out of the player’s silhouette in the resulting image (Fig. 3(a) and (b)). At the second stage, the goal is to identify these bumps. In order to achieve this, we apply a two-step approach again. In the first step, circles (or at least parts of circles) in the image are detected via Hough transform [5]. In the second step, the Freeman chain code is considered to decide if a detected circle is a soccer ball. In the following, the details of the procedure are given: At the beginning of the first step, all the player silhouettes of the foreground/background segmented image are extracted into separate images. On each of these silhouette images, a Hough transform for circle detection is applied. All detected circles and circular arcs that approximately match the predefined ball dimensions are determined as ball candidates. A resulting Hough circle c is characterized by the center coordinates x and y as well as the radius r : c = (x, y, r). The Hough transform variant chosen in this work is called the Hough gradient method [7]. Unlike comparable methods, this variant only uses a two-dimensional accumulator instead of a three dimensional one. This is achieved by incrementing only accumulator cells along the gradient direction of each non zero pixel of the edge map instead of incrementing a complete circle and therefore keeping a separate accumulator for every predefined possible circle radius. This is beneficial to the running time of the algorithm. The downside is a lower recognition rate of circles with a concentric counterpart. But this flaw is acceptable since concentric circles do not occur in the segmented image material. At the beginning of the second step, the outer contour of the silhouette image is calculated. Then the Freeman chain code of the contour is determined (Fig. 3(c)). Now, in a circular Region of Interest (RoI) around the ball candidates that were identified before, the Chain Code Histogram (CCH) is computed [8]. The circular RoI is constructed around the center coordinates x and y of the ball candidate, adding a small Δ to the radius r (Fig. 3(d)). The Δ is added to encounter the problem that the detected circles of the Hough gradient method tend to be slightly smaller than they actually are. As a result, the considered RoI around the ball candidate RoIbc is defined as RoIbc = (x, y, r + Δ). As described in [8], the CCH is a discrete function p(k) = nk , k = 0, 1, ..., K − 1, n (1) where nk is the number of chain code values k in a chain code, and n is the number of links in a chain code. In case of the Freeman chain code there are K = 8 possible directions. Generally, a bump has a high amount of lef t and right-values of the chain code at the same time, while the lef t-values are on the upper side of the bump and right-values on the lower side of the bump. As a consequence, a RoIbc with a CCH that provides certain frequencies of occurrence of lef t- and right-values λ and ρ is defined to indicate a bump in the silhouette. If this frequency lies beyond a certain threshold τ (and RoIbc originates from inside the silhouette), it is assumed that a bump exists in this area. As this bump also matches the 22 J. Halbinger and J. Metzler dimensions of the soccer ball, the examined RoIbc is identified to have the soccer ball in it (Fig. 3(e)): 1 : λ ≥ τ and ρ ≥ τ Ball = . (2) 0 : else 3 Experimental Results We tested the first-stage of our approach - the tracklet extraction - on a data set of a Bundesliga match consisting of an image sequence with about 140.000 images of double Full HD resolution (see Fig. 2 for an example). There are 1428 tracklets to detect, in situations where the ball is neither occluded nor merged with a player. In these situations the approach detected 1343 tracklets and missed 85. There was no false alarm, i.e. all detected ball tracklets were correctly detected as such. The second-stage - ball detection in occluded situations - was tested on two data sets of a Bundesliga match. Both sets consist of image regions that were extracted from the same Bundesliga sequence as in the first stage test. The first data set consists of 1408 non-consecutive images with a resolution of 50 × 100 pixels, 704 of them showing the ball close to a player or partially occluded by 1 0,9 0,8 True Positive Rate 0,7 0,6 0,5 data set 1 0,4 data set 2 0,3 0,2 0,1 0 0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1 False Positive Rate Fig. 4. ROC curve of the tested second-stage approach: the ball detection approach for occluded situations. “data set 1” consists of 1408 non-consecutive images: 704 of them showing the ball close to a player or partially occluded by a player and the other 704 images don’t show a ball. “data set 2” consists of 9634 consecutive images: 111 of them showing the ball close to a player or partially occluded by a player and 9323 images don’t show the ball. Video-Based Soccer Ball Detection in Difficult Situations (a) (b) 23 (c) Fig. 5. Difficult cases for the ball detector: (a) Ball between player’s legs, (b) ball right in front of a player and (c) player without a ball, although there is a bump in the segmented image. a player. The other 704 images don’t show a ball. The second data set consists of 9634 consecutive images with a resolution of 64 × 128 pixels, 111 of them showing the ball close to a player or partially occluded by a player. 9323 images don’t show the ball. As mentioned in Sect. 2.2, the threshold τ describes the required frequencies of occurrence λ and ρ of lef t and right chain code values inside of RoIbc . In order to determine the optimal threshold, τ is iterated from a specified minimum to a specified maximum in both data sets. The range is set in a way that all possible cases are covered: it starts with a configuration that identifies every Hough circle as a ball and ends with a configuration that detects no single Hough circle as a ball. The results are displayed in a ROC (Receiver Operating Characteristics) curve that puts the true positive rate of a data set in relationship with its false positive rate (see Fig. 4). In the second data set, a true positive respectively false positive rate of 1.0 could not be reached. The reason for this is that no Hough circles matching the predefined ball dimensions were found. This leads to scenarios where varying the τ -threshold has no effect. The results also differ because the second data set has more difficult cases: On the one hand, there are several images in which the ball is between the player’s legs as illustrated in Fig. 5(a) or right in front of the foot as shown in Fig. 5(b). As a result, the ball does not appear as a bump in the segmented image. On the other hand, there are segmented images that have a strong bump, although there is no ball on the input image as shown in Fig. 5(c). 4 Conclusions In this paper, a two-stage approach for the detection of the soccer ball has been presented. The focus is on occluded situations in which the ball is partially occluded or merged with a player. We could yield a reliable extraction of ball tracklets in not occluded situations. Also, the ball detector for occluded situations is able to reliably detect balls in cases where the ball is partially occluded. With the exception of the delay in the output of the ball coordinates, which depends on the length of the motion history, the proposed approach is real-time capable. 24 J. Halbinger and J. Metzler References 1. Herrmann, C., Manger, D., Metzler J.: Feature-based localization refinement of players in soccer using plausibility maps. In: Proceedings of the International Conference on Image Processing, Computer Vision, and Pattern Recognition IPCV (WORLDCOMP), vol. 2, Las Vegas (2011) 2. D’Orazi, T., Leo, M.: A review of vision-based systems for soccer video analysis. Pattern Recognit. 43(8), 2911–2926 (2010) 3. D’Orazi, T., Guaragnella, C., Leo, M., Distante, A.: A new algorithm for ball recognition using circle Hough transform and neural classifier. Pattern Recognit. 37(3), 393–408 (2004) 4. Kim, K., Chalidabhongse, T.H., Harwood, D., Davis, L.: Real-time foregroundbackground segmentation using codebook model. Real-Time Imag. Spec. Iss. Video Object Process. 11(3), 172–185 (2004) 5. Kimme, C., Ballard, D., Sklansky, J.: Finding circles by an array of accumulators. Commun. ACM 18(2), 120–122 (1975) 6. Freeman, H.: On the encoding of arbitrary geometric configurations. IRE Trans. Electron. Comput. EC 10(2), 260–268 (1961) 7. Bradski, G., Kaehler, A.: Learning OpenCV: Computer Vision with the OpenCV Library. O’Reilly Media, Sebastopol (2008) 8. Iivarinen, J., Visa, A.J.E.: Shape recognition of irregular objects. Proc. SPIE 2904, 25–32 (1996) http://www.springer.com/978-3-319-17547-8
© Copyright 2026 Paperzz