Probabilistic Tracking of Soccer Players and Ball
Kyuhyoung Choi, Yongdeuk Seo, and Sang Wook Lee
Dept. of Media Technology, Sogang University
Shinsu-dong 1, Mapo-gu, Seoul 121-742, Korea
{Kyu, Yndk, Slee}@sogang.ac.kr
Abstract. An effective system simultaneously tracking multiple players
and a ball on broadcasted soccer matches is proposed in this paper. This
system uses particle filter with synthesized images from templates for
tracking players of the same team in occlusion. This synthesized image
where an adaptive color histogram is made from means an expected
image for each particle and gives more precise likelihood evaluation of the
particles. For ball tracking, when the ball is in ballistic motion without
any interruption of players, an ordinary particle filter estimates the state
of the ball. When the ball is considered to be possessed by a player
or players, the tracker stops, waits for the ball to reappear in the area
around the corresponding players. This tracker gives good performance
on the commonly broadcasted soccer match videos.
1
Introduction
Analysis of soccer video sequences has been an interesting application in computer vision as the abundance of recent papers presents let alone the fever of the
soccer itself. This paper focuses on tracking the players and ball on commonly
broadcasted soccer match video sequences. In this environment, the most challenging problem is occlusion between players of the same team resulting that
particles of an player are separated and populate on two or more players. A similar study on this kind of input video was studied in [1], which tracked multiple
players in a video of American football.
Our approach is based on particle filter[7, 8]. However, simple attachment of
a particle filter to each of the players does not give a good performance because particles of an object change their locations very easily to the region of
the adjacent objects whose regions give higher likelihoods through the iteration
of likelihood evaluation and re-sampling, which has been observed in the previous multi-tracking papers. JPDAF(joint probabilistic data association filter) has
been a solution for the identity association of measurement and multiple objects
tracked [9]. OAP(occlusion alarm probability) successfully managed to control
the particle population by probabilistic weighting of the likelihood of a particle
according to the distance to its neighbors [2]. The suggested tracker exploits a
synthesized image from templates of players on the verge of occlusion to give
more reasonable measurement. Sequential Monte-Carlo method is explained in
Section 2. Section 3 deals with pre-image processing and Section 4 covers ordinary single player tracking. In Section 5 tracking the same team players in
2
occlusion is explained. The method of Ball tracking is discussed in 6. Section 7
provides experimental results and finally Section 8 concludes this paper.
2
Sequential Monte-Carlo
In brief, sequential Monte-Carlo algorithm(SMC) estimates a non-parametric
representation of posterior distribution p(xt |zt ) sequentially, where xt is the state
and zt is the measurement at time t, given a sequential dynamic system with
Gauss-Markov process. The posterior is represented by random particles or samples from the posterior distribution. When it is not possible to sample directly
from the posterior, q, a proposal distribution of known random sampler can be
adopted to compute the posterior, and in this case the posterior at time t is
represented by the pairs of particle s and its weight w updated sequentially:
wt = wt−1
p(xt |zt )p(xt |xt−1 )
q(xt |x0:t−1 , z1:t )
(1)
AfterPcomputation of wt ’s for the particles generated from q and normalizaN
tion 1 wti = 1, where N is the number of particles, the set of particles comes
to represent the posterior distribution. Particles have the same weight 1/N
after re-sampling from the proposal distribution which particle filter takes as
q = p(xt |zt−1 ) resulting in wt = wt−1 p(zt |xt ), saying that the posterior can be
estimated by evaluating the likelihoods at each time using the particles generated
from the prediction process of system dynamics. Incorporated with resampling,
the weight update equation can be further reduced to wt = p(xt |zt ), where
weight normalization is implied afterwards.
3
Pre-image processing
Fig. 1. Image processing
3
Illustrated are the steps of image processing to automatically detect players
and identify their classes(ex, player of team A, player of team B, goalie of team
A, goalie of team B and referee) in Figure 1. From the original image, I ori , the
ground area is segmented out according to the 3D-histogram of the image to gives
I sub , as shown in Figure 1. Then applying morphological filtering(I mor ), connected component labelling(I ccl ) and size filtering(I siz ), we get I pla containing
only candidate region blobs of players. The rest part of the image is considered
to be the ground, spectators or other facilities of the stadium and marked as
black. Note that, in the application of particle filter, only those non-black pixels
are considered in the evaluation of likelihood and from the second frame just
I sub is used for tracking in both player and ball cases.
4
Single Player Tracking
Tracking starts from the second frame. For the image of tth frame Itsub , state
estimates of players(pt ) and the ball(bt ) are done by the particle filter assigned
respectively. The state vector of a player p is (rT , w, h)T , where r is (rx , ry )T
and represents the center position of a rectangle which a player is considered
as. w and h mean the half width and height of the rectangle. Constant velocity
is assumed for the dynamics of position and no velocity for the width and height.
rt = 2rt−1 − rt−2
(2)
For particle filtering, each player has N samples or particles, the weight is determined by likelihood evaluation, that is, histogram comparison. A class is
assigned to each player as in Section 3 and has its model color histogram. When
the color histogram of the region for siA , i th(i ∈ N ) sample of player A, is hiA
and the model color histogram of the corresponding class is hA , Li , the likelihood
of siA , can be expressed with total divergence D [10].
µ
¶
1
−D(hA , hiA )2
Li = √
exp
(3)
2σ 2
2πσ
¾
X ½
hi (y)
hj (y)
D(hi , hj ) = 2 log 2 +
+ hj (y) log
hi (y) log
hi (y) + hj (y)
hi (y) + hj (y)
y∈Both
(4)
where y is a index of bin and Both is the set of y satisfying both hi (y) > 0 and
hj (y) > 0. The weights are obtained by normalizing L and the weighted sum of
particles leads to p̂, the estimate of p at this frame.
5
Image Synthesis from Templates
As in Figure 2, the nature of particle filter can not restrain the features of a
player from attracting particles of the other in the same team while they are
close enough to each other. In Figure 2, during occlusion between them, some
4
Fig. 2. Particle distribution during occlusion
part of the image of player A is not visible due to foreground player B. That is, for
player A, that much of color information is lost and this causes that the otherwise
best particles fail to give best matches to the given model color histogram which
is derived from a player template with whole body. The more A is occluded by
B, the harder observation of A is done and the lower confidence in observation
becomes. Therefore, it will be more appropriate to take the occluding part of
player B as that of A and compare that with the corresponding synthesized
image. This is for fully exploiting the given image information and considering
the changes around the target for tracking. For this, a observation model as
in Figure 3 is required. From the tracking results at every frame, we can get
scent of occlusions between players of the same team and for those candidates
who are close enough to each other, the image template is saved respectively.
During occlusion, a synthesized image is derived from the templates and its color
histogram is used for likelihood evaluation of particles. If the distance between
player A and B of the same team at frame t − 1 becomes smaller than a proper
threshold during single player tracking as in Section 4, that is, if it is expected for
those players to overlap each other in frames coming soon, the template images
TA , TB respectively corresponding to the expectations E(pA ) and E(pB ) are
obtained and saved. To compute the likelihood of i th particle of A at frame t,
the synthesized image TSi is made out of TA and TB . If the position expectation
of B E(rB,t ) is not gotten yet, the velocity Et−1 (rB ) − Et−2 (rB )) is added to
Et−1 (rB ) to give the predicted position rB,t− . Otherwise, E(rB,t ) is used as
rB,t− . According to the relative positions between rB,t− and riA,t of the particle
siA,t , TA and TB are aligned. The intersection area of two templates TA,sub and
TB,sub are derived through this alignment as in Figure 4.
TA,sub = {p : xA,f rom ≤ px ≤ xA,to , yA,f rom ≤ py ≤ yA,to }
TB,sub = {p : xB,f rom ≤ px ≤ xB,to , yB,f rom ≤ py ≤ yB,to }
(5)
(6)
5
Fig. 3. Observation model
where, p is a pixel which has RGB values as elements, (xf rom , yf rom ), (xto , yto )
are the most up left and low right points of each template respectively. Then,
i
αSi , the pixel constituting TSi is defined according to ry,B,t− and ry,A,t
as follows.
i
When ry,A,t > ry,B,t− , that is, A is occluding B since A is located lower than B
in the image coordinate,
½
αSi (j, k)
=
i
When ry,A,t
½
αB (j − xA,f rom , k + yB,f rom ) , if α(j, k) ∈ TA,sub and V(α(j, k)) ∈ Bf ield )
αA (j, k)
, otherwise
(7)
< ry,B,t−
αB (j − xA,f rom , k + yB,f rom ) , if α(j, k) ∈ TA,sub and V(α(j, k)) ∈
/ Bf ield )
αA (j, k)
, otherwise
(8)
where V maps the color of a pixel to a bin of a color histogram. Unlike single
player tracking for the case of no players around the target, the particle of
αSi (j, k)
=
6
(a) Image synthesis
(b) Examples of synthesized
images
Fig. 4. Image synthesis and example images
the maximum likelihood is taken as the state of a player during the frames of
occlusion. In Figure 4(b), examples of synthesized images are shown.
6
Ball Tracking
The overall model for ball tracking is similar to that of player except the ball
is considered as an ellipse with four parameters the center position, the lengths
of the long and short axes. Unlike player tracking, contour tracking module is
added to that of histogram matching [11]. That is, the intensity gradient along
the normal of the circumference is measured as another cue of observation with
existing color histogram of inner area of the circle. The weight of i th particle
wi , among Nb ball particles, computed via the dual module of contour and color
histogram is
Yu (i)
wi = N
Pb
Yu (i)
ui − min uj
,
Yu (i) =
j∈Nb
max uj − min uj
j∈Nb
(9)
j∈Nb
i=1
where
ui = Yug (i) + Yuc (i) ,
ug =
Nc
1 X
g(k) ,
Nc
uc = D(h, hb )
(10)
k=1
with Nc , the number of pixels on circumference.
1 , if c(k) ∈ E
g(k) = 0.5 , if c− (k) ∈ E or c+ (k) ∈ E
0 , otherwise
(11)
where c(k) is the pixel corresponding to k(k ∈ Nc ), c− (k) and c+ (k) are pixels
in front of and behind c(k) along its normal respectively, E is the set of Cannyedge pixels and h and hb are the histograms of a particle and the model ball
7
template respectively. As in Figure 5, the states of ball can be classified into
two, namely, the case when a player has the ball and the other when the ball is
in ballistic motion as an elastic body. An ordinary particle filter is applied for
the latter. For the former, the filter stops tracking and the position of the player
possessing the ball is taken as the estimate of that of ball. In next frames, it
is waited for the ball to separate from the player and the particle filter is reinitialized after reappearance of the ball to resume tracking. That is, if E(bt )
Fig. 5. Ball tracking
the resulted estimate of ball at frame t shows a player(s) has the ball, that is,
the distance between from the ball and the player(s) is smaller than a proper
threshold, the mean position E(rt ) of pi (i ∈ Nc ) of Nc players within a proper
distance d from the estimated ball is considered as the ball position. From the
next frame, rectangular bounding boxes with proper size around those players
are searched to detect the reappearance of the ball. In the image I ccl , the proper
image blob of the splitting ball is searched for and its likelihood evaluation is
done in I sub . The blob of maximum likelihood above some threshold is taken as
the reappeared ball and is assigned a new particle filter.
7
Experiments
Experiments are carried out on some video sequences of size 640 × 360 and
960 × 540. Figure 6 shows some frames of the results of which the detail is
contained in accompanying video clip. The initial position of the ball was given
by the user and no two players were supposed to appear as a connected blob at
the initial frame. In the situation of scoring a goal of Figure 6(a), the ball is lost
when it looks the goal keeper is catching it since the ball is just passing him by
without his touch. In Figure 6(b) two players in white top overlaps each other
and the trackers keep tracking them through the occlusion.
8
Conclusion
This paper presents an effective system to track the players and the ball in a
soccer match video sequence. To deal with the objects of the same class, namely,
the players of the same team during occlusion, we exploited the synthesized
8
(a)
(b)
Fig. 6. Examples of results
image out of template images saved before the occlusion for likelihood evaluation
of the particles. Simple color histogram matching with the proposed method gave
lower performance than the expected since the spatial and local information of
colors are not considered in the histogram.
References
1. Intille, S., Bobick, A.: Closed-world tracking. In: Proc. Int. Conf. on Computer
Vision. (1995)
2. OK, H., Seo, Y., Hong, K.: Multiple soccer players tracking by condensation with
occlusion alarm probability. In: Int. Workshop on Statistically Motivated Vision
Processing, in conjunction with ECCV 2002, Copenhagen, Denmark. (2002)
3. Reid, I., Zisserman, A.: Goal-directed video metrology. In: Proc. European Conf.
on Computer Vision. (1996)
4. Kim, T., Seo, Y., Hong, K.: Physics-based 3d position analysis of a soccer ball
from monocular image sequences. In: Proc. Int. Conf. on Computer Vision. (1998)
721–726
5. Inamoto, N., Saito, H.: Immersive observation of virtualized soccer match at real
stadium model. In: IEEE and ACM International Symposium on Mixed and Augmented Reality. (2003)
6. Taki, T., Hasegawa, J., Fukumura, T.: Group motion features for teamwork evaluation and its application to soccer games. In: 14th International Conference on
Pattern Recognition. (1998)
7. Blake, A., Isard, M.: Active Contours. Springer-Verlag (1997)
8. Doucet, A., Godsill, S., Andrieu, C.: On sequential monte-carlo sampling methods
for bayesian filtering. (2000)
9. Bar-Shalom, Y., Fortmann, T.: Tracking and Data Association. Academic Press
(1998)
10. Lee, L.: Similarity-Based Approaches to Natural Language Processing. PhD thesis,
Harvard University, Cambridge, MA (1997)
11. Birchfield, S.: Elliptical head tracking using intensity gradients and color histograms. In: Proc. IEEE Conf. Computer Vision and Pattern Recognition. (1998)
© Copyright 2026 Paperzz