Priors for the Ball Position in Football Match using Contextual Information CARLOS DUQUE Master of Science Thesis Stockholm, Sweden 2010 Priors for the Ball Position in Football Match using Contextual Information CARLOS DUQUE Master’s Thesis in Computer Science (30 ECTS credits) at the School of Electrical Engineering Royal Institute of Technology year 2010 Supervisor at CSC was Josephine Sullivan Examiner was Stefan Carlsson TRITA-CSC-E 2010:048 ISRN-KTH/CSC/E--10/048--SE ISSN-1653-5715 Royal Institute of Technology School of Computer Science and Communication KTH CSC SE-100 44 Stockholm, Sweden URL: www.kth.se/csc Abstract Due to the growth of media services in football broadcasting, a Swedish company called TRACAB is modelling the reconstruction of football matches in 3-D. TRACAB has developed a system to track players’ positions for the whole match using cameras installed in each stadium. However, this system is less reliable when applied to tracking the football. Therefore this Master Thesis focuses on estimating the ball’s position from the players’ trajectories. In particular we investigate exploiting the contextual information to localize the ball. The estimation of the position of the ball is calculated using different supervised learning methods from feature vectors which are built with players’ positions. At first, exact estimated coordinates are researched in general situations where no priors are given. After looking at the results, this work selects regions which are more suitable to find the ball. Another scenario is investigated where it is assumed that we know which player has the ball. Then we try to predict to which region he will pass the ball to. Consequently, TRACAB could find the ball efficiently on the pitch. Results are reported from data provided by TRACAB. These data are the coordinates of the ball and players’ positions of a particular whole match. Histograms and percentages of success illustrate results and error aproximations in each case. Contents 1 Introduction 1.1 Background . . . . . . . . . . . . . . . . . . . . . . 1.2 Problem this project addresses . . . . . . . . . . . 1.3 Overview of how to exploit contextual information 1.4 Supervised Learning Methods . . . . . . . . . . . . . . . . 1 1 2 5 5 2 Data utilized in this work 2.1 Collection of needed data . . . . . . . . . . . . . . . . . . . . . . . . 7 7 3 Estimation of ball position 3.1 Nearest Neighbour Regression . . . . . . . . . . . . . . . . . . . . . . 3.2 Nearest Neighbour regression with players in grid . . . . . . . . . . . 3.3 Nearest Neighbour with players in grid and temporal window . . . . 11 11 12 15 4 Heat Maps: Scenario given no prior information 4.1 Scenario 1: General Situations . . . . . . . . . . . . . . . . . . . . . 21 21 5 Heat Maps: Scenario where players pass the 5.1 Scenario 2: Particular Situations . . . . . . . 5.1.1 Scenario 2: Feature Vector . . . . . . 5.1.2 Results and Conclusions . . . . . . . . 29 29 35 40 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ball . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Conclusions and Future Work 43 References 49 Acknowledgement After attending to the course called Image Analysis and Computer Vision, Javier Romero González, who was teaching at the laboratory lectures, introduced the CVAP department in KTH. Therefore, my first gratitude is to him. I met Josephine Sullivan there and she offered me a project I could not refuse. It was firstly called, “Action recognition for football players”. Hence, thanks to Josephine Sullivan, who has become my supervisor since then, and who has dedidated all the time I needed to answer my questions and has taught me extra mathematical theories when I needed. She has guided me, step by step, working on this Master Thesis. I cannot forget Mohammad Rastegari, who was my workmate in the department at first, and later who has become a real friend. I have to express my gratitude to him for his patience replying me my queries about programming in MATLAB. My infinite thankfulness is to my parents, who have let me finish my career abroad. Thanks also to Ricardo Oña, Sergio García, Guillermo Tato, Víctor Herrero, who are my football mates, and added their points of view about this project, and to Umi Morita and Helena Pérez who have helped me with this report. During my stay in Sweden, I have met lots of friends from many countries. Thanks to them to make this year the best and the most meaningful one in my life. Day by day, they have made me laugh and I have spent great time with them. Besides, I would like to wish them the best and I hope that we will see each other here again or in other parts of the world. Chapter 1 Introduction 1.1 Background Nowadays, the development of the technology in image processing and the possibility of broadcasting huge amount of sporting events, have increased the demand of multiple services requested by customers. Spectators are not content to only watch the game. They also want more information on the screen like measures to check statistics, names and data, replays from different angles, etc. An important company in this field which provides such services is TRACAB [1]. This firm has been selected recently by FIFA to provide these multiple services in the biggest football competition, the world cup in South Africa in 2010. Figure 1.1: TRACAB: example of services Since 2003, TRACAB has been improving its technology in order to develop new products. One of its products is to provide virtual representations of games in real-time, for helping sport industries and media creating new products for the end consumer. SAAB and several universities are involved in this exciting project. In particular KTH, which is one of Sweden’s leading technical universities, has 1 CHAPTER 1. INTRODUCTION close collaborations with TRACAB. This Master’s Thesis is a part of this on-going collaboration. 1.2 Problem this project addresses One of TRACAB’s ambitious goals is to reconstruct a real match in 3-D in real-time. This Thesis is a small part of this huge project and is devoted to the estimation of the position of the ball. There are many possible approaches to ball detection. The best and more utilized method is via usual tracking. TRACAB has 8 pairs of cameras to cover the whole pitch during matches and by using stereo image processing which they capture the movements of all objects such as players, officials and the ball at a rate of 25 images per second. Figure 1.2: TRACAB: Covering the whole pitch Therefore, image and video data are utilized to track the players and the ball at every moment. TRACAB’s tracking of people is reliable but not in the case of the ball. Reasons why the ball tracking is not reliable will be explained later. If its algorithm fails at a certain moment, frequently, knowing where the ball is, makes it easy to predict where the ball will be in the near future. With this prediction, it is possible to search and find the ball. This method is sufficient in many situations. Moreover, improvements have been made with tracking algorithms and ball detection algorithms [2]. However, these algorithms will lose track of the ball at some points and they have to look at the whole image to refind it. This is potentially a computationally expensive task and there is no certainty of finding the ball during this search. There are often situations where detecting the ball is hard, and as it has been commented before, sometimes tracking the ball is not reliable. For example, it may be occluded by a player if he is in front of the ball or even its appearance may be similar to a player’s sock or other different objects like a bag that may be in the 2 1.2. PROBLEM THIS PROJECT ADDRESSES Figure 1.3: Sixteen cameras are installed in the stadium. pitch. Lighting is usually another problem experienced with this technique. In some stadiums, depending on the time of the match, some parts of the pitch have solar light and other parts may have shadow. Moreover, the fact that ball is small and can move very quickly, faster than players, is a problem that affects to the methods which analyze images. Therefore, predicting its position from one frame to another may require a large search area. This phenomenon can be confirmed in Figure 1.4. Besides, the problem of tracking the ball could be also due to the fact that, when ball moves quickly, its shape become oval. The appearance of the ball may vary alot due to motion blur. Figure 1.5 illustrates some examples of situations where some of these problems could appear. This Thesis explores methods for highlighting where the football ball could be. These new algorithms are based on exploiting the contextual information. This means that the ball position will be estimated by only looking at the position of the players over time on the pitch that is provided. This information, in this case, is formed with the one concerning the players. Intuitively, many people could think that whatever human acts, it is unpredictable because human behavior is based on impulses. Although many patterns could be detected, human being may behave 3 CHAPTER 1. INTRODUCTION Figure 1.4: Ball, in green, moves faster than players, in red and blue. Then, predicting the position of the ball may require a large area (a) Ball appearance may be similar to player’s sock (b) Ball appearance may be blurred and oval because of motion (c) The ball may be occluded behind the players (d) The ball could be in the solar light part or in the shadow one Figure 1.5: Different problems may appear in the scene causing unreliable ball tracking 4 1.3. OVERVIEW OF HOW TO EXPLOIT CONTEXTUAL INFORMATION without any predictable sense anytime. On the other hand, patterns are found easily when players behave similar as they usually play, and the ball position could be exactly defined with a minimum error approximation. Hence, this work does not aim to find the exact result but to provide to a large extent to make easier for ball detector and assist that program to locate faster where the ball position is. The methods that this Thesis utilized are applied in two different scenarios. The first one is: given no information about the situation, using the position of the players in a time window to predict the plausible regions of where the ball could be. The second one is: given that the ball was in the possession of player x at time t, and the position of players in the subsequent time window, to locate where is the ball. The goal in both cases is to find ‘priors’ for the ball location. In other words, imagine that the ball detector has lost the ball while looking for the ball in the picture. Furthermore, this report will construct priorities given the general situation by spotting an area where the ball could be with most probability. 1.3 Overview of how to exploit contextual information As it has been commented above, this work uses data provided by TRACAB. This data s collection of both position of the ball and that of all players classified by the number of each frame of football matches’ video. The means of how this data is exploited is as follows: it is collected and organized in order to set a list of training data. By machine learning, any new query that arrives is passed to the training data and the machine will generate the most accurate answer of the ball position. It will be more or less precise based on the algorithm of the machine learning. This Thesis utilizes one part of the provided data as learning and the other part for testing to check later how far the prediction of the ball position can be. A detailed explanation of how machine learning performs is described in the next part of this report. 1.4 Supervised Learning Methods Basically, methods utilized in this work try to learn a mapping from inputs to outputs. Besides, the function that is being searched is too hard to specify explicity so that it is no a real regression method. In order to learn this mapping, there are examples which are stored as a database where every input has its own output. Afterwards, each new input will be compared with the database and depending on the utilized algorithm, the result will be an output more or less precise than an expected value. It is impossible to store infinity database because it might waste so much time, but on the other hand, it cannot keep a simple list of examples because then the result of mapping will be very far from the real value. Mechanisms used in this paper for describing this mapping are nearest neighbour and distance-weighted nearest neighbour. A first introduction is described in [6] . The theory of this chapter has been studied from the book called Machine Learning 5 CHAPTER 1. INTRODUCTION Figure 1.6: Inputs and outputs from the database are in black. New input and new output in red. This is a mapping from space X to the space O [3]. Nearest neighbour is a method that checks the most suitable input from the database depending on the new input and gives its corresponding output as a result to value of the output of the new input. The most suitable input is chosen because it has the minimum distance between it and the new input. Additional mechanisms have been studied in order to learn efficiently this mapping as [7] and as Weber, Schek, Blott explore in [8] and [9]. On the other hand, distace-weighted nearest neighbour is a method that introduces a weight for each input from the database depending on the distance from them to the new input. Therefore, more weight is provided to the closer neighbours. Moreover, the most suitable input in this case is the one with minimum distance once this weight is given. Additionally, other alternatives are taken from [10]. Besides, a full review of metric space has been studied in [5] . Euclidean distance is utilized to compute all the distance in all the metric spaces used in this work. 6 Chapter 2 Data utilized in this work 2.1 Collection of needed data TRACAB has provided the data from the match AIK Solna against Örgryte that corresponds to the Swedish football league and it took place in Rasunda Stadium on 21 September 2009 in Sweden. This data shows the X and Y coordinates of players and referees, and the X, Y and Z coordinates of the ball position on the pitch for every frame that they could extract from the algorithm and software that they utilize. Although they recorded 90 minutes of the match, ball data was tracked for only 66 minutes of these 90. In total there are 98511 frames since they utilize video of 25 frames per second. The lack of ball data can be seen in Figure 2.1. This is normal performance of the ball tracker since the ball is much time outside. Figure 2.1: Presence of ball data The thick white line in the middle of the picture represents the break between the first and the second half. Thin white lines are the moments where the ball is missed in the frames of the match. Thin white lines separate sequences of the match. The data useful for our supervised learning task is the players’ positions in the frames where ball data exists, too. Although there are many frames where the ball is not tracked, the reason most common is that the ball is out of the pitch. Therefore, it is necessary to re-initilizate the tracking algorithm. The program has to search the ball on the pitch with the possibility that some problems may appear, as it has been mentioned before, in section 1.2 . Figures 2.2a and 2.2b represent the last tracked frame and the first one of the next sequence. Thus, the algorithm need a re-initialization to start tracking the ball when it is on the pitch again. Some examples of how TRACAB provides this data are represented in the fol7 CHAPTER 2. DATA UTILIZED IN THIS WORK (a) In the last tracked frame from a sequence, the ball is almost outside (b) First tracked frame from the next sequence, when the ball is on the pitch again Figure 2.2: The tracking program needs to be re-initialized after the ball goes out. The ball is represented as the big green point in both images lowing tables. Frame 163923 163923 163923 163923 163923 163923 163923 163923 163923 163923 163923 163923 163923 163923 163923 163923 163923 163923 163923 163923 163923 163923 Team 0/1 0 0 1 1 0 1 1 1 0 1 1 1 1 0 0 0 1 0 0 0 1 0 Player number 22 24 27 20 18 4 11 3 5 10 19 7 5 40 17 7 30 30 6 44 18 19 X coordinate 34,5660 43,2075 99,5190 73,4475 52,7835 71,5575 54,5475 73,3320 55,1985 52,9725 57,5505 58,2645 63,2205 52,9515 31,2585 44,2155 61,2465 0 46,6515 30,4185 72,6810 52,3215 Player Data 8 Y coordinate 51,0544 31,2596 33,9864 31,8104 32,3680 15,9120 42,2076 42,0648 62,7708 25,0512 50,3268 12,8860 35,8496 34,1224 21,7600 12,3692 28,9816 33,5648 25,5816 39,2020 55,9912 12,2876 2.1. COLLECTION OF NEEDED DATA Frame 163923 163924 163925 163926 X coordinate 52,8885 52,7520 52,5840 52,2165 Y coordinate 31,2188 31,3004 31,3956 31,5724 Z coordinate 0,3000 0,3000 0,3000 0,3000 Ball Data These coordinates have the origin in the top right of the screen. What is more, a transformation to obtain the coordinates as the match is watched in the video is necessary. This is a simple transformation, where the X position is computed as X − 105 and the Y position with Y − 68 because the length and width of the pitch are 105 and 68 meters respectively. The example of player and ball data is illustrated in Figure 2.3 . Figure 2.3: Illustrated example of how TRACAB provides player and ball data. Red and blue points represent both teams and the green point is the position of the ball Besides, they provided videos of the match. TRACAB installed other four cameras covering the whole pitch to record the match. Their images are in Figure 2.4 . These videos are very useful to prove, in some points, what is really happening 9 CHAPTER 2. DATA UTILIZED IN THIS WORK (a) Camera 1 (b) Camera 2 (c) Camera 3 (d) Camera 4 Figure 2.4: TRACAB installed HD cameras to record the whole pitch and get a higher resolution video then that used for tracking system during the match. Additionally it confirms if the position of the players is true, or if the algorithm of TRACAB has some mistakes in a particular case. 10 Chapter 3 Estimation of ball position 3.1 Nearest Neighbour Regression The simplest method that is applied in this Master’s Thesis is collecting a database and then search which frame is the most similar to our test frame. The database consists of coordinates of each player for each frame from the learning data, and it is called training data, too. Thus, the algorithm has to learn a mapping from players’ positions to ball position. Therefore, the feature vector is made up by two coordinates of each player at one frame. ~xi,t ∈ ℜ2 i = 1, ..., 22 t = 1, ..., T ~ t = (~xT , ~xT , ..., ~xT )T ∈ ℜ44 X 22,t 1,t 2,t f : ℜ44 → ℜ2 ~ t ) → ~xb,t ∈ ℜ2 (3.1) f (X ~xi,t means the position of player i at time t on the pitch, coordinates X and ~ t) Y, and it is a matrix with dimension 2x1. ~xb,t is ball position. Besides, f (X is defined via nearest neighbours and it is a matrix with dimension 44x1. The euclidean distance is utilized to obtain the training data which has the minimum distance to the test data. The euclidean distance is defined in 3.2. 2 ~ a, X ~ b) = ~a − X ~ b d(X X (3.2) This method has some problems. First of all, we may have a problem if a player changes his position with another player. For example, if a defender changes 11 CHAPTER 3. ESTIMATION OF BALL POSITION his position with a midfielder, the distance between the team formations at these different time instances will be increased. This problem may also appear if the coach orders players to change. It is also necessary to do a transformation in the coordinates in the second half in order to have teams in the same position in the field. Thus, it is able to work with both halfs easily. The first results of this method were not so encouraging: as it can be seen in the figure below, the players are almost in the same position but the ball is very far from its real position. Figure 3.1: Test data against selected training data. It is selected because it has the minimum euclidean distance to the test frame. Blue and red points represent both teams from one test frame. Magenta and cyan points are players’ positions selected as the nearest neighbour from database. Although the players are approximately in the same position, in one case the ball is at the big green point while in the other case it is at the big black point. Hence, the estimation of the ball position is very far from the real position. 3.2 Nearest Neighbour regression with players in grid In order to solve some problems that appeared in the last section, a rectangular grid is placed over the pitch. This idea means that the pitch is divided into several regions, and then, the number of players is counted in each region. This method solves many problems. For example, it is neither necessary to transform coordinates for the second half, nor to consider about changes in position of players because now each player is not associated with a number. Therefore, the algorithm does not check the number of each player. The grid defines a histogram which shows where the 12 3.2. NEAREST NEIGHBOUR REGRESSION WITH PLAYERS IN GRID players are generally. It is possible to solve the problem that could appear too, if any player receives a red card in the match because the histogram is normalized with the total number of players present on the pitch. Grid defines N rectangular regions: R1 , ..., RN ~ t = (h1,t , ..., hn,t )T where hi,t ∈ [0, 1] H hi,t = 1 NtP PNtP ℑ [x~j,t ∈ Ri ] j=1 2 (3.3) ~ a, H ~ b) = ~a − H ~ b Euclidean Distance : d(H H NtP is the number of player on the pitch in t . ℑ is the indicate function which ~ t is a 2-D histogram of the players’ positions. It adds 1 if x~j,t ∈ Ri . In addition, H is necessary to blur this histogram in order to spread players’ positions and smooth the borders of the grid. Therefore, each histogram is convolved with itself to smooth the borders of the regions, as it can be seen in figure 3.2. The players’ positions related with this histograms can be seen in figure 2.3. The following tables represent the number of players in each region before and after blurring. 0 0 1 0 0 2 2 8 6 2 6 6 6 2 3 0 0 0 0 0 4 12 12 8 2 0 1 1 0 0 0 0 0 1 0 2 1 1 0 0 1 1 3 2 1 7 10 9 10 6 16 16 18 10 4 17 20 25 18 12 0 1 2 1 1 0 0 0 0 0 6 14 22 18 14 0 0 0 0 0 1 4 6 8 5 0 0 1 0 0 4 2 2 0 0 2 2 6 4 2 This idea is implemented in MATLAB, and new results are obtained. How methods are implemented in MATLAB have been studied from [11]. These results of 13 CHAPTER 3. ESTIMATION OF BALL POSITION (a) Histogram of players’ positions (b) Convolved histogram of players’ positions Figure 3.2: These figures illustrate histograms that summarize the players’ positions on the pitch nearest neighbour classfication are represented in a histogram, showing the average of the error, in meters, from the estimated ball position and the real ball position. The chosen grid is made of 10 x 5 bins to cover the whole pitch, and the database is collected from all the available frames of the second half. In total, there are 39627 frames. There are 1005 test frames from the first half and they have been taken not randomly but with a distance between them enough to have a representation of the whole first half. First histogram can be seen in Figure 3.3. The election of the size of the grid has been chosen to minimize the average error in the estimation of the ball position. A validation set is used to estimate this 14 3.3. NEAREST NEIGHBOUR WITH PLAYERS IN GRID AND TEMPORAL WINDOW Figure 3.3: The histogram of errors with parameters: bins 10 x 5. Average Error: 22,54 meters. The error is measured by the distance from the estimated ball position to the real ball position quantity as opposed to the test set. The validation set uses performance on this set to estimate free parameters and the test set uses parameter settings learned from the validation set. Figure 3.4 shows the same histogram as in 3.3 but a grid of 20 x 10 bins is utilized. The average of the error is 24,04 meters, but if the size of 10 x 5 bins is utilized, the error was 22,54 meters, as it can be seen in the last histogram. It is essential to say that test frames are the same in both cases. Therefore, different methods are able to be compared. 3.3 Nearest Neighbour with players in grid and temporal window As has been seen before, at different time instances, the players can be in very similar positions but the ball at very different locations. In order to try to solve this problem, a temporal window is created. This temporal window introduces the concept of a sequence. Now, the histograms that have been compared to find the nearest neighbours have been added to the histograms of a forward step of time and a backward step of time. The selected nomenclature takes grids of the frames following this rule: 15 CHAPTER 3. ESTIMATION OF BALL POSITION Figure 3.4: The histogram of errors with parameters: bins 20 x 10. Average Error: 24,04 meters. t - K * S, ..., t - 2 * S, t - S, t, t + S, t + 2 * S, ..., t + K * S T T T T T ~ t,K = (H T H t−KS , Ht−(K−1)S , ..., Ht , ..., Ht+(K−1)S , Ht+KS ) (3.4) 2 ~ a,K , H ~ b,K ) = ~ a,K − H ~ b,K Euclidean Distance : d(H H Where K is the width of the temporal window, S is the step of the temporal window and t means the "present" of the tested frame. K equals to zero would be the experiment that has been executed before. The distance that has to be minimized is written with this expression: Distance ≡ P t∈−K∗S,...,+K∗S PN test i=1 ||Xtj +t training 2 || (3.5) − Xi,t j +t First of all, K = 1 and S = 10 frames are selected as parameters. This means that a sequence of 20 frames will be utilized in total, almost one second. The distance that is aimed to be minimized is in this situation: 16 3.3. NEAREST NEIGHBOUR WITH PLAYERS IN GRID AND TEMPORAL WINDOW Distance ≡ P t∈−10,0,+10 PN test i=1 ||Xtj +t training 2 || (3.6) − Xi,t j +t Hence, the same 1005 frames are utilized for testing and the same 39627 frames from the database that the computer will use for training data. Results are shown in this histogram, where the average of the error is 23,03 meters in this case. Figure 3.5: The histogram of errors with parameters: K = 1, bins 10 x 5. Average Error: 23,02 meters Therefore, the temporal window has to be longer. The next experiment is to explore the effect of varying K > 1. The reason why the training data is exactly 39627 frames is going to be explained now. Although the whole match is composed of 98511 frames, 49279 frames of the first half are dedicated for testing, and 49232 frames for learning data. The problem is that, as it has been explained before, there are holes in the data, lack of ball data in some frames consequently, and actually not all frames are useful for training because for each frame it is necessary that from −K ∗ S to +K ∗ S the frames exist, related with each particular frame. In this way, choosing K = 10 as most and S = 10 frames, the total number of useful frames are 39627 in the second half and 38185 in the first half. There are less useful frames because the whole 66 minutes of ball data is not continuous, as it has been described before in Figure 2.1. Next histogram has been calculated utilizing K = 10, that means, a temporal window from −100 + f ramei to +100 + f ramei . In seconds, the temporal window has duration of 8 seconds. 17 CHAPTER 3. ESTIMATION OF BALL POSITION Figure 3.6: The histogram of errors with parameters: K = 5, bins 10 x 5. Average Error: 22,02 meters The improvement is almost 3 meters regarding the average of the error. Errors between 0 and 20 meters have the most number of frames, too. In conclusion, there has been an improvement of the method in general. Consequently, to watch the improvement of this method using different values of K, another experiment has been proved. It consists of using different values of K for a certain frame. Errors, that decrease as long as K is increased, are represented in this table. All experiments correspond to the same frame. K 1 2 3 4 5 6 7 8 9 10 Error (meters) 17,07 34,54 4,93 7,29 6,06 7,29 6,06 3,97 2,29 3,97 18 3.3. NEAREST NEIGHBOUR WITH PLAYERS IN GRID AND TEMPORAL WINDOW Figure 3.7: The histogram of errors with parameters: K = 10, bins 10 x 5. Average Error: 20,81 meters The improvement that is shown in the table could be visualized by looking at the next figures. The estimation of the position of the ball (red point) and the real position of the ball (green point) are drawn in each pitch. In all these cases, 1005 frames are utilized for testing. Surely, the average of the error will decrease by using more test frames. 19 CHAPTER 3. ESTIMATION OF BALL POSITION (a) K = 1, Error: 17,07 m. (b) K = 2, Error: 34,54 m. (c) K = 3, Error: 4,93 m. (d) K = 4, Error: 7,29 m. (e) K = 5, Error: 6,06 m. (f) K = 6, Error: 7,29 m. (g) K = 7, Error: 6,06 m. (h) K = 8, Error: 3,97 m. (i) K = 9, Error: 2,29 m. Figure 3.8: Figures showing the distance between the estimated (in red) and the real position of the ball (in green) for K = 1, ..., 9 on the pitch 20 Chapter 4 Heat Maps: Scenario given no prior information In all these previous situations, it has been learnt that actually the ball could be in a range near the real position of the ball. Heat maps are divided in two chapters which describe two different football situations. In both of them, the objective is to apply the learning of a mapping to select and decide which zones have more probability to find the ball inside. Therefore, the same supervised learning methods are utilized in both scenarios, and only the feature vector will change in each case. 4.1 Scenario 1: General Situations This part of the Master’s Thesis describes, what it has been called, a kernel density estimation. A grid of bins is created, but this time for the ball. In this new grid, each bin obtains score depending on the distance of the players from a test frame to the whole learning data. This score is shown as a probability in this expression: p( ball in region j ) ∝ 2 wi = e(−λkxi −xtest k ) P i s.t. x~b,i in region j f or i = 1, ..., Ntraining wi examples (4.1) Scores are normalized for each frame in favor of helping the visualization of the experiment. Therefore, by mapping these score to a grey scale, the most white ones correspond to the highest scored bins and the most black ones to the opposite. This method is used to see if the minimum distance of the position of players in bins is also the most scored ones or could be another bin nearer the real position of the ball. In other words, this chapter has the purpose to produce no explicit results of 21 CHAPTER 4. HEAT MAPS: SCENARIO GIVEN NO PRIOR INFORMATION the estimation of the ball, but to select zones where the ball is located with higher probability. Examples of these heat maps are added below: Figure 4.1: Examples of Heat Map. Green point is the real position of the ball and red point is the estimated position of the ball utilizing the method explained in 3.3 . The most white bin is the most scored one which is the one suppose to be the most probable zone where the real position of the ball could be. Therefore, heat map workd perfectly in this case Figure number 4.1 displays a heat map where the real position of the ball is located at the big green point and is inside the white bin, which is the most scored bin. This bin is suppose to be the most probable zone where the real position of the ball could be. Therefore, this method called heat map works perfectly, in this case. The estimation of the position of the ball, the big red point, is located 10 meters far from the real position of the ball. Hence, heat map works better than nearest neighbour regression method, in this case, too. In addition, Figure 4.2 is the same figure as 3.8, but heat maps are included. By utilizing heat maps, it can be seen how much the estimation of the zones with most priority is closer as the value of K, defined in 3.3, is increased. However, heat maps do not give an exact position of the ball with particular x and y coordinates. Nevertheless, since the length and the width of the bins are 5,25 and 6,8 meters respectively, the error estimating the position of the ball with this method would be ± 2,625 meters for the x coordinate and ± 3,4 meters for the y coordinate as Figure 4.3 shows. Those approximations of the error of the position of the ball depend on the number of bins that are utilized to divide the pitch. In this case, as it has been commented before, 20 x 10 bins are selected for this experiment to simplify the possible position of the ball into regions. 22 4.1. SCENARIO 1: GENERAL SITUATIONS (a) K = 1, Error: 17,07 m. (b) K = 2, Error: 34,54 m. (c) K = 3, Error: 4,93 m. (d) K = 4, Error: 7,29 m. (e) K = 5, Error: 6,06 m. (f) K = 6, Error: 7,29 m. (g) K = 7, Error: 6,06 m. (h) K = 8, Error: 3,97 m. (i) K = 9, Error: 2,29 m. Figure 4.2: Heat maps showing the distance between the estimated and the real position of the ball for K = 1, ..., 9 on the pitch. Changes can be seen in gray scale as the density of players change with the team formation. The real ball position is situated at the big green point, and the big red point represents the estimation of the ball position utilizing the method studied in 3.3. Moreover, the most scored region, is very near to the real position of the ball. It is represented as the most white region, below the green point. The experiment checks if the real position of the ball corresponds to the most scored region. Moreover, the results prove that this is not valid in most cases. It is represented as a percentage of how near the real position of the ball is from the most scored bin. After all, the aim of this chapter is to introduce an idea of where 23 CHAPTER 4. HEAT MAPS: SCENARIO GIVEN NO PRIOR INFORMATION Figure 4.3: Heat maps show a region where the ball could be found, therefore, the error on the estimation is ± 2,625 meters for the x coordinate and ± 3,4 meters for the y coordinate. the location of the ball is with more probability. In conclusion, heat maps represent changes in zones where the ball is more suitable to be located as the density of players change. Sometimes, more than one bin could be selected as the most scored bin. Therefore, there could be more than one bin totally white. Besides, the bin in the second position (or in the third, or fourth position) of the most scored bin table, could be a perfect candidate to be selected and perhaps it corresponds to the bin where the real position of the ball is located. For this reason, histograms that represent the values that the first five most scored bins get have been calculated, see Figure 4.5. As the most scored bin has always value equal to 1 which means that it will be totally white, its histogram is obvious and it is not represented. Figure 4.4: Heat maps show a larger region because the neighbours have been included. The ball could be found in this region, therefore, the error on the estimation is ± 7,875 meters for the x coordinate and ± 10,2 meters for the y coordinate. To obtain these results, it is reasonable to set a threshold below the value 1. If the real position of the ball is located in a bin with value above this threshold, the result of the test will announce that the estimation of the bin has been calculated successfully. Finally, the eight neighbours of the bin, where the ball is located, are included to compute again a percentage. An important improvement is obtained then, although the error estimating the position of the ball is increased because the zone where the ball could be is now larger. Using neighbours in the calculation makes the predicted region larger. The error in this case is ± 7,875 meters for the x coordinate and ± 10,2 meters for the y coordinate, as it is illustrated in Figure 4.4. 24 4.1. SCENARIO 1: GENERAL SITUATIONS (a) Second position (b) Third position (c) Fourth position (d) Fifth position Figure 4.5: Histograms that represent the value of the score of the second, third, fourth and fifth bin in the most scored bins table. The evolution of the histogram that shows the probability to obtain a correct selected bin or region can be seen in Figure 4.6 . Therefore, percentages of how this method works are computed and that is illustrated in Figure 4.7 . 25 CHAPTER 4. HEAT MAPS: SCENARIO GIVEN NO PRIOR INFORMATION (a) Histogram from 0,0.1,...,0.8,0.9,1 (b) Histogram with a threshold in 0.75 (c) Histogram with a threshold in 0.75 and neighbours of the bin are included Figure 4.6: Histograms showing the value of the bin where the real position of the ball is located. 26 4.1. SCENARIO 1: GENERAL SITUATIONS Figure 4.7: This figure illustrates percentages of the score of the large region, once its neighbours are included, where the real ball position is located. As it can be seen, only the 21 % of the cases are successed 27 Chapter 5 Heat Maps: Scenario where players pass the ball 5.1 Scenario 2: Particular Situations The second scenario investigated for ball prediction is whenever a pass happens in the match. This section describes a method that can determine, when there is a pass, who the most suitable player to receive the ball is. Therefore, the most probable zone where the ball is going to be located in the near future can be predicted. Then, the prior information given is that a player has the ball in a certain frame and he is going to pass the ball. First of all, we define when a completed pass occurs. This is the case when: 1. A player, the passer, has the current possession of the ball. 2. A player receives the ball when he obtains the possession of the ball. 3. The player, who receives the ball, has to be a teammate of the player who has passed the ball. Summary of a pass: 1. Therefore, a pass is computed and it is possible to calculate all these parameters of that pass: a) The team which completed the pass. b) The position and the identity of the player who passes. c) The position and the identity of the player who receives. d) ρ : the distance in meters of the pass. e) θ : the angle in degrees of the pass. It is considered that horizontal passes, orthogonal to the goal, have zero degrees. f) The length of the pass. Computed in frames and in seconds. 29 CHAPTER 5. HEAT MAPS: SCENARIO WHERE PLAYERS PASS THE BALL g) The velocity of the ball. Computed as an average in meters and in kilometers per second. h) In which part the pass has been computed. First or second half. Very useful in order to represent the pass in a figure. 2. If the player who receives the ball belongs to the opponent team, the pass is canceled but he obtains the possession of the ball. Detection of passes in a game: 1. A player has the possession of the ball when the following conditions are fulfilled: a) The ball is inside an imaginary circle around himself. b) The radius of this circle could be flexible, but it has been fixed to 1.5 meters in this work. c) The height of the position of the ball, coordinate Z, must be less than 2 meters because the ball often goes above the player and he actually cannot touch it. Hence, it has been considered that he has not in possession of the ball. d) If there are two players whose circles involve the ball, the player who has the minimum distance to the ball has the possession of the ball. i. In this case, some problems will appear. If both players belong to the same team and one has the ball in possession but when he loses it in favour of his teammate, a problem appear. The pass is computed, but then, velocity and time length of the ball cannot be taken into account because the pass length would be only one frame. Therefore, the velocity becomes huge and physically impossible to occur. 2. The velocity of the pass has to be less than 40 meters per second, that means around 140 kilometers per hour. Moreover, those passes with a velocity of more than this threshold are not computed because they will not be really true passes. The reason to select such a conservative speed threshold is because sometimes a player kicks a ball strongly but it goes to a teammate who may catch and get the ball in his possession. Experiments estimating the velocity kicking a ball can be seen in [14]. Besides, in the situation of throwing a ball, the coordinates of the ball are not tracked accurately and a player receives the ball quite fast. Therefore, MATLAB code is written to obtain the information and the parameters of all the completed passes of the match. It is based on loops that keep the number of the player and which team has the possession of the ball or if anybody has the possession of the ball for every frame. In addition basically, if a possession of the ball changes from one player to another of the same team, a pass will be 30 5.1. SCENARIO 2: PARTICULAR SITUATIONS recorded. Following the instructions given before about how a completed pass is defined, all the distances from all players to the ball have to be computed in order to decide who has the possession of the ball if two players are very close to the ball. However, there are many situations where this method could fail. For example, if a player has the ball and one opponent player runs behind him to kick the ball, the machine will not recognize that it is actually the opponent who kicks the ball and not the player in possession as the machine has decided. Otherwise, the pass detector works quite well, with very coherent parameters of the pass, and with a very logical determination. For every frame, a picture illustrates the actual situation on the pitch. An example of this can be seen in Figure 5.2 . Figure 5.1: Parameters of each completed pass are saved in a text file Moreover, the parameters of each completed pass is saved in a text file. An example of how these parameters are saved can be seen in figure 5.1 . 31 CHAPTER 5. HEAT MAPS: SCENARIO WHERE PLAYERS PASS THE BALL (a) Part 1: Player 30 of red team has the possession of the ball (b) Part 2: Ball goes towards the position of player 4 Figure 5.2: In these pictures, ball is the green point, teams are represented with red and blue dots and their circles of possession are illustrated, too. Sequence shows a pass from player 30 to player 4 in red team. In 5.2a player 30 has the possession of the ball, and ball goes to player 4 in 5.2b. The second part of this sequence can be seen in 5.3. Besides, at the top of each picture which player has the possession of the ball can be seen 32 5.1. SCENARIO 2: PARTICULAR SITUATIONS (a) Part 3: Ball continues going towards the position of player 4 (b) Part 4: Player 4 obtains the possession of the ball Figure 5.3: These figures are the second part of the sequence seen in 5.2. Sequence shows a pass from player 30 to player 4 in red team. Ball goes to player 4 in 5.3a. Player 4 gets the ball in 5.3b and the pass is completed. Besides, at the top of each picture which player has the possession of the ball can be seen 33 CHAPTER 5. HEAT MAPS: SCENARIO WHERE PLAYERS PASS THE BALL In this particular match, 838 completed passes are computed. At first, they are classified by their distance and angle. Hence, polar coordinates are utilized in order to help this classification. A review of polar coordinates has been studied from [12] and extra knowledge has been acquired from [13]. It is necessary to compute a coordinate transformation for passes located in the second half and also depending on the team that they belong to. Following this rule, forward passes will be always have zero degrees, that is to the right direction, although they happen in the second half or they belong to the team that actually goals to the left. (a) Scatter plot of red team. Forward passes to the right (b) Scatter plot of blue team. Forward passes to the left. Figure 5.4: These pictures describe the distance and the direction of the passes for each team. Red team has completed 546 passes and blue team has only 292. Furthermore, most of the passes are forward Figure 5.4 represents two histograms, one per team. All the passes are classified 34 5.1. SCENARIO 2: PARTICULAR SITUATIONS by their distance and angle. The analysis of passes may reflect the summary of the match. For example, in this match 546 passes have been computed for red team and 292 for the blue one. The final result of the match was a winning for the red team by 3 goals. 5.1.1 Scenario 2: Feature Vector The feature vector for this scenario consists of a grid of bins in polar coordinates. 10 regions are chosen in radial direction, from 1 meter to 100 meters irregularly, spaced non-linearly and 16 regions in angle direction regularly spaced. Players’ positions relative to the player in possession are recorded in the corresponding bin. Moreover, the reason to select an irregular spaced radial direction in the composition of bins, is that there may be more players around the player who passes than far away. An example of relative positions can be seen in Figure 5.5. Figure 5.5: Relative positions of every player to the player who has passed the ball. This player is situated at the center of this figure. In particular, this is a pass that belongs to the blue team. The pass is described with the green line A problem may appear if a player is located between two bins or near some borders, is solved smoothing players’ positions. In order to smooth the border that each bin has with its neighbours, it cannot be possible to do a convolution of the array as it has been done before in the other scenario. This time, the grid has polar coordinates and it means that the last element of each row is neighbour of the first one. To solve this problem, each players’ position is represented with a white point in a black background and each image is blurred with a gaussian kernel. Furthermore, each player is represented as a number of points around its particular position and all of these points have a particular weight that the gaussian function 35 CHAPTER 5. HEAT MAPS: SCENARIO WHERE PLAYERS PASS THE BALL gives. This idea can be better understood by seeing Figure 5.6 and 5.7. All these points are kept in their corresponding bin creating the feature vector for each pass. (a) Original image not blurred (b) Blurred image with gaussian function Figure 5.6: Players’ relative positions to the player who has passed the ball. Both pictures represent the same situation, but in 5.6a there are only single white points as players and in 5.6b each player is represented as blurred points. This method will solve the problem that could appear if a player is located in the middle of two or more bins Once the relative positions of all the players for each completed pass are kept in an array, the next step is to learn a mapping. This mapping is learnt from inputs, 36 5.1. SCENARIO 2: PARTICULAR SITUATIONS (a) Weight of players in bins without blurring their positions (b) Weight of players in bins after blurring their positions Figure 5.7: The matrix that stores the weight of players in bins and represents the feature vector, is transformed into a 1-D array in order to be able to see this figure. The effect of blurring players’ positions relative to the passer can be seen in 5.7b. If a player is between two or more bins, he gives weight to both bins. This cannot happend in 5.7a where players gives weight to only one bin the own relative positions of the players, to outputs, the length and direction of each pass. In this case, all passes are independent from each other. Moreover, in this mapping each pass can be utilized as one test. Then, the number of computed tests are 838 since this is the total number of recorded completed passes for this match. In addition, supervised learning methods are used to learn this mapping and the distance from the test feature vector to all the training examples, that have their own feature vectors, has been calculated. Therefore, the most suitable passes for each situation can be selected. They are also called as the most similar nearest neighbours. The mathematical expression can be checked in the following equation. The distance can be calculated by minimizing this expression and X represents the space of the feature vector. This distance is defined as euclidean distance. Therefore, the expresion is: Distance ≡ argmin 1 ≤i ≤ N ||X 37 test − Xitraining ||2 (5.1) CHAPTER 5. HEAT MAPS: SCENARIO WHERE PLAYERS PASS THE BALL The purpose of this chapter is to compute from the list of passes with their corresponding relative positions of the players, the probability that the ball will be located in each region in the near future. These probabilities are easy to determine since it is known where the teammates of the player who passes the ball are and the most common passes that the player may do for each situation. Therefore, training data is chosen such that the distance and angle of each pass from the passer is in a region containing a teammate. Hence, only a subset of training data is utilized to compute the score for each region following this expression: score region j = score region j + e−λ||X test −X training ||2 i (5.2) To illustrate them, a heat map is created where a grid of 15x8 bins cover the whole pitch. Bins where the teammates are located are in a gray scale where the most white region is related with the most probable zone where the ball could go. The following Figure 5.8 summarizes this concept. Figure 5.8: Heat map of a particular pass. In this case, the blue team has the possession of the ball. The player who has the possession of the ball is represented at the yellow point. The green point represents where the ball finally went in this situation. Players do not pass the ball directly to the other players in some occasions. As it can be seen in the figure, bins where the teammates are located have a gray scale. This scale depends on the probability of each player to receive the ball. Therefore, this is a great example because the ball has been passed to the zone with the most probability, the most white Finally, the last step of the method applies these percentages to the teammates. Teammates obtain the percentage that they have to get the possession of the ball 38 5.1. SCENARIO 2: PARTICULAR SITUATIONS in the near future for each pass. It depends on how far they are from the player who has currently the possession of the ball and how the opponents are placed on the pitch. Hence, teammates are classified by this percentage. In cases where two players are in the same region, the percentage is split so that both players obtain the split percentage that has this region. Which player is going to receive the ball is not analyzed in this work. Therefore, Figures 5.9 and 5.10 describe two examples where a player has the possession of the ball and the percentages of his teammates are analyzed. A situation where there are some players without an explicit percentage is owing to the fact that they have obtained zero percentage. Figure 5.9: In this example, red team has the possession of the ball. In particular, the yellow point marks the position of the player that currently has the possession of the ball. His teammates are classified by his number and percentage receiving the ball in the near future. This percentage depends on the colour of his region. A list of these percentages is organized from the highest percentage to the lowest one. This example is selected because it has all the particular situations. Players with number 18, 4 and 27 have the same percentage although they are not in the same regions. Players with number 10 and 19 have the split percentage of their region. Actually the ball is located at the green point in the near future. Player number 7 is situated in that region. He had the most percentage to receive the ball. In conclusion, this method has worked perfectly because it estimates the region where the ball could be found in the near future correctly 39 CHAPTER 5. HEAT MAPS: SCENARIO WHERE PLAYERS PASS THE BALL Figure 5.10: Another selected example is illustrated in this figure. In this case, player 30 obtains the most probability to receive the ball. Moreover, he belongs to the blue team so that the numbers of his teammates are represented in the figure. It is very reasonable because he is by far the closest to the player who has the ball. Therefore, he obtains almost 90 % of probability to receive the ball and actually the ball went to his region as it can be seen at the green point. Some players are not in the percentage list because they have not received any score to obtain the ball 5.1.2 Results and Conclusions In conclusion, players nearer the ball have more probability to obtain the ball in the near future. In cases when a defender, who has the current possession of the ball, passes the ball to the strikers without any priority, this method does not work efficiently. Despite of selecting the most similar situations with supervised learning methods, short passes are more common than long ones. Therefore, teammates closer to the player who is going to pass acquire more percentages. Thus, results are represented in percentages. To demostrate how this method could be successful, the number of times, that the prediction of the most white bin which was actually the region where the player passed the ball, is counted. Almost in 60 % of the cases, the ball went to the first or to the second most probable region, as it can be checked in Figure 5.11. Therefore, this method could be utilized to obtain an alternative solution when tracking is not enough to find the ball. This method will decide successfully, in most of the cases, where the most suitable zone 40 5.1. SCENARIO 2: PARTICULAR SITUATIONS is to find the ball if there has been a reinitialization of the system that analyzes the image. Figure 5.11: This figure illustrates the percentage of the cases where in the most probable zone the ball actually went, and also to the second, the thrid, the fourth and the fifth one 41 Chapter 6 Conclusions and Future Work As it has been seen in histogram errors in the first scenario, the results are not as good as expected. There are different ways to improve these results. Firstly, a better database has to be collected. This means not only to store more ball data but also have clearer data. For example, fake data can appear when the referee whistles and players stop and the ball may not be in any expected position because the situation is invalid. This could be improved by adding sound to videos or only selecting the data carefully. Inserting the height of the ball on the pitch could add more information to the feature vector, but then, our database has to be huge. The height of the ball is only used in the second scenario to see if the player could touch the ball and have the possession of the ball or he is not able to touch it. The most important improvement could be done if the feature vector is complemented with the velocity and acceleration of the players, but in order to estimate where the ball is for every moment, our database has to be computed with many scenarios and many different matches. On the other hand, in the second scenario, it could be more useful if the probabilities are calculated taking into account whether any opponent is near to a teammate or not. In that case, his pertentage of receiving the ball would be less. Velocity and acceleration in the feature vector could add more information about the situation because in this scenario, there is not a temporal sequence. Other suggestion to improve in the second scenario is to use irregular, circle or oval regions where teammates could receive the ball from the player who has the ball. By utilizing rectangle regions, there might be some variations in cases that a player is near to the border of a region. This improvement has not been realized in the project because there may appear problems in order to decide how those regions will be when two teammates are very near from each other. It could be a geometric problem, therefore it has not been implemented. In this case, the heat maps could seem as in Figure 6.1. In addition, as the length of the passes recorded from the match are known, it could be possible to predict where the player who is going to receive the ball could 43 CHAPTER 6. CONCLUSIONS AND FUTURE WORK Figure 6.1: How heat maps look when using circle regions where teammates may receive the ball from the player who has currently the possession of the ball. In this case, regions with more probability to have the ball in the near future are painted in black be. Then, it may predict how long the player could move depending on the length of passes. Length of passes would depend on the distance from each teammate to the player who has the ball. In this work the positions where the teammates are when the ball is currently with the player are taken into account. Additionally, the data provided by TRACAB is not 100 % accurate. This project would help them to locate the better position of the ball, but at the same time, it has to deal with their tracked ball data. Sometimes this data is wrong and impossible situations may appear when the ball is in a particular position, and it appears 2 or 3 meters far in the next frame. It is impossible to check frame by frame looking for mistakes because there are almost 100000 frames. An example of this event can be seen in Figure 6.2 and 6.3 . 44 (a) Player 22 has the possession of the ball in frame 2070 (b) Player 22 has the possession of the ball in frame 2071 Figure 6.2: In this sequence it can be seen how the possession of the ball changes from player 22 to player 44 in one frame, although they are far from each other. The second part of this sequence can be seen in 6.3 45 CHAPTER 6. CONCLUSIONS AND FUTURE WORK (a) Player 44 has the possession of the ball in frame 2072 (b) Player 44 continues with the possession of the ball in frame 2075 Figure 6.3: This figure is the second part of the sequence seen in 6.2. Player 44 obtains the possession of the ball in one frame, although they are far from each other. The third part of this sequence can be seen in 6.4 46 (a) Nobody has the possession of the ball in frame 2076 (b) Nobody has the possession of the ball in frame 2077 Figure 6.4: This figure is the third part of the sequence seen in 6.2 and 6.3. Suddenly the ball appears near player 22 again and it seems that he has passed the ball to the goalkeeper 47 References [1] TRACAB: http://www.tracab.com 2003. [2] Rafael Osorio. “Ball detection via Machine Learning”. Master Thesis in KTH, Stockholm, Sweden 2009. [3] Tom M. Mitchell. “Machine Learning”. pages 230-249 ,1997. [4] Richard Hartley and Andrew Zisserman. “Multiple View Geometry in computer vision”. Second Edition, 2003. [5] Pavel Zezula, Giuseppe Amato, Vlastislav Dohnal and Michal Batko. ‘Similarity Search - The Metric Space Approach”. Advances in Database Systems, Vol. 32, 2006. [6] Beyer, K., Goldstein, J., Ramakrishnan, R., and Shaft, “When is nearest neighbor meaningful?” Proceedings of the 7th ICDT, Jerusalem, Israel, 1999. [7] Arya, S., D. M. Mount, N. S. Netanyahu, R. Silverman, and A. Y. Wu. ‘An Optimal Algorithm for Approximate Nearest Neighbor Searching in Fixed Dimensions”. Journal of the ACM, vol. 45, no. 6, pp. 891-923 [8] Weber, Schek, Blott. ‘A quantitative analysis and performance study for similarity search methods in high dimensional spaces” Zurich, Switzerland. [9] Weber, Blott. ‘An Approximation-Based Data Structure for Similarity Search” Zurich, Switzerland. [10] Vaidya, P. M. ‘An O(n log n) Algorithm for the All-Nearest-Neighbors Problem” 1987, USA. [11] Shoichiro Nakamura. ‘Numerical Analysis and Graphic Visualization with MATLAB”, Prentice Hall, 2002. [12] Anton Howard, Irl Bivens and Stephen Davis. ‘Calculus”. Anton Textbooks, 2002. [13] Codruþa Vancea, Florin Vancea and Antoniu Nicula. ‘On Double Interpolation in Polar Coordinates”. Oradea, Romania. 49 REFERENCES [14] H. Matsui, K Kobayashi. “Analysis of powerful ball kicking” in Biomechanics VIII, Japan, 1983. 50 TRITA-CSC-E 2010:048 ISRN-KTH/CSC/E--10/048--SE ISSN-1653-5715 www.kth.se
© Copyright 2026 Paperzz