Elijah Molloy Bradley Korabik Brandon Harris NHL Game Prediction Project Description Our group project revolves around finding an accurate way to predict the winner of a hockey game between various matchups in the Western Conference (Central Division) of the NHL. Some of the statistics we would be using in our algorithm for predicting a game winner are a team’s win loss record, the number of games they play at home versus the number of games they play away, and also various league averages that can be used to compare one team to another. We are using the 2014/2015 season’s data for the prediction process of certain NHL games. The formal definition of predicting win probability is as follows, “Given input variables from two separate NHL teams, find the probability of winning by using discrete probability distribution (Poisson Distribution).” Our formula is the summation of k=1 to n of (Pk – Ak)2 = 0. With this equation our goal is to minimize the output as close to 0, therefore eliminating as much error as possible. In addition, we are also utilizing poisson distribution as a tool to assist in the prediction generating process. In researching this problem our group found that there was little analysis done to predict actual hockey games on a game by game basis. We found that most of the work had been done to predict soccer games, and soccer is a similar sport to compare to hockey as far as statistics and overall game play go. We utilized our research into the soccer game predictions to develop our algorithms for predicting hockey games. Our first game prediction algorithm, Poisson Distribution, utilizes the statistics of “Home Team Goals For”, “Home Team Goals Against”, “Away Team Goals For”, and “Away Team Goals Against.” Poisson distribution is defined as Is used to determine which team would win in a game. is used to represent the average number of goals that a particular team is expected to score in any particular game, while k is used to represent the actual number of goals scored by the same particular team. Start by taking one team, team one and calculating the average number of goals they score by dividing the amount of goals they score by the number of games team one has played. Secondly determine team one’s attack strength by taking the average goals scored that was calculated above and dividing it by the league scoring average to find out how team one ranks in comparison to the rest of the league in terms of offense. To find the defensive strength of the opposing team, team two takes the number of goals team two has allowed and divides it by the number of games they have played. To calculate team two’s defense strength, take the average above and divide it by the league average for goals allowed. Once the offensive strength for team one and defensive strength for team two is considered we can create an average number of goals that team one is most likely to score. To find team two’s average number of goals score we can take the method used above and reverse it by substituting team one for team two and vise versa. Once you have the average for team one and two, poisson distribution can be used to predict how many goals each team would score. Poisson distribution would only go from zero to eight goals because a team is not very likely to score more than eight goals looking at NHL games last season. Once the results from poisson are calculated, each team’s win percentage can be calculated by summing the multiplied percentages from 1 to 0 goals, 2 to 0 goals, 2 to 1 goals and so on up to 8 to 7 goals. The higher percentage between the two teams is the winner for that game. The second game prediction algorithm uses “Average Home Team Goals Scored” and “Average Away Team Goals Scored.” Skellam Distribution is defined as where I k(z) is a modified Bessel function of the first kind. It is used to determine the probability of one team beating another team with respect to only the home and away games between the two teams, and no other statistics for a given team. Specifically, it will determine the probability of the difference of two statistically independent random variables, using expected values, 1 and 2, representing “Average Home Team Goals Scored/Game” and “Average Away Team Goals Scored/Game.” The algorithm will determine the probability of a specific difference in goals scored by team on and team two. The sum of the probabilities of team one winning by any number of goals can be summed to represent the total probability of team one winning. The same goes for the probability of team two winning. After determining the total probability of team a and team two winning by any number of goals, the team with the higher probability can be expected to win. The third game prediction algorithm used the total probability of a specific team winning found using Skellam Distribution, total number of trials, and number of successes over those trials. Negative Binomial Distribution can be defined as This algorithm determines the probability, represented as P, of a certain number of successes, represented as r, over a specified number of trials, represented as k. Using the probability from Skellam of a specific team one beating team two, and the number of games those teams have played each other over a season as k, one can determine the probability of team one winning k number of times against team two. Utilizing the series of Chicago playing St. Louis, Dallas and Nashville through out the season we were able to calculate the percentage of games each team would win both home and away. Overall, our results were not as good as we thought we could get from our algorithms. We admit to the fact that the way we determined which team would win based off of statistical evidence and choosing a higher percentage led to some inaccuracy. We noticed that a lot of the percentages were pretty close to one another, so sometimes the team that wasn’t predicted to win actually won, due to their closeness in percentages. Our predictions for the four or five game series between Chicago, Dallas, St. Louis, and Nashville often were able to get 2 or 3 of the games right, which is not bad for a small sample size. In our future work we would like to incorporate a larger sample set and use Chicago versus any team over the entire season and add more factors to influence two teams other than pure statistical goals scored and allowed. In the future we would like to incorporate more data into our algorithms which would allow us to predict games for all teams in the NHL. Additionally, we would also like to include other variables or factors into the algorithms such as power plays, injuries, penalty minutes, games played in a row, fan attendance and so on. We think that by adding in some of these different factors we could help to more accurately predict the outcome of games. Furthermore, we would also like to develop a way to model a prediction graph using live game results. It would be neat to see how a graph could change utilizing real time stats and goals. The algorithms we designed intrigued our interest in this problem and we hope to continue our progress in the future in solving this problem.
© Copyright 2026 Paperzz