Predicting the Winner of an NFL Football Game

Predicting the Winner of an
NFL Football Game
Matt Gray
CS/ECE 539
Reasons to Predict
• NFL Football is watched by millions of
people every weekend during the season.
• Vast amounts of money invested in NFL
Football
• Prediction Polls such as Weekly Football
Polls, etc.
Main Problem & Goal
• Problem:
– Most predictions available have a human bias
in it which stems from personal opinions that
could result in errors with the predictions.
• Goal:
– Eliminate the human error by having a Multilayer Perceptron to perform the prediction.
Why MLP
• Non-linear:
– Data from NFL football games covers a vast
amount of information, such as home versus
away, yards gained, yards allowed, etc.
– No one piece of the data always correlates to
a win or loss as there are many ways in which
a team can win or lose.
Why MLP
• MLPs
– Multi-Layer Perpceptrons are capable of
predicting outcomes of non-linear data.
– Multi-Layer Perceptrons reduce the problem
to a Neural Network prediction problem and
removes the human personal bias of a teams
performance from the prediction.
Data Collection
• I collected the data from
http://www.statfox.com/nfl/nfllogs.htm
• Regular Season and Post Season Data per team was
put into an Excel file.
• To keep data even per team, Post Season data was
removed.
• Averages and Standard Deviations for both Yards
Gained and Yards Allowed per team were calculated
from the team’s games up through the current game.
Data Collection
• To make the data easier to manage and handle,
separate values of Rushing and Passing Yards gained
and allowed and removed, keeping Total Yards gained
and allowed.
• Teams were given unique team IDs for purposes of
arranging the data.
• The Excel File was then saved as a text document that
could be read in by Matlab.
• Team IDs were removed by Matlab so that they do not
affect the data.
Preliminary Results
• Data was formatted in Matlab and then fed
into a modified MLP Matlab program
provided from the class website.
• Order of data was randomized, and then a
separated into a training and data set.
• Without scaling data, division by zero
occurs, so data was scaled.
Preliminary Results
• Using a MLP with a 2-5-2 structure as an
initial setup.
• Multiple tests run using the same variables
for alpha and momentum set to default
values of 0.1 and 0.8 respectively.
• Average of initial results on the training
data was a 57.71% classification rate as to
whether or not a team won its game.
Preliminary Results
• Training Error from one test run of the initial setup
Preliminary Results
• Training Error from a second test run of the initial setup
Further Plans
• My next steps in obtaining better
prediction results include the following.
– Perform feature reduction of data that may not be
necessary or that causes confuses
– Perform multiple loops of varying the variables of
alpha, momentum, number of layers, and number of
hidden neurons.
– For each game, combine the data from the two
opposing teams to hopefully form a stronger
correlation of the data.
References
•
http://www.statfox.com/nfl/nfllogs.htm
• http://www.nfl.com
• Newman, M. E. J., and Park, Juyong; A network-based ranking
system for US college
• Football. Department of Physics and Center for the Study of
Complex Systems, University of Michigan, Ann Arbor, MI 48109.
arXiv:physics/0505169 v4 31 Oct 2005
• Purucker, Michael C; Neural network quarterbacking: Who different
training methods
• perform in calling the games. IEEE Potentials, August/September
1996.
http://ieeexplore.ieee.org/iel1/45/11252/00535226.pdf?arnumber=53
5226