Matrix theory applications to baseball rankings1 Dedicated

Matrix theory applications to baseball
rankings1
Dedicated to the memory of TS Michael.
In this note, we compare different matrix-oriented ranking methods for
the men’s baseball time in the 2016 regular season of the Patriot league.
These methods include Massey’s method, Colley’s method, Keener’s method,
the related random walker method, and the minimum upset method.
A good reference for this material is the book Who’s # 1, by A. Langville
and C. Meyer [LM].
Contents
1 The data to be ranked: Patriot league men’s baseball, 2016
2
2 Application: (a variation on) Massey’s ranking
4
3 Pre-tournament (Massey-like) ranking
7
4 “Multi-graph” pre-tournament Massey ranking
8
5 Pre-tournament Keener ranking
10
6 Pre-tournament random walker ranking
12
7 The random walker system of ODEs
15
8 The minimum upset method
18
9 The Elo rating method
20
1
Written by Prof W. D. Joyner, [email protected]. Last modified 2016-12-02
1
1
The data to be ranked: Patriot league men’s
baseball, 2016
In our applications, we shall consider Patriot League men’s baseball:
1. Army (U.S. Military Academy),
2. Bucknell,
3. Holy Cross,
4. Lafayette,
5. Lehigh,
6. Navy (U.S. Naval Academy).
The cumulative results of the 2016 regular season are given in Figure 3. The
pre-tournament2 results are displayed in Figure 1. For a win/loss diagram,
see Figures 2 and 5.
x\y
Army
Bucknell
Holy Cross
Lafayette
Lehigh
Navy
Army
×
16-14
13-14
24-14
12-10
19-8
Bucknell
14-16
×
30-27
16-18
20-23
22-10
Holy Cross
14-13
27-30
×
15-19
13-17
16-9
Lafayette
14-24
18-16
19-15
×
23-12
39-17
Lehigh Navy
10-12
8-19
23-20 10-22
17-13
9-16
12-23 17-39
×
12-18
18-12
×
Figure 1: Regular season results, sorted/ordered as Army vs Bucknell, Army
vs Holy Cross, Army vs Lafayette, . . . , Lehigh vs Navy.
2
We count only the games played in the Patriot league, but not including the Patriot
league post-season tournament. In the table, the total score (since the teams play multiple
games against each other) of the team in the vertical column on the left is listed first. In
other words, “a - b” in row i and column j means the total runs scored by team i against
team j is a, and the total runs allowed by team i against team j is b. Here, we order the 6
teams as above (team 1 is Army, team 2 is Bucknell, and so on). For instance if X played
Y and the scores were 10 − 0, 0 − 1, 0 − 1, 0 − 1, 0 − 1, 0 − 1, then the table would read
10 − 5 in the position of row X and column Y.
2
Figure 2: Win/loss diagram for the pre-tournament Patriot League (0=Army,
1=Bucknell, . . . , 5=Navy).
The cumulative results of the 2016 regular season3 are collected in Figure
3.
x\y
Army
Bucknell
Holy Cross
Lafayette
Lehigh
Navy
Army
×
16-14
13-14
24-14
12-10
19-8
Bucknell
14-16
×
30-27
16-18
20-23
42-28
Holy Cross
14-13
27-30
×
15-19
43-53
30-12
Lafayette
14-24
18-16
19-15
×
23-12
39-17
Lehigh Navy
10-12
8-19
23-20 28-42
27-13 43-53
12-23 17-39
×
12-18
18-12
×
Figure 3: 2016 regular season Patriot league men’s baseball
3
We count only the games played in the Patriot league, including the Patriot league
tournament.
3
2
Application: (a variation on) Massey’s ranking
In this section we give an application of orthogonal projection to the ranking
of team sports. Massey’s method, currently in use by the NCAA (for football,
where teams typically play each other once), was developed by Kenneth P.
Massey while an undergraduate math major in the late 1990s. We present
a possible variation of Massey’s method adapted to baseball, where teams
typically play each other multiple times.
There are exactly 15 pairing between these teams. These pairs are sorted
lexicographically, as follows:
(1,2),(1,3),(1,4), . . . , (5,6).
In other words, sorted as Army vs Bucknell, Army vs Holy Cross, Army vs
Lafayette, . . . , Lehigh vs Navy. In this ordering, we record their (sum total)
win-loss record (a 1 for a win, -1 for a loss) in a 15 × 6 matrix:


−1 1
0
0
0 0
 1
0 −1 0
0 0 


 −1 0
0
1
0 0 


 −1 0

0
0
1
0


 −1 0

0
0
0
1


 0 −1 1
0
0 0 


 0

1
0
−1
0
0


1
0
0 −1 0 
M =
 0
.
 0 −1 0
0
0 1 


 0
0
1 −1 0 0 


 0

0
1
0
−1
0


 0

0
−1
0
0
1


 0
0
0 −1 1 0 


 0
0
0 −1 0 1 
0
0
0
0 −1 1
We also record their total losses:
4

2
1
10
2
11
3
2
3
14
4
14
10
11
22
6












b=

























.












The Massey ranking of these teams is a vector r which best fits the equation
M r = b.
While the corresponding linear system is over-determined, we can look for a
best (in the least squares sense) approximate solution using the orthogonal
projection formula
PV = B(B t B)−1 B t .
(1)
Unfortunately, in this case B = M does not have linearly independent
columns, so (1) does not apply.
Massey’s clever idea is to solve
M tM r = M tb
by row-reduction and determine
of the solution.
To this end, we compute

5
 −1

 −1
t
M M =
 −1

 −1
−1
(2)
the rankings from the parameterized form
−1
5
−1
−1
−1
−1
−1
−1
5
−1
−1
−1
5
−1
−1
−1
5
−1
−1
−1
−1
−1
−1
5
−1
−1
−1
−1
−1
−1
5








and

−24
−10
10
−29
−10
63



M tb = 



Then we compute the rref of




t
t
A = (M M, M b) = 



5
−1
−1
−1
−1
−1
−1
5
−1
−1
−1
−1
−1
−1
5
−1
−1
−1




.



−1
−1
−1
5
−1
−1
−1
−1
−1
−1
5
−1
−1
−1
−1
−1
−1
5
−24
−10
10
−29
−10
63
−1
−1
−1
−1
−1
0
− 87
6
− 73
6
− 53
6
− 92
3
− 73
6
0





,



which is




t
t
rref (M M, M b) = 



1
0
0
0
0
0
0
1
0
0
0
0
0
0
1
0
0
0
0
0
0
1
0
0
0
0
0
0
1
0



.



If r = (r1 , r2 , r3 , r4 , r5 , r6 ) denotes the rankings of Army, Bucknell, Holy
Cross, Lafayette, Lehigh, Navy, in that order, then
r1 = r6 −
87
73
53
92
73
, r2 = r6 − , r3 = r6 − , r4 = r6 − , r5 = r6 − .
6
6
6
6
6
Therefore
Lafayette < Army = Bucknell = Lehigh < Holy Cross < Navy.
If we use this ranking to predict win/losses over the season, it would fail to
correctly predict Army vs Holy Cross (Army won), Bucknell vs Lehigh, and
Lafayette vs Army. This gives a prediction failure rate of 20%.
6
3
Pre-tournament (Massey-like) ranking
We shall use the above method to determine the ranking before the Patriot
league tournament. The ranking used by the Patriot league is simply the
win-loss record:
Army (6-13) < Lafayette (7-13) < Bucknell (9-11)
< Lehigh (9-10) < Holy Cross (13-7) < Navy (15-5).
The pre-tournament matrix was given
In this case, their total losses:

2
 1

 10

 2

 11

 3

 2

b=
 3
 12

 4

 4

 7

 11

 22
6
and




t
M b=



Then we compute the rref of
7
in Figure 3 above.


























−24
−8
3
−29
0
58




.







t
t
A = (M M, M b) = 



5
−1
−1
−1
−1
−1
−1
5
−1
−1
−1
−1
1
0
0
0
0
0
0
0
1
0
0
0
−1
−1
5
−1
−1
−1
−1
−1
−1
5
−1
−1
−1
−1
−1
−1
5
−1

−1 −24
−1 −8 

−1
3 

−1 −29 

−1
0 
58
5
0
0
0
0
1
0
−1
−1
−1
−1
−1
0
− 82
6
− 66
6
− 55
6
− 87
6
− 58
6
0

to get




rref (A) = 



0
1
0
0
0
0
0
0
0
1
0
0



.



If r = (r1 , r2 , r3 , r4 , r5 , r6 ) denotes the rankings of Army, Bucknell, Holy
Cross, Lafayette, Lehigh, Navy, in that order, then
r1 = r6 −
82
66
55
87
58
, r2 = r6 − , r3 = r6 − , r4 = r6 − , r5 = r6 − .
6
6
6
6
6
Therefore,
Lafayette < Army < Bucknell < Lehigh < Holy Cross < Navy.
This also gives a prediction failure rate of 20%.
4
“Multi-graph” pre-tournament Massey ranking
This section follows formulas explained to me by T.S. Michael4 .
In this multi-graph version, we record the win-loss record (a 1 for a win,
-1 for a loss) in a 59 × 6 matrix M , one for each game. The display of the
matrix is omitted as it won’t fit on the page, but it’s similar to the incidence
matrix used in the previous sections. However, we do display the product
4
I thank T.S. for these formulas as well as comments on the proceeding sections.
8




t
M M =



19
−4
−4
−4
−3
−4
−4
20
−4
−4
−4
−4
−4
−4
20
−4
−4
−4
−4
−4
−4
20
−4
−4
−3
−4
−4
−4
19
−4
−4
−4
−4
−4
−4
20




.



We also must record the loss vector (which has length 59) b, but it too
is omitted as it won’t fit on the page. It records the (positive) difference
(number of runs of winner)-(number of runs of loser), one for each game.
However, we do display the augmented matrix


19 −4 −4 −4 −3 −4 −24
 −4 20 −4 −4 −4 −4 −14 


 −4 −4 20 −4 −4 −4
11 
t
t
,

A = (M M, M b) = 

−4
−4
−4
20
−4
−4
−29


 −3 −4 −4 −4 19 −4 −8 
−4 −4 −4 −4 −4 20
64
as well as its rref:




rref (A) = 



1
0
0
0
0
0
0
1
0
0
0
0
0
0
1
0
0
0
0
0
0
1
0
0
0
0
0
0
1
0
−1 − 122
33
−1 − 13
4
−1 − 53
24
−1 − 31
8
−1 − 98
33
0
0




.



If r = (r1 , r2 , r3 , r4 , r5 , r6 ) denotes the rankings of Army, Bucknell, Holy
Cross, Lafayette, Lehigh, Navy, in that order, then
r1 = r6 −
976
858
583
1023
784
, r2 = r6 −
, r3 = r6 −
, r4 = r6 −
, r5 = r6 −
.
264
264
264
264
264
Therefore,
Lafayette < Army < Bucknell < Lehigh < Holy Cross < Navy.
This too gives a prediction failure rate of 20%.
9
5
Pre-tournament Keener ranking
Suppose T teams play each other. Let A = (aij )1≤i,j≤T be a non-negative
square matrix determined by the results of their games, called the preference
matrix. In his 1993 paper [Ke], Keener defined the score of the ith team to
be given by
T
1 X
aij rj ,
si =
ni j=1
where ni denotes the total number of games played by team i and r =
(r1 , r2 , . . . , rT ) is the rating vector (where ri ≥ 0 denotes the rating of team
i).
One possible preference matrix the matrix A of total scores obtained from
the pre-tournament table below:


0 14 14 14 10 8
 16 0 27 18 23 28 


 13 30 0 19 27 43 

A=
 24 16 15 0 12 17  ,


 12 20 43 23 0 12 
19 42 30 39 18 0
(In this case, ni = 4 so we ignore the 1/ni factor.) Recall, the pre-tournament
table of scores was given in Figure 3.
In his paper, Keener proposed a ranking method where the ranking vector
r is proportional to its score. The score is expressed as a matrix product Ar,
where A is a square preference matrix. In other words, there is a constant
ρ > 0 such that si = ρri , for each i. This is the same as saying Ar = ρr.
The Frobenius-Perron theorem5 implies that S has an eigenvector r =
(r1 , r2 , r3 , r4 , r5 , r6 ) having positive entries associated to the largest eigenvalue
5
From the wikipedia entry: Perron-Frobenius theorem for irreducible matrices: Let A
be an irreducible non-negative n × n matrix with spectral radius ρ. Then the following
statements hold.
• The number ρ is a positive real number and it is an eigenvalue of the matrix A
(called the Perron-Frobenius eigenvalue).
• The Perron-Frobenius eigenvalue ρ is simple. Both right and left eigenspaces associated with r are one-dimensional.
• A has a left eigenvector v with eigenvalue ρ whose components are all positive.
10
λmax of A, which has (geometric) multiplicity 1. Indeed, A has maximum
eigenvalue λmax = 110.0385..., of multiplicity 1, with eigenvector
r = (1, 1.8313 . . . , 2.1548 . . . , 1.3177 . . . , 1.8015 . . . , 2.2208 . . .).
Therefore the teams, according to Kenner’s method, are ranked,
Army < Lafayette < Lehigh < Bucknell < Holy Cross < Navy.
This gives a prediction failure rate of just 6.7%.
If we instead use the mollified scoring matrix
Mij = Aij /(Aij + Aji + 1):

1
14
10
2
0 14
31
2
39
23
7
 16 0 27 18 23 28
31
58
35
44
71
 13
15
27
43

0 19
28
29
35
71
74
M =
8
16
3
1
17

0 3 57
13
35
7
 12
5
43
23
12

0 31
23
11
71
36
19
42
15
13
18
0
28
71
37
19
31
M whose ijth entry is




.



Indeed, M has maximum eigenvalue λmax = 2.4053 . . ., of multiplicity 1, with
eigenvector
r = (1, 1.1531 . . . , 1.1933 . . . , 1.0253 . . . , 1.2255 . . . , 1.3543 . . .).
Therefore the teams, according to Kenner’s method, are ranked,
Army < Lafayette < Bucknell < Holy Cross < Lehigh < Navy.
This gives a prediction failure p
rate of 20%.
1
1
1
Let f (x) = 2 + 2 sign(x − 2 ) |2x − 1| (see Figure 4) and use the mollified
scoring matrix M whose ijth entry is Mij = f (Aij /(Aij + Aji + 1)). The
maximum eigenvalue is λmax = 2.28 . . ., of multiplicity 1, with eigenvector
r = (1, 1.38 . . . , 1.53 . . . , 1.03 . . . , 1.51 . . . , 1.80 . . .).
Therefore the teams, according to Kenner’s method, are ranked,
Army < Lafayette < Bucknell < Lehigh < Holy Cross < Navy.
This ranking has a prediction failure rate of 13.3%.
Likewise, A has a right eigenvector w with eigenvalue ρ whose components are all
positive.
• The only eigenvectors whose components are all positive are those associated with
the eigenvalue ρ.
11
Figure 4: The mollifier function
Keener.
6
1
2
+ 21 sign(x − 12 )
p
|2x − 1| suggested by
Pre-tournament random walker ranking
This method has elements in common with the Keener method presented in
the last section, in that both use eigenvectors of a matrix. It is also closely
related to the Google pagerank method.
We follow the presentation in the paper by Govan and Meyer [GM]. The
table of “score differentials” based on Figure 3 is:
x\y
Army
Bucknell
Holy Cross
Lafayette
Lehigh
Navy
Army
0
2
0
10
2
11
Bucknell
0
0
3
0
0
14
Holy Cross
1
0
0
0
0
8
This leads to the following matrix:
12
Lafayette
0
2
4
0
11
22
Lehigh
0
3
14
0
0
6
Navy
0
0
0
0
0
0




M0 = 



0 0
2 0
0 3
10 0
2 0
11 14

1 0 0 0
0 2 3 0 

0 4 14 0 
.
0 0 0 0 

0 11 0 0 
8 22 6 0
The edge-weighted score-differential graph associated to M0 (regarded as a
weighted adjacency matrix) is in Figure 5.
Figure 5: The score-differential graph for the pre-tournament Patriot League
(0=Army, 1=Bucknell, . . . , 5=Navy).
This matrix M0 must be normalized to create a (row) stochastic matrix:
13




M =



0
0
1
0
0
2/7
0
0
2/7
3/7
0
1/7
0
4/21 2/3
1
0
0
0
0
2/13
0
0
11/13
0
11/61 14/61 8/61 22/61 6/61
0
0
0
0
0
0




.



Next, to insure it is irreducible, we replace M by A = (M + J)/2, where J
is the 6 × 6 doubly stochastic matrix with every entry equal to 1/6:


1/12
1/12
7/12
1/12
1/12 1/12
 19/84
1/12
1/12
19/84
25/84 1/12 


 1/12

13/84
1/12
5/28
5/12
1/12
.
A=
 7/12
1/12
1/12
1/12
1/12 1/12 


 25/156
1/12
1/12
79/156
1/12 1/12 
127/732 145/732 109/732 193/732 97/732 1/12
Let
v0 =
1 1 1 1 1 1
, , , , ,
6 6 6 6 6 6
.
The ranking determined by the random walker method is the reverse6 of the
left eigenvector of A associated to the largest eigenvalue λmax = 1
In other words, the vector7
r∗ = lim v0 An .
n→∞
6
By reverse, I mean that the vector ranks the teams from worst-to-best, not from
best-to-worst, as we have seen in previous ranking methods.
7
Alternatively, you can use the column stochastic matrix


1/12 19/84 1/12 7/12 25/156 127/732
 1/12 1/12 13/84 1/12 1/12 145/732 


 7/12 1/12 1/12 1/12 1/12 109/732 
t


A =

 1/12 19/84 5/28 1/12 79/156 193/732 
 1/12 25/84 5/12 1/12 1/12
97/732 
1/12 1/12 1/12 1/12 1/12
1/12
and compute the ranking via the normalized right eigenvector associated to λmax = 1.
14
This is approximately
r∗ ∼
= (0.2237 . . . , 0.1072 . . . , 0.2006 . . . , 0.2077 . . . , 0.1772 . . . , 0.0833 . . .) .
Its reverse gives the ranking:
Army < Lafayette < Bucknell < Lehigh < Holy Cross < Navy.
This gives a prediction failure rate of 13.3%.
7
The random walker system of ODEs
It’s worthwhile exploring the system of ODEs that arise in the random walker
model.
Suppose T teams play each other. Let ni denote the number of games
played by team i, wi denote the number of games won by team i, `i denote
the number of games lost by team i, so `i + wi = ni , for all 1 ≤ i ≤ T .
Let Nij denote the number of games played between team i and team j, so
Nij = Nji . Let Aij denote the number of times team i beat team j minus
the number of times team j beat team i. If Nij = 0 or = 1 then

 −1, if team i lost to team j,
+1,
if team i beat team j,
Aij =

0,
if i = j.
A “random walker” is a randomly selected sports fan who may change
his/her allegiance depending on the results of a game. The number of random
walkers preferring team i is xi and the total number of fans is denoted by
F , so x1 + . . . + xT = F . Let p denote the probability that a random walker
changes8 his/her allegiance from team i to team i, given that team i just
beat team j. Let D = (Dij ) denote the T × T matrix defined by
−p`i − (1 − p)wi , i = j,
Dij =
1
N + 2p−1
Aij , i 6= j.
2 ij
2
The rate of change of the expected change of the votes cast by the random
walker for each of the T teams is governed by this first order system of ODEs:
8
In this model, the fan does not have complete information about the results of the
games, but instead is told of the results of a randomly selected game without regard to
the time it was played.
15
x0 = Dx,
(3)
where


x1


x =  ...  .
xT
In our case, Figure 3 gives us the values
`1 = 4, w1 = 1, D11 = −4p − (1 − p) = −3p − 1,
`2 = 2, w2 = 3, D22 = −2p − 3(1 − p) = p − 3,
`3 = 2, w3 = 3, D33 = −2p − 3(1 − p) = p − 3,
`4 = 4, w4 = 1, D44 = −4p − (1 − p) = −3p − 1,
`5 = 3, w1 = 2, D55 = −3p − 2(1 − p) = −p − 2,
`6 = 0, w6 = 5, D11 = −0p − 5(1 − p) = 5p − 5,
and




A = (Aij ) = 



0 −1 1 −1 −1 −1
1
0 −1 1
1 −1
−1 1
0
1
1 −1
1 −1 −1 0 −1 −1
1 −1 −1 1
0 −1
1
1
1
1
1
0




.



This gives

−3 p − 1 −p + 1
p
−p + 1 −p + 1 −p + 1

p
p − 3 −p + 1
p
p
−p + 1

 −p + 1
p
p−3
p
p
−p + 1
D=

p
−p
+
1
−p
+
1
−3
p
−
1
−p
+
1
−p
+1


p
−p + 1 −p + 1
p
−p − 2 −p + 1
p
p
p
p
p
5p − 5
16




.



• For example, when p = 1/4, the vector

675
 290

 345
x=
 609

 406
155
9




.



is a steady-state solution to the system in (3). This represents the
expected population of random walkers voting for each team. This
gives the ranking
Army < Lafayette < Lehigh < Holy Cross < Bucknell < Navy.
This gives a prediction failure rate of 13.3%.
• For example, when p = 3/4, the vector 10


693
 1148 


 1239 

x=
 615  .


 820 
2709
is a steady-state solution to the system in (3):








x1
x2
x3
x4
x5
x6
0







 =






− 13
4
3
4
1
4
3
4
3
4
3
4
1
4
− 49
3
4
1
4
1
4
3
4
3
4
1
4
− 94
1
4
1
4
3
4
1
4
3
4
3
4
− 13
4
3
4
3
4
1
4
3
4
3
4
1
4
− 11
4
3
4
1
4
1
4
1
4
1
4
1
4
− 54








x1
x2
x3
x4
x5
x6




.



This represents the expected population of random walkers voting for
each team. This gives the ranking
9
Recall, when p < 1/2, the random walker is preferring losers, so the ranking is reversed.
Recall, when p > 1/2, the random walker is preferring winners, so the ranking is in
the usual order.
10
17
Army < Lafayette < Lehigh < Bucknell < Holy Cross < Navy.
This agrees with the ranking obtained by Kenner’s method. This gives
a prediction failure rate of 6.7%.
8
The minimum upset method
Let Team 1 < Team 2 < . . . < Team n denote some ranking of teams and
assume, for simplicity, that each team plays each other. An upset occurs
when a lower ranked team beats an upper ranked team. For each ranking, r,
let U (r) denote the total number of upsets. The goal of the minimum upset
method is to construct a ranking for which U (r) is as small as possible.
Let Aij denote the number of times team i beat team j minus the number
of times team j beat team i, so

 −1, if team i lost to team j,
+1,
if team i beat team j,
Aij =

0,
if i = j.
We regard this matrix as the signed adjacency matrix of a digraph Γ (see
Figure 2). Our goal is to find a Hamiltonian (undirected) path through the
vertices of Γ which goes the “wrong way” on as few edges as possible.
• Construct the list of spanning trees of Γ (regarded as an undirected
graph).
• Contruct the sublist of Hamiltonian paths (from the spanning trees of
maximum degree 2).
• For each Hamiltonian path, compute the associated upset number: the
total number of edges tranversed in Γ going the “right way” minus the
total number going the “wrong way.”
• Locate a Hamiltonian for which this upset number is as large as possible.
Example 1. In our Patriot League example,
18




A=



0 −1
1 −1 −1 −1
1
0 −1
1
1 −1
−1
1
0
1
1 −1
1 −1 −1
0 −1 −1
1 −1 −1
1
0 −1
1
1
1
1
1
0




.



Using Sage, the list 0, 4, 3, 1, 2, 5, was obtained. This gives the ranking
Army < Lafayette < Lehigh < Bucknell < Holy Cross < Navy.
This agrees with the ranking obtained by the previous method and using
Kenner’s method. This gives a prediction failure rate of 6.7%.
Wessell [We2] describes this method in a different way.
• Construct a matrix, M = (Mij ), with rows and columns indexed by
the teams in some fixed order. The entry in the ith row and the jth
column is defined by

 0, if team i lost to team j,
1,
if team i beat team j,
mij =

0,
if i = j.
• Reorder the rows (and corresponding columns) to in a basic win-loss
order: the teams that won the most games go at the top of M , and
those that lost the most at the bottom.
• Randomly swap rows and their associated columns, each time checking
if the number of upsets has gone down or not from the previous time.
If it has gone down, we keep the swap that just happened, if not we
switch the two rows and columns back and try again.
The basic idea behind Wessel’s version of the minimum upset method
[We2] is to find a permutation matrix P such that P −1 M P is “as uppertriangular as possible”.
19
9
The Elo rating method
This system was originally developed by Arpad Elo11 for rating chess players
in the 1950s and 1960s.
We use the following version of his rating system.
As above, assume all the n teams play each other (ties allowed) and let
ri denote the rating of Team i, i = 1, 2, . . . , n.
Let A = (Aij ) denote an n × n matrix of score results:

 −1, if team i lost to team j,
+1,
if team i beat team j,
Aij =

0,
if i = j.
Let Sij = (Aij + 1)/2.
In the example of the Patriot league, the diagraph associated to A is
visualized using a graph in Figure 2.
1. Initialize all the ratings to be 100: r = (r1 , . . . , rn ) = (100, . . . , 100).
2. After Team i plays Team j, update their rating using the formula
ri = ri + K(Sij − muij ),
where K = 10 and
muij = (1 + e( − (ri − rj )/400)))−1 .
In the example of the Patriot league, the ratings vector is
r = (85.124, 104.79, 104.88, 85.032, 94.876, 124.53).
This gives the ranking
Lafayette < Army < Lehigh < Bucknell < Holy Cross < Navy.
This gives a prediction failure rate of 13.3%.
Acknowledgements: I think T.S. Michael for both encouraging comments and
many helpful suggestions on this paper.
11
Elo (1903-1992) was a physics professor at Marquette University in Milwaukee and a
chess master [E]. He won the Wisconsin State Championship eight times.
20
References
[E] A. Elo’s Obituary in the N.Y. Times (1992), available at
http://www.nytimes.com/1992/11/14/obituaries/
prof-arpad-e-elo-is-dead-at-89-inventor-of-chess-ratings-system.
html
[GM] A. Govan and C. Meyer, Ranking National Football League teams
using Google’s PageRank, available at https://www.ncsu.edu/crsc/
reports/ftp/pdf/crsc-tr06-19.pdf.
[Ke] J.P. Keener, The Perron-Frobenius theorem and the ranking of football,
SIAM Review 35 (1993)80-93.
[LM] A. Langville and C. Meyer, Who’s # 1, Princeton University Press,
2012.
[Ma] K. Massey, Statistical models applied to the rating of sports teams, available at masseyratings.com.
[S] Sagemath Developers, Sagemath - a mathematical software package, version 7.5, http://www.sagemath.org/.
[We1] C. Wessell, Massey’s method, available at public.gettysburg.edu/
~cwessell/RankingPage/massey.pdf.
[We2] — , Minimum upset method, available at public.gettysburg.edu/
~cwessell/RankingPage/minupsets.pdf.
21