VT Math - Virginia Tech

The Unity of Combinatorics
Ezra Brown and Richard K. Guy
May 31, 2017
One reason why combinatorics has been slow to become accepted as part of mainstream
mathematics is the common belief that it consists of a bag of tricks in many areas:
combinatorial number theory (partitions, integer sequences), combinatorial set
theory, Ramsey theory, partially ordered sets, lattices (in the poset, geometric, and number-theoretic senses), error-correcting codes, combinatorial designs (latin squares and rectangles, projective and affine geometries, Steiner
systems, Kirkman’s schoolgirls problem), combinatorial games, enumerative
combinatorics (recurrence relations, generating functions), 0–1 matrices, graph
theory (including tournaments, topological properties, coloring problems, networks), recreational mathematics, scheduling, combinatorial geometry, packing
and covering (in number-theoretic, set-theoretic, graph-theoretic or geometric
contexts),
with little or no connection between them. We shall see that they have numerous threads
weaving them together into a beautifully patterned tapestry.
We have divided RKG’s original “The Unity of Combinatorics” paper into sections, and
have expanded the exposition, built up background material as needed, and augmented
the original with additional connections as seemed appropriate. We have also included
parts of several of EB’s papers on block designs and automorphism groups with the aim
of enhancing the exposition and readability of this work. The sequence section has been
greatly expanded to include material about the Fibonacci numbers, Pascal’s triangle, and
the Catalan numbers. Regarding the Catalan numbers, we have included part of RKG’s
online paper on Cat Paths as an example of how to take a sequence (namely, the Catalan
numbers) and follow what happens when both Dyck paths and triangles are generalized to
higher dimensions.
We refer to the brief section on Mock Turtles as the Keystone, because it seems to be an
arch between TUOC and Alex Fink and RKG’s more advanced “Rick’s Tricky Six Puzzle:
S5 sits specially in S6 .” In this section, we make mention of Golay codes, Mathieu groups,
and the Leech lattice of dimension 24, as these and other combinatorial objects appear in
the “Rick’s Tricky Six” section. EB’s paper on the (11, 5, 2) biplane is a way to connect
1
up all these objects, as well as the ternary Golay codes and the Steiner systems S(4, 5, 11)
and S(5, 6, 12).
There will also be a section on those ubiquitous objects called matroids, which link
combinatorics, graph theory, linear algebra, geometry, and topology – again, borrowed
liberally from an eminently readable 2009 Mathematics Magazine paper by David Neel
and Nancy Neudauer. In particular, the F7 matroid is another name of (7, 3, 1).
We conclude with two sections on that most unusual of all combinatorial designs, the
Steiner system S(5, 8, 24). This is a collection of octads (8-element subsets of a 24-element
set Ω) such that every 5-element subset of Ω is contained in exactly one octad. Section 18,
much of which is drawn from Maria Beane’s 2011 Masters thesis on S(5, 8, 24), begins with
a question: based only on this definition, what can we learn about the internal structure of
such an S(5, 8, 24)? (Quite a lot, as it turns out.) The last section in the book, Section 19, is
Robert Curtis’s beautiful exposition of the Miracle Octad Generator, an object constructed
from an S(5, 8, 24) that will take a 5-element subset of the 24-element set Ω and find the
unique octad containing that 5-element subset.
Contents
1 Sequences
4
2 Combinatorial Games
11
3 Sequences, II
20
4 Catwalks, Sandsteps, and Pascal Pyramids
33
5 Unique Rook Circuits
45
6 Sums, colorings, squared squares, and packings
54
7 Difference sets and combinatorial designs
63
8 Geometric connections
71
9 The groups PSL(2, 7) and GL(3, 2) and why they are isomorphic.
85
10 Incidence matrices, codes, and geometries
92
11 Kirkman’s Schoolgirls, Fields, Spreads, and Hats
104
12 (7, 3, 1) and combinatorics
117
13 (7, 3, 1) and algebraic systems
121
2
14 (7, 3, 1) and Matroids
129
15 Coin Turning Games and Mock Turtles
140
16 The Fabulous (11, 5, 2) Biplane
143
17 Rick’s Tricky Six Puzzle: S5 sits specially in S6
158
18 S(5, 8, 24)
182
19 The Miracle Octad Generator
186
19.1 An elementary approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
19.2 A more mathematical approach . . . . . . . . . . . . . . . . . . . . . . . . . 191
19.2.1 The exceptional isomorphism A8 ∼
= L4 (2) . . . . . . . . . . . . . . . 191
19.2.2 The binary Golay code C. . . . . . . . . . . . . . . . . . . . . . . . . 192
19.2.3 The hexacode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
3
1
Sequences
This section is in the file called sequences1.tex.
Langford sequences and Skolem’s problem
Even before learning the numbers, children like to arrange objects in a line. That seems
like a perfect way to begin our combinatorial journey, so let’s look in on Dudley Langford’s
small son playing with his colored building blocks and see what happens.
The boy arranged them with one block between the red ones,
R
R
two between the blue ones,
R B R
B
and three between the green,
G R B R G B.
Replace the colors by the number of blocks between,
3 1 2 1 3 2
and we have an arrangement of a pair of 1s, a pair of 2s, and a pair of 3s in which the
two 1s are one unit apart, the two 2s are two units apart, and the two 3s are three units
apart. Such arrangements are called Langford sequences or Langford arrangements,
and in 1958, Bang [6] showed that there is such a sequence using n pairs if and only if
n ≡ 0 or − 1 (mod 4).
There are many algorithms for constructing Langford sequences whenever n ≡ 0 or − 1
mod 4; here is how to see that this condition is necessary.
Suppose there is a Langford arrangement of n pairs. There are 2n positions in such an
arrangement, namely 1, 2, . . . , 2n, and they sum to 2n(2n + 1)/2 = n(2n + 1). On the other
hand, for 1 ≤ j ≤ n, the left position Lj and right position Rj of j differ by j + 1; hence,
2
2n + n = n(2n + 1) =
n
X
j=1
=2
n
X
j=1
Lj +
n
n
X
X
(j + 1)
(Lj + Lj ) +
L j + Rj =
j=1
j=1
n(n + 1)
+ n.
2
Isolating n(n + 1)/2 on the right shows that the latter must be an even integer, which
happens only for n ≡ 0 or − 1 (mod 4).
4
The problem generalizes for triples, quadruples etc. in a straightforward way. Thus, a
Langford (s, n)-sequence is a sequence consisting of s appearances of i, for 1 ≤ i ≤ n in
which consecutive occurrences of i are separated by i elements of the sequence. However,
the known conditions on s for solutions to exist are necessary ones. For example, if a (3, n)
sequence exists, then n ≡ −1, 0, or 1 mod 9 – see [136] for further details. On the other
hand, here is a sequence of triples i . . . i . . . i (1 ≤ i ≤ 9) arranged such that consecutive
appearances of the number d are separated by d numbers:
1 9 1 2 1 8 2 4 6 2 7 9 4 5 8 6 3 4 7 5 3 9 6 8 3 5 7.
The study of Langford (s, n) sequences and variations thereof continues to be an active
area of research.
Here is a variation on this theme. To the string of blocks
G R B R G B,
we add two white ones with no blocks between:
G R B R G B W W.
Again, replace the colors by the numbers of blocks between,
3 1 2 1 3 2 0 0
and then add one to each number:
4 2 3 2 4 3 1 1.
Now, let’s attach labels from 1 to 8 to the above line of blocks:
4 2 3 2 4 3 1 1
.
1 2 3 4 5 6 7 8
In so doing, we find that we have solved a problem of Skolem in those cases when there is
a solution, namely n ≡ 0 or 1 (mod 4). The problem is to partition the numbers from 1 to
2n into n pairs whose differences are the numbers 1 to n. Here is the solution for n = 4:
8 − 7 = 1,
4 − 2 = 2,
6 − 3 = 3,
5 − 1 = 4.
Skolem’s original problem dealt with constructing a sequence with elements in the set
A = {1, 2, . . . , n} in which pairs of differences are the numbers in A. More recent work
has been done on the same problem but in which A is allowed to be any set of n positive
integers.
We return to these sequences in Section 6 when we discuss the x + y = 2z problem.
5
Beatty sequences
If we don’t restrict ourselves to a particular value of n, we can partition the positive integers
into two sequences A and B with the differences between corresponding terms given in the
bottom line:
A
B
difference
1 3 4 6 8 9 11 12 14 16 17 19 21 . . .
2 5 7 10 13 15 18 20 23 26 28 31 34 . . .
1 2 3 4 5 6 7 8 9 10 11 12 13 . . .
A and B are examples of Beatty sequences, which are sequences of the form {⌊nα⌋ : n =
1, 2, . . .}, where α is an irrational number. In this example, A = {⌊nφ⌋} and B = {⌊nφ2 ⌋},
√
where φ := (1 + 5)/2 is the famous golden section. It is a remarkable fact that if α and
β are irrational numbers satisfying 1/α + 1/β = 1, then the sequences {⌊nα⌋} and {⌊nβ⌋}
contain every positive integer without repetition. Two such Beatty sequences are called
complementary.
Here is a proof of this far-from-obvious fact. Let α and β be irrational numbers such
that 1/α + 1/β = 1.
First, we show that the sequences {⌊nα⌋} and {⌊nβ⌋} have no elements in common.
Suppose, to the contrary, that there exist positive integers m, i and j such that m = ⌊iα⌋ =
⌊jβ⌋. Since α and β are irrational, we know that
m < iα < m + 1 and m < jβ < m + 1.
If we divide the first set of inequalities by α and the second by β, and then add the resulting
inequalities, we see that
1
1
+
< i + j < m + 1,
m=m·1=m
α β
contradicting the assumption that i and j are positive integers.
Second, we show that the sequences exclude no positive integer. Again, suppose the
positive integer m is excluded. Then there exist positive integers x, i, and j such that
iα < m < m + 1 < (i + 1)α and jβ < m < m + 1 < (j + 1).
(The inequalities are strict because, as before, α and β are irrational.) Dividing the sets
of inequalities by α and β, respectively, and adding them leads to the string of inequalities
1
1
1
1
+
+
i + j <= m
= m < m + 1 = (m + 1)
< i + j + 2.
α β
α β
There cannot be two integers strictly between i+j and i+j+2, contradicting the assumption
that m was excluded. We conclude that indeed, the sequences {⌊nα⌋} and {⌊nβ⌋} form a
partition of the positive integers. Thus, the sequences come by the name “complementary”
honestly.
6
Samuel Beatty is another of the many curious figures that have appeared in the world
of combinatorics. A Canadian by birth, he entered the University of Toronto as a student
in 1903 and stayed there for the rest of his professional life, eventually becoming the
University Chancellor. He was the only doctoral student of John Charles Fields, he of
the Fields Medal. By all accounts a beloved teacher, mentor, and strong supporter of his
students, he is best known for his American Mathematical Monthly Problem 3173 [8]. The
latter is arguably the most studied problem ever to appear in the Monthly.
Are there other ways to generate complementary sequences? Indeed there are, and here
is one of those ways, as described in James Tanton’s delightful book named Mathematics
Galore! [162]:
Begin with any nondecreasing sequence P = {p1 , p2 , . . .} of positive integers, such as
the following sequence:
2, 2, 3, 5, 8, 11, 11, 11, 13, 17, 19, 23, . . . .
Now define the frequency sequence Q = {q1 , q2 , . . .} of nonnegative integers, where qk is the
number of entries in P less than k. Thus,
q1 = number of elements in P less than 1 = 0,
q2 = number of elements in P less than 2 = 0,
q3 = number of elements in P less than 3 = 2,
q4 = number of elements in P less than 4 = 3, . . .
and so Q = {0, 0, 2, 3, 3, 4, 4, 4, 5, 5, 5, 8, 8, 9, . . .}. Next, find the frequency sequence of Q;
this turns out to be {2, 2, 3, 5, 8, 11, 11, 11, 13, . . .}. But that’s the sequence we started with
– that is, the frequency sequence of the frequency sequence of P is P itself!
Finally, construct the two sequences P ∗ = {pn + n : n = 1, 2, . . .} and Q∗ = {qn + n :
n = 1, 2, . . .}; then
P ∗ = {3, 4, 6, 9, 13, 17, 18, 19, . . .},
Q∗ = {1, 2, 5, 7, 8, 10, 11, 12, 14, 15, 16, 20, . . .},
and we see that P ∗ and Q∗ are complementary sequences. For a proof that this method
works every time, see [162, p. 29].
Can every pair of complementary sequences formed in this way be a pair of complementary Beatty sequences? No. Consider, for example, the sequence of positive integers
P = Z+ . We see that the frequency sequence of Z+ is the sequence Q of nonnegative
integers. Then
P ∗ = {2, 4, 6, 8, 10, 12, . . .} = 2Z, and
Q∗ = {1, 3, 5, 7, 9, 11, . . .} = 2Z + 1,
the even and odd numbers respectively. Suppose 2Z = {⌊α⌋, ⌊2α⌋, . . .} for some irrational
1
number α. Then α = 2 + δ for some δ ∈ (0, 1). If k1 < δ < k−1
, then 2k = ⌊kα⌋ =
⌊2k + kδ⌋ = 2k + 1, contrary to assumption. Thus, the odd numbers and even numbers
are complementary sequences that are not a pair of complementary Beatty sequences.
7
Penrose pieces
The Beatty sequences A and B from the above discussion are associated, via the golden
section φ, with the well-known Fibonacci numbers fn and Lucas numbers Ln , defined
by the recurrences
f1 = 1, f2 = 1, fn+1 = fn + fn−1 for n ≥ 2, and
L1 = 1, L2 = 3, Ln+1 = Ln + Ln−1 for n ≥ 2,
and displayed in the following table:
rank . . . −4 −3 −2 −1 0 1 2 3 4 5 6 7 8 9 . . .
Fibonacci . . . −3
2 −1
1 0 1 1 2 3 5 8 13 21 34 . . .
Lucas . . .
7 −4
3 −1 2 1 3 4 7 11 18 29 47 76 . . .
The association of φ with the Fibonacci numbers comes from the fact that
√
1+ 5
fn+1
φ=
= lim
.
n→∞ fn
2
We prove this when we return to, and expand upon, the Fibonacci numbers in Chapter 3.
A striking example of the appearance of Beatty sequences occurs in the arrangement
of short and long bow-ties in the Conway worms that appear in one of Roger Penrose’s
aperiodic tilings of the plane with kites and darts [59, 6].
Now, a tiling of a surface is a covering of the surface with nonoverlapping pieces that
fit together exactly – that is, the pieces do not meet except in points that are parts of
common edges or in corners. The pieces are called tiles. The tilings of the plane into
congruent regular hexagons, squares, and equilateral triangles are familiar to us all. The
famous wallpaper patterns are tilings that use copies of a finite number of tiles, patterns
that exhibit various symmetries – that is, rigid motions of the entire tiling that bring it
into self-coincidence. Translations, rotations, reflections, and glide-reflections are examples
of such symmetries. A tiling is periodic if it has one of these symmetries. A uniform tiling
is a tiling of the plane by regular polygons such that for any two vertices a and b (corners
of the tiling), there is a symmetry of the tiling that carries a into b.
A tiling is called aperiodic if it does not contain arbitrary large periodic subtilings.
In 1966, Robert Berger found an aperiodic set of 20,426 tiles that tile the plane and for
which no periodic tiling exists. (This was his doctoral dissertation.) Smaller sets were
found, and in 1971 Raphael M. Robinson found a set of six such tiles. At this point, the
mathematical physicist Roger Penrose got interested in the problem, found another set of
six tiles, reduced the number to four, and finally to the two tiles he called the kite and the
dart.
Kites and darts are made as follows. Begin with a rhombus P QRS with edges of unit
length and angles 72◦ and 108◦ . Draw the long diagonal P R, locate a point X on P R that
is a golden section distance from P , draw the lines QX and SX, and erase the diagonal.
8
Then P QXS is a kite with head X and tail P and QRSX is a dart with head R
and tail X. In a Penrose tiling, you are not allowed to put a kite and a dart together in
the way we constructed them: they must not form a rhombus.
Once we construct the rhombus, we decorate it with four circular arcs, as we see in
Figure 1. Suppose the sides of the rhombus have length 1. Then the circular arcs labeled
G, H, K, and L have radii rG = 1/ϕ, rH = 1/ϕ2 , rK = 1/ϕ3 , and rL = 1/ϕ2 , respectively.
The radii of both the dashed arcs and the solid arcs are in ratio rG /rL = rH /rK = ϕ. (At
this point, you might want to construct the beginning of your own Penrose tiling, making
sure to remember not to form a rhombus.)
A Conway worm is a curve made of only curved arcs or only dashed arcs. Now, a short
bow-tie is constructed from three kites and two darts, and a long bow-tie is constructed
from five kites and three darts. (See Figure 3.) In each worm the short bow-ties (comprising
three kites and two darts) are separated by just one or two long ones (each comprising five
kites and three darts). If you successively label the bow-ties in a worm, then the labels
of the long and short bow-ties form sequences like the pair of Beatty sequences we gave
above. (FIGURES 1, 2, and 3 GO ABOUT HERE:)
L
K
H
G
Figure 1: Rhombus showing the kite, the dart, and four circular arcs for constructing
Conway worms and Penrose tilings.
Figure 2: From left: the rhombus, the kite, and the dart
The Beatty sequences we mentioned above have a connection, via a game invented by
Willem A. Wythoff, to the fascinating world of combinatorial games. Let’s see how.
9
Figure 3: From left: the ace or false kite, the short bowtie, and the long bowtie
10
2
Combinatorial Games
This section is in the file combgames.tex.
Wythoff’s game
Willem Abraham Wythoff (1865-1939) was a Dutch mathematician from Amsterdam who
received his doctorate in 1898 with a dissertation on biquaternions. His advisor was the applied mathematician Diederik J. Korteweg, co-inventor of the Korteweg-de Vries equation,
fundamental for the study of shallow-water waves.
Wythoff’s game dates from 1905 [179] and is another example involving complementary
Beatty sequences – specifically the sequences A and B from Section 1. It is a two-person
game played with two heaps of beans. The players alternately take either any number of
beans from one heap or equal numbers from both heaps, the last player being the winner.
Some time around 1960, Rufus Isaacs unwittingly invented an isomorph of this game,
played with a queen on a chessboard. Players alternately move the queen in one of three
directions taking it nearer to a predesignated corner; the player who moves the queen to
the corner wins. See [9, p. 38] or [64] for further details. Isaac’s version of Wythoff’s game
has recently surfaced in the Mathematical Circles world under the names of Last Biscuit
and the Puppies and Kittens Game.
In our version, called Wyt Queens, any number of queens may occupy the same square;
i.e., we play a disjunctive compound or sum of copies of Wythoff’s game, which itself
contains non-disjunctive moves in which both heaps of beans are altered. (That is, we play
several copies of Wythoff’s game simultaneously and our moves affect each game.) Figure
4 shows eight copies of Wythoff’s games in play at the same time (FIGURE 4 GOES
ABOUT HERE:)
The connection between the Beatty sequences A and B and Wythoff’s game is this: If
you play Wythoff ’s game so that the numbers in the two heaps are corresponding members
of the two sequences, you will always win.
This curious statement leads us to explore a tiny bit of a vast area of combinatorics
called combinatorial game theory, of which Wythoff’s game is an interesting example.
However, we need to explain some of the terminology and address some questions. What
do we mean by a game, how do we determine a winning strategy for a game, and what is
a sum of games?
In this section, we answer these questions.
In Wythoff’s game, two players take turns making moves until reaching a winning
position for one of the players. By the description, we see that it is an impartial game,
which is a game in which the allowable moves depend solely on the position and not on
whose move it is. It is also a finite game, for the two heaps of beans are presumably finite.
With these assumptions in mind, we make the following definition.
More formally, a finite combinatorial impartial two-player game – or game for
11
0
1
2
3
4
0
1
2
3
4
Figure 4: Wyt Queens, a disjunctive form of Isaac’s game.
short – satisfies the following rules:
1. There are two players, and the players take turns moving.
2. There is a finite set of positions, and the rules specify all allowable moves from a
given position.
3. For each position, both players have the same set of allowable moves.
4. The game ends when it reaches a terminal position, i.e. a position from which no
moves are possible for the player whose move it is, at which point the previous player wins.
5. The game ends in a finite number of moves no matter how it is played.
Let’s look at some games.
A selection of games
Example 1: A takeaway game ([55]). The “game board” consists of a heap of n beans.
Two players take turns removing some of the beans, according to a specified rule, and the
winner is the player who takes the last bean. For, an empty board is a terminal posltion,
because the next player has no legal move. A play consists of removing 1, 2, or 3 beans.
Can either of the players force a win, and if so, which one?
Example 2: Nim. This is a game played with a number of heaps of beans. Players
alternate by taking all or some of the beans in a single heap, but they must not take beans
12
from two or more heaps. The player who takes the last bean is the winner. Decide on a
winning strategy.
Example 3: A reverse takeaway game ([169]). The game board is an empty table, and
both players have a supply of beans. The players take turns placing one, two or five beans
into a heap on the table until the number of beans in the heap is either (a) the square of
an integer ≥ 2, or (a) greater than 40. How do you find a winning strategy?
For our first example, if n = 1, 2, or 3, then the first player wins by taking all the beans.
If n = 4, the first player loses, because the second player counters each removal of k beans
by removing the remaining 4 − k beans. Similarly, n = 5, 6, and 7 are wins and n = 8 is
a loss for the first player. The strategy now reveals itself: the first player loses if n is a
multiple of 4, otherwise the first player wins.
For Nim, you can see that if there is only one heap, then the first player has a winning
move – take the whole heap. With two heaps, the first player has a winning move if the
heaps are of unequal sizes; otherwise, the second player has a winning move. For three or
more heaps, winning strategies are not at all obvious, and a winning strategy must work
for any number of heaps of arbitrary sizes.
Finally, Example 3 is complicated by the fact that there are multiple ways to win,
because heaps of sizes 4, 9, 16, 25, 36, and “Over-40” are all terminal positions. To find a
strategy appears to be a nontrivial exercise. But help is on the way.
Developing strategies: P-positions and N -positions
To formulate game-winning strategies, combinatorial game theorists introduced the ideas
of P positions and N positions. Namely, a game is in a P-position if it secures a win for
the Previous player. A game is in an N -position if it secures a win for the N ext player.
More formally, we use Ferguson’s characterization of P-positions and N -positions (see [55,
pp. 1-5]) as follows.
Characteristic Property. P-positions and N -positions are defined recursively by
three statements.
(1) All terminal positions are P-positions.
(2) From every N -position, there is at least one move to a P-position.
(3) From every P-position, every move is to an N -position.
One way to determine the P and N positions is to work backwards, as we did for the
takeaway game in Example 1. For each position x in a game, define F (x) to be the set of
all positions y such that y can be reached from x in a single move. Thus, for that takeaway
game we see that F (0) = ϕ, F (1) = {0}, F (2) = {0, 1}, and F (x) = {x − 1, x − 2, x − 3}
for x ≥ 3. Let P and N denote the sets of P-positions and N -positions, respectively. We
label the positions recursively as follows.
Initialize: Set P = {terminal positions} and N = ∅.
For each position x ∈
/ P ∪ N, do the following:
13
Place x in N if there is a position y ∈ P such that y ∈ F (x).
Otherwise, place x in P.
When this is done, we claim that by the logic of this procedure, every position in the
game has been placed in either P or N. Moreover, P is the set of winning positions for the
previous player and N is the set of winning positions for the next player. For, if x ∈ P,
then x must either (a) move to an N -position or (b) be a terminal position, and if x ∈ N,
then x can always move to a P-position. Thus, if some initial position in the game is in
N, then the first player has a winning strategy; otherwise, the second player has a winning
strategy.
However, knowing that there is a winning strategy is fine, but finding such a strategy is
what matters when playing a game. It turns out that Nim is the key to all combinatorial
impartial two-player games. Let’s see how.
Nim, nimbers, and the Sprague-Grundy theory
The key operation to solving Nim with multiple heaps is binary addition without carrying,
which we call Nim-sum or Nim addition. First, write out the binary expansions of each
heap size; we call such a binary expansion the heap’s nimber. To find the Nim-sum of a
set of nimbers, first write the nimbers as binary strings, including leading zeros if necessary.
Then add the strings by performing a string exclusive-or (XOR), which we represent by ⊕.
For example, to find the Nim-sum of the nimbers 22, 37, and 18, we see that
22 ⊕ 37 ⊕ 18 = 010110 ⊕ 100101 ⊕ 010010 = 100001,
and so the Nim-sum of 22, 37 and 18 is the number that is 100001 in binary, namely 33.
The nimber of a position, also called its Nim-value is the Nim-sum of the nimbers
of its heaps. It is convenient to represent the position in a game of Nim with heap sizes
x1 , x2 , . . . , xn by the ordered n-tuple (x1 , x2 , . . . , xn ).
Nimbers lead us to the following key theorem on P-positions in Nim, due to C. L.
Bouton [17].
Bouton’s Theorem. A position in n-heap Nim with heap sizes is a P-position if and
only if the associated nimber x1 ⊕ x2 ⊕ . . . ⊕ xn is equal to zero. Otherwise, it is an N position. Every move from a P-position is to an N -position with nonzero nimber, and for
every position with a nonzero nimber, there exists a legal Nim move that transforms the
position into one with nimber 0.
Let’s use the above rule to decide on the nimbers of a couple of Nim games. Picking
two examples, we see that 3 ⊕ 5 ⊕ 7 = 001 ⊕ 101 ⊕ 111 = 001 = 1 and 6 ⊕ 11 ⊕ 13 =
0110 ⊕ 1011 ⊕ 1101 = 0000 = 0. Thus, (3, 5, 7) is an N -position and (6, 11, 13) is a Pposition.
For arbitrary combinatorial games, P. M. Grundy [62] described a procedure for assigning, to each position x in the game, a nonnegative integer g(x) called the Grundy number
of the position, to be the smallest nonnegative integer not assigned to any y ∈ F (x). The
14
integer g(x) is also known as the minimum excluded ordinal of x, written mex(x).
We make a few observations. If t is a terminal position, then F (t) = ∅, so that g(t) = 0.
If g(y) = 0 and y ∈ F (x), then g(x) will be nonzero. If F (y) > 0 for each y ∈ F (x), then
g(x) = 0. This leads us to the following theorem:
Grundy’s Theorem. A position x in a two-person combinatorial game is a P-position
if and only if the Grundy number g(x) of x is equal to 0.
The takeaway game in Example 1 has the following Grundy numbers:
x
0 1 2 3 4 5 6 7 8 9 10 11 . . .
g(x) 0 1 2 3 0 1 2 3 0 1 2 3 . . .
As an exercise, analyse the reverse takeaway game in Example 3. Begin by giving the
terminal positions Over-40, 36, 25, 16, 9, and 4 Grundy numbers of zero and assign the rest
recursively by working backwards from Over-40.
The general theorem is the Sprague-Grundy theorem, named for its formulators
Grundy and R. P. Sprague (see [151] (who obtained the result earlier, unbeknownst to
Grundy). It asserts that all (positions in) finite combinatorial impartial two-player games
are equivalent to a set of heaps of beans in the game of Nim. For a game with heaps
H1 , H2 , . . . , Hk in which a player picks a single heap and plays by the rules of that heap, the
Grundy function of a position (x1 , x2 , . . . , xk ) is equal to the Nim-sum g1 (x1 ) ⊕ · · · ⊕ gk (xk )
of the Grundy functions for the component heaps. Bouton’s Theorem is a special case of
the Sprague-Grundy theorem applied to a set of heaps in Nim.
For Isaacs’ version of Wythoff’s game, here is a table of nimbers for a queen on various
squares.
0
1
2
3
4
5
6
7
0
0
1
2
3
4
5
6
7
1
1
2
0
4
5
3
7
8
2
2
0
1
5
3
4
8
6
3
3
4
5
6
2
0
1
9
4 5 6 7 8 9 10
4 5 6 7 8 9 10
5 3 7 8 6 10 11
3 4 8 6 7 11 9
2 0 1 9 10 12 8
7 6 9 0 1 8 13
6 8 10 1 2 7 12
9 10 3 4 5 13 0
0 1 4 5 3 14 15
11
11
9
10
7
12
14
2
13
12
12
13
14
15
11
9
16
17
We see that the positions with Grundy number 0 are precisely the positions (m, n) such
that
(m, n) or (n, m) = (1, 2), (3, 5), (4, 7), (6, 10), (8, 13), . . . ,
in which m and n are corresponding numbers in the Beatty sequences. In Figure 4, the eight
queens are in positions (4, 0), (4, 0), (2, 1), (2, 1), (2, 1), (4, 1), (0, 3), and (3, 4). These have
Grundy numbers 4, 4, 0, 0, 0, 5, 3, and 2, respectively. By the Sprague-Grundy theorem, the
Grundy number of this position is
4⊕4⊕0⊕0⊕0⊕5⊕3⊕2= 5⊕3⊕2= 4
15
and there is a win for the next player. (Find it!)
Recent work of Gabriel Nivasch [114] contains a great deal of detailed information about
the Sprague-Grundy function for Wythoff’s game.
The Grundy numbers for the games we have seen in this section seem to have wellbehaved patterns. But there is a game whose Grundy numbers are anything but well
behaved, and oddly enough, the game is due to Grundy himself [62]. This game begins
with a heap of stones, and the players take turns splitting a heap into two heaps of unequal
sizes. A player loses if there is no legal move available, which means that only heaps of
sizes one or two remain.
This deceptively simple game brings us into deep research waters. For, the sequence of
Grundy numbers for this game begin as follows:
Heap size
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 . . .
Grundy number 0 0 0 1 0 2 1 0 2 1 0 2 1 3 2 1 . . . .
We might ask whether this sequence of Grundy numbers has a detectable pattern. For
example, is the sequence eventually periodic? No one knows the answer to this seemingly
innocent question!
Nim arithmetic and Nim algebra
As noted, nimbers can be added by performing an XOR – denoted ⊕ on their binary strings.
One readily sees that the operation ⊕ is associative and commutative, that the set of
nimbers of the same length is closed under ⊕, that the all-zeros string is an additive identity,
and that a nimber is its own additive inverse. In short, the n-bit nimbers {0, 1, . . . , 2n − 1}
form a group under nimber addition. Here are those groups of orders 2, 4, and 8:
⊕ 0 1
0 0 1
1 1 0
⊕
0
1
2
3
0
0
1
2
3
1
1
0
3
2
2
2
3
0
1
⊕
0
1
2
3
4
5
6
7
3
3
2
1
0
0
0
1
2
3
4
5
6
7
1
1
0
3
2
5
4
7
6
2
2
3
0
1
6
7
4
5
3
3
2
1
0
7
6
5
4
4
4
5
6
7
0
1
2
3
5
5
4
7
6
1
0
3
2
6
6
7
4
5
2
3
0
1
7
7
6
5
4
3
2
1
0
Notice that these groups are nested.
It turns out that there is a way to multiply nimbers that is associative and commutative,
distributes over nim addition, and has an identity element (namely, 1). John Conway
[34] defined nimbers as ordinal numbers and described nim multiplication in terms of nim
addition and mex. Now, to present this description would lead us too far afield; fortunately,
Conway also gave the following simplification:
n
For n a nonnegative integer, the nimbers 22 are the so-called Fermat powers, the
4
3
2
1
0
first five being 2 = 22 , 4 = 22 , 16 = 22 , 256 = 22 , and 65536 = 22 . Let us denote the
nim product by ⊗ and let · be the usual integer product.
16
n
n
n
If k is a nonnegative integer and k < 22 , define k⊗22 to be k·22 . Thus, 4⊗3 = 4·3 = 12
and 16 ⊗ 13 = 16 · 13 = 208.
n
3
2
n
Next, define 22 ⊗ 22 to be
3
· 22 = 384.
3
2
n −1
n
· 22 = 3 · 22
3
3
. For example, 256 ⊗ 256 = 22 ⊗ 22 =
To multiply nimbers in general, express the factors as sums of powers of 2, use the
Fermat powers when possible, and take advantage of the distributive law. For example, 8
is not a Fermat power, but 8 = 2 ⊗ 4 and so
82 = (2 ⊗ 4) ⊗ (2 ⊗ 4)
= (2 ⊗ 2) ⊗ (4 ⊗ 4) . . . rearrange factors
= 3 ⊗ 6 . . . squaring Fermat powers
= 3 ⊗ (2 ⊕ 4) . . . using a nim sum
= (3 ⊗ 2) ⊕ (3 ⊗ 4) = 1 ⊕ 12
= 13.
Hence, 8 ⊗ 8 = 13.
To find 7 ⊗ 11, write 7 = 3 ⊕ 4 and 11 = 3 ⊕ 8, use both the distributive law and
previous computations. It is a surprising fact that 7 ⊗ 11 = 1 – that is, 7 and 11 are
nim-multiplicative inverses!
n
It is far from obvious, but true, that under ⊕ and ⊗, the set {0, 1, . . . , 22 −1} of nimbers
n
less than the Fermat power 22 form a field. Furthermore, these fields are nested and their
2
union is also a field. Figure 5 shows the nim multiplication table for the 16(= 22 )-element
field as well as the table for the 4-element subfield in bold-face (FIGURE 5 GOES
ABOUT HERE:)
In his book “On Numbers and Games” [34], commonly known as ONAG, Conway gives
a proof that the set N of natural numbers is isomorphic to the quadratic closure F2 of the
two-element field F2 . Lenstra’s monograph “Nimber Arithmetic” [106, p. 13ff] gives the
following explicit construction of F2 . He describes the quadratic closure of F2 as the nested
set of fields F2 (x0 , x1 , . . .), where x0 = 1 and for i > 0 the xi satisfy the equations
Y
x2i + xi +
xj = 0.
j<i
Thus x1 is a root of x2 + x + 1 = 0 over F2 , and so F2 (x0 , x1 ) = F2 (x1 ) = {0, 1, x1 , x1 + 1}
is a 22 = 4-element field, being a quadratic extension of F2 . Similarly, x2 is a root of
x2 + x + 1 · x1 over F2 (x1 ) and so F2 (x0 , x1 , x2 ) is a quadratic extension of F2 (x1 ); thus, it
is a 16-element field containing F2 .
n
Let us denote the 22 -element field of nimbers by F22n . By the above construction,
these fields are nested:
F2 ⊆ F4 ⊆ F16 ⊆ F256 ⊆ F65536 ⊆ · · · .
It is interesting to note that the map x1 → 2 induces an isomorphism between the field
{F2 (x0 , x1 ), +, ×} under mod-2 arithmetic and the field {F4 , ⊕, ⊗} under nim arithmetic.
17
⊗
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
2
0
2
3
1
8
10
11
9
12
14
15
13
4
6
7
5
3
0
3
1
2
12
15
13
14
4
7
5
6
8
11
9
10
4
0
4
8
12
6
2
14
10
11
15
3
7
13
9
5
1
5
0
5
10
13
2
7
8
13
3
6
9
12
1
4
11
14
6
0
6
11
15
14
8
5
3
7
1
12
10
9
15
2
4
7
0
7
9
14
10
13
3
4
15
8
6
1
5
2
12
11
8
0
8
12
4
11
3
7
15
13
5
1
9
6
14
10
2
9
0
9
14
7
15
6
1
8
5
12
11
2
10
3
4
13
10
0
10
15
5
3
9
12
6
1
11
14
4
2
8
13
7
11
0
11
13
6
7
12
10
1
9
2
4
15
14
5
3
8
12
0
12
4
8
13
1
9
5
6
10
2
14
11
7
15
3
13
0
13
6
11
9
4
15
2
14
3
8
5
7
10
1
12
14
0
14
7
9
5
11
2
12
10
4
13
3
15
1
8
6
15
0
15
5
10
1
14
4
11
2
13
7
8
3
12
6
9
Figure 5: Nim multiplication table for the 16-element field of nimbers.
It is a classical result that the nonzero elements of a finite field F form a multiplicative
cyclic group, denoted F × . A generator for this group is x1 for the algebraic number field
0
F2 (x0 , x1 ) and 2 = 22 for the 4-element field of nimbers.
Similarly, the map x2 → 4 induces an isomorphism between the field {F2 (x0 , x1 , x2 ), +, ×}
under mod-2 arithmetic and the field {F16 , ⊕, ⊗} under nim arithmetic. The cyclic groups
1
×
×
F4× and F16
are generated by x2 and 4(= 22 ), respectively, and the powers of 4 in the F16
are
{4, 6, 14, 5, 2, 8, 11, 7, 10, 3, 12, 13, 9, 15, 1},
in that order.
n
These two cases might lead one to conjecture that 22 is a generator for the multiplicative
group of nimbers F ×2n+1 for n an arbitrary nonnegative integer. This is another case of
2
2
the Strong Law of Small Numbers, for 16 = 22 does not generate the multiplicative group
×
×
F256
of F256 . For, F256
has order 255 = 3 · 5 · 17, and a bit of calculation reveals that
×
17
16 = 8. We see that 8 = 46 has order 5 in F16
, and so 1685 = 85 = 1. Hence 16 generates
×
a subgroup of F256 of order 85 and so cannot be a generator of that multiplicative group.
×
However, 3217 = 217 1617 = 3 · 8 = 4, and 4 has order 15, so 32 is a generator of F256
.
n
In fact, it is not hard to prove that 22 generates F ×2n+1 if and only if n = 0 or n = 1.
2
Two other curious facts about nim multiplication [36, p. 292] are
2n
n
1. 22 = 3, 44 = 5, 1616 = 17, and in general(22 )2
n
2n +1
2. 23 = 1, 44 = 2, 1617 = 8, and in general (22 )2
n
= 22 + 1, and
n−1
= 22
.
You can find more general information about ordinals, and particular information about
18
these remarkable fields, in “On Numbers And Games” (ONAG), Conway’s equally remarkable book [34].
Now, nim addition and nimbers have their roots in Bouton’s analysis of Nim, Wythoff’s
game, and the Sprague-Grundy theorem: they are an essential part of the very fiber of
combinatorial game theory. We have seen that Nim multiplication has its origins in ONAG,
and so we might suspect that there is a combinatorial game-theoretic interpretation of that
most unusual operation.
Our suspicions are correct: in Section 15 we meet the so-called coin-turning games of
Mock Turtles, Moebius, and Mogul, and we learn how nim multiplication plays a part in a
certain class of games called – unsurprisingly – product games.
Finally, the fields of nimbers have an interesting connection with yet another combinatorial structure called a difference set, which we preview here.
Let G be a v-element group written additively. A (v, k, λ) difference set in G is a kelement subset D of G such that every nonzero element of G can be written in the form
a − b for a, b ∈ D in exactly λ ways. If G is cyclic, abelian, or nonabelian, then so is
the difference set. For example, if G = Z7 , the integers mod 7, then {1, 2, 4} is a (7, 3, 1)
difference set in that group, because {1 − 2, 1 − 4, 2 − 4, 2 − 1, 4 − 1, 4 − 2} ≡ {6, 4, 5, 1, 3, 2}
mod 7.
It turns out that the additive group of the nimber field F16 contains a (16, 6, 2) difference
set. In Section 7 we learn more about these structures – and about how this particular
difference set was found.
19
3
Sequences, II
This section is in the file called sequences2.tex.
In this section, we dig a little deeper into the Fibonacci numbers, find them hiding in
Pascal’s Triangle, reimagine Pascal’s Triangle in an interesting way, and explore a sequence
associated with the central column of Pascal’s Triangle: namely, the Catalan numbers.
Fibonacci numbers
It was in 1202 with the appearance of his Liber abaci (the Book of Calculations) that
Leonardo of Pisa (∼1170–?1240), popularly known as Fibonacci, introduced the Western
world to the sequence of numbers that bear his name. As is now well known, the Fibonacci
numbers are defined by the recurrence f1 = 1, f2 = 1, and fn+1 = fn + fn−1 for n ≥ 2, and
the famous problem – concerning some rabbits with dubious breeding habits – appears in
Chapter 12, Part 7, Problem 18:
“How many pairs of rabbits can be produced from a single pair in one year if it
is assumed that every month each pair begets a new pair which from the second
month becomes productive?”
Let fk be the number of pairs in month k, and let pk be the number of productive pairs in
month k. Applying the assumption,we see that f1 = f2 = 1, and fk+1 = fk +pk = fk +fk−1
for k ≥ 2. For the first year, this yields the sequence 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, and 144.
Hence there are 144 pairs at the end of twelve months.
A biologically more realistic example that traces its origins back to produces the Fibonacci numbers concerns the genealogy of the male bee:
A male bee comes from an unfertilized egg, so he has only a female parent; the female
is produced by a fertilized egg, so she has both male and female parents. How many
grandparents, great-grandparents, etc. does the male bee have? Let bn , mn , and fn be the
number of bees, males, and females n generations back. Then mn = fn−1 and fn = bn−1
and we see that
bn+1 = fn+1 + mn+1 = bn + fn = bn + bn−1 ,
as claimed.
The rabbits problem is, of course, well known. Not quite so well known, alas, is the fact
that in the sixth century CE, the Indian mathematician Virahanka defined sn to be the
number of n-syllable strings made up of long (two beats, written −) and short (one beat,
written ◦) syllables:
number of syllables 1
2
3
4
arrangements
◦ ◦◦, − ◦ ◦ ◦, ◦−, −◦ ◦ ◦ ◦◦, ◦ ◦ −, ◦ − ◦, − ◦ ◦, −−
# arrangements
1
2
3
5
Thus, sn = fn+1 .
20
Another way to look at this example is that fn is the number of ways of packing
dominoes in a 2 × n box:
1
1
2
3
5
Yet a third way is that fn is the number of ways of paving a strip n feet long using
one-foot and two-foot bricks.
These examples are only three of the large number of counting problems in which the
Fibonacci numbers appear. It would seem useful to have a “nice” formula for the nth
Fibonacci number fn . It happens that Binet’s formula, namely
!n
!n !
√
√
1
− 5+1
5+1
fn = √
,
−
2
2
5
√
does the trick.The presence of the golden section φ = (1 + 5)/2 in the formula gives
us some confidence in this formula. Indeed, the two roots of the quadratic polynomial
√
x2 − x − 1 are φ and ( 5 − 1)/2 = φ−1 , so we rewrite the above formula as
fn =
φn − (−φ−1 )n
√
.
5
If we can prove that the expression on the right satisfies both the recurrence for fn and
the initial conditions, then we have shown that the two expressions are equal for all n ≥ 1.
Such a proof by induction is easily accomplished. Two short calculations show that for
n = 1 and 2, both sides are equal to 1, and some algebraic manipulation shows that
φn − (−φ−1 )n φn+1 − (−φ−1 )n+1
φn+2 − (−φ−1 )n+2
√
√
√
+
=
.
5
5
5
This proves our claim.
J. P. M. Binet is credited with finding this formula in 1843, and as is common in the
history of mathematical names, he was not the first: Donald Knuth points to Abraham de
Moivre from 1730 as the source.
Here is another Fibonacci curiosity. If you perform the Euclidean GCD algorithm on
consecutive Fibonacci numbers, the partial quotients are all ones. Furthermore, the golden
section φ satisfies the equation φ2 = φ + 1. Divide both sides by φ, continue with the
substitution φ = 1 + φ1 and here’s what happens:
φ =1+
1
=1+
φ
1
1
1+
1+
21
1
1 + ...
.
In short, the continued fraction expansion of φ consists of all ones.
Martin Gardner [59, p.21] points out another connection between the Beatty sequences
and the Fibonacci numbers, as follows.
1. Begin with the first Beatty sequence 1, 3, 4, 6, 8, 9, 11, 12, 14, 16, 17, 19, 21, . . ..
2. Construct the sequence of differences 2, 1, 2, 2, 1, 2, 1, 2, 2, 1, 2, 1, . . ..
3. Replace each 2 by a 1 and each 1 by a 0, remove the commas, and place a decimal
point to the left of the first 1. Call the resulting number α: thus, α = .101101011010 . . . .
4. The exponents on the powers of two in the continued fraction expansion
1
α=
1
1+
1
21 +
1
21 +
1
22 +
1
23 +
1
25 +
28 +
1
213 + . . .
of α are the Fibonacci numbers!
Finally, the Fibonacci numbers have taken up residence inside Pascal’s triangle. Let’s
see how.
The triangle of Pingala/Al Karaji/Omar Khayyam/Pascal
Pascal’s triangle is one of the most familiar objects in combinatorics, so let’s begin with
the standard picture:
1
1
1
1
1
1
1
1
1
5
7
8
10
1
4
10
20
35
56
1
3
6
15
21
28
2
3
4
6
1
5
15
35
70
1
1
6
21
56
1
7
28
1
8
1
Welearn, as early as middle school, that the kth entry in row n is the binomial coefficient
n
for 0 ≤ k ≤ n and n = 0, 1, 2, . . ..
k
Where Fibonacci and Pascal come together is that the diagonals in Pascal’s triangle
sum to the Fibonacci numbers, as you can see by tipping your head on one side:
22
row sum
1
1
2
3
5
8
13
21
34
55
89
1
1
1
1
1
1
1
1
7
3
10
15
1
4
10
21
28
1
6
6
8
9
4
5
1
2
3
1
1
1
20
35
1
5
15
1
But there is more to The Triangle than meets the eye, especially if we replace the nth
rown of The Triangle by the n + 1 terms we obtain by expanding (a + b)n , and then tilt
The General Triangle as we did above:
1
a
a2
b
a3
a4
a5
4a3 b
a6
a7
a8
a9
a10
9a8 b
5a4 b
6a5 b
7a6 b
8a7 b
b2
3ab3
6a2 b2
10a3 b2
15a4 b2
21a5 b2
28a6 b2
2ab
3a2 b
b3
4ab3
10a2 b3
20a3 b3
35a4 b3
b4
5ab4
15a2 b4
b5
The resulting row sums for various choices of a and b are quite interesting; let’s look at
some.
1. a = 2, b = −1 gives the natural numbers
1, 2, 3, 4, 5, 6, 7, 8, . . . ,
both the foundation of number theory and the objects of its study.
2. a = 1, b = 1 gives the Fibonacci numbers
1, 1, 2, 3, 5, 8, 13, 21, 34, . . . ,
which we have already met and discussed, and concerning which entire books have
been written and much ink and toner has been spilled.
23
3. a = 3, b = −2 gives the Mersenne numbers Mn = 2n − 1:
1, 3, 7, 15, 31, 63, 127, 255, 1023, . . . .
It is straightforward to prove that if Mn is prime, then so is n. The converse, however,
is false: 11 is prime, but M11 = 2047 = 23 · 89 is composite. As of September 2016,
49 Mersenne numbers are known to be primes; the first thirteen of these are Mp for
p = 2, 3, 5, 7, 13, 19, 31, 61, 89, 107, 127, 521, and 607.
The Lucas-Lehmer test for primality of Mp is as follows. Construct the sequence sk ,
where s0 = 4 and sk = s2k−1 − 2 mod Mp for k ≥ 1. Then Mp is prime if and only if
sp−2 ≡ 0 mod Mp .
The proof of this surprisingly useful theorem uses Fermat’s Little Theorem, the Binomial Theorem in a finite field, some facts about quadratic residues, and the intriguing
fact that
√ k
√ k
sk = (2 + 3)2 + (2 − 3)2 .
The theorem, which is not at all obvious, turns out to be a special case of the Lucas
test for primality of arbitrary integers, and it is the test used by followers of the Great
Internet Mersenne Prime Search to find Mersenne primes. It is generally believed that
there are infinitely many Mersenne numbers that are prime, but no one has been able
to prove this.
4. a = 1, b = 2 gives the Jacobsthal numbers
0, 1, 1, 3, 5, 11, 21, 43, 85, 171, 341, . . .
which, apart from the zeroth, are all odd. In fact
Jn+1 = 2Jn + (−1)n ,
and among other things, they are also the number of ways of tiling a 3 × (n − 1)
rectangle with 1 × 1 and 2 × 2 square tiles:
1
1
3
5
5. a = 2, b = 1 gives the Brahmagupta-Pell numbers
1, 2, 5, 12, 29, 70, 169, 408, 985, . . .
24
which were probably known to the Babylonians some four millennia ago. These are
the denominators of the best approximations
1 3 7 17 41 99 239 577 1393
, , , , , ,
,
,
,...
1 2 5 12 29 70 169 408 985
√
to 2; the word “best” has a specific meaning we need not explore here. The name
joins perhaps the greatest Indian mathematician of the first millennium CE with one
of the most obscure mathematicians of the seventeenth century. The numerators tn
and un of these approximations satisfy the recurrence xn+1 = 2xn + xn−1 with initial
values t1 = 1, t2 = 3 and u1 = 1, u2 = 2 respectively, and they are also solutions to
the so-called Pell Equation
t2 − 2u2 = 1.
Brahmagupta made an extensive study of the equations of the form t2 − Du2 = 1,
where D is a positive nonsquare integer. About a thousand years later, so did Fermat,
and by rights the equation should be named for Fermat or for Brahmagupta and
Fermat. However, Euler attributed the equation to the obscure Englishman John
Pell (1611-1685), who wrote a textbook containing the equation – one of the few
mathematical mistakes Euler ever made.
Table No. 7289 from the Yale Collection of Babylonic cuneiform includes the following
√
three approximations to 2 written to three sexagesimal places, to which we have
appended their equivalent expressions as rational numbers:
3
2
17
1; 25 =
12
577
1; 24, 51, 10, . . . =
.
408
1; 30 =
The mathematics behind
the inscription is the recognition that if x and r are positive
r
1
x+
is closer to r than either x or r/x. This is, in effect,
real numbers, then
2
x
√
three iterations of Newton’s method for approximating 2, with the initial value of
x = 1. Newton’s method produces the second, fourth, eighth, . . . , 2n th BrahmaguptaPell numbers, and this is an expression of the fact that under favorable conditions,
the method converges quadratically.
6. Finally, there is a connection between the two worlds of combinatorics and numerical
analysis via the Pascal Triangle bridge – namely, the Pell Equation on the combinatorial side and the Chebyshev polynomials Tn (x) and Un (x) of the first and second
kind, respectively, on the numerical side. They are defined by the recurrences
T0 (x) = 1, T1 (x) = x, and Tn+1 (x) = 2xTn (x) − Tn−1 (x) for n ≥ 1, and
U0 (x) = 1, U1 (x) = 2x, and Un+1 (x) = 2xUn (x) − Un−1 (x) for n ≥ 1.
25
Although there are no special values of a and b for which tilted Pascal Triangle row
sums yield the Tn (x), the values a = 2x, b = −1 give the polynomials Un (x), and here
are the first ten relevant row sums for the tilted Pascal Triangle:
U0 (x) = 1
U1 (x) = 2x
U2 (x) = 4x2 − 1
U3 (x) = 8x3 − 4x
U4 (x) = 16x4 − 12x2 + 1
U5 (x) = 32x5 − 32x3 + 6x
U6 (x) = 64x6 − 80x4 + 24x2 − 1
U7 (x) = 128x7 − 192x5 + 80x3 − 8x
U8 (x) = 256x8 − 448x6 + 240x4 − 40x2 + 1
U9 (x) = 512x9 − 1024x7 + 672x5 − 160x3 + 10x.
The polynomials {Un (x)} satisfy the differential equations
(1 − x2 )Un′′ (x) − 3xUn′ (x) + n(n + 2)Un (x) = 0 with U0 = 1 and U1 = 2x.
They form an orthogonal family of functions with respect to the weight function
√
1 − x2 on the interval [−1, 1]
(
Z 1
p
0,
if n 6= k,
Un (x)Uk (x) 1 − x2 dx =
π/2, if n = k
−1
sin(n + 1)θ
. The polysin θ
nomials {Tn (x)} satisfy similar differential equations, they form an orthogonal family
1
with respect to √
, and they satisfy the trigonometric equations Tn (cos θ) =
1 − x2
cos nθ. For |x| ≤ 1, the latter formula is equivalent to the frequently-seen and
mysterious-looking Tn (x) = cos(n arccos x).
and they also satisfiy the trigonometric equations Un (cos θ) =
What about that combinatorics/numerical connection? It comes from the fact that
the Chebyshev polynomials satisfy the Pell equations
2
Tn2 (x) − (x2 − 1)Un−1
(x) = 1.
We have seen that tilting Pascal’s triangle brings us a whole host of interesting sequences. When we look at the triangle mod 2 we find a connection with the world of
fractals and fractal geometry. Here are rows 0 through 63 of the triangle (Figure 6 goes
about here):
This view shows that the mod-2 triangle up to row 63 is tiled by three copies of the mod2 triangle up to row 31, each with area one-quarter of the larger triangle, together with an
26
Figure 6: Pascal’s triangle mod 2
empty triangle of the same area. Each of the three nonempty triangles can be tiled in the
same way. The mod-2 Pascal’s triangle turns out to be a fractal – a self-similar geometric
figure composed of patterns that are in turn composed of smaller versions of themselves.
Fractals are pathways into the relatively new areas of topological dynamics and dynamical
systems – although patterns very much like Figure 6 appear in stone decorations of floors
of Roman churches dating back almost a thousand years.
Returning to the familiar base-ten presentation of Pascal’s triangle, we find that its
central column hides an especially interesting sequence. Let’s take a closer look.
The Catalan numbers and the central column of Pascal’s triangle
The central column of Pascal’s triangle consists of what are known as the central binomial
numbers; the first few are
0
2
4
6
8
10
12
14
16
0
1
2
3
4
5
6
7
8
and they give us the sequence 1, 2, 6, 20, 70, 252, 924, 3432, 12870, 48610, 184756. We see that
2n
n is divisible by n + 1, and dividing out that factor gives us the sequence
1, 1, 2, 5, 14, 42, 132, 429, 1430, 4860, 16796.
2n
1
, and a good introduction to them is to
These are the Catalan numbers Cn =
n+1 n
see how they count the number pn+1 of ways to group the product x1 x2 · · · xn+1 into pairs
using parentheses. To begin with, we see that p1 = p2 = 1, and p3 = 2. To continue, count
27
by grouping the product into two subproducts of lengths r and n − r for r = 1, . . . , n − 1.
As there are pr ways to group an r-fold product and pn−r ways to group an (n − r)-fold
product, we see that
pn = p1 pn−1 + p2 pn−2 + · · · + pn−1 p1 .
Thus, p4 = p3 p1 +p2 p2 +p1 p3 = 2·1+1·1+1·2 = 5 = C3 , p5 = 5·1+2·1+1·2+1·5 = 14 = C4 ,
and it appears that pn+1 = Cn for n ≥ 1.
Here are the fourteen groupings of a five-fold product as the above method counts them:
p1 · p4
p2 · p3
p3 · p2
p4 · p1
a(b(c(de))) (ab)(c(de)) ((ab)c)(de) (a(b(cd)))e
a(b((cd)e)) (ab)((cd)e) (a(bc))(de) (a((bc)d))e
a((bc)(de))
((ab)(cd))e
a(((bc)d)e)
(((ab)c)d)e
a((b(cd))e)
((a(bc))d)e
These groupings form a partially ordered set, called the Tamari lattice, in which one
grouping precedes another if the second grouping may be obtained from the first by leftward
applications of the associative law a(bc) = (ab)c. There is a label p1 p2 . . . pn that describes
each grouping of a1 a2 . . . an+1 in which pi is the number of left parentheses in the grouping
that immediately precede ai for 1 ≤ i ≤ n. Thus, the label 2101 describes the grouping
((a(bc))(de)). This partial ordering gives rise to a directed graph called the associahedron of
order n+1. Figure 7 shows the associahedron of order 5 as a planar 3-regular directed graph
that has three quadrilateral faces, six pentagonal faces, fourteen vertices and twenty-one
arcs. (FIGURE 7 GOES ABOUT HERE:)
We see that each grouping of an (n + 1)-fold product corresponds to a string of n L’s
(for Left) and n R’s (for Right) in which the number of L’s never falls below the number
of R’s. Replacing each L by a 1 and each R by a -1 yields a sequence x1 , x2 , . . . , x2n of n
1’s and n -1’s in which
x1 + x2 + · · · + xk ≥ 0 for 1 ≤ k ≤ 2n.
Call such sequences good. Thus, the number of good sequences of length 2n is equal to the
number of groupings pn+1 of a n + 1-fold product, and we now show that
2n
1
pn+1 =
for n ≥ 1.
n+1 n
In short, the apparent equality of pn+1 and Cn is a fact.
It is easier, however, to count the sequences of n 1’s and n -1’s that are bad – i.e. the
number of such sequences for which the above condition fails. The proof hinges on a clever
trick called André’s Reflection Principle that constructs a bijection between the set of bad
sequencesof length
2n and the set of sequences of n + 1 1’s and n − 1 -1’s, the latter being
2n
2n!
. Let S = {x1 , . . . , x2n } be a bad sequence of n 1’s and
equal to
=
(n + 1)!(n − 1)!
n+1
n -1’s. Construct a sequence S ∗ = {y1 , . . . , y2n } of n + 1 1’s and n − 1 -1’s as follows. S
28
3001
2011
1111
1201
2101
1300
1120
2020
3100
1210
2200
4000
2110
3010
label
1111
1120
1201
1210
1300
2011
2020
grouping
(a(b(c(de)))
(a(b((cd)e)))
(a((bc)(de)))
(a((b(cd))e))
(a(((bc)d)e)
((a(b(c(de)))
((a(b(c(de)))
label
2101
2110
2200
3001
3010
3100
4000
grouping
((a(bc))(de))
((a(b(cd)))e)
((a((bc)d))e)
(((ab)c)(de))
(((ab)(cd))e)
(((a(bc))d)e)
((((ab)c)d)e)
Figure 7: The associahedron of order 5 and the meanings of the labels.
is bad, so there is a smallest index i such that {x1 , . . . , xi } has more -1’s than 1’s. The
sequence {x1 , . . . , xi−1 } must have an equal number of 1’s and -1’s, so i = 2j + 1 must
be odd and the first 2j + 1 elements of the sequence consist of j 1’s and j + 1 -1’s. Let
S ∗ = {y1 , . . . , y2n }, where yk = −xk or xk according as k ≤ 2j + 1 or not. We see that S ∗
is a sequence of n + 1 1’s and n − 1 -1’s.
To reverse the process, locate the least index i in a sequence {y1 , . . . , y2n } of n + 1 1’s
and n − 1 -1’s for which there are more 1’s than -1’s. Changing the signs of y1 , . . . , yi and
leaving the rest as they are will result in a sequence of n 1’s and n -1’s such that the first
i elements contain one more -1 than 1. Hence, the resulting sequence is bad.
Since every sequence of n 1’s and n -1’s is either good
or
bad, and no such sequence is
2n
both, the good and bad sequences partition the set of
sequences of n 1’s and n -1’s.
n
29
1
1
1
−
=
, we see that
n n+1
(n + 1)n
(2n)!
(2n)!
2n
2n
(2n)!
−
=
−
=
n!(n − 1)!n (n + 1)n!(n − 1)!
(n + 1)n!n!
n
n+1
1
2n
=
, as claimed.
n+1 n
Finally, since
It follows that there are three distinct ways to define the Catalan numbers, namely
2n
1
Cn =
– as a quotient;
n+1 n
2n
2n
=
−
– as a difference; and
n
n+1
n
X
=
Ck−1 Cn−k with C0 = 1 – as a convolution.
k=1
The next appearance of the Catalan numbers is concerned with counting paths. But
first, some terminology. Let G be a graph, which could be infinite – for example, the
integer lattice in the plane. A walk in G is a sequence W = (v0 , e1 , v1 , e2 , . . . , vn−1 , en , vn ),
where the vi are vertices (points) of G and ei is an edge joining vi−1 and vi . The vertices
v1 , . . . , vn−1 are the internal vertices, and v0 is the origin. W is closed if v0 = vn . A
trail is a walk with no repeated edge, a path is a walk with no repeated vertex, a circuit
is a closed trail, and a cycle is a circuit with distinct origin and internal vertices. An
n-cycle is a cycle with n vertices (and n edges), and a cycle is odd or even according as it
has an odd or even number of vertices, respectively.
A Dyck path of order n is a path in the plane from (0, 0) to (n, n) that goes only up
and to the right, turns only at lattice points, and satisfies x ≤ y for each point (x, y) on
the path. Such a path consists of 2n steps of length 1; there are n steps up, n steps to the
right, and at no time does the number of steps to the right exceed the number of steps up.
If we label steps with 1 (for “up”) and −1 for “right”, we may view a Dyck path of order
n as a good string of n 1’s and n -1’s. Thus, the number of Dyck paths of order n is equal
to Cn , the nth Catalan number.
We may also label Dyck paths of order n as we did groupings of (n + 1)-fold products,
with a label p1 p2 . . . pn in which pi is the number of steps up along the vertical line x = i−1.
Figure 8 tells the story for Dyck paths of order 3 (FIGURE 8 GOES ABOUT HERE):
Finally, we can count Dyck paths of order n according to the least positive k such that
the path contains the point (k, k) – the so-called first return. A path P with first return
(k, k) is in two sections. The section from (0, 0) to (k, k) rises to (0, 1) and does not go
below the line y = x + 1 until it returns to (k, k), hence there are Dk−1 such sections. The
section from (k, k) to (n, n) is counted by Dn−k . Hence we see that
Dn =
n
X
Dk−1 Dn−k ,
k=1
30
111
120
210
201
300
Figure 8: The five Dyck paths from (0, 0) to (3, 3) and their labels
which is precisely the convolution recurrence for the Catalan numbers. Since D0 = 1 = C0 ,
we have another proof that Dn = Cn .
A regular n-gon can be can be triangulated (chopped up) into n − 2 triangles by connecting n − 3 pairs of vertices with non-crossing diagonal lines. Thus, an equilateral
triangle needs no diagonals, a square needs one, a regular pentagon needs two, etc. Let
Tn be the number of ways to triangulate an n-gon; Some experimentation reveals that
T3 = 1, T4 = 2, T5 = 5, and T6 = 14. Figure 9 shows the five triangulations of a regular
pentagon (FIGURE 9 GOES ABOUT HERE):
Figure 9: The five triangulations of the regular pentagon
We can count these triangulations in a way that mimics the counting of Dyck paths.
Pick two adjacent vertices v1 and v2 of the n-gon; then the edge v1 v2 is in exactly one
triangle, so count the triangulations according to the third vertex of that triangle. It turns
out that if we define T2 = 1, than we obtain the recurrence
Tn = T2 Tn−1 + T3 Tn−2 + · · · + Tn−1 T2 =
n−1
X
Tk Tn+1−k .
k=2
Comparing this with the Catalan recurrence shows that Tn = Cn−2 .
For those who cannot get enough of the Catalan numbers, Richard Stanley’s 2015 book
[153] catalogs 214 different families of objects counted by them.
Finally, we cannot resist showing you this:
un, un, dos, cinc, catorze, quaranta-dos, cent tretze-dos, quatre-cents vint-i-nou, . . ..
We know that the number Dn of Dyck paths of order n is equal to the nth Catalan
number Cn . A related problem is to count the number wn of different walks of n steps
between lattice points, each in a direction N, S, E or W, starting from the origin and
31
remaining in the upper half-plane. In the next section, we answer this question and see
how it is connected with many other kinds of walks and some variations of Pascal’s triangle.
32
4
Catwalks, Sandsteps, and Pascal Pyramids
This section is in the file called catwalkschapter.tex.
We begin with a problem posed by Bill Sands. In [140], he observed that the number of
different walks of n steps between latticepoints, each in a direction N, S, E or W, starting
from the origin and remaining in the upper half-plane, is
2n + 1
wn =
(1)
n
and asked for a neat proof.
What is wanted is a simple “choice” argument: any offers? This is sequence A001700
in [149], and RKG’s first attempt at any sort of proof was by induction from the formula
wn = 4wn−1 − Cn
(2)
since a walk of length n is one more step in one of the four directions N, S, E or W, than a
walk of length n − 1, except that a southerly step is not allowed if the walk of length n − 1
terminated on the x-axis; as we have seen in the previous section, the number Dn of such
walks is the n-th Catalan number Cn .
One might think that this result is well known, but that is not true. It doesn’t occur
among the 31 manifestations listed by Kuchinski [98], nor can we immediately see any
simple correspondence between the walks and any of the manifestations. However, first
let’s assume that it’s true, and that (1) holds with n − 1 in place of n. Then
2n − 1
1
2n
4wn−1 − Cn = 4
−
n−1
n+1 n
(2n)!
4(2n − 1)!
−
=
n!(n − 1)!
n!(n + 1)!
(2n − 1)!
=
{4n(n + 1) − 2n}
n!(n + 1)!
2n + 1
(2n)!(2n + 1)
=
=
.
n
n!(n + 1)!
What is well known is that the number of walks of 2n steps, each N or E, from (0,0)
to (n, n), which don’t cross the diagonal y = x, or the number of walks of 2n + 2 steps
from (0,0) to (n + 1, n + 1) which stay strictly above the diagonal, is Cn , the n-th Catalan
number. This is clearly the same as the number of walks of 2n steps on the positive x-axis,
starting and finishing at the origin.
Let us look at this one-dimensional analog of the Sands problem. We can exhibit the
numbers of walks, w(n, x), of n unit steps, starting at (0,0) and ending at (x, 0), x ≥ 0, in
a “Pascal semi-triangle” (Figure 10).
Columns n = 0 and n = 1 contain the Catalan numbers, sequence A000108 in [149];
column n = 3 (sequence A002057) also occurs in connexion with partitioning a polygon
33
n
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
0
1
1
2
3
4
5
6
7 8
9 10 11 12 13 14 15 16
1
1
1
2
1
2
3
5
4
5
9
14
1
5
1
14
14
28
42
6
20
48
42
90
132
429
1430
9
429
1
10
54
65
273
910
2548
1
11
208
637
1638
3640
1
44
154
1001
2002
3432
35
275
1001
1
8
110
572
429
7
75
297
1
27
165
132
1430
1
1
12
77
350
1260
1
13
90
440
1
14
104
1
15
1
k
0
0
1
1
2
2
3
3
4
4
5
5
6
6
7
7
8
Total
1
1
2
3
6
10
20
35
70
126
252
462
924
1716
3432
6435
12870
Figure 10: Numbers of walks, w(n, x), on the positive x-axis.
[31]. Columns x = 2, 4, 6, 8, 10, 12 are sequences A000245, A000344, A000588, A001392,
A000589, A000590 in [149]: they are Laplace transform coefficients: more precisely, w(2n, 2k)
is denoted in [120] by Ck , which is defined by:
2n
(2 cos θ)
sin θ =
n
X
Ck sin(2k + 1)θ.
(3)
k=0
Presumably there is an analogous formula for w(2n + 1, 2k + 1); compare equation (11)
below.
The first table in Cayley’s paper [31] is for the number of partitions of an r-gon into
k parts by non-intersecting diagonals. His column k = 1 is our main diagonal, and his
column k = 2 is our third diagonal (starting at (n, x) = (4, 0)). More generally, his
column k is our diagonal starting at (2k, 0), except that his entries contain an extra factor
(x + k − 2)!/(x + 1)!(k − 2)!, a generalized Catalan number: in fact, for x = k − 2 it is Ck−2 .
Cayley attributes his results to Kirkman [94] and to Taylor and Rowe [164]; the latter
paper gives some history, mentioning Terquem, Lamé, Rodrigues, Binet, and Catalan.
(4)
A near miss for column x = 9 is Stirling numbers of the second kind, Sn , but the
entries (not shown here) for rows 17, 19, and 21 are deficient by 1, 18, and 190 respectively.
We omit zero values of w(n, x) from our table: it’s fairly obvious that w(n, x) = 0 if n
and x are of opposite parity, or if x > n. It’s not too difficult to find formulas for the first
few diagonals:
1
1
w(n, n) = 1, w(n, n − 2) = n − 1, w(n, n − 4) = n(n − 3), w(n, n − 6) = n(n − 1)(n − 5).
2
6
34
In fact there is a comparatively simple formula for all the entries in Figure 10:
n
n
w(n, x) =
−
,
r
r−1
(4)
where r = 12 (n − x). Indeed, the formula (4) also works in the apocryphal cases mentioned
above, if we take the reasonable interpretation that nr = 0 if r < 0, or if n < r, or if r is
not an integer. We shall do this: note that the usual formulas, such as
n
n−1
n−1
=
+
(5)
r
r
r−1
and (13) still hold in these cases. Formula (4) is easily proved by induction, since
w(n, x) = w(n − 1, x − 1) + w(n − 1, x + 1)
n−1
n−1
n−1
n−1
=
−
+
−
r
r−1
r−1
r−2
n
n
=
−
.
r
r−1
The well-known result that we mentioned is the special case
2n
2n
2n
1
= Cn
w(2n, 0) =
−
=
n+1 n
n
n−1
The total number, w(n), of walks of length n is
w(n, n) + w(n, n − 2) + w(n, n − 4) + · · ·
n
n
n
n
n
n
n
=
−
+
−
+ ··· +
−
=
0
−1
1
0
k
k−1
k
where n − 2k = 0 or 1 according as n is even or odd: i.e.,
2k
2k + 1
or
.
k
k
Here it is clear that the number of walks of even length is just twice the number of
walks of (odd) length one less:
2k + 2
2k + 1
2(k + 1)(2k + 1)!
=
.
2
=
(k + 1)k!(k + 1)!
k+1
k
Is there a simple “choice” argument for walks of odd length? If you “know” the Catalan
number result, then we can use a device similar to formula (2):
w(2k + 1) = 2w(2k) − Ck
2k
2k
1
= 2
−
k+1 k
k
2k + 1 2k
2k + 1
=
=
,
k+1 k
k
35
(6)
but this has an air of circularity about it, or at best may be using a sledgehammer to crack
a nut.
Returning to the original problem, we can solve it if we go into more detail than most
people would deem desirable. The numbers, wn (x, y), of walks of n steps from (0,0) to
(x, y), which remain in the half-plane y ≥ 0, may be exhibited in a “Pascal semi-pyramid”
whose layers are shown in Figure 11.
1
1
1
n=0
1
2 2
1 3 1
n=2
1
1 1
n=1
1
3 3
3 8 3
1 6 6 1
n=3
5
4
4
6
15 6
4 20 20 4
1 10 20 10 1
n=4
5
10 24 10
10 45 45 10
5 40 75 40 5
1 15 50 50 15 1
n=5
1
1
6
15
7
6
35
21
15
35
20
84
84
20
15 105
189
105 15
6 70
210
210
70 6
1 21 105
175
105 21 1
n=6
35
21
140
224
210
7
48
21
140
392
588
35
224
588
35
210
21
7 112
490
784
490
112 7
1 28
196
490
490
196
28 1
n=7
Figure 11: Layers of a Pascal semi-pyramid: values of wn (x, y).
If we sum the rows in the layers of Figure 11 we obtain the numbers, wn (y), of walks
of n steps which start at (0,0) and end at distance y from the x-axis. These are shown in
Figure 12. We shall see that a special case is, as we have already earnested,
wn (0) = Cn+1 .
(7)
In turn, the row sums of Figure 12 are the total numbers, wn , of Sands-type walks of
length n. They are listed in column two of Figure 12 and we will confirm another of our
earlier statements:
2n + 1
wn =
.
(8)
n
At risk of losing some interesting heuristics, we again leap to the conclusion
n n
n
n
wn (x, y) =
−
r
s
r−1 s−1
where r = 21 (n + x − y), s = 12 (n − x − y).
(9)
The obvious symmetry wn (x, y) = wn (−x, y) is reflected in formula (9), since changing
the sign of x is equivalent to interchanging r and s. It is also clear that
36
n
0
1
2
3
4
5
6
7
8
9
10
wn
1
3
10
35
126
462
1716
6435
24310
92378
352712
wn (0) wn (1) wn (2) wn (3) wn (4) wn (5) wn (6) wn (7) wn (8) wn (9) wn (10)
1
2
1
5
4
1
14
14
6
1
42
48
27
8
1
132 165 110
44
10
1
429 572 429 208
65
12
1
1430 2002 1638 910 350
90
14
1
4862 7072 6188 3808 1700 544 119
16
1
16796 25194 23256 15504 7752 2907 798 152
18
1
58786 90440 87210 62016 33915 14364 4655 1120 189 20
1
Figure 12: Sums of rows of Figure 11: values of wn (y).
(a) if n + x + y is odd, then r, s are not integers, and
(b) if |x| + y > n, then at least one of r, s is negative,
so that in either of these cases,
wn (x, y) = 0.
We can prove (9) inductively from the recursion (10), which states that the last step
was either N, S, E or W:
wn (x, y) = wn−1 (x, y − 1) + wn−1 (x, y + 1) + wn−1 (x − 1, y) + wn−1 (x + 1, y).
(10)
Notice that the sums of the three arguments in the five terms are all of the same parity.
If this is odd, then all the terms are zero. But if (r, s) are integers, then the corresponding
values for the four terms on the right of (10) are
(r, s)
(r − 1, s − 1)
(r − 1, s)
(r, s − 1),
and if we assume that formula (9) holds with n − 1 in place of n, then (10) yields
37
n−1 n−1
n−1 n−1
n−1 n−1
n−1 n−1
wn (x, y) =
−
+
−
r
s
r−1
s−1
r−1
s−1
r−2
s−2
n−1 n−1
n−1 n−1
n−1 n−1
n−1 n−1
+
−
+
−
,
r−1
s
r−2
s−1
r
s−1
r−1
s−2
which becomes formula (9) after some more or less tedious manipulation, depending on
one’s ingenuity or symbol manipulator.
To find wn (y), sum (9) over x:
wn (y) =
x=n−y
X
2(n−y)
wn (x, y) =
x=−n+y
X
wn (x, y)
r=0
n
n
n
n
n
n
=
+
+ ... +
−
0
n−y
1
n−y−1
n−y
0
n
n
n
n
n
n
+
+ ... +
.
−1 n − y − 1
0
n−y−2
n−y−2
0
The two brackets are the coefficients of tn−y and of tn−y−2 in the expansion of (1 +
t)n (1 + t)n , so that
2n
2n
wn (y) =
−
n−y
n−y−2
, which may be rewritten as
wn (y) =
2n + 1
2n + 1
−
.
n−y
n−y−1
(11)
On comparing this with (4) we see that
wn (y) = w(2n + 1, 2y + 1),
the number of odd length one-dimensional walks which finish, of course, at an odd distance
from the origin.
In particular, (7) is the same as the number of walks from (0,0) to (n, n + 1) which begin
with a northward step and do not cross the line joining start to finish, w(2n + 1, 1) =
2n + 1
2n + 1
(2n + 1)!
(n + 2 − n)
wn (0) =
−
=
n!(n + 2)!
n
n−1
2(n + 1)(2n + 1)!
2n + 2
1
=
=
= Cn+1 ,
(n + 1)!(n + 2)!
n+2 n+1
the (n + 1)th Catalan number.
Finally, summing (11) from y = 0 to y = n, gives (8).
We could ask similar questions concerning walks which do not stray outside the positive
quadrant. The numbers of such walks now form a “Pascal quarter-pyramid”, which is
exhibited in Figure 13.
38
1
1
1
1
2
5
1
5
4
3
2
1
1
1
9
3
5
1
6
16
10
9
20
84
90
64
84
21
168
15
20
35
189
378
6
1
35
35
5
14
1
8
35
105
20
140
70
27
15
64
35
1
10
1
7
6
10
35
4
1
1
14
294
35
189
378
294
21
105
168
7
27 1
28
160
56
300
350 70
840
448 56
840 1134 350 28
1344 840 160 8
588
840 300 35 1
Figure 13: Layers of a Pascal quarter-pyramid: values of wn′ (x, y).
The entries in Figure 13 are given, again without motivation, by
n n+2
n+2
n
′
wn (x, y) =
−
r
s
r+1
s−1
(12)
where r = 21 (n + x − y), s = 12 (n − x − y) as before. Notice that interchange of x and y
keeps s fixed and replaces r by n − r. So the symmetry
wn′ (y, x) = wn′ (x, y)
follows from the symmetries
n
n
=
r
n−r
and
n+2
r+1
=
n+2
.
n−r+1
(13)
We may prove (12) as we proved (9), since wn′ (x, y) also satisfies the relation (10).
A remarkable coincidence is that
1
′
(0, 1) = Ck Ck+1
w2k−1
2
is the number of inequivalent Hamiltonian rooted maps on 2k vertices (sequence 1647 in
[10]) although Tutte [12] doesn’t give the formula in terms of Catalan numbers. Is there
yet another opportunity for a pure combinatorial proof?
Figure 14 is obtained by summing the rows of Figure 13, and we may find wn′ (y), the
number of walks in the positive quadrant which finish at distance y from the x-axis, by
39
n
0
1
2
3
4
5
6
7
8
9
wn′ (0) wn′ (1) wn′ (2) wn′ (3) wn′ (4) wn′ (5) wn′ (6) wn′ (7) wn′ (8) wn′ (9)
1
1
1
3
2
1
6
8
3
1
20
20
15
4
1
50
75
45
24
5
1
175
210
189
84
35
6
1
490
784
588
392
140
48
7
1
1764 2352 2352 1344 720 216
63
8
1
5292 8820 7560 5760 2700 1215 315
80
9
1
Figure 14: Sums of rows of Figure reffig:cat4 : values of wn′ (y).
summing (12) from x = 0 to x = n − y.
n
n+2
n+2
n
′
wn (y) =
−
n−y
0
n − y + 1 −1
n
n+2
n+2 n
+
−
n−y−1
1
n−y
0
n+2
n+2
n
n
− 1
+... +
1
1
1
2 (n − y)
2 (n − y)
2 (n − y) + 1
2 (n − y) − 1
if n − y is even, but with the last term replaced by
n
n+2
n+2
n
−
1
1
1
1
2 (n − y + 1)
2 (n − y − 1)
2 (n − y + 3)
2 (n − y − 3)
if n − y is odd.
Put n − y = 2k or 2k + 1 and
(
n+1 n
′
k k −
wn (y) =
n+1
n
k
n+1 n k
k−1
n n+1
k+1 − k+1 k−1
40
if n − y = 2k
if n − y = 2k + 1.
In particular, if y = 0,
wn′ (0)
1
2k
2k + 1
k+1 k
k
2k + 1
Ck
k
=
i.e.
2k + 2 2k + 1
1
k+2 k+1
k
2k + 1
Ck+1
k
or
or
according as n = 2k or n = 2k + 1, where Ck is the k-th Catalan number.
For walks in the positive quadrant it’s more natural and symmetrical to ask for the
numbers of walks which terminate at various distances from the origin, using the “Manhattan metric”, x + y = n − 2s. Figure 15 shows the sums of the diagonals of Figure
13.
wn′
1
2
6
18
60
200
700
2450
8820
31752
116424
426888
n
0
1
2
3
4
5
6
7
8
9
10
11
0
1
1
2
3
4
5
6
7
8
9
10
11
2
2
4
10
8
10
34
70
16
98
70
308
588
588
258
64
1092
3024
5544
5544
32
642
3414
1538
12276
31680
56628
128
9834
43230
141570
256
3586
26752
138424
512
8194
69784
1024
18434
2048
Figure 15: Sums of diagonals of Figure 13: values of wn′′ (x + y).
The entries in Figure 15 are
wn′′ (x + y) = wn′′ (n − 2s) =
X
x+y=n−2s
wn′ (x, y)
n+2
n
n
n
=
+
+ ··· +
−
s
s
s+1
n−s
n
n+2
n+2
n+2
+
+ ··· +
.
s−1
s+1
s+2
n−s+1
41
Except for small values of s, the truncated binomial expansions do not seem to have a
simple closed form:
wn′′ (n)
= 2n
wn′′ (n − 2) = (n − 2)2n + 2
1
wn′′ (n − 4) = (n2 − 5n + 2)2n + n2 + 3n − 2
2!
1
′′
wn (n − 6) = [n(n2 − 9n + 14)2n + n(n3 + 4n2 − n − 28)]
3!
1
1
′′
wn (n − 8) = n(n − 1)(n2 − 13n + 34)2n + n(n − 1)(n4 + 4n3 − n2 − 64n − 204)
4!
72
An amusing curiosity is that wn′′ (n − 2) is twice the genus of the (n + 2)-dimensional
cube ([128], or see Theorem 14 in [79]).
The total number of walks, wn′ , the left hand column in Figure 6, has, on the other
hand, the comparatively simple formula
n
n+1
′
wn =
,
(14)
⌊n/2⌋ ⌊(n + 1)/2⌋
which again seems to beg for a simple proof.
If there is no restriction on the two-dimensional walks, i.e. if they may wander on either
side of the x- and y-axes, then it is fairly easy to see that their number of length n, from
(0,0) to (x, y), is
n n
(15)
r
s
where r and s are as before, but calculated using the absolute values of x and y.
Of course, the total number of walks of length n is 4n .
Although we certainly haven’t found the most aesthetic proofs, the comparative simplicity of the final results tempts us to ask what happens in three dimensions. Let Wn (x, y, z)
be the number of walks of n steps, each in a direction N, S, E, W, up, or down, from
(0,0,0) to (x, y, z), which never go below the (x, y)-plane. We will not attempt to depict
the four-dimensional “Pascal semi-pyramid”, but the sums of its layers now give Wn (z),
the number of walks terminating at height z above the (x, y)-plane, and this satisfies the
recurrence
Wn (z) = Wn−1 (z − 1) + 4Wn−1 (z) + Wn−1 (z + 1)
(16)
which may be used to produce the array of Figure 16.
Each entry in Figure 16 is the sum of four times the entry immediately above it and
the two neighbors of that entry, e.g.
W5 (2) = 4 · 99 + 288 + 16 = 700.
42
Wn
1
5
26
139
758
4194
23460
132339
751526
4290838
n
0
1
2
3
4
5
6
7
8
9
0
1
2
3
4
5
6
1
4
1
17
8
1
76
50
12
1
354
288
99
16
1
1704
1605
700
164
20
1
8421
8824
4569
1376
245
24
1
42508 48286 28476 10318 2380 342 28
218318 264128 172508 72128 20180 3776 455
1137400 1447338 1026288 481200 156624 35739 5628
7
8 9
1
32 1
584 36 1
Figure 16: Walks in three dimensions: values of Wn (z).
We again suppress the details of discovery of the general formula, and of its inductive
proof: these details seem to be more complicated than before, and we found no obvious
manifestation of the Catalan numbers. The simplest expression for Wn (z) that we have so
far found is not in closed form:
n
n
n
Wn (n − v) = an,0
+ an,1
+ · · · + an,t
v
v−1
v−t
where t = ⌊(v + 2)/2⌋ and the coefficients an,u are of shape
v−u
v−u
an,u =
− 42
4v−2u
u
u−2
although there are, of course, closed form expressions for small values of v:
Wn (n)
= 1
Wn (n − 1) = 4n
Wn (n − 2) = (n − 1)(8n + 1)
4
n(n − 2)(8n − 5)
Wn (n − 3) =
3
1
Wn (n − 4) =
n(n − 3)(64n2 − 144n + 83)
6
2
Wn (n − 5) =
n(n − 1)(n − 4)(64n2 − 240n + 239)
15
1
Wn (n − 6) =
n(n − 1)(n − 5)(512n3 − 3648n2 + 8872n − 7233)
90
2
Wn (n − 7) =
n(n − 1)(n − 2)(n − 6)(512n3 − 4800n2 + 15496n − 17007).
315
43
We have not found a closed expression for Wn , the total number of walks of n steps
which do not go below the (x, y)-plane, nor have we had an opportunity to examine the
paper [96] which may contain such an expression and may overlap these results in other
ways. The total number of n-step walks in d dimensions, without restriction, is, of course,
(2d)n . However, combinatorial proofs of many of the preceding results appear in [71], a
1992 paper by Guy, Krattenthaler, and Sagan.
Walks in graphs or on grids also turn up in recreational mathematics – where many of
us first learned that there is a whole lot of math that is not arithmetic, including questions
about chessboards. The Catalan numbers count Dyck paths, and we can describe Dyck
paths as paths that begin in the southwest corner of a chessboard, in which moves go North
or East and the path returns to the Southwest-to-Northeast diagonal. A knight’s tour is
a walk that consists solely of knight moves (two squares along a row or column followed
by one square along a column or row) were investigated by Euler and many others, and
the next section is about a similar problem involving rooks, which can move North, South,
East, or West.
44
5
Unique Rook Circuits
This section is in the file called rookcircuits.tex
A rook tour of an array of squares is a path that visits every empty square exactly
once; from a square you may move to any empty adjacent square (North, South, East or
West, but not diagonally). We will call a rook tour a rook circuit if it starts and ends
on the same square. As you can see from Figure 17, given an array with a set of forbidden
squares, there may be: (a), no circuit; (b) and (c), more than one circuit; or (d), exactly
one circuit. We are especially interested in this last case.
L
(a)
L
(b)
(c)
L
N
(d)
Figure 17: Some examples of rook circuits.
How do we justify our statements? If we checker the squares of the array, we see that
each unit step in a rook circuit goes from a white square to a black square, or vice versa.
This gives us
The Parity Principle. The numbers of black and white squares in a rook circuit must
be equal. Moreover, the number of vertical (unit) steps and the number of horizontal steps
are both even.
In example (a) the two forbidden squares are both black squares, so there is no hope
N
of a circuit, or even a tour. Forbidden squares are indicated by guideposts,
on white
L
squares, and
on black squares.
Examples (b) and (c) make it clear that arrays of size 4 × 4 or bigger with an even
number of squares, none of them forbidden, always have more than one circuit.
45
Example (d) has a unique circuit. This may not be immediately obvious, but is easily
checked using
The Two Neighbor Principle. If a square has only two neighbors, then it must be
visited via those neighbors.
For example, corner squares a, d, i, l, n in Figure 18. But also squares k and m. This
gives us the 11 steps b−a−e−i−j −l−m−n−k−h−d−c, and now squares f and g have
only two neighbors and the circuit is determined.
a
b
c
d
e
f
h
i
L
j
g
N
l
k
m n
Figure 18: Uniqueness of rook circuit.
The problem originated with Sidney Kravitz who posed Problem 2212 in [97] – see also
[171]. He asked the readers to find the unique rook circuit in Figure 19.
L
L
N
N
L
N
N
N
N
L
L
N
L
N
L
L
N
L
Figure 19: Sidney Kravitz’s problem: a 10 × 15 array with 18 prohibited squares.
The open question was, is there a configuration of fewer than 18 prohibited squares
which forces a unique rook circuit on a 10 × 15 array? The answer is ‘Yes’, and the reader
can verify that the configuration given in Figure 20, with only 10 guideposts, has a unique
rook circuit.
46
L
N
L
N
L
N
L
N
L
N
Figure 20: Best known solution, with 10 guideposts, for a 10 × 15 array.
Although Kravitz’s problem, Figure 19, is not minimal in the number of guideposts
used, it is a better puzzle in the sense that the proof of uniqueness of the circuit is not
as straightforward as in other examples. We leave this as an exercise for the reader, who
should be armed with
The Cul-de-sac Principle. Never draw segments which leave a square with only one
exit.
The best known general results have been found by Marc Paulus, using the 4-Corner
Principle. A 4-corner is a 4 × 4 array of squares in a corner of a larger array.
The 4-Corner Principle. If there is a unique circuit in an m × n array, then at least
one square in each 4-corner must contain a guidepost.
Our proof is tedious and we give only an outline; can the reader find a shorter one?
If there is a rook circuit in an m × n array, then it must visit every square in an empty
4-corner. There are only finitely many ways it can do this and the circuit will cross the
boundary of the 4-corner exactly 2, 4, 6 or 8 times. Label the exits from a 4-corner as in
Figure 21.
47
1̄
2̄
3̄
4̄
1
2
3
4
Figure 21: Labels for the exits of a 4-corner.
First suppose that the circuit crosses the boundary of the 4-corner exactly twice. Two
examples are shown in Figure 22, where the crossings are at exits {1, 2̄} and {1, 4}. In each
case we see two alternative paths which give a counterexample to the statement that any
circuit with those prescribed exits is unique.
Figure 22: Cases {1, 2̄} and {1, 4}.
We do not need to consider the case {1, 3}, for example, since the Parity Principle shows
that no circuit can cover a 4-corner with those exits (both 1 and 3 are odd). Furthermore,
the counterexamples constructed for exits {1, 4} will also work for {1, 4̄}. Also, those for
{1, 2̄} will work for the case {2, 1̄}, by diagonal reflexion.
In this way we can reduce the cases that still need to be considered to: {1, 2}, {2, 3},
{2, 3̄} and {3, 4}, which we leave to the reader.
Now suppose that the circuit crosses the boundary of a 4-corner exactly four times.
For example, if the circuit crosses at exits {1, 2, 3, 2̄}, we need to consider the two cases
illustrated in Figure 23. The first is where exits {2, 3} (braced in the figure) and exits
{1, 2̄} are connected outside the 4-corner (so that {1, 2} and {2, 2̄} are connected inside).
In the second, the roles are interchanged, with exits {1, 2} (braced) and {3, 2̄} connected
outside. We don’t need to consider the case where {1, 3} and {2, 2̄} are connected outside
the 4-corner since that would require the circuit to intersect itself.
48
|{z}
|{z}
|{z}
Figure 23: Cases {{1, 2̄}, {2, 3}} and {{1, 2}, {3, 2̄}}.
|{z}
We list in an appendix the cases with 4, 6 or 8 exits for which we need to construct
counterexamples.
You might think that a statement similar to the 4-Corner Principle must be true if
you consider large enough arrays that are not in a corner, but our general results show
that no such statement is true: you can have an arbitrarily large area of an array without
guideposts but with only one circuit! Moreover, if you look at the examples in Figure
27 below, you will see that a ‘3-corner principle’ can’t be true, and the statement that a
5-corner must contain two guideposts is also false. It seems probable that an 8-corner must
contain two guideposts but we have made no attempt to prove this.
It’s not too hard, when you know the answers, to verify exact results when the board
has a dimension less than 5, or one that is a multiple of 4.
Rectangles of size 1 × n have no rook circuit, and those of size 2 × n have a unique one.
For rectangles of size 3 × (3n + j) with n > 0 and j = 0 or ±1 it is necessary and
sufficient to use n + j guideposts to enforce a unique circuit. You can check this by noting
that consecutive posts have to be on squares of opposite colors. One way of placing them
is
for
j = 0,
j = 1,
j = −1,
L
N
put
n
n+1
n−1
N
guideposts at
L
N
(3r + 1, 1)
(3r + 1, 1)
(3r, 1)
L
Figure 24: Find the unique rook circuits.
for
N
0≤r<n
0≤r≤n
0 < r < n.
L
N
It is a curious paradox that, in order to force a unique rook circuit on a 3 × n array,
about 1/9 of its squares need to be occupied by guideposts, whereas larger boards require
smaller fractions. Rectangles of size 4 × n only need 2 guideposts, if you place them at
(1,1) and at (n−1, 2) or (n−1, 3) according as n is even or odd.
49
L
N
L
N
Figure 25: Check that the rook circuits are unique.
If one dimension of the rectangle is a multiple of 4, say 4m, then 2m guideposts are seen
to suffice, if you stack m copies of the 4 × n examples that we’ve just seen. For example,
the standard 8 × 8 chessboard has a unique rook circuit if 4 guideposts are placed as in
Figure 26.
N
L
N
L
Figure 26: Check that the circuit is unique.
We are less certain of the minimum number of guideposts if the dimensions of the
rectangle are 4m+i by 4n+j, with m, n > 0 and i, j = 1, 2 or 3. We believe them to be
2(m+n) − 1
if ij = 1
(first diagram of 27),
2(m+n)
if ij is even
(next three diagrams), and
2(m+n) + 1
if ij = 3 or 9
(last row of 27).
The fact that these numbers suffice is illustrated by the six diagrams of Figure 27; for
m = n = 2 and (i, j) = (1, 1), (1,2) (first row); (2,2), (2,3) (second row); (3,1), (3,3) (last
row). For the cases (i, j) = (2, 1), (3,2), or (1,3), reflect the appropriate diagram about its
rising diagonal.
In each diagram the framed (4m+i) × 4 and 4 × (4n+j) rectangles each contain two
guideposts, one of each color. These may be deleted (in order to reduce m or n to 1), or
duplicated (to cover cases where m or n is greater than 2) producing rectangles of different
height or width. Enjoy verifying the uniqueness of the rook circuit in each case.
50
L
(i, j) = (1, 1)
N
LN
L
(i, j) = (2, 2)
N
N
L
L
(i, j) = (1, 2)
LN
LN
N
L
N
L
L
N
L
(i, j) = (3, 1)
N
N
L
N
(i, j) = (2, 3)
N
N
L
L
N
L
N
L
N
N
L
N
N
L
N
LN
N
N
(i, j) = (3, 3)
L
L
N
Figure 27: Find the unique rook circuits.
51
N
L
L
Here are some other problems.
1. Force unique circuits on different shapes of board. Hamilton might have wondered
about circuits on a dodecahedron or icosahedron.
2. Force unique circuits on different tilings, for example, triangular and hexagonal.
3. What if the array of blocks is a tiling of a torus, a cylinder, a Möbius band, a Klein
bottle, or a projective plane?
4. What is the maximum number of non-adjacent squares that you can block from an
array and still permit a rook circuit?
5. Higher dimensional problems.
Special thanks to Sidney Kravitz, Loren Larson and Bill Sands for various insights.
52
Appendix
To complete the proof of the 4-Corner Principle we use arguments similar to those in
the text to reduce the number of cases, and find pairs of circuits providing counterexamples
to uniqueness in the cases where there are two pairs of exits:
{{1, 2}, {3, 4}},
{{1, 2}, {4, 3̄}},
{{1, 2}, {2̄, 3̄}},
{{1, 3}, {4, 2̄}},
{{1, 3}, {4, 4̄}},
{{2, 3}, {4, 3̄}},
{{1, 4}, {2, 3}},
{{1, 3̄}, {2, 4}},
{{1, 2̄}, {2, 3̄}},
{{1, 2̄}, {3, 4}},
{{1, 3̄}, {4, 4̄}},
{{2, 3̄}, {3, 4}},
{{1, 2}, {3, 2̄}},
{{1, 2}, {4, 1̄}},
{{1, 2}, {1̄, 2̄}},
{{1, 4}, {2̄, 3̄}},
{{1, 1̄}, {4, 4̄}},
{{1, 2̄}, {2, 3}},
{{1, 1̄}, {2, 4}},
{{1, 1̄}, {2, 2̄}},
{{1, 2̄}, {4, 3̄}},
{{3, 3̄}, {4, 4̄}},
or three pairs of exits:
{{1, 2}, {3, 3̄}, {4, 4̄}},
{{1, 1̄}, {2, 3}, {4, 4̄}},
{{1, 2}, {1̄, 3̄}, {4, 4̄}},
{{1, 2̄}, {2, 3}, {4, 3̄}},
{{1, 1̄}, {2, 3}, {4, 2̄}},
{{1, 2}, {3, 1̄}, {4, 2̄}},
{{1, 3̄}, {2, 3}, {4, 4̄}},
{{1, 3}, {2̄, 3̄}, {4, 4̄}},
{{1, 1̄}, {2, 3̄}, {4, 4̄}},
{{1, 2̄}, {2, 3̄}, {3, 4}},
{{1, 1̄}, {2, 2̄}, {3, 4}},
{{1, 2}, {3, 1̄}, {4, 4̄}},
{{1, 2̄}, {3, 3̄}, {4, 4̄}},
{{1, 2}, {3, 4}, {2̄, 3̄}},
{{1, 2}, {3, 2̄}, {4, 3̄}},
{{1, 2}, {3, 4}, {1̄, 2̄}},
or four pairs:
{{1, 2}, {3, 4}, {1̄, 2̄}, {3̄, 4̄}},
{{1, 4}, {2, 3}, {1̄, 4̄}, {2̄, 3̄}},
{{1, 1̄}, {2, 3}, {4, 2̄}, {3̄, 4̄}},
{{1, 1̄}, {2, 2̄}, {3, 4}, {3̄, 4̄}}.
{{1, 4}, {2, 3}, {1̄, 2̄}, {3̄, 4̄}},
{{1, 3̄}, {2, 4̄}, {3, 4}, {3̄, 4̄}},
{{1, 1̄}, {2, 4̄}, {3, 4}, {2̄, 3̄}},
53
6
Sums, colorings, squared squares, and packings
This section is in the file sumscolorsgraphs.tex.
Triples satisfying x + y = z
Let’s go back to Langford’s son’s blocks in Section 1, that array of two rows and eight
columns, add 1 to the first label 5 in the bottom row and fill out the bottom row to read
6, 7 . . . , 12:
4 2 3 2 4 3 1 1
5 6 7 8 9 10 11 12
We have solved another problem, namely, to partition the numbers from 1 to 3n into n
triples that are solutions of x + y = z:
1 + 11 = 12,
2 + 6 = 8,
3 + 7 = 10,
4 + 5 = 9.
Since we added 1 to the numbers on the blocks, we see that the x + y = z problem can
be solved precisely when the Skolem problem can be solved, namely when n ≡ 0 or 1 (mod
4). This example began with a Langford sequence of 3 pairs. Let’s follow the complete
process beginning with a Langford sequence for four pairs:
4 1 3 1 2 4 3 2
We adjoin two zeros to this row, add another row below, add 1 to each of the elements of
the first row, add yet another row below, and fill in the bottom row to be 1+5 = 6, 7, . . . , 15:
4 1 3 1 2 4 3 2 0 0
5 2 4 2 3 5 4 3 1 1
6 7 8 9 10 11 12 13 14 15
This gives a partition of the numbers from 1 to 15 into five triples {x, y, z} such that
x + y = z, namely
1 + 14 = 15, 2 + 7 = 9, 3 + 10 = 13, 4 + 8 = 12, 1 + 14 = 15.
We will use such solutions to find packings and coverings of the complete graph with
triangles and to find cyclic Steiner triple systems. The complete graph Kn is the graph
with n vertices in which every pair of vertices is joined by an edge. Cyclic Steiner systems
are our first examples of combinatorial designs, and we’ll define them in Section 6.
54
Triples satisfying x + y = 2z.
The corresponding problem where x + y = z is replaced by x + y = 2z does have solutions
for all values of n. For example, for n = 10 we have
1 + 3 = 2 · 2, 4 + 8 = 2 · 6, 7 + 13 = 2 · 10,
20 + 28 = 2 · 24, 11 + 21 = 2 · 16, 17 + 29 = 2 · 23,
12 + 26 = 2 · 19, 14 + 30 = 2 · 22, 9 + 27 = 2 · 18, 5 + 25 = 2 · 15.
Since x + y = 2z if and only if {x, z, y} is a three-term arithmetic progression with common
difference (y − x)/2, we see that {1, 2, 3}, {4, 6, 8}, {7, 10, 13}, . . . , and {5, 15, 25} are ten
arithmetic progressions with differences 1, 2, . . . , and 10, respectively.
Coil diagrams and Heawood’s Conjecture
Although we cannot always partition the numbers from 1 to 3n into n triples satisfying
x + y = z, we can partition those from 1 to 3n + 1, omitting one number; for example,
we may omit the number ⌊(5n + 3)/2⌋. The diagram in Figure 28 consists of points and
arcs; with the exception of the two central points (here, 14 and 10), the semicircle joining
the numbers a and c represents b, where a + b = c. For example, the semicircle joining 5
and 23 represents 18, because 5 + 18 = 23. In this way, we display the triples showing that
1 + 24 = 25, 2 + 21 = 23, 4 + 18 = 22, 5 + 15 = 20, 7 + 12 = 19, 8 + 9 = 17, 10 + 6 = 16,
and 14 + 3 = 11. The omitted number is 13.
(FIGURE 28 GOES ABOUT HERE):
2
5
8
11
14
17
20
23 25
22
19
16
10
7
4
1
Figure 28: A coil diagram displaying n triples satisfying x + y = z.
The preceding use of these so-called coil diagrams is very similar to the use made by
Ringel and Youngs [131] when they confirmed Heawood’s conjecture for the genus of the
complete graph Kn . Let us see what this is all about.
The genus of a graph G is the least integer g such that G can be embedded in a torus
with g holes. A k-coloring of a graph is an assignment of one of k colors to each vertex.
55
A k-coloring is proper provided adjacent vertices are given distinct colors. The famous
Four Color Theorem, proved in 1976, states that four colors are sufficient to properly color
any graph that can be drawn in the plane. In 1890, P. J. Heawood [81] showed that A. B.
Kempe’s 1879 proof of the Four Color Theorem was flawed and proceeded to prove that
every planar graph has a proper 5-coloring. He then conjectured that if g > 1, then every
graph that can be drawn on the surface of a g-holed torus can be properly colored using
at most χ(g) colors, where
√
7 + 1 + 48g
χ(g) =
,
2
and ⌊x⌋ is the greatest-integer function. Moreover, that bound is achieved by the complete
graph Kn , and that graph has genus γ(Kn ) given by
(n − 3)(n − 4)
γ(Kn ) =
,
12
where ⌈x⌉ is the least integer ≥ x. This statement was formerly known as Heawood’s
Conjecture, but is now called the Ringel-Youngs Theorem.
Proper colorings of maps are similarly defined: regions of a map that share a boundary
edge are given distinct colors. For the familiar one-holed torus of genus g = 1, Heawood’s
formula states that χ(1) = 7. Heawood constructed a map of seven mutually adjacent
hexagons on the torus, which is depicted in Figure 29. The dashed lines in the figure form
a rhombus, and one identifies—matches up—the opposite sides of the rhombus so that it
folds into a torus. This map has a proper 7-coloring, and so it achieves Heawood’s bound.
(FIGURE 29 GOES ABOUT HERE):
0
6
1
0
3
2
4
3
5
4
0
6
1
0
Figure 29: The Heawood graph on the torus.
Indeed the connection between colorings and coil diagrams is less surprising when you
56
realize that Ringel and Youngs’ main tool was a trivalent current graph on which Kirchoff’s
current law requires x + y = z at each node.
Squaring the Square
Current graphs and Kirchhoff’s laws are also the basis of a method of solving the problem
of finding a perfect squared square. The problem is to find an integral square—one
whose edges have integer lengths—and tile it using only other integral squares such that
no two smaller squares are the same size. The problem is first recorded as being studied
by R. L. Brooks, C. A. B. Smith, A. H. Stone, and W. T. Tutte [18] while they were
undergraduates at Cambridge University. However, the earliest publication of a perfect
squared square was due to Roland Sprague [152] – the “Sprague” of the Sprague-Grundy
theorem from Section 2 and one of the pioneers in combinatorial game theory.
The order of a perfect squared square is the number of smaller squares that comprise
the tiling. Duijvestijn [49] has found a perfect squared square of order 21—the lowest
possible order, which is depicted in Figure 30. (FIGURE 30 GOES ABOUT HERE):
27
35
50
8
19
11
15
17
6
2
9
7
24
18
25
29
16
4
42
33
37
Figure 30: Duijvestijn’s perfect squared square of order 21.
Duijvestijn [50] has since discovered pairs of perfect squared squares containing the same
57
elements but differently arranged and has given examples of such squares whose largest
element is not on the boundary.
“Squareorama 4”, a painting on quilted copper by the Blacksburg, VA artist Darcy
Meeker [111], is a beautiful image of Duijvestijn’s smallest perfect squared square.
27
50
27
C
35
50
35
8
11
8
19
11
15
17
6
15
2
17
19
6
2
9
7
24
18
29
25
9
7
24
16
18
29
4
25
16
B
4
33
37
42
A
33
37
42
Figure 31: Left: Current graph corresponding to Duijvestijn’s perfect squared square.
Right: Dual graph corresponding to Duijvestijn’s perfect squared square – identify points
B and C.
With each squared square we may associate a graph called the current graph. The
nodes of this graph correspond to the horizontal lines in the squared square, and two
nodes are joined by an edge provided the corresponding horizontal lines include the top
and bottom edges of one of the tiling squares. The edges are labeled with the areas of
the corresponding squares. We also add an edge (dotted in the figure) joining the top
and bottom edges, because their corresponding squares are the same, namely the external
square. The lengths of the sides of the squares may be regarded as the magnitudes flowing
in the wires (edges) of the associated current graph. If the edges are directed in the
downward direction, then the current diagram becomes a capacitated network. That is,
for every node, the sum of the currents flowing into a node equals the sum flowing out
of the node. The upper diagram in Figure 31 is the current diagram corresponding to
58
Duijvestijn’s squared square. For example, the edges labeled 50, 35, and 27 are incident
with the top vertex, corresponding to the areas of the three squares whose upper edges are
included in the topmost horizontal line of the large square.
Notice that the lengths of the sides of the squares may be regarded as the magnitudes of
the currents flowing in the “wires” of the circuit, or, if these wires are thought of as having
unit resistance, then as potential differences between the various levels in the square. An
alternative current diagram can be drawn from left to right instead of from top to bottom.
This gives an example of a pair of dual graphs imbedded in the plane with each vertex
of one graph corresponding to a region of the other and corresponding edges intersecting
each other. (FIGURE 31 GOES ABOUT HERE):
Euler’s Polyhedral Formula
We are able to construct these dual graphs precisely because the current graphs corresponding to perfect squared squares and rectangles are planar graphs. A graph is planar
provided it can be embedded in the plane with its vertices as distinct points and its edges
as smooth arcs, such that two edges meet, if at all, only in a common vertex. Moreover, a
planar graph partitions the plane into disjoint faces whose boundaries are edges.
One of the great theorems of mathematics concerns planar graphs, namely Euler’s
Polyhedral Formula. Euler proved that if G is a connected planar graph with v vertices,
e edges, and f faces, then v − e + f = 2. For example, in Figure 31, the upper graph
has v = 11 vertices, e = 22 edges (including the dotted edge joining the top and bottom
vertices) and f = 13 faces (the dotted edge being in the boundary of two faces). Sure
enough, v − e + f = 11 − 22 + 13 = 2.
Euler’s Polyhedral Formula also applies to polyhedra drawn on the surface of a sphere.
In fact, G is planar provided it can be embedded in a sphere, and since a sphere is a torus
with g = 0 holes, we can say that Euler’s Polyhedral Formula holds for connected graphs
of genus γ(G) = 0. For example, the familiar soccer ball clearly has genus 0; it has 12
pentagonal faces, 20 hexagonal faces, 90 edges, and 60 vertices, and once again we see that
v − e + f = 60 − 90 + 12 + 20 = 92 − 90 = 2. Proofs of Euler’s Polyhedral Formula can be
found in most elementary textbooks on graph theory or combinatorics; see, for example,
[169, p. 37].
It happens that there are analogs of Euler’s Polyhedral Formula for nonplanar (i.e.
nonspherical) graphs, namely that if G is a connected graph of genus g with v vertices, e
edges, and f faces, then v − e + f = 2 − 2g.
Heawood’s graph (Figure 29) is an embedding of seven mutually adjacent hexagons in
the torus, so we should expect that it satisfies v − e + f = 2 − 2 · 1 = 0. We see that
this graph has seven faces, and each face has six boundary edges. Now each edge is in the
boundary of exactly two faces, so there are 6 · 7/2 = 21 edges. Each vertex has degree
3 and the sum of the degrees of the vertices is equal to twice the number of edges – this
fact is sometimes called the Handshake Theorem – so there are 42/3 = 14 vertices. Sure
59
enough, v − e + f = 14 − 21 + 7 = 0 – as predicted by Euler’s Formula for graphs of genus
γ = 1.
In Section 6, we described the problem of partitioning the numbers 1, 2, . . . , 3n into n
triples {x, y, z} such that x + y = z for each triple. There is a connection between these
triples and the so-called packing and covering problems for complete graphs, which we now
describe.
Packings and coverings of the complete graph
Let G and H be graphs. An H-packing of G is a collection of subgraphs {G1 , . . . , Gk }
of G such that (a) each Gi is isomorphic to H and (b) no two Gi share an edge of G. An
H-covering of G is a collection of subgraphs {G1 , . . . , Gk } of G such that (a) each Gi is
isomorphic to H and (b) every edge of G is in at least one of the Gi . An H-decomposition
of G is a collection of subgraphs {G1 , . . . , Gk } of G that form both an H-packing and an
H-covering of G.
Recall that the four triples {11, 1, 12}, {2, 6, 8}, {3, 7, 10}, and {5, 4, 9} partition the
numbers 1, 2, . . . , 12 into four triples {x, y, z} that each satisfy x + y = z. We view these
four triples as the “sides” of four different triangles, covering the twelve different “lengths”
of chords in a circle of 25 equally spaced points as shown in Figure 32. In this figure, the
sides of the triangle with vertices 20, 22, and 3 are labeled 2, 6 and 8, corresponding to
the fact that vertex 22 is 2 positions from vertex 20, 3 is 6 positions from 22, and 3 is 8
positions from 20 as we move clockwise around the regular 25-gon.
If we rotate the four triangles in Figure 32 into each of their 25 possible positions, we will
have used up all 25 chords of each of the 12 different lengths. This gives a decomposition
of K25 , the complete graph on 25 points, into 100 copies of the triangle K3 .
FIGURE 32 GOES HERE.
The resulting set of 100 triples has another interesting feature: namely, that each of
the 300 pairs of elements from the set {1, . . . , 25} belongs to exactly one of the 100 triples.
For example, the vertices 19 and 4 are ten places apart; the relevant triangle is a clockwise
rotation of triangle {14, 17, 24} into triangle {19, 22, 4}. Such sets of triples that display
this noteworthy symmetry are worthy of both a name and a closer look.
Steiner triple systems
A Steiner triple system (or STS, for short) is a set V of n objects together with a
collection of 3-element subsets of V called blocks, such that each of the n2 pairs of
objects from V belong to exactly one block together. Steiner triple systems are our first
examples of combinatorial designs – that is, arrangements of finite sets into systems of
subsets that satisfy certain harmonious properties of balance and symmetry.
H. Hanani [76] constructs Steiner triple systems in several different ways, including the
following one that gives rise to that decomposition of K25 into 100 copies of K3 :
60
0
24
1
23
2
22
3
6
21
4
2
8
5
20
5
7
19
6
9
18
7
10
11
4
12
17
16
8
3
9
15
10
1
14
11
13
12
Figure 32: Four triangles whose edge-lengths cover the twelve different arc lengths
1, 2, . . . , 12.
{0, 11, 12}
{0, 2, 8}
{0, 3, 10}
{0, 5, 9}
{1, 12, 13}
{1, 3, 9}
{1, 4, 11}
{1, 6, 10}
{2, 13, 14}
{2, 4, 10}
{2, 5, 12}
{2, 7, 11}
{3, 14, 15}
{3, 5, 11}
{3, 6, 13}
{3, 8, 12}
...
...
...
...
...
...
{14, 17, 24}
...
...
...
...
...
...
{20, 22, 3}
...
...
...
...
...
...
{23, 9, 10}
{23, 0, 6}
{23, 1, 8}
{23, 3, 7}
{24, 10, 11} {24, 1, 7}
{24, 2, 9}
{24, 4, 8}
The boldface entries correspond to the four triangles depicted in Figure 32. It happens
that an STS on n objects exists if and only if n ≡ 1 or 3 (mod 6).
Now the smallest n for which we may partition the numbers {1, . . . , 3n} into triples that
61
solve the x + y = z problem is n = 1 – for, 1 + 2 = 3. It turns out that we can rotate the
resulting triangle {1, 2, 4} around a circle of seven equally spaced points, and that yields a
decomposition of K7 into triangles (FIGURE 33 GOES ABOUT HERE):
2
1
3
0
4
6
5
Figure 33: Packing K7 with seven copies of K3
The seven rotated triangles also give us the smallest nontrivial STS, namely
S = {{1, 2, 4}, {2, 3, 5}, {3, 4, 6}, {4, 5, 0}, {5, 6, 1}, {6, 0, 2}, {0, 1, 3}}.
Not only that, but the set {1, 2, 4} has another useful feature. Namely, every nonzero
integer mod 7 can be expressed as a difference of distinct elements of {1, 2, 4} in exactly
one way, because the mod-7 differences of the numbers 1, 2 and 4 are ±1, ±2, ±3. This
construction also gives us our first example of a large family of combinatorial designs called
difference sets that (a) are of independent interest and (b) have extensive connections to
other areas of combinatorics.
So, let’s talk about difference sets.
62
7
Difference sets and combinatorial designs
This section is in the file diffsanddesigns.tex.
Difference sets
Let v, k, and λ be positive integers with v > k > λ, and let G be a v-element group.
A (v, k, λ) difference set is a k-element subset D = {d1 , . . . , dk } of G such that every
nonidentity element of G can be expressed as di d−1
j for di , dj ∈ D in exactly λ ways. If G is
abelian, we use additive notation and say that every nonzero element of G can be expressed
as a difference of distinct elements of D in exactly λ ways. A difference set is planar if
G = Zv and λ = 1. If n is a prime power, then there are planar difference sets with n + 1
elements whose nonzero differences yield all the nonzero residues (mod n2 + n + 1). For
example:
n n2 + n + 1 planar difference set
2
7
{1, 2, 4}
3
13
{0, 1, 3, 9}
22
21
{0, 2, 7, 8, 11}
5
31
{0, 1, 3, 8, 12, 18}
7
57
{0, 4, 11, 20, 25, 26, 28, 38}
23
73
{0, 9, 19, 20, 22, 26, 34, 50, 55}
32
91
{0, 7, 12, 28, 32, 42, 43, 45, 51, 69}.
It is a long-standing conjecture that there are no (n2 + n + 1, n + 1, 1) difference sets if
n is not a prime power. Bruck and Ryser [25] proved that no such difference set exists if
n ≡ 1 or 2 mod 4 and if the square-free part of n contains at least one prime factor of the
form 4m + 3. In addition, it is known that there are no such difference sets for n = 6 and
n = 10, and as of November of 2016, that is where the conjecture stands.
Since there are infinitely many prime powers q, then there are infinitely many difference
sets with λ = 1. The name planar suggests that there are connections between these
difference sets and geometric planes, and indeed that is the case, and we explore these
connections in Section 8. Here are several families of difference sets that were discovered
in the twentieth century.
1. The Paley or quadratic residue difference sets were constructed by R. E. A. C. Paley
[118] from properties of Hadamard matrices. Let v = p be a prime. The elements of
the difference set are the k = (p − 1)/2 nonzero squares mod p, and there are k(k − 1)
pairs whose differences yield λ copies of all the nonzero residues mod p. Hence,
(p − 1)λ = k(k − 1) =
(p − 1)(p − 3)
p−1p−3
=
,
2
2
4
and since λ = (p − 3)/4 is an integer, we must have (v, k, λ) = (4n + 3, 2n + 1, n) for
some nonnegative integer n.
63
This restriction on the parameters is a necessary condition for the 2n + 1 nonzero
squares mod the prime p = 4n+3 to form such a difference set, and proving that those
nonzero squares mod p have this property is not difficult. Here are some examples:
p
(v, k, λ)
7
(7, 3, 1)
11 (11, 5, 2)
19 (19, 9, 4)
23 (23, 11, 5)
31 (31, 15, 7)
43 (43, 21, 10)
Paley difference set
{1, 2, 4}
{1, 3, 4, 5, 9}
{1, 4, 5, 6, 7, 9, 11, 16, 17}
{1, 2, 3, 4, 6, 8, 9, 12, 13, 16, 18}
{1, 2, 4, 5, 7, 8, 9, 10, 14, 16, 18, 19, 20, 25, 28}
{1, 4, 6, 9, 10, 11, 13, 14, 15, 16, 17, 21, 23, 24, 25, 31, 35, 36, 38, 40, 41}
Together with 0, the nonzero squares mod the prime p = 4n + 3 also form difference
sets with parameters (4n + 3, 2n + 2, n + 1).
(Query: If p = 4n + 3 is a prime, it appears that there are more squares mod p in the
interval [1, (p − 1)/2] than there are in the interval [(p + 1)/2, p − 1]. Is this true for all
primes p = 4n + 3, or is this another instance of the Strong Law of Small Numbers?)
2. There exist families of Paley-like difference sets comprised of quartic (fourth-power)
residues mod certain primes p ≡ 1 mod 4 with p > 4:
(i) The set Q of quartic residues mod p form a (p, (p − 1)/4, (p − 5)/16) difference set
if and only if p = 4a2 + 1 for some odd integer a. The four smallest such primes are
37, 101, 197, and 677; for p = 37, Q = {1, 2, 4, 8, 16, 17, 27, 32, 34}.
(ii) The set Q0 = Q ∪ {0} form a (p, (p + 3)/4, (p + 3)/16) difference set if and only if
p = 4b2 + 9 for some odd integer b. The four smallest such primes are 13, 109, 1453,
and 3373; for p = 13, Q0 = {0, 1, 3, 9}, which is also a planar difference set.
3. Projective geometry is the source of the Singer difference sets described by J. Singer
in [144] and having parameters
n+1
q
− 1 q n − 1 q n−1 − 1
,
,
,
q−1
q−1
q−1
where q is a prime power. We look at these difference sets in the next section.
4. A Hadamard matrix of order n is an n × n matrix H with entries ±1 such that
HH tr = nI, where I is the identity matrix of order n. The Hadamard difference sets
are difference sets with parameters (4n2 , 2n2 − n, n2 − n), and a necessary condition
for such a difference set to exist is the existence of a Hadamard matrix of order n2
whose rows and columns all contain the same number of ones. These difference sets
are a bit harder to come by, but the study of the (16, 6, 2) difference sets goes back
to the 1860s.
64
As the planar difference sets are connected with geometric planes, so are the (16, 6, 2)
difference sets connected with geometric biplanes. These are sets of points and lines
such that every line contains the same number of points and two distinct lines have
exactly two points in common. In 1869 (see [3]), Camille Jordan described a configuration in one of Kummer’s quartic surfaces, consisting of sixteen points and sixteen
tangent planes, each plane containing six points and each pair of planes having exactly two points in common. Such a configuration is called a (16, 6, 2) block design,
and Jordan described it as a 4 × 4 matrix, in which the points are entries x and the
planes are determined by the entry positions in the row and column of x other than
itself. See below: the bold-faced entries 1, 5, 8, 10, 11 and 13 in the matrix on the
left are the points in the plane corresponding to the entry 9.
0 1 2 3
4 5 6 7
8 9 10 11
12 13 14 15
0000
0110
1000
1100
0001
0101
1001
1101
0010
0110
1010
1110
0110
0111
1011
1111
What is interesting about this matrix is that replacing the entries in the left matrix
by their corresponding nimbers yields a (16, 6, 2) difference set in the additive group
of nimbers in the field F16 , described in Section 2, namely
D = {0001, 0101, 1000, 1010, 1011, 1101}.
You can check that for every nimber m ∈ F16 , the set D ⊕ m is also a difference set.
In addition,
(i) Q ⊕ Z2 , the direct sum of the quaternion group Q = {±1, ±i, ±j, ±k} and Z2 ,
contains the difference set D1 = {(1, 0), (i, 0), (j, 0), (k, 0), (1, 1), (−1, 1)}, and
(ii) Z2 ⊕ Z8 contains the difference set D2 = {(0, 0), (0, 1), (1, 0), (1, 2), (1, 5), (1, 6)}.
Finally, there is more about Hadamard matrices in Section 10.
5. The Whiteman difference sets were constructed by A. L. Whiteman [176] as analogs
of the Paley difference sets, with v a product of two primes and fourth-powers taking
the role of Paley’s squares. Whiteman difference sets have parameters
pq − 1 pq − 5
,
pq,
,
4
16
where p and q are distinct primes such that p < q and (p − 1)/4 and (q − 1)/4 are
relatively prime. Set d = (p − 1)(q − 1)/4. If g is a primitive root of both p and q
– that is, {1, . . . , p − 1} = {gm : 1 ≤ m ≤ p − 1} and the analogous equality holds
for q – then the difference set is {1, g, g2 , . . . , gd−1 , 0, q, . . . , (p − 1)q} mod pq. Such a
65
difference set exists if and only if (pq − 1)/4 is a perfect square and q = 3p + 2, and
as of this writing, only two such examples have been found, namely
p = 17, q = 53 : (v, k, λ) = (901, 225, 56) and
p = 46817, q = 140453 : (v, k, λ) = (657588101, 1643897025, 410974256).
The search continues.
Multipliers
Let D = {x1 , . . . , xk } be a (v, k, λ) difference set mod v. A multiplier of D is an integer m
such that mD = {mxi (mod v) : i = 1, . . . , k} is equal to a translation D + r (mod v) for
some integer r. (Necessarily, m and v are relatively prime, for if not, then multiplication
by m (mod v) would not preserve the size of D.) For example, D = {2, 3, 5} is a (7, 3, 1)
difference set, and 2D mod 7 ≡ D + 1 mod 7 = {3, 4, 6}. Thus, 2 is a multiplier of this
(7, 3, 1) difference set. Similarly, 3 is a multiplier of the (11, 5, 2) biplane difference set
D11 = {1, 3, 4, 5, 9} – but with a twist: 3D11 ≡ D11 mod 11. This latter is a special case
of a very general theorem, first proved by Marshall Hall, which we now state.
The Multiplier Theorem. Let D be a (v, k, λ) difference set, and let p be a prime
such that (p, v) = 1, p > λ, and p is a divisor of k − λ. Then:
(a) p is a multiplier of D, and
(b) there exists j such that p · (D + j) ≡ D + j mod v. More generally, if there is
a (v, k, λ)-difference set D with a multiplier m, then there is a difference set D ′ on these
parameters such that D ′ ≡ mD ′ mod v.
Now, by the Multiplier Theorem, if there is a (21, 5, 1) difference set D, then 2 is a
multiplier of D. But is knowing that a certain integer m might be a multiplier of some
(currently unknown) (v, k, λ) difference set enough to construct such a difference set? The
answer is that sometimes this knowledge is enough to find a difference set, and sometimes
this knowledge is enough to prove that no such difference set exists. In order to use the
Multiplier Theorem, we need to talk about permutations and orbits.
Recall that a permutation on a set S is a 1-1 mapping of the set onto itself. If f is
a permutation on S, and x ∈ S, then the orbit of f containing x is the set of iterated
images {x, f (x), f (f (x)), . . .}. For example, let S = {1, 2, 3, 4, 5}, and define π by π(1) = 3,
π(2) = 5, π(3) = 4, π(4) = 1, π(5) = 2. One can see that the orbits of π are {1, 3, 4} and
{2, 5}. As noted above, if (m, v) = 1, then the mapping x 7→ mx mod v is a permutation
on Zv . Let’s look at these orbits for a few special examples.
Example 1. The orbits of x 7→ 2x mod 7 on Z7 are {0}, {1, 2, 4}, and {3, 6, 5}, and
{1, 2, 4} is a (7, 3, 1) difference set fixed by this map.
Example 2. The orbits of x 7→ 3x mod 11 on Z11 are {0}, {1, 3, 9, 5, 4}, and {2, 6, 7, 10, 8}
— and {1, 3, 9, 5, 4} is an (11, 5, 2) difference set fixed by the given mapping.
From this, we may conclude that if a (v, k, λ) difference set D is fixed by a multiplier m,
then D is a union of orbits of the map x 7→ mx mod v. Therefore, if v, k, and λ satisfy the
66
relation k(k − 1) = λ(v − 1), and p satisfies the conditions in the Multiplier Theorem, then
the set of orbits of x 7→ px mod v just might contain a (v, k, λ) difference set. For such a
setup, look through that set of orbits and find some of them whose union (a) contains k
elements and (b) produces a (v, k, λ) difference set.
The Multiplier Theorem tells us that if D is a (21, 5, 1) difference set, then 2 is a
multiplier of D — and so it fixes a translate of D. The orbits of x 7→ 2x mod 21 are
{0}, {1, 2, 4, 8, 16, 11}, {3, 6, 12}, {5, 10, 20, 19, 17, 13}, {7, 14}, and {9, 18, 15}. We find that
{3, 6, 7, 12, 14} = {3, 6, 12} ∪ {7, 14} is indeed a (21, 5, 1) difference set.
The orbits of x 7→ 2x mod 15 are {0}, {1, 2, 4, 8}, {3, 6, 12, 9}, {5, 10}, and {7, 14, 13, 11};
— and {0, 1, 2, 4, 8, 5, 10} is a (15, 7, 3) difference set.
Using this method led to the discovery of these difference sets:
{1, 5, 25, 17, 22, 23} is a (31, 6, 1) difference set with multiplier 5.
{1, 7, 9, 10, 12, 16, 26, 33, 34} is a (37, 9, 2) difference set with multiplier 7.
{0, 1, 3, 5, 9, 15, 22, 25, 26, 27, 34, 35, 38} is a (40, 13, 4) difference set with multiplier 3.
But we can also use this method to disprove the existence of certain difference sets. For
example, if a (31, 10, 3) difference set were to exist, then 7 would be a multiplier. However,
the map x 7→ 7x mod 31 has one orbit of size 1 and two of size 15. No union of these can be
of size 10, and so a (31, 10, 3) difference set does not exist. Similarly, (56, 11, 2) difference
set does not exist. The map x 7→ 3 mod 56 does contain orbits with unions of size 11, but
none of them give rise to such a difference set. As for a (43, 7, 1) difference set, there are
three orbits for m = 2 of size 14 and one of size 1, and there is one orbit for m = 3 of size
42 and one of size 1. Thus, there is no (43, 7, 1) difference set.
We see that multipliers can make a difference.
Difference sets, de Bruijn sequences, de Bruijn graphs, and n-cubes
Here is an interesting observation about the (7, 3, 1) difference set S = {1, 2, 4}. Replace an
element x in the sequence 1, 2, 3, 4, 5, 6, 7 by 1 if x ∈ S and by 0 if not; this yields the string
of bits 1101000. Augment this string by prepending a 1, make the ordering circular, and
observe that the resulting cyclic sequence contains every possible three-bit string exactly
once:
111 110 101 010 100 000 001 011
A de Bruijn sequence of order n is a string B of 2n bits such that if B is given the cyclic
ordering, then every bit string a1 a2 . . . an of length n occurs exactly once consecutively in
B. Thus, 1101000 is a de Bruijn sequence of order 3. The same operation on the (15, 7, 3)
difference set {0, 1, 2, 4, 5, 8, 10} produces the string 111011001010000, and that is a de
Bruijn sequence of order 4. More generally, using this process on the Singer difference sets
with parameters (2n − 1, 2n−1 − 1, 2n−2 − 1) – and we’ll meet them in Section 8 – constructs
de Bruijn sequences of order n for all n ≥ 3.
67
A de Bruijn graph Dn of order n is a directed graph whose vertices are the binary
strings of length n − 1; thus, Dn has 2n−1 vertices. There is an edge directed from vertex
x to vertex y if and only if the last n − 2 bits of x are identical to the first n − 2 bits of y.
Thus, 1101 is adjacent to both 1010 and 1011, and 1101 is adjacent from both 0110 and
1110. For example, the graph on the left in Figure 34 is a de Bruijn graph of order 3.
More generally, for each vertex v, there are two edges directed toward v and two edges
directed away from v, so that each vertex has both out-degree and in-degree 2. Thus, Dn
has a directed Euler circuit – that is, a directed walk beginning and ending at any given
vertex and passing through each edge exactly once. To find such a directed Euler circuit,
label the edge from b1 . . . bn−1 to b2 . . . bn with the label bn . A de Bruijn cycle of order
n gives an ordering of the labeled edges of Dn that yields a directed Euler circuit. For
example, beginning at vertex 00 in Figure 34 and following the edges labeled 1, 1, 1, 0, 1, 0, 0,
and 0 in that order will produce a directed Euler circuit in D3 .
(FIGURE 34 GOES ABOUT HERE):
0
00
1
0
1
01
10
1
0
0
11
1
Figure 34: An Eulerian circuit in the deBruijn graph D3 of order 3: labels follow the
augmented sequence 11101000 defined by the (7, 3, 1) difference set.
The n-cube Qn is a graph whose vertices are the bit strings of length n; two vertices
are adjacent if and only if the strings differ in exactly one place. For all n ≥ 2, Qn has a
Hamiltonian cycle, i.e. a closed path that goes through every vertex once and returns to
its starting point. Alternate vertex labelings for Qn can be found by labeling a vertex with
the string 11 . . . 1 and following any Hamiltonian cycle, labeling according to the ordering
of any de Bruijn sequence of order n. For n = 3, Figure 35 shows the vertices of Q3 labeled
according to the de Bruijn sequence of order 3 mentioned above (Figure 35 goes about
here):
If D is a (v, k, λ) difference set mod v, then the following is true about the set of
68
001
110
000
101
011
111
100
010
Figure 35: A Hamiltonian cycle in the 3-cube Q3 : vertex labeling follows the de Bruignsequence 11101000 defined by the (7, 3, 1) difference set
translates {D + j mod v : j = 0, 1, . . . , v − 1}. Namely, each of the v translates contains k
elements, each element of Zv belongs to k translates, and each pair of elements are together
in exactly λ translates. This is an example of a structure called a block design, and we
now present a brief introduction to these most beautiful and useful combinatorial designs.
Block designs
Let v, b, r, k and λ be positive integers, with v > k. A balanced incomplete block
design (or BIBD) with parameters (v, b, r, k, λ) is an arrangement of b subsets (or blocks)
of a v-element set V of elements (or points), such that each block contains k points, each
point appears in r blocks, and each pair of points appears together in exactly λ blocks.
Unless otherwise specified, we refer to BIBDs as block designs.
The parameters are not independent, for we can show that bk = vr and r(k − 1) =
λ(v − 1). First, there are two ways to count the number of pairs {B, x} such that the block
B contains the point x. Each of the b blocks contains k points, making bk pairs in all, and
each of the v points appears in r blocks, making vr pairs in all. Thus, bk = vr. Next, fix a
point x: there are two ways to count the number of pairs {B, y} such that x and y appear
together in a block B. The point x is contained in r blocks and each such block contains
k − 1 other points; also, the point x appears with another point y in λ blocks and there
are v − 1 points y 6= x in all. It follows that r(k − 1) = λ(v − 1). Thus, the parameters
v, k, and λ are enough to specify a block design and so we may speak of a (v, k, λ) design.
A Steiner Triple System is a BIBD with k = 3 and λ = 1. For example, the packing of
K25 with triangles uses a (100, 3, 1) STS and that smallest STS mentioned in this section
is also a block design. Thus, the seven triples
{{1, 2, 4}, {2, 3, 5}, {3, 4, 6}, {4, 5, 0}, {5, 6, 1}, {6, 0, 2}, {0, 1, 3}}
are the blocks of a (7, 3, 1) block design.
69
The ways of constructing block designs are many and varied, coming from algebra,
number theory, geometry, and statistics, among other fields. A large class of methods
comes from number theory, and one number-theoretic method uses difference sets. We
have seen that the set D = {1, 2, 4} is a (7, 3, 1) difference set mod 7. The translates
D + j = {1 + j, 2 + j, 4 + j} mod 7 are also (7, 3, 1) difference sets, where j is an integer
mod 7. For example, D + 5 ≡ {1 + 5, 2 + 5, 4 + 5} ≡ {6, 0, 2} mod 7, and in computing
the six differences between distinct elements of D + 5, the 5s cancel out. It is not hard to
show that if v, k, and λ are integers such that D is a (v, k, λ) difference set, then – as we
just mentioned – the v sets {D + j mod v : 0 ≤ j ≤ v − 1} form a (v, k, λ) block design.
Similarly, the eleven translates of the (11, 5, 2) biplane difference set {1, 3, 4, 5, 9} mod 11
form an (11, 5, 2) biplane block design. We will meet this design again when we see how it
relates to the game of Mock Turtles, Golay codes, and Mathieu groups.
And now, let’s talk about some ways of constructing block designs using ideas from
geometry – specifically, from finite geometries, both projective and affine.
70
8
Geometric connections
This section is in the file geometry.tex.
Finite projective geometries and Singer designs
The (7, 3, 1) block design is also the simplest nontrivial example of a finite projective
geometry, namely the projective plane of order 2—also known as the Fano configuration
(see [54]), although it goes back to a question posed by Woolhouse in 1844 [178] and
answered by Kirkman in 1847 [89]. A projective plane consists of a set of points and a
set of lines satisfying these axioms.
1. Every pair of points lie on a unique line.
2. Every pair of lines intersect in a unique point.
3. There exist at least four points such that no three lie on a line.
4. There exist at least four lines such that no three intersect in a point.
More generally, let Fq be a finite field, let n be a positive integer and let Un =
{(x0 , x1 , . . . , xn ) : xi ∈ Fq } − {(0, . . . , 0)} be the set of all nonzero (n + 1)-tuples with elements in the field Fq . Define an equivalence relation ∼ on Un by (x0 , . . . , xn ) ∼ (y0 , . . . , yn )
provided there exists a nonzero constant λ such that xi = λyi for all i. We define the ndimensional projective geometry P G(n, q) over Fq to be the set of all ∼-equivalence
classes in Un . We also refer to these geometries as n-dimensional projective spaces
and use the terms interchangeably.
A point of P G(n, q) is an equivalence class of (n + 1)-tuples. For an example, consider
the space P G(3, 5) of dimension 3 over the 5-element field. The nonzero scalar multiples
of p = (1, 4, 3, 3) are p itself, 2p = (2, 3, 1, 1), 3p = (3, 2, 4, 4), and 4p = (4, 1, 2, 2), and
so in P G(3, 5), p represents the class of its nonzero multiples. (The same letter refers to
both the element and its equivalence class – the key is to remember that scalar multiples
represent the same class.) The lattice of subspaces of P G(n, q) corresponds to the lattice
.
of subspaces of Fn+1
q
, and each nonzero vector is in a ∼-equivalence
There are q n+1 −1 nonzero vectors in Fn+1
q
class containing q−1 scalar multiples. Hence, P G(n, q) contains (q n+1 −1)/(q−1) elements,
which are the points.
In [15], R. C. Bose proved the following theorem about an interesting class of block
designs, now known as Singer designs after their discoverer James Singer, who first
described them in [144].
Bose’s Theorem. The points and (n − 1)-dimensional subspaces of P G(n, q) are the
points and blocks, respectively, of a
n+1
q
− 1 q n − 1 q n−1 − 1
,
,
q−1
q−1
q−1
71
symmetric balanced incomplete block design.
To see why this is so, we first review some linear algebra. Let H be a d-dimensional
subspace of an n-dimensional vector space. The (right) null-space H ⊥ of H is the set of
vectors v for which Hv = 0, the zero vector – left null-spaces are defined analogously – and
the nullity of H is the dimension of H ⊥ . The Rank-Nullity Theorem – also known as the
Fundamental Theorem of Linear Algebra – tells us that the dimension of H ⊥ is equal to
n−d. Thus, if B is an n-dimensional subspace of Fn+1
, then by the Rank-Nullity Theorem,
q
B ⊥ has dimension 1.
Now let K be a d-dimensional subspace of P G(n, q). Then K corresponds to a d + 1dimensional subspace of Fn+1
, so its null-space has dimension n + 1 − (d + 1) = n − d.
q
Projectively, this null-space corresponds to an n − d − 1-dimensional subspace of P G(n, q).
In particular, if K is an (n − 1)-dimensional subspace of P G(n, q), then its null-space has
dimension n − (n − 1)− 1 = 0. In short, the null-space of an (n − 1)-dimensional subspace of
P G(n, q) is a point, and it follows that distinct (n − 1)-dimensional subspaces have distinct
null-spaces. Hence, the points and the (n − 1)-dimensional subspaces – and these will be
our blocks – are in one-to-one correspondence, and there are v = (q n+1 − 1)/(q − 1) of each.
Let B be the block whose null-space is the point (a0 , . . . , an ). Then every element
P
w = (x0 , . . . , xn ) ∈ B satisfies ni=0 ai xi = 0; without loss of generality, suppose a0 6= 0.
Then each of the q n − 1 nonzero choices of x1 , . . . , xn determines a unique value of x0 .
However, since the q − 1 scalar multiples of a solution vector w are considered the same,
we divide out by that quantity and see that a block contains k = (q n − 1)/(q − 1) points.
A similar argument shows that every point is contained in k blocks.
Finally, two blocks are either equal or intersect in an (n − 2)-dimensional subspace of
P G(n, q), and repeating the above argument shows that the intersection of distinct blocks
has λ = (q n−1 − 1)/(q − 1) points, and each pair of distinct points belongs to λ blocks.
In short, the collection of subspaces of dimension n − 1 in P G(n, q) forms a symmetric
(v, k, λ) block design whose elements are the points of P G(n, q), and this completes the
proof of Bose’s Theorem. These are called Singer designs after James Singer (1906–1976),
a 1931 Ph.D. from Princeton with a dissertation directed by the eminent topologist J. W.
Alexander. Singer was on the mathematics faculty of Brooklyn College from 1936 to 1974,
and was by all accounts an influential and beloved teacher. He became interested in finite
projective geometry, and in his 1938 paper [144], Singer proved the following theorem.
Singer’s Theorem. Let D be an (n − 1)-dimensional subspace of P G(n, q). Then
there is a bijective transformation carrying the v = (q n+1 − 1)/(q − 1) points of P G(n, q)
onto the integers {0, 1, . . . , v − 1)} in such a way that the resulting integers corresponding
to the k = (q n − 1)/(q − 1) points of D have the following property. Namely, every nonzero
integer mod v can be expresses as the difference between distinct elements of D in exactly
λ = (q n−1 − 1)/(q − 1) ways.
In short, Singer proved that with v, k and λ as above, each block of a Singer (v, k, λ)
design is a (v, k, λ) difference set, a circumstance hinted at in Section 7.
A symmetric design on V = {0, . . . , v − 1} is called cyclic if the v blocks are the v
72
translates D + i mod v of some fixed block V . Singer’s next theorem tells us in such a
design, every block is a difference set.
Theorem. Let B = {B0 , B1 , . . . , Bv−1 } be the blocks of a cyclic (v, k, λ) design on the
set of points V = {0, 1, . . . , v − 1}, Then each block Bi is a (v, k, λ) difference set.
Let’s prove this.
To simplify the proof, we assume that 0 ∈ B0 . Thus, B0 = {x1 , . . . , xk−1 , 0} for some
xi ∈ V, 1 ≤ k − 1, xi 6= 0. We show that B0 is a (v, k, λ) difference set. Since the design
is symmetric, each point is in k blocks and there are v blocks in all. Since the design is
cyclic, we see that Bj = {x1 + j, . . . , xk−1 + j, j} for 0 ≤ j ≤ v − 1}.
Now let d be any nonzero element of V . Then d and 0 are in exactly λ blocks together
– that is, for λ values of j, d, 0 ∈ Bj . A block has no repeated elements, so if d, 0 ∈ Bj ,
then d = xr + j and 0 = xs + j for distinct xr , xs ∈ B0 . Thus, d = d − 0 = xr − xs for
exactly λ pairs (xr , xs ) of distinct elements of B0 . Since d was arbitrary, it follows that
every nonzero number mod v can be expressed as a difference of elements of B0 in exactly
λ ways. In short, B0 is a (v, k, λ) difference set.
Thus, if a, b ∈ Bj , then a = xr + j, b = xs + j for xr , xs ∈ B0 , and so a − b =
xr + j − (xs − j) = xr − xs . Hence, the differences of elements in Bj are the same as the
differences of elements in B0 , and so each block Bj is a (v, k, λ) difference set—as claimed.
More generalization is possible. In fact, there is a way to make the Singer block designs
contained in P G(n, q) into cyclic designs. Proving this is tedious, so we will not pursue it.
For a proof, see [172, pp. 79-82].
Let’s return to the Fano configuration, depicted in Figure 36. We see that the “lines”
are the seven triples that form the (7, 3, 1) block design
S = {{1, 2, 4}, {2, 3, 5}, {3, 4, 6}, {4, 5, 0}, {5, 6, 1}, {6, 0, 2}, {0, 1, 3}}
(counting {2, 3, 5} as a line) and the “points” are the numbers 0, 1, . . . , 6 (FIGURE 36
GOES ABOUT HERE):
Now these axioms are self-dual: they describe the same geometry if the words “point”
and “line” are switched and the phrases “lie on” and “intersect in” are switched. As a
result, such a geometry is self-dual in the sense that points and lines are interchangeable.
For example, give the edges of the triangle opposite the vertices 0, 1 and 4 the dual labels
0, 1, and 4 respectively, give the corresponding medians with base points 2, 3, and 5 the
dual labels 2, 3, and 5 respectively, and give the circular line the dual label 6. Just as in
the original labeling, the points 4, 5, and 0 are collinear (on line with dual label 1), so the
dual lines 4, 5, and 0 are concurrent (meeting in the point with original label 1).
Duality shows up in the block design world. As we have seen, the parameters v, k, and
λv(v − 1)
λ are enough to specify a block design, and such a design has b =
blocks. A
k(k − 1)
design with b = v is called symmetric. The dual design of a symmetric block design
is obtained by interchanging the blocks and the points, and as is the case with the Fano
plane, the (7, 3, 1) design is self-dual.
73
0
3
5
6
2
1
4
Figure 36: The Fano configuration: the projective plane of order 2.
We can use the (13, 4, 1) planar difference set {0, 1, 3, 9} to form a projective plane of
order 3 with thirteen points and thirteen lines:
(0, 1, 3, 9) (1, 2, 4, A) (2, 3, 5, B) (3, 4, 6, C) (4, 5, 7, 0) (5, 6, 8, 1) (6, 7, 9, 2)
(7, 8, A, 3) (8, 9, B, 4) (9, A, C, 5) (A, B, 0, 6) (B, C, 1, 7) (C, 0, 2, 8).
Again, due to the duality inherent in a projective plane, the (13, 4, 1) symmetric block
design associated with the projective plane of order 3 is also self-dual. It is also a Singer
((33 − 1)/(3 − 1), (32 − 1)/(3 − 1), (31 − 1)/3 − 1)) design.
Another realization of the projective plane of order 3 comes from the vertical, horizontal,
main diagonal, and contrary diagonal lines of the 3 × 3 matrix with entries 1, . . . , 9 in
lexicographic order, along with the four extra points A, B, C, and D:
(0, 1, 2, 3) (0, 4, 5, 6) (0, 7, 8, 9)
(A, 1, 4, 7) (A, 2, 5, 8) (A, 3, 6, 9)
(B, 1, 5, 9) (B, 2, 6, 7) (B, 3, 4, 8)
(C, 1, 6, 8) (C, 2, 4, 9) (C, 3, 5, 7) (0, A, B, C).
Affine Geometries
If we regard the points of one line of a projective geometry as being at infinity, then the
remaining points lie on families of parallel lines. For instance, delete the line {0, A, B, C}
and all its points from the geometry constructed in the previous section and we obtain
(1, 2, 3) (4, 5, 6) (7, 8, 9)
(1, 4, 7) (2, 5, 8) (3, 6, 9)
(1, 5, 9) (2, 6, 7) (3, 4, 8)
(1, 6, 8) (2, 4, 9) (3, 5, 7)
74
This is an affine geometry, i.e., one in which the parallel postulate holds: given a line l
and a point P ∈
/ l, there exists a line m containing P and parallel to l. It contains nine
points on twelve lines, the four rows being sets of three parallel lines each. The twelve
triples also form a (9, 3, 1) Steiner triple system on the nine points.
It happens that affine geometries have a close connection with magic squares.
Magic Squares
We can’t arrange that all the lines of an affine geometry are magic in the sense that if
the points are numbers then the sum of the points on each line is the same, but we can
manage quite a few of them. An n by n latin square is an arrangement of n copies of n
symbols in a square so that there is one and only one of each symbol in each row and each
column. It is also possible to find diagonal latin squares in which the diagonals (including
the broken diagonals) also contain just one of each symbol. Two latin squares are said to
be orthogonal if, when they are superimposed one upon the other, all n2 ordered pairs
of symbols occur (once each). Below we see, first two 5 by 5 orthogonal diagonal latin
squares, next their superposition, and finally their values as two-digit numbers in base 5
to yield the numbers 0, 1, . . . , 24 arranged in a pandiagonal magic square. FIGURE 37
GOES ABOUT HERE:
0
1
2
3
4
2
3
4
0
1
4
0
1
2
3
1
2
3
4
0
3
4
0
1
2
0
2
4
1
3
1
3
0
2
4
2
4
1
3
0
3
0
2
4
1
4
1
3
0
2
00
12
24
31
43
21
33
40
02
14
42
04
11
23
30
13
20
31
44
01
34
41
03
10
22
0 11 22 8 19
7 18 4 10 21
14 20 6 17 3
16 2 13 24 5
24 9 15 1 12
Figure 37: A pandiagonal magic square from two order-5 orthogonal latin squares.
The rows, columns and diagonals, including the broken diagonals such as 14 + 2 + 15 +
8 + 21 or 16 + 20 + 4 + 8 + 12, all have magic sum 60. They can be viewed as twenty lines of
an affine geometry on twenty-five points; the other ten lines are obtained by making chess
knight’s moves on a torus (draw a square around the array and identify opposite sides):
(0, 20, 15, 10, 5), (11, 6, 1, 21, 16), (22, 17, 12, 7, 2), (8, 3, 23, 18, 13), (19, 14, 9, 4, 24)
(0, 4, 3, 2, 1), (11, 10, 14, 13, 12), (22, 21, 20, 24, 23), (8, 7, 6, 5, 9), (19, 18, 17, 16, 15).
Only the third member of the first set and the second member of the second set have magic
sum 60.
Heawood’s map on the torus revisited
Another connection between difference sets and geometry involves Heawood’s map of the
complete graph K7 on the torus. We can use the six differences modulo 7 from the planar difference set {0, 1, 3} to label the nodes of a triangular lattice along its six different
75
directions, as seen in Figure 38. We choose the labels so that the labeling of each downwardpointing equilateral triangle corresponds to one of the seven blocks of the (7, 3, 1) block
design, listed at the end of Section 7 (FIGURE 38 GOES ABOUT HERE):
5
0
6
1
3
5
0
2
1
3
4
0
3
4
5
6
2
6
5
0
1
Figure 38: Finding seven nodes and
2
6
1
2
7
4
3
0
2
4
5
adjacencies on a torus.
If we identify all nodes having the same label we obtain a torus with seven nodes (or
vertices), each of them adjacent to the other six. This graph is the complete graph K7
on seven points. Think of the capitals of seven countries, with roads between each pair,
crossing the frontiers of each pair of contiguous countries.
(FIGURE 39 GOES ABOUT HERE):
0
6
1
0
3
2
4
3
5
0
1
4
6
0
Figure 39: The Heawood graph on the torus.
76
Figure 39 is Heawood’s famous map on the torus, showing that at least seven colors are
needed to properly color maps on the torus for which neighboring regions are given distinct
colors. (See also Figure 29.) We begin with a tiling of the plane by regular hexagons,
numbering the regions in such a way that the hexagonal regions labeled 0, 1, 2, 3, 4, 5, and
6 are mutually adjacent and so require distinct colors in a proper coloring of the map. The
dashed lines in the figure form a rhombus, and one identifies—matches up—the opposite
sides of the rhombus so that it folds into a torus. (This figure has also appeared in Section
6.)
Given a map M on a surface, one can construct what is called the dual graph of M
as follows. Place a vertex in each region and join vertices whenever two regions share a
boundary edge. The dual graph of Heawood’s map H has seven vertices, and as the regions
of H are mutually adjacent, it follows that each pair of vertices in the dual graph is joined
by an edge. Thus, the dual graph is the complete graph K7 on seven vertices. This also
shows that the genus of K7 is equal to 1, for it is clear that we cannot embed K7 in the
sphere or the plane.
The toroidal thickness of the complete graph
Heawood’s conjecture, proved by Ringel, Youngs and others in the 1960s, states that the
genus γ(Kn ) of the complete graph Kn is given by
(n − 3)(n − 4)
γ(Kn ) =
,
12
where ⌈x⌉ is the least integer ≥ x. Then γ(K7 ) = 1 and γ(Kn ) ≥ 2 for n ≥ 8. Quite clearly,
then, we cannot embed K25 in the (one-holed) torus. But we can ask the question: into
how few parts can we partition its edges with each part forming a graph that is embeddable
in the torus? This number is called the toroidal thickness θ1 of the graph. In this case
the answer is four, because we can take our four equations from Section 6, namely
1 + 11 = 12, 2 + 6 = 8, 3 + 7 = 10, and 4 + 5 = 9,
and use them in place of 1 + 2 = 3 to make four maps like Heawood’s, but working modulo
25 instead of modulo 7.
In the same way you can use two triangles with arc lengths 1, 3, 4 and 2, 5, 7 in a circle
of thirteen points to show that θ1 (K13 ) = 2. You can use three triangles with arc lengths
1, 7, 8; 2, 3, 5; 4, 6, 9 in a circle of nineteen points to give θ1 (K19 ) = 3. Ringel [129] has
proved that in general, θ1 (Kn ) = ⌈(n − 1)/6⌉.
(7, 3, 1) and Nim
There is another way to label the Fano configuration, depicted in Figure 40.
(FIGURE 40 GOES HERE:)
At first glance it looks as though we’ve merely added one to each of the labels and done
a little rearrangement. But it is more subtle than that. The motivation for the relabeling
77
1
3
7
5
4
2
6
Figure 40: The Fano configuration relabeled.
is seen if you write the new labels in binary, as in Figure 41, and it has to do with the
Nim-sum operation we have already introduced in Section 2. That is, we treat the labels as
nimbers. Recall that this operation has several equivalent descriptions, namely (a) vector
or coordinate-wise addition over F2 , the two-element field of integers mod 2; (b) addition in
base 2 without carrying; and (c) string-wise XOR, or exclusive-or. Thus, writing ⊕ for the
Nim-sum operation, we see that (001 ⊕ 111) ⊕ 110 = 110 ⊕ 110 = 000, and 011 ⊕ 101 = 110.
For the relabeling in Figure 41, the Nim-sum of three labels on a line is zero – but the
relation with nim goes further. Ignoring orderings, there are 35 three-pile nim games in
which the piles have different sizes and the sizes are taken from {1, . . . , 7}. The “lines” of
the Fano plane are those nim games that are losing positions.
(FIGURE 41 GOES HERE:)
001
011
111
101
010
100
110
Figure 41: Binary labels illustrate Nim-sum.
As for the origin of the relabeling rule for the Fano configuration, that comes from a
78
i
0
1
2
3
4
5
6
xi (mod x3 + x + 1)
1
x
x2
x+1
x2 + x
2
x +x+1
x2 + 1
string a2 a1 a0
001
010
100
011
110
111
101
integer
1
2
4
3
6
7
5
mod 7 blocks
013
124
235
346
450
561
602
Nim-sum blocks
123
246
437
365
671
752
514
Table 1: Difference set labels to nimber labels.
way to transform the arithmetic of integers on Z (mod 7) to the arithmetic of polynomials
over Z (mod 2), which we now describe.
Let Z2 [x] be the ring of polynomials with coefficients in the integers (mod 2). Polynomial arithmetic in Z2 [x] is the same as ordinary polynomial arithmetic, except that the
coefficients of these polynomials are 0s and 1s and 1 + 1 = 0. Thus, the only non-zero
polynomials of degree less than 3 are the seven listed in the second column of Table 1. Let
g(x) = x3 + x + 1 ∈ Z2 [x]. In this setting, when we do arithmetic mod g(x), we find that
the powers xi , i = 0, 1, . . . , 6 (mod g(x)) are distinct, and that x7 ≡ 1 (mod g(x)).
In Table 1, the numbers in column 1 are the labelings of the Fano configuration.
Columns 2 and 3 list the corresponding powers of x (mod x3 + x + 1) as polynomials
and as three-bit strings. The numbers in column 4 are the integers represented by the
three-bit strings. Under this transformation, the blocks in the difference-set design in
column 5 are mapped to their respective blocks in the Nim-sum design.
A little study of Figure 40 reveals that the Fano plane – and hence the (7, 3, 1) block
design – has a great deal of symmetry. For example, rotating the figure by a one-third
clockwise turn around the point 5 is a rigid motion that preserves both the set of points and
the set of lines. From Figure 33, we see that the cyclic permutation ρ = (1 2 3 4 5 6 7)
of the points also acts on the lines – i.e., the blocks – by cyclically permuting them.
Given a projective plane, any permutation of its points that also preserves its lines is
called an automorphism of the plane, and the set of all automorphisms of a projective
plane form a group under composition of mappings. The automorphism group G for the
projective plane of order 2 can be represented by the group GL(3, 2) of nonsingular 3 × 3
matrices with entries in the integers mod 2. The first row can be written in 23 − 20 = 7
ways, the second in 23 − 21 = 6 ways, and the third in 23 − 22 = 4 ways. Thus, the order
of the group G is 7 × 6 × 4 = 168. (We select a triangle of reference by choosing one
point, then a second, then the third not on the line joining the first two. The image of
these three points completely determines the automorphism.)
A bijective map on the vertices of a graph that preserves adjacency is called an automorphism of the graph, and G is also the automorphism group for the Heawood graph: its
elements permute the vertices and induce permutations of the edges. Finally, a bijective
map on the elements of a block design that preserves the set of blocks is called an automor79
phism of the design, and it happens that G is also the automorphism group of the (7, 3, 1)
symmetric block design. G can also be realized as the group P SL(2, 7) of 2 × 2 matrices of
determinant 1 over the integers mod 7, where we also identify the identity matrix I with
its negative −I. Thus, P SL(2, 7) is isomorphic to GL(3, 2).
The preceding paragraph raises two interesting questions associated with this group G.
Question 1: We have asserted that GL(3, 2) is the automorphism group for the projective
plane of order 2. Why is this true?
Question 2: For many years, both RKG and EB had pondered this question:
Why is a group of 2 × 2 matrices over the integers mod 7 isomorphic to a group of 3 × 3
matrices over the integers mod 2?
Or, to put it more simply: Why is P SL(2, 7) isomorphic to GL(3, 2)?
Let’s answer Question 1 first.
The automorphism group of the Fano plane
We have boldly asserted that the group Aut(D) of automorphisms of the Fano plane D –
that is, the permutations on the set {1, 2, 3, 4, 5, 6, 7} of points of D that also preserve the
lines of D – can be represented by the group of invertible 3 × 3 matrices with entries in
the integers mod 2. Let’s prove this by constructing the group Aut(D) and making use
of one of the fundamental results about permutation groups, namely the Orbit-Stabilizer
Theorem.
As we have seen, we can identify the Fano plane (see Figure 36) with the blocks and
varieties of the (7, 3, 1) block design, as follows –
B1
B2
B3
B4
B5
B6
B7
= {1, 2, 4}
= {2, 3, 5}
= {3, 4, 6}
= {4, 5, 7}
= {5, 6, 1}
= {6, 7, 2}
= {7, 1, 3}
– and in which we have replaced each occurrence of 0 with a 7. In what follows, we use the
cycle notation to describe permutations. Thus, the permutation ρ : 1 → 2 → 4 → 1, 3 →
6 → 5 → 3, 7 → 7 of the set {1, 2, 3, 4, 5, 6, 7} can also be written as (1, 2, 4)(3, 6, 5)(7).
Formally, an automorphism of a design is a permutation of its varieties that also permutes its blocks. Thus, ρ = (1, 2, 4)(3, 6, 5)(7) is an automorphism of (7, 3, 1): it determines
(or induces) the permutation ρ∗ : (B1 )(B2 , B3 , B5 )(B4 , B7 , B6 ) on the blocks of (7, 3, 1).
On the other hand, (1, 3, 4)(2, 5, 7, 6) is not an automorphism of (7, 3, 1): it maps the block
B1 = {1, 2, 4} to the non-block {3, 5, 1}.
Let G be a group of permutations of a set S, let T ⊆ S and let x ∈ S. The set of
elements OrbG (x) = {y ∈ S : y = g(x) for some g ∈ G} is called the orbit of x under G.
80
Similarly, OrbG (T ) = {R ⊆ S : R = g(T ) for some g ∈ G} is called the orbit of T under
G. The set of mappings StabG (T ) = {g ∈ G : g(T ) = T } is called the stabilizer of T in G;
we denote StabG ({x}) by StabG (x). The following is one of the most useful and beautiful
theorems in group theory.
The Orbit-Stabilizer Theorem: Let G be a finite group of permutations on a finite
set S, let T ⊆ S, and let |X| denote the cardinality (or order) of X. Then StabG (T ) is
a subgroup of G, and the cardinalities of G, StabG (T ), and OrbG (T ) are related by the
equation
|G| = |OrbG (T )| · |StabG (T )|.
Let D denote the (7, 3, 1) design. We define the following groups:
• G = Aut(D)) – the automorphism group of D,
• H = StabG (B1 ) – the automorphisms that fix B1 ,
• K = StabH (1) – the automorphisms that fix B1 and 1, and
• L = StabK (2) – the automorphisms that fix B1 , 1 and 2.
By three applications of the Orbit-Stabilizer Theorem,
|G| = |OrbG (B1 )| · |OrbH (1)| · |OrbK (2)| · |L|.
In order to determine the sizes of the above orbits and stabilizer, here are four automorphisms of (7, 3, 1) that figure in the calculation of these sizes. We also describe their
realizations as symmetries of some familiar geometric figures.
1. Let τ = (1, 2, 3, 4, 5, 6, 7); τ induces the permutation
τ ∗ = (B1 , B2 , B3 , B4 , B5 , B6 , B7 )
on the blocks of (7, 3, 1). We see that repeated application of τ ∗ to B1 reveals that
|OrbG (B1 )| = 7, and τ acts as a 1/7 turn on the regular heptagon, as seen in Figure
42 (Figure 42 goes about here):
2. Let ρ = (1, 2, 4)(3, 6, 5)(7) be as above; ρ induces the permutation
ρ∗ = (B1 )(B2 , B3 , B5 )(B4 , B7 , B6 )
of the blocks of (7, 3, 1). Now, ρ fixes B1 set-wise, so ρ ∈ H. But ρ cyclically
permutes the elements of B1 , so |OrbH (1)| = 3. Geometrically, ρ acts as a 1/3 turn
on the regular octagon, as seen in Figure 43 (Figure 43 goes about here):
81
1
1
2
7
6
3
5
2
7
3
6
4
5
4
Figure 42: τ = (1, 2, 3, 4, 5, 6, 7) cyclically permutes 1, 2, 3, 4, 5, 6, and 7.
3. Let σ = (1)(2, 4)(3, 5, 7, 6); σ induces the permutation
σ ∗ = (B1 )(B2 , B4 , B6 , B3 )(B5 , B7 )
on the blocks of (7, 3, 1). We see that σ fixes 1 and switches 2 and 4, so σ is in the
stabilizer of both 1 and B1 – that is, σ ∈ K. Since σ switches 2 and 4, we see that
|OrbK (2)| = 2. Geometrically, σ acts as a 1/4 turn with a reversing reflection on the
regular octagon, as seen in Figure 44 ( Figure 44 goes about here):
4. Let δ = (1)(2)(4)(3, 5)(6, 7); δ induces the permutation
δ∗ = (B1 )(B2 )(B3 , B4 )(B5 , B7 )(B6 )
of the blocks of (7, 3, 1). Thus, δ is in the stabilizer of the block B1 = {1, 2, 4}. It
happens that δ fixes each element of B1 , so δ fixes B1 pointwise. It can be shown
that L = {identity, δ, σ 2 , δσ 2 }, and so |StabK (2)| = |L| = 4. Geometrically, δ acts as
a reversing reflection on the regular octagon, as seen in Figure 45 (Figure 45 goes
about here):
The following table summarizes the geometric realizations of these four symmetries:
how to lay out{1, . . . , 7}
geometric symmetry
τ
regular heptagon
1/7 turn
ρ octahedron with centroid
1/3 turn
σ octahedron with centroid 1/4 turn with reversing reflection
δ octahedron with centroid
a reversing reflection
82
4
1
6
5
6
7
2
3
1
7
2
4
5
3
Figure 43: ρ = (1, 2, 4)(3, 6, 5)(7) rotates {1, 2, 4} and {3, 6, 5} and fixes 7
2
4
6
1
7
7
3
1
5
5
6
3
4
2
Figure 44: σ = (1)(2, 4)(3, 5, 7, 6) fixes 1, switches 2 and 4, and cyclically permutes 3, 5, 7,
and 6
We are now ready to find the order of G = Aut(7, 3, 1), the automorphism group of the
(7, 3, 1) block design. For, the Orbit-Stabilizer Theorem tells us that for G = Aut(7, 3, 1),
|G| = |OrbG | · |OrbH (1)| · |OrbK (2)| · |StabK (2)|
=7·3·2·4
= 168.
It follows that Aut(7, 3, 1) has order 168.
Many groups, however, have order 168: how do we know that Aut(7, 3, 1) is isomorphic
to GL(3, 2)?
Here’s how. It can be shown that Aut(7, 3, 1) is generated by the elements τ δτ −1 and ρ
of orders 2 and 3, respectively, whose product ρτ δτ −1 has order 7. In the next section, we
prove that GL(3, 2) is generated by an element of order 2 and an element of order 3, whose
83
2
2
7
1
6
6
5
1
7
3
3
5
4
4
Figure 45: δ = (1)(2)(4)(3, 5)(6, 7) switches 3 and 5, switches 6 and 7, fixes 1, 2, and 4.
product has order 7. Hence our automorphism group is the group GL(3, 2), as claimed.
But it is also the group P SL(2, 7) and, as stated above, the fact that these two groups
are isomorphic was a mystery to both of your authors. The mystery is solved in the
following section.
84
9
The groups PSL(2, 7) and GL(3, 2) and why they are isomorphic.
This section is in the file psl27gl32.tex.
In the previous section, we learned that automorphism group of the Fano plane has
order 168, and that it is, in fact, the group GL(3, 2) of invertible matrices over the twoelement field. Now, the groups of invertible matrices over finite fields are among the first
groups we meet in a beginning course in modern algebra. Eventually, we find out about
simple groups and that the unique simple group of order 168 has two representations as a
group of matrices. And this is where we learn that the group of 2 × 2 unimodular matrices
over a seven-element field, with I and −I identified, is isomorphic to the group of invertible
3 × 3 matrices over a 2-element field. In short, it is a fact that PSL(2, 7) ∼
= GL(3, 2).
Many of us are surprised by this fact: why should a group of 2 × 2 matrices with mod-7
integer entries be isomorphic to a group of 3 × 3 binary matrices?
There are a number of proofs of this remarkable theorem. Dickson [47, p. 303] gives
a proof based on his general theorem giving uniform sets of generators and relations for
the family of groups SL(2, q), where q is any prime power. One checks that the relations appearing in Dickson’s presentation of PSL(2, 7) are satisfied by certain generators
of GL(3, 2), implying that these groups have the same presentations and are therefore isomorphic. Dummit and Foote [51, p. 207-212] show that every simple group of order 168
is necessarily isomorphic to the automorphism group Aut(F) of the Fano plane F. They
then show that Aut(F) ∼
= GL(3, 2) and that PSL(2, 7) is a simple group of order 168;
the isomorphism theorem follows. Rotman gives the result as an exercise [138, Exercise
9.26, p. 281]. A hint is to begin with a simple group G of order 168 and use the seven
conjugates of a Sylow 2-subgroup P of G to construct a seven-point projective plane; the
proof is similar to Dummit and Foote’s proof. Jeurissen [88] proves the result by showing
that both PSL(2, 7) and GL(3, 2) are subgroups of index 2 of the automorphism group of
a Coxeter graph. Elkies [53] gives a clever proof that uses the automorphism group G
of the 3-(8, 4, 1) Steiner system – also known as the Steiner S(3, 4, 8) design. He shows
that PSL(2, 7) is contained in G, which in turn maps homomorphically onto GL(3, 2). The
result follows from the simplicity of the two groups and the fact that they are both of order
168. We remark that there do exist non-isomorphic simple groups of the same order. For
example, Schottenfels showed that PSL(3, 4) and A8 are non-isomorphic simple groups of
order 20,160 [138, Theorem 8.24, p. 233].
We give a proof that PSL(2, 7) ∼
= GL(3, 2) that is elementary in the sense that it uses
neither simplicity, nor projective geometry, nor block designs. We will not prove the fact
that any two simple groups of order 168 are isomorphic, nor will we use this fact in our
proof. What makes our proof work is that: (a) we can identify GL(3, 2) with the set of
invertible F2 -linear transformations on the finite field with eight elements; (b) 7 = 23 − 1;
(c) the nonzero squares mod 7 are precisely the powers of 2 mod 7; (d) squaring mod 2
is additive (the Freshman’s Dream); and (e) the mapping k 7→ −1/k mod 7 translates
85
to a bit-switch mod 2 — which is linear. We begin by giving functional descriptions for
both groups, determining their sizes, and exhibiting sets of generators for them. After this
we define a mapping between the groups and prove that the mapping is a bijective group
homomorphism.
Let’s begin with GL(3, 2).
The group GL(3, 2).
Let F2 = {0, 1} be the field with two elements. The group GL(3, 2) consists of all invertible
3 × 3 matrices with entries in F2 . To construct our isomorphism, we need three basic facts
about this group.
1. Functional Description of the Group. Let F8 = F2 [X]/hX 3 + X + 1i, and let x =
X +hX 3 +X +1i ∈ F8 . On one hand, F8 is an eight-element field whose multiplicative
group is generated by x. On the other hand, F8 is a three-dimensional vector space
over F2 with ordered basis B = (x0 , x1 , x2 ). Let GL(F8 ) be the set of all invertible
F2 -linear transformations of this vector space. This means that GL(F8 ) consists of
all bijections L : F8 → F8 such that L(u + v) = L(u) + L(v) for all u, v ∈ F8 . We
note that L(cu) = cL(u) holds automatically, since the only available scalars c are 0
and 1. Let [L]B denote the matrix of L relative to the ordered basis B. Then the
map L 7→ [L]B defines an isomorphism between GL(F8 ) and GL(3, 2). From now on,
we identify these two groups by means of this isomorphism.
2. The Size of GL(3, 2). The following counting argument proves that | GL(3, 2)| = 168.
Let us build an invertible 3 × 3 matrix of 0s and 1s one row at a time. The first
row can be any nonzero bit string of length 3; there are seven such bit strings. The
second row can be any nonzero bit string different from the first row; there are six
such bit strings. When choosing the third row, we must pick a bit string that is not
a linear combination of the first two rows. There are four such linear combinations
(zero, the first row, the second row, or the sum of the first two rows), so there are
8 − 4 = 4 choices for the third row. By the product rule,
| GL(3, 2)| = 7 · 6 · 4 = 168.
3. Generators for GL(3, 2). It will be useful to have a small set of generators for GL(3, 2).
Starting with any matrix A ∈ GL(3, 2), we can use elementary row operations (Gaussian elimination) to reduce A to the identity matrix. Each elementary operation can
be accomplished by multiplying on the left by one of the following nine elementary
matrices:






1 0 0
1 0 1
1 1 0
E12 =  0 1 0  , E13 =  0 1 0  , E23 =  0 1 1  ,
0 0 1
0 0 1
0 0 1
86


1 0 0
E21 =  1 1 0  ,
0 0 1


0 1 0
S12 =  1 0 0  ,
0 0 1


1 0 0
E31 =  0 1 0  ,
1 0 1


1 0 0
S23 =  0 0 1  ,
0 1 0


1 0 0
E32 =  0 1 0  ,
0 1 1


0 0 1
S13 =  0 1 0  .
1 0 0
For example, multiplying A on the left by E12 adds the the second row of A to the
first row, whereas multiplying A on the left by S13 interchanges the first and third
rows of A. Thus, the row-reduction of A to the identity matrix via elementary row
operations translates to a matrix equation of the form E1 · · · Ek A = I, where each
Ei is an elementary matrix. Solving for A and noting that each elementary matrix
equals its own inverse, we see that GL(3, 2) is generated by the nine elementary
matrices listed above. In fact, many of these matrices are redundant, and the set
{E23 , S12 , S23 } already generates the whole group. This remark follows from the
formulas
S13 = S12 S23 S12 , E13 = S12 E23 S12 , E32 = S23 E23 S23 ,
E21 = S13 E23 S13 ,
Now consider the three

1 0
A1 =  0 0
0 1
E12 = S13 E32 S13 ,
matrices



0 0 1
0
1  , A2 =  1 0 1  ,
0 1 0
0
E31 = S12 E32 S12 .

1 0 0
A3 =  0 0 1  .
0 1 1

We have E23 = A1 A3 , S12 = A22 A1 A32 A3 , S23 = A1 , and so GL(3, 2) = hA1 , A2 , A3 i.
And now, on to a description of PSL(2, 7).
The group PSL(2, 7).
Let F7 = {0, 1, 2, 3, 4, 5, 6} be the field with seven elements. The group SL(2, 7) consists
of all 2 × 2 matrices with entries in F7 and determinant 1. The group PSL(2, 7) is the
quotient group SL(2, 7)/{I, −I}. To construct our isomorphism, we need three basic facts
about this group.
1. Functional Description of the Group. Let
F7 = F7 ∪ {∞} = {0, 1, 2, 3, 4, 5, 6, ∞}.
As in complex analysis, we define a linear fractional transformation on F7 to be a
function f : F7 → F7 of the form
f (k) =
ak + b
ck + d
87
(k ∈ F7 ),
(17)
where a, b, c, d ∈ F7 are constants such that ad − bc 6= 0. (The same definition works
for any field F.) In the formula for f (k), division by ck+d means multiplication by the
inverse of ck + d in the field F7 ; any nonzero element divided by 0 is ∞; and anything
(other than ∞) divided by ∞ is 0. We have f (∞) = a/c when c 6= 0, and f (∞) = ∞
when c = 0. The transformation f is called special if a, b, c, d can be chosen so that
ad−bc = 1. There is a natural map φ from SL(2, 7) to the set
of special linear
SLF(7)
a b
fractional transformations on F7 , which sends the matrix
to the function
c d
f given in (17). A routine calculation shows that φ(A) ◦ φ(B) = φ(AB) (which says
composing linear fractional transformations is done by matrix multiplication), so that
φ is a group homomorphism. Furthermore, one sees that φ(A) = φ(B) iff B = A or
B = −A. It follows that ker(φ) = {I, −I}, so that φ induces a group isomorphism
from PSL(2, 7) onto SLF(7). Henceforth, we identify these two groups by means of
this isomorphism.
2. The Size of PSL(2, 7).The following
counting argument proves that | SL(2, 7)| = 336.
a b
Let us build a matrix
with entries in F7 such that ad−bc = 1. There are two
c d
cases: c = 0 or c 6= 0. In the case where c = 0, choose a to be any nonzero element (6
possibilities); then choose d = a−1 to force the determinant to be 1 (one possibility);
then choose b to be anything (7 possibilities). This gives 42 upper-triangular matrices
in SL(2, 7). In the case where c 6= 0, choose c (6 possibilities); then choose a and d
arbitrarily (7 possibilities each); then we must choose b = c−1 (ad − 1) to get the right
determinant. This gives 6 · 7 · 7 = 294 more matrices, for a total of 336. Taking the
quotient by the two-element subgroup {I, −I} cuts the number of group elements in
half, so | PSL(2, 7)| = 336/2 = 168.
3. Generators for PSL(2, 7). We can use the functional description of PSL(2, 7) to find
a convenient set of generators for this group. We define three special linear fractional
transformations r, t, δ by setting
• r(k) = −1/k (the “reflection map”);
• t(k) = k + 1 (the “translation map”); and
• δ(k) = 2k (the “doubling map”).
We will prove that SLF(7) = hr, t, δi. Consider a special linear fractional transformation f (k) = (ak + b)/(ck + d). If c = 0, we must have ad = 1 and d = a−1 ,
so f (k) = a2 k + ab. The nonzero squares mod 7 are 1, 2, 4, so f = tab ◦ δj for a
suitable j ∈ {0, 1, 2}. For example, given f (k) = (3k + 6)/5, we have f (k) = 2k + 4 =
t(t(t(t(δ(k))))). Next, consider f (k) = (ak + b)/(ck + d) where c 6= 0. Division gives
f (k) = (ac−1 ) +
−1
bc − ad
= (ac−1 ) + 2
.
c(ck + d)
c k + cd
Writing c2 = 2j , it is now evident that f = tac
generated by r, t, and δ.
88
−1
◦ r ◦ tcd ◦ δj . Hence, SLF(7) is
Constructing an isomorphism of PSL(2, 7) onto GL(3, 2).
We now have all the ingredients needed to define the promised group isomorphism between
PSL(2, 7) and GL(3, 2). Using the functional descriptions of these groups, it will suffice to
define an isomorphism T : SLF(7) → GL(F8 ). We proceed in four stages.
1. Definition of T . For each function f ∈ SLF(7), we need to define an associated
function T (f ) = Tf ∈ GL(F8 ). How can we use the function f , whose domain is F7 ,
to build a function Tf , whose domain is F8 ? To relate these two domains, we define
x∞ = 0 and then observe that F8 = {xk : k ∈ F7 }. This observation suggests the
map xk 7→ xf (k) as a possibility for Tf . However, this map is not always linear, since
zero maps to zero only if f (∞) = ∞. To account for this difficulty, we instead define
Tf (xk ) = xf (k) + xf (∞)
(k ∈ F7 ).
(18)
With this definition, Tf (0) = 0 always holds, though it is not yet evident that Tf
must belong to GL(F8 ).
Let us illustrate the formula by computing T (r), T (t), and T (δ). This computation
will reveal that each of these three functions does indeed lie in GL(F8 ). The function
r(k) = −1/k is given in two-line form as follows:
0 1 2 3 4 5 6 ∞
r=
.
∞ 6 3 2 5 4 1 0
Since x3 + x + 1 = 0 in F8 , we have
x3 = x + 1,
x4 = x2 + x,
x5 = x2 + x + 1,
x6 = x2 + 1.
Let us represent an element b2 x2 + b1 x1 + b0 x0 in F8 by the bit string b2 b1 b0 . In this
notation,
x0 = 001, x1 = 010, x2 = 100, x3 = 011,
x4 = 110, x5 = 111, x6 = 101, x∞ = 000.
Putting all this information into (18), we conclude that
001 010 100 011 110 111 101 000
T (r) =
.
001 100 010 101 110 111 011 000
Note that Tr just interchanges the first two bits. Thus, Tr is the invertible linear map
on F8 that interchanges the basis vectors x1 and x2 , and so the matrix of Tr relative
to the ordered basis B = (x0 , x1 , x2 ) is


1 0 0
[T (r)]B =  0 0 1  = A1 .
0 1 0
89
It is even easier to compute T (t) and T (δ). We have Tt (0) = 0 and, for k 6= ∞,
Tt (xk ) = xt(k) + xt(∞) = xk+1 = x(xk ).
Thus, Tt is simply left-multiplication by x in the field F8 . This map is linear by the
distributive law in F8 , and it is invertible since the inverse map is left-multiplication
by x−1 = x6 . The matrix of Tt is


0 0 1
[T (t)]B =  1 0 1  = A2 .
0 1 0
Finally, Tδ (0) = 0 and, for k 6= ∞, Tδ (xk ) = x2k = (xk )2 . So Tδ is the squaring
map in F8 . This map is linear (and even a ring homomorphism) since (u + v)2 =
u2 + 2uv + v 2 = u2 + v 2 for u, v ∈ F8 . The map is one-to-one (hence onto) since the
kernel is zero. The matrix of Tδ is


1 0 0
[T (δ)]B =  0 0 1  = A3 .
0 1 1
2. The key lemma. Suppose f, g ∈ SLF(7) are two functions such that T (f ) and T (g)
lie in GL(F8 ). Then T (f ◦ g) = T (f ) ◦ T (g), and hence T (f ◦ g) also lies in GL(F8 ).
Proof: For any k ∈ F7 , we compute
Tf ◦ Tg (xk ) = Tf (Tg (xk )) = Tf (xg(k) + xg(∞) )
= Tf (xg(k) ) + Tf (xg(∞) ) (since Tf is linear)
= (xf (g(k)) + xf (∞) ) + (xf (g(∞)) + xf (∞) )
= xf (g(k)) + xf (g(∞))
k
= Tf ◦g (x ).
(since u + u = 0 for all u ∈ F8 )
3. Proof that T is a homomorphism mapping into GL(F8 ). We have seen that each
element of SLF(7) can be written as a product of the generators r, t, and δ (using only positive powers, in fact). Since T (r), T (t), and T (δ) are known to lie in
GL(F8 ), repeated application of the lemma shows that T (h) lies in GL(F8 ) for all
h ∈ SLF(7). Having drawn this conclusion, the lemma now shows that T is a group
homomorphism.
4. Proof that T is a bijection. So far, we know that T is a group homomorphism
mapping SLF(7) into GL(F8 ). T is actually onto, since the image of T contains
hT (r), T (t), T (δ)i, which is the whole group GL(F8 ). Since SLF(7) and GL(F8 ) both
have 168 elements, T must also be one-to-one.
Our proof that PSL(2, 7) ∼
= GL(3, 2) is now complete. We leave it as a challenge for the
reader to find an explicit description of the inverse bijection T −1 : GL(F8 ) → SLF(7).
90
And now, back to our story.
Associated with every block design and every finite projective geometry is a matrix
called its incidence matrix, and such matrices lead to some more interesting combinatorial
connections, including error-correcting codes, Hadamard matrices, sphere packing, and
factoring with quadratic forms. So, let’s find out about these matrices.
91
10
Incidence matrices, codes, and geometries
This section is in the file incidencematrices.tex.
Let P be a finite projective geometry with p points and q lines. The incidence matrix
of P is a p × q matrix M = [mij ]. The rows and columns are indexed by the points and
lines, respectively, and mij = 1 or 0 according as the ith point is or is not on the jth line.
The following 7 × 7 matrix F is the

1
 1

 1

F =
 0
 0

 0
0
incidence matrix of the Fano plane

1 1 0 0 0 0
0 0 1 1 0 0 

0 0 0 0 1 1 

1 0 1 0 1 0 
.
1 0 0 1 0 1 

0 1 1 0 0 1 
0 1 0 1 1 0
We also use such matrices to describe the relations between points and blocks in a block
design: the rows and columns of the v × b matrix M are indexed by the points and blocks
and mij = 1 or 0 according as the ith point is or is not in the jth block. It so happens that
F is also the incidence matrix of the (7, 3, 1) block design with points 1, . . . , 7 and blocks
123, 145, 167, 246, 257, 347, and 356.
This matrix can be used to solve special cases of Zarankiewicz’s problem, namely:
What is the maximum number Z(m, n; i, j) of ones that can occur in an m × n 0–1
matrix that doesn’t contain an i × j submatrix consisting entirely of ones? From the
incidence matrix above we see that Z(7, 7; 2, 2) = 21.
But incidence matrices have many uses – remember F , because it’s important – and
we’ll next see how they are connected to error-correcting codes.
Error-correcting codes
Let’s begin with two parties, Alice and Bob, who want to communicate with each other.
Alice is sending a message to Bob. The message is expressed in some way as a sequence
of strings of characters or codewords, which are sent to a receiver, one at a time. Errors
can happen in the process, so the string Bob receives may fail to be a codeword. They can
stop there, or they may try to correct the error. In every case we will consider, Alice will
build some extra information into each codeword, and we will describe how this is done.
Bob will use the extra information to test a string for an error and—if there is an error—to
replace the bad string with the “closest” codeword (in the sense we’ll describe). We are
not concerned with the process by which the original message is translated into codewords
or vice versa. We are only concerned with Alice sending one codeword at a time to Bob,
possibly with some characters changed by error, and then with Bob trying to reconstruct
the original codeword.
Mathematical schemes to deal with such errors first appeared in the 1940s in the work
of several researchers, including Claude Shannon, Richard Hamming, and Marcel Golay.
92
These researchers saw the need for something that would automatically detect and correct
errors in signal transmissions across channels that were noisy and hence were likely to
produce such errors. Their work led to a new branch of mathematics called coding theory—
specifically, the study of error-detecting and error-correcting codes. They modeled these
signals as sets of m-long strings called blocks, to be taken from a fixed alphabet of size q;
a particular set C of such blocks, or codewords, is called a q-ary code of length m. If q is
a prime number, then a q−ary code of length n is called linear if the code words form a
subspace of Znq , the n−dimensional vector space over Zq —the integers mod q. To correct
errors means to determine the right code word in case it was incorrectly received. Just
how this correction happens will vary from code to code.
The fact that d errors in transmission change d characters in a block gives rise to the
idea of distance between blocks. If v and w are n-blocks, then the (Hamming) distance
D(v, w) is the number of positions in which v and w differ. Thus, D(11001, 10101) = 2
and D(1102002, 2011012) = 5. If Alice sends the block v and Bob receives the block w,
then D(v, w) errors occurred while sending v. The Hamming sphere of radius d about an
n-block w, denoted S(w, d), is the set of all n-blocks whose Hamming distance from w is at
most d. Finally, the (Hamming) weight of a codeword is the number of nonzero characters.
We see that if the words in a code are all “far apart” in the Hamming distance sense,
then we can detect errors. Even better, if we assume that only a few errors are received,
then we can sometimes change the received block to the correct codeword.
A simple example of a binary code of length 3 consists of only two codewords, 000 and
111. If Bob receives 010, then it is most likely that Alice sent 000 and so the intended
message was 0; this is the triplication or majority-vote code. Effectively, a 3-bit codeword
consists of one “message bit” sent three times. More generally, a codeword of length n
contains a certain number k of message bits, and the other n − k check bits are used for
error detection and correction. Such a code is called an (n, k) code: the triplication code
is a (3, 1) code.
We have presented k as the number of message bits, but it can be defined more clearly
as the dimension of the subspace consisting of the codewords. This makes sense only for
linear codes—but we are only concerned with linear codes here.
The minimum distance of a code is the smallest distance between its codewords; this
minimum distance determines the code’s error detection and correction features. For example, show that a code with minimum distance 5 will detect up to 4 errors and correct
up to 2. You can show that a code with minimum distance d will detect up to d − 1
errors and correct up to ⌊(d − 1)/2⌋ errors. We see that if the Hamming spheres S(w, d)
of radius d about all codewords w are pairwise disjoint, then the code can correct up to d
errors. Maximum efficiency in an (n, k) d-error correcting code C occurs when every string
of length n is either a codeword or at a distance of at most d from a unique codeword—
equivalently, when the Hamming spheres of radius d about all codewords partition the set
of all n-blocks. This is a rare event, and a code with this property is called perfect.
Hamming’s first error correcting scheme was a perfect 1-error correcting code of length
93
seven with four message bits, three check bits, and minimum distance 3; hence, it could
correct all errors in which a single bit was received incorrectly. Golay extended Hamming’s
work and constructed a family of (2n − 1, 2n − 1 − n) linear binary perfect 1-error correcting
codes of minimum distance 3 for all n ≥ 2. These are now known as the binary Hamming
codes, and they include both Hamming’s original (7, 4) code and the (3, 1) triplication code.
The notation H(m, k) refers to a linear binary perfect 1-error correcting code of length m
and dimension k.
H(7, 4), Hamming’s first code—the perfect single-error correcting code of length 7—was
described in 1948 in [143, p. 418], as follows:
Let a block of seven [binary] symbols be X1 , X2 , . . . , X7 . Of these X3 , X5 , X6
and X7 are the message symbols and chosen arbitrarily by the source. The
other three are redundant and calculated as follows:
X4 is chosen to make α = X4 + X5 + X6 + X7 even
X2 is chosen to make β = X2 + X3 + X6 + X7 even
X1 is chosen to make γ = X1 + X3 + X5 + X7 even.
When a block of seven is received, α, β and γ are calculated and if even, called
zero, if odd, called one. The binary number αβγ then gives the subscript of the
Xi that is incorrect (if 0, there was no error).
Now this procedure determines α, β and γ mod 2 in the following way. Suppose exactly
one of the seven bits, say Xj , is incorrect. Since α = X4 + X5 + X6 + X7 adds up the Xi
whose high bit equals 1, it follows that α = 1 if and only if j = 4, 5, 6 or 7, that is, if the
high bit of Xj is 1. Similarly, β = X2 + X3 + X6 + X7 adds up the Xj whose middle bit
equals 1, so it follows that β = 1 if and only if j = 2, 3, 6 or 7, i.e. if the middle bit of Xj
is 1. Finally, γ = X1 + X3 + X5 + X7 adds up the Xj whose low bit equals 1, and so γ = 1
if and only if j = 1, 3, 5 or 7, i.e. if the low bit of Xj is 1. Thus, Xj affects those, and only
those, of α, β and γ whose sum contains Xj .
Another way to describe the decoding procedure
string, then compute v = P · X t , where

0 0 0 1 1 1
P = 0 1 1 0 0 1
1 0 1 0 1 0
is that if X = (X1 , . . . , X7 ) is a 7-bit

1
1 .
1
P is constructed in such a way that if the vector v is identical to the ith column of P , then
Xi is the incorrect bit, and if v = 0 then there is no error.
The free choices for the four message symbols shows that there are 16 codewords, and
the condition that P ·v t = 0 (when v is a codeword) means that the vector v is in the (right)
nullspace of the matrix P . Thus, the 16 codewords are closed under both addition and
scalar multiplication by 0 and 1. In short, the codewords form a 4-dimensional subspace
of Z/2Z7 and we see that the above code is a linear code. More generally, if C is a linear
94
code that is the nullspace of a matrix Q, then we call Q the parity check matrix for the
code.
Hamming’s scheme, then, takes every 7-long binary string with a single error and corrects that error, producing the corrected 7-bit code word – whence the name “binary single
error-correcting code of length 7”. Since this code has length 7 and dimension 4, we call
it the binary Hamming code H(7, 4). The triplication
code is an H(3, 1) binary Hamming
1 0 1
code and its parity-check matrix is
.
0 1 1
The parity-check matrix P has another interesting feature. Write P = [P1 , . . . , P7 ] –
thus, Pi is the ith column of P – and consider the set Ci of columns of P whose dot
products with Pi equal zero (FIGURE 46 GOES ABOUT HERE):
i
1
2
3
4
5
6
7
Pi
001
010
011
100
101
110
111
binary Ci
decimal Ci
{010, 100, 110}
{2, 4, 6}
{001, 100, 101}
{1, 4, 5}
{011, 100, 111}
{3, 4, 7}
{001, 010, 011}
{1, 2, 3}
{010, 101, 111}
{2, 5, 7}
{001, 110, 111}
{1, 6, 7}
{011, 101, 110}
{3, 5, 6}
i
1
2
3
4
5
6
7
Bi
{j : Bi,j = 1}
1110000
{1, 2, 3}
1001100
{1, 4, 5}
1000011
{1, 6, 7}
0101010
{2, 4, 6}
0100101
{2, 5, 7}
0011001
{3, 4, 7}
0010110
{3, 5, 6}
Figure 46: The (7, 3, 1) design (left) and the seven codewords of weight 3 (right).
A second appearance of (7, 3, 1) in this code is in the table on the right side of Figure 46.
This table comes from the seven codewords of weight 3. The blocks Bi = Bi,1 . . . Bi,7 are
the codewords, the points j are the integers 1, . . . , 7, and j ∈ Bi if and only if Bi,j = 1—
another (7, 3, 1) design.
Finally, Hamming’s system of three congruences (mod 2) has a nice pictorial interpretation, as follows. Draw the usual three-circle Venn diagram for three sets. Next, associate
the region that is in all three sets with X7 , associate the regions that are in exactly two of
the sets with X3 , X5 and X6 , and associate the regions that are in exactly one of the sets
with X1 , X2 and X4 . We see that each region of the diagram is associated with exactly one
of the Xi , and each Xi appears exactly once. (FIGURE 47 GOES ABOUT HERE:)
Hamming’s scheme can be realized by placing Xi in its corresponding region, then the
number of 1s in each of the circles must be even. Pictorially, if exactly one Xi is switched
from x to 1 − x, then it will be the value in the region contained in exactly those circles
with an odd number of 1s. As a bonus, this picture also shows us one more appearance
of (7, 3, 1) (on the right-hand side of Figure 47), and so the Hamming (7, 4) code gives us
three different views of (7, 3, 1)!
Choose any two rows from the matrix F and compare them. They differ in exactly
four places. If we use them as codewords to send a message and one of the bits is changed
95
1
3
1
5
3
5
7
2
6
7
4
2
6
4
Figure 47: The three-circle Venn diagram (left) with another instance of (7, 3, 1) (right)
enroute, the recipient would still be able tell which word was meant. If two bits got changed
then she would still know that there had been an error, but would not know which of two
words was intended. We can amplify this code by adding the unit word (the first row in
Figure 48), and then the complements of these eight words, and finally a parity-check
bit (the first column) making the sum of each row even (one of weight 0, one of weight 8,
and the other fourteen of weight 4). This is the extended Hamming code H8 , as seen
in Figure 48.
(FIGURE 48 GOES HERE:)
1
1
1
1
1
1
1
0
0
0
0
0
0
0
0
1
1
1
1
0
0
0
0
1
1
1
1
0
0
0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
1
0
0
0
0
1
1
0
0
1
1
1
1
0
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
0
1
0
1
0
1
0
1
1
0
1
0
1
0
0
1
1
0
0
1
0
1
1
0
0
1
1
0
1
0
0
1
0
1
1
0
1
0
0
1
0
1
1
0
Figure 48: The extended Hamming code H8 .
96
Sphere packing
The 16 × 8 matrix H8 for the extended Hamming code turns up in connection with a
sphere packing, which refers to an arrangement of nonoverlapping identical spheres in
n-dimensional space. An important sphere-packing problem is to determine the kissing
number, i.e., the maximal number of equal size nonoverlapping spheres in n dimensions
that can touch another sphere of the same size. For example, we know that the kissing
number for n = 2 is 6, because in the usual hexagonal arrangement of congruent nonoverlapping discs in the plane, every disc touches six others, and this is known to be best
possible. The kissing number for three dimensions is 12, and for dimensions n > 3, very
few of these numbers are known – and this is where H8 comes in.
In the extended Hamming code H8 , if we replace the zero vector by eight vectors (1, 07 ),
each of the fourteen vectors of weight four by sixteen vectors ((± 12 )4 , 04 ) and the weighteight vector by eight vectors (−1, 07 ), we have the 240 minimal vectors of the lattice E8 ,
the best lattice packing of spheres in eight dimensions, displaying the kissing number 240
in eight dimensions.
E8 can also be identified with a certain structure within W. R. Hamilton’s quaternions
H, i.e., the 4-dimensional normed algebra over R with basis {1, i, j, k}, where i2 = j 2 =
k2 = ijk = −1. That structure is the ring of so-called icosians, i.e., finite sums of members
of a 120-element subgroup of H consisting of eight elements of shape (±1, 03 ), sixteen of
√
4
shape (± 21 ), and the 96 even permutations of 12 (0, ±1, ±σ, ±τ ), where σ, τ = 21 (1 ∓ 5).
You might remember that we first met τ in Section 1, where we discussed Beatty sequences.
Finally, we cannot mention Hamilton’s 4-dimensional quaternion algebra and not mention the 8-dimensional octonion algebra, discovered independently by J. T. Graves [60]
and Arthur Cayley [28], [29]. We will say more about the octonions in a later chapter;
meanwhile, the article by Baez [4] and the book by Conway and Smith [39] are excellent
references.
Hadamard matrices and Hadamard difference sets
If, in the first eight words of the extended binary Hamming code, we replace zeroes by minus
ones (represented by minus signs) and replace ones by plus signs, we obtain a matrix H8
of order 8, where


+ + + + + + + +
 + + + + − − − − 


 + + − − + + − − 


 + + − − − − + + 

H8 = 
 + − + − + − + − .


 + − + − − + − + 


 + − − + + − − + 
+ − − + − + + −
97
H8 is an example of a Hadamard matrix, an n × n matrix of ones and minus ones in
which any distinct pair of rows (or columns) is orthogonal: HH tr = nI (here, n = 8). It is
known that n must be either 1, 2, or a multiple of 4. It is believed that Hadamard matrices
exist for all orders n = 4k, but many remain to be constructed.
Now the Hadamard matrix H8 we’ve just constructed came from the Hamming code,
which came from the incidence matrix of the projective geometry. We could also construct
a cyclic Hadamard matrix from the other labelling of the Fano configuration, which came
from the (7, 3, 1) difference set {1, 2, 4}:
0
1
2
3
4
5
6
7
0
+
+
+
+
+
+
+
+
1
+
+
−
−
−
+
−
+
2
+
+
+
−
−
−
+
−
3
+
−
+
+
−
−
−
+
4
+
+
−
+
+
−
−
−
5
+
−
+
−
+
+
−
−
6
+
−
−
+
−
+
+
−
7
+
−
−
−
+
−
+
+
As a matter of fact, this is derived from the previous Hadamard matrix by arranging,
first the rows in the order 01753426 and then the columns in order 01423765. Thus,
these two Hadamard matrices are isomorphic. As n gets larger, however, the number of
nonisomorphic Hadamard matrices of order n increases rapidly. For example, there are
13710027 isomorphism classes of Hadamard matrices of order 32.
Multiplying a row or a column by −1 preserves the orthogonality relations, as does
interchanging rows or columns, and by doing so judiciously, one can transform a Hadamard
matrix into one in which the entries of first row and first column are all ones. Such a
Hadamard matrix is said to be normalized, and the orthogonality relations imply that
every row and every column except for the first one contains the same number of 1s and
−1s. One sees that the matrix H8 is normalized; it turns out that if q = 4n − 1 is a prime,
then a normalized Hadamard matrix of order q + 1 = 4n is an essential part of Paley’s
construction of a (4n − 1, 2n − 1, n) difference set consisting of the nonzero squares mod q.
A Hadamard difference set is a difference set with parameters (4n2 , 2n2 − n, n2 − n), and
these are considerably harder to come by. Such a difference set exists if and only if there
exists a regular Hadamard matrix of order 4n2 , i.e. one for which each row has the same
number of ones. As noted in Section 7, there is a (16, 6, 2) difference set and so a regular
Hadamard matrix of order 16 exists.
Factoring with quadratic forms
Another place where Hadamard matrices have been put to use is connected to the integer
factoring problem. The pattern in the first Hadamard matrix also occurred when D. H.
Lehmer and Emma Lehmer [104] were using their sieve to factor large integers via quadratic
forms. (FIGURE 49 GOES ABOUT HERE:)
98
x2 − dy 2 : d =
24k + 1
17
13
5
23
7
11
19
1 −1 2 −2 3 −3 6 −6
+ + + + + + + +
+ + + + − − − −
+ + − − + + − −
+ + − − − − + +
+ − + − + − + −
+ − + − − + − +
+ − − + + − − +
+ − − + − + + −
Figure 49: Lehmer factoring I.
A row represents a residue class (mod 24) of a number N to be factored. A column
represents a quadratic form – for example, the column headed by −2 refers to the quadratic
form x2 −(−2)y 2 = x2 +2y 2 , where x and y are integers. A plus sign means that N might be
representable by the form, a minus sign that it cannot. Reading down the column headed
by −2, we see that if N ≡ 1, 17, 11, or 19 (mod 24) then there may be a representation of
N by the quadratic form x2 + 2y 2 .
Two distinct representations of N by the same quadratic form will lead us to a factorization of N . For example, let N = 12199148017; then N ≡ 3 mod 8. The congruence
conditions N ≡ 1 or 3 mod 8 are necessary for N to have a representation as x2 + 2y 2 , and
as it happens, a search yields the representations
N = 673072 + 2 · 619232 = x21 + 2y12 , and
N = 1051152 + 2 · 239792 = x22 + 2y22 .
A calculation shows that GCD(N, x1 y2 − x2 y1 ) = 116483 = p, and GCD(N, x1 y2 + x2 y1 ) =
104729 = q. Sure enough, pq = 12199148017 = N.
See also [69], the source for many of the numbers factored by this method. The Lehmers
also used the prime 5 as well as 2 and 3, giving the above scheme mod 120, and the resulting
16 × 16 array of 1s and −1s is a Hadamard matrix H16 of order 16 (FIGURE 50 GOES
ABOUT HERE:)
Hadamard matrices and projective geometries
The Hadamard matrix H8 is connected to the projective plane of order 2. For, as seen in
Section 10, the 7 × 7 matrix F is the incidence matrix of the projective plane of order 2,
and if you remove the top row and left column from H8 , replace 1s and −1s by 0s and 1s
respectively and reorder the rows and columns, then you obtain F .
In the same way, the array H16 is connected to P G(3, F2 ), the 3-dimensional projective
geometry of order 2. Omit the first row and column and replace 1s and −1s by 1s and 0s and
we obtain the incidence matrix for the fifteen points of P G(3, F2 ) (the row headings), seven
of which lie on each of fifteen Fano planes contained in P G(3, F2 ) (the column headings)
(FIGURE 51 GOES HERE):
99
x2 − dy 2 : d =
120k + 1, 49
73, 97
41, 89
17, 113
61, 109
13, 37
29, 101
53, 77
71, 119
23, 47
31, 79
7, 103
11, 59
83, 107
19, 91
43, 67
1 −1 2 −2 3 −3 6 −6 5 −5 10 −10 15 −15 30 −30
+ + + + + + + + + + +
+
+
+
+
+
+ + + + + + + + − − −
−
−
−
−
−
+ + + + − − − − + + +
+
−
−
−
−
+ + + + − − − − − − −
−
+
+
+
+
+ + − − + + − − + + −
−
+
+
−
−
+ + − − + + − − − − +
+
−
−
+
+
+ + − − − − + + + + −
−
−
−
+
+
+ + − − − − + + − − +
+
+
+
−
−
+ − + − + − + − + − +
−
+
−
+
−
+ − + − + − + − − + −
+
−
+
−
+
+ − + − − + − + + − +
−
−
+
−
+
+ − + − − + − + − + −
+
+
−
+
−
+ − − + + − − + + − −
+
+
−
−
+
+ − − + + − − + − + +
−
−
+
+
−
+ − − + − + + − + − −
+
−
+
+
−
+ − − + − + + − − + +
−
+
−
−
+
Figure 50: Lehmer factoring II.
In the following description, the points are the nonzero binary strings of length four, the
subscript 2 indicating the binary nature of the strings. The fifteen points may be thought
of as
• the four vertices (1, 03 ) (i.e., 1 = 00012 , 2 = 00102 , 4 = 01002 , 8 = 10002 ) of a
tetrahedron,
• the six midpoints, (12 , 02 ) (i.e., 3, 5, 6, 9, 10, 12) of its edges,
• the four centroids (13 , 0), (i.e., 7, 11, 13, 14) of the faces, and
• the body centroid (14 ), (i.e., 15).
Dually, the planes are
• the four faces (x = 0, y = 0, z = 0, w = 0) of the tetrahedron,
• the six medial planes, such as x + y = 0, containing an edge and the midpoint of the
opposite face,
• the four cones such as x + y + z = 0, joining a vertex to the incircle of the opposite
face, and
• a sphere x + y + z + w = 0 through the midpoints of the edges together with the
tetrahedron’s center.
100
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
1
1
1
1
1
1
1
1
0
0
0
0
0
0
0
0
2
1
1
1
0
0
0
0
1
1
1
1
0
0
0
0
3
1
1
1
0
0
0
0
0
0
0
0
1
1
1
1
4
1
0
0
1
1
0
0
1
1
0
0
1
1
0
0
5
1
0
0
1
1
0
0
0
0
1
1
0
0
1
1
6
1
0
0
0
0
1
1
1
1
0
0
0
0
1
1
7
1
0
0
0
0
1
1
0
0
1
1
1
1
0
0
8
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
9 10 11 12 13 14 15
0 0 0 0 0 0 0
1 1 1 0 0 0 0
0 0 0 1 1 1 1
1 0 0 1 1 0 0
0 1 1 0 0 1 1
1 0 0 0 0 1 1
0 1 1 1 1 0 0
0 1 0 1 0 1 0
1 0 1 0 1 0 1
0 1 0 0 1 0 1
1 0 1 1 0 1 0
0 0 1 1 0 0 1
1 1 0 0 1 1 0
0 0 1 0 1 1 0
1 1 0 1 0 0 1
Figure 51: The incidence matrix of the projective geometry P G(3, F2 ).
Each plane contains seven lines and each line lies in three planes, so that there are 35 lines.
These are the triples of distinct nimbers chosen from {1, . . . , 15} whose Nim-sum is zero.
Choose the first nimber in any of fifteen ways, the second in fourteen ways, and the Nimsum rule determines the third—but six different orderings produce the same unordered
triple: 15 × 14/3! = 35.
In short, these 35 lines and 15 points form a (15, 3, 1) Steiner triple system, and the 15
planes and 15 points form a (15, 7, 3) symmetric balanced incomplete block design.
Here is the picture (FIGURE 52 GOES HERE):
∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗∗
P G(3, 2) and the (15, 3, 1) Steiner triple system are not merely things of beauty and
elegance. They also share a feature with the (9, 3, 1) Steiner triple system, and it is this.
Its twelve triples can be arranged in four sets of three parallel lines each. Such a block
design, in which the blocks can be partitioned into pairwise disjoint parallel sets, is called
a resolvable design. The composer of possibly the earliest problem on resolvable designs
was an obscure English clergyman, and the problem concerned schoolgirls taking walks.
So, let’s meet T. P. Kirkman and his fifteen schoolgirls.
Kirkman’s Schoolgirls Problem
In 1850, the Reverend Thomas P. Kirkman posed the problem (see [90])
Fifteen young ladies in a school walk out three abreast for seven days in succession: it is required to arrange them daily so that no two shall walk twice
101
4
5
12
6
13
7
14
15
1
8
9
11
3
10
2
Figure 52: The projective geometry P G(3, F2 ).
abreast.
and gave a solution in [91]. What he was looking for was a Steiner triple system of fifteen
schoolgirls in seven parallel sets of five triples, each set containing all fifteen girls. In short,
he was looking for a resolvable (15, 3, 1) block design. (The (9, 3, 1) Steiner triple system
mentioned above yields a solution for nine girls walking out in three rows of three each on
four different days.) This generalizes to 3n girls walking out in 3n−1 rows of three each on
3n (3n − 1)/6 different days. (These numbers grow quickly: for n = 5, completing all these
walks one day at a time would take the better part of 27 years!) For those 3n girls, we can
use the n-dimensional affine geometry with three points on a line, generalizing the solution
for nine schoolgirls.
Let’s learn more about
We can solve Kirkman’s original problem with five triples for each day of the week
102
by putting the girls in a circle. Rotate the equilateral triangle (0, 5, 10) for Sunday ($);
rotating the triangles (0, 2, 8) and (1, 2, 5) (the latter moved around two clicks at a time
to reveal the pattern better) will yield 2 × 15 triples that split into six parallel sets for the
week-days (MTWΘFS) (see Figure 53).
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
$
$
$
$
$
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
2
3
4
5
6
7
8
9
10
11
12
13
14
0
1
8
9
10
11
12
13
14
0
1
2
3
4
5
6
7
M
M
T
T
W
W
T
W
W
W
F
M
M
Θ
S
1
3
5
7
9
11
13
0
2
4
6
8
10
12
14
2
4
6
8
10
12
14
1
3
5
7
9
11
13
0
5 Θ
7 Θ
9 F
11 F
13 S
0 S
2 F
4 F
6 S
8 S
10 M
12 Θ
14 Θ
1 T
3 W
Figure 53: Solution of Kirkman’s schoolgirls problem.
The publication of this problem in [90] and its solution in [91] is one of the starting
points for what has become the vast modern field of combinatorial design theory. Its
poser, Thomas Pennyngton Kirkman (1806-1895), is one of the more intriguing figures
in the history of mathematics. He published his first mathematical paper when he was
40, and was the first to describe many structures in discrete mathematics. Among these
are block designs, which form the basis for the statistical design of experiments; bipartite
graphs, which are essential for such problems as classroom scheduling and medical school
admissions; and Hamiltonian circuits, which are at the heart of the famous Traveling
Salesman Problem. (Biggs [14] gives more details about Kirkman’s life and work.) For
these achievements, combinatorialists regard him as the “Father of Design Theory”.
So, let’s find out more about Kirkman and some interesting connections involving his
Schoolgirls design.
103
MON TUE WED THU FRI SAT SUN
a, b, e a, c, f a, d, h a, g, k a, j, m a, n, o a, i, l
c, l, o b, m, o b, c, g
b, h, l b, f, k b, d, i b, j, n
d, f, m d, g, n e, j, o c, d, j c, i, n c, e, k c, h, m
g, i, j
e, h, i f, l, n e, m, n d, e, l f, h, j d, k, o
h, k, n j, k, l i, k, m f, i, o g, h, o g, l, m e, f, g
Figure 54: Walking to School
11
Kirkman’s Schoolgirls, Fields, Spreads, and Hats
This section is in the file called kirkmanchapter.
Fifteen Young Ladies at School
Imagine fifteen young ladies at the Emmy Noether Boarding School – Anita, Barb, Carol,
Doris, Ellen, Fran, Gail, Helen, Ivy, Julia, Kali, Lori, Mary, Noel, and Olive. Every day,
they walk to school in the Official ENBS Formation, namely, in five rows of three each.
One of the ENBS rules is that during the walk, a student may only talk with the other
students in her row of three. These fifteen are all good friends and like to talk with each
other – and they are all mathematically inclined. One day Julia says, “I wonder if it’s
possible for us to walk to school in the Official Formation in such a way that we all have
a chance to talk with each other at least once a week?” “But that means nobody walks
with anybody else in a line more than once a week,” observes Anita. “I’ll bet we can do
that,” concludes Lori. “Let’s get to work.” And what they come up with is the schedule
in Figure 54. (Figure 54 GOES ABOUT HERE.)
Figure 54 was probably what T. P. Kirkman had in mind when he posed the Fifteen
Schoolgirls question in 1850. Appearing in the unlikely-sounding Lady’s and Gentlemen’s
Diary [90], it reads as follows:
“Fifteen young ladies of a school walk out three abreast for seven days in succession: it
is required to arrange them daily so that no two shall walk abreast more than once.”
The publication of this problem in [90] and its solution in [91] is one of the starting
points for what has become the vast modern field of combinatorial design theory. Its
poser, Thomas Pennyngton Kirkman (1806-1895), is one of the more intriguing figures
in the history of mathematics. He published his first mathematical paper when he was
40, and was the first to describe many structures in discrete mathematics. Among these
are block designs, which form the basis for the statistical design of experiments; bipartite
graphs, which are essential for such problems as classroom scheduling and medical school
admissions; and Hamiltonian circuits, which are at the heart of the famous Traveling
Salesman Problem. (Biggs [14] gives more details about Kirkman’s life and work.) For
these achievements, combinatorialists regard him as the “Father of Design Theory” – yet
104
his fame outside the field rests entirely on the Schoolgirls Problem and his solution.
This part of combinatorics is a story about the very problem that made Kirkman famous.
His solution is an example of a resolvable (15, 35, 7, 3, 1)-design, and we begin with a review
of block designs. We describe how such a design appears in a most unlikely place: the
√ √ √ √
algebraic number field K = Q( 2, 3, 5, 7). This proves to be a particularly fertile
field in which several other block designs grow. This resolvable design relates to spreads
and packings in finite geometries, and how a particular packing in the geometry P G(3, 2)
we have just met answers Kirkman’s question. In fact, we’ll see that the design associated
with the P G(3, 2) in Figure 52 is the same as the number field design. Finally, we show
how our design is a solution to a certain problem in recreational mathematics called the
Fifteen Hats Problem.
We begin with a short refresher on the subject of block designs.
Block designs and Kirkman Triple Systems
Design theory began with Euler’s studies of Latin squares in the eighteenth century, interest in which was recently rekindled with the world-wide popularity of Sudoku. Many
decades after their invention by Kirkman, block designs appeared in connection with R. A.
Fisher’s work ([57] and [58]) on the statistical design of agricultural experiments, and the
first comprehensive mathematical study of the field was due to R. C. Bose ([15]). More
recently, they have found applications in coding theory, cryptography, network design,
scheduling, communication theory, and computer science. Finally, designs have always
appealed to mathematicians because of their elegance, beauty, high degree of symmetry,
and connections with many other fields of mathematics, as we have seen with the (7, 3, 1)
design.
A balanced incomplete block design with parameters v, b, r, k, and λ is a collection B of
b subsets (or blocks) of a v-element set V of objects (or varieties) such that each block
contains k varieties, each variety appears in r blocks and each pair of distinct varieties
appears together in λ blocks. Such a design is also called a (v, b, r, k, λ)-design. The
five parameters are not independent. Since there are b blocks, each of size k, there are
bk occurrences of varieties in the design. On the other hand, there are v varieties, each
occurring in r blocks, and so a total of vr varieties appear in the design. Hence bk = vr. A
similar counting argument shows that r(k − 1) = λ(v − 1). Hence r = λ(v − 1)/(k − 1) and
b = λv(v−1)/(k(k−1)). Because of this, such a design is frequently called a (v, k, λ)-design.
We have already introduced block designs, but it might help to remind ourselves of the
defintions and basics. (It couldn’t hurt.)
Given a block design with varieties x1 , . . . , xv and blocks B1 , . . . , Bb , an efficient way to
represent it is by its incidence matrix. This is a b × v matrix M = [mij ], where mij = 1 if
xj ∈ Bi and mij = 0 otherwise.
A reading of the Kirkman Schoolgirls Problem reveals that he first asks for an arrangement of 15 schoolgirls into sets of size three such that each pair of girls is present in at
105
most one of these triples. There are five triples for each of seven days, making 35 triples in
all. Moreover, each girl appears in just one triple each day, and over seven days, each girl
would thus appear with each other girl exactly once. We conclude that Kirkman is asking
for a way to arrange the girls into a (15, 3, 1)-design, and note that the incidence matrix
for Kirkman’s design will reappear in our discussion about fifteen hats.
But there is more: he asks for a way to arrange the b = 35 triples into seven days of five
triples each, so that each girl appears in exactly one triple each day. Such a design, whose b
blocks can be arranged into r parallel classes of n = v/k blocks each such that each variety
appears exactly once in each class, is called resolvable. For such a design to exist, v must
be a multiple of k. In Kirkman’s honor, a resolvable (3n, 3, 1)-design is called a Kirkman
Triple System. (A (v, 3, 1)-design is called a Steiner Triple System, despite the fact that
Kirkman described them six years before Jakob Steiner’s publication on the subject — but
that’s another story.)
Do Kirkman Triple Systems exist? Yes, they do. The smallest possibility is v = 3,
with exactly one block and one parallel class, but the smallest nontrivial case is v = 9.
Construction begins with the magic square of order 3, that familiar arrangement of the
numbers 1 through 9 into a 3 × 3 grid such that the triples of numbers in each row, each
column and on the two main diagonals add up to 15. The three rows, three columns,
three extended diagonals parallel to the principal diagonal, and three extended diagonals
parallel to the principal contrary diagonal form the four parallel classes of a resolvable
(9, 3, 1)-design. The following picture tells the tale, with the magic square on the left and
the four parallel classes of the resolvable (9, 3, 1)-design on the right:
8 1 6
3 5 7
4 9 2
{1, 6, 8}
{1, 5, 9}
{1, 4, 7}
{1, 2, 3}
{3, 5, 7}
{2, 6, 7}
{2, 5, 8}
{4, 5, 6}
{2, 4, 9}
{3, 4, 8}
{3, 6, 9}
{7, 8, 9}
The next smallest case is v = 15, which is the design Kirkman sought in his query;
where do we look? If we could find a structure containing fifteen objects arranged in
thirty-five sets, with three objects per set, that would be a place to start. It happens that
there are such structures, and we find one of them in the world of algebraic number theory
√ √ √ √
— specifically, in the number field K = Q( 2, 3, 5, 7). The field K contains several
interesting designs, and we’ll talk about them, but first we supply some background about
this area of mathematics.
√ √ √ √
K = Q( 2, 3, 5, 7) and the designs it contains
Évariste Galois (1811–1832) described relations between the roots of polynomials, number
fields and finite groups, now known as Galois theory. One basic idea is that if p(x) is
a polynomial with rational coefficients, then there is a smallest subfield of the complex
numbers C containing all the roots of p(x). This is the splitting field of p over Q. If
106
a, b, . . . ∈ C, we write Q(a, b, . . .) to mean the smallest subfield of C containing Q and
a, b, . . .. For example, the splitting field of the polynomial p(x) = (x2 − 2)(x2 − 3)(x2 − 5)
√ √ √
is the field Q( 2, 3, 5). Now it is a fact that Q(a, b, . . .) is a vector space over Q, and
the degree of F is the dimension of this vector space. These splitting fields have a good bit
√ √ √
of internal structure, which we illustrate with the field Q( 2, 3, 5).
√ √
Now by definition, the biquadratic (degree-4) field Q( 2, 3) contains the two elements
√
√
√ √
√
√ √
2 and 3, and since it is a field, it also contains 2 3 = 6. Hence Q( 2, 3) also
√
√
√
contains three quadratic (degree-2) subfields Q( 2), Q( 3), and Q( 6). A similar ar√ √
√
√ √
gument shows that Q( 6, 10) contains 15 = 6 10/4, and so it also contains the
√
√
√
three quadratic subfields Q( 6), Q( 10), and Q( 15). In the same vein, one can show
√
√ √ √
that Q( 2, 3, 5) contains seven Q( d), for d = 2, 3, 5, 6, 10, 15, and 30, and seven bi√ √
quadratic subfields Q( d1 , d2 ). Not only does each biquadratic subfield contain three
quadratic subfields, but each quadratic is contained in three biquadratics, and these sub√ √ √
fields of Q( 2, 3, 5) form a (7, 7, 3, 3, 1)-design with the biquadratic fields as the blocks
and the quadratic fields as the varieties. Such a design, in which b = v and r = k, is
called a symmetric design, and we will encounter some more symmetric designs later in
this section.
We now turn to the polynomial p(x) = (x2 − 2)(x2 − 3)(x2 − 5)(x2 − 7), whose splitting
√ √ √ √
field is the degree-16 field K = Q( 2, 3, 5, 7), the smallest subfield of the complex
√
numbers containing Q( d) for d = 2, 3, 5, and 7. Now let
S = {2, 3, 5, 6, 7, 10, 14, 15, 21, 30, 35, 42, 70, 105, 210}.
√
Then K contains the fifteen quadratic subfields Q( d) for d ∈ S. Moreover, each pair of
these quadratics is contained in a unique biquadratic subfield of K, and each biquadratic
contains three quadratics. A counting argument shows that K contains thirty-five bi√ √
quadratic subfields Q( d1 , d2 ), and it is straightforward to show that each quadratic is
contained in seven biquadratics.
Now consider the block design with the 15 quadratic subfields of K as varieties and the
35 biquadratic subfields of K as blocks. Our work in the previous paragraph shows that
these form a block design with v = 15, b = 35, r = 7, k = 3 and λ = 1, i.e. a (15, 3, 1)
design. But is this design resolvable?
In fact, it is, and Figure 55 shows the seven columns which are the seven parallel
classes. The three numbers in each of the 35 cells in this figure determine a block, i.e. one
of the 35 biquadratic subfields of K. We began by placing the seven biquadratic subfields
√
containing Q( 2) in separate classes across the top row and proceeded, mainly by trial
and error, to arrange the thirty-five blocks in seven parallel classes. The end result is a
resolvable (15, 3, 1)-design — in short, a solution to Kirkman’s Schoolgirls problem (Figure
55 GOES ABOUT HERE):
But that is not all. The field K also contains another resolvable (15, 3, 1) design as well
as two other types of designs.
We construct the other Kirkman design as follows. The blocks are the 35 biquadratic
107
MON
TUE
WED
THU
FRI
SAT
SUN
2, 3, 6
2, 5, 10
2, 7, 14
2, 15, 30
2, 21, 42
2, 35, 70 2, 105, 210
5, 21, 105 3, 70, 210
3, 5, 15
3, 14, 42
3, 35, 105
3, 7, 21
3, 10, 30
7, 30, 210 6, 14, 21 6, 35, 210
5, 7, 35
5, 6, 30
5, 42, 210
5, 14, 70
10, 14, 35 7, 15, 105 10, 42, 105 6, 70, 105
7, 10, 70
6, 10, 15
6, 7, 42
15, 42, 70 30, 35, 42 21, 30, 70 10, 21, 210 14, 15, 210 14, 30, 105 15, 21, 35
√ √ √ √
Figure 55: The Kirkman design in Q( 2, 3, 5, 7)
subfields of K, and the varieties are the 15 octic (degree-8) subfields of K, which we
√ √ √
number a through o; notice that a is the subfield Q( 2, 3, 5), with which we began this
section. But in a reversal of the previous construction, a variety (octic field) is a member of
those blocks (biquadratic fields) which it contains as a subfield. That is, “contains” means
√ √
“is a subfield of” in this context. Thus, the “block” Q( 21, 35) “contains” the three
“varieties” d, o and k, as shown in Figure 56.
It is straightforward to show that each of the 35 biquadratic subfields of K is a subfield of
exactly three of these octic fields, each octic contains seven biquadratic subfields, and each
pair of biquadratics are subfields of a unique octic. Thus, we have another (15, 3, 1)-design,
which we call KS ∗ .
Is KS ∗ resolvable? Yes, it is, and to see this, we look at Figure 55 again. In it,
each biquadratic is designated by the triple of quadratics it contains. If we replace each
biquadratic in Figure 55 by the triple of octics which contain it, we are led to Figure 54,
the arrangement found by the fifteen ladies at the ENBS.
The field K contains fifteen octic subfields, and each of these contains seven quadratic
subfields. It turns out that each quadratic appears in seven octics, and that each pair of
quadratics appear together in exactly three octics. This gives us a (15, 7, 3) symmetric
design OQ with the quadratics as varieties and the octics as blocks. Each row of Figure 56
begins with a letter referring to an octic field, followed by seven numbers d1 , . . . , d7 ; these
√
are the values of d for which Q( d) is contained in that octic field. For example, line l
√ √ √
refers to the octic field L = Q( 3, 10, 14). It contains the seven quadratic subfields
√
Q( r) for r = 3, 10, 14, 30, 35, 42 and 105. (FIGURE 56 GOES ABOUT HERE):
Now, the elements of the blocks in Figure 56 can themselves be arranged into block
designs. For, each of the 15 octic subfields of K contains 7 biquadratic subfields (the blocks)
as well as 7 quadratic subfields (the varieties). Each biquadratic contains 3 quadratics,
each quadratic is contained in 3 biquadratics, and each pair of quadratics lie in a unique
biquadratic. Thus, each block is a triple of quadratics, and we conclude that K contains
fifteen (7, 3, 1) symmetric designs. Continuing with line l, we list the triples of the (7, 3, 1)
design contained in the octic field L here:
10, 14, 35; 30, 35, 42; 10, 42, 105; 3, 14, 42; 3, 35, 105; 14, 30, 105; 3, 10, 30.
108
Octic Field
a
b
c
d
e
f
g
h
i
j
k
l
m
n
o
√
Contains Q( d) for these d
2, 3, 5, 6, 10, 15, 30
2, 3, 7, 6, 14, 21, 42
2, 5, 7, 10, 14, 35, 70
3, 5, 7, 15, 21, 35, 105
2, 3, 6, 35, 70, 105, 210
2, 5, 10, 21, 42, 105, 210
2, 7, 14, 15, 30, 105, 210
3, 5, 14, 15, 42, 70, 210
3, 7, 10, 21, 30, 70, 210
5, 6, 7, 30, 35, 42, 210
2, 15, 21, 30, 35, 42, 70
3, 10, 14, 30, 35, 42, 105
5, 6, 14, 21, 30, 70, 105
6, 7, 10, 15, 42, 70, 105
6, 10, 14, 15, 21, 35, 210
√ √ √ √
Figure 56: The (15, 7, 3) design OQ in Q( 2, 3, 5, 7)
As an exercise, find these seven triples in Figure 55, and observe that they occur in different
columns.
Finally, if D is a symmetric design, then the dual design D ∗ of D is obtained from D by
a formal exchange of blocks and varieties. Thus, if the variety x belongs to the block B in
D, then the variety B belongs to the block x in D ∗ . In this way, we obtain the dual OQ∗
of the (15, 7, 3) symmetric design OQ depicted in Figure 56, again by a formal exchange
of blocks and designs. We note that this construction fails for nonsymmetric designs, i.e.,
designs in which v 6= b.
Exchanging the roles of blocks and varieties in a block design is analogous to exchanging
the roles of points and lines in projective geometry. To see this more clearly, we need to
pass to a geometric description of KS. So, let’s talk about finite projective geometries and
spreads.
Spreads in P G(3, 2) and the geometry of Kirkman
One very elegant way to generate a solution to the Kirman Schoolgirls problem involves a
nice partitioning and packing problem in finite projective geometry. Projective spaces are
hundreds of years old, but were first introduced as an extension of the real Euclidean spaces
that we’re more familiar with. The essential difference between Euclidean and projective
spaces is that in projective spaces you have no notion of parallelism. So if two lines are
lying in the same plane, then they necessarily intersect (in a unique point). Lines can be
skew, but this requires them to be non-coplanar. We’ll see examples of skew lines shortly.
109
If one uses a finite field to coordinatize a projective space (rather than the more typical
real numbers), one obtains a finite projective space. Once you have a finite set to work
with, you can use combinatorial arguments to count things like points and lines. If you
use the finite field with q elements, and restrict yourself to a 3-dimensional setting, you get
what’s called a finite projective space of dimension 3 and order q, denoted by P G(3, q) and
it’s not hard to show (using some sassy linear algebra) that there are exactly q 3 + q 2 + q + 1
points and every line contains q + 1 points in this space. If you haven’t noticed already,
think about what happens when q = 2. In this case, the finite projective space P G(3, 2)
contains 15 points and every line contains 3 points. Sound familiar?
A solution to Kirkman’s famous problem could be obtained with lines of P G(3, 2). A
solution would go something like this. First you’d have to partition the projective space
into lines. Such a partition of the points of P G(3, q) into lines is called a spread by finite
geometers. A spread in our setting would contain 5 disjoint lines (each containing 3 points).
The points of our projective space would correspond to the girls, and the lines of our spread
would correspond to the groups of girls walking together on the first day. To find the groups
for the second day would require us to find a second spread such that no line from the first
spread gets reused in the second spread. Then we continue in this fashion until we get 7
pairwise disjoint spreads (or 7 days worth of partitions). Seems possible, I suppose. But
are we satisfying the condition that no two girls walk together more than once? If this were
not the case, then we would have 2 points of the projective space lying on two different
lines. As you might suspect, this violates one of the axioms for projective geometry. So,
the geometric model actually guarantees us the desired property!
But back to the spreads. To solve Kirkman’s problem, we would need 7 pairwise disjoint
spreads (no two sharing a common line). Hence, we would need to use 35 different lines of
the projective space. Oddly enough, this is exactly the number of lines of P G(3, 2). So,
in geometric terms, we’re trying to partition the lines of P G(3, 2) into 7 disjoint spreads.
Such a partition of lines into spreads is known as a packing. It is fairly well-known (in
the sense that it’s written down somewhere) that spreads and packings exist. In fact, they
exists for projective spaces of any order (i.e., any value of q). The projective space P G(3, q)
contains (q 2 + 1)(q 2 + q + 1) lines and a packing of P G(3, q) is comprised of q 2 + q + 1
spreads, each of size q 2 + 1. Hence, packings of P G(3, q) actually give a solution to the
generalized Kirkman Schoolgirls problem:
If (q 2 + 1)(q + 1) schoolgirls go walking each day in q 2 + 1 rows of q + 1, they
can walk for q 2 + q + 1 days so that each girl has walked in the same row as
has every other girl and hence with no girl twice.
Incidentally, finite geometry provides a wealth of examples of designs, and Kirkman
designs are no exception. By generalizing the spreads and packings described above, one
can construct resolvable (3n, 3, 1) designs for many values of n simply by varying the
dimension of the space you work in. A very thorough, albeit technical, description of these
methods can be found in [82].
110
MON
TUE
WED
THU
FRI
SAT
SUN
1, 2, 3
1, 4, 5
2, 4, 6
1, 6, 7
3, 4, 7
3, 5, 6
2, 5, 7
4, 10, 14 2, 13, 15 1, 8, 9
2, 9, 11 2, 12, 14 2, 8, 10 1, 14, 15
7, 8, 15 3, 9, 10 3, 12, 15 4, 8, 12 1, 10, 11 4, 11, 15 4, 9, 13
5, 9, 12 6, 8, 14 5, 11, 14 3, 13, 14 5, 8, 13 1, 12, 13 3, 8, 11
6, 11, 13 7, 11, 12 7, 10, 13 5, 10, 15 6, 9, 15 7, 9, 14 6, 10, 12
Figure 57: The Kirkman design as a spread in P G(3, 2)
Let us represent the nonzero 4-bit strings by the decimal integers they represent, e.g.
1 = 0001, 2 = 0010, . . . , 10 = 1010, . . . , 15 = 1111. Then the following packing of the
lines of P G(3, 2) into 7 disjoint spreads solves Kirkman’s Schoolgirl Problem (Figure 57
GOES ABOUT HERE):
Notice that the seven blocks in the first row make up a (7, 3, 1) design.
Now, we have two ways of describing Kirkman’s Fifteen Young Ladies, the spread in
P G(3, 2) and the subfields of an algebraic number field. By a KS(15), we mean a resolvable
(15, 3, 1) block design. As we have seen, the quadratic (varieties) and bi-quadratic (blocks)
√ √ √ √
subfields of the degree-16 algebraic number field L = Q( 2, 3, 5, 7) form a KS(15).
But wait—there’s more. The points in P G(3, 2) are 4-long bit strings. There is a
one-to-one correspondence (prove it!) between the points in P G(3, 2) and the quadratic
subfields
of L, defined by mapping the nonzero bit string b1 b2 b3 b4 to the quadratic subfield
√
b
b
Q( 2 1 3 2 5b3 7b4 ). Let’s see how this correspondence acts on a KS(15) design.
A set of three points in P G(3, 2), such as {0110, 1101, 1011}, are collinear provided their
vector sum over the 2-element field is the zero vector. A line in the extension-field version of
√
√
√
the KS(15) design is a set of three quadratic subfields, such as {Q( 15), Q( 42), Q( 70)}
that belong to the same biquadratic subfield of L. Now the condition that the fields
√
√
√
{Q( p), Q( q), Q( r)} lie in a common biquadratic field is that p, q and r are nonsquare
integers whose product is a square – note that 15 · 42 · 70 = 44100 = 2102 is a square. A
little algebra shows that three strings {a, b, c} in P G(3, 2) sum to zero mod 2 if and only
√
√
√
if their corresponding quadratic fields {Q( p), Q( q), Q( r)} have the property that pqr
is a square. Thus, the correspondence preserves lines, and so is an isomorphism between
the set of 35 lines of a KS(15) in P G(3, 2) and its corresponding KS(15) in L.
And you can see every bit of this in Figure 52.
At this point you probably will not be surprised that Kirkman’s famous design has a
connection with two more seemingly unrelated areas of mathematics, namely recreational
mathematics and coding theory. So, let’s talk about the Kirkman design and how it relates
to both a certain guessing game involving fifteen players wearing hats and to Hamming
codes.
111
Fifteen schoolgirls, fifteen hats: some coding theory
Here is a famous problem in recreational mathematics we’ll call the Three Hats Game.
Three players enter a room and a maroon or orange hat is placed on each person’s head.
The color of each hat is determined by a coin toss, with the outcome of one coin toss having
no effect on the others. Each person can see the other players’ hats but not his own.
No communication of any sort is allowed, except for an initial strategy session before
the game begins. Once they have had a chance to look at the other hats, the players must
simultaneously guess the color of their own hats or pass. The group shares a hypothetical
$3 million prize if at least one player guesses correctly and no players guess incorrectly.
The problem is to find a strategy whereby the groups chance of winning exceeds 50%.
Mathematicians credit the Three Hats Game to Todd Ebert, a computer science professor at the University of California at Irvine, who introduced it in his Ph.D. thesis in 1998
[52]. The problem was then popularized by an April 2001 article [134] in the New York
Times.
The winning strategy is as follows. Each player looks at the other two hats. A player
who sees two of the same color guesses the missing color. A player who sees two different
colors passes. Now there are eight ways of distributing hats of two colors among three
distinct players. In six of these ways, two players see hats of different colors and they pass;
the third player sees two hats of the same color, guesses the missing color — and that turns
out to be a win. In the other two cases, all hats are the same color; each player guesses the
missing color, and all three are wrong. Hence, the strategy works in six of eight cases, and
so the three players will win 3/4 of the time. This comes as a surprise to most readers.
We will see how this technique generalizes, with increasingly better odds, to any number
of players of the form 2n −1 for n ≥ 3. In particular, it generalizes to a situation where there
are 24 −1 = 15 players — maybe even fifteen schoolgirls — and the analysis involves a family
of lovely mathematical structures whom we have already met – namely, the Hamming
Codes. So, before we describe the general technique, let’s give a brief review of errorcorrecting codes.
Mathematical schemes to deal with signal errors first appeared in the 1940s in the work
of several researchers, including Claude Shannon, Richard Hamming, and Marcel Golay.
As we have seen, what they came up with was a new branch of mathematics called coding
theory—specifically, the study of error-detecting and error-correcting codes. They modeled
these signals as sets of n-long strings called blocks, to be taken from a fixed alphabet of
size q; a particular set of such blocks, or codewords, is called a q-ary code of length n. If
q is a prime number, then a q-ary code of length n is called linear if the code words form
a subspace of Znq , the n-dimensional vector space over Zq , the integers mod q. A basis for
such a linear code is called a generating set for the code. One way to describe such a set
is with a generator matrix, which is a q-ary matrix of n columns whose rows generate the
code.
To detect errors means to determine that a codeword was incorrectly received; to correct
112
errors means to determine the right codeword in case it was incorrectly received. Just how
this correction happens will vary from code to code.
The fact that d errors in transmission change d characters in a block gives rise to the
idea of distance between blocks. If v and w are n-blocks, then the (Hamming) distance
D(v, w) is the number of positions in which v and w differ. Thus, D(11001, 10101) = 2 and
D(1101000, 0011010) = 4. If I send the block v and you receive the block w, then D(v, w)
errors occurred while sending v.
It follows that if the words in a code are all “far apart” in the Hamming distance sense,
then can detect errors. Even better, if we assume that only a few errors are received, then
we can sometimes change the received block to the correct codeword. A codeword of length
n contains a certain number k of message bits, and the other n − k check bits are used for
error detectionand correction. Such a code is called an (n, k) code.
The minimum distance of a code is the smallest distance between its codewords; this
minimum distance determines the code’s error detection and correction features. It can
be shown that a code with minimum distance d will detect up to d − 1 errors and correct
up to ⌊(d − 1)/2⌋ errors.) For an (n, k) code to be efficient, the ratio k/n should be as
large as possible, consistent with its error detection and correction capabilities. Maximum
efficiency in an (n, k) m-error correcting code occurs when it can correct up to m errors,
and no others. Such a code is called perfect.
Hamming’s first error correcting scheme was a perfect 1-error correcting code of length
seven with four message bits, three check bits, and minimum distance 3; hence, it could
correct all errors in which a single bit was received incorrectly. Golay extended Hamming’s
work and constructed a family of (2n − 1, 2n − 1 − n) linear binary perfect 1-error correcting
codes of minimum distance 3 for all n ≥ 2. These are now known as the Hamming codes,
and they include both Hamming’s original (7, 4) code and the (3, 1) triplication code.
Here is the connection between the Kirkman Schoolgirls Problem and Hamming codes.
As we have seen, the 35 triples in Figure 57 are the 35 blocks of a resolvable (15, 3, 1) design,
and the numbers 1, . . . , 15 are the varieties. The incidence matrix M for this design is a
35 × 15 matrix of zeros and ones. It is straightforward to show that the row space of M
— that is, the vector space generated by the rows of M — is an 11-dimensional subspace
of Z15
2 , and that this subspace is a (15, 11) Hamming code.
We now show how Hamming codes are the keys to understanding the winning strategy
for the Three Hats Game, and how the Kirkman Schoolgirls problem is linked to the Fifteen
Hats Game.
Fifteen schoolgirls, fifteen hats: a solution
Go back and look at the Three Hats Game again. Notice that the triplication code contains
two codewords and six blocks with errors. The six erroneous blocks correspond to the six
winning hat placements for the three players, and the two codewords correspond to the
two losing hat placements. As we see in what follows, that is not an accident.
113
Here is how a solution to the Kirkman Schoolgirls problem leads to a solution to the
Fifteen Hats Game in which the probability of winning is much greater than 50%: in fact,
it is well over 90%.
First, we number the girls from 1 to 15 in the same way that they are labeled in Figure
57. We think of these as 4-digit nonzero binary numbers.
Now, suppose that the girls enter the room, each obtaining a hat, and circle up in order
1 through 15. Each player now does the following. She looks at the numbers corresponding
to each girl wearing a maroon hat, and she computes the corresponding vector sum. For
example, if girls 1, 3, 5, 8, 10, 12 and 14 are wearing maroon hats, then girl 4 will compute
1 ⊕ 3 ⊕ 5 ⊕ 8 ⊕ 10 ⊕ 12 ⊕ 14. As a mod-2 vector sum, this is
0001 ⊕ 0011 ⊕ 0101 ⊕ 1000 ⊕ 1010 ⊕ 1100 ⊕ 1110 = 0111, or 7.
1. If that sum is equal to her number, she guesses that her color is orange.
2. If that sum is equal to zero, she guess that her color is maroon.
3. If neither of these two situations occurs, she passes.
Let’s analyze what happens. First suppose that the sequence of all maroon hats corresponds to a vector sum of 0. Then every schoolgirl falls into one of the first two cases. All
of them will guess incorrectly, and the team loses. More precisely, if a particular girl has
on a maroon hat, the corresponding sum that she computes will be equal to her number.
So, she will fall into case 1 above and will therefore guess that her hat is orange. Wrong!
A similar mistake occurs if the girl is wearing orange.
Next, suppose that the sequence of all maroon hats corresponds to a vector sum of
n 6= 0. Girl k sees a vector sum of n ⊕ k or n, according as she is wearing maroon or
orange, respectively. If k 6= n, then what Girl k sees is neither her own number nor zero,
so she passes. Girl n, however, sees 0 if she is wearing maroon and sees n if she is wearing
orange; in both of these cases, she guesses correctly and the team wins.
In the previous example, in which the sequence of all maroon hats corresponds to a
vector sum of 7 6= 0. The only girl to see either 0 or her own number is Girl 7, who sees
7. That is her own number so she correctly guesses that her hat is orange, and the team
wins.
As an exercise, suppose that girls 1, 4, 6, 8, 9, 10 and 12 are wearing maroon hats, and
the others are wearing orange hats. Is this a winning configuration, and if so, which girl
makes the correct guess? The solution is at the end of this section.
Why does this work? This is where Hamming codes point the way. The reason is that
the configurations of maroon hats with vector sums of 0 are in one-to-one correspondence
with the binary vectors of length 15 in the row space of M , the incidence matrix of the
Kirkman (15, 3, 1) design, and as previously mentioned, this row space forms a (15, 11)
Hamming code. Recall that the Hamming codes are perfect codes with minimum distance
1. This means that every vector in the entire vector space Z15
2 either (a) is a codeword, or
114
(b) differs in one coordinate from a unique Hamming codeword. That is, changing just one
special coordinate position of a non-codeword will leave us with a codeword. Thus, in an
arrangement of hats corresponding to a non-codeword, the only one who can detect this
is the girl who occupies that special coordinate position. She can tell what her hat color
should be in order to make the entire configuration a codeword — and so she guesses the
opposite color.
As for the probability of winning with this strategy, it is 15/16, and here is why. We
have seen that the triples corresponding to the Kirkman Schoolgirls problem generate a
vector space, the row space of the incidence matrix M , that corresponds to the (15, 11)
Hamming code. The incorrect guesses will occur exactly when the arrangements of maroon
hats correspond to a vector in the Hamming code. Hence, the probability that the players
lose the game is given by the size of the Hamming code divided by the total number of
Z2 -vectors of length 15. This gives us 211 /215 = 1/16. So the chances of winning are
actually 1 − 1/16, or 15/16. We hope you find this as surprising as we do. By increasing
the number of players, you actually increase your chances of winning.
As for the Three Hats Game, the triplication code is a (3, 1) Hamming code. Its generator matrix is [1 1 1], the set of codewords is {000, 111} and there are 8 binary vectors of
length 3. Hence, the probability of a win is 1 − 1/4, or 3/4.
With that, we leave Thomas Kirkman and his fifteen schoolgirls, whose simple arrangement question has led us into many varied areas of mathematics. Hats off to all fifteen of
you!
Questions
Where can I find out more about Kirkman designs and block designs in general?
Two of the best others are Marshall Hall’s classic [73] and the more technical book by Beth,
Jungnickel and Lenz [13], both of which are excellent and will take you as far as you want
to go.
Is the Kirkman design found in P G(3, 2) the only solution to Kirkman’s
Schoolgirls Problem? We say that two block designs are isomorphic if there is a 1-1
correspondence between the two sets of varieties that is also a 1-1 correspondence between
the two sets of blocks. It was known for a long time that there are eighty non-isomorphic
(15, 3, 1) designs. In 1922, F. N. Cole [33] proved that only four of these eighty designs are
resolvable. Cole also proved that three of these have two non-isomorphic resolutions, while
the fourth has only one. (Exercise: Determine whether the (15, 3, 1) design presented in
this paper has a resolution not isomorphic to the one in Figure 55.)
For which values of v do resolvable (v, 3, 1)-designs exist? This question dates
back to Kirkman himself ([90, 91]) and was open for over a hundred years. Finally, in 1971
D. K. Ray-Choudhury and R. M. Wilson [125] proved that resolvable (v, 3, 1)-designs exist
if and only if v ≡ 3 mod 6.
115
Are there Kirkman designs in fields other than the degree-16 field described above? These fields certainly contain Steiner triple systems analogous to Kirkman’s design. Let n > 3 and let p1 , p2 , . . . , pn be distinct primes. The field Ln =
√ √
√
Q( p1 , p2 , . . . , pn ) is an extension of degree 2n over the rational numbers Q. In such
fields, the quadratic and the biquadratic subfields of Ln are the varieties and blocks, respectively, of a (2n − 1, 3, 1) design. As an exercise, show that such a design contains
b = (2n − 1)(2n−1 − 1)/3 blocks, and each variety appears in r = 2n−1 − 1 blocks.
Now, the number of quadratic subfields in L(n) is a multiple of 3 if and only if n is
even. Thus, if n is odd, then the above Steiner triple system is not resolvable, because the
number of varieties is not a multiple of 3. If n is even, is the (2n − 1, 3, 1) Steiner triple
system analogous to Kirkman’s design resolvable? We don’t know the answer!
The set of 15 Schoolgirls contains 455 3-element subsets, or trios. Suppose
the school term is 13 weeks long. What if the Schoolgirls wanted to arrange
13 weeks’ worth of walks so that each trio of girls can walk together exactly
once during the term? They can do it. Note that this amounts to partitioning the
91 trios into 13 distinct Kirkman (15, 3, 1) designs. Evidently, in 1850 Cayley referred to
Kirkman’s original problem as well as to Sylvester’s extension to 13 walks. In 1974, R. H.
F. Denniston briefly discussed the problem’s history, and then presented a solution [45]. As
an exercise, find a partition of the 84 3-element subsets of {1, . . . , 9} into seven resolvable
(9, 3, 1) designs. Happy walking!
The Schoolgirl Problem connects block designs, finite projective geometries,
algebraic number fields, error-correcting codes, and recreational mathematics.
Are there any other connections? Yes, there is at least one more connection. The set
GK = (Z/2Z)4 = {(a, b, c, d) : a, b, c, d ∈ 0, 1} is a group under the operation of coordinatewise addition mod 2. (Z/2Z)4 has 15 subgroups of order 2, 35 subgroups of order 4 and
15 subgroups of order 8; each order-2 subgroup is contained in three order-4 subgroups
and seven order-8 subgroups. (Does this sound familiar?) In fact, GK is what is known
√ √ √ √
as the Galois group of the degree-16 field K = Q( 2, 3, 5, 7). It is the group of
isomorphisms of K to itself that leaves Q fixed. There is a one-to-one, order-reversing
correspondence between the subfields of K and the subgroups of GK , and the details of
this correspondence are laid out in the Fundamental Theorem of Galois Theory, one of the
most beautiful theorems in mathematics.
What about the solution to that exercise? We know that girls 1, 4, 6, 8, 9, 10 and
12 are wearing maroon hats, and the others are wearing orange hats. The sequence of all
maroon hats yields the vector sum 1 ⊕ 4 ⊕ 6 ⊕ 8 ⊕ 9 ⊕ 10 ⊕ 12, i.e.
0001 ⊕ 0100 ⊕ 0110 ⊕ 1000 ⊕ 1001 ⊕ 1010 ⊕ 1100 = 0100, or 4.
Girl k sees k ⊕ 4, and the only one with a winning view is Girl 4, who sees the all-zeros
vector. Therefore, she guesses maroon, nobody else guesses, and the team wins.
116
12
(7, 3, 1) and combinatorics
This section is in the file 731tuoccombx.tex.
You may have noticed that the (7, 3, 1) block design has a whole host of connections
with different areas of combinatorics. So far, we have seen these seven three-element sets
turn up as a block design, a Steiner triple system, a perfect difference set, the P-positions
in three-pile nim, the Heawood graph, the projective plane of order two, a Hamming code,
and a three-circle Venn diagram associated with that code. Might there be others?
There are indeed, and it is time for you to meet some of them!
(7, 3, 1) and Latin squares
Let’s begin with a generalization of the Fano plane. A finite projective plane of order n,
abbreviated FPP(n), is an FPP containing n2 + n + 1 points and n2 + n + 1 lines, such that
every point is on n + 1 lines, every line contains n + 1 points, and every pair of points is on
a unique line. Thus, an FPP(n) is an (n2 + n + 1, n + 1, 1) symmetric design; conversely,
every (v, k, 1) design is a finite projective plane of order n = k − 1 with v = n2 + n + 1.
We see that the Fano plane is an FPP(2).
One of the major unsolved problems in combinatorics is determining for which values
of n an FPP(n) exists. Their existence is equivalent to the existence of certain families of
designs called Latin squares, designs which got mixed up with one of the most famous false
conjectures in the history of mathematics. So let’s talk briefly about them now.
A Latin square of order n is an n × n array with entries from the set {1, . . . , n} such
that each element of {1, . . . , n} appears in each row and each column of the array exactly
once. Two Latin squares A = [aij ] and B = [bij ] of order n are said to be orthogonal if the
n2 ordered pairs (aij , bij ) are distinct. (A good reference for the general subject of Latin
squares is [44].) Here are three Latin squares of order 4; you can check that (a) they are
Latin squares and (b) they are orthogonal in pairs.






1 2 3 4
1 3 4 2
1 4 2 3
2 1 4 3
2 4 3 1
2 3 1 4






3 4 1 2
3 1 2 4
3 2 4 1
4 3 2 1
4 2 1 3
4 1 3 2
It turns out that the existence of this trio of pairwise orthogonal Latin squares of size
4 is equivalent to the existence of a finite projective plane of order 4; in fact, this is true
in general:
Theorem: Let n be an integer greater than 1. Then there exists a finite projective plane
of order n if and only if there exists a set of n − 1 Latin squares of size n which are pairwise
orthogonal.
An outline of the proof is given in [133]—where it is also shown that if n is a prime
power, then there exists a set of n−1 Latin squares of size n which are pairwise orthogonal.
117
Hence, FPP’s exist for orders 2, 3, 4, 5, 7, 8 and 9. In particular, (7, 3, 1) is a FPP(2) and
so there must be a corresponding set of n − 1 pairwise orthogonal Latin squares of size
n = 2. We’ll see that set later.
By this theorem, there are sets of n − 1 pairwise orthogonal Latin squares of sizes
2, 3, 4, 5, 7, 8, and 9. But what about order 6?
The great Leonhard Euler wondered about 6, too. That prodigious mathematical mind
from the eighteenth century made a study of Latin squares, showed how to construct a
pair of size n if n is not of the form 4k + 2, and saw immediately that it is impossible
to construct a pair of orthogonal Latin squares of size 2 (try it). He then attempted to
construct a pair of size 6; failing to do so, he made the following bold conjecture:
Euler’s Conjecture (1782): For each nonnegative integer k, there does not exist a pair of
orthogonal Latin squares of size 4k + 2.
For over 100 years, nothing happened. Then, in 1900, G. Tarry wrote two papers (see
[163]) proving that Euler was right about 6. But as Bose, Shrikhande and Parker (see [16])
showed in 1960, he was spectacularly wrong for all other values of 4k + 2 greater than 6:
Theorem: There exists a pair of orthogonal Latin squares of order n for all n > 6.
It is now known (see [100] for details) that an FPP(10) does not exist; the smallest
unknown case is for n = 12. So we see that (7, 3, 1) is connected with one of the rare
instances in which Euler was almost totally wrong!
And here is the set of pair-wise orthogonal Latin squares of size 2 guaranteed by the
preceding theorem:
1 0
.
0 1
(7, 3, 1) and round-robin tournaments
We have seen that (7, 3, 1) has connections with graph theory – specifically, with the
Heawood graph and the imbedding of seven mutually adjacent hexagons on the torus,
the latter illustrating that the complete graph K7 has genus γ(K7 ) = 1. Now, directed
graphs have already appeared as models of combinatorial games, and the next (7, 3, 1)
connection features directed graphs called tournaments.
In many sports leagues, each team plays every other team exactly once, and there are
no ties. We can model this scenario with a graph as follows. The teams are the vertices,
and for each pair of teams u and v, include the edge (u, v) directed from u to v if u
beats v, and include the edge (v, u) if v beats u. We call such a graph a (round–robin)
tournament; thus, a tournament is a complete graph with a direction assigned to each
edge. (A good reference on tournaments is Moon’s book [112].) The score of a vertex u is
the number of edges (u, v) in the tournament. A tournament is called transitive if every
team has a different score, and regular if every team has the same score. Naturally, the
higher the score, the higher the team’s rank. Thus, in a transitive tournament, the scores
118
determine the ranking unambiguously, and in a regular tournament, the scores don’t give
any information. (A transitive tournament is so named because it has the property that if
u beats v and v beats w, then u beats w.)
Now, the high schools of Auburn, Blacksburg, Christiansburg, EastMont, Giles, Newman and Radford make up the Riverside League. In most years, one or two schools dominate, but last year the results of the league’s round-robin play were quite different:
Team
Auburn
Blacksburg
Christiansburg
EastMont
Giles
Newman
Radford
Victories Over
Blacksburg, Giles, Radford
Christiansburg, Giles, Newman
Auburn, Newman, Radford
Auburn, Blacksburg, Christiansburg
Christiansburg, EastMont, Radford
Auburn, Giles, EastMont
Blacksburg, EastMont, Newman
Each team had the same score of 3, so this was a regular tournament. But the league
was even more balanced than that: each pair of teams was victorious over exactly one
common opponent. (Such a tournament is called doubly regular.) Let us look at this
a little more carefully, assigning numbers to the teams as follows: A=3, B=0, C=1, E=6,
G=4, N=2, and R=5. If we now rewrite the results in numerical order, the table of victories
begins to look somewhat familiar:
Team Victories Over
0
1, 2, 4
1
2, 3, 5
2
3, 4, 6
3
4, 5, 0
4
5, 6, 1
5
6, 0, 2
6
0, 1, 3
Our old friend (7, 3, 1) has reappeared: the sets of teams defeated by each member of
the league are the blocks of a (7, 3, 1) symmetric design, and the teams defeated by Team
0 form a (7, 3, 1) difference set.
Of course, this is no accident—and we can prove it.
Theorem: Let p = 4n + 3 be a prime. Define the tournament T by V (T ) = [0..p − 1]
and E(T ) = {(x, x + r) : r is a square mod p}. Then T is a doubly regular tournament
with 4n + 3 vertices, in which every vertex has a score of 2n + 1 and every pair of vertices
defeats n common opponents.
Proof: Since there are (p − 1)/2 = 2n + 1 squares mod p, each vertex has a score of
2n + 1. Now let x, y be distinct vertices. Then (x, z) and (y, z) are both edges of T if and
only if there exist distinct squares r and s such that z − x = r and z − y = s. Hence,
119
the number of such z is equal to the number of pairs of distinct squares r, s such that
r − s = x − y. But the squares form a (4n + 3, 2n + 1, n)-difference set, and so the nonzero
number x − y can be written as a difference r − s in exactly n distinct ways. Hence, there
are n vertices z such that both (x, z) and (y, z) are edges of T , i.e. T is doubly regular.
In short, (7, 3, 1) is a seven-player doubly regular tournament.
Now, the adjacency matrix of a tournament T on v vertices x1 , . . . , xv is a v × v matrix
A = [Aij ] such that Aij = 1 if there is an edge from xi to xj , and Aij = 0 otherwise. Thus,
the (7, 3, 1) doubly regular tournament has the following adjacency matrix A:





A=




0
0
0
1
0
1
1
1
0
0
0
1
0
1
1
1
0
0
0
1
0
0
1
1
0
0
0
1
1
0
1
1
0
0
0
0
1
0
1
1
0
0
0
0
1
0
1
1
0





,










H=





1
1
1
1
1
1
1
1
1
−1
−1
−1
1
−1
1
1
1
1
−1
−1
−1
1
−1
1
1
1
1
−1
−1
−1
1
−1
1
−1
1
1
−1
−1
−1
1
1
1
−1
1
1
−1
−1
−1
1
−1
1
−1
1
1
−1
−1
1
−1
−1
1
−1
1
1
−1






.





If we replace all the 0’s in A with −1’s and border the resulting matrix top and left with
a row and column of 1’s, we obtain the above matrix H. If we multiply H by its transpose
H T , it turns out that HH T = 8I, where I is the 8 × 8 identity matrix. In Section 10, we
called an n × n matrix H of 0’s and 1’s for which HH T = nI a Hadamard matrix of order
n.
It turns out that the adjacency matrix of every doubly regular tournament can be
transformed into a skew–Hadamard matrix (one for which H + H T = 2I), and every
skew–Hadamard matrix gives rise to a doubly regular tournament (see [126]).
What this means is that (7, 3, 1) is a skew-Hadamard matrix of order 8.
Hadamard matrices are very useful in constructing error–correcting codes and other
combinatorial designs. You can show that if H is a Hadamard matrix of order n > 1, then
either n = 2 or n ≡ 0 mod 4. Hadamard’s conjecture is that there exists a Hadamard
matrix of order n for every n divisible by 4. As of 2015, the smallest open case is n = 668.
We have seen how (7, 3, 1) connects with complete sets of orthogonal Latin squares,
doubly-regular tournaments, and skew-Hadamard matrices.
In short, (7, 3, 1) appears in the lattice of subfields of the splitting field of the polynomial
p(x) = (x2 − 2)(x2 − 3)(x2 − 5) and has connections with complete sets of orthogonal Latin
squares, doubly-regular tournaments, and skew-Hadamard matrices.
120
13
(7, 3, 1) and algebraic systems
This section is in the file 731tuocalg.tex.
(7, 3, 1) and algebraic number fields
As we mentioned in the last section, Galois made the great breakthrough in algebra by
transforming a problem of finding the roots of a given polynomial p(x) with rational coefficients into a problem of determining a certain finite group associated with p(x). We call
that group the Galois group of the splitting field L = L(p) of p(x), namely the smallest
subfield L of the complex numbers containing the rational numbers as well as the roots
of p(x). His work was the starting point for shifting the emphasis in algebra from finding
roots of polynomials to studying algebraic structures – but that is a whole nother story.
It happens that these splitting fields are finite-dimensional vector spaces over the rational numbers Q, and the dimension of this field is called the degree of the extension. For
√ √ √
the polynomial p(x) = (x2 − 2)(x2 − 3)(x2 − 5), it turns out that L(p) = Q( 2, 3, 5).
This is an octic field, so-called because it is a degree-8 extension of Q. This octic field
contains, besides itself and Q, fourteen other subfields which form a (7, 3, 1) block design.
For this particular incarnation of (7, 3, 1), the elements are the seven quadratic (degree-2)
√
subfields Q( d), where d ∈ {2, 3, 5, 6, 10, 15, 30}, and the blocks are the seven biquadratic
√ √
(degree-4) subfields Q( d1 , d2 ) (where d1 d2 is not a perfect square). See for yourself:
√
Biquadratic
Field
Contains
Q(
d) for these d
√ √
Q(√2, √3)
2, 3, 6
Q(√3, √5)
3, 5, 15
Q(√5, √6)
5, 6, 30
Q(√6, √
15)
6, 15, 10
Q(√15, √30)
15, 30, 2
Q(√30, √10)
30, 10, 3
Q( 10, 2)
10, 2, 5
(7, 3, 1) and normed algebras
We now explore a fascinating connection (7, 3, 1) has with a number system that superficially resembles the complex numbers, and to which mathematicians were led by asking
questions about sums of squares.
Squares and their sums have fascinated the mathematical world for millennia, beginning
with the Pythagorean theorem. Euclid gives a proof of the Pythagorean theorem in Book I,
Proposition 47 of The Elements. Book X, Proposition 29, Lemma 1 gives a general formula
for triples (x, y, z) of integers such that x2 + y 2 = z 2 . In modern notation, if a and b are
relatively prime integers of opposite parity, set x = a2 − b2 , y = 2ab, and z = a2 + b2 ; then
x2 + y 2 = z 2 .
Several hundred years later, Diophantus (ca. 250 CE) made an observation in the
121
solution to Problem III.22 of his Arithmetica, an observation that implicitly contains the
identity
(a2 + b2 )(c2 + d2 ) = (ac − bd)2 + (ad + bc)2 ,
which gives the product of two sums of two squares as a sum of two squares. Diophantus does not supply a proof, but almost a millenium later, Leonardo of Pisa (11751240) includes this two-squares identity–with proof–in his Liber quadratorum (The Book
of Squares).
In 1748, Euler proved the Four Square Identity, namely that the product of two sums
of four squares is again a sum of four squares, showing that if a1 , . . . , a4 and b1 , . . . b4 are
numbers, then
(a21 + a22 + a23 + a24 )(b21 + b22 + b23 + b24 ) =
= (a1 b1 − a2 b2 − a3 b3 − a4 b4 )2 + (a1 b2 + a2 b1 + a3 b4 − a4 b3 )2
+(a1 b3 − a2 b4 + a3 b1 + a4 b2 )2 + (a1 b4 + a2 b3 − a3 b2 + a4 b1 )2 .
Lagrange used this identity in his 1770 proof that every positive integer can be written
as a sum of four squares of integers. The identities of Diophantus and Euler raised the
question: are there other identities like this?
One such identity for sums of eight squares was first found by the Danish mathematician
Ferdinand Degen in 1818 . The eight-squares identity states that if a0 , . . . , a7 and b0 , . . . , b7
are numbers, then
(a20 + a21 + a22 + a23 + a24 + a25 + a26 + a27 )
×(b20 + b21 + b22 + b23 + b24 + b25 + b26 + b27 )
= (a0 b0 − a1 b1 − a2 b2 − a3 b3 − a4 b4 − a5 b5 − a6 b6 − a7 b7 )2
+(a0 b1 + a1 b0 + a2 b3 − a3 b2 + a4 b5 − a5 b4 − a6 b7 + a7 b6 )2
+(a0 b2 − a1 b3 + a2 b0 + a3 b1 + a4 b6 + a5 b7 − a6 b4 − a7 b5 )2
+(a0 b3 + a1 b2 − a2 b1 + a3 b0 + a4 b7 − a5 b6 + a6 b5 − a7 b4 )2
+(a0 b4 − a1 b5 − a2 b6 − a3 b7 + a4 b0 + a5 b1 + a6 b2 + a7 b3 )2
+(a0 b5 + a1 b4 − a2 b7 + a3 b6 − a4 b1 + a5 b0 − a6 b3 + a7 b2 )2
+(a0 b6 + a1 b7 + a2 b4 − a3 b5 − a4 b2 + a5 b3 + a6 b0 − a7 b1 )2
+(a0 b7 − a1 b6 − a2 b5 + a3 b4 − a4 b3 − a5 b2 + a6 b1 + a7 b0 )2 .
At this point, mathematicians were quite hopeful that other, perhaps infinitely many,
sums-of-squares identities exist. Let’s rephrase the question “Are there other identities like
this?” as follows. For which positive integers n does there exist an identity of the form
(x21 + · · · + x2n )(y12 + · · · + yn2 ) = z12 + · · · + zn2 , where zk =
n
X
Aijk xi yj ,
i,j=1
and the Aijk are constants independent of the values of the xi and the yj ?
122
The question was answered in 1898 by Adolph Hurwitz, who proved that such an identity
exists for n = 1, 2, 4, 8—and for no other positive integers. He showed that each sums-ofsquares identity led to an n-dimensional normed algebra. Now a normed algebra A is an ndimensional vector space over the real numbers R that has two special features, namely (1) a
vector multiplication that distributes over vector addition, and (2) a mapping N : A → R
such that N (uv) = N (u)N (v) for all u, v ∈ A. These algebras are the real numbers R
(n = 1), the complex numbers C (n = 2), Hamilton’s quaternions H (n = 4), and the
octonions O (n = 8). The latter is a beautiful algebraic system with a multiplication table
that reveals itself as another aspect of (7, 3, 1). We will explore the octonions below, and
then we will construct the analogous 16-dimensional algebra known as the sedenions and
see just why it is not a normed algebra.
One square is easy: because multiplication of real numbers is commutative and associative, we see that a2 b2 = (ab)2 for all real numbers a and b. As for two squares, Diophantus
(ca. 250 CE) had an answer. Problem III.22 of his Arithmetica implicitly contains the
identity
(a2 + b2 )(c2 + d2 ) = (ac − bd)2 + (ad + bc)2 ,
which gives the product of two sums of two squares as a sum of two squares. As mentioned
above, the normed algebras associated with the one-square and two-square identities are
the real numbers R and the complex numbers C, respectively. In fact, multiplication of
complex numbers reflects the two-squares identity as follows. Let z = a + bi and define
N (z) = a2 +b2 ; if w = c+di, then N (w) = c2 +d2 and we see that zw = ac−bd+(ad+bc)i.
Finally, we see that
N (zw) = (ac − bd)2 + (ad + bc)2 = (a2 + b2 )(c2 + d2 ) = N (z)N (w),
by the two-squares identity.
For n = 4, the four-squares identity dates from 1748, when Euler proved that
(a21 + a22 + a23 + a24 )(b21 + b22 + b23 + b24 ) =
= (a1 b1 − a2 b2 − a3 b3 − a4 b4 )2 + (a1 b2 + a2 b1 + a3 b4 − a4 b3 )2
+(a1 b3 − a2 b4 + a3 b1 + a4 b2 )2 + (a1 b4 + a2 b3 − a3 b2 + a4 b1 )2 ,
which Lagrange used in his 1770 proof that every positive integer can be written as a sum
of four squares of integers. As for the associated normed algebra, that is one of the great
stories in mathematics, and it came about in the following way.
During the early 1840s, William R. Hamilton was searching for a way to multiply ordered
triples of real numbers, analogous to multiplication of complex numbers viewed as ordered
pairs. He searched a long time and failed to find such a multiplication, but working through
these unsuccessful attempts led him to one of the famous “aha!” moments in the history of
mathematics. On the morning of October 16, 1843, that moment came to Hamilton while
he was taking a walk. He realized in a flash of insight that the solution he sought was a
multiplication of quadruples, not triples, and then, as he described in an 1865 letter to his
123
son Archibald [74], “Nor could I resist the impulse – unphilosophical as it may have been
– to cut with a knife on a stone of Brougham Bridge, as we passed it, the fundamental
formula with the symbols, i, j, k; namely,
i2 = j 2 = k2 = ijk = −1,
which contains the Solution of the Problem, but of course, as an inscription, has long since
mouldered away.”
Hamilton gave the name quaternions to the resulting algebra H generated by 1, i, j and
k; the multiplication table for the units 1, i, j and k is as follows:
∗ 1
i
j
k
1 1
i
j
k
i i −1
k −j
j j −k −1
i
k k
j −i −1
A quaternion is an expression of the form x1 + x2 i + x3 j + x4 k, where the xn are real
numbers. It is easy to see how to add these expressions term-by-term, and Hamilton’s
new multiplication table shows us how to multiply them. One multiplies two quaternions
by using the distributive law, Hamilton’s table, and the fact that xi = ix, xj = jx, and
xk = kx for all real numbers x. Hamilton showed that this multiplication is associative;
however, the table shows that ij = k = −ji and so multiplication is not commutative. We
can define a norm on H by N (x1 + x2 i + x3 j + x4 k) = x21 + x22 + x23 + x24 , and because of
the four-square identity, it follows that N (x)N (y) = N (xy) for all x, y ∈ H. Therefore, H
is a four-dimensional normed algebra – that is, R4 equipped with a multiplication – and
because of that, we can show that H is a division ring, which means that every nonzero
element of H has a multiplicative inverse. Here’s how:
We first define the conjugate x of a quaternion x by x1 + x2 i + x3 j + x4 k = x1 − x2 i −
x3 j − x4 k. Another routine calculation shows that
xx = (x1 + x2 i + x3 j + x4 k)(x1 − x2 i − x3 j − x4 k) = x21 + x22 + x23 + x24 = N (x).
Now if x 6= 0, then N (x) is a positive real number, and it follows that
x·
N (x)
x
=
= 1.
N (x)
N (x)
Hence x has a multiplicative inverse, and so H is a division algebra. Since at that time,
the only known division rings were fields, H was the first example of a noncommutative
division ring. This unique status of H would last only a couple of months.
What happened next was that the very next day, Hamilton mailed the good news about
the quaternions to his friend and fellow mathematician John T. Graves. Two months
later, Graves sent him a letter in which he described a multiplication on R8 ; we now
call this algebra the octonions O. Hamilton’s quaternion multiplication uses three units
124
{i, j, k}, each of whose squares is equal to −1. Graves’ multiplication on O uses seven units
{o1 , . . . , o7 } whose products come from the following multiplication table:
∗
1
o1
o2
o3
o4
o5
o6
o7
1
o1
o2
o3
o4
o5
o6
o7
1
o1
o2
o3
o4
o5
o6
o7
o1 −1
o4
o7 −o2
o6 −o5 −o3
o2 −o4 −1
o5
o1 −o3
o7 −o6
o3 −o7 −o5 −1
o6
o2 −o4
o1
o4
o2 −o1 −o6 −1
o7
o3 −o5
o5 −o6
o3 −o2 −o7 −1
o1
o4
o6
o5 −o7
o4 −o3 −o1 −1
o2
o7
o3
o6 −o1
o5 −o4 −o2 −1
Better yet, this multiplication came equipped with a norm, namely
oN (a0 + a1 o1 + · · · + a7 o7 ) = a20 + a21 + · · · + a27 .
This norm satisfies N (ab) = N (a)N (b) because of Graves’ other bit of news, namely his
rediscovery of the eight-squares identity
(a20 + a21 + a22 + a23 + a24 + a25 + a26 + a27 )
×(b20 + b21 + b22 + b23 + b24 + b25 + b26 + b27 )
= (a0 b0 − a1 b1 − a2 b2 − a3 b3 − a4 b4 − a5 b5 − a6 b6 − a7 b7 )2
+(a0 b1 + a1 b0 + a2 b3 − a3 b2 + a4 b5 − a5 b4 − a6 b7 + a7 b6 )2
+(a0 b2 − a1 b3 + a2 b0 + a3 b1 + a4 b6 + a5 b7 − a6 b4 − a7 b5 )2
+(a0 b3 + a1 b2 − a2 b1 + a3 b0 + a4 b7 − a5 b6 + a6 b5 − a7 b4 )2
+(a0 b4 − a1 b5 − a2 b6 − a3 b7 + a4 b0 + a5 b1 + a6 b2 + a7 b3 )2
+(a0 b5 + a1 b4 − a2 b7 + a3 b6 − a4 b1 + a5 b0 − a6 b3 + a7 b2 )2
+(a0 b6 + a1 b7 + a2 b4 − a3 b5 − a4 b2 + a5 b3 + a6 b0 − a7 b1 )2
+(a0 b7 − a1 b6 − a2 b5 + a3 b4 − a4 b3 − a5 b2 + a6 b1 + a7 b0 )2 ,
due – as we have seen – to Ferdinand Degen. However, there is no evidence that Degen
constructed the associated multiplication on R8 . Arthur Cayley independently rediscovered
that identity when he constructed the eight-dimensional normed algebra O in 1845, and
both he and Graves used the same method to produce their versions of O. Their method
was to mimic the constructions of C and H as two-dimensional vector spaces over R and
C, respectively, with multiplication described by a formula similar to multiplication of
complex numbers.
This method should bear the names of both Cayley and Graves. Unfortunately, Cayley’s work was published first, and his method was later generalized by the American
mathematician L. E. Dickson in such papers as [47]. As a result, we call this method the
Cayley-Dickson construction.
125
Because O is a normed algebra, by previous reasoning we see that O is also a division
ring, and the table tells us that multiplication in O is noncommutative. But O is also
nonassociative, for o1 (o2 o3 ) = o1 o5 = o6 , whereas (o1 o2 )o3 = o4 o3 = −o6 .
The construction of this multiplication table seems quite mysterious; however, if we
look more closely, we notice that
o1 o2 = o4 = −o2 o1 ,
o2 o3 = o5 = −o3 o2 ,
o3 o4 = o6 = −o4 o3 ,
o4 o5 = o7 = −o5 o4 ,
o5 o6 = o1 = −o6 o5 ,
o6 o7 = o2 = −o7 o6 , and
o7 o1 = o3 = −o1 o7 .
And now we see it. For distinct a, b ∈ {1, . . . , 7}, oa ob = ±oc , where {a, b, c} is one of
the seven blocks Di in the mod 7 (7, 3, 1) block design. The sign is determined by cyclically ordering the blocks as follows: (1, 2, 4), (2, 3, 5), (3, 4, 6), (4, 5, 7), (5, 6, 1), (6, 7, 2), and
(7, 1, 3). Then oa ob = oc or oa ob = −oc according as a does or does not directly precede b
in the unique ordered block containing a and b. Thus, 6 precedes 1 in the block (5, 6, 1),
so o6 o1 = o5 ; 6 does not directly precede 4 in (3, 4, 6), so o6 o4 = −o3 . (We note that these
designated orderings on the blocks of (7, 3, 1) arise as a direct result of Graves’ method
of constructing O.) And that is why “the multiplication rule for the octonion units” is
another name of (7, 3, 1).
But there is more: the octonion algebra has the following structural feature:
1. The octonion algebra O contains seven complex subalgebras Cn = Rhon i and seven
quaternion subalgebras Hn = Rhot , ou , ov i, where {t, u, v} is a block in (7, 3, 1).
2. Each Hn contains three of the Ck and each Ck is contained in three of the Hn .
3. Each pair {Ck , Cm } is contained in a unique Hn together.
In short, O contains a (7, 3, 1) block design, with the seven quaternion subalgebras as blocks
and the seven complex subalgebras as points — another name of (7, 3, 1).
Well, can we do this again and get a 16-squares identity? We applied the Cayley-Dickson
construction to the complex numbers to get the quaternions, and the resulting algebra
was no longer commutative. We applied Cayley-Dickson to the quaternions to get the
octonions and there was a connection with (7, 3, 1), but the resulting algebra was no longer
associative. It is natural, therefore, to ask what happens when we apply Cayley-Dickson
to the octonions? The answer is that we can do this, and the result is a 16-dimensional
real algebra S called the sedenions. The multiplication on S uses fifteen units {s1 , . . . , s15 },
whose products come from the multiplication table described in Figure 58 (Figure 58
goes about here):
126
∗
1
s1
s2
s3
s4
s5
s6
s7
s8
s9
s10
s11
s12
s13
s14
s15
1
1
s1
s2
s3
s4
s5
s6
s7
s8
s9
s10
s11
s12
s13
s14
s15
s1
s1
−1
−s3
s2
−s5
s4
s7
−s6
−s9
s8
s11
−s10
s13
−s12
−s15
s14
s2
s2
s3
−1
−s1
−s6
−s7
s4
s5
−s10
−s11
s8
s9
s14
s15
−s12
−s13
s3
s3
−s2
s1
−1
−s7
s6
−s5
s4
−s11
s10
−s9
s8
s15
−s14
s13
−s12
s4
s4
−s5
s6
s7
−1
−s1
−s2
−s3
−s12
−s13
−s14
−s15
s8
s9
s10
s11
s5
s5
−s4
−s7
−s6
s1
−1
s3
−s2
−s13
s12
−s15
s14
−s9
s8
−s11
s10
s6
s6
−s7
s4
s5
s2
−s3
−1
s1
−s14
s15
s12
−s13
−s10
s11
s8
−s9
s7
s7
s6
−s5
−s4
s3
s2
−s1
−1
−s15
−s14
s13
s12
−s11
−s10
s9
s8
s8
s8
s9
s10
s11
s12
s13
s14
s15
−1
−s1
−s2
−s3
−s4
−s5
−s6
−s7
s9
s9
−s8
s11
−s10
s13
−s12
−s15
s14
s1
−1
s3
−s2
s5
−s4
−s7
s6
s10
s10
−s11
−s8
s9
s14
s15
−s12
−s13
s2
−s3
−1
−s1
s6
s7
−s4
−s5
s11
s11
s10
−s9
−s8
s15
−s14
s13
−s12
s3
s2
−s1
−1
s7
−s6
s5
−s4
s12
s12
−s13
−s14
−s15
−s8
s9
s10
s11
s4
−s5
−s6
−s7
−1
s1
s2
s3
s13
s13
s12
−s15
s14
−s9
−s8
−s11
s10
s5
s4
−s7
s6
−s1
−1
−s3
s2
s14
s14
s15
s12
1s13
−s10
s11
−s8
−s9
s6
s7
s4
−s5
−s2
s3
−1
−s1
Figure 58: The sedenions
It happens that there are 15 8-dimensional subalgebras of S, each isomorphic to the
octonions, and for each of these, the multiplication tables are generated by 15 isomorphic
copies of (7, 3, 1). One obtains the overall multiplication by adjusting the tables of the 15
octonions to achieve consistency of the products from one octonion subalgebra to the next.
There are also 35 4-dimensional subalgebras of S, each isomorphic to the quaternions, and
15 2-dimensional subalgebras of S, each isomorphic to the complex numbers. And there is
another design hidden within this set of subalgebras. Namely, the 15 complex subalgebras
(points) and the 35 quaternionic subalgebras (blocks) form a (15, 35, 7, 3, 1) block design.
However, the string of normed algebras – that is, algebras with sums-of-squares identities
– stops with O. The reason is that S contains pairs of nonzero elements whose product
equals zero, and this prevents S from being a normed algebra. Indeed, suppose there were
a norm N on S. From the table we see that
(s5 + s9 )(s7 − s11 ) = s5 s7 + s9 s7 − s5 s11 − s9 s11 = 0.
Thus, 0 = N (0) = N ((s5 + s9 )(s7 − s11 )) = N (s5 + s9 )N (s7 − s11 ), so one of N (s5 +
s9 ), N (s7 − s11 ) must be 0. But this implies that either s5 = −s9 or s7 = s11 , neither of
which holds. Hence the sedenions are not a normed algebra. Finally, the Cayley-Dickson
operation on S won’t produce a normed algebra, as the resulting 32-dimensional algebra
would contain 31 copies of S. Thus, there are no more real normed algebras to be produced
by the Cayley-Dickson construction, and so – according to L. E. Dickson’s modification of
Hurwitz’ original proof [47] – there are no real normed algebras beyond the octonions.
We have now revealed connections between (7, 3, 1) with Latin squares, round-robin
tournaments, Hadamard matrices, algebraic number fields, and the octonions, as well as
all the others we have previously mentioned. Are these all of them?
Not quite. In the next section we see how another name of (7, 3, 1) is associated with
127
s15
s15
−s14
s13
s12
−s11
−s10
s9
−s8
s7
−s6
s5
s4
−s3
−s2
s1
−1
a mathematical structure that connects combinatorics, graph theory, geometry, topology,
linear algebra, and greedy algorithms. These structures are called matroids, so let’s find
out about them now.
128
14
(7, 3, 1) and Matroids
This section is in the file called matroids.tex.
We explore another name of (7, 3, 1) – namely the F7 matroid – and see how the idea
of a matroid reveals connections between combinatorics, graph theory, geometry, topology,
linear algebra, and greedy algorithms.
Have you noticed the hidden connections between seemingly independent mathematical
ideas? Strange that finding roots of polynomials can tell us important things about how
to solve certain ordinary differential equations, or that computing a determinant would
have anything to do with finding solutions to a system of equations. But this is one of the
charming elements of mathematics – that disparate objects share similar traits. Properties
like independence appear in many contexts. Do you find independence everywhere you
look? In 1933, three Harvard junior-fellows tied together this recurring theme in mathematics by defining new mathematical objects, which they called matroids. Matroids are
everywhere, if only we knew how to look
Matroids arise from shared behaviors of vector spaces and graphs. In the discussion
to follow, we explore this natural motivation for the matroid through two examples and
consider how properties of independence surface. After this, we’ll explore the connection
between matroids and projective planes – in particular, with (7, 3, 1). But matroids do not
reside merely in the halls of pure mathematics; they play an essential role in combinatorial
optimization, and we consider their role in two contexts, the construction of minimumweight spanning trees and the creation of optimal schedules.
For example, suppose you move your company into a new office building and your 25
employees need to connect their 25 computers to each other in a network. The cable needed
to do this is expensive, so you want to connect them with the least cable possible; this
will form a minimum-weight spanning tree, where by weight we mean the length of cable
needed to connect the computers, by spanning we mean that we reach each computer, and
by tree we mean we have no redundancy in the network. How do we find this minimum
length?
An exhaustive search is prohibitely expensive, but we are in luck. Not only certain
kind of matroids associated with this problem, but it turns out that they are the very
characterizations of the problems. That is, (a) recognizing that a problem involves a
matroid tells us whether or not certain algorithms will return an optimal solution and (b)
knowing that an algorithm effects a solution tells us whether we have a matroid.
Why Matroids?
One of the main themes of this book is that in combinatorics, disparate objects share
similar traits. In particular, independence, dependence, generating sets, and bases appear
in many contexts, and their commonalities are embodied in the ubiquity of these objects
called matroids.
129
Figure 59: A cycle sketch
Matroids arise from shared behaviors of vector spaces and graphs. In the discussion
to follow, we explore this natural motivation for the matroid through two examples and
consider how properties of independence surface. We find matroids in arrangements of hyperplanes, configurations of points, and finite projective planes. In particular, the matroid
F7 is a matroid associated with the Fano plane, and so with our old friend the (7, 3, 1) block
design.
While tying together similar structures is important and enlightening, matroids do not
reside merely in the halls of pure mathematics; they play an essential role in combinatorial
optimization, and we’ll see how.
Declaration of (In)dependence
In everyday life, what do we mean by the terms dependence and independence? In life,
we feel dependent if there is something (or someone) upon which (or whom) we must rely.
On the other hand, independence is the state of self-sufficiency, and being reliant upon
nothing else. Further, we consider something independent if it somehow extends beyond
the rest, making new territory accessible, whether that territory is physical, intellectual,
or otherwise. In such a case that independent entity is necessary for access to this new
territory.
But we use these terms more technically in mathematics, so let us connect the colloquial
to the technical by considering two examples in which we find independence.
Linear Independence of Vectors The first and most familiar context in which we
encounter independence is linear algebra, when we define the linear independence of vectors
within a particular vector space. Consider the following finite collection of vectors from
the vector space R3 (or C3 or (F3 )3 ):













1
0
0
1
0
2
0
v1 =  0  , v2 =  1  , v3 =  0  , v4 =  0  , v5 =  1  , v6 =  0  , v7 =  0  .
0
0
1
1
1
0
0

130
It is not difficult to determine which subsets of this set are linearly independent sets of
vectors over R3 : subsets in which it is impossible to represent the zero vector as a nontrivial linear combination of the vectors of the subset. That is, no vector within the subset
relies upon any of the others. If some vector were a linear combination of the others, we
would call the set of vectors linearly dependent. Clearly, this means v7 must be excluded
from any subset aspiring to linear independence.
Let us identify the maximal independent sets. By maximal we mean that the set in
question is not properly contained within any other independent set of vectors. We know
that since the vector space has dimension 3, the size of such a maximal set can be no larger
than 3; in fact, we can produce a set of size 3 immediately, since {v1 , v2 , v3 } forms the
standard basis. It takes little time to find B, the complete set of maximal independent
sets, which the reader should verify:
B = {{v1 , v2 , v3 }, {v1 , v2 , v4 }, {v1 , v2 , v5 }, {v1 , v3 , v5 }, {v1 , v4 , v5 },
{v2 , v3 , v4 }, {v2 , v3 , v6 }, {v2 , v4 , v5 }, {v2 , v4 , v6 },
{v2 , v5 , v6 }, {v3 , v4 , v5 }, {v3 , v5 , v6 }, {v4 , v5 , v6 }}.
Note that each set contains exactly three elements. This will turn out to be a robust
characteristic when we expand the scope of our exploration of independence. For any set
of vectors, there exists at least one maximal independent set.
Two other properties of B that will prove to be important:
• No maximal independent set can be properly contained in another maximal independent set.
• Given any pair of elements, B1 , B2 ∈ B, we may take any v from B1 and there is
some element w ∈ B2 such that (B1 − v) ∪ w is in B.
You might want to check the second property in a couple of cases, but don’t overdo it, as
there are 10
2 = 45 pairs of maximal sets. In general, given any vector space, we could
select some finite set of vectors and then find the maximal linearly independent subsets of
that set of vectors. These maximal sets necessarily have size no larger than the dimension
of the vector space, but whatever their size, they will always satisfy the two properties
listed above. The same phenomenon occurs in graph theory: let’s see how.
Graph Theory and Independence We restrict our attention to connected graphs.
There are two common ways to define independence in a graph, on the vertices or on the
edges. We focus on the edges. What might it mean for a set of edges to be independent?
Revisiting the idea of independence being tied to necessity, and the accessibility of new
territory, when would edges be necessary in a connected graph? Edges exist to connect
vertices. Put another way, edges are how we move from vertex to vertex in a graph. So
some set of edges should be considered independent if, for each edge, the removal of that
edge makes some vertex inaccessible to a previously accessible vertex.
131
Figure 60: Connected graph G
Consider the graph in Figure 60 with edge set E = {e1 , e2 , . . . , e7 }.
Now, consider the subset of edges S = {e1 , e3 , e4 , e5 }. Is this an independent set of
edges? No, because the same set of vertices are connected to one another even if, for
example, edge e3 were removed from S. Note that the set S contains a cycle. (A cycle is
a closed path.) Any time some set of edges contains a cycle, it cannot be an independent
set of edges. This also means {e7 } is not an independent set, since it is itself a cycle; it
doesn’t get us anywhere new.
In any connected graph, a set of edges that forms a tree or forest is independent. This
makes sense two different ways: first, a tree or forest never contains a cycle; second, the
removal of any edge from a tree or forest disconnects some vertices from one another,
decreasing accessibility, and so every edge is necessary. A maximal such set is a set of
edges containing no cycles, which also makes all vertices accessible to one another. This
is called a spanning tree. There must be at least one spanning tree for a connected graph.
Here is the set, T , of all spanning trees for G:
T = {{e1 , e2 , e3 }, {e1 , e2 , e4 }, {e1 , e2 , e5 }, {e1 , e3 , e5 }, {e1 , e4 , e5 },
{e2 , e3 , e4 }, {e2 , e3 , e6 }, {e2 , e4 , e5 }, {e2 , e4 , e6 },
{e2 , e5 , e6 }, {e3 , e4 , e5 }, {e3 , e5 , e6 }, {e4 , e5 , e6 }}.
Here again we see that all maximal independent sets must have the same size. (How
many edges are there in a spanning tree of a connected graph on n vertices?)
Spanning trees also have two other important traits.
• No spanning tree properly contains another spanning tree.
• Given two spanning trees, T1 and T2 , and an edge e from T1 , we can always find some
edge f from T2 such that (T1 − e) ∪ f will also be a spanning tree.
To demonstrate the second condition, consider the spanning trees T1 and T2 shown as
bolded edges of the graph G in Figure 61.
132
Figure 61: Three spanning trees of G
Suppose we wanted to build a third spanning tree using the edges from T1 except e1 .
Then we must be able to find some edge of T2 that we can include with the leftover edges
from T1 to form the new spanning tree T3 . We can, indeed, include edge e3 to produce
spanning tree T3 , also shown in Figure 61. This exchange property would hold for any edge
of T1 , and this property is proved in every beginning course in graph theory.
Motivated by our two examples, now is the proper time for some new terminology and
definitions to formally abstract these behaviors.
Thus, Matroids
As you notice these similarities between the spanning trees of a graph and the maximal
independent sets of a collection of vectors, we should point out that you are not alone. In the
1930s, Hassler Whitney, Garrett Birkhoff, and Saunders Maclane at Harvard and B.L. van
der Waerden in Germany were observing these same traits. They noticed these properties
of independence that appeared in a graph or a collection of vectors, and wondered if other
mathematical objects shared this behavior. Thus, a matroid is defined on any collection
of elements that share these traits. We define here a matroid in terms of its maximal
independent sets, or bases.
The bases A matroid M is an ordered pair, (E, B), of a finite set E (elements) and a
non-empty collection B (bases) of subsets of E satisfying the following conditions, usually
called the basis axioms.
• No basis properly contains another basis.
• If B1 and B2 are in B and e ∈ B1 , then there is an element f ∈ B2 such that
(B1 − e) ∪ f ∈ B.
The bases of the matroid are its maximal independent sets. By repeatedly applying the
second property above, we can show that all bases have the same size.
Returning to our examples, we can define a matroid on a graph. A matroid can be
defined on any graph, but we will restrict our attention to connected graphs. If G is
133
a graph with edge set E, the cycle matroid of G, denoted M (G), is the matroid whose
element set, E, is the set of edges of the graph and whose set of bases, B, is the set of
spanning trees of G. We can list the bases of the cycle matroid of G by listing all of the
spanning trees of the graph.
For the graph in the Figure 60, the elements of M (G) are the edges {e1 , e2 , e3 , e4 , e5 , e6 , e7 }.
We have already listed all of the spanning trees of the graph above, so we already have a
list of the bases of this matroid.
We can also define a matroid on a finite set of vectors. The vectors are the elements, or
ground set, of the matroid, and B is the set of maximal linearly independent sets of vectors.
These maximal independent sets, of course, form bases for the vector space spanned by
these vectors. And we recall that all bases of a vector space have the same size.
You might be starting to see where some of the terminology comes from. The bases of
the vector matroid are bases of a vector space. What about the word matroid? We can
view the vectors comprising our example as the column vectors of a matrix, which is why
Whitney[177] called these matroids.
 v1 v2 v3 v4 v5 v6 v7 
1 0 0 1 0 2 0
 0 1 0 0 1 0 0 
0 0 1 1 1 0 0
These (column) vectors {v1 , v2 , v3 , v4 , v5 , v6 , v7 } are the elements of this matroid. The
bases are the maximal independent sets listed in the previous section.
Now for a quick example not (necessarily) from a matrix or graph. We said that any
pair (E, B) that satisfies the two conditions is a matroid. Suppose we take, for example, a
set of four elements and let the bases be every subset of two elements. This is a matroid
called a uniform matroid, and in general it is not related to either a graph or a collection
of vectors.
Beyond the bases You might notice something now that we’ve looked at our two examples again. The bases of the cycle matroid and the bases of the vector matroid are
the same, if we relabel vi as ei . Are they the same matroid? Yes. Once we know the
elements of the matroid and the bases, the matroid is fully determined, so these matroids
are isomorphic. An isomorphism is a structure-preserving correspondence. Thus, two matroids are isomorphic if there is a one-to-one correspondence between their elements that
preserves the set of bases.
Knowing the elements and the bases tells us exactly what the matroid is, but can we
delve further into the structure of this matroid? What else might we like to know about
a matroid? Well, what else do we know about a collection of vectors? We know what it
means for a set of vectors to be linearly dependent, for instance. In a graph, we often look
at the cycles of the graph. If we had focused on the linearly dependent sets and cycles in
our examples, we would have uncovered similar properties they share.
134
Recall also that, if we take a subset of a linearly independent set of vectors, that subset
is linearly independent. Why? If a vector could not be written as a linear combination of
the others, it cannot be written as a linear combination of a smaller set. Also, if we take
a subset of the edges of a tree in a graph, that subset is still independent; if a set of edges
contains no cycle, it would be impossible for a subset of those edges to contain a cycle. So
any subset of an independent set is independent, and this is true for matroids in general
as well.
We can translate some of these familiar traits from linear algebra and graph theory
to define some more features of a matroid. Any set of elements of the matroid that is
contained in a basis is an independent set of the matroid. Further, any independent set
can be extended to a basis. The rank of a matroid is the size of a maximum independent
set, i.e. the size of a basis. On a related note, anytime we have two independent sets of
different sizes, say |I1 | < |I2 | then we can always find some element of the larger set to
include with the smaller so that it is also independent. There exists some e ∈ I2 such that
I1 ∪ e is independent. A subset of E that is not independent is called dependent. A minimal
dependent set is a circuit in the matroid; by minimal we mean that any proper subset of
this set is not dependent.
What is an independent set of the cycle matroid? A set of edges is independent in the
matroid if it contains no cycle in the graph because a subset of a spanning tree cannot
contain a cycle. Thus, a set of edges is dependent in the matroid if it contains a cycle in
the graph. A circuit in this matroid is a cycle in the graph.
Figure 62: Connected graph G
Get out your pencils! In the graph in Figure 62, {e2 , e4 } is an independent set, but not
a basis because it is not maximal. The subset {e7 } is not independent because it is a cycle;
it is a dependent set, and, since it is a minimal dependent set, it is a circuit. (A singleedge circuit is called a loop in a matroid.) In fact, any set containing {e7 } is dependent
because it contains a cycle in the graph, or circuit in the matroid. Another dependent set
is {e2 , e3 , e4 , e5 }, but it is not a circuit; {e2 , e3 , e5 } is a circuit.
In the vector matroid, a set of elements is independent in the matroid if that collection
of vectors is linearly independent; for instance, {v2 , v4 } is an independent set. A dependent
set in the matroid is a set of linearly dependent vectors, for example {v2 , v3 , v4 , v5 }. And
135
a circuit is a dependent set, all of whose proper subsets are independent. {v2 , v3 , v5 } is a
circuit, as is {v7 }. We noted earlier that any set containing {v7 } is a linearly dependent
set; we now see that any such set contains a circuit in the vector matroid.
Harvesting a Geometric Example from a New Field We just saw how a collection
of vectors can be a representation for a particular matroid over one field but not over
another field. The ground set of the matroid (the vectors) is the same in each case, but
the independent sets are different. Thus, they are not the same matroid. Let’s further
explore the role the field can play in determining the structure of a vector matroid. We
now consider an example over the field of two elements, F2 . As above, the ground set of
our matroid will be the set of column vectors, and a subset is independent if the vectors
form a linearly independent set when considered within the vector space (F2 )3 .
 a
1
 0
0
b
0
1
0
c
0
0
1
d
1
1
0
e
1
0
1
f
0
1
1
g 
1
1 
1
Consider the set {d, e, f }. Accustomed as we are to vectors in R3 , our initial inclination
is that this is a linearly independent set of vectors. But recall that 1 + 1 = 0 over F2 . This
means that each vector in {d, e, f } is the sum of the other two vectors. This is a linearly
dependent set in this vector space, and thus a dependent set in the matroid, and not a
basis. In fact, {d, e, f } is a minimal dependent set, a circuit, in the matroid, since all of its
subsets are independent.
The matroid generated by this matrix has a number of interesting characteristics, which
you should take a few moments to explore.
1. Given any two distinct elements, there will be a unique third element which completes
a 3-element circuit. (That is, any two elements determine a 3-element circuit.)
2. Any two 3-element circuits will intersect in a single element.
3. There is a set of four elements no three of which form a circuit. (This might be a
little harder to find, as there are 74 = 35 cases to check.)
Look familiar? Certainly: these are the axioms for projective planes with elements as
points of the geometry, and 3-element circuits as lines. Our example has seven points, and
this particular projective plane is called the Fano plane, denoted F7 . The Fano plane is
shown in Figure 63, with each point labeled by its associated vector over F2 . Viewed as a
matroid, any three points on a line (straight or curved) form a circuit.
The Fano plane exemplifies the interesting fact that any projective geometry is also a
matroid, though the specific definition of that matroid becomes more complicated once the
dimension of the finite geometry grows beyond two. (Although the Fano plane has rank 3
as a matroid it has dimension 2 as a finite geometry, which is, incidentally, why it is called
a plane. For further information see [117].)
136
Figure 63: The Fano plane, F7
Matroids and greed
Now that we have seen several different types of matroids, we consider their applications.
Beyond unifying distinct areas of discrete mathematics, matroids are essential in combinatorial optimization. The greedy algorithm, a powerful optimization technique, can be
recognized as a matroid optimization technique. In fact, the greedy algorithm guarantees
an optimal solution only if the fundamental structure is a matroid.
Here is a graph algorithm that illustrates this point. Suppose each edge of a graph has
been assigned a weight. How would you find a spanning tree of minimum total weight? You
could start with an edge of minimal weight, then continue to add the next smallest weight
edge available, unless that edge would introduce a cycle. Does this simple and intuitive
idea work? Yes, but only because the operative structure is a matroid.
An algorithm that, at each stage, chooses the best option (cheapest, shortest, highest
profit) is called greedy. The greedy algorithm allows us to construct a minimum-weight
spanning tree. (This particular incarnation of the greedy algorithm is called Kruskal’s
algorithm.)
In graph G with weight function w on the edges, initialize our set B, B =
∅.
1. Choose edge ei of minimal weight.
tied edges.
In case of ties, choose any of the
2. If B ∪ {ei } contains no cycle, then set B := B ∪ {ei }, else remove ei
from consideration and repeat previous step.
The greedy algorithm concludes, returning a minimum-weight spanning tree
B.
Perhaps surprisingly, this approach will always construct a minimum-weight spanning
tree. The surprise is that a sequence of locally best choices results in a globally optimal
solution. Other greedy graph algorithms include Prim’s algorithm for minimum-weight
137
The greedy algorithm
guarantees an optimal solution.
⇐⇒
The underlying structure
is actually a matroid.
Figure 64: A stunning truth.
spanning trees, Dijkstra’s minimal connector algorithm for finding the minimum-weight
path from a given vertex to every other vertex, Hall’s maximal matching algorithm for
bipartite graphs, the Hungarian algorithm for the Assignment problem, and the FordFulkerson labeling algorithm for finding a flow of maximum value in a capacitated network
with integer-valued edge weights. See West’s book [173] for these and other topics in graph
theory.
This is not the case with every graph algorithm. It turns out that the greedy algorithm
will not usually lead you to an optimal solution for the Travelling Salesperson Problem,
namely: given a graph with weighted edges, find a cycle that (a) goes through every point
and returns to its starting point, and that (b) has minimum total weight. Right now, the
only way to guarantee an optimal solution is to check all possible routes. For only 10 cities
this is 9! = 362, 880 possible routes. But for the minimum-weight spanning tree problem,
the greedy algorithm guarantees success.
What does this have to do with matroids? The greedy algorithm constructs a minimumweight spanning tree, and we know what role a spanning tree plays in a graph’s associated
cycle matroid. Thus, the greedy algorithm finds a minimum-weight basis of the cycle
matroid. (Once weights have been assigned to the edges of G, they have also been assigned
to the elements of M (G).) Further, for any matroid, graphic or otherwise, the greedy
algorithm finds a minimum-weight basis.
What is fascinating and quite stunning is that one may go further and define matroids
using the greedy algorithm. That is, it turns out that any time the greedy algorithm, in
any of its guises, guarantees an optimal solution for all weight functions, we may be sure
that the operative mathematical structure must be a matroid. Stated another way, only
when the structure is a matroid is the greedy algorithm guaranteed to return an optimal
solution. (See [117] or [102].) We may, however, have to dig deep to find out what that
particular matroid might be.
Greedy algorithms in the linear algebra of finite-dimensional vector spaces include,
among others, constructing a basis, enlarging a linearly independent set to a basis, the
Gram-Schmidt orthogonalization process, and Gaussian elimination.
Finally, one other observation on the nature of matroids is in order. Given a matroid
M on ground set E, with set of bases B, we may always construct what is called the dual
matroid with the same ground set and the set of bases {B ′ ⊆ E|B ′ = E − B, B ∈ B}.
Stated more plainly, the set of all complements of bases for one matroid yields a set of
bases for another matroid on the same ground set, called the dual matroid. This surprising
138
fact has kept many matroid theorists employed for many years. In our current context, the
reason this is particularly interesting is that any time the greedy algorithm is used to find a
minimum-weight basis for a matroid, it has simultaneously found a maximum-weight basis
for the dual matroid. Pause for a moment to grasp, and then savor, that fact. (In fact, the
greedy algorithm is sometimes presented first as a method of finding a maximum-weight
bases, in which case the adjective “greedy” makes a little more sense.)
With a little practice, you will now see matroids everywhere you look!
139
15
Coin Turning Games and Mock Turtles
This section is in the file mockturtles.tex.
We introduced combinatorial games in Section 2, where we learned about P positions,
N positions, the Grundy and the special place that the game of nim holds in the world of
combinatorial games. In particular, the Sprague-Grundy theory states that every position
X in a finite combinatorial impartial two-player game is equivalent to a set of heaps of
beans in the game of Nim. Furthermore, the Grundy number of a sum X1 ⊕ X2 ⊕ · · · ⊕ Xn
of such games is equal to the nim-sum of the Grundy numbers of the individual games.
Now, it happens that Nim-multiplication also plays a part in evaluating positions in a wide
range of games known as product games. But let us begin with some games devised by H.
W. Lenstra [106] called coin-turning games.
One of the simplest coin-turning game is called Turning Turtles, in which the layout is
a row of coins, some showing heads and the rest showing tails. (An apocryphal story has
it that the original game was played with turtles, but it was considered more humane to
use coins. Besides, a right-side-up turtle had the unfortunate tendency to walk away, thus
spoiling the layout. And how can you calculate the Grundy function of a moving turtle?)
Play consists of turning over some coins from heads to tails or from tails to heads. You
moves by turning over one of the coins from heads to tails and, if desired, turning over one
other coin to the left of it. The latter coin may go either from heads to tails or from tails
to heads. The game is over when all coins show tails, and the last player wins. Here is an
example with a sequence of n = 11 coins:
T
1
T
2
H H T
3 4 5
H T
6 7
H T
8 9
H T
10 11
You are suspicious that this is a disguised version of Nim. Only coins 3, 4, 6, 8, and
10 are heads, and 3 ⊕ 4 ⊕ 6 = 1. If you turn coin 10 to a tail and coin 9 to a head, the
nim-sum of the new heads is (3 ⊕ 4 ⊕ 6) ⊕ (8 ⊕ 9) = 1 ⊕ 1 = 0. This move converts the
opening position into a P-position, and your suspicions have been confirmed. Thus, we
view a Turning Turtles position as Nim with heap sizes equal to the heads positions, and
this converts a game of Turning Turtles into a game of Nim.
More generally, Nim moves become coin turns as follows. We reduce a heap n to a size
k not already present by turning the coin at position n from heads to tails and turning
another of size k from tails to heads, where k < n. We remove a heap by merely turning
the appropriate head to a tail. But if you number the coins 1, 2, 3, . . . from the left, then
for the nth coin, g(n) = n or 0 according as the nth coin is a head or a tail. The Grundy
number of a given position equals the nim-sum of the Grundy numbers of the heads. For
the given 11-coin layout, we see that
g = g(3) ⊕ g(4) ⊕ g(6) ⊕ g(7) ⊕ g(10) = 3 ⊕ 4 ⊕ 6 ⊕ 7 ⊕ 10 = 12.
The correct move is to turn coins 6 and 10 from heads to tails, leaving a head in positions
140
3, 4, and 7. Since 3 ⊕ 4 ⊕ 7 = 0, this is a losing position – a P-position – in nim, and hence
a P-position in this game of Turning Turtles.
Avariation of Turning Turtles is Mock Turtles, in which you are allowed to turn one,
two, or three coins over, provided the rightmost one turned goes from heads to tails. The
winner is the one who leaves all tails. For technical reasons, the coins are numbered from
left to right beginning with 0, not 1 – see Lenstra or Winning Ways for details. The Grundy
values of the turtles that are rightside up are as follows:
n
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 · · ·
g(n) 1 2 4 7 8 11 13 14 16 19 21 22 25 26 28 31 32 · · ·
These numbers are the odious numbers, i.e., those whose binary representations contain
an odd number of ones. In other words, write n in binary and append a check digit 0 or 1
to make the digit sum odd.
Turning Corners is a two-dimensional version of Turning Turtles, played on a large
rectangular array of coins, each of which is either a head or a tail. Players alternate by
turning over four distinct coins that are at the corners of a rectangle, provided the coin
(x, y) that is farthest from (0, 0) is a head. It turns out that the Grundy number of the
position (x, y) is the nim product t(x) ⊗ t(y), where t is the Grundy number of the position
x in Mock Turtles This game was devised by H. W. Lenstra as an example of a game
whose Grundy numbers are calculated using the nim product. Here is a sample game, with
numbering done as for Wythoff’s game in Section 2, Figure 4:
0
1
2
3
4
5
6
7
8
9
0
T
T
T
T
T
T
T
T
T
T
1
T
T
T
H
T
T
H
T
T
T
2
H
T
T
T
T
T
T
T
H
T
3
T
T
T
T
T
T
T
T
H
T
4
T
T
T
T
H
T
T
T
T
T
5
T
T
H
T
T
T
T
H
T
T
6
T
T
T
T
T
T
T
T
T
H
Figure 65: Turning Corners, a game with a nim product Grundy function
From only the information at hand, determine whether the position in Figure 65 is a
P-position (i.e., a Previous-player-winning position) or an N -position (i.e., a Next-playerwinning position).
141
Mock Turtles may be played with any number of coins. However, for the eight-coin
game, the codewords of the Hamming code H8 will give you the P positions.
If each player is allowed to turn up to five coins at each move, then we have the game
of Moebius, named because when played on a row of eighteen coins, the P-positions can
be obtained from each other by Moebius transformations modulo 17:
ax + b
with ad − bc ≡ 1 (mod 17)
cx + d
acting on the following labels for the 18 coins:
x→
∞ 1 4 0 −4 −1 5 6 −8 2 −3 −5 8 3 −7 7 −6 −2.
Correspondingly, if the move is to turn up to seven coins, we have the game of Mogul,
whose P-positions, when played on 24 coins, manifest the extended Golay code, obtained
by appending a check digit to the quadratic residue code (mod 23).
Investigating Moebius transformations opens up a connection between Mock Turtles and
projective geometry, namely the symmetries of the projective line Fq = {0, 1, . . . , q − 1, ∞}
over the finite field Fq .
There is also the following connection between this game, error-correcting codes, block
designs, and difference sets.
It is a fact that if p = 4n + 3 is a prime, then the nonzero squares—or quadratic
residues—modulo p form a (4n+3, 2n+1, n+1) difference set. The incidence matrix of the
corresponding symmetric block design is a 0–1 matrix whose rows are a basis for an errorcorrecting code called a quadratic residue code. It happens that the P-positions for the
game of Moebius form the words of a quadratic residue code. Finally, there is a connection
between all of these things and the (11, 5, 2) block design. These remarkable relationships
between combinatorial games and codes inspired Conway and Sloane to discover lexicodes.
Finally, the names of these games come from the fact that the third letters in moCk
turtles, moEbius and moGul respectively are the 3rd, 5th, and 7th letters of the alphabet.
Its automorphism group is the Mathieu group M24 and it is intimately related to the
24-dimensional Leech lattice. But that is another story, and we will tell that story as
part of our exploration of S(5, 8, 24), the Miracle Octad Generator, and other wonders!
142
1
8
4
2
3
0
X
6
7
5
9
Figure 66: A Fascinating Picture.
16
The Fabulous (11, 5, 2) Biplane
This section is in the file 1152tuoc.tex
“How Do You Make Math Exciting For Students?”
After a workshop for new graduate assistants on innovations in teaching, a new sociology
grad student wandered into EB’s office and asked the question, “Tell me . . . how do you
make math exciting for students?” By chance, the office’s computer screen contained
a picture that exhibits some of the symmetries of one of the most intriguing objects in
mathematics: the (11, 5, 2) biplane. The student learned that a similar picture graces the
cover of an excellent book [86] on combinatorial designs.
The picture was lovely, but it wasn’t labeled.
As we saw in Section 8, one of the most pleasing aspects of the (7, 3, 1) block design is
its many geometric representations that reveal its symmetries. It was fun finding a labeling
compatible with symmetries of the biplane. It was more fun digging out generators for the
symmetry group of the biplane, which turns out to be P SL(2, 11). The best part, however,
was learning about the exact connection between the biplane and six pairs of mathematical
twins.
We find these six sets of mathematical twins just outside the boundaries of many tradi143
tional courses, where a bit of exploration can lead the curious to all manner of interesting
mathematics. A good course in coding theory will mention two pairs of perfect errorcorrecting codes, namely the Golay codes {G11 , G12 } and {G23 , G24 } — but sometimes
only in passing. Look past the usual topics in combinatorics into the world of combinatorial designs and you will meet two pairs of Steiner systems, namely {S(4, 5, 11), S(5, 6, 12)}
and {S(4, 7, 23), S(5, 8, 24)}. Beyond the first course in group theory lie two pairs of finite
simple groups, namely the Mathieu groups {M11 , M12 } and {M23 , M24 }.
Finding out about both the origin of the biplane’s name and just how these twins
connect with the biplane and with each other was quite a revelation. The student found
the story intriguing – and maybe you will, too.
Difference Sets, Block Designs, and Biplanes
The (11, 5, 2) biplane is a collection of eleven 5-element subsets of {1, 2, 3, 4, 5, 6, 7, 8, 9, X, 0}
as follows (we think of X as 10, and we have written abcde for the set {a, b, c, d, e}:
B1 = 13459
B2 = 2456X
B3 = 35670
B4 = 46781
B5 = 57892
B6 = 689X3
B7 = 79X04
B8 = 8X015
B9 = 90126
BX = X1237
B0 = 02348
The (11, 5, 2) Biplane.
As we have seen in previous sections, this is an example of a block design, which is
an arrangement of v objects called varieties into b sets called blocks. Furthermore, each
variety appears in r blocks and each block contains k varieties. Finally, each pair of varieties
appears together in λ blocks. From the above, we see that b = v = 11 and k = 5. It is a
bit less obvious that r = 5 and still less obvious that λ = 2 — for example, 1 appears in
blocks B1 , B4 , B8 , B9 and BX , and 7 and 0 appear together in blocks B3 and B7 .
Block designs first appeared in the 1930’s in connection with the design of certain agricultural experiments, although they are implicit in two hard-to-find papers by Woolhouse
[178] and Kirkman [89] as early as 1844 and 1847, respectively. The parameters b, v, r, k
and λ are not independent: it happens that bk = vr and r(k − 1) = λ(v − 1) (prove it!).
Thus, if b = v, then r = k (prove it!) and we speak of a (v, k, λ) symmetric design. Hence,
the (11, 5, 2) biplane is an (11, 5, 2) symmetric design, which explains the origin of the first
part of its name. (We’ll get to the “biplane” part later.)
A closer look reveals that we may construct the entire (11, 5, 2) biplane from B1 by
adding a particular integer mod 11 to each element; for example, if we add 5 to each
element of B1 and reduce the results mod 11, we find that
{1 + 5, 3 + 5, 4 + 5, 5 + 5, 9 + 5} ≡ {6, 8, 9, X, 3} ≡ B6 mod 11.
Now, B1 is an example of a difference set; that is, every nonzero integer mod 11 appears
exactly twice among the 20 differences i − j mod 11 for i and j distinct elements of B1 (in
144
the following, a ≡ b is short for a ≡ b mod 11):
1 ≡ 4−3 ≡ 5−4
2≡3−1 ≡5−3
3≡4−1≡1−9
7 ≡ 1−5 ≡ 5−9
8≡9−1 ≡1−4
9≡1−3≡3−5
4 ≡ 5−1 ≡ 9−5
5≡9−4 ≡3−9
10 ≡ 3 − 4 ≡ 4 − 5.
6≡9−3≡4−9
More generally, a (v, k, λ) difference set is a k−element subset S of V = {0, 1, . . . , v − 1}
such that every nonzero integer mod v can be written in exactly λ ways as a difference of
elements of S. Thus, {1, 3, 4, 5, 9} is an (11, 5, 2) difference set. Notice that 1, 3, 4, 5, and 9
are the five nonzero perfect squares mod 11.
It is not true that for every v, k and λ there is a (v, k, λ) difference set, but difference
sets are not that hard to come by. In fact, for every prime p ≡ 3 mod 4, the set Qp of
nonzero perfect squares mod p is a (p, (p − 1)/2, (p − 3)/4) difference set. For example, you
can check that Q23 = {1, 2, 3, 4, 6, 8, 9, 12, 13, 16, 18} is a (23, 11, 5) difference set. (Exercise:
Find the five different ways to write 7 as a difference of elements of Q23 .)
What is interesting here is that every difference set gives rise to a symmetric design in
the following way.
Theorem. Let D = {x1 , x2 , . . . , xk } be a (v, k, λ) difference set. Let Di := {x1 +
i, . . . , xk + i} where addition is mod v. Then the v sets D0 , . . . , Dv−1 are the blocks of a
(v, k, λ) symmetric design.
(For a proof, see [73], Theorem 11.1.1.) Thus, the (11, 5, 2) difference set gives rise to
the (11, 5, 2) symmetric design.
Symmetric designs with λ = 1 have the property that every pair of varieties determines
a unique block and every pair of blocks intersects in a unique variety. Reading “line” for
“block” and “point” for “variety” gives us the first two axioms of projective geometry; for
this reason, (v, k, 1) designs are called finite projective planes, or planes for short. Now
for a (v, k, 2) design, every pair of varieties determines exactly two blocks and every pair
of blocks intersects in exactly two varieties. For this reason, the blocks and varieties of
a (v, k, 2) design are called lines and points, respectively, and the designs themselves are
called biplanes—and that explains the second part of the (11, 5, 2) biplane’s name.
As stated earlier, part of my fascination with the (11, 5, 2) biplane lies both in its symmetries and in the challenge of drawing a picture that will reveal some of its symmetries.
(We introduced automorphisms of block designs in Section 8, but it doesn’t hurt to refamiliarize ourselves with the topic.) By a symmetry of a design, we mean a permutation of
the varieties that simultaneously permutes the blocks. For any design, the set of all such
permutations is a group called the automorphism group of the design. So, first we’ll talk
about automorphism groups, and then we’ll draw some pictures.
145
The Automorphism Group of the Biplane
A permutation on a set Y is a mapping of the set to itself that is one-to-one and onto.
The cycle notation is a standard way to describe permutations on finite sets; here is an
example to show how it works. Writing f = (1 3 6)(4 5) means that f (1) = 3, f (3) =
6, f (6) = 1, f (4) = 5, f (5) = 4, and f (x) = x for all x 6∈ {1, 3, 4, 5, 6}. In this example, we
say that f is a product of two disjoint cycles. Similarly, g = (1 2) means that g switches 1
and 2 and leaves everything else fixed. Since permutations are functions, they compose like
functions—i.e., from right to left. If we denote composition by ◦ — i.e., F ◦G(x) = F (G(x))
— then f ◦ g = (1 3 6)(4 5)(1 2). This maps 1 to 2, 2 to 3 (since g(2) = 1 and f (1) = 3),
3 to 6, 4 to 5, 5 to 4, and 6 to 1. We see that f ◦ g = (1 2 3 6)(4 5) as a product of disjoint
cycles.
Let D be a block design. An automorphism of D is a permutation f of the set V of
varieties that is simultaneously a permutation of the set B of blocks. (We say that f induces
a permutation on B.) For example, the permutation τ = (1 2 3 4 5 6 7 8 9 X 0) of the
set V = {1, 2, 3, 4, 5, 6, 7, 8, 9, X, 0} of varieties induces the permutation
τ ′ = (B1 B2 B3 B4 B5 B6 B7 B8 B9 BX B0 )
of the corresponding set of blocks. The set of all such automorphisms is a group under
composition, called the automorphism group Aut(D) of the design D.
It turns out that there are 660 automorphisms of the (11, 5, 2) biplane. How do we find
them all?
In some sense, the automorphism τ is an obvious choice, for the blocks of the biplane were created by repeatedly adding 1 (mod 11) to each member of the difference set
B1 = {1, 3, 4, 5, 9}. Less obviously, the permutation µ = (1 3 9 5 4)(2 6 7 X 8) also induces a permutation of the blocks—but µ′ = (B2 B4 BX B6 B5 )(B0 B9 B3 B7 B8 ) does
permute the blocks. So, how do we find the rest of the automorphisms?
What we do involves a four-fold application of one of the most useful and elegant
theorems in all of group theory, namely the Orbit-Stabilizer Theorem. But first, we need
a couple of definitions. Suppose that G is a group of permutations on the set S, and let
x ∈ S. Then the stabilizer StabG (x) of x in G is the set of all permutations in G that leave
x fixed, and the orbit OrbG (x) of x is the set of all y ∈ S for which y = g(x) for some
permutation g ∈ G. Let |X| be the order of X (if X is a group) or the cardinality of X (if
X is a set). Here is the result:
Theorem. (The Orbit-Stabilizer Theorem) Let G be a finite group of permutations
of a set S and let x ∈ S. Then (a) StabG (x) is a subgroup of G, and (b) |G| = |StabG (x)| ·
|OrbG (x)|.
Proof: (a) Since the identity permutation fixes x, StabG (x) is nonempty. If α, β ∈
StabG (x), then α−1 (x) = x, and so α−1 ◦ β(x) = α−1 (β(x)) = α−1 (x) = x. Thus,
α−1 ◦ β ∈ StabG (x), and it follows that StabG (x) is a subgroup of G.
As for (b), we can show that |OrbG (x)| = |G/StabG (x)|, the number of cosets of
StabG (x) in G. For, elements γ and δ of G are in the same coset if and only if γ −1 (δ(x)) = x
146
— equivalently, if and only if γ(x) = δ(x). Thus, the number of cosets of StabG (x) in G
is equal to the number of distinct images γ(x), for γ ∈ G. But the latter is just the size
of the orbit of x, and we conclude that |OrbG (x)| = |G/StabG (x)|. Finally, by the proof of
Lagrange’s Theorem, |G| = |StabG (x)| · |G/StabG (x)| = |StabG (x)| · |OrbG (x)|.
We now define the groups G, H, K and L as follows:
G = Aut((11, 5, 2));
H = StabG (B1 ) = { automorphisms in G that leave B1 set-wise fixed};
K = StabH (1) = { automorphisms in H that leave 1 fixed};
(19)
L = StabK (3) = { automorphisms in K that leave 3 fixed}.
By the Orbit-Stabilizer Theorem, L, K and H are subgroups of K, H and G, respectively,
and since 4 ∈ B1 , we see that
|G| = |H| · |OrbG (B1 )| = |K| · |OrbH (1)| · |OrbG (B1 )|
= |L| · |OrbK (1)| · |OrbH (1)| · |OrbG (B1 )|
(20)
= |StabL (4)| · |OrbL (4)| · |OrbK (1)| · |OrbH (1)| · |OrbG (B1 )|.
If we can show that |StabL (4)| = 1, |OrbL (4)| = 3, |OrbK (1)| = 4, |OrbH (1)| = 5, and
|OrbG (B1 )| = 11, it will follow that |G| = 1 · 3 · 4 · 5 · 11 = 660. Let’s call it a theorem.
Theorem. Let G, H, K and L be as defined above. (a) If σ ∈ H and σ fixes 1, 3 and
4, then σ = I, the identity map, and |StabL (4)| = 1. (b) |OrbL (4)| = 3, |OrbK (1)| = 4,
|OrbH (1)| = 5, and |OrbG (B1 )| = 11. (c) |G| = 660.
Proof: (a) Since σ ∈ H, σ fixes B1 setwise. Since B4 = 46781, BX = X1237 and B0 =
02348 are the only other blocks containing the pairs {1, 4}, {1, 3} and {3, 4} respectively, it
follows that σ fixes B4 , BX and B0 . Thus, σ fixes the subsets {6, 7, 8}, {X, 2, 7} and {0, 2, 8}
of B4 , BX and B0 respectively. The only way this can happen is if σ fixes the elements 2, 7
and 8. As a consequence, σ also fixes 6, X and 0, and hence σ fixes B3 = 35670. It follows
that σ fixes 5. Finally, since σ fixes B1 , it must also fix 9, and we conclude that σ = I,
and so |StabL (4)| = 1.
(b) Now, L = StabK (3) contains I, α = (4 5 9)(2 7 X)(0 6 8) and α−1 , so that
OrbL (4) = {4, 5, 9}. It follows that |OrbL (4)| = 3. Since K = StabH (1) contains I,
β = (3 4)(5 9)(2 8)(6 X), γ = (3 5)(4 9)(2 8)(7 0), and β ◦ γ, it follows that OrbK (3) =
{3, 4, 5, 9} and so |OrbK (3)| = 4. Next, H = StabH (1) contains the powers of µ =
(1 3 9 5 4)(2 6 7 X 8). It follows that OrbH (1) = {1, 3, 4, 5, 9}, and so |OrbH (1)| = 5.
Finally, G contains the powers of τ = (1 2 3 4 5 6 7 8 9 X 0); the kth powers of the
induced permutation τ ′ send B1 to Bk for each k. Hence, OrbG (B1 ) contains all eleven
blocks, and we conclude that |OrbG (B1 )| = 11.
(c) We now put the pieces together. By the Orbit-Stabilizer Theorem and Equation
(20), we see that
|G| = |StabL (4)| · |OrbL (4)| · |OrbK (1)| · |OrbH (1)| · |OrbG (B1 )|
= 1 · 3 · 4 · 5 · 11 = 660,
147
as claimed.
Finally, we know that the group G contains the automorphisms α and τ, so that P =
hα, τ i is a subgroup of G. A short calculation with a near-by computer algebra system
reveals that the subgroup P also has order 660. Therefore, G = Aut((11, 5, 2)) = hα, τ i,
and so G is generated by the two automorphisms α = (4 5 9)(2 7 X)(0 6 8) and τ =
(1 2 3 4 5 6 7 8 9 X 0). It turns out that G is commonly known as P SL(2, 11), the
group of 2 × 2 unimodular matrices over an 11-element field, with I and −I identified.
With so much symmetry, there ought to be a picture that tells us something about the
(11, 5, 2) biplane — and Figure 66 is where we came in. So let’s look at some pictures.
Symmetries of the Biplane as Revealed in Pictures
“Draw a figure.” So said that master problem-solver and teacher, George Pólya, in his
classic “How To Solve It” [123]. We learn so much from figures, so we follow Pólya’s lead
and return to the picture in Figure 66. As we mentioned earlier, the context suggested
that it was a picture of the (11, 5, 2) biplane. It is clear that Figure 66 is a dressedup regular pentagon. As such, it is set-wise fixed by both a 1/5-turn about the center
and reflections about lines through the center. The challenge was to label the figure so
that these geometric motions corresponded to symmetries of the (11, 5, 2) biplane, and my
efforts were eventually rewarded. In Figure 67, the clockwise 1/5-turn about the point 0
and the reflection about the line through 0 and 7 correspond to the automorphisms µ and
ρ, respectively, where
µ = (1 3 9 5 4)(2 6 7 X 8) and ρ = (2 8)(3 4)(5 9)(6 X).
(Figure 67 goes about here.)
Let us now see just how the figure depicts these automorphisms.
First, consider µ = (1 3 9 5 4)(2 6 7 X 8). As mentioned above, µ induces the permutation µ′ = (B2 B4 BX B6 B5 )(B0 B9 B3 B7 B8 ) on the blocks of the biplane.
Now, look at Figure 67. The exterior pentagon joins the five points labeled 1, 3, 9, 4
and 5. This is the block B1 , which is mapped into itself by a 1/5-turn about 0. Next,
the dotted lines connect the five points labeled 4, 8, 0, 2 and 3. This is just the block
B0 = 02348, and if we rotate the figure about 0 by a 1/5-turn, we see that B0 is mapped
into B9 = {1, 2, 0, 6, 9}, B9 into B3 = {3, 6, 0, 7, 5}, B3 into B7 = {9, 7, 0, X, 4}, B7 into
B8 = {5, X, 0, 8, 1}, and B8 into B0 . Finally, the bold lines connect the five points labeled
2, 9, 7, 5 and 8. This is the block B5 = 57892, and if we rotate the figure about 0 by a
1/5-turn, we see that B5 is mapped into B2 = {6, 5, X, 4, 2}, B2 into B4 = {7, 4, 8, 1, 6},
B4 into BX = {X, 1, 2, 3, 7}, BX into B6 = {8, 3, 6, 9, X}, and B6 into B5 .
We have shown that the 1/5-turn about 0 induces the permutation (B2 B4 BX B6 B5 )
(B0 B9 B3 B7 B8 ) on the blocks of the biplane. We omit B1 because it maps into itself.
But this permutation is exactly µ′ !
148
1
8
4
2
3
0
X
6
7
5
9
Figure 67: The Fabulous (11, 5, 2) Biplane
Are there ways to draw the (11, 5, 2) biplane biplane that exhibit symmetries other than
µ and others of order 5? How about α = (4 5 9)(2 7 X)(0 6 8), which has order 3? Find
one!
We are almost ready to talk about the mathematical twins connected to the (11, 5, 2)
biplane. The most direct path to the twins leads through a certain matrix associated with
the (11, 5, 2) biplane, so let’s talk about incidence matrices.
Incidence Matrices
One way to describe a block design is by its incidence matrix, a b × v matrix whose (i, j)th
entry is 1 or 0 according as the ith block does or does not contain the jth variety. Here
is the incidence matrix M for the (11, 5, 2) symmetric design. The rows correspond to
the blocks in the above order, and the columns correspond to the varieties in the order
149
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, X:










M=








0
0
1
0
0
0
1
1
1
0
1
1
0
0
1
0
0
0
1
1
1
0
0
1
0
0
1
0
0
0
1
1
1
1
0
1
0
0
1
0
0
0
1
1
1
1
0
1
0
0
1
0
0
0
1
1
1
1
0
1
0
0
1
0
0
0
0
1
1
1
0
1
0
0
1
0
0
0
0
1
1
1
0
1
0
0
1
0
0
0
0
1
1
1
0
1
0
0
1
1
0
0
0
1
1
1
0
1
0
0
0
1
0
0
0
1
1
1
0
1
0










.








As we shall soon see, the above matrix M is instrumental in constructing the Golay
code twins {G11 , G12 } and {G23 , G24 }.
Error–Correcting Codes
We gave an introduction to the theory of error–correcting codes in Section 10, and we
recall some of the ideas in that section. The minimum distance of a code is the smallest
distance between its code words; this minimum distance determines the code’s error detection and correction features. (Exercise: Show that a code with minimum distance 5 will
detect up to 4 errors and correct up to 2. You can then show that a code with minimum
distance d will detect up to d − 1 errors and correct up to (d − 1)/2 errors.) For an (n, k)
code to be efficient, the ratio k/n should be as large as possible, consistent with its error
detection and correction capabilities. Maximum efficiency in an (n, k) m−error correcting
code occurs when it can correct up to m errors, and no others. Such a code is called perfect.
Here is a very nice necessary condition—which we can verify—for the existence of a perfect
code.
Theorem. If there exists a q−ary (n, k) perfect m−error–correcting code, then
2 n
m n
1 + (q − 1)n + (q − 1)
+ . . . + (q − 1)
= qr
2
m
for some positive integer r, and k = n − r.
Proof.
A code word of length n can have a single
error occur in n positions, two errors
n
n
in
positions, and in general m errors in
ways. For a q−ary code, there are
2
m
q − 1 ways for a single error to occur at a given position, (q − 1)2 ways for two errors to
occur at two given positions, and in general (q − 1)m ways for m errors to happen at m
given positions. Thus, the total number of ways in which no more than m errors can occur
relative to a given code word is equal to
2 n
m n
1 + (q − 1)n + (q − 1)
+ · · · + (q − 1)
.
(21)
2
m
150
The set of all n−long q−ary strings differing from a given code word W in at most m
positions is called the sphere of radius m about W .
If a code is perfect, then every n−string lies in a sphere of radius m about some code
word, and the spheres do not overlap. That is, the union of the spheres is equal to the
entire space of n−tuples. Since the latter is of size q n , it follows that
(number of m−spheres) · (size of each m−sphere) = q n .
Thus, the size of an m−sphere must be a power of q — say, q r , and the equation of
the lemma is satisfied. Finally, every m−sphere is centered about a code word. Since
q n = q r · q n−r , it follows that k = n − r, and we are done.
Now, 11 happens to be the smallest prime number p for which 2p − 1 is not a prime. For
p = 2, 3, 5 and 7, we obtain the primes 2p −1 = 3, 7, 31 and 127, and 211 −1 = 2047 = 23·89
is composite. But there is ample recompense for 211 − 1’s failure to be prime; let’s take a
closer look:
211 = 1 + 23 · 89 = 1 + 23(1 + 11 + 11 · 7)
= 1 + 23
+23 ·11 +
23 ·11 ·7
23
23
23
= 1+
+
+
.
1
2
3
In 1949, Golay noted that this is precisely the case q = 2, n = 23, r = 11 of Theorem
16. That is, the necessary condition for the existence of a binary (23, 23 − 11) perfect
3−error–correcting code is satisfied.
In the same year, he also noticed that
11
2 11
1+2·
+2
= 1 + 22 + 220 = 243 = 35 ,
1
2
so that the necessary condition for the existence of a ternary (11, 11 − 5) perfect 2−error–
correcting code is satisfied.
Of course, necessary conditions are not always sufficient, but in 1949, Golay constructed
two linear codes with the above parameters and two slightly larger linear codes. The binary
codes are G23 and G24 , the (23, 12) Golay code and the (24, 12) extended Golay code; the
ternary codes are G11 and G12 , the (11, 6) Golay code and the (12, 6) extended Golay code.
Now we can describe an (n, n − r) q−ary linear code as the row space of a matrix of
n columns and rank r over Zq , the so–called generating matrix of the code. Let A be the
following 12 × 24 binary matrix:
151










A=









1
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
1
1
1
1
0
1
0
1
0
0
0
1
1
1
0
1
1
1
1
0
1
0
0
0
1
1
1
0
1
0
1
1
0
1
0
0
0
1
1
1
1
1
0
1
1
0
1
0
0
0
1
1
1
1
1
0
1
1
0
1
0
0
0
1
1
1
1
1
0
1
1
0
1
0
0
0
1
0
1
1
1
0
1
1
0
1
0
0
1
0
0
1
1
1
0
1
1
0
1
0
1
0
0
0
1
1
1
0
1
1
0
1
1
1
0
0
0
1
1
1
0
1
1
0
1
0
1
0
0
0
1
1
1
0
1
1
1










.









A is a generating matrix for G24 ; deleting its last (boldface) column gives a generating
matrix for G23 .
The Golay code
the Golay code G11

−1
 −1

 −1

 −1

 −1


 −1
B=
 −1

 −1

 −1

 −1

 −1
−1
G12 is the row space of the following 12 × 12 ternary matrix B, and
is the row space of B ′ , obtained from B by deleting the last column.

1
1 −1
1
1
1 −1 −1 −1
1 −1
−1
1
1 −1
1
1
1 −1 −1 −1
1 

1 −1
1
1 −1
1
1
1 −1 −1 −1 

−1
1 −1
1
1 −1
1
1
1 −1 −1 

−1 −1
1 −1
1
1 −1
1
1
1 −1 


−1 −1 −1
1 −1
1
1 −1
1
1
1 
.
1 −1 −1 −1
1 −1
1
1 −1
1
1 

1
1 −1 −1 −1
1 −1
1
1 −1
1 

1
1
1 −1 −1 −1
1 −1
1
1 −1 

−1
1
1
1 −1 −1 −1
1 −1
1
1 

1 −1
1
1
1 −1 −1 −1
1 −1
1 
−1 −1 −1 −1 −1 −1 −1 −1 −1 −1 −1
11
Note that these codes are 6−dimensional subspaces of Z12
3 and Z3 respectively, since B
′
and B have rank six. (Arithmetic in Z3 is the same as arithmetic mod 3 with the symbols
−1, 0 and 1.)
We are now ready to connect the (11, 5, 2) biplane with the Golay code twins. Let U
and V be the upper rightmost 11 × 11 submatrices of A and B, respectively.
Take U and change all the 1’s on the main diagonal to 0’s, and what do you get? You
get M, the incidence matrix for the (11, 5, 2) biplane.
Take V and change all the −1’s to 0’s. Then, change all the 1’s on the main diagonal
to 0’s, and what do you get? Again, you get M.
Thus, from the (11, 5, 2) biplane we are able to construct the binary Golay code twins
G23 and G24 and the ternary Golay code twins G11 and G12 .
Nice connections, to be sure — and there are even more connections with Steiner systems, so let’s find out about them.
152
Steiner Systems
The (11, 5, 2) biplane and the Golay codes G23 and G24 are instrumental in constructing
our next set of twins, namely the twin Steiner systems known as {S(4, 5, 11), S(5, 6, 12)}
and {S(4, 7, 23), S(5, 8, 24)}.
A Steiner system S(p, q, r) is a collection S of q−subsets of an r−set Ω, such that
every p−set in Ω is contained in exactly one of the q−sets in S. Thus, an S(1, q, r) is a
partition of an r−set into q−sets, so that these exist if and only if r is a multiple of q.
It is known that S(2, 3, r)’s exist if and only if r ≡ 1 or 3 mod 6 and S(3, 4, r)’s exist if
and only if r ≡ 2 or 4 mod 6. Steiner systems S(2, 3, r) are also block designs known as
Steiner triple systems; here is S(2, 3, 7), namely the (7, 3, 1) symmetric design with blocks
A, B, C, D, E, F, and G:
A = 124, B = 235, C = 346, D = 450, E = 561, F = 602, G = 013.
(22)
(You can construct an S(3, 4, 8) from the S(2, 3, 7): adjoin ∞ to each of the blocks of the
S(2, 3, 7), and include the complement in {0, 1, 2, 3, 4, 5, 6} of each block in the S(2, 3, 7).)
For r ≥ 4, the story is different: very few of these are known, and one reason is that
there are restrictions on the parameters p, q and r, namely the
r
p
Restriction Theorem. If S is an S(p, q, r) defined on the r−set Ω, then S contains
/ pq q−sets, and for 0 ≤ j < p:
r−j q−j / p−j is an integer;
(a) p−j
q−j r−j
(b) every j−subset of Ω belongs to exactly p−j
/ p−j q−sets of S;
(c) there exists an S(p − j, q − j, r − j) on an (r − j)−subset of Ω.
Proof: Each of the rp p−sets in Ω belongs to a unique q−set in S and each such q−set
contains qp p−sets; hence, S contains pr / pq q−sets. It follows that rp / pq is an integer,
which establishes (a) for j = 0. Now fix x ∈ Ω. Let Sx = {Y |Y is a q−set in Ω containing x}.
Since each p−set containing s belongs to a unique q−set Y ∈ Sx , it follows that Sx′ =
{Y −{x}|Y ∈ Sx } is a collection of (q −1)−subsets of Ω−{x}, such that each (p−1)−subset
of Ω − {x} belongs to a unique (q − 1)−set in Sx′ . In short, Sx′ is an S(p − 1, q − 1, r − 1) on
r−1 q−1
the set Ω − {x}; by the above, it follows that Sx contains exactly p−1
/ p−1 q−sets of S.
This establishes (b) and (c) for j = 1. Continuing inductively, we see that if an S(p, q, r)
exists, then so does an S(p − j, q − j, r − j) for 0 ≤ j ≤ p − 1; from this, we may deduce
(b) and (c) for 0 ≤ j < p.
The S(p − j, q − j, r − j) systems obtained in this way from an S(p, q, r) are said to
be derived from the S(p, q, r). Every known Steiner system S(4, q, r) is derived from an
S(5, q + 1, r + 1); in this sense, they are all twin systems. The known Steiner systems
S(p, q, r) with p > 3 are S(5, 6, r) with r = 12, 24, 48, 72, 84, 108, 132, and 244, S(5, 7, 78),
S(5, 8, 24), and their derived designs.
We are now ready to construct the twin systems {S(4, 5, 11), S(5, 6, 12)} and the twin
systems {S(4, 7, 23), S(5, 8, 24)}. We construct S(5, 6, 12) and the systems derived from
153
it — in particular, S(4, 5, 11) — by means of a unified approach, beginning with the
(11, 5, 2) biplane. Let B1 = {1, 3, 4, 5, 9}, the first block in the (11, 5, 2) biplane, and
let B := B1 ∪ {∞}. Denote the set {0, 1, . . . , 10} by [0..10]. In what follows, addition
and subtraction are all mod 11, except that ∞ ± x = ∞ for all x. If X is a set of
numbers and m is a number, then we define X + m := {x + m|x ∈ X}. For example,
B + 6 = {1 + 6, 3 + 6, 4 + 6, 5 + 6, 9 + 6, ∞ + 6} = {7, 9, 10, 0, 4, ∞}.
Define the mappings s and σ to be permutations on [0..10] and [0..10]∪{∞}, respectively,
by
σ = (1 10)(2 5)(3 7)(4 8)(6 9) and s = (0 ∞) ◦ σ.
Now, if f is a permutation and X is a set, then define f (X) to be {f (x) : x ∈ X}. For
example, since B + 6 = {7, 9, 10, 0, 4, ∞}, we see that
s(B + 6) = s({7, 9, 10, 0, 4, ∞}) = {s(7), s(9), s(10), s(1), s(5), s(∞)}
= {3, 6, 1, 10, 2, 0}, and so
s(B + 6) + 3 = {6, 9, 4, 2, 5, 3}.
We now construct the Steiner systems as follows:
S(5, 6, 12) = {B + k|k ∈ [0..10]} ∪ {s(B + k) + j|j, k ∈ [0..10]};
S(4, 5, 11) = {B1 + k|k ∈ [0..10]} ∪ {σ(B1 − n) + k : n ∈ B1 , k ∈ [0..10]};
S(3, 4, 10) =
S(2, 3, 9) =
blocks of S(4, 5, 11) containing 10, with 10 deleted; and
blocks of S(3, 4, 10) containing 0, with 0 deleted.
Figures 68 and 69 list the blocks for S(4, 5, 11) and S(5, 6, 12), respectively; don’t peek
until you’ve tried your hand at constructing them yourself. Notice that the blocks of the
(11, 5, 2) biplane appear in S(4, 5, 11) as its first column.
There are many ways to construct that most extraordinary design known as S(5, 8, 24)
(as, indeed, there are to construct S(5, 6, 12)), and one way is to use the Golay code G24 .
Here is a very brief description; we give a more detailed account of some of the constructions
in Section 18.
8
By Theorem 16, if S(5, 8, 24) exists, then it contains 24
5 / 5 = 759 8−sets. This
just happens to be the exact number of codewords of Hamming weight 8 in G24 . For
example, all rows but the last in the generating matrix A (see Equation 22) are codewords
of weight 8. Let us number the columns of A with the customary numbering scheme
1, 2, . . . , 22, 0, ∞. If c = c1 c2 · · · c∞ is a weight−8 codeword, then Oc = {i|ci = 1} is an
8−subset of {1, 2, . . . , 22, 0, ∞}. S(5, 8, 24) consists of these 759 so–called octads, and we
construct the derived systems as follows:
S(5, 8, 24) =
codewords of weight 8 in G24 ;
S(4, 7, 23) =
octads of S(5, 8, 24) containing ∞, with ∞ deleted;
S(3, 6, 22) =
blocks of S(4, 7, 23) containing 0, with 0 deleted; and
S(2, 5, 21) =
blocks of S(3, 6, 22) containing 22, with 22 deleted.
154
One of the notable aspects of S(5, 8, 24) is something it shares with the other Steiner
systems, namely, a high degree of symmetry. Studying this symmetry leads us to the next
collection of twins: the Mathieu groups.
Automorphisms, Transitivity, Simplicity and the Mathieu Groups
An automorphism of a Steiner system S is a permutation of the underlying r−set that also
permutes the q−sets of S among themselves. For example, the permutation a = (2 4)(5 6)
on the set {0, 1, 2, 3, 4, 5, 6} is an automorphism of S(2, 3, 7) — see Equation (22). Using the
labeling convention from that equation, you can check that a switches B and C, switches
D and F , and leaves A, E and G fixed. That is, viewed as a permutation on S(2, 3, 7),
a = (B C)(D F ).
You may recall that the automorphisms of the (11, 5, 2) biplane form a group under
composition — and the same is true for a Steiner system. As before, we write Aut(S)
for the automorphism group of the Steiner system S. In general, Steiner systems have a
large number of automorphisms. For example, S(2, 3, 7) consists of seven triples, and yet
Aut(S(2, 3, 7)) is isomorphic to P SL(2, 7), the group of order 168 generated by the permutations a and b = (0 1 2 3 4 5 6). For example, you can show that, as a permutation
on S(2, 3, 7), ab2 = (A B F )(C E G).
The automorphism groups of S(4, 5, 11), S(5, 6, 12), S(4, 7, 23) and S(5, 8, 24) are known
as the Mathieu groups M11 , M12 , M23 and M24 , respectively. First, we’ll learn about their
origin and why they are important, and then we’ll describe them.
Émile Mathieu (1835–1890) first constructed the groups bearing his name in two papers
summarizing work from his doctoral thesis. The Mathieu groups are special in two ways:
first, they are multiply transitive, and second, they are simple—the first of the so–called
sporadic finite simple groups ever described. Let us see what these terms mean.
A group of permutations G on a set A is called k−transitive if for every pair of ordered k−tuples (a1 , . . . , ak ) and (b1 , . . . , bk ) of elements of A, there exists g ∈ G such that
g(ai ) = bi for 1 ≤ i ≤ k. G is called transitive (respectively, multiply transitive) if it
is 1−transitive (respectively, k−transitive for some k > 1). A k−transitive group is also
(k − 1)−transitive. For example, the group A3 = {(1), (1 2 3), (1 3 2)} is transitive, but
not multiply transitive. P SL(2, 7) is 2−transitive, but not 3−transitive. Sn , consisting of
all permutations on {1, 2, . . . , n}, is n−transitive. Now, k−transitive for k ≥ 4 are rare
indeed, and one special feature of the Mathieu groups is that they are highly transitive.
The following theorem tells the story; for a proof, see [27] or [138].
Multiple Transitivity Theorem. (a) If G is 4−transitive, then G is isomorphic to
(i) a symmetric group Sn for some n ≥ 4, (ii) an alternating group An for some n ≥ 6, or
(iii) Mn for n = 11, 12, 23 or 24.
(b) If G is 5−transitive, then G is isomorphic to (i) a symmetric group Sn for some
n ≥ 5, (ii) an alternating group An for some n ≥ 7, (iii) M23 or (iv) M24 .
The Mathieu groups are also simple, and to understand what that means, we need to
155
bring back an idea from matrix algebra. Recall that two matrices A and B are similar if
there exists an invertible matrix Q such that B = Q−1 AQ. We can carry this idea over
into groups: two group elements a and b are conjugate if there exists a group element g
such that b = g−1 ag. (Remember, all elements of a group are invertible.) For example, if
a = (1 2), b = (1 3) and g = (1 2 3), then you can show that b = g −1 ag.
A special property of some subgroups is that of normality: a subgroup H of a group
G is normal if for all h ∈ H and for all g ∈ G, H contains g−1 hg. For example, let
S = {(1), (1 2 3), (1 3 2), (1 2), (1 3), (2 3)}, the group of all permutations of {1, 2, 3};
let A = {(1), (1 2 3), (1 3 2)} and C = {(1), (1 2)}. You can check that A and C are
both subgroups of S, that A is normal, and that C is not normal. If G is an abelian
(commutative) group, then all subgroups are normal—for, if h ∈ H and g ∈ G, then
g−1 hg = g −1 gh = h ∈ H by commutativity.
A group containing no normal subgroups except itself and the identity subgroup is
called simple. Just as prime numbers are the (multiplicative) building blocks by which
we construct all the integers, so simple groups are the building blocks for constructing
all finite groups. A major achievement of twentieth-century mathematics, featuring such
luminaries as Chevalley, Feit, Thompson, Conway, Fischer, Gorenstein, and many others,
was the complete classification of finite simple groups. The upshot of this effort, spanning
some 15,000 journal pages (!), is that all finite simple groups belong to a few well-studied
infinite families—except for twenty-six so-called sporadic groups. And the Mathieu groups
were the very first sporadic groups ever described. Speaking of which:
There are many ways to describe the Mathieu groups; here is one way. Let t, u and v
be the permutations defined by
s = (0 ∞)(1 10)(2 5)(3 7)(4 8)(6 9),
t = (0 1 2 3 4 5 6 7 8 9 10), and
u = (3 9 4 5)(2 6 10 7).
Then M11 is the group generated by t and u, and M12 is the group generated by s, t and
u. And yes, s is the same permutation that we used to construct S(5, 6, 12).
Let α, β and γ be the permutations defined on {0, 1, . . . , 22, ∞} by
α = (2 16 9 6 8)(4 3 12 13 18)(10 11 22 7 17)(20 15 14 19 21),
β = (0 1 . . . 21 22), and
γ = (0 ∞)(1 22)(2 11)(3 15)(4 17)(5 9)(6 19)(7 13)(8 20)(10 16)(12 21)(14 18).
Then M23 is the group generated by α and β, and M24 is the group generated by α, β and
γ.
We can now see how M11 and M12 are twins, and the same is true of M23 and M24 ; to
cement their connection with the (11, 5, 2) biplane further, it turns out that the number of
elements in each of these groups is divisible by 11.
Our whirlwind tour of the (11, 5, 2) biplane, its symmetries, and its connections with
six pairs of twins has been quite a tale. But there is one Steiner system, namely the Steiner
156
system S(5, 8, 24), whose description we gave was just too brief, and in the last two sections
of this book, we’ll explore this extraordinary combinatorial design with the detail that it
deserves.
And now, we take a good look at Rick’s Tricky Six Puzzle, a board game from the
world of recreational mathematics whose deceptive simplicity gives rise to a large number
of combinatorial connections.
Figures 68 and 69 go about here:
13459
2456X
35670
46781
57892
689X3
79X04
8X015
90126
X1237
02348
07293
183X4
29405
3X516
40627
51738
62849
7395X
84X60
95071
X6182
03618
14729
2583X
36940
47X51
58062
69173
7X284
80395
914X6
X2507
0412X
15230
26341
37452
48563
59674
6X785
70896
819X7
92X08
X3019
06X59
1706X
28170
39281
4X392
504X3
61503
72615
83726
94837
X5948
05784
16895
279X6
38X07
49018
5X129
6023X
71340
82451
93562
X4673
Figure 68: The Steiner system S(4, 5, 11) and the (11, 5, 2) design
∞13459
0X7826
108937
219X48
32X059
43016X
541270
652381
763492
8745X3
985604
X96715
∞2456X
058291
1693X2
27X403
380514
491625
5X2736
603847
714958
825X69
93607X
X47180
∞35670
07293∞
183X4∞
29405∞
3X516∞
40627∞
51738∞
62849∞
7395X∞
84X60∞
95071∞
X6182∞
∞46781
08934X
19X450
2X0561
301672
412783
523894
6349X5
745X06
856017
967128
X78239
∞57892
023465
134576
245687
356798
4678X9
57890X
689X10
79X021
0X0132
901243
X12354
∞689X3
094617
1X5728
206839
31794X
428X50
539061
64X172
750283
961394
9724X5
X83506
∞79X04
0361∞8
1472∞9
2583∞X
3694∞0
47X5∞1
5806∞2
6917∞3
7X28∞4
8039∞5
914X∞6
X250∞7
∞8X015
041∞X2
152∞03
263∞14
374∞25
485∞36
596∞47
6X7∞58
708∞69
819∞7X
92X∞80
X30∞91
Figure 69: The Steiner system S(5, 6, 12)
157
∞90126
06∞X59
17∞06X
28∞170
39∞281
4X∞392
50∞4X3
61∞504
72∞615
83∞726
94∞837
X5∞948
∞X1237
01X573
120684
261795
3428X6
453907
564X18
675024
78613X
897240
9X8351
X09462
∞02348
0∞5784
1∞6895
2∞79X6
3∞8X07
4∞9018
5∞X129
6∞023X
7∞1340
8∞2451
9∞3562
X∞4673
17
Rick’s Tricky Six Puzzle: S5 sits specially in S6
This section is in the file rickschapter.tex.
Many of you will be familiar with the Fifteen Puzzle (Figure 70, left). Singmaster [145,
§5A, pp. 77–84] gives nearly a hundred references to it. It is often associated with the name
of Sam Loyd, but Sam continues to be a controversial figure [17;9, Chapter 2, pp. 18–30].
In the unlikely event that you’ve never seen the Fifteen Puzzle, you can read about it in
the review quoted in the next section.
Sliding block puzzles may be represented by graphs in which the vertices represent
possible positions of the blocks and the edges represent the permissible moves of a block
from one position to another. For example, the Fifteen Puzzle may be thought of as being
played on the sixteen vertices of the graph in Figure 70. In this graph, don’t think of the
numbers as labels for the vertices, but as labelled blocks that can be slid from a vertex to
an empty vertex. For example, in the figure, either block 12 or block 15 may be slid onto
the vertex where indicates that there is no block.
1
2
3
4
5
6
7
8
2
4
5
7
1
3
6
8
10 12
13
15
14
9 10 11 12
13 14 15
9
Figure 70: The Fifteen Puzzle and its bipartite graph.
158
11
The notoriety of the puzzle derives from the impossibility of being able to swap the
positions of 14 and 15 in the bottom row, while keeping all the other numbers fixed. This
parity property was noted as early as 1879 [19;18, Chapter 1].
How many people know Rick Wilson’s general theorem on sliding block puzzles? We
retain Rick’s first name to avoid confusion with the well known theorem of Sir John Wilson,
first proved by Lagrange, that if p is a prime then (p − 1)! + 1 is divisible by p.
The set of attainable positions in a sliding block puzzle of n pieces sliding on the edges
of a graph with n + 1 vertices form a group. Rick Wilson’s theorem [174] states that,
apart from simple polygons, and the graph that is the subject of this article, the group of
permutations of attainable positions is either Sn , the full symmetric group, if the graph
contains an odd circuit, or An , the alternating group of even permutations, if the graph
contains only even circuits. In the latter case the graph is bipartite, the vertices separate
into two sets and there are no edges between members of the same set — the Fifteen Puzzle
is the classical example.
We mention that Rick Wilson’s theorem applies only to nonseparable graphs, that is,
graphs that are 2-connected, or without cut-points, so that there are always at least two
paths between any pair of vertices that have no intermediate vertex in common.
What is the exception?
Math Reviews 48 #10882 offers a review by Derek Smith of Wilson’s paper [174], quoted
here with permission from the AMS.
The 15-puzzle consists of fifteen small movable square tiles numbered 1, 2, · · · , 15
and one empty square, arranged in a 4 × 4 array. One is permitted to interchange the empty square with a tile next to it as often as desired. The challenge
is to move by a sequence of such interchanges from one position of the tiles to
another specified position. The author generalises this problem to an arbitrary
simple graph and proves that for a finite simple nonseparable graph, with one
exception, any position can be reached from any other position unless the graph
is bipartite. In the bipartite case, the set of positions splits into two sets, with
no position in one set reachable from a position of the other set.
159
This might be misconstrued to read as though the exception is the set of bipartite
graphs. In fact the exception is shown in Figure 71. It is a graph on 7 points with 8 edges.
It contains two 5-circuits and a 6-circuit, so that we might expect to be able to obtain all
6! = 720 permutations of the six counters, labelled with the symbols 0,1,2,3,4,∞. Why
do we use ∞ instead of 5 ? Our labels represent the field F5 with ∞ adjoined; this will
make the connexion with the automorphism group of the puzzle clearer.
∞
4
0
3
1
2
Figure 71: Rick’s Tricky Six Puzzle.
A little experimentation reveals that there are many arrangements that cannot be attained. The 6! possible arrangements separate into six equivalence classes, with 5! positions
in each class. We shall see that
∞01234,
∞01243,
∞01324,
∞01342,
∞01423,
∞01432
are representatives, one from each equivalence class. Note that we always read a position
clockwise, starting from twelve o’clock. It is not possible to get from any one of these six
positions to any other by sliding the disks along the eight edges of the graph.
Not much of a puzzle?
John Conway tells us that he once made a copy of the Tricky Six Puzzle, and we made
one that Art Benjamin helped us demonstrate at the 2006 MathFest, but we doubt if it
will ever catch on commercially. However, it does have considerable mathematical interest.
160
We shall see that it is related to the projective plane of order 4, to the Hoffman-Singleton
graph, to the Steiner system S(5, 6, 12), to a binary (12,132,4) code, to the ternary Golay
code C12 , and to shuffling a deck of cards [141, 46]. It is also related to the invariant theory
of six points, to “mystic pentagons” and the two-colorings of the three-subsets of a sixelement set [85], and to the tetracode, the Minimog, and the Rubicon [37, pp. 320–330],and
to many other things that we don’t have room for here.
Many mathematicians are interested in word play, so we asked our favorite anagrammatist, Andrew Bremner, to supply a set of six letters which had many anagrams. He
suggested A, C, E, N, R, T. Among the 720 possibilities we found the following twenty
words, names and acronyms.
RECANT
ARCNET
CARTEN
CENTRA
CARNET
TANCER
CANTER
CRANET
CRETAN
CANTRE
TRANCE
CERANT
NECTAR
CREANT
ENCART
TARNEC
NETCAR
TERCAN
TRACEN
TANREC
Table 2: Six equivalence classes of anagrams.
If you encode these anagrams with R=∞, E=0, C=1, A=2, N=3, and T=4, you will
find that it’s possible to get from one word to any other in the same column of Table 2, but
not to any word in a different column. For example, from RECANT, you can’t CANTER
to any of the other words. We list below four things you CAN do (have we always found
the shortest sequence of moves?). If you want to follow along, and to avoid what Conway
calls the “alias-alibi problem” (is it the counter? or the position it’s in?), then you should
label six counters or slips of paper with the symbols ∞, 0, 1, 2, 3, 4 and the letters R, E,
C, A, N, T and slide them about on an improvised board. When we write a permutation
(ABC. . . Z) this means that A ends up where B started, B ends up where C started,
and so on, cyclically, with Z arriving where A started. By the usual convention, when
we string together several such permutations it is the one on the right that acts first: they
don’t act in the order in which you would normally read them. Compare the out-shuffle
with the in-shuffle in the second example below.
161
1. Cut the deck: swap the first three symbols ∞, 0, 1, with the last three, 2, 3, 4
respectively. The moves 210∞4310∞4310∞432 take RECANT into ANTREC. This
is the permutation (∞2)(03)(14). [In anticipation of the next section we will also
write this as x → (x + 2)/(3x + 4) mod 5. Such a mapping is called a Möbius
transformation.]
2. Perform an out-shuffle, or an in-shuffle: cut the deck RECANT into REC and ANT
and interleave letters alternately from each half. In an out-shuffle the top card remains on top: RAENCT = (0132) [x → 2x + 1]. This can be achieved by the moves
234∞23102∞413. An in-shuffle results in ARNETC = (∞02)(431) [x → 2/(2x + 1)]
and results from the moves ∞012∞012∞3412∞30. Note that shuffling one way
then unshuffling the other performs a cut: (∞20)(134)(0132) =(∞2)(03)(14). On
the other hand, unshuffling then shuffling swaps alternate cards: (0132)(∞20)(134)
=(∞0)(12)(34) [x → 2/x].
These manipulations of cards don’t generate the whole group of the puzzle; they only
yield 4! of the 5! possible states, those in which the pairs of cards ∞4, 03, 12, that are
equidistant from the centre of the deck, remain so. Some experimentation reveals sequences
of moves that break up these pairs and generate the whole group.
3. Cycle the first four symbols. The moves 210∞2 followed by 10∞21 and 0∞210
and ∞210∞ take RECANT → ARECNT → CARENT → ECARNT and back into
RECANT. These are the transformations (∞012) [x → 1/(2x + 1)], (∞012)2 =
(∞1)(02) [x → (2x + 1)/(2x + 3)], (∞012)3 = (∞210) [x → (2x + 3)/x], and (∞012)4
= the identity [x → x].
4. Fix the first symbol and cycle the other five. The moves ∞432104∞ send RECANT
to RTECAN, ∞01234 to ∞40123, the permutation (01234) [x → x + 1]. In fact,
combined with the out-shuffle (0132) [x → 2x + 1], this cycle allows us to apply any
invertible linear polynomial mod 5 to the finite symbols 0,1,2,3,4, yielding positions
such as (0412) [x → 3x + 4], and its inverse (0214) [x → 2x + 2]. These are illustrated
in the first of the six diagrams of Figure 73 below as all ways to travel round the
pentagon or the pentagram.
What is the automorphism group of the Tricky Six Puzzle?
As you may have guessed from the brackets in the last section, it is the group P GL(2, F5 )
of Möbius transformations over the field F5 , namely
px + q
: ps − qr 6= 0
mappings x →
rx + s
This F5 is the first of several finite fields we will encounter. In fact for each prime power
q there is a unique field with q elements, which we will denote by Fq . So working in F5
means working modulo 5 — but only because 5 is prime.
162
2
There
are 5 − 1 = 24 possible nonzero vectors (p, q) for the top row of the matrix
p q
, and then 52 − 5 = 20 vectors (r, s) which are independent of the first row, as
r s
possibilities for the second row; a total of 24 × 20 = 480 nonsingular matrices. But the
matrices M , 2M , 3M , 4M , for example
1 0
2 0
3 0
4 0
,
,
,
,
4 4
3 3
2 2
1 1
all give the same transformation, (0)(3)(∞4)(1 2), taking ∞0 1 2 3 4 into 4 0 2 1 3∞, or RECANT into TEACNR, so that the number of different transformations is only 480/4 = 120.
To the surprise of at least one of the authors, this group is isomorphic to S5 , the group of
permutations of five objects. We will show that the isomorphism establishing this extends
naturally to an automorphism of S6 , under which the group of the puzzle maps to an S5
subgroup of S6 given by fixing a point. It’s in this context that the isomorphism is most
illuminatingly presented.
Two different group actions
An inner automorphism of a group is one given by conjugation, that is, each element x 7→
a−1 xa for some fixed element a. The automorphisms of a group themselves form a group,
of which the inner automorphisms form a normal subgroup [12, pp. 140–141].The outer
automorphisms are those automorphisms this doesn’t account for: by one definition any
non-inner automorphism is outer; by another the outer automorphism group is the quotient
of the automorphism group by the inner automorphism group. The symmetric group S6 is
the only finite symmetric group that supports a (nontrivial) outer automorphism [11; 14,
Theorem 7.3].
Here’s why. Let α be an automorphism of Sn , and let p be a permutation in Sn . Then
p and α(p) have the same order. In addition, if α is a conjugacy, then p and α(p) have
the same cycle structure. Therefore, if there is to be an automorphism of Sn that is not a
conjugacy, then there must be two distinct conjugacy classes of the same size are mapped
to each other by the non-conjugacy α. This only happens in S6 , where there two conjugacy
classes of element of order two, namely the 15 transpositions and the 15 products of three
disjoint transpositions. And that’s why.
Suppose an abstract group acts on a finite set T (that is, each element of the group
permutes T , and permuting by two group elements in succession is the same as permuting
by their product). If we were to relabel the elements of T by a permutation a, then an
element which acts via the permutation x after the relabelling would have acted by a−1 xa
before it. Now suppose our abstract group was the symmetric group ST all along. Then a
is in ST , so x 7→ a−1 xa is an inner automorphism of ST .
So the existence of an outer automorphism of S6 means that it can act on sets of size 6 in
a fundamentally different way than the obvious one. We’ll realize the outer automorphism
163
by constructing such an action, following Sylvester [157, 158, 159, 160, 161]. Consider the
complete graph on the six points a, b, c, d, e, f. Sylvester calls the six points monads, and
its 62 = 15 edges duads. These duads form 15 = 5 × 3 matchings, or triads of independent
edges, that Sylvester called synthemes, and graph theorists know as one-factors. Note that
there are 5 choices for a’s partner and 3 ways to pair the remaining four.
The graph supports six partitions, or synthematic totals, into five synthemes, shown in
Table 3 and labelled with their associated Tricky Six blocks, ∞, 0, 1, 2, 3, 4.
color
r
o
y
i
v
∞
ab cf de
ac db ef
ad ec fb
ae fd bc
af be cd
0
1
2
3
ab de cf
ac fd eb
ad cb fe
ae bf dc
af ec bd
ab fd ce
ac ef db
ad be fc
ae dc bf
af cb ed
ab dc fe
ac be df
ad fb ec
ae cf bd
af ed cb
ab fe dc
ac ed bf
ad cf eb
ae bc fd
af db ce
4
ab ec
ac bf
ad fe
ae db
af cd
df
ed
cb
fc
be
Table 3: The six totals: the edge-colorings of K6 with five colors.
The complete graph K6 underlying this construction shouldn’t be confused with Figure 71, the graph of the puzzle itself. As an example, the coloring associated with the label
2, with ab dc fe colored red, ac be df colored orange, etc., is illustrated in Figure 72.
A
vyr
v v i iyoor r r
v
v v i y oo r r r
i y
vv
rr
v
i y o
v
v
o
F vy y y y y y yi iy y yyy y yoy y y y yryr yr B
y
roioi
o
o oiv
i
y
o o o o ii v
r oi i
i
o
y
r o i ii
o oo i i v
y
v
r o ii i i
o
o i
o
ii y o
oi
v
r
o
o
i yoi
oi
i
v
r
o
o
io
o yi i i
v
r
i oo
o
y
r i o oo
i i i i oo v
y
r i o oo
ii i i o v
y
v
r ii o o o
i
royi oy y y y yoy y y yyy y y yi y y y yi yioyoiv
o
y
i
r C
E vv
o y
i
rr
vv
o
r
i
y
r
vv
v v oo y i i r r r
vv oy i r r
v voyir r
D
Figure 72: The edge-coloring 2 of K6 , the complete graph on six points.
If we fix the monad a and operate on the six totals with the 5! = 120 permutations
of the other five monads, we generate the set of possible arrangements of the Tricky Six
symbols.
164
Consider the action of our inner automorphism on conjugacy classes. Within a symmetric group such as S6 conjugacy classes are just cycle shapes, which we write as partitions
of 6. The cycle shapes on the totals attainable in the puzzle are those that arise from
permutations of the monads which fix a, and these have a fixed point in their cycle shape.
For example, if we fix a and three other vertices, we obtain 52 = 10 odd permutations
of order 2. These are involutions; each is its own inverse. They appear as the first ten
entries in Table 4:
(de)
(∞0)(12)(34)
0210
(ef)
(∞1)(23)(40)
1114
(fb)
(∞2)(34)(01)
4121
(bc)
(∞3)(40)(12)
1124
(cd)
(∞4)(01)(23)
4111
(cf)
(∞0)(13)(24)
0310
(db)
(∞1)(24)(30)
1214
(ec)
(∞2)(30)(41)
4321
(fd)
(∞3)(41)(02)
1324
(be)
(∞4)(02)(13)
4211
(ab)
(∞0)(14)(23)
(ac)
(∞1)(20)(34)
(ad)
(∞2)(31)(40)
(ae)
(∞3)(42)(01)
(af)
(∞4)(03)(12)
Table 4: Swapping two vertices of K6 .
together with the permutations of ∞ 0 1 2 3 4 that they realize, and the entries pqrs of
the corresponding Möbius transformation.
For later reference we include as well the five transpositions which move the monad a;
these don’t realize Möbius transformations.
We thus find that permutations of abcdef of shape
16
2 · 14
22 12
16
23
22 12
1
10
15
23
3 · 13
4 · 12
42
2 · 14
32
6
3 · 13 4 · 12
When a is fixed, respectively
0
20
20
0
30
42
51
321.
0
24
0
321
32
51
map respectively to permutations of ∞01234 of shape
6
of these are attainable. For example, at the entry 4 · 12 we fix a, and one other letter (5
ways) and cycle the remaining four (4!/4 = 6 ways), contributing 5 × 6 = 30 to the total
of 120. As another example, if we fix a and two other vertices and cycle the rest, 3 · 13 , we
obtain 53 × 2 = 20 even permutations of order 3. They are displayed in Table 5.
It will be found that any of the 6 × 5 × 4 = 120 possible arrangements of the first three,
or indeed of any three, symbols in a Tricky Six position is attainable, the order of the
remaining three then being determined.
All 120 positions are conveniently displayed as the set of six diagrams of Figure 73. The
first symbol is in the middle of the appropriate diagram. The next two symbols determine
a directed edge of a pentagon or pentagram. The final three symbols are then found by
continuing to cycle round the pentagon or pentagram in the sense defined by the edge. For
165
(fbc)
(∞41)(032)
1341
(bcd)
(∞02)(143)
0113
(cde)
(∞13)(204)
1312
(def)
(∞24)(310)
2311
(efb)
(∞30)(421)
3110
(fcb)
(∞14)(023)
1211
(bdc)
(∞20)(134)
1341
(ced)
(∞31)(240)
1123
(dfe)
(∞42)(301)
3121
(ebf)
(∞03)(412)
0112
(deb)
(∞32)(014)
1121
(efc)
(∞43)(120)
2131
(fbd)
(∞04)(231)
0411
(bce)
(∞10)(342)
1410
(cdf)
(∞21)(403)
1132
(dbe)
(∞23)(041)
1431
(ecf)
(∞34)(102)
3211
(fdb)
(∞40)(213)
1140
(bec)
(∞01)(324)
0141
(cfd)
(∞12)(430)
1213
Table 5: Cycling three of five vertices of K6 .
example, the position 241xyz is found in the diagram having 2 in the middle, where the
edge 41 defines the counterclockwise pentagram 41∞30, so that xyz = ∞30.
∞
0
4
1 1
∞
3
2
∞
4 2
0
3
2
∞
0 3
1
4
3
∞
1 4
2
0
4
∞
2 0
3
1
0
3
4
2
1
Figure 73: The set of 120 Tricky Six positions at a glance.
The six diagrams of Figure 73 are also conveniently viewed as the six pentagonal pyramids that may be sliced from the icosahedron of Figure 74, whose opposite vertices are
identified. Each pyramid comprises four cycles. For example,
∞(01234)1 = ∞(01234),
∞(01234)3 = ∞(03142),
∞(01234)2 = ∞(02413),
∞(01234)4 = ∞(04321),
where the superscripts denote powers, that is the lengths of the steps round the pentagon.
As you can see from Table 3, there is a unique synthematic total that is invariant under
any five-cycle (jklmn) on the monads a, b, c, d, e, f. Conway, who introduced us to
Sylvester’s notation, denotes it by i(jklmn). The total i(jklmn) contains the syntheme
ij kn lm and its images under powers of (jklmn). Figure 75 shows an example. This is,
of course, an arrangement for 6 players in a round robin tournament.
The identities
↔ ↔ ↔
i(jklmn)= i(jklmn)power =j(i lk nm)
let us bring any of the 6 monads into the initial position, and write the remainder as any of
5 presentations of any of 4 powers of the five-cycle left over, giving 6 × 5 × 4 = 120 names
166
0
∞
4
1
2
3
2
3
1
4
∞
0
Figure 74: Another good way to see them all.
b
b
c f
f
d
c f
a
a
e
b
e
d
b
c f
a
e
d
b
c f
a
e
d
c
a
e
d
Figure 75: a(bcdef), the unique synthematic total, also known as ∞, invariant under
(bcdef).
for each total.
For instance a(bcdef) = a(bcdef)k for any exponent k not divisible by 5, and its other
names are b(adcfe)k = c(aedbf)k = d(afecb)k = e(abfdc)k = f(acbed)k . Each
group of names can be thought of as associated with a pentagram labelled with letters,
with the first letter in the centre, like those in Figure 73. Such pentagrams are fixed by
one of the six subgroups of S6 of order 20 that fixes ∞ = a(bcdef).
Indeed, observe that there is a duality of our construction exchanging monads with
totals and duads with synthemes, realizable as ∞01234 ↔ abcdef. Under this exchange
the names of the total ∞ become just the attainable Tricky Six permutations. Our situation
can be schematized as in Figure 76, the symmetry of which makes the duality obvious.
monads
6
i
duads
5
2
15
synthemes
3
3
ij
15
ij.kl.mn
totals
2
5
6
i(m k j l n)
Figure 76: Schematic view of Sylvester’s construction.
We saw that the first three symbols determine the whole position, and how to read it
from Figure 73. In fact any three symbols determine the position. For example, to find
167
which of ∞, 0, 2 should be assigned to x in x31y4z, look for the edge 31 in the ∞, 0 and
2 diagrams of Figure 73. It respectively defines the pentagram (∞)31420, the pentagram
(0)31∞42, and the pentagon (2)310∞4, of which the second has 4 in the required
position, 031∞42. For another example we may complete x3y1z4, by looking in the same
three diagrams for the edge 43 (why 43? Think of x as fixed, and notice that 4 and 3
are adjacent in the remaining cycle 3y1z43). This determines the pentagons (∞)43210,
(0)43∞21, (2)4310∞, of which the first has 1 in the required position, ∞32104.
It is through this automorphism that Rick’s Tricky Six puzzle is related to the other
objects named at the start of the “Not much of a puzzle” section.
Here’s a first brief example. Implicit in the way we’ve written Table 3 is another set
of six objects paired with the totals, the mystic pentagons which begin the interesting
paper [85].The ten duads that don’t contain a form two sets of five: the second and
third columns of each total. Each monad appears twice in each column. If we forget the
synthemes and remember only the column divisions, we get a mystic pentagon, that is, a
partition of the edges of complete graph on vertices bcdef into two five-cycles. There are
in fact only six mystic pentagons, and we get each of them once (Figure 77). Therefore the
permutations of the mystic pentagons which can be attained by permuting bcdef exactly
form the Tricky Six group.
b
b
f
b
cf
e
d
b
cf
e
d
b
cf
e
d
b
cf
e
d
cf
e
d
c
e
d
Figure 77: The six mystic pentagons
The remainder of this section is devoted to a more leisurely examination of several other
examples.
The projective plane of order 4
The projective plane of order four, PG(2,F4 ), is often defined by means of a cyclic difference
set, for example {3,6,12,7,14} modulo 21, whose five members generate the 52 differences
±1, ±2, . . . , ±10. Note that the first three elements generate the multiples of 3, and the
last two generate the multiples of 7. Think of the difference set as a complete pentagon
which cycles round a complete regular 21-gon as in Figure 78. Among its 10 edges there
is exactly one of every possible length, so that every pair of the 21 points belongs to just
one pentagon. Dually, any two pentagons have just one vertex in common.
Call the pentagon {3,6,12,7,14} the line 0. Subtract 3, 4, 9, 11 modulo 21 to give the
respective lines
3: {0, ♠, 9, 4, 11},
4: {20, 2, 8, ♥, 10},
9: {15, 18, ♦, 19, 5},
11: {13, 16, 1, 17, ♣}.
168
0
20
19
♦
18
1
♠
♥
♣
2
♥
3
♦
17
♣
4
♠
16 ♣
♦
15 ♦
5
6
7
14
13
♣
♥
12
♠
♥
♠
8
9
11
10
Figure 78: A difference set generates the projective plane of order 4.
These four lines each pass through the point 3 which is denoted differently in each of
them, by ♠, ♥, ♦ and ♣ in turn, and is circled in Figure 78. The other four points on
each of these lines are represented in the figure by the corresponding suit symbols. They
exactly cover the 16 points which are not on line 0. (Compare this figure with Figures 32
and 33 from Section 6, which were also generated using difference sets.)
In general, we give the line {3 − n, 6 − n, 12 − n, 7 − n, 14 − n} modulo 21 the name n,
0 ≤ n ≤ 20, as in Table 6, which displays a configuration of 21 points and 21 lines with
5 points on each line, 5 lines through each point, every pair of lines intersecting in a point
and every pair of points determining a line. Bold numbers refer to lines, ordinary numbers
to points (or vice versa, since the configuration is self-dual). The line i passes through the
point j if and only if the point i lies on the line j.
Twenty-one is not a prime power, so the numbers 0, 1, . . . , 20 do not form a field.
However, they do form an additive cyclic group, and the twelve numbers which are not
multiples of 3 or 7 form a multiplicative group, of which the powers of 2 are a subgroup.
Let’s find two different actions of S6 in this projective plane. As the two sets of size six
let us take the points 1 2 4 8 16 11 (the powers of two, 20 , 21 , 22 , 23 , 24 , 25 , mod 21) and
the lines 0 18 15 9 14 7 (zero and the negatives of the original difference set).
We can begin to rewrite Table 3 for the projective plane by replacing the labels a b
c d e f of the vertices of K6 with the respective point numbers 1 2 4 8 16 11. We also
169
lines
0 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1
points
points
points
points
points
3
6
12
7
14
4
7
13
8
15
5
8
14
9
16
6
9
15
10
17
7
10
16
11
18
8
11
17
12
19
9
12
18
13
20
10
13
19
14
0
11
14
20
15
1
12
15
0
16
2
13
16
1
17
3
14
17
2
18
4
15
18
3
19
5
16
19
4
20
6
17
20
5
0
7
18
0
6
1
8
19
1
7
2
9
20
2
8
3
10
0
3
9
4
11
1
4
10
5
12
2
5
11
6
13
Table 6: Incidences in the projective plane of order 4.
relabel the totals, ∞ 0 1 2 3 4 with the respective line numbers 18 15 14 0 7 9.
18
15
14
5 3 19 9 5 19 3
2 4 17 10 2 16 12
6 8 1 6 6 10 17
11 16 10 17 11 1 20
13 12 20 15 13 8 4
9 5 16 8
12 2 17 4
18 6 12 3
13 11 20 1
20 13 10 19
0
7
9
19 5 20 17 7 5 17 20 7 5 8 16 19
10 2 12 16 12 2 19 1 5 2 1 19 5
0 6 1 8 6 6 3 12 0 6 17 10 18
13 11 3 4 3 11 10 16 17 11 4 3 3
14 13 19 10 14 13 4 8 20 13 20 12 15
Table 7: An assignment of numbers to Table 3.
Then, with Table 6 as our guide, we label the edges ab, cf, de which join the points
1 & 2, 4 & 11, 8 & 16, with the line-numbers 5 3 19 and similarly for all the fifteen
synthemes. The lines 5 3 19 concur in the point 9 and each syntheme corresponds to
a point. The labels of these fifteen points are just those numbers that are not powers of
two, and Table 3 turns into Table 7. You can check that this is the same configuration,
with the same labelling, as before.
The points 1, 2, 4, 8, 16, of which no three are collinear, form a conic, that is, the
solution set of a homogeneous quadratic over the field of order four. The tangents to the
conic are the lines that meet the conic in just one point (indicated by a hat):
13 {11 14 20 15 1̂},
16 {8̂ 11 17 12 19},
b 11 18},
17 {7 10 16
1 {2̂ 5 11 6 13},
3 {0 3 9 4̂ 11}.
These are the five lines through the point 11. This point combines with the conic to form
a hyperconic, six points no three of which are collinear. These six points are the monads,
and determine 62 = 15 lines, the duads, which meet in threes at the other fifteen points;
these correspond to the synthemes. The remaining six lines (0, 7, 14, 9, 18, 15) that don’t
meet the hyperconic correspond to the totals; no three of them concur and they form a set
of lines dual to the set of six points.
We repeat Figure 76 as Figure 79, annotating the nodes further to make clear the
interpretation of the figure as the projective plane.
Our two nonisomorphic S6 -actions show up here as the action that permutes the points
of any six-point hyperconic, like 1 2 4 8 16 11, and the action induced on the lines
170
monads
6
duads
5
2
15
synthemes
3
3
15
totals
2
5
6
POINTS
LINES:
POINTS
LINES
on the
hyperconic
chords of the
hyperconic
not on the
hyperconic
not meeting the
hyperconic
Figure 79: Schematic view of the projective plane of order 4.
not meeting it, in this case 0 7 14 15 18 9. Our numbering makes it easy to check that
doubling all the vertex labels modulo 21 is an automorphism that fixes the hyperconic under
which line labels are also doubled, so the cycle (1 2 4 8 16 11) induces the permutation
(0)(7 14)(9 18 15) of the six lines. If we swap 1 and 2 and fix the other four points, (1
2)(4)(8)(16)(11), this induces (0 7)(15 18)(9 14) on the lines and these two automorphisms
are enough to generate the whole group.
We can’t draw the plane with straight lines, so, in Figure 80, although the twenty-one
points 0, 1, 2, . . . , 20 are clear, the lines are less so. The line 9 is the incircle of the
pentagon and the lines 0, 14, 15, 18, 7 look like petals. The lines 3, 16, 17, 13, 1
are the diameters through the point 11. The lines 4, 8, 6, 12, 2 are pentagram edges,
which need to be bent round to pass through the respective points 3, 19, 18, 15, 5; and
the lines 11, 5, 10, 20, 19 are pentagon edges, both ends of which should be bent round
to pass through the respective pairs of points 17&13, 7&9, 14&17, 13&7, 9&14.
The points 3, 6, 12, 7, 14 of line 0 thus lie on the respective lines 3, 6, 12, 7, 14
and, of course, lie just one on each of the remaining fifteen lines. The other four points on
such a line comprise two pairs which form triples with the line number, each member of a
triple being the number of the line containing the other two points. For example, line 18
contains the point 6 and the four points 9, 15, 10, 17 whose joins to the point 18 are
the respective lines 15, 9, 17, 10 which form the triples {18, 15, 9}, and {18, 17,
10}. There are ten such triples and they exhibit the ten differences 1 ≤ d ≤ 10 exactly
three times each. For example, the difference 5 occurs in the triples {8, 13, 4}, {11,
16, 7}, and {15, 20, 13}. These ten triples correspond to the sets of edges of pairs of
opposite faces of an icosahedron, half of which is shown in Figure 81.
Buy one; get several free!
We noticed that the difference set {3,6,12,7,14} comprised two difference sets: {7,14}
generates the multiples of 7 and {3,6,12} generates the multiples of 3. So the projective
171
5
13
7
1
5
9
3
14
6
11
14
3
0
16
3
16
12
0
15
14
19
14
12
7
2
13
17
9
8
10
11
15
6
10
1
5
17
18
7
2
20
17
19
9
4
13
18
4
7
8
15
13
20
19
18
17
9
Figure 80: The projective plane of order 4.
plane of order four contains the not very exciting projective plane of order one: the triangle
{0,7,14} and 1119 other copies of it, and the much more interesting projective plane of order
two, the so-called Fano configuration (although it was known more than 40 years earlier
to the Rev. T. P. Kirkman [91]). Besides the obvious example, whose point-numbers are
congruent to 0 modulo 3, which is self-dual in the sense that it has the same line-numbers,
and is shown in Figure 82, there are 359 others: including the dual pair whose point- and
line-numbers are respectively congruent to 1 and 2 (or to 2 and 1) modulo 3. The figure
also shows a dual pair whose point-numbers differ by 3 from the line-numbers.
More surprising is the fact [11]that if we throw away a hyperconic we are left with fifteen
points which form a projective geometry of order two in three dimensions! For example,
throw away 1,2,4,8,16,11. The remaining points are those of the line {0,5,7,17,20}, and its
double {0,10,14,13,19}, together with the multiples of 3. Figure 83 shows this geometry
as a tetrahedron, as Polster would draw it [122]. Its fifteen points are the vertices, 5 7 17
20, the midpoints of the edges (multiples of 3), the centroids of the faces, 10 14 13 19,
and the centroid, 0. Fifteen of the thirty-five lines, those which meet the hyperconic, are
inherited from the plane: they are the twelve medians of the faces and the three joins of
midpoints of opposite edges. The other twenty are the vertex sets of triangles formed by
172
2
1
5
19
16
9
4
10
18
17
16
15
8
4
11
13
20
1
Figure 81: Ten triples form half an icosahedron.
2
5
9
18
15
6
3
0
16
19
9
9
15
12
6
0
3
5
2
18
6
1
12
0
12
11
16
12
8
4
3
3
9
3
3
4
8
19
11
1
0
6
Figure 82: Kirkman-Fano configurations.
three of the six lines 0, 7, 14, 9, 18, 15 which avoid the hyperconic. They appear as the
six edges of the tetrahedron, the four joins of the vertices to the centroids of the opposite
faces, and ten lines which cannot be drawn in Euclidean space: the four incircles of the
faces and six similar curves circumscribing the “medial triangles”:
{3,14,19}
14,9,0
{6,10,14}
{9,10,13} {12,13,14} {15,10,19}
formed by the triples of lines
14,0,18 14,15,18
14,0,15
14,9,18
173
{18,13,19}
14,9,15.
5
15
18
3
14
13
10
0
17
20
9
19
6
12
7
Figure 83: The projective geometry PG(3,F2 ).
A different and quite revealing labelling of the 15 = 24 − 1 = 4 + 6 + 4 + 1 points is to
assign 1, 2, 4, 8 to the vertices, sums of pairs of these to the midpoints of the edges, sums
of three to the centroids of the faces, and the sum of all four, 15, to the centroid.
old numbers
new numbers
5
1
7
2
3
3
17
4
15
5
6
6
13
7
20
8
18
9
12
10
10
11
9
12
14
13
19
14
0
15
The thirty-five lines are then those triples whose nim-sums (XOR, binary addition
without carry) are zero: the ten “noneuclidean” lines correspond to those nim-sums which
are not ordinary sums, for example, 3⊕5 = 6 and 5⊕11 = 14. The 15 = 24 −1 = 4+6+4+1
planes of the geometry are Kirkman-Fano configurations: the four faces of the tetrahedron,
the six “medial planes” joining the midpoint of an edge to the opposite edge, the four
“cones” joining a vertex to the incircle of the opposite face, and the “sphere” of midpoints
of edges together with its centre, 15.
174
Does this figure look familiar? It should, for this labeling is precisely the one given in
Figure 52 of Section 10, right before the discussion about Kirkman’s schoolgirls.
Remarkably, the thirty-five lines can be partitioned, in 240 different ways, into seven
sets of five lines, with no two of the five intersecting, each set exactly covering the fifteen
points. That is, the thirty-five lines can be arranged as rows in a Kirkman (15,3,1)-design;
they provide solutions to the famous Kirkman schoolgirls problem, with which readers of
[24] will already be familar. An example is shown in Table 8.
Sun
Mon
Tue
Wed
Thu
Fri
Sat
1 2 3
5 8 14
4 11 15
7 9 14
6 10 12
1 4 5
3 9 10
2 12 14
7 8 15
6 11 13
1 6 7
3 8 11
2 13 15
5 9 12
4 10 14
1 8 9
2 4 6
3 12 15
5 11 14
7 10 13
1 10 11
2 5 7
3 13 14
4 8 12
6 9 15
1 12 13
3 4 7
2 9 11
5 10 15
6 8 14
1 14 15
3 5 6
2 8 10
4 9 13
7 11 12
Table 8: The thirty-five lines of PG(3,F2 ) form a Kirkman (15,3,1)-design.
The fifteen Kirkman-Fano planes each appear as seven triples, one from each day of the
week. For example, the “cone” 1 6 7 10 11 12 13 is represented by 6 10 12;
6 11 13; 1 6 7; 7 10 13; 1 10 11; 1 12 13; and 7 11 12.
Finally, you may recall that we explored a somewhat surprising connexion between
PG(3,F2 ) and the Lehmers’ method of factoring integers by means of quadratic forms in
Section 10.
The Hoffman-Singleton graph
A Moore graph of type v, k is a regular graph of valence v and diameter k with the maximum
possible number, v(v − 1)k − 2 /(v − 2), of vertices. This formula doesn’t make sense if
v = 2, but it tends to the limit 2k + 1 as v approaches 2, and this is the number of
vertices in the valence 2 case. Hoffman & Singleton [83] showed that for diameter 2 there
are at most four such. Their valences are 2 (the pentagon), 3 (the Petersen graph), 7
(the Hoffman-Singleton graph) and possibly 57 (though the existence of this last remains
an unsolved problem). The Hoffman-Singleton graph has 50 vertices and 175 edges, and
like every Moore graph of diameter 2 its shortest cycles are pentagons so that its girth
is 5. Its automorphism group has order 252000 = 25 32 53 7. It is arc-transitive, that is it
has an automorphism sending a particular edge to any of its 175 edges with either of 2
orientations. The stabilizer of an oriented edge thus has order 252000/(175 · 2) = 720, and
indeed is isomorphic to S6 , as reflected in the following construction of the graph from our
versatile Table 3.
To draw the Hoffman-Singleton graph, start with an edge joining vertices which we label
⋆ and g. Label the six other vertices adjacent to ⋆ with the letters a b c d e f and the
other six adjacent to g with the symbols ∞ 0 1 2 3 4 as in Figure 84. The other 36 vertices
175
⋆
a
b
c
d
e
g
f
∞
0
1
2
3
4
a∞ a0 a1 a2 a3 a4 b∞ b0 b1 b2 b3 b4 c∞ c0 c1 c2 c3 c4 d∞ d0 d1 d2 d3 d4 e∞ e0 e1 e2 e3 e4 f∞ f0 f1 f2 f3 f4
Figure 84: How to construct the Hoffman-Singleton graph.
are {xn}, where x runs through the letters a b c d e f and n runs through the symbols
∞ 0 1 2 3 4, and there are the implied adjacencies, for example vertex c2 is adjacent to
vertices c and 2. It remains to insert the other 175 – (1+12+36+36) = 90 edges. Again,
they correspond to our edge-colorings of K6 .
Recall the fifteen swaps of Table 4. They each provide six adjacencies, for example
(ce)
(∞2)(30)(41)
provides the six adjacencies
c∞—e2
c2—e∞
c3—e0
176
c0—e3
c4—e1
c1—e4.
We can also succinctly describe the 6! automorphisms of the graph fixing the edge ⋆—g:
they permute the vertices a b c d e f arbitrarily and the vertices ∞ 0 1 2 3 4 as dictated
by construction.
Other constructions for the Hoffman-Singleton graph are given in [19, §13.1]. Conway
showed us his perspective, which begins with a distinguished vertex rather than an edge.
We’ll choose ⋆ in Figure 84 as this vertex. Its neighbors are the six monads abcdef
and g, and the other neighbors of g are the totals. This suggests that to place all seven
neighbors of ⋆ on an equal footing we should recognize g as a seventh monad and interpret
the other neighbors of an original monad i as the totals on the set of the six other monads,
so that what we before called xn is reinterpreted as the total n with x replaced by g.
Therefore the vertices adjacent to a numbered total n on abcdef are just the totals xn on
abcdefg which differ from it only by a single-letter substitution. In fact this turns out to
be true of any pair of totals, determining all remaining edges of the graph. The resulting
picture of the Hoffman-Singleton graph is Figure 85.
⋆“absolute” point
g
f
e
d
c
b
a
(7 monads)
each monad
joins the six
totals on the
remaining six monads
swap f,g
g(edcba)
misses f
two totals
obtained by swapping
their “misses” are joined
f(edcba)
misses g
Figure 85: Conway’s description of the Hoffman-Singleton graph.
The Steiner system S(5, 6, 12)
The Steiner system S(5, 6, 12) is a set of blocks of 6 elements, hexads, chosen from a set of
12 so that each pentad, or choice of 5 elements from the 12, occurs exactly once in a block.
6
Hence the number of blocks is 12
5
5 = 132.
We use a b c d e f ∞ 0 1 2 3 4 for our 12 elements: in fact abcdef and ∞01234
will be two of the blocks. We get 15 × 6 = 90 blocks that contain four letters and two
177
numbers, or two letters and four numbers, from the fifteen swaps of Table 4.
For example, the swap
a2cde∞,
(fb) (∞2)(34)(01)
a4cde3,
a1cde0,
f01b34,
yields the six blocks
∞012fb,
∞fb234.
where the pairs of numbers ∞2, 34, 01 have been substituted for the pair of letters fb in
abcdef and, conversely, the letters fb have been substituted for the pairs of numbers in
∞01234.
The other 40 blocks have three letters and three numbers and may be generated in pairs
from the 63 = 20 three-cycles of Table 5, by substitutions exchanging three letters and
three digits. That table omits the three-cycles moving the monad a, but all we need here
is the partition of the totals into the two three-cycles that these induce, and this partition
is the same one that arises from the cycles on the other three monads. So for instance the
cycles (bde) and (bed) correspond to the permutations (∞32)(014) and (∞23)(041) while
(acf) and (afc) correspond to (∞23)(014) and (∞32)(041).
For example, the cycle (bef) associated with the permutation (∞30)(214) gives rise
to the four blocks
a∞cd30
a1cd42
∞0bf3e
bf12e4.
How do we know that each pentad occurs exactly once? If a pentad consists
of 5 letters, or 5 numbers, then the hexad is abcdef or ∞01234. If it consists of 4 letters
and a number n the hexad will contain a second number. This is found in Table 4 which
displays all 62 = 15 swaps of two vertices. Select the swap of the two letters which are not
in the pentad and take the number paired with n. For example, given the pentad acef3,
look at the entry (db) (∞1)(24)(30) where 3 is paired with 0, so that the pentad belongs
to the unique hexad a3c0ef. If the pentad contains 4 numbers and a letter, for example,
∞024b, find the entries of Table 4 that contain the missing numbers 13, namely (ad),
(cf), (be). Here b is paired with e, so the hexad is ∞0b2e4. If the pentad contains
3 letters and 2 numbers, or 3 numbers and 2 letters, we use Table 5. For example, for
bcf23 we find (fbc) (∞41)(203) so that the hexad is completed with 0. But if the pentad
were bcf24, with 2 and 4 in different triples, the hexad must be completed with a letter.
In Table 4 the pair (24) occurs in the swaps (ae), (bd) and (fc), so the missing letter is
d: 2bcd4f.
If the pentad were bcf02, then, in Table 4, (02) occurs in (fd), (be), and (ac),
none of which are in bcf, so we need a number rather than a letter to complete the hexad.
Table 5 gives (fbc) (∞14)(023) so that the missing number is 3.
A (12,132,4) binary code and the ternary Golay code C12
In a binary code, the letters of the codewords are zeroes and ones. The number of letters
in a codeword is its length and the number of ones is its weight. The 132 hexads of the
Steiner system S(5, 6, 12) form a basis for a binary code with words of length 12 and weight
6.
The blocks of the Steiner system indicate which letters of the 12-letter codewords are
178
occupied by six ones or by six zeroes. In anticipation of the construction of the ternary
Golay code C12 we will put the letters in the order
a 01234 ∞ bcdef
and, for ease of reading, we will leave space round the 1st and 7th letters.
For example, our initial blocks abcdef and ∞01234 correspond to the codewords 1
00000 0 11111 and 0 11111 1 00000; the blocks
a2cde∞, a4cde3, a1cde0, and their complements f01b34, ∞012fb, ∞fb234
correspond respectively to the codewords
1 00100 1 01110,
1 00011 0 01110,
1 11000 0 01110,
0 11100 1 10001,
0 00111 1 10001,
and their complements
0 11011 0 10001,
while the blocks
∞b30ef, 1b24ef, ad12c4, ∞0ac3d
correspond to
0 10010 1 10011,
0 01101 0 10011,
1 01101 0 01100,
1 10010 1 01100.
Each codeword differs from every other in at least four places, that is, the Hamming
distance between any two words is at least 4.
Suppose that you received a codeword 0 01101 1 10101. This contains seven ones, so
there is an error. Assume that the zeroes are correct. They correspond to the pentad
a03ce. Table 5 has the (complementary to a c e) entry (fbd) (∞04)(231); 0 and 3 are
in different triples, so the missing element is a letter. In Table 4 the pair (30) occurs in
(db), (ec), and (af), so that the missing letter is f and the final 1 in the erroneous
codeword should have been 0, making it 0 01101 1 10100.
We can pass from this binary code to a ternary code, which we now present in outline.
To incorporate the words of our binary code into a ternary code we will leave the zeroes
as they are and endow the ones with signs. With a correct choice of signs the resulting
132 words of length 12 can be made to generate by addition a linear code of dimension
6, that is a 6-dimensional subspace of the ambient vector space F12
3 over the finite field
6
F3 = {−1, 0, +1}. Our code will thus contain 3 = 729 codewords.
Aside from the zero word 0 00000 0 00000, the words will come in pairs of opposite
sign. In fact, we will obtain no nonzero codewords with more zeroes than the signed
manifestations, two apiece, of our 132 words from the binary code. So the minimal distance
of our code will increase to 6. The resulting code is known as the ternary Golay code and
denoted as C12 .
From [37, p.85] we learn that C12 may be obtained by appending a zero-sum check digit
to C11 , the quadratic residue code of length 11 over F3 ; that a generator matrix is
179








a 0 1 2 3 4
1
0
0
0
0
0
0
1
0
0
0
0
0
0
1
0
0
0
that it has weight enumerator
0
0
0
1
0
0
0
0
0
0
1
0
0
0
0
0
0
1
∞
b
c
d
e
f

0
1
1
1
1
1
−1
0
1 −1 −1
1 

−1
1
0
1 −1 −1 

−1 −1
1
0
1 −1 

−1 −1 −1
1
0
1 
−1
1 −1 −1
1
0
x12 + 264x6 y 6 + 440x3 y 9 + 24y 12
that is it contains 1, 264, 440 and 24 words with respectively 0, 6, 9 and 12 nonzero letters,
and that its automorphism group is 2.M12 , that is it has the Mathieu group M12 as a
normal subgroup with quotient (cyclic of order) 2.
But by now we’ve roved far enough from the Tricky Six puzzle, so we pursue codes no
further and turn to the
Conclusions
Our favorite for an actual puzzle changes C into W and T into D, turning RECANT into
REWAND. Manoeuvre #1 of the “Not much of a puzzle” section then gives Figure 86, a
version of which appeared on the cover of the April 2009 issue of Mathematics Magazine.
It should be read clockwise, starting from twelve o’clock. The solution: move the letters R
E W A R E W A R E and read clockwise from noon again.
Finally, due to the many connexions between the Tricky Six puzzle and graph theory,
combinatorial designs, finite geometries, error-correcting codes, and finite groups, we feel
confident in changing the status of the Tricky Six puzzle from “Not much of a puzzle” to
“Very much of a puzzle!”
180
A
W
N
E
D
R
Figure 86: Solution to all our problems?
181
18
S(5, 8, 24)
This section is in the file called miracle.tex.
In the last part of Section 16, we introduced the notion of a multiply transitive
permutation group and talked about Émile Mathieu’s study of such groups, which he
published in [109] and [110]. It was there that he described the five remarkable groups
we know as M11 , M12 , M22 , M23 , and M24 . He showed that M11 and M23 are quadruplytransitive and that M12 and M24 are quintuply transitive; we saw that the only other groups
of those degrees of transitivity are the symmetric groups and the alternating groups. Ernst
Witt’s 1938 paper [175] was the first to contain a description of that most singular of
combinatorial objects, the Steiner design S(5, 8, 24), whose automorphism group is M24 .
In Section 16, we gave a brief description of how to construct the Steiner system
S(5, 8, 24) as the set of codewords of Hamming weight 8 in the Extended Golay Code G24 .
We continue our journey discovering the unity of combinatorics with a detailed exploration
of the inner structure of S(5, 8, 24).
Elementary properties of S(5, 8, 24).
Suppose someone gives you a general description of some structure. However, you are
not provided with an example, and you are not certain that one exists. How do you
proceed? In this subsection, we go into considerable detail about the inner structure
of S(5, 8, 24), knowing only the definition. Our discussion follows the one found in Ian
Anderson’s excellent little book on combinatorics [2, pp. 105ff.].
In Section 16, we gave a very brief construction of S(5, 8, 24) in terms of the extended
Golay code G24 and proved a general theorem about restrictions placed on the parameters
p, q, and r. For the special case p = 5, q = 8, and r = 24, let Ω be the set of 24 varieties on
which S(5, 8, 24) is formed. Then we have the S(5, 8, 24) Restriction Theorem.
8
1. Ω contains 24
5 / 5 = 759 8-element subsets, which we call octads.
7
2. Every element of Ω belongs to 23
4 / 4 = 253 octads.
6
3. Every pair of elements of Ω belongs to 22
3 / 3 = 77 octads.
5
4. Every triple of elements of Ω belongs to 21
2 / 2 = 21 octads.
5
5. Every tetrad (four-element set) of elements of Ω belongs to 20
1 / 1 = 4 octads.
4
6. Every quintuple of elements of Ω belongs to 20
0 / 0 = one octad.
The latter is not at all surprising – it is part of the definition of S(5, 8, 24) – but it is
comforting to know that the Restriction Theorem tells us this!
The next step is to understand how octads behave in terms of intersection and complement. Part of our work is to introduce a certain group G in which intersection and
complementation play a large role. Specifically, let G = P(S) be the power set of the
182
24-element set of varieties Ω. If X, Y ∈ G, let X ⊕ Y denote the symmetric difference of
X and Y , i.e. (X ∪ Y ) − (X ∩ Y ), and let X ′ denote the complement Ω − X of X in Ω.
Then it is easy to see that ⊕ is both associative and commutative, that G is closed under
both ⊕ and complementation, that ∅ is the identity element of G, and that X ⊕ X = ∅. In
short, (G, ∅) is an abelian group of order 224 . We will construct a certain subgroup G0 of
G of order 212 containing the 759 octads of the S(5, 8, 24).
Now, the size of the intersection of two 8-element subsets of a fixed set can certainly
be any number from zero through eight. Octads, however, are not ordinary 8-element sets,
and they have the following non-obvious properties.
1. Two distinct octads can only intersect in zero, two, or four elements.
2. If two octads are disjoint, then their complement is an octad.
3. If two octads intersect in four elements, then their symmetric difference is an octad.
To see why, let E and F be distinct octads. First, if |E ∩ F | ≥ 5, then there would be
a quintuple of elements that belongs to more than one octad, contrary to the definition of
S(5, 8, 24). Hence, distinct octads cannot intersect in five or more elements.
Next, we show that |E ∩F | =
6 3. Suppose not; let E = {x1 , x2 , x3 , x4 , x5 , x6 , x7 , x8 }. Consider the set {x1 , x2 , x3 }. Then there are exactly five tetrads of E containing {x1 , x2 , x3 },
so by the S(5, 8, 24) restriction, each of these tetrads is contained in exactly four octads
other than E. Thus, there are 20 octads other than E which contain {x1 , x2 , x3 , xj } for
some xj ∈ E − {x1 , x2 , x3 }.
We claim that each of those 20 octads must be distinct. Suppose not; then there is an
octad G containing both {x1 , x2 , x3 , xj } and {x1 , x2 , x3 , xk } with xj 6= xk . Thus G contains
{x1 , x2 , x3 , xj , xk } – but so does E, contrary to the restriction that each quintuple lies in a
unique octad. Hence those 20 octads are distinct and so there are 21 octads – including E –
that contain {x1 , x2 , x3 } and some other element of E. It follows that if F is an octad that
contains {x1 , x2 , x3 }, then it must also contain some other element of E. Hence distinct
octads cannot intersect in exactly three elements.
We now show that |E ∩ F | =
6 1 for distinct octads E and F . For a fixed element
7
x1 ∈ E, there are 3 = 35 tetrads of E which contain x1 . Furthermore, by the S(5, 8, 24)
Restriction, each of these tetrads is contained in exactly four octads besides E, making
35· = 140 octads other than E which contain {x1 , xi , xj , xk } for some xi , xj , xk ∈ E − {x1 }.
By an argument similar to the above, we deduce that these 140 octads must be distinct.
Now pick x2 ∈ E, different from x1 . Again by the Restriction, {x1 , x2 } is contained in 76
octads other than E. But note that 62 · 4 = 60 of these octads contain a tetrad of the
form {x1 , x2 , xj , xk } for some xj , xk ∈ E − {x1 , x2 }. As before, these 60 octads are distinct
and are numbered among the 140 octads just mentioned.
Now, two octads cannot intersect in exactly three elements, so the remaining 76−60 = 16
octads containing {x1 , x2 } cannot contain any other elements of E. It follows that there
are 16 distinct octads that intersect with E in the set {x1 , x2 }, and for each i > 2, there
183
are 16 distinct octads that intersect with E in the set {x1 , xi }. This gives us 7 · 16 = 112
distinct octads. Thus, we have found 140 + 112 + 1 = 253 octads that contain x1 and
another element of E. By the Restriction, no other octad can contain x1 , and so a pair
of octads must either be disjoint or have at least two elements in common. Thus, distinct
octads cannot intersect in exactly one element.
We notice that E and F are disjoint octads, then E ⊕ F = E ∪ F and (E ⊕ F )′ is an
8-element set. Does that make (E ⊕ F )′ an octad? Similarly, if |E ∩ F | = 4 then E ⊕ F is
an 8-element set; does that make E ⊕ F an octad? We are in luck: the relevant 8-element
set is an octad in both cases.
For, suppose E and F are disjoint, and suppose (E ⊕ F )′ = {y1 , y2 , y3 , y4 , y5 , y6 , y7 , y8 }
is not an octad. We know that {y1 , y2 , y3 , y4 , y5 } is contained in a unique octad, say, D.
We see that D ∩ (E ⊕ F ) = (D ∩ E) ∪ (D ∩ F ), and as |D ∩ E| and |D ∩ F | are both even,
so is |D ∩ (E ⊕ F )′ |. As (E ⊕ F ) and (E ⊕ F )′ are disjoint, it follows that |D ∩ (E ⊕ F )′ | is
also even. The only possibilities for the size of this intersection are 6 and 8. If it is 8, then
the two sets are equal, contrary to the assumption that (E ⊕ F )′ is not an octad. Thus,
|D ∩ (E ⊕ F )′ | = 6, and we may assume that D contains y6 but that y7 , y8 ∈
/ H.
By the definition of S(5, 8, 24), {y1 , y2 , y3 , y4 , y7 } is contained in a unique octad, say,
H. Then |H ∩ (E ⊕ F )′ | = 6. If y5 ∈ H or y6 ∈ H, then D = H by previous results. Hence, H contains y8 and neither y5 nor y6 ; we have {y1 , y2 , y3 , y4 , y5 , y6 } ⊆ D and
{y1 , y2 , y3 , y4 , y7 , y8 } ⊆ H. Once again, by definition {y1 , y2 , y3 , y5 , y7 } belongs to a unique
octad K, whose intersection with the complement of E ⊕ F contains exactly six elements.
But by another element-by-element argument, this cannot happen, and we conclude that
(E ⊕ F )′ is an octad.
The proof that if |E ∩ F | = 4, then E ⊕ F is an octad is similar, and we omit it.
A dodecad is a 12-element subset of Ω that can be written as the symmetric difference
of two octads that intersect in exacly two elements. It turns out that the complement of a
dodecad is also a dodecad, no dodecad contains an octad, and it can be shown that there are
exactly 2576 dodecads in the group G generated by the octads under symmetric difference.
Including the base set Ω, the empty set, the 759 octads, the 759 octad complements, and
the 2576 dodecads, we do the addition and find that |G| = 2096 = 212 . This squares with
our previous description of S(5, 8, 24) as the row space of the 24-dimensional Extended
Golay Code G24 .
We now describe the famous 24-dimensional Leech lattice. An n-dimensional lattice is
a set of vectors in real n-dimensional vector space Rn that forms an abelian group under
vector addition. The Leech lattice Λ is a 24-dimensional lattice constructed from the row
space of the 12 × 24 matrix of G24 – that is, from the Golay code – as follows. Let m be
an integer. The Leech lattice is the set of integer vectors (x1 , . . . , x24 ) ∈ R24 that satisfy
the following three properties:
184
• xi ≡ m (mod 2);
• ((x1 − m)/2 mod 2, . . . , (x24 − m)/2 mod 2) is a vector in the Golay code G24 ; and
•
24
X
i=1
xi ≡ 4m(mod 8).
A recent article in the AMS Notices [32] describes the conceptual breakthroughs made
by Maryna Viazofska on sphere packing problems. She has proved that the root lattice of
the exceptional Lie algebra E8 gives the densest possible packing of spheres in dimension
8. In addition, she and Cohn and others have extended her work, proving that the Leech
lattice Λ gives the densest possible packing of spheres in dimension 24.
We are now ready to describe an object constructed from an S(5, 8, 24) that (a) looks
like a patchwork quilt, (b) takes any 5-subset Q of the 24-element set as input, (c) produces
the unique octad that contains Q as output, and (d) combines both program and data in
a beautiful picture. Let’s meet the Miracle Octad Generator.
185
19
The Miracle Octad Generator
This section is in the file MOGchapter.tex. It is a modification by Rob Curtis of his paper
“A new combinatorial approach to M24 ”, Math. Proc. Camb. Phil. Soc. 79 (1976), 25-42.
We are grateful for his kind permission to use this work as a fitting end to our combinatorial
journey.
In the previous section, we described something about the inner structure of a Steiner
system S(5, 8, 24), knowing only its definition – and miraculously, such a system does exist
and it is unique up to relabelling its points. The 759 8-element subsets in such a system
are called special octads or simply octads. Moreover it is a remarkably symmetrical object.
Indeed, if S denotes a collection of 759 octads of the 24-element set Ω that form a Steiner
system S(5, 8, 24) then there are 244,823,040 rearrangements of the 24 points of Ω which
preserve S. These rearrangements form the elements of the famous Mathieu group M24
which is of great interest in its own right but which is also involved in a key way in many of
the most exceptional finite simple groups, the so-called sporadic simple groups. It is clear
0
3 19
6 15
9 5
11 1 22 2
4 20 18 10
16 14 8 17
13 21 12 7
Figure 87: The Miracle Octad Generator
from the definition of S(5, 8, 24) that every octad is uniquely determined by its 5-element
subsets. But it goes further than that. Suppose you pick a 5-element subset T at random
from Ω. Then the octad O containing T is uniquely determined – but is there a way to find
O if all you know is T ? The answer is yes, and the rest of this section is devoted to both
the description of an object that can do this and to the procedure for using this object to
find O, given T . The object is called the Miracle Octad Generator, or MOG for short, and
186
it is with this miracle that we conclude our combinatorial journey.
The Miracle Octad Generator (MOG)
The MOG may be regarded as a picture of the Steiner system S(5, 8, 24). More explicitly
it is an arrangement of the 24 points of Ω into a 4 × 6 array in which the 759 octads are
immediately recognisable. Indeed, given any subset of 5 points of Ω it is relatively easy, as
we shall see, to complete them to the unique octad containing them. Because of this we
can readily write down permutations of the group M24 preserving S and can thus calculate
within M24 and other groups in which it is contained.
Suppose that O is an octad of the Steiner system S and let Ω \ O = U denote the
complementary subset of size 16, the complementary 16-ad; then O may be partitioned
into two 4s in 84 /2 = 35 ways. Each of these partitions, O = O1 ∪ O2 say, will determine
a partition of U into four 4s, U = U1 ∪ U2 ∪ U3 ∪ U4 say, such that Oi ∪ Uj is an octad of
the system for i = 1 or 2 and j = 1, 2, 3 or 4. Underlying the MOG is that fact that the
35 × 4 subsets of U produced in this way form the tetrads of a Steiner system S(3, 4, 16).
4
[Note that 16
3 / 3 = 35 × 4 = 140.] So the MOG is essentially a correspondence between
the partitions of a set of size 8 into two 4s with the tetrads of a Steiner system S(3, 4, 16)
grouped into sets of size 4.
The MOG is displayed in Figure 87; the figure in the first column and second row gives
the standard labelling of the 24 points of Ω, shown more clearly in Figure 91, and the
remaining 6 × 6 − 1 = 35 pictures display a partition of the lefthand 8-element subset or
brick into two 4s, together with the corresponding partition of the 16-ad into four special
tetrads.
19.1
An elementary approach
Suppose we wish to place four apples into 4 buckets so that either (i) every bucket has an
even number of apples in it or (ii) every bucket has an odd number of apples in it. That is,
the numbers of apples in each bucket have the same parity. Then we must have 4 apples
in one bucket and none in the other three (4.03 ) or 22 .02 or 14 . So if 3 apples have already
been placed in the buckets and we wish to place the 4th so that parity is preserved, then
it is clear where the 4th apple should be placed: if the 3 apples are all in one bucket then
the 4th must also be placed in that bucket; if there are 2 apples in one bucket, 1 in another
bucket and none in the other two then the 4th should be placed in the bucket containing
one apple; and if 3 of the buckets have one apple each in them, then the 4th apple should
be placed in the bucket which was empty. Thus 3.03 → 4.03 ; 2.1.02 → 22 .02 ; 13 .0 → 14 .
Imagine now a 2-dimensional version of this process and suppose we have a 4 × 4 array in
which 3 crosses have been inserted, and we wish to insert a 4th cross so that the parity in
the columns and the parity in the rows is preserved. In the first example of Figure 88 we
see that the three crosses hit the columns as 1.1.0.1 and so the 4th cross should go in the
3rd column. They hit the rows a 2.1.0.0 and so the 4th cross should go in the 2nd column.
187
0
0
x
1
x
x 2
1 1 0 1
x
x
x
x
x
x
x x
;
x
x
x
1 1 1
x
x
x
0
x x
x
x
x
x
x
1
1
0
1
x
x
x
x
x
x
Figure 88: Tetrads of the S(3, 4, 16) and the corresponding octads.
Similarly in the second example we see that the missing cross must go in the 4th column
and the 2nd row. If we look in the MOG diagram Figure 87 we see that the first special
tetrad appears as the black dots in the picture in the 2nd column and the 4th row; thus
the 8-element subset indicated by crosses in the diagram below it in Figure 88. Similarly
the second special tetrad found in Figure 88 appears as the black squares in the picture in
the 2nd row and 3rd column of the MOG. These two pictures are reproduced in Figure 89,
when the partition of the 16-ad is shown by black squares, white squares, black dots and
white dots. The set of four special tetrads defined by a given special tetrad is obtained by
applying permutations which either (i) interchange the rows in two pairs while fixing the
columns, or (ii) interchanging the columns in two pairs while fixing the rows. For instance,
interchanging the 1st and 4th, and 2nd and 3rd rows while keeping the columns fixed takes
the white dots of the first picture in Figure 89 to the white dots. Any one of these special
tetrads together with either the black squares or the white squares in the left hand brick
is an octad.
However, the additional symmetry which makes the MOG such a useful device is the
following. The MOG pictures should not be viewed as a brick on the left and the complementary 16-ad on the right, but as 3 bricks Λ1 , Λ2 and Λ3 laid side by side, as shown in
Figure refoctads2. The brick which contains 4 points can be any one of the three, when
the complementary 16-ad is simply the union of the other two bricks. Permutations which
bodily rearrange the 3 bricks preserve the Steiner system. In Figure 89 we show a number
of octads which correspond to the same MOG picture. Note that the vast majority of the
759 octads will have 4 points in one of the 3 bricks (we call this the heavy brick), and 2
points in each of the other 2 bricks. Some, easily recognised octads, will have 4 points
in each of two bricks, but these can still be viewed as a special tetrad in one of the 3
16-ads together with one of tetrads in the complementary brick. Of course the 3 bricks are
themselves octads of the system. We are now in a position, given any 5 points of Ω, to find
188
x
x
x
x
x
x
x x
x
x
x
x xx
x
x
x
x x
x
x
x
x
x
x
x
x
x
x
x x
x
x
x
x
x
xx
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x xx
x
x
x x
x
x
x
x
x
x
x
xx x
x
x x
x
x
x
x
x
x
x
x x
x
x
x
x
x
x
x
x x
x x
x x
Figure 89: Octads corresponding to a particular MOG picture.
the unique octad containing them.
• If the 5 points all lie in one of the bricks, then the required octad is that brick.
• If 4 of the points lie in a brick then the required octad consists of those 4 points
together with the corresponding special tetrad in the 16-ad containing the 5th point.
• If there are 3 points in one of the bricks (and so 2 in the complementary 16-ad) then
these 2 points must be completed to a special tetrad in the 16-ad which corresponds
to a partition of the octad in which the 3 points have the same colour (are all white
or all black in the MOG).
• Finally, the 5 points cut the three bricks 2.2.1. If the two 2s form a special tetrad
then we simply complete the 5th point to the corresponding tetrad. Otherwise there
are two possibilities: we complete each of the pairs plus the fixed point to a special
tetrad in the 16-ad in which they lie, and then see whether the other pair of points
have the same colour in the complementary brick. In one case they will and in the
other they won’t.
Figure 90 contains five examples of finding an octad from 5 of its points. In the first
example the 5 points cut across the 3 bricks as 1.2.2 and the two 2’s clearly do not form a
189
Λ1 Λ2 Λ3
x
x
x
x x
x
x x x
x x x
x x
x
;
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
;
x
x
x
x
x
x
x
x
x
x
x
x
x
x
;
x
x
x
x
x
x
x
x
x
x
x
Figure 90: Completing 5 points to an octad.
special tetrad in the 16-ad Λ2 ∪ Λ3 . Thus the heavy brick is either Λ2 or Λ3 . We complete
the 3 points in Λ1 ∪ Λ2 to a special tetrad, which we locate in the fourth row of the first
column. We observe that the 2 points in Λ3 do indeed have the same colour in the brick
of this picture, and so the required octad is as shown.
In the second example the heavy brick is clearly Λ2 . It is perhaps easier to consider the
7 ways in which the pair of points in Λ1 can be extended to a special tetrad in Λ1 ∪ Λ3 ;
these are seen in the bottom four pictures of the first column and the bottom four pictures
in the fourth column. The unique picture in which the 3 points in Λ2 have the same colour
in the brick is second from the bottom in the first column.
∞ 0 11 1 22 2
3 19 4 20 18 10
6 15 16 14 8 17
9 5 13 21 12 7
Figure 91: The standard labelling with the points of the projective line P1 (23).
190
(0
(0
(0
(0
0
0
0
0
0
1
0
1
0)
0)
1)
1)
(1
(1
(1
(1
0
0
0
0
0
1
0
1
0)
0)
1)
1)
(0
(0
(0
(0
1
1
1
1
0
1
0
1
0)
0)
1)
1)
(1
(1
(1
(1
1
1
1
1
0
1
0
1
0)
0)
1)
1)
Figure 92: The 16 vectors of V .
19.2
19.2.1
A more mathematical approach
The exceptional isomorphism A8 ∼
= L4 (2)
Let F ∼
= Z2 , the integers modulo 2, be a field with two elements 0 and 1, where 1 + 1 =
0, and let V be a 4-dimensional vector space over F. Then if u, v, w are three distinct
vectors of V , then the set {u, v, w, u + v + w} is a coset of a 2-dimensional suspace of
V , namely W + u where W = hu + v, u + wi = {0, u + v, u + w, v + w}. There are
(24 − 1)(24 − 2)/(22 − 1)(22 − 2) × 4 = 140 such sets of 4 vectors which form a Steiner
system S = S(3, 4, 16). Indeed
S = {{u, v, w, t} | u, v, w, t distinct, u + v + w + t = 0}.
Let U, W be distinct 2-dimensional subspaces of V , and let
V /W = {W + v | v ∈ V } = {W, W + v1 , W + v2 , W + v3 }
for suitable v1 , v2 , v3 ∈ V . If U ∩ W = {0} then U has one vector in each of the 4 cosets of
V /W ; we say U cuts V /W as 14 . In this case the coset U + t also cuts V /W as 14 (note
that V /W = {W + v | v ∈ V } = {W + v + t | v ∈ V }). If U ∩ W = {0, z}, z 6= 0, then
U ∩ (W + u) = {u, u + z}, for u ∈ U \ W , and so U has 2 vectors in each of two cosets of
W and 0 in the other two; i.e. U cuts V /W as 22 .02 . As above, the coset U + t will also
cut V /W as 22 .02 .
Thus if we arrange the 16 vectors of V as a 4 × 4 array, as in Figure 92, such that
the columns are the cosets of one 2-dimensional subspace and the rows are the cosets of
another, then every coset of every 2-dimensional subspace of V cuts every row with the
same parity, and cuts every column with the same parity. We see that the 35 partitions of
the 16 vectors into cosets of a 2-dimensional subspace correspond to the 35 pictures in the
MOG.
Now an 8-element set may be partitioned into two fours in 84 /2 = 35 ways, and these
too are displayed in the 35 pictures of the MOG. One of the remarkable isomorphisms of
finite simple groups is that of the alternating group A8 , consisting of all even permutations
of 8 letters, with the linear group L4 (2), consisting of all non-singular 4 × 4 matrices over
Z2 . The MOG simply displays a correspondence between these two sets of 35 objects
which is preserved by the simultaneous actions of A8 ∼
= L4 (2) on the 8 points and on the
15 non-zero vectors.
191
19.2.2
The binary Golay code C.
The power set of Ω, P (Ω), is the set of all subsets of Ω; it contains 224 members. If we
define the sum of two subsets of Ω to be their symmetric difference, thus
X + Y = (X \ Y ) ∪ (Y \ X), for X, Y ∈ P (Ω),
then P (Ω) becomes a vector space of dimension 24 in which the 1-element subsets form a
natural basis. The 759 octads, regarded as elements of this vector space, span a subspace
of dimension 12, which is known as the binary Golay code, denoted C. Now let V = F24
be a vector space whose coordinate positions correspond to the 24 elements of Ω. Then a
subset X ∈ P (Ω) corresponds to a vector of length 24 with 1 in the positions of X and 0
in the other positions. Such a vector is called a codeword and its weight is the number of
non-zero entries it possesses; so the octads correspond to codewords of weight 8. It turns
out that C consists of
• the zero vector of weight 0;
• 759 codewords of weight 8, corresponding to the octads;
• 2576 codewords of weight 12, known as dodecads;
• 759 codewords of weight 16, the complements of octads, known as 16-ads;
• the all-ones vector, of weight 24.
Note that 1 + 759 + 2576 + 759 + 1 = 4096 = 212 .
19.2.3
The hexacode
The hexacode H, see Conway [35], is a 3-dimensional quaternary code of length six whose
codewords give an algebraic notation for the binary codewords of C as given in the MOG.
Explicitly, if {0, 1, ω, ω̄} = K ∼
= GF4 , then
H = h(1, 1, 1, 1, 0, 0, ), (0, 0, 1, 1, 1, 1), (ω̄ , ω, ω̄, ω, ω̄, ω)i
= {(0, 0, 0, 0, 0, 0), (0, 0, 1, 1, 1, 1)(9 such), ((ω̄, ω, ω̄, ω, ω̄, ω)(12 such),
(ω̄, ω, 0, 1, 0, 1)(36 such), (1, 1, ω, ω, ω̄, ω̄)(6 such)} ,
where multiplication by powers of ω are of course allowed, and an S4 of permutations of
the columns corresponding to §4 ∼
= h(1 3 5)(2 4 6), (1 2)(3 4), (1 3)(2 4)i preserves the
code. Each hexacodeword has an odd and an even interpretation and each interpretation
corresponds to 25 binary codewords in C, giving the 26 × 2 × 25 = 212 binary codewords
of C. The rows of the MOG are labelled in descending order with the elements of K as
shown in the diagram, thus the top row is labelled 0. Let h = (h1 , h2 , . . . , h6 ) ∈ H. Then
in the odd interpretation if hi = λ ∈ K we place 1 in the λ position in the ith column
and zeros in the other three positions, or we may complement this and place 0 in the λth
position and 1s in the other three positions. We do this for each of the 6 values of i and
192
may complement freely so long as the number of 1s in the top row is odd. So there are 25
choices.
In the even interpretation if hi = λ 6= 0 we place 1 in the 0th and λth positions and
zeros in the other two, or as before we may complement. If hi = 0 then we place 0 in all
four positions or 1 in all four positions. This time we may complement freely so long as
the number of 1s in the top row is even.
0
1
ω
ω̄
0
1
ω
ω̄
|
|
×
.
0∼
.
.
or
.
×
,
×
×
.
×
1∼
.
.
or
×
.
.
.
, ω∼
×
×
×
.
{z
or
×
×
,
.
×
.
.
ω̄ ∼
.
×
or
.
×
,
.
×
×
.
ω̄ ∼
.
×
or
the odd interpretation
.
.
0∼
.
.
or
×
×
,
×
×
×
×
1∼
.
.
or
.
×
.
.
, ω∼
×
×
×
.
{z
or
the even interpretation
Thus for instance
×
.
(0, 1, ω̄, ω, 0, 1) ∼
.
.
.
×
.
.
×
×
×
.
.
.
×
.
×
.
.
.
.
×
.
.
or
.
.
.
.
×
×
.
.
×
.
.
×
×
.
×
.
.
.
.
.
×
×
,
.
.
∞
3
6
9
0 11 1 22
19 4 20 18
15 16 14 8
5 13 21 12
×
×
.
×
.
}
.
×
.
×
.
}
2
10
,
17
7
in the odd and even interpretations respectively, where evenly many complementations are
allowed in each case. The last figure shows the standard labelling of the 24 points of Ω
with the projective line P1 (23) such that all permutations of L2 (23) are in M24 .
Once one has committed the vectors in the hexacode to memory, which is surprisingly
easy to do, one can work with the Golay code, the Mathieu groups, the Leech lattice and
so on without needing to refer to the actual MOG diagram.
∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗∗
Our journey to tell the story of the unity of combinatorics began with a kid playing with
blocks on a living-room floor and a father who noticed a pattern, and has ended with the
Miracle Octad Generator that belongs to the highest level of modern mathematics. We hope
you will agree that combinatorics is more than a bag of tricks, and that it stitches together
many branches of mathematics into – that’s right – a beautifully patterned tapestry.
193
References
[1] V. E. Alekseev, On the Skolem method of constructing Steiner triple systems (Russian), Mat. Zametki 2 (1967), 145–156; MR 35, #5341. [x + y = z]
[2] Ian Anderson, A first course in combinatorial mathematics, Clarendon Press, Oxford,
1974.
[3] E. F. Assmus and C. Salwach, The (16, 6, 2) designs, Int. J. Math. and Math. Sci. 2
(1979), 261–281.
[4] J. C. Baez, The octonions, Bull. Amer. Math. Soc. 39 (2002), 145–205.
[5] W. W. Rouse Ball and H. S. M. Coxeter, Mathematical Recreations and Essays (12th
ed.), University of Toronto, Toronto, 1974. [36–40, Nim and Wythoff’s Game; 115–
116, squaring the square; 149–152, sphere packing; 189–192, latin squares; 193–221,
magic squares; 222–242, map coloring; 271–311, combinatorial designs, an excellent
survey written by J. J. Seidel]
[6] Thøger Bang, On the sequence [nα], n = 1, 2, . . ., supplementary note to the previous
paper by Th. Skolem, Math. Scand. 5 (1957) 69–76; MR 19, 1159h. [The Skolem
problem]
[7] Maria Beane, S(5, 8, 24), Masters Thesis, Virginia Tech, 2011.
[8] S. Beatty, Problem 3173, Amer. Math. Monthly 33 (1926), 159; 34 (1927), 159.
[Beatty sequences]
[9] Claude Berge, The Theory of Graphs and its Applications, Methuen, London, 1962,
p. 38. [Isaacs’s Game]
[10] E. R. Berlekamp, J. H. Conway, and R. K. Guy, Winning Ways for Your Mathematical Plays, 2nd ed., A K Peters, 2000–2004. [sum of games, p. 31; nim addition,
pp. 58–59; Wyt Queens, p. 59–60; nim-values, pp. 82–94; P-positions, p. 83; Turning
Turtles, p. 461]
[11] Albrecht Beutelspacher, 21 - 6 = 15: a connection between two distinguished geometries, Amer. Math. Monthly, 93(1986) 29–41; MR 87g:51010.
[12] Garrett Birkhoff and Saunders Mac Lane, A Survey of Modern Algebra, 3rd ed.,
Macmillan, New York, 1965.
[13] T. Beth, D. Jungnickel and H. Lenz, Design Theory, Cambridge Univ. Press, 1986.
[a comprehensive reference for Steiner systems and difference methods]
[14] N. L. Biggs, T. P. Kirkman, mathematician, Bull. London Math. Soc. 13 (1981),
97–120.
194
[15] R. C. Bose, On the construction of balanced incomplete block designs, Ann. Eugenics
9 (1939), 353–399.
[16] R. C. Bose, S. Shrikhande and E. T. Parker, Further results on the construction
of sets of mutually orthogonal Latin squares and the falsity of Euler’s conjecture,
Canad. J. Math. 12 (1960), 189–203.
[17] Charles L. Bouton, Nim, a game with a complete mathematical theory, Ann. of
Math., Princeton,(2) 3 (1901-02), 35–39.
[18] R. L. Brooks, C. A. B. Smith, A. H. Stone, and W. T. Tutte, The dissection of
rectangles into squares, Duke Math. J., 7 (1940), 312–340; MR 2, 153.
[19] A. E. Brouwer, A. M. Cohen and A. Neumaier, Distance-regular Graphs, Ergebnisse
der Mathematik und ihrer Grenzgebiete (3), 18, Springer-Verlag, Berlin, 1989.
[20] Ezra Brown, The many names of (7, 3, 1), Math. Magazine 75 (2002), 83–94.
[21] Ezra Brown, The fabulous (11,5,2) biplane, Math. Magazine 77 (2004), 87–100.
[22] Ezra Brown, Many more names of (7, 3, 1), Math. Magazine 88 (2015), 103–120.
[23] Ezra Brown and Nicholas Loehr, Why is P SL(2, 7) ∼
= GL(3, 2)?, Amer. Math.
Monthly 116 (2009), 727–731.
[24] Ezra Brown and Keith A. Mellinger, Kirkman’s schoolgirls wearing hats and walking
through fields of numbers, Math. Magazine, 82 (2009), 3–15.
[25] R. H. Bruck and H. J. Ryser, The non-existence of certain finite projective planes,
Canad. J. Math., 1 (1949) 88–93; MR 10, 319.
[26] Tom Brylawski, A partially-anecdotal history of matroids, talk given at “Matroids
in Montana” workshop, November, 2006.
[27] Robert D. Carmichael, Introduction to the Theory of Groups of Finite Order, Dover
Publications, New York, 1956.
[28] A. Cayley, On Jacobi’s elliptic functions, in reply to the Rev. B. Bronwin, and on
quaternions, Collected Mathematicial Papers 1, 127.
[29] A. Cayley, Note on a system of imaginaries, Collected Mathematicial Papers 1, 301.
[30] A. Cayley, On the triadic arrangements of seven and fifteen things, Philos. Mag. 37
(3) (1850) pp. 50–53.
[31] A. Cayley, On the partitions of a polygon, Proc. London Math. Soc. 22 (1891), 237262 = Coll. Math. Papers 13 (1897), 93–113.
195
[32] H. Cohn, A conceptual breakthrough in sphere packing, Notices Amer. Math. Soc.
64 (2017), 102–115.
[33] F. N. Cole, Kirkman parades, Bull. Amer. Math. Soc. 28 (1922) pp. 435–437.
[34] J. H. Conway, On Numbers and Games, Academic Press, London, 1976. [Nim arithmetic, chap. 6; impartial games and Nim, chap. 11]
[35] J.H. Conway, “Hexacode and tetracode - MOG and MINIMOG, in Computational
Group Theory (Ed. Michael D. Atkinson), Academic Press, London, Orlando and
New York (1984), 359–365.
[36] J. H. Conway and R. K. Guy, The Book of Numbers, Copernicus Books, 2007.
[37] J. H. Conway and N. J. A. Sloane, Sphere Packings, Lattices and Groups, Springer
Grundlehren der mathematischen Wissenschaften 290, 1988. [Hamming code p. 80;
Golay code, p. 84; Cayley numbers, p. 122; lexicodes, p. 327]
[38] J. H. Conway and N. J. A. Sloane, Lexicographic codes: error-correcting codes from
game theory, IEEE Trans. Info. Theory, IT-32 (1986) 337–348; MR 87f:94049.
[39] J. H. Conway and D. Smith, On Quaternions and Octonions, A K Peters/CRC Press,
2003.
[40] H. S. M. Coxeter, The golden section, phyllotaxis and Wythoff’s game, Scripta Math.,
19 (1953) 135–143; MR15, 246.
[41] Hallard T. Croft, Kenneth J. Falconer, and Richard K. Guy, Unsolved Problems in
Geometry, Springer, 1991. [C2 squaring the square; D10 packing spheres; E9-E13
lattice points]
[42] Curtis, R. T., A new combinatorial approach to M24 , Math. Proc. Camb. Phil. Soc.
79 (1976), 25-42.
[43] R. O. Davies, On Langford’s problem, II, Math. Gaz., 43 (1959) 253–255; MR 22
#5581.
[44] J. Dénes and A. D. Keedwell, Latin Squares and Their Applications, Academic Press,
1974.
[45] R. H. F. Denniston, Sylvester’s problem of the 15 schoolgirls, Discrete Math. 9 (1974)
pp. 229–233.
[46] P. Diaconis, R. L. Graham and W. M. Kantor, The mathematics of perfect shuffles,
Adv. in App. Math., 4(1983) 175–193; MR 84j:20040.
[47] L. E. Dickson, Linear Groups, B. G. Teubner, Leipzig, 1901.
196
[48] J. Doyen, Recent developments in the theory of Steiner systems, Atti dei Conv.
Lincei, 17 (1976) Tomo I, 277–285; MR 55 #10286. [Excellent bibliography with
many early items, previously overlooked]
[49] A. J. W. Duijvestijn, Simple perfect squared square of lowest order, J. Combinatorial
Theory Ser. B 25 (1978), 240–243; MR 80a:05051. [The smallest perfect squared
square.]
[50] A. J. W. Duijvestijn, Simple perfect squared squares and 2 × 1 squared rectangles of
order 25, Math. Comp. 62 (1994), 325–332: MR 94c:05023 (and see 05017). [Squares
with the same elements differently arranged; squares whose largest element is not on
the boundary.]
[51] D. S. Dummit and R. M. Foote, Abstract Algebra, 3rd ed., John Wiley, Hoboken, NJ,
2004.
[52] T. Ebert, Applications of recursive operators to randomness and complexity. Ph.D.
Thesis, University of California at Santa Barbara, 1998.
[53] N. Elkies, Handout for Math 155 (1998), available at
http://www.math.harvard.edu/ elkies/M155.98/h4.ps.
[54] G. Fano, Sui postulati fondamentali della geometria proiettiva, Giorn. Mat. 30
(1892), 114-124. [The “Fano” configuration was anticipated by Kirkman in 1850.]
[55] T. S. Ferguson, Game Theory: Impartial Combinatorial Games. www.math.
ucla.edu/tom/Game Theory/comb.pdf, accessed 7 January 2016.
[56] Alex Fink and Richard K. Guy, Rick’s Tricky Six Puzzle: S5 sits specially in S6 ,
Math. Magazine 82, 2009, 84–102.
[57] R. A. Fisher, The Design of Experiments, Oliver and Boyd, Edinburgh, 1935.
[58] R. A. Fisher, An examination of the different possible solutions of a problem in
incomplete blocks, Ann. Eugenics 10 (1940), 52–57.
[59] M. Gardner, Penrose Tiles to Trapdoor Cipers (revised ed.), Math. Assoc. of Amer.,
Washington DC, 1997. [Chapters 1 and 2, especially p. 6, Conway worms and Penrose
aperiodic tiles]
[60] J. T. Graves, Letter to W. R. Hamilton, January 22, 1844, in The Mathematical
Papers of Sir William Rowan Hamilton 3, 649
[61] Branko Grunbaum and G. C. Shephard, Tilings and Patterns, W. H. Freeman, New
York, 1987, [76–81 squaring the square, 531–583 Penrose aperiodic tiles]
[62] P. M. Grundy, Mathematics and games, Eureka, 2 (1939) 6–8; reprinted 27 (1964)
9–11. [Sprague-Grundy theory of impartial games]
197
[63] Richard K. Guy, A many-facetted problem of Zarankiewicz, in The Many Facets of
Graph Theory, Springer, New York, 1969, 129–148; MR 41 #91. [Adjacency matrices,
incidence matrices, packing and covering, Turán and Ramsey problems, tournaments,
Steiner triples, affine and projective planes, difference sets]
[64] Richard K. Guy, Packing [1, n] with solutions of ax + by = cz - the unity of combinatorics, Atti dei Conv. Lincei, 17 (1976) Tomo II, 173-179; MR 57 #9565. [x + y = z,
Langford-Skolem, Wythoff, Isaacs, Steiner, Hanani, Ringel, Zarankiewicz]
[65] Richard K. Guy, The Penrose pieces, Bull. London Math. Soc., 8 (1976) 9–10.
[66] Richard K. Guy, The unity of combinatorics, Proc. 25th Iran. Math. Conf., Tehran
(1994), Math. Appl., 329 129–159, Kluwer Acad. Publ., Dordrecht, 1995; MR
96k:05001.
[67] , Richard K. Guy, Catwalks, Sandsteps and Pascal Pyramids, Journal of Integer
Sequences 3 (2000), Article 00.1.6
[68] , Richard K. Guy and Mark M. Paulhus, Unique rook circuits, Math. Magazine 75
(2002), 380–387
[69] Richard K. Guy and John L. Selfridge, What drives an aliquot sequence?, Math.
Comp 29 101–107.
[70] Richard K. Guy and Cedric A. B. Smith, The G-values for various games, Proc.
Cambridge Philos. Soc., 52 (1956), 514–526; MR 18, 546.
[71] Richard K. Guy, Christian Krattenthaler, and Bruce E. Sagan, Lattice paths, reflections, and dimension-changing bijections Ars Combinatoria, 34 (1992), 3–15.
[72] J. Hadamard, Résolution d’une question relative aux déterminants, Bull. Sci. Math.
(2) 17 (1893), 240–248. [Hadamard matrices]
[73] Marshall Hall, Jr., Combinatorial Theory, 9th Edition, Blaisdell Publishing Company,
Waltham, MA, 1967.
[74] William R. Hamilton, Letter from Sir W. R. Hamilton to Rev. Archibald H. Hamilton,
August 5, 1865.
[75] R. W. Hamming, Error correcting and error detecting codes, Bell Sys. Tech. J., 29
(1950) 147–160; MR 12, 35c.
[76] H. Hanani, A note on Steiner triple systems, Math. Scand. 8 (1960), 154–156; MR
23 #A2330.
[77] H. Hanani, On quadruple systems, Canad. J. Math., 12 (1960) 145–157; MR 22
#2558.
198
[78] H. Hanani and J. Schönheim, On Steiner systems, Israel J. Math., 2 (1964) 139–142;
MR 31 #73.
[79] F. Harary, Topological concepts in graph theory, in Harary & Beineke, A Seminar
on Graph Theory, Hold, Reinhart & Winston, New York, 1967, pp. 13-17.
[80] G. H. Hardy and E. M. Wright, An Introduction to the Theory of Numbers, 4th
Edition, Oxford at the Clarendon Press, London, 1960.
[81] Percy J. Heawood, Map colour theorem, Quart. J. Math., Oxford Ser. 24 (1890),
332–338.
[82] J. W. P. Hirschfeld, Projective Geometries over Finite Fields, Oxford University Press
(1998).
[83] A. J. Hoffman and R. R. Singleton, On Moore graphs with diameters 2 and 3, IBM
J. Res. Develop., 4(1960) 497–504; MR 25 #3857.
[84] L. E. Hordern, Sliding Piece Puzzles, Oxford Univ. Press, 1986.
[85] Ben Howard, John Millson, Andrew Snowden and Ravi Vakil, A description of the
outer automorphism of S6 , and the invariants of six points in projective space, J.
Combin. Theory A, 115(2008) 1296–1303.
[86] D. R. Hughes and F. C. Piper, Design Theory, Cambridge University Press, New
York, 1985.
[87] Gerald Janusz and Joseph Rotman, Outer automorphisms of S6 , Amer. Math.
Monthly, 89(1982) 407–410; MR 83g:20002.
[88] R. H. Jeurissen, A proof by graphs that PSL(2, 7) ∼
= GL(3, 2), Discrete Math. 70
(1988) 315–317.
[89] T. P. Kirkman, On a problem in combinations, Cambridge and Dublin Math. J. 2,
(1847), 191–204.
[90] T. P. Kirkman, Query 6, Lady’s and Gentlemen’s Diary, 1850.
[91] T. P. Kirkman, Note on an unanswered prize question, Cambridge and Dublin Math.
J. 5 (1850), 255–262.
[92] T. P. Kirkman, On triads made with fifteen things, London, Edinburg and Dublin
Phil. Mag., 37 (1850), 169–171.
[93] T. P. Kirkman, On the perfect r-partitions of N = r 2 − r + 1, Trans. Historic Soc.
Lancs. and Cheshire 9 (1856–57), 127-142.
[94] T. P. Kirkman, On the k-partitions of the r-gon and r-ace, Phil. Trans. Royal Soc.
London 147 (1857), p. 225.
199
[95] Thomas Kövári, Vera Sós, and Paul Turán, On a problem of K. Zarankiewicz, Colloq.
Math. 3 (1954), 50-57; MR 16, 456.
[96] C. Krattenthaler, Counting lattice paths with a linear boundary. I. Osterreich. Akad.
Wiss. Math.-Natur. Kl. Sitzungsber. II 198 (1989), 87–107.
[97] Sidney Kravitz, Complete the Circuit, J. Recreational Math., 28(1996-97), No. 2,
p.143.
[98] M. Kuchinski, Catalan Structures and Correspondences, M.Sc. Thesis, West Virginia
University, 1977.
[99] R. Kumanduri and C. Romero, Number Theory with Computer Applications.
Prentice–Hall, Upper Saddle River, NJ, 1998.
[100] Clement W. H. Lam, The search for a finite projective plane of order 10, Amer. Math.
Monthly 98 (1991), 305–318.
[101] C. Dudley Langford, Problem, Math. Gazette 42 (1958), 228. [Langford sequences]
[102] Eugene Lawler, Combinatorial Optimization: Networks and Matroids, Dover Publications, Inc., Mineola, NY, 2001.
[103] John Leech, Some sphere packings in higher space, Canad. Math. J. 5 (1964), 657682; MR 29 #5166; 19 (1967) 251–267; MR 35 #878.
[104] D. H. Lehmer and Emma Lehmer, A new factorization technique using quadratic
forms, Math. Comp. 28 (1974), 625–635.
[105] Emma Lehmer, On residue difference sets, Canad. J. Math. 5 (1953), 425–432; MR
15, 10.
[106] H. W. Lenstra, Nim multiplication, Seminaire de Théorie des Nombres de Bordeaux
7 (1977-78), 1–24.
[107] F. Jessie MacWilliams and Neil J. A. Sloane, The Theory of Error–Correcting Codes,
2nd Reprint, North–Holland Mathematical Library 16, North–Holland, New York,
1983.
[108] Al. A. Markov, On a certain combinatorial problem (Russian), Problemy Kibernetiki
15 (1965), 263–266; MR 35 #1497. [x + y = z]
[109] É. Mathieu, Mémoire sur L’Étude des Fonctions de Plusieurs Quantités, sur la
Manière d les Former et sur les Substitutions Qui les Laissent Invariables, J. Math.
Pures Appl., Ser. 2 6, (1861), 241–274.
[110] É. Mathieu, Sur la Fonction Cinq Fois Transitive de 24 Quantités, J. Math. Pures
Appl., Ser. 2 18, (1873), 25–46.
200
[111] Darcy Meeker, Squareorama 4, Math Horizons 22 (November 2014), 2.
[112] J. W. Moon, Topics on Tournaments, Holt, Rinehart and Winston, New York, 1968.
[113] David L. Neel and Nancy Ann Neudauer, Matroids you have known, Math. Magazine
82 (2009), 26–41.
[114] Gabriel Nivasch, More on the Sprague-Grundy function for Wythoff’s Game, in
Games of No Chance 3, MSRI Publications, M. Albert and R. Nowakowski, eds.),
Cambridge University Press 2009, 377–410.
[115] R. J. Nowakowski, Generalizations of the Langford-Skolem problem, M.Sc. thesis,
The Univ. of Calgary, 1975.
[116] R. J. Nowakowski, Zarankiewicz’s problem, Ph.D. thesis, The Univ. of Calgary, 1978.
[117] James G. Oxley, Matroid Theory, Oxford University Press, Oxford, UK, 1992.
[118] R. E. A. C. Paley, On orthogonal matrices, J. Math. Phys. 12 (1933), 311-320.
[119] R. E. O’Connor and Gordon Pall, The construction of integral quadratic forms of
determinant 1, Duke Math. J. 11 (1944), 319–331. [Mock turtles, the Leech lattice]
[120] A Papoulis, A new method of inversion of the Laplace transform,Q. Appl. Math. 14
(1956), 405–414.
[121] Vera Pless, Introduction to the Theory of Error–Correcting Codes, 2nd Edition, Wiley,
New York, 1989.
[122] B. Polster, Pretty pictures of geometries, Finite geometry and combinatorics (Deinze
1997), Bull. Belg. Math. Soc. Simon Stevin, 5(1998) 417–425; MR 99f:51022.
[123] George Pólya, How To Solve It, 2nd Edition, Doubleday, Garden City NY, 1957.
[124] C. J. Priday, On Langford’s problem, I, Math. Gaz. 43 (1959), 250–253; MR 22
#5580.
[125] D. K. Ray-Chaudhuri and R. M. Wilson, Solution of Kirkman’s schoolgirl problem,
Proc. Symp. Pure Math. 19 , Amer. Math. Soc. (1971), 187–203; MR 47 #3195.
[126] K. B. Reid and E. Brown, Doubly Regular Tournaments Are Equivalent to Skew–
Hadamard Matrices, J. Combinatorial Theory, Series A 12 (1972), 332–338.
[127] D. S. Richeson, Euler’s Gem: The Polyhedron Formula and the Birth of Topology,
Princeton University Press, 2009.
[128] G. Ringel, Färbungsprobleme auf Flächen und Graphen, VEB Deutscher Verlag der
Wissenschaften, Berlin, 1959.
201
[129] G. Ringel, Die toroidale Dicke des vollständigen Graphen, Math. Zeit. 87 (1965),
19–26; MR 30 #2489. [toroidal thickness]
[130] G. Ringel, Map Color Theorem, Grundlehren der math. Wissenschaften 209,
Springer, 1974.
[131] G. Ringel and J. W. T. Youngs, Solution of the Heawood map-coloring problem,
Proc. Nat. Acad. Sci. U.S.A. 60 (1968), 438-445; MR 37 #3959.
[132] G. Ringel and J. W. T. Youngs, Solution of the Heawood map-coloring problem: case
11, J. Combinatorial Theory 7 (1969), 71–93; MR 39 #1360; case 2, 342–352; case 8,
353–363; MR 41 #6723.
[133] Fred S. Roberts, Applied Combinatorics, Prentice–Hall, 1984.
[134] S. Robinson, “Why Mathematicians Now Care About Their Hat Color”, The New
York Times (10 April 2001).
[135] A. Rosa, A note on Steiner triple systems (Slovak), Mat.-Fyz. Časopis Sloven. Akad.
Vied 16 (1966), 285–290; MR 35 #2759. [x + y = z]
[136] D. P. Roselle and T. C. Thomasson, Jr., On generalized Langford Sequences, J.
Combinatorial Theory 11 (1971), 196–199.
[137] J. J. Rotman, The Theory of Groups, An Introduction, 2nd edition, Allyn & Bacon,
Boston MA, 1973.
[138] J. J. Rotman, An Introduction to the Theory of Groups, 4th ed., Springer-Verlag,
New York, 1995.
[139] H. J. Ryser, Combinatorial Mathematics, Carus Math. Monograph 14, Math. Assoc.
Amer., 1963.
[140] W. Sands, Problem 1517*, Crux Mathematicorum 16 # (Feb. 1990), 44.
[141] Daniel Scully, Perfect shuffles through dynamical systems, Math. Magazine, 77(2004)
101–117; MR 2005g:05008.
[142] J. Sedláček, On a set system, Ann. New York Acad. Sci. 175 (1970), 329–330; MR
42 #117. [x + y = z]
[143] C. E. Shannon, A mathematical theory of communication, Bell System Tech. J. 27
(1948), 379?423, 623?656. [The first Hamming Code]
[144] J. Singer, A theorem in finite projective geometry and some applications to number
theory, Trans. Amer. Math. Soc. 43 (1938), 377–385.
[145] David Singmaster, Sources in Recreational Mathematics, 6th edition, Nov. 1993.
202
[146] Th. A. Skolem, On certain distributions of integers in pairs with given differences,
Math. Scand. 5 (1957), 57–68; MR 19 , 1159i. [Skolem’s problem]
[147] Th. A. Skolem, Some remarks on the triple systems of Steiner, Math. Scand., 6 (1958)
273-280; MR 21 #5582.
[148] Th. A. Skolem, Über einige Eigenschaften der Zahlenmengen [αn+β] bei irrationalem
mit einleitenden Bemerkungen über einige kombinatorische Probleme, Norske Vid.
Selsk. Forh.Trondheim 30 (1957), 42–49; MR 19, 1159i.
[149] N. J. A. Sloane, The On-Line Encyclopedia of Integer Sequences, http://oeis.org
[150] Jerry Slocum and Dic Sonneveld, The 15 Puzzle: How it Drove the World Crazy. The
Puzzle that Started the Craze of 1880. How America’s Greatest Puzzle Designer, Sam
Loyd, Fooled Everyone for 115 Years, Beverly Hills, CA; Slocum Puzzle Foundation,
2006.
[151] R. P. Sprague, Über mathematische Kampfspiele, Tôhoku Math. J. 41 (1935-36),
438–444; Zbl. 13, 290. [Sprague-Grundy theory of impartial games]
[152] R. P. Sprague, Beispiel einer Zerlegung des Quadrats in lauter verschiedene Quadrate,
Math. Z. 45 (1939), 607–608. [squaring the square]
[153] Richard P. Stanley, Catalan Numbers, Cambridge University Press, New York, 2015.
[154] S. K. Stein, Mathematics: The Man-Made Universe, 2nd edition, Freeman, San
Francisco, 1969.
[155] Thomas Storer, Cyclotomy and Difference Sets, Markham, Chicago, 1967.
[156] W. E. Story, Notes on the ‘15’ puzzle, II, Amer. J. Math., 2(1879) 399–404.
[157] J. J. Sylvester, Elementary researches in the analysis of combinatorial aggregation,
Philosophical Mag., 24(1844) 285–296; Collected Math. Papers, Vol.I, 91–102.
[158] J. J. Sylvester, Note on the historical origin of the unsymmetrical six-valued function
of six letters, Philosophical Mag., 21(1861) 369–377; Collected Math. Papers, Vol.II,
264–271.
[159] J. J. Sylvester, On a problem in tactic which serves to disclose the existence of a
four-valued function of three sets of three letters each, Philosophical Mag., 21(1861)
515–520; Collected Math. Papers, Vol.II, 272–276.
[160] J. J. Sylvester, Concluding paper on tactic, Philosophical Mag., 22(1861) 45–54;
Collected Math. Papers, Vol.II, 277–285.
[161] J. J. Sylvester, Remark on the tactic of nine elements, Philosophical Mag., 22(1861)
144–147; Collected Math. Papers, Vol.II, 286–289.
203
[162] J. Tanton, Mathematics Galore!, MAA, Washington DC (2015).
[163] G. Tarry, Le Problème de 36 officiers, C. R. Assoc. Fr. Avance. Sci. Nat 1 (1900),
122–123; 2 (1901), 170–203.
[164] H. M. Taylor and R. C. Rowe, Note on a geometric theorem, Proc. London Math.
Soc. 13 (1882), 102–106.
[165] C. M. Terry, L. R. Welch, and J. W. T. Youngs, Solution of the Heawood map-coloring
problem: case 4, J. Combinatorial Theory 8 (1970), 170–174; MR 41 #3321.
[166] A. Tietäväinen, On the non-existence of perfect codes over finite fields, SIAM J.
Appl. Math. 24 (1973), 88-96; MR 48 #3609.
[167] Thomas M. Thompson, From Error-correcting Codes through Sphere Packings to
Simple Groups, Carus Math. Monograph 21, Math. Assoc. Amer., 1983.
[168] J. A. Todd, A combinatorial problem, J. Math. and Phys. 12 (1933), 321–323.
[Hadamard matrices]
[169] Alan Tucker, Applied Combinatorics (6th edition), John Wiley & sons, Hoboken,
2012.
[170] W. T. Tutte, A census of Hamiltonian polygons, Proc. London Math. Soc. 13(1882),
102–106.
[171] Paul Vaderlind, Richard Guy & Loren Larson, The Inquisitive Problem Solver, Math.
Assoc. of Amer. Problem Books, 2002, Problem 88.
[172] W. D. Wallis, Introduction to Combinatorial Designs (2nd edition), Chapman and
Hall/CRC, Boca Raton FL, 2007.
[173] Douglas B. West, Introduction to Graph Theory (2nd edition), Prentice Hall, Upper
Saddle River, NJ, 2001.
[174] Richard M. Wilson, Graph puzzles, homotopy, and the alternating group, J. Combin.
Theory Ser. B, 16(1974) 86–96; MR 48 #10882.
[175] Ernst Witt, Die 5-fach transitiven Gruppen von Mathieu, Abh. Math. Sem. Hamburg
12 (1938), 256–264.
[176] A. L. Whiteman, A family of difference sets, Illinois J. Math. 6 (1962),107–121.
[177] Hassler Whitney, On the abstract properties of linear dependence, Amer. J. Math.,
57 (1935), 509–533.
[178] W. S. B. Woolhouse, Prize queston 1733, Lady’s and Gentlemen’s Diary, 1844. [block
designs]
204
[179] W. A. Wythoff, A modification of the game of Nim, Niew Arch. voor Wisk (2) 7
(1905-07), 199–202.
[180] J. W. T. Youngs, Solution of the Heawood map-coloring problem : cases 3, 5, 6 and
9, J. Combinatorial Theory 8 (1970), 175–219; cases 1, 7, and 10, 220-231; MR 41
#3322-3.
[181] K. Zarankiewicz, Problem P101, Colloq. Math. 2 (1951), 301.
[182] Š. Znám, On a combinatorical problem of K. Zarankiewicz, Colloq. Math., 11 (1963)
81–84; MR 29 #37; 13 (1965) 255-258; MR 32 #7434.
205

Download Report

VT Math - Virginia Tech

Paperzz.com

Your Paperzz