First-Server Advantage in Tennis Matches

First-Server Advantage in Tennis Matches
Iain MacPhee and Jonathan Rougier∗
Department of Mathematical Sciences
University of Durham, U.K.
Abstract
We show that the advantage that can accrue to the server in
tennis does not necessarily imply that serving first increases the
probability of winning the match. We demonstrate that the
outcome of tie-breaks, sets and matches can be independent of
who serves first. These are corollaries of a more general result
that we prove by considering invariances across certain permutations of the order in which the players serve. Our proof is
non-algebraic and self-contained.
Keywords: Tennis, Tie-break, n-point win-by-k Games
Primary classification: 91A60 (Probabilistic games; gambling),
91A05 (2-person games). Secondary classification: 60J20 (Applications of discrete Markov processes)
1
Introduction
In professional lawn tennis it is usually better to serve than to return. It
is tempting to infer from this that it is therefore advantageous to serve
∗
Corresponding author: Department of Mathematical Sciences, University of
Durham, Science Site, South Road, Durham DH1 3LE, U.K.; tel +44(0)191 334
3111; fax +44(0)191 334 3051; e-mail [email protected]
1
first, i.e. winning the toss and electing to serve improves the probability
of winning the first set, and the match. In this paper we show that this
inference does not necessarily follow. We show that in a standard class
of models where serving confers an advantage the probability that a
given player wins is independent of who serves first. We first show that
this is true for a tie-break, and then we generalise our result to sets and
to the match.
The analysis of tennis matches using simple probabilistic models
is well-known (e.g., Kemeny and Snell, 1960). An interesting history
of scoring systems in tennis, and some combinatorial calculations on
outcomes is given in Riddle (1988). Our tie-break results are not new,
having been given in Pollard (1983), and, implicitly, in Haigh (1996).
These proofs use standard probabilistic methods (e.g., hitting times
on binomial trees, combinatorial analysis) relying on algebraic equalities. By contrast, our approach provides a simple and completely selfcontained proof without any algebra at all, based on a new stronger
result concerning invariances across certain permutations of the order
in which the players serve.
2
The tie-break
The tennis tie-break is won by the first player to seven points, or, if
the score reaches six-all, the first player to go two points ahead. It is
an example of a n-point win-by-k game, with n = 7 and k = 2, with
additional structure provided by the pattern of serves. On entering a
tie-break the player to serve first is pre-determined by the initial coin
toss and the total number of games played so far in the match. This
player serves once, and then the players alternate serving two points
each.
2
We will make the following assumption, which defines the class of
models we study.
Assumption 1. The points of the tie-break comprise independent Bernoulli trials with fixed probability of success depending only on who is
serving.
We label our players A and B. We will prove that under Assumption 1 the probability of player A winning the tie-break is independent
of who serves first. This is a simple corollary to a more general result,
for which we require the following definition.
Definition 1. A pairwise service ordering ( pso) is a concatenation of
the tuples AB and BA repeatedly according to some rule.
Then the theorem is as follows.
Theorem 1. Suppose we are in tie-break with a generalised serving
pattern that conforms to a pso fixed by rule R. Then under Assumption 1, the probability of player A winning the tie-break is independent
of the choice for R.
For an actual tennis tie-break, the two possible service patterns are
ABBAABB · · · A serves first,
BAABBAA · · · B serves first.
Both of these are psos, and so it follows from Theorem 1 that the
probability of player A winning is invariant to who serves first.
To prove Theorem 1 we make repeated use of the following lemma,
which summarises the invariance structure of the probability of attaining certain scores with respect to the service ordering.
3
Lemma 2. Under the same conditions as Theorem 1, the probability
of any score [i, j] with i + j even is invariant to R, excepting those in
the set [7, j] or [i, 7] : i, j ∈ {1, 3, 5} .
Proof. The tie-break may be represented on a binomial tree. We write
any given path through the tree with any given pso rule R in the
following way (ignoring, briefly, the restrictions on the score):
1
0
0
1
1
1 ...
A B B A A B ...
(say). Each vertical pair represents a single point, with the first line
showing the indicator variable of A winning and the second line showing
the server; in this example, the score goes with service for the first five
points, before A wins a point on B’s serve. Clearly the role of the rule
R is to assign a probability to the chosen path.
The first point to note is that from a given pso we can reach every
other pso by interchanging A and B in positions 2i + 1 and 2i + 2 for
i ∈ N one-pair-at-a-time, two-pairs-at-a-time, and so on. The second
point is that if we also interchange the indicator variables then we alter
neither the final score of the path, nor its probability.
If we want to know the probability of a score [i, j] under a pso
with rule R, then we sum the probabilities over all possible paths to
[i, j]. If we take any other pso with rule R0 the paths to [i, j] will be
different, but the sum of their probabilities will be the same. This is
because each path to [i, j] with rule R can be bijectively associated with
a path to [i, j] which has the same probability under rule R0 , using the
interchanges described above. Thus the probability of [i, j] is invariant
to the choice of R or R0 , and since R and R0 are completely general, it
is invariant to whatever rule is chosen for the pso.
4
The reason that Lemma 2 only works for even scores is that we
need to know what happens on both of the points that we interchange
in each tuple to associate bijectively between R to R0 . The reason it
does not work for scores of the form [7, j] or [j, 7] for i, j ∈ {1, 3, 5} is
that in this case the winning player must win the final point. Unless he
or she also wins the penultimate point, we end up interchanging into a
different terminating score.
To prove Theorem 1 it is sufficient to show that the probability
of player A winning the tie-break is invariant to the service ordering.
Lemma 2 shows that the probability of reaching the score [6, 6] is invariant to the pso rule R. Now we condition on whether the score
passes through [6, 6], and consider the two cases separately.
If the score does pass through [6, 6], then the terminal scores (for A
winning) will be of the form [6 + i + 2, 6 + i] for i ∈ N. Scores of this
form satisfy Lemma 2, noting that A must have won both of the final
two points, and so we conclude that the probability of A winning after
passing through [6, 6] is invariant to the pso rule R.
What about if the score does not pass through [6, 6]? The terminat
ing scores in this case comprise the set S1 = [7, j] : j = 0, 1, . . . , 5 .
Several of the scores in this set do not satisfy Lemma 2. We can finesse
this by considering an extension of the tie-break to 12 points, regardless of whether A has 7 points or not. Thus we define a second set,
S2 = [i, j] : i > j, i + j = 12 . Note that S1 ∩ S2 = [7, 5] , and that
[7, 5] is the only element of S2 that does not satisfy Lemma 2. However,
in the extended game to 12 points the score [7, 5] is invariant to the
pso rule, because the game can end both . . . , 0, 1 or . . . , 1, 0. That
is, in the 12-point game player B can legitimately win the last point,
while in the tie-break he or she cannot. Therefore every score in S2 is
invariant to the pso rule R.
5
Now consider the relationship between S1 and S2 on the binomial
tree. Every path to S2 must pass through S1 , and every path through
S1 must terminate in S2 . Therefore for any given pso rule R the probability on S1 is the same as that on S2 . As the probability on S2 is
invariant to R, it follows immediately that the probability on S1 is too.
That is, the probability that A wins without passing through [6, 6] is
invariant to the pso rule R.
This completes the proof of Theorem 1.
3
Tennis generalisations
Lemma 2 clearly generalises to all n ≥ 2, with appropriate modification
to S1 and S2 . Therefore we can extend our result from tie-breaks to
sets without tie-breaks, noting the following:
1. Assumption 1 is sufficient to ensure that the games themselves
comprise independent Bernouilli trials with fixed probability of
success depending only on who is serving.
2. The scoring for a set without a tie-break is exactly the same as
for the tie-break itself, except n = 6 rather than 7.
3. The two possible serving arrangements are ABAB · · · and BABA · · · , both of which are psos.
For sets with tie-breaks we can condition our argument on the score
[6, 6] in games. Clearly this satisfies Lemma 2 when n = 6: players
A and B must win one apiece of the final two games, and so these
two games are interchangeable as described in Lemma 2. If the score
in games gets to [6, 6] then a tie-break is played, and we have already
shown that the outcome of the tie-break is invariant to who serves first.
But if the score in games does not get to [6, 6] we can apply exactly the
6
same reasoning as before, with S1 = [6, j] : j ∈ 0, 1, . . . , 4 ∪ [7, 5]
and S2 = [i, j] : i > j, i + j = 12 .
So we have shown that under Assumption 1 the probability of
player A winning a set is independent of who serves first, regardless of
whether or not the set is terminated with a tie-break or with two-ahead.
And then it follows that under the same Assumption, the probability
of player A winning the match is independent of who serves first. i.e.,
independent of who wins the toss.
Finally, we note some recent evidence (Klaassen and Magnus, 2001)
suggesting that the simple Markov structure implied by our Assumption 1 is in fact a reasonable model. Klaassen and Magnus find that
with their sample of 86,298 points from Wimbledon 1992–1995 they
reject the hypothesis that points are identically and independently distributed (iid), but they note that the deviations from the iid are small
and “. . . the assumption of iid in specific applications (such as forecasting) could be relatively harmless” (p. 506). This finding, which
suggests that our Assumption may be taken as ‘approximately true’,
implies that in practice the impact of serving first on the outcome of a
tie-break, a set, or a match, will be small and could be negligible.
References
Haigh, J. (1996), More on n-point, win-by-k games, Journal of Applied Probability, 33, 382–387.
Kemeny, J.G. and Snell, J.L. (1960),
Finite Markov Chains,
Princeton N.J., Van Nostrand Reinhold.
Klaassen, F.J.G.M. and Magnus, J.R. (2001), Are points in tennis independently and identically distributed? Evidence from a dy-
7
namic binary panel data model, Journal of the American Statistical
Association, 96, 500–509.
Pollard, G.H. (1983) An analysis of classical and tie-breaker tennis,
Australian Journal of Statistics, 25, 496–505.
Riddle, L.H. (1988) Probability models for tennis scoring, Applied
Statistics, 37, 63–75.
8