The Uniformity Assumption in the Birthday Problem

The Uniformity Assumption in the Birthday Problem
Author(s): Geoffrey C. Berresford
Source: Mathematics Magazine, Vol. 53, No. 5 (Nov., 1980), pp. 286-288
Published by: Mathematical Association of America
Stable URL: http://www.jstor.org/stable/2689391 .
Accessed: 12/10/2011 06:16
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact [email protected].
Mathematical Association of America is collaborating with JSTOR to digitize, preserve and extend access to
Mathematics Magazine.
http://www.jstor.org
schemewithvolumeV. For anyc >0 we can findan integern so
Now consideran arbitrary
thattheboxesfromthefirstn stagesof thisschemehavevolumeat least V-ce. Sincetheseboxes
less than V. It follows
belongto an n-stageproblem,theirvolumeis at most VJand therefore
thatV< V+ Eand,sincec is arbitrary,
V< V. ThisshowsthatV is indeedthemaximumvolume
problem.
in theinfinite-stage
belonging
Finally,ifa schemeproducesthemaximumvolumeV, thenall of thesubschemes
to anystagemustproduceV timestheirscalingfactoror theycouldbe replacedwitha definite
we considertheji of thefirststage.It mustsatisfy
Withoutloss of generality
improvement.
V=t(l
-2[Q2+4t2V
or
4(V+ l)A3-4,u2+M- V=O,
and thisequationsufficesto prove that,-=X. This followsbecause X is a double root and
rootmustbe
therefore
theremaining
4(V+1)X2
=.5827...
>-
This completestheproofthattheonlyschemewithvolumeVis theone in which,u =X forall i.
Assumptionin the BirthdayProblem
The Uniformity
GEOFFREY
C. BERRESFORD
Long Island University
Greenvale,NY 11548
thatin a groupofn peoplesometwowillshare
problem,to findtheprobability
The birthday
sincehavingbeenproposedin 1939
in theliterature
has occurredfrequently
a commonbirthday,
is determined
thateach person'sbirthday
byvonMises.It is easilysolvedundertheassumptions
and that the 365 possiblebirthdays(ignoringleap years) are equally likely.
independently,
of
it is easy to showthattheprobability
assumptions
and uniformity
Undertheseindependence
23.
reaches
of
the
group
a sharedbirthday
reaches2 as soon as thesize
Dependingon thepopulation,itmay
is interesting.
assumption
The reasonfortheuniformity
or may not be a reasonableapproximationto reality,but in any case it is enormously
To takefull
To see this,let us considertheproblemwithoutassuminguniformity.
convenient.
days of theyear,we letp, be the
of beingbornon different
accountof the365 probabilities
of
probability
of beingbornon day i, i = 1,... ,365, and obtainthe(complementary)
probability
as
birthdays
chosenpeopleall havingdifferent
n independently
2
P(n )=n!
il <
...
<in
(1)
pj-*pi,
is that
of { 1,2,.. . ,365) suchthati1<i2 < ... <in. The difficulty
thesumbeingoverall n-subsets
whicheventhe
1036terms,
thesumhas (365) terms,and forgroupsize n= 23 thisis (365
to calculate.
wouldneed 102?centuries
fastestcomputer
one empirical.In a groupofn
can be made,one theoretical,
twoobservations
Nevertheless,
Therefore,
distribution.
the
is
uniform
least
for
of a sharedbirthday
people, theprobability
a shared
make
to
is
sufficient
of
23
size
a
of
group
birthdays,
distribution
oftheactual
regardless
but the following
birthdaymoreprobablethannot. Thereare proofsof thisin theliterature,
)
286
MATHEMATICS MAGAZINE
versionof Munford's1977proof[3] is perhapsthesimplest.
We willshowthatthe(complementary)probability
P(n) of n independently
chosenpeople havingdifferent
birthdays
is greatest
fortheuniform
distribution.
We assume,of course,thatn> 2 and thatat leastn of thep,'s are
nonzero,so thatn different
birthdays
are possible.We willshowhowP(n) changesif we alter
thevaluesof two unequal probabilities,
saypI and P2, by replacingthemwiththeircommon
mean,2(PI +P2). Let us partitionthesum(1) intothreeparts:thosetermscontaining
bothpI
andP2, thosetermscontaining
just one ofpI andP2, and thosetermscontaining
neither
pI nor
P2. Let us also factoroutpI andP2 whenever
theyoccur.We maythenexpressP(n) as
P(n)-=n! PI P2
2
2<il < ...
2<il<
..
<In-2
Ai,l
p.< pin
*Pin2+(PI
+P2)
I
2<il < ...
..
<in-I
Pi, *Pin-I
(2)
)
If we now replacebothpi and P2 by theircommonmean,I(P1 +P2), thusleavingtheirsum
unchanged,
onlythefirsttermin (2) changes,in whichwe mustreplacepIp2 by ((PI +P2))2. But
sincePIP2 <({(pI +P2))2, (takingthesquarerootof each side,thisis merelythestatement
that
thegeometric
meanof twounequalnumbersis lessthantheirarithmetic
mean)thisreplacement
willonlyincreasethevalueofP(n). Thussincetheprobability
P(n) of different
birthdays
can be
increasedby thisoperationwhenevertwop,'s are unequal,it mustbe greatestwhenall thepi's
are equal, thatis, fortheuniform
distribution.
The second observationinvolvescomparingthe uniformity
assumptionwithactual data.
FIGURE 1 is a graphof theempiricalprobabilities
of a birthoccurring
on any day of theyear
1977 forthe 239,762live birthsin New York State (source:New York State Health Department).I leaveit to thereaderto surmisereasonsfortheobviousweeklycyclicalcomponent.
The
empiricalprobabilities
varyfroma low of .002135(on Sunday,December11th)to a highof
.003478(on Wednesday,July6th),a variationof almost27% fromthemeanof 1/365.Thusfor
a populationbornin a givenyear(the typeof populationfromwhichmostschoolclassesare
drawn)theassumption
ofuniformity
is notvalid.However,becausea givenbirthday
willfallon
different
daysoftheweekin different
years(since365 is relatively
primeto 7) in a populationof
mixedages theweeklycyclewillbe averagedout. For such a populationuniformity
willbe a
reasonableassumption,
as is shownby thegraphin FIGuIRE2, in whichthedata is as in FIGURE
1 exceptthateach dailyprobability
has been averagedwiththesix following
it to removethe
weeklycycle.The variationhereis onlyabout 10%aboveand belowthemeanof 1/365.
-.0034
.0032
.0030
.0028
.0026
.0024
.0022
.0020
.0001
JAN FEB
MAR
APR
MAY
JUNE JULY AUG
SEPT
OCT
NOV
DEC
FIGuRE 1.
VOL. 53, NO. 5, NOVEMBER 1980
287
.00341
.00321i
.0028..0024:
.0022.0020--
JAN FEB
MAR
APR
_I
-tt
MAY
JUNE
JULY AUG
SEPT
OCT
WI
t
tI
-
tootQ
NOV
-I 0T
DEC
QW
FIGURE 2.
0001-1
probabilities
is hopelesslyintractable,
it
Althoughthebirthdayproblemwith365 different
probabilities,
p ,...'Pm' and
becomesmanageableifone allowsa smallernumber,m,of different
roundseach oftheactualempiricalprobabilities
to thenearestof these.Thenifdiis thenumber
of days beingassignedprobability
relationsIdi = 365,
pi, subjectto the obviousconsistency
of
n
the
all
different
is
birthdays
probability people having
dipi= 1,
n!
(
E
nl+---
+nm=n
i=l
ni
)%
wherethe sum is over the (m nn- 1) different
m-tuples(n1,.,tim) of nonnegativeintegers
satisfyingn, + *
+ nm= n (see Feller [2], p. 38).
Size (n)
of Group
12
15
18
21
22
23
Probabilityof a SharedBirthday
UniformCase
.1670
.2529
.3469
.4437
.4757
.5073
Non-UniformCase
.1683
.2537
.3491
.4463
.4783
.5101
TABLE 1.
A computercalculation(performed
on a Univac 1100/82with 18 digitprecision)using
fromthe1977New YorkState
m= 10different
based on theempiricalprobabilities
probabilities
data graphedin FIGURE 1 showsthattheprobability
of a sharedbirthday
is surprisingly
robust:
above one-half(see TABLE 1).
a groupsize of 23 is stillrequiredto raisetheprobability
I would like to thank the C. W. Post ComputerCenterand the C. W. Post Research Committeefor their
assistance.
References
[1]
[2]
[3]
288
D. M. Bloom,A birthdayproblem,Amer.Math. Monthly,80 (1973) 1141-1142.
W. Feller,An Introductionto ProbabilityTheoryand Its Applications,vol. 1, 3rd ed., JohnWiley,New
York, 1968.
assumptionin the birthdayproblem,Amer. Statist.,31, no. 3
A. G. Munford,A note on the uniformity
(1977) 119.
MATHEMATICS MAGAZINE