journal and proceedings young archimedes

JOURNAL AND PROCEEDINGS
of
YOUNG ARCHIMEDES
Volume 1, Number 1
2015
“Providing a forum to exchange mathematical ideas,
activities, and/or sharing and interpreting high school
research.”
JOURNAL AND PROCEEDINGS
of
YOUNG ARCHIMEDES
Volume 1, Number 1, 2015
Contents
3.
Osman, F.,
Myadera, R.
Editorial Board
7.
Osman, F.
Solutions of Higher Order Dispersion Terms in the Nonlinear Schrödinger
Equation
13.
Ballador,
W.L.,
Gallaza, A.L.,
Lazaro, L.
Pattern for Centre of Twin Primes
18.
Millican, C.
Blu-Ray vs. DVD vs. CD: A Mathematical Exploration: Why Archimedean
Spirals lie at the Centre of the differences between them?
29.
Zheng, Y.
Bayes’ Theorem and Its Applications in Law
41.
Gereis, J.,
Wu, V.
Effect of surface area and volume on rate of cooling
43.
Kitagawa, T.,
Nakamura, S.
New Chocolate Games
The Editor, Journal and Proceedings of Young Archimedes
Trinity Grammar School,
119 Prospect Road,
SUMMER HILL NSW AUSTRALIA 2130
Published: January 2015
ISSN online 2204-6534
Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015
page 2
EDITORIAL BOARD
Dr Frederick Osman and Dr Ryohei Miyadera
BACKGROUND: On the 24th of September 2013, the Trinity Grammar School Music and Mathematics
tour departed from Sydney International Airport and began a journey to Kwansei Gakuin Senior High
School, Nishinomiya, Japan.
The main purpose of this tour was to establish a relationship with Kwansei Gakuin to develop an
international programme that would go beyond the current Rugby connection, to promote crosscultural awareness through extensive exchange programmes that challenge the mind, body and spirit
for both students and staff from both schools.
JOURNAL NAME: Archimedes was a Greek mathematician, physicist, engineer, inventor, and
astronomer. Archimedes is generally considered to be the greatest mathematician of antiquity and
one of the greatest of all time.
AIMS:
1. The Journal and Proceedings of Young Archimedes publishes academic online papers of
secondary students in the fields of Mathematics Applications.
2. To provide a forum to exchange mathematical ideas, activities, and/or sharing and interpreting
high school research.
3. To pioneer a new field of educational endeavour to be the first Mathematics International
Journal publication for High Schoolers.
4. To increase the relationship and strengthen the academic links between Trinity and Kwansei
Gakuin.
5. To promote cross-cultural understanding between Australia and Japan and affirm our academic
relationship as brother Schools.
6. To have students in all departments completing HSC and/or International Baccalaureate
essays or projects with relevance to the fields of Mathematics Applications submit a paper for
refereeing within the Journal and Proceedings of Young Archimedes.
OUTCOMES:
1. Issues are scheduled to be published in June and December of each Year.
2. Maximum of six long papers (max 6 pages) or twelve short papers (max 3 pages) for each
issue.
3. An electronic online version of each issue is to be posted to the Trinity Grammar School
Mathematics Club web site publication.
ISSN online 2204-6534
Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015
page 3
Journal and Proceedings of Young Archimedes
The Journal and Proceedings of Young Archimedes publishes academic online papers of secondary
students in the fields of Mathematics Applications and provides a forum to exchange mathematical
ideas, activities, and/or sharing and interpreting high school research.
Manuscripts will be reviewed by the Editor, in consultation with the Associate Editors, to decide whether
the paper will be considered for publication in the Journal. Issues are scheduled to be published in
June and December. An electronic version of each issue is posted to the Trinity Grammar School
Mathematics Club web site http://bit.ly/young_archimedes as a formal publication. Enquiries relating
to copyright or reproduction of an article should be directed to the author.
Information for Authors
Manuscripts are only accepted in digital format and should be e-mailed:
>> In Australia to Dr Frederick Osman on [email protected] or
>> In Japan to Dr Ryohei Miyadera on [email protected]
The template below should be used as a guide for preparing manuscripts.
If the file-size is too large to email, it should be placed on a CD-ROM or other digital media and posted to:
The Editor, Journal of Proceedings of Young Archimedes
Trinity Grammar School,
119 Prospect Road,
SUMMER HILL NSW AUSTRALIA 2130
Editorial Board
Dr Frederick Osman has had an extensive experience of more than 20 years academic/industry
experience in innovative teaching and researching, in Physics and Mathematics education. His
research background and achievements have been attained in laser plasma interaction for inertial
confinement fusion including work on several plasma effects. He is currently the Director of Vocational
Education and the Master in Charge of the Mathematics Club at Trinity Grammar School.
Dr Ryohei Miyadera received a Ph.D. in Mathematics at Osaka City University and received a second
Ph.D. in mathematics education at Kobe University. He has two fields of research: probability theory of
functions with values in an abstract space and applications of Mathematica to discrete mathematics.
He and his high school students have been doing research in discrete mathematics for more than 15
years.
ISSN online 2204-6534
Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015
page 4
ASSOCIATE EDITORS
Professor Robert Cowen is Mathematics Emeritus at Queens College, CUNY. He uses Mathematica
in his own research and has written a textbook with John Kennedy called Discovering Mathematics
with Mathematica.
Professor Heinrich Hora is known for his work on the theory for fusion energy with lasers. He has
published more than 450 papers on laser-plasma interaction and inertial nuclear fusion, ponderomotive
and relativistic self-focusing, laser acceleration of particles, correspondence principle of electromagnetic
interaction and accuracy principle of nonlinearity.
Professor Yaichi Shinohara is Mathematics Emeritus of Kwansei Gakuin University. His research
background and achievements have been in on Topology and Knot theory. He is a respected
mathematician in the fields of algebra and geometry.
Professor Tadashi Takahashi is a Mathematics Professor at Konan University. He is the President
of Japan Society for Symbolic and Algebraic Computation and is also the President of the Game
Amusement Society. His research background and achievements have been in Computer Algebra,
Singularity Theory and Efficient Use of Computer in Mathematics Education.
Dr Jonny Bernas Pornel is an Assistant Professor of Mathematics in the University of the Philippines
Visayas. He promotes the use of Lesson Study among teachers of Science and Mathematics and the
enhancement of mathematical creativity among students.
Dr Nethal K. Jajo is Modelling and Projection Analyst at the University of Sydney, Australia. He has
a PhD degree in Mathematical Statistics and Probability Theory with professional training in Discrete
Event Simulation and System Dynamics Modelling. His industrial and academic research activities
include: Data analysis, data mining, partial least squares path modelling, regression analysis and
dynamic simulation.
Dr Katsuyuki Yoshikawa is an International expert in the area of Knot Theory. He has attained the
Takebe-Award for his research on the four dimensional topology from the Mathematical Society of
Japan.
Edward Habkouk is an experienced teacher of NSW Mathematics courses to HSC level and he is an
HSC Mathematics Extension 1 marker (since 2000) and IB Mathematics Examiner (particularly SL,
paper 1) since 1999. He is currently the Dean of Mathematics at Trinity Grammar School.
Stephen McAndrew is an experienced Physics Teacher at Trinity Grammar School, having taught
in Australia and the UK. His research background is in Applied Mathematics, in particular the areas
of classical mechanics, fluid mechanics and electromagnetism. He is currently involved in a PhD
research in magnetohydrodynamic shock waves.
Katsuya Mori is a teacher at Takarazuka Higashi High School who is doing research in Mathematics with
his students. His students papers were published at The Rose-Hulman Undergraduate Mathematics
Journal. He achieved a M.Sc. from Kyoto University with a major in algebraic geometry.
Shane Scott is an experienced teacher of NSW Mathematics courses to HSC level at Trinity Grammar
School. He is an executive member of the Mathematical Association of New South Wales. He has won
the NSW Premier’s Scholarship for Mathematical Teaching and has travelled to Germany, UK and the
US to attend and present at International Schools and Mathematics conferences.
Yuko Matsuda is known as an experienced computer scientist with many years’ experience in artificial
intelligence, data science, language design and super-computing based on symbolic computation.
ISSN online 2204-6534
Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015
page 5
Sample template format for Authors to the Journal and Proceedings of
Young Archimedes (16 point font size)
First name Last name
Department, Institution, City, Country
E-mail: name@domain-name
Abstract
This is the layout and template for a paper to be submitted to The Journal and Proceedings of Young Archimedes.
Introduction (12 point font size)
Equations (11 point font size)
The Journal and Proceedings of Young Archimedes
publishes academic online papers of secondary students
in the fields of Mathematics Applications and provides a
forum to exchange of mathematical ideas, activities, and
or sharing and interpreting high school research. Papers
may be submitted electronically only to the editors
Dr Frederick Osman on [email protected]
from Trinity Grammar School Australia and Dr Ryohei
Miyadera on [email protected] from Kwansei
Gakuin High School, Nishinomiya City Japan.
Equations should be placed on separate lines and
numbered. An example of an equation is given below:
Δ
f = – p + fN, (1)
where fN is the nonlinear force (Osman, 2000).
Figures
All figures must be centered on the column (or page, if
the figure spans both columns).
Acknowledgement of receipt of the submission will be
sent to the corresponding author’s e-mail address. It is the
author’s responsibility to submit an accurate manuscript
– any errors in spelling, grammar, or scientific content
may be reproduced as typed by the author. Manuscripts
will be reviewed by the Editor, in consultation with
the Associate Editors, to decide whether the paper will
be considered for publication in the Journal. Accepted
papers will be published electronically on the Trinity
Grammar School Mathematics Club web-site.
Layout and style
A Times New Roman font is used for the main text.
The font size is 11 points with main heading of sections
should use font size 12 points. It is important that when
the final PDF file is created, all fonts used must be
embedded. Two columns are used except for the title
and abstract section and possibly for large figures,
tables or photographs that need a full-page width. If
you have any questions regarding paper submission,
please contact the editors, Dr Frederick Osman from
Trinity Grammar School, Sydney AUSTRALIA and Dr
Ryohei Miyadera from Kwansei Gakuin High School,
Nishinomiya City JAPAN.
Figure 1: Generation of blocks of deuterium plasma
moving against the neodymium glass laser light (Osman,
2005).
References (11 point font size)
Cang, Y., Osman, F., Hora, H., Zhang, J., Badziak, J.,
Wolowski, J., Jungwirth, K., Rohlena, J., Ullschied,
J.,(2005) Computations for Nonlinear Force driven
plasma blocks by picosecond laser pulses for fusion,
Journal of Plasma Physics, 71, 35-51.
Osman, F., Castillo, R., and Hora, H. (2000) Focusing
and Defocusing of the Nonlinear Paraxial Equation at
Laser-Plasma Interaction, Laser and Particle Beam, 18,
73-79.
Osman, F. (2005) Guest editor’s preface: Workshop on
fast high-density plasma blocks driven by picoseconds
and terawatt lasers. Laser and Particle Beams, 23, 399-40
ISSN online 2204-6534
Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015
page 6
Solutions of Higher Order Dispersion Terms in the
Nonlinear Schrödinger Equation
Dr Frederick Osman
Trinity Grammar School, Mathematics Department, Summer Hill NSW, Australia
E-mail: [email protected]
Abstract
This paper presents the nonlinearity and dispersion effects involved in the propagation of optical
solitons which can be understood by using a numerical routine to solve the Nonlinear Schrödinger
Equation (NLSE). A sequence of code has been developed in Mathematica to explore in depth
several features of the optical soliton’s formation and propagation. These numerical routines were
implemented through the use with Mathematica and the results give a very clear idea of this
interesting and important practical phenomenon. The resonant radiation of solitons due to higher
order dispersive effects will be seen here to cause increasing turbulence which will ultimately lead to
severe damping of the soliton, resulting in its diminished usefulness in telecommunications and other
fields, including self-focusing. It is believed that these results will be of considerable use in any work
or research that uses the self-focusing properties of the soliton [9] and this paper sets out to explain
why the higher order optical soliton should be considered in such research
Introduction
The field of nonlinear optics has developed
in recent years as nonlinear materials have
become
available
and
widespread
applications have become apparent. This is
particularly true for optical solitons and
other types of nonlinear pulse transmission
in optical fibres and laser plasma interaction
[6, 9]. Subsequently, this form of light
propagation can be utilised in the future for
very high capacity dispersion free
communications.
The purpose of this paper is to describe the
use of a very powerful tool to solve the
generalized NLSE that has stable solutions
called optical solitons [2]. The solitary wave
(or soliton) is a wave that consists of a
single symmetrical hump that propagates at
uniform velocity without changing its form.
The physical origin of solitons is the Kerr
effect, which relies on a nonlinear dielectric
constant that can balance the group
dispersion in the optical propagation
medium. The resulting effect of this balance
is the propagation of solitons, which has the
form of a hyperbolic secant [13].
ISSN online 2204-6534
Nonlinear Schrödinger Equation
The Nonlinear Schrödinger Equation
(NLSE) used in this paper is generalised as:
u 1  2 u
 nu
2
n

 | u | u  iu  | i |  n n
i
 2  2

where n is the order of dissipation of the
Schrödinger Equation being used and  n is
an arbitrary constant. The electric field is
considered as a monochromatic wave
propagating along the x-axis with the wave
number k and angular frequency  , that
is, the field E is assumed to be in the
expansion form:
E r , , x, t  

 E r, ,  , ;  exp ikx   t 
l  
l
With El  El* (complex conjugate) where
kl  lk ,  l  l and the summation is
taken over all harmonics generated by the
nonlinearity due to the Kerr effect and
El r, ,  , ;   is the envelope of the lth
Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015
page 7
harmonic changing slowly in x and t. The
slow variables  and  are defined by:




z
   2 z and     t  
Vg
(3)
From Eq. (2) and Eq. (3), the displacement
is found by:
D    E   Dl exp ikl z   l t  (4)
It
has
been
shown by [3] that
El r, ,  , ;   can be expanded in terms
of  :

El r , , , ;      n El( n ) r , , ,  (5)
n 1
From which the generalised NLSE for
u1(1)  ,  is obtained [6]:
u 1  2u
 3u
2
i

 | u | u  iu  i 3
 2  2

Now the new variables and constants are
introduced:
  z :  t 
u

0
2k 0
3
z
vg
(7a)
1
u0 0 : Vg =
k
= k  (7b)

6 u  k 
u0
0 0
: 
3 u 
6k  2k 0 k 
0 0
 3k
 2k
k  
: k  
3
 2

(7c)
(7d)
The importance of Eq. (6) is that it can be
solved
into
normalised
reference
coordinates. A clear view of the evolution
of the envelope along the normalised
propagation path results. This will allow us
to study the different cases, such as the
classical situation, where   0 which
results in the standard Nonlinear
Schrödinger Equation [9].
Initial Conditions and Programming
for Higher Orders
The solution of the Nonlinear Schrödinger
Equation can be solved exactly by the
inverse scattering method. A planar
stationary light beam in a medium with a
nonlinear refractive index can be described
as a dimensionless form [5]:
u  2 u
 2  k | u |2 u  0
 
(8)
The method used to solve the exact inverse
scattering method is applicable to equations
of the type:
u ˆ
 S [u ]

(9)
Where Ŝ is a nonlinear operator
differential in x, which can be represented
in the form
Lˆ
 i [ Lˆ , Aˆ ]

(10)
Here L̂ and  are linear differential
operators containing the sought function
u ( x, t ) in the form of a coefficient. The
result in Eq. (8) can be verified in Eq. (10)
with the operator’s L̂ and  taking the
form of the Nonlinear Schrödinger
Equation:
2
ISSN online 2204-6534
Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015
page 8
u( x, t )  2 sech [2 ( x  x0 )  8 t ]
. exp[i ( 2x  4( 2   2 ) t   ]
(11)
are
scaling
where
,  ,  , t , x0
parameters. This form of the solution can
also be known as a soliton that has a stable
formation.
The soliton Eq. (11) is the simplest
representative of an extensive family of
exact solutions of Eq. (9). In the general
case such a solution can also be called an Nsoliton solution, which depends on 4N
arbitrary constants,
 j ,  j ,  j , t j , x0 j .
However, for the non-coinciding  j this
solution breaks into individual solitons if
t   . Using this solution and beginning
at the origin x  0, a wave formation can
be acknowledged by [9]:
u(0, t )   sech [t  t 0 ]
(12)
Using this programming method, with its
wide range of available iteration models
available from Mathematica, we get a three
dimensional representation of the wave as
given below in Figure 1. To achieve this
result we have used the new “NDSave”
command in Mathematica Version 5 to
solve Eq. (6) for a given value of n given a
solution of the format of Eq. (12) and given
that the resultant wave has an arbitrary
wavelength; in this case x  40. This is
then plotted in three dimensions using the
“Plot3D” command.
Figure 1: Basic waveform for higher order
dispersion
This figure is obtained when all the higher
order coefficients,  n are zero. Here only
the plain soliton is visible, no radiations are
present. This is a classic example of a
soliton wave; it is a single hump, clearly
seen at time zero, and it tails off in two
directions, vis. time, and it travels along the
time line for very great distances without
change or distortion. This is why it is
important to find an order of solitons that do
not decay or radiate to form wave packets,
unless of course a packet is specified as
being best for a stipulated use. For the
graphs reproduced in this paper the
resolution is set at PlotPoints1024 and
ImageSize600. The ‘NDSolve’ command
given in Mathematica, allows the user to
choose from a wide range of mathematical
iteration techniques in finding numerical
solutions for differential equations.
Numerical Results
We have found from the above method that
for the dispersion orders from 3 to 6
inclusively there is a point along the
evolution of the coefficient for each order of
dispersion where radiation becomes visibly
evident.
3
ISSN online 2204-6534
Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015
page 9
Figure 2: High order wave (in this case 5th
order at coefficient 0.019) when radiation
first becomes noticeable.
Figure 3: Wave for higher order dispersion
(third order coefficient 1.0) showing well
developed radiation.
This figure shows a small packet of
radiations coming from the outer
extremities of the wave. The soliton is still
clearly defined, however, as the wave
travels further away from time zero, in
either direction the radiations become more
evident and complex.
This figure clearly shows the packet of
radiations, running parallel to the wave and
travelling outwards from the parent wave.
There is still an area of calm along the zero
time line and the soliton hump is still clearly
visible.
Karpman, [7, 8] has defined the third and
fourth order dispersion equation and this
will be used throughout this work as the
basis for programming the Schrödinger
equation. This level of dispersion is
covered by the work of Karpman and
Shagalov [7, 8]. We will now carry on
towards the next order of dispersion. Using
this technique and extending it to waves of
the fifth, sixth and seventh orders, we find a
distinctive trend in the area along the
evolution of the coefficients of the higher
orders of dispersion, where the phenomenon
of radiating solitons occurs. This is set out
in Table 1, below.
Figure 4: Higher order wave with heavy
radiation. This is 4th order at coefficient
0.6549, which is at the threshold of
breaking.
In this figure we can see that the radiations
extending outwards from the waves have
merged to form an area between the waves
of general turbulence, where radiations from
various directions have merged. The
shrinking area of relative calm around the
zero time line is still visible. The parent
soliton is still visible; though it is difficult to
4
ISSN online 2204-6534
Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015
page 10
see where such a packet would be useful,
given the level of radiating interference.
To make the coefficient any larger than this
causes computation overflow, or breakdown
of the wave, caused by the radiation
turbulence taking over.
Table 1: Coefficient Values for each Order of
Dispersion.
Order
of
Dispersion
3
4
5
6
7
8
Computation
Starts
Radiation
Effect
Computation
Overflow
0.0225
0.015
0.0006
0.00011
0.0000081
0.000000002
0.2
0.05
0.01
0.001
-
260.647
0.655
0.0292
0.00134
0.000102
0.0000000442
From the results above it is evident that as
the coefficient increases, a point is reached
where the soliton begins to radiate,
changing it from a solitary wave to a wave
packet. From observing the time taken for
compilation by the computer, however, it is
evident that some activity is present even
though it cannot be discerned from the
simulations. For the basic wave, that is
without a dispersion term of 3 or greater
order, the compilation time is slightly over 2
seconds. We have included a column here
to indicate the coefficient required to raise
the compilation time more than one second
above ground zero. At higher levels
computation time can run into hours. The
upper limit expressed here is the point at
which self-focusing comes into effect, thus
emphasising the reference made above to its
usefulness to in such research [6, 9].
Figure 5: Coefficient as a function of
dispersion order
From this figure it will be noted that the
point at which the order of dispersion being
studied starts to make its presence felt on
the computation, which can be related to the
point at which breaking or computation
overflow occurs. The point at which
radiation becomes a visible phenomenon
converges onto the point of breaking and
seems to be in a position to cross over, thus
implying that radiation would still occur at a
higher coefficient had the wave not broken.
This is difficult to verify, however, since we
have no graphic output beyond computation
overflow.
Error Tolerance for Iteration Method
Since these numerical results are obtained
by the computer performing iteration for a
given set of parameters there will be an
expected margin of error, the question of
interest being whether or not this error
margin is within the acceptable boundaries.
Mathematica can be programmed to give a
specified accuracy goal as a specification of
the number of digits required to be
absolutely correct. Mathematica will then
either produce a result with no error
message, in which case the required
5
ISSN online 2204-6534
Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015
page 11
accuracy goal has been met, or it will give
an error message stating where the accuracy
requirement was breached. In the case of
this poster we tested each order of
dispersion to find at what number of digits
the error message first appeared. This gave
us the minimum error tolerance for each
order. These results are set out in Table 2.
Table 2: Percentage Error Expected at each
Level
Order
of
Dispersion
3
4
5
6
7
8
Computation
Starts
0.0225
0.015
0.0006
0.00011
0.0000081
0.000000002
Accuracy
of
Digits
6
5
8
10000+
10
8
Maximum %
of Error
0.002
0.03
0.0008
0.06
2500
From this table, it is clearly indicated that in
terms of error due to iteration, the sixth
order is vastly superior to any other; since
this is the lowest order where the breaking
point of the wave is reached before visible
radiation is evident and the required order of
accuracy is never violated, even when taken
to ludicrous extremes (in this case 10000
decimal places!). Above order 7 the
accuracy quickly degenerates and order 8
and above are outside the boundaries of
practical usefulness.
Conclusion
From the results shown in this paper, it is
abundantly clear that the best possible
results are to be obtained from using the
sixth order (of dispersion) Schrödinger
Equation. For this order there is no visible
radiation, up to the point where increasing
the sixth order coefficient causes
computation overflow and/or breaking of
the wave, thus rendering it free from the
effects of turbulence and eliminating the
associated damping of the soliton. This in
turn cuts down any negative effects on selffocusing normally caused by wave
turbulence. As an added bonus the expected
error in the computed iteration is negligible.
Consequently we can say that the sixth
order dispersion will give all the advantages
of a higher order soliton, most important
being a higher self-focusing ratio, without
all the setbacks encountered with higher
orders, the most disruptive of these being
soliton damping.
References
[1] AKHMEDIEV, N. N. and ANKIEWICZ A. (1997) Solitons,
Nonlinear Pulses and Beams, Canberra: Chapman & Hall.
[2] DRAZIN, G., JOHNSON, R.S. (1990) Solitons: An
Introduction, Cambridge University Press.
[3] HASEGAWA, A. (1989) Optical Solitons in Fibres, SpringerVerlag, Berlin.
[4] HAUS, H. A. (1981) Optical Fiber Solitons, their Properties
and Uses, Proceedings of the IEEE. Vol. 81. No. 7.
[5] HAUS, H. A. (1993) Molding Light into Solitons, IEEE
Spectrum.
[6] HORA, H., OSMAN, F., HÖPFL, R., BADZIAK, J., PARYS,
J., WOLOWSKI, E., WORYNA, E., BOODY, F.,
JUNGWIRTH, K., KRALIKOVA, J., KRAZA, L., LASKA, M.,
PFEIFER, M., ROHLENA, R., SKALA, J., ULLSCHMIED, J.
(2002) Skin Depth Theory explaining Anomalous Picosecond
Laser Plasma Interaction. Czechoslovak Journal of Physics, 52,
Suppl. D.
[7] KARPMAN, V. I. (1998) Evolution of Solitons described by
Higher-Order Nonlinear Schrödinger Equation. Phys. Lett. A 244
397-400.
[8] KARPMAN, V. I. SHAGALOW A. G. (1999) Evolution of
Solitons described by Higher-Order Nonlinear Schrödinger
Equation. II. Numerical Investigation. Phys. Lett. A 254 (1999)
319-324.
[9] OSMAN, F., CASTILLO, R., HORA, H. (2000) Focusing and
Defocusing of the Nonlinear Paraxial Equation at Laser Plasma
Interaction, Laser & Particle Beams 18, 73.
[10] SMITH, G.D. (1987) Numerical Solution of Partial
Differential Equations: Finite Difference Methods, Oxford
Applied Mathematics and Computing Science Series, 3rd edition.
[11] WHITMAN, G.B. (1974) Linear and Nonlinear Waves, New
York: Wiley.
[12] WOLFRAM, S. (1991) MATHEMATICA: A System for
Doing Mathematics by Computer, 2nd ed. Addison-Wesley.
[13] ZAKHAROV, V.E., SHABAT, A.B. (1972) Exact Theory of
Two-Dimensional Self-focusing and One-Dimensional SelfModulation of Waves in Nonlinear Media, Soviet Physics JETP,
34, 62.
6
ISSN online 2204-6534
Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015
page 12
Pattern for Centre of Twin Primes
Wayne Lester Ballador, Andrea Louise Gallaza, and Ignacio Lazaro III
Advisers: Prof. Jonny Pornel, Prof. Raphael Belleza, Ms. Early Sol Gadong
UP High School in Iloilo, University of the Philippines Visayas, Iloilo City Philippines
Abstract
This paper presented the concept of the centre of twin primes and determined its properties. It proved
that centre of twin primes that is greater than 7 is: (a) divisible by 6; and (b) congruent to 0, 2 or 8
modulo 10. These properties would enable researchers of twin primes to test a smaller set of
numbers for primality without sacrificing accuracy.
Key words: Twin primes, centre of twin primes
Introduction
In 1900, German mathematician David
Hilbert suggested 23 baffling mathematical
problems at the 2nd International Congress
of Mathematics. One of these problems is
the Twin Prime Conjecture which states that
“There are infinitely many twin primes.”
The Twin Prime Conjecture is one of the
classic problems in mathematics. It is still
unsettled whether it is true or not (Burton,
1980; Rosen, 1984; Schumer, 2004).
Although many believe it is true and many
illustrious mathematicians tried proving it,
until now no-one succeeded in proving it. In
fact, the Worldwide Computer Services Inc.
offers $25,000 dollars to anyone who can
successfully prove the Conjecture.
Twin primes are pairs of primes that differ
by two (Rosen, 1984); that is, only one
number separates them. The primes 3 and 5
are twin primes since they differ by 2.
Generally, if P and P  2 are primes, then
they are twin primes.
The conjecture has attracted the attention of
countless professional, and amateur
mathematicians because it was deceptively
simple. To start working on it, one needs
only to know the concept of primes. The
attraction of a profound problem needing
mathematical tools available to high school
students is indeed irresistible.
ISSN online 2204-6534
One of the reasons for the difficulty in
proving the conjecture is the randomness in
the occurrence of prime numbers. Finding
twin primes is thus twice as difficult. Twin
primes are also notoriously rare. Koshy
(2007) has stated that, “discovering twin
primes involves essentially finding two
primes; therefore, the largest known twin
primes are substantially smaller than the
largest known primes.” p.118. There are
only eight pairs of twin primes less than 100
and 35 pairs of twin primes less 1000. The
twin primes less than 100 are (3, 5), (5, 7),
(11, 13), (17, 19), (29, 31), (41, 43), (59,
61), and (71, 73).
It is cumbersome to always write the two
primes to represent themselves. The usual
way of representing a pair of twin primes is
using the first prime. Since they differ by 2,
then anyone may readily derive the other
prime given the first one. Thus, if P is the
first prime, then P  2 is the other prime.
Another way of representing a pair of twin
primes is by using their centre, that is, the
number that separates them. To represent (3,
5), some mathematicians use 4  1. In this
paper, the concept of centre of twin primes
will be used. If P and P  2 are primes,
then the integer C P  1 is the centre of
twin primes. From this point forward, a pair
of twin primes will simply be represented
by their centre to simplify the notation. The
Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015
page 13
centre for twin primes less than 100 are 4, 6,
12, 18, 30, 42, 60, and 72.
Problem
This mathematical investigation aims to
determine the pattern behind the centres of
twin primes, and, consequently, the pattern
behind
twin
primes
themselves.
Specifically, this paper aims to answer the
following questions:
1)
What are the common factors of the
centre of twin primes?
2)
What other patterns can be
discerned from the sequence of
centre of twin primes?
Conjectures Formulated
To come up with a conjecture, the
researchers identified the first few twin
primes. Next, their centres were determined
and observed.
Table 1: Centre of Twin Primes
than 4
Centre of twin
Primes (C)
4 6 12
Divisible by 6?
n y y
Last digit
4 6 2
Note: n – no, y – yes
Greater
18
Y
8
30
y
0
The first 15 centres of twin primes are
shown in Table 1. All these centres are
even. Further, except for 4, all the centres of
twin primes are divisible by 6. Thus,
conjecture 1 logically follows.
Conjecture 1:
If C  4 and C  1 are primes, then
C  6k where k  N
Also, the longer list of centre of twin primes
show that except for 4 and 6, the centre of
twin primes ends with 0, 2, or 8. Thus,
conjecture 2 was advanced.
Conjecture 2:
If C  7 and C  1 are primes, then
C  d  mod10  where d   0, 2,8
Verifying Conjectures
The centres of twin primes between 100 and
500 are shown in Table 2. All these centres
of twin primes are divisible by 6. Thus,
conjecture 1 is logical. Also, the longer list
of centre of twin primes shows that for
centre of twin primes between 100 and 500,
centres of twin primes end with 0, 2, or 8.
This verifies the second conjecture.
Table 2: Centre of Twin Primes Greater
than 100 but Less than 1500
Centre of
Divisible by
Last Digit
Twin Primes
6
102
Y
2
108
Y
8
138
Y
8
150
Y
0
180
Y
0
192
Y
2
198
Y
8
228
Y
8
240
Y
0
270
Y
0
282
Y
2
312
Y
2
348
Y
8
420
Y
0
432
Y
2
462
Y
2
2
ISSN online 2204-6534
Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015
page 14
To test the conjectures in extreme cases,
take the biggest twin primes known right
now, which are 33218925  2169690  1.
Their centre
is  33218925  2169690.

To verify Conjecture 1, it must be shown
that  is divisible by 6.

 3 11072975  2  2169689 
 6 11072975  2169689 
So  is divisible by 6.
To verify Conjecture 2, the last digit of 
must be shown to be 0, 2 or 8. This can be
done by determining the last digit of the
product of the last digit of 33218925 and
of 2169690.
To determine the last digit of 2169690 , it must
be noted that the last digit of 2k is a
repeating cycle of 2, 4, 8, and 6 as k takes
the
values
1,2,3,4,…
Since
169690=4(42422) +2,
Then 2169690  24 42422  22  . Thus
2169690  mod10   2
4 42422 
 2   mod10
2
 4 mod10
 6  4  mod10 
 4  mod10 
 42422
6
Also,
33218925  mod10   5  mod10 
So
  mod10  33218925  2169690  mod10 
5  mod10   4  mod10   0  mod10 
Thus, Conjecture 2 is verified. A better way
to verify Conjecture 2, is to note that the last
digit of 33,218,925 is 5 and that 2169690 is
even. Since the product of 5 and an even
number is 0, so the last digit is 0 as earlier
shown.
Justifications
Conjecture 1:
If C  4 and C  1 are primes, then
C  6k where k  N
Proof:
To prove that a number is divisible by 6, it
must be shown that it is divisible by 2 and 3.
Since the only prime that is even is 2, and it
is not a twin prime, then all twin primes are
odd primes. Thus, for any twin primes
C  1 and C  1 , their centre C is even and
is divisible by 2. To prove that centres
greater than 4 are divisible by 3, let C be
an integer greater than 4 and a centre of
twin primes C  1 and C  1 . By definition,
the prime C  1 is not divisible by any
positive integer except itself and 1. Now,
any integer can only be congruent to one of
the following: 1 mod 3 , 2  mod 3 and
0  mod 3 .
Verifying the possibility of these three cases
shows the following result.
Case 1: C  1 mod 3

C 1 mod 3 
C  1 0  mod 3
 C  1 3k
 C  1 is divisible by 3
A contradiction, since C  1 is a prime.
Thus, C cannot be 1(mod3).
Case 2 : C  2  mod 3
3
ISSN online 2204-6534
Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015
page 15

C 2  mod 3 

C  1 0  mod 3
 C 1 
3k
 C 1
is divisible by 3. A contradiction since
C  1 is prime. Thus, C cannot be
2  mod 3
Case 3: C  0  mod 3
C  1  6  mod10   1
C  1  5  mod10 
This implies C  1 is divisible by 5; another
contradiction when the primes are greater
than 7. Therefore, a centre of twin primes
greater than 7 can only be 0, 2 or 8
(mod10). QED
Summary
Since the first two cases are not possible,
then case 3 must be true and that C is
divisible by 3. Therefore, C is divisible by
6, since it is divisible by 2 and 3. QED
Conjecture 2:
If C  7 and C  1 are primes, then
C  d  mod10  where d   0, 2,8
Proof:
By the previous proven conjecture, a centre
of twin primes is divisible by 6. This
implies that a centre of twin primes C is
C  6k  C  r  mod10  . Where the
possible values of r are 0, 2, 4, 6, and 8.
However, C cannot be congruent to 4
modulo 10, because
C  4  mod10 
C  1  4  mod10   1
C  1  5  mod10 
This implies C  1 is divisible by 5; a
contradiction when the centre of twin
primes are greater than 7. In the same line
of argument, C cannot be congruent to
6  mod10 
In summary, this mathematical investigation
shows that the centres of twin primes
greater than 7 are multiples of 6 that end
with 0, 2 or 8. This shows an implied
pattern for twin primes, since the centre for
a pair of twin primes defines them. This
result will simplify the search for twin
primes. Let m be a natural number, and let
N m be the set of natural numbers that are
smaller than m. Suppose that we search for
twin primes in N m . Let S be the set of
multiples of 6 in N m . Now, of the multiples
of 6, only those that end with 0, 2, and 8 are
possible centres of twin primes. Let the set
of these particular multiples of 6 be
Sc . Since the possible endings of any
multiples of 6 are 0, 2, 4, 6 and 8, then the
S
3
S
1
ratio c  Also,
 .
S 5
Nm 6
Thus, using the properties of twin primes, a
researcher need only to consider
Sc Sc S  3  1  1
of N m a

    
N m S N m  5  6  10
possible candidate for the centre of twin
primes.
C  6  mod10 
4
ISSN online 2204-6534
Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015
page 16
Possible Extensions
Since all the centres of twin primes are
multiples of 6, that is, a centre C is of the
form C  6k , a natural extension for this
investigation is to identify which value of k
will ensure that 6k is a centre of primes. A
better restatement of this problem would be:
for n=0, 1, 2, 3… what values of k taken
from {0, 1, 2, 3, 4… 9} would make
6 10n  k  a centre of twin primes?
References
Burton, D. M. (1980). Elementary Number Theory.
Boston: Allyn and Bacon
Koshy, T. (2007). Elementary number theory with
applications, 2nd ed. MA: Academic Press
Rosen, K. H. (1984). Elementary number theory and
its Applications. MA: Addison Wesley
Schumer, P.D. (2004). Mathematical Journeys. NJ:
John Wiley and sons.
5
ISSN online 2204-6534
Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015
page 17
Blu-Ray vs. DVD vs. CD: A Mathematical Exploration:
Why Archimedean Spirals lie at the Centre of the
differences between them?
Chris Millican
Trinity Grammar School, Summer Hill NSW, Australia
E-mail: [email protected]
Abstract
As an avid musician and a fan of digital entertainment, my house is full of CD, DVD and Blu-ray
discs. The questions addressed in this exploration are extremely relevant for me as a teenager who
benefits daily from this technology and yet has little understanding about how it all actually works. I
feel I have taken this technology for granted without having an understanding for what I am using.
By the end of the exploration, my aim is to have quantitatively demonstrated the differences between
CD, DVD and Blu-ray discs and therefore to have shown why Blu-Ray discs are arguably the best
choice of the three. My exploration ultimately delved into various areas of mathematics including
measurement, differential and integral calculus and trigonometric functions. The implications of my
work would help me to realise the possible innovations that may further improve the capabilities of
digital data storage in the future.
Introduction
CD, DVD and Blu-ray discs are classified
as digital optical disc data storage formats
and they all effectively function the same
way. However it is commonly recognised
that Blu-ray discs are the best of the three
because they have the greatest storage
capacity and the ability to store the highest
quality video and audio. I realised prior to
this exploration that I did not know the
differences between the three discs. Why
exactly do Blu-ray discs reproduce the
highest quality entertainment? Is it just a
matter of popular opinion or is there a
mathematical basis for why Blu-ray discs
are considered to be the best disc format out
of the three?
Archimedean Spirals the Background
to my Exploration
As shown to the right, the data on CD, DVD
and Blu-ray discs is arranged in a spiral
starting from the inside edge of the disc and
spiralling outwards. This spiral is called the
data spiral. The data spiral is a type of
ISSN online 2204-6534
Archimedean spiral which “is a spiral
named after the 3rd century BC Greek
mathematician Archimedes”.
An Archimedean spiral is defined as “the
locus of points corresponding to the
locations over time of a point moving away
from a fixed point with a constant speed
along a line which rotates with constant
angular velocity.” This means that the
distance from the centre of the spiral
increases arithmetically rather than
exponentially. The primary difference
between Blu-Ray, DVD and CD discs is the
type of laser used to read the disc. Blu-Ray
technology uses a more precise blue/violet
coloured laser; thereby allowing for the data
to be encoded onto the disc in a more tightly
wound spiral. This difference in the colour
of the laser is where the name “Blu-ray”
comes from! My aim for the exploration is
to quantitatively prove that the spiral of data
on a Blu-ray disc is longer than the other
two discs. If I can prove this, then I am in
essence proving why Blu-ray discs are able
to store higher quality entertainment
Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015
page 18
because a longer data spiral corresponds to a
higher storage capacity and a higher data
storage capacity allows for higher quality
recording.
Exploration Aim: To calculate the
length of the data spiral on CD, DVD
and Blu-Ray Discs
My aim is to firstly estimate the length of
the data spiral for each disc and then
secondly use a method based on polar
coordinates and the equation of a spiral in
order to hopefully calculate more precise
answers
assumption is not necessarily true and is the
primary weakness of this estimation
method. However, this method should
provide a reasonable starting point for my
exploration. In order to complete my
calculations, I need to measure the
dimensions of the disc as well as find the
spacing between each spiral arm.
For discs, this spacing is known by the term
‘Track Pitch’ (T) and I will be referring to
the Track Pitch throughout my exploration.
Because the Track Pitch is too small to
measure, I used the internet to find values
for the Track Pitch for each disc
Calculating the Length of the Data
Spiral Method 1 (Estimation Method)
My intention was to approximate the length
of the spiral on each of the discs by using
the equation for the circumference of a
circle. In order to make this approximation,
I would need to make an assumption (the
picture to the right demonstrates this
assumption).
The red spiral represents the actual data
spiral whereas the black circles are evenly
spaced concentric circles which share the
same central point, have the same spacing
and begin and finish the same distance away
from the centre as the spiral. In relation to
the diagram, the assumption made is that the
length of the red spiral is equal to the sum of
the circumferences of the black concentric
circles. This is a common assumption made
when estimating the length of spirals. This
Table 1: Track Pitch for CD, DVD
and Blu-Ray discs
Disc:
CD
DVD
BLU − RAY
Track Pitch:
1.6 × 10−6 m
7.4 × 10−7 m
3.2 × 10−7 m
2
ISSN online 2204-6534
Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015
page 19
To put these distances in perspective, 100
µm is the average width of a human hair! So
on a Blu-ray disc, because the distance
between each line of data is 0.32 µm:
Number of lines of data able to fit in a
human hair = 100 ÷ 0.32 = 312.5. This
means 313 lines of Blu-ray data take up the
same width as a human hair! I found this
phenomenal and it is evidence for the
extraordinary capabilities of information
technology.
This equation is derived from the proof
where on a straight line with equidistant
points A, B, C and D, the distance between
(A and D) = the distances between (A and
B) + (B and C) + (C and D).
Finding the number of turnings of the
data spiral:
DVD:
Now that I have the measurements of the
discs, I need to calculate how many turnings
the data spiral makes by finding the straight
line distance from the inner radius of the
disc to the outer radius of the disc and
dividing it by the Track Pitch. A turning is
one full 360° revolution made by a spiral.
As such, the spiral shown to the right has
three successive turnings because it
undergoes 1080 ( 3 × 360 ) of revolution.
The Track Pitch will be represented by the
letter T.
No of turnings =
outer disc radius − inner disc radius
T
CD:
0.06 − 0.023
1.6 × 10 −6
Number of turnings = 23125
Number of turnings =
0.06 − 0.023
0.74 × 10 −6
Number of turnings = 50000
Number of turnings =
Blu-Ray:
0.06 − 0.023
0.32 × 10 −6
Number of turnings = 115625
Number of turnings =
These results show that the Blu-Ray disc
has the greatest number of spiral turnings.
This is the first evidence that the data spiral
on a Blu-Ray disc is longer than the other
two discs. My first problem arose when I
realised that it would be impractical to find
the sum of the circumferences of 118750
different circles. I decided that the best
method would be to find the mean
circumference for all of the circles and then
multiply this by the number of turnings. In
this case, due to the consistent spacing of
these concentric circles, the mean
circumference is equal to the median
circumference.
C = 2π r
Where c = circumference and r = radius of
the circle.
3
ISSN online 2204-6534
Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015
page 20
Median circumference = 2π × Median Radius
Med Rad =
outer disc radius − inner disc radius
× 2π
2
Mean Circumference =
0.06 + 0.023
× 2π
2
Mean Circumference = 0.261 m
The mean circumference is the same for all
of the discs because they all have the same
dimensions. Now that I have the mean
circumference and the number of turnings
for each of the three data spirals, I can
estimate the length of each data spiral
through following equation:
Length of data spiral = Average
Circumference × Number of turnings of the
spiral
CD:
Length = 0.261 × 23125 = 6035.625 m
Therefore I endeavoured to take this
exploration further and look into the area of
polar coordinates and spirals. I soon realized
that to make the proper calculations, I
would need to use my current knowledge of
calculus as well as research further into the
area of spirals. I was definitely curious to
see how much discrepancy I would get
between the two methods. In hindsight, the
estimation method certainly turned out to be
the simpler method but would they be
accurate enough?
Calculating the Length of the Data
Spiral Method 2 (Equation for a
Spiral)
I found that the general equation for a spiral
is:
r (θ ) = a + bθ
Table 2: Explanation of the Equation
for a Spiral
Coefficient:
b
DVD:
Length = 0.261 × 5000 = 13050 m
Blu-Ray:
Length = 0.261 × 115625 = 30178.125 m
I found it astounding that there could be a
30km long spiral on the surface of a Bluray disc. These estimations clearly show
that the Blu-Ray disc has a much longer
data spiral than DVD and CD discs.
However I was not satisfied with this
estimation method. I knew that I had only
made approximations.
r
θ
a
Meaning:
a constant which determines
the distance between each
successive turning of the
spiral
distance from the centre of
the spiral
the number of radians of
revolution undergone by the
spiral
A constant which determines
the starting point where the
spiral begins to turn.
It can be assumed that a = 0 because on a
physical disc, it does not matter where the
starting point of the spiral is. Changing the
constant ‘a’ rotates the spiral (such as in the
two diagrams to the right). However on a
4
ISSN online 2204-6534
Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015
page 21
physical three dimensional disc, it does not
matter which way the spiral is rotated. A
disc is hand held and thus not constrained
by this two dimensional mathematical
aspect for the equation of a spiral. It is
interesting that in this case, applying
mathematics to the real world actually
enabled me to simplify the mathematical
theory. I didn’t anticipate this prior to my
exploration because I assumed that real life
mathematical problems were always more
complicated than the mathematical theory
suggests. However, I proved myself wrong.
∴ r (2π ) = 2π b
T
∴ b=
2π
I then substituted the value for back into
the equation for a spiral of r (θ )= a + bθ
∴ r (θ ) =
Now that I have found an equation for the
data spiral, I need to think of a way to find
the length of the data spiral. The spiral is a
form of polar curve and therefore, finding
the length of the spiral can thus also be
expressed as finding the arc length of a
polar curve. After some research into polar
curves, I found that the integral for the arc
length of any polar curve is found that the
general equation for a spiral is:
β
α
Therefore for a spiral, after one revolution
(
) has been made, the distance from
the centre is 1 Track Pitch.
Therefore: T = r (θ ) where θ = 2π rad
∴ T = r (2π )
T
×θ
2π
Finding the Length of the Data
Spiral:
s=∫
Therefore, for the data spiral on a disc, we
were able to simplify the equation to:
r (θ ) = bθ . The distance between the arms
of a spiral is called the Track Pitch.
∴ T = 2π b
[r (θ )]2 + [r ′(θ )]2 dθ
Table 3: Explanation of the Integral
for the Arc length of a Polar Curve
Coefficient:
s
r (θ )
r ′(θ )
Meaning:
Arc length of the spiral
α
Equation for the spiral
Derivative of the equation
for the spiral
Lower Bound
β
Upper Bound
In order to use this integral, we need to find
Therefore, due to the equation for an
r ′(θ ) using basic rules of differentiation r:
Archimedean Spiral of: r (θ ) = bθ .
=
y kx
=
y′ k
5
ISSN online 2204-6534
Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015
page 22
In applying this same rule for a spiral, it
follows that for when:
r (θ ) =
T
×θ
2π
∴ r ′(θ ) =
T
2π
Therefore, through the process of
substitution, I found that the integral for the
length of the spiral on any of the three disc
formats is:
s=∫
β
α
 T × (θ )  T 
 2π  +  2π  dθ
2
2
Finally I had produced an integral for the
length of an Archimedean spiral where the
distance between spiral arms (T) is known.
Integration:
β
α
(ax
2
+ a ) dx
Where a is a constant. The use of an
integral calculator gave the following
equation.
β
∫α
f ( x ) dx =
2
β
∫α r(θ )dθ =
2
T × θ   T 
T 
 2π  × ar sinh(θ ) + x  2π  +  2π 
2
a × ar sinh( x ) + x ax 2 + a
2
Note: The term ar sinh ( x ) represents an
inverse hyperbolic function and is
completely different from the constant ‘a’.
2
I also realised that the length of the spiral
would be found by subtracting the length of
spiral between the centre and the inside
edge of the disc from the length of spiral
between the centre and the outside edge of
the disc. This is simply due to the fact that
discs have holes in the centre (as seen to the
right ) and thus, the data spiral does not
actually start at the centre, but rather from
the inside edge of the disc. Therefore:
2
Solving this final integral, would require
knowledge of inverse hyperbolic functions
by the use of the mathematical software to
make the final calculations. By substituting
2
T
in the letter ‘a’ for the constant   and
 2π 
substituting x for θ , which the integral is
simplified to its basic form:
s=∫
2
T
By substituting   in for the constant
 2π 
‘a’ as well as substituting θ back in for x ,
the equation became:
S Data Spiral
2
2
T 
T × β   T 
 2π  × ar sinh(β ) + x  2π  +  2π 
−
=
2
2
2
T 
T × α   T 
 2π  × ar sinh(α ) + x  2π  +  2π 
2
2
Finally, I had found the equation for the
length of a data spiral. I had all of the
necessary information in order to complete
the final step except for the values:
I would
need to find
before I could
find
.
Finding α and β :
The goal of integral calculus is to find the
area under a curve. The upper and lower
bounds define between what values of θ
that the area under the curve will be found.
The lower and upper bounds α and β
would be determined by the highest and
6
ISSN online 2204-6534
Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015
page 23
lowest possible values of θ for each data
spiral. This is because the answer to the
integral will give the length of the spiral.
Therefore, in terms of the length of the
spiral, the point at which the length will be
the longest is at the outermost point of the
disc which is also where the value for θ is
greatest. Therefore by finding the highest
and lowest possible values for θ , we will
be able to find the upper and lower bounds.
The lower bound α and the upper bound
β can be found by going back to the
equation which was created for the data
spiral:
r (θ ) =
T
×θ
2π
The outermost and innermost radii of the
disc were measured to be 0.06m and
0.023m respectively. Therefore to find the
upper limit β :
β=
r (θ )
T
2π
r (θ ) × 2π
∴β=
T
Because r (θ ) = 0.06m - (disc radius)
0.06 × 2π
∴β=
T
The same process can be used to find the
lower limit 'α ' by substituting out the
outermost radius of the disc (0.06m) with
the innermost radius of the disc (0.023m):
0.023 × 2π
∴α=
T
For example, in calculating the CD Upper
Bound:
0.06 × 2π
∴β=
T
0.06 × 2π
∴β=
1.6 ×10−6
∴β=
235619.449
Through the two equations for finding
α and β , and the values for the Track
Pitch of each disc, I was able to find the
upper and lower bounds for the integral for
the length of the data spiral on CD, DVD
and Blu-ray discs:
Table 4: The Calculated upper and
lower bounds for each disc
CD Upper Bound - β
CD Lower Bound - α
DVD Upper Bound - β
DVD Lower Bound - α
Blu-Ray Upper Bound - β
Blu-Ray Lower Bound - α
235619.449
90320.789
509447.457
195288.192
1178097.245
451603.944
Finding ar sinh (β ) and ar sinh(α ):
In order to find the values for each inverse
hyperbolic function, the inverse hyperbolic
function calculator was used. Through the
use of this technology and previous
calculations for α and β , the following
values are determined:
7
ISSN online 2204-6534
Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015
page 24
Table 5: The Calculated value of
ar sinh(θ ) for each previously calculated
upper and lower bound
CD Upper
Bound -
ar sinh(235619.449 ) = 13.063
CD Lower
Bound –
ar sinh (90320.789 ) =
β
α
12.104
DVD
Upper
Bound -
ar sinh (509447.457 ) = 13.834
DVD
Lower
Bound -
ar sinh (195288.192 ) = 12.875
Blu-Ray
Upper
Bound -
ar sinh (1178097.245) = 14.673
Blu-Ray
Lower
Bound -
ar sinh (451603.944 ) = 13.714
β
α
Pits and Lands:
The data spiral of a disc is actually a series
of bumps which are called pits and lands.
The laser is able to reflect light off these
bumps and subsequently read the data on
the disc. This series of bumps is how digital
data is encoded on discs. The image to the
right shows that not only does the new
technology of Blu-Ray discs allow for more
closely spaced spiral arms, but also smaller
minimum pit and land lengths. By dividing
the recently calculated length of the data
spiral by the minimum pit length, we can
get a figure for the maximum number of
pits that could physically fit onto each of the
three discs. This is vitally important when
comparing the three disc formats as it is
good indicator for the differences between
CD, DVD and Blu-Ray discs.
β
α
Using the following equation, the length of
the data spiral for each disc was calculated.
2
S Data Spiral
2
2
T 
T × β   T 
 2π  × ar sinh(β ) + x  2π  +  2π 
−
=
2
2
2
T × α   T 
T 
 2π  × ar sinh(α ) + x  2π  +  2π 
2
CD Data Spiral
DVD Data Spiral
Blu-Ray Data Spiral
2
6029.894 m
13037.610 m
30149.472 m
Length of Spiral
Minimum pit length
= Max no. pits able to fit on the disc
6029.894
CD :
= 7.537 × 109 pits
−9
800 × 10
13037.610
DVD :
= 3.259 × 1010 pits
−9
400 × 10
30149.472
BLU − RAY :
= 2.010 × 1011 pits
150 × 10− 9
∴
8
ISSN online 2204-6534
Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015
page 25
By using this data, we can compare the
three formats to see just how much more
data can fit on a Blu-Ray disc than a CD
disc. Furthermore, we can compare these
results with the accepted differences in data
storage capacity in order to firstly see how
accurate my calculations were and secondly
to evaluate the importance of the length of
the data spiral as an indicator for the quality
of an optical disc. Therefore:
Max no. pits able to fit on the disc x
=
Max no. pits able to fit on the disc y
Ratio for the data capacity of disc x in
relation to disc y
BLU − RAY / DVD :
BLU − RAY / CD :
DVD / CD :
11
2.010 × 10
= 6.0
3.259 × 1010
2.010 × 1011
= 26.7
7.537 × 109
3.259 × 1010
= 4.3
7.537 × 109
These results have no units because they are
ratios. These calculations show that
according to the results of my exploration, a
Blu-Ray disc holds 6 times more than a
DVD disc, a DVD disc holds 4.3 times
more data than a CD disc and a Blu-Ray
disc holds 26.7 times more data than a CD
disc.
What should we make of these
results?
In order to assess the accuracy of my
calculations and evaluate the importance of
the length of the data spiral as an indicator
for the quality of an optical disc, I need to
compare these values to the actual ratios for
how much data each disc can hold in
relation to the other discs. The real values
for the data storage capacity for CD, DVD,
and Blu-Ray discs are as follows.
Table 6: The accepted values for the
data storage capacity of each disc
Disc Format:
CD
DVD
BLU − RAY
Data Storage Capacity
(GB):
0.7
4.7
25
Data Storage Capacity of disc x
=
Data Storage Capacity of disc y
Ratio for the accepted data capacity of disc x
in relation to disc y
25
BLU − RAY / DVD :
= 5.3
4.7
25
BLU − RAY / CD :
= 35.7
0.7
4.7
DVD / CD :
= 6.7
0.7
Percentage Error:
Therefore, in order to compare my results
with the accepted values, the percentage
error formula has been utilised:
Result − Accepted Value
Error (% ) =
× 100
Accepted Value
6 − 5.3
BLU − RAY / DVD :
× 100 = 13.2% error
5.3
26.7 − 35.7
BLU − RAY / CD :
× 100 = 25.2% error
35.7
4.3 − 6.7
DVD / CD :
× 100 = 35.8% error
6.7
9
ISSN online 2204-6534
Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015
page 26
Comparison of Methods:
Blu-ray discs allows for this extra space. In
order to illustrate the importance of data
It was actually quite surprised how accurate
storage capacity, I utilized Microsoft Excel
the estimations were. The percentage error
graph drawing software to create the four
for the length of the spiral on each disc in
graphs to the left. It is important to note that
was determined in the following
these graphs are an estimation of reality,
calculations:
and do not use any equations from my
Result − Accepted Value
Error (% ) =
× 100 exploration. However they provide a great
Accepted Value
visual representation of the differences
between the three discs in attempting to
6035.625 − 6029.894
× 100 record a sound wave. It is clear that because
CD (% error ) =
6029.894
the CD only has a limited amount of data
= 0.09504% error
storage space, the resemblance of the digital
recording to the actual sound is quite poor.
On the other hand, the DVD and Blu-Ray
13050 − 13037.610
× 100
DVD (% error ) =
digital recordings bear much more
13037.610
resemblance to the original sound wave.
= 0.09503% error
Therefore, these recordings are more
accurate digital recordings and thus
BLU − RAY (% error )
arguably of higher quality.
30178.125 − 30149.472
=
× 100
It is interesting that although CD, DVD and
30149.472
Blu-Ray discs all look similar; they have
= 0.09505% error
fundamental differences due to progress in
laser technology. By refining the spot size
In all three cases, the percentage error is
of the laser by a mere micrometre, the data
extremely small. These low percentage
spiral can be increased from 6km to over
errors show that in most cases, the
30km in length. It seems that for
estimation method would suffice in giving
technological advancement to occur, a small
an indication of the length of a spiral.
improvement of a micrometre may lead to a
However, in cases where absolute accuracy
drastic overall benefit. In addition to this,
is necessary, the second method requiring
my exploration has shown to me the
integral calculus is important to recognise.
potential for innovation in the future. It
seems logical that because refining the laser
Conclusion
by one micrometre allowed for a data spiral
So why exactly is all of this important? My
up to six times longer, then perhaps the
exploration has shown that Blu-ray discs
same can be done again to create even
can hold more data than DVDs and much
higher quality digital entertainment.
more data than CDs. However, how do
Perhaps, if we used a laser of even shorter
these calculations prove that Blu-Rays are
electromagnetic wavelength, this will
of better quality than DVDs and CDs?
become a reality. In any case, my
Generally, higher quality recording requires
mathematical exploration has really helped
more data space. The longer data spiral of
me to understand the world of digital
10
ISSN online 2204-6534
Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015
page 27
technology much more than before. I
believe that my exploration into the
mathematics behind technology has
implications for everyone endeavouring to
benefit society through technological
advancement.
References
[1] "Arc Length Integral for Polar Coordinates."
HubPages. N.p., n.d. Web. 16 Feb. 2014.
[11] "Archimedean spiral." Princeton University.
<http://www.princeton.edu/~achaney/tmve/wiki100k/doc
s/Archimedean_spiral.html>.
[12] Austen, Ian. "Dueling Visions of a HighDefinition DVD." The New York Times. The
New York Times, 28 Apr. 2004. Web. 25 Apr.
2014.<http://www.nytimes.com/2004/04/29/technology/h
ow-it-works-dueling-visions-of-a-high-definitiondvd.html>.
[13] "Blu-ray vs DVD ." Blu-ray vs DVD
<http://www.computerhope.com/issues/ch001395.htm>.
<http://calculus-geometry.hubpages.com/hub/ArcLength-Integral-for-Polar-Coordinates>.
[14] "Chip's CD Media Resource Center: CDDA (Digital Audio) 5." Chip's CD Media
Resource Center: CD-DA (Digital Audio).
Web. 10 Dec. 2013. <http://www.wolframalpha.com>.
[15] "Compact Disc. How it Works?."
ElectroSchematicscom RSS. N.p., n.d. Web. 1 Apr.
[2] "Experimental Feature." Wolfram|Alpha:
Computational Knowledge Engine. N.p., n.d.
[3] "File:Comparison CD DVD HDDVD
BD.svg." - Wikimedia Commons. N.p., n.d. Web.
17 Feb. 2014.
<http://commons.wikimedia.org/wiki/File:Comparison_C
D_DVD_HDDVD_BD.svg>.
[4] "How Your Brain Understands What Your
Ear Hears. How Small Is a Hair Cell?. N.p., n.d.
Web. 17 Feb. 2014.
<http://science.education.nih.gov/supplements/nih3/heari
ng/activities/hair-cell.htm>.
[5] "How to Find the Arc Length of an
Archimedean Spiral: Calculus Integration
Tutorial." HubPages. N.p., n.d. Web. 10 Feb. 2014.
<http://calculus-geometry.hubpages.com/hub/How-toFind-the-Arc-Length-of-an-Archimedean-SpiralCalculus-Integration-Tutorial>.
[6]
"Integral
(Antiderivative)
Calculator."
<http://symbolab.com/solver/integral-calculator/>.
[7] "Integral Calculator." Online Find Integrals
and Antiderivatives!. N.p., n.d. Web. 21 Dec. 2013.
<http://www.integral-calculator.com/>.
[8] "Inverse Hyperbolic Functions." Online
calculator:. N.p., n.d. Web. 17 Feb. 2014.
<http://planetcalc.com/1118/.>.
<http://www.chipchapin.com/CDMedia/cdda5.php3>.
2014. <http://www.electroschematics.com/4997/compactdisc-how-it-works/>.
[16] "Optical disc." Princeton University. N.p.,
n.d. Web. 25 Apr. 2014.
<http://www.princeton.edu/~achaney/tmve/wiki100k/doc
s/Optical_disc.html>.
[17] "Optical disc drive." Wikipedia. Wikimedia
Foundation, 18 Apr. 2014. Web. 16 Jan. 2014.
<http://en.wikipedia.org/wiki/Optical_disc_drive>.
[18] "Reference Guide for Blue Laser Media."
Datarius. N.p., n.d. Web. 11 Jan. 2014.
<http://www.datarius.com/news/whitepapers/wp_Blue_L
aser_datarius-memorex.pdf>.
[19] "Roll length calculator." Roll length
calculator. N.p., n.d. Web. 25 Apr. 2014.
<http://www.giangrandi.ch/soft/spiral/spiral.shtml>
[20] "Springville company introduces new DVD
to protect data for a thousand years or more."
Daily Herald. N.p., 17 July 2009.
Web. 25 Apr. 2014.
<http://www.heraldextra.com/news/local/springvillecompany-introduces-new-dvd-to-protect-data-fora/article_b25c9a30-7242-11de-9feb001cc4c03286.html>.
[9] "What are Blu-ray discs?." What are Blu-ray
discs? An explanation of how Blu-ray discs work
and their specifications. N.p., n.d. Web. 21 Nov.
[21] "Why Blu-ray's Better Than DVD - IGN."
IGN. N.p., n.d. Web. 1 Apr. 2014.
[10] Length of an Archimedean Spiral. N.p., n.d.
[22] "Optical disc." Wikipedia. Wikimedia
Foundation, 18 Apr. 2014. Web. 16 Jan. 2014.
2013.<http://www.wizbit.net/cddvd_production_faqs_what_are_blu-ray_discs.htm>.
Web. 17 Feb. 2014.
<http://www.intmath.com/blog/length-of-an-archimedianspiral/6595>.
<http://au.ign.com/articles/2009/03/25/why-blu-raysbetter-than-dvd>.
<http://en.wikipedia.org/wiki/Optical_disc>
11
ISSN online 2204-6534
Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015
page 28
Bayes’ Theorem and Its Applications in Law
Yuzhe Zheng
Trinity Grammar School, Summer Hill NSW, Australia
E-mail: [email protected]
Abstract
The aim of this paper is to clearly explain how Bayes’ Theorem can provide the correct
interpretation of statistical evidence. This will be achieved through firstly, introducing Bayes’
Theorem in two different forms, and examining its use in two real-world cases, Regina v Sally Clark
[1] and Regina v Denis John Adams [2]. In the first case, I will explain how Bayes’ Theorem
accounts for one piece of evidence, and in the second case, I will explain how Bayes’ Theorem can
also be used for multiple pieces of evidence. Through this process, I will also expose the logical
fallacies of these cases and explore how this theorem avoids making such fallacies. Acknowledging
that it is difficult for the layman to understand Bayes’ Theorem, I will also be using diagrams to
represent Bayes’ Theorem for clarification.
Introduction
An introductory example: my conversation
with Ishaan. I found that one of my best
attempts in explaining Bayes’ Theorem
was in a conversation with my friend
Ishaan, of which a transcript is provided
below:
Yuzhe: Here are two coins, both of which
are showing heads at the moment. One of
them is double-headed, while the other is a
normal coin. Now pick one up without
looking at the other side.
Ishaan picks it up
Yuzhe: Okay. What is the probability of
the coin in your hand being biased? And
explain why.
Ishaan: 50%, because the biased coin is
one of the two coins which could have
been chosen.
Yuzhe: Now flip the coin three times and
tell me which side is facing up each time.
Ishaan flips the coin three times. By the
end of the third flip, he looks quite
shocked.
Ishaan: Heads, heads and heads.
ISSN online 2204-6534
Yuzhe: Okay, how sure are you now that
the coin that you’ve just flipped is the
biased coin?
Ishaan: Quite sure.
Yuzhe: Can you give me a probability?
Ishaan: Well now I’m around 80% sure
it’s the biased coin, and I definitely know
it’s more than the figure of 50% which I
gave earlier. Reading this conversation, it
is likely that you, the reader, has followed
a similar pattern of thought to Ishaan’s.
Ishaan in this case, has intuitively updated
his belief in the biased coin from 50% to
80%, after he observed that the coin
produced three consecutive heads. This
process of updating our beliefs in light of
new evidence in relation to this
exploration is the basis of Bayes’
Theorem, which carries this out in an
objective, controlled manner using the
mathematics of probability.
Bayes’ Theorem therefore has a wide
range of applications for any field, which
requires the confirmation of a hypothesis
through evidence. It was however, the
legal function of Bayes’ Theorem, which
Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015
page 29
surprised and interested me
most, as it actually uses math
and probability to account for
evidence and help determine a
defendant’s guilt or innocence.
Rationale
Steps
[1]
Explanation
General definition
[2]
General definition
[3]
[ 2] × P ( A )
[4]
Sub [3] in [1] as
P ( B ∩ A )= P ( A ∩ B )
I chose this topic because I was
intrigued that mathematics could
be used to ensure that justice be
appropriately carried out. I was led to
believe by TV crime and legal shows such
as The Goodwife that courtroom law was
incredibly subjective, especially due to the
unreliability of evidence provided by
humans, such as witnesses. Moreover,
originally believing that mathematics and
law were two mutually exclusive areas, my
exploration of Bayes’ Theorem highlighted
the usefulness of mathematics in many
more areas than I had previously
considered.
Explaining Bayes’ Theorem
Bayes’ Theorem was named after Reverend
Thomas Bayes, who suggested the process
of updating beliefs through probability in
his 1812 essay An Essay towards Solving a
Problem in the Doctrine of Chances [3].
The simplest form of Bayes’ Theorem is
given below:
P ( A | B) =
P ( B | A) P ( A)
P ( B)
Proof
I attempted to prove Bayes’ Theorem
myself through showing how it can be
derived from the general definition [5] of
P ( A | B) .
Equation
P ( A ∩ B)
P ( A | B) =
P ( B)
P ( B | A) =
P ( B ∩ A)
P ( A)
P ( B | A ) P (=
A) P ( B ∩ A)
P ( A | B) =
P ( B | A) P ( A)
P ( B)
Table 1: The meaning of each term of the
simple form of Bays’ Theorem
This assumes that 0 ≤ P ( A ) ≤ 1 and
0 ≤ P ( B) ≤ 1
Term
P ( A)
P ( A | B)
Name
The prior
[6]
The
posterior
P ( B | A ) The
likelihood
P ( B)
Meaning
The degree of belief that
event A has occurred.
The degree of belief that A
has occurred, given that
event B has occurred.
P ( B | A ) is the probability
of B occurring, given A has
occurred. P (B) is the sum of
all probabilities of all
possible ways to get B. This
includes the probability of B
given A, P ( B | A ) , and the
probability of B given not A,
P ( B | A′ ) ,
=
P ( B ) P ( B | A ) + P ( B | A′ )
Dividing
P (B | A)
by
P ( B ) therefore gives us the
probability
of
B| A
occurring in comparison to
all ways of B occurring. The
likelihood is therefore the
support of B for A.
2
ISSN online 2204-6534
Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015
page 30
Substituting the terms for their names,
Bayes’ Rule is simply:
posterior
= prior × likelihood
Returning to the introductory example
Returning to our coin flip example, the
prior was Ishaan’s initial degree of belief in
the flipped coin being biased, which we will
denote as P (A).
The posterior, P ( A | HHH ) is Ishaan’s
updated belief in the coin being biased, after
it produced three consecutive heads, HHH .
Using Bayes’ Theorem, we can calculate
the posterior:
P ( A | HHH ) =
P ( HHH | A ) P ( A )
P ( HHH )
There are two coins, the fair and the biased,
which can produce HHH .
Therefore, the probability of producing
three consecutive heads, P ( HHH ) ,
includes the probability of picking the
biased coin and flipping HHH, as well as
the probability of picking the fair coin and
flipping HHH.
Hence, with A′
representing the fair coin
P ( HHH ) = P ( HHH | A ) P ( A )
+ P ( HHH | A′ ) P ( A′ )
The substitution of this equation into Bayes’
Theorem would give us its extended form
P ( HHH | A ) P ( A )
P ( A | HHH ) =
P ( HHH | A ) P ( A ) + P ( HHH | A′ ) P ( A′ )
Table 2: The calculations of all the terms
on the right side of Bayes’ Theorem for
the coin flip example
Calculation
Explanation
As
Ishaan
has
1
P ( A) =
explained,
there
are
2
1 out of 2 coins to
be chosen from.
A and A′ are
1
P ( A′ ) =
complementary
2
events.
Given that the coin
P ( HHH | A ) = 1
is double-headed, it
is certain that it will
produce HHH.
1 1 1 1 The probability of
P ( HHH | A′ ) = × × =
2 2 2 8 getting a head on the
first flip doesn’t
affect probability of
getting a head on the
second or third flips.
These events are
therefore
independent, which
are multiplied by
each other when we
wish to find out the
probability of all
events
occurring
together [5].
These calculations are substituted into the
extended form of Bayes’ Theorem:
1
1×
2
P ( A | HHH ) =
= 0.8
1 1 1
1× + ×
2 8 2
Thus, Ishaan’s prior degree of belief in the
selected coin being biased, which was 0.5,
is updated to 0.8 after accounting for
HHH.
3
ISSN online 2204-6534
Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015
page 31
Reflection: The Usefulness of Bayes’
Theorem
It can therefore be seen that Bayes’
Theorem is useful because it allows us to
update our prior belief from new
information. It was at this point however
that I noticed that the same result for the
biased coin example could be calculated by
using the basic definition of P ( A | B ) :
P ( A | B) =
P ( A ∩ B)
P ( B)
When I was comparing the two equations, I
P ( B | A)
realised that P ( A ∩ B ) =
, which
P ( A)
made me question the usefulness of Bayes’
Theorem at all. Finally, I realised that
Bayes’ Theorem differs from the general
definition of P ( A | B ) in that it provides
the
relation
between
P ( A | B)
and
P ( B | A ) . This is particularly useful for
cases where P ( B | A ) is known but
P ( A | B ) is not.
For example, the jury wants to know the
probability of a defendant’s innocence,
given the occurrence of a particular piece of
evidence, P ( innocence | evidence ) . Yet,
the jury only knows the probability of that
piece of evidence occurring when one is
innocent, denoted as:
P ( innocence | evidence ) .
Bayes’ Theorem allows the jury to derive
P ( innocence | evidence ) from
P ( evidence | innocence ) and thus update
their degree of belief in a defendant’s guilt
in light of the evidence. Another example is
in DNA testing, in which there is always a
possibility that the match produced is
someone who is innocent and completely
unrelated to the crime in question. This is
known as the random match probability [2],
which is analogous to the probability of
getting a match, given ones innocence,
P ( match | innocence ) . Thus with Bayes’
Theorem, we can determine the probability
of one’s innocence, given the DNA test,
P ( match | innocence ) from the random
match probability, P ( match | innocence ) .
Odds form
The simple form of Bayes’ Theorem
however can only provide the probability of
one hypothesis; rather, in the courtroom we
want to compare the viability of two
competing hypotheses – a defendant’s guilt
and innocence.
I found that the odds form [2] of Bayes’
Theorem, allows for this as it compares the
probabilities of two competing hypotheses
in a ratio, shown below:
P (G | E ) P (G ) P ( E | G )
=
×
P ( G′ | E ) P ( G′ ) P ( E | G′ )
In trials, the two competing hypotheses are
the defendant’s innocence and guilt for a
charge, in which P ( G ) is the prior
probability of guilt and P ( G′ ) is the prior
probability of innocence. E represents the
evidence, which is used to update the prior
probabilities of innocence and guilt.
4
ISSN online 2204-6534
Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015
page 32
Table 3: An explanation of the terms of the
odds form of Bayes’ Theorem
Term
Name
Explanation
Prior odds The ratio of the prior
P (G )
probability of guilt to
P ( G′ )
the prior probability of
innocence.
P(E | G)
P ( E | G′ )
P (G | E)
P ( G′ | E )
Likelihood
ratio
Posterior
odds
The probability of the
evidence
occurring
given
that
the
defendant is guilty,
divided
by
the
probability of the
evidence
occurring
given
that
the
defendant is innocent.
This ratio therefore
shows how much the
evidence agrees with
G in comparison to .
Therefore, the higher
the likelihood ratio,
the greater weight the
prosecution’s case has.
The
relative
probabilities of guilt to
innocence, after the
evidence has been
taken into account.
Using the technical names of the terms in
the odds form of the theorem we arrive at:
Posterior Odds = Prior Odds × Likelihood Ratio
For example, if we derive a posterior odds
of 1/3 from the formula above, this means
that the defendant is three times more likely
to be innocent than guilty, after accounting
for the evidence.
The events of a defendant’s guilt and
innocence are complementary events,
meaning, “Exactly one of the events must
occur [5]”. Therefore P ( G ) + P ( G′ ) =
1.
At this point, I realised I could convert the
posterior odds into individual probabilities
of guilt and innocence. For example, the
ratio of guilt to innocence is 1:3. As guilt
and innocence are complementary events,
the probability of guilt can be given out of
1+ 3 =
4. Hence, the probability of guilt is
1
= 0.25 and the probability of innocence
4
3
is = 0.75.
4
In cases involving juries, Bayes’ theorem is
often called upon to aid juries in
deliberating a verdict. The prior odds can be
derived in 3 ways:
• A juror’s own subjective guess at
the probability of the defendant’s
guilt.
• The result from using Bayes’
Theorem with a previous piece of
evidence.
• A confirmed statistic. For example,
1/8 would be the prior odds if the
defendant were initially one of 8
suspects. This statistic is then
compounded with other evidence
such as DNA, which is represented
by the likelihood ratio.
Personally I see a problem with the first
option, as it is unfair and uncomfortable for
the jurors to have to assign a subjective
probability to a defendant’s guilt before
even seeing any evidence. This problem is
solved by the odds form of Bayes’
Theorem, which requires only the likelihood
5
ISSN online 2204-6534
Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015
page 33
ratio to determine the significance of the
evidence on the case. For example, if the
likelihood ratio from a DNA test were 100
to 1, it would make the prosecution’s case,
or the probability of guilt, 100 times greater.
Bayes’ Theorem can also be used to
incorporate all pieces of evidence of varying
likelihood ratios in one equation, or it can
be done successively, meaning the posterior
probability after accounting for one piece of
evidence (e.g. License plate match),
becomes the prior probability before
accounting for another piece of evidence
(e.g. DNA testing). I realised that the former
could be achieved by slightly modifying the
odds form of Bayes’ Theorem, which I have
presented in detail in the examination of
Regina v Denis John Adams.
Regina vs. Sally Clark
We will now see how Bayes’ Theorem can
be used to interpret one piece of evidence in
terms of its impact on the case for guilt or
innocence.
The Case
Sally Clark was wrongly convicted of
murdering her two infant sons, who both
had died of Sudden Infant Death Syndrome
(SIDS), also known as cot death. At the age
of three months, her first son died and his
death was initially considered as a case of
SIDS. Following the second son’s death at
the age of 2 months from similar
circumstances, Sally was tried and
convicted for murder in 1999. Lacking
substantial medical evidence, Professor
Meadow of the prosecution stated that the
probability of both babies dying of SIDS
was approximately 1 in 73 million. This
was calculated by squaring the probability
of a single death from SIDS in a family
such as Sally Clark’s, 1 in 8500, to account
for two SIDS deaths. With such a statistic, it
seemed to the jury and the media that it was
almost impossible that Sally Clark’ children
died from SIDS and thus highly improbable
that she was innocent. Only during a second
appeal in 2003 was the conviction
overturned, due to the court’s recognition
that many pieces of evidence, including
Meadow’s statistic, had misrepresented the
case.
Proof
Step
[1]
[2]
Explanation
Bayes’
Theorem
simple form
Bayes’
Theorem
simple form
Equation
P ( E | G ) P (G )
P (G | E) =
P(E)
P ( G′ | E ) =
P ( E | G′ ) P ( G′ )
P(E)
[1] × P ( E )
P ( E ) P (G | E) = P (E | G ) P (G )
[5]
P ( E | G ) P (G )
[3] ÷ P ( G | E ) P ( E ) = P ( G | E )
[ 2] × P ( E ) P ( E ) P ( G ′ | E ) = P ( E | G ′ ) P ( G ′ )
[6]
[ 5] ÷ P ( G ′ | E )
P ( E | G′ ) P ( G′ ) P ( E | G ) P ( G )
=
P ( G′ | E )
P (G | E )
[3]
[4]
[7]
[ 6] = [ 4]
P ( G | E ) P ( E | G′ ) P ( G′ )
= P ( E | G ) P (G )
P ( G′ | E )
[8]
[7] × P ( G | E )
P ( G | E ) P ( E | G′ ) P ( G′ )
= P ( E | G ) P (G )
P ( G′ | E )
[9]
[ 8] ÷ P ( E | G ′ )
× P ( G′ )
P (G ) P ( E | G )
×
P ( G′ ) P ( E | G′ )
6
ISSN online 2204-6534
Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015
page 34
The Logical Fallacies
I realised that Professor Meadow made a
grave error in the calculation of this statistic.
The probability of one event can only be
multiplied by the probability of another if
they are independent, meaning, the
occurrence of one does not affect the
probability that the other occurs. P ( A ∩ B )
means the probability that A and B occur
together, and assuming that A and B are
independent events,
P ( A ∩ B=
) P ( A) × P ( B )
The occurrence of the first SIDS death
however, does influence the probability of
the second SIDS death. Meadow’s incorrect
assumption of independence therefore leads
to a grossly overestimated statistic, which
had incorrectly pointed to Clark’s guilt. The
President of the Royal Statistical Society in
a letter criticizing the court’s abuse of
probability, notes that “There may well be
unknown genetic or environmental factors
that predispose families to SIDS, so that a
second case within the family [4], becomes
much more likely than would be a case in
another, apparently similar, family”. This
makes them dependent events, and therefore
it is inappropriate to simply just square the
probability of a single SIDS death.
Whilst assigning mathematical notations for
the case’s events, I also realised that it was
wrong for the media and Meadow assume
that the low probability of the children’s cot
deaths translate equally to the low
probability of her innocence. Having made
this logical error myself, I discovered that
this was quite common in the interpretation
of statistics in the courtroom, known as The
Prosecutor’s Fallacy [2]. If we suppose that
G′ (G being the event of guilt) is the
probability of Sally Clark’s innocence, and
E, the evidence of the two deaths,
1
P ( E | G′ ) = million . This represents the
73
probability of the two deaths happening,
given her innocence. For simplicity, this
assumes that there are only two possible
explanations of the deaths, double-murder
or two cases of SIDS. However, the
probability of Sally Clark’s innocence given
the deaths is P ( G′ | E ) . According to the
terms of the simple form of Bayes’
Theorem therefore, to equate the likelihood,
P ( E | G′ ) to the posterior, P ( G′ | E ) is
wrong, unless the prior equals 1.
Using Bayes’ Theorem
Using the odds form of Bayes’ Theorem,
we can correctly calculate the posterior
odds of Sally Clark’s guilt:
P (G | E ) P (G ) P ( E | G )
=
×
P ( G′ | E ) P ( G′ ) P ( E | G′ )
•
P ( E | G ) = 1, = 1, as it is certain
that if Sally Clark were guilty of
double-infanticide, the two deaths
would have obviously occurred.
•
P ( E | G′ ) = 1/ 73000 000,
established by Dr Meadow earlier.
At this point I realised I did not have any
figures for P ( G ) or its complement,
P ( G′ ) . Luckily, I found the probability of
an infanticide within the infant’s first year
of birth to be 1.1×10−5 , from the UK Office
of National Statistics data for 1997 [2],
7
ISSN online 2204-6534
Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015
page 35
which was the year in which Sally’s second
child died. Assuming Professor Meadow’s
(admittedly ill-informed) logic of nonindependence,
2
1
P (1.1×10−5 ) =
billion,
8.4
for the probability of double murder.
Hence, P ( G ) =
1
billion and
8.4
1
billion. Substituting these
8.4
values into Bayes’ Theorem, the posterior
odds on guilt are:
P ( G′ ) = 1 −
1
billion
P (G | E )
1
8.4
=
×
P ( G′ | E ) 1 − 1 billion 1 million
8.4
73
9
≈ 0.009 or
1000
Knowing this, we can conclude that it is
over 100 times more likely, that Sally Clark
is innocent than guilty, and that, her
children died of SIDS rather than murder.
Diagrammatic representation
Following this calculation, I was still
uncertain with how I exactly came to this
1
1/8.4billion
1- 1/8.4billion
G
1/73million
result. So the following tree diagram has
been produced for all possibilities. With the
aid of the diagram, I fully understood the
use of Bayes’ Theorem in terms of the tree
diagram. The odds form was simply a
comparison between the red and blue
branches containing E, because we know
that the two deaths have already happened.
Multiplying the probabilities on the red
‘guilty’ branch, I noticed I was actually just
calculating the nominator of the right-hand
side of the odds form of Bayes’ Theorem.
Hence, doing the same with the E branch,
I divided the former by the latter and arrived
at 0.009, the same answer as earlier.
Reflection
The benefit of using Bayes’ Theorem here
is that it reminds us to compare the
probability of Sally Clark’s guilt to her
innocence. Although the probability of two
deaths by SIDS was highly improbable,
1
million, , this is over a hundred times
73
more probable than the chances of a doubleinfanticide, which the jury, the media and
Professor Meadow himself failed to
recognise.
Another benefit is that Bayes’ Theorem
outlines which terms must be known in
order to rationally process evidence and
hence determine a defendant’s guilt or
innocence. For example, my use of Bayes’
Theorem in the Sally Clark case made me
recognise that I still needed the prior odds
in order to use Bayes’ Theorem.
This case has also revealed to me the
difficulty of language in clearly expressing
mathematical concepts. An example is the
Prosecutor’s Fallacy, in which it was
8
ISSN online 2204-6534
Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015
page 36
extremely easy for the media to represent
the probability of Clark’s guilt as equal to
the probability of two consecutive cot
deaths. The misinterpretation of statistics in
court is therefore extremely easy, and has
unfortunately a profound effect in outcomes
of cases.
I chose this particular case due to these
profound effects. Although Clark’s
conviction was overturned later, her ordeal
lasted more than 5 years, which left her with
issues such as serious alcohol dependency,
which lead to her death in 2007 due to
alcohol
poisoning.
These
horrid
consequences can be attributed to logical
fallacies during her case, such as the
Prosecutor’s Fallacy, incorrect assumptions
of independency, and the failure to compare
probabilities of guilt to innocence. It struck
me that it was so easy to make such
fallacies, which would then lead to
disastrous consequences on the lives of
others. On the other hand, I found that using
mathematical notation and the tree diagram
made much clearer to me than word-based
explanations.
A key limitation in my analysis of Regina v
Sally Clark was that I found P (G) by
squaring the probability of infanticide
within the first year of birth. In this
calculation, I have, like Meadow, also
assumed that the murder of one baby was
independent of the murder of the other.
These events are actually dependent events
because hypothetically, if Clark did murder
her first baby, it is much more likely that
she murdered her second as well. This may
be due to factors surrounding the first
infanticide which may have also lead her to
commit the second one. I purposely based
my analysis on the same fallacious
assumptions as Meadow to emphasise that,
even when following his logic, the posterior
odds are still significantly low. In
conclusion, my examination of Regina v
Sally Clark has highlighted to me the need
for objective methods such as Bayes’
Theorem to properly assess evidence and
update probabilities of guilt.
Regina v Denis John Adams
We will now see how Bayes’ Theorem can
be used to account for different pieces of
evidence, as opposed to one single piece of
evidence.
This case was of particular interest to me
because I had always thought of DNA as
the most conclusive evidence which proves
an offender’s guilt beyond reasonable
doubt.
Influenced by sensationalized crime shows
such as NCIS, my initial naïve view of DNA
was challenged by its misuse in Regina v
Denis John Adams.
The Case
In 1993, Denis John Adams was arrested on
a rape charge [5] committed in 1991. His
DNA was entered into the police database
in 1993 for a different arrest, and whilst
running a check on his DNA, a match was
found at the scene of the rape in 1991. As a
result, he was convicted of rape at trial.
The prosecution’s case relied solely on the
DNA evidence, arguing that its random
match probability was one in 200 million.
As the random match probability was so
low, it seemed likely that Adams was guilty.
The defence actually proved that the
random match probability for that specific
9
ISSN online 2204-6534
Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015
page 37
DNA test was one in 2 million, rather than
one in 200 million.
The Evidence
1. The victim did not identify Adams
as her rapist at the identification
parade, and stated that he did not
resemble her attacker.
2. Adams’ girlfriend stated that
Adams spent the night with her, on
the night of the offence.
3. In the local area of the offence,
there were 150 000 males between
18-60 years of age. When
accounting for others who may
have entered the area that night,
this increases the potential suspect
pool to 200 000.
4. The random match probability is
one in two million.
The Prosecutor’s Fallacy
Immediately, I noticed that the prosecution
had committed prosecutor’s fallacy.
random
match
P ( Match | G′ ) the
probability, is based on the fact that the
suspect is innocent. However, from this
statistic the prosecution argued the low
probability of Adams’ innocence, which is
actually P ( G′ | Match ) , the probability of
innocence given the match. The relation
between them is actually determined in the
simple form of Bayes’ Theorem, which
requires that P ( Match | G′ ) , the likelihood,
be multiplied by the prior, P ( G ) .
Using Bayes’ Theorem
We can overcome the risk of committing
The Prosecutor’s Fallacy through using
Bayes’ Theorem. Unlike in Regina v Sally
Clark, not all the evidence is statistical, and
therefore, as pretend jury members of
Adams’ trial, we will assign our own
estimates of likelihood ratios for each
piece of non-statistical evidence, to reach
conclusions on Adams’ guilt or
innocence.
As there is more than one piece of evidence,
the total likelihood can be found by
multiplying the individual likelihoods of
each piece of evidence. Hence, the odds
form of Bayes’ Theorem is transformed
into:
P ( G | E ) P ( G )  P ( E1 | G )
P ( En | G ) 
= × 
× ....

P ( G′ | E ) P ( G′ )  P ( E1 | G′ )
P ( E n | G′ ) 
Values are assigned to each term below:
It is reasonable to use the most readily
available information as the prior odds,
which in this case, is the potential suspect
1
pool. Hence, the prior odds are
.
200000
We may denote E1 as the event of the
victim not identifying Adams as the
offender. Thus, if Adams were indeed
guilty, we may estimate the probability of
the victim not identifying him, P ( E1 | G ) to
be 0.1. Likewise, we would estimate that
E1 is more likely if Adams weren’t guilty,
say, P ( E1 | G′ ) = 0.9 . The likelihood odds
for E1 is therefore
1
.
9
We can also denote E2 as the event of an
alibi. We estimate that an alibi would exist
25% of the time given ones guilt, and 75%
10
ISSN online 2204-6534
Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015
page 38
of the time given ones innocence. Hence,
1
the likelihood odds for E2 is .
3
decision-making exhibited by the jury, as
my values were rationally generated
through Bayes’ Theorem.
Finally, we denote E3 as the event of a
match in DNA testing provides a likelihood
2 million
ratio of
.
1
Implications and Reflection
After conviction, the case went to appeal, in
which Fenton & Neil detail that the Appeal
Court claimed that “the introduction of
Bayes’ Theorem, or any similar method,
into a criminal trial plunges the jury into
inappropriate and unnecessary realms of
theory and complexity [3]”.They argued
that juries were meant to “evaluate evidence
and reach a conclusion not by means of a
formula, mathematical or otherwise, but by
the joint application of their individual
common sense and knowledge of the world
to the evidence before them”.
We know that P P ( E3 | G ) = 1, as, if Adams
were indeed guilty, he would obviously be a
1
match for DNA. P ( E3 | G′ ) =
, as
2 million
already given by the defence earlier.
Dividing the former by the latter, we arrive
2 million
at the likelihood ratio of
.
1
Substituting these values into the above
form of Bayes’ Theorem, we arrive at:
1
 1 1 2000 000  10
× × ×
=
200 000  9 3
1
 27
Hence, in light of these pieces of evidence,
the probability ratio of guilt to innocence is
10:27. When we convert this ratio to a
posterior probability of guilt we find it to
be:
10
10
=
≈ 0.27
27 + 10 37
This posterior probability of guilt is wildly
inadequate for a verdict of guilt beyond
reasonable doubt, as the actual jury has
done. Granted, this calculation is based on
my own personal assignments of likelihoods
for two pieces of evidence. This is however,
better than any purely intuitive-based
The privileging of “individual common
sense” over mathematical methods to
account for evidence highlights how people,
even judges, are more comfortable relying
on intuition, rather than formulas, which
may be confusing at times. Upon my
encounter with these two cases, I realised
that The Appeal Court failed to recognise
that one’s “common sense and knowledge”
gathered from everyday experiences doesn’t
equip one with the necessary skills to
understand and manipulate probabilities. So
far I have demonstrated that it is extremely
easy and even intuitive at times to commit
the prosecutor’s fallacy, to incorrectly
assume independency of events and to fail
to compare the probability of guilt to that of
innocence.
Despite the appeal court’s comments on the
apparent insignificance of Bayes’ Theorem,
my illustration of its use in this specific case
combined with my reasoning above proves
the contrary.
11
ISSN online 2204-6534
Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015
page 39
Conclusion
This exploration has found Bayes’ Theorem
to be extremely useful in the courtroom in
updating prior beliefs on guilt in light of
new evidence. This exploration has
examined two real-world cases, Regina v
Sally Clark and Regina v Denis John Adam,
in which I have also revealed common
logical fallacies that occur in the courtroom
such as the Prosecutor’s Fallacy, and how
Bayes’ Theorem prevents people from
making such fallacies. In conclusion,
through bringing together the worlds of law
and mathematics, I developed a more
appreciative
perspective
on
the
contributively power of mathematics to
seemingly unrelated aspects of life.
References
[1] Dawid, Alexander., ‘Bayes' Theorem And
Weighing Evidence by Juries.’ in Swinburne, R.
(ed.), Bayes’s Theorem, London, Oxford
University
Press,
2005,
pp.
71-90.
http://www.math.nmsu.edu/~jlakey/m210/dawid
-paper.pdf
[2] Fenton, Norman, and Martin Neil. "Avoiding
probabilistic reasoning fallacies in legal practice
using Bayesian networks." Austl. J. Leg. Phil. 36
(2011): 114.
http://www.eecs.qmul.ac.uk/~norman/papers/fen
ton_neil_prob_fallacies_3_July_09.pdf
[3] Green, P. "Letter from the President to the
Lord Chancellor regarding the use of statistical
evidence in court cases, 23 January 2002."The
Royal
Statistical
Society.
2002.
http://www.ucl.ac.uk/lapt/doubt/rss-2002.pdf
[4] Haese, Robert. Haese, Sandra, Haese,
Michael., Mäenpää, Marjut., Humphries, Mark.,
Mathematics for the international student:
Mathematics SL, 3rd Ed. (Adelaide: Haese
Mathematics, 2012)
[5] Motivate. "The test is positive: What are the
odds it’s wrong?" Motivate. 4 January 2014
<https://motivate.maths.org/content/sites/motivat
e.maths.org/files/PositiveTest_RvDenisJohnAda
ms.pdf>.
[6] Westbury, Chris. "Bayes’ For Beginners.
“Department of Psychology, P220 Biological
Sciences Bldg., University of Alberta,
Edmonton,
AB,
T6G
2E9,
Canada.
http://www.ualberta.ca/~chrisw/BayesForBegin
ners.pdf
12
ISSN online 2204-6534
Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015
page 40
Effect of surface area and volume on rate of cooling
Joshua Gereis and Victor Wu
Trinity Grammar School, Summer Hill NSW, Australia
E-mail: [email protected]
Abstract
Newton’s Law of Cooling describes the rate of cooling of an object, given its temperature, the
ambient temperature, and the object’s rate of cooling (represented by ‘k’ in Newton’s Law). This
experiment aims to find the relationship between an object’s surface area to volume ratio and its rate
of cooling. We find a linear relationship between these two values.
Hypothesis
Our hypothesis was that objects with a
greater surface area to volume ratio would
cool more rapidly than those with a lesser
ratio. We also expected there to be a linear
relationship between this ratio and the rate
of cooling ( k ) in Newton's Law of
Cooling.
Aim
Our aim was to determine the relationship
between an object's surface area to volume
ratio and its rate of cooling.
Method
We used four 2.5 × 2.5 × 2.5cm3 and four
1×1×1cm3 iron cubes, four 5g, four 25g
and two 50g masses as the objects to test.
We weighed each of the objects with an
electronic scale, using known densities of
the objects to calculate their volume.
Measurements were also taken to estimate
the surface area.
We then heated the objects up in an
electric frypan with sand, and took them
out in groups. Using an infrared
thermometer, we measured the initial
temperature of the objects (immediately
after taking them out), then subsequently
at each minute afterwards for 9 minutes
(to obtain 10 data points in total). To do
this, we tied each object with string, and
upon taking them out of the pan, tied them
ISSN online 2204-6534
onto a bosshead on a retort stand. We used
separate stopwatches for different objects
when they were cooling at the same time
to measure the one minute intervals. To
calculate the value of k for each object, we
used the formula derived below from
Newton’s Law of Cooling [1], using each of
the 45 combinations of data points for each of
the objects as the different temperatures in the
derived formula. The value for k was
determined as the mean of the scores, and the
error was the standard deviation.
dT
=
−k (T − Ta )
dt
T (t ) =
(T0 − Ta ) e− kt + Ta
T ( t1 ) − Ta
= e − k (t1 −t2 )
T ( t2 ) − Ta
 T ( t1 ) − Ta
ln 
 T ( t2 ) − Ta

−k ( t1 − t2 )
 =

 T ( t1 ) − Ta 
ln 

T ( t2 ) − Ta 

∴k =
t2 − t1
Results
We found that there was a positive
correlation between the surface area to
volume ratio and the value of k for the
objects, which seemed to be a linear
relationship.
Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015
page 41
that there is no surface area, there would
be no way for the heat to escape, and
therefore the rate of cooling would be
zero. This supports the presence of a linear
relationship between the two variables.
The large error ranges also allow for other
relationships as well, for example a
polynomial relationship, which also pass
through the origin, however the linear
relationship best fits the data points.
One interesting result we found from the
data was that sometimes the objects would
actually
increase
in
temperature
significantly, and this increase could not
be accounted for with the errors in
measurements. We hypothesise that this is
because the surface cooled rapidly, while
the “core” of the objects cooled much
more slowly, and the energy from the core
would excite the atoms on the surface,
causing the observed fluctuations.
Conclusion
Discussion
Initially, the iron cubes were planned to be
used as a control for mass, so that any
effect could be considered when analyzing
the data from the brass masses. However,
we overlooked the fact that though the
cubes were roughly the same shape, they
had vastly different surface area to volume
ratios. If mass does have an impact on the
rate of cooling, then this data would need
to be reanalyzed. However, we believe
that the surface area to volume ratio is the
main factor in this, as the volume stores
the heat energy, and the surface area
provides a way for the energy to escape.
The line of best fit for both graphs has a yintercept very close to zero. This is
intuitive, because in the theoretical case
We conclude that the rate of cooling of an
object (made of brass or iron), determined
by its rate of cooling k is proportional to
its surface area to volume ratio. However,
to obtain an accurate and general result,
more experimentation and analysis is
needed.
Acknowledgments
We would like to acknowledge Mr. Stephen
McAndrew for his guidance and supervision
of the experiment, and Mr. Rocco Appio for
supplying us with equipment.
References
[1] Khamsi, M. A., Newton’s Law of Cooling,
http://www.sosmath.com/diffeq/first/application/
newton/newton.html, accessed 31 October 2014
2
ISSN online 2204-6534
Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015
page 42
New Chocolate Games
Takeru Kitagawa and Shunsuke Nakamura
Kwansei Gakuin High School
Uegahara-1-1-155
Nishinomiya City
Japan
Abstract
The authors studied a Chocolate Game that is a variant of the well known combinatorial game of Nim,
and discovered new formulas for P-positions of the game. Chocolate Game studied in this paper can be cut
in three directions, and each game has coordinates (x, y, z), where x, y, z are the maximum number of times
we can cut the chocolate in each directions respectively.
Rectangle chocolate games are mathematically equivalent to the game of Nim with three piles, but the
coordinates (x, y, z) of Chocolate Game created by the authors satisfy inequalities, and this fact makes
mathematical structures of these chocolates games different from that of the game of Nim.
The result presented in this paper is carried out by a group of high school students using computer algebra
system Mathematica.
1
chocolate in Fig. 1.2.
The structure of the Chocolate Game will be very
clear if you download the author’s paper in [10]. You
can play the Chocolate Game with the free Mathematica player.
Introduction
Chocolate Games are a variant of the well known
combinatorial game of Nim, and the first chocolate
in Fig. 1.1 was presented in [1]. The second and
the third chocolate in Fig. 1.1 were introduced and
studied by the authors in [6] and in [9] respectively.
In Section 1,2 and 3 the authors present their research on the Chocolate Game in Fig. 1.2. In Section
4 the authors present examples of chocolates that do
not have simple formulas for P-positions.
Chocolate Games look like the game of Chomp,
but these games are different from Chomp mathematically. As to Chomp see [3]. The winning strategy for Chomp has not been discovered, but the winning strategy for many chocolates that were created
by our mathematics group were discovered.
Here we define two important positions of chocolates.
Definition 1.2. (a) N-positions, from which we can
force a win, as long as we play correctly at every
stage.
(b) P-positions, from which we will lose however well
we play, but we may end up winning if our opponents
make a mistake.
Figure 1.1.
We define coordinates for each position of Chocolate Game. Let Z≥0 be the set of non-negative integers.
Figure 1.2.
Definition 1.3. We represent the chocolate with coordinates (x, y, z), where x, y, z stand for the maximum numbers of times that we can cut these chocolate in each direction.
Definition 1.1. Given a piece of chocolate, where
the light gray parts are sweet and the dark gray part
is very bitter. This game is played by two players in
turn. Each player breaks the chocolate (in a straight
line along the grooves) and eats the piece he breaks
off. The player to leave his opponent with the single
bitter part is the winner. (or his opponent has no
move)
For examples of Chocolate Game see Fig. 1.1 and
Fig. 1.2, and the main topic of this paper is the
ISSN online 2204-6534
Example 1.1. In the chocolate of Fig. 1.2 we can
cut 6 times at most vertically on the left side of the
dark gray (bitter) block, we can cut 6 times at most
horizontally above the dark gray (bitter) block and
we can 12 times at most vertically on the right side
of the dark gray block. Therefore x = 6, y = 6 and
Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015
1
page 43
z = 12. Therefore we represent this chocolate with
the coordinates (6, 6, 12).
Example
1.3. M ex({0, 1, 4, 5, 6})
M ex({1, 4, 5, 6}) = 0.
=
2,
For examples of coordinates see Fig. 1.3. It is
clear that the coordinates of these positions satisfy
the inequality 2y ≤ z, and this is equivalent to the
inequality
z
y≤ (1.1)
2
where is the floor function.
Chocolate Game made by the authors are new, since
these games must satisfy inequalities.
We define the Grundy Number G for the chocolate in Fig. 1.4.
Definition 1.6. Let G((0, 0)) = 0. For a position
(y, z) we define Grundy Number recursively by
G((y, z))
=
M ex({G((v, w))
:
(v, w)
∈
move((y, z))}).
2
Some Lemmas for Nim-sum
Definition 2.1. Let x, y be non-negative
∑n integers,
i
and write
them
in
base
2,
so
x
=
i=0 xi 2 and
∑n
i
y =
i=0 yi 2 with xi , yi ∈ {0, 1}. We define the
nim-sum x ⊕ y by
Figure 1.3.
x⊕y =
n
∑
wi 2i ,
(2.1)
i=0
where wi = xi + yi (mod 2).
Figure 1.4.
Lemma 2.1. We suppose that
x ⊕ y > z.
(2.2)
Then we have x > y ⊕ z or y > z ⊕ x.
Our aim is to find the formula for P-positions of
the chocolate of Fig.1.2, and the first step is to study
the right part of it that is presented in Fig. 1.4.
Our tool is the Grundy number.
For the detailed theory of Grundy Number see [2].
First we define move for this chocolate.
∑
Proof.∑We write x, y, z in ∑
base 2, so x = ni=0 xi 2i ,
n
n
i
i
y =
i=0 yi 2 and z =
i=0 zi 2 with xi , yi , zi ∈
{0, 1}.
Suppose that for i = m + 1, m + 2, ..., n
Definition 1.4. We define move for the chocolate
in Fig. 1.4.
For y, z ∈ Z≥0 with 2y ≤ z we define move((y, z)) =
{(v, z) : 0 ≤ v < y} ∪ {(min(y, w/2), w) : 0 ≤ w <
z},
where v, w ∈ Z≥0 .
xi + yi + zi = 0 (mod 2)
(2.3)
xm + ym + zm = 0 (mod 2).
(2.4)
xi + yi = zi (mod 2),
(2.5)
yi + zi = xi (mod 2)
(2.6)
zi + xi = yi (mod 2).
(2.7)
and
Then by (2.3) we have for i = m + 1, m + 2, ..., n
move((y, z)) is the set of all positions that can
be reached from (y, z) directly.
and
Remark 1.1. By Definition 1.4
move((y, z)) = {(v, z) : 0 ≤ v < y} ∪
{(min(y, w/2), w) : 0 ≤ w < z}
= {(v, z) : 0 ≤ v < y} ∪ {(w/2w) : w/2 <
y and 0 ≤ w < z} ∪ {(y, w) : w/2 ≥ y and 0 ≤
w < z}.
By (2.2), (2.4) and (2.5) we have xm + ym = 1 > 0 =
zm .
If xm = 1 and ym = 0, then by (2.6) we have
x > y ⊕ z.
If xm = 0 and ym = 1, then by (2.7) we have
Example 1.2. move((1, 3)) = {(0, 3), (1, 2), (0, 1), (0, 0)}. y > z ⊕ x.
We define the function M ex(A) for a set A of
non-negative integers.
Lemma 2.2. For y, z ∈ Z≥0 y ⊕ z = M ex(
Definition 1.5. Let M ex(A) be the least nonnegative integer not in the set A.
{v⊕z : v = 0, 1, 2, ..., y−1}∪{y⊕w : w = 0, 1, 2, ..., z−1}).
(2.8)
ISSN online 2204-6534
Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015
2
page 44
Proof. Clearly y ⊕ z does not belong to {v ⊕ z : v =
0, 1, 2, ..., y − 1} ∪ {y ⊕ w : w = 0, 1, 2, ..., z − 1}.
Let k be a non-negative integer such that y ⊕ z > k,
then by Lemma 2.1 we have z > y ⊕ k or y > k ⊕ z.
If z > y ⊕ k, k = (y ⊕ y) ⊕ k = y ⊕ (y ⊕ k)
∈ {y ⊕ w : w = 0, 1, 2, ..., z − 1}.
Note that y ⊕ k = w for some 0 ≤ w < z.
If y > k ⊕ z, k = k ⊕ (z ⊕ z) = (k ⊕ z) ⊕ z
∈ {v ⊕ z : v = 0, 1, 2, ..., y − 1}.
Therefore any non-negative integer smaller than y⊕z
belongs to {v ⊕z : v = 0, 1, 2, ..., y −1}∪{y ⊕w : w =
0, 1, 2, ..., z−1}. By the definition of M ex (Definition
1.5) we prove this lemma.
Lemma 2.3. For any odd number m there exist a
non-negative integer a and a natural number t such
that
t−1
∑
t+1
m=2
×a+
2i .
(2.9)
i=0
Proof. Let m be ∑
an odd number, and write it in
n
i
base 2, so m =
i=0 mi 2 . If mi = 0 for each
i = 0, 1, 2, ..., n, then let t = n + 1 and a = 0, and we
have (2.9).
If there exists i such that mi = 0, then let t =
min{i : m
∑i = 0}. Since m is odd, t > 0.
Let a = ni=t+1 mi 2i−t−1 . Then we have (2.9).
Lemma 2.4. ∑
Let x be an arbitrary odd number, and
t+1
i
x = 2 ×a+ t−1
i=0 2 for some natural numbers a, t.
For a natural number c let
∑
A(a, t, c) = {x⊕(z +2t+1 ×c), z = 0, 1, 2, ..., ti=0 2i }
and B(a,∑
t, c) = {(x + 1) ⊕ (z + 2t+1 × c), z =
0, 1, 2, ..., ti=0 2i }.
Then the set A(a, t, c) is the same as the set
B(a, t, c).
Proof. An arbitrary element of A(a, t, c) can be expressed as x ⊕ (z +∑
2t+1 × c) for some integer z
t
i
such∑
that 0 ≤ z ≤
i=0 2 , and we express z by
t
z = i=0 zi 2i .∑Then x ⊕ (z
2t+1 × c)
∑+
t−1 i
t
t+1
=(2
× a + i=0 2 ) ∑
⊕ ( i=0 zi 2i + 2t+1 × c)
t−1
t+1
t
i
t+1 × c)
= (2
× a) ⊕ (zt 2 +
i=0 (1 − zi )2 + 2
∑
= (2t+1 × a + 2t ) ⊕ ( ti=0 (1 − zi )2i + 2t+1 × c)
= (x + 1) ⊕ (
t
∑
(1 − zi )2i + 2t+1 × c).
(2.10)
i=0
∑
∑
Since 0 ≤ ( ti=0 (1 − zi )2i ) ≤ ti=0 2i , by Equation
(2.10) we have x ⊕ (z + ∑
2t+1 × c) ∈ {(x + 1) ⊕ (z +
t+1
2
× c) : z = 0, 1,∑
2, ..., ti=0 2i } = B(a, t, c).
The function z → ∑ ti=0 (1 − zi )2i is one to ∑
one mapping of {0, 1, 2, ..., ti=0 2i } onto {0, 1, 2, ..., ti=0 2i }.
Therefore we have the conclusion of this lemma.
ISSN online 2204-6534
Lemma 2.5. Let m be a natural number. Then the
set {m ⊕ z : z = 0, 1, 2, ..., 2m + 1} is the same as the
set {(m + 1) ⊕ z : z = 0, 1, 2, ..., 2m + 1}.
Proof. Let m be an arbitrary
number, then by
∑t−1odd
t+1
i
Lemma 2.3 m = 2 ×a+ i=0 2 for a non-negative
integer a and a natural number t. ∑
i
Then 2m + 1∑= 2t+1 × (2a) + 2( t−1
i=0 2 ) + 1 =
t
t+1
i
2 × (2a) + i=0 2 . Then by using Lemma 2.4 for
c = 0, 1, .., 2a we have {m ⊕ z∑: z = 0, 1, 2, ..., 2m + 1}
= {m ⊕ z : z = 0, 1, 2, ..., ti=0 2i , 2t+1 , ..., 2t+1 +
∑
t
i
i=0 2 ,
∑
2 × 2t+1 , ..., 2 × ∑
2t+1 + ti=0 2i , ..., (2a) ×
i
2t+1 , ..., (2a) × 2t+1 + ti=0
∑t2 } i
= {m ⊕ z : z = 0, 1, 2, ..., i=0 2 }∑
i
∪{m ⊕ (z + 2t+1 ) : z = 0, 1, 2, ..., ti=0
∑2t } i
t+1
∪{m ⊕ (z + 2 × 2 ) : z = 0, 1, 2, ..., i=0 2∑}
... ∪ {m ⊕ (z + 2a × 2t+1 ) : z = 0, 1, 2, ..., ti=0 2i }
= A(a, t, 0) ∪ A(a, t, 1) ∪ A(a, t, 2)... ∪ A(a, t, 2a)
= B(a, t, 0) ∪ B(a, t, 1) ∪ B(a, t, 2)... ∪ B(a, t, 2a)
∑
= {(m + 1) ⊕ z : z = 0, 1, 2,∑..., ti=0 2i } ∪ {(m +
1) ⊕ (z + 2t+1 ) : z = 0, 1, 2, ..., ∑ ti=0 2i } ∪ {(m + 1) ⊕
(z + 2 × 2t+1 ) : z = 0, 1, 2, ..., ti=0 ∑
2i } ∪ ... ∪ {(m +
t+1
1) ⊕ (z + 2a × 2 ) : z = 0, 1, 2, ..., ti=0 2i }
= {(m + 1) ⊕ z : z = 0, 1, 2, ..., 2m + 1}.
(2.11)
Next we assume that m is even, and let m = 2n.
Then (2n) ⊕ (2i) = (2n + 1) ⊕ (2i + 1) and (2n) ⊕
(2i + 1) = (2n + 1) ⊕ (2i) for i = 0, 1, 2, ..., m. Therefore the set {m⊕z : z = 0, 1, 2, ..., 2m+1} is the same
as the set {(m + 1) ⊕ z : z = 0, 1, 2, ..., 2m + 1}.
Lemma 2.6. For any natural number m we have
{z/2 ⊕ z : z = 0, 1, 2, 3, ..., 2m + 1}
= {(m + 1) ⊕ z : z = 0, 1, 2, 3, ..., 2m + 1}. (2.12)
Proof. We prove by mathematical induction. Suppose that Equation (2.12) is true for m = k. We
prove Equation (2.12) for m = k + 1. By the hypothesis of mathematical induction and the fact that
z/2 = k+1 for z = 2k+2, 2k+3 we have {z/2⊕z :
z = 0, 1, 2, 3, ..., 2(k + 1) + 1} ={z/2 ⊕ z : z =
0, 1, 2, 3, ..., 2k + 1} ∪{z/2 ⊕ z : z = 2k + 2, 2k + 3}
= {(k + 1) ⊕ z : z = 0, 1, 2, 3, ..., 2k + 1} ∪{(k + 1) ⊕
(2k + 2), (k + 1) ⊕ (2k + 3)} = {(k + 1) ⊕ z : z =
0, 1, 2, 3, ..., 2k+1, 2k+2, 2k+3}, which is equal to the
set {(k +2)⊕z : z = 0, 1, 2, 3, ..., 2k +1, 2k+2, 2k+3}
by Lemma 2.5. Therefore Equation (2.12) is true for
m = k +1, and the proof is finished by mathematical
induction.
Journal and Proceedings of Young
3 Archimedes, vol. 1, no. 1, 2015
page 45
3
Theorem
3.2. Let GX (x),
GY (y) and
GX⊕Y ({x, y}) be Grundy Numbers of Chocolate
Game X, Y and X ⊕ Y . Then GX⊕Y ({x, y})
= GX (x) ⊕ GY (y).
P-positions of Chocolate Game
Lemma 3.1. G((y, z)) = y ⊕ z for any y, z ∈ Z≥0
with 2y ≤ z.
Proof. By mathematical induction on a natural
number n we prove
G((y, z)) = y ⊕ z for y, z ∈ Z≥0 .
This is a well known theorem of combinatorial
game theory. For a proof see Theorem 7.24 (p.142)
of [2].
(3.1)
We suppose (3.1) for y, z ∈ Z≥0 with y + z < n
and prove (3.1) for y, z ∈ Z≥0 with y + z = n. Let
y, z ∈ Z≥0 satisfy y + z = n. By the definition
of move, Grundy Number G, Remark 1.1 and the
hypothesis of mathematical induction G((y, z)) =
M ex({G((v, w)) : (v, w) ∈ move((y, z))})
= M ex({G((v, z)) : v = 0, 1, ..., y − 1}
∪{G((y, w)) : w2 ≥ y and 0 ≤ w < z}
∪{G(( w2 , w}) : w2 < y and 0 ≤ w < z})
= M ex({v ⊕ z : v = 0, 1, ..., y − 1}
∪{y ⊕ w : w2 ≥ y and 0 ≤ w < z}
∪{ w2 ⊕ w : w2 < y and 0 ≤ w < z}) = M ex({v ⊕
z : v = 0, 1, ..., y−1}∪{y⊕w : w = 2y, 2y+1, ..., z−1}
w
(3.2)
∪{ ⊕ w : w = 0, 1, 2, ..., 2y − 1}).
2
Note that we use the fact that 2y ≤ z to show that
w2 < y and w < z if and only if w = 0, 1, 2, ..., 2y−1
in the last equation.
By Lemma 2.6 {y ⊕ w : w = 2y, 2y + 1, ..., z − 1} ∪
{ w2 ⊕ w : w = 0, 1, 2, ..., 2y − 1} = {y ⊕ w : w =
2y, 2y + 1, ..., z − 1} ∪ {y ⊕ w : w = 0, 1, 2, ..., 2y − 1}
= {y ⊕ w : w = 0, 1, 2, ..., z − 1}.
Figure 3.1.
We define Grundy number for the chocolate in
Fig. 3.1.
First we define move1 for it.
Definition 3.2. For x ∈ Z≥0 we define
move1((x)) = {(u) : 0 ≤ u < x}, where u ∈ Z≥0 .
We define Grundy number G1 for the chocolate
in Fig. 3.1.
Definition 3.3. Let G1((0)) = 0. For a position (x)
we define Grundy Number recursively by
G1((x)) = M ex({G1((u)) : (u) ∈ move1((x))}).
Lemma 3.2. G1(x) = x for any x ∈ Z≥0 .
Proof. Since move1((x)) = {(x − 1), (x − 2), ..., (0)},
this is clear from Definition 3.3.
(3.3)
Definition 3.4. Let A2 = {(x, y, z) : x, y, z ∈
Z≥0 , y ≤ z2 and x ⊕ y ⊕ z = 0}, B2 = {(x, y, z) :
x, y, z ∈ Z≥0 , y ≤ z2 and x ⊕ y ⊕ z = 0}.
By Lemma 2.2, equations (3.2) and (3.3) G((y, z)) =
y ⊕ z.
By using Lemma 3.1 we make a theorem for Ppositions of Chocolate Game that satisfy the inequality y ≤ z2 . We need some theorems to do that.
Theorem 3.3. Let G2 be Grundy number of the
chocolate in Fig. 1.2. Then G2((x, y, z)) = x⊕y ⊕z.
Proof. By Theorem 3.2 G2 = G1 ⊕ G, where G and
G1 are Grundy numbers of the chocolate in Fig. 1.4
and Fig. 3.1, and hence by Lemma 3.1 and Lemma
3.2 we have G2((x, y, z)) = x ⊕ y ⊕ z.
Theorem 3.1. Let GX be Grundy Number of an
arbitrary combinatorial game X with a position x.
Then a position x is a P-position of the game if and
only if GX (x) = 0.
This is a well known theorem of combinatorial
game theory. For a proof see Theorem 7.12 (p.138)
of [2].
We need a theorem on Grundy Number of the
sum of two games.
Theorem 3.4. Let A2 and B2 be the sets defined in
Definition 3.4. A2 is the sets of P-positions, and B2
is the set of N-positions of the Chocolate Game that
satisfies inequality y ≤ z2 .
Proof. By Theorem 3.3 G2((x, y, z)) = x ⊕ y ⊕ z,
and hence by Theorem 3.1 (x, y, z) is a P-state if
and only if x ⊕ y ⊕ z = 0. Consequently (x, y, z) is
an N-position if and only if x ⊕ y ⊕ z = 0.
Definition 3.1. Let X and Y be two arbitrary combinatorial games. The sum of these games X and
Y is a game where each player may choose to play
either in X or Y at any point in the game, and a
player wins when his opponent has no move in either game. We denote by X ⊕ Y the sum of X and
Y , and by {x, y} the sum of the position x in the
game X and the position y in the game Y .
ISSN online 2204-6534
Remark 3.1. Theorem 3.4 can be generalized for
the case of the chocolate that satisfies the inequality
y ≤ kz for an arbitrary even number k.
Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015
4
page 46
4
[2] Michael H. Albert, Richard J. Nowakowski and
David Wolfe, Lessons In Play , A K Peters.
Chocolate Games without Simple formula for P-positions
By Theorem 3.4 the chocolate game that we study
in Section 1,2,3 has a very simple formula for Ppositions, and someone may think that all the chocolate games have a simple formula for P-positions. In
this section the authors present examples of chocolate without a simple formula for P-positions.
As to the chocolate in Fig. 4.1 any formula for Ppositions is not known. As to the calculation by
computer see [11].
Some chocolate have formulas for P-positions, but
they are not simple. One of these chocolate is the
first chocolate game in Fig. 4.2. The authors studied the right part of this chocolate that is the second
chocolate in Fig. 4.2, and made the table in Fig. 4.3
of Grundy Numbers. There seems to be some kind
of patters in these numbers, but the patters are not
simple and there is no relation between these numbers and nim-sum.
As to the detailed study of this chocolate see [9].
This chocolate satisfies the inequality y ≤ z.
[3] Weisstein,
Eric
W.
”Chomp.”
From
MathWorld–A Wolfram Web Resource.
http://mathworld.wolfram.com/Chomp.html
[4] Weisstein, Eric W. ”Nim-Value.” From
MathWorld–A Wolfram Web Resource.
http://mathworld.wolfram.com/NimValue.html
[5] Weisstein,
Eric
W.
”Nim.”
From
MathWorld–A Wolfram Web Resource.
http://mathworld.wolfram.com/Nim.html
[6] M.Naito, T.Inoue, R.Miyadera, Discrete
Mathematics and Computer Algebra System, The Joint Conference of ASCM
2009 and MACIS 2009, COE Lecture
Note Vol.22,Kyushu University. A PDF
file of the paper is available at http://gcoemi.jp/english/publish list/pub inner/id:2/cid:10
[7] R.Miyadera,
T.Inoue,
W.Ogasa
and
S.Nakamura, Chocolate Games that are
variants of the Game of Nim, Journal of Information Processing, Information Processing
Society of Japan, 53(6) pp. 1582-1591, 2012 (in
Japanese).
[8] M. Naito, D. Minematsu, R. Miyadera and etc.,
Combinatorial Games and Beautiful Graphs
Produced by them, Visual Mathematics, Volume 11, No. 3, 2009
http://www.mi.sanu.ac.rs/vismath/
miyaderasept2009/index.html
Figure 4.1.
[9] S. Nakamura, D. Minematsu, T. Kitagawa, R.
Miyadera and etc.,Chocolate games that are
variants of nim and interesting graphs made by
these games, Visual Mathematics, Volume 14,
No. 2, 2012
http://www.mi.sanu.ac.rs/vismath/
miyaderasept2012/index.html
Figure 4.2.
Z
Y
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
0
1
2
3
4
5
6
7
8
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
2
1
4
3
6
5
8
7
10
9
12
11
14
13
16
3
1
5
4
7
6
9
8
11
10
13
12
15
14
5
1
7
4
9
6
11
8
13
10
15
12
17
6
1
8
4
10
7
12
9
14
11
16
13
8
1
10
4
12
7
14
9
16
11
18
9
1
11
4
13
7
15
10
17
12
11
1
13
4
15
7
17
10
19
12
1
14
4
16
7
18
10
9
10 11 12 13 14 15
[10] R. Miyadera, S. Nakamura, Y. Okada, R. Hanafusa, and T. Ishikawa,Chocolate Games -How
High School Students Discovered New Formulas Using Mathematica-, Mathematica Journal,
Volume 15, 2013.
14
1 15
16 1 17
4 17 1 18
18 4 19 1 20
7 19 4 20 1 21
20 7 21 4 22 1 23
Figure 4.3.
[11] MathPuzzle.com. (Submitted by R.Miyadera)
“ The Bitter Chocolate Problem. ” Material added 8 Jan 06 (Happy New Year).
www.mathpuzzle.com/26Feb2006.html.
References
[1] A.C.Robin, A poisoned chocolate problem,
Problem corner, The Mathematical Gazette
Vol. 73, No. 466 (Dec., 1989), pp. 341-343 and
Vol. 74, No. 468 (Jun., 1990), pp. 171-173
ISSN online 2204-6534
Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015
5
page 47