JOURNAL AND PROCEEDINGS of YOUNG ARCHIMEDES Volume 1, Number 1 2015 “Providing a forum to exchange mathematical ideas, activities, and/or sharing and interpreting high school research.” JOURNAL AND PROCEEDINGS of YOUNG ARCHIMEDES Volume 1, Number 1, 2015 Contents 3. Osman, F., Myadera, R. Editorial Board 7. Osman, F. Solutions of Higher Order Dispersion Terms in the Nonlinear Schrödinger Equation 13. Ballador, W.L., Gallaza, A.L., Lazaro, L. Pattern for Centre of Twin Primes 18. Millican, C. Blu-Ray vs. DVD vs. CD: A Mathematical Exploration: Why Archimedean Spirals lie at the Centre of the differences between them? 29. Zheng, Y. Bayes’ Theorem and Its Applications in Law 41. Gereis, J., Wu, V. Effect of surface area and volume on rate of cooling 43. Kitagawa, T., Nakamura, S. New Chocolate Games The Editor, Journal and Proceedings of Young Archimedes Trinity Grammar School, 119 Prospect Road, SUMMER HILL NSW AUSTRALIA 2130 Published: January 2015 ISSN online 2204-6534 Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015 page 2 EDITORIAL BOARD Dr Frederick Osman and Dr Ryohei Miyadera BACKGROUND: On the 24th of September 2013, the Trinity Grammar School Music and Mathematics tour departed from Sydney International Airport and began a journey to Kwansei Gakuin Senior High School, Nishinomiya, Japan. The main purpose of this tour was to establish a relationship with Kwansei Gakuin to develop an international programme that would go beyond the current Rugby connection, to promote crosscultural awareness through extensive exchange programmes that challenge the mind, body and spirit for both students and staff from both schools. JOURNAL NAME: Archimedes was a Greek mathematician, physicist, engineer, inventor, and astronomer. Archimedes is generally considered to be the greatest mathematician of antiquity and one of the greatest of all time. AIMS: 1. The Journal and Proceedings of Young Archimedes publishes academic online papers of secondary students in the fields of Mathematics Applications. 2. To provide a forum to exchange mathematical ideas, activities, and/or sharing and interpreting high school research. 3. To pioneer a new field of educational endeavour to be the first Mathematics International Journal publication for High Schoolers. 4. To increase the relationship and strengthen the academic links between Trinity and Kwansei Gakuin. 5. To promote cross-cultural understanding between Australia and Japan and affirm our academic relationship as brother Schools. 6. To have students in all departments completing HSC and/or International Baccalaureate essays or projects with relevance to the fields of Mathematics Applications submit a paper for refereeing within the Journal and Proceedings of Young Archimedes. OUTCOMES: 1. Issues are scheduled to be published in June and December of each Year. 2. Maximum of six long papers (max 6 pages) or twelve short papers (max 3 pages) for each issue. 3. An electronic online version of each issue is to be posted to the Trinity Grammar School Mathematics Club web site publication. ISSN online 2204-6534 Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015 page 3 Journal and Proceedings of Young Archimedes The Journal and Proceedings of Young Archimedes publishes academic online papers of secondary students in the fields of Mathematics Applications and provides a forum to exchange mathematical ideas, activities, and/or sharing and interpreting high school research. Manuscripts will be reviewed by the Editor, in consultation with the Associate Editors, to decide whether the paper will be considered for publication in the Journal. Issues are scheduled to be published in June and December. An electronic version of each issue is posted to the Trinity Grammar School Mathematics Club web site http://bit.ly/young_archimedes as a formal publication. Enquiries relating to copyright or reproduction of an article should be directed to the author. Information for Authors Manuscripts are only accepted in digital format and should be e-mailed: >> In Australia to Dr Frederick Osman on [email protected] or >> In Japan to Dr Ryohei Miyadera on [email protected] The template below should be used as a guide for preparing manuscripts. If the file-size is too large to email, it should be placed on a CD-ROM or other digital media and posted to: The Editor, Journal of Proceedings of Young Archimedes Trinity Grammar School, 119 Prospect Road, SUMMER HILL NSW AUSTRALIA 2130 Editorial Board Dr Frederick Osman has had an extensive experience of more than 20 years academic/industry experience in innovative teaching and researching, in Physics and Mathematics education. His research background and achievements have been attained in laser plasma interaction for inertial confinement fusion including work on several plasma effects. He is currently the Director of Vocational Education and the Master in Charge of the Mathematics Club at Trinity Grammar School. Dr Ryohei Miyadera received a Ph.D. in Mathematics at Osaka City University and received a second Ph.D. in mathematics education at Kobe University. He has two fields of research: probability theory of functions with values in an abstract space and applications of Mathematica to discrete mathematics. He and his high school students have been doing research in discrete mathematics for more than 15 years. ISSN online 2204-6534 Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015 page 4 ASSOCIATE EDITORS Professor Robert Cowen is Mathematics Emeritus at Queens College, CUNY. He uses Mathematica in his own research and has written a textbook with John Kennedy called Discovering Mathematics with Mathematica. Professor Heinrich Hora is known for his work on the theory for fusion energy with lasers. He has published more than 450 papers on laser-plasma interaction and inertial nuclear fusion, ponderomotive and relativistic self-focusing, laser acceleration of particles, correspondence principle of electromagnetic interaction and accuracy principle of nonlinearity. Professor Yaichi Shinohara is Mathematics Emeritus of Kwansei Gakuin University. His research background and achievements have been in on Topology and Knot theory. He is a respected mathematician in the fields of algebra and geometry. Professor Tadashi Takahashi is a Mathematics Professor at Konan University. He is the President of Japan Society for Symbolic and Algebraic Computation and is also the President of the Game Amusement Society. His research background and achievements have been in Computer Algebra, Singularity Theory and Efficient Use of Computer in Mathematics Education. Dr Jonny Bernas Pornel is an Assistant Professor of Mathematics in the University of the Philippines Visayas. He promotes the use of Lesson Study among teachers of Science and Mathematics and the enhancement of mathematical creativity among students. Dr Nethal K. Jajo is Modelling and Projection Analyst at the University of Sydney, Australia. He has a PhD degree in Mathematical Statistics and Probability Theory with professional training in Discrete Event Simulation and System Dynamics Modelling. His industrial and academic research activities include: Data analysis, data mining, partial least squares path modelling, regression analysis and dynamic simulation. Dr Katsuyuki Yoshikawa is an International expert in the area of Knot Theory. He has attained the Takebe-Award for his research on the four dimensional topology from the Mathematical Society of Japan. Edward Habkouk is an experienced teacher of NSW Mathematics courses to HSC level and he is an HSC Mathematics Extension 1 marker (since 2000) and IB Mathematics Examiner (particularly SL, paper 1) since 1999. He is currently the Dean of Mathematics at Trinity Grammar School. Stephen McAndrew is an experienced Physics Teacher at Trinity Grammar School, having taught in Australia and the UK. His research background is in Applied Mathematics, in particular the areas of classical mechanics, fluid mechanics and electromagnetism. He is currently involved in a PhD research in magnetohydrodynamic shock waves. Katsuya Mori is a teacher at Takarazuka Higashi High School who is doing research in Mathematics with his students. His students papers were published at The Rose-Hulman Undergraduate Mathematics Journal. He achieved a M.Sc. from Kyoto University with a major in algebraic geometry. Shane Scott is an experienced teacher of NSW Mathematics courses to HSC level at Trinity Grammar School. He is an executive member of the Mathematical Association of New South Wales. He has won the NSW Premier’s Scholarship for Mathematical Teaching and has travelled to Germany, UK and the US to attend and present at International Schools and Mathematics conferences. Yuko Matsuda is known as an experienced computer scientist with many years’ experience in artificial intelligence, data science, language design and super-computing based on symbolic computation. ISSN online 2204-6534 Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015 page 5 Sample template format for Authors to the Journal and Proceedings of Young Archimedes (16 point font size) First name Last name Department, Institution, City, Country E-mail: name@domain-name Abstract This is the layout and template for a paper to be submitted to The Journal and Proceedings of Young Archimedes. Introduction (12 point font size) Equations (11 point font size) The Journal and Proceedings of Young Archimedes publishes academic online papers of secondary students in the fields of Mathematics Applications and provides a forum to exchange of mathematical ideas, activities, and or sharing and interpreting high school research. Papers may be submitted electronically only to the editors Dr Frederick Osman on [email protected] from Trinity Grammar School Australia and Dr Ryohei Miyadera on [email protected] from Kwansei Gakuin High School, Nishinomiya City Japan. Equations should be placed on separate lines and numbered. An example of an equation is given below: Δ f = – p + fN, (1) where fN is the nonlinear force (Osman, 2000). Figures All figures must be centered on the column (or page, if the figure spans both columns). Acknowledgement of receipt of the submission will be sent to the corresponding author’s e-mail address. It is the author’s responsibility to submit an accurate manuscript – any errors in spelling, grammar, or scientific content may be reproduced as typed by the author. Manuscripts will be reviewed by the Editor, in consultation with the Associate Editors, to decide whether the paper will be considered for publication in the Journal. Accepted papers will be published electronically on the Trinity Grammar School Mathematics Club web-site. Layout and style A Times New Roman font is used for the main text. The font size is 11 points with main heading of sections should use font size 12 points. It is important that when the final PDF file is created, all fonts used must be embedded. Two columns are used except for the title and abstract section and possibly for large figures, tables or photographs that need a full-page width. If you have any questions regarding paper submission, please contact the editors, Dr Frederick Osman from Trinity Grammar School, Sydney AUSTRALIA and Dr Ryohei Miyadera from Kwansei Gakuin High School, Nishinomiya City JAPAN. Figure 1: Generation of blocks of deuterium plasma moving against the neodymium glass laser light (Osman, 2005). References (11 point font size) Cang, Y., Osman, F., Hora, H., Zhang, J., Badziak, J., Wolowski, J., Jungwirth, K., Rohlena, J., Ullschied, J.,(2005) Computations for Nonlinear Force driven plasma blocks by picosecond laser pulses for fusion, Journal of Plasma Physics, 71, 35-51. Osman, F., Castillo, R., and Hora, H. (2000) Focusing and Defocusing of the Nonlinear Paraxial Equation at Laser-Plasma Interaction, Laser and Particle Beam, 18, 73-79. Osman, F. (2005) Guest editor’s preface: Workshop on fast high-density plasma blocks driven by picoseconds and terawatt lasers. Laser and Particle Beams, 23, 399-40 ISSN online 2204-6534 Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015 page 6 Solutions of Higher Order Dispersion Terms in the Nonlinear Schrödinger Equation Dr Frederick Osman Trinity Grammar School, Mathematics Department, Summer Hill NSW, Australia E-mail: [email protected] Abstract This paper presents the nonlinearity and dispersion effects involved in the propagation of optical solitons which can be understood by using a numerical routine to solve the Nonlinear Schrödinger Equation (NLSE). A sequence of code has been developed in Mathematica to explore in depth several features of the optical soliton’s formation and propagation. These numerical routines were implemented through the use with Mathematica and the results give a very clear idea of this interesting and important practical phenomenon. The resonant radiation of solitons due to higher order dispersive effects will be seen here to cause increasing turbulence which will ultimately lead to severe damping of the soliton, resulting in its diminished usefulness in telecommunications and other fields, including self-focusing. It is believed that these results will be of considerable use in any work or research that uses the self-focusing properties of the soliton [9] and this paper sets out to explain why the higher order optical soliton should be considered in such research Introduction The field of nonlinear optics has developed in recent years as nonlinear materials have become available and widespread applications have become apparent. This is particularly true for optical solitons and other types of nonlinear pulse transmission in optical fibres and laser plasma interaction [6, 9]. Subsequently, this form of light propagation can be utilised in the future for very high capacity dispersion free communications. The purpose of this paper is to describe the use of a very powerful tool to solve the generalized NLSE that has stable solutions called optical solitons [2]. The solitary wave (or soliton) is a wave that consists of a single symmetrical hump that propagates at uniform velocity without changing its form. The physical origin of solitons is the Kerr effect, which relies on a nonlinear dielectric constant that can balance the group dispersion in the optical propagation medium. The resulting effect of this balance is the propagation of solitons, which has the form of a hyperbolic secant [13]. ISSN online 2204-6534 Nonlinear Schrödinger Equation The Nonlinear Schrödinger Equation (NLSE) used in this paper is generalised as: u 1 2 u nu 2 n | u | u iu | i | n n i 2 2 where n is the order of dissipation of the Schrödinger Equation being used and n is an arbitrary constant. The electric field is considered as a monochromatic wave propagating along the x-axis with the wave number k and angular frequency , that is, the field E is assumed to be in the expansion form: E r , , x, t E r, , , ; exp ikx t l l With El El* (complex conjugate) where kl lk , l l and the summation is taken over all harmonics generated by the nonlinearity due to the Kerr effect and El r, , , ; is the envelope of the lth Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015 page 7 harmonic changing slowly in x and t. The slow variables and are defined by: z 2 z and t Vg (3) From Eq. (2) and Eq. (3), the displacement is found by: D E Dl exp ikl z l t (4) It has been shown by [3] that El r, , , ; can be expanded in terms of : El r , , , ; n El( n ) r , , , (5) n 1 From which the generalised NLSE for u1(1) , is obtained [6]: u 1 2u 3u 2 i | u | u iu i 3 2 2 Now the new variables and constants are introduced: z : t u 0 2k 0 3 z vg (7a) 1 u0 0 : Vg = k = k (7b) 6 u k u0 0 0 : 3 u 6k 2k 0 k 0 0 3k 2k k : k 3 2 (7c) (7d) The importance of Eq. (6) is that it can be solved into normalised reference coordinates. A clear view of the evolution of the envelope along the normalised propagation path results. This will allow us to study the different cases, such as the classical situation, where 0 which results in the standard Nonlinear Schrödinger Equation [9]. Initial Conditions and Programming for Higher Orders The solution of the Nonlinear Schrödinger Equation can be solved exactly by the inverse scattering method. A planar stationary light beam in a medium with a nonlinear refractive index can be described as a dimensionless form [5]: u 2 u 2 k | u |2 u 0 (8) The method used to solve the exact inverse scattering method is applicable to equations of the type: u ˆ S [u ] (9) Where Ŝ is a nonlinear operator differential in x, which can be represented in the form Lˆ i [ Lˆ , Aˆ ] (10) Here L̂ and  are linear differential operators containing the sought function u ( x, t ) in the form of a coefficient. The result in Eq. (8) can be verified in Eq. (10) with the operator’s L̂ and  taking the form of the Nonlinear Schrödinger Equation: 2 ISSN online 2204-6534 Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015 page 8 u( x, t ) 2 sech [2 ( x x0 ) 8 t ] . exp[i ( 2x 4( 2 2 ) t ] (11) are scaling where , , , t , x0 parameters. This form of the solution can also be known as a soliton that has a stable formation. The soliton Eq. (11) is the simplest representative of an extensive family of exact solutions of Eq. (9). In the general case such a solution can also be called an Nsoliton solution, which depends on 4N arbitrary constants, j , j , j , t j , x0 j . However, for the non-coinciding j this solution breaks into individual solitons if t . Using this solution and beginning at the origin x 0, a wave formation can be acknowledged by [9]: u(0, t ) sech [t t 0 ] (12) Using this programming method, with its wide range of available iteration models available from Mathematica, we get a three dimensional representation of the wave as given below in Figure 1. To achieve this result we have used the new “NDSave” command in Mathematica Version 5 to solve Eq. (6) for a given value of n given a solution of the format of Eq. (12) and given that the resultant wave has an arbitrary wavelength; in this case x 40. This is then plotted in three dimensions using the “Plot3D” command. Figure 1: Basic waveform for higher order dispersion This figure is obtained when all the higher order coefficients, n are zero. Here only the plain soliton is visible, no radiations are present. This is a classic example of a soliton wave; it is a single hump, clearly seen at time zero, and it tails off in two directions, vis. time, and it travels along the time line for very great distances without change or distortion. This is why it is important to find an order of solitons that do not decay or radiate to form wave packets, unless of course a packet is specified as being best for a stipulated use. For the graphs reproduced in this paper the resolution is set at PlotPoints1024 and ImageSize600. The ‘NDSolve’ command given in Mathematica, allows the user to choose from a wide range of mathematical iteration techniques in finding numerical solutions for differential equations. Numerical Results We have found from the above method that for the dispersion orders from 3 to 6 inclusively there is a point along the evolution of the coefficient for each order of dispersion where radiation becomes visibly evident. 3 ISSN online 2204-6534 Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015 page 9 Figure 2: High order wave (in this case 5th order at coefficient 0.019) when radiation first becomes noticeable. Figure 3: Wave for higher order dispersion (third order coefficient 1.0) showing well developed radiation. This figure shows a small packet of radiations coming from the outer extremities of the wave. The soliton is still clearly defined, however, as the wave travels further away from time zero, in either direction the radiations become more evident and complex. This figure clearly shows the packet of radiations, running parallel to the wave and travelling outwards from the parent wave. There is still an area of calm along the zero time line and the soliton hump is still clearly visible. Karpman, [7, 8] has defined the third and fourth order dispersion equation and this will be used throughout this work as the basis for programming the Schrödinger equation. This level of dispersion is covered by the work of Karpman and Shagalov [7, 8]. We will now carry on towards the next order of dispersion. Using this technique and extending it to waves of the fifth, sixth and seventh orders, we find a distinctive trend in the area along the evolution of the coefficients of the higher orders of dispersion, where the phenomenon of radiating solitons occurs. This is set out in Table 1, below. Figure 4: Higher order wave with heavy radiation. This is 4th order at coefficient 0.6549, which is at the threshold of breaking. In this figure we can see that the radiations extending outwards from the waves have merged to form an area between the waves of general turbulence, where radiations from various directions have merged. The shrinking area of relative calm around the zero time line is still visible. The parent soliton is still visible; though it is difficult to 4 ISSN online 2204-6534 Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015 page 10 see where such a packet would be useful, given the level of radiating interference. To make the coefficient any larger than this causes computation overflow, or breakdown of the wave, caused by the radiation turbulence taking over. Table 1: Coefficient Values for each Order of Dispersion. Order of Dispersion 3 4 5 6 7 8 Computation Starts Radiation Effect Computation Overflow 0.0225 0.015 0.0006 0.00011 0.0000081 0.000000002 0.2 0.05 0.01 0.001 - 260.647 0.655 0.0292 0.00134 0.000102 0.0000000442 From the results above it is evident that as the coefficient increases, a point is reached where the soliton begins to radiate, changing it from a solitary wave to a wave packet. From observing the time taken for compilation by the computer, however, it is evident that some activity is present even though it cannot be discerned from the simulations. For the basic wave, that is without a dispersion term of 3 or greater order, the compilation time is slightly over 2 seconds. We have included a column here to indicate the coefficient required to raise the compilation time more than one second above ground zero. At higher levels computation time can run into hours. The upper limit expressed here is the point at which self-focusing comes into effect, thus emphasising the reference made above to its usefulness to in such research [6, 9]. Figure 5: Coefficient as a function of dispersion order From this figure it will be noted that the point at which the order of dispersion being studied starts to make its presence felt on the computation, which can be related to the point at which breaking or computation overflow occurs. The point at which radiation becomes a visible phenomenon converges onto the point of breaking and seems to be in a position to cross over, thus implying that radiation would still occur at a higher coefficient had the wave not broken. This is difficult to verify, however, since we have no graphic output beyond computation overflow. Error Tolerance for Iteration Method Since these numerical results are obtained by the computer performing iteration for a given set of parameters there will be an expected margin of error, the question of interest being whether or not this error margin is within the acceptable boundaries. Mathematica can be programmed to give a specified accuracy goal as a specification of the number of digits required to be absolutely correct. Mathematica will then either produce a result with no error message, in which case the required 5 ISSN online 2204-6534 Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015 page 11 accuracy goal has been met, or it will give an error message stating where the accuracy requirement was breached. In the case of this poster we tested each order of dispersion to find at what number of digits the error message first appeared. This gave us the minimum error tolerance for each order. These results are set out in Table 2. Table 2: Percentage Error Expected at each Level Order of Dispersion 3 4 5 6 7 8 Computation Starts 0.0225 0.015 0.0006 0.00011 0.0000081 0.000000002 Accuracy of Digits 6 5 8 10000+ 10 8 Maximum % of Error 0.002 0.03 0.0008 0.06 2500 From this table, it is clearly indicated that in terms of error due to iteration, the sixth order is vastly superior to any other; since this is the lowest order where the breaking point of the wave is reached before visible radiation is evident and the required order of accuracy is never violated, even when taken to ludicrous extremes (in this case 10000 decimal places!). Above order 7 the accuracy quickly degenerates and order 8 and above are outside the boundaries of practical usefulness. Conclusion From the results shown in this paper, it is abundantly clear that the best possible results are to be obtained from using the sixth order (of dispersion) Schrödinger Equation. For this order there is no visible radiation, up to the point where increasing the sixth order coefficient causes computation overflow and/or breaking of the wave, thus rendering it free from the effects of turbulence and eliminating the associated damping of the soliton. This in turn cuts down any negative effects on selffocusing normally caused by wave turbulence. As an added bonus the expected error in the computed iteration is negligible. Consequently we can say that the sixth order dispersion will give all the advantages of a higher order soliton, most important being a higher self-focusing ratio, without all the setbacks encountered with higher orders, the most disruptive of these being soliton damping. References [1] AKHMEDIEV, N. N. and ANKIEWICZ A. (1997) Solitons, Nonlinear Pulses and Beams, Canberra: Chapman & Hall. [2] DRAZIN, G., JOHNSON, R.S. (1990) Solitons: An Introduction, Cambridge University Press. [3] HASEGAWA, A. (1989) Optical Solitons in Fibres, SpringerVerlag, Berlin. [4] HAUS, H. A. (1981) Optical Fiber Solitons, their Properties and Uses, Proceedings of the IEEE. Vol. 81. No. 7. [5] HAUS, H. A. (1993) Molding Light into Solitons, IEEE Spectrum. [6] HORA, H., OSMAN, F., HÖPFL, R., BADZIAK, J., PARYS, J., WOLOWSKI, E., WORYNA, E., BOODY, F., JUNGWIRTH, K., KRALIKOVA, J., KRAZA, L., LASKA, M., PFEIFER, M., ROHLENA, R., SKALA, J., ULLSCHMIED, J. (2002) Skin Depth Theory explaining Anomalous Picosecond Laser Plasma Interaction. Czechoslovak Journal of Physics, 52, Suppl. D. [7] KARPMAN, V. I. (1998) Evolution of Solitons described by Higher-Order Nonlinear Schrödinger Equation. Phys. Lett. A 244 397-400. [8] KARPMAN, V. I. SHAGALOW A. G. (1999) Evolution of Solitons described by Higher-Order Nonlinear Schrödinger Equation. II. Numerical Investigation. Phys. Lett. A 254 (1999) 319-324. [9] OSMAN, F., CASTILLO, R., HORA, H. (2000) Focusing and Defocusing of the Nonlinear Paraxial Equation at Laser Plasma Interaction, Laser & Particle Beams 18, 73. [10] SMITH, G.D. (1987) Numerical Solution of Partial Differential Equations: Finite Difference Methods, Oxford Applied Mathematics and Computing Science Series, 3rd edition. [11] WHITMAN, G.B. (1974) Linear and Nonlinear Waves, New York: Wiley. [12] WOLFRAM, S. (1991) MATHEMATICA: A System for Doing Mathematics by Computer, 2nd ed. Addison-Wesley. [13] ZAKHAROV, V.E., SHABAT, A.B. (1972) Exact Theory of Two-Dimensional Self-focusing and One-Dimensional SelfModulation of Waves in Nonlinear Media, Soviet Physics JETP, 34, 62. 6 ISSN online 2204-6534 Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015 page 12 Pattern for Centre of Twin Primes Wayne Lester Ballador, Andrea Louise Gallaza, and Ignacio Lazaro III Advisers: Prof. Jonny Pornel, Prof. Raphael Belleza, Ms. Early Sol Gadong UP High School in Iloilo, University of the Philippines Visayas, Iloilo City Philippines Abstract This paper presented the concept of the centre of twin primes and determined its properties. It proved that centre of twin primes that is greater than 7 is: (a) divisible by 6; and (b) congruent to 0, 2 or 8 modulo 10. These properties would enable researchers of twin primes to test a smaller set of numbers for primality without sacrificing accuracy. Key words: Twin primes, centre of twin primes Introduction In 1900, German mathematician David Hilbert suggested 23 baffling mathematical problems at the 2nd International Congress of Mathematics. One of these problems is the Twin Prime Conjecture which states that “There are infinitely many twin primes.” The Twin Prime Conjecture is one of the classic problems in mathematics. It is still unsettled whether it is true or not (Burton, 1980; Rosen, 1984; Schumer, 2004). Although many believe it is true and many illustrious mathematicians tried proving it, until now no-one succeeded in proving it. In fact, the Worldwide Computer Services Inc. offers $25,000 dollars to anyone who can successfully prove the Conjecture. Twin primes are pairs of primes that differ by two (Rosen, 1984); that is, only one number separates them. The primes 3 and 5 are twin primes since they differ by 2. Generally, if P and P 2 are primes, then they are twin primes. The conjecture has attracted the attention of countless professional, and amateur mathematicians because it was deceptively simple. To start working on it, one needs only to know the concept of primes. The attraction of a profound problem needing mathematical tools available to high school students is indeed irresistible. ISSN online 2204-6534 One of the reasons for the difficulty in proving the conjecture is the randomness in the occurrence of prime numbers. Finding twin primes is thus twice as difficult. Twin primes are also notoriously rare. Koshy (2007) has stated that, “discovering twin primes involves essentially finding two primes; therefore, the largest known twin primes are substantially smaller than the largest known primes.” p.118. There are only eight pairs of twin primes less than 100 and 35 pairs of twin primes less 1000. The twin primes less than 100 are (3, 5), (5, 7), (11, 13), (17, 19), (29, 31), (41, 43), (59, 61), and (71, 73). It is cumbersome to always write the two primes to represent themselves. The usual way of representing a pair of twin primes is using the first prime. Since they differ by 2, then anyone may readily derive the other prime given the first one. Thus, if P is the first prime, then P 2 is the other prime. Another way of representing a pair of twin primes is by using their centre, that is, the number that separates them. To represent (3, 5), some mathematicians use 4 1. In this paper, the concept of centre of twin primes will be used. If P and P 2 are primes, then the integer C P 1 is the centre of twin primes. From this point forward, a pair of twin primes will simply be represented by their centre to simplify the notation. The Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015 page 13 centre for twin primes less than 100 are 4, 6, 12, 18, 30, 42, 60, and 72. Problem This mathematical investigation aims to determine the pattern behind the centres of twin primes, and, consequently, the pattern behind twin primes themselves. Specifically, this paper aims to answer the following questions: 1) What are the common factors of the centre of twin primes? 2) What other patterns can be discerned from the sequence of centre of twin primes? Conjectures Formulated To come up with a conjecture, the researchers identified the first few twin primes. Next, their centres were determined and observed. Table 1: Centre of Twin Primes than 4 Centre of twin Primes (C) 4 6 12 Divisible by 6? n y y Last digit 4 6 2 Note: n – no, y – yes Greater 18 Y 8 30 y 0 The first 15 centres of twin primes are shown in Table 1. All these centres are even. Further, except for 4, all the centres of twin primes are divisible by 6. Thus, conjecture 1 logically follows. Conjecture 1: If C 4 and C 1 are primes, then C 6k where k N Also, the longer list of centre of twin primes show that except for 4 and 6, the centre of twin primes ends with 0, 2, or 8. Thus, conjecture 2 was advanced. Conjecture 2: If C 7 and C 1 are primes, then C d mod10 where d 0, 2,8 Verifying Conjectures The centres of twin primes between 100 and 500 are shown in Table 2. All these centres of twin primes are divisible by 6. Thus, conjecture 1 is logical. Also, the longer list of centre of twin primes shows that for centre of twin primes between 100 and 500, centres of twin primes end with 0, 2, or 8. This verifies the second conjecture. Table 2: Centre of Twin Primes Greater than 100 but Less than 1500 Centre of Divisible by Last Digit Twin Primes 6 102 Y 2 108 Y 8 138 Y 8 150 Y 0 180 Y 0 192 Y 2 198 Y 8 228 Y 8 240 Y 0 270 Y 0 282 Y 2 312 Y 2 348 Y 8 420 Y 0 432 Y 2 462 Y 2 2 ISSN online 2204-6534 Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015 page 14 To test the conjectures in extreme cases, take the biggest twin primes known right now, which are 33218925 2169690 1. Their centre is 33218925 2169690. To verify Conjecture 1, it must be shown that is divisible by 6. 3 11072975 2 2169689 6 11072975 2169689 So is divisible by 6. To verify Conjecture 2, the last digit of must be shown to be 0, 2 or 8. This can be done by determining the last digit of the product of the last digit of 33218925 and of 2169690. To determine the last digit of 2169690 , it must be noted that the last digit of 2k is a repeating cycle of 2, 4, 8, and 6 as k takes the values 1,2,3,4,… Since 169690=4(42422) +2, Then 2169690 24 42422 22 . Thus 2169690 mod10 2 4 42422 2 mod10 2 4 mod10 6 4 mod10 4 mod10 42422 6 Also, 33218925 mod10 5 mod10 So mod10 33218925 2169690 mod10 5 mod10 4 mod10 0 mod10 Thus, Conjecture 2 is verified. A better way to verify Conjecture 2, is to note that the last digit of 33,218,925 is 5 and that 2169690 is even. Since the product of 5 and an even number is 0, so the last digit is 0 as earlier shown. Justifications Conjecture 1: If C 4 and C 1 are primes, then C 6k where k N Proof: To prove that a number is divisible by 6, it must be shown that it is divisible by 2 and 3. Since the only prime that is even is 2, and it is not a twin prime, then all twin primes are odd primes. Thus, for any twin primes C 1 and C 1 , their centre C is even and is divisible by 2. To prove that centres greater than 4 are divisible by 3, let C be an integer greater than 4 and a centre of twin primes C 1 and C 1 . By definition, the prime C 1 is not divisible by any positive integer except itself and 1. Now, any integer can only be congruent to one of the following: 1 mod 3 , 2 mod 3 and 0 mod 3 . Verifying the possibility of these three cases shows the following result. Case 1: C 1 mod 3 C 1 mod 3 C 1 0 mod 3 C 1 3k C 1 is divisible by 3 A contradiction, since C 1 is a prime. Thus, C cannot be 1(mod3). Case 2 : C 2 mod 3 3 ISSN online 2204-6534 Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015 page 15 C 2 mod 3 C 1 0 mod 3 C 1 3k C 1 is divisible by 3. A contradiction since C 1 is prime. Thus, C cannot be 2 mod 3 Case 3: C 0 mod 3 C 1 6 mod10 1 C 1 5 mod10 This implies C 1 is divisible by 5; another contradiction when the primes are greater than 7. Therefore, a centre of twin primes greater than 7 can only be 0, 2 or 8 (mod10). QED Summary Since the first two cases are not possible, then case 3 must be true and that C is divisible by 3. Therefore, C is divisible by 6, since it is divisible by 2 and 3. QED Conjecture 2: If C 7 and C 1 are primes, then C d mod10 where d 0, 2,8 Proof: By the previous proven conjecture, a centre of twin primes is divisible by 6. This implies that a centre of twin primes C is C 6k C r mod10 . Where the possible values of r are 0, 2, 4, 6, and 8. However, C cannot be congruent to 4 modulo 10, because C 4 mod10 C 1 4 mod10 1 C 1 5 mod10 This implies C 1 is divisible by 5; a contradiction when the centre of twin primes are greater than 7. In the same line of argument, C cannot be congruent to 6 mod10 In summary, this mathematical investigation shows that the centres of twin primes greater than 7 are multiples of 6 that end with 0, 2 or 8. This shows an implied pattern for twin primes, since the centre for a pair of twin primes defines them. This result will simplify the search for twin primes. Let m be a natural number, and let N m be the set of natural numbers that are smaller than m. Suppose that we search for twin primes in N m . Let S be the set of multiples of 6 in N m . Now, of the multiples of 6, only those that end with 0, 2, and 8 are possible centres of twin primes. Let the set of these particular multiples of 6 be Sc . Since the possible endings of any multiples of 6 are 0, 2, 4, 6 and 8, then the S 3 S 1 ratio c Also, . S 5 Nm 6 Thus, using the properties of twin primes, a researcher need only to consider Sc Sc S 3 1 1 of N m a N m S N m 5 6 10 possible candidate for the centre of twin primes. C 6 mod10 4 ISSN online 2204-6534 Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015 page 16 Possible Extensions Since all the centres of twin primes are multiples of 6, that is, a centre C is of the form C 6k , a natural extension for this investigation is to identify which value of k will ensure that 6k is a centre of primes. A better restatement of this problem would be: for n=0, 1, 2, 3… what values of k taken from {0, 1, 2, 3, 4… 9} would make 6 10n k a centre of twin primes? References Burton, D. M. (1980). Elementary Number Theory. Boston: Allyn and Bacon Koshy, T. (2007). Elementary number theory with applications, 2nd ed. MA: Academic Press Rosen, K. H. (1984). Elementary number theory and its Applications. MA: Addison Wesley Schumer, P.D. (2004). Mathematical Journeys. NJ: John Wiley and sons. 5 ISSN online 2204-6534 Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015 page 17 Blu-Ray vs. DVD vs. CD: A Mathematical Exploration: Why Archimedean Spirals lie at the Centre of the differences between them? Chris Millican Trinity Grammar School, Summer Hill NSW, Australia E-mail: [email protected] Abstract As an avid musician and a fan of digital entertainment, my house is full of CD, DVD and Blu-ray discs. The questions addressed in this exploration are extremely relevant for me as a teenager who benefits daily from this technology and yet has little understanding about how it all actually works. I feel I have taken this technology for granted without having an understanding for what I am using. By the end of the exploration, my aim is to have quantitatively demonstrated the differences between CD, DVD and Blu-ray discs and therefore to have shown why Blu-Ray discs are arguably the best choice of the three. My exploration ultimately delved into various areas of mathematics including measurement, differential and integral calculus and trigonometric functions. The implications of my work would help me to realise the possible innovations that may further improve the capabilities of digital data storage in the future. Introduction CD, DVD and Blu-ray discs are classified as digital optical disc data storage formats and they all effectively function the same way. However it is commonly recognised that Blu-ray discs are the best of the three because they have the greatest storage capacity and the ability to store the highest quality video and audio. I realised prior to this exploration that I did not know the differences between the three discs. Why exactly do Blu-ray discs reproduce the highest quality entertainment? Is it just a matter of popular opinion or is there a mathematical basis for why Blu-ray discs are considered to be the best disc format out of the three? Archimedean Spirals the Background to my Exploration As shown to the right, the data on CD, DVD and Blu-ray discs is arranged in a spiral starting from the inside edge of the disc and spiralling outwards. This spiral is called the data spiral. The data spiral is a type of ISSN online 2204-6534 Archimedean spiral which “is a spiral named after the 3rd century BC Greek mathematician Archimedes”. An Archimedean spiral is defined as “the locus of points corresponding to the locations over time of a point moving away from a fixed point with a constant speed along a line which rotates with constant angular velocity.” This means that the distance from the centre of the spiral increases arithmetically rather than exponentially. The primary difference between Blu-Ray, DVD and CD discs is the type of laser used to read the disc. Blu-Ray technology uses a more precise blue/violet coloured laser; thereby allowing for the data to be encoded onto the disc in a more tightly wound spiral. This difference in the colour of the laser is where the name “Blu-ray” comes from! My aim for the exploration is to quantitatively prove that the spiral of data on a Blu-ray disc is longer than the other two discs. If I can prove this, then I am in essence proving why Blu-ray discs are able to store higher quality entertainment Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015 page 18 because a longer data spiral corresponds to a higher storage capacity and a higher data storage capacity allows for higher quality recording. Exploration Aim: To calculate the length of the data spiral on CD, DVD and Blu-Ray Discs My aim is to firstly estimate the length of the data spiral for each disc and then secondly use a method based on polar coordinates and the equation of a spiral in order to hopefully calculate more precise answers assumption is not necessarily true and is the primary weakness of this estimation method. However, this method should provide a reasonable starting point for my exploration. In order to complete my calculations, I need to measure the dimensions of the disc as well as find the spacing between each spiral arm. For discs, this spacing is known by the term ‘Track Pitch’ (T) and I will be referring to the Track Pitch throughout my exploration. Because the Track Pitch is too small to measure, I used the internet to find values for the Track Pitch for each disc Calculating the Length of the Data Spiral Method 1 (Estimation Method) My intention was to approximate the length of the spiral on each of the discs by using the equation for the circumference of a circle. In order to make this approximation, I would need to make an assumption (the picture to the right demonstrates this assumption). The red spiral represents the actual data spiral whereas the black circles are evenly spaced concentric circles which share the same central point, have the same spacing and begin and finish the same distance away from the centre as the spiral. In relation to the diagram, the assumption made is that the length of the red spiral is equal to the sum of the circumferences of the black concentric circles. This is a common assumption made when estimating the length of spirals. This Table 1: Track Pitch for CD, DVD and Blu-Ray discs Disc: CD DVD BLU − RAY Track Pitch: 1.6 × 10−6 m 7.4 × 10−7 m 3.2 × 10−7 m 2 ISSN online 2204-6534 Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015 page 19 To put these distances in perspective, 100 µm is the average width of a human hair! So on a Blu-ray disc, because the distance between each line of data is 0.32 µm: Number of lines of data able to fit in a human hair = 100 ÷ 0.32 = 312.5. This means 313 lines of Blu-ray data take up the same width as a human hair! I found this phenomenal and it is evidence for the extraordinary capabilities of information technology. This equation is derived from the proof where on a straight line with equidistant points A, B, C and D, the distance between (A and D) = the distances between (A and B) + (B and C) + (C and D). Finding the number of turnings of the data spiral: DVD: Now that I have the measurements of the discs, I need to calculate how many turnings the data spiral makes by finding the straight line distance from the inner radius of the disc to the outer radius of the disc and dividing it by the Track Pitch. A turning is one full 360° revolution made by a spiral. As such, the spiral shown to the right has three successive turnings because it undergoes 1080 ( 3 × 360 ) of revolution. The Track Pitch will be represented by the letter T. No of turnings = outer disc radius − inner disc radius T CD: 0.06 − 0.023 1.6 × 10 −6 Number of turnings = 23125 Number of turnings = 0.06 − 0.023 0.74 × 10 −6 Number of turnings = 50000 Number of turnings = Blu-Ray: 0.06 − 0.023 0.32 × 10 −6 Number of turnings = 115625 Number of turnings = These results show that the Blu-Ray disc has the greatest number of spiral turnings. This is the first evidence that the data spiral on a Blu-Ray disc is longer than the other two discs. My first problem arose when I realised that it would be impractical to find the sum of the circumferences of 118750 different circles. I decided that the best method would be to find the mean circumference for all of the circles and then multiply this by the number of turnings. In this case, due to the consistent spacing of these concentric circles, the mean circumference is equal to the median circumference. C = 2π r Where c = circumference and r = radius of the circle. 3 ISSN online 2204-6534 Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015 page 20 Median circumference = 2π × Median Radius Med Rad = outer disc radius − inner disc radius × 2π 2 Mean Circumference = 0.06 + 0.023 × 2π 2 Mean Circumference = 0.261 m The mean circumference is the same for all of the discs because they all have the same dimensions. Now that I have the mean circumference and the number of turnings for each of the three data spirals, I can estimate the length of each data spiral through following equation: Length of data spiral = Average Circumference × Number of turnings of the spiral CD: Length = 0.261 × 23125 = 6035.625 m Therefore I endeavoured to take this exploration further and look into the area of polar coordinates and spirals. I soon realized that to make the proper calculations, I would need to use my current knowledge of calculus as well as research further into the area of spirals. I was definitely curious to see how much discrepancy I would get between the two methods. In hindsight, the estimation method certainly turned out to be the simpler method but would they be accurate enough? Calculating the Length of the Data Spiral Method 2 (Equation for a Spiral) I found that the general equation for a spiral is: r (θ ) = a + bθ Table 2: Explanation of the Equation for a Spiral Coefficient: b DVD: Length = 0.261 × 5000 = 13050 m Blu-Ray: Length = 0.261 × 115625 = 30178.125 m I found it astounding that there could be a 30km long spiral on the surface of a Bluray disc. These estimations clearly show that the Blu-Ray disc has a much longer data spiral than DVD and CD discs. However I was not satisfied with this estimation method. I knew that I had only made approximations. r θ a Meaning: a constant which determines the distance between each successive turning of the spiral distance from the centre of the spiral the number of radians of revolution undergone by the spiral A constant which determines the starting point where the spiral begins to turn. It can be assumed that a = 0 because on a physical disc, it does not matter where the starting point of the spiral is. Changing the constant ‘a’ rotates the spiral (such as in the two diagrams to the right). However on a 4 ISSN online 2204-6534 Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015 page 21 physical three dimensional disc, it does not matter which way the spiral is rotated. A disc is hand held and thus not constrained by this two dimensional mathematical aspect for the equation of a spiral. It is interesting that in this case, applying mathematics to the real world actually enabled me to simplify the mathematical theory. I didn’t anticipate this prior to my exploration because I assumed that real life mathematical problems were always more complicated than the mathematical theory suggests. However, I proved myself wrong. ∴ r (2π ) = 2π b T ∴ b= 2π I then substituted the value for back into the equation for a spiral of r (θ )= a + bθ ∴ r (θ ) = Now that I have found an equation for the data spiral, I need to think of a way to find the length of the data spiral. The spiral is a form of polar curve and therefore, finding the length of the spiral can thus also be expressed as finding the arc length of a polar curve. After some research into polar curves, I found that the integral for the arc length of any polar curve is found that the general equation for a spiral is: β α Therefore for a spiral, after one revolution ( ) has been made, the distance from the centre is 1 Track Pitch. Therefore: T = r (θ ) where θ = 2π rad ∴ T = r (2π ) T ×θ 2π Finding the Length of the Data Spiral: s=∫ Therefore, for the data spiral on a disc, we were able to simplify the equation to: r (θ ) = bθ . The distance between the arms of a spiral is called the Track Pitch. ∴ T = 2π b [r (θ )]2 + [r ′(θ )]2 dθ Table 3: Explanation of the Integral for the Arc length of a Polar Curve Coefficient: s r (θ ) r ′(θ ) Meaning: Arc length of the spiral α Equation for the spiral Derivative of the equation for the spiral Lower Bound β Upper Bound In order to use this integral, we need to find Therefore, due to the equation for an r ′(θ ) using basic rules of differentiation r: Archimedean Spiral of: r (θ ) = bθ . = y kx = y′ k 5 ISSN online 2204-6534 Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015 page 22 In applying this same rule for a spiral, it follows that for when: r (θ ) = T ×θ 2π ∴ r ′(θ ) = T 2π Therefore, through the process of substitution, I found that the integral for the length of the spiral on any of the three disc formats is: s=∫ β α T × (θ ) T 2π + 2π dθ 2 2 Finally I had produced an integral for the length of an Archimedean spiral where the distance between spiral arms (T) is known. Integration: β α (ax 2 + a ) dx Where a is a constant. The use of an integral calculator gave the following equation. β ∫α f ( x ) dx = 2 β ∫α r(θ )dθ = 2 T × θ T T 2π × ar sinh(θ ) + x 2π + 2π 2 a × ar sinh( x ) + x ax 2 + a 2 Note: The term ar sinh ( x ) represents an inverse hyperbolic function and is completely different from the constant ‘a’. 2 I also realised that the length of the spiral would be found by subtracting the length of spiral between the centre and the inside edge of the disc from the length of spiral between the centre and the outside edge of the disc. This is simply due to the fact that discs have holes in the centre (as seen to the right ) and thus, the data spiral does not actually start at the centre, but rather from the inside edge of the disc. Therefore: 2 Solving this final integral, would require knowledge of inverse hyperbolic functions by the use of the mathematical software to make the final calculations. By substituting 2 T in the letter ‘a’ for the constant and 2π substituting x for θ , which the integral is simplified to its basic form: s=∫ 2 T By substituting in for the constant 2π ‘a’ as well as substituting θ back in for x , the equation became: S Data Spiral 2 2 T T × β T 2π × ar sinh(β ) + x 2π + 2π − = 2 2 2 T T × α T 2π × ar sinh(α ) + x 2π + 2π 2 2 Finally, I had found the equation for the length of a data spiral. I had all of the necessary information in order to complete the final step except for the values: I would need to find before I could find . Finding α and β : The goal of integral calculus is to find the area under a curve. The upper and lower bounds define between what values of θ that the area under the curve will be found. The lower and upper bounds α and β would be determined by the highest and 6 ISSN online 2204-6534 Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015 page 23 lowest possible values of θ for each data spiral. This is because the answer to the integral will give the length of the spiral. Therefore, in terms of the length of the spiral, the point at which the length will be the longest is at the outermost point of the disc which is also where the value for θ is greatest. Therefore by finding the highest and lowest possible values for θ , we will be able to find the upper and lower bounds. The lower bound α and the upper bound β can be found by going back to the equation which was created for the data spiral: r (θ ) = T ×θ 2π The outermost and innermost radii of the disc were measured to be 0.06m and 0.023m respectively. Therefore to find the upper limit β : β= r (θ ) T 2π r (θ ) × 2π ∴β= T Because r (θ ) = 0.06m - (disc radius) 0.06 × 2π ∴β= T The same process can be used to find the lower limit 'α ' by substituting out the outermost radius of the disc (0.06m) with the innermost radius of the disc (0.023m): 0.023 × 2π ∴α= T For example, in calculating the CD Upper Bound: 0.06 × 2π ∴β= T 0.06 × 2π ∴β= 1.6 ×10−6 ∴β= 235619.449 Through the two equations for finding α and β , and the values for the Track Pitch of each disc, I was able to find the upper and lower bounds for the integral for the length of the data spiral on CD, DVD and Blu-ray discs: Table 4: The Calculated upper and lower bounds for each disc CD Upper Bound - β CD Lower Bound - α DVD Upper Bound - β DVD Lower Bound - α Blu-Ray Upper Bound - β Blu-Ray Lower Bound - α 235619.449 90320.789 509447.457 195288.192 1178097.245 451603.944 Finding ar sinh (β ) and ar sinh(α ): In order to find the values for each inverse hyperbolic function, the inverse hyperbolic function calculator was used. Through the use of this technology and previous calculations for α and β , the following values are determined: 7 ISSN online 2204-6534 Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015 page 24 Table 5: The Calculated value of ar sinh(θ ) for each previously calculated upper and lower bound CD Upper Bound - ar sinh(235619.449 ) = 13.063 CD Lower Bound – ar sinh (90320.789 ) = β α 12.104 DVD Upper Bound - ar sinh (509447.457 ) = 13.834 DVD Lower Bound - ar sinh (195288.192 ) = 12.875 Blu-Ray Upper Bound - ar sinh (1178097.245) = 14.673 Blu-Ray Lower Bound - ar sinh (451603.944 ) = 13.714 β α Pits and Lands: The data spiral of a disc is actually a series of bumps which are called pits and lands. The laser is able to reflect light off these bumps and subsequently read the data on the disc. This series of bumps is how digital data is encoded on discs. The image to the right shows that not only does the new technology of Blu-Ray discs allow for more closely spaced spiral arms, but also smaller minimum pit and land lengths. By dividing the recently calculated length of the data spiral by the minimum pit length, we can get a figure for the maximum number of pits that could physically fit onto each of the three discs. This is vitally important when comparing the three disc formats as it is good indicator for the differences between CD, DVD and Blu-Ray discs. β α Using the following equation, the length of the data spiral for each disc was calculated. 2 S Data Spiral 2 2 T T × β T 2π × ar sinh(β ) + x 2π + 2π − = 2 2 2 T × α T T 2π × ar sinh(α ) + x 2π + 2π 2 CD Data Spiral DVD Data Spiral Blu-Ray Data Spiral 2 6029.894 m 13037.610 m 30149.472 m Length of Spiral Minimum pit length = Max no. pits able to fit on the disc 6029.894 CD : = 7.537 × 109 pits −9 800 × 10 13037.610 DVD : = 3.259 × 1010 pits −9 400 × 10 30149.472 BLU − RAY : = 2.010 × 1011 pits 150 × 10− 9 ∴ 8 ISSN online 2204-6534 Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015 page 25 By using this data, we can compare the three formats to see just how much more data can fit on a Blu-Ray disc than a CD disc. Furthermore, we can compare these results with the accepted differences in data storage capacity in order to firstly see how accurate my calculations were and secondly to evaluate the importance of the length of the data spiral as an indicator for the quality of an optical disc. Therefore: Max no. pits able to fit on the disc x = Max no. pits able to fit on the disc y Ratio for the data capacity of disc x in relation to disc y BLU − RAY / DVD : BLU − RAY / CD : DVD / CD : 11 2.010 × 10 = 6.0 3.259 × 1010 2.010 × 1011 = 26.7 7.537 × 109 3.259 × 1010 = 4.3 7.537 × 109 These results have no units because they are ratios. These calculations show that according to the results of my exploration, a Blu-Ray disc holds 6 times more than a DVD disc, a DVD disc holds 4.3 times more data than a CD disc and a Blu-Ray disc holds 26.7 times more data than a CD disc. What should we make of these results? In order to assess the accuracy of my calculations and evaluate the importance of the length of the data spiral as an indicator for the quality of an optical disc, I need to compare these values to the actual ratios for how much data each disc can hold in relation to the other discs. The real values for the data storage capacity for CD, DVD, and Blu-Ray discs are as follows. Table 6: The accepted values for the data storage capacity of each disc Disc Format: CD DVD BLU − RAY Data Storage Capacity (GB): 0.7 4.7 25 Data Storage Capacity of disc x = Data Storage Capacity of disc y Ratio for the accepted data capacity of disc x in relation to disc y 25 BLU − RAY / DVD : = 5.3 4.7 25 BLU − RAY / CD : = 35.7 0.7 4.7 DVD / CD : = 6.7 0.7 Percentage Error: Therefore, in order to compare my results with the accepted values, the percentage error formula has been utilised: Result − Accepted Value Error (% ) = × 100 Accepted Value 6 − 5.3 BLU − RAY / DVD : × 100 = 13.2% error 5.3 26.7 − 35.7 BLU − RAY / CD : × 100 = 25.2% error 35.7 4.3 − 6.7 DVD / CD : × 100 = 35.8% error 6.7 9 ISSN online 2204-6534 Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015 page 26 Comparison of Methods: Blu-ray discs allows for this extra space. In order to illustrate the importance of data It was actually quite surprised how accurate storage capacity, I utilized Microsoft Excel the estimations were. The percentage error graph drawing software to create the four for the length of the spiral on each disc in graphs to the left. It is important to note that was determined in the following these graphs are an estimation of reality, calculations: and do not use any equations from my Result − Accepted Value Error (% ) = × 100 exploration. However they provide a great Accepted Value visual representation of the differences between the three discs in attempting to 6035.625 − 6029.894 × 100 record a sound wave. It is clear that because CD (% error ) = 6029.894 the CD only has a limited amount of data = 0.09504% error storage space, the resemblance of the digital recording to the actual sound is quite poor. On the other hand, the DVD and Blu-Ray 13050 − 13037.610 × 100 DVD (% error ) = digital recordings bear much more 13037.610 resemblance to the original sound wave. = 0.09503% error Therefore, these recordings are more accurate digital recordings and thus BLU − RAY (% error ) arguably of higher quality. 30178.125 − 30149.472 = × 100 It is interesting that although CD, DVD and 30149.472 Blu-Ray discs all look similar; they have = 0.09505% error fundamental differences due to progress in laser technology. By refining the spot size In all three cases, the percentage error is of the laser by a mere micrometre, the data extremely small. These low percentage spiral can be increased from 6km to over errors show that in most cases, the 30km in length. It seems that for estimation method would suffice in giving technological advancement to occur, a small an indication of the length of a spiral. improvement of a micrometre may lead to a However, in cases where absolute accuracy drastic overall benefit. In addition to this, is necessary, the second method requiring my exploration has shown to me the integral calculus is important to recognise. potential for innovation in the future. It seems logical that because refining the laser Conclusion by one micrometre allowed for a data spiral So why exactly is all of this important? My up to six times longer, then perhaps the exploration has shown that Blu-ray discs same can be done again to create even can hold more data than DVDs and much higher quality digital entertainment. more data than CDs. However, how do Perhaps, if we used a laser of even shorter these calculations prove that Blu-Rays are electromagnetic wavelength, this will of better quality than DVDs and CDs? become a reality. In any case, my Generally, higher quality recording requires mathematical exploration has really helped more data space. The longer data spiral of me to understand the world of digital 10 ISSN online 2204-6534 Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015 page 27 technology much more than before. I believe that my exploration into the mathematics behind technology has implications for everyone endeavouring to benefit society through technological advancement. References [1] "Arc Length Integral for Polar Coordinates." HubPages. N.p., n.d. Web. 16 Feb. 2014. [11] "Archimedean spiral." Princeton University. <http://www.princeton.edu/~achaney/tmve/wiki100k/doc s/Archimedean_spiral.html>. [12] Austen, Ian. "Dueling Visions of a HighDefinition DVD." The New York Times. The New York Times, 28 Apr. 2004. Web. 25 Apr. 2014.<http://www.nytimes.com/2004/04/29/technology/h ow-it-works-dueling-visions-of-a-high-definitiondvd.html>. [13] "Blu-ray vs DVD ." Blu-ray vs DVD <http://www.computerhope.com/issues/ch001395.htm>. <http://calculus-geometry.hubpages.com/hub/ArcLength-Integral-for-Polar-Coordinates>. [14] "Chip's CD Media Resource Center: CDDA (Digital Audio) 5." Chip's CD Media Resource Center: CD-DA (Digital Audio). Web. 10 Dec. 2013. <http://www.wolframalpha.com>. [15] "Compact Disc. How it Works?." ElectroSchematicscom RSS. N.p., n.d. Web. 1 Apr. [2] "Experimental Feature." Wolfram|Alpha: Computational Knowledge Engine. N.p., n.d. [3] "File:Comparison CD DVD HDDVD BD.svg." - Wikimedia Commons. N.p., n.d. Web. 17 Feb. 2014. <http://commons.wikimedia.org/wiki/File:Comparison_C D_DVD_HDDVD_BD.svg>. [4] "How Your Brain Understands What Your Ear Hears. How Small Is a Hair Cell?. N.p., n.d. Web. 17 Feb. 2014. <http://science.education.nih.gov/supplements/nih3/heari ng/activities/hair-cell.htm>. [5] "How to Find the Arc Length of an Archimedean Spiral: Calculus Integration Tutorial." HubPages. N.p., n.d. Web. 10 Feb. 2014. <http://calculus-geometry.hubpages.com/hub/How-toFind-the-Arc-Length-of-an-Archimedean-SpiralCalculus-Integration-Tutorial>. [6] "Integral (Antiderivative) Calculator." <http://symbolab.com/solver/integral-calculator/>. [7] "Integral Calculator." Online Find Integrals and Antiderivatives!. N.p., n.d. Web. 21 Dec. 2013. <http://www.integral-calculator.com/>. [8] "Inverse Hyperbolic Functions." Online calculator:. N.p., n.d. Web. 17 Feb. 2014. <http://planetcalc.com/1118/.>. <http://www.chipchapin.com/CDMedia/cdda5.php3>. 2014. <http://www.electroschematics.com/4997/compactdisc-how-it-works/>. [16] "Optical disc." Princeton University. N.p., n.d. Web. 25 Apr. 2014. <http://www.princeton.edu/~achaney/tmve/wiki100k/doc s/Optical_disc.html>. [17] "Optical disc drive." Wikipedia. Wikimedia Foundation, 18 Apr. 2014. Web. 16 Jan. 2014. <http://en.wikipedia.org/wiki/Optical_disc_drive>. [18] "Reference Guide for Blue Laser Media." Datarius. N.p., n.d. Web. 11 Jan. 2014. <http://www.datarius.com/news/whitepapers/wp_Blue_L aser_datarius-memorex.pdf>. [19] "Roll length calculator." Roll length calculator. N.p., n.d. Web. 25 Apr. 2014. <http://www.giangrandi.ch/soft/spiral/spiral.shtml> [20] "Springville company introduces new DVD to protect data for a thousand years or more." Daily Herald. N.p., 17 July 2009. Web. 25 Apr. 2014. <http://www.heraldextra.com/news/local/springvillecompany-introduces-new-dvd-to-protect-data-fora/article_b25c9a30-7242-11de-9feb001cc4c03286.html>. [9] "What are Blu-ray discs?." What are Blu-ray discs? An explanation of how Blu-ray discs work and their specifications. N.p., n.d. Web. 21 Nov. [21] "Why Blu-ray's Better Than DVD - IGN." IGN. N.p., n.d. Web. 1 Apr. 2014. [10] Length of an Archimedean Spiral. N.p., n.d. [22] "Optical disc." Wikipedia. Wikimedia Foundation, 18 Apr. 2014. Web. 16 Jan. 2014. 2013.<http://www.wizbit.net/cddvd_production_faqs_what_are_blu-ray_discs.htm>. Web. 17 Feb. 2014. <http://www.intmath.com/blog/length-of-an-archimedianspiral/6595>. <http://au.ign.com/articles/2009/03/25/why-blu-raysbetter-than-dvd>. <http://en.wikipedia.org/wiki/Optical_disc> 11 ISSN online 2204-6534 Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015 page 28 Bayes’ Theorem and Its Applications in Law Yuzhe Zheng Trinity Grammar School, Summer Hill NSW, Australia E-mail: [email protected] Abstract The aim of this paper is to clearly explain how Bayes’ Theorem can provide the correct interpretation of statistical evidence. This will be achieved through firstly, introducing Bayes’ Theorem in two different forms, and examining its use in two real-world cases, Regina v Sally Clark [1] and Regina v Denis John Adams [2]. In the first case, I will explain how Bayes’ Theorem accounts for one piece of evidence, and in the second case, I will explain how Bayes’ Theorem can also be used for multiple pieces of evidence. Through this process, I will also expose the logical fallacies of these cases and explore how this theorem avoids making such fallacies. Acknowledging that it is difficult for the layman to understand Bayes’ Theorem, I will also be using diagrams to represent Bayes’ Theorem for clarification. Introduction An introductory example: my conversation with Ishaan. I found that one of my best attempts in explaining Bayes’ Theorem was in a conversation with my friend Ishaan, of which a transcript is provided below: Yuzhe: Here are two coins, both of which are showing heads at the moment. One of them is double-headed, while the other is a normal coin. Now pick one up without looking at the other side. Ishaan picks it up Yuzhe: Okay. What is the probability of the coin in your hand being biased? And explain why. Ishaan: 50%, because the biased coin is one of the two coins which could have been chosen. Yuzhe: Now flip the coin three times and tell me which side is facing up each time. Ishaan flips the coin three times. By the end of the third flip, he looks quite shocked. Ishaan: Heads, heads and heads. ISSN online 2204-6534 Yuzhe: Okay, how sure are you now that the coin that you’ve just flipped is the biased coin? Ishaan: Quite sure. Yuzhe: Can you give me a probability? Ishaan: Well now I’m around 80% sure it’s the biased coin, and I definitely know it’s more than the figure of 50% which I gave earlier. Reading this conversation, it is likely that you, the reader, has followed a similar pattern of thought to Ishaan’s. Ishaan in this case, has intuitively updated his belief in the biased coin from 50% to 80%, after he observed that the coin produced three consecutive heads. This process of updating our beliefs in light of new evidence in relation to this exploration is the basis of Bayes’ Theorem, which carries this out in an objective, controlled manner using the mathematics of probability. Bayes’ Theorem therefore has a wide range of applications for any field, which requires the confirmation of a hypothesis through evidence. It was however, the legal function of Bayes’ Theorem, which Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015 page 29 surprised and interested me most, as it actually uses math and probability to account for evidence and help determine a defendant’s guilt or innocence. Rationale Steps [1] Explanation General definition [2] General definition [3] [ 2] × P ( A ) [4] Sub [3] in [1] as P ( B ∩ A )= P ( A ∩ B ) I chose this topic because I was intrigued that mathematics could be used to ensure that justice be appropriately carried out. I was led to believe by TV crime and legal shows such as The Goodwife that courtroom law was incredibly subjective, especially due to the unreliability of evidence provided by humans, such as witnesses. Moreover, originally believing that mathematics and law were two mutually exclusive areas, my exploration of Bayes’ Theorem highlighted the usefulness of mathematics in many more areas than I had previously considered. Explaining Bayes’ Theorem Bayes’ Theorem was named after Reverend Thomas Bayes, who suggested the process of updating beliefs through probability in his 1812 essay An Essay towards Solving a Problem in the Doctrine of Chances [3]. The simplest form of Bayes’ Theorem is given below: P ( A | B) = P ( B | A) P ( A) P ( B) Proof I attempted to prove Bayes’ Theorem myself through showing how it can be derived from the general definition [5] of P ( A | B) . Equation P ( A ∩ B) P ( A | B) = P ( B) P ( B | A) = P ( B ∩ A) P ( A) P ( B | A ) P (= A) P ( B ∩ A) P ( A | B) = P ( B | A) P ( A) P ( B) Table 1: The meaning of each term of the simple form of Bays’ Theorem This assumes that 0 ≤ P ( A ) ≤ 1 and 0 ≤ P ( B) ≤ 1 Term P ( A) P ( A | B) Name The prior [6] The posterior P ( B | A ) The likelihood P ( B) Meaning The degree of belief that event A has occurred. The degree of belief that A has occurred, given that event B has occurred. P ( B | A ) is the probability of B occurring, given A has occurred. P (B) is the sum of all probabilities of all possible ways to get B. This includes the probability of B given A, P ( B | A ) , and the probability of B given not A, P ( B | A′ ) , = P ( B ) P ( B | A ) + P ( B | A′ ) Dividing P (B | A) by P ( B ) therefore gives us the probability of B| A occurring in comparison to all ways of B occurring. The likelihood is therefore the support of B for A. 2 ISSN online 2204-6534 Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015 page 30 Substituting the terms for their names, Bayes’ Rule is simply: posterior = prior × likelihood Returning to the introductory example Returning to our coin flip example, the prior was Ishaan’s initial degree of belief in the flipped coin being biased, which we will denote as P (A). The posterior, P ( A | HHH ) is Ishaan’s updated belief in the coin being biased, after it produced three consecutive heads, HHH . Using Bayes’ Theorem, we can calculate the posterior: P ( A | HHH ) = P ( HHH | A ) P ( A ) P ( HHH ) There are two coins, the fair and the biased, which can produce HHH . Therefore, the probability of producing three consecutive heads, P ( HHH ) , includes the probability of picking the biased coin and flipping HHH, as well as the probability of picking the fair coin and flipping HHH. Hence, with A′ representing the fair coin P ( HHH ) = P ( HHH | A ) P ( A ) + P ( HHH | A′ ) P ( A′ ) The substitution of this equation into Bayes’ Theorem would give us its extended form P ( HHH | A ) P ( A ) P ( A | HHH ) = P ( HHH | A ) P ( A ) + P ( HHH | A′ ) P ( A′ ) Table 2: The calculations of all the terms on the right side of Bayes’ Theorem for the coin flip example Calculation Explanation As Ishaan has 1 P ( A) = explained, there are 2 1 out of 2 coins to be chosen from. A and A′ are 1 P ( A′ ) = complementary 2 events. Given that the coin P ( HHH | A ) = 1 is double-headed, it is certain that it will produce HHH. 1 1 1 1 The probability of P ( HHH | A′ ) = × × = 2 2 2 8 getting a head on the first flip doesn’t affect probability of getting a head on the second or third flips. These events are therefore independent, which are multiplied by each other when we wish to find out the probability of all events occurring together [5]. These calculations are substituted into the extended form of Bayes’ Theorem: 1 1× 2 P ( A | HHH ) = = 0.8 1 1 1 1× + × 2 8 2 Thus, Ishaan’s prior degree of belief in the selected coin being biased, which was 0.5, is updated to 0.8 after accounting for HHH. 3 ISSN online 2204-6534 Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015 page 31 Reflection: The Usefulness of Bayes’ Theorem It can therefore be seen that Bayes’ Theorem is useful because it allows us to update our prior belief from new information. It was at this point however that I noticed that the same result for the biased coin example could be calculated by using the basic definition of P ( A | B ) : P ( A | B) = P ( A ∩ B) P ( B) When I was comparing the two equations, I P ( B | A) realised that P ( A ∩ B ) = , which P ( A) made me question the usefulness of Bayes’ Theorem at all. Finally, I realised that Bayes’ Theorem differs from the general definition of P ( A | B ) in that it provides the relation between P ( A | B) and P ( B | A ) . This is particularly useful for cases where P ( B | A ) is known but P ( A | B ) is not. For example, the jury wants to know the probability of a defendant’s innocence, given the occurrence of a particular piece of evidence, P ( innocence | evidence ) . Yet, the jury only knows the probability of that piece of evidence occurring when one is innocent, denoted as: P ( innocence | evidence ) . Bayes’ Theorem allows the jury to derive P ( innocence | evidence ) from P ( evidence | innocence ) and thus update their degree of belief in a defendant’s guilt in light of the evidence. Another example is in DNA testing, in which there is always a possibility that the match produced is someone who is innocent and completely unrelated to the crime in question. This is known as the random match probability [2], which is analogous to the probability of getting a match, given ones innocence, P ( match | innocence ) . Thus with Bayes’ Theorem, we can determine the probability of one’s innocence, given the DNA test, P ( match | innocence ) from the random match probability, P ( match | innocence ) . Odds form The simple form of Bayes’ Theorem however can only provide the probability of one hypothesis; rather, in the courtroom we want to compare the viability of two competing hypotheses – a defendant’s guilt and innocence. I found that the odds form [2] of Bayes’ Theorem, allows for this as it compares the probabilities of two competing hypotheses in a ratio, shown below: P (G | E ) P (G ) P ( E | G ) = × P ( G′ | E ) P ( G′ ) P ( E | G′ ) In trials, the two competing hypotheses are the defendant’s innocence and guilt for a charge, in which P ( G ) is the prior probability of guilt and P ( G′ ) is the prior probability of innocence. E represents the evidence, which is used to update the prior probabilities of innocence and guilt. 4 ISSN online 2204-6534 Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015 page 32 Table 3: An explanation of the terms of the odds form of Bayes’ Theorem Term Name Explanation Prior odds The ratio of the prior P (G ) probability of guilt to P ( G′ ) the prior probability of innocence. P(E | G) P ( E | G′ ) P (G | E) P ( G′ | E ) Likelihood ratio Posterior odds The probability of the evidence occurring given that the defendant is guilty, divided by the probability of the evidence occurring given that the defendant is innocent. This ratio therefore shows how much the evidence agrees with G in comparison to . Therefore, the higher the likelihood ratio, the greater weight the prosecution’s case has. The relative probabilities of guilt to innocence, after the evidence has been taken into account. Using the technical names of the terms in the odds form of the theorem we arrive at: Posterior Odds = Prior Odds × Likelihood Ratio For example, if we derive a posterior odds of 1/3 from the formula above, this means that the defendant is three times more likely to be innocent than guilty, after accounting for the evidence. The events of a defendant’s guilt and innocence are complementary events, meaning, “Exactly one of the events must occur [5]”. Therefore P ( G ) + P ( G′ ) = 1. At this point, I realised I could convert the posterior odds into individual probabilities of guilt and innocence. For example, the ratio of guilt to innocence is 1:3. As guilt and innocence are complementary events, the probability of guilt can be given out of 1+ 3 = 4. Hence, the probability of guilt is 1 = 0.25 and the probability of innocence 4 3 is = 0.75. 4 In cases involving juries, Bayes’ theorem is often called upon to aid juries in deliberating a verdict. The prior odds can be derived in 3 ways: • A juror’s own subjective guess at the probability of the defendant’s guilt. • The result from using Bayes’ Theorem with a previous piece of evidence. • A confirmed statistic. For example, 1/8 would be the prior odds if the defendant were initially one of 8 suspects. This statistic is then compounded with other evidence such as DNA, which is represented by the likelihood ratio. Personally I see a problem with the first option, as it is unfair and uncomfortable for the jurors to have to assign a subjective probability to a defendant’s guilt before even seeing any evidence. This problem is solved by the odds form of Bayes’ Theorem, which requires only the likelihood 5 ISSN online 2204-6534 Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015 page 33 ratio to determine the significance of the evidence on the case. For example, if the likelihood ratio from a DNA test were 100 to 1, it would make the prosecution’s case, or the probability of guilt, 100 times greater. Bayes’ Theorem can also be used to incorporate all pieces of evidence of varying likelihood ratios in one equation, or it can be done successively, meaning the posterior probability after accounting for one piece of evidence (e.g. License plate match), becomes the prior probability before accounting for another piece of evidence (e.g. DNA testing). I realised that the former could be achieved by slightly modifying the odds form of Bayes’ Theorem, which I have presented in detail in the examination of Regina v Denis John Adams. Regina vs. Sally Clark We will now see how Bayes’ Theorem can be used to interpret one piece of evidence in terms of its impact on the case for guilt or innocence. The Case Sally Clark was wrongly convicted of murdering her two infant sons, who both had died of Sudden Infant Death Syndrome (SIDS), also known as cot death. At the age of three months, her first son died and his death was initially considered as a case of SIDS. Following the second son’s death at the age of 2 months from similar circumstances, Sally was tried and convicted for murder in 1999. Lacking substantial medical evidence, Professor Meadow of the prosecution stated that the probability of both babies dying of SIDS was approximately 1 in 73 million. This was calculated by squaring the probability of a single death from SIDS in a family such as Sally Clark’s, 1 in 8500, to account for two SIDS deaths. With such a statistic, it seemed to the jury and the media that it was almost impossible that Sally Clark’ children died from SIDS and thus highly improbable that she was innocent. Only during a second appeal in 2003 was the conviction overturned, due to the court’s recognition that many pieces of evidence, including Meadow’s statistic, had misrepresented the case. Proof Step [1] [2] Explanation Bayes’ Theorem simple form Bayes’ Theorem simple form Equation P ( E | G ) P (G ) P (G | E) = P(E) P ( G′ | E ) = P ( E | G′ ) P ( G′ ) P(E) [1] × P ( E ) P ( E ) P (G | E) = P (E | G ) P (G ) [5] P ( E | G ) P (G ) [3] ÷ P ( G | E ) P ( E ) = P ( G | E ) [ 2] × P ( E ) P ( E ) P ( G ′ | E ) = P ( E | G ′ ) P ( G ′ ) [6] [ 5] ÷ P ( G ′ | E ) P ( E | G′ ) P ( G′ ) P ( E | G ) P ( G ) = P ( G′ | E ) P (G | E ) [3] [4] [7] [ 6] = [ 4] P ( G | E ) P ( E | G′ ) P ( G′ ) = P ( E | G ) P (G ) P ( G′ | E ) [8] [7] × P ( G | E ) P ( G | E ) P ( E | G′ ) P ( G′ ) = P ( E | G ) P (G ) P ( G′ | E ) [9] [ 8] ÷ P ( E | G ′ ) × P ( G′ ) P (G ) P ( E | G ) × P ( G′ ) P ( E | G′ ) 6 ISSN online 2204-6534 Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015 page 34 The Logical Fallacies I realised that Professor Meadow made a grave error in the calculation of this statistic. The probability of one event can only be multiplied by the probability of another if they are independent, meaning, the occurrence of one does not affect the probability that the other occurs. P ( A ∩ B ) means the probability that A and B occur together, and assuming that A and B are independent events, P ( A ∩ B= ) P ( A) × P ( B ) The occurrence of the first SIDS death however, does influence the probability of the second SIDS death. Meadow’s incorrect assumption of independence therefore leads to a grossly overestimated statistic, which had incorrectly pointed to Clark’s guilt. The President of the Royal Statistical Society in a letter criticizing the court’s abuse of probability, notes that “There may well be unknown genetic or environmental factors that predispose families to SIDS, so that a second case within the family [4], becomes much more likely than would be a case in another, apparently similar, family”. This makes them dependent events, and therefore it is inappropriate to simply just square the probability of a single SIDS death. Whilst assigning mathematical notations for the case’s events, I also realised that it was wrong for the media and Meadow assume that the low probability of the children’s cot deaths translate equally to the low probability of her innocence. Having made this logical error myself, I discovered that this was quite common in the interpretation of statistics in the courtroom, known as The Prosecutor’s Fallacy [2]. If we suppose that G′ (G being the event of guilt) is the probability of Sally Clark’s innocence, and E, the evidence of the two deaths, 1 P ( E | G′ ) = million . This represents the 73 probability of the two deaths happening, given her innocence. For simplicity, this assumes that there are only two possible explanations of the deaths, double-murder or two cases of SIDS. However, the probability of Sally Clark’s innocence given the deaths is P ( G′ | E ) . According to the terms of the simple form of Bayes’ Theorem therefore, to equate the likelihood, P ( E | G′ ) to the posterior, P ( G′ | E ) is wrong, unless the prior equals 1. Using Bayes’ Theorem Using the odds form of Bayes’ Theorem, we can correctly calculate the posterior odds of Sally Clark’s guilt: P (G | E ) P (G ) P ( E | G ) = × P ( G′ | E ) P ( G′ ) P ( E | G′ ) • P ( E | G ) = 1, = 1, as it is certain that if Sally Clark were guilty of double-infanticide, the two deaths would have obviously occurred. • P ( E | G′ ) = 1/ 73000 000, established by Dr Meadow earlier. At this point I realised I did not have any figures for P ( G ) or its complement, P ( G′ ) . Luckily, I found the probability of an infanticide within the infant’s first year of birth to be 1.1×10−5 , from the UK Office of National Statistics data for 1997 [2], 7 ISSN online 2204-6534 Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015 page 35 which was the year in which Sally’s second child died. Assuming Professor Meadow’s (admittedly ill-informed) logic of nonindependence, 2 1 P (1.1×10−5 ) = billion, 8.4 for the probability of double murder. Hence, P ( G ) = 1 billion and 8.4 1 billion. Substituting these 8.4 values into Bayes’ Theorem, the posterior odds on guilt are: P ( G′ ) = 1 − 1 billion P (G | E ) 1 8.4 = × P ( G′ | E ) 1 − 1 billion 1 million 8.4 73 9 ≈ 0.009 or 1000 Knowing this, we can conclude that it is over 100 times more likely, that Sally Clark is innocent than guilty, and that, her children died of SIDS rather than murder. Diagrammatic representation Following this calculation, I was still uncertain with how I exactly came to this 1 1/8.4billion 1- 1/8.4billion G 1/73million result. So the following tree diagram has been produced for all possibilities. With the aid of the diagram, I fully understood the use of Bayes’ Theorem in terms of the tree diagram. The odds form was simply a comparison between the red and blue branches containing E, because we know that the two deaths have already happened. Multiplying the probabilities on the red ‘guilty’ branch, I noticed I was actually just calculating the nominator of the right-hand side of the odds form of Bayes’ Theorem. Hence, doing the same with the E branch, I divided the former by the latter and arrived at 0.009, the same answer as earlier. Reflection The benefit of using Bayes’ Theorem here is that it reminds us to compare the probability of Sally Clark’s guilt to her innocence. Although the probability of two deaths by SIDS was highly improbable, 1 million, , this is over a hundred times 73 more probable than the chances of a doubleinfanticide, which the jury, the media and Professor Meadow himself failed to recognise. Another benefit is that Bayes’ Theorem outlines which terms must be known in order to rationally process evidence and hence determine a defendant’s guilt or innocence. For example, my use of Bayes’ Theorem in the Sally Clark case made me recognise that I still needed the prior odds in order to use Bayes’ Theorem. This case has also revealed to me the difficulty of language in clearly expressing mathematical concepts. An example is the Prosecutor’s Fallacy, in which it was 8 ISSN online 2204-6534 Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015 page 36 extremely easy for the media to represent the probability of Clark’s guilt as equal to the probability of two consecutive cot deaths. The misinterpretation of statistics in court is therefore extremely easy, and has unfortunately a profound effect in outcomes of cases. I chose this particular case due to these profound effects. Although Clark’s conviction was overturned later, her ordeal lasted more than 5 years, which left her with issues such as serious alcohol dependency, which lead to her death in 2007 due to alcohol poisoning. These horrid consequences can be attributed to logical fallacies during her case, such as the Prosecutor’s Fallacy, incorrect assumptions of independency, and the failure to compare probabilities of guilt to innocence. It struck me that it was so easy to make such fallacies, which would then lead to disastrous consequences on the lives of others. On the other hand, I found that using mathematical notation and the tree diagram made much clearer to me than word-based explanations. A key limitation in my analysis of Regina v Sally Clark was that I found P (G) by squaring the probability of infanticide within the first year of birth. In this calculation, I have, like Meadow, also assumed that the murder of one baby was independent of the murder of the other. These events are actually dependent events because hypothetically, if Clark did murder her first baby, it is much more likely that she murdered her second as well. This may be due to factors surrounding the first infanticide which may have also lead her to commit the second one. I purposely based my analysis on the same fallacious assumptions as Meadow to emphasise that, even when following his logic, the posterior odds are still significantly low. In conclusion, my examination of Regina v Sally Clark has highlighted to me the need for objective methods such as Bayes’ Theorem to properly assess evidence and update probabilities of guilt. Regina v Denis John Adams We will now see how Bayes’ Theorem can be used to account for different pieces of evidence, as opposed to one single piece of evidence. This case was of particular interest to me because I had always thought of DNA as the most conclusive evidence which proves an offender’s guilt beyond reasonable doubt. Influenced by sensationalized crime shows such as NCIS, my initial naïve view of DNA was challenged by its misuse in Regina v Denis John Adams. The Case In 1993, Denis John Adams was arrested on a rape charge [5] committed in 1991. His DNA was entered into the police database in 1993 for a different arrest, and whilst running a check on his DNA, a match was found at the scene of the rape in 1991. As a result, he was convicted of rape at trial. The prosecution’s case relied solely on the DNA evidence, arguing that its random match probability was one in 200 million. As the random match probability was so low, it seemed likely that Adams was guilty. The defence actually proved that the random match probability for that specific 9 ISSN online 2204-6534 Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015 page 37 DNA test was one in 2 million, rather than one in 200 million. The Evidence 1. The victim did not identify Adams as her rapist at the identification parade, and stated that he did not resemble her attacker. 2. Adams’ girlfriend stated that Adams spent the night with her, on the night of the offence. 3. In the local area of the offence, there were 150 000 males between 18-60 years of age. When accounting for others who may have entered the area that night, this increases the potential suspect pool to 200 000. 4. The random match probability is one in two million. The Prosecutor’s Fallacy Immediately, I noticed that the prosecution had committed prosecutor’s fallacy. random match P ( Match | G′ ) the probability, is based on the fact that the suspect is innocent. However, from this statistic the prosecution argued the low probability of Adams’ innocence, which is actually P ( G′ | Match ) , the probability of innocence given the match. The relation between them is actually determined in the simple form of Bayes’ Theorem, which requires that P ( Match | G′ ) , the likelihood, be multiplied by the prior, P ( G ) . Using Bayes’ Theorem We can overcome the risk of committing The Prosecutor’s Fallacy through using Bayes’ Theorem. Unlike in Regina v Sally Clark, not all the evidence is statistical, and therefore, as pretend jury members of Adams’ trial, we will assign our own estimates of likelihood ratios for each piece of non-statistical evidence, to reach conclusions on Adams’ guilt or innocence. As there is more than one piece of evidence, the total likelihood can be found by multiplying the individual likelihoods of each piece of evidence. Hence, the odds form of Bayes’ Theorem is transformed into: P ( G | E ) P ( G ) P ( E1 | G ) P ( En | G ) = × × .... P ( G′ | E ) P ( G′ ) P ( E1 | G′ ) P ( E n | G′ ) Values are assigned to each term below: It is reasonable to use the most readily available information as the prior odds, which in this case, is the potential suspect 1 pool. Hence, the prior odds are . 200000 We may denote E1 as the event of the victim not identifying Adams as the offender. Thus, if Adams were indeed guilty, we may estimate the probability of the victim not identifying him, P ( E1 | G ) to be 0.1. Likewise, we would estimate that E1 is more likely if Adams weren’t guilty, say, P ( E1 | G′ ) = 0.9 . The likelihood odds for E1 is therefore 1 . 9 We can also denote E2 as the event of an alibi. We estimate that an alibi would exist 25% of the time given ones guilt, and 75% 10 ISSN online 2204-6534 Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015 page 38 of the time given ones innocence. Hence, 1 the likelihood odds for E2 is . 3 decision-making exhibited by the jury, as my values were rationally generated through Bayes’ Theorem. Finally, we denote E3 as the event of a match in DNA testing provides a likelihood 2 million ratio of . 1 Implications and Reflection After conviction, the case went to appeal, in which Fenton & Neil detail that the Appeal Court claimed that “the introduction of Bayes’ Theorem, or any similar method, into a criminal trial plunges the jury into inappropriate and unnecessary realms of theory and complexity [3]”.They argued that juries were meant to “evaluate evidence and reach a conclusion not by means of a formula, mathematical or otherwise, but by the joint application of their individual common sense and knowledge of the world to the evidence before them”. We know that P P ( E3 | G ) = 1, as, if Adams were indeed guilty, he would obviously be a 1 match for DNA. P ( E3 | G′ ) = , as 2 million already given by the defence earlier. Dividing the former by the latter, we arrive 2 million at the likelihood ratio of . 1 Substituting these values into the above form of Bayes’ Theorem, we arrive at: 1 1 1 2000 000 10 × × × = 200 000 9 3 1 27 Hence, in light of these pieces of evidence, the probability ratio of guilt to innocence is 10:27. When we convert this ratio to a posterior probability of guilt we find it to be: 10 10 = ≈ 0.27 27 + 10 37 This posterior probability of guilt is wildly inadequate for a verdict of guilt beyond reasonable doubt, as the actual jury has done. Granted, this calculation is based on my own personal assignments of likelihoods for two pieces of evidence. This is however, better than any purely intuitive-based The privileging of “individual common sense” over mathematical methods to account for evidence highlights how people, even judges, are more comfortable relying on intuition, rather than formulas, which may be confusing at times. Upon my encounter with these two cases, I realised that The Appeal Court failed to recognise that one’s “common sense and knowledge” gathered from everyday experiences doesn’t equip one with the necessary skills to understand and manipulate probabilities. So far I have demonstrated that it is extremely easy and even intuitive at times to commit the prosecutor’s fallacy, to incorrectly assume independency of events and to fail to compare the probability of guilt to that of innocence. Despite the appeal court’s comments on the apparent insignificance of Bayes’ Theorem, my illustration of its use in this specific case combined with my reasoning above proves the contrary. 11 ISSN online 2204-6534 Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015 page 39 Conclusion This exploration has found Bayes’ Theorem to be extremely useful in the courtroom in updating prior beliefs on guilt in light of new evidence. This exploration has examined two real-world cases, Regina v Sally Clark and Regina v Denis John Adam, in which I have also revealed common logical fallacies that occur in the courtroom such as the Prosecutor’s Fallacy, and how Bayes’ Theorem prevents people from making such fallacies. In conclusion, through bringing together the worlds of law and mathematics, I developed a more appreciative perspective on the contributively power of mathematics to seemingly unrelated aspects of life. References [1] Dawid, Alexander., ‘Bayes' Theorem And Weighing Evidence by Juries.’ in Swinburne, R. (ed.), Bayes’s Theorem, London, Oxford University Press, 2005, pp. 71-90. http://www.math.nmsu.edu/~jlakey/m210/dawid -paper.pdf [2] Fenton, Norman, and Martin Neil. "Avoiding probabilistic reasoning fallacies in legal practice using Bayesian networks." Austl. J. Leg. Phil. 36 (2011): 114. http://www.eecs.qmul.ac.uk/~norman/papers/fen ton_neil_prob_fallacies_3_July_09.pdf [3] Green, P. "Letter from the President to the Lord Chancellor regarding the use of statistical evidence in court cases, 23 January 2002."The Royal Statistical Society. 2002. http://www.ucl.ac.uk/lapt/doubt/rss-2002.pdf [4] Haese, Robert. Haese, Sandra, Haese, Michael., Mäenpää, Marjut., Humphries, Mark., Mathematics for the international student: Mathematics SL, 3rd Ed. (Adelaide: Haese Mathematics, 2012) [5] Motivate. "The test is positive: What are the odds it’s wrong?" Motivate. 4 January 2014 <https://motivate.maths.org/content/sites/motivat e.maths.org/files/PositiveTest_RvDenisJohnAda ms.pdf>. [6] Westbury, Chris. "Bayes’ For Beginners. “Department of Psychology, P220 Biological Sciences Bldg., University of Alberta, Edmonton, AB, T6G 2E9, Canada. http://www.ualberta.ca/~chrisw/BayesForBegin ners.pdf 12 ISSN online 2204-6534 Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015 page 40 Effect of surface area and volume on rate of cooling Joshua Gereis and Victor Wu Trinity Grammar School, Summer Hill NSW, Australia E-mail: [email protected] Abstract Newton’s Law of Cooling describes the rate of cooling of an object, given its temperature, the ambient temperature, and the object’s rate of cooling (represented by ‘k’ in Newton’s Law). This experiment aims to find the relationship between an object’s surface area to volume ratio and its rate of cooling. We find a linear relationship between these two values. Hypothesis Our hypothesis was that objects with a greater surface area to volume ratio would cool more rapidly than those with a lesser ratio. We also expected there to be a linear relationship between this ratio and the rate of cooling ( k ) in Newton's Law of Cooling. Aim Our aim was to determine the relationship between an object's surface area to volume ratio and its rate of cooling. Method We used four 2.5 × 2.5 × 2.5cm3 and four 1×1×1cm3 iron cubes, four 5g, four 25g and two 50g masses as the objects to test. We weighed each of the objects with an electronic scale, using known densities of the objects to calculate their volume. Measurements were also taken to estimate the surface area. We then heated the objects up in an electric frypan with sand, and took them out in groups. Using an infrared thermometer, we measured the initial temperature of the objects (immediately after taking them out), then subsequently at each minute afterwards for 9 minutes (to obtain 10 data points in total). To do this, we tied each object with string, and upon taking them out of the pan, tied them ISSN online 2204-6534 onto a bosshead on a retort stand. We used separate stopwatches for different objects when they were cooling at the same time to measure the one minute intervals. To calculate the value of k for each object, we used the formula derived below from Newton’s Law of Cooling [1], using each of the 45 combinations of data points for each of the objects as the different temperatures in the derived formula. The value for k was determined as the mean of the scores, and the error was the standard deviation. dT = −k (T − Ta ) dt T (t ) = (T0 − Ta ) e− kt + Ta T ( t1 ) − Ta = e − k (t1 −t2 ) T ( t2 ) − Ta T ( t1 ) − Ta ln T ( t2 ) − Ta −k ( t1 − t2 ) = T ( t1 ) − Ta ln T ( t2 ) − Ta ∴k = t2 − t1 Results We found that there was a positive correlation between the surface area to volume ratio and the value of k for the objects, which seemed to be a linear relationship. Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015 page 41 that there is no surface area, there would be no way for the heat to escape, and therefore the rate of cooling would be zero. This supports the presence of a linear relationship between the two variables. The large error ranges also allow for other relationships as well, for example a polynomial relationship, which also pass through the origin, however the linear relationship best fits the data points. One interesting result we found from the data was that sometimes the objects would actually increase in temperature significantly, and this increase could not be accounted for with the errors in measurements. We hypothesise that this is because the surface cooled rapidly, while the “core” of the objects cooled much more slowly, and the energy from the core would excite the atoms on the surface, causing the observed fluctuations. Conclusion Discussion Initially, the iron cubes were planned to be used as a control for mass, so that any effect could be considered when analyzing the data from the brass masses. However, we overlooked the fact that though the cubes were roughly the same shape, they had vastly different surface area to volume ratios. If mass does have an impact on the rate of cooling, then this data would need to be reanalyzed. However, we believe that the surface area to volume ratio is the main factor in this, as the volume stores the heat energy, and the surface area provides a way for the energy to escape. The line of best fit for both graphs has a yintercept very close to zero. This is intuitive, because in the theoretical case We conclude that the rate of cooling of an object (made of brass or iron), determined by its rate of cooling k is proportional to its surface area to volume ratio. However, to obtain an accurate and general result, more experimentation and analysis is needed. Acknowledgments We would like to acknowledge Mr. Stephen McAndrew for his guidance and supervision of the experiment, and Mr. Rocco Appio for supplying us with equipment. References [1] Khamsi, M. A., Newton’s Law of Cooling, http://www.sosmath.com/diffeq/first/application/ newton/newton.html, accessed 31 October 2014 2 ISSN online 2204-6534 Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015 page 42 New Chocolate Games Takeru Kitagawa and Shunsuke Nakamura Kwansei Gakuin High School Uegahara-1-1-155 Nishinomiya City Japan Abstract The authors studied a Chocolate Game that is a variant of the well known combinatorial game of Nim, and discovered new formulas for P-positions of the game. Chocolate Game studied in this paper can be cut in three directions, and each game has coordinates (x, y, z), where x, y, z are the maximum number of times we can cut the chocolate in each directions respectively. Rectangle chocolate games are mathematically equivalent to the game of Nim with three piles, but the coordinates (x, y, z) of Chocolate Game created by the authors satisfy inequalities, and this fact makes mathematical structures of these chocolates games different from that of the game of Nim. The result presented in this paper is carried out by a group of high school students using computer algebra system Mathematica. 1 chocolate in Fig. 1.2. The structure of the Chocolate Game will be very clear if you download the author’s paper in [10]. You can play the Chocolate Game with the free Mathematica player. Introduction Chocolate Games are a variant of the well known combinatorial game of Nim, and the first chocolate in Fig. 1.1 was presented in [1]. The second and the third chocolate in Fig. 1.1 were introduced and studied by the authors in [6] and in [9] respectively. In Section 1,2 and 3 the authors present their research on the Chocolate Game in Fig. 1.2. In Section 4 the authors present examples of chocolates that do not have simple formulas for P-positions. Chocolate Games look like the game of Chomp, but these games are different from Chomp mathematically. As to Chomp see [3]. The winning strategy for Chomp has not been discovered, but the winning strategy for many chocolates that were created by our mathematics group were discovered. Here we define two important positions of chocolates. Definition 1.2. (a) N-positions, from which we can force a win, as long as we play correctly at every stage. (b) P-positions, from which we will lose however well we play, but we may end up winning if our opponents make a mistake. Figure 1.1. We define coordinates for each position of Chocolate Game. Let Z≥0 be the set of non-negative integers. Figure 1.2. Definition 1.3. We represent the chocolate with coordinates (x, y, z), where x, y, z stand for the maximum numbers of times that we can cut these chocolate in each direction. Definition 1.1. Given a piece of chocolate, where the light gray parts are sweet and the dark gray part is very bitter. This game is played by two players in turn. Each player breaks the chocolate (in a straight line along the grooves) and eats the piece he breaks off. The player to leave his opponent with the single bitter part is the winner. (or his opponent has no move) For examples of Chocolate Game see Fig. 1.1 and Fig. 1.2, and the main topic of this paper is the ISSN online 2204-6534 Example 1.1. In the chocolate of Fig. 1.2 we can cut 6 times at most vertically on the left side of the dark gray (bitter) block, we can cut 6 times at most horizontally above the dark gray (bitter) block and we can 12 times at most vertically on the right side of the dark gray block. Therefore x = 6, y = 6 and Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015 1 page 43 z = 12. Therefore we represent this chocolate with the coordinates (6, 6, 12). Example 1.3. M ex({0, 1, 4, 5, 6}) M ex({1, 4, 5, 6}) = 0. = 2, For examples of coordinates see Fig. 1.3. It is clear that the coordinates of these positions satisfy the inequality 2y ≤ z, and this is equivalent to the inequality z y≤ (1.1) 2 where is the floor function. Chocolate Game made by the authors are new, since these games must satisfy inequalities. We define the Grundy Number G for the chocolate in Fig. 1.4. Definition 1.6. Let G((0, 0)) = 0. For a position (y, z) we define Grundy Number recursively by G((y, z)) = M ex({G((v, w)) : (v, w) ∈ move((y, z))}). 2 Some Lemmas for Nim-sum Definition 2.1. Let x, y be non-negative ∑n integers, i and write them in base 2, so x = i=0 xi 2 and ∑n i y = i=0 yi 2 with xi , yi ∈ {0, 1}. We define the nim-sum x ⊕ y by Figure 1.3. x⊕y = n ∑ wi 2i , (2.1) i=0 where wi = xi + yi (mod 2). Figure 1.4. Lemma 2.1. We suppose that x ⊕ y > z. (2.2) Then we have x > y ⊕ z or y > z ⊕ x. Our aim is to find the formula for P-positions of the chocolate of Fig.1.2, and the first step is to study the right part of it that is presented in Fig. 1.4. Our tool is the Grundy number. For the detailed theory of Grundy Number see [2]. First we define move for this chocolate. ∑ Proof.∑We write x, y, z in ∑ base 2, so x = ni=0 xi 2i , n n i i y = i=0 yi 2 and z = i=0 zi 2 with xi , yi , zi ∈ {0, 1}. Suppose that for i = m + 1, m + 2, ..., n Definition 1.4. We define move for the chocolate in Fig. 1.4. For y, z ∈ Z≥0 with 2y ≤ z we define move((y, z)) = {(v, z) : 0 ≤ v < y} ∪ {(min(y, w/2), w) : 0 ≤ w < z}, where v, w ∈ Z≥0 . xi + yi + zi = 0 (mod 2) (2.3) xm + ym + zm = 0 (mod 2). (2.4) xi + yi = zi (mod 2), (2.5) yi + zi = xi (mod 2) (2.6) zi + xi = yi (mod 2). (2.7) and Then by (2.3) we have for i = m + 1, m + 2, ..., n move((y, z)) is the set of all positions that can be reached from (y, z) directly. and Remark 1.1. By Definition 1.4 move((y, z)) = {(v, z) : 0 ≤ v < y} ∪ {(min(y, w/2), w) : 0 ≤ w < z} = {(v, z) : 0 ≤ v < y} ∪ {(w/2w) : w/2 < y and 0 ≤ w < z} ∪ {(y, w) : w/2 ≥ y and 0 ≤ w < z}. By (2.2), (2.4) and (2.5) we have xm + ym = 1 > 0 = zm . If xm = 1 and ym = 0, then by (2.6) we have x > y ⊕ z. If xm = 0 and ym = 1, then by (2.7) we have Example 1.2. move((1, 3)) = {(0, 3), (1, 2), (0, 1), (0, 0)}. y > z ⊕ x. We define the function M ex(A) for a set A of non-negative integers. Lemma 2.2. For y, z ∈ Z≥0 y ⊕ z = M ex( Definition 1.5. Let M ex(A) be the least nonnegative integer not in the set A. {v⊕z : v = 0, 1, 2, ..., y−1}∪{y⊕w : w = 0, 1, 2, ..., z−1}). (2.8) ISSN online 2204-6534 Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015 2 page 44 Proof. Clearly y ⊕ z does not belong to {v ⊕ z : v = 0, 1, 2, ..., y − 1} ∪ {y ⊕ w : w = 0, 1, 2, ..., z − 1}. Let k be a non-negative integer such that y ⊕ z > k, then by Lemma 2.1 we have z > y ⊕ k or y > k ⊕ z. If z > y ⊕ k, k = (y ⊕ y) ⊕ k = y ⊕ (y ⊕ k) ∈ {y ⊕ w : w = 0, 1, 2, ..., z − 1}. Note that y ⊕ k = w for some 0 ≤ w < z. If y > k ⊕ z, k = k ⊕ (z ⊕ z) = (k ⊕ z) ⊕ z ∈ {v ⊕ z : v = 0, 1, 2, ..., y − 1}. Therefore any non-negative integer smaller than y⊕z belongs to {v ⊕z : v = 0, 1, 2, ..., y −1}∪{y ⊕w : w = 0, 1, 2, ..., z−1}. By the definition of M ex (Definition 1.5) we prove this lemma. Lemma 2.3. For any odd number m there exist a non-negative integer a and a natural number t such that t−1 ∑ t+1 m=2 ×a+ 2i . (2.9) i=0 Proof. Let m be ∑ an odd number, and write it in n i base 2, so m = i=0 mi 2 . If mi = 0 for each i = 0, 1, 2, ..., n, then let t = n + 1 and a = 0, and we have (2.9). If there exists i such that mi = 0, then let t = min{i : m ∑i = 0}. Since m is odd, t > 0. Let a = ni=t+1 mi 2i−t−1 . Then we have (2.9). Lemma 2.4. ∑ Let x be an arbitrary odd number, and t+1 i x = 2 ×a+ t−1 i=0 2 for some natural numbers a, t. For a natural number c let ∑ A(a, t, c) = {x⊕(z +2t+1 ×c), z = 0, 1, 2, ..., ti=0 2i } and B(a,∑ t, c) = {(x + 1) ⊕ (z + 2t+1 × c), z = 0, 1, 2, ..., ti=0 2i }. Then the set A(a, t, c) is the same as the set B(a, t, c). Proof. An arbitrary element of A(a, t, c) can be expressed as x ⊕ (z +∑ 2t+1 × c) for some integer z t i such∑ that 0 ≤ z ≤ i=0 2 , and we express z by t z = i=0 zi 2i .∑Then x ⊕ (z 2t+1 × c) ∑+ t−1 i t t+1 =(2 × a + i=0 2 ) ∑ ⊕ ( i=0 zi 2i + 2t+1 × c) t−1 t+1 t i t+1 × c) = (2 × a) ⊕ (zt 2 + i=0 (1 − zi )2 + 2 ∑ = (2t+1 × a + 2t ) ⊕ ( ti=0 (1 − zi )2i + 2t+1 × c) = (x + 1) ⊕ ( t ∑ (1 − zi )2i + 2t+1 × c). (2.10) i=0 ∑ ∑ Since 0 ≤ ( ti=0 (1 − zi )2i ) ≤ ti=0 2i , by Equation (2.10) we have x ⊕ (z + ∑ 2t+1 × c) ∈ {(x + 1) ⊕ (z + t+1 2 × c) : z = 0, 1,∑ 2, ..., ti=0 2i } = B(a, t, c). The function z → ∑ ti=0 (1 − zi )2i is one to ∑ one mapping of {0, 1, 2, ..., ti=0 2i } onto {0, 1, 2, ..., ti=0 2i }. Therefore we have the conclusion of this lemma. ISSN online 2204-6534 Lemma 2.5. Let m be a natural number. Then the set {m ⊕ z : z = 0, 1, 2, ..., 2m + 1} is the same as the set {(m + 1) ⊕ z : z = 0, 1, 2, ..., 2m + 1}. Proof. Let m be an arbitrary number, then by ∑t−1odd t+1 i Lemma 2.3 m = 2 ×a+ i=0 2 for a non-negative integer a and a natural number t. ∑ i Then 2m + 1∑= 2t+1 × (2a) + 2( t−1 i=0 2 ) + 1 = t t+1 i 2 × (2a) + i=0 2 . Then by using Lemma 2.4 for c = 0, 1, .., 2a we have {m ⊕ z∑: z = 0, 1, 2, ..., 2m + 1} = {m ⊕ z : z = 0, 1, 2, ..., ti=0 2i , 2t+1 , ..., 2t+1 + ∑ t i i=0 2 , ∑ 2 × 2t+1 , ..., 2 × ∑ 2t+1 + ti=0 2i , ..., (2a) × i 2t+1 , ..., (2a) × 2t+1 + ti=0 ∑t2 } i = {m ⊕ z : z = 0, 1, 2, ..., i=0 2 }∑ i ∪{m ⊕ (z + 2t+1 ) : z = 0, 1, 2, ..., ti=0 ∑2t } i t+1 ∪{m ⊕ (z + 2 × 2 ) : z = 0, 1, 2, ..., i=0 2∑} ... ∪ {m ⊕ (z + 2a × 2t+1 ) : z = 0, 1, 2, ..., ti=0 2i } = A(a, t, 0) ∪ A(a, t, 1) ∪ A(a, t, 2)... ∪ A(a, t, 2a) = B(a, t, 0) ∪ B(a, t, 1) ∪ B(a, t, 2)... ∪ B(a, t, 2a) ∑ = {(m + 1) ⊕ z : z = 0, 1, 2,∑..., ti=0 2i } ∪ {(m + 1) ⊕ (z + 2t+1 ) : z = 0, 1, 2, ..., ∑ ti=0 2i } ∪ {(m + 1) ⊕ (z + 2 × 2t+1 ) : z = 0, 1, 2, ..., ti=0 ∑ 2i } ∪ ... ∪ {(m + t+1 1) ⊕ (z + 2a × 2 ) : z = 0, 1, 2, ..., ti=0 2i } = {(m + 1) ⊕ z : z = 0, 1, 2, ..., 2m + 1}. (2.11) Next we assume that m is even, and let m = 2n. Then (2n) ⊕ (2i) = (2n + 1) ⊕ (2i + 1) and (2n) ⊕ (2i + 1) = (2n + 1) ⊕ (2i) for i = 0, 1, 2, ..., m. Therefore the set {m⊕z : z = 0, 1, 2, ..., 2m+1} is the same as the set {(m + 1) ⊕ z : z = 0, 1, 2, ..., 2m + 1}. Lemma 2.6. For any natural number m we have {z/2 ⊕ z : z = 0, 1, 2, 3, ..., 2m + 1} = {(m + 1) ⊕ z : z = 0, 1, 2, 3, ..., 2m + 1}. (2.12) Proof. We prove by mathematical induction. Suppose that Equation (2.12) is true for m = k. We prove Equation (2.12) for m = k + 1. By the hypothesis of mathematical induction and the fact that z/2 = k+1 for z = 2k+2, 2k+3 we have {z/2⊕z : z = 0, 1, 2, 3, ..., 2(k + 1) + 1} ={z/2 ⊕ z : z = 0, 1, 2, 3, ..., 2k + 1} ∪{z/2 ⊕ z : z = 2k + 2, 2k + 3} = {(k + 1) ⊕ z : z = 0, 1, 2, 3, ..., 2k + 1} ∪{(k + 1) ⊕ (2k + 2), (k + 1) ⊕ (2k + 3)} = {(k + 1) ⊕ z : z = 0, 1, 2, 3, ..., 2k+1, 2k+2, 2k+3}, which is equal to the set {(k +2)⊕z : z = 0, 1, 2, 3, ..., 2k +1, 2k+2, 2k+3} by Lemma 2.5. Therefore Equation (2.12) is true for m = k +1, and the proof is finished by mathematical induction. Journal and Proceedings of Young 3 Archimedes, vol. 1, no. 1, 2015 page 45 3 Theorem 3.2. Let GX (x), GY (y) and GX⊕Y ({x, y}) be Grundy Numbers of Chocolate Game X, Y and X ⊕ Y . Then GX⊕Y ({x, y}) = GX (x) ⊕ GY (y). P-positions of Chocolate Game Lemma 3.1. G((y, z)) = y ⊕ z for any y, z ∈ Z≥0 with 2y ≤ z. Proof. By mathematical induction on a natural number n we prove G((y, z)) = y ⊕ z for y, z ∈ Z≥0 . This is a well known theorem of combinatorial game theory. For a proof see Theorem 7.24 (p.142) of [2]. (3.1) We suppose (3.1) for y, z ∈ Z≥0 with y + z < n and prove (3.1) for y, z ∈ Z≥0 with y + z = n. Let y, z ∈ Z≥0 satisfy y + z = n. By the definition of move, Grundy Number G, Remark 1.1 and the hypothesis of mathematical induction G((y, z)) = M ex({G((v, w)) : (v, w) ∈ move((y, z))}) = M ex({G((v, z)) : v = 0, 1, ..., y − 1} ∪{G((y, w)) : w2 ≥ y and 0 ≤ w < z} ∪{G(( w2 , w}) : w2 < y and 0 ≤ w < z}) = M ex({v ⊕ z : v = 0, 1, ..., y − 1} ∪{y ⊕ w : w2 ≥ y and 0 ≤ w < z} ∪{ w2 ⊕ w : w2 < y and 0 ≤ w < z}) = M ex({v ⊕ z : v = 0, 1, ..., y−1}∪{y⊕w : w = 2y, 2y+1, ..., z−1} w (3.2) ∪{ ⊕ w : w = 0, 1, 2, ..., 2y − 1}). 2 Note that we use the fact that 2y ≤ z to show that w2 < y and w < z if and only if w = 0, 1, 2, ..., 2y−1 in the last equation. By Lemma 2.6 {y ⊕ w : w = 2y, 2y + 1, ..., z − 1} ∪ { w2 ⊕ w : w = 0, 1, 2, ..., 2y − 1} = {y ⊕ w : w = 2y, 2y + 1, ..., z − 1} ∪ {y ⊕ w : w = 0, 1, 2, ..., 2y − 1} = {y ⊕ w : w = 0, 1, 2, ..., z − 1}. Figure 3.1. We define Grundy number for the chocolate in Fig. 3.1. First we define move1 for it. Definition 3.2. For x ∈ Z≥0 we define move1((x)) = {(u) : 0 ≤ u < x}, where u ∈ Z≥0 . We define Grundy number G1 for the chocolate in Fig. 3.1. Definition 3.3. Let G1((0)) = 0. For a position (x) we define Grundy Number recursively by G1((x)) = M ex({G1((u)) : (u) ∈ move1((x))}). Lemma 3.2. G1(x) = x for any x ∈ Z≥0 . Proof. Since move1((x)) = {(x − 1), (x − 2), ..., (0)}, this is clear from Definition 3.3. (3.3) Definition 3.4. Let A2 = {(x, y, z) : x, y, z ∈ Z≥0 , y ≤ z2 and x ⊕ y ⊕ z = 0}, B2 = {(x, y, z) : x, y, z ∈ Z≥0 , y ≤ z2 and x ⊕ y ⊕ z = 0}. By Lemma 2.2, equations (3.2) and (3.3) G((y, z)) = y ⊕ z. By using Lemma 3.1 we make a theorem for Ppositions of Chocolate Game that satisfy the inequality y ≤ z2 . We need some theorems to do that. Theorem 3.3. Let G2 be Grundy number of the chocolate in Fig. 1.2. Then G2((x, y, z)) = x⊕y ⊕z. Proof. By Theorem 3.2 G2 = G1 ⊕ G, where G and G1 are Grundy numbers of the chocolate in Fig. 1.4 and Fig. 3.1, and hence by Lemma 3.1 and Lemma 3.2 we have G2((x, y, z)) = x ⊕ y ⊕ z. Theorem 3.1. Let GX be Grundy Number of an arbitrary combinatorial game X with a position x. Then a position x is a P-position of the game if and only if GX (x) = 0. This is a well known theorem of combinatorial game theory. For a proof see Theorem 7.12 (p.138) of [2]. We need a theorem on Grundy Number of the sum of two games. Theorem 3.4. Let A2 and B2 be the sets defined in Definition 3.4. A2 is the sets of P-positions, and B2 is the set of N-positions of the Chocolate Game that satisfies inequality y ≤ z2 . Proof. By Theorem 3.3 G2((x, y, z)) = x ⊕ y ⊕ z, and hence by Theorem 3.1 (x, y, z) is a P-state if and only if x ⊕ y ⊕ z = 0. Consequently (x, y, z) is an N-position if and only if x ⊕ y ⊕ z = 0. Definition 3.1. Let X and Y be two arbitrary combinatorial games. The sum of these games X and Y is a game where each player may choose to play either in X or Y at any point in the game, and a player wins when his opponent has no move in either game. We denote by X ⊕ Y the sum of X and Y , and by {x, y} the sum of the position x in the game X and the position y in the game Y . ISSN online 2204-6534 Remark 3.1. Theorem 3.4 can be generalized for the case of the chocolate that satisfies the inequality y ≤ kz for an arbitrary even number k. Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015 4 page 46 4 [2] Michael H. Albert, Richard J. Nowakowski and David Wolfe, Lessons In Play , A K Peters. Chocolate Games without Simple formula for P-positions By Theorem 3.4 the chocolate game that we study in Section 1,2,3 has a very simple formula for Ppositions, and someone may think that all the chocolate games have a simple formula for P-positions. In this section the authors present examples of chocolate without a simple formula for P-positions. As to the chocolate in Fig. 4.1 any formula for Ppositions is not known. As to the calculation by computer see [11]. Some chocolate have formulas for P-positions, but they are not simple. One of these chocolate is the first chocolate game in Fig. 4.2. The authors studied the right part of this chocolate that is the second chocolate in Fig. 4.2, and made the table in Fig. 4.3 of Grundy Numbers. There seems to be some kind of patters in these numbers, but the patters are not simple and there is no relation between these numbers and nim-sum. As to the detailed study of this chocolate see [9]. This chocolate satisfies the inequality y ≤ z. [3] Weisstein, Eric W. ”Chomp.” From MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/Chomp.html [4] Weisstein, Eric W. ”Nim-Value.” From MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/NimValue.html [5] Weisstein, Eric W. ”Nim.” From MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/Nim.html [6] M.Naito, T.Inoue, R.Miyadera, Discrete Mathematics and Computer Algebra System, The Joint Conference of ASCM 2009 and MACIS 2009, COE Lecture Note Vol.22,Kyushu University. A PDF file of the paper is available at http://gcoemi.jp/english/publish list/pub inner/id:2/cid:10 [7] R.Miyadera, T.Inoue, W.Ogasa and S.Nakamura, Chocolate Games that are variants of the Game of Nim, Journal of Information Processing, Information Processing Society of Japan, 53(6) pp. 1582-1591, 2012 (in Japanese). [8] M. Naito, D. Minematsu, R. Miyadera and etc., Combinatorial Games and Beautiful Graphs Produced by them, Visual Mathematics, Volume 11, No. 3, 2009 http://www.mi.sanu.ac.rs/vismath/ miyaderasept2009/index.html Figure 4.1. [9] S. Nakamura, D. Minematsu, T. Kitagawa, R. Miyadera and etc.,Chocolate games that are variants of nim and interesting graphs made by these games, Visual Mathematics, Volume 14, No. 2, 2012 http://www.mi.sanu.ac.rs/vismath/ miyaderasept2012/index.html Figure 4.2. Z Y 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 2 1 4 3 6 5 8 7 10 9 12 11 14 13 16 3 1 5 4 7 6 9 8 11 10 13 12 15 14 5 1 7 4 9 6 11 8 13 10 15 12 17 6 1 8 4 10 7 12 9 14 11 16 13 8 1 10 4 12 7 14 9 16 11 18 9 1 11 4 13 7 15 10 17 12 11 1 13 4 15 7 17 10 19 12 1 14 4 16 7 18 10 9 10 11 12 13 14 15 [10] R. Miyadera, S. Nakamura, Y. Okada, R. Hanafusa, and T. Ishikawa,Chocolate Games -How High School Students Discovered New Formulas Using Mathematica-, Mathematica Journal, Volume 15, 2013. 14 1 15 16 1 17 4 17 1 18 18 4 19 1 20 7 19 4 20 1 21 20 7 21 4 22 1 23 Figure 4.3. [11] MathPuzzle.com. (Submitted by R.Miyadera) “ The Bitter Chocolate Problem. ” Material added 8 Jan 06 (Happy New Year). www.mathpuzzle.com/26Feb2006.html. References [1] A.C.Robin, A poisoned chocolate problem, Problem corner, The Mathematical Gazette Vol. 73, No. 466 (Dec., 1989), pp. 341-343 and Vol. 74, No. 468 (Jun., 1990), pp. 171-173 ISSN online 2204-6534 Journal and Proceedings of Young Archimedes, vol. 1, no. 1, 2015 5 page 47
© Copyright 2026 Paperzz