2017 Contribution of Prof. R.A. Fisher to statistics VIDYA SAGAR,(ROLL NO.:-1875) DEPARTMENT OF STATISTICS 5/5/2017 During writing the dissertation, a few people become an inseparable part of this dissertation. All the people I thank below were my strength during writing the dissertation. Prof. Dr. Rameshwar Nath Mishra (head of department of statistics, Patna university) as well as Prof. Dr. Srikant Singh (department of statistics, Patna university) who gave me the golden opportunity to do this wonderful dissertation on the topic ‘ Some contribution of Prof. Ronald Alymer Fisher on statistics’ which also helped me in doing lot of research and I come to know about so many new things. Prof. Amendra Mishra , Prof. Anchala kumari and others Professor who help and guide me from first day of my college life and make me able to write this distraction . Founder of Google ‘ Larry page’ and information and photos of R.A. Fisher . ‘Sergey Brin’ who provide me so My parents who has been supporting me from birth, my friends who stood by me in all the phases that i went through while thinking, writing and improving this dissertation and helped me a lot in finalizing this dissertation within the limited time frame. ----------------------------------------------------------------------------------------------------- Signature CONTENTS 1. INTRODUCTION 2. BEGINNINGS 3. Academic Career 4. R.A. Fisher’s Timelines and scientific achievements 5. Contribution 6. Tobacco , lungs cancer and Fisher’s biggest mistake 7. Books 8. Some personal detail and ends Introduction Lived 1890 – 1962. Sir Ronald fisher F.R.S.(1890-1962) was one of the leading scientist of the 20th century who laid the foundation for modern statistics. As a statistican working at the Rothamsted Agricultural Experimentation station, the oldest agricultural research institute in the United Kingdom, he also made major contribution to Evolutionary Biology and Genetics. The concept of randomization and the analysis of variance procedures that he introduced are now used throughout the world. In 1922 he gave a new definition of statistics. In addition to being probably the greatest statistician ever, he also invented experimental design and was one of the principal founders of population genetics. He unified the disconnected concepts of natural selection and Mendel’s rules of inheritance. In quantitative biology the importance of his book Statistical Methods for Research Workers has been likened to that of Isaac Newton’s Principia in physics. According to geneticist and author Richard Dawkins, Fisher was the greatest biologist since Charles Darwin. Beginnings Ronald Aylmer Fisher was born into a wealthy family in London, England, UK on February 17, 1890. He was the second born of twins; his elder twin was still-born. His father, George Fisher, was an enormously successful fine arts dealer, who ran an auction company ranking in importance with Sotheby’s or Christie’s. His mother, Katie Heath, was a lawyer’s daughter. Ronald’s parents could afford the best private schooling for him, but his life of abundance was temporary. His mother died of peritonitis when he was 14, and, when he was 15, his father’s business folded. The family moved from a luxurious mansion in one of the richest parts of London – Hampstead – to a small house in one of the poorer parts – Streatham. Ronald continued to be educated at Harrow School; not because his father could afford the very high fees, but because Ronald was a brilliant student and was awarded scholarships. One of his masters later commented that, of all the students he had taught, Ronald was uniquely brilliant. In addition to his family’s ill-fortunes, Ronald was hampered by a personal disability – his appalling short-sightedness. His eyesight was so bad that he was not allowed to read under electric light because it strained his eyes too much. This particular cloud, however, seems to have had a silver lining, shifting his perspective on mathematics. He learned to visualize problems in his mind’s eye and solve them in his head rather than on paper. Academic career of Prof. R. A. Fisher Fisher viewed himself as a scientist, especially interested in biology. Despite this, he did not enjoy learning the intricacies and names of biological structures. He decided to study mathematics, believing it was through mathematics he could make the greatest contributions to biology. In 1909, at the age of 19, he won a scholarship to the University of Cambridge. Three years later he graduated with first class honors in mathematics. Although clearly a brilliant mathematician, his tutors were dubious about his future. They were worried that in mathematics he tended to ‘see’ the correct answer and write it down, rather than go through the usual processes of calculation and proof. After graduating, Fisher spent a further year at Cambridge studying postgraduate level physics, including the theory of errors, a topic which heightened his interest in statistics. In 1911, still an undergraduate, Fisher formed Cambridge University’s Eugenics Society, which attracted a number of prominent members. Charles Darwin’s son Leonard lectured to the society in 1912. He and Fisher became firm, lifelong friends. It was Fisher’s interest in eugenics that first prompted him to look at the genetics of a population, leading him to found – along with J. B. S. Haldane and Sewall Wright – the new science of population genetics. In 1919 he was offered a position at the Galton Laboratory in University College London led by Karl Pearson, but instead accepted a temporary job at Rothamsted Research in Harpenden to investigate the possibility of analysing the vast amount of data accumulated since 1842 from the "Classical Field Experiments" where he analysed the data recorded over many years and published Studies in Crop Variation in 1921. Ronald Fisher’s Timeline and Main Scientific Achievements 1912 – Published his first paper, in which he created the method of maximum likelihood. He continued refining this method for 10 years. 1912 – established the principle that there is a sample mean different from the population mean. 1913 – worked as a statistician for an insurance company and trained in Britain’s Territorial Army. 1914 – volunteered for the British Army at the start of World War 1 and was rejected because of his poor eyesight. This may have saved him from a similar fate to Henry Moseley, who volunteered successfully. 1914 – became a high school mathematics and physics teacher. He soon found he did not enjoy teaching, but had to stick with it in the absence of any other way of earning a living. 1918 – with financial help from Leonard Darwin he published a landmark paper that founded quantitative genetics: The correlation between relatives on the supposition of Mendelian inheritance. Fisher also introduced the concept of variance for the first time. The paper had been delayed since 1916 because referees had difficulty understanding it. Geneticist James Crow likened this paper, written when Fisher was a high school teacher, to Albert Einstein’s great papers published when he was working in a patent office. Ronald Fisher’s Timeline and Main Scientific Achievements Fisher’s new ideas and his mathematical approach to biological questions were often met with incomprehension and sometimes downright resistance. Fisher had a quick temper and became involved in some rather bitter feuds. His ability to ‘see’ the answers to complex mathematics problems was both a blessing – he made outstanding progress – and a curse – people could often not follow the logic of his arguments. The books he would later write were landmarks in biology and statistics, but often had to be explained by more ‘user friendly’ scientists before they became widely understood. 1919 – Became a statistician at Rothamsted Experimental Station in central England, working in agricultural research. Here he had access to a huge amount of biological data collected since 1842. He applied his mathematical genius to the data, enabling him to invent the tools of modern experimental design. 1921 – created the statistical method of analysis of variance (ANOVA) and introduced the concept of likelihood. 1924 – created the F distribution 1925 – released his book Statistical Methods for Research Workers. At the time of its publication it received no positive reviews, yet it was soon to revolutionize statistics and biology. Geneticist and mathematician Alan Owen wrote that this book occupies a position in quantitative biology similar to Isaac Newton’sPrincipia in physics. 1929 – elected to the Royal Society, joining the United Kingdom’s scientific elite. 1930 – released his book The Genetical Theory of Natural Selection, unifying the theory of natural selection with Mendel’s laws of inheritance, defining the new field of population genetics and revitalizing the concept of sexual selection. Fisher introduced a large number of vital new concepts in this book including: the inverse relationship between the magnitude of a mutation and the likelihood of the mutation increasing an organism’s fitness; parental investment; Fisher’s principle; the sexy son hypothesis; Ronald Fisher’s Timeline and Main Scientific Achievements and the heterozygote advantage. Geneticist James Crow described this book as “the natural successor to The Origin of Species.” 1933 – appointed Professor of Eugenics at University College London. While working in this role he studied the genetics of human blood groups, explained the Rhesus system and established the Fisher-Race notation, still used today, for Rhesus phenotypes and genotypes. Fisher’s reputation as a lecturer was reminiscent of Willard Gibbs’s: his lecture courses usually lost most of the students who started them. Only a small band of the smartest students could stick with his brilliant but challenging ideas. 1935 – released his book The Design of Experiments introducing the concept of a null hypothesis. 1936 – although his work was decisive in unifying natural selection with Mendel’s laws of inheritance, Fisher’s careful statistical analysis of Mendel’s data suggested all was not well. Mendel’s results showed too few random errors to have come from real experiments. Nearly all of Mendel’s data showed an unnatural bias. Fischer wrote: “Although no explanation can be expected to be satisfactory, it remains a possibility among others that Mendel was deceived by some assistant who knew too well what was expected. This possibility is supported by independent evidence that the data of most, if not all, of the experiments have been falsified so as to agree closely with Mendel’s expectations.” RONALD FISHER, 1890 TO 1962 Ronald Fisher’s Timeline and Main Scientific Achievements Has Mendel’s work been rediscovered? Annals of Science 1: 115 – 117, 1936 Fisher’s analysis said there was only a 1-in-2000 chance that Mendel’s results were the fully reported results of real experiments. Mendel’s results and conclusions, however, were correct. 1939 – University College London closed down the Eugenics Department and Fisher returned to the Rothamsted Experimental Station. 1943 – appointed to the Balfour Chair of Genetics at the University of Cambridge. 1952 – knighted by Queen Elizabeth, becoming Sir Ronald Aylmer Fisher. 1955 – awarded the Royal Society’s Copley Medal, one of the greatest prizes in science. Previous recipients included Benjamin Franklin, Alessandro Volta, Michael Faraday, Robert Bunsen, Charles Darwin, Willard Gibbs, Dmitri Mendeleev, Alfred Russel Wallace, J. J. Thomson, Ernest Rutherford, Albert Einstein, and James Chadwick. 1957 – retired from his chair at Cambridge, but continued working there for two years. 1959 – moved to Adelaide, Australia to do research work with E. A. Cornish at CSIRO. Now 69 years old, one of the main reasons he moved was that he enjoyed the warm, sunny climate of South Australia. By the end of his career, Fisher had written 7 books and almost 400 academic papers devoted to statistics. Some contribution of R.A. Fisher to statistics The contributions of Sir Ronald Aylmer Fisher to the discipline of statistics are multifarious, profound and long-lasting. In fact, he can be regarded as having laid the foundations of statistics as a science. He is often dubbed the 'father of statistics'. He contributed both to the mathematical theory of statistics and to its applications, especially to agriculture and the design of experiments therein. His contributions to statistics are so many that it is not even possible to mention them all in this short article. We, therefore, confine our attention to discussing what we regard as the more important among them. Fisher provided a unified and general theory for analysis of data and laid the logical foundations for inductive inference. Fisher regarded statistical methods from the point of view of applications. Since he was always involved in solving biological problems which needed statistical methods, he himself developed a large body of methods. Many of them have become standard tools in a statistician's repertoire. Some of these developments required rather deep mathematical work and it was Some contribution of R.A. Fisher to statistics characteristic of-Fisher to use elegant geometrical arguments in the derivation of his results. An excellent example of this type is his derivation of the sampling distribution of the correlation coefficient. Fisher had poor eyesight even at an young age, which prevented private reading and he relied largely on being read to, which in turn involved doing mathematics without pencil, paper and other such visual aids. It is believed that this situation helped him develop a keen geometrical sense. Fisher's contributions to statistics have also given rise to a number of bitter controversies, due to the nature of the ideas and his personality and idiosyncrasies. Some of the controversies only go to show that it is not possible to build the whole gamut of statistical theory and methodology on a single paradigm and that no single system is quite solid, as Fisher himself realised. Sampling distribution Fisher derived mathematically the sampling distribution of the Student's t statistic which Gosset (pen name: Student) had derived earlier by 'simulation'. Fisher also derived mathematically the sampling distributions of the F statistic, the correlation coefficient and the multiple correlation coefficient and the sampling distributions associated with the general linear model. Fisher's derivation of the sampling distribution of the correlation coefficient from a bivariate normal distribution was the starting point of the modern theory of exact sampling distributions. Another useful and important contribution was the tanh -1 transformation he found for the correlation coefficient to make its sampling distribution close to the normal distribution, so that tables of the standard normal distribution could be used in testing significance of the correlation coefficient. Fisher made a modification in the degree of freedom of the Pearson's X2 when parameters are to be estimated Maximum likelihood estimator Fisher's very first paper published in 1912 (at the age of22) was on the method of maximum likelihood (although he did not call it so at that time). He developed this in view of his lack of satisfaction with the methods of moment estimators and least squares estimators. At that time the term 'likelihood' as opposed to probability or inverse probability caused some controversies. Although the basic idea of likelihood dates back to Lambert and Bernoulli and the method of estimation can be found in the works of Gauss, Laplace and Edgeworth, it was Fisher to whom the idea is credited, since he developed it and advocated its use. Fisher studied the maximum likelihood estimation in some detail establishing its efficiency. Fisher's mathematics was not always rigorous, certainly not by modern day standards, but even then, his mathematical work, like in the case of his work on maximum likelihood estimation, provides a great deal of insight. Some of his claims on the properties of maximum likelihood estimators were proved to be false in their generality by Bahadur, Basu and Savage. Subsequent authors developed strong theories based on the likelihood function. Fisher advocated maximum likelihood estimation as a standard procedure and since then it has become the foremost estimation method and has been developed for innumerable problems in many different sciences and contexts. It also has seen enormous ramifications and plays a central role in statistical theory, methodology and applications. Following his work on the likelihood, Fisher did a lot of work on the theory of estimation and developed the notions of sufficiency, information, consistency, efficiency and ancillary statistic and integrated them into a well-knit theory of estimation. His pioneering work on this is contained in two papers he wrote in 1922 and 1925. Analysis of Variance In 1919, Fisher joined the Rothamsted Experimental Station, where one of his tasks was to analyse data from current field trials. It is in this context that he formulated and developed the technique of analysis of variance. The analysis of variance is really a convenient way of organising the computation for analysing data in certain situations. Fisher developed the analysis of variance initially for orthogonal designs such as randomised block designs and Latin square designs. Later, Frank Yates extended the technique to non-orthogonal designs such as balanced incomplete block designs, designs with a factorial structure of treatments, etc. The technique of analysis of variance developed rapidly and has come to be used in a wide variety of problems formulated in the set-up of the linear model. Although initially developed as a convenient means of testing hypotheses, it also throws light on sources of experimental error and helps set up confidence intervals for means, contrasts, etc. Design of Experiment Fisher's studies on the analysis of variance brought to light certain inadequacies in the schemes being used then for experiments, especially agricultural experiments. It is in an attempt to sort out these inadequacies that Fisher evolved design of experiments as a science and enunciated clearly and carefully the basic principles of experiments as randomisation, replication and local control (blocking, confounding, etc.). The theory of design of experiments he formulated was intended to provide adequate techniques for collecting primary data and for drawing valid inferences from them and extracting efficiently the maximum amount of information from the data collected. Randomisation guarantees validity of estimates and their unbiasedness . Replication helps provide a source of estimate of error, which can be used to compare treatments and other effects, test hypotheses and set up confidence limits. Local control helps to reduce sampling variations in the comparisons by eliminating some sources of such variations. Fisher formulated randomised block designs, Latin square designs, factorial arrangements of treatments and other efficient designs and worked out the analysis of variance structures for them. The subject of design of experiments then developed rapidly both in the direction of formulation and· use of efficient designs, especially in agricultural experiments, in the direction of statistical theory formulating useful and efficient designs and working out their analyses, and in the direction of interesting and difficult combinatorial mathematics investigating the existence of designs of certain types and their construction. Surely, the formulation of the basics of experimental design should be regarded as Fisher's most important contribution to statistics and science. Discriminant Analysis From the time Fisher derived the sampling distributions of correlation coefficient and the multiple correlation coefficient, he was interested in the smdy of relationships between different measurements on the same individual and the use of multiple measurements for the purposes of classification and other problems. Fisher formulated the problem of discriminant analysis (what might be called a statistical pattern recognition problem today) in statistical terms and arrived at what is called the linear discriminant function for classifying an object into one of two classes on the basis of measurements on multiple variables. He derived the linear discriminant function as the linear combination of the variables that maximises the between-group to withingroup squared distance. Since then the same function has been derived from other considerations such as a Bayes decision rule and has been applied in many fields like biological taxonomy, medical diagnosis, engineering pattern recognition and other classification problems. Statistical and other pattern recognition methods and image processing techniques have made considerable progress in the last two or three decades, in theory and in applications, but Fisher's linear discriminant function still has a place in the pattern recognition repertoire. Fisher’s Exact test Fisher's exact test is a statistical significance test used in the analysis of contingency tables. Although in practice it is employed when sample sizes are small, it is valid for all sample sizes. It is named after its inventor, Ronald Fisher, and is one of a class of exact tests, so called because the significance of the deviation from a null hypothesis (e.g., Pvalue) can be calculated exactly, rather than relying on an approximation that becomes exact in the limit as the sample size grows to infinity, as with many statistical tests. The test is useful for categorical data that result from classifying objects in two different ways; it is used to examine the significance of the association (contingency) between the two kinds of classification. With large samples, a chi-squared test can be used in this situation. Sufficient statistics In statistics, a statistic is sufficient with respect to a statistical model and its associated unknown parameter if "no other statistic that can be calculated from the same sample provides any additional information as to the value of the parameter". In particular, a statistic is sufficient for a family of probability distributions if the sample from which it is calculated gives no additional information than does the statistic, as to which of those probability distributions is that of the population from which the sample was taken. More generally, the "unknown parameter" may represent a vector of unknown quantities or may represent everything about the model that is unknown or not fully specified. In such a case, the sufficient statistic may be a set of functions, called a jointly sufficient statistic. The concept, due to Ronald Fisher, is equivalent to the statement that, conditional on the value of a sufficient statistic for a parameter, the joint probability distribution of the data does not depend on that parameter. Both the statistic and the underlying parameter can be vectors. Fisher–Neyman factorization theorem Fisher's factorization theorem or factorization criterion provides a convenient characterization of a sufficient statistic. If the probability density function is ƒθ(x), then T is sufficient for θ if and only if nonnegative functions g and h can be found such that 𝑓𝜃 (𝑥 ) = ℎ(𝑥 )𝑔𝜃 (𝑇(𝑥 )) i.e. the density ƒ can be factored into a product such that one factor, h, does not depend on θ and the other factor, which does depend on θ, depends on x only through T(x). It is easy to see that if F(t) is a one-to-one function and T is a sufficient statistic, then F(T) is a sufficient statistic. In Sufficient statistics particular we can multiply a sufficient statistic by a nonzero constant and get another sufficient statistic. This concept of Fisher finds useful in the application of RaoBlackwell theorem and in the family of exponential family. Tobacco, Lungs cancer, and Fisher’s biggest mistake In 1954 Richard Doll and Bradford Hill published evidence in the British Medical Journal showing a strong link between smoking and lung cancer. They published further evidence in 1956. Fisher was a paid tobacco industry consultant and a devoted pipe smoker. He did not think the statistical evidence for a link was convincing. He accepted that smoking seemed to be correlated with lung cancer, but declared that ‘correlation is not causation.’ He said a good case had been made for further research, but not for suggesting to people that they should stop smoking. Reviewing Fisher’s arguments today is interesting. He made many valid scientific points against the research that linked lung cancer to smoking. History and further research, however, proved him wrong. Books by R.A. Fisher Fisher wrote several books on statistics, many of them containing his original ideas. The most important among them are: Statistical Methods for Research Workers, Edinburgh: Oliver & Boyd, first published in 1925 and which has seen several editions. This unusual book is full of original ideas, written from the point of view of applications; here, each technique is explained starting with an actual scientific problem and a live data set collected to be able to answer certain questions and an enunciation of an appropriate statistical method with illustration on this data Design of Experiments, first published in 1935 has also seen several editions. Besides these, with F Yates, he compiled and published Statistical Tables for Biological, Agricultural and Medical Research in 1938 (with several subsequent editions), also by Edinburgh: Oliver & Boyd. These tables, together with those by Pearson and Hartley, were essential tools of a statistician's trade in those days when a statistical laboratory consisted of manually or electrically operated calculating machines and even in the days of electronic desk calculators. Books by R.A. Fisher Students may not find Fisher's books quite readable and until one has mastered the material from some other source or with the help of a good teacher, his books may not help. However, they make very useful and enjoyable reading for an expert and for a teacher! Suggested Reading • June 1964 issue of Biometrics. Vol.20. No.2. in memoriam Ronald Aylmer Fisher. Dedicated to the memory of Fisher soon after his death, contains many articles on his life and work. • Box J. R A Fisher: The Life of a Scientist. John Wiley & Sons. New York, 1978. • Savage L J. Rereading of R A Fisher. In L J Savage: The writings of Leonard Jimmie Savage: A Memorial Selection. American Statistical Association and the Institute of Mathematical Statistics. Hayward. Calif. pp. 678-720, 1981. • Fisher-Box J. Fisher, Ronald Aylmer. In Kotz S, Johnson N Land Read C B (Eds.).Encyclopedia of Statistical Sciences. New York. Wiley Interscience. Vo1.3. pp. 103-111, 1988. • December 1990 issue of Biometrics. Vol. 46. No.4. published in the year of Fisher's birth centenary, contains a few articles on his life and work. • Rao C R. R A Fisher: The founder of moclem statistics. Statistical Science. Vol. 7. pp.34-48, 1992. Some Personal Details and the End Fisher married Ruth Guinness, a physician’s daughter, in 1917. He was 27, she was 17. Together they had seven daughters and two sons. Their eldest son George was killed in action flying his fighter plane in 1943, during World War 2. Fisher’s marriage then fell apart. Fisher was known to have a quick temper: he got involved in scientific feuds and could behave rudely to people he had strong disagreements with. On the other hand, his many friends reported that he was warm, likable, friendly, had a sharp and appealing sense of humour, was engagingly eccentric at times, and was an intellectually stimulating companion. He was generous with his ideas: many people who talked to him were able to publish work as their own in which Fisher’s informal, unaccredited contributions had been vital. Ronald Fisher died aged 72 on July 29, 1962, in Adelaide, Australia following an operation for colon cancer. With bitter irony, we now know that the likelihood of getting this disease increases in smokers. Ronald Fisher was cremated and his ashes interred in St. Peter’s Cathedral, Adelaide. “I am genuinely sorry for scientists of the younger generation who never knew Fisher personally. So long as you avoided a handful of subjects like inverse probability that would turn Fisher in the briefest possible moment from extreme urbanity into a boiling cauldron of wrath, you got by with little worse than a thick head from the port which he, like the Cambridge mathematician J. E. Littlewood, loved to drink in the evening. And on the credit side you gained a cherished memory of English spoken in a Shakespearean style and delivered in the manner of a Spanish grandee.
© Copyright 2026 Paperzz