WDS'10 Proceedings of Contributed Papers, Part I, 116–120, 2010. ISBN 978-80-7378-139-2 © MATFYZPRESS Selected Problems of Teaching Probability and Statistics V. Línek Charles University Prague, Faculty of Mathematics and Physics, Prague, Czech Republic. Abstract. Although courses in statistics and probability run at many universities and colleges in the Czech Republic, the results of this education are often regarded as insufficient. In this article, we focus on some particular aspects of this problem: various examples such as diagnostic tests in medicine or probability thinking in criminology are discussed as well as ways of cultivating statistical thinking. Introduction Results of various ways of education in statistics and probability and the level of statistical thinking among non-mathematical public appear to be very poor both in the Czech Republic [Špinková, 2007] and abroad [Gigerenzer, 2002]. Apart from the overall shortage of mathematical literacy, some other reasons for this situation in our country can be pointed out: first of all, there is very little teaching of statistics and probability at primary and secondary schools. At primary schools, basic terms of descriptive statistics should be explained, but it doesn’t happen very often due to the lack of interest and time of teachers. At secondary schools, basics of probability theory are studied. However, it is held rather as a combinatory exercise than something connected with the “real world”. As a result, courses in statistics held at colleges and universities are too difficult for students and make them to learn nothing but formal algorithms. Thus, even if laymen know how to use particular statistical methods, they are often not conscious about the main purpose of inductive statistics – to analyze statistical data objectively [Zvárová, 2007]. Instead, statistics is regarded as some kind of mystery, which can produce any result we want and has nothing to do with the common sense. This attitude leads not only to incorrect use of statistics, but also to distrust in its results on one hand and overestimating them on the other. As the state has been lasting for many decades, an easy and simple solution cannot be expected; the mathematical and pedagogical public’s attention should be drawn at that point, which is a modest aim of this article. Examples from scientific practice As we occasionally cooperate with physicians and surgeons with statistical analysis of their data, we can present some examples of flaws occurring in their statistical thinking at various levels. It is our intention neither to ridicule them nor to state that physicians are not able to use statistics correctly; we just want to show the ways how the statistical thinking can be distorted and what are possible sources of this. Case A A student of psychology brought 180 questionnaires filled in by participants of some kind of therapy. Her wish was to prove that by these questionnaires, the participants were “happier than other people”. She was surprised when she was told that questionnaires by those “other people” were needed for this aim. As it was impossible, we analyzed these data just by means of descriptive statistics. Three months after the work had been finished (and paid), she discovered results from another researcher analyzing the same questionnaires even with the formula which was used for it. Thus all the work previously performed was found useless and done (and paid) again. In this case, the data were collected without any idea of how statistics is used. The only one twosample t-test performed by the student during the course in statistics could have prevented the student from wasting money and time. 116 LÍNEK: SELECTED PROBLEMS OF TEACHING PROBABILITY AND STATISTICS Case B A physician needed to find a difference between two groups of patients, divided by the depth of a tumor. The boundary depth between the groups was stated 5 mm. After we had given the results in, “new” data were sent to analyze with the boundary depth 4 mm “so that the results were better”. We can see a typical example of statistical dishonesty here; statistics is not used as an objective way to analyze the data, but as a way to get the results we want. Unfortunately, this is a much more common way how statistics is used then the correct one: future writers of articles are more interested in results they can publish then in being objective. This is not surprising and can hardly be changed. However, the sinners are rarely conscious that they sin and teachers of statistics might be able to reform at least partly the situation by a strong warning against this attitude. Case C A surgeon needed to show that the duration of a particular kind of surgery depends on the obesity of a patient. He had divided the patients into two groups – fat and thin ones – and brought the average values of duration of the operation for both of them. Unfortunately, he did not have the primary data – just the averages. The order was to simulate the data having the same averages and showing the fact he wanted to show via a two-sample t-test. We were assured that such a deception is a quite harmless way as he had been performing the surgery for many years and he “knows” that he is right. This situation shows more than just a simple dishonesty. Thinking it over carefully, we can find two paradoxical moments here. Firstly, using statistics could be regarded as needless even if the primary data were available, as the validity of the considered hypothesis is obvious. Secondly, the surgeon uses his authority and experience to persuade the statisticians that he is right so that they could persuade other surgeons about it. Why such a waste of energy? His common sense and experience are convincing for him; why does he think it is not enough for the other surgeons? It seems that statistics has completely lost its original purpose here and is used as a kind of certificate which needs to be purchased when any piece of knowledge is to be published or taken seriously. Case D A student of post-graduate course in medicine commented her feelings of despair about cooperation with a statistician during her research in these words: “I don’t understand the statistics at all. I don’t even know the form I should give him the data in!” This expression shows substantial failure of statistical education. Really, it is quite easy to explain a student what to do with their data if the data is clearly arranged; if it is not, a serious problem arises. There is much work for teachers at grammar schools in this area. Diagnostic tests Diagnostic tests are methods used in medicine to determine the state of patients, e.g., whether they suffer from a particular disease, or some other unknown condition. Since there is nothing certain in this world, the results of tests are not absolutely reliable, and interesting statistical and didactical problems arise here. Two basic quantities characterizing the reliability of a test are its sensitivity (probability that the result of the test is positive given that the patient has the disease), and its specificity (probability that the result of the test is negative given that the patient does not have the disease). We present these characteristics of several tests in Table 1 together with the prevalence of the tested condition (the proportion of individuals with the tested condition in a particular population; in other words, the probability that the patient chosen by chance meets the condition). It is obvious that the prevalence depends on the selected population. The trouble is that these numbers are of no importance for patients. They are much more interested in the false positive rate (probability that the patient does not have the disease given that he/she has tested positive) and false negative rate (probability that the patient has the disease given that he/she has tested negative). The standard method of calculation is via the Bayes rule, leading to formulas: 117 LÍNEK: SELECTED PROBLEMS OF TEACHING PROBABILITY AND STATISTICS Table 1. The sensitivity and the specificity of selected diagnostic tests and the prevalence of the tested diseases (or some other unknown conditions) in a relevant population. 1 condition sensitivity specificity prevalence 99.9% 99.99% 0.01% breast cancer (mammogram) 90% 93% 0.8% colorectal cancer (test FOBT) 50% 97% 0.3% pregnancy tests 82% 64% 5.5% HIV false positive rate = P ( H + ) = false negative rate = P ( D −) = P(+ H ) ⋅ P( H ) P (+ H ) ⋅ P( H ) + P(+ D) ⋅ P( D) P (− D) ⋅ P ( D) P(− D) ⋅ P ( D) + P(− H ) ⋅ P( H ) , , where following abbreviations are used: H ... the patient is healthy D ... the patient has the disease, + ... the result of the test is positive, – ... the result of the test is negative. Unfortunately, the idea represented by the Bayes rule seems to be too difficult and quite unavailable for the doctors, who are rarely interested in mathematics. Thus they usually know very little about uncertainties connected with the results of these tests. This can occasionally lead to harmful misunderstandings or even to a tragedy [Gigerenzer, 2002]. The subject of diagnostic tests is more deeply analyzed e.g. by Anděl [2007] and Hacking [2001]. Didactics of diagnostic tests Gigerenzer [2002] claims that the conditional probabilities represent the source of these difficulties as our minds are not adapted for them. Instead, he suggests formulating the problem in socalled “natural frequencies”, i.e. absolute numbers of particular results in an ideal population of some round number, e.g. 10,000 people. To prove the legitimacy of this attitude he asked 48 physicians to estimate the chances of breast cancer for a woman aged 40 to 50 given a positive mammogram in a routine screening. The input data were given in two possible versions as follows: Version A (conditional probabilities) “The probability that one of these women has breast cancer is 0.8%. If a woman has breast cancer, the probability is 90% that she will have a positive mammogram. If a woman does not have breast cancer, the probability is 7% that she will still have a positive mammogram. Imagine a woman who has a positive mammogram. What is the probability that she actually has breast cancer?” Version B (natural frequencies) “Eight out of every 1,000 women have breast cancer. Of these with breast cancer, 7 will have a positive mammogram. Of the remaining 992 women who don’t have the breast cancer, some 70 will still have a positive mammogram. Imagine a sample of women who have positive mammograms in screening. How many of these women actually have breast cancer?” One half of the physicians were given the version A while the other half received the second one. We present the results of this experiment in Table 2. 1 Data from [Gigerenzer, 2002], [Katz, 2009]. 118 LÍNEK: SELECTED PROBLEMS OF TEACHING PROBABILITY AND STATISTICS Table 2. Results of Gigerenzer’s experiment. Incorrect answers Correct answers Total Version A Version B 22 (92 %) 13 (54 %) 2 (8 %) 11 (46 %) 24 (100 %) 24 (100 %) The results are convincing enough. The question remains, whether the physicians would be able to “translate” the language of conditional probabilities into that of natural frequencies. In the experiment it was done by someone else but it cannot be done each time when a new test is used. A quite easy way to do this is to allocate the population into the natural frequencies according to the sensitivity, specificity and prevalence in the 2 by 2 contingency table. The data are clearly arranged here and we can even display the characteristics of the test graphically (Fig. 1). In fact, such tables are the very way how the specificity and the sensitivity are explained to students of medicine. Unfortunately, the effect of the prevalence on the false rates can easily be forgotten here, as the inserted data comes usually from experiments, where the proportion of patients with the disease differs from that in the normal population [Linn, 2004]. The effect is essential: the lower the prevalence, the higher the proportion of true and false positive results and the higher the false positive rate. Therefore the ability the students of medicine should gain during their education is to fill in the table according to the prevalence, the sensitivity and the specificity just like in the way which was suggested in the Version B. It seems this didactic attitude would be more effective than memorizing the Bayes rule, which is usually forgotten anyway. Probability and criminology Examples of using probability in criminology are interesting for teachers, as they show in an absorbing way how the clouded probability thinking can lead to incorrect conclusions. We present here a case described in [Gigerenzer, 2002]. It concerns conditional probabilities again. Case E A woman was murdered and her husband was accused from the murder. The strongest argument against him was the fact that he had battered her repeatedly in the past; therefore it could be regarded as probable that he killed her as well. The defense, consulting the situation with a renowned law pro- INFECTED HEALTHY Σ POSITIVE 90 180 270 NEGATIVE 10 720 730 Σ 100 900 1000 FALSE POSITIVE RATE 67 % FALSE NEGATIVE RATE 1.4 % SENSITIVITY 90 % SPECIFICITY 80 % Figure 1. Natural frequencies and in 2 by 2 table and their graphical displaying. The prevalence is 10 %. The grey fields represent false results. 119 LÍNEK: SELECTED PROBLEMS OF TEACHING PROBABILITY AND STATISTICS fessor, objected that out of 100,000 women battered by their partners only 40 of them are killed by the partners annually. In other words, he stated that the conditional probability of a man murdering his wife given he had battered her, was too small to be useful on trial: P (a man will kill her wife ⎪ he batters her) = 40:100,000 = 0.000,4. In the end, the jury decided that the defendant was innocent 2 . It is not known whether they were affected by the probability argument. In any case, the argument is not correct. Why? Out of 100,000 women battered by their partners, 40 are killed annually. But only 5 of the remaining women are killed each year by someone other than their partners (Fig. 2). Thus the probability which is relevant in this situation is: P (a man will kill her wife ⎪ he batters her and she will be killed) = 40:45 = 0.89. In other words, 40 out of every 45 murdered and battered women have been killed by their batterers. The probability theory gives a fairly strong case against the defendant, not for him. women battered by their partner 100,000 40 women killed by someone other than by their partner 5 women killed by their partner Figure 2. A schema of the probability arguments discussed in the presented case of murder. Conclusion We presented several examples of practical using and misusing statistics and probability which show why these parts of mathematical education should be focused on. Two remarkable sources of problems were suggested: a misconception about the purpose of statistics resulting in statistical dishonesty, and conditional probabilities, which are very tricky and their incorrect use can have serious consequences. Both of these issues deserve to be addressed carefully. To look for other practical examples and better methods of teaching are certainly a convenient way how to deal with them. Acknowledgments. The author would like to thank RNDr. Magdalena Hykšová, PhD. for her patient and kind help with work on this article. References Anděl, J., Matematika náhody, Matfyzpress, Praha, 2007. Gigerenzer, G., Calculated Risks, Simon & Schuster, New York, 2002. Hacking, I., An Introduction to Probability and Inductive Logic, Cambridge University Press, Cambridge, 2001. Katz, D., False Positives & False negatives. Lecture Notes for the Course Introductory Statistics, Preston Ridge Campus, Frisco, 2009 Linn, S., A New Conceptual Approach to Teaching the Interpretation of Clinical Tests, Journal of Statistics Education, Volume 12(2004), Number 3, čísla stran. Špinková, M., Statistical Notions and Learning of Statistics, in: WDS´07 Proceedings of Contributed Papers, Part I, Matfyzpress, Praha, 2007. Zvárová, J., Základy statistiky pro biomedicíncké obory I., Karolinum, Praha, 2007. 2 The case happened in 1995 in the USA. The accused man was black, the murdered woman was white and eight out of twelve jurors were black women. Therefore the case was strongly watched by the public because of its gender and racial connotations. 120
© Copyright 2026 Paperzz