Lecture 3: Bayesian Statistics Julia Collins 9th October 2013 The lottery What are the chances of winning the lottery jackpot? Suppose you’ve chosen your 6 numbers and you’re watching the draw. There are 49 balls in the machine and 6 of them would be 6 a match if chosen. So we have a 49 chance of the first ball drawn matching. Now there are 48 balls in the machine, and 5 of them would be a match with the remaining numbers. So 4 5 chance of matching another number. There is a 47 chance of matching this time there is 48 1 the next ball, with the pattern continuing until there is a 44 chance of the final 6th ball matching. All of these events have to happen in order for you to win the jackpot, so we multiply the probabilities together to get the final jackpot probability: P(Winning the jackpot) = 5 4 3 2 1 1 6 × × × × × = 49 48 47 46 45 44 13, 983, 816 So the chance of winning the UK lottery jackpot is about 1 in 14 million. If you bought a lottery ticket every other second for a year you would be expected to win only once. In fact, if you buy the ticket earlier than Friday for the Saturday draw, then your chances of dying before the draw are higher than your chance of winning! The chance of getting the minimum prize of £10 for getting 3 balls correct is still a measly 1 . On average you would have to spend £57 to get a £10 return. 57 The Birthday Problem How many people do you need there to be in a room before the chance of two people having the same birthday is more than 50%? Most people will go for a figure of about 180, since there are 365 days in a year and you need about half of that. The true figure is an astoundingly low 23. How could this possibly be true? 1 The way to think about it is not how many individuals there are in the room, but how many possible pairs of people there are. With 23 people there are 253 possible pairings, which already makes the answer sound more believable. How do we calculate the probabilities involved? It turns out to be easier to calculate the chance that no two people share a birthday. Let’s take it one person at a time. Suppose that one person is in a room and let’s ignore leap years. • A second person walks into the room. What is the chance that they do not share a birthday with the first person? There are 364 other birthdays they could have out of . a possible 365, so the probability is simply 364 365 • A third person walks into the room. What is the chance that they do not share a birthday with either of the other two? There are now 363 birthdays left to choose from 363 . (assuming the first two are different) out of 365, so the probability is 365 • Continue similarly, so the fourth person has a with any of the first three people. 362 365 chance of not sharing a birthday As with the lottery, the total probability of no two people sharing a birthday is the product of all these probabilities. P(n people having no birthdays in common) = 364 363 365 − n + 1 × × ··· × 365 365 365 Finally, to get the probability that there do exist two people with a common birthday: P(2 people having a common birthday) = 1 − P(no two people have a common birthday). When we put in some different values for n we get this table: Number of people Probability that two people share a birthday 10 11.7% 20 41.1% 23 50.7% 30 70.6% 50 97% 57 99% 100 99.99997% 200 99.9999999999999999999999999998% 366 100 % So already with 50 people in a room you can be virtually certain that there will be some pair of people with the same birthday. 2 The reason why this result seems so counterintuitive is that we very easily confuse the probability that some two people share a birthday with the probability that someone shares a birthday with a specific person. Of course, the probability that, in a room of 50 people, someone shares a birthday with you, is fairly low. Example of Bayes’ Theorem in action We saw in the lecture that in 1828 a Frenchman called A. Taillandier found that 67% of prisoners were illiterate. He claimed that “ignorance is the mother of all vices”. However, in order to justify his statement, he would need to know not the proportion of criminals that were illiterate, but the proportion of illiterate people who were criminals. Bayes’ Theorem tells us how to work out a conditional probability given its inverse. His formula is P(B|A)P(A) P(A|B) = P(B) where P(A|B) means “the probability of A happening, given that we know B”. So, to work out the proportion of illiterate people are criminals, we need to work out P(criminal|illiterate) = P(illiterate|criminal)P(criminal) . P(illiterate) From the research Taillandier did, we know that P(illiterate|criminal) = 67%. We can roughly guess that the proportion of illiterate people in the French population at that time was about 40%. Another stab in the dark, as it were, gives us an estimate of criminality at about 5%. Putting these numbers into the equation gives us P(criminal|illiterate) = 0.67 × 0.05 = 8.4%. 0.4 A figure of 8% is much less than the headline-grabbing figure of 67%! However, it is still a little bit significant: only 5% of the general public are assumed to be criminals, but 8% of illiterate people are criminals. Homework I have a friend who has two children. At least one of these children is a boy who was born on a Tuesday. What is the chance that the other child is a boy? 3 Homework solution First we have to write down all the combinations of two children (gender and day-of-week) in which at least one of the children is a boy born on a Tuesday. We have: • The older child is a boy born on a Tuesday and the younger child is a girl. The girl could be born on any of the days of the week so there are 7 possibilities here. • The younger child is a boy born on a Tuesday and the older child is a girl. The girl could be born on any of the days of the week so there are 7 possibilities here. • The older child is a boy born on a Tuesday and the younger child is a boy. The second boy could also be born on any day of the week so there are 7 possibilities here. • The younger child is a boy born on a Tuesday and the older child is a boy. The older boy could be born on any day of the week EXCEPT Tuesday, because the case where the older child is a boy born on a Tuesday was already counted. Therefore there are 6 possibilities here. In total, then, there are 7 + 7 + 7 + 6 = 27 different gender/day-of-week combinations where at least one child is a boy born on a Tuesday. Out of these 27, there are 7+6 = 13 cases ≈ 0.48. where both children are boys. Therefore the answer to the homework question is 13 27 Interestingly, this answer is much closer to the intuitive value of 12 than the problem in lectures where the day-of-the-week didn’t feature. This Wikipedia article provides a good discussion of the difficulties associated with the general boy/girl paradox: http://en.wikipedia.org/wiki/Boy_or_Girl_paradox And here is a nice concise description of the different scenarios in which someone might tell you they have two children, one of whom is a boy born on a Tuesday, and how the probabilities associated to these scenarios change: http://blog.tanyakhovanova.com/?p=221. 4
© Copyright 2026 Paperzz