Chapter 4 Probability

47
Chapter 4
Probability
One of the original motivations behind counting was the beginning of the taming of
uncertainty that occurred in the 16th and early part of the 17th century. Why it took so
long to even develop to the extent it did in those early centuries is indeed interesting, but
not for us to speculate (gambling is very old indeed.) What is relevant to us is that by
1600 it was reasonably clear in many people's minds what some aspects of probability
were about. But let's progress by example, a historical one. Galileo himself was posed
this question, and as usual he analyzed it correctly.
Example 1. Suppose we are going to play the following game (those early years were
mostly concerned with gambling questions (as mentioned above, gambling is old
definitely thousands of years old.):
we roll 3 dice, if a 9 shows up, I pay you $1, if a 10 shows up, you
pay me $1, if anything else shows up, we roll again.
Naturally you are mistrustful since I am proposing the game, but how do you know I am
not the idiot by offering it to you, or better, that it may be a fair game and you are just
missing the opportunity to have fun. Of course, if you are just going to play one hand, no
calculation is really necessary and you are just going to make your decision based on
your mood, who makes the offer, etc. But suppose you intend to do this for three hours
every Saturday for the next three years (we live in the age of individual preference.) At
first thought it seems like a reasonable game: one can obtain a 9 by rolling:
while a 10 can be obtained by rolling:
At first thought it seems like a fair and reasonable game. Both numbers can be rolled in
six different ways, as the two lists of possibilities indicate. But what is the logic behind
this attempt? It has something to do with the number of ways of doing something
and if one thing has more ways of occurring than another, then it is more likely to
occur. After all nobody would play the previous game if the competition were between
rolling a 3 and rolling a 10 since intuitively one feels that a 3 is much rarer than a 10.
Although there is common sense behind this, it is not quite correct. It needs to be
improved upon. The basic principle that we are going to use for our probability
calculations is
48
suppose an activity or experiment is to be performed, and we have
equally feasible outcomes, then the probability for a given event to
occur is the number of outcomes that give the desired event divided
by the total number of outcomes.1
But extra emphasis needs to be made on the premise of the principle: one must first
reduce the outcomes of the activity to equally feasible outcomes2. Then you can start
looking at the probability of the event that you are interested in. Going back to the game
in question: what is the activity in this example? Rolling 3 dice. What are the outcomes?
It seems acceptable to say that the outcomes are, in addition, to the two lists above:
3
4
5
6
7
8
11
12
13
14
15
16
17
18
and thus there would be a total of 56 outcomes, so the probability of a
and a 10 would have the same probability, so the game is seemingly fair.
1
9 would be 6 ,
56
This is the first enunciated principle in the theory of probability, and the simplest one. As simple as it is, it
was not stated clearly until the 16th century by the inimitable Cardano, great scholar and scoundrel.
2
What are equally feasible outcomes can be in itself a polemic. How do you know a coin is fair? But we
will be naive about the subtleties of statistical analysis, and only insist that, from what we know, we can
honestly claim that the outcomes we are taking are equally feasible.
49
However, if we apply the same reasoning, then the probability of a
3 is
1 , and a
56
4 has
the same probability. So if we keep rolling the three dice for a long time, the number of
3's occurring should roughly be the same as the number of 4's. It does not take much
experimentation to perhaps start doubting our premise, and maybe we should question
why did we label those outcomes as equally feasible? So let's rethink a bit. Is a
as equally feasible as a
?
Suppose we had a yellow die, a white die and a blue die. Then, to roll a 3, we would
have to have
, we need to show a
do it by:
, or
in each die, but to get a
, or
,
4, we can
, there are three ways since
any of the three dice could show a
and the other two need to show a
. It seems
like we have some more choices in the latter situation. Three times as many, actually.
A way out of the quagmire is to take for our outcomes the 216 different ways there are to
roll three dice if one of them is yellow, other one white and the third one blue. We get the
216 from: 6 6 6 . Nobody can argue on the equal feasibility of these 216 outcomes.
So we start now from there. How many ways can we roll a 3? As before, only one way,
so the probability is 1 , not 1 . But how many ways can we roll a 4? Three ways as we
216
56
saw above, so the probability of a
4 is
often than 3's.
Let's go back to the
we roll a
which the
3 . So
216
4's should occur about three times more
9 and the 10 of our game. Of the 216 ways, how many ways can
? Easily, we have three decisions, which die shows the
and which the
:
,
3 2 1 6 ways:
By identical reasoning, there are 6 ways to roll a
. But what about a
? Here our tree of options has only two stages, since once we have decided
which die shows the
, the other two dice must show a
and we have nothing left
to decide (or equivalently, we only have one option, once we have placed the
there are only 3 ways to roll a
:
), so
50
Identically, there are 3 ways to roll a
, and 6 ways to roll a
Finally, there is only 1 way:
show
to roll a
.
(all 3 dice have to
's.) So how many ways can we roll a 9? Totaling our options we obtain:
6 6 3 3 6 1
25 ways to roll a 9. Putting it in a table, together with the
similar calculations for 10.
Roll of 9
# of
Ways
Hence the probability of a
9
# of
Ways
6
6
6
3
3
6
3
6
6
3
1
3
25
Total
Roll of 10
Total
is 25 while the probability of a
216
27
10
is 27 . So on the
216
average, after 216 rolls of the dice, Person A would have lost 25 times, but would have
won 27 times, and the rest would have been draws. So on the average, after 216 rolls of
the dice, you would have won 25 times, I would have won 27 times, and the rest would
have been draws. So your net outcome would be a loss of $2. Now whether you play or
not is your decision: after all you could consider $2 cheap entertainment (millions of people
go to Vegas).
One of the common errors made in the past by mathematicians (including some of the
best like Leibniz, D'Alembert and others) is the one of presuming equally feasible
outcomes to an experiment without further analysis. In this course we will not have the
chance to get too subtle into this subject, but remember to always be careful to set up the
outcomes to the experiment before you start asking about the event that you are interested
in, and try to analyze your outcomes so that they seem, as best as you can tell, equally
feasible.
Example 2. Suppose that a happily married couple has two children. How likely is it that
they will have one of each sex? D'Alembert incorrectly analyzed this by saying there
were three outcomes to the experiment: Two Boys, Two Girls and One-Of-Each. So the
51
probability of One-Of-Each is 1 . Actually, if we assume that a boy being born is as
3
likely as a girl being born in any given birth3, then there are four equally feasible
outcomes: BB, BG, GB and GG. Of those, 2 give us children of both sexes, so the
probability is 2 1 . Again this estimate conforms to reality much better.
4
2
But to more relevant matters:
Example 3. Suppose you come to take a test totally unprepared. The test consists of 10
True-False questions, each of which you will answer at random (but you will answer
them all, since there is no penalty for guessing.) How likely is it that you will achieve a
passing score of 70% or better? First, what is the experiment? Answering the exam. How
many ways can you do this? By the second counting principle, 210 1024 (10 decisions,
2 choices for each). What is the event we are pursuing? To obtain at least 7 correct
answers. Let's partition this set into: exactly 7 correct, exactly 8 correct, exactly 9 correct
and exactly 10 correct. In how many ways can you answer the exam so that you have
exactly 10 correct? 1 way. How about 9 correct? Build your tree, the first stage is to
decide which question you are going to answer incorrectly, for this stage you have 10
choices. After you have done that, there are no options left since the question you are to
answer incorrectly has to be answered that way while all the others have to be answered
correctly. So there is a total of 10
10
1
ways of getting 9 correctly. What about 8?
There we have the decision of which two questions out of the 10 we are to answer
incorrectly, after that we have no options left, so the answer is
similar reasoning, we get
10
3
10
2
45 . Finally, by
120 ways of getting exactly 7 correct. So the number of
ways of getting at least 7 is: 1+10+45+120=176. Hence the probability of getting a
passing score in the exam is 176 , which is approximately 17%. Not bad for total
1024
ignorance!
Observe that
The probability of an event is always a number between 0 and 1, inclusive.
Example 4. This is a slight, but very important variation of Example 3. Suppose you
come to take a test (totally unprepared as usual), but that the test is Multiple Choice, with
ten questions and each question having three choices, only one of them correct. What is
the probability that you score at least a 70% on the test? How many ways can we answer
the exam? Easy, 310 59049 . How many ways can we get all correct: 1 way. Nine
3
As it happens, this is not quite correct. One realization came very early—as soon as statistical tables of
birth were gathered in the 1660's: more boys are born than girls—approximately in an 18-to-17 ratio (girls
are more likely to survive, so not so many need to be produced.) The other complication is that a given
couple, because of the chemistry, has a certain small factor of repeating the sex of previous children (this is
small).
52
correct?
10
9
19 21
20 . The surprising ingredient here might be the 2, which is
coming from the 2 ways we can answer a question incorrectly. Reviewing the three
factors we get:
10
9
as the number of ways of choosing the questions that we are going
to answer correctly, 19 as the number of ways of answer those questions correctly, 21 as
the number of ways of answering the remaining questions incorrectly. How many ways
can we get an 80%?
10
7
17 23
10
8
18 22 180 . Finally, the number of ways of getting 70% is
960 , so the probability of passing the exam is:
1 20 180 960
1161
59049
59059
.0196 ,
much smaller than in the True-False exam.
Keep the model of the multiple-choice exam in mind. Many situations can be modeled
using the same kind of thinking. There is no reason, for example, why there can be only
one way to answer the question correctly. Hence our counting becomes a little bit more
complicated, but totally manageable.
Let's next play poker.
Example 5.
A typical deck of cards consists of 52 cards in 4
suits (spades, hearts, diamonds and clubs) and 13 denominations
(Ace, 2,3,4,5,6,7,8,9,10, Jack, Queen, King). A poker hand consists of 5 cards out of the 52.
There are special hands, combinations of either denominations or suits or both, that are
ranked higher than others. The rankings from best to worse are as follows:
Name of the Hand
Royal Flush
Description of the hand
Example
10,J,Q,K,A of the same suit.
53
Straight Flush
5 cards in sequence in the same suit, not royal.
4-of-a-Kind
All four cards of the same denomination.
Full House
3 cards in the same denomination, 2 in another.
Flush
5 cards in the same suit, but not in sequence.
Straight
5 cards in sequence, but not in the same suit.
3-of-a-Kind
3 cards in the same denomination, other 2 different
Two Pairs
2 cards in one denomination, 2 cards in another denomination, fifth
card in yet another.
Pair
2 cards in one denomination, nothing else.
54
Bust
None of the above.
What is the probability of a Royal Flush? There are at least two ways to view our
experiment: one way is I am dealt 5 cards out of a deck of 52 (as in 5-card draw), another
is I am dealt one card at a time until I have 5 (as in 5-card stud). Should the probabilities
be different? Of course not. But what may happen is that it is easier to do a problem one
way than the other. With the Royal Flush there is no hassle. With the first approach,
there are
52
5
2,598,960 equally feasible outcomes, of those 4 of them give a royal
flush (the only option we have is what suit the flush is to be in), so the probability is
4
1
0.000001539 , not very likely indeed. In the other approach we have
2598960
649740
52 51 50 49 48 311875
, ,200 equally feasible outcomes. How many of them give
us a royal flush? The first card has to be any of the 20 possible cards (any 10 or J or Q or
K or A), the second card has to be in the same suit, so we have only 4 choices, for the
third card we have 3 choices, for the fourth one, 2 choices and for the fifth one, 1 choice,
so by the second counting principle, we have a total of 20 4 3 2 1 480 ways, so
480
1
0.000001539 , which is as expected.
the probability is
311875200
649740
Let's try to count Straight Flushes. In the first approach, how many decisions do we
have to make? The suit of the flush (4 choices for this decision) and the type of straight (9
choices for this decision—just count them in your fingers by deciding which
denomination is the lowest), so in total we have 36 options, so the probability is 36
2598960
0.000013852 (you should not hold your breath until you get one of these either).
What about the second approach? The first card can be anything, but what about the
second card. We are in trouble. The number of options for the second level of the tree
depended on which branch of the first level you are in (for example, if the first one is a
king, then the second one can only be a 9,10,J,Q,A: 5 options, while if the first one is an
8, then the second one can be a 4,5,6,7,9,10,J,Q: 8 options.) We certainly don't want to
start drawing trees. As it turns out this tree has 4,320 terminal nodes! (We might let a
machine do this, but certainly not by hand.) This is an important lesson. If you are in
trouble counting the outcomes for an event, then by moving laterally and changing
the set up maybe the trouble can be avoided. In reality, the second approach only
worsens as we go down the list of hands. So we will stay with the first approach.
For 4-of-a-Kind: we have to decide the denomination (13 ways), and then the odd card
(48 ways), so we have 13 48 624 options all together, giving us a probability of
0.00024.
55
For a Full House: in order to build a full house we need to decide which 3-of-a-kind we
are going to have (13 options), which suit those 3 cards are going to have,
4
4 , which
3
pair (12 options) and the suits for the pair,
4
6 options, so the total is
2
13 4 12 6
3,744 , so the probability is 0.001440576.
For a Flush: we have to decide the suit (4 options) and which 5 cards out of the 13 in the
suit, which gives
13
1,287 options, so we have 5148 ways, but these include the 40
5
hands that are straights (36 straight flushes + 4 royal), hence the number is 5108, and the
probability is 0.0019654. Note that if we had missed the subtlety of the 40 hands we had
counted before, the answer wouldn't be that much different: 0.0019807, with a difference
of 0.000015.
For a Straight: we have to decide the type of straight (10 options), and then decide the
suit for each of the cards (4 options for each), as before, we don't worry about the straight
flushes, we will just subtract them. So in total we have 10*4*4*4*4*4=10,240 ways from
which we subtract the 40 straight flushes, to give 10,200, so the probability is 0.0039246.
For 3-of-a-Kind: choose the denomination (13 ways), the suits for these 3 cards
ways), the two other denominations
12
4
4
3
66 ways), and the arbitrary suits for the two
2
new cards: 4 2 16 . Total
(things are getting better).
13 4 66 16 54,912 . The probability is, thus, 0.021128
Now we will do one of the subtlest ones: Two-Pair. One trap that is commonly fallen
into is as follows: choose the denomination for the first pair—we have 13 ways of doing
this, then choose the suits for this pair: 4 6 ways. Then we have 12 ways of choosing
2
the second pair, and again 6 ways of choosing its suits. Now all we have to do is choose
the remaining card out of a possible 48, giving us a grand total of
13 6 12 6 48 269,568 . There are two most foul errors in this discussion. The
latter is the easiest one to catch. The 48 is wrong. You are not controlling full houses!
Hence it should be one card out of the remaining 44, giving an adjusted count of 247,104.
But one error remains that is most subtle and very common (and tempting—can you detect
it?) Remember the distinction between a committee and an executive board. Can we tell
the difference between the first pair and the second pair? We certainly have counted them
as if we could, and that is definitely wrong. Instead we have counted every hand twice
and the real count should be 123,552. Just to make you totally comfortable with this let's
count them another way. Let's start by choosing the two denominations for the two pairs:
13
78 ways. Then we have 6 ways of choosing the suits for one of the pairs and
2
another 6 of choosing them for the other one, and then we have as before 44 ways of
56
choosing the extra card: 78 6 6 44 123,552 .
For a simple Pair, we have first to decide the denomination (13 ways), then the suits for
the pair,
4
6 . Then we must have three other denominations, 12
2
3
arbitrary suits for those denominations, 4 4 4
13 6 220 64 1,098,240 .
Finally, for a Bust, we must have 5 denominations,
220 , and then the
64 , which gives us the total of
13
1,287 , and a suit for each,
5
4 5 1,024 . But have included the Royal Flushes, the Straight Flushes, the Flushes and
the Straights, so we have 1287 1024 4 36 5108 10200 1,302,540 . But wait, you
say, I was stupid to have done this calculation since we could have found the number of
bust hands by subtracting all the previous hands from the total number of hands.
However, as we will see in a latter Name of the hand
Number of ways Probability
chapter, it is important to have Royal Flush
0.000002
4
redundancy, especially in a Straight Flush
0.000014
36
sophisticated calculation like this 4-of-a-Kind
0.000240
624
one. And the fact that, as the table Full House
0.001441
3,744
below shows us, our numbers add Flush
0.001965
5,108
0.003925
up to the correct total assures us Straight
10,200
0.021128
54,912
that we have not possibly made 3-of-a-Kind
0.047539
Two Pair
123,552
only one error.
Pair
Bust
TOTAL
1,098,240
1,302,540
2,598,960
0.422569
0.501177
1.000000
Example 6. Dominoes. In one version of the game of dominoes there
are two players, each of which draws seven tiles. For those unaware
of the tiles, each tile consists of two entries (not necessarily distinct)
from the numbers from 0 to 6 (inclusive). A tile with both numbers
being the same is called a double. So there are 7 tiles with high mark being a 6, 6 with
the high mark being a 5, etcetera. So there are T7 28 tiles. The player with the highest
double starts the game by laying down that domino. If neither player has a double, the
tiles are reshuffled and the game starts again. We will pose some questions about the
game of dominoes.

What is the probability the game will start with the double 6?
Since there are 14 tiles out of 28 being chosen, the probability that the double 6 is
27
27!
13
13!14! 14 1 .
chosen is simply
28!
28
28 2
14
14!14!

What is the probability that Tony (one of the two players) leads the double 6?
Clearly, it is half of the previous answer, so it is 14 .
57

What is the probability the game will start by leading the double 4?
To start with a double 4, it must have happened that neither the double 6 nor the
double 5 were chosen among the 14 tiles, and that the double 4 was indeed
25
25!
13
7
13!12! 14 14 13
12.96% .
chosen. So the probability is
28!
28
28 27 26 54
14
14!14!

What is the probability that the tiles will have to be reshuffled?
By similar reasoning to the previous, we have the probability to be given by
21
21!
14
14!7! 21!14! 0.28% .
28!
28
28!7!
14
14!14!
Example 7. We next look at another historical example. It involves two of the best
mathematicians of the 17th century, both French: Fermat and Pascal (both of whom we
have encountered before). Pascal was proposed the following problem:
Two parties, of equal ability, will play a game until one of them has
won six hands. Each of them has placed 32 coins in the pot to be
collected by the winner. For some unexpected reason they have to
stop when one of them has won 5 games and the other 3. How
should the 64 coins be divided?
You may think of this problem as that of flipping a coin
until you get a total of 6 heads or 6 tails. The problem had
been around for a long time, and several proposed answers
had been given, including 2:1 and 5:3. Pascal
corresponded with Fermat on it, and they both solved it
correctly, but in very different ways.
Game
1
Game
2
Game
3
Winner
A
A
A
A
B
B
B
B
A
A
B
B
A
A
B
B
A
B
A
B
A
B
A
B
A
A
A
A
A
A
A
B
First we look at Fermat's solution: let's call the two
players A and B. Then if they were to play 3 more hands
(or flip the coin three more times), they would have decided for sure on the winner, for
then either A would have at least won 1 or B would have won the needed 3. Of the
possible, equally feasible, 8 outcomes to these 3 future games, 7 of them make A the
7
winner, while one of them makes B the winner, so the probability of A winning is
so
8
the stakes should be divided 7-to-1, or 56 coins for A and 8 for B. Note that Fermat was
very careful to make sure he had equally feasible outcomes to his claim.
As a matter of fact, a contemporary of them, Roberval, complained that the outcomes of
3
the future would be: A, BA , BBA , or BBB . So the probability of A winning is , so the
4
58
stakes should be divided 3-to-1. The problem, of course, is that Roberval had no reason to
presume his outcomes to be equally feasible. And indeed they are not: A has probability
1
1
1
, BA has probability , and both BBA and BBB have probability , and the answer is
8
2
4
7
correctly .
8
A word one encounters often in the outside world is odds. In general, the odds for an
event are the probability of the event occurring divided by the probability of it not
occurring. Equivalently, it is how the stakes should be divided. So in the previous
situation, the odds for A to win are 7-to-1. The odds for B to win are 1-to-7. If you want
to bet on B, for every $1 you contribute to the pot, your opponent should contribute $7.
5-4
5-5
A=32
6-4
A=64
On to Pascal's solution. He solved the problem recursively!
He argued as follows. If the players had each won 5 games,
nobody would disagree with splitting the stakes 32-32. What
would happen if A had won 5 and B had won only 4? Well,
if they played one more hand they would either be 6-4 or 55. In one situation, A would receive all 64 coins while in the
other, A would receive 32. Thus,
5-3
we should average the two, and
64 32
48 coins, and B would receive 16. So we
2
understand 5-4. How about 5-3? With one more hand, they
5-4
would be at either 6-3 or 5-4. In one situation, A gets 64,
while in the other, 48 averaging we get 56. The same A=48
answer as Fermat!
give A,
6-3
A=64
Example 8. The next idea we introduce in this section is fundamental: averages. Let's
look at the following game (once more, not atypical of the 17th century):
Since one has a chance in six of rolling a 1 with a die, one has an
even chance to roll at least one 1 when one rolls 3 dice. Hence I
propose the following game to you. You will roll 3 dice. If three 1's
show up you win a wonderful $5, if only two 1's, you win $2, while if
only one 1 is rolled, you will still win $1. If, unfortunately, on the other
hand, no 1's show up you pay me only $1.
Naturally, you are suspicious of my proposition, but it is much better to pin point the
reasons for your suspicions. What we need is to compute your expected value when you
play this game this is equivalent to what your average performance is going to be. Of
course, if you are going to play this game just once, then it does not matter what you opt
to do, but as a long-range strategist, you need to compute.
59
Outcome
Probability
1
$
$5
216
15
The computation is just common sense. You are
basically asking: suppose I played the game
so many times, what would happen?
$2
In any one roll, you can win either 5, 2 or 1
dollars, or you can lose 1. Let's say you played
216 times (Why this number?) Then on the
$1
216
average, three 1's would show up once, while
125
$1 two 1's would show up 15 times ( 15 3 5 ),
216
the 3 is the number of options of which two dice
are going to show the two 1's while the 5 is what the other dice is going to show). How
many times does one 1 show up? Choose which die shows the 1 (3 options), and then
choose what the other two dice show ( 5 5 ) for a total of 75 times. Finally, no 1's will
show 125 times (which is 5 5 5 216 1 15 75 ). So what are your winnings?
5 1 2 15 1 75 1 125
15
or equivalently, your expectation is
216
75
5
1
216
2
15
216
1
75
216
1
125
216
15 .
216
So on the average you will lose $15 in 216 rolls, or approximately 7¢ a roll. Naturally,
you would rather not play a game when your expectation is negative, unless you will
have so much fun you are willing to pay the fee.
Example 9. Dominoes Revisited. A tile in Dominoes has the value of the sum of the
two numbers in it. Thus the highest valued tile is the double 6 with the value of 12. What
is the average value of a tile? There are 28 tiles, and the sum of all the tiles is
8 6 5 4 3 2 1 0 8 21 , so the average value of a domino piece is 6.By a
famous theorem which states simply that the average of a sum is the sum of the
averages, the average total of points a domino hand of 7 tiles should have 7 6 42 .
We end the section with a brief discussion of conditional probability. The next example
points out the subtleties inherent in probability—but just because it is subtle you
shouldn't mistrust it, instead you should take the opportunity to fine tune your brain. This
is a famous problem that has amused, confused and bedazzled many people.
Example 10. We are going to play a game where I am going to give you a choice of
three doors. Behind one is an extra point for this class, behind the other two, nothing.
You pick a door, and all-knowing I, before showing you what is behind your chosen
door, open another door which has nothing behind it. I then give you a choice of either
retaining your door or switching to the only other unopened door. On the average,
suppose you were doing this every day there is class. What should be your standard
operating procedure?
60
Before we discuss the problem as stated, let us solve a simpler problem. Suppose I had
just given you a door to choose out of 3 doors. Then nobody would argue that you have 1
3
of a chance of guessing the correct door. Is that agreed on?
Now, let's convolute the problem by my showing you a door without a price. One easily
arrived, yet wrong, conclusion is that it really does not matter if you have a standard
procedure. This wrongful reasoning goes as follows. Originally each door had a 1 chance
3
of having the one point behind it. One of them has been eliminated, so now each door has
only 1 of having the point behind it, hence it is really the same if you switch as if you
2
don't switch. Isn't this absolutely reasonable?
I will try to convince you that it is not. But first let me get a little philosophical and point
out that what probability tries to accomplish is to measure uncertainty, and that the only
uncertainty in this problem is strictly from your point of view. (After all, I know everything.)
Hence you have to try to use every bit of information available to you (sort of squeeze
blood out of rocks). What piece of information you have not weighed in the argument in the
previous paragraph? The fact that although under no circumstances I would show you
what is behind your door, I had perhaps a choice of which door to show you and that I
indeed chose the door I chose. More directly put, it is correct to say that at the beginning
every door has 1 of a chance of being rich. But what is crucial is that the new
3
information could not have affected the probability behind your door, but instead has
affected the probabilities of the other two, one going to 0 and the other to 2 . And that it
3
indeed it behooves you to switch doors all the time!
To further convince you of this we will run an experiment (the ultimate test of any
discussion).
We are going to play the game 50 times in three different ways. In one strategy you never
change doors, in another you always change doors, and in yet another you will change or
not change at random. I will in return also show you a door at random every time I have a
choice.
In order to be absolutely fair about this experiment, I let the computer choose several
random vectors of size 50. One for where the point is going to be placed, another for
which door you pick, a third for which door I show you, and finally for the last strategy,
for whether you change or not. I will label the 3 doors 1,2 and 3.
Here is the data:
Game Door With Door You
Your
Your
Your
Door I Door You
Random
Outcome
Point
Choose Outcome Show You Switch To Outcome
1
0
1
1
1
3
2
1
switch
2
1
0
1
2
2
1
3
keep
61
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
Totals
3
2
2
3
2
1
3
3
1
3
1
2
3
1
2
2
2
1
2
3
2
3
2
1
1
1
2
1
2
2
1
2
1
1
2
3
2
3
3
1
3
1
1
2
1
2
3
1
2
1
1
1
1
1
1
3
1
3
2
2
1
1
1
1
3
2
2
2
1
1
2
2
2
2
2
3
1
2
3
1
3
1
1
2
1
3
3
3
2
2
3
3
2
3
1
3
0
0
0
0
0
1
0
1
1
1
0
1
0
1
0
0
0
0
1
0
0
0
1
0
0
0
1
0
0
1
0
0
0
1
0
0
0
1
1
0
0
0
0
0
0
0
0
0
14
1
3
3
2
3
2
2
1
3
1
3
1
2
3
3
3
1
3
1
1
3
2
3
3
3
3
3
2
3
1
2
3
2
3
3
1
3
1
2
2
1
3
2
1
3
1
2
2
3
2
2
3
2
3
3
2
2
2
1
3
3
2
2
2
2
1
3
3
2
3
1
1
1
1
1
1
2
3
1
2
1
2
2
3
2
2
1
1
3
1
1
2
1
2
3
1
1
1
1
1
1
0
1
0
0
0
1
0
1
0
1
1
1
1
0
1
1
1
0
1
1
1
0
1
1
0
1
1
1
0
1
1
1
0
0
1
1
1
1
1
1
1
1
1
36
keep
keep
keep
switch
keep
switch
switch
switch
switch
switch
switch
switch
keep
keep
switch
keep
switch
keep
switch
switch
switch
switch
keep
keep
keep
keep
keep
switch
keep
switch
switch
keep
keep
switch
switch
keep
keep
switch
switch
switch
keep
keep
keep
switch
keep
keep
switch
keep
0
0
0
1
0
0
1
0
0
0
1
0
0
1
1
0
1
0
0
1
1
1
1
0
0
0
1
1
0
0
1
0
0
0
1
0
0
0
0
1
0
0
0
1
0
0
1
0
19
which clearly indicates that switching is a much better strategy.
We also understand why it is a much better strategy. In the always-keep strategy, you win
only when you were correct to start with, that is, when you were correct in your original
choice before I showed another door, which is about one third of the time—as we all
62
know, and just what the data points out.
The last example dealt with the fundamental notion of conditional probability. If A and
B are two events, one defines the probability of A given B , in symbols P( A | B) , by
P( A B )
.
P( A | B )
P( B )
Namely, we restrict our set of possibilities to those in which B has occurred, and thus
our denominator while in the numerator we put the situations when both events occur.
Example 11. The situation is simple. You are to visit a potential customer who is known
to have two children. You are speculating whether the customer has two boys. Knowing
nothing else, you know that the likelihood that you are right is 14 (as worked out in
Example 2 above). You arrive at the house and you see a boy playing in the backyard.
You ask the customer who the boy is and the customer replies either
He is my oldest child.
He is one of my children.
The subtleties in measuring information are reflected in the difference between the two
statements. One would hardly think there is a measurable difference between them. But
let us see what we can conclude from each.
Let A be the event that both children are boys, B be the even that the oldest child is a
boy, and C the event that one of the children is a boy.
Then, if B is given, then the likelihood that A will occur is the same as the likelihood
that the second child is a boy, which is simply 12 . But given C , then out of the four
possibilities for two children only one is ruled out, that of two girls, so our denominator is
3, and clearly the numerator is only one, and so in summary we get the surprising fact
that P( A) 14 , P( A | B ) 12 , but P( A | C ) 13 .