chapter seven an introduction to probability calculations for small

CHAPTER SEVEN
AN INTRODUCTION TO PROBABILITY CALCULATIONS
FOR SMALL SYSTEMS
Introduction
In the last chapter we discussed how the different states of a system are specified, both quantum
mechanically and classically. Once the different states are uniquely specified, we typically examine the number of
unique microstates which all have the same macroscopic characteristics of interest (say the same pressure T ÑÞ If
we assume that all of the microstates of the system are equally likely, then we would expect the most likely
configuration of the system to be that macroscopic state which corresponds to the largest number of microstates.
In order to determine which of the macroscopic states of a system are indeed the most likely, and to examine the
fluctuations in the macroscopic measurements of these states, we must examine the concepts of probability and
statistics.
The Definition of Probability
In order to define the concept of probability, we examine the results of tossing a coin. The probability,
c(h), of obtaining heads in this experiment is given by
c (h ) œ
HÐ2Ñ
H
(7.1)
where H is the total number of possible outcomes of the experiment, and where H(h) is the number of different
ways of obtaining the desired result! Thus the probability of obtaining heads in a single coin toss is "# (one
possible way of getting heads out of two possible outcomes heads or tails). Of course, this result depends upon
the a priori assumption that each of the possible outcomes is equally likely! (This would not be true, for example,
for a weighted coin or rounded die.)
Although the probability of getting heads in a single coin toss is "# , if I take a single coin out of my
pocket and flip it four (4) times I may actually get (where stands for heads, and stands for
tails). I find that I get heads only "% of the time rather than "# . Trying this same experiment again, but flipping the
coin eight (8) times, I might get + . I find that now I get heads $% of the time. I don't get "# in
either case! However, if I repeat this experiment (tossing the coin) enough times, I find that the number of times
I get heads begins to converge to "# (although I may need to flip the coin a very large number of times to see this).
Thus, when we speak of probabilities, we really think of repeating the same experiment a large number of times
( ¶ ∞). This can be accomplished, in principle, by performing the same identical experiment many times over
and over again with the same system (coins, die, boxes of atoms, etc.) or it can be accomplished by performing
the same experiment in a large number of identical systems all at the same time. We call the large number of
identical experiments an “ensemble" of experiments. The probability, then, is just the relative number of times
that a given result occurs in the “ensemble". A single experiment is just one member of a larger “ensemble". If
we repeat the same experiment several times we are just sampling the “ensemble". Thus, although we expect
50% of an ensemble of tossed coins to exhibit heads and 50% to exhibit tails, any random selection from the
ensemble will give different results. But, if we examine a large number of the members of the ensemble, we will
begin to get results close to 50%.
An additional point to consider is the case where there is some constraint which limits the number of
accessible states. As an example let's look at the case of the semi-classical, one-dimensional simple harmonic
oscillator with constant energy. We pointed out in the last chapter that the region of phase space accessible to
this system was the area between the ellipse corresponding to the energy I and the ellipse corresponding to
energy I $ I . [We designate this system as one which has energy I with an uncertainty $ I .] The “volume" of
phase space lying between I and I $ I , contains H(E) different cells in phase space (each with a volume 2o ).
The probability of finding this system with constant energy I in a particular region xk is given by
c ( xk ) œ
H(E;xk )
HE ( x k )
œ
H( E )
HE
(7.2)
Chapter Seven: Probability for Small Systems
2
where we explicitly indicate a subset, H(Eà B5 ) of all the states in which the system might be found with constant
energy I and also within the region B5.
Example 1: Consider throwing die. The probability of the die landing with one dot up is
given by
c(") œ
number of ways to get (")
1
œ
number of possible outcomes
6
(7.3)
Example 2: Consider drawing a single card from a full deck of 52 cards. The probability of
drawing an ace (of any kind) is given by
c" (ace) œ
number of aces in deck
%
œ
number of different cards in deck
&#
(7.4)
The probability of drawing a particular ace, say the ace of clubs, would be
c# Ðace of clubsÑ œ
number of ace of clubs in the deck
"
œ
number of differenct cards in deck
&#
(7.5)
Now, if, after drawing the first card, we had obtained the ace of clubs, the probability of
obtaining a second ace (of any kind) from this same deck of cards would be
c$ (ace) œ
3
number of aces remaining in deck
œ
number of cards remaining in deck
51
(7.6)
The probability of drawing the ace of diamonds from this deck on the second draw would be
c% Ðace of diamondsÑ œ
number of ace of diamonds
"
œ
number of cards in deck
&"
(7.7)
Notice that the probability of drawing an ace (of any kind) and then drawing a particular ace
(say the ace of clubs) depends upon whether or not you drew the ace of clubs on the first
round or not. Thus these two probabilities are not independent.
The Probability Sample Space. Another way of defining probability is based upon what we will call a
“sample space”. A sample space is just a list of all possible, mutually exclusive, outcomes of an experiment and
their associated probabilities. We call each individual outcome a point in the sample space. A uniform sample
space is one in which all the points are equally likely and mutually exclusive. An example of a uniform sample
space is the case where a single coin is flipped. Each of the two outcomes is equally likely. Most sample spaces,
however, are not uniform. As an example of a non-uniform sample space, consider the case where two die are
rolledÞ The possible outcomes of this toss are listed below:
"ß "
"ß #
"ß $
"ß %
"ß &
"ß '
#ß "
#ß #
#ß $
#ß %
#ß &
#ß '
$ß "
$ß #
$ß $
$ß %
$ß &
$ß '
%ß "
%ß #
%ß $
%ß %
%ß &
%ß '
&ß "
&ß #
&ß $
&ß %
&ß &
&ß '
'ß "
'ß #
'ß $
'ß %
'ß &
'ß '
Chapter Seven: Probability for Small Systems
3
If we assume that each outcome is equally likely, then we can determine the probability of obtaining
various results simply by adding the number of ways each result can be obtained. For example, the number of
ways of obtaining a sum of 5 is the number of sample points which add to five (4,1 3,2 2,3 1,4). The probability
of getting a sum of 5 is, therefore,
cÐ&Ñ œ
%
"
œ
$'
*
Likewise, the probability of obtaining a sum of 10 is
cÐ"!Ñ œ
$
"
œ
$'
"#
If we are interested in the sum of the two dice we can generate a different sample space, in which the
possible outcomes of this experment correspond to the sample points and the relative number of different ways
that this result can be obtained would complete this particular sample space:
Sample Space for the Sum of Two Die
Sample Point:
2 3 4 5 6 7 8 9
1
2
3
4
5
6
5
4
Associated Probability: 36
36
36
36
36
36
36
36
10
11
12
3
36
2
36
1
36
Probabilities of Multiple Events
We now consider multiple events and their associated probabilities. We will designate the probability of
one event (call it E) by c ( A ), and another event (say F) by c ( B ). We can visualize the possible probabilities by
using a representation of sample space. Consider the grid of points below.
A
AB
B
The grid represents points in the sample space. We can enclose those points in which event E occurs, and
those points in which event F occurs. If there is no overlap, the events are independent. If, however, the two
regions overlap then we can talk about the probability that both E and F occur, c (A and B) œ c (EF)Þ Several
other possibilities also arise, such as the probability that either A or B occur, c (A or B); the probability that either
A or B occur or both, c (A B); the probability that F has occurred if we know that E has already occurred
q
q
cA (B); the probability that neither A nor B occur, c (A and B ); etc.
There are a few important relationships among these probabilities which we will now introduce without
proof. For further explanations you might refer to Boas. Many of these can be easily reasoned out based upon
the representative sample space above. In the equations below, R represents the number of sample points in the
sample space, N(A) the number of sample points in which event E occurs, etc.
c(A) œ
R ÐEÑ
R
(7.8)
Chapter Seven: Probability for Small Systems
4
c(AB) œ
R ÐEBÑ
R
(7.9)
c A (B) œ
R ÐEBÑ
R (A)
(7.10)
c(AB) œ 0
if A and B are mutually exclusive (7.11)
c (A B) œ c (A) c (B) c (AB)
(7.12)
c (A) ‚ cA (B) œ c (AB) œ c (B) ‚ cB (A) œ c (BA)
(7.13)
c A (B) œ c (B)
(7.14)
if A and B are independent
Statistically Independent Events
Of particular interest are those cases where the occurance of one outcome (E or F) does not influence the
occurance of the other. These are said to be independent occurances. For this case, the probability of obtaining
E and then F is given by
c (A and B) œ c (AB) œ c (A) ‚ c (B)
(7.15)
Example: Consider throwing two die. The probability of obtaining a six on the first role is "' , and
the probabilility of obtaining a six on any subsequent roll (or on the roll of another die) is also "' .
The two rolls (or die) are independent of one another. The probability of obtaining two sixes on
two subsequent rolls, since these rolls are independent, is therefore given by
c (' and ') œ c (') ‚ c (') œ
"
'
‚
"
'
œ
"
$'
.
Let's show that this is indeed correct. By definition the probability of an event is the number of
times the event can occur divided by the total number of all possible events for the specific
experiment. How many possible combinations of numbers are there with two dice? For each face
upward on die E there are 6 possible orientations of die F. Thus, there are 6 ‚ 6 œ 36 different
possible combinations of the dice. How many different combinations will give us 12 (6 on both
die)? There is only one! To see this examine the table on the next page. Notice that there is only
one way to obtain “snake eyes" and “box cars", but that there are several ways to obtain other total
values, for example 7 or 11! For example, there are two ways to obtain 11; with a 6 and 5, and a 5
and 6, where we differentiate between die E and die F. Thus, the most probable number to arise
in throwing two dice is 7! Notice the difference between specifying the system in this way and by
using the sample space designation used earlier.
Sometimes the probability that an event occurs depends upon whether or not another event has occured.
We designate by cA (F) the probability that event F occurs given the fact that event E has already occured.
Example 1: What is the probability of picking two aces from a 52 card deck?
The probability of picking one ace is given by
c(first ace) œ
number of aces
%
œ
number of cards in deck
&#
(7.16)
Chapter Seven: Probability for Small Systems
5
The probability of picking a second ace, after the first one is removed, is given by
c(second ace) œ
number of remaining aces
$
œ
number of remaining cards in deck
&"
(7.17)
From our earlier arguments, the probability of picking two aces in sequence must, therefore, be
c (two aces) œ c (first ace) ‚ c (second ace) œ
%‚$
"
œ
&# ‚ &"
##"
(7.18)
Example 2: A meeting is held in which 40 doctors and 10 psyhcologists attend. A committee of 3
is selected at random. What is the likelyhood that at least 1 psychologist will be on the committee?
We attack this problem much like the one of aces drawn from a deck. We choose one member of
the committee at a time. Let's consider the likelyhood that no psychologist is chosen for the
committee. If the probability that no psychologist is on the committee is ; , then the probability
that at least one on the committee is a psychologist is : œ " ;Þ For the first member, the
probability that no psychologist is chosen will be 40/50. For the second selection, if no
psychologist was selected for the first position, the probability will be 39/49. For the last position,
we get 38/48. So the probability of selecting all doctors after three random selections is
%! ‚ $* ‚ $)
œ 0.504
&! ‚ %* ‚ %)
c(all doctors) œ
(7.19)
The probability that at least one of the committee members will be a psychologist is, therefore,
0.496!
Events with Two Possible Outcomes. When we toss a coin we assume that the likelyhood of obtaining
heads is equal to the likelyhood of obtaining tails (i.e., "# Ñ. This would be true provided the coin is balanced
properly. But what if the coin is bent? In this case the probabilities of obtaining the two possible results may not
be equal. If we designate the probability of obtaining heads by c ( h ), then we can designate the probability of not
q
q
getting heads by the expression c ( 2 ), where we let 2 stand for not-h. It should be obvious that we will obtain
either heads or not-heads for any given experiment, so the probability of getting heads plus the probability of not
getting heads is a certainty, or
q
c( h ) c( h ) œ "
(7.20)
or
q
c( 2 ) œ " c( h )
(7.21)
This idea can be applied to any situation where we are interested only in the probability of one type of event as
compared to any other event, no matter how many different outcomes may actually be possible. To simplify
notation, we will designate the probabiltiy of obtaining the desired outcome by : and the probability of any other
outcome by ; . This means that the equations above would be written
p
qœ"
(7.22)
; œ":
(7.23)
Example 1: The probability of obtaining two dots up when a die is thrown is
probability of obtaining any other result is &' .
Example 2: For a spin
probability of it being "
#
"
#
particle, the probability of it's z-component being is also "# .
Permutations and Combinations
"
#
"
,
'
while the
is
"
#
and the
Chapter Seven: Probability for Small Systems
6
Assume that we have a “true” die so that the probability of getting any one face up is "' . The probability
of getting an ace (one up) is just "' while the probability of getting anything else is &' Þ Now assume that we have
five such dice and want to know the probability of throwing 2 aces. When we roll the first die, the probability of
throwing an ace is "' and the probability of anything else is &' . Since the probabilities of the individual die are
independent (one does not influence the other) the probability that the second die will be an ace is also "' . Thus,
the probability of getting two aces on the first two throws, but not getting an ace on the next three must be
cÐEER R R Ñ œ
" " & & &
† † † † œ !Þ!"'"
' ' ' ' '
Likewise, the probability of getting and ace on the first throw and on the third throw would be
cÐER ER R Ñ œ
" & " & &
† † † † œ !Þ!"'"
' ' ' ' '
and is the same result. Obviously, there are a number of different ways that we can obtain just two aces. We
typcially want to know the probability of obtaining two aces out of five throws independent of the order in which
the ace occurs, or of the identity of an individual die. But just how many ways is that? To answer this we need to
talk about permutations and combinations.
To understand the concept of permutations we will consider the example of a “State Dinner” for which
there are R persons to be seated at the table (where there are R chairs we don't want to leave anyone out). We
might wonder how many different ways we could arrange the nametags for the individual plates (i.e., how many
permutations T ÐR ± R Ñ are there of R persons among R chairs). Let's assume that all the nametags were placed
in a large box and that we would draw them out one at a time and place them by the plates. When we draw the
first name, there are R nametags to choose from thus there are R possibilities. For the second plate we have
R 1 possible choices. Thus, in selecting the nametags for two different plates we would have R (R 1)
possible combinations. For the next plate there are R 2 possible choices, etc., which means that by the time we
have chosen R nametags and placed them by R plates, there are
T ÐR ± R Ñ œ R (R 1)(R 2)â(1) œ R !
(7.24)
different ways that the nametags could have been selected. This means that there are R ! different seating
arrangements if we do not restrict the seating order (for example if we do not care who sits at the head table).
In our example of throwing five dice, we might consider numbering each individual die from one to five
so that they are distinguishable from one another (just like each die having its own name). The dice are then
placed into a sack and are pulled out of the sack one at a time and rolled. According to what we have stated
above, there are &x different ways we can reach into our bag, pull out a die and roll the die, and for each possible
combination we have a "' probability of getting an ace.
Now suppose we have only 8 ( 8 < R ) plates for our state dinner. (We will ignore the fact that having too
few chairs might create a very difficult eating atmosphere.) We might ask how many different ways, T ÐR ± 8Ñ,
there are to choose the nametags and place them by these 8 plates? As before, we saw there were R for the first
plate, R 1 for the second, etc. until we reach the 8th plate. For this plate there are R (8 1) choices left.
Thus, the number of possible permutations are given by
T ÐR ± 8Ñ œ R (R 1)(R 2)â(R [8 1])
(7.25)
T ÐR ± 8Ñ œ R (R 1)(R 2)â(R 8 1)
(7.26)
R (R 1)(R 2)â(R 8 1) ‚ (R 8)!
R!
œ
(R 8)!
(R 8)!
(7.27)
T ÐR ± R Ñ
(R 8)!
(7.28)
or
which can be written
T ÐR ± 8Ñ œ
or
T ÐR ± 8Ñ œ
Chapter Seven: Probability for Small Systems
7
This is the number of different possible arrangements of the name tags on the tables.
Now suppose we are not really interested in the number of different ways the name tags can be placed on
the tables, but are really just interested in the number of different ways we can select (at random) 8 different
people who would be allowed to eat at the state dinner from R people who had asked to be allowed to eat at the
dinner. Would we care which person's name was placed by which plate? No! Therefore, after choosing the 8
names from the box who will be allowed to sit at the table, we could rearrange the nametags 8! different ways.
When we determine the number of permutations, T ÐR ± 8Ñ, we are counting the number of unique ways in which
the people can be arranged. If we are only interested in the number of ways to choose 8 people, we have
overcounted by the number of permutations, T Ð8 ± 8Ñ, of the 8 people which are chosen. We call the number of
different ways to choose 8 different people from a pool of R people, without counting the permutations of the 8
people a Combination, GÐR ± 8Ñ, and it is given by
GÐR ± 8Ñ œ
T ÐR ± 8Ñ
R!
œ
T Ð8 ± 8Ñ
(R 8)! 8!
Example1: Suppose you just received a beautiful ivory box for your birthday. The box has three
cubical sections which are just the right size to hold three of your prize marbles. But you have 5
prize marbles. You place the three best marbles in the box and then gaze delightedly at your
treasure. However, as you look at the ivory box with your three marbles you realize that one of the
other marbles would actually look better, because of its unique coloring. As you change out the
marbles, it becomes more and more difficult to decide just which marbles you should place in the
box. How many different possibilities are there for you to consider as you try to make up your
mind?
Since the marbles are distinguishable (i.e., they are not all identical) you are asking how many
unique ways you can place 5 marbles into 3 spaces. This is the same problem as the R people
&x
being seated at 8 different plates. There are T Ð&l$Ñ œ Ð&$Ñx
œ &x
#x œ '! unique permutations of the
marbles among the three available spaces. However, of these 60 permutations, some have the
same three marbles in the available spaces, but simply arranged in different orders within the ivory
box. If you are not concerned with just how the three marbles in the ivory box are oriented, then
the actual number of different ways of putting 5 marbles in 3 spaces is 60/3! œ "! combinations.
[Note: You also get the same answer if you consider the two marbles which are left outside the
&x
&x
&x
ivory box, since GÐ&l$Ñ œ Ð&$Ñx
$x œ #x $x œ Ð&#Ñx #x œ GÐ&l#ÑÞÓ
Example 2: Suppose you have a club with 50 members and you are about to elect four (4) officers,
a president, vice-president, secretary, and tresurer. How many different possible outcomes are
there to this selection process?
In selecting the president, you have 50 choices. But once he is selected there are now only 49
choices left for vice-president. After choosing him, there are only 48 ways to choose the secretary,
and then only 47 ways to choose the treasurer. Thus, there are
T &!l% œ
&!x
œ &ß &#(ß #!!
&! %x
different possible outcomes if the selection process were purely random.
Example 3: Now suppose that this same club of 50 members wishes to choose a four member
committee to consider the initiation rites of the club. How many different combinations are there
for selecting this committee?
The difference between this situation and the one in the last example is that all the members of the
committee are assumed to be equal (which would be true provided a chairman were not selected in
advance). This means that the ordering in which the four were selected would not be important.
(7.29)
Chapter Seven: Probability for Small Systems
8
This is an example of a combination
G &!l% œ
&!x
&ß &#(ß #!!
œ
œ #$!ß $!!
&! %x %x
#%
The Binomial Expansion and Probability
In cases where the outcome of some process can be categorized as either successful or unsuccessful, we
can make use of the binomial expansion to determine the probability of success of failure of that process. To
illustrate this, we will first consider tossing two coins. We know that the probability of getting heads for a single
toss is "# and the probability of getting both heads is "# ‚ "# œ "% . But what about the other possible outcomes?
The possibility of getting both tails is likewise "% , so that the possibility of getting one head and one tail is "# when
tossing two coins. A table of all the possible outcomes is shown in table 7.1.
TABLE 7.1
Tabulation of the outcomes of tossing two coins simultaneously.
System State
1
2
3
4
Coin 1
H
H
T
T
Coin 2
H
T
H
T
# Heads
2
1
1
0
Now for each coin the probability of getting either heads or tails is unity. Mathematically, we write this as
(pi ;i ) œ ", where :3 is the probability of obtaining heads, ;3 is the probability of obtaining tails, and where the
subscript 3 denotes which coin. If we toss two coins, then, the probabilty that the experiment has an outcome of
any sort must be given by
(:1 ;1 ) (:2 ;2 ) œ 1
(7.30)
:1 :2 :1 ;2 ;1 :2 ;1 ;2 œ 1
(7.31)
which gives, upon expansion,
Notice that we obtain four different terms, one for each of the possible outcomes: one with two heads, one with
two tails, and two with one head and one tail! Therefore, it appears that this binomial expansion can be used to
discribe our coin tossing experiment. Notice that this expansion designates which coin is heads and which is tails!
This is a result of the fact that we numbered each : and ; , which implies that the different coins are
distinguishable (i.e., one may be older and therefore less shiny). If we are not interested, however, in which coin
is heads and which is tails, we might simply designate this problem by the equation
(: ; )(: ; ) œ (1) :2 (2) :; (1) ; 2
(7.32)
where the number appearing in parentheses on the right hand side of the equation indicates the number of ways of
obtaining the desired result. Thus, there are two distinct ways of obtaining one head and one tail, but only one
way to obtain two heads or two tails.
This same process can be extended to the case of tossing three coins. If we want to distinguish each coin
we could write
:" ;" :# ;# :$ ;$ œ "
Chapter Seven: Probability for Small Systems
9
or
:" :# :$
:" :# ;$ :" ;# :$ ;" :# :$
:" ;# ;$ ;" : # ;$ ;" ;# :$
;" ;# ;$ œ "
This way of expressing the probabilities specifically indicates which coin is a head and which coin is a tail. ÒNote:
This listing, however, does not include all possible permutationsß as we defined them earlier. The permutations
would be equivalent to specifying the ordering of the coins. For example, :" :# ;$ would be a different
arrangement from :" ;$ :# Þ] If we are not interested in distinguishing the three coins (i.e., if we don't care which
coin is heads and which tails), then we have
(: ; )(: ; )(: ; ) œ (: ; )3 œ :3 3:2 ; 3:; 2 ; 3
(7.33)
Here we see that there are three ways to obtain two heads and one tail, three ways to obtain one head and two
tails, and only one way to obtain all heads or all tails! The binomial expansion, then, is a useful method of
determining the probability of multiple events when each individual event is independent and can be expressed in
terms of only one of two possibilities with constant probability, and when we are considering indistinguishable
particles.
The general expression for the binomial expansion is
Ðp ; ÑR œ pR R pR " ; R ÐR "Ñ R # # R ÐR "ÑÐR #) R $ $
p
; p
; â ;R
#x
$x
(7.34)
Notice that the coefficients of the binomial expansion have the form of a combination GÐR ± 8Ñ which we
introduced earlier, since
R ÐR "ÑÐR #)
R ÐR "ÑÐR #) ÐR $Ñx
Rx
œ
œ
$x
ÐR $Ñx $x
ÐR $Ñx $x
(7.35)
Using this last equation, we can write the binomial expansion as
Ð: ;ÑR œ R
Rx
:8 ; R 8 œ GÐR l 8Ñ :8 ; R 8 œ "
ÐR 8Ñx 8x
8œ!
8œ!
R
(7.36)
Each term of the form
cR l8 œ GÐR l8Ñ :8 ; R 8
represents the probability of obtaining 8 desired results in R attempts. The number G R l8 is the number of
different ways of obtaining the same result, :8 is the probability of 8 successes, and ; R 8 is the probability of
R 8 failures.
Example 1. Consider tossing 4 die. What is the probability of obtaining 3 aces (ones)?
The probability of success (an ace) is 1/6, while the probability of failure is 5/6. The
probability of obtaining 3 aces is then "Î'$ , but there are G %l$ œ % different ways of
obtaining 3 aces if four throws (EEER EER E ER EE R EEEÑ, so we have
$
"
&
#!
&
c%l$ œ GÐ% l $Ñ :$ ; %$ œ % † Œ  † œ
œ
'
'
"#*'
$#%
Example 2. What is the probability of getting 3 heads in four tosses of a balanced coin?
This is expressed as
$
"
%
"
"
"
"
c%l$ œ G %l$ † Œ  † Œ  œ G %l$ † Œ  œ
#
#
#
%
Chapter Seven: Probability for Small Systems
10
You might think that the binomial distribution is a very special case which would be of little real interest in
physics. However, in the case of objects having spin "# , these particles can exhibit either spin “up” or spin
“down” and this type of reasoning is just what we need! Similarly, we can often express a desired outcome as one
of two choices.
Using the binomial expansion for determining the probability of 8 successes in R attempts is useful when
dealing with a small number of events, but when R becomes large this process becomes very cumbersome. In
fact, when R becomes large we may not really be so interested in the probabilty of getting a particular outcome,
but rather in the most probable outcome and the size of the fluctuations about that most probable outcome. This is
what we will begin to examine in the next chapter.