Introduction to Probability Ariel Yadin Lecture 2 1

Introduction to Probability
Ariel Yadin
Lecture 2
1. Discrete Probability Spaces
Discrete probability spaces are those for which the sample space is countable. We have already
seen that in this case we can take all subsets to be events, so F = 2Ω . We have also implicitly
seen that due to additivity on disjoint sets, the probability measure P is completely determined
by its value on singletons.
That is, let (Ω, 2Ω , P) be a probability space where Ω is countable. If we denote p(ω) = P({ω})
S
for all ω ∈ Ω, then for any event A ⊂ Ω we have that A = ω∈A {ω}, and this is a countable
disjoint union. Thus,
P(A) =
X
P({ω}) =
ω∈A
X
p(ω).
ω∈A
Exercise 2.1. Let Ω = {1, 2, . . . , } and define a probability measure on (Ω, 2Ω ) by
(1) P({ω}) =
(2) P0 ({ω}) =
1
(e−1)ω! .
1
3
· (3/4)ω .
(Extend using additivity on disjoint unions.) Show that both uniquely define probability measures.
Solution. We need to show that P(Ω) = 1, and that P is additive on disjoint unions.
Indeed,
P(Ω) =
∞
X
P({ω}) =
ω=1
P0 (Ω) =
∞
X
1
(e−1)ω!
= 1.
ω=1
1
3
·
∞
X
(3/4)ω = 1.
ω=1
Now let (An ) be a sequence of mutually disjoint events and let A =
S
n
An . Using the fact that
any subset is the disjoint union of the singlton composing it, we have that ω ∈ A if and only if
there exists a unique n(ω) such that ω ∈ An . Thus,
P(A) =
X
ω∈A
P({ω}) =
X X
n ω∈An
The same for P0 .
1
P({ω}) =
X
P(An ).
n
t
u
2
The next proposition generalizes the above examples, and characterizes all discrete probability
spaces.
Proposition 2.2. Let Ω be a countable set. Let p : Ω → [0, 1], such that
P
ω∈Ω
p(ω) = 1. Then,
there exists a unique probability measure P on (Ω, 2Ω ) such that P({ω}) = p(ω) for all ω ∈ Ω.
(p as above is somtimes called the density of P.)
Proof. Let A ⊂ Ω. Define
P(A) =
X
p(ω).
ω∈A
P
We have by the assumption on p that P(Ω) = ω∈Ω p(ω) = 1. Also, if (An ) is a sequence of
S
events that are mutually disjoint, and A = n An is their union, then, for any ω ∈ A there exists
n such that ω ∈ An . Moreover, this n is unique, since An ∩ Am = ∅ for all m 6= n, so ω 6∈ Am
for ll m 6= n. So
P(A) =
X
p(ω) =
X X
p(ω) =
n ω∈An
ω∈A
This shows that P is a probability measure on (Ω, 2Ω ).
X
P(An ).
n
For uniquness, let P : 2Ω → [0, 1] be a probability measure such that P ({ω}) = p(ω) for all
ω ∈ Ω. Then, for any A ⊂ Ω,
P (A) =
X
ω∈A
P ({ω}) =
X
p(ω) = P(A).
ω∈A
t
u
*** Nov. 8 ***
Example 2.3.
(1) An simple but important example is for finite sample spaces. Let Ω be a finite set.
Suppose we assume that all outcomes are equally likely. That is P({ω}) = 1/|Ω| for all
ω ∈ Ω, and so, for any event A ⊂ Ω we have that P(A) = |A|/|Ω|.
It is simple to verify that this is a probability measure. This is known as the uniform
measure on Ω.
(2) We throw two dice. What is the probability the sum is 7? What is the probability the
sum is 6?
2
Solution. The sample space here is Ω = {1, 2, . . . , 6} = {(i, j) : 1 ≤ i, j ≤ 6}. Since
we assume the dice are fair, all outcomes are equally likely, and so the probability measure
3
is the uniform measure on Ω.
Now, the event that the sum is 7 is
A = {(i, j) : 1 ≤ i, j ≤ 6 , i + j = 7} = {(1, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, 1)} .
So P(A) = |A|/|Ω| = 6/36 = 1/6.
The event that the sum is 6 is
B = {(i, j) : 1 ≤ i, j ≤ 6 , i + j = 6} = {(1, 5), (2, 4), (3, 3), (4, 2), (5, 1)} .
So P(B) = 5/36.
(3) There are 15 balls in a jar, 7 black balls and 8 white balls. When removing a ball from
the jar, any ball is equally likely to be removed. Shir puts her hand in the jar and takes
out two balls, one after the other. What is the probablity Shir removes one black ball
and one white ball.
Solution. First, we can think of the different balls being represented by the numbers
1, 2, . . . , 15. So the sample space is Ω = {(i, j) : 1 ≤ i 6= j ≤ 15}. Note that the size
of Ω is |Ω| = 15 · 14. Since it is equally likely to remove any ball, we have that the
probability measure is the uniform measure on Ω.
Let A be the event that one ball is black and one is white. How can we compute
P(A) = |A|/|Ω|?
We can split A into two disjoint events, and use additivity of probability measures:
Let B be the event that the first ball is white and the second ball is black. Let B 0 be
the event that the first ball is black and the second ball is white. It is immediate that B
and B 0 are disjoint. Also, A = B ∪ B 0 . Thus, P(A) = P(B) + P(B 0 ), and we only need
to compute the sizes of B, B 0 .
Note that the number of pairs (i, j) such that ball i is black and ball j is white is 7 · 8.
So |B| = 7 · 8. Similarly, |B 0 | = 8 · 7. All in all,
P(A) = P(B) + P(B 0 ) =
7·8
8·7
8
+
=
.
15 · 14 15 · 14
15
(4) A deck of 52 cards is shuffelled, so that any order is equally likely. The top 10 cards are
distributed amoung 2 players, 5 to each one. What is the probability that at least one
player has a royal flush (10-J-Q-K-A of the same suit)?
4
Solution. Each player receives a subset of the cards of size 5, so the sample space is
Ω = {(S1 , S2 ) : S1 , S2 ⊂ C, |S1 | = |S2 | = 5, S1 ∩ S2 = ∅} ,
where C is the set of all cards (i.e. C = {A♠, 2♠, . . . , A♦, 2♦, . . . , }). There are 52
5
42
52
ways to choose S1 and 47
5 ways to choose S2 , so |Ω| = 5 · 5 . Also, every choice is
equally likely, so P is the uniform measure.
Let Ai be the event that player i has a royal flush (i = 1, 2). For s ∈ {♠, ♦, ♥, ♣},
U
let B(i, s) be the event that player i has a royal flush of the suit s. So Ai = s B(i, s).
B(i, s) is the event that Si is a specific set of 5 cards, so |B(i, s)| = 47
5 for any choice
of i, s. Thus,
P(Ai ) =
X |B(i, s)|
s
|Ω|
=
4
.
47
5
Now we use the inclusion-exclusion principle:
P(A) = P(A1 ∪ A2 ) = P(A1 ) + P(A2 ) − P(A1 ∩ A2 ).
The event A1 ∩ A2 is the event that both players have a royal flush, so |A1 ∩ A2 | = 4 · 3
since there are 4 options for the first player’s suit, and then 3 for the second. Altogether,
!
4
4·3
8
12
4
P(A) = 47 + 47 − 52 47 = 47 · 1 − 52 .
5
5
5 · 5
5
5
Exercise 2.4. Prove that there is no uniform probability measure on N; that is, there is no
probability measure on N such that P({i}) = P({j}) for all i, j.
Solution. Assume such a probability measure exists. By the defining properties of probability
measures,
1 = P(N) =
∞
X
j=0
A contradiction.
P({j}) = p
∞
X
j=1
1 = ∞.
t
u
Exercise 2.5. A deck of 52 card is ordered randomly, all orders equally likely. what is the
probability that the 14th card is an ace? What is the probability that the first ace is at the 14th
card?
Solution. The sample space is all the possible orderings of the set of cards C = {A♠, 2♠, . . . , A♦, 2♦, . . . , }.
So Ω is the set of all 1-1 functions f : {1, . . . , 52} → C, where f (1) is the first card, f (2) the
second, and so on. So |Ω| = 52!. The measure is the uniform one.
5
Let A be the event that the 14th card is an ace. Let B be the event that the first ace is at
the 14th card.
A is the event that f (14) is an ace, and there are 4 choices for this ace, so if A is the set of
aces, A = {f ∈ Ω : f (14) ∈ A}, so |A| = 4 · 51!. Thus, P(A) =
4
52
=
1
13 .
B is the event that f (14) is an ace, but also f (j) is not an ace for all j < 14. Thus,
B = {f ∈ Ω : f (14) ∈ A , ∀ j < 14 f (j) 6∈ A}. So |B| = 4 · 48 · 47 · · · 36 · 38! = 4 · 48! · 38 · 37 · 36,
and P(B) =
4·38·37·36
52·51·50·49
t
u
= 0.031160772.
2. Some set theory
Recall the inclusion-exclusion principle:
P(A ∪ B) = P(A) + P(B) − P(A ∩ B).
This can be demonstrated by the Venn diagram in Figure 1.
ASB
A
B
A∩B
Figure 1. The inclusion-exclusion principle.
This diagram also illustrates the intuitive meaning of A ∪ B; namely, A ∪ B is the event that
one of A or B occurs. Similarly, A ∩ B is the event that both A and B occur.
Let (An )∞
n=1 be a sequence of events in some probability space. We have
[
n≥k
So
i.e.
An = {ω ∈ Ω : ω ∈ An for at least one n ≥ k} = {ω ∈ Ω : there exists n ≥ k such that ω ∈ An } .
T∞ S
k=1
n≥k
An is the set of ω ∈ Ω such that for any k, there exists n ≥ k such that ω ∈ An ;
∞ [
\
k=1 n≥k
An = {ω ∈ Ω : ω ∈ An for infinitely many n} .
Definition 2.6. Define the set lim supn An as
lim sup An :=
n
∞ [
\
k=1 n≥k
An .
6
The intuitive meaning of lim supn An is that infinitely many of the An ’s occur.
Similarly, we have that
∞ \
[
k=1 n≥k
An = {ω ∈ Ω : there exists n0 such that for all n > n0 , ω ∈ An }
= {ω ∈ Ω : ω ∈ An for all large enough n} .
Definition 2.7. Define
lim inf An :=
n
∞ \
[
An .
k=1 n≥k
lim supn An means that An will occur from some large enough n onward; that is, eventually
all An occur.
It is easy to see that
lim inf An ⊆ lim sup An .
n
n
This also fits the intuition, as if all An occur eventually, then they occur infinitely many times.
Definition 2.8. If (An )∞
n=1 is a sequence of events such that lim inf n An = lim supn An (as sets)
then we define
lim An := lim inf An = lim sup An .
n
n
n
Example 2.9. Consider Ω = N, and the sequence
An = nj : j = 0, 1, 2, . . . , .
If m < n and m ∈ An , then it must be that m = n0 = 1. Thus,
lim sup An =
n
∞ \
[
k=1 n≥k
lim inf An =
n
∞ [
\
k=1 n≥k
n≥k
An = {1} and
An = {1} .
Also, if m < k ≤ n, and m ∈ An , then again m = 1, so
Hence,
T
S
n≥k
An does not contain 1 < m < k.
An = {1} .
Thus, the limit exists and limn An = {1}.
Definition 2.10. A sequence of events is called increasing if An ⊂ An+1 for all n, and decreasing
if An+1 ⊂ An for all n.
7
Proposition 2.11. Let (An ) be a sequence of events. Then,
(lim inf An )c = lim sup Acn .
n
n
Moreover, if (An ) is increasing (resp. decreasing) then limn An =
T
n An ).
S
n
An (resp. limn An =
Proof. The first assertion is de-Morgan.
For the second assertion, note that if (An ) is increasing, then
[
[
\
An =
An
and
An = Ak .
n≥1
n≥k
n≥k
So
lim sup An =
n
\[
An =
k n≥k
lim inf An =
n
[\
An =
k n≥k
\[
An =
k
n
[
Ak .
[
An
n
k
If (An ) is decreasing then (Acn ) is increasing. So
[ c
lim sup An = (lim inf Acn )c =
Acn = (lim sup Acn )c = lim inf An ,
n
n
and limn An =
S
n
Acn
c
=
T
n
n
n
An .
n
t
u
Exercise 2.12. Show that of F is a σ-algebra, and (An ) is a sequence of events in F, then
lim inf n An ∈ F and lim supn An ∈ F.
3. Continuity of Probability Measures
The goal of this section is to prove
Theorem 2.13 (Continuity of Probability). Let (Ω, F, P) be a probability space. Let (An ) be a
sequence of events such that limn An exists. Then,
P(lim An ) = lim P(An ).
n
n
We start with a restricted version:
Proposition 2.14. Let (Ω, F, P) be a probability space. Let (An ) be a sequence of increasing
(resp. decreasing) events. Then,
P(lim An ) = lim P(An ).
n
n
*** Nov. 10 ***
8
S
Proof. We start with the increasing case. Let A =
n
An = limn An . Define A0 = ∅, and
Bk = Ak \ Ak−1 for all k ≥ 1. So (Bk ) are mutually disjoint and
so
U
n
]
Bk =
k=1
n
n
[
Ak = An
k=1
Bn = A. Thus,
n
X
X
]
P(lim An ) = P( Bn ) =
P(Bn ) = lim
P(Bk )
n
n
= lim P(
n
n
n
n
]
k=1
Bk ) = lim P(An ).
n
k=1
The decreasing case follows from noting that if (An ) is decreasing, then (Acn ) is increasing, so
P(lim An ) = P(
n
\
n
[
An ) = 1 − P( Acn ) = 1 − lim P(Acn ) = lim P(An ).
n
n
n
t
u
Lemma 2.15 (Fatou’s Lemma). Let (Ω, F, P) be a probability space. Let (An ) be a sequence of
events (that may not have a limit). Then,
P(lim inf An ) ≤ lim inf P(An )
n
and
n
Proof. For all k let Bk =
T
n≥k
lim sup P(An ) ≤ P(lim sup An ).
n
An . So lim inf n An =
S
k
n
Bk . Note that (Bk ) is an increasing
sequence of events (Bk = Bk+1 ∩ Ak ). Also, for any n ≥ k, Bk ⊂ An , so P(Bk ) ≤ inf n≥k P(An ).
Thus,
[
P(lim inf An ) = P( Bn ) = lim P(Bn ) ≤ lim inf P(Ak ) = lim inf P(An ).
n
n
n
n k≥n
n
For the lim sup,
lim sup P(An ) = lim sup(1 − P(Acn )) = 1 − lim inf P(Acn )
n
n
n
≤ 1 − P(lim inf Acn ) = 1 − P((lim sup An )c ) = P(lim sup An ).
n
n
n
t
u
Fatou’s Lemma immediately proves Theorem 2.13.
Proof of Theorem 2.13. Just note that
lim sup P(An ) ≤ P(lim sup An ) = P(lim An ) = P(lim inf An ) ≤ lim inf P(An ),
n
n
so equality holds, and the limit exists.
n
n
n
t
u
9
As a consequence we get
Lemma 2.16 (First Borel-Cantelli Lemma). Let (Ω, F, P) be a probability space. Let (An ) be a
P
sequence of events. If n P(An ) < ∞, then
P( An occurs for infinitely many n ) = P(lim sup An ) = 0.
n
Proof. Let Bk =
S
n≥k
An . So (Bk ) is decreasing, and so the decreasing seqence P(Bk ) converges
to P(lim supn An ). Thus, for all k,
P(lim sup An ) ≤ P(Bk ) ≤
n
X
P(An ).
n≥k
Since the right hand side converges to 0 as k → ∞ by the assumption that the series is convergent,
we get P(lim supn An ) ≤ 0.
t
u
Example 2.17. We have a bunch of bacteria in a patry dish. Every second, the bacteria give
off offspring randomly, and then the parents die out. Suppose that for any n, the probability
that there are no bacteria left by time n is 1 − exp(−f (n)).
What is the probability that the bacteria eventually die out if:
• f (n) = log n.
• f (n) =
n2 −7
2n2 +3n+5 .
Let An be the event that the bacteria dies out by time n. So P(An ) = 1 − e−f (n) , for n ≥ 1.
Note that the event that the bacteria eventually die out, is the event that there exists n
S
such that An ; i.e. the event n An . Since (An )n is an increasing sequence, we have that
S
P( n An ) = limn P(An ).
In the first case this is 1. In the second case this is
n2 − 7
= 1 − e−1/2 .
lim 1 − exp − 2
n→∞
2n + 3n + 5
Example 2.18. What if we take the previous example, and the information is that the probability that the bacteria at generation n do not die out without offspring is at most exp(−2 log n).
Then, if An is the event that the n-th generation has offspring, we have that P(An ) ≤ n−2 .
P
Since n P(An ) < ∞, Borel-Cantelli tells us that
P(An occurs for infinately many n) = P(lim sup An ) = 0.
n
10
That is,
P(∃ k : ∀ n ≥ k : Acn ) = P(lim inf Acn ) = 1.
n
So a.s. there exists k such that all generations n ≥ k do not have offspring – implying that the
bacteria die out with probability 1.