Almost Sure Events

Steven R. Dunbar
Department of Mathematics
203 Avery Hall
University of Nebraska-Lincoln
Lincoln, NE 68588-0130
http://www.math.unl.edu
Voice: 402-472-3731
Fax: 402-472-8466
Topics in
Probability Theory and Stochastic Processes
Steven R. Dunbar
Almost Sure Events
Rating
Mathematicians Only: prolonged scenes of intense rigor.
1
Section Starter Question
Consider an infinite sequence of coin flips. What probability would you assign
to the sequence that has a Head on its initial flip and then any sequence of
Heads and Tails on all the subsequent flips? What probability would you
assign to the sequence of flips that is alternately Heads and Tails, continuing
indefinitely?
Key Concepts
1. A subset A ⊆ Ω is of finite type if there exists an n = n(A) ≥ 1
and a subset A0 ⊆ Ωn so that A = ω ∈ Ω : ω (n) ∈ A0 , where ω (n) =
(ω1 , . . . , ωn ).
2. N ⊆ Ω is a negligible event (that is, an event of probability measure
0) if for every > 0 there exists a countable
set {Ak : k ≥ 1} of finite
P
type events so that N ⊆ ∪k≥1 Ak and k≥1 P [Ak ] < .
3. Every countable union of negligible sets is negligible.
4. If p 6= 0, 1, every countable subset of Ω is negligible.
5. The set of events {Ai }i∈I are independent if
" n
#
n
\
Y
P
Aik =
P [Aik ]
k=1
k=1
for every finite set of distinct indexes i1 , . . . , in ∈ I.
6. Events determined by coordinates with disjoint sets of indexes are independent.
7. The set of sequences that are periodic after a certain index is negligible.
8. A set E is called a tail event if E is not a finite type event.
2
9. Let Bn be a sequence of sets in Ω.
{Bn i.o. } = lim ∪∞
n=m Bn .
m→∞
(i.o. stands for “infinitely often.”)
10. The Kolmogorov 0-1 Law says that if B1 , B2 , . . . are independent, and
if E = {Bn i.o. }, then P [E] = 0 or P [E] = 1.
Vocabulary
1. A subset A ⊆ Ω is of finite type or is a finite type event if
there exists an n =
n(A) ≥ 1 and a subset A0 ⊆ Ωn so that A =
ω ∈ Ω : ω (n) ∈ A0 , where ω (n) = (ω1 , . . . , ωn ).
2. N ⊆ Ω is a negligible event (that is, an event of probability measure
0) if for every > 0 there exists a countable
set {Ak : k ≥ 1} of finite
P
type events so that N ⊆ ∪k≥1 Ak and k≥1 P [Ak ] < .
3. An event A ⊆ Ω is an almost sure event if Ac = Ω \ A is negligible.
4. The set of events {Ai }i∈I are independent if
" n
#
n
\
Y
P
Aik =
P [Aik ]
k=1
k=1
for every finite set of distinct indexes i1 , . . . , in ∈ I.
5. A set E is called a tail event if E is not a finite type event.
3
Mathematical Ideas
The Strong Law of Large Numbers says “it is certain” that Sn /n approaches
the probability of success p as the number of tosses n approaches infinity.
To be rigorous, we need to make the statement “it is certain” precise. The
problem is that there exist infinite sequences of heads and tails such that the
proportion of heads does not converge to p, such as the sequence consisting of
all tails. In fact, there exist sequences of heads and tails where the proportion
of heads does not converge at all. However, we can exclude such events by
defining the concept of an almost sure event. The Strong Law of Large
Numbers then tells us that the sequence Sn /n converges almost surely to p.
This fundamental idea is due to E. Borel in 1909, [1].
Finite Type Events
First we consider the subset of elements in Ω defined by a condition depending
on only finitely many coordinates. The realization or non-realization of such
an event is determined after a fixed and finite number of elementary trials.
Recall (Binomial Distribution) that a composite experiment consists
of repeating an elementary Bernoulli trial n times. The sample space,
denoted Ωn , is the set of all possible sequences of n 0’s and 1’s representing
all possible outcomes of the composite experiment. We denote an element of
Ωn as ω = (ω1 , . . . , ωn ), where each ωk = 0 or 1. That is, Ωn = {0, 1}n . We
assign a probability measure Pn [·] on Ωn by multiplying the probabilities of
each Bernoulli trial in the composite experiment according to the principle
of independence. Thus, for k = 1, . . . , n,
P1 [ωk = 0] = q and P1 [ωk = 1] = p
and inductively for each (e1 , e2 , . . . , en ) ∈ {1, 0}n
Pn+1 [ωn+1 = 1 and (ω1 , . . . , ωk ) = (e1 , . . . , en )] =
P1 [ωn+1 = 1] × Pn [(ω1 , . . . , ωn ) = (e1 , . . . , en )]
Definition. Consider the space
Ω = {ω = (ωn )∞
n=1 : ωn = 0, 1 for all n}.
Define Sn := ω1 + · · · + ωn . A subset A ⊆ Ω is of finite type or is a finite
type event if there exists an n = n(A) ≥ 1 and a subset A0 ⊆ Ωn so that
4
A = ω ∈ Ω : ω (n) ∈ A0 , where ω (n) = (ω1 , . . . , ωn ) is the projection of the
infinite sequence ω onto the finite head of length n.
If A is of finite type, then we define the probability to be
X
P [A] = Pn(A0 ) [A0 ] =
pSn (ω) q (n−Sn (ω)) .
ω (n) ∈A0
This probability is well-defined, see the Problems.
Definition. Note that P [·] has the property of invariance under shifting, that is, if A is an event, and k is a positive integer, then the event
{ω ∈ Ω : (ωk , ωk+1 , ωk+2 , . . . , ) ∈ A} has the same probability as the event
A.
Example. The set A consisting of the sequences with a Head on its initial
flip and then any sequence of Heads and Tails on all the subsequent flips is a
finite type event. Taking a 1 to represent a Head and a 0 to represent a tail,
the integer
n(A) = 1 and the subset A0 = {1} so A0 ⊆ Ω1 = {0, 1}. Then
A = ω ∈ Ω : ω (1) ∈ A0 = {1}
Definition. A set of subsets of some universal set is a field if it is closed
under finite unions, finite intersections, and complements.
Define E to be the set of finite type events. Note that ∅ ∈ E, Ω ∈ E. The
set E is closed under taking complements and also closed under finite unions
and finite intersections. Thus, we see that E is a field. This follows by direct
verification, see the Problems.
Note that P : E −→ [0, 1] and P [Ω] = 1 and P [∅] = 0 and P [A ∪ B] =
P [A] + P [B] if A ∩ B = ∅.
Definition. A set of subsets of some universal set is a Borel field if it is
closed under complements, and countable intersections and unions. For any
set C of subsets of Ω, denote by F(C) the smallest Borel field containing C.
Negligible and Almost Sure Events
Definition. We say that N ⊆ Ω is a negligible event, that is, an event
of probability measure 0, if for every > 0 there exists P
a countable set
{Ak : k ≥ 1} of finite type events so that N ⊆ ∪k≥1 Ak and k≥1 P [Ak ] < and we write P [[] A] a.s. .
An event A ⊆ Ω is an almost sure event if Ac = Ω \ A is negligible.
5
Proposition 1.
1. Every subset of Ω that is contained in a negligible
event is negligible.
2. Every countable union of negligible sets is negligible.
3. If p 6= 0, 1, every countable subset of Ω is negligible.
Proof.
1. Suppose that A is negligible and B ⊆ A. Then there exist
Ak ⊆ Ω, a set of finite P
type events, so that A ⊆ ∪k≥1 Ak . Then we see
that B ⊆ ∪k≥1 Ak and k≥1 P [Ak ] ≤ . Thus, B is negligible.
2. Let {Nn }∞
n=1 be a countable set of negligible sets. Fix > 0 and find (by
definition) a set of negligible events {An,k }∞
k=1 so that Nn ⊆ ∪k≥1 An,k
and P [An,k ] ≤ 2n . Then
N = ∪n≥1 Nn ⊆ ∪n≥1 An,k
k≥1
and
P
n≥1
k≥1
P [An,k ] ≤ .
3. Suppose that p 6= 0, 1. First
prove that the singleton
set {ω} is negli
gible. Note that {ω} ⊆ ω 0 ∈ Ω : ω 0(n) = ω (n) = An and
P [An ] ≤ max {pn , (1 − p)n } .
For n 0 so that pn , (1 − p)n ≤ , then we see that {ω} ⊆ An and
P [An ] ≤ . Thus we see that {ω} is negligible and by part (2), we see
that a countable union of singletons is negligible.
Definition. The set of events {Ai }i∈I are independent if
#
" n
n
Y
\
P [Aik ]
P
A ik =
k=1
k=1
for every finite set of distinct indexes i1 , . . . , in ∈ N.
Remark. The sets Ai are independent if and only if the sets Aci are independent. This follows because
c
c
P AC
1 ∩ A2 = P [(A1 ∪ A2 ) ]
= 1 − P [A1 ∪ A2 ]
= 1 − P [A1 ] − P [A2 ] + P [A1 ∩ A2 ]
= 1 − P [A1 ] − P [A2 ] + P [A1 ] P [A2 ]
= (1 − P [A1 ])(1 − P [A2 ]).
6
Proposition 2. Let {Ai }i∈I be a family of events. If for each i ∈ I there
exists a finite subset Ei ⊆ N and a subset A0i ⊆ {0, 1}Ei such that Ei ∩ Ej = ∅
if i 6= j and Ai = {ω ∈ Ω : (ωn )n∈Ei ∈ A0i }. Then the events {Ai }i∈I are
independent.
Remark. This proposition means that events that are determined by coordinates with disjoint sets of indexes are independent.
Example. Let Ai be the event that the ith coin flip is heads. Let Ei = {i}
and A0i be the set so that ωi = 1. Certainly, Ei ∩ Ej = ∅ for i 6= j. Also,
Ai = {ω ∈ Ω : (ωn )n∈Ei ∈ A0i } = {ω ∈ Ω : ωi = 1}.
n
n
n
Note
Qn P [Ai ] = p and P [∩k=1 Ak ] = p . Thus, we see that P [∩k=1 Ak ] =
k=1 P [Ak ].
Proposition 3. Let b be a word from the alphabet {0, 1}; i.e., b is a finite
sequence of 0’s and 1’s. The set
A = {ω ∈ Ω : b is not found in ω}
is negligible.
Proof.
1. Let b be a word of length j > 0. For all m ≥ 0, let Am be the
set of all ω ∈ Ω so that (ωmj+1 , . . . , ω(m+1)j ) 6= b. In other words, we
are dividing ω into consecutive non-overlapping blocks of length j.
2. Note that P [A0 ] < 1 and the property of invariance under shifting tells
us that P [Am ] < 1.
3. The Ai are independent by Proposition 3 and so
"
#
\
P
Ak = (P [A0 ])m+1 .
k≤m
4. This value can be made arbitrarily small by choosing a large enough
m.
5. Let Bn = {ω ∈ Ω : (ωn+1 , . . . , ωn+j ) 6= b} ⊆ ∩k≤m Ak .
6. Thus, we see that the probability of Bn can be made arbitrarily small.
7
7. Note that A ⊆ ∪Bn and so A is negligible.
Corollary 1. All possible words almost surely appear in the sequence ω.
Proof. (of Corollary 1)
The set of all possible words is countable, so let bi be an enumeration of
the set of possible words. By Proposition 3 the set
Abi = {ω ∈ Ω : bi is not found in ω}
is negligible. Then consider A = ∪bi Abi . By Proposition 1, the set A is
negligible. Finally consider ω ∈ AC = Ω \ A and an arbitrary word b. The
word b must appear in AC , and AC is an almost sure event.
Corollary 2. The set of sequences that are periodic after a certain index is
negligible.
Proof. (of Corollary 2)
1. Let Pi,j be the set of all ω that begin periodicity at index i and the
period has length j.
2. First, show that the set P1,j of purely periodic sequences is negligible.
Let bj = ~0j be the word of length j with all zeros. There is only
sequence ωbj in P1,j which starts with the word b and so we see that
Pi,j = ωbj ∪ {ω : not containing bj }.
Note that the first set is negligible since it is a singleton and the second
set is a subset of a negligible set by Proposition 3. So P1,j is negligible.
3. Next, for i ≥ 2 the set Pi,j is negligible. For n = 0, . . . , 2i−1 − 1 let bn
be the binary representation with length i − 1 of the number n.
4. For n = 0, . . . , 2i−1 − 1 let Pbn ,j be the set of sequences which start with
b and are periodic with period j starting at index i. By the property of
invariance under shifting, P [Pbn ,j ] = P [P1,j ] and so Pbi ,j is negligible.
Finally,
2i−1
[−1
Pi,j =
Pbn ,j
n=0
and so Pi,j is negligible.
8
Tail Events
Definition. A set E is called a tail event if E is not a finite type event. That
is, whether or not ω ∈ E does not depend on the first n coordinates of ω no
matter how large n is. Some authors refer to a tail event as a remote event.
Example. Consider the set E of coin-tossing sequences such that Snn does not
converge to p. This set cannot depend on the first n coordinates of ω no
matter how large n is, hence it is a tail event.
Remark. E ⊆ F(Ω) is a tail event if E ∈ F(Xn , Xn+1 , . . . ) for all n. Here
F(Ω) is the Borel field of Ω and F(Xn , Xn+1 , . . . ) is the smallest Borel field
containing all possible sequences beginning at the nth index.
Remark. This definition may seem quite abstract, but it captures in a formal
way the sense in which certain events do not depend on any finite number of
their coordinates. For example, consider the set E of coin-tossing sequences
such that Snn does not converge to p. That is,
X1 + · · · + Xn
E= ω :
6→ p .
n
Then for any k ≥ 1,
X k + · · · + Xn
E= ω :
6→ p .
n
so that E ∈ F(Xn , Xn+1 , . . . ) for all k ≥ 1, E is a tail event.
Definition. Let Bn be a sequence of sets in Ω.
{Bn i.o.} = lim ∪∞
n=m Bn .
m→∞
(i.o. stands for “infinitely often.”)
So ω ∈ {Bn i.o.} if and only if ω ∈ Bn for infinitely many n.
Proposition 4. Given {an }∞
n=1 with limn→∞ an = +∞, then
√ Sn
lim sup an n − p = +∞
n
n→∞
almost surely.
9
Remark. If an ≡ a, then
√ Sn
√ Sn − pn lim Pn a n − p < c = lim Pn a n <c
n→∞
n→∞
n
n "
#
S − pn p
n
= lim Pn a p(1 − p) p
<c
n→∞
p(1 − p)n "
#
S − pn c
n
= lim Pn p
< p
n→∞
p(1 − p)n a p(1 − p)
Z √c
a p(1−p)
1
2
√ e−x /2 dx
=
c
2π
√
a
p(1−p)
by the Central Limit Theorem and this is a finite value.
On the other hand, the Moderate Deviations Theorem says that if
1. (an ) is a sequence of real numbers,
2. an → ∞ as n → ∞ and
3. limn→∞
an
n1/6
= 0.
then
Pn
√ p
p
an
n Sn
Sn
− p ≥ p(1 − p) √ = Pn
− p ≥ p(1 − p)
n
an
n
n
1
2
∼ √ e−an /2 .
an 2π
This pair of previous results, together
√ with theProposition show how
finely balanced around p is the sequence n Snn − p .
√ Proof.
1. Set Am = ω : lim supn→∞ an n Snn − p < m for all m > 0.
For any ω ∈ Am is √
an N
depending on ω ) such that for every
(possibly
n > N we have an n Snn − p < m.
n
o
√ Sk (ω)
2. Let Ak,m = ω : ak k k − p < m . For ω ∈ Am there exists k so
that ω ∈ Ak,m .
10
3. Note that Am ⊆ ∪k Ak,m .
4.
Z
P [Ak,m ] ≈
m
√
ak p 1−p
m
√
k p 1−p
−a
2
e−x /2
√
dx
2π
5. Given 2j , because ak → ∞ there exists kj so that P [Ak,m ] <
can choose the kj as an increasing sequence. Thus,
2j
and we
∞
X
P Akj ,m < k=1
and Am ⊆ ∪∞
j=1 Akj ,m . Thus, Am is negligible. This is true for any value
m and so
√ Sn
P lim sup an n − p = ∞ a.s.
n
n→∞
Sources
This section is adapted from: Heads or Tails, by Emmanuel Lesigne, Student
Mathematical Library Volume 28, American Mathematical Society, Providence, 2005, Chapter 11.1, pages 78-82. [3]. Some of the remarks and examples are also adapted from Probability by Leo Breiman, Addison-Wesley,
Reading MA, 1968, Chapters 1 and 3. [2]
11
Algorithms, Scripts, Simulations
Algorithm
AlmostSureEvents-Simulation
Comment Post: Observation that any coin flip sequence
selected by a pseudo-random-number-generator
has scaled deviations which exceed a sequence of integers,
suggesting the almost sure result of 4
Comment Post: Indexes and values of a million flip coin-flip
sequence where scaled deviations from p exceed
the counting sequence.
1 Set probability of success p
2 Set length of coin-flip sequence n
3 Initialize and fill length n array of coin flips
4 Use vectorization to sum to get the cumulative sequence Sn of heads
5 Use vectorization to compute the scaling factor array an
√
6 Use vectorization to compute the array of scaled deviations an n |Sn /n − p|
7 Initialize the subsequence index k to 1
8 while any scaled deviations exceed k
9
Set Nk as the first index where the scaled
√ deviations exceed k
10
Print k, SNk and scaled deviation aNk Nk |SNk /Nk − p|.
Scripts
Scripts
R R script for Almost Sure Events.
p <− 0 . 5
n <− 1 e+6
c o i n F l i p s <− ( runif ( n ) <= p )
S <− cumsum( c o i n F l i p s )
an <− ( 1 : n ) ˆ ( 1 / 6 )
d e v i a t i o n s <− an∗sqrt ( 1 : n ) ∗ ( abs ( S/ ( 1 : n ) − p ) )
k <− 1
while ( length ( which ( d e v i a t i o n s > k ) ) > 0 ) {
12
Nk <− min( which ( d e v i a t i o n s > k ) )
cat ( ” Index : ” , Nk , ”Sum : ” , S [ Nk ] , ” S c a l e d D e v i a t i o n : ” , d e v i a t i o n s [ N
k <− k+1
}
Octave Octave script for Almost Sure Events.
p = 0.5;
n = 1 e +6;
c o i n F l i p s = rand ( 1 , n ) <= p ;
S = cumsum( c o i n F l i p s ) ;
an = ( 1 : n ) . ˆ ( 1 / 6 ) ;
d e v i a t i o n s = an . ∗ sqrt ( 1 : n ) . ∗ ( abs ( S . / ( 1 : n ) − p∗ones ( 1 , n ) ) ) ;
k = 1;
while ( any ( d e v i a t i o n s > k ) )
Nk = find ( d e v i a t i o n s > k ) ( 1 ) ;
p r i n t f ( ” Index : %i , Sum : %i , S c a l e d d e v i a t i o n : %f \n” ,
Nk , S (Nk ) , d e v i a t i o n s (Nk ) )
k = k+1;
endwhile
Perl Perl PDL script for Almost Sure Events.
$p
$n
$coinFlips
$S
=
=
=
=
0.5;
1 e +6;
random ( $n ) <= $p ;
cumusumover ( $ c o i n F l i p s ) ;
$narray
= z e r o s ( $n)−> x l i n v a l s ( 1 , $n ) ;
$an
= $ n a r r a y ∗∗( 1 / 6 ) ;
$ d e v i a t i o n s = $an ∗ sqrt ( $ n a r r a y ) ∗ ( abs ( $S / ( $ n a r r a y ) − $p ) ) ;
$k = 1 ;
while ( any $ d e v i a t i o n s > $k ) {
$Nk = ( ( which $ d e v i a t i o n s > $k )−>range ( 0 ) ) ;
print ” Index : ” , $Nk , ”Sum : ” , $S−>range ( [ $Nk ] ) , ” S c a l e d D e v i a t i o n
$ d e v i a t i o n s −>range ( [ $Nk ] ) , ”\n” ;
13
$k = $k + 1 ;
}
SciPy Scientific Python script for Almost Sure Events.
import s c i p y
p = 0.5
n = 1 e+6
c o i n F l i p s = s c i p y . random . random ( n ) <= p
# Note Booleans True f o r Heads and F a l s e f o r T a i l s
S = s c i p y . cumsum ( c o i n F l i p s , a x i s =0)
# Note how Booleans a c t as 0 ( F a l s e ) and 1 ( True )
n a r r a y = s c i p y . arange ( 1 , n + 1 , dtype=f l o a t )
an = n a r r a y ∗∗ ( 1 . / 6 . )
d e v i a t i o n s = an ∗ s c i p y . s q r t ( n a r r a y ) ∗ s c i p y . a b s o l u t e ( S / n a r r a y − p
k = 1
while s c i p y . any ( d e v i a t i o n s > k ) :
Nk = s c i p y . nonzero ( d e v i a t i o n s > k ) [ 0 ] [ 0 ]
print ’ Index : ’ , Nk , ’Sum : ’ , S [ Nk ] , ’ S c a l e d D e v i a t i o n : ’ , \
d e v i a t i o n s [ Nk ]
k = k + 1
14
Problems to Work for Understanding
1. Show that the definition that if A is of finite type, then we define the
probability to be
X
P [A] = Pn(A0 ) [A0 ] =
pSn (ω) q (n−Sn (ω))
ω (n) ∈A0
is well-defined. That
is, if A0 ⊆ Ωn(A0 ) and A00 ⊆ Ωn(A00 ) are two events
such that A = ω ∈ Ω : ω (n) ∈ A0 and A = ω ∈ Ω : ω (n) ∈ A00 ,
then
P [A] = Pn(A0 ) [A0 ] = Pn(A00 ) [A00 ] .
2. Show that:
(a) ∅ ∈ E.
(b) Ω ∈ E.
(c) The set E is closed under taking complements.
(d) The set E is closed under finite unions and finite intersections.
3. Show that P [·] has the property of invariance under shifting.
4. Show that P [·] is monotone, that is, if A ⊆ B then P [A] ≤ P [B].
Reading Suggestion:
References
[1] E. Borel. Sur les probabilités démombrables et leurs applications
arithmétiques. Rediconti del Circolo Math. di Palermo, 26:247–271, 1909.
[2] Leo Breiman. Probability. Addison Wesley, 1968.
15
[3] Emmanuel Lesigne. Heads or Tails: An Introduction to Limit Theorems
in Probability, volume 28 of Student Mathematical Library. American
Mathematical Society, 2005.
Outside Readings and Links:
1.
2.
3.
4.
I check all the information on each page for correctness and typographical
errors. Nevertheless, some errors may occur and I would be grateful if you would
alert me to such errors. I make every reasonable effort to present current and
accurate information for public use, however I do not guarantee the accuracy or
timeliness of information on this website. Your use of the information from this
website is strictly voluntary and at your risk.
I have checked the links to external sites for usefulness. Links to external
websites are provided as a convenience. I do not endorse, control, monitor, or
guarantee the information contained in any external website. I don’t guarantee
that the links are active at all times. Use the links here with the same caution as
you would all information on the Internet. This website reflects the thoughts, interests and opinions of its author. They do not explicitly represent official positions
or policies of my employer.
Information on this website is subject to change without notice.
Steve Dunbar’s Home Page, http://www.math.unl.edu/~sdunbar1
Email to Steve Dunbar, sdunbar1 at unl dot edu
Last modified: Processed from LATEX source on June 12, 2015
16