FROM ELEMENTS OF PROBABILITY THEORY TO RANDOM
WALKS
CHRISTIAN BICK
Abstract. This is an informal summary of results from basic probability
theory. The treatment is not complete and there is probably more than one
mistake in there. For a more comprehensive treatment, see any textbook on
measure or probability theory, for example [Dur91].
1. Probability spaces and measures
Definition 1.1. A collection F of subsets of a set Ω is called a σ-algebra (or
σ-field) over Ω if
(1) Ω ∈ F,
(2) A ∈ F =⇒ Ac := Ω\A ∈ F,
S
(3) An a finite or countable collection in F =⇒ n An ∈ F.
Example 1.2.
(1) For an arbitrary set Ω, F1 = {∅, Ω} and F2 = {A ⊂ Ω}, i.e., the collection
of all subsets of Ω, are σ-algebras.
(2) Let A be a collection of subsets of Ω. Then the σ-algebra generated by A is
the smallest σ-algebra that contains A. For Ω = R the σ-algebra generated
by sets of the form (−∞, a) for a ∈ R is called the Borel σ-algebra B(R).
Definition 1.3. Let Ω be a set and F a σ-algebra over Ω. Then the tuple (Ω, F)
is called a measurable space.
A measure is a function µ : F → [0, ∞] such that
(1) µ(∅) = 0,
(2) if A1 , A2 , . . . is a finite or countable sequence of pairwise disjoint sets in F
then
!
[
X
µ
An =
µ(An ).
n
n
Furthermore, µ is called finite if µ(Ω) < ∞ and a probability measure if µ(Ω) = 1.
A collection (Ω, F, µ) is called a measure (probability) space.
Remark 1.4. Every finite measure can be rescaled to a probability measure.
What do we need σ-algebras for? At first glance, the construction might seem a
bit artificial and unnecessary since one would like to have a measure that is defined
on every subset. Fact is though, that there is no consistent way to properly define
a measure on all subsets of, say, the unit interval I = [0, 1]. Counterexamples are
provided, for example by the Banach–Tarski paradox or the definition of Vitaly sets.
Hence, instead of trying to define a measure on the whole collection of subsets, one
leaves out the ‘bad’ sets, restricting oneself to defining the measure on a smaller
collection of subsets.
In fact there is another approach to measure theory (see for example [Rud76]),
not starting off with a σ-algebra but with an outer measure which is a function on
Date: July 22, 2010.
1
2
CHRISTIAN BICK
all subsets. The subsets however then split into measurable and non-measurable
subsets and the measurable sets carry the structure of a σ-algebra.
2. Measurable functions and random variables
Definition 2.1. If (S1 , S1 ) and (S2 , S2 ) are measurable spaces and f : S1 → S2
then f is called measurable if f −1 (B) ∈ S1 for all B ∈ S2 . Here, f −1 denotes the
preimage of a set1.
A measurable function between a probability space (Ω, P ) and (R, B(R)) is called
a random variable.
This definition is the analogous definition respecting the additional structure to
continuity where preimages of open sets have to be open.
Example 2.2. Recall the definitions and notation from Example 1.2.
(1) id : Ω → Ω, id(ω) = ω, is a measurable function from (Ω, F2 ) → (Ω, F1 ).
(2) c : Ω → Ω, c(ω) = ω0 for ω0 ∈ Ω fixed, is a measurable function from
(Ω, F1 ) → (Ω, F2 ).
Remark 2.3. The identity id : (S1 , S1 ) → (S2 , S2 ) is measurable iff2 S2 ⊂ S1 .
Definition 2.4. Suppose (Ω, F, P ) is a probability space and X : Ω → R a random
variable. Then
µ(B) := P (X −1 (B))
with B ∈ B(R) defines a probability measure on (R, B(R)) called the distribution of
X.
Now the question arises what possible distributions are. The distributions we
are most familiar with are continuous and discrete distributions. It turns out that
any distribution µ can be written as µ = α1 µ1 + α2 µ2 + α3 µ3 where αi ∈ R and µ1
is continuous, µ2 is discrete, and µ3 is a uniform distribution on a Cantor set.
3. Measure integrals and product spaces
The notion of a measure defined above allows the definition of a measure integral.
Definition 3.1. Suppose A ⊂ Ω and let IA denote the indicator function of A, i.e.,
IA (ω) = 1 if ω ∈ A and IA (ω) = 0 otherwise. A measurable function f : (Ω, F) →
(R, B(R)) is called simple if there are pairwise disjoint Ai ∈ F and ci ∈ R such that
X
f=
ci IAi .
i
We now define the measure integral for simple functions as
Z
X
f dµ :=
ci µ(Ai ).
i
This definition can be extended to positive measurable functions by approximating
them by simple functions. With the measure integral for positive functions and
f = f+ − f− the decomposition into positive and negative part the measure integral
can be defined for arbitrary measurable functions. See a textbook ([Dur91], other
options are [Bil86, FG97, Res99]) for a more elaborate definition.
1If f : X → Y and B ⊂Y then f −1 (B) := {x ∈ X | f (x) ∈ B } is a subset of X.
2if and only if
FROM ELEMENTS OF PROBABILITY THEORY TO RANDOM WALKS
3
Definition 3.2. For a probability space (Ω, F, P ) and a random variable X : Ω → R
define
Z
E[X] :=
XdP
the expectation of the random variable X.
Remark 3.3. The integral and the expectation defined above satisfy all the nice
properties that we expect them to do like linearity, monotonicity etc. This can be
proved by showing it for simple functions, then for positive measurable functions
and then for general measurable functions. Note that these properties are not
automatically satisfied, and although it is pretty straightforward, it is quite a bit
of work.
Proposition 3.4. Let (S1 , S1 , µ1 ), . . . , (Sn , Sn , µn ) be σ-finite3 measure spaces.
Then there is a unique measure on the product space (S, S), where S = S1 ×· · ·×Sn
and S is the σ-algebra generated by A1 × · · · × An where Ai ∈ Si for all i, such that
µ(A1 × · · · × An ) = µ1 (A1 ) · · · µn (An ).
Example 3.5. This gives the n-dimensional Borel σ-algebra on Rn denoted by
B(Rn ).
Theorem 3.6 (Fubini). Let (S, S, µ) = (S1 , S1 , µ1 )×(S2 , S2 , µ2 ) in the sense above
with σ-finiteR µ1 and µ2 . Suppose f : S → R is a measurable function. If f is nonnegative or |f | dµ < ∞ then
Z
Z Z
f dµ =
f (x, y)µ2 (dy) µ1 (dx)
S
S
Z 1 Z 2
f (x, y)µ1 (dx) µ2 (dy).
=
S2
S1
Fubini’s theorem says that if you have a product space, then under some conditions it does not matter which ‘dimension’ you integrate first. A consequence
of Fubini’s theorem is the additivity of the expectation for independent4 random
variables.
4. Random Walks
Definition 4.1. Suppose X1 , X2 , . . . be identically and independently distributed
random variables with values in Rd . Let S0 = 0 and Sn = X1 + . . . + Xn for n ≥ 1.
Then (Sn )∞
n=0 is called a random walk.
1
If P (Xi = (x1 , · · · , xd )) = 2d
whenever |xi | = 1 and xj = 0 ∀j 6= i then the
∞
random walk (Sn )n=0 is called a simple random walk in Zd .
Definition 4.2. A simple random walk is called recurrent if
P (Sn = 0 for some n) = 1
and transient otherwise.
P∞
Proposition 4.3. The random walk (Sn )∞
n=0 is recurrent iff
n=1 P (Sn = 0) = ∞.
P∞
Proof. Let N = n=1 I{Sn =0} be the number of visits to zero. Then
E[N ] =
∞
X
∞
X
E I{Sn =0} =
P (Sn = 0)
n=1
by Fubini’s theorem.
3This requires a more elaborate definition.
4This requires a more elaborate definition.
n=1
4
CHRISTIAN BICK
If (Sn ) is recurrent, then N = ∞ with probability one and thus E[N ] = ∞.
Conversely, if Sn is transient then define
q := P (Sn = 0 for some n ≥ 1) < 1.
k
We have P (N ≥ k) = q so
Z ∞
∞ Z
X
P (N ≥ x)dx =
E[N ] =
0
k=1
k
P (N ≥ x)dx =
k−1
∞
X
k=1
P (N ≥ k) =
∞
X
q k < ∞.
k=1
We can now state the main theorem of this section.
Theorem 4.4. A simple random walk is recurrent for d = 1, 2 and transient for
d ≥ 3.
Sketch of the proof. The basic idea of the proof is to make use of Stirling’s formula
√
1
n! ∼ 2πnn+ 2 exp(−n)
where an ∼ bn if lim abnn = 1. The goal is to calculate the limit of the series in
Proposition 4.3. We have P (Sn = 0) = 0 for n odd.
1
Using Stirling’s formula we obtain P (Sn = 0) ∼ √1πn for d = 1, P (Sn = 0) ∼ πn
for d = 2 and P (Sn = 0) ≤ C3 for d ≥ 3. The claim now follows from the
n2
convergence of the geometric series.
In other words, recurrence is a property that depends on the dimension of the
space in which the random walk is happening. If the dimension is small enough,
i.e., d = 1, 2, the ‘random walker’ will always come back to the origin (and of course
to any other point of Rd ). If the dimension of the space is ‘too large’, the walker
will always wander off. In fact, the boundary between transience and recurrence is
sharp at d = 2 when allowing spaces to have fractional dimension.
References
[Bil86]
Patrick Billingsley, Probability and measure, second ed., Wiley Series in Probability and
Mathematical Statistics: Probability and Mathematical Statistics, John Wiley & Sons
Inc., New York, 1986.
[Dur91] Richard Durrett, Probability, The Wadsworth & Brooks/Cole Statistics/Probability Series, Wadsworth & Brooks/Cole Advanced Books & Software, Pacific Grove, CA, 1991,
Theory and examples.
[FG97] Bert Fristedt and Lawrence Gray, A modern approach to probability theory, Probability
and its Applications, Birkhäuser Boston Inc., Boston, MA, 1997.
[Res99] Sidney I. Resnick, A probability path, Birkhäuser Boston Inc., Boston, MA, 1999.
[Rud76] Walter Rudin, Principles of mathematical analysis, third ed., McGraw-Hill Book Co.,
New York, 1976, International Series in Pure and Applied Mathematics.
E-mail address: [email protected]
© Copyright 2026 Paperzz