242
14 Conditional Expectation
Proof. Suppose that X is independent of G, f : X → R is a measurable
function such that f (X) ∈ L1 (Ω, B, P ) , µ := E [f (X)] , and A ∈ G. Then, by
independence,
which by convention we set to be (say) zero if P (X ≥ t) = 0. Now suppose that
u (t) = 0, then Eq. (8.6) becomes,
E [f (X) u (X) : X < t] = E [h (X) u (X) : X < t]
E [f (X) : A] = E [f (X) 1A ] = E [f (X)] E [1A ] = E [µ1A ] = E [µ : A] .
Therefore EG [f (X)] = µ = E [f (X)] a.s.
Conversely if EG [f (X)] = E [f (X)] = µ and A ∈ G, then
E [f (X) 1A ] = E [f (X) : A] = E [µ : A] = µE [1A ] = E [f (X)] E [1A ] .
Since this last equation is assumed to hold true for all A ∈ G and all bounded
measurable functions, f : X → R, X is independent of G.
The following remark is often useful in computing conditional expectations.
The following Exercise should help you gain some more intuition about conditional expectations.
Remark 14.12 (Note well.). According to Lemma 14.1, E (f |X) = f˜ (X) a.s.
for some measurable function, f˜ : X → R. So computing E (f |X) = f˜ (X) is
equivalent to finding a function, f˜ : X → R, such that
h
i
E [f · h (X)] = E f˜ (X) h (X)
(14.12)
for all bounded and measurable functions, h : X → R. “The” function, f˜ :
X → R, is often denoted by writing f˜ (x) = E (f |X = x). If P (X = x) >
0, then E (f |X = x) = E (f : X = x) /P (X = x) consistent with our previous
definitions – compare with Example 14.10. If P (X = x) = 0, E (f |X = x) is not
given a value but is just a convenient notational way to denote a function f˜ :
X → R such that Eq. (14.12) holds. (Roughly speaking, you should think that
E (f |X = x) = E [f · δx (X)] /E [δx (X)] where δx is the “Dirac delta function”
at x. If this last comment is confusing to you, please ignore it!)
Example 14.13. Suppose that X is a random variable, t ∈ R and f : R → R
is a measurable function such that f (X) ∈ L1 (P ) . We wish to compute
E [f (X) |X ∧ t] = h (X ∧ t) . So we are looking for a function, h : (−∞, t] → R
such that
E [f (X) u (X ∧ t)] = E [h (X ∧ t) u (X ∧ t)]
(14.13)
for all bounded measurable functions, u : (−∞, t] → R. Taking u = 1{t} in Eq.
(14.13) implies,
E [f (X) : X ≥ t] = h (t) P (X ≥ t)
from which it follows that f (X) 1X<t = h (X) 1X<t a.s. Thus we can take
f (x)
if x < t
h (x) :=
E [f (X) |X ≥ t] if x = t
and we have shown,
E [f (X) |X ∧ t] = 1X<t f (X) + 1X≥t E [f (X) |X ≥ t]
= 1X∧t<t f (X) + 1X∧t=t E [f (X) |X ≥ t] .
Exercise 14.3. Let (Ω, B, P ) be a probability space, (X, M) and (Y, N ) be
measurable spaces, X : Ω → X and Y : Ω → Y be measurable functions.
Let (X, M), (Y, N ) be measurable spaces, (Ω, F, P ) a probability space, and
X : Ω → X and Y : Ω → Y be measurable functions. Further assume that
G ⊂ F be a σ–algebra such that X is G/M – measurable and Y is independent
of G. Then for any bounded (M ⊗ N , BR ) – measurable function f : X × Y → R
we have
E[f (X, Y )|G] = hf (X) = E [f (x, Y )] |x=X a.s.
(14.14)
where if µ := LawP (Y ) ,
Z
hf (x) := E [f (x, Y )] =
f (x, y) dµ (y) .
(14.15)
Y
Solution to Exercise (14.3). The proof is an exercise in using the multiplicative systems Theorem 8.2. In more detail, let
hf is G – measurable
and
H := f ∈ [M ⊗ N ]b :
E [f (X, Y ) |G] = hf (X) a.s.
and let
M := {f ∈ [M ⊗ N ]b : f (x, y) = u (x) v (y) were u ∈ [M]b and v ∈ [N ]b } .
For f (x, y) = u (x) v (y) in M,
hf (·) = u (·) E [v (Y )] ∈ [M]b and
and therefore we should take,
h (t) = E [f (X) |X ≥ t]
Page:
242
job:
prob
and by the pull-out property (Theorem 14.5) of conditional expectation and
Lemma 14.11
macro:
svmonob.cls
date/time:
14-Feb-2014/12:27
14.1 Examples
E [f (X, Y ) |G] = E [u (X) v (Y ) |G] = u (X) E [v (Y ) |G]
= u (X) E [v (Y )] = u (X) µ (v) = hf (X) a.s.
Thus we have shown, M ⊂ H. Since σ(M) = M ⊗ N the multiplicative system theorem implies H consists of all bounded measurable functions on X × Y
provided we show H is a linear subspace which is closed under bounded convergence. The fact that H is a subspace follows easily from the linearity of the
expectation and conditional expectation operators. So it remains to check that
H is closed under bounded convergence which we now do.
∞
Suppose that {fn }n=1 ⊂ H and fn → f boundedly. By DCT it follows
that hfn (x) → hf (x) as n → ∞ for all x ∈ X and therefore hf is still G –
measurable. Moreover if C is a finite constant and |fn | ≤ C for all n, then
|hfn (x)| = |Efn (x, Y )| ≤ E |fn (x, Y )| ≤ C
and therefore hfn → hf boundedly. Finally for any h ∈ L∞ (Ω, G, P ) ,
DCT
E[f (X, Y )h] = E[ lim fn (X, Y )h] =
lim E[fn (X, Y )h]
h
i
h
i
DCT
lim E [hfn (X)h] = E lim hfn (X)h = E lim hf (X)h
n→∞
fn ∈H
=
n→∞
n→∞
n→∞
As usual we use the notation,
1 R
Z
v (y) ρ (x, y) dν (y) if ρ̄ (x) ∈ (0, ∞)
Q (x, v) :=
v (y) Q (x, dy) = ρ̄(x) Y
δ
if ρ̄ (x) ∈ {0, ∞} .
y0 (v) = v (y0 )
Y
for all bounded measurable functions, v : Y → R.
Proof. Our goal is to compute E [f (X, Y ) |X] . According to Remark 14.12,
we are searching for a bounded measurable function, g : X → R, such that
E [f (X, Y ) h (X)] = E [g (X) h (X)] for all h ∈ Mb .
(14.19)
(Throughout this argument we are going to repeatedly use the Tonelli - Fubini
theorems.) We now explicitly write out both sides of Eq. (14.19);
Z
E [f (X, Y ) h (X)] =
h (x) f (x, y) ρ (x, y) dµ (x) dν (y)
X×Y
Z
Z
=
h (x)
f (x, y) ρ (x, y) dν (y) dµ (x)
(14.20)
X
n→∞
Y
Z
E [g (X) h (X)] =
h (x) g (x) ρ (x, y) dµ (x) dν (y)
ZX×Y
=
h (x) g (x) ρ̄ (x) dµ (x) .
and so we may conclude E[f (X, Y )|G] = hf (X) a.s., i.e. f ∈ H.
Proposition 14.14. Suppose that (Ω, B, P ) is a probability space, (X, M, µ)
and (Y, N , ν) are two σ – finite measure spaces, X : Ω → X and Y : Ω → Y
are measurable functions,
and there exists 0 ≤ ρ ∈ L1 (Ω, B, µ ⊗ ν) such that
R
P ((X, Y ) ∈ U ) = U ρ (x, y) dµ (x) dν (y) for all U ∈ M ⊗ N . Let
Z
ρ̄ (x) :=
ρ (x, y) dν (y)
(14.16)
Y
(14.21)
X
Since the right sides of Eqs. (14.20) and (14.21) must be equal for all h ∈ Mb ,
we must demand (see Lemma 7.23 and 7.24) that
Z
f (x, y) ρ (x, y) dν (y) = g (x) ρ̄ (x) for µ – a.e. x.
(14.22)
Y
and x ∈ X and B ∈ N , let
1 R
ρ (x, y) dν (y) if ρ̄ (x) ∈ (0, ∞)
Q (x, B) := ρ̄(x) B
δy0 (B)
if ρ̄ (x) ∈ {0, ∞}
(14.17)
where y0 is some arbitrary but fixed point in Y. Then for any bounded (or nonnegative) measurable function, f : X × Y → R, we have
Z
E [f (X, Y ) |X] = Q (X, f (X, ·)) =:
f (X, y) Q (X, dy) = g (X) a.s. (14.18)
Y
where,
There are two possible problems in solving this equation for g (x) at a particular
point x; the first is when ρ̄ (x) = 0 and the second is when ρ̄ (x) = ∞. Since
Z
Z Z
ρ̄ (x) dµ (x) =
ρ (x, y) dν (y) dµ (x) = 1,
X
X
Y
we know that ρ̄ (x) < ∞ for µ – a.e. x and therefore it does not matter how g
is defined on {ρ̄ = ∞} as long as it is measurable. If
Z
0 = ρ̄ (x) =
ρ (x, y) dν (y) ,
Y
Z
f (x, y) Q (x, dy) = Q (x, f (x, ·)) .
g (x) :=
243
then ρ (x, y) = 0 for ν – a.e. y and therefore,
Y
Page:
243
job:
prob
macro:
svmonob.cls
date/time:
14-Feb-2014/12:27
© Copyright 2026 Paperzz