48 APPENDIX We start by recalling and proving the last two lemmas

48
APPENDIX
We start by recalling and proving the last two lemmas seen during the
last lecture.
Lemma A.20. Let G be the
-algebra generated by either the event B or
the random variable Z. Then: a random variable Y is a version of the
conditional expectation with respect to G, i.e. Y = E[X|G] P-almost surely,
if and only if it satisfies
(i) Y is G-measurable;
(ii)
R
A
XdP =
R
A
Y dP for all A 2 G.
Proof. ()) The measurability with respect to G is a direct consequence of
the definitions in (A.3)-(A.4). Regarding the property (ii), we distinguish
the two cases.
If G = (B) = {;, B, B c , ⌦}, then
Z
Z
E[X|G]dP =
(E[X|B]1B (!) + E[X|B c ]1B c (!)) P(d!)
A
A
8
>
0
if
>
>
>
>
>
<R E[X|B]P(d!)
if
= RB
>
>
E[X|B c ]P(d!)
if
>
Bc
>
>
R
R
>
:
E[X|B]P(d!) + B c E[X|B c ]P(d!) if
B
and
Z
B
E[X|B]P(d!) = E[X|B]P(B) =
(analogously for B c ).
Z
A = ;,
A = B,
A = Bc,
A=⌦
X(!)P(d!)
B
If G = (Z) = ({Z = zj }, j = 1, . . . , m), then, for all j = 1, . . . , m,
Z
Z
E[X|G]dP =
E[X|{Z = zj }]P(d!)
{Z=zj }
{Z=zj }
= E[X|{Z = zj }]P({Z = zj })
Z
=
XdP
{Z=zj }
Elements of Probability
49
(() Since both Y and E[X|G] are G-measurable, also Y
E[X|G] is a G-
E[X|G] > 0} = {(Y
measurable random variable. Thus the event {Y
E[X|G]) 1 ((0, +1))} 2 G is contained in G, so that we can apply the property (ii) on it:
Z
(Y
E[X|G]>0
Y
This imply that P(Y
prove P(Y
E[X|G])dP =
Z
(X
Y
X)dP = 0.
E[X|G]>0
E[X|G] > 0) = 0. Through an analogous procedure we
E[X|G] < 0) = 0. So eventually we have P(Y 6= E[X|G]) = 0.
Lemma A.21. Let X, Z be two simple random variables on (⌦, F, P),
then
E[X|Z] =
n
X
i=1
xi P({X = xi }|Z),
where X(⌦) = {x1 , . . . , xn } and we denote P(F |Z) := E[1F |Z] for any F 2
F.
Proof. Suppose Z(⌦) = {z1 , . . . , zm }. For all j = 1, . . . , m, for any ! 2 {Z =
zj }, we have
E[X|Z](!) = E[X|{Z = zj }]
Z
1
=
XdP
P({Z = zj }) {Z=zj }
n
X
1
=
xi P({X = xi } \ {Z = zj })
P({Z = zj }) i=1
=
n
X
i=1
xi P({X = xi }|{Z = zj }).
Note that, given a simple random variable Z on(⌦, F, P), the conditional
expectation of X with respect to Z can be written as:
E[X|Z](!) = E[X|Z = Z(!)],
! 2 ⌦.
(A.5)
The following is another equivalent condition for independence of simple
random variables using the conditional expectation.
50
APPENDIX
Lemma A.22. Let X, Z be two simple random variables on (⌦, F, P), X(⌦) =
{x1 , . . . , xm }, then X, Z are independent if and only if the random variable
P({X = xi }|Z) is a constant on (⌦, F, P), for every i = 1, . . . m.
Proof. By (A.5) and(A.2), for any ! 2 ⌦ we have
P({X = xi }|Z)(!) = E[1{X=xi } |Z = Z(!)]
Z
1
=
1{X=xi } dP
P(Z = Z(!)) {Z=Z(!)}
P({X = xi } \ {Z = Z(!)}
=
P(Z = Z(!))
= P({X = xi }|{Z = Z(!)}.
If X, Z are independent, this is clearly a constant, by Remark A.14. Conversely, suppose that Z(⌦) = {z1 , . . . , zk }. If, for all j = 1, . . . , k and for all
!2Z
1
({zj }),
P({X = xi }|Z)(!) = P ({X = xi }|{Z = zj }) = pi ,
then
P ({X = xi } \ {Z = zj }) = pi P(Z = zj ),
and summing over j = 1, . . . , k we obtain pi = P(X = xi ). By substitution
in the previous equation, we proved that the events {X = xi } and {Z = zj }
are independent. Since this holds for all i = 1, . . . , m, j = 1, . . . , k, that
means that X, Z are independent.
A.0.3
Conditional Expectation:
General Definition and Properties
The properties (i)-(ii) shown in Lemma A.20 for the conditional expectation with respect to particular kinds of -algebra, G = ({B}) or G = (Z),
can be extended to the definition of the conditional expectation with respect
to a generic -algebra.
Elements of Probability
51
Theorem A.23. Let X be an integrable random variable on (⌦, F, P) and
G be any -algebra contained in F, G ✓ F. There exists random variable Y
on (⌦, F, P) satisfying
(i) Y is integrable, i.e.
(ii)
R
A
XdP =
R
A
R
⌦
|Y |dP < 1, and G-measurable;
Y dP , or equivalently E[X1A ] = E[Y 1A ], for all A 2 G.
Moreover, Y is unique up to a negligible event, that is: if there exists another
random variable Z on (⌦, F, P) satisfying (i)-(ii), then Y = Z P-almost
surely.
Definition A.24. The conditional expectation of X with respect to G is any
member of the equivalence class of random variables on (⌦, F, P) satisfying
(i)-(ii) in Theorem A.23. It is denoted by E[X|G].
Definition A.24 is given under the most general assumptions on the probability space and random variables considered. In the case of simple random
variables or finite sample spaces, we do not require the integrability of Y in
(i), because it is always satisfied.
Note that the conditional expectation E[X|G] is G-measurable even when
X is not. It represents the best estimate of X based on the information
contained in G.
Proof (Theorem A.23: Almost sure uniqueness). We proceed as in the proof
of the implication (() in Lemma A.20. Assume that there exist two random variables Y, Z on (⌦, F, P) satisfying (i)-(ii). Then, {Y > Z} = (Y
Z) 1 ((0, 1)) 2 G by (i), and by (ii) we have
Z
Z
(Y Z)dP =
(X
{Y >Z}
X)dP = 0.
{Y >Z}
Thus P(Y > Z) = 0. In a symmetric way we obtain P(Y < Z) = 0, and so
Y = Z P-almost surely.
In order to prove the existence, we have to resort to a classical result
in Probability. First, given two measures P, Q on (⌦, F), we say that Q is
52
APPENDIX
P-absolutely continuous if, for all A 2 F such that P(A) = 0 , we also have
Q(A) = 0. In this case we write Q ⌧F P, or simply Q ⌧ P when there is no
ambiguity. If Q is P-absolutely continuous and P is Q-absolutely continuous,
we say that P and Q are equivalent and we write Q ⇠ P. Note that the
notion of absolute continuity is related to the -algebra considered.
Lemma A.25 (Radon-Nikodym Theorem). Let P, Q be finite measures on
(⌦, F) such that Q ⌧ P. Then, there exists a map L : ⌦ ! [0, 1) such that:
1. L is F-measurable (i.e. L is a random variable),
2. L is P-integrable,
3. Q(A) =
R
A
LdP for all A 2 F.
Moreover, L is unique up to a negligible event. L is called the density, or the
Radon-Nikodym derivative, of Q with respect to P on F, and the notation
L=
dQ
dQ
⌘
|F
dP
dP
is used.
For instance, any distribution P defined as in Proposition A.8 is absolutely continuous with respect to the Lebesgue measure m, that is P ⌧B m.
Actually the inverse also holds: all measures which are m-absolutely continuous, can be written in the form (A.1).
We are now able to prove the existence of the conditional expectation.
Proof (Theorem A.23: Existence). Assume that X 0. Define a measure Q
R
on (⌦, G) by Q(A) = A XdP for all A 2 G. Then Q is finite, because X is P-
integrable. Moreover, we have Q ⌧G P, thus, by Lemma A.25, there exists a
R
G-measurable and P-integrable random variable Y such that Q(A) = A Y dP
for all A 2 G. Such random variable Y satisfy the properties (i)-(ii).
Remark A.26. Property (ii) in Definition A.24 is equivalent to the following:
(ii bis) E[XV ] = E[Y V ] for all bounded and G-measurable random variables
V on (⌦, F, P).
Elements of Probability
53
Proof. We only prove it in the case where both X and V are simple random
variables.
(ii))(ii bis). It is thanks to the fact that V is G-measurable and to the
linearity of the expectation. Indeed, by assumption,
V =
n
X
v j 1 Bj ,
where Bj 2 G ✓ F 8j = 1, . . . , m.
j=1
Then
"
E[XV ] = E X
n
X
#
v i 1 Bi =
i=1
n
X
vi E [X1Bi ] =
i=1
n
X
vi E [Y 1Bi ] = E[Y V ]
i=1
by (ii).
(ii bis))(ii). It is enough to consider all random variables of the form V =
1A , A 2 G, to get (ii).
Proposition A.27 (Properties of the conditional expectation). Let X, Y two
integrable random variables on (⌦, F, P) and H, G ✓ F two sub- -algebras of
F, then:
1. If X is G-measurable, then X = E[X|G].
2. If X is independent of G, i.e.
E[X].
(X), G are independent, then E[X|G] =
3. E [E[X|G]] = E[X].
4. [Linearity on the argument] If a, b 2 R, then E[aX+bY |G] = aE[X|G]+
bE[Y |G].
5. [Linearity w.r.t. convex combinations of measures] Let
and P, Q two probability measures on (⌦, F), then E
P
E [X|G] + (1
P+(1
)Q
Q
)E [X|G].
6. [Monotonicity] If X  Y , then E[X|G]  E[Y |G].
7. If Y is G-measurable and bounded, then E[Y X|G] = Y E[X|G].
2 [0, 1]
[X|G] =
54
APPENDIX
8. If Y is independent of (X, G), then E[Y X|G] = E[Y ]E[X|G].
9. If H ✓ G, then E[E[X|G]|H] = E[E[X|H]|G] = E[X|H].
10. [Jensen inequality] Let ' : R ! R be a convex function such that '(X)
is P-integrable, then '(E[X|G])  E['(X)|G].
Proof.
1. Trivial, by Definition A.24.
2. E[X] is a constant, thus
(E[X]) = {;, ⌦} ✓ G, i.e. E[X] is G-
measurable. Then, for every bounded and G-measurable random variable V , X, V are independent and
E[XV ] = E[X]E[V ] = E [E[X]V ] .
3. By (ii bis) with V = 1.
4. By Definition A.24: the set of G-measurable random variables is a
vector space, and the integral is linear.
5. Again by linearity of the integral, but with respect to the measure.
6. By monotonicity of the integral.
7. Consider the random variable Z := Y E[X|G]. We want to prove that
Z satisfies the properties (i)-(ii bis). Since both Y and E[X|G] are
G-measurable, so is Z; for every bounded and G-measurable random
variable V ,
E[ZV ] = E [Y E[X|G]V ] = E[Y V X],
by (ii bis) for E[X|G]. Thus Z = E[XY |G].
8. As before, let us consider the random variable Z := E[Y ]E[X|G] and
prove (i)-(ii bis). Z is G-measurable as the product of a constant and a
G-measurable variable. Moreover, for every bounded and G-measurable
random variable V ,
E[ZV ] = E [E[Y ]E[X|G]V ] = E [E[Y ]XV ] = E[Y ]E[XV ] = E[Y XV ],
Elements of Probability
55
by (ii bis) for E[X|G] and the independence of Y, XV . Thus Z =
E[XY |G].
9. Consider Z := E [E[X|G]|H]. It is H-measurable by definition; for
every bounded and H-measurable random variable V ,
E[ZV ] = E [E [E[X|G]|H] V ]
= E [E [V E[X|G]|H]]
(by 7.)
= E [V E[X|G]]
(by 3.)
= E [E[V X|G]]
(by 7.)
= E [V X]
(by 3.)
Thus Z = E[XY |H].
10. We recall a property of convexity, that is: any convex function ' is the
supremum of all linear functions dominated by it, i.e. for all x 2 R
'(x) = sup l(x),
l2L
L := {l : R ! R| l(x) = ax + b, a, b 2 R, l  '}.
Then,

E['(X)|G] = E sup l(X)|G
l2L
sup E[l(X)|G]
l2L
= sup l (E[X|G])
l2L
= ' (E[X|G]) .
(by 4.)