Conditional Expectation

Conditional Expectation
Recall that

If
X and Y are continuous random variables, then the conditional density
function of Y given X  x is given by
fY / X ( y / x)  f X ,Y ( x, y) / f X ( x)

If
X and Y are discrete random variables, then the probability mass function
Y given X  x is given by
pY / X ( y / x)  p X ,Y ( x, y) / p X ( x)
The conditional expectation of Y given X  x is defined by
Y / X  x

  yfY / X ( y / x)
 E (Y / X  x)   
  ypY / X ( y / x )
 yRY
if X and Y are continuous
if X and Y are discrete
The conditional expectation of Y given X  x is also called the conditional mean of
Y given X  x. Clearly, Y / X  x denotes the centre of mass of the conditional pdf or
the conditional pmf as shown in Fig. below.
Remark

We
can
similarly
define
the
conditional
expectation
of
X given Y  y, denoted by E ( X / Y  y )

Higher-order conditional moments can be defined in a similar manner.

Particularly, the conditional
variance of
Y given X  x
is given by
 Y2 / X  x  E[(Y  Y / X  x )2 / X  x]
Example:
Consider the discrete random variables X and Y discussed in example .The
joint probability mass function of the random variables are tabulated in Table .
Find the joint expectation of E (Y / X  2)
x
y
0
1
p X ( x)
0
1
2
pY ( y )
0.25
0.14
0.1
0.35
0.15
0.01
0.5
0.5
0.39
0.45
0.16
The conditional probability mass function is given by
pY / X ( y / 2)  p X ,Y (2, y ) / p X (2)
 pY / X (0 / 2)  p X ,Y (2,0) / p X (2)

0.15
 15 /16
0.16
and
pY / X (1/ 2)  p X ,Y (2,1) / p X (2)
0.01
 1/16
0.16
E (Y / X  2)  0  pY / X (0 / 2)  1  pY / X (1/ 2)

 1/16
Similarly, we can show that
pY / X (0 /1)  2 / 9
and
pY / X (1/1)  7 / 9
so that
E (Y / X  1)  7 / 9
pY / X (0 / 0)  25 / 39
and
pY / X (1/ 0)  14 / 39
so that
E (Y / X  0)  14 / 39
We also note that
EX  0  p X (0)  1  p X (1)  2  p X (2)  0.77
and
EY  0  pY (0)  1  pY (1)  0.5
Example
Suppose X and Y are jointly uniform random variables with the joint probability
density function given by
1
 x  0, y  0, x  y  2
f X ,Y ( x, y )   2
0 otherwise
Find E (Y / X  x)
From the figure, f X ,Y ( x, y ) 
1
in the shaded area.
2
Y
x y 2
We have
X
2 x
 f X ( x)   f X ,Y ( x, y )dy
0
2 x
 
0
1
dy
2
1
= (2  x) 0  x  2
2
 fY / X ( y / x )
 f X ,Y ( x, y ) / f X ( x)
1
2 x


 E (Y / X  x )   yfY / X ( y / x )dy

2 x
  y
0

1
dy
2 x
2- x
2
Example
Suppose X and Y are jointly Gaussian random variables with the joint probability density
function given by
f X ,Y ( x, y) 

1
2 x y 1  X2 ,Y
e
1
2(1  2
XY )
 ( x   X )2
 2  XY

2
  X
( x
X
)( y  ) ( y  )2 
Y 
Y

Y2

 
X Y
.
Find E (Y / X  x).
We have fY / X ( y / x)  f X ,Y ( x, y ) / f X ( x)


1
2
2 X  Y 1  XY
e
 ( x   X )2
 2  XY

2
 X
1
2(1  2
X ,Y )
1
2 X



1
2
2 Y 1  XY
e

1
2
2 Y 1  XY
e
( x
X
2
X
X
)( y   ) ( y   )2 
Y 
Y

 Y2
Y

 X
)2

 Y  XY 
( y   Y )  ( x   x )

X 

1
2 )
2 Y2 (1  XY
1
2 )
2 Y2 (1  XY
e
 12
( x
2
 
 Y  X ,Y  
 y  Y  ( x   x )

 X  
 
2
which is a Gaussian distribution.
Therefore,


E (Y / X  x)   y

 Y 
1
2
2 Y 1  XY
e
1
2 )
2  Y2 (1  XY
 Y  XY
( x  x )
X
and
2
var(Y / X  x)   Y2 (1   XY
)
We can similarly show that


 Y  XY
( x   x )
( y   Y ) 
X


2
dy
E ( X / Y  y)   X 
 Y  XY
( y  Y )
Y
and
2
var( X / Y  y )   X2 (1   XY
)
Conditional Expectation as a random variable
Note that E(Y / X  x) is a function of x.
Using this function, we may define a random variable  ( X )  E (Y / X ). Thus
EY / X as a function of the random variable
we may consider
value of
E (Y / X )
at
X
and
E (Y / X  x)
X  x.
E (Y / X ) is a random variable, E (Y / X  x) is the value of E (Y / X ) at X  x
Total expectation theorem
We establish the following results.
EE (Y / X )  EY
and
EE ( X / Y )  EX
Proof:

EE (Y / X )   E (Y / X  x) f X ( x)dx

 
   yfY / X ( y / x)dy f X ( x)dx
 
 
   yf X ( x) fY / X ( y / x)dydx
 
 
   yf X ,Y ( x, y )dydx
 




  y  f X ,Y ( x, y )dxdy

  yfY ( y )dy

 EY
as the
Thus EE (Y / X )  EY and similarly
EE ( X / Y )  EX
The above results simplify the calculation of the unconditional expectations EX and
EY . We can also show that
EE ( g (Y ) / X )  Eg (Y )
and
EE ( g ( X ) / Y )  Eg ( X )
Example In example
x
E (Y / X  x)
0
1
2
14/39
7/9
1/16
p X ( x)
039
0.45
0.16
EE (Y / X )  pX (0) E (Y / X  0)  p X (1) E (Y / X  1)  p X (2) E (Y / X  2)
 0.5
 EY
Baysean Estimation theory and conditional expectation
Consider two random variables X and Y with joint pdf f X ,Y ( x, y ). Suppose
Y is observable and some a priori information about X is available in a sense that
some values of X are more likely. We can represent this prior information in the
form of a prior density function f X ( x). . We have to estimate X for a given value
Y  y in some optimal sense.
Obervation Y  y
fY / X ( y / x )
Random
variable X
with density
f X ( x)
The conditional density function fY / X ( y / x) is called likelihood function in
estimation terminology.
f X ,Y ( x, y )  f X ( x) fY / X ( y )
Also we have the Bayes rule
f X /Y ( x / y) 
f X ( x ) fY / X ( y / x )
fY ( y )
where f  / X ( ) is the a posteriori density function
Suppose the optimum estimator Xˆ (Y ) is a function of the random variable Y such that
it minimizes the mean-square estimation error E ( Xˆ (Y )  X ) 2 . Such an estimator is
known as the minimum mean-square error(MMSE) estimator.
The estimation problem is
 
  ( Xˆ ( y)  x)
2
Minimize
f X ,Y (x, y )dx dy
 
with respect to X̂(y) .
This is equivalent to minimizing
 
  ( Xˆ ( y)  x)
2
fY (y ) f X / Y ( x / y )dx dy
 
 

  ( Xˆ ( y)  x)
2
f X / Y ( x / y )dx fY (y )dy
 
Since fY (y ) is always +ve, the above integral will be minimum if the inner integral is
minimum. This results in the problem:

Minimize
 ( Xˆ ( y)  x)
2
f X / Y ( x / y )dx

with respect to Xˆ ( y ). The minimum is given by


( Xˆ ( y )  x)2 f X / Y ( x / y )dx  0

ˆ
X ( y ) 

 2  ( Xˆ ( y )  x) f X / Y ( x / y )dx  0



Xˆ ( y ) f X / Y ( x / y )dx 


 Xˆ ( y )

xf
X /Y
( x / y )dx E ( X / Y  y )



f X / Y ( x / y )dx  E ( X / Y  y )

 Xˆ ( y )  E ( X / Y  y )
Example
Suppose X and Y are two jointly Gaussian random variables considered in the earlier
example. We have to estimate X from a single observation Y  y. The MMSE
estimator Xˆ ( y ) is given by
 
Xˆ ( y)   Y  Y X ,Y ( x   x )
X
If   0 and   0, then
 
Xˆ ( y)  Y X ,Y x
X
X
Y
Thus the MMSE estimator Xˆ ( y ) for two zero-mean jointly Gaussian random variables
is linearly related with the data y. This result plays an important role in the optimal
filtering of random signals.

Download Report

Conditional Expectation

Paperzz.com

Your Paperzz