Conditional Expectation Recall that If X and Y are continuous random variables, then the conditional density function of Y given X x is given by fY / X ( y / x) f X ,Y ( x, y) / f X ( x) If X and Y are discrete random variables, then the probability mass function Y given X x is given by pY / X ( y / x) p X ,Y ( x, y) / p X ( x) The conditional expectation of Y given X x is defined by Y / X x yfY / X ( y / x) E (Y / X x) ypY / X ( y / x ) yRY if X and Y are continuous if X and Y are discrete The conditional expectation of Y given X x is also called the conditional mean of Y given X x. Clearly, Y / X x denotes the centre of mass of the conditional pdf or the conditional pmf as shown in Fig. below. Remark We can similarly define the conditional expectation of X given Y y, denoted by E ( X / Y y ) Higher-order conditional moments can be defined in a similar manner. Particularly, the conditional variance of Y given X x is given by Y2 / X x E[(Y Y / X x )2 / X x] Example: Consider the discrete random variables X and Y discussed in example .The joint probability mass function of the random variables are tabulated in Table . Find the joint expectation of E (Y / X 2) x y 0 1 p X ( x) 0 1 2 pY ( y ) 0.25 0.14 0.1 0.35 0.15 0.01 0.5 0.5 0.39 0.45 0.16 The conditional probability mass function is given by pY / X ( y / 2) p X ,Y (2, y ) / p X (2) pY / X (0 / 2) p X ,Y (2,0) / p X (2) 0.15 15 /16 0.16 and pY / X (1/ 2) p X ,Y (2,1) / p X (2) 0.01 1/16 0.16 E (Y / X 2) 0 pY / X (0 / 2) 1 pY / X (1/ 2) 1/16 Similarly, we can show that pY / X (0 /1) 2 / 9 and pY / X (1/1) 7 / 9 so that E (Y / X 1) 7 / 9 pY / X (0 / 0) 25 / 39 and pY / X (1/ 0) 14 / 39 so that E (Y / X 0) 14 / 39 We also note that EX 0 p X (0) 1 p X (1) 2 p X (2) 0.77 and EY 0 pY (0) 1 pY (1) 0.5 Example Suppose X and Y are jointly uniform random variables with the joint probability density function given by 1 x 0, y 0, x y 2 f X ,Y ( x, y ) 2 0 otherwise Find E (Y / X x) From the figure, f X ,Y ( x, y ) 1 in the shaded area. 2 Y x y 2 We have X 2 x f X ( x) f X ,Y ( x, y )dy 0 2 x 0 1 dy 2 1 = (2 x) 0 x 2 2 fY / X ( y / x ) f X ,Y ( x, y ) / f X ( x) 1 2 x E (Y / X x ) yfY / X ( y / x )dy 2 x y 0 1 dy 2 x 2- x 2 Example Suppose X and Y are jointly Gaussian random variables with the joint probability density function given by f X ,Y ( x, y) 1 2 x y 1 X2 ,Y e 1 2(1 2 XY ) ( x X )2 2 XY 2 X ( x X )( y ) ( y )2 Y Y Y2 X Y . Find E (Y / X x). We have fY / X ( y / x) f X ,Y ( x, y ) / f X ( x) 1 2 2 X Y 1 XY e ( x X )2 2 XY 2 X 1 2(1 2 X ,Y ) 1 2 X 1 2 2 Y 1 XY e 1 2 2 Y 1 XY e ( x X 2 X X )( y ) ( y )2 Y Y Y2 Y X )2 Y XY ( y Y ) ( x x ) X 1 2 ) 2 Y2 (1 XY 1 2 ) 2 Y2 (1 XY e 12 ( x 2 Y X ,Y y Y ( x x ) X 2 which is a Gaussian distribution. Therefore, E (Y / X x) y Y 1 2 2 Y 1 XY e 1 2 ) 2 Y2 (1 XY Y XY ( x x ) X and 2 var(Y / X x) Y2 (1 XY ) We can similarly show that Y XY ( x x ) ( y Y ) X 2 dy E ( X / Y y) X Y XY ( y Y ) Y and 2 var( X / Y y ) X2 (1 XY ) Conditional Expectation as a random variable Note that E(Y / X x) is a function of x. Using this function, we may define a random variable ( X ) E (Y / X ). Thus EY / X as a function of the random variable we may consider value of E (Y / X ) at X and E (Y / X x) X x. E (Y / X ) is a random variable, E (Y / X x) is the value of E (Y / X ) at X x Total expectation theorem We establish the following results. EE (Y / X ) EY and EE ( X / Y ) EX Proof: EE (Y / X ) E (Y / X x) f X ( x)dx yfY / X ( y / x)dy f X ( x)dx yf X ( x) fY / X ( y / x)dydx yf X ,Y ( x, y )dydx y f X ,Y ( x, y )dxdy yfY ( y )dy EY as the Thus EE (Y / X ) EY and similarly EE ( X / Y ) EX The above results simplify the calculation of the unconditional expectations EX and EY . We can also show that EE ( g (Y ) / X ) Eg (Y ) and EE ( g ( X ) / Y ) Eg ( X ) Example In example x E (Y / X x) 0 1 2 14/39 7/9 1/16 p X ( x) 039 0.45 0.16 EE (Y / X ) pX (0) E (Y / X 0) p X (1) E (Y / X 1) p X (2) E (Y / X 2) 0.5 EY Baysean Estimation theory and conditional expectation Consider two random variables X and Y with joint pdf f X ,Y ( x, y ). Suppose Y is observable and some a priori information about X is available in a sense that some values of X are more likely. We can represent this prior information in the form of a prior density function f X ( x). . We have to estimate X for a given value Y y in some optimal sense. Obervation Y y fY / X ( y / x ) Random variable X with density f X ( x) The conditional density function fY / X ( y / x) is called likelihood function in estimation terminology. f X ,Y ( x, y ) f X ( x) fY / X ( y ) Also we have the Bayes rule f X /Y ( x / y) f X ( x ) fY / X ( y / x ) fY ( y ) where f / X ( ) is the a posteriori density function Suppose the optimum estimator Xˆ (Y ) is a function of the random variable Y such that it minimizes the mean-square estimation error E ( Xˆ (Y ) X ) 2 . Such an estimator is known as the minimum mean-square error(MMSE) estimator. The estimation problem is ( Xˆ ( y) x) 2 Minimize f X ,Y (x, y )dx dy with respect to X̂(y) . This is equivalent to minimizing ( Xˆ ( y) x) 2 fY (y ) f X / Y ( x / y )dx dy ( Xˆ ( y) x) 2 f X / Y ( x / y )dx fY (y )dy Since fY (y ) is always +ve, the above integral will be minimum if the inner integral is minimum. This results in the problem: Minimize ( Xˆ ( y) x) 2 f X / Y ( x / y )dx with respect to Xˆ ( y ). The minimum is given by ( Xˆ ( y ) x)2 f X / Y ( x / y )dx 0 ˆ X ( y ) 2 ( Xˆ ( y ) x) f X / Y ( x / y )dx 0 Xˆ ( y ) f X / Y ( x / y )dx Xˆ ( y ) xf X /Y ( x / y )dx E ( X / Y y ) f X / Y ( x / y )dx E ( X / Y y ) Xˆ ( y ) E ( X / Y y ) Example Suppose X and Y are two jointly Gaussian random variables considered in the earlier example. We have to estimate X from a single observation Y y. The MMSE estimator Xˆ ( y ) is given by Xˆ ( y) Y Y X ,Y ( x x ) X If 0 and 0, then Xˆ ( y) Y X ,Y x X X Y Thus the MMSE estimator Xˆ ( y ) for two zero-mean jointly Gaussian random variables is linearly related with the data y. This result plays an important role in the optimal filtering of random signals.
© Copyright 2026 Paperzz