Expected Values

Expected Values
Arthur White∗
2nd November 2015
Consider a random variable X which can take values x1 , . . . , xI . The range of possible
values can be infinite. Associated with each value is some probability p1 , . . . , pI , where
pi = P(xi ) for all i. The expected value of X, E[X], is defined to be:
I
X
E[X] =
xi p i .
i=1
If g(X) is a function of X, then the expected value of g(X) is given by:
I
X
E[g(X)] =
g(xi )pi .
i=1
If a and b are constants, then E[aX + b] = aE[X] + b. This is straightforward to show:
E[aX + b] =
=
I
X
i=1
I
X
(axi + b)pi
axi pi +
i=1
I
X
= a
I
X
bpi
i=1
I
X
xi p i + b
i=1
pi
i=1
= aE[X] + b.
Joint Probability Distributions
Let X and Y denote random variables taking values over the ranges x1 , . . . , xI and y1 , . . . , yJ
respectively. The joint probability that X and Y will take particular values xi and yj is
P P
denoted by P(X = xi , Y = yj ) = pij . Note that Ii=1 Jj=1 pij = 1. The probability that
X takes a particular value, regardless of the value of Y , is referred to as the marginal
∗
Based extensively on material previously taught by Eamonn Mullins.
1
probability of X. The set of such probabilities is the marginal distribution. Formally, the
marginal distribution of X is defined to be
pi. =
J
X
pij , for i = 1, . . . , I.
j=1
Similarly,
p.j =
I
X
pij , for j = 1, . . . , J.
i=1
Suppose that g(X, Y ) = X + Y. What then is the value of E[g(X)]?
I X
J
X
E[X + Y ] =
(xi + yj )pij
i=1 j=1
=
I X
J
X
xi pij +
i=1 j=1
=
I
X
xi pi· +
i=1
I X
J
X
yj pij
i=1 j=1
J
X
yj p·j
j=1
= E[X] + E[Y ].
Further, suppose that X and Y are independent. Now, let g(X, Y ) = XY. What now is
the value of E[g(X)]?
E[XY ] =
I X
J
X
xi yj pij
i=1 j=1
=
I X
J
X
xi yj pi· p·j
(by independence of X, Y )
i=1 j=1
=
I
X
xi pi·
J
X
i=1
=
I
X
yj p·j
j=1
xi pi· E[Y ]
i=1
= E[Y ]
I
X
xi pi·
i=1
= E[X]E[Y ],
when X and Y are independent.
2
Variance
Letting E[X] = µ, we define the variance of X, Var[X], to be:
2
Var[X] = E[(X − µ) ] =
I
X
(xi − µ)2 p(xi ).
i=1
An alternative form, which often proves useful, is Var[X] = E[X 2 ] − E[X]2 . This follows
from the previous definition:
Var[X] =
=
=
=
=
=
E[(X − µ)2 ]
E[(X − µ)(X − µ)]
E[X 2 + µ2 − 2Xµ)]
E[X 2 ] + µ2 − 2µE[X]
E[X 2 ] − µ2
E[X 2 ] − E[X]2 .
Exercise
Show that Var[aX + b] = a2 Var[X].
Covariance
If X and Y are two random variables, then their covariance is defined to be:
Cov[X, Y ] = E[(X − E[X])(Y − E[XY )].
Note the relationship to variance; if X = Y, then :
Cov[X, Y ] =
=
=
=
E[(X − E[X])(Y − E[Y ])]
E[(X − E[X])(X − E[X])]
E[(X − E[X])2 ]
Var[X].
If X and Y are independent, then Cov[X, Y ] = 0. Letting µx = E[X], and µy = E[Y ],
Cov[X, Y ] =
=
=
=
=
=
E[(X − E[X])(Y − E[Y )]
E[(X − µx )(Y − µy )]
E[XY − Xµy − Y µx + µx µy )]
E[X]E[Y ] − E[X]µy − E[Y ]µx + µx µy
µx µy − µx µy − µy µx + µx µy
0.
3
(by independence of X, Y )
Exercise
Show that Cov[X, Y ] = E[XY ] − E[X]E[Y ].
Properties of Variance
If X and Y are random variables, then
Var[X ± Y ] = Var[X] + Var[Y ] ± Cov[X, Y ].
If X and Y are independent, then this further simplifies to give:
Var[X ± Y ] = Var[X] + Var[Y ].
Var[X + Y ] =
=
=
=
=
E[(X + Y )2 ] − E[X + Y ]2
E[X 2 + Y 2 + 2XY ] − (E[X] + E[Y ])2
E[X 2 ] + E[Y 2 ] + 2E[XY ] − (E[X]2 + E[Y ]2 + 2E[X]E[Y ])
E[X 2 ] − E[X]2 + E[Y 2 ] − E[Y ]2 + 2(E[XY ] − E[X]E[Y ])
Var[X] + Var[Y ] + 2Cov[X, Y ].
If X and Y are independent, then Cov[X, Y ] = 0, with the result that:
Var[X + Y ] = Var[X] + Var[Y ].
4