Lecture 6: Expected Value and Moments

Expected Value
Expected Value
The expected value of a random variable is defined as follows
Lecture 6: Expected Value and Moments
Discrete Random Variable:
Sta 111
X
E [X ] =
xP(X = x)
all x
Colin Rundel
May 21, 2014
Continous Random Variable:
Z
E [X ] =
xP(X = x)dx
all x
Sta 111 (Colin Rundel)
Expected Value
Lecture 6
May 21, 2014
1 / 33
Expected Value
Expected Value of a function
Properties of Expected Value
The expected value of a function of a random variable is defined as follows
Constants - E (c) = c if c is constant
Discrete Random Variable:
Indicators - E (IA ) = P(A) where IA is an indicator function
X
E [f (X )] =
f (x)P(X = x)
all x
Constant Factors - E (cX ) = cE (x)
Continous Random Variable:
Addition - E (X + Y ) = E (X ) + E (Y )
Z
E [f (X )] =
f (x)P(X = x)dx
all x
Multiplication - E (XY ) = E (X )E (Y ) if X and Y are independent.
Sta 111 (Colin Rundel)
Lecture 6
May 21, 2014
2 / 33
Sta 111 (Colin Rundel)
Lecture 6
May 21, 2014
3 / 33
Expected Value
Expected Value
Variance
Properties of Variance
Another common property of random variables we are interested in is the
Variance which measures the squared deviation from the mean.
What is Var (aX + b) when a and b are constants?
Var (X ) = E (X − E (X ))2 = E (X − µ)2
One common simplification:
Var (X ) = E (X − µ)2
= E (X 2 − 2µX + µ2 )
= E (X 2 ) − 2µE (X ) + µ2
Which gives us:
= E (X 2 ) − µ2
Var (aX ) = a2 Var (X )
Standard Deviation:
SD(X ) =
Var (X + c) = Var (X )
p
Var (X )
Var (c) = 0
Sta 111 (Colin Rundel)
Lecture 6
May 21, 2014
4 / 33
Sta 111 (Colin Rundel)
Expected Value
Lecture 6
May 21, 2014
5 / 33
Expected Value
Properties of Variance, cont.
Covariance
What about when X and Y are not independent?
E (XY ) 6= E (X )E (Y )
⇒
E (XY ) − µx µy 6= 0
This quantity is known as Covariance, and is roughly speaking a
generalization of variance to two variables
What about Var (X + Y )?
Cov (X , Y ) = E [(X − E (X ))(Y − E (Y ))]
= E [(X − µx )(Y − µy )]
= E [XY + µx µy − X µy − Y µx ])
= E (XY ) − µx µy
Sta 111 (Colin Rundel)
Lecture 6
May 21, 2014
6 / 33
Sta 111 (Colin Rundel)
Lecture 6
May 21, 2014
7 / 33
Expected Value
Expected Value
Properties of Covariance
Properties of Variance, cont.
A general formula for the variance of the linear combination of two
random variables:
Cov (X , Y ) = E [(X − µx )(Y − µy )] = E (XY ) − µx µy
Cov (X , Y ) = 0 if X and Y are independent
Cov (X , c) = 0
Cov (X , X ) = Var (X )
From which we can see that
Cov (aX , bY ) = ab Cov (X , Y )
Var (X + Y ) = Var (X ) + Var (Y ) + Cov (X , Y )
Var (X − Y ) = Var (X ) + Var (Y ) − Cov (X , Y )
Cov (X + a, Y + b) = Cov (X , Y )
Sta 111 (Colin Rundel)
Lecture 6
May 21, 2014
8 / 33
Sta 111 (Colin Rundel)
Expected Value
Lecture 6
May 21, 2014
9 / 33
Expected Value
Properties of Variance, cont.
Bernoulli Random Variable
For a completely general formula for the variances of a linear combination
of n random variables:
Var
n
X
!
ci Xi
=
i=1
=
Sta 111 (Colin Rundel)
n X
n
X
Let X ∼ Bern(p), what is E (X ) and Var (X )?
Cov (ci Xi , cj Xj )
i=1 j=1
n
X
ci2 Var (Xi )
i=1
Lecture 6
+
n X
n
X
ci cj Cov (Xi , Xj )
i=1 j=1
i6=j
May 21, 2014
10 / 33
Sta 111 (Colin Rundel)
Lecture 6
May 21, 2014
11 / 33
Expected Value
Expected Value
Binomial Random Variable
Hypergeometric Random Variable - E (X )
Lets consider a simple case where we have an urn with m black marbles
and N − m white marbles. Let Bi be an indicator variable for the ith
marble being black.
(
1 if ith draw is black
Bi =
0 otherwise
Let X ∼ Binom(n, p), what is E (X ) and Var (X )?
P
We can redefine X = ni=1 Yi where Y1 , · · · , Yn ∼ Bern(p), and since we
are sampling with replacement all Yi and Yj are independent.
In the case where N = 2 and m = 1 what is P(Bi ) = 1 for all i?
Ω = {BW , WB}
P(B1 ) = 1/2, P(B2 ) = 1/2
P(W1 ) = 1/2, P(W2 ) = 1/2
Sta 111 (Colin Rundel)
Lecture 6
May 21, 2014
12 / 33
Sta 111 (Colin Rundel)
Expected Value
Lecture 6
May 21, 2014
13 / 33
Expected Value
Hypergeometric Random Variable - E (X ) - cont.
Hypergeometric Random Variable - E (X ) - cont.
What about when N = 3 and m = 1?
Ω = {BW1 W2 , BW2 W1 , W1 BW2 , W2 BW1 , W1 W2 B, W2 W1 B}
Let X ∼ Hypergeo(N, m, n) then X = B1 + B2 + · · · + Bn
P(B1 ) = 1/3, P(B2 ) = 1/3, P(B3 ) = 1/3
P(W1 ) = 2/3, P(W2 ) = 2/3, P(W3 ) = 2/3
Proposition
P(Bi = 1) = m/N for all i
Sta 111 (Colin Rundel)
Lecture 6
May 21, 2014
14 / 33
Sta 111 (Colin Rundel)
Lecture 6
May 21, 2014
15 / 33
Expected Value
Expected Value
Hypergeometric Random Variable - E (X ) - 2nd way
Hypergeometric Random Variable - Var (X )
Let X ∼ Hypergeo(N, m, n), what is Var (X )?
n
X
Var (X ) = Var
!
Bi
=
Let X ∼ Hypergeo(N, m, n), what is E (X )?
n
X
Var (Bi ) +
Var (Bi ) =
i=1
=
Sta 111 (Colin Rundel)
Lecture 6
May 21, 2014
16 / 33
Sta 111 (Colin Rundel)
i=1 j=1
i6=j
n X
n
X
=
Var (X ) =
P(Bi = 1 ∩ Bj = 1) − P(Bi = 1)P(Bj = 1)
i6=j
=
i=1 j=1
i6=j
Sta 111 (Colin Rundel)
Cov (Bi , Bj )
i=1 j=1
i6=j
nm(N − m) N − n
N2
N −1
m
m N − n
=n
1−
N
N N −1
m(N − m)
m(N − m)
= −n(n − 1) 2
N 2 (N − 1)
N (N − 1)
Lecture 6
n X
n
X
=
i6=j
−
Var (Bi ) +
nm(N − m)
m(N − m)
− n(n − 1) 2
N2
N (N − 1)
nm(N − m)(N − 1) − nm(n − 1)(N − m)
=
N 2 (N − 1)
n X
n
n X
n
X
X
m m−1
mm
Nm(m − 1) − m2 (N − 1)
=
−
=
N
N
−
1
N
N
N 2 (N − 1)
i=1 j=1
i=1 j=1
i6=j
17 / 33
=
mm
P(Bi = 1)P(Bj = 1|Bi = 1) −
N N
j=1
n X
n
X
n
X
i=1
n X
n
X
i=1
May 21, 2014
Let X ∼ Hypergeo(N, m, n)
E (Bi BJ ) − E (Bi )E (Bj )
i=1 j=1
i6=j
=
n
X
m(N − m)
nm(N − m)
=
2
N
N2
i=1
Hypergeometric Random Variable - Variance, cont.
i=1 j=1
i6=j
n X
n
X
n
X
m
m
1−
N
N
i=1
Expected Value
Hypergeometric Random Variable - Variance, cont.
Cov (Bi , Bj ) =
Cov (Bi , Bj )
Lecture 6
Expected Value
n X
n
X
n
n X
X
i=1 j=1
i6=j
i=1
n
X
Cov (Bi , Bj )
i=1 j=1
i=1
=
n
n X
X
May 21, 2014
18 / 33
Sta 111 (Colin Rundel)
Lecture 6
May 21, 2014
19 / 33
Expected Value
Expected Value
Poisson Random Variable - E (X )
Poisson Random Variable - Var (X )
Let X ∼ Poisson(λ), what is E (X )?
Let X ∼ Poisson(λ), what is Var (X )?
Sta 111 (Colin Rundel)
Lecture 6
May 21, 2014
20 / 33
Sta 111 (Colin Rundel)
Expected Value
Lecture 6
May 21, 2014
21 / 33
May 21, 2014
23 / 33
Moments
St. Petersburg Lottery
Moments
Some definitions,
We start with $1 on the table and a coin.
Raw moment:
At each step: Toss the coin; if it shows Heads, take the money. If it shows
Tails, I double the money on the table.
Central moment:
µ0n = E (X n )
µn = E [(X − µ)n ]
Let X be the amount you win, what is E (X )?
Normalized / Standardized moment:
µn
σn
Sta 111 (Colin Rundel)
Lecture 6
May 21, 2014
22 / 33
Sta 111 (Colin Rundel)
Lecture 6
Moments
Moments
Common Moments of Interest
Third and Forth Central Moments
Zeroth Moment:
µ00 = µ0 = 1
µ3 = E [(X − µ)3 ] = E (X 3 − 3X 2 µ + 3X µ2 − µ3 )
First Moment:
= E (X 3 ) − 3µE (X 2 ) + 3µ3 − µ3
µ01 = E (X ) = µ
= E (X 3 ) − 3µσ 2 − µ3
µ1 = E (X − µ) = 0
Second Moment:
= µ03 − 3µσ 2 − µ3
µ2 = E [(X − µ)2 ] = Var (X )
µ02 − (µ01 )2 = Var (X )
Third Moment:
Skewness(X ) =
µ3
σ3
µ4 = E [(X − µ)4 ] = E (X 4 − 4X 3 µ + 6X 2 µ2 − 4X µ3 + µ4 )
Fourth Moment:
= E (X 4 ) − 4µE (X 3 ) + 6µ2 E (X 2 ) − 4µ4 + µ4
µ4
σ4
µ4
Ex. Kurtosis(X ) = 4 − 3
σ
Note that some moments do not exist, which is the case when E (X n ) does not
Kurtosis(X ) =
= E (X 4 ) − 4µE (X 3 ) + 2µ2 (σ 2 + µ2 ) + 4µ2 σ 2 + µ4
= µ04 − 4µµ03 + 6µ2 σ 2 + 3µ4
converge.
Sta 111 (Colin Rundel)
Lecture 6
May 21, 2014
24 / 33
Sta 111 (Colin Rundel)
Moments
Lecture 6
May 21, 2014
25 / 33
May 21, 2014
27 / 33
Moments
Moment Generating Function
Moment Generating Function - Poisson
The moment generating function of a discrete random variable X is
defined for all real values of t by
X
MX (t) = E e tX =
e tx P(X = x)
x
This is called the moment generating function because we can obtain the
moments of X by successively differentiating MX (t) wrt t and then
evaluating at t = 0.
Let X ∼ Pois(λ) then
MX (0) = E [e 0 ] = 1 = µ00
d
d tX
E [e tX ] = E
e
= E [Xe tX ]
MX0 (t) =
dt
dt
MX0 (0) = E [Xe 0 ] = E [X ] = µ01
MX00 (t) =
d 0
d
M (t) =
E [Xe tX ] = E
dt X
dt
d
(Xe tX ) = E [X 2 e tX ]
dt
MX00 (0) = E [X 2 e 0 ] = E [X 2 ] = µ02
Sta 111 (Colin Rundel)
Lecture 6
May 21, 2014
26 / 33
Sta 111 (Colin Rundel)
Lecture 6
Moments
Moments
Moment Generating Function - Poisson Skewness
Sta 111 (Colin Rundel)
Lecture 6
May 21, 2014
Moment Generating Function - Poisson Kurtosis
28 / 33
Sta 111 (Colin Rundel)
Lecture 6
Moments
May 21, 2014
Moments
Moment Generating Function - Binomial
Moment Generating Function - Normal
Let X ∼ Binom(n, p) then
Let Z ∼ N (0, 1) and X ∼ N (µ, σ 2 ) where X = µ + σZ then
MX (t) = E [e tX ] =
n
X
!
e tk
k=0
29 / 33
n
Z ∞
2
1
MZ (t) = E [e tZ ] =
e tx √ e −x /2 dx
2π
−∞
Z ∞
x 2 −tx
1
= √
e − 2 dx
2π −∞
Z ∞
(x−t)2
t2
1
e − 2 + 2 dx
= √
2π −∞
Z ∞
(x−t)2
2
1
√ e − 2 dx
= e t /2
2π
−∞
!
X n
n k
p (1 − p)n−k =
(pe t )k (1 − p)n−k
k
k
k=0
= [pe t + (1 − p)]n
MX0 (t) = npe t (pe t + 1 − p)n−1
MX0 (0) = µ01 = np = E (X )
= et
2
/2
MX00 (t) = n(n − 1)(pe t )2 (pe t + 1 − p)n−2 + npe t (pe t + 1 − p)n−1
MX (t) = E [e tX ] = E [e t(µ+σZ ) ] = E [e tµ e tσZ ]
MX00 (0) = µ02 = n(n − 1)p 2 + np = E (X 2 )
Var (X ) =
µ02
−
(µ01 )2
2
= e tµ E [e tσZ ] = e tµ MU (tσ)
2 2
t 2 σ2
= e tµ e t σ /2 = exp tµ +
2
2 2
= n(n − 1)p + np − n p
= np((n − 1)p + 1 − np) = np(1 − p)
Sta 111 (Colin Rundel)
Lecture 6
May 21, 2014
30 / 33
Sta 111 (Colin Rundel)
Lecture 6
May 21, 2014
31 / 33
Moments
Moments
Moment Generating Function - Normal, cont.
Moment Generating Function - Normal, cont.
µ03 = MX000 (0) = µ3 + 3µσ 2
MX0 (t) = exp(µt +
t 2σ2
2
µ04 = MX0000 (0) = µ4 + 6µ2 σ 2 + 3σ 4
)(µ + tσ 2 )
µ0 − 3µσ 2 − µ3
µ3
= 3
3
σ
σ3
3
2
µ + 3µσ − 3µσ 2 − µ3
=
=0
σ3
MX0 (0) = µ01 = µ
MX00 (t)
MX00 (0)
Skewness(X ) =
t 2σ2
t 2σ2
2
= exp µt +
σ + exp µt +
(µ + tσ 2 )2
2
2
=
µ02
Sta 111 (Colin Rundel)
2
=µ +σ
µ4
µ04 − 4µµ03 + 6µ2 σ 2 + 3µ4
=
σ4
σ4
µ04 − 4µ(µ3 + 3µσ 2 ) + 6µ2 σ 2 + 3µ4
=
σ4
(µ4 + 6µ2 σ 2 + 3σ 4 ) − 6µ2 σ 2 − 1µ4
3σ 4
=
=
=3
σ4
σ4
Ex. Kurtosis(X ) = 0
Kurtosis(X ) =
2
Lecture 6
May 21, 2014
32 / 33
Sta 111 (Colin Rundel)
Lecture 6
May 21, 2014
33 / 33