Topics in Mathematics 201-BNJ-05 Vincent Carrier Joint Probability

Topics in Mathematics 201-BNJ-05
Vincent Carrier
Joint Probability Distributions
The joint probability distribution p(x, y) of two discrete random variables X and Y
is defined as
p(x, y) = P (X = x, Y = y)
for x ∈ DX , y ∈ DY .
Being a probability distribution, it must satisfy
1) 0 ≤ p(x, y) ≤ 1
for x ∈ DX , y ∈ DY
2)
X X
p(x, y) = 1.
x∈DX y∈DY
Example: An individual is picked at random in some city.
X : number of university diplomas,
Y : number of cars owned.
0
0
0.10
x
0.04
1
2
0.06
Total 0.20
y
1
0.26
0.21
0.03
0.50
2 Total
0.19 0.55
0.10 0.35
0.01 0.10
0.30
1
The marginal probability distributions of X and Y are defined as
X
X
pX (x) = P (X = x) =
p(x, y)
and
pY (y) = P (Y = y) =
p(x, y).
y∈DY
x∈DX
Example: Find pX (x), pY (y), µX , and µY in the example above.
X:
x
0
1
2 Total
pX (x) 0.55 0.35 0.10
1
µX = 0(0.55) + 1(0.35) + 2(0.1) = 0.55
Y :
y
0
1
2 Total
pY (y) 0.2 0.5 0.3
1
µY = 0(0.2) + 1(0.5) + 2(0.3) = 1.1
Two random variables X and Y are independent if
p(x, y) = pX (x) pY (y)
for all x ∈ DX , y ∈ DY .
Example: In the example above, X and Y are not independent since
p(2, 0) = 0.06 6= 0.02 = pX (2) pY (0).
The conditional probability distributions of X and Y are defined as
pX (x|y) =
p(x, y)
pY (y)
for x ∈ DX ,
pY (y|x) =
p(x, y)
pX (x)
for y ∈ DY .
Example:
x
0
1
2 Total
pX (x|y = 1) 0.52 0.42 0.06
1
y
0
1
2 Total
pY (y|x = 2) 0.6 0.3 0.1
1
When X and Y are not independent random variables, their dependance can be measured
by the covariance between them:
cov(X, Y ) = E((X − µX )(Y − µY )).
A positive covariance indicates that X and Y vary in the same direction, a negative one
that they vary in opposite directions.
This formula can be simplified as follows.
E((X − µX )(Y − µY )) =
X X
(x − µX )(y − µY ) p(x, y)
x∈DX y∈DY
=
X X
(xy − xµY − yµX + µX µY ) p(x, y)
x∈DX y∈DY
=
X X
xy p(x, y) − µY
X X
y p(x, y) + µX µY
x∈DX y∈DY
=
X X
xy p(x, y) − µY
y∈DY
=
X X
y
X
x
p(x, y)
y∈DY
p(x, y) + µX µY
xy p(x, y) − µY
X
X
x∈DX
x∈DX y∈DY
−µX
X
x∈DX
X
X X
x∈DX y∈DY
x∈DX y∈DY
−µX
x p(x, y)
x∈DX y∈DY
x∈DX y∈DY
−µX
X X
X
x pX (x)
x∈DX
y pY (y) + µX µY
y∈DY
= E(XY ) − µX µY − µX µY + µX µY
= E(XY ) − µX µY .
p(x, y)
Example:
E(XY ) = 1(0.21) + 2(0.03) + 2(0.1) + 4(0.01) = 0.51
cov(X, Y ) = 0.51 − (0.55)(1.1) = −0.1.
If X and Y are independent, then
E(XY ) =
X X
xy p(x, y)
x∈DX y∈DY
=
X X
xy pX (x)pY (y)
x∈DX y∈DY
!
=
X
!
X
x pX (x)
x∈DX
y pY (y)
y∈DY
= µX µY .
Result: If X and Y are independent random variables, then
cov(X, Y ) = 0.
The correlation coefficient between X and Y is defined as
ρ =
cov(X, Y )
.
σX σY
It can be shown that
−1 ≤ ρ ≤ 1.
Example: Find ρ.
E(X 2 ) = 12 (0.35) + 22 (0.1) = 0.75
E(Y 2 ) = 12 (0.5) + 22 (0.3) = 1.7
2
σX
= E(X 2 ) − µ2X = 0.75 − (0.55)2 = 0.4475
σY2
= E(Y 2 ) − µ2Y
ρ =
=
1.7 − (1.1)2
=
0.49
−0.1
cov(X, Y )
√
= √
' −0.214.
σX σY
0.4475 0.49
Result: Let X and Y be two random variables. Then
a) E(aX + bY ) = aE(X) + bE(Y )
b) If X and Y are independent, then Var(X + Y ) = Var(X) + Var(Y ).
Proof:
a)
E(aX + bY ) =
X X
(ax + by) p(x, y)
x∈DX y∈DY
= a
X X
x p(x, y) + b
x∈DX y∈DY
= a
X
x∈DX
= a
X
x
X
X X
y p(x, y)
x∈DX y∈DY
p(x, y) + b
y∈DY
x pX (x) + b
x∈DX
X
y
y∈DY
X
X
p(x, y)
x∈DX
y pY (y)
y∈DY
= aE(X) + bE(Y ).
b)
Var(X + Y ) = E((X + Y )2 ) − [E(X + Y )]2
= E(X 2 + 2XY + Y 2 ) − (µX + µY )2
= E(X 2 ) + 2E(XY ) + E(Y 2 ) − µ2X − 2µX µY − µ2Y
= E(X 2 ) − µ2X + E(Y 2 ) − µ2Y + 2 [E(XY ) − µX µY ]
= Var(X) + Var(Y ) + 2 cov(X, Y )
= Var(X) + Var(Y ).
Remark: Result b) can be generalized to
Var(aX + bY ) = a2 Var(X) + b2 Var(Y ).