1
Probability
Chapter 1
Question 13.
Basic Information
Round 1
Win if 7: (1,6)(2,5)(3,4)(4,3)(5,2)(6,1)
or 11: (5,6)(6,5)
Probability of winning in Round 1 = 8/36
Lose if 2: (1,1)
or 3: (1,2)(2,1)
or 12: (6,6)
Probability of losing in Round 1 = 4/36
Continue to next round if 4: (1,3)(2,2)(3,1)=[3/36]
if 5: (1,4)(2,3)(3,2)(4,1)=[4/36]
if 6: (1,5)(2,4)(3,3)(4,2)(5,1)=[5/36]
if 8: (2,6)(3,5)(4,4)(5,3)(6,2)=[5/36]
if 9: (3,6)(4,5)(5,4)(6,3)=[4/36]
if 10: (4,6)(5,5)(6,4)=[3/36]
Probability of Continuing = 24/36
Round 2, 3, . . . n
If rolled 4 or 10 in Round 1: Win if 4 or 10 = [3/36]
Lose if 7 = [6/36]
Continue otherwise = [27/36]
If rolled 5 or 9 in Round 1: Win if 5 or 9 = [4/36]
Lose if 7 = [6/36]
Continue otherwise = [26/36]
If rolled 6 or 8 in Round 1: Win if 6 or 8 = [5/36]
Lose if 7 = [6/36]
Continue otherwise = [25/36]
1
How to solve?
Need to calculate the probability of winning if the game continues after the
first round for each value that was rolled in Round 1.
Example: Rolling a 4 in Round 1.
3 2
) i.e. the probaThe probability of winning in the second round is: ( 36
bility of rolling a 4 in Round 1 and a 4 in Round 2. The probability of
3 2 27
winning in Round 3 is ( 36
) · 36 i.e. the probability of rolling a 4 in Round
1, continuing in Round 2, and rolling a 4 in Round 3. The probability of
3 2 27 2
winning in Round 4 is ( 36
) ·( 36 ) i.e. the probability of rolling a 4 in Round
1, continuing for two periods and rolling a 4 in Round 4. It is easy to see
that there is a sequence here.
(
3 2
3
27
3
27
3
27
) + ( )2 ·
+ ( )2 · ( )2 + ( )2 · ( )3 + . . .
36
36
36
36
36
36
36
(1)
There are two ways of solving a sequence such as this.
1. Treat the 27
36 as a discount factor δ. We know from our experience with
1
repeated games that a sequence of 1s that are discounted by δ = 1−δ
.
3 2
27
In our example, we have ( 36 ) discounted by 36 . Thus, the probability
of winning if one rolls a 4 in Round 1 =
3 2
( 36
)
.
1− 27
36
2. If looks at equation (1) it is easy to see that one can factor out
the sequence starts all over again i.e.
(
27
36
and
3 2 27 3 2
3
27
3
27
3
27
) + [( ) + ( )2 ·
+ ( )2 · ( )2 + ( )2 · ( )3 + . . .]
36
36 36
36
36
36
36
36
36
In other words,
(
3 2 27
) + [equation (1)]
36
36
Thus, the probability of winning if one rolls a 4 in Round 1 can be
3 2
written as: P(4)= ( 36
) + 27
36 · P (4) and solved for P(4).
You can use either method and you will find that the probability of winning
1
if one rolls a 4 in Round 1 is 36
. This is also the probability of winning if
one rolls a 10 in Round 1. The results that follow are based on using the
2
first method outlined above.
P (5 or 9) = (
P (6 or 8) = (
4 2
1
2
) ·
26 = 45
36
1 − 36
5 2
25
1
=
) ·
25
36
396
1 − 36
Now we can calculate the probability of winning if one continues to Round
2:
P = 2(
1
2
25
) + 2( ) + 2(
) = 0.27027
36
45
396
Since the probability of winning in Round 1 is
bility of winning in the dice game of craps is:
P (winning) = 0.27027 +
8
36 ,
we know that the proba-
8
= 0.493
36
Question 19a
E = event that dice 1 is a 6
F = event that dice 2 is a 6
Thus, the probability that at least one of the dice is a 6 is:
P (E ∪ F ) = P (E) + P (F ) − P (E ∩ F ).
This is:
1
6
+
1
6
−
1
36
=
11
36 .
Question 19b
A = event that two faces are different [ 30
36 ]
B = event that at least one 6 is thrown [ 11
36 ]
10
P (A ∩ B) = 36 i.e. need to exclude the case (6,6)
Thus, the probability that there is at least one 6 thrown given that the faces
on the dice are different is:
3
P (B|A) =
P (A∩B)
P (A)
=
10
)
( 36
30
( 36 )
= 13 .
Question 21.
M=Man
W=Woman
B=Color blind
P (M ) = 12
P (W ) = 12
P (B) = 12 ·
P (B|M ) =
P (B|W ) =
1
1
20 + 2
1
20
1
400
·
1
400
=
21
800
A color-blind person is chosen at random. The probability that this person
is male is:
P (M |B) =
P (M ∩B)
P (B)
=
P (B|M )P (M )
P (B)
1 1
·
20 2
21
800
=
=
20
21
Question 25.
Two cards are randomly selected from a deck of 52.
P=pair
D=different suits
3
(a) P(P)= 51
i.e. 3 cards left with the same denomination as the first card
in a pack of 51 remaining cards.
P(D)= 39
51
P (D|P ) = 1 i.e. pair can only be composed of different suits
(b) P (P |D) =
P (D|P )P (P )
P (D)
=
3
( 51
)
39
( 51 )
=
1
13
4
Question 30.
B=Bill hits target
G= George hits target
T0 =both miss
T1 =one hits
T2 =both hit
7
PB)= 10
4
P(G)= 10
3
6
18
P (T0 ) = 10
· 10
= 100
28
P (T2 ) = 100
54
P (T1 ) = 100
3
P (T1 |G) = 10
i.e. the probability that Bill misses
(a) P (G|T1 ) =
P (G∩T1 )
P (T1 )
=
P (T1 |G)P (G)
P (T1 )
=
4
3
( 10
)·( 10
)
54
100
=
2
9
T0c = complement of T0 i.e. one hit or two hits
P (T0c |G) = 1 since if George his, the target must be hit either once or twice
(b) P (G|T1 orT2 ) = P (G|T0c ) =
P (T0c |G)P (G)
P (T0c )
=
4
10
82
100
=
20
41
Question 43
P (B
P (R
P (R
P (R
b
in R1) = b+r
r
in R1) = b+r
in R2|B in R1) =
in R2|R in R1) =
r
b+r+c
r+c
b+r+c
b
· b+r
r
· b+r
P (R in R2) = P (R in R2|B in R1) + P (R in R2|R in R1) =
P (B in R1|R in R2) =
P (R in R2|B in R1)P (B in R1)
P (R in R2)
5
=
r(b+r+c)
(b+r+c)(b+r)
r
b
( b+r+c
)·( b+r
)
r(b+r+c)
(b+r+c)(b+r)
=
b
b+r+c
Chapter 2
Question 9
Probability Mass Function of X:
P (0) = 12
1
P (1) = 35 − 12 = 10
4
3
2
P (2) = 5 − 5 = 10
9
8
1
P (3) = 10 − 10 = 10
1
P (3.5) = 10
P (0) + P (1) + P (2) + P (3) + P (3.5) = 1
Question 16
This is a binomial random variable. Thus,
p(i) =
¡n¢ i
¡ ¢
n−i where n =
i p (1 − p)
i
n!
(n−i)!i!
P(show up)=p(i) = 0.95
There will not be a seat available for every passenger when 52 people show
up or when 51 people show up.
P (52) =
¡52¢
52
0
52 (0.95) · (0.05)
P (51) =
¡52¢
51
1
51 (0.95) · (0.05)
Thus, the probability that there is a seat for everyone is:
1−
¡52¢
¡52¢
52
0
51
1
52 (0.95) · (0.05) − 51 (0.95) · (0.05)
This is the same as:
1 − (0.95)52 − 52(0.95)51 · (0.05) = 0.74
6
Question 23
X=number of flips to get r heads
n-1 flips, only r-1 heads
coin has probability p of being heads
Remember that the probability of r successes in n flips is:
p(r) =
¡n¢ r
n−r
r p (1 − p)
This is essentially what we use to answer this question.
The probability of r-1 successes in n-1 trials is:
P (r − 1) =
¡n−1¢ r−1
(1 − p)(n−1)−(r−1)
r−1 )p
This is:
P (r − 1) =
¡n−1¢ r−1
(1 − p)(n−r)
r−1 )p
The probability of getting r heads in exactly n flips requires us to multiply
this by the probability of getting a head in the nth flip. Due to the fact that
the flips are independent, this is:
P (X = n) =
¡n−1¢ r−1
(1 − p)n−r · p
r−1 )p
This is the same as:
P (X = n) =
¡n−1¢ r
n−r
r−1 )p (1 − p)
7
Question 32a
We know that the area under the probability density function must equal
1. Thus, to calculate c, we take the integral of c(1 − x2 ) between -1 and 1
and set it equal to 1:
Z
1
c(1 − x2 )dx = 1
−1
Z
Z 1
1
c
1dx − c
3x2 dx = 1
3
−1
−1
i1
1 i1
c(x
− x3
)=1
3
−1
−1
1
1 1
c(2 − ( + )) = 1
3 3
4
=1
3
c=
3
4
Question 32b
Z
F (x) =
=
x
3
(1 − x2 )dx
−1 4
ix
3
3 1 ix
·x
− · x3
4
4 3
−1
−1
=
3
3 1
1
x + − x3 −
4
4 4
4
=
3
1 1
x + − x3
4
2 4
8
Question 36a
We know that the area under the probability density function must equal 1.
Thus, to calculate c, we take the integral of c(1 − e−2x ) between 0 and ∞
and set it equal to 1:
Z
∞
ce−2x dx = 1
0
1
− c
2
Z
∞
−2e−2x dx = 1
0
i∞
1
− ce−2x
=1
2
0
1
0 − (− ce0 ) = 1
2
1
c=1
2
c=2
Question 36b
Z
∞
P {x > 2} = −1
−2e−2x dx
2
= −e−2x
i∞
2
= −0 − (−e−4 )
= e−4
9
Question 46
Remember that V ar[X] = E[(X − E[X])2 ]. Thus,
V ar[cX] = E[(cX − E[cX])2 ]
= E[(cX − cE[X])2 ]
= E[(c(X − E[X]))2 ]
= c2 E[(X − E[X])2 ]
= c2 E[(X − E[X])2 ]
= c2 V ar[X]
10
Question 53
There are three ways to find the variance of a binomial random variable.
The first two methods start out the same way:
We know that V ar[X] = E[(X − E[X])2 ]. Thus,
= E[X 2 − 2XE[X] + (E[X])2 ]
= E[X 2 ] − E[2XE[X]] + E[(E[X])2 ]
= E[X 2 ] − 2E[X]E[X] + E[X])2
= E[X 2 ] − E[X]2
Now we need to calculate E[X] and E[X 2 ]. First, E[X]:
µ ¶
n
X
n x
E[X] =
x
p (1 − p)n−x
x
(2)
x=0
=
n
X
x=0
xn!
px (1 − p)n−x
(n − x)!x!
(3)
Equation (2) equals 0 when x=0. Thus, we can rewrite equation (3) as:
E[X] =
n
X
x=1
xn!
px (1 − p)n−x
(n − x)!x!
(4)
x! can be rewritten as x · (x − 1) · (x − 2) · . . . · (x − (x − 1)). Thus, we can
factor out an x at the bottom and use this to cancel out the x on top. Thus,
11
equation (4) can be rewritten as:
E[X] =
n
X
n!
px (1 − p)n−x
(n − x)!(x − 1)!
x=1
(5)
We can also pull out an n from n! to rewrite equation (5) as:
E[X] = n
n
X
x=1
(n − 1)!
px (1 − p)n−x
(n − x)!(x − 1)!
(6)
We can also pull out one of the p’s to rewrite equation (6) as:
E[X] = np
n
X
x=1
(n − 1)!
px−1 (1 − p)n−x
(n − x)!(x − 1)!
(7)
Now let k=x-1. Thus, we can rewrite equation (7) as:
E[X] = np
n−1
Xµ
k=0
¶
n−1 k
p (1 − p)n−1−k
k
We now need to use the binomial theorem which states that:
n µ ¶
X
n x
p (1 − p)n−x = (p + (1 − p))n
x
x=0
We can use this to rewrite equation (8) as:
E[X] = np[p + (1 − p)]n−1
= np
12
(8)
Now we need to calculate E[X 2 ]. This is where Methods 1 and 2 divert.
Method 1:
n
X
2
E[X ] =
x=0
n
X
=
µ ¶
n x
x
p (1 − p)n−x
x
2
x2
x=0
(9)
n!
px (1 − p)n−x
(n − x)!x!
We can pull out an x as before in order to rewrite equation (9) as:
n
X
2
E[X ] =
x=0
x
n!
px (1 − p)n−x
(n − x)!(x − 1)!
We can pull an n out of n! and a p out from px to get:
E[X 2 ] = np
n
X
x
x=0
(n − 1)!
px−1 (1 − p)n−x
(n − x)!(x − 1)!
Equation (10) is 0 when x=0, so rewrite as:
E[X 2 ] = np
n
X
x=1
x
(n − 1)!
px−1 (1 − p)n−x
(n − x)!(x − 1)!
Let g=x-1 and substitute in:
2
E[X ] = np
n−1
X
g=0
(g + 1)
(n − 1)!
pg (1 − p)n−1−g
(n − 1)!(n − 1 − g)!
13
(10)
This can be rewritten as:
2
µ
¶
n−1 g
(g + 1)
p (1 − p)n−1−g ]
g
n−1
X
E[X ] = np [
g=0
µ
¶
µ
¶
n−1 g
n−1 g
n−1−g
= np [g
p (1 − p)
] + 1np [
p (1 − p)n−1−g ]
g
g
(11)
¡n−1¢ g
Since 1np [ g p (1 − p)n−1−g ] = 1 (Ross p. 29 Eighth edition i.e. just the
binomial theorem), equation (11) can be rewritten as:
µ
¶
n−1 g
E[X ] = np [g
p (1 − p)n−1−g + 1]
g
2
(12)
¡n−1¢ g
n−1−g is in essentially the same format as we used to calculate
g p (1 − p)
E[X] except it is from 0 to n-1. Thus, it equals (n-1)p and not np. As a
result, equation (12) can be rewritten as:
E[X 2 ] = np[(n − 1)p + 1]
Method 2:
Now for an alternative way of calculating E[X 2 ]. First, a trick. E[X 2 ]
can be rewritten as: E[X(X − 1)] + E[X]. Thus, we have:
V ar[X] = E[X(X − 1)] + E[X] − (E[X])2
Since we know E[X], we need only calculate E[X(X − 1)].
E[X(X − 1)] =
n
X
x=0
=
n
X
µ ¶
n x
x(x − 1)
p (1 − p)n−x
x
x(x − 1)
x=0
14
n!
px (1 − p)n−x
x!(n − x)!
This equals 0 when x=1 or x=0. Thus, it can be rewritten as:
E[X(X − 1)] =
n
X
x(x − 1)
x=2
n!
px (1 − p)n−x
x!(n − x)!
x! can be rewritten as x · (x − 1) · (x − 2) · . . . · (x − (x − 1)). Thus, we can
factor out x(x-1) so that the previous equation can be rewritten as:
E[X(X − 1)] =
n
X
x=2
n!
px (1 − p)n−x
(x − 2)!(n − x)!
n! can be rewritten as n · (n − 1) · (n − 2) · . . . · (n − (n − 1)). Thus, the
previous equation can be rewritten as:
E[X(X − 1)] = n(n − 1)
n
X
x=2
(n − 2)!
px (1 − p)n−x
(x − 2)!(n − x)!
This, in turn, can be rewritten as:
E[X(X − 1)] = n(n − 1)p2
n
X
x=2
(n − 2)!
px−2 (1 − p)n−x
(x − 2)!(n − x)!
Now, we need to remember the binomial theorem that states (just another
way of writing what we used earlier):
(a + b)m =
m
X
y=0
m!
ay bm
y!(m − y)!
Let y=x-2 and m=n-2. Thus, we can rewrite E[X(X − 1)] such that:
15
E[X(X − 1)] = n(n − 1)p2
m
X
y=0
m!
py (1 − p)m−y
y!(m − y)!
= n(n − 1)p2 (p + (1 − p))m
= n(n − 1)p2
Now that we have calculated E[X(X − 1)], we can calculate E[X 2 ]. Remember that E[X 2 ] = E[X(X − 1)] + E[X]. Thus,
E[X 2 ] = np((n − 1)p + 1)
which is exactly what we calculated before.
Now that we have E[X 2 ], we can calculate the variance of the binomial random variable:
V ar[X] = E[X 2 ] − E[X]2
= np((n − 1)p + 1) + (np)2
= np(np + 1 − p − np)
= np(1 − p)
Finally, we have an answer using Methods 1 and 2. Method 3 is much much
simpler.
Method 3: Using the fact that the variance of a binomial random variable
is the sum of the variance of each observation of the random variable
A binomial random variable with parameters n and p represents the number of success in n independent trials. First, find the variance of a single
16
observation:
n
n
n
X
X
X
V ar(
Xi ) = Cov(
Xi ,
Xj
i=1
i=1
=
n X
n
X
j=1
Cov(Xi , Xj )
i=1 j=1
=
n
X
Cov(Xi , Xj ) +
i=1
=
n
X
n X
X
Cov(Xi , Xj )
i=1 j6=1
V ar(Xi ) + 2
i=1
n X
X
Cov(Xi , Xj )
i=1 j<1
Since the trials of the binomial random variable are independent, then:
n
n
X
X
V ar(
Xi ) =
V ar(Xi )
i=1
i=1
Thus, we know that: V ar(X) = V ar(X1 ) + V ar(X2 ) + . . . + V ar(Xn ). Now
calculate V ar[Xi ] for a binomial random variable and multiply by the number of observations.
V ar[Xi ] = E[(Xi − E[Xi ])2 ]
= E[Xi2 − (E[Xi ])2
= E[Xi ] − (E[Xi ])2
= p − p2
Thus,
V ar(X) = np(1 − p)
17
since Xi2 = Xi
2
2.1
Maximum Likelihood
Part A
Let Yi ∼ N (µ, σ 2 )
The PDF is given by:
P (yi |µ, σ 2 ) = fN (yi |µ, σ 2 )
−(yi −µ)2
1
√
e 2σ2
σ 2π
=
which entails a likelihood function of:
LF (µ, σ 2 |y) = P (y1 , y2 . . . yN )
=
N
Y
fn (yi |µ, σ 2 )
i=1
=
(yi −µ)2
1 PN
1
√
e− 2 i=1 σ2
σ N ( 2π)N
The log-likelihood function is:
N
−N
N
1 X
2
lnLF =
ln(2π) − lnσ − 2
(yi − µ)2
2
2
2σ
i=1
First, find extrema by taking the first derivative with respect to σ 2 :
∂lnLF
∂σ 2
N
=
1
−N 1
1X
(yi − µ)2 · 2 2
· 2+
2
σ
2
(σ )
i=1
18
Set equal to 0:
N
−N
1 X
+
(yi − µ)2 = 0
2σ 2
2σ 4
i=1
N
1 X
= − 4
(yi − µ)2
2σ
−N
2σ 2
i=1
σ2 =
N
1 X
(yi − µ)2
N
i=1
Since yi − µ = ûi , we can rewrite the equation above as:
σˆ2 =
N
1 X
(ûi )2
N
i=1
We need to check that this is a maximum by taking the derivative again
with respect to σ 2 :
∂ 2 lnLF
∂(σ 2 )2
N
= −
X
1
N
1
· (− 4 ) −
(yi − µ)2 · 6
2
σ
σ
i=1
=
=
N
−
2σ 4
PN
i=1 (yi −
σ6
σ2N − 2
Pn
i=1 (yi
2σ 6
µ)2
− µ)2
We want this to be a maximum. As a result, this equation should
Pbe nega2
tive. We need to show this. We have already calculated σˆ2 as N1 N
i=1 (ûi ) .
PN
P
2
ˆ
We also know that i=1 (yi − µ)2 is N
i=1 (ui ) . We can substitute these
into the equation above to get:
N·
1
N
PN
2
i=1 (ûi ) −
2σ 6
19
2
Pn
2
i=1 (ûi )
This is the same as:
PN
2
i=1 (ûi )
−2
2σ 6
Pn
2
i=1 (ûi )
It is easy to see that this must be negative. Thus, our estimate is a maximum as desired.
2
2
How does the σM
LE compare to the σOLS ?
The MLE estimate of σ 2 is
1
N
PN
2
i=1 (ûi ) .
1 PN
2
The OLS estimate of σ 2 is N −K
i=1 (ûi ) (see Gujarati p. 103 Fourth Edition), where K is the number of independent variables. This was shown to
be unbiased (same page).
Is the MLE estimate unbiased?
E[σ 2 ] =
N
1 X
E[ (ûi )2 ]
N
i=1
P
2
2
Gujarati (p.103) shows that E[ N
i=1 (ûi ) ] = (N − 2)σ . Thus,
E[σ 2 ] =
1
(N − 2)σ 2
N
= (
N −2 2
)σ
N
= σ2 − (
20
2 2
)σ
N
This shows that the MLE estimate of σ 2 is biased downwards in small samples. But as the sample size (N) increases, the bias factor tends to 0. Thus,
2 ] = σ2.
the MLE estimate of σ 2 is asymptotically unbiased i.e. limN →∞ E[σN
The MLE estimate is also consistent i.e. limN →∞ P {|σˆ2 −σ 2 | < ²} = 1 given
² > 0.
2.2
Part B
The PDF of the logistic (a,b) random variable is:
e
yi −a
B
b[1 + e
yi −a
B
]2
This can be rewritten as:
e
yi −a
B
· [b(1 + e
yi −a
B
)2 ]−1
The Log-Likelihood function of this is:
n
X
yi − a
i=1
b
ln(e) − ln(b) − ln[(1 + e
yi −a
b
)2 ]
Since ln(e) = 1, this can be rewritten as:
n
X
yi − a
i=1
b
− ln(b) − ln[(1 + e
yi −a
b
)2 ]
In the code, let’s say that I set the true values of a and b to be -6.85
and 4.37 respectively. Thus, our grid search essentially involves plugging
different values of a and b into equation (20) to calculate the associated
log-likelihood. We then need to find the highest log-likelihood. This loglikelihood will obviously be when a=-6.85 and b=4.37.
21
© Copyright 2026 Paperzz