y 1 2 3 1 2 3 0 0 x - Inst.eecs.berkeley.edu

UC Berkeley
Department of Electrical Engineering and Computer Science
EE 126: Probability and Random Processes
Practice Problems for Midterm: SOLUTION #1
Fall 2007
Issued: Thurs, September 27, 2007
Solutions: Posted on Tues, October 2, 2007
Reading: Bertsekas & Tsitsiklis, Chapters 1 and 2
Problems for §2.6 and §2.7
Problem 3.1
Let X and Y be independent random variables that take values in the set {1, 2, 3}. Let
V = 2X + 2Y , and W = X − Y .
(a) Assume that P(X = k) and P(Y = k) are positive for any k ∈ {1, 2, 3}. Can V and W
be independent? Explain. (No calculations needed.)
For the remaining parts of this problem, assume that X and Y are uniformly distributed
on {1, 2, 3}.
(b) Find and plot pV (v), and compute E[V ] and var(V ).
(c) Find and show in a diagram pV,W (v, w).
(d) Find E[V | W > 0].
(e) Find the conditional variance of W given the event V = 8.
Solution:
1. V and W cannot be independent. Knowledge of one random variable gives information
about the other. For instance, if V = 12 we know that W = 0.
2. We begin by drawing the joint PMF of X and Y .
y
(1/9)
(1/9)
(1/9)
(1/9)
(1/9)
(1/9)
(1/9)
(1/9)
(1/9)
8
6
v=
v=
4
v=
1
10
v=
2
12
v=
3
0
0
1
2
3
x
uniformly distributed so each of the nine grid points
1
X and Y are uniformly distributed so each of the nine grid points has probability 1/9.
The lines on the graph represent areas of the sample space in which V is constant. This
constant value of V is indicated on each line. The PMF of V is calculated by adding
the probability associated with each grid point on the appropriate line.
pV(v)
1/3
2/9
1/9
4
6
8
10
12
v
By symmetry, E[V ] = 8 . The variance is:
var(V ) = (4 − 8)2 ·
2
2
1
1
+ (6 − 8)2 · + 0 + (10 − 8)2 · + (12 − 8)2 · =
9
9
9
9
16
3
.
Alternatively, note that V is twice the sum of two independent random variables, V =
2(X + Y ), and hence
var(V ) = var 2(X+Y ) = 22 var(X+Y ) = 4 var(X)+var(Y ) = 4·2 var(X) = 8 var(X)
(note the use of independence in the third equality; in the fourth one we use the fact
that X, Y are identically distributed, therefore they have the same variance). Now, by
the distribution of X, we can easily calculate that
1
1
2
1
var(X) = (1 − 2)2 + (2 − 2)2 + (3 − 2)2 = ,
3
3
3
3
so that in total var(V ) =
16
3 ,
as before.
3. We start by adding lines corresponding to constant values of W to our first graph in
part b:
2
0
v=
3
w=
w=
w=
-2
-1
y
w=
1
12
v=
2
w=
2
10
v=
v=
v=
1
8
6
4
0
0
1
2
3
x
Again, each grid point has probability 1/9. Using the above graph, we get pV,W (v, w).
w
(1/9)
2
(1/9) (1/9)
1
(1/9) (1/9) (1/9)
0
(1/9) (1/9)
-1
(1/9)
-2
4 6 8 10 12
v
below:
4. The event W > 0 is shaded below:
w
(1/9)
W>0
2
(1/9) (1/9)
1
(1/9) (1/9) (1/9)
0
(1/9) (1/9)
-1
(1/9)
-2
4 6 8 10 12
By symmetry, E[V | W > 0] = 8 .
5. The event V = 8 is shaded below:
3
v
w
V=8
2
1
0
-1
-2
4 6 8 10 12
v
When V = 8, W can take on values in the set {−2, 0, 2} with equal probability. By
symmetry, E[W | V = 8] = 0. The variance is:
var(W | V = 8) = (−2 − 0)2 ·
1
1
1
+ (0 − 0)2 · + (2 − 0)2 · =
3
3
3
8
3
.
Problem 3.2
Joe Lucky plays the lottery on any given week with probability p, independently of whether
he played on any other week. Each time he plays, he has a probability q of winning, again
independently of everything else. During a fixed time period of n weeks, let X be the number
of weeks that he played the lottery and Y the number of weeks that he won.
(a) What is the probability that he played the lottery any particular week, given that he
did not win anything that week?
(b) Find the conditional PMF pY |X (y | x).
(c) Find the joint PMF pX,Y (x, y).
(d) Find the marginal PMF pY (y). Hint: One possibility is to start with the answer in (c),
but the algebra can be messy. But if you think intuitively about the procedure that
generates Y , you may be able to guess the answer.
(e) Find the conditional PMF pX|Y (x | y).
In all parts of this problem, make sure to indicate the range of values to which your PMF
formula applies.
Solution:
1. Let Li be the event that Joe played the lottery on week i, and let Wi be the event that
he won on week i. We are asked to find
P(Li | Wic ) =
(1 − q)p
P(Wic | Li )P(Li )
=
=
P(Wic | Li )P(Li ) + P(Wic | Lci )P(Lci )
(1 − q)p + 1 · (1 − p)
4
p−pq
1−pq
2. Conditioned on X, Y is binomial
 x

q y (1 − q)(x−y) , 0 ≤ y ≤ x;
pY |X (y | x) =
y

0,
otherwise.
3. Realizing that X has a binomial PMF, we have
pX,Y (x, y) = pY |X (y | x)pX (x)
 n
x

y
(x−y)
px (1 − p)(n−x) , 0 ≤ y ≤ x ≤ n;
q (1 − q)
x
y
=

0,
otherwise.
4. Using the result from (c), we could compute
pY (y) =
n
X
pX,Y (x, y) ,
x=y
5.
but the algebra is messy. An easier method is to realize that Y is just the sum of
n independent Bernoulli random variables, each having a probability pq of being 1.
Therefore Y has a binomial PMF:
 n

(pq)y (1 − pq)(n−y) , 0 ≤ y ≤ n;
pY (y) =
y

0,
otherwise.
pX,Y (x, y)
pY (y)
 0
1
0
1

x
n

@
Aq y (1−q)(x−y) @
Apx (1−p)(n−x)



y
x

0
1
, 0 ≤ y ≤ x ≤ n;
=
n
(n−y)
y
@
A(pq) (1−pq)



y



0,
otherwise.
pX|Y (x | y) =
Problem 3.3
Suppose you and your friend play a game where each of you throws a 6-sided die, each
of your throws being independent. Each time you play the game, if the largest of the two
values you obtained from each die is greater than 4, then you win 1 dollar; otherwise, you
lose 1 dollar. Suppose that you play the game n ≥ 3 times, each game being independent of
the others.
(a) What is the amount of money you expect to win on the first and last game combined?
5
(b) How much do you expect to win in your last game given that you lost in the first game?
(c) How much do you expect to have won in your last game given that you won the first
game and you won a total of m dollars at the end?
(d) What is the probability that you won both the first and last game given that you won
a total of m dollars at the end?
Solution:
(a) We can define the collection of random variables {Xi }i∈I for I = {1, 2, . . . , n} with
each Xi representing the amount that is won in game i. For this question, we want to
calculate E [X1 + Xn ] = E [X1 ] + E [Xn ] = 2 · E [X1 ], where we have used linearity of
expectation and the fact that {Xi }i∈I are identically distributed.
To compute E [Xi ], we note that Xi takes value 1 whenever max {roll #1, roll #2} > 4
and it takes on −1 otherwise. If we name the two dice rolls as A and B and consider
1
, we can evaluate the expectation directly:
the joint PMF pA,B (a, b) = pA (a)·pB (b) = 36
E [Xi ] =
X
1 · pA,B (a, b) +
X
1
+
36
max{a,b}>4
=
max{a,b}>4
X
(−1) · pA,B (a, b)
max{a,b}≤4
X
max{a,b}≤4
−1
36
−1
1
+ 16 ·
= 20 ·
36
36
1
=
9
So the solution to this question is $
2
.
9
(b) By independence, we can ignore the information about losing the first game, and simply
1
calculate the expected value of the last game as above to get $ .
9
(c) Easier Solution:
P
The given information provides E [ ni=2 Xi ] = m−1, and by linearity of expectation and
the fact that all Xi are identical, we then have (n − 1)E [Xi ] = m − 1 ⇒ E [Xi ] = m−1
n−1 .
Longer Solution:
We want to compute E [Xn | C], where our conditioning event C is defined as C =
{X1 = 1} ∩ D and D is defined as D = {“won m dollars total”}. The expectation is
given by:
6
E [Xn | C] =
X
xn pxn |C (xn )
xn ∈{−1,1}
P ({Xn = 1} ∩ {X1 = 1} ∩ D)
P ({Xn = −1} ∩ {X1 = 1} ∩ D)
+ (−1) ·
P ({X1 = 1} ∩ D)
P ({X1 = 1} ∩ D)
P ({Xn = 1} ∩ {X1 = 1} ∩ D)
= 2·
−1
P ({X1 = 1} ∩ D)
= 1·
To carry out the computation, we need to translate our event statements into familiar
forms. First, we consider the event {Xn = 1} ∩ {X1 = 1} ∩ D. This event can be
translated to “we win the first game AND the second game AND m-2 dollars in the
other games.” Since we have split the sub-events across games, we now have independet
events, and we notice that the last event can be expressed as a binomial; to win m − 2
dollars in n − 2 games, we must have won (m − 2) + (n − 2 − (m − 2))/2 games total
and lost (n − 2 − (m − 2))/2 games total. Therefore, we have:
P ({Xn = 1} ∩ {X1 = 1} ∩ D) = P ({Xn = 1}) · P ({X1 = 1}) · P (D)
n−m (n−2)− n−m
2
2
20
n−2
16
20 20
·
· n−m
=
36 36
36
37
2
To calculate the probability of the event {X1 = 1} ∩ D is similar; we translate into
“won the first game AND won m-1 dollars in n-1 other games,” and we carry out the
computation with independence and a binomial.
Note that for both of the probability expressions, expression is only valid when n − m
is even. When n − m is odd, the probability is zero, because it is impossible to win m
dollars in n games in that case (since we must have split the other n − m games evenly
between wins and losses).
Our final expression, neglecting the trivial case of n − m odd and after performing some
algebra and cancellations, is:
E [Xn | C] = 2 · ·
n−2 n−m
2
n−1 −1
n−m
2
=
=
n+m−2
−1
(n − 1)
m−1
n−1
(d) Now we want to compute P({X1 = 1} ∩ {Xn = 1} | D). As before, we can expand the
conditional by the definition of conditional probability, and carry out the computation
as above:
7
P ({X1 = 1} ∩ {Xn = 1} ∩ D) =
=
P ({X1 = 1} ∩ {Xn = 1} ∩ D)
P (D)
(n −
n−m
2 )(n
−1−
n(n − 1)
n−m
2 )
Sample midterm problems
Problem 3.4
Since there is no direct flight from San Diego (S) to New York (N), every time Alice wants to
go to the New York, she has to stop in either Chicago (C) or Denver (D). Due to bad weather
conditions, both the flights from S to C and the flights from C to N have independently a
delay of 1 hour with probability p. Similarly, at Denver airport, both incoming and outgoing
flights are independently subject to a 2 hour delay with probability q. On any given occasion,
Alice chooses randomly between the Chicago or Denver routes with equal probability.
(a) What is the average total delay (across both legs of the overall trip) that she experiences
in going from S to N?
(b) Suppose Alice arrives at N with a delay of two hours. What is the probability that she
flew through C?
(c) Suppose that Alice wants to maximize the probability that she arrives in New York
with a total delay < 2 hours. Under what conditions on p and q is going via Chicago
a better choice than going via Denver?
(d) Suppose now that Alice always flies through C. On average, how many trips does she
make before experiencing a 2 hour delay?
(e) Suppose now that the flight between S and D is known to be delayed, but Alice still
randomly flies either via C or D with equal probability. With what delay should she
expect to arrive at N?
Solution: The problem can be modeled as a network having four nodes (S,D,C,S), where
SD, SC, CD and CN are linked. We can define four independent random variable indicating
the delay on each of the link:
• XSC is 0 wp 1 − p and is 1 wp p.
• XSD is 0 wp 1 − q and is 2 wp q.
• XCN is 0 wp 1 − p and is 1 wp p.
• XDN is 0 wp 1 − q and is 1 wp q.
Also, let us define D as the event that Alice flies through Denver, and C as the event that
Alice flies through Chicago.
8
(a) There are two possible ways to go to N from S, so using Total Probability law we have
E(delay) = E(delay|C)P (C) + E(delay|D)P (D)
= E(XSC + XCN )P (C) + E(XSD + XDN )P (D)
1
(E(XSC ) + E(XCN ) + E(XSD ) + E(XDN ))
=
2
= p + 2q
(b) Using Total Probability law and Bayes’ rule, and since P (D) = P (C) =
P (C|delay = 2) =
=
=
=
1
2
we have
P (delay = 2|C)P (C)
P (delay = 2|D)P (D) + P (delay = 2|C)P (C)
P (XSC + XCN = 2)P (D)
P (XSD + XDN = 2)P (D) + P (XSC + XCN = 2)P (C)
P (XSC + XCN = 2)
P (XSD + XDN = 2) + P (XSC + XCN = 2)
p2
2q(1 − q) + p2
(c) Flying via Denver
P (delay < 2|D) = P (XSD + XDN < 2) = (1 − q)2
and flying via Chicago
P (delay < 2|C) = P (XSC + XCN < 2)
= 1 − P (XSC + XCN = 2) = 1 − p2
This solution has been updated: > should have beenp<. Alice should fly via
2
2
2
Chicago
p when (1 − q) < 1 − p . This is the case when q?1 − 1 − p or, equivalently,
p < 2q(1 − q).
(d) A delay of two hours happens with probability p2 on each trip. We are asked for the
mean of a geometric random variable with parameter p2 . Thus, the average number of
trips is p12 .
(e) From the independence of the four random variables
E(delay|XSD = 2) = E(delay|XSD = 2, D)P (D|XSD = 2)
+E(delay|XSD = 2, C)P (C|XSD = 2)
1
=
(2 + E(XDN ) + E(XSC + E(XCN ))
2
1
(2 + 2q + 2p) = 1 + q + p
=
2
9
Problem 3.5
We transmit a bit of information which is 0 with probability 1 − p and 1 with p. Because of
noise on the channel, each transmitted bit is received correctly with probability 1 − ǫ.
(a) Suppose we observe a “1” at the output. Find the conditional probability p1 that the
transmitted bit is a “1”.
(b) Suppose that we transmit the same information bit n times over the channel. Calculate
the probability that the information bit is a “1” given that you have observed n “1”s
at the output. What happens when n grows?
(c) For this part of the problem, we suppose that we transmit the symbol “1” a total of n
times over the channel. At the output of the channel, we observe the symbol “1” three
times in the n received bits, and that we observe a “1” at the n-th transmission. Given
these observations, what is the probability that the k-th received bit is a “1”?
(d) Going back to the situation in part (a): some unknown bit is transmitted over the
channel, and the received bit is a “1”. Suppose in addition that the same information
bit is transmitted a second time, and you again receive another “1”. We want to
find a recursive formula to update p1 to get p2 , the conditional probability that the
transmitted bit is a “1” given that we have observed two “1”s at the output of the
channel. Show that the update can be written as
p2 =
(1 − ǫ)p1
(1 − ǫ)p1 + ǫ(1 − p1 )
Solution:
(a) Let A be the event that 1 is transmitted, Ac be the event that 0 is transmitted, B n be
the event the n-th bit we received is 1. Then using Total Probability law and Bayes’
rule:
p1 = P (A|B1 ) =
(1 − ǫ)p
P (B1 |A)P (A)
=
P (B1 |A)P (A) + P (B1 |Ac )P (Ac )
(1 − ǫ)p + ǫ(1 − p)
(b) As in the previous part, and assuming that the uses of the channel are independent, we
have
P (A|B1 , . . . , Bn ) =
=
=
P (B1 , . . . , Bn |A)P (A)
P (B1 , . . . , Bn |A)P (A) + P (B1 , . . . , Bn |Ac )P (Ac )
(1 − ǫ)n p
(1 − ǫ)n p + ǫn (1 − p)
p
ǫ n
p + ( 1−ǫ ) (1 − p)
As n → ∞, we have
p∞

 1 ǫ<
p ǫ=
=

0 ǫ>
10
1
2
1
2
1
2
These results meet the intuitions. When ǫ < 21 , it means the observations are positively
correlated with the bit transmitted, thus knowing the observations Bn n = 1, 2, . . . will
increase the conditional probability of A given the observations, until it hits 1. When
ǫ = 21 , it means the observations are independent to the bit transmitted, thus given a
bunch of observations Bn n = 1, 2, . . . won’t change the conditional probability of A
given the observations, which is the same as the prior probability p. When ǫ > 12 , it
means the observations are negatively correlated with the bit transmitted, thus knowing
the observations Bn n = 1, 2, . . . will decrease the conditional probability of A given
the observations, until it hits 0.
(c) For j < n
P (Bj = 1|
n
X
Bi = 3, Bn = 1, A) =
i=1
=
=
while if j = n P (Bn = 1|
Pn
i=1 Bi
P
P ( ni=1 Bi = 3|Bj = 1, Bn = 1, A)P (B1 = 1|Bn = 1, A)
P
P ( ni=1 Bi |Bn = 1, A)
n−2
n−3 (1 − ǫ)
1 (1 − ǫ)ǫ
n−1
2 n−3
2 (1 − ǫ) ǫ
2
n−1
= 3, Bn = 1, A) = 1.
(d) Assuming the uses of the channel are independent, we have
p2 = P (A|B2 , B1 ) =
=
=
P (B2 |A, B1 )P (A|B1 )
P (B2 |B1 )
P (B2 ,B1 |A)P (A)
P (B1 |A)P (A) P (A|B1 )
P (B2 |B1 )
(1−ǫ)2
(1−ǫ) p1
P (B2 |B1 )
And since
P (B2 |B1 ) =
=
=
=
=
P (B1 , B2 )
P (B1 )
P (B1 , B2 |A)P (A) + P (B1 , B2 |AC )P (AC )
P (B1 |A)P (A) + P (B1 |Ac )P (Ac )
(1 − ǫ)2 p + ǫ2 (1 − p)
(1 − ǫ)p + ǫ(1 − p)
ǫ(1 − p)
(1 − ǫ)p
+ǫ
(1 − ǫ)
(1 − ǫ)p + ǫ(1 − p)
(1 − ǫ)p + ǫ(1 − p)
(1 − ǫ)p1 + (1 − p1 )ǫ
We have
p2 =
(1 − ǫ)p1
(1 − ǫ)p1 + (1 − p1 )ǫ
11
One might have noticed that, if we let p0 = p, the functions that we use to upgrade
p0 to p1 (part (a)), and p1 to p2 are the same. Intuitively, for the first observation, p0
is the prior probability of A before the observation, and p1 is the posterior probability
of A after the observation. Similarly, for the second observation, p1 is the ”prior”
probability, and p2 is the ”posterior” probability. Thus the ways to upgrade two prior
probabilities to the two posterior probabilities should be the same.
Problem 3.6
You play the lottery by choosing a set of 6 numbers from {1, 2, . . . , 49} without replacement.
Let X be a random variable representing the number of matches between your set and the
winning set. (The order of numbers in your set and the winning set does not matter.) You
win the grand prize if all 6 numbers match (i.e., if X = 6).
(a) What is the probability of winning the grand prize? Compute the PMF pX of X.
(b) Suppose that before playing the lottery, you (illegally) wiretap the phone of the lottery,
and learn that 2 of the winning numbers are between 1 and 20; another 2 are between
21 and 40, and the remaining 2 are between 41 and 49. If you use this information
wisely in choosing your six numbers, how does your probability of winning the grand
prize improve?
(c) Now suppose instead that you determine by illegal wiretapping that the maximum
number in the winning sequence is some fixed number R (note that R must be 6 or
larger). If you use this information wisely in choosing your 6 numbers, how does your
probability of winning the grand prize improve?
(d) Use a counting argument to establish the identity
n X
n
r−1
=
.
k
k−1
r=k
Solution:
(a) Among the 49 numbers used at the lottery, only 6 correspond to the winning sequence.
If the number of matches between our set and the winning set is k, then we must
have selected (without replacement and without ordering) exactly k elements from the
winning set of size 6 and 6 − k elements from the remaining set of 49 − 6 available
numbers. Then, we have that
6 49−6
P (X = k) =
k
so the probability of winning the grand prize is
6
P (X = 6) =
12
6
49
6
6−k
49
6
=
1
49
6
(b) In order to use wisely the information given to us, we select (without replacement and
without ordering) two numbers from the set {1, . . . , 20}, two numbers from
the set
20 9
{21, . . . , 40}, and two numbers from the set {41, . . . , 49}. Among the 20
2
2
2 ways
of selecting the six numbers, there is only one that corresponds to the right sequence.
So, the probability of winning is
1
20 20 9
2
2
2
Which corresponds to an improvement of (roughly) a factor 10 with respect to case
when no information is available.
(c) Since we know that the maximum number in the winning sequence is some number r,
we select r in our sequence of numbers. Next, we use the information that r is the
maximum number, and we select the remaining 5 numbers in the set {1, . . . , r − 1}.
The probability of winning the grand prize becomes
1
r−1
5
(d) We can imagine to number in increasing order the n elements from which we are sampling k elements. Let us denote with r the maximum number that we select. For a
fixed r we can use the result in part
c) for determining in how many ways we can select
r−1
. Summing over all the possible values of r, we have
the remaining r − 1 elements: k−1
the result.
13

Download Report

y 1 2 3 1 2 3 0 0 x - Inst.eecs.berkeley.edu

Paperzz.com

Your Paperzz