Spring 2017
CS537 – Probability in Computing
Homework 1
Homework 1, due Feb 1
You must prove your answer to every question by a convincing explanation
(in complete sentences). Just writing a formula or “yes/no” is not sufficient.
I encourage you to type up your problem sets. Else, please write (and draw)
them legibly and neatly, then scan them in. Typing up your work allows you
to revise it more easily, which means you are more likely to write clear and
sound solutions. I specifically encourage you to use the typesetting system
LaTeX: the initial effort of learning to use it even if in a basic manner will
repay itself amply later.
The page http://latex-project.org/ftp.html provides distributions
for the most common operating systems. I recommend a good GUI front
end like TeXShop (on the Mac) or TeXworks (also on Linux and Windows)
with a built-in viewer that syncs between source and output (typically CtrlClick or Cmd-Click takes you back and forth). There are many short tutorials
available for LaTeX, see for example http://latex-tutorial.com.
Problem 1 (10 points) A box of 30 oranges contains 5 rotten ones. We draw 4
oranges randomly from the box. What is the probability that we encounter a
rotten one?
Solution. The probability that we don’t get a rotten one is
25 24 23 22
·
·
·
≈ 0.46.
30 29 28 27
Indeed, if Ei is the event that we don’t get an orange at draw i then
P(E1 ∩ E2 ∩ E3 ) = P(E1 ) P(E2 | E1 ) P(E3 | E1 ∩ E2 ),
and so on. So the probability that we do get a rotten one is ≈ 0.54.
Problem 2 (10 points) You have taken a test for a certain genetic disorder.
The false negative rate of the test is small: if you have the disorder, the
probability that the test returns a positive result is 0.99. The false positive
rate is also small: if you do not have the disorder, the probability that the
test returns a positive result is only 0.015. Assume that 2% of the population
has the disorder. You don’t have any extra information about whether you
are disposed to this disorder, so you can assume that you have been selected
uniformly and randomly from the population. Under the condition that the
Spring 2017
CS537 – Probability in Computing
Homework 1
test came out positive for you, what is the probability that you have the
disorder?
Solution. We have
P{sick} = 0.02,
P{positive | sick} = 0.99,
P{healthy} = 0.98,
P{positive | healthy} = 0.015.
Hence
P{sick and positive} = P{sick} · P{positive | sick} = 0.02 · 0.99 = 0.0198,
P{healthy and positive} = P{healthy} · P{positive | healthy} = 0.01485,
P{positive} = 0.0198 + 0.01485 = 0.03465
Finally,
P{sick | positive} =
0.0198
P{sick} · P{positive | sick}
=
= 0.57.
P{sick}
0.03465
Notice that the the probability that the person is healthy is still quite large.
This is since the small percentage of false positives come from a very large
population, so they are not such a small percentage of all the positives.
Problem 3 (10 points) A monkey types on a 26-letter keyboard that has lowercase letters only. Each letter is chosen independently and uniformly at random
from the alphabet. If the monkey types a million letters, what is the expected
number of times that the sequence “proof” appears?
Solution. Let n = 1, 000, 000. Let Xi = 1 if the sequenceP
“proof” appears
at
P
position i and 0 otherwise. Then we are interested in E ni=1 Xi = i EXi .
Now EXi is the probability that “proof” appears at position i. This is 26−5
for i = 1, . . . , n − 4, so the expected value is (n − 4)26−5 ≈ 0.01.
Problem 4 A certain Italian candy manufacturer produces a kind of candy
called Baci. Each piece has an individual wrapping that includes a quotation
(about kisses), one out of 100 possible quotations. We assume that every time
you open a wrapping you encounter a quotation randomly chosen of those 100.
(a) (15 points) What is the expected number of candies you will buy if you
keep buying them until you get all 100 quotations? [Hint: for each k,
compute the expected number of candies you will buy to get one more
quotation provided you already have k.]
Spring 2017
CS537 – Probability in Computing
Homework 1
Solution. Let n = 100. For each k < n, suppose that we already obtained
k quotations, and let us compute the expected number of steps needed
to get to k + 1. Every candy opening gets us a new quotation with
. In the probability course you must have learned
probability p = n−k
n
that this implies that the expected number of candy openings to get from
n
. So the expected number of candies to buy is
k to k + 1 is p1 = n−k
n
n
n
1 1
1
+
+ ··· + = n 1 + + + ··· +
.
n n−1
1
2 3
n
(b) (10 points) Give a simple approximate asymptotic expression for the
above number when the number of quotations is n in place of 100.
Solution. From calculus it is known that the sum in the parentheses is
within an additive constant of ln n, so the approximate answer is n ln n +
O(n).
(c) (15 points) With n quotations instead of 100, estimate how many candies
you need to buy in order that with probability > 2/3, you get every quotation. [Hint: For a given k, estimate for each quotation the probability
that you don’t get it even after buying k candies. Then use the union
bound to estimate the probability that there is a quotation you still miss.
Then choose k to make this bound < 1/3.]
Solution. For simplicity let us write n = 100. For a certain quotation,
every time you open a candy the probability that you don’t find it is 1 −
1/n. The probability that you don’t find it after opening k candies is (1−
1/n)k . We have seen that this is < e−k/n . There are n quotations, so the
probability that after k trials one of them is not found is upperbounded
by ne−k/n . So we get the needed probability estimate if
ne−k/n < 1/3,
ek/n > 3n,
k > n ln 3n = n ln n + n ln 3.
© Copyright 2026 Paperzz