Second Assignment: Solutions
S
T
1. Let In := i≥n Bn , Dn := i≥n Bn . Then In ↑ B, Dn ↓ B. Since In ⊂ Bn ⊂ Dn , we have
P (In ) ≤ P (Bn ) ≤ P (Dn ). If we show that P (In ) → P S
(B), P (Dn ) → P (B), as n → ∞,
∞
we are done. Since the sequence In is increasing,
B
=
j−1 ), and since the sets
j=2 (Ij \ IP
P∞
Ij \ Ij−1 ,P
j = 2, 3, . . ., are disjoint we have P (B) = j=2 P (Ij \ Ij−1 ) = ∞
j=2 [P (Ij ) − P (Ij−1 )] =
∞
c
c
limn→∞ j=n [P (Ij ) − P (Ij−1 )] = limn→∞ P (In ). Since Dn ↑ B , we have P (Dnc ) → P (B c ), and
so P (Dn ) → P (B). Done.
2. (i) Let Sn be the number of heads in n coin tosses. Then, by Stirling,
p
P (Sn = n/2) ∼ 2/πn, as n → ∞.
With n = 60000, we get P (S60000 = 30000) ≈ 0.0033 . This is a remarkably high probability!
(ii) If k(n)/n → α, as n → ∞, then
p
P (Sn = k(n)) ∼ 1/2πα(1 − α)n e−n(log 2−α log(1/α)−(1−α) log(1/1−α)) .
With α := 31000/60000, we find P (S60000 = 31000) ≈ 3.32 × 10−15 . This is a remarkably small
probability!
(iii) We want to compute P (|Sn − (n/2)| > εn), where ε = 0.005. One easy bound is by
Chebyshev’s inequality: P (|Sn − (n/2)| > εn) ≤ var(Sn )/ε2 n2 = (n/4)/ε2 n2 = 1/4ε2 n
p= 1/6.
On the other hand, the CLT tells us that P (|SnR− (n/2)| > εn) = P (|Sn − (n/2)|/ n/4 >
√
√
∞
2ε n) ≈ P (|Z| > 2ε n) = P (|Z| > 1.73) = 2 1.73 (2π)−1/2 exp(−x2 /2)dx ≈ 0.08, which is
smaller than 1/6. Even 0.08 is an overestimate. The actual probability turns out to be about
0.01.
(iv) The correct answer is (c). To see this, use (i) to compute ρ(k) := P (S60000 = 59500 +
k)/P (S60000 = 59500), k = 1, 2, 3. We have, respectively, 10−3 , 10−5 , 10−7 . Given that the
unlikely event that you threw more than 59500 heads happened, then chances are that the
number of heads is exactly 59500.
In fact, P (S60000 = 59500|S60000 ≥ 59500) = 0.991 , but P (S60000 = 59500) = 2 × 10−16808 .
3. (ii) If f is continuous then, by uniform continuity, the sum on the left side converges to the
Riemann integral of f when µ is the Lebesgue measure.
(i) The sum on the left side is the integral of a simple function ϕn against µ. By uniform
continuity, ϕn → f , as n → ∞, pointwise
R on (0, 1]. By, for example, the dominated converge
theorem, the integral of ϕn converges to (0,1] f dµ.
4. Let X, Y be i.i.d. uniform random variables in [0, 1]. Then
(f (X) − f (Y ))(g(X) − g(Y )) ≥ 0, a.s.,
because the functions are increasing. Hence
E[(f (X) − f (Y ))(g(X) − g(Y ))] ≥ 0,
and, by expansion of the product, this proves the required inequality.1
1
Of course, I could have rephrased this by starting with (f (x) − f (y))(g(x) − g(y)) ≥ 0 for all x, y, then integrating
with respect to the Lebesgue measure on [0, 1]2 and then using Fubini’s theorem. The result is the same, the logic is the
same, the language is different.
1
5. (i) We have to assume P (An ) → 0 here. Here is a way a probabilist
would prove this. Let
P
2
Xn := 1An . We are being told that P (Xn = 1) → 0 and that n P (Xn = 1, Xn+1 = 0) < ∞.
The first requirement tells us that P (Xn = 1 eventually) = 0 because P (Xn = 1 eventually) =
limn P (∀k ≥ nXk = 1) ≤ limn P (Xn = 1) = 0. In other words, the sequence (Xn ) cannot be
eventually equal to 111111111 · · · , almost surely. Therefore,
almost surely, there are infinitely many 0’s: P (Xn = 0 for infinitely many n) = 1.
(*)
P
P
The other requirement tells us that E n 1(Xn = 1, Xn+1 = 0) < ∞. In particular, n 1(Xn =
1, Xn+1 = 0) < ∞, a.s. That is,
the number of transitions from 1 to 0 is finite, almost surely.
(**)
Now, if (*) and (**) are simultaneously true, that is, if a sequence of 0’s and 1’s has the properties
that it can’t be eventually equal to 1 and that it can’t have an infinite number of transitions
from 1 to 0, the only possibility left is that the number of 1’s is finite, i.e.,
P (Xn = 1 for infinitely many n) = 0,
and this, translated in terms of the An ’s, means exactly what we want to prove.3
(ii) We need to show that Yn := −n log Mn / log n → 0, a.s. Since Yn ≥ 0, for all n, a.s., this
amounts to showing that, for all ε > 0, we have P (Yn > ε for infinitely many n) = 0. A sufficient
condition for this is given by (i) above:
X
P (Yn > ε, Yn+1 ≤ ε) < ∞.
n
To show that this is true, let us compute the n-th term:
P (Yn > ε, Yn+1 ≤ ε) = P (Mn < n−ε/n , Mn+1 > (n + 1)−ε/(n+1) ).
Since c(n) := n−ε/n is eventually increasing, we have, for n large enough, that this is further
equal to
P (X1 , . . . , X(n) < c(n), Xn+1 > c(n + 1)) = c(n)n (1 − c(n + 1)).
2
3
Every statement about the An ’s can be translated into a statement about the Xn ’s, right?
The proof I gave is a proof, but if you want to express it in a different language, here it is: Let n < N and write
N
[
Ak = AN ∪
N[
−1
(Ak ∩ Ack+1 ∩ · · · ∩ AcN ),
k=n
k=n
where the union on the right is a union of disjoint sets. Therefore,
P(
N
[
Ak ) = P (AN ) +
k=n
N
−1
X
P (Ak ∩ Ack+1 ∩ · · · ∩ AcN ) ≤ P (AN ) +
k=n
N
−1
X
P (Ak ∩ Ack+1 ).
k=n
Taking limit as N → ∞, we obtain
P(
∞
[
k=n
Ak ) ≤ lim P (AN ) +
N
∞
X
P (Ak ∩ Ack+1 ) =
k=n
Taking limit as n → ∞, and using the assumption that
as required.
∞
X
P (Ak ∩ Ack+1 ).
k=n
P∞
k=1
2
P (Ak ∩
Ack+1 )
< ∞, we obtain limn→∞ P (
S∞
k=n
Ak ) = 0,
So, all we have to check is that
P
n c(n)
X
n
We have
n (1
− c(n + 1)) < ∞, or that
c(n)n (1 − c(n)) < ∞.
n(1 − n−ε/n )
log n
≤ 1+ε ,
1+ε
n
n
≤ x. Clearly, (log n)/n is summable, and so our job is
1
c(n)n (1 − c(n)) = n−ε − n−ε(1+ n ) =
where we used the inequality 1 − e−x
done.
6. Since (ii) implies (i), we prove (ii). We have
Bn f (x) = E[f (Gn (x))],
where
n
1X
Gn (x) =
1{Xi ≤ x},
n
i=1
and where X1 , . . . , Xn are i.i.d. random variables, uniformly distributed in the interval [0, 1]. We
have
Bn f (x) − f (x) = E[f (Gn (x)) − f (x)] ≤ E f (Gn (x)) − f (x)
= E f (Gn (x)) − f (x) 1(|Gn (x) − f (x)| ≤ δ) + E f (Gn (x)) − f (x) 1(|Gn (x) − f (x)| > δ) .
The first term is smaller than m(δ). The last term is smaller than 2kf k∞ times
P (|Gn (x) − f (x)| > δ) = P (|Gn (x) − f (x)|2 > δ 2 ) ≤
=
1
E|Gn (x) − f (x)|2
δ2
n
1 1
1 1 X
E(1{Xi ≤ x} − x)2 = 2 2 nE(1{X1 ≤ x} − x)2
δ 2 n2
δ n
i=1
=
1 11
1 1 2
(x (1 − x) + (1 − x)2 x) ≤ 2
.
δ2 n
δ n4
7. Let N be any positive random variable, independent of (X, Y ). I adopt the following strategy:
if Y ≤ N , I pass; otherwise, I play.
On the event
W := {Y > N, Y > X},
I gain 1 unit. On the event
L := {Y > N, Y ≤ X},
I lose 1 unit. On the event (W ∪ L)c = {Y ≤ N }, I do not win or lose anything, because I pass.
The expected net winnings per game is
P (W ) − P (L).
Note that
P (L) = P (Y > N, Y ≤ X) = P (X > N, X ≤ Y ) =: P (L∗ ).
3
because
(d)
(N, X, Y ) = (N, Y, X).
But
{X > N, X ≤ Y } ⊂ {Y > N, X ≤ Y },
So
P (L) = P (Y > N, X ≤ Y ) ≤ P (Y > N, X < Y ) = P (W ).
The inequality is actually strict. Indeed,
W \ L∗ = {Y > N ≥ X}.
So
P (W ) − P (L) = P (W ) − P (L∗ ) = P (W \ L∗ ) = P (X < N < Y ) > 0.
because F is strictly increasing.
If we therefore let Mn be the net winnings on game n, we have that M1 , M2 , . . . are i.i.d. random
variables with E[Mn ] = P (W ) − P (L) > 0. By the SLLN, (Mn /n → P (W ) − P (L)) = 1, and so
P (Mn → ∞) = 1.
The following Maple code shows that choosing the distribution of N carefully may speed up the
rate at which I make money.
with(Statistics); with(plots)
casino := -10*log(RandomVariable(Uniform(0, 1)))
decision := RandomVariable(Normal(4, 0.1e-2))
S[0] := 0:
for n to 1000 do
Y := Sample(casino, 1); N := Sample(decision, 1); X := Sample(casino, 1);
if Y[1] < N[1] then S[n] := S[n-1]
else
if Y[1] > X[1] then S[n] := S[n-1]+1
else S[n] := S[n-1]-1
end if
end if
end do
plotS := seq([k, S[k]], k = 0 .. 1000); pointplot({plotS}, style = line)
Here, the unknown F is taken to be exponential with mean 10. I choose N to be almost
deterministic with mean 4. There is an optimal value of the mean. Small mean and large mean
reduce the rate at which Mn increases.
8. The increments Xj of the random walk are i.i.d. random variables with
P (Xj = 2k − 1) = pk =
1
,
k(k + 1)2k
k ≥ 1,
P (Xj = −1) = p0 := 1 −
∞
X
k=1
and we have E[X1 ] = 0. Let us separate the positive and negative part of Xj . Define
Yj := Xj + 1,
4
pk ,
so that Yj is a nonnegative random variable with
P (Yj = 2k ) = pk ,
Then E[Y1 ] = 1. Let
Tn :=
n
X
k ≥ 1,
Yj =
n
X
P (Yj = 0) = p0 .
Xj + n = Sn + n.
j=1
j=1
We wish to prove that, for 0 < α < 1 < β,
P (−βcn ≤ Tn − n ≤ −αcn ) → 1.
Equivalently, that
P (Tn − n > −αcn ) → 0,
P (Tn − n < −βcn ) → 0.
(up)
(down)
We prove (up) first. If Tn had finite variance, we would be able to use Chebyshev’s inequality
and hope for the best. But Tn has infinite variance. Let us then define
Ten :=
n
X
j=1
Yj 1(Yj ≤ τn ),
which surely has finite variance, for appropriate sequence τn ↑ ∞, and hope that we can prove
that
P (Ten − n > −αcn ) → 0.
Then, since,
P (Tn − n > −αcn ) ≤ P (Ten − n > −αcn ) + P (Ten 6= Tn ),
we will be able to conclude if τn is chosen so that
P (Ten 6= Tn ) → 0.
(c)
So let us try to choose τn so that this happens. Notice that
P (Ten 6= Tn ) ≤
n
X
P (Yj > τn ) = nP (Y1 > τn ),
j=1
so it suffices to choose τn so that the last term goes to 0. But4
P (Y1 > τ ) = o(
1
),
τ log2 τ
(tail)
so we pick τn in order that nP (Y1 > τn ) → 0. But
nP (Y1 > τn ) = o(
n
) = o(1),
τn log2 τn
R ∞ −x 2
P
2−k
To see this, we estimate P (Y > 2m ) = ∞
/x dx which, by integration
k=m k(k+1) . In analogy to the integral t e
−t
m
−m
by parts,
is
of
lower
order
than
e
/t,
we
make
the
guess
that
P
(Y
>
2
)
=
o(2
/m).
To
prove
this, write m2m P (Y >
P∞
P∞
P∞
−k
m−k
−k
m
m
m
= k=0 (m+k)(m+k+1) 2 =: k=0 fk (m)2 . Since fm (k) ≤ 1, and limm→∞ fm (k) = 0, by
2 ) = k=m k(k+1) 2
the dominated convergence theorem, we can take the limit outside the summation and verify that our guess was correct.
So (tail) holds.
4
5
provided that, e.g.,
τn :=
n
= cn .
log2 n
(tau)
With this choice, (c) holds, we see that5
E[Ten ] − n ∼ −cn ,
as n → ∞.
Then, for ε > 0, we have that, for all large n,
P (Ten − n > −αcn ) = P (Ten − E[Ten ] > (n − E[Ten ]) − αcn )
≤ P (Ten − E[Ten ] > εcn )
≤
E(Ten − E[Ten ])2
.
ε2 c2n
If we are lucky, we should find that the last term tends to 0 as n → ∞.6 But
E(Ten − E[Ten ])2 ≤ nE[Y12 1(Y1 ≤ cn )],
and we should check whether this is o(c2n ), or, equivalently, whether
n
n
E[Y12 1(Y1 ≤
)] = o(
).
log2 n
(log2 n)2
It is enough to show (why?) that
or that
m2−m
E[Y12 1(Y1 ≤ 2m )] = o(
2m
),
m
m
X
as m → ∞.
k=1
22k
→ 0,
k(k + 1)2k
We can either show this directly (by computation of the sum, using integration by parts), or by
observing that the quantity on the left equals
m
X
∞
∞
X
X
m1(ℓ ≤ m − 1) −ℓ
m
fm (ℓ)2−ℓ ,
2k−m =
2 =:
k(k + 1)
(m − ℓ)(m − ℓ + 1)
ℓ=0
ℓ=0
k=1
P∞
and that (i) limm→∞ fm (ℓ) = 0 and (ii) limK→∞ supm ℓ=0 fm (ℓ)1(fm (ℓ) ≥ K)2−ℓ = 0, which
allows us to pass the limit as m → ∞ inside the summation.
The convergence (down) is proved in the same manner.
No, the result does now violate the strong law of large numbers (SLLN). The SLLN says that
Tn /n → 1, a.s., or that (Tn −n)/n → 0, a.s. We proved that (Tn −n)/cn = log1 n Tnn−n approaches
2
−1 in the weak sense the probability that the sequence is outside an open neighborhood of −1
tends to 0.
If we think of Tn as the fortune we make in a game of chance which asks us to pay 1 unit per unit
of time to take part in, then the game is “fair” because E[Tn ] = n and Tn /n → 1, as n → ∞.
But with probability approaching 1, Tn will drop below its “fair value” n by a quantity αcn
(which goes to ∞). Nothing wrong with that: we proved it. What is wrong, perhaps, is one’s
definition of “fairness”.
P∞
m
This is easy to see, since E[Y 1(Y ≥ 2m )] =
k=m 1/k(k + 1) = 1/m. Hence mE[Y 1(Y ≥ 2 )] = 1 for all
m, and so (log2 cn )E[Y 1(Y ≥ cn ] → 1, as m → ∞, implying that (log2 n)E[Y 1(Y ≥ cn ] → 1, as n → ∞, and so
(log2 n)(1 − E[Ten ]/n) → 1.
6
And if we are not lucky, we should go back and examine the places where energy was lost due to inequalities.
5
6
© Copyright 2026 Paperzz