Varadhan`s lemma for large deviations

Varadhan’s lemma for large deviations
Jordan Bell
[email protected]
Department of Mathematics, University of Toronto
June 2, 2015
1
Large deviation principles
Let (X , d) be a Polish space. A function f : X → [−∞, ∞] is called lower
semicontinuous if c ∈ R implies that f −1 (c, ∞] is an open set in X ; equivalently, c ∈ R implies that f −1 [−∞, c] is a closed set in X ; equivalently, for any
convergent sequence xn in X ,1
f ( lim xn ) ≤ lim inf f (xn ).
n→∞
n
If K is a nonempty compact subset of X and f : X → [−∞, ∞] is lower semicontinuous, there is some x0 ∈ K such that f (x0 ) = inf x∈K f (x).2
A rate function is a function I : X → [0, ∞] such that (i) there is some
x ∈ X with I(x) < ∞ and (ii) for any c ∈ [0, ∞], f −1 ([0, c]) is compact (namely,
the sublevel sets of I are compact). Condition (ii) implies that a rate function
is lower semicontinuous.
For S ∈ BX , the Borel σ-algebra of X , we define
I(S) = inf I(x).
x∈S
Denote by P(X ) the collection of Borel probability measures on X and let I
be a rate function. A sequence Pn ∈ P(X ) is said to satisfy a large deviation
principle with rate n and with rate function I if for any closed subset C
of X ,
1
lim sup log Pn (C) ≤ −I(C)
n→∞ n
and for any open subset U of X ,
lim inf
n→∞
1
log Pn (U ) ≥ −I(U ).
n
1 http://individual.utoronto.ca/jordanbell/notes/semicontinuous.pdf, p. 2, Theorem 3.
2 http://individual.utoronto.ca/jordanbell/notes/semicontinuous.pdf, p. 5, Theorem 8.
1
Unless we say otherwise, when we speak about a large deviation principle we
shall use the rate n. Because Pn (X ) = 1 for each n, it follows that I(X ) = 0
and thus that inf x∈X I(x) = 0.
We first establish that if a sequence of Borel probability measures satisfies a
large deviation principle then its rate function is unique.3
Lemma 1. If Pn satisfy a large deviation principle with rate functions I and
J, then I = J.
Proof. Let x ∈ X and let BN = B1/N (x), the open ball with center x and radius
1/N . Then, because I(BN +1 ) ≤ I(cl(BN +1 )) and because cl(BN +1 ) ⊂ BN and
so I(cl(BN +1 )) ≤ I(BN ),
−I(x) ≤ −I(BN +1 )
1
≤ lim inf log Pn (BN +1 )
n→∞ n
≤ −J(cl(BN +1 ))
≤ −J(BN ).
Because J is lower semicontinuous, as N → ∞,
J(x) ≤ lim inf J(BN ),
N →∞
hence
−I(x) ≤ −J(x),
i.e. J(x) ≤ I(x). Likewise we obtain I(x) ≤ J(x), so I(x) = J(x), for any
x ∈ X.
2
Varadhan’s lemma
For sequences αn and βn of positive real numbers, we write
αn ' βn
if
lim
n→∞
1
(log αn − log βn ) = 0.
n
Lemma 2. αn + βn ' α ∨ βn .
Lemma 3. If A is a set and f, g : A → [−∞, ∞] are functions, then
inf f − inf g ≤ sup(f − g).
A
3 Frank
A
A
den Hollander, Large Deviations, p. 30,Theorem III.8.
2
Proof.
inf g = inf (g − f + f ) ≥ inf (g − f ) + inf f,
A
A
A
A
hence
inf f − inf g ≤ − inf (g − f ) = sup(f − g).
A
A
A
A
We now prove Varadhan’s lemma.4
Theorem 4 (Varadhan’s lemma). Suppose that Pn satisfies a large deviation
principle with rate function I. If F : X → R is continuous and bounded above,
then
Z
1
log
enF (x) dPn (x) = sup (F (x) − I(x)).
lim
n→∞ n
x∈X
X
Proof. Let
b = sup F (x),
a = sup (F (x) − I(x)).
x∈X
x∈X
Because F is bounded above, −∞ < b < ∞. Because I is not identically ∞,
−∞ < a ≤ b. For n ≥ 1 and S ∈ BX , define
Z
Jn (S) =
enF (x) dPn (x).
S
Let C = F −1 ([a, b]). For N ≥ 1 and 0 ≤ j ≤ N , define
cN,j = a +
for which
[a, b] =
N
[
j
(b − a),
N
[cN,j−1 , cN,j ].
j=1
For 1 ≤ j ≤ N , define
CN,j = F −1 ([cN,j−1 , cN,j ]),
for which
C = F −1 ([a, b]) =
N
[
CN,j .
j=1
Because F is continuous, each CN,j is closed, and hence applying the large
deviation principle,
lim sup
n→∞
4 Frank
1
log Pn (CN,j ) ≤ −I(CN,j ).
n
den Hollander, Large Deviations, p. 32, Theorem III.13.
3
For x ∈ CN,j , F (x) ≤ cN,j , and hence, for each N ,
Z
Jn (C) =
e
nF (x)
dPn (x) ≤
C
N Z
X
j=1
enF (x) dPn (x) ≤
CN,j
N
X
encN,j Pn (CN,j ).
j=1
Applying Lemma 2,
 



N
n
X
_
1 
log
lim
encN,j Pn (CN,j ) − log 
encN,j Pn (CN,j ) = 0,
n→∞ n
j=1
j=1
i.e.
 


N
N
_
1  X ncN,j
lim
log
e
Pn (CN,j ) −
log(encN,j Pn (CN,j )) = 0.
n→∞ n
j=1
j=1
Then applying the large deviation principle,
lim sup
n→∞
N
1
1 _
log Jn (C) ≤ lim sup
(ncN,j + log Pn (CN,j ))
n
n→∞ n
j=1
≤
N _
cN,j
j=1
≤
N
_
1
+ lim sup log Pn (CN,j )
n→∞ n
(cN,j − I(CN,j )) .
j=1
For x ∈ CN,j , F (x) ≥ cN,j−1 , and as cN,j = cN,j−1 +
cN,j ≤
inf F (x) +
x∈CN,j
1
N (b
− a),
1
(b − a),
N
and using this and Lemma 3 we get
lim sup
n→∞
N _
1
1
log Jn (C) ≤
inf F (x) + (b − a) − I(CN,j )
x∈CN,j
n
N
j=1
=
N _
1
(b − a) +
inf F (x) − inf I(x)
x∈CN,j
x∈CN,j
N
j=1
≤
N
_
1
(b − a) +
sup (F (x) − I(x))
N
j=1 x∈CN,j
1
(b − a) + sup (F (x) − I(x))
N
x∈C
1
≤ (b − a) + a.
N
=
4
Because this is true for all N ,
1
log Jn (C) ≤ a.
n
lim sup
n→∞
On the other hand, for x ∈ X \ C we have F (x) < a and hence
Z
Jn (X \ C) =
enF (x) dPn (x) ≤ ena ,
X \C
so
1
log Jn (X \ C) ≤ a.
n→∞ n
Lemma 2 tells us that Jn (C) + Jn (X \ C) ' Jn (C) ∨ Jn (X \ C):
lim sup
lim
n→∞
1
(log(Jn (C) + Jn (X \ C)) − log(Jn (C) ∨ Jn (X \ C))),
n
whence
lim sup
n→∞
1
1
log Jn (X ) = lim sup log(Jn (C) ∨ Jn (X \ C)) ≤ a,
n
n→∞ n
i.e.
lim sup
n→∞
1
log
n
Z
enF (x) dPn (x) ≤ sup (F (x) − I(x)).
x∈X
X
Let x ∈ X and let > 0, and define
Ux, = {y ∈ X : F (y) > F (x) − } = F −1 (F (x) − , ∞),
which is an open subset of X because F is continuous. Applying the large
deviation principle,
lim inf
n→∞
1
log Pn (Ux, ) ≥ −I(Ux, ).
n
Because x ∈ Ux, , I(Ux, ) ≤ I(x),
lim inf
n→∞
Together with
Z
Jn (Ux, ) =
e
Ux,
nF (y)
1
log Pn (Ux, ) ≥ −I(x).
n
Z
dPn (y) ≥
en(F (x)−) dPn (y) = en(F (x)−) Pn (Ux, ),
Ux,
this yields
lim inf
n→∞
1
1
log Jn (X ) ≥ lim inf log Jn (Ux, )
n→∞
n
n
1
≥ lim inf log en(F (x)−) Pn (Ux, )
n→∞ n
1
= F (x) − + lim inf log Pn (Ux, )
n→∞ n
≥ F (x) − − I(x).
5
Because this is true for all > 0,
lim inf
n→∞
1
log Jn (X ) ≥ F (x) − I(x).
n
Because this is true for all x ∈ X ,
lim inf
n→∞
1
log Jn (X ) ≥ sup (F (x) − I(x)),
n
x∈X
i.e.,
lim inf
n→∞
1
log
n
Z
enF (x) dPn (x) ≥ sup (F (x) − I(x)),
x∈X
X
which completes the proof.
The following theorem states that if a sequence of Borel probability measures
satisfies a large deviation principle then the tilted sequence of Borel probability
measures satisfies a large deviation principle with a tilted rate function.5
Theorem 5. Suppose that Pn ∈ P(X ) satisfy a large deviation principle with
rate function I, and let F : X → R be continuous and bounded above. Define
for n ≥ 1 and S ∈ BX ,
Z
Jn (S) =
enF (x) dPn (x).
S
Then the sequence PnF ∈ P(X ) defined by
PnF (S) =
Jn (S)
,
Jn (X )
S ∈ BX ,
satisfies a large deviation principle with rate function
I F (x) = sup (F (y) − I(y)) − (F (x) − I(x)).
y∈X
5 Frank
den Hollander, Large Deviations, p. 34,Theorem III.17.
6