Steinis lemma, Malliavin calculus, and tail bounds, with application

1
Stein’s lemma, Malliavin calculus, and tail bounds, with
application to polymer ‡uctuation exponent
Frederi G. Viens1;
1
Dept. Statistics and Dept. Mathematics
Purdue University
150 N. University St.
West Lafayette, IN 47907-2067, USA;
[email protected]
July 3, 2009
Abstract
We consider a random variable X satisfying almost-sure conditions involving G := DX; DL 1 X
where DX is X’s Malliavin derivative and L 1 is the pseudo-inverse of the generator of the OrnsteinUhlenbeck semigroup. A lower- (resp. upper-) bound condition on G is proved to imply a Gaussian-type
lower (resp. upper) bound on the tail P [X > z]. Bounds of other natures are also given. A key ingredient
is the use of Stein’s lemma, including the explicit form of the solution of Stein’s equation relative to the
function 1x>z , and its relation to G. Another set of comparable results is established, without the use of
Stein’s lemma, using instead a formula for the density of a random variable based on G, recently devised
by the author and Ivan Nourdin. As an application, via a Mehler-type formula for G, we show that the
Brownian polymer in a Gaussian environment which is white-noise in time and positively correlated in
space has deviations of Gaussian type and a ‡uctuation exponent = 1=2. We also show this exponent
remains 1=2 after a non-linear transformation of the polymer’s Hamiltonian.
Key words and phrases: Malliavin calculus, Wiener chaos, sub-Gaussian, Stein’s lemma, polymer,
Anderson model, random media, ‡uctuation exponent.
AMS 2000 MSC codes: primary 60H07; secondary 60G15, 60K37, 82D60
1
Introduction
1.1
Background and context
Ivan Nourdin and Giovanni Peccati have recently made a long-awaited connection between Stein’s lemma
and the Malliavin calculus: see [9], and also [10]. Our article uses crucial basic elements from their work, to
investigate the behavior of square-integrable random variables whose Wiener chaos expansions are not …nite.
Speci…cally we devise conditions under which the tail of a random variable is bounded below by Gaussian
tails, by using Stein’s lemma and the Malliavin calculus. Our article also derives similar lower bounds by
way of a new formula for the density of a random variable, established in [12], which uses Malliavin calculus,
but not Stein’s lemma. Tail upper bounds are also derived, using both methods.
Author’s reserach partially supported by NSF grant 0606615
Stein’s lemma has been used in the past for Gaussian upper bounds, e.g. in [3] in the context of
exchangeable pairs. Malliavin derivatives have been invoked for similar upper bounds in [22]. In the current
paper, the combination of these two tools yields a novel criterion for a Gaussian tail lower bound. We borrow
a main idea from Nourdin and Peccati [9], and also from [12]: to understand a random variable Z which is
measurable with respect to a Gaussian …eld W , it is fruitful to consider the random variable
G := hDZ; DL
1
ZiH ;
where D is the Malliavin derivative relative to W , h ; iH is the inner product in the canonical Hilbert space
H of W , and L is the generator of the Ornstein-Uhlenbeck semigroup. Details on D, H, L, and G, will be
given below.
The function g (z) = E [GjZ = z] has already been used to good e¤ect in the density formula discovered
in [12]; this formula implied new lower bounds on the densities of some Gaussian processes’suprema. These
results are made possible by fully using the Gaussian property, and in particular by exploiting both upper
and lower bounds on the process’s covariance. The authors of [12] noted that, if Z has a density and an
upper bound is assumed on G, in the absence of any other assumption on how Z is related to the underlying
Gaussian process W , then Z’s tail is sub-Gaussian. On the other hand, the authors of [12] tried to discard
any upper bound assumption, and assume instead that G was bounded below, to see if they could derive
a Gaussian lower bound on Z’s tail; they succeeded in this task, but only partially, as they had to impose
some additional conditions on Z’s function g, which are of upper-bound type, and which may not be easy
to verify in practice.
The techniques used in [12] are well adapted to studying densities of random variables under simultaneous
lower and upper bound assumptions, but less so under single-sided assumptions. The point of the current
paper is to show that, while the quantitative study of densities via the Malliavin calculus seems to require
two-sided assumptions as in [12], single-sided assumptions on G are in essence su¢ cient to obtain single
sided bounds on tails of random variables, and there are two strategies to this end: Nourdin and Peccati’s
connection between Malliavin calculus and Stein’s lemma, and exploiting the Malliavin-calculus-based density
formula in [12].
A key new component in our work, relative to the …rst strategy, may be characterized by saying that,
in addition to a systematic exploitation of the Stein-lemma–Malliavin-calculus connection (via Lemma 3.5
below), we carefully analyze the behavior of solutions of the so-called Stein equation, and use them pro…tably,
rather than simply use the fact that there exist bounded solutions with bounded derivatives. We were inspired
to work this way by the similar innovative use of the solution in the context of Berry-Esséen theorems in
[10]. The novelty in our second strategy is simply to note that the di¢ culties inherent to using the density
formula of [12] with only one-sided assumptions, tend to disappear when one passes to tail formulas.
Our work follows in the footsteps of Nourdin and Peccati’s. One major di¤erence in focus between our
work and their’s, and indeed between ours and the main use of Stein’s method since its inception in [19] to
the most recent results (see [3], [4], [17], and references therein) is that Stein’s method is typically concerned
with convergence to the normal distribution while we are only interested in rough bounds of Gaussian or
other types for single random variables (not sequences), without imposing conditions which would lead to
normal or any other convergence. While the focus in [9] is on convergence theorems, its authors were already
aware of the ability of Stein’s lemma and the Malliavin calculus to yield bounds for …xed r.v.’s, not sequences:
their work implies that a bound on the deviation of a single G from the value 1 has clear implications for
the distance from Z’s distribution to the normal law. In fact the main technical tool therein ([9, Theorem
3.1]) is stated for …xed random variables, yielding bounds on various distances between the distributions of
such r.v.’s and the normal law, based on expectation calculations using G 1 explicitly; also see [9, Remark
3.6]. Nourdin and Peccati’s motivations only required them to make use of [9, Theorem 3.1] as applied to
convergences of sequences.
2
One other di¤erence between our motivations and theirs is that we do not consider the case of a single
Wiener chaos. Their work does in principle apply to random variables with arbitrary in…nite chaos expansions, see e.g. again [9, Theorem 3.1], and also [9, Remark 3.8], but their motivation is largely to apply the
general theorem to r.v.’s in a …xed Wiener chaos. That we systematically consider random variables with
in…nitely many non-zero Wiener chaos components, comes from the application which we also consider in
this article, to the so-called ‡uctuation exponent of a polymer in a random environment. Details on this
application, where we show that = 1=2 for a certain class of environments, are in Section 5. There is a
more fundamental obstacle to seeking upper or lower Gaussian tail bounds on an r.v. in a single Wiener
chaos: unlike convergence results for sequences of r.v.’s, such as [15], a single qth chaos r.v. has a tail of order
2=q
exp
(x=c)
(see [2]), it never has a Gaussian behavior; our lower-bound results below (e.g. Theorem
1.3 Point 3) does apply to such an r.v., but the result cannot be sharp.
Since submitting the …rst version of this article, there have been rapid developments in the use of Malliavin
calculus, Stein’s lemma, and the random variable G, which apply to situations not restricted to single Wiener
chaos: see in particular [11] where the authors prove a second-order Poincaré inequality to again assess the
distance between a single r.v.’s law and the Gaussian law, and [16] where the authors use the G-based density
formula of [12] to …nd Gaussian upper and lower bounds for solutions of some stochastic heat equations.
1.2
Summary of results
We now describe our main theoretical results. All stochastic analytic concepts used in this introduction are
described in Section 2. Let W be an isonormal Gaussian process relative to a Hilbert space H = L2 (T; B; )
(for instance if W is the Wiener process on [0; 1], then T = [0; 1] and is the Lebesgue measure). The norm
and inner products in H are denoted by k kH and h ; iH . Let L2 ( ) be the set of all random variables which
are square-integrable and measurable with respect to W . Let D be the Malliavin derivative with respect to
W (see Paul Malliavin’s or David Nualart’s texts [8], [13]). Thus DX is a random element in L2 ( ) with
values in the Hilbert space H. The set of all X 2 L2 ( ) such that kDXkH 2 L2 ( ) is called D1;2 . Let
be the tail of the standard normal distribution
Z 1
p
2
(u) :=
e x =2 dx= 2 :
u
The following result, described in [22] as an elementary consequence of a classical stochastic analytic inequality found for instance in Üstünel’s textbook [21, Theorem 9.1.1], makes use of a condition based solely
on the Malliavin derivative of a given r.v. to guarantee that its tail is bounded above by a Gaussian tail.
Proposition 1.1 For any X 2 D1;2 , if kDXkH is bounded almost surely by 1, then X is a standard sub2
Gaussian random variable, in the sense that P [jX E [X]j > u] 2e u =2 .
Remark 1.2 The value 1 in this proposition, and indeed in many places in this paper, has the role of a
dispersion coe¢ cient. Since the Malliavin derivative D is linear, the above proposition implies that for
2
2
any X 2 D1;2 such that kDXkH
almost surely, then P [jX E [X]j > u]
2e u =(2 ) . This trivial
normalization argument can be used throughout this paper, because our hypotheses are always based on linear
operators such as D. We use this argument in our application in Section 5.
2
The question of whether a lower bound on kDXkH gives rise to an inequality in the opposite direction
as in the above proposition arises naturally. However, we were unable to …nd any proof of such a result.
Instead, after reading Eulalia Nualart’s article [14] where she …nds a class of lower bounds by considering
exponential moments on the divergence (Skorohod integral) of a covering vector …eld of X, we were inspired
to look for other Malliavin calculus operations on X which would yield a Gaussian lower bound on X’s tail.
We turned to the quantity G := DX; DL 1 X H , identi…ed in [9], and used pro…tably in [12]. Here L 1 ,
3
the pseudo-inverse of the so-called generator of the Ornstein-Uhlenbeck semigroup, is de…ned in Section 3.2.
This article’s …rst theoretical result is that a lower (resp. upper) bound on G can yield a lower (resp. upper)
bound similar to the upper bound in Proposition 1.1. For instance, summarizing the combination of some
consequences of our results and Proposition 1.1, we have the following.
Theorem 1.3 Let X be a random variable in D1;2 . Let G := DX; DL
1. If G
X
H
.
1 almost surely, then
V ar [X] = E [G]
2. If G
1
1:
1 almost surely, and if for some c > 2, E [X c ] < 1, then
lim sup P [X
c
E [X] > z] = (z)
2
c
z!1
:
3. If G 1 almost surely, and if there exist c0 2 (0; 1=2) and z0 > 0, such that G
when X z0 , then for z > z0 ,
P [X
4. If G
E [X] > z]
2c0 )
(1
(1)
c0 X 2 almost surely
(z) :
1 almost surely, and X has a density, then for every z > 0
P [X
2
5. If kDXkH
E [X] > z]
1 almost surely, then V ar [X]
P [X
1+
1
z2
(z) :
(2)
2
( =2) and for z > 0,
E [X] > z]
e
z 2 =2
:
(3)
Remark 1.4 Item 1 in this theorem is Corollary 4.2 Point 1. Item 2 here comes from Corollary 4.2 Point
3. Item 3 here follows from Corollary 4.5 Point 1. Item 4 is from Theorem 4.1. Inequality (3) in Item 5
here is equivalent to Proposition 1.1. The variance upper bound in Item 5 here follows from [21, Theorem
9.2.3 part (iii)]. Other, non-Gaussian comparisons are also obtained in this article: see Corollary 4.5.
The results in Theorem 1.3 point to basic properties of the Malliavin derivative and generator of the
Ornstein-Uhlenbeck semigroup when investigating tail behavior of random variables. The importance of the
relation of G to the value 1 was already noticed in [9, Theorem 3.1] where its L2 -convergence to 1 for a
sequence of r.v.’s was a basic building block for convergence to the standard normal distribution. Here we
show what can still be asserted when the condition is signi…cantly relaxed. An attempt was made to prove
a version of the theorem above in [12, Section 4]; here we signi…cantly improve that work by: (i) removing
the unwieldy upper bound conditions made in [12, Theorem 4.2] to prove lower bound results therein; and
(ii) improving the upper bound in [12, Theorem 4.1] while using a weaker hypothesis.
Our results should have applications in any area of pure or applied probability where Malliavin derivatives
are readily expressed. In fact, Nourdin and Peccati [9, Remark 1.4, point 4] already hint that G is not always
as intractable as one may fear. We present such an application in this article, in which the deviations of
a random polymer in some random media are estimated, and its ‡uctuation exponent is calculated to be
= 1=2, a result which we prove to be robust to non-linear changes in the polymer’s Hamiltonian.
The structure of this article is as follows. Section 2 presents all necessary background information from
the theory of Wiener chaos and the Malliavin calculus needed to understand our statements and proofs.
4
Section 3 recalls Stein’s lemma and equation, presents the way it will be used in this article, and recalls
the density representation results from [12]. Section 4 states and proves our main lower and upper bound
results. Section 5 gives a construction of continuous random polymers in Gaussian environments, and states
and proves the estimates on its deviations and its ‡uctuation exponent under Gaussian and non-Gaussian
Hamiltonians, when the Gaussian environment has in…nite-range correlations. Several interesting open
questions are described in this section as well. Section 6, the Appendix, contains the proofs of some lemmas.
Acknowledgements We wish to thank Ivan Nourdin and Giovanni Peccati for discussing their work on
Stein’s method with us, Eulalia Nualart for encouraging us to study the question of lower bounds on
tails of random variables via the Malliavin Calculus, and Samy Tindel for help with the concept of
polymer ‡uctuation exponents. We are grateful to two anonymous referees whose very careful reading
led to spotting a number of small errors in an earlier version.
2
Preliminaries: Wiener chaos and Malliavin calculus
For a complete treatment of this topic, we refer the reader to David Nualart’s textbook [13].
We are in the framework of an isonormal Gaussian process W on a suitable probability space ( ; F; P):
it is de…ned as a Gaussian …eld W on a Hilbert space H = L2 (T; B; ) where
is a -…nite measure
that is either discrete or without atoms, and the covariance of W coincides with the inner product in H.
This forces W to be linear on H; consequently, it can be interpreted as an abstract Wiener integral. For
instance, if T = [0; 1] and is the Lebesgue measure, then W (f ) represents the usual Wiener stochastic
R1
integral 0 f (s) dW (s) of a square-integrable non-random function f with respect to a Wiener process also
; ng 2 H n ,
denoted by W ; i.e. we confuse the notation W (t) and W 1[0;t] . In general for ffi : i = 1;
2
(W (fi ) : i = 1;
; n) is a centered Gaussian vector, with covariance matrix given by i;j = hfi ; fj iH . The
set H1 of all Wiener integrals W (f ) when f ranges over all of H is called the …rst Wiener chaos of W .
To construct higher-order chaoses, one may for example use iterated Itô integration in the case of standard
Brownian motion, where H = L2 [0; 1]. If we denote I0 (f ) = f for any non-random constant f , then for any
integer n 1 and any symmetric function f 2 H n , we let
Z 1 Z s1
Z sn 1
f (s1 ; s2 ;
; sn ) dW (sn ) dW (s2 ) dW (s1 ) :
In (f ) := n!
0
0
0
This is the nth iterated Wiener integral of f w.r.t. W .
De…nition 2.1 The set Hn := fIn (f ) : f 2 H n g is the nth Wiener chaos of W .
We refer to [13, Section 1.2] for the general de…nition of In and Hn when W is a more general isonormal
Gaussian process.
Proposition 2.2 L2 ( ) is the direct sum – with respect to the inner product de…ned by expectations of
products of r.v.’s – of all the Wiener chaoses. Speci…cally for any X 2 L2 ( ), there exists a sequence of
P1
P1
2
non-random symmetric functions fn 2 H n with n=0 kfn kH n < 1 such that X = n=0 In (fn ). Moreover
E [X] = f0 = I0 (f0 ) and E [In (fn )] = 0 for all n
1, and E [In (fn ) Im (gm )] = m;n n! hfn ; gn iH n where
P1
2
2
= n=0 n! kfn kH n .
m;n equals 0 if m 6= n and 1 if m = n. In particular E X
The Malliavin derivative operator is usually constructed via an extension starting from so-called simple
random variables which are di¤erentiable functions of …nite-dimensional vectors from the Gaussian space
H1 . The reader can consult Nualart’s textbook [13]. We recall the properties which are of use to us herein.
5
1. The Malliavin derivative operator D is de…ned from H1 into H by the formula: for all r 2 T ,
Dr W (f ) = f (r) :
The Malliavin derivative of a non-random constant is zero. For any m-dimensional Gaussian vector
m
m
m
G = (Gi )i=1 = (I1 (gi ))i=1 2 (H1 ) , for any F 2 C 1 (Rm ) such that X = F (G) 2 L2 ( ), we have
Pm @F
Dr X = i=1 @xi (G) gi (r).
2. The Malliavin derivative of an nth Wiener chaos r.v. is particularly simple. Let Xn 2 Hn , i.e. let fn
be a symmetric function in H n and Xn = In (fn ). Then
Dr X = Dr In (fn ) = nIn
1
(fn (r; )) :
(4)
The Malliavin derivative being linear, this extends immediately to any random variable X in L2 ( )
P1
by writing X as its Wiener chaos expansion n=0 In (fn ), which means that, using the covariance
formulas in Proposition 2.2, DX 2 L2 (
T ) if and only if
1
h
i
X
2
2
E kDXkH :=
n n! kfn kH n < 1:
(5)
n=1
The set of all X 2 L2 ( ) such that DX 2 L2 (
T ) is denoted by D1;2 .
Remark 2.3 The general chain rule of point 1 above generalizes to D [h (X)] = h0 (X) DX for any X 2 D1;2
such that X has a density, and any function h which is continuous and piecewise di¤ erentiable with a bounded
derivative. This is an immediate consequence of [13, Proposition 1.2.3].
3
3.1
Tools: using Stein’s lemma and Malliavin derivatives
Stein’s lemma and equation
The version of Stein’s lemma which we use can be found in [9]. Let Z be a standard normal random variable
and (z) = P [Z > z] its tail. Let h be a measurable function of one real variable. Stein’s equation poses
the following question: to …nd a continuous and piecewise di¤erentiable function f with bounded derivative
such that, for all x 2 R where f 0 (x) exists,
h (x)
E [h (Z)] = f 0 (x)
xf (x) :
(6)
The precise form of the solution to this di¤erential equation for h = 1( 1;z] , given in the next lemma, was
derived in Stein’s original work [19]; a recent usage is found in equalities (1.5), (2,20), and (2.21) in [10].
Lemma 3.1 Fix z 2 R. Let h = 1( 1;z] . Then Stein’s equation (6) has at a unique solution f satisfying
kf 0 k1 := supx2R jf 0 (x)j 1. It is the following:
p
2
(x)
(z) ;
for x z, f (x) = 2 ex =2 1
p
2
for x > z, f (x) = 2 ex =2 1
(z)
(x).
Corollary 3.2 Let X 2 L2 ( ). Setting x = X in Stein’s equation (6) and taking expectations we get
P [X > z] =
(z)
E [f 0 (X)] + E [Xf (X)] :
The next section gives tools which will allow us to combine this corollary with estimates of the random
variable G := DX; DL 1 X H in order to get tail bounds. It also shows how G can be used, as in [12], to
express the density of X without using Stein’s lemma.
6
3.2
Malliavin derivative tools
De…nition 3.3 The generator of the Ornstein-Uhlenbeck semigroup L is de…ned as follows. Let X =
P1
P1
2
in L2 ( ). If n=1 n2 n! jfn j < 1, then we de…ne a new random varin=1 In (fn ) be a centered r.v.
P
1
2
able LX in L ( ) by LX = n=1 nIn (fn ). The pseudo-inverse of L operating on centered r.v.’s in L2 ( )
P1
is de…ned by the formula L 1 X = n=1 n1 In (fn ) : If X is not centered, we de…ne its image by L and L 1
by applying them to X EX.
De…nition 3.4 For X 2 D1;2 , we let G := DX; DL
1
X
H
.
The following formula will play an important role in our proofs where we use Stein’s lemma. It was
originally noted in [9]. We provide a self-contained proof of this result in the Appendix, which does not use
the concept of divergence operator (Skorohod integral).
Lemma 3.5 For any centered X 2 D1;2 with a density, with G from De…nition 3.4, and for any deterministic
continuous and piecewise di¤ erentiable function h such that h0 is bounded,
E [Xh (X)] = E [h0 (X) G] :
(7)
On the other hand, the next result and its proof (see [12]), make no reference to Stein’s lemma.
De…nition 3.6 With X 2 D1;2 and G as above in De…nition 3.6, let the function g be de…ned almost
everywhere on the support of X as the conditional expectation of G given X:
g (z) := E[GjX = z]:
(8)
Proposition 3.7 Let X 2 D1;2 be centered with a density which is supported on a set I. Then I is an
interval [a; b] and, with g as above, we have for almost all z 2 (a; b),
Z z
E jXj
ydy
exp
:
(z) =
2g (z)
g
(y)
0
Strictly speaking, the proof of this proposition is not contained in [12], since the authors there use the
additional assumption that g (x)
1 everywhere, which implies that exists and that I = R. However,
the modi…cation of their arguments to yield the proposition above is straightforward, and we omit it: for
instance, that I is an interval follows from X 2 D1;2 as seen in [13, Proposition 2.1.7].
As one can see from this proposition, and the statement of Theorem 1.3, it is important to have a
technique to be able to calculate DL 1 X. We will use a device which can be found for instance in a di¤erent
form in the proof of Lemma 1.5.1 in [13], and is at the core of the so-called Mehler formula, also found in
[13]. It requires a special operator which introduces a coupling with an independent Wiener space. This
operator R replaces W by the linear combination W cos + W 0 sin where W 0 is an independent copy of
W . For instance, if W is Brownian motion and one writes the random variable X as X = F (W ) where F
is a deterministic Borel-measurable functional on the space of continuous functions, then
R X := F (W cos + W 0 sin ) :
(9)
We have the following formula (akin to the Mehler formula, and proved in the Appendix), where sgn ( ) =
= j j, sgn (0) = 1 by convention, where E0 represents the expectation w.r.t. the randomness in W 0 only, i.e.
conditional on W , and where D0 is the Malliavin derivative w.r.t. W 0 only.
Lemma 3.8 For any X 2 D1;2 , for all s 2 T ,
Z =2
Z
1
1
Ds L 1 X =
E0 [Ds0 (R X)] sgn ( ) d =
2
2
=2
7
=2
=2
EE0 [Ds0 (R X) jW ] sgn ( ) d :
To be speci…c, note that Ds0 (R X) can typically be expressed explicitly. For instance, for an elementary
random variable X = f (W (hi ))i=1; ;n with f 2 Cb1 (Rn ) and hi 2 H for all i,
Ds0 (R X) = Ds0 [f ((W (hi ) cos + W 0 (hi ) sin )i )]
n
X
@f
= sin
hi (s)
((W (hi ) cos + W 0 (hi ) sin )i ) :
@x
i
i=1
We also note that if we rewrite this example abstractly by setting DX = (W ) where
is a measurable
function from into H, then the above calculation shows that D0 (R X) = (sin ) (R W ), and the result
of the lemma above can be rewritten in a form which may be more familiar to users of Mehler’s formula:
Ds L
4
1
1
X =
2
Z
=2
=2
E0 [ (R W )] jsin j d :
Main results
All results in this section are stated and discussed in the …rst two subsections, the …rst one dealing with
consequences of Stein’s lemma, the second with the function g. All proofs are in the third subsection.
4.1
Results using Stein’s lemma
Our …rst result is tailored to Gaussian comparisons.
Theorem 4.1 Let X 2 D1;2 be centered. Assume that almost surely,
1
G := DX; DL
Then for every z > 0,
P [X > z]
1
1 + z2
(z)
Assume instead that X has a density and G
P [X > z]
Z
X
1
1:
H
(2x
(10)
z)P [X > x] dx:
z
1 almost surely; then for every z > 0,
1+
1
z2
(z) :
Before proving this theorem, we record some consequences of its lower bound result in the next Corollary.
In order to obtain a more precise lower bound result on the tail S (z) := P [X > z], it appears to be necessary
to make some regularity and integrability assumptions on S. This is the aim of the second point in the next
corollary. The …rst and third points show what can be obtained by using only an integrability condition, with
no regularity assumption: we may either …nd a universal lower bound on such quantities as X’s variance, or
an asymptotic statement on S itself.
Corollary 4.2 Let X 2 D1;2 be centered. Let S (z) := P [X > z]. Assume that condition (10) holds.
1. We have
V ar [X] = E [G]
1:
2. Assume there exists a constant c > 2 such that jS 0 (z)j =S (z)
P [X > z]
(c
2) 1 + z 2
c 2 + cz 2
8
(z) '
c=z holds for large z. Then for large z,
(c
2)
c
(z) :
c
3. Assume there exists a constant c > 2 such that S (z) < z
sup xc P [X > x]
c
2
c
x z
holds for large z. Then, for large z,
z c (z) :
Consequently,
lim sup
z!1
P [X > z]
(z)
c
2
c
:
Let us discuss the assumptions and results in the corollary from a quantitative standpoint. The assumption of point 2, jS 0 (z)j =S (z) c=z, when integrated, yields S (z) S (1) z c , implies no more than existence
of a moment of order larger than 2; it does, however, represent an additional monotonicity condition since it
refers to S 0 . The assumption of point 3, which is weaker because it does not require any monotonicity, also
implies the same moment condition. This moment condition is little more than the integrability required
from X belonging to D1;2 . If c can be made arbitrarily large (for instance in point 3, this occurs when X is
assumed to have moments of all orders), asymptotically (c 2)=c can be replaced by 1, yielding the sharpest
possible comparison to the normal tail. If indeed S is close to the normal tail, it is morally not a restriction
to assume that c can be taken arbitrarily large: it is typically easy to check this via a priori estimates.
4.2
Results using the function g
We now present results which do not use Stein’s lemma, but refer only to the random variable G :=
DX; DL 1 X H and the resulting function g (z) := E[GjX = z] introduced in (8). We will prove the
theorem below using the results in [12] on representation of densities. Its corollary shows how to obtain
quantitatively explicit upper and lower bounds on the tail of a random variable, which are as sharp as the
upper and lower bounds one might establish on g. A description of the advantages and disadvantages of
using g over Stein’s lemma follows the statements of the next theorem and its corollary.
Theorem 4.3 Let X 2 D1;2 be centered. Let G := DX; DL 1 X H and g (z) := E[GjX = z]. Assume
that X has a density which is positive on the interior of its support (a; +1), where a 2 [ 1; +1). For
x 0, let
Z x
ydy
:
A (x) := exp
g
(y)
0
Then for all x > 0,
P [X > x] =
E jXj
2
A (x)
x
Z
1
x
A (y)
dy :
y2
(11)
Remark 4.4 The density formula in Proposition (3.7) shows that g must be non-negative [in fact, this was
already known, and holds true almost surely for G itself, even if X is not known have a density, assuming
only X 2 D1;2 : [9, Proposition 3.9]]. Assuming our centered X 2 D1;2 has a density , we have already
noted that must be positive on (a; b) and zero outside. To ensure that b = +1, as is needed in the above
theorem, it is su¢ cient to assume that g is bounded below on [0; b) by a positive constant. If in addition we
can assume, as in (10), that this lower-boundedness of g holds everywhere, then X has a density, and its
support is R.
Corollary 4.5 Assume that for some c0 2 (0; 1) and some z0 > 1, we have for all x > z0 , g (x)
c0
(c0 )
EjXj
Then, with K := 2
; for x > z0 ,
0 1+c0
c0 x2 .
(1+c )
P [X > x]
K
9
A (x)
:
x
(12)
1. Under the additional assumption (10), g (x)
P [X > z]
K
1 everywhere, and we have
x2
2
1
exp
x
'
p
2 K (z) :
2. If we have rather the stronger lower bound g (x) c00 x2 for some c00 2 (0; c0 ] and all x > z0 , then for
x > z0 , and with some constant K 0 depending on g, c00 and z0 ,
K 0x
P [X > z]
1 1=c00
:
3. If we have instead that g (x) c1 xp for some c1 > 0, p < 2, and for all x > z0 , then for x > z0 , and
with some constant K 00 depending on g, c1 , p, and z0 ,
P [X > z]
K 00 exp
x2
(2
p
p) c1
:
4. In the last two points, if the inequalities on g in the hypotheses are reversed, the conclusions are also
reversed, without changing any of the constants: i.e.
(a) if 9c00
c0 ; 9z0 > 0 : 8x > z0 ; g (x)
c00 x2 then x > z0 ) P [X > z]
K 0x
c1 xp then x > z0 ) P [X > z]
(b) if 9c1 > 0; 9p < 2; 9z0 > 0 : 8x > z0 ; g (x)
1 1=c00
;
K 00 exp
x2 p
(2 p)c1
:
The tail formula (11) in Theorem 4.3 readily implies asymptotic estimates on S of non-Gaussian type if
one is able to compare g to a power function. Methods using Stein’s lemma, at least in its form described
in Section 3.1, only work e¢ ciently for comparing S to the Gaussian tail. Arguments found in Nourdin and
Peccati’s articles (e.g. [9]) indicate that Stein’s method may be of use in some speci…c non-Gaussian cases,
which one could use to compare tails to the Gamma tail, and perhaps to other tails in the Pearson family,
which would correspond to polynomial g with degree at most 2. The ‡exibility of our method of working
directly with g rather than Stein’s lemma, is that it seems to allow any type of tail. Stein’s method has one
important advantage, however: it is not restricted to having a good control on g; Theorem 4.1 establishes
Gaussian lower bounds on tails by only assuming (10) and mild conditions on the tail itself. This is to be
compared to the lower bound [12, Theorem 4.2] proved via the function g alone, where it required growth
conditions on g which may not be that easy to check.
There is one intriguing, albeit perhaps technical, fact regarding the use of Stein’s method: in Point 1 of
the above Corollary 4.5, since the comparison is made with a Gaussian tail, one may wonder what the usage
of Stein’s lemma via Theorem 4.1 may produce when assuming, as in Point 1 of Corollary 4.5, that g (x) 1
and g grows slower than x2 . As it turns out, Stein’s method is not systematically superior to Corollary 4.5,
as we now see.
Corollary 4.6 (Consequence of Theorem 4.1) Assume that g (x)
large x > z0 , g (x) c0 x2 . Then for z > z0 ,
P [X > z]
1 + z2
1 + (1
2c0 )
1
z2
(z) ' (1
1 and, for some c0 2 (0; 1=2) and
2c0 )
(z) :
When this corollary and Point 1 in Corollary 4.5 are used in an e¢ cient situation, this means that X
is presumably “subgaussian” as well as being “supergaussian” as a consequence of assumption (10). For
illustrative purposes, we can translate this roughly as meaning that for some > 0, g (x) is in the interval
[1; 1 + ] for all x. This implies that we can take c0 ! 0 in both Corollaries 4.5 and
p 4.6; as a consequence,
the …rst corollary yields P [X > z]
(z), while the second p
gives P [X > z]
( 2 E jXj =2) (z). The
superiority of one method over another then depends on how 2 E jXj =2 compares
to 1. It is elementary
p
to check that, in “very sharp”situations, which means that is quite small, 2 E jXj =2 will be close to 1,
from which one can only conclude that both methods appear to be equally e¢ cient.
10
4.3
Proofs
We now turn to the proofs of the above results.
Proof of Theorem 4.1.
Step 1: exploiting the negativity of f 0 . From lemma 3.1, we are able to calculate the derivative of the
solution f to Stein’s equation:
p
2
(x) xex =2 ;
for x z, f 0 (x) = (z) 1 + 2 1
for x > z, f 0 (x) = 1
(z)
1+
p
2
(x) xex
2
=2
:
We now use the standard estimate, valid for all x > 0 (see [6, Problem 2.9.22, page 112]):
x
p e
(x2 + 1) 2
x2 =2
1
p e
x 2
(x)
x2 =2
:
(13)
In the case x > z, since z > 0, the upper estimate yields f 0 (x)
1
(z) ( 1 + 1) = 0. Now by the
0
expression for P [X > z] in Corollary 3.2, the negativity of f on fx > zg implies for all z > 0,
P [X > z] =
(z)
E [1X
zf
0
(X)]
E [1X>z f 0 (X)] + E [Xf (X)]
(z)
E [1X
zf
0
(X)] + E [Xf (X)] :
Step 2: Exploiting the positivities and the smallness of f 0 . Using Step 1, we have
P [X > z]
(z)
E [1X
zf
0
(X)] + E [1X
z Xf (X)]
+ E [1X>z Xf (X)] :
We apply Lemma 3.5 to the function h (x) = (f (x) f (z)) 1x z ; h is continuous everywhere; it is di¤erentiable everywhere with a bounded derivative, equal to f 0 (x) 1x z , except at x = z. Applying this lemma is
legitimized by the fact, proved in [12], that G 1 a.s. implies X has a density (see the explanation given
after the statement of Proposition 3.7). Thus we get
P [X > z]
(z)
E [1X
(z) + E [1X
zf
0
(X)] + E [Xh(X)] + E [1X
zf
0
(X) ( 1 + G)] + E [1X
z X] f
z X] f
(z) + E [1X>z Xf (X)]
(z) + E [1X>z Xf (X)] :
When x z, we can use the formula in Step 1 to prove that f 0 (x) 0. Indeed this is trivial when x
while when x < 0, it is proved as follows: for x = y < 0, and using the upper bound in (13)
p
p
2
2
(x) xex =2 = (z) 1
2 (y) yey =2
0:
f 0 (x) = (z) 1 + 2 1
By the lower bound hypothesis (10), we also have positivity of
right-hand side of (14) is non-negative. In other words we have
P [X > z]
(z) + E [1X
=:
z X] f
(14)
0,
1 + G. Thus the second term on the
(z) + E [1X>z Xf (X)]
(z) + A:
(15)
(16)
The sum of the last two terms on the right-hand side of (15), which we call A, can be rewritten as follows,
using the fact that E [X] = 0:
A := E [1X
= E [1X
z X] f
z X] f
(z) + E [1X>z Xf (X)]
(z) + E [1X>z X (f (X)
= E [1X>z X (f (X)
f (z))] :
11
f (z))] + f (z) E [1X>z X]
This quantity A is slightly problematic since, f being decreasing on [z; +1), we have A < 0. However, we
can write f (X) f (z) = f 0 ( ) (X z) for some > z. Note that this is random and depends on z; in
fact on the event X > z, we have 2 [z; X], but we will only need to use the lower bound on : we use the
lower bound in (13) to get that for all > z,
jf 0 ( )j =
f0 ( ) = 1
(z)
1
p
2
( ) e
2
=2
2
1
1
1+
This upper bound can obviously be further bounded above uniformly by 1 + z 2
jAj
E [1X>z X (X
z)]
=
2
1
1
1+
2
:
(17)
, which means that
1
:
1 + z2
By using this estimate in (16) we …nally get
P [X > z]
(z)
E [1X>z X(X
z)]
1
:
1 + z2
(18)
Step 3: integrating by parts. For notational compactness, let S (z) := P [X > z]. We integrate the last
term in (18) by parts with respect to the positive measure dS (x). We have, for any z > 0,
Z 1
E [1X>z X(X z)] =
x (x z) dS (x)
z
Z 1
= z (z z) S (z)
lim x (x z) S (x) +
(2x z)S (x) dx
x!+1
z
Z 1
(2x z)S (x) dx:
z
The conclusion (18) from the previous step now implies
Z 1
1
(2x
S (z)
(z)
1 + z2 z
z)S (x) dx;
which …nishes the proof of the theorem’s lower bound.
Step 4: Upper bound. The proof of the upper bound is similar to, not symmetric with, and less delicate
than, the proof of the lower bound. Indeed, we can take advantage of a projective positivity result on the
inner product of DX and DL 1 X, namely [9, Proposition 3.9] which says that E [GjX] 0. This allows
us to avoid the need for any additional moment assumptions. Since X is assumed to have a density, we may
use Lemma 3.5 directly with the function h = f , which is continuous, and di¤erentiable everywhere except
at x = z: we have
P [X > z]
E [f 0 (X)] + E [f 0 (X) G]
=
(z)
=
(z) + E [1X
zf
0
(X) ( 1 + G)] + E [1X>z f 0 (X) ( 1 + G)]
(z) + E [1X>z f 0 (X) ( 1 + G)]
(19)
where the last inequality simply comes from the facts that by hypothesis 1 + G is negative, and when
x z, f 0 (x) 0 (see previous step for proof of this positivity). It remains to control the term in (19): since
E [GjX] 0, and using the negativity of f 0 on x > z,
E [1X>z f 0 (X) ( 1 + G)] = E [1X>z f 0 (X) E [( 1 + G) jX]]
E [1X>z f 0 (X)] = E [1X>z jf 0 (X)j] :
12
This last inequality together with the bound on f 0 obtained in (17) imply
E [1X>z f 0 (X) ( 1 + G)]
P [X > z]
1
:
1 + z2
Thus we have proved that
1
1 + z2
which implies the upper bound of the theorem, …nishing its proof.
P [X > z]
(z) + P [X > z]
Proof of Corollary 4.2. To simplify the expressions of the constants in this corollary, we have ignored
the term z in the factor (2x z) in the lower bound of Theorem 4.1, which of course yields a stronger lower
bound statement. One also notes that by a result in [12], condition (10) implies that X has a density.
Proof of Point 1. Since X has a density, we can apply Lemma 3.5, from which it trivially follows that
V ar [X] = E [XX] = E [G] 1.
Proof of Point 2. Since X has a density, S 0 is de…ned, and we get
Z 1
Z 1
Z
1 1 2 0
F (z) :=
xP [X > x] dx =
xS (x) dx
x jS (x)j dx
c z
z
z
Z 1
1
=
z 2 S (z)
lim x2 S (x) +
2xS (x) dx
x!1
c
z
1 2
z S (z) + 2F (z)
c
which implies
1
z 2 S (z) :
c 2
With the lower bound conclusion of Theorem 4.1, we obtain
F (z)
S (z)
2z 2
1
S (z)
1 + z2 c 2
(z)
which is equivalent to the statement of Point 2.
Proof of Point 3. From Theorem 4.1, we have for large z,
Z 1
1
S (z)
(z)
2x1 c xc S (x) dx
1 + z2 z
Z 1
1
c
(z)
sup [x S (x)]
2x1 c dx =
1 + z 2 x>z
z
(z)
z2
c
2= (c 2)
sup [xc S (x)]
1 + z2
x>z
which implies
sup [xc S (x)]
x>z
2
c
2
+1
z c (z)
which is equivalent to the …rst part of the statement of Point 3, the second part following from the fact that
z c (z) is decreasing for large z.
Proof of Theorem 4.3. By Proposition 3.7, with L = E jXj =2, for x 2 (a; +1),
(x) = LA (x) =g (x) :
By de…nition we also get A0 (x) =
xA (x) =g (x) = xL 1 (x), and thus
Z +1
A0
A (x)
A (y)
P [X > x] =: S (x) = L
dy = L
lim
y!1
y
x
y
x
13
Z
+1
x
A (y)
dy :
y2
Since g is non-negative, A is bounded, and the term limy!1 A (y) =y is thus zero. Equality (11) follows
immediately, proving the theorem.
Proof of Corollary 4.5.
Proof of inequality (12). From Theorem 4.3, with L = E jXj =2, and k > 1, and using the fact that A is
decreasing, we can write
!
Z kx
Z +1
A (x)
A (y)
A (y)
S (x) =: P [X > x] = L
dy
dy
x
y2
y2
x
kx
A (x) A (x)
1
x
x
A (x) 1
A (kx)
=L
1
x k
A (x)
L
1
k
A (kx)
kx
:
It is now just a matter of using the assumption g (x) c0 x2 to control A (kx) =A (x). We have for large x,
!
Z kx
0
ydy
1
A (kx)
= exp
exp
log k = k 1=c :
0
A (x)
g
(y)
c
x
This proves
S (x)
L
A (x) 1
1
x k
k
1=c0
:
The proof is completed simply by optimizing this over the values of k > 1: the function k 7! 1
0
0 c
reaches its maximum of (c ) (1 + c0 )
c0 1
k
1=c0
=k
0
0 c
at (1 + 1=c ) .
Proof of Points 1, 2, 3, and 4. Point 1 is immediate since g (x)
1 implies A (x)
Similarly, for Point 2, we have
Z y0
Z x
00
ydy
1
dy
A (x) exp
exp
= cst x 1=c ;
00
g (y)
c y0 y
0
exp
x2 =2 .
and Point 3 follows in the same fashion. Point 4 is shown identically by reversing all inequalities, concluding
the proof of the Corollary.
Proof of Corollary 4.6. This is in fact a corollary of the proof of Theorem 4.1. At the end of Step 2
therein, in (18), we prove that (10), the lower bound assumption G 1, implies
S (z)
(z)
E [1X>z X (X
z)]
1
:
1 + z2
Let us investigate the term B := E [1X>z X (X z)]. Using Lemma 3.5 with the function h (x) = (x
we have
B = E [1X>z G] = E [1X>z g (X)] :
Now use the upper bound assumption on g: we get, for all z
B
(20)
z) 1x>z ,
z0 ,
c0 E 1X>z X 2 = c0 E [1X>z X (X
z)] + c0 zE [1X>z X]
Z 1
= c0 B + c0 zE [1X>z X] = c0 B + c0 z zS (z) +
S (x) dx ;
(21)
z
where we used integration by parts for the last inequality. Again using integration by parts, but directly on
R1
the de…nition of B := z x2 zx (x) dx, yields
Z 1
Z 1
B=2
xS (x) dx z
S (x) dx:
z
z
14
R1
R1
Introducing the following additional notation: D := z z S (x) dx and E := 2 z xS (s) dx, we see that
B = E D and also that E 2D. Moreover, in (21), we also recognize the appearance of D. Therefore we
have
(E D) (1 c0 ) c0 D + c0 z 2 S (z) (c0 =2) E + c0 z 2 S (z) :
With E
D
E=2, we now get E (1
c0 )
c0 E + 2c0 z 2 S (z), i.e.
B
2c0
z 2 S (z) :
1 2c0
E
From (20), we now get
S (z)
from which we obtain, for z
2c0 (1
(z)
2c0 )
1 + z2
1
z2
S (z)
z0
1 + z2
S (z)
1 + 2c0 (1
2c0 )
1
(z) ;
+ 1 z2
…nishing the proof of the corollary.
5
Fluctuation exponent and deviations for polymers in Gaussian
environments
Lemma 3.8 provides a way to calculate G := DX; DL 1 X H in order to check, for instance, whether it
is bounded below by a positive constant c2 . If c2 6= 1, because of the bilinearity of Condition (10), one only
needs to consider X=c instead of X in order to apply Theorem 4.1, say. To show that such a tool can be
applied with ease in a non-trivial situation, we have chosen the issue of ‡uctuation exponents for polymers
in random environments.
We can consider various polymer models in random environments constructed by analogy with the socalled stochastic Anderson models (see [18] and [5]). A polymer’s state space R can be either Rd or the
d-dimensional torus Sd , or also Zd or Z=pZ; we could also use any Lie group for R. We can equip R with a
Markov process b on [0; 1) whose in…nitesimal generator, under the probability measure Pb , is the Laplace(Beltrami) operator or the discrete Laplacian. Thus for instance, b is Brownian motion when R = Rd , or
is the simple symmetric random walk when R = Zd ; it is the image of Brownian motion by the imaginary
exponential map when R = S1 . To simplify our exposition, we can and will typically assume, unless explicitly
stated otherwise, that R = R, but our constructions and proofs can be adapted to any of the above choices.
5.1
The random environment
Let W be a Gaussian …eld on R+ R which is homogeneous in space and is Brownian in time for …xed
space parameter: the covariance of W is thus
E [W (t; x) W (s; y)] = min (s; t) Q (x
y) ;
for some homogeneous covariance function Q on R. We assume that Q is continuous and that its Fourier
^ Note that Q
^ is a positive function, and jQj is bounded
transform is a measure with a density denoted by Q.
by Q (0). The …eld W can be represented using a very speci…c isonormal Gaussian process: there exists a
white noise measure M on R+ R such that
Z tZ
q
^ ( ) ei x ;
W (t; x) =
M (ds; d ) Q
0
R
15
q
^ ( ) ei x with respect to M . This
where the above integral is the Wiener integral of (s; ) 7! 1[0;t] (s) Q
M is the Gaussian white noise measure generated by an isonormal Gaussian process whose Hilbert space
is H = L2 (R+ R; drd ), i.e. the control measure of M is the Lebesgue measure on R+ R. Malliavin
derivatives relative to M will take their parameters (s; ) in R+ R, and inner products and norms are
understood in H. There is a slight possibility of notational confusion since now the underlying isonormal
Gaussian process is called M , with the letter W –the traditional name of the polymer potential …eld –being
a linear transformation of M .
q
^ ( )1[0;t] (s). We will make use of
The relation between D and W is thus that Ds; W (t; x) = ei x Q
the following similarly important formulas: for any measurable function f :
Z Z t
q
q
^ ( )ei f (s) = Q
^ ( )ei f (s) 1[0;t] (s) ;
(22)
Ds;
M (ds; d ) Q
R 0
Z tZ
Z t
^ ( ) d ei f (s) =
ds Q
Q (f (s)) ds:
(23)
0
R
0
^ as the Fourier transform
The last equality is obtained by Fubini’s theorem and the de…nition of the function Q
of the univariate function Q. Quantitatively, formula (23) will be particularly useful as a key to easy upper
bounds by noting the fact that maxx2R Q (x) = Q (0) is positive and …nite. On the other hand, if Q is
positive and non-degenerate, lower bounds will easily follow.
In order to use the full strength of our estimates in Section (4), we will also allow Q to be inhomogeneous,
and in particular, unbounded. This is easily modeled by specifying that
Z tZ
W (t; x) =
M (ds; d ) q ( ; x)
0
R
R
where R q ( ; x) q ( ; y) d = Q (x; y). Calculations similar to (22) and (23) then ensue.
We may also devise polymer models in non-Gaussian environments by considering W as a mixture of
Gaussian …elds. This means that we consider Q to be random itself, with respect to some separate probability
space. We will place only weak restrictions on this randomness: under a probability measure P, we assume
R
^ is a non-negative random …eld on R, integrable on R, with Q (0) =
^ ( ) d integrable with respect to
Q
Q
R
P.
5.2
The polymer and its ‡uctuation exponent
Let the Hamiltonian of a path b in R under the random environment W be de…ned, up to time t, as
Z t
Z Z t
q
^ ( )ei bs :
HtW (b) =
W (ds; bs ) =
M (ds; d ) Q
0
R
0
Since W is a symmetric …eld, we have omitted the traditional negative sign in front of the de…nition of HtW .
For …xed path b, this Hamiltonian HtW (b) is a Gaussian random variable w.r.t W .
The polymer P~b based on b in the random Hamiltonian H W is de…ned as the law whose Radon-Nykodym
derivative with respect to Pb is Zt (b) =Eb [Zt (b)] where
Zt (b) := exp HtW (b) :
We use the notation u for the partition function (normalizing constant) for this measure:
u (t) := Eb [Zt (b)] :
The process u (t) is of special importance: its behavior helps understand the behavior of the whole measure
P~b . When b0 = x instead of 0, the resulting u (t; x) is the solution of a stochastic heat equation with
16
multiplicative noise potential W , and the logarithm of this solution solves a so-called stochastic Burgers
equation.
It is known that t 1 log u (t) typically converges almost surely to a non-random constant called the
almost sure Lyapunov exponent of u (see [18] and references therein for instance; the case of random Q is
treated in [7]; the case of inhomogeneous Q on compact space is discussed in [5]). The speed of concentration
of log u (t) around its mean has been the subject of some debate recently. One may consult [1] for a discussion
of the issue and its relation to the so-called wandering exponent in non-compact space. The question is to
evaluate the asymptotics of log u (t) E [log u (t)] for large t, or to show that it is roughly equivalent to t ,
where is called the ‡uctuation exponent. The most widely used measure of this behavior is the asymptotics
of V ar [log u (t)]. Here we show that if the space is compact with positive correlations, or if W has in…nite
spatial correlation range, then V ar [log u (t)] behaves as t, i.e. the ‡uctuation exponent is 1=2. This result
is highly robust to the actual distribution of W , since it does not depend on the law of Q under P beyond
its …rst moment. We also provide a class of examples in which H W is replaced
p by a non-linear functional of
W , and yet the ‡uctuation exponent, as measured by the power behavior of V ar [log u (t)], is still 1=2.
We hope that our method will stimulate the study of this problem for other correlation structures not
covered by the theorem below, in particular in in…nite space when the correlation range of W is …nite or
decaying at a certain speed at in…nity, or in the case of space-time white-noise in discrete space, i.e. when
the Brownian motions W ( ; x) : x 2 Zd form an IID family. We conjecture that will depend on the
decorrelation speed of W . It is at least believed by some that in the case of space-time white noise, < 1=2.
The starting point for studying V ar [log u (t)] is the estimation of the function g relative to the random
Rt
variable log u (t) = log Eb exp HtW (b) . Here because the integral HtW (b) = 0 W (ds; bs ) has to be underq
RtR
^ ( )ei bs , we must calculate the Malliavin derivative with parameters r and .
stood as 0 R M (ds; d ) Q
We will use the consequence of Mehler’s formula described in Lemma 3.8 of Section 3.2. More speci…cally,
we have the following.
Lemma 5.1 Assume Q is homogeneous. Let
X :=
Then
log u (t)
q
^ ( )ei
Q
1 1
Ds; X = p
Eb
t u (t)
and
G := DX; DL
1
X
H
1
=
2t
Z
=2
0
=2
E log u (t)
p
:
t
jsin j d E Eb;b
"Z
0
bs HtW (b)
e
1[0;t] (s) ;
t
ds Q bs
bs
R W
b
exp HtW (b) exp Ht
u (t)
R u (t)
#
; (24)
where Eb;b is the expectation w.r.t. two independent copies b and b of Brownian motion, and R W was de…ned
in (9). When Q is inhomogeneous, the above formula still holds, with Q bs bs replaced by Q bs ; bs .
Proof. By formula (22) and the chain rule for Malliavin derivatives, we have for …xed b,
q
W
^ ( )ei bs eHtW (b) 1[0;t] (s)
Ds; eHt (b) = Q
and therefore by linearity of the expectation Eb , and the chain rule again, the …rst statement of the lemma
follows immediately.
Now we investigate DL 1 X. To use Lemma 3.8 relative to W , we note that the expression for R X is
straightforward, since X is de…ned as a non-random non-linear functional of an expression involving b and
17
W with the latter appearing linearly via HtW (b); in other words, R X is obtained by replacing HtW (b) by
HtR W (b), so we simply have
h
i
0
log Eb exp HtW (b) cos + HtW (b) sin
E log u (t)
p
R X=
:
t
Thus by Lemma 3.8,
Ds; L
1
X=
Z
=2
sgn ( )
p Eb E0
d
2 t
=2
"q
^ ( )e
Q
i
bs
#
exp HtR W (b)
sin ( )
:
R u (t)
We may thus calculate explicitly the inner product G := DX; DL 1 X H , using equation (23), obtaining
the second announced result (24). The proof of the …rst statement is identical in structure to the above
arguments. The last statement is obtained again using identical arguments.
2
It is worth noting that a similar expression as for G := DX; DL 1 X H can be obtained for kDXkH .
Using the same calculation technique as in the above proof, we have
" W
#
W
Z
1
eHt (b) eHt (b) t
2
2
dsQ bs ; bs
kDXkH = kDXkL2 ([0;t] R) = Eb;b
t
u2 (t)
0
Z t
1~
dsQ bs ; bs ;
(25)
= E
t b;b 0
where the last expression involves the expectation w.r.t. the polymer measure P~ itself, or rather w.r.t.
W
W
the product measure dP~b;b = eHt (b) eHt (b) u 2 (t) dPb dPb of two independent polymers b; b in the
sameh random environment
W . This measure is called the two-replica polymer measure, and the quantity
i
R
~b;b t dsQ bs ; bs is the so-called replica overlap for this polymer. This notion should be familiar to those
E
0
studying spin glasses such as the Sherrington-Kirkpatrick model (see [20]). The strategy developed in this
article suggests that the expression G := DX; DL 1 X H may be better suited than the rescaled overlap
2
kDXkH in seeking lower bounds on log u’s concentration.
Notation 5.2 In order to simplify the notation in the next theorem, when Q is not homogeneous, we denote
Q (0) = maxx2R Q (x; x). We then have, in all cases, Q (0)
jQ (x; y)j for all x; y 2 R. Similarly we
denote Qm = minx;y2R Q (x; y). In the homogeneous case Qm thus coincides with minx2R Q (x). When Q
is random, assumptions about Q below are to be understood as being required P-almost surely.
De…nition 5.3 To make precise statements about the ‡uctuation exponent, it is convenient to use the following de…nition:
log V ar [log u (t)]
:= lim
:
t!1
2 log t
Theorem 5.4
1. Assume Q (0) is …nite. We have for all a; t > 0,
h
P jlog u (t)
pi
E [log u (t)]j > a t
1=2
1^
2Q (0)
p
a 2
exp
a2
2Q (0)
:
If Q is random, one only needs to take an expectation EP of the above right-hand side.
18
(26)
2. Assume Q (0) is …nite. Then for all t,
2
V ar [log u (t)]
2
EP [Q (0)] t:
(27)
3. Assume Qm is positive. Then for all t,
V ar [log u (t)]
EP [Qm ] t:
(28)
4. Assume Qm is positive and Q (0) is …nite. Then, in addition to (26), we have for any K 2 (0; 1) and
all a large,
1=2
h
pi
Qm
a2
K
exp
(29)
P jlog u (t) E [log u (t)]j > a t
a
2Qm
Moreover in this case, the conclusions (27) and (28) hold simultaneously, so that the ‡uctuation exponent is = 1=2 as soon as Q (0) 2 L1 [P].
The hypotheses in Points 3 and 4 of this theorem are satis…ed if the state space R is replaced by a
compact set such as S1 , or a …nite set, and Q is positive everywhere: then indeed Qm > 0. Although the
hypothesis of uniform positivity of Q can be considered as restrictive for non-compact state space, one notes
that there is no restriction on how small Qm can be compared to Q (0); in this sense, the slightest persistent
correlation of the random environment at distinct sites results in a ‡uctuation exponent = 1=2. In sharp
contrast is the case of space-time white noise in discrete space, which is not covered by our theorem, since
then Q (x) = 0 except if x = 0; the main open problem in discrete space is to prove that < 1=2 in this
white noise case.
2
In relation to the overlap kDXkH , we see that under the assumptions of Point 4 above, kDXkH is also
2
bounded above and below by non-random multiples of t1=2 . Hence, while our proofs cannot use kDXkH
directly to prove = 1=2, the situation in which we can prove = 1=2 coincides with a case where the
overlap has the same rough large-time behavior as V ar [log u (t)]. We believe this is in accordance with
common intuition about related spin glass models.
More generally, we consider it an important open problem to understand the precise deviations of log u (t).
The combination of the sub-Gaussian and super-Gaussian estimates (26) and (29) are close to a central limit
theorem statement, except for the fact that the rate is not sharply pinpointed. Finding a sharper rate is
an arduous task which will require a …ner analysis of the expression (24), and should depend heavily and
non-trivially on the correlations of the covariance function, just as the obtention of a < 1=2 should depend
on having correlations that decay at in…nity su¢ ciently fast. There, we believe that a …ne analysis will
2
reveal di¤erences between G and the overlap kDXkH , so that precise quantitative asymptotics of log u (t)
2
can only be
by analyzing G, not merely kDXkH . For instance, it is trivial to prove that
h understood
i
E [G]
2
E kDXkH , and we conjecture that this inequality is asymptotically strict for large t, while the
2
deviations of G and kDXkH themselves from their respective means are quite small, so that their means’
behavior is determinant.
Answering these questions is beyond this article’s scope; we plan to pursue them actively in the future.
Proof of Theorem 5.4. Proof of Point 1. Since Q (x; y) Q (0) for all x; y, from Lemma 5.1, we have
"
#
Z =2
R W
W
exp
H
b
Q (0)
exp
H
(b)
t
t
G
jsin j d t E0 Eb;b
= Q (0) ;
2t
u (t)
R u (t)
=2
h
i
where we used the trivial facts that Eb exp HtW (b) = u (t) and Eb exp HtR W (b) = R u (t). The upper
p
~ = X= Q (0); now yields
bound result in Theorem 4.1, applied to the random variable X
!
h
i
Q (0)
z
1=2
~
P [X > z] = P X > zQ (0)
1+ 2
1=2
z
Q (0)
19
and the upper bound statement (26).
Proof of Points 2 and 3. Now we note that, since all terms in the integrals
our hypothesis that Q (x; y) Qm > 0 for all x; y implies
"
#
Z =2
R W
b
exp HtW (b) exp Ht
Qm
0
G
jsin j d t E Eb;b
2t
u (t)
R u (t)
=2
h
2
Z =2
W
Eb exp HtR W b
Qm
0 4 Eb exp Ht (b)
=
jsin j d E
2
u (t)
R u (t)
=2
in Lemma 5.1 are positive,
i3
5 = Qm :
~ = X=pQm , the lower bound of (28) in Point
Applying Point 1 in Corollary 4.2 to the random variable X
3 follows. The upper bound (27) of Point 2 can be proved using the result (26) of Point 1, although one
2
obtains a slightly larger constant than the one announced. The constant ( =2) is obtained by using the
2
bound kDXk
h H iQ (0) which follows trivially from (25), and then applying the classical result V ar [X]
2
2
( =2) E kDXkH , found for instance in [21, Theorem 9.2.3].
~ = X=pQm in Corollary 4.6, we
Proof of Point 4. Since Q (0) is …nite and Qm is positive, using X
have that g (x) 1 and g (x) Q (0) =Qm , so that we may use any value c0 > 0 in the assumption of that
corollary, with thus K = 1 2c0 arbitrarily close to 1; the corollary’s conclusion is the statement of Point 4.
This …nishes the proof of the theorem.
5.3
Robustness of the ‡uctuation exponent: a non-Gaussian Hamiltonian
The statements of Point 4 of Theorem 5.4 show that if the random environment’s spatial covariance is
bounded above and below by positive constants, then the partition function’s logarithm log u (t) is both
sub-Gaussian and super-Gaussian, in terms of its tail behavior (tail bounded respectively above and below
by Gaussian tails). We now provide an example of a polymer subject to a non-Gaussian Hamiltonian, based
still on the same random environment, whose logarithmic partition function may not be sub-Gaussian, yet
still has a ‡uctuation exponent equal to 1=2. It is legitimate to qualify the persistence of this value 1=2 in a
non-Gaussian example as a type of robustness.
Let
Z
t
XtW (b) :=
W (ds; bs ) :
0
With F (t; x) = x + x jxj =(2t), we de…ne our new Hamiltonian as
HtW (b) := F t; XtW (b) :
(30)
Similarly to Lemma 5.1, the Chain Rule for Malliavin derivatives proves that
!#
"q
W
(b)
X
W
1 1
t
i
b
H
(b)
^ ( )e s e t
Ds; X = p
Q
1+
1[0;t] (s) ;
Eb
t
t u (t)
(31)
and
G := DX; DL
1
X
H
1
=
2t
Z
=2
0
=2
jsin j d E Eb;b
XtW
1+
t
"Z
ds Q bs ; bs
0
!0
XtR
(b)
@1 +
20
t
W
t
b
13
A5 :
R W
b
exp HtW (b) exp Ht
u (t)
R u (t)
(32)
Theorem 5.5 Consider u (t) = Eb exp HtW (b) where the new Hamiltonian HtW is given in (30). The
random environment W is as it was de…ned in Section 5.1, and Q (0) and Qm are given in Notation 5.2,
and are non-random.
1. Assume Q (0) < 1=4. Then V ar [log u (t)]
2. Assume Qm is positive. Then V ar [log u (t)]
2
64 ( =2) Q3 (0) t + O t
1
.
Qm t.
If both assumptions of Points 1 and 2 hold, the ‡uctuation exponent of De…nition 5.3 is
the conclusion of Point 4 in Theorem 5.4 holds.
= 1=2, and
The theorem above also works when Q (0) and Qm are random. We leave it to the reader to check
that the conclusions of points 1 and 2 above hold with expectations EP on the right-hand sides, and with
O t 1 = t 1 EP Q (0) 1 + log2 (1 4Q (0)) .
We suspect that the logarithmic partition function log u (t) given by the non-Gaussian Hamiltonian in
(30) is eminently non-Gaussian itself; in fact, the form of its derivative in (31), with the additional factors
of the form (1 + X (b)) =t, can presumably be compared with X. We conjecture, although we are unable
to prove it, that the corresponding g (y) grows linearly in y. This would show, via Corollary 4.5 Point 3,
that log u (t) has exponential tails. Other examples of non-Gaussian Hamiltonians can be given, using the
p
formulation (30) with other functions F , such as F (t; x) = x + x jxj =t(1+p)=2 for p > 0. It should be noted,
however, that in our Gaussian environment, any value p > 1 results in a partition function u (t) with in…nite
…rst moment, in which case the arguments we have given above for proving that = 1=2 will not work.
This does not mean that the logarithmic partition function cannot be analyzed using …ner arguments; it can
presumably be proved to be non-Gaussian with heavier-than-exponential tails when p > 1.
Proof of Theorem 5.5. Since the additional terms in (32), compared to Lemma 5.1, are factors greater
than 1, the conclusion of Point 2 follows immediately using the proof of Points 2 and 3 of Theorem 5.4.
h
i
2
2
To prove that Point 1 holds, we will use again the classical fact V ar [X]
( =2) E kDXkH . Here
from (31) note …rst that
Z
2
jDs; Xj d
R
!
!#
"Z
q
XtW (b0 )
XtW (b)
1 1
i (bs b0s ) HtW (b) HtW (b0 )
^
1+
=
Eb Eb 0
d
Q ( )e
e
e
1+
t
t
t u2 (t)
R
"
!
!#
XtW (b)
XtW (b0 )
W
0
W
1 1
=
Eb Eb0 Q(bs b0s )eHt (b) eHt (b ) 1 +
1+
2
t u (t)
t
t
"
!
!#
XtW (b)
XtW (b0 )
Q (0) 1
HtW (b) HtW (b0 )
0
E
E
e
e
1
+
1
+
b
b
t u2 (t)
t
t
" W
#!2
Q (0)
eHt (b) XtW (b)
=
1 + Eb
:
t
u (t)
t
Therefore we have immediately
2
kDXkH
Q (0) 1 + Eb
"
W
eHt (b) XtW (b)
u (t)
t
#!2
:
Therefore, to get an upper bound on the variance of X uniformly in t we only need to show that the quantity
2
" W
#!2 3
Ht (b) X W (b)
e
t
5
B := E 4 Eb
u (t)
t
21
W
1
is bounded in t. We see that, using Jensen’s inequality w.r.t. the polymer measure dP~b = eHt (b) u (t) dPb ,
and then w.r.t. the random medium’s expectation,
2
" W
#!2 3
Ht (b)
W
e
1
5
B = 2 E 4 Eb
log ejXt (b)j
t
u (t)
2
" W
#!2 3
Ht (b)+jXtW (b)j
1 4
e
5
E
log Eb
t2
u (t)
" " W
##!
Ht (b)+jXtW (b)j
1
e
log2 e 1 + E Eb
:
t2
u (t)
h W
i
W
Here we used the fact that Eb eHt (b)+jXt (b)j =u (t) Eb HtW (b) =u (t) = 1 and that x 7! log2 (x + e 1)
is convex for x 1. Now we evaluate
##
" W
#
" " W
2
W
W
W
eXt (b)+jXt (b)j =(2t)+jXt (b)j
eHt (b)+jXt (b)j
= Eb E
E Eb
u (t)
u (t)
h
i
h
i
2
W
W
2
E1=2 u (t)
Eb E1=2 e4jXt (b)j+jXt (b)j =t
h h
ii
h
i
2
W
W
W
E1=2 Eb e 2Ht (b) Eb E1=2 e4jXt (b)j+jXt (b)j =t :
(33)
The …rst term in the above product is actually less than the second. For the second, we note that for
any …xed b, the random variable
XtW i(b) is Gaussian centered, with ha variance
h
i bounded above by Q (0) t.
2
W
1=2
2jXt (b)j =t
2Q(0)Z 2
is bounded by the constant E e
= (1 4Q (0))
(here Z
Therefore we have that E e
denotes
standard
normal);
h aW
i
h p that iexpectation is …nite because Q (0) < 1=4 by assumption. Similarly for …xed
b, E e8jXt (b)j = E e8 Q(0)tjZj
2e32Q(0)t . We now apply these two estimates and Schwartz’s inequality
to the last factor in (33), to get:
h
i
2
W
W
E1=2 e4jXt (b)j+jXt (b)j =t
h
i
h
i
2
W
W
21=4 E1=4 e8jXt (b)j E1=4 e2jXt (b)j =t
21=4 e8Q(0)t (1
4Q (0))
1=8
:
Combining this with the inequality in (33) now yields
B
1
log2
t2
e
1+
21=4 e8Q(0)t
(1
1=8
4Q (0))
1
8Q (0) t + 1 + 4 1 log 2
t2
= 64Q2 (0) + O t 2 ;
where O t
6
2
8
!
1
log (1
4Q (0))
2
is non-random (depends only on Q (0)), proving Point 1, and the theorem.
Appendix
To prove Lemma 3.5, we begin with an intermediate result in the nth Wiener chaos.
Lemma 6.1 Let n 2 N and fn 2 H n be a symmetric function. Let Y 2 D1;2 . Then
1
E [hD (In (fn )) ; D Y iH ]
n
= E [hIn 1 (fn (?; )) ; D Y iH ] ;
E [In (fn ) Y ] =
22
where we used the notation In 1 (fn (?; )) to denote the function r 7! In
on the n 1 variables “?” of fn (?; r).
1
(fn (?; r)) where In
1
operates
Proof. This is an immediate consequence of formula (4) and the famous relation D = L (where is
the divergence operator (Skorohod integral), adjoint of D, see [13, Proposition 1.4.3]).
Here, however, we present a direct proof. Note that, because of the Wiener chaos expansion of Y in
Proposition 2.2, and the fact that all chaos terms of di¤erent orders are orthogonal, without loss of generality,
we can assume Y = In (gn ) for some symmetric gn 2 H n ; then, using the formula for the covariance of two
nth-chaos r.v.’s in Proposition 2.2, we have
E [hIn
1
(fn (?; )) ; D Y iH ] = E [hIn 1 (fn (?; )) ; nIn 1 (gn (?; ))iH ]
Z
=n
E [In 1 (fn (?; r)) In 1 (gn (?; r))] (dr)
T
Z
=n
(n 1)! hfn (?; r) ; gn (?; r)iL2 (T n 1 ; n 1 ) (dr)
T
= n! hfn ; gn iL2 (T n ;
n)
= E [In (fn ) Y ] :
which, together with formula (4), proves the lemma.
Proof of Lemma 3.5. Since X 2 D1;2 and is centered, it has a Wiener chaos expansion X =
P1
n=1 In (fn ). We calculate E [Xh (X)] via this expansion and the Malliavin calculus, invoking Remark
2.3 and using Lemma 6.1:
E [Xh (X)] =
1
X
E [In (fn ) h (X)]
n=1
1
X
1
=
E
n
n=1
"
0
Z
= E h (X)
which by the de…nition of
Dr In (fn ) Dr h (X)
T
Z
(dr)
!
1
X
1
In (fn ) Dr X
n
n=1
Dr
T
#
(dr)
L is precisely the statement (7).
Proof of Lemma 3.8. The proof goes exactly as that of Lemma 1.5.1 in [13], with only computational
n
changes. We give it here for completeness. It is su¢ cient to assume that X = p ((W (hi ))i=1 ) where p
n
is, for instance, a polynomial in n variables. Thus R X = p (W (hi ) cos + W 0 (hi ) sin )i=1 , so that
0
Ds (R X) = (sin ) R (Ds X). Using the Mehler formula (formula (1.54) in [13]) with t > 0 such that
cos = e t , and Tt = etL , we get E0 [Ds0 (R X)] = (sin ) Tt (Ds X), which we can rewrite as
E0 [Ds0 (R X)] =
1
X
sin cosn Jn Ds X:
n=0
Integrating this expression over
1
2
Z
=
2[
=2; =2] yields
=2
0
sgn ( ) E
[Ds0
(R X)] d =
=2
Z
1
X
n=0
0
=2
sin cos
n
d
!
1
X
n=0
Jn Ds X =
23
1
2
Z
1
X
=2
=2
jsin j cos
1
Jn Ds X:
n+1
n=0
n
d
!
Jn Ds X
It is now an elementary property of multiplication operators to check that the last expression above equals
Ds L 1 X (see the commutativity relationship (1.63) in [13]), …nishing the proof of the lemma.
For completeness, we …nish with a short proof of the upper bound in Theorem 1.3, which is equivalent
to Proposition 1.1.
Proof of Proposition 1.1. Assume X 2 D1;2 is centered and W is the standard Wiener space. By the
Clark-Ocone representation formula
Z 1
X=
E [Ds XjFs ] dW (s)
0
(see [13, Proposition 1.3.5]), we can de…ne a continuous square-integrable martingale M with M (1) =
Rt
X, via the formula M (t) := 0 E [Ds XjFs ] dW (s). The quadratic variation of M is equal to [M ]t =
Rt
2
jE [Ds XjFs ]j ds; therefore, by hypothesis, [M ]t
t. Using the Doleans-Dade exponential martingale
0
E ( M ) based on M , de…ned by E ( M )t = exp
Mt
2
2
[M ]t we now have
2
E [exp X] = E E ( M )1 exp
2
[M ]1
E [E ( M )1 ] e
2
=2
=e
2
=2
:
The proposition follows using a standard optimization calculation and Chebyshev’s inequality. [21, Theorem
9.1.1] can be invoked to prove the same estimate for a general isonormal Gaussian process W , as done in
[22].
References
[1] Bézerra, S.; Tindel, S.; Viens, F. (2008). Superdi¤usivity for a Brownian polymer in a continuous
Gaussian environment. Annals of Probability, 36 (5), 1642-1672.
[2] Borell, C. (1978). Tail probabilities in Gauss space. In Vector Space Measures and Applications, Dublin
1977. Lecture Notes in Math. 644, 71-82. Springer-Verlag.
[3] Chatterjee, S. (2007). Stein’s method for concentration inequalities. Prob. Th. Rel. Fields 138, 305-321.
[4] Chen, L.; Shao, Q.-M. (2005). Stein’s method for normal approximation. In: An introduction to Stein’s
method, 1-59. Lect. Notes Ser. Inst. Math. Sci. Natl. Uni. Singapore 4, Singapore U.P.
[5] Florescu, I.; Viens, F. (2006). Sharp estimation for the almost-sure Lyapunov exponent of the Anderson
model in continuous space. Probab. Theory and Related Fields, 135 (4), 603-644.
[6] Karatzas, I.; Shreve, S. (1991). Brownian Motion and Stochastic Calculus, 2nd ed. Springer-Verlag.
[7] Kim, H.-Y.; Viens, F.; Vizcarra, A. (2008). Lyapunov exponents for stochastic Anderson models with
non-Gaussian noise. Stochastics and Dynamics, 8 no. 3, 451-473.
[8] Malliavin, P. (2002). Stochastic Analysis. Springer-Verlag.
[9] Nourdin, I.; Peccati, G. (2007). Stein’s method on Wiener chaos. To appear in Probability Theory and
Related Fields. 32 pages.
[10] Nourdin, I.; Peccati, G. (2008). Stein’s method and exact Berry Esséen asymptotics for functionals of
Gaussian …elds. To appear in Annals of Probability.
24
[11] Nourdin, I.; Peccati, G.; Reinert, G. (2008). Second order Poincaré inequalities and CLTs on Wiener
space. To appear in J. Functional Analysis.
[12] Nourdin, I.; Viens, F. (2008). Density estimates and concentration inequalities with Malliavin calculus.
Preprint: http://arxiv.org/PS_cache/arxiv/pdf/0808/0808.2088v2.pdf.
[13] Nualart, D. (2006). The Malliavin calculus and related topics, 2nd ed. Springer-Verlag.
[14] Nualart, E. (2004). Exponential divergence estimates and heat kernel tail. C. R. Math. Acad. Sci. Paris
338 (1), 77-80.
[15] Nualart, D.; Ortiz-Latorre, S. (2008). Central limit theorems for mutiple stochastic integrals. Stochastic
Processes and their Applications, 118 (4), 614-628.
[16] Nualart, D.; Quer-Sardayons, Ll. (2009). Gaussian density estimates for solutions to quasi-linear stochastic partial di¤ erential equations. Preprint http://arxiv.org/abs/0902.1849.
[17] Rinott, Y.; Rotar, V. (2000). Normal Approximation by Stein’s method. Decisions in Economics and
Finance 23, 15-29.
[18] Rovira, C.; Tindel, S. (2005). On the Brownian directed polymer in a gaussian random enviroment.
Journal of Functional Analysis, 222, 178-201.
[19] Stein, C. (1972). A bound for the error in the normal approximation to the distribution of a sum of
dependent random variables. In Proceedings of the 6th Berkeley Symposium on Mathematical Statistics
and Probability, Vol II: Probability Theory, 583-602. Univ. Cal. Press.
[20] Talagrand, M. (2003). Spin glasses: a challenge for Mathematicians. Springer Verlag.
[21] Üstünel, A.-S. (1995). An Introduction to Analysis on Wiener Space. LNM 1610. Springer-Verlag.
[22] Viens, F.; Vizcarra, A. (2008). Supremum Concentration Inequality and Modulus of Continuity for
Sub-nth Chaos Processes. Journal of Functional Analysis 248, 1-26.
25