A note on some concentration inequalities under a non

A note on some concentration inequalities under a
non-standard assumption
Christophe Chesneau, Jan Bulla, André Sesboüé
To cite this version:
Christophe Chesneau, Jan Bulla, André Sesboüé. A note on some concentration inequalities
under a non-standard assumption. 8 pages, 2 figures. 2009. <hal-00419741v1>
HAL Id: hal-00419741
https://hal.archives-ouvertes.fr/hal-00419741v1
Submitted on 24 Sep 2009 (v1), last revised 19 Jan 2010 (v2)
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diffusion de documents
scientifiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
A note on some concentration inequalities under
a non-standard assumption∗
Christophe Chesneau, Jan Bulla & André Sesboüé†
21 September 2009
Abstract
We determine bounds of the tail probability for a sum of n independent random variables. Our assumption on these variables is non-standard: we suppose
that they have moments of order δ with δ ∈ (1, 2). Some numerical examples
illustrate the theoretical results.
1
Introduction
Let (Yi )i∈N∗ be a sequence of independent random variables. For any n ∈ N∗ and t > 0,
we wish to determine the smallest pn (t) satisfying
!
n
X
P
Yi ≥ t ≤ pn (t).
(1)
i=1
Numerous inequalities exist to reach this aim under appropriate assumptions, such
as Markov’s inequality, Tchebychev’s inequality, Chernoff’s inequality, Bernstein’s inequality, Fuk-Nagaev’s inequality, . . . (see, e.g., [1, 2, 3, 4] and the references therein).
In this note, we investigate pn (t) in a non-standard case, as we merely suppose that
there exists δ ∈ (1, 2) such that E |Y1 |δ exists. That is, we have no information on the
existence of the variance and thus most of the common inequalities cannot be applied.
We determine three bounds: the first two are consequences of Markov’s inequality,
and the third, which is more technical and original, offers a suitable alternative. We
compare the quality of these bound via a numerical study.
The note is organized as follows. Section 2 presents the result and the proof, and
Section 3 provides a comparative numerical study of the three bounds.
∗ Mathematics
Subject Classifications: 60E15.
de Mathématiques Nicolas Oresme, Université de Caen Basse-Normandie, Campus II, Science 3, 14032 Caen, France, [email protected], [email protected],
[email protected].
† Laboratoire
1
2
Tail bounds
2.1
Assumptions
Let n ∈ N∗ and (Yi )i∈N∗ be a sequence of independent random variables. We suppose
that
− for any i ∈ {1, . . . , n}, without loss of generality, E(Yi ) = 0,
− there exists a real number δ ∈ (1, 2) such that, for any i ∈ {1, . . . , n}, E |Yi |δ
exists and is known. We assume that we have no a priori information on the
existence of a moment of order 2 or it does not exist.
For example, let (Yi )i∈N∗ be i.i.d. random variables having the Pareto distribution with
parameter s, i.e., Y1 has the probability density function
(
((s + 1)/2)|x|−s , if |x| ≥ 1,
f (x) =
0
otherwise.
If s ∈ (1 + δ, 3) with
δ ∈ (1, 2), the (Yi )i∈N∗ satisfy the previous assumptions, i.e.,
E(Y1 ) = 0, E |Y1 |δ = 1/(s − δ − 1) and E Y12 does not exist.
2.2
Results
Pn
We aim to bound
P ( i=1 Yi ≥ t), t > 0, as sharp as possible, for given values of n, t,
δ, and E |Yi |δ ∀ i ∈ {1, . . . , n}.
THEOREM 1. Consider the framework of Section 2.1. For any t > 0 and any n ∈ N∗ ,
we have the three following bounds.
Bound 1:
P
n
X
!
≤ t−δ/2
Yi ≥ t
i=1
Bound 2:
P
n
X
!
Yi ≥ t
≤ t−δ nδ−1
i=1
P
1/2
.
i=1
n
X
Bound 3:
E |Yi |δ
n
X
E |Yi |δ .
i=1
n
X
!
Yi ≥ t
i=1
≤ min gn (t, y),
y∈[0,∞)
where
t2
gn (t, y) = exp −
P
n
2 y 2−δ i=1 E (|Yi |δ ) +
2
!
ty
3
+1−
n
Y
i=1
1 − y −δ E |Yi |δ
.
The proofs of Bounds 1 and 2 use elementary tools (Markov’s inequality, lp -inequalities,
. . . ), the one of Bounds 3 is more technical (truncation techniques, Bernstein’s inequality, . . . ).
PROOF. Let n ∈ N∗ , we prove Bounds 1-3 in turns.
Pn
Pn
u
Proof of Bound 1. Using Markov’s inequality, the inequality | i=1 ai | ≤ i=1 |ai |u
with (ai )i∈{1,...,n} ∈ Rn and u ∈ (0, 1), the fact that δ/2 ∈ (0, 1), and CauchySchwarz’s inequality, we obtain

δ/2 
!
!
n
n
n
X
X
X

−δ/2
δ/2
−δ/2 
≤t
E
|Yi |
P
Yi ≥ t
≤ t
E Yi i=1
i=1
i=1
= t−δ/2
n
X
E |Yi |δ/2 ≤ t−δ/2
i=1
n
X
E |Yi |δ
1/2
i=1
for any t > 0, and Bound 1 is proved.
Pn
Pn
v
Proof of Bound 2. Using Markov’s inequality, the inequality | i=1 ai | ≤ nv−1 i=1 |ai |v
with (ai )i∈{1,...,n} ∈ Rn and v ∈ (1, ∞), and the fact that δ ∈ (1, ∞), we attain

δ 
!
!
n
n
n
X
X
X

−δ 
−δ δ−1
δ
Yi P
Yi ≥ t
≤ t E ≤t n E
|Yi |
i=1
i=1
= t−δ nδ−1
n
X
i=1
E |Yi |δ
i=1
for any t > 0, and Bound 2 is proved.
Proof of Bound 3. For any t > 0 and any y > 0 holds
!
( n
) n
X
X
P
Yi ≥ t
= P
Yi ≥ t ∩
max
i=1
P
i=1
( n
X
i∈{1,...,n}
)
Yi ≥ t
i=1
!
|Yi | < y
!
∩
max
i∈{1,...,n}
|Yi | ≥ y
≤ T1 (t, y) + T2 (y),
where
T1 (t, y) = P
( n
X
(2)
) Yi ≥ t /
max
and
T2 (y) = P
!
i∈{1,...,n}
i=1
|Yi | < y
max
i∈{1,...,n}
3
+
|Yi | ≥ y .
To bound T1 (t, y), we need Bernstein’s inequality, which is presented in the lemma
below.
LEMMA 1. (Bernstein’s inequality, see [3]) Let (Xi )i∈N∗ be a sequence of independent random variables such that, for any n ∈ N∗ and any i ∈ {1, . . . , n},
E(Xi ) = 0 and |Xi | ≤ M < ∞. Then, for any λ > 0 and any n ∈ N∗ holds
!
!
n
X
λ2
,
P
Xi ≥ λ ≤ exp −
2 d2 + λM
3
i=1
Pn
where d2 = i=1 E(Xi2 ).
Since, E(Yi ) = 0 for any i ∈ {1, . . . , n}, and |Yi | ≤ y when the event maxi∈{1,...,n} |Yi | < y
is realized, Bernstein’s inequality applied to the independent random variables
(Yi )i∈N∗ gives



t2
T1 (t, y) ≤ exp 
− Pn
2
2
max
i=1 E Yi /
i∈{1,...,n}
Since
Pn
i=1
E
Yi2 /
max
i∈{1,...,n}
|Yi | < y
≤ y 2−δ
|Yi | < y
Pn
i=1
+
ty
3


.
E |Yi |δ , it follows
t2
T1 (t, y) ≤ exp −
P
n
2 y 2−δ i=1 E (|Yi |δ ) +
!
ty
3
.
(3)
To treat bound T2 (y), we use the independence of the random variables (Yi )i∈N∗
as well as Markov’s inequality, and we obtain
n
Y
T2 (y) = 1 − P
max |Yi | ≤ y = 1 −
P (|Yi | ≤ y)
i∈{1,...,n}
=
1−
n
Y
i=1
(1 − P (|Yi | ≥ y)) ≤ 1 −
i=1
n
Y
1 − y −δ E |Yi |δ
.
(4)
i=1
Combining (2), (3), and (4), we obtain
!
n
X
P
Yi ≥ t ≤ min gn (t, y),
y∈[0,∞)
i=1
where
t2
gn (t, y) = exp −
P
n
2 y 2−δ i=1 E (|Yi |δ ) +
!
ty
3
and Bound 3 is proved. This ends Theorem 1.
4
+1−
n
Y
i=1
1 − y −δ E |Yi |δ
,
REMARK. When t is small enough, Bounds 1 and 2 are not interesting since they are
greater than 1. This is not the case for Bound 3: for any t > 0, we have
min gn (t, y) ≤ lim gn (t, y) = 1.
y→∞
y∈[0,∞)
P
2/δ
Pn
n
δ 1/δ
(δ−1)/δ
δ 1/2
,
More precisely, when t < min n
,
i=1 E |Yi |
i=1 E |Yi |
we have
min gn (t, y) ≤ 1 < min t
−δ/2
y∈[0,∞)
n
X
δ
E |Yi |
1/2
−δ δ−1
,t
i=1
n
n
X
!
δ
E |Yi |
,
i=1
and thus Bound 3 is lower than Bound 1 and Bound 2.
3
A numerical study
In what follows, we present some numerical results of the three bounds from Theorem
1 by means of two examples. The first examples treats the case of a large value for
n, the second example
deals with a smaller value for n. Without loss of generality, we
assume that E |Yi |δ = 1, i ∈ {1, . . . , n}, for all calculations. Following the philosophy of reproducible research, the programs are made available freely for download at
the address http://www.chesneau-stat.com/concentration.r. It requires at least
R (see http://www.r-project.org/) to run properly. These programs contain the
scripts to reproduce Figures 1 and 2.
Figure 1 displays the first case, for which n takes the value 500. The three panels
display the evolution of Bound 1, 2, and 3 for different values of δ (more precisely, 1.8,
1.5, and 1.2 in the upper, middle, and lower panel, respectively). The figure shows that
Bound 3 is clearly lower than Bound 1 and 2, in particular for small values of t. Note
that the differences between Bound 2 and 3 reduce for large t and small δ.
5
Figure 1: Empirical boundary values for large n
This figure displays the values of Bound 1, 2, and 3, respectively, for varying values of t and δ. For all
three panels, n = 500. The horizontal gray line represents bound value of 1.
5.00 20.00
delta = 1.8
0.05
0.20
p
1.00
Bound 1
Bound 2
Bound 3
200
400
600
800
1000
600
800
1000
600
800
1000
t
0.05
0.20
p
1.00
5.00 20.00
delta = 1.5
200
400
t
0.05
0.20
p
1.00
5.00 20.00
delta = 1.2
200
400
t
For the following Figure 2 deals with the case of small n, more precisely the value
is n = 50. The results correspond the those displayed in the previous figure. It is
visible that Bound 2 happens to be inferior to Bound 3 and should thus be selected for
6
smaller values of δ and larger t. However, for practical purposes, this case may only be
of limited interest.
Figure 2: Empirical boundary values for small n
This figure displays the values of Bound 1, 2, and 3, respectively, for varying values of t and δ. For
all three panels, n = 50. The horizontal gray line represents a bound value of 1, the vertical gray line
indicates the value of t from which on Bound 2 is preferable to Bound 3.
delta = 1.8
0.01
0.05
p
0.50
5.00
Bound 1
Bound 2
Bound 3
100
200
300
400
500
300
400
500
300
400
500
t
0.01
0.05
p
0.50
5.00
delta = 1.5
100
200
t
0.01
0.05
p
0.50
5.00
delta = 1.2
100
200
t
7
References
[1] F.Chung and L. Lu, Concentration inequalities and martingale inequalities — a
survey, Internet Math., 3 (2006-2007), 79–127.
[2] D.H. Fuk and S.V. Nagaev, Probability inequalities for sums of independent random
variables, Theor. Probab. Appl., 16(1971), 643–660.
[3] V.V. Petrov, Limit Theorems of Probability Theory, Clarendon Press, Oxford, 1995.
[4] D. Pollard, Convergence of Stochastic Processes, Springer, New York, 1984.
8