A tail bound for sums of independent random variables

A tail bound for sums of independent random variables :
application to the symmetric Pareto distribution
Christophe Chesneau
To cite this version:
Christophe Chesneau. A tail bound for sums of independent random variables : application to
the symmetric Pareto distribution. Short note. 2007. <hal-00192500>
HAL Id: hal-00192500
https://hal.archives-ouvertes.fr/hal-00192500
Submitted on 28 Nov 2007
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diffusion de documents
scientifiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
Vol. 0 (2007) 1–8
A tail bound for sums of independent
random variables : application to the
symmetric Pareto distribution
Christophe Chesneau
Laboratoire de Mathématiques Nicolas Oresme,
Université de Caen Basse-Normandie,
Campus II, Science 3,
14032 Caen, France.
http://www.chesneau-stat.com
[email protected]
Abstract: In this note we prove a bound of the tail probability for a sum of
n independent random variables. It can be applied under mild assumptions;
the variables are not assumed to be almost surely absolutely bounded, or
admit finite moments of all orders. Moreover, in some cases, it is significantly better than the bound obtained via the standard Markov inequality.
To illustrate this result, we investigate the bound of the tail probability for
a sum of n weighted i.i.d. random variables having the symmetric Pareto
distribution.
AMS 2000 subject classifications: 60E15.
Keywords and phrases: Tail bound, symmetric Pareto distribution.
1. MOTIVATION
Let (Yi )i∈N∗ be independent random variables. For any n ∈ N∗ , we wish to
determine the smallest sequence of functions pn (t) such that
P
à n
X
i=1
!
Yi ≥ t
≤ pn (t),
t ∈ [0, ∞[.
This problem is well-known; numerous results exist. The most famous of them is
the Markov inequality. Under mild assumptions on the moments of the Xi ’s, it
gives a polynomial bound pn (t). In many cases, this bound can be improved. For
instance, if the Xi ’s are almost surely absolutely bounded, or admit finite moments of all orders (and these moments satisfy some inequalities), the Bernstein
inequalities provide better results. The obtained bounds pn (t) are exponential.
See Petrov (1995) and Pollard (1984) for further details and complete bibliography.
In this note, we present a new inequality which provides a bound pn (t) of the
form pn (t) = vn (t) + wn (t), where vn (t) is polynomial, and wn (t) is exponential.
It can be applied under mild assumptions on the Xi ’s; as for the Markov inequality, only knowledge of the order of a finite moment is required. The main interest
of our inequality is that it can be applied when the ‘Bernstein conditions’ are
1
imsart-generic ver. 2007/09/18 file: Sum-Pareto.tex date: November 20, 2007
Christophe Chesneau/A tail bound for sums of independent random variables
2
not satisfied, and can give better results than the Markov inequality. In order
to illustrate this, we investigate the bound of the tail probability for a sum of
n weighted i.i.d. random variables having the symmetric Pareto distribution.
This is particularly interesting because the exact expression of the distribution
of such a sum is really difficult to identify. See, for instance, Ramsay (2006).
Moreover, there are some applications in economics, actuarial science, survival
analysis and queuing networks.
The note is organized as follows. Section 2 presents the main result. In Section 3 we illustrate the use of this result by considering the symmetric Pareto
distribution. The technical proofs are postponed to Section 4.
2. MAIN RESULT
Theorem 2.1 below presents a bound of the tail probability for a sum of n
independent random variables. As mentioned in Section 1, it requires knowledge
only of the order of a finite moment.
Theorem 2.1. Let (Yi )i∈N∗ be independent random variables. We suppose that
• for any n ∈ N∗ , and any i ∈ {1, ..., n}, we have, w.l.o.g., E(Yi ) = 0,
• there exists a real number p ≥ 2 such that, for any n ∈ N∗ , and any
i ∈ {1, ..., n}, we have E(|Yi |p ) < ∞.
Then, for any t > 0, and any n ∈ N∗ , we have
!
à n
µ
¶
´
³
X
t2
p/2
−p
+ exp −
Yi ≥ t ≤ Cp t max rn,p (t), (rn,2 (t))
P
,
16bn
i=1
(2.1)
´
³
¡ ¢
Pn
Pn
where, for any u ∈ {2, p}, rn,u (t) = i=1 E |Yi |u 1{|Yi |≥ 3bn } , bn = i=1 E Yi2
t
¡
¢
R∞
and Cp = 22p+1 max pp , pp/2+1 ep 0 xp/2−1 (1 − x)−p dx .
The proof of Theorem 2.1 uses truncation technics, the Rosenthal inequality
and one of the Bernstein inequalities. See Rosenthal (1970) and Petrov (1995).
Clearly, Theorem 2.1 can be applied for a wide class of random variables.
However, if the variables are almost surely absolutely bounded, or have finite
moments of all orders, the Bernstein inequalities can give more optimal results
than (2.1). But, when these conditions are not satisfied, Theorem 2.1 becomes
of interest. This fact is illustrated in Section 3 below for the symmetric Pareto
distribution. Other examples can be studied in a similar fashion.
3. APPLICATION: SYMMETRIC PARETO DISTRIBUTION
Proposition 3.1 below investigates the bound of the tail probability for a sum
of n weighted i.i.d. random variables having the symmetric Pareto distribution.
imsart-generic ver. 2007/09/18 file: Sum-Pareto.tex date: November 20, 2007
Christophe Chesneau/A tail bound for sums of independent random variables
3
Proposition 3.1. Let s > 2 and (Xi )i∈N∗ be i.i.d. random variables with the
probability density function f (x) = 2−1 s|x|−s−1
. Let (ai )i∈N∗ be a sePn1{|x|≥1}
s
quence of nonzeros real numbers such that
|a
|
< ∞. Then, for any
i
Pn i=1 s 1/s
3bn
∗
n ∈ N , any t ∈ (0, ρn ), where ρn = ( i=1 |ai | ) , and any p ∈ (2, s),
we have
!
à n
µ
¶
n
X
¢X
¡
t2
|ai |s + exp −
ai Xi ≥ t ≤ Kp t−2p+s bnp−s
P
,
(3.1)
16bn
i=1
i=1
³
s
s−2
´P
n
i=1
p−s
a2i ,
µ³
s
s−p
Kp = 3
max
where bn =
¡
¢
R
∞
22p+1 max pp , pp/2+1 ep 0 xp/2−1 (1 − x)−p dx .
´ ³
´p/2 ¶
s
, s−2
Cp , and Cp =
Notice that, since the distribution of the variables is symmetric, the constant
Cp (associated to the Rosenthal inequality) can be improved. For its optimal
form, we refer to Ibragimov and Sharakhmetov (1997).
In the literature, there exist several results on the approximation of the tail
probability of a sum of n i.i.d. random variables having the symmetric Pareto
distribution. But, to our knowledge, contrary to Proposition 3.1, these results are
asymptotic (i.e. t → ∞). See, for instance, Goovaerts, Kaas, Laeven, Tang, and Vernic
(2005).
Illustration. Here, we consider a simple example to compare the precision
between (3.1) and the bound obtained via the Markov inequality.
Let s > 2 and (Xi )i∈N∗ be i.i.d. random variables with the probability density
function f (x) = 2−1 s|x|−s−1 1{|x|≥1} . For any integer n such that n1/2−1/s (log n)−1/2 >
¡
¢
23/2 1/2
, and any p ∈ max( 2s , 2), s , if we take t = tn = 23/2 (sn log n)1/2 , then
3 s
we can balance the two terms of the bound in (3.1); there exist two constants,
Q1 > 0 and Q2 > 0, such that
!
à n
X
Xi ≥ tn ≤ Q1 n1−s/2 (log n)−p+s/2 + n1−s/2 ≤ Q2 n1−s/2 .
P
(3.2)
i=1
Under the same framework, for any p < s, the Markov inequality combined
with the Rosenthal inequality (see Lemma 4.1 below) implies the existence of
two constants, R1 > 0 and R2 > 0, such that
¯ !
à n
!
ï n
¯X ¯p
X
¯
¯
p/2
P
Xi ≥ tn ≤ t−p
Xi ¯
≤ R1 t−p
≤ R2 (log n)−p/2 . (3.3)
n E ¯
n n
¯
¯
i=1
i=1
Therefore, for n large enough, the rate of convergence in (3.2) is really faster
than those in (3.3). In this case, (3.1) gives a better result than the Markov
inequality.
imsart-generic ver. 2007/09/18 file: Sum-Pareto.tex date: November 20, 2007
Christophe Chesneau/A tail bound for sums of independent random variables
4
4. PROOFS
. Proof of Theorem 2.1. Let n ∈ N∗ . For any t > 0, we have
à n
!
à n
!
X
X
P
Yi ≥ t = P
(Yi − E (Yi )) ≥ t ≤ U + V,
i=1
i=1
where
U =P
and
V =P
Ã
n ³
X
i=1
à n
X³
i=1
³
´´
t
≥
2
³
´´
t
≥
2
Yi 1{|Yi |≥ 3bn } − E Yi 1{|Yi |≥ 3bn }
t
t
Yi 1{|Yi |< 3bn } − E Yi 1{|Yi |< 3bn }
t
t
!
!
.
Let us bound U and V , in turn.
The upper bound for U . The Markov inequality yields
¯ !
ï n
¯X ³
´´¯p
³
¯
¯
p −p
U ≤2 t E ¯
Yi 1{|Yi |≥ 3bn } − E Yi 1{|Yi |≥ 3bn } ¯ .
t
t
¯
¯
(4.1)
i=1
Now, let us introduce the Rosenthal inequality. See Rosenthal (1970).
Lemma 4.1 (Rosenthal’s inequality). Let p ≥ 2 and (Xi )i∈N∗ be independent
random variables such that, for any n ∈ N∗ , and any i ∈ {1, ..., n}, we have
E(Xi ) = 0 and E(|Xi |p ) < ∞. Then we have

¯ !
!p/2 
à n
ï n
n
¯X ¯p
X
X
¢
¡
¯
¯
,
E Xi2
E (|Xi |p ) ,
≤ cp max 
E ¯
Xi ¯
¯
¯
i=1
i=1
i=1
¡
¢
R∞
where cp = 2 max pp , pp/2+1 ep 0 xp/2−1 (1 − x)−p dx .
³
´
For any i ∈ {1, ..., n}, set Zi = Yi 1{|Yi |≥ 3bn } − E Yi 1{|Yi |≥ 3bn } . Since
t´
t
³
E(Zi ) = 0 and E (|Zi |p ) ≤ 2p E |Yi |p 1{|Yi |≥ 3bn } ≤ 2p E (|Yi |p ) < ∞, Lemma
t
4.1 applied to the independent variables (Zi )i∈N∗ gives

¯ !
!p/2 
ï n
à n
n
¯X ¯p
X
X
¡
¢
¯
¯
,
≤ cp max 
E ¯
Zi ¯
E Zi2
E (|Zi |p ) ,
¯
¯
i=1
i=1
(4.2)
i=1
¢
¡
R∞
where cp = 2 max pp , pp/2+1 ep 0 xp/2−1 (1 − x)−p dx .
imsart-generic ver. 2007/09/18 file: Sum-Pareto.tex date: November 20, 2007
Christophe Chesneau/A tail bound for sums of independent random variables
It follows from (4.1) and (4.2) that
à n
à n
!!
X
X ¡ ¢
p −p
p
2
U ≤ 2 t cp max
E (|Zi | ) ,
E Zi
i=1

≤ 22p t−p cp max 
n
X
i=1
5
i=1
n
³
´
X
E Yi2 1{|Yi |≥ 3bn }
E |Yi |p 1{|Yi |≥ 3bn } ,
t
t
´
³
Ã
i=1
!p/2 

´
³
p/2
,
= Cp t−p max rn,p (t), (rn,2 (t))
(4.3)
where Cp = 22p cp .
The upper bound for V . Let us present one of the Bernstein inequalities. See,
for instance, Petrov (1995).
Lemma 4.2 (Bernstein’s inequality). Let (Xi )i∈N∗ be independent random variables such that, for any n ∈ N∗ and any i ∈ {1, ..., n}, we have E(Xi ) = 0 and
|Xi | ≤ M < ∞. Then, for any λ > 0, and any n ∈ N∗ , we have
!
!
Ã
à n
X
λ2
,
Xi ≥ λ ≤ exp −
P
2(d2n + λM
3 )
i=1
where d2n =
Pn
i=1
E(Xi2 ).
´
³
For any i ∈ {1, ..., n}, set Zi = Yi 1{|Yi |< 3bn } − E Yi 1{|Yi |< 3bn } . Since
t
t
³
´
6bn
E(Zi ) = 0 and |Zi | ≤ |Yi |1{|Yi |< 3bn } + E |Yi |1{|Yi |< 3bn } ≤ t , Lemma 4.2
t
t
applied with the independent variables (Zi )i∈N∗ , and the parameters λ =
M = 6btn , gives


2
t
³
´
V ≤ exp − ³P
¡ ¢´  .
n
t 6bn
V
Y
1
8
+
3b
i {|Yi |< n }
i=1
6
t
t
´ P
³
¡ ¢
Pn
n
Since i=1 V Yi 1{|Yi |< 3bn } ≤ i=1 E Yi2 = bn , it comes
t
2
and
t
µ
¶
t2
V ≤ exp −
.
16bn
(4.4)
Putting (4.3) and (4.4) together, we obtain the inequality
!
à n
µ
¶
´
³
X
t2
p/2
+ exp −
Yi ≥ t ≤ U + V ≤ Cp t−p max rn,p (t), (rn,2 (t))
P
.
16bn
i=1
Theorem 2.1 is proved.
imsart-generic ver. 2007/09/18 file: Sum-Pareto.tex date: November 20, 2007
Christophe Chesneau/A tail bound for sums of independent random variables
6
. Proof of Proposition 3.1. Let n ∈ N∗ . Set, for any i ∈ {1, ..., n}, Yi =
ai Xi . Clearly, (Yi )i∈N are independent
variables such that E(Yi ) =
³ random
´P
Pn
n
s
2
2
ai E(Xi ) = 0 and
i=1 E(Yi ) =
i=1 ai < ∞. In order to apply
s−2
³
´
Pn
u
E
|Y
|
1
Theorem 2.1, let us bound the term rn,u (t) =
3bn
i
i=1
{|Yi |≥ t } =
¶
µ
¡
¢
Pn
u
u ©
ª for any u ∈ {2, p}, and any p ∈ max( s , 2), s .
i=1 |ai | E |Xi | 1 |Xi |≥ 3bn
2
|ai |t
³
´
³
´
Pn
1/s
3bn
n
Recall that ρn = ( i=1 |ai |s ) . Since t ∈ 0, 3b
⊆
0,
σn , where
¶ρn
µ
R
∞
σn = supi=1,...,n |ai |, we have E |Xi |u 1©|X |≥ 3bn ª = s 3bn xu−s−1 dx =
i
|ai |t
|ai |t
´³
´u−s
³
3bn
s
. Hence,
s−u
|ai |t
rn,u (t) =
µ
s
s−u
¶µ
µ
¶p
3bn
t
¶u−s X
n
|ai |s .
i=1
Therefore,
³
p/2
max rn,p (t), (rn,2 (t))
where Rp = max
³
µ³
n
Since t ∈ 0, 3b
ρn
³
3bn
tρn
´−s
s
s−p
´
´
≤ Rp
3bn
t
´ ³
´p/2 ¶
s
, s−2
.

µ
3bn
max 
tρn
and p > 2, we have max
Ã
³

¶−s õ
¶−s !p/2
3bn
,
,
tρn
!
´−s µ³ ´−s ¶p/2
3bn
3bn
=
,
tρn
tρn
. Hence,
µ
¶p−s X
n
³
´
3bn
p/2
max rn,p (t), (rn,2 (t))
≤ Rp
|ai |s .
t
i=1
(4.5)
Putting (4.5) in Theorem 2.1, we obtain
à n
!
µ
¶
n
X
¢X
¡
t2
P
ai Xi ≥ t ≤ Kp t−2p+s bnp−s
|ai |s + exp −
,
16bn
i=1
i=1
µ³
´P
´ ³
´p/2 ¶
³
n
s
s
s
2
p−s
,
Cp , and Cp =
a
,
K
=
3
max
where bn = s−2
p
i=1 i
s−p
s−2
¡
¢
R
∞
22p+1 max pp , pp/2+1 ep 0 xp/2−1 (1 − x)−p dx . Proposition 3.1 is proved.
References
Goovaerts, M., Kaas, R., Laeven, R., Tang, Q., and Vernic, R. (2005). The Tail
Probability of Discounted Sums of Pareto-like Losses in Insurance. Scandinavian Actuarial Journal, Issue 6, November 2005 , pp. 446 - 461
imsart-generic ver. 2007/09/18 file: Sum-Pareto.tex date: November 20, 2007
Christophe Chesneau/A tail bound for sums of independent random variables
7
Ibragimov, R., and Sharakhmetov, Sh. (1997). On an exact constant for the
Rosenthal inequality, Theory Probab. Appl., 42, pp. 294302.
Petrov, V. V. (1995). Limit Theorems of Probability Theory, Clarendon Press,
Oxford.
Pollard, D. (1984). Convergence of Stochastic Processes, Springer, New York.
Ramsay, Colin M. (2006) The distribution of sums of certain i.i.d. Pareto variates. Commun. Stat., Theory Methods. 35, No.1-3, pp. 395-405.
Rosenthal, H. P. (1970). On the subspaces of Lp (p ≥ 2) spanned by sequences of
independent random variables, Israel Journal of Mathematics 8: pp. 273-303.
imsart-generic ver. 2007/09/18 file: Sum-Pareto.tex date: November 20, 2007