A New Generalization of Chebyshev Inequality for
arXiv:0707.0805v2 [math.ST] 24 Jun 2011
Random Vectors ∗
Xinjia Chen
June 2007
Abstract
In this article, we derive a new generalization of Chebyshev inequality for random vectors. We demonstrate that the new generalization is much less conservative than the classical
generalization.
1
Classical Generalization of Chebyshev inequality
The Chebyshev inequality discloses the fundamental relationship between the mean and variance
of a random variable. Extensive research works have been devoted to its generalizations for
random vectors. For example, various generalizations can be found in Marshall and Olkin (1960),
Godwin (1955), Mallows (1956) and the references therein. A natural generalization of Chebyshev
inequality is as follows.
For a random vector X ∈ Rn with cumulative distribution F (.),
Pr {||X − E[X]| | ≥ ε} ≤
Var(X)
ε2
∀ε > 0
(1)
where ||.|| denotes the Euclidean norm of a vector and
Z
def
||V − E[X]||2 dF (V )
Var(X) =
V ∈ Rn
This classical generalization can be found in a number of textbooks of probability theory and
statistics (see, e.g., pp. 446-451 of Laha and Rohatgi (1979)).
∗
The author is with Department of Electrical and Computer Engineering, Louisiana State University, Baton
Rouge, LA 70803; Email: [email protected].
1
2
New Generalization of Chebyshev inequality
The classical generalization (1) perfectly assembles its counterpart for scalar random variables.
However, it may be too conservative. To address the conservatism, we derive a new multivariate
Chebyshev inequality as follows.
Theorem 1 For any random vector X ∈ Rn with covariance matrix Σ,
n
o n
Pr (X − E[X])⊤ Σ−1 (X − E[X]) ≥ ε ≤ ,
∀ε > 0
ε
(2)
where the superscript “⊤” denotes the transpose of a matrix.
Proof.
Let Dε =
have
V ∈ Rn : (V − E[X])⊤ Σ−1 (V − E[X]) ≥ ε . By the definition of Dε , we
1
(V − E[X])⊤ Σ−1 (V − E[X]) ≥ 1,
ε
∀V ∈ Dε .
Hence,
Pr {X ∈ Dε } ≤
≤
Z
1
ε
Z
1
ε
V ∈ Dε
(V − E[X])⊤ Σ−1 (V − E[X])dF (V )
V ∈ Rn
(V − E[X])⊤ Σ−1 (V − E[X])dF (V ).
For i = 1, · · · , n, let ui denote the i-th element of V − E[X]. For i = 1, · · · , n and j = 1, · · · , n,
let σij denote the element of Σ in the i-th row and j-th column. Similarly, let ρij denote the
element of Σ−1 in the i-th row and j-th column. Then,
(V − E[X])⊤ Σ−1 (V − E[X]) =
=
n
X
n
X
ρik
ui
i=1
k=1
n
n X
X
uk
!
ρik ui uk .
i=1 k=1
It follows that
Z
=
=
Z
V ∈ Rn
V ∈ Rn
n
n X
X
i=1 k=1
(V − E[X])⊤ Σ−1 (V − E[X]) dF (V )
!
n
n X
X
ρik ui uk dF (V )
ρik
i=1 k=1
Z
V ∈ Rn
ui uk dF (V ) .
By the definition of the covariance matrix Σ and its symmetry, we have
Z
ui uk dF (V ) = σik = σki
V ∈ Rn
2
for i = 1, · · · , n and k = 1, · · · , n. Hence,
Z
(V − E[X])⊤ Σ−1 (V − E[X])dF (V )
V ∈ Rn
n
n X
X
=
ρik σki
i=1 k=1
−1
= tr(Σ
Σ)
= n
where tr(.) denotes the trace of a matrix. Therefore,
Z
1
Pr {X ∈ Dε } ≥
Σ−1 (V − E[X])(V − E[X])⊤ dF (V )
ε V ∈ Rn
n
=
.
ε
The proof is thus completed.
✷
Remark 1 Theorem 1 indicates a fundamental relationship between the mean and covariance of
a random vector and describes how a random vector deviates from its expectation. Specially, for
n = 1, we have Σ = Var(X) and by Theorem 1, for any ǫ > 0,
n
o
Pr (X − E[X])⊤ Σ−1 (X − E[X]) > ǫ
o
n
p
= Pr ||X − E[X]|| > ǫ Var(X)
≤
1
,
ǫ
from which we deduce
Pr {||X − E[X]| | > ε} ≤
by letting ε =
p
Var(X)
ε2
ǫ Var(X). This shows that Theorem 1 includes the well-known Chebyshev in-
equality as a special case.
Remark 2 Actually, we had established Theorem 1 in [1, pp. 8–9] in 1997. The applications of
this result to control engineering can be found in [1, 2]. Recently, Theorem 1 has been extended
to random elements taking values in a separate Hilbert space by Rao [7] and to random elements
taking values in a separate Banach space by Zhou and Hu [8].
3
Comparison with Classical Generalization
In this section, we shall show that the inequality in Theorem 1 can be much less conservative
than the classical generalized Chebyshev inequality (1).
3
Let δ ∈ (0, 1). Based on inequality (1), sphere
tr(Σ)
def
n
2
Bδ = V ∈ R : ||V − E[X]| | ≤
δ
is the smallest set that can be constructed to ensure Pr{X ∈ Bδ } > 1 − δ. On the other hand,
by applying Theorem 1 we can construct an ellipsoid
n
no
def
,
Eδ = V ∈ Rn : (V − E[X])⊤ Σ−1 (V − E[X]) ≤
δ
which guarantees Pr{X ∈ Eδ } > 1 − δ.
For a comparison of the conservativeness of generalized Chebyshev inequalities (1) and (2), it
R
δ)
is natural to consider the ratio vol(B
where
vol(.)
is
a
volume
function
such
that
vol(S)
=
vol(Eδ )
v∈S dv
for any S ⊂ Rn . Interestingly, we have
Theorem 2 For any random vector X ∈ Rn ,
q
tr(Σ)
n
n
vol(Bδ )
>1
= p
vol(Eδ )
det(Σ)
where det(Σ) is the determinant of Σ.
Proof.
By the definitions of variance and covariance, we have Var(X) = tr(Σ). It follows that
!n
r
tr(Σ)
vol(Bδ ) = K
δ
1
where K > 0 is a constant. Applying a linear transform u = Σ− 2 (v − E[X]) to the integration
R
vol(Eδ ) = v∈Eδ dv, we have
r n
Z
p
1
n
vol(Eδ ) = det(Σ 2 )
du = det(Σ) K
δ
||u||2 ≤ n
δ
and thus
To show
q
vol(Bδ )
vol(Eδ )
tr(Σ)
n
n
vol(Bδ )
= p
.
vol(Eδ )
det(Σ)
> 1, it is equivalent to show
1
tr(Σ)
≥ [det(Σ)] n .
n
Recall that the geometric average is no less than the arithmetic average,
!1
Pn
n
n
Y
tr(Σ)
i=1 σii
,
σii
=
≥
n
n
i=1
4
(3)
where σii , i = 1, · · · , n are the diagonal components of Σ. Note that the covariance matrix Σ is
positive definite, hence by Hadamard’s inequality,
det(Σ) ≤
It follows from (3) and (4) that
tr(Σ)
n
n
Y
σii .
(4)
i=1
1
≥ [det(Σ)] n . The proof is thus completed.
✷
As an illustrative example, consider a two-dimensional random vector
"
#
y
X=
y+z
where y and z are independent Guassian random variables with zero means and variances σ 2 , kσ 2
respectively. Straightforward computation gives
#
"
σ2 σ2
Σ=
σ 2 (k + 1)σ 2
and
vol(Bδ )
k+2 √
= √ ≥ 2.
vol(Eδ )
2 k
Obviously, as k increases from 2 to ∞ or decreases from 2 to 0, the ratio of volumes increases
monotonically and tends to ∞.
In the following Figure 1, ellipsoid Eδ and sphere Bδ are constructed for σ = 1, k = 25 and
δ = 0.1. Moreover, 1000 i.i.d. samples of X are generated to show the coverage of the ellipsoid
and sphere. It can be seen that most samples are included in the ellipsoid. This indicates
that Theorem 1 is much less conservative than the classical generalized Chebyshev inequality in
describing how a random vector deviates from its expectation.
References
[1] X. Chen, On the Probabilistic Characterization of Model Uncertainty and Robustness, pp.
8–9, Master thesis, Louisiana State University, 1997.
[2] X. Chen and K. Zhou, “On the Probabilistic Characterization of Model Uncertainty and
Robustness,” Proceeding of the 36-th CDC, pp. 3616–3621, San Diego, December 1997.
[3] H. J. Godwin, “On Generalizations of Chebyshev Inequality,” Journal of the American Statistical Association, pp. 923–945, Vol. 50, No. 271, 1955.
[4] R. G. Laha and V. K. Rohatgi, Probability Theory, pp. 446–451, John Wiley and Sons, 1979.
5
25
20
Ellipsoid
15
Sphere
10
5
0
−5
−10
−15
−20
−25
−20
−15
−10
−5
0
5
10
15
20
Figure 1: Comparison of Generalized Chebyshev Inequalities
6
[5] C. L. Mallows, “Generalizations of Chebyshev Inequalities,” Journal of the Royal Statistical
Society, Series B, pp. 139–171, Vol. 18, No. 2, 1956.
[6] A. W. Marshall and I. Olkin, “Multivariate Chebyshev Inequalities,” The Annals of Mathematical Statistics, pp. 1001–1014, Vol. 31, No. 4, 1960.
[7] B. L. S. P. Rao, “Chebyshev’s inequality for Hilbert-sapce-valued random elements,” Statistics and Probability Letters, Vol. 80, pp. 1039–1042, 2010.
[8] L. Zhou and Z. C. Hu, “Chebyshev’s inequality for Banach-sapce-valued random elements,”
arXiv:1106.0955v1 [math.PR], June 2011.
7
© Copyright 2026 Paperzz