Elliptically contoured distributions

erobabmty
Theory
Probab. Th. Rel. Fields 76, 429-438 (1987)
Related Fields
9 Springer-Verlag 1987
Elliptically Contoured Distributions
Yehoram Gordon
Department of Mathematics, Technion, Israel Institute of Technology, Haifa 32000, Israel
Summary. Given two covariance matrices R and S for a given elliptically
c o n t o u r e d distribution, we show h o w simple inequalities between the matrix
elements imply that ER(f)< Es(f), e.g., when x = (xil,i ...... i,) is a multiindex
vector and
f ( x ) = rain m a x rain max...xil ..... i,,
il
or
f(x)
i2
i3
is the indicator function of sets such as
(3 U N U...rx, ...... ~._-<;o,..... ,~
il
i2 i3
of which the well k n o w n Slepian's inequality ( n = 1) is a special case.
1. Introduction
A density of the form p~(x) = IS]- a/2p(xZ- ix'), where S = (aij) is an n x n symmetric
positive definite matrix, is c o m m o n l y called elliptically contoured. The transformation y = x Z - 1/2 yields a density which depends only on the Euclidean n o r m ]Yl-A
special case is the n o r m a l density p(t)=(2n)-"/2e -t/2.
Let x -- (x 1.... , x,) be an n-dimensional r a n d o m vector. Writing y = xX- 1/2, we
get
E:E(XiXj) = I xixjps(Ixl) dx
= c,aij, i.e. Ex(x'x ) = c,Z,
(1.1)
where
c, = ~yZp([yl2)dy = n-1 ~ [yl2p(ly[2)dy
=n-l i r"+lp(rZ)dr/ i r"-~p(r2'dr'
(1.2)
* Supported in part by the Fund for the Promotion of Research at the Technion # 100-621, and the
K. & M. Bank Mathematics Research Fund # 100-609
430
Y. Gordon
since 1 = ~p(N2)dx= a, ~ p(r2)r "- 1dr, where a, is the n - 1 spherical area of the
unit sphere Ixl = 1.
o
Some of the well known inequalities which were initially proved for normal
distributions, such as Slepian's inequality, were proved also for the more general
class of elliptically contoured distributions (see, e.g., [-2, 6] and references therein).
We shall extend to this class and generalize in other directions some results of
Fernique, Kahane, and Gordon. The key is an elementary partial differential
equation satisfied by such densities (see Prop. 1). The results due to Fernique and
Gordon will be given new and simpler proofs.
Derivatives will be understood in the sense of distributions whenever the
functions are not differentiable.
2. Basic Inequalities
The implication (2.1)~(2.2) of the next Proposition is essentially contained in [2]
and [-6]. We include its proof for the sake completeness.
Proposition 1. The following two statements are equivalent for functions p(t) and
q(t) which vanish at ~ :
q(t) = 89 -~ p(r)dr
(2.1)
t
Oaij
~XiOXj
"
Proof Assume (2.1) holds. Let us denote Z-a =(aij), and let pz(2) be the Fourier
transform of pz(x). Substituting y = x Z -1/2 it follows immediately that Pz(2)
=f(2Z2'), for some function f defined on [0, ~). Hence, if
i:#j ~
(4) = 22,2J'(2Z2'),
and
~Ps(2~)
~ 0 2= 2k ( ~=1 2takl) f'(2Z2')'
therefore
k = 1 ffik (~P2(}c)
O2k
--2 E
. . .
k, t a ikau2tf .(2Z2)
= 22if (2S2),
~tTij
k=1
O2k
Integration by parts now yields
2j
~2k
-- 2j ~ (-- ixk)e-ix;~'p~.(x)dx = ~ ~xj (e-ix'V)xkpz(x)dx
-
e ix
'
EllipticallyContoured Distributions
431
Taking the inverse Fourier transform we obtain
Opt(x)
-~ff ij
(~Xj(k~=ltTikXkp,~(X))
~Xj p(x~-,- 1X') ~X/ (X~- 1X')
OZq;~(X)
~x~xj
The case i=j is proved similarly.
To prove (2.2)~(2.1) we use the following identities:
and
~~aij (xZ-lx')=-2(1- @) lZl-a/2(Z-lx'xS,)ij,
hence
gps(x)
~Tij
(102J)
[St_l/z(aiJp+ 2(S_lx,xS_t)ijp,),
where p=p(xs
and p'=p'(xS-1Xt).
On the other hand,
c32qs(x)
~x~Ox~-
2lsI- 1/2(tTijq'.-t.- 2(~'- lx'x,~,- 1)ijq" ) .
(2.2) implies the equality
S - l(p' + 2q) + S - 1x'xz~ l(p,, .~_2q') = O.
-
Now, given any t > 0, we can find two vectors x for which x 2 - i x ' = t for both
vectors, yet Z - l x ' x Z - 1 is not the same. This implies that p'(t)+ 2q(t)= 0, and (2.1)
follows. []
Let f(x)=f(x~,..., x,) be a function defined on R". We set
Es(f) = If(x)px(x)dx.
(2.3)
Given two positive definite symmetric matrices R = (rii) and S = (slj), let 2 0 = OR
+(1-O)S (0<0< 1), and
F(O) = Ezo(f).
Corollary 2. If
~, (rij-sij) (~2f(x~) >0 (xeR'), then F(I)>F(0).
i,j =~
c~xit?xj --
432
Y. Gordon
Proof The chain rule and integration by parts yield
F'(0)= 89
hence F(1)>F(0).
(i,j~=l(rij--sij)~02f(x)~
j qro(x)dx >0
[]
Remark. This happens, for example, when R - S is non-negative definite and f(x)
is convex, since the sum is simply tr/.((R - S ) ( ~ ) ) ~ 2 f
> 0.
Of course if the sum is non-positive then F(1)< F(0). A special case of this is the
following corollary.
Corollary 3. Assume that R, S and f(x) satisfy
(ER(XiXj)-Es(xixj))f~j < 0
for all i, j.
Then, ER(f) < Es(f).
Proof Obvious by Corollary 2, and (1.1). []
Remark 4. Let A = {(xi)] Ix~< 21,i = 1,..., n}, and 1A be the indicator function of the
set A. It is easy to see that ~,tl A lv,
> nv ifi#j. Hence ifER(X~)=Es(x~) for all i, and
xixj =
ER(XiXj) >Es(xixj) for i#j, then for f(x)= lac = 1 - 1A we have
This is the analogue of Slepian's inequality (cf. [8]) which we shall generalize in
Theorem 9. It was observed in [7], that Corollary 3 above in the context of normal
density proves Theorem 1.1 of [3].
Taking 2i = 2 for all i, and integrating Slepian's inequality over the interval
~ < 2 < 0o, we obtain
-
ER (max x~) < Es (max x~).
(2.5)
The fact that (2.5) holds under weaker assumptions on the matrix elements of R
and S [see Remark 7(ii)], is Fernique's inequality [i, 5] which was proved for
normal distributions, and is a consequence of the following general result:
Theorem 5. Assume that R, S and f(x) (x ~ R") satisfy the following:
(ERIxl-xjl2-Eslxi-xjl2)J"~j>=O
for all i,j
(2.6)
and
f~'jxi=0
for all i.
(2.7)
j=l
Then
ER(f) < Es(f) .
(2.8)
E11ipticallyContoured Distributions
433
Proof. We estimate the sum
i,j=l
Ir,j-s,j)S%=
lr,,-s,,)S;',x,+ Z (r,j-s,j)S%
i*j
i=1
= _ 1 ~ ( r u - sit + rjj - s i j - 2(rij- sij))f~',~j,
i*j
and using (1.1), our assumptions, and Corollary 2 we have
~-__89 - 1 ~
i,j
(ERlXi--xj[ 2 -- Eslx i -
,'t <=0,
xjl 2)f;,~j
hence
ER(f)=F(1)<=F(O)=Es(f).
[]
Remarks. (2.7) is equivalent to the following geometrical condition:
f(x+te)=f(x)+ct
for all
x ~ R ~,
(2.9)
t E R i, where c is a constant, and e=(1, 1, ..., 1).
Z Of
Indeed (2.9) implies ~ ~ = c, or ~ f ( x + te)= c, implying (2.9). That (2.9)
/=1
--*(2.7) is also obvious.
A class of functions which satisfy (2.9) hence (2.7) are functions of the form
f ( x ) = h(xi~ - xj~, ..., x i ~ - xj~), where h(yl, ..., Yk) is defined on R k.
3. Applications to Special Functions
Besides the function max xi, the function f ( x ) =
i
min
max x~j which was
l<=i<=n l<j<_m
considered in [3] and [4], also proved to be interesting in that the computations
made on this and the max function in the context of normal density, enabled us to
get quantitatively exact estimates of Dvoretzky's theorem which is a fundamental
result in Banach spaces. We shall now generalize and simplify the proofs of various
results proved in [3].
We introduce for convenience the following notation: a A b = m i n { a , b ) and
a v b = m a x { a , b}. We shall consider the function
f ( ~ ) = ./k .k/./k .k/...x,l,,2,,3 ..... ,k
ll
12
13
(2.10)
14
i.e., f ( x ) = rain max min .-.xil,i2 ..... ik,
il
i2
i3
where for each l = 1, 2,..., k, the index i~ ranges over some finite non-empty set of
integers C(il, i2,..., i l_ 1) (which as indicated shows that this set may depend on the
previous choices of i 1,i 2 . . . . . il_l). The simplest case is when C(ii, i2, " " , if-l)
= {1, 2,..., n} for all l = 1, 2,..., k. An example of such a function is
(X1VX2) A(X3VX4) =
A
V
Xil,i2
i<_il<=2 1__<i2_<2
434
Y. Gordon
where we identify: x 1 = x 11, x2 = x 12, x3
the function
=
X21, X4
=
X22" A triple indexed example is
[ ( x i A x~) v (x~ ^ x,,)] A [X~ V (X~ ,, X~)] = A V A x,,,,~,,~
il i2 i3
where we identify X I = X l l l , X 2 ~ X l 1 2 , X3~X121, X4~X122, X5=X211, X6=X221,
xT=x222. T h a t is, 1~-+111, 2~-+112, etc.
Thus f(_x) can be written as a singly indexed function under the p r o p e r
identification and vice versa. This identification can be carried over to matrices
R = (%j), and therefore ER(g ) is well defined for functions g(__x),where x = (xi).
ik) and j = (Jl, .. ",Jk)
be distinct vector indices, and denote by l, 1 < 1< k, the first index such that it ~-Jv I f
R, S are matrices such that for every choice of i ~_j
T h e o r e m 6. Let f (x_) be the function defined by (2.10),2= (il,...,
ERIXi_--XjI2~EsIxi--Xj[ 2
if
I is even
E~lx~_-xil2>=Eslx~_-xj_.l 2 if
I is odd,
and
then
ER(f) < E s ( f ) .
Proof Let d be the set of all functions of the f o r m
b;
g(t)=
t;
a;
t>b
a<_t<_b
t<a
we allow in ~r c o n s t a n t functions (by taking a = b) and also cases where a = a n d / o r b = oo. Notice that g(t) A c and g(t) v c belong to d if c is a constant, thus if
we denote the variable x~ by t a n d fix all other variables x z, f o r / ' :#_,/then f ( x ) is
clearly a function of the form g(t).
Similarly, by setting s = x~ and fixing all other variables in f ( x ) we get that f(_x)
is a function of the form h(s)e ~ . It is n o w easy to verify that if we set t = xi and
s = ~ and fix all other v a r i a b l e s x > for i' =~_,/j_in f ( x ) we get that f(_x) is a function of
two variables s, t of the form:
f(x)=a(t)Afl(S)
if
lisodd,
f ( x ) = e(t) v fl(s)
if
I is even,
and
where a(t), fl(s) e d .
We'll n o w show that the distributional derivatives satisfy the following
inequalities for all - oe < s, t < cc :
O)
~ (~(t) ^/~(s)) > o
and
~ (or
v fl(s)) >=0
435
Elliptically Contoured Distributions
(2)
as t
(3)
Os& (a(t) /x fl(s)) >=O.
<=0
v
82
(1) follows from the fact that a(t)/~ fl(s) and c~(t)v fi(s) are non-decreasing in t.
To prove (2) assume ~(t) has the form of g(t) above. Writing a(t)v fl(s)= l(a(t)
+fi(s) + [a(t)-fl(s)[) we get that since ~ [~(t)[ = ~'(t)sign(a(t)), we have
O~ (~(t) v fi(s)) = k(a'(t) + a'(t) sign(~(t)-- fl(s)))
fO
{ 89+ 89s i g n ( t 9
t r [a, b]
fl(s));
t e (a, b)
.
(~2
and since the last function is non-increasing in s at follows that ~
Similarly, writing a(t)A fi(s)=89
proved that
(a(t) v fi(s)) < O.
fl(s)-[c~(t)-fi(s)[) we obtain (3). Hence we
--(x)__>O
Ox_~
for all
?2f(-x) _>0 if
Ox~Oxj -
O2f(x-) <0
if
i
(2.11)
/is odd
(2.12)
1 is even 9
(2.13)
~x/3~ -
Applying inequalities (2.11) ~ (2.13) in Theorem 5, and noting that f(x) satisfies
(2.9) and hence (2.7), we get that ER(f)< Es(f). []
Remark 7. (i) Direct computation shows that if c~(t) is of the form g(t) above, and
fl(s) also has the form g(s) (with a, b replaced by c, d), then for all test functions
~oe C oo we have the identities:
62
~S ~ s
(o~(t)/~ fl(s)) q~(s, t) ds dt
R2
62
= - ~~ O~s (~(t) v fi(s)) q~(s, t) ds dt
R2
= ~r
t) dt
I
where I = [a, b]c~[c, d]. This also implies (2) and (3) above.
(ii) If two n x n matrices R, S satisfy ERIxi-xj[2<=Es[xi-xj[ 2 for all
i, j = 1, 2,..., n, then E R ( m a x xi~ <=Es ( m a x xi]. This is Fernique's inequality
\1<=i~,, /
\l<=i<=n J
([1], I-5]) which was proved for normal densities, and follow by taking k = 1 in
436
Y. Gordon
Theorem 6 and noticing that ER (max xf) = - E R ( min xf~, similarly for S replac-
~--<~--<. /
p--<f~. :
ing R. Another special case with k = 2 , i.e. f ( x ) = min max xi~,~, was proved for
it
f2
normal densities in [3] using a different more complicated method.
Under additional assumptions on the diagonals of the matrices R and S we can
get an inequality between the probability contents of sets which are formed by
intersections of unions of intersections, etc., of mutually orthogonal half spaces.
This generalizes Slepian's inequality, as well as Theorem 1.2 of [3]. The proof is
simpler than the previous proof of [3].
Let A be the set ~ ~ (-] U...[x_i>2_/], where_/denotes the k-fold index vector
il i2 i3
(il, i2,.-., ik), )~_/arearbitrary fixed scalars, and let 1A be the indicator function of:/.
As in definition (2.10) off(x), for each 1(l = 1, 2 .... , k), the index if ranges over some
finite non-empty set C(i~, i2,..., iz_ 1) which may depend on the previous choice of
the indices i~, i2,..., i l_ 1"
Lemma 8. Let A be the set .~ ~ ~ U...[x_/=>~]. Giveni:~j_, let l, 1 <_l<=k, be the
II
12
13
14
first index such that il :~Jv Then
8 1 A >= 0
(a)
for all _i
(b)
~21A >-0 if
8~Sx~ -
(c)
--<=0
8x~Sxd
~21A
if
1 is odd
l is even
Proof. If B is any set which is formed by intersections and unions of various
"atom" sets Ix i_> 2f], we shall say that -/0 appears in B if the event [x~o >=3%] or its
complement appears in B. Note that /o appears in B iff -/0 appears in the
complement B c of B. Since 1a is non-decreasing in each variable x i it follows that
8
--(1A)
=>0 for each i. Another way to verify this is the following:
8x~
Assume that i appears in a set B but not in C, then writing 1B~c = 1B+ l c - lclB,
~1B
we have that ~x. IBuc=lcc~X, and similarly 1B:~C=IBlc hence ~.(1B:,c)
s
= 1c ~
s
1~. Therefore in computing the sign of ~la
8x_/one may carry the derivative
past all intersections and unions directly to the derivative of the indicator function
of the atom [x~ > 21] which is clearly non-negative. This also shows that ~~la >0.
_
Assume t h a t / = (/1,..., ik),j = (Jl ..... j,) and 1is the first index such that if +Jr As
above, if the indices L j appear in B, a n d L j do not appear in C, then
62
- -
8xi~xj
021B
(1B,.c)=
1co - -
OxiSxj'
Elliptically Contoured Distributions
437
and
t?2
a21B
ax~_ax~(1~c)= lc Oxifxj "
2
9
.
.
~
This shows that if/is even (resp., odd) then the sign of T ~
l A
cx~_oxj
would be the same
( resp., a21B~C~j/
as the sign O"I ~cxi-oxs_'a21B~c
Oxis?x--,, where
B = (~
L) (~.-.[xi,,>--~,]
ii+~ i~+2
(resp. B= i'~)+1 i'+20U'' [X_/'~-~----/~_/'])
c = N U N-..[~_J,-->,[J,3
(resp. C =
j'+IUJr+2
(~ U'"[X-J'~-~----/~-J'' ])
where i' = (il, ..., iz, ij + 1.... , i'k),j' = (JD . . ., J~,J'~+ 1,-..,J'k). (Recall that it =it if 1 < t < l,
and it 4:jr) But it is easy to see that since_/appears in B and not in C, andj appears in
C and not in B, then
~21a~ c =
Ox~_Ox1
DI,. ~1 c <-0
8x, O x j -
This proves (b) and (c).
( resp., OelB~c -- ~31B 31c > 0 )
Dx~gx j
~x~ ~
"
[]
Applying Corollary 3 to the indicator function 1A and using Lemma 8 we
obtain:
Theorem 9. In the notation of Lemma 8, let
A = {~j_)]i~j, l is even},
B = {~j)1_/4=j, l is odd}.
Assume R, S satisfy:
ER(X 2) = Es(xai_) for all i,
ER(X,X) > Es(x~_x)
if
(h j_) e A ,
ER(Xi_Xj) < Es(xi~)
if
~j_) e B.
Then, PR(A) < Ps(A).
Remark. The case k = 1 is the well known Slepian's lemma ([8]), and the case k = 2
was proved in [3] for the normal density, that proof was later simplified in [7].
Acknowledgement. Thanks are
due to Yoav Benyamini for some valuable discussions.
438
Y. Gordon
References
1. Fernique, X.: Des resultats nouveaux sur les processus Gaussiens. C.R. Acad. Sci., Paris, Set. A-B 278,
A363-A365 (1974)
2. Das Gupta, S., Eaton, M.L., Olkiu, I., Perleman, M., Savage, J.L., Sobel, M.: Inequalities on the
probability content of convex regions for elliptically contoured distributions, Proc. Sixth Berkeley
Syrup. Math. Statist. Probab. 2, 241-264 (1972). Univ. of California Press
3. Gordon, Y.: Some inequalities for Gaussian processes and applications. Israel J. Math. 50, 265-289
(1985)
4. Gordon, Y.: Gaussian processes and almost spherical sections of convex bodies. Ann. Probab. 16
(1987)
5. Jain, N.C., Marcus, M.B.: Continuity of subgaussian processes. Adv. Probab. Rel. Topics 4, 81-196
(1978)
6. Joag-Dev, K., Perlman, M.D., Pitt, L.D.: Association of normal random variables and Slepian's
inequality. Ann. Probab. 11, 451-445 (1983)
7. Kahane, J.P.: Une inequalit6 du type de Slepian et Gordon sur les processus Gaussiens. Israel J.
Math. 55, 109-110 (1986)
8. Slepian, D.: The one sided barrier problem for Gaussian noise. Bell. Syst. Tech. J. 41, 463-501 (1962)
Received June 23, 1986