Kotz, S. and J.W. Adams; (1963)On a distribution of sum of identically distribute correlated gamma variables."

UNIVERSITY OF NORTH CAROLINA
Department of Statistics
Chapel Hill, N. C.
ON A DISTRIBUTION OF SUM OF IDENrICALLY
DISTRIBUTED CORRElATED GAMMA VARIABIES
by
S. Kotz and John W. Adams
June 1963
Grant No. AF-AFISR- 62-169
\
(
A distribution of a sum of identically distributed
Gamma-variables correlated according to an 'exponential'
autocorrelation law Pkj = P /k-j '< k,j = l, ••• ,nwhere
Pij
)
is the correlation coefficient between the i-th and j-th
random varia~les and 0 < P < 1 is a given number is
derived. This is performed by determining the characteristic roots of the appropriate variance - covariance
matrix, using a special method, and by applying Robbins'
and Pitman's result on a mixtures of distributions to
the case of Ga.mm&ovariables. An "approximate" distribution of the sum of these variables under the
assumption that the sum itself is a Gamma variable
is given. A comparison between an "exact" and an
"approximate"distribution for certa.in values of the
correlation coefficient, the number of variables in
the sum and the values of parameters of the initial
distributions are presented.
This research was supported by the Mathematics Division of the
Air Force Office of Scientific Research.
Institute of Statistics
Mimeo. Series No. 367
ON A DISTRIBUTION OF SUM OF IDENTICALLY DISTRIBUTED
CORRELATED GJIMMA VARIABIES
I
by
Samuel Kotz and John W. Adams
University of North Carolina
= == = = = = = = =
O.
= = = = = = = == = = = = = = = = = = = = = = = = = = = = =
Abstract:
A distribution of a sum of
according to an 'exponential'
where
Pij
identica~.ly
distributed Gamma-variables correlated
autocorre~tion law
Pkj = plk- j (k
l
j
= l~ ••n)
is the correlation coefficien,t betileen the i-th and j-th random
variables and
0
<P<1
is a given n1JIl1per 1& oerived.
This is performed by
determining the characteristic roots of the appropriate variance - covariance
matrixl using a special method, and by applying Robbins f and Pitman's result on
mixtures of distributions to the case of Gamma-variables.
An "approximate" distri-
bution of the sum of these variables under the assllIllption that the sum itself is
a Gamma variable is given.
A comparison between an "exact" and an "approximate"
distributions for certain values of the correlation coefficient, the number of
variables in the sum and the values of parameters of the initial distributions are
presented.
1.
Introduction and general remarks.
Distribution of sum of correlated Gamma variables has
8
significant interest
and many applications in engineering 1 insurance and other areas.
applications mentioned in
I)}
and f~7
'We
Besides the
should like to note the usefulness
of this distribution in problems connected with representation of precipitation
amounts.
~hiS
Weekly or monthly, etc., precipitation amounts are regarded as sums of
research was supported by the Mathematics Division of
of Scientific Research.
~he
Air Force Office
daily amounts, the latter being well fitted by Gamma distribution
L'd.7.
eular solution for the case of a constant correlation between each
variables in the sum is ~resent.ed in
I)}.
~air
A partiof
In this note we shall extend this
result for the case of " an exponential autocorrelation scheme" between the
variables, where each one of the variables has the marginal density function given
by
X:
1
=
r(r)Q"J..
f(x)
,
Xr - l
x>o
x<o
0
=
In meteorological
e
--g
ap~lications
it is sOJaetimes assumed that the sum random
variable is itself a Gamma. variable and a.n "approximate" Gamma distribution whose
first two moments are identical with the "exact" distribution of the sum is used
L'd.7,
(see also, for exam~le, ~~7 for an earlier a~plication in another field).
In the case of identically distributed r.v. and of stationary exponential corre-
lation model (namely when the correlation coefficient between the i-th and the
j-th r.v. in the sum is given by
p i-j
positive constant) it is easy to verify
rand
n
Q
n
of the
for all
L'd.7
"appr~fJilaten Gamma.
i
and
j,
p being a given
that the appropriate parameters
variable, representing the sum
r.v.,
are given by:
r
n
and
Q
n
when n
=
=
n2.
r
1 n
n+ .......E(n _ -=.e )
. l-p
l-p
2.
n
l-p)
n+ ~(
nl ..p
l-p
n
Q
is the number variables in the sum and rand Q are the parameters
of initial r.v. distributed according to (1.1).
Since the first two moments
3
com.:p1etely determine Gamma-distribution the "a:p:proximate" distribution is" unique 4
'rhe :pur:Pose of this note is to cozn:pare this "a:p:proximate" distribution (which is
quite easy com.:putab1e and a:p:p1ied ) with an "exact distribution of the sum (which
is much more difficult to be actually determined).
the (lase treated in
["J},
We also note that, similar to
the obtained ifna tura1" exact distribution may not be
unique, since the correlation scheme, in general, does not determine uniquely a
multivariate Gamma distribution as it will be noted in some more detail in the
following section·.
2.
JA ;Joint Characteristic P\.mcti. on of I!lx;Ponentia11y Correlated Gamma Variables
Consider the characteristic function
¢(t , t , ••• , t ),
n
2
1
II -
••• , tn) .=
where
the
Q
nxn
and
r
are :positive constants,
diagonal matrix with the elements
nxn :positive definite matrix.
This chara.c
I
r
Q
is the
t
jj
vi -r
nxn
= t ,
tel'~stic
T
j
identity matriX,
T is
and V is an arbitrary
function leads to a joint
:probability density function whose margin?ls are given by (1.1) and whose matrix
of second moments is some :positive definite matrix V*, say.
In general, a joint d
density function with marginal given by (1.1) and a given covariance matrix V*
is not uniquely determined; (for exam:p1e.. as it is known, a multivariate Gamma
distribution with constant correlation between the components is not unique
L27).
An explicit expression for a joint distribution with a general matrix is rather
complicated.. but can be obtained in terms of a series of Laguerre polynomials
by applying the derivations given in
LE7.
We will not present these expressions
here since the main concern of this paper is the validity Elind applicability of
approximations to the distribution of the sum of the component random variables.
Following Gurland
LJ:7
we shall regard (2.1) as the characteristic function of
a natural multivariate analogue of the Gamma distribution.
4
In particular, if the elements of V are given by V
ij
i, j
= 1, •••
n (0
= p~i-jJ.
< P < 1 a given constant) it is easy to verify by differentia-
tion of (2.1) that the corresponding elements of V* will be given by
V1'j
=
rQ
2 2ri-jf
.' i,j =1, ... n.
p ..
The characteristic function of the distribution of the sum of the random
variables whose joint distribution has the characteristic function given in (2.1)
is:
= JII -
¢(t)
and (2.3)
i
Q
vI -r
t
may be expressed in the form
n--'(l -
¢(t) =
i
Q
tr r
A. j
j:;l
where the
A.
j
are the characteristic roots of the matrix V.
In section
shall describe a procedure for an explicit determination of these
4 we
characte~istic
roots.
3. The Distribution of the Random Variables
'1t?
Robbins p~c:red in
£17
= ~)...
k:O
0, k
and
that if
~ =1
{\-.ll
~ose Characteristic Function is
is a ~equence of constants such that
J
and if . { F.k(·)
is a sequence of
distrib~­
tien functions with corresponding sequence of characteristic functions {¢k
then
(i)
tic function
FC,-)
=
I: ;c Fk.~ d
k
¢(.)
of
is a distribution function,
F(.)
is given by ¢(.)
any Borel measurable function
Jg(x) dF(x) = I:
~
k
ck
j
g( )
= I:
(ii)
l'lj.
the characteris-
~k ¢k(.)' and (iii) for
the following equality holds:
g(x) dFk(x), where
~
¢( t) •
.
1.*
is the
approp~ia.te space. These
results is valid for a general class of spaces and in particular for any finite
dimensional Euclidean space.
positive parameters
rand
The distribution function of the Gamma variable with
Q
denoted. here by Fr ()
is given by:
5
F (
r
1
'
r-1
X») = or rem)
l
u
du
I
x>
0
x<o
o
The characteristie function of this distribution is
¢*(t)
Let
Y
(1 - i Q t)-r •
=
be the random variable whose characteristic function is given by
(2.4).
The Y has the same distribution as the random variable X,
X
where Xjl j
= 1,
=
n
~
(,.,)
A X.
j
j=l
J
2, ••• n, are independent identically distributed random
variables, each following a Gamma distribution with parameters
~
and Q as
given in (, .1) •
Let A*
= min
Aj
j
•
Then
=
Prey < y)
Pr
(~* <
t*)
(,.4)
and the characteristic function of the random variable X/A* is
~ (t)
n
1T
=
(1 - i 0
j=l
Aj-r
~ t)
Using the method developed by Pitman and Robbins in
hand side of
(,.5)
in the form
n
1T
j=l
(1 - i Q
A-r
A~ t)
=
-nr
(1 - i 0 t)
f§l, we express the right
6
Since"
(3.7)
for all real
Q
and t
the binomial theorem may be applied to expand the r.h.s.
of (3.6). We thus obtain
fr
(1 _ i
Q
.1=1
where the coefficients c·
k
n
are determined by the identity
7-.
17 / ( ~ ) £1
.1=1 l
I
=
7-. j t) -r
7-.*
-
7-.*) -r
(1 - r-) z_7,
=
j
j
!Xl
I:
Ok zk
k=o
It f01~OWS from (3.6) that all. ck > 0" and putting (1. i Q t)
and (3.8) shows that I: ck = 1 •
Thus the results of Robbins
£17
=1
in
(3.6)
apply for
c
determined 'by (3.9)
k
and hence the r.h.s. of (3.8) is the characteristic function of the distribution
F( .. ) given by
!Xl
F(x)
where Fnr+k(·)
= I:
k=o
(k
ck F -.Lk (x)
noL"
= 0"1,, ••• )
(3.10)
are defined by (3.1).
Hence
co
Pr(Y
< y) =
I: c
k
k=o
Fnr+k (Y/7-.*) =
.
u
y
Xi
.
•'
(U)nr+k-l
~
e
--Q
~.
\,I,U
o
and the corresponding probability density fUnction is given by:
(3.11)
t
- II
-
fi y )
1
1:
A~ Q
k=o
nr+k-1
\
co
(~~Q)
f(nr+k1
x
G*
e
~
7
,
x>O
x<O
0
(The term by term differentiation is justified by the uniform convergence of the
series).
The upper bound on the error of truncation is given by the following
formula:
for any integers
0 ~ P1 ~ P2
and for any xl
.. P2
o ~Pr(Y < y) -
1:
P1
i
+ sup J· Fnr+k(Y/~*) L (1 -
L
:J
P1- l
1:
P
!Ok -
1:
2
P1
0
We should like to point out that the expansions
P2
~) ~
(3.8)
and
1 -
ok.
1:
P1
(3.9)
are closely re-
lated to the general results on expansion of the hypergeametric functions of matrix
variables into series of Go·~ca.llea."zonal polynomials"
flO,
10~7.
4. The Characteristic Roots of the Matrix V.
The inverse matrix of
is V- l given by
V'\-There
all
I
= ann
1
= «l+p2 )I
V, whose elements are v ij = P i-j , i ,
2
2 -1
- P A - P B)(l-p )
is the nxn identity matrix,
= 1 and all other elements
b ij = 1 for
li ..j J
j
= 1, ••• ,n,
(4.1)
A is the nm matrix with elements
0
,and B is the
=
nxn matrix with
1 and all other elements O. Hence the characteris1
2 -1
tic roots of Vare· (l-p)
Yj I ~
l,2, ••• ,n, where the Yj are those
=
value of Y which satisfy the equation;
8
Let bon{n,p) be the f.h.s. of (4.2). Expanding An(A,p)
bon(Y,P)
by
its minors we obtain
= (1 - y)2 Dn_2(y,p) - 2p2(1-Y)Dn_,(y,p)
4
(4.,)
+ P Dn _4(y,p)
where Dn(y,p)
is the determinant of the nxn matrix U ,
I
U
=
(1 +
(:,2 - y) -
(4.4)
P B
Expanding Dn{YIP) by its minors we get the following difference equation
and, is it is well knOw., (see e.g.
£27),
this difference equation has the
solution
=
pn sin(n+l} Q
sin Q
(4.6)
2
where 1 + P - Y = 2p Cos Q
Substituting (4.6) in (4.;) and equating bon(y,P) to zero ~ obtain the
equation
( 1 _ y)2 pn-2
sin(n-l)Q _ 2 (1 _
sin Q
..
+
) n-l sin(n-2)Q
Y P.
n sin(n-,)Q
P
sin Q
sin Q
=
+
o
Using the complex representation of the sine function we get after multiplying
(4.7) by sin Q (excluding the value of Q for which sin Q =
i(n-,}Q
e .~
.....
~.
-p)
« l - y)ei Q
0),
) -iQ
=,:te- i(n-,)
2 Q {( l-ye
-P,)
(4.8)
and after a few a.lgebraic manipulations the equation (4.8) reduces to
9
Sin (n+l) Q
2
=
p Sin (n-l)
2
=
P
Q
(~.9)
and
Cos (n+l)
2
If Qj
Q.
Cos (n-l) Q
2
(4.10)
= 1~2, •••n
are the values of Q which satisfy one or the other of the
equations (4.9) and (4.10), then the characteristic roots of the matrix V· 1 are
2
2 -1
given by Yj = (1 - 2p Cos Qj + P ) (1- p)
, and the characteristic roots of
1
2 -1
2
are ~j =
= (1 - 2p Cos Qj + p) (1 - p ), j = l,2, ••• ,n.
Apparently it is not possible to express the solutions of (4.9) and (4.10)
j
V
Yj
as explicit functions of I&' and p.
However, for 0
~
p
~
1" it can be verified
that the solutions of (4.9) are located one in each of the intervals
2k-l
2k3t )
n
n-l.
.
(~
3t, n+l ' k = 1,2,... '2 for T depending on whether n ~s even or
odd; and that the solutions of (4.10) are one in each of the intervals
2k-2 3t, li.'+"l
2k-l)
n+l depending on whether n is even
(-n
tt" k = 1,2, ••• , '2n or T
of odd.
Using this observation we can obtain the
solutio~
of these equations.
Consider the iterative equations:
Qjk
=
r
2 (ktt - Sin-l( p Sin 2"
n-l Qj-l"k » ' for k odd
ii+i
~
(4.11)
.
ln~l (ktt + Sin- l (p Sin n;l Qj-l,k»' for k even
Q
and k
= 1,2, .... '2n (n-1)
T
ok
€
(2k-13t
7C
2K1t)
ii+i ' - '2
where
-n'
for
-1(.
n-l
)
tt
~ Sinp Si~ Qj-l"k ~'2' j
n even (odd);
and
= 0,1, ••• ,
10
where Q
e
ok
j
= 0,1, •••
n
and k
2k-l n).
n < C -l( Cos n-l Q
) < ~
n+ l
' - "2 os p
:2 j-l,k - t=.
n.
'
(2k-2
= 1,2, ••• "2n (n+l)
2
•
'
)
n even ( odd.
for
The series of solutions obtained using this iterative procedure actually converges to the solutions of equations (4.9) and (4.10). This can easily be seen
by means of the following theorem of differential calculus. (See e.g. £117).
If Q e o
1\ , where '--0
1\ is an interval containing a root of the equation
o
Q = g(Q),
and if g(Q) is a diffe;rential function of Q throughoutR and, (...:..'
I g'(Q) I < M <
in addition,
solutions of the equation Qj
to a root of the equation.
1
for all
= g(Qj-l)'
Q
e
I? , then the sequence of iterative
starting with the point Qo' converges
In the case of the equations (4.11) and (4.12) the conditions of this theorem
are clearly satisfied.
In fact,
Q01.\,
1.
belongs, . in each case, to an interval
containing exactly one solution of (4.11) or (4.12) respectively. Moreover, the
absolute values of the derivatives of (4.11) and (4.12) are:
n-l)
n-l! P Cos (T
ii+I
Q
1'1_p2 2(¥>
and
6in
P Sin (n-l)
.
2
Q
Q
(4.14)
which for all
!!:1: <
ntl
p, ipl
<
1
and for all real Q, are, in each case, less than
1
•
5. Comparison between the "a;p;proximate lt and "exact lt distributions of sums of
identically distributed exponentially correlated Gamma-variables.
In order to obtain the distribution of sum of identically distributed Gamma
variables correlated according to the stationary exponential law
(r (Xi' Xj
)
= p,i- j /
,
i,j
= 1, ••
n)
we replace in (2.1) the matrix
•
V
=
V
"{-1*
ij
= P f.t-~
by the matriX
i
1""1 =[wij } =
P
li-jl/
2
11
Jso that the matrix
vrhich corresponds to the matrix V* 'Will have its elements
t'1'j 1 =l r
variables
Q2 p li-j
Xi
i
1' ~here
"-
~l
.,1~
is the common variance eaCh of the gamma-
= l, •••n.
Using the UNIVAC al05 of the Computation Center of the University of North
Carolina we computed the parameters { ~J and
"-* .of
(:5.11) for several different values of n" p,
Q"
t
the distribution given by
and compared several percen-
tiles of this distribution 'With the corresponding percentiles of the "approximate"
Gamma distribution with the parameters given by (1.2) and (1.').
the computations are presented in Table 1 on page
12.
The results of
We recall that the first
two moments of these two distributions are identical; (this fact has been used
for checking the correctness of the numerical calculations).
Q'\.ling to computational
difficulties and complications especially on computing the sequences t~J we
cannot claim higher accuracy than the second significant figure.
Comparing these two types of distributions we observe that the functional
dependence of parameters on n
meter r
follows the same pattern in both cases.
is asymptotically linear with. n
while
Q
The para-
is independent of n.
The
discrepancies between the corresponding distributions are relatively small and
may be considered insignificant for most of the practical purposes in the cases of
small values of
pep
So.,)
even for values of
also indicate that for higher values of
p (p
n as small as
= 0.5
5. The computations
and greater) the relatively
large deviations at the lower tail decline sharply with the increase of the values
of n.
The number of
£.
c
k
':
necessary to obtain the cumulative sum equal to
1
12
The number of iterative steps necessary to obtain the values of the charac~j
teristic roots
with accuracy up to the 6-th significant digit also depends
strongly on nand p and varies for the cases presented below between 5 and
18.
TABLE I
Comparison between the approximate a.nd "exact" distributions of sum of
identically distributed exponentially correlated Gamma variables
I
Percentiles of _ct distribution corresponding to
the follmdng percentiles of the approximate distribution
n
p
r
5
.2
10
5
25
75
95
99
.62
4.14
24.81
75.73
94.80
98.75
.5
.70
4.39
25.03
75.61
94.87
98.87
1.94
.5
.47
3.75
24.67
75.93
94.68
98.63
2.00
.5
.56
4.06
25.00
75.81
94.75
98.75
.28
;.18
24.60
76.24
94.77
98.72
.;5
3.50
24.81
76.,20
98.76
.42
;.74
25.10
76.0
98.8
.18
;.01
2;.8
76.6
98.. 6
.;4
;.66
24.6
76.2
98.7
.10
1.40
20.1
Q
1
1.92
.5
.2
2.04
5
.3
10
.3
2.00
5
10
.5
2.04
15
2.00
5
.75 3.00
5
15
.5
.5
.75 3.00
.;;2
.151 28.1
98.70
*This case corresponds to a distribution of precipitation amounts at a certain
station in
.
*
ACKNOWLEDGE:ME:1'T.
The authors wish to express their thanks to Professor W.
HoefftUng and especially to Professor N. L. Johnson for their valuable ccmm.ents.
We are also much indebted to Mr. P. J. Bro'WIl who carried out the programming
of all the computations on the UNIVAC computer.
Studien
M1. 3 (1961)" p. 83-92.
,
1'2.7
S. Kotz and J. Neumann" 'tOn distribution of precipitation amounts for the
periods of increasing lengths,," to be published in J. Geoph. Research (1963).
e
£!:.7
Welch" B. L." "The significance of difference between two means when the
population variances are unequal," Biometrika" Vol. 29 (1938)" p. 350-362.
£27
J. Gurland" Correction to "Distribution of the Maximum of the Arithmetic
Mean of Correlated Random Variables,," Ann. Math. Stat., Vol. 30 (1959),
p. 1265-1266.
£§J
A. S. Krishnamoorthy and M. Parthasarathy" "A Multivariate Gamma-Type
Distribution," Ann. Math. Stat." Vol. 22 (1951)" p. 549-557 •
.L17
H. Robbins" ''Mixture sf Distributions,," Ann. Math. Stat." Vol. 19 (1948)"
p. 360·369•
.L§.7
H. Robbins and E. J. G. Pitman" "Application of the method of mi~ures to
quadratic forms in normal variates, n Ann. Math, Stat." Vol. 20 (1949)" 1'.552 •
.L27 F.
1J.g7 A.
B. Hildebrand" Introduction to Numerical Analysis" McGraw Hill, 1956.
'.diW1l~'t J,~.he, d~s.ri.but:J.ott af~· 1a1teX!.t roots of the covariance matriX,"
Anp. :t-1ath. Stat., Vol.. 31 (1960)1. p. 151-158.
L10!7 A.
T. James" "A comparison of Univariate and Multivariate Distributions,,"
special invited ~aper presented at the 94th Eastern Regional Meeting of the
n;.m (May 6, 1963).
,
["1]} L. Brand, Advanced Calculus" Wiley and Son, 1955.