Chakravorti, S.A.; (1973)On asymptotic properties of the maximum likelihood estimates of the general growth curve model."

•
ON ASYMPTOTIC PROPERTIES OF THE MAXIMUM LIKELIHOOD
ESTIMATES OF THE GENERAL GROWTH CURVE MODEL
By
S. R. Chakravarti
Department of Biostatistics
University of North Carolina at Chapel Hill
and
University of Calcutta, India
Institute of Statistics MimeD Series No. 896
1
•
OCTOBER 1973
ON ASYMPTOTIC PROPERTIES OF THE
MAXIMUM LIKELIHOOD ESTIMATES OF
THE GENERAL GROWTH CURVE MODEL*
S. R. CHAKRAVORTI+
Department of Biostatistics
University of North Carolina at Chapel Hill
1.
Introduction.
The problem of estimation of the parameters of the growth curve model
under Behrens-Fisher situation has been discussed i.n Chakravorti [3].
Here
we have considered the most general situation by violating the assumption of
normality of the underlying parent distribution.
Let us consider the observation vector Z(t)(lxq):(y(t) ,x(t)), the a-th
-a
-a-a
observation in t-th population (a=l, •.• ,n , t=l, ••• ,m), where y(t) is lXp
t
-a
vector and X(t) is lXs, s=q-~ > O.
-a
r-
Then considering growth curve model as
MANOCOVA model (Rao [10]) we have
(1.1)
where
~(t)(lXp), ~(SXp) are the parameters involved in the model, ~~t), the
random error component, distributed with continuous distribution function
F (£(t)) x(t) (lxs) the concomitant vector variable distributed as F (x(t)).
p -a
' - a ' s -a
Our object, here, is to study the asymptotic properties of the maximum likelihood estimates (m.l.e.) of the parameters, if they exist.
The asymptotic
*Work sponsored by the Aerospace Research Laboratories, U. S. Air Force
Systems Command, Contract F336l52-7l-C-1927. Reproduction in whole or in
part permitted for any purpose of the U. S. Government.
+On leave of absence from the University of Calcutta, India.
2
efficiencies of the estimates have been compared with those of the m.l.e. 's
obtained under the normality assumptions.
Inagaki [6] discussed these properties for independent not necessarily
identically distributed (i.n.i.d.) random variables with parameters §(kXl)
in the line of Huber [4].
In the model (1.1), we consider observation vectors
et) as 1..1..
.. d •
Y
with respect to a=l, ••. ,n
~a
t=1,2, .•. ,m.
t
and i.n.i.d. with respect to
Then in line of Inagaki we have established the consistency and
asymptotic normality of the estimates of the parameters of the model (1.1)
under less restrictive assumptions (that is, without the assumption (vii)
of
Inagaki [6]).
2.
Notations and assumptions
Let us define
~,
the parameter matrix of order (m+s)xp, where
e'
(2.1)
(1) ,
= (!]
(m) ,
,H',!]
,
,§)
The likelihood function is given by
(2.2)
General notations
(*,A,P):
§:
Probability space,
a parameter space which is a subset of the k(=(m+s)p) dimensional Euclidean
k
space R such that for any M>O,
~)n {II~ 1/.2. M} is closed,
¢(t)(n(t)) = alog f(y(t)_n(t)_x(t)s)/an(t), ¢(t)(S) = alog f(y(t)-n(t)-x(t)s)/as
-a
~
-a
~
-a ~
~
~a
~
-a
~
-a
_'
(t=l, ... ,m, a=l, .•• ,n ) are functions in
t
II •1/:
the maximum norm of the matrix,
A
e
maximum likelihood estimate of
In:
any other estimator for
-n
*x ®
~,
§ based
on n observations,
3
L(Y), E(Y), Cov(Y), the distribution, mean and variance-covariance matrix
under probability measure P respectively.
t(y;P):
distribution of Y under probability measure P which is specified.
We shall make the following assumptions, similar to those of Inagaki [6].
Assumptions
(i)
and ¢ (t) (/3) are ~-measurable, where jl is the a-field
¢ (t) (n (t»
-a
-
of B0re1-subsets of
_a_
~
and separable when considered as a process of
~.
(ii)
A(t) (n (t)
(2.3)
~(t)(~)
=
=
-
-
= 0
A(t) (/3)
-
-
E~.
-0 -
for any fixed e
x
(2.4)
=
-n
A(n(m»
~(~)
--
# 0 for e#e ,
-
-0
m
I
r
A(t) (/3)
t=l twhere nand n
t
-
are large so that r
=
t
nt/n finite and bounded away from 0
and 1.
(iii)
There are two positive constants Aoo and Hoo>O and positive functions
b (t)(e»o, t= 1 , ..
o,nl SUC h
t h at
(2.5)
(2.6)
lim
n~
lim
be"
"_lHI«
{Max b(t)(e)/
t
X }<
n
1
4
(2.7)
lim
n-700
5: II
I ~.
Im 1/ -n
e-700
> A
-
>
00
°
(2.8)
(2.9)
these last two convergences being uniform for t=l, ••• ,m, a=l, ••• ,n t •
(iv)
For all n
t
and n,
Ellp~t)(~(t»
~(t)(~(t»p, Ellt~t)(~)
-
-
~(t)(~)r
exist and
nl2
I I E"t~t) (~)
t
(v)
Let u(t) =
a
-
a
Sur
~ (t) (~)I/ 2
-+
° as n-700.
I/¢(t)(T) - ¢(t)(e)lI, t=l, ... ,m.
1I1-'§//<<jJ -a
-
-a
Then for some
-
constants HI' H and d0
>0, andfor 0
d < d
2
E(U(t»
a
< Hd
I
'
(vi)
i~iformly
in the neighbourhood of e.
-0
Let f(e ) be the information matrix
-
-0
corresponding to the parameters n(l) , •.• ,n(m),S
-0
-0-0
and it is finite and non-singular.
5
It may be noted that derivatives of ¢(t)(n(t)) w.r.t. n(t') for
-a
t~t' are null
matrices so that submatrices in the information matrix corresponding to n(t),
n(t') are all null matrices.
(vii)
and
Since the derivatives of the log likelihood function w.r.t.
S involve
-
n (t)
the random vector x(t) implicitly, in general, we consider all
-a
the expectations as unconditional taking into consideration the following
facts:
E(~~t)) = Q, a=l, ..• ,n t , t=l, .•. ,m
= E*, finite and positive definite.
E(X(t)'x(t)
-a
3.
-a
-t
Consistency and asymptotic normality of the maximum likelihood estimates
Under the set up in Sec. 2 we consider the estimating function
follows.
Let
t=l, ••. , m
(3.1)
(3.2)
Then the estimating function
~ (8) is given by
-n -
~'(8) =
(3.3)
-n -
(6' , ... ,6' ,6')
-n
1
-n-n
m
A
A(l)'
A(m)' A,
Let 8' = (n
, •.. ,n
,B ) be the maximum likelihood estimates of
'
~_
-n
= (n
(1) ,
-
(3.4)
,· .. ,n
(m) ,
,
,6).
--
Then we must have
P{lim [~n(~n)]
= o} = 1
n~
.~
LEMMA 3.1.
Under the assumptions (i), (ii), (iv), (vi) and (vii),
~
(8) as
-n -
6
(a.)
(b)
[~
~n
(8 )]
-0
+
N(O,f(8 ) in law.
-
-
-0
PROOF.
(a)
From the assumptions (i), (ii), (iv) and (vii) each of
~ ~-n
YU
t
(n(t»,
t
-0
t=l, ••• ,m, converges to zero in probability and since m is finite, r bounded,
t
nt
l
¢(t)(S) converges to zero, it implies that /iil ~ (S ) converges to
n t a= 1 -a
-0
n -n -0
lie
zero in probability by W.L.L. NS. (loeve [9], page 274). Hence the result
2
~~
follows immediately.
(b)
where
We consider a matrix ~£Rk, where k=(m+s)p, such that
Et
is of order lXp, t=l, ... ,m and
sider the linear function
T
= tr~r(8
)H
-n -0 -
~m+l
of order sxp.
Then let us con-
7
where
Now from assumptions (ii) and (vi) we have
E U(t) = 0
a
(say)
is finite.
Hence {u(t)} satisfies the Lindeberg-Levy's condition for C.L.T.,
a
so that
z.(} .i~ u~t))
Hence it follows that T =
is asymptotically distributed
-t"1
t
o{~1
as N(O,G), where G is the variance of T, given by tr~'r(8 )~, where r(8 )
-
is the variance-covariance matrix of
Since for any
C
-
-0
~
-
-0
(8 ) •
~n ~o
k
ER this result holds we have the result (b).
Hence the
lemma is completely proved.
A
Consistency
of the m.l.e. 8-n .
A
e-n
that
being the maximum likelihood estimate for .-~, it follows from (3.4)
8
__1__ 6 (n(m»
-n
m m
vn=
+
0 in P.
Hence in the line of Huber [4] and under the assumptions (i)-(v) and (vii),
{8-n } converges
to 8
-0
in P.
Asymptotic normality
A
To obtain the asymptotic distribution of the m.l.e. 's 8 in the line of
-n
Inagaki [6] let us define the concept of relative compactness (See LeCam [8])
,
in the following sense.
Definition.
In order that
{ley-n )}
is said to be relatively compact it is
necessary and sufficient that for any E>O there exists a positive number M>O
such that
P{~!n~ > M} < E, for all n
(3.5)
Nowe
~n
being m.l.e. of 8 ,the relation (3.4) implies that {~(~
~o
relatively compact.
compac t then
(3.6)
where
(8 )}
~n
is
{l[I:n(8-n -8-0 )]}
Hence from theorem 3.2 of Inagaki [6],
is relatively compact.
THEOREM 3.1.
~n
Hence we have the following.
Under the assumptions (i)-(iii) if {~[I:n
(8-n -8
~o
)]} is relatively
9
Ill.: (n(l) -n (1»
1 -
vn(e-n -8
(3.7)
-0
) =
10' 61 (m) -n (m) )
m-
IU
PROOF.
-0
-0
(6 - -0
s)
Following Lemma 3.2 of Inagaki [6], page 7, it can be easily shown
that, under the assumptions (i), (ii), (v) and (vi), for any 1'1>0 and large n
(putting T
vn(T-8
»,
- -0
T
Sup
(3.8)
IIIII <1'1
This shows that
of Inagaki [6]).
~
/10 (T)"
-n -
=
Sup
1I!112.M
I/~ (8 + ~)-~ (8 )-Tf(8 )11
-n -0
yn -n -0 - - - 0
0 in P
(8) is "weakly asymptotically differentiable" (in the sense
-n -
Now since
{t[!J1(e-n-8-0 )]}
is relatively compact, the result
(3.6) follows from (3.8) in the line of Theorem 3.1 of Inagaki [6].
the theorem.
Thus we have
THEOREM 3.2.
Under the assumptions (i)-(vii)
t[vn(e-n -8-0 )]
PROOF.
-+
-+
N(O, f-l(e
Since (3.4) holds for m.l.e. 's
» in law.
-
-
e
of 8 , we have from Theorem 3.1 that
-n
t[l:n(e-n -8-0 )f(8
)]
- -0
-0
Hence
-0
-+
t[~ (8 )]
-n -0
Hence from Lemma 3.l(b) we have the desired result and the theorem is completely
proved.
..
4.
A
Asymptotic efficiency of 8 •
-n
A
e
-n
P e +"'1!::
-0
JYi.
being m.l.e. 's of the parameter matrix 8 , is complete sufficient for
-0
,where
10
I
~ = (~ Ei,· .. , ~ ~~, ~ ~~l)
(4.1)
so that
n
(l)
~l
+--
vU1
-0
(4.2)
(m)
~o
,
H
+~
vn=m
~m+l
~o + - /tl
e
Then any other estimator T which is location invariant is independent of
-n
-n
(See Basu Il,2]).
Now T will be said to be asymptotically location invariant at 8 if
-n
-0
for any -tf£Rk
t.IT -(8
(4.3)
-n
-0
,#>
+~). p
Iii'
where L is independent of~.
is that
(4.4)
.>.t
]
~t ~
-+
L in law
The necessary and sufficient condition for this
~ (T ) is location invariant at 8.
-n -n
-0
tI ~n
~.
- ('En); p 8-0t-=
,'<\.
That is
-+
G in law
where G is independent of ft.
Under this set up, following Theorem 5.1 of Inagaki [6],it can be shown
that the limiting distribution L can be expressed as a convolution of those
of §n(~o) and -§n('E n ) under P8
N(Q, [(~o)) and G(Z)
= l-G(-Z), that is
-0
(4.5)
L
=G
*
N(Q,
£)
Hence it follows from the corrollary 5.1 of Inagaki [6] that
(4.6)
Lim[ vn(T -8 )]
-n
-0
lim[vn(T -8 )]
-n -n
*
Lim[I:n(8 -8 )]
-n -0
11
A
Since 8
~n
A
is sufficient statistic for 8
~o
it follows from Kaufman [7J that
-
A
(T~n-8-n ) and ~n
8 are independently distributed.
-
Hence i f v(l) and v(2) be the
~
A
dispersion matrices corresponding to T
~n
and 8 , then it follows from (4.6)
~n
that y(1)_y(2) must be at least positive semi definite.
A
e
~n
This proves that
is asymptotically efficient as compared with the estimator which is
asymptotically i-invariant.
5.
A
Relative efficiency of 8 compared with m.l.e. when parent distribution
-n
is multinormal
The m.l. estimators of the parameters of the growth curve model under
Behrens-Fisher situation have been considered in [3J, so that when the underlying distribution is multinormal the asymptotic distribution of
In(&-n -8-0 ),
defined by (3.7), has been shown to be N(O, ~-l)
where ~.v is the information
_
~
matrix with off-diagonal submatrices zero.
A
To compare the relative efficiency of the estimates 8 with that under
-n
normality we are only to compare the corresponding dispersion matrices,
so that relative efficiency is
given by
(5.1)
This e < 1 provided V(1)_V(2) is at least positive semidefinite.
Acknowledgement
The author is grateful to Professor P. K. Sen for his help and guidance
throughout this investigation.
12
References
[1]
Basu D. (1955): On statistics independent of a complete sufficient
statistic, Sankhya, 15, 377-80.
[2]
Basu D. (1958): On statistics independent of sufficient statistics,
Sankhya, 18, 223-226.
[3]
Chakravorti S. R. (1973): Maximum likelihood estimation of growth
curve model with unequal dispersion matrices, Institute of Statistics
Mimeo Series No. 893, University of North Carolina at Chapel Hill.
[4]
Huber P. J. (1967): The behaviour of maximum likelihood estimators
under nonstandard conditions, Proc. Fifth Berkeley Sympo. Math.
Statist. Prob., 1, 221-233.
[5]
Inagaki N. (1970): On the limiting distribution of a sequence of estimators with uniformity property, Ann. Inst. Statist. Math., 22, 1-13.
[6]
Inagaki N. (1973): Asymptotic relation between the likelihood estimating function and the maximum likelihood estimator, Ann. Inst.
Statist. Math., 25, 1-25.
[7]
Kaufman S. (1966): Asymptotic efficiency of the maximum likelihood
estimator, Ann. Inst. Statist. Math., 18, 155-178.
[8]
LeCam L. (1960): Locally asymptotically normal families of distributions, Univ. California Pub1. Statist., 3, 37-98.
[9]
Loeve L. (1963):
[10]
Probability Theory, Von Nostrand, Princeton.
Rao C. R. (1965): The Theory of least squares when parameters are
stochastic and its applications to the analysis of growth curve,
Biometrika, 52, 447-458.