Non-linear regression with multidimensional indices

Statistics & Probability Letters 45 (1999) 175 – 186
www.elsevier.nl/locate/stapro
Non-linear regression with multidimensional indices
Naveen K. Bansal a; ∗ , G.G. Hamedani a , Hao Zhang b
a Department
of Mathematics, Statistics, and Computer Science, Marquette University, Milwaukee, WI 53233, USA
b Program in Statistics, Washington State University, Pullman, WA 99164, USA
Received August 1998; received in revised form January 1999
Abstract
We consider a non-linear regression model when the index variable is multidimensional. Sucient conditions on the
non-linear function are given under which the least-squares estimators are strongly consistent and asymptotically normally
distributed. These sucient conditions are satisÿed by harmonic type functions, which are also of interest in the one
dimensional index case where Wu’s (Asymptotic theory of non-linear least-squares estimation, Ann. Statist. 9 (1981)
501–513) and Jennrich’s (Asymptotic properties of non-linear least-squares estimators, Ann. Math. Statist. 40 (1969)
c 1999 Elsevier Science B.V. All rights reserved
633– 643) sucient conditions are not applicable. Keywords: Consistency; Asymptotic normality; Least-squares method; Harmonic type functions
1. Introduction
Statistical modeling with multidimensional indices is an important problem in spatial or tempro-spatial
process, in signal processing, and in texture modeling. For example, suppose the problem is to model the
growth of vegetation in a particular farm over a period of time. This requires the modeling with three
dimensional indices. For examples in signal processing, see Rao et al. (1994) and McClellan (1982). For
examples in texture modeling, see Francos et al. (1993), Yuan and Subba Rao (1993), and Mandrekar and
Zhang (1996). Most of the models in these areas are non-linear regression models. There has been extensive
work in the literature on non-linear regression models with one dimensional index (Wu, 1981; Jennrich, 1969;
Gallant, 1987). A non-linear regression model with one dimensional index can be deÿned as follows:
yt = f(xt ; Â) + t ;
t = 1; 2; : : : ; n;
(1)
where the observed data is {yt ; t = 1; : : : ; n}; xt ; t = 1; : : : ; n are some known constants, Â ∈ ⊆ Rp is an
unknown parameter vector, t ; t = 1; : : : ; n are the random errors, and f is a known function. The problem of
estimating  has been investigated quite extensively in literature. Wu (1981) and Jennrich (1969) gave some
∗
Corresponding author. Tel.: +1 414 2885 290; fax: +1 414 2885 472.
E-mail address: [email protected] (N.K. Bansal)
c 1999 Elsevier Science B.V. All rights reserved
0167-7152/99/$ - see front matter PII: S 0 1 6 7 - 7 1 5 2 ( 9 9 ) 0 0 0 5 7 - 7
176
N.K. Bansal et al. / Statistics & Probability Letters 45 (1999) 175 – 186
sucient conditions based on the function f and the design to establish certain asymptotic properties of the
least squares estimators. These conditions, however, are not satisÿed if the function f is of harmonic type
(Kundu, 1993).
In the present work, we consider an extension to Eq. (1) with multidimensional indices,
yt = f(xt ; Â) + t ;
t6n;
(2)
where t = (t1 ; t2 ; : : : ; tk ) and n = (n1 ; n2 ; : : : ; nk )∈Nk (set of k dimensional non-negative integer values),
k
k
6 denotes the partial ordering, i.e.,
for m = (m1 ; m2 ; : : : ; mk )∈N and n = (n1 ; n2 ; : : : ; nk )∈N ; m6n if
mi 6ni for i = 1; : : : ; k; t ; t ∈ Nk is an independent ÿeld of random variables such that
E(t ) = 0
and
Var (t ) = 2
∀ t ∈ Nk ;
p
(3)
k
 ∈ ⊆ R is a parameter vector, {xt ; t ∈ N } a set of known constant vectors, and f is a known non-linear
function.
Remark. For signal processing models (see, Rao et al. (1994) for an example), yt ; f(·; ·) and t are complex
valued. Here for notational convenience we only assume them to be real valued.
A natural choice of estimating  is by least-squares methods, i.e., by minimizing
X
2
|yt − f(xt ; Â)| :
Qn (Â) =
(4)
t6n
Here we will not deal with the numerical method of estimating Â. Our aim will be to ÿnd sucient conditions
on function f(·; ·) so that the least squares estimators are strongly consistent and are asymptotically normally
Qk
distributed as |n| = i=1 ni → ∞. Note that consistency and asymptotic normality as |n| → ∞ provide much
stronger results than consistency and asymptotic normality as min(n1 ; n2 ; : : : ; nk ) → ∞, as assumed, e.g. in
Rao et al. Our results will also be of interest in one dimensional case since they can be applied to harmonic
type functions which do not satisfy Wu’s and Jennrich’s sucient conditions (Kundu, 1993).
To illustrate, we will consider the following example which can be used to model textures (Yuan and Subba
Rao, 1993; Mandrekar and Zhang, 1996):
yt =
m
X
k cos(t1 1k + t2 2k ) + t ;
(5)
k=1
where t = (t1 ; t2 ); k ’s are real unknown parameters, and 1k ; 2k are unknown parameters in [0; ]. For a
special case of m = 1, we will show the strong consistency of the least squares estimator and also we will
establish the asymptotic normality of the estimator.
In Section 2, we will present sucient conditions in terms of function f to establish strong consistency of Â̂n .
In Section 3, sucient conditions for the asymptotic normality of Â̂n will be given and we will obtain its
asymptotic distribution.
2. Strong consistency
Let Â0 be the true parameter vector. Our objective is to prove the strong consistency of Â̂n in the sense
that Â̂n → Â0 a.s. as |n| → ∞. Note that if {yt ; 16t6n}, where 1 = (1; 1; : : : ; 1)∈Nk , is the observed data,
then the total number of observations is |n|. To prove the strong consistency, we need the following lemma
which is similar to Lemma 1 of Wu (1981).
N.K. Bansal et al. / Statistics & Probability Letters 45 (1999) 175 – 186
177
Lemma 2.1. Let {Qn (Â)} be a ÿeld of measurable functions such that
inf
lim inf
|n|→∞
|Â−Â0 |¿
1
(Qn (Â) − Qn (Â0 )) ¿ 0
|n|
a:s:
(6)
for every ¿ 0: Then Â̂n which minimizes Qn (Â) converges to Â0 a.s. as |n| → ∞.
Proof. Suppose Â̂n 9 Â0 a.s. as |n| → ∞. Then there exists a subsequence {ns ; s¿1} and ¿ 0 such that
|Â̂ns − Â0 |¿ for all s¿1 with positive probability. Thus, from Eq. (6),
Qns (Â̂ns ) − Qns (Â0 ) ¿ 0
with positive probability for all s¿M , for some M ¿ 0. This contradicts the fact that Â̂ns is a least-squares
estimator.
We now make the following assumptions:
Assumption 1. The parameter space is compact.
Assumption 2. The function f(xt ; ·) satisÿes
(i) P
|f(xt ; Â1 ) − f(xt ; Â2 )|6at |Â1 − Â2 | for all Â1 6= Â2 , where at ¿ 0; t∈Nk are some constants such that
s
t6n at = o(|n| ), for some non-negative integer s.
(ii)
sup |f(xt ; Â)|6M0 for some M0 ¿ 0.
t∈Nk ; Â∈
Assumption 3.
lim inf
|n|→∞
inf
|Â−Â0 |¿
1 X
[f(xt ; Â) − f(xt ; Â0 )]2 ¿ 0:
|n| t6n
Let Â̂n denote the least-squares estimator which minimizes (4).
Theorem 2.2. Under Assumptions 1–3; and Eq. (3); Â̂n → Â0 a.s. as |n| → ∞.
Proof. Observe that
1 X
2 X
1
[f (xt ; Â) − f (xt ; Â0 )]2 −
[f (xt ; Â) − f (xt ; Â0 )] t :
(Qn (Â) − Qn (Â0 )) =
|n|
|n| t6n
|n| t6n
Thus, from Assumption 3 and Lemma 2.1, the consistency of Â̂n follows if we prove the following:
1 X
[f (xt ; Â) − f (xt ; Â0 ) ] t → 0 a:s: as |n| → ∞:
sup
|Â−Â0 |¿ |n| t6n
(7)
Now, since in view of Assumption 2, the function d(xt ; Â) = f(xt ; Â) − f(xt ; Â0 ) is bounded, i.e.,
sup
t∈Nk ; Â∈
|d(xt ; Â)| ¡ ∞;
and since it satisÿes the condition (8) below, (7) follows from the following lemma.
Lemma 2.3. Let g(xt ; ·) be such that
(i) |g(xt ; Â1 ) − g(xt ; Â2 )|6at |Â1 − Â2 |;
for all Â1 6= Â2 ;
(8)
178
where
(ii)
N.K. Bansal et al. / Statistics & Probability Letters 45 (1999) 175 – 186
P
t6n
sup
at = o(|n|s+1 ); for some non-negative integer s; and
t∈Nk ; Â∈
|g(xt ; Â)| = B ¡ ∞;
(9)
and let {t ; t ∈ Nk } be i.i.d. real valued random ÿeld with mean 0 and variance 2 ; then
1 X
g(xt ; Â) t → 0 a:s: as |n| → ∞:
sup Â∈ |n| t6n
Proof. To prove this, we use the following result of Mikosch and Norvaisa (1987) on Banach space valued
random ÿeld.
Let Xn ; n ∈ Nk be a ÿeld of Banach space valued random variables with a norm k·k. Let {an ; n ∈ Nk }
be a ÿeld of positive numbers such that lim|n|→∞ an = ∞; it satisÿes the conditions A–D of Mikosch and
Norvaisa (1987), and
!
∞
X
−3
−2
j |Aj | = O q |Aq | ; q → ∞ ;
(10)
{an } ∈ F2 = {an } :
j=q
where Aj = {n : an 6j}, and |Aj | is the cardinality of Aj .
If there exists a constant c ¿ 0 and a positive random variable X such that
sup P (kXn k ¿ x) 6c P(X ¿ x) ∀ x ¿ 0;
n∈Nk
X
P (X ¿ an ) ¡ ∞;
(11)
(12)
n∈Nk
P
then lim|n|→∞ k(1=an ) j6n Xj k = 0 a.s. is equivalent to
X 1
Xj lim|n|→∞ an
= 0 in probability:
j6n
(13)
We deÿne Xt = g(xt ; Â) t . Then {Xt ; t ∈ Nk } is a ÿeld of C()-valued random variables, where C() is
the space of continuous function on the compact metric space with the sup norm. Deÿne an = |n|. It is easy
to see that {an ; n ∈ Nk } satisfy conditions A–D of Mikosch and Norvaisa (1987).
Now, from the above mentioned result of Mikosch and Norvaisa, it suces to show that {an ; n ∈ Nk } and
{Xn ; n ∈ Nk } satisfy (10) – (13).
To show Eq. (10), we need to prove
∞
X
j −3 |Aj | = O(q−2 |Aq |);
as q → ∞:
(14)
j=q
Using mathematical induction, it can be shown that
Z
Z
(lnj)k−2
(lnj)k−1
−
:
dy1 : : : dyk = j
:::
(k − 1)! (k − 2)!
16y1 ·y2 :::yk 6j
Therefore
|Aj | ∼ j(lnj)k−1
as j → ∞:
(15)
N.K. Bansal et al. / Statistics & Probability Letters 45 (1999) 175 – 186
179
Thus
∞
X
j −3 |Aj | ∼
j=q
∞
X
j −3 j (lnj)k−1
j=q
Z
∼
∞
q
y−2 (lny)k−1 dy:
(16)
(The notation bj ∼ cj means bj =cj =O(1); cj =bj =O(1), i.e., as j → ∞ M1 cj 6bj 6M2 cj for some M1 ; M2 ¿ 0.)
Now, since
0 ¡ y−2 [(lny)k−1 − (lnK)k−1 ] ¡ y−1:5
for all y ¿ K;
where K is suciently large, we have
Z
Z ∞
y−2 (lny)k−1 dy − (lnK)k−1
0¡
K
∞
K
y−2 dy ¡ ∞:
We now conclude, from Eqs. (15) and (16), that
Z ∞
∞
X
j −3 |Aj | ∼
y−2 (lny)k−1 dy ∼ q−1 (lnq)k−1
j=q
q
∼ O(q−2 |Aq |);
as q → ∞:
This proves Eq. (14).
To prove Eqs. (11) and (12), note that, since g(xt ; Â) is bounded, for any x ¿ 0,
P(kXt k ¿ x) = P sup |g (xt ; Â)| |t | ¿ x 6P(B|1 | ¿ x);
Â∈
(17)
where B is the upper bound of |g(·; ·)|. Now, substituting X = B |1 |, we get
X
n∈Nk
P(X ¿ |n|)6
X B2 2
¡ ∞:
|n|2
k
(18)
n∈N
The inequalities (17) and (18) prove (11) and (12), respectively.
To prove Eq. (13), we ÿrst note that, since is compact, there exists a constant R ¿ 0 such that is
covered by a p-tuple cube [ − R; R]p . We now form a grid of cubes with sides of length R=|n|s over [ − R; R]p
which are given by
Rip R(ip + 1)
Ri1 R(i1 + 1)
(n)
× ··· ×
;
;
;
Ci1 ; ::: ; ip =
|n|s
|n|s
|n|s
|n|s
be the center of the cube Ci(n)
.
where (i1 ; : : : ; ip ) ∈ In = (0; ±1; : : : ; ±(|n|s − 1); −|n|s ). Let Âi(n)
1 ; ::: ; ip
1 ; ::: ; ip
Now, for any ¿ 0; h ¿ 0, and for any  ∈ ,
!
P
E[exp{h t6n g(xt ; Â) t }]
1 X
g(xt ; Â) t ¿ 6
P
|n| t6n
exp{h|n| }
Y
exp{− [h − Ä (g(xt ; Â)h)]};
=
t6n
where Ä (·) is the cumulant generating function of ”t .
(19)
180
N.K. Bansal et al. / Statistics & Probability Letters 45 (1999) 175 – 186
Since the mean of t is zero and the second moment of t exists,
Ä (g(xt ; Â)h) = 12 g2 (xt ; Â)h2 + o g2 (xt ; Â)h2 :
(1)
Therefore, since g(·; ·) is bounded, there exist h(1)
0 ¿ 0 and 0 ¿ 0 such that
for all 0 ¡ h6h(1)
− ( 12 g2 (xt ; Â)h + o g2 (xt ; Â)h )¿(1)
0
0 :
Thus, from (19) – (21), we have
!
(1)
1 X
g(xt ; Â) t ¿ 6e−h |n| 0
P
|n| t6n
for all 0 ¡ h6h(1)
0 :
(20)
(21)
(22)
(2)
Similarly, it can be seen that there exist h(2)
0 ¿ 0 and 0 ¿ 0 such that
!
P
E[exp{−h t6n g(xt ; Â) t }]
1 X
g(xt ; Â) t ¡ − 6
P
|n| t6n
exp{h|n| }
(2)
6 e−h|n| 0
for all 0 ¡ h6h(2)
0 :
(23)
(2)
(1) (2)
Now, we choose 0 = min((1)
0 ; 0 ) and h0 = min(h0 ; h0 ). Then, from Eqs. (22) and (23), for any Â∈
!
1 X
g (xt ; Â) t ¿ 6exp{−h0 |n| 0 }:
(24)
P
|n| t6n
√
such that  ∈ Ci(n)
. Thus, since |Â − Âi(n)
|6R= 2 |n|s for
For every  ∈ , there exists a cube Ci(n)
1 ;:::; ip
1 ;:::; ip
1 ;:::; ip
this cube, using assumption (8), we get
1 X
1 X
1 X
(n)
(n)
g (xt ; Â) t 6
|g(xt ; Â) − g(xt ; Âi1 ;:::; ip )| |t | +
g(xt ; Âi1 ;:::; ip ) t |n| t6n
|n| t6n
|n| t6n
X
1 X
R
at |t | +
sup
g(xt ; Âi(n)
)
6√
t
1 ;:::; ip
2 |n|s+1 t6n
(i1 ;:::; ip )∈In |n| t6n
for all  ∈ . Thus
X
1 X
1 X
R
(n)
g (xt ; Â) t 6 √
at |t | +
sup
g(xt ; Âi1 ;:::; ip ) t :
sup
2 |n|s+1 t6n
(i1 ;:::; ip )∈In |n| t6n
Â∈ |n| t6n
Now, using the condition on at , and using Markov’s inequality it can be seen that
1 X
at |”t | → 0 in probability as |n| → ∞:
|n|s+1 t6n
Also, from Eq. (24),
P
sup
(i1 ;:::; ip )∈In
6
(25)
(26)
!
X
(n)
g(xt ; Âi1 ;:::; ip ) t ¿ t6n
!
1 X
(n)
P
g(xt ; Âi1 ;:::; ip ) t ¿ |n| 1
|n|
X
(i1 ;:::; ip )∈In
t6n
6 2p |n|sp e−h0 |n| 0 → 0
as |n| → ∞:
Thus, Eq. (13) now follows from Eqs. (25) – (27).
(27)
N.K. Bansal et al. / Statistics & Probability Letters 45 (1999) 175 – 186
181
This completes the proof of Lemma 2.3 and thus the proof of Theorem 2.2.
We now show that the least-squares estimators for model (5) are strongly consistent under Assumption (3).
For notational convenience, we assume that m = 1 and deal only with
yt = cos(1 t1 + 2 t2 ) + t ;
16t6n;
(28)
where 1 ∈ [ − ; ] and 2 ∈ [0; ]. Further, we assume that m6||6M ¡ ∞, for some M ¿ m ¿ 0. This is
a reasonable assumption since represents the amplitute of the waves.
To prove the strong consistency, via Theorem 2.2, it suces to show that Assumptions 1–3 are satisÿed
for this model.
The parameter space is {(; 1 ; 2 ) : m6||6M; 1 ∈ [ − ; ]; 2 ∈ [0; ]}. Clearly Assumption 1 is
satisÿed. To verify the Assumption 2, we let Â1 = (1 ; 11 ; 21 )T and Â2 = (2 ; 21 ; 22 )T be any two points in
. Then
|1 cos(11 t1 + 21 t2 ) − 2 cos(21 t1 + 22 t2 )|
= |(1 − 2 ) cos(11 t1 + 21 t2 ) + 2 (cos(11 t1 + 21 t2 ) − cos(21 t1 + 22 t2 ))|
6|1 − 2 | + |2 | |cos(11 t1 + 21 t2 ) − cos(21 t1 + 22 t2 )|
6|1 − 2 | + |2 | |11 − 21 | t1 + |2 | |21 − 22 | t2
6|t| (|1 − 2 | + M |11 − 21 | + M |21 − 22 |)
6|t| |Â1 − Â2 | (1 + 2M 2 )1=2 :
Thus Assumption 2(i) follows with at = (1 + 2M 2 )1=2 |t| and s = 3. Assumption 2(ii) is clearly satisÿed.
To verify Assumption 3, we let  = (; 1 ; 2 )T and Â0 = (0 ; 10 ; 20 )T ∈ . Then
1 X
[ cos(1 t1 + 2 t2 ) − o cos(10 t1 + 20 t2 )]2
|n| t6n
2
ei(10 t1 +20 t2 ) + e−i(10 t1 +20 t2 )
1 X ei(1 t1 +2 t2 ) + e−i(1 t1 +2 t2 )
=
− o
|n| t6n
2
2
"
#
1 X 2i(1 t1 +2 t2 )
2
1 X −2i(1 t1 +2 t2 )
2+
=
e
+
e
4
|n| t6n
|n| t6n
"
#
1 X 2i(10 t1 +20 t2 )
1 X −2i(10 t1 +20 t2 )
02
2+
e
+
e
+
4
|n| t6n
|n| t6n
−
0 X i((1 +10 )t1 +(2 +20 )t2 )
[e
+ e−i((1 +10 )t1 +(2 +20 )t2 )
2|n| t6n
+ ei((1 −10 )t1 +(2 −20 )t2 ) + e−i(1 −10 )t1 +(2 −20 )t2 ]:
Now, since, for all !1 and !2 ∈(0; 2),
X
(1 − ein1 !1 )(1 − ein2 !2 )
;
ei(!1 t1 +!2 t2 ) = ei(!1 +!2 )
(1 − ei!1 )(1 − ei!2 )
t6n
Assumption 3 follows from Eq. (29).
(29)
182
N.K. Bansal et al. / Statistics & Probability Letters 45 (1999) 175 – 186
3. Asymptotic normality
Observe that the least-squares estimator Â̂n is a zero of Qn0 (Â)=0. (Primes are used to denote the derivatives
with respect to Â). Thus expanding Qn0 (Â) about Â0 (true parameter value) and evaluating it at Â̂n , we get
(30)
O = Qn0 (Â0 ) + Qn00 (Â̂∗n ) Â̂n − Â0 ;
where Â̂∗n = cÂ0 + (1 − c) Â̂n for some 06c61. Note that
X
(yt − f(xt ; Â)) f0 (xt ; Â)
Qn0 (Â) = − 2
(31)
t6n
Qn00 (Â) = − 2
=−2
X
(yt − f(xt ; Â)) f00 (xt ; Â) + 2
t6n
X
X
[f0 (xt ; Â) ][f0 (xt ; Â)]T
t6n
t f00 (xt ; Â)) + 2
t6n
+2
X
X
[f(xt ; Â) − f(xt ; Â0 )] f00 (xt ; Â)
t6n
[f0 (xt ; Â)][f0 (xt ; Â)]T :
(32)
t6n
Now, we impose the following assumptions on the function f(·; ·) in addition to Assumptions 1–3.
Assumption 4. f(xt ; ·) is twice continuously dierentiable in .
k
a ÿeld of k × k non-singular matrices such that
Assumption 5. Let
P {Dn ; 0n ∈ N } be
T
T
0
(x
[f
(x
;
Â)]
[f
(i) (1=|n|) Dn
t
t ; Â) ] Dn converges to a positive deÿnite matrix (Â0 ) uniformly as
t6n
|n| → ∞ and
P |Â − Â0 | → 0.
00
(ii) (1=|n|) DnT
t6n [f(xt ; Â) − f(xt ; Â0 )] f (xt ; Â) Dn → 0 uniformly as |n| → ∞ and |Â − Â0 | → 0.
T 00
(iii) 00 for all
 ∈ , for some M00 ¿ 0. Here k·kE is the Euclidean norm on matrices.
DnT f 00(xt ; Â) Dn kE 6M
(iv) Dn (f (xt ; Â1 ) − f00 (xt ; Â2 ))Dn E 6 bt;n |Â1 −Â2 | for all Â1 6= Â2 , where bt;n ¿ 0; t∈Nk are some constants
P
r+1
) for some non-negative integer r.
such that t6n b
t;n = o(|n|
(v) max16t6n (1=|n|) DnT f0 (xt ; Â0 )E → 0 as |n| → ∞.
Theorem 3.1. Under Assumptions 1–5 and (3);
p
L
|n| (Dn )−1 Â̂n − Â0 → Nk 0; 2 −1 (Â0 )
Proof. From Eq. (30)
p
|n| Dn−1 (Â̂ − Â0 ) = −
as |n| → ∞:
−1
1
1 T 00 p DnT Qn0 (Â0 ):
Dn Qn Â̂∗n Dn
|n|
|n|
Now, from Eq. (32),
1 X
1 T 00
t [DnT f00 (xt ; Â) Dn ]
Dn Qn (Â) Dn = −2
|n|
|n| t6n
(33)
N.K. Bansal et al. / Statistics & Probability Letters 45 (1999) 175 – 186
+2
+
183
1 TX
[f (xt ; Â) − f (xt ; Â0 )] f00 (xt ; Â) Dn
D
|n| n t6n
2 T X
D
[f0 (xt ; Â)] [f0 (xt ; Â)]T Dn :
|n| n t6n
Using Assumption 5 parts (iii) and (iv), it can be shown as in the proof of Lemma 2.3 that
1 X
T 00
t Dn f (xt ; Â) Dn → 0 in probability as |n| → ∞:
sup |n|
Â∈
t6n
(34)
(35)
E
Since Â̂n → Â0 a.s., Â̂∗n → Â0 a.s. as |n| → ∞, thus from Eq. (34), and Assumption 5 parts (i) and (ii), we
obtain
1 T 00 D Q Â̂∗n Dn → 2(Â0 ) in probability as |n| → ∞:
(36)
|n| n n
From Eq. (31)
2 X
1
p DnT Qn0 (Â0 ) = − p
t DnT f0 (xt ; Â0 ):
|n|
|n| t6n
(37)
To establish the asymptotic normality of the above, we ÿrst state the multidimension indices version of the
Hajek–Sidek theorem (Sen and Singer, 1993).
Lemma 3.2. Let {Xn ; n∈Nk } be a ÿeld of independent and identically distributed random variables with
mean and variance 2 ; and let {cm; n ; m6n ∈ Nk } be a ÿeld of real numbers such that
max
16m6n
Then
P
2
cm;
n
→0
2
m6n cm; n
P
as |n| → ∞:
cm; n (Xm − ) L
qP
→ N (0; 2 )
2
m6n cm; n
m6n
as |n| → ∞:
The proof of this follows along the standard lines of Hajeck–Sidek theorem.
Using Lemma 3.2, and Assumption 5 parts (i) and (v), we have, for any ∈ Rp ,
P
T t6n t DnT f0 (xt ; Â0 )
L
q P
→ N (0; 2 ):
T t6n DnT [f0 (xt ; Â0 )] [f0 (xt ; Â0 )]T Dn From Assumption 5(i), we obtain
1 X
L
t DnT f0 (xt ; Â0 ) → N 0; 2 T (Â0 ) T p
|n| t6n
for any ∈ Rk ;
and in view of Eq. (37),
1
L
p DnT Qn0 (Â0 ) → N 0; 42 (Â0 ) :
|n|
Now, the proof of the theorem follows from Eq. (33) and (36).
184
N.K. Bansal et al. / Statistics & Probability Letters 45 (1999) 175 – 186
Remark. If Assumption 5 holds only for a subsequence of {n; n∈Nk } such that |n| → ∞, then the result of
Theorem 3.1 holds for that subsequence. This will be seen in the example below.
We now consider asymptotic normality of the least squares estimators for model (5). We will show that the
parameters of the asymptotic normal distribution depend on subsequences of n. Asymptotic normality can be
obtained for two types of sequences: (i) a subsequence for which min(n1 ; n2 ) → ∞, and (ii) a subsequence for
which any one of n1 and n2 → ∞ while the other kept constant. For notational convenience, as in Section 2,
we assume that m = 1 and deal only with model (28). Let Â̂n = ˆn ; ˆ1n ; ˆ2n )T be the least squares estimators,
and Â0 = (0 ; 10 ; 20 )T be the true parameter vector. Let
−1
:
Dn = Diag 1; n−1
1 ; n2
In order to apply Theorem 3.1, we need to verify Assumptions 4 and 5. Assumption 4 is clearly satisÿed. To
verify Assumption 5, observe that


cos(t1 1 + t2 2 )


:
sin(t
+
t
)
−
t
(38)
f0 (xt ; Â) = 
1
1
1
2
2


− t2 sin(t1 1 + t2 2 )
Thus
X
T
f0 (xt ; Â) f0 (xt ; Â)
t6n

X
cos2 (t1 1 + t2 2 )

t6n

 X

t1 sin2(t1 1 + t2 2 )
−
=
 2 t6n

 X

t2 sin2(t1 1 + t2 2 )
−
2 t6n
Note that
X
cos2 (t1 1 + t2 2 ) =
t6n
1
=
2
"
X
16t1 6n1
2i1 t1
e
X
t1 sin2(t1 1 + t2 2 )
2 t6n
X
2
t12 sin2 (t1 1 + t2 2 )
−
t6n
2
X
t1 t2 sin2 (t1 1 + t2 2 )
t6n

X
t2 sin2(t1 1 + t2 2 )
2 t6n



X

2
2
t1 t2 sin (t1 1 + t2 2 )  : (39)

t6n


X

2
2
2
t2 sin (t1 1 + t2 2 )
−
t6n
1 X 2i(t1 1 +t2 2 )
+ e−2i(t1 1 +t2 2 )
e
2 t6n
X
16t2 6n2
2i2 t2
e
+
X
e
−2i1 t1
16t1 6n1
X
#
−2i2 t2
e
16t2 6n2
"
#
−2i1 n1
−2i1 n1
1
−
e
1
−
e
1 2i(1 +2 ) 1 − e2i1 n1 1 − e2i2 n2
+ e−2i(1 +2 )
:
= e
2
1 − e2i1 1 − e2i2
1 − e−2i1 1 − e−2i2
This term is of order O(1). This implies that
1 X
1
1
1 X
cos2 (t1 1 + t2 2 ) = +
cos2 (t1 1 + t2 2 ) →
|n| t6n
2 2|n| t6n
2
By dierentiating Eq. (40) with respect to 1 , we get
X
t1 sin2(t1 1 + t2 2 ) = O(n1 ):
t6n
as |n| → ∞:
(40)
N.K. Bansal et al. / Statistics & Probability Letters 45 (1999) 175 – 186
Thus
1 1 X
t1 sin2(t1 1 + t2 2 ) → 0
|n| n1 t6n
Similarly, it can be seen that
1 1 X
t2 sin2 (t1 1 + t2 2 ) → 0
|n| n2 t6n
185
as |n| → ∞:
as |n| → ∞:
By dierentiating (40) twice with respect to 1 , we get
X
t12 cos2(t1 1 + t2 2 ) = O(n21 ):
t6n
Thus
1 1 X 2
1 1 X 2
1 1 X 2 2
t1 sin (t1 1 + t2 2 ) =
t1 −
t cos 2(t1 1 + t2 2 )
2
2
|n| n1 t6n
2|n| n1 t6n
2|n| n21 t6n 1
=
1 1 X 2
(n1 + 1)(2n1 + 1)
−
t cos 2(t1 1 + t2 2 );
2
2|n| n21 t6n 1
12 n1
converges to 1=6 if min(n1 ; n2 ) → ∞ or if n1 → ∞, and converges to (n1 + 1)(2n1 + 1)=12n21 if n1 is ÿxed
but n2 → ∞.
P
Similarly, it can be seen that (1=|n|) (1=n1 n2 ) t6n t1 t2 sin2 (t1 1 +t2 2 ) converges to 1=8 if min(n1 ; n2 ) → ∞,
and converges to (n1 + 1)=8nP
1 if n1 is ÿxed but n2 → ∞, and converges to (n2 + 1)=8n2 if n2 is ÿxed but
n1 → ∞. Also (1=|n|) (1=n2 ) t6n t22 sin2 2(t1 1 + t2 2 ) converges to 1=6 if min(n1 ; n2 ) → ∞ or if n2 → ∞,
and converges to (n2 + 1)(2n2 + 1)=12n22 if n2 is ÿxed but n1 → ∞.
Thus, we have that Assumption 5 (i) is satisÿed with

1
0
0
2


2
2

(41)
(Â0 ) = 
 0 0 =6 0 =8  if min(n1 ; n2 ) → ∞;
2
2
0 0 =8 0 =6
with

1
2


0
(Â0 ) = 


0
and with

0
02


0
(Â0 ) = 


0
(n1 + 1)(2n1 + 1)
12n21
02
1
2
0
(n1 + 1)
8n1
0
02 =6
02
02
(n2 + 1)
8n2

(n1 + 1) 

8n1 


02 =6
0
(n2 + 1)
8n2
(n
+
1)(2n
2
2 + 1)
02
2
12n2
02

if n1 is ÿxed but n2 → ∞;
(42)
if n2 is ÿxed but n1 → ∞:
(43)







186
N.K. Bansal et al. / Statistics & Probability Letters 45 (1999) 175 – 186
It is easy to see that Assumption 5 parts (ii), (iii), (iv) and (v) are satisÿed by substituting


0
−t1 sin(t1 1 + t2 2 )
−t2 sin(t1 1 + t2 2 )


−t12 cos(t1 1 + t2 2 )
−t1 t2 cos(t1 1 + t2 2 ) 
f00 (xt ; Â) = 
:
 −t1 sin(t1 1 + t2 2 )
2
−t2 cos(t1 1 + t2 2 )
−t2 sin(t1 1 + t2 2 ) −t1 t2 cos(t1 1 + t2 2 )
Thus, from the remark after the proof of Theorem 3.1, we conclude that
p L
|n| (ˆ − 0 ) ; n1 ˆ1 − 10 ; n2 ˆ2 − 20 → N3 0; 2 −1 (Â0 )
as either min(n1 ; n2 ) → ∞, or n2 → ∞ while n1 is held ÿxed, or n1 → ∞ while n2 is held ÿxed, where
(Â0 ) is given by Eqs. (41) – (43) for the appropriate subsequences.
Acknowledgements
The authors are thankful to the referee and Professor Hira Koul for their careful reading and many valuable
suggestions which led to improvements in the paper. We also are grateful to Professor Jim Kuelbs for his
helpful comments and suggestions.
References
Francos, J.M., Meiri, A.Z., Porat, B., 1993. A united texture model based on a 2-D Wold like decomposition. IEEE Trans. Signal Process.
41, 2665–2678.
Gallant, A.R., 1987. Nonlinear Statistical Models. Wiley, New York.
Jennrich, R.I., 1969. Asymptotic properties of non-linear least squares estimators. Ann. Math. Statist. 40, 633–643.
Kundu, D., 1993. Asymptotic theory of least squares estimator of a particular nonlinear regression model. Statist. Probab. Lett. 18, 13–17.
Mandrekar, V., Zhang, H., 1996. On parameter estimation in a texture model, submitted for publication.
McClellan, J.H., 1982. Multidimensional spectral estimation. Proc. IEEE 70, 1029–1039.
Mikosch, T., Norvaisa, R., 1987. Strong laws of large numbers for ÿelds of Banach space valued random variables. Probab. Theory Rel.
Fields 74, 241–253.
Rao, C. R., Zhao, L., Zhou, B., 1994. Maximum likelihood estimation of 2-D superimpared exponential signals. IEEE Trans. Signal
Process. 42, 1795–1802.
Sen, P. K., Singer, J.M., 1993. Large Sample Methods in Statistics: An Introduction with Applications. Chapman & Hall, London.
Wu, C.F., 1981. Asymptotic theory of non-linear least squares estimation. Ann. Statist. 9, 501–513.
Yuan, T., Subba Rao, T., 1993. Spectrum estimation for random ÿelds with applications to Markov modelling and texture classiÿcation.
In: Chellappa, R., Jain, A.K. (Eds.), Markov Random Fields, Theory and Applications. Academic Press, New York.