Gallant, Ronald; (1976)On the asymptotic power of the lack of fit test in non-linear regression."

ON THE ASYMPrOTIC POWER OF THE
LACK OF FIT TEST IN NONLINEAR REGRESSION
by
A. Ronald Gallant
Institute of Statistics
Mimeo Series No. 1083
Raleigh - August 1976
2
•
Some notatibn which is used throughout is set forth here for convenient reference .
Notation:
(nxl)
(nxl) ,
o •• ,
h(xn ,w)J'
(nxl) ,
(nxl) ,
G(Y)
= the
n by u matrix with typical element
(e/bY j
)
g(xt , y) where t is
the row index,
z = the n by w matrix with rows
g
•
= g('¥ * )
h =
G
z~
,
,
h(w* )
= G( Y* )
,
(nxn) ,
(nxn) ,
The following regularity conditions govern throughout.
Assumptions: The seCluence of inputs {(xt ,Zt))nt=l are chosen from X x Z , where
X and Z are compact, such that the measure ~ defined on the Borel subsets of X x Z
.
n
•
.rbY
\.1
n (A)
= the
converges weakly to a measure
setsY and t:. are compact.
proportion of (xt , Zt) in A for t S n
~
defined on the Borel subsets of X x Z; see [2J.
The functions g(x,Y),
(e/ey i )
g(x,Y), and
The
10 ~3
·This tech."lical note serves as an appendix to (4).
ie~ini~ions,
notation, etc. may be resolved by
Cons~der
.Ambiquities as to objectives,
re~erence
to [4J.
testing
•
against
using the likelihood ratio test, assuming normal errors with variance unknown.
The
true model is, hm1ever,
A:
= g(xt ,
Y
t
An asymptotic
'1') +
approxima~ion o~
h(X , w) + e
t
t
the power o~ the test against A, not A#, is required.
'I'
In the above expressions, Y is univariate, x t is k-dimensional, Zt is wt
dimensional, and the index t = 1,2, ••• , n. The pa~ameter Y is u-dimensional, 'l' is
univariate, w is v-dimensional, and 6 is w-dimensional.
connection with parameters - '1' * ,
·unknown value which is meant.
*
'I' ,
The asterisk is used in
and w* - to emphasize that it is the true but
The omission
o~
the. asterisk does not necessarily
The parameter '1' is contained in a compact set ~.
denote the contrary.
•.
For
convenience, 6 is constrained to lie in a cOmpact set ~ but this assumption may be
eliminated
i~
desired; see [lJ.
The test statistic,
itsel~,
is
r! = 'fl(rll
y minimizes ;~l[Yt-g(Xt' 'f »)2, and
where:
('f~
2
7! =
fl) minimizes t':l[yt-g(xt ) 'f) - Zt'oJ , and
One rejects when
c*
where F
rx
~
=1 +
(lin) ;':l[Yt -g(xt , y)J2;
(rl)#
= (lin)
~l[Yt-g(Xt' '1'#)-z~o#i.
is larger.than
wF
rx
I(n-u-w)
denotes the upper rx.100 percentage point of an F random variable with ·w
numerator degrees freedom and n-u-w denominator degrees freedom.
•
3
4IIt
(b2/~~ie~j) g(x, ~) are continuous in (x, ~) on
continuous on I .
The true value
~
*
x~. The function hex, w*) is
I
is contained in an open set which is, in turn,
contained in ~ ; 6 contains an open neigliliorhood of the zero vector.
= g(x, ~* )
except on a set of \.L measure zero, it is assumed that ~
g(x, ~) + z'6
= g(x, ~* )
a.e. implies ~
and 0
=
0 .
I
(The limit exists by Lemma 1 of [3).).
is non-singular.
with mean zero and unknown var.iance
and
(ell
Proof:
likewise,
The matrix lim!J:o+GOC G~ Z) '[ G: Z J
As n increases,
T* tends
$
i
The random variables r;; and 1# converge almost surely to
random variable 0# converges almost surely to the zero vector.
cf2
*
= ~;
The errors (e ) are independently and normally distributed each
t
to a finite limit.
Lemma 1:
= 1*
If g(x, ~)
converge almost surely to
Denote T* by T*
n
~*
The
The random variables
r:l .
to emphasize the assumed variation with n •
Consider the sequence of random variables
= e'e/n + 2e'[g + Th - g(1)J/n + [g + Th-g(~)J'[g + Th-g(1)J/n •
Note that
y minimizes Qn(1, 1:)
for each realization of e • Qn(1, T) converges
almost surely to
= a2
+ JCg(x, ~ * ) + Th(x,
til
*)
g(x, 1)J 2 d\.L(x, z)
-
uniformly in (~:, T) over
'¥ x [-i,
of [3J, Parts 1 and 2.
Consider a realization of the errors [etl;=l which does not
belong to the exceptional set.
iJ by the Strong Law of Large Numb ers and Lemma 1
Let
0fn 1 be
the sequence of points minimizing
* corresponding to this realization.
Qn(1, Tn)
Since ~ is compact, there is at least
~
} such that lim
~
= 1 • As
~
n
m
m
a direct consequence of the uniform convergence of the continuous functions Q (1, ,.)
-
one limit point 1 and at least one subsequence
f"'tJ
[~
-
n
n
to
Q (1,
,.), we have
4
= lim
~
lim
S
~
Q C~n
nm
m
,
*
'l'n )
m
* )
Q ('J:' * , "I'n
n
m
m
- * , 0)
= Q('J:'
2
= a
This implies
* - g(x, 'J:') J2 d1J.(x, z) = 0
S[g(x, 'J:' * ) + '1'h(x, . (1))
which implies -'J:' = 'J:' *. by assumption.
Thus, the se~uence (¥n ) has on~ one limit
point y* J moreover, this implies
lim:rHCIO ~
an
~
= limJ:HCO Q(y,
'1'*
)
n
n
Next, consider the
se~uence
- * , 0) =
= Q(y
a2
of random variables
Qn('J:',6,'l') = (l/n)\le + g + '1'h - g('J:') - Z611
2
.
Note that (y#,6#) minimizes Q (y,6,'1'*) for each realization of e.
.
n
n
Qn(y,O,'1') converges
almost surely to
uniformly in (y,5,'1') over~ x ~ x [-~,~J as above.
The remainder of the proof is
entirely analagous to the above with (Y#,6#) replacing
Lemma 2:
~
throughout.Q
The random vector (l/~) [G:Z]'(e + '1'*h) converges in distribution to
a u + w - variate normal.
Proof:
By Lemma 3.5 of [1), (l/JE) [G~Z)'e converges in disttibution to a
u + w - variate normal.
By Part 1 of Lemma 1 of [3 J,
lim~(l/Jn) CG~Z)'('1'*h)
= (limJt)
[limrJ.o+o:>(l/n)[G:~)'h]. 0
5
Theorem 1.
The random variable
(02 )# m~ be characterized as
2 # :::;: (e + ,.*h) 1QGz(e + ,.*h)/n + an
(0)
where n a
n
converges in probability to zero.
The random variable
where n b
n
Froof:
of may be
characterized as
converges in probability to zero.
.
Recall that ,.* varies with n.
By Lemma 1, ("f#,Fl) will almost surely
be contained in an open subset of~ x 8 containing ("f*,O), allowing the use of Taylor's
expansions in the proof and causing ("f#,O#) to eventually become a stationary point of
We will now obtain intermediate results based on
be used later in the proof'.
By
T~lor's
T~lor's
expansions which will
theorem,
where D is the n by u matrix with' typical row
2
l( "f#-"f*)h\i'y
"2
g ( x t ' ;)
1:
and "f is on the line segment joining "f# to "f*
Using Lemma 1 and Lemma 1 of' [3J, one can show that (lin) G/("f#)D, (1/n)G.J'D,
(l/n)z/ D, and (lin) DID converge almost S:UJrely,
theorem,
[G / ('!'#) - G/J(e + ,.*h) = E("f#-"f*)
t.Q:·the~Z;e':tlQ .matr~x.c
Again'15yTaylbr IS=,
where E is the u by u matrix with typical element
*
e .. = t n (b 2 Ib"f.~"f·)g(:Xt'''f)
[et + ,.*h(xt,w)J
J.J
t:::;:l
J
J.
.
e
Using Lemma 1 of [3J and the assumed convergence of
(lin) E converges almost surely to the zero matrix.
*
'1'"
to zero, one can show that
6
fl. As mentioned earlier, for
(..l , rl) is eventually a stationary
We will obtain the probability order of '¥# and
almost every realization of the errors £e
point of (,.J;;72) Q (,¥, 5,
n
t)
converges almost
(l/~)[G('¥#);
1,
so that the random vector
t)
(-In/2) 'V'¥5Qn(vl, 5#,
=
t
Z]'(y-g('¥#)- Z 5#)
to the zero vector.
sur~ly
Substituting the expansions of the
previous paragraph, we have that
imply that the matrix in braces converges almost surely to a non-singular matrix.
Lemma 2 implies that (lj,Jii)[G;Z]'(e +
variate normal.
*h)
'i
converges in distribution to a u + w -
These facts allow the conclusion that
converges in probability to zero and that
are bounded in probability.
The sum of squares
\ly - g('¥#) - Z 5# 11
2
=
II e + g +
th - g ('¥#) - Z 5# 11 2
= \.1QGZ (e
+
t h) + PG~ (e + t h)
= UQ,az (e
+
'i
+ II P
GZ
*h)l\ 2
(e +
- 2(e +
,.*h)
- G('¥#- '¥*) - Z
*,
h) Qaz
'i
- G('¥#- '¥ *)
-
rl- D( '¥#_ '¥*) 2
11
D(,¥# - '¥ * )
Z 5#- D ( '¥#_ '¥*) 11
2
•
•
7
The cross product term may be written as
(e + 't'*h) QGZD ('i'#- 'i'* )
==
~
((lin)
n
w"*)J
t n [et + th(Xt ,
tel
,,2 g (x ,
t
~)lw.n
((l/n)[G:zJ/[G:zJfl((l/n)[G~ZJ/Dlwn
- [(l/Ji)(e + 't'*h)/[G:ZJl
Both terms converge almost surely to zero by Lemmas 1 and 2, Lemma 1 of [3J, and our
previous results.
(Some care must be taken with the argument concerning the first
term; see [1, p. l4a-14b] for details.)
\IPGZ (e + 't'*h) -
G( 'i'#- If* ) - Z
By the triangle inequality
fl-
D(If#- If *) \I
S \I [G; Z](([G;Z]/[G;Z])-l[G:Z]/(e + th)
+ \I D(IIj#- 1Ij*) 1\:
1
1.
+
== (u~((1/n)[G~zJ/[G~ZJ}un)2
e
(w' ((lin) D'D} w )2
n
n
The two terms on the right converge almost surely to zero by our previous results.
The proof for ~ is analogous and, therefore, omitted.
Theorem 2.
The statistic T# may be characterized as T#
X = (e +
and n c
n
't'*h)
't'*h) I (e
't'*h)
Q (e +
+
G
converges in probability to zero.
I
I
0
=X+
c where
n
-1(-
QGZ (e + 't' h)
The probability p(X > c* ) is given by the doubly non-central F distribution as
defined in [5; p.75] with: numerator degrees freedom w, and non-centrality parameter
Xl
= ('t'*)2
h/(P - P )h/(2fi)
GZ
G
and denominator degrees freedom n - u - w, and non-centrality parameter
.* 2
~ == ('t')
Proof:
2
h' QGZh/ (2a ).
The proof of Lemma 1 of [2 J may be used almost word for word to prove that
8
where n d converges in probability to zero.
n
the term b
is as defined in Theorem 1.
n
Thus,
~ =X+
c
n
where
By Lemma 1 and Theorem 1, each term of nc
n
converges in probability to zero.
Set z = (l/o)e, Y = (l/~) T*h, and R = PGZ - P •
G
The random variables zl' z2' ••• , zn are independent normal random variables each
with mean zero and variance one.
Thus, the random variable (z + Y)'R(z + y) is a
noncentral chi-squared with w degrees freedom and noncentrality parameter
Similarly, (z + Y)'Qaz(z + y) is a noncentral chi-squared random variable with n - u - w
degrees freedom and noncentrality parameter
These two random variables are independent because R QGZ
=0
e
(see [Graybill 5, p.79ff]).
Now,
p(X :> c * )
=
*
p[(e + Th)'QG(e
+ T* h) / (e + .,.* h )'QGZ(e + T* h) > 1 + wFCt' / (n-u-w)]
=
pC (z + y)' (QG- QGZ)( z + Y) / (z + y)'QGZ (z + Y)
=
P[[(z +y)'R(z + y)/w]/[(z + y)'QGZ(z + y)/(n-u-w)] > FCt'}
Alternative expressions for the
context of an attempt to maximize
noncen~rality
=
2
(T*)2h 'QGZ(Z'QGZ)-lZ'QG h/(20 )
h2
=
(/)2h 'QGh /(2cf) - Al
These expressions may be verified as follows.
(Z 'QGZ) -~ 'QGh
=
a
= (G'G)-lG'(h
- Zb)
·0
parameters are useful in the
hl and minimize h2 via choice of Z.
hl
b
:> w FCt'/ (n- u-w ) ]
The vectors
9
solve the equations
fG'G
Q'Zl
LZ'G
Z'Z
J
as may be verified by substitution.
h'pGZ h
= [h'G
= [h'G
h'z1.
fG'G
I:'G
= h'PQh
e
G'Zl-l fG'hl
J
Z 'z
~
'hl
h'Z] r(G'GflG' (h - zb0
(Z 'Q Z) -lZ 'Q h
.
_
= h'PG(h
Thus,
G
G
I
J
- Zb) + h'Z(Z'QGZ)-lz'QGh
- h'PGZ(Z'QaZ)-lZ'QG h + h'Z(Z'QGZ)-lz'QGh
The formulas for hl and ~ a:t'e obtained by substitution of this expression for
h'PGZh in the formulas given in Theorem 2.
As mentioned in [4J, one may use the statistic S# instead of T# to test H
against A.
For convenience, the definition of S# is repeated here.
(s
2
l
= t n [Yt - g(x , '1#) t
tel
Evaluate the matrix G('¥) at '1
Let
= y#
z~
Let
5#f/(n-u-w) •
and put
c:
2 be the matrix formed by deleting the first u rows and columns of C; then
s# _
-
(6#) '(C# )-16#/W
22
(s2/#
H is rejected when S# exceeds the upper
~
• 100 percentage point F of an F random
~
~ variable with w numerator degrees freedom and n-u-w denominator degrees freedom.
10
The tests based on S# and T# are asymptoticall;y equivalent in the sense of the
following theorem.
~lleorem.~.
'l'he statistic s# may be characterized as s#
(e
Y
=:
!'::
Y
-I-
d
*h)' (pGZ - PG) (e + '1'oX- h) /w
*
. *
+ '1' h) 'QGz(e + '1' h)/(n-u-w)
n
where
-I- '1'
(e
and d converges in probability to zero.
n
The probabilities p(Y
> F~ ) and p(X > c* )
are equal where X is as in Theorem 2.
6~
Eoo,f.
On page
tion of
Jii l'f ",#'f I
r#
*\
as an intermediate step in the proof of Theorem 1, a characterizewas obtained.
Rearranging terms, we have
,5 )
5#
I:':
(C
G'
21
-I-
C Z')(e + /h)
22
-I-
Zn
where
has been partitioned as
c
and
=
,In zn converges in probability to zero. Also, recall that
Jii
ri is bounded in
probability as is ,jnW •
2
On page 7, as an intermed.iate step in the proof of Theorem 2, a characterization
of
(rll
was obtained.
1/(8 2 )#
From this , it follows that
= (n-u-w)/(e
+
*h) 'QGz(e
'1'
where n w converges in probability to zero.
n
+
-*h) + w
n
'1'
~
11
The random variable s# may be written
Jii
The second term converges in probability to zero because:
2
probability, (s2)# converges almost surely to 0 , and
6# is bounded in
[(1/n)(C~2)-1
converges almost surely to the zero matrix by Part 3 of Lemma 1 of
Denote this second term by u
n
(6#)
IC;~
rl = W~
+
.
Now
C;~ W2 +
2(,JnW2 ) '[
(l/n)C;~J(J1 zn)
(In zn) '[ (l/n) C;~J(Jilzn)
where the latter two terms converge in probability to zero.
terms by v
n
(l/n)C;~J
[3 J.
-
Denote these latter
•
We have, now, that
<
2
#
u + vil![w(s )
n
=
W~ c;~
+ (n wn )
W2/[w(e + th)
I
QGZ(e + th)/(n-u-w)]
W~[ (l/n) c;~J w2/w"
The remainder term d
n
J
+ un + v n!Cw(s
of the theorem to be proved
this expression; and, converges in probability to zero.
Y is the first,term.--O£ this expression.
2
l J.
e~uals
the last three terms of
It remains to show that
12
Now,
-1
using the fact that ( G'G )
as req,uired.
0
= Cll
-1
- C12 C22 C21 • It follows tha.t
References
[lJ
Gallant, A. Ronald., "Inference for Nonlinear Models," Institute of statistics
'"
Mimeograph Series, No. 875 (revised), Raleigh: North Carolina State University,
1975.
[3J - - - - - - - , "Testing a Subset of the Parameters of a Nonlinear Regression
Model," Journal of the American Statistical Association, 70 (December 19'75),
927-932.
[4]
, "Testing a Nonlinear Regression Specificationi A Nonregular
Case," Institute of Statistics Mimeograph Series, No. 1017, Raleigh:
North Carolina State University, 1975.
[5J Graybill, Franklin A.
New York:
..
An Introduction to Linear Statistical Models, Vol. I,
McGraw-Hill Book Co., 1961 •