Gallant, A.R.; (1985)Nonlinear Statistical Models. Ch. 4: Univariate Nonlinear Regression: Asymptotic Theory."

•
•
Nonlinear Statistical Models
by
A. Ronald Gallant
Chapter 4.
Univariate Nonlinear Regression:
Asymptotic Theory
Institute of Statistics Mimeograph Series No. Ib74
October 1985
•
NORTH CAROLINA STATE UNIVERSITY
Raleigh, North Carolina
NONLINEAR STATISTICAL MODELS
by
A. Ronald Gallant
CHAPTER 4.
Univariate Nonlinear Regression:
COVJright © 1983 by A. Ronald Gallant.
Asymptotic Theory
All rights reserved.
NOT FOR GENERAL DISTRIBUTION
This copy is reproduced by special permission of the author and
publisher from:
A. Ronald Gallant. Nonlinear Statistical Models.
John Wiley and Sons, Inc, forthcoming.
New York:
and is being circulated privately for discussion purposes only.
Reproduction
from this copy is expressly prohibited.
Please send comments and report errors to the following address:
A. Ronald Gallant
Institute of Statistics
North Carolina State University
Post Office Box 8203
Raleigh, NC 27695-8203
Phone 1-919-737-2531
NONLINEAR STATISTICAL MODELS
Table of Contents
Anticipated
Completion Date
1.
Comoleteri
Univariate Nonlinear Regression
1.0
1.1
1.2
1.3
1.4
1.5
1.6
~.7
1.8
Preface
Introduction
Taylor's Theorem and Matters of Notation
Statistical Properties of Least Squares
Estimators
Methods of Computing Least Squares Estimators
Hypothesis Testing
Confidence Intervals
References
Index
2.
Univariate Nonlinear Regression:
Special Situations
3.
A Unified Asymptotic Theory of Nonlinear Statistical
Models
December 1985
Completed
Preface
Introduction
The Data Generating Model and Limits of Cesaro
Sums
3.3 Least Mean Distance Estimators
3.4 Method of Moments Estimators
3.5 Tests of Hypotheses
3.6 Alternative Representations of a Hypothesis
3.7 Independently and Identically Distributed Regressors
3.8 Constra;ned Es~imation
3.9 References
3. '10 Index
3.0
3.1
3.2
4.
Univariate Non:inear Regression:
Asymototic Theory
4.0
4.1
4.2
4.3
Preface
Introduction
Regularity Conditions
Characterizations of Least Squares Estimators and
Test Statistics
4.4 References
4.5 Index
5.
Multivariate Linear Models:
Review
Completed
Table of Contents (continued)
Anticipated
Completion Date
6.
Completed
Multivariate Nonlinear Models
6.0
6.1
6.2
6.3
6.4
6.5
6.6
6.7
Preface
Introduction
Least Squares Estimators and Matters of Notation
Asymptotic Theory
Hypothesis Testing
Confidence Intervals
Maximum Likelihood Estimators
An Illustration of the Bias in Inference
Caused by Misspecification
6.8 References
6.9 Index
Review
Deletion likely
7.
Linear Simultaneous Equations Models:
8.
Nonlinear Simultaneous Equations Models
8.0 Preface
8.1 Introduction
8.2 Three-Stage Least Squares
8.3 The Dynamic Case: Generalized Method of Moments
8.4 Hypothesis Testing
8.5 Maximum Likelihood Estimation
8.6 Choice of Instruments
8.7 References
8.8 Index
December 1985
9.
Dynamic Nonlinear Models
Completed
9.0 Preface
9.1 Introduction
9.2 A Uniform Strong Law and a Central Limit
Theorem for Dependent, Nonstationary
Random Variables.
9.3 Data Generating Process
9.4 Least Mean Distance Estimators
9.5 Method of Moments Estimators
9.6 Hypothesis Testing
9.7 References
9.8 Inde-<
4-0-1
CHAPTER 4.
Univariate Nonlinear Regression:
Asymptotic Theory
In this chapter, the results of the previous chapter are specialized
to the case of a correctly specified univariate nonlinear regression model
estimated by least squares.
Specialization is simply a matter of restating
Assumptions 1 through 7 of Chapter 3 in context.
asymptotic theory follows immediately.
This done, the
The characterizations used in
Chapter 1 are established using probability bounds that follow from the
asymptotic
theory.
4-1-1
1.
INTRODUCTION
Let us review some notation.
The univariate nonlinear model is written
as
t::: 1,2, ... , n
TN'ith eO known to lie in some compact set 9*
The functional form of f(x,8)
is known, x is k-dimensional, e is p-dimensional, and the model is assumed to
be correctly specified.
Following the conventions of Chapter 1, the model
can be written in a vector notation as
y
= f( eo
) + e
with the Jacobian of f(e) written as F(e)
is estL~ated by
=
(%e')f(e).
The parameter e
e that minimizes
We are interested in testing the hypothesis
H: h( eO)
=
0
against A: h(eO)
*0
which we assume can be given the equivalent representation
H: eo = g(pO) for some pO against A: eO
where h: RP
-?
q
r
R , g: R
-?
RP , and P = r + q.
notation of Chapter 3 is as follows.
* g(p)
for any p
The correspondence with the
4-1-2
NOTATION 1
General (Chapter 3)
Specific (Chapter
= q(Yt,Xt'Y~)
e t = Yt - f(Xt,8~)
et
e e e*
yer
Y
4)
= Y(e,x,y)
Y .. f(x, e) + e
s (y t ,x t ',. n':>..)
e E: e*
A e !I.*
n
,..
(l/n)L;~l[Yt - f(X t ,e)]2
Sn(A) = (l/n)};=ls(yt,xt'''n,A)
sn(8) =
S~(A) = (l/n)~=lS s[Y(e,xt'y~) ,Xt'''~'A]dP(e)
s~(e) = r?+ (l/n)~=l[f(Xt,e~)
e
J s[Y(e,x,y*),X,,.*,A]dP(e)
S*(A)=J
1.
~
n
e
s*(9)
= r?+ J Lf(x,e*) - f(x,A)J 2 dlJ,(x)
1.
Gn minimizes
minimizes s (A)
n
r n minimizes
dlJ,(x)
~
Sn(A)
n
subject to h(A) = 0
subject to h(8)
* minimizes S~(A)
An
subject to h( i..)
*
en
i..* minimizes S* (A)
n
n
8° minimizes
n
S (8)
= g(p ) minimizes S (8)
AO minimizes SO (A)
n
n
=
0
n
=0
- f(xt,A)]
minimizes SO (8)
= g(pO)
n
n
subject to h( 8) = 0
8* minimizes s * (8)
4-2-1
2.
REGULARITY CONDITIONS
Application of the general theory to a correctly specified univariate
nonlinear regression is just a matter of restating Assumptions 1 through 7
of Chapter 3 in terms of the notation above.
As the data is presumed to be
generated according to
t
= 1,
2, ... , n
Assumptions 1 through 5 of Chapter 3 read as follows.
ASSUMPTION 1'.
The errors are independently and identically distributed
with common distribution P(e).
0
ASSUM.PrION 2'.
f(x,e) is continuous on X X 8 * and 8 * is compact.
ASSUMPTION 3'.
(Gallant and Holly, 1980)
[vt-1 with v t
= (e~,xt)
0
Almost every realization of
is a Cesaro sum generator with respect to the product
v
measure
= S S IA(e,x)
v(A)
X
e
dP(e) d~(x)
and dominating function b(e,x).
The sequence [x } is a Cesaro sum generator
t
with res?ect to ~ and b(x)
= Sb(e,x)
neighborhood N. such that
SesUPNxb(e,x)
.(
ASSUMPrION 4~
e
(Identification)
dP(e).
For each x c X there is a
dP(e) <
CI:l.
0
The parameter eO is indexed by n and
the sequence [eo ';\ converges to A* •
n
s * (8)
= cr2
~
*) - r (x,9 ) J2 d~ (x)
+ j~ Lr(x,e
X
has a unique minimum over
ASSUMPrION 5~
e*
e* at e*
0
is compact, [e + r(x,eO) - f(x,9)]2 is dominated by
b(e,x); b(e,x) is that of Assu,"llption 3·
0
4-2-2
The sample objective function is
s (8) = (l/n) \\y - f(8)\\2
n
with expectation
By Lemma 1 of Chapter 3, both s n (A) and sO(8)
have uniform, almost sure limit
n
s * (8) = cr2 +
S [f(x,S)
*
- f(x,S)] 2 d~(x).
1.
Note that the true value SOn of the
~~known
narameter is also a minimizer of
-
SO (e) so that our use of So to denote them both is not ambiguous.
n
n
i're may
apply Theorem 3 of Chapter 3 and conclude that
~
80n
= 8*
~
An
= 8*
~ im
JfJ
Lim
'
almost surely.
Assumption 6 of Chapter 3 may be restated as follows.
ASSUMPTION
6~
e*
contains a closed ball
e
centered at 8* with finite,
nonzero radius such that
(0/d8 i )s[Y(e,x,8° ),x,81 = -2[e + f(x,eO) - f(x,e)J(0/08 i ) f(x,A)
(0 2/08.08.
)s[ Y(e ,x,8" ) ,x,8 J
~
J
'"
2[(0/08 ~.)f(x, e) 1(0/08.J
)f(x,.
e))
- 2[e + f(x,eO) - f(x,e)](02/oeioAj)f(X,A)
[(0 /oA i )s[Y( e ,x ,Ao ) ,:{, A]}[ (C/06 j ) s[Y( e ,:{, eO ) ,x, A]}
'" 4[e + f(x,S") - f(X,A)J2 [(0/08 i )f(x,A)J[(0/OA j )f(X,Cl)]
are continuous and dominated by b(e,x) on
e
X 1. X
e*
X
Moreover,
~*= 2
S[(0/08)f(x,A*)1[(0/OA)f(X,A*)1/d~(X)
1
is nonsingular.
Define
0
e for i,
j
= 1, 2, ... , P .
4-2-3
NOTATION 2
Q.
= S [(O/oe)f(X,e*)JL(o/Oe)f(X,e*)]'dfJ,(X),
1.
Q.0
n
='
(1/n) F' (eO) F (eO) ,
n
n
Q. * = (l/n) F'(8* ) F(e * ).
n
n
n
0
Direct computation according to Notations 2 and 3 of Chapter 3 yields
(Problem 1).
J* = 2 Q.
u* = a
.fJn =
4i Q.°n
J*n = 2 Q.*n - (2/n)2:tn=l[f(xt,eO)
- f(Xt,e*)](02/oeoe')f(Xt,r/)
n
n
n
u:
=
(4/n)~=1[f(Xt,9~)
-
f(Xt'A~)J2[ (O/OA)f(Xt,e:)][ (%e)
f(xt,e:)J'
4-2-4
Noting tl::at
we have from Theorem 4 of Chapter 3 that
and from Theorem 5 that
The Pitman drift assumption is restated as follows.
(Pitman drift)
ASSUMPl'ION 7:
.tim
n--
In( eO - 8* ) = D..
n
The sequence
Moreover, h( e* )
n
=0
~
n
is chosen such that
.
Noting that
=
(%(1)SO (8)
n
(-2/n)P'((1) [f(eO) - f(e)]
n
we have from Theorem 6 that
tim
~
tim
tim
An = A* aJJnost surely
e*n = e*
~
~
Q
* =Q
n
(l/Jii.) p'(e*)
eL
LimrHO' (l/Arn)
P'(9:)[f(e~) - f(e:)] = QD.
n
N(O,c,.2 Q)
•
Assumption 13 of Chapter 3 is restated as follows.
PBSUMPTION
mapping of
(=q) at
e
13~
The function h(e) is a once continuously differentiable
q
into R .
e = e* . 0
Its
Jacobian H(A)
=
(%A')h(e) has full rank
4-2-5
PROBLEMS
1.
Use the derivatives given in Assumption
~(8) and J(8) ,
1(8) , G(8)
6 to compute J(8) ,
as defined in Notations 2 and
3
-
~(8)
of Chapter 3.
,
4-3-1
3·
CHARACTERIZATIONS OF LEAST SQ.UARES ESTJMATORS AND TEST STATISTICS
The first of the characterizations appearing in Chapter 1 is
It is derived using the same sort of
arglli~ents
as used in the proof of
Theorem 5 of Chapter 3 so we shall be brief here; one can look at Theorem 5
By
for details.
that ~
Q.~
= Q.
2 of Chapter 3 we may assume without loss of generality
and eo are in e and that (c/ce)s (8 )
n
n n
n
+
0(1) whence bPn= J* + 0(1).
J:i
where [}
Le~~a
= J*
(c/ce)s (eO)
n n
+
[c9 *
0
+
s
0
(1).
s
=
In
=
0
p
(l/Jii).
Recall that
By Taylor's theorem
(O/OA)S
n
(8 n )
+ [} Jii (eO - 8 )
n
n
Then
(1) J Jii (8
n
= - In
- eO)
n
(c/o e )s (eO) +
n n
0
s
(1)
which can be rewritten as
Now [9* -
:tn +
Jii. (9 n - eO)
=
n
0
s (l)J =
0
s (1) and
0 (1) whence
p
Jr; (~ n - eO)~
n
Ct?* -1.n
+
0
s
(l)J
N(o,iQ.)
which implies
that
.
-
In (8 n - eO)
=
n
0 (1)
p
have that
rP Jii
tfn
(8 n - eO)
= Jii
n
(d/de)S n (eO)
n +
0
p (1) .
There is an N such that for n > N the inverse of ~ Bxists whence
n
or
Thus we
4-3-2
Finally, -(1:,) -1(0/08)s (eo) = [F' (eO )F( 8° ) r~' (8°)e which completes the
n
nn·
n
n
n
argument.
The next characterization that needs justification is
The derivation is similar to the arguments used in the proof of Theorem 15
of Chapter 3; again we shall be brief and one can look at the proof of
Theorem 15 for details.
By Taylor's theorem
nCs (8°) - s (9)]
n n
n n
=
=
,
n[ (0/0 8) s (~ ) J (tl - 8° )
n n
n
n
+
(n/2)(~ n -8 n )'C(i/0808')s n (8 n )J(e n _8°)
n
n
0
0
s
(1/01)(~ - 8'J) + (n/2)(8 - eO) 'C/o +
n
n
From the proceeding result we have
whence
This equation reduces to
which completes the argument.
n
n
n
0
s
(1)](8 _ 8°)
n
n
4-3-3
Next we show that
A straightforward argument using Taylor's theorem yields
where H has rows (%e ') h(e.) 'tlith e. == A. Q + (l-A.)eO for some A. with
~
~
~ n
1. n
~
o5
A. -:; 1 whence
~
Since
JD. (8 n - eO)
n
is bounded in probability we have
In h (e n ) == In h ( eO)
n
==
JD.
+
In
H( eO )( 8
n
n
h(eO) + H(eO)
n
n
JD.
- eO)
n
+
0
s
(1 )
f[F'(eO )F(eO )]-~'(eo)e +
n
n
n
0
p
(l/Jnn +
We next show that
where
P == F(eO )[F'(eO )F(eO )l-~'(eO) .
F
n
n
n ~
n
Fix a realization of the errors [e } for which
t
.Hm~e'(1-PF)e/(n-p) ==
cl;
~im~s2
and
aL"llost every realization is such (Problem 2).
Choose N so that if n > N then s2 > 0 and e' (I - PF)e >
and Taylor's theorem we have
2
== 0
o.
Using
0
s
(1)
4-3-4
The term [(n - p)/e' (I - PF)ef is bounded for n > N because
i,im~(J (n - p)/e' (I - PF)e]
1/s2
=
4
2
=0
1/0 .
One concludes that
(n-p)/e'(I-PF)e + 0p(l/n) which completes the argument.
The next task is to show that if the errors are normally distributed then
If
=Y
+
0
P
(1)
where
y"" F'(q, n -p, )..)
Now
and as notation write
JU
hee ) =
n
.
h(eO) + In H(AO)[F'(eO )F(e0 )]-~'(eO)e +
n·
n
n
n
n
Jri
0
p
(1)
= \.J. + U + a (1)
p
[He (l/n)F'(e n )F(8 n )J-J-rr'r l
=
[H(eO)[ (1/n)F'(9° )F(eO )]-lH'(8° )}-l+ 0 (1)
n
n
n
n
p
= A-1 +
0
p
(1)
whence
W= [\.J.+U +
=
0
p
(l)J'A-1t\.J.+ U+
(\.J.+U)'A-l(\.J.+u)/(q~)
e'(1 - PF)e/[l(n - p)J
=Y +
0
p
(1)
Assuming normal errors then
?
U "" N (0, o-A)
q
+
0
p
(1)
0
P
(l)][(n-p)/e'(1-PF)e+
0
p
(l)]/q
4-3-5
which implies that (Appendix 1)
with
Since A(1 - P ) := 0, U and (I - PF)e are independently distributed whence
F
(\Jo+U)'A-l(\Jo+U) and e'(1-PF)e:= e'(1-PF)'(1-PF)e are independently distributed.
This implies that Y
rv
F' (q, n - p,)J which complete s the arg\L.llent.
Simply by rescaling s2 in the foregoing we have that
1
(SSEfull)/n:: e'PFe/n + 0p(l/n)
n/(SSE full ) ::
n/e'P~e
+ 0p(l/n)
where
recall that
SSEfull ::
\\y -
f( en) 1\2
SSEreduced :: \\y - fCen )\\2 :: \\y - :r[g(Pn)J\l2
The claim that
with
6 :: f(eO) - f(A * ) :: f(eO) - f[g(p0)J
nn
n
n
I - P
:: I - F(eO )G(pO )[G'(pO )F'(eO )F(eO )G(pO )J-1G'(po )F'(eO)
n
n
n
n
n
n
n
n
FG
comes fairly close to being a restatement of a few lines of the proof of
4-3-6
Theorem 13 of Chapter 3.
iLfri (8n -
In
In that proof we find the
e*)
n
(8n - e*)
n
==
==
0
~
-1
s
e~uations
(1)
In
(0/08)s (8 ) - ~ -l)!i (%e)s (e*) +
nn
nn
0
s
(1)
which, using arguments that have become repetitive at this point, can be
rewritten as
H..[ri (8' -e*)
n
n
==
In
~-J.cJn (%A)s n:J.
(8 ) -
(8n - e*)
n
with ~ == c1n
~o and
==
0
s
H == H(e*).
n
one can substitute for
Jii
-
e:)
== -
Jii
(0/08)S (e*)] +
nn
Using the conclusion of Theorem
(0/08)s
rn [(0/08)s n cen )j'..[ri
in (~:J.
(1)
n
-
p
(1)
13 of Chapter 3
(8n ) to obtain
(8'n - 8*)
'n
~-\~
0
==
0
p
(1)
H'(H~-lH'f1HJ~-1 Jfi (0/08)sn(8:)
+ Opel)
Then using Taylor's theorem
nes' ('A' ) - s (e*)]
n n
n n
= -n[(%q)s n:J.
(8 )](8n ==
(-n/2)(8'n - e*)
n '~CAn
8*) - (n/2)(e - 8*Y[~ +
n
n
n
- e*)
n
+
0
p (1)
Using the identi:;y obtained in Section 6 of Chapter 3 we have
0
s
(l)J(8 - 8*)
n
n
4-3-7
whence
n s (8):= ns (e*) - (nj2)[(ojoe)s (e*)J'G(G'~ GrlG'[(ojoe)s (e*)] + 0 (1)
nn
nn
nnnn
p
Using Taylor's theorem, the Uniform strong Law, and the Pitman drift assumption
we have
x
.
(ojoA')(ojoAp)f(xt,e pn )
= (-2jn) F'(e~)(e + 0) + 0p(l/,[ri.)
Substitution and algebraic reduction yields (Problem 3)
ns (6')= (e+6)'(e+ 6) - (e+ I))'p (e+ 6)+ 0 (1)
n n
FG
p
which proves the claim.
The followirliS are the characterizations used in Chapter 1 that have not
yet been verified
4-3-8
'D/(F/F)-~/n =' (e+ 6)/(PF -PFG)(e+ 6)/n + 0p(l/n)
(e+ 6)/(P -PFG)(e+ 6)/11
F
~
e'(I - PH,)e/(n _ p)
+ Opel)
J:
nD'(F'F) D == n(e+ 6)/(PF -PFG)(e+ 6)
----~~.::--_--+ 0
SSE (8)
(e+6)/(I-P )(e+6)
FG
Except for the second, these are obvious at sight.
(1)
p
Let us sketch the
verification of the second characterization
==
(n/4)[ (0/06) sn (en)] '[ (l/nYF
'Frlc (%A) sn (en)]
(n/2)[(0/08)s (e )1'[,9+
n n
(l)f\(%o)s (e)J
n n
0
s
==
(n/2)[(%8)sn(8:)]'t?-lH/(fldl-~/)-lfldl-lc(%8)sn(8:)]
==
(n/2)[ (%A)sn(8:)J '[1- _ G(G',9G)-lG / ][ (%e)Sn(e:)J + Jp(l)
==
(l/n)(e+ 6)/F(AO)[ (Qo )-1_G(G' QoG)-lG']F / (8° )(e+ 6) +
'n
n
n
n
+ opel)
1
0
p
(1)
4-3-9
PROBLEMS
1.
Give a detailed derivation of the f8ur characterizations listed in
the preceding paragraph.
2.
Cite the theorem which permits one to claim that tim
aL:nost surely and prove directly that tim
n~
e' (I - P.., )e/ (n _ p)
l!
n-*O
s2
== 0
== 0
2
2
almost
surely.
3.
Show in detail that
(0/08)s n (e *n )
* ,
~
suffices toreduce (n/2)[(0/oe)sn(A )] G(G',9
n
(e + 6) 'PFG(e+ 6) .
(-2/n)F'(eO )(e+ 6) +
n
0
p
(l/Jn)
G)-~G'[(%e)sn(e:)] to
4-4-1
4. REFeRENCES
A. Ronald and Alberto Holly (1180), "Statistical Inference in an
Implicit, Nonlinear. Simultaneous Equation Model in the Context of
Maximum Likelihood Estimation," Econometrica 48, 617-720.
C~llant.
4-5-1
5.
INDEX TO CHAPTER 4
Assumption 1', 4-2-1
Assumption 2', 4-2-1
Assumption 3', 4-2-1
Ass um p t ion 4', 4- 2- 1
Assumption 5', 4-2-1
As sump t ion 6', 4- 2- 2
Assumption 1', 4-2-4
Ass um p ti 0 n 1 3
4- 2- 4
Characterizations
estimator of variance, 4-3-2
Lagrange multiplier test statistic, 4-3-8
least squares estimator, 4-3-1
likelihood ratio test statistic, 4-3-6
nonlinear function of the least squares estimator, 4-3-3
reciprocal of the estimator of variance, 4-3-3
restricted estimator of variance, 4-3-5
Wald test statistic, 4-3-4
Least squares estimators for univariate models
asymptotic normality, 4-2-4
characterization as a linear function of the errors, 4-3-1
consistency, 4-2-2
defined, 4-1-1
data generating model, 4-2-1
identification, 4-2-1
sample objective function, 4-1-1, 4-2-2
score function, 4-2-4
Pitman drift, 4-2-4
Notation 1, 4-1-2
Notation 2, 4-1-3
Univariate nonlinear regression model
defined, 4-1-1
identification, 4-2-1
Pitman drift, 4-2-4
I,