ESTIMATION OF THE PARAMETERS IN .AN IMPLICIT MODEL
BY MmIMIZING THE SUM OF ABSOLUTE VALUES OF ORDER P
by
Sonia Balet-Lawrence
Institute of Statistics
Mimeograph Series No. 1073
Raleigh - 1976
•
EST1MATION OF THE PARAMETERS IN AN IMPLICIT MODEL BY
Ml1\JIMIZING THE SUM OF ABSOLUTE VALUES OF' ORDER P
SoNIA BALET-lAWRENCE
A 'thesis submitted to the Graduate Faculty of'
North Carolina S'tate Universi ty at Raleigh
in partial fulfillment of the
requirements for the Degree of
Doctor of' Philosophy
DEPARTMENT OF STATISTICS
RALEIGH
.1
9 '7 5
APPROVED BY:
,._ _....
~
.....
•
~...,.._~..,_
....,~,~._._
....._ _
n>._~
.
Cc-Cb.ej.rman of Ad7isory Committee
ABSTRACT
LAWRENCE, SONIA BALET.
Estimation of the Parameters in an Implicit
Model by Minimizing the Sum of Absolute Values of Order
P.
(Under
the direction of BIBHUTI B. BHATTACHARYYA and A. RONALD GALIANT.)
Consider the structural relation given by
where
fx
y] n
t-t' t t=l
of parameters, and
is the observed sample,
e
e*
is the unknown vector
is a random error due to the combined effect
t
of omitted variables.
The statistical problem is to estimate
If an analytic solution for the endogenous variable
e*
Yt
(or a
~t
function of it) does not exist in terms of exogenous variables
and parameters plus an error term, the well-known techniques for
estimating
doe not apply directly.
An estimator
considers this formulation of the model is proposed.
"e
n
that
The regression
case is considered as a special type of implicit model.
The estimator en
k
is defined as that value in 0 c R
such that
n
min ! IQ(~t'Yt,e)IP
eeO t=l
for some fixed
p~
I.
A computing algorithm is derived and the
statistical properties of en
are investigated.
shown that under certain conditions for any
probability one as
o
satisfying
n
~
co, where
eO
p
~
In particular, it is
I,
"en ~
eo
with
is the point in the interior of
for any
e
ria
So
e
and
in
(2.
If more than one value in
minimizes the above expectation, then for
n
(2
,.,
en
sufficiently large
remains in the neighborhood of anyone of these paints with probability
one.
For a choice of
p > 1
the estima.tor
asymptotically normal with mean
~n
is shown to be
eO.
When the regression model is considered then under more specific
conditions set on the class of distribution functions for the error
term it is shown that
eO
consistent for
and the asymptotic distribution for
on
p> 1
P
:it
1
~
e*
and hence the estimator is strongly
for the endogenous variable
y
x.
based
is defined for
the case in which an analytic or numerical solution for
for a given value of
en
e*.
is centered abeut the true parameter value
A predictor
,.,
y
exists
ii
BIOGRAPHY
The author was born on November 4, 1947, in San Juan,
Puerto Ric0.After receiving a Bachelor1s degree in Business
Administration from the University of Puerto Rico, she was granted
financial support from this university to pursue studies in
statistics towards a Masters at North Carolina State Universi ty.
Studies during 1970-71 were done with financial support from
the National Science Foundation.
After completion of the Master's degree the author worked as
an instructor at the University of Puerto Rico.
In 1972, she
returned to North Carolina State University to fulfill the
requirements for a Doctorate I s degree.
iii
ACKNOWLEDGEMENTS
I wish to express my sincere gratitude to
my
th~
Co-Chairmen of
graduate committee, Drs. B. B. Bhattacharyya and A. R. Gallant for
their guidance in the preparation of this thesis and their valuable
assistance in the organization and writing of the final draft.
Thanks
also go to Drs. F. Gisbrecht and R. Schrimper for serving in this
committee and fur their cooperation
given.
short notice during the
last stages of this thesis.
Special thanks go to Dr. Thomas D. Wallace for his assistance
throughout my five years of graduate WOrk and for suggesting what
eventually became part of this research.
To Linda Granger, I owe thankS for her prompt and careful typing
of this work.
\
\
I am greatly indebted to the University of Puerto Rico for
financial su:pport during the first year of my graduate work and
above all for being a main source of motivation in the completion
of my degree.
To my parents I wish to express my gratitude for their constant
encouragement.
To my husband, John,. I dedicate this thesis.
iv
TABLE OF CONTENTS
Pa.ge
1.
OVERVIEW OF· THE PROBLEM
• •
• • •
• • • • • • • •
•
0
Introduction • • • • • e· • • • • • • • • • • •
Background • • • • • .' • • • • • • • • • • • •
2.
THE MODEL. AND GENERAL CONCLUSIONS • • • •
Introduction . • • ' • • • • • • . •
Structure and MOdel Specifications
The General Assumptions • • • • •
3.
THE ESTIMATOR • •
•
0
0
•
•
•
•
•
•
0
•
• • •
• •
1
·.
1
2
·.
• • •
17
11
•
•
• •
• • • •
•
•
iii
•
•
•
•
•
•
Q
•
•
• •
• •
• •
·.. • • • •
30
· ·
30
"-
3.1 Definition of 'l1le Estimator en • • • • • •
•
"
3.2 Compu ta t ion of en
• • • •
• • • • • • •
3.3 Convergence of The Algorithm
• • • • •
•
.·
4.
17
23
·
32
··
43
• • • • • • • •
49
Introduction • • • • • • • • • • • • • • • • • • •
49
4.2 Convergence Witp Probability One 0 • • • • • • • •
~
4.3 Asymptotic Normality of
4.4 The Predictor • • • • • . •u • • • • • • • • • • • •
63
76
GENERAL RESULTS • • • •
4.1
·.. •
·
· ..
• • •
• •
e ': . • . . . . . . . .
56
81
5.
5.1 Strong Consistency of ~n • • • • • • • • • • • •
5.2 Asymptotic Normality • • • • • • • • • • • • • • •
6.
SUMMARY •
7.
NUMERICAL ILLUSTRATIONS
8.
LITERATURE CITED
• • • • •
•
• • • • • •
•
•.
II
•
•
o
•
•
•
•
• • • • • • •
.' • - • • t_. • • •-
e.
.,
•
•
•
._
•
•
•
•
•
•
•
0
-.
•
•
•
•
•
82
88
113
· .'
115
.,
119
e
1.
OVERVIEW OF THE PROBLEM
1.1
Introduction
Conventional regression models usually are of the form
where
Yt
is the observed response variable,
f
is the hypothesized
~,
response function in terms of exogenous variables
e*
and parameters,
is the unknown vector of parameters to be estimated and
random error uncorrelated with !t
0
€t
is a '
This equation attempts to satisfy
the scientist!s need to express a hypothetical relationship in
manageable terms.
Most models are derived in this form and the well-
known techniques for estimating
algebraic formulation, T1W t
8*
are oriented to this specific
is, a response variable on the left hand.
side of the equation, and a function of exogenous variables and parameters plus an error term, on the right hand, is the eventual form of
the model prior to the estimation stage,
It is possible, however, that theoretical considerations give rise
to a structural relation
which is an implicit functiion of the variable
y
0
thi,s formulation is that an analytic solution for
and the well-established approaches for estimating
directly.
One problem with
Yt
9*
may not exist
do not apply
A lesser problem is that when known numerical methods are
able to generate a solution for
Yt' given trial values for
e*
and
2
known values of
estimating
e*
xt '
(and equating
E
t
to zero), the process of
under standard techniques may be cumbersome and
costly.
Furthermore, in the case where an analytic solution for
Yt
does
exist,- as in the regression model, variousalternatives to the least
squares appr"oach have been explored mostly by
simulat~on
studies.
How-
ever the statistical properties of the generated estimators have not
been formally derived given a general response function
f(~,e),
not
necessarily linear in the unknown parameters.
It is to these problems that this dissertation is addressed.
More specifically, we investigate the properties of a class of
estimators generated under no requirement that an analytic or a
unique" numerical solution exists for the endogenous variable
regression model is considered as a special case,
!.~.,
y
The
a conceptual
implicit model having an analytic solution for the response
y
which
is expressed as a function of parameters, and variables uncorrelated
with the additive error term.
1. 2
Background
Data available for analysis usually consists of a set of
(~t'Yt}t~l where
observations
on the variable
variables
x.
theoretical
(Yt}t~l is the sequence of responses
y, obtained for different values of the exogenous
The functional form of the response may arise from
consideration~,
but in many situations it is approximated
by" a simpler function like a polynomial of fixed degree in
x
3
A model is said to be in reduced form if it is expressed by a
system of equations (one or more) with one endogenous variable on the
left hand side and a function of variables uncorrelated with the error
term on the right hand side of each equation.
structural or implicit
rep~esentation
Alternatively, a
of the model is of the form
In many applications the interest lies in constructing an equation
that will predict and hopefully explain the behavior of the response
variable; the model is therefore formulated accordingly.
If the
derived model is in implicit form, the approach is to obtain the
reduced form prior to the estimation stage.
Given the data and the reduced form, the· least squares technique
is the most frequently used method for estimating the parameter
other criteria have been considered.
In particular, Huber
e*
[91 and
[11] , considered the model
k
Yt
=
~
. I
l=
x t 6. +
.
l
E
l
t
. and his class of estimators included
k
~ x
.
~
n
generated by minimizing
B·I P
t I
. II
l=
for
p
~
1.
Under general conditions he derived the consistency and
asymptotic normality of
"en
Smi th and Hall [18J and Forsythe [7J
conducted simulation studies that suggested the following:
model
Under the
4
if
the probability distribution for the error
is in the doub le
exponential family or in the family of contaminated normals with
density function
1 (e:- £)2
e
for
0 s: G ~ 1,
R > 0 , then en
n
min
geQ
~
t=l
1 < p < 2
for
IYt
k
n
r: x . e. Ip = ~ IYt - ~ xt.e· Ip
i=l t l l
t=l
i=l
l l
in general has smaller sample mean square error than
the estimator generated by the least squares approach.
above density is sYmmetric when
S
to.
R
satisfying
k
-
2
Note that the
S = 0 , and skewed when
R = 1
and
In both cases the alternative estimator showed,substantig..l
increase in efficiency as the number of observations and the degree of
contamination gets larger.
In this same context, the class of sYmmetric distributions with
density given by
2
=
for
-
00
< e: < CO
1
w
= r(l
{ llEI
w exp - - 2 ()
0 < () < co,
,
+
l(l
2
+ (3)).2
1-$ }
-1 < 13 < 1
and
(1+ (1-$))
2 . ()
has been studied by Box and Tiao [2) as an alternative to the usual
assumption of a normal distribution for the error term
particular note tllat if
E
t
•
In
13 = 0 we have the normal density, the double
exponential corresponds to
13 = 1 , and as 13
~
-1
the distribution
5
tends to the uniform.
if
2
P :::: (1+(3)
and
We remark that for this class qf distributions
~
are known, then for the explicit model,
maximizing the likelihood corresponds to the estimator
en
generated
by minimizing
Hence, at least for
n
sufficiently large, minimizing the sum of
th
p-
absOlute deviations may give more efficient estimates than those
generated by the least squares approach.
The implicit model may arise from considerations of the
theoretical relationships underlying an observed process.
Frequently,
a system of differential equations, or the set of necessary conditions
that must be satisfied when
constit~te
th~
the derived model.
process is in some optimal state,
It is possible that these equations are
algebraically implicit in the dependent variable.
appears in Wallace and Ihnen [20J.
A specific model
Situations may also arise in
practice in which all variables observ:ed may be viewed as being
generated from a joint probability distribution.
In this case it is
meaningless to distinguish between dependent and independent variables.
If the variables are correlated with the error term then we might say
that the usual regression model is actually in implicit form.
Usually, these models are originally stated in deterministic form.
The experimenter derives a relation of the form
6
Yet, hecaus.e of measurement error, the use of alternative measures for
a given physical quantity, or due to neglected effects, an error component gets introduced int,o the model.
We have chosen to view the
data as measured without error, but to consider the combined effect of
and the solution to the problem of estimation relies on the prior
information concerning the error term
€t and the parameter space
n
that the experimenter is willing to add to the structure.
Given a proha.bili ty distribution for the error
€t' then a
maximum likelihood approach may be used and under general conditions
the estimator of
e*
is consistent and asymptotically normal.
In the case of the model
assuming
for any
00
x }
e-t
t::::l
E(E€t I~t
e; e*.
: : ~) : :
0
for all
x, also im]llies that
Uhder additional ,restrictions on the sequence
the above relation becomes a strict inequality when the
conditional expectation is averaged over all
large.
Minimizing
x
t
for, n
sufficiently
7
under general conditions then leads to a consistent and asymptotically
normal estimator of
e*.
When the model is approached in its implicit form, it is necessary
that the prior information on the model be sufficient to distinguish
the true parameter value'S from all others.
However, in many real
situations knowledge of the distribution function of
€t
is at most
highly uncertain.
As Fisher [5J pointed out, even in the case of a relation of the
!form
assuming a normal distribution with constant variance and zero
expectation for all
(~,y
T ) and
€t
T
(c~,cy),
is not sufficient to distinguish between
for
c
a scalar different from zero.
Usual.1y, a normalization rule is imposed on the original parameters in
order to define a unique value for
(~,yT) that is in accordance with
the'observed data and the model specifications.
In considering a specific model, knowledge on the behavior of the
errors associat:ed, with the process,may impose additional restrictions
on
e*
which will define the true vector.
One such restriction could
relate to some measure of dispersion for the random variable
such that at
e = e*
the
measured by the function
for any
a 1=
6* .
var~ability
of
p(Q(~Jy,e)),
,Q(~, y, e)
Q around the value zero, as
satisfies
·8
With these comments in mind, we state the purpose of this thesis
as twofold:
1.
To consider the model in its implicit form
where
E(E ) = 0
t
and no specific distribution
function is assumed for
The standard
regression model is simply a special case.
To explore the statistical properties of the
2.
en
estimator
min
eEQ
n
!:
t=l
for a fixed
In
Ghap't~r
satisfying
IQ(~t'Yt,eIP =
p
~
1 •
2 we define the implicit model in more detail and
present the major assumptions underlying the results derived in later
chapters.
en
A
Chapter 3 develops an iterative computing scheme for finding
when p > 1,
n
~
k
and
k
is the number of unknown parameters.
Under certain conditions, the sequence generated by the algorithm is
shown to converge to a stationary point of the objective function
n
y (e)
n
=
~ IQ(~'Yt'. s)I P .
t=l-v
The general properties of the estimator
4.
The sequence of estimators
Bn
are der,ived in Chapter
{en}~l is shown to be asymptotically
normal and to converge with probability' one to a point 'SO
in
('2.
In
9
Sectio~3
of this chapter the predictor
y
is defined, under the
assumption that the equation
Q(x,y,e n )
= 0
has a solution in '"y
given values of x
The general results of Chapter
model in Chapter 5-
the estimator
~
n
and ~
n
4 are, applied to the regression
For the explicit regression model where
is strongly consistent and asymptotically normal
with the asymptotic distribution certuered around the true parameter
value
9*.
Some numerical illustrations on the performance of the algorithm
are. included in Chapter 7-
In some cases, the results are compared to
those generated by other computing schemes_
1.3
Review of the Literature
In the case of one dependent variable
independent variables
to the estimation of
where
Yt
and an m..:.vector of
!t measured without error, the usual approaGh
a*
assumes a model of the form
Y~ is related to the observed value Yt
through the equation
' 10
Y~ is the value of the random variable y at the time of
That is,
observation.
observation is
The measurement error associated with the
€t'
If for each
there exists a solution
then the estimate of
e*
~t'
tth
and different values of
e,
y(~,e) of
is that value of
e
for which
attains its minimum.
We note that under this interpretation of the implicit model the
method can be extended to consider the estimator
en
generated by
minimizing
for
p
~
1.
Then under the conditions specified in Chapter 5 the
derived results apply to the estimator of
e*
where,
Britt and Luecke [3J considered this approach.
Moreover, in order
to extend the results to the case me:re both dependent and independent
variables are subject to measurement error, the authors present the
model as
where
t
1, 2, ... , k ,
11
Y:..m
and
y
"'"ill
=il.
°
+€
is a q=vector of measurements on the variables subject to
error, with
q
~
k .
The unknown parameters are
defined by the
k
yO
and the
n x 1
e*,
vector
model equations
k > n .
with
The error vector is assumed to be normally distributed with zero
R.
mean and a known positive definite variance covariance matrix,
Taking a maximum likelihood approach, the relevant likelihood
function is
°
°
1 mOT
L(y ,e);:: (2n') -qJ21
. R ,-1/2 exp(- 2(:;Y:
-:;y:) R-1 (Y:..m - Y:..))
:;y:o and
where
for
t
= 1,
e
are restrained to satisfy
2, ... , k .
The objective function with the added restriction becomes
Thus, the problem becomes one of estimating
parameters,
8*
and
(q + n)
unknown
:;y:o, subject to side constraints, given the
distribution of the error terms associated with the model.
An
algoritrm is derived to solve the system of necessary conditions
satisfied by the stationary points
(l
°,e,,,.) .
12
Bard [lJ takes a similar approach and also considers another
interpretation of the modeL
Similar to Britt and Luecke, if the data
is sUbject to measurement error the model equations may be assumed
to apply exactly to the true unknown values of the variables and
parameters,
for
y
o
t
= 1,
depends on
In this case
n •
2,
is an
g
n x 1
vector and
t.
An error component enters the model through
where the
vector
~t
m-vector
inclUdes all variables subject to error, the
all those measured exactly.
The measurement errors are
assumed to have a joint probability density function
indexed by the parameters
p((~t - y~)/.'f)
'f.
Alternatively, the model may be viewed as applying only
approximately even to the true values
o
.lt .
The model itself is
inexact,
or
depending on whether the model applies to the true unknown values
or the measured values
Yt .
The random variable
is
usual~y
13
assumed to have zero expectation, a symmetric distribution, and to
represent the effects of non-specified variables.
If
the
where
tth
~t
is measured with error the joint probability function for
observation is
p(~tl~l)
is the joint density function of the error vector
indexed by the parameters
~1
,
and the errors
to be statistically independent of
~t
for each
o
(~t ... ~t)
t
= 1,
are assumed
2, ..• , n .
Under no measurement error the relevant density is simply
p(~I~l) .
Bard also takes a maximum likelihood approach to estimation.
the model is exact and describes the relationship among
and
e*,
~t
If
~t'
the likelihood is derived solely from the distribution of
measurement errors
restricted to those points
o
(~t' e)
satisfying
The relevant likelihood, under an inexact model applying to the
measured values
is,
14
AlternativelYJ since
n
IT
t=l
n
p(~tl'¥') =.IT p(g(~'l:t,e)I'¥'),
t=l
if the model applies to the measured values
Zt and these are
observed without error, the likelihood expressed in terms of the
derived density for the observed data is
L(e,,¥')
:=
Usually the distribution function assumed for the errors in the model
is the multivariate normal.
A second approach to the implicit model situation is presented by
Salmond
[17J.
The problem arose in which a mathematical model proposed
did not supply enough prior information to define uniquely the parameters indexing the process.
The model postulated is a single equation model given by
1 = X!t3 +
J
E
j
where
X:
J
is a multivariate observation associated with sample
unit
t3
j
is the unknown vector of weighting coefficients to
be estimated
X!t3 is the index associated with sample unit
J
E.
J
is a random error .
j
15
The vector
X.
was considered a multivariate random variable,
J
correlated with the error term, and additional information was not
sufficient to postulate a probability distribution for
E
t
In this
•
case the vector of parameters is not defined by the model; additional
restrictions on
it was
t1
were added to accomplish this.
~
For each unit
j
sensiblyiY implied from the process that at the desired
~,
parameter vector
Xj~
was close to
~:=
was defined as the value
(bl' b ,
2
1
'00'
in some sense.
b )
n
Hence,
which minimizes
subject to
E (1 - X~b)
J
0
:=
0
Under the assumption that
distribution with mean
r ,
u
X had an n-multivariate normal
and positive definite matrix of dispersion
:the constrained minimization problem led to
~
:=
r -1uI u If -1u.
~:=
Hence,
(b , b ,
l
2
b )
n
00"
was defined by the above
expres si on o
The derived maximum likelihood
n
A
u.
~
for
l:
X
j:=l
o
=:
•
~J
X.
~
n
A
y
liN
estima~
k..e
l:s;.e~n
liN
l:
j:=l
(Xkj - Xkt(X.ej
l:s;k~
n
.
- X.e )
of
u
and
r
were
~
16
Hence,
t3
A
=:
"'=11'/"
r u u'r"'-1"u
0
Under more general conditions for
X
~
that is, a non-specified
continuous distribution, and assuming that the first four cUIDIl1ulants
of the distribution existed, the author concluded that the estimators
of
u
and
r
correspond to those derived with the multivariate
normal distribution assumption.
In either case, the estimator
~
was
shown to be asymptotically unbiased, given a degree approximation of
~ , and an estimator of the asymptotic variance covariance matrix of
~
was derived,
17
2.
THE MODEL AND GEJ.'iJERAL ASSUMPTIONS
2•.1
Introduction
The implicit model was introduced in Chapter 1 as a mathematical
representation of a process, given by
Given the model formulation and a sample of points
the statistical problem is to estimate
properties of the estimator.
imposed on the function
e*,
t~t'Yt}t~l'
and to derive the
To this end, specific restrictions are
Q, its arguments, and the error term Et .
In the present discussion the implicit model will consist of the
structural relation Q and the conditions imposed on it and on-the
error term
E
t
.
The following secti9ns introduce the model formally, establish
the notation, and present the major assumptions.
f.2
Structure and Model Specifications
The structural model is one equation of the form
€t
where
is an unobservable random variable satisfying
o
for
t
(iI.,u,iP)
= 1, 2,
into R.
Formally,
The vector
E
t
= Et(A) , is a random variable on
tEl (A.), €.2(A.) ••• }
denotes a
18
realization of independent and identically distributed error terms
where each
has probability function
k x 1
to be estimated is
r.
The
vector in
ncR
un$_~own
k
parameter
.
As in the conventional linear regression model the choice must be
made whether to view
(x } CO
- t t=l
as a se"uence
of known constants or as
'1
a realization of random variables.
The theoretical results derived in
later chapters do not depend on which of the two interpretations one
takes, as long as limits of functions of the form
and
behave properly, (when
e
is a point in
D).
Assumptions 14 and 15
in Section 2 of this chapter explain what we mean by the above comment.
Hence, we will assume that
CO
tx }
t.-t
t=l
is generated in one of two
ways:
1.
are a sequence of random
The vectors
variables mapping
2.
(J\.,G)
The m-dimensional vectors
into
RID.
~l' ~2'
...
are constants
specified by the experimenter.
Often, there exists a real valued function
a random variable on
(J\.,G)
satisfying,
h, such that
Yt
is
19
However, by specifying the model in its implicit form
we include the case when
not exist Qniquely,
response
each
yt
= 1,
t
h
cannot be formulated explicitly or does
Standard regression methods require data on the
and a predictor
2,
"'J
n.
e
for different values of
and
Hence, the derivation of the model is most
often approached formulating a response function in terms of
independent variables and parameters.
the function
h
numerically.
When considering the
For the purpose of estimation,
will not be assumed to exist analytically or
required that if the true value
e*
kn own, th en,f or any va1ue
~
0f
x
prediction of
y
and the error component
m
4n X c R
a solution for
be found using standard methods of numerical analysis,
h
it will be
€t
Y
are
can
Note that if
is known and is of the form
where
f~
meters in
Xx 0
~
R is a linear or nonlinear function of the para-
e*, and the
~t
vector is uncorrelated with
€t' then we
have the regression case,
Furthermore, as indicated in Chapter 1, it is assumed
(a)
that the structural model
Q(xt'Yt,e*) = €t
satisfied by the measured values
is
(~t'Yt}t~l' and
20
(b)
that the error term
reflects the influence of
unknown variables not individually identified by
the structure.
Moreover, Assumptions 2, 4, and 5 in Section 2 of this chapter impose
Q mapping
regularity conditions on the function
S is a Borel subset of
into R where
ml ,
R +
As noted previously the inputs
a sequence of constants.
S
( X } 00
-t t=l
are either stochastic or
For each case, additional requirements are
as follows:
The vectors
are random variables and
~l' ~2' ."
{(~l'Yl)' (~2'Y2)' .,.}
denotes independent and
(~t'Yt)
identically distributed random vectors
e*
P
with probability function
Ie
on
each
(S,Ie) , where
is the collection of Borel subsets of
S.
As
we will see, in this case, the Strong Law of Large
Numbers and the Multivariate Central Limit Theorem
hol.d and imply Assumption 14 and 15,
We denote the
condi tional distribution function of
given
-X t
= -x by Fy, e*/x for
t
conditional expectation of a
function
g
on
S x 0
by
= 1,
2, ,.,
a..rld the
Ie measurable
21
Expectations of functions on
to the joint measure
E (g/x)
2.
P
e*.
S x
(2
will be taken with respect
A formal definition of the function
appears in Loeve [13, p. 341J .
For each
in the sequence of constants
~t
(~t}t~l ,
is a random variable on
Yt
(A,G)
and there may exist a real valued function
,
h
such that,
If
h
exists, its explicit form mayor may not be
known.
The distribution function of
for
x-t = -x
is
= 1, 2,
n.
F
for
y(x,e*)
t
{~t}~l is generated
Furthermore, the sequence
such that for each
measure
un (A)
n
there is a probability
defined on
a Borel subset of
of Borel subsets of
(X,G ) ' where
x
m
R
and
X.
A
€
Ox '
X is
the collection
In addition, the measure
u (A)
converges weakly to a probability measure
~(A)
for all
n
A in
Ox'
Definition 2.1
Let
and let
(Un(A)}~l be a sequence of probability measures on
''-1c(A)
be a probability measure on
(X,CLx) , then
(x,Gx )
22
if for every- bounded continuous function
as
n ... oo.
Let the probability- measure on
where
for
f (X)
E
€
B and
g(~,y,e)
Ex:=' (y-
E
RI (~,y-)
(S,B)
E:
E}
a B-measurable function on
be
is a measurable set.
Then,
S x 0 ,
whenever the expectation exists with respect to
For both interpretations of the sequence
P *.
e
(X
} 00
-t t=l
we will use
the following notation throughout:
and
JR g(~.9y,e)dFy(x,e*)
~t
non-stochastic
23
2.3
For convenience
The General Assumptions
all assumptions are listed in this section.
j
First, the basic notation used throughout is established.
This is,
and
Let
~(e) and pt(e) denote the tth term in the vectors g,T(e)
and f?(e)
respectively and
Dn(e)
n x k matrix with
:=
an
OQ. (e)
( ae'
1
The rank of
Gn(e) := a
Dn(e)
pt(e):= 1Qt(e)I P .
i
is
row given by
oQ. (e)
oel
.) .
OQ. (e)
~e 2
th
j
k
k.
k x k matrix with
i
th
row given by
24
G(e):=: a
kx k matrix with
.th
l-
row given by
o
A(e) :=: a
k x k matrix with
i
th
op(x,y,8) op(x,y,e)
*
(E e
~ei
~el'
0
0
,
row given by
op(x,y,e) op(x,y,e)
Ee
*
O;i
Op(~,y,e)
*
Ee
oe
i
0;2'
000'
op(x,y,e)
O;k)
0
T
(D WQ)n(e) :=: the product
A(e,8)
e and 0
:=:
a spherical neighborhood with radius
a>
0
and centered at
is the termination of a lemma, proposition or theDrem,
The Assumptions on which this thesis is based appear below,
Inclnded in their description are explanatory comments to identify the
major results associated with a particular assumption or a group of
them,
These
are~
25
1.
0
2.
Q(~,y,e)
is a real valued function on
e
tinuous in
each
e
For
4.
oQ(~,y,e)/oei
eO
(~,y)
for
exists on
5·
0
7
and is real valued
in
S
for all
,
and
B
i =- 1, 2,
k •
,
2
o Q(x,y,e)/oe.oe.
-
J
J.
exists on
valued continuous in
measurable for each
6.
(~,y)
in
e
,
S x 0
for
e
measurable for each
0
S, B measurable for
in
as defined in Assumption
and continuous in
0
S x 0 , con-
in O.
3.
•
k
R
is a compact subset in
There exists an
~en
e
e
set
(~,y)
for
in
S x 0
0
,
and is real
in
for all
S
i,j
and
=-
B
1, 2,
V in the interior of
0
such that
implies
6.
(a)
e
1
_
ell •
There exists a
that
e*
in
V of Assumption 6 such
26
6
for
6.
(b)
For each natural number
in
(2
of
(1
V.
any boundary point in
n, the value
'fn (El)
minimizing
El I
is in the interior
Assumptions 1, 2, 4, 6, and 6 (a) are fundamental in proving convergence of the computing algoritb~ generating the estimator
The derivation of this algorithm is based on Assumptions
7.
There exists a unique value
eO
in
4
en
and 6(b).
int((1) such
that
for all
8.
El
f.
E * p(~,y,e)
e
Assumptions 1, 2,
estimator
9·
10.
en
<
e
and
CO
7, 8
converges to
E * (OP(~eY,9»
e
in
(1.
and 14 are sufficient for showing that the
eO
I
with probability one as
< co for
as in Assumption
I -
-
J
l
1, 2,
El
i,j
in
n ... co.
(1.
1, 2, ..• , k
and
7·
op(x,y,e) op(x,y,El)
E e*
oEl.
oEl.
i, j
(1.
for all
2
p(x,y,e)
E El* 10 09.06.
El=e
O
J l
for
in
is well defined for
e
eO
11.
eO
'.0' k
I<
co and
E
~ (.
) 2
(Up x,y,e )
e*
oEl.
l
and
e
in
(1.
>
0
27
12.
Det(G(e » 1= 0 where
o
with
(i,j)
G(e )
O
is the
k x k matrix
element
o2 p(~,y,eo)
*
'09.oe.
1.
J
Ee
~3.
The matrix
14.
Let
(D'WD)(e)
is positive definite.
iB measurable
g (E, y, e) be a real valued
function for each
e
in 0
such that
then
n -+ 00
with probability one as
15.
Let
g(E,y,e)
g.1. (x,y,e)
each
e
be a
vector such that
is real valued and
in 0
for all
k x 1
i
= 1,
•
iB measurable for
and
2,
"0'
k
e
in 0
matrix with
(i,j)
and each
then
where
element
L:
is the
(k x k)
28
The derivation of Theorems 4,3 and 4,4 concerning the asymptotic
normali ty of the sequence of estimators
§n
is based on Assumptions
2, 4, 5, 7, 10, 11, 12, 14, 15 and relations 4.3 and 4,4 respectively,
The latter are presented with the theorems,
The following as,sumptions relate specifically to the special case
when
Q(x
=,y
t t ,e*) = y t - f(x_·t ,e*)
16.
For each
E
t = 1, 2, ... ,
for a given vector
t
the distribution of .
~t =
x
is symmetric about
zero with a unique median,
17.
in
18.
c(e) = (~I(f(~,e) - f(~,e*))
Let
o
0,
e ! e* , then
~(C(e))
The error
E
t
for
e
> 0 .
is a compact connected subset of
is an interior point of
19.
f O}
k
Rand
e*
O.
has a bounded density function
with respect to the Lebesque Measure,
20.
There exists a
for each
e
in
0
x integrable function r.1,J. (x)
-
P
i,j = 1, 2, .. "
k
such that for every
29
20.
(a)
There~ists a constant
T
such that
sup ( sup If(~,e) - f(~,e*)
X . A(e*,8)
for all
20.
(b)
<
0 < 8
There exists a
S(~)
,ZOo
(c)
~ T
0 .
integrable function
x
such that
i,j
= 1,
~ S(x)
-
2, ... , k
There ex:i.$ts a"f . integrable function
X
r! ;
"J.,j,r
(x)
for
.~ach
such that foe every
i,j,r = 1, 2,
e
of(x,e) of (x, e) of(x,e)
I
0;
•
1.
Note that if the set
20(a), 20(b),
e>
and same
P
2
10 f(x,e),
oe 1 oe j
for
e
I)
"a~d
0;
'.
j
X in
0;
r
000,
k
in 0
I~
r J.,j,r
~ . (x).
m
R
is also compact, Assumptions 20,
20(c) are satisfied.
30
3.
3.1
THE ESTIMATOR
Definition of The Estimator
n
f(x
y)1 t=l
l -t' t
for each
n
~
k ,where
n
A
en
k
e
e n ,and develops an
This chapter introduces the estimator
iterative computing scheme to generate
A
given the observed sample
is the number of unknown
parameters.
Consider the implicit model
where the structure
(~t'Yt,e*)
Q , the random error
follow the model specifications presented in Chapter 2.
Given a choice of
function of
min ~n(e)
~n(e)
A
p,
f(x y)} n
t -t' t
t=l
eeQ
Since
St' and the arguments
en
into
will be defined asa measurable
~
~~
such that,
n
= min ~ IQ(~t'Yt,eIP =
esQ t=l
is bounded below by zero and
exists at least one value
~ n (e t)
= min
at
~ n (e )
in
Q
Q
is compact, there
such that
•
eeQ
Moreover, we will show that there exists a measurable version
atn
of
et
satisfying
31
Lemma. 3.1
Let
Q be a real valued function on
compact subset of a Euclidian Space and
For
each
e
in
y
® , let Q(e,y)
in
~,
~
measurable function
Q(e(y),y)
S
from
= inf
where
~
~
into
®
®
is a
is a measurable space.
be a measurable function of
a continuous function of
A
® x
e •
y, and for
Then, there exists a
such that for all
y
in
Q(S,y) •
e
Proof
Lemma 3.1 and its proof appear in Jennrich [12J.
For each
n
= 1,
is the domain of
Q
Borel subsets of
Sn
2,
... ,
S n = S x S x ••• x S , whe re
let
S c Rm+l , and let
,
,
0
then the space
iSn
(Sn,8 )
n
S
be the collection of
is a measurable space.
The following theorem fo llows immediately from Lemma 3.1.
Theorem 3.1
Let Assumptions 1 and 2 hold, and let the implicit model be
then, for any
n = 1, 2, .•• , there exists a
n
ou~n
= min ~ IQ(~t'Yt,s)IP
SeO t=l
measurable function
32
Proof
n is a compact subset of Rk by Assumption 1. Assumption 2
implies
and
B
n
'Yn<e)
e
is continuous in
measurable for each
e
in
for each
n.
the estimator
on
e*
of
Hence, byLemma3.l the
n
conclusion of the theorem follows.
Given the samPle values
C(as.t'Yt.)}t~l in Sn
(~t'Yt)lt~l from the implicit model
is defined as a
3
n
measurable function
such that
into
n
min
I:
een~t=l
3.2
Computation of
"en
This section proposes an iterative method which generates a
sequence
for a given initial guess of the true parameter value
given observed sample
conditions
point of
e*
and a
n
(!.t.' y t)1t=l ' such that under general
converges to
~n(a)
a'
,where
a'
is a stationary
has a unique minimum then
a'
generated
by the algorithm corresponds to the measurable version "a· • This
n
is not necessarily so if there is more than one value in . n
which
'In(e)
ill
atta. ins its minimum.
for
For practical .purposes the
33
estimate is that value
(
respect to
for
i
= 1,
given by the computing method.
However,
O'!'n(a)
Oa ) = 0
(o'!'n (e) loe)
where
a'
is the
k x 1
vector of partial derivatives with
a , and
2, •.• , k.
o\Qt(e)\P
oe i
For a choice of
p > 1
then,
=
o
for
t
= 1,
2, •• ,;, n,
The above is
= 1,
i
eq~iva1ent
2, ••• , k •
to
=
o
for
t
= 1,2,
••• , n,
be a solution to
i
= 1,
2, ••• , k.
The estimate
e'
must
34
for
= 1,
i
2,
•
•
0 ,
k •
Underlying the derivation of the algorithm is the existence of
the first
partials of 'f (8)
________
..
with>respect
to
._
~.De.."."
e.
For this ·reason,
the resulting system of equations is not appropriate when
since in this case the first partials of
(~,y,e)·
those points
=1
may not exist at
~!.Y-,-e..)= _~.
for which
p
the estimator is defined for any finite
p
~
1
However, note that
and any of the known
methods for minimizing the sum of absolute deviations can be
utilized in generating
that
Parenthetically, it should be noted
8' •
we applied the algorithm to several examples for which
=1
p
and the value oLL._generated coincided with known values obtained
See [19J.
from other computing techniques.
In matrix notation, the system of equations generating
0'f (8)
(
n
08
)
= p(D
1
WQ)n(8 n )
vector
=Q
n
t=l
0:.
1.
oQt(8)/oe
expression at the current value of
last factor
element of the
oQ (e)
n
= ~
Evaluating the product
th
is
(D'WQ) (e)
«D'WQ)n(e»i
is
r
r
where as specified in Section 2, Chapter 2, the i
(k x 1)
en
i
• IQt(e) \p-2
e . /Ill e
r
in the above
, and substituting the
.
Qt(e) ,by a first order Taylor i $ approximation around
e r ' the expression for
«D'WQ)n(~»i
becomes
35
I
Qt(9 r )
for
i = 1, 2,
Ip-2
k
'~1
[Qt(9 r ) +
oQ t ( 9 ) ,J
J-
09,r (9
J
j
- 9 r )]
..., k •
In matrix notation,
The matrix
(D'WD)n(S)
has
(i,j)
n
[(D 'WD) n (e) J i, j
for
Q
n
i, j
= 1,
=0
i
t=l
element
oQ (e)
t
oSi
IQ
Qt(9)
(s)
t
The matrices
2, ••• , k •
D
n
\p-2 ---oe j
and
Wand the vector
n
are as described in Section 2 of Chapter 2.
Thus, the value of
e
.at the
(r+l)
iteration is
er+l' where
Equivalently,
.(9'r+l - 9r )
= -(D'WD)-l(e
n
r )(D'WQ) n (9 r )
•
In the language of iterative techniques, the algorithm is in the
class of gradient methods with a positive definite direction matrix
(D 'WD) n-1
and direction
of search given by
.
-(D 'WD) n-\e r )(D 'WQ) n (e)
r •
36
insur~
However, to
that the value of the objective function
9 +
r l
improved at each step we will choose
~n(e)
is
to satisfy
er+l· := e r - hr (D'WD)-l(e
n
r )D'WQ(e r )
hr
where
[O,l+eJ such that
is in
Under the conditions of Lemma 3.2 such a
= 1,
2, ••• whenever
exists.
and the direction matrix
(D'WD) (6)
n r
Lemma 3.2 concludes that for some
exists in the
int(O)
A.
r
A.
r
exists for each r
Moreover, if
is positive definite then
> 0,
e +1
r
as defined above
such that
Lemma 3.2
e-
Let
be an interior point in
13 be satisfied at
a value
11.
for all
A.
0
e:=
e , and
let
k
OcR
(Q'WD)
,let Assumptions 4 and
(e) f: Q,
then there exists
such that
in
(0,11. 0 ) •
Proof:
By Assumption 4
intermor of
0
~n(e)
is differentiable for all
with derivative at
e:=
e
given by
e
in the
p(Q'WD)n(e) •
37
k
From the definition of derivative of a function on Oc R ~ R" ,
and by Assumption 13; there exists a value
AO(e)
for any
e > 0,
such that
£Or all
A e (O,AO(e» and where
Note that the matrix
definite since
= IQ·t(e) Ip-2
wn (e)
f or
t
II II
(D'WD)n(e)
denotes the Euclidian norm.
is at least positive semi-
is a diagonal matrix with elements
1 2
="
••• , n •
Let
by Assumption 13, and let
e < C(e) , then
where
is in the interior of
0
for all
A in
(O,AO(e».
n
38
Given the underlying assumptions up to this point, any choice of
p> 1
is valid.
We also investigated a method utilizing the
knowledge on the second partials of
~
n
(9) , like the Newton-Raphson,
bearing in mind that the existence and finiteness of these partials
are required and hence the method will: not apply always for a choice
of
in (1,2) •
p
Under this approach the approximation
Taking partial derivatives with respect to
where
"f"(e
n r)
is a
k x k
.~'f,
to
e
n
i, j = 1, 2, ••• , k •
The above expression is equivalent to
(0)
around
is
we obtain
matrix evaluated at
element
for
I
e
=
er ,
with
(i,j)
39
o
co
Note that each of these terms becomes infinity when
Equivalently,
p
is a
40
o
Qt(e) = 0,
p ~ 2
co
for
i,J= 1, 2, ... , k •
The matrix
'ftr(e)
n
is the Sum of two matrices,
where
F (e)
n
=
k x k
0
41
Second derivative methods have been shown to provide an answer in
situations where first derivative methods fail.
However, in many
practical cases the additional computational burden is not justified
and
~reover,
given our class of estimators, uhe wider applicability
to all values of
p> 1
of the first approach is preferred.
For a
specific problem a Newton-Raphson method can be implemented whenever
p
~
'f"(e)
n,·
2 -and the matrix
of second partials is non-singular.
Using the conventional terminology we refer to the vector
-(D'WD) -1(8 )(D'WQ) (8 )
n
r
n
as the direction of searchP(e)
r
~
as the step length.
~---~----~-
The choice of
~r
each iterative step, the next value
1.
8-r+l
= or
~
-'"'rr
P (~)
-~._-~--
--
r
o
and to
will guarantee that at
satisfies
is in the interior of
~,
H
-- - - - - - - - - - - - -
The topic of choosing
~
~
has been widely discussed in the
literature, specifically see Ortega and Rheinboldt [15J.
These authors
discuss a variety of methods to insure that conditions 1 and 2 above
are satisfied for a given direction vector
iP (8 )
r
r
f:-Q.
Their
exposition emphasizes the considerable independence between the choice
of step length and that of the direction of search.
What is important
is that at each iterative step the value of the objective function
'f:n<e->.
is reduced (recall that Lemma 3.2 outlines the conditions under
which this reduction is made possible by a choice of
~r)'
interested in theoretical convergence then the choice of..~.:r
~
decrease '1' (8)
n r
by a sufficient amount.
If one is
must
This comment will become
42
clear when we define the algorithm formally and prove convergence of
the generated sequence
(8
CX)
. n } r=l
•
r
We start off with an in,itial guess
8*
in the interior of
n
such that
8*
1.
is in
V c int(O) ,
for
2.
Then at the
th
r-
e
a boundary point in
iteration define the .next value of
~
-
V
denoted by
as
er - Ar {D'WD)-1(8
)(D'WQ) n (8 r )
n
r
8
A
r
where
satisfies
and
is a closed set in
V
When we refer to the sequence
CO
{en }r=l
,we mean that each
r
was generated as described above for
for
r
=0
r= 1, 2,
and that
8
r
8* = 8 r
, satisfies the conditions specified for the initial guess.
43
3.3
Convergence of The Algorithm
The literature is not lacking in proofs of convergence of
iterative techniques that belong to the class of gradient methods with
a positive definite direction matrix.
For a specified direction'of
search and step length choice, and in the context of minimizing
Ortega and Rheinboldt [151 and Gallant [81 give proofs of, ,convergence
without requiring the existence of the second partial derivatives of
~n(e)
(in their case
p
2).
is
Gallant
[8J
chooses
Ar
such
that
and
Vr
= (ele
is in a closed convex set
e = er The initial guess
6*
Af' r
V and
fo rAe CO, 111 •
is in
V.
On various examples we tried to reduce the number of iterations
necessary to locate the minimum of
A greater than
1.
~
n
(e)
We extended the set
by considering values of
V to include these values.
Our proof of convergence is similar to Gallant's
[8J.
This
proof makes no use of the existence of the second partials of
Recall that for
1 < p < 2
if
••• , n , the second partials of
Q(~tsYtSe)
~
n
=0
for some
t
~n(e)
= l~
(e) may become infinity •.
2,
•
44
Theorem. 3 .2
. Le t Assumptions 2, 4, 6, 6a, and 13 be sa tis fied and le t
fa J
"n
r r=l
be the sequence generated by the algorithm, then
00
1.
e
.
is an interior point of
nr
in
2.
3.
-
and
an
is
r
r = 1, 2, ...
V for
The sequence
00
Cen Jr=l
r
converges to
is in the inte1l'd.or of
·9
0
and
0
-,
a
~1'n(e)
('Oe
)
=0
•
Proof
Since
8*
is an interior point in
PO(e*) ~ 0
and the fact that
is aA i
1'n(9)
imply, through Lemma 3.2, that there
> 0 such that [e*,e*· X'PO(e*)]
< \'n<8*)
for all
and such that
V
e
is in
is in
V and
in. [8*,8* - X'PO(8*)] •
is a closed set.
o
[9*,e* •. A"PO(8*)]
Vo
V cO , Assumptions 4 and 13
Hence, the set
Vo for Some
A" < min[A', HoC].
is compact and '·1'n(S) is real valued continuous in
exists a
e'
in
Recall
e
Since
~ there
Vo such that
= min 1'n(6) •
a.vO
By construction
e' • V.
e'
Note that
= 9* -AOPO(e*)
for some
and
Thus, let
where
8
1
= e'
•
By Assumption 6a ,
is any boundary point in
8
point in
-
V.
Henc~
Note that in the case wh~re
V.
9
is an interior
1
=0
PO(e~~)
= 9*
en
r
for all
Let
= 1,
r
2,
~
P (Sl)
1
interior of
•
V.
0
0
0
It was just Shown that
O.
and similarly generate
point in
V.
e'1(
Substitute
by
e
1
is in the
in the praeeding paragraph
8 , e , ••• , where each
3
2
er
is an interior
Hence, conclusion 1 follows.
{'IlJ' (e
The sequence
'In
n
)}
r
00
r=l
is monotone non-increasing with non-
is
) ... 'f'~ as ~"'oo. Since {9 }
r
r
a closed, bounded set it contains a convergent subsequence {en } ,
s
en ... e as s ... oo. By continuity of''f' n (e) we conclude
s
negative terms.
Hence,
'f'n(9
n
-
'f'n(e n ) ... '¥ (9)
s
n
as
s ... 00, and hence
e-
= '¥*n
is in interior of
V.
The function
Pr (e)
is continuous in
=
(D'WD)-l(e)(D'WQ)
(e)
n
n
e
positive definite for
by Assumption 2 and 4.
e
in
Since
(D 'WD)
V by Assumption 13, then
lim P (e ) = lim [(D'WD)-l(e )(D'WQ) (e )l
s s
n
s
ns'
s... oo
s... oo
= (D 'WD) -l(e)(D 'WQ)(e) = P(e)
n
n (e)
is
46
Since e is in the interior of 0 and Assumptions 4 and 13
,
hold, by the conclusion of Lemma 3.2 we can find a value A8 and an
TP'~,
0 <
n* <
A8 < A" ,
depending on the sequence
{(~t' Yt)} ~1
such that
if
p(e) rf: 0 •
Let
p(e)
s
be sufficiently large and let
es
and convergence of
such that for
n*
e
as
e<
Y , by continuity of
s ... co, there exists a
(O'A8J
implies
(V n (e s - ~*P
'I
s)
-V n (e - ~*P(e»
< e < Y
'I
and
where
c rf: 0 •
. Recall that
Vn (e s +1 )
= min
V
s
Thus,
V (e - P ) ~ V (e n s
s
n s
n*p)
.
s
He) >
0 ,
for all
> I
j
and
s
sufficiently large.
This conclusion contradicts
'n(Bs) ~ 'n(~)
o.
P(~J =
s
as
Thus,
~
the fact previously derived that
P(~) =f: 0 , and we conclude
co whenever
=0
(D'WD)-I(D'WQ) (e)
n
(D'WQ)n(e)
n
o'n(~)
=(
oB
, which implies
) =0 •
Thus we have that
lim , n (a r ) = 'n(e)
r~c:o
-
where
a
is a stationary point of
fa}
co
1. r r=l
will now show that the sequence
same
'fn(a)
-
By Assumption 6a, we
also converges to this
a •
I
Consider any convergent subsequence of
(e s '}
•
Then
es '
'n(a s ') ~ 'n(e').
~
a'
in
n ,
co
fa r } r=l'
le~
- it be
1..
and by continuity of
By convergence of
'fn(a) ,
('fn(a r )} , we obtain that
'fn (e') = 'f*
= 'f n (e)
n
for
a'
the limit of any convergent subsequence.
manner as was concluded for
e-
we must conclude
If Assumption 6 holds this can not occur unless
} co ~ e
(arr=l
• o
In the same
e'
-
= a
and thus
48
Chapter 7 contains
num~rical
illustrations of the performance of
the algorithm under different models and different choices of
In some cases the value
e'
p •
generated is compared to the minimizing
value obtained from other computing techniques.
In the actual
implementation of the algorithm an approximating process was used
instead of the minimization required for each choice of
~
r
.
4,
GENERAL RESULTS
4.1
Introduction
In this chapter it will be shown that the sequence of estimators
renJ:l
eO
converges with probability one to a point
defined in Assumption 7.
is the value in (2
for any
e
f
eO
and
e
in
(2.
eO
in
(2, where
That is,
Under two different sets of
sufficient conditions, the asymptotic distribution of ~n C~n - eo)
is derived.
This is the k-variate normal with mean
variance-covariance matrix
When
p
~
0
and
G-lceo)Aceo)G-lceo)
2 , asymptotic normality follows from a set of
sufficient conditions which include the existence and finiteness of
the second partials of
~nCe)
C~,y,e)
at all points
in
S x (2
This turns out to be a stronger condition than is necessary for
normality to follow.
1 < P < 2 , when this
Thus, for the choice of
latter assumption on the second partials of
~nCe)
may not be
satisfied, it is possible to formulate a set of weaker assumptions.
Of course, this weaker set applies for all values of
p > 1 , yet
both approaches are presented, since for a particular m0deland a
choice of
p
~
2
the stronger set of conditions is much easier to
verify,
The predictor for the endogenous variable
properties are presented in Section 4,4.
y
and its asymptotic
50
Recall that two ways for viewing the sequence of inputs
{~t}~l were presented in Chapter 2, both were required to guarantee
that Assumptions 14 and 15 be satisfied.
(~t} tC::l
Under the first approach,
is a realization of a sequence of random variables and
{(~t'Yt)}~l
(~t'Yt)'
denotes independent and identically distributed vectors
As was previously noted, in this case, Assumption 14 is
guaranted by the conclusion of the Strong Law of Large Numbers, and
the MUltivariate Central Limit Theorem implies Assumption 15.
co
-t t=l
Alternative ly,
constants.
(X }
was considered as a sequence of
Propositions 4.1 and 4.2 show that such a sequence can be
generated so that Assumptions 14 and 15 hold and so that there exists
a measure
measure
u (A)
n
UX(A)
on
on
(X,u)
x
converging weakly to a probability
(x,ux ) '
For convenience we repeat Assumptions 14 and 15.
Assumption 14
Let
e
g(!,y,e)
be a real valued
in 0, such that
then
wi th probability one as
n .... oo.
B measurable function for each
51
Assumption 15
Let
g(~,y,e)
a real valued
e
for
be a
k x 1
vector such that
g. (x,y,e)
B measurable :function for all e i n
in 0, and all
i
=
~
-
(2
1, 2, ... , k, then
1 n
L
1:: (g (x=t' Yt' e) -. Ee*g (~, y, e ) )-.-+ Nk (0,1::)
n
'v t=l
(tr.:-
where
1::
is the
(k x k)
matrix with
(i,j)
element
Proposition 4.1
Let the sequence
co
(~t1t=1 be g~A~atedas follows:
and
~ = ~(i-l)r+h
for
i
=
1, 2, ... , m
h = 1, 2, •...' r
n = mr
and
is
52
(1, 2, .. J.
for n in,
m
R
and
ax
In addition, let
X be a Borel subset of
be the collection of m-dimensional Borelstibsets of
Let
n
u
for any
n
(A) = 1 E IA(~)
n t:=l
A in
Ox
.
~t in (!.t} t~l ' then
and
where
is a probability measure and
••• , x } •
-r
Proof
Consider
for any
A in
Ox'
Tnen
un (A)
is a. probability measure on
(X,Ox) , and
1
-n
n
1 r 1 m
E IA(!t):= - E - E IA(~(' -1) +h)
t=l
.
r h=l m i =1
.~
r
1
=-
r
1
m
E I (x )
r h=l m i =1 A-h
.
1
= -
E -
r
E IA(~h) •
r h=l
X.
53
Hence,
Un (A) = ~(A)
for each
n
1, 2, ... , and
A in' ~
0
and the conclusion follows.
Proposition 4.2
Let
-t t=l
~t
each
Jro
(X
let
Y
t
for all
be generated as described in Proposition 4.1-
For
be a random variable with distribution function
t = 1, 2, ... , and let the
Y
t
be independent.
~(A)
:for
and let . g
be a
(k x 1)
real valued
B measurable function such
that
for
i
1, 2, .•. , t .
Then Assumption 14 and 15 are satisfied.
Proof
For each
1
-
n = mr
n
consider the sum
1
r
1
m
E g(~t'Yt,eO = rEm. E g(?£.(i-l)r+h'Y(i-l)r+h,e)
n t=l
Since for each
h=l
h = 1, 2, ... , r ,
~=l
are independent and identically distributed the strong Law of Large
Numbers implies
with probability one as
as
m ... co.
Hence, with probability one
n ... co.
Similarly,
~y
the Multivariate Central Limit Theorem, in Rao [16, p. 128J
where
~
is the
Eor'.each
(k x k)
matrix with
(i,j)
term given by
h-, the characteristic function of
55
is
iu' (~~l Z~)u
Ee
where the
and the
k x 1
iu'Z u
r
~
= IT Ee
h=l
vector
(ZJIb}h~l are independent.
lim C (u) = lim
m
m... co
= Cm (u)
r
IT
Ee
Hence,
iu'Z u
~
m... co h=l
iu'Z
r
= IT lim Ee
h=l :rn-+ co
r
IT e
h=l
llb
u
1
-u '. - 1:.. u
r'lI
which is the characteristic function of ak-variate normal with mean
Oand
t·
·
.
var1.ance-covar~ance
ma r~x
where the
(i,j)
element of
E is
r
1 Ze
'" .
-;
n=l Lo
h
Hence,
and the expectation is taken with respect to
4.2
Lemma
~.
IT
Convergence With Probability One
4.3
Let Assumption 2 be satisfied, then for each
e
in 0
Proof
Let
n
k
~ cx),
0
n
be any
converge~t
subsequence such that
k
0
n
~
0
as
k
and let
z
inf
p(x,y,e')
-
lie' -eH<on
nk
k
Then,
By
continm ty of
p in
e
(Assumption 2), and by the Monotone Con-
vergence Theorem
lim E * z
k~CX)
e
Uk
= Ee*
p(~,y,e) .
Since this is true for any convergent subsequence
lim Ee*'inf
o~O . lie' -ell<o
for each
e
in O.
IT
p (E" y, e')
=
E * p (~, y, e)
e
57
The conclusion of Lemma
4,3 is used in the following proof, which
follows the ideas in Theorem 1 of Huber [lOJ '
Theorem 4.1
Let the conclusion of Lemma
4.3 be satisfied, and let Assumptions
1, 2, 7, 8, and 14 hold, then with probability one
as
" co is the sequence of estimators and
(en}n=l
n -+ CO, where
eO
is
defined in Assumption 7,
Proof
Let
eO
A(eo,a ) c int(O).
o
is a compact subset of
ojA(eo,a )
o
€
continuity over 0
4.3, for any e >
of
0
E
e*
p(~,y,e)
there exists a
By compactness of 0, the set
k
R , Assumptions 2 and 8 imply
and by the conclusion of Lemma
aCe) > 0
such that for
e
in
ojA(e o' a) .
lie' -ell
< a (e)
implies
and
IE
*
inf
e lie' -ell<5 (e)
Let
A(e, a (e))
A(e,S)
p(~,y,e)1 < e •
(4.1)
be
(8'llle' - ell
then the
e*
p(~,y,e') - E
< aCe)} ,
form a covering of
ojA(eo,a o )
and there exists a
58
(e
fini te set
R
U
. 1
l
, 8 , ... , 8 } . such that
2
R
A(e.,8(e))
~=
~
are a finite covering of O/A(8 0'oo) •
Given any
8
in O/A(8 0'oo)' then
8
is in A(eiJo(e))
i = 1, 2, ... , R and by relation (4.1) there exists a
some
such that for all i ,
e > 0 be chosen so that
Let
where
for any
8
0
e
is the value in int(O)
in 0,
e~
80 ' and
Then, it follows that
for all i
= 1,
2, •.. , R •
such that
8
is in O/A(eo'oo) •
for
8 (e)
59
A(e,6) =
By the conclusion of Lermna 4.3, if
inf p(x,y,e')
e* A(e,o
) - -
~ E
E
5
as
nO
~
e
0, for
e
~
p(x,y,e)
-
((~t'Y·t)}tc:'l such that wi th probability
depending on the sequence
n
ell < 5}
in 0, and by Assumptions 8 and 14 there exists an
1
one for all
*
re' IIle' -
nO
and all
i
= 1, 2, "0' R ,
1
-e
and
~
-
p(x,y ,e')
n t=l
i
~
n
1
inf
e'€A(e ,5(e»
1
-
n
E
n t=l
-t
t
p(x ,y ,e')
-t t
inf
e'€A(ei,o(e»
Hence, with probability one, for all
n
~
nO
1
=n
1
n
and some
i
_
~ p(~t'Yt,e)
t=l
=i
0
60
Similarly, with probability one for all
~
n
~
nO
nO
2
'
1
and
n ~ nO
Hence, with probability one, for
'
2
and by relation
Thus, I;lince
§
n
en°
0
Sn ~
n
satisfies
(4.2)
n,
is a value in
probabili ty one for all
n
~
no·
with
0
2
The previous theorem relies on the existence of a unique value
e
for
eO in the interior of
e
in
n,
e
I
eO
n such that
Corollary 4.1 extends this result to the
case when there is more than one value in n for which
attains its minimum.
E*
e
p(~,y,e)
61
4.1
Corollary
Let the conditions of Theorem
4.1 be satisfied, where Assumption
7 is substituted by the following:
r
and
e
e * p(~,y,e) = inf Ee* p(~,y,e)}
= {e€OIE
r
in
e€o
e
implies
is in
int(O)
Then, with probability one, for all
e
~
nO ,where
nO
depends
(~-t'Yt)l:l' the estimator ~n is in any neighbor-
on the sequence
hood of
n
r.
in
Proof
r
is a compact subset of D.
A(9,5) =
for same
5 > 0
(e'llle' - e1\
e
and
Consider the spheres
< 5}
in r.
Then there exists a finite union of them which covers the set
say
R
r c
U
i=l
A(e.,5) •
J..
The set
R
0/
U A(e.,S)
. 1
J..
J..=
is also compact.
Similarly to the proof of Theorem
probabill ty one for all
n.~
4.1, with
nO
2
1
inf
R
e60/ . U1A (e J..., 5 )
J..=
-
n
~
n t=l
e*
p(!t'Yt,e) ~ E
p(~,y,e') +
2e
r,
62
and
e'
for all
in
y •
Hence, with probability one for all
n::: nO
' and all
e
i
in
2
y ,
"e n
and therefore
is in
R
U A(e.,5).
. 1
~
~:=
If "en
at
e
0
is in the interior of
n for each n:= 1, 2, ... , then
:= "e
n
Huber [lOJ gives sufficient conditions such that any sequence
{e~}n~l satisfying the above equation converges with probability one
to a point
eO
in
assumption is for
[2
9
0
Instead of Assumption
7,
to be the unique value in
This result is presented in Theorem 4.2.
an eqUivalent
n satisfying
63
Theorem 4.2
Let Assumptions 1, 2, 4, 9, and 14 be satisfied, and let
o
eO
in
be the unique solution to
where
op (x, y, e)
(
~e
)
is integrable for all
functions
8
in 0, then any sequence of measurable
{e '} 00 ,where
n n=l
o'fn (8~)
(.
converges to
=0
8' :
n
08
)
8
with probability one as
0
,
n ... OJ
•
Proof
It can be shown that this is a special case of Theorem 3 in Huber
[lOJ when Assumption 14 substitutes for the result implied by the
strong Law of Large Numbers.
IT
A
4.3 Asymptotic Normality of 8n
Throughout this section
interior of 0
such that
8
0
denotes the parameter value in the
'en'" eO with probability one as
n
"'00.
Theorem 4.3 gives sufficient condi tions for the asymptotic normality
of ~n (~n - 8 ),
0
The same conclusion follows from Theorem 4.4
under a set of stronger assumptions.
64
Theorem
4.3
Let Assumptions 2,
3, 4, 5, 7, 10, 11, 12, 14, and 15 be satisfied
and let
i = 1, 2, ... , k, all
for all
where
mi(~,y,8)' is a
such that as
a
e
E
A(SO,8)
&< e
0 <
and each
measurable function for each
0 <
i
,
e
&<
i
n .... co
for almost all realizations
as
M(8) .... 0
8 .... 0 •
Then,
Proof
4 insure the existence and measurability of
Assumptions 2 and
p(~,y,S)
the first partials of
for each
0 <
interior point of
0,
A(eO,a)
0<8 <
e1
•
is in
A(SO'o)
1
and some
A(SO' 8) c int'(O)
Convergence of
implies that for all
en
&< e
in a neighborhood of
n
en
to
eo
e
1
> O.
for some
Since
e
i
= eO ' say in
80
> 0
is an
and all
with probability one
sufficiently large, and some
almost surely.
S
0 <
°< e
i
,
Hence, at
e = en
(4.3)
I ~e.
l
i,j
-
k
E
2
0 p (~, y, eo)
. 1
J=
= 1, 2, "0' k.
."'e.l
~e O
a J
a set of measure zero.
It follows that
Hence,
(" J'
en - eJo')1
Note that Assumptions 3 and 5 imply the
existence of the second partials of
or equivently
sufficiently large and almost
becomes
op ( x, y, eo)
for all
n
(~t,Yt)}:l
all realizatlons
and relation
' for all
p(~,y,e),
at
8 =8
0
except on
66
o
n
1
=
-
rr.::n
V
r;
t=l
n
1
- rr::n r;
v
t=l
In matrix notation,
Consider the second term in the above expression.
n
(r;
t=l
is a
(k x 1)
"n(O)
------ m(~t'Yt'O))
n
vector with
i th
element given by
where
i
n
1
i
I~ (0)\1 t - m (~t'Yt'O)'
n
t=l n
Firstly,
for all
i:= 1, 2,
vector with
jth
I~jn
lI~n
"0'
k , and
- 6 11
0
is a
1 x k
component satisfying
- e0j I
- eoll
~
1
0
Thus, the second term is a
element
eo)T/ llsn
(en -
1n Mn (0) with (i,j)
k x k matrix
l[M
(0)]. . such that
n n~,J
By relation
(4.3) let
0 < a <
e'
such that for any
y > 0
M(o) < ~ .
Then for all n
sufficiently large
en
is in
probability one and for almost all realizations
1
1
I~[M (o)J . . 1 ~
n
n
n
.
1-n t:=l
~ m~(x,y,o)l
-
. ~,J
Hence, with probability one,
1
-n [Mn(o)J.~,J. -.
as
n-.oo
Assumptions
for
0
i, j := 1, 2,
7, 11, and 15 imply
k.
Thus,
A(eo'S) with
((~'Yt)}:l and
< M(a) + ~ := y •
68
where
for
A(So)
(i,j)
has
i,j = 1, 2,
...,
element
It follows then, that·
k •
Consider
o
By Assumptions 10 and 14,
with
(i,j)
1:
n
G (eo)
n
with
(i, j)
element
element
wi th probability one as
n ... 00
for each
i, j = 1, 2, ... , k .
Hence, by Slutsky's Theorem
1
1
P
(- M (5) - ':~ G (S )) .... G(e )
n
n
n
n
0
0
Similarly, the inverse of a matrix is a continuous function of its
elements and hence since
det (G(e ))
o
I:
0
by Assumption 12, then
Let
Then,
zn (0)
p
-+ 0
and
imply the conclusion of the theorem and
n
In the following theorem, we assume the stronger set of
assumptions for deriving asymptotic normality of rn (~n -
eo) .
Theorem 4.4
Let Assumption 10 hold in a neighborhood of
Assumptions 2, 4,
where
eO
and let
5, 7, 11, 12, 14, and 15 be satisfied, let
h. . (x,y,8) is a
~,J such that for all. i,j
B measurable function for each
e
in
n
70
Then the conclusion of Theorem 4.3 follows.
Proof
4, and 5 guarantee continuity of the second
Assumptions 2,
partials of
p (~, y, e)
with respect to
02p (x,y,e)/oe. o e.
that
-
J
e.
measure zero not depending on
Representing the partials of
= 1, 2, ••. ,
9
Note that this is the condition
1 < P < 2
violated by the case when
i
Relation (4.4) imp;Lies
is finite valued, except p.erhaps on a set of
~
approximation around
e.
p, by a first Qrder Taylor's
0 ' then for
9
in A(e '&)
o
and each
k , we obtain
op(~,y,e)
op(~,y,eJ
oe i
oei
where
for
A
is a 3
E
[O,lJ
and
e
E
A(e ,8 )
o
measurable function of
Jennrich [12J has shown that there
(~,y)
corresponding to
9 '.
Hence,
Op (~, y, 9)
OP (~, y, 90)
oe 1
oe i
k
= r:
j=l
and
k
t
j=l
o2p(x,y,e)
(
-
02p(~,y,ed
09.09 i
J
(e j
- e~)
71
op(~,y,eo)
op(x,y,e)
I ~e.J.
oe i
Let
h. . (x,y,S)
J.,
J
and
(~, y, S) is B measurable for each 0 < 6 < e.
relation and relation
as
J -
By the above
(4.4),
8 -+ 0 .
1
o~ -
n
~
n t:::;l
i
m (~t'Yt,6)
wi th probability one as
(4.3)
in Theorem
4.3
n -+co, by Assumption
14.
Hence, relation
is satisfied and together with the other
assumptions the conclusion of the theorem follows.
0
72
. The asymptotic distribution derived is in terms of the parameter
value
eO
and expectations of measurable functions of
(x,y).
We
now investigate the asymptotic distribution when
...
(a)
G(e)
(b)
if an estimate is substituted for the
is evaluated at
e ::; e n ,and
va~iance-
covariance matrix of the asymptotic distribution.
The properties of this estimator. are derived from the conclusion of
Lemma 4.2.
4.1
Lemma
Let
h(y,e)
for each value of
be a measurable function of the real variable
e
in
y
n a compact subset of Rk , let
Yl' Y2 , ... , be a sequence of independent and identically distributed
random variables, and let
in
u~
n such that
is measurable on the space of
E * (h(y,e)) is continuous over
e
2.
For each
lim (E *
e
an
e
in 0
e
n,
in a countable base of
sup
lie -6' ll<s
n = 1, 2, ... ,
Yl , Y2 , ""Y
n
1.
&...0
for
be a sequence of functions with range
n
Ih(Y,S') - h(y,e)l) = 0
.
then with probability one, for any
no' depending upon the sequence
00
(Yt}t=l
e > 0 there is
' such that for all
if
73
and
n
1
- h(Yt'U ) - E
It=l
t
n
e*
n
h(y,Un )I <
e.
Proof
The proof is in Mickey
[14J.
0
As a consequence of the above lemma, we formulate Lemma 4.2.
Lemma 4.2
Let
e
a,
in
and
h(~,y,e)
14
let
be a
"e
B
measurable function of
be as defined in Chapter 3, let Assumptions 1
n
*
h(~,y,e)
1.
E
2.
lim (E *
0...0
e
e
in
be continuo~s avera,
sup
lIe-e '\ko
Ih(~,y,e) - h(!,y,e')') :; 0 ,
a,
then, with probability one, for any
on the sequence
n
't:l
in
for each
hold, and let
e
e
(~,y)
a,
and
e
> 0
there is an
nO' depending
((~t'Yt)};:'l such tha£'for all n;;: nO'
.
~ h(~t'Yt,e) -
Ee
* h(~,y,e)l
<
e,
Proof
By Assumption 1,
~
n
is a
n
is a compact subset of
Rk .
The estimator
Bn measurable function with arguments
and range in
n
for each
n = 1, 2, . . . .
equivalent to that of Mickey
[14J
The proof then is
for Lemma
4.1
with Assumption
guaranteeing the conclusion of the strong Law of Large Numbers.
Corollary
14
0
4.1
Let the conditions of Theorem
4.4
Q.t'
Theorem
4.3
be satisfied,
and let
be continuous at
e = 90
' then
If
with probability one (or in probability) and the conclusion of Theorem
4.3
or Theorem
Proof
Let
4.4
is satisfied then
75
By Theorem.. 4.3 or 4. 4,
Since
en'"
eo wi th probability one and G(e) is fini te valued
continuous at
eO by Slutsky's Theorem
and
Thus,
Conclusion 2 follows similarly
s~nce
Note that the conditions of Lemma
a consistent estimator of the
dispers~onmatrix
What is required is that for all n
with probability one and hence
estimator of
G(e ) '
O
4.2 guarantee the existence of
.1n
suff~ciently
large, for any
G (~ ) i s a strongJ,.y consistent
n n
However, for the case of
1 < P < 2 ,
Condition 2 of this lemma is not satisfied for
2
'0 p(~,y,e)
aeiOej
Note that the stronger conditions of Theorem,4.4 imply those of
Lemma 4.2.
4.4
The Predictor
The random variable y
is usually considered as the endogenous
variable determined by the model, the vector
stochastic exogenous variables.
numerically, such that the predictor
f
(x,
-
as the non-
Generating predictions for
requires that we can derive a function
y=
~.'
f(!,e)
y
analytically or
A
Y :is d,efined as
en ) .
For the general implicit model the predictor
Y is
that value
satisfying
Q(x,y,~
-
n
)
= 0 •
The conditions under which a solution
A
Y
exists are given in the
Implicit Function Theorem, which we present for convenience.
The
proof is in [4].
Theorem 4.6
Let
(Imp lici t Function Theorem)
m
(a,b)T, (a € R , b € Rn ) be an interior point of a.set
and suppose that the function
the folluwing conditions:
n
f:A ... R
satisfies
77
1.
f(a,b) =.Q ,
~.
f
is continuously differentiable in an. 'open set
G containing
3.
Det
(a,b)T,
Jr~{4,b)J 1= 0 ,where J[f(a,b)] has ~i,j)th
element
of. (a,b )/'Ob. ,
J
J.
then, there exist intervals
~:
and a continuous function
only solution lying in
~
for
M ~ N such that
M x N of the equation
is continuously qifferentiable in
x
element
€
int(M).
D2f(x,~(x»
has
,
Dlf(x, ~(x»
(i,j)!£
can solve for
~(Yo,eO)
y:;= ~x(e)
has
element
In our case if we consider x
Condition 1 that
= ~(x)
=0
for
e
(i, j
of./'OJ/..
J.
J
is the
f (x,y) = O.
Moreover
and
D~(X) is the matrix with
Note that
0h (x)/OXj
int(M)
y
~
(i,j)th
element
evaluated
at
.
ofi/ox j ,
y = ¢(x)
as fixed, Condition 3 requires
for some
Yo(~)
in R.
in a neighborhood of
Then we
eo and
0¢x(9) T
(
08
)
o~ (e, ¢x (9))
(
08
1
-1
= O~(e'~x~(r-"'e)~)
oy
Hence, if the conditions of the Implicit Function Theorem hold for
given a fixed
~,
and if
n
Q
is sufficiently large
exists 'With :probability one for
en
in a neighborhood of
eO.
Corollary 4. 2
Let the condition of Theorem
wi th domain
Let
~ n'" eO
A c R x Rk , let
Yo
4.6
hold for the function
be the point in
with probability one as
R
~(y,e)
such that
n .... co, and let the
conclusion of Theorem 4.3 or Theorem 4.4 be satisfied, then
Proof
By Theorem
4.6
there is a neighbornood of
is defined, is continuously differentiable in
•
eo
8
such thar
and
¢x(e)
·79
exists.
nO
By the almost sure convergence of
en
to
eo we can find an
sufficiently large such that
for all n::: no
exists for all
and the solution
"-
Yn
to
n::: nO with probability one.
The function
¢x(e)
is real valued continuous in
e, hence by
stutsky's Theorem conclusion (a) follows and
in probability as
n ... 00
•
Existence and continuity of the partials of
first order Taylor approximation of
At
e = en
then,
[O,lJ
Then
for
~ E
¢x(e)
¢x(e)
allow a
in a neighborhood of
eO.
80
By continuity of
O¢x (e )/oe,
n ... co.
in probability as
o~x (eo) ~
) \{n
(e )/oe
is measurable and
By the Multivariate Central Limit Theorem
and the conclusion of Theorem
(oe
o¢x
4.3
or Theorem
4.4,
it follows that
A
(e n - eo)
and hence
Note that when
=y
¢(x,y,e*)
then
A
y.
n
=
f
x
(Ae )
value, and \{n
n
- f(x,e*)
is consistent if
cYn -
f (x, e* ) )
A
en ...
e*
the true parameter
is asymptoti cally normal.. . In the next
chapter we consider this specific model in more
detail~·
81
In the single equation regression model, the defining relationship among variables and parameters is given by
Alternatively, the model in reduced form is
The arguments
~t
and the random error
et
satisfy the model
specifications presented in Chapter 2, and the function
f: (X x 0)
~
is known explicitly.
R
k
OcR •
in the interior of
"'e
n
The unknown parameter
e*
is
For this specific case the estimator
sat.isfies
n
I
I =
min !: y t - f (!.t ' S) p
SeO t=l
for a choice of
p
~
1 •
The almost sure convergence of
r"'e} co
1.
n n=l
to a point
and the asymptotic distribution of yn (en - So)
Chapter 4.
e
t
So
in
0
were derived in
Under conditions imposed on the distribution function of
' it will be shown that Some of the major assumptions are
satisfied by the regression model.
is strongly consistent and
Furthermore, the estimator en
is asymptotically normal
with mean zero and variance covariance matrix
G-l(e*)A(S*)G-l<e*) •
82
5.1
Strong Consistency of
Lemma 5.1
Consider the model
and let Assumptions 1, 2, 4, 8, 16 and 17 be satisfied, then
1.
For a given choice of
value
2.
8
0
in
e
1 , there exists a
~
n such that
In particular, for
for
p
p
=1
, then
8
0
= e*
and
in 0 •
If in addition, Assumption 18 is satisfied, then for all
for
8
in
n •
Proof
Firstly consider the case for
The model is given by
p
=1
•
p> 1
83
By Assumptions 2 and 8,
function of
o.
€lover
Ee*IQ(~,y,e)1
Since
0
is a finite v~lued continuous
is compact, by Assumption 1,
conclusion 1 follows, that is, there is a value of
ing on
eO
in 0 (depend-
p) such that
(5.1)
for all
e
in
0 .
We know that
~ E min E(ly - f(~,e*) - (f(~,e) - f(~,e~(»I~)
(5.2)
€leO
and by Assumption 16 the above equality is satisfied at
e = e* •
Thus,
for all
~
Hence,
imply
in X •
~~
particular let
e = e*.
Relations (5.1) and (5.2)
84
and therefore,
min Ee*ly - f(~,e*) - (f(~,e) - f(~,e*»1
SeO
Suppose
Hence,
e
= e*
eO ~ e*.
~(c(eo»
By Assumption 17,
is the unique value in 0
min Ee*IQ(~,y,e)1
= Ee*ly
> 0
implies
such that
- f(~,e*)1
SeO
and conclusion 2 follows.
The case of p > 1 .
let
~
is continuous over 0 ,
~
For each
~,
x
x
(e) = (f(x,e) - f(x,e*»
(e): 0
~
D is a compact connected subset in
•
Then
~
-
D cR.
By Assumption 2 and 18,
R.
For each
~,
x
(e)
-
consider
85
as a function of
y
and
6 , and let
Assumptions 4 and 8 guarantee that we can interchange the order of
integration and differentiation and hence differentiating with
respect to
6,
oE(IY - f(~,e*) - 6IPI~)
o(ly =
06
E
f(~,e*)IPI~)
06
where
-p(y - f(~,e*) - 6)
elY - f(~,e*)
- 6l
o
=
y - f(~,e*)
P(6 - (y - f(~,e*»
Note that the function
~
6
in
0
(y - f(~,e*»
of
E (6)
x
and by compactness of
Some value
D
~
D
(depending on
p > 1
-~
,
\
for all
6
in
int(D)
< 0
6 < 0
= 0
6
> 0
6 > 0
andra11
•
=6
y - f(~,e*) < 6 •
6
for
attains the minimum at
For a fixed
px(6)
x
and each
is a convex function
is a convex function of
and hence
/
06
~)
D,
This and Assumption 16 imply:
oEx (6)
p-1
0 , is continuous in
E (6)
x
, by connectedness of
6 , for all
y - f(~,e*) > 6
P
06
each
p-1
=0
x
•
6 .
86
We know that' 6
o for all
~,
=0
e = e*
is attained at
and hence
6
=0
in the interior of
is in the interior of
D and
E (0) < E (6)
x
x
for any
6
in
The function
Ai
for
int(D)
E (6)
x
Consider
0
I
is continuous in
> 0 and Ex (6 n )
r
any convergent
.1-- E(6P
6
Ex (0) < Ex (! n )
~
Ex (A ')
1
Ex (0) < Ex (A n )
~
Ex (6 ') •
2
boundary points of
and thus,
E (6 )
x n
k
2
for
subsequence~
k
82 '
6'
A < 0
and
6n ~ 6
k
Hence, for all
n
k
~
D.
~ E(6 1')
i '
n
k
o
and
and
r
Thus,
Ax
=0
is the unique va lue of
attains its minimum at
minimizing
p
> 1
6 o.
=0
for all
in
6
E (6)
x
Note that the value of
~.
is not'a function of
D such that
x
nor of the choice of
0
Hence,
for
A
+80'
6 in
D and
We will now show that
over' 0
0
80
e*
=0
.
is the only value in 0
minimizing
Similar to the case of
E
e~'c
p
Iy - f(x e*) - 6(e
-'
= 1,
0
let
+ e*
eO
)I P = Ee"".Iy
and
- f(x e*)IP
-'
Then, by Assumption 17
ux(C(e O»
which is a contradiction unless
Thus, conclusion 3 follows.
=0
and hence
eO
= e*
0
Theorem 5.1
Let
"-
en
be the estimator of
eic
,
where
= et ~
for all sample vectors
{(~t'Yt)1t~1 and
n = 1, 2, ••.•
Let the
conditions of Lemma 5.1 and Assumption 14 be satisfied, then with
probability one
as
n .... CD
•
•
88
Proof
The conclusion of Lemma .5.1 says that at
e*
in the interior of
o
for all
e:f: e''t'
satisfied.
e
in
o.
Hence Assumption 7 of Theorem 4.1 is
The rest of the Assumptions of Theorem 4.1 are satisfied
by the conditions of Lemma 5.1 and Assumption 14.
elusion of Theorem 4.1 follows.
Hence, the
con~
0
5.2 Asymptotic Normality
In this section we shall be concerned with showing that the
conditions of Theorem 4.3 are satisfied whenever
and specific conditions are imposed on the distribution function of
e t'
In the previous sectio'n it waS required for this distribution
to be in the family of symmetric distributions with a unique
and hence the strong consistency of
-e
n
was derived.
median~
Assumption 19
will add an additional constraint on this family.
Some of the expressions derived in Chapter 4 are presented below
for the specific regression model.
ConSider the partial derivatives of
e •
Then,
p(~,y~e)
with respect to
al·Qt(8)
IP
:=
08 i
o
y
t
- f (8)
t
:=
0
Equivalently,
oIQt(e)!P
ae i
for
i
:=
o
= 1,2,
000,
y
k
and
t
= 1,2,_
The general expression for
becomes,
00.,
n
t
- f (8)
t
:=
0
0
derived in Chapter 3
90
p-2 oft(e> oft(e)
p(p-l)(Y t - ft(e»
08
· oe
i
- p(Y t - ft(e»
Y - f(8)
t
t
j
2
p-l 0 f t (e)
oe.oe.
J
~
> 0
p-2 oft(e) oft(e)
08
p(p-l)(f t <8) - Yt )
j
2
p-l 0 f t (8)
+
(ft(e) - Yt )
(8)
< 0
Yt - f t (e)
=0
Y
t
o
08
i
- f
t
p ~ 2
co
Y
t
- f (e)
t
1 < p < 2
or
=0
oe.08.
~
J
91
p(p-l)lyt-ft(e)1
p-2 oft<B) oft(e)
P 2
- plyt-ft(e)I oa
oe
i
j
o2 f t (9)
• (yt-ft<e» o9 ae
j
i
02 1Qt (e)
aeioa j
o
IP
y
- f (e) == 0
t
t
=
p
y
O!:
t
2
=0
.. f (e)
t
1 < p < 2
for
i,j
= If
2, ••• , k
~otethat
1.
= 1,
2,
•
•
0
,
n •
the following results hold for the regression model:
> 1 •
p
Assumptions 8 and 20 imply Assumption ZO(a) for
1 < p
3.
t
Assumptions 8 and 20 imply Assumption 9 for
all
2.
and
~
2 •
Assumptions 16, 19, and 20 imply Assumption 10
for all
p > 1 •
92
4.
Assumptions 1, 2, 4, 8, 16, 17, 18 imply
Assumption 7 for all
p
~
1 •
The next result will be presented in Lemma 5.2.
Under the conditions
specified by the next lemma,. the regression model satisfies the
conditions of Theorem 4.3 and asymptotic normality of
1 < P < 2
(8 n -
e~'c)
for
follows.
Lemma 5.2
Let Assumptions 1, 2, 3, 4, 5, 8, 10, 14, 16, 17, 19, 20,
20a,~)
and 20b be satisfied, and let
(~,y),
then except almost everywhere
measurable function of
0 < S<
k , and each
(~,y)
e ,
m
(~,Y,S)
, for each
e>
0 , and a
i = 1, 2,
such that
Op(~t'Yt,8*)
1.
08 i
k
r:
j=l
for
i,j
n
2.
i
,
there exists an
1 r:
n t=l
and
= 1,
2
~ P (~t' Yt'8'1'C)
oe jOe i
2, •.• , k
and all
(e ~ - 8'1':)
J
e
J
I
in A(e*,S) •
mi(~t'Yt'S)'" 0 with probability one as n ... co
&... 0 •
.•• ,
93
Proof
Consider
- (p-l)
for
= 1,
i,j
2, ••• , k;
t
= 1,
2, ••• , nand all
& in A(e*,8)
In order to eliminate the absolute value signs we will consider
firstly two general cases:
Case I
for
e
in (2 ,
Case II
Yt -
f(~t,e*)
< 0
and all possible va.lues of
The goal is to define a bounding function of
of the form
(Y t -
f(~t,e»
•
for almost all
(~t'yt)
o< a< e
e e A(e*,a) ,
and
, and such that for Some
In all cases we consider only the values
o<
(p-l)
f(~t,e)
:!i:
(p-2) s:
land
and
£(e)
use the subscript
o.
e>
I <
p~
0 , and ea.ch
2.
In the following derivations
denote the same value of the function.
t
Then,
We will not
to distinguish the observations.
Case I
(a)
(y .. f(x e»p-l af(e) ~ (y _ f(_x,ei~»
-'
ae.
~
where
and
for Case I since
I < P s: 2
and
af(e*) ,
ae 4
do
Note that
95
+ Of(S) (f(e*) - f(e»J •
ge i
where
Of(e) > 0
oe.
].
_ p(y _
gf(e*)
'oe.
].
~ 0 •
f(e*»P-l gf(e*)
oe i
(c)
(y _
f(~,e»p-l O~~~) ~
(y _
f(~,e*»p-l Of~~~ ,
~
~
where
Note that this case is equivalent to the preceeding
case and hence
of(~,e)/oe
Cases (b) and (c) do not occur whenever
constant function of
(d)
(y _
e
is a
0
f(~,e»p-l o~~~) ~
(y _
£(~,e*»P-l 9:~~*) ,
~
~
where
a£(e1~)
ae.
> 0 •
~
Cases (a) and (d) are
equiv.alent~
+ o£(e) (£(e) - £(8*»J •
ae i
The above relation implies
Therefore,
97
p-l
o < (y - f(x,e*»'
=
< (y - f(x,S»
-
p-l
and
o < (y -
(a)
p-2
f(x,S»"
~
(y - f(~,e*»
p-2
,
(y._ f(~,e»P-l o~~~) ~ (y - £(~,e*»P-l af~:~)
].
1
,
where
of(e) , 0
~e
~
o i
Q£(e*) < 0
'~e.
0
1,
w
This case corresponds to l(c) with
e* and e inter-
changed in
Hence,
(b)
(y _ f(~,e»p-l o~~~) ~ (y _ f(~,s*»P-l of~~~) ,
1,
1,
where
Interchange
e* and
(of. (e*) _ -af(e~)
- -
ae.
].
of Case l(d).
oe1,.
e
in the terms
and
(£(S*) - f(e»
10§~~
1.
_ OP~~~)1 < plY _ f(s*)I P- 2[(y
:i.
+ (£(e*) - fee»~ O£~~~J •
1.
(c)
(y ~ f(~,e»p~l O~~~) ~ (y _ £(~,e*»p-l of~~~) ,
1.
1.
where
of (e) < 0 o£(e*)
":lIe
'
':::.e.
~ 0 •
o i
0
1.
Similar to l{a) and interchanging
+ (£(e) - f(8*»
e*
and
e'
we get
o~~~)J .
1.
(d)
(y _ f(2S"en P- 1 o~~~ ~ (y _ f(~,e*»P-l o~~~*) ,
1.
1.
where
oHe)
> 0 ,
': :.
oSi
By.
2(a) we obtain,
af(e'"c)
08 i
~
0 •
99
Note that Case 2(a) and 2(d) do not occur when
constant function of
of(e)/o(e)
e •
Only the cases where
of(e)/oe.
1.
and
of(e*)/oe.
1.
have the same
sign (if both are different from zero) have been considered.
land 2 above, it is possible that for some values of
derivatives have opposite signs.
~,
Under
these
The only applicable cases then are
as follows:
(a)
(y _ f(x e»p-l qf(e)
-'
08·1.
~ (y _ f(~,e*»p-l o~~~*) ,
1.
where
< p(y - £(8*»
(b)
(y _ f(~,e»p-l o~~~)
1.
where
is a
p-l
(
o£(e) _ of(e*»
- - '08 i
oei
100
This is equivalent to the case above.
2.
y>
f(~,e*)
>
Hence,
f(~,e)
where
= p(y _ f(e»p-2(y _ fee»~ o~~~l
~
+ (£(e*) - fee»~ O~~~)J
•
~
(b)
(y _ f(8*»p-l O£~~~) ~ (y _ f(8»p-l O~~~) ,
~
where
~
101
2£(e)
oe..1.
~
0
~
gf(e:1.
oe.1.
~ 0
•
Similarly to 2(a) we obtain,
+ gf(e) (f(e*) - £(e»] •
oe i
Hence,
o < f(x,e)
and
~
y<
f(~,e)
-
f(~,e*)
102
+ plfCe) - fce*)IP-ll~f(e*)1
,
06'
i
+ 10~~~*)1] < plY - f(e*)\p-2!f(e)
~
Hence, for each
hi(~t'Yt,e:)
for
i
= 1,
(~t,e)
in X x n we can define a function
which bounds
2, •.• , k and all
(x,y)
-
such that
Similarly, we can derive the equivalent cases for
That is,
Case 1:
y <
f(~,e) ~ f(~,e*)
Case 2:
y <
f(~,e*)
Case 3:
f(~,e)
<
< Y<
+ If(e) -
f(~,e)"
f(~,e*)
•
f(e*)IIO~~~)IJ
~
-
y > f(x,e*) •
y <
f(~,e*)
.
103
for
i = 1, 2, •.• , k •
Consider the set
A(e*,e)
{elle
€ A(e~(,13)}
is in the interior of O.
(through Assumptions 4 and 5) for all
where
0 < 13 < e
Then, by the Mean.Value Theorem,
e
in A(e*,13)
it follows
that
and
Hence, for each
Let
0 < 13 <
e
and
and all
e
in A(e*,13)
104
Hence, with probability one, by Assumption 2,
IOR (e)
whenever
e
is
oP (e*)
, oe.J.
oe i
+ 21y - f(e*) IP-21IOf~:~()I!.IOf£:~? I\\e
-
e*11 •
J.
Using this bounding function we can construct the function
i
m
(~,y,o)
as
0" 0
of the lemma, such that with probability one,
and
For each
8(0)
n .. CD
•
0 , consider the set
= t(~,y)I(ly
- f(e*)I,o)
~
- f(e*)! - 2 max (
C
8 (0) ,
If(e)
O} •
ly - f(~,e*)1 > 2 max(
and hence on
sup
A(e~(, 0)
sup
A(e~(,o)
IfCe) - f(e*)!,o)
= 2~(~,0)
105
Iy - fCe)l = Iy - f(e*) - (f(e) - f(e*»l > 6(~,0)
e
for all
in A(e*,o) •
By Assumption 4, the second partials of
to
e
e,
exist on
for all
e
SC(o) , and are real valued continuous functions of
in A(eiC,o) •
approximation around
be in
Using a second order Taylor's
e*, let
e
be in A(e*,6)
and let
(~t'Yt)
C
S (0) , then
k
:s:
where
with respect
p(~,y,e)
-e
~
j=l
is in A(e*,o)
2
2
°
pt(e)
°
pt(e*)
loeoe
- oe.oe. lile -e~'cll
j
i
J
~
and
+ 21 lYt - f (e*)!P-2(Y - f (e*»02 f (e*L
t
t
t
oeioe j
- lY t - f t <e)IP-2(Y t - ft<e»
2 °oeioe
f(e)1
j
(5.3)
where
106
q
= (y
For each
- f(e*»
0 <
°< e
•
and each
x
consider the sets
c
S 1 (~, 6) = (y - f(~,eic)' (~,y) e S (6) ,
and
ly - f( ~'c) I > T' + tl(x,o)1
where by Assumption 20(a)
sup (
X
and
sup
A(e ic ,e.)
If(~,e) - f(~.eiC)I.~ T
T' = max(1,2T} •
Consider
Then, denoting the distribution function of
Assumption 19
•
q
by
F
q
,by
107
For convenience let
20 for
e
in A(e*,o)
6(X,O)
and on
be denoted as
+
Sl(~'O)
6.
By Assumption
(p - r) < 0
if
Since
(q - (f(e) - f(e*»
on
any
+
Sl(~'O)
e
~
0 > 0
a first order Taylor's expansion around
e*
gives for
in A(e*,o) ,
Hence,
By the Dominated Convergence Theorem then
case for
S;: (~, 0)
I
l
~
0
as
0
would be equivalent by Assumption 16.
~
O.
The
108
Now consider the integral of
Call it
1
2
•
Al(q,~,e)
over the set
S2(~,8)
Then, since
ly - f(e)1 > 1
on
S2(~'&)
, the second term in
12
is bounded for all
x
by an
integrable function, and by the D.C.T., the integral of this term
goes to zero as
&~ o. That is,
I Rm Is 2 (x- ' o)ly
- f(e*) - (f(e)
~
0
as
°
~
0 •
By Assumption 20, the first term in
and since for all
integrable function
e
in A(e*,o)
2r. . (x)
J.,
J -
and
1
2
is
the integrand is bounded by the
.
109
SRm r.
·(20
~,J
as
a
as
a~
~
0
Is 2 (
by the D.C.T.
1:)
~'v
Ily -
Hence
II
p-z
f(ei~)
I
0
as
~
a
~
0
I2
and
~
0
0 •
Consider the second term
A2(q,~,e)
..
Iy
in relation (5.3).
- Ip-2 (y
f(e)
-
22-
_ £(e*»(o f(e*) .. 0
oejoe i
fee»~
oejoe i
_°oe fee)(,y
_ f(e)\p-2(y
jOe
2
-
'
i
.
- f(e*»ldF qdEX •
Then,
110
20(b)~
Note that by Assumptions 5, 8,
and the D.C.T. the
integral of the first term is
m J(~oo co) I y ..
JR
'
=s:
as
8
~
l
fCe * ) IP-
O.
sup
A(e*,8)
2
0 f(e*)
1oe
09
j
i
Consider the second term bounded by
2
0 f (e) ,.
m ( sup
ae
oe )
R A(e*,8)
ji
J
I
Similarly) this integral
~
0
.
J(8 US ) I( Iy - ..f(9).. Ip-2 (y. .. £(6»-
as
1
8
2
~
0 •
111.
Define
i
m (x,y,8)
as follows:
=
ly ... f(e*)
I lS:
I
2 max(
sup If(?f"e) ... f(?f"e*) ,8)
A(e*,o)
sup
A(e*,o)
ty
w
if
f(e*)
I>
2 max(
sup
A(eoJ~,o)
If(~,e) ... f(~,e*)l ,8) •
Then,
e*
E
i
m ~,y,8) ~ 0
as
8
~
0 ,
and, by Assumption 14,
with probability one as
n
~co
•
0
Theorem 5.2
Let
en
be a consistent estimator of
e* ,with
1< p
~
2 ,
let the conditions of Lemma 5.2 Qe satisfied, and let the conclusion
of Lemma 5.1 and Assumptions 12 and 15 hold, then if
112
Proof
The conclusion follows from Theorem 4.3.
It
is easy to verify
that the conditions of this theorem are all satisfied.
p > 2
The condition to verify for
a2 p(~,y,e)
sup I
0
Ile-e*11 ae j ei
0
-
2
0
p(~,y,e*)
0
ej ei
0
is relation (4.4).
I s;;
That is,
h . . (x,y,o)
J.,J -
where
e*
E
h.
. (x, y, 0) ... 0
J.,]-
Note that in this case
0 <
F~l
as
5'" 0 •
< p and
0 < p-2 < p.
Assumption 8
imp lie s tha t
and
for all
e
in
o.
This, and Assumptions 5, 20, and 20(b), allow
for the conclusion of the D.C.T. to apply and relation (4.4) is
satisfied.
113
6.
SUMMARY
The statistical properties of the estimator
investigated, where
n
min
L;
eE(2 t=l
for a choice of
p
-en
en
were
satisfies
IQ(~t' Yt' e )I P
=
1 , and
~
is the hypothesized implicit model.
An algorithm to compute
en
was
derived and under general conditions it was shown to converge to a
stationary point of the objective function.
The implicit model was more specifically defined by placing
restrictions on the function
(~'Yt)'
arguments
error.
= 1,
t
Q and the parameter space
(2, the
2, ••• , and the behavior of the random
Given the complete model specification and a set of assumptions
to be fulfilled by the model it was shown that for any p
converges with probability one to a poi.nt
and that for
normal.
The point
for any
e
f
~n(en - eO)
p > 1,
eo'
eO
e
is defined as that value in
in
expectation, the estimate
1 ,
e
in the interior of
is asymptotically
(2
k
n
(2
variate
such that
(2
Moreover, when more than one point in
of these points.
eO
~
en
(2
minimizes the above
ultimately remains in any neighborhood
114
When the model can be solved for the endogenous variable
through analytical or numerical procedures, and equating
a predictor
is defined for a fixed value of
en
converges to
eO
to
zero,
Under rather
~
't'x
(9)
n
'
and
with probability one and has an
asymptotically normal distribution with mean
Yn
0
exists as a continuous function
general conditions
if
x
€t
y,
converges in probability to
8
then it follows
0
~n(Yn - ¢x(e
¢x(e O) ' and that
that
o))
converges in distribution to the Wlivariate normal with mean zero.
Finally, the regression model was considered as a special case
of the general implicit model.
The model assumptions and additional
conditions on a class of distribution functions for the error, and on
the parameter space,
was sufficient to show that in this case
corresponds to the true parameter value
e*
eO
and, hence, the findings
presented in the previous paragraphs follow for
e = e-)(-
a
115
7.
NUMERICAL ILLUSTRATIONS
The examples in this chapter were solved using the computing
technique developed in Chapter
3.
The purpose was to assess the
performance of the algorithm and when possible to compare the reSUlts
with those generated by other computing methods.
The following example was used by Schlossmacher [19J in
evaluating a proposed technique for minimizing the sum of absolu'&e
values of residuals from a linear regression model.
Given the data
t
12
18
24
30
36
42
48
Yt
5·27
5.68
6.25
7·21
8.02
8·71
8.42?
x
the optimal fit using the minimum absolute deviations criteria was
However, as commented in [19J, this example has a non-unique optimal
region.
In partiCUlar, lines with slope
passing through the point
.1078
~ ~ ~
(30,7, 21) are optimal.
value for the objective function was reported as
we obtained were as :presented in Table 1.
.1250 and
The minimum
1.65.
The resuJ.tJ3
116
Table 1
A two parameter linear regression model
a
P
1.0
.1.
n
3·92
25
3·93
Initial
I.'
"Values
SRHO
.1096
8.0, -1. 0
1.65
.1099
4.0,2.0
1.65
.1080
5.0,1.0
1.30
b
n
DERV
-.19640ID-03
-.619430D-03
1. 50
3·95
.1052
1.01
5·0,1.0
-.354370D-07
-.115185D-05
=1
For p
(30,7.~1)
were
the residuals corresponding to the observation
.2284D-06
and
.2220D-15
for the fwo fits
respectively.
The nonlinear regression model
Yt
= exp( -e,~ .
t
exp (-e!:xz)) + EO t
was fitted using various choices of
available results for values of p
t
p.
Since there were no other
different fram two several
initial values for the parameters were considered.
these results.
Table 2 presents
117
Table 2
A two parameter nonlinear regression model
Initial
Values
p
525.6997
SRHO
DERV
• 57177
-.103530D-03
200,1000
872.1269
300, 800
•184522D-03
750,1200
1000,1000
100, 300
900.2983
750,1200
.29875
1200,1300
1. 50
.129813D-09
100, 300
931. 0162
706.8473
-.472908D-l0
200,1000
.15128
1200,1300
-.740393D-ll
.237167D-l0
750,1200
2.0
1000,1000
961.0026
.039806
.681658D-l1
-.186779D-ll
750,1200
200,1000
Bard [lJ reported ~1:= 813.4583 ,
SRHO := .039806
for
p
equal
to
2.
"-
e2
:= 960.9063 ,
The author approached the
problem using the algorithm by Marquardt as well as an approximation
to a second derivative method combined with different choices for the
step length.
The direction vector derived by Bard was equivalent to
the one in our algori thm when p:= 2 .
A third type of implicit model was considered.
118
In this case, the function is algebraically implicit in the
variable
X2
y, yet for specific values of
there exists a solution for
y
To generate values for the
Yt
9
1
,
9
2
and Xl
and
Xl' X2 ' el ' e2 .
it was assumed that the true
in terms of
= 1.0, 92 ;:; -8.0 , and that th~ Xl
1
were known constants. Solution for the Yt was obtained
parameter values were
and X
2
9
by solving the equation with
E
t
= O.
To the
an error term from the uniform distribution
gener~ted
(- 1/64,
values
1/64) was
added so that the model would not ap:/?ly exactly to the observed
(~yt).
sequence
The results were equivalent across different choices of
probably due to the very good fit of the
el = 4.0,
value of
model~
= -11. 0, the solution
2
p = {1.1,1.5,2.5} .
was generated for
9
p,
With a starting
(.9999,-8. 000)
In carrying out the computations for these three types of
implici t models an approximation was used in the choice of step
length
A
at each iteration.
r
The value of
e
at the
(r + 1)
iteration was chosen to be
where
"'r
=
j
max (C )
for
j
= 1, 2, ..• ,
jo
such that
j
~n(er+l) < ~n(er)
e,
The value
of
C was chosen as
in all cases.
'.'
119
8.
1.
LITERAruRE CITED
Bard, Y. 1974. Nonlinear Parameter Estimation.
New York, New York.
Academic Press,
Box, G. E. P. and G. C. Tiao. 1962. A further look at robustness via Bayes Theorem. Biometrika 49:419-432.
3.
Britt, H. I. and R. H. Luecke. 1973. The estimation of parameters in nonlinear implicit models. Technometrics
15 :233-247.
4.
Burkill, J. C. and H. Burkill. 1970. A Second Course in
Mathematical Analysis. Cambridge University Press,
Cambridge, Mass.
5.
Fisher, F. M. 1966. The Identification Problem in Econometrics.
McGraw-Hill Book Company, Inc., New York, New York.
6.
Flanagan, P. D., P. A. Vitale, and J. Mendelsohn. 1969. A
numerical investigation of several one-dimensional search
procedures in nonlinear regression problems. Technometrics
11:245-254.
7.
Forsythe, A. B. 1972. Robust estimation of straight line
regression coefficients by minimizing p-th power deviations.
Technometrics l4:l59-l66.
8.
Gallant, A. R. 1971. Statistical inference for nonlinear
regression models. Unpublished Ph. D. Thesis. Department
of Statistics, Iowa State University, Ames, Iowa.
University Microfilms, Ann Arbor, Michigan.
9.
Huber, P. J. 1964. Robust estimation of a location parameter.
Mathematical Statistics Annals 35:73-101.
10.
HUber, P. J. 1964. The behavior of maximum likelihood estimates
under nonstandard conditions. Proceedings of the Fifth
Berkeley Symposium, pp. 221-223, Berkeley, California.
11.
Huber, P. J. 1972. Robust regression: Asymptotics, conjectures
and Monte Carlo. Mathematical Statistics Annals 5:799-821.
12.
Jennrich, R. I. 1969. Asymptotic properties of non-linear
least squares estimators. Mathematical Statistics Annals
40:633-643.
D.
Loeve, M. L. 1963. Probability Theory. D.Van"Nostrand
Company, ~nc., Princeton, New Jersey.
/
•
120
l~.
Mickey, M. R., P. M. Mundle, D. N. Walker, and A. M. Glinski.
1963. Test criteria for Pearson Type III distributions.
Aeronautical Research Laboratorie~ Office of' Aerospace
Research, Wright Patterson Air Force Base, ARL63-09,
Dayton, Ohio.
15.
Ortega, J. M. and W. C. Rheinboldt. 1970. Iterative Solution
of Nonlinear Equations in Several Variables. Academic Press,
New York, New York.
16.
Rao, C. R. 1973. Linear Statistical Inf'erenceand Its
Applications. John Wiley and Sons, Inc., New York,
New York.
17.
Salmond, C. E. 1971. The estimation of parameters of a model
which are implicitly defined f'rom their properties. The
Australian Journal of Statistics 13:57-63.
18.
Schlossmacher, E. J. 1973. An iterative technique for absolute
deviation curve f'itting. Journal of' the American Statistical
Association 68:857-859.
19.
Smith, V. K. and T. Hall. 1972. A comparison of' maximum
likelihood versus Blue estimators. The Review of Economics
and Statistics 54:186-190.
20.
Wallace, T. D. and L. A. Ihnan. 1975. Full time schooling in
life cycle models of human capital accumulation. Journal of
Political Economy 83: 137-155.
© Copyright 2026 Paperzz