Carroll, R.J.Onthe Maximum Likelihood Estimate for Logistic Errors-in-Variables Regression Models."

•
ON THE MAXIMUM LIKELIHOOD
ESTIMATE FOR LOGISTIC ERRORS-IN-VARIABLES REGRESSION MODELS
by
R. J. Carroll*
University of North Carolina at Chapel Hill
*Research supported by the Air Force Office of Scientific Research
Grant AFOSR-F49620 82 C 0009.
•
•
•
ABSTRACT
Maximum likelihood estimates for errors-in-variables models are not
always root-N consistent.
We provide an example of this for logistic
regression.
SOME KEY WORDS:
Binary regression, Measurement error, Logistic regression,
•
Maximum likelihood, Functional models.
•
I.
INTRODUCTION
Logistic regression is a popular device for estimating the probability
of an event such as the development of heart disease from a set of predictors,
e.g., systolic blood pressure.
The simplest form of this model is logistic
regression through the origin with a single predictor:
Pr{Y.=l\c.} = G(aoc.) = {l+exp(-a c.)}-l
111
P1
(1)
where a
O
and {c } are scalars (i=l, ... ,N).
i
In many applications, the predictors
are measured with substantial error; a good example of this is systolic blood
pressure, see Carroll, et al (1983).
Thus, we observe
C.1
=1
c. + v.
1
where the errors {v.} are assumed here to be normally distributed with mean zero
(2)
1
•
.
an d var1ance
2
0 .
The functional errors-in-variables logistic regression model is the case
where (1) and (2) hold and the true values {c.} are unknown constants.
1
parameters are a
O
The
and {c i }; there are (N+l) parameters with N observations, so
classical maximum likelihood theory does not apply.
Up to a constant, the
log-likelihood is
2
2 -1 N
-N log 0 - (20 )
L
(C.-c.)
e
. 1 1 1
1=
•
N
+
L {Yo1
i=l
log G(ac.)+(l-Y.)log (l-G(ac.))J •
e
1
1
e
1
The linear functional errors in variables model (Kendall and Stuart
(197~)
takes a form similar to (1) and (2), although of course (1) is replaced by the
usual linear regression model with variance
0
2
£
.
If
0
2
,
0
2
£
2 2
or 0 /0 is known,
£
then the linear functional maximum likelihood estimate exists and is both
consistent and asymptotically normally distributed.
In this note, we show that for the functional logistic errors-in-variables
•
model (1) and (2), even if
0
2
is known, the maximum likelihood estimate cannot
be consistent and asymptotically normally distributed about aO'
The result
can be extended to multiple logistic regression, and it is true even it we
replicate (2) a finite number M times.
the sample size N +
00,
then
00
as
the functional maximum likelihood estimate can
be shown to be consistent whether
II •
If the number of replicates M +
0
2
is known or not.
THE THEOREM
In model (1) with the {ci} known, the ordinary maximum likelihood estimate
for a
O
satisfies
N
o =
..
L c.1 (y.-G(alc.))
1
1
i=l
In the presence of measurement error, the naive estimator would solve
N
o =
(3)
.
C'))
LIC.(y.-G(&2
II
1
1=
However, because of correlations, it turns out that
N
(4)
.
LIC.I(Y.-G(aoC'))
I
1
~ 0 .
1=
Condition (4) says that the defining equation (3) for &2 is not even
consistent at the true value aO'
Under these circumstances, it is well
known from the theory of M-estimators that the usual naive estimator &2
converges not to a
O
but to the value a* satisfying
N
L C.(Y.-G(a*C.))
i=l
1
1
= 0 ,
1
assuming such a value a* exists and is unique.
-3-
Assuming it exists and is unique, the functional MLE a'"
O
satisfies an
equation analogous to (3):
(5)
o=N
-1
N
'\ '"
'"
t:l '"
ci(aO)(Y.-G(u.Oc.(a
)))'
i=l
1
1 O
A
L.
where
c. (a) = C. + a0 2 (y. -G(ac. (a))) .
1
1
1
1
(6)
It is easy to construct examples for which an analogue to (4) holds:
(7)
One example of (7) is the extraordinarily easy problem 0 2 = 1 and c.1 = (_l)i.
The only question is whether (7) is enough to guarantee that the functional
MLE &0 cannot be asymptotically normally distributed about the true value aO'
This is the case.
Theorem
(A. 1)
Suppose that 0 2 is known and that
The maximum likelihood estimate &0 exists;
(A.2)
N
-1
L c.1
N
-+A
i=l
(A.3)
-1
N
N 2
L c.1 -+ B
i=l
(0 < B < 00)
•
Then, if
(A.4)
.!.
'"
N2 (a -a )
o
0
= 0p (1)
,
we must have that (7) fails, i.e. ,
(8)
N
l
E
Nlim
I -c. (a ) (Y.1 -G(aOc.1 (aO))) = O.
.1= 1 1 o
N-+oo
The theorem as stated does not readily follow from the theory of M-estimators
unless one assumes the existence of a unique a* which satisfies (8), along with
-4-
other regularity conditions.
The proof we give avoids these complications
because it exploits the form of the logistic function G.
III.
PROOF OF THE THEOREM
It is most transparent to take a
2
= 1.
A
By formal differentiation, a
simultaneously satisfies (6) and
N- l
(9)
N
I
i=l
c. (a){G(ac.(a)) - Y.} = o.
1
1
1
Assumptions (A.2) and (A.3) imply that
2
max{c./N
: 1
(10)
1
~
~
i
N} + 0
From (2) and (6), it follows that
lim max
sup
12. (a)-v. 1/(l+laol+lc. I) = oP (1) .
E+O l~i~N la-aol<E 1
1
1
(11)
Further, since the {Vi} are normally distributed,
max{ Iv. IN- i : 1
(12)
1
p
i ~ N} + 0
~
It follows that if (A.l)-(A.4) hold, then
Lemma
(13)
Proof of the Lemma.
Define
H.(u,a) = u-c.-v.+a{G(au) = Y.} ,
1
H.1 (c.1 (a), a)
1
1
1
= O.
The partial derivatives of H.1 are·
a
=
~
=
~
aU
d
oa
2
H.(u,a) = 1 = a G(au){l-G(au)}
1
H. (u,a) = {G(au) - Y.} + auG (au) {l-G(au)}
1
By the chain rule,
1
O
-5-
From (10)-(12) and (14) it follows that for every M > 0,
•
= 0 { max
p l~i~
sup
la-aol<M/N~
Ic. (a) I/N~}-P+
0 .
1
This means that for every M > 0,
max
sup
k Ic.(a) - c.(a ) I ~ 0
O
l<i<N la- a l<M/N 2
1
1
o
o
which by (A.4) completes the proof of the Lemma.
We must prove that (A.l)-(A.4) imply (8).
We are first going to show
that
(ls)
The term in (15) can be written as A
IN
+
A
2N
+
A , where
3N
N
A2N =
N-li~lei(aO)
[G{aOci(a O)}- G{aOci(a O)}]'
..
P
By (9), A = 0 and, since G is bounded, the Lemma and (A.4) gives AIN ~
0
3N
P
Because G and its derivative are bounded, the Lemma says that A2N - - + 0
as long as
N-
l
N
L {c.(ao)}2=
i=l
1
l
N
0 (1) and N- L {c.(ao)}2 = 0 (1),
P
i=l 1
P
-6-
which follow from (A.3), (10) and (11).
Since (15) holds, to prove (8) we
merely need to show that
N- l
N
L
i=l
ci(aO) (Yi-G(aOci(aO)))
p
-+ 0
-E{ci(aO) (Yi-G(aOci(aO)))}
•
This follows from Chebychev's inequality and (A.3), completing the proof.
0
The Theorem does not follow from ordinary likelihood calculations because
the number of parameters increases with the sample size.
IV.
A SIMULATION STUDY
To give some idea of the effect of measurement error, we conducted a
small Monte-Carlo study of the logistic regression model
Pr{Y. = I} = G(c./2 - 1),
1
1
i=l, ... ,N
Here the values {c.}
were randomly generated as normal random variables with
1
mean zero and variance 3 =
0
2
c
while the measurement errors were normally
,
2
distributed with mean zero and variance 2 = 0 v , with each {c.}
being
1
replicated twice.
We chose the two sample sizes N= 200,400 and took 100
simulations for each sample size.
In Table 1 we report the Monte-Carlo efficiencies of the usual naive
estimator and the functional
MLE
with respect to the logistic
regression based on the correct values {c.}.
1
If the replicates of c.1 are
by the
=1
(C. +c. )/2 and estimated the variance of C.-c.
Cil 'C i2 ' we used C.1
1
1
l12
sample variance of (C
il
-C
i2
) /2.
The results make it clear that neither the usual naive method nor the
functional MLE are acceptable.
good methods.
Further work is clearly needed to identify
-7-
TABLE 1
Monte-Carlo Mean Squared Error Efficiencies
Relative to Logistic Regression Based On The
True Predictors
Pr{Y.=llc.}
= G(a+Bc.), B = ~, a = -1.0
1 1 1
i=l, ... ,N
USUAL
LOGISTIC
FUNCTIONAL
MLE
a
N=200
N=400
0.74
0.46
0.25
0.32
B
N=200
N=400
0.27
0.09
0.15
0.24
a+B
N=200
N=400
1.13
0.60
0.59
0.53
a+213
N=200
N=400
0.38
0.13
0.28
0.43
REFERENCES
Carroll, R.J., Spiegelman, C.H., Lan, K.K.G., Bailey, K.T. and Abbott, R.D.
(1982). On errors-in-variables for binary regression models. Manuscript.
Kendall, M. and Stuart, A. The Advanced Theory of Statistics, Volume 2,
pp. 399-443. Macmillan Publishing Co., New York.