Carroll, R.J.;Robust Estimation in the Heteroscedastic Linear Model When There Are Many Parameters."

ROBUST ESTIMATION IN THE HETEROSCEDASTIC LINEAR M)DEL
WlIT:N TIIERE ARE MANY PARAMETERS
by
R.J. Carroll
University of North Carolina at Chapel Iii11
Abstract
We study estiuultion of regression parameters in heteroscedastic
linear models when the ntunber of parameters is large.
generalize work of
and
(~rro11
l~ber
The results
(1973), Yohai and Maronna (1979), and Ruppert
(1979).
Key Words and Phrases: Iieteroscedasticity, Linear MOdels, Regression,
Weighted Least Squares, Robustness.
AMS 1970 Subject Classifications:
Primary 62J05; Secondary 62G35.
Research supported by the U.S. Air Force Office of Scientific
Research under Contract AFOSR-80-0080.
2
1.
Introduction
Consider the linear model given by
Y.1 = T.
1
(1.1)
= X.1 S
T.
1
Iierc
{E.}
1
+
a.c. for i
1 1
= 1, ... ,N ,
.
arc independently and identically distributed with symmetric
distrihution F, {x.} are (lxp) design constants, S (pxl) is the
1
regression vector, and {a } expresses the possible heteroscedasticity in
i
the model. Anscombe (1961) and Bickel (1978) have considered tests for
the hypothesis of homoscedasticity, Le.,
(1. 2)
In the case that (1.2) is accepted, we are in a standard linear
model for which a vast theory applies.
Huber (1977) and others have
argued (for this homoscedastic case) that least squares methods should
not be used blindly because the resulting estimates are sensitive to
outliers and departures from the assumption of nonnal errors.
Huber
(1973) and many others, for example Krasker (1978), have proposed robust
regression estimates.
Huber (1973) and Yohai and Maronna (1979) study
the asymptotic properties of these methods as N ~
simultaneously.
00
and p "*
00
They argue that such "many parameter" asymptotics are
importmlt, both because they often occur in practice and because they
show how large a s<lTlqlle size is needed for a given model.
their work formalizes the rule of thumb:
In effect,
"one should take about 10
observations for every parameter."
Tn contrast to the homoscedastic case, there has not been much work
on estimating
e in
the heteroscedastic case that (1.2) is rejected, and
3
to our knowledge there has been no work at all on the important problem
of N -+
00
and p
-+
00
simultaneously.
Fuller and Rao (1978) consider the
case where the Y.] occur in groups for which cr.1 is constant, while Box
and Hill (1974) and Jobson and Fuller (1980) assume that
cr. = [f('r. ,8 )]
1
1
0
(1. 3)
-1
.
All three papers consider only Gaussi.an errors.
Box and Hill suggest
generalized weighted least squares (WLS) wherein the weights (1.3) are
estimated and WLS applied.
Jobson and }uller consider maximum likelihood
techniques which are particularly sensitive to non-normal errors as well
as small misspecifications in (1.3).
For the case p fixed and N
-+ 00,
Ruppert and Carroll (1979)
developed a class of regression estimates for the heteroscedastic linear
model, which are both robust and do not depend on the normal error
assumption.
Their estimates are a natural generalization of Huber's
Proposal 2 (1977) and are defined as follows.
First, assume
f(T ,0) = exp(8h(T)) , 8(l x r)
(1.4)
This class includes the important special cases
Let
be an odd function and X be an even function (X(-oo)<O, X(+oo»O).
11)
A
Let S
p
A
be a preliminary estimate of S and define e as any solution to
N
(1.6)
= N- 1 L\' X((Y.-t.)f(t.,8))h(t.)
111
1
i=l
= 0 ,
4
where t.
A
=
x. (3
IIp
•
If an exact solution to (1.6) does not exist, define
A
IHN(0) I.
to minimize
Finally, define S as the solution to
(1. 7)
We make the conventions
(1. 8)
with the parameter 00 chosen so that
(1. 9)
EX((Y.-T.)f(T.,e
l
1
1
o))
= EX(c)
1
=
0 .
'lhe conventions (1.8) - (1.9) ensure consistency of the solutions.
In the
case of homoscedasticity, the solution to (1.6)-(1.7) is trivially
A
Huber I s Proposal 2 if the preliminary estimate B
p
is also.
Ruppert and
Carroll (1979) show that, in general, the solution to (1.6) and (1.7) is
asymptot j cally equivalent to the "optimal" robustified WLS obtainable
when the
{a.}
1
are
actu~lly
known, Le.,
L(N!i CB - S)) -+ Np(O , S-1 El/J2 CC1 )/(El/JI C(1))2) ,
.
N-+oo
-1
S = lun N
(1.10)
~LX. x. f 2 (T., e) = lim SN '
I
i=l
1
1
1
N+<x>
Np(O,V) = p-variate normal, mean 0, covariance V .
M:mte-Carlo results show that the asymptotics are surprisingly accurate
for small samples.
A
e
5
The purpose of this paper is to generalize the results of Ruppert
and Carroll to thc case
N~, p~,
under conditions similar to those
needed by Bickel (1978) and Yohai and Maronna (1979).
In the next
section we consider the estimatcs of B when an appropriate estimate of 8
is available, and in Section 3 we consider the estimation of 8.
2.
'The Limit Distribution of the Regression Estimate
A
Assuming that 0 is of the correct order ((AlD) below) and h(·) is
sufficiently smooth ((A3) below), one can follow the steps of Yohai and
A
We first establish the order of B.
Maronna (1979).
Theorem 1.
Under (Al)-(Al1)
beZow~
TIle main result is as follows.
Theorem 2.
A[wume (Al) - (A12) beZow.
Let V
N
be any syrronetric square
root of SN ((1.10) and (All)) and "let {aN} he a sequence of (PNX1)
vectors with
I~I
= l.
7~en
Here arc thc assumptions.
(AI)
N- 1
N
L
i=l
(A2)
x~
1
x.1 = I .
6
(A3)
The function h(x)
~md
Ix I
bOilllded on the set
fCx,O)
:S
its first two derivatives are continuous and
~
M+
E
for some
M] on this set illliformly for
E
> O.
-1
Further, M1 <
10 - 00 I :s e:: •
(AS)
~
is non-decreasing and bOilllded.
(A6)
~
is Lipschitz and constant outside an interval.
(A7)
If D(u,z) = (Hu+z) - l/J(U))/Z, then there exists b, c, and d such
that P(le::11<CMi2) > 0 and D'(u,z) ~ d if lul:sc, Iz/::;b.
(AS)
~
and its first three derivatives are continuous and bOilllded.
(A9)
2
1\
(AIO)
Nle - 81 /PN = 0 (1).
P
(All)
o<
(A12)
TN P)- () where TN = Cap - 8)
:S lim sup A
(SN) < 00, where A . (.) and
mln
max
mIn
AmaxC') are the minimum and maximum eigenvalues respectively.
Wo
_
-
lim inf A . (SN)
-!i
N
W C6-8) ,
o
\'N £2 (
)., -1 "
l.l.
l"OO
,1NVN x.x.
1
I. 1
Remarks on Asstnnptions:
u~ed
I
by Imber (1973)
WId
d
X.1 -:r.:(1"[
f
(ll·,8 0)·
Assumption (Al) is a reparameterization device
Bickel (1978).
,~sswnption
CAZ), which is also
used by Bickel, can be deleted if one strengthens (A3) by making M = 00,
a set of circunstances holding for the homoscedastic case.
As stnnption
7
lA3) is stronger than rC4ui recl but is quite convenIent [fill is tTIle [or
the first two special cases in (1.5).
Asslunptions (A4)-(A8) arc as in
Yohai and Marorma (1979) except for the second part of (A6) , which holds
for those
\~
cOJTDTIonly used and which can be relaxed at the cost of some
technical complications.
p~/2 in
Part (A8), in particular, can be relaxed if
is strengthened to p~; the proof becomes quite messy.
(1\4)
Part
(1\9) holds, [or example, in cases of homoscedasticity or, in general, for
least squares if
EE:i
<
00,
and (AlD) will be verified in the next section.
Part (All), in effect, fonnalizes (1. 8) and follows from (Al) in the
important special case of homoscedasticity.
iL<;sumption (A12) is somewhat awkward looking and is the only
obstacle to our Theorem 2 being a real generalization of previous work.
1I0wever, (Al2) holds in important special cases, including
hOllloscedasticity, as the following shows.
PrQEos j hon 1.
Assumption (Al2) holds if any of the following obtain:
(2.1)
The vapiances ape homoscedastic.
l2.2)
,2
I·~£l
<
(2.3)
~.
P
<;0,
p
7.-S
t h e .,I;east Bquapes estimate, and
. -1 \'N ,
lim sup AmtX
a (N
L} x. x. f(T.,8
11
1
Proof of Theorem 1.
Maronna.
Let Yj
)) <
00
'111e proof parallels that of Theorem 2.2 of Yohai and
- 't j
:::
zi = C/f(T ,0 ) and define
i
_1~
d. = x. N
1
0
l
-2
•
0
8
Since t/J is monotone t
Now
where
A
N2 (y) = L
~
L
i';l
1\~2
1\
D(z. f(t't e) t -L(l. yf(t.tO)PNHd. y)
1
1
1
1
221\
1
f (t't e)
1
From (A3) and (A4) it follows that
From (A3) we finu that
N
AN2 Cy) ~ (Ld/M1 ) i~l Cdiy)
2
~ (W/~)P(I£1IScMi2) +
the last step following from Y-M and (A?).
ANI Y
= 0p (1) •
Rewrite ANI y as,
2
ICI£il~~)
0pCl)
t
We thus need only prove
9
e
EIBNI 2 = 0(1)
By (AS) ;:md (All),
so that
By (AZ), (A3), (A6) and (AB), for some M > 0,
Z
I(ANl-B.·hl
-N
1
N
(NpN)-':2.r
Z
s; M
Ix. rl (It .• T·1
1.=1
1.
1.
1
A
+ le-e/)
By (Al), (A9) and (AlO), this is 0 (1) tmifonnly for
p
Ir I = 1,
completing
0
the proof.
Proof· of 111eorem Z.
and Maronna.
1he proof is similar to that of Theorem 3.2 of Yohai
By a Taylor expansion,
,N
o = L1
.
A
-1
A
I
tHCY.1- 1
x. S)fCt.1. ,8))a.N'.V x.] 1
q.
N
where
q.
1
A
= f(t.,e),
r. = f(T.,6 )
1
1.
1. 0
,N
wI = N-~ £1
W
z=
1 1.
1:1
(E~J')N
-1 x.
~Cz.q.)q.a.~VN
1.
N
I
1.
aN VN (S-f3)
h
\,N (~ I ( z.q. ) q.2 - (E'"If' I ) r.2) a.'.'V-Nl x.I x.
w I -_ N-~ L]
3
.
1. 1. 1.
1. N
1. 1.
,N
W -- N-~ L1
4
""I(
If'
l
)
I
I
z.q.
aN'V-N x.x.
x.q.3
1 1.
1. 1. 1. 1.
10
Note that from (A2) , (A3) , and (AlO),
(2.4)
,q.1 - r.I
I =I
0 ( It.T. I +
I
Then (2.4),
(AB),
Ie-e/) = 0 ( It.1 -T.1 I + (PN/N) !o<2)
•
and Theorem 1 show that
Thus, as in (3.5) and (3.6) of Yohai and Maronna, we obtain
(2.7)
(2.8)
-~
R = N
N2
By
\N
2
2
-1,
1\
[,1(1/1' (z.q.)q. - 1/J' (e:.)r.)a.',V x. x. (13-13)
111
l I N N 11
Taylor expansions of 1/J' and f, we obtain
where TN is given in (Al2).
Because of (Al2) , we see that the proof is
completed by showing
(2.9)
wI.
= N-~
\N
-1 ,
[.1 1/J(£.)r.a.'.V
x. + aP (1) •
lIN N 1
By using the synmetry of F, the fact that 1/J is odd, and as in (3.5) and
(3.6) of Yohai and Maronna, (2.9) follows.
Proof of Proposition 1.
Under (2.1),
o
11
verifying (Al2).
A similar easy proof follows from (2.2).
When (2.3)
holds,
ITN 12
l(PN/N))~
= 0p
,
_ tN ,-1
2
A
2
DN - £1 (ar:.FN xi) (xi (l3p - B)) ,
E~ ~
N-
1
maxlxil2 L~(aNV~IXi)2
= O(maxlxi,2) ,
o
completing the proof by (A4).
3.
The Order of
A
e
From Theorems I and 2, we see that we must exhibit a sequence of
estimates
A
e satisfying
assumption (AlO).
We will show that any solution
to (1.6) will work under the following assumptions.
(BI)
The even function X satisfies the same assumptions as does
(B2)
'The statistic
C~,
= 80
\X>
=
N
(B3)
-1
N
N
I
i=l
~.
~
h(T.)(t.-T.) = 0 ((PN/N)2) •
1
1
1
P
lim inf Amin (QN) ~ lim sup Amax(QN) < 00,
1
Q = (Ee: X' (£I))N- l~ h(Ti)T h(T )·
l
i
N
0 <
A
We note that, when r is fixed, (132) holds when Bp is the least
squares estimate, while Huber's (1973) proof can be used for classical
12
M-estimates if (2.2) holds.
Also, (B2) follows from assumption (AS) if
there is homoscedasticity; for in this case,
6
0
=
(log a 0 ••• 0)
h(T)
=
(1 h2 (T) ... hr (T))
N
~=(logO')N-l
Theorem 3.
L
i=l
(t.-T.).
]. ].
Assume (Al) - (A9)" (All) - (Al2)" and (Bl) - (B3).
Then (AlO)
holds.
We need the following lenuna.
Lemma 1.
Let {v.} be i.i.d. mean-aero bounded random variables; let
1
{s.} be uniformZy bounded constants.
Then
1
A__
.~
(3.1)
Proof.
= N- l
i=l ].
1
1
1
= 0P (PNIN)
•
From (A9),
2
AN
s IN
-1 N
2
l
v.s .x·1 0 (PN/N)
i=l 1 1 1
P
Proof of 'Theorem 3.
(A4), and (A9),
(3.2)
N
\' v.s. (.-T.
t)
L
=
n-o (PN/N)
-N P
Define HN(·) as in (1.6) and note that by (Al),
13
e
From (A4) , (B1) and Lenuna 1 ,we thus obtain
~(aO)
(3.3)
1 N
= N- i~l
X(£i)h(l'i) + (''N E£l Xl (£1) + 0p(PN/N)
= 0p ((PN/N)~)
from (Bl) , (B2) and the fact that EX(£l)
= o.
By (A4), (Bl)-(B3), (3.2)
and a Taylor expansion, we find that for any M > 0,
2
Now fix
£
> 0, L > O.
O1oose y ,NO' DO such that N
P{~I~I ~ DOP~}
~
NO implies
< £/3
P{N-~llNl X(£.)h(l'.)1
~
1
1
y} < £/3 •
Define M by
Z
O1oose N
l
~
NO so that N
~
N
l
implies
and
P{left side of (3.4)
~
y/Z}
< £/3 .
1hen with probability at least 1-£, N ~ N and
l
Illi
1
= MZP~
imply
14
(3.5)
Also note that
if
IAI
M~(eo+Sl:lN-!:z) is increasing as a function of s; so that
I
~ M2P~, then as in Jureckova's (1977) proof, using (3.5),
(3.6)
From (3.6), we have with probability at least l-E,
inf{PN~lf~(e)l: NJj\e
~
i.nf{
~
L.
- 80 1 ~
M2P~}
IllI\r(80+M(~) 1/ Illlp~:
Illi
~ M2P~}
o
Since L is arbitrary, the proof is complete.
Remark:
h(T).
Throughout we have taken r fixed; r is the dimension of 8 and
0
1his seems a perfectly reasonable assumption, as one is unlikely
to construct a complicated. ftmction for
suffice for most applications.
when r
0.,
1
and in fact, r
~
3 will
However, the proofs can be forced through
= r N = 0 (p~ ) . One must aSSl.D11e that the bounds in (A3) are
uniform in the components of h(·)
asslDllption (B2).
When
gp
t
and the only sttunbling block is the
is the least squares estimate and
turns out that (B2) holds whenever
EEi
<
00, it
15
References
Anscornbe, F.J. (1961). Estimation of residuals. ~oa. Fourth BerkeZey
Symp. Math. Statist. Frob. (J. Neyman, Editor), pp. 1-36.
University of California Press, Berkeley.
Bickel, I'.J. (1978). Using residuals robustly I: Tests for
heteroscedasticity, nonlinearity. Ann. Statist. 6, pp. 266-291.
Box, G.E.P. and Hill, W.J. (1974). Gorrecting inhomogeneity of variances
with power transformation weighting. Teahnomet1'ias 16, pp. 385-389.
Fuller, W.A. and Rao, J.N.K. (1978). Estimation for a linear regression
model with tmknown diagonal covariance matrix. Ann. statist. 6,
pp. 1149-1158.
P.J. (1973). Robust regression: Asymptotics, conjectures and
MOnte-Carlo. Ann. Statist. 1, pp. 799-821.
~Iuber,
~Iuber,
P.J. (1977).
Robust StatistiaaZ Froaedures.
SIAM, Philadelphia.
Jobson, ,J.D. and Fuller, W.A. (1980). Least squares estimation when the
covariance matrix and parameter vector are functionally related.
J. Am. statist. Assoa. 75, 176-181.
Jureckova, J. (1977). Asymptotic relations of M-estimates and
R-estimates in linear regression models. Ann. Statist. 5, 464-472.
Krasker, W.S. (1978). Estimation in linear regression models with
disparate data points. To appear in Eaonometriaa.
Ruppert, D. and Carroll, R.J. (1979). M-estimates for the heteroscedastic
linear model. M,f,meo Senes '#1243, lhliversity of North Carolina.
Yohai, V.J. and Marorum, R.A. (1979). Asymptotic behavior of M-estimates
for the linear model. Ann. statist. 7, 258-268.
--...------
SECURITY CL.ASSIFICATION OF THIS PAGE (IVIINI
/'''1#' 1'",..".",
READ INSTRUCTIONS
BI-:FORI<: COMPLETING FORM
REPORT DOCUMENTArION PAGE
I. REPORT NUMBER
.______._ _ _ •
4.
7,
~~
t:~T
____
.
Robust Estimation in the Heteroscedastic Linear
Model When There Are Many Parameters
TI TL.E (.nd Subtlll .. )
.
AUTHORrs)
__
.
RECIPIENT'S CATALOG NUMBER
5.
TYPf; OF REPORT & PERIOO COVERED
TEOiNICAL
~.
6.
PE.RFORMING
o'G.
REPORT NUMBER
Mimeo Series No • 1321
.
6.
R.J. Carroll
9.
l.
ACCH&ION NO.
CONTRACT OR GRANT NUMBER(s)
,
AFOSR Contract AFOSR-80-0080
10.
PERFORMING Of<GANIZATION NAME AND ADORES';
PROGRAM ELEMENT. PROJECT. TASK
AREA & WORK UNIT NUMBERS
_.
I 1.
14,
1?
CONTROL.L.ING OFFICE NAME ANO ADDRESS
Air Force Office of Scientific Research
Bolling Air Force Base
Washington, DC 20332
REPORT DATE
December 1980
13
NUMBER OF PAGES
I!>.
SECURITY CLASS. (01 thla ,eport)
15
.-
MONITORING AGENCY NAME:: IJ, AOORF.SS(II <1;11"'011' "om Conl,olll"" UIII<-,,)
UNCLASSIFIED
'15;;.
._ _---------_._._
... _., ..
16. DISTI'lIBUTION STA"~MENT rof till a N ..,''''t)
-
..
,.~_.-
DEC"L-ASSiFiCAi'Tci'N700WNGRADING
SCHEDULE
---_._._--------
Approved for Public: Release - - Distribution Unlimited
".
._.'R_·"_·__ +_
DISTRIBUTION STA-TEMEN T (.,( til ...h.t".d ,·"t",,·,( /n nlo,," 2IJ. /I,ll1f", ..nl
-'-,-----
""111 U"pn,t)
18. SUPPLEMENTARY NOTES
19.
KEy WOROS (Conllnue on reve,.o ~lrlo' II " .... ",...",. and Id'mt/ly by Mod "uml...,)
_.
Heteroscedasticity, Linear MJde1s, Regression, Weighted Least Squares,
Robustness
20.
ABSTRACT (Con tIn"" on rpv",." sid.. II t.-,·I'''''''I' lind /d"l"lIy by b/"ck ""mh",)
We study estimation of regression parameters in heteroscedastic linear
models when the number of parameters is large. The results generalize work
of lfuber (1973) , Yohai and Maronna (1979) , and Ruppert and Carroll (1979).
DO
FORM
1 JAN 73
1473
EOITION OF 1 NOV 611 IS caSOL.ETE
SECURITY
,
.
--l
UNCLASSIFIED
CLA5~IF1CATION
OF THIS PAGE (K11e"
e
I
D.,. E n l = - - J
.
-------_.._---_ _--- ---..
SECUfIIlTY CLASSIFICATIC.).. OF Tu,r PAGE(~"," 0.'. Z;nrerad)
•
, J
I