Ruppert, D. and Carroll, R.J.On the Asymptotic Normality of Robust Regression Estimates."

ON THE ASYMPTOTIC NOID.tA1ITY OF ROBUST REGRESSION ESTIMATES
by
Raymond J. Carroll* and David Ruppert**
University of North Carolina at Chapel.Hill
Abstract
Huber's (1973) proof of asymptotic normality of robust regression
estimates is modified to include the estimates used in practice, which
have unknown scale and only piecewise smooth defining functions 1jJ.
Key Words and Phrases:
Robustness, M-estimates, regression, linear model,
asymptotic theory.
*This research was supported by the Air Force Scientific Office of Scientific
Research under contract AFOSR-75-2796.
**This research was supported by the National Science Foundation Grant
NSF HCS78-0l240.
1.
Introduction
We consider the general linear model
'Yo1
=1
T· + Z./OO
1
(1.1)
= l, ... ,n),
(i
where
=
T.
(1. 2)
1
!
j =1
B~O)
c ..
J
1J
Here the BjO) are unknown parameters, the c ij are known constants, the zi are
independent and identically distributed with common distribution function F
(symmetric about zero), and 00 is a constant to be chosen later.
Huber (1973)
proposed estimating B by solving the system of equations
I-1
(1. 3)
e
1
1
1J
2
I-1
{l/J (0 (y. - t. (B) ) ) 11
(1. 4)
(j = 1, ... ,p)
l/J (0 (y. - t. (B))) c .. = 0
=
~}
o,
where
t. (B) =
1
~
!
j=l
c .. B·
1J
= E¢ l/J 2 (z)
J
,
(the expectation being taken under the standard normal distribution) and l/J is a
monotone nondecreasing function. Let C = {c .. }, f = C(CT C)-l CT be the
1J
projection matrix, and E be the maximum diagonal element of f.
Huber (1973) considers
solve (1.3).
0
fixed and known
(0
lIe proves· that if Ep2 ~ 0 as n ~
00
= l}
so that one only need
(which implies p3/n ~ 0)
and if l/J is bounded with two continuous bounded derivatives, then all estimates
of the form
a = L:lJ? =1 a.J S.J
2
(L:. a.
J
J
= 1) are asymptotoically normal.
It is
2
routine to extend his results to the system (1.3), (1.4) (see below).
The
smoothness conditions on t/J are not satisfied for three of the most corrnnon1y
used functions, namely
Huber's function
Ixl
Ixl
t/J(x) = x
= c sign (x)
c
> c
<
Hampel's function
t/J(x)
=
-t/J(-x)
=
x
o<x
=
a
a < x < b
=
aC c - x)
c-:D"
< a
b < x < c
x >
= 0
C
Andrew's function
t/J(x)
=
-t/J(-x)
=
o<x
sine (x/c)
= 0
<
TIC
x >
TIC
In this note we weaken Huber's conditions to include the connnon t/J-functions,
at the cost of slightly strengthened conditions on the rate of growth of p,
the dimension of the problem.
The results have been applied by Carroll and
Ruppert (1979) to the probe1m of testing for heteroscedasticity (Bickel (1978)).
As in Huber's proof, we assume C'C
problem, we can take sjO)
notation.
=
0 (j
=
= I.
Because of the invariance of the
l, ... ,p) and 00
=
1, primarily to simplify
3
2.
Notation, Assumptions and Main Results
1) vector for which l: a~ = Iia 11 2 = 1. Define
J
2
2
Note that lit (8) 11 = 11(311 . Define for j = 1, •.. ,p,
Let a be an arbitrary (p
si = L}=l c ij a j .
x
n
~.(B,o)
J
I
= -
i=l
~(O(Y'
1
- t. (B))) c .. /E~'(Y1)
1
1J
n
If' .(B,o) = B· J
J
I
i=l
~(y.)
c .. /E~' (Y1)'
1. 1J
while
~p+l (B,o) = -n
If'p+l (B,o) =
_1/
'2
-n-~
n
ir
l
2
{~ (o(Yi - t i (B))) -
((0-1)
+
iII
{~2(Yi)
0 /2 EY1
- t:J/2 EYI
~(Y1)~' (Y1)
~(Yl)~' (Y1))
•
Our estimates are solutions to ~.(S,a) = 0 (j = 1, ... ,p+1), and we hope to
J
approximate (~,a) by (S,o) which solves If'j(S,O) =
a
(j = l, ... ,p+l).
We make the following assumptions.
is odd, bounded, and constant outside a finite interval.
(2.1)
~
(2.2)
lp is Lipschitz of order one and has two continuous bounded derivatives
except at a finite number of points, which we take without loss as ±c.
(2.3)
F is symmetric about zero.
(2.4)
F is Lipschitz in neighborhoods of ±c.
Theorem 1.
such that
If (2.1) - (2.5) hold and, in addition, there is a sequence a
n
+
a
4
(2.6)
(1. 4) such that
then there is a sequence of solutions to (1.3)
1
II (8,
(2.7)
(np)~ (0 - 1))
II = 0p(p).
If in addition
(2.8)
then
II (8,
(2.9)
(np)12 (0 - 1)) -
(13, (np)~
(8 - 1))
II R 0
•
Thus, (2.1) - (2.5) and (2.8) imply that all esimtates of the form
&= L}=l aj Bj (I lal 12 = 1) are asymptotically normal.
Remark:
~
The result (2.7) is the starting point in constructing robust tests
for heteroscedasticity (Bickel (1978), Carroll and Ruppert (1979)); it implies
the crucial assumption T of Bickel's Theorem 3.1.
For balanced designs
(E = p/n) , assumption (2.6) is satisfied by choosing a
n
Y > 0, which then requires p4+2 Y/ n
when
3.
Whas
Proofs
+
= p-(l+Y) for small
0, as compared to Huber's condition p2 jn
+
two derivatives.
i
Proposition 1.
Suppose that (2.1) - (2.5) hold, that
uous derivatives, and that Ep
II (6,
the following hold:
+
0.
Then on the set
l~
(pn)'z (0 - 1))
2
II . : . Kp,
~
has two bounded contin-
°
5
(3.1)
II<HB,a) - '¥(B,a)
I I<P(B,a)
(3.Z)
- (B,
Proof of Proposition 1.
(3.3)
I
,¥.
J
J
Zk
= OplCEP ) 2) ,
(a - 1))11 =
Op(P~
+
(EpZ)~)
By a Taylor series expansion,
a . (<p ' (B, a) -
j=l
(np)~
II
J
(B, a)) + (a - 1)
¥ s 1.y.1 1jJ' (y.1) /E 1jJ' (Y1)
i=l
n
-a
=
L
. 1
1=
s. t. (S)(1jJ' (Y1·) - E 1jJ' (Y1))/E 1jJ' (Y1)
1
1
IAn I
n
< la - 11
Ii=l
L
s. t. (B)Y· 1jJ"((1 + nZ·)y·)/E 1jJ' (Y1) I
1 1
1
1 1
n
I L s. t?(B) 1jJ"(ay. + n1-)/E 1jJ'(Y1)I .
i=l 1 1
1
1
,
e
On the set IIBII
Z
~ Kp,
la - 11 ~~,
term on the r .h. s. of (3.3) is
O(E~
Iiall = 1, Huber shows that the second
II BliZ) .
Since 1jJ" = 0 outside a finite
set and InZi l > ~, the first term on the r.h.s. of (3.3) is bounded by
Thus,
We next
(3.4)
consider~, in particular its last two terms. Since
n
(a - 1) if1 si Yi 1jJ'(Yi)/E 1~'(Y1) = OpCla - 11)·
rluber shows that
l: si = 1 and
6
n
L
(3.5)
i=l
Thus, uniformly on the set II BII
2'
-k
.:: Kp, Ia-II ::. Kn 2, 1'1 a I I
=
1 we have
from (3.3) - (3.5)
I! a (<p. (s, a) ~
(3.6)
o
j=l J J
'¥J'
(s, a) I =
°P
(£\
+
pn- !2)
•
Another Taylor series expansion shows that
IBnl =
(3.7)
l<pp+1 (s,a) -
~p+1(s,a) + Cn1 + Cn21
-k
-k
2
.:: M{ Ia-II II S II + pn 2 +. n 2 Ia-II } ,
where
and
{2 E Y1 1jJ(Y1) lP' (Y1)}C 2
n
Since Cn1
= ope Ia - I
2
3
k
i) and Cn2
k
n
-k
L
=
2a n 2
=
0p ((pin) 2) , we obtain
-k
°
1
1=
t. (S) 1jJ(y.) 1jJ' (y.) .
1
1
1
k
Since (£p)2 2. (p /n)2 2.pn 2, we thus obtain (3.1).
Equation (3.2) follows
from (3.20) of Huber (1973).
Proof of Theorem 1.
The proof of Proposition 1 makes it clear that we need to
obtain bounds for IAnl and IBn' uniformly on the set I lsi
Ia-II
-k
::. Kn 2 and
II a II = 1.
Rewrite
I .:: Kp,
7
n
IAn I = I l H(i, 13,0)
i=l
E ljJ' (Y1)
I
n
L
i=l
I
k
Let I be the indicator ftmction and let an
IAn I =
n
.L\
1=1
H(i,B,o)
I(-c
+
-+
0, n 2 an
-+ 00.
Then
an -< y.1 <- c - an)
+ I (y. > c + a ) + I (y. < - c - a )
n
1-
.
I(-c - a
+
< y. <
n
n
1-
1
-c
+
a )
n
+
I(c -
a <yo<c+aJ
n
1
For notational purposes, define di = o(Yi - tieS)). Then
n
A1 =
n
L
. 1
H(i, B, 0) I (- c
+
a
1=
x
< y. < c - an)
n
1
Z
{I(-c < d.1 < c) + I(ld·1
> c)} = A(ll) + A(l ) •
1
n
n
By Proposition 1, IA~i) I = O(£~).
Since
ljJ
is Lipschitz and constant outside
a finite interval,
However, since
10 - 11
<
-
Kn
-k
2
k
and n 2 an
-+
0,
Now
n
(3.8)
l
. 1
1=
It I t. (B) I > a /Z} < 4p/a Z
1
n
-
n
n
8
n
L
(3.9)
Iti(S)I I{lti(S)I > a n IZ} -< 4p/an ,
i=l
so that (3.8) and (3.9) imply
I
IA~i) I ~
However,
10 -
E~{lo
4M
11
-
pian Z
+
pia}
n
Z -!<
lila < K(na ) 2 so that
nn
!<
Z -!<
IAn 11 = O(E 2 pian (1 + (nan) 2)
(3.10)
The same bOlmd (3.10) holds for IAnZI and IAn31.
•
Noting once again that l/J is
Lipschitz and constant outside a finite interval,
n
.
IAnSI ~ i~l Isil {Io - 11 +. Iti(S) I} I{IYi - cl < an}
L
~
ME~
.
1
{aI-l l
1
Gn + p ~ Gn ~} '
where
n
(3.11)
Gn
= . L1
1=
I {IY· - c I < a } .
1
n
From Lerrnna 1 of Carroll (1978), provided
nan
2. log n,
Gn = 0p (nan ) .
This gives
!<
IAn S I = 0p (( E npan ) 2) •
(3.1Z)
The bound (3.1Z) also holds for IAn41.
(3.13)
IAn I = 0p (E~
The same bound holds for IB
n
pian (1
I.
Thus,
+ (nan Z)-~) + ( E npa )~) .
n
Thus ,
9
II<PCS,o)
(3.14)
- '¥(S,o)
II
~
=
0p(E 2 p(l
=
0p(r(p,n)) .
+
2
-~
(nan) 2)/an
+
Equation (3.14) is the generalization of HUber's (3.18).
2
sufficiently large K, on the set I lsi 1 ~ Kp, 1o -
II<PCS,o)
(3.15)
~
- (s, (np) 2 (0 - 1))
II
k
lE npan ) 2)
As he shows, for
11 ~ Kn- Yz ,
< r(p,n) +
k
Yz(Kp) 2 .
'Thus, if
k
r(p,n)/p 2 +
(3.16)
a,
Brouwer's fixed point theorem enables us to conclude that
the ball
true.
II (S,
(np)
~
2
(0 - 1))
II
< Kp.
~
has a zero inside
Equation (3.16) is true if (2.6) is
The rest of the proof parallels that of HUber.
0
References
Bickel, P.J. (1978).
non-linearity.
Using residuals robustly I: tests for heteroscedasticity,
Ann. Statist.
Carroll, R.J. (1978).
6~
266-291.
On almost sure expansions for
~1-estimates.
Ann. Statist.
6, 314-318.
Huber, P.J. (1973).
Ann. Statist.
1~
Huber, P.J. (1977).
Robust regression: asymptotics, conjectures and Monte-Carlo.
799-821.
Robust Statistical Procedures.
SIAM, Philadelphia.
Unclassified
•• I!nt
SECURITY CLASSIFICATION 0 F T HI S PAGE (Wh en D t
-.
r d)
READ INSTRUCTIONS
BEFORE COMPLETING FORM
REPORT DOCUMENTAJION PAGE
2. GOVT ACCESSION NO. 3.
1. REPORT NUMBER
••
RECIPIENT'S CATALOG NUMBER
S.
TITLE (and SubtWIt)
TYPE OF REPORT" PERIOD COVERED
On the Asymptotic' Normality of Robust Regression
Technical
Estimates
6. PERFORMING
O~G.
REPORT NUMBER
Mimeo Series #1242
1.
8.
AUTHOR(_)
Raymond J. Carroll and David Ruppert
9.
CONTRACT OR GRANT NUMBER(s)
APOSR-75-2796
10. PROGRAM ELEMENT. PROJECT, TASK
AREA" WORK UNIT NUMBERS
PERFORMING ORGANiZATION NAME' AND ADDRESS
Department of Statistics
UNC Chapel Hill, NC 27514
12.
11. CONTROLLING OFFICE NAME AND ADDRESS
REPORT DATE
July 1979
Air Force Office of Scientific Research
Bolling APB, OC
13. NUMBER OF PAGES
10
14. MONITORING AGENCY NAME" ADDRESS(II dlU.rltnt 'rom COllfrolltn, Ott/Cit)
15.
SECURITY CLASS. (of thl. report)
Unclassified
15•.
16.
DECL ASSI FICATION/ DOWNGRADING
SCHEDULE
DISTRIBUTION STATEMENT (of this Roport)
Distribution unlimited
- approved for public release
.
i
11.
DIST I'll BUnON ST A TEMEN T (of the .b.tr.. ct "nt.r.d In Block 20, If dlff.rent from R.port)
18. SUPPLEMENTARY NOTES
19.
KEY WORDS (Contlnult on r.v.r." .Id" If n.c ••• ary . .d Id.ntlfy by bfock numb.r)
Robustness, M-estimates, regression, linear model, asymptotic theory
20.
ABSTRACT (Continu. on rev.r • • • Id. If n.c .....ry . .d Id.ntlfy by block numb.r)
Huber's (1973) proof of asymptotic normality of robust regression estimates
is modified to include the estimates used in practice, which have unknown scale
and only piecewise smooth defining functions ljJ.
DO
FORM
1 JAN 13
1473
EDITION OF 1 NOV 65 IS OBSOLETE
Unclassified
SECURITV CLAS51"'CATION OF THIS PAGE (When D ..t .. Entered)
SE·CURITY CLASSIFICATION OF THIS PAGE(Wh..
D.'. Ifn'.red)
SECUR1TV CLASSler,CAT'OIi Of"
T,-U'" PAGE(Whan Dat. Entfi'