Carroll, R.J. and Ruppert, D.&Almost Sure Properties of Robust Regression Estimates."

Almost Sure Properties of Robust Regression Estimates
by
Raymond J. Carrol1*
and
David Ruppert**
•
We consider Huber's Proposal 2 for robust regression estimates in the
general linear model.
The estimates are first shown to be strongly consistent.
We then develop an almost sure expansion of these estimates, approximating them
-~
(to order o(n 2)) by a weighted sum of bounded random variables.
The approxi-
mation is sufficiently strong to permit construction of sequential fixed-width
confidence regions for the regression parameter.
AMS 1970 Subject Classifications:
Key Words and Phrases:
Primary 62G3S; Secondary 62E20, 62JOS
Regression, robustness, linear model, almost sure
properties, consistency, M-estimates, Fixed-width
confidence intervals.
*This work was supported by the Air Force Office of Scientific Research under
Contract AFOSR-7S-2796.
**This work was supported by the National Science Foundation Grant
NSF MCS78-01240.
,
e
1.
Introduction
Consider the general linear model
Y.1
(1.1)
=
c.
Bn
-1 -v
+ E.
1
where {C.}
is a sequence of (lxp) vectors of fixed constants, -v
Bn(pxl) is the
-1
regression parameter, and E ,E , ... are independent and identically distributed
l 2
(i.i.d.) errors.
Least squares estimation of
~
is known to be sensitive to
outliers (Andrews (1974), Carroll (1979) give empirical demonstrations of this
fact) and inefficient if the error distribution is heavier-tailed than the
normal distribution (Huber (1973), (1977)).
For this reason Huber (1973)
(1977) has proposed a class of competitors to least squares called M-estimates.
For
given functions
X, Huber's Proposal 2 involves solving the simultaneous
~,
equations
(1. 2)
n
-1
n
L
~((Y.
1
i=l
n
(1. 3)
-1
- c. B)/a)c.
-1 -
-1
=0
n
L
. 1
X(Y. - ~1· ~ / a)
1=
1
= E,; = E",
X (z) ,
't'
where the expectation is taken under the standard normal distribution function.
The original Proposal 2 (Huber (1973)) choses
2
X(u) = ~ (u) ,
(1. 4)
a choice that, for convenience, we make throughout.
There are two classes of functions
~
commonly used (Andrews, et al (1972)).
The first are bounded and monotone non-decreasing, the prototype of which is
Huber's
(1. 5)
~(u)
= max(-k,
min(u,k)) .
2
Often, better robustness properties are obtained by using the solution to
(1.2)-(1.3) with
~
given by (1.5) as an initial estimate, and then performing
one step of Newton's algorithm using a function
~
which redescends to zero,
such as Hampel's
~(u)
(1.6)
o<
= -~(-u) = u
=a
u < a
a < u < b
b < u < c
u > c .
= 0
The asymptotic properties of such one-step estimates can be established
(Bickel (1975), (Carroll (1977)) if one knows the asymptotia properties of the
estimates based on
(1. 5) •.
Recent Monte-Carlo (Huber (1973), Gross (1977)) and empirical (Andrews
(1974), Carroll (1979)) studies have established the superiority of M-estimates
to least squares estimates.
~
However, only limited theoretical work is available.
Huber's (1973) methods may be.used to show that there is a sequence of solutions
to (1.2)-(1.3) which is asymptotically normally distributed when
~
has two
bounded continuous derivatives (not satisfied for (1.5)), but while his results
are truly
remarkable in letting p
~ 00,
he does not show that all solutions must
be asymptotically normal.
Maronna and Yohai (1978) consider monotone functions
~
and treat the design
{c.} as an i.i.d. sequence of random variables, which enables them to use Glivenko1.
Cantelli results.
They show that all solutions are strongly consistent and asymp-
totically normal.
We consider the almost sure properties of M-estimates in the usual case that
the design {c.}
is a sequence of constants.
-1.
When
~
is monotone, we show in
Section 2 that all solutions to (1.2)-(1.3) are strongly consistent.
e"
3
In Section 3 we present a result which is much stronger than asymptotic
normality and yields insight into the asymptotic behavior of M-estimates.
The
specific result generalize work of Carroll (1978a) to show that, if a sequence
of solutions is strongly consistent, even when
~
is not monotone (as in (1.6))
the solutions can be approximated by a weighted sum of bounded i.i.d. random
-!<:
variables, with remainder term of order o(n 2)
tion can be motivated as follows.
!<: <5
and n r
A
lan-aol
-+
almost surely.
If F is symmetric, if n
0 (a.s) for all
<5
!<:-o
2
This approximaA
II~-~II
-+
0 (a.s.)
> 0, and if ~ has two continuous bounded
derivatives (not true for (1.5)-(1.6)), then Taylor expansions can be used to show
that i f l.\' n = n-1
n
I
c! c. , then
.
-1-1
1=1
(1. 7)
•
It is the purpose of Section 3 to verify (1.7) under reasonable conditions.
~
One
can grasp the importance of (1.7) by noting that it implies that, except for a
negligible remainder term, the normalized M-estimate is a normalized least squares
estimate based on observations .£i
~(Ei/aO)
~ + ~(E/aO);
the boundedness of the "errors"
is the essential reason for the robustness of M-estimates to outliers in
the responses or heavier-tailed distributions.
Note also that (1.7) shows quite
clearly that the present version of M-estimates is not robust against outliers in
the design; Maronna and Yohai (1978) have suggested weighting (1.2), replacing c.
-1
by -1
c. w(c.),
but here much work remains to be done.
-1
Finally, the expansion (1.7) and its relationship with least squares yields
as simple consequences two classes of results:
(i)
The one-step estimates mentioned above are strongly consistent, asymptotically normal, and representable as least sqaures estimates with
bOWlded "errors" to order o(n-~).
(ii)
Robust sequential fixed-width confidence bounds for the regression parameter ~ can be constructed and analyzed exactly as in GIeser (1965); only
a change of notation is necessary.
4
We assume in Section 2 that
~
is monotone and bounded.
We assume throughout
~
that there exists nO' ao with
If
~
is given by (1.5), then this assumption is met if for all
n,
peEl = n) < 1 - ~/k2; this can be seen by a simple modification of an argument
by Huber (1964, p. 97).
By including an intercept parameter in the problem, we
can repararneterize so that ~ = 0,
do throughout the paper.
0
0
= 1, E
~(Yl)
= 0 and E
~2(Yl)
=
~, which we
Thus, in proving consistency for example, we will attempt
to show that all solutions (~, an) converge almost surely to (Q,l).
2.
Strong Consistency
•
Throughout this section we will make the following assumptions.
(2.1)
The function
one.
~
is continuous, odd, nondecreasing, and Lipschitz of order
(For notational convenience, we take the Lipschitz constant equal
to one).
(2.2)
For some a, K , K > 0,
l
2
= Kl
~(x)
I~ (x) I ~ a Ix I
(2 ~ 3)
for all Ixl ~ K2
Ix I 2 K2
if
If A is the minimum eigenvalue of
n
n
-1
n
I
i=l
then lim inf An = A00 > O.
c! c. (l
-1 -1
+
Ilc·ID-1
l
,
5
(2.4)
If A* is the maximum eigenvalue of
n
Ln = n
then lim sup A*n = A*00 <
(2.6)
00
-1
n
L
i=l
c! c.
-1. -1.
•
Let G (8,0) = (Gnl(~'o), Gn2 (.§.,a)), where
nn
-1
Gnl (.§., a) = n
c. -8)/0') (c.
8/0)
L E I/J( (Y 1 - -1.
-1. i=l
n
-1
Gn2 (f, a) = n
c. -8)/0') - ~ .
L E 1/J2((Y l - -1.
i=l
Then for any compact subset C of {(f,a): a > 0, a F 1, S F O}
-
lim inf
n~
inf
C
IIGn (f, a) II
-
> 0 .
The assumptions on the design ((2.3)-(2.4)) are considerably stronger than
what is needed for strong consistency of least squares estimates (see Lai,
Robbins and Wei (1978)), but the latter requires EIYll2, which is of course
stronger than our (2.5).
Limit theorem for regression often require that n -1 E c! c.
-1. -1.
~
E (positive
definite); under this assumption (2.3) is implied by rather weak conditions (cf.
Proposition 2 below).
Condition (2.6) is used to insure eventual uniqueness of the solutions and
corresponds to Huber's (1967) condition (B-3) (in fact, in the location -scale
problem it is exactly his (B-3)).
If we know that F is symmetric then it is easy
to write various reasonable conditions which imply (2.6).
(2.7)
For example, consider
F is symmetric and, for each fixed t > 0, F(s,t) = E I/J(Y1-s)/t) (s/t) has
a unique zero at s =
o.
6
Proposition 1.
(2.7) implies (2.6).
Note that
Consider the expressions
(2.8)
n
n
-1
L
ljJ((Y. - c.
1
i=l
(2.9)
n
n
-1
L
i=l
B)la)c.
Bla
-1 -
-1 -
= 0
2
ljJ ((Y.1 - c.1 B) I a) = t;, '.
Huber's Proposal 2 M-estimates satisfy (2.8) and (2.9).
Lemma 1.
If (2.1)-(2.5) hold then, for some M > 0,
Pr {there exists N such that n > N, (2.8)
and (2.9) imply 111311.::. M, a.::. M} = 1
Theorem 1.
If (2.1)-(2.6) hold then for all
£
> 0,
Pr {there exists N such that n ~ N, (2.8) and
and (2.9) imply 111311 2. £, Ia-aOI 2. d = 1
i.e., M-estimates of regression are strongly consistent.
Proof are contained in Appendix A.
3.
An Almost Sure Approximation
In this section we do not assume that ljJ is monotone, but we do insist that
the following assumption holds:
(3.1)
The sequence {B ,a } of solutions to (1.2) and (1.3) is strongly consistent,
-41
n
converging to (Q,l) (without loss of generality).
7
Additionally, we assume
(3.2)
The function
interval.
~
is odd, bounded, continuous, and constant outside a tinite
Further,
~
is twice boundedly and continuously differentiable
except possibly at a finite number of points al, ... ,a (note that
k
~
need
not be monotone).
(3.3)
Y ,Y , ... are LLd. (F).
l 2
(3.4)
F is Lipschitz in neighborhoods of al' ..• '~.
(3.6)
For all
(3.7)
°
> 0, n
- (~-o)
n
Ln = n'-1 i=l
L
If c*
-n
= n-
1
c. c!
+
-1. -1.
n
L
. 1
1.=
-1
n
<
L
00
i=l
n~
(3.9)
+ 0 .
There is a 0* > 0 for which
lim sup n
(3.8)
Ilc.11
-1.
max
l~i~n
L
(positive definite).
c., then the following matrix is non-singular for
-1.
sufficiently large n:
c*' E Y
-n
1
-2 E Y1
Note that (3.9) holds if F is symmetric (since E Yl
is centered and has an intercept term so that c*
-n
A is eventually nonsingular if
n
+
~'(Yl)
=
0).
If the design
(l 0 ... 0), one shows that
8
this condition is also required by Maronna and Yohai (1978) in proving asymptotic
normality for their situation (c.
random).
-1
Theorem 2.
If (3.1)-(3.9) hold, then
(In E~'(Yl))B.:....n
(3.10)
= n-
1
n
A-I
I c.(~(Y.)
• 1
-1
1=
+ (0
1
-l)E YI ~'(Yl))
n
+
H
n
(3.11)
n
1:
where n 2 H
n
-1
1~
-+
Corollary 2.
0 (a. s . ), n'2 G
n
-+
0 (a. s . )
If (3.1)-(3.9) hold and F is symmetric, then
'\
(3.12)
(L
n
E
A
~'(Yl))E
.:....n
= n
n
'\
L
-1
• 1
1=
-2(E YI ~(Yl)~'(Yl))(a~l -1) =n-
(3.13)
!«
where n 2 H
n
1:
-+
0 (a.s.), n 2 G
n
-+
l
c.
-1
~(Y.) +
1
Hn
n
I
i=l
0 (a.s.).
All proofs are given in Appendix B.
Remark.
E
~
2
In terms of the original model Y.1 = -1
c.
(zl/oO) =
E/OO for
Remark.
n-1
n
'\
L
. 1
1=
S,
B
n
-v
+ E.,
1
E
~(Zl/oO)
= 0,
Theorem 2 and its Corollary may be written by substituting
Sand
Yi' (~ - ~) 1°0 for .:....n
(0n - °0)/°0
for a-I.
n
If the design includes an intercept and is centered so that
c.
-+
(l 0 ... 0),
-1
covariance matrix of .:....n
B
one can show that the common estimate of the variance
9
-
{n-
n
l
~'((Y. -
L
. 1
1.=
C.
-1.
c.
1.-1.
is a consistent estimate of variance of all the terms of
8n
if F is symmetric,
while it inconsistently estimates only the variance of the intercept term if F is
asymmetric and E Y
l
~'(Yl)
f 0 (see Carroll (1978b) for empirical demonstrations
of this result).
4.
Relationship between the assumptions
It is clear that (3.7) is guaranteed in the design is bounded, i.e.,
sup I I~il
I
<
In addition, the following proposition shows that (2.3) is not a
00.
very strong condition.
Propos~t}on
Proof:
2.
(3.7)-(3.8) imply (2.3).
Choose M > 1 so large that
where lim sup n
-1
n
L
By (3.7)-(3.8), for sufficiently large n
i=l
Am1.n
• 0=)
n > A0012
n
-1
n
2 0
L
II.~.·
11
+
. 1
1.
1.=
where A . (A) is the minimum eigenvalue of A.
m1.n
n
-1
-1
= n
n
(c. x) (1
-1. -
L
(c.
-1. -x) (lIM
i=l
n
i=l
>
2
L
+
2
- n
-1
II~i In+
(1
For fixed x
II~i I j)-
L (~i
RP,
1
n
i=l
E
l
.
+
< 2K O '
x) 2 1(1 +
11M)
II c i II
> M) 1M
10
·
:: IIxl12 \j(2M) - n-
n
l
)
~=l
{(~i ~)2/M}{II£.iIIO/(M_I)O}
:: (1IxI12/MH\j2 - 2K/(M-I)O) = IIxl1
2
£M/M , say with EM > O.
Thus, since x was arbitrary
A. (n-
r
l
i=l
m~n
c! c. (l
-~ -~
+
II£.; 11)-1) :: £M/M.
...
0
Appendix A
Proposition AI.
Assumptions (2.1) and (2.2) imply that for some K3 ,K > 0 and
4
for all z,
sup
(A. 1)
-co<~<oo
Proof:
Suppose that Izi
I~(z-~)~ - ~(-~)~I ~ K (K + Izl) .
3 4
< I~I
-
K2 (and hence I~I
::
K2).
Then
(2.2) implies
Since I~(x) I ~ KI , if Izi > I~I - K2 '
Proposition A2.
There exists Al > 0, A2 > 0 such that for almost all w, there
exists N(w) such that
(A.2)
Proof:
(A.3)
There exists NI(w) such that n:: NI(w) implies
n
-1
n
l
i=l
11
From (A.l) and (A.3)
(A.4)
From (2.2) and (2.3), there exists NO such that n > NO implies
1
(A.5)
n
I
i=1
n-
-1
~(-c.
Bla)(c.8Ia)
-1. -
n
~ - a n i ~ 1 (c i BI a)
- K1
< -min (a,K )
1
-1
n
n-
1
n
.I
1.=1
-1.-
2
I{
n
i~1 I~i Blal
(~i Bla)
2
Ic i
Hlci
(1 +
81 a I ~ K2}
Sial..::.
Ilci 11)-1
K2 }
(1 +
[email protected] I
Thus from (A.4) and (A.5), there exists K5 ,K and K7 for which n > max(NO,NI(w))
6
implies
Now, (2.8), a ~ A2 [email protected]
a contradiction.
Since
~
and the choice of sufficiently large Al would imply
0
is Lipschitz and
~2(x) ~ Kg I xl.
0_
L
I;
Define
~(x) =
K, for
IXI ..::.
K2 we can choose K such that
8
12
where M is chosen so that C2 > O.
O
Further fix 0 so that if
t
0' = (2 A*)~
oleA00 min(a,K I ))
00
then
Choose MI (by dominated convergence) and N3 (w) so that if n
n
.L
{min(2K l , IYi I/M l )}2 .2. 2 E{min(2K l , IY I
1.=1
n-1
n ~ N (w)
I
4
(2.9)
a > MO
If n
(A.7)
n
-1
~
N (w),
3
I/ MI)}2 2.
0
2
.
There exists N4 (W) for which
Proposition A3.
Proof:
~
I~
II~II a > C2 .
max(N l (w), N3 (w))
2
1 n
l/J ((Y i - ci~/a) '::'K S n- ): (IYil
1.=1
1.=1
n
):
.::. (K
s/a)(c l
+
(n
-1
1.=1
Thus, by (A.7), a > MO' (2.9), n
which gives IIBII/a > C2 ·
~
.I.
~
I~i
+
l.!.i~ll/a
2 !..
~I ) 2) .::. (K s/a)(c 1 + (2 \:)~ II~II ) .
t
N4 (w) imply
0
Proposition A4.
~>
II~II/a < C2 •
e'
13
~
Proof:
By (2.1), (2.2), the Schwarz inequality and the definition of M ,
I
(A.8)
Also note that (2.8) implies
n
o = I n -1 I
(A.9)
i=I
n
= I n -1 I
i=I
-
I
n
~(-c.
Sla)c.
SIal
-1-
-1 -
n
-1
I
i=I
Again note that, as in (A.5),
...
(A.IO)
In-
I ~(-~i S/a)c
i=I
I
i
SIal'::' min(a,K l )
Aoo
IISIal1
2
(1 +
11..~ll/a)-I
.
Hence (A.9), (A.IO) imply
. 1
o = In-
(A.H)
II SI a II
This implies
Proof of Lemma 1.
I
i=I
~((Yi - ~i
< 0' (1- 0') -1 <
£.
mla)
C
i
SIal
0
Propositions A3 and A4 show that a must eventually be bounded
(with probability one).
bounded.
n
Proposition A2 shows that this implies
I I~I I
must be
0
The proof of Theorem 1 is based on the consistency prove of Huber (1967,
Part B).
14
CA.12)
Pr
there exists N such that n > N implies
sup
CB,O)EC
sup
n
CBl,ol)EUC~,o)
-1
n
I
i=l
1/JCCY i - Ci B)/o)(c i 13)/0)
-1/JCCY i - E.i~ )/01) CCd. lil)/Ol)
A similar result holds for 1/J2 CCY 1.. - -c.1 .B)/o)
.
Proof:
First note that since 1/J is Lipschitz and bounded and C is compact,
n
-1
< n
I
i=l
+ n
n
-1
I
i=l
+ n
n
-1
I
i=l
+ n
n
-1
I
i=l
n
< M
I
i=l
~i (~1 - .§.)/01 + £i ~ 11/0 - 1/ 01 1
if the U(f,o) are chosen so that
sup
(.§.,ohC
This completes the proof.
sup
1/0 <
1
(B1' 01hU(~ 0)
0
00
•
= O.
I
IS
~
The functions {Gn}~=l are equicontinuous on compact subsets of
Proposition A6.
s.
Proof:
Similar to that of Proposition AS.
Proof of Theorem 1.
From Lemma 1, all solutions to (2.8), (2.9) are eventually
confined to a compact set C.
(2.6), one can choose
£
Let U be an open neighborhood of (Q,l).
Then by
> 0 (depending on U) so that
lim
inf
n~
C/U
II Gn (.§., a) II 2.
S£
We discuss only the case
.
lim
inf
n
C/U
IGnl(~,a) I > S£ ,
as the other case is quite similar .
Now for every (.§.,a)
E
C/U, let U(S) be a nieghborhood of
8 for which
and the following hold:
IGnl (~' , a')
sup
(.§.' , a' hU(.§., a)
nl (!?, a)
- G
I .::.
£
•
Select a finite subcover Uses = l, ... ,k) with associated points
(s = l, ... ,k).
~s,as)
Then
n
L
sup
(A.14)
i=l
(.§., ahC/U
<
sup
l:o;s:o;K
+
2£ .
Suppose we show that for any (8,a),
(A. IS)
n
-1
n
L
i=l
{1jJ((Y. 1
c. 8)/a) -1
c. -f3/a - Gn(S,a)}
-
-1 -
-+-
0
(a.s.) .
(A.14)
16
Then with probability one we can choose n sufficiently large that the left hand
Since IGn(~.a) I ~ S£. this implies
side of (A.14) is bounded by 4£.
inf
In-
l
.
n
L
i=l
(~.ahCIU
and since U was arbitrary. the proof would be complete.
Proposition 81 below with a nK
Proof of Proposition 1.
= ~
Now (A. IS) follows from
S/na.
Suppose that
lim inf inf IGn I(S.a)
e
n-+«>
I =0
•
Then. since C is compact, Proposition A6 shows that there exists
lim inf
n-+«>
Now.
~
IGnl(~.a) I
=0
(~.a)
in C with
•
is odd so that (2.7) implies
so it suffices to show that there exists £ > 0 for which
n
lim inf n -1
(A.19)
L
i=l
n-+«>
Let
E*
=
I{lc.
sl > £} >
-1 -
o.
lim inf n- l [.1_
t~_l Ic.sl.
-1E*
> lim inf n
Since (2.4) implies that
-1
>
o.
~
17
n-1
e
n
I Ie. BI
i=l -1 - -
< E +
<
EO
+
n
n-'1
I
Ie.
BI > d
-1 BII{ Ie.
-1-
i=l
n
(n-1
I
612)~
Ie.
-1
i=l
n
1
<
if
EO
<
E*,
EO
+
-1
IIBII (A~)~ (n
I
i=l
~
. 1
(n- 1
1=
I{ Ie.
61 >
-1 -
I{lc.
61 > d)
-1 -
1
d)~
k
Z
,
then (A.19) holds.
Appendix B
Proposition Bl.
Let xl 'x 2' ... be i.i.d. bounded mean zero random variables.
Let a nK (K = l, ... ,n) be a triangular array of constants with
for some 0 < a < 1
n
I
K=l
..
2
anK
~
n-6 for some 0 < 6.
Then
Tn
Proof:
=
(a.s.) .
0
Theorem 4.1.3 of Stout (1974).
1
Fix E > 0 and define A(E-,E) = {211E-11 ~ (M!E)~}, where
lim sup
4n-
1
n
n
1.llc.11
-1
i=l
2
<M.
-
Then
(B.l)
lim sup n
n
Choose d = d(E)
(B.2)
-1
n
I
i=l
I(A(c.,E))
< 4 n
-1
-
= (M! E) !.:2. Since Ed =
E(maxla. I
J
+
Ed)
+
I,
(ME)~
(1
+
-1
n
I
i=l
I Ie.
II
-1
2
E!M -< E.
, for sufficiently small E,
E) Ed!2 < Ed,
18
where al, •.. ,a
K
are as in (3.2).
Now define
K
=
B(e:)
Proposi tion B2.
The conditions
U [a. - e:d, a. + e:d] .
J
J
j=l
1 I~ I 1
.::. e:,
I0
-1 I < e: ,
imply that for some j,
(B.3)
a. 1 < (Y. - c. 8)/0 < a.
J1
-1 J
a. 1 < Y. < a.
J-
Proof:
J
1
Note
10-
1
-1 -S) -
(Y. - c.
1
< e:(max Ia.
-
J
I
1I
Y.
+ e:d) + (l +e:) e: 11'£1' II
"
< e:(maxla.
-
< e:d
J
I
+ e:d) + (l+e:)e:
0
(from B.2)
We say that X < c almost surely as n
n-
d/2
~
00
when
Pr{there exists N such that n > N implies X < c}
n-
= 1.
Throughout this section. all inequalities are taken in the above sense.
Define gee:)
Proposition B3.
= e: o*/4(1+0*)
There exist positive numbers C. e:0' 0** such that for any
0<0 < 0** and any sequence e: l .e: 2•... in [O,e: O] the following holds: defining
-!z+o
an = n
• for almost all w. there exists N(w) such that n ~ N(w).
1
> log n imply
II_sll_< En' 10- - 11 -< E.
n nd n En-
19
{~((Y.
(8.4)
~
- -~
c. -8)/0) -
~(Y.)
~
- (0
-1
-
l)E Y1
~'(Y1)}
and
{~(O-l(y. - c. 8)) - (0- 1
(8.5)
~
II
+ ~ E~'(Y1)
~
C{a n
+
-~
(g(c n )
+
- l)E Y
1
n -0 **)(10-1 - 1 I
~'(Y1)}
I I~I I) .
+
Similar bounds hold for
•
and
(8.7)
respectively.
Proof of Proposition 83.
(8.8)
I Ic!{~((Y.
-~
~
Consider the expression
- c. 8)/0) -~
-
~(Y.)
~
- (0 -1 - l)Y.
~
~'(Y.) +
~
c. 8
-~
-
~'(Y.)}
~
II .
8y Proposition B2, ifY.1 I B(cn ) and -~
c. I A(c.,c
), then (B.8) can be expanded
-1
n
in a Taylor series.
Writing dn = d(cn ) and noting that
(B.8) (in this case) by
I Ic. I I -<
-~
dn /2, we bound
20
If Y.
1
E:
B(E ) (in which case IY.I < maxla.1 + Ed) or c. £A(C.,E ), then since
n
1 J
n n
-1
1 n
~ is Lipschitz we can bound (B.8) by
(B.10)
II_BI
j}
II_c1·11 (I(Y.
1
+ Iw(o(y. - c.
B)) - w(y·)
-1 1
1
I
E:
B(E )) + I(A(c.,E )))
n
1
n
Ilc·11 I(A(c.,E)) •
-1
-1 n
Thus, (B.9)-(B.10) show that
(B .11)
n
-1 c ! {W((Y. - c. B)/o) - ~(Y.) - (0 -1 - l)Y. ~I(y.)
n
I
.
-1
1
-1 . 1
1
1
1=1
L
I
+ c.
S Wl(Y·)}1
1
-1 -
I < C3 (A 1 + A 2 + A 3 + A 4) ,
n
n
n
n
where
AnI = n
-1
A . = n -1
n2
A
= n
n3
-1
L
Ilc.
-1 II CIa-II +
II~i II II~II)
I (Yi
L
(10-11 +
Ilc·11
-1
II~i II II~II)
I (A(c. , E ))
-1 n
i=l
n
i=l
-1
An4 = n
n
n
.L
1=1
E:
B(En))
.
1~(0(Y1' - ~1' §.)) - ~(Y1') I Ilc·11 I(A(c.,E)) •
-1
By (3.7), since 10-11 < E,
n
IIBII <
-
£
n
-1
;
By Lemma 1 of Carroll (1978), and (3.4),
(B.12)
lim sup n
n
for some constant
M*.
-1
Hence
n
L
i=l
I (Y.
1
E:
B(E )) < M* (d E)
n
n n
n
AI
..
21
But, by Holder's inequality,
so that almost surely as n
+
00,
Also,
•
.
But, by Schwarz's and Holder's inequalities,
n
-1
n
L
i=l
n
-1
II~i
II
I(A(~i' En))
k
2. CS(En ) 2 2. Cs g(En )
n
L II~i112 I (A(~i '
i=l
En)) 2. Cg g(E n ) .
Finally,
A 4 < n
n
-1
-
n
L
IIjJ(a-1 (Y. - c. B)) - ljJ(a -1 Y.) I
1
i=l
+ n
-1 -
1
II -1
c.1 I
I(A(c.,
-1.
E ))
n
-1
Now, since IjJ is Lipschitz, for some M > 0,
l
n
A~~) 2. M n- i~l l~i.!?1 II~ill I(A(~i'
En)) 2.
I I.!? II
g(E n ) .
22
Further, since $ is constant outside an interval and
la-II
constant KO for which $(0 Vi) - $(Y i ) = 0 if IYil > KO.
< £ , there is a
-
n
Hence, using this and
the fact that $ is Lipschitz,
Ilc·11
I(A(c.,
-1
-1
£))
n
-<
la-II
g(£)
n
We have thus obtained for (B.11) the bound we wish to obtain for (B.4).
Bounding the difference in the two terms requires two steps, the first considering
n
= n -1 L .£i.£i .§.($' (Y i ) - E $' (Y1)) .
i=l
AnS
Now there is a 0**
>
0, depending only upon 0* (in 3.7) for which if
a nK = c~j/n1-0**, there exists aI' a 2 > 0 for which
Ia nK I .::.
n
L
K=l
n
-a1
-a2
2
a nK -< n
(from (3.6))
•
(from (3.6) and (3.7))
.
•
From Proposition B1, this means that almost surely as n
+
00,
IIAns II
< n -ou
I I.§.I I·
By similar arguments (noting that $' vanishes outside an interval) we obtain that
almost surely as n
+
00,
la-II
The results (B.6) and (B.7) follow similarly.
(proving (B.4))
0
,
23
References
Andrews, D. (1974).
A robust method for multiple linear regression.
Teohnometrios 16, 523-532.
Andrews, D.F., Bickel, P.J., Hampel, F.R., Huber, P.J., Rogers, W.H., and
Tukey, J.W. (1972).
Robust Estimates and Location: Survey and Advances.
Cambridge: Princeton University Press.
Bickel, P.J. (1975).
One-step Huber estimates in the linear model.
J. Amer.
Statist. Assoo. 70, 428-434.
Carroll, R.J. (1977).
On the asymptotic normality of stopping times based on
Sankhya, Series A 39, 355-377.
robust estimators.
Carroll, R.J. (1978a).
On almost sure expansions for M-estimates.
Ann. Statist.
6, 314-318 .
•
•
Carroll, R.J. (1978b).
On estimating variances of robust estimators when the
errors are aSYmmetric.
Carroll, R.J. (1979).
To appear in J. Am. Statist. Assoc .
Robust methods for factorial experiments with outliers.
Unpublished manuscript.
GIeser, L.J. (1965).
On the asymptotic theory of fixed-size sequential confidence
bounds for linear regression parameters.
Ann. Math. Statist. 36, 463-467
(correction 37, 1053-1055).
Gross, A.M. (1977).
Confidence intervals for bisquare regression estimates.
J. Am. Statist. Assoo. 72, 341-354.
Huber, P.J. (1964).
Robust estimation of a location parameter.
Ann. Math. Statist.
35, 73-101.
Huber, P.J. (1967).
conditions.
The behavior of maximum likelihood estimates under nonstandard
Proo. Fifth Berkeley Symp. Math. Statist. Prob. 1, 221-233.
of California Press.
Univ.
24
Huber, P.J. (1973).
Ann. Statist.
Huber, P.J. (1977).
Robust regression: asymptotics, conjectures and Monte-Carlo.
1~
799-821.
Robust Statistical Procedures, SIAM, Philadelphia.
Lai, T.L., Robbins, H. and Wei, C.Z. (1978).
Maronna, R.M. and Yohai, V.J. (1978).
Strong consistency of least squares
Ppoc. NatZ. Acad. Sci. 75, 3034-3036.
estimates in multiple regression.
Robust M-estimators for regression with
contaminated independent variables.
Stout, W.F. (1974).
~
Unpublished manuscript.
Almost Sure Convergence.
Academic Press, New York.
•
Unclassified
SECURITY CLASSIFICATION OF THIS PAGE (When 0.,. Enle,ed)
READINSTRUCTtONS
BEFORE COMPLETING FORM
REPORT DOCUMENTAllON PAGE
1. REPORT NUMBER
2. GOVT ACCESSION NO. 3.
4. TITLE (and Sublllle)
5.
AUTHOR(e)
Raymond
g.
J~
TYPE OF REPORT" PERIOD COVERED
Tedmical
Almost Sure Properties of Robust Regression
Estimates
7.
RECIPIENT'S CATALOG NUMBER
Carroll and David Ruppert
••
PERFORMING O"'lG. REPORT NUMBER
a.
CONTRACT OR GRANT NUMBER(.)
Mimeo Series #1240
AFOSR 75 2796
10. PROGRAM ELI!:M!!NT. PROJE.CT. TASK
PERFORMING ORGANIZATION NAME AND ADDRESS
AREA" WORK UNIT NUMBERS
Dept. of Statistics
University of North Carolina at Chapel Hill
II.
,...
12. REPORT DATE
CONTROLLING OFFICE NAME AND ADDRESS
July 1979
Directorate of Mathematics & Statistics
AFOSR
Bolling AFB. DC
13. NUMBER OF PAGES
7L1.
15. SECURITY CLASS. (01 Ihi. ,eporr)
MONITORING AGENCY NAME" ADDRESS(1f dlfl.,enl I,om Conl,ollln, Ottlc.)
Unclassified
15•.
DECLASSIFICATION/DOWNGRADING
SCHEDULI!:
16. DISTRIBuTION STATEMENT (01 Ihh Repo,')
..
Distribution unlimited - approved for public release
;
•
17.
DISTRIBUTION STATEMENT (01 Ih. "b.I,,,el ent.,.d tn Blocle :lO, If dlfl.,enl I,,,,,, R"po,t)
II. SUPPLEMENTARY NOTES
Ig.
KEY WORDS (Continue on ,.ve,.e .Ide If nee. . ..", . .d Idenlll" b" bloc. numbe,)
Regression robustness, linear model, almost sure properties, consistency,
M-estimates, Fixed-width confidence intervals.
20.
•
..
ABSTRACT (Continue _
--
,.v.,.. • Ide If n.c. . . .'!' . .d Identlf" b" blocle numb",)
We consider Huber's Proposal 2 for robust regression estimates in the general
linear model. The estimates are first shown to be strongly consistent. We then
..deveiop an almost sure expansion of the estimates, approximating them (tJo order
o(n - /2)) by a weighted sum of bOl.m.ded random variables. The approximation is
sufficiently strong to permit construction of sequential fixed-width confidence
regions for the regression parameter.
DD
,.ORM
I JAN 11
1413
EOITION OF I NOV 65 IS OBSOLI!:TE
Unclassified
Sf:CU~'TY 1:1. "~C;'FICATION 0,. THIS PAGIt ("".n
D"t" Enlered)
.....
SECURITY CLASSIFICATION OF THIS PAGK(1nt_ D.'• ."t.r.d)
.
,i
•
•
SECURITY CLASSI"ICATIOII 0" TU'~ P~GE(Wh.n Dill. Enl.'