Carroll, R.J. and Ruppert, D.On Robust Test for Heteroscedasticity."

On Robust Tests for Heteroscedasticity
by
Raymond J. Carroll* and David Ruppert**
University of North Carolina at Chapel Hill
We extend Bickel's (1978) tests for heteroscedasticity to include wider
classes of test statistics and fitting methods.
The test statistics include
those based on Huber's function, while the fitting techniques include Huber's
Proposal 2 (1977) for robust regression.
AMS 1970 Subject Classifications:
Key Words and Phrases:
Primary 62G3S; Secondary 62JOS, 62J10.
Heteroscedasticity, robustness, linear model,
M-estimates, regression, asymptotic tests.
*This work was supported by the Air Force Office of Scientific Research under
Contract AFOSR-7S-2796.
**This work was supported by the National Science Foundation Grant NSF MCS78-0l240.
1.
Introduction
We consider the general linear model
where
T.1 = -c.1 -Bvn
Y. = T. + a(T.,8)E:.,
(1.1)
1
B
n
-v
1
1
1
(i = l, ... ,n),
is an unknown (p x 1) vector, the (p x 1) vectors -1
c! are known, the
error terms E. are independent and identically distributed (i.i.d.) with common
1
distribution function F, and 0(T.,8) expresses the possible heteroscedasticity
1
in the model, with
a(T,8) = 1
(1. 2)
+
e
aCT)
+
0(8)
as 8
+
0 .
Bickel (1978), generalizing work of Anscombe (1961), defines robust tests for
heteroscedasticity,which in the present context are tests of H : 8 = 0; the idea
O
is to replace aspects of the usual informal examination of residuals by formal
~
statistical inference about the probability structure of the data.
If {til are
the fitted values (from least squares or possibly a robust regression method
(Huber (1973)(1977))) and b is an even function, Bickel's robust test statistic is
n
L
(1. 3)
i=l
(a(t.) - a (t))b(r·)/cr.b '
1
.
1
where
r. = Y. - t. = residual,
(1.4)
I I I
n
(1. 5)
L
i=l
(a(t.)
1
a. (t))
2
(n-p)
-1
n
L
i=l
(b(r.) - b.(r))
and for any function g,
=n
-1
n
L
i=l
g(x.) .
1
1
2
,
2
Bickel makes the following assumption:
b is bounded and has two continuous, bounded derivatives.
(1. 6)
Under (1.6) and other assumptions (see Theorem 1 below), Bickel obtains the
e=
asymptotic distribution of '\ under H :
O
results are obtained for the case p2/ n
+
0 and contiguous alternatives;
0
One of the most attractive choices of b (well motivated in Bickel's Section
3) is Huber's function squared:
b(x)
(1. 7)
= x
2
Ixl
<
Ixl
> k
k
This choice of b does not satisfy (1.6) so that Bickel's Theorem 3.1 does not
apply.
He states that the strong smoothness condition (1.6) is "unsatisfactory"
and obtains results for (1.7) only when p is bounded and fitting is by least
squares.
In this note we show by a simple modification of Bickel's proofs (using
techniques of Carroll (1978)), that results for,\ can be obtained for b given
by (1.7) even when p 4 /n
0 and fitting is by robust estimates or least squares.
+
This result is given in Section 2.
In Section 3 we note extensions which obtain
scale invariance by robust estimation with scale estimated by Huber's Proposal 2.
2.
Main Results
Where possible we adopt Bickel's notation.
assume n
-1
~
c! c.
-1 -1
=
I.
To provide a frame of reference we state:
Theorem 1 (Bickel (1978)).
(2.1)
Without loss of generality we
Suppose the following hold:
max
IT.1 I
< M ,
e-
3
(2.2)
(2.3)
F is symmetric about zero,
(2.4)
M- 1 ~ J (F) < M where for F'
2
(2.5)
J (F) =
2
(2.6)
f'(x)/f(x)
Var(b(£l))
The function a is twice
(2.7)
(2.8)
e
J (x
If d.
1
(2.9)
= f(abso1ute1y
+
~
1)2
-1
M
bounded~y
continuous)
f(x)dx,
> 0 ,
and continuously differentiable,
= t.1
b(x)
= be-x),
b is bounded and satisfies (1.6),
(2.10)
2
(2.11)
P
In
-+
0 •
Then
(2.12)
where
(2.13)
Our generalization of Theorem 1 to incorporate such functions as (1.7) is
Theorem 2.
(2.14)
Suppose (2.1)-(2.9) and the following hold:
b is bounded, Lipschitz of order one, and has two bounded continuous
derivatives except possibly at a finite number of points, which we
take as ±c.
4
4
P /n
(2.15)
-+
0 .
Then (2.12) holds. (Assumption (2.8) is discussed in the next section.)
Proof of Theorem 2.
The key results in Bickel's proof are (A34)-(A37) with
w.• = 1 - l/n
(i = j)
1J
= -l/n
(i
f j)
Because b is bounded and Lipschitz of order one, (A34)-(A36) follow exactly
as given by Bickel.
He uses (A37) to prove
(2.16)
(a(t.)
- a . (t))b(r.)
1
1
::: n -~
n
I
i=l
(a(T.) - a (T))b(E.)
1
.
1
n
I
(a(T.) - a (T))d.
i=l
where d. = t. - T..
111
1
•
1
+
aP (1) ,
Instead of proving (A37) we will prove (2.16) directly.
As seen in Bickel's (A41)-(A47), (2.16) is verified by proving either (A48) (as
Bickel has done) or
(2.17)
n
-~
A = n
n
n
-1-<
p
n
I
I
w.. a(T.)(b(r.) - b(E.)
i=l j=l 1J
1
J
J
2
+
d.b'(E.))
J
J .
-+
0 .
We will prove (2.17) .. Note that in Bickel's proofs of (A41)-(A47) the assumption
(1.6) is not needed; the weaker assumption (2.14) suffices.
Since r. = E.
J
J
wi th I being the indicator function, rewrite
(2.18)
A ::: L: . L:.
n
1 J
x
{I(-c
+
I(-c - a
+
a
w..
1J
a (T. ) (b ( E. - d .) - b ( E.)
1
J J
J
< E. <
n-J
C -
< E. < -c
n- J
+
+
d. b' (E.))
J
J
a )
n
+
I(c - a
< E. <
n-J-
a )
n
+
I(E. >
J
C +
a )
n
+
C +
a )
n
I(E. < -c - a )}
J
n
d. ,
J
5
_e
where a
n
0 will be specified later.
+
We also write A
n
= ~1'
~.
J
K.. (n).
1J
We can
further write
(2.19)
I(-C < £. - d. < c }
~. K.. (n)
J
J
I(-c+a <£.<c-a)
{ + I ( 1£. -d. I > c)
1 J 1J
n J
n
J J
=L
A1
n
= A(l)
n1
A(2)
+ n2
As in Bickel's (A48), since b is differentiable on (-c,c),
IA (1)
n1
Note that if -c + a
< £. <
n
J
since b is Lipschitz and w..
C -
=
1J
I = 0 (~
P
2
d.)
I
and 1£. - d.1 > c then Id.1 > a.
n
J
J
J
n
0.. - lin,
n
I Id.J I
J
i=l
Thus IAn 11
J
I { Id.
= 0p (pia).
n
I
Then, by (2.1),
1.J
Ib(£.-d.) - b(e::.) ..: d. b' (£.)
JJ
.
a
n
j =1
= 0P (p)
1
J
J
I
I{-c + a
n
< £. <
J
n
2 1
n
> a } < M (\' d.) ~ (\' I { I.d.
2 j~l J
- n j~l
J
Similarly, IAn4 1
= 0p (pia),
IAn 51
n
I
C -
> a })
=
n
!.:
2
a , 1e::.-d.1 > c}
JJ
n
= 0 (pI a
p
n
)
0p (pia)
n .
Further, by (2.1) and since b is Lipschitz,
(2.20)
A similar bound holds for IAn31 .
Since by (2.5) F is Lipschitz in neighborhoods
of ±c, Lemma 1 of Carroll (1978) shows
n
(2.21)
I
j=l
Provided na
I{c - a
>
n-
< £. <
n - J -
C
+ a }
n
=0
P
(max(log n, n(F(c + a ) - F(c - a))) .
n
n
log n, (2.20) and (2.21) give
oP ((n
This yields
n-~ An
(2.22)
If we take a
n
= n-!.i ,
=
0 ((pa)~ + n-~ (plan))
p
n
(2.15) and (2.22) yield
!.:
pa ) 2)
n
6
0
completing the proof.
3.
Extensions
A.
Scale invariance
The test statistic
~
is not scale invariant.
To obtain such invariance,
one would rewrite the model (1.1)-(1.2) so that a(T,a)
~
(1
+
where 00 is a scale parameter consistently estimated (when 8
estimate
a (provided
a(T)8 + 0 (8))/0 ,
0
= 0)
by a scale
by least squares or Huber's Proposal 2 for robust regression).
To obtain scale invariance, Bickel suggests replacing b(r.) by b(r./a).
·
1
1
The
statements and proofs of Theorems 1 and 2 must be modified for this new test
statistic which we denote ~(cr).
Theorem 3.
(3.1)
An analogue of Theorem 2 is
Suppose the conditions of Theorem 2 hold and, in addition,
n
!.,;
2
(8 - 0 ) = 0p (1) ,
(3.2)
0
E{b ' (E 1) E 1 }
2
<
00
,
(3.3)
Then (2.12) holds for ~(a).
Remark.
Assumption (3.1) is the subject of part B of this section.
Assumptions
(3.2) and (3.3) hold ifb is constant outside an interval (as is (1.7)).
Sketch of the proof of Theorem 3.
We need to verify substitutes for Bickel's
(A35) and (A37) when b(r.)
is replaced by b(r./a),
b(E.)
is replaced by b(E;/a )
1
1
1
O
!.,;
and the remainder terms are (respectively) 0p((np) 2) and 0p(p). To prove the
.L
7
substitute for (A35) one must show that
(3.4)
l:: w.. b(r./a) b(r./a)
1J
1
J
E w.. b(E./a) b(E./a)
(3.5)
l:: w.. b(E./a) b(E./a) - E w.. b(E./O ) b(E./O ) = 0 ((np) 2)
1J
1
J
1J
1 O
J O
P
1J
1
J
k
Using the special form of w.. , (3.4) follows from (2.8) and the fact that b is
1J
bounded and Lipschitz; (3.5) is a consequence of (3.1) and (3.2).
is more complex.
Statement (A37)
The analogue of (A42)-(A45) is to show
L
. .
1,J
w.. (a (T .) - a (t. )) b (E. /
1J
1
1
J
a)
= Op (p)
,
for which (using Bickel's proof) it suffices to show
L
(3.6)
.
e
.
1,j
w.. a' (T.) d. (b(E./a) - b(E./oO))
1J
1
1
J
J
= 0p(p)
.
We rewrite (3.6) as
(3.7)
L
w.. a'(T.)d.
1 1
L
w.. a'(T.)d.
1 1
i,j
+
i,j
=
BIn
1J
1J
+
B
2n
That B = 0 (p) follows from using the Schwarz inequality, the boundedness of
ln
p
a, and then applying (3.l)and (3.2).
That B = 0 (p) is complicated notationally
2n
p
but is a consequence of a weakened version of Lemma 2 of Carroll (1978).
verifies (3.6).
The analogue of (A48) is to show that
..
(3.8)
L
. .
1,J
w.. a(T.)(b(r./a) - b(E./O )) =
1J
1
J
J
O
This
8
First note that (3.1) and the proof of Theorem 2 can be used to show that
L
(3.9)
. .
1,J
w.. a(T.)(b(r./a) - b(e./a)
1J
1
J
=
(d./a)b'(e./cr))
+
J
1
J
°
P
(p) .
To verify (3.8) we must show that the difference between (3.8) and (3.9) is
0p(p); this is a consequence of
L
(3.10)
a (b' (E J. / a) - b' (e. J. / °0 ))
I'
L
w.. a (T .) d . /
1J
1
J
L
w.. a(T.) d. b'(e./oo)(l/cr - 1/°0)
i,j
(3.12)
= 0p(p)
w.. a(T.)(b(e./cr) - b(e./oo))
1J
1
J
J
. .
1,J
(3.11)
the following:
i,j
1J
1
J
=
J
=
°
P
0P (p)
(p) .
Equations (3.11) and (3.12) follow by applying the Schwarz inequality, (2.8),
(2.14), (3.1) and (3.3).
(3.13)
L
i,j
We can rewrite (3.10) as
the last step following since E w.. a(T.)
1J
the proof of Theorem 2, while B;2
inequality.
B.
1
= 0p(p)
=
O.
That B*l
n
=
°
p
(p) follows as in
follows from (3.1) and the Chebychev
0
On assumption (2.8).
Huber's Proposal 2 for robust regression is to solve
n
L
(3.12)
i=l
n
(3.13)
e .-
wij a(Ti)(b(E/cr) - bee/Go) - (1/0 - l/OO)E j b'(e/Go ))
L
i=l
~
2
~((Y.
1
-c.
B)/G)c.
-1 -1
((Y.1 - -c.1 8)/G)
=
-
.
~
=0
=E
~
2
(2) ,
the last expectation taken under the standard normal distribution function.
(1973) shows (2.8) under the following conditions:
Huber
..
9
(3.14)
~
is odd and non-decreasing,
~
has two bounded continuous derivatives,
a = 1 and only (3.12) is solved,
yp
+
0, where y is the maximum diagonal element of the projection
-1
.
2
matrix C(C'C)
C' = CC' = r (since y;:=p/n, p In + 0
is necessary.)
The above conditions are very restrictive in assuming a known, and Huber's function
(~(x) =
max(-k, min(k,x))) does not satisfy the smoothness condition.
Carroll and Ruppert (1979) have generalized Huber's result by means of the
techniques used in the proof of Theorem 2.
(3.12)-(3.13) and assume only that
functions used in practice.
~
They utilize the full system
satisfies (2.14), which is true for all
The price paid is a stronger condition on the growth
rate of p; they require that for some sequence an
nya
£
C.
n
>
+
O.
= pin),
When the design is balanced ( y
.
0 suffices, but that requlres p
Smoothness of F.
4+2£
In
+
+
2
0, both yp/a
. n
then a n
+
= p - (1+£)
0 and
for any
O.
Condition (2.5) is rather strong.
Ruppert and Carroll (1979)
show by entirely different methods that when p is fixed and b satisfies (2.14), (2.5)
can be relaxed by requiring only that F is Lipschitz of order one in neighborhoods
of ±c.
10
References
Anscombe, F.J. (1961).
Examination of residuals.
Proa. Fourth Berkeley Symp.
Math. Statist. Prob. 1, 1-36.
Bickel, P.J. (1978).
non-linearity.
Carroll, R.J. (1978).
Using residuals robustly I: tests for heteroscedasticity,
Ann. Statist. 6, 266-291.
On almost sure expansions for M-estimates.
Ann. Statist.
6, 314-318.
Carroll, R.J. and Ruppert, D. (1979).
Regression Estimates.
Huber, P.J. (1973).
On the Asymptotic Normality of Robust
Unpublished Manuscript.
Robust regression: asymptotics, conjectures and Monte-Carlo.
Ann. Statist. 1, 799-821.
Huber, P.J. (1977).
Robust Statistical Procedures.
Ruppert, D. and Carroll, R.J. (1979).
Manuscript in preparation.
SIAM, Philadelphia.
On Bickel's Tests for Heteroscedasticity.
Unclassified
n ere d)
SECURITY CLASslFICA T ION 0 F THIS PAGE (Wh en D eI . IE,
READ INSTRUCTIONS
BEFORE COMPLETING FORM
REPORT DOCUMEHTA,TION PAGE
t.
REPORT NUMBER
4.
TITLE (etId Sublllle)
2. GOVT ACCESSION N~. 3.
RECIPIENT'S CATALOG NUMBER
5. TYPE OF REPORT
On Robust Tests for Hcteroscedasticity
..
a.
PERIOD COVERED
Tedmical
6 . PERmRMING §~G •• REPO/T ~UMBER
1meo erles
7.
B. CONTRACT OR GRANT NUMBER(s)
AUTHOR(.)
Raymond J. Carroll
9.
and David Ruppert
AFOSR 75-2796
10. PROGRAM ELEMENT. PROJECT. TASK
PERFORMING ORGANIZATION NAME AND ADDRESS
AREA
Department of Statistics
University of North Carolina at Chapel Hill
11.
12 1
12.
CONTROLLING OFFICE NAME AND ADDRESS
Air Force Office of Scientific Research
Bolling AFB, DC
14. MONITORING AGENCY NAME l\ ADDREsS(1I dlfferenl Irom COlllrolllnli Office)
a.
WORK UNIT NUMBERS
REPORT DATE
July 1979
13. NUMBER OF PAGES
10
IS.
SECURITY CLASS. (of Ihl. reporl)
Unclassified
15•.
16.
DECLASSIFICATION/DOWNGRADING
SCHEDUIeE
DISTRIBUTION STATEMENT (of thl. Reporl)
Distribution unlimited - approved for public release
.
.'
17.
DISTRIBUTION STATEMENT (of Ihe eb.'rllC' "nler"d In Bloel< 20, II dlflerenl from Reporl)
18.
sUPPL EMENTARY NOTES
19.
KEY WORDS (Conllnu .. on r .. ve,." .Id.. If n"e .....ry . .d Idenilly by bloel< number)
Heteroscedasticity, robustness, linear model, M-estimates, regression,
asymptotic tests.
20.
ABSTRACT (Conllnue on reve,.e .Ide II n.e" •• ery etId id..nllly by block number)
We extend Bickel's (1978) tests for heteroscedasticity to include wider classes
of test statistics and fitting methods. The test statistics include those
based on Huber's function, while the fitting techniques include Huber's Proposal
2 (1977) for robust regression.
DO
FORM
1 JAN 73
1473
EDITION OF I NOV 65 IS OBSOL.ETE
Unclassified
SECURITV CLASSII"CAT;ON OF THIS PAGE (It'h.n Dcte Enterftd)
SECURITY CLASSIFICATION OF THIS PAGE(1f'h... De'. en'.t.d)
..
SECURITY CLASSIFICATlON OF
TL'''
""'GE(w".n DIt'. Ent.