Carroll, R.J.On Estimating Variances of Robust Estimators when the Errors are Aysmptotic."

BIOMATHE
-e
MATIes TRAINING
PROGRAM
ON ESTIMATING VARIANCES OF ROBUST
ESTIMATORS WHEN THE ERRORS ARE ASYMMETRIC
by
Raymond J. Carroll*
University of North Carolina at Chapel Hill
ABSTRACT
We investigate the effects of asymmetry on estimates of variance of robust
estimator in location and regression
problem~
showing that heavy skewness of
errors can seriously bias the common variance estimates for location and
intercept, a problem that can be corrected by jackknifing for location but is
more intractable for the intercept in regression.
The scale parameters in
regression seem not to be as seriously subject to this bias if the sample size
is large compared to the number of parameters.
* Research supported by the Air Force Office of Scientific Research under Grant
·e
AFOSR-7S-2796.
Introduction
The theoretical results and
~~nte-Carlo
studies in the area of robustness
have in the main focused on symmetric distributions (Andrews 1 et al (1972)) or
procedures which are not scaleequivariant(which effectively eliminates most
problems due to
fixed).
asymmetry when the number of dimensions in the problem is
Recently, Huber (1973) and Bickel (1978) have examined situations in
which the asymmetry of errors can lead to quite complicated results.
In this
paper we study the effects of asymmetric errors in two very simple situations:
(one and two dimensional) location problems and simple linear regression.
We
have focused our attention on estimating the variability of robust point
estimators in these problems.
A major difficulty with considering asymmetric errors has been that location
(intercept) is not uniquely defined.
4It
However, asymmetric data do occur and there
are situations where data transformations to achieve symmetry either make no
sense or are not possible.
In regression, it might be conjectured that
asym-
metry has different effects on intercept and slope (see Section 3); if so,
there will be situations where one might invest much effort in data transformations, when the parameters of interest are not influenced by the asymmetry.
This alone seems a good reason for studying asymmetry, but as another example,
in ranking and selection problems, one is often interested in the stochastically
largest population; robust estimators have been proposed in place of the sample
mean for this problem, and a data transformation which achieves symmetry for all
populations may not be possible.
In Section 2, we study the effects of asymmetry of errors for the
one-dimensional location problem, using M-estimates, trimmed means, and an
-e
adaptive trimmed mean due to Hogg (1974).
regression.
In Section 3, we study M-estimates of
Our major qualitative conclusions are as follows:
2
(1) The published estimates of variance for M-estimates of location will
be consistently small if the errors are heavily skewed. This problem appears
amenable to solution if a variance estimate suggested by jackknifing is used.
~
~
(2) In linear regression, a similar conclusion holds true for the intercept. Theoretical and Monte-Carlo results show that slope parameters are
influenced only negligibly by skewness. Jackknifing for the intercept does not
work as well here. with the variance estimates now being slightly too large.
(3) In the two-sample problem with equal scales. a robust test statistic
can be constructed with a consistent variance estimtate. even if the errors are
aSYmmetric. This statistic. however. has unknown sensitivity to heteroscedasticity of variances.
Location Estimates
In this section we define the location estimates used in the study. state
two Lemmas and give a discussion of expected results.
An M-estimator is defined
as a solution to the equation
(2.1)
E
where s
tions
n
~
~((X.
1
is an estimate of scale.
- T )/s ) = 0.
n
n
In the Monte-Carlo study. we used two func-
defined by
(2.2)
~(x)
-~(-x) =
(2.3)
w(x)
-~(-x)
max(-2. min(x.2))
0 < x < 2.25
x
= 2.25
= 2.25
=0
2.25 < x < 3.75
(15-x)/11.25
The function (2.2) is referred to as Huber's
3.75
x < 15
x > 15
W.
while (2.3) is
Hampel's~.
The
solutions to (2.1) were generated iteratively (Gross (1977)), starting with the
e·
3
.e
sample median and with two estimates of scale:
(2.4)
sin = (median absolute residual)/.6745
n
i~l ~2((Xi
(2.5)
where a = E¢
~2(Z),
the expectation taken under the standard normal distribu-
Two estimates of the variance of n k2 T were
use d , name I y
n
tion function.
(2.6)
- Tn )/s2n(0Id))}S;n(0Id)/a
2
2
(i) = H s. (n-l)
In
1n
D
b.
1
= n -1
n
L
j=l
-1
n
L
j=l
~
2
2
((X.-T )/s. )/b. , where
J .n
1n
1
~'((X.-T )/s. J, H = 1
J n
1n
+
n
n
= 1.08(n-l)-1
(2.7)
w.. =
1J
~((X.-T
J
n
-1
(I-b. )/b.
1
1
2
L
w.. (Y .-T J , where
j=l 1J J n
)/s. )/((X.-T )/s. )
1n
J n
1n
We used the factor 1.08 in D (i) so that the two estimates have approximately
2n
the same value for normal samples in our Monte-Carlo experiment, and we used the
2
factor H in Dln(i) as suggested by Huber (1973).
The variance estimates Dln(i) and D (i) are essentially as suggested by
2n
k
Gross (1977), the first motivated by the asymptotic variance formula for n 2 T
n
J ~2(x)dF(X)/{ f ~'(X)dF(x)}2
,
appropriate when F is symmetric, the second suggested by the weighted least
squares nature of the iterative estimation procedure.
possible asymmetry, the behavior of M-estimates T
n
In the general case with
can be summarized by the
following result, for which exact conditions can be generated from the method
of proof of Theorem 1 of Carroll l1978a).
4
Suppose that for some constants T.(F), o. (F) we have that T
Lemma 1.
~2
n (s.
ln
1
- o.(F)) = 0 (I), and E ~((XI - T. (F))/o.(F)) = O.
p
1
1
o. (F) = 1 (without loss of generality), for
1
(EF~' (X ))T
(2.8)
1
n
n
1
1
~
p
+
T.(F),
1
Taking T. (F) = 0,
1
smooth 1
= n -1
o
Discussion of Lemma 1
The following qualitative considerations emerge from Lemma 1 and are
verified in the Monte-Carlo experiment.
(1)
If the distribution function F is not symmetric, then, in general,
I 0 so that Dln(i) (which is asymptotically correct in the symmetric
case) will typically underestimate the true variance. The same should hold for
EFXI~'(XI)
D2n (i).
(2) The bias in Dln(i) and D2n(i) should be small for distributions which
are nearly symmetric because Tn is a smooth function of the data. The bias will
become large as the degree of asymmetry increases.
(3) When s2n is used as scale, Tn should be a particularly smooth function
of F. From Jaeckel (1972), this means that jackknifing will probably be effective in estimating the variance of Tn- As pointed out by Huber (1977, p.26),
such a variance estimate will be more appropriate for Tn than for the jackknifed
version of T _
n
We also consider trimmed means and two adaptive versions_
as the mean of the largest (smallest) na order statistics.
Define U (a)(L (a))
n
n
As a measure of tail
length, when na is an integer (true in the situations considered here), Hogg (1974)
proposed
Then, if m% refers to a m% symmetrically trimmed mean, the Hogg adaptive estimate
is defined by
HGI = 5%
if
Qn< 1.81
= 10%
if
1.81 < Q < 1. 87
n-
= 25%
if
Q > 1. 87
n
e-
5
The estimate of an m% trimmed mean is given by Shorack (1974), Huber (1977,
cq.(10.4)) has shown that this estimate essentially arises from jackknifing the
m% trimmed mean.
2
Calling this variance a (m%), we define
D(Hogg)
=
2
a (m%) if HGI
= m9::o.
We also consider an estimate suggested by Switzer (1970), defined by
Switzer
=
m% if a 2 (m%) is minimum among
22
a2
(5%), a (10%) and a (25%).
!<
If one assumes that n 2 (Q
- Q(F)) has a limit distribution, it is easy to
n
obtain the following disturbing result, precise conditions for which could be
4It
given but are omitted.
Lemma 2.
Suppose Q(F)
!::
= 1.81.
If F is symmetric about 8(F), then n 2 (HGl - 8(F))
has a non-normal limit distribution.
If F is not symmetric, there typically
!<
exists no finite 8 for which n 2 (HGl - 8) has a limit distribution.
Let T (m) be the m% symmetrically trimmed mean with
Sketch of proof.
n
!::
n 2(T (m) - 8(m)) having a normal limit
n
8(F), since 0
'n
distribution.
!<
is even and T (m) is odd, n 2 (T (m) - 8(F)) is asymptotically
n
!<2
independent of n (Q
n
- Q(F)).
n
Define
8
-e
An n
-~
00,
If F is sYmmetric about
1
=
!<
{n 2(HGI - 8(F))
<
z}.
6
P(B )
1
~
P(A
~ P(A
+
P(A
1n
n B ) + P(A
1
2n
n B )
l
~
ln
and n 2 (T (5) - e(F)) ~ z)
n
2n
and n 2 (Tn (10) - 8(F)) ~ z)
~
verifying the first part of Lemma 2.
If F is asymmetric and 8(5) F 8(10)
(the usual case), for any 8 ,
(2.9)
Clearly, no 8 can be chosen so that the right hand side of (2.9) is a probability
o
distribution.
Because of Lemma 2 we expect HGI to do very
poorly in terms of efficiency
and estimating variance if F is asymmetric and Q(F)
exponential distribution, Q(F)
~
~
1.81.
For the negative
1.805, so particular problems might be expected
here.
The location estimates used in the study we report here are given in Table
1.
These represent a portion of the results in a larger study which leads to
the same conclusions.
The Monte-Carlo study used a shuffled congruential random number generator
to obtain the uniform random deviates.
obtain the standard normal deviates.
various sample sizes (500
errors.
~
The Box-Muller algorithm was used to
Due to time and financial considerations,
N ~ 2000) were used, so that we also report standard
e·
7
.e
TABLE 1
A description of the location estimates used in the study
Code
M
e
·e
Sample Mean
10%
10% symmetrically trimmed mean
HGI
See text
Switzer
See text
Hub 11
Huber with snl and 0nl (1)
Hub 22
Huber with sn2 and 0n2 (2)
Hamp 12
Hampel with snl and 0n2 (1)
Hamp 21
Hampel with sn2 and 0nl (2)
Hub lJ
Huber jackknife variance estimate with SIn
Hamp lJ
Hanpel jackknife variance estimate with SIn
Hub 2J
Huber jackknife variance estimate with s2n
8
Let Z be a standard normal random variable.
While a wide range of sampling
~
distributions were investigated (both symmetric and asymmetric), the five
presented here are representative.
Code
~
Z
N(O,l)
Z
+
.10Z2
.10N
Z
+
.50Z2
.50N
2
2
Negative Exponential, mean 1.25
NE
.50 Exp(Z)
EXP(Z)
2
The second (.10N ) is only slightly skewed and was chosen from a larger set
(Z
+
2
.05Z , Z
+
2
.25Z , EXP(.lOZ), EXP(.25Z)) as a reasonable representative of
the class of distributions close to, but not exactly, symmetric.
Such data
might arise for example from data transformations which only achieve approximate symmetry.
The sample size is n
=
20.
If N is the number of Monte-Carlo experiments
and Yl'Y2' ... 'Yn the realized value of a location estimator, the (standardized)
Monte-Carlo variance is
a
2
=
n
N
The average value of a variance estimate of
presents the values of a
2
1;.
n~
n
",2
n
is denoted by a .
Table 2
",2
and an and their standard errors.
e·
9
.
e
TABLE 2
The first row for each estimator gives the Monte-Carlo variance a
location estimate and the Monte-Carlo average a,,2
n
estimate.
of the
f t h e approprlate
.
.
varlance
The second row consists of standard errors.
N(O,l)
,,2
2
Code
a
M
1.03
.03
.99
.01
10%
1.09
.03
Switzer
.10N
2
2
.50N
",2
a
a
1.07
.03
1.01
.01
1.06
.01
1.10
.03
1.14
.04
.90
.01
HG1
1.09
.03
Hub 11
2
2
NE
"'2
a
a
1.50
.05
1.47
.02
1.07
.01
1. 24
.04
1.15
.04
.90
.01
.96
.01
1.12
.04
1.05
.03
1.01
.01
Hub 22
1.05
.03
Hamp 12
2
EXP(Z)
",2
2
",2
a
a
a
1.58
.05
1.55
.02
1.10
.05
1.10
.04
1.25
.02
1.31
.04
1.36
.02
.51
.02
.52
.01
1.57
.05
.93
.01
1.57
.05
1.04
.02
.52
.01
.34
.01
.97
.01
1. 75
.05
1.05
.02
1.63
.05
1.16
.02
.58
.01
.38
.01
1.06
.03
.99
.01
1.49
.05
1.08
.02
1.50
.05
1.20
.02
.68
.03
.46
.01
1.02
.01
1.07
.03
1.02
.01
1.54
.05
1.21
.02
1.52
.05
1. 31
.02
.68
.03
.56
.01
1.04
.03
1.06
.01
1.06
.03
1.07
.01
1.64
.05
1.24
.02
1.54
.05
1.38
.02
.77
.03
.53
.01
Hamp 21
1.05
.03
1.01
.01
1.07
.03
1.00
.01
1.60
.05
1.14
.02
1.54
.05
1.28
.02
.74
.03
.45
.01
Hub 1J
1.05
.03
1.01
.01
1.06
.04
1.02
.01
1.49
.05
1.49
.03
1.50
.05
1.62
.03
.68
.03
.70
.02
Hamp IJ
1.04
.03
1.01
.01
1.06
.03
1.02
.01
1.64
.05
1.53
.02
1.54
.05
1.57
.02
.77
.03
.81
.02
1.05
.03
1.03
.01
1.07
.03
1.05
.01
1.54
.05
1.61
.05
1.52
.05
1. 70
.05
.68
.03
e
.-
0
2
Hub 2J
a
a
.'".
~
.75
.03
10
Discussion of the results for location estimates
The tabled results confirm our earlier speculations and might be slnnmarized
as follows:
(1) It is difficult to estimate the variance of either HGI or SWITZER.
Jackknifing will probably not help here as the estimates are not smooth
functionals of the data.
(2)
The two commonly used scales (sln,s2n) lead to similar results.
(3) The two variance estimates Dnl(i) and Dn2 (i) do quite well for N(O,l)
and .10N2, which are sYmmetric or nearly so, but are not to be trusted for
heavily skewed data. Either a jackknife or transformtions appear necessary.
(4) The jackknifed variance estimates are a dramatic improvement on
Dnl(i) and Dn 2(i). There is no simple pattern to these figures, but generally
the jackknifed variance estimates appear to become more conservative as we pass
from N(O,l) to EXP(Z).
The optimistic interpretation given here to the use of the jackknife for
variance estimation in robustness of location contrasts with that of Braun
(1975), who concluded that jackknifing and robust estimation do not get along
very well, and that variance estimates obtained from jackknifing are not
reliable.
The closest he came to considering our M-estimates was a one-step
M-estimate starting at the median and the median absolute deviations from the
median (MAD, essentially our sIn).
We believe that our results are more
favorable to the jackknife because the statistics we have considered are much
smoother functionals of the data then are one-step M-estimates starting at
(median, MAD), (a conjecture of this type has been made by Gross (1976)),
.
because
2
A2
instead of comparlng 0 and 0 as we do here, he compared (in the
n
2
2
main) log(02) to log 0 , which is more unstable when 0 ~ 1, and because asymmetry
n
is unfavorable to (2.6) and (2.7).
We used the one-step M-estimate with
~
given by (2.2) and starting at (median, MAD/.6745) and obtained the table
e·
11
e
0-
2
"'2 from (2.6)
n
",2
from jackknife
0n
iO-
Z
.10Z2
,50Z 2
1.06
1.08
.99
1.06
NE
EXP(Z)
1.64
1.55
.58
.97
.85
1.04
.33
1.09
1.90
1. 76
.65
• *
Thus, for one-step M-estimates, the jackknife is more unstable than it is with
smoother M-estimates.
Regression
We next investigate regression to see if similar phenomena to those found
in the previous section continue.
Our intuition says that, under asymmetry,
the common estimates of variance will be inconsistent for intercept but acceptable
for slope.
In this section we sketch a proof confirming this conjecture
and illustrate small sample resul ts with a Monte-Carlo experiment.
The model we consider is (the use of 0'0 will become clear later)
(3.1)
where the Ei are i.i.d. random variables with
x.
-1
=
(1
Xl' 1
x.).
lp
E~(El) =
0 and
We consider a version of Huber's Proposal 2 (Huber
(1977), p.37), which involves solving the equations
(3.2)
(3.3)
n
-1
en _ p)-l
While we will assume the -1
x. are constants, the first two conditions of Lemma 3
(to follow) are reasonable and may be justified by quoting results of Maronna
12
and Yohai (1978).
In a subset of their paper. they assume (Yl'x 1 ).
is a sample from a distribution function P
80 )/°0) =
E~((y - x
2
00 solving
e.
°
80 )/°0) = a
E~ ((y - x
Defining
with~,
(Y2'~2)""
.
= (y.1 - -1
x. 8 )/°0' they show essentially that if the
E. are inde1
0
pendent of -1
x. and if ~
8 • s n satisfy (3.2) and (3.3) (the latter with (n - p)
E.
1
!<
replaced by n). then n 2 (8
!<
n
- ~) and n 2 (sn - 00) are asymptotically normally
distributed.
The proof of our result involves Taylor expansions along the lines of
Carroll (1978a. 1978b) and is omitted to avoid cluttering up the paper with
messy calculations.
Lemma 3.
Recall that our -1
x. are non-stochastic.
Suppose that
!<
o
n 2(8
~) = P (1)
-i1.
!<
n 2(S
n
°0) = 0p (1)
n-lX'X -+V (positive definite)
n -1
Then, for
~
n
L
i=l
x.
-1
-?
(1.0.0, ... ,0) = w
sufficiently smooth,
(3.4)
e·
13
where
D
In simple linear regression with
we obtain that if
Corollary.
~
=
(Sint' Sslope) and
~
=
(Sint' SSlope)' then
Under the conditions of Lemma 3, for simple linear regression
(3.5)
(3.6)
n
L
xO ~(£o) + 0 (n
i=l 1 l
1
P
-1
).
D
Similar results hold for the general regression problem.
Discussion of Lemma 3.
The following conclusions emerge from Lemma 3 (for
large samples) and are partially confirmed for small samples in the Monte-Carlo
experiment to follow.
(1) It can be shown that the representations (2.8) and (3,5) are equivalent when scale is estimated by s2n'
(2) The representation (3.6) implies that in large samples, even with
asymmetry
(3.7)
14
which is precisely the result obtained by Huber (1973) for known scale and p
fixed.
~
.,
(3) Thus, for fixed p as n becomes large, the asymptotic variance
formulas for B obtained in the symmetric case with known scale are correct in
general exceptowhen applied to the intercept. We expect to experience the same
difficulty in estimating the variance of the intercept that we found in the
previous section.
(4) For estimating the variance of the intercept, the equation (3.6)
shows that B1
is sufficiently smooth for jackknifing.
s ope
We constructed a Monte-Carlo experiment for n = 20 in the simple linear
regression model
+ ~
y. = 1
x.
+ E.,
I I I
where the values of x are -.95, -.90, ... ,.90,.95 and the error distributions
were as in the previous section.
There were N=2000 iterations for least
squares and the Huber estimate, and N=1200 iterations for the jackknife.
Table 2 we present the values of &2/ 0 2 for S a n d
n
slope
B1' nt
In
using least squares,
the Huber Proposal 2 with W(x) = max(-2, min(x,2)) and the jackknifed variance
estimate for the Huber Proposal 2. The estimate of variance used is essentially
(2), and if r. = (y. - x. B )/s2n'
I l
-1..:....n
ln
D
2
DIn (2) = H (n-p)
b
2
= n
-1
-1
n
L
j=l
2
2
W (r.)/b , where
J
n
L
WI (r.), H = 1 + p(l-b)/ (bn).
j=l
J
In Table 3 we present the average values ~B
squares and Proposal 2.
We conclude
slope
and
Sint
for both least
~
15
(1) The slope estimate SSlope is hardly influenced by even large dosages
of asymmetry. It appears relatively unbiased and its variance can be assessed
relatively accurately. In separate Monte-Carlo experiments, we have found that
this phenomena extends to a quadratic regression with uniform design (n=30) and
to the poison-treatment 3x4 design of Box and Cox (1964).
(2) The intercept estimate Sint is influenced by asymmetry and standard
variance formulas will tend to underestimate the true variance (in asymmetric
cases) even more than will happen in least squares.
(3) The jackknife appears to be conservative, more so than in the location
case. Hinkley (1977) shows that the jackknife as employed here (the so-called
balanced jackknife) leads in general to a biased variance estimate even in least
squares, unless the design is balanced. In his example, the design is much less
balanced than ours and the behavior of the balanced jackknife much worse.
Further study of his "weighted" jackknife appears necessary, although for this
particular design, our unreported simulation results for the weighted jackknife
are virtually identical to those given here for the usual jackknife.
(4)
Consider a two-sample location problem
Y..
1J
= 11.J
+ E .•
1J
(j
= 1.2
i
= 1, .. .,n.).
J
The results of the previous section show that a statistic formed by analogy
with the t-statistic (X replaced by Tn (X1, ...• Xn ).etc.) will tend to have a
higher type I error than advertised. However. define
(nl + n 2)80 = n111l + n2112. (n1 + n2)8l = n2(111 - 112), and
z.1
=
z.
=
x.1 =
1
Y.
x. = -n /n
1 2
1-n1'2
1
Then we have the linear model and. by estimating
(3.2) and (3.3), Lemma 3 and Corollary 1 tell us
estimate the variance of the estimate of 111 - 112
This of course leads to the usual t-statistic if
1
i
= 1, .... n 1
i
=
scale simultaneously through
that we can consistently
and hence obtain proper tests.
\jJ(x) = x.
16
TABLE 2
"2
The entries are the ratio
0 /0
2
for the appropriate estimates and sampling
n
situations.
Estimation
Method
Distribution
.10N
N(O,l)
2
.50N
2
Sint Sslope
Sint Sslope
Sint Sslope
Least Squares
.98
.98
.96
.98
.98
Huber
.95
.98
.93
.98
1.02
1.06
Jackknifed
Huber
NE
EXP (2)
Sint Sslope
Sint Sslope
.99
.99
1.00
1.00
1.00
.75
.96
.83
.99
.69
.94
1.07
1.11
1.11
1.14
TABLE 3
The entries are the average values of the parameters over the sampling situations.
The true values are Sint
= 1.00,
N(0,1)
Sint' Least Squares
Sslope,
Least Squares
S·1n t' Huber
Ss 1ope ' Huber
SSlope
.10N
2
=
.50.
.50N
2
NE
EXP (Z)
1.00
1.05
1.00
1.00
1.00
.49
.49
.48
.48
.48
1.00
.99
.92
.94
.90
.49
.49
.49
.48
.50
17
References
Andrews, D.F., Bickel, P.J., Hampel, F.R., Huber, P.J., Rogers, W.H., and
Tukey, J.W. (1972).
Robust Estimates of Location: Survey and Advances.
Princeton University Press.
Bickel, P.J. (1978).
nonlinearity.
Using residuals robustly I: tests for heteroscedasticity,
Ann. Statist. 6, 266-291.
Box, G.E.P., and Cox, D.R. (1964).
An analysis of transformations.
J. R.
Statist. Soc. B 26, 211-252.
Braun, H. (1975).
Robustness and the jackknife.
Technical report 94, Series 2,
Department of Statistics, Princeton University.
Carroll, R.J. (1978a).
On almost sure expansions for M-estimates.
Ann. Statist.
6, 314-318.
Carroll, R.J. (1978b).
~
On almost sure expansions for multivariate M-estimates.
To appear in J. MUlt. Analysis.
Gross, A.M. (1976).
Confidence interval robustness with long-tailed symmetric
J. Am. Statist. Assoc. 71, 409-416.
distributions.
Gross, A.M. (1977).
Confidence intervals for bisquare regression estimates.
J. Am. Statist. Assoc. 72, 341-354.
Hinkley, D.V. (1976).
On jackknifing in unbalanced situations.
Technical
Report No. 22, Division of Biostatistics, Stanford University.
Hogg, R.V. (1974).
Adaptive robust procedures: a partial review and suggestions
for future applications and theory.
Huber, P.J. (1973).
Carlo.
,J.
Amer. Stat-ist. Assoc.
firl,
909-923.
Robust regressions: asymptotics, conj ectures, and Montc-
Ann. Statist. 1, 799-821.
Huber, P.J. (1977).
Jaeckel, L.A. (1972).
MM-72-l2l5-11.
Robust Statistical Procedures.
The infinitesimal jackknife.
SIAM, Philadelphia.
Bell Laboratories Memorandum
18
Naronna, R.A. and Yohai, V.J. (1978).
Robust M-estimators for regression with
contaminated independent variables, Unpublished manuscript.
Shorack, G.R. (1974).
Switzer, P. (1970).
Random means.
Ann. Statist.
2~
661-675.
Efficiency robustness of estimators.
Symp. Math. Statistic. Frob. 1, 283-291.
Proc. Sixth Berkeley
SECURITY CLASSIFICATION OF THIS PAGE ("nen Uat" Entered)
r
READ INSTRUCTIONS
BEFORE COMPLETING FORM
REPORT DOCUMENTATION PAGE
1. REPORT NUMBER
4.
GOVT ACCESSION NO.
TITLE (end Subtitle)
3.
RECIPIENT'S CATALOG NUMBER
5.
TYPE OF REPORT & PERIOD COVERED
On Estimating Variance of Robust Estimators
When the Errors are Asynunetric
TECHNICAL
6.
PERFORMING ORG. REPORT NUMBER
Mimeo Series No. 1177
7.
8.
AUTHOR(s)
Raymond J. Carroll
CONTRACT OR GRANT NUMBER(s)
AFOSR-7S-2796
9. PERFORMING ORGANIZATION NAME AND ADDRESS
10.
PROGRAM ELEMENT. PROJECT, TASK
AREA & WORK UNIT NUMBERS
12.
REPORT DATE
Department of Statistics
University of North Carolina
Chapel Hill North Carolina 27514
11.
CONTROLLING OFFICE NAME AND ADDRESS
August ,1978
Air Force Office of Scientific Research
Bolling AFB, Washington, D.C.
14.
13.
NUMBER OF PAGES
IS.
SECURITY CLASS. (of this report)
19
MONITORING AGENCy NAME & ADDRESS(1t different from Contrail/nil Office)
UNCLASSIFIED
lSa.
16.
DECL ASSI FICATIONI DOWNGRADING
SCHEDULE
DISTRIBUTION STATEMENT (of this Report)
Approved for Public Release: Distribution Unlimited
I
17.
DISTRIBUTION STATEMENT (of the abstract entered In Block 20, It dltferent from Report)
18.
SUPPLEMENTARY NOTES
,
19.
KEY WORDS (Continue on reverse side if necessary end Identity by block numb"r)
Regression, Robustness, Asynunetry, Jackknife, M-estimates
20.
----- .. -_._-- ----
ABSTRACT (Continue on reverse side If necessary and Identity by block number)
We investigate the effects of asynunetry on estimates of variance of robust
estimator in location and regression problems, showing the heavy skewness of
errors can seriously bias the common variance estimates for location and
intercept, a problem that can be corrected by jackknifing for location but is
more intractable for the intercept in regression. The scale parameters in
regression seem not to be as seriously subject to this bias if the sample size
is large compared to the number of parameters.
DO
FORM
1 JAN 73
1473
EDITION OF 1 NOV 65 IS OBSOLETE
UNCLASSIFIED
SECURITY CLASSIFICATION OF THIS PAGE (When nata Entered)
SECURITY CLASSIFICATION OF THIS PAGE(When Uata Ent.ted)
SECURITY CLASSIFICATION OF THIS PAGE(When Datlt Entered)