Carroll, Raymond J.; (1975)Studentizing Robuts Estimates."

* This
research was supported by the Air Force Office of Scientific Research
under Grant AFOSR-75-2796.
Studentizing Robust Estimates
by
Raymond J. Carroll*
.
Department of Statietias
UniveX's-tty of NoX'th CaroUna at Chapel. HiZ'"
Institute of Statistics Mimeo Series No. 1085
August~
1975
:0
STUDEIHIZWG ROBUST ESTmATES
by
Raymond J. Carroll *
University of North Carolina
Summary
The consistency of connnonly used variance estimates for studentizing
robust estimates of location is studied ",Then the tmderlying distributions
are asymmetric.
Surprisingly\' the popular methods for studentizing M-
estimates\, jackknifed M-estimates and adaptive trimmed means tmderestimate
the true variance by a factor approaching 50% tmder the negative exponential
distribution. Possible reasons for the failures and suggested modifications
are presented.
Key Words and Phrases: RobustnessI' t-tests\, asymmetryl' lVbnte-Carlo\,
trimmed means, 11-estimators\, adaptive estimators
Al1S (1970) Subject Classifications: Primary 62G35; Secondary 62E25
* TIlis
research was supported by the Air Force Office of Scientific Research
under Grant AFOSR-75-2796.
2
Introduction
This project started with the problen of sequential fixed-width
interval estimation of the lirl1iting functional of robust estimators.
We first compared a number of esti.r.lators in many symmetric situations
with predictable results (concerning robustness of efficiency) which
will not be reported here (see
(1975) ).
Andrews~
et al (1972) ~ Carroll and Wegman
When the sampling distribution was the negative
exponential~
the situation was surprisingly degenerate; the modern robust estimators
all gave much lower coverage probabilities than advertised. After some
study ~ we discovered that the corrmonly used variance estimates underestimated the true variance by nearly
were being taken.
50%~
so that too few observations
In the context of the two-sample location
this means the Type I error is much higher than desired.
application of robust two-sample tests can be
problem~
Before routine
recommended~
it is important
to discover where the studentization fails and to suggest appropriate
modifications to existing procedures.
This paper describes the results of a Monte-Carlo
experiment~
comparing
true and estimated variances in asymmetric situations. We show that in
asymmetric cases such as the negative exponential
distribution~
hard
adaption (Hogg (1974) ) ~ (common) M-estimators and jackknifed Iv1-esfi.matorS ~
al:i fail. .::Our major iiiitiill'success was obiained 'wl:th,.t'rir:Uned means.""'fhe
reasons·. for the 9ccurrence of the' obs~rved phenotwna are· inve:stigated'~ and
reconmendations for future study are made.
'e
3
The Estimators
Define U(a) ,(L(a,)) -as the
rfum~e1ts
of the largest (smallest)
[a(n~l)]
As a measure of tail length~ Hogg proposed
order statistics.
Q
!.~le.a:,:
= (U(.20)
- L(.20))/(U(.50) - L(.50)).
scale is defined by
sn
= median{IXi
with the property that
sn
-+-
1
- sample medianl}/.6754~
in the nomal case.
the estimators of location used in this study.
In Table 1 we list
The variance estimate
for the sample mean M is the usual sample variance ~ while for the tri'llffioo
means
5% - 38%
it is the jackknifed variance estimate (Shorack (1974)).
For the one-step Harnpels and I-Iubers
(1970)) ~
12A-25A, HUB)
n
n
(HGI~ HG2~ 1.81D~ 1.90D~ 2.00A~
estm.ate of the chosen location estimate is used.
(1)
L8lDL the variance
For the jacldmifed
Huber (JHUBL the jackknifed variance estilnate is used.
0-e
it is (HUber
.i Z x-Tn
. . ns ft/J -s- dFn(x)
.
For the adaptors
(DlO-D20~
4
TABLE 1
----
A description of the location estimates used in the study.
Code
M
Description
Sample mea.'1
5%
10%
25%
38%
HG1
HG2
DI0
DIS
D20
1.81D
1.90
hUB
JHUB
l2A
2lA
25A
2.00A
1.81A
°e
a% means e;n a% synunetrica1ly trimmed
mean.
38% has a = .375 .
{
5%
QSI.81
10% if 1.8l<QSl.B7
{ 25%
Q>1.87
5%
QSl.80
(.05+(Q-l.80)(4/3))lOO% if 1.80<Qsl.95
{
25%
Q>1.95
One-step Huber estimates~ with start being the
sanple median$ scale sn (Andrews~ et a1 (1972) ).
{ TIle :knot in the W fl..1Jlction for DK is K./lO.
D20
QSI.81
DIS
1.8l<Q~I.87
{ DIO
Q>I.87
D20
QSl.90
DIS if 1.90<Q~2.0S
{ DlO
Q>2.05
Solution (via method_.Of bisection with 20
iterations) to EW(snl(Xi-t)) = 0$ where
{
W(x) = max ( -1. 5» min (x$ 1. 5)).
A jackknifed version of HUB» with n = 50.
One-step Hampel estirnates~ with knots
given in Andrews~ et al (1972). Start is
{ the sample median~ scale is sn'
21A if Q<2.00
{ 12A if Q~2.00
2SA
QSl.81
21A if 1.81<Qsl.87
{ 12A
Q>1.87
5
RandoM Numbers
29
A congruential random number generator (period> 2 ) with a slmffling
feature 't'ms used to obtain the unifonn random variables ~ while standard
nomal observations were obtained by the Box-Muller algorithm.
The shuffler
first generated 300 uniform deviates and then chose one of them at
random; this effectively overcomes the undesirable dependence features
of the usual congroential generator.
No IVbnte-Carlo Sl'.rindle was
used~
so reported results will be accurate to about one decimal place ~ more
then enough to make the conclusions given here.
SamolinQ
....
.... Distributions
Ten sampling distributions were studied.
non~l
If
Q)
is the standard
distribution function, the first six had distributions as £0110\l1s:
'"'; ·Code
~(x)
N(O,l)
.95Q)(x)+.05~(x-l)
N(O,l)+.OSN(l~l)
.90~(x)+.lO~(x-1)
N(O,l)+.lON(l,l)
.95<p(x)+.OS~(x-3)
N(O~1)+.OSN(3~1)
. 90~(x)+.lO<p(x-3)
Negative exponential with mean 1.'25
N(0,1)+.lON(3.l)
NE
Letting X be a standard nomal random variable, the last four "Jere
generated by
"e
12.Sexp(.10X)
4. 88exp (. 25X)
2. 46exp (. SOX)
. 4gexp (X)
Exp(.lOX)
Exp(.2SX)
Exp(. SOX)
Exp(X)
6
Samples of size n = 75 were taken for each iteration.
There were
500 iterations on N(011)1 1000 iterations on NE and Exp(X)1 and
250 iterations on the others.
f~ajor
Resul ts
This paper focuses on the effects of asymmetry on the consistency
of variance estimates of robust estimators.
size.
N is the number of
experiments~
If n = 75
is the sample
and Yp
'YN the realized
values of a particular estimator> the (standardized) Monte-Carlo variance
of the estimator is
2
(1
n \"n
v
= 1'1 l.l (Yi - !N)
2
•
The average value of a variance estimate for a particular location estimator
is denoted by (1~.
distributions.
Table 2 presents values of cr~/(12 for all ten sampling
Table 3 presents values of (12 and (1; for l\J(O.lL
NE 1
and Exp eX) > the first acting as a standard> "Iith the latter two representing
Hworst casan asyrrrnetric distributions.
.
7
TABLE ?
---~-----
2 2
Values of the ratio crn/,cr, • A consistent variance estimate should have
.
,..
,
,
,
": t~
.
'
n-fi .
0
'--'
t<")
t<")
'--'
'--'
~
@
0
Lt'.l
r-l
0
U"l
0
+
+
,......,
+
,-...
+
,-...
r-l
I-!
r-{
r-l
.
~
...
.
Z
.
'-'
...
..
'--'
:..,
:2:;
1.1
.97
.99
.94
1.11
5%
.95
.94
.97
.93
1.14
.95
.92
1.15
.96
.93
1.10
1.04
LOO
1.13
.92
.90
1.06
.94
.97
.98
1.03
.94
1.09
1.02
.95
1.12
1.02
1.04
1.02
.95
.95
.95
1.13
1.11
1.12
.99
.84
.93
10%
25%
38%
HGI
HG2
D10
D15
D20
l.81D
1.90D
HUB
JI-rJB
12A
21A
25A
2.0DA
I.8lA
-·e
.......
LJ')
'"
'-'
:z;
8
e
...
'"
0
~
.93
.92
.90
.97
.90
.95
.97
.91
.97
1.07
1.00
.97
1.01
1.01
1.01
1.00
0
.95
.96
.97
.96
.94
0
0
'"
'--'
z
",
\
'~;
.
.-I
.
"
r--..
r-l
~
r-l
r-l"
r-l
1""'\
~
'--'
...".
,....."
-thi-s v'alue: neaf''"f.Ou. '
f
Z
.
0 "
'--'
:z.
LOO
1.10
1.06
.99
1.02
.90
.97
.95
.97
.98
.99
.96
.93
.92
1.12
.98
.94
.93
.94
.92
1.12
1.01
1.11
1.03
1.11
1.08
g
.
~
~
.
R
0
.
Jf
Lt'.l
r-l
N
'--'
'--'
'-'
'-'
1.01
.96
.97
.93
1.00
.96
.90
1.02
.97
1.05
.97
.88
1.04
.97
1.04
1.00
1.02
.86
.88
1.05
1.02
.94
.94
.99
.53
.84
.97
.70
.98
.62
.60
.87
.79
.62
.54
1.08
.89
.82
.52
.59
.92
.75
.53
.68
.95
.91
.88
.88
.91
.90
.89
.81
.78
.59
.82
.74
.54
ill
~
.59
.59
.60
.61
.49
.62
~~
.94
.88
.90
.91
.93
~'
.67
.84
.90
.75
.81
.86
.65
.99
.70
.61
.91
.93
.90
.90
.42
.91
.84
.80
R
I
.42
.52
.53
.44
.50
.55
.35
.34
8
.
TABLE 3
Values of
cin
and a2 for three sampling distributions.
r
N
'0
I
~
H
5%
10%
25%
38%
HG1
HG2
D10
DIS
D20
1.81D
1.90D
HUB
---'"
r-i
0
'-'
z
1.03
1.09
1.13
1.31
1.46
1.13
1.04
1.21
1.10
1.05
1. 21
1.05
12A
2lA
25A
.98
1.01
1.10
1.00
1.01
2.00A
1.8lA
1.01
1.00
JHUB
NI::l
'0
N
I
---...
r-l
0
'--'
z
1.00
1.04
1. 07
1.22
1.35
1.02
1.01
1.09
1.04
1.02
1.10
1.02
1.04
1.01
1.07
1.02
1.01
1.02
1.01
'0
N'Or:.
R
'--'
---
I
N
'0
I
~
1.54
1.45
1.35
1.31
1.38
2.30
2.07
1.37
1.49
1.53
1.30
1. 71
1.52
1.43
1.57
1.66
1.65
1.71
2.38
NI:l
'0
I
~
1.55
1.38
1.31
1.30
1.40
1.22
1.24
.74
.88
1.04
.78
1.00
.91
.87
.77
1.05
1.15
1.04
1.01
I
><
'--'
~
B
1.10
.58
.45
.35
.36
.56
1.10
.61
.47
.35
.35
.35
.36
.19
.55
.37
.45
.51
.43
.60
.49
.47
.44
.55
.59
.72
.62
.24
.30
.23
.25
.25
.'25
.19
.28
.32
.25
.21
9
I.
(Trimmed I'feans)
The syrn.etrically trmed means H (or 0%),
5% - 38% vJere the
only estimators from Table 1 \-IDose variance estimates consistently estimated
the true variance over the whole range of distributions.
Hence, if the
user has serious concern about asymmetry, trimmed means would be recommended
for their robustness of validity.
The first Hogg adaptor (HGl) uses a discrete adaptive approach which
chooses among a finite set of possible trimmed means.
choice used here fails at NE ~
Exp (. SOX), and
The particular
Exp (X) .
This is most
lmusual, especially in view of the success of individual trimmed. means.
(1~ is in line with individual
Note that in Table 3, while the value for
trimmed means, the value for (12
is considerably larger in magnitude.
The following arguments should indicate in part that this explosion of
(12 can be expected to occur over the whole range of discrete adaptors.
Denote Q = Q(n,F)
to emphasize the dependence on the srorple size n
and the tmderlying (continuous) distribution function
n~(Q(n,F)
- Q(F))
~
N(O, c,.2(Q)), where
the a-trimmed. mean by S(n,a)
symmetric about
e
the adaptor is Q(F)
1.87).
Q(F)
F.
Typically
is some constant.
lnth limit value S(a).
Denote
Suppose F is
and that one of the ilknotsVl or nchange pointsi1 in
(for example, in HGI these Imots are 1. 81 and
Then asymptotically, the adaptor is
T*(n,F)
= S(n,al)
if Q(n,F) s Q(F)
= S(n,a2)
if Q(n,F)
>
Q(F).
(2)
10
Since Q(n~F)
~
nZ(S(n~a) •
is odd~ ~(Q(nsF) • Q(F))
is even lll"1d S(n,a)
Sea)) are asymptotically independent
9
Sea) =
e
and
and
r
Pr{n~(T*(n~F)-e)~Z} ~ ~(Pr{J:1(S(n~al)·e)::;;z}
~(T*(n 9 F) • e)
n"
Hence~
S(az)
¢
pr{~(S(n,a2)-e)::;;z}).
(3)
does not quite have a limiting normal distribution 9
although the variance will not have
Seal)
+
blo~ln
up.
However, if F is asymmetric,
and if the approximation (3) holds, we see that for no e
1
does n~(T*(n,F) •
e)
have a proper limit distribution (in fact, mass
lidll be placed at +00 or
-00) •
These very heuristic arguments shol'l
that the studentization of hard adaptors is likely to fail whenever one
is ull.forttmate enough to have chosen Q(F)
and Q(Exp (. SOX))
= 1. 8Z
as a knot.
Since Q(NE)
= 1.804
(the latter obtained by Iibnte -Carlo salTlpling
lIJith 250 trials), one sees the probable explanation for the failure
of HGI
Q(Exp(X))
in these cases.
= 1.96;
In Hmte-Carlo sampling \"e obtained
i.~tuitively, in this heavily asynmetric situation the
choices 10% and 25% are so different as to provide a boost·
to the variaTlce.
..":.
However, an adequate expla'lation is still needed.
It would seem then that a source of difficulty is that T*(n,F)
is not continuous in Q(n,F)
for all n.
HG2 provides a..'1 example which
ShOliVS that the continuity merely of Ii,!!. T*(n,F)
n
As a first constraint
then~
continuity property.
Although
= T*(F)
is not sufficient.
any adaptive estimate should have the above
l'1e
experimented with linear ftmctions ·of
order statistics with weight functions depending smoothly on a
. as piecewise linear and quadratic) ~ we had very little success.
(such
The
problem of constructing smoothly adaptive estimates "Jith reasonably
··e
11 '
consistent variance estiTflates in snaIl sa.:IPles
is an onen a.."1d worthwhile
~
problem.
'.
II.
(M-estimators)
The Huber estinate H£JB (generated by ljJ(x)
-1'1 '"': kz
::= 1. g
) has
gOO9.
robust:iless
(1975a) has shown that if F is
non-zero constants
al~
ffi1d
= mai\c(kp
min(x~k2)) ~ .
breaJ:..do\tID properties but Carroll
asyrmetric~
sn
~ ~~
then there exist
a 2 with
Tnis means that the variance estimate (1) (suggested by Huber (1970)
and used by Gross (1976) ) is not consistent for asyr:netric distributions.
111is is clearly reflected in Tables 2 and 3.
by (1)
\'lOu~d
~~e
studentization suggested
hold if a 2 = 0 ~ \-lhicn is equivalent to
(5)
One possibility i'1Ou1d be to choose kl ~ k 2 to force (5) to hold~ at
least asymptotically. Since Huber estL"'Jates have such nice !,roperties ~
some future "lOTk should be devoted to finding
2,
satisfactory solution.
That the j acldmifed I-ruber estirrate failed \<las both surprisii'1g and
disappointing.
Some insight into the cause of this phenomena. is available.
Jaeckel in an. unpublished paper indicates that Sho\<ling sonething like
(6)
is the Fassing step which needs to be verified for his arg.. .mlents to hold;
12
(4) shows that (6) fails.
Further, it is well-Imown that the jackknife
does not provide a consistent estimate of the variance of the sample
median; si.T1ce the scale sn is very close to a sample median, it 1/JOuld
"
seem to be the cause of some instability.
If this latter point is the
root difficulty (a poLTlt which needs to be settled), perhaps using a
scale with somewhat smoother influence function :.is mandated.
One possi-
bility would be to take the average of the smallest half of
{IXl
- median I,
... ,I ~ - median I}.
We have done some partial work
in this direction with little success.
III.
(One-step estimates)
Theone-step M-estimates and their adaptors were included for illus-
trative purposes although they would not be used in obviously asymmetric
situations.
Carroll (1975b) has shown that an analogue to (4) exists
in this case as well, so the failures at NE and Exp(X)
are not unexpected.
One interesting point to note is the very poor perfonnance of 2lA and
25A at the negative exponential; we have no good explanations for this.
Gross (1976), in surveying the disappointing performance (in syri1metric
situations) of jac1dmifed versions of these estimates, recommends replacing
the median by a smooth starting value.
The arguments of the previous
section indicate an additional need for a smooth scale.
13
Conclusions
In this paper we have shown that wide classes of robust estimates
will not perforn adequately in the two-sample location problem in the
"
presence of asymmetry. While tr:i.Jmned means work quite well ~ discrete
adaptive trimmed means have been shown to degenerate whenever one of
the Imots is near the limiting value of the tail length ftm.ctional
Q(n~F);
evidently only smoothly adaptive trimmed means have aTlY hope of success.
M-estlll1atE:S and their one-step versions have been shown analytically
to lead to inconsistent variance estimates if the
~-function
syrttnetric; this suggests the possibility of allowing the
also depend on the data.
is skew-
~-function
to
It would appear that IvI-esti.r.lates and their
jackknifed versions share the same source of
difficulty~
but this point
needs to be clarified.
References
ANDREHS, D. F., BICKEL, P. J., HAMPEL, F. R. ~ HUBER~ P. J. ~ ROGERS, W. H.,
and lUKEY, J. W. (1972). Robust Estimates of Loaation: Survey and
Advances. Princeton University Press.
CARROLL, R. J. (1975a). An asymptotic representation for M-estimators
and linear functions of order statistics. Institute of Statistics
f'ilim.eo Series #1006, University of North. Carolina at Chapel Hill.
CARROLL, R. J.
(197Sb) • On the asymptotic normality of stopping times
based on robust estinmtors. Institute of Statistics ~limeo Series
#1027, University of North CarolL"la at Chapel Hill.
CARROLL, R. J. and WEG!lAN, E. J.
(1975). A Monte-Carlo study of robust
estinators of location. Institute of Statistics ~limeo Series #1040,
University of North Carolina at Chapel Hill.
14
GHOSS, A. M.
(1976) • Confidence interval robustness with long-tailed
symmetric distributions. J. An~p. Statist. Assoc. 71, 409-416.
'.
fDGG, R. V. (1974). Adaptive robust procedures: a partial review and
suggestions for future applications and theory. J. Amep. Statist.
Assoc. 69, 909-923.
(1970). Studentizing robust estimates. NonpaPametric
Techniques in Statistical- Inference, ed. H. L. PurL Cambridge
University Press, 453-463.
HUBER, P. J.
SHOPACK, G. R.
(1974).
Random means.
Ann. Statist. 2, 661-675.