L-Moment M-estimates in Linear and Non-Linear
Regression *
C.S. vVithers
Applied Mathematics Group
Institute for Industrial Research and Development
Box 31-310, Lower Hutt
New Zealand t
July 29, 1994
Abstract
We introduce a new class of estimates for linear and nonlinear regression models,
that combines the robustness features of both L-moments and .M-estimates. We
sketch a proof of their asymptotic normality and so derive confidence regions for
the regression parameters.
·Research partially supported by the ONR Grants N00014-91-J-1087 and N00014-93-1-0043.
tWritten while visiting Stanford University and University of North Carolina.
1991 AMS Classification: Primary 62 F20, Secondary 62F35
Key Words and Phrases: regression, L-rnornents, AI-estimates, confidence regions.
ONR Grant TR No. 12
Introduction and Summary
1
This note combines the notions of M-estimates and L-moments to obtain a new class of
estimates for regression parameters.
Suppose we observe
(1.1 )
where e.p in RP is an unknown parameter, ZN(e.p) is a given regression function with finite
derivatives, and the residuals {E1' ... ' En} are independent and identically distributed
(i.i.d.) with ·some unspecified distribution F on R.
vVe define the LM-estimate (or L-moment M-estimate) as epminimising
N
An
= An(<p) = n- l
L
p(EN(e.p), Fn(EN(e.p))
(1.2)
N=l
and p( E, y) is a given function with sufficently many finite derivatives
If p( E, y) does not depend on y,
ep
is called an M-estimate and
IS
known to be
asymptotically normal. See, for example, \\Tithers (1994a).
A more useful way of writing (1.2) is
n
An =
n-
I
L
p(E(N)' N/n)
(1.3 )
N=l
where
E(l) :::; . , . :::;
E(n) are the ordered values of
EI,""
En. The rth L-moment of F is
defined to be
(1.4 )
Estimates based on L-moments are more robust than those based on moments and
may even be more efficient than maximum likelihood estimates: see Hosking (1990) and
2
Hosking, \Vallis and Wood (198.5). They are a special case of probability weighted
For the asymptotic normality and bias reduction of their estimates, see Withers and
Pearson (1994).
Set
Prs(F) = Prs(E, F(E))dF(E),
prs
and
prs.rs
prs,rs(F) =
PlO:Il
Jr
}x5,y
J
Prs(S,F(E))2dF(E)
PIO(X, F(X))Pll(Y, F(y))dF(x)dF(y).
In Section 2 we prove our main result:
Theorem 1.1 If
PlO
= 0 then
n I/ 2(<jJ - cp)
where
-S
V(cp)
II
n ~
N(O, V(cp))
as
-2( PlO,lO V-I
P20
z
+ 01111 ') ,
00
(1..5 )
Vz- I Z.,
n
...
;]z/fJcp, z = n- I
L
ZN(CP),
N=I
n
Vz
n-
I
L
{fJ Z N(cp )/ ;]cp }{ fJ Z N(cp )/ ;]cp }' ,
N=I
o
PIl,ll -
pi I + 2PlO:ll'
In fact using Withers (1994b) one may develop a formal Edgeworth expansion for the
distribution of n I / 2 (<jJ - cpl.
Note 1.1 Let
where ZN(cp)
F be
=
the distribution of {EN
= z + EN, 1 ~
N ~ n}. So YN
= ZN(CP) + EN
ZN(cp) - z has mean O. So with centering condition PlO(F)
=
0, the
asymptotic variance simplifies to
(1.6)
:J .
o
This formula is the same as for AI-estimates!
Note 1.2 The centering condition PlO =
a can
be viewed as determining the location
parameter. Suppose y = (~) with a scalar and ZN( y) = a
distribution of {eN =
EN
+ a}.
Then YN = xN(;3)
+ eN
+ xN(;3).
Let F* be the
so
0= PlO = J{e - a)a(F*(e))dF*{e) giving a = J ea{F*(e))dF*(e)/ J~ a(t)dt
for F continuous. In this way the dimensionality of the problem is decreased by 1 since
o
generally a is only a nuisance parameter.
Note 1.3 V(y) is determined by y and F, say V(y) = V(tp, F). Let F be the empirical
distribution of {eN(<p)). Then for smooth a(·,·) under weak conditions,
V=
V(<p,
F)
is
a consistent estimate of V (tp), so that an asymptotically 1 - a level confidence region for
a smooth function g( y) in Rq with q
~
p is
(g(y) - g(tP))' 0- 1 (g(y) - g(<p)) < X;,1-0
o
where G = T(<p)fIT(<p)' and T(y) = fJg(tp)/fJtp'.
Examples are given in Section 3. In particular L-estimates are shown to be a special
case.
2
Proof
Here we sketch the proof of Theorem 1.1.
Write the ordered values of {EN(tp)} as
Let a subscript ·ij ... denote fJifJ j
.••
{E(N)
= E(N)(tp) = Y((N) - Z(N)(tp)}.
where fJi = f) / fJtpi.
'vVe generally suppress the
argument tp. Assume that <p minimising An of (1.2) satisfies for 1
n
a
=
~n.i = fJAn(tP)/DtPi = n- 1
L
dp(E(N)(tP), N/n)/D<pi
N=!
4
~
i
~ p,
-n-
-n-
I
I
n
L PlO(E(N)(0), N/n)=(N).i(0))
N=I
n
L
{PlO(E(N), N/n) - P20(E(N), N/n)Z(N).j6j
+ ... } {Z(N)'i + Z(N).ik 6 k + ...}
N=I
0-
expanding about y, where 6 =
y and summation of repeated pairs of suffixes j, k, ...
over 1, ... ,p is implicit. For example Z(N).j6j =
Assume that 6 is Op(n- I / 2 ) asn ~
2:;'=1 Z(N).j6j.
so
00,
n
-n -I L PlO( EN, F n ( EN) )ZN.i
N=I
n
-n-
Since
z. f+ 0 as
I
L PlO(EN, Fn(EN))ZN.i
N=I
+ Op( n -1/2)
+ Op(n- I / 2 )
n ~ 00 in general, taking means gives the centering condition
PlO =
o.
(2.1 )
Conversely this conclition implies that 8 = Op(n- I / 2 ). Now
(2.2)
where
71
Ani
-n-
I
L
PlO(E(N), N/n)Z(N).i'
N=I
n
B nij
n-
L P20(E(N), N/n)Z(N).iZ(N)-j
I
N=I
n
n-
I
L P20(L, Fn(EN))ZN.iZN.j
N=I
N
+ 0 P (n -1/2)
'
B nij
-- B"nij
+ 0 p (n -1/2) ,
n
B~ij
11.-
B~l)
E B:'ij = P20 Vi)'
1
L P2(J(EN,F(EN))ZN.i ZN.j,
N=I
71
\1,')'
( II
),.
~ Z lJ
--
l'< - 1 '""'"
,,
~ "-N,,"-N·) ,
N=I
n
Gnij
-n-
I
L PlO(E(N),N/n)Z(N).ij,
N=I
n
-n-
I
L PlO(EN, F(EN))ZN'ij
N=I
Op(n- 1/2
by (2.1)
5
+ Op(n- I / 2 )
So by (2.2),
o
glvmg
8)
-1
-P20
where
(V'))
- Ani
= n- L
Vji A ni
+ 0 p (n -1)
(2.3)
Now
n
1
PlO(EN, Fn(En))ZN.i
= Z.iTi(Fn, Fni )
(2.4)
N=l
where
(The assumption Z.i
i=
0 can be removed by a limiting argument later). By Section 5
of Withers (1994b), {Ti(Fn, Fnd, 1 ::; i ::; p} is asymptotically
where J1.i
= Ti(F, Fd = 0 by
(2.1),
Fi(;r)
Vi}
N p (J1., vn -1 )
=
t
= EFni(x) = F(x),
(lV,r/,)
[~i, U] 1
where
WN.o
= 1,
,",/'=0
and Ti(~) is the first derivative of S(Fa ) = Ti(Fa, Fd where Fa = F.
So L~,b=a can be repla.ced by
where F(E)x
and Ti(~)
= I(x
La=O,l
Lb=O,j and
::; E) - F(f.) is the first derivative of S(F)
= PlO(X, F(x))
- Ti(F, F)
= PlO(X, F(x)).
6
= F(E),
° 'J
[ 0'
1/.' 1) 1
P11 ,11
Also
'J
j
[ 0'
l l ,l) 1
2
-
P11'
JJ
P11(E, F(E)))F(E)xPlO(X, F(x))dF(E)dF(x)
PI0:ll,
s
V"I)
O
So by (2.3), (2.4), n 1 / 2 (,p - 'P) ~ .Vp(O, V(ip)) where
-2I1'Di(0-:;-'-:;-'
P20
v
-,,"")
+ T/..
Vi)PlO,10 )Vj;3 .
So Theorem 1.1 holds.
3
Examples
Example 3.1. Suppose p(E,F) = p(E)a(F). Then integrating w.r.t. F and denoting a
first derivative by a subscript 1,
0= PlO
PIO,lO
Pll
pll,11
and
PlO:II
J
J
J
J
Jl.-S: Pl(J~)a(F(x))pl(y)adF(y)dF(x)dF(y).
Pla(F)dF,
pfa( F)2dF,
PI al (F)dF,
pfal (F)2dF,
Y
o
Example 3.2. For an AI-estimate, p(E,F)
Pll
= P11,l1 = PlO:ll = 0 giving 0 = O.
= p(E)
so 0
= J Pl dF,PlO,lO = J pidF,
o
Example 3.3. For the" L - LS E estimate", p( t. F) =
o
PlO =
2
E a(
F)j2 so
J
W(F(E))dF(E),
J
J
J
E2a( F( E) )2dF( E),
PIO,IO
wI(F(E))dF(E),
Pll
E2adF(E))2dF(E).
pll,ll
(a) For a(F) = F - F 2 ,Pll,11 =
J E(1
so essentially as for the LSE one needs
- 2F(E))2dF(E)
J E2dF((E) < 00.
so the requirement of finite variance weakens to
J E2F(E)2(1- F(c))2dF(E) <
00
which is
a far weaker requirement. For example if
F(x) ~ c(-x)-O
as
and
.1: ---t -00
1 - F(x) ~ dx- iJ
as
:1' ---t 00,
(:3.1)
then it is sufficient that min( a, ;3) > 2;:3. So this covers distributions like the Cauchy for
which a = (3 = 1 and the mean does not exist.
(c) Similarly if a(F) = (F - F 2
t
where, > 1 then
PII,ll
<
00
if (3.1) holcls and
a > 2/(2, - 1). So by choosing, large enough one can deal with distributions with
o
power tails decreasing arbitrarily slowly.
= 1 and
Example 3.3.1. Suppose in Example :3.:3 that p
n
~ =
L
= ~.
Then
n
}'(;v)a(Njn)j
;V=I
1
z;v(~)
L
a(N/n)
N=I
1
Fn1}(l)a(l)dlj
{1
1
a(l)dl
+ O(n- I )}
by the Euler-McLaurin expansion, where Fny is the empirical distribution of {YI , ... , }~}.
So r.p is an L-estimate. So by p. 276 of Serfling (1980), n 1/ 2 (r.p - ~o)!:." N(o,v) where
y
y
Fy(y)
v
+
1
1
1
1
F-I(l)a(t)dtj
since
a(t)dt
PIO = 0,
P(}} :::; y) = F(y - y),
JJ
a(F(x))a(F(y)) {F(min(x,y)) - F(x)F(y))dxdy.
This gives an alternative formula for the asymptotic variance.
o
References
[1] Hosking, J.R.lVI. (1990) L-moments: analysis and estimation of distributions using
linear combinations of order sUttistics. J. R. Statist. Soc. 52, 105-124.
[2] Hosking, J .R. lVI., Wallis, J .R. and Wood, E.F. (198.5) Estimation of the generalised extreme-value distribution by the method of probability-weighted-moments.
Technometrics, 27,251-261.
[3] Serfling, R.J. (1980) Approximation theorems of mathematical statistics. Wiley, New
York.
[4] Withers, C.S. (199'1a.) The bias and skewness of'M -estima.tes in regression. Submitted.
[5] Withers, C.S. (1994 b) Edgeworth expansIons for functions of weighted empirical
distributions. Submitted.
[6] Withers, C.S. and Pearson, C.P. (1994) Bias-reduced estimates of probability
weighted moments with application to L-skewness and L-kurtosis. Submitted.
9
© Copyright 2026 Paperzz