ON PITMAN EMPIRICAL DISTRIBUfION AND STATISTICAL ESTIMATION
PRANAB KUMAR SEN
University of North Carolina at Chapel Hill
The posterior distribution of a parameter (with respect to a uniform
weight function). termed the Pitman empirical distribution. provides the Pitman
estimator as well as a posterior Pitman closest estimator.
A systematic
account of various properties of this empirical d.f. is provided with due
emphasis on the related asymptotics.
1.
Introduction.
Let X ..... X be independent and identically distributed
n
1
random vectors (i.i.d.r.v.). each with density f(x.B) with respect to a sigmafinite measure
~.
d
where 9 € 9 CR. for some d
~
1.
Then the joint density
(i.e .. the likelihood) function of X ..... X is given by
n
1
I. (9) =
n
(1.1)
n
11
i=l
f(x .• 9).
I
9 € 9.
The classical Pitman-estimator (P.E.) of 9 is defined as
AMS Classifications:
62C10. 62B99. 62F10.
Keywords and Phrases: Ancillarity; asymptotic Gaussian; equivariance; MLE;
Pitman closeness; Pitman estimator; posterior Pitman closest estimator;
robustness; shrinkage estimation.
Short Title:
Pitman Empirical distribution.
- 2 -
(1.2)
[see, Pitman (1939)].
In this context, we define the posterior density of S
(with respect to the uniform weight function) by
(1.3)
and the corresponding distribution function (d.f.)
(1.4)
Gp (S")
,n
=
J...<S J
gp
y-
,n
(y)dy, S € 9,
is termed the Pitman empirical d.f. (PEDF) of S.
Note that by (1.2)-(1.4),
(1.5)
where U stands for the (improper) prior distribution of S generated by the
uniform weight function on 9.
Thus, Sp ,n is a Bayes estimator, and various
A
properties of Sp
,n
for specific models (such as location-scale and exponential
families) have been studied in detail in the literature; Lehmann (1983) and
Berger (1985) are excellent sources.
In the case of a real parameter S, if we
let
(1.6)
~
Sp ,n
1
= G-1
p ,n (-2) = inf{a
= median
: Gp ,n (a) ~
1
2-}
of the PEDF Gp ,
,n
then a p ,n is a posterior Pitman closest (PPC) estimator of a [viz., Ghosh and
Sen (1991)]; the result holds in the multi-parameter case under additional
regularity conditions [viz., Bose (1991)].
unique median.
Sp ,n is unique whenever Gp ,n has a
Further, whereas the PE Sp ,n is adapted to squared error loss.
- 3 -
9
P.n
is adapted to absolute error loss. and under the Pitman closeness
"
criterion (POC). 9p
may dominate 9p
[viz .• Sen and Saleh (1991) and
.n
.n
Kubokawa (1991)].
It is quite clear that both 9p
and 9p
are functionals of
.n
.n
the PEDF Gp . and there is ample room for other functionals of Gp
as
.n
.n
competing estimators of 9 (under other loss criteria).
For this reason. it is
of interest to study the basic properties of the PEDF Gp
.n
itself and to
incorporate these in optimal (Pitman-type) estimation of 9.
This task
constitutes the primary objective of the current study.
In a (multi-variate) location model. we set f{x;9)
9 € RP • for some p ~ 1. where f{-) does not depend on 9.
is translation-equivariant and unbiased for 9.
In this setup, 9
P.n
For p = 1. 9p
is also
.n
translation-equivariant. but may not be universally unbiased.
Also. within the
class of translation-equivariant estimators. with respect to a quadratic loss,
9p
is minimax for the location family. and it is admissible under additional
.n
regularity conditions. On the other hand. in a general estimation problem,
with respect to an absolute error loss. 9p
may dominate 9p ; 9p
may also
.n
.n
,n
be "posterior Pitman closer" then 9p .
.n
The parameter in a Poisson or gamma
distribution are noteworthy examples in this context.
Thus. the relative
picture depends on the loss function or other criteria adapted. and most of the
issues arising in this context can be better resolved by reference to the basic
properties of the PEDF Gp .
.n
This provides further incentives for our
contemplated study. so that the role of a conventional quadratic risk may be
properly examined.
Pitman estimators are known to possess affinity to maximum likelihood
estimators (MLE). especially in the asymptotic case. and to sufficient
statistics whenever they exist.
A similar picture holds for 9 p
as well .
•n
- 4 -
This affinity remains in tact for the PEDF Gp
estimators
ap ,n
and
ap ,n .
,n
which provides both the
These results are presented in Sections 2 and 3.
The MLE are known to be dominated by their shrinkage or Stein-rule versions in
the light of quadratic risk and the Pitman closeness measure [Viz., Sen,
Kubokawa and Saleh (1989)].
As such, the dominance of Pitman-tyPe estimators
by their shrinkage versions is also briefly considered (in the concluding
section).
Often, the MLE are not so robust (for small departures from the
A.
assumed model). and this drawback is shared by the PE
ap .n
as well.
For this
reason. some functional Pitman estimators are considered in the concluding
section (With due emphasis on the robustness aspects).
The results in the
concluding section are mostly asymptotic in nature.
2.
PEDF and the exponential family.
for some d
~
Suppose that the parameter
a
€
a
C
Rd.
1 and the joint density 2 (a) (of X •.... X ) admits a sufficient
n
1
n
statistic T , so that 2 (a) can be factorized as
n
n
2n (a) = hn (Tn .a) 2*n (X 1 ..... Xn ; Tn ). a
(2.1)
where 2*
(.) does not depend on
n
a.
€
a.
As for the sufficient statistic. we (assume
a full rank model and) choose a suitable version. such that T
n
estimator of
a
in a meaningful sense (viz .• sample mean vs. sum in a normal or
exponential model).
In this setup. h (T .q) can be taken as the pdf of T . so
n n
n
that
(2.2)
From (1.3) and
(2.3)
so that
is itself an
(2.1). we have
gp .n (a) = h n (Tn
.a)/{f
- 5 -
(2.4)
Thus, whenever a sufficient statistic (estimator) Tn exists, the PEDF Gp ,n (0)
depends solely on Tn through its density h n (Tn ;0).
In fact, Gp ,n (0) is a
random d.f. (defined on 9) and is itself a sufficient statistic (process)
whenever a sufficient statistic exists.
Let us consider a general exponential family of densities (in a minimal
canonical form), where we set
f(x,9)
(2.5)
where a(9) is
> 0,
= a(9)b(x)
exp{a(9)
t(x)}, 9 € 9,
0
d
d
b(x) ~ 0, a(9) € R and t(x) € R , for some d ~ 1; without
any loss of generality, we assume that the elements of t(x) are affinely
independent [viz., Barndorff-Nielsen (1978)].
(2.6)
= ([a(9)J n
I! (9)
n
Then, we have
n
IT b(X.»
i=1
1
exp{n a(9)
(a(9»n exp{n a(9)
(2.7)
gP,n(9)
0
T },
n
T }
0
= - - - - - - - - - - -n - - , 9
J·e· J
€
9,
(a(y»n exp{n a(y)o Tn}dy
which belongs to the conjugate family (of densities on 9) of the form
(2.8)
dn (a,Tn )[a(9)J
n
exp{n a(9)
0
Tn }, 9 € 9
where dn (a,Tn ) depends on Tn (given) and a(o), but not on 9.
Thus, the PEDF
Gp ,n can be characterized by the conjugate family in (2.8) wherein the
sufficient statistic T
n
is regarded as fixed while 9 varies over 9.
This
conjugate density may often suggest an appropriate loss function (or other
- 6 -
criteria) and provide the desired estimators.
Suppose that {jX, j € ,} be a group of transformations (of the sample
The transformation jX
space onto itself) which leave the model invariant.
induces a transformation (on 9) 9 ~ }9 =
denoted by j.
Further. if T
n
the} also form a group.
is equivariant (with respect to j € ' ) . then
T (jX •...• jX ) = j
n
n
1
(2.9)
9 where
*
T (x •...• x).
n 1
n
j € ,.
where the transformations j * form a group. denoted by ,*.
Jacobian of the transformation T
n
~
T* = j
n
*
Recall that the
T does not involve 9. and hence.
n
the inherent (,-) equivariance of the model implies that h (T .9) expressed in
n
its new coordinate system (T* . -9) as h* (T* .
n
n n
*h*
n (Tn .9)
(2.10)
for every T*
n
=j*
T and -9
n
transformed model by G*
p
.n
= -j9. j *
9) satisfies the following:
hn (Tn .9).
€ ,* and j €,.
We denote the PEDF for the
• so that by (2.3). (2.4) and (2.10). we obtain that
G*
p .n (j(9+y»
(2.11)
= d(Tn .j* )
n
= Gp
.n
(9+y) , j € ,. (9 + y) € 9.
where the transitiveness of j is tacitly assumed to justify (2.10).
Thus, whenever a group of transformations' leave the model invariant the PEDF
Gp
.n
is also '-equivariant.
The picture simplifies a bit more for the
classical location model where f(x.9) = f(x-9). x
h n (Tn .9)
n).
= hn (Tn-9)
€
EP . 9
RP . so that
and h n (u) is independent of 9 (but the form may depend on
If we let
(2.12)
€
H (u)
n
=
I y_u
...> I h (u). u
then. we have by (2.4) and (2.12).
n
€
RP •
- 7 -
(2.13)
so that for every x € RP ,
(2.14)
(0)
= Gp ,n (x), say,
where Gp(O)(x), the PEDF of T , under a = 0, is independent of a.
,n
n
Thus, if we
consider the group , of affine transformations
(2.15)
X ~ X* = a + BX, a real p-vector, B nonsingular,
then on letting -a = a + Ba and T* = a + BT , we have
n
n
(2.16)
(0) (y) = G
G*P,n (a- + y* ) = GP,n
p ,n (a+y),
whenever y* = a + By, y € RP .
Note that if H (u), u € RP is (diagonally)
n
symmetric about 0, then H (u) = H «-I)u), V u
n
(2.17)
n
Hn (Tn-x)
€
RP , and hence, for every
= H
n «Tn-a) - (x-a»
P
n' V Tn € R ,
= H (x-T)
n
...
so that in (2.13), we may as well replace
Hn (Tn-x)
by
Hn (x-T).
n
In any case,
(2.16) insures that the PEDF Gp ,n is '-equivariant for the location model.
Conclusions about the uniqueness of the median of Gp ,n (for p=l) and the
multivariate-median (for P
~
2) can be drawn as in Ghosh and Sen (1991) and
Bose (1991), and the posterior Pitman closest property of T can be established
n
- 8 -
thereof.
It may be quite appropriate to consider another example.
Consider a
multi-normal population with null mean vector and dispersion matrix L (positive
definite (p.d.) but arbitrary).
Then T
n
is the sample covariance matrix, so
that (2.1) holds with h (e) related to the classical Wishart density with n
n
degrees of freedom (DF) and the PEDF Gp ,n reduces to the conjugate Wishart
distribution.
However, Gp
fails to satisfy the requirement of multivariate
,n
median unbiasedness (and it is not diagonally sYmmetric).
The group 1 of all
nonsingular matrices B (i.e .• X ~ Y = BX) leaves the model invariant and T
n
is
1-equivariant:
T* = BT B' and L* = B}JP.
(2.18)
n
n
In this context, the following two loss functions are generally used:
-1
2
Lq (L.Tn ) = Tr(L Tn - I) ,
(2.19)
(2.20)
Recall that An
=n
Tn has the Wishart distribution Wp (L.n).
of an estimator T (a)
n
=a
A , a
n
>0
is minimized at a
1
likelihood risk of T (a) is minimized at a = n- .
n
The quadratic risk
= (n+P+1)-1,
and the
Although the minimum risk is
constant (for its respective loss), the estimator T (a) is not minimax.
n
(generalized) Pitman closeness (GPe) sense. within the class
(2.21)
~1
= {Tn
: Tn
=a
An • a
> o}.
.
the closest estimator of L is
(2.22)
2 )}.
Tn* = An {p/med(x"np
Note that an unbiased estimator (sufficient statistic) of L is
In the
- 9 -
-1
A,
n = n
n
(2.23)
T
and that
~1
is the class of all equivariant estimators of
~
under the group of
nonsingular transformations (too big to achieve a best estimator w.r.t. L
q
L ).
2
or
For this purpose, we may note that there exists a lower triangular Un'
such that A
n
= UnU'.
n
We consider the group of lower triangular
transformations, i.e., A
n
~H
A H'
n
~ ~
and
H
~
H', for lower triangular H.
The corresponding class of equivariant estimators is
(2.24)
~2
= {Tn
: Tn
= UnDU',
n
D diagonal and p.d.}.
The minimum risk (w.r.t. L ) estimator in
2
~2
is
(2.25)
[viz., James and Stein (1961)].
L
2
{·»,
However, in the GPC sense (with respect to
a best (closest) equivariant estimator in
[viz., Sen, Nayak and Khattree (1991)].
~2
does not exist (for p
~
A similar picture holds for L (.).
q
These results show that the classical Pitman estimator of
~
(based on the
conjugate Wishart distribution) does not have the "bestness", and there is a
need to probe further into the Gp
to characterize alternative estimators
,n
which perform better.
The density in (2.6) involving (T ,a) may not have naturally
n
independent (or uncorrelated) coordinate variables.
a
= (a{l),
T(2)
= a(2)
Suppose that
a(2» and consider a transformation a ~ T = (T{l), T(2» where
and
T{l) is a function of a{l) and a(2), such that T{l) and
T(2) (or a(2» are L-independent.
Then, by Theorem
9.12 of Barndorff-Nielsen
(1978), we conclude that T{l) and a(2), considered as r.v.·s on 9 are
2)
- 10 -
stochastically independent under the conjugate family, so that under the
transformation a
~
Gp(l)(T(l»
,n
,n
T, the PEDF Gp ,n (T) can be expressed as the product of
and Gp(2}(T(2)} where the component PEDF's can be obtained from the
respective reduced models in (2.8).
Example 9.7 (on p. 149) in Barndorff-
Nielsen (1978) provides a nice application of this orthogonalization of PEDF's
in the context of estimation of a minor of the covariance matrix!.
Motivated
by this, we may consider this conditional approach in a quasi-independence case
as follows.
We consider a partition of a and Tn as (a(l), a(2}) and (T~l), T~2}}
respectively, such that in (2.1),
h (T a) = h (T(l) a) h (T(2) a(2»,
n n'
n1 n '
n2 n '
(2.26)
a € a
,
so that T(l), T(2) are independent, the density of T(2) depends only on a(2),
n
n
n
but the density of T(l) may depend on both a(l), a(2).
Suppose further that
n
J h l(T(l), a)da(l)
(2.27)
n
n
=l(wlog),
V a(2)
€
a(2}.
Then, under (2.26) and (2.27), (2.3) reduces to
h
(2.28)
g
(a(l), a(2»
P,n
= h
(T(l),a)
n1 n
(T(2) a(2»
n '
(2) (2)
(2)
hn2 (T
,y
}dy
Ja J
n2
...
= gP,n
(1)(a(1)la(2».
n
(2)(a(2})
gP,n'
a
€
a,
(1)( y (1}1 y (2)) is
where gp(2} (a(2}) is the marginal Pitman density of a(2) and gP,n
,n
the conditional density of a(l), given a(2) = y(2}.
The PEDF corresponding to
gp(l), denoted by Gp(l)(. ly(2}), is termed the conditional PEDF of a(l) given
,n
,n
a(2}
= y(2}.
In an estimation problem if a(l} is the parameter of interest and
- 11 -
9(2} is a nuisance parameter, (2.28) provides a convenient way of deriving
(conditional) Pitman estimators and other related ones.
For example, if
G(l} (_ 19(2}) has location parameter (mean/median) T(l) for every 9(2), then
P,n
n
T(l) is a convenient (Pitman type) estimator of 9(1).
n
As an illustrative
example, we consider the case of a multinormal density with mean vector 9(1)
and dispersion matrix 9(2).
Then G(l) (9(1)19(2)} is N (X , 1 9 (2», which has
P,n
p n n
the natural location parameter X (the PE of 9(1) when 9(2) is known}, which
n
can as well be adopted in an unconditional setup.
(1)
(9 1
(1)
,92
) and 9
(2)
= «9'1(2)
'-1
j ).I , J
- , 2'
Wishart form, g~~~)(9~1)19(2»
In the same vein, let 9(1)
Then gp(2) (9(2»
,n
=
has the conjugate
is normal with mean vector X~2} and dispersion
-(1) + (9 (2) ) -1 (9 (1) - X
-(2) ) and dispersion matrix n-1 (9 (2) X
n
22
2
n
11
2)(9(2})-1(9(2)}) _ -1 9(2)
9(12
22
21
- n
11.2'
G(11)(9(1)lx(2), 9(2»
P,n
n
Th
e
PE
0
f 9(1} . X-(2)
ha
2
IS
n ' so t t
provides the usual Pitman estimator X(l) of 9(11).
n
conclusions may also be derived for some non-regular models.
uniform (9
1
- ~ 9 , 91 + ~ 9 ), 9 1 real 92
2
2
> a,-density.
Such
For example,
The sample extreme
The d.f. of T(2) does not depend on 9 , while T(l) has a
1
n
n
density depending on both 9 and 9 ,
1
2
However, the conditional density of T~l).
given T~2}, is symmetric about 9 , so that a similar characterization of
1
- 12 -
The results presented here are all of exact nature.
As the sample size
n becomes large. under fairly general regularity conditions T
n
~
a. in
probability. and hence. asymptotically. Gp ,n (y) becomes degenerate at the point
a.
This calls for suitable (Pitman) neighborhoods of a on which the PEDF
behave property, and this will,be considered in the next section in a more
general setup where sufficient statistics may not exist.
3.
PEDF and MLE.
Pitman estimators are known to have affinity to MLE's.
This
affinity extends directly to the PEDF through a local asymptotic normality
(LAN) pattern, and this will be considered here.
In this context, exponential
family. sufficiency and equivariance do not play any basic role.
We assume that the density f{x,a). x € RP . a €
d
~
a
C Rd. for some p l 1.
I, satisfy the following (essentially. Cramer-type) regularity conditions:
(i)
(818a)f{x.a) and (a218a8a')f{x,a) exist almost everywhere and are
dominated by some integrable functions (w.r.t. x).
(ii)
(818a) log f{x.a) and (a218a8a') log f{x.a) exist almost
everywhere. and are such that
(a)
The Fisher information -(matriX) 'a' defined by
'a = Ea {[{818a)log f{x,a)][{818a')log f{x.a)]}
"
is finite and p.d., and
(3.1)
(b)
(3.2)
for every £5 > O. there exists a
E
sup
a u: lIull~£5
{
and ~£5 ! 0 as £5 !
II
o.
a2
8a8a
, log £(x;O+u) -
~£5{>
8~O'
0), such that
log £(x:o)ll}
= ;'6
exists
- 13 -
(iii)
d
If we define Z(u) = -log{f(x.9)/f(x;9+u)}. u € R
and define the Kullback-Leibler information by
I(u;9) = E9 Z(u) =
(3.3)
J
RP
{-log{f(x;9)/f(x.9+u)}}f(x.9)~.
for u € Rd. then
I(u;9) ) O. V u # O.
(3.4)
d
and. either 9 is a compact subset of R • or. letting
(3.5)
Zn (u)
=I
Z(u)dFn
=I
(and noting that E9 Zn(u)
(a)
= 1n ~
1=
Z(u)dF
1{-log{f(X1.. 9)/f(X 1.. 9+u)}}.
= I(u;9».
we have:
There exists a positive number k. such that
Ju€Rd exp{-k I(u;9):
(3.6)
du
and (b) there exists a compact 9* (containing u
< 00.
=0
as an inner point). such
that on letting 9-* = 9\.9* .
(3.7)
as n
sUE*
9€9
Iz
n
(u)/I(u;9) - 11 ~ O. in probability.
~ 00.
Note that (3.6) insures that as n
I
(3.7)
~
00.
d exp{-n I(u;9)}du
R
~0
(at an exponential rate). and moreover.
J
(3.8)
-*
9
exp{-n Z (u)} du
n
~ O.
as n
Further. i f for every c ) O. we denote by 9*c
= {9'
(3.4). (3.5) and some standard steps. as n
00.
(3.9)
J
9
* exp{-n Z (9')} d9'
c
n
~
~ O.
€
as n
~
00.
9* : 119'-911
~
00.
> c}. then by
- 14 -
where again the rate of convergence is exponential in n.
let us assume that
=
a* u e* u {9'
Eo
decomposition.
e
= Rd , for some d
: 119'-911
< Eo}.
e
e
=
e-* u e*
is conpact. we may drop
~
e
in this
Let then
1
(3.10)
(3.11)
If
I, and write
~
For the time being
U = -
~
n
V
n
-1
= -n
(8/89)log in(9)
2
(8 /8989')log in(9)
...n
2
= n -1 ~i=I{-(8
/8989') log f(X i ,9) I9·
Then, under the stated regularity conditions, as n
~ ~
(3.12)
(3.13)
We also write
(3.14)
-1
-1
Wn = (Vn Un (~ Xd (0.1 9 ), as
n ~ ~).
[Recall that Wn is a form of Studentized scores at 9.]
Next, we write for
every u € Rd ,
(3.15)
; f(x ,
i
i=1
9+U)/f... f ;
e
= exp{-n Z
n
f(x.,y)dy
i=1
1
(U)}/f...
f exp{-n Zn (y-9)}dy.
e
First, consider the domain
(3.16)
D
n
= {u
u+n-~ t, IItU
< K}.
- 15 -
where K{<
is arbitrarily large (but fixed).
00)
Then, under Assumptions (i) and
(iia), (lib),
1
In Z (n~t) + t'U - -2 t'V tl ~ 0, as n
IItll~K
n
n
n
sup
(3.I7)
Next, over the domain 9 *\D , we note that f or
~
(3.I8)
sUi
u€9 \D
~
n
n
make use of (3.8) and (3.9).
00),
= 0p (I»,
! U~{Vn)-I Un}
(and noting that by (3.14).
we obtain from (3.I5) through (3.I8) that for every
IItll < K,
(3.I9)
Over the complementary part. we
-+00.
Hence, multiplying both the numerator and
denominator of (3.I5) by exp{-
K{<
small,
2
In Z (u) + nYz u' U _! n u' V u + 0{n)lIuIl 1
n
n2
n
converges, in probability, to zero as n
U'{V )-IU
n n
n
~
-+ 00.
exp{-Z
n
{n~
t)}/J...
D
J
exp{-Z (u)}du
n
n
=
(2u)~
Ivn I~h
exp{- ~2I U -V t)'V- 1{U -V t)} + o{I).
n n
n
n n
where in the denominator, the domain may as well be replaced bye.
We rewrite
the exponent in (3.I9) as
1
exp{- ~t
- Wn )'Vn (t-Wn
)}'
2
(3.20)
so· that from (3.19) and (3.20), we conclude that as n
IGp ,n (e
(3.2I)
+ n ~ t) -
~d{t;
-+ 00,
P 0,
Wn , Vn-1 ) I -+
d
uniformly in t (€ R ), where ~(t; ~, ~) stands for a d-variate normal d.f. with
mean vector
~
and dispersion matrix
~.
'"
Let us now denote the MLE of e bye;
n
'"
actually, we may take a BAN estimator of e and denote bye.
n
of (3.I7), we have
Then, by virtue
- 16 -
(3.22)
n
~
,..
(9n -9)
= Vn-1
=
Un + 0 p (I)
W
+
n
0
p
(I).
Consequently. from (3.2I) and (3.22). we conclude that
Gp
(3.23)
uniformly in f.
.n
(On + n~h f)
+
(On-9)
= ~d{f; o.
Vn ) +
= Gp .n{9
-1
+
n~ f)
0
(I).
p
Thus. asymptotically. in a Pitman (i.e .•
,..
O{n~» neighborhood
of the MLE (BAN) 9 (and hence. around 9). the PEDF Gp
is Gaussian with mean
n
.n
.
. n -1 V-I .
9,.. and d·Isperslon
matrIx
n
n
This representation is of prime importance in studying the asymptotic
properties of Pitman-type estimators of 9 (which are based on the Gp ).
.n
this characterization. because of the diagonal symmetry of
~d
In
(around 0). we
conclude that. asymptotically. Gp
attains diagonal symmetry in a Pitman
.n
,..
neighborhood of 9 (or 9).
n
Hence if we use the Posterior Pitman closest (PPC)
characterization of Ghosh and Sen (I99I) [and Bose (I991)]. we can claim that
the classical Pitman estimator 9p
is asymptotically equivalent to the MLE
.n
,..
(BAN) 9
n
and is a PPC estimator of 9.
This representation also provides a
natural justification for using a quadratic loss in an asymptotic setup.
At
the same time. it raises some other issues which will be discussed in the next
section.
4.
Some general remarks.
We have observed that (3.23) provides a
justification for the adaptability of the PPC criterion (in an asymptotic
setup) on Gp • and this entails that the classical Pitman estimator 9p
is
.n
.n
asymptotically a PPC estimator of 9 [in the sense of Ghosh and Sen (I99I)].
- 17 -
The underlying (multivariate) median unbiasedness property of the (posterior)
Pitman empirical d.f. (in an asymptotic setup) insures that this asymptotic PPC
property holds for a general class of (location-) measures of the PEDF Gp
.n
which include the PC 9p
as a special member.
.n
multivariate normal dispersion matrix (9
= !)
For example. for the
estimation problem. treated in
detail in Section 2 [(2.18) through (2.25)]. the Gp
is not strictly Gaussian
.n
(for finite n). and (3.23) holds only in an asymptotic setup.
In such a case.
the influence of the tails of Gp
in a measure of its location may be quite
.n
perceptible for Pitman-type estimators but less for some alternative ones.
As
for example. we may consider the alternative estimator in (2.25) or even modify
it by taking B as a diagonal matrix wi th the elements med
1
~
j
~
p.
may. for example. consider the marginal PEDFs
.
""{j).
respective medIans by 9p • 1 ~ J
•n
.n
2
(~{n+1+p-2j))'
[Incidentally. in this context. we have used the entropy loss
function instead of the conventional quadratic one.]
9p
-1
is also a PPC estimator of,9.
~
Keeping this in mind. we
G~~~{9j)'
1
~
j
~
d. denote the
""
""(1)
""{d),
d. and let 9p
= (9 p •...• 9p ) .
.n
.n
.n
Then .
The main advantage of prescribing 9p
as
.n
an alternative to 9p
is that 9p
is much more sensitive to the tail-behavior
.n
.n
of Gp
then 9 p • so that 9p
is likely to be more robust than 9p .
.n
.n.n
.n
Of
course. it should be kept in mind that the PEDF Gp
is a conditional d.f ..
•n
"-
given X •.... X . and it is highly influenced by the MLE 9 {or any other BAN
n
n
1
"-
Quite often. the MLE (or BAN) 9
n
attacked on the ground of plausible lack of robustness.
Thus. if 9
n
is
is not so
robust. (3.23). in turn. would imply that the Pitman-type estimators (based on
the Gp ) may share the same drawback to a certain extent.
•n
On the other hand .
(3.23) ensures that the Gaussian approximation holds well in an O{n~h)_
"-
neighborhood of 9
n
(and hence 9). and the 'closeness' of this approximation may
- 18 -
A
not be that fine in the tails to ensure that B
is so robust.
P,n
suggest that either Bp
,n
This may
or some other measure (of location of Gp ) which is
,n
quite insensitive to the tail-behavior of Gp
should be preferable.
,n
case where Gp
is itself strictly Gaussian (viz., f(x,B)
,n
= f(x-B),
In the
f (multi-)
normal density), this point of distinction may not be very pertinent.
However,
in the negation of the exact Gaussian form for Gp , for finite sample sizes,
,n
such as a robustness consideration merits attention.
Basically, one then
prescribe a measure of location of the PEDF Gp
which is less sensitive to the
,n
tails.
Trimmed mean, Winsorized mean, median or, in general, an L-functional
with practically no weight attached to the tails would be a desirable solution.
This leads us to the following robustification of PE.
= T(Gp ,n )
Let T
n
be a functional of the PEDF Gp , such that T is
,n
n
A
robust and translation-equivariant.
Pitman estimator (FPE) of B.
Then T
n
= Bp
,n
(T) is termed a functional
The main point in this construction is that by
virtue of (3.23), Gp
is non-degenerate only in an O(n~)-neighborhood of e
,n
n
(and hence, B), so that the tails of Gp
are all adapted to this shrinking
,n
balls around B.
n
As long as
T
n
admits a first order representation (where the
leading term is a linear functional), we may use (3.23) to claim that
n~lIen -e p ,n (T)II
-+ 0, in probability, as n -+ co, while depending on the particular
form of T(·), we may be able to achieve more robustness.
In addition to this,
in (1.1) (and elsewhere), the likelihood i (B) may be modified in a way to
n
induce more robustness, and with that modification one may consider suitable
Pitman-type estimators.
While this works out well for the location model,
there are some technicalities for a general model, and we shall not enter into
these problems here.
Another important feature of multiparameter estimation is that in the
- 19 -
case of d
~
3 (with respect to a quadratic loss). the MLEs are generally
dominated (at least asymptotically) by their shrinkage or Stein-rule versions.
For the multinormal location model. this was the Stein-phenomenan introduced
more than 35 years ago by Stein (1956). and since then a considerable amount of
work has been done in this area.
In the light of a generalized Pitman
closeness measure. a similar dominance result has been established by Sen.
Kubokawa and Saleh (1989). where d may be as low as 2.
As such. the Gaussian
approximation in (3.23) suggests that similar dominance results (under
quadratic loss or Pitman closeness measure) should hold for the Pitman-type
estimators (based on Gp ).
.n
A
The PE Sp
A
or. in general. an FPE Sp
.n
or Stein-rule versions.
9
~
(T) is dominated by suitable shrinkage
.n
For example. with a pivot 9 (= O. WOLOG). we may set
o
{1 - (d-2) ~-1}ap •
n ·.n
(4.1)
where
Indeed this is the case (in an asymptotic setup):
~n
is the likelihood ratio test statistic for testing H :
O
O. and d is
~
AS
A
3; SP.n is a James-Stein version of SP.n.
~
= Q vs.
H
1
More complex
versions (including the positive-rule ones) can also be considered in the same
vein.
Moreover. with respect to POe. the shrinkage factor (d-2) may also be
replaced by b : 0
< b < (d-1) (3d+1)/2d. d
~
2.
As such. the results in Sen
(1986) and Sen. Kubokawa and Saleh (1989) can directly be adapted along with
(3.23) to conclude that Stein-rule versions of the Pitman type estimators are
asymptotically better than the original ones (in either mode).
- 20 -
REFERENCES
BARNDORFF-NIELSEN. O. (1918). Information and Exponential Families in
Statistical Estimation. John Wiley. New York.
BERGER. J.O. (1985). Statistical Decision Theory and Bayesian Analysis.
Springer-Verlag. New York.
BOSE. S. (1991). Some properties of posterior Pitman closeness.
Statist. Theory Meth. 20. in press.
Commun.
GHOSH. M. and SEN. P.K. (1989). Median unbiasedness and Pitman closeness.
]. Amer. Statist. Assoc. 84. 1089-1091.
GHOSH. M. and SEN. P.K. (1991).
Theory Meth. 20. in press.
Bayesian Pitman closeness.
Commun. Statist.
KUBOKAWA. T. (1991). Equivariant estimation under the Pitman closeness
criterion. Commun. Statist. Theory Meth. 20. in press.
LEHMANN. E.L. (1983).
Theory of Point Estimation.
John Wiley. New York.
PITMAN. E.J.G. (1939) The estimation of location and scale parameters of a
continuous population of any given form. Biometrika 30. 391-421.
SEN. P.K. (1986). On the asymptotic distributional risk of shrinkage and
preliminary test versions of maximum likelihood estimators. Sankhya. Ser.
A. 48. 354-312.
SEN. P.K .. KUBOKAWA. T. and SALEH. A.K.M.E. (1989). The Stein paradox in the
sense of the Pitman measure of closeness. Ann. Statist. 17. 1315-1386.
SEN. P.K. and SALEH. A.K.M.E. (1991). On Pitman closeness of Pitman
estimators. Gujarat Statist. Rev .• in press.
STEIN. C. (1956). Inadmissibility of the usual estimator for the mean of a
multivariate normal distribution. Proc. Third BerkeLey Symp. Math.
Statist. Probab. 1. 191-206.
© Copyright 2026 Paperzz