Fan, JianqinAsymptotic Normality for Deconvolving Kernel Density Estimators"

No. 2000
ASYMPTOTIC NORMALITY FOR DECONVOLVING
KERNEL DENSITY ESTIMATORS
Jianqing Fan
Department of Statistics
University of North Carolina. Chapel Hill
Chapel Hill. NC 27514
ABSTRACT
Suppose that we have
11
observations from the convolution model Y = X + £, where X and
the independent unobservable random variables, and
£
£
are
is measurement error with a known distribution.
We will discuss the asymptotic normality for deconvolving kernel density estimators of the unknown
density
f x0
of X by assuming either the tail of the characteristic function of £ behaves as
II I~Oexp( - 11.1 13/1) as
I
~
00
(which is called supersmooth error), or the tail of the characteristic func-
tion is of order 0 (1- 13 ) (called ordinary smooth error). Asymptotic normality of estimating the funcI
tional T if)
=L aj f (j l(;e 0) is also addressed.
I
KEY WORDS: Asymptotic normality, deconvolution, nonparametric kernel density estimate, Fourier
transformation, smoothness of error.
Abbreviated title: Asymptotic normality for deconvolution
AMS(1980) subject classification: Primary 62G05; secondary 6OF05.
-2-
1. INTRODUCTION
Suppose that we have n observations Y It
"',
Y,. i.i.d. from the convolution model
y :: X + E;
(1.1)
we want to estimate the unknown density f x (x) of X nonparametric ally , where
E
is the measurement
error with a known distribution.
Such a model with contaminated error exists in many different fields and has been widely studied.
The model arises from microftuorimeuy, electrophoresis, biostatistics, and some other fields, where the
measurements can not be observed directly. For example, in AIDS study, Y may be the time from
some starting point to the time that symptoms appear,
E
might be the time from the starting point to
the time that infection occurs, and X is the incubation period ( the time from the occurrence of infection to the time of symptoms). Obviously, there are many examples of this kind in the biological studies. Additional practical problems of deconvolution can be found in Medgyessy (1977).
Applying the model to Bayesian setting, the deconvolution problem is precisely the same as the
empirical Bayesian estimation of prior (Berger(1986». Another possible field of application is the generalized linear measurement-error model (Anderson (1984), Bickel and Ritov (1987), Schafer (1987),
Stefanski and Carroll (1987», where in the structural model, deconvolution provides a non-parametric
way of estimating the unknown density of predictor and hence the conditional first two moments in
Schafer's EM algorithm and the unknown density in the efficiency score defined by Stefanski and Carro11(1987) can be estimated non-parametrically. Another interesting application is using deconvolution
techniques to estimate a monotone density (Donoho and Liu (1987,1988), Grenander (1956), Greoneboom (1985), etc.) where nonparametric maximum likelihood method and kernel density estimators are
typically used. Note that a random variable Y having a monotone density iff Y
are independent. and
E
is distributed uniformly on [0, 1] (hence log
E
4
X E, where X and
E
is distributed as an exponential
distribution so that the density of X can be explicitly deconvolved). The approach appears to be new in
the literature.
-3-
Related works on the deconvolution problem include Carroll and Hall (1988), Fan (1988a. b),
Kuelbs (1976), Liu and Taylor (1987a. b), Mendelsohn and Rice (1982), Nadaraja (1965), Schwanz
(1969), Stefanski and Carroll (1987), Stefanski (1988), Wise et al (1977), and Zhang (1989). Some
optimal rate (local and global) results and the unifonn consistency results can be found in the related
works cited above. The procedure for estimating the unknown density is the kernel density estimator,
which achieves the optimal rate of convergence for an appropriate kernel and bandwidth. As in the
ordinary kernel density estimation, it appears necessary to find the asymptotic distributions of kernel
density estimators so that we can construct the confident intervals and do some other statistical inferences on the unknown!x.
The kernel density estimator of estimating the unknown density of X is defined by using Fourier
inversion, Le.
1
A
J
! ,,(x) = 21t
where <11K is a Icnown function with <IIK(O)
.
exp(-llX )
= I, which
<IIdth,,) ~,,(t)
<IIE(t)
dt ,
(1.2)
can be thought as a characteristic function of a
kernel K ('), and ~,,(t) is the empirical characteristic function of the observations defined by
AI"
<II" (c)
=- L
n
To see why the estimator is a kernel density estimator, note that
AI"
!,,(x)
=- L
n
(1.3)
exp(ityj).
j=l
j" (x)
can be rewritten as
1
x - Yi
-h g"(-h-)'
i=l"
(1.4)
"
where
g,,(y)
= _1
21t
J
exp(-ity)
<IIK(t) dc.
<IIE(tI h,,)
(1.5)
More generally, we want to find the limit distribution of estimating the I th derivative, ! l)(x), of
the unknown density!x (x) under certain conditions. A natural estimator is the I th derivative, defined
by
-4-
= 1f'(I)(x)
"
n
~ _1_ (l)(x - Yi)
~ hI +/ gIl
h
'
j=l
"
(1.6)
"
where
(1.7)
Similar to the results given by Carroll and Hall (1987), Fan (1988a, b), the limiting behavior of
the estimator (1.6) will heavily depend on the tail of ¢le. We will separate the problem into two cases:
ordinary smooth and supersmooth. Roughly speaking, we will assume that the tail of ¢le behaves as
either
= 0 (/~),
i)
¢le(t)
ii)
1¢le(/) 1 =
1
-+
+00
for some
O(t~Oexp(- Ir>/y»,
1
-+
13
~ 0 (ordinary smooth case) or
+00
for some
13 > 0, Y> 0 and real number 130 ( supersmooth
case).
The typical examples of super-smooth error distributions are normal, Cauchy, and their mixtures.
Examples of ordinary smooth error distributions include gamma, symmetric gamma, double exponential,
and their mixtures.
2. ASYMPTOTIC NORMALITY
To discuss asymptotic normality of the estimator (1.6), note that (1.6) is the sum of an i.i.d.
sequence. Thus, we need only check the usual triangular CLT conditions hold. A sufficient condition
for asymptotical normality
(2.1)
is that Lyapounov's condition holds, i.e. for some 0 > 0,
(2.2)
where
-5-
. __
1_ (/) x - Yj
Z"J - hi + I g" ( h
).
(2.3)
..
"
To check (2.2). we have to give a lower bound for var(Z"I)' For the ordinary smooth case, we
need the following lemma. which generalizes the Theorem Al of Panen (1962).
Lemma 2.1: Suppose that K" 0 is a sequence of Borel functions satisfying
and sup I K,,(y) I ~ K· (y).
K,,(y) -> K(y).
"
where K· (y) satisfies
J
K· (y) dy <
00
and lim lyK· (y)1 = O.
'I-+-
If x is a continuity point of a density 10, then for any sequence h" -> O. we have
li m -
I
J
K,,(x h- Y)I (y) dy
"_-h,, - - "
=I(x)
J
-.
K(y)dy.
Case I (ordinary smooth case): For this case. we will need the following assumptions on the tail
of characteristic function 4>£ and kernel function 4>K, call Assumption
Am,l'
Assumption Am,l:
i)
4>£(t) tl' -> c. 4>'£(t )tl' + I -> -~c ,(as t -> +00). with some constant
4>it)
ii)
0 and ~ ~ O. Moreover
O. for all t.
4>K(t) is a symmetric function. having m + 2 bounded integrable derivatives. 4>K(O)
4>K(t)
iv)
"I:
C "I:
= 1 + O(lt 1
111
).
as t -+ O.
The m th derivative of the unknown Ix 0 exists and is continuous.
=I
and
-6Note that the assumption
assumption
A",,1
A""I
is stronger than the assumption
A.t,1
if m > k. Note also that the
ii) implies the kernel function
(2.4)
is a real-valued function integrating to unity, and satisfying
(2.5)
for some positive constant D and
J
yj K(y) dy = 0, for 1
~ j ~ m-1,
(2.6)
i.e., K (.) satisfies the ordinary kernel conditions (Prakasa Rao (1983), page 46). Under these assumptions, it can be proved (Lemma 3.1 ) that if h"
~
0, then
(2.7)
for some constant C I' which is independent of n and x.
Theorem 2.1: Under the assumption
A""
if h" ~ 0 and nh" ~ "", then
i,,(I'(X) - Ei,,(I'(x)
L
--;:==;:;:;~=- ~
"';var(j;I)(X»
Remark 1: The asswnption that $K (t)
N (0, 1).
= 1 + 0 (1"')
as t -> 0 is not used in the proof of
Theorem 2.1 and 2.2. But, it will be used in Corollary 2.1 & 2.2 to ensure that the bias of estimator
goes to 0 sufficiently fast so that we can replace En"(x) by f i."(x).
Even though we obtain the asymptotic normality for a kernel density estimator, its rescaling still
involves the unknown quantity var<i~"(x». We have to estimate the scale from the data. Note that the
unknown quantity from the proof of Theorem 2.1 is
A
(1'(
» _ 1.. h -2(1 + I + /3, + 1 fr (x) 2
var(jII x - n "
21tlc I
J
It 12(1 +/3)I$K(t)1 2 dt(l + 0(1».
-7-
Here, the unknown
replaced by
of
fr
can be replaced by any of its consistent estimators. For example, it can be
ir = ill *F t, where ill
is the kernel density estimator given by (1.2) and F t denotes the cdf
But. it involves some extra computation for the convolution and the integration. Naturally, we
E.
would rather use sample variance to estimate it. In what follows, we need to check that the sample
variance converges in probability to var(Z" I)' where Z"j is defined by (2.3). As £ (Z,,21) goes to
00,
and
£Z"1 is bounded, it is not important to know that whether the sample mean converges to £Z"1 or not.
In fact, it is not generally true unless hIt tends to 0 sufficiently slow.
Lemma 2.2: Under the assumptions of Theorem 2.1,
p
~
and if moreover nh"U,1 + ~l + 1 ->
00,
I,
(2.8)
then
1 " .
p
- ~ZlIj -£Z"1 ~ 0,
n
(2.9)
1
where ZlIj is defined by (2.3).
Corollary 2.1: Under Assumption A....1 (m > I), if hIt
=0 (II -
2m + 2~ +
1) and f rlO is bounded,
then
(ll(x) f (ll(x)
-J1i f "
- x
~N(O,I),
A
SIt
where
SIt
is
S,,2 = .!. ±(Z"j II
1
given
.!.
±
II 1
by
either
2
sIt
=-II1
Z"j)2 provided that nh"U,1
+
~
2
~Z"j
or
the
sample
vartance
defined
by
1
~l + 1 ->
00 ,
where Z"j is given by (2.3).
Remark 2: By the results of Fan (l988a), to achieve the optimal rates (singular) of convergence
under quadratic loss, the optimal rate of bandwidth hIt must be of order n -
2m + 213 + 1.
To assure the
the bias is a negligible quantity in comparison with the variance of the estimator, we have to let
-8-
bandwidth go to 0 faster than the optimal one.
Case II (Supersmooth Case): In this case, we will assume that 4>E(t) and 4>x (t) satisfies the following assumptions, called Assumption
B",,/.
Assumption B m):
i)
C2
~
I'Mt)llt
I-~o exp(lt I~/Y) ~ C\,
~o. ~E(t) # 0 for all t. Write 4>E(t)
as t -> + 00 with
= R E(t) + i
~x<t)
and some real number
1E(t), where R it) and 1E(t) are the real part and
the imaginary part of the characteristic function of
ii)
~. y, C\o C2 > 0
e. Assume furthermore that either
is a symmetric and supported with [-I, 1], having the first m + 2 continuous derivatives.
Moreover, 4>.dt) >
C3(1 -
t )"'+3, for t
E
[1 - 0, 1) for some 0 > 0, and
= 1 and 4>x(t) = 1 + O(lt I"'), as t ->
iii)
~x(O)
iv)
The m th derivative of unknown f x (-) exists and is continuous.
C3
> O.
00.
Note that for this case, it is hard (maybe impossible) to find a simple expression and the exact
order for the function g,,(I)(y) defined by (1.7). Consequently, the essential difficulty is in finding a
lower bound of £2,,2\. This is why we impose so many assumptions. Fortunely. these assumptions do
not exclude the commonly-used error distributions, such as normal, Cauchy and mixtures of normal, etc.
The assumption i) essentially says that the error distribution is supersmooth, and at the tail, its
characteristic function is essentially pure real or pure imaginary. The assumption ii) is imposed to
make (1.7) converge. The smoothness condition on 4>xO is imposed to make (2.5) valid. Because of
the smoothness conditions of ~xO, the tail condition that 4>x<t) ~
C3 (1-
t)'"
+3
as t ~ 1 is the sim-
plest one we can impose. The assumption iii) is made so that kernel function satisfies (2.6). It will not
be used in proving Theorem 2.2.
Lemma 2.3: Under the assumption BI ,/. as n ->
00,
(/)
(1 - b,,)~
~o '"
Ig" (y)1 ~ C4QI(Y) exp(
~) h" b"
yh"
+4
,
(2.10)
-9unifonnly over y
E
[0, ro21. where b" = h!"2(",
+ 5)
and
is a positive constant, and
C4
Now. we are ready to state and prove the asymptotic nonnality for a kernel density estimator.
Theorem 2.2: Under the assumption B , ,/. then we have
provided that h" - c (log n )-I/~. for some c > O.
To show the legitimacy of replacing var(Z"I) by the sample variance. we need the following
Lemma.
1
Lemma 2.4: Under the assumption B"I. if h" - c (log n)- j. then
1:" Z,J
P 1
-1- - - +
n£Z,,21
(2.11)
in probability and if moreover h" - (aflOg nrl/~ for some 0 < a <1, then
-n1 1:"I Z"j
P
- £Z"I --+ 0,
(2.12)
where Z"j is defined by (2.3).
Corollary 2.2: Under the assumption B""I (m > I). if h" - (aflOg n)-I/~ for some a > 1 and
/(""0 is bounded. then
{fj
j,.</\X) - /(/)(X)
s"
L
~ N(O, 1).
(2.13)
- 10 -
where s/l2 =
1.
n
±z';
is defined similarly to Corollary 2.1.
1
Remark 3: The optimal bandwidth in the supersmooth case is h/l - d (log n )-1/[3 for some large
d (> (y/2r l/[3). The constant tenn d is important for the order of magnitude of the variance tenn, and
the variance can even go to infinite if d < (Y/2)-j3l2 (see Lemma 2.3 and (3.7».
By choosing a
sufficiently small constant, the bias is a negligible tenn in comparison to its variance. However, the
situation changes for a sufficiently large constant. For a large constant d, the variance will be a small
order lenn. Thus, the mean square error of a kernel estimator and the result of Corollary 2.2 is quite
Itl
-I
sensitive to choosing a suitable constant. As the optimal rate of the problem is only 0 ((log !l)- -[3-)
(Fan (1988a, b), we cannot expect that (2.13) converges reasonably fast for moderate sample sizes. The
final remark is that the sample mean
1.
n
±
Z/lj may not converge to £2/11 for the bandwidth given by
1
Corollary 2.2. This is why we don't use the sample variance to estimate var(Z/lI) in Corollary 2.2.
3. PROOFS
Proof of Lemma 2.1:
The proof is quite standard (panen (1962». Let 0> O. and split the region of integration into
two parts: Iy I ~ 0 and Iy I > o. Then
f
I _1
h/l __ K/I (x h-/I Y ) f (y )dy -
f
f
(x) __ K (y )dy I
f
S I __ (f(x - y) - f(x)]f/I K/I(ylh/l) dy I + f(x)1
I sup
K (y)dy +"F
y) -
+ f(x)
IK·(z)ldz + f(x)1
1,1 sll
[
Iy I
1II1t/l
f
f .
~ max If (x -
(x) I
__
U 1,1 ;z5l1t.
f
f
(K/I(y) - K(y»dy I
I yK • (y) I
(K/I(Y) - K(y»dy I.
- 11 -
By Lesbegue's dominated convergence Theorem, and the assumptions, the last three tenns tend to 0, as
n -+00. Letting HO, we conclude the result
We need the following Lemma to prove Theorem 2.
Lemma 3.1: Under assumption
A,., ,
for some constant C l'
Proor or Lemma 3.1:
By integration by parts,
J-.
[
<p.dl) ]'
g~')(x) = ix __ exp(- itt) (- il)' <PE(lIh,,) dl.
1
Thus, by the usual argument (see (3.1»,
for some constant C. Hence, the assertion follows.
Proor or Theorem 2.1:
Note that the assumption A, J i) implies that
I h! <PK(I) I
<PE(llh,,)
(3.1)
- 12 -
where go is a positive integrable function such that
J
Itil g cAt )dt < 00, '1 A is the indicator function
for a set A, and M is a large enough ( but fixed) such that
..
Ie I
It - .. <Mt) I ~ -2-' when It I ~ M.
By Lesbegue's dominated convergence theorem(see (1.7) & (3.1)), we have
1.
hf g,,(I)(y) -> 2~
exp( - ity X- it)'
t~~K(t) dt,
where g,,(I)O is the "deconvolution kernel" function defined by (1.7). A consequence of (3.1) is that
Combining with (2.7) (see Lemma 3.1), we have
(3.2)
Let fy(') be the density of Y
= X + e.
Then by the assumption A, ., iv),
frO
is continuous. By
Lemma 2.1 and Parseval's identity, we have
2 EZ,,1
- h 2(11+ I)
J
[g~I)( X h- Y )]2 fr(y )dy
"
"
fy(X)
= h 2(1 + I +~) - 1
J
1
[ 27te
J
exp(-ity)(-it)' t~~K{t)de] dy (1 + 0 (1»
2
"
=
fy(x)
27tlc 12
J Itl2(l+13)I~K(t)12dt
h,,-2(l +1 +13)+1(1 +0(1»,
and by Fubini's Theorem and Lemma 2.1,
EZ"l
J
= __ fi')(x
- y)_1K(L)dy -> fi')(x),
h"
h"
where K is given by (2.4). Similarly, by (3.2) and Lemma 2.1, we conclude that
(3.3)
- 13 -
£ I Z" 1 12 + 5 = 0 (h,,- (2 + 5)(l + Ii + I) + I).
Thus. Lyapounov's condition (2.2) holds and the conclusion follows from Lyapounov's Theorem.
Proor or Lemma 2.2:
According to the Weak Law of Large Numbers ( Chow & Teicher (1978), P328), (2.8) holds if
for each
E
>0
(3.4)
By (2.3), (3.2) and (3.3), we have
1Z" 1 12 S ( h' l : : + Ii )2 <
E 11
£Z,,21 (a.s. )
00.
Consequently, (3.4) holds. Now by (3.3), if
"
when
11
1Ih,,2(1 +
is large enough, by using the fact that
Ii) +
1
_>
00,
1Ih"
->
then
which implies (2.9).
Proor or Corollary 2.1:
By (2.4) and (2.6), the kernel function K (-) satisfies the classical kernel conditions (Prakasa Rao
(1983), P46-47). Thus
Consequently by Theorem 2.1, (3.3) and the assumption of h", £j,,(I)(x) can be replaced by
(2.1). Note that
By Lemma 2.2,
f
(I)(x) in
- 14 -
Hence, the result follows.
Proof of Lemma 2.3:
Split
the
region of g~I)(y) in
integral
(1.7)
into three parts: /1 = (t: It I ~ 1- a,,),
/2=(t:I-a,,<ltl~I-b,,}, and /3={t:I~ltl>I-b,,}, where a,,=b,,+b:+s.
results of the integral over
dominant term, Le. 1 1 =
0
/1 •
Let the
/2' and /3 be 1 1, 1 2, and 1 3, respectively. We will show that 1 3 is the
(1 3),1 2 =
0
(1 3). as we expect intuitively.
According to our asswnption on the tail of 14I£(t) I, there exists an M > 0 such that when
It I
~M,
(3.5)
Hence,
. =~ I 1. exp(-
IJ 1 I
ity)( - ity)I
1
41K(t)
dtl
(h)
41£ t I "
h!o
x
1 - a"
= 0 (exp(
~ JII ~ Mit"
2
It I~
- exp( - - ) - - dl
e1
yh!
It I ~O
(1 - a,,)p
~
)e,,),
yh,.
where
e,. =
I,
if ~o ~ 0
Po
{ h" , if~o<O
As b" -> 0, when n is large enough, we have
- 15 -
Thus,
IJII S
o (exp(
= o «exp
(1- b,,)~
~)
exp(-
yh"
-L
~»
2yh"
( 1 - b)~ A.
l"°b"'+S)
~")h "
".
yh"
.Similarly, by the assumption on the tail of 4lt:, we have
IJ 2 1 ~ 1
1- b"
<
_ 0 «exp
4ldt)
eXp(-ity)(-it)' 4lt:(tlh,,) dt I
J
~ III > 1 -
<I"
( 1 - b)~ A.
I"°b"'+s)
~")h "
" ,
yh"
as we integrate over the intervals of length 2(a" - b,,) = 2b: +s. Finally, we have to evaluate J 3•
Note that by the fact that 4lt:(t) is a characteristic function, we can rewrite
J
3
J
= [ -1t1
1 - b" < I
Re.(tlh,,).
It:(tlh,,)
](1)
t cos
+ SIO
dt
'
4lK( )(
ty 14l ( Ih )1 2
ty 1.1\ ( Ih )1 2
<1
£ t
"
'1'£ t
"
by symmetry. For simplicity, assume I = 0 and I £(t Ih,,) =
0
(3.6)
(R £(t Ih,,» ( for other cases, the proof is
essentially the same ).
Note that when 1 - b" < t < 1 and
fI
is large,
cos ty
uniformly in Y
E
=cos y
+ 0 ( 1)
[O,1tI2]. By assumption B,,l i), R£(tlh,,) can not change its sign over
is large (otherwise. 4le.(t) has zero point). Thus by (3.5) and (3.6), when
J
1
1
IJ 3 1 ~ -2
1t I _ b
1
It I ~
~o
4lK(t) cos Y-2 exp(--~) h" dJ
C2
"
C)Cos
> 4
1tC2
y
exp(
(l - b,,)~
~o
~
)h"
yh"
yh"
J
I-b
"
(l-t)"'+3dt
fI
[~,
1] when
is sufficiently large
fI
· 16·
for some constant
c4. The conclusion follows.
Proor or Theorem 2.2:
By Lemma 2.3. when n is sufficiently large
J..
-
[g,,(y)]2 fy(x - h"y) dy
(3.7)
for some constants cs > 0 and c 6' Note that EZ" t is bounded as shown in Theorem 2.1. Consequently,
(3.7) is also a lower bound for var(Z"t). By the definition of g,,(I) and (3.5), it is easy to check that
(3.8)
where
CIt
is defined in the proof of Lemma 2.3. By (1.7), (2.3), and (3.8),
(3.9)
t
Consequently, Lyapounov's condition holds for any 8> 0 by choosing
hIt -
c (log n)- j"
The asser-
tion follows.
Proor or Lemma 2.4:
Similar to the proof of Lemma 2.2, we need only check (3.4) and .!.var(Z"t) -> O. By (3.7) and
n
(3.8), it is easy to see the event {Z,,2t ~ n £ EZ,,2t } is an empty set when n is sufficiently large, and
hence (3.4) holds. By (3.9) with
8 = 0, we conclude the second result.
- 17 -
Acknowledgements. The author gratefully acknowledges Prof. Pl. aickel for his fruitful discus·
sions and encouragement. Thanks are also due to the referees for their constructive comments on a previous draft that improved the readability of the paper.
REFERENCES
1.
Anderson, T.W. (1984): Estimating linear statistical relationships. Ann. Statist., 12, 1-45.
2.
Berger, 1.0. (1986): Statistical Decision Theory. Springer-verlag, New York.
3.
Bickel, Pl. and Ritov, Y. (1987): Efficient estimation in the errors in variables model. Ann. Statist., 15 513-540.
4.
Bickel, Pl. and Rosenblatt, M.(1973): Some global measures on the deviations of density fune·
tion estimates. Ann. Statist., 1, 1071-1095.
5.
Carroll, Rl. and Hall, P. (1988): Optimal rates of convergence for deconvoluting a density. J.
•
Amer. Statisti. Assoc, 83, 1184-1186.
6.
Chow, Y.S. & Teicher, H. (1978): Probability Theory. Springer-Verlag, New York.
7.
Crump, I.G. and Seinfeld, I.H. (1982): A new algorithm for inversion of aerosol size distribution
data. Aerosol Science and Technology, I, 15-34.
8.
Donoho, DL. and Liu, R.C. (1987): Geometrizing rate of convergence I. Tech. Report B7a,
Dept of Statist, University of California, Berkeley.
,..-<!38
9.
Donoho, DL. and Liu, R.C. (1988): Geometrizing Rate of Convergence III. Tech. Report fJ.Q)',
Dept of Statist, University of California, Berkeley.
10.
Fan, 1. (l988a): On the optimal rates of convergence for non-parametric deconvolution problem.
Tech. Report. 157, Dept of Statist, University of California. Berkeley.
11.
Fan, 1. (1988b): Optimal global rates of convergence for non-parametric deconvolution problem,
Tech. Report. 162, Dept. of Statist, University of California, Berkeley.
12.
Farrell, R.H. (1972): On the best obtainable asymptotic rates of convergence in estimation of a
density function at a point Ann. Math. Stat., 43, 170-180.
•
- 18 -
13.
Grenander, U. (1956): On the theory of mortality measurement, Part II. Slcand. MI., 39, 125153.
14.
Greoneboom, P. (1985): Estimating a monotone density. Preceedings of the Berkeley Conference
in Honor of Jerry Neyman and Jack Kiefer, Vol II, (L.M. Le Cam and R.A. Olshen eds.), 539-
555.
15.
Greoneboom, P. and Pyke, R. (1983): Asymptotic normality of statistics based on the convex
minorants of empirical distribution functions. Ann. Probab. 11, 328-345.
16.
Kuelbs, J.D. (1976): Estimation of the multidimensional probability density function. MRC Tech.
Report. No. 1646, University of Wisconsin, Madison.
17.
Le Cam, L. (1985):
Asymptotic Methods in Statistical Decision Theory, Springer-Verlag, New
York-Berlin-Heidelberg.
18.
Liu, M.C. and Taylor, R.L. (1987a): A consistent nonparametric density estimator for the deconvolution problem, Tech. Report 73, Dept of Statist., Univ. of Georgia.
19.
Liu, M.e. and Taylor, R.L. (1987b): Simulations and computations of nonparametric density estimates for the deconvolution problem, Tech. Report. 74, Dept. of Statist, Univ. of Georgia.
20.
Medgyessy, P. (1977): Decomposition of Superpositions of Density Functions and Discrete Distributions. John Wiley, New York.
21.
Mendelsohn, J. and Rice, J. (1982):
Deconvolution of microftuorometric histograms with B
splines. J. Amer. Stat. Assoc., 77, 748-753.
22.
Nadaraja, E. (1965): On nonparametric estimation of density function and regression. Theory
Probab. Appl., 10. 186-190.
23.
Prakasa Roo, B.L.S. (1983): Nonparametric Functional Estimation. Academic Press. New York.
24.
Rosenblatt, M. (1976): On the maximum deviation of k-dimensional density estimates. Ann. Statist., 4, 1009-1015.
25.
Schafer, D.W. (1987): Covariate measurememt error in generalized linear models. BiometriJca,
74, 385-391.
• 19 •
26.
Schwartz, S.C. (1969): On the estimation of Gaussian convolution probability density. SIAM J.
Appl. Math., 17,447-453.
27.
Stefanski, L.A. and Carroll, R. J. (1987): Conditional scores and optimal scores for generalized
•
linear measurement-error models. Biometrika, 74 703·716.
)
28.
Stefanski, L.A. and Carroll, RJ. (1987): Deconvoluting kernel density estimators. University of
North Carolina Mimeo Series #1623.
29.
Stefanski, L.A. (1988): Rates of convergence of some estimators in a class of deconvolution
problems. Manuscript.
30.
Stone, C. (1980): Optimal rates of convergence for nonparametric estimators.
Ann. Stat. 8,
1348-1360.
31.
Wise, GL., Tragentis, A.D., and Thomas, J.B. (1977): The estimation of a probability density
from measurements conupted by Poisson noise. IEEE Trans. Inform. Theory, 23, 764-766.
a
•
•

Download Report

Fan, JianqinAsymptotic Normality for Deconvolving Kernel Density Estimators"

Paperzz.com

Your Paperzz