Deconvolution with Supersmooth Distributions Jianqing Fan Department of Statistics University of North Carolina Chapel Hill, N.C. 27514 July 15, 1990 Abstract The desire to recover the unknown density when data are contaminated with errors leads to nonparametric deconvolution problems. Optimal global rates of convergence are found under the weighted Lp-loss (1 $ p $ 00). It appears that the optimal rates of convergence are extremely slow for supersmooth error distributions. To overcome the difficulty, we examine how large the noise level can be for deconvolution to be feasible, and for the deconvolution estimate to be as good as the ordinary density estimate. It is shown that if noise level is not too large, nonparametric Gaussian deconvolution can still be practical. Several simulation studies are also presented. o Abbreviated title. Supersmooth Deconvolution. AMS 1980 lubject clallijication. Primary 62G20. Secondary 62G05. Key wortU and phralel. Deconvolution, Fourier transforms, kernel density estimates, Lp-norm, global rates of convergence, minimax risks. 1 Section 4 examines how the theory works for moderate sample sizes via simulation studies. Futher remarks are given in section 5. Proofs are deferred in section 6. Optimal Global Rates 2. Let's give a global lower bound on rates for supersmooth error distributions. Let's assume that the second half inequality of (1.4) holds: -+ 00), (as x -+ (as t /3, i > 0, d 1 for some constants for some 0 ~ 0, and (2.1) /3I, and that ±oo), (2.2) < Qo < 1 and a > 1 + Qo. Theorem 1. Suppose that the distribution of error variable E satisfies (2.1) and (2.2) and f E Cm,B. Then, no estimator can estimate f(l)(x) faster than the rate 0 (log n)-(m-l)/t3) in the sense that for any estimator Tn(x), (2.3) for all 1 ::; p :S 00, provided that the weight function w(.) is positive continuous on some .:.~ :::)~-,:).a:'Yl ;':(~':'~::: interval, where .;" I: is a positive constant independent of the estimator. Cp,l '.. ,: ','~ .' technical argum,ent of Theorem 1, we will use the technique of adaptiYely . J017 ~,t::\:~, J.Y..9f: 9fL! I!~ j5~G;J)~J:~ . , . In tez:ms -;c. ..'J.~.) . , C" ",,'" ...J'J .... \,. o:.~.s J'" ('/l:"' rr " .. ~f .J~b· ) "0 r",o , ..... ~ v· _'. -... ~,) " •• '~, - _._ local one-dimensional subproblems developed by Fan (1989), and then reduce the global :2. I L:::9'W9lil' r:i 2!L:'>.~: "lLiT .JJ:o:. '::,:I'::. '~lJro ~fr~~1~9~~l:~a1,~;~t~:(t~;~~~:i~n~roblem so that the existing lower bound (Fan (1990)) on pointwise rates of convergence can be used. To our knowledge, the technical argument .h9~~9-:~) ': ~'1 ; J appears to be new! • . ',. -~:.,": .' ?'~ .. ) ~_:-. "";' ;-':.~. .. , _ Now, let's show that the rate above is indeed attained by the deconvolution kernel .., ... estimator (1.3), and hence it is optimal. Some assumptions on kernel function K(.) are Condition 1: 4 e • 1((·) is bounded continuous, and J:~= lylml](y)ldy < 00 • • The Fourier transform 4>K of K has a bounded support Theorem 2. Assume that 4>e(t) :f:. 0 for any t, ItI $ Mo. Moreover, <PK(t) = and that (2.4) for some positive constants {3,'Y,d 2 and constant {32. If the kernel function I( satisfies Condition 1, then for h n = cMo(2h)1/fJ(logn)-1!fJ with c > 1, (2.5) for all 0 $ p $ 00, provided that the weight function is integrable. In light of the bandwidth given by Theorem 2, there is no much room for bandwidth selection. He> 1, then the variance converges to 0 much faster than the bias does, while if c < 1, the variance goes to infinity. Thus, practical selection of bandwidth would select a constant c close to 1 in Theorem 2. The distributions satisfying conditions (2.1), (2.2), and (2.4:) include normal, mixture normal, and Cauchy distributions. For these supersmooth error distributions, nonpara:. ': :.)':',~Q\q . x: ~ q ~ .l metric deconvolution is extremely hard: the optimal rate of convergence is only of order .,:)') ~~r:~f'?O'1 1) U ",.J 3't~:H'\1lr ,\1'l;;-:'')~'''1 (log n )-m!fJ. One way of resolving this difficulty will be discussed in the next section. ~ ~ ir,: i11!T'" 1.0 ~:>uHb9.t 10 i!.lI:!91 .aI Some special global results (basically p = m = 2, I = 0, t normal or Cauchy) are obtained . '~[;:9Id()'!9G :ra I.s!IOi~£I9J:.!L:J·9iIO ls,o! independently by Zhang (1990) under different formulatIon. The results in Theorem 1 & 2 provide better insights: it shows that both lowe~'~d~~p:;ib~~~lssd~;e~dr~~lreonly .'.:-·;:·:S·Ino:> 10 a91SI 9arnSJ::> ' fIO through the tail of 4>e, and the dependence is explicitly addressed. : "llSc.a :; d' c ~ ~1.6' ") Remark 1. In an early version of the proof of Theorem 1 (see Fa:n (19~8), f~r which the " .. ',':', %: results in this section are based), a 1-dimensional subproblem is hard enough to capture the difficulty of the full global deconvolution problem. In contrast with the ordinary density estimation (Stone (1982», in order to construct an attainable lower bound under the global 5 • and Let Fe be the distribution of £, and x2 (f,g) = J(f - g)2lldx be the x2.distance. By Theorem 1 of Fan (1989), if (6.3) then sup EJ in! r l1'n(x) - If)(x)IPw(x)dx Jo > 1- J1 ~~(-C1) lol w(x)dx llIH(I)(x)IPdX(m;;(m-l)p. tn(z) JeCm.B (6.4) Thus, m;;(m-l) is the global lower rate. Let's determine m n from (6.3). Note that there exists a positive constant C2 such that lo(x) > cdo(x + jlm n ) (1 $ j $ m n ). By (6.1) and (6.2) with a change of variable, we have To construct a pointv.ise minimax lower bound, one has also to select m n such that (6.5) I, .'~ . "f',: is of order O(l/n), which. o. constant C3 > 'p • is determined by Fan (1990) to be m j n = c3(logn)1/t3, for some Conse:q~ktly, the global rate is of order m;;(m-l) = c;(m-l) (log n )-(m-l)/t3. The conclusion follows. ~v£.d 9·n ('0 < i~'~·-) '''. 6.2• • '~e need o,nly 'rIS 1(,; t~ prove the result for p = 00; the other result follows from J')-(naoI))o ,"'::"," " by assuming that J:~= w(x )dx = 1. 12 Note that ~ E/~')(x) = 1+ 00 -00 Ix(I) (x - hny)K(y)dy, which is independent of the error distribution Fc • Thus, by the results in the ordinary density estimation, or Taylor's expansion sup feem,s IIEj~'>(-) - If>(.)lIoo ~ 1+ (m - 1). 00 B, -00 lyl{m-I>IK(y)ldyh~m-I>. (6.6) Thus, we need only to verify that Note that by (1.3), (6.7) by using the fact that ¢JK has a support [-1110, Mo], and that .:. k: By (2.4), there exists a constant '-,:,:._';,.:~ to such that when ItI ~ to, •. .i~ f, ;jr::;~J:'.,':", . ,. fLo ,: b:_>; ,:' q\i)O !9b'!o ro;,: .a"iloITor !1o!wbno:> sdT Consequently, by (6.7) and the fact that minltl90 l¢Jc(t)1 > 0, we have .S:.-3 ' : : ct ':Ic) D'19 II sorT With the bandwidth given by Theorem 2, the last display is of order o«logn)-d), for any positive constant d. This completes the proof. 13 6.5. Proof of Theorem 5 First, using the integration by parts twice, it ~s easy to see that K n defined by (3.5) is bounded by C II(n(x)1 $ 1 + x 2 (for some constant C), Ixl- 2 as Ixl - Le. Kn(x) is bounded and decays at the rate and can be expressed as p;;(x) = with K·(x) = 1: 00 ! n t JC· 1 00. Thus, P;; is well defined, (X - Yi) hn Kn(y)dy. Note that s~p IEP,:(x) - F(x)1 +00 1 = s~p I -00 Fx(x - hny)JC(y)dy - F(x)1 = O(h~) = O(n- 1 / 2 ). Thus, we need to prove that which follows from Marcinkiewicz-Zugmund's inequality (Chow and Teicher (1988), p356) or direct e.,,<pansion by assuming p = 2j as Theorem 4, for some constant D, as had to be shown. References .G'rOj S~~;~9 ..~] 1.GJ!sb I~~;·.:·!~J. cF:: ..... ~1.:J~"" Jrl~"-'" [1] Carroll, R. J. and Hall, P. (1988). Optimal rates of convergence for deconvoluting a density. J. Amer. Statist. Assoc., 83, 1184-1186. . !Iot~?~1dij1 :,1·::s.r.::::..'i:sq:.r10I: 101 ::::J~~;~:.:.~':~-:. [2] Chow, Y.S. and Teicher, H. (1988). Probability Theory, 2 nd edition. Springer-Verlag, New York. [3] Devroye (1989). Consistent deconvolution in density estimation. Canadian J. Statist., 17, 235-239. 16 [4] Fan, J. (1988). Optimal global rates of convergence for nonparametric deconvolution problems. Tech. Report 162, Dept. of Statist., Univ. of California, Berkeley. [5] Fan, J. (1989). Adaptively local I-dimensional subproblems. Institute of Statistics Mimeo Series #2010, Univ. of North Carolina, Chapel Hill. [6] Fan, J. (1990). On the optimal rates of convergence for nonparametric deconvolution problem. Ann. Statist., to appear. [7] Fan, J. and Truong, Y. (1990). Nonparametric regression with errors-in-variables. In- stitute of Statistics Mimeo Series #2023, Univ. of North Carolina, Chapel Hill. [8] Fuller, W. A. (1987). Measurement error models. Wiley, New York. [9] Liu, M. C. and Taylor, R. L. (1989). A consistent nonparametric density estimator for the deconvolution problem. Canadian J. Statist., 17,427-438. [10] Masry, E. and Rice, J.A. (1990). Gaussian deconvolution via differentiation. Manuscript. [11] Mendelson, J. and Rice, J.A. (1982). Deconvolution of microfl.uorometric histograms with B splines. J. A mer. Stat. Assoc., 77, 748-753. r~ .::. 'A C.... 1,___ ..... : . ., [12] Stefanski, L.A. (1990). Rates of convergence of some estimators in a class of deconvolution problems. Statist. Probab. Letters, 9,229-235. [13] Stefanski, L. A. and Carroll, R. J. (1990). Deconvoluting kernel density estimators. , ': ,::.:;:: 011£ .r, ..: .iloTIs::> [11 Statistics, to appear. ' . . '" .:?s~,,-,~"2. :nm1'_ .t :-:;tl2il9S [14] Stone, C. (1982). Optimal global rates of convergence for nonparametric regression. :-' ,T i::;' ;::.-: ,')7,'.(") ::~ Ann. Statist., 10, 1040-1053. [15] Wise, G. L., Traganitis, A. D., and Thomas, J. B. (1977). The estimation of a probability density from measurements corrupted by Poisson noise. IEEE Trans. Inform. Theory, 23, 764-766. 17
© Copyright 2025 Paperzz