Empirical Bayes Estimation of Finite Population Means from Complex Surveys Author(s): Vipin Arora, P. Lahiri and Kanchan Mukherjee Source: Journal of the American Statistical Association, Vol. 92, No. 440 (Dec., 1997), pp. 15551562 Published by: American Statistical Association Stable URL: http://www.jstor.org/stable/2965426 . Accessed: 25/07/2014 12:35 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp . JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected]. . American Statistical Association is collaborating with JSTOR to digitize, preserve and extend access to Journal of the American Statistical Association. http://www.jstor.org This content downloaded from 193.0.66.11 on Fri, 25 Jul 2014 12:35:48 PM All use subject to JSTOR Terms and Conditions EmpiricalBayes Estimationof FinitePopulation Means fromComplex Surveys VipinARORA,P. LAHIRI, and Kanchan MUKHERJEE samplingdesign.Finitepopulations Estimationof finitepopulationmeansis consideredwhensamplesarecollectedusinga stratified The truemeans of the observationslie on a for different strataare assumedto be realizationsfromdifferent superpopulations. and randomfor strata.The true samplingvariancesare also different regressionsurfacewithrandominterceptsfor different one for the interceptsand anotherfor the different strata.The strataare connectedthroughtwo commonpriordistributions, in twoimportant surveysituations.First,itcan be appliedto repeated samplingvariancesforall thestrata.The modelis appropriate of the samplingunitschange slowlyover time. Second, the model is appropriatein surveyswherethe physicalcharacteristics small-areaestimationproblemswherea veryfew samplesare available foranyparticulararea. EmpiricalBayes estimatorsof the optimalin the sense of Robbins.The proposedempiricalBayes estimators finitepopulationmeansare shownto be asymptotically are also comparedto the classical regressionestimatorsin termsof therelativesavingsloss due to Efronand Morris.A measure of variabilityof the proposedempiricalBayes estimatoris consideredbased on bootstrapsamples.This measureof variability A numericalstudyis conductedto evaluate all sourcesof variationsdue to theestimationof variousmodelparameters. incorporates of the proposedempiricalBayes estimatorcomparedto rivalestimators. theperformance KEY WORDS: Asymptoticoptimality; Bayes risk;Repeatedsurvey;Small area estimation. let Yij denotethevalueof a theproblem, To formulate jth unitof theithfinite for the of interest characteristic routinely surveys, samplesarecollected In manyrepeated andthe population(i = 1,.. .m; j = 1,.. .,Nj). The problemis orquarterly) froma finite population, (e.g.,monthly iTm Yij, themeanof themthfi1 of the samplingunitschangeslowlyover to estimateNYm= characteristics thattheNi's (i = 1,... I m) are We assume nite population. on time.Manyof thesesurveysalso provideinformation theindexi indicates surveys, In repeated constants. known Thusit is possibleto improve auxiliary variables. relevant time point.In small-area current with m the point, ith time populaofthefinite estimators directsurvey on thecurrent to theitharea,with index i refers the problems, tionparameters byusingthedatafromtheearliersurveys, estimation A fixedsampleof in conjunction withtheauxiliaryvariables.For example, m theindexforthe area of interest. (i = 1, . .. , m). is the ith population from size available ni by conducted Expenditure considertheConsumer Survey, = ... let loss of Without generality, Yi (Yli, , Yin)' denote of exan estimate theU.S. CensusBureau.For obtaining = ... (i the ith the sample from population, 1, , m). When ofan item(e.g.,freshwholemilk)forthecurrent penditure of -ym survey estimators direct traditional the is nm small, and also one can use datafromthepastquarters quarter, Z>relin1 are not Y mean the sample Yj) size andin- (e.g., variables(e.g.,family relatedauxiliary certain to improve on thedirectsurvey Werefer able.Thusit is necessary estimators. on thedirectsurvey come)to improve bymanyauThisproblemhas beenaddressed denotedby estimators. to Nandramand Sedransk(1993; hereinafter modelthat orandexplicit niceexamplefromtheNationalHealthIn- thorswhousedeitheran implicit NS) foranother them finite populations. conducted bytheNationalCenterforHealth connects terview Survey has beenfoundto be Bayes(EB) method esThe empirical A similarsituation maybe citedinsmall-area Statistics. atten- suitableforestimating whichhavereceivedconsiderable timation problems, -/mwhennr is small.Ghoshand ofa smallarea(finite popula- Meeden(1986,- hereinafter byGM) proposedan denoted tioninrecentyears.Estimate datafrom EB estimator byutilizing maybe improved tion)characteristic of -m usingan one-wayanalysisof variance variables.For example,to (ANOVA)model.Subsequently, relatedareasand theauxiliary Ghoshand Lahiri(1987) rateofa state,onecan use the carriedout a robustEB analysis,replacing estimate theunemployment thenormality on certain assumption data fromotherstatesas well as information linearity of posterior by a weakerassumption dol- of the stratameansin the sampleobservations. ofwelfare variables(e.g.,theamount relatedauxiliary The EB arerequired estimator statistics Reliablesmall-area larsdistributed). fromthebestlinof GM can be also motivated agenciesfor ear unbiasedprediction state,andlocalgovernment byvariousfederal, (BLUP) approach(see Prasadand policymakingandallocationofresources. thereNS generalized PR). Recently, Rao 1990,hereinafter sampling sultsofGM to thecase ofunequalandunknown of estimation WhereasGM and NS considered variances. SmithHanleyConsultingGroup,Wayne, VipinArorais Biostatistician, essimultaneous PA 19087. P. Lahiriis AssociateProfessor,Division of Statistics,Depart- -ym,GhoshandLahiri(1987) considered NE mentof Mathematicsand Statistics,Universityof Nebraska-Lincoln, timation of a = ( -1 . . . Yr). We considerthefollowing 68588. KanchanMukherjeeis Lecturer,Departmentof Mathematics,Na- model. thefirst 1. INTRODUCTION of Singapore,Singapore119260.The researchof tionalUniversity two authorshas been supportedin partby National Science Foundation grantsSES-9206326 and SES-951 1202 to P. Lahiri.The authorsthanktwo refereesand an associate editorfortheirhelpfulcomments.The authors also acknowledgethecomputational supportof FerryButar,graduatestuof Nebraska-Lincoln. dentat University ? 1997 American Statistical Association Journal of the American Statistical Association December 1997, Vol. 92, No. 440, Theory and Methods 1555 This content downloaded from 193.0.66.11 on Fri, 25 Jul 2014 12:35:48 PM All use subject to JSTOR Terms and Conditions Journalof the AmericanStatisticalAssociation,December 1997 1556 1, . ,m j iid 1,. .. Nj); (b) Oilri N(O, ), (i = 1,. .. ,mT); 2. THE BAYES ESTIMATIONOF am Undermodel 1 and squarederrorloss, theBayes estimatorof -Ymis givenby e(m) = e(m)(Ym,T>)= E[ymIYm,r,] Nm nm = Nm1 Ymj 1 _j= Ym E>(Ymj E ) Nm j=nm+l E{E(Ymj Ymi,m,o7m)IYm,7}1} x Fnm (1) where fm = (Nm - nm) /Nm,the finitepopulationcorXmj, Xm = (Nm rection factor,Xm = L=1 - S j=fnm+ (p 71 = exp K ( ?+z)1 x (T? + Z)1/2nmz( 8)6Y Wm = T Wm(?7, E (YMX-X j) z > 2/5)1_ _Z1 - O. (2) Clearly,the integralin the definitionof wm is uniformly bounded,and thustheBayes estimatore(m) exists. on Remark2.1. Note thatwe do not need information the auxiliaryvariablesforall of theunobservedunits.It is enoughto knowthemeanof auxiliaryvariablesfortheunobservedunits;thatis, Rm*. If the values of the auxiliary variablesare the same forall the unitsof the finitepopuis available only at the lation(i.e., if auxiliaryinformation stratumlevel, as in Ghosh and Lahiri 1987), thenthethird termin (1) vanishes. Remark2.2. Note thatunlikein GM and NS, here wm is a functionof Ym and does not have a closed-formexpression.When Xi3 = 1 forall i = 1, . . ., m; j = 1,..., Ni, and 6 = 0 (which impliesthatvi = ( for i = 1,...,m), e(m) is identified withtheBayes estimatorof am proposed by GM. 3. EMPIRICAL BAYES ESTIMATIONOF Ym The Bayes estimatore(m) is a functionof several un7r = (9, T, q, 8)', and thus we known hyperparameters, need to estimatethemfromthe available data. First,assume that T, (, and 6 are known but : is unknownand mustbe estimatedfromthe data. Writen = Em ni,Y = = = col1i<<mYi,Xi coll0<j<nX/jlX = coli<j<mXj, and Ii = an identitymatrixof orderni (i = 1,... im). Then, marginallyE(Yi) = Xi: and var(Yi) = (Ii + Tlil, where li is a ni x 1 column vector of Is. Hence the best linear unbiased estimator(BLUE) of : is obtained by minimizing Eml(Yi - Xi:)'((Ii + to 3. The eswhich resulting respect Xi:) T1i1)-1(Yj timatorof : is givenby - xE(OmIYm,r,)+ Xmj, m gm(z I7, Ym) Nm EYmj + E Nm ENm nm) Ym) = fJ'[Z/(z + nm7T)]f(zli Ym)dz, f(zl7 Ym) Oc Ym), and gm(zlr7, (Yv ..... *Y E Ymi+ E Lj=l N1 + j=nm+l1 ~nm = + fmwmX ij3 + fm(Xrm- Xm)'3, (1- fmwm)Ym iid 8), whereG((, 8) represents (c) oi's (i = 1,... m) iG(, withmean ( and variance8. a gammadistribution In this model we assume thatXij is a p x 1 vectorof knownand fixedauxiliaryvariables.Our approachdiffers ways.First,unlikeNS, the fromthatof NS in two different priorvarianceof Oi (i.e., T) is nota functionof thesampling variance(i.e., vi). This assumptionmakes the EB analysis because the Bayes estimatoritselfdoes not have difficult, a nice closed-formexpression.Second, our model can incorporatemanyauxiliaryvariablesavailablein mostlargeof auxiliaryvariscale complexsurveys.The introduction ables makestheanalysisevenharder.Severalmatrixresults are developedto provethe asymptoticresultsgivenin the article.The mathematicaltools developedhereinadvance researchon asymptoticsapplicableto similarsituations. Section 2 provides the Bayes estimatorof Ym under model 1 and thesquarederrorloss function.Section3 gives consistentestimatorsof the priorparameters,as well as a newEB estimator of -Ym.Section4 providesresults(without proofs)relatedto the asymptoticoptimalitydue to Robbins (1955) and relativesavingsloss (RSL) due to Efron we assumethatni's and Morris(1973). In our asymptotics, boundedand m tendsto in(i = 1,... . m) are uniformly the A numericalstudyis conductedto demonstrate finity. behaviorof our proposedEB estimatorfor moderatem. Section5 presentstheresults.Section6 providesa measure of variability of our EB estimatorby extendingthetypeIII bootstrapmethodof Laird and Louis (1987) to our finite populationproblem.This methodincorporatesall sources Fiof variationsdue to estimationof thehyperparameters. nally,the Appendixgives proofsof some of the theorems and lemmas. = N;l + fm[(1- Wm)(Ym- XMO)+ XM*O] f(1-fm)Ym N(X/j3 + Oi, vi), (i = Model 1. (a) Yij Oi, vi ind [ X(Li - ni-1Aj1j1)X 1 Xm~ii3 J This content downloaded from 193.0.66.11 on Fri, 25 Jul 2014 12:35:48 PM All use subject to JSTOR Terms and Conditions - Arora,Lahiri,and Mukherjee:Estimationof FinitePopulationMeans whereAi = nirT/( + niT), (i = 1, . . ., m). Next,consider the more realisticcase when 7ris completelyunknown. WriteYi, = (Ii -n-1 ili)Yi,Xic = (Ii -n lii)Xi, YC = coli<i<mYic) Xc = coli<i<mXic, and ic = Y7C- xIcjc, where )c = (X/Xc) -1X'XYc and u1 = [I X(X'X)-1X']Y. Note that under model 1, the marginaldistribution of satisfies the conditions of the nested error regression Yij model as in PR with v, = Oi,a = and a 2 = T. Thus, followingPR, consistentunbiased quadratic estimators of ( and T are given by ( = (n- m - p + r) I m1= 1ici and f = n-j[fiu' - (n-p ], where = 0 nif there tr[(X'X) EZm n X,X'] and r = n* is no intercepttermin model 1 and r = 1 otherwise.In a real situation,r could be negative.Thus we estimater by r = max(O,f), which is a consistentestimatorof T, as m -* oo, undercertainmild regularityconditions(see PR). An approximately unbiased quadraticestimatorof 6 is given by 8 = [Emi (ni- 1)] -l Eim j(61C6C2 2. In practice,we estimate6 by 6 = max(O,8), because 6 could yielda negativevalue.TheoremA. 1 showsthat8 is a consistentestimatorof 6, undersome mildregularity conditions. When r and ( are unknown,: is estimatedby - A 3 i(in m [X ji-n-i i li)X 1)x1 -1 E Xi (i-n-i Li=l The RSL is the proportionof the possible Bayes risk improvementover e(m) thatis sacrificedby the use of e(m) insteadof the ideal e(m) undermodel 1. The smallerthe value of RSL(e(m), e(m)),thebettertheestimatore(m)comparedto e(m).The followingtheoremshowsthate(m) is not asymptotically optimal. Theorem4.2. Undermodel 1 and RC, inf [r(e(m7) - r(e(m))] > 0. m>1 RSL(e(m),e(m))-O ili li)yi whereAi = niTj/(( + nirf. Having estimated7rby i1= (3, ~, (, 8)', we estimateWm by Zbm = Wm (7 Ym). TheoremA.2 showsthatZbm- wm= op(l) as m - oo, undercertainmild regularity conditions.An EB estimator of 9/m is now obtainedfrome(m),replacing7rby i1,and is givenby + fmZ )-r(eB ) =r(e(m) r(e)B r(e(m)) RSL(e(m), EB e(m) R Theorem4.3. Undermodel 1 and RC, m (I =(1fm7bm)Ym mXEB From now on, the regularity conditions(a) and (b) are referredto as RC. Next, we compare e(mB)to the classical regressionestimatore(m) = (1-fm)Ym 1JOLS, + fmXm where !OLS = in terms of the rela(X'X)-1X'Y, tive savings loss (RSL), introducedby Efron and Morris (1973). The RSL of e(m) relativeto e(m), denotedby RSL(e(m), e(m), is definedas Theorems4.1 and 4.2 yieldthefollowingtheorem,which demonstrates the asymptoticbehaviorof RSL(e(m), e(m)). i=l x 1557 3+fm(Im* Xm)'3. Note thatwhen 8 0, e(m) is identicalto the estimated BLUP (EBLUP) of 9Ym (see PR). 5. as m- oo. MEASURE OF VARIABILITYOF e(m) B AND e(m) EB A naturalmeasureof variabilityof e(m) is givenby V (m) =-V (m) (Ym, 77) = var[L'm IYm, 7,] Nm = N;2var E Ymij Ym 7] ji=nm+l {v (=nm+l ) } 4. ASYMPTOTIC OPTIMALITY OF e(m) EB In this section we establishthe asymptoticoptimality property(see Robbins 1955) of e(m). The Bayes riskof an estimatore(m) of -(mqundersquared errorloss, is defined as r(e(m)) = E(e(m) - _ym)29 where E is the expectation undermodel 1. The followingtheoremshows thate(m) is asymptotically optimalin the sense of Robbins(1955). 2 Theorem4.1. Under model 1 and the conditions(a) < infi>1rti < supi>1rii < k(< 00) and (b) SUPl<2?m;l?j?n Ai,wehave 2 < ini> ni hi = < Q(rn1), where hi3 = Xi(') sup nO as m -NmjfmE(um|Yin, 71) +1fmvar(OmYin, r1) -Nm1fmE(UmYm,?71) +- fm{var[E(O9m IYmv ?,U ) + E[var(E YE, IYm v?] } aYm, Y] , I,mUO) = NjfmE(am |Ym,v7) + fm (Ym -Im) + fm[var(Om (a Ym, 71)],3 7,om) I I ?mvT[1 - E(B(mI)Ym, Y,?a)], oo. This content downloaded from 193.0.66.11 on Fri, 25 Jul 2014 12:35:48 PM All use subject to JSTOR Terms and Conditions (3) Journalof the AmericanStatisticalAssociation,December 1997 1558 6. A NUMERICAL EXAMPLE whereBm(a,,m)= nTmT/(um + nlnT) and theexpectations andthevarianceinthelastequationof(3) arewithrespect In thissectionwe studytheperformance of variousesto tO f(J, ,f(Smlym) IY",?).n)~~~~~~~~~~~~~in for m and for means small of finite-population timators oftheEB estimator A naivemeasureof variability e(mB) realdata.We didnotconsider forthispurpose, simulation is obtainedfromVim)(Ym,,r) whenq is replacedby its becauseitis modeldependent. We consider thedatasetanthetruevari- alyzedbyGhoshandLahiri(1992);hereinafter underestimates ij. Butthismeasure estimator denotedby due GL. The dataconcernresponsesfromstudents thevariability becauseit does notincorporate ability, of Queens of thehyperparameter vector71.The lit- University to theestimation inCanadatothefollowing "Howmany question: in tripshomedo youestimate of EB estimators eratureon themeasureof variability youhavetakenby theendof NS theacademicyear?"The students samplingis notveryrich.Recently, finite-population werefromdifferent disof theirEB esti- ciplinesandfrom15 municipalities proposedcertainmeasuresof variability in Canada.A reduced thetrue versionof thisdatasetwas earlierconsidered underestimate mators. Buttheirproposedmethods by Stroud variabilido incorporate the variabilities, becausethey not are treatedas the (1987). As in GL, heremunicipalities of finite of different variancecomponents tiesdueto estimation To demonstrate of diftheperformance populations. due to variabilities theirmodel.To includetheadditional ferent sizesaretakenfrom samplesofdifferent estimators, of different of ourmodel,we hyperparameters estimation as in GL. thesefinite populations, by Laird extendthetypeIII bootstrap methodconsidered X covariate is to be a function of theroad The taken to EfronandTibandLouis (1987).(Readersarereferred between the and municipality Kingston (where distance shirani1993 and Shao andTu 1995 fordetailedaccounts is As Stroud located). suggested by Queen's University anditsapplications.) ofthisbootstrap method theboot- (1987),thereasonableX variableshouldbe -2 powerof we generate theLaird-Louismethod, Following fromKingston. Obr = theroaddistanceof themunicipality For bootstrap replications as follows. strapsamples 1,..., R, we draw iidOi* i = , ..,m, fromN(0,f) and servethatXi = Xi) (i = 1,....,15), withm = 15, because remainssameforeach munici... , m, fromG((, 8). Then we drawtheboot- thevalueof thecovariate iidud, i=1 pality. indepen...n) strapsamplesYir' (i = 1, . . . , m, j = of (, T, 6, and 3 are ( = 11.71, = .25, The estimates dentlyfromN(Xl/j3+ O*, <n). 8 = 279.94,andf 72.76.The largevalueoftheestimate Followingequation(10) of Lairdand Louis (1987),we 6 = 279.94suggeststhatit is notreasonableto assume of e((m). measureofthevariability proposethefollowing . EB ) + (ft - 1X E EB~) -R-11 ZV(rn)(Y r=1 R~~ le(m {~(yin,f x -()(m r=1 jR and whereE(m)(Ym) b = R-1i r=t eYm)(Ym,) B mple. Bo A* estimateof 7qbased on therthbootstrapsample. vi = (, (i = 1,... in), an assumptionmade in GL. The of theproposed squarerootof averagesquareddeviation estimator fromthetruemeanis 2.01,compared to 2.62 of thesamplemean,2.52 oftheGM estimator, 2.55oftheNS and2.41 of theGL estimator. NotethattheGL estimator, 2 estimator is a particular case of theestimator proposedby DattaandGhosh(1991).ThustheproposedEB estimator on thesamplemeanby23%,on GM by20%,on improves andthe is an NS by21%,andon GL by 16%.Variousestimates for are in truemeans 15 municipalities reported Table 1. Table 1. Different Estimatesand the Measures of Variability (Naive and Bootstrap)of the Proposed EB Estimatesforthe Dataset Givenby Ghosh and Lahiri(1992) GL Proposed estimate Na7ve variance estimate Bootstrap variance estimate Municipality N n X True mean Belleville Brampton Brockville Calgary London 3 3 3 5 3 2 3 2 3 2 .1125 .0594 .1118 .0171 .0476 12.33 5.00 10.67 1.20 4.00 3.50 5.00 15.00 1.00 3.00 3.55 5.00 14.26 1.19 3.08 3.53 5.00 14.54 1.11 3.05 3.99 5.00 14.28 1.04 3.05 5.03 5.00 12.70 1.03 3.08 2.11 0 3.02 .67 .93 2.35 0 3.33 .75 1.03 Mississauga Montreal Oakville Oshawa 4 3 3 3 2 2 2 2 .0606 .0579 .0587 .0712 3.75 4.00 4.33 5.33 4.00 4.00 4.00 5.00 4.02 4.02 4.02 4.95 4.02 4.01 4.01 4.97 4.06 4.02 4.03 5.02 4.10 4.05 4.06 5.04 1.20 1.11 .99 .97 1.34 1.21 1.09 1.09 Ottawa Pembroke Sault Marie Sudburg Toronto Vancouver 25 3 4 3 31 5 4 2 2 3 5 3 .0774 .0634 .0337 .0413 .0624 .0149 6.00 3.00 3.00 2.67 5.90 1.60 4.25 2.00 2.50 2.67 5.80 2.00 4.25 2.15 2.68 2.67 5.68 2.13 4.25 2.10 2.61 2.67 5.73 2.08 4.49 2.27 2.49 2.67 5.31 1.86 5.26 2.83 2.46 2.67 4.81 1.68 1.09 1.33 1.51 0 1.06 .72 1.51 1.46 1.61 0 1.36 .81 Sample mean GM NS This content downloaded from 193.0.66.11 on Fri, 25 Jul 2014 12:35:48 PM All use subject to JSTOR Terms and Conditions Arora,Lahiri,and Mukherjee:Estimationof FinitePopulationMeans 1559 Naive and the proposedbootstrapmeasuresof variability of e(m) are also givenin thistable. EB a. [X,(13_ b. XMI(d-) 3)12 is uniformly integrablein m, and op(1) as m ---oo. Proof. (a) Using matrixalgebra,it can be shownthat 7. CONCLUSIONS m n, We have consideredEB estimationof finite-population k,mj (Yzji -XI3113) means undera fairlygeneralBayesian regressionmodel. Xm3(/ 13) '1=1 j'=1 We have establishedthe relevantasymptoticpropertiesof the proposedEB estimator.We used the ANOVA method n, n, to estimatethe hyperparameters. It is not knownwhether -A, , 1, 1f/mJ5?g Yz-/S/ 3) > .(A.3) the asymptoticresultswould hold if the maximumlikeli,7=1 ,x'=1 hood or residuallikelihoodmethodwere used to estimate thehyperparameters. We have also developeda measureof Repeated applications of Cauchy-Schwarz inequality on the variabilityof theproposedEB estimator.This measurein- sums involvingi and j' in (A.3), the inequality(x + y)2 < )]2 < c(1 + corporatesall sourcesof variationsdue to theestimationof 2(X2 + y2), RC, and Lemma A. l(b) yield [Xm-(. it Thus is sufficient ' EmZ> EL,1=(YC -X'X113)2 variousmodel parameters.The estimationof otherfinite- k2r2/62)mM-l Z7I1 L/-(IY populationcharacteristics (e.g., finite-population variance) to prove that ( 72/62)m1 X$3713)2 is uniis currently underinvestigation. formlyintegrablein m. Note thatformo < oc),SUpm>mO 1 Z=1 ZT1(?' -X'113)2]4 < 00. < 0o and supm>mo[mo Thus by theCauchy-Schwarzinequality,supm>mo E(_/,)4 [mAPPENDIX: PROOFS - - lJ X' < This completestheproofof Z= 113)f2 oc. (Y?3Throughoutthe Appendix,c is a generic finiteconpart (a). stantindependent of m. Define G = ' X/(1%- A, (b) NotethatIX$ (3 -1)I <?1 Iz + - X/(I, = M3 Em1 1tX,3k1:,M3 j/m X G'1,3, -7n . = XmJ 3-1Xz7, kzm3 = (ktim3. kzn%mJ)/' andk,mj (ki,.m3 . ,kin,m3)/ (i = 1,...,m;j = 1,...,nm; j' = 1, .. ., n,). In thefollowing, A > (>)B meansthatA - B n,11t'X, G'X,31, k2'm G-'Xq7,)kqxm = m a. jkjj/m3j < cm-(1 - A,)-l,i n,.m;j' = 1,...,n,; a.s., i b. Ik,3mjl < cm1(l-Au)1 1,.... n; c. maxl<,<m d. ml (Yin 1,... ,m;j 7= 1,...,m;j,j' - XI e. supm>iE(Ym - Z2 Z(kmJ- - X71), k1mj)'(1, -A,n-I1171/)(Y, - X,13), 1 7= and m = Z3= - =op(l); integrablein m; 3/)2 is uniformly k/mjnT' (i, - A )1) 1' (Y - XA). We show thatz= op(1) as m - oc (i = 1,2,3). To prove op(l), use independenceof the summandsand the CauchySchwarzinequalityon the sum involvingj' to get Zi = Xm)4 < ? Proof The proofsof parts(b) and (e) are similarto thoseof parts(a) and(d). Theproofofpart(c) is similartothatoflemma 2 ofGM. UsingalgebraandRC, it canbe shownthat m n, G= A, 2_ (X,j - X,)(X,3 -X 3=1 (1-A,)X/X, {E F n,. k,72'mj (Yi:3,- X/313) -A,n /=I 3= matrix result(see,e.g.,Rao 1973,p. Using(A.1) anda familiar 70),we get G- < (I1-Au,)-(X X)- . (A.2) the and we Using (A.2), RC, Cauchy-Schwarz inequality, get iid part(a). Usingthefactthatundermodel1, (Y,3- X'p)cju = N(0,T + o,),(i 1,... ,m) and RC, we get E[Zn1(Y,J=E[Qi- + - (YX3/3)2]2Iu} ut)2{ri2 ? 2n,)(T2 ? f + 6 ? 2T() < c(r2 + ? 2T() < oo. This completestheproofof part(d). 2m,}] =(nt 4 (Y2, -X1313) 3/=1 > (1 - A,)X'X. (A.1) ,= 1 E{E[EJ m EZ x kZ3mE(jm1 m + X/g)2]2 A,nX71171)(Y, -A and =1 - 1 m = = k'mJ(Il Zi - is positive(positive Thefollowing semi)definite. twolemmasare usedto provetheresultsgivenin Sections3 and4. LemmaA.1. Undermodel 1 and RC, we have for Au kT/(( + kT) andAu = ki/Q($ + k-i), , where ?Z3 Iz2 I+ m <c2 - 't n0. EYj _/ 2 I 1_x=1 + n,, n7 E kt2m 1 47/=I ? ? 8 LemmaA.2. Undermodel 1 and RC, This content downloaded from 193.0.66.11 on Fri, 25 Jul 2014 12:35:48 PM All use subject to JSTOR Terms and Conditions E(Y,3,,-X/:/) ,x= (A.4) Journalof the AmericanStatisticalAssociation,December 1997 1560 Note that m 23 = 1z21| I Z21 = { Observe that j=1 I(Yij, - Xj,3)1 = Op(1). Now use Lemma A.l(b) and Lemma A.l(c) to get Z3 = op(1). ni L(kij'mj kijImj)(Yij- - TheoremA.1. Under Model 1 and RC, m -o 00. -Aini Xi3) k= L3(kij'mj - kij'mj) (Yij' _1e -ijc XUj, ei = ni= J_{($cr3)'Xijc}2,and Ci= ( -)' ZI n1 eijcXijc. Using considerablealgebra (see Arora 1994 for details), it can be shown that [Ei= (n? - Xij i)} - < 2{mmaxIk, /mj- kijimjI} n, n, m m X xjm-lZZIE Y i=l = op(l) as Proof: Define eij = Yij= EL ?1 eT3C, eij- ei, n, x 6-8 2 + 6], [Zim (n - 1)1Em= Bi, and m (n? Em Hence the are all proof is op(1). -1)1 ZCi2 completeby notingthat8 = 1(n? -1)]-1 Zm 1[Ai + Bi(A.5) 1)]-1Z7m1 [A -XX gYii,-:I 3J., 2C0]2 _ 2 _ ((2 _ j'=l 2) TheoremA.2. UnderModel 1 and RC, Wm -Wm Because G = G - QCQ', where Define Em = G-G1. Q = (X11i,..., X'mlm), C = diag{n71(Al - Al),., m = G-1 -G-1Q(Q'G-1Q Therefore, Em = G-QCQ/G-1 + G-1QCQ'(G so that QCQ'G-1, kij/mj - wm-WmI T< ? QCQ')- I? QCQ'G1 IXljG-1 + IXmjG1 QCQ'(G Xi,', QcQ')1QCQ'G-X1 - ij 1. (A.6) IZm'i/{(nini/)-1(Ai- k~'itlit)(>tn k*lil+ yields mlkij'mj - kij'mjl < cm-l(I_Au)-2 { n-1 Ai-Ail + Z(ninil)-1Ii - AIIA% ji - A I m 2 ZE'n- A)2 - Ao)2. (A.7) i=l factor in (A.5)converges UsingLemmaA.1(c)in(A.7),thefirst is because Iz2 op(1), ElYij3- X1jI3 < c, forall i,j, j'. Now, to 0 in probability.Thus m IZ31= K Zn1-(A nf - A) max lA AiIZ -- )/g(zI i=l E kij'mj(Yij - X/j3) m fl nili ZEk%JImJ t j'=l Z(YijI-Xlj/!1) ~~j'=l1 i> 00 - gm(zIYnYm) gm(z, 77, Ym) - lIzl/2(b-1) /2Inmdz, az) (r +z)- Y ) = + z )1/2nm (A.8) 1/2(b-b) ~~~~~nm _ 2exp (a-a)z+ + Xmj (3 ;b m - g(z I^Y Ai)(A~i - Ai')(Zni1 kmjil) (En1 Thus the inequality(A.6) njj,k%Iil')}I. m j YM) gmr(z, 71,71, n11kilil/)}I ? + op(1) as wherea = 2t/b b = 22 /6 and zEm1 + cm2(1 J x exp (-2 The firstterm in (A.6) becomes IEi= n-' (Ai - A = I m ni-1( (iG-lXl1i)(Xj,G-lvXli)'I Ai)(En,l Xmj = n7(Ai X/,G-1Xij)'I G-1Xij)(Znl -Ai)(Eni kmjii)(l >i k j'%l)I.The second termon the rightside of (A.6) n 2{( can be shown to be less than or equal to ZI E=l i Ai)2(Enz1 kmjil)(En k,1%j,)(Znl kiljil ?+2 I-TI x k1j'mj lmjEmXij' - - )-lQ/G- = 00. Proof. Using considerablealgebra,we have nm(Am - Am)} using a familiarmatrixresult(see Rao 1973, p. 33), G-l -z Qr-T)2( 1T?Z + ) - m (y 3J=1 j3) 'j(13 X+ j 2(T + z) ] Now note that [ gg (z, Ym) dz]l < exp(Um/2-) [E{l(z)}]-1, where Um = Z j__(Y3 -X, j,3)2,l(z) = (T + Z)-1/2nm and E is expectationwithrespectto G((, 6). Because 1(z) is a convex functionin z, by Jensen'sinequality, [E{l(z)}1-1 < E(r + z)1/2Ko, where Ko = k if r + z > 1 and Ko = 2 otherwise.Thus [f0 gm(z|r7Ym)dz]-l < cexp(Um/2T). Because exp(Um/2--) = Op(l), we have so in view of (A.8) and = and Op(l), [f0`'gm(z1mYnYm)]'- T = op(1), we are leftto show j t00 gm(z 7q, Ym) - x exp (- llz1/2(b-1) az) (T+z /2nidz = op(1). (A.9) To prove (A.9), suppose that {mi } is a given subsequence. Note that by Lemma A.l(e), Ej=Z1 (Ym1j - X'm1j3)2 and j-Xm1i (i + /mj)] are both bounded in probaEj=Z[2Y1 bility,because the expectationsare uniformlybounded. Also, = Tm1 -T op(1) along the subsequence {mi }. Therefore, = op(1) along the subse- X'1lI3)2 (~rnl -T)ZEj]l(Ymnij quence {ml }. Thus thereexists a subsequence {m2 } of {mi } suchthat(Ti2 - T) Ejl (Ym23 - Xm2jfl)2 a~ 0. Usinga sim- This content downloaded from 193.0.66.11 on Fri, 25 Jul 2014 12:35:48 PM All use subject to JSTOR Terms and Conditions Arora,Lahiri,and Mukherjee:Estimationof FinitePopulationMeans 1561 ilar argument, E(1 - wm)2(Ym -X 3) E.=71XM3 ( - YM - 33(13+ d)} O 0 along a subsequence {m3} of {m2}. Proceedingin this way, - 2E[Xm * (OLS we finallyget a grand subsequence,mgrand (whichforthe con1-) (I 1 Wm) (m-)]m Xm venience of writingis denoted by m, if there is no fear of As in theproofofEA' = o(1), we can showthat confusion),along which a - a, b - b,(= - r) Z72m (Y3 and E,=ml X3 (3 - 13)[2Ymj - X$(d + )] tendto 0 E[Xm*COoLs almost surely.Therefore,we must show that along this subseWe also showthat quence {m}, Xm33)2J, y liminfE[(1 -Wm)2 m-moo gm(z77/7/v Ym) - 1zl/2(b-1)e-1/2az(w 0. + z)-1/2nm dz a (As10) -3)]2 (A. 12) (A.13) = o(1). 3)2] > 0. (Ym-X (A.14) Becausethesecondtermon therightside of (A.12) is of order o(1), by an application of theCauchy-Schwarz inequality, (A.13) and (A.14),Theorem4.1 will thenbe proved.To prove (A.14),notethatunlikeinGM,herewmis a random variable, and thuswe cannottake(1- wM)2 outsidetheexpectations. Foreach c > 0, notethat Now pick an w outsidethenullset.Treata(w), b(w), and so on as fixedsequencesof real numbersbecause w is fixed.(They are no longerrandomfor the subsequentdiscussion.)To prove (A.10), break the integralinto two pieces: one over the interval(0, 1 -r) and the otherover (1 - w,oc). If 1 - -r < 0, thenbreaking _W)2 (Y_ -X/n)2 is not necessary.For the integralover (0, 1 - -r), the integralis E(I finiteand the integrandtendsto 0, as can be seen easily.For the > E(1 - Xm)2 I(Tm < c), (A. 15) Y second integral,z + -r > 1, and hence it is sufficient to prove + am) Y - lz1/2(b-1) exp(-1/2az) dz -* 0 because whereTm= E(am IYm). Usingthefactthatnm-r/(nmT f00gm(z ,mq,YM) is increasing in 1nm, Jensen's inequality, and the RC we get (z + r)-1/2nm < 1. Also, d = f00zl/2(bz-1) exp(-1/2az) dz < + am)jYm]> 2T/[2T+ E(amjYm)]. Thusthe Wm> E[2w/(2w o0, so we can treat h(z) = d-1zl/2(b-l) exp(-1/2az), 0 < z < o0, as a probability 77,?R, densityand provethat Ym)ggm(z, f0 7/v Ym)IIh(z) dz - 0. Now foreach fixedz C (O,oo), gm(Z'7/v -* 0. Using the factthat(r + z)/(# + z) < 2, (because r), b-b < 2,-(a-a) < a/4, (=)Z m(Yn m-X 13)2< 1, and 1 rightsideof (A.15) > (J2 + ) E[(Ym -Xm/3)2I(Tm < c)] i){2Y -m -Xm3(13 + 3)} < 1, forlarge m and E>=1 =(J2+ )2 {E(Ym-Xm3)E(Ym - Xm)2I(Tm > c)} some algebra,we can show thatsupm>l foJ0g (z,m2 ,Yi)h(z) dz < o0. This proves that {gm(z,77<it,Ym)} - 1 is uniformly > > )} ) { E(Ym - X/ )2 (A.16) integrablewithrespectto the probabilitymeasure h(z), so that the limitcan be takeninside the integralwithrespectto z and The last inequalityin (A.16) is obtainedfromthe factthat thustheintegralconvergesto 0 (foreach w). E(Ym- Xm3)2 = r + n1 > -. Notethat(Ym- X'3)2 is uni- ( in m and{Tm} is boundedinprobability. formly integrable Thus A.1 Proofof Theorem4.1 First,notethat r(e(mB) - r(e(m)) - E(e(m) - e(m))2 < 3[EA 2 + EA2 + EA3] (A.ll) where A1 = fm-(wm-m)(m m3), A2 = fmmiXm( - 13), and A3 = fm(Xm- Xm)/( - ). Using TheoremA.2, the fact that fm< 1 and supm>I E(Ym - X )2 < oc, which followsfromLemma A.l(e), one gets A1 = op(l).A2 = op() < 1 and Xm(/3 - 13) = op(l) (see Lemma A.2). because fmwbm Also, A3 = op(1), whichcan be provedin a similarway as in the Lemma A.2. Thus e (m)- e(m) = op(1) as m -* oc. Now, note - X thatA2 <(Y< integrablein m 3)2, which is uniformly thereexists 5* > 0 such thatsupm>I E(Ym - X /3)2I(Tm > c) < w/2 wheneverP(Tm > c) < *. Choose c(< oc) sufficiently large so that P(Tm > c) < 6* for all m > 1. Therefore,for sufficientlylarge c, SUPm>1 E(Y and hence E(Ym - -Xm)2I(Tm X/mn3)2(1 -WM)2 > > c) < T/2 [2r/(2w+ c)]2w/2 for all large m and c. Thus lim inf m, E(Ym - Xm)2(1)2 > 0. This completestheproofof Theorem4.2. Wm [ReceivedApril1994. RevisedOctober1996.] REFERENCES Arora,V. (1994), "Empiricaland HierarchicalBayes Estimationof Small Area Characteristics" unpublishedPh.D. thesis,University of Nebraska(see Lemma A. 1). Also, A2 is uniformly integrablein m, by using Lincoln,Dept. of Mathematicsand Statistics. LemmaA.2(a). The proofof uniform of A3 is similar integrability to thatof X/(-) and is omittedto save space. The theorem Datta, G., and Ghosh,M. (1991), "Bayesian Predictionin Linear Models: Applicationto Small Area Estimation,"The Annals of Statistics,19, now followsfrom(A.l 1). 1748-1770. Efron,B., and Morris,C. (1973), "Stein'sEstimationRule and Its CompetiA.2 Proofof Theorem4.2 tors:An EmpiricalBayes Approach,"JournaloftheAmericanStatistical Association,68, 117-130. First,notethat Efron,B., and Tibshirani,R. J. (1993), An Introduction to theBootstrap, - r(e im)) r(e(m) New York: Chapmanand Hall. Ghosh,M., and Lahiri,P. (1987), "RobustEmpiricalBayes Estimationof (oLs -)]2} Means From StratifiedSamples," Journalof the AmericanStatistical =f {E[Xm Association,82, 1153-1162. + E(1 - Wm)(Ym-X'f)2 (1992), "A HierarchicalBayes Approachto Small Area Estimation WithAuxiliaryInformation" (withdiscussion),in BayesianAnalysisin Statisticsand Econometrics,Berlin:Springer,pp. 107-125. - 2E[Xm*~ COoLs -)( -m) m -Xm This content downloaded from 193.0.66.11 on Fri, 25 Jul 2014 12:35:48 PM All use subject to JSTOR Terms and Conditions Journalof the AmericanStatisticalAssociation,December 1997 1562 Ghosh,M., and Meeden,G. (1986), "EmpiricalBayes Estimationin Finite PopulationSampling,"Journalof theAmericanStatisticalAssociation, 81, 1058-1062. Laird, N. M., and Louis, T. A. (1987), "EmpiricalBayesian Confidence IntervalsBased on BootstrapSampling,"Journalof theAmericanStatisticalAssociation,82, 739-750. Nandram,B., and Sedransk,J. (1993), "EmpiricalBayes Estimationfor the Finite PopulationMean on the CurrentOccasion," Journalof the AmericanStatisticalAssociation,88, 994-1000. Prasad, N. G. N., and Rao, J. N. K. (1990), "The Estimationof Mean JournaloftheAmericanStaSquaredErrorsof Small-AreaEstimators," tisticalAssociation,85, 163-171. New York: Rao, C. R. (1973), LinearStatisticalInferenceand Applications, Wiley. Robbins,H. (1955), "An EmpiricalBayes Approachto Statistics,"in Proon MathematicalStatisticsand ceedingsof the3rd BerkeleySymposium Probability,Vol. 6, Berkeley,CA: Universityof CaliforniaPress, pp. 157-163. and theBootstrap,New York: Shao, J.,and Tu, D. (1995), The Jackknife Springer. Stroud,T. W. F. (1987), "Bayes and EmpiricalBayes Approachto Small Symposium, Area Estimation," in SmallArea Statistics:An International Eds. R. Platek,J.N. K. Rao, C. E. Sarndal,and M. P. Singh,New York: Wiley,pp. 124-137. This content downloaded from 193.0.66.11 on Fri, 25 Jul 2014 12:35:48 PM All use subject to JSTOR Terms and Conditions
© Copyright 2025 Paperzz