ASYMPTOTIC MULTIVARIATE NORMALITY FOR THE SUBSERIES VALUES OF A GENERAL STATISTIC FROM A STATIONARY SEQUENCE--WITH APPLICATIONS TO NONPARAMETRIC CONFIDENCE INTERVALS E. Car1stein University of North Garo1ina, Chapel Hill Summary Let {Z.: _00 1. ~ < i < + oo} be a strictly stationary a-mixing sequence with unknown marginal distributions and unknown dependence structure. ~. data Z~ = (Zi+1,Zi+2"'" of the unknown series values s Zi+m)' the statistic s~ paramet~r e.~f i m . a .. sample . .. .. series . Suppose that, given ~. = sm(Z~) is a point estimator ~O Zn is available, then the sub.. (0::;; i < i+m::;; n) may be used to construct a nonparametric con- fidence interval on e via either Student's distribution or via the Typical Value principle. The asymptotic justification for both methods rests upon a more general result regarding asymptotic multivariate normality of subseries values. • '. Supported by NSF Grant DMS-8400602 • Running Heading: Subseries-based Confidence Intervals AMS 1980 Subject Classification: Primary 62G15, Secondary 62E20. Keywords and Phrases: Confidence intervals, subseries, dependence, typical values, Student's t, jackknife, nonparametric, stationary, multivariate normality. 1. Introduction. The jackknife (Tukey, 1958) and the typical-value principle (Hartigan, 1969) both employ subsample values of a general statistic as the building blocks of confidence intervals on an unknown parameter. The idea behind the jackknife is that the "pseudovalues" (which are based upon subsamples of the data) are approximately iid normal random variables; hence, they can be used to construct approximate confidence intervals based on Student's distribution. Hartigan (1975) gave sufficient conditions for the asymptotic multivariate normality of these pseudovalues, thus providing a theoretical justification for the jackknife procedure. The idea behind the typical-value principle is that certain sets of subsample values of a statistic are (approximately) "typical values" for the unknown parameter; i.e. each of the intervals between the ordered subsample values will include the unknown parameter with (approximately) equal probability. 4It Hart- igan (1975) gave sufficient conditions for a set of subsample values to behave asymptotically like a set of typical values, thus providing a theoretical justification for the concomitant confidence intervals. All of the work discussed above deals with subsample values of a general statistic computed from iid data. If there is dependence in the data, then there is an even greater need for confidence interval techniques that free the user from parametric modeling and theoretical analysis. Indeed, the dependence structure would provide one more unknown in the model and one more complication in the theory. , The present paper provides confidence interval procedures for the dependent case. The procedures are analogous to the jackknife and typical- value methods both in physical form and in spirit: employed rather than "subsamples" posed of successive observations). (the~former On "Subseries" of the data are referring only to subsamples com- the one hand, equi-lengthed non-overlap- ping subseries values of a general statistic are shown to be asymptotically 2 distributed as iid normal random variables. This justifies the use of confidence ~ intervals based on Student's distribution, analogously to the jackknife procedure. On the other hand, certain sets of linear combinations of subseries values behave asymptotically like a set of typical values. This justifies the use of these entities in constructing confidence intervals via the typical-value principle. The theoretical justifications that Hartigan (1975) provided for the jackknife and typical-value methods (in the case of iid data) both rested on a more general result: His Theorem 2 gave conditions under which (possibly overlapping) subsample values of a general statistic are asymptotically multivariate normal (with possibly nonzero covariances). Our theoretical justifications for confid- ence intervals based on Student's distribution and on the typical-value principle (under dependence) are likewise based upon a more general result: Theorem 1 (below) establishes conditions under which (possibly overlapping) subseries values of a general statistic are asymptotically multivariate normal (With possib1y nOnzero covariances). ~ • Quite surprisingly, the conditions in Theorem 1 are virtually identical to those in Hartigan's (1975) Theorem 2, even though the £ormer permits dependence in the data while the latter forbids it. The results presented below assume no knowledge of the marginal distributions of the data beyond stationarity. The only assumption about the dependence is that it be a-mixing (a relatively weak assumption in the mixing hierarchy (see Ibragimov and Linnik, 1971)). The literature contains no other analogies to the jackknife or the typical-value methods for dependent data. Freedman (1984) has considered applying the bootstrap to a linear model with autoregressive component; but, as he emphasizes, the bootstrap calculations assume that the user has correctly specified the form of the underlying autoregressive model. The next section presents basic notation and definitions. Section 3 contains the fundamental asymptotic multivariate normality result for subseries values of a general statistic from an a-mixing sequence (Theorem 1). Corollaries 1 and 2 ~ 3 an Section 4) provide the justifications for confidence intervals based on Student's distribution and the typical-value principle. The proof of Theorem 1 is deferred to Section 5. • 2. Definitions and Notation . Let {Z. (w) : _00 1. < i < +oo} be a strictly stationary sequence of real-valued random variables (r.v.) defined on probability space (Q,F,P). + - Let F (F P q respectively) be the a-field generated by {Zp(w), Zp+l(w), .•• } (L .. , Z l(w), Z (w)} respectively). qq For N ~ 1 denote: a(N) = sup{ IP{AnB} - P{A}P{B} I: A E F~, B E F~}, and define a-mixing to mean liJTN_~cxp(N) = 0 • m 1 Let t (Zl' .•• ' Z ) be a function from R + R , defined for each m ~ 1 so that m m Suppressing the argument , W of Z. (0) 1. E,V, and C denote expectation, variance, and covariance respectively. Indicator functions are denoted by I{·}. and B X For B~O denote: BX=xoI{lxl <B} = X-BX. Let {a} be a sequence of real vectors, and let A be a set of conditions n to be satisfied by the a 's as n+ oo (e.g. la 1+ 00 ) . n n Then the notation = x means that, for a single finite constant x, an sequences {a } satisfying A. lim A x lim x n-+OO an = x. for all n • 3. Main Result . '. 2n0 Consider an arbitrary m. but fixed number k > 1 of subseries values {T. : 1 < i <: k}, where T. =t 10 . 1n - 1n r. 1n 2 Assume r. In + p. > 0 as n + 00 for each i, so that none of the subseries 1n 1 For each n > 1 the data are asymptotically negligible. from {Z.} is available. 1 Of course we must have 0 < m. <m. + r. < n 1n 1n 1n +0 V i,n so that each subseries is in fact contained in Zn. Also assume 4 2 as n+oo for each i, so that the proportion of overlap between In Cl subseries is asymptotically constant. In fact the asymptotic covariances m. /n+u. between the T. 's depend precisely upon the limiting proportions of overlap. In Therefore it is convenient to pool the 2k integers {min' min + r in: 1 < i < k} It for fixed n, and to order them and relabel them as Cn Thus c <.··<c <no < c where 0 -< C1n- 2k,n2n c 2k,n = max {m.In I<i<k + r. In } In = {c.In : 1 < i < 2k} = min {m.In } and I<i<k To avoid trivial complications, we assume that the ranks of {m.In , m.In + r.In : I<i< - - k} remain the same for all n. That is, if we define I (i) to be the rank of m. in C (Le. c (.) we may n In n I 1 ,n = m.) In ' n wri tel (i):: I (i) . Similarly the rank of m. + r. in C may be denoted J(ij. In In n n 2 Finally we define li-1 = lim (c. - c. 1 )/n; in particular n--= In 1-,n The statistics defined by {t (0): m > I} will be called central m parameter ~ith 0 2 1'f : • (I) (II) lim limAIE{At~}1 = 0; and A-><>o n+oo (III) lim A+oo lim O'Vp 2 E[O,l]. 2 u /w +p ,w >v +u >u +00 n n n-n n-n These conditions are virtually identical to those set forth by Hartigan (1975) in his definition of centrality for the independent case. Condition (I) controls the tails of to,s distribution; (II) centers the statistic; (III) requires the n statistic to have covariance behavior analogous to the sample mean of iid r.v.s: the squared correlation between the statistic and its subseries value should be approximately equal to the proportion of overlap. Similarly to Hartigan's (1975) result for the independent case, centrality (as defined above) implies + multivariate asymptotic normality of Tn = (TIn' T2n , .:., Tkn ) in the Q-mixing 5 case. Theorem 1: Let {Zi} be a-mixing. 2 -+ D -+ o > 0, then Tn -+ NkCO,L:) as n-+ oo • o .. =02 J.J If {tmCo)} are central with parameter The entries of L: are: min{JCi)!J(j)}-l ~=max{I 2 Yo/p.p. JV (i) , I (j) } J. J where empty sums are zero. Note that centrality does not even assume the existence of ti,s moments. m Carlstein (1985) establishes centrality for sample means, sample fractiles, and smooth functions of central statistics, all under a-mixing. As a practical matter, the following alternative conditions (which are analogous but more restrictive) are known to imply centrality under a-mixing (see Carlstein (1985)): • (I' ) {to} are uniformly squared-integrable; and (II' ) lim E{tO} = 0; and n n-+oo n (II I') 4. ° 1 v (W/U)2C{t ,tn}=i. lim n n w u w >v +u >u -+00 n n n-n n-n Applications to Confidence Intervals. Throughout this section assume the following set-up: The statistic -+i si = s (Zi) is wholly computable from the data Z , and does not depend upon m m m m any unknown parameters. . i Furthermore, s 1 Finally, put tJ. = (s -8 )m 2 m m • i m estimates the unknown parameter e. The notation of Section 3 remains unchanged. Corollary 1: Let r.J.n = r n and m.J.n = (i-1)rn V i € U, ... ,k} and Vn, with: r In -+ p2 > 0, 1 < r < kr < n vn . If {Z.} is a-mixing and {t (o)} are J. m n 0 n- ncentra 1 WJ.t . h parameter 0 2 > , t h en Tn 2. Nk (0, 02 Ik ) as n-+oo • ° 6 Proof: Immediate from Theorem 1. 0 Analogously to Hartigan's (1975) justification for the jackknife in the independent case, Corollary 1 provides the asymptotic justification in the a-mixing case for treating S n = , k-l ir , (s _8)k 2 / ( I (s n_ s)2/(k_l))2 n . 0 r n 1= n as a r.v. with Student's distribution on k-l degrees of freedom. (Here we k-l ir denote 5 = I s n/ k .) Observe that S is free of the nuisance parameter n n i=O r n 2 a , and hence may be used as a pivot for constructing a confidence interval on 8. i Corollary 2 shows that, even under a-mixing, the subseries values scan m be used to construct a set of statistics which are (asymptotically) typical values for 8. Coroll ar)' 2: Let mIn = 0 and m.1+,n 1 = m. + r. Vi E{I, ... ,k-l} and \fn, 1n In Kith: k r. /n -+ p ~ > 0 \f i E {l, ... , k}, 1 < r. < I r. < n \' j E{I, ... , k} and Vn . 1n 1 - In i=l 1nLet £ E{l, ... ,k}be arbitrary but fixed, and denote K = {i E{1, ... ,k} s.t. i f. .Q.}. Define: V. 1n If for each i EK, Vn . {Z.} is a-mixing and {t (o)} are central with parameter a 2 > 0, then: 1 m lim p{ I I{V. < 8} = N} = l/k 1n n-+oo iE K In particular: Proof: for each N E {0,1, ... ,k-l} P{min V. < 8 < max V. } 1n iE K 1n iEK -+ 1-2/k By Theorem 1, Tn ~ Nk (0, a 2I ) as n-+oo. k - - T. =T. +T n ; alsodenoteT =(T.: iEK). 1n In Nn n In as n-+oo. For each i EK, denote 7 -r = cr 2 (I _ +[I} (k-l)X(k-l)) • k l Using the logic in the second paragraph of Hartigan's (1975) proof of his Theorem 6, we may conclude that the event = {exactly N of the A.._ -~n T. ~n 's, i E K, are less than O} has asymptotic probability But since T. 11k (for each N E {O,I, ... ,k-l}). An is equivalent to the event -~ 5. j ~n IEK I{V. <e} Proof of Theorem 1. {d. : n>l} and {w. : n>l} as follows: ~n 1n ~n d. ~n For each iE {2,3, ... ,2k} define = c. ~n = 0 yn' otherwise we can choose w. s. t. 0 < ' ~n - w,/d. t 0 = N. ~n 2 Without loss of generality, put cr =1. w. = (V. -e)(r.!+! r o ), the event ~n 1n Nn -+ 0 as n-+ oo • 1n 1n ~n c = t d ~n -w ~+1, n ~E For each - c. 1 . l-,n W, ~n If y. 1 = 0 then ~- < d. Yn, w. -+ - ~n ~n 00, and {l,2, ... ,2k-l} and each n, denote We shall begin by showing that: ~+1, n J(i)-1 B.1n .- IT, 1n P yo"t o Ip·1 -+ N Nn ~ L ~=I(i) - o as n -+ 00 for each i E {I, ... , k} . Observe that J(i)-1 A- B. ~n for A> O. < - I-l'·1n I Let E: + I ~=I(i) • (1) > 0 be given. By (1), lim lim p{ A-+00 n-+oo each A J(i)-1 Yol to lip. + 'AT1'n L YoN "Ato/f'iI N Nn 1 ~=I(i) N I~inl > d = 0 and lim lim A-+00 n-+oo p{yQ.IAt~nl > d = 0 for Let B. (A) denote the term within the last modulus of equation (1). ~. ~n We have: P{ lB. (A) 1n I>d J(i)-1 I t=I(i) < - E: - 2 [ IE { (AT. ) 2} - 11 + ~n _ (2 Yo N IE{AT,~n "At 0 Nn } - II p.~ ~ Yo I p. N 2 + Yo N - IE { (At 0 Nn 2 ) } - 2 1 II p .) + 1 8 (2) Now take lim lim of both sides in equation (2). Except for the terms within A~n~ the last summation, each term on the r.h.s. of (2) vanishes by (III). Those remaining summands with Y£.· Y£.I ~ 0 are each dominated by IC{At£.n'At£"n}I + IE{A'f£.n}E{A'f£"n}l. Now taking lim lim we obtain zero, since the A~ n~ 2 covariance term is bounded by 4A a(c n I - c n 1 + wn 1 ) (see Ibragimov and N n N+,n N+,n Linnik, p. 306), and since (II) applies to the expectation terms. these. results establishes B.1n p + Combining 0 for each i. k Let A = (A ,A , ... ,A ) E R be fixed. It will suffice to consider the 1 2 k k J(i)-l asymptotic distribution of G := A.y~t£. /p. in place of that of n i =1 £.= I (i) 1 . n 1 I A• k I Tn , b-ecause IA • Tn -G n I -< . I 1 IA.1 lB.1n P + 0. 1= 2k-1 I Write G = n Define r.v.s £.= 1 {tin: t £.n • g ~' where g£. = Y~. I"::'£'''::' k I A. 1{J(i) < £. < J(i) - l}/P . . 1 i=l - - 1 2k-1, n > I} to have the same marginal distributions as U£.n: 1~£.'::'2k-1, n> l}, but with {tIn: 1 <£.< 2k-1} independent for fixed n > 1. Let G' n = IE{ exp {i u Gn }} Then for every u E R: - E{exp {i u GI }} I < 16 n as n-l-W, - by the argument of Ibragimov and Linnik (p. 338). Hence we may simply consider the asymptotic distribution of G' in place of that of A • T n n . By Theorem 6 of Carlstein (1985), each tl is marginally asymptotically n N(O,l) (provided y£.~O). And since {tln: 1 < ~ < 2k-l} are independent, we may . D 2k-1 2 conclude that GI + N(O, L gn). n £.=1 N for 0 .. as defined in Theorem 1. 1J k AE R , the proof is completed. 0 2k-1 k k g~ = L A.A. 0 .. £.=1 i=l j=l 1 J 1J Since this argument holds for each Observe that I I • References Carl stein, E. (1985). • Asymptotic Normality for a General Statistic from a Stationary Sequence. Freedman, D. (1984). (To appear in Ann. Prob.) On Bootstrapping Two-Stage Least-Squares Estimates in Stationary Linear Models. Hartigan, J. (1969). Ann. Stat., 12, 827-842. Using Subsarnple Values as Typical Values. JASA, 64, 1303-1317. Hartigan, J. (1975). Necessary and Sufficient Conditions for Asymptotic Joint Normality of a Statistic and its Subsample Values. Ann. Stat., 3, 573-580. Ibragimov, I. and Linnik, Y. (1971). Random Variables. Tukey, J. (1958). ... Wolters-Noordhoff, The Netherlands. Bias and Confidence in Not-quite Large Samples. Stat., 29, 614 . • Independent and Stationary Sequences of Ann. r-lath.
© Copyright 2024 Paperzz