Work supported by the National Institute of Health, Public Health Service, Grant GM-I03~7 and the United States Air Force Grant AFOSR-68-1415. A SEQUENTIAL FIXED-WIDTH CONFIDENCE INTERVAL FOR THE MEAN OF AU-STATISTIC by RAYMOND NELSON SPROULE Department of Statistics University of North Carolina at Chapel Hill Institute of Statistics Mimeo Series No. 636 AUGUST 1969 RAYMOND NELSON SPROULE. A sequential fixed-width confidence interval for the mean of a U-statistic. w. (Under the direction of J. HALL.) Preliminary properties of aU-statistic U , introduced by n Hoeffding (Ann. Math. Statist. 21 (1948) 293-325), are developed along with an investigation of several related statistics. The problem of estimating n Var(U } is considered in some detail, n with emphasis placed on the asymptotic nature of the estimates. Two prime candidates emerge each possessing good asymptotic properties. Hoeffding (Univ. of North Carolina, Inst. of Statist., Mimeo Series No. 302 (1961» showed that a U-statistic may be expressed as an average of independent and identically distributed random variables plus a remainder term. A Ko1mogorov-1ike inequality for this remainder term is developed and its (a.s.) convergence properties are examined. These properties are then related to the U-statistic. In addition, using a result of Anscombe (Proc. Cambridge Philos. Soc. ~ (1952) 600-607), the asymptotic normality of UN' where N is a positive integer-valued random variable, is established under certain conditions. Equipped with the preceding results, a sequential fixed-width confidence interval for the mean of a U-statistic, having coverage probability approximately equal to some preassigned a, is developed. It is also shown that the sequential procedure (or confidence interval) is asymptotically efficient, in the sense of Chow and Robbins (Ann. Math. Statist. 36 (1965) 457-462). """ A SEQUENTIAL FIXED-WIDTH CONFIDENCE INTERVAL FOR THE MEAN OF AU-STATISTIC by Raymond Nelson Sproule A thesis submitted to the faculty of the University of North Carolina at Chapel Hill in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Department of Statistics Chapel Hill 1969 Approved by: Adviser ii TABLE OF CONTENTS CHAPTER I II III PAGE INTRODUCTION 1 PRELIMINARY CONCEPTS AND RESULTS 3 2.1 Introduction 3 2.2 Functionals 3 2.3 U-statistics 4 2.4 The H-decompoaition 9 2.5 The W-statistic 15 2.6 The S-decomposition 17 2.7 The Z-statistic 18 ESTIMATION OF THE VARIANCE OF AU-STATISTIC 29 3.1 Introduction 29 3.2 The U-statistic estimate of P1 30 3.3 Estimation ofp 33 3.4 2 Sen's estimate of cr = 3.5 Estimate of cr 3.6 Example 2 c = r 2 P1 based on the Z-statistic 35 38 49 iii PAGE CHAPTER IV CONTRIBUTIONS TO Tlili ASYMPTOTIC THEORY OF U-STATISTICS 53 4.1 Introduction 53 4.2 Kolmogorov inequalities 53 4.3 Strong convergence results 55 4.4 The asymptotic normality of UN 58 V SEQUENTIAL FIXED-WIDTH CONFIDENCE INTERVALS FOR REGULAR FUNCTIONALS 5.1 5.2 5.3 Introduction 2 The sequential procedure using swn 2 The sequential procedure using s zn 5.4 Examples BIBLIOGRAPHY 62 62 63 70 77 81 iv ACKNOWLEDGMENT I wish to express my sincere appreciation to Professor W. J. Hall for his significant guidance, criticism and understanding during the course of this work. I would also like to thank Professors W. Hoeffding, P. K. Sen and G. Simons for their helpful suggestions regarding the original manuscript and the Public Health Service for financing a Research Assistantship through grant GM-10397. A very special thank you is due my wife Bonnie and Electra Sproule for their constant encouragement. CHAPTER I INTRODUCTION Our primary goal is the development of a sequential confidence interval for the mean of a U-statistic, having fixed-width equal to 2d and coverage probability approximately equal to some preassigned a, where 0 < a < 1. The problem was solved by Chow and Robbins [3] for the special U-statistic, the sample mean. Starr [16] evaluated the Chow and Robbins [3] sequential procedure, assuming that the underlying distribution is normal, and concluded that the procedure is reasonably consistent and efficient for all values of the variance of the underlying distribution. Generally speaking a U-statistic is a generalization of a sample mean. This is clearly revealed by a de- composition due to Hoeffding [9] and which we refer to as the Hdecomposition. By means of this decomposition aU-statistic U can be . n expressed as the sum of a sample mean and a remainder term R • n In Chapter II, notation, concepts and preliminary results used throughout this work are developed. In addition to an examination of the H-decomposition, two important statistics are introduced, the W-statistic and the Z-statistic. In Chapter III we consider the problem of estimating the variance of a U-statistic. Two estimates, s2 and s2 , of n Var(U } emerge as wn zn n prime candidates - based on the W- and Z-statistics. It turns out that , s l'19h t 1Y superlor 'h teoret i ca 11y, b ut t h at s 2 can be calculated s 2 lS wn zn much more easily, in general. Therefore, both estimates are used in 2 Chapter V to define sequential procedures. In Chapter IV a Kolmogorov-like inequality for U-statistics is established. Also, it is shown that the remainder term R converges n to 0 in a strong sense. However, the main purpose of Chapter IV is to develop the asymptotic normality of a U-statistic based on a random number of observations, in the fashion of Anscombe [1]. This result leads to the asymptotic consistency of the sequential procedures offered in Chapter V. In Chapter V, equipped with the results of the previous chapters, we present a sequential confidence interval for the mean of a Ustatistic. The confidence interval has fixed-width and is asymptoti- cally consistent. Making use of the fact that U-statistics are reverse martingales, the asymptotic efficiency of our sequential procedure is established, in much the same manner as Simons [15]. The theory is illustrated explicitly by obtaining sequential (non-parametric) fixedwidth confidence intervals for (1) the population variance and (2) the probability of a concordant pair of observations in sampling from a bivariate population. As a secondary goal, it is hoped that the techniques and results developed here can be used to extend other sequential procedures already available for the sample mean (where the variance is unknown) to the case of aU-statistic. CHAPTER II PRELIMINARY CONCEPTS AND RESULTS 2.1 Introduction. ~ U-statistics, as well as some related statistics, are defined and some relevant properties are presented. A particular emphasis is placed on a decomposition of aU-statistic due to Hoeffding [9], and referred to as the H-decomposition. 2.2 Functiona1s. We now introduce some basic terminology ~ following closely the lines of Hoeffding [7]. Let ~ be a subset of the set of all c.d.f.'s defined on a finite-dimensional Euclidean space. Suppose that the random variable X has c.d.f. F that "e(F) is a functional of F defined on number e(F) is assigned. ~" E~. We say i f for each F E ~ a real A functional e(F) is "regular over £I" if there exists a positive integer n and a function for each F E~. In such a case unbiased estimate of e(F) over ~(x1'··· ,x n ~". ~(x1,···,xn) such that ) is said to be "an Let r be the smallest number of arguments required for a function to be an unbiased estimate of e(F) over £I. Then r is said to be "the degree of e(F) over function ~(x1,···,xr) functional e(F)". ~" and the is referred to as a "kernel of the regular Clearly, for any regular functional e(F), we can always find a kernel which is a symmetric function of its r arguments, namely, 4 (r:)-l~ ~(x Q" 1 ••• , x Q' ) r where the summation is over the r: permutations (Q'l,···,Q'r) of the integers [1,2,".,r}. For example, suppose ~ is the set of all c.d.f. 's defined on the real line and having finite variance. random variable X having c.d.f. F E~. Let e(F) be the variance of the Then 2 is a kernel of e(F) and (x -x ) /2 is the 1 2 corresponding symmetric kernel. Hoeffding [7] shows that a polynomial in regular functiona1s is itself a regular functional. This useful result has a very straight- forward proof. 2.3 U-statistics. U-statistics were introduced by Hoeffding [7]. ~ Let X ,X ,···,X be independent and identically distributed random n 1 2 variables (henceforth referred to as I.I.D. random variables) having c.d.f. F. Let f(x ,···,x ) be a function of r arguments. 1 r Then a U-statistic is defined by = [n(n-1)··· (n-r+1) ] -1 ~ f (xQ' , .•• ,xQ' ) 1 r where the summation is over all permutations (Q'l,···,Q'r) formed from the integers [1,2, ••• ,n}. the U-statistic". We refer to f(xl'''.'x ) as a "kernel of r Notice that Un is sYmmetric in x ,x ,···,x • n 1 2 i f f(x ,···,x ) is a kernel of a regular functional e(F) over 1 r ~ Also, then 5 Un is an unbiased estimate of e(F) over~. Letr f (x 1 ""'x ) be the o symmetric function corresponding to f(x 1 ,··· ,x r ). n)-l L; (n , r) f (x n = ( r 0 Q" U • •• x 1 , We can then write Q' ) r where L;(n,r) represents here, and in the sequel, the summation over all the combinations (Q'l,···,Q'r) formed from the integers {l,Z,···,n}. refer to f (x ,···,:x ) as the "symmetric kernel" of the U-statistic. r o 1 Assume from this point on, without loss of generality, that f(x 1 ,···,xr ) is symmetric in x ,x Z,···,xr ' 1 We now introduce regular functiona1s denoted by p which playa central role in what follows. Assume that the symmetric function f(x , ••• ,x ) has existing expectation for F r 1 E~. Write Define for c = 1,2,···,r. Note that f r (x ,···,x r ) = f(x ,···,x r ). 1 1 We interpret e{f(x1,.·.,xC'Xc+l"",Xr)} as the expected value of f(X ,···,Xr ) given that X "",Xc are fixed at the values x ,"',xc ' 1 1 1 respectively. Notice that e = e{f c (X ,···,Xc )} for c = 1,2,···,r. 1 Define pc = Var{fc (X 1 ,"',Xc )} for c= 1,2,···,r. In particular f 1 (x 1) = e{£(x 1 ,X Z""'Xr )} and P1 = var{f 1 (X 1)}· Now Pc = pc(F) is a polynomial in regular func- tiona1s of F, and so, is itself a regular functional of F. We 6 If e(f(Xl,···,X )} r 2 < 00, then Hoeffding [7] shows that the variance of U is given by n Var(U}= (n)-lL;rlc r \) (n-r) n r c= c r-c Pc (2.1) = n -1 r 2 PI + for n r. ~ O(n -2 ) This result is generalized in Theorem 2.2 below. In Chapter III we consider the problem of estimation of the regular 2 2 functional cr = r Pl. LEMMA 2.1. (i) (ii) (iii) (iv) The following lemma appears in Hoeffding [7]. Assume e(f(Xl,···,X )} r 2 < 00. Then 0 ~ pc/c ~ pd/d for 1 ~ c < d ~ r, 2 r p /n ~ Var(un } ~ rpr/n, l n Var(D } is a non-increasing function _of n, and n -- - Var(U } = P and lim r r -- n .... oo 2 n Var(U } = r p • n l We now introduce notation which is used to represent the covariance between two U-statistics. Let X ,X ,···,X be I.I.D. random n l 2 variables, and let f(xl,···,x ) and g(xl,···,x s ) be two symmetric r functions with rand s arguments, respectively. f c (x l' Define ••• x ) , c and Sc = Cov(f c (Xl,···,Xc ),gc (Xl,···,X)}, c = 1,2,···,min(r,s). c Define two U-statistics by D n = n) -1 L; (n , r) f (x • •• x ) ( r et' 'et 1 r 7 and V = (. m ) -1 ~ (m , s) g(x A ,···,xA ) s m ~1 ~s where n 2 rand n THEOREM 2.2. 2 m 2 s. Assume e[f(X ,· .. ,x )} 2 < 1 = n PROOF. -1 rS~l r 00 and e[g(x ,.·.,X) }2 < 00. 1 -- s -2 + O(n ). First Cov(u ,V } = en) -l(m) -l~(n,r)~(m,s)Cov(f(X ••• X ) (X" .... X-..)}. n mrs Q" s ~ ,S Q'" Q' ' g j j ' , jj r 1 s 1 Now, for each combination (~l'···'~s) formed from (1,2, •• ·,m}, the total number of combinations (Q'l' ••• , Q') r formed from (1,2,··· ,n} and having exactly c suffixes in common with (~1'··· ,~s) is (~) (~=~) where c = O,l,···,min(r,s). Notice also that Cov(f(XQ' ,···,XQ')' 1 r g(X~ , .•. ,x~ )} is zero if c = 0, that is, if there are no suffixes 1 s in common. Cov(u Thus V } = (n) -1 (m) -1 {em) ~min(r,s) e S) (n-s) S } n' m r s . s c=l c r-c c which completes the proof. Cov(u ,V } is free of m. n m Notice that the final expression for , 8 and Correlation[U ,U } n We ne~t m = q (c) n m introduce notation used to represent the covariance between the squares of U and U. n m (2.2) [Var[U }/Var[U }]1/2 For c = O,l,···,r define (xl' ••• ,x 2 r-c ) where the summation ~(c) is over all combinations (al,· •• ,a ) and r (~l'···'~r) each formed from [1,2,·.· ,2r-c} and such that there are exactly c integers in common. Pc + e2 for c = O,l, .. ·,r. Put Po = 0. Then e(q (c) (X· •• X )} = l' , 2r-c Now, for c = O,l,· .. ,r and t = 1,2, ••• ,2r-c define qlC)(Xl,···,X;) = e(q(C)(Xl,···,Xt,XU1'···'X2r_c)} and c = O,l,···,r (2.3) define a U-statistic by = n) -1,,(n,2r-c) (c) ( x , · · · ,x i.J q ). ( 2 r-c a a 1 2r-c Put t = min(2r-c,2r-d). define Finally, for t = 1,2,"',t and c,d = O,l, .. ·,r 9 Notice that for L = 1,2,···,2r-c and c = d = 0,1,··· ,r, SlC,C) = (c) PL so that, in particular, LEMMA 2.3. Assume 2 2 -1 en) -1 r r er)(r)(n-r)(m-r) e n ) -1 Cov[U n ,Um} = ( m) r r ~c=o~d=o c d r-c r-d 2r-c ~t b:1 (2r-d) cn-2r+d) S(c ,d) L 2r-c-L L = n -1 4r 2e2 P1 + O(m -1 -1 n ). First, from (2.2) and (2.3), PROOF. ( n) -2 ~ (n , r) ~ (n , r) f(x r 0" s ~ , sO" •.• x 1 '0' )f(x r ~' ••. x ) 1 , ~ r = en) -l~r (r) (n-r) U(c). r c=o c r-c n Then cov[u 2 U2 } n'm = (m) r -I( n) -l~r ~r (r)( r)· (n-r) (m-r) r c=o d=o c d r-c r-d Cov{ U(c) U(d) }. . n ' m But from Theorem 2.2 Cov[u(c) U(d)} n ' m for c,d = O,l,···,r. 2.4 = (n ).-l~t (2r-d) (n-2r+d) S(c,d) 2r-c L=l L, 2r-c-,e ,e This completes the proof. ~. In Hoeffding [9] a means for decom- posing U is developed, one which has great value in establishing n properties of- U. n decomposition". We refer to this decomposition as the "HContinuing in our development of notation we define 10 gh(xl'·"'~) = fh(xl'·"'~) - e for h = 1,2,· .. ,r. Also, let g(l)(X ) = gl(x ) and l l (2.4) = ~(xl.' ••• '~) - ~~-l~(h,j)g(j)(x ••. x ) ~n n J=l Q'l" Q'j for h = 2,3,···,r. For example, if h = 2, From (2.4) it follows that = e/g(h)(x ••• x X · •• X.)} gc(h)(x l' ••• ,x) c l 1" c' c+l' '--h for h = 1,2,···,r and c = 1,2,···,h-l. For n > rand h = 1,2,···,r define n £.J g (h)( x,· . . ,x ) . V(h) __ (h ) -l,,(n,h) n Q'l 'b. (2.5) (1) -1 n (1) -1 n Note that Vn = n ~i=lg (xi) = n ~i=lfl(xi) - e. Strictly speaking V(h) is not a U-statistic, as it may depend upon unknown n functionals. Nevertheless, it does have most of the attributes of a U-statistic. The proof of the following lemma is simple and appears in Hoeffding [9]. LEMMA 2.4. If e[ If(X ' ••• ,Xr ) I} < 00, then erg (h) (Xl' ••• ,~)} = o and l .•. ,x) = 0 for h = 1,2,···,r and c = 1,2, ••• ,h-l. synunetric. g(h) (x c l' For h = 1,2,···,r the function g(h)(xl'.··'~) is c 11 LEMMA 2.5. Assume that e(f(X 1 ,··· ,Xr )} 2 0h = var(g(h)(X1'···'~)} for h = 1,2,··· ,r. < 00 and let Then (i) for h = 1,2,···,r the ~ of v~h) is 0 and the variance is given EY O(n -h ). Also for r < m < n we have (ii) --- - Cov(v(h) v(I-)} = n ' m PROOF. Var (Von(h) } { First, since e(f(X ,···,X )} 1 r h = 1,2,···,r. h = I- = 1,2,···,r h ~ I- = 1,2,···,r. 2 < 00, then 0h < 00 for Part (i) follows from Lemma 2.4 and the fact that Part (ii) follows from Theorem 2.2. We now introduce the H-decomposition by means of the following theorem given in Hoeffding [9]. THEOREM 2.6. ~ U-statistic may be decomposed into ~ linear combination of U-statistics, specifically, (2.6) U = n = e +>,.r (r) v(h) = e + rV(l) + R n=l e + rn h -1 n ~. ~= n n n 1(f1 (x.)-e) + Rn ~ where R = >,.r (r) V(h) and Correlation (v(l),R } = O. n in=2 h. n n n n ) Vn(h) satisfies the martingale property, that is, (h Further, 12 for r.$ m < n and h REMARK 1. 1,2,·",r. = Theorem 2.6 is extremely useful in establishing properties of U-statistics. In fact, it states that U is a linear n combination of U-statistics, mutually uncorre1ated (by Lemma 2.5) and each successive term having a variance of smaller order. It shows that a U-statistic is essentially the sum of an average of I.I.D. random variables V(l) and a zero-mean remainder term R , and that the n n two are uncorre1ated. zero. Of course, if r =1 From Lemma 2.5 we see that Var(R } n the remainder term R = O(n- 2). show that under the assumption that e(f(X , ••• 1 converges to zero almost surely as n This implies that nYR n Y < 1. -> ~ is n In Chapter IV we ,xr )}2 <~, nYv(h) n for Y < h/2 and h == 1,2,"·,r. converges to zero almost surely as n -> ~ for Hoeffding [9] uses the H-decomposition to show that, under the assumption that e(lf(X ,···,X ) r 1 mean almost surely as n ->~. I} <~, a U-statistic converges to its Sen [13] proves a somewhat weaker result in that he assumes that e(lf(X ,···,X ) 1 r 1 e 1 + } < ~ for some e> 1 - r- l Berk [2] contains a rather simple proof of the almost sure convergence of a U-statistic by recognizing that U-statistics are reverse martingales. More will be said about U-statistics as reverse martingales in Chapter V. REMARK 2. 2 Hoe ffd,ing [7] prove s that i f e( f (Xl' ... ,X ) } < r ~ and 2 PI > 0 then In(Un-e) has an asymptotic normal distribution N(O,cr ) 2 2 where cr = r Pl' The result follows directly from the H-decomposition 2 by noticing that r/n v(l) is asymptotically N(O,cr ) by the Lindbergn 13 Levy central limit theorem and that lim REMARK 3. n-+ oo e[/n R }2 = n o. The H-decomposition along with Lemma 2.5 yields a second expression for the variance of a U-statistic, namely, (2.7) Compare (2.7) with (2.1). Equating these two expressions for the variance of a U-statistic enables us to obtain explicitly the re1ationship between the p's and the o's. This leads us to the following lemma, also proved by Hoeffding [7], but in an entirely different manner. LEMMA 2.7. The p's and the o's (2.8) 0. '-n = ~ related for h = 1,2,···,r ~ L;hc=l (h) 0c c and (2.9) PROOF. To prove (2.8) we proceed by induction. and equating (2.7) with (2.1) we obtain Pr -holds for h = r. some j > 1. Putting n = r c=l (r) c 0c· Thus (2.8) L;r Now assume that (2.8) is true for r - j <h ~ r for Put n = r + j and equate (2.7) with (2.1) so as to obtain )'.r . (r+j ) -1 ( r) ( j ) = )'.r (r+j) -1 (r) 2 0 • ~=r-J ~ r h r-h Ph ~=1 \ h h h . Mu1t1p1y through b y( .r ) -1 (r+ r j ) and obtain J Pr - j + )'.r ( r) -1 ( r) ( j ) ~=r- j+1 j h r-h Thus, from (2.8), we have . r (r+j ) % = L;h=1 h -1 (r ) -1 (r ) 2 (r+j) 0h. j h r 14 (2.10) The double sum on the left side uf (2.10) with i = r - h becomes ° r i t:~-l(~) -l(:)(~ )t: - (r-i) 1=0 J 1 1 c=l c c = t:~-l (~ ) -1 (~ ) t:r ( r ) (r:c) 1=0 J 1 c=l c 1 °c ° = t:r ( ~ ) -1 ( r) t:~-l (~ ) ( r:c) c=l J c c 1=0 1 \ 1 Thus (2.10) becomes Pr - j = = ~=l ( ~ ) -1 ~:{ (r~j) which is (2.8) with h = r - j. (~) (rjh) 0h 0h This completes the proof of (2.8). order to prove (2.9) we make use of (2.8) as follows: = t:~ t:h . (-1) h-c ( h) ( : ) 0.. J=l c=J c J J But In 15 h . (-1) h-c (h) (~ ) C=J c J 2:: = = h, which equals 1 whenever j (~) (l_l)h-j and 0 whenever j < h. This completes the proof of (2.9). 2.5 The W-statistic. ~ For each i = 1,2,···,n define aU-statistic based on xl,···,x.~- l'x,~+l'···'xn by U. (~) n = ( n-l)-L(n-l r) f (x L: ' • •. , x 0: ) r 0:' 1 r where the summation is over all combinations (O:l,···,O:r) formed from {l, ••• ,i-l,i+l, ••• ,n}. (2.11) Define the W-statistics by Win = nUn - (n-r)u(i)n for i = 1,2,···,n and notice that they are identically distributed. Furthermore, since Un = n Wn -1 =n n 2::. lU(,) , ~= ~ n -1 n 2::.~= lW,~n = rUn • The W-statistics can be conveniently decomposed. To do so, for each i = 1,2,···,n and h = 1,2,···,r, define V(h) (i)n where the summation is over all combinations the integers {l,···,i-l,i+l,.··,n}. Let (O:l'···'~) formed from 16 (h) nVn(h) - (n-r)V(i)n (2.12) for i = 1,2,···,n and h = 1,2,···,r. Then r (r) (h) h Win (2.13) Win = r8 + ~l for i = 1,2,···,n. The following lemma gives us additional insight into the decomposition (2.13). LEMMA 2.8. Assume e[f(Xl,···,Xr )} and i,j = 1,2,···,n. 2 < ~ and suppose tha~ n > r Then, for h = 1,2, .•• ,r we-have that e[w~h)} = 0 ---- - - -- ~n and Also, for h ~ t = 1 2 •.• r we have that Cov[w~h) w~t)} = 0, whereas " , - -- -~n ' In for h = t = 1,2,···, r , (h) (h) } (2.15) Cov[W.~n = ( n-l) h __ -1 [(r 2 -h)-(n-l) -1 h(r-l) 2 ]oh ' W.In = PROOF. O(n -h ). Lemma 2.4 and (2.12) imply that e[w~h)} = O. ~n follows from Lemma 2.5 and (2.12). Cov[w~h) ,w~t)} = 0 by Lemma 2.4. ~n In If h ~ Also, (2.14) t = 1,2,···,r, then If h = t -= 1,2,··· ,r, then by Lemma 2.5 and (2.12), (2.15) holds. LEMMA 2.9. Assume e[f(X1 ,···,Xr )} and i, j = 1,2, •• ·,n • 2 < ~ and suppose that n Then e[w. } = r8, ~n r (n-l) -1 ( hr) 2 [r 2 + h(n-2r)]oh = 0(1) h Var [Win } = Lh=l >r 17 and O(n The proof follows directly from (2.13) and Lemma 2.8. -1 ). The W-statistic is closely related to a statistic introduced by Sen [13], as we shall see in the next sectio~, and plays an important role in Chapter III. 2.6 The S-decomposition. ~ V in = Sen [13] defined a U-statistic by n-1) -1",(n-1,r-1)f( '"' x.,x , ••• ,x ) ( r- 1 1 ~2 ~r for i = 1,2,···,n where the summation is over all combinations (~2'···'~r) formed from the integers [1, ••• ,i-1,i+1, ••• ,n}. The S-decomposition is Un = n -1 n 2::. 1V, • 1= 1n Notice that W. = rV. for i = 1,2,···,n so that Lemma 2.9 may be used 1n 1n to determine variance and covariance expressions for the V. 'so An 1n 2 2 estimate of a = r P1 can be constructed from the W. 's (or the V. IS) 1n 1n and such an estimate is examined in Chapter III. V converges in probability to f (x ) as n in 1 i integer i. ~ 00 Sen [13] proves that for any positive This also follows from Lemma 2.9. Both the H-decomposition and the S-decomposition indicate that for large n a U-statistic behaves like a sample mean. However, the H- decomposition has the advantage that the remainder term R of n Theorem 2.6 has a fairly explicit representation. There is a great bulk of theory in the literature developed for the sample mean. It 18 appears then that this theory might, in certain cases, be extended to the case of a U-statistic by showing that, at least asymptotically, the remainder term is in some sense negligible. A specific result due to Chow and Robbins [3) is extended in Chapter v. 2.7 The Z-statistic. For n > r define the Let Zr = rU. r ~ Z-statistic by Z = nU - (n-1)U • n-1 n n (2.16) Note that U = n n -1 n ~. Z •• 1.=r 1. Now, define Z(h) = nv(h) - (n-1)v(h) n n n-1 (2.17) for n > rand h = 1,2,···,r. By (2.16) and (2.17), along with Theorem 2.6, we have Z = n (2.18) . of An est1.mate for n > r. e + L;r (r) h=1 0 2 = r 2 PI h Z (h) n can be constructed from the Zn's and such an estimate is examined in Chapter III. The following lemma, an immediate consequence of (2.17) and Lemma 2.5, gives us some insight into the decomposition (2.18). LEMMA 2.10. Assume e{f(X ,···,X )} r 1 var{z(h)} n ~ and suppose that = ( n -1) -1 [(n-2)h+1)Oh = O(n -h+1 ). h Also, for h ~ ~ = 1,2,···,r ~ for h = 1,2, •• ·, r , ~ < Then, for h = 1,2,···,r ~ have that e{z~h)} = 0 and r < m < n. (2.19) 2 = have that cov{Z~h) ,Z~~)} = 0, whereas 19 Cov[Z(h) Z(h)} = _(n-1)-1(h_1)O (2.20) n LEMMA 2.11. h ' m Assume e[f(X ,···,X )}2 < 1 r h = O(n- h ). 00. Then e[z } = re, -- r e[z } = e for n > rand n - - (2.21) Var[Z } = n { r r2 p r (n-1) -1 r 2 2-L h.. (h) [(n-2)h+1]~= r P1 +O(n) n = r lb=l \ n> r. Also, for m < n, (2.22) Cov[Z ,Z } = n m PROOF. (2.16). 1 -rL;~=2 (n~l) -1 (~) 2(h_1) ~ _L;r h=2 2 = 0(n- ) (n-1) -1 ( r) 2 (h-1) 0 _ O( -2) h h h n m = r m > r. The expectations follow from the definition of Z and r The variance expression (2.21) follows from (2.18), (2.19) and (2.20). Suppose r < m < n. Then, by (2.18), which, upon applying (2.20), reduces to (2.22). m = r < n. Next, suppose Then, using (2.16), the corollary to Theorem 2.2, and (2.7) gives us Cov[Z n ,Z r } = rn Cov[Un ,U r } - r(n-1)Cov[U n- l'U} r = rn Var[Un } - r(n-1)Var[Un- 1} 20 which reduces to (2.22). REMARKS. This completes the proof. If r = 1, notice that W. and Z. each reduce to f(x.). 1n 1 1 In general rU n is the average of WI ,·.·,W , whereas U is a near n nn n . average of Z , ••• ,Z , that 1S, U = r n n -1 n Z.• 1=r 1 n~. In Chapter III we are concerned with the problem of estimating 02 = r2pl' One estimate is the sample variance of WI n ,···,Wnn , while a second estimate somewhat resembles the sample variance of Z ,···,Z. r n From the computational point of view, the Z-statistic is a little more suited to a sequential setting than is the W-statistic. The next two lemmas are used in Chapter III to establish Theorem 3.3. LEMMA 2.12. Assume e[f(X , ••• ,X )}4 < r l Let ~l = var[[g(1)(X )]2}. l 00 and suppose that n> r. Then (2.23) PROOF. and set P n (2.24) For convenience write z "r L..h= 2 ( hr ) Z(h) n • = Z(l) = g(l)(x ) (see (2.17)) n n Th en, b y (2 • 18) , n 22 (Z _8)2 = r 2 z + 2rz P + P n n n n n and so 242 2 (2.25) Var[(Z n -8) } = r ~l. + 4r Var[z n Pn } + Var[pn } 3 2 222 2 + 4r Cov[z n ,z n Pn } + 2r Cov[zn ,Pn } + 4rCov[zn Pn ,Pn }. Our major task is to show that (2.26) 2 -2 Var[Pn } = O(n ). 21 Once this has been accomplished, it is a simple matter to show that each of the remaining variance and covariance terms in (2.25) is of order n- l To prove (2.26), first notice that (2.27) Now, using (2.17) and (2.15), we can write W*(h) _ (h-l)V(h) nn n-l (2.28) where ok (h) (2.29) W nn h ehn-I) -1. -l,,(n-l,h-l) g (h) ( x ~ Q/' S for h = 2,3,···,r and n > r. x ..• x ) n' Q/2' , ~ (Notice that W*(h) is related to V(h) in nn n much the same way as W , defined in (2.11), is related to U , and nn n that W*(h) differs slightly from w(h), defined in (2.12).) nn nn Therefore, from (2.28) zn(h) 2 = W~.(nn(h) 2 (2.30) for h (h) (h) 2 (h) 2 - 2(h-l)Wnn Vn-l + (h-l) Vn-l ft oJ. = 2,3,···,r. We now are equipped to prove that (2.31) for h = 2,3, .•• ,r. e[v(h)} n-l = 0, From Lemma 2.3 (with m = n) and the fact that we have (2.32) for h = 2,3,···,r. (h)2 Var [ V _ } n l = O(n -2 ) Although W*(h) is not a U-statistic, the proof of nn Lemma 2.3 can be adapted to show that 22 r ,\, (h) 2} Var1. Wnn (2.33) for h h = = 2,3,···,r. 2,3, ••• ,r. Notice that, by Lemma 2.4 , o e[w*(h)v(h)} nn n-1 for Then, by the Schwarz inequality = 2,3,···,r. Thus, (2.32), (2.33) and (2.34), along with for h Lemmas 2.5 and 2.8, imply that (2.35) for h = 2,3,···,r. Also, (2.32), (2.33), (2.35) and the Schwarz inequality imply that the three covariances involving W*(h)2 V(h)2 nn ' n-1 2 and w:~h)v~~i are each of order n- for h = 2,3, •.. ,r. This proves (2.31). An argument similar to that in (2.34), along with (2.31) gives us (2.36) for h 1 ~ h 2 = 2,3, ..• ,r. Again, by the Schwarz inequality, the covariances between the terms in (2.27) are each of order n -2 . This proves (2.26). We now tackle the remaining terms in (2.25). k erz p } = 0 for k = 1,3 so that 1. nn (2.37) cov[z2,z p } = cov[z3,p } = O. n n n n n From Lemma 2.4, 23 By Lemma 2.10, Var[p } n = O(n- l ); also var[zk} n = 0(1) for k = 1,2. Hence, by an argument similar to that in (2.34) we obtain Var[z P } (2.38) n n = O(n- l ). An application of (2.26) and the Schwarz inequality proves that 2 l 2 cov[z2,p } and Cov[z P ,p } are each of order n- . n n n n n This, along with (2.26), (2.37), (2.38) and (2.25), completes the proof of (2.23). LEMMA 2.13. r < m (2.39) < Assume e[f(X ,··· ,X )} r l 4 < 00 and suppose that Then n. 2 2 Cov[ (2 -8) , (2 -8) } n r = m 4 [(r-l)~2 + (r-l) 2 ~3]n -1 -1 -1 + O(n m ) where ~2 = e[g(1)2(X )g(1) (X )g(2)(X ,X )} and 2 l 2 l ~3 = e[g(l) (Xl)g(l) (X2 )g(2) (X PROOF. l ,X )g(2)(X ,X )}. 2 3 3 From (2.24), writing zn for g(l)(X ), n 3 2 2 2 + 2r Cov[znnm P ,z } + 4r Cov[z P ,z P } + 2rCov[z P ,P } nnmm nnm 22 2 2 2 + r2 Cov[p ,z } + 2rCov[P ,z P } + Cov[p ,P }. n m n m n mm We treat the seven terms in (2.40) one at a time. z2 is independent of (2 _8)2, so that n m (2.41) From (2.26) and the Schwarz inequality Since m < n, 24 2 2 -1 -1 Cov [ p ,P } = O(n m ). n m (2.42) 2 m We now consider the term Cov[z P ,z } in (2.40). n n (2.29) and the fact that z n (2.43) 2} Cov [ zn P,z n m = r ~h From (2.28), is independent of z2 v (h)1 for m < n, m n- (r) ,,[ 2Z (h)} = =2 h U zn zm n L:hr =2 (r) U"[z z2W'1(h)} h n m nn = ~r (r)(n-1)-1~(n-1,h-1)e[z z2 g (h)(X X ... X)}. h=2 h h-1 a's n m n' a ' , a 2 For convenience we shall write g combination Cap." ,~). a1"·~ = g(h)(x a1 ,···,x ~ h ) for any Now, for h = 3,4, ... ,r, by Lemma 2.4, o (2.44) for any combination (a2'···'~) chosen from [1,2, •.. ,n-1}. Also, for h = 2 2g . } (n-1) -1 ~.n-1 1e[ zn zm ~= n~ (2.45) = (n-1) -1 ~2' Putting (2.44) and (2.45) into (2.43) gives us (2.46) Next, we consider the term Cov[z n Pn ,zmPm } in (2.40). e[zn Pn } = 0 (2.47) Cov[z P ,z P } n n m m Since 25 (hI) (h 2) From (2.28) and the independence of z from z VIZ n m nm (h ) (h ) e[z n zmZn (2.48) 1 Z m 2} = i( (h ) i( 1 W e[z n zmWnn we have (h ) mm 2 } *(h ) (h 2) l - (h 2 -l)e[zn zmWnn VI} m- *(2)} e[ zn zmW*(2) nn Wmm where e[z n zmg = (n-l) -1 (m-l) -1 n-l m-l [ ~. l~' Ie z z g .g .} ~= J= n m n~ mJ .g .} = 0 except possibly when i = j = 1,2,···,m-l. mJ n~ Thus (2.49) e[z z w*(2)w*(2)} = (n-l)-l(m-l)-l~~-lle[z z g .g .} mm n m nn J= = Higher values of hI and h (n-l) 2 -1 n m nJ mJ 13 , 3 lead to terms of order n -2 or higher, that is (2.50) *(h l ) *(h 2) W } mm e[zn zmWnn O(n -2 ). A close examination yields (2.51) *(h l ) (h 2) VI} m- e[zn zmWnn 4 = O(n- ) ~'( (More specifically, for hI = 2,3 and h 2 = 2,3, .• ·,r.) (h ) (h ) e[zn zmWnn 1 V 2 } = 0 m- l Putting (2.49), (2.50) and (2.51) into (2.48), and then, (2.48) into (2.47) yields (2.52) 2 We now consider the term Cov[z P ,p } in (2.40). n n m First, 26 from (2.28) (2.53) since z n is independent of v(h)lP2 for m < n. n- m For h = 2,3,"',r (2.55) But ~'(2) *(2)2} e[ znWnn W rom = (n-l) where e[z g .g n g n } n n~ m~l m~2 ,R,l -1 (m-l) -2 n-l m-l m-l [ } L:._lL: n _lL: n _Ie z g .g n g n ~~l~2n n~ m~l m~2 = 0 except possibly when i = m and = 1-2 = 1,2"" ,m-l and when i = 1-1 = /;2 = 1,2,'" ,m-l, (2.56) Also, it is not difficult to show that (2.57) and Thus 27 (2.58) Putting (2.56), (2.57) and (2.58) into (2.55) gives us (2.59) We now show that (2.60) For h = h1 = 2 and h 2 =3 e( z nW*(2) W*(2) W*(3)} nun nun nn where e(z g .g ~g ~ ~ } = 0 except possibly when (i,t) equals n n~ m"", m""l ""'2 -1 -1 (2.61) O(n m ). It is a simple matter to show that (2.62) e(z W*(2)W*(2)V(3)} n nn nun m-1 = 0 ' (2.63) and (2.64) Then (2.60) follows from (2.61), (2.62), (2.63) and (2.64). Higher 28 values of h, h l and h 2 yield terms of order n -1 -2 m or higher. There- fore, from (2.53), (2.54), (2.59), (2.60) and the extensions of (2.59) and (2.60) to higher values of h, h l and h , we obtain 2 (2.65) Further application of the techniques already used gives us (2.66) Cov [ p 2 ,z 2} = 0(n-2) n m and (2.67) Cov [ p 2 ,z P } = O(n -2 ). n mm We have now completed our treatment of each of the seven terms in (2.40). Combining (2.41), (2.42), (2.46), (2.52), (2.65), (2.66) and (2.67) yields (2.39). This completes the proof of the lemma. CHAPTER III ESTIMATION OF THE VARIANCE OF AU-STATISTIC 3.1 Introduction. ~ Our chief purpose in this chapter is to obtain an estimate for the variance of a U-statistic having certain desirable properties. Assume e[f(X ,···,X )} 1 r 2 < 00 and that P1 > O. Then, from (2.1), recall that var[u n } = n -1 2 -2 r P + O(n ) 1 so that we may confine our attention to estimation of since any good estimate of 0 2 ~ 00. = 2 r P1' n More specifically, we are concerned with obtaining an estimate for n 2 will be a good estimate of n Var[U }, if second order terms are negligible. basic properties. 0 (1) It converges to 0 0 2 2 = r 2 P1 = r 2 P1 that has several almost surely as (2) For each n, it is positive almost surely. variance of the estimate may be evaluated for large n. (3) The (4) The nature of the estimate is such that, for it, we can establish the asymptotic efficiency of the sequential procedure appearing in Chapter V. (5) In addition, we would hope that the sequential calculation of the estimate is not too tedious. three suitable candidates are examined. In this chapter The two leading candidates are compared when U is the unbiased estimate of the population n variance. 30 3.2 P1 The U-statistic estimate of Pl' 2 2 = var[f 1 (X 1 )} = e[f 1 (X 1 )} - e. and so, has a U-statistic estimate. Recall that Now P is a regular functional, 1 From definition (2.2) in section 2.3, notice that P1 has a kernel (3.1) 2r) -1 ~ (0) f(x ••• x )f(x ••. x ) ( r a"a~"~ 1 r 1 r - where the summation ~(1) is over all combinations (a ,···,a ) and r 1 (~l""'~r) each formed from [1,2,"',2r-1} and such that there is exactly one integer in common, and the summation ~(O) is over all combinations (a1 ,···,ar ) and (~l""'~r) each formed from [1,2,"',2r} and such that there are no integers in common. From (2.3), the U-statistic estimate of P is given by 1 (3.2) = u(l) n _ u(O) n n ) -1,,(n,2r-1) Lq (1) ( X ( 2r-1 a' 1 - • •• n ) -1,,(n,2r) Lq (0) ( X ( 2r a ' ••• , x a . 1 ) x , a 2r-1 ) 2r . We next evaluate the variance of this estimate of Pl' (2.1) and Theorem 2.2, = £1 n 1 -1 + O(n -2 ) From 31 where and the functionals siO,l) and pi l ) are defined in section 2.3. Then, from (3.4), (3.5) 6 1 = 4r 2e 2var[g(1)(X l )} - 4r(2r-l)Cov[qiO)(X ),qi l )(X )} l l + (2r-l)2var [qi l ) (Xl)} We would now like to express (2r-l)qi l )(X ) - 2rqiO)(x ) in l l terms of the g(h) functions introduced in section 2.4. (0) ql (xl) = ef l (xl) = eg fo(x l ) = (1) e[fl(X2)f2(xl,X2)}. (3.6 ) From Lenuna 2.4 2 (xl) + e. . Def1ne Then, from section 2.3, First 32 (3.7) fo(x ) l = e[(g(1)(X 2 )+Q)(g(2)(x l ,X 2 ) = gl(x l ) + Qg (1) + Q + g(l)(x ) + g(1)(X ))} l 2 2 (xl) + Pl + Q Thus, putting (3.7) into (3.6) yields and so Before we put (3.8) into (3.5), notice that (3.9) and 33 = I{Ig(l) (X )g(2) (X ,X )dF(X )JfIg(1) (X )g(2) (x ,x )dF(x 2~dF(Xy l l l 3 2 2 3 = Ig~(Xl)dF(Xl) Then, putting (3.8) into (3.5) gives us (3.11) 61 = ~l + r(r-l)~2 + 4(r-l) 2 ~3· The variance of Q is then given by (3.3) with 6 given by (3.11). ln 1 Qln inherits all the good properties of aU-statistic. an unbiased estimate of Pl. e[f(Xl,···,Xr )} 2 It is Under the assumption that <~, Q ln converges almost surely to PI as n ~~. One drawback is that Q may possibly take on negative values. ln Also, it appears that Q would require much time for computation. ln Estimation of p. 3.3 Pc for c ~ c = 1,2,· •• ,r. In section 2.3 we defined the functionals Now Pc is a regular functional of degree 2r and has a kernel given by q(C)(X ,···,x _ ) - q(O)(X ,···,x ) for 2r c 2r l l c = 1,2,···,r. The U-statistic estimate of P is given by c _ U(O) f or c = 1 , 2 , ••• , r. Qcn = U(c) n n Then, from (2.1) and Theorem 2.2, Var[Q cn } = var[u(c)} n = 6 n c -1 - 2 cov[U(C) u(O)} + var[u(O)} n ' n n + O(n -2 ) 34 where for c = 1,2,···,r. It is also possible to write for c = 1,2,···,r. Clearly, for c = 1 we have the situation con- sidered in section 3.2. Suppose we define n) -1 r (r)(n-r) Qn = r ~c=l c r-c Qcn· e Then Q is an unbiased estimate of Var(U}. n n Also var(Q } = (n) -2~r (r) 2 (n-r) 2va r(Q } n r c=l c r-c cn r r (r)(r)(n-r)(n-r) + ( n)-2 r ~c=l~d=l c d r-c r-d Cov ( Qcn,Qdn } C:;fd = where 6 1 4 r 6 n 1 -3 -4 + O(n ) is given by (3.11). For small values ofn, where the higher order terms of Var[U } n may not be negligible, Q might be a satisfactory estimate of n Var(U}. n However, it is a very tedious estimate to compute, and we 2 -1 are concerned mainly with large values of n, so that r n Q is to 1n .e be preferred, so far, as an estimate of Var(U }. n 35 3.4 Sen's estimate of cr 2 In section 2.5 we introduced = ~ the W-statistic. Now define (3.12) s 2 (n-l) wn -1 n 2 - L:, l(W. -W ) • ~= ~n n Sen [13] introduced (see section 2.6) Recall that W, as an estimate of Pl. W = rU. n ·d s~ ~n It then follows that s n ere d as an . est~mate 0 2 ~ converges to cr2 ~n for i = 1,2,···,n and 222 r s , so that s can be convn ~ = f 0 2 = r 2 Pl. asymptotically unbiased estimate of = rV. is an Sen [13] showed that s2 wn 0 2 2 = r = r2 Pl .~n pro b a b'l' ~ ~ty as n 2 PI' and further, that s wn ~ ~. In the next two theorems we derive first order expressions for the bias and the variance of s to cr 2 2 wn , as well as prove that s 2 = r Pl as n 2 wn converges almost surely ~ ~. THEOREM 3. 1 • (i) 2 2 -1-2 (3.13) e(s2 } = r Pl + r (r-l) [(r-l)o2-2P ]n + O(n ). ~ l (ii) (3.14) 2 Var(s} wn where 6 is given 1 ~ PROOF. = r 4 6l n -1 + O(n -2 ) (3.11). To prove (i) notice that s 2 ~ = (n-l) -1 n 2 L:, lW, ~= ~n (n-l) -1 -2 nWn • 36 Then (n-1) -1 n [}2 L:i =l e Win = (n-1)-ln [var[w 1n - (n-1) -1 - 2 ne[Wn } } - Var[W }] n which, using Lemma 2.9 and (2.7), reduces to (3.13). To prove (ii) we express s2 wn U-statistics. as a linear combination of From the proof of Lemma 2.3 r (r)(n-r)u(c). ( n)-lL: r c=o c r-c n (3.15) In a similar fashion we may write (3.16) (n-1) -1 n 2 L:.1= 1W.1n = (n-1) -1 (n-1) -l r 2n L: r (r-1) (n-r) U(c) • r-1 c=l c-1 r-c n Then, from (3.15) and (3.16), after some rearranging, (3.17) For c = O,l,"',r let (n) (r)(n-r) 2 • an (c) = (n-1) -1 n r-1c r-c [cn-r] In particular, a (0) = -r n 2 -1 + O(n ) and a n (l) = c = 2,3,"',r notice that a (c) = O(n -1 n (3.18) s ). For Thus 2 wn 1 where a (c) = 0(n- ) for c O,l,···,r. n an explicit expression for a (c). n Note that we do not require For convenience let r a (c)u(c) and recall from section 3.2 that Q T = L: n c=o n n' 1n 37 Then 242 Var[s wn } = r Var(Ql n } + 2r cov[Ql n ,Tn } + Var(T n }. (3.19) From (2.1), Theorem 2.2 and the fact that ~ (c) n = O(n- l ) for -2 3 c = O,l,···,r, we have that Var[T } = O(n- ) and cov[Ql ,T } = O(n ). n n n Therefore, from (3.19) and (3.3) 2 Var(s wn } where ~l = ~l + 4(r-l)~2 THEOREM 3.2. surely .!2 (J 2 PROOF. = r 2 PI = r 4 ~ln -1 + + 4(r-l) 2 ~ n surely to its expectation PI as n = O,l,···,r, < co, 2 then s converges almost - - wn - First, by Hoeffding [9] = U(l) - U(O) converges almost n n -+ co. Secondly, since n n = ~rc=o~n (C)U(C) n converges almost surely to 0 This completes the proof. 2 REMARKS. . est~mates = O(n-1 ) ~n(c) and U(c) is a U-statistic with finite expectation, c = O,l,···,r, then T -+ co. 2 The theorem follows almost immediately from expression (or Berk [2]), the U-statistic Q ln as n ) -+ co. (3.18) in the proof of Theorem 3.1. for c -2 This completes the proof. ~3' If e(f(Xl,···,X )} r O(n 2 There is not much difference between swn and r Q as ln 2 f 0 (J 2 = r 2 Pl' From expression (3.18), s2 wn = r Ql n + Tn • The difference T converges to 0 almost surely as n n expectation of order n- l and variance of order n- 3 -+ co, has Both estimates have good properties, including the same first order variance. of course, converge to (J2 = r 2 PI almost surely as n differ on the question of unbiasedness. 2 wn They only 2 The estimate r Qln is unbiased whereas s2 is asymptotically unbiased. wn view of small sample theory s -+ co. Both, From the point of has the good property that it is 38 2 always non-negative, whereas it is possible for r Q to take on ln negative values. For the problem of finding a fixed-width sequential e, confidence interval for discussed in Chapter V, the estimate s 2 wn 2 serves our purposes better than does r Qln. 3.5 ~ 0 2 2 = r PI based on the Z-statistic. section 2.7 for an introduction to the Z-statistic. Refer to For n > r define (3.20) Notice that if r = 1 then variance. o 2 = 0 2 = var(f(X )} and s2 reduces to a sample l zn We now consider the merits of s 2 zn as an estimate of 2 r Pl. THEOREM 3.3. If e(f(X ,···,X )}2 < r l (i) (3.21) 2} e( szn 00, then 2 2 2 -1 -1 = r PI + r (r-l) 02 n logn + An + o(n -1 ) where A is defined ~ with Al = -1, A2 = 2 (y - r-2. -1 ~i=l~ ), ~ = h(h_2)-1 for h = 3,4,··· ,r and Y = 0.5772 ••• (Euler's constant). (ii) Var ( s 2} = r 4b.n -1 + O(n -2 (log n) 2 ) zn (3.22) where b. = ~l PROOF. + 2(r-l)~2 + 2(r-l) To prove (i) let 2 ~3. 39 s ~'(2 zn = (n-1) -1 n 2:. ~=r+1 (Z.-U ) ~ n so that s2 = s*2 + r(n-1)-1(U -U )2. zn zn r n (3.23) }2 -1 r r(n-1) -1 e[u-u = rn Lh=l r n 2 Now (rh ) ~ + O(n -2 ). Next = (n-1) -1 n n [2:.~=r+lvar[z.} - 22:.~=r+lcov[Z.,u} ~ ~ n + (n-r)Var[U n }]. From (2.16) and the corollary to Theorem 2.2, Cov[Z.,U } = Var[Un } ~ n for i = r+1, •.• ,n, so that using Lemma 2.11 *2 = (n-1) -1 2:i=r+1Var n [Zi } - (n-1) -1 (n-r)Var [Un } (3.24) e [ szn} r (r) 2 -1 2 -2 = 2:h=l h ~Kn(h,r) - n r P1 + O(n ) where, for n > rand h = 1,2,··· ,r, Kn(h,r) = (n-1) -1 2:in=r+1 (i-1)-1 h [(i-2)h+1]. Note that Kn (l,r) = 1 - n -1 (r-1) + O(n -2 ) and that K (2,r) = 4,{'n-1) -12:~ +1(i-2) -1 - 2(n-1) -12:~ 1(i-1) -1(i_2) -1. n ~=r ~=r+ Now, let yn = n .-1 2:;... __ 1~ - log n for n > 2; then limn-+oo yn = y, where y = 0.5772 •.. is Euler's constant. Also, notice that (r-1) Then Kn (2,r) becomes -1 - (n-1) -1 . 40 (3.25) K (2,r) 4n n -1 r-2 -1 -1 -1 [y+10g n-L:i=li ] - 2n (r-1) + O(n 4n -1 -2 10gn) + O(n 10gn+n -1 -1 E:) n r-2 -1 -1 -1 [4y-4L:i=li -2(r-1) ] +o(n ). From the theory of infinite series L:~ i- 1 (i+1) -1 •.• (i+k) -1 = k-1(k~) -1 (m+k) -1 ~=m+1 k for k = 1,2, ••. and m = 0,1,···. Notice that the above infinite -k series is of order m for k = 1,2,···. Then, making use of this series result, for h = 3,4,···,r, we have that Kn(h,r) = n -1 n h(h~)L:i=r+1 (i-2) -n = n -1 -1 n <Xl 1 hi ~=r+ - -1 .•. (i-h) (h-1)(h~)L:. (i-l) ~=r+1 h (h : ) L:. -n -1 -1 ( i +1) <Xl (h-1) (h:)L:. -1 -1 -1 ···(i-h) • . . (i+h - 2) 1hi ~=r+ -. -1 (i+1) -1 -1 -2 +O(n) -1 ···(i+h-1) -1-2 +O(n) and so = n r (r) h -1 rL: =3 h \ (h-2) -1 [(r-2) h+2] + 0 (n -2 ). Combining (3.20), (3.23), (3.24), (3.25) and (3.26) gives us (3.21). To prove (ii) first set e = 0, without loss of generality. Then 41 *2 -1 n 2 s zn = (n-1) 2:.~=r+1Z.~ = An + (n-1) -1 (n+r)U 2 -1 + (n-1) 2rU U n r n B n where we have set (3.27) A n (n-1) -1 n 2 2:.~=r+lZ,~ and (3.28) Therefore (3.29) var[s*2} = Var[A } + 2 Cov[A ,B } + Var[B }. zn n n n n We now divide the proof of (ii) into two parts. In Part (a) we show that Var[A } is given by the expression in (3.22). In Part (b) n we show that Var[B } and Cov[A ,B } are each of order n n n n -2 log n or higher. PART (a). (3.30) From (3.27) Var[A } = (n-1) -2 2:.n IVar [ Z.2} ~=r+ ~ n + 2(n-1) ~2 n 2:. i-I [2 2} 12:. lCoV Z. ,Z .• J=r+ ~ J ~=r+ By Lemma 2.12 (3.31) (n-1) -2 2:.n ~=r+ -2 n 4 -1 1Var [ Z.2} = (n-l) 2: =r+l[r ~1 + O(i )] ~ i = r 4 ~ln -1 + O(n -2 logn). 42 By Lennna 2.13 (3.32) (n-1) -2 n ~. ~=r (n-1) i-I CoV [2 Z2} z.~ t J. J=r+l +l~' -2 n [4 [(r-1)~2+(r-1) 2 ~3]~.-1 + 0(4-1J.-1)} i-I ~i=r+1~j=r+1 r 4 = r [(r-1) ~2+(r-1) 2 ~3]n L -1 + O(n -2 2 (log n) ). Putting (3.31) and (3.32) into (3.30) yields 4 (3.33) Var [A } = r t:.n n PART (b). 2 that var[u } n -1 + 0 (n -2 2 (log n) ). From Lennna 2.3 (with m = n)t since O(n -2 ). (3.34) In order to prove Cov[AntBn } = O(n -2 log n) it is sufficient t by (3.28)t to show that (3.36) Cov[A tU U } = 0(n- 1 ) n r n and (3.37) = 0t we have It is then a simple matter to show from (3.28) that (3.35) e Cov [ An tUn2} = O(n -2 logn). We now prove (3.36). First 43 (3.38) 2 2 = cov[U ,U } + Var[U }Var[U } - [Cov[U ,U }]2 r n r n r n = O(n-1) by Lemma 2.3 (since e = 0), (2.1) and the corollary to Theorem 2.2. l Then (3.36) follows from (3.38), the fact that Var[A } = O(n- ) and n the Schwarz inequality. We now prove (3.37). From (3.27) and (2.24) 2 n (3.39) Cov[A ,U 2 } = (n-l) -1 r~. lCoV [2 z.,U 2} n n ~=r+ ~ n + (n-l) where Pi = -1 ~=2 (~) n 2 2r~.~=r+lCov[z.P.,u} ~ ~ n + (n-l) zih ) for i = r+l,r+2,'" ,no -1 n [2 2} ~.~=r +lCoV P.,U ~ n Now, using symmetry and (3.15), n +lCoV [2 2} (3.40) (n-l) -1 ~.~=r z.,U ~ n (n-l) -1 (n-r) Cov [2 zl' Un2} = (n-l)-l(n-r) (n) -l~r (r)(n-r) Cov[n-l~? z7 U(c)}. r c=o c r-c ~=l ~' n Notice that n -1 n ~. ~= 2 lZ' is a U-statistic, so that, applying Theorem 2.2, ~ we obtain cov[n-l~~ lz7,u(0)} ~= ~ n = O(n -2 ). Theorem 2.2 equals zero because q(O)(x) 1 1 (In this case the Sl of = 0.) For c = 1,2, ... ,r it is clear, by the Schwarz inequality and the fact that the variance 44 of a U-statistic is of order n -1 2 (c)} = O(n -1 ). that Cov [ n -1 ~.n 1z.,U ~= ~ n Thus (n-1) -1 ~.n (3.41) 1CoV [2 z,U 2} = O(n -2 ). i n ~=r+ In order to prove that n +lCoV [2 (n-1) -1 ~.~=r P.~ ,Un2} = O(n -2 log n) (3.42) 2 recall from (2.26) that var[p7} = O(i- ). ~ The Schwarz inequality 2 2 -1 -1 then implies that Cov[P.,U } = O(i n ) and (3.42» follows. n ~ To prove (3.43) n [ 2} (n-1) -1 ~.~=r +lCovz.P.,U ~ ~ n O(n -2 10gn) notice from (3.15) that n 2} (n-1) -1 ~.~=r +lCov[z.P.,U ~ ~ n = (n) -l~r (r)(n-r) (n_1)-1~~ Cov(z.P. U(c)}. r c=o c r-c ~=r+1 ~ ~' n We can therefore establish (3.43) by showing that (3.44) Cov[(n-1) -1 n ~. ~=r (1) } -1 +lz.P.,U = O(n ) ~ ~ n and (3.45) n (O)} (n-1) -1 ~.~=r +lCoV ( z.P.,U ~ ~ n O(n -2 log n). Now (3.46) n } } Var ( (n-1) -1 ~.~=r +lz.P. = (n-1) -2 ~.n~=r+lVar [ z.P. ~ ~ ~ ~ + (n-1) -2 2~.n i-1 1CoV [ z.P.,z.P .}. J=r+ ~ ~ J J 1~' ~=r+ 45 Then, from (2.38) and (2.52) in the proof of Letmna 2.13 Cov[z.P.,z.P.} ~ for r < j .:S i. (3.47) J J ~ = 0(i- 1 ) Thus, from (3.46), Var[(n-1) -1 ~. n +lz.P. ~ ~ } ~=r = O(n -1 ). Therefore (3.44) follows from (3.47), the fact that var[u(l)} n = 0(n- 1 ) and the Schwarz inequality. To prove (3.45) it is sufficient to prove (by (2.28» (h) (3.48) (O)} Cov[z.V. U ~ ~-1' n that = O(i -1 n -1 ) and *(h) (0) Cov[z.W.. } ~ ~~ , Un (3.49) for h = 2,3,"',r = O(n -2 ) = r+1,r+2, ••• ,n. and i Since z. is independent of ~ vi~i, Letmna 2.5 implies that for h ~ Also, var[U(O)} 2,3,"',r and i = r+1,r+2, ••. ,n. n = O(n -2 ), so that (3.48) follows from the Schwarz inequality. To prove (3.49) notice that for h *(2) (O)} -_ Cov [ z.W..,U ~ ~~ n • for i ~ J= ~ = r+1,r+2, ••• ,n. e[z.g(2)(X.,X.)f(X ~ ._ 1) -l(n)-l(n-r)-l r r (~ i-11~ (0) e[ z.g (2) (X.,X.)f(X ~. ~ J a1 =2 J a1 ,"',X ar )f(X Q ""'XQ ~1 ~r ) } But ,"',X ar )f(XQ ,"',X Q ~1 ~r )} =0 except possibly 46 when i appears among (a ,···,ar ) and j appears among 1 or vice versa. (~l""'~r)' The number of possibly non-zero terms is • 2 ( n-2) r-1 (n-r-1) r-1 Thus, (3.49) holds for h = 2. h = 3,4, .•. ,r follow in analogous fashion. The cases where This completes the proof of (3.49). 2 4 We have therefore shown that var[s*2} = r t.n- 1 + 0(n- (10g n)2). zn It is a simple matter to see that var[(n-1)-1(U -u )2} and r n -2 cov[s*2,(n-1)~1(U -u )2} are each of order n , and so, (3.22) is zn r n finally verified. Assume e[f(X ,··· ,X )} 1 r THEOREM 3.4. to (J 2 2 = r P1 almost surely PROOF. Assume that e n .... ~ 2 < 00. Then s 2 converges zn 00. = 0, without loss of genera1it~and recall that (3.50) s 2 zn where *2 -1 n 2 -1 2 -1 (3.51) szn = (n-1) ~i=r+1Zi - (n-1) (n+r)U + (n-1) 2rU U • r n n Clearly r(n-1) -1 (U -U) r n 2 converges to 0 almost surely as n .... 00. By Hoeffding [9] (or Berk [2]), the second and third terms in (3.51) each converge to 0 almost surely as n .... 00. From (3.27) and (2.24) (3.52) = (n-1) "'1 2 n r~. 2 -1 n +l z ~. + 2r(n-1) ~.~=r+lz.P. ~ ~ ~=r + (n-1) -1 n ~. 2 +l P ~.. ~=r 47 Now e[Z:} = ~ e[g(1)2(X.)} ~ = PI' so that, by the strong law of large 2 numbers, the first term in (3.52) converges to cr surely as n almost Thus, to complete the proof we need only show that ~~. . (3.53) 2 = r PI -1 n l~m (n-l)~. n"'~ 2 o +lP,~ ~=r (a. s.) and (3.54) (a. s.) . From (2.28) (r)w~~h) _ ~r (r) (h-l)V~h) 1h=2 h ~~ h=2 h ~-l P; = )'.r (3.55) ~ = r+l,r+2,···,n where W~~h) is given by (2.29). for i From ~~ Hoeffding [9], V~h) converges almost surely to zero as i ... ~ for ~ h = 2,3,···,r. Also, it can be shown, in a proof almost identical to that on pages 108-110 of Wilks [17] (due to Feller [5]), that w~~h) converges almost surely to zero as i ... ~ for h ~~ = 2,3,···,r. 2 Then, (3.55) implies that P., and therefore P., converges almost ~ ~ surely to zero and hence in Cesaro-mean a.s. - that is, (3.53) holds. Next, by the Schwarz inequality, I(n-l) -1 ~.n (3.56) so that (3.54) holds. If r REMARK 1. I -1 n ~. ~= This completes the proof. = 1, then W. ~n 2 2 Also, both sand s sample mean. (n-l) +lz.P. ~ ~ ~=r wn 2 - l(x.-x). ~ n = x., wn ~ Z. = x. and U = ~ ~ n xn , the reduce to the sample variance (For notational convenience, when r = 1, we 48 assume that f(x) = x.) 2 A few comments about the relative merits of sand wn . f a2=2 . or der. Both estimates are 0 r P are 1n s 2 as est1mates zn l 2 asymptotically unbiased. However, the bias in the case of s is of zn 2 lower order than that of swn' that is REMARK 2. BIAS(S2 ) zn -1 = r 2 (r-l) 2 02n -1 logn + An - 1 + o(n ) whereas the bias of s 2 is of order n wn .22 overest1mate a = r Pl. ~l Notice that s 2 tends to zn The variances of the estimates each have leading terms of order n possible to compare -1 -1 and See (3.14) and (3.22). ~, as ~2 + (r-l)~3 It is not may be either negative or non-negative depending upon f(xl, ... ,x ) and the c.d.£. F. The r 2 second order term of var[s2 } is of order n- (log n)2, which compares zn 2 with n- , the order of the second order term of var[s2}. The one wn advantage, and it is an important one, that s 2 zn has over s it is by nature more suited for sequential calculation. 2 wn In s is that 2 wn , each of the Win terms depend upon x ,x 2 ,·.·,x and must therefore n l (in general) be calculated at each stage of the sequential procedure, whereas, in s2 , Z. depends only on the first i observations and need zn 1 only be calculated once. More will be said about the computatio~ of 2 2 2 and s at the end of Chapter v. Theoretically, s appears a wn zn wn 2 2 2 as an estimate of a = r Pl· However, in little superior to s zn s 2 many cases, szn can be calculated with relative ease, and in such cases, might be preferred over s 2 wn Thus, the final choice of estimate depends upon the function f(xl,···,x ). r In Chapter V the sequential procedure is examined with respect to both s2 and s2 wn zn 49 ~. Define ~ 3.6 j = 2,3,'" f(x ,x ) 1 2 = (when eXistent). Assume that (X -X )2/2, so that 1 2 = e{x 1} and ~j e = e{(x1-~) j} for = ~4 < ~ e{(x 1-x 2)2/2} and ~2 ~2' = > O. Let The corres- ponding U-statistic is (3.57) Un where x n n 2 = (n-1) -1 L:.1= l(x.-x) = n(n-1) -1 (m2 -m 12 ) 1 n is the sample mean and m. J n j = n -1 L:.1= 1x. 1 . for J = 1,2, ••.. 2 2 Next, f 1 (xl) = e{(x 1 -x2 ) /2} = (x1-~) /2 + ~zl2 and 222 2 P1 = 1 = (~4-~2)/4, so that 0 = r P1 = ~4 - ~2' ° From (2.4) we obtain g(1)(x ) = (X1-~)2/2 - ~2/2 and 1 g (2) (x 1 ,x 2 ) = -(x1-~)(x2-~)' so that °2 = 2 ~2' Therefore, it follows from (2.7) that var{U } (3.58) n as is well known. We now present s~n and s;n' the estimates of 0 2 = ~4 - 2 ~2' From (2.11) W.1n for i = 1,2,'" ,n, and so, after some manipulation, (3.59) 3 The factor n (n-1) properties of s (3.60) 2 wn -3 in (3.59) may be omitted without affecting the to any appreciable extent. From (3.20) 50 where Z. for i 1. = 3,4,··· is given by Z.1. = iU.1. - (i-1)U.1.- 1 (3.61) and U. is given by (3.57). 1. In this example s 2 wn is just as easy to 2 2 calculate sequentially as szn' and so, is to be preferred over s zn . f 2 2 as an est1.mate 0 a = ~4 - ~2. From (3.13) of Theorem 3.1 (3.62) which indicates that the first order bias of s 2 wn From (3.21) of Theorem 3.3 (3.63) -1 + o(n ) where Y = 0.5772 •... The dominant term in the bias of s2 is zn 2 -1 4~2n log n, which is non-negative, and so, as we have already . . d , s 2 ten d s to overest1.mate a not1.ce zn n 2 = ~4 - 2 ~2. The term of order -1 . 1.n (3.63) may be negative or non-negative depending upon the c.d.f. F, and therefore, may either help to decrease or increase the bias. . . of s2 an d s 2 assume t h at To d eterm1.ne t h e var1.ances wn zn (as well as ~2 > 0). From the statement of Lemma 2.12 ~8 < = 51 Next, from section 3.2 = e[g(l)(X )g(2)(x X)} = go (x 1) 2 l' 2 Therefore, by (3.9) and, by (3.10) The variance of s 2 wn is then (3.64) The variance of s Var[s (3.65) where ~ = ~1 + 2~2 2 zn is given by 2 -1 -2 2 } = 16~n + O(n (log n) ) zn + 2~3· be either negative or non-negative depending upon the c.d.f. F, and so, it is 'not possible to compare the first order terms of (3.64) and (3.65). For this particular example we. might consider s 2 2 -1 -1 ::: n (n-2) (n-3) [S4 2 (n -3) (n-l) -2 2 s2] 52 as an estimate of n Var{U } where s. = 2:~ l(x.-x )j for j = 2,3,···. J n ~= ~ n The estimate s2 is motivated by the classical theory of sampling distributions of sample moments. Kendall and Stuart [10J. See, for example, Chapter 12 of The chief merit of s2 is that it is an unbiased estimate of n Var {U}. n As is evident from (3.59), which may be written as 2 3 Swn = n (n-1) 2 2 sand s wn with s2 wn -3 2 [s4-s2J, differ very little (especially for large values of n) being slightly easier to compute. concerned with large values of n, we favor s Since we are mainly 2 wn over s 2 as an estimate For further examples of U-statistics see Hoeffding [7J, [8J and Fraser [6J. CHAPTER IV CONTRIBUTIONS TO THE ASYMPTOTIC THEORY OF U-STATISTICS 4.1 Introduction. ~ The results of this chapter were developed with the idea of solving the fixed-width sequential confidence interval problem of Chapter V. We utilize the H-decomposition intro- duced in Theorem 2.6, namely U n = 8 + L;r h=l (r) V (h) h n ' to establish a Kolmogorov-like inequality for U-statistics (Theorem 4.2), and then, to show that, for y < 1/2, nY(U -8) converges n almost surely to 0 as n ~ ~ (Theorem 4.4). In addition, if for each positive integer s, N is a positive integer-valued random variable, s then Theorem 4.5 states that the U-statistic UN based on s x ,x ,···,x is asymptotically normal as s l 2 N s ditions. ~~, under certain con- Theorem 4.5 is used in Chapter V to establish the asymptotic consistency of the sequential procedure (Theorem 5.2). 4.2 Kolmo orov inequalities. h = 1,2,.··,r, s~h) = Theorem 2.6 states that for each (~)v~h) forms a martingale sequence. This fact is used to prove LEMMA 4.1. Assume that 0 < 6 < h ~ for ~ the following Kolmogorov-like inequality holds: h = 1,2, .• · ,r. for A > 0 Then 54 (4.1) PROOF. By Lemma 2.5, e[s~h)2}:::;: (~) 0h' Thus, by the Kolmogorov inequality for martingales (see page 399 of Loeve (11]), > 0, for any € Putting €:::;: AO l/2(n)" h h REMARKS. 1/2 completes the proof of (4.1). The Kolmogorov-like inequality (4.1) can be applied directly to show that for any y < h/2, n Yv~h) converges almost surely to 0 as n ~ See pages 109-110 of Wilks (17]. 00. However, a different approach leads to stronger convergence results, as we shall see in Theorem 4.3. In a somewhat classical approach a proof similar to that on pages 107-108 of Wilks (17], utilizing Lemmas 2.4 and 2.5, can be used to establish (4.1). We now use Lemma 4.1 to derive a Kolmogorov-like inequality for a U-statistic. From Theorem 2.6 where we have set S n THEOREM 4.2. °= r for A > O. r n - Assume e[f(Xl,···,X r )} 1/2 L: h=l (r) h 0h = ( n ) U for n > r. . Then 2 < 00 and °1 > 0, and let 55 PROOF. First, note that 0h < ~ for h = 1,2,···,r as a conse- quence of our assumption, Lemma 2.1(i), Lemma 2.7 and the Schwarz inequality. Define the events and for h = 1,2,···,r. If each of E ,E ,'" 1 2 ,Er occurs, then for a ~ r ISa- ( ~) e 1.$ (~) ~~=1 ( ~) (~ ) -11 S;h) I < (~)~~=1(~)(~) -lAO~/2(~) 1/2 r This implies that E E- lJh=lEh' so that, by Lemma 4.1 which completes the proof. 4.3 ~~~. THEOREM 4.3. The main theorem is Let (bnJ; be .~ positive increasing sequence of real numbers with limn-too bn =~. If, for ~ h = 1,2,'" ,r, 0 < 0h < and (4.3) then b~ls~h) converges almost surely to 0 as n ~ ~. ~ 56 PROOF. From Lemma 4.1 for any e > 0 (4.4) Then (4.3), (4.4) and the Borel-Cantelli lemma imply that (4.5) (a.s.). Next, define for j = 1,2,··· and Y n = s(~) 2J+n - s(~) for n = 1,2,···. Then [Yn}~ 2J is a martingale sequence, so that, by the Kolmogorov inequality for martingales (page 399 of Loeve [11]) (4.6) P[ T. > eb . } < e-2 b -2.e[ Y. }2 • J 2J 2J 2J (4.7) 2 j +l 2j A little computation shows that ( h ) - (h) constant 0 < K < 00. h' < K2 J for some Thus (4.3), (4.6), (4.7) and the Borel-Cantelli lemma imply that (4.8) -1 b.T. J"'oo 2J J lim. =0 (a.s.). Now, for each n, let j be the positive integer such that 57 Then, since [bnJ; is positive increasing, (4.9) for n = h,h+1,···. Combining (4.5), (4.8) and (4.9) completes the proof of the theorem. COROLLARY. If e (i) to 0 as n ... <\ > lIZ, then n < GO for -h/Z ~ (log n) h = 1,Z,···,r. -e (h) converges almost surely Sn GO. If Y < h/Z, then nYV(h) converges almost surely to 0 as (ii) n ... Assume 0 < n GO. If Y < 1, then nYR (iii) n converges almost surely to 0 as n ... 00, where R is defined bv (Z.6). n---";;'';;;''';;;=~~ PROOF. To prove (i) let bn = nh/Z (log n) e so that, since Ze > 1, (4.3) becomes To prove (ii) let b n = nh- y • Then, since h - Zy > 0, (4.3) becomes GO ~. J= 1Z -j(h-ZY) < GO. Thus ny-hS(h) converges almost surely to 0 as n ... GO, which is equivan lent to (ii). Part (iii) follows directly from (ii). REMARK. result. Theorem 4.3 may be easily generalized to the following Let [bnJ; be a positive increasing sequence of real numbers with lim b n"'GO n = GO. If, for n > r, S forms a zero-mean martingale n 58 sequence such that els IS} n = O(nh ) for some s > 1 and some h> 0 and then b- 1 S converges almost surely to 0 as n n THEOREM 4.4. Assume e(f(Xl,···,X )} r 2 < 00 and that °1 > Y < 1/2, then nY(Un-Q) converges almost surely ~ 0 ~ n ... PROOF. O. If 00. The theorem follows directly from the H-decomposition (2.6) and Corollary (ii) of Theorem 4.3. 4.4 The asymptotic normality of UN' In this section we extend to U-statistics Anscombe's theorem on the asymptotic normality of averages of a random number of I.I.D. random variables. From Theorem 2.6 recall that the H-decomposition is U = Q + rv(l) + R n where Rn (h) = ~hr= 2 (r) h Vn THEOREM 4.5. n (1) and Vn n = n -1~.nL= l(f 1 (x.)-Q). L Assume e(f(X ,· .. ,X )} 1 r 2 < 00 and P1> O. Denote the standard normal c.d.f. ~ ~(x).~ (n } be 1!!! increasing sequence of s .£!: !! 00 ~ s -+ proper random variables taking ~ positive integer values positive integers tending J:.£ -1 00, p - lims-+oo n s Ns ~, with cr2 and (N } s sequence of ~ ~ = 1. = lim p(U -Q) < n- 1/ 2xcr} = s-+oo - s N ~(x). s PROOF. Anscombe [1] introduced the following situation. Let 59 [Y } be a sequence of random variables. n Assume that there exists a real number Q, a sequence of positive numbers [w } and a c.d.f. F(x) n such that: Cl. For any x such that F(x) is continuous lim n-+QO C2. pryn -Q -< xwn } = F(x). Given e > 0 and Tl> 0 there exists a large \) c > 0 such that for any n > \) p[lyn ,-yn I < ew n e, e, Tl and a small Tl for all n' such that In'-nl < cn} > 1 - Tl. Theorem 1 of Anscombe [1] states that if [Y } satisfies Cl and C2, n then lim S-texl p[Y -9 < xw } = F(x) N n s at all continuity points of F(x). s Let C3 be the condition that [w } n is decreasing, tending to 0 as n -+ QO and limn-+QO w-+llw n n = 1. Theorem 3 of Anscombe [1] states that C2 is satisfied if Y is the average of n n IoIoDo random variables, if Cl and C3 hold, and if F(x) is continuous. We now apply these results to our situation. so that C3 is satisfied. Cl with F(x) = ~(x). n = n -1/2 cr Hoeffding [7] has shown that U satisfies n (See Remark 2 following Theorem 2.6.) show that U satisfies C2. n of Anscombe [1]. Here w We now First, rv(l) satisfies C2 by Theorem 3 n Thus, from the H-decomposition, given e > 0 and Tl > 0 there exists a \) e, Tl and a c > 0 such that p[lun ,-Un -Rn ,+Rn I < ecrn- 1/2 for all n' such that In'-hl < cn} > 1 - Tl ·60 for all n > v e, and hence ~, 'I pf IUn ,-Un I- IRn ,-Rn I < (4.10) ecrn -1/2 for all n' such that /n'-nl < cn} > 1 - 11 for all n> v lim n"' oo ~. By Corollary (iii) of Theorem 4.3, e, 'I 1/2 n R = 0 (a.s.). Thus given e > 0 and 11 > 0 there exists n an N' ~ such that e, 'I pflRn ,-Rn I < for all n> N' e, ~ ~crn-1/2 for n' = n,n+1 ,···,n+k} > 1 - ~'I v and for k 'I = 0,1,···. n > N implies that n(l-c) > N' ~. e,11 e"1 (4.11) pf IRn ,-Rn I < ecru Let N ~ e"1 Therefore = (l_c)-lN, ~. e"1 Then -1/2 for all n' such that In'-nl < cn} > 1 - 11 for all n > N ~. e, 'I = max(v Let v A = flun ,-un I-IRn ,-Rn I < B = fiR n ,-Rn I < ecrn- 1/2 ecrn- 1/2 ~,N e"1 ~). e"1 Define the events for all n' such that In'-nl < cn}, for all n' such that In'-nl < cn} and C = flU n ,-un I < Therefore A 2e cr n- nB~ 1/2 for all n' such that In'-nl < cn}. C, and so, from (4.10) and (4.11) P(C) ~ P(A n B) = P(A) - P(A n B) > 1 - 211 for all n > v. Thus U satisfies C2, and the asymptotic normality of n 61 UN follows from Theorem 1 of Anscombe [1]. The following corollary is a consequence of Theorems 3.2 and 4.5. Under the assumptions of Theorem 4.5 COROLLARY. . f -1N (UN -8) _< x } =@(x) 11ms"'oo P n s1/2 s w (4.12) s where swN is given ~ s (3.12). s REMARKS. As a result of Theorem 3.4, swN in (4.12) may be s replaced by szN ' which is given by (3.20). As a special case of s (4.12) it follows that n 1/2 -1 s (U -8) is asymptotically normal with wn n mean 0 and variance 1 as n ... s. for s .) zn wn 00. (We may, of course, again substitute CHAPTER V SEQUENTIAL FIXED-WIDTH CONFIDENCE INTERVALS FOR REGULAR FUNCTIONALS 5.1 variables. ~~. . Assume that Xl ,X 2 ,··· are I.I.D. random Let f(xl,·.·,x r ) be the symmetric kernel of aU-statistic U whose expectation is 8. n The problem is to find a sequential confi- dence interval for 8 of fixed-width 2d, where d > 0, and such that the coverage probability either equals, or approaches in some way, a specified a, where ° < a < 1. The problem was solved by Chow and Robbins [3] for a special U-statistic, the sample mean. To adapt their procedure to deal with a general U-statistic is the "raison d'etre" of Chapter V. Chow and Robbins [3] use n -1 2 2 n n s , where s is the sample variance, to estimate the unknown variance of the sample mean. In section 5.2, n -1 s 2 is used as an estimate of the unknown variance wn . sect10n . 5 • 3 ,n -1 s 2 .1S use d • o f U ,and 1n n zn The sequential procedure may be simply described as follows: at each stage of sampling the U-statistic U and an estimate of its n variance are calculated, and sampling is terminated as soon as the approximate coverage probability for the interval [Un -d,Un +d], based on a normal approximation, is at least a. It is shown that the coverage probability is, in a certain sense, asymptotically a; that is, the sequential procedures are consistent (Theorem 5.2). It is also shown that the expected sample size of the procedures is asymptotically equal to the sample size of the corresponding non-sequential scheme 63 used when the variance of the U-statistic is known (Theorems 5.3 and 5.8); that is, the sequential procedures are efficient. In section 5.4 the procedures are illustrated with the estimation of (1) the variance of Xl' and (2) the probability of concordance for bivariate Xl' 5.2 "a" The se uentia1 procedure using s 2 wn • For 0 < a < 1 define (> 0) by (2TI) -1/2 J+a 2 exp(-u /2)du = a. -a Let fa n } be a sequence of positive real numbers such that limn-KlOa n = a. For d > 0 define the stopping variable (5.1) 2 N(d) = smallest integer k > r such that swk Define a closed confidence interval Notice that N and ~ ~ ~ = [UN-d,UN+d] 2 -2 kd a . k of width 2d. have properties similar to those stated in the theorem appearing in Chow and Robbins [3]. LEMMA 5.L (i) (ii) (iii) (iv) Assume e.ff(X 1 ,···,X )} r N(d) is well-defined and is 1im N(d) d"'0 = CIO 2 < CIO and P1> O. ~ Then non-increasing function of d, (a.s.), -- limd e.fN(d)} = CIO,and "'0 --2 -2 -2 1imd a cr d N(d) = 1 (a.s.). "'0 2 Recall from Theorem 3.2 that limn-.oo s2 wn = cr (a.s.). Let -2 2 -2 2 -2 2 2 Yn = cr swn' fen) = an na and t = d a cr. Then all parts of the PROOF. lemma follow from Lemma 1 of Chow and Robbins [3]. THEOREM 5.2. Then 64 Hence, i2!: sufficiently short intervals, the coverage probability is approximately equal to PROOF. replaced by t ct. = d-2a 2a 2 Let t -1 2 2 a a. and let Nt be defined by (5.1) with d (Note that Nt and identify Nt with N and t with s p[Q E ~} = p[IUN -QI t < d} = N(d).) n • s 2 Refer to Theorem 4.5 Then = p[IUN -QI l 2 < t- / aa} t and so the theorem follows. THEOREM 5.3. d2 a -2a -2 e[ N(d)} limd "'0 (5.2) = 1. Before we tackle the proof of Theorem 5.3 we establish a series of four lemmas, which are required in the proof. Let x l ,X ,··· be Ll.D. random variables and Yn .! 2 LEMMA 5.4. function of Xl ,X2 ,··· ,Xn • For each t > 0 let Nt be.! positive integer-valued random variable depending event [Nt = n} ~ (X ,X ,···) such l 2 PROOF. is in 8 , the a-field generated bv [x ,x ,···,X } for l 2 n -~ n -- = 00 (a.s.), then - limt ... oo YN =Q If lim Y =Q Yn = Q} n.-.co n (a.s.) . t Define the events = [(X l ,x2 ,···)llimn-+ and the -- n = 1,2,··· (i.e., Nt is.! stopping variable). and limt .-.co Nt ~ OO (a. s •) 65 Then P(A) = P(B) = 1, which implies that P(A n B) easily be shown that A nBc C. LEMMA 5.5. Thus P(C) If e[lf(X ,···,x ) I} < 1 r 00, = 1. But, it can = 1. then [U }oo is a reverse -nr-- martingale. PROOF. The proof appears in Berk [2], but, because of its simple nature and the fact that it is referred to several times in the ensuing pages, is repeated here. Let n> m> rand (a ,···,a ) be 1 r any r-combination from [1,2,··.,n}. Then e[f(Xa , ••• ,Xa )IU,U +1'···} n n 1 r = e[f(X ,···,X )IU,U 1 r n n+1 , ••• } = q (say). Sum over all (~) combinations and obtain :E(n,r)e[f(Xa ,· •• ,Xa )IU,U +1' ••• } n n 1 r so that e[un Iu n ,U n+1'···} = q. That is Un = =q (n) r q (a.s.). Next, sum over all (:) combinations from [1,2'''.,m} and obtain e[UmIUn,Un+1'···} = q. Thus, e[UmIUn'Un+1'···} = Un (a.s.), and [U }oo is a reverse martingale. n r LEMMA 5.6. (5.3) PROOF. This method of proof by truncation is similar to that in Hoeffding [9] and Sen [13]. Also see Siegmund [14]. Define 66 f I (x 0" ••• x , 1 and f"(x 0'1 0' ) r ,···,x ) f(X = 0'1 { ,···,x ) O'r O'r o = f(x 0'1 otherwise , ••• ,x ) O'r S = ~(n,r)f(x nO" f'(x 0'1 , ... ,x ). O'r ••• x ) , 0' ' 1 r and S"n = ~(n,r)f"(x O" (a) 1 ••• , x0' ) • r To prove e(sup n-(r+e)ls'll < n 00, n note that < sup n -(r+e)~(n,r)( ~ max.O'. )e/2 J J n < sup ~(n,r)( n ~ max.O'. )-r-e/2 J J n (j-l) < SUPn~j=r r-l 00 .-I-el2 < ~.J=r J (b) < 00. To prove e( sup n- (r+e) Is" 11 < n n .-r-e/2 J 00, note that Then set 67 e(sup n-(r+€)ls"l} < e(sup n-(r+€)E(n,r)lf"(X ••. X )I} n n nO:'" 0:' 1 r =e(su Pn n-(r+€)E~ E(j-1,r-1)\f"(X.X .. ·X )I} J=r O:"s J' 0:'2' , O:'r < e(E~ J.-(r+€)E(j-1,r-1) If"(X. X J=r 0:" s J' 0:" - 2 < Ea: - = J=r ••• X , ar ) I} J.-(r+€)E(j-1,r-1) e( If"(X. X ••• X ) \} 0:" S J' 0:" , 0:' 2 r E~=rj-(r+€) (;=i) I \f(xl"" ,x r ) 1I1~=ldF(Xi) [If( ••• ) l>jE:!2] < - b E~ J,-1-€ r J:::r < co where we have set bj = I If(x 1 ,··· ,x r ) I I1~=ldF(Xi) [!f(x , ... ,x ) \>j€/2] r 1 for j = r,r+1,"', so that b. > 0 and b. > b. 1.' J+ J J (c) Finally, we have S = S' + S" and n n n e(sup n-(r+€) Is I} < e(sup n-(r+€) Is' n n n n IJ + e(sup n-(r+€) Is"l} n n which, along with (a) and (b), proves the lemma. A positive integer-valued random variable M depending on (X ,X ,· .. ) such that, for n = 1,2,"', the event (M = n} is in :a~, 1 2 the a-field generated by variable". (xn ,xn + 1 ,···}, is called a "reverse stopping The following lemma appears in Simons [15] and follows 68 from Theorem 2.2 on page 302 of Doob [4]. !:.2! Z_m , ••• , Z-m be.! martingale where LEMMA 5.7. 2 < co and ~ M be .! reverse stopping variable ~ e[z_M} = e[Z_m -co 1 !!!!h < ml < m2 P[m l .:s M.:s m2 } = 1. }. 1 PROOF OF THEOREM 5.3. (a) As in Simons [15] define a reverse stopping variable for d> 0 by last integer n > n - 0 such that s2 > nd 2a- 2 wn n if there is such an n (5.4) M = if s2 no - 1 < nd 2a-n 2 for all n >- n0 wn- if s2 > nd 2a- 2 infinitely often wn n 00 where n 0 > r + 1. - Let I represent the indicator function and define t and Nt as in the proof of Theorem 5.2. < d- 2 2s 2 + n - ~ wM I 0 Then for every t > 0 [M> n -1] - 0 2 2 < ta-2 (J-2 a..S M + n0 • MW - Thus, for every t > 0 -1 [ } -2 -2 [ 22} -1 teNt .:s a (J e ~swM + t no. (5.5) (b) = 2 (J • From expression (3.18) in 69 the proof of Theorem 3.1 (5.6) Define Z(c) -n z(c) _00 = lim = U(c) n n~oo Po = 0.) and Z(c) = lim z(c) for c = O,l,···,r. Then -00 n~oo -n 2 U(c) = Pc + 9 (a.s.) for c = O,l,···,r. (Recall that n Then, by Lemma 5.5, {z(c)' is a martingale. _00 •• "z(c)} -r fore, from Lemma 5.7 with m1 = no - 1 and m 2 = 00, There- we obtain t{u(c)} = t{u(c) } = M n -1 (5.7) o = O,l,···,r. for c In particular, t{U~l)} = P1 + 9 2 and t{u~O)} = 9 2 . From (5.1) and (5.4) note that, for every t > 0, Nt as a consequence of Lemma 5.1(ii), 1im t Lemma 5.4 implies that Now an(c) c c = O(n-1 ), = O,l,···,r. = O,l,···,r. 1imt~00 u~c) 1imt~00 so that = Pc ~oo ~ M + 1, so that, M = 00 (a.s.). + 9 2 (a.s.) for c = O,l,···,r. (c) aM(c)UM = ° (a.s.) for Furthermore, by Lemma 5.6, t{sup a (c)lu(c)l} < 00 for n n n We then use the Lebesgue dominated convergence theorem to obtain . (5.8) r l~mt~oo ~c=ot { (c)} aM(c)UM = 0. Finally, from (5.6), (5.7) and (5.8) we conclude that lim (5.9) (c) Also, t ~oo t{s2 } = (J 2 • wM Guided by Pratt [12], we next show that (5.10) From Lemma 5.4 it follows that 1imt~00 s;M = (J2 (a.s.) and 70 2 .2222 2 11m 8 S M= a a (a.s.). Now, let A = inf a and B = sup a . t~ nn n n Mw 222 2 for every t > 0, As wM -< a.s M < Bs wM' Thus MW - Then, and, by Fatou's lemma, (5.11) Also o< 222 BCJ - a CJ· and, by invoking Fatou's lemma once more, (5.12) = BCJ2 - lim sup t-+oo 2 2 e(aM s }. wM Then (5.10) follows from (5.11) and (5.12). (d) We conclude from (5.5) and (5.10) that lim SUPt-+oot-1e(Nt};; 1. lim inf -+ oo tt 5.3 1e(N } ~ t 1. The sequential However, Fatou's lemma implies that This completes the proof of Theorem 5.3. rocedure usin s 2 The results of zn section 5.2 also hold if s2 is used as an estimate of CJ2 zn Throughout this section N(d) is defined by (5.1) with s;k substituted 2 for swk' Results analogous to Lemma 5.1 and Theorem 5.2 follow immediately. We now consider the analog of Theorem 5.3. THEOREM 5.8. Assume e(f(X 1 ,··· ,Xr )} 2 < 00 and P1 > O. Then 71 (S .13) PROOF. (a) Examine the proof of Theorem S.3. analogy to (S.S), for every t> (S .14) -1 ( } teNt -2 2 2 where t = d a cr. It is clear that, in ° -2 -2 ( 2 2 } + t-ln ~ a c r e ~szM 0 In order to establish (S.13), it is sufficient to prove 2 (S.lS) cr • For, assume for the moment that (S.lS) is true. Then it is easy to see that Part (c) of the proof of Theorem S.3 can be applied, as it stands, to prove that = a 2a 2 • (S.16) As a result of (S.14), (S.16) and Part (d) of the proof of Theorem S.3, it follows that (S.13) is true. We therefore begin the proof of (S.lS). without loss of generality. 2 szn = s *2 zn First, set Q = 0, Then from the proof of Theorem 3.3 + r(n-l) -1 (Ur -Un ) 2 and *2 s zn = (n-l) -1 ~.n1=r+12.12 (n-l) -1 2 n (n+r)U + (n-l) -1 2rU U • r n We establish (S.lS) by proving each of the following four statements: 72 (S.17) lim 2 t ..... CD = a , e(M-l) -ll!: 1.=r+lZ:} 1. (S.18) and (S.19) (S.20) (b) Proof of (S.17). Z2. = r 2 z.2 + 2rz.P. + P.2 1. 1. 1. 1. 1. (S.2l) where Pi = From the proof of Lemma 2.12 recall that ~=2 ( ~ ) z~h) for i = r+l,···,n. and Now write 2} -1 M 2} ( M 2} (S.23) e(M-l) -LM L;. 1.=r+lz.1. = e(M-r) L:.1.=r+lz.1. + e b(M)L:.1.=r+lz.1. where b(M) = OeM -2 ). . 1e, Cl ear 1y, ( n-r ) -l"n ~. lZ.2.1.S a reverse mart1.nga 1.=r+ 1. so that, by Lemma S.7 2 (S.24) Recall that lim t .....CD M = a • CD (a.s.). Therefore, from Lemmas S.4 and S.6 and the Lebesgue dominated convergence theorem, we obtain 73 1im t "'co e[b(M)~~=r+lz2} = i (5.25) o. Putting (5.24) and (5.25) into (5.23) yields (5.26 ) Because of (5.21) and (5.26), in order to prove (5.17), we need only prove (5.27) and . (5.28) l~mt ...co M } e [ (M-1) -1 2:.~=r+ 1z.Pi = ~ o. But, by the Schwarz inequality (for both the sununation and the expectation), Ie[(M-l) -1 2:.M ~=r (5.29) I e [ «M-l) -1 2:.M +l z 2.) 1/2 +lz.P.} ~ ~ -< ~=r [ -< [e (M-1) -1 M ~ 2} 2:.~=r +l z ~. [ «M-1) e (M-l) 2 V 2} -1 M -1 M 2:.~=r +lP.) ~ 2 2:.~=r +lP.}] ~ 1/2 • From (5.29) and (5.26), notice that (5.27) implies (5.28). We now prove (5.27). (5.30) From the proof of Lenuna 2.12 recall that (h) *(h) _ (h-1)V(.,h) 2i = Wii ~-l so that (5.31) for h = 2,3,···,r and i = r+1,···,n. Then, because of (5.22), (5.31) 1"'lIII 74 and an argument similar to (5.29), in order to prove (5.27) it is sufficient to prove . l~mt"'00 e[(M-1) (5.32) -1~ *(h)2 ~.~=r+lW"~~ } = 0 and . l~mt...00 (5.33) [ e (M-1) -1 M (h)2} ~.~=r~+lV. 1 = 0 for h = 2,3,···,r. To prove (5.32), for h = 2,3,···,r; c = O,1,···,h-1 and i = r+1, ••• ,n, define [ (i-1) (h-1) ( i-h ) ]-1 h-1 c h-1-c (5.34) • (c) (h) g ~ (x.,x J a2 , ••• ,x)g (h) (x.,xj:l , ••• ,xj:l ) ah J P2 Ph where the summation ~(c) is over all combinations (a ,·.·,a ) and h 2 (13 2 "",13 h ) each chosen from [1,2, ••• ,i-1} and such that there are exactly c integers in common. defines the u(c)'s.) n (Compare (5.34) with (2.3) which Then, in analogy to (3.15), 1c(h)2 = (i-1) -1~h-1 (h-1) ( W~~ .. (5.35) h-1 c=o for h = 2,3,· .. ,r and i = r+1,···,n. c i-h ) V~h,c) h-1-c ~ Because of (5.35), (5.32) becomes (5.36) lim t~ + lim e[ (M_1)-1~~ ~=r+1 (i-1) -1 (i-h) V~h,O)} h-1 h-1 ~ ~h-1 (h-1) e[(M_1)-1~~ (i-1) -1 ( i-h ) v~h,c)} = 0 c ~=r+1 h-1 h-1-c ~ t ...oo c=l 75 for h = 2,3,···,r, which is to be proved. We now examine the first term of (5.36), which may be written as lim (5.37) t ....oo e[(M-l)-l~~ V~h,O)} 1.=r+l 1. + lim t ....00 e[(M-l) -l~Mi +la(h,i)V~h,O)} =r 1. l where a(h,i) = O(i- ) for h = 2,3,···,r. Notice that the first term of (5.37) is equal to lim (5.38) t .... oo e[(M-r) + lim t .... oo -l~~ V~h,O)} 1.=r+l 1. e[b (M) ~~ V~h ,0) } 1.=r+l 1. for h = 2,3, ••• ,r and where b(M) = O(M -2 ). By Lemma 2.4, e[v~h,O)} = 0 for h = 2,3,.··,r and i = r+l, ••• ,n. 1. (n_r)-l~~ Also, v~h,O) is a reverse martingale (the proof is similar to 1.=r+l 1. that of Lemma 5.5) so that, by Lemma 5.7, (5.39) for h = 2,3,··· ,r. Next, Theorem 4.3 can be adapted to show that lim b(n)~~ lv~h,O) = 0 (a.s.), and also, Lemma 5.6 can be 1.=r+ 1. n.... oo adapted to show that e[sup b(n)~~ +llv~h,O) I} < n 1.=r 1. 00, for h = 2,3, ••. ,r. Thus, from the Lebesgue dominated convergence theorem (5.40) lim for h = 2,3,···, r • t .... 00 e[b(M)~1.=r+lv~h,O)} = 0 1. From (5.38), (5.39) and (5.40) we find that the first term of (5.37) equals zero. In a similar fashion it is possible to shoN that the second term of (5.37), and hence, the first term of (5.36), equals zero. 76 We now examine the second term of (5.36). Again, Theorem 4.3 can be adapted to show that (5.41) (a.s.) and a proof similar to that of Lemma 5.6 demonstrates that n (h c)l} e[supn (n-l) -1 ~.1=r +li -11 V.' < ~ 1 (5.42) for h = 2,3,"',r and c = O,l,···,h-l. Then (5.41), (5.42) and the Lebesgue dominated convergence theorem combine to show that the second term of (5.36) equals zero. This completes the proof of (5.32). The proof of (5.33) is similar to the proof of (5.32) and is therefore omitted. (c) We have thus established (5.27), and hence, (5.17). Proof of (5.18), (5.19) and (5.20). From the proof of Theorem 3.1 recall that u 2 = (n) -1 ~ r ( r ) (n - r) U(c) r-c n n r c=o c (5.43) with U(c) given by (2.3). n for c = O,l, ••• ,r. Notice that (n ) -1 ( r ) (nr-_cr) c r For c = 0, by Lemma 5.7, e[u~O)} = 0. = O(n -c ) For c = 1,2,"',r, by Lemma 5.6, (5.44) Also, for c = 1,2, ••• ,r (5.45) (a.s.) • Thus, by (5.43), (5.44), (5.45) and the Lebesgue dominated convergence theorem, (5.18) holds. Both (5.19) and (5.20) can be easily proved using the Schwarz inequality. This completes the proof of (5.15) and 77 the theorem. 5.4 ~. EXAMPLE 1. We continue our discussion of the example of section 3.6 in which Q = var{X }. 1 To be specific let a k = a for k = 2,3,··· although any positive sequence {a } such that 1i~~ak= a k would do since we are only investigating asymptotic behavior. From (5.1) define for d > 0 2 2 N(d) = smallest integer k> 2 such that s2 k < kd aw - (5.46) 2 where swk is given by (3.59). Then IN = [UN-d,UN+d] is a sequential confidence interval for Q = having width equal to 2d and coverage ~2 probability approximately equal to a, for small values of d. The sequential procedure is asymptotically efficient in the sense that (5.2) holds. In this case s2 is not difficult to calculate sequenwn tia11y as it depends only on the first four sample moments. We could also define N(d) by (5.46) with s;k replaced by s~k 2 where szk is computed using (3.57), (3.61) and (3.60), in that order. However, for this example, the procedure using s;k is to be preferred. Note, incidentally, that the sequential procedure is invariant under a location shift. (1) (2) x(2» ), ••• ,xn = (x(l) n ' n is a bivariate random sample of a random variable X = (x(l) ,x(2» with EXAMPLE 2. Suppose that xl = (xl ,xl continuous marginal distribution functions. s(u) -1 u < 0 0 u =0 ! = +1 u> 0 Let 78 and f(x ,x ) = s(x(1)_x(1»s(x(2)_x(2» 1 2 12· 1 2 The corresponding U-statistic is (5.47) and is referred to as the difference sign covariance of the sample. See Hoeffding [7]. Two points xl and x are said to be concordant 2 = +1 and are discordant if if s(x(1)_x(1»s(x(2)_x(2» 1 2 1 2 s(Xi1)-x~1»s(xi2)-x~2» = -1. Let (5.48) TI = P[X 1 and X are concordant} = p[(X~1)_x~1»(X~2)_x~2» 2 > OJ. Then the expectation of the U-statistic is Q = e[s(x(1)_x(1»s(x(2)_x(2»} = 2TI - 1. 1 2 1 2 Now, let C equal the n number of concordant pairs among x ,x ,· •• ,x ' n 1 2 (5.49) Un = 4n -1 (n-1) -1 Then (5.47) becomes Cn - 1. Next, for i = 1,2,"',n, let T. equal the number of points among J.n n Then~. J.= 1T.J.n = 2Cn n and Cn = ~i=2Tii' so that Cn+ 1 = Cn + Tn+1,n+1' ·that W. = 4(n-1) -1T. - 2 and so To determine s 2 notice wn J.n J.n W.J.n - Wn = 4(n-1) -1 T.J.n for i (5.50) Define = 1,2,···,n. 8n -1 (n-1) -1 Thus, after some arrangement 2 -3 n 2 s wn = 16(n-1) [~.J.= 1T.J.n C n 79 if xi and xn+ 1 are concordant 1 = 'a. (n+1) ~ for i = 1,2,···,n. to otherwise Then T.~,n+1 = T. ~n + a.(n+1) for i ~ = 1,2,···,n. Notice that the T. 's may be arranged in a triangle as follows: ~n T12 T 22 T 13 T 23 1n T 2n ·T·· T 33 T nn Suppose that the observations x ,x ,··.,x have been taken and that n 1 2 C ,T ,···,T have been determined numerically. n 1n nn Now, if a further observation xn+1 is taken, then Tn+ 1 ,n+1 can be determined either by plotting x + 1 and comparing with x ,x 2 ,···,xn ' or otherwise. n 1 = Cn compute Cn+1 from Cn+1 for i = 1,2,···,n, + Tn+ 1 ,n+1. Then The Ti ,n+1's are determined, from the last row of the above triangle, and 2 finally, s given by (5.50) • w,n+1 is 2 The estimate s is given by zn (5.51) 2 -1 n 2 s zn = (n-1) [L 3Z . ~= ~ nu~ + 2U~] where, for n > 2, (5.52) Z = nU Suppose that Cn' Un and n n n 2 ~i=3Zi observation xn+ 1 is taken. (n-1)Un- 1. are known numerically and a further Determine Tn+ 1 ,n+1 and Cn+ 1 as before. 80 Compute Un+ 1 from (5.49) and Zn+1 from (5.52). Then, finally s 2 z,n+1 can be computed from (5.51). Define N(d) using either s IN = 2 wn or s 2 zn as an estimate of cr [UN-d,UN+d] is a sequential confidence interval for Q 2 = 2n Then - 1 having fixed-width equal to 2d and coverage probability approximately equal to ~, for small values of d. 81 BIBLIOGRAPHY [1] Anscombe, F. J. (1952). Large-sample theory of sequential estimation. ~. Roy. Statist. Soc. Sere B. 48 600-607. [2] Berk, R. H. (1966). Limiting behavior of posterior distributions when the model is incorrect. Ann. Math. Statist. 37 51-58. -- - [3] Chow, Y. S. and Robbins, H. (1965). On the asymptotic theory of fixed-width sequential confidence intervals for the mean. Ann. ~. Statist. 36 457-462. [4] Doob, J. L. (1953). Stochastic Processes. Wiley, New York. [5] Feller, W. (1957). An Introduction to Probability Theory and its Applications, 1. Second edition. Wiley, New York. "" [6] Fraser, D. A. s. (1957). Nonparametric Methods in Statistics. Wiley, New York. [7] Hoeffding, W. (1948). A class of statistics with asymptotically normal distribution. Ann. Math. Statist. 19 293-325. -- --- ""-"V [8] Hoeffding, W. (1960). An upper bound for the variance of Kendall's tau and of related statistics. In Contributions to Probability and Statistics, edited by I. 01kin and others. Stanford University Press, Stanford. [9] Hoeffding, W. (1961). The strong law of large numbers for U-statistics. Institute of Statistics Mimeo Series No. 302, University of North Carolina, Chapel Hill. [10] Kendall, M. G. and Stuart, A. (1958). The Advanced Theory of Statistics, 1. Hafner, New York. [11] Loeve, M. (1960). Probability Theory. Second edition. Van Nostrand, New York. [12] Pratt, J. W. (1960). On interchanging limits and integrals. Ann. Math. Statist. 31 74-77. [13] Sen, P. K. (1960). On some convergence properties of Ustatistics. Calcutta Statist. Assn. Bull. 10 1-18. "" ---""-"V 82 [ 14] Siegmund, D. (1969). On moments of the maximum of normed partial sums. Ann. Math. Statist. 40 527-531. -- --- [15] Simons, G. (1968). On the cost of not knowing the variance when making a fixed-width confidence interval for the mean. Ann. Math. Statist. 39 1946-1952. -- --[16] Starr, N. (1966). The performance of a sequential procedure for the fixed-width interval estimation of the mean. ~. Math. Statist. 37 36-50. [17] Wilks, S. S. (1962). Mathematical Statistics. Wiley, New York.
© Copyright 2025 Paperzz