On the Uniformity of Sequential Ranking Procedures* by Raymond J. Carroll Department of Statistias University of North Carolina at ChapeZ HiU May 9 1976 Institute of Statistics Mimeo Series No. 1067 * This research was supported in part by the Office of Naval Research Contract N00014-67-A-00014 and in part by the Air Force Office of Scientific Research tmder Grant Noo AFOSR-75-2796 • .e SUMMARY On the Uniformity of Sequential Ranking Procedures* by Raymond J. Carroll In the class of ranking procedures based on sequential confidence intervals of fixed-length, the probability statements are not tmifonn in the scale parameter. Further the available generalizations to rankings based on stochastic orderings are not uniform over the parameter space Q. 1\'10 methods are proposed for solving such problems; the first is based on the theory of weak convergence while the second is more direct but only solves the problem when Q is compact. American ~hthematical Society 1970 subject classifications: Primary 62L99; Secondary 62F07. Keywords and phrases: Ranking and Selection, Sequential Confidence Intervals, Unifonnity in Weak Convergence. * This research was supported in part by the Office of Naval Research Contract N00014-67-A-00014 and in part by the Air Force Office of Scientific Research under Grant No. AFOSR-75-2796 • .e 1. Introduction In a certain class of sequential ranking and selection procedures based on fixed-length confidence intervals (Geertsema (1972), Chow and F ,O'(x) = ll and provides, for each fixed F and fixed k-vector 2, Robbins (1965)), the standard work assumes an unknown d.f. F(O'- 1(x-1l)) a statement lim inf Pr{CSIM,Q;} d+0 = p* , where d is the length of the lJ preference zone ned), the infimum is taken over ned) a correct selection. and CS denotes However, the main historical concern in ran}<ing theory has been lower bounds for Pr{CSIM,2} unifonn in both ned) and certain sets of nuisance parameters, so that in fact one requires (1) e lim inf Pr{CSlll,O'} d+0 M,!l = p* . In the general problem, the distributions take the form F1l,O'(x) FO'(x-ll), where 0' ranges over an indexing set as a subset of [0,(0) 1l =0, )>> = F (which is taken here the ranking might be with respect to EX. Taking this example includes the case of stochastic orderings. Although important applications of our results are in ranking theory, it turns out that if one applies the sequential rule defined below (with iPk- l (x+b)diP(x) = p* ) to each of the k independent b chosen so that J populations and choE:lses the population with the largest observed value of the ranking statistic Tne ' then to show (1) it is sufficient to show (4) below, i.e., to shm'l the mifonnity of the corresponding sequential fixed-width confidence interval. The problem of miform bounds for the vast number of such confidence intervals is itself an i..'1IpOrtant and neglected .e area, as the usual results show that under FO' (x-ll), 2 e (2) inf lim Pr{correct coverage} (j d-!-O ~ p* . The object of this work then is to find sequential selection and confidence interval procedures and conditions tmder which (1) and (4) hold for all compact indexing sets F. That compactness alone is not sufficient .to guarantee (1) and (4) is shown in the counter-e:lCaJ1lP1e of Section 2; the problem is that a certain stochastic process in D[O,l] be tight. (Billingsley (1968)) fails to This leads to the presentation in Section 3 (see Theorem 1) of a weak convergence theory which guarantees (1) and (4) in the sample means case. In Section 4, two examples are given which illustrate the use of Theorem 1. Finally, in Section 5, a generalization of Theorem 1 is presented. Tong (1971) considered a problem similar to (1) in the fixed sample size e case, making use of results on the tmiformity of convergence (in a) in the Central Limit Theorem (Parzen (1954)). These results are not applicable in our case since the number of observations is random; hence, this paper might also be considered a slight extension of Parzen's work to the sequential case. To set the notation for our solution to the confidence interval and ranking problems, we assume there are Li.d. observations Xl'Xz,... from a d.f. Fe(x). One defines for each n a location-scale equivariant statistic Tne for which, under Fe' Iil(Tne -}.l(e))/h(8) ~ N(O,l) and the ranking is with respect to }.l(e). By translation invariance, one may asstu11e }.l(8) :: O. 8 may be considered a scalar and h(e) Hence, will satisfy a Lipschitz condition. For the rest of this paper, assume for convenience that h(e) :: 8. One also defines a sequence ~e of location invariant, scale equivariant statistics 2 for which ~8 -+ 6 almost surely. Finally, define for a* > 0, .e cr(6) = max{a*,8}, s2 ne = rnax{a* ''116 ~2 } 3 Although Ne(d) is slightly different from Chow and Robbins' (1965) rule, choosing a* very small results in no great change. The problem is to show for all compact F in [0,(0) that (4) where a lim inf Pr{ITN (e) e'~d} = Z~(b) - 1, d+0 e€F d' is the standard nomal d.f. Note that e need not be a scale para- meter ~ and Fa need not be known. 2. An Example The following example (which can easily be extended to the ranking problem) shows that even if Tne is the sample mean and ~e the sample variance, it is possible for F = [0,1] that (4) fails but (2) does not. Let Ul,UZ"" be LLd. unifonn (0,1) random variables. (~)(1-8)/e, Thus for ~(e) a Define for aCe) = 0~8~1 I the indicator of the event A, and A l Xi (8) = (e/l-e/~( (1-a(8)r I (U >aC8)) . i < 1, = EX1 (8) = (8/1-8)~, while for 8 = 1, ~(8) a2 (8) = Var X1(8) = (8/1_8)(2(1-8)/8 - 1)-1 , = a2 (8) = O. Then Nd (8f 1 N (8) LId (Xi (8) - ~(8)) is a stochastic process in D[O,l], but choosing Sed) = d- 4/(1+d- 4) Na(O) ' . lim sup Pr{INd (8)L1' d+O O~e~l 2 (xi(e)~~(~)) I~d} ~ lim Pr{Xk (8(d))=0, k=l, ••• ,(ba*/d) } d+O' = lim (2- d4 )(ba*/d)2 .e we obtain d+O = 1, 4 e 2 the inequality following because xk(e(d)) = 0 for k = 1 1 " , 1 (ba*/d) 2 means Nd(e(d)) = (ba*/d)2 and ll(e(d)) = d- ~ d. However, from Anscombe (1952), it is clear that 3. Weak Convergence SUppose the distributions in e. Fe have inverses The crucial idea is that if Xl'X Z'''. Ge which are continuous are Lied. 1.mifonn (0,1) random variables, t~e statistics Tne = Tn (Ge (X1), ... ,Ge (Xu)) form a stochastic process in e. To exploit this idea, the first example considered -1 tn is that of the sample mean Tne = n £1 GeCXi ). ~e will be the sample variance. Consider the following assumptions: (Al) On each finite interval, exist positive constants J (AZ) I G:Cx)dx is bOlmded and there ~~, 00 such that 4 1+°0 (Ge1(X)-Gez(X)) dx ~ f~le2-ell For arbitrary positive const8.l1.ts E, . M, Note that (Al) and (AZ) hold in the important special case of scale orderings 4 (F (x) = F(X-ll)} as long as x dF(x) < 00. The key is the following 1l,8 lemma. e f 5 Suppose Vn > Y in D[O ~oo) ~ where Vet) is normally distributed with mean zero and variance at JOOst Mrc ~ a finite constant. Suppose lel1111a 1. Pi'{V t C[O,oo)} =L Then for any compact set lim sup tf:F n-+oo Proof: Let ~ = {x Ipr{IVn(t)l~b} ~ D[O~oo) - f Pr{IV(t)l~b}1 = O. : Ix(t)lsb} and A =' Theorem 3 of Topsf6e (1967) ~ we have to show that if {~ : on -i- teF}. 0, ° By t n e f, py ( 00n (oAt) n) = O. n=l n (5) Assume A is a V-continuity class. Then sinc~ {tn } has:a limit point (5) is shown to hold by following the method of proof of Topsf6e' s Theorem 'e 8. To verify that for each t. A is a V-continuity class, one must show PV(o~) = 0 But Py(oAt ) = Py(o(AtnC)) = Py{xeC : Ix(t)l=b} = O. Theorem 1. .e Under (Al) and (A2), if f is (;ompact~ (4) and -hence (I) ·'hold. 6 Proof: We freely use the notation and results of Bickel and Wichura Fix M* > 0 and define on (1971), said paper hereafter denoted by B-W. intervals [O,aln) ~ [a1n,aZn)".' ,[akn,~'1*] each of length at most exp( _n 2) ~2(e) = sUP{~a : a jn s a s aj+l,n} ~l(a) = inf{~e : a jn s if a·In s 6 < a.J+1 ,n e 0 Nd 2)(e) = inf{n~5 : n ~ (bs (8)/d)2}. nz D[O,co). Define nil) (8) and for 0 S t S 2, D (M*) 2 = D([0,2] 0 ] Define : n ~ (bsnl(8)/d)Z} Ndl )(8) [ a j +1 ,n Ndl )(8) = inf{n~5 Clearly where as S Nd(6) 0 S S Ni2) (8) and the above rules are members of and ni2) (6) a s M* , in a mariner similar to the above let is the greatest integer function. x [O,M*]) and for fixed t, Vd is an element of Vd(t,o) is an element Let B = (s,t) x [6 1 ,8 2) be any block (B-W, page 1658), where ns and nt are integers. Then, by (Al), of C[O,M*]. 1+°0 2 162-61'" {It-sl/n + It-sl } 1+00 162-611 since .e It-s I ~ lIn. inequality shows 1+00 It-sl Hence for any pair of blocks B, C, the Schwarz 7 proving by the Corollary to Theorem 3 of B-W that {V } is tight. By d Vd converges weakly to a Gaussian process Theorems Z and 4 of B-W, V in DZ(11.1*) • Now define {Val)} are also in DZ(~F) and (t i are in a rectangle with the same center as co, for a fixed constant c e nJ2)C6i) 0 > 0, are in a rectangle R of diameter (i=l,Z) (t ,8 ), l l the sequence is tight since if and d n~2)CM*)' ) 8i R but of diameter sufficiently small. Here, we have used the fact that n~Z)(8)/naZ)O~) ~ (a*/2~~)Z. Note that n~2)(6) was defined precisely to keep v£l) (2) fd Z 1 (M*)a (M*J) ~ l~ (2) 2 n (6)a (a) J (1) . Vd (t,6). in D . 2 (2) Then Vd Next define v£Z) (t,6) = is tight in DZ(M*) and its d finite dimensional distributions converge to those of a Gaussian process V with covariance ftmction when t = 1 and 81 < a2 given by G (x)G (x) a1 eZ dx • a2(a ) f Hence Z V~Z)(l,6Z) - V~Z)(1,6l) converges to a Gaussian random variable with mean zero and variance f .e for some that { G~ (x) G~ (x) ZG8 (x)G6 2 1 12 2 2 + 2 a (8 2) a (6 1 ) a (6 2) e: > O. (X)} I dx ~ r~ 62-61 Ie: ' The inequality here follows from (Al) and the fact la 2 (6 2) - a Z(6 l ) I ~ M4lez-6ll Hence, by Billingsley (1968, page 8 e 97) for fixed t, C[O,M]. V~2) (t, 0) converges weakly to a Gaussian process on 2 Now define v~3) (t,6) = vl ) (t N(2) (6) 12) nd (8) ,8). Note that condition e: > 0 (AZ) shows that fOT (6) By (6) and the Kolmogorov mequality (see Anscombe (1952)), one shows that for fixed t and 8, V~2)(t,8) - Va3) (t,8) + 0 (in probability). V~3) (and hence v~2) - Vd3) ) is shown to be tight by (6) and the same e l kind of argument used to prove the tightness of Vd ) ; hence for fixed t, Vd3) (t,o) converges weakly to a Gaussian element of C[O,M]. We need the following result. Proposition 1. Wd + Define 0 in probability. Proof of Proposition 1: First note that (A2) implies (7) Let n > 0 be arbitrary. Then IV~3)(t,8) - Vd3)(l,S)I ~ Ivl3)(1,e) - VJ3)(l-n,e)\ 3 + mini IVd3) (t,e)-Va ) (1,8) I, Iva 3) (1-n,e)-Vd3) (t,8) I}, 9 e so that by (7) (8) W s d ~ with probability approaching one~ sUPIVd3)(1~6)-V~3)(1-n~6)1 6 sup + l-nsts1 min{suplv~3)(t,6)-Vd3)(1~e)1 ~ sUPlv~3)(1-n~e)-vd3)Ct~6)1}. e 6 The first tern on the right hand side of (8) converges in probability to zero asn,d -+ 0 by Chebychev's inequality, while the second is bounded by the modulus of continuity w~ of B~W and hence converges to zero in n -+ o. probability as Returning to the proof of Theorem 1» we see that the conclusion of Proposition 1 also holds for the process v£4) (t,6) e = b(dI'J~l) (8)r since b 2<i(6) (d2Ndl) (8)) -1 e: > 0 and d (9) P... [tN(2) (6)] li=ld GeCXi ) l 1 tuiifonnlyon 0 s 8 s M. Thus, for all sufficiently small -1 Pr{jNd (6) N (6) II d G6(Xi)l~d}=Pr { b . N (6) laNd(6 ) II d .... } G6(Xi)l~b s Pr{ 1Vd4) (1,6) I ~ b-d + Since v~4) (1,) converges ,.,reakly on D[O,M*] to a Gaussian process (call it V) on C[O,r1*] on D[O,oo) one. and M* is arbitrary, the convergence is to a Gaussian process V which is in C(O,oo) with probability Thus, Lemma 1 (with b replaced by b- e: ) says that as last term in (9) is botmded uniformly in 8 Letting e: +:0 comp:letes the proof. .e E. € f d -+ 0, the 10 e In the sample means case, (Al) and (A2) hold if f G:(x)dx is bounded on each finite interval and there exists a function H with finite fourth moment such that 4. IGe (x)-Ge·.(x)l::;; leZ-elIH(x). 2 1 Applications Theorem 1 holds for many types of estimators, two of which (the sample median and a smooth M-estimate) we illustrate here. idea is that Theorem 1 will hold when Tne mean plus a tmiformly small order tern. .e -+ 0 (a.s.u.)) if for all C, sup{ITne ' : o::;;e::;;C} Exampl_~ can be expanded as a sample {Tne } converges to zero almost surely tmiformly (denoted Definition. Tne 4.1. + Here we define Xl'X2 "" 0 (a.s.). as i. i. d. uniform variables and Tne by I~ i/J (Ge (Xi) -Tne ) = O. Here l/J nondecreasing with two bounded continuous derivatives. in a neighborhood of zero and i/J' (x) These i/J =0 (0,1) random is bounded and Further, outside an interval i/J' (x) > 0 [-k,k]. ftmctions include smoothed versions of the lfuber M-est:imate (Huber (1964)). ~e = n- l ~e is defined by I~ i/J2(Ge (Xi )-Tne ) {n- l L~ l/J'(Ge(Xi )-Tne)}-2 {Tne } with Vimean" zero; more precisely, We Fake the following asstmq>tions for every C > 0: We again have the .e The key. • f i/J(x)dFe(x) = O. 11 (EI) There exists C* such that C~ (BZ) For every E: there exists (B3) TIlere existP9sitive constants c1 j > 0 such that C j z such that Condition (Bl) is weaker than the sufficient condition for the sample means given at the end of Section 3~ while (HZ) and (B3) are quite reasonable. Lemma 2. Under (Bl) - (B3)j the conclusion to Theorem 1 holds. Proof: We first show the fact that l/Jv ITnal -+ 0 (a.s.u.). By a Taylor expansion and vanishes outside [-kjk] for any E: j there exists nl > 0 such that (10) In- l L~ l/J(Ge(Xi)-E:) - n- l L~ l/J(Ge (Xi)-t) I o ~ s~ l/JV(x) n ~ HI le-eol where f·!1.l depends only on C and· j 1 j o~e~c. Now choose a set of such 1 parameters eO by {e in = i/~ O~i~c~}. (1963) and the Borel-Cantelli Lemma~ j .e £1 IGe(Xi)-Geo(Xi) II{nlsXi~l-nl} -1 ~n :By Theorem 1 of Heeffding ;: . 12 (11) sup{ln- l L~ 1/J(Ge(Xi )-t:) - f 1/J'(x,.e:)dFe (x) >"nO ~ Since J 1/J(x-t:)dFe(x) I : (10) and (11) give ede in }} ... 0 ITnal'" 0 (a.s.). (a.s.u.). Again by a Taylor expansion, ~en(Xi) where and Ge(X i ) - Tne • Since 1/J'(x) > 0 is between Ge(Xi ) in a neighborhood of zero and (B3) holds, there is a positive constant nZ such that almost surely as n'" -e One now shows 00. (12) n by shQi.\Ting n-3/4 !.-4 ITna ' ... 0 (a.s.u.) I~ 1/J(Ge (Xi )) ... 0 (a.s.u.); this is accomplished by following the steps in (10) and (11). 1/Jll (13) is bounded~ ~ .e one rore Taylor expansion shows ITne - n- Now define Now using (12) and the fact that Pe(x) 1 L~ 1/J(Ge (Xi )) / = a 1 (e)1/J2(x) J 1/J'(x)dFe (x)I (a.s.u.). + a (e)1/J'(x) + a (e)1/J(x), where 2 3 r2 a l (e) = a (e) 2 = -2E1/J2 (X) (E1/J' (X))-3 a (8) 3 = -azE1/JI1(X) (E1/J' (x) and the expectations are under ... 0 Fe'" - a l E1/J(X)1}J'(X), Using (13), Taylor's theorem, 13 and the same type of arguments as in (10) and (11), one I~e (14) where - h(8) - n -1 h(8) r~ Pe (Ge (Xi)) + J Wi (x)dFe(x)}-2 = J w2(X)dFe (X){ a on each subinterval 0 ~ J Pe(x)dFe(x) I a ~ + shO\~s 0 (a. s. u.), is clearly Lipschitz in M. Now reconsider the proof of Theorem 1. processes there in terms of Tna • One can redefine all the For example, (15) if t < -e ~. Because of (Bl), (Al) holds for the sample means generated by w(Ge(X))/a (E~), l while (14) shows that (A2) holds for ~e' Because of (13), all the weak convergence arguments in Theorem 1 apply to processes such as (15), and the proof is complete. It is clear that this type of result can be extended to a more general class of smooth W functions and to the I-Iuber Lemma 2 for the sake of brevity. w» but we will stay with Further» the scale equivariant version given by could also be considered. invariant range. .e Here qna = qn(Ga(Xi ), ..• ,Ge(~)) is location and scale equivariant; an example would be the interquartile The proof of Lemma 2 will still be applicable if we can write qne as a sum of i. i. d. variates plus remainder tenn as in (13), say l qna = n- L~ H(Ge(Xi )) + o(n-~) (a.s.u.). The next example includes 14 implicit conditions under which the interquartile range admits such an expansion. ~le 4. Z. Let Xln < XZn < ••• < the uniform sample Xl' XZ' ••• Bahadur (1966) has shown that if k where n 2 Rn ~ a (a.s.). ,~. ~ be the order statistics from Let 1)1 = np + a<p < 1 and define k (n 2 log n), Following Kiefer (1967, page 1324), it is possible to prove the following . .e Proposition 2. for some Define ~e = Ge(p) , and suppose that unifonnly on 0::; e: > 0, Then This proposition gives an. (a.s.u.) expansion for the interquartile range, the median, and the variance estimate for the median suggested by Geertsema (1972). As in the previous example, it suffices to verify Theorem 1 when the random variables are given by .e e ::; C, 15 and ~6 is given by the expansion for Geertsema is variance estimate. Under the conditions of Proposition 2 ~ (Al) and (AZ) reduce to requiring that Fe(~8) is Lipschitz in 8 of order a > ~. 5. A General Result The Theorem we actually gave in Section 3 is stronger than stated. We place it here in its fullest generality as it should be of independent interest in the theory of "'leak convergence of partial sum processes with random indices. Theorem 2. Suppose (AI) holds and there exists elements Nil)(6)~ N~2)(8) .e and nd(8) Ni2) (8). in D[O~oo) with {nd (8)} Suppose further that ~a2)(8)/nd(8) -~> non-random and N~1)(8) :s; N (8) :s; d I Nil) (8)/ndC6) ~> 1 on D[O,oo) d 2n (8)/o2(8) ~ b 2 d where 11=>ii is weak convergence. Then C4) and (1) hold. Further ~ if we define then Vd converges weakly to a Gaussian process V on sample paths are in C[O,oo) ,e with probability one. D[O~oo) whose 16 ACKNm~LEDGE~1ENT The author wishes to thank Professors S. Gupta and H. Rootzen for their encouragement and advice. -- .e 18 REFERENCES F.J. (1952). Large sample theory of sequential estimation. Froo. Camb. PhiZ. Soo. 48, 600-617. Ar~SCONffiE, BAHADUR, R.R. (1966). A note on quanti1es in large samples. Ann. Math. Statist. 37, 577-580. BICKEL, P.J. and vvICHURA, 11.J. (1971). Convergence criteria for mu1tiparameter stochastic processes and some applications. Ann. Math. Statist. 12, 1656-l670. BILLINGSLEY, P. (1968). Convergenoe of PpobabiUty MeasU!'es. New York: JoIL~ Wiley &Sons, Inc. mow, Y.S. and ROBBINS, H. (1968). On the asymptotic theory of fixedwidth sequential confidence· intervals for the mean. Ann. Math. Statist. 36, 463-467. J.C. (1972). Nonpararnetric sequential procedures for selecting the best of k populations. J. Am. Statist. Assoo. 67, 614-616. GEERTSEr"~, . - HOEFFDING, W. (1963) ~ Probability inequalities for sums of bounded random variables. J. Am. Statist. Assoc. 58, 13-30• HtIBER, P.J. (1964). Robust estimation of a location parameter. Ann. Math. Statist. 35, 73-101. KIEFER, J. (1967). On Bahadur's representation of saJl¥)le quantiles. Ann. Math. Statist. 38, 1323-1342. LINDVALL. T. (1973). Weak convergence of probability measures and random functions in the function space D[O,oo). J. Appl. Probe 10, 109-121. PARZEN, E. (1954). On unifonn convergence of families of sequences of random variables. Univ. of CaZif. PubZ. in Statist. 2, 23-54. TONG, Y.L. (1971). On the consistency of single-stage ranking procedures. Ann. Institute Statist. Math. 22, 271-284. TOPS0E, F. (1967). On the cormection between P-continuity and P-uniformity in weak convergence. Theor. Frob. Appl. 12, 241-250 . .e
© Copyright 2025 Paperzz