Pacific Journal of Mathematics SOME THEOREMS ON THE RATIO OF EMPIRICAL DISTRIBUTION TO THE THEORETICAL DISTRIBUTION S. C. TANG Vol. 12, No. 3 March 1962 SOME THEOREMS ON THE RATIO OF EMPIRICAL DISTRIBUTION TO THE THEORETICAL DISTRIBUTION S. C. TANG 1. Introduction. Let Xlt X2, " ,Xn be mutually independent random variables with the common cumulative distribution function F(x). Let X*,X*, "',X* be the same set of variables rearranged in increasing order of magnitude. In statistical language Xu X,, , Xn form a sample of n drawn from the distribution with distribution function F(x). The empirical distribution of the sample Xu •• ,Xn is the step function Fn{%) denned by 0 (1) F,(x) = — it 1 for x ^ X* for Xξ < x 5ϊ Xk*+1 f or x > Xί . A. Kolmogorov developed a well-known limit distribution law for the difference between the empirical distribution and the corresponding theoretical distribution, assuming F(x) continuous: (0 , P Vn FM ^ { -^?J for z ^ 0 , F ~ ^< 4 = {ti-iye— , for * > 0 . Equally interesting is Smirnov's theorem: 0, lim p\VH sup [Fn(x) - F(x)] < z\ = ]° ' __2β2 f or z ^ 0 , f or z > 0 . In this paper we shall study the ratio of the empirical distribution to the theoretical distribution, and evaluate the distribution functions of the upper and lower bounds of the ratio. We shall prove the following four theorems. THEOREM 1. If F is everywhere continuous then (0 for z%l — JL for z > 1 . z Received July 28, 1961. I am deeply indebted to the referee for his corrections of my English writing. Without his aid, the present paper would not be read clearly, although the responsibility of errors rests with me. Π07 1108 S. C. TANG THEOREM 2. Let c be a positive number no larger than n and suppose that F(x) is continuous in the range 0 ^ F(x) g c/n. Then k £? (n\ k -\nz - &)—* 7 ^ (nz) ~~ 2-*\k) Λ=I \κ/ ίg /w - fey ca - k γ~*Λ _ eg - fe y m«•—u ίj. i F(X) —K ~~~ iC // \\ /yi/y /t,^ ^ _ \\ι If // /v \\ 'V?<y / (ί/& — i* // /v for 0 < z S — e for z ^ 0 for z> — c \ THEOREM 3. Under the assumption of Theorem 1, for z < — n 1- \ inf lβ<#.(.)si Ή* nz 1 fe (fe - l ) * - ^ - fc)"" F(X) /or 1- ^ ^ < n for z > 1 . then THEOREM 4. Under the assumptions of Theorem 2, if 1 <Ξ c : n /or ^ < 0 Λ _ 1 V nzJ P^ JL (fe - l ) * - 1 ^ - k)n~* (nz)n for — ^ n inf v \ (fe n forz>j~. 2. An elementary lemma. We shall use the following lemma. — n SOME THEOREMS ON THE RATIO OF EMPIRICAL DISTRIBUTION Letfcbe a positive integer and a an arbitrary real num- LEMMA. ber. 1109 Then (2) sp r [a fϋ. r! r) a (fc-r)! (fc - 1)! ' ^ (r - I ) - 1 (a - r)«-' _ a* ί=i r! (fc-r)l fc! ' (a - 1)* k\ If a = k (a special ease) then 1 } h ,,, κ r! (fc - r)! 1 ^ (r - I ) - (k - r)*-r _ k« #=i r! (fc-r)I fc! ' (k - I)" M ' (fc - 1)! fc! ' Proof. Let /(«) be a polynomial whose degree is less than &. Then r i) /(* ) = (-D"[^/(*)].=. = o r i.e. Σ(ί)(-l) /(r)=-/(0). JSΓow we can directly obtain (2) and (3): z^ - r\ ( f c - r)\ r\ s=o s! (fc — r — -f \ 1. i.^ Λ g λ. -1 Oί s^l S! r=i r!(fc- - ( - .l)*-«-«α*-i ( 1)- (fc - 1)! (α-D k fc! β)! g r —•s)l a"-1 (fc - 1)! ' 1 _L. 1 r\ J (a -\y fc! (a -D* fc! (a _1)» fc! (at o s! (fc - 4- y- («- (fc - r ) ! ^ ( r - 1 ) ^ ^ (α -r=i 7*' s=0 -r+ l)s(,s ! ( f c - r — v ( - 1 ' )*-=(« _ i ) . s=0 s! 1 ^ fA. \y-r-* β)! L)'(r r! ( f c - r — s ) ! (-• - + ^(-1) *-( α _ i) (_ s! (fc - s)! — s )! fc! 3. Proof of Theorem l First we consider the case z ^ 1. Let be the largest root of the equation 1110 S. C. TANG (i ^ k ^ n) . F(x) = A nz Since F is continuous, yk is well defined. ability of the inequality (6) sup Now we evaluate prob- ΣΆ F(X) 0^F()^ that is, the probability that there exists an x such that F (7) ^ > z. F(x) ~ If (7) is true, then, since F{x) is nondecreasing and lima._0o (Fn(x)IF(x)) — 1, there exists an xQ such that F(xQ) By the definition of Fn, Fn(xQ) = kjn for some k, so that F(xQ) — kfnz. Therefore we can take x0 as one of yk. In other words, for one value of yk we have FM) = n and the inequality (8) Xt<y*^Xί+i is true. Now let Ak be the event that this inequality holds. Clearly (6) occurs if, and only if, at least one among the events A A A . .. A occurs. Generally the Afc are not mutually exclusive events so that the additive law is of no use. We may, with the help of an associated set of mutually exclusive events, deal with this situation. Put uk = AXA2 {k = 1, 2, Ak^Ak where A denotes the complementary event of A. (9) , n) . We have P{ sup 1M- ^z} = Pl±Ak\ = Pi±uλ = Σ P W . U<F{χ)^i Fyx) ) U=i J U=i J fc=i If we employ the following conditional probability formula, we can evaluate P(uk). (10) P(Ak) = r=l SOME THEOREMS ON THE RATIO OF EMPIRICAL DISTRIBUTION 1111 Now Ak occurs if k of the n observations fall to the left of yk, and n — k to the right; hence nz / \ nz And P(Ak I Ar) is the probability that of (n — r) observations, known to lie to the right of yr, (k — r) lie to the left of yk and n — k to the right. The probability of occurrence of an event, whose value is greater than or equal to yr and is on the interval yr^x ^ yk is kjnz — r\nz _ k — r 1 — rjnz nz — r Therefore k k - r \*-V1 _ k- r Y~ z —r / V nz — r If we employ the notation pk(z) = kl we get Equation (10) can be reduced to kJz) _ Λ p( v pk-r(l)pn-k(z) = Σ PK) Hence P,(l) = fcfc/^! and P,_r(l) = (k - r)*-r/(fc - r)!. lemma we know that P(Mr) g P»( ) p(^) Z!li r! ' r ΪHL P ( ) r! pn(«) 1n (nz) And (9) and the (2) of our lemma tell us that P{ sup ^ L < z) = 1 - P{ sup -§M. ^ {nzf έΊ fc! (n - k)\ - 1)! By (3) of our 1 1 fo ) " rl (n — r)\ r 1112 S. C. TANG This completes the proof of Theorem 1. Proof of Theorem 2. Since the ratio is nonnegative, the probability of the inequality is zero if z ^ 0. Under the condition z > njc, we know that the event {supo<r{x)^eιnFn(x)IF(x) Ξ> z} is still equal to Σt=i A by the result of Theorem 1. Suppose 0 < z ^njc. Then Pi QITΠ lθ<F(x)^ n\fi) > F(X) ? I — ) pj sf* A J_ A * I — U=l J 'S? PΓ?/ ^ 4- T^di^Λ fc=l where c1 , {a?: F(^) = —^ nJ 1/* - 4 J .,, J 4* Since P{Uk) has been computed, we need only evaluate P(C/c*). have P(A*) - Σ P(uk)P{A* | A,} + P(u*)P{A* \ A*} , P(u*) = P(A*) - Σ P{A* | Ak} . From this we obtain gup pi ^ ί ί L < z\ = 1 - Σ P(^fc) - P(%*) lo^ί'(χ)^ F(x) > k=i _. v P/4* I A W pίηi \\Λ ίczl JL \ wkJ-L χΛ~\.C2 £±-k\ Since r=0 we have γΊ _ ^ - f e y - We SOME THEOREMS ON THE RATIO OF EMPIRICAL DISTRIBUTION 1113 which completes the proof. The distribution of Theorem 1 is continuous; the distribution of Theorem 2 is not continuous on the interval 0 ^ z gΞ n/c. Theorem 3 is a special case of Theorem 4 under the condition c = n. In fact, by (4) of our lemma, setting z = 1, we deduce Theorem 3 as follows: inf ^ < i U ( i - λ)+ { in F(x) ~ ) \ ± (I) n L nn *=iW n) lέi \ n fc! (n - fc)! J Thus we need only prove Theorem 4. The distribution function of Theorem 3 has a discontinuity corresponding to 3 = 1; the distribution function of Theorem 4 is continuous on the interval 1 ^ c ^ n. 5. Proof of Theorem 4 On the internal Fn(x) > 0 the maximum of F(x) is 1; the minimum of Fn(x) is 1/n. From these results it follows that the lower bound of the ratio is no smaller than 1/n. Therefore w*i F(x) if z < ljn. Let zk be the least root of the equation nz \ n J Define events J30. A x ^ Z1 , If 1/n ^ z ^ c/w, the event is the union of the events Bo> Bi> B2f , Bίnzl . For the purpose of evaluating the probability of ^kBk we put Vo - Bo, Vk= SoB 1 ... S t _ A (fc = 1, 2, . . . , [nz]) 1114 S. C. TANG We have (11) P(Bk) = Σ P ( Vr)P(Bk (fc = 0,1, \Br), , [nz]) „ Here F{Bk \jbo\ — Now we transform (11) into the following form: klpn(z) r r=i pn_r(z) By (4) of our lemma we obtain P(Fr) = ( r ~ 1 > r ' 1 p*-*iz) = (n)(r-lY-K™-'> r! pn(z) \r/ '--sn ( 1 <r < ~ ~ It follows that if 1/n g z ^ c/w, i \ _1_ "S? I nz / If z {> c/w, inf P{U<F ^{x)^~ F(X) ^^ U<Fnn{x)^~ F(X) \ k=i\K'/ N — / \ιvZ — K>) n (nz) } > = Λ_ M Theorem 4 is thus proved. 6 Conclusions, Theorems 1 and 3 can be applied to test whether statistical data correspond with the theoretical distribution or not. In the process of testing, zs and z{ are separately representative of the ratio's upper and lower bounds. If S(zs) is small and In{z^) large, we conclude that the empirical data in two extreme tails correspond with the theoretical distribution; conversely, if S(zs) is very large and In{z^ very small, the statistical data do not correspond with the theoretical distribution. Our test is more sensitive in the two tails, but less sensitive in the central part, than Smirnov's test. PACIFIC JOURNAL OF MATHEMATICS EDITORS RALPH S. PHILLIPS A. Stanford University Stanford, California M. L. WHITEMAN University of Southern California Los Angeles 7, California G. ARSOVE LOWELL J. University of Washington Seattle 5, Washington PAIGE University of California Los Angeles 24, California ASSOCIATE EDITORS E. F. BECKENBACH T. M. CHERRY D. DERRY M. OHTSUKA H. L. ROYDEN E. SPANIER E. G. STRAUS F. WOLF SUPPORTING INSTITUTIONS UNIVERSITY OF BRITISH COLUMBIA CALIFORNIA INSTITUTE OF TECHNOLOGY UNIVERSITY OF CALIFORNIA MONTANA STATE UNIVERSITY UNIVERSITY OF NEVADA NEW MEXICO STATE UNIVERSITY OREGON STATE UNIVERSITY UNIVERSITY OF OREGON OSAKA UNIVERSITY UNIVERSITY OF SOUTHERN CALIFORNIA STANFORD UNIVERSITY UNIVERSITY OF TOKYO UNIVERSITY OF UTAH WASHINGTON STATE UNIVERSITY UNIVERSITY OF WASHINGTON * * * AMERICAN MATHEMATICAL SOCIETY CALIFORNIA RESEARCH CORPORATION SPACE TECHNOLOGY LABORATORIES NAVAL ORDNANCE TEST STATION Mathematical papers intended for publication in the Pacific Journal of Mathematics should be typewritten (double spaced), and the author should keep a complete copy. Manuscripts may be sent to any one of the four editors. All other communications to the editors should be addressed to the managing editor, L. J. Paige at the University of California, Los Angeles 24, California. 50 reprints per author of each article are furnished free of charge; additional copies may be obtained at cost in multiples of 50. The Pacific Journal of Mathematics is published quarterly, in March, June, September, and December. Effective with Volume 13 the price per volume (4 numbers) is $18.00; single issues, $5.00. Special price for current issues to individual faculty members of supporting institutions and to individual members of the American Mathematical Society: $8.00 per volume; single issues $2.50. Back numbers are available. Subscriptions, orders for back numbers, and changes of address should be sent to Pacific Journal of Mathematics, 103 Highland Boulevard, Berkeley 8, California. Printed at Kokusai Bunken Insatsusha (International Academic Printing Co., Ltd.), No. 6, 2-chome, Fujimi-cho, Chiyoda-ku, Tokyo, Japan. PUBLISHED BY PACIFIC JOURNAL OF MATHEMATICS, A NON-PROFIT CORPORATION The Supporting Institutions listed above contribute to the cost of publication of this Journal, but they are not owners or publishers and have no responsibility for its content or policies. Pacific Journal of Mathematics Vol. 12, No. 3 March, 1962 Alfred Aeppli, Some exact sequences in cohomology theory for Kähler manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paul Richard Beesack, On the Green’s function of an N -point boundary value problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . James Robert Boen, On p-automorphic p-groups. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . James Robert Boen, Oscar S. Rothaus and John Griggs Thompson, Further results on p-automorphic p-groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . James Henry Bramble and Lawrence Edward Payne, Bounds in the Neumann problem for second order uniformly elliptic operators . . . . . . . . . . . . . . . . . . . . . . . Chen Chung Chang and H. Jerome (Howard) Keisler, Applications of ultraproducts of pairs of cardinals to the theory of models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Stephen Urban Chase, On direct sums and products of modules . . . . . . . . . . . . . . . . . . . Paul Civin, Annihilators in the second conjugate algebra of a group algebra . . . . . . . J. H. Curtiss, Polynomial interpolation in points equidistributed on the unit circle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Marion K. Fort, Jr., Homogeneity of infinite products of manifolds with boundary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . James G. Glimm, Families of induced representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . Daniel E. Gorenstein, Reuben Sandler and William H. Mills, On almost-commuting permutations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Vincent C. Harris and M. V. Subba Rao, Congruence properties of σr (N ) . . . . . . . . . . Harry Hochstadt, Fourier series with linearly dependent coefficients . . . . . . . . . . . . . . . Kenneth Myron Hoffman and John Wermer, A characterization of C(X ) . . . . . . . . . . . Robert Weldon Hunt, The behavior of solutions of ordinary, self-adjoint differential equations of arbitrary even order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Edward Takashi Kobayashi, A remark on the Nijenhuis tensor . . . . . . . . . . . . . . . . . . . . David London, On the zeros of the solutions of w00 (z) + p(z)w(z) = 0 . . . . . . . . . . . . . Gerald R. Mac Lane and Frank Beall Ryan, On the radial limits of Blaschke products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . T. M. MacRobert, Evaluation of an E-function when three of its upper parameters differ by integral values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Robert W. McKelvey, The spectra of minimal self-adjoint extensions of a symmetric operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Adegoke Olubummo, Operators of finite rank in a reflexive Banach space. . . . . . . . . . David Alexander Pope, On the approximation of function spaces in the calculus of variations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bernard W. Roos and Ward C. Sangren, Three spectral theorems for a pair of singular first-order differential equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Arthur Argyle Sagle, Simple Malcev algebras over fields of characteristic zero . . . . . Leo Sario, Meromorphic functions and conformal metrics on Riemann surfaces . . . . Richard Gordon Swan, Factorization of polynomials over finite fields . . . . . . . . . . . . . . S. C. Tang, Some theorems on the ratio of empirical distribution to the theoretical distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Robert Charles Thompson, Normal matrices and the normal basis in abelian number fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Howard Gregory Tucker, Absolute continuity of infinitely divisible distributions . . . . Elliot Carl Weinberg, Completely distributed lattice-ordered groups . . . . . . . . . . . . . . . James Howard Wells, A note on the primes in a Banach algebra of measures . . . . . . . Horace C. Wiser, Decomposition and homogeneity of continua on a 2-manifold . . . . 791 801 813 817 823 835 847 855 863 879 885 913 925 929 941 945 963 979 993 999 1003 1023 1029 1047 1057 1079 1099 1107 1115 1125 1131 1139 1145
© Copyright 2024 Paperzz