Some theorems on the ratio of empirical distribution to the

Pacific Journal of
Mathematics
SOME THEOREMS ON THE RATIO OF EMPIRICAL
DISTRIBUTION TO THE THEORETICAL DISTRIBUTION
S. C. TANG
Vol. 12, No. 3
March 1962
SOME THEOREMS ON THE RATIO OF EMPIRICAL
DISTRIBUTION TO THE THEORETICAL
DISTRIBUTION
S. C. TANG
1. Introduction. Let Xlt X2, " ,Xn be mutually independent random variables with the common cumulative distribution function F(x).
Let X*,X*, "',X*
be the same set of variables rearranged in increasing order of magnitude. In statistical language Xu X,,
, Xn
form a sample of n drawn from the distribution with distribution function F(x). The empirical distribution of the sample Xu •• ,Xn is the
step function Fn{%) denned by
0
(1)
F,(x) = —
it
1
for x ^ X*
for Xξ < x 5ϊ Xk*+1
f or x > Xί .
A. Kolmogorov developed a well-known limit distribution law for
the difference between the empirical distribution and the corresponding
theoretical distribution, assuming F(x) continuous:
(0 ,
P Vn
FM
^ { -^?J
for z ^ 0 ,
F
~ ^< 4 = {ti-iye—
, for * > 0 .
Equally interesting is Smirnov's theorem:
0,
lim p\VH sup [Fn(x) - F(x)] < z\ = ]° ' __2β2
f or z ^ 0 ,
f or z > 0 .
In this paper we shall study the ratio of the empirical distribution
to the theoretical distribution, and evaluate the distribution functions
of the upper and lower bounds of the ratio. We shall prove the following four theorems.
THEOREM
1. If F is everywhere continuous then
(0
for
z%l
— JL for z > 1 .
z
Received July 28, 1961. I am deeply indebted to the referee for his corrections of my
English writing. Without his aid, the present paper would not be read clearly, although
the responsibility of errors rests with me.
Π07
1108
S. C. TANG
THEOREM 2. Let c be a positive number no larger than n and
suppose that F(x) is continuous in the range 0 ^ F(x) g c/n. Then
k
£? (n\ k -\nz - &)—*
7 ^
(nz)
~~ 2-*\k)
Λ=I \κ/
ίg /w - fey ca - k γ~*Λ _ eg - fe y
m«•—u
ίj.
i
F(X)
—K
~~~ iC // \\ /yi/y
/t,^ ^ _
\\ι
If //
/v
\\
'V?<y
/ (ί/& —
i* //
/v
for 0 < z S —
e
for z ^ 0
for z> —
c
\
THEOREM
3.
Under the assumption of Theorem 1,
for z < —
n
1-
\
inf
lβ<#.(.)si
Ή*
nz
1
fe
(fe - l ) * - ^ - fc)""
F(X)
/or 1- ^ ^ <
n
for z > 1 .
then
THEOREM
4.
Under the assumptions of Theorem 2, if 1 <Ξ c : n
/or ^ <
0
Λ _ 1
V
nzJ
P^
JL
(fe - l ) * - 1 ^ - k)n~*
(nz)n
for — ^
n
inf
v
\ (fe
n
forz>j~.
2. An elementary lemma. We shall use the following lemma.
—
n
SOME THEOREMS ON THE RATIO OF EMPIRICAL DISTRIBUTION
Letfcbe a positive integer and a an arbitrary real num-
LEMMA.
ber.
1109
Then
(2)
sp r
[a
fϋ. r!
r)
a
(fc-r)!
(fc - 1)! '
^ (r - I ) - 1 (a - r)«-' _ a*
ί=i r!
(fc-r)l fc! '
(a - 1)*
k\
If a = k (a special ease) then
1
}
h
,,,
κ
r!
(fc - r)!
1
^ (r - I ) - (k - r)*-r _ k«
#=i r!
(fc-r)I fc! '
(k - I)"
M
'
(fc - 1)! fc! '
Proof. Let /(«) be a polynomial whose degree is less than &. Then
r
i) /(* ) = (-D"[^/(*)].=. = o
r
i.e.
Σ(ί)(-l) /(r)=-/(0).
JSΓow we can directly obtain (2) and (3):
z^ - r\
( f c - r)\
r\
s=o s! (fc — r —
-f \ 1.
i.^
Λ
g
λ.
-1
Oί s^l
S!
r=i
r!(fc- -
( - .l)*-«-«α*-i
( 1)-
(fc - 1)!
(α-D k
fc!
β)!
g
r —•s)l
a"-1
(fc - 1)! '
1
_L.
1
r\
J
(a
-\y
fc!
(a
-D*
fc!
(a
_1)»
fc!
(at o s! (fc
- 4-
y- («-
(fc - r ) !
^ ( r - 1 ) ^ ^ (α -r=i
7*'
s=0
-r+
l)s(,s ! ( f c - r —
v ( - 1 ' )*-=(« _ i ) .
s=0
s!
1
^
fA.
\y-r-*
β)!
L)'(r
r! ( f c - r — s ) !
(-•
- + ^(-1) *-( α _ i) (_
s! (fc - s)!
— s )!
fc!
3. Proof of Theorem l First we consider the case z ^ 1. Let
be the largest root of the equation
1110
S. C. TANG
(i ^ k ^ n) .
F(x) = A
nz
Since F is continuous, yk is well defined.
ability of the inequality
(6)
sup
Now we evaluate prob-
ΣΆ
F(X)
0^F()^
that is, the probability that there exists an x such that
F
(7)
^ > z.
F(x) ~
If (7) is true, then, since F{x) is nondecreasing and lima._0o (Fn(x)IF(x)) —
1, there exists an xQ such that
F(xQ)
By the definition of Fn, Fn(xQ) = kjn for some k, so that F(xQ) —
kfnz. Therefore we can take x0 as one of yk. In other words, for one
value of yk we have
FM) = n
and the inequality
(8)
Xt<y*^Xί+i
is true. Now let Ak be the event that this inequality holds.
Clearly (6) occurs if, and only if, at least one among the events
A
A
A
. ..
A
occurs. Generally the Afc are not mutually exclusive events so that the
additive law is of no use. We may, with the help of an associated set
of mutually exclusive events, deal with this situation. Put
uk = AXA2
{k = 1, 2,
Ak^Ak
where A denotes the complementary event of A.
(9)
, n) .
We have
P{ sup 1M- ^z} = Pl±Ak\ = Pi±uλ = Σ P W .
U<F{χ)^i Fyx)
)
U=i
J
U=i
J
fc=i
If we employ the following conditional probability formula, we can
evaluate P(uk).
(10)
P(Ak) =
r=l
SOME THEOREMS ON THE RATIO OF EMPIRICAL DISTRIBUTION
1111
Now Ak occurs if k of the n observations fall to the left of yk,
and n — k to the right; hence
nz / \
nz
And P(Ak I Ar) is the probability that of (n — r) observations, known to
lie to the right of yr, (k — r) lie to the left of yk and n — k to the
right. The probability of occurrence of an event, whose value is greater
than or equal to yr and is on the interval yr^x ^ yk is
kjnz — r\nz _ k — r
1 — rjnz
nz — r
Therefore
k
k - r \*-V1 _ k- r Y~
z —r /
V
nz — r
If we employ the notation
pk(z)
=
kl
we get
Equation (10) can be reduced to
kJz) _ Λ
p(
v pk-r(l)pn-k(z)
= Σ PK)
Hence P,(l) = fcfc/^! and P,_r(l) = (k - r)*-r/(fc - r)!.
lemma we know that
P(Mr)
g
P»( )
p(^)
Z!li
r!
'
r
ΪHL P ( )
r!
pn(«)
1n
(nz)
And (9) and the (2) of our lemma tell us that
P{ sup ^ L < z) = 1 - P{ sup -§M. ^
{nzf έΊ fc! (n - k)\
- 1)!
By (3) of our
1 1 fo ) "
rl
(n — r)\
r
1112
S. C. TANG
This completes the proof of Theorem 1.
Proof of Theorem 2. Since the ratio is nonnegative, the probability of the inequality is zero if z ^ 0. Under the condition z > njc,
we know that the event {supo<r{x)^eιnFn(x)IF(x)
Ξ> z} is still equal to
Σt=i A by the result of Theorem 1. Suppose 0 < z ^njc.
Then
Pi
QITΠ
lθ<F(x)^
n\fi)
>
F(X)
? I —
)
pj
sf*
A
J_
A * I —
U=l
J
'S? PΓ?/
^ 4-
T^di^Λ
fc=l
where
c1
,
{a?: F(^) = —^
nJ
1/* -
4 J
.,, J
4*
Since P{Uk) has been computed, we need only evaluate P(C/c*).
have
P(A*) - Σ P(uk)P{A* | A,} + P(u*)P{A* \ A*} ,
P(u*) = P(A*) - Σ P{A* | Ak} .
From this we obtain
gup
pi
^ ί ί L < z\ = 1 - Σ P(^fc) - P(%*)
lo^ί'(χ)^ F(x)
>
k=i
_. v
P/4* I A W
pίηi \\Λ
ίczl
JL \ wkJ-L
χΛ~\.C2
£±-k\
Since
r=0
we have
γΊ _ ^ - f e y -
We
SOME THEOREMS ON THE RATIO OF EMPIRICAL DISTRIBUTION
1113
which completes the proof. The distribution of Theorem 1 is continuous;
the distribution of Theorem 2 is not continuous on the interval 0 ^
z gΞ n/c.
Theorem 3 is a special case of Theorem 4 under the condition c = n.
In fact, by (4) of our lemma, setting z = 1, we deduce Theorem 3 as
follows:
inf ^ < i U ( i - λ)+
{ in
F(x)
~
)
\
± (I)
n L
nn
*=iW
n)
lέi
\
n
fc!
(n - fc)! J
Thus we need only prove Theorem 4. The distribution function of
Theorem 3 has a discontinuity corresponding to 3 = 1; the distribution
function of Theorem 4 is continuous on the interval 1 ^ c ^ n.
5. Proof of Theorem 4 On the internal Fn(x) > 0 the maximum
of F(x) is 1; the minimum of Fn(x) is 1/n. From these results it follows that the lower bound of the ratio is no smaller than 1/n. Therefore
w*i F(x)
if z < ljn. Let zk be the least root of the equation
nz
\
n
J
Define events
J30.
A x ^ Z1 ,
If 1/n ^ z ^ c/w, the event
is the union of the events
Bo> Bi> B2f
, Bίnzl
.
For the purpose of evaluating the probability of ^kBk we put
Vo - Bo,
Vk= SoB 1 ... S t _ A
(fc = 1, 2, . . . , [nz])
1114
S. C. TANG
We have
(11)
P(Bk) = Σ P ( Vr)P(Bk
(fc = 0,1,
\Br),
, [nz]) „
Here
F{Bk
\jbo\ —
Now we transform (11) into the following form:
klpn(z)
r
r=i
pn_r(z)
By (4) of our lemma we obtain
P(Fr) = ( r ~ 1 > r ' 1 p*-*iz) = (n)(r-lY-K™-'>
r!
pn(z)
\r/
'--sn
( 1
<r <
~ ~
It follows that if 1/n g z ^ c/w,
i
\ _1_ "S? I
nz /
If z {> c/w,
inf
P{U<F ^{x)^~ F(X)
^^
U<Fnn{x)^~ F(X)
\
k=i\K'/
N
—
/
\ιvZ
— K>)
n
(nz)
}
>
=
Λ_ M
Theorem 4 is thus proved.
6 Conclusions, Theorems 1 and 3 can be applied to test whether
statistical data correspond with the theoretical distribution or not. In
the process of testing, zs and z{ are separately representative of the
ratio's upper and lower bounds. If S(zs) is small and In{z^) large, we
conclude that the empirical data in two extreme tails correspond with
the theoretical distribution; conversely, if S(zs) is very large and In{z^
very small, the statistical data do not correspond with the theoretical
distribution. Our test is more sensitive in the two tails, but less sensitive in the central part, than Smirnov's test.
PACIFIC JOURNAL OF MATHEMATICS
EDITORS
RALPH S. PHILLIPS
A.
Stanford University
Stanford, California
M.
L.
WHITEMAN
University of Southern California
Los Angeles 7, California
G. ARSOVE
LOWELL J.
University of Washington
Seattle 5, Washington
PAIGE
University of California
Los Angeles 24, California
ASSOCIATE EDITORS
E. F. BECKENBACH
T. M. CHERRY
D. DERRY
M. OHTSUKA
H. L. ROYDEN
E. SPANIER
E. G. STRAUS
F. WOLF
SUPPORTING INSTITUTIONS
UNIVERSITY OF BRITISH COLUMBIA
CALIFORNIA INSTITUTE OF TECHNOLOGY
UNIVERSITY OF CALIFORNIA
MONTANA STATE UNIVERSITY
UNIVERSITY OF NEVADA
NEW MEXICO STATE UNIVERSITY
OREGON STATE UNIVERSITY
UNIVERSITY OF OREGON
OSAKA UNIVERSITY
UNIVERSITY OF SOUTHERN CALIFORNIA
STANFORD UNIVERSITY
UNIVERSITY OF TOKYO
UNIVERSITY OF UTAH
WASHINGTON STATE UNIVERSITY
UNIVERSITY OF WASHINGTON
*
*
*
AMERICAN MATHEMATICAL SOCIETY
CALIFORNIA RESEARCH CORPORATION
SPACE TECHNOLOGY LABORATORIES
NAVAL ORDNANCE TEST STATION
Mathematical papers intended for publication in the Pacific Journal of Mathematics should
be typewritten (double spaced), and the author should keep a complete copy. Manuscripts may
be sent to any one of the four editors. All other communications to the editors should be addressed
to the managing editor, L. J. Paige at the University of California, Los Angeles 24, California.
50 reprints per author of each article are furnished free of charge; additional copies may be
obtained at cost in multiples of 50.
The Pacific Journal of Mathematics is published quarterly, in March, June, September, and
December. Effective with Volume 13 the price per volume (4 numbers) is $18.00; single issues, $5.00.
Special price for current issues to individual faculty members of supporting institutions and to
individual members of the American Mathematical Society: $8.00 per volume; single issues
$2.50. Back numbers are available.
Subscriptions, orders for back numbers, and changes of address should be sent to Pacific
Journal of Mathematics, 103 Highland Boulevard, Berkeley 8, California.
Printed at Kokusai Bunken Insatsusha (International Academic Printing Co., Ltd.), No. 6,
2-chome, Fujimi-cho, Chiyoda-ku, Tokyo, Japan.
PUBLISHED BY PACIFIC JOURNAL OF MATHEMATICS, A NON-PROFIT CORPORATION
The Supporting Institutions listed above contribute to the cost of publication of this Journal,
but they are not owners or publishers and have no responsibility for its content or policies.
Pacific Journal of Mathematics
Vol. 12, No. 3
March, 1962
Alfred Aeppli, Some exact sequences in cohomology theory for Kähler
manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Paul Richard Beesack, On the Green’s function of an N -point boundary value
problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
James Robert Boen, On p-automorphic p-groups. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
James Robert Boen, Oscar S. Rothaus and John Griggs Thompson, Further results
on p-automorphic p-groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
James Henry Bramble and Lawrence Edward Payne, Bounds in the Neumann
problem for second order uniformly elliptic operators . . . . . . . . . . . . . . . . . . . . . . .
Chen Chung Chang and H. Jerome (Howard) Keisler, Applications of ultraproducts
of pairs of cardinals to the theory of models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Stephen Urban Chase, On direct sums and products of modules . . . . . . . . . . . . . . . . . . .
Paul Civin, Annihilators in the second conjugate algebra of a group algebra . . . . . . .
J. H. Curtiss, Polynomial interpolation in points equidistributed on the unit
circle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Marion K. Fort, Jr., Homogeneity of infinite products of manifolds with
boundary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
James G. Glimm, Families of induced representations . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Daniel E. Gorenstein, Reuben Sandler and William H. Mills, On almost-commuting
permutations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Vincent C. Harris and M. V. Subba Rao, Congruence properties of σr (N ) . . . . . . . . . .
Harry Hochstadt, Fourier series with linearly dependent coefficients . . . . . . . . . . . . . . .
Kenneth Myron Hoffman and John Wermer, A characterization of C(X ) . . . . . . . . . . .
Robert Weldon Hunt, The behavior of solutions of ordinary, self-adjoint differential
equations of arbitrary even order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Edward Takashi Kobayashi, A remark on the Nijenhuis tensor . . . . . . . . . . . . . . . . . . . .
David London, On the zeros of the solutions of w00 (z) + p(z)w(z) = 0 . . . . . . . . . . . . .
Gerald R. Mac Lane and Frank Beall Ryan, On the radial limits of Blaschke
products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
T. M. MacRobert, Evaluation of an E-function when three of its upper parameters
differ by integral values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Robert W. McKelvey, The spectra of minimal self-adjoint extensions of a symmetric
operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Adegoke Olubummo, Operators of finite rank in a reflexive Banach space. . . . . . . . . .
David Alexander Pope, On the approximation of function spaces in the calculus of
variations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Bernard W. Roos and Ward C. Sangren, Three spectral theorems for a pair of
singular first-order differential equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Arthur Argyle Sagle, Simple Malcev algebras over fields of characteristic zero . . . . .
Leo Sario, Meromorphic functions and conformal metrics on Riemann surfaces . . . .
Richard Gordon Swan, Factorization of polynomials over finite fields . . . . . . . . . . . . . .
S. C. Tang, Some theorems on the ratio of empirical distribution to the theoretical
distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Robert Charles Thompson, Normal matrices and the normal basis in abelian
number fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Howard Gregory Tucker, Absolute continuity of infinitely divisible distributions . . . .
Elliot Carl Weinberg, Completely distributed lattice-ordered groups . . . . . . . . . . . . . . .
James Howard Wells, A note on the primes in a Banach algebra of measures . . . . . . .
Horace C. Wiser, Decomposition and homogeneity of continua on a 2-manifold . . . .
791
801
813
817
823
835
847
855
863
879
885
913
925
929
941
945
963
979
993
999
1003
1023
1029
1047
1057
1079
1099
1107
1115
1125
1131
1139
1145