ON THE THEORY OF ORDER STATISTICS
By
ALFRt~D RI~NYI (Budapest), corresponding member of the Academy
Dedicated to A. N. KOLMOCORO'r
on the occasion of his 50th birthday
Introduction
Since the beginning of the century many authors, e. g. K. PEARSON [1]~
L. v. BORTKIEWlCZ [2], E. L. DODD [3], L. H. C. TIPPET [4], a n d M. FRECHET [5] have dealt with particular problems which may be classified as
belonging to the theory of order statistics. A. N. KOLMOGOROV [6], V. I. GLIVENKO [7], N. V. SMIRNOV[8], B. V. GNEDENKO [9], and other mathematicians
having recognized the great theoretical and practical importance of this set
of problems, developed this subject into a systematical theory.
In the last three years a particularly great number of papers dealt with
such problems; of these we mention those of B. V. GNEDEN~O and
V. S. KOROLUK [10], B. V. ONEDENKO and E. L. RVA~.EVA [11], B. V. GNE-DENKO and V. S. MIHALEVI~. [12], V. S. NIHALEW~ [13], J. D. KVIT [14],
G. lVi. MANIA [15], I. I. GIHMAN [16], W. FELLER [17], J. L. DOOB [18],
F. J. MASSEY [19], M. D. DONSKER [20], T. W. ANDERSON and D. A. DARLING
[21]. A bibliography up to 1947 is to be found in the paper of S. S. WILKS
[30] enumerating 90 papers.
The purpose of the present paper is to give a new method by means
of which many important results of the theory of order statistics can be
obtained with surprising simplicity; the method also enables us to prove
several new theorems. The essential novelty of this method is that it reduces
the problems connected with order statistics to the study of sums of mutually
independent random variables. w 1 contains the review of the method, w 2 is
devoted to the proof of some known theorems by means of this method, and
w 3 contains the formulation of some new results obtained by this method,.
concerning the comparison of the sample distribution function to that of the
population. These results are connected with the fundamental results o f
A. N. KOL/VIOGOROVand N. V. SN.IRNOV.
Let F,~(x) denote the distribution function of a sample of size n drawn
from a population having the continuous distribution function F(x), in other-
A. RENYI
192
words, F,~(x) denotes the frequency ratio
KOLMOQO~OV determined the limiting
]Fn(x)--F(x)[, SMIRNOV did the same
determine the limiting distribution of the
of sample values not exceeding x.
distribution of the supremum of
for F~(x)--F(x); in w 4 we shall
supremum* of the relative deviations
r,~(x) -F, (x) -- F(x)
F ( x(x)_
( i and
F(x)
, respectively. To do this, besides our method,
the lemmas in w 4, generalizing some results of P. ERD6S and M. KAC [22],
are needed. These lemmas are of certain interest in themselves. w 5 contains
the proof of the results formulated in w 3, while w 6 contains some remarks on
the numerical computation of the values of the limiting distribution functions
occurring in our theorems; tables for one of these are also given. I have
found the method formulated in w 1 by analysing a theorem of S. MALMQUIST
[23]; w 1 contains among others a new and simple proof for this theorem
of MALMQUIST. Together with G. HAJOS, we have found another simple proof
of this theorem which will be published in a joint paper of ours [24]. All these
investigations have for their origin in the discussions in a seminary of the
Departments of Probability Theory and of Mathematical Statistics of the Institute for Applied Mathematics of the Hungarian Academy of Sciences.
I lectured on the part of the results contained in this paper in January 1953
on the congress of the Humboldt University in Berlinland in September 1953
on the VIIlth Polish Mathematical Congress in Warsaw. On this last occasion
A. N. KOLMOGOROV has made certain valuable remarks for which I
express him my most sincere thanks. Further, I express my thanks to
T. LIPTAK who participated in the preparation of this paper by elaboration
of some particular calculations, a s well as to Miss I. PALASTI and Mrs. P.
VARNAI for the numerical computations.
w 1. A n e w method in the theory of order statistics
Let us start with the following special case: let a sample of size n be
given concerning the value of a random variable ~ of exponential distribution,
i. e. the results of n independent observations for its value, denoted by
~ ~2,
~," in other words, ~, ~2,
~,, are mutually independent random
variables with the same distribution function of exponential type. We need
the following well-known property of the exponential distribution: if ~ is an
exponentially distributed random variable, then
~I,
(1.1)
9 "
",
~
"
P(~ <
"
",
x-]-yl~ >=y) = P(~ < x),
* (in an interval 0 < a ~ F(x) ~ b ~ 1).
This lecture will be published in the communications of the Congress under the
~ollowing title: "Eine neue Methode in der Theorie der geordneten Stichproben".
193
ON THE THEORY OF ORDER STATISTICS
if X > 0 and y-->_0. 2 This property characterizes uniquely the exponential
distribution. Indeed, let F ( x ) be the distribution function of g, then
P(C < x + yl~ > y ) -
F(x-]- y ) - - F ( y )
=
1--F(y)
and it follows that (1.1) is equivalent to the following relation:
(1.2)
q)(x + y ) -----
q,(y)
where ~ ( x ) = 1--F(x). It is known however that, of all ftmctions satisfying
the condition 0 -<
-- ~ ( x ) N 1, except for the trivial cases q)(x)~_0 and
~ ( x ) ~ 1, the functions C P ( x ) = e -xx (2 > 0) and only these satisfy the functional equation (1.2).
The meaning of (1.1) becomes especially clear, if the random variable
is interpreted as the duration of a happening having random duration. In
this interpretation the proposition (1.1) can be formulated as follows: in case
of a happening of exponentially distributed random duration, being in progress at the moment y, the further duration of the happening does not depend
on y, i.e. on its duration until the given moment.
Let us arrange the numbers ~1, .~2,..., ~, in order of magnitude and use
the notation
.(1.3)
~ = ,~,~(~, ~2, 9.., ~,)
( k = 1, 2 , . . . , n)
where the function Rk(xl, x2 . . . . , x,) of the n variables xl, x 2 , . . . , x~, denotes
the k-th of the values x~,x2 . . . . , x , in order of magnitude ( k = 1,2 . . . . , n);
thus e.g. ~ : rain gk and ~*~= max ~ . Then the individual and joint disl~k~n
l<:k~n
tributions of the values of the order statistics ST <-_ ~,~ <= ... <= ~,*~can be most
,easily determined. For that purpose we interpret the variables ~k as random
durations of mutually independent happenings; then ~ denotes the duration
of the happening finished as k-th of the n happenings. Let us determine,
first of all, the distributions of the differences ~ + ~ - - ~ . If U
_k = y, then
(1.4)
:g
$
P(~k+l--~k > X[~ = y) =
p
$
(~k+l > X ' @ y[ ~
y)
where on the right side there stands the probability of the event that none
of the n - - k happenings, being in progress at the moment y, finishes until
the moment X @ y . By virtue of (1.1), the value of this probability is
(P(r > x))'~-k = e-(,~-,0x~
and thus the conditional distribution function of r
condition ~, ~ y is
(1.5)
1~(~+1--~
.<
with respect to the
x Ig; = y) = 1--e-('~-~0 x~.
2 P(A) denotes the probability of the event A, and P(A[B) denotes the conditional
probability of the event A with respect to the event B.
194
A. RENYI
As the conditional distribution function (1.5) does not depend on y, (1.5)
gives also the non-conditional distribution function of ~7,§
indeed, by
virtue of the theorem on total probability,
Qo
( | . 6)
P(~+I--~
< x) =
f P(~+I--~;
< xl~, = y ) d P ( ~
< y) = 1 - - e -('~-kJ~.
0
Therefore the differences ~ + 1 - - ~
with the mean value
(1.7)
1
(n-k)z
are themselves exponentially distributed
and thus the variables
4+1 = ( n - - k )
( k = O, 1. . . . , n--l
are also exponentially distributed with the mean v a l u e ~1. (In the above relation by definition ~3~ 0.)
It follows from the abovesaid that the variables d~, d~,..., d, are mutually independent random variables. It is namely easy to see that the probability
does not depend on the variables y~, Y2 . . . . . yk; this is evident, as the above
conditions mean that ~ - = y ~ , ~ - ~ y l §
~ . = y ~ + y 2 @ . . . +yk; i . e .
they give the moments of the finishing of k happenings, which finish first of
the n happenings which started simultaneously at the moment t = 0 . These:
conditions imply that at the moment t = y ~ + y 2 + . . . @ y k
there are still
n - - k happenings in progress and the probability of the finishing of at least:
one of them before the moment t @ x is equal to 1 - - e ~(~-k~x. Thus the
probability of the left hand side of (1.8) equals 1--e-('~-~% i. e. it does not:
depend on the variables y~, y~,..., y~, and this is equivalent to the fact that
the variables ~+~--~?, (and also the variables dk) are mutually independent~
Thus the variables ~ can be expressed in the form
0.9)
d~
~ = -h- §
+ §
dk
n--k§
(k = 1, 2, " " n),
i. e. as linear forms of mutually independent random variables having the
same distribution. (1.9) can be also expressed by saying that the variables
~:. form an (additive) Markov chain. By virtue of (1.9) the distribution of any
~,, further, the joint distribution of any number of the variables ~ can be:
determined in explicit form.
Consider now, how the abovesaid can be applied in general to the
study of order statistics. Let ~ be any random variable having a continuous
and steadily increasing ~ distribution function F(x), let (~,g.~, . . . , ~ ) b e a
By saying that F(x) is steadi!yincreasing, we mean that F(x) is a strictly increasing monotone function in the least interval (a, b) where F(a)= 0 and F(b)---1; it may
be happen that a = - - ~
or b : - § ~ .
195
ON THE THEORY OF ORDER STATISTICS
sample of size n consisting of n independent observations of the value of
~, that is to say, let ~1, ~2, . . . . ~,~ be mutually independent random variables
with the same (continuous) distribution function F(x). Let us arrange the
sample values ~; in order of magnitude, that is to say, let us form the new
variables ~
R~(~I, ~2, . . . . ~).
The main problem of order statistics is to study the variables ~.; this,
however, can be reduced to the special case when the variables ~k are exponentially distributed (and tlierefore - - by virtue of (1.9) - - to the study of
sums of mutually independent random variables), as follows: let us put
(1.10)
'2~ = F(~k)
and
1
~k ~ l o g ~
( k = 1,2 . . . . . n)
and let us denote by ~ = (~) the k-th of the variables ~1, ~/2,..., ~/~ in
order of magnitude, i. e. let us put ~2~: Rk(~, ~,.--,~b~); further, let us put
1
~k, : log ~.+~_,~
(1.11)
(k=1,2,..
., n).
1
As log~- is a steadily decreasing function, we obtain:
(1.12)
~ = Rk(~i, g2. . . . . _~,)
(k~---- 1, 2 , . . . , n),
whence ~*
_k is the k-th of the variables ~, ~2, . . . , ~,~ in order of magnitude.
As we have assumed the variables ~k to be mutually independent, it follows
that the variables ~ are also mutually independent.
Let us investigate now the distribution of the single variable ~ . F ( x )
being a strictly increasing function, the inverse function of x ~ F(y), denoted
by y = F-~(x), is uniquely defined in the interval 0 --< x --< 1, and thus
P(r
1 < x ) = P(~k > F-~(e-X)) = 1 _ F ( F _ I ( e = ~ ) ) =
< x) = P ( log T((~;
1 - - e -~,
if 0 ~ x =< 1. Therefore the variables gl, g-~,. . . , g,, are mutually independent
and of exponential distribution with the mean value 1. In this way, the random variables ~,.~
~* themselves can be expressed in the form
(1.13)
,
~,~
:F
-1
(
~*
(e--~+~
~ ) - - F-i~e - ~
~
J
(k~.... 1, 2 , . . . , n);
where the variables O~, 6~ . . . . ,6,~ are mutually independent and of exponential distribution with the same distribution function 1--e -.~ (x > 0). It also
follows from this result that the quotients
6n+l-k
(1. 14)
~-+1, - - e
are mutually independent random variables (here, by definition, ~ / * + ~ 1 ) ,
since the variables 6,,,+~_~ are, as we have seen, mutually independent.
Another consequence of (1.13) is that the variables ~...... ~, .. ~ f o r m a
13 Acta Mathematica
196
A. RI~NYt
Markov chain (and thus the variables ~ , ~_~,..., ~* form also a Markov chain);
this follows from
(I, 15)
-~*-1':= F-1 (e l~
,
6k+1
~;-/~")
which is obtained from (1.13) and from the fact that the variables _~+1-1<and
d'k+~ are independent, since
(1(o,
o, )]) "
-I---'-t n--k-t-1
_~*+1_t~--F-1 exp - - n +
A. N. KOLMOGOROV [6a] was the first who remarked that the variables
_~',~, ...,~,'~, i.e. the sequence of order statistics, form a Markov chain.
The new method contained in the present paper starts from this fundamental
observation, but the possibilities implied by it could be developed only after
having transformed the Markov chain {~} into an additive Markov chain by
1
means of the transformation ~;=-log F(~_~+I)" In this connection it is interesting to consider the following general problem: for which Markov
processes {~t} can such a family of functions Gr
be found that the variables
~7r
form an additive Markov process? A necessary condition of this
is that the distribution function F(x, s,y, t ) = P ( ~ < x I-~*'-Y) of the Markov
process should satisfy the following differential equation:
OFOF(OF O7
~x Oy IOy dx-Oy
O F]_ o2F tO FfOFT_ F(OF] i
~0~
~-)
Ox~y f Ox'2 ~.Oy]
~
kOxJ }"
We hope to return to the discussion of this problem on another occasion.
The variables ~ are obviously uniformly distributed in the interval
(0, 1), because, if 0 < x < 1, then
(1.16)
P(~I~ < x ) - - P(~I~ < F-~(x))= F(F-~(x))~- x,
and therefore the variables ro~ form an ordered sample of size n drawn from
a population of uniform distribution in the interval (0, 1).
It follows from (1.14) that
and as P(e 4''+z k < x) --- P 6,~+~ i. > log
= e-
'~ --= x, therefore the va-
riables ~-~-<) are mutually independent and have the same distribution, namely,
they are uniformly distributed in the interval (0, 1). This is the theorem of
S. MALMQUrST mentioned in the introduction.
ON T H E T H E O R Y O F O R D E R S T A T I S T I C S
197
w 2. The theory of order statistics built up by means
of the method of w 1
On the basis of what has b e e n said above it is easy to obtain the
results concerning the limiting distribution of order statistics. In order to
show this, we shall prove the following theorems?
THEOREM 1. If k >= 1 is a fixed positive integer, then
[" ( ktk-~e-~
- 1) ! at
lim P(n_~ < x ) =
(x > o),
0
i.e. n ~ has in the limit a F-distribution of order k.
PRooF. As we have seen,
61
62
Ok
where 0~, 0,~,..., 0k are mutually independent and exponentially distributed
xandom variables with the common distribution function 1 - - e -~ ( x > 0 ) .
6j
TherefOre, the variable n-t-i--j
has the probability density function
{ n ~ - l - - j ) e o,+1-5)~ ( x > 0 ) a n d
thus it follows by simple calculation that
the probability density function of ~* is
g,~(t)=(nk)ke'"*(d--1)'~';
hence n ~ has the frequency function
(2. 1)
-n-g (:)()
k--1
1
=
n--1
-t ; _
e (e
_
1)'~-1-
[n(e~--l)]k-'e-~=lll-tk-le-~
As ~,-~limn (e '~ - - 1) ~---t, thus the density function of n ~ converges to ( k - - 1) .t
as n--~ ~ , i.e. to the density function of the F-distribution of order k.
This result might have been expected by the following consideration.
Obviously
]r
,(2.2)
.
n ~ = 01 -}- 02 -}- 999@ 0k ~- Y ( j - 1)0j.
-- j='~.;n -}- l ~ j '
the density function of all+ 6.~@ . . . -}- 0k is, however, ( k - - 1) [ ' on the other
hand, the variable . ~ ] ( j - - 1 ) 0 j tends stochastically to 0, as n - - ~ .
j=2 n + 1- - j
4 We use the notations introduced in w I.
13"
198
A. RENYI
To develop this consideration into a precise proof, we need the following
lemma, due to H. CRAM~R ([25], p. 254).
LEMMA 1. Let us put 7',~= e,~ + ~,~ where a,~ and ,#,~are random variables
and let F,~(x) denote the distribution function of a,~ and further let us assume
thaf'
lira M~',~-- 0
and
lira D#,~ ~ 0.
Furthermore, let us suppose that there exists the limit F(x) of the distribution
functions F,(x) as n ~ ,
i. e.
lim F,~(x) = F(x)
holds for all points x of continuity of the distribution function F(x). Further,
let us denote the distribution function of 7,~ by G,~(x). Then
lim G,~(x) = F(x).
PROOF.6 Without loss of generality we may suppose that M s
Then, by virtue of Tchebyshev's inequality, we have
P ( l ~ . l > ~) < therefore, given any, arbitrarily small e > 0 and 6 > 0, there exists a positive
integer no(d) such that
P(I~',~I > ~) < O
if
n > no(O).
But then
(2. 3)
o . (x) = p (7. < x) <= p
< x +
+ P
< -
In fact, if et~+#~ < x, then either a,~ < x + e or c~,~~ x + ~ , but in the latter
case at the same time ~,~ < - - e holds and we obtain (2.3) by means of the
theorem on total probability. Similarly,
(2.4)
G,~(x)=P(7,~<x)>=p(~,~ < x - - e ) - - P ( # , ~ > , ) ;
in fact, if tz,,< x - - , , then either e,~+s < x , or ct,~+s
case at the same time #,~ > e. Consequently, we have
>_ x, but in the latter
F , , ( x - - *)-- d --< O, (x) ~ F,,(x + ,) + d.
Passing to the limit n--* ~ and considering that e and d can be chosen
arbitrarily small, it follows that
F(x--O) <= lira Gn(x) <_-- lim G,,(x) ~ F(x-}-0),
5 Here and in what follows we shall denote then mean value of the random variable
by Mr and its standard deviation by D~.
0 We give here the proof of this lemma of CRAu~R because a similar method of
proof is needed in the proof of Lemma 2.
199
ON THE THEORY OF ORDER STATISTICS
i.e., that in all points of continuity of F(x)
lira G,~(x) ~- F(x).
This proves Lemma 1.
Theorem 1 follows immediately from this lemma, as
M
n + ~)
J-'
- - .=. n 4- 1 - - j
and
=2 n q- l - - j )
.i=.~
(,+ ,
j),-,
and thus the conditions of Lemma 1 are satisfied.
By means of Theorem 1 we can easily determine also the limiting
distribution of ~-. We have /]**+~-7,~-~-e--~ and therefore
P(n(1--/]:+>,,)<x) = P ( ~
< log
1 ).
l__ x.3_ '
n
since
1
log---1----
lim
x
n
-- 1
n
and the F-distribution is continuous, it follows that n(l--yb*,+>l.)has in limit
also a F-distribution of order k with the density function ( k - - l ) !
(t > 0 ) .
Now, the random variables /]k are mutually independent and uniformly distributed in the interval (0, 1). Because of the symmetry of the uniform distribution, the same holds also for the variables l - - / ] k ( k ~ - - - 1 , 2 , . . . , n ) a n d thus
the variables
/]~ --/;?~-(/]1,/]2 . . . . ,/],0
and
1 --/],]+l-k
=
Rk(1
--/]1, 1--/]2 . . . . .
1 --/],,)
have the same distribution. Hence it follows the following
TueoueM 2. The distribution of the variables n~7~ and n(1--/],*,+~_0 in
case of any fixed k >= 1, tends to the F-distribution of order k with the dentl~ 1e
sity function ( k - - l ) ! ( t > 0 ) ?
By means of Theorem 2 we can determine also the limiting distributions
.,c and ~;~+>k,
contrary to the limiting distributions of
of ~*
*
" these, howe.ver,
the variables ~,*~and g~ - - will depend on the distribution function F(x)
(see [81).
7 The distribution of the variables ~ can be also determined exactly for finite n and
after this passing to the limit Theorem 2 can be proved also in this manner by means of
some simple calculations (see H. C~AM~R[25]). We proved this theorem here by means of
our method to show its application at first in a simple case.
200
h. RENu
Now we shall prove the following
THEOREM 3. 8 The variables ~l~ and ~,%1-j are independent in the limit i f
n ~ ~ and at the same time k and j are fixed, namely
x
.....
y
n '
(x>O,y>O).
dudr
O0
PROOF. First of all we prove a lemma:
LEMMA 2. Let us put 7,~ = e,~ q- 8,~ where c~,~and [L. are random variables.,
and let a random variable Z,r be given which is independent of cr Let us
denote the distribution functions of c~,~ and Z,~ by F,~(x) and H,~(x) respectively. Let us assume that the limit functions F(x) ~ lira F,~(x) and H ( x ) ~ lira H~(x)
exist and, further, that
lira M f i , = 0
and
limD#,~--0.
In case all these conditions are satisfied, we have
lira P(7~ < x; Z , < Y) ~- F(x) H ( y ) ,
i.e., the variables 7,, and Z,~ become independent in the limit.
PROOF. Let us choose (as in the proof of Lemma 1) the value
integer no so large that
of the
P(i~,~l > e) < O if n > no
Similarly to the arguments applied in the proof of Lemma 1, if may be proved
that, if n > no, where no depends on the choice of the positive numbers ~ and
d, then
(2.5) P (r162< x - - e, g,~ < Y) - - d N P (7,~ < x, Z,~ < Y) <= P (a,~ < x -}- ~', )b,. < Y) -5 d,
and as, by our assumption, c~,, and Z,~ are independent, therefore
p
< x•
< y) =- t%(x •
H,, (y)
and similarly to the proof of Lemma 1, we obtain that in all points of continuity of F ( x )
lim P(7,, < x, Z, < Y) = F(x) H ( y )
holds. This completes the proof of Lemma 2.
Now
~,*+1-~-- log n
6j+1
d~.+l-k
. q - . . . qn--j
k
log n +
~
8 See CRAM~R[25], p. 371. We discuss this well-known theorem here, because our
method throws more light on the real ground of the fact expressed in the theorem, than
its known proof.
ON THE THEORY OF ORDER STATISTICS
201
where
0i
O~
+
n+ l--j'
and therefore
lira M , ~ = l i m D ~ * - 0 .
d~+~
As the sum - - . + . . . +
n--j
(~n+l-k
k
l o g n is independent of ~}' and
1
lira n log
1
-- y,
Y
n
further, we know that
g
f
limP(ngj<y)--
tj_l
e -~ dt,
(j--l)!
o
we have
(2.6)
<
*
lira P(~i~ < ) - , 1 - - ~/,~+l-j
= lira P
\
= log
,,+~__~--~ >
---~Fj(y)
where
!/
FAy) =
f
(j_
p-1
e" dt,
o
On the other hand, by Lemma 1
(2. 7)
P
,~+1~i:--~ > log
~-lim P
*
n+l-k
and by virtue of Theorem 2
.9,'
(2. 8)
x]__(
< 3-) - - )
t~qe -t
at.
o
From the relations (2.6), (2. 7) and (2.8), Theorem 3 follows.
By means of Theorem 3 we can determine the limiting distribution of
the difference ~b*~--t/{. This is important, because ~/,'[--~/I, the range of the
sample Ql,,r/~ . . . . ,r/?0, can be used to estimate the standard deviation
of the population. As, for large n, the variable ~{ is near to 1 and 2fi is
near to 0 with a probability near to 1, we obviously have to consider the
variable n [ 1 - - ( ~ - - ~ l l ) ] and as, by virtue of Theorem 3, n,~ and n(1--~/~)
are independent in the limit, the limiting distribution of their sum equals the
composition of their limiting distributions. As e -~ (x > O) is .the density function of the limiting distribution of both n~{ a n d n ( -i ' ~, b , )*,, the density
202
A. RI~NYI
function of the limiting distribution of n[1--(r/,~--~{)] is
.,,e
j'e-(:~-y)e-~t d y = x e ....
(x > 0);
0
therefore n[1--(~,]--ri~)] is in the limit a random variable having a /'-distribution of order 2.
By means of the limiting distributions of the random variables *i~, the
limiting distributions o f the random variables ~*
_k can also be determined.
Hitherto we have considered the limiting distributions of the order
statistics ~, (resp. those of their transformed values ~ and ~ ) under the
condition that the index k (resp. j - - n + l - - k ) is fixed and at the same time
n - - ~ ; this set of problems is called the study of the "extreme values" of
the sample. We now turn to the study of the limiting distributions of the
variables ~$ (resp. of zff and ~) under the condition that together with n
k tends also to infinity, namely so that I k - - n q [ = o ( V n ) , where q is a constant (0 < q < 1). The variable ~*+~ (where k = [nq]) 9 satisfies this condition;
this variable is called the q-quantile of the sample. In the special case
n = 2 m + I where m is an integer, the variable ~,,+1
~" is the median of the sample:
obviously, the q-quantile of the sample is nothing else but the q-quantile of
the sample distribution function and thus the median of the sample is nothing else
but the median of the sample distribution function. Consequently, if n is an even
integer, i.e. n~-2m, then 2(~*~+~,~+0 is called the median of the sample.
We shall now prove the following theorem containing the proposition that
the q-quantiles of the sample in the limit are normally distributed, if the
distribution function F(x) of the population satisfies certain simple Conditions.
THEOREM 4. ~~ Let us suppose that the density function f(x)---F'(x) of
the common distribution function F(x) of the mutually independent random
variables ~, ~2. . . . . ~,, exists and that f(x) is continuous and positive in the
interval a < x < b; then, if 0 < F(a) < q < F ( b ) < 1 and further, if [k,,~nql
-~-o(Vn)(and thus, afortiori, lira k,~,
q), then ~* is, in the limit, nor-
really distributed with the mean value Q=F-~(q), which is the q-quantile of
the distribution function F(x), i.e.
x
lim Pl_ 1 . ] / ~ - - q )
f(Q) V
<X : [ / ~
e-
n
9 [x] is the largest integer for which [x] ~ x.
1~ This theorem is contained in a general theorem of N. V. S~amNov[8g].
203
ON THE THEORY OF ORDER STATISTICS
PROOF9 First consider the limiting distribution of
~+l-l"n
(2.9)
(Jj
--
n + 1-j
,i"=1
where the variables da- are mutually independent and exponentially distributed
with the distribution function 1--e -x (x > 0), that is to say, Md~-= D d j - = 1
.and
oe
1
<~
M(Id'i--ll a) -~- f ]x--ll 3e -.~'d x < [e -.~d x + i x a e-Xdx <=5.
0
0
0
Then, however,
n+l-gn
1
*~+l-lc~
S~ = D ~r247 ,,,,=
(2. 10)
[
n§
n
l
.~' (n + 1--j)" '
j=l
Ir
g
K,?= .~ M
3-1
n -]- 1- - j
n+l-/c n
~5
|
~ ' (n +- l - - j ) : "
j=l
and thus
K,,,
(2.11)
5
s,-( =< k,-7
Therefore, if n--* o% then ~----,0; this means that the central limit theorem
in LIAPUNOV'S form can be applied to the sequence of sums ( 2 . 9 ) a n d
thus
(2. 12)
,,§
P L
S;7
- < x. - - g 2 ~ _' e - ~- dt.
Now, it is known that
21
k=l ~ - - -
logm+ C+A,.
where C is the Euler constant, and an A > 0 constant can be found such
that !A,,,[ < A 9 By means of some simple calculations it can be verified that
m
(2. 13)
1
1
1
1,.=,, kr - - h
&
H + h(h--1)
Thus we have
(2. 14)
M,, == log ~n + 8'....
(2. 15)
&
1-
k,,
l
h +e":'
(0 < ,9 < 1).
204
A. RI~NYI
A
where /g+l < ~ and
B
l+,71< ~ ,
A and B being constants not depending on n.
Therefore
~,~+1-~:,, -- M,,
(2. ~6)
s,U~
g,+,+l J~. -- log
-
n
g~+7+
+ +':'
I/
[nk,+
where
n
IgL'l<L k , ( n - - k , )
and L is a constant not depending on n. But,
k,
if n---. ++, then -----~ q and thus ~;"--+ 0. Hence
n
lim P /g'~+~-k, - - log ~ <
(2. 17)
,+.+ +
_.
]/n-k~
~.
+ e - ~+++dt.
1
7~
n k++
-~'+
For brevity, let us introduce the notation q+-- k,+, then, by virtue of (2.17)
n
and taking into account that owing to #,.--nql = o (Vn), we have
log q~--log-ql- : o ( ~ ) ,
it follows that
lira P
(2. 18)
(
/~-q
q~ < x - -
e - ~ dt.
/
nq
1
As ~+*++1,:,~---log F(~*+,) ' if follows from (2.18) that
(2. 19)
,+lim++P ~,* > F -+~ q e-~[/W
(
=-~_
e-Vdt.
' 2 "
But, in view of the mean value theorem of differential calculus, we have
F-' (qe " +") = Q 4c q
(2, 20)
f(O&+)
where lim & + = 1.
It follows that
(2.21)
lim P (.
....
~"-- Q
t. f~Q) V q C ln- q )
which was to be proved.
<x
_
V 1~
i: t"-dt,
e- ~
ON THE THEORY OF ORDER STATISTICS
205,
The statement of our theorem can also be characterized by that the
q-quantile of a sample of size n in case of large n is approximately normally
distributed around Q, i.e. around the q-quantile of the population, with the
standard deviation 1
]/q(1--q)
f ( Q ) [/
n
"
In this way, by means of the sample median, an interval containing
the median of the population with probability arbitrarily near to 1, can be
given. In the special case of symmetrical, for example, normal distribution,
the median of the distribution coincides with its mean and in this way w e
can estimate the mean value of the population.
The theory of order statistics has a widespread applicability in t h e
statistical quality control of mass production, ~1 which is an important field
of application of probability theory. Let us assume that a certain measurement
of some engine parts produced on an automatic machine displays some
small random fluctuations from specimen to specimen, therefore its value
can be considered as a random variable. Let us suppose that, under standard.
manufacturing circumstances, the distribution function of this measurement is
the (continuous) function F ( x ) ; to control the process of production, at regular
time intervals we draw a sample of size n - - e.g. of size 5.
We take the considered measurement of the values of the sample and we
mark them on a perpendicular straight line drawn across the abscissa corresponding to the point of time of sampling on the ,,control chart" and we mark
their places with dots; the values of the sample will be placed automatically
in order of magnitude. In order to detect any irregularity in the process of
production (e. g. the displacement of the adjustment of the automatic macIfine
or t h e attrition Of certain parts of the producing machine etc.), we draw 5
bands determined by parallel straight lines, giving intervals containing the
least, the second, the third, the fourth, and the largest value, respectively, of
the sample of size 5 at the same time with a given probability - - e . g. 95 %
--under
standard manufacturing circumstances. The determination of these
intervals is very easy by what has been said above. In fact, if ~ denotes
the k-th sample value in order of magnitude (k = 1, 2, 3,4, 5), then, as we
have seen, we can exactly determine the individual and joint distributions
of the variables
1
~'7~= log F(~_~:)
(k = 1,2, 3, 4, 5).
The practical application of this method in quality control is dealt with
by the Department of Mathematical Statistics of the Institute for Applied
n Cf. the work of L. I. BRA(ItNSKY[26] ; by means of the theory of order statistics, the
calculations of BRAGINSKYwhich are not quite exact can be put in a precise form; in the
practical application it is suitable to carry out the control charts on the basis of these
precise calculations.
206
A. R~NYI
Mathematics of the Hungarian Academy of Sciences; tables needed for the
use of the method are also prepared.
We shall not continue the enumeration of theorems obtainable by means
of this method, we only emphasize, that our method consists in deducing a l l
these theorems, by means of (1.9), from the theory of limiting distribution of
sums of mutually independent random variables.
w 3. Formulation of some n e w theorems concerning the comparison
of the distribution function of a population and
t h a t of a sample drawn f r o m it
Hitherto we have only shown how our method makes it possible to
prove certain well-known results of order statistics. Now we shall show the
new results which can be obtained by means of the same method.
A. N. KOLMOGOROV [6b] proved a fundamental theorem giving a test
for the hypothesis that a sample has been drawn from a population having
a given distribution. By means of this test we can infer to the unknown distribution of the population from the distribution of sample values. 12 Let us define
0
(3. 1)
F~(x) :
k
if
x<~,--
if
~ < x ~ ~J%1,
/7
1
if ~*<x,
i. e. F,~(x) is the distribution function of the sample, in other words, the frequency
ratio of the values less than x in the sample.
KOLMOGOROV'S theorem is as follows:
0.2)
lirn P ( V n
sup
lFn(x)--F(x)l<y)=
~r
-~<x<+~
I 2
~.:_~(--1)~ e-2;~'~if Y > 0
0
if y ~ 0 .
KOLMOGOROV'S theorem therefore gives the limiting distribution of the supremum of absolute value of the difference between the distribution function
of the sample and that of the population. This limiting distribution does not
~depend on the distribution function F(x) of the population which is assumed,
for the validity of theorem, to be continuous. KOLMOQOROV'S theorem considers the difference [F,(x)--F(x)] with the same weight, regardless to the
value of F ( x ) ; so e.g. the difference [F,,~(x)--F(x)]=-O.Ol has the same
weight in a point x with F ( x ) = 0 . 5 (where this difference is 2% of the
value of F(x)) as a point x with F(x) = 0.01 (where this difference is 100%!
of the value of F(x)). We can avoid this by considering the quotient
IF~(x)--F(x)[ instead of ]F,~(x)--F(x)], that is to say, by considering the
F(x)
x~ I. e. we can give confidence limits for the unknown distribution function.
ON THE THEORY OF ORDER STATISTICS
207
relative error of F,~(x). In this way, the idea arises, naturally, to consider the
limiting distribution of the supremum of the quotient [E"(x)--F(x)l which
F(x)
characterizes the relative deviation of distribution function of the population
and that of the sample.
A theorem similar to that of KOLMOGOROV'swas proved by N. V. SMIRNOV
concerning the one-sided deviation of the sample and population distribution
functions. SMmNOV'S theorem is as follows:
if Y 0,
(3. 3)
lim P ( V n
sup (F,~(x)--F(x)) < y) = i
. . . . . .
<.~<+~
,0
if y = < 0 .
4
We shall consider also the analogous problem for relative deviations.
All these problems can be successfully solved by means of the above
method. In the course of solving these problems a natural limitation is :to be
adopted: as E(x) takes on arbitrarily small values, it is not suitable to consider the supremum of
F,~(x)--F(x)
F(x)
or respectively
F,(x)--f(x) [
F(x)
taken
in the whole interval - - ~ < x < + ~ , but to restrict ourselves to an interval
x~ N x < + o,, where the abscissa x~ is defined by the relation F(x,,) = a > 0;
the value of a, however, can be an arbitrarily small positive value. In {}5
we shall prove the following results:
THEOREM 5.
(3.4)
lira P (Vn sup
)
F(x)
<y =
' ~l-a,
=
t~
e ~dt
if y > O,
o
0
if y ~ 0 .
THEOREM 6.
(3. 5)
(
lim P l/rn sup
)
F(x)
< y -4
=
0
We may consider
F,~(x)--F(x)
F(x)
the limiting
~'~
~- k=o~'~(--1)~
e
8
2kq-1
a y'~
if y > 0 ,
if y ~ O .
distribution of the
supremum o f
and of its absolute value taken in the interval x~ < x <= xb,
respectively, where the abscissae x,~ and xb are defined by the relations
F(x,,)=a>O and F ( x O = b < l ( 0 < a < b < l ) .
We then arrive at the
following theorems.
208
A. RENY[
T H E O R E M 7.
lim P
,,-~ ~
(3.6)
--
1
e-
J~7
n
sup
~ ~ F(~)<=~
1
< y =
F(x)
y
e ~
dt
(--~ <y<
du
+~).
- ct~
THEOREM 8.
lira
P
(
Vn
sup
a ~ F (~c)~ b
F,~(x) - - F ( x )
F(x)
)
< y -=
(3.7)
_ (9-k+l) ~-n 2 (I - a )
__
- - 1 )t~ e
--
"~ 1,-_o
0
8 ~ .r
2k + 1
El,- if y > O,
if y ~ 0 ;
where
~
Ek ~
?r
" du+q~,
1-----
and
b y~
2e-
(2k+O T
2(l-b)
?
Yi~da y
(1-b)~t2
)
e ~'bY~ s i n u d t / .
o
These theorems provide tests for verifying the hypothesis that the sample
(-~, ~2,..., ~,~) has been drawn from a population of the distribution function
F(x). The character of these tests consists in that they give a band around
F(x) in which, if the hypothesis is true, the sample distribution function F,(x)
have to lie with a certain probability and the width of this band in all points
x being proportional to F(x) This band, however, is not a symmetrical one.
To overcome this difficulty, we apply the test twice, first to the sample
(.~1,-~2,..., ~,~) having the population distribution function F(x) and then to
the sample ( - - ~ , - - ~ 2 . . . . ,--~,,) having the population distribution function
G(x)=-l--F(--x).
In order to illustrate this, let us denote the distribution
function of the sample (--.~1,--_~:, . . . . ,--~,,) by G,~(x) and let A be the event
that
sup /
~,,<=~,,(~),
F(x)
< y'
.and B the event that
- <-F(
Vn
sup)
G,(x)--G(X)G(x)
<y,
i.e.,
1/n sup I g'~(x')-F(x')
' ~'(z')<=l-~ 1--F(x')
<y;
ON THE THEORY OF ORDER STATISTICS
209
finally, let us denote the simultaneous occurrence of A and B by C. Taking
into account that
P (C) ---~e (A B ) : P (A) + P (B)-- P (A q- B)
A + B we have obviously at the same time
]/n sup iF,,(x)--F(x)l < y ;
and in case of occurrence
and this last event, by KOLMOGOROV'S theorem, has in the limit the probability
K(y) = 2 ( - - ! ) ke-~k2~,
{3.8)
Therefore P ( A - I - B ) < = K ( y ) in the limit and in the same case
P(C) ~ P(A)+P(B)
K(y).
T h e probabilities P(A) and P(B) are equal and their common value is
given by (3.5). Thus the probability of the event that the sample distribution
function F,,~(x)lies in the intersection of bands defined by the above two
conditions corresponding to the sample (~,, ~2.... ,~,,) and (--~1,--_~2 ..... --~,~),
respectively,
is not less than
2L(Yl/j-~)=-K(y)in
the limit, where
(2k+l)~n ~
L(z)= ,~=o
2 (__I),~e2(k4-s~ 1) (z > 0) and K(y) is the function defined by (3.8).
Let us point out a most surprising corollary of Theorem 7. From the
theorem (3.3) of SMIRNOV, we get
(3.9)
lira P (
n-~cc
sup
(F~(x)--F(x)) < 0 ) = 0 ,
- ac~x<+~o
i. e. the probability of the event that the sample distribution function does
not exceed the population distribution function all along the interval
- - ~ < x < + ~ , tends to 0 as n--, ~ . From Theorem 5 it follows that
(3.10)
lira P (
sup
(F~(x)--F(x)) < 0)~0,
i. e. the same is true for the interval x~ =< X < - [ - ~ .
Theorem 7,
On the other hand, by
l In(l-b)
b-a
{3.11a)
limP(
n->.aa
sup
aZa~X~x b
(F~,(x)--F(x))<O)---~1 e- T
0
e-V-dtdu > O,
0
i. e. the probability of the event that the sample distribution function does
not exceed the population distribution function all along the interval in which
the value of F(x) lies between arbitrarily fixed values a and b (0 < a < b < 1),
remains positive also in the limit. This result, obviously, is important also
from the point of view of statistical practice.
210
A.
RI~NYI
The result that the limit on the left side of (3. l l a ) i s
also proved by fireMAN [16]; moreover, he obtained that
positive, was
t
l i m P ( sup (f**(x)--F(x))<O)-~ - 1- arc sin V a ( 1 - - b )
,~-.~o ~ =<~.~,.~
:w
b(1--a)
(3.11b)
GmMnn mentioned further that the result (3. l lb) has been already known tcr
GNEDENKO. The terms on the right sides of (3.11a) and (3.11b)are, of
course, identical. This follows from the following consideration: the right side
of (3. l la) is nothing else than two times the probability of the event that
a random point normally distributed in the plane (x,y) and having the pro1
1 (x2+y2)
bability density function ~ - e O<y<x
-
, lies in the infinite sector 0 < x < q- ~ ,
If a(1--b)
b--a ' and this probability is equal to
.
(3. 12)
2 arctgu ~a
2~-
--~
1 arc s i n V a(1-o)
_ b(1--a)
o
Indeed, because of the circular symmetry of the normal distribution having
1
l - "T(~'+"r the probability corresponding to the infithe density function~-~-e
nite sector of angle ~ is ~
2if"
Theorems 5--8 will be proved in w 5. First in w 4 we shall prove
some auxiliary theorems which are of interest i n themselves too.
w 4. S o m e n e w limiting distribution t h e o r e m s
Let a sequence b e given ~consisting of the sets of random variables
~,,,1,~,,,2 . . . . ,~,~,N,
(n = 1, 2, . . . ) .
Let us assume that the random variables ~,~,k have the expectation 0 and a
finite variance, further, that the random variables having the same first index
n ( n = 1 , 2 , . . . ) are mutually independent and satisfy LINDEBERG'Scondition,
that is to say, introducing the notations
F~,,k(x) = P (~,,, ~:< x);
S,,,,k
~ , ~,, ,~; B~
D" S,,, N,,,-~- ~ , D 2,},,,
7p==-1
k=l
we suppose
~o
= f xdF,,,
=0,
and
(4.1)
lim 1.,-~s
j~ x2dF,,j~(x)=O
. . . . BT, k=l I~I:>~B,~
if ~ > 0 .
ON THE THEORY OF ORDER STATISTICS
211
Concerning these sequences satisfying the above conditions we shall prove
the following theorems.
THEOREM 9.
9 - -
(4.2)
l i m P ( max
....
X
~-2
S,,,~<xB~)~-tV2-j e-~dt
~-<k-<x,~
[0
if x > O ,
if x ~ O .
THEOREM 10.
(4. 3)
ax
<
,o~!~m"P ( 1--<k-~N,,mIS,~.~,I
xB,O-- ~Yrk=o
~ (--1) ,~
2k+l-
if x > O ,
if x ~ O
kO
THEOREM 1 1.
(4. 4)
lim
=
P(--yBn <-- min
&, ~, ~
max S,,, k <
~(~+# s i n ( 2 k + l ) Y r x
2k+l
either x ~ 0
or y < 0 .
if x > O
xB~)~and
y-->O,
y=-x, Theorem 11 reduces to Theorem 10.
Let A~D2S,,,~G with 1 ~M,~<N.,~and
. A~,
xm ~ - - ~ .
(O<X<
REMARK. In case
THEOREM 12.
1).
rhell
(4 5)
limP(
n-~c~
max
Mn < IC~r~
(-gk+l)?n-
i4 ~
~
I&,,,~[<yB,,)=
~
s~'~(
2f
'~
)
if y > 0 ,
y
0
y~O
where
r/
y2 (2k+l) ~-
2Le- o~--~ f
Y2-~zy
~'"_2
e 2y2 sin u du.
o
REMARK. In the special case of M n - - 1 (i. e. for 2 ~ 0), Theorem 12 is
identical with Theorem 10.
For the special case in which all lhe considered random variables ~,,~
have the same distribution, Theorems 9 and 10 were proved by P. ERDOs
14 Acta Mathemafica
212
A. R E N Y I
and M. KAC [22]Y In the proofs of the above more general theorems, we
modify their proofs inasmuch as we apply an ordinary (one-dimensional)
limiting distribution theorem instead of the multi-dimensional limit theorem
used by them; this enables us to generalize their results. We shall use Theorems
9 - - 1 2 in w 5 to determine the limiting distribution of the random variables
sup
~<-u(x)<-b
_
_
F.(x)--F(x)
F(x)
and
sup
a,~.v(a.) ~ b
F,,(x)--F(x)
F(x)
where F,~(x) denotes again the distribution function of a sample of n mutually
independent observations concerning the random variable ~ having a continuous distribution function F(x) and further 0 < a < b > 1.
Let us turn to the proof of Theorem 9. Let us put
P,~(x)=P(
max S,,I~<xB, O.
l~k~N
u
Let ~1, g,_. . . . , ~i. . . . . . be mutually independent random variables which are
normally distributed with mean 0 and variance 1, and let us introduce the
random variables
k
Ck-- ~.7~2,~
(k~-~- 1, 2, ...).
First of all we shall prove that for any ~ > 0 and for any positive integer k
we have
1
(4.6)
lira P,,(x) >=P ( m a x (r g 2 , . . . , Ck) < (x--~) Vk-)- d k
and
(4. 7)
lira Pn(x) <=P ( m a x (~1, ~ . . . . , ~k) < x Vk).
For, let mj be the least positive integer satisfying
mj
(j ~ 1, 2, .. ., k). Obviously, 1 <= ml =
< m~ <= . . . ==<m~ ~- N~. Let us define now
the following variables:
( 4 . 8)
,~n, 1 ---- 8,% m., ,"
Afn, j = S, b mj - - Sn, ruff_1
We can see easily that for any fixed j ( j =
dition holds for the sequence
(4.9)
(] = 2, 3, ..., k).
1, 2 . . . . , k) the Lindeberg con-
~,,, W_l+l, ~n, mj_l+2 . . . . ' g'" ~"0'
(n = 1, 2, 3 , . . . ) .
1~ Their method was generalized by M. D. DOSSIER [20b]. See further the papers by
A. WALl) [27], [28], and K. L. CaUN~ [29]. They consider the limiting distributions of the
supremum of the first n partial sums under the conditions that the variables have the
same distribution and that the variables have finite third moments, respectively. C~UNQ
gives for this latter special case an estimate for the remainder term also. ERD6S and KAC
remarked that their theorems can be proved under more general conditions.
213
ON THE THEORY OF ORDER STATISTICS
lndeed, introducing the notation
B 2 9= D"d~,a'
and using the relation
1
lim~
s u p D2~,.~,k~---0
which trivially follows from (4. 1), we obtain that, for any d > O,
,(4. 10)
1--d
2 B2
1 4-6
k B~<--_
,~,3"=<--k
B;, if n>no(d)
;holds. Thus for any s > 0
l ' ~J
>fBn X2
k
B"
a & , (x) <=
,,,j .... *J-l+l Ix]
,a'
1--d
if n>no(d), and s ' = s V1- ~ .
l y~5'
[ x
_
B,'2,,. l l~l>rB,~
The Lindeberg condition is therefore actually
satisfied by the sequence (4. 9). Therefore, by the central limit theorem,
,'v
lim P(&,j<xB,,,~)
(4. 1.1)
][~
1 f e-gdt
~"
(j=
1,2, ..., k).
But the random variables &,,, & , u , . . . , &,a, are mutually independent and
B,,,~
B~
lim
,~§
1
Vk
(j=l,2,...,k);
hence
(4. 12)
( A~,J__B,,< v ~Xi"
~-,|
--
.... ,k
) -:
1 xl x2 Xke-y(j)lt) )
(V~) k ; ;''';
dtl'''d'lr
-0o -0o
i. e .
(4. 13)
limP(An'14-J~'24-'"4-A"J<x;j~l
--
1
{" -o-! "~ ~j"
]e
" ~y=l
)
dt~
2,
k)=
...
(T~)
where the integration is to be extended over the domain
Tz: {--oc <t14-t24-...+tj<xVtc;
j~
Tx
1,2,...,k}.
Hence we obtain
(4. 14)
14.
lim P( max &, W <
xB,O =
defined by
P( max gy < x~k).
214
A. RENYI
Let us put
Q,,, ,,(x) = P ( max S,,, ,,,; < xB~),
l~j~k
and let lln,,.(x) denote the probability of the event that &,r is the first of
the sums &,5 ( J ~ 1 , 2 , . . . , r ) which is ~ x B,~, i. e. let
H,,r(x)=P(&,~>=xB~,;
max S , , j < x B , O.
l~j~r-1
Obviously,
H,,,,. (x) = a- - Pn (x) <-- 1.
Let us suppose mj_l < r
my; introducing the notations
H % ( x ) = P ( S .... ~ x B , , ;
max & , j < x B ~ ;
]&,,,g--S,~,ri>=sB,0 ,
1-<2"~r-1
and
H (~) (x) = P (S ..... >=xB,~; max S,,,j<xB~,; [&,,,~--&,,oI<sB,O.
1 ~j~r-1
we get
evidently
Let us apply TCHF~BYSHEV'S inequality:
2
(1)
n,,,,.(x)
=H,~
s , ~ , w - g .... I > 8B~,)< n,~ r(x) B,~,j
8 B;
and consider the relation (4. 10); thus we obtain that
r(x)P(
(1)
H,~,,
(x) < H,~~ r (x) 1 ~o_L.
+ d
,
therefore
(4. 15)
1--P,.~(x) = ~ H,~, ,. (x) ~ --sHe q- ,,.=,
" ~ H~.(x).
On the other hand,
(4. 16)
H (2),~,,. ,rx~, =< 1 - - Q,, , ~ (x - - 8)
as from the relations
S,, ,. >= xB,~ and IS,,,,,j--S,,, ,.] < 8B,~
it follows that
&, ~,j > (x - - 0 B,,.
Thus we have
1--P~(x) ~ 1 8-}-d
2k + 1 - - Q,~, ~.( x - - 8).
Further, on account of the trivial inequalities P,~(x)<= Q,~,k(x) (k = 1, 2 , . . . ) ,
we obtain
lq-d
(4. 17)
Q,,k(x--8)
8~k <= P,~(x) <= Q,,,k(x).
ON T H E T H E O R Y OF ORDER S T A T I S T I C S
215
Comparing the above relation (4.17) with (4. 14), we have just the desired
inequalities (4.6) and (4. 7).
Let us consider now the special case in which the variables ~,~ assume
only the values 4-1 arid --1, and
1
P ( ~ , k = 4. 1) ~ P(~, ~--- - - 1) ~----~-
( k = l, 2, ...,N,~=n; n = l , 2
. . . . ).
Then
~
11v)
:
rnod 2
.
+ [xpn + 1]
.
,(except if x f n is an integer) and thus from the Moivre--Laplace theorem
we conclude that
x
(4. 19)
e ~dt
lim P,(x) =
(x > 0).
n ---~ oc
o
Therefore, as it follows from (4. 6) that
(4. 20)
lira P(max (g~, ~ , . . . , ;k) < xl/'k) <=lim P,,(x+~)
k-->- oe
n - ~ - oo
and from (4. 7) that
{4. 21)
lim P (max (~, ~.~. . . . , ~ ) < x Vk) => lim P,, (x),
we have
x
(4. 22)
lim P(max (~,, .~2. . . . , ~k) < x ~'k) :
e-Ydt
(x > 0),
k -->- ~
o
and, applying again the relations (4. 6) and (4. 7), we obtain (4. 2).
The basic idea of this proof can be summed up as follows: we have
pointed out that in case of a special choice of the variables ~,,,k, (4. 2) holds;
from this, by ( 4 . 6 ) a n d (4. 7), we have concluded that (4. 2) is true also
in case ~,,k=~i,~ where the variables z/, are normally distributed; hence,
again by (4. 6) and (4. 7), it followed that (4. 2) holds also for any variables
~,,~ satisfying the conditions of the theorem.
The proof of Theorems 10 and 11 is based on the same idea and,
with suitable modifications, agrees step by step with the proof of Theorem 9.
It is sufficient to prove only Theorem 11, because this, as we have
seen, includes Theorem 10 as a special case. It is unnecessary to detail the
first part of the proof, therefore we shall deal only with the second.
Let us have again
P(~,~.k-~-- 4 . 1 ) ~ - - - P ( ~ , ~ , k = - - l ) - -
1
2
(k~--l,2 .... ,N,,=n;
n=1,2,...)
and let us suppose that the variables ~,~, (k = 1, 2 . . . . , N~ = n) are mutually
216
h. RENYI
independent. Then it follows by simple arguments, well-known in the theory
of the random walk in the plane, that, if A = [x~/n] + 1 and B = [yVn] + 1,.
then
P ( - - y l / n < &,,k< xVn; k = 1 , 2 , . . . , n) =
(4.23)
-B~k.~A
~
i
v=l
where
n
k -~-
if
k~n
(mod2)
Vk =
~0
in all other cases.
As, by the Moivre--Laplace theorem,
a
lira(~(n)
1~
t--
t2
1 fe-gdt '
-b
by simple calculation we obtain
lim P ( - - y g n < S,~,,~< x / n ; k ~-~-1,2 . . . . , n ) =
(4. 24)
(21:+1)~n2
X
4 ~e
~.(x+,~)2s i n ( 2 k @ l ) . w - x+y
(x>O, y > O ) .
k=0"~
2k+ 1
Similarly to the case of Theorem 9, it follows that the limit of the probability on the left side of (4. 4) is the same also in the general case.
Theorem 12 can be derived from Theorem 11 as follows. Owing to the:
independence of S~,j• and Sn, k--&~,M~ (k>M~), by the relation
P ( max [S,,,kI<yB,O=P(--yB,~<(S.,~+(S,~,~--S~,~s
I
Mn < !~<=5"n
we obtain, by virtue of the theorem on total probability, that
(4. 25)
P ( max ]S,~,kI < YB,,) :
As, in accordance with our restrictions, Lindeberg's condition is satisfied
k=~
- - Bn -~2,as n - + ~ , there-
fore (uniformly in x in all finite intervals) we have
x
(4. 26)
limp
. . . . t B,,
< x ---
l/2, Z
e - du--
e gdt.
ON THE THEORY OF ORDER STATISTICS
217
Furthermore, by Theorem 11, considering the relation
D(&,G--S,~,ar,~) = VB,--A;,
it follows that
limP(--(xq-y)B~<S~,k--S,~,,~,~<(y--x)Bn; k = 1 , 2 , . . . , n ) ~---
(4.27)
'g --~ Go
4 2
(2k+1)~,=o-,)
sy=
sin(2kq- 1);r y - x
if y > O
k=o
2kq- 1
2y '
if y ~ O
or y > 0
but I x ] > y .
Therefore, finally, we obtain
(4. 28)
l i m P ( max l & , k l < y B , 0 ~ - e
(2k4-1)2 a= (1-~2)
42
- - yrk=o
e
and
x~
+y
f eV2rr~
"'__sin (2k-l-1)yr rWf
sy~
2kq-1
[x[<=y,
dx.
-y
Hence, by simple calculations, we obtain Theorem 12. In fact,
+y
x~
clx ~-
V~--~[ sin (2k q- 1)yr
g
)
{ 2, I (t (2k+1)i;~
V-~
][~
=- (--lyre
dt .
t~
Now, since the integral of e 2 on any closed curve vanishes, therefore, if
a > 0 and O is real, then
+a
(t-ib)~
a-ib
; e - ~ f
-a
t~
+a
e2
a
t~
fe~dt
-a-ib
.-a
a
t~
~a
a2
b
t2
-=-~--~ dt +
]/2 :r
a-ib
t2
t
a
b
~_ [ e "~ dtq-2e -'~- f e g~2s i n a v d v .
2 v2,
o
Consequently,
+y
x2
f e ~a~---sin(2kq-1)yrY~dy =
V2n*
-g
y2 (2k+1)
@
)
(2k+1)2 ~ ~ h2
2
t2
2e 2;.~2 i
.,~
~._,v2
e
sin v dv .
~_(_l),e
s~
1
]f~
e-~dtq t/~_~y
( ;
k
This completes the proof of Theorem 12.
0
218
A. RENYI
w 5. Proof of the tests a n a l o g o u s to t h o s e of Kolmogorov
and Smirnov
Let ~ be a random variable having the continuous distribution function
F(x) and let ~1, ~2,...,~n denote the results of n independent observations
for the value of ~, i.e. let ~1, ~ , . . . , ~ , be mutually independent random
variables having the same continuous distribution function F(x). Let us denote
the distribution function of this sample by Fn(x).
We shall prove the theorems formulated in w 3 by means of the method
exposed in w 1, using the theorems of w 4.
1
Let us put ~k~F(~k) and ~ . = l o g - - ~ S , further, ~*k = F (~k)
and
*
~ = R k ( ~ , ~2,.-., gn). In this case the variables r~k are uniformly distributed
in the interval (0, 1) and their sample distribution function is
where y = U l ( x )
(5. 1)
Gn(x) ==F,~(F -~(x))
is the inverse function of x = F(y). But it is
( F~(x)--F(x) -) : sup G~(x)--x ,
sup -
a<=F@)
F(X)
a~x~l
easily seen that
X
therefore, instead of the variable on the left side of (5. 1), we may consider
the variable sup
a~z~l
G.(x)--x
identical with it. The variables ? ~ , -
X
as we
have s e e n ' - - f o r m a Markov chain. Further, we have seen that the variables
*
* are mutually independent and exponentially distributed
k+l = ( n - - k ) (gk+l--~)
with mean 1, i. e.
P (Ok < x) = 1 - - e -x
(x > 0).
1
We have also seen how the variables ~ = l o g ~
may be decomposed in0
to sums of mutually independent random variables by means of the 6j's:
k
d1
j:l n+l--j"
Let us turn to the proof of Theorem 8. First of all, it is easy to see that
instead of the relation (3.4) it is enough to prove
(5.2)
~}imP(Vnn~<=G~(~:)
sup ~G'(x)--x< y) 1-=/ ~
f
e-~~ dt
(y>0).
o
For, if [G,,(x)--x]
and thus
<=i~,then from G,~(x) >=a+~
sup
a~ x
G~(x)--x
X
=> sup
6'~ (x) ~ a+~
it follows that
G,~(x)--x
X
x>= G,~(x)--~>=a
ON THE THEORY OF ORDER STATISTICS
i.e. from sup
G,~(x)--x< y
a~x
X
it follows that
]/1l
219
G,~(x)--x< y .
sup
Gn(x)~a+t
A, A' and B are any events and
But if
v11
X
ABc A'B, then
P ( d ) = P(A B) + P(A B) ~ P(B) + P(A'B) <= P(B) + P(A').
Applying this inequality to the case when A is the event sup
o~x
B
is
the
sup
%~ (~)~ a+e
event
] On (x) - - x l =< e, and,
G,~(x)--x< y
X
finally, A'
Gn(x)--x<'x
denotes
the
Y
Vn'
event
we obtain
V~
P :nsupG"(x)--X<y <:P(lG,~(x)--xl>~)-+-P n
sup
G,~(x)--X<y.
It can be similarly shown that
P(~n
G~(x)--X<y)~
sup
a-~Gn(x
)
p([G,~(x)_x,>~)+pO/nsup
X
~,~(x)-x
a~x
As
lira P(I
Gn(x)--x! > e) = 0
<Y).) "
(~ > o),
:it follows that if (5.2) is satisfied then
/[ a+~
Y [ 1-a-e
supG'~(x)--X<y ~
limP
n-+ o~
a~ x
X
e-gdt
0
and
Y |' 1- a+e
lira P
sup
G,~(x)--x <Y ~
e Y dr.
o
Since ~ can be chosen arbitrarily small and the integral is a continuous
-function of its upper limit, it follows that
lim
P(VnsupG,~(x)--x
- - < y
a~x
)= ~
f - ~e
X
o
Therefore, (3. 4) actually follows from (5.2) and thus, to prove Theorem 8,
it is enough to show that (5.2) holds.
Further, we shall need the following relation:
{5.3)
G,~(x)--x_ Vn max
fact that G,~(x) is a constant
Vn sup
This follows from the
- 1
lying between ~ and
220
a. RENYI
~k+a, and so in any interval ~2k <
< ~k+~ the supremum
of
G~,(x)--x
G~(x)
X
x
is equal to
k
G,,(~,:+O)
1--
n
1.
Now, let us apply Theorem 9 to the sequence consisting of the following
sequence of random variables:
dj.--1
n-l- 1 - - j
( j = 1,2 . . . . ,[n(1--a)]+l);.
this sequence satisfies LINDEBERQ'S condition (and, moreover, even LIAPUN0V'S~
condition) if 0 < a < 1. Then,'by (1.11), for any z > 0 , we obtain
z
(5.4)
lim P
max
log
--
< z
g
=
e
dt.
As in case of k>=an and 0 < a < l ,
~T=log~-+O
and
~.~
k~
a~k~n
from (5.4) one concludes
z
(5.5)
lim P
max
log ~ < z
2 dt;
=
0
]fl--a
- -a
therefore, finally, introducing the notation y = z
(5.6)
limp
(
Vn max log
--1
' we have
))
<y
VT
~---
e-Tdt.
0
In view of (5.2) and (5.3), Theorem 5 follows from (5.6).
Let us now turn to the proof of Theorem 7. We may obtain the
random variable
(5.7)
~=7n
max
an~k~b~,l,.
log
~k
--~.]
"F 1r
=Vn
max
~.~ n §
an~k~bnj=l
as the sum of to independent random variables *~ and T~ where
(5. 8)
*~ = l/~
~
1 ~_j~
4.- i
n + l - b ~ rt - ~ 1 - - j
and
k
(5.9)
~ = Vn
max
.~' g - 1
~_-<~,+~ ~_<b,y=l n-t- l - - j "
ON THE THEORY OF ORDER STATISTICS
221
It is evident that in the limit ~ is a normally distributed random variable
with the standard deviation V l o b ; further, from the proof of Theorem 5,
we can see that
(5. lo)
lim P ( % ]fb < z) =
(z > 0).
e- v. dt
o
Considering further that % and -v2 are independent, it follows from
and (5.10) that
/ ab ~
(Y_ ~0]/~-77
Y
(5. l l )
limP(-c<y)-~-~
~
(5.8)
e 2 dtdu.
e 2(1-o
-ce
0
This completes the proof of Theorem 7.
In the same way we can also prove Theorem 6. Here the relation (5.3}
is replaced by
(5.12)
G~(X-)x--X =][n max
Vn sup
a ~ On(a:)
~-~--
n
_1
an ~ ~ ~ n
from which it follows
[/n
max
-~--
an--~;~n
(5.13)
~Vn
/~k
~Vn
max
O,~(x)--x
sup
a ~ G n(x)
k--1
1
x
1
=<
1
.2f___
V~
For, if
k
Tj~+I<-~,
however, ~i~+1
* ~ nk'
I k
n
-7----1
i ~2~+1
Consequently,
then
k
k
n
n
=
n
1 ; if,~
~k+l
~'~k+l
then in case k ~ an, ~+~ >=a and thus
k+l
n
k
~1
n
----1<
~k+l
~lk+l
k+l
k+l
n
- - - < 1
~/~+1 --
k+l
1
n
n~k+l
~k+l
+ - -1an
in either case
k
1
n
~+~
--
I
=<1 ~f+;
~ +Y5
(k >=an)
222
A. RENY!
k-t-]
1 N
and since max
an~-k~n
/~k+l
max
an~k~n
~--kn,-
1 , we
get
] ~k
k
max
3.
3-
3.
-- -- 1
max -~ - 1
-w-- - - 1
1
+all
The limiting distribution of the variable
k
]In
max
eccurring here is identical with that of the variable
Vn
max
an ~ - k ~ n
log--
1 =l/n
2~k v = k - ~ -
max
an~k~n
which can be determined by means of Theorem
To prove Theorem 8, those steps have to
which have been used in the proof of Theorems
shall use Theorem 12 instead of Theorem 10.
The basic idea of the proof is as follows.
the variable
}/n. sup
a ~ Z.'(x)~b
j~l
n+l-j
'
10.
be applied simultaneously
6 and 7; in this proof we
The limiting distribution of
F,,(x)--F(x)
F(x)
is identical with that of the variable
k
z~--]fn
max
an--<k--< b,
3.
--1
r/~:
.and therefore it is identical also with that of the variable
k
z~ = ]/n
max
3.
l o g - ~ = ]/n max
'~2
~'''
r
Thus Theorem ]2 is applicable, namely, since the values of the constants &,
and B,, occurring in it, are as follows:
A,c~}/~+o(1)
and
B,~--~+o(1),
therefore
A,, ](~O--b)
1--a
223,
ON THE THEORY OF ORDER STATISTICS
and thus introducing the notation &,,r =__~@n d+3 -1--j'
I
we have
)
=limP
max
n+oo kn+l-bn~r~n+l-an
[Sn,,.l<y
_ (2k+1)2~ ~ 1-a
--
:r
(--
2k-t-
1
~B,~
=
co
1
e
d u + o~
where
/ ~
bye"
e-
2
el~ =
~
(2 k+l) ~--~
2 ~i2(l_b)
]/~y
f
e "~:f sin
u
du.
0
This completes the proof of Theorem 8.
w 6. Remarks on the limiting distribution functions occurring
in Theorems 5--8
The values of the limiting distribution function occurring in Theorem 5
9may be read from the tables of the normal distribution function. The values
of the limiting distribution function occurring in Theorem 6 can be computed
by substituting the values z = y
(6.-1)
7
~
L(z)__ 4 ~ -2L
~ (_1)~ e x p
~ k=o
into the function
8z'2
2k+ 1
)
(z > 0).
(Vg
At the end of this paper, we give the table of the function L y
tain valuesofa. The curve of the distribution function L y
~
~
forcercan
for
certain values of a be seen on Fig. 1.
The values of the limiting distribution function occurring in Theorem 7
can be approximatively computed in the following manner:
(6.2)
F(y,a,b)
----
i f - ~ -e ~ 2
-~
f
0
- "~
e T d v d u =
1
ff
T
~ dudv
e - '~+v~
"224
A. RI~NYI
where T is an infinite triangular d o m a i n in the plane (u, v) defined by the
following inequalites :
(6.3)
i
T: - - ~ < u < y
1/o
l--b;
--
--
1--b
Introducing the polar coordinates r - - l [ ~ + v ~, cf = arc tg v// we obtain
-
-
7g-f%
F(y, a, b) ~ - ~ 1
(6.4)
[1--e
2(1-a)sin~(ep+cx))
d~
o
z~ t
IJT-~ ~ o,oa
11"~ 4o~
~...7"al
2;;, ~
where t g c ~ =
y
/
/
ts 2o 2s 3o,~404g..V 8'0 m ,r gb fog
Fig. 1.
-a (b1- -- -ab )
and
7g.
O< c~< ~ - , therefore
7~
F(y, a, b) = ~1
(6.5)
1 - - e - 2 0-~)=i~0) dfl.
As
(6.6)
~
- - exp - - 2(1 - - a ) sin~fl) J
o
+
o
we have finally the following approximative expression:
(6.7)
F(y, a, b) =
Y1//~_-~
arc ltg V a(1 - - b )
e: =2d u - .7/7
0
(~ - R )
"
225
ON THE THEORY OF ORDER STATISTICS
where
t~
(6.8) R =
exp 2 ( 1 - - a ) sin ~ dfl and a==arctg
a(lb_a--b)
o
If 1 - - b - - s is small, then R is, in most eases, negligible, except for extremely small values of y, since
(6.9)
R ~ exp(--2(1
aYi2sin2c/)=exp (2(~Y_~_b}).
The values of the limiting distribution function occurring in Theorem 8
can be approximatively computed in the following manner: it follows from
the second mean value theorem of the integral calculus that
(2k+1)
(6. lO)
I~)kl=
e
2b~ sinu du
0
<
=
bY~Y 1 - - b
2e- 2(1-o
O
2~y
((2k-I-1)2:u~(1--'O))
exp t
8by 2
"
In this way, using the notation (6. 1) the limiting distribution function
~occurring in Theorem 8 can be expressed in the following form
where, as is seen by way of a simple calculation.
(6. 12)
A<
22 e- 2(1 b)
log
~ 2]/f~y
1 -t- exp
1--exp
8~
)
:u~(b--a)]
~
) '
whence it is readily seen that if b is very near to l, A is negligible. Observe
that the first factor of the main term depends only on a, the second only
on b; this fact simplifies the computation to a great extent; namely, because
we can obtain the first factor from the table of L(z), the second from the
table of the normal distribution function.
INSTITUTE FOI~ APPLIED MATHEMATICS
,OF THE HUNGARIAN ACADEMY OF SCIENCES.
(Received 30 October 1953)
226
A. RENYI
Bibliography
[1] K. PEARSON, Note on Francis Galton's problem, Biometrika, 1 (1902), pp. 390--399.
[2] L. v. BoRxalEWlCZ, Variationsbreite und mittlere Fehler, Sitzungsber. Berl. Math. Ges.~
21 (1922), pp. 3--11.
[3] E. L. Dooo, The greatest and the least variate under general laws of error, Trans.
'
Amer. Math. Soc., 25 (1923), pp. 525--539.
[4] L. H. C. TIPPET, On the extreme individuals and the range of samples taken from a
normal population, Biometrika, 17 (1925), pp. 264--387.
[5] M. FR~CHET, Sur la loi de probabilit~ de l'6cart maximum, Ann. Soc. Polon. Math.,.
6 (1928), pp. 92--116.
[6] a) A. N. KOLMOOO~OV, Sulla determinazione empirico di una legge di distribuzione,.
Giornale dell'lstituto Italiano d. Attuari, 4 (1933), pp. 83--91.
b) A. N. KOLMO~O~OV, Confidence limits for an unknown distribution function, Ann.
Math. Stat., 12 (1941), pp. 461--463.
[7] V. I. GUVENKO, Sulla determinazione empirica delle leggi di probabilith, Giorn. Ist. Ital..
Attuari, 4 (1933), pp. 92--99.
[8] a) N.V. S~mNOV,l]ber die Verteilung des allgemeinen Gliedes in derVariationsreihe, Metron,~
12 (1935), pp. 59--81.
b) H. B. C M U p , O B, Sur la distribution de ~o~, Comptes Rendus Ac. Sci., Paris, 202
(1936), pp. 449--452.
c) H. B. C M . p n o B, O pacnpe~eaeHHH oJ-%KpHTepn~ M.3eca, M a T e M. C 6 o p . . K,
2 (44) (1937), pp. 973--994.
d) H. B. C M u p n o u, O6 y~aoneun~x 9MnHpH~ecKo~i I~pI4BO~i pacnpexenenHn, M a T e M.
C 6 o p a H K, 6 (44) (1939), pp. 3--26.
e) H. B. C M u p n o B, OL~em<a pacxon<~eni~ Men<~y ~MnI~pn~ecKHMn Kp~BL~MH pacnpe~enenn~ B ;~Byx HeaaBHC~M~IX B~16opKax, B to an. M 0 c K. Y U n B., 2 (1939),
pp. 3--14.
f) H. B. C ~ n p u o u, l-lp~6a~n<eHne aai<oHoB p a c n p e ~ e n e n ~ cnyuafi~lx uen~nH no
3Mnupl4qeCl<uM xaHnI~I/~, Y C n e x ~i M a T. H a y K, 10 (1944), pp. 176--206.
g) H. B. C ~ n p H O ~, FIpe~enbnbie ~a~on~I pacnpe~eaenn~ Xn~ ~aeHou Bapna~noHHoro
p~/{a, T p y ~ , ~ M a T . HHCT. C T e ~ n o B a ,
25 (1949), pp. 1--60.
[9] B.V. ONEDENKO,Limit theorems for sums of independent random variables, Transactions
Amer. Math. Soc., 45 (1951), pp.
[10] B. B. F n e ~ e n e o
.n B. C. I < o p o n ~ o t < , O Ma~<caMan~noM pacxon<Aeni~e /~ByX:
a~nnpn~ecKnx pacnpe~enennfi, ~ o K ~. A ~ a ~. H a y K C C C P, 80 (1951),
pp. 525--528.
[11] B. B. F H e g e a ~<O ~ E. J1. P B a ~ e ~ a, 0 6 o~nofi ~aua~e cpasHenn~ ~ y x ~ n n p n - .
~ecenx pacnpe~eaennfi, ~ o ~<n. A i< a A. H a y ~< C C C P, 82 (1952), pp. 513--516..
[12] B. B. F n e ~ e n ~ < o
n B. C. M ~ a x a a e B n ~ ,
~se Teope~si o noBe~ennfi ~ n n p ~ ~ecKnx 0pyn~xmfi pacnpe~enennn, 1~ o K a. A i< a X. H a y K C C C P, 82 (1952),
pp. 841--843 and pp. 25--27.
[13] B. C. M n x a n e B n q, O BaaHMnOM pacnonon<ennfi ~syx ~Mnnpn~ec~nx qbyHi<tI!4fi pacnpexenenrm, ~ o K n . Ai<a~. H a y ~ C C C P , 85 (1952), pp. 485--488.
[14] I4. ~. I( Bn T, O Teope~ae H. B. CMnpHoBa oxuocnxen~no cpaunenn~ ~ByX Bbl6OpO~,
~ O K n . A K a n . H a y K C C C P , 71 (1950), pp. 229--231.
[15] F. M. M a H n ~, O6o6menne KpnTepn~ A. H. goaMoropoBa ~a~ ot{en~n uax<ona pacnpe~eaenn~ no oMnnpn~ec~<n~ ~aHn~i~, ~ o I < n . A~<a~. H a y K C C C P , 6 9
(1949), pp. 495--497.
[16] I/I. I/I. F n x ~ a H, O6 o~nnpnqec~ofi ~ y n ~ u f i pacnpe~eaeun~ B cny~ae r p y n n n p o B ~
~anns~x, ] l o ~ n . A ~ a ~ . H a y ~ C C C P , 82 (1952), pp. 837--840.
ON THE THEORYOF ORDERSTATISTICS
227
[171 W. FELLER,Off the Kolmogorov--Smirnov limit theorems for empirical distributions,
Ann. Math. Statistics, 19 (1948), pp. 177--180.
[18] J. L. DooB, Heuristic approach to the Kolmogorov--Smirnov theorems, Ann. Math.
Statistics, 20 (1949), pp. 393--403.
[19] a) F. J. MASSEY, A note on the estimation of a distribution function by confidence
limits~ Ann. Math. Statistics, 21 (1950), pp. il6--119.
b) F. J. MASSEY,A note on tlle,~9~gr of a non-garametric test, mAnn. Math. Statistics,
2 1 (1950), pp. 440--443.
C) F. J. MASSEY,Distribution table for the deviation between two sample cumulatives.
Ann. Math. Statistics, 23 (1952), pp. 435--441.
[20] a) M. D. DONSKEIr Justification and extension of Doob's heuristic approach to the
Kolmogorov~Smirnov theorems, Ann. Math. Statistics, 23 (1952), pp, 277--281,
b) M. D. Dor~saER, An invariance principle for certain probability limit theorems,
Memoirs Amer. Math. Soc., 6 (1951), pp. 1--12.
[21] T. W. ANDERSON--D.A. DARLINg,Asymptotic theory of certain ,,goodness of fit" criteria
based on stochastic processes, Ann. Math. Statistics, 23 (1952), pp. t93--212
[22] P. ERI)6s--M. KAC, On certain limit theorems of 1he theory of probability, Bull. Amer
Math. Soc., 52 (1946), pp. 292--302.
[23] S. MALMQU1ST,On a property of order statistics from a rectangular distribution, Skand.
Akluerietidsskrift, 33 (1950), pp. 214--222.
[24] G. HAj6s and A. R~Nw, Elementary proofs of some basic facts concerning order statistics, Acta Math. Acad. Sci. Hung., 5 (1954) (under press).
[25] H. CRAM~R,Mathematical methods of statistics (Princeton, 1946).
[26] JI. H. B p a r H n c K u, OnepaTn~nb~fi CTaTnCTnqecKnii KOnTpO/Ib r~aqecTBa B ~aamnnocTpoennn (MoCX<Ba, 1951), Maturn3.
[27] A. WALD, Limit distribution of the maximum and minimum of successive cumulative
sums of random variables, Bull. Amer. Math. Soc., 53 (1947), pp. 142--153.
[28] A. WALD, On the distribution of the maximum of successive cumulative sums of independently, but not indentically distributed chance variables, Bull. Amer. Math.
Soc., 54 (1948), pp. 422--430.
[29] K. L. CHVNO, Asymptotic distribution of themaximum cumulative sum of independent
random variables, Bull. Amer. Math. Soc., 54 (1~,8), pp. 1162--1170.
[30] S. S. W~L~S, Order statistics, Bull. Amer. Math. Soe., 54 (1948)," pp. 6--50.
15 Acta Mamematlca
228
A. RI~NYI
$
~"~b,,..,~"~a~r
'',-
~ " ~ ' ~ C ~ ' ~ ' ~ O
c~
I
0
~ddd~dddddddd~ddddd~dddddod~dd
ON THE THEORY OF ORDER STATISTICS
229
r
9
.r
e- ,
>.
r
A. RENYI
230
K TEOPHH
BAPHAI~HOHHblX
P~]],OB
A. PEHBH (ByaanemT)
(Peam~ae)
lleab HacTomReii CTaTbrl- Haao~euue HOBOFOMeT0~a, C noMOtttbm ROTOpOFO MO)KHO'
npOCTb~My CncTeMaTnqec~nM 06pa3oM ~ocTpOnTb Teopnm Bapnattnonnblx p~XOB H /Io~a3aTb
P~B; HOBblX TeopeM. Cym,HOCTb MeToBia COCTOHT B TOM, qTO HccneaoBanne npeaeabm,~:('pacn p e R e a e g n ~ : B e a n q n H , aasncamnx OT qaeHOB B a p t i a R H o g H o r O w a a CBO,~IITC$I 1~ nccaeaoBanm~
pacnpeaenengn (pyn~Ltni~ OT cyMM HeaaBncnm,~x CayqafiHblX Re~nqtm. OTOT MeTOR nCXO~HT
Ua qbaKTa, ~OTOpb~fi nepB~[M aa~eTnn A. H. 1{ o a M o r o p o s n csoeii pa60Te [6a],i qTO
qaeHh~ sapnatmouuoro pnaa o6paaymT Lteg~ M a p K o aa. Boaee T0rO, Car aXo~(haan
S. Malmquist, ecan #~ (k ~ 1, 2 , . . . , n)--pacno:1ox<eunbm s soapac~amtae~ nopuaxKe qaeHH
aae~enToa ~ ( k ~ 1,2 . . . . , n) Bb~6op~n o6"~e~a n na CTaTgCTnqecKofi CO~ot(ynnocTn c
uenpep~iBHoi~ ~0yn~<tmefi pacnpeaeaenne F(x) n @ ~ F(#~) (k ~ 1, 2. . . . . n), TO ~e.rmqnn~,l
(~)'~(k--" = 1,2 . . . . . n) ~IB01~IIOTC~tsnoal-le neaaancn.~,Mn n B nuTep~aae (0, 1)pasnoMepuo
pacnpe/~eaeHHblMn cnyqafinbtMn Benuqunamn, n rIOOTOMy BenHqHHH ~ z ~ l o g ~*~ o6paaymT
a;lanTuBHym i~en~ M a p u o B a. IIpocToe aouaaaTea~cTBo aToro r
aaHo B w 1. w 2
co~ep~nT
H3~IO~,I(eHHe n p u M e H e H n ~
DTOFO ~ a ~ T a
~ NpOCTOFO ~ 0 K a 3 a T e a b C T B a
HeI(OTOpbIX
u3BecTnblX Teope~l Teopu~i ~apnaLmonn~ix pUaOB. B w 3 cq~op~ynupoBan~ caeg~ymmne uosue
peaynbTaTbI, HoayqeHnbie c no~ou~bro HO~OrO MeTo~a.
HyCTb Fn (x) o3naqaeT aMnnpnqecwro r
pacnpeaeaennu B~6opun, T. e. noao~H~
F~(x)=-~
aau
k
,
,
/Iau 2k<:x<~zc+~ ( k - - l , 2 . . . . . n--1), P;~(x)=O
aau x < ~
u F,~(x)
I
#;~ ~ x. Toraa nMee~
Teope~a
5.
lira P lVn
n-~.-m
\
sup
O~a~F(~)~l
IO
TeopeMa
~a~
y < O.
6.
(2 k+l)2 a~ (1-t~)
"~
t-n
sup
]F~dx)--F(x),
lira P IV"
,~-.~
~ o<~_F(~,)~l ]
F(x)
< Y) 1=4 k ~•= O
8 y2 C~
(--1) ~e-
2k-t- 1
~a~ y > O,
LO ;ln~ y ~ O .
TeopeMa
7.
~
lim
P
n
sup
F~,(x)--F(x)
< y __ 1
1-b ~
e--~
0
z
b-a
ON THE THEORY OF ORDER STATISTICS
TeopeMa
231
8.
(2/;+1) -~a - (t - a )
lira P
.... o
}n
v<:t
sup
o <-a -< F(,:) --<
F,,.(x)--F(x)
F(x)
) 2,
4
<Y
b y2
rgteEz:
1 1'2~
2
= ----
f
k e
--1)
E~-
2k+1
~a~
y ~ 0,
(2k+1)
e- ,,~
2e 2(1-b)
2du-[- V
)
2a("
~
=7.=
2~,y2
f _ (1-b)u'-'
e
sinudu.
y
~TH TeopeMb~, KOTOpble aHa.rlOFHqHbl HgBeCTHblM TeoperaaM H. B. C M n p n o B a n
A. H. I( o a M o r o p o ~ a, aamT KpnTepnn ~aa rnnoTea OTnOCnTeabno F(x), COOTK ,~at0T
;~0Bepn.Teat, nI~le rpatmt~bi~as~ nenaBeCTnO~i ~yni~tmi~ F(X).
~0KUaaTe.at,r
xeopeM Co/~ep~i,~TC;I:'B' w 5 , n onupae;rc;~,, KpoMe ynoM~nyTOrO
MeTO~a, .Ha HeKOTOpr~XX HOBBIX npe~eab~.sIx TeopeM, u~.no~r
B, w
O~HOI3I/1TeaBHO
~a~cnMy~aa qaCTHbIX CyMM IIOC.rle,/~OBaTedIbHOCTeH HeSaBHCI,IMI-,IX cay~a~inux Beawmn. w
co/lep~nT ne~oTOpb~e aaMe,mnn~ OTHOCHTedlBHO BBIqHCaeHH~I Flpe~edlbHblX qbyH~Rn~ pacnpe~eaenn~,~nvypnpymmne n xe0pe~ax 5--8. B ~OHRe'~CraTrl aa.Ha Ta6antta ananennfi
L(z):4Z(--1)"e
I;=o
~a~ pa3aHqHbiX 3HaqeHH~ OT y H O.
8~=
rae
z=y]/
a
1-- a
© Copyright 2026 Paperzz