Chu, John T. and H. Hotelling; (1954)Moments of the sample median." (Navy Research)

•
THE MOMENTS OF THE SAMPLE MEDIAN
"
by
J. T. Chu and Harold
Ho~elling
Special report to the Office of
Naval Research of work at Chapel
Hill under Contract NR 042 031 ,
Project N7-onr-284(02)1 for research in Probability and Statistics.
Institute of Statistics
Mimeograph Series No. 116
August1 1954
a
•
THE MCMENTS OF THE S1U-IPLE MEDIANl , 2
·e
by
John T. Chu and Harold Hotelling
Institute of Statistics, University of North Carolina
1.
Summary.
It is shown that under certain regularity conditions, the moments
about its mean of the sample median tend, as the sample size increases indefinitely,
to the corresponding ones of the asymptotic distribution (which is normal).
A
method of approximation, using the inverse function of the cumulative distribution
function, is obtained for the moments of the sronple median of a certain type of
parent distribution.
small as is required.
discussed.
An advantage of this method is that the error can be made as
Applications to normal, Laplace, and Cauchy distributions are
Upper and lower bounds are obtained, by a different method, for the
variance of the sample median of normal and Laplace parent distributions.
They are
simple in form, and of practical use if the sample size is not too small.
2.
Introduction.
Let a population be given with cdf (cumulative distribution
function) F(x) and pdf (probability density function) f(x), and median
assume to exist uniquely.
2n + 1.
Then the pdf g(x)
-
~
which we
x denote the sanlple median of a sample of size
of x and the pdf h(x) of the asymptotic distribution of
Let
x are respectively
where C
n
= (2n
+
l)!/(n! n!), and
1. This paper was presented at the summer meeting of the Institute of
Mathematical Statistics at Montreal, P.Q., Canada, on September 11, 1954.
2. Sponsored by the Office of Naval Research under Contract NR 042 031 for
research in Probabilj.ty and Statistics.
2
(2)
h(x)
where
il 2 =
A question that follows naturally is: Can the moments of the asymptotic dis-
tribution of
x be used
as approximations to the corresponding moments of
i,
and i f
not, how to find better anproximations? When the parent distribution is normal,
this question has been answered by various authors, e.g., T. Hojo ~6_7, K. Pearson
~8,
9_7
and more recently, J. H. Cadwell ~3_7.
that experiments showed that the distribution of
the variance of
It has been stated, e.g., in ~3_7,
xtends rapidly to normality, but
x(as ofquantiles in general) tends only slowly to the variance of
the asymptotic distribution.
For this reason of slow convergence, approximations
...
were derived for the variance of x when the sample size is small.
1Nhile different
methods were used by different authors, their results agree fairly well with each
other.
In fact, the problem should be considered as completely solved but for the
unknown error cowaittcd in using such approximation.
...
One of us ~4_7 recently proved that the distribution of x, for a normal
parent distribution, does tend to normality rather "rapidly".
-
In
g 6 we shall
confirm another experimental result that the variance of x tends flslowly" to the
variance of the asymptotic distribution (actually not "very slowly").
lower bounds are obtained (§ 6, Theorem 4) for the variance of~.
lower bound is obtained, by a different method, in
formula (74).
~
Upper and
A slightly better
5, formula (49), or
~
6,
It seems that even for sample sizes around 10 to 20, the asymptotic
-
variance is not a bad approximation to the true variance of x.
good approximation if the sample size is large.
better approximation is obtained in
~
It becomes a very
However, for large samples, an even
5, formula (56).
Before further discussion, the following notations will be introduced.
If
f(x) and g(x) are functions of x, then Ef(g) denotes the expectation of g(x) with
3
00
~
J
respect to f(x), i.e.,
g(x) f(x) dx.
We use, where f, g, and h are given by
-00
(1) and (2),
,
(3)
,
and for any integer k
...
( 4)
~k
~
2,
(
= Ee x
... )k
- ~l
~k
'
= Eh ( x
It should be pointed out that, although the pdf g(x) of
~k of i in general do not necessarily tend to ~k.
i
-)k
- ~l
'
tends to h(x), the moments
In fact ~k may never exist ~2_7.
Nevertheless, if the parent pdf satisfies certain conditions, then it can be shown
...
that ~k tends to ~k as sample size tends to infinity (
§3,
Theorem 1).
Therefore
under such circumstance, it is justifiable, at least for large samples, to use ~k
as an approximation to ~.
If the parent distribution satisfies certain conditions, a general method is
obtained in
~
4 for
computing ~k' k
= 1, 2, ..•• The method is based on the Taylor
expansion of x(F), the inverse function of F(x).
00
1 m
= ~ ~(F - 2) converges for 0
~l
metric with respect to x
< F < 1, a
= ~J then when n
m
For example, if x(F) k
= O(ZUm
> 2k +
3,
~
) where k ~ 0 and f(x) is sym-
4
1
ii 2
*
j
S~ Cn F"(I
- F)n dF ,
o
m
1 r
where C is given by (1) and S = ~ a (F - -2)
n
m r=l r
(~4, Theorem
3). Error in such
approximation can be computed, and it tends to 0 as m tends to
00.
pdf is not symmetric, similar approximations can be obtained (
~
4,
If the parent
formula 26).
Applications are given to the variances of the sMQple medians of Laplace and Cauchy
parent distributions (§ 4, Examples 1 and 2).
~
Finally upper and lower bounds are derived in
Laplace parent distribution.
7 for the variance of
x of
a
It then can be seen that for estimating the mean of a
Laplace distribution, the sample median is a "betterB estimate than the sample mean,
not only for large smnples, but for small samples as well.
3.
Large Sample Moments.
Lemma 1.
1
2
(6)
4c 2
+ c
m
(
Iu - ~ I
I
1 m+2n+l
unCI - u)n du - (
1 )
"2 )
I
,J
()/
t m-I
2 (1 - t)n dt
o
2 - c
In particular, if c
1
(7)
... ,
If 0 ~ c ~~, then for m, n = 1, 2,
= 21 '
~d
Cn
= (2n + l)l/nl nl , we have for fixod m,
{Cn I u - ~ 1m unCI _ u)n du = O(n-m/2) ,
o
1
()I
t
8
Cn(u -
1)2m n(
)n
u 1 - u du
1)2rn
2
= (?
1 • 3 .•• (2m - 1)
(2n + 3)(2n +
5) ...
(2n + 2m + 1) ,
a
These formulae are easily proved using transformations v = ~ (u - ~), etc.
Theorem 1.
that the median
~
Let a population be given with cdf F(x) and pdf f(x).
of the given population exists uniquely and
exists and is bounded in some neighborhood of x =~.
If
x is
f(~)
r 0,
Suppose
and fl(x)
the sample median of
a sample of size 2n + 1, and ~ and ~k' as defined by (4), are respectively the
x about
kth moment of
its mean and the corresponding one of its asymptotic dis-
tribution, then
(10)
Lim
n ~oo
(11)
n
Lim
....;> 00
-
-
lJ.2k-l = lJ.2k-l
'
~2k/ ~?k = 1 ,
k = 1, 2, ... ,
provided that in each case, lJ. 2k - l and ~2k are finite for at least one n.
Proof.
We will prove (11) as an illustration of the method we usc.
can be shown in the same way.
(I?)
where t~k)
~2k = lJ.~k
+
(10)
Obviously
2k-l
Z (2k) (_1)2k-j
j=O
j
l-Li 2k- j l-Lj ,
= (2k)!/ {jl(2k - j)l} • We say that if
~2k is finite for a certain
6
n == no' then ~2k is finite for all n ~ nO' and
(13)
,
_ (-m) ,
!J.2m_l - 0 n
(14)
p.L =
elll
a1 )2m
( "?I'"
~
1 • 3 ••• (2m - 1)
(2n + 3) ( 2n + 5) ... (2n + 2m - 1)
( -m-1/2)
+ 0 n
,
for m = 1, 2, •.• , k,
where a1
= l/f(~). On combining (12), (13), and (lh), it follows that
a1 2k
...
(IS)
!J. 2k
= (2)
1 • 3 •.• (2k - 1)
( 2n + 3)( 2n + 5) ... ( 2n + 2k + 1)
+ 0 ( n -k-l/2) •
Since ij:2k == 1·3 ••• (21( -1) ij:~, where ij:2 is dofined by (?), we have (11).
To complete the proof, it remains to establish (13) and (14).
Now for
example,
00
(16)
!J.'2m
~)2m Cn-rF(x) -......
7n r l - F(x) - 7n f(x)
(x -
=
dx
-00
a
=
+
-00
where a <
~
b
and b >
~
i
00
+
a
f
= II
say,
+ 1 2 + I) ,
b
will be chosen later.
For 0
~
F
~
1, the function F(l - F) is
"mum '14 at F = "21 ' an.d""
" for 0 ::: F ~ '2
1 and
"
non-nega t ~ve,
reac h
os "
~ st
max
~s IDcreas~ng
decreasing for
(17)
~ ~ F ~ 1. Lot
7
then 0 < r < 1.
Since C
n
= O(2 2nnl / 2), it follows that
On the other hand, if a and b are so chosen that, e.g., F(b) - ~
small, then for ~ - c ~ F ~ ~
+
=~ -
F(a)
=c
is
c, x(F), the inverse function of F(x~ is uniquely
defined and may be expanded, by Taylorrs method, into
x(F) - l;
where al
= l/f(~) and R2 is the remainder. Substituting (19) for x -
~ in I
2 of
(16), it can be shown, using Lemma 1, that I
of (14).
is equal to the RHS (right hand side)
2
Combining this fact with (16) and (18), we obtain (14).
Regarding the above theorem, we make
Remark 1.
A sufficient condition for ~1.( being finite for some n
all n ~ no) is that ~k bo finite.
= nO (hence
This condition, however, is not necessary.
For
example, the variance of the sample median of a Cauchy parent distribution is finite
2n + 1 ~
if the sample size
infinite ( ~
5,
though the variance of the parent distribution is
4, Example 2).
Remark 2.
lbeorem 1 states only some sufficient conditions under which (10)
and (11) are true.
(10) and (11) hold (
For a Laplace parent distribution, f'(l;) does not exist, yet
~
4, Example 1).
The above theorem provides a justification, at least for large samples, for
using ~k as 3n approximation to ~k.
In the next section we will proceed to show that
if tho parent pdf satisfies some additional conditions, then satisfactory approxDuations can be obtained for ~k for samples of smaller sizes as well.
8
4.
Approximations.
Lenuna 2.
If Ie is rcal, then the follOl'1ing series is convergent for every
positive integer n > k l
00
(20)
E
m=l
o
Use Lemma 1 and the fact i I , p. 33_7 that if am ~ 0 and m( ~/am+l-l)
Proof.
00
approaches r
>
1, then
E a is convergent; or apply the Stirling's approximation,
m=l m
with m large, to the gamma functions obtained by putting c
= ~ in (6).
...
Let F(x) be the cdf of a given distribution and ~ and x be
'lbeorem 2.
respectively the median and the sample median of a sample of size
?n + 1.
Suppose
that x(F), the inverse function of F(x), is for 0 < F < 1 uniquely defined and equal
to a convergent series of powers of F
x(F) - ~
(21)
1
-?;
let
1 m
00
=
am(F - '2)
E
m=l
•
v.Trite
1 r
00
(22)
and
If there exists a sequence
00
(23)
E
m=l
R = E
m r=rn+l
ar(F -
2)
{b ) such that
m
(a /b) 2 <
m m
00 ,
for 0 < F < 1,
and
9
1
2
00
(25)
Z
m=l
b
m
1 2m
C pn(l - F)n dF <
2
n
(F - -)
f
o
for some positive integer value nO of n, and
-
~2'
00 ,
-
tho variance of x, is finite for
n = nO' then for all integers n > nO'
( ?6)
-=
1J.2
•
1
Lim
J
m-.;> 00
1
82 C
m n
~(l
- F)n dF - (
1
)
o
8
C
m n
~(l
_ F)n dF/ J
o
Further, if f(x) is symmetric with respect'to~, then the second term in the bracket
should be omitted.
For simplicity we assume that f(x) is ~mmetric with respect to x
Proof.
In this case ~l :: ~ and
a
00
-
IJ. 2
=
( (x -
)
n r-l
~)2Crr-;-F(x)_7L
b
I
- F(x)
-
7nf(x)dx
= I
j
-00
-00
where a < £ and b > t will be chosen later.
+
j
a
00
+
i
b
It can be shown that
(28 )
where.r is defined by (17).
=
c <~.
Choose a, b, and c such that 0 <
Using (21) and (22), we have
21 - F(a) = F(b) - 21
= ~.
10
1
( 2
) SmCnF'l(l - F)ndFI ~ J 1 + J 2 + J
3
+ J
4'
o
where
1
(30)
J1
I
=
S;Cnrn(l - F)n dF ,
1 '
c
~ +
1
~
(31)
J
=
2
- c
(
S2 C F'l(l - F)n dF ,
mn
I
o
1
'2
(J3)
J
4=
I
+ c
dF .
21sR
ICrn(l-F)n
mm
n
1 .
'2 -
c
By Schwarz IS inequality, we get
1
(34)
J
1
+ J
1
<
2 -
6(- - c)
2
00
a
~ (~)
m=l bm
2
00
~
2
(F _~)
b
2
m=1 m
2m
C
n-l
~-l(l _ F)n-l dF
o
By (23) and (25), the two series on the RHS are convergent for n
n > nO' J 1 + J 2 tends to 0 as c tends to
Further, from (32),
i.
>
nO.
Hence if
11
1
J
<
2
a
00
(.....!:)
1:
.
2r
I b2(F _ 1)
C pn(l _ F)n dF
j
r
2
n
00
1:
3 - r=m+l b r
r=m+l
o
1
(36)
J
4
~
I
a 2 .1
a 2
00
00 2
1:
(if) _7 2 • 1: b I
1: (if) •
r=l r )
r=m+l
r=l r
r
2£
00
I
o
As m tends to infinity, J
3
+
J
4 tends
(
~
2
to O.
1
'
)
= Lim
m--;. 00
Consequently, for any fixed n > nO' .
S2C pn(l _ F)n dF
mn
o
An ~~cdiate consequence of Lemma 2 and Theorem 2 is
Theorem 3.
k
If, in the preceding theorem, a = O(znm ) for some integer k,
m
then (?6) holds for every n > 2k + 3.
Proof.
Choose bm = 2ffimk + l •
Thus we have found an a pproximation for
1
I
. S2 C ~(l - F)n dF
mn
'
(38)
o
for all integers n which are not too small.
evaluated by formulns given in Lemma 1.
The integral on the RHS of (38) can be
An upper bound for the error committed in
such approximation is given by the sum of the RHS of (35) and (36).
note that the same method can be used to obtain the moments of
Examp1c 1 •
=
1
1
- ""2
e -x
Lap1 ace
n·J.S t rJ.·but·J.on.
if x ~ 0, and F(x)
= .1?
Let f(x) =
e
x
if x
~
0.
~c
Xin
Finally we
general.
e- Ixl , then F(x)
Hence
12
1
~2 = 2 ) x2Cn~(1 - F)n dF •
1
'2
If
1
'2 ~ F
<
1, then x
= - log 2(1 - F) =
1
00
m- ~2(F -
L
m=l
1
1
2)_Jffi. So am = znm-. It
follows that for n > 1,
1
(40)
~2 = Lim . "
.
m-.> 00
1
"2
1
( 41)
=
00
w
m=l m
L
J.
12(F -~)
m+1
I
Cn~(l- F)n
dF,
o
m
w = ~ r-1 (m _ r + 1)-1
ill
r=l
m
= 2(m + 1)-1 ~ r- 1
r=l
If we usc
I
1
(43)
.-!J.2
...
2k-1
L
m:::l
w
m
o
then thG error ccrmnittod is bounded by
(43a)
2n- l / 2w (1 + --21n)(n + k)-1/2
2k
2 • 4···2k
(2n+l)(2n+3) ••• (2n+2k-1J·
In deriving (43a), we used the facts that w is a monotonically decreasing sequence
m
13
of m ~l, P.
85_7
and the Wallis product ~7, p.
385_7
1 • 3 ••• (211-1) 1/2
2> • 4 ••. ( 211) n
is a monotonically increasing sequence of n.
-1/2
Sinularly if
1
-
(44)
n
?k
1J. 2
2: w )
m=l n
o
then the error is bounded by
(44a)
Example 2.
Cauchy distribution.
F(x) = n-l~tan-lx + ~_7, for
-00
< x <
2
Let f(x) = l/n(l + x ), then
00,
so x(F) = tan n(F - ~) for 0 < F < 1.
It can be shown that the variance of the sample dedian of a samplo of sizG 211 + 1 ~
is finite:
1
~2
(45)
= (
o
tan
2
n(F -
~)
)
It is known
["7, pp. 204, 237_7 that
00
(46)
where
Cn J!1(l - F)n dF
_l)m-l
tan x = 2: (
m=l
22m(22m - 1)
(2m)!
B2mx
?m-l
,
for
Ix I< ~
,
5
14
1~e see that am
5.
= O(ZU), hence
by Theorem
3, (37) holds if n
>
3.
2
Normal distribution.
Throughout this section, f(x)
= (?n)-1/2 e-x /2 and
x
F(x) = )
f(t) dt, and x(F) is the inverse function of F(x).
No simple goneral
-00
form of the derivatives of x(F) at F • ~ is known.
But tho first few derivatives
of x(F) can be obtained by direct differentiation, a.g.,
dx _
1
<iF - rrxJ'
2
d x -
1
dF? - f{XJ
d (f(lxJ) , ....
-dx
For finite m and 0 < F < 1, let
(47)
x(F )
=
21)
~ (F -
+ a 2(1)2
F - 2 . + ••• + am (Fl -) m
2 + Rm+1 (F - 2l)m+l
'
then
(48)
a
1
=
(?n)1/2,
,
... ,
15
A.
A Lower Bound.
into two:
Take the integral (39), let the range of integration be divided
I I
2I
to 2 + c, and 2 +
c to 1, where 0 < c <
21 .
If we neglect the last
integral; in the first integral, replace x(F) by its expansion (47) with m = 6, and
then neglect all terms containing the remainder term
finally let c approach ~
the variance ~2 of
x of
Rr
(Which is non-negative), and
and use Lemma 1, we arc able to obtain a lower bound for
a normal parent distribution with unit variance, i.o.,
where
(50)
Incidentally, for n ~ 3, we may expand the RHS of (49) into a power series in
(2n
+
where
1)-1 and obtain an approximation for
ii 2
is given by (2).
In terms of standard deviations, and with the numerical
values of the coefficients computed, (51) is equivalent to
This agrees with a formula obtained by K. Pearson for the same purpose ~8, P.363_7.
16
B.
Approximations for large samples.
a
00
(53) ~2
= 2
j
o
2
n
I
x CJF(x)_7 ["1 - F(xt'flf(x)dx = 2 )
o
Since F(x)rl - F(x)
-
00
,
-
7 -<
F(a)rl - F(a)
-
-
7 if
a
x > a > 0 and C < (2n)-1/2rl
-
-
n-
-
+ (?n)-1_7(2n + 1)1/22 2n+l , it follows that
00
(54)
12~2 ~ (2/n)3/2(1
I
+
3/2n)(2n
+
1)3/2~4F(n) (1-F(a))_7"2)
a
In II' use F as the indepondent variable, and replace x(F) by al(F - ~) + a (F _
3
+ M (F 5
~)5 where M5 =
Max R = (2n)5/ 2A/51
O<x<a
%)3
and A = (7+46a2+24a4)e5a2/2, then
5
(55)
+ (~)3
2
2A
3!5!
+ (1!2)4 (.{;,!)2
=';
5.
3 •
7
(2n+5){2n+7)(2n+9)
3· 5-~ 7 • 9
(2n+5) ... (2n+ll) •
Combining (49), (53), (54), and (55), we conclude:
First Approximation:
(56)
n
""2 = 2(2n + 3)
Second Approximation:
If the second approximation is used, an upper bound for the proportional error (defined to be
I
(True value / Approximation) 1- 1
) is given by the sum of tho RHS
17
of (54) and the last three terms of that of (55).
If tho first approximation is
used, then there is an additional error n/2(2n+5), the second term in the br3cket
of the second approxliaation.
The following table is given for illustration.
a: .35, .50, .65, .75.
It is to be noted that the RHS of (54) is a decreasing
function of n for, e.g., n
creasing function of n.
sizes
~
We choose successively for
~
25, a = .75.
The mrs of (55) is obviously also a de-
Therefore what Table 1 means is that: e.g., for sample
51 (not just = 51), if the first approximation is used, then the proportional
error is < .092, or explicitly: 1 ~ ~2/A2 ~ 1.092.
TABLE 1
Proportional Error
Sample Size
501
201
101
51
First
Approximat~on
3.2 x 10-3
8.5 x 10-3
2.2 x 10- 2
9.2 x 10-2
Second Approximation
6.8 x
7.8 x
6.9 x
6 .3 x
10-5
10-4
10-3
10-2
18
6.
Normal distribution -- a different approach.
In this section a different method
is used to derive upper and lower bounds for the variance of
distribution with unit variance.
Theorem
4.
= n/2(2n
+
We state
Let ~ and ~2 be respectively the sample median and its variance
of a sample of size
~2
xof a normal parent
2n + 1
drawn from a normal distribution with unit variance. If
1) is the variance of the asymptotic distribution of
i,
1 3/2
1 3/2
Bn (1 - 2n + 2)
<
~2/
~2
<
D
(1
+ -2 )
- .
n
n
all practical purpose
and n
~
4,
1
7n+3
1 + ~n 2
on
24n (2n+l)
(58)
<
1
en1 + 16n(8n-i)
Bn <
1 +
...
1
1 + 8n •
,
or
Bn
Proof.
By using the following transformations consecutively,
(60)
u
,y
where F( y).
J (21<) -1/2
= F(y)
1
2
v=u--
,
2
e -x /2 dx , we obtain
-00
(61)
1 2n
...
I.J.
2
= 2Cn (-)
2
1/2
f
o
2
y (1 -
2 n
4v) dv
then
19
Let
v
(62)
= ~ (1 _ e- t 2/(2n+l» 1/2 ,
then
00
(63)
~ = 2B
,2
J
n
2
(2n)-1/2 y2 e-(n+l)t /(2n+l) h (t/(2n+l)1/2) dt ,
1
o
where hl(t)
t2 -1/2
= tel - c-)
~ 1
for all t ~ O.
Further, it is known ~10_7 that
y
J (20)-1/2
(64)
x2
e- / 2 dx
~
o
Using (64), it can bJ shown that y 2
2
~ -~2t.
Therefore it follows from (63 ) that
(6,)
On the other hand, we have from (63),
00
-
(66)
1J. 2
= 2BnIJ.2
f
(2n)-1/2t 2 e-nt2 /{2n.H)
o
,
whore h {t) =
2
for all t
~
0-
t2
t 2 1/2
/t(l - e- )
.
0, thon
2
1/2
If we can show that (2/n)y h {t/{2n+1)
) ~ 1
2
20
2 •
eO(Y ) = Y2(1 - 4v2) /4v
(68)
It can be seen that
Lim
€O(y) = n/2.
Hence it suffices for our purpose to show
y~O
that go(y) is decreasing.
Let",,, denote differentiation with respect to y. Thon,
where
( 71)
Y
g3(y) = (n/6)y e Y2 - e Y2/2 J e-x 2/2 dx.
o
I t is known flO _7 that
Y
2/2 J.
eY
o
- ..
e-x
2/2
dx
00
2
~
n=O
2 +1
Y n /1.3 ••• (2n+l) •
21
Hence
g3(y)
=
00
z
n=O
1
n
{6nl - 1.3 ••• (2n+l)
y
2n+l
It can be shown, by a similar
argument used in ~lo_7 for a similar purpose, that
( 73)
where a < 0 and a > 0, i
O
i
= 1, 2, ••••
Hence there exists a YO > 0 such that
g3(y) ~ 0 if 0 ~ Y ~ YO and g3(y) ~ 0 if Y ~ YO.
g2(y) decreases steadily from 0 to a
minL~um
So as y increases from 0 to
and then increases steadily to
Consequently gl(y) first decreases steadily and then increases steadily.
Lim
gl(y) =
Y --:p 0
= 0,
Lim
gl(y)
y:-:> 00
00.
As
it becomes clear that gl(Y) ~ 0 for all Y ~ O.
Therefore gO(y) is a decreasing function of y.
This completes the proof.
Finally we note that (58) is obtained by using n! _(~)2nn+l 2e -n+(12n
1
by (57) if we use (59) for B.
n
(74)
/
2(2n+3)
•
This is so even if the last term at the RHS of (49)
For
n
)-1
The lower bound for ;2 given by (49) is better than the one given
Remark 1.
is omitted.
00,
+
n
2
4(2n+3)(?n+S)
n
=
r 1 -. (8-2n)n
20 - n 7
2(2n+3)(?n+S) -
2(2n+l) _
+
22
Now i f n
~
2, the last term in the bracket of (74) is smaller in absolute value
than (2n+2}-1 and (1 +
~) ~l
-
l/(2n+2}_~ ~
1.
Therefore the quantity in the
bracket of (74) is greater than
3
1
1 - 2n + 2 ~
For n
= 1,
112
(1 + 'Sii) (1 - 2n + 2)
•
direct comparison shows also that the LHS of (74) is greater than that
of (57).
Remark 2.
Since both lower bounds for
~2'
(49) and (57) are smaller than
~2' while the upper bound is greater than ~2 (1 + 1/2n), we cannot be sure, just
by comparing these bounds, that in using ~2 as an approximation to ~2' the proportional error is less than 1/2n.
Therefore the second approximation given by (56)
is better than ~2' for large samples.
(Table 1).
7. Laplace distribution. We shall now employ the same technique, used in
to derive upper and lower bounds for the variance ~2 of the sample median
sample of size 2n + 1 drawn from a Laplace distribution with pdf
1
(75)
f(x}
=2
e
-Ixl
....
Clearly, the variance in this case of the asymptotic distribution of x is
1J.2
= 2n
1
+ 1
We state
Theorem 5.
If ~2 and ~2 are as defined above, then
~
6,
xof
a
23
3
(77)
Bn(l -
2~2)2 < ~2/~2 <
where Bn is given by (57) and (59).
Proof.
3
It can be seen easily that
1.51 Bn(l +
-
1J.
2
~)2
,
1s equal to the RHS of (63) i f v and
t satisfy (62) and
(78)
v
=
(
21 e -x
dx.
o
We proved
J:4_7
that for all y ~ 0,
2.1.
1
Y 2
v~2 (1 - e- )
2
From (62) and (79), we have y2 ~ ~2 t • Hence it follows from (63) that ~2/~2 has
a lower bound given by (77).
Further, from (63)
(80)
~2 = 2Bn~2
J
00
2
(2IIl-4 t 2 ._nt /(2n+l) (
lh2(t/(2n+ll~l
where h (t) is given by (66).
2
We say that
(81)
y2 h2 (t / (2n+l)2) S 1.51
dt l,
1
•
If this is true, then the RHS of (77) 1s an upper bound of ~2/~2. Thus the
proof is completed.
24
To establish (81), we introduce, as in (68),
gO(y) = y2
(1 -2
4v /
) 4v 2
(82)
where y and v satisfy (78).
For all y
~
0, gO(y) is not smaller than the LHS of
(81) and
1
(83)
Let
o
"I"
as
o
y->
00
denote differentiation with respect to y, then
(84)
,
where
,
where
(85)
(86)
gl'(y)
=12
(87)
=-
(88)
=-
e- y g (y)
2
If f(x) is a function of x, and if as x increases from 0 to
00,
f(x) varies from,
e.g., positive to negative, and then back to positive, we will write, for
simplicity,
As x:
0->
00,
f(x):
+,-,+.
Now
25
>
~
I
g,(y)
0 according as
y
>
~
log 2 ,
So as y:
Now g2(0)
= 0,
Otherwise g2{y)
gl(O)
= O.
while g2(00)
~
0 for all y
0
--->
~
We say that as y:
I
0, so gl(y)
~
00,
gl(y):
0
--->
+,-.
I
00, gl{y):
0
---> 00, g2{y): +,-,+.
0 and Sl(y)
Hence gO{y) is steadily increasing.
It follows that as y:
y:
= 00.
+,.,+.
~
0 for all y
~
0, as
This, however, contradicts (8,).
Now gl(O)
= gl(oo) = 0,
Therefore we conclude that as y:
0
hence as
---> 00, SO(y)
increases steadily from 1 ,to a maximum, and then decreases steadily to O.
To
find the maximum of BO(y), we first solve Bl(y) = 0, which is equivalent to
2v(1+2V) • y
= O.
Using table ~12_7, we obtain an approximate solution y
= 1.15.
The maximum of gO(y) is then found to be 1.51.
Remark.
The variance of the sample mean (of a sample of size 2n+l) drawn
from a Laplace distribution with pdf given by (75) is 2/(2n+l).
Theorem
It follows, from
5, that the sample median has smaller variance than the sample mean for
sample size 2n+l ~ 7.
In a recent paper, A. E. Sarhan 111_7 found that for
sample sizes equal to 2, 3, 4, and 5, the variance of sample median is also
smaller than that of the sample mean.
26
l
)I.
References
~1-7
T. J. Ila. Bromich, An Introduction to the theory of Infinite Series,
MacMillan and Co., London, 1908•
.. 1:2_7 G.
W. Brown and J. W. Tukey, Some distributions of sample means", Ann. Math.
Stat., Vol. 17 (1946), pp. 1-12.
£:3_7 J. H. Cadwell, "The distribution of quantiles of small samples", Biometrika,
Vol. 39 (1952), pp. 207-211.
["4_7 J. T. Chu, "On the distribution of the sample median", to be published.
£:5_7 W. Feller, "On the normal approximation to the binomial distribution", Ann.
Math. Stat., Vol. 16 (1945), pp. 319-329.
---£:6_7 T. Hojo, "Distirbution of the median, etc.", Biometrika, Vol. 23 (1931),
pp. 315-360.
L7_7 K.
Knopp, Theory and Application of Infinite Series, Hafner Publishing
Company, New York.
£:8_7 K. Pearson, "On the standard error of the median, etc.", Biometrika 23 (1931),
pp. 361-363.
L9_7 K.
Pearson, "On the mean character and variance of a ranked indiVidual, etc."
Biometrika, Vol. 23 (1931), pp. 364-397.
POlya, "Remarks on computing the probability integral in one and two
dimensions", Proceedings of the Berkeley Symposium on Mathematical
Statistics and Probability, University of California Press, Berkeley,
1949, pp. 63-78.
Lll_7 A. E. Sarhan, "Estimation of the mean and standard deViation by order
statistics", Ann. Math. Stat., Vol. 25 (1954), pp. 317-328.
1:12-7 Tables
of the Exponential Function eX, Applied Mathematics Series, 14,
National Bureau of Standards, Washington, D. C., 1951.
\