. ON THE DISTRIBUTION OF TH:C: SAl-1PLE MEDIAN by John T. Chu Special report to the Office of Naval Research of work at Chapel Hill under Contract N1-onr-28402 for research in probability and statistics. Institute of Statistics Mimeograph Series No. 105 May 1954 On the Distribution of the Sample Median l by John T. Chu Institute of Statistics, University of North Carolina 1. Summary. Upper and lower bounds are obtained for the cumulative distribu- tion function of the sample median of a sample of size 2n+l drawn from a continuous population. It is shown that if the parent population is normal, then the distribution of the sample median tends "rapidly" to normality. Other kinds of parent populations are also discussed. 2. Introduction. Let a continuous population be given with cdf F(x) (cumula- tive distribution function) and median £ (assume it exists uniquely). sample of size 2n+l, let x denote the sample median. xis, p. 369_7 that the distribution of normal with mean £ and variance 2 0n Ll_7, It is well-Y~own £"2, under certain conditions, as~~ptotically = 1/4 the pdf (probability density function). £"4_7 and J. H. Cadwell For a 72 (2n+l) r-f(~) L _ where f(x) = F'(x) is Several authors, among them T. Rojo claimed that numerical investigations showed that, i f the parent population is normal, "the convergence (of the distribution of x) to normality is surprisingly fast." It seems, however, no mathematical proof or disproof has ever been given to this experimental result. It is the purpose of this paper to show mathematically that their findings are correct. ·Upper and. lower bounds are obtained for P r -x < (x - £) / - 0 n < y - 7 (Equations ( 23) and( 24)), and for large samples, these bounds are reduced to a simpler form, (25) and (26). By examining these bounds, it then becomes evident that the distribution of (x - £)/0n tends "rapidly" to normality. Rectangular and Laplace parent popUlations are briefly discussed. seems that in these cases the distribution of It (x - ~)/on tends to normality at a "much slower speed." 1 Special report to the Office of Naval Research of work at Chapel Hill under Contract N7-onr-28402 for research in probability and statistics. 2 3. Upper and Lower Bounds. Let F(x) and f(x) be respectively the cdf and pdf of a certain popula.tion whose median is~. If g(x) is the pdf of the sample median x of a. semple of size 2n+l, then (1) where C = (2n+l) !jn~n~. It is known 1-2, P. 369_7 that :l.f f( ~ ) n f' (x) is continuous in some neighborhood. of x normal distribution with mean~ =~ , :f - then x has a,n a.symptotica.lly and variance (2) For finite n, let the cdf of (i-~ ~ ! (3) 1 H(y) - "2 = 1 I I +ya )jan be H(x). g(t)dt = en j J ~ Then for any y > 0, , F(~ +yo n ) n 1 '2 ° and u n (l-u )n du 3 Applying to (3) the transformation v ( 4) = 1 L-l 2 1 - e- t 21(2n+l)72 ' we obtain without difficulty ,tl 1 ,( H(y) - -> a. B 2_ n I ___1_ e o (6) H(y) _12 _<bBn t2 2n+l f2$. I I _ -E:!:! t J o t 1 2n+l a h ( t ) dt, 2 /2n+l j2;c (8) =t dt, for 0.:: 8 .:: 1, _~ t 2 where h (t ) 2 ) /2n+l 2 1 h ( a -t 2 2 1 I( I-a -t)2 , for b > 1, 4 (10) (11) t 2 2 = L-(2n +1) log ( 1 _(4/b ) LF(~ 1 ~_72 t7 2 + yOn) - It can be shown (by differentiations) that hl(t) and h (t) are respectively 2 monotonically increasing and decreasing functions of t, when t Lim t -> 0 hl(t) = Lim t -:;> 0 h (t) 2 = 1. (12) H ( y) 1/ f - '2 ~ a Bn 1 - 2ii+2 Lr ¢ 1 .-- /2n+2) (tl,j 2ii+I - 1 '2 _ 7, where t (14) 11' (t) 1 ) -f2i = -00 In a similar way we can show that for arbitrary x (15) H(y) - H( -x) >a - B nv /;"'.~ 1 2n+2 0, and that Hence we obtain, from (5) and (6), :----_.1 ~ >0 and y > 0, 5 where t 3 y by -x. and t 4 are obtained from t l and t 2 in (10) and (11) by replacing It can be seen from (10) and (11) that if x and yare fixed, = t 4 = x + 0(1) for large n. In the following 3 sections we will show, for various kinds of parent distributions, that if t l =t2 =Y+ 0(1) and t a and b are properly chosen, then (15) and (16) remain valid if t , t ; and l 2 t , t are respectively replaced by y and x. 4 3 If n is large, upper and lower bounds for B can be obtained by using n W. Feller ["3_7 showed that for n ~ 4, Stirling's formula. where n~ = f21t IQ I ~ 1/6. n n+.l2 e ·1 (l+Q) -n+--l2n 360n 3 1+Q- If the last term, - ----- , is omitted, then it can be 360n 3 shown that (18) 1 + 1 trri 7n+3 <B<1+ 1 + 2 n En 24n (2n+l) 1 16n(8n-l) , or 4. Normal Parent Population Suppose that a sample of size 2n+l is drawn from a normal population - 2 with mean S and variance a • The distribution of x is then asymptotically normal with mean s and variance ~a2/2(2n+l). It has been shown that if for 6 x> 0, II> (X) _ <l?( -X) (20) = a(x) 2 2 Ll _ e- it X 1 7"2 , then a(x), a function of x, never exceeds 1 and is very close to 1 for all values of x > O. J. D. Williams L6_7 proved that a(x) .$ 1 and tabulated l/a(x) -1 for a number of values of x ranging from .1 to 2.0. G. P61ya L5_7 gave several proofs for the same inequality and remarked that if 2 2 - - (1 - e X 1 :n:)2 is used as an approximation to lI>(x) - II>(-x), "then the error committed is less than one per cent (even less than of the quantity approximated." (21) For arbitrary x (22) , a(x) > .9929 >0 and y x /2ii+I H(y) - H(-x) > Min > O. let , Applying (20) to (15) and ( 23) In other words, for all x > 0, .71 per cent) (16), it follows that r;. - 2ii+2 ;,. ( a(x ), a(y )} B n n n.J Ii( j2n+2) - <IJ(_x/2n+?, )7 2n+l ~2n+l - L .... Y.,.. where Bn , <l?(x), and a(x) are defined by (7), (14), and (20). we suggest to use For large n 7 /-- (26) 5. ~ H{y) - H{ -x) (I + §n) j 1 + ~ L4>(y) - 4>( -x)_7. Other Parent Populations. A. Rectangular Distribution. then s = !(c+d), Ir x > 0 and y and 0n2 > 0, Let rex) = (c-d)2/4(2n+l). = l/{d-c), where c < x < d, Let H(x) be the cdr or (x-s)/on . then lower bounds rorH(y) - H(-x) are the RHS (right- hand sides) or (23) and (25), without the factors Min (a(x ), a(y )J and n n .9929 and upper bounds for H(y) - H(-x) are the RHS of (24) and (26) with an additiona.l ractor Max (b(x ), bey ) J where b(x) is derined by n n b{x) =0 l-e -x , > 0, x and , (28) We note that b(x) is close to 1 only if x is close to 0, e. g., b(.l) = 1.02 and b(.2} = 1.05. This means: unless x and yare small, upper and lower bounds for H(y) - H{-x) are not very close to each other except for large n. B. where -00 Laplace Distribution. < x < 00, then s is Let rex) = 2~ the median and ~ _ .k.lJ e A , = A2/{2n+l). Define c{x) by 8 1 _ e- x We say that c{x) it. ~ = c{x) 2 1 (l _ e-x )2 1 : if x ~ , X > O. 1, this is obvious; if x ~ 1, use (20) to prove It then becomes clear that upper bounds for H{y) - H(-x) are the same as those corresponding to a normal parent population, i.e., (24) and (26). Lower bounds for H(y) - H(-x) are the RHS of (23) and (25) with Min (a{xn ), a(Yn») and .9929 replaced by Min (c(Xn ), c{Yn) J,:.wher$.c(x) is defined by (29) and x n = x//2n+l • Again we remark that c(x) is close to 1 if x is small or large, e. g., c(.l) = .99, c(.2) = .92, and c(3) while c(l) = .79 and c(2) = .97, = .87. Finally we note that since c(x) tends to 1 as x tends to 0, hey) - H(-x) tends to ¢(y) - ¢(-x) as n tends to infinity. (x-~)/a has an asymptotically normal distribution. n Therefore In the general theorem which showed that (x-~)/an has an asymptotically normal distribution (see ~2, p. 369_7 , also the beginning of § be continuous. 3), it is required that For a Laplace distribution, however, ft(~) ft(~) does not exist. 9 References ["1_7 J. H. Cadwell, "The distribution of quantiles of small samples," Biometrika, Vol. 39 (1952), pp. 207-211. ~2_7 H. Cramer, Mathematical Methods of Statistics, Princeton University Press, 1946. ["3J W. Feller, "On the normal approximation to the binomial distribution," Ann. Math. Stat., Vol. 16 (191~5), pp. 319-329. ~4_7 T. Hojo, "Distribution of the median, quartiles, and interquartile distance in s8.Illples from a normal population," Biometrika Vol. 23 (1931), pp. 315-360. "Remarks on computing the probability integral in one and two dimensions," Proceedings of the Berkeley Symposium on Mathematical Statistics and Probability, University of California Press, Berkeley, 1949, pp. 63-78. 1:6_7 J. D. Hil1iams, "An approximation to the probability integral," Ann. Math. Stat., Vol. 17 (1946), pp. 363-365.
© Copyright 2025 Paperzz