The Distribution of the Concentration Ratio for Samples from a

Applied Mathematics, 2015, 6, 57-70
Published Online January 2015 in SciRes. http://www.scirp.org/journal/am
http://dx.doi.org/10.4236/am.2015.61007
The Distribution of the Concentration Ratio
for Samples from a Uniform Population
Giovanni Girone, Antonella Nannavecchia
Faculty of Economics, University of Bari, Bari, Italy
Email: [email protected], [email protected]
Received 24 October 2014; revised 20 November 2014; accepted 16 December 2014
Copyright © 2015 by authors and Scientific Research Publishing Inc.
This work is licensed under the Creative Commons Attribution International License (CC BY).
http://creativecommons.org/licenses/by/4.0/
Abstract
In the present paper we derived, with direct method, the exact expressions for the sampling probability density function of the Gini concentration ratio for samples from a uniform population of
size n = 6, 7, 8, 9 and 10. Moreover, we found some regularities of such distributions valid for any
sample size.
Keywords
Gini Concentration Ratio, Uniform Distribution, Order Statistics, Probability Density Function
1. Introduction
In 1914 Corrado Gini [1] introduced the concentration ratio R for the measure of inequality among values of a
frequency distribution. The Gini index is widely used in fields as diverse as sociology, health science, engineering, and in particular, economics to measure the inequality of income distribution.
Various aspects of the Gini index have been taken into account. One of the most interesting topics regards the
estimation of the concentration ratio (Hoeffding, 1948 [2]; Glasser, 1962 [3]; Cucconi, 1965 [4]; Dall’Aglio,
1965 [5]). More recently, Deltas (2003) [6] discussed the sources of bias of the Gini coefficient for small samples. This has implications for the comparison of inequality among subsamples, some of which may be small,
and the use of the Gini index in measuring firm size inequality in markets with a small number of firms. Barret
and Donald (2009) [7] considered statistical inference for consistent estimators of generalized Gini indices. The
empirical indices are shown to be asymptotically normally distributed using functional limit theory. Moreover,
asymptotic variance expressions are obtained using influence functions. Davidson (2009) [8] derived an approximation for the estimator of the Gini index by which it is expressed as a sum of IID random variables. This
approximation allows developing a reliable standard error that is simple to compute. Fakoor, Ghalibaf and
Azarnoosh (2011) [9] considered nonparametric estimators of the Gini index based on a sample from length-biHow to cite this paper: Girone, G. and Nannavecchia, A. (2015) The Distribution of the Concentration Ratio for Samples
from a Uniform Population. Applied Mathematics, 6, 57-70. http://dx.doi.org/10.4236/am.2015.61007
G. Girone, A. Nannavecchia
ased distributions. They showed that these estimators are strongly consistent for the Gini index. Also, they obtained an asymptotic normality for the corresponding Gini index.
Girone (1968) [10] focused on the study of the sampling distribution of the Gini index and in 1971 [11] derived the exact expression for samples drawn from an exponential population. In 1971 Girone [12] obtained,
with direct method, the sampling distribution function of the Gini ratio for samples of size n ≤ 5 drawn from a
uniform population.
In the present note (Section 2), we calculate the joint probability density function (p.d.f.) of the random sample of size n and, then, the joint p.d.f. of the n order statistics. Hence, we transform one of the order statistics in
their average and the remaining n ‒ 1 order statistics are divided by the same average. We calculate the joint
p.d.f. of the new n variables and integrating with respect to the average we obtain the joint p.d.f. of the other n ‒
1 variables. One of these variables is transformed in the concentration ratio. We calculate the joint p.d.f. of the
concentration ratio and of the other n ‒ 2 variables and at last we integrate this p.d.f. with respect to the n – 2 variables obtaining the marginal p.d.f. of the concentration ratio. The main difficulty of this procedure consists in
the identification of the region of integration of the n ‒ 2 variables, for two reasons: firstly the need to decompose this region into subregions which allow identifying directly the limits of integration and secondly the
growing number of such subregions that makes the derivation heavy.
In Sections 3-7, using the software Mathematica, we derive the exact distributions of the concentration ratio
for samples from a uniform distribution of size n = 6, 7, 8, 9 and 10. Moreover (Section 8), we find some regularities of such distributions valid for any sample size.
2. The Procedure to Derive the Distribution of the Concentration Ratio
Let random variables X 1 , X 2 , , X n from a uniform population have p.d.f.
1, 0 < x < 1,
f ( x) = 
0, elsewhere.
(1)
The joint p.d.f. of the variables is
1,
h ( x1 , x2 , , xn ) = 
0,
0 < xi < 1, for i =
1, 2, , n,
(2)
elsewhere.
The joint p.d.f. of the order statistics X (1) , X ( 2 ) , , X ( n ) is

n !,
h x(1) , x( 2 ) , , x( n ) = 

0,
(
)
0 < x(1) < x( 2 ) <  < x( n ) < 1,
elsewhere.
(3)
By transforming the variables
S = X (1) + X ( 2 ) +  + X ( n ) ,
=
D( i )
X (i )
S
,=
for i 1, 2, , n − 1,
whose Jacobian is
J = S n −1 ,
we obtain the joint p.d.f. of the variables S and D(1) , D( 2 ) , , D( n −1) that can be written as
(
) ( n − 1)!s ,
) < s (1 − d ( ) − d ( ) −  − d ( ) ) < 1.
n −1
g s, d(1) , d( 2 ) , , d( n −1)=
for 0 < sd(1) < sd( 2 ) <  < sd( n −1
1
2
(4)
n −1
We integrate expression [4] with respect to the variable S and obtain the joint p.d.f. of the variables
D(1) , D( 2 ) , , D( n −1) that can be written as
58
(
)
f d(1) , d( 2 ) , , d( n −1) =
G. Girone, A. Nannavecchia
( n − 1)!
(
1 − d(1) − d( 2 ) −  − d( n −1)
)
n
(5)
,
for 0 < d(1) < d( 2 ) <  < d( n −1) < 1 − d(1) − d( 2 ) −  − d( n −1) .
By transforming the variable D( n −1) in the variable R i.e. the concentration ratio
2 n −1
R=
1−
∑ ( n − i ) D(i ) ,
n − 1 i =1
from which we get
=
D( n −1)
( n − 1)(1 − R )
2
n−2
− ∑ ( n − i ) D( i ) ,
i =1
the Jacobian of the transformation is
n −1
2
J=
and the joint p.d.f. of the variable R and D(1) , D( 2 ) , , D( n − 2 ) is
(
)
h d(1) , d( 2 ) , , d( n − 2 ) , R =
2
n
( n − 1)!
n−2


 2 − ( n − 1)(1 − R ) + 2 ∑ ( n − i − 1) d( i ) 
i =1


(6)
,
n
for
0 < d(1) < d( 2 ) <  < d( n − 2 ) <
( n − 1)(1 − R )
n−2
− ∑ ( n − i ) d(i ) < 1 −
2
( n − 1)(1 − R )
2
n−2
+ ∑ ( n − i − 1) d( i ) .
(7)
=i 1 =i 1
By integrating expression [6] with respect to the variables D(1) , D( 2 ) , , D( n − 2 ) over the regions determined
by inequalities [7], we get the marginal p.d.f. of the concentration ratio R.
3. The Distribution of the Concentration Ratio for n = 6
The procedure indicated in Section 2 is used to obtain the following p.d.f. (Figure 1) of the concentration ratio
R for random samples of size n = 6:
g ( R) =
27
1

50  + R 
5

g ( R) = −
g ( R) =
125
+
144 R 2
2
−
2

25  + R 
5

7853
1

75  + R 
5


32
 1

75  − + R 
 5

g ( R) = −
1728
2
−
−
11131
+
240 R 2
27
 2

400  − + R 
 5

2
2
+
2
+
19683
3

25  + R 
5

449523
2

400  + R 
5


33773
1

75  + R 
5

437
 1

75  − + R 
 5

2
−
2
2
2
+
−
−
55296
4

25  + R 
5

677147
3

225  + R 
5


449523
2

400  + R 
5

11131
+
240 R 2
59
2
2
2
+
−
+
2 (1 + R )
1
, for 0 < R < ,
5
2
55296
4

25  + R 
5


2
19683
3

25  + R 
5

7853
1

75  + R 
5

3375
2
−
2
, for
, for
1728
2

25  + R 
5

2
1
2
<R< ,
5
5
2
3
<R< ,
5
5
, for
3
4
<R< ,
5
5
G. Girone, A. Nannavecchia
Figure 1. Probability density function of the concentration ratio R for random
samples of size n = 6 from a uniform population.
g ( R) =
1
 3

450  − + R 
 5

2
−
27
 2

400  − + R 
 5

2
+
32
 1

75  − + R 
 5

2
−
125
+
144 R 2
27
1

50  + R 
5

2
, for
4
< R < 1.
5
Characteristic values of the distribution are:
11696 log 2 243log 3 1250 log 5
mean E ( R ) =−3 +
+
−
=0.35222,
15
2
3
69056 log 2 24219 log 3 18125log 5
second moment E ( R 2 ) =
−
+
=
16 −
0.13716,
25
50
12
1368 678688log 2 156006 log 3 12625log 5
third moment E ( R 3 ) =
0.05785,
−
+
+
−
=
25
125
125
4
89237 25434496 log 2 7691436 log 3 15290 log 5
fourth moment E ( R 4 ) = −
−
+
=
0.02606,
625
3125
3125
3
standard deviation σ ( R ) = 0.11444,
index of skewness γ 1 ( R ) = 0.20793,
index of kurtosis γ 2 ( R ) = −0.16767.
The distribution of the concentration ratio R for samples of size n = 6 from a uniform population shows a
slight positive skewness and platykurtosis.
4. The Distribution of the Concentration Ratio for n = 7
The procedure indicated in Section 2 is used to obtain the following p.d.f. (Figure 2) of the concentration ratio R
for random samples of size n = 7:
g ( R) = −
117649
1

518400  + R 
6

367653125
2
+
117649
2
−
1058841
2

3

1620  + R  640  + R 
6

6

1058841
1
, for 0 < R < ,
−
+
2
2
6
5
 100 (1 + R )
20736  + R 
6

60
2
+
3764768
4

405  + R 
6

2
G. Girone, A. Nannavecchia
Figure 2. Probability density function of the concentration ratio R for random samples
of size n = 7 from a uniform population.
g ( R) =
9
−
20 R 2
14356253
2
2450309
+
2
2099205
−
1

2

3

103680  + R  810  + R  128  + R 
6

6

6

49230947
367653125
1
2
+
−
, for < R < ,
2
2
6
6
4

5

1620  + R  20736  + R 
6

6

g ( R) = −
3125
2
+
7853
−
90 R 2
46303907
2
 1

1

10368  − + R 
25920  + R 
6
6




2099205
3764768
2
, for < R <
−
+
2
2
6
3

4

128  + R  405  + R 
6
6




g ( R) =
32
2
1064201
−
2
+
+
33773
−
90 R 2
2
7450309
2

810  + R 
6


2
3
,
6
46303907
 2

 1

1

405  − + R  51840  − + R 
25920  + R 
6
6
6






2450309
1058841
3
4
+
−
,
for < R < ,
2
2
6
6
2

3

810  + R  640  + R 
6

6

g ( R) = −
9
 3

1280  − + R 
 6

14356253
2
+
463
 2

324  − + R 
 6

117649
2
−
1064201
 1

51840  − + R 
 6

4
5
−
+
, for < R < ,
2
2
6
6
1

2

103680  + R  1620  + R 
6

6

61
2
+
2
7853
90 R 2
G. Girone, A. Nannavecchia
g ( R) =
1
2
9
−
2
32
+
 4

 3

 2

8100  − + R  1280  − + R 
405  − + R 
 6

 6

 6

9
117649
5
, for < R < 1.
+
−
2
6
20 R 2
1

518400  + R 
6

2
−
3125
 1

10368  − + R 
 6

2
Characteristic values of the distribution are:
7 35072 log 2 21797 log 3 359375log 5 823543log 7
mean E ( R ) =− +
−
+
+
=0.34951,
2
15
160
288
1440
763 3806128log 2 451251log 3 359375log 5 15647317 log 7
second moment E ( R 2 ) =
0.13291,
+
+
−
−
=
36
405
80
72
6480
third moment
18179 2771144 log 2 4266351log 3 18546875log 5 5764801log 7
E ( R3 ) =
−
−
−
+
+
=
0.05417,
216
135
320
1728
960
fourth moment
165193 40601588log 2 7638867 log 3 263234375log 5 895191241log 7
E ( R4 ) = +
+
+
−
=
0.02342,
648
1215
320
15552
77760
standard deviation σ ( R ) = 0.10367,
index of skewness γ 1 ( R ) = 0.18545,
index of kurtosis γ 2 ( R ) = −0.14535.
The distribution of the concentration ratio R for samples of size n = 7 from a uniform population shows slight
positive skewness and platykurtosis, both lower than those obtained for samples of size n = 6.
5. The Distribution of the Concentration Ratio for n = 8
The procedure indicated in Section 2 is used to obtain the following p.d.f. (Figure 3) of the concentration ratio R
for random samples of size n = 8:
g ( R) =
8192
2
−
2097152
2
+
663552
2
1

2

3

99225  + R  33075  + R 
245  + R 
7

7

7

128000000
169869312
137682944
+
−
+
,
2
2
2
2025 (1 + R )
5

6

1323  + R  1225  + R 
7
7


g ( R) = −
16807
+
86400 R 2
g ( R) =
81
9613463
2
−
2548367503
2
−
+
536870912
4

19845  + R 
7

1
for 0 < R < ,
7
2
386863168
2
1

2

3

66150  + R 
423360  + R 
6615  + R 
7

7

7

173031518287
19007619787
169869312
1
2
−
+
−
, for < R < ,
2
2
2
7
7
4

5

6

846720  + R 
66150  + R  1225  + R 
7

7

7

2
−
14356253
+
120960 R 2
31094051
2
−
18601017487
2
 1

1

2

490  − + R 
6615  + R  423360  + R 
7
7
7






1969741961
173031518287
128000000
2
3
, for < R < ,
+
−
+
2
2
2
7
7
3

4

5

13230  + R  846720  + R  1323  + R 
7

7

7

62
G. Girone, A. Nannavecchia
Figure 3. Probability density function of the concentration ratio R for random samples
of size n = 8 from a uniform population.
g ( R) = −
15625
2
274531
+
−
2
46303907
+
30240 R 2
268282153
2
 2

 1

1

254016  − + R  6615  − + R 
19845  + R 
 7

 7

7

18601017487
386863168
536870912
3
4
, for < R < ,
−
+
−
2
2
2
7
7
2

3

4

423360  + R  6615  + R  19845  + R 
7

7

7

g ( R) =
64
2
2472719
−
2
1254307
+
2
−
46303907
30240 R 2
 3

 2

 1

6615  − + R  423360  − + R  6615  − + R 
 7

 7

 7

31094051
2548367503
663552
4
5
, for < R < ,
+
−
+
2
2
2
7
7
1

2

3

6615  + R  423360  + R 
245  + R 
7

7

7

g ( R) = −
81
2
+
15949
2
−
2472719
2
+
 4

 3

 2

156800  − + R  66150  − + R  423360  − + R 
 7

 7

 7

14356253
9613463
2097152
5
−
+
−
, for < R <
2
2
7
120960 R 2
1

2

66150  + R  33075  + R 
7

7

g ( R) =
1
2
−
81
2
+
64
2
−
274531
 1

6615  − + R 
 7

6
,
7
15625
 5

 4

 3

 2

198450  − + R  156800  − + R  6615  − + R  254016  − + R 
 7

 7

 7

 7

81
16807
8192
6
, for < R < 1.
+
−
+
2
2
2
7
86400
R
 1

1

490  − + R 
99225  + R 
 7

7

Characteristic values of the distribution are:
63
2
2
G. Girone, A. Nannavecchia
3475456 log 2 2775303log 3 2421875log 5 3411821log 7
+
−
−
=0.34747,
315
560
1008
720
190 3184576 log 2 8776431log 3 4140625log 5 7882483log 7
−
−
+
+
=
0.12985,
second moment E R 2 =
7
63
392
392
360
third moment
6016 658405376 log 2 1492752159 log 3 129921875log 5 13596863log 7
E R3 =
−
+
+
−
−
=
0.05160,
49
5145
27440
5488
240
fourth moment
145475 3729880384 log 2 2307400911log 3 533984375log 5 9851303log 7
E R4 = −
−
+
+
=
0.02162,
343
15435
24010
14406
90
mean E ( R ) =−4 +
( )
( )
( )
standard deviation σ ( R ) = 0.09544,
index of skewness γ 1 ( R ) = 0.16867,
index of kurtosis γ 2 ( R ) = −0.12824.
The distribution of the concentration ratio R for samples of size n = 8 from a uniform population shows
slight positive skewness and platykurtosis, both lower than those obtained for samples of size n = 6 and 7.
6. The Distribution of the Concentration Ratio for n = 9
The procedure indicated in Section 2 is used to obtain the following p.d.f. (Figure 4) of the concentration ratio R
for random samples of size n = 9:
g ( R) = −
531441
2
531441
+
2
−
10460353203
2
+
2125764
2
1

2

3

4

20070400  + R  11200  + R  2867200  + R  35  + R 
8

8

8

8

1
41518828125
10460353203
437664515463
544195584
, for 0 < R < ,
−
+
−
+
2
2
2
2
8
5

6

7
 1225 (1 + R )
114688  + R  11200  + R  409600  + R 
8

8

8

g ( R) =
1024
−
14175 R 2
9819117949
2
+
15242823
2
−
1441266427931
2
+
857395628
1

2

3

4

77414400  + R  1600  + R  9289728  + R  945  + R 
8

8

8

8

6604248318741
2358496602787
437664515463
1
2
−
+
−
, for < R < ,
2
2
2
8
8
5

6

7

2867200  + R  907200  + R  409600  + R 
8

8

8

g ( R) = −
823543
2
+
9613463
−
75600 R 2
715845826021
2
+
1262942489
2
 1

1

2

11059200  − + R 
77414400  + R  8640  + R 
8
8
8






2582172507227 156844988749
6604248318741
10460353203
−
+
−
+
,
2
2
2
2
3

4

5

6

3096576  + R 
75600  + R  2867200  + R  11200  + R 
8

8

8

8

g ( R) =
81
2
 1

2240  − + R 
8


2971536239
+
2
2

8640  + R 
8

for
2
3
<R< ,
8
8
31094051 193012968877
−
2
7560 R 2
 2

1

3096576  − + R 
3096576  + R 
 8

8

3
4
2582172507227
857395628
41518828125
, for < R < ,
−
+
−
2
2
2
8
8
5

3

4

3096576  + R  945  + R  114688  + R 
8

8

8

−
183689449
2
2
+
64
G. Girone, A. Nannavecchia
Figure 4. Probability density function of the concentration ratio R for random samples
of size n = 9 from a uniform population.
g ( R) = −
78125
2
+
783437
−
2
2575720699
2
+
268282153
22680 R 2
 3

 2

 1

9289728  − + R  60480  − + R  3096576  − + R 
 8

 8

 8

4
5
1262942489
1441266427931
2125764
193012968877
, for < R < ,
−
+
−
+
2
2
2
2
8
8
4

1

2

3

3096576  + R  8640  + R  9289728  + R  35  + R 
8
8
8
8








g ( R) =
4
2
90129527
−
19593601
+
2
2
2575720699
−
2
 4

 3

 2

 1

4725  − + R  77414400  − + R  302400  − + R  3096576  − + R 
8
8
8
8








31094051
715845826021
15242823
10460353203
5
6
+
−
+
−
, for < R < ,
2
2
2
2
8
8
7560 R
1

2

3

77414400  + R  1600  + R  2867200  + R 
8

8

8

g ( R) = −
81
+
2251
2
−
90129527
2
+
783437
2
 4

 3

 2

75600  − + R  77414400  − + R  60480  − + R 
 8

 8

 8

6
7
9613463
9819117949
531441
−
+
−
+
, for < R < ,
2
2
2
2
8
8
 1
 75600 R
1

2

3096576  − + R 
77414400  + R  11200  + R 
 8

8

8

g ( R) =
 5

2867200  − + R 
 8

183689449
2
1
2
−
81
2
+
4
2
−
78125
2
 6

 5

 4

 3

6350400  − + R  2867200  − + R  4725  − + R  9289728  − + R 
 8

 8

 8

 8

81
823543
1024
531441
7
, for < R < 1.
+
−
+
−
2
2
2
2
8
14175
R
 2

 1

1

2240  − + R  11059200  − + R 
20070400  + R 
 8

 8

8

65
G. Girone, A. Nannavecchia
Characteristic values of the distribution are:
9 18425272 log 2 1948617 log 3 1953125log 5 109531219 log 7
− −
−
+
+
=
0.34589,
mean E ( R ) =
2
315
1120
576
5760
second moment
1083 31176874 log 2 209048769 log 3 150390625log 5 1101076991log 7
E R2 =
+
+
−
−
=
0.12754,
32
105
35840
9216
11520
third moment
43983 69161579 log 2 394860663log 3 947265625log 5 31654522291log 7
E R3 =
−
−
+
+
+
=
0.04969,
256
84
57344
24576
122880
fourth moment
2732815 58613649 log 2 36660941253log 3 36018359375log 5 739122430613log 7
E R4 =
+
−
−
−
=
0.02032,
4096
35
458752
589824
1474560
standard deviation σ ( R ) = 0.08889,
( )
( )
( )
index of skewness γ 1 ( R ) = 0.15559,
index of kurtosis γ 2 ( R ) = −0.11467.
The distribution of the concentration ratio R for samples of size n = 9 from a uniform population shows slight
positive skewness and platykurtosis, both lower than those obtained for samples of size n = 6, 7 and 8.
7. The Distribution of the Concentration Ratio for n = 10
The procedure indicated in Section 2 is used to obtain the following p.d.f. (Figure 5) of the concentration ratio R
for random samples of size n = 10:
g ( R) =
78125
1

10287648  + R 
9

30000000
−
6

7 + R
9

g ( R) = −
+
g ( R) =
2
+
−
2
7

52488  + R 
9

59049
+
2508800 R 2
2
−
5

9185400  + R 
9


2
34956443674391
2

1161216  + R 
9

2
−
+
239423780652283
5

9185400  + R 
9

2
234375
3

56  + R 
9

2
−
2
+
2
−
5120000000
4

45927  + R 
9

4613203125
1568 (1 + R )
29486161727807
2

2351462400  + R 
9

2
+
2
,
2
+
762939453125
5

734832  + R 
9

91871139
3

280  + R 
9

2
9819117949
133551832049
+
2
87091200 R 2
1

9185400  + R 
9

227124686777
3

68040  + R 
9

−
2
−
8684520419229
6

358400  + R 
9

2
4488452581986787
4

335923200  + R 
9

+
450375078125
7

52488  + R 
9

66
2
2
, for
2
1
for 0 < R < ,
9
−
282890791328125
4

94058496  + R 
9

78160683773061 1465749080718869 2621440000000
+
−
,
2
2
2
6

7

8

321489  + R 
64297800  + R 
793800  + R 
9

9

9

−
2
+
8

321489  + R 
9

1

64297800  + R 
9

112482309211387
2
2621440000000
6102360983
 1

1148175  − + R 
 9

+
2

321489  + R 
9

450375078125
32768
−
10000000
2
3
<R< ,
9
9
for
2
1
2
<R< ,
9
9
G. Girone, A. Nannavecchia
Figure 5. Probability density function of the concentration ratio R for random
samples of size n = 10 from a uniform population.
5764801
g ( R) = −
2
+
605566249
2
−
715845826021
87091200 R 2
 2

 1

335923200  − + R  9185400  − + R 
 9

 9

373815952381
167366716643141 2371170173011
+
−
+
2
2
2
1

2

3

1837080  + R  94058496  + R  340200  + R 
9

9

9

4488452581986787 112482309211387 30000000
3
4
−
+
−
, for < R < ,
2
2
2
9
9
4

5

6

335923200  + R 
9185400  + R 
7 + R
9
9
9






g ( R) =
3
2
−
1859135071
2
4305254531
+
2
 3

 2

 1

560  − + R  94058496  − + R 
1837080  − + R 
 9

 9

 9

193012968877 429101140253 167366716643141
−
+
−
2
2
3483648 R 2
1

2

918540  + R  94058496  + R 
9

9

762939453125
4
5
227124686777 282890791328125
+
, for < R < ,
+
−
2
2
2
9
9
4

5

3

68040  + R  94058496  + R  734832  + R 
9

9

9

g ( R) = −
78125
2
+
195677
2
−
29538353821
2
 4

 3

 2

94058496  − + R  68040  − + R  94058496  − + R 
 9

 9

 9

12705254531
193012968877
373815952381
+
−
+
2
2
3483648 R 2
 1

1

1837080  − + R 
1837080  + R 
 9

9

34956443674391
91871139
5120000000
5
6
−
+
−
, for < R < ,
2
2
2
9
9
2

3

4

94058496  + R 
280  + R  45927  + R 
9

9

9

67
G. Girone, A. Nannavecchia
g ( R) =
64
 5

1148175  − + R 
 9

 2

94058496  − + R 
 9

133551832049
+
1

9185400  + R 
9

g ( R) = −
−
g ( R) =
404606309
−
 4

2351462400  − + R 
 9

29538353821
−
+
2
2
 6

2508800  − + R 
 9

 3

68040  − + R 
 9

9819117949
+
87091200 R 2
2
2
−
 1

1837080  − + R 
 9

+
6102360983
1

64297800  + R 
9


 4

94058496  − + R 
 9

32768
 1

1148175  − + R 
9


2
2
−
2
+
−
+
2
 5

64297800  − + R 
 9

 2

94058496  − + R 
 9

78125
−
2
180731
1859135071
 7

257191200  − + R 
 9

+
4305254531
2

2351462400  + R 
9

1
−
+
29486161727807
−
3
195677
2
2
−
2
2
+
779623
 3

48600  − + R 
 9

234375
3

56  + R 
9

−
 6

2508800  − + R 
 9

3
 3

560  − + R 
 9

2
−
2
,
for
6
7
<R< ,
9
9
404606309
 4

2351462400  − + R 
 9

2
605566249
 1

9185400  − + R 
 9

2

321489  + R 
9


2
2
715845826021
87091200 R 2
10000000
3
59049
+
2508800 R 2
+
2
+
2
,
for
2
7
8
<R< ,
9
9
64
 5

1148175  − + R 
 9

2
5764801
 2

335923200  − + R 
 9

78125
1

10287648  + R 
9


2
,
2
for
8
< R < 1.
9
Characteristic values of the distribution are:
686419424 log 2 88683579 log 3 60546875log 5 40353607 log 7
mean E ( R ) =−5 +
−
+
−
=0.34462,
2835
1120
6048
810
second moment
4977475650505 5506009449897293log 2 3870789417061723log 3
E R2 =
−
+
−
1666598976
57868020
115736040
23520914453125log 5 155338805614063log 7
+
−
=
0.12574,
5143824
8266860
third moment
56500 302665066912 log 2 1609375311log 3
E R3 =
−
+
−
243
76545
1120
99482421875log 5 133852914419 log 7
0.04823,
+
−
=
489888
174960
fourth moment
( )
( )
68
G. Girone, A. Nannavecchia
6581554 10398297681152 log 2 1892901777 log 3 10382748828125log 5
E R4 = −
+
−
6561
1240029
560
19840464
1189583980753log 7
0.01933,
+
=
787320
standard deviation σ ( R ) = 0.08352,
index of skewness γ 1 ( R ) = 0.14505,
index of kurtosis γ 2 ( R ) = −0.10366.
The distribution of the concentration ratio R for samples of size n = 10 from a uniform population shows
slight positive skewness and platykurtosis, both lower than those obtained for samples of size n = 6, 7,8 and 9 .
( )
8. Some Regularities of the Distributions
The analysis of the p.d.f. for n = 2,3, ,10 shows some regularities:
● The p.d.f. of the concentration ratio R, for 0 < R < 1 n and for samples of size n, can be expressed by
( −1)
1+ i
n −1
g ( R) = ∑
i =1
 n ( n − 1) 
 n−i

( n − 1)!  n − 1 + R 


2
( n − 1)
● Furthermore, the p.d.f. of the concentration ratio R, for
expressed by
n −1
g ( R) = ∑
( −1)
i =1
 n − 2


 i −1 ;
n −1
n +1− i
(n +1− i)
n −1
2
n < R < 1 and for samples of size n, can be
 n − 2


 i −1 ;
2  2−i

+ R
( n − 1)! 
 n −1

2
● The density of the concentration ratio R, for 0 < R < 1 n and for samples of size n, is given by
1
∫0n g ( R ) dR =
● The density of the concentration ratio R, for
( n − 1)
nn−2
( n − 1)!
2
;
n < R < 1 and for samples of size n, is given by
1
1
∫nn−1 g ( R ) dR =
( n − 1)!
2
;
● The jth term of the density of the concentration ratio R, denoted as ai , j , verifies the following symmetry
ai , j = a j ,i .
The coefficients of the ai ,i terms of the p.d.f. of the concentration ratio R for samples of size n − 1 multiplied by ( n − 1) n become the coefficients of the ai +1,i +1 terms of the same p.d.f. for sample of size n.
These results are valid for every sample size and may allow reducing the heavy calculation to determine the
p.d.f. of the concentration ratio R.
9. Concluding Remarks
In the present paper we obtain the distributions of the Gini concentration ratio R for samples of size
n = 6, 7,8,9 and 10 drawn from a uniform population. We use the same method used by Girone [12] to derive
the same distributions for samples of size n ≤ 5 . We obtain the p.d.f. of the concentration ratio R calculating a
for k 1, 2, , n − 1 .
multiple integral in n − 1 dimensions for each region from ( k − 1) ( n − 1) to k ( n − 1) =
The limits of integration are defined by solving the inequalities of the order statistics divided by the sample
69
G. Girone, A. Nannavecchia
mean and expressed in terms of the concentration ratio R for the values assumed in each of such regions. The
calculation of the limits of integration is particularly heavy and requires a very long processing time.
The obtained results show that the p.d.f. of the concentration ratio R is given by hyperbolic splines with defor k 1, 2, , n − 1 . Such distributions are unimodal with mean tending to
gree 2 and with nodes in k ( n − 1) =
1 3 , which is the value of the concentration ratio R for the population, and have decreasing standard deviation.
Moreover, the distributions show a slight positive skewness and platykurtosis that tend to decrease as n increases.
Beyond the possibility to obtain similar results for samples of larger size, open problems are the derivation of
the exact expression for the mean and the other features of the distribution of the concentration ratio R for random samples of size n drawn from a uniform population.
References
[1]
Gini, C. (1914) L’ammontare e la composizionedellaricchezzadellenazioni. Bocca, Torino.
[2]
Hoeffding, W. (1948) A Class of Statistics with Asymptotically Normal Distribution. Annals of Mathematical Statistics,
19, 293-325.
[3]
Glasser, G.J. (1962) Variance Formulas for the Mean Difference and the Coefficient of Concentration. Journal of the
American Statistical Association, 57, 648-654. http://dx.doi.org/10.1080/01621459.1962.10500553
[4]
Cucconi, O. (1965) Sulla distribuzionecampionaria del rapporto R di concentrazione. Statistica, 25, 119.
[5]
Dall’Aglio, G. (1965) Comportamentoasintoticodellestimedelladifferenza media e del rapporto di concentrazione. Metron, 24, 379-414.
[6]
Deltas, G. (2003) The Small-Sample Bias of the Gini Coefficient: Results and Implications for Empirical Research.
Review of Economics and Statistics, 85, 226-234. http://dx.doi.org/10.1162/rest.2003.85.1.226
[7]
Barrett, G.F. and Donald, S.G. (2009) Statistical Inference with Generalized Gini Indices of Inequality, Poverty, and
Welfare. Journal of Business & Economic Statistics, 27, 1-17. http://dx.doi.org/10.1198/jbes.2009.0001
[8]
Davidson, R. (2009) Reliable Inference for the Gini Index. Journal of Econometrics, 150, 30-40.
http://dx.doi.org/10.1016/j.jeconom.2008.11.004
[9]
Fakoor, V., Ghalibaf, M.B. and Azarnoosh, H.A. (2011) Asymptotic Behaviors of the Lorenz Curve and Gini Index in
Sampling from a Length-Biased Distribution. Statistics and Probability Letters, 81, 1425-1435.
http://dx.doi.org/10.1016/j.spl.2011.04.013
[10] Girone, G. (1968) Sulcomportamentocampionariosimulato del rapporto di concentrazione. Annalidella Facoltà di
Economia e Commerciodell’Universitàdegli Studi di Bari, 23, 5-11.
[11] Girone, G. (1971) La distribuzione del rapporto di concentrazione per campionicasuali di variabiliesponenziali. Studi di
Probabilità, Statistica e Ricercaoperativa in onore di Giuseppe Pompilj, Oderisi, Gubbio.
[12] Girone, G. (1971) La distribuzione del rapporto di concentrazione per piccolicampioniestratti da unapopolazioneuniforme. Annalidell’Istituto di Statisticadell’Universitàdegli Studi di Bari, 36, 31-52.
70

Download Report

The Distribution of the Concentration Ratio for Samples from a

Paperzz.com

Your Paperzz