The Concentration Index

8
The Concentration Index
Concentration curves can be used to identify whether socioeconomic inequality in
some health sector variable exists and whether it is more pronounced at one point
in time than another or in one country than another. But a concentration curve does
not give a measure of the magnitude of inequality that can be compared conveniently across many time periods, countries, regions, or whatever may be chosen
for comparison. The concentration index (Kakwani 1977, 1980), which is directly
related to the concentration curve, does quantify the degree of socioeconomicrelated inequality in a health variable (Kakwani, Wagstaff, and van Doorslaer 1997;
Wagstaff, van Doorslaer, and Paci 1989). It has been used, for example, to measure
and to compare the degree of socioeconomic-related inequality in child mortality (Wagstaff 2000), child immunization (Gwatkin et al. 2003), child malnutrition
(Wagstaff, van Doorslaer, and Watanabe 2003), adult health (van Doorslaer et al.
1997), health subsidies (O’Donnell et al. 2007), and health care utilization (van
Doorslaer et al. 2006). Many other applications are possible.
In this chapter we define the concentration index, comment on its properties,
and identify the required measurement properties of health sector variables to
which it can be applied. We also describe how to compute the concentration index
and how to obtain a standard error for it, both for grouped data and for microdata.
Definition and properties
Definition
The concentration index is defined with reference to the concentration curve, introduced in chapter 7. The concentration index is defined as twice the area between
the concentration curve and the line of equality (the 45-degree line). So, in the case
in which there is no socioeconomic-related inequality, the concentration index is
zero. The convention is that the index takes a negative value when the curve lies
above the line of equality, indicating disproportionate concentration of the health
variable among the poor, and a positive value when it lies below the line of equality.
If the health variable is a “bad” such as ill health, a negative value of the concentration index means ill health is higher among the poor.
Formally, the concentration index is defined as
1
(8.1)
C=1− 2 ∫ Lh ( p ) dp .
0
95
96
Chapter 8
The index is bounded between –1 and 1. For a discrete living standards variable, it
can be written as
2 n
1
(8.2)
C=
∑ hi ri − 1− N ,
Nµ i=1
where hi is the health sector variable, µ is its mean, and ri = i N is the fractional
rank of individual i in the living standards distribution, with i = 1 for the poorest
and i = N for the richest.1 For computation, a more convenient formula for the concentration index defines it in terms of the covariance between the health variable
and the fractional rank in the living standards distribution (Jenkins 1988; Kakwani
1980; Lerman and Yitzhaki 1989),
(8.3)
C=
2
cov ( h, r ).
µ
Note that the concentration index depends only on the relationship between
the health variable and the rank of the living standards variable and not on the
variation in the living standards variable itself. A change in the degree of income
inequality need not affect the concentration index measure of income-related health
inequality.
The concentration index summarizes information from the concentration curve
and can do so only through the imposition of value judgments about the weight
given to inequality at different points in the distribution. Alternative weighting
schemes implying different judgments about attitudes to inequality are considered
in the next chapter. Inevitably, the concentration index loses some of the information
that is contained in the concentration curve. The index can be zero either because
the concentration curve lies everywhere on top of the 45-degree line or because it
crosses the line and the (weighted) areas above and below the line cancel out. It is
obviously important to distinguish between such cases, and so the summary index
should be examined in conjunction with the concentration curve.
The sign of the concentration index indicates the direction of any relationship
between the health variable and position in the living standards distribution, and
its magnitude reflects both the strength of the relationship and the degree of variability in the health variable. Although this is valuable information, one may also
wish to place an intuitive interpretation on the value of the index. Koolman and
van Doorslaer (2004) have shown that multiplying the value of the concentration
index by 75 gives the percentage of the health variable that would need to be (linearly) redistributed from the richer half to the poorer half of the population (in the
case that health inequality favors the rich) to arrive at a distribution with an index
value of zero.
Properties
The properties of the concentration index depend on the measurement characteristics of the variable of interest. Strictly, the concentration index is an appropriate measure of socioeconomic-related health (care) inequality when health (care)
is measured on a ratio scale with nonnegative values. The concentration index is
1
For large N, the final term in equation 8.3 approaches zero and it is often omitted.
The Concentration Index
97
invariant to multiplication of the health sector variable of interest by any scalar
(Kakwani 1980). So, for example, if we are measuring inequality in payments for
health care, it does not matter whether payments are measured in local currency
or in dollars; the concentration index will be the same. Similarly, it does not matter whether health care is analyzed in terms of utilization per month or if monthly
data are multiplied by 12 to give yearly figures. However, the concentration index
is not invariant to any linear transformation of the variable of interest. Adding a
constant to the variable will change the value of the concentration index. In many
applications this does not matter because there is no reason to make an additive
transformation of the variable of interest. There is one important application in
which this does represent a limitation, however. We are often interested in inequality in a health variable that is not measured on a ratio scale. A ratio scale has a true
zero, allowing statements such as “A has twice as much X as B.” That makes sense
for dollars or height. But many aspects of health cannot be measured in this way.
Measurement of health inequality often relies on self-reported indicators of health,
such as those considered in chapter 5. A concentration index cannot be computed
directly from such categorical data. Although the ordinal data can be transformed
into some cardinal measure and a concentration index computed for this (van
Doorslaer and Jones 2003; Wagstaff and van Doorslaer 1994), the value of the index
will depend on the transformation chosen (Erreygers 2005).2 In cross-country comparisons, even if all countries adopt the same transformation, their ranking by the
concentration index could be sensitive to differences in the means of health that are
used in the transformation.
A partial solution to this problem would be to dichotomize the categorical
health measure. For example, one could examine how the proportion of individuals
reporting poor health varies with living standards. Unfortunately, this introduces
another problem. Wagstaff (2005) has demonstrated that the bounds of the concentration index for a dichotomous variable are not –1 and 1 but depend on the mean
of the variable. For large samples, the lower bound is µ − 1 and the upper bound is
1 − µ . So the feasible interval of the index shrinks as the mean rises. One should be
cautious, therefore, in using the concentration index to compare inequality in, for
example, child mortality and immunization rates across countries with substantial
differences in the means of these variables. An obvious response is to normalize
the concentration index by dividing through by 1 minus the mean (Wagstaff 2005).
If the health variable of interest takes negative as well as positive values, then its
concentration index is not bounded within the range of (–1,1). In the extreme, if the
mean of the variable were 0, the concentration index would not be defined.
Bleichrodt and van Doorslaer (2006) have derived the conditions that must
hold for the concentration index (and related measures) to be a measure of socioeconomic-related health inequality consistent with a social welfare function. They
argue that one condition—the principle of income-related health transfers—is
rather restrictive. Erreygers (2006) has derived an alternative measure of socioeconomic-related health inequality that is consistent with this condition and three others argued to be desirable.
2
Erreygers (2005) suggests a couple of alternatives to the concentration index to deal with
this problem.
98
Chapter 8
Estimation and inference for grouped data
Point estimate of the concentration index
The concentration index for t=1,…,T groups is easily computed in a spreadsheet
program using the following formula (Fuller and Lury 1977):
C = ( p1 L2 − p2 L1 ) + ( p2 L3 − p3 L2 ) + ...+ ( pT −1 LT − pT LT −1 )
(8.4)
where pt is the cumulative percentage of the sample ranked by economic status in
group t, and Lt is the corresponding concentration curve ordinate. To illustrate, consider the distribution of under-five mortality by wealth quintiles in India, 1982–92.
We drew the concentration curve for these data in chapter 7. Table 8.1 reproduces
table 7.1 with the terms in brackets in the formula above added to the final column.
The sum of these terms is –0.1694, which is the concentration index. The negative
concentration index reflects the higher mortality rates among poorer children.
Standard error
A standard error of the estimator of C in the grouped data case can be computed using
a formula given in Kakwani, Wagstaff, and van Doorslaer (1997). Let ft be the proportion of the sample in the tth group, and define the fractional rank of group t by
t −1
Rt = ∑ f k + 21 ft,
(8.5)
k =1
which is the cumulative proportion of the population up to the midpoint of each
group interval. The variance of the estimator of C is given by
(8.6)
1
1
T
T
2
2
var(Cˆ ) = ⎡⎢ ∑ t =1 ft at2 − ( 1 + C ) ⎤⎥ + 2 ∑ t =1 ftσ t2 ( 2 Rt − 1 − C ) ,
⎦
⎣
n
nµ
where n is the sample size, σ t2 is the variance of the health variable in the tth group,
µ is its mean,
µ
at = t ( 2 Rt − 1 − C ) + 2 − qt −1 − qt, and
µ
qt =
1 t
∑ µ k f k,
µ k =1
t
which is the ordinate of Lh (p), q0 = 0, and pt = ∑ f k Rk (Kakwani, Wagstaff, and van
k =1
Doorslaer 1997).
Table 8.1 Under-Five Deaths in India, 1982–92
Wealth
group
No. of Rel % Cumul % U5MR
births births
births per 1,000
Poorest
29,939
23
23
154.7
4,632
30
30
–0.0008
2nd
28,776
22
45
152.9
4,400
29
59
–0.0267
Middle
26,528
20
66
119.5
3,170
21
79
–0.0592
4th
24,689
19
85
86.9
2,145
14
93
–0.0827
Richest
19,739
15
100
54.3
1,072
7
100
0.0000
118.8
15,419
Total/average
129,671
Source: Authors.
No. of
deaths
Rel % Cumul % Conc.
deaths deaths
index
–0.1694
The Concentration Index
99
Case in which within-group variances are unknown In many applications, the within-group variances will be unknown. For example, the data might
have been obtained from published tabulations by income quintile. In such cases,
it must be assumed that there is no within-group variance and the second term in
equation 8.6 is set to zero. However, in addition, n needs to be replaced by T in the
denominator of the first term because there are in effect only T observations, not n.
Table 8.2 gives an example using data on under-five mortality (rates per birth,
not rates per 1,000 births) from the 1998 Vietnam Living Standards Survey (VLSS).
The data were computed directly from the survey, with children being grouped
into household per capita consumption quintiles. The assumption made in table 8.2
is that the within-group variances in mortality are not known and are set to zero.
Below, we relax this assumption. The table, which is extracted from an Excel file,
shows the values for each quintile of R, q, a, and fa2 computed by substituting estimates for the parameters in the formula above. Also shown is the sum of fa2 across
the five quintiles. Substituting Σ f·a2 = 0.680, C = –0.1841, and T = 5 into equation 8.6
gives 0.0029 for the variance of the estimate of C and hence a standard error equal
to 0.0537. The t-statistic for C is therefore –3.43.
Case in which within-group variances are known In some cases, the
within-group variances will be known, and this provides us with more information. In effect, we move from having information only on the T group means to
having information on the full sample—albeit with the variation within the groups
being picked up only by the group standard deviations. One such scenario is the
case in which we are working with mortality data—the rates are defined at the
group level only, but the within-group standard deviations are reported.3
In such cases, n is used (rather than T) in the denominator of the first term in
equation 8.6, and the second term needs to be computed as well. Table 8.3 shows the
standard errors for each quintile’s under-five mortality rate from the Vietnam data.
The final column shows the value for each quintile of the term in the summation
operator in the second term of equation 8.6, as well as the sum of these across the
five quintiles. Dividing this sum through by nµ2 gives 1.511e-6, which is the second
term of equation 8.6. Dividing Σ fa2 through by n (=5,315) gives 2.717e-6, which is
Table 8.2 Under-Five Deaths in Vietnam, 1989–98 (within-group variance unknown)
Consumption No. of Cumul %
group
births
births
Poorest
R
U5MR
Cumul %
deaths
CI
q
a
f · a2
1,002
19
0.094
0.060
31
–0.024
0.312
0.648
0.079
949
37
0.278
0.034
48
–0.013
0.482
0.959
0.164
Middle
1,002
56
0.461
0.041
69
–0.053
0.695
0.944
0.168
4th
1,082
76
0.657
0.028
85
–0.095
0.854
0.842
0.144
Richest
1,280
100
0.880
0.022
100
0.000
1.000
0.719
0.124
Total/average
5,315
2nd
0.036
–0.184
0.680
Source: Authors.
3
Or the standard errors of the group means are reported, from which estimates of the variances can be recovered provided group sizes are known.
100
Chapter 8
Table 8.3 Under-Five Deaths in Vietnam, 1989–98 (within-group variance known)
Consumption
group
No. of
births
R
Poorest
1,002
949
Middle
f · a2 Std Error f σ2(2R–0.5–0.5C)2
U5MR
CI
q
a
0.094
0.060
–0.024
0.312
0.648
0.079
0.008
4.631E-06
0.278
0.034
–0.013
0.482
0.959
0.164
0.006
4.354E-07
1,002
0.461
0.041
–0.053
0.695
0.944
0.168
0.007
9.085E-08
4th
1,082
0.657
0.028
–0.095
0.854
0.842
0.144
0.005
1.423E-06
Richest
1,280
0.880
0.022
0.000
1.000
0.719
0.124
0.004
3.780E-06
Total/average
5,315
0.036
–0.184
2nd
0.680
1.036E-05
Source: Authors.
the first term in equation 8.6. The sum of the two terms is the variance, equal in this
case to 4.228e-6, giving a standard error of the estimate of C equal to 0.0021. This,
unsurprisingly, is substantially smaller than the standard error obtained assuming
no within-group variance.
Estimation and inference for microdata
Point estimate of the concentration index
The concentration index (C) can be computed very easily from microdata by
using the “convenient covariance” formula (equation 8.3). If the sample is not selfweighted, weights should be applied in computation of the covariance, the mean of
the health variable, and the fractional rank. Given the relationship between covariance and ordinary least squares (OLS) regression, an equivalent estimate of the
concentration index can be obtained from a “convenient regression” of a transformation of the health variable of interest on the fractional rank in the living standards distribution (Kakwani, Wagstaff, and van Doorslaer 1997). Specifically,
(8.7)
⎛h ⎞
2σ r2 ⎜ i ⎟ = α + β ri + ε i ,
⎝ µ⎠
where σ r2 is the variance of the fractional rank. The OLS estimate of β is an estimate of the concentration index equivalent to that obtained from equation 8.3. This
method gives rise to an alternative interpretation of the concentration index as the
slope of a line passing through the heads of a parade of people, ranked by their living standards, with each individual’s height proportional to the value of his or her
health variable, expressed as a fraction of the mean.
Computation of the concentration index
To illustrate computation, we estimate the concentration index for the public subsidy to hospital outpatient care (subsidy) in Vietnam using data from the 1998
Vietnam Living Standards Survey (see chapter 14 and O’Donnell et al. [2007]). The
living standards variable is household consumption per equivalent adult (eqcons),
and the sample must be weighted (by variable weight).
By using equation 8.3 or 8.7, an estimate of a concentration index can be computed easily with any statistical package. The only slight complication involves com-
The Concentration Index
101
putation of the fractional rank variable in the case that the data must be weighted.
The weighted fractional rank is defined as follows:
i −1
(8.8)
ri = ∑ w j +
j=0
wi
,
2
where wi is the sample weight scaled to sum to 1, observations are sorting in ascending order of living standards, and w0 = 0. In Stata, this can be computed as follows:
egen raw_rank=rank(eqcons), unique
sort raw_rank
quietly sum weight
gen wi=weight/r(sum)
gen cusum=sum(wi)
gen wj=cusum[_n-1]
replace wj=0 if wj==.
gen rank=wj+0.5*wi
where income is the measure of living standards, weight is the original sample
weight, wi is a scaled version of this that sums to 1, wj is the first term in equation
8.8, and rank is ri in equation 8.8. Alternatively, the weighted fractional rank can be
generated using glcurve,4
glcurve eqcons [aw=weight], pvar(rank) nograph
Because weights are applied, the generated variable rank is the weighted fractional
rank. The rank variables produced by the two procedures will be perfectly correlated. The (weighted) mean of the rank produced from the first procedure will
always be exactly 0.5, and the mean of the rank produced by glcurve will differ
from 0.5 only at the 4th–5th decimal place.
By using equation 8.3, the concentration index can then be computed using5
qui sum subsidy [fw=weight]
scalar mean=r(mean)
cor subsidy rank [fw=weight], c
sca c=(2/mean)*r(cov_12)
sca list c
By using equation 8.7, the index is computed as follows:
qui sum rank [fw=weight]
sca var_rank=r(Var)
gen lhs=2*var_rank*(subsidy/mean)
regr lhs rank [pw=weight]
sca c=_b[rank]
sca list c
4
See chapter 7 for an explanation of glcurve.
For the corr and sum commands, frequency weights (fw) should be used to get the correct variance of the weighted rank and its covariance with the health variable of interest.
The weight variable must be an integer for the frequency weight to be accepted by Stata. If
the weight variable is a noninteger, analysts will first need to multiply the weight by 10^k,
where k is the largest number of decimal places in any value of the weight variable. A new
integer weight variable will then have to be created using the gen new _ weight = int
(weight). The alternative is to use analytic weights and accept some imprecision.
5
102
Chapter 8
Both procedures give an estimate of the concentration index of 0.16700, indicating that the better-off receive more of the public subsidy to hospital outpatient care
in Vietnam.
Standard error
Kakwani, Wagstaff, and van Doorslaer (1997) derived the standard error of a concentration estimated from microdata. They did this by noting that the concentration
index can be written as a nonlinear function of totals, and so the delta method (Rao
1965) can be applied to obtain the standard error. The resulting formula is essentially a simplified version of equation 8.6 without the second term because at the
individual level there is no within-group variation. Specifically,
(8.9)
where
and
1⎡1 n
2⎤
var Ĉ = ⎢ ∑ ai2 − ( 1 + C ) ⎥ ,
n ⎣ n i =1
⎦
hi
ai = ( 2 ri − 1 − C ) + 2 − qi−1 − qi ,
µ
1 i
qi =
∑ hj
µ n j =1
( )
is the ordinate of the concentration curve Lh (p), and q0 = 0.
Unfortunately, equation 8.9 does not take into account sample weights and other
sample design features, such as cluster sampling (see chapters 2 and 9), although in
principle it could be adapted to do so. The formula can be computed easily in Stata.
We demonstrate this for the Vietnam subsidy example. First, we must recompute
the concentration index without the application of weights, as follows:
glcurve subsidy, sortvar(eqcons) pvar(ranku) glvar(ccurve)
lorenz nograph;
qui sum ranku
sca var_ranku=r(Var)
qui sum subsidy
sca meanu=r(mean)
gen lhsu=2*var_ranku*(subsidy/meanu)
regr lhsu ranku
sca conindu=_b[rank]
The estimate of the concentration index from the unweighted data is 0.16606.
Standard errors are then computed using equation 8.9 as follows:
sort ranku
gen cclag = ccurve[_n-1]
replace cclag=0 if cclag==.
gen asqr=((subsidy/meanu)*(2*ranku-1-conindu)+2-cclagccurve)^2
qui sum asqr
sca var=(r(mean)-(1+conindu)^2)/r(N)
sca se=sqrt(var)
sca list conindu se
That gives a standard error of 0.033976, and so a t-ratio of 4.89.
The Concentration Index
103
The limitation of equation 8.9 is that it cannot be applied directly to data that are
weighted and/or do not have a simple random sample design. To take such sample
features into account, one option is simply to use the standard error of the coefficient on the rank variable in the convenient regression. Because this coefficient
is an estimate of the concentration index, one might expect its standard error to be
that of the concentration index. This is not quite correct because it takes no account
of the sampling variability of the estimate of the mean of the health variable that
enters the transformation giving the left-hand side of the convenient regression.
Note that the variance of the fractional rank, which is also used in the transformation, depends only on the sample size and so has no sampling variability.6 It
can be treated as a constant. One computationally simple way of taking account of
the sampling variability of the mean is to run the convenient regression without
transforming the left-hand-side variable but (equivalently) transforming the rank
coefficient instead. A delta method standard error can then be computed for the
transformed coefficient that takes account of the sampling variability of all terms
used in the transformation. From the regression
(8.10)
hi = α 1 + β1ri + ui
the estimate of the concentration index is given by
(8.11)
⎛ 2σ 2 ⎞
βˆ = ⎜ r ⎟ βˆ1.
⎝ µˆ ⎠
By using the facts that the least squares predicted value has the same mean
as the dependent variable and that the mean of the fractional rank variable is 0.5,
equation 8.11 can be written as
⎛
⎞
2 ⎟
⎜
σ
2
r
(8.12)
βˆ = ⎜
βˆ .
ˆ ⎟ 1
⎜ αˆ + β1 ⎟
⎜⎝ 1
⎟
2 ⎠
Because the estimate is now written as a function of the regression coefficients, a
standard error can be obtained by applying the delta method. In Stata, this procedure can be implemented very easily using nlcom.
regr subsidy rank [pw=weight]
nlcom ((2*var_rank)/(_b[_cons]+0.5*_b[rank]))*_b[rank]
For the Vietnam outpatient subsidy example, that gives a standard error of 0.034016
for the estimate of the (weighted) concentration index of 0.16700 reported above.
The standard error of the rank coefficient from equation 8.7 is 0.034945, and so it
appears that taking account of the sampling variability of the mean makes very
little difference. Experimentation suggests that this is generally the case, and so
standard errors from the convenient regression equation 8.7 can be used without
too much concern for inaccuracy.
In Stata, if weights are applied in the regression, then the standard error returned
will be robust to heteroskedasticity. If there are no weights, heteroskedasticity robust
6
This is due to the nature of the fractional rank variable. Its weighted mean is always 0.5,
and its variance approaches 1/12 as n goes to infinity. For given n, the variance of the fractional rank is always the same.
104
Chapter 8
standard errors can be obtained by adding the option robust to the regression. If
this is done, then the delta method standard errors computed by a nlcom command
following the regression will also be robust. If the survey has a cluster sampling
design, then the standard errors should be corrected for within-cluster correlation.
This is achieved by adding the option cluster,
regr subsidy rank [pw=wt], cluster(commune)
nlcom ((2*var_rank)/(_b[_cons]+0.5*_b[rank]))*_b[rank]
where commune is the variable denoting the primary sampling unit—communes
in the VLSS. Allowing for within-cluster correlation raises the standard error in the
Vietnam subsidy example from 0.034016 to 0.041988.
Correcting for across-cluster correlation may or may not be necessary, depending on the sample design, but a form of serial correlation is always likely to be present owing to the rank nature of the regressor (Kakwani, Wagstaff, van Doorslaer
1997). To correct the standard errors for this, one can use the Newey-West (Newey
and West 1994) variance-covariance matrix, which corrects for autocorrelation, as
well as heteroscedasticty. In Stata, the command newey produces OLS regression
coefficients with Newey-West standard errors. To use this, the data must be set to
a time series format with the time variable being, in this case, the living standards
rank. This must be an integer valued variable, and so the fractional rank created
above cannot be used. Below, we create the appropriate rank variable (ranki)
before running the newey command:
egen
tsset
newey
nlcom
ranki=rank(eqcons), unique
ranki
subsidy rank [aw=weight], lag(1)
((2*var_rank)/(_b[_cons]+0.5*_b[rank]))*_b[rank]
Note that the (weighted) fractional rank and not the integer valued rank is still used
in the regression. Weights (analytical) are allowed, and the lag(#) option must be
included to specify the maximum number of lags to be considered in the autocorrelation structure. For our example, this estimator gives a standard error of 0.034568,
slightly larger than if we allow for heteroskedasticity only (0.034016), but smaller
than if we allow for within-cluster correlation (0.041988).
Demographic standardization of the concentration index
As discussed in chapter 5, we are often interested in measuring socioeconomicrelated inequality in a health variable after controlling for the confounding effect
of demographics. In chapter 5 we explained how this can be done using both direct
and indirect methods of standardization. To estimate a standardized concentration index, one could use either method of standardization to generate a predicted
health variable purged of the influence of demographics across socioeconomic
groups, as explained in chapter 5, then compute the concentration index for this
standardized variable.
In the case that one wishes to standardize for the full correlation with confounders, and so there are no control (z) variables (see chapter 5), a shortcut method
of obtaining an indirectly standardized concentration index is simply to include
the standardizing variables directly in the convenient regression. This is precisely
The Concentration Index
105
what is being done in the literature that makes use of the relative index of inequality (e.g., Mackenbach et al. [1997]). From the regression
(8.13)
⎛h ⎞
2σ r2 ⎜ i ⎟ = α 2 + β2 ri + ∑ δ j x ji + υi ,
⎝ µ⎠
j
where xj are the confounding variables, for example, age, sex, and so on, the OLS
estimate β̂2 is an estimate of the indirectly standardized concentration index. Computation requires simply adding the confounding variables to the regression commands discussed above.
Sensitivity of the concentration index to the living standards measure
In chapter 6 we described alternative measures of living standards—consumption,
expenditure, wealth index—and noted that it is not always possible to establish a
clear advantage of one measure over others. It is therefore important to consider
whether the chosen measure of living standards influences the measured degree
of socioeconomic-related inequality in the health variable of interest. When the
concentration index is used as a summary measure of inequality, the question is
whether it is sensitive to the living standards measure.
As noted above, the concentration index reflects the relationship between the
health variable and living standards rank. It is not influenced by the variance of the
living standards measure. In some circumstances, this may be considered a disadvantage. For example, it means that, for a given relationship between income and
health, the concentration index cannot discriminate the degree of income-related
health inequality in one country in which income is distributed very unevenly
from that in another country in which the income distribution is very equal. On
the other hand, when one is interested in inequality at a certain place and time, it is
reassuring that the differing variances of alternative measures of living standards
will not influence the concentration index. However, the concentration index may
differ if the ranking of individuals is inconsistent across alternative measures.
Wagstaff and Watanabe (2003) demonstrate that the concentration index will
differ across alternative living standards measures if the health variable is correlated with changes in an individual’s rank on moving from one measure to another.
The difference between two concentration indices C1 and C2, where the respective
concentration index is calculated on the basis of a given ranking (r1i and r2i)—for
example, consumption and a wealth index—can be computed by means of the
regression
⎛h ⎞
(8.14)
2σ ∆2r ⎜ i ⎟ = α + γ ∆ri + ε i,
⎝ µ⎠
where ∆ri = r1i – r2i is the reranking that results from changing the measure of socio2
economic status, and σ ∆r
is its variance. The OLS estimate of γ provides an estimate
of the difference (C1 – C2). Significance of the difference between indices can be
tested by using the standard error of γ .7
For 19 countries, Wagstaff and Watanabe (2003) test the sensitivity of the concentration index for child malnutrition to the use of household consumption and a
7
This ignores the sampling variability of the left-hand-side estimates.
106
Chapter 8
Table 8.4 Concentration Indices for Health Service Utilization with Household Ranked
by Consumption and an Assets Index, Mozambique 1996/97
Consumption
Asset index
CI
CI
t-value
t-value
Difference t-value for
CIC – CIAI difference
Hospital visits
0.166
8.72
0.231
12.94
–0.065
–3.35
Health center visits
0.066
3.85
–0.136
–8.49
0.202
9.99
Complete immunizations
0.059
8.35
0.194
34.69
–0.135
–19.1
Delivery control
0.063
11.86
0.154
35.01
–0.091
–15.27
Institutional delivery
0.089
11.31
0.266
43.26
–0.176
–20.06
Source: Lindelow 2006.
wealth index as the living standards ranking variable. Malnutrition is measured by
a binary indicator of underweight and another for stunting (see chapter 4). For each
of underweight and stunting, the difference between the concentration indices is
significant (10%) for 6 of 19 comparisons. This suggests that in the majority of countries, child nutritional status is not strongly correlated with inconsistencies in the
ranking of households by consumption and wealth.
But there is some evidence that concentration indices for health service utilization are more sensitive to the living standards measure. Table 8.4, reproduced from
Lindelow (2006), shows substantial and significant differences between the concentration indices (CI) for a variety of health services in Mozambique using consumption and an asset index as the living standards measure. In the case of consumption, the concentration index indicates statistically significant inequality in favor of
richer households for all services. With households ranked by the asset index rather
than consumption, the inequality is greater for all services except health center visits, for which the concentration index indicates inequality in utilization in favor of
poorer households.
It appears that the choice of welfare indicator can have a large and significant
impact on measured socioeconomic inequalities in a health variable, but it depends
on the variable examined. Differences in measured inequality reflect the fact that
consumption and the asset index measure different things, or at least are different
proxies for the same underlying variable of interest. But only in cases in which the
difference in rankings between the measures is also correlated with the health variable of interest will the choice of indicator have an important impact on the findings. In cases in which both asset and consumption data are available, analysts are
in a position to qualify any analysis of these issues by reference to parallel analysis
based on alternative measures. However, data on both consumption and assets are
often not available. In these cases, the potential sensitivity of the findings should be
explicitly recognized.
References
Bleichrodt, H., and E. van Doorslaer. 2006. “A Welfare Economics Foundation for Health
Inequality Measurement.” Journal of Health Economics 25(5): 945–57.
Erreygers, G. 2005. Beyond the Health Concentration Index: An Atkinson Alternative to the Measurement of Socioeconomic Inequality of Health. University of Antwerp. Antwerp, Belgium.
The Concentration Index
107
Erreygars, G. 2006. Correcting the Concentration Index. University of Antwerp. Antwerp,
Belgium.
Fuller, M., and D. Lury. 1977. Statistics Workbook for Social Science Students. Oxford, United
Kingdom: Phillip Allan.
Gwatkin, D. R., S. Rustein, K. Johnson, R. Pande, and A. Wagstaff. 2003. Initial Country-Level
Information about Socio-Economic Differentials in Health, Nutrition and Population, Volumes I
and II. Washington, DC: World Bank Health, Population and Nutrition.
Jenkins, S. 1988. “Calculating Income Distribution Indices from Microdata.” National Tax
Journal 61: 139–42.
Kakwani, N. C. 1977. “Measurement of Tax Progressivity: An International Comparison.”
Economic Journal 87(345): 71–80.
Kakwani, N. C. 1980. Income Inequality and Poverty: Methods of Estimation and Policy Applications. New York: Oxford University Press.
Kakwani, N. C., A. Wagstaff, and E. van Doorslaer. 1997. “Socioeconomic Inequalities in
Health: Measurement, Computation and Statistical Inference.” Journal of Econometrics
77(1): 87–104.
Koolman, X., and E. van Doorslaer. 2004. “On the Interpretation of a Concentration Index of
Inequality.” Health Economics 13: 649–56.
Lerman, R. I., and S. Yitzhaki. 1989. “Improving the Accuracy of Estimates of Gini Coefficients.” Journal of Econometrics 42(1): 43–47.
Lindelow, M. 2006. “Sometimes More Equal Than Others: How Health Inequalities Depend
upon the Choice of Welfare Indicator.” Health Economics 15(3): 263–80.
Mackenbach, J. P., A. E. Kunst, A. E. J. M. Cavelaars, F. Groenhof, and J. J. M.Geurts. 1997.
“Socioeconomic Inequalities in Morbidity and Mortality in Western Europe.” Lancet 349:
1655–59.
Newey, W. K., and K. D. West. 1994. “Automatic Lag Selection in Covariance Matrix Estimation.” Review of Economic Studies 61(4): 631–53.
O’Donnell, O., E. van Doorslaer, R. P. Rannan-Eliya, A. Somanathan, S. R. Adhikari, D. Harbianto, C. G. Garg, P. Hanvoravongchai, M. N. Huq, A. Karan, G. M. Leung, C-W Ng,
B. R. Pande, K. Tin, L. Trisnantoro, C. Vasavid, Y. Zhang, and Y. Zhao. Forthcoming. “The
Incidence of Public Spending on Health Care: Comparative Evidence from Asia.” World
Bank Economic Review.
Rao, C. R. 1965. Linear Statistical Inference and its Applications. New York: Wiley.
van Doorslaer, E., and A. M. Jones. 2003. “Inequalities in Self-Reported Health: Validation of
a New Approach to Measurement.” Journal of Health Economics 22.
van Doorslaer, E., C. Masseria, X. Koolman, and the OECD Health Equity Research Group.
2006. “Inequalities in Access to Medical Care by Income in Developed Countries.” Canadian Medical Association Journal 174: 177–83.
van Doorslaer, E., A. Wagstaff, H. Bleichrodt, S. Calonge, U. G. Gerdtham, M. Gerfin, J.
Geurts, L. Gross, U. Hakkinen, R. E. Leu, O. O’Donnell, C. Propper, F. Puffer, M. Rodriguez, G. Sundberg, and O. Winkelhake. 1997. “Income-Related Inequalities in Health:
Some International Comparisons.” J Health Econ 16(1): 93–112.
Wagstaff, A. 2000. “Socioeconomic Inequalities in Child Mortality: Comparisons across
Nine Developing Countries.” Bulletin of the World Health Organization 78(1): 19–29.
Wagstaff, A. 2005. “The Bounds of the Concentration Index When the Variable of Interest Is
Binary, with an Application to Immunization Inequality.” Health Economics 14(4): 429–32.
Wagstaff, A., and E. van Doorslaer. 1994. “Measuring Inequalities in Health in the Presence
of Multiple-Category Morbidity Indicators.” Health Economics 3: 281–91.
108
Chapter 8
Wagstaff, A., E. van Doorslaer, and P. Paci. 1989. “Equity in the Finance and Delivery of
Health Care: Some Tentative Cross-Country Comparisons.” Oxford Review of Economic
Policy 5(1): 89–112.
Wagstaff, A., E. van Doorslaer, and N. Watanabe. 2003. “On Decomposing the Causes of
Health Sector Inequalities, with an Application to Malnutrition Inequalities in Vietnam.”
Journal of Econometrics 112(1): 219–27.
Wagstaff, A., and N. Watanabe. 2003. “What Difference Does the Choice of SES Make in
Health Inequality Measurement?” Health Economics 12(10): 885–90.