Intensive and Extensive Poverty: A Multidimensional Formulation

Intensive and Extensive Poverty: A Multidimensional
Formulation
Ximing Wu∗
November, 2006
Abstract
A fundamental difficulty for multidimentional poverty analysis is the identification
of the poor. This paper proposes two multidimensional poverty indices: the intensive and extensive poverty, which correspond respectively to the most exclusive and
inclusive definition of poverty defined according to multiple attributes. Bounds on the
intensive and extensive poverty are provided. Drawing analogy from choice under uncertainty, we relate these indices to the minimin and minimax poverty criterion of a
social planner. We also show that they can be used to establish sufficient conditions for
the first order stochastic dominance of multivariate distributions. Since these conditions are rather demanding, we further propose an aggregate poverty index constructed
as a convex combination of the intensive and extensive poverty, which explicitly accounts for the dimensionality effect. Because it effectively separates individual risk
aversion and social poverty aversion, it can be used for socially optimal poverty policy
that considers both individual welfare and poverty externality. An illustrative example
on U.S. income and education distribution is presented.
1
Introduction
Since the seminal work of Sen (1976), there has been a renewed interest in the measurement
of poverty. During the past three decades, the literature on poverty measurement has progressed considerably. For example, see Foster et al. (1984), Foster and Shorrocks (1988),
Atkinson (1987, 1989), Chakravarty (1990) and Zheng (1997). At the same time, it has been
∗
Department of Agricultural Economics, Texas A&M University. Email: [email protected].
1
increasingly recognized that poverty measurement based on a single attribute such as income or wealth is inadequate. An alternative basic needs approach contends that individual
wellbeing and social welfare depend on the distribution of not only income or wealth, but
also other attributes such as health, longevity and literacy. However, markets for all basic
needs do not always exist. Hence an “equivalent income” does not necessarily reflect the
true level of individual wellbeing. In contrast, multidimensional analyses can often provide
a more complete view on the overall social welfare. See Maasoumi (1999) for an overview
of multidimensional welfare analysis. Since direct comparison of multivariate distributions
is inherently difficult due to the curse of dimensionality, poverty index of multidimensional
distribution has often been used instead. Duclos et al. (2001), Bourguignon and Chakravarty
(2002), Tsui (2002) and Maasoumi and Lugo (2006) discuss various approaches to construct
multivariate poverty index. Deutsch and Silber (2005) investigates this issue empirically.
General reviews can be found in Atkinson (2003) and Bibi (2005).
The measurement of poverty typically involves two steps: the identification of the poor
and the aggregation of the data on poverty into an overall index. Regarding a single attribute,
the former determines the poverty line or minimum basic needs; the latter specifies the actual
construction of poverty index given the poverty line.1 Given an attribute distribution F
defined on R+ and a poverty line 0 < z < ∞, a general poverty index can be constructed as
Z
z
p (z; φ) =
φ (x; z) dF (x) ,
(1)
0
where φ (x; z) is a general function of disutility or deprivation. This poverty index measures
the integrated shortfalls below the poverty line. Most commonly used poverty indices can
be constructed this way. For example, if φ (x; z) = 1, p (z; φ) is the headcount index; if
φ (x; z) = 1 − x/z, p (z; φ) is the normalized deficit index; and more generally, if φ (x; z) =
(1 − x/z)ε with ε ≥ 0, p (z; φ) is the general decomposable poverty index proposed by Foster
et al. (1984).
The construction of poverty index in a multidimensional context is more difficult. Given a
poverty line z = {zk }K
k=1 , K ≥ 2, the poor is generally not uniquely identified: an individual
is obviously poor if all her attributes are below the poverty line; on the other hand, it is not
clear whether she shall be counted as poor if only a strict subset of her attributes are below
the poverty line. A simple solution is to compare each attribute separately. Alternatively, one
can aggregate multiple unidimensional indices into a single index. However, the associations
among different attributes are not taken into account when these methods are used. Consider
1
In this text, we use the term “poverty line” in a broad sense for the standard of minimum basic needs.
Therefore, the poverty line is not necessarily defined in terms of income or wealth.
2
a sample of two where one person is rich but ailing and the other poor but healthy. Most
people will agree that its poverty condition is better than a second sample where one person
is rich and healthy and the other poor and ailing. Nonetheless, without accounting for the
association between wealth and health in the sample, the above methods will conclude that
the two distributions share the same level of poverty.
To undertake a genuine multivariate analysis, one can in principle construct a multidimensional poverty index in a similar fashion as its unidimensional counterpart (1). However, the
unique identification of the poor given a poverty line is lost in a multidimensional context.2
Facing this inherent complication in multidimensional poverty comparison, we propose two
general indices of multidimensional poverty: intensive poverty and extensive poverty. Given
a poverty line z and disutility function φ, the intensive poverty includes only those with
all attributes below the poverty line. On the other hand, the extensive poverty includes all
those with at least one attribute below the poverty line; i.e., it excludes only those with all
attributes above the poverty line.
In section 2, construction of the intensive and extensive poverty index are presented, and
their properties explored. Noting the similarity between multidimensional poverty and multiperiod poverty, we compare and contrast these two concepts. We then relate the intensive
and extensive poverty to the minimin and minimax poverty objective of a social planner. We
also present bounds on them as functions of the unidimensional poverty index of marginal
distributions.
In section 3, we demonstrate that the proposed poverty concepts can be used to establish
the first order stochastic dominance of social welfare given certain conditions of individual
utility function. The conditions for the first order stochastic dominance are rather demanding for bivariate distributions, and become increasingly more so with the dimension of the
distribution or order of stochastic dominance. Hence we propose a simple aggregate poverty
index as a convex combination of the intensive and extensive poverty. This aggregate index
is parameterized by a coefficient that reflects the society’s poverty aversion. This construction accommodates separate individual risk aversion and social poverty aversion. Since the
gap between the intensive and extensive poverty increases progressively with the dimension
of multivariate distributions, we further introduce a correction to reduce the dimensionality
effect.
In Section 4, we apply the proposed poverty indices to the income and education distribution of U.S. adults from 1992 through 2006. We show that both the intensive and extensive
poverty declined during the sample period except that they increased slightly for the last
2
We do not have this problem when generalizing an unidimensional inequality index to a multidimensional
one as the aggregation is always taken over the entire support of the distribution.
3
three years. The aggregate poverty is decomposed into poverty of the male and female. Some
concluding remarks are offered in Section 5.
2
Intensive and Extensive Poverty
2.1
Formulation
In this section, we present the construction of the intensive and extensive poverty. Suppose
each individual is endowed with K attributes, such as income, health, literacy and so on,
where K > 1 is a positive integer. Let x be a K-dimensional random variable representing
these attributes with a distribution function F defined on RK
+ . Given a poverty line z =
K
{zk }k=1 , the poor in terms of the k th attribute is identified by A (zk ) = {x|xk ≤ zk } for
K
k = 1, . . . , K. Denote A(z) = ∩K
k=1 A (zk ) and Ā (z) = ∪k=1 A (zk ). For an arbitrary A (z) such
that A(z) ⊆ A (z) ⊆ Ā (z) , one can define a general poverty index for the multidimensional
distribution F
Z
φ (x; z) dF (x) .
(2)
p (A (z) ; φ) =
A(z)
In this study, we abstract from the choice of disutility function φ. Following the welfaretheoretic formulation of poverty index, we assume that φ (x; z) is derived from individual
utility function. In particular, we assume that φ (x; z) ≥ 0 for all x and z, and is a “proper”
disutility function such that p (A (z) ; φ) satisfies the usual properties of a poverty index,
such as focus, symmetry, monotonicity, continuity, principle of population, scale invariance
and subgroup decomposability.3 We also assume that φ (x; z) is properly normalized such
that 0 ≤ p (A (z) ; φ) ≤ 1.
The construction of p (A (z) ; φ) has some drawbacks. It is not unique: a continuum of
©
ª
poverty indices is defined by the set A = A (z) : A (z) ⊆ A (z) ⊆ Ā (z) . Given the poverty
line z and disutility function φ, p (A (z) ; φ) depends on the choice of A (z). Moreover,
the construction is generally not invertible: p (A1 (z) ; φ) = p (A2 (z) ; φ) does not imply
A1 (z) = A2 (z) . Therefore, welfare comparison based on a single p (A (z) ; φ) can sometimes
be arbitrary.
In this study, we propose instead to work with the two boundary cases p (A (z) ; φ) and
¢
¡
p Ā (z) ; φ . Formally, define the intensive poverty of a distribution F
Z
p (z; φ) =
φ (x; z) dF (x) ,
(3)
A(z)
3
See Bourguignon and Chakravarty (2002a) and Tsui (2002) for detailed discussions of desirable properties
of a multidimensional poverty index.
4
and the extensive poverty
Z
p̄ (z; φ) =
φ (x; z) dF (x) .
(4)
Ā(z)
The intensive poverty is the most exclusive in the sense that only those with all attributes
below the poverty line are included. On the other extreme, the extensive poverty is the most
inclusive as only those with all attributes above the poverty line are excluded.4
Since φ (x; z) ≥ 0 for all x and z, it follows immediately that p(z; φ) ≤ p (A (z) ; φ) ≤
p̄ (z; φ) for A(z) ⊆ A (z) ⊆ Ā (z) . Moreover, since φ is continuous, there exists an 0 ≤
α ≤ 1 such that p (A (z) ; φ) = αp(z; φ) + (1 − α) p̄ (x; φ) by continuous mapping theorem.
Therefore, an arbitrary p (A (z) ; φ) can be expressed as a convex combination of p(z; φ) and
p̄ (z; φ) .
Another poverty index that is multidimensional in nature is the multiperiod poverty
index. We note that there is a close connection between multidimensional poverty and
multiperiod poverty. One can consider a single attribute x in multiple periods as a multidimensional attribute and apply the concepts of intensive and extensive poverty to the
multiperiod case. The intensive poverty is equivalent to the permanent poverty, while the
extensive poverty is loosely related to the temporary poverty.5 Borooah and Creedy (1998)
and Gräb and Grimm (2006) propose a multiperiod poverty index in a two-period context.
For t = 1, 2, pTt and pPt denote respectively the temporary (or transient) and permanent
poverty measures for period t, where pTt includes those who are poor only in period t and pPt
includes those who are poor in both periods. Denote At the set of individual who are poor
in period t. The permanent poverty is the “intersection” of the two sets, and the temporary
poverty corresponds to the “symmetric difference” of two sets.6 Although simple and intuitive, this construction suffers the curse of dimensionality: pTt is not well-defined when there
are more than two periods. Consider an income distribution in three periods. It is not clear
one shall count those with income below poverty line for two periods as temporarily poor or
permanently poor. To avoid this confusion, one can in principle define a K−element set of
poverty indices for a K period distribution, of which the k th index includes those who are
poor for k out of K periods. However, the construction becomes progressively complicated
with K.7
4
The construction of extensive and intensive poverty corresponds to the union and intersection approach
discussed in Bourguignon and Chakravarty (2002a).
5
Multiperiod poverty problem has a natural (temporal) order of the elements of X, which is lacked in
multidimensional poverty case.
6
The symmetric difference of two sets A1 and A2 is defined as A1 ∆A2 = (A1 − A2 ) ∪ (A2 − A1 ) .
7
This complication reflects the difficulty associated with symmetric difference for more than two sets.
The construction is rather cumbersome. For k < K, one can define the “k th -order” symmetric difference of
K sets as ∆Mk = {a ∈ ∪M |# {A ∈ M |a ∈ A} = k} .
5
In contrast, the proposed intensive and extensive poverty indices involve only the two
boundary cases, whose complicity is not affected by the dimension of x. All the intermediate
cases can be obtained as a linear combination of these two extreme cases. Furthermore,
in next section we present an aggregate index which explicitly accounts for the impact of
dimensionality.
2.2
Properties
In this subsection, we present some properties of the proposed poverty indices. For expositional ease, we focus on the case where φ = 1 such that the poverty index is the headcount
index. Generalization to more general φ is straightforward.
Given the poverty line z, we define
Z
p (z) =
dF (x) = F (z) ,
A(z)
and
Z
p̄ (z) =
dF (x)
Ā(z)
= 1 − Q (z) ,
R∞
R∞
where Q (z) = z1 · · · zK dF (x) is the survival function. We can relate p and p̄ to the
poverty target of a social planner with either a minimax or minimin criterion.
Proposition 1 Let X be a random vector from a distribution F defined on RK . The random
variables max (X) and min (X) are the order statistics of X. Define p (t) = p (tι) , where ι
is a vector of K ones; p̄ (t) is similarly defined. The distribution functions of max (X) and
min (X) are given by
P [max (X) ≤ t] = p (t) ,
and
P [min (X) ≤ t] = p̄ (t) .
[All proofs are given in Appendix.]
There is a close analogy between Proposition 1 and theorems on choice under uncertainty.
Under expected utility maximization, the maximin standard is the most conservative choice
while the maximax strategy is the most aggressive one. On the contrary regarding minimizing
a “bad” such as poverty, the minimax criterion associated with the intensive poverty is
6
conservative in the sense that only those severely poor are targeted. On the other hand,
the minimin criterion associated with the extensive poverty is the most ambitious one as all
those with at least one attribute below the poverty line are considered.
Next we present bounds on the intensive and extensive poverty in terms of the poverty
level of marginal distributions.
Proposition 2 Given the unidimensional poverty index pk (zk ) = P [xk ≤ zk ] for k = 1, . . . , K,
max
ÃK
X
!
pk (zk ) − K + 1, 0
≤ p (z) ≤ min (p1 (z1 ) , . . . , pK (zK )) ,
(5)
k=1
and
max (p1 (z1 ) , . . . , pK (zK )) ≤ p̄ (z) ≤ min
ÃK
X
!
pk (zk ) , 1 .
(6)
k=1
When the joint distribution of multiple attributes are not observable or no individual data
but only poverty rates for each individual dimension are available, Proposition 2 provides a
useful tool to bound the intensive and extensive poverty.
3
Welfare Comparison Based on Multidimensional Distributions
3.1
Stochastic dominance
In this section, we discuss welfare comparison of multidimensional distributions according to
their poverty levels. It is known that the stochastic dominance allows us to make a partial
ranking of distributions without knowledge of the precise form of social welfare function.
Below we show that the proposed poverty indices can be used to establish the first order
stochastic dominance of social welfare for multidimensional distributions as discussed in
Atkinson and Bourguignon (1982) and Bourguignon and Chakravarty (2002b). Consider
two distributions F and F ∗ , and an individual utility function U . We assume that the social
welfare is well-defined by the expected utility:
Z
W = U (x) dF (x) .
7
We first consider the two-dimensional case. The social welfare comparison is based on
Z
∞
Z
∞
∆W =
Z
∞
∗
Z
∞
U (x1 , x2 ) dF (x1 , x2 ) −
0
0
U (x1 , x2 ) dF (x1 , x2 ) .
0
0
One distribution, F ∗ , is said to stochastically dominate the other, F, for a specified class of
utility function U ∈ U when ∆W is non-negative for all U and is strictly positive for some
U. Let F ∗ ºU F denote that F ∗ first order stochastically dominates F for utility function U .
Under certain conditions of the utility function, dominance in terms of p or p̄ is sufficient to
establish the first order stochastic dominance between distributions.
Proposition 3 Let p and p̄ be the intensive and extensive poverty associated with distribution F ; p∗ and p̄∗ are similarly defined. Define ∆p̄ = p̄∗ − p̄ and ∆p = p∗ − p. Suppose
U1 ≥ 0, U2 ≥ 0 for all x1 and x2 , where Ui = ∂U (x1 , x2 ) /∂xi , i=1,2.
(a) Given U12 ≤ 0, if ∆p (x1 , x2 ) ≤ 0 for all x1 and x2 , then F ∗ ºU F .
(b) Given U12 ≥ 0, if ∆p̄ (x1 , x2 ) ≤ 0 for all x1 and x2 , then F ∗ ºU F .
(c) If ∆p (x1 , x2 ) ≤ 0 and ∆p̄ (x1 , x2 ) ≤ 0 for all x1 and x2 , then F ∗ ºU F.
Proposition 3 offers some interesting insights into welfare comparison based on multiple
attributes. A key difference from the unidimensional case is that derivative conditions higher
than order one are required to establish the first order stochastic dominance. For univariate
distributions, the first order stochastic dominance only requires non-negative marginal utility,
while for bivariate distributions, cross-derivatives of the utility function are also restricted.
When two attributes are substitutes, dominance in terms of the intensive poverty p warrants
the first order stochastic dominance. Recall from Proposition 1 that p is associated with
the distribution function of the maximum of multiple attributes. Therefore, part (a) of
Proposition 3 is equivalent to that if the distribution of the maximum of two substitutes from
a joint distribution F ∗ is everywhere below that from distribution F, then F ∗ is preferred to F.
On the other hand, part (b) of Proposition 3 suggests that if the distribution of the minimum
of two complements from a joint distribution F ∗ is everywhere below that from F, then F ∗
is preferred. This is consistent with the intuition that utility derived from two complements
is determined by the one whose amount is smaller. One can also relate the result to choice
under uncertainty: to maximize expected utility from two complements is equivalent to
maximize the minimum of these two. Put another way, one shall use the maximin criterion.
Part (c) suggests that only under dominance in both p and p̄, non-negative marginal utility
establishes the first order stochastic dominance.
8
Although the first order stochastic dominance offers unambiguous welfare comparison,
its conditions are rather demanding. The condition ∆p(x1 , x2 ) ≤ 0 for all x1 and x2 is
equivalent to that the distribution function F (x1 , x2 ) is everywhere above F ∗ (x1 , x2 ) . On
the other hand, ∆p̄ (x1 , x2 ) ≤ 0 for all x1 and x2 amounts to that the survival function
Q (x1 , x2 ) is everywhere below Q∗ (x1 , x2 ). Furthermore, the conditions become progressively
more demanding with the dimension of x due to the curse of dimensionality. For the sake of
completeness, below we present sufficient conditions, in terms of p and p̄, of the first order
stochastic dominance for the general K dimensional case.8
Proposition 4 Let x be a K dimensional attribute, K ≥ 3. Suppose Uk ≥ 0 for all {xk }K
k=1 .
(a) Given Uij ≤ 0 for all 1 ≤ i 6= j ≤ K, Uijk ≥ 0 for all 1 ≤ i 6= j 6= k ≤
∗
K, . . . , (−1)K+1 U12...K ≥ 0, if ∆p(x1 , . . . , xK ) ≤ 0 for all {xk }K
k=1 , then F ºU F.
(b) Given Uij ≥ 0 for all 1 ≤ i 6= j ≤ K, Uijk ≤ 0 for all 1 ≤ i 6= j 6= k ≤
∗
K, . . . , (−1)K U12...K ≥ 0, if ∆p̄ (x1 , . . . , xK ) ≤ 0 for all {xk }K
k=1 , then F ºU F.
The conditions in Proposition 4 require that the distribution F ∗ or survival function
Q∗ everywhere below F or Q, which is even more demanding in the K dimensional case.
Facing this difficulty, one might consider using higher order stochastic dominance, which can
be established under weaker conditions on distributions. However, the gains from weaker
conditions on distributions are probably offset by more restrictive conditions on the utility
function. Consider the two-dimensional case discussed in Atkinson and Bourguignon (1982).
The second order stochastic dominance requires assumptions on derivatives of the utility
Rx Rx
Rx
function up to order four. Define H (x1 , x2 ) = 0 1 0 2 F (s, t) dtds, H1 (x1 ) = 0 1 F1 (s) ds
Rx
and H2 (x2 ) = 0 2 F2 (t) dt. Under the conditions that U1 ≥ 0, U2 ≥ 0, U11 ≤ 0, U22 ≤
0, U12≤0 , U112 ≥ 0, U122 ≥ 0 and U1122 ≤ 0, the second order stochastic dominance can be
established if ∆H1 (x1 ) ≤ 0, ∆H2 (x2 ) ≤ 0 and ∆H (x1 , x2 ) ≤ 0 for all x1 and x2 .
3.2
Aggregate poverty index
The foregoing discussion suggests that although the intensive and extensive poverty can
be used to establish stochastic dominance, the conditions are demanding even for the twodimensional case. To circumvent the curse of dimensionality, a more realistic approach is to
use certain poverty index associated with a given poverty line or a range of poverty lines to
to obtain an operational poverty ordering. Recall that p and p̄ are respectively the minimum
8
The counterpart of part (c) in Proposition 3 is rather tedious and hence not derived here.
9
and maximum of the family of poverty index defined by (2) given the poverty line z and
disutility function φ. It follows that a distribution F is preferred to F ∗ in terms of poverty
index defined by (2) if the lower bound poverty of F is larger than the upper bound poverty
of F ∗ . However, this condition is no less demanding, especially for higher dimensional case.
£ ¤
£
¤
Often the two sets p, p̄ and p∗ , p̄∗ overlap and a poverty order is not obtained.
To resolve this inadequacy, we propose an aggregate poverty index
¡
¢
pα = p + α p̄ − p ,
where 0 ≤ α ≤ 1. Although rather flexible, pα also suffers the curse of the dimensionality.
Given z and φ, we note that the intensive poverty generally decreases with K, while the
extensive poverty increases with K. Consider a simple example where φ = 1 and the marginal
distributions of all attributes are independent and follow the uniform distribution on [0, 1] .
Suppose the poverty line is set at 0.5 for each attribute. When K = 2, Ep = 0.25 and
E p̄ = 0.75; when K = 3, Ep = 0.125 and E p̄ = 0.875; and when K = 4, Ep = 0.0625
and E p̄ = 0.9375. Clearly, the gap p̄ − p increases progressively with K. Therefore as K
increases, pα is increasingly dominated by p̄ for a fixed α. Hence, we further propose a
dimension-adjusted aggregate index:
¡
¢
pα = p + αK p̄ − p .
(7)
The exponentiating of α ∈ [0, 1] to power K effectively mitigates the impact of dimensionality. At the same time, since pα remains a linear combination of p and p̄, it inherits all the
desirable properties of p and p̄, especially the crucial property of subgroup consistency or
decomposability. Moreover pα is also normalized since 0 ≤p≤ pα ≤ p̄ ≤ 1.
The coefficient α reflects social poverty aversion and the choice of α is independent of
poverty line z, disutility function φ and the dimension of x. The larger α is, the higher is
social poverty aversion. When α = 1, p1 = p̄, the social planner considers all those with at
least one attribute below poverty line as poor; on the other extreme, when α = 0, p0 = p,
only those with all attributes below the poverty line are counted as poor.
Conventionally, the welfare-theoretic construction of inequality or poverty index implicitly assumes that individual risk aversion coincides with society’s inequality or poverty aversion. For example, Atkinson (1970) assumes a CRRA individual utility function xε / (1 − ε) ,
where x is a scalar, ε > 0. Let the equally distributed equivalent income, xE , be defined such
that
Z ∞
ε
xE / (1 − ε) =
xε / (1 − ε) dF (x) .
0
10
The Atkinson index is given by
Iε = 1 − xE /µx ,
R∞
where µx = 0 xdF (x) . The social inequality aversion inherits individual risk aversion in
this framework. Similarly given a poverty line z, one can construct a poverty index as in
Foster et al. (1984). Let φ (x; z) = (1 − x/z)ε / (1 − ε) , we obtain
Z
z
pε (z) =
0
(1 − x/z)ε
dF (x) .
1−ε
Thus individual risk aversion dictates social poverty aversion: the larger ε is, the lower is
the society’s poverty aversion because ∂pε (z) /∂ε < 0.
However, poverty as both a private and public “bad” is known to generate long-lasting
negative externality in addition to its adverse impacts on the poor.9 Consequently the
socially acceptable poverty level is not necessarily consistent with that dictated by individual
utility maximization. The proposed aggregate poverty index effectively separates the social
poverty aversion from individual risk aversion. In particular, specification of the disutility
function φ is based on individual utility function, while α is chosen according to social
planner’s poverty aversion. Thus the proposed index offers the flexibility that accounts for
both individual utility maximization and social optimum.
4
Empirical example
In this section, we provide an illustrative example of the proposed poverty indices. Our data
come from the Current Population Survey (CPS) March Supplements 1992 through 2006.
Because the CPS underwent a major re-design in 1992, we start our sample in 1992 to ensure
data comparability. We focus on the poverty level in terms of income and education of the
prime-aged adults (25 to 60 years old). The CPS records each interviewee’s poverty status
according to a poverty line based on the family equivalence scale defined by the U.S. Bureau
of Labor Statistics. Regarding education, we consider an individual poor if he or she does
not have a high school degree. For simplicity, we set φ (x; z) = 1.
Various poverty estimates are reported in Figure 1. The top panel reports the poverty
rate of the marginal distributions. During the sample period, the marginal poverty rate
in terms of income and education generally decreased from 1992 through 2002 and then
increased slightly henceforth. The second panel reports the intensive and extensive poverty
rate for the joint distribution of income and education. The general pattern is similar to the
9
For example, Wane (2001) discusses the design of optimal income tax, treating poverty as a public “bad”.
11
poverty rates of the marginal distributions: gradual decline from 1992 through 2002 followed
by a slow increase. The ratio between intensive and extensive poverty ranges between 17%
to 21% for the sample period, with an average of 19%.
For each pair of adjacent years, conditions for the first order stochastic dominance for
the joint distribution are not satisfied. Also, since the extensive poverty is larger than the
intensive poverty throughout the sample period, unambiguous poverty ordering cannot be
established based on the rule that one distribution’s extensive poverty’s is lower than the
other’s intensive poverty as discussed above. Hence we resort to the aggregate poverty index
(7) for poverty ordering. Setting α = 1/2, we calculate the aggregate poverty as
p1/2 = p + (1/2)2 {p̄ − p}.
The results are reported in the lower left plot. Because the intensive and extensive poverty
follow a similar pattern, the aggregate index mirrors their trends during the sample period.
Since the aggregate index is a linear combination of the intensive and extensive poverty,
it also satisfies the decomposability condition. Therefore, we can calculate the contribution
of each subgroup to the overall poverty. In this example, we explore the relative contribution
of the male and female to the overall poverty. Since the ratio of prime-aged male to female
remains rather stable in the range of [91.3%, 94.6%] during the sample period, their contributions are reflected in their respective level of aggregate poverty. The estimates of male
and female aggregate poverty index with α = 1/2 are shown in the lower right plot. During
the entire period, the female aggregate poverty rate is consistently higher than that of the
male, with an average ratio 119%. Moreover, the increase after year 2002 is more significant
for the female.
5
Conclusion
This paper presents the concepts of intensive and extensive poverty in a multidimensional
context, which correspond respectively to the strictest and broadest definition of poverty.
We relate these concepts to the minimax and minimin risk criterion. We show that these
concepts can be used to establish sufficient conditions of the first order stochastic dominance
for welfare comparison, but the conditions are rather demanding. We then propose an
aggregate poverty index as a convex combination of the intensive and extensive poverty.
This aggregate index is further adjusted to mitigate dimensionality effect. Moreover, this
aggregate index separates individual risk aversion and social poverty aversion. Therefore
it can be used by policy makers for poverty targeting which reflects both individual utility
12
education poverty
0.12
0.090
0.13
0.100
0.14
0.15
0.110
income poverty
1994
1996
1998
2000
2002
2004
2006
1992
1994
1996
1998
2000
year
year
intensive poverty
extensive poverty
2002
2004
2006
2002
2004
2006
0.030
0.18
0.19
0.035
0.20
0.040
0.21
0.045
1992
1994
1996
1998
2000
2002
2004
2006
1992
1994
1996
1998
2000
year
year
aggregate poverty
male (triangle) and female (diamond) aggregate poverty
0.06
0.065
0.07
0.075
0.08
0.09
0.085
1992
1992
1994
1996
1998
2000
2002
2004
2006
1992
year
1994
1996
1998
2000
2002
year
Figure 1: U.S. poverty rate in terms of income and education
maximization and social optimum.
13
2004
2006
References
Atkinson, A. B. “Measurement of Inequality.” Journal of Economic Theory, 2, no. 3(1970):
244-263.
Atkinson, A. B. “On the Measurement of Poverty.” Econometrica, 55, no. 4(1987): 749-764.
Atkinson, A. B. Poverty and Social Security. New York: Harvester Wheatsheaf, 1989.
Atkinson, A. B. “Multidimensional Deprivation: Contrasting Social Welfare and Counting
Approaches.” Journal of Economic Inequality, 1(2003): 51-65.
Bibi, S. “Measuring Poverty in a Multidimensional Perspective: A Review of Literature.”
Working Paper, 2005.
Borooah, V. K., and J. Creedy. “Income Mobility, Temporary and Permanent Poverty.”
Australian Economic Papers, 37, no. 1(1998): 36-44.
Bourguignon, F., and S. R. Chakravarty. “The Measurement of Multidimensional Poverty.”
Working Paper, 2002a.
Bourguignon, F., and S. R. Chakravarty. “Multi-dimensional Poverty Orderings.” Working
Paper, 2002b.
Bourguignon, F., and C. Morrisson. “Inequality among World Citizens: 1820-1992.” American Economic Review, 92, no. 4(2002): 727-744.
Chakravarty, S. R. Ethical Social Index Numbers. Berlin Heidelberg New York: Springer,
1990.
Deutsch, J., and J. Silber. “Measuring Multidimensional Poverty: An Empirical Comparison
of Various Approaches.” Review of Income and Wealth, no. 51(2005): 145-174.
Duclos, J.-Y., D. Sahn, and S. D. Younger. “Robust Multidimensional Poverty Comparisons.” Working Paper, 2001.
Foster, J., J. Greer, and E. Thorbecke. “A Class of Decomposable Poverty Measures.”
Econometrica, 52, no. 3(1984): 761-766.
Foster, J. E., and A. F. Shorrocks. “Poverty Orderings.” Econometrica, 56, no. 1(1988):
173-177.
Gräb, J., and M. Grimm. “Robust Multiperiod Poverty Comparisons.” Working paper,
2006.
Maasoumi, E. “The Measurement and Decomposition of Multidimensional Inequality.” Econometrica, 54, no. 4(1986): 991-997.
Maasoumi, E. “Multidimensional Approaches to Welfare Analysis”, ed. J. Silber. Dordrecht
14
and Boston, Kluwer Academic Publishers, 1999.
Maasoumi, E., and M. A. Lugo. “The Information Basis of Multivariate Poverty Assessments.” Working Paper, 2006.
Sen, A. “Poverty - Ordinal Approach to Measurement.” Econometrica, 44, no. 2(1976):
219-231.
Tsui, K. Y. “Multidimensional Poverty Indices.” Social Choice and Welfare, 19, no. 1(2002):
69-93.
Wane, W. “The Optimal Income Tax when Poverty is a Public ‘Bad’.” Journal of Public
Economics, 82, no. 2(2001): 271-299.
Zheng, B. “Aggregate Poverty Measures.” Journal of Economic Surveys, 11(1997): 123-162.
15
Appendix
£
¤
Proof of Proposition 1. P [max (X) ≤ t] = P ∩K
X
≤
t
= F (tι) = p (t) . On the
k
k=1
£ K
¤
£ K
¤
other hand, P [min (X) ≤ t] = P ∪k=1 Xk ≤ t = 1 − P ∩k=1 Xk > t = 1 − Q (tι) = p̄ (t) .
Proof of Proposition 2. We first prove the result with K = 2. Note that
p (z1 , z2 ) = P [x1 ≤ z1 , x2 ≤ z2 ]
= P [x1 ≤ z1 ] − P [x1 ≤ z1 , x2 > z2 ]
≤ P [x1 ≤ z1 ] = p1 (z) .
Similarly, one can show that p (z1 , z2 ) ≤ p2 (z2 ) . Therefore, p (z1 , z2 ) ≤ min (p1 (z1 ) , p2 (z2 )) .
On the other hand,
p (z1 , z2 ) = P [x1 ≤ z1 , x2 ≤ z2 ]
= P [x1 ≤ z1 ] − P [x1 ≤ z1 , x2 > z2 ]
≥ P [x1 ≤ z1 ] − P [x2 > z2 ]
= P [x1 ≤ z1 ] − (1 − P [x2 ≤ z2 ])
= p1 (z1 ) + p2 (z2 ) − 1.
Therefore, p (z1 , z2 ) ≥ max (p1 (z1 ) + p2 (z2 ) − 1, 0) .
For the second inequality, note that p̄ (z1 , z2 ) = p1 (z1 )+p2 (z2 )−p (z1 , z2 ) . Using the first
inequality of (5) , it follows that p̄ (z1 , z2 ) ≤ p1 (z1 ) + p2 (z2 ) − max (p1 (z1 ) + p2 (z2 ) − 1, 0) =
min (p1 (z1 ) + p2 (z2 ) , 1) . On the other hand, using the second inequality of (5) , p̄ (z1 , z2 ) =
p1 (z1 ) + p2 (z2 ) − p (z1 , z2 ) ≥ p1 (z1 ) + p2 (z2 ) − min (p1 (z1 ) , p2 (z2 )) = max (p1 (z1 ) , p2 (z2 )) .
One can then use induction to prove the inequality for K = 3, 4, . . . .
Proof of Proposition 3. (a) This proof follows closely Atkinson and Bourguignon (1982).
Let f and f ∗ be the density function of F and F ∗ respectively. Rewrite
Z
∞
Z
∞
∆W =
U (x1 , x2 ) ∆f (x1 , x2 ) dx2 dx1 ,
0
0
where ∆f = f ∗ − f. Taking the first integral by parts with respect to x2 yields:
Z
∞
∆W =
0
·
Z
U (x1 , ∞)
Z
∞
∆f (x1 , x2 ) dx2 −
0
x2
U2 (x1 , x2 )
0
16
Z
∞
¸
∆f (x1 , t) dtdx2 dx1 . (8)
0
Note that the marginal distribution F1 (x1 ) can be defined as
Z
x1
Z
∞
F1 (x1 ) =
f (s, x2 ) dx2 ds.
0
0
F2 (x2 ) is similarly defined. Next we integrate the right-hand side of (8) by parts with respect
to x1 :
Z ∞Z ∞
Z ∞
∆W = U (∞, ∞)
∆f (x1 , x2 ) dx2 dx1 −
U1 (x1 , ∞) ∆F1 (x1 ) dx1
0
0
0
Z ∞
Z ∞Z ∞
−
U2 (∞, x2 ) ∆F2 (x2 ) dx2 +
U12 (x1 , x2 ) ∆F (x1 , x2 ) dx2 dx1 ,
(9)
0
0
0
where the first term of (8) equals zero.
Suppose ∆p (x1 , x2 ) = ∆F (x1 , x2 ) ≤ 0 for all x1 and x2 . It follows immediately that
∆F1 (x1 ) = ∆F (x1 , ∞) ≤ 0 for all x1 . Similarly, ∆F2 (x2 ) ≤ 0 for all x2 . Therefore, given
U1 ≥ 0, U2 ≥ 0 and U12 ≤ 0 for all x1 and x2 , we have ∆W ≥ 0.
(b) Note that the second term of (9) can be rewritten as:
Z
Z
∞
∞
Z
∞
U1 (x1 , 0) ∆F1 (x1 ) dx1 +
0
U12 (x1 , x2 ) ∆F1 (x1 ) dx2 dx1
0
0
with a similar expression for the third term of (9) . It then follows that
Z
∞
∆W =
Z 0∞
−
Z
∞
U12 (∆F (x1 , x2 ) − ∆F1 (x1 ) − ∆F2 (x2 )) dx2 dx1
Z ∞
U2 (0, x2 ) ∆F2 (x2 ) dx2 .
U1 (x1 , 0) ∆F1 (x1 ) dx1 −
0
0
0
Suppose ∆p̄ (x1 , x2 ) = ∆F1 (x1 ) + ∆F2 (x2 ) − ∆F (x1 , x2 ) ≤ 0 for all x1 and x2 . It follows
that ∆p̄ (x1 , 0) = ∆F1 (x1 ) ≤ 0 for all x1 . Similarly, ∆F2 (x2 ) ≤ 0 for all x2 . Therefore, given
U1 ≤ 0, U2 ≤ 0 and U12 ≥ 0 for all x1 and x2 , we have ∆W ≥ 0.
(c) The result is immediate from (a) and (b).
Proof of Proposition 4. The proof, although tedious, is a straightforward extension of
the proof of part (a) and (b) of Proposition 3 to higher dimensional case. The results can
be obtained by repeated integration by parts with respect to x1 , . . . , xK .
17