1 A Guide to Using Prices in Poverty Analysis John Gibson

A Guide to Using Prices in Poverty Analysis
John Gibson
Department of Economics
University of Waikato
The goal of this document is to provide practical guidance to those poverty analysts who need to
use price data in their analysis. The relevant issues and choices depend somewhat on the stage at
which the analyst has become involved in the project and on the prior information available
about poverty in the country. Therefore, after an introductory section that should be read by all
users and which outlines the particular poverty analysis tasks that prices can be useful for, the
second part of the guide is structured in the following way:
Users of the guide should therefore combine Section 1 with one of Sections 2-5, depending on
when they enter the project and the extent of previous information. The major division is
between those projects where the survey has already finished and where, potentially, the analyst
has little connection with the survey agency (Sections 4 and 5) and those where there is a closer
integration between survey work and poverty analysis (Sections 2 and 3). The guide is not
designed to be read in its entirety because some points are duplicated between sections.
Section 1: Which Poverty Analytical Tasks Require Price Data?
Most obviously prices are needed to place a monetary value on the food basket for a Cost of
Basic Needs (CBN) poverty line. But even methods for constructing a poverty line that seem to
rule out the need for prices, such as the Food Energy Intake (FEI) method, prove on further
1
examination to require information on prices.1 Some sort of price index is also needed to
calculate the change over time in the cost of reaching a poverty line standard of living.
Summarizing across all stages of poverty measurement (including the calculation and crosschecking of household total consumption, which may have been done before the poverty analyst
obtains the data), local prices are needed for some or all of the following four tasks (the question
of what is “local” is discussed in Section 2.2):
1. pricing the food basket for the Cost of Basic Needs (CBN) poverty line,
2. forming spatial deflators, so that any ranking of household consumption expenditures
is in real rather than nominal terms,
3. imputing values either when the survey only collects quantities or when checking the
sensitivity of the consumption estimates to the use of respondent-reported values, and
4. calculating the change over time in the cost of reaching the poverty line.
In addition, once poverty estimates have been made there may be an interest in modeling the
effect on poverty of price changes for specific items. Examples include changes in the price of
commodities that are both key staples and major income sources (Ravallion, 1992), relative price
shifts during an economic crisis (Friedman and Levinsohn, 2002) and more general evolution of
relative prices over time (Son and Kakwani, 2006).
1.1
Cost of Basic Needs Poverty Lines
As the name suggests, a Cost of Basic Needs (CBN) poverty line attempts to estimate the cost of
reaching some basic standard of living. Because norms about food energy needs are more widely
agreed upon than norms about other needs, and because food is the largest item in the consumption
budgets of the poor, these CBN poverty lines are anchored by a food poverty line. Thus, the first
task is to calculate the cost of meeting food energy requirements from a diet consisting of the foods
that are actually eaten by poor people in the country. The foods to include in this basket, and their
relative importance, can be set by looking at the food budgets of a group of poor households.
Ideally, this group should not include households ultimately found to be above the poverty line, so
that it is the dietary patterns of the poor but no others that count in forming the basket.2 The
identification of this group may thus rely on the use of a spatial price deflator (see Section 1.2
below).
Once the list of foods and their relative importance is determined the size of the basket can be scaled
up or down (holding calorie budget shares constant within the basket) until it exactly achieves the
food energy target (say 2000 calories per person per day). The cost of buying this (scaled) food
basket can then be estimated separately for each region and sector, giving a set of food poverty
lines. If the survey collects information on food quantities directly, and these are deemed reliable,
1
The FEI method relies on a regression of calorie intakes on a welfare indicator like per capita expenditures. Once a
calorie target is set (say, 2000 calories per person per day) the regression is inverted to solve for the required
expenditure to meet the calorie target. However there will be a measurement error in this regression if it is carried
out in terms of nominal expenditures when there are large price differences between regions. This error will tend to
reduce the magnitude of the regression coefficient, causing an overstatement in the level of expenditures required to
reach the calorie threshold and hence an overstatement in the value of the poverty line. This error could be reduced
if price data were available to calculate real expenditures that reflect regional differences in the cost of living.
2
This may require an iterative approach since an analyst does not know who the poor are in advance. One example
of such an approach is Pradhan, Suryahadi, Sumarto and Pritchett (2001).
2
the food poverty line basket can be formed in one step from the average consumption quantities
for people in the target group. But if quantities are not available they may be derived by dividing
recorded consumption expenditure on each food by the local price.3 The prices are then used
again when the basket of goods is priced in each region and sector.
In the next step, the food poverty line, zF, is inflated upwards to get the total CBN poverty line by
adding to it the typical value of non-food spending by households whose total expenditure just
equals zF. This is a somewhat austere non-food allowance because these households displace some
required food consumption, given that they don’t actually spend their total budget on food
(Ravallion, 1992). If the food budget share of households whose total expenditure just equals zF is
wL, the CBN poverty line is calculated as: zCBN = zF + zF (1-wL). This budget share can be found from
the following Engel curve:
⎛ x ⎞ K
⎟+ γ
w= α + β ln⎜⎜
+ε
(1)
F ⎟ ∑ k nk
⎝ n ⋅ z ⎠ k =1
where w is the food budget share, x is total expenditure, n is the number of persons, zF is the food
poverty line, and nk is the number of people in the kth demographic category. If total expenditure
equals the cost of the food poverty line, ln (x (n ⋅ z F )) = 0 , so wL = αˆ + ∑ γˆ k n k where n k is the
K
k =1
mean of the demographic variables for the reference household used to form the poverty line
basket of foods.
An upper poverty line is also calculated in many analyses, using a non-food allowance that is
calculated from the food budget share of those households whose food spending (rather than total
spending as in the CBN poverty line) exactly meets the food poverty line, wU. Ravallion (1992)
shows how wU can be estimated by putting the estimated parameters from equation (1) into a
U
iterative solution. The upper poverty line is then estimated as: z U = z F w .
1.2
Spatial price deflators
Spatial price deflators are needed because price differences between regions may make betweenhousehold comparisons of nominal consumption expenditures misleading.4 For example, in the
CBN method of setting poverty lines it is typical to base the poverty line basket of foods on the
actual consumption pattern of a group of poor households.5 But in order to identify this group of
poor households, some ranking must be used and this needs to control for spatial price variation.
Otherwise poor households from regions where prices are high are less likely to be included in
the reference group than are poor households in regions where prices are low because those from
the higher priced region will have higher nominal expenditures.
3
This of course requires a good match between the items in the price survey and the commodity detail in the food
consumption questionnaire. Surprisingly, this basic point is missed by many surveys.
4
Temporal price deflators may also be needed. It is typically assumed that prices do not vary over time within a
cross-section but in inflationary environments even a few months between the time of the first and last household
being surveyed could cause a difference between nominal and real expenditures.
5
Exactly how many households should be in this group depends on prior notions of the poverty rate. For example, if
it was believed that the poverty rate was 0.25 it would be likely that an analyst would use the food consumption
patterns of the poorest quarter of households for obtaining the poverty line basket of foods. If this prior estimate of
the poverty rate turns out to be quite different than the subsequently calculated one, it may be necessary to revise the
calculations, using a different definition of the starting group (Pradhan, Suryahadi, Sumarto and Pritchett, 2001).
3
The ideal way to control for spatial differences in the prices facing households is to calculate a
“true cost-of-living index”. This true cost-of-living index is based on the expenditure function,
c = c(u , p) , which gives the minimum cost, c for a household to reach utility level u when
facing the set of prices represented by the vector p. For two, otherwise identical households, one
living in the base region and facing prices p0, and the other living in another region facing prices
p1, the true cost-of-living index is:
1
c u, p
True cost - of - living index =
0
c u, p
(
(
)
)
which can be interpreted as the relative price in each region of a fixed level of utility. Although
this is the ideal spatial price index, it is not commonly calculated, even in developed countries.
Instead the usual approach to controlling for spatial price differences is to use a price index
formula that approximates the true cost-of-living index. A common choice is the Laspeyre’s
index, which calculates the relative cost in each region of buying the base region’s basket of
goods:
J
L=
∑ Q kj Pij
j =1
J
∑ Q kj P kj
,
(2)
j =1
where k is the base region, i indexes every other region, j indexes each item in the consumption
basket, and Q and P are quantities and prices.
The Laspeyre’s index overstates the cost-of-living in high price regions. It does not allow for
households making economising substitutions away from items that are more expensive in their
home region than they are in the base region. For example, ocean fish are usually more
expensive in the interior of a country than on the coast, so the quantity of fish consumed would
typically be lower in the interior than on the coast. But if a coastal region is the base region, the
Laspeyre’s index calculates the cost of purchasing the coastal level of fish consumption at the
high prices prevailing in the interior. Instead, a true cost-of-living index would calculate the cost
of obtaining the coastal level of utility when facing the high prices for fish that prevail in the
interior, letting the household rearrange its consumption bundle to minimise cost.
Another commonly used price index, the Paasche index understates the cost of living in high
price regions because it evaluates relative prices using a basket of goods that varies for each of
the i regions:
J
P=
∑ Qij P ij
j =1
J
∑ Qij P kj
.
(3)
j =1
In other words, the Paasche index takes a weighted average of relative prices, where the weights
reflect prior economising substitutions by households. Continuing the above example, the
Paasche index weights the high price of fish in the interior with the (low) quantity of fish
consumed by interior households. This understates the cost of living disadvantage in the interior
4
compared with the coast because it puts a smaller weight on the items with the highest prices
relative to other regions.
12
A geometric average of the Laspeyre’s and Paasche indexes gives a Fisher index: F = ( L × P) .
This is a superlative price index which will closely approximate a true cost-of-living index.
Another superlative price index that is sometimes used is the Törnqvist index:
⎡ J ⎛ w + wij ⎞ ⎛ P ij ⎞⎤
⎟⎥
⎟⎟ ln⎜
T = exp ⎢ ∑ ⎜⎜ kj
(4)
⎜
2
=
1
j
P
⎠ ⎝ kj ⎟⎠⎦⎥
⎣⎢ ⎝
where wij is the average share that item j has in the consumption basket in region i, and region k
is the base region.
One practical difficulty with all of these price index formulae is that they require a full set of
prices for all items in the consumption basket. Household surveys are typically not able to collect
prices for all consumption items (for example, prices for services are hard to measure) so
assumptions are needed about the regional pattern of prices for the items that are not observed.
One solution to this problem is to derive the spatial price index from the regional poverty lines
because poverty lines can be calculated when there are missing non-food prices (see the
discussion surrounding equation (1)). A further advantage of deriving the spatial price index
from the CBN poverty line is that this ensures consistency between what should be two
equivalent methods of calculating poverty statistics (head count, poverty gap, etc):
(i)
comparing nominal consumption expenditures with poverty lines that vary by
region and sector
(ii)
using the spatial price index to deflate nominal consumption to either national
average prices or to the prices in a base region and then comparing these spatially
real consumption expenditures with a poverty line that takes a single value.
If the spatial price index estimates regional variations in the cost of living that differ from those
implied by the CBN poverty line, these two equivalent methods will not give consistent results.
1.2.1 Spatial price indexes from regional CPIs
In cases where information on price levels across regions are lacking for current periods, analysts
may be tempted to estimate these regional price levels at a given point in time by applying a
local consumer price index (CPI) to some base period when cross-sectional price levels were
known (or else were assumed to be equal). For instance, a baseline household survey may enable
poverty lines and other deflators to be estimated for each region while subsequent surveys lack a
price collection module (or lack quantity information to derive price movements from unit
values). But if there is a CPI available for each region (or for key cities within or near to each
region) this might be used by a poverty analyst to estimate current price levels across regions.6
The available evidence suggests that such a procedure is biased. It fails to take into account the
inconsistency between price levels involved in comparisons across space and time. The most
extensive empirical evidence on the bias involved in this procedure is from Gluschenko (2006)
6
This approach is also common outside of poverty analysis and even in rich countries with extensive data. For
example, Hamilton (2001) estimates food Engel curves for the US over several years and in lieu of a variable
measuring regional price levels he uses the CPI for each of 25 cities. Cross-country tests of Purchasing Power Parity
also rely on approximating local price levels by local CPIs.
5
for the case of Russia. Gluschenko uses data from 1997 and 1998 to consider two methods of
measuring the relative price level in location r with respect to location s at time t:
a direct spatial price index calculated for period t using the local prices for the
(i)
same period
(ii)
an indirect spatial price index for period t that is extrapolated from a direct spatial
price index for period t0 with local CPIs used to characterize price changes from
t0 to t
In the case of Russia (where biases that may take years to show up elsewhere show up more
quickly because of the rapid inflation) the indirect spatial price index had considerable bias. This
indirect index implied that regional price levels in 1998 varied from 81 percent to 153 percent of
the national price level (a ratio of highest to lowest of 1.9:1). In contrast, the direct spatial price
index only varied from 92 percent to 136 percent of the national price level (a ratio of 1.5:1).
Gluschenko concludes that the CPI-proxied (indirect) price levels cannot be used adequately to
proxy the cross-spatial price levels. The indirect spatial price index is substantially biased and
distorts the cross-spatial comparisons in the sense that it tends to overstate cross-spatial
differences in price levels. In the context of Russia, this implies that except for occasional
periods when a direct spatial price index is able to be calculated (such as in 1997 and 1998) there
is no means to get a precise estimate of real incomes (and hence, of real poverty) across regions
of the country. Since the methods of construction of the Russian CPI are similar to those in other
countries, these pessimistic conclusions may hold more widely.7
The bias that Gluschenko (2006) reports is likely to become more important in the future, as the
demand to combine regional and inter-temporal price indexes rises with the growing availability
of nominal data on living standards across time and space. However, this task is much more
complicated than it appears. The conceptual problems are discussed below but examples from
two developed countries, New Zealand and the UK, may help to reinforce the message. The
Consumer Price Index in both countries is constructed according to best practice, especially
because it is the indicator used for an explicit inflation target of the central banks. In New
Zealand it is calculated and reported for 15 regions. A recent public review of the NZ CPI
highlighted a user demand for the statistics agency to produce spatial comparisons of the cost of
living in different cities and regions. The statistics office emphasized that this could not be
produced from the current CPI and that additional resources would be needed to produce these
additional cost of living measures.8 In the UK, a similar demand for data on spatial price levels
caused the Office of National Statistics to field a new survey for 380 goods across 65 towns in
year 2000 (Ball and Fenwick, 2004), since it wasn’t possible to estimate these spatial price levels
from the existing CPI. If technically sophisticated statistics offices in developed countries that
place great public policy weight on the CPI cannot extract regional price levels from a CPI it is
rather optimistic for a poverty analyst working on developing country data to be able to do this.
7
Indeed, the Russian CPI could be considered ‘best practice’ in the sense that the expenditure weights are updated
for each of 89 regions every year (based on the results of the previous year’s Household Budget Survey that surveys
49,000 households every quarter) and prices are gathered each month from 30,000 outlets for 400 representative
goods and services in 350 towns and cities.
8
Statistics New Zealand (2005) Report of the Consumers Price Index Revision Advisory Committee, 2004.
6
In terms of the conceptual problems, Hill (2004) suggests that it may in general be impossible to
construct panel price indexes that are unbiased across both space and time. Bilateral formulae,
such as those presented in equations (2)-(4), are unlikely to give transitive results when extended
to a multilateral situation. For example, consider a country where a price index is calculated for
three regions: the capital city PCC, other urban areas, POA, and rural areas, PR with base weights
that differ in each region. A direct comparison between the rural price level in period t2 and the
capital city price level in period t0 (say, the base period for the poverty line) will not give the
same result as constructing an indirect comparison via the third region in an intermediate time
period, t1:
PR 2,CC 0 ≠ PR 2,OA1 × POA1,CC 0
This lack of transitivity is partly due to different consumption patterns causing the weights
attached to each commodity to vary across regions. In contrast, a multilateral index is transitive
by construction and can be expressed as:
P
PR 2,CC 0 = CC 0
PR 2
The most common of the multilateral index methods are (i) average price methods, such as the
Geary-Khamis (GK) method that underlies the Penn World Table, (ii) EKS (Eltetö, Köves and
Szulc) type methods, (iii) Spanning-Tree Methods and (iv) the Weighted Country-Product
Dummy Method (WPCD).9 The basic idea behind WPCD, as used in the cross-country literature,
is that the observed price of commodity n in country k and period t is assumed to be the product
of the PPP price index for the country Pkt, the price level of commodity n ( ptn ), which is a
country-invariant component, and an error term υktn . In log form this can be expressed as:
(5)
ln pktn = ln Pkt + ln ptn + ln υktn = π kt + θtn + ε nkt
If observations are weighted by the expenditure share for each commodity in each country and
the parameters of the following regression estimated by weighted least squares:
ln pktn =
K
∑π
j =1
N
i i
n
jt C jt + ∑ θ t Gt + ε kt
(6)
i =1
where Cjt and Git are the country and commodity dummy variables respectively, then the price
indexes are obtained by exponentiating the parameter estimates on the country dummies.
Advantages of this method are that since it is based on a regression there are standard errors (at
least of the logarithms of the price indexes) and it can also be used when there are gaps in the
data.
9
Average price methods compare each country (or region) with an artificially constructed average country (or
region). They mostly use the Paasche price index formula (including the Geary-Khamis) to make each of these
bilateral comparisons, with the artificial country as the base, and tend to suffer from substitution bias because the
price vector of the base artificial country is not equally representative of the prices faced by all of the countries in
the comparison. EKS methods impose transitivity in the following way: first, make bilateral comparisons between
all possible pairs of countries, then take the nth root of the product of all possible Fisher indices between n countries.
A spanning tree is a connected graph that does not contain any cycle (i.e. any pair of vertices in the graph are
connected by one and only one path of edges) in order to ensure that price indexes are internally consistent. Thus, a
multilateral comparison among K countries can be made by chaining together K-1 bilateral comparisons as long as
the underlying graph is a spanning tree. The Weighted Country Product Dummy method is explained in detail in the
text below.
7
These multilateral index methods are widely used in the cross-country literature for calculating
Purchasing Power Parity (PPP) exchange rates but very rarely used for multilateral regional and
temporal comparisons within countries. One of the few published examples of applying these
methods to household survey data to estimate multiregional consumer price index numbers is
Coondoo, Majumder and Ray (2004) who adapt the Country-Product Dummy method to using
unit value data from the National Sample Survey in India.
But even in the cross-country literature where these multilateral indexes are widely used there is
controversy about their interpretation and robustness. Hill (2006) shows how it is possible to use
Penn World Table data to support either convergence or divergence, depending on which
multilateral index is used to compute the per capita income benchmarks. Ackland, Dowrick and
Freyens (2006) find that using the EKS method to calculate real income in the Penn World Table
raises the global count of people below the PPP$1 per day poverty line by nearly 60 percent,
compared with using the standard PWT data that rely on the GK method.
Moreover, despite these multilateral indices satisfying transitivity there are other desirable
criteria that they fail to meet. These criteria include: temporal fixity – the results for an existing
time series should be unaffected by the inclusion of a new time period, spatial fixity – the results
for an existing set of countries (or regions) are unaffected by the inclusion of other countries,
temporal consistency – temporal results for each country (region) do not depend on the other
countries (regions) in the comparison, and spatial consistency – spatial results do not depend on
the other years in the comparison. In general it is not possible to maintain both temporal and
spatial consistency and achieve transitivity. Consequently analysts have to weigh up which
violations are least problematic in the particular application they have in mind. Hill (2004)
suggests that in many settings (but he does not explicitly consider poverty measurement) most
importance will attach to maintaining temporal fixity and consistency.
Thus, to conclude this section, poverty analysts should treat the calculation of spatial price levels
or indexes from regional CPIs with considerable skepticism. They should also be aware that
multilateral indexes also have weaknesses. The sensitivity of cross-country poverty estimates to
the particular multilateral index used suggests there may be a similar sensitivity of regional
poverty estimates if multilateral indexes are used to calculate regional PPPs. Amongst the class
of multilateral indexes, ones based on the Weighted Product-Country Dummy approach show the
greatest scope for adapting to household- and regional-level data.
1.3
Using prices to impute the value of consumption
Self-produced items, and especially food, are a major component of consumption in rural areas
of many developing countries. The monetary values placed on these self-produced items in
surveys are often the values that respondents themselves suggest. It is difficult to know how
reliable these respondent-reported values are. Many households who produce a food do not buy
that same food, so they may not be well informed about prices when they assign a value to their
own food production. Moreover, the items available for sale in markets may be of a different
quality than their own production so even if they are aware of prices in the market they may not
be able to accurately impute a value for their own production. These problems can be particularly
acute if a comprehensive measure of consumption is used that attempts to value some of the
8
services provided by the environment (eg., firewood and other bush materials are often gathered
but rarely sold in rural areas, so valuing these products can be particularly difficult).
There are two concerns about relying on respondent-reported values for self-production. First,
they introduce an additional, and extraneous, source of inequality into measured consumption
because they will vary across respondents who are in the same location and face the same prices.
If the poverty line is below the mode of the welfare indicator, this increase in measured
inequality will raise the measured poverty rate (see Ravalion, 1988 for a formal treatment).
Intuitively, a household might fall below the poverty line just by being too pessimistic when
valuing their own food production because they think prices are lower than they truly are.
Second, the values applied to self-produced food items could differ, systematically, from market
prices. Such discrepancies could drive a wedge between the market prices used to form a CBN
food poverty line and the respondent-reported values used to form estimates of consumption. If
respondents tend to report values for their self-produced foods that are lower than market prices,
estimates of poverty could be inflated, especially in rural areas where subsistence food
production is important.
There are two alternatives to respondent-reported values, as measures of the value of selfproduced food items. The first is to value self-produced foods with the average of the implicit
unit values used by other households living in the same cluster (aka Primary Sampling Unit) as
the respondent. These implicit unit values are the ratio of value to quantity reported by each
respondent, and are similar to a price except that they may reflect quality variation and also
measurement error. Replacing respondent-reported values with a cluster average (medians may
be preferred to means, to reduce the effect of measurement error) removes the within-cluster
variability in valuations. However, it does not address any discrepancy between these average
unit values and market prices which may drive a wedge between the prices used for the poverty
line and the implicit prices used when valuing consumption.
The second alternative is to value self-produced foods with the average price that was observed
during the survey in the market closest to the respondent. In the absence of a market price
survey, unit values from the market purchase part of the questionnaire could be used although
these may be subject to quality differences between items that are purchased and items that are
consumed from own-production.
It is notable that both of these alternative ways of valuing self-produced foods switch the
cornerstone of consumption measurements from the respondent reports of values to the survey
estimates of food production quantities. Poverty analysts may be reluctant to place a lot of faith
in quantity measurements depending on the nature of the key food staples (grains are easier to
measure than root crops) and their opinions about the thoroughness of the consumption-fromown-production section of the survey (e.g., did the survey agency attempt to weigh items or else
use validated conversion factors from traditional units. See Capéau and Dercon (2006)). But
unless data on prices in local markets are available it is impossible to know how sensitive the
estimates of consumption and poverty are to the various assumptions made when valuing selfproduced items.
9
The ‘quality elasticity’ is one tool that may be useful for poverty analysts facing these issues.
This can be estimated from a double-log regression of unit values, vi on household total
expenditure xi, various demographic controls zi, and cluster-level dummy variables, δc:
ln vi = α + β ln xi + γ • z + δ c + ui
(7)
The estimated β-coefficient shows how unit values change with respect to household total
expenditure, where this change is typically due to an upgrading of quality as households get
richer. While equation (7) is typically used with unit values from market purchases, as part of the
procedures suggested by Deaton (1989; 1997) for stripping quality effects out of unit values
when they are used as proxies for market prices, it could also be applied to the unit values that
are implied by reported values and quantities of self-production.
If estimated quality elasticities are large, it points to either an inherent variability in the
commodity (e.g. tubers are typically less uniform than grains) or else it may reflect the broadness
of the commodity category which allows a lot of within-category substitution as households get
richer. For example, in Indonesia the quality elasticity for the broad category of meat (from
market purchases rather than own-production) is 0.12 but when a finer disaggregation is used the
quality elasticity for beef is only 0.05 and for chicken 0.04 (Olivia and Gibson, 2005). For
commodities that have a high degree of quality variability, the variation in respondent-reported
values may reflect the underlying quality differences rather than measurement error and so there
would be a loss of information if respondent-reported values were replaced with some form of
cluster average.
1.4
Using prices to update the cost of the poverty lines
The cost of the poverty line needs to be recalculated for each year that poverty is being
measured, in order that it refers to the same real standard of living. It is impossible to carry out
this calculation without some price data, but even with data there are a number of issues that
warrant attention.
1.4.1 Using general purpose deflators
The typical approach is to use a general purpose index like the CPI to update a poverty line that
has been estimated for a base period using the procedures outlined in Section 1.1 above. Even
absent the problems of getting spatial and temporal consistency that are discussed in Section
1.2.1 there are other problems with these general purpose deflators. An important practical
concern with this procedure is that the change in the cost of living for the group of households
below the poverty line could be quite different to the change shown by a general purpose price
index. For example, the CPI places greatest weight on the expenditure patterns of households
who are in the upper parts of the income distribution. As a result the measured inflation rate from
a CPI may be different than the inflation rate facing the poor. There are three sources of this
possible difference:
1. the prices for the CPI in many developing countries are collected only from urban areas
and the trend in these may be different than the trend in rural prices, especially if the
price of transport and other marketing services changes rapidly. Moreover, the base
weights for the CPI are also often only for urban households. While using only these
households is an (internally) consistent choice, from the point of view of measuring urban
inflation, it makes the CPI even less relevant as a poverty line deflator when the majority
10
of the poor are in rural areas. For example, even Indonesia, with one of the most
comprehensive statistical systems in developing countries and a nation-wide consumption
survey fielded every year, carries out its Cost of Living Survey (Survei Biaya Hidup)
which provides the base weights for the CPI only in provincial capitals and other large
cities.10 This may have contributed to the discrepancy in estimates of the poverty increase
in Indonesia during the Asian economic crisis. The change in the poverty line using the
price surveys from the Indonesian Family Life Survey was quite different to the change
calculated from applying the official (urban) inflation rates (Beegle, Frankenberg and
Thomas, 1999).
2. the price trend for the basic necessities consumed by the poor may not be the same as the
trend for items consumed in the upper parts of the income distribution, even if prices
were gathered in the same locations, and
3. within a given category of consumption (say, rice) the particular brands, grades, varieties
and outlets where rich and poor purchase may differ and may have different price trends.
One tool for assessing whether these differences in price trends are likely to be important is the
so-called plutocratic gap (Izquierdo, Ley, and Ruiz-Castillo, 2003). The plutocratic gap is the
difference between inflation measured using the official CPI and inflation measured using an
alternative group index in which all households are weighted equally. To understand this method
it is helpful to recall that official CPI calculations weight each commodity by adding up
expenditure on that particular item across all households, and calculating the ratio of the total
expenditure on the item to the total expenditure on all items. This gives more weight to the rich,
who have more total spending, and hence can be considered a “plutocratic price index” (Prais,
1958). In contrast, another method of calculating the weight for a commodity in the index would
be to first calculate budget shares for each household and then average these budget shares
across all households. This average of shares approach gives every household the same weight
(except for any variation due to household size and sampling weights). Thus it can be considered
a “democratic price index” because a rich household has no more impact on the finally
calculated index than does a poor household. This democratic method is more consistent with the
approach used with CBN poverty lines.
A hypothetical example showing the difference between these two types of averages is presented
in Table 1. There are two households, with one having three times the total spending of the other.
Only two commodities are available to consume: cassava, which is a necessity and beef, which is
a luxury. If the average importance of each commodity is calculated in terms of the shares of
total expenditure (the plutocratic method), the resulting price index would put 25 percent of the
weight on the price of cassava and 75 percent on the price of beef. This is much closer to the
consumption pattern of the rich household than the poor household. But if the democratic
average of shares approach was used the weights would be 30 percent on cassava and 70 percent
on beef which is halfway between the consumption patterns of the two households.
10
It is sometimes (wrongly) asserted that the base weights for the Indonesian CPI come from the national socioeconomic survey (SUSENAS). For example, see Quinn (2004). The SUSENAS has an abbreviated consumption
module every year and a comprehensive consumption module every three years (see Pradhan, 2001 for details) but
neither of these are used in the calculation of the CPI.
11
Table 1: Example of Two Different Weighting Methods for a Price Index
Total
Cassava
Cassava
Beef
Spending
Share
Beef
Poor household
$40
$60
$100
0.40
0.60
Rich household
$60
$240
$300
0.20
0.80
Total
$100
$300
Share of total
0.25
0.75
Average of shares
0.30
0.70
Source: Author’s example.
Consistent with this hypothetical example, in real-world price indexes the consumer whose
budget corresponds to the weights in a plutocratic CPI is located well into the upper part of the
income distribution. According to calculations by Deaton (1998), in the United States in 1990
this “average consumer” was at the 75th percentile of the distribution of household expenditures.
Deaton suggests that rising inequality would have raised this position still further since then.
Having the “representative consumer” located so far up the income distribution may not have
mattered in the United States in the immediate period prior to 1990 because Deaton suggests that
price movements at the 75th percentile were much the same as for those faced by a median or
poor household. However, this may not be the same in other countries, particularly in poor
countries where relative price shifts can be expected during the structural changes that occur
during development.
The sparse international evidence on the size of the plutocratic gap has recently been
summarized by Ley (2005). The only developing country with estimates is Argentina, where the
plutocratic gap varied in sign over time, ranging from -0.48 to +0.65 between 1993-98 (a period
when the official annual inflation rate was between 1.2 and 3.3 percentage points). The fact that
the sign of the gap varies over time does not mean that this issue can be ignored when choosing a
deflator for updating poverty lines under the assumption that the effects cancel over time. For
example, in Spain the plutocratic gap averaged 0.06 percentage points during 1992-97 but the
average absolute gap was 0.09 percentage points, so the sign reversals only removed a small
amount of the effect.
One thing that may be helpful for poverty analysts to consider is the characteristics of settings
where the plutocratic gap is likely to be larger because these will be where the CPI would be an
especially poor deflator for updating poverty lines. Ley (2005) shows that the plutocratic gap
will be larger, the greater the expenditure inequality in the country, the more different are the
consumption patterns across income groups and the larger the variation in inflation rates for
particular consumption items. Hence it is expected to be particularly significant in regions such
as Latin America where inequality is high and where high inflation rates may have allowed more
differentiated price dynamics across commodities.
In addition to the plutocratic gap, a recently developed tool for comparing price changes for the
poor with those indicated by a general purpose deflator like the CPI is the “Price Index for the
Poor” (PIP) developed by Son and Kakwani (2006). This index is based on the following thought
experiment: the actual change in the price vector over time produces a poverty change with both
12
an income component (if all prices rise by 10% it is equivalent to a 10% fall in real income) and
a relative price component. The relative price component reflects the fact that some prices move
more than others and that some price changes are relatively more important to the poor than are
others. The PIP is designed to measure what the percentage change in overall prices would have
to be, in order to get the same poverty change that actually occurred (which depends on both the
income effect of the change in price and the distribution effect of the price change). The PIP, λ is
defined as:
m
p* ⎛ η ⎞
(8)
λ = ∑ i ⎜⎜ θ i ⎟⎟
i =1 pi ⎝ ηθ ⎠
where pi is the price of the ith (amongst m total) commodity in the initial period, pi* is the price
in the subsequent period, ηθ i is the elasticity of the poverty measure θ (which is any member of
the additive separable class, including the Foster-Greer-Thorbecke measures) with respect to the
price of the ith commodity and ηθ is the elasticity of the poverty measure if all prices change by
one percent, which Son and Kakwani call the “total poverty elasticity”:
1 z ∂P
f ( x)dx
(9)
ηθ = − ∫ x
θ 0 ∂x
where for the headcount index, H η H = z f ( x) H , where z is the poverty line and f(x) is the
density of income,
However, ηθ is simply the negative of the elasticity of the poverty measure with respect to mean
income (aka the growth elasticity of poverty) since if all prices change by one percent it is
equivalent to nominal income falling by one percent. For the change in the price of a single item,
the income effect of that price change on poverty is just wiηθ , where wi is the mean budget
share for the ith commodity. In contrast, it is the share of the ith commodity at the poverty line,
wi (z ) that matters for the overall elasticity of the poverty measure with respect to the price of
the ith commodity. For example, for the headcount index,
∂H pi z f ( z ) wi ( z )
=
ηH i = −
.
(10)
∂pi H
H
Thus, the PIP is essentially a more elaborate way of contrasting price changes based on the
importance of each item at the poverty line and at the mean, and transforming this into an
interpretable magnitude – what is the equivalent equally distributed price change that would
produce the poverty change that actually occurred. In the case of Brazil, prices rose by
59.9 percent between 1999 and 2005 according to a Laspeyres Index with average budget shares
as the weights. But the Price Index for the Poor, based on equation (8), household survey data
and a set of prices for almost 500 items gathered in 12 regions of Brazil, rose by between
63.8 and 64.4 percent, depending on whether the Headcount, Poverty Gap or Poverty Severity
Index is used. Therefore price changes in Brazil appear to have favoured the non-poor over the
poor during this period.11
11
Another factor to note is that the response of poverty to price changes in this framework of Son and Kakwani is
based only on first order effects, without allowing for consumers to rearrange their budgets as relative prices change.
Welfare estimates based on first-order effects proved to be almost two times as large as those that allowed for
substitution responses in the Indonesian crisis (Friedman and Levinsohn, 2002). Analysts wishing to incorporate
13
In light of the above discussion of several different ways of highlighting weaknesses in the CPI
as a measure of price changes for the poor, poverty analysts should where possible use price
indexes calculated specifically for lower income groups. Examples include the CPI for
agricultural laborers in India. If such indexes are not already calculated by statistics agencies it
provides a further reason for local prices to be collected during poverty-focused household
surveys, so that poverty analysts may calculate these deflators themselves.
An additional reason for considerable caution in using a published CPI for updating poverty lines
is the issue of “CPI bias”. It is well known that the CPI is a biased measure of changes in the cost
of living due not only to the substitution bias discussed above in the spatial context (Section 1.2)
but also due to outlet bias (shoppers responding to lower prices by switching outlets while price
surveyors do not), and an inability to deal properly with quality change and new goods. Recently,
a practical method based on the estimation of food Engel curves has been developed for
measuring and correcting this CPI bias and has been applied in several developed and developing
countries. This method just requires repeated cross-sections of a household survey with total
consumption and food consumption measured consistently over time. In the United States, this
method estimates a CPI-bias of roughly one percentage point per year over the 1980s (Hamilton,
2001), which is an estimate very close to that derived by a completely different, and more
laborious, method used by the Boskin Commission. In Canada the bias was just over one
percentage point per year for 1978-2000 (Beatty and Larsen, 2005). In Brazil the CPI bias is
estimated as three percentage points per year over 1987/8 to 2002/3 (Filho and Chamon, 2006)
while in Russia it is estimated as one percentage points per month from 1994-2001 (Gibson,
Stillman and Le, 2004). Moreover, in the Brazilian case, the CPI bias appears to be larger for the
poor, making the CPI a particularly unreliable index for updating poverty lines that attempt to
hold an absolute standard of living constant.
1.4.2 Recalculating CBN poverty lines each year
One seemingly attractive alternative to relying on general purpose deflators is to simply repeat
the calculations of a CBN poverty line for each year that survey data are available and poverty
estimates required. However, there is a conceptual problem with this approach. While it is
possible to re-price the same basket of foods that was identified in the baseline period, there is no
easy approach to updating the non-food allowance. Recall from equation (1) that an Engel curve
is estimated to calculate the non-food allowance because of two problems:
1. it is hard to get agreement on what to include in the basket of non-foods, compared with
using nutritional norms to anchor the food basket. The Engel curve approach gets around
this problem by letting the revealed choices of poor households determine the amount
(but not the composition) of the non-food allowance, and
2. prices for non-food items are less easily available than they are for foods. Only food
prices are needed to calculate the Engel curve in equation (1).
However the calculated non-food allowance has both price and quantity components and because
these are jointly estimated it is not possible to hold the quantities constant when repeating this
calculation in subsequent years. Thus, repeating equation (1) does not hold real living standards
these second order effects need either a matrix of demand responses or they can estimate a utility consistent demand
model to estimate an ‘equivalent income’ concept (Ravallion, 1992).
14
constant because we cannot rule out quantity changes out, which denote changes in real living
standards.
An example of the approach of recalculating the CBN poverty line in each year is provided by
Meng, Gregory and Wang (2005) who calculate poverty lines for urban China for each year from
1986-2000. These authors argue that such recalculation is required because of the rapid changes
in the availability of goods, changes in the provision of subsidized services and divergences in
the prices of key commodities consumed by the rich and poor. However, these problems could
be dealt with by using a more appropriate deflator to adjust the baseline estimate of the non-food
allowance for price changes only, keeping the implicit quantity of non-foods the same over time.
The one situation where it is appropriate to recalculate an Engel curve for updating the non-food
allowance is when measured consumption changes its composition and coverage between
surveys. For example, one survey may have “rice” as an item, but this is broken down in a
subsequent survey into “basmati rice” and “plain rice”. This greater detail would be expected to
raise measured consumption because it prompts respondents to remember some expenditure that
they would otherwise forget. In cases such as this, Lanjouw and Lanjouw (2001) show that the
bundle of foods in the poverty line should be recalculated, restricting attention just to the subset
of items that are common to both surveys, yielding an abbreviated food poverty line zF,subset. This
line, which is abbreviated because it excludes items whose definition changed between surveys,
is then scaled up to provide a total poverty line. The appropriate nonfood allowance for the
U
scaling up is based on the “upper poverty line” z U = z F, subset w rather than having the allowance
calculated directly from equation (1). Moreover, in these circumstances, only the headcount
measure of poverty maintains its comparability across the two surveys.
15
Section 2: No Previous Poverty Lines and Survey Fieldwork Not Yet Complete
This section is for those poverty analysts who are in the (increasingly rare) position of working
in a country with no previous poverty line and where there has been sufficient forethought to
involve the poverty analyst before the survey fieldwork is complete. Obviously there are
difficulties if the survey interviewing has already begun but even in that case some amendments
can possibly be made in the field.12 The disruption that this causes could still be worthwhile in
order to get good price data.
There are three key questions that the poverty analyst and the survey agency should consider:
1. How many prices to collect, in terms of the number of items and the number of
individual price observations per item,
2. Where to collect prices, and at what geographical scale to calculate and report any
resulting price aggregations such as a spatial price index or food poverty line, and
3. How to collect the price information, in terms of the following four choices:
a. Price surveys in community markets, such as those typically done by LSMS
surveys,
b. Unit values (that is, the ratio of expenditure to quantity) coming typically from a
consumption recall but potentially also from individual transaction records in
expenditure diaries,
c. Surveys of opinions about prices from either sampled households or community
leaders, and
d. Existing price collection efforts, as might already be occurring for a Consumer
Price Index or some sort of rural index like a Farm Cost Index.
2.1
How many prices to collect?
The number of items whose prices are needed depends partly on the nature of the consumption
module of the survey. If it is an LSMS style survey the consumption recall is likely to have less
than 50 categories of food and less than 100 consumption categories in total. In this case, if there
is a separate price survey it is sensible to try to obtain the price for at least one item per food
category.13 This matching is especially needed if quantity data are not collected in the
consumption recall; otherwise there is no way to derive the required consumption quantities from
those food expenditure categories with no matching price. For key foods such as rice and other
dominant staples, price surveys often include several specifications (such as high and low
quality) although it is not clear how this can help with the calculation of the food poverty line
when the expenditure or quantity information is only available for a broader aggregate.
12
For example, midway through the fieldwork in the 1996 Papua New Guinea Household Survey it became
apparent that the market price surveys did not cover all items in the food consumption bundle and hence part of the
food poverty line basket would have been unpriced. Moreover, some interview teams were less diligent at gathering
market prices when produce markets and tradestores were some distance from the selected village (because of the
time needed to walk to the markets for the price survey). So additional staff were employed to gather prices for the
unpriced commodities and from villages with missing data. While this was more expensive than having the original
survey teams gather these data, it was felt to be worthwhile because of the complete absence of other information
that could serve as a proxy for these missing prices.
13
If unit values are used there will automatically be a matching between the commodities with expenditure and
quantity information and those with “prices”.
16
If the survey uses more consumption recall categories, as would typically occur with a
Household Income and Expenditure Survey (HIES) or a Household Budget Survey (HBS), then
prices should only be collected for foods that are going to make a ‘significant’ contribution to the
food poverty line. A similar recommendation holds for surveys that use expenditure diaries,
because the amount of commodity detail that these allow is almost limitless (eg., such surveys
typically use a 4-digit coding scheme, so could have several hundred codes for food items). In
these cases it is decisions about which prices to collect which ultimately shape the degree of
detail in the poverty line basket of foods.
One useful tool in this regard is the concentration curve. If previous survey information on food
consumption is available, this curve could be constructed for the foods that could potentially be
included in the poverty line basket. After ranking foods according to their importance the
concentration curve plots the cumulative contribution to either the total cost or the total calorie
content of the poverty line basket. Figure 1 presents an example from Cambodia, where the
initial poverty line was calculated from a 1993/94 survey that had 155 separate food items. This
detailed food basket was never fully priced in subsequent surveys, which only gathered data on
the prices of about 30 foods. In fact this more abbreviated level of price was about an
appropriate level of detail for the poverty line food basket. According to Figure 1, a basket with
just the 20 of the most important foods would give 73 percent of the total cost and 85 percent of
the total calories in the 155-item food poverty line. A basket with 35 items would give 86
percent of the total cost and 94 percent of the calories of the 155-item basket. But because the
initial food poverty line in Cambodia had been too detailed, all subsequent updates of that food
poverty relied partly on assumptions about price trends for the items that the new surveys had
not collected information on. This was an unnecessary source of ambiguity.
Figure 1: Concentration curves for poverty line food basket
100
Cost
Calories
Cumulative %
80
60
40
20
0
0
50
100
Food items
17
150
While it would not be possible to exactly replicate Figure 1 if there is no previous poverty line,
there is likely to be either nutritional of budget information on the importance of various foods.
Thus an approximate concentration curve could be constructed to guide the specification of the
food price collection effort.
In addition to food prices, the prices of key non-foods should also be collected. Even though
these are not needed when the CBN method is used to scale the food poverty line up to the total
poverty line (see equation (1)) they are useful for at least two other purposes. First, some
countries have traditionally based the non-food allowance in the poverty line on the prices of a
select group of non-foods that experts have identified as constituting basic needs. This method
was especially common in the former Soviet Union. Collecting prices for these items enables
some sensitivity analysis by testing that style of poverty line against the CBN line (and may also
help in discussions of the social acceptability of the CBN line). Second, the prices of these nonfoods can be used for analytical studies that either look at causes rather than measurement of
poverty or else that consider the incidence of social spending. For example, fuel subsidies are
important in many countries such as Indonesia so it is necessary to have good estimates of price
elasticities of demand to assess their efficiency impacts. In countries with considerable spatial
price variation (because of poor infrastructure, difficult topography etc) these elasticities can be
estimated cross-sectionally, if the survey has collected the required price data.
How many price observations per item
If prices are obtained from a market price survey, there is a choice of how many observations to
make on the price of each item. The standard in most LSMS surveys is three observations per
village (that is, per cluster). It is not clear if a fixed number of observations per item is the best
approach, although it does have the advantage of simplicity. A CBN food poverty line is a
statistic (essentially a weighted average of a set of average prices) although it is rare to see
standard errors reported for poverty lines. This statistic would be more precisely estimated if the
prices for the items contributing the most weight (e.g., rice) were based on larger samples than
the samples used to measure the price for minor items.
The variability across time and space should also be considered when deciding how many
observations to take on the price of each item. Some items may be subject to price controls (for
example, fuels) so the same price might be observed over all outlets and across short time spans.
Other items, and particularly informally marketed foods, may have prices that vary from day to
day and from seller to seller, so more observations are required to precisely measure the prices
for such items. Some surveys have visited fresh produce markets on two separate days to capture
this effect.
Some consideration of the ‘lumpiness’ of the product may also help to inform decisions about
the optimal number of observations on market prices. Root crops are lumpier than grains and
hence the prices observed in a market are likely to be more variable, especially when they are
sold in piles or bundles and where there is no splitting of individual tubers. The greater
variability in root crop prices suggests that more observations should be taken of their market
prices than for grains, in order to get an equally reliable measure of the mean price.
18
Some evidence for this effect comes from market price surveys in Papua New Guinea which
looked at intra-seller price variation. Specifically, enumerators selected the seller in each market
with the largest number of piles on display and then weighed all piles that were offered at the
most common price (e.g., 10-cent piles, one-dollar piles etc). This is a setting where haggling is
not the norm (Gibson and Rozelle, 2005) so the posted prices should measure the effective prices
paid by consumers. On average, the coefficient of variation for piles offered by the same seller
(and at the same listed price) was 0.20 for taro, 0.18 for sweet potato and 0.14 for cassava. At
least some of this variability is due to the lumpiness of these foods, because the piles typically
have only a few tubers so it is difficult for a seller to exactly equalize the weight of each pile if
no tuber is to be split. This implicitly makes it difficult to equalize price across the piles offered
by the seller because prices are posted at only certain values (typically 10, 20, 50, 100, and 200).
The only product in these markets that approximates a grain is sago, which is a starchy food
made from the pith of a palm tree. Sago is sold in bundles of various weights, which can be
adjusted, unlike the size of an individual root crop tuber. The average coefficient of variation for
the sago bundles was considerably less than for the root crops, at only 0.09. It would be useful to
have evidence on the intra-market and intra-seller variation in prices from other settings to help
assess the likely reliability of mean prices calculated from only a few observations in each
market.
2.2
Where to collect prices
In terms of where to collect prices, the aim should be to observe prices in the markets actually
used by the households in the sample. Thus it is worthwhile asking respondents in the
consumption questionnaire where they actually buy their items. Otherwise an approach of just
visiting the nearest markets and asking vendors the price of particular goods (as was done by the
LSMS surveys) can be subject to the criticism that this is possibly the wrong market. Other
criticisms of the approach are that prices could be collected for the wrong specification of goods
and that the prices quoted may not be the prices actually paid by local residents because of
bargaining (Deaton and Grosh, 2000).
It is also possible that some prices will need to be collected from larger, more regional, markets
because specialized items may not be available in local markets. For example, a 1999 survey in
Cambodia tried to obtain prices for 50 food items in 600 villages but data were obtained on less
than half of the price-village combinations because of items missing from markets (Gibson,
2000). There are three options for dealing with these missing local prices:
• Apply the price from a neighboring market (essentially a form of ‘hot deck’ technique
that survey software often applies to missing data)
• Apply prices that are obtained in larger markets to a whole region, and
• Use regression to predict the price of missing items, based on the price of some other
item more widely available.
The logic of the regression approach is that spatial price differences may reflect transport costs,
so if goods are coming from a common source (say a port) and moving into the hinterland, prices
may tend to move proportionally.14 Of course if there are more complicated commodity flows,
with missing prices reflecting seasonality, environmental constraints (eg., altitudinal limits on
coconut) etc, then none of these imputation approaches will be very reliable.
14
Glewwe (1991) used the same logic when taking the price of a can of tomato paste as a proxy for non-food prices
in an early LSMS surveys in Côte d’Ivoire because the non-food prices that were collected were poorly measured.
19
In terms of the geographical scale at which to calculate average prices (as an input to the food
poverty lines), most surveys, and the subsequent poverty analyses, report these for only a few
major regions despite prices being collected from a far larger number of communities. There are
at least three reasons for this aggregation:15
• concern about missing prices at the local level (see above)
• measurement error because the prices observed in a single village market on a given day
are only a snapshot taken with a very small sample. By averaging over prices collected
in surrounding markets within the region, the share of the variance due to random
measurement error will be reduced, and
• introduction of temporal variation such that the prices obtained in a village on a given
day do not reflect the ‘usual’ prices facing the households in that community. Regional
prices may be more representative because surveys that stagger fieldwork over several
months or a year will have price samples within a region that are collected over the
entire duration of the fieldwork (unless the survey works entirely in one region and then
moves to the next region). But prices in a single village are likely to be collected only
once, and so will reflect both spatial and temporal/seasonal variation and it will not be
possible for the poverty analyst to identify the purely spatial part, which is needed for
setting the regional poverty lines.16
On the other hand, there are some costs of using regional average prices rather than local prices.
Regional prices will overstate the cost of buying the poverty line basket of foods in low-price
communities within each region, while understating it for others. Measured poverty will be too
high in the low-price communities because these same (high) prices are not used for valuing food
consumption. Hence, some households will be above the poverty line if that line is priced using
local (i.e., cluster-level) prices, but below the poverty line if regional average prices are used.
Bias in the opposite direction (measured poverty too low) will occur in clusters where regional
average prices understate the local cost of the poverty line basket of foods.
At first glance it would seem that there is no net effect of using regional average prices because
the overstatement of poverty in some communities within the region is cancelled out by the
understatement in others. This would only be true if the distribution of food prices within each
region is symmetric. There is surprisingly little evidence on the distribution of staple food prices
within regions to know if this is a reasonable assumption. Some evidence is reported by Gibson
and Rozelle (1998) for Papua New Guinea. They find, for example, that for sweet potato (the
dominant staple supplying 30 percent of the calories in the food poverty line basket) the
hypothesis that the distribution of surveyed prices across clusters in the largest region comes
from a Normal distribution is rejected (p<0.01), while the hypothesis of log-normality is not
rejected (p<0.50). A similar pattern holds for three-quarters of the combinations of other regions
and other foods.
15
Additionally, may also be concerns about estimating the non-food allowance separately for every cluster in the
sample, which will introduce a large number of intercepts into equation (1).
16
Surveys with a within-year longitudinal component are an exception. Muller (2002) reports on an example of such
a survey from Rwanda, where the same households and villages were revisited four times throughout the year.
20
Consequently, in Papua New Guinea, there were fewer communities with food prices above the
regional average than those with prices below the regional average and hence more communities
where poverty was overstated than understated when regional prices were used to calculate the
food poverty line. The headcount index at the food poverty line was 17 percent when using
regional average prices and only 14 percent when using cluster-level prices.
2.3
How to collect price information?
There are four different methods available for obtaining information on the local prices faced by
households: community price surveys, unit values, price opinions and using prices already
collected for on-going surveys like the CPI. According to Frankenberg (2000), little is known
about how to collect data on community-level prices and there have been many problems in past
LSMS studies, so she recommends that more than one method be used (specifically price
opinions and community price surveys). This duplication would enable a poverty analyst to take
the average of what are, potentially, error-ridden measures of prices, although this averaging may
be useful only if the measurement errors are random.17
2.3.1 Community market price surveys
Community market price surveys of the sort used by the LSMS are described fully by
Frankenberg (2000). Also, the points made above about how many items to collect prices for,
how many observations to make per item, and where to collect these prices all apply especially
to community price surveys. Therefore only two further points are made about this method of
gathering price information:
• it is a surprisingly rare method. With the exception of the LSMS surveys, it has not been
common for household surveys to include a community price survey. For example, state
statistical bureaus in countries such as China, Indonesia and Pakistan do not collect
market price data that can be matched to their rural household income and expenditure
surveys. Even research-driven surveys like the Indonesia Family Life Survey gather only
incomplete price data (eg., IFLS2 used a consumption recall with 37 food items, but
market price surveys were carried out for only nine foods).
• Empirical results reported below for the performance of unit values and price opinions
use community market price surveys as the benchmark. This is not an uncontroversial
choice. Even well informed users of household surveys like Deaton and Grosh (2000)
express doubts about community market price surveys, which in their view may be
unreliable due to being gathered from the wrong market, for the wrong specification of
goods, or for prices that are not actually paid by local residents due to bargaining and
other interactions between buyer and seller.18 However, it is the opinion of the author of
this guide that the prices for well-defined items collected from market surveys using
certain sampling rules are the appropriate standard for comparing other methods against.
2.3.2 Unit values
Many consumption surveys also collect food quantities (and sometimes quantities of other items
17
For example, if one method of gathering data systematically understates prices, averaging over this method and a
more reliable method will create more measurement error rather than less.
18
Examples of quality problems in the LSMS community market price surveys presumably include Tajikistan,
where the data were never released due to quality issues, and Côte d’Ivoire where Glewwe (1991) had to use the
price of a can of tomato paste to proxy for the non-food prices, which had measurement problems.
21
like fuels), so unit values can be calculated from the ratio of expenditures to quantity. These unit
values are used as proxies for market prices in some poverty studies especially where there are
no alternatives, due to the lack of community market price surveys. However, in LSMS surveys
that have both community market price surveys and unit values, the poverty lines are usually
calculated from the community market price survey data.
In addition to their availability, there are two other potential advantages of using unit values as a
proxy for market prices. First, because they are collected along with household variables it is
possible to create price indexes where both ‘prices’ and the weights in any price index or poverty
line are tailored to specific groups in the population. This may be helpful if markets are
segmented, so that different population groups face different prices, although other attributes
such as quality may also vary along with these price differences. Additionally, unit values can be
a rich source of data because there are typically far more observations (potentially millions in the
case of surveys like India’s National Sample Survey (NSS) and Indonesia’s SUSENAS with
large samples of households and a large number of commodities in the consumption recall) than
are available from traditional price surveys.
Good examples of the uses of unit values are provided by Deaton, Freidman and Alatas (2004)
and Deaton and Tarozzi (2005). In both cases these studies use the quantities and expenditures
from the NSS in India, with the first study also using data from the SUSENAS survey in
Indonesia. These studies calculate price indexes for urban and rural sectors, and major states in
India, and also PPP exchange rates between Indonesia and India. The pattern in these indexes
and their movements over time are contrasted with the price indexes that are implicit in both the
official poverty lines (and in the Penn World Tables for the case of the PPP exchange rates) and
a number of key differences are highlighted in terms of trends in poverty and living standards
over both time and space.
Offsetting the potential advantages of unit values are three key features which prevent them from
being used directly as a proxy for price:19
1. Unlike prices, unit values are available only for purchasers.20 This is particularly a
problem where no households within a survey cluster make a purchase, because then
there is no proxy for the market price in that community. A sample selection problem
may result because the communities where purchases are recorded by the survey may
differ from those where no purchases and unit values are observed (especially because
non-purchase may reflect either that households in that community are self-sufficient in
the good, or conversely that they never consume it). Gibson and Rozelle (2005) give an
19
Despite this claim, there are many examples in the applied demand literature of unit values being used naively as
perfect substitutes for prices.
20
Unit values may also be available for own-producers, and for gift givers and receivers because surveys often ask
for quantities and values in the modules of the survey dealing with these means of obtaining consumption goods.
However, these unit values are typically not used as proxies for unavailable price data because they do not refer to
transactions taking place through the market. Gibson and Rozelle (2005) show that there is little agreement amongst
the different types of unit values: for those households in Papua New Guinea who both purchased and produced
either sweet potato, banana or betelnut (three key commodities, comprising over 20% of the average household
budget), the average correlation between the two types of unit values is only 0.26. For those who both purchased and
received gifts, the average correlation is 0.43.
22
example of the extent of this problem: in Papua New Guinea only three-quarters of
survey clusters had a unit value for banana (a secondary staple with an average budget
share of six percent) and only one-half had a unit value for beer (an item with an average
budget share of two percent). The situation was even worse in rural areas, where only one
quarter of clusters had a unit value for beer. Thus relying on purchase behaviour to obtain
unit values and using these as proxies for local prices may cause a poverty analyst to miss
the full range of spatial price variation in a sample.
2. Unit values are subject to quality effects. As Prais and Houthakker (1955, p.110) first
pointed out: “An item of expenditure in a family-budget schedule is to be regarded as the
sum of a number of varieties of the commodity each of different quality and sold at a
different price.” Consequently, as the mix of varieties purchased changes across
households, the unit value will change, even if underlying prices are the same. The mix of
varieties is likely to change with changes in household income, household size, and price
changes, all of which affect the real living standard of household members. These
responses may be captured in the ‘quality elasticity’ discussed in equation (7) above and
repeated here:
ln vi = α + β ln xi + γ • z + δ c + ui
(7)
noting that the response of quality to price is typically unknown because unit values tend
to be used only when prices are either unavailable or poorly measured.21
The joint response of both quantity and quality to prices changes may be particularly
concerning when using unit values to estimate poverty lines. In markets where prices are
high, consumers may react by choosing lower quality, and where prices are low they may
choose higher quality (Deaton, 1997). This type of correlated demand response is not
captured by equation (5) because it affects all households in a community equally
(assuming they face the same prices) and so cannot be identified by the within-cluster
variation in unit values that is due to household characteristics. Consequently, because
unit values reflect both price and quality, they will tend to vary by less than prices and
poverty lines calculated from them may understate spatial and temporal differences in the
cost of living even after removing the effects of quality variation identified from
idiosyncratic household characteristics.
3. Unit values will reflect measurement errors in quantities, expenditures, or both. Even if
all households consumed the same varieties of a particular good and paid the same price,
the reported unit values could show considerable variation. Household surveys often
require respondents to recall the value and quantities of their expenditures (or
consumption) for the previous week, fortnight, month or even year. This difficult task
cannot be done with perfect accuracy, so reporting errors induce a variation in unit values
that might be mistaken for genuine variation in prices. It is possible that these errors
cancel out in large enough samples and when only a measure of central tendency like the
mean is required, although even for this requirement there is contrary evidence (see
below). Moreover, in regression contexts, such as demand studies, even random
21
In equation (5) the cluster level dummy variables allow the other coefficients to be consistently estimated even in
the absence of the missing price data.
23
measurement errors that cancel are problematic because they may induce a spurious
correlation between the unit value on the right-hand side and the dependent variable
which is either a quantity or a budget share (Deaton, 1997).
A number of procedures are used to mitigate each of the three the problems affecting unit values,
and may deal with more than one problem at once. For example, a regression can be used to
predict unit values for those clusters that have none, while also household-specific stripping
quality effects out of the unit values for those clusters with data. Nevertheless, the mitigation
procedures will be discussed separately for each problem because some solutions that involve
survey design issues face a tradeoff between reducing the extent of one problem and
exacerbating another.
Procedures for Clusters with No Unit Values
The most common ex post procedures for dealing with clusters that have no unit value available
are to either insert some regional average unit value in their place or to use a regression to predict
a (conditional) mean unit value for those clusters that have none.22 The same comments made in
Section 2.2 about imputing prices for communities where items are missing from markets apply
to both of these procedures. In each case there is an assumption that the unit values are missing
at random, so that averages calculated from unit values in other communities serve as a reliable
proxy.
There are also two ex ante choices about survey design that may help to reduce the problems
caused by clusters without unit values. The first is to use broader consumption categories, so that
there is more chance that each category will have at lease one purchase recorded in a cluster. For
example, 21 percent of the clusters in the PNG survey used by Gibson and Rozelle (2005) did
not have a unit value for flour, but at the broader category of “cereals” (which included rice,
bread, biscuits, and cakes) there were purchasers (and hence unit values) in all clusters.
However, quality effects become more important the broader the category because households
may consume quite distinct items within the category.
The second choice is to extend the length of the reference period, giving households more chance
to record a purchase within the survey window. While there are concerns about understatement
of consumption when longer recall periods are used (Scott and Amenuvegbe, 1990) at least some
LSMS surveys adopt strategies to hopefully deal with this problem. For example, respondents in
the 1997-98 VLSS were asked about the number of months they purchased each item over the
past year, the number of times per month they purchased the item, the usual quantity of each
purchase, and the value of this quantity. It was hoped that combining the information from these
three questions would give annual estimates of spending and quantity purchased, and hence also
an annual estimate of the unit value. However, it is not clear that this procedure worked because
there is a temporal pattern in these supposedly ‘annual’ unit values that is quite similar to the
pattern observed from the community market price surveys (Figure 2).23 The community market
22
Senauer, Sahn and Alderman (1986) are an early example of replacement with a regional average, in a regression
context rather than when estimating poverty lines.
23
The horizontal bar in the box plot shows the median rice price per month and the ends of the box extend from the
25th percentile to the 75th percentile of prices. The lines emerging from the box show the dispersion in the remainder
of the data outside this inter-quartile range.
24
price surveys reflect current seasonal conditions because they were carried out only once in each
cluster but the unit values should not because they are meant to refer to purchases made over the
last 12 months.
Figure 2: Distribution of Rice Market Prices and Rice Purchase Unit Values
by Month in the Vietnam Living Standards Survey
Unit Value
Market Price
5.5
1000 dong per kg
4.5
3.5
2.5
1.5
Dec
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Procedures for dealing with quality effects
Visual inspection and regression techniques can help to detect quality effects in unit values.
Deaton, Friedman and Alatas (2004) search for multi-modality, which may indicate that unit
values come from survey categories containing several distinct goods, each with different prices.
For example, “other milk products” in the Indian NSS appeared to have multiple modes. In
contrast, the category “rice” was better defined, with almost 30 percent of rural households
reporting buying it at exactly 10 rupees per kilogram, and 25 percent of urban households buying
it exactly 12 rupees per kilogram. This within-category variability can also be picked up by
regressions of unit values on household expenditure and other characteristics (see equation (5)).
Similarly, an ANOVA should detect cluster, district and seasonal effects if unit values are
picking up spatial and temporal price effects rather than just quality variations across households.
Household-specific quality effects can be removed from unit values using the coefficients from a
regression of the unit value on a vector of household characteristics. Appendix 1 provides an
example of code to do this that was used by Gibson and Rozelle (2005) when purging unit values
of quality effects, prior to the calculation of a food poverty line in Papua New Guinea. There are
several variants to this type of procedure; Cox and Wohlgenant (1986) regress unit values
deviations from regional/seasonal means on household variables while the various studies by
25
Deaton and co-authors (summarized in Deaton, 1997) typically use double log regressions like
equation (7). The strength of these household-specific quality effects are likely to vary with the
degree of heterogeneity of the commodity (informally produced and marketed root crops may be
more variable than price-controlled sugar). They will also reflect the broadness of the commodity
category. For example, in the SUSENAS survey from Indonesia the quality elasticity for the
broad category of meat is 0.12 but when a finer disaggregation is used the quality elasticity for
beef is only 0.05 and for chicken 0.04 (Olivia and Gibson, 2005).
While household-specific quality effects can be purged from unit values with a regression there
may still be an uncorrected response of quality to price in the resulting purged unit values. The
reason is that the cluster dummy variables in equation (7) are picking up two sources of variation
between clusters: genuine price variation and possible responses of cluster average quality to
price differences. It may be reasonable to assume that if the estimated response of quality to
income variations within clusters is small (as shown by low quality elasticities in equation (7)),
so too should the response of quality to price differences between clusters be small (seeing as
price differences can be treated as an equivalent income effects).24
Procedures for dealing with measurement error
Measurement errors in unit values reflect errors in survey estimates of food quantities, food
expenditures or both. There is likely to be greater interest in these estimates than there is in unit
values, so survey procedures should anyway be attempting to deal with these errors and there is a
large literature on the choices available (see Deaton and Grosh, 2000 for a brief review). One
source of potentially large errors in unit values is discrepancies between the measuring units that
are reported by respondents and entered into the survey database. This discrepancy may arise
when farmers and consumers use traditional units rather than the metric ones needed by poverty
analysts. An econometric procedure for dealing with this problem is suggested by Capéau and
Dercon (2005) who apply it to poverty measurement in Ethiopia.
The main ex post procedures for dealing with measurement error in unit values are to trim them
before they are used to calculate average prices. For example, Deaton, Friedman and Alatas
(2004) trim the top and bottom one percent of unit values, Deaton and Tarozzi (2005) trim log
unit values that are above or below 2.5 standard deviations from the mean of the log unit value,
while Gibson and Rozelle (2005) follow Cox and Wohlgenant (1986) and trim unit values more
than five standard deviations above or below the mean. A useful tool for detecting outliers,
especially due to problems with the units of quantity measurement that are entered into the
survey database (e.g. grams entered as kilograms) is to plot unit values for each household
against the average unit value from other households in the same cluster.
Even after trimming possible outliers, the mean is less robust than either the median or the mode,
and good arguments can be made for using these measures when calculating average unit values
by cluster or by region and season.
Evidence on the performance of unit values
There is mixed evidence on how reliable unit values are, even after procedures have been used
24
See Deaton (1997) for a discussion of this separability theory of quality. While the discussion is in the context of
demand estimation rather than poverty measurement the same issues apply.
26
for dealing with clusters with missing values, quality effects and measurement errors. Gibson
and Rozelle (2005) use unit values for nine major foods that contribute half of the poverty line
food basket in Papua New Guinea. After trimming and stripping household-specific quality
effects, the purged unit values are averaged by cluster and then by region, and then used to
calculate the food poverty line. The resulting food poverty line is overstated by 11 percent,
compared with its value when prices from a community market survey are used (Table 2).25 In
this setting it is argued that market surveys provide a good benchmark because there is no
haggling, local markets are well defined and geographically separated, and there is not much
quality variation amongst goods across the various markets. The overstatement would have been
even larger if the unit values were not purged of household-specific quality effects. This
overstatement of the food poverty line makes a significant difference to the estimated poverty
rates.
Table 2: Poverty measures with unit values and community price surveys
Cost of poverty line food
basket calculated from:
Community price survey
Unit values
Purged unit valuesa
Poverty line
(Kina per year)
K334
K385
K370
Headcount
Index
22.0
30.0**
28.0**
Poverty gap
Index
5.9
8.9**
8.0**
Poverty
severity index
2.4
3.8**
3.4**
Source: Gibson and Rozelle (2005).
Note: The poverty line and poverty estimates are in terms of adult-equivalents. ** indicates that estimates differ from
those obtained with the community price survey, at the 1% significance level corrected for the effect of clustering,
sampling weights and stratification).
a
The unit values have been purged of quality effects using a regression like equation (5).
Similar evidence of an overstatement in poverty estimates when using unit values is found by
Capeau and Dercon (2005) for Ethiopia when they compare with the results using community
surveys. The degree of overstatement is not quite as large as reported in Table 2 for PNG
(17 percent rather than 27 percent for the headcount index) but it is still disquietingly large.
Moreover, in the Ethiopian example, unit values also resulted in larger poverty fluctuations over
time than those coming from the market price surveys.
In contrast to these negative results, the poverty lines estimated from unit values by Deaton and
co-workers in India appear plausible and consistent with what limited information is available on
the spatial and temporal distribution of prices. Thus, until more is known about the performance
of unit values, poverty analysts should be cautious when using them to calculate poverty lines,
and where possible should seek additional information on prices. This additional information
may validate the unit values or it may prove to be a more reliable proxy for local market prices.
Temporal Price Indexes from Unit Values
Poverty analysts may be tempted to use unit values to measure changes in price over time,
especially when they are needed to compare poverty in two periods. Existing evidence from
developed countries suggests that unit values are not a desirable source of information for
measuring price changes over time. Specifically, when unit values are used to create a price
25
For some regions the overstatement is from 16-20 percent. The values in column 2 of Table 2 are populationweighted averages of the regional poverty lines.
27
index for a group of goods that are neither pure complements nor perfect substitutes the
calculated index will be a biased measure of the true change in prices (Bradley, 2005). This bias
occurs because in the presence of cross-sectional dispersion, which is inherent in unit values, it is
not possible to exactly aggregate price changes across individual households to get an unbiased
aggregate price index. Empirical evidence suggests that the unit value indexes often fall outside
the Laspeyres-Paasche bounds, indicating that the bias in these can be larger than the more
widely studied bias from using either the Laspeyres or the Paasche formula when aggregating
individual price changes into an aggregate index (Silver and Webb, 2000; Bradley, 2005).
2.3.3 Individual Transaction Records in Expenditure Diaries
In surveys where consumption is measured using diaries rather than recall, there is one overlooked
source of price data that is somewhat analogous to a unit value but with potentially fewer
problems. Expenditure diaries differ widely in format. Especially in terms of whether they are kept
by individuals or the whole household, and whether they are largely unstructured or structured
according to particular groups of related consumption items or days of the week or means of
acquisition (purchases, gifts, own-production, takings from own-business etc). However, for
expenditure dairies used in at least some surveys it is typical for respondents to be asked to record
not only the expenditure and quantity of each item purchased but also the brand and other details
on the specification such as the unit size. These details are useful for statistical agencies in
selecting the most widely purchased specification when designing the regimen of prices to collect
for the CPI. For example, the representative price for the Soft Drinks category might be a “340ml
can of Coca Cola”.
Figure 3 provides an example of the sort of information that may be available in an expenditure
diary, where in this case the data come from the Urban Household Survey carried out in Papua
New Guinea in 1985-87.26 After being completed by the respondents the details on each
transaction in the diaries were assigned to a four digit commodity code. The extract shown here
relates to Group 126 “canned meat”. The most prevalent specification within this group is “canned
corned meat” (code 1261) where the dominant brand is “Ox and Palm” which is sold
predominantly in 340 gram cans. If attention is restricted to this dominant brand and specification,
there is no quality variation of the sort that potentially interferes with the use of unit values as a
proxy for market price. Because these reports of the prices paid for each transaction are coming
from volunteer households rather than trained price surveyors from the CPI branch of the statistics
agency, it is likely that there will be measurement error that causes some outliers. So these records
can be trimmed to remove the effect of any outliers, and either the median or mode will likely be
better measures of average price than is the mean.
26
It is notable that information on branded goods is more likely to be available in urban settings, where formally
marketed food is more important in consumption than in the rural sector where informal markets predominant.
28
Figure 3: Example of Pricing Information Available From Expenditure Diaries
Commodity Expenditure
1261
276
1265
135
1261
264
1261
135
1261
140
1265
98
1261
330
1263
85
1261
135
1261
136
1263
85
1261
145
1261
140
1261
718
1261
135
1262
80
1262
135
1262
150
Number
2
1
2
1
1
1
2
1
1
1
1
1
Brand
OX&PALM
T'DUCK
OX&PALM
OX&PALM
OX&PALM
MALING
OX&PALM
TULIP
OX&PALM
OX&PALM
TULIP
OX&PALM
Size
340
397
340
340
340
14
340
340
340
340
340
340
Unit
G
G
G
G
G
G
G
G
G
G
G
G
6
1
1
1
1
OX&PALM
OX&PALM
CBEEF
GLOBE
GLOBE
340
340
200
340
340
G
G
G
G
G
Despite this potentially rich source of information, it is rare for poverty analysts and other
economists to work with the individual transaction records in diaries. Instead, the total spending
and quantity purchased by each household for each commodity category is the usual level at
which the data are made available. This aggregation loses valuable information and will tend to
introduce apparent quality effects that are not present in the original data.
The transaction information in expenditure diaries could be particularly useful for addressing the
question of “do the poor pay more?” (Rao, 2000; Mueller, 2002; Attanasio and Frayne, 2006).
This is a difficult question to answer with unit values because there are two offsetting effects and
only one item of information to try to identify the effects. On the one hand, poorer households
are likely to have lower unit values because they purchase lower quality items (as shown by the
‘quality elasticity’ described in equation (5)). On the other hand, if poorer households are
liquidity constrained and can not realize bulk discounts the prices they pay and the resulting unit
values will be higher. Information on each transaction, and a means of removing quality effects
by restricting attention to the same brand and specification would provide more robust evidence
on whether the poor pay more.
2.3.4 Price opinions
In addition to community market surveys, unit values and the transaction records from expenditure
diaries, a third source of information about local market prices is for surveys to solicit opinions
from key informants. This is quite an old idea, dating back to the early days of the LSMS surveys
when Saunders and Grootaert (1980) suggested interviewing groups of housewives to obtain price
data. This strategy was never implemented in the LSMS because of concerns that the reported
prices could be biased by differences in bargaining skill, by uncertainty about the reference
period (which matters in inflationary environments), and by the lack of a representative sample
(Wood and Knight, 1985). Nevertheless, variants of the idea have been used in all three waves
29
of the Indonesia Family Life Survey (IFLS) where community informants are asked about the
local prices of several food and non-food items.27
A further development of the idea of soliciting price opinions was used by Gibson and Rozelle
(2005) in Papua New Guinea. In this survey, respondents in the sampled households where shown
photographs of a variety of different items and asked their opinion about the current price of the
same items in local markets. By using the full group of sampled households, this approach was
able to overcome concerns about the price opinions coming from an unrepresentative group.
Moreover, by getting price opinions from every sampled household in a cluster it was possible to
treat the opinions analogously to unit values and apply the equation (5) framework for estimating
the determinants of the within-cluster variation. The results suggested that the quality elasticities
for price opinions were only one-quarter of the size of those for unit values and were all
statistically insignificant. The price opinions also averaged only one-quarter of the measurement
error variance of the unit values (that is, variability about the cluster mean which was not explained
by household characteristics) and the covariance between these measurement errors and the actual
demands averaged only one-tenth of that for the unit values.
A final advantage of the price opinions in this study was that because it was easy to show the
photographs to respondents and because responses were not tied to actual purchasing behavior,
there were far fewer missing observations than with either unit values or community market price
surveys. The module on price opinions added about 10 minutes to the time taken to complete the
household questionnaire, making a total time cost per cluster of about two hours (the sample
drew 12 households per cluster). This was somewhat shorter than the time taken to gather the
prices from local stores and markets so relying on informed opinions about prices may be an
economical and reasonably accurate way of obtaining local prices. However, this time cost of the
market surveys may reflect the poor infrastructure and relatively low population density in Papua
New Guinea, and in other countries market surveys may be cheaper to carry out. Finally, another
unusual feature of Papua New Guinea is that a considerable amount of food is imported, and these
foods are all sold in branded, pre-packaged quantities, which makes photographs of such products
more informative than they may be for bulk products sold through informal markets.
In terms of poverty measurement, the results using the price opinions in the PNG survey were
closer to those from the benchmark (the community market price survey) than were the results
from using unit values. The population-weighted average of the regional food poverty lines was
only overstated by 3.6 percent using the price opinions, compared with an 11 percent
overstatement when unit values were used (Table 3). The headcount poverty rate was only
overstated by eight percent and the difference was not statistically significant, whereas when unit
values were used it was overstated by more than one-quarter. While the difference from the
benchmark for the poverty gap and poverty severity measures was statistically significant, the
degree of overstatement was less than one-half of that when using the unit values.
27
In wave 3 of the IFLS, fielded in 2000, prices were obtained on 32 foods and 7 non-foods from the volunteers
who staff the local health post. In previous waves, questions about the prices of fewer items were administered to the
staff of the health post and to the head of the Village Women’s Group (Ibu PKK) and one or more of her staff
members in a group interview.
30
Table 3: Poverty measures with price opinions, unit values and community price surveys
Cost of poverty line food
basket calculated from:
Community price survey
Unit valuesa
Price opinions
Poverty line
(Kina per year)
K334
K370
K345
Headcount
Index
22.0
28.0**
23.8
Poverty gap
Index
5.9
8.0**
6.8**
Poverty
severity index
2.4
3.4**
2.8**
Source: Gibson and Rozelle (2005).
Note: The poverty line and poverty estimates are in terms of adult-equivalents. ** indicates that estimates differ from
those obtained with the community price survey, at the 1% significance level corrected for the effect of clustering,
sampling weights and stratification).
a
The unit values have been purged of quality effects using a regression like equation (5).
The price opinions obtained in the IFLS have not been as closely studied as those from the PNG
survey but a preliminary analysis suggests that the price opinions are at least as good a proxy for
market prices as are unit values. The evidence is limited because there are only six commodities
with price opinions, unit values and community market prices available (rice, beef, sugar, cooking
oil, kangkung and kerosene).28 However, these six items contribute almost one-quarter of the
average consumption budget. The average values of the community market prices, unit values
and price opinions for each of these six commodities are reported in columns 2-4 of Table 4,
where the averages are calculated after removing outliers that are more than five standard
deviations from the mean. The proportionate deviation from the market prices is in Columns 5-6
and with the exception of kangkung (water spinach) the averages are within 20 percent of the
average market price. Aggregating across the commodities, the mean unit value deviates from
the mean market price by 10 percent, while the mean price opinion deviates from the mean
market price by only six percent.
The correlation between price opinions and market prices is also substantially higher than that
between unit values and market prices, especially for rice.29 Whether these correlations are high
enough for either unit values or market prices to be an adequate proxy depends both on whether
community market prices are treated as a defensible benchmark and on the particular purpose
that the prices are being used for. But regardless of that issue, if an analyst needs either
alternatives to or supplements for community market prices, the current evidence suggests that
price opinions would do at least as well as unit values, even when collected from only one group
per community and without the aid of pictures. Moreover, price opinions have the potential
advantage of being more widely available; across the six commodities in Table 4 the minimum
28
The survey recorded information about household total expenditures on 37 food items, but quantity consumed,
and hence the unit value, is available only for seven commodities; rice, beef, chicken, kangkung (water spinach),
cooking oil, granulated sugar and kerosene. The market price survey obtained the prices of 12 foods (rice, noodles,
beef, salted fish, sugar, salt, cooking oil, sweetened condensed milk, banana, kangkung, tofu, and milk powder) and
kerosene. The opinions about prices were obtained for 32 foods and 7 non-foods.
29
This claim also holds if the price data are aggregated into province-level price indexes. Specifically, Tornqvist
indexes were calculated from data on budget shares (from the 1999 SUSENAS consumption module) in province
pairs, multiplied by the log ratio of prices, where we use Jakarta province as the reference point. Comparing these
price indexes across the 13 provinces covered by IFLS gave a correlation between the market price and unit value
indexes of 0.66, compared with 0.77 between the indexes from the price opinions and market prices.
31
number of communities with price opinions was n=304, compared with n=278 for the unit values
and n=147 for the market prices.
Table 4: Market prices, unit values and price opinions in the Indonesia Family Life Survey
Correlation with
Deviation from
Mean
Mean
Mean
market prices
market priceb
market
unit
price
Unit
Price
Unit
Price
pricea
valuea
opiniona
values
opinions
values
opinions
Rice
2102
2190
2034
0.04
0.03
0.236
0.605
Beef
25925
22296
25975
0.14
0.00
0.236
0.372
Sugar
3416
3379
3364
0.01
0.02
0.284
0.310
Cooking oil
4309
3645
3509
0.15
0.19
0.263
0.139
Kangkung
284
622
329
1.19
0.16
0.074
0.450
Kerosene
544
587
563
0.08
0.03
0.365
0.444
b
Average
0.10
0.06
0.250
0.483
Source: Author’s calculation from IFLS data.
a
Rupiah per kilogram (with the exception for cooking oil (Rupiah per litre)), as calculated from cluster-level
averages after outliers more than 5 standard deviations from the mean have been removed. The averages are based
on a common sample that has all three price measures available.
b
Absolute value, as a proportion of the mean market price.
c
Weighted by each commodity’s share in total consumption, in 1999 SUSENAS results.
2.3.5 Existing price collections
The final choice of where to collect prices is to rely on existing price collection efforts. However,
this is unlikely to work in many developing country settings. The discussion in Section 1 above
notes that the Consumer Price Index in many countries (including developed countries) relies
mostly on urban prices, so these would not be applicable for calculating either poverty lines or
spatial deflators and for imputing the value of consumption for rural households. Moreover, the
commodity weighting in a CPI is much more towards the consumption pattern of richer
households, so the index values are unlikely to be relevant to poverty-related analysis.
32
Section 3: Previous Poverty Lines and Survey Fieldwork Not Yet Complete
This section is for those poverty analysts who are working in a country with a previous poverty
line and who is involved before the survey fieldwork is complete. The first requirement in this
case is to replicate the methods used in the previous household survey and poverty line
calculation – even when these methods were faulty. Considerable policy attention will be paid to
backward looking comparisons of poverty rates and these will almost certainly be compromised
if price (and other) information is either gathered or used in a different manner than for the
previous poverty calculations. Key questions to check on include:
• Do all foods that were priced in the previous survey and used in the poverty line
calculation have prices being gathered in the same survey, and are these prices being
gathered in the same way and at the same geographic scale?
• Are the same methods used as in the previous survey for imputing missing values, and
calculating spatial price indexes and the cost of the poverty line bundle of foods (if a Cost
of Basic Needs poverty line was used)?
If either methods or data collection have changed since the previous survey and poverty line
calculation, it would be worthwhile to select at least a random sample of clusters in the current
survey for supplemental data collection, to capture and use price data in the same way as
previously. The estimates from this sub-sample could provide adjustment factors needed to
restore comparability to any backwards looking poverty comparisons.
However, forward-looking action is also needed since otherwise poverty analysts would always
be stuck using outdated and possibly faulty methods from the past. This means that two sets of
poverty lines (and possibly consumption aggregates) and poverty estimates may need to be
produced, one that is methodologically consistent with the previous estimates and one that can
stand as a foundation for future estimates. Looking forward, there are three key questions that the
poverty analyst and the survey agency should consider:
1. What is the optimal set of items to collect prices for,
2. Where to collect price and on what geographical scale to report any resulting price
aggregations such as a spatial price index or food poverty line, and
3. What is the best way to collect the price information, in terms of price surveys in
community markets, unit values, surveys of opinions about prices from either sampled
households or community leaders, and existing price collection efforts such as for a CPI.
3.1
Optimal set of items to collect prices for?
The optimal number of items whose prices are needed depends partly on the nature of the
consumption module of the survey. If it is an LSMS style survey the consumption recall is likely
to have less than 50 categories of food and less than 100 consumption categories in total. In this
case, if there is a separate price survey it is sensible to try to obtain the price for at least one item
per food category.30 This matching is especially needed if quantity data are not collected in the
consumption recall; otherwise there is no way to derive the required consumption quantities from
those food expenditure categories with no matching price. If the survey uses more consumption
recall categories, as would typically occur with either a diary-keeping or recall-based Household
30
If unit values are used there will automatically be a matching between the commodities with expenditure and
quantity information and those with “prices”.
33
Income and Expenditure Survey (HIES) or a Household Budget Survey (HBS), then prices
should only be collected for foods that are going to make a ‘significant’ contribution to the food
poverty line.
One useful tool in this regard is the concentration curve, which ranks foods according to either
their calorie contribution or cost contribution to the poverty line basket. Figure 3.1 presents an
example from Cambodia, where the initial poverty line was calculated from a 1993/94 survey
that had 155 separate food items. This detailed food basket was never fully priced in subsequent
surveys, which only gathered data on the prices of about 30 foods. In fact this more abbreviated
level of price was about an appropriate level of detail for the poverty line food basket.
According to Figure 3.1, a basket with 35 items would give 86 percent of the total cost and
94 percent of the calories of the 155-item basket.
Figure 3.1: Concentration curves for poverty line food basket
100
Cost
Calories
Cumulative %
80
60
40
20
0
0
50
100
150
Food items
How many price observations per item
If prices are obtained from a market price survey, there is a choice of how many observations to
make on the price of each item. The standard in most LSMS surveys is three observations per
village (that is, per cluster). It is not clear if a fixed number of observations per item is the best
approach, although it does have the advantage of simplicity. A CBN food poverty line is a
statistic (essentially a weighted average of a set of average prices) although it is rare to see
standard errors reported for poverty lines. This statistic would be more precisely estimated if the
prices for the items contributing the most weight (e.g., rice) were based on larger samples than
the samples used to measure the price for minor items.
34
3.2
Where to collect prices and the appropriate geographical scale for reporting
The aim should be to collect prices in the markets actually used by the households in the sample,
noting the need to also keep comparability with what was done in the previous survey. Thus it is
worthwhile asking respondents in the consumption questionnaire where they actually buy their
items. Otherwise an approach of just visiting the nearest markets and asking vendors the price of
particular goods (as was done by the LSMS surveys) can be subject to the criticism that this is
possibly the wrong market. Other criticisms of the approach are that prices could be collected for
the wrong specification of goods and that the prices quoted may not be the prices actually paid
by local residents because of bargaining (Deaton and Grosh, 2000).
It is also possible that some prices will need to be collected from larger, more regional, markets
because specialized items may not be available in local markets. There are three options for
dealing with these missing local prices:
• Apply the price from a neighboring market (essentially a form of ‘hot deck’ technique
that survey software often applies to missing data)
• Apply prices that are obtained in larger markets to a whole region, and
• Use regression to predict the price of missing items, based on the price of some other
item more widely available.
The logic of the regression approach is that spatial price differences may reflect transport costs,
so if goods are coming from a common source (say a port) and moving into the hinterland, prices
may tend to move proportionally.31 Of course if there are more complicated commodity flows,
with missing prices reflecting seasonality, environmental constraints (eg., altitudinal limits) etc,
then none of these imputation approaches will be very reliable.
In terms of the geographical scale at which to calculate average prices (as an input to the food
poverty lines), most surveys, and the subsequent poverty analyses, report these for only a few
major regions despite prices being collected from a far larger number of communities. There are
at least three reasons for this aggregation:32
• concern about missing prices at the local level (see above)
• measurement error because the prices observed in a single village market on a given day
are only a snapshot taken with a very small sample. By averaging over prices collected
in surrounding markets within the region, the share of the variance due to random
measurement error will be reduced, and
• introduction of temporal variation such that the prices obtained in a village on a given
day do not reflect the ‘usual’ prices facing the households in that community. Regional
prices may be more representative because surveys that stagger fieldwork over several
months or a year will have price samples within a region that are collected over the
entire duration of the fieldwork (unless the survey works entirely in one region and then
moves to the next region). But prices in a single village are likely to be collected only
once, and so will reflect both spatial and temporal/seasonal variation and it will not be
31
Glewwe (1991) used the same logic when taking the price of a can of tomato paste as a proxy for non-food prices
in an early LSMS surveys in Côte d’Ivoire because the non-food prices that were collected were poorly measured.
32
Additionally, may also be concerns about estimating the non-food allowance separately for every cluster in the
sample, which will introduce a large number of intercepts into equation (1).
35
possible for the poverty analyst to identify the purely spatial part, which is needed for
setting the regional poverty lines.33
On the other hand, there are some costs of using regional average prices rather than local prices.
Regional prices will overstate the cost of buying the poverty line basket of foods in low-price
communities within each region, while understating it for others. Measured poverty will be too
high in the low-price communities because these same (high) prices are not used for valuing food
consumption. Hence, some households will be above the poverty line if that line is priced using
local (i.e., cluster-level) prices, but below the poverty line if regional average prices are used.
Bias in the opposite direction (measured poverty too low) will occur in clusters where regional
average prices understate the local cost of the poverty line basket of foods. These effects don’t
necessarily net out if the distribution of prices within regions is non-Normal, which is what
Gibson and Rozelle (1998) found for Papua New Guinea for three-quarters of foods and regions.
Consequently, the headcount index at the food poverty line was 17 percent when using regional
average prices and only 14 percent when using cluster-level prices.
3.3
How to collect price information?
There are four different methods available for obtaining information on the local prices faced by
households: community price surveys, unit values, price opinions and using prices already
collected for on-going surveys like the CPI. According to Frankenberg (2000), little is known
about how to collect data on community-level prices and there have been many problems in past
LSMS studies, so she recommends that more than one method be used (specifically price
opinions and community price surveys). This duplication would enable a poverty analyst to take
the average of what are, potentially, error-ridden measures of prices, although this averaging may
be useful only if the measurement errors are random.34
3.3.1 Community market price surveys
Community market price surveys of the sort used by the LSMS are described fully by
Frankenberg (2000). Yet with the exception of the LSMS surveys, it has not been common for
household surveys to include a community price survey. For example, state statistical bureaus in
countries such as China, Indonesia and Pakistan do not collect market price data that can be
matched to their rural household income and expenditure surveys. Even research-driven surveys
like the Indonesia Family Life Survey gather only incomplete price data (eg., IFLS2 used a
consumption recall with 37 food items, but market price surveys were carried out for only nine
foods). Even some of the LSMS surveys did not always use the community price survey data for
poverty analysis, instead using the unit values.
Reasons for being skeptical about community price surveys are described by Deaton and Grosh
(2000), who suggest that they may be unreliable due to being gathered from the wrong market,
for the wrong specification of goods, or for prices that are not actually paid by local residents
33
Surveys with a within-year longitudinal component are an exception. Muller (2002) reports on an example of such
a survey from Rwanda, where the same households and villages were revisited four times throughout the year.
34
For example, if one method of gathering data systematically understates prices, averaging over this method and a
more reliable method will create more measurement error rather than less.
36
due to bargaining and other interactions between buyer and seller.35 However, there is almost no
evidence on these problems, and accumulating evidence on the problems with unit values (see
below). Moreover, the prices for well-defined items collected from market surveys using certain
sampling rules should be the standard for poverty analysis. Since the price collection exercise
requires weighing and measuring it can also collect other useful data, such as for the conversion
factors to transform traditional units into metric units. Without such factors, reliance has to be
placed on econometric procedures (Capeau and Dercon, 2006) that are essentially untested.
3.3.2 Unit values
Many consumption surveys also collect food quantities so unit values can be calculated from the
ratio of expenditures to quantity. In addition to their availability, there are two other potential
advantages of using unit values as a proxy for market prices. First, because they are collected
along with household variables it is possible to create price indexes where both ‘prices’ and the
weights in any price index or poverty line are tailored to specific groups in the population. This
may be helpful if markets are segmented, so that different population groups face different
prices. Additionally, unit values can be a rich source of data because there are typically far more
observations (potentially millions in the case of surveys like India’s National Sample Survey
(NSS) and Indonesia’s SUSENAS with large samples of households and a large number of
commodities in the consumption recall) than are available from traditional price surveys.
Good examples of the uses of unit values are provided by Deaton, Freidman and Alatas (2004)
and Deaton and Tarozzi (2005). In both cases these studies use the quantities and expenditures
from the NSS in India, with the first study also using data from the SUSENAS survey in
Indonesia. These studies calculate price indexes for urban and rural sectors, and major states in
India, and also PPP exchange rates between Indonesia and India. The pattern in these indexes
and their movements over time are contrasted with the price indexes that are implicit in both the
official poverty lines (and in the Penn World Tables for the case of the PPP exchange rates) and
a number of key differences are highlighted in terms of trends in poverty and living standards
over both time and space.
Offsetting the potential advantages of unit values are four key features which prevent them from
being used directly as a proxy for price:
1. Unlike prices, unit values are available only for purchasers.36 This is particularly a
problem where no households within a survey cluster make a purchase, because then
there is no proxy for the market price in that community. A sample selection problem
35
Examples of quality problems in the LSMS community market price surveys presumably include Tajikistan,
where the data were never released due to quality issues, and Côte d’Ivoire where Glewwe (1991) had to use the
price of a can of tomato paste to proxy for the non-food prices, which had measurement problems.
36
Unit values may also be available for own-producers, and for gift givers and receivers because surveys often ask
for quantities and values in the modules of the survey dealing with these means of obtaining consumption goods.
However, these unit values are typically not used as proxies for unavailable price data because they do not refer to
transactions taking place through the market. Gibson and Rozelle (2005) show that there is little agreement amongst
the different types of unit values: for those households in Papua New Guinea who both purchased and produced
either sweet potato, banana or betelnut (three key commodities, comprising over 20% of the average household
budget), the average correlation between the two types of unit values is only 0.26. For those who both purchased and
received gifts, the average correlation is 0.43.
37
may result because the communities where purchases are recorded by the survey may
differ from those where no purchases and unit values are observed (especially because
non-purchase may reflect either that households in that community are self-sufficient in
the good, or conversely that they never consume it). Gibson and Rozelle (2005) give an
example of the extent of this problem: in Papua New Guinea only three-quarters of
survey clusters had a unit value for banana (a secondary staple with an average budget
share of six percent) and only one-half had a unit value for beer (an item with an average
budget share of two percent). The situation was even worse in rural areas, where only one
quarter of clusters had a unit value for beer. Thus relying on purchase behaviour to obtain
unit values and using these as proxies for local prices may cause a poverty analyst to miss
the full range of spatial price variation in a sample.
2. Unit values are subject to quality effects. As Prais and Houthakker (1955, p.110) first
pointed out: “An item of expenditure in a family-budget schedule is to be regarded as the
sum of a number of varieties of the commodity each of different quality and sold at a
different price.” Consequently, as the mix of varieties purchased changes across
households, the unit value will change, even if underlying prices are the same. The mix of
varieties is likely to change with changes in household income, household size, and price
changes, all of which affect the real living standard of household members. These
responses may be captured in the ‘quality elasticity’ discussed in equation (7) above and
repeated here:
ln vi = α + β ln xi + γ • z + δ c + ui
(7)
noting that the response of quality to price is typically unknown because unit values tend
to be used only when prices are either unavailable or poorly measured.37
The joint response of both quantity and quality to prices changes may be particularly
concerning when using unit values to estimate poverty lines. In markets where prices are
high, consumers may react by choosing lower quality, and where prices are low they may
choose higher quality (Deaton, 1997). This type of correlated demand response is not
captured by equation (5) because it affects all households in a community equally
(assuming they face the same prices) and so cannot be identified by the within-cluster
variation in unit values that is due to household characteristics. Consequently, because
unit values reflect both price and quality, they will tend to vary by less than prices and
poverty lines calculated from them may understate spatial and temporal differences in the
cost of living even after removing the effects of quality variation identified from
idiosyncratic household characteristics.
3. Unit values will reflect measurement errors in quantities, expenditures, or both. Even if
all households consumed the same varieties of a particular good and paid the same price,
the reported unit values could show considerable variation. Household surveys often
require respondents to recall the value and quantities of their expenditures (or
consumption) for the previous week, fortnight, month or even year. This difficult task
cannot be done with perfect accuracy, so reporting errors induce a variation in unit values
37
In equation (5) the cluster level dummy variables allow the other coefficients to be consistently estimated even in
the absence of the missing price data.
38
that might be mistaken for genuine variation in prices. It is possible that these errors
cancel out in large enough samples and when only a measure of central tendency like the
mean is required, although even for this requirement there is contrary evidence (see
below).
4. Insertion of unit values into standard formula for a price index is likely to lead to bias
because in the presence of price dispersion across households (which is inherent in unit
values) there is no exact aggregation of the price changes felt by households into an
overall price index. Empirical evidence suggests that the unit value indexes often fall
outside the Laspeyres-Paasche bounds, indicating that the bias in these can be larger than
the more widely studied bias from using either the Laspeyres or the Paasche formula
when aggregating individual price changes into an aggregate index (Silver and Webb,
2000; Bradley, 2005).
A number of procedures are used to mitigate each of the three the problems affecting unit values,
and may deal with more than one problem at once. For example, a regression can be used to
predict unit values for those clusters that have none, while also stripping household-specific
quality effects out of the unit values for those clusters with data.
Procedures for dealing with quality effects
Visual inspection and regression techniques can help to detect quality effects in unit values.
Deaton, Friedman and Alatas (2004) search for multi-modality, which may indicate that unit
values come from survey categories containing several distinct goods, each with different prices.
For example, “other milk products” in the Indian NSS appeared to have multiple modes. In
contrast, the category “rice” was better defined, with almost 30 percent of rural households
reporting buying it at exactly 10 rupees per kilogram, and 25 percent of urban households buying
it exactly 12 rupees per kilogram. This within-category variability can also be picked up by
regressions of unit values on household expenditure and other characteristics (see equation (5)).
Similarly, an ANOVA should detect cluster, district and seasonal effects if unit values are
picking up spatial and temporal price effects rather than just quality variations across households.
Household-specific quality effects can be removed from unit values using the coefficients from a
regression of the unit value on a vector of household characteristics. Appendix 1 provides an
example of code to do this that was used by Gibson and Rozelle (2005) when purging unit values
of quality effects, prior to the calculation of a food poverty line in Papua New Guinea. There are
several variants to this type of procedure; Cox and Wohlgenant (1986) regress unit values
deviations from regional/seasonal means on household variables while the various studies by
Deaton and co-authors (summarized in Deaton, 1997) typically use double log regressions like
equation (7). The strength of these household-specific quality effects are likely to vary with the
degree of heterogeneity of the commodity (informally produced and marketed root crops may be
more variable than price-controlled sugar). They will also reflect the broadness of the commodity
category. For example, in the SUSENAS survey from Indonesia the quality elasticity for the
broad category of meat is 0.12 but when a finer disaggregation is used the quality elasticity for
beef is only 0.05 and for chicken 0.04 (Olivia and Gibson, 2005).
While household-specific quality effects can be purged from unit values with a regression there
39
may still be an uncorrected response of quality to price in the resulting purged unit values. The
reason is that the cluster dummy variables in equation (7) are picking up two sources of variation
between clusters: genuine price variation and possible responses of cluster average quality to
price differences. It may be reasonable to assume that if the estimated response of quality to
income variations within clusters is small (as shown by low quality elasticities in equation (7)),
so too should the response of quality to price differences between clusters be small (seeing as
price differences can be treated as an equivalent income effects).38
Procedures for dealing with measurement error
Measurement errors in unit values reflect errors in survey estimates of food quantities, food
expenditures or both. There is likely to be greater interest in these estimates than there is in unit
values, so survey procedures should anyway be attempting to deal with these errors and there is a
large literature on the choices available (see Deaton and Grosh, 2000 for a brief review). One
source of potentially large errors in unit values is discrepancies between the measuring units that
are reported by respondents and entered into the survey database. This discrepancy may arise
when farmers and consumers use traditional units rather than the metric ones needed by poverty
analysts. An econometric procedure for dealing with this problem is suggested by Capéau and
Dercon (2005) who apply it to poverty measurement in Ethiopia.
The main ex post procedures for dealing with measurement error in unit values are to trim them
before they are used to calculate average prices. For example, Deaton, Friedman and Alatas
(2004) trim the top and bottom one percent of unit values, Deaton and Tarozzi (2005) trim log
unit values that are above or below 2.5 standard deviations from the mean of the log unit value,
while Gibson and Rozelle (2005) follow Cox and Wohlgenant (1986) and trim unit values more
than five standard deviations above or below the mean. A useful tool for detecting outliers,
especially due to problems with the units of quantity measurement that are entered into the
survey database (e.g. grams entered as kilograms) is to plot unit values for each household
against the average unit value from other households in the same cluster.
Even after trimming possible outliers, the mean is less robust than either the median or the mode,
and good arguments can be made for using these measures when calculating average unit values
by cluster or by region and season.
Evidence on the performance of unit values
There is mixed evidence on how reliable unit values are, even after procedures have been used
for dealing with clusters with missing values, quality effects and measurement errors. Gibson
and Rozelle (2005) use unit values for nine major foods that contribute half of the poverty line
food basket in Papua New Guinea. After trimming and stripping household-specific quality
effects, the purged unit values are averaged by cluster and then by region, and then used to
calculate the food poverty line. The resulting food poverty line is overstated by 11 percent,
compared with its value when prices from a community market survey are used (Table 3.1).39 In
this setting it is argued that market surveys provide a good benchmark because there is no
38
See Deaton (1997) for a discussion of this separability theory of quality. While the discussion is in the context of
demand estimation rather than poverty measurement the same issues apply.
39
For some regions the overstatement is from 16-20 percent. The values in column 2 of Table 2 are populationweighted averages of the regional poverty lines.
40
haggling, local markets are well defined and geographically separated, and there is not much
quality variation amongst goods across the various markets. The overstatement would have been
even larger if the unit values were not purged of household-specific quality effects. This
overstatement of the food poverty line makes a significant difference to the estimated poverty
rates.
Table 3.1: Poverty measures with unit values and community price surveys
Cost of poverty line food
basket calculated from:
Community price survey
Unit values
Purged unit valuesa
Poverty line
(Kina per year)
K334
K385
K370
Headcount
Index
22.0
30.0**
28.0**
Poverty gap
Index
5.9
8.9**
8.0**
Poverty
severity index
2.4
3.8**
3.4**
Source: Gibson and Rozelle (2005).
Note: The poverty line and poverty estimates are in terms of adult-equivalents. ** indicates that estimates differ from
those obtained with the community price survey, at the 1% significance level corrected for the effect of clustering,
sampling weights and stratification).
a
The unit values have been purged of quality effects using a regression like equation (5).
Similar evidence of an overstatement in poverty estimates when using unit values is found by
Capeau and Dercon (2005) for Ethiopia when they compare with the results using community
surveys. The degree of overstatement is not quite as large as reported in Table 3.1 for PNG
(17 percent rather than 27 percent for the headcount index) but it is still disquietingly large.
Moreover, in the Ethiopian example, unit values also resulted in larger poverty fluctuations over
time than those coming from the market price surveys.
In contrast to these negative results, the poverty lines estimated from unit values by Deaton and
co-workers in India appear plausible and consistent with what limited information is available on
the spatial and temporal distribution of prices. Thus, until more is known about the performance
of unit values, poverty analysts should be cautious when using them to calculate poverty lines,
and where possible should seek additional information on prices. This additional information
may validate the unit values or it may prove to be a more reliable proxy for local market prices.
3.3.3 Price opinions
In addition to community market surveys, unit values and the transaction records from expenditure
diaries, a third source of information about local market prices is for surveys to solicit opinions
from key informants (Saunders and Grootaert, 1980). Variants of the idea have been used in all
three waves of the Indonesia Family Life Survey (IFLS) where community informants are asked
about the local prices of several food and non-food items.40 A further development of the idea of
soliciting price opinions was used by Gibson and Rozelle (2005) in Papua New Guinea. In this
survey, respondents in the sampled households where shown photographs of a variety of different
items and asked their opinion about the current price of the same items in local markets. By using
the full group of sampled households, this approach was able to overcome concerns about the price
40
In wave 3 of the IFLS, fielded in 2000, prices were obtained on 32 foods and 7 non-foods from the volunteers
who staff the local health post. In previous waves, questions about the prices of fewer items were administered to the
staff of the health post and to the head of the Village Women’s Group (Ibu PKK) and one or more of her staff
members in a group interview.
41
opinions coming from an unrepresentative group. Moreover, by getting price opinions from every
sampled household in a cluster it was possible to treat the opinions analogously to unit values and
apply the equation (5) framework for estimating the determinants of the within-cluster variation.
The results suggested that the quality elasticities for price opinions were only one-quarter of the
size of those for unit values and were all statistically insignificant. The price opinions also
averaged only one-quarter of the measurement error variance of the unit values (that is, variability
about the cluster mean which was not explained by household characteristics) and the covariance
between these measurement errors and the actual demands averaged only one-tenth of that for the
unit values. A final advantage of obtaining price opinions is that there are far fewer missing
observations than with either unit values or community market price surveys.
In terms of poverty measurement, the results using the price opinions in the PNG survey were
closer to those from the benchmark (the community market price survey) than were the results
from using unit values. The population-weighted average of the regional food poverty lines was
only overstated by 3.6 percent using the price opinions, compared with an 11 percent
overstatement when unit values were used (Table 3.2). The headcount poverty rate was only
overstated by eight percent and the difference was not statistically significant, whereas when unit
values were used it was overstated by more than one-quarter.
Table 3.2: Poverty measures with price opinions, unit values and community price surveys
Cost of poverty line food
basket calculated from:
Community price survey
Unit valuesa
Price opinions
Poverty line
(Kina per year)
K334
K370
K345
Headcount
Index
22.0
28.0**
23.8
Poverty gap
Index
5.9
8.0**
6.8**
Poverty
severity index
2.4
3.4**
2.8**
Source: Gibson and Rozelle (2005).
Note: The poverty line and poverty estimates are in terms of adult-equivalents. ** indicates that estimates differ from
those obtained with the community price survey, at the 1% significance level corrected for the effect of clustering,
sampling weights and stratification).
a
The unit values have been purged of quality effects using a regression like equation (5).
The price opinions obtained in the IFLS have not been as closely studied as those from the PNG
survey but a preliminary analysis suggests that the price opinions are at least as good a proxy for
market prices as are unit values. The evidence is limited because there are only six commodities
with price opinions, unit values and community market prices available (rice, beef, sugar, cooking
oil, kangkung and kerosene).41 However, these six items contribute almost one-quarter of the
average consumption budget. The average values of the community market prices, unit values
and price opinions for each of these six commodities are reported in columns 2-4 of Table 3.3,
where the averages are calculated after removing outliers that are more than five standard
deviations from the mean. The proportionate deviation from the market prices is in Columns 5-6
41
The survey recorded information about household total expenditures on 37 food items, but quantity consumed,
and hence the unit value, is available only for seven commodities; rice, beef, chicken, kangkung (water spinach),
cooking oil, granulated sugar and kerosene. The market price survey obtained the prices of 12 foods (rice, noodles,
beef, salted fish, sugar, salt, cooking oil, sweetened condensed milk, banana, kangkung, tofu, and milk powder) and
kerosene. The opinions about prices were obtained for 32 foods and 7 non-foods.
42
and with the exception of kangkung (water spinach) the averages are within 20 percent of the
average market price. Aggregating across the commodities, the mean unit value deviates from
the mean market price by 10 percent, while the mean price opinion deviates from the mean
market price by only six percent.
The correlation between price opinions and market prices is also substantially higher than that
between unit values and market prices, especially for rice.42 Whether these correlations are high
enough for either unit values or market prices to be an adequate proxy depends both on whether
community market prices are treated as a defensible benchmark and on the particular purpose
that the prices are being used for. But regardless of that issue, if an analyst needs either
alternatives to or supplements for community market prices, the current evidence suggests that
price opinions would do at least as well as unit values, even when collected from only one group
per community and without the aid of pictures. Moreover, price opinions have the potential
advantage of being more widely available; across the six commodities in Table 4 the minimum
number of communities with price opinions was n=304, compared with n=278 for the unit values
and n=147 for the market prices.
Table 3.3: Market prices, unit values and price opinions in the Indonesia Family Life Survey
Deviation from
Correlation with
Mean
Mean
Mean
market priceb
market prices
market
unit
price
Unit
Price
Unit
Price
pricea
valuea
opiniona
values
opinions
values
opinions
Rice
2102
2190
2034
0.04
0.03
0.236
0.605
Beef
25925
22296
25975
0.14
0.00
0.236
0.372
Sugar
3416
3379
3364
0.01
0.02
0.284
0.310
Cooking oil
4309
3645
3509
0.15
0.19
0.263
0.139
Kangkung
284
622
329
1.19
0.16
0.074
0.450
Kerosene
544
587
563
0.08
0.03
0.365
0.444
b
Average
0.10
0.06
0.250
0.483
Source: Author’s calculation from IFLS data.
a
Rupiah per kilogram (with the exception for cooking oil (Rupiah per litre)), as calculated from cluster-level
averages after outliers more than 5 standard deviations from the mean have been removed. The averages are based
on a common sample that has all three price measures available.
b
Absolute value, as a proportion of the mean market price.
c
Weighted by each commodity’s share in total consumption, in 1999 SUSENAS results.
42
This claim also holds if the price data are aggregated into province-level price indexes. Specifically, Tornqvist
indexes were calculated from data on budget shares (from the 1999 SUSENAS consumption module) in province
pairs, multiplied by the log ratio of prices, where we use Jakarta province as the reference point. Comparing these
price indexes across the 13 provinces covered by IFLS gave a correlation between the market price and unit value
indexes of 0.66, compared with 0.77 between the indexes from the price opinions and market prices.
43
Section 4: No Previous Poverty Lines and Survey Fieldwork Completed
This section is for those poverty analysts working in a country with no previous poverty line and
after the household survey is already complete. There are not many choices in this situation since
the designers of the household survey have, perhaps unwittingly, already taken many of the
decisions for the poverty analyst. In particular, they will have chosen which items to obtain
prices for and how to obtain them (unit values versus community price surveys). However, since
there is no previous poverty line this lack of choice may be less constraining than if the poverty
analysis also needed to maintain comparability with what has been done to measure poverty in
the past.
However this freedom from attempting retrospective comparability does not exempt the poverty
analyst from thinking about the comparability with poverty calculations in other years. Instead,
they need to consider the impact of their choices on future attempts to make (backward looking)
temporal poverty comparisons. For example, if the household survey used a method that is
unlikely to be sustainable (e.g., intensive diaries so that a detailed basket for a CPI could be
formed for the country) it may be unwise to form a food poverty line that can be re-priced in the
future only with a similarly intensive survey. Instead a more abbreviated food poverty line that
used only the main items that any future survey would also include could be a more sustainable
basis for future poverty monitoring in the country.
A choice over how many items to include in the food poverty line also depends on the nature of
the consumption module of the survey. If it is an LSMS style multi-topic survey the consumption
recall may have only 50 categories of food. There will likely be many more food categories if it
is a Household Income and Expenditure Survey (HIES) or a Household Budget Survey (HBS). If
the survey collects food quantities then there will also be unit values available for all of the
categories of food consumption, while a community market price survey would typically only
match the number of foods in a multi-topic survey.
The geographical scale at which to calculate average prices and price the food poverty line
depends on:
• how many missing prices there will be as a finer geographic scale is used for the
calculations and reporting of average prices,
• measurement error that may result from using prices observed in a single village market
on a given day. By averaging over prices collected in surrounding markets within the
region, the share of the variance due to random measurement error will be reduced, and
• introduction of temporal variation such that the prices obtained in a village on a given
day do not reflect the ‘usual’ prices facing the households in that community.43 Regional
prices may be more representative because surveys that stagger fieldwork over several
months or a year will have price samples within a region that are collected over the
entire duration of the fieldwork (unless the survey works entirely in one region and then
moves to the next region). But prices in a single village are likely to be collected only
43
However, if unit values are used and if the recall is based on a “usual” month rather than the current month,
fortnight or week then in principle the unit value should be free of any seasonal fluctuations. However, the evidence
reported in Figure 2 of Section 2 (p.25) indicates that even this method of getting unit values that are supposedly
free of seasonality does not seem to work.
44
once, and so will reflect both spatial and temporal/seasonal variation and it will not be
possible for the poverty analyst to identify the purely spatial part, which is needed for
setting the regional poverty lines.44
On the other hand, there are some costs of using regional average prices rather than local prices.
Regional prices will overstate the cost of buying the poverty line basket of foods in low-price
communities within each region, while understating it for others. Measured poverty will be too
high in the low-price communities because these same (high) prices are not used for valuing food
consumption. Hence, some households will be above the poverty line if that line is priced using
local (i.e., cluster-level) prices, but below the poverty line if regional average prices are used.
Bias in the opposite direction (measured poverty too low) will occur in clusters where regional
average prices understate the local cost of the poverty line basket of foods. These effects don’t
necessarily net out if the distribution of prices within regions is non-Normal, which is what
Gibson and Rozelle (1998) found for Papua New Guinea for three-quarters of foods and regions.
Consequently, the headcount index at the food poverty line was 17 percent when using regional
average prices and only 14 percent when using cluster-level prices.
4.1
Which Prices to Use: Community Market Prices or Unit Values?
It is possible that there is only type of price information available from the household survey and
in some cases there may be nothing available. If nothing is available from the survey, some other
data on prices will have to be used, such as from the Consumer Price Index. However in many
countries the CPI mostly on urban prices, so these will not be very applicable for calculating either
poverty lines or imputing the value of consumption for rural households.
If the survey has only one type of price information, where the likely choices are between prices
gathered in community price surveys and unit values then the analyst is forced to use those. This
situation is very common; for example, state statistical bureaus in countries such as China,
Indonesia and Pakistan do not collect market price data that can be matched to their household
income and expenditure surveys in rural areas. Analysts thus have to use unit values from the
survey. In some other (typically urban) household surveys unit values can’t be calculated
because the survey doesn’t measure quantities (since quantities can be derived by dividing
expenditures by urban prices – which are already being collected for the CPI).
Surveys like the LSMS include both a community price survey and allowed unit values to be
calculated since they measure both the value and quantity of food purchases are surprisingly rare.
But even if the analyst has no choice, it is worth knowing what the literature indicates about the
strengths and weaknesses of unit values compared to prices gathered from community market
surveys.
In terms of community market price surveys, the main criticisms are that these may be unreliable
due to being gathered from the wrong market, for the wrong specification of goods, or for prices
that are not actually paid by local residents due to bargaining and other interactions between
buyer and seller (Deaton and Grosh, 2000). However there is little evidence for either the
existence or importance of these biases, in part because so many surveys do not collect
44
Surveys with a within-year longitudinal component are an exception. Muller (2002) reports on an example of such
a survey from Rwanda, where the same households and villages were revisited four times throughout the year.
45
community market prices.
The problems with unit values are that they are only available for households who make
purchases, they may refer to items of varying quality rather than a fixed specification and they
are likely to have measurement errors. The problem of unit values being tied to purchases
matters especially where no households within a survey cluster make a purchase, because then
there is no proxy for the market price in that community. A sample selection problem may result
because the communities where purchases are recorded by the survey may differ from those
where no purchases and unit values are observed (especially because non-purchase may reflect
either that households in that community are self-sufficient in the good, or conversely that they
never consume it).
The importance of the quality effects in unit values will depend on how broad and heterogeneous
is the category of food consumption that the unit value refers to. For a broad category, the mix of
varieties is likely to change with changes in household income, household size, and price
changes, all of which affect the real living standard of household members. These responses may
be captured in the ‘quality elasticity’ discussed in equation (7) above and repeated here:
ln vi = α + β ln xi + γ • z + δ c + ui
(7)
This same regression can be used both to predict unit values for those clusters that have none,
and to remove household-specific quality effects. Visual inspection also can help to detect
quality effects; for example, Deaton, Friedman and Alatas (2004) search for multi-modality,
which may indicate that unit values come from survey categories containing several distinct
goods, each with different prices. For example, “other milk products” in the Indian NSS
appeared to have multiple modes. In contrast, the category “rice” was better defined, with almost
30 percent of rural households reporting buying it at exactly 10 rupees per kilogram, and 25
percent of urban households buying it exactly 12 rupees per kilogram.
Nevertheless, one quality problem that the regression cannot handle is the joint response of both
quantity and quality to prices changes. In markets where prices are high, consumers may react by
choosing lower quality, and where prices are low they may choose higher quality (Deaton,
1997). This type of correlated demand response is not captured by equation (7) because it affects
all households in a community equally (assuming they face the same prices) and so cannot be
identified by the within-cluster variation in unit values that is due to household characteristics.
Consequently, because unit values reflect both price and quality, they will tend to vary by less
than prices and poverty lines calculated from them may understate spatial and temporal
differences in the cost of living even after removing the effects of quality variation identified
from idiosyncratic household characteristics.
Unit values will reflect measurement errors in quantities, expenditures, or both. Even if all
households consumed the same varieties of a particular good and paid the same price, the
reported unit values could show considerable variation. One source of potentially large errors in
unit values is discrepancies between the measuring units that are reported by respondents and
entered into the survey database. This discrepancy may arise when farmers and consumers use
traditional units rather than the metric ones needed by poverty analysts. An econometric
procedure for dealing with this problem is suggested by Capéau and Dercon (2005) who apply it
46
to poverty measurement in Ethiopia.
While these errors might be expected to cancel out in large enough samples there is contrary
evidence even when procedures are used to remove possible outliers and when unit values are
collapsed to cluster level (which should reduce the impact of measurement error).45 Gibson and
Rozelle (2005) use unit values for nine major foods that contribute half of the poverty line food
basket in Papua New Guinea. After trimming and stripping household-specific quality effects,
the purged unit values are averaged by cluster and then by region, and then used to calculate the
food poverty line. The resulting food poverty line is overstated by 11 percent, compared with its
value when prices from a community market survey are used (Table 4.1).46 In this setting it is
argued that market surveys provide a good benchmark because there is no haggling, local markets
are well defined and geographically separated, and there is not much quality variation amongst
goods across the various markets. The overstatement would have been even larger if the unit
values were not purged of household-specific quality effects. This overstatement of the food
poverty line makes a significant difference to the estimated poverty rates.
Table 4.1: Poverty measures with unit values and community price surveys
Cost of poverty line food
basket calculated from:
Community price survey
Unit values
Purged unit valuesa
Poverty line
(Kina per year)
K334
K385
K370
Headcount
Index
22.0
30.0**
28.0**
Poverty gap
Index
5.9
8.9**
8.0**
Poverty
severity index
2.4
3.8**
3.4**
Source: Gibson and Rozelle (2005).
Note: The poverty line and poverty estimates are in terms of adult-equivalents. ** indicates that estimates differ from
those obtained with the community price survey, at the 1% significance level corrected for the effect of clustering,
sampling weights and stratification).
a
The unit values have been purged of quality effects using a regression like equation (5).
Similar evidence of an overstatement in poverty estimates when using unit values is found by
Capeau and Dercon (2006) for Ethiopia when they compare with the results using community
surveys. The degree of overstatement is not quite as large as reported in Table 4.1 for PNG
(17 percent rather than 27 percent for the headcount index) but it is still disquietingly large.
Moreover, in the Ethiopian example, unit values also resulted in larger poverty fluctuations over
time than those coming from the market price surveys.
In contrast to these negative results, the poverty lines estimated from unit values by Deaton and
co-workers in India appear plausible and consistent with what limited information is available on
the spatial and temporal distribution of prices. Thus, until more is known about the performance
of unit values, poverty analysts should be cautious when using them to calculate poverty lines,
and where possible should seek additional information on prices. This additional information
may validate the unit values or it may prove to be a more reliable proxy for local market prices.
45
Note that even after trimming possible outliers, the mean is less robust than either the median or the mode, and
good arguments can be made for using these measures when calculating average unit values by cluster or by region
and season.
46
For some regions the overstatement is from 16-20 percent. The values in column 2 of Table 2 are populationweighted averages of the regional poverty lines.
47
Section 5: Survey Fieldwork Completed and a Previous Poverty Line Exists
This section is for poverty analysts in the situation with the least choice. Since the household
survey is already completed the survey designers have, perhaps unwittingly, already taken many
of the decisions for the poverty analyst. In particular, they will have chosen which items to
obtain prices for and how to obtain them (unit values versus community price surveys). Since
there is a previous poverty line (and presumably poverty estimates) there will also be a demand
for temporal poverty comparisons. Hopefully it is feasible to maintain comparability with what
has been done to measure poverty in the past but this depends largely on the choices of others;
the poverty analyst from the previous survey and the designer of the current survey.
What are the main places that these choices made by others are irreconcilable? The poverty
analyst should compare what prices were gathered with what prices are needed for the poverty
line updating, and where and how they were gathered. Specific questions they might like to ask
include:
• Do all foods that were priced in the previous survey and used in the poverty line
calculation have prices being gathered in the same survey, and are these prices being
gathered in the same way and at the same geographic scale?
• Are the same methods used as in the previous survey for imputing missing values, and
calculating spatial price indexes and the cost of the poverty line bundle of foods (if a Cost
of Basic Needs poverty line was used)?
In the perhaps unlikely case that the survey provides multiple measures, such as having available
both unit values and price data from community market price surveys, the choice for the poverty
analyst is clear; use whichever price data are most consistent with the practice of the previous
poverty estimates for backward looking comparisons. There may be grounds for also using the
other type of data for making alternative poverty estimates if there are reasons to believe either
that the sort of data used in the past are unsuitable or that survey practice in the country is going
to change so that these alternative estimates provide the baseline for forward-looking poverty
comparisons.
The more difficult situation is where the survey has not built any redundancy into the range of
measures available, so prices are measured either with unit values (e.g. the survey gathers
quantities in the recall or diary section but does not have a price survey) or with community
market price surveys (e.g. quantities are not collected since they can be derived as expenditure
divided by community market price). In this setting the analyst has to consider what has been
learnt from comparisons between poverty estimates that rely on two (or more) different types of
price data in the same setting. If these comparisons show no effect, then the fact that the current
survey gathered prices differently than the form that the previously poverty line requires them
may not cause any temporal inconsistency.
The evidence on the sensitivity of poverty estimates to the choice of variable for measuring
prices is not very appealing for analysts stuck in this situation. For example, Gibson and Rozelle
(2005) carry out a comparison of poverty when using either unit values or community market
prices for nine major foods that contribute half of the poverty line food basket in Papua New
Guinea. After trimming and stripping household-specific quality effects, using a regression
48
approach proposed by Deaton (1997) the purged unit values are averaged by cluster and then by
region, and then used to calculate the food poverty line. The resulting food poverty line is
overstated by 11 percent, compared with its value when prices from a community market survey
are used (Table 5.1).47 In this setting it is argued that market surveys provide a good benchmark
because there is no haggling, local markets are well defined and geographically separated, and
there is not much quality variation amongst goods across the various markets. The overstatement
would have been even larger if the unit values were not purged of household-specific quality
effects. This overstatement of the food poverty line makes a significant difference to the estimated
poverty rates, raising the headcount rate from 22 percent to 28 percent even after the full range of
treatments are applied to unit values to remove the effect of outliers and adjust for quality
variations.
Table 5.1: Poverty measures with unit values and community price surveys
Cost of poverty line food
basket calculated from:
Community price survey
Unit values
Purged unit valuesa
Poverty line
(Kina per year)
K334
K385
K370
Headcount
Index
22.0
30.0**
28.0**
Poverty gap
Index
5.9
8.9**
8.0**
Poverty
severity index
2.4
3.8**
3.4**
Source: Gibson and Rozelle (2005).
Note: The poverty line and poverty estimates are in terms of adult-equivalents. ** indicates that estimates differ from
those obtained with the community price survey, at the 1% significance level corrected for the effect of clustering,
sampling weights and stratification).
a
The unit values have been purged of quality effects using a regression approach proposed by Deaton (1997).
Similar evidence of an overstatement in poverty estimates when using unit values is found by
Capeau and Dercon (2005) for Ethiopia when they compare with the results using community
surveys. The degree of overstatement is not quite as large as reported in Table 5.1 for PNG
(17 percent rather than 27 percent for the headcount index) but it is still disquietingly large.
Moreover, in the Ethiopian example, unit values also resulted in larger poverty fluctuations over
time than those coming from the market price surveys.
47
For some regions the overstatement is from 16-20 percent. The values in column 2 of Table 5.1 are populationweighted averages of the regional poverty lines.
49
References
Ackland, R., Dowrick, S., and Freyens, B. (2006) Measuring global poverty: why PPP methods
matter. Mimeo Research School of Social Sciences, The Australian National University.
Attanasio, O. and Frayne, C. (2006) Do the poor pay more? Paper presented at the Eighth
BREAD Conference, Cornell University, May 2006.
Ball, A. and Fenwick, D. (2004) Relative regional consumer price levels in 2003, Economic
Trends 603: 42-51.
Beatty, T. and Larsen, E. 2005. Using Engel Curves to Estimate Bias in the Canadian CPI as a
Cost of Living Index. Canadian Journal of Economics 38(2): 482-499.
Beegle, K., Frankenberg, E. and Thomas, D. (1999) Measuring change in Indonesia. Labour and
Population Program Working Paper Series 99-07. Santa Monica, CA: RAND.
Bradley, R. (2005) Pitfalls of using unit values as a price measure or price index. Journal of
Economic and Social Measurement 30(1): 39-61.
Capéau, B. and Dercon, S. (2006) Prices, unit values and local measurement units in rural surveys:
an econometric approach with an application to poverty measurement in Ethiopia. Journal
of African Economies 15(2): 181-211.
Coondoo, D., Majumder, A. and Ray, R. (2004) A method of calculating regional consumer price
differentials with illustrative evidence from India. Review of Income and Wealth 50(1):
51-83.
Cox, T. and Wohlgenant, M. (1986) Prices and quality effects in cross-sectional demand analysis.
American Journal of Agricultural Economics 68(4): 908-919.
Deaton, A. (1989) Household survey data and pricing policies in developing countries. The World
Bank Economic Review 3(2): 183-210.
Deaton, A. (1997), The Analysis of Household Surveys: A Microeconometric Approach to
Development Policy Johns Hopkins, Baltimore.
Deaton, A. (1998) Getting prices right: what should be done? Journal of Economic Perspectives
12(1): 37-46.
Deaton, A., Friedman, J., and Alatas, V. (2004) Purchasing power parity exchange rates from
household survey data: India and Indonesia. Mimeo Princeton University.
Deaton, A. and Grosh, M. (2000) Consumption. In M. Grosh and P. Glewwe (eds) Designing
Household Survey Questionnaires for Developing Countries The World Bank, Washington,
pp. 91-133.
50
Deaton, A. and Tarozzi, A. (2005) Prices and poverty in India In The Great Indian Poverty
Debate, (A Deaton and V. Kozel editors), New Delhi, India. MacMillian, pp 381-411.
Filho, I. and Chamon, M. (2006) The myth of post-reform income stagnation in Brazil. Working
Paper 06/275, International Monetary Fund.
Frankenberg, E. (2000) Community and price data. In M. Grosh and P. Glewwe (eds) Designing
Household Survey Questionnaires for Developing Countries The World Bank, Washington,
pp. 315-338.
Friedman, J. and Levinshom, J. (2002) The distributional impacts of Indonesia’s financial crisis on
household welfare: a ‘rapid response’ methodology. World Bank Economic Review 16(3):
397-423.
Gibson, J. (2000) A Poverty Profile of Cambodia, 1999. A Report to the World Bank and the
Ministry of Planning, Phnom Penh, Cambodia.
Gibson, J., and Rozelle, S., (1998) Results of the Household Survey Component of the 1996
Poverty Assessment for Papua New Guinea, mimeo Population and Human Resources
Division, The World Bank.
Gibson, J. and Rozelle, S. (2005) Prices and unit values in poverty measurement and tax reform
analysis. World Bank Economic Review 19(1): 69-97.
Gibson, J., Stillman, S. and Le, T. (2004) CPI bias and real living standards in Russia during the
transition. Working Paper 04/02 Department of Economics, University of Waikato.
Glewwe, P. (1991) Investigating the determinants of household welfare in Côte d’Ivoire. Journal
of Development Studies 35(2): 307-337.
Gluschenko, K. (2006). Biases in cross-space comparisons through cross-time price indexes: The
case of Russia. BOFIT Discussion Paper No. 9.
Hamilton, B. (2001) Using Engel’s Law to Estimate CPI Bias. American Economic Review
91(3): 619-630
Hill, R. (2004). Constructing price indexes across space and time: The case of the European
Union. American Economic Review, 94(5): 1379 – 1409.
Hill, R. (2006) Convergence or divergence: how to get the answer you want. Paper Presented at
the Joint European Economic Association and Econometric Society European Meetings
August 24-28, Vienna.
Izquierdo, M., Ley, E. and Ruiz-Castillo, J. (2003) The plutocratic gap in the CPI: evidence from
Spain. IMF Staff Papers 50(1): 136-155.
51
Kakwani, N. (1993) Poverty and economic growth with application to Côte d’Ivoire. Review of
Income and Wealth 39(2): 121-139.
Lanjouw, J. and Lanjouw, P. (2001) How to compare apples and oranges: Poverty measurement
based on different definitions of consumption. Review of Income and Wealth 47(1),
25-42.
Ley, E. (2005) Whose inflation? A characterization of the CPI plutocratic gap. Oxford Economic
Papers 57(3): 634-646.
Meng, X., Gregory, R., and Wang, Y. (2005) Poverty, inequality, and growth in urban China,
1986-2000. Journal of Comparative Economics 33(4): 710-729.
Muller, C. (2002) Prices and living standards: evidence from Rwanda. Journal of Development
Economics 68(1): 187-203.
Olivia, S. and Gibson, J. (2005) Unit value biases in price elasticities of demand for meat in
Indonesia. Australasian Agribusiness Review 13(12): 1-17.
Pradhan, M. (2001). ‘Welfare analysis with a proxy consumption measure: evidence from a
repeated experiment in Indonesia.’ mimeo Cornell Food and Nutrition Program.
Prais, S. (1958). Whose cost of living? Review of Economic Studies 26(1): 126-134.
Prais, S. and Houthakker, H. (1955). The Analysis of Family Budgets New York: Cambridge
University Press.
Quinn, J. (2004) USAID Prices Project in Timor-Leste: End of Project Report, mimeo Australian
Bureau of Statistics, Canberra.
Rao, V. (2000) Price heterogeneity and “real” inequality: a case-study of prices and poverty in
rural South India. Review of Income and Wealth 46(2): 201-211.
Ravallion, M. (1988). Expected poverty under risk-induced welfare variability. Economic
Journal 98(393): 1171-1182.
Ravallion, M (1992) Poverty comparisons: a guide to concepts and methods. Living Standards
Measurement Study Working Paper No. 88, The World Bank.
Saunders, C, and Grootaert, C. (1980) Reflections on the LSMS group meeting. Living Standards
Measurement Study Working Paper No. 10, The World Bank.
Scott, C. and Amenuvegbe, B. (1990) Effect of recall duration on reporting of household
expenditures: an experimental study in Ghana. SDA Working Paper 8905, World Bank.
52
Senauer, B., Sahn, D., and Alderman, H. (1986) The effect of the value of time on food
consumption patterns in developing countries: evidence from Sri Lanka. American Journal
of Agricultural Economics 68(4): 920-927.
Silver, M. and Webb, B. (2000) The measurement of inflation: aggregation at the basic level.
Journal of Economic and Social Measurement 26(1): 1-15.
Son, H. and Kakwani, N. (2006) Measuring the impact of price changes on poverty. Working
Paper No. 33, International Poverty Centre, UNDP Brazil.
Wood, D., and Knight, J. (1985) The collection of price data for the measurement of living
standards. Living Standards Measurement Study Working Paper No. 21, The World
Bank.
53
Appendix A
Stata Code for
*** Gets unit values and purged unit values for poverty line calculations
*** Used by Gibson and Rozelle (2005), illustrated just for example of rice
*** Unit values are averaged by cluster and then by region
version 7.0
#delimit ;
drop _all;
set matsize 800;
set more 1;
capture log close;
log using povline, replace;
use uv_data;
egen cluster=group(prov cd cu);
collapse prov cd cu region, by(cluster);
sort cluster;
save cluster.dta, replace;
use uv_data, replace;
collapse (median) rm* rf* lnexp lnhhs femhead wagebis;
expand 120;
gen cluster=_n;
sort cluster;
merge cluster using cluster;
qui tab cluster, gen(clusd);
save means.dta, replace;
use uv_data, clear;
egen cluster=group(prov cd cu);
qui tab cluster, gen(clusd);
gen uvric= rice_uv3 ;
reg uvric rm* rf* lnexp lnhhs femhead wagebis clusd*;
use means.dta, replace ;
predict purge_ric, xb ;
save means.dta, replace;
table region, c(mean purge_ric );
log close;
54