Further Updating Poverty
Mapping in Albania
Gianni Betti*, Andrew Dabalen**,
Celine Ferrè** and Laura Neri*
* University of Siena, Italy, ** The World Bank, Washington, USA
Poverty and Social Inclusion in the Western Balkans
WBalkans 2010, Brussels, December 14-15, 2010
Scopes of the presentation
- Introduction on basic concepts of
poverty mapping
- Concepts of updating poverty mapping
without new census data
- Application to Albania: 2002-2005-2008
- Results on 2008 and comparisons with
2005 and 2002 are only reported in the
paper for sake of time restriction
2
THE METHODOLOGY
Combines Census and Survey Data to
produce disaggregated maps of poverty and
inequality (Elbers, Lanjow and Lanjow, 2003,
Econometrica).
THE APPLICATION HERE PROPOSED
Census (2001) and LSMS (2002) in Albania,
firstly updated to LSMS (2005), and then
updated to LSMS (2008).
3
What is “Poverty Mapping” ?
Citing the paper of Elbers, Lanjouw and
Lanjouw (ELL, 2002 and 2003)
Poverty and inequality maps are spatial
descriptions of the distribution of poverty
and inequality and are most useful to policymakers and researchers when they are
finely disaggregated, i.e. when they
represent small geographic units, such as
cities, municipalities, regions or other
administrative partitions of a country.
4
Figure 3. Head Count Ratio by Municipality, 2002.
5
BASIC IDEA OF “POVERTY MAPPING”
To estimate a linear regression model with local
variance components on the LSMS data (the
dependent variable is a monetary variable) –
ESTIMATION (Stage 1)
The distribution of the dependent variable is used
to generate the distribution for any subpopulation in
the Census conditional to the observed data –
IMPUTATION or SIMULATION (Stage 2)
The variables used in the Census data and in the
LSMS should comparable (i.e. same categories, etc…).
A so-called Stage 0 is needed before estimation of
linear regression model in Stage 1.
6
Stage 0: are the LSMS and the Census
comparable?
Fully analysis of the two data source to
construct common variables in Albania we
have identified 38 common variables
Housing and Dwelling conditions and presence
of durable goods (23)
Household head characteristics (8)
Household socio-demographic characteristics
(7)
Imputation for missing values in LSMS has been
done
(IVE-ware, Raghunathan et al. 2001)
Census and LSMS distribution should be
compared
7
Stage 1: Estimation
The model: it is a linear approximation to the
conditional distribution of the logarithm
consumption expenditure (or income) of
household h in cluster c,
T
T
ln y ch E ln y ch | xch
u ch xch
u ch
(1)
The error component is specified to allow for
a within cluster correlation in disturbances.
IMPORTANT: several models are estimated in
terms of number of strata in the LSMS
survey.
8
Stage 2: Simulation
The estimates obtained are applied to the
Census data to simulate the expenditure for
each household in the Census.
A certain number (i.e.100) of simulations are
conducted
The simulated values are:
T
yˆch exp xch
c ch
(4)
The beta coefficients ~ , are drawn from
a multivariate normal distribution with mean ̂
and variance covariance matrix equal to the
one associated to ̂ .
9
For the residual, any specific
distributional form assumption is avoided so
the residual are drawn directly from the
estimated residuals.
For each of the simulated consumption
expenditure distributions a set of poverty
and inequality measures is calculated.
Mean over all the simulations point
estimates
Standard deviation over all the simulations
bootstrapping standard error.
10
Updating the 2002 Poverty Mapping
In 2005, Dabalèn and Ferrè have proposed to
update the poverty mapping in two Phases:
First Phase: construct the so-called “counterfactual
population distribution”: this is the distribution that
would have prevailed in 2002 if the parameters of
consumption and the distribution of observable
and unobserved covariates were as they were in
2005 (now 2008);
Second Phase: apply the ELL methodology
described in the previous slides using the
“counterfactual population distribution” [CM].
11
CM – logic behind the model - 1
In this case, Dabalèn and Ferrè (2005)
proposed to construct a counterfactual
consumption distribution of the old
household survey, using information from
both the old and new household survey
and match the corresponding estimates
with the old census data, following the
methodology proposed by Lemieux
(2002).
12
CM – logic behind the model - 2
To construct the counterfactual wealth
distribution, firstly let’s consider a consumption
model using the new survey.
X 05,i 05,i
ln( y 05,i ) 05
(5)
Where y 05 denotes consumption in year 2005, i
indexes the household, 05 is a parameter (that
captures the “returns” to or “price” of covariates
in 2005), X 05 is a vector of covariates and, 05 is
unobserved component of consumption.
13
CM – logic behind the model - 3
Note that using this new survey, without
additional adjustment, and applying the ELL
estimator would be problematic because the
returns to covariates, the parameter may have
changed between 2002 and 2005. In addition,
the profile of the population – that is covariates
such as education levels, age composition, and
so on – may also have changed. Finally, the
returns to unobserved covariates may also have
changed. To recreate a consumption distribution
that resembles consumption of 2002, CM would
have to account for these changes. Therefore,
the counterfactual consumption distribution has
14
three steps.
CM – First step - 1
The first step is to create a consumption
distribution that would have prevailed in 2002 if
the parameters were as in 2005. That is,
ln( y
p
02,i
) ˆ05,i X 02,i
(6)
Equation (6) accounts for changes in the
parameters of covariates, by using the
estimated parameters from the new survey to
estimate consumption distribution in the old
survey.
15
CM – First step - 2
However, in addition to these parameters,
levels of covariates may have changed
because, for instance, the population is now
more educated, etc…
16
CM – Second step - 1
Instead, the CM methodology creates a score that
reduces the dimension of the data, by stacking the new
and old surveys, and then by running a probit model:
Pit Pr ob(survey 2005 | Z it , M it ) z Z it m M it
(7)
In principle, a large set of observable household level
characteristics can be included, Z it , and also the
migration status of the household, M it , or any suitable
variables that capture the scale of migration, which is
of crucial concern when trying to update poverty maps.
17
CM – Second step - 2
Equation (7) allows us to obtain a propensity
score – the predicted probability of being in
period t {2002,2005} - conditional on the
observable characteristics.
it
1 Pit
Pit
Pt
1 Pt
(8)
Where P is the unconditional probability that
an observation belongs to period t or the share
of year 2005 observations in total observations
(that is, both years).
t
18
CM – Second step - 3
In this framework, accounting for
changes in the distribution of observable
characteristics is equivalent to
reweighing the consumption distribution
estimated in equation (6), so that the CM
model becomes to be as:
ln( y
r
02,i
) ln( y
p
02,i
) 02,i
(9)
19
CM – Third step - 1
The only step remaining is to add a measure of
the unobserved component of consumption. If
the dispersion in unobserved consumption is
due to random events that are unrelated to
systematic differences across households, then
there would be nothing more to say about the
error term. However, one reason to add a
measure of the unobserved consumption is
that the residual is unlikely to be just a random
component of consumption.
20
CM – Third step - 2
CM first estimates a consumption model for the
2002 data, and ranks all the households on the
basis of the residual distribution for that year.
Then CM assigns to each household in year
2002, the value of ranked residual from the
empirical distribution of residuals in year 2005
(equation (5)) that corresponds to the year
2002 rank. We now have the counterfactual
consumption, the consumption that would have
been observed in 2002, if the parameters, the
distribution of covariates and the unmeasured
determinants of consumption are as in 2005. 21
CM – Third step - 3
From equations (5) and (9), this
counterfactual wealth distribution can be
rewritten as:
ln( y02c ,i ) 02,i (ln( y02p ,i 05r ,i ) 02,i (ˆ05 X 02,i 05r ) (10)
r
Where, 05,i
denote the value of the ranked
residual in 2005 assigned to a household
with the same residual rank in year 2002.
22
Here we have further updated the
poverty mapping using the new
LSMS conducted in 2008.
Clearly the counterfactual
distribution corresponding to the
2008 is less accurate comparing to
the one of 2005.
However, the results are still good,
and the errors still under control.
23
IMPLEMENTATION OF THE POVERTY
MAPPING IN ALBANIA
THE DATA:
Population and Housing Census (2001)
§
§
§
§
Reference Time: 31 March 2001
Number of Households: 726,895
Number of Persons: 3,069,275
Collected Information: Building,
Dwellings, Household, Individuals.
24
Living Standard Measurement Study
(LSMS, 2002, 2005 & 2008)
§
Reference Time: Spring 2002, 2005
and 2008
Sampling Frame: 4 Strata, 450 PSUs
(corresponding to the EA in the
Census), 8 Household per PSU
§
Number of Households: 3599
§
Collected Information: Household,
Food Consumption, Diary,
Community, Price
§
25
POVERTY MEASURES:
The procedure for estimating the poverty
measures has been applied for the whole of
Albania and disaggregated at seven levels:
a) Rural – urban level;
b) The four strata used in sampling the
LSMS;
c) The six strata for which the linear
regression models have been estimated;
d) The 12 Prefectures (or Counties);
e) The 36 Districts;
f) The 374 Communes/Municipalities;
g) The 11 Mini-municipalities in which the
city of Tirana is divided.
26
Future work:
New Poverty Mapping using the fresh
2011 Census and the fresh 2011 LSMS
THANK YOU FOR YOUR ATTENTION!!
27
© Copyright 2026 Paperzz