new proxy means test (pmt) models: improving targeting of the poor

12th National Convention on Statistics (NCS)
EDSA Shangri-La Hotel
October 1-2, 2013
NEW PROXY MEANS TEST (PMT) MODELS:
IMPROVING TARGETING OF THE POOR FOR SOCIAL PROTECTION
by
Dennis S. Mapa
Manuel Leonard F. Albis
For additional information, please contact:
Author’s name
Designation
Affiliation
Address
:
:
:
:
Tel. no.
E-mail
:
:
Author’s name
Designation
Affiliation
Address
:
:
:
:
Tel. no.
:
Dennis S. Mapa
Associate Professor
School of Statistics, University of the Philippines, Diliman
School of Statistics Building, Ramon Magsaysay Avenue
U.P. Diliman, Quezon City
(02) 9280881
[email protected]
Manuel Leonard F. Albis
Assistant Professor
School of Statistics, University of the Philippines, Diliman
School of Statistics Building, Ramon Magsaysay Avenue
U.P. Diliman, Quezon City
Diliman, Quezon City
(02) 9280881
1
NEW PROXY MEANS TEST (PMT) MODELS:
IMPROVING TARGETING OF THE POOR FOR SOCIAL PROTECTION1
by
Dr. Dennis S. Mapa.2 and Prof. Manuel Leonard F. Albis3
The National Household Targeting System for Poverty Reduction (NHTS-PR) of the Department
of Social Welfare and Development (DSWD) is a system for identifying poor households. The
system guarantees the generation and establishment of a socio-economic database of poor
households. The NHTS-PR uses the Proxy Means Test (PMT) as the methodology for
estimating the per capita income of households based on a set of verifiable indicators that are
difficult to manipulate.
The PMT model is a statistical method used to predict the income of a household based on
observable characteristics that correlate with, but are easier to measure, than income. In lieu of
the household's actual income (per capita), its predicted income (per capita) from the PMT
model is used to compare with the official poverty threshold, computed by the National
Statistical Coordination Board (NSCB). The household is classified as poor when the predicted
per capita income from the PMT model is less than or equal to the official poverty threshold,
otherwise, the household is classified as non-poor. The households classified as poor using the
PMT model are then used as one of the eligibility criteria for the Conditional Cash Transfer
program and other government poverty reduction programs.
While the observable characteristics used in the PMT model are relatively easier to collect,
these are also less than perfect correlates of income. Thus, the PMT model risks misclassifying
poor households as non-poor (exclusion error) and non-poor households as poor (inclusion
error). Reducing the exclusion and inclusion errors and in the process increasing the number of
households correctly classified (as poor or non-poor) is one of the primary objectives of the PMT
model.
In preparation for the reassessment of the database of the NHTS-PR for 2013-2014, new and
better PMT models are developed for the DSWD. Innovations are introduced in these new PMT
models with the objective of reducing the exclusion and inclusion error rates. These are the
addition of variables that measures community characteristics – deemed useful in identifying the
poor and non-poor households; the use of restricted regression in estimating the model’s
coefficients; using the lower limit of a predicted interval, instead of the point estimate, in
predicting the household’s per capita income and using two models (least-squares regression
model and logit model) to identify the poor and non-poor households. Within sample evaluation,
using the data from the 2009 Family Income and Expenditure Survey (FIES), shows that the
new PMT models have lower exclusion and inclusion errors compared with the current PMT
model.
1
The paper benefitted from discussions with and inputs from Joseph Capuno of the UP School of
Economics and from Sharon Piza, Rashiel Velarde, Nazmul Chaudhury and Shanna Elaine Rogan of
the World Bank’s Social Protection Unit. All errors and omissions remain the authors’ responsibility.
2
Associate Professor and Director for Research, School of Statistics and Affiliate Associate Professor,
School of Economics, University of the Philippines, Diliman and Consultant, DSWD. Email:
[email protected]
3
Assistant Professor, School of Statistics, University of the Philippines Diliman and Consultant, DSWD.
2
I.
Introduction
The National Household Targeting System for Poverty Reduction (NHTS-PR) of the Department
of Social Welfare and Development (DSWD) is a system for identifying who and where the poor
households are. The system guarantees the generation and establishment of a socio-economic
database of poor households. The targeting system employed by the NHTS-PR is similar to
those which have earned success in Latin American countries in terms of effectively distributing
social assistance and social protection programs to the poorest of the poor. The NHTS-PR
uses the Proxy Means Test (PMT) as the methodology for estimating the per capita income of
households based on a set of verifiable indicators that are not easy to manipulate.
The PMT model is a statistical method used to predict the income of a household based on
observable characteristics that correlate with, but are easier to measure, than income. In lieu of
the household's actual income (per capita), its predicted income (per capita) from the PMT
model is used to compare with the official poverty threshold, computed by the National
Statistical Coordination Board (NSCB).4 The household is classified as poor when the predicted
per capita income from the PMT model is less than or equal to the official poverty threshold,
otherwise, the household is classified as non-poor. The households classified as poor using the
PMT model are then used as one of the eligibility criteria for the CCT program (and other
government poverty reduction programs).
While the observable characteristics used in the PMT model are relatively easier to collect, they
are also less than perfect correlates of income. Thus, the PMT model risks misclassifying poor
households as non-poor (exclusion error) and non-poor households as poor (inclusion error).
Reducing the exclusion and inclusion errors and in the process increasing the number of
households correctly classified (as poor or non-poor) is one of the primary objectives of the PMT
model.
In preparation for the reassessment of the database of the NHTS-PR for 2013-2014, the DSWD
is reviewing the methodologies associated with the PMT models, in particular to find alternative
(additional) variables that are useful in identifying the poor and non-poor households, as well as
4
The annual per capita poverty threshold in 2009 is estimated at Php 16,841 for the entire country. Using this value,
the NSCB estimated poverty incidence to be about 20.9% of the families. This is equivalent to about 26.5% of the
total population. For 2012, the 1st semester (6 months) per capita poverty threshold is estimated at Php 9385.
Poverty incidence among families is about 22.3%, while poverty incidence among population is estimated at
27.9%.
3
exploring alternative estimation procedures that will help reduce the exclusion and inclusion
error rates.
In identifying the appropriate PMT model, focus is on the two misclassification errors that are
produced by the model: exclusion and inclusion errors. The exclusion error rate is defined as
number of actual poor households classified as non-poor by the models divided by the total
number of actual poor households, while the inclusion error rate is the number of actual nonpoor households classified as poor divided by the total number of households classified as poor
by the models. There is always a trade-off between the two errors: decreasing the exclusion
error rate tends to increase the inclusion error rate and vice versa, all things being the same.
However, researchers and policy makers are usually biased in favour of a lower exclusion error
rate. The exclusion errors are more costly than inclusion errors, because of costs of re-certifying
poor households (Araujo and Carraro; 2011).
II.
Current Proxy Mean Test (PMT) Model
The standard PMT model seeks to predict per capita income at household level.
The
dependent variable to be estimated is per capita income instead of per capita consumption
because in the Philippines official poverty statistics produced by the NSCB are based on
income. In theory income would be a better indicator of purchase capacity of the household.
However, it has been observed that income has higher volatility than consumption
(expenditure). Moreover, in countries with a high share of informal sector such as the
Philippines, the underreporting of income is an issue (Fernadez, 2012). In other countries such
as in Indonesia (Alatas, et. al; 2010), Mongolia (Araujo and Carraro; 2011) and Sri Lanka
(Narayan and Yoshida; 2005), per capita consumption or expenditure is used in the PMT
models instead of per capita income.
The DSWD’s current PMT model uses the natural logarithm of the household’s per capita
income and the following as predictor variables: (i) ownership of appliances and/or assets, (ii)
educational attainment (proportion of family members), (iii) family/household composition, (iv)
employment/kind of business, (v) housing materials and access to basic services or amenities,
(vi) and location (urban/rural and regional classification). The PMT model uses the 10 major
occupational groups based on the Philippine Standard Occupational Classification (PSOC).
The proxy (predictor) variables for per capita income are identified based on the data from the
Family Income and Expenditure Survey (FIES) for 2003 and the Labor Force Survey (LFS) for
4
2003, both collected and reported by the National Statistics Office (NS0). In 2003 the two
surveys can be merged and produced a sample size of 42,094 households. The selection of
variables was based on variables that can be good proxies of income (highly correlated) such
as housing conditions, access to basic services, ownership of assets, family composition,
education variables and specific variables related to the Conditional Cash Transfer (CCT)
operation (Fernandez, 2007). Using the 2003 poverty threshold, this PMT model has an
exclusion error rate of 30% and an inclusion error rate of 24% within sample for 2003.
Moreover, about 89% of the 2003 FIES households were classified correctly as either poor or
non-poor.
Fernandez (2012) performed simulations by applying the coefficients of the 2003 PMT model
(based on the 2003 LFS and FIES data) to the 2009 FIES and LFS data sets. The predicted per
capita income is then compared to the 2009 NSCB’s provincial poverty thresholds. The
simulations show the exclusion error rate at 18% and inclusion error rate at 45%. Moreover, the
model’s overall prediction rate is 82% with only 55% of the poor households are correctly
predicted.
Attempts have been made to improve the existing PMT model taking into consideration the
relatively high exclusion and inclusion error rates. A technical team from the National Household
Targeting Office (NHTO) is assigned to improve the technical aspects of the PMT model. The
technical team started with the existing PMT model used in developing the initial NHTSPR
database in 2009 as the baseline model and looked at ways of improving it by enhancing the
predictive accuracy of the PMT model. The expectation is that by improving predicted accuracy,
the exclusion and inclusion error rates will be reduced to acceptable levels. Moreover, the
technical team of the NHTO also considered sub-national and cluster PMT models as
alternative models. The alternative models also used the logarithm of per capita income of the
household as the dependent variable and basically the same set of explanatory variables from
the FIES and LFS modifying only the major occupational groups based on the PSOC. The
alternative models either used the 33 PSOC sub-major occupational groups or the 17 major
industry groups of the Philippine Standard Industrial Classification (PSIC). A summary of the
error rates is presented in table 1 for the best three models using the within sample validation
from the 38400 households in the 2009 FIES. A projection for the total households in the
country in 2009, using the sampling weights, is already reported. These models still carry
relatively high exclusion and inclusion error rates. The figures from table 1 show that exclusion
error rates ranges from 36% to 37%, while inclusion error rates are from 24% to 26%. Using the
5
alternative PMT models will result in a large number of households (about 2.2 million) that will
be misclassified (World Bank Technical Note 2012).
The NHTO technical team also considered sub-national and cluster PMT models in the hope of
lowering the exclusion and inclusion error rates. Several alternative PMT models are considered
for urban and rural households and for households residing in Luzon, Visayas, Mindanao and
the National Capital Region (NCR). The summary of the exclusion and inclusion rates, using
within sample validation for the households in the 2009 FIES, are provided in table 2 below. The
results show these sub-national and cluster models are no better than the national models in
terms of the error rates. In particular, the exclusion error rates from some of these PMT models
(e.g. models for the NCR, Luzon and Urban households) are larger compared to the national
model.
Table 1. Exclusion and Inclusion Error Rates of Alternative (National) PMT Models
Based on FIES 2009 Sample (38,400 Households)
Model
%
%
#
Exclusion Inclusion Misclassified
%
Correctly
# Correctly
Classified
Total
Classified
112
36.36
25.25
5,023
33,377
38,400
86.92
233
36.80
25.28
5,051
33,349
38,400
86.85
317
36.64
24.82
4,997
33,403
38,400
86.99
With Weights Applied (18,451,414 Households)
Model
%
%
#
Exclusion Inclusion Misclassified
%
Correctly
# Correctly
Classified
Total
Classified
112
36.86
25.56
2,243,281
16,208,133 18,451,414
87.84
233
37.33
25.55
2,254,603
16,196,811 18,451,414
87.78
317
37.20
25.06
2,230,063
16,221,351 18,451,414
87.91
Source: NHTO, May 2012
6
Table 2. Exclusion and Inclusion Error Rates of Sub-National/Clustered PMT Models
Misclassified
Correctly
Total
Option
Urbanity
Models
Cluster
Models
Current
Models
R2
Model
HHs
Classified HHs
No. of
Exclusion
Inclusion
Urban
Rural
Overall
NCR
Luzon
Visayas
0.7642
0.703
0.7694
0.7351
0.7579
No.
1,321
3,627
4,948
104
1,912
1,086
%
7.62
17.22
12.89
2.43
11.71
15.23
No.
16,014
17,438
33,452
4,181
14,415
6,044
%
92.38
82.78
87.11
97.57
88.29
84.77
HHs
17,335
21,065
38,400
4,285
16,327
7,130
Rate
49.58
31.56
Rate
27.90
24.08
78.95
42.62
31.42
36.84
27.02
23.78
Mindanao
0.7305
1,866
17.51
8,792
82.49
10,658
29.99
24.29
Urban
Rural
Overall
0.66
0.697
2,390
4,740
7,130
13.79
22.5
18.57
14,945
16,325
31,270
86.21
77.5
81.43
17,335
21,065
38,400
23.24
16.28
59.67
38.35
Source: NHTO, August 2012
III.
Proposed Modifications/Improvements to the Current PMT Model
The PMT model at its current state has large exclusion and inclusion errors within sample (using
2003 and 2009 FIES and LFS database). The exclusion error rate is about 30% while the
inclusion error rate is about 24%. These error rates produce high leakages.
This paper
identifies three (3) major modifications that can be made to improve the prediction power of the
PMT models in identifying poor and non-poor households. First, is to include additional
explanatory variables in the regression model to increase the goodness of fit of the model and
thereby increasing its prediction power within the sample data. Studies, particularly Mapa,
Balisacan and Briones (2008), Balisacan (2005) and Balisacan and Pernia (2003), suggest that
community (or Barangay) characteristics such as presence of infrastructure (road network,
electricity, telephone, and water system) and the presence and number of business
establishments in the Barangay are factors that explain the households’ per capita income in
that Barangay. Incorporating these variables to the PMT models will likely increase the models
goodness-of-fit and the increase the likelihood of having lower errors within sample. These
Barangay characteristic variables are generated from the Census of Population and Housing
7
(CPH). For this particular undertaking, the data from the 2007 CPH was incorporated in the
models5.
The second proposed improvement to the current PMT model is restricting the least-squares
estimation to only include the poorest 40% of the households in the sample, instead of the entire
households. The idea behind this proposal is that in the least-squares estimation, the estimated
coefficients are more likely to be BIASED and INCONSISTENT due to problems associated with
having endogenous variables in the model (due omitted variables, reverse causality and
measurement error). The question is how the BIASED estimates impact the predicted per capita
income of the households? The regression results showed these BIASED estimates tend to
increase the predicted per capita income, all things being the same, and thus tend to misclassify
a poor household as non-poor – increasing the model’s exclusion rate. In practice, there is a
high cost if a poor household is misclassified as non-poor. While there is a review mechanism
for the CCT program, a poor household (misclassified as non-poor) will surely have a hard time
arguing its case. For one, there is a cost of going through the review process (such as
submission of documents, transportation expenses). The costs associated with the review
create a high barrier for a poor household, incorrectly classified as non-poor, and will most likely
accept its fate (resulting from errors in the model). It defeats the main objective of the program –
to help poor households. Estimating a restricted model, using only the poorest 40% of the
households, creates biased estimates in favour of identifying poor households. Simulations
made showed that exclusion errors are lower by about 15 to 20 percentage points when using
the poorest 40% of the households in the estimation compared to using the entire data.6
The third proposed modification in the current PMT model is in the process of predicting the
household’s per capita income. The existing model is using the point estimate, Yˆ  X ̂ , for
generating the per capita income of the households. The predicted per capita income is then
compared with the per capita poverty threshold reported by the NSCB. If the predicted per
capita income is less than or equal to the per capita poverty threshold, the households is then
classified as poor, otherwise, the household is non-poor. The problem in estimation is the fact
that point estimate is always an incorrect predictor of the actual per capita income because of
errors. To illustrate the case, using the annual per capita poverty threshold in 2009 estimated at
Php 16,841 for the entire country. If the estimated per capita income for a particular household
5
6
The 2010 CPH database is not yet available when the models were constructed.
The reduction in the exclusion error rates also increased the inclusion error rates. The higher inclusion rates are
addressed using a second model to further screen non-poor households.
8
from the PMT model is, say, Php 17,000, the household will be classified as non-poor. However,
we know the point estimate has an error. Now, what if we overestimated the household per
capita income by Php 300, that is, the actual per capita income is only Php 16,700? The PMT
model then misclassifies the actual poor household as non-poor, increasing the model’s
exclusion error. To correct for this, we are proposing that the lower limit of the 95%
confidence/prediction interval be used as the estimated per capita income instead of the point
estimate. In estimation, reporting a confidence interval estimate is better than the point
estimate.7 Initial simulations showed that by using the lower limit of the confidence interval,
instead of the point estimate, the exclusion error rates of the models are reduced by 4 to 5
percentage points.
IV.
Explanatory Variables for the PMT Model to Predict the Household’s Per Capita
Income
The Proxy Means Test (PMT) model is a statistical method used to predict the per capita
income of a household based on observable characteristics that correlate with, but are easier to
measure, than income. The literature suggests the following set of variables as close proximate
of income.
 Needs indicators – e.g. household composition, family size, dependency ratio (sources:
Family Income and Expenditure Survey (FIES) and Labor Force Survey (LFS));
 Measures of current command over resources – e.g. assets (presence of television set,
air condition unit, vehicle), housing characteristics/amenities (house structure, toilet
facilities, water source, presence of electricity) presence of family member who is an
OFW; marital status, proportion of family members employed (FIES and LFS);
 Measures of potential/prospective command over resources – e.g. employment status,
educational attainment (source: LFS)
 Measures of transfers from other people, organizations or government -- communitylevel (Barangay) public facilities (e.g. roads/ports, water systems); since some of these
local public amenities are also financed from local taxes, they may also indicate the
development of the local economy (Source: Census of Population and Housing (CPH));
 Measure of economic opportunity or activity -- community-level variables like number of
banks, business enterprises, etc. (Source: CPH)
 Geographical Location – Regional indicator variables and Urban/Rural indicator variable
(Source: FIES)
7
While one can use the upper limit of the 95% confidence/prediction interval estimate, this option is not considered
to capture more poor households.
9
Using the data from the 2009 FIES and LFS and the 2007 CPH, there are 139 variables
identified as potential explanatory variables for the household’s per capita income. Of these 139
variables, 45 variables are from the FIES, 53 variables from the LFS and 41 variables from the
CPH. These are variables are defined in the appendix 1 and appendix 2. Note that the existing
PMT used only the 98 variables from the FIES and the LFS (variables in appendix 1). The
proposed model is expected to provide a better fit (within sample) with the addition of the 41
variables from the CPH. This will more likely improve the prediction performance of the
proposed PMT model, thereby reducing the model’s exclusion and inclusion error rates.
The PMT model is given by,
Yi  0  X i   Zi  Wi  Fi  Ei  Gi   i (1) i  1,2..., n
where Yi is the per capita income of the ith household, X is a matrix of associated with needs
indicator variables; Z is a matrix for the variables representing the measures of current
command over resources; W is for measures of potential command over resources; F is for the
variables representing transfers from other people, organizations or government; E is the
variables representing the measures of economic opportunity and G for geographical indicator
variables. The variable ε is the random error term and assumed to be normally distributed with
mean 0 and constant variance σ2ε. The vectors β, δ λ, φ, α, and ϕ are the structural parameters
and are estimated using the least-squares estimation procedure.
V.
Empirical Results and Evaluation of Errors
After an exhaustive analysis of the data, two PMT models were developed: one PMT model for
households in the National Capital Region (NCR) and another PMT model for households
outside the NCR (non-NCR). Having two models, instead of one for all households, is another
innovation introduced in this undertaking. The motivation behind the decision to use two models
is basically to minimize the error rates. Evaluation of the initial models (single model for all
households) showed large exclusion errors for the NCR households, ranging from 70% to 80%.
The large errors suggest that poor households in the NCR may have different characteristics
than poor households outside the NCR.
The PMT model for non-NCR households has 76 significant explanatory variables. The list of
explanatory variables is provided in appendix 3. The variable that explains the largest variation
10
in the household’s per capita income is family size (in natural logarithm). The R-squared or
coefficient of determination for this model is about 39%.8
The PMT model for NCR households has 39 significant explanatory variables. The list of
variables is provided in appendix 4. Similar to the non-NCR PMT model, family size (in natural
logarithm) is the variable that explains the largest variation in the household’s per capita
income. The NCR PMT model has an R-squared of about 50%.
Moreover, robustness procedure showed that 73 out of the total 139 explanatory variables has a
sign certainty probability of at least 97.5% using the Bayesian Averaging of the Classical
Estimates (BACE) approach (Sala-i-Martin; 2004), suggesting these variables are robust
determinants of per capita income. A detailed discussion on the BACE is provided in appendix
5. The 73 robust variables are part of the 76 variables included in the final PMT model for nonNCR households.
Second Model to Lower Inclusion Rate
The introduction of the three (3) modifications to the existing PMT model, namely: (1) Inclusion
of additional explanatory variables from the CPH to measure community (Barangay)
characteristics; (2) estimating the coefficients of the regression model using the poorest 40% of
the households instead of the entire households in the sample; and (3) using the lower limit of
the 95% predicted interval of the per capita income instead of the point estimate, contributed to
the large reduction in the exclusion error rates. In other words, more actual poor households
were correctly classified by the models. One drawback, however, is that in the process of
making the models biased towards identifying poor households, it also increased the inclusion
error rates of the models – the number of actual non-poor households classified as poor divided
by the total number of households classified as poor. Initial estimates suggest that inclusion
error rates increased to about 45-50% for non-NCR households and about 75% for NCR
households from about 24% for non-NCR and 37% for NCR households, using the existing
model.
To remedy the problem of high inclusion error rates of the PMT models, second set of models
are constructed to further screen non-poor households misclassified as poor households by the
first set of models. The second set of models consists of logit models (one each for non-NCR
and NCR households).
8
The detailed results of the final model can be made available upon request from the authors.
11
The second model used in lowering the inclusion error is the logit model. Consider the linear
model,
(2)
yi   0  1 X 1i   2 X 2i  ...   k X ki   i
i  1,2,..., n
where the variable of interest, yi, takes on the value 1 if the household is non-poor and value 0 if
the household is poor and X1, X2,…, Xk represent the determinants of households being nonpoor.
Note that yi is a Bernoulli random variable with probability of success,, or yi ~ Be(). The
problem in economics is that most likely  is unknown and not constant across the observations.
The solution is to make  dependent on the vector of explanatory variables X. Thus, we have,
(3)
yi ~ BeF  o  1 X 1i   2 X 2i  ...   k X ki 
where the function F(·) has the property that maps β0+β1X1+β2X2+…+βkXk onto the interval [0,1].
Thus, instead of considering the precise value of y, we are now interested on the probability that
y=1, given the outcome of β0+β1X1+β2X2+…+βkXk , or,
(4)


Pr yi  1 |  , xi  F ( xi  )
where F is a continuous, strictly increasing function and returns a value ranging from 0 to 1. The
choice of F determines the type of binary model. Given such a specification, the parameters of
this model (the betas) can be estimated using the method of maximum likelihood. Once the
identifiable parameters are established, the likelihood function is written as,
n
(5)

L( y;  )   F ( xi  ) i 1  F ( xi  )
1 yi
y

i 1
where F(.) is a cumulative density function.
If F(·) is a logistic distribution then,
(6)
F x i '    ( x i '  ) 
exp x i '  
1  exp x i '  
12
Marginal Effects
Interpretation of the coefficient values is complicated by the fact that estimated coefficients from
a binary model cannot be interpreted as marginal effect on the dependent variable. The
marginal effect of Xj on the conditional probability is given by,
(7)
E y | X ,  
X j
 f ( x i '  ) j
where f(·) is the density function corresponding to F(·). In here, βj is weighted by a factor f(·) that
depends on the values of all the variables in X. The direction of the effect of a change in Xj
depends only on the sign of the βj coefficient. Positive values of βj imply that increasing Xj will
increase the probability of the response, while negative values of βj will decrease the probability
of the response. The marginal effect is usually estimated using the average of all the values of
the explanatory variables (X) as the representative values in the estimation.
Average Marginal Effect
Some researchers, particularly Bartus (2005), argue that it would be more preferable to
compute the average marginal effect, that is, the average of each individual’s marginal effect.
The marginal effect computed at the average X is different from the average of the marginal
effect computed at the individual X.
The results of the two logit models for non-NCR and NCR households are given in appendix 6A
and appendix 6B, respectively.9 Using the logit model, if the estimated probability the ith
household being non-poor, given X, is equal or greater than the cut-off value c, the household is
tagged as non-poor.10 Otherwise, the household is considered as poor.
Applying the two models, linear regression model and logit model, results show lower exclusion
and inclusion errors for non-NCR and NCR households, reported in tables 3 and 4. The
exclusion and inclusion error rates for non-NCR households are 6.8% and 13.9%, respectively.
For NCR households, the exclusion and inclusion error rates are 19.3% and 10.9%,
respectively.
9
The detailed results of the two logit models can be requested from the authors.
Simulations made indicate that the optimal cut-off value is equal to 0.40.
10
13
Table 3. Expected-Prediction Table for Non-NCR Households
True Welfare Level
Lower Limit Estimate
Poor
Non-Poor
%
n
%
Predicted Welfare
Level
Poor
Non-Poor
Total
93.2
6.8
100.0
7827
574
8401
5.0
95.0
100.0
n
1262
24026
25288
The exclusion error rate is 6.8% while the inclusion error rate is 13.9%
Table 4. Expected-Prediction Table for NCR Households
Lower Limit Estimate
%
Predicted Welfare
Level
Poor
Non-Poor
Total
80.7
19.3
100.0
True Welfare Level
Poor
Non-Poor
n
%
92
22
114
0.3
99.7
100.0
n
11
4,160
4171
The exclusion error rate is 19.3% while the inclusion error rate is 10.7%
Using the PMT models, the estimated poverty incidence among families in 2009 is about
22.24% compared to the official poverty estimates of 20.9% from the NSCB.11
Table 5. Estimated Poverty Incidence using the PMT Models
Un-weighted
Area
Number of
N
Incidence
Households
NCR
All Regions Outside NCR
Total
4,285
33,689
37,974
2.40
26.98
24.20
103
9,088
9,191
Weighted
N
Incidence
Number of
Households*
2,460,993
15,830,871
18,291,864
2.42
25.32
22.24
59,556
4,008,377
4,068,111
* May not sum up to the total predicted poor households due to rounding off errors.
The figures in tables 6A and 6B are the estimated number of households that will be
misclassified by the PMT models as poor, given that these household households are non-poor.
These households will be part of the inclusion error. The figures are reported by income decile.
11
The PMT models are not constructed for the purpose of estimating the overall poverty incidence. The purpose of
this comparison is to merely assess the prediction performance of the models. The authors are grateful to Dean
Mon Clarete of the University of the Philippines School of Economics and member of the National Technical
Advisory Group (NTAG) of the DSWD for this suggestion.
14
Table 6A. Estimated Number of Non-NCR Households (Weighted) Misclassified as Poor
Number of Households
National Income Decile
Percentage per Decile
Total
Inclusion Error
1
2
3
4
5
6
7
8
9
10
Total
1,798,387
1,792,055
1,769,891
1,736,647
1,660,658
1,556,175
1,480,186
1,420,029
1,370,953
1,244,306
15,830,871
7,931
38,780
77,123
121,209
119,185
91,891
58,362
24,997
6,673
875
546,971
0.44
2.16
4.36
6.98
7.18
5.90
3.94
1.76
0.49
0.07
3.46
For non-NCR households, the bulk of these misclassified households are in the 4th to 6th income
decile (using the 2009 figures). For NCR households, all misclassified households are in the 5th
to 9th income decile. Given the income profile of these households, some of the
misclassifications can be corrected naturally (a “non-poor” household will not enrol itself in the
program, particularly those households in the higher income decile) or administratively (DSWD
can detect these households). Either way, the final inclusion errors may still be lower than the
reported figures.
Table 6B. Estimated Number of NCR Households (Weighted) Misclassified as Poor
Number of Households
National Income
Percentage per Decile
Deciles
Total
Inclusion Error
1
2
3
4
5
6
7
8
9
10
Total
16,981
22,641
47,251
81,705
168,578
274,893
360,535
419,845
470,296
598,021
2,460,993
0
0
0
0
847
2,127
2,634
733
489
0
6,830
0.00
0.00
0.00
0.00
0.50
0.77
0.73
0.17
0.10
0.00
0.28
15
VI.
Conclusion
This paper proposed PMT models that lower the risks of misclassifying poor households as nonpoor (exclusion error) and non-poor households as poor (inclusion error). In the construction of
the models, modifications are made to sharpen the targeting process. Learning from the
experiences gained in using the existing PMT model, several improvements are introduced in
the new PMT models: additional explanatory variables to capture community (Barangay)
characteristics, alternative estimation procedure to make the estimates biased in identifying
poor households and adjustment in the process of predicting the household’s per capita income,
using the lower limit of the 95% prediction interval instead of the point estimate. These
modifications are introduced to minimize the models’ exclusion error rates, the new PMT models
will capture more poor households. In order to minimize the inclusion error rates, a logit model is
then introduced to further weed out the non-poor households misclassified as poor households
in the first model. The introduction of the second “screener” model lowers the inclusion error
rate at a comfortable level. Using another model solely for NCR households also helped in
lowering the error rates.
16
References
Alatas, V., Banerjee, A. Oklen, B. and Tobias, J. (2010), “Targeting the Poor: Evidence from a
Field Experiment in Indonesia” Technical Report prepared for the World Bank.
Araujo, C. and Carraro, L (2011). “A proxy-means test exercise for the selection of beneficiaries
of poverty targeted programs in Mongolia” Technical Report prepared for the Ministry of
Social Welfare, Mongolia.
Balisacan, A. (2005) “In Search of Proxy Indicators for Poverty Targeting: Toward a Framework
for a Poverty Indicator and Monitoring System” Paper prepared for the National Statistics
Office (NSO) component in the UNDP-assisted project: Strengthening Institutional
Mechanism for the Convergence of Poverty Alleviation Efforts.
Balisacan, A.M. and Pernia, E.M. (2003). “Poverty, Inequality and Growth in the Philippines”, in
E.M. Pernia and A.B. Deolalikar (eds.), Poverty, Growth and Institutions in Developing
Asia, . Hampshire, England: Palgrave Macmillan Publishers.
Bartus, T. (2005), “Estimation of marginal effects using margeff” The Stata Journal, Vol. 5, No.
3, pp. 309-329.
Capuno, J. (2012), “Enhancing the DSWD NHTSPR Proxy Means Test Model: A review of
current efforts and recommendations” Technical Note, World Bank, Manila.
Fernandez, L. (2007) “Technical Note on Estimation of a Proxy Means Test Model (PMT) for
Conditional Cash Transfer (CCT) Pilot Program in the Philippines” prepared for the
Department of Social Welfare and Development (DSWD), Philippines.
Fernadez, L. (2012), “Technical Note on Estimations of Proxy Means Test Model (PMT) of the
National Household Targeting System for Poverty Reduction (NHTS-PR) with latest
household surveys: FIES-LFS 2009 in the Philippines” prepared for the World Bank,
Manila and the DSWD, Philippines.
Mapa, D., Balisacan, A. and Briones, K. (2008), “Robust Determinants of Income Growth in the
Philippines.” in Joseph J. Capuno and Aniceto C. Orbeta (editors), Human Capital and
Development in the Philippines. Philippines Institute for Development Studies (PIDS),
Makati City.
Narayan, A. and Yoshida, N. (2005), “Proxy Means Test for Targeting Welfare Benefits in Sri
Lanka” Technical Report No. SASPR-7, World Bank, Washington, D.C.
17
Appendix 1. Explanatory Variables from the FIES and LFS
Var No.
1
2
3
Variable Name
house_strong_3
bldg_single
ts_oh_ol
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
ts_squatter
tf_w_sealed
tf_none
w_elec
ws_o_faucet
ws_s_well
ws_dug_well
ws_spring
w_radio
w_tv
w_vtr
w_stereo
w_ref
w_wash
w_aircon
w_sala
w_dining
w_car
w_phone
w_pc
w_oven
w_motor
n_hh
hh_type_s_fam
ln_fam_size2
p_mem_0_14
w_helper
p2_educ_ngc
32
p2_educ_eu
33
34
p2_educ_eg
p2_educ_hsu
35
p2_educ_hsg
36
p2_educ_cu
Description
roof and walls made of strong materials
building type = single house
tenure status = own house and lot; or owner-like
possession of house and lot
tenure status = squatter
toilet facility = water sealed
toilet facility = none
With electricity in the building/house
water source = own use, faucet, community water system
water source = shared, tubed / piped well
water source = dug well
water source = spring, river, stream, etc
family with radio
with television set
with VTR/VHS/VCD/DVD
with stereo / CD player
with refrigerator/freezer
with washing machine
with air conditioner
with sala set
with dining set
with car/jeep
with telephone/cellphone
with microcomputer
with microwave oven
with motorcycle tricycle
number of households in the housing unit
household type = single family
natural logarithm of family size
proportion of members 0-14 years old
family with domestic helper
Proportion of family members with no grade completed
Proportion of family members who are elementary
undergraduates
Proportion of family members who are elementary
graduates
Proportion of family members who are high school
undergraduates
Proportion of family members who are high school
graduates
Proportion of family members who are college
undergraduates
Source
FIES
FIES
FIES
FIES
FIES
FIES
FIES
FIES
FIES
FIES
FIES
FIES
FIES
FIES
FIES
FIES
FIES
FIES
FIES
FIES
FIES
FIES
FIES
FIES
FIES
FIES
FIES
LFS
LFS
LFS
LFS
LFS
LFS
LFS
LFS
LFS
18
37
p2_educ_cg
Proportion of family members who are college graduates
38
p2_educ_pg
Proportion of family members who are post graduates
39
p2_mem_sch
Proportion of family members currently attending school
40
41
42
43
44
h_sex2_m
h_age2
h_ms2_single
p2_mem_emp
w_occup2_11
household head = male
age of household head
household head = single
proportion of family members employed
with family member whose primary occupation = officials
of government and special-interest organizations
45
w_occup2_12
with family member whose primary occupation =
corporate executives and specialized managers
46
w_occup2_13
general managers or managing proprietors
47
w_occup2_14
supervisors
48
w_occup2_21
49
w_occup2_22
physical, mathematical and engineering science
professionals
life science and health professionals
50
w_occup2_23
teaching professionals
51
w_occup2_24
other professionals
52
w_occup2_31
physical science and engineering associate professionals
53
w_occup2_32
life science and health professional associates
54
w_occup2_33
teaching associate professionals
55
w_occup2_34
related associate professionals
56
w_occup2_41
office clerks
57
w_occup2_42
customer service clerks
58
w_occup2_51
personal and protective services workers
59
w_occup2_52
models, salespersons and demonstrators
60
w_occup2_61
farmers and other plant growers
61
w_occup2_62
animal producers
62
w_occup2_63
forestry and related workers
63
w_occup2_64
fishermen
LFS
LFS
LFS
LFS
LFS
LFS
LFS
LFS
LFS
LFS
LFS
LFS
LFS
LFS
LFS
LFS
LFS
LFS
LFS
LFS
LFS
LFS
LFS
LFS
LFS
LFS
LFS
19
64
w_occup2_65
hunters and trappers
65
w_occup2_71
mining, construction and related trades workers
66
w_occup2_72
metal, machinery and related trades workers
67
w_occup2_73
precision, handicraft, printing and related trades workers
68
w_occup2_74
other craft and related trades workers
69
w_occup2_81
stationary-plant and related operators
70
w_occup2_82
machine operators and assemblers
71
w_occup2_83
drivers and mobile plant operators
72
w_occup2_91
sales and services elementary occupations
73
w_occup2_92
agricultural, forestry and fishery laborers
74
w_occup2_93
75
w_occup2_01
laborers in mining, construction, manufacturing and
transport
armed forces
76
w_occup2_09
other occupations not classifiable
77
w_cw2_employer
78
w_ne2_s_term
withfamily member whose class of worker = employer in
own family-operated farm or business
withfamily member whose nature of employment = shortterm or seasonal or casual job/business/ unpaid family
work
79
w_pb2_month
80
ofi_w_ocw
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
urban
region_01
region_02
region_03
region_41
region_42
region_05
region_06
region_07
region_08
region_09
region_10
region_11
region_12
region_13
LFS
LFS
LFS
LFS
LFS
LFS
LFS
LFS
LFS
LFS
LFS
LFS
LFS
LFS
LFS
with family member whose basis of payment = monthly
LFS
with family member who is an overseas contract worker
urbanity = urban
Ilocos Region
Cagayan Valley
Central Luzon
CALABARZON
MIMAROPA
Bicol Region
Western Visayas
Central Visayas
Eastern Visayas
Zamboanga Peninsula
Northern Mindanao
Southern Mindanao
Central Mindanao
National Capital Region
LFS
FIES
FIES
FIES
FIES
FIES
FIES
FIES
FIES
FIES
FIES
FIES
FIES
FIES
FIES
FIES
20
96
97
98
region_14
region_15
region_16
Cordillera Administrative Region
Autonomus Region of Muslim Mindanao
Caraga
FIES
FIES
FIES
Appendix 2. Explanatory Variables from the CPH
Var No. Variable Name
Description
1
q_1a
Part of the town/city proper
2
q_1b
Former poblacion of the municipality
3
q_1c
Poblacion/city district
4
q_2
Street pattern indicator
5
q_3
Barangay is accessible to the national highway
6
q_4a
Town/city hall or provincial capitol indicator
7
q_4b
Church, chapel or mosque with religious service of at least once a month
8
9
10
q_4c
q_4d
q_4e
Public plaza or park for recreation
Cemetery indicator
Market place or building where trading activities are carried our at least
once a week
11
12
13
14
15
16
17
18
19
20
21
22
23
24
q_4f
q_4g
q_4h
q_4i
q_4j
q_4k
q_4l
q_4m
q_4n
q_4o
q_4p
q_4q
q_4r
q_5
Elementary school indicator
High school indicator
College/university indicator
Public library indicator
Hospital indicator
Health center indicator
Landline telephone system or calling station indicator
Cellular phone signal indicator
Postal service indicator
Community waterworks system indicator
Operational seaport indicator
Puclic fire-protection service indicator
Public-street sweeper indicator
Farmers, farm laborers, fishermen, loggers, and forest product gatherers
constitute more than half of the population 10 years old and over
25
26
q_6a
q_6b
Number of commercial establishments in the barangay
Number of commercial establishments outside the barangay but within
2kms
27
28
q_7a
q_7b
Number of recreational establishments in the barangay
Number of recreational establishments outside the barangay but within
2kms
29
q_8a
Number of manufacturing establishments in the barangay
21
30
q_8b
Number of manufacturing establishments outside the barangay but within
2kms
31
q_9a
Number of hotel, dormitory, motel or other lodging places in the barangay
32
q_9b
Number of hotel, dormitory, motel or other lodging places outside the
barangay but within 2kms
33
q_10a
Number of banking institution, pawnshop, financing/investment or
insurance company in the barangay
34
q_10b
Number of banking institution, pawnshop, financing/investment or
insurance company outside the barangay but within 2kms
35
q_11a
Number of auto repair shop, vulcanizing shop, electronic repair shop in the
barangay
36
q_11b
Number of auto repair shop, vulcanizing shop, electronic repair shop
outside the barangay but within 2kms
37
q_12a
Number of establishments offering personal services like restaurant,
cafeteria, or refreshment parlor in the barangay
38
q_12b
Number of establishments offering personal services like restaurant,
cafeteria, or refreshment parlor outside the barangay but within 2kms
39
40
q_13a
q_13b
Number of households in danger areas
Number of households in government land without legally recognizable
claims to the land
41
q_13c
Number of households in private land which they do not own
Appendix 3. Variables in the PMT Model for Non-NCR Households and the Partial Sum of
Squares
Variable
Partial SS
natural logarithm of family size
ln_fam_size2
106.9507
Zamboanga Peninsula
region_09
12.2461
with telephone/cellphone
w_phone
12.0351
Autonomus Region of Muslim Mindanao
region_15
11.8521
Northern Mindanao
region_10
11.3210
Caraga
region_16
10.4322
mining, construction and related trades workers
w_occup2_71
7.1971
drivers and mobile plant operators
w_occup2_83
6.0946
Fishermen
w_occup2_64
6.0367
Proportion of family members who are high school graduates
p2_educ_hsg
5.1667
Proportion of family members who are college undergraduates
p2_educ_cu
5.0339
proportion of members 0-14 years old
p_mem_0_14
4.9922
laborers in mining, construction, manufacturing and transport
w_occup2_93
4.8049
Proportion of family members who are high school
undergraduates
p2_educ_hsu
4.5039
22
Eastern Visayas
water source = spring, river, stream, etc
with family member who is an overseas contract worker
roof and walls made of strong materials
general managers or managing proprietors
Central Visayas
with family member whose basis of payment = monthly
with VTR/VHS/VCD/DVD
Proportion of family members with no grade completed
household type = single family
sales and services elementary occupations
Highschool indicator
with television set
Central Mindanao
toilet facility = water sealed
with motorcycle tricycle
With electricity in the building/house
urbanity = urban
machine operators and assemblers
CALABARZON
Cellular phone signal Indicator
household head = single
office clerks
Proportion of family members currently attending school
age of household head
with refrigerator/freezer
metal, machinery and related trades workers
household head = male
Proportion of family members who are college graduates
Proportion of family members who are elementary graduates
withfamily member whose nature of employment = short-term
or seasonal or casual job/business/ unpaid family work
with car/jeep
water source = shared, tubed / piped well
Number of auto repair shop, vulcanizing shop, electronic repair
shop, or other repair shops in the barangay
family with radio
withfamily member whose class of worker = employer in own
family-operated farm or business
farmers and other plant growers
tenure status = own house and lot; or owner-like possession of
house and lot
region_08
ws_spring
ofi_w_ocw
house_strong_3
w_occup2_13
region_07
w_pb2_month
w_vtr
p2_educ_ngc
hh_type_s_fam
w_occup2_91
q_4g
w_tv
region_12
tf_w_sealed
w_motor
w_elec
Urban
w_occup2_82
region_41
q_4m
h_ms2_single
w_occup2_41
p2_mem_sch
h_age2
w_ref
w_occup2_72
h_sex2_m
p2_educ_cg
p2_educ_eg
4.2999
3.8729
3.7744
3.2691
3.0721
2.7962
2.7792
2.1045
2.0969
1.9464
1.9378
1.8551
1.7796
1.7037
1.6850
1.6811
1.6673
1.6163
1.5660
1.3110
1.2375
1.2277
1.1950
1.1903
1.1313
1.0074
0.9825
0.9642
0.9328
0.8949
w_ne2_s_term
w_car
ws_s_well
0.8556
0.7919
0.7182
q_11a
w_radio
0.6295
0.5895
w_cw2_employer
w_occup2_61
0.5560
0.5525
ts_oh_ol
0.5508
23
models, salespersons and demonstrators
with stereo / CD player
armed forces
with sala set
related associate professionals
personal and protective services workers
other craft and related trades workers
Number of recreational establishments outside the barangay
but within 2kms
with family member whose primary occupation = officials of
government and special-interest organizations
Number of commercial establishments in the barangay
customer service clerks
building type = single house
with washing machine
with dining set
proportion of family members employed
hunters and trappers
Central Luzon
animal producers
Number of hotel dormitory, motel or other lodging places in the
barangay
Number of recreational establishments in the barangay
Number of establishments offering personal services like
restaurants, cafereria, etc in the barangay
Proportion of family members who are post graduates
Supervisors
physical, mathematical and engineering science professionals
w_occup2_52
w_stereo
w_occup2_01
w_sala
w_occup2_34
w_occup2_51
w_occup2_74
0.5377
0.5198
0.4925
0.4828
0.4693
0.4506
0.4336
q_7b
0.4207
w_occup2_11
q_6a
w_occup2_42
bldg_single
w_wash
w_dining
p2_mem_emp
w_occup2_65
region_03
w_occup2_62
0.4200
0.3670
0.3620
0.3474
0.2904
0.2855
0.2683
0.2617
0.2530
0.2499
q_9a
q_7a
0.2428
0.1869
q_12a
p2_educ_pg
w_occup2_14
w_occup2_21
0.1847
0.0641
0.0519
0.0139
Appendix 4. Variables in the PMT Model for NCR Households and the Partial Sum of Squares
Variable
SS
natural logarithm of family size
ln_fam_size2
16.8941
number of telephone/cellphone
n_phone
2.5443
number of washing machines
n_wash
1.3592
mining, construction and related trades workers
w_occup2_71
1.2661
proportion of members 0-14 years old
p_mem_0_14
1.1725
drivers and mobile plant operators
w_occup2_83
1.1321
Proportion of family members who are high school graduates
p2_educ_hsg
1.0384
with family member whose basis of payment = monthly
w_pb2_month
0.9155
personal and protective services workers
w_occup2_51
0.7428
Proportion of family members who are college undergraduates
p2_educ_cu
0.7133
household type = single family
hh_type_s_fam
0.6986
24
Do farmers, farm laboreres, fishermen, loggers, and forest product
gatherers constitute more than half of the population 10 years old
and over (yes=1)
office clerks
number of television sets
laborers in mining, construction, manufacturing and transport
tenure status = squatter
water source = own use, faucet, community water system
metal, machinery and related trades workers
number of VTR/VHS/VCD/DVD
number of stereo/cd player
models, salespersons and demonstrators
general managers or managing proprietors
number of refrigerators
toilet facility = none
number of banking institutions/pawnshops financing and
investment, inside the barangay
customer service clerks
Proportion of family members currently attending school
Number of commercial establishments, outside the barangay but
within 2kms
age of household head
number of sala sets
number of households in the housing unit
Landline telephone system or calling station (indicator)
number of microwave oven
proportion of family members employed
forestry and related workers
With family member whose nature of employment = short-term or
seasonal or casual job/business/ unpaid family work
other craft and related trades workers
number of motorcycle/tricycle
other occupations not classifiable
q_5
w_occup2_41
n_tv
w_occup2_93
ts_squatter
ws_o_faucet
w_occup2_72
n_vtr
n_stereo
w_occup2_52
w_occup2_13
n_ref
tf_none
0.6091
0.5829
0.5566
0.5547
0.5204
0.4994
0.4871
0.4825
0.4646
0.4185
0.4014
0.3970
0.3816
q_10a
w_occup2_42
p2_mem_sch
0.3799
0.3764
0.3489
q_6b
h_age2
n_sala
n_hh
q_4l
n_oven
p2_mem_emp
w_occup2_63
0.2734
0.2580
0.2249
0.2078
0.1919
0.1805
0.1652
0.1614
w_ne2_s_term
w_occup2_74
n_motor
w_occup2_09
0.1592
0.1566
0.1504
0.1453
Appendix 4. Variables in the PMT Model for NCR Households and the Partial Sum of Squares
Variable
SS
natural logarithm of family size
ln_fam_size2
16.8941
number of telephone/cellphone
n_phone
2.5443
number of washing machines
n_wash
1.3592
mining, construction and related trades workers
w_occup2_71
1.2661
proportion of members 0-14 years old
p_mem_0_14
1.1725
drivers and mobile plant operators
w_occup2_83
1.1321
25
Proportion of family members who are high school graduates
with family member whose basis of payment = monthly
personal and protective services workers
Proportion of family members who are college undergraduates
household type = single family
Do farmers, farm laboreres, fishermen, loggers, and forest product
gatherers constitute more than half of the population 10 years old
and over (yes=1)
office clerks
number of television sets
laborers in mining, construction, manufacturing and transport
tenure status = squatter
water source = own use, faucet, community water system
metal, machinery and related trades workers
number of VTR/VHS/VCD/DVD
number of stereo/cd player
models, salespersons and demonstrators
general managers or managing proprietors
number of refrigerators
toilet facility = none
number of banking institutions/pawnshops financing and
investment, inside the barangay
customer service clerks
Proportion of family members currently attending school
Number of commercial establishments, outside the barangay but
within 2kms
age of household head
number of sala sets
number of households in the housing unit
Landline telephone system or calling station (indicator)
number of microwave oven
proportion of family members employed
forestry and related workers
With family member whose nature of employment = short-term or
seasonal or casual job/business/ unpaid family work
other craft and related trades workers
number of motorcycle/tricycle
other occupations not classifiable
p2_educ_hsg
w_pb2_month
w_occup2_51
p2_educ_cu
hh_type_s_fam
1.0384
0.9155
0.7428
0.7133
0.6986
q_5
w_occup2_41
n_tv
w_occup2_93
ts_squatter
ws_o_faucet
w_occup2_72
n_vtr
n_stereo
w_occup2_52
w_occup2_13
n_ref
tf_none
0.6091
0.5829
0.5566
0.5547
0.5204
0.4994
0.4871
0.4825
0.4646
0.4185
0.4014
0.3970
0.3816
q_10a
w_occup2_42
p2_mem_sch
0.3799
0.3764
0.3489
q_6b
h_age2
n_sala
n_hh
q_4l
n_oven
p2_mem_emp
w_occup2_63
0.2734
0.2580
0.2249
0.2078
0.1919
0.1805
0.1652
0.1614
w_ne2_s_term
w_occup2_74
n_motor
w_occup2_09
0.1592
0.1566
0.1504
0.1453
26
Appendix 5. Bayesian Averaging of Classical Estimates (BACE)
The BACE (Sala-i-Martin et al, 2004) is a procedure that determines the robustness of
explanatory variables in a cross-section regression model. It computes the weighted average of
each variable’s estimated coefficient from a large number of regression models, where the
weights are functions of goodness-of-fit statistics. The focus of BACE is not on the magnitude of
the weighted regression coefficients, but on the robustness of a variable’s effect even if the
model is suffering from misspecification errors.
The procedure involves estimating all regression models of the form
where
is the variable of interest,
is a vector of fixed variables appearing in all the
regressions, and
is a vector of variables taken from the collection of all other variables
under consideration.
The prior probability of model , denoted as
probability, will be:
̅
̅
( )
( ) (
)
(
) assuming equal prior variable inclusion
where ̅ is the speculated number of variables in the true model,
is the total number of
variables in the dataset, and is the number of variables in the
model. The weights that will
be used in the averaging is the posterior probabilities of the
. The weight is a function of the
prior probability and is given by:
(
)
(
∑
)
(
)
where the
is the sum of squared errors in model . Therefore, the unconditional posterior
mean of is given by:
(
)
∑ (
| )̂
where ̂ is the estimated value of the vector of coefficients under OLS; and its corresponding
unconditional posterior variance is of the form:
(
)
∑ (
| )
(
)
∑ (
| )[ ̂
(
)]
27
The posterior mean and posterior variance that are conditional to the posterior inclusion
probability are as follows:
( )
( )
(
(
)
(
)
(
) )
( )
where is the vector of posterior inclusion probabilities with elements equal to the sum of
posterior model probabilities taken over models containing the particular variable.
To determine the robustness of the effects, the sign-certainty probability of the variables is used.
It is the probability that the true effect strictly lies on one side of zero under the Gaussian
distribution. Robust variables are those with sign-certainty probability of at least 97.5%.
BACE in the DSWD Targeting Model
To check the robustness of the variable’s effects, the BACE method was applied to a
subsample of the 2009 FIES comprising of households living outside NCR with per capita
income below the 40th percentile. If the BACE procedure is strictly implemented, there should be
a total of
regression models to estimates, which leads to computational
problems. To simplify the analysis and to reduce the number of runs, variable groupings were
used. The main variable groupings are needs indicators, household-head characteristics,
geographic dummy variables, employment indicators, asset indicators, water source and toilet
facility, occupation indicators, education variables, housing characteristics, and community
characteristics.
The needs indicators, household-head characteristics, and geographic dummy variables were
fixed in all regression models, and each variable in employment indicators and education
variables will be unrestricted. All in all, there were a total of 16 variable and variable groupings
tested for robustness, which translates to 65,536 regression runs. The BACE procedure
identified 73 robust correlates of per capita income.
Robust Determinants of Income
Variable
urbanity = urban
Geographic Indicators
Unconditional
Conditional
Posterior
Sign
Sign
Inclusion Posterior Posterior
Posterior Posterior
Certainty
Certainty
Probability
Mean
Variance
Mean
Variance
Probability
Probability
1.0
0.0655
0.0002
1.0000
0.0655
0.0002
1.0000
28
Ilocos Region
Cagayan Valley
Central Luzon
CALABARZON
MIMAROPA
Bicol Region
Western Visayas
Central Visayas
Eastern Visayas
Zamboanga Peninsula
Northern Mindanao
Southern Mindanao
Central Mindanao
CAR
ARMM
Caraga
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
0.1512
0.1449
0.1587
0.1880
0.0967
0.1093
0.1193
0.0609
0.0565
-0.0149
0.0033
0.1008
0.0684
0.1065
0.2210
0.0003
0.0004
0.0003
0.0002
0.0002
0.0002
0.0002
0.0002
0.0002
0.0002
0.0002
0.0002
0.0002
0.0004
0.0004
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
0.8279
0.5903
1.0000
1.0000
1.0000
1.0000
Base Region
0.1512
0.1449
0.1587
0.1880
0.0967
0.1093
0.1193
0.0609
0.0565
-0.0149
0.0033
0.1008
0.0684
0.1065
0.2210
0.0003
0.0004
0.0003
0.0002
0.0002
0.0002
0.0002
0.0002
0.0002
0.0002
0.0002
0.0002
0.0002
0.0004
0.0004
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
0.8279
0.5903
1.0000
1.0000
1.0000
1.0000
Needs Indicators
Unconditional
Variable
Conditional
Posterio
Sign
Posterio
r
Certainty
r Mean
Varianc
Probability
e
Posterior
Inclusion
Probability
Posterio
r Mean
Posterior
Variance
Sign
Certainty
Probability
0.8043
-0.2196
0.0028
1.0000
-0.2730
-0.0111
0.0708
-0.0262
0.0091
0.6079
-0.3698
0.0020
natural logarithm
of family size
proportion of
members 0-14
years old
1.0000
Employment Indicators
Variable
with family member whose
basis of payment =
monthly
with family member who is
an overseas contract
worker
Unconditional
Conditional
Posterior
Sign
Sign
Inclusion Posterior Posterior
Posterior Posterior
Certainty
Certainty
Probability
Mean
Variance
Mean
Variance
Probability
Probability
0.0139
0.0011
0.0001
0.5460
0.0806
0.0004
1.0000
0.0101
0.0013
0.0002
0.5392
0.1292
0.0010
1.0000
29
Asset Indicators
Variable
With electricity in the
building/house
with motorcycle tricycle
with television set
with telephone/cellphone
with VTR/VHS/VCD/DVD
with refrigerator/freezer
with car/jeep
with sala set
family with radio
with stereo / CD player
with washing machine
with dining set
with microcomputer
Unconditional
Conditional
Posterior
Sign
Sign
Inclusion Posterior Posterior
Posterior Posterior
Certainty
Certainty
Probability
Mean
Variance
Mean
Variance
Probability
Probability
0.8250
0.8250
0.8250
0.8250
0.8250
0.8250
0.8250
0.8250
0.8250
0.8250
0.8250
0.8250
0.8250
0.0525
0.0625
0.0408
0.0940
0.0344
0.0508
0.1025
0.0266
0.0158
0.0252
0.0248
0.0143
0.0646
0.0002
0.0003
0.0001
0.0007
0.0001
0.0002
0.0010
0.0001
0.0000
0.0001
0.0002
0.0001
0.0020
1.0000
0.9999
0.9999
0.9999
0.9998
0.9998
0.9993
0.9973
0.9941
0.9930
0.9758
0.9621
0.9279
0.0637
0.0758
0.0495
0.1139
0.0417
0.0616
0.1242
0.0322
0.0191
0.0305
0.0300
0.0173
0.0783
-0.0005
-0.0007
-0.0003
-0.0015
-0.0002
-0.0004
-0.0015
-0.0001
0.0000
0.0000
0.0000
0.0000
0.0013
1.0000
0.9997
0.9851
Water Source and Toilet Facility
Variable
water source = spring,
river, stream, etc
toilet facility = water
sealed
water source = shared,
tubed / piped well
Posterior
Inclusion
Probabilit
y
Unconditional
Posterio
Sign
Posterio
r
Certainty
r Mean Varianc Probabilit
e
y
Conditional
Posterio
Sign
Posterio
r
Certainty
r Mean Varianc Probabilit
e
y
0.0150
-0.0013
0.0001
0.5478
-0.0874
0.0004
1.0000
0.0146
0.0009
0.0001
0.5459
0.0649
0.0005
0.9984
0.0150
-0.0004
0.0000
0.5462
-0.0289
0.0001
0.9970
Occupation Indicators
Variable
drivers and mobile plant
operators
office clerks
machine operators and
assemblers
Unconditional
Conditional
Posterior
Sign
Sign
Inclusion Posterior Posterior
Posterior Posterior
Certainty
Certainty
Probability
Mean
Variance
Mean
Variance
Probability
Probability
0.1285
0.1285
0.0142
0.0188
0.0013
0.0023
0.6538
0.6523
0.1105
0.1461
-0.0006
-0.0007
0.1285
0.0206
0.0028
0.6522
0.1606
-0.0008
30
mining, construction and
related trades workers
general managers or
managing proprietors
fishermen
related associate
professionals
with family member
whose primary
occupation = officials of
government and specialinterest organizations
metal, machinery and
related trades workers
armed forces
personal and protective
services workers
laborers in mining,
construction,
manufacturing and
transport
customer service clerks
hunters and trappers
animal producers
models, salespersons
and demonstrators
teaching professionals
sales and services
elementary occupations
0.1285
0.0146
0.0014
0.6522
0.1138
-0.0004
0.1285
0.1285
0.0093
0.0113
0.0006
0.0009
0.6516
0.6483
0.0728
0.0876
-0.0001
0.0001
1.0000
0.1285
0.0121
0.0010
0.6472
0.0941
0.0003
1.0000
0.1285
0.0127
0.0012
0.6458
0.0990
0.0005
1.0000
0.1285
0.1285
0.0115
0.0163
0.0009
0.0019
0.6454
0.6454
0.0891
0.1271
0.0004
0.0009
1.0000
1.0000
0.1285
0.0079
0.0005
0.6434
0.0613
0.0003
0.9998
0.1285
0.1285
0.1285
0.1285
0.0089
0.0107
-0.0719
0.0049
0.0006
0.0009
0.0401
0.0002
0.6419
0.6405
0.6402
0.6399
0.0697
0.0832
-0.5593
0.0378
0.0005
0.0008
0.0395
0.0002
0.9992
0.9980
0.9976
0.9973
0.1285
0.1285
0.0069
0.0134
0.0004
0.0015
0.6399
0.6365
0.0535
0.1041
0.0004
0.0020
0.9972
0.9904
0.1285
0.0065
0.0004
0.6360
0.0509
0.0005
0.9888
Education Variables
Variable
Proportion of family
members currently
attending school
Proportion of family
members with no
grade completed
Proportion of family
members who are
elementary
undergraduates
Proportion of family
members who are
Unconditional
Conditional
Posterior
Sign
Sign
Inclusion Posterior Posterior
Posterior Posterior
Certainty
Certainty
Probability
Mean
Variance
Mean
Variance
Probability
Probability
0.2639
-0.0452
0.0071
0.7042
-0.1713
0.0053
0.9907
0.2639
0.0344
0.0064
0.6667
0.1305
0.0034
1.0000
0.2639
0.0732
0.0158
0.7196
0.2773
-0.0058
0.2639
0.0988
0.0257
0.7311
0.3745
-0.0261
31
elementary
graduates
Proportion of family
members who are
high school
undergraduates
Proportion of family
members who are
high school
graduates
Proportion of family
members who are
college
undergraduates
Proportion of family
members who are
college graduates
Proportion of family
members who are
post graduates
0.2639
0.1325
0.0421
0.7408
0.5021
-0.0341
0.2639
0.1497
0.0535
0.7412
0.5672
-0.0640
0.2639
0.1937
0.0878
0.7434
0.7340
-0.0527
0.2639
0.1941
0.0912
0.7398
0.7355
1.8868
0.7189
0.2639
0.2101
0.6209
0.6051
0.7961
0.0053
0.9907
Housing Characteristics
Variable
building type = single
house
roof and walls made of
strong materials
tenure status = own
house and lot; or ownerlike possession of house
and lot
Unconditional
Conditional
Posterior
Sign
Sign
Inclusion Posterior Posterior
Posterior Posterior
Certainty
Certainty
Probability
Mean
Variance
Mean
Variance
Probability
Probability
0.0136
-0.0009
0.0001
0.5447
-0.0688
0.0004
0.9994
0.0136
0.0007
0.0000
0.5443
0.0531
0.0003
0.9984
0.0136
0.0002
0.0000
0.5431
0.0150
0.0000
0.9914
Community Characteristics
Variable
There is a high school
There is Landline
Telephone
System/Calling Station
Unconditional
Conditional
Posterior
Sign
Sign
Inclusion Posterior Posterior
Posterior Posterior
Certainty
Certainty
Probability
Mean
Variance
Mean
Variance
Probability
Probability
0.0167
-0.0004
0.0000
0.5509
-0.0255
0.0000
1.0000
0.0167
0.0004
0.0000
0.5498
0.0251
0.0001
0.9997
32
There is a town city
hall/provincial capitol
There is a cellular
phone signal
Commercial
establishments inside
the barangay
Autorepair shops
inside the barangay
There is a barangay
health center
Barangay has street
pattern
There is a
church/chapel
Barangay is part of the
town/city proper
0.0167
0.0005
0.0000
0.5497
0.0272
0.0001
0.9996
0.0167
-0.0004
0.0000
0.5496
-0.0227
0.0000
0.9995
0.0167
0.0000
0.0000
0.5483
0.0001
0.0000
0.9946
0.0167
0.0000
0.0000
0.5482
0.0018
0.0000
0.9941
0.0167
0.0002
0.0000
0.5479
0.0147
0.0000
0.9921
0.0167
0.0002
0.0000
0.5476
0.0142
0.0000
0.9901
0.0167
0.0003
0.0000
0.5474
0.0180
0.0001
0.9878
0.0167
-0.0002
0.0000
0.5471
-0.0121
0.0000
0.9854
Appendix 6A. Explanatory Variables in the logit model for Non-NCR households
Variables
number of television sets
n_tv
proportion of members 0-14 years old
p_mem_0_14
natural logarithm of family size
ln_fam_size2
number of telephone/cellular phone
n_phone
number of refrigerators/freezers
n_ref
roof and walls made of strong materials
house_strong_3
number of washing machines
n_wash
Proportion of family members who are college graduates
p2_educ_cg
Proportion of family members who are high school graduates
p2_educ_hsg
Proportion of family members who are college undergraduates
p2_educ_cu
number of motorcycles
n_motor
with family member whose basis of payment = monthly
w_pb2_month
with family member who is an overseas contract worker
ofi_w_ocw
CALABARZON
region_41
mining, construction and related trades workers
w_occup2_71
number of VTR/VHS/VCD/DVD
n_vtr
drivers and mobile plant operators
w_occup2_83
general managers or managing proprietors
w_occup2_13
with sala set
w_sala
Caraga
region_16
Proportion of family members who are high school undergraduates
p2_educ_hsu
Northern Mindanao
region_10
Fishermen
w_occup2_64
33
Number of commercial establishments in the barangay
Autonomus Region of Muslim Mindanao
Proportion of family members with no grade completed
laborers in mining, construction, manufacturing and transport
machine operators and assemblers
Central Visayas
building type = single house
sales and services elementary occupations
office clerks
withfamily member whose nature of employment = short-term or seasonal
or casual job/business/ unpaid family work
proportion of family members employed
With electricity in the building/house
number of cars/jeep
water source = shared, tubed / piped well
water source = spring, river, stream, etc
age of household head
household type = single family
tenure status = own house and lot; or owner-like possession of house
and lot
related associate professionals
number of radio
metal, machinery and related trades workers
Number of auto repair shop, vulcanizing shop, electronic repair shop, or
other repair shops in the barangay
withfamily member whose class of worker = employer in own familyoperated farm or business
farmers and other plant growers
Proportion of family members currently attending school
number of dining sets
models, salespersons and demonstrators
armed forces
with family member whose primary occupation = officials of government
and special-interest organizations
Proportion of family members who are elementary graduates
with stereo / CD player
Number of establishments offering personal services like restaurants,
cafereria, etc, in the barangay
Zamboanga Peninsula
Eastern Visayas
Supervisors
personal and protective services workers
other craft and related trades workers
q_6a
region_15
p2_educ_ngc
w_occup2_93
w_occup2_82
region_07
bldg_single
w_occup2_91
w_occup2_41
w_ne2_s_term
p2_mem_emp
w_elec
n_car
ws_s_well
ws_spring
h_age2
hh_type_s_fam
ts_oh_ol
w_occup2_34
n_radio
w_occup2_72
q_11a
w_cw2_employer
w_occup2_61
p2_mem_sch
n_dining
w_occup2_52
w_occup2_01
w_occup2_11
p2_educ_eg
w_stereo
q_12a
region_09
region_08
w_occup2_14
w_occup2_51
w_occup2_74
34
Number of recreational establishments, outside the barangay but within
2kms
q_7b
Appendix 6B. Explanatory Variables in the logit model for NCR households
Variables
With family member whose class of worker = employer in own
family-operated farm or business
w_cw2_employer
proportion of family members employed
p2_mem_emp
other occupations not classifiable
w_occup2_09
number of washing machines
n_wash
Do farmers, farm laboreres, fishermen, loggers, and forest product
gatherers constitute more than half of the population 10 years old
and over (yes=1)
q_5
Proportion of family members who are elementary undergraduates
p2_educ_eu
Town city hall/provincial capitol indicator
q_4a
sales and services elementary occupations
w_occup2_91
number of telephone/cell phone
n_phone
Number of households dwelling in private land which they do not
own except in danger areas
q_13c
Barangay street pattern indicator
q_2
number of microwave oven
n_oven
toilet facility = water sealed
tf_w_sealed
cemetery indicator
q_4d
precision, handicraft, printing and related trades workers
w_occup2_73
natural logarithm of family size
ln_fam_size2
household type = single family
hh_type_s_fam
Highs chool indicator
q_4g
Poblacion/City District indicator
q_1c
35