12th National Convention on Statistics (NCS) EDSA Shangri-La Hotel October 1-2, 2013 NEW PROXY MEANS TEST (PMT) MODELS: IMPROVING TARGETING OF THE POOR FOR SOCIAL PROTECTION by Dennis S. Mapa Manuel Leonard F. Albis For additional information, please contact: Author’s name Designation Affiliation Address : : : : Tel. no. E-mail : : Author’s name Designation Affiliation Address : : : : Tel. no. : Dennis S. Mapa Associate Professor School of Statistics, University of the Philippines, Diliman School of Statistics Building, Ramon Magsaysay Avenue U.P. Diliman, Quezon City (02) 9280881 [email protected] Manuel Leonard F. Albis Assistant Professor School of Statistics, University of the Philippines, Diliman School of Statistics Building, Ramon Magsaysay Avenue U.P. Diliman, Quezon City Diliman, Quezon City (02) 9280881 1 NEW PROXY MEANS TEST (PMT) MODELS: IMPROVING TARGETING OF THE POOR FOR SOCIAL PROTECTION1 by Dr. Dennis S. Mapa.2 and Prof. Manuel Leonard F. Albis3 The National Household Targeting System for Poverty Reduction (NHTS-PR) of the Department of Social Welfare and Development (DSWD) is a system for identifying poor households. The system guarantees the generation and establishment of a socio-economic database of poor households. The NHTS-PR uses the Proxy Means Test (PMT) as the methodology for estimating the per capita income of households based on a set of verifiable indicators that are difficult to manipulate. The PMT model is a statistical method used to predict the income of a household based on observable characteristics that correlate with, but are easier to measure, than income. In lieu of the household's actual income (per capita), its predicted income (per capita) from the PMT model is used to compare with the official poverty threshold, computed by the National Statistical Coordination Board (NSCB). The household is classified as poor when the predicted per capita income from the PMT model is less than or equal to the official poverty threshold, otherwise, the household is classified as non-poor. The households classified as poor using the PMT model are then used as one of the eligibility criteria for the Conditional Cash Transfer program and other government poverty reduction programs. While the observable characteristics used in the PMT model are relatively easier to collect, these are also less than perfect correlates of income. Thus, the PMT model risks misclassifying poor households as non-poor (exclusion error) and non-poor households as poor (inclusion error). Reducing the exclusion and inclusion errors and in the process increasing the number of households correctly classified (as poor or non-poor) is one of the primary objectives of the PMT model. In preparation for the reassessment of the database of the NHTS-PR for 2013-2014, new and better PMT models are developed for the DSWD. Innovations are introduced in these new PMT models with the objective of reducing the exclusion and inclusion error rates. These are the addition of variables that measures community characteristics – deemed useful in identifying the poor and non-poor households; the use of restricted regression in estimating the model’s coefficients; using the lower limit of a predicted interval, instead of the point estimate, in predicting the household’s per capita income and using two models (least-squares regression model and logit model) to identify the poor and non-poor households. Within sample evaluation, using the data from the 2009 Family Income and Expenditure Survey (FIES), shows that the new PMT models have lower exclusion and inclusion errors compared with the current PMT model. 1 The paper benefitted from discussions with and inputs from Joseph Capuno of the UP School of Economics and from Sharon Piza, Rashiel Velarde, Nazmul Chaudhury and Shanna Elaine Rogan of the World Bank’s Social Protection Unit. All errors and omissions remain the authors’ responsibility. 2 Associate Professor and Director for Research, School of Statistics and Affiliate Associate Professor, School of Economics, University of the Philippines, Diliman and Consultant, DSWD. Email: [email protected] 3 Assistant Professor, School of Statistics, University of the Philippines Diliman and Consultant, DSWD. 2 I. Introduction The National Household Targeting System for Poverty Reduction (NHTS-PR) of the Department of Social Welfare and Development (DSWD) is a system for identifying who and where the poor households are. The system guarantees the generation and establishment of a socio-economic database of poor households. The targeting system employed by the NHTS-PR is similar to those which have earned success in Latin American countries in terms of effectively distributing social assistance and social protection programs to the poorest of the poor. The NHTS-PR uses the Proxy Means Test (PMT) as the methodology for estimating the per capita income of households based on a set of verifiable indicators that are not easy to manipulate. The PMT model is a statistical method used to predict the income of a household based on observable characteristics that correlate with, but are easier to measure, than income. In lieu of the household's actual income (per capita), its predicted income (per capita) from the PMT model is used to compare with the official poverty threshold, computed by the National Statistical Coordination Board (NSCB).4 The household is classified as poor when the predicted per capita income from the PMT model is less than or equal to the official poverty threshold, otherwise, the household is classified as non-poor. The households classified as poor using the PMT model are then used as one of the eligibility criteria for the CCT program (and other government poverty reduction programs). While the observable characteristics used in the PMT model are relatively easier to collect, they are also less than perfect correlates of income. Thus, the PMT model risks misclassifying poor households as non-poor (exclusion error) and non-poor households as poor (inclusion error). Reducing the exclusion and inclusion errors and in the process increasing the number of households correctly classified (as poor or non-poor) is one of the primary objectives of the PMT model. In preparation for the reassessment of the database of the NHTS-PR for 2013-2014, the DSWD is reviewing the methodologies associated with the PMT models, in particular to find alternative (additional) variables that are useful in identifying the poor and non-poor households, as well as 4 The annual per capita poverty threshold in 2009 is estimated at Php 16,841 for the entire country. Using this value, the NSCB estimated poverty incidence to be about 20.9% of the families. This is equivalent to about 26.5% of the total population. For 2012, the 1st semester (6 months) per capita poverty threshold is estimated at Php 9385. Poverty incidence among families is about 22.3%, while poverty incidence among population is estimated at 27.9%. 3 exploring alternative estimation procedures that will help reduce the exclusion and inclusion error rates. In identifying the appropriate PMT model, focus is on the two misclassification errors that are produced by the model: exclusion and inclusion errors. The exclusion error rate is defined as number of actual poor households classified as non-poor by the models divided by the total number of actual poor households, while the inclusion error rate is the number of actual nonpoor households classified as poor divided by the total number of households classified as poor by the models. There is always a trade-off between the two errors: decreasing the exclusion error rate tends to increase the inclusion error rate and vice versa, all things being the same. However, researchers and policy makers are usually biased in favour of a lower exclusion error rate. The exclusion errors are more costly than inclusion errors, because of costs of re-certifying poor households (Araujo and Carraro; 2011). II. Current Proxy Mean Test (PMT) Model The standard PMT model seeks to predict per capita income at household level. The dependent variable to be estimated is per capita income instead of per capita consumption because in the Philippines official poverty statistics produced by the NSCB are based on income. In theory income would be a better indicator of purchase capacity of the household. However, it has been observed that income has higher volatility than consumption (expenditure). Moreover, in countries with a high share of informal sector such as the Philippines, the underreporting of income is an issue (Fernadez, 2012). In other countries such as in Indonesia (Alatas, et. al; 2010), Mongolia (Araujo and Carraro; 2011) and Sri Lanka (Narayan and Yoshida; 2005), per capita consumption or expenditure is used in the PMT models instead of per capita income. The DSWD’s current PMT model uses the natural logarithm of the household’s per capita income and the following as predictor variables: (i) ownership of appliances and/or assets, (ii) educational attainment (proportion of family members), (iii) family/household composition, (iv) employment/kind of business, (v) housing materials and access to basic services or amenities, (vi) and location (urban/rural and regional classification). The PMT model uses the 10 major occupational groups based on the Philippine Standard Occupational Classification (PSOC). The proxy (predictor) variables for per capita income are identified based on the data from the Family Income and Expenditure Survey (FIES) for 2003 and the Labor Force Survey (LFS) for 4 2003, both collected and reported by the National Statistics Office (NS0). In 2003 the two surveys can be merged and produced a sample size of 42,094 households. The selection of variables was based on variables that can be good proxies of income (highly correlated) such as housing conditions, access to basic services, ownership of assets, family composition, education variables and specific variables related to the Conditional Cash Transfer (CCT) operation (Fernandez, 2007). Using the 2003 poverty threshold, this PMT model has an exclusion error rate of 30% and an inclusion error rate of 24% within sample for 2003. Moreover, about 89% of the 2003 FIES households were classified correctly as either poor or non-poor. Fernandez (2012) performed simulations by applying the coefficients of the 2003 PMT model (based on the 2003 LFS and FIES data) to the 2009 FIES and LFS data sets. The predicted per capita income is then compared to the 2009 NSCB’s provincial poverty thresholds. The simulations show the exclusion error rate at 18% and inclusion error rate at 45%. Moreover, the model’s overall prediction rate is 82% with only 55% of the poor households are correctly predicted. Attempts have been made to improve the existing PMT model taking into consideration the relatively high exclusion and inclusion error rates. A technical team from the National Household Targeting Office (NHTO) is assigned to improve the technical aspects of the PMT model. The technical team started with the existing PMT model used in developing the initial NHTSPR database in 2009 as the baseline model and looked at ways of improving it by enhancing the predictive accuracy of the PMT model. The expectation is that by improving predicted accuracy, the exclusion and inclusion error rates will be reduced to acceptable levels. Moreover, the technical team of the NHTO also considered sub-national and cluster PMT models as alternative models. The alternative models also used the logarithm of per capita income of the household as the dependent variable and basically the same set of explanatory variables from the FIES and LFS modifying only the major occupational groups based on the PSOC. The alternative models either used the 33 PSOC sub-major occupational groups or the 17 major industry groups of the Philippine Standard Industrial Classification (PSIC). A summary of the error rates is presented in table 1 for the best three models using the within sample validation from the 38400 households in the 2009 FIES. A projection for the total households in the country in 2009, using the sampling weights, is already reported. These models still carry relatively high exclusion and inclusion error rates. The figures from table 1 show that exclusion error rates ranges from 36% to 37%, while inclusion error rates are from 24% to 26%. Using the 5 alternative PMT models will result in a large number of households (about 2.2 million) that will be misclassified (World Bank Technical Note 2012). The NHTO technical team also considered sub-national and cluster PMT models in the hope of lowering the exclusion and inclusion error rates. Several alternative PMT models are considered for urban and rural households and for households residing in Luzon, Visayas, Mindanao and the National Capital Region (NCR). The summary of the exclusion and inclusion rates, using within sample validation for the households in the 2009 FIES, are provided in table 2 below. The results show these sub-national and cluster models are no better than the national models in terms of the error rates. In particular, the exclusion error rates from some of these PMT models (e.g. models for the NCR, Luzon and Urban households) are larger compared to the national model. Table 1. Exclusion and Inclusion Error Rates of Alternative (National) PMT Models Based on FIES 2009 Sample (38,400 Households) Model % % # Exclusion Inclusion Misclassified % Correctly # Correctly Classified Total Classified 112 36.36 25.25 5,023 33,377 38,400 86.92 233 36.80 25.28 5,051 33,349 38,400 86.85 317 36.64 24.82 4,997 33,403 38,400 86.99 With Weights Applied (18,451,414 Households) Model % % # Exclusion Inclusion Misclassified % Correctly # Correctly Classified Total Classified 112 36.86 25.56 2,243,281 16,208,133 18,451,414 87.84 233 37.33 25.55 2,254,603 16,196,811 18,451,414 87.78 317 37.20 25.06 2,230,063 16,221,351 18,451,414 87.91 Source: NHTO, May 2012 6 Table 2. Exclusion and Inclusion Error Rates of Sub-National/Clustered PMT Models Misclassified Correctly Total Option Urbanity Models Cluster Models Current Models R2 Model HHs Classified HHs No. of Exclusion Inclusion Urban Rural Overall NCR Luzon Visayas 0.7642 0.703 0.7694 0.7351 0.7579 No. 1,321 3,627 4,948 104 1,912 1,086 % 7.62 17.22 12.89 2.43 11.71 15.23 No. 16,014 17,438 33,452 4,181 14,415 6,044 % 92.38 82.78 87.11 97.57 88.29 84.77 HHs 17,335 21,065 38,400 4,285 16,327 7,130 Rate 49.58 31.56 Rate 27.90 24.08 78.95 42.62 31.42 36.84 27.02 23.78 Mindanao 0.7305 1,866 17.51 8,792 82.49 10,658 29.99 24.29 Urban Rural Overall 0.66 0.697 2,390 4,740 7,130 13.79 22.5 18.57 14,945 16,325 31,270 86.21 77.5 81.43 17,335 21,065 38,400 23.24 16.28 59.67 38.35 Source: NHTO, August 2012 III. Proposed Modifications/Improvements to the Current PMT Model The PMT model at its current state has large exclusion and inclusion errors within sample (using 2003 and 2009 FIES and LFS database). The exclusion error rate is about 30% while the inclusion error rate is about 24%. These error rates produce high leakages. This paper identifies three (3) major modifications that can be made to improve the prediction power of the PMT models in identifying poor and non-poor households. First, is to include additional explanatory variables in the regression model to increase the goodness of fit of the model and thereby increasing its prediction power within the sample data. Studies, particularly Mapa, Balisacan and Briones (2008), Balisacan (2005) and Balisacan and Pernia (2003), suggest that community (or Barangay) characteristics such as presence of infrastructure (road network, electricity, telephone, and water system) and the presence and number of business establishments in the Barangay are factors that explain the households’ per capita income in that Barangay. Incorporating these variables to the PMT models will likely increase the models goodness-of-fit and the increase the likelihood of having lower errors within sample. These Barangay characteristic variables are generated from the Census of Population and Housing 7 (CPH). For this particular undertaking, the data from the 2007 CPH was incorporated in the models5. The second proposed improvement to the current PMT model is restricting the least-squares estimation to only include the poorest 40% of the households in the sample, instead of the entire households. The idea behind this proposal is that in the least-squares estimation, the estimated coefficients are more likely to be BIASED and INCONSISTENT due to problems associated with having endogenous variables in the model (due omitted variables, reverse causality and measurement error). The question is how the BIASED estimates impact the predicted per capita income of the households? The regression results showed these BIASED estimates tend to increase the predicted per capita income, all things being the same, and thus tend to misclassify a poor household as non-poor – increasing the model’s exclusion rate. In practice, there is a high cost if a poor household is misclassified as non-poor. While there is a review mechanism for the CCT program, a poor household (misclassified as non-poor) will surely have a hard time arguing its case. For one, there is a cost of going through the review process (such as submission of documents, transportation expenses). The costs associated with the review create a high barrier for a poor household, incorrectly classified as non-poor, and will most likely accept its fate (resulting from errors in the model). It defeats the main objective of the program – to help poor households. Estimating a restricted model, using only the poorest 40% of the households, creates biased estimates in favour of identifying poor households. Simulations made showed that exclusion errors are lower by about 15 to 20 percentage points when using the poorest 40% of the households in the estimation compared to using the entire data.6 The third proposed modification in the current PMT model is in the process of predicting the household’s per capita income. The existing model is using the point estimate, Yˆ X ̂ , for generating the per capita income of the households. The predicted per capita income is then compared with the per capita poverty threshold reported by the NSCB. If the predicted per capita income is less than or equal to the per capita poverty threshold, the households is then classified as poor, otherwise, the household is non-poor. The problem in estimation is the fact that point estimate is always an incorrect predictor of the actual per capita income because of errors. To illustrate the case, using the annual per capita poverty threshold in 2009 estimated at Php 16,841 for the entire country. If the estimated per capita income for a particular household 5 6 The 2010 CPH database is not yet available when the models were constructed. The reduction in the exclusion error rates also increased the inclusion error rates. The higher inclusion rates are addressed using a second model to further screen non-poor households. 8 from the PMT model is, say, Php 17,000, the household will be classified as non-poor. However, we know the point estimate has an error. Now, what if we overestimated the household per capita income by Php 300, that is, the actual per capita income is only Php 16,700? The PMT model then misclassifies the actual poor household as non-poor, increasing the model’s exclusion error. To correct for this, we are proposing that the lower limit of the 95% confidence/prediction interval be used as the estimated per capita income instead of the point estimate. In estimation, reporting a confidence interval estimate is better than the point estimate.7 Initial simulations showed that by using the lower limit of the confidence interval, instead of the point estimate, the exclusion error rates of the models are reduced by 4 to 5 percentage points. IV. Explanatory Variables for the PMT Model to Predict the Household’s Per Capita Income The Proxy Means Test (PMT) model is a statistical method used to predict the per capita income of a household based on observable characteristics that correlate with, but are easier to measure, than income. The literature suggests the following set of variables as close proximate of income. Needs indicators – e.g. household composition, family size, dependency ratio (sources: Family Income and Expenditure Survey (FIES) and Labor Force Survey (LFS)); Measures of current command over resources – e.g. assets (presence of television set, air condition unit, vehicle), housing characteristics/amenities (house structure, toilet facilities, water source, presence of electricity) presence of family member who is an OFW; marital status, proportion of family members employed (FIES and LFS); Measures of potential/prospective command over resources – e.g. employment status, educational attainment (source: LFS) Measures of transfers from other people, organizations or government -- communitylevel (Barangay) public facilities (e.g. roads/ports, water systems); since some of these local public amenities are also financed from local taxes, they may also indicate the development of the local economy (Source: Census of Population and Housing (CPH)); Measure of economic opportunity or activity -- community-level variables like number of banks, business enterprises, etc. (Source: CPH) Geographical Location – Regional indicator variables and Urban/Rural indicator variable (Source: FIES) 7 While one can use the upper limit of the 95% confidence/prediction interval estimate, this option is not considered to capture more poor households. 9 Using the data from the 2009 FIES and LFS and the 2007 CPH, there are 139 variables identified as potential explanatory variables for the household’s per capita income. Of these 139 variables, 45 variables are from the FIES, 53 variables from the LFS and 41 variables from the CPH. These are variables are defined in the appendix 1 and appendix 2. Note that the existing PMT used only the 98 variables from the FIES and the LFS (variables in appendix 1). The proposed model is expected to provide a better fit (within sample) with the addition of the 41 variables from the CPH. This will more likely improve the prediction performance of the proposed PMT model, thereby reducing the model’s exclusion and inclusion error rates. The PMT model is given by, Yi 0 X i Zi Wi Fi Ei Gi i (1) i 1,2..., n where Yi is the per capita income of the ith household, X is a matrix of associated with needs indicator variables; Z is a matrix for the variables representing the measures of current command over resources; W is for measures of potential command over resources; F is for the variables representing transfers from other people, organizations or government; E is the variables representing the measures of economic opportunity and G for geographical indicator variables. The variable ε is the random error term and assumed to be normally distributed with mean 0 and constant variance σ2ε. The vectors β, δ λ, φ, α, and ϕ are the structural parameters and are estimated using the least-squares estimation procedure. V. Empirical Results and Evaluation of Errors After an exhaustive analysis of the data, two PMT models were developed: one PMT model for households in the National Capital Region (NCR) and another PMT model for households outside the NCR (non-NCR). Having two models, instead of one for all households, is another innovation introduced in this undertaking. The motivation behind the decision to use two models is basically to minimize the error rates. Evaluation of the initial models (single model for all households) showed large exclusion errors for the NCR households, ranging from 70% to 80%. The large errors suggest that poor households in the NCR may have different characteristics than poor households outside the NCR. The PMT model for non-NCR households has 76 significant explanatory variables. The list of explanatory variables is provided in appendix 3. The variable that explains the largest variation 10 in the household’s per capita income is family size (in natural logarithm). The R-squared or coefficient of determination for this model is about 39%.8 The PMT model for NCR households has 39 significant explanatory variables. The list of variables is provided in appendix 4. Similar to the non-NCR PMT model, family size (in natural logarithm) is the variable that explains the largest variation in the household’s per capita income. The NCR PMT model has an R-squared of about 50%. Moreover, robustness procedure showed that 73 out of the total 139 explanatory variables has a sign certainty probability of at least 97.5% using the Bayesian Averaging of the Classical Estimates (BACE) approach (Sala-i-Martin; 2004), suggesting these variables are robust determinants of per capita income. A detailed discussion on the BACE is provided in appendix 5. The 73 robust variables are part of the 76 variables included in the final PMT model for nonNCR households. Second Model to Lower Inclusion Rate The introduction of the three (3) modifications to the existing PMT model, namely: (1) Inclusion of additional explanatory variables from the CPH to measure community (Barangay) characteristics; (2) estimating the coefficients of the regression model using the poorest 40% of the households instead of the entire households in the sample; and (3) using the lower limit of the 95% predicted interval of the per capita income instead of the point estimate, contributed to the large reduction in the exclusion error rates. In other words, more actual poor households were correctly classified by the models. One drawback, however, is that in the process of making the models biased towards identifying poor households, it also increased the inclusion error rates of the models – the number of actual non-poor households classified as poor divided by the total number of households classified as poor. Initial estimates suggest that inclusion error rates increased to about 45-50% for non-NCR households and about 75% for NCR households from about 24% for non-NCR and 37% for NCR households, using the existing model. To remedy the problem of high inclusion error rates of the PMT models, second set of models are constructed to further screen non-poor households misclassified as poor households by the first set of models. The second set of models consists of logit models (one each for non-NCR and NCR households). 8 The detailed results of the final model can be made available upon request from the authors. 11 The second model used in lowering the inclusion error is the logit model. Consider the linear model, (2) yi 0 1 X 1i 2 X 2i ... k X ki i i 1,2,..., n where the variable of interest, yi, takes on the value 1 if the household is non-poor and value 0 if the household is poor and X1, X2,…, Xk represent the determinants of households being nonpoor. Note that yi is a Bernoulli random variable with probability of success,, or yi ~ Be(). The problem in economics is that most likely is unknown and not constant across the observations. The solution is to make dependent on the vector of explanatory variables X. Thus, we have, (3) yi ~ BeF o 1 X 1i 2 X 2i ... k X ki where the function F(·) has the property that maps β0+β1X1+β2X2+…+βkXk onto the interval [0,1]. Thus, instead of considering the precise value of y, we are now interested on the probability that y=1, given the outcome of β0+β1X1+β2X2+…+βkXk , or, (4) Pr yi 1 | , xi F ( xi ) where F is a continuous, strictly increasing function and returns a value ranging from 0 to 1. The choice of F determines the type of binary model. Given such a specification, the parameters of this model (the betas) can be estimated using the method of maximum likelihood. Once the identifiable parameters are established, the likelihood function is written as, n (5) L( y; ) F ( xi ) i 1 F ( xi ) 1 yi y i 1 where F(.) is a cumulative density function. If F(·) is a logistic distribution then, (6) F x i ' ( x i ' ) exp x i ' 1 exp x i ' 12 Marginal Effects Interpretation of the coefficient values is complicated by the fact that estimated coefficients from a binary model cannot be interpreted as marginal effect on the dependent variable. The marginal effect of Xj on the conditional probability is given by, (7) E y | X , X j f ( x i ' ) j where f(·) is the density function corresponding to F(·). In here, βj is weighted by a factor f(·) that depends on the values of all the variables in X. The direction of the effect of a change in Xj depends only on the sign of the βj coefficient. Positive values of βj imply that increasing Xj will increase the probability of the response, while negative values of βj will decrease the probability of the response. The marginal effect is usually estimated using the average of all the values of the explanatory variables (X) as the representative values in the estimation. Average Marginal Effect Some researchers, particularly Bartus (2005), argue that it would be more preferable to compute the average marginal effect, that is, the average of each individual’s marginal effect. The marginal effect computed at the average X is different from the average of the marginal effect computed at the individual X. The results of the two logit models for non-NCR and NCR households are given in appendix 6A and appendix 6B, respectively.9 Using the logit model, if the estimated probability the ith household being non-poor, given X, is equal or greater than the cut-off value c, the household is tagged as non-poor.10 Otherwise, the household is considered as poor. Applying the two models, linear regression model and logit model, results show lower exclusion and inclusion errors for non-NCR and NCR households, reported in tables 3 and 4. The exclusion and inclusion error rates for non-NCR households are 6.8% and 13.9%, respectively. For NCR households, the exclusion and inclusion error rates are 19.3% and 10.9%, respectively. 9 The detailed results of the two logit models can be requested from the authors. Simulations made indicate that the optimal cut-off value is equal to 0.40. 10 13 Table 3. Expected-Prediction Table for Non-NCR Households True Welfare Level Lower Limit Estimate Poor Non-Poor % n % Predicted Welfare Level Poor Non-Poor Total 93.2 6.8 100.0 7827 574 8401 5.0 95.0 100.0 n 1262 24026 25288 The exclusion error rate is 6.8% while the inclusion error rate is 13.9% Table 4. Expected-Prediction Table for NCR Households Lower Limit Estimate % Predicted Welfare Level Poor Non-Poor Total 80.7 19.3 100.0 True Welfare Level Poor Non-Poor n % 92 22 114 0.3 99.7 100.0 n 11 4,160 4171 The exclusion error rate is 19.3% while the inclusion error rate is 10.7% Using the PMT models, the estimated poverty incidence among families in 2009 is about 22.24% compared to the official poverty estimates of 20.9% from the NSCB.11 Table 5. Estimated Poverty Incidence using the PMT Models Un-weighted Area Number of N Incidence Households NCR All Regions Outside NCR Total 4,285 33,689 37,974 2.40 26.98 24.20 103 9,088 9,191 Weighted N Incidence Number of Households* 2,460,993 15,830,871 18,291,864 2.42 25.32 22.24 59,556 4,008,377 4,068,111 * May not sum up to the total predicted poor households due to rounding off errors. The figures in tables 6A and 6B are the estimated number of households that will be misclassified by the PMT models as poor, given that these household households are non-poor. These households will be part of the inclusion error. The figures are reported by income decile. 11 The PMT models are not constructed for the purpose of estimating the overall poverty incidence. The purpose of this comparison is to merely assess the prediction performance of the models. The authors are grateful to Dean Mon Clarete of the University of the Philippines School of Economics and member of the National Technical Advisory Group (NTAG) of the DSWD for this suggestion. 14 Table 6A. Estimated Number of Non-NCR Households (Weighted) Misclassified as Poor Number of Households National Income Decile Percentage per Decile Total Inclusion Error 1 2 3 4 5 6 7 8 9 10 Total 1,798,387 1,792,055 1,769,891 1,736,647 1,660,658 1,556,175 1,480,186 1,420,029 1,370,953 1,244,306 15,830,871 7,931 38,780 77,123 121,209 119,185 91,891 58,362 24,997 6,673 875 546,971 0.44 2.16 4.36 6.98 7.18 5.90 3.94 1.76 0.49 0.07 3.46 For non-NCR households, the bulk of these misclassified households are in the 4th to 6th income decile (using the 2009 figures). For NCR households, all misclassified households are in the 5th to 9th income decile. Given the income profile of these households, some of the misclassifications can be corrected naturally (a “non-poor” household will not enrol itself in the program, particularly those households in the higher income decile) or administratively (DSWD can detect these households). Either way, the final inclusion errors may still be lower than the reported figures. Table 6B. Estimated Number of NCR Households (Weighted) Misclassified as Poor Number of Households National Income Percentage per Decile Deciles Total Inclusion Error 1 2 3 4 5 6 7 8 9 10 Total 16,981 22,641 47,251 81,705 168,578 274,893 360,535 419,845 470,296 598,021 2,460,993 0 0 0 0 847 2,127 2,634 733 489 0 6,830 0.00 0.00 0.00 0.00 0.50 0.77 0.73 0.17 0.10 0.00 0.28 15 VI. Conclusion This paper proposed PMT models that lower the risks of misclassifying poor households as nonpoor (exclusion error) and non-poor households as poor (inclusion error). In the construction of the models, modifications are made to sharpen the targeting process. Learning from the experiences gained in using the existing PMT model, several improvements are introduced in the new PMT models: additional explanatory variables to capture community (Barangay) characteristics, alternative estimation procedure to make the estimates biased in identifying poor households and adjustment in the process of predicting the household’s per capita income, using the lower limit of the 95% prediction interval instead of the point estimate. These modifications are introduced to minimize the models’ exclusion error rates, the new PMT models will capture more poor households. In order to minimize the inclusion error rates, a logit model is then introduced to further weed out the non-poor households misclassified as poor households in the first model. The introduction of the second “screener” model lowers the inclusion error rate at a comfortable level. Using another model solely for NCR households also helped in lowering the error rates. 16 References Alatas, V., Banerjee, A. Oklen, B. and Tobias, J. (2010), “Targeting the Poor: Evidence from a Field Experiment in Indonesia” Technical Report prepared for the World Bank. Araujo, C. and Carraro, L (2011). “A proxy-means test exercise for the selection of beneficiaries of poverty targeted programs in Mongolia” Technical Report prepared for the Ministry of Social Welfare, Mongolia. Balisacan, A. (2005) “In Search of Proxy Indicators for Poverty Targeting: Toward a Framework for a Poverty Indicator and Monitoring System” Paper prepared for the National Statistics Office (NSO) component in the UNDP-assisted project: Strengthening Institutional Mechanism for the Convergence of Poverty Alleviation Efforts. Balisacan, A.M. and Pernia, E.M. (2003). “Poverty, Inequality and Growth in the Philippines”, in E.M. Pernia and A.B. Deolalikar (eds.), Poverty, Growth and Institutions in Developing Asia, . Hampshire, England: Palgrave Macmillan Publishers. Bartus, T. (2005), “Estimation of marginal effects using margeff” The Stata Journal, Vol. 5, No. 3, pp. 309-329. Capuno, J. (2012), “Enhancing the DSWD NHTSPR Proxy Means Test Model: A review of current efforts and recommendations” Technical Note, World Bank, Manila. Fernandez, L. (2007) “Technical Note on Estimation of a Proxy Means Test Model (PMT) for Conditional Cash Transfer (CCT) Pilot Program in the Philippines” prepared for the Department of Social Welfare and Development (DSWD), Philippines. Fernadez, L. (2012), “Technical Note on Estimations of Proxy Means Test Model (PMT) of the National Household Targeting System for Poverty Reduction (NHTS-PR) with latest household surveys: FIES-LFS 2009 in the Philippines” prepared for the World Bank, Manila and the DSWD, Philippines. Mapa, D., Balisacan, A. and Briones, K. (2008), “Robust Determinants of Income Growth in the Philippines.” in Joseph J. Capuno and Aniceto C. Orbeta (editors), Human Capital and Development in the Philippines. Philippines Institute for Development Studies (PIDS), Makati City. Narayan, A. and Yoshida, N. (2005), “Proxy Means Test for Targeting Welfare Benefits in Sri Lanka” Technical Report No. SASPR-7, World Bank, Washington, D.C. 17 Appendix 1. Explanatory Variables from the FIES and LFS Var No. 1 2 3 Variable Name house_strong_3 bldg_single ts_oh_ol 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 ts_squatter tf_w_sealed tf_none w_elec ws_o_faucet ws_s_well ws_dug_well ws_spring w_radio w_tv w_vtr w_stereo w_ref w_wash w_aircon w_sala w_dining w_car w_phone w_pc w_oven w_motor n_hh hh_type_s_fam ln_fam_size2 p_mem_0_14 w_helper p2_educ_ngc 32 p2_educ_eu 33 34 p2_educ_eg p2_educ_hsu 35 p2_educ_hsg 36 p2_educ_cu Description roof and walls made of strong materials building type = single house tenure status = own house and lot; or owner-like possession of house and lot tenure status = squatter toilet facility = water sealed toilet facility = none With electricity in the building/house water source = own use, faucet, community water system water source = shared, tubed / piped well water source = dug well water source = spring, river, stream, etc family with radio with television set with VTR/VHS/VCD/DVD with stereo / CD player with refrigerator/freezer with washing machine with air conditioner with sala set with dining set with car/jeep with telephone/cellphone with microcomputer with microwave oven with motorcycle tricycle number of households in the housing unit household type = single family natural logarithm of family size proportion of members 0-14 years old family with domestic helper Proportion of family members with no grade completed Proportion of family members who are elementary undergraduates Proportion of family members who are elementary graduates Proportion of family members who are high school undergraduates Proportion of family members who are high school graduates Proportion of family members who are college undergraduates Source FIES FIES FIES FIES FIES FIES FIES FIES FIES FIES FIES FIES FIES FIES FIES FIES FIES FIES FIES FIES FIES FIES FIES FIES FIES FIES FIES LFS LFS LFS LFS LFS LFS LFS LFS LFS 18 37 p2_educ_cg Proportion of family members who are college graduates 38 p2_educ_pg Proportion of family members who are post graduates 39 p2_mem_sch Proportion of family members currently attending school 40 41 42 43 44 h_sex2_m h_age2 h_ms2_single p2_mem_emp w_occup2_11 household head = male age of household head household head = single proportion of family members employed with family member whose primary occupation = officials of government and special-interest organizations 45 w_occup2_12 with family member whose primary occupation = corporate executives and specialized managers 46 w_occup2_13 general managers or managing proprietors 47 w_occup2_14 supervisors 48 w_occup2_21 49 w_occup2_22 physical, mathematical and engineering science professionals life science and health professionals 50 w_occup2_23 teaching professionals 51 w_occup2_24 other professionals 52 w_occup2_31 physical science and engineering associate professionals 53 w_occup2_32 life science and health professional associates 54 w_occup2_33 teaching associate professionals 55 w_occup2_34 related associate professionals 56 w_occup2_41 office clerks 57 w_occup2_42 customer service clerks 58 w_occup2_51 personal and protective services workers 59 w_occup2_52 models, salespersons and demonstrators 60 w_occup2_61 farmers and other plant growers 61 w_occup2_62 animal producers 62 w_occup2_63 forestry and related workers 63 w_occup2_64 fishermen LFS LFS LFS LFS LFS LFS LFS LFS LFS LFS LFS LFS LFS LFS LFS LFS LFS LFS LFS LFS LFS LFS LFS LFS LFS LFS LFS 19 64 w_occup2_65 hunters and trappers 65 w_occup2_71 mining, construction and related trades workers 66 w_occup2_72 metal, machinery and related trades workers 67 w_occup2_73 precision, handicraft, printing and related trades workers 68 w_occup2_74 other craft and related trades workers 69 w_occup2_81 stationary-plant and related operators 70 w_occup2_82 machine operators and assemblers 71 w_occup2_83 drivers and mobile plant operators 72 w_occup2_91 sales and services elementary occupations 73 w_occup2_92 agricultural, forestry and fishery laborers 74 w_occup2_93 75 w_occup2_01 laborers in mining, construction, manufacturing and transport armed forces 76 w_occup2_09 other occupations not classifiable 77 w_cw2_employer 78 w_ne2_s_term withfamily member whose class of worker = employer in own family-operated farm or business withfamily member whose nature of employment = shortterm or seasonal or casual job/business/ unpaid family work 79 w_pb2_month 80 ofi_w_ocw 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 urban region_01 region_02 region_03 region_41 region_42 region_05 region_06 region_07 region_08 region_09 region_10 region_11 region_12 region_13 LFS LFS LFS LFS LFS LFS LFS LFS LFS LFS LFS LFS LFS LFS LFS with family member whose basis of payment = monthly LFS with family member who is an overseas contract worker urbanity = urban Ilocos Region Cagayan Valley Central Luzon CALABARZON MIMAROPA Bicol Region Western Visayas Central Visayas Eastern Visayas Zamboanga Peninsula Northern Mindanao Southern Mindanao Central Mindanao National Capital Region LFS FIES FIES FIES FIES FIES FIES FIES FIES FIES FIES FIES FIES FIES FIES FIES 20 96 97 98 region_14 region_15 region_16 Cordillera Administrative Region Autonomus Region of Muslim Mindanao Caraga FIES FIES FIES Appendix 2. Explanatory Variables from the CPH Var No. Variable Name Description 1 q_1a Part of the town/city proper 2 q_1b Former poblacion of the municipality 3 q_1c Poblacion/city district 4 q_2 Street pattern indicator 5 q_3 Barangay is accessible to the national highway 6 q_4a Town/city hall or provincial capitol indicator 7 q_4b Church, chapel or mosque with religious service of at least once a month 8 9 10 q_4c q_4d q_4e Public plaza or park for recreation Cemetery indicator Market place or building where trading activities are carried our at least once a week 11 12 13 14 15 16 17 18 19 20 21 22 23 24 q_4f q_4g q_4h q_4i q_4j q_4k q_4l q_4m q_4n q_4o q_4p q_4q q_4r q_5 Elementary school indicator High school indicator College/university indicator Public library indicator Hospital indicator Health center indicator Landline telephone system or calling station indicator Cellular phone signal indicator Postal service indicator Community waterworks system indicator Operational seaport indicator Puclic fire-protection service indicator Public-street sweeper indicator Farmers, farm laborers, fishermen, loggers, and forest product gatherers constitute more than half of the population 10 years old and over 25 26 q_6a q_6b Number of commercial establishments in the barangay Number of commercial establishments outside the barangay but within 2kms 27 28 q_7a q_7b Number of recreational establishments in the barangay Number of recreational establishments outside the barangay but within 2kms 29 q_8a Number of manufacturing establishments in the barangay 21 30 q_8b Number of manufacturing establishments outside the barangay but within 2kms 31 q_9a Number of hotel, dormitory, motel or other lodging places in the barangay 32 q_9b Number of hotel, dormitory, motel or other lodging places outside the barangay but within 2kms 33 q_10a Number of banking institution, pawnshop, financing/investment or insurance company in the barangay 34 q_10b Number of banking institution, pawnshop, financing/investment or insurance company outside the barangay but within 2kms 35 q_11a Number of auto repair shop, vulcanizing shop, electronic repair shop in the barangay 36 q_11b Number of auto repair shop, vulcanizing shop, electronic repair shop outside the barangay but within 2kms 37 q_12a Number of establishments offering personal services like restaurant, cafeteria, or refreshment parlor in the barangay 38 q_12b Number of establishments offering personal services like restaurant, cafeteria, or refreshment parlor outside the barangay but within 2kms 39 40 q_13a q_13b Number of households in danger areas Number of households in government land without legally recognizable claims to the land 41 q_13c Number of households in private land which they do not own Appendix 3. Variables in the PMT Model for Non-NCR Households and the Partial Sum of Squares Variable Partial SS natural logarithm of family size ln_fam_size2 106.9507 Zamboanga Peninsula region_09 12.2461 with telephone/cellphone w_phone 12.0351 Autonomus Region of Muslim Mindanao region_15 11.8521 Northern Mindanao region_10 11.3210 Caraga region_16 10.4322 mining, construction and related trades workers w_occup2_71 7.1971 drivers and mobile plant operators w_occup2_83 6.0946 Fishermen w_occup2_64 6.0367 Proportion of family members who are high school graduates p2_educ_hsg 5.1667 Proportion of family members who are college undergraduates p2_educ_cu 5.0339 proportion of members 0-14 years old p_mem_0_14 4.9922 laborers in mining, construction, manufacturing and transport w_occup2_93 4.8049 Proportion of family members who are high school undergraduates p2_educ_hsu 4.5039 22 Eastern Visayas water source = spring, river, stream, etc with family member who is an overseas contract worker roof and walls made of strong materials general managers or managing proprietors Central Visayas with family member whose basis of payment = monthly with VTR/VHS/VCD/DVD Proportion of family members with no grade completed household type = single family sales and services elementary occupations Highschool indicator with television set Central Mindanao toilet facility = water sealed with motorcycle tricycle With electricity in the building/house urbanity = urban machine operators and assemblers CALABARZON Cellular phone signal Indicator household head = single office clerks Proportion of family members currently attending school age of household head with refrigerator/freezer metal, machinery and related trades workers household head = male Proportion of family members who are college graduates Proportion of family members who are elementary graduates withfamily member whose nature of employment = short-term or seasonal or casual job/business/ unpaid family work with car/jeep water source = shared, tubed / piped well Number of auto repair shop, vulcanizing shop, electronic repair shop, or other repair shops in the barangay family with radio withfamily member whose class of worker = employer in own family-operated farm or business farmers and other plant growers tenure status = own house and lot; or owner-like possession of house and lot region_08 ws_spring ofi_w_ocw house_strong_3 w_occup2_13 region_07 w_pb2_month w_vtr p2_educ_ngc hh_type_s_fam w_occup2_91 q_4g w_tv region_12 tf_w_sealed w_motor w_elec Urban w_occup2_82 region_41 q_4m h_ms2_single w_occup2_41 p2_mem_sch h_age2 w_ref w_occup2_72 h_sex2_m p2_educ_cg p2_educ_eg 4.2999 3.8729 3.7744 3.2691 3.0721 2.7962 2.7792 2.1045 2.0969 1.9464 1.9378 1.8551 1.7796 1.7037 1.6850 1.6811 1.6673 1.6163 1.5660 1.3110 1.2375 1.2277 1.1950 1.1903 1.1313 1.0074 0.9825 0.9642 0.9328 0.8949 w_ne2_s_term w_car ws_s_well 0.8556 0.7919 0.7182 q_11a w_radio 0.6295 0.5895 w_cw2_employer w_occup2_61 0.5560 0.5525 ts_oh_ol 0.5508 23 models, salespersons and demonstrators with stereo / CD player armed forces with sala set related associate professionals personal and protective services workers other craft and related trades workers Number of recreational establishments outside the barangay but within 2kms with family member whose primary occupation = officials of government and special-interest organizations Number of commercial establishments in the barangay customer service clerks building type = single house with washing machine with dining set proportion of family members employed hunters and trappers Central Luzon animal producers Number of hotel dormitory, motel or other lodging places in the barangay Number of recreational establishments in the barangay Number of establishments offering personal services like restaurants, cafereria, etc in the barangay Proportion of family members who are post graduates Supervisors physical, mathematical and engineering science professionals w_occup2_52 w_stereo w_occup2_01 w_sala w_occup2_34 w_occup2_51 w_occup2_74 0.5377 0.5198 0.4925 0.4828 0.4693 0.4506 0.4336 q_7b 0.4207 w_occup2_11 q_6a w_occup2_42 bldg_single w_wash w_dining p2_mem_emp w_occup2_65 region_03 w_occup2_62 0.4200 0.3670 0.3620 0.3474 0.2904 0.2855 0.2683 0.2617 0.2530 0.2499 q_9a q_7a 0.2428 0.1869 q_12a p2_educ_pg w_occup2_14 w_occup2_21 0.1847 0.0641 0.0519 0.0139 Appendix 4. Variables in the PMT Model for NCR Households and the Partial Sum of Squares Variable SS natural logarithm of family size ln_fam_size2 16.8941 number of telephone/cellphone n_phone 2.5443 number of washing machines n_wash 1.3592 mining, construction and related trades workers w_occup2_71 1.2661 proportion of members 0-14 years old p_mem_0_14 1.1725 drivers and mobile plant operators w_occup2_83 1.1321 Proportion of family members who are high school graduates p2_educ_hsg 1.0384 with family member whose basis of payment = monthly w_pb2_month 0.9155 personal and protective services workers w_occup2_51 0.7428 Proportion of family members who are college undergraduates p2_educ_cu 0.7133 household type = single family hh_type_s_fam 0.6986 24 Do farmers, farm laboreres, fishermen, loggers, and forest product gatherers constitute more than half of the population 10 years old and over (yes=1) office clerks number of television sets laborers in mining, construction, manufacturing and transport tenure status = squatter water source = own use, faucet, community water system metal, machinery and related trades workers number of VTR/VHS/VCD/DVD number of stereo/cd player models, salespersons and demonstrators general managers or managing proprietors number of refrigerators toilet facility = none number of banking institutions/pawnshops financing and investment, inside the barangay customer service clerks Proportion of family members currently attending school Number of commercial establishments, outside the barangay but within 2kms age of household head number of sala sets number of households in the housing unit Landline telephone system or calling station (indicator) number of microwave oven proportion of family members employed forestry and related workers With family member whose nature of employment = short-term or seasonal or casual job/business/ unpaid family work other craft and related trades workers number of motorcycle/tricycle other occupations not classifiable q_5 w_occup2_41 n_tv w_occup2_93 ts_squatter ws_o_faucet w_occup2_72 n_vtr n_stereo w_occup2_52 w_occup2_13 n_ref tf_none 0.6091 0.5829 0.5566 0.5547 0.5204 0.4994 0.4871 0.4825 0.4646 0.4185 0.4014 0.3970 0.3816 q_10a w_occup2_42 p2_mem_sch 0.3799 0.3764 0.3489 q_6b h_age2 n_sala n_hh q_4l n_oven p2_mem_emp w_occup2_63 0.2734 0.2580 0.2249 0.2078 0.1919 0.1805 0.1652 0.1614 w_ne2_s_term w_occup2_74 n_motor w_occup2_09 0.1592 0.1566 0.1504 0.1453 Appendix 4. Variables in the PMT Model for NCR Households and the Partial Sum of Squares Variable SS natural logarithm of family size ln_fam_size2 16.8941 number of telephone/cellphone n_phone 2.5443 number of washing machines n_wash 1.3592 mining, construction and related trades workers w_occup2_71 1.2661 proportion of members 0-14 years old p_mem_0_14 1.1725 drivers and mobile plant operators w_occup2_83 1.1321 25 Proportion of family members who are high school graduates with family member whose basis of payment = monthly personal and protective services workers Proportion of family members who are college undergraduates household type = single family Do farmers, farm laboreres, fishermen, loggers, and forest product gatherers constitute more than half of the population 10 years old and over (yes=1) office clerks number of television sets laborers in mining, construction, manufacturing and transport tenure status = squatter water source = own use, faucet, community water system metal, machinery and related trades workers number of VTR/VHS/VCD/DVD number of stereo/cd player models, salespersons and demonstrators general managers or managing proprietors number of refrigerators toilet facility = none number of banking institutions/pawnshops financing and investment, inside the barangay customer service clerks Proportion of family members currently attending school Number of commercial establishments, outside the barangay but within 2kms age of household head number of sala sets number of households in the housing unit Landline telephone system or calling station (indicator) number of microwave oven proportion of family members employed forestry and related workers With family member whose nature of employment = short-term or seasonal or casual job/business/ unpaid family work other craft and related trades workers number of motorcycle/tricycle other occupations not classifiable p2_educ_hsg w_pb2_month w_occup2_51 p2_educ_cu hh_type_s_fam 1.0384 0.9155 0.7428 0.7133 0.6986 q_5 w_occup2_41 n_tv w_occup2_93 ts_squatter ws_o_faucet w_occup2_72 n_vtr n_stereo w_occup2_52 w_occup2_13 n_ref tf_none 0.6091 0.5829 0.5566 0.5547 0.5204 0.4994 0.4871 0.4825 0.4646 0.4185 0.4014 0.3970 0.3816 q_10a w_occup2_42 p2_mem_sch 0.3799 0.3764 0.3489 q_6b h_age2 n_sala n_hh q_4l n_oven p2_mem_emp w_occup2_63 0.2734 0.2580 0.2249 0.2078 0.1919 0.1805 0.1652 0.1614 w_ne2_s_term w_occup2_74 n_motor w_occup2_09 0.1592 0.1566 0.1504 0.1453 26 Appendix 5. Bayesian Averaging of Classical Estimates (BACE) The BACE (Sala-i-Martin et al, 2004) is a procedure that determines the robustness of explanatory variables in a cross-section regression model. It computes the weighted average of each variable’s estimated coefficient from a large number of regression models, where the weights are functions of goodness-of-fit statistics. The focus of BACE is not on the magnitude of the weighted regression coefficients, but on the robustness of a variable’s effect even if the model is suffering from misspecification errors. The procedure involves estimating all regression models of the form where is the variable of interest, is a vector of fixed variables appearing in all the regressions, and is a vector of variables taken from the collection of all other variables under consideration. The prior probability of model , denoted as probability, will be: ̅ ̅ ( ) ( ) ( ) ( ) assuming equal prior variable inclusion where ̅ is the speculated number of variables in the true model, is the total number of variables in the dataset, and is the number of variables in the model. The weights that will be used in the averaging is the posterior probabilities of the . The weight is a function of the prior probability and is given by: ( ) ( ∑ ) ( ) where the is the sum of squared errors in model . Therefore, the unconditional posterior mean of is given by: ( ) ∑ ( | )̂ where ̂ is the estimated value of the vector of coefficients under OLS; and its corresponding unconditional posterior variance is of the form: ( ) ∑ ( | ) ( ) ∑ ( | )[ ̂ ( )] 27 The posterior mean and posterior variance that are conditional to the posterior inclusion probability are as follows: ( ) ( ) ( ( ) ( ) ( ) ) ( ) where is the vector of posterior inclusion probabilities with elements equal to the sum of posterior model probabilities taken over models containing the particular variable. To determine the robustness of the effects, the sign-certainty probability of the variables is used. It is the probability that the true effect strictly lies on one side of zero under the Gaussian distribution. Robust variables are those with sign-certainty probability of at least 97.5%. BACE in the DSWD Targeting Model To check the robustness of the variable’s effects, the BACE method was applied to a subsample of the 2009 FIES comprising of households living outside NCR with per capita income below the 40th percentile. If the BACE procedure is strictly implemented, there should be a total of regression models to estimates, which leads to computational problems. To simplify the analysis and to reduce the number of runs, variable groupings were used. The main variable groupings are needs indicators, household-head characteristics, geographic dummy variables, employment indicators, asset indicators, water source and toilet facility, occupation indicators, education variables, housing characteristics, and community characteristics. The needs indicators, household-head characteristics, and geographic dummy variables were fixed in all regression models, and each variable in employment indicators and education variables will be unrestricted. All in all, there were a total of 16 variable and variable groupings tested for robustness, which translates to 65,536 regression runs. The BACE procedure identified 73 robust correlates of per capita income. Robust Determinants of Income Variable urbanity = urban Geographic Indicators Unconditional Conditional Posterior Sign Sign Inclusion Posterior Posterior Posterior Posterior Certainty Certainty Probability Mean Variance Mean Variance Probability Probability 1.0 0.0655 0.0002 1.0000 0.0655 0.0002 1.0000 28 Ilocos Region Cagayan Valley Central Luzon CALABARZON MIMAROPA Bicol Region Western Visayas Central Visayas Eastern Visayas Zamboanga Peninsula Northern Mindanao Southern Mindanao Central Mindanao CAR ARMM Caraga 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 0.1512 0.1449 0.1587 0.1880 0.0967 0.1093 0.1193 0.0609 0.0565 -0.0149 0.0033 0.1008 0.0684 0.1065 0.2210 0.0003 0.0004 0.0003 0.0002 0.0002 0.0002 0.0002 0.0002 0.0002 0.0002 0.0002 0.0002 0.0002 0.0004 0.0004 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.8279 0.5903 1.0000 1.0000 1.0000 1.0000 Base Region 0.1512 0.1449 0.1587 0.1880 0.0967 0.1093 0.1193 0.0609 0.0565 -0.0149 0.0033 0.1008 0.0684 0.1065 0.2210 0.0003 0.0004 0.0003 0.0002 0.0002 0.0002 0.0002 0.0002 0.0002 0.0002 0.0002 0.0002 0.0002 0.0004 0.0004 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.8279 0.5903 1.0000 1.0000 1.0000 1.0000 Needs Indicators Unconditional Variable Conditional Posterio Sign Posterio r Certainty r Mean Varianc Probability e Posterior Inclusion Probability Posterio r Mean Posterior Variance Sign Certainty Probability 0.8043 -0.2196 0.0028 1.0000 -0.2730 -0.0111 0.0708 -0.0262 0.0091 0.6079 -0.3698 0.0020 natural logarithm of family size proportion of members 0-14 years old 1.0000 Employment Indicators Variable with family member whose basis of payment = monthly with family member who is an overseas contract worker Unconditional Conditional Posterior Sign Sign Inclusion Posterior Posterior Posterior Posterior Certainty Certainty Probability Mean Variance Mean Variance Probability Probability 0.0139 0.0011 0.0001 0.5460 0.0806 0.0004 1.0000 0.0101 0.0013 0.0002 0.5392 0.1292 0.0010 1.0000 29 Asset Indicators Variable With electricity in the building/house with motorcycle tricycle with television set with telephone/cellphone with VTR/VHS/VCD/DVD with refrigerator/freezer with car/jeep with sala set family with radio with stereo / CD player with washing machine with dining set with microcomputer Unconditional Conditional Posterior Sign Sign Inclusion Posterior Posterior Posterior Posterior Certainty Certainty Probability Mean Variance Mean Variance Probability Probability 0.8250 0.8250 0.8250 0.8250 0.8250 0.8250 0.8250 0.8250 0.8250 0.8250 0.8250 0.8250 0.8250 0.0525 0.0625 0.0408 0.0940 0.0344 0.0508 0.1025 0.0266 0.0158 0.0252 0.0248 0.0143 0.0646 0.0002 0.0003 0.0001 0.0007 0.0001 0.0002 0.0010 0.0001 0.0000 0.0001 0.0002 0.0001 0.0020 1.0000 0.9999 0.9999 0.9999 0.9998 0.9998 0.9993 0.9973 0.9941 0.9930 0.9758 0.9621 0.9279 0.0637 0.0758 0.0495 0.1139 0.0417 0.0616 0.1242 0.0322 0.0191 0.0305 0.0300 0.0173 0.0783 -0.0005 -0.0007 -0.0003 -0.0015 -0.0002 -0.0004 -0.0015 -0.0001 0.0000 0.0000 0.0000 0.0000 0.0013 1.0000 0.9997 0.9851 Water Source and Toilet Facility Variable water source = spring, river, stream, etc toilet facility = water sealed water source = shared, tubed / piped well Posterior Inclusion Probabilit y Unconditional Posterio Sign Posterio r Certainty r Mean Varianc Probabilit e y Conditional Posterio Sign Posterio r Certainty r Mean Varianc Probabilit e y 0.0150 -0.0013 0.0001 0.5478 -0.0874 0.0004 1.0000 0.0146 0.0009 0.0001 0.5459 0.0649 0.0005 0.9984 0.0150 -0.0004 0.0000 0.5462 -0.0289 0.0001 0.9970 Occupation Indicators Variable drivers and mobile plant operators office clerks machine operators and assemblers Unconditional Conditional Posterior Sign Sign Inclusion Posterior Posterior Posterior Posterior Certainty Certainty Probability Mean Variance Mean Variance Probability Probability 0.1285 0.1285 0.0142 0.0188 0.0013 0.0023 0.6538 0.6523 0.1105 0.1461 -0.0006 -0.0007 0.1285 0.0206 0.0028 0.6522 0.1606 -0.0008 30 mining, construction and related trades workers general managers or managing proprietors fishermen related associate professionals with family member whose primary occupation = officials of government and specialinterest organizations metal, machinery and related trades workers armed forces personal and protective services workers laborers in mining, construction, manufacturing and transport customer service clerks hunters and trappers animal producers models, salespersons and demonstrators teaching professionals sales and services elementary occupations 0.1285 0.0146 0.0014 0.6522 0.1138 -0.0004 0.1285 0.1285 0.0093 0.0113 0.0006 0.0009 0.6516 0.6483 0.0728 0.0876 -0.0001 0.0001 1.0000 0.1285 0.0121 0.0010 0.6472 0.0941 0.0003 1.0000 0.1285 0.0127 0.0012 0.6458 0.0990 0.0005 1.0000 0.1285 0.1285 0.0115 0.0163 0.0009 0.0019 0.6454 0.6454 0.0891 0.1271 0.0004 0.0009 1.0000 1.0000 0.1285 0.0079 0.0005 0.6434 0.0613 0.0003 0.9998 0.1285 0.1285 0.1285 0.1285 0.0089 0.0107 -0.0719 0.0049 0.0006 0.0009 0.0401 0.0002 0.6419 0.6405 0.6402 0.6399 0.0697 0.0832 -0.5593 0.0378 0.0005 0.0008 0.0395 0.0002 0.9992 0.9980 0.9976 0.9973 0.1285 0.1285 0.0069 0.0134 0.0004 0.0015 0.6399 0.6365 0.0535 0.1041 0.0004 0.0020 0.9972 0.9904 0.1285 0.0065 0.0004 0.6360 0.0509 0.0005 0.9888 Education Variables Variable Proportion of family members currently attending school Proportion of family members with no grade completed Proportion of family members who are elementary undergraduates Proportion of family members who are Unconditional Conditional Posterior Sign Sign Inclusion Posterior Posterior Posterior Posterior Certainty Certainty Probability Mean Variance Mean Variance Probability Probability 0.2639 -0.0452 0.0071 0.7042 -0.1713 0.0053 0.9907 0.2639 0.0344 0.0064 0.6667 0.1305 0.0034 1.0000 0.2639 0.0732 0.0158 0.7196 0.2773 -0.0058 0.2639 0.0988 0.0257 0.7311 0.3745 -0.0261 31 elementary graduates Proportion of family members who are high school undergraduates Proportion of family members who are high school graduates Proportion of family members who are college undergraduates Proportion of family members who are college graduates Proportion of family members who are post graduates 0.2639 0.1325 0.0421 0.7408 0.5021 -0.0341 0.2639 0.1497 0.0535 0.7412 0.5672 -0.0640 0.2639 0.1937 0.0878 0.7434 0.7340 -0.0527 0.2639 0.1941 0.0912 0.7398 0.7355 1.8868 0.7189 0.2639 0.2101 0.6209 0.6051 0.7961 0.0053 0.9907 Housing Characteristics Variable building type = single house roof and walls made of strong materials tenure status = own house and lot; or ownerlike possession of house and lot Unconditional Conditional Posterior Sign Sign Inclusion Posterior Posterior Posterior Posterior Certainty Certainty Probability Mean Variance Mean Variance Probability Probability 0.0136 -0.0009 0.0001 0.5447 -0.0688 0.0004 0.9994 0.0136 0.0007 0.0000 0.5443 0.0531 0.0003 0.9984 0.0136 0.0002 0.0000 0.5431 0.0150 0.0000 0.9914 Community Characteristics Variable There is a high school There is Landline Telephone System/Calling Station Unconditional Conditional Posterior Sign Sign Inclusion Posterior Posterior Posterior Posterior Certainty Certainty Probability Mean Variance Mean Variance Probability Probability 0.0167 -0.0004 0.0000 0.5509 -0.0255 0.0000 1.0000 0.0167 0.0004 0.0000 0.5498 0.0251 0.0001 0.9997 32 There is a town city hall/provincial capitol There is a cellular phone signal Commercial establishments inside the barangay Autorepair shops inside the barangay There is a barangay health center Barangay has street pattern There is a church/chapel Barangay is part of the town/city proper 0.0167 0.0005 0.0000 0.5497 0.0272 0.0001 0.9996 0.0167 -0.0004 0.0000 0.5496 -0.0227 0.0000 0.9995 0.0167 0.0000 0.0000 0.5483 0.0001 0.0000 0.9946 0.0167 0.0000 0.0000 0.5482 0.0018 0.0000 0.9941 0.0167 0.0002 0.0000 0.5479 0.0147 0.0000 0.9921 0.0167 0.0002 0.0000 0.5476 0.0142 0.0000 0.9901 0.0167 0.0003 0.0000 0.5474 0.0180 0.0001 0.9878 0.0167 -0.0002 0.0000 0.5471 -0.0121 0.0000 0.9854 Appendix 6A. Explanatory Variables in the logit model for Non-NCR households Variables number of television sets n_tv proportion of members 0-14 years old p_mem_0_14 natural logarithm of family size ln_fam_size2 number of telephone/cellular phone n_phone number of refrigerators/freezers n_ref roof and walls made of strong materials house_strong_3 number of washing machines n_wash Proportion of family members who are college graduates p2_educ_cg Proportion of family members who are high school graduates p2_educ_hsg Proportion of family members who are college undergraduates p2_educ_cu number of motorcycles n_motor with family member whose basis of payment = monthly w_pb2_month with family member who is an overseas contract worker ofi_w_ocw CALABARZON region_41 mining, construction and related trades workers w_occup2_71 number of VTR/VHS/VCD/DVD n_vtr drivers and mobile plant operators w_occup2_83 general managers or managing proprietors w_occup2_13 with sala set w_sala Caraga region_16 Proportion of family members who are high school undergraduates p2_educ_hsu Northern Mindanao region_10 Fishermen w_occup2_64 33 Number of commercial establishments in the barangay Autonomus Region of Muslim Mindanao Proportion of family members with no grade completed laborers in mining, construction, manufacturing and transport machine operators and assemblers Central Visayas building type = single house sales and services elementary occupations office clerks withfamily member whose nature of employment = short-term or seasonal or casual job/business/ unpaid family work proportion of family members employed With electricity in the building/house number of cars/jeep water source = shared, tubed / piped well water source = spring, river, stream, etc age of household head household type = single family tenure status = own house and lot; or owner-like possession of house and lot related associate professionals number of radio metal, machinery and related trades workers Number of auto repair shop, vulcanizing shop, electronic repair shop, or other repair shops in the barangay withfamily member whose class of worker = employer in own familyoperated farm or business farmers and other plant growers Proportion of family members currently attending school number of dining sets models, salespersons and demonstrators armed forces with family member whose primary occupation = officials of government and special-interest organizations Proportion of family members who are elementary graduates with stereo / CD player Number of establishments offering personal services like restaurants, cafereria, etc, in the barangay Zamboanga Peninsula Eastern Visayas Supervisors personal and protective services workers other craft and related trades workers q_6a region_15 p2_educ_ngc w_occup2_93 w_occup2_82 region_07 bldg_single w_occup2_91 w_occup2_41 w_ne2_s_term p2_mem_emp w_elec n_car ws_s_well ws_spring h_age2 hh_type_s_fam ts_oh_ol w_occup2_34 n_radio w_occup2_72 q_11a w_cw2_employer w_occup2_61 p2_mem_sch n_dining w_occup2_52 w_occup2_01 w_occup2_11 p2_educ_eg w_stereo q_12a region_09 region_08 w_occup2_14 w_occup2_51 w_occup2_74 34 Number of recreational establishments, outside the barangay but within 2kms q_7b Appendix 6B. Explanatory Variables in the logit model for NCR households Variables With family member whose class of worker = employer in own family-operated farm or business w_cw2_employer proportion of family members employed p2_mem_emp other occupations not classifiable w_occup2_09 number of washing machines n_wash Do farmers, farm laboreres, fishermen, loggers, and forest product gatherers constitute more than half of the population 10 years old and over (yes=1) q_5 Proportion of family members who are elementary undergraduates p2_educ_eu Town city hall/provincial capitol indicator q_4a sales and services elementary occupations w_occup2_91 number of telephone/cell phone n_phone Number of households dwelling in private land which they do not own except in danger areas q_13c Barangay street pattern indicator q_2 number of microwave oven n_oven toilet facility = water sealed tf_w_sealed cemetery indicator q_4d precision, handicraft, printing and related trades workers w_occup2_73 natural logarithm of family size ln_fam_size2 household type = single family hh_type_s_fam Highs chool indicator q_4g Poblacion/City District indicator q_1c 35
© Copyright 2026 Paperzz