ABOUT THE QUALITY OF THE 2013 CENSUS OF POPULATION, HOUSEHOLDS AND DWELLINGS IN BOSNIA AND HERZEGOVINA: BASIC EVIDENCE FROM THE POSTENUMERATION SURVEY Edin Šabanović, Agency for Statistics of Bosnia and Herzegovina Rabija Somun – Kapetanović, School of Economics and Business, Sarajevo Introduction • Census of Population in BiH 1991 prematurely became almost useless • Longer than two decades BiH suffered from the lack of basic demographic and socio-economic data on the basis of full statistical coverage • 2013 Census of Population, Households and Dwellings has been given a huge political importance • Request for high quality of the Census data and fully harmonisation with international and EU standards • Post-enumeration Survey as a tool for measuring coverage and content quality of the Census Literature review • Every statistical survey is regulated by different kind of regulations (laws, specific regulatives, recommendations, guidelines, manuals, etc.) • International comparability of statistical data is very important • Population censuses as the hugest statistical surveys are subject of international regulations and standardisation from the very beginning • UN Principles and Recommendations for Population and Housing Censuses (2007): „ The post enumeration survey (PES) can be defined as the complete re-enumeration of a representative sample of the census population and matching each individual who is enumerated in the post-enumeration survey with information from the main enumeration.“ • UN PES Operational Guidelines (2010) serves as a reference manual for the evaluation of census quality and defines Dual System Estimation method for estimation of the True Population • Fellegi and Sunter (1969) prepared a classic paper on linkage of statistical data and developed a mathematical model for comparison between the recorded values in two records • Theoretical basis of Latent Class Analysis (LCA) was given in papers of Goodman (1974), Hagenaars (1993 and 1997), Wolter (1986) and Biemer (2011). LCA treated as a technique and model for the generalisation of classical test theory and traditional survey error modeling approach PES and Dual System Estimation Method (1) • To measure coverage and content quality of the Census; • To give the usability level of census data • DSE is a standard model to evaluate the unknown population total • It is derived from the capture-recapture model, which is designed for estimating the size of a closed population using two incomplete enumerations • In the census context this model refers to the situation where the size of population of N individuals is not known and should be estimated and where one made two attempts to count the entire population: the first one based on the total enumeration and the second-one, which is sample-based enumeration • After matching the lists from two data sources, a 2×2 table is set up, as in Table 1. PES and Dual System Estimation Method (2) Table 2. Contingency table of the counts in the Census and in the PES PES Present Present x11 x12 x1 Absent x21 x22 x 2 x 1 x2 N Census Total Total Absent Five assumptions must be satisfied in order to use DSE for estimating N: a) Captures in Census and PES are independent; b) The population is closed, so the population being measured in both sources is the same; c) Records from both sources can be linked without errors; d) Units have the same capture probabilities within each source (homogeneity probability assumption); e) Over-count in both sources is negligible; • In this way, an unbiased estimator of N is given by Nˆ N1 N 2 / x11 (1) Results of the linkage procedure • Results of the linkage of units between PES and Census in order to obtain marginal sums from the table 1 are presented in following tables: Table 2. Contingency table of individuals enumerated in the Census and in the PES, unweighted cases (before determination of resident status) PES Total Present Absent Present 36,936 4,540 Absent 537 37,473 Census Total Table 3. Matched units, comparison on resident status, unweighted cases Resident status at the Census Resident status at thePES Resident status 1 Resident status Resident status 2 3 Grand total Resident status 1 34,603 58 829 35,490 41,476 Resident status 2 61 38 116 215 x22 x 2 Resident status 3 667 7 557 1,231 x 2 N Grand total 35,331 103 864 36,936 • Units showing different answers in PES and Census on the resident and enumeration status were not considered, due to the lack of information needed to solve discrepancies Table 4. Contingency table of individuals enumerated in the Census and in the PES - final data, unweighted cases PES Census Total Total Present Absent Present 34,603 2,656 Absent 448 xx22 22 2 x2 35,051 x 2 N 37,259 Adjustment of the DSE Method-LCA • Over-count of individuals was unusual. Latent Class Analysis (LCA) used • If we assume that the PES list does not include ineligible persons, in this case the adjusted DSE estimate of the N population size (i.e. corrected by overcount), N̂ , is given by Nˆ O Nˆ ce N 2 / x11 O • LCA is a probabilistic method for assessing if a unit is a part of the target population or not and for estimating a probability of belonging to this population • Different models have been tested and the best one included three explanatory variables (Sex, Age and Matching status of the household) Census Coverage Errors (1) • The model results with the first latent class probability (the marginal probability to belong to the population) of 0.1379 (366 unmatched people estimated being in the population) • On the basis of results of the LCA, it is possible to estimate all coverage errors of the Population census in Bosnia and Herzegovina • Main estimates have been calculated as follows: OvRate = 1 − l p10,l X10,l +X11 l X10,l +X11,l 1−OvRate DSE = Ncensus ∗ 1−UnRate X11 UnRate = 1 − X.1 (2) (3) (4) • Main coverage errors are presented in the following three tables: Census Coverage Errors (2) Table 5. Main coverage errors at national level for individuals-absolute figures Dual system estimate of the true population (DSE) Census count before correcting for collective households Census count after correcting for collective households (Ncensus) Eligible and correctly enumerated in the Census (Nce) Erroneous inclusion Omission (DSE-Nce) Census net coverage error (DSE-Ncensus) 3,530,159 3,507,343 3,311,191 196,152 41,570 -154,583 3,352,760 Table 6. Coverage errors at urbar, rural, entity and national levels for individuals DSEs Population estimate Net coverage Error Census omission Census erroneous inclusion Urban Rural Federation of BiH Republika Srpska Brcko District National Counts 1,413,957 1,938,804 2,113,556 1,167,778 71,426 3,352,760 95% CI LL 1,410,191 1,933,381 2,108,606 1,163,986 70,502 3,346,202 95% CI UL 1,417,722 1,944,226 2,118,506 1,171,571 72,350 3,359,319 Rate -5.34% -4.08% -4.28% -4.49% -16.51% -4.61% 95% CI LL -5.62% -4.37% -4.52% -4.83% -18.02% -4.82% 95% CI UL -5.06% -3.79% -4.03% -4.15% -15.01% -4.41% Rate 1.42% 1.19% 0.98% 1.81% 1.38% 1.24% 95% CI LL 1.09% 0.83% 0.68% 1.40% -0.26% 0.99% 95% CI UL 1.75% 1.56% 1.28% 2.23% 3.02% 1.49% Rate 6.76% 5.27% 5.26% 6.30% 17.89% 5.85% 95% CI LL 6.56% 5.03% 5.07% 6.04% 16.84% 5.69% 95% CI UL 6.96% 5.52% 5.45% 6.56% 18.94% 6.01% Census Coverage Errors (3) Table 7. Coverage rats at urban, rural, entity and national level for individuals Over-coverage Under-coverage Urban Rural Federation of BiH Republika Srpska Brcko District National Rate 6.41% 5.07% 5.04% 6.03% 15.36% 5.59% 95% CI LL 6.18% 4.88% 4.86% 5.78% 14.48% 5.44% 95% CI UL 6.65% 5.26% 5.23% 6.28% 16.24% 5.75% Rate 1.38% 1.15% 0.96% 1.79% 1.35% 1.24% 95% CI LL 1.21% 1.02% 0.83% 1.61% 0.60% 1.14% 95% CI UL 1.55% 1.29% 1.09% 1.97% 2.11% 1.34% Census Content Errors (1) • Content quality control is conducted to measure errors due to differences/inconsistencies among Census and PES data • For this purpose, the 10% sub-sample of the PES sample was used (sample consisted of 7,637 individuals, 2,114 households and 2,073 dwellings) • Variability of answers is measured by five content quality indicators: a) Net difference rate; b) Index of inconsistency c) Aggregate index of inconsistency; d) Gross difference rate and e) Rate of agreement. In this presentation, we will focus our attention only to content errors for selected variables for individuals Census Content Errors (2) Variable category Net difference rate Index of inconsistency Gross difference rate Aggregate index of inconsistency Rate of agreement Sex M -0.13% 2.88% F 0.00% 3.30% Null 0.13% 99.48% Total 2.45% 4.81% 97.55% 1,13% 1,72% 98,87% 5.15% 7.66% 94.85% 12.31% 16.45% 87.69% Age 0-14 15-49 50-64 65+ NULL -0,08% 0,85% -0,04% 1,34% -0,01% 1,34% 0,03% 1,40% 0,10% 52,04% Total Marital status Never Married -1.74% 10.69% Married 0.39% 5.56% Divorced 0.00% 27.96% Widow/ Widower -0.08% 3.62% Not applicable -0.07% 0.80% 1.49% 94.14% NULL Total Highest completed education No education 0.14% 15.83% Uncompleted basic education 0.25% 32.25% Primary school -1.15% 34.45% Lower secondary school -0.16% 11.82% Post-secondary school -0.05% 1.76% 0.24% 38.42% Tertiary education -0.16% 17.13% Not applicable -0.07% 0.80% 0.96% 100.02% High school NULL Total Table 8. Content errors indicators for selected variables and their categories at Census and PES Census Content Errors (3) Variable category Net difference rate Index of inconsistency Gross difference rate Aggregate index of inconsistency Rate of agreement Activity status Employed -0.59% 16.93% 0.20% 82.02% Unemployed -0.24% 19.67% Not applicable -0.07% 0.65% 0.69% 100.00% Didn't work but has a job to return to NULL Total 11.06% 17.76% 88.94% 5.60% 21.35% 94.40% Citizenship BiH 0.64% 19.58% -1.11% 20.16% Other country 0.03% 48.97% Without citizenship 0.04% 100.00% NULL 0.41% 100.00% BiH and other country Total Table 8. Content errors indicators for selected variables and their categories at Census and PES, cont. Conclusions (1) • The 2013 Census on population, households and dwellings in Bosnia and Herzegovina has been conducted in the very specific socio-economic and political atmosphere • Issue of the general “statistical culture” in Bosnia and Herzegovina • BHAS was aware of the atmosphere in which the census should be conducted. It has prepared and conducted the Population Census in line with all relevant international and European standards, definitions, regulations and recommendations • Apart of IMO reports, which were prepared during all census activities, PES indicators served as data for the final validation of the Population Census in BiH • PES provided the DSE of the true population in BiH and main indicators of census quality in terms of the coverage and content Conclusions (2) • PES coverage results in absolute terms: a) DSE of the true population was 3,352,760 indicating that the enumeration in the Census was over-counted (Census figure is 3,507,343 after correcting for institutional households) b) 196,152 inhabitants were erroneously included in the population count c) 41,570 inhabitants were missed to enumerate d) Net-coverage census error was -154,582 • PES coverage results in relative terms: a) Over-coverage rate: 5.59% b) Under-coverage rate: 1.24% c) Net-coverage error rate: -4.61% Coverage indicators indicated some extent of over- and under-count, but not in the scope to be evaluated as bad census quality. Conclusions (3) • Rates of agreement between Census and PES variables in terms of the content ranged between 87.69% and 98.87% (Highest completed education and Age, respectively) • Aggregate indices of inconsistency were in low range, except for one variable (Citizenship: 21.35 which is in the medium range) • Average net difference rate for every analyzed variable was in the low range • Only for several variables, individual net difference rates and individual indices of inconsistency were in the middle or high range, indicating the existance of problems in understanding and reporting answers • Data users should be aware in using these results, while statisticians must provide clearer definitions and wording in the design of survey instruments and more probing in the course of the interview in future work • Overall quality of 2013 Census of Population, Households and Dwellings in BiH could be evaluated as good and usable for statistical purposes and policy making, both in terms of coverage and content.
© Copyright 2025 Paperzz