about the quality of the 2013 census of population, households and

ABOUT THE QUALITY OF THE 2013 CENSUS OF
POPULATION, HOUSEHOLDS AND DWELLINGS IN BOSNIA
AND HERZEGOVINA: BASIC EVIDENCE FROM THE POSTENUMERATION SURVEY
Edin Šabanović, Agency for Statistics of Bosnia and Herzegovina
Rabija Somun – Kapetanović, School of Economics and Business, Sarajevo
Introduction
• Census of Population in BiH 1991 prematurely became almost useless
• Longer than two decades BiH suffered from the lack of basic demographic
and socio-economic data on the basis of full statistical coverage
• 2013 Census of Population, Households and Dwellings has been given a
huge political importance
• Request for high quality of the Census data and fully harmonisation with
international and EU standards
• Post-enumeration Survey as a tool for measuring coverage and content
quality of the Census
Literature review
• Every statistical survey is regulated by different kind of regulations (laws, specific
regulatives, recommendations, guidelines, manuals, etc.)
• International comparability of statistical data is very important
• Population censuses as the hugest statistical surveys are subject of international regulations
and standardisation from the very beginning
• UN Principles and Recommendations for Population and Housing Censuses (2007): „ The
post enumeration survey (PES) can be defined as the complete re-enumeration of a
representative sample of the census population and matching each individual who is
enumerated in the post-enumeration survey with information from the main enumeration.“
• UN PES Operational Guidelines (2010) serves as a reference manual for the evaluation of
census quality and defines Dual System Estimation method for estimation of the True
Population
• Fellegi and Sunter (1969) prepared a classic paper on linkage of statistical data and
developed a mathematical model for comparison between the recorded values in two
records
• Theoretical basis of Latent Class Analysis (LCA) was given in papers of Goodman (1974),
Hagenaars (1993 and 1997), Wolter (1986) and Biemer (2011). LCA treated as a technique
and model for the generalisation of classical test theory and traditional survey error
modeling approach
PES and Dual System Estimation Method (1)
• To measure coverage and content quality of the Census;
• To give the usability level of census data
• DSE is a standard model to evaluate the unknown population total
• It is derived from the capture-recapture model, which is designed for
estimating the size of a closed population using two incomplete
enumerations
• In the census context this model refers to the situation where the size of
population of N individuals is not known and should be estimated and
where one made two attempts to count the entire population: the first
one based on the total enumeration and the second-one, which is
sample-based enumeration
• After matching the lists from two data sources, a 2×2 table is set up, as in
Table 1.
PES and Dual System Estimation Method (2)
Table 2. Contingency table of the counts in the Census and in the PES
PES
Present
Present
x11
x12
x1
Absent
x21
x22
x 2
x 1
x2
N
Census
Total
Total
Absent
Five assumptions must be satisfied in order to use DSE for estimating N:
a) Captures in Census and PES are independent;
b) The population is closed, so the population being measured in both sources is the same;
c) Records from both sources can be linked without errors;
d) Units have the same capture probabilities within each source (homogeneity probability
assumption);
e) Over-count in both sources is negligible;
• In this way, an unbiased estimator of N is given by
Nˆ  N1  N 2 / x11
(1)
Results of the linkage procedure
• Results of the linkage of units between PES and Census in order to obtain
marginal sums from the table 1 are presented in following tables:
Table 2. Contingency table of individuals enumerated in the Census and in the
PES, unweighted cases (before determination of resident status)
PES
Total
Present
Absent
Present
36,936
4,540
Absent
537
37,473
Census
Total
Table 3. Matched units, comparison on resident status, unweighted cases
Resident status at the
Census
Resident status at thePES
Resident status 1
Resident status Resident status
2
3
Grand total
Resident status 1
34,603
58
829
35,490
41,476
Resident status 2
61
38
116
215
x22
x 2
Resident status 3
667
7
557
1,231
x 2
N
Grand total
35,331
103
864
36,936
• Units showing different answers in PES and Census on the resident and
enumeration status were not considered, due to the lack of information
needed to solve discrepancies
Table 4. Contingency table of individuals enumerated in the Census and in the PES - final data,
unweighted cases
PES
Census
Total
Total
Present
Absent
Present
34,603
2,656
Absent
448
xx22
22
2
x2
35,051
x 2
N
37,259
Adjustment of the DSE Method-LCA
• Over-count of individuals was unusual. Latent Class Analysis (LCA) used
• If we assume that the PES list does not include ineligible persons, in this case
the adjusted DSE estimate of the N population size (i.e. corrected by overcount), N̂ , is given by Nˆ O  Nˆ ce  N 2 / x11
O
• LCA is a probabilistic method for assessing if a unit is a part of the target
population or not and for estimating a probability of belonging to this
population
• Different models have been tested and the best one included three
explanatory variables (Sex, Age and Matching status of the household)
Census Coverage Errors (1)
• The model results with the first latent class probability (the marginal
probability to belong to the population) of 0.1379 (366 unmatched people
estimated being in the population)
• On the basis of results of the LCA, it is possible to estimate all coverage errors
of the Population census in Bosnia and Herzegovina
• Main estimates have been calculated as follows:
OvRate = 1 −
l p10,l
X10,l +X11
l X10,l +X11,l
1−OvRate
DSE = Ncensus ∗
1−UnRate
X11
UnRate = 1 −
X.1
(2)
(3)
(4)
• Main coverage errors are presented in the following three tables:
Census Coverage Errors (2)
Table 5. Main coverage errors at national level for individuals-absolute figures
Dual system
estimate of the
true population
(DSE)
Census count
before correcting
for collective
households
Census count
after correcting
for collective
households
(Ncensus)
Eligible and
correctly
enumerated in
the Census
(Nce)
Erroneous
inclusion
Omission
(DSE-Nce)
Census net
coverage
error
(DSE-Ncensus)
3,530,159
3,507,343
3,311,191
196,152
41,570
-154,583
3,352,760
Table 6. Coverage errors at urbar, rural, entity and national levels for individuals
DSEs
Population
estimate
Net coverage
Error
Census omission
Census erroneous
inclusion
Urban
Rural
Federation of BiH
Republika Srpska
Brcko District
National
Counts
1,413,957
1,938,804
2,113,556
1,167,778
71,426
3,352,760
95% CI LL
1,410,191
1,933,381
2,108,606
1,163,986
70,502
3,346,202
95% CI UL
1,417,722
1,944,226
2,118,506
1,171,571
72,350
3,359,319
Rate
-5.34%
-4.08%
-4.28%
-4.49%
-16.51%
-4.61%
95% CI LL
-5.62%
-4.37%
-4.52%
-4.83%
-18.02%
-4.82%
95% CI UL
-5.06%
-3.79%
-4.03%
-4.15%
-15.01%
-4.41%
Rate
1.42%
1.19%
0.98%
1.81%
1.38%
1.24%
95% CI LL
1.09%
0.83%
0.68%
1.40%
-0.26%
0.99%
95% CI UL
1.75%
1.56%
1.28%
2.23%
3.02%
1.49%
Rate
6.76%
5.27%
5.26%
6.30%
17.89%
5.85%
95% CI LL
6.56%
5.03%
5.07%
6.04%
16.84%
5.69%
95% CI UL
6.96%
5.52%
5.45%
6.56%
18.94%
6.01%
Census Coverage Errors (3)
Table 7. Coverage rats at urban, rural, entity and national level for individuals
Over-coverage
Under-coverage
Urban
Rural
Federation of
BiH
Republika
Srpska
Brcko District
National
Rate
6.41%
5.07%
5.04%
6.03%
15.36%
5.59%
95% CI LL
6.18%
4.88%
4.86%
5.78%
14.48%
5.44%
95% CI UL
6.65%
5.26%
5.23%
6.28%
16.24%
5.75%
Rate
1.38%
1.15%
0.96%
1.79%
1.35%
1.24%
95% CI LL
1.21%
1.02%
0.83%
1.61%
0.60%
1.14%
95% CI UL
1.55%
1.29%
1.09%
1.97%
2.11%
1.34%
Census Content Errors (1)
• Content quality control is conducted to measure errors due to
differences/inconsistencies among Census and PES data
• For this purpose, the 10% sub-sample of the PES sample was used (sample
consisted of 7,637 individuals, 2,114 households and 2,073 dwellings)
• Variability of answers is measured by five content quality indicators:
a) Net difference rate;
b) Index of inconsistency
c) Aggregate index of inconsistency;
d) Gross difference rate and
e) Rate of agreement.
In this presentation, we will focus our attention only to content errors for
selected variables for individuals
Census Content Errors (2)
Variable category
Net difference rate
Index of inconsistency
Gross difference rate
Aggregate index of inconsistency
Rate of agreement
Sex
M
-0.13%
2.88%
F
0.00%
3.30%
Null
0.13%
99.48%
Total
2.45%
4.81%
97.55%
1,13%
1,72%
98,87%
5.15%
7.66%
94.85%
12.31%
16.45%
87.69%
Age
0-14
15-49
50-64
65+
NULL
-0,08%
0,85%
-0,04%
1,34%
-0,01%
1,34%
0,03%
1,40%
0,10%
52,04%
Total
Marital status
Never Married
-1.74%
10.69%
Married
0.39%
5.56%
Divorced
0.00%
27.96%
Widow/ Widower
-0.08%
3.62%
Not applicable
-0.07%
0.80%
1.49%
94.14%
NULL
Total
Highest completed education
No education
0.14%
15.83%
Uncompleted basic education
0.25%
32.25%
Primary school
-1.15%
34.45%
Lower secondary school
-0.16%
11.82%
Post-secondary school
-0.05%
1.76%
0.24%
38.42%
Tertiary education
-0.16%
17.13%
Not applicable
-0.07%
0.80%
0.96%
100.02%
High school
NULL
Total
Table 8. Content
errors indicators for
selected variables
and their categories
at Census and PES
Census Content Errors (3)
Variable category
Net difference
rate
Index of
inconsistency
Gross difference
rate
Aggregate index of
inconsistency
Rate of
agreement
Activity status
Employed
-0.59%
16.93%
0.20%
82.02%
Unemployed
-0.24%
19.67%
Not applicable
-0.07%
0.65%
0.69%
100.00%
Didn't work but has a job to return to
NULL
Total
11.06%
17.76%
88.94%
5.60%
21.35%
94.40%
Citizenship
BiH
0.64%
19.58%
-1.11%
20.16%
Other country
0.03%
48.97%
Without citizenship
0.04%
100.00%
NULL
0.41%
100.00%
BiH and other country
Total
Table 8. Content
errors indicators for
selected variables
and their categories
at Census and PES,
cont.
Conclusions (1)
• The 2013 Census on population, households and dwellings in Bosnia and
Herzegovina has been conducted in the very specific socio-economic and
political atmosphere
• Issue of the general “statistical culture” in Bosnia and Herzegovina
• BHAS was aware of the atmosphere in which the census should be conducted.
It has prepared and conducted the Population Census in line with all relevant
international and European standards, definitions, regulations and
recommendations
• Apart of IMO reports, which were prepared during all census activities, PES
indicators served as data for the final validation of the Population Census in BiH
• PES provided the DSE of the true population in BiH and main indicators of
census quality in terms of the coverage and content
Conclusions (2)
• PES coverage results in absolute terms:
a) DSE of the true population was 3,352,760 indicating that the enumeration in the
Census was over-counted (Census figure is 3,507,343 after correcting for institutional
households)
b) 196,152 inhabitants were erroneously included in the population count
c) 41,570 inhabitants were missed to enumerate
d) Net-coverage census error was -154,582
• PES coverage results in relative terms:
a) Over-coverage rate: 5.59%
b) Under-coverage rate: 1.24%
c) Net-coverage error rate: -4.61%
Coverage indicators indicated some extent of over- and under-count, but not in the
scope to be evaluated as bad census quality.
Conclusions (3)
• Rates of agreement between Census and PES variables in terms of the content ranged
between 87.69% and 98.87% (Highest completed education and Age, respectively)
• Aggregate indices of inconsistency were in low range, except for one variable
(Citizenship: 21.35 which is in the medium range)
• Average net difference rate for every analyzed variable was in the low range
• Only for several variables, individual net difference rates and individual indices of
inconsistency were in the middle or high range, indicating the existance of problems
in understanding and reporting answers
• Data users should be aware in using these results, while statisticians must provide
clearer definitions and wording in the design of survey instruments and more probing
in the course of the interview in future work
• Overall quality of 2013 Census of Population, Households and Dwellings in BiH
could be evaluated as good and usable for statistical purposes and policy making,
both in terms of coverage and content.