European Social Survey ESS 2012 Documentation of the Spanish sampling procedure The 2012 sampling design incorporates small innovations to the 2010 design. These are: a) Changes in the number of individuals in each PSU The analysis of the 2010 data has shown that differences in response rate between the three brackets of bigger size have decreased. Significant divergences in the response rate only remain between those brackets with less than 10,001 inhabitants and the rest. Therefore, anticipating that these differences in the response rate will remain in the sixth round, for the 2012 sample 7 individuals will be selected in the three first brackets and 6 individuals will be selected in the brackets of smaller size. b) Calculation of the inclusion probabilities Following the recommendation of the ESS Round 6 Guidelines, the effect due to variation of inclusion probabilities has been also taken into consideration when estimating the design effect. Additionally, in order to achieve a more accurate sample size for the 2012 round, estimations of the ineligible rate, the response rate and the mean response rate in each cluster have been obtained as a mean of the 2008 and 2010 results. The continuous update of the municipality registers and the fieldwork improvements justify that these estimations are calculated using the most recent data. The design effect, however, has been estimated using the data from 2006 and 2010. The 2008 data has been excluded from the analysis as the ESS Round 4 design oversampled two regions (Catalonia and Galicia). A. TARGET POPULATION The population consists of all people aged 15 or older who are resident within private households in Spain (including Ceuta and Melilla), regardless of nationality, citizenship or language. 1 B. SAMPLING FRAME The sampling frame for the 2012 ESS sample is the Spanish population census structured in sections taken from the Continuous Census (Padrón Contínuo) updated in December 2011 by the Instituto Nacional de Estadística (INE, the Public Statistics Office of Spain). Taking the Continuous Register as a sampling frame ensures the best available coverage of the population who resides in Spain. The Continuous Register is updated using the municipal register: when a citizen moves from one municipality to another s/he has to notify the local authorities of his/her new place of residence. That grants her/him access to the public health services, public schools and other public services, as well as it allows updating the electoral register. The law obliges every Spanish city council to send the data from its register to the INE once a year. This process guarantees that the national Continuous Register of inhabitants is updated. Foreigners usually register in municipal rolls in order to benefit from welfare services, even if they are not legal residents in the country. C. SAMPLE DESIGN The proposed design for the 2012 round of the ESS is a stratified two-stage sample design. The strata are obtained by crossing two population classification criteria. The first criterion is the Autonomous Community or region of residence (there are 17 of them plus another one grouping the North-African autonomous cities of Ceuta and Melilla). The second criterion (the type of habitat criterion) distinguishes among four types of habitats according to their size: • • • • The first bracket: cities with more than 100,000 inhabitants aged 15+ The second bracket: cities between 50,001 and 100,000 inhabitants aged 15+ The third bracket: municipalities between 10,001 and 50,000 inhabitants aged 15+ The fourth bracket: municipalities with less than 10,001 inhabitants aged 15+ The existence of different response rates between the three first brackets and the forth one justify the maintenance of this stratification. More details about the rational behind the proposed stratification will be found in Appendix 1. 2 The cross-tabulation of the two criteria gives a total of 72 theoretical strata (18x4), only 64 of them being effective. In each stratum the two sampling stages are the following: 1. In the first stage, a fixed number of census sections are drawn with probability proportional to the number of inhabitants aged 15+ in each section. Thus, census sections are the primary sampling units (PSUs)1. 2. In the second stage, for each PSU selected in the previous stage, 6 or 7 individuals per unit will be randomly drawn: 7 in the sections belonging to the first three brackets and 6 in the fourth. As we mentioned above, the data analysis of 2008 and 2010 round showed that the response rate in the three first brackets has converged, although this remains to be significantly different from the response rate in the fourth bracket (see Table A.1. Appendix A.1). The probabilities of inclusion of sections and individuals are provided by the INE. D. DESIGN EFFECTS The effect due to variation of inclusion probabilities has been included in the design effect of the Spanish sampling for the first time. The total design effect (DEFF) is now calculated from the product of the design effect due to clustering (DEFFc) and the design effect due to the variation of inclusion probabilities (DEFFp). For the estimation of the effect of clustering in the 2012 round, 22 variables of 2006 round and 23 variables of 2010 round have been used. See Appendix 2 for the calculations leading to the final estimation of the 2012 mean design effect: DEFF= DEFFc* DEFFp =1.083*1.177 =1.275 E. RESPONSE RATE 1 There are 34,600 census sections in Spain. Census sections are the most elementary framing units of eligible voters. The size of sections vary between 500 and 2,000 voters (18+ years old), being the average size of 1,300. Nevertheless, it should be stressed that although census sections are defined with regard to electoral processes, these are only used for establishing the boundaries of administrative units that are used for sample designs. Census sections do include all citizens registered in the municipal registers, regardless of their voting rights. 3 The first two ESS rounds highlighted the difficulties for achieving the target response rate of 70% in Spain. The response rates were 53% in 2002 and 55% in 2004. However, due to the improvements implemented in the fieldwork plan and the serious involvement of the survey company in the way that they conduct and monitor the fieldwork, the Spanish response rate raised to 66.8% in 2008 and to 68.5% in 2010. The goal of 70 % is within reach. In the calculation of the 2012 sample size an estimated response rate of 69.1% has been used. F. VALID CASES The proportion of valid cases in the 2010 Round was 0.96, lower than in 2008 (0.973), but still significantly higher than in Round 3 (0.870). Taking into account that the population data used in the 2012 sampling design have been recently updated (December 2011), we preview a high eligible rate for the Sixth Round. We estimate the 2012 eligible rate as a weighted average of the two last eligible rates assigning a higher weight for the last one. Estimated proportion of valid cases in 2012 Round = 0.960*(2/3) + 0.973*(1/3) = 0.964 G. SAMPLE SIZE In the calculations of the sample size three numbers have to be estimated: the proportion of valid cases, the response rate and the design effect. As we have explained in previous paragraphs, estimations for the response rate and the eligible rate for the 2012 design have been obtained from the two previous rounds, while the design effect has been estimated from the 2006 and 2010 data. The values for these estimations are: Proportion of valid cases = 0.964 Mean response rate = 0.691 Design effect = 1.275 Taking into account the above information, the calculations to determine the sample size for the 2012 survey are the following: Minimum effective sample size = 1,500 Net sample size = 1,500*1.275=1,912 Gross sample size = 1912.5/(0.964*0.692) = 2,869 4 In the process of assigning individuals to each stratum proportionally to the population aged 15+, the constraint to take an integer number of sections in each stratum has led to a slight modification of the total number. Therefore, the total sample size is 2,868. Table 1 presents the distribution of the number of sections and individuals to be selected in each stratum proportional to the population aged 15+. See Appendix 3 for the distribution of the 2012 Spanish population aged 15+. Table 1. Distribution of sections and individuals by strata (proportionally to the population) Number of sections Number of individuals Size of habitat Region Andalucía Aragón Asturias Baleares Canarias Cantabria Castilla y León Castilla-La Mancha Cataluña Valencia Extremadura Galicia Madrid Murcia Navarra País Vasco La Rioja Ceuta y Melilla Total Between Between More 50,001 10,001 than and and 100,000 100,000 50,000 25 6 5 4 7 2 7 2 27 14 1 5 41 6 2 7 1 0 162 13 0 1 0 2 0 3 3 8 6 1 4 7 1 0 2 0 1 52 20 2 3 5 8 2 3 5 17 17 2 8 6 5 1 6 0 0 110 Less than 10,001 Total 17 4 2 2 2 2 11 10 14 9 5 9 4 1 3 4 1 0 100 75 12 11 11 19 6 24 20 66 46 9 26 58 13 6 19 2 1 424 Between Between More 50,001 10,001 than and and 100,000 100,000 50,000 175 42 35 28 49 14 49 14 189 98 7 35 287 42 14 49 7 0 1134 91 0 7 0 14 0 21 21 56 42 7 28 49 7 0 14 0 7 364 140 14 21 35 56 14 21 35 119 119 14 56 42 35 7 42 0 0 770 Less than 10,001 Total 102 24 12 12 12 12 66 60 84 54 30 54 24 6 18 24 6 0 600 508 80 75 75 131 40 157 130 448 313 58 173 402 90 39 129 13 7 2868 5 Appendix A.1: Stratification We discuss below the reasons for stratification by region and type of habitat: Stratification by region. There is a common practice for social surveys in Spain to use stratification by Autonomous Communities (regions). That procedure is based on the observed socio-economic, political and cultural differences. The analysis of the five first rounds of the ESS corroborated those differences among regions for some variables and thus, the benefit to stratify by autonomous communities despite the mean design effect due to stratification seems to be negligible. Stratification by type of habitat. From the first stratification in the 2002 Round there has been an improvement in the stratification by type of habitat. Response rates by habitat brackets justify the applied stratification: the bigger the city the lower the response rate. Results of the second round also suggested the need to reconsider the stratification with an aim to reducing the heterogeneity in terms of town size within the strata. Thus, from the third round the stratification has been composed of the following four brackets: cities with more than 100,000 inhabitants, cities between 50,001 and 100,000 inhabitants, municipalities between 10,001 and 50,000 inhabitants and, finally, municipalities with less than 10,001 inhabitants. Table A.1. Expected response rate (%) and size of cluster by bracket More than 100,000 2008 response rate 2010 response rate Weighted mean 2012 previsions Proportion of population Individuals by section 62.4 66.7 65.3 67.0 0.38 7 Between 50,001 and 100,000 65.2 66.3 65.9 67.0 0.12 7 Between 10001 and 50,000 67.6 65.6 66.3 67.0 0.26 7 Les than 10,001 74.8 76.3 75.8 76.0 0.20 6 Total 66.8 68.5 67.9 69.1 1.00 6.76 6 Appendix A.2. Design effects Design effect due to variation in inclusion probabilities and design effect due to clustering for 2012 round are both estimated as a weighted mean of the correspondent design effects in 2006 and 2010 data. The special design sampling of Spanish ESS R4 (with an oversampling of Catalonia and Galicia regions) causes 2008 data inappropriate for making these calculations. The formula used in the calculation of design effect due to clustering is the following: DEFFc = 1 + (k − 1) ⋅ ρ Being: ρ = intra-group correlation coefficient k = estimated average number of completed interviews per cluster To estimate the intra-group correlation coefficient in the 2012 round we have used the data from 2006 and 2010 rounds for a group of numerical, ordinal and dummy variables. All of the variables were also used in the 2010 design. Some of the ordinal variables were also used in 2006 design and others were proposed by the ESS experts’ panel. Table 4 provides the prevision for the average number k of completed interviews per cluster while Table 5 displays the list of selected variables and the calculation of the value of the intra-group correlation coefficient used in the estimation of the design effect of clustering. The intra-group correlation coefficients (ρ) for each of the variables have been estimated using two level variance decomposition models. Each PSU is considered as a group or cluster (level-2 unit). We computed the means of the number of completed interviews for cluster in 2006 and 2010 rounds. The prevision for k is the average of these two means. Table A.2.1. Estimation of the mean response rate per cluster Weighted 2008 data 2010 data Mean (2012 prevision) mean k 4,223 4,274 4,257 4,257 7 Table A.2.2. Intra-group correlation coefficient Variable Ordinal PPLTRST PPLFAIR PPLHLP POLINTR TRSTLGL TRSTPLT STFECO STFGOV STFDEM DISCRIGV Numerical HHMMB YRBRN EDUYRS PDJOBYR WKHCT Dummy VOTE PDJOBEV MOCNTR GNDR UEMP3M UEMP12M UEMP5YR CHLDHHE Total 2006 ρ 2010 ρ Mean ρ 0.023 0.008 0.087 0.088 0.084 0.098 0.016 0.038 0.098 - 0,095 0,083 0,133 0,073 0,063 0,06 0,077 0,054 0,088 0,008 0,106 0,098 0,168 0,088 0,096 0,095 0,1 0,069 0,111 0,013 0.023 0.028 0.108 0.093 0.070 0,034 0,064 0,168 0,012 0,047 0,036 0,06 0,161 0,041 0,052 0.031 0.008 0.049 0.004 0.024 0.031 0.002 0.010 0,035 0,028 0,074 0,002 0,016 0,03 0,002 0,098 0,038 0,045 0,078 0,011 0,036 0,083 0,022 0,079 0.046 0,058 0,054 Finally, the design effect due to clustering for 2012 round is estimated by: Table A.2.3. Estimation of design effects 2006 data 2010 data 2012 prevision DEFFp 1.016 1.117 1.083 DEFFc 1.151 1.190 1.177 The total design effect is: DEFF = DEFFC * DEFFP = 1.083*1.177=1.275 8 Appendix A.3: Assignment of the number of individuals and sections to strata Table 6 shows the distribution of the Spanish population in the 64 strata considered: Table A.3.1. 2011 Spanish population of 15 years old and over per strata Size of habitat: cities with More than 100,000 Region Andalucía Aragón Asturias Baleares Canarias Cantabria Castilla y León Castilla - La Mancha Catalonia Comunidad Valenciana Extremadura Galicia Madrid Murcia Navarra País Vasco Rioja (La) Ceuta y Melilla Total 2.342.460 580.417 446.283 345.586 654.156 158.664 680.984 143.480 2.612.271 1.316.461 126.435 476.448 3.889.789 542.694 170.565 679.471 129.377 0 15.295.541 Between 50,001 and 100,000 1.182.786 0 74.156 0 203.136 0 238.810 277.547 770.355 525.168 80.348 400.118 692.737 130.714 0 210.303 0 126.607 4.912.785 Between 10,001 and 50,000 1.917.987 187.948 275.037 442.203 745.215 146.169 298.072 449.496 1.654.450 1.630.670 238.331 751.507 544.807 454.700 98.719 583.203 33.170 0 10.451.684 Less than 10,001 Total 1.607.709 7.050.942 393.260 1.161.625 171.227 966.703 155.913 943.702 212.934 1.815.441 210.064 514.897 1.034.076 2.251.942 919.089 1.789.612 1.334.712 6.371.788 880.049 4.352.348 504.445 949.559 843.230 2.471.303 362.683 5.490.016 82.878 1.210.986 273.677 542.961 417.275 1.890.252 113.084 275.631 0 126.607 9.516.305 40.176.315 Source: INE, 2011 Continuous Municipal Register The total number of sections to be selected comes from the gross sample (2,869) divided by the number of individuals per section (6.5), giving an initial total of 424 sections. This total has been distributed among the four brackets of habitat proportionally to their population aged 15+. The assignment of 7 individuals per section in the first three brackets and 6 in the fourth gives the final distribution of individuals to be selected in each bracket as it has been provided in Table 1. 9
© Copyright 2026 Paperzz