How similar are different calibration estimators in the presence of a zero-inflated auxiliary variable? Evidence from the German job vacancy survey Hans Kiesl Institute for Employment Research (IAB), Germany [email protected] NTTS 2009 – New Techniques and Technologies for Statistics Brussels • February 18-20, 2009 Background Regulation (EC) No. 453/2008 of the European Parliament and of the Council of 23 April 2008 on quarterly statistics on Community job vacancies Member states have to provide quarterly data on job vacancies (broken down to NACE section level) quality reports In Germany, the data will be provided by the IAB. 2 Background (2) Information on job vacancies in Germany Business units might report job vacancies to the Federal Employment Agency Federal Employment Agency publishes monthly statistics on number of registered job vacancies (by NACE-sector) Since 1989, IAB conducts a yearly (4th quarter) sample survey among business units to estimate number of job vacancies (registered or not) and to get additional information (e.g. about recruiting strategies) Mail questionnaire (8 pages in length); voluntary CATI interviews in quarters 1 - 3 3 Basic estimation strategy stratified simple random sampling (by size classes and industry sector) calculate design weights as inverse (realized) sampling rate within each stratum calibrate design weights to known totals from external data number of business units by size number of business units by industry sector number of employees by size number of employees by industry sector number of registered vacancies by industry sector 4 Calibration estimators (1) RAKCON raking estimator with weight restrictions within each stratum only two different weights allowed units with vacancies, units without vacancies reason: control variance of weights and variance of estimates start with design weights and repeat following two steps until convergence of weights: proportional fitting of weights for units with vacancies to number of registered vacancies by sector iterative proportional fitting of all weights to number of units by size and by sector 5 Calibration estimators (2) Generalized regression estimator (GREG) minimizes ( w i di )2 dist( w, d) qidi i1 n so that t̂ X j w i x ij t X j i GREG1 calibrated to number of units by size number of units by sector number of registered vacancies by sector GREG2 additionally calibrated to number of employees by size number of employees by sector 6 Calibration estimators (3) Generalized regression estimator (GREG) with weight restrictions ( w i di )2 ( w i di )2 ( w i di )2 dist( w, d) qidi qidi qidi i1 h iN1 iN2 n ( w 1h d1h )2 ( w h2 dh2 )2 1 2 iN q d q d iN2 i h i h 1 ( w 1h d1h )2 1 ( w h2 dh2 )2 1 1 2 dh dh iN1 qi iN1 qi h h GREGCON1: N1 = set of units with vacancies GREGCON2: N1 = set of units with registered vacancies 7 Result of different calibration estimators 4th quarter 2007, Germany (west) realized sample size: 7,485 (response rate: 20%) Algorithm used Estimated # of job vacancies Germany (west) 4th quarter 2007 RAKCON 994,735 GREGCON1 951,386 GREGCON2 848,178 GREG1 848,184 GREG2 812,513 8 Highly skewed distribution of job vacancies % of 0’s total # of vacancies (excluding 0) # of registered vacancies (excluding 0) 91% 97% size 1-10 86% 96% 10-19 77% 91% 20-49 68% 86% 50-199 48% 75% 200-499 36% 71% 500 + 9 0 100 200 300 Simulation study Create synthetic population by sampling with replacement from original sample Draw 300 samples from synthetic population with same sampling design and realized sample sizes as original sample Calculate all estimators described above Repeat for different nonresponse models RHG1: equal response probability within strata RHG2: equal response probabilities within two group (units with and without vacancies) in every stratum RHG3: equal response probabilities within two group (units with and without registered vacancies) in every stratum 10 Sampling distributions under RHG 1 750,000 800,000 850,000 900,000 950,000 1,000,000 sampling under nonresponse model RHG1 rakcon gregcon1 gregc on2 greg1 greg2 11 Sampling distributions under RHG 2 700,000 800,000 900,000 1,000,000 1,100,000 sampling under nonresponse model RHG2 rakcon gregcon1 gregc on2 greg1 greg2 12 Sampling distributions under RHG 3 800,000 850,000 900,000 950,000 1,000,000 sampling under nonresponse model RHG3 rakcon gregcon1 gregc on2 greg1 greg2 13 Two step GREG estimation If we accept RHG2, unconstrained GREG is biased. No information in the frame or among non-responding units to directly estimate the response probabilities. Suggestion: two step GREG estimation. First step: GREG estimation, calibrating to registered vacancies Using the calibrated weights, we can get estimates for response probabilities. Second step: adjust design weights for different response probabilities, add another GREG estimation step 14 How do we estimate response probabilities? unreg v ac N̂ N̂reg v ac n unreg v ac N̂reg v ac reg v ac n N̂no v ac N N̂reg v ac N̂unreg v ac population 1st stage: equal inclusion probabilities sample 1 1 2 (model RHG 2) respondents nreg v ac nunreg v ac nno v ac 15 Sampling distributions under RHG 1 750,000 800,000 850,000 900,000 950,000 1,000,000 sampling under nonresponse model RHG1 rakcon gregcon1 gregcon2 greg1 greg2 2 step greg 16 Sampling distributions under RHG 2 700,000 800,000 900,000 1,000,000 1,100,000 sampling under nonresponse model RHG2 rakcon gregcon1 gregcon2 greg1 greg2 2 step greg 17 Sampling distributions under RHG 3 750,000 800,000 850,000 900,000 950,000 1,000,000 sampling under nonresponse model RHG3 rakcon gregcon1 gregcon2 greg1 greg2 2 step greg 18 Conclusions Weight restrictions lead to larger variance of estimators. Calibration estimators work under an implicit nonresponse model. Two step GREG estimator applicable if theory suggests certain response homogeneity groups, there is no complete information about RHG membership in the frame or among the non-responding units, the only information is an auxiliary variable applicable for calibration which identifies part of the RHG group. Special case: existence of a zero-inflated calibration variable with the property that units with a value greater than zero are in the same RHG, but units with a value of zero might be in different RHGs. 19 Thank you very much for your attention! NTTS 2009 – New Techniques and Technologies for Statistics Brussels • February 18-20, 2009
© Copyright 2025 Paperzz