A generic tool to assess impact of changing edit rules in a business survey – SNOWDON-X Pedro Luis do Nascimento Silva Robert Bucknall Ping Zong Alaa Al-Hamad Business survey editing in the ONS • Uses complex sets of edit rules to: – Check returned questionnaires (records) – Locate suspicious or unacceptable responses – Support data cleaning operations • Edit sets are complex because they may: – – – – Involve a large number of survey forms and variables Contain a large number of edits Define complex acceptance / rejection regions Depend on a large number of tolerance parameters Editing costs are high • The estimated cost of editing is over 40% of the survey process budget • Edits cause large numbers of record failures • Edit failures are mainly dealt with by means of manual follow-up, re-contacting respondents Aim of paper • Describe a generic tool developed to assess the potential impact of changing the edits in any specified business survey • Present example of application of the tool (SNOWDON-X) to large scale annual business survey Edit revision strategies for efficiency saving • Filtering or sub-setting – Comprises introducing a record filter which selects the records to be submitted to the full set of edits • Gate widening – Consists of revising the tolerance parameters (gates) in individual edit rules, such that flagging of suspicious records for revision is less frequent than with previously used values • Edit deletion – Consists of simply discarding some of the edits previously used to flag suspicious records SNOWDON tool • A SAS program developed first by Al-Hamad, Martín and Brown 2006 • Developed to enable informed decision making when revising business survey edits • Aims to “… help survey managers evaluate what savings can be achieved, at what cost to output quality, across many alternative permutations of editing rule parameters.” • Limited to single variable survey, where only ‘gatewidening’ was considered SNOWDON-X tool • Extended funcionality when compared to SNOWDON • Uses SAS IML language for increased performance • Can handle all three edit revision strategies • Can handle multivariate surveys • Provides a wealth of summary indicators relating to: – Expected savings achievable by edit revision – Expected bias to survey results, both overall and per variable – Information on performance of individual edit rules / variables • Simple to run, once data have been properly organised Basic scenario • Previous survey data available in two versions – Unedited (raw) – at point of capture or prior to any editing – Edited (clean) – at point of publication or after all editing • Edit rules used to clean previous survey data are known • Key idea of SNOWDON-X tool 1. Increase tolerance of some edits (or delete or introduce filter if necessary) 2. Calculate indicators of impact of changes to edits 3. Repeat 1. and 2. until expected savings achieve specified level or quality measures reveal unacceptable bias Key assumptions behind approach 1. Future survey edition will behave similarly to previous survey 2. Edited data from previous survey edition are ‘clean’ or error free 3. Changes to ‘raw’ data in previous survey edition were due to error correction, i.e., any values changed between capture and final were ‘wrong’ 4. Once a record is flagged for clerical revision, all errors it contains will be located and corrected What is required to run SNOWDON-X? SNOWDON-X Previous period unedited data Previous period edited data Original edits Revised edits Link between edits and variables What is the output of SNOWDON-X? SNOWDON-X Descriptive indicators for the data set under analysis Indicators about individual variables and edits Indicators of impact due to changes to edit rules Modified ‘output’ survey data set Core indicators Indicator Total number of records Number of records failing at least one edit rule Proportion of records failing at least one edit rule Expected savings in number of records to be edited Missed error rate Average relative absolute global bias (RAGB) for all variables involved in edits Maximum RAGB for all variables involved in edits Overall hit rate, i.e. the proportion of times that fields were changed during validation when flagged by edits Overall false hit rate, i.e. the proportion of times that fields were flagged by edits but were not changed after validation Before After edit edit revision revision ----- How to target edits for revision 1. Select most commonly used form type 2. Select edit failing largest proportion of records within each form type 3. Relax edit parameters to reduce proportion of failed records while keeping bias low 4. Repeat 2. and 3. for each form type until further savings are minimal or bias increases above specified threshold 5. Repeat for all relevant form types Original Test Revised TestNo2001 TestNo3126 TestNo1160 TestNo1159 TestNo1143 TestNo1181 TestNo1114 TestNo1112 TestNo1134 TestNo1125 TestNo1141 TestNo1111 TestNo1154 TestNo1153 TestNo1190 TestNo1113 TestNo1189 TestNo1133 TestNo1123 TestNo1145 TestNo1120 TestNo1172 TestNo1173 TestNo3125 TestNo1131 TestNo1148 TestNo1119 Number of failures ABI/2 (Retail questionnaire) – Number of failing records on original and revised edits Validation failures - Questionnaire RT205 1600 1400 1200 1000 800 600 400 200 0 Results - applying SNOWDON-X to ABI/2 (Retail questionnaire) Indicator Total number of records Number of records failing at least one edit rule Proportion of records failing at least one edit rule Expected savings in number of records to be edited Missed error rate Average relative absolute global bias (RAGB) for all variables involved in edits Maximum RAGB for all variables involved in edits Overall hit rate, i.e. the proportion of times that fields were changed during validation when flagged by edits Overall false hit rate, i.e. the proportion of times that fields were flagged by edits but were not changed after validation Before After edit edit revision revision 3,809 3,809 3,107 2,934 81.57 77.03 -173 -3.6 -- 0.05 -- 0.21 38.01 38.23 61.99 61.77 Results from applying SNOWDON-X to ABI/2 Number of respondents failing validation Maximum relative bias (over all questions) Number of respondents Original edit rules Revised edit rules Savings (number of questionnaires) Catering 1,712 606 548 58 (9.6%) 0.65% Retail 3,809 3,107 2,934 173 (5.6%) 0.21% Motor Trades 1,716 810 720 90 (11.1%) 0.31% Service Trades 5,108 1,162 1,095 67 (5.8%) 0.04% Wholesale 3,471 1,815 1,705 110 (6.1%) 0.29% Property 1,062 363 350 13 (3.6%) 0.05% Production & Construction 5,826 2,348 2,242 106 (4.5%) 0.53% Sector (questionnaire) Results summary • Overall expected saving for ABI/2 ≈ 6% of previously edited records • Largest expected bias occurs in Catering sector (0.65%) where a saving of 58 (9.6%) records was made • Highest expected saving was made in Motor Trades sector (11.1%), with an expected bias of 0.31% Conclusions • Generic tool developed to assist edit revision – Successfully applied to ABI2 – Currently being applied to two monthly surveys • SNOWDON-X tool enables focus on edit revision, not programming for calculating quality and savings indicators • Further development required for: – Impact on standard error estimates – Improved usability
© Copyright 2026 Paperzz