Remarks for data cleaning SHARE Data Cleaning Workshop

The Analysis of Interviewers‘ remarks
Laura Crespo
Spanish team
CEMFI
Data cleaning workshop
Berlin, 8-10 June 2009

This is based on my presentation for Wave 2 in Frankfurt, December 6
2007:


Based on the remarks and feedback from PL, NL, BE-fr, DK, GR and ES
from Wave 2 !
Comments and suggestions from other countries’ experiences are very
welcome!
SHARE Data Cleaning Workshop
Reminder: When a “remark” should be recorded?:

When a response (or non-response) needs to be commented.

When Blaise does not accept the answer provided by the respondent.

When a response is difficult to code.

When a response needs to be clarified.
SHARE Data Cleaning Workshop

Therefore,
Good news:
 They may contain very useful info for data cleaning (and also useful for
SHARE-users, working-groups, country teams and even the survey
agency).
 They are an important source of info to detect errors, missing info,
clarifications, problems with questions. One of the first things to look
into.
Bad news:
 Very iwer-specific (large heterogeneity across iwers, questions, and
even countries).
 They need a case-by-case analysis. Very time consuming.

At some point, they will need translation to english.
SHARE Data Cleaning Workshop
Dealing with iwer remarks: Steps
Step 1: They will be provided by MEA in an Excel file with a particular
format for categorization.
Step 2: Have a look at them and try to define specific categories based on
their content and potential use. Very often we will need to check
the corresponding question to understand perfectly the remark.
Categories:
1. Remarks that should be investigated for data cleaning.
2. Useful remarks for researchers, working groups or country teams.
3. Both (useful for data cleaning and SHARE users).
4. Other remarks that should be investigated.
SHARE Data Cleaning Workshop
Dealing with iwer remarks: Steps
Step 3: Focus on those that may be useful for data cleaning and identify
which correction should be made.
Step 4: Write programs to correct data or flag cases following instructions
(examples do files from wave 1 and wave 2 provided by MEA?)
SHARE Data Cleaning Workshop
Step 2) Categories with different colours/columns
1.
Remarks that should be investigated for data cleaning:

Specific amounts, frequencies, years, time periods (time
consistency along the calendar or life cycle).

Currencies (maybe less problematic than in Wave 2).

Gross terms instead of net terms or viceversa.
SHARE Data Cleaning Workshop
Remarks for data cleaning
 Answer category: Information that may be recorded or imputed to one of
the categories already defined (instead of “Other” option) or should be
back-checked with the reported answer:
–
–
–
–
(RC) Sources of income maternity leave.
(RP) Reasons for not living with a partner.
(AC) Type of residence.
(RE) Situation at 15 if no education, occupation (ISCO),
economic activity (NACE), why worked part-time, reasons left
job, title of the job.
– (GS) reasons for no completed, positions during the tests.
– (HS) Type of illness, reasons for no checks.
– (IV) location and type of house
SHARE Data Cleaning Workshop
Remarks for data cleaning
 Answer category: Information that may be recorded or imputed to one of
the categories already defined (instead of “Other” option) or should be
back-checked with the reported answer:
– (EP) employment status, pensions, eligibility for pensions,
occupation (ISCO), economic activity (NACE).
– (HO) housing status.
– (HC) health care payments.
– (GS, WS, PT) positions during the tests.
– (CH) age of children, education.
– (DN) marital status.
– (PH) illness and disorders, medication, surgery…
– (IV) location and type of house
SHARE Data Cleaning Workshop
Remarks for data cleaning
 Mistakes by iwers when coding the answers or the proxy status.
 The system does not accept a particular answer (i.e, years, dates,
amounts).
 Corrected information that is included by the iwer when the respondent
realizes that he/she made a mistake or reported wrong info previously
(specially when inconsistencies are detected along the calendar).
SHARE Data Cleaning Workshop
Remarks for data cleaning
1. Remarks useful for researchers, working groups or country teams:
 Grip strength test not performed or interrupted due to illness,
disabilities, fears, concerns, not safe.
 Problems
encountered during the physical test (due to
distraction, lack of concentration or interest, nerves, specific
physical impairments or conditions,..). Presence of another
person during the test.
 Does not remember/Does not Know.
 Does not know to read or write.
SHARE Data Cleaning Workshop
Remarks for data cleaning
 Difficulties with Spanish (language problems).
 IWERS'
opinions about the reliability of the answers:
contradictions, attitudes, random answers, reluctance…
 Further clarifications or explanations of reported answers.
 i.e., help (or influence) provided by another person (spouse,
children, others,…)
 Problems or circumstances with the drop-offs (help provided by
the iwer, by a relative,…).
SHARE Data Cleaning Workshop
Remarks for data cleaning
 More specific motives for non-response (private and sensitive
information, does not understand the question): i.e., stillborn
children, no available equipment to perform tests.
 Complaints relating to the length of questionnaire.
SHARE Data Cleaning Workshop
Remarks for data cleaning
3. Both (useful for data cleaning and SHARE users):

Use of proxies (need to be back-checked with SMS data and
also useful for researcher).
4. Other remarks that should be investigated:
 Unclear meaning.
 Phone numbers and addresses (may be important for contacts
in next waves).
Some examples.
SHARE Data Cleaning Workshop
Remarks for data cleaning
Step 3: Focus on remarks for data cleaning and identify the
correction needed.
Step 4: Corrections (do files):

Instructions on this?

Even if a correction or imputation can not be made, the remark
could still be useful for SHARE users, working groups/country
teams and CentERdata (revision of questionnaire for Wave4).

Production of a specific file with translation for this purposes?
 Translation of all remarks: Probably not worthy!
SHARE Data Cleaning Workshop
Remarks for data cleaning
Thanks for your attention!
Julie’s instructions
Open discussion…
SHARE Data Cleaning Workshop