View Poster Presentation - Pitt Public Health

Quality Control Activities for Visit
2 Data of Long Life Family Study
Yuqing Chen
Advisor: Nancy W. Glynn, PhD
Internship Preceptor: Sharon Cho Welburn, MPH
EPI in Action Student Poster Presentation October 13, 2016
Background
 Long Life Family Study (LLFS) is a National Institute on
Aging sponsored international, multi-center longitudinal
cohort study of familial exceptional survival

Four field centers: University of Pittsburgh, Columbia
University, Boston University, and University of Southern
Denmark
 Baseline in-person visits conducted between 20062009

Telephone follow-ups have been conducted annually
 Second in-person visits (Visit 2) began in
September 2014
EPI in Action Student Poster Presentation October 13, 2016
2
Background
 Visit 2 measurements include:
 Socio-demographics, personal history, medical history,
medications, cognitive and physical function, carotid artery
scans, depression scale, spirometry, anthropometrics, and
phlebotomy
 Interviews were conducted by trained research
assistants (RA)
 All data are entered into a REDCap system by
graduate student employees
 Quality control (QC) of both follow-up and visit 2 data are
conducted monthly
EPI in Action Student Poster Presentation October 13, 2016
3
Objective
Aim 1:
 Conduct QC of Visit 2 data for June 2016
Aim 2:
 Characterize error subtypes, and cause of errors
Aim 3:
 Give recommendations of methods to improve
data quality
EPI in Action Student Poster Presentation October 13, 2016
4
My Role
 Assisted with LLFS Data Management and Coordinating
Center (DMCC) generated QC reports of Visit 2 data
 Completed certification of data entry using the REDCap data system
 Pulled original charts to check error reported
 Determined if error was either a QC algorithm error or real error
 Gave file to RA if interviewer error
 Fixed error in Redcap if it was a real error
 Characterized & analyzed error types and subtypes
(discussed in Methods)
 Calculated the frequency, percentage and composition of error types
and subtypes
 Gave suggestions of improvement of data entry &
collection
EPI in Action Student Poster Presentation October 13, 2016
5
Methods
397 items (382 for Visit 2 data and 15 for Follow-Up data) were queried in June
2016 by the DMCC for the Pittsburgh site
All items were checked and compared to
the original files
All true errors were
corrected in REDCap
Files with errors attributed to the interviewer were
given to the RA to check for data accuracy, then
fixed in REDCap
If missing or entry error, it would be
identified and then entered in REDCap
If data entry delay, it was assigned as missing
data (N=11)
For these analyses, we focus on Visit 2 data due to the few follow-up QC items
EPI in Action Student Poster Presentation October 13, 2016
6
Methods
 Errors of the Visit 2 data (N=382) were categorized by
 “Type of error” (E/F)
QC algorithm error and no change needed (E)
True errors needed to be fixed (F)
 “Data obtainment method”
Data entry (Data is entered into REDCap system)
Field (Data obtained in field prior to data entry)
 “Form” indicating which form had the errors
 Errors subtypes were categorized by
 Entry error, missing code, missing, inconsistent, and misclassification
 Data was further analyzed (by SAS 9.4 ) to evaluate:
 Composition of true error or DMCC QC algorithm error among each from type
 Composition of error subtype by data obtainment method
 Composition of forms with most errors by data obtainment method and DMCC QC
algorithm error
EPI in Action Student Poster Presentation October 13, 2016
7
Forms Included in Monthly QC
Report
Form Name
Form Contents
Alert Tracking
Notation of a measured value that needs medical oversight (Ex: Blood pressure
higher than normal range)
Cognitive Assessment
Cognitive tests included: Trail Making Test, Digital Clock Drawing, Letter
Fluency, Category Fluency-Animals, Hopkins Verbal Learning Test-revised,
Folstein Mini-Mental State Exam, Logical Memory IA IIA, Number Span Test,
Digit Symbol Substitution Test
Venipuncture & Blood
Collection
Information of venipuncture, number of attempts, time ended, Phlebotomist
code
Carotid Ultrasound
Scanning
Information on the common carotid intima media thickness and carotid plaques
Socio-Demographic
Information
Participants’ basic demographic information such as marital status, education,
income, etc.
Clinical Dementia
Rating
Assessment of participants’ cognitive impairment level as rated by trained RAs
Medical History
Information about medical and surgical history for heart and vascular disease,
stroke, lung disease, arthritis, etc.
EPI in Action Student Poster Presentation October 13, 2016
8
Forms Included in QC Monthly
Report
Form Name
Form Contents
Physical Function and Activity
Participant's perception of his/her ability to carry out activities of
daily living including physical activity and fatigability
Performance Measures
Physical performance of participants including timed 4 meter
walk, balance tests, grip strength, etc.
Interview Proxy
Criteria to determine if there is a need for proxy-based interview
Medication Inventory
Record all prescription and non-prescription medications
Blood Pressure, Heart Rate,
Height, Weight and Waist
Circumference
Blood pressure, heart rate, and anthropometric measurements
Consent Tracking and Interview
Feasibility
Components of the study participant agrees to and feasibility of
study
Spirometry
Assessment of lung function
Instrumental Activities of Daily
Living
Assessment of activities of independent of daily living (Ex:
housework, preparing meals)
Personal History
Smoking and alcohol consumption
EPI in Action Student Poster Presentation October 13, 2016
9
Results
Percentage of QC Algorithm Error vs. Actual Errors
• Around 24% percent of
problems are outliers but
entered correctly
•
(Ex: shipment date is 7
days prior to date form
filled out, which can
occur if saliva is taken
and shipped prior to
Exam One drawing
blood).
• 76% of the problems are
real errors which are
needed to be fixed.
EPI in Action Student Poster Presentation October 13, 2016
10
Percentages of Errors by
Data Obtainment Method
p<0.0001
• Most errors (74%)
were made
through data entry
process
• Relatively lower
percentage of
errors (26%) were
made during
interview
EPI in Action Student Poster Presentation October 13, 2016
11
Percentage of Error Subtypes
Found in QC
• Missing data consists of
the largest proportion
(50%) of errors.
• Similar proportion of
errors were entry error
and missing code
• Inconsistent with the
medical record consists
of 11% of total errors
• Only 2% of errors are
caused by
misclassification
EPI in Action Student Poster Presentation October 13, 2016
12
Composition of True Errors and
DMCC QC Algorithm Errors by Form
• Alert Tracking and
Cognitive Assessment
had the most errors
• >60% of errors were
true errors for most of
the forms, except
Cognitive Assessment
• In Cognitive
Assessment almost
half of errors are
DMCC QC algorithm
errors
EPI in Action Student Poster Presentation October 13, 2016
13
Frequency of Error Subtypes by Data
Obtainment Method
• Error subtypes differed
significantly by data
obtainment method
• >80% of missing, entry
error and missing code
errors are due to data
entry
• All inconsistent errors
came from the interview
• All misclassification errors
occurred in data entry
EPI in Action Student Poster Presentation October 13, 2016
14
Distribution of Data Obtainment Method and DMCC
QC Algorithm Errors in the Forms with Most Errors
• In all forms, except
Carotid Ultrasound
Scanning, nearly
half of the errors
were from data
entry
• For Carotid
Ultrasound
Scanning, >80% of
errors were from
the field interview
EPI in Action Student Poster Presentation October 13, 2016
15
Discussion
 25% of errors were errors in the DMCC QC
Algorithm
 Most of the Data Obtainment Method errors
occur in data entry
 Missing data was the most common error
subtype
 Alert Tracking and Cognitive Assessment
were the forms with the most errors
EPI in Action Student Poster Presentation October 13, 2016
16
Discussion - Error Subtypes
 Missing data, entry errors, missing code and
misclassification (~89%) occurred most frequently during
data entry
 Likely due to the large amounts of data entered
 More training could prevent skipping data entry points and entering the
wrong value during the data entry process
 Inconsistent errors most commonly occurred during the
interview
 Likely due to the length of the interview and complex forms
 Interviewers should review forms prior to sending them to data entry to
ensure all information is consistent
 Ex: An above normal value blood pressure reading being properly
flagged as an alert
EPI in Action Student Poster Presentation October 13, 2016
17
Discussion - Forms with Most
Errors
Alert Tracking and Cognitive Assessment had the most errors
Cognitive Assessment is the longest form (34 pages) with the most
opportunities for a mistake to occur
Form should be reviewed slowly and carefully for missing data and any
other mistakes
Timing between assessment needs improvement as many were reported
out of range
Alert Tracking was newly implemented in the past year
There was some confusion on what counts as an alert since this is a
unique population
Not many studies have normative values on health measures in older
adults >90 years
More training and referencing the alert tracking chapter from the Manual
of Operations could reduce the number of alert tracking errors
EPI in Action Student Poster Presentation October 13, 2016
18
Conclusion
 DMCC QC Algorithm needs to be refined to be more
precise and trigger fewer false positives
 To prevent interviewer errors, RAs need to double
check skip patterns on forms in the field and during
their post-visit review of all forms prior to data entry
 To reduce data entry errors and ensure clean data
for subsequent analyses, better understanding of the
purpose and meaning of the data items and doublechecking their entries is needed
EPI in Action Student Poster Presentation October 13, 2016
19
What I Have Learned
 Large, multi-center studies collect a large number of data
where errors may occur, making data QC essential
 Given the amount of data collected, the number of errors found
in this study is a very small percentage of the existing data
 Determining where errors occur most frequently is a challenge
but crucial to improving data quality
 Complicated and longer forms seem to impact the quality of
data
 A complex interview process and data entry increases the
likelihood of errors
EPI in Action Student Poster Presentation October 13, 2016
20