Roche Template - JMP User Community

Centralised Statistical Monitoring –
‘It’s Just Data Cleaning, Right?’
Implementation and Challenges in Industry
Chris Wells
JMP Discovery, Amsterdam 15th – 18th March
Why Statistical Monitoring
What is Statistical Monitoring
The Methods Being Used
Statistical Monitoring in Roche
Interactive Demonstration
Take Home Messages
Why Statistical Monitoring
What is Statistical Monitoring
The Methods Being Used
Interactive Demonstration
Statistical Monitoring in Roche
Take Home Messages
What Statistical Monitoring is not !!!!
• Although this is part of Risk Based Monitoring
• It is NOT data cleaning
• It IS a statistical analysis of the data
– Not comparing treatments but an analysis comparing sites with
each other, patients with each other irrespective of treatment
• You should ignore anything you know about treatment
• Is it an interim analysis?
– Not strictly speaking because nothing is broken down by treatment.
• This should be carried out by people with a statistics background
Why Statistical Monitoring?
• Much of this has background in fraud and falsification.
• FDA proposed rule for “Reporting Information Regarding Falsification of
Data” defines falsification of data as
– creating, altering, recording or omitting data in such a way that the
data do not represent what actually occurred
• Examples of falsification of data include but are not limited to:
– Creating data that were never obtained
– Altering data that were obtained by substituting different data
– Recording or obtaining data from a specimen, sample or test
whose origin is not accurately described or in a way that does not
accurately reflect the data
– Omitting data that were obtained and ordinarily would be recorded
Why Statistical Monitoring?
• FDA estimates that it will see 73 reports of data falsification across all
divisions.
• The impact is critical.
• A single investigator involved in data falsification could jeopardize the
trial and potentially a drug submission.
• Regulators have the power to terminate a product’s development
immediately or withdraw approval of an already marketed drug
• The U.S. Code on Crimes and Criminal Procedure makes submission
of falsified information to the federal government a criminal act, and
convictions can lead to substantial fines and even imprisonment of
responsible parties.
Why are we using Statistical
Monitoring??
• Develop an approach for identifying and managing
issues affecting data integrity such as non-random
errors, GCP misconduct including fraud at
investigational sites. up data issues that simple
thresholds or the
• Ability to use statistical testing to pick up data issues that
simple thresholds or the naked eye cannot
7
Recent Examples in the News
Impact of Fraud/Data Falsification
• Jeopardising a submission could :
– Have huge cost implications. Delay in launching a
product could mean the loss of $6 – 15 million per drug,
per day to the company
– Moreover, it can lead to ineffective or harmful treatment
being available or patients being denied of effective
treatment.
– Result in a Sponsor company missing getting to market
first and hence the loss of a major patent and the
associated rewards. Possible loss of exclusivity license.
– Loss of Company credibility
Motivation for Fraud
Motivation comes in many forms:
• Desire for Academic Prestige
• Money (Getting to market first provides high rewards;
Fabrication of patients)
• Unintentional Fraud (Possible mix up of patient files when
data entered at sites), patients/parents fabricating data
• Sabotage (aggrieved employee)
• Professional patients
Why Statistical Monitoring
What is Statistical Monitoring
The Methods Being Used
Interactive Demonstration
Statistical Monitoring in Roche
Take Home Messages
What is Statistical Monitoring?
• Statistical Monitoring comes under the Study Conduct arm of
Risk Based Monitoring
• SAS JMP/Clinical can be used to apply statistical algorithms
to the clinical datasets in order to identify outliers in data that
could indicate a risk to the study.
• We need to use SAS JMP Clinical to identify and manage
issues affecting data integrity and GCP misconduct
12
Data Integrity and GCP Misconduct
REMIT
Develop industry wide guidelines and approach for identifying and managing issues
affecting data integrity such as non-random errors, GCP misconduct including fraud
at investigational sites.
UNMET NEED
• No Industry approach currently exists for systematically and pro-actively
detecting and handling Data Integrity and GCP Misconduct
• Discovery of non-random errors affecting statistical analysis tends to be detected
after code breaking without much opportunity to remediate the situation
• Very limited investment in detecting fraud (e.g data falsification)
• Non-random errors, GCP misconduct/fraud can raise serious doubts on the
integrity of clinical study data and jeopardize a submission
13
Data Integrity and GCP Misconduct
VALUE PROPOSITION
• Recommend best practices detecting and handling
data integrity risks GCP and misconducts
• Quality improvement with a focus on Data Integrity and
GCP Compliance
• Prevent unexpected post code breaking data pattern
discovery and allow mid-study corrective action
• Audit and Inspection findings minimized with focus on
filing approval threats
• Increase Industry credibility by acting on rare but still
harmful fraud cases affecting public perception of
medical research
14
Why Statistical Monitoring
What is Statistical Monitoring
The Methods Being Used
Interactive Demonstration
Statistical Monitoring in Roche
Take Home Messages
Methods Being Used - Standard Statistical
Oversight Assessments: Already Available
TO BE RUN AT 25/50/75 AND 100% OF ENROLLMENT
1.
Demographic Distributions
2.
Birthdays – This test looks for patients who have duplicate dates of birth.
3.
Cluster Subjects Across Sites – This test looks for a patient going to multiple sites or
possible fabrication across sites
4.
Weekdays and Holidays – This test looks for dosing at unexpected time points
5.
Perfect Schedule Attendance – This test checks for sites with no variability in
attendance
6.
Constant Findings – This test looks for subjects that exhibit no variability in a
measurement.
7.
Duplicate Records – This test looks for records that are fully duplicated.
8.
Digit Preference (leading and trailing) – This test is making comparisons across clinical
sites in order to identify quality issues.
9. Multivariate Inliers and Outliers – If patients are overall too close or too far to the mean,
this might indicate fabricated data
10. Cluster Subjects Within Study Sites – This test examines patients within a site to identify
16
possible fabricated data
Why Statistical Monitoring
What is Statistical Monitoring
The Methods Being Used
Interactive Demonstration
Statistical Monitoring in Roche
Take Home Messages
Interactive Demonstration
• Example of Reviewing for Duplicate Dates of Birth
• Clustering Subjects Across Sites
• Constant Findings
• Digit Preference (Trailing)
• Multivariate Outliers and Inliers
• C:\Users\wellsc2\AppData\Local\SAS\JMPClinical\1
1\JMPC\Output\Nicardipine
Why Statistical Monitoring
What is Statistical Monitoring
The Methods Being Used
Interactive Demonstration
Statistical Monitoring in Roche
Take Home Messages
Where are we now?
• Challenges setting up licenses on server
• Training – Richard Zink’s help invaluable
• Work required to get ‘buy in’ from Study teams
• Completed 10 studies and delivered reports to Study
Teams. Have a further 18/20 studies between now and
the end of the year (more possibly)
• Some interesting findings, but we are waiting for the
study team to inform us as to the relevance of these
findings
• Need to establish the team strategy for 2017
20
Future Strategy of who runs the tests
Currently have a small team of 6 people on varying resource
Separate SM Team
• Advantages
– Keeps all the analyses in one
place.
– Maintains consistency.
– Keeps up to date with new
methods
– Unbiased
– Cost of tools cheaper
• Disadvantages
– Resource could be an issue
– The team may be swamped and
it may not get done.
– Never embedded in study teams
Study Statistician Supported
by Expert Team
• Advantages
– Embeds the methods within a
study team.
– Resource rests with study team.
– The analysis can be run more
times and more flexibly
– Expert team are free to research
new methods and keep up to
date
• Disadvantages
– Steep learning curve
– Need more JMP/Clinical licences
Why Statistical Monitoring
What is Statistical Monitoring
The Methods Being Used
Interactive Demonstration
Statistical Monitoring in Roche
Take Home Messages
Take Home Messages and Discussion
• Statistical monitoring is a reality. The FDA will be conducting Stats
Monitoring on submitted studies. Do we want them to find problems
before we do??
• Data misconduct is rare so let’s not panic too much, however, the
impact is huge, so maybe we should. We MUST be proactive!
• The whole industry is moving in on this – we can be (are) leaders
• Think of Stats Monitoring like any other analysis of data. It uses
statistical procedures.
• IT IS NOT DATA CLEANING !!!!!!!!!
Doing now what patients need next
BACKUP SLIDES
Example of Reviewing for Duplicate Dates of
Birth
This test looks for patients who
have duplicate dates of birth. We
want to see if there is a chance that
we may have 1 patient entering a
study/project multiple times (IT
DOES HAPPEN!!)
Example of Clustering Subjects Across
What we are assessing is ‘How
Sites
similar is too similar?’
Principle Components Analysis,
Euclidean Distance Matrices
used for analyses.
Box plots presented by Covariate
subgroups (Gender and Race),
then subset to pairs of subjects
with very similar demographic
characteristics.
Once we identify the most similar
pairs of subjects from the box
plot, we can go to the heat map
which can be useful for
identifying the cluster
membership for any selected
pairs of subjects within the
hierarchical clustering analysis or
for identifying groups of 3 or
more that could indicate that the
subject has enrolled more than
twice.
Example of Constant Findings
Example of Digit Preference (Trailing)
We are now making comparisons across clinical sites in order to identify quality
issues. We want to identify anomalies through tests of the trailing or leading digit for
all procedures that provide numeric outcomes.
CMH row mean scores are used to take advantage of the ordinality of the last/first
digit. Further we apply standardized midrank scores to account for the possibility
that the observed last/first digits may not be equally spaced from one another.
Example of Multivariate Outliers and Inliers
An outlier is a statistical
observation that is markedly
different in value from the
others of the sample
An inlier is a value that lies
close to the mean.
The relationship amongst
the covariates needs to be
considered.
Hence looking into this
multivariate space is where
we can utilize Malahanobis
distances (MD).
Malahanobis distances can
be used to calculate the
distance between 2 vectors
of data to assess similarity
or from a vector to a
particular point in
multivariate space (typically
the multivariate mean or
centroid).
Doing now what patients need next