Demystifying Electronic Data Standards for Clinical and Nonclinical

Demystifying Electronic Data Standards
for Clinical and Nonclinical Studies
By Michelle Conan-Cibotti, PhD, RAC (US, EU)
As part of the Prescription Drug User Fee Act IV (PDUFA IV) information technology commitments, the US Food and Drug Administration (FDA) is moving toward a fully electronic,
standards-based submission and review environment.1 FDA has issued a series of guidance documents to assist sponsors in providing regulatory submissions in electronic
format. In the latest draft guidance, Providing Regulatory Submissions in Electronic
Format—Standardized Study Data2, issued in February 2012, FDA promotes the use
of data standards in electronic submissions of clinical and nonclinical study data and
provides resources for the various data standards supported by the Center for Drug
Evaluation and Research (CDER), Center for Biologics Evaluation and Research (CBER)
and Center for Devices and Radiological Health (CDRH).
FDA and many other public and private organizations have been collaborating with the
Clinical Data Interchange Standards Consortium (CDISC)3 to develop standards for study
data submitted in support of regulatory applications. CDISC, a global nonprofit organization, is working on a set of standards to support the acquisition, exchange, submission
and archiving of clinical and nonclinical research data and metadata. CDISC standards
already adopted by many regulatory authorities include the Study Data Tabulation Model
(SDTM) for representation of clinical trial data, the Standard for Exchange of Non-clinical
Data (SEND) for representation of data from nonclinical animal toxicology studies,
the Analysis Data Model (ADaM) for clinical trial data analysis and the Clinical Data
Acquisition Standards Harmonization (CDASH) standards for the collection of data in case
report forms. These standards are updated periodically to better suit the needs of all
types of products and several other standards have been released. What do SDTM, SEND,
ADaM, CDASH and other standards mean in plain language for regulatory professionals
not manipulating data every day? Why should we start using them?
regulatoryfocus.org
May 2012
1
Benefits and Challenges
Electronic data standardization improves the quality and efficiency of the regulatory review. As
stated in FDA’s draft guidance, “[d]ata that are standardized are easier to understand, analyze,
review and synthetize in an integrated manner in a single study or multiple studies, thereby
enabling more effective regulatory decisions.” In 2011, about 30% of New Drug Applications
(NDAs) and about 20% of Biologics License Applications (BLAs) submitted electronically contained CDISC SDTM data. From an industry perspective, clinical data standards enable data
processing efficiencies and facilitate data exchange, communication and coordination of activities internally and with vendors, partners, investigators, patients and regulators.4
However, implementation of CDISC standards remains a challenge. CDISC has
developed implementation guides, available on its website, to harmonize the interpretation of its standards by sponsors. Each FDA center offers various web resources and
recommends communications with the agency to discuss data standards implementation approaches as early as possible and no later than the end of Phase 2. CDER has
published a common data standards issues document5 to assist sponsors. CBER advises
sponsors to contact its review division before submitting data in CDISC format to discuss
the datasets that should be provided, the data elements that should be included in each
dataset and the organization of the data within the file during formal meetings with FDA.6
Following is an overview of the current electronic data standards.
Study Data Tabulation Model
SDTM provides a general framework for describing the organization of information collected during human and animal studies and submitted to regulatory authorities. SDTM
defines a standard structure for study data tables. The SDTM document available on
the CDISC website describes the basic concepts and general structures of the model.
In summary, the model is built around the concept of observations, which correspond
to rows in the dataset. Each observation is described by a series of named variables,
which correspond to columns in a dataset. Each variable can be classified into one of five
major roles: identifier (i.e., study or subject), topic (i.e., nausea), timing (i.e., start date,
end date), qualifier (i.e., descriptive adjectives such as mild or numeric value) and rules
(method to define start, end or looping conditions in the trial design model). Variables are
codified and qualifier variables are further categorized into sub-classes.
Observations are collected for all subjects in a series of domains. A domain is a
collection of related observations with a common topic and is represented by a unique,
two-character domain code (i.e., Adverse Event domain = AE, demographics = DM, subject
visits = SV, medical history = MH, etc.). There are more than 30 SDTM standard domains
and additional custom domains can be created. Domains are based on three general
observation classes: interventions (i.e., treatment per protocol or self-administered),
events (i.e., randomization, study completion, AE) and findings (i.e., observations resulting from planned evaluations). Standardized domains are also used for representing the
trial design model, including the planned trial elements, trial arms, trial visits, trial inclusion/exclusion table and trial summary, and for representing relationship datasets. CDISC
updates the standard domains as they are developed and publishes them on its website.
The CDISC SDTM Implementation Guide (SDTMIG), also available on the website, provides specific recommendations and examples for mapping the data standards and is a
must-read before preparing a regulatory submission based on SDTM. The implementation
guide describes which variables are required, expected or permitted to be used in specific
domains based on the general observation classes. It also provides details on how to represent relationships among datasets and records (i.e., concomitant medications used to
treat an AE; comments recorded in association with an AE).
A draft devices supplement to the SDTMIG will be published soon, now that public
comments have been collected. The draft document contains seven proposed new SDTM
domains that are designed to capture basic information about diagnostic devices, implantable devices and imaging devices.7
In addition to providing a standardized dataset, sponsors must describe their dataset in a
data definition document, called “define.xml.” Metadata definitions provide information about
the variables used in the dataset and must be submitted with the data to regulatory authorities.
regulatoryfocus.org
May 2012
2
As stated by FDA, “The ideal time to implement the SDTM standards for representation of clinical trial tabulation data is prior to the conduct of the study. This approach is
preferred to the alternative of collecting data in a non-standard format and then converting
to SDTM format after the trial (legacy data conversion).”8
Standard for Exchange of Nonclinical Data
SEND is an extension of the SDTM standard for submission of nonclinical data. The SEND
Implementation Guide (SENDIG) is available on the CDISC website. The current SENDIG,
version 3.0, is designed to support single-dose general toxicology, repeat-dose general toxicology and carcinogenicity studies. In the future, the SENDIG is expected to be updated to
support reproductive toxicology, safety pharmacology and veterinary studies. SEND is used
as an interchange between organizations such as sponsors and CROs and for submission
to regulatory authorities. SEND is based on SDTM principles described above.
Analysis Data Model
ADaM provides standards to use when generating analysis datasets and associated metadata
following the data format required for eCTD submissions. Per the ADaM document (version
2.1) available on the CDISC website, “the purpose of ADaM is to provide a framework that
enables analysis of the data, while at the same time allowing reviewers and other recipients
of the data to have a clear understanding of the data’s lineage from collection to analysis to
results.” ADaM is optimized to support data derivation and statistical analysis and is intended
to simplify the programming steps necessary for performing an analysis. Analysis datasets
are derived from SDTM datasets and support the results presented in the study report. For
standard datasets structures and variables, including naming conventions, sponsors must
refer to the published ADaM Implementation Guide (ADaMIG) available on the CDISC website.
Clinical Data Acquisition Standards Harmonization
CDASH defines basic standards for the collection of clinical trial data in case report forms. As
stated on the CDISC website, “It describes the basic recommended (minimal) data collection
fields for 18 domains, including common header fields, and demographic, adverse events, and
other safety domains that are common to all therapeutic areas and phases of clinical research.”
The CDASH collection fields facilitate implementation of SDTM and ADaM. The CDASH document available on the CDISC website provides “recommendations and methodologies for
creating data collection instruments” as well as suggested CDASH domain tables. It also contains “commonly used CDISC controlled terminology” facilitating consistent data collection for
standard domains such as prior and concomitant medications (CM), drug accountability, ECG
test results, exposure and vital signs. A CDASH user guide is in development.
Other Electronic Data Standards
Controlled terminology is essential to harmonization and all CDISC standards use
terminology standards, which are developed and maintained by various standards organizations. CDISC and the National Cancer Institute (NCI) in the US are working together
on SDTM and SEND terminology standards, which include controlled standard vocabulary
and code sets. Refer to the NCI terminology resources webpage9 and the FDA study data
standards resources webpage for the current terminology standards supported by CBER,
CDER and CDRH.10 Terms defined by sponsors are not considered controlled terminology. A request to have new terms added to the standards dictionaries can be submitted
to NCI. In addition, CDISC glossaries of terms, including acronyms and abbreviations, are
available on the CDISC website.
The CDISC Protocol Representation Model (PRM) has been released and is available
on the CDISC website. The PRM provides “content and format standards supporting the
interchange of clinical trial protocol information.” It covers study design, eligibility criteria
and the requirements from the ClinicalTrials.gov and World Health Organization registries
and includes the Trial Design Model “representing the planned sequence of events and
treatment plan of a trial.”
regulatoryfocus.org
May 2012
3
The CDISC Laboratory Data Model Base Model Version 1.0.1 has been available for
implementation since 2003. It describes the content and format standards for transferring
clinical laboratory data between clinical laboratories and study sponsors. Specifications
and recent microbiology extension and range reference model standards are available on
the CDISC website.
The Operational Data Model (ODM) is used for the transfer of case report form data.
Per the CDISC website, “ODM is designed to facilitate the archive and interchange of the
metadata and data for clinical research, its power being fully unleashed when data are collected from multiple sources.”
For additional standards in development (e.g., therapeutic area standards), refer to
the CDISC website.
Conclusion
You probably have noticed that letters sent by FDA, such as preliminary pre-IND comments
issued by CDER, now include a section entitled “Data Standards for Studies,” which encourages sponsors to consider the implementation and use of data standards for the design,
conduct and analysis of studies as early as possible in the product development lifecycle.
FDA is offering guidance and support to facilitate the submission of data in an electronic format for a more “efficient and comprehensive data review.” In 2008, FDA proposed amending
the regulations governing the format of clinical study data to require that data submitted
for NDAs, BLAs and Abbreviated New Drug Applications and their supplements and amendments be provided in electronic format using the CDISC standards (SDTM, ADaM, SEND,
CDASH, etc.).11 The draft FDA guidance document dated February 2012 also promotes
the use of data standards for the submission of Premarketing Approval applications and
premarketing notifications (510(k)s), Investigational Device Exemptions and Investigational
New Drug applications (INDs). The use of data standards for clinical and nonclinical studies
could potentially become a mandate when the use of the eCTD format is generalized to all
types of submissions. For those of us not yet using the CDISC data standards, now is the
time to start becoming familiar with them and planning for implementation while their use
is not yet mandatory. Ultimately, data standardization will reduce cost and time to market by
streamlining the clinical trial process and will increase the quality of medical research.
References
1. PDUFA IV Information Technology Plan, FDA-2008-N-0352. FDA website. www.fda.gov/ForIndustry/UserFees/
PrescriptionDrugUserFee/ucm093567.htm. Accessed 3 March 2012.
2, Draft Guidance for Industry: Providing Regulatory Submissions in Electronic Format -- Standardized Study Data. February 2012.
FDA website. www.fda.gov/downloads/Drugs/.../Guidances/UCM292334.pdf. Accessed 23 February 2012.
3. CDISC website. http://www.cdisc.org. Accessed 23 February 2012.
4. Dubman S, Hinkson B, Soloff D, Fritsche D and Tandon PK. “Genzyme ‘s GetSMART Program: Implementing Standards
End-to-End.” CDISC Journal, October 2011. CDISC website. www.cdisc.org/stuff/contentmgr/files/.../cdisc_journal_dubman_etal_p2.pdf. Accessed 25 February 2012.
5.CDER Common Data Standards Issues Document (Version 1.1/December 2011). FDA website. www.fda.gov/downloads/
Drugs/.../UCM254113.pdf. Accessed 25 February 2012.
6. Submission of Data in CDISC Format to CBER. 15 December 2010. FDA website. www.fda.gov/BiologicsBloodVaccines/
DevelopmentApprovalProcess/ucm209137.htm. Accessed 25 February 2012.
7. Major Milestone in Development of New CDISC Device Standard. 5 March 2012. CDISC website. www.cdisc.org/content3469. Accessed 25 March 2012.
8. Op cit 5.
9. National Cancer Institute (NCI), Terminology Resources, CDIDS Terminology. NCI website. www.cancer.gov/cancertopics/cancerlibrary/terminologyresources/cdisc. Accessed 2 April 2012.
10. US Food and Drug Administration, Study Data Standards Resources. FDA website. www.fda.gov/ForIndustry/DataStandards/
StudyDataStandards/default.htm. Accessed 23 February 2012.
11. Office of Information and Regulatory Affairs, Reginfo.gov, Electronic Submission of Data from Studies Evaluating Human
Drugs and Biologics, RIN: 0910-AC52. Reginfo.gov. website. www.reginfo.gov/public/do/eAgendaViewRule?ruleID=284747.
Accessed 25 February 2012.
Author
Michelle Conan-Cibotti, PhD, RAC (US,EU), is a vaccine scientific and regulatory specialist for the Vaccine Research Center and
the Division of AIDS, National Institute of Allergy and Infectious Diseases, National Institutes of Health. She has more than 14
years of regulatory experience, managing US IND for biologics and drugs and international registrations for IVD devices. ConanCibotti is a member of the RAPS Board of Editors for Regulatory Focus and can be reached at [email protected].
© 2012 by the Regulatory Affairs Professionals Society. All rights reserved.
regulatoryfocus.org
May 2012
4