Computation architecture

Academic Computing
Daniella Meeker, PhD
Director, Clinical Research Informatics SC-CTSI
Assistant Professor of Preventive Medicine and Pediatrics
Why doesn’t health care work like
google?
Clinical Algorithms vs. Recommender Algorithms
Google
Health System
Platforms for data collection
1 per application
1000s without interoperability
Accumulation of data
Continuous
Continuous
Randomized Trials to
improve performance
Continuous inexpensive
implicit consent
Expensive rare, ethical
concerns, consent?
Number of competing
objectives and incentives
2 – user, advertiser
5-6, patient, clinician, insurer,
pharma, hospital, caregiver
Cost of mistakes
Low – Learning Opportunities
High
Authority for data access
Single
Multiple, including lawyers
Distribution of data
Multiple Locations
Multiple Locations
Computation architecture
master-worker coordination
silo
Analysis execution
environment
Controlled
Serendipitous
Incentives for Research
Participation
High
Low*
Distribution and comparison
of algorithms
software
literature
Optimization, Evolution &
Dissemination of Tools
o
o
o
Optimizing requires an evaluation step
No evaluation platform to analyze next steps
Evaluation requires data
Machine Learning/System
Science
o
o
“Machine” ~ Automation/Efficiency
“Learning” ~ Optimization and improvement
What is Data Science?
o
o
o
What is the source of data for data science?
– Data about Data (Metadata)
Applications and tools generate metadata about workflow, user experience, and effectiveness
– How can we use this to optimize USC use of tools?
– Breadcrumbs for collaboration and improvement opportunities come in our application use
• Time tracking software can do this automatically
Where are incentives for such evaluations in academic computing?
Clinical Data Research Networks
Common problem in clinical
informatics
o
o
o
o
No platform to compare applications and tools in a head-to-head competition
Tools are developed and not matured after publication
Collaboration costs are even higher than other disciplines…how do we reduce the costs of
collaboration?
Need to distribute both master and worker software to collaborate
What is global, what is local?
o
o
o
o
Algorithms are global
– Execution environments (software) may be local
Workflow specifications can be global
– Workflow execution is local
Data specifications can be global
– Data storage can be local or global
– Security policies may be local
Regulations for security and privacy are global
– Interpretations may be are local
Distributing Innovation & Evaluation in Clinical
Informatics
4
Approved
Query
1
Specify Model
Invoke
4 Protocol
Principal Investigator
1
Principal Investigator
Define Protocol
Add Staff and Roles
Define Data Set
Nominate Sites
Specify Analytics
2
1
5
8
Define Protocol
Execute Data Extract or View
Instantiate Data Set
Register Resource
Approve Release Mechanism
Authorized Study Investigator8 Result
Approve Protocol
Specify Model
Invoke Protocol
9
2
6
Data Sets
Analysis
Packages
Result Sets
Display Results
9
Manager
Display Results
Site Authority
Approved
Results
Site Authority
Site Authority
Approved
Results
Site Authority
PopMedNet
DataMartClient
PopMedNet
Site Authority
PopMedNet
DataMartClient
Authorized Study Investigator
Specify Model
Invoke Protocol
Approve Protocol
3
3
3
Approved
Query
Display Results
Approved
Results
Data Set
Extract
Query
Result
Manager
iteration
9
5
Data Set
Extract
Query
Site Authority
Approved
Query
2
4
Data Set
Extract
Query
Approve Protocol
Approved
Query
iteration
Define Protocol
Add Staff and Roles
Principal
Investigator
Define
Data
Set
Nominate Sites
Specify Analytics
Add Staff and Roles
Define Data Set
Approved
Nominate SitesQuery
Specify Analytics
Execute Data Extract or View
Instantiate Data Set
Register Resource
iteration
Approve Release Mechanism
5
Execute Data
Extract or View
8
Instantiate
Data
Set
Result
Approved
Register
Resource
Query
Manager
Approve Release Mechanism
Authorized Study Investigator
6 DataMartClient
Data Sets
Analysis
Packages
6
Result Sets
Data Sets
Analysis
Packages
Result Sets
Tool
developers
Clinical Informatics at USC
Lessons from the mini-DEWARS
experiment
DEWARS Clinical Research Data
Warehouse
o
o
o
o
Collaboration for enterprise warehouse to be used for biomedical data analytics
Sponsored by SC-CTSI, USC, CHLA
Data harmonization with Los Angeles County, but distributed storage and stewardship
18 month timeline to first release
Mini-DEWARS
o
o
o
Data Set without personal identifiers from CHLA and Keck Electronic Medical Records
I2b2 application “academic standard” for clinical data exploration and cohort identification
Collecting information about data sources and policies around specific test-cases
– Test-case #1: the LIBERATE study – handoff back to health system to identify patients for consent
and contact
– Test-case #2: Los Angeles Data Resource – sharing metadata (patient counts) with UCLA and
Cedar’s researchers
– Test-case #3: TriNetX – same metadata, different software, industry clients
CHLA
Keck
Lessons Learned about USC from the
mini-DEWARS data warehouse
experiment
o
o
o
o
o
o
o
$120K Investment by SC-CTSI (Buchanan, NIH)
– “Embedded” and empowered staff at Keck made 6 week process to get from clinical data
warehouse to research data warehouse
– “Embedded” staff absent at CHLA, no clinical data warehouse, no lines of authority; 6 month
process
CHLA and USC centralized research data warehouse
– Data are not well understood
– Project management styles are very different
The policy infrastructure is more important than technical infrastructure
– Funding
– Data access
– Decision-making authority is fuzzy
Clinical Researchers are innovative, motivated, frustrated
Health system authorities are cautious
Many entrepreneurial aspirations from students
No model yet to ensure benefits are bidirectional to balance risk to health systems with business
intelligence benefits back to health system.