and “ontology” - Referent Tracking Unit

R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Ontology and the future of
Evidence-Based Medicine
Dagstuhl May 23th, 2006
Werner Ceusters, MD
Ontology Research Group
Center of Excellence in Bioinformatics & Life Sciences
SUNY at Buffalo, NY
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Evidence Based Medicine
• the integration of best research evidence with clinical
expertise and patient values.
– best research evidence: clinically relevant patient centered
research into the accuracy and precision of diagnostic tests, the
power of prognostic markers, and the efficacy and safety of care
regimens.
– clinical expertise:
the ability to use clinical skills and past
experience to rapidly identify each patient's unique health state.
– patient values:
the unique preferences, concerns and
expectations each patient brings to a clinical encounter and
which must be integrated into clinical decisions if they are to
serve him.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Application of Evidence Based Medicine
• Now:
– Decisions based on (motivated/justified by) the
outcomes of (reproducable) results of well-designed
studies
• Guidelines and protocols
– Evidence is hard to get, takes time to accumulate.
• Future:
– Each discovered fact or expressed belief should
instantly become available as contributing to
‘evidence’, wherever its description is generated.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Future scenarios
• Data entered about a successful treatment of a case
in X generates a suggestion for a similar case in Y;
• Submission of a new paper to Pubmed on some
ADR triggers an alert in EHR systems worldwide
for those patients that might be at risk;
• …
 From reactive care to proactive care
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Some problem areas
• Pharmaceutical Industry:
– Optimise drug discovery
• Make “messy” databases more useful for everybody
• Consumer health:
– Opposing forces:
• Quality of information
• Make them consume
• Malpractice suits
• Public sector health:
– Cost containment
• Cost effectiviness of treatment, prevention
• Bio-informatics world:
– How to find out that a ‘discovery’ is a ‘new’ discovery ?
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
An action plan for a European eHealth Area.
• By the start of 2005:
• MS and EC should agree on an overall approach to benchmarking in
order to assess the quantitative, including economic, and qualitative
impacts of e-Health.
• By end 2006:
• in order to achieve a seamless exchange of health information across
Europe through common structures and ontologies, MS, in collaboration
with the EC, should identify and outline interoperability standards for
health data messages and electronic health records, taking into
account best practices and relevant standardisation efforts.
• By end 2008:
• the majority of all European health organisations and health regions
(communities, counties, districts) should be able to provide online
services such as teleconsultation (second medical opinion), eprescription, e-referral, telemonitoring and telecare.
R T U New York State
One
Center of Excellence in
key
issue: Semantic
Interoperability
Bioinformatics
& Life Sciences
• Working definition:
– Two information systems are semantically interoperable if
and only if each can carry out the tasks for which it was
designed using data and information taken from the other as
seemlessly as using its own data and information.
system: Any organized assembly of resources and procedures
united and regulated by interaction or interdependence to
accomplish a set of specific functions.
information system: a system, whether automated or manual,
that comprises people, machines, and/or methods organized to
collect, process, transmit, and disseminate data that represent
user information.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Essential components
• People: physicians, nurses, patients, healthcare
administrators, ...
• Machines:
Communication
• to make humans interact with the EHR,
&
Interpretation
selection, ...)
• to transmit data from one EHR to another
• to enter data (lab analysers, EMR monitors, ...)
• to interprete data (alerts, quality assessment, protocol
• Data and information (data in context)
• Procedures
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Understanding content (1)
“John Doe has a pyogenic
granuloma of the left thumb”
John Doe has a
pyogenic
granuloma of
the left thumb
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Understanding content (2)
<record>
<patient>John Doe</patient>
<diagnosis>pyogenic granuloma of the left thumb</diagnosis>
</record>
<record>
<subject> John Doe </subject>
<diagnosis> pyogenic granuloma
of the left thumb </diagnosis>
</record>
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Understanding content (3)
<129465004>
<116154003>John Doe</116154003>
< 8319008 > 17372009
<finding site> 76505004
<laterality>7771000</laterality>
</finding site>
</ 8319008 >
</129465004>
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Ontology based on Unqualified Realism
• Accepts the existence of
– a real world outside mind and language
– a structure in that world prior to mind and language
(universals / particulars)
• Rejects nominalism, conceptualism, ontology as a
matter of agreement on ‘conceptualizations’
• Uses reality as a benchmark for testing the quality
of ontologies as artifacts by building appropriate
logics with referential semantics (rather than
model-theoretic)
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Relevance for EHR & Semantic Interoperability
The conceptualist approach
R
E
A
L
I
T
Y
B
E
L
I
E
F
Ontology
EHR
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Relevance for EHR & Semantic Interoperability
The realist approach
R
E
A
L
I
T
Y
L
O G
O L
K A
I S
N S
G
Ontology
EHR
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Terminology
• A theory concerned with those aspects of the
nature and the functions of language which permit
the efficient representation and transmission of
items of knowledge (J. Sager)
• Precise and appropriate terminologies provide
important facilities for human communication (J.
Gamper)
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Ontology
• An ontology is a representation of some pre-existing
domain of reality which
– (1) reflects the properties of the objects within its domain in
such a way that there obtains a systematic correlation between
reality and the representation itself,
– (2) is intelligible to a domain expert
– (3) is formalized in a way that allows it to support automatic
information processing
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
A division of labour
• Terminology:
– Communication amongst humans
– Communication between human and machine
• Ontology:
– Representation of “reality” inside a machine
– Communication amongst machines
– Interpretation by machines
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Today’s biggest problem: a confusion between
“terminology” and “ontology”
• The conditions to be agreed upon when to use a
certain term to denote an entity, are often different
than the conditions which make an entity what it
is.
– Trees would still be different from rabbits if there were
no humans to agree on how these things should be
called.
• “ontos” means “being”. The link with reality tends
to be forgotten: one concentrates on the models
instead of on the reality.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
What to do about it ? (1)
• Research:
– Revision of the appropriatness of concept-based
terminology for our purposes
– Relationship between models and that part of reality
that the models want to represent
– Adequacy of current tools and languages for
representation
– Boundaries between terminology and ontology and the
place of each in semantic interoperability in healthcare
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
What to do about it ? (2)
• Training and awareness
– Make people more critical wrt terminology and
ontology promisses
• What is needed must be based on needs, not on the
popularity of a new concept
• But in a system, it’s not just your own needs, it is each
component’s needs !
– Towards “an ontology of ontologies”
• First description
• Then quality criteria
R T U New York State
Center of Excellence
in goal
Ultimate
Bioinformatics & Life Sciences
Ontology
continuant
disorder
person
CAG repeat
EHR
Juvenile HD
#IUI-1 ‘affects’ #IUI-2
#IUI-3 ‘affects’ #IUI-2
#IUI-1 ‘causes’ #IUI-3
...
Referent Tracking
Database
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
3 fundamentally different in levels
1. the reality on the side of the patient;
2. the cognitive representations of this reality
embodied in observations and interpretations on
the part of clinicians and others;
3. the publicly accessible concretizations of such
cognitive representations in representational
artifacts of various sorts, of which ontologies
and terminologies are examples.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Example: a person (in this room) ’s phenotypic gender
• Reality:
– Male
– Female
Other types of phenotypic gender ?
• Cognitive representation
– [male]
– [female]
• In the EHR:
– “male”
– “female”
– “unknown”
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
4 fundamental reasons for making changes
1. changes in the underlying reality
• does the appearance of an entry (in a new version of
an ontology or in an EHR) relate to the appearance of
an entity or a relationship among entities in reality ?;
2. changes in our (scientific) understanding;
3. reassessments of what is considered to be
relevant for inclusion (notion of purpose), or:
4. encoding mistakes introduced during data entry
or ontology development.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Key requirement
Any change in an ontology
or data repository should
be associated with the
reason for that change !
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Example: a person (in this room) ’s gender in the EHR
• In John Smith’s EHR:
– At t1: “male”
at t2: “female”
• What are the possibilities ?
• Change in reality: transgender surgery
• Change in understanding: it was female from the very
beginning but interpreted wrongly
• ( No change in relevance )
• Correction of data entry mistake
(was understood as male, but wrongly transcribed)
R T U New York State
Center of Excellence in
Possible
combinations
Bioinformatics
& Life
Sciences
P+1
A+1
A+2
P-1
P-2
P-3
P-4
P-5
P-6
P-7
P-8
A-1
A-2
A-3
A-4
Reality
UnderEncoding G E
standing
OE
ORV
BE
BRV
Int.
Ref.
Y
N
Y
N
N
N
Y
Y
Y
Y
Y
Y
Y
N
Y
Y
N
Y
Y
N
N
N
Y
Y
N
Y
N
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
N
Y
N
Y
N
Y
Y
Y
Y
Y
Y
Y
Y
N
N
-
Y
Y
N
N
N
N
Y
N
N
-
R+
R
R
RR
RR+
R
R-
G1
0
G2
0
G3
0
-
3
G4
4
G5
5
G4
1
G5
2
G1
1
G4
2
G5
3
G3
1
G2
1
G3
1
G2
1
OE:
BE:
BRV:
Int.:
Ref.:
G:
E:
P/A:
objective existence; ORV:
objective relevance;
belief in existence;
belief in relevance;
intended encoding;
manner in which the
expression refers;
typology which results
when the factor of
external reality is
ignored.
number of errors
when measured
against the
benchmark of reality.
presence/absence of term.
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Possible evolutions
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Towards an implementation
•
A client-server application in which the server is
composed of four layers:
–
–
–
–
the Web Server Layer (WSL) provides the interface to clients
via web services;
the RT Core application programming interface (API)
encapsulates the data services related to storage and retrieval.
Its Security Module validates the access rights before any data
service;
the database layer stores all the RT data, and;
the reasoner layer (RL) performs inferences upon specific
requests, based on the information available in the database
and, if available, the ontologies that have been used for the
descriptions of the portions of reality.
Shahid Manzoor
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Schematic representation
Referent Tracking (RT) Server
Web Server
Referent Tracking Web services
Health Institution A
hosting RT
EHR Client
Inte
rne
t
Referent Tracking Core System API
Security Module
Session Manager
Health Institution B
registered WITH RT
Reasoner
EHR Client

RT Data

Imported
Ontologies rules
R T U New York State
Center
of Excellence
Simple
Graph inRepresentation
Bioinformatics & Life Sciences
Privacy issues
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Complete structure
R T U New York State
Center of Excellence in
UML-diagram
for the entities in the RDF-graph
Bioinformatics & Life Sciences
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Querying the RTDB using ontologies (SPARQL)
1
2
3
4
5
6
7
8
PREFIX rts: <http://ecor#>
PREFIX fma: <http://FMA#>
SELECT ?p ?u ?v
WHERE {?p rts:relation ?u .
?u a rts:PtoU .
?u rts:u ?t .
?t a fma:Face .
}
Retrieve particulars
that are related to the
universal face
R T U New York State
Center of Excellence in
Bioinformatics & Life Sciences
Querying the RTDB using ontologies (SPARQL)
1 WHERE {?p rts:relation ?rf .
2 ?rf a rts:PtoP .
3 ?rf rts:p ?f .
4 ?f a fma:Head .
5 ?f rts:relation ?rd .
6 ?rd a rts:PtoP .
7 ?rd :p ?d .
8 ?d a dis:DISEASES AND
INJURIES .
}
Retrieve
patients with
diseases in the
head
Test
R T U New York State
interface
Center of Excellence in
Bioinformatics & Life Sciences