Construction And Evaluation Of OWL-DL Ontologies

Construction And Evaluation Of
OWL-DL Ontologies
Mark Wilkinson
Assistant Professor
Department of Medical Genetics
University of British Columbia
iCAPTURE Centre, St. Paul’s Hospital
Presenting the work of
Benjamin Good, M.Sc.
Wilkinson Laboratory
Bioinformatics Doctoral Programme, UBC
Our Perspective
“We believe that [centralized ontology
building] efforts are unsustainable and
that the Semantic Web will eventually be
built in the same way as the WWW was –
by its users”
Good and Wilkinson, “The Life Sciences Semantic Web is Full of Creeps!”, Briefings in
Bioinformatics, (in press)
Why Do We Think This Way?
BioMoby: Mass collaborative
ontology building to support
Web Services Interoperability
What Does BioMoby Do?
The MOBY Plan
Create an ontology of bioinformatics data-types
Define an XML representation of this ontology
Create an ontology of bioinformatics operations
Open these ontologies to public input
Define Web interfaces v.v. these two ontologies
Register Interfaces in an ontology-aware Registry
A Machine can find an appropriate service
A Machine can execute that service unattended
Ontology is community-extensible
Take home message
…this was built
by a community
of non-expert
ontologists!
Open
Kimono Time
The BioMoby
ontology is
quite messy…
…communal
brains can
build useful
ontologies, but
we will need
better tooling
How are ontologies usually
constructed?
By A Few People With Lots
Of Moola!
Gene Ontology
Curated: ~5 full-time staff
$25 Million (Lewis,S personal communication)
National Cancer Institute Metathesaurus
Curated: ~12 full-time staff
$75 Million (personal estimate)
Health Level 7 (HL7)
Curated – staffing unknown
$15 Billion(?) (Smith, Barry, KBB Workshop, and
Montreal, 2005)
Why does it cost so much??
To build the Semantic Web for Life Sciences
we need to encode knowledge from EVERY
domain of biology – from barley root apex
structure and function, to HIV clinical-trials
outcomes… and this knowledge is
constantly changing!
At >>$25M a pop, can we afford the
Semantic Web???
The iCAPTURer Method
Template-Assisted Ontology Construction
Pre-iCAPTURer
Extract the brain of
one or a very few
experts – expensive
and time-consuming…
iCAPTURer
Consume as many brains as possible
The iCAPTURer Experiment
Hypotheses
With a starting thesaurus of concepts
With a clear, simple interface for linking them
“wet” researchers can create a robust
ontology themselves
Using carefully-defined templates, a Knowledge
Engineer can control the structure of an ontology
without controlling, nor even understanding,
the content
Knowledge Capture Parameters
Domain: Cardiovascular and Pulmonary
disease, both clinical and molecular
Capture Scope
Thesaurus construction
Definitions (unevaluated)
Synonomy (same as) relations
Hyponomy (is a) relations
Ontology Task: Ontological classification
of conference abstracts to aid in
semantic searching
Interface
Chatterbot
“I’ve heard that a cardiac myocyte is a type of
cardiac cell. Is this true?”
“I’ve heard that STEMI means the same thing as ST
Elevated Myocardial Infarction. Is that nonsense, or
is it correct?”
“How do you feel about your mother?”
Results Over 5 days
Concepts accepted and expert-validated: 661
Text-mined concepts rejected: 232
Relationships captured: 547
Number of distinct expert knowledge capture
events in 5 days: >12,000!!
This is approximately the size of the GO
Cost: 4 pints of beer, 4 coffee mugs, 3 T-shirts,
1 chocolate Moose
Was built entirely by volunteers
Full details of this experiment are available in:
Proceedings of the Pacific Symposium on Biocomputing, 2006
Subjective iCAPTURer Observations
Humans had an extremely difficult
time classifying things into
pre-existing categories
Humans had an extremely difficult time
defining new categories and placing them
into the existing classification system
How Do We Know If It Is
Any Good?
Templates control structure, but
not content
Structurally sound, logically valid,
ontologies can still be nonsensical!
How do we measure the quality of
an ontology?
Possible Quality Metrics
Domain independent
Philosophical
desiderata
Graphical structure
Satisfiability
Instance-based
Slow, subjective
Fast, questionable value
Fast, useful, not enough
Fast in theory, useful…
Domain specific
“Fit” to text
Similarity to a
gold standard
Task-based
Fast, dependent on NLP
Fast to run, extremely
slow to set up
Real, but not
generalizable
Problem
Evaluating the metrics
No clear winner has yet emerged from the
morass of metrics
A “global” winner is unlikely to be found
Each seems to have some benefits and
some disadvantages
Each may be useful for one ontology but
not another
How do we evaluate which metrics are
useful for evaluating our ontologies?
Ontology Permutation As A
Metrics-Evaluation Tool
Take an ontology that everyone agrees
is “good”
Make it worse by systematically adding
random changes (noise)
Quality metric should correlate with the
amount of noise added
An Objective Comparison Of
Ontology Quality Metrics
Measured
Ontology
Quality
Quality
Metric 1
Quality
Metric 2
Amount of noise added (ontology quality decreasing)
Adding Noise To Ontologies
Maintain same number of classes and
relationships as well as satisfiability
Add noise by swapping relationships
attached to pairs of classes
Sub/superclass
Domain/range etc.,
Validate with Pellet reasoner
Quantifying Noise
Simple number of changes is misleading,
and not a good measure of “noise”
Noise better quantified by the degree of
(dis)similarity between the permuted
ontology and the source ontology
Maedche, A. and S. Staab, Measuring Similarity between Ontologies
Lecture Notes in Computer Science. 2002. 251
Example Of Similarity Measurement
Semantic distance
Aquatic things
Air-centric Ontology
Semantic Distance
2
Air breathing
3
non breathing
Water breathing
4
1
fishermen
dolphins
Dolphins  Fishermen 0
ships
sand
water
fish
seaweed
anchovies
sharks
tuna
Dolphins  Fish 4
Example Of Similarity Measurement
Semantic distance
Aquatic things
Leg-centric Ontology
Semantic Distance
2
3
Has legs
No legs
4
1
fishermen
fish dolphins
seaweed
anchovies
sharks
tuna
ships
sand
water
Dolphins  Fishermen 4
Dolphins  Fish 0
Conclusions
Communities can build useful ontologies
Better tools make better ontologies
Chatterbot templates seem to work well
Could easily be incorporated into existing
software tools for dynamic, organization-wide
knowledge capture!
Ontology evaluation is hard!
Some non-task-based evaluation metrics
are showing promise
Genome Canada
Genome Alberta
Genome British Columbia
GA: A Bioinformatics Platform for
Genome Canada
GBC: Better Biomarkers in Transplantation
Canadian Institutes For Health Research
Bioinformatics Training Program
© 2006 Microsoft Corporation. All rights reserved.
Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.
The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation.
Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft,
and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation.
MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.