Human Phenotype Ontology: Identifying and Studying

Human Phenotype Ontology: Identifying and
Studying Rare Diseases
TMF-Workshop: Registries for patients with undiagnosed rare
diseases
Peter N. Robinson
Professor for Medical Genomics
Institut für medizinische Genetik und Humangenetik
Charité Universitätsmedizin Berlin
Peter N. Robinson (Charité)
HPO and Rare Diseases
1/22
21.11.2013
1 / 22
Deep Phenotyping and Clinical Research
A subjective list of goals for improving RD patient care
1
Reliably identify pathogenicity of variants in known disease genes
2
Quickly identify remaining Mendelian disease genes
3
Differential diagnosis and clinical decision support system
4
Characterize natural history of RDs and discover clinically
actionable complications and risks
5
Basis to include clinical aspects in integrative basic science
research on disease pathophysiology
Peter N. Robinson (Charité)
HPO and Rare Diseases
2/22
21.11.2013
2 / 22
Deep Phenotyping
the precise and comprehensive analysis of phenotypic
abnormalities
the individual components of the phenotype are observed and
described
often for the purposes of scientific examination of human disease.
PN Robinson (2012) Deep phenotyping for precision medicine.
Hum Mutat 33: 777–780
Special Issue of Human Mutation on Deep Phenotyping
Peter N. Robinson (Charité)
HPO and Rare Diseases
3/22
21.11.2013
3 / 22
What is an Ontology?
“An ontology is a specification of a conceptualization.”
– Tom Gruber, 1993
Catalog
Thesaurus
Glossary
instances
subclassing
Peter N. Robinson (Charité)
value
restrictions
properties
HPO and Rare Diseases
disjoint
inverse
part of
...
logical
constraints
4/22
21.11.2013
4 / 22
What is a phenotype ontology?
A phenotype ontology describes not diseases but the individual manifestations of a disease
symptoms
signs
laboratory abnormalities
anomalies found by imaging studies
behavioral manifestations
Peter N. Robinson (Charité)
HPO and Rare Diseases
5/22
21.11.2013
5 / 22
The Human Phenotype Ontology
organ
abnormality
general terms
cardiovascular
abnormality
cardiac
abnormality
cardiac
malformation
abn. of the
cardiac septa
abn. of the
cardiac atria
abn. of the
atrial septum
atrial septal
defect
specific terms
∼ 10, 143 terms, ∼ 110, 000 annotations for ∼ 7000 mainly
monogenic diseases
http://www.human-phenotype-ontology.org
Robinson PN et al. (2008) The Human Phenotype Ontology: a tool for annotating and analyzing human hereditary disease. Am J
Hum Genet. 83:610–5.
Peter N. Robinson (Charité)
HPO and Rare Diseases
6/22
21.11.2013
6 / 22
Uptake in community
Databases & Bioinformatics Resources Using HPO
DECIPHER (Sanger Institute)
DDD (Sanger Institute)
ECARUCA
FORGE (Genome Canada)
GWAS Central
IRDiRC
ISCA
NCBI Genetic Testing Registry
NIH Undiagnosed diseases program
PhenomeNET
RIKEN
...
Close integration with other important efforts
Phenotips (Brudno group, U
Toronto)
Peter N. Robinson (Charité)
HPO and Rare Diseases
7/22
21.11.2013
7 / 22
The Human Phenome: Network of Human
Diseases and Disease Genes
a)
Muscular
Disorder class
Bone
Skeletal/Bone/
Connective tissue/
Development
Cancer
Cardiovascular
Neuro
Connective Tissue
Dermatological
Developmental
Ear, Nose, Throat
Endocrine
Gastrointestinal
Hematological
Heme/
Immuno
Immunological
Endo/Renal
Metabolic
Muscular
Metab
Neurological
Nutritional
Ophtamalogical
CV
Psychiatric
Renal
Respiratory
Ophth
Skeletal
Multiple
Cancer
Derma
c)
60
40
Peter N. Robinson (Charité)
Annotated terms
Mapped to parents
Added 50% random terms
50
#
t ∈d2
"
X
40
max sim(s, t ) + 0.5 · avg
30
s ∈ d1
Rank of OMIM entry
20
Density
= 0.5 · avg
10
sim(d1 , d2 )
X
s∈d2
20
30
"
10
b)
HPO and Rare Diseases
8/22
#
max sim(s, t )
t ∈ d1
21.11.2013
8 / 22
Ontological diagnostics
Noonan Syndrome
a)
abn. of
the eye
abn. of the
eyelid
Opitz Syndrome
b)
abn. of
the eye
abn. of the
ocular region
abn. of the
eyelid
abn. of globe
localization or size
abn. of the
ocular region
abn. of globe
localization or size
telecanthus
abn. of the
palpebral fissures
abn. of the
palpebral fissures
hypertelorism
Syndrome term
downward slanting
palpebral fissures
hypertelorism
downward slanting
palpebral fissures
Query term
Overlap between
query and disease
3.78
c)
Noonan Syndrome
downward slanting
palpebral fissures
3.05
Query (Q)
sim(Q,Noonan) = 3.78 + 3.05
2
= 3.42
hypertelorism
downward slanting
palpebral fissures
hypertelorism
Opitz Syndrome
3.05
sim(Q,Opitz) = 2.452+ 3.05
hypertelorism
= 2.75
2.45
(IC of abn. of the eyelid)
telecanthus
Q: Query terms
d: Disease terms
"
sim(Q → d ) = avg
#
X
s∈Q
Peter N. Robinson (Charité)
HPO and Rare Diseases
max sim(s, t )
t ∈d
9/22
21.11.2013
9 / 22
The Phenomizer
Sebastian Köhler et al. (2009) Clinical Diagnostics with Semantic Similarity Searches
in Ontologies. Am J Hum Genet, 85:457–64.
http://compbio.charite.de/Phenomizer
Peter N. Robinson (Charité)
HPO and Rare Diseases
10/22
21.11.2013
10 / 22
Reasoning over Phenotypes
Building Block
Ontology (ChEBI)
Logical Definitions
Aldohexose
class: decreased aldohexose concentration
equivalent to:
has_part_some:
decreased concentration and
towards some aldohexose and
inheres_in some portion of blood and
qualifier some abnormal
is a
Glucose
Inferrence
Decreased
Aldohexose
concentration
inferred: is a
class: hypoglycemia
equivalent to:
has_part_some:
decreased concentration and
towards some glucose and
inheres_in some portion of blood and
qualifier some abnormal
Hypoglycemia
Köhler et al (2013) F1000Research 2:30
Gkoutos GV et al. (2009) Entity/Quality-Based Logical Definitions for the Human Skeletal Phenome using PATO. IEEE Engineering in Medicine
and Biology (EMBC 2009)
Köhler S et al. (2011) Improving ontologies by automatic reasoning and evaluation of logical definitions.
BMC Bioinformatics 12:418.
Peter N. Robinson (Charité)
HPO and Rare Diseases
11/22
21.11.2013
11 / 22
A semantic web of the human phenotype
HPO defined using 11 other ontologies ⇒ Semantic network
Connections to genes, diseases, anatomy, . . .
Cross phenotype species analysis
Peter N. Robinson (Charité)
HPO and Rare Diseases
12/22
21.11.2013
12 / 22
Of mice, fish, flies, and men
Mutations in orthologous genes are often associated with similar
phenotypes
Phenotype information for >5300 genes currently available only
in model organisms
Peter N. Robinson (Charité)
HPO and Rare Diseases
13/22
21.11.2013
13 / 22
Finding the needle: 3,000,000,000 bases → 1
mutation
X-ome
Exome
Genome
800–1200
20,000-30,000
2–3 Mio.
¬ dbSNP
100–300
1,000–3,000
100K–300K
Indels (<10bp)
100–200
3,000
600K
50
1,500
150K
SNVs
¬ dbSNP
Each genome: Lots of “private” variants with ∼ 100
genuine loss of function variants with ∼20 genes
completely inactivated.
∴ sequence-based prioritization alone will struggle to
identify the disease-associated mutation
Peter N. Robinson (Charité)
HPO and Rare Diseases
14/22
21.11.2013
14 / 22
Exome Sequencing
Whole exome
at least 30,000 variants per exome
Remove off-target and common variants
We all have ∼ 100 genuine loss of
function variants
Prioritization based purely on
sequence variant pathogenicity will
struggle to correctly identify the
disease-associated mutation from
other variants with a deleterious
biochemical effect.
Phenotypic Relevance Score based on
similarity of observed phenotypes
Variant Score based on allele frequency and
pathological impact
Phenotypic
profile
28,176 models
Combined Phenotypic/Variant Score (CPVS)
to give final candidate(s)
The Exomiser
Peter N. Robinson (Charité)
HPO and Rare Diseases
15/22
21.11.2013
15 / 22
PHIVE
Human Disease:
PFEIFFER
SYNDROME
Coronal
craniosynostosis
HP:0004440
Most similar
mouse model:
CD1.Cg-Fgfr2tm4Lni/H
premature
suture
closure
premature
suture closure
MP:0000081
maxilla
hypoplasia
Hypoplasia of
the maxilla
HP:0000327
short maxilla
MP:0000097
malocclusion
Dental crowding
HP:0000678
Brachyturricephaly
HP:0000244
Hypertelorism
HP:0000316
Cross-species
Phenotype
malocclusion
MP:0000120
shortened
head
shortened
head
MP:0000435
ocular
hypertelorism
ocular
hypertelorism
MP:0001300
PHIVE: PHenotypic Interpretation of Variants in Exomes
Peter N. Robinson (Charité)
HPO and Rare Diseases
16/22
21.11.2013
16 / 22
Exomizer
Users enter VCF file and phenotype data and get back ranked list
of candidates
Peter N. Robinson (Charité)
HPO and Rare Diseases
17/22
21.11.2013
17 / 22
Exomiser
Phenotype data and variant data synergistically improve
interpretation of exome data for gene discovery
Peter N. Robinson (Charité)
HPO and Rare Diseases
18/22
21.11.2013
18 / 22
Exomiser
With ESP and 1000G Frequency data
Peter N. Robinson (Charité)
HPO and Rare Diseases
19/22
21.11.2013
19 / 22
Exomiser
With frequency data only from ESP to remove any potential bias
due to the non-causative variants also coming from the 1000
Genomes Project
Peter N. Robinson (Charité)
HPO and Rare Diseases
20/22
21.11.2013
20 / 22
Conclusions
Large-scale validation of PHIVE analysis using 100,000 exomes
containing known mutations demonstrated an improvement of up
to 54.1 fold over purely variant-based methods with the correct
gene recalled as the top hit in up to 83% of samples.
Robinson PN, Köhler S, Oellrich A, Sanger Mouse Genetics Project, Wang K, Mungall C, Washington N, Bauer S, Seelow D, Krawitz
P, Gilessen C, Haendel M, Smedley D (2013) Improved exome prioritization of disease genes through cross species phenotype
comparison. Genome Research, early access, pmid: 24162188
https://www.sanger.ac.uk/resources/databases/exomiser/
Peter N. Robinson (Charité)
HPO and Rare Diseases
21/22
21.11.2013
21 / 22
Any Questions?
Peter N. Robinson (Charité)
HPO and Rare Diseases
22/22
21.11.2013
22 / 22