Thesaurus for plant traits in environmental context :

Thesaurus for plant traits in environmental
context :
The examples of Crop Ontology
And Trait of Plants Thesaurus (TOP)
Marie-Angélique Laporte
Marie-Angélique Laporte, Ulrike Stahl, Jens Kattge
TOP
Context
• “Any morphological, physiological or
phenological feature measurable at the
individual level, from the cell to the wholeorganism level, without reference to the
environment or any other level of
organization” (Violle et al. 2007)
• Functional trait-based approaches are widely
used to address biodiversity issues
Thesauform tool
• Web-based tool to build collaborative SKOS
thesaurus
• Relies on Semantic Web
– SKOS: W3C format to manage thesaurus
• Build and publish thesaurus on the web
• Establish cross references between thesaurus
• Flexible and user-friendly environment
Experts collaboratively developed Trait
Thesaurus
Ahrestani F., Amiaud B., Cornelissen H., Diaz S., Gachet S., Garnier E.,
Kattge J., Kleyer M., Lavorel S., Perez Harguindeguy N., Poorter H.,
Schildhauer M., Shipley B., Violle C., Wright I.
Thesauform : edition phase
Thesauform : validation phase
TOP thesaurus
• SKOS Thesaurus : stable reference resource
• Around 1000 traits of plants (950 are well
defined)
:Specific_leaf_area a skos:Concept;
:prefUnit "m2kg-1[DM]";
rdfs:label "Specific_leaf_area"^^xsd:string;
rdfs:subClassOf :Morphology;
vs:term_status "testing";
skos:broaderTransitive :Morphology;
skos:definition "The one sided area of a fresh leaf devided by its oven-dry
mass";
skos:inScheme :Thesaurustrait;
skos:related :Leaf_blade_thickness,
:Leaf_mass_per_area;
skosxl:prefLabel "Specific_leaf_area" .
Scientists love hierarchies !
-Leaf
-Leaf area
-Leaf mass
-Seed
-Seed mass
-Seed color
-Bark
-Bark thickness
-Bark carbon content
-Root
-Root Length
-Root diameter
How can I
organize my
data ?
By plant
organs?
Scientists love hierarchies !
-Mass
Size
-Seed mass
-Leaf mass
-Area
-Leaf area
-Length
-Bark thickness
-Root length
-Root diameter
-Content
Content
-Bark carbon content
-Color
Color
-Seed color
How can I
organize my
data ?
By
measurement
?
Scientists love hierarchies !
-Reproduction, Dispersion
-Seed mass
-Plant height
-Absorption, Carbon Fluxes
-Root length
-Root diameter
-Resources acquisition
-Leaf mass
-Leaf area
-Root length
-Root diameter
How can I
organize my
data ?
By Biological
Functions?
Scientists love hierarchies !
How can I
organize my
data ?
Unifying system of plant trait
modelling
FACETED SEARCH
Definition
• “a technique for accessing a collection of
information, allowing users to explore by
filtering available information. A faceted
classification system allows the assignment of
multiple classification to an object, enabling
the classification to be ordered in multiple
ways, rather than in a single, pre-determined,
taxonomic order”
source : http://www.mumia-network.eu/index.php/working-groups/wg4
Faceted search main principle
:OrganFacet a skos:Collection; :SizeFacet a skos:Collection;
skos:member :Leaf;
skos:member :Area;
skos:member :Root;
skos:member :Length;
…
skos:member :Density;
skos:member :Seed;
skos:member :Mass;
skos:member :Flower .
skos:member :Volume.
:Leaf a skos:Collection;
skos:member :LeafArea;
skos:member :Specific Leaf
Area;
…
skos:member :LeafLifespan .
:Area a skos:Collection;
skos:member :LeafArea;
skos:member :Specific
Leaf Area;
…
skos:member :XylemArea .
Users’ feedbacks
• Facets have been proposed by experts during
the collaborative building of the thesarus
(using the Thesauform tool)
• Results:
– 5 facets have been implemented
Text Mining Approach : on going work
• Knowledge extracted from literature
• Papers based on TRY data
• Which functions are linked to which traits?
Existing ontologies: on going work
• Use ontological terms as facet elements
– Ex: PO can be used to build the plant part facet
Cell -> Tissue -> Organ
• Collections can be ordered in SKOS
• Facets as the result of queries
– Give me the traits that have been measured on
Leaf or on a part of Leaf
– The facet Leaf will be built from the part_of
relationship expressed by PO
Faceted search benefits
• Facilitate the thesaurus
appropriation by the endusers :
• Reorganize in an
intuitive way the
thesaurus terms
•Facets allow multiple
classification of traits,
depending on trait
usages
Main Architecture
Faceted search benefits
• Access to
disseminated data
with a reorganization
of the information
Marie-Angélique Laporte, Luca Matteis, Harold Duruflé, Elizabeth Arnaud
CROP ONTOLOGY
Context
• Understanding the relationships between
plant genotype and environment, develop the
adaptive traits to respond to biotic and
abiotic stress, promote the adequate
agronomic practices to cultivate it and
understand the heritability of adaptive traits
Harmonization and access to data
•
•
•
Breeders’ data are often
unstructured data - Complex free
text used for phenotypes
description
No semantic coherence :
‘Fruit colour‘
Bean pod color
•
Same trait given different names
by scientists
•
One trait named the same way for Rice grain or
caryopsis colour
various species but refers to
different plant structures
Data and metadata are NOT
interoperable and often not
online
Maize Kernel
Colour
The Crop Ontology
• Species Specific Traits
• Originally to enable the annotations of GCP
data sets and enable web services for data
retrieval and discovery
• Became an application Ontology for
fieldbooks
• Covers the GCP priority crops
Crop Ontology Content
www.cropontology.org
15 online crops
• Banana
• Barley
• Cassava
• Chickpea
• Common beans
• Cowpea
• Groundnut
• Maize
• Pearl millet
• Pigeon Pea
• Potato
• Rice
• Sorghum
• Wheat
• Yam
2014 - adding
Lentil
Soybean
Sweet Potato
Crop Ontology formats
• Excel files
• OBO format
• Simple Knowledge Organization System (SKOS)
<http://www.cropontology.org/rdf/CO_324:0000002>
a
skos:Concept ;
rdfs:label "Flag leaf weight"@en ;
dc:creator _:b1 ;
skos:definition "Weight of the flag leaf (the one just below the panicle)." ;
skos:inScheme co:sorghum ;
skosxl:prefLabel
[a
skosxl:Label ;
co:acronym
[a
skosxl:Label ;
skosxl:literalForm "FLGWT"
];
skosxl:literalForm "Flag leaf weight"@en
].
Local Databases
Breeders & Data Managers
Breeders’
Trait Dictionaries
Crop Database
Data Manager
Curation of the Crop Ontology
Fieldbook Template
Data Annotation
& new terms addition
Cross referencing terms with Plant Ontology &Trait Ontology
Submission of new traits through the term tracker
Crop Trait Dictionary Template
Name of submitting
scientist
Institution
Language of submission
Date of submission
Bibliographic Reference
Comments
n
1
Method ID
Name of Method
Describe how measured (method)
Growth Stage
Field, greenhouse
1
n
Crop Name
Name of Trait
Abbreviated name
Synonyms (separate by commas)
Trait
Description of Trait
How is this trait routinely used?
Trait Class
Scale ID
Type of Measure (Continuous, Discrete or
Categorical)
For Continuous: units of measurement,
reporting units, minimum. maximum
For Discrete: Name of scale or
measurement units
For Categorical: Name of rating scale, Class
#value = meaning
Online visualization of Trait dictionaries
Methods & Scales for annotations
• Relationships between Trait, Methods and Scales required
for annotations in phenotype databases
• 3 namespaces
Annotation for storing phenotypic data in the
IBDB
Property (Trait)- CO_ID
3 namespaces
Method - CO_ID
Scale – CO_ID
continuous
discrete
categorical
Class1-value – CO_ID
Class2-value – CO_ID
Class3-value – CO_ID
A unique combination of IDs for P+M+S+C
= A Standard Variable
Is_a_valid_value_of
Data
Controlled
vocabulary
Term ID
Crop
Ontology
API access by 3rd Party Web sites
IBP Crop Databases
IB Fieldbook
Genotype Data MS
[Text]
API
Phenomics Ontology
Driven DB (PODD)
EU-SOL
Solanaceae Breeding DB
Wageningen.
[Text]
International cassava DB
Agtrials -CCAFS
THANK YOU FOR YOUR ATTENTION