here - Ontology matching

Problem
Applications
Methodology
Classification
Methods
Process
Systems
Use
Evaluation
Conclusions
Goals of the tutorial
Ontology matching tutorial
Jérôme Euzenat
Pavel Shvaiko
I Provide an introduction to ontology matching
I Discuss practical and methodological issues
&
Montbonnot Saint-Martin, France
[email protected]
I Demonstrate and use (advanced) matching technology
I Motivate future research
Trento, Italy
[email protected]
October 2014
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Problem
Applications
Methodology
Classification
Methods
Process
Systems
Use
Evaluation
Conclusions
Outline
1
Problem
2
Applications
3
Methodology
4
Classification
5
Methods
6
Strategies
7
Systems
8
Using alignments
9
Evaluation
10
Conclusions
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Problem
Applications
Methodology
Classification
Methods
Process
Systems
2 / 113
Use
Evaluation
Conclusions
What is an ontology?
An ontology typically provides a vocabulary that describes a domain of
interest and a specification of the meaning of terms used in the vocabulary.
Depending on the precision of this specification, the notion of ontology
encompasses several data and conceptual models, including, sets of terms,
classifications, thesauri, database schemas, or fully axiomatized theories.
3 / 113
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
5 / 113
Problem
Applications
Methodology
Classification
Methods
Process
Systems
Use
Evaluation
Conclusions
Various forms of ontologies
‘Ordinary’
Data Structured
glossaries dictionaries glossaries
Problem
Applications
Methodology
Classification
Methods
Process
Systems
Use
Evaluation
Conclusions
Being serious about the semantic web
Principled,
informal
hierarchies
XML
Formal Description
schemas taxonomies logics
EntityAd hoc
Database
Terms
Thesauri XML DTDs
relationship
hierarchies
schemas
models
Glossaries and
Thesauri and
Metadata and
data dictionaries
taxonomies
data models
I It is not one guy’s ontology.
I It is not several guys’ common ontology.
I It is many guys and girls’ many ontologies.
expressivity
Frames Logics
I So it is a mess, but a meaningful mess.
Formal
ontologies
adapted from [Uschold and Gruninger, 2004]
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Problem
Applications
Methodology
Classification
Methods
Process
Systems
6 / 113
Use
Evaluation
Conclusions
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Problem
Applications
Methodology
Classification
Methods
Process
Systems
7 / 113
Use
Evaluation
Conclusions
Living with heterogeneity
The heterogeneity problem
The semantic web will be:
I huge,
I dynamic,
Often resources expressed in different ways must be reconciled before being
used.
Mismatch between formalized knowledge can occur when:
I different languages are used,
I heterogeneous.
I different terminologies are used,
I different modelling is used.
These are not bugs, these are features.
We must learn to live with them and master them.
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
8 / 113
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
9 / 113
Problem
Applications
Methodology
Classification
Methods
Process
Systems
Use
Evaluation
Conclusions
On reducing heterogeneity
Problem
Applications
Methodology
Classification
Methods
Process
Systems
Use
Evaluation
Conclusions
The matching process
o0
Reconciliation can be performed in 2 steps o
Matcher
o
parameters
A
A
matching
Generator
o0
resources
Match,
thereby determine an alignment
Generate
a processor (for merging, transforming, etc.)
A0
Transformation
Matching can be achieved at run time or at design time.
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Problem
Applications
Methodology
Classification
Methods
Process
Systems
10 / 113
Use
Evaluation
Conclusions
Motivation: two ontologies
Product
price
title
doi
creator
topic
integer
Monograph
string
isbn
author
title
Essay
Litterary critics
Person
Human
Book
author
Writer
Applications
Methodology
Classification
Methods
Process
Systems
Politics
Biography
Use
Evaluation
Conclusions
author
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Monograph
isbn
author
title
Essay
≥
Person
≤
≥
Litterary critics
Human
Book
≥
Writer
Politics
Biography
subject
CD
Autobiography
Albert Camus: La chute
≥
Product
price
title
doi
creator
topic
DVD
subject
CD
Bertrand Russell: My life
Problem
11 / 113
Motivation: two ontologies
uri
DVD
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Autobiography
Literature
Literature
12 / 113
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
12 / 113
Problem
Applications
Methodology
Classification
Methods
Process
Systems
Use
Evaluation
Conclusions
Motivation: two XML schemas
Electronics
Personal Computers
Classification
Methods
Process
Systems
Use
Evaluation
Conclusions
PC
⊥
PC board
Definition (Correspondence)
ID
Brand
Amount
Price
Accessories
Given two ontologies o and o 0 , a correspondence between o and o 0 is a
3-uple: he, e 0 , r i such that:
I e and e 0 are entities of o and o 0 , for instance, classes, XML elements;
Cameras and Photo
≥
Photo and Cameras
PID
Name
Quantity
Price
≥
Classification
Methods
I r is a relation, for instance, equivalence (=), more general (w),
disjointness (⊥).
Accessories
Digital Cameras
ID
Brand
Amount
Price
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Methodology
Methodology
Correspondence
PID
Name
Quantity
Price
Applications
Applications
Electronics
Microprocessors
Problem
Problem
Process
Systems
13 / 113
Use
Evaluation
Conclusions
Alignment
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Problem
Applications
Methodology
Classification
Methods
Process
Systems
14 / 113
Use
Evaluation
Conclusions
Terminology: a summary
Matching is the process of finding relationships or correspondences
between entities of different ontologies.
Definition (Alignment)
Alignment is a set of correspondences between two or more (in case of
multiple matching) ontologies. The alignment is the output of
the matching process.
Given two ontologies o and o 0 , an alignment (A) between o and o 0 :
I is a set of correspondences on o and o 0
I with some additional metadata (multiplicity: 1-1, 1-*, method, date,
. . .)
Correspondence is the relation supposed to hold according to a particular
matching algorithm or individual, between entities of different
ontologies.
Mapping is the oriented version of an alignment.
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
15 / 113
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
16 / 113
Problem
Applications
Methodology
Classification
Methods
Process
Systems
Use
Evaluation
Conclusions
Applications: ontology evolution
ot
Applications
Methods
Generator
Generator
Classification
Methods
Process
Systems
Use
Evaluation
Conclusions
Process
DB
Kbt+n
Systems
18 / 113
Use
Evaluation
Conclusions
DBPortal
Translator
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Problem
Applications
Methodology
Classification
Methods
Process
Systems
19 / 113
Use
Evaluation
Conclusions
Applications: p2p information sharing
o
Matcher
o0
Matcher
A
Applications: linked data interlinking
o
Classification
A
Transformation
Methodology
Methodology
o
ot+n
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Problem
Applications
Applications: catalog integration
Matcher
Kbt
Problem
o0
Matcher
o0
A
A
Generator
Linker
peer1
d
L
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
d0
20 / 113
query
answer
query
mediator
answer
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
peer2
21 / 113
Problem
Applications
Methodology
Classification
Methods
Process
Systems
Use
Evaluation
Conclusions
Applications: web service composition
o
Problem
Applications
Methodology
Classification
Methods
Process
Systems
Use
Evaluation
Conclusions
Applications: query answering
o0
Matcher
A
Generator
service1
mediator
output
service2
input
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Problem
Applications
Methodology
Classification
Methods
Process
Systems
22 / 113
Use
Evaluation
Conclusions
√
√
√
√
√
√
√
√
√
√
√
√
√
Applications
Methodology
Classification
Methods
Process
Systems
Use
Evaluation
Conclusions
√
√
√
√
√
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
operation
complete
correct
automatic
√
√
√
√
√
√
Problem
23 / 113
The alignment life cycle
run time
Application
Ontology evolution
Schema integration
Catalog integration
Data integration
Linked data
P2P information sharing
Web service composition
Multi agent communication
Query answering
instances
Applications requirements
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
transformation
merging
data translation
query answering
data interlinking
query answering
data mediation
data translation
query reformulation
24 / 113
enhancement
evaluation
A0
A00
A
exploitation
communication
creation
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
26 / 113
Problem
Applications
Methodology
Classification
Methods
Process
Systems
Use
Evaluation
Conclusions
The matching methodology workflow
Problem
Applications
Methodology
Classification
Methods
Process
Systems
Use
Evaluation
Conclusions
Matching dimensions
Identifying ontologies,
characterising need
I Input dimensions
Retrieving existing
alignments
found
I Underlying models (e.g., XML, OWL)
I Schema-level vs. instance-level
I Content vs. context
not found
I Process dimensions
Selecting and
composing matchers
Enhancing
I Approximate vs. exact
I Interpretation of the input
Matching
failed
I Output dimensions
I Cardinality (e.g., 1-1, 1-*)
I Equivalence vs. diverse relations (e.g., subsumption)
I Graded vs. absolute confidence
Evaluating
passed
Storing and sharing
Rendering
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Problem
Applications
Methodology
Classification
Methods
Process
Systems
27 / 113
Use
Evaluation
Conclusions
Three layers
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Problem
Applications
Methodology
Classification
Methods
Process
Systems
29 / 113
Use
Evaluation
Conclusions
Classification of matching techniques
Matching techniques
Element-level
Semantic
I The upper layer
I Granularity of match
I Interpretation of the input information
Syntactic
Structure-level
Syntactic
Granularity/Input interpretation
Semantic
Formal
StringresourceLanguageGraphbased
based
Constraintbased
based
Informal
Instancename
upperbased
tokenisation,
graph
resourceTaxonomybased
similarity,
level
type
lemmatisation,
homomorbased
based
data
description
ontologies,
similarity, taxonomy
directories, similarity, morphology,
phism,
analysis
domainkey
annotated
elimination,
structure
path,
and
global
specific
properties
resources
lexicons,
children, statistics
namesontologies,
thesauri
leaves
pace
linked
data
I The middle layer represents classes of matching techniques
I The lower layer
I Origin
I Kind of input information
Semantic
Syntactic Terminological Structural
Context-based
Extensional
Concrete techniques
Modelbased
SAT
solvers,
DL
reasoners
Semantic
Content-based
Origin/Kind of input
Matching techniques
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
30 / 113
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
31 / 113
Problem
Applications
Methodology
Classification
Methods
Process
Systems
Use
Evaluation
Conclusions
Basic methods: string-based
I takes as input two strings and checks whether the first string starts with
the second one
I net = network; but also hot = hotel
I Suffix
I takes as input two strings and checks whether the first string ends with
the second one
I ID = PID; but also word = sword
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Methodology
Classification
Methodology
Classification
Methods
Process
Systems
Use
Evaluation
Conclusions
Edit distance
I takes as input two strings and calculates the number of edition
operations, (e.g., insertions, deletions, substitutions) of characters
required to transform one string into another
I normalized by length of the maximum string
I EditDistance(NKN,Nikon) = NiKoN/5 = 2/5 = 0.4
I EditDistance(editeur,editor) = edit oe ur/7= 3/7 = 0.43
(e.g., S-Match, OLA, Anchor-Prompt)
(e.g., COMA, SF, S-Match, OLA)
Applications
Applications
Basic methods: string-based
I Prefix
Problem
Problem
Methods
Process
Systems
33 / 113
Use
Evaluation
Conclusions
Basic methods: string-based
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Problem
Applications
Methodology
Classification
Methods
Process
Systems
34 / 113
Use
Evaluation
Conclusions
Basic methods: language-based
I Tokenization
I N-gram
I takes as input two strings and calculates the number of common n-grams
(i.e., sequences of n characters) between them, normalized by
max(length(string 1), length(string 2))
I trigram(3) for the string nikon are nik, iko, kon
(e.g., COMA, S-Match)
I parses names into tokens by recognizing punctuation, cases
I Hands-Free Kits → h hands, free, kits i
I Lemmatization
I analyses morphologically tokens in order to find all their possible basic
forms
I Kits → Kit
(e.g., COMA, Cupid, S-Match, OLA)
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
35 / 113
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
36 / 113
Problem
Applications
Methodology
Classification
Methods
Process
Systems
Use
Evaluation
Conclusions
Basic methods: language-based
Problem
Applications
Methodology
Classification
Methods
Process
Systems
Use
Evaluation
Conclusions
Basic methods: linguistic resources
I Sense-based: WordNet
I A v B if A is a hyponym or meronym of B
I Brand v Name
I Elimination
I A w B if A is a hypernym or holonym of B
I discards “empty” tokens that are articles, prepositions, conjunctions, etc.
I a, the, by, type of, their, from
I Europe w Greece
I A = B if they are synonyms
(e.g., Cupid, S-Match)
I Quantity = Amount
I A ⊥ B if they are antonyms or the siblings in the part of hierarchy
I Microprocessors ⊥ PC Board
(e.g., Artemis, CtxMatch, S-Match)
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Problem
Applications
Methodology
Classification
Methods
Process
Systems
37 / 113
Use
Evaluation
Conclusions
Basic methods: linguistic resources
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Problem
Applications
Methodology
Classification
Methods
Process
Systems
38 / 113
Use
Evaluation
Conclusions
Basic methods: multilingual matching
I Sense-based: WordNet hierarchy distance
artist
person
God
creator2
creator1
Ontologies can be multilingual if they use several different languages, e.g.,
EN, IT, FR. Matching can be done by comparing to a pivot language or
through cross-translation. We distinguish between:
monolingual matching, which matches two ontologies based on their labels
in a single language, such as English;
maker communicator litterate legal document
illustrator
author1 writer2 =author2 writer3
writer1
illustrator
author
creator
writer
Person
multilingual matching, which matches two ontologies based on labels in a
variety of languages, e.g., English, French and Spanish. This
can be achieved by parallel monolingual matching of terms or
crosslingual matching of terms in different languages;
Some other measures (e.g., Resnik measure) depend on the frequency of the
terms in the corpus made of all the labels of the ontologies.
(e.g., S-Match)
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
39 / 113
crosslingual matching, which matches two ontologies based on labels in two
identified different languages, e.g., English vs. French.
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
40 / 113
Problem
Applications
Methodology
Classification
Methods
Process
Systems
Use
Evaluation
Conclusions
Basic methods: constraint-based
Problem
Applications
Methodology
Classification
Methods
Process
Systems
Use
Evaluation
Conclusions
Basic methods: extensional
:C →E
I Datatype comparison
E can be a set of instances, a set of documents which are indexed by
concepts, a set of items, e.g., people, which use these concepts.
Two cases:
I E is common to both ontologies;
I integer < real
I date ∈ [1/4/2005 30/6/2005] < date[year = 2005]
I {a, c, g , t}[1 − 10] < {a, c, g , u, t}+
I Multiplicity comparison
I [1 1] < [0 10]
Can be turned into a distance by estimating the ratio of domain coverage of
each datatype.
(e.g., OLA, COMA)
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Problem
Applications
Methodology
Classification
Methods
Process
41 / 113
Systems
Use
Evaluation
Conclusions
Extensional techniques
I E depends on the ontology. This can be reduced to the former case by
identification or record linkage techniques.
Techniques:
I statistical and machine learning techniques infer and compare the
characteristics of populations;
I set-theoretic techniques compare the extensions;
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Problem
Applications
Methodology
Classification
Methods
Process
42 / 113
Systems
Use
Evaluation
Conclusions
Extensional techniques
Monograph
Product
DVD
≥
Book
.8
Essay
Literary critics
CD
Politics
Biography
≥
Monograph
Product
DVD
≥
Book
.8
Essay
Literary critics
CD
Politics
Biography
≥
Autobiography
Autobiography
Literature
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Literature
43 / 113
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
43 / 113
Problem
Applications
Methodology
Classification
Methods
Process
Systems
Use
Evaluation
Conclusions
Global methods: tree-based
Problem
Applications
Methodology
Classification
Methods
Process
Systems
Use
Evaluation
Conclusions
Global methods: tree-based
Electronics
I Children
I Two non-leaf schema elements are structurally similar if their immediate
children sets are highly similar
Electronics
Personal computers
PC
Cameras and photos
Digital cameras
Photos and cameras
I Leaves
PID
I Two non-leaf schema elements are structurally similar if their leaf sets
are highly similar, even if their immediate children are not
ID
Name
Quantity
(e.g., Cupid, COMA)
Brand
Amount
Price
Price
(e.g., Cupid, COMA)
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Problem
Applications
Methodology
Classification
Methods
Process
Systems
44 / 113
Use
Evaluation
Conclusions
Global methods: graph-based
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Problem
Applications
Methodology
Classification
Methods
Process
Systems
45 / 113
Use
Evaluation
Conclusions
Global methods: probabilistic matching
Probabilistic methods, such as Bayesian networks or Markov networks, can
be used universally in ontology matching, e.g., to enhance some available
matching candidates.
I Iterative fix point computation
I If the neighbors of two nodes of the two ontologies are similar, they will
be more similar.
(e.g., SF, OLA)
o
A
probabilistic
reasoner
probabilistic
modeling
M
extraction
A0
o0
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
46 / 113
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
47 / 113
Problem
Applications
Methodology
Classification
Methods
Process
Systems
Use
Evaluation
Conclusions
Problem
Applications
Methodology
Classification
Methods
Probabilistic matching: bayesian networks example
Global methods: model-based
Bayesian networks are made up of (i) a directed acyclic graph, containing
nodes (also called variables) and arcs, and (ii) a set of conditional
probability tables. Arcs between nodes stand for conditional dependencies
and indicate the direction of influence.
Description logics (DL)-based
Process
Systems
Use
Evaluation
Conclusions
=
≥
hasWritten m(hasWritten, creates) creates
micro-company = company
u ≤5 employee
SME = firm
u ≤10 associate
≤
Writer
m(Writer, Creator) Creator
company = firm ; associate v employee
micro-company v SME
m(author, hasCreated) hasCreated
author
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Problem
Applications
Methodology
Classification
Methods
Process
48 / 113
Systems
Use
Evaluation
Conclusions
Ontology partitioning and search-space pruning
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Problem
Applications
Methodology
Classification
Methods
Process
Systems
49 / 113
Use
Evaluation
Conclusions
Sequential composition
Partitioning: split large ontologies into smaller ontologies, and match these
smaller ontologies, e.g., Falcon-AO, TaxoMap. Pruning: dynamically ignore
parts of large ontologies when matching, e.g., AROMA, LogMap.
o
o1
o
A1
matcher
A
o10
A
partitioner
...
parameters 0
parameters
A01
aggregation
matching
resources
A0
A0
matching0
A00
resources 0
on
o0
o0
An
matcher
A0n
on0
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
51 / 113
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
52 / 113
Problem
Applications
Methodology
Classification
Methods
Process
Systems
Use
Evaluation
Conclusions
Parallel composition
Applications
Methodology
Classification
Methods
Process
Systems
Use
Evaluation
Conclusions
Context-based matching (CBM)
parameters
o
Problem
I Using the ontologies on the web as context
I Composing the relations obtained through these ontologies
A0
matching
The seven steps:
resources
1. Ontology arrangement
2. Contextualization
A000
A
3. Ontology selection
4. Local inference
parameters 0
5. Global inference
matching0
o0
A00
6. Composition
7. Aggregation
resources 0
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Problem
Applications
Methodology
Classification
Methods
Process
Systems
53 / 113
Use
Evaluation
Conclusions
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Problem
Applications
Methodology
Classification
Methods
Process
Systems
54 / 113
Use
Evaluation
Conclusions
CBM step 1: ontology arrangement
CBM step 2: contextualization
Ontology arrangement preselects and ranks the ontologies to be explored
as intermediate ontologies
Contextualisation (or anchoring) finds anchors between the ontologies to
be matched and the candidate intermediate ontologies
a00
a000
b0
≤
a0
=
a
b
a
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
a
b
55 / 113
b
b00
=
=
a
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
=
b
56 / 113
Problem
Applications
Methodology
Classification
Methods
Process
Systems
Use
Evaluation
Conclusions
Problem
Applications
Methodology
Classification
Methods
Process
Systems
Use
Evaluation
Conclusions
CBM step 3: selection
CBM step 4: local inference
Ontology selection restricts the candidate ontologies that will actually be
used
Local inference obtains relations between entities of a single ontology. It
may be reduced to logical entailment
a00
a000
b0
b0
b0
≤
a0
=
a00
a00
b00
=
=
a0
=
a
=
b00
=
=
Applications
Methodology
Classification
Methods
Process
Systems
57 / 113
Use
Evaluation
Conclusions
b0
w
a0
b00
=
=
=
a
b
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Problem
=
=
a
b
a0
a00
w
c
=
c0
w
b00
=
=
=
a
b
b
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Problem
Applications
Methodology
Classification
Methods
Process
Systems
58 / 113
Use
Evaluation
Conclusions
CBM step 5: global inference
CBM step 6: composition
Global inference finds relations between two concepts of the ontologies to
be matched by concatenating relations obtained from local inference and
correspondences across intermediate ontologies
Composition determines the relations holding between the source and
target entities by composing the relations in the path (sequence of relations)
connecting them
a00
w
c
b0
w
a0
=
=
=
a
c0
w
b00
=
b
a00
w
c
b0
w
a0
=
=
=
a
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
≥
c0
w
b00
b0
w
a0
=
=
a00
w
c
=
=
a
b
59 / 113
≥
c0
w
b00
=
b
a00
w
c
b0
w
a0
=
=
=
a
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
≥
c0
w
b00
≤
=
≥
b
60 / 113
Problem
Applications
Methodology
Classification
Methods
Process
Systems
Use
Evaluation
Conclusions
CBM step 7: aggregation
Problem
Applications
Methodology
b0
w
a0
=
≥
=
a
c0
≤
=
≥
b
a00
w
c
b0
w
a0
w
b00
=
Methods
Process
Systems
≥
=
a
Food
≤
Beef
c0
w
b00
=
=
TAP
=
=
=
Food
≤
RedMeat
≤
MeatOrPoultry
≤
= Beef=
Methodology
Classification
Beef
Methods
Process
Systems
61 / 113
Use
Evaluation
Conclusions
Matching learning
Beef
Food
=
≤
Food
NAL
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Problem
Applications
Methodology
Classification
Methods
Process
Systems
62 / 113
Use
Evaluation
Conclusions
Matching learning: examples
These algorithms learn how to sort alignments through the presentation of
many correct alignments (positive examples) and incorrect alignments
(negative examples)
o1
A multistrategy learning approach is useful when several learners are used,
each one handling a particular kind of pattern that it learns best, e.g.,
GLUE, CSR, YAM++.
Various well-known machine learning methods, which had been used for text
categorisation, were also applied in ontology matching:
I Bayes learning,
o
training
R
Conclusions
=
b
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Applications
Evaluation
=
Agrovoc
Problem
Use
CBM example: Scarlet
Aggregation combines relations obtained between the same pair of entities
a00
w
c
Classification
matching
I WHIRL learning,
I neural networks,
A0
I support vector machines,
I decision trees.
o0
o2
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
And, in many cases the Weka data mining software was used.
63 / 113
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
64 / 113
Problem
Applications
Methodology
Classification
Methods
Process
Systems
Use
Evaluation
Conclusions
Tuning
Applications
Methodology
Classification
Methods
Process
Systems
Use
Evaluation
Conclusions
Tuning: examples
Tuning refers to the process of adjusting a matcher for a better functioning
in terms of:
I better quality of matching results, measured, e.g., through precision or
F-measure, and
I better performance of a matcher, measured through resource
consumption, e.g., execution time, main memory.
tuning
(prematch)
o0
matching
tuning
(postmatch)
A00
From a methodological point of view, tuning may be applied at various
levels of architectural granularity:
I for choosing a specific matcher, such as edit distance, from a library of
matchers,
I for setting parameters of the matcher chosen, e.g., cost of edit distance
operations,
I for aggregating the results of several matchers, e.g., through weighting,
I for enforcing constraints, such as 1:1 alignments,
I for selecting the final alignment, e.g., through thresholds.
parameters
o
A
Problem
A0
Informed decisions, for instance, for choosing a specific threshold of 0.55 vs.
0.57 vs. 0.6, should be made.
resources
A variety of systems have explored different possibilities, e.g., eTuner,
MatchPlaner, ECOMatch, AMS.
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Problem
Applications
Methodology
Classification
Methods
Process
Systems
65 / 113
Use
Evaluation
Conclusions
Similarity filter, alignment extractor and alignment filter
Many algorithms are based on similarity or distance computation. A number
of operations can be based on similarity/distance matrices.
M0
M
similarity
filter
A0
A
alignment
extractor
alignment
filter
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
67 / 113
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Problem
Applications
Methodology
Classification
Methods
Process
Systems
66 / 113
Use
Evaluation
Conclusions
Filtering similarities: thresholding
I Hard threshold retains all the correspondence above threshold n;
I Delta threshold consists of using as a threshold the highest similarity
value out of which a particular constant value d is subtracted;
I Proportional threshold consists of using as a threshold the percentage
of the highest similarity value;
I Percentage retains the n% correspondences above the others.
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
68 / 113
Classification
Methods
Process
Systems
Use
Evaluation
Conclusions
Methodology
Process
Systems
Use
Classification
Methods
Evaluation
Conclusions
r
r
he
0.
0.
.05
.90
.84
.12
.12
.60
.84
I Greedy algorithm: 1.86 (stable marriage)
I Permutation: 2.1 (better)
Process
69 / 113
Systems
Use
Evaluation
Conclusions
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Problem
Applications
Methodology
Classification
Methods
Process
69 / 113
Systems
Use
Evaluation
Conclusions
These algorithms measure some quality of a produced alignment, reduce the
alignment, so that the quality may improve, and possibly iterate by
expanding the resulting alignment, e.g., LogMap, ASMOV, ALCOMO.
0.
0.
.05
.90
.84
.12
er
W
rit
Pu
bl
ish
er
ns
la
to
r
Alignment improvement
Tr
a
ok
Bo
.84
.12
.60
.84
.12
.60
W
rit
e
Product
Provider
Creator
Extracting alignments
Product
Provider
Creator
Methods
lis
k
Bo
o
.12
.60
.84
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Applications
Classification
r
W
rit
e
0.
0.
.05
I Greedy algorithm: 1.86 (stable marriage)
Problem
Methodology
r
he
lis
.84
.12
.60
.90
.84
.12
Pu
b
Tr
a
k
Bo
o
Product
Provider
Creator
Applications
Extracting alignments
ns
la
to
r
Extracting alignments
Problem
Pu
b
Methodology
ns
la
to
r
Applications
Tr
a
Problem
.12
.60
.84
o
A
A00
measure
A0
o0
I Greedy algorithm: 1.86 (stable marriage)
I Permutation: 2.1 (better)
selection
I Maximal weight match: 2.52 (optimal)
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
matching/
expansion
69 / 113
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
C
70 / 113
Problem
Applications
Methodology
Classification
Methods
Process
Systems
Use
Evaluation
Conclusions
Alignment improvement: quality measures
Problem
Applications
Methodology
Classification
I threshold on confidence or average confidence,
I cohesion measures between matched entities, i.e., their neighbours are
matched with each other,
I ambiguity degree, i.e., proportion of classes matched to several other
classes,
I agreement or non-disagreement between the aligned ontologies,
owl:Thing
⊥
Person
Book
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Methodology
Classification
Methods
Process
Systems
Systems
Use
Evaluation
≥
⊥
Conclusions
Evaluation
Biography
=
string
subject
foaf:Person
≤
71 / 113
Use
Essay
≤
topic
I violation of some constraints, e.g., acyclicity in the correspondence
paths,
I satisfaction of syntactic anti-patterns,
I consistency and coherence.
Applications
Process
Alignment improvement: alignment debugging
Quality measures are the main ingredients for improvement. Contrary to the
evaluation measures, such as precision and recall, these must be intrinsic
measures of the alignment (they do not depend on any reference):
Problem
Methods
Conclusions
User involvement
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Problem
Applications
Methodology
Classification
Methods
Process
Systems
72 / 113
Use
Evaluation
Conclusions
Individual matching
There are at least three areas in which users may be involved:
I by providing initial alignments to the system (before matching),
I by configuring and tuning the system, and
I In traditional applications semi-automatic matching/design-time
interaction is a promising way to improve quality of the results
I Burdenless to the user interaction schemes
I by providing feedback to matchers in order for them to adapt their
results.
I Usability
I Scalability of visualization
I Exploit the user feedback
o
I to adjust matcher parameters
I to take it as (partial) input alignment to a matcher
I ...
seed
A
I In dynamic settings, agents involved in the matching process can
negotiate the mismatches in a fully automated way
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
config tuning
matching
feedback
A0
o0
73 / 113
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
74 / 113
Problem
Applications
Methodology
Classification
Methods
Process
Systems
Use
Evaluation
Conclusions
Collective matching
Problem
Applications
Methodology
Classification
Methods
Process
Systems
Use
Evaluation
Conclusions
Social and collaborative ontology matching
Besides involving a single user at a time, mostly in a synchronous fashion,
matching may also be a collective effort in which several users are involved.
I The network effect:
I Each person has to do a small amount of work
I Each person can improve on what has been done by others
I Errors remain in minority
I A community of people can share alignments and argue about them by
using annotations
o
seed
A
config tuning
A0
matching
comment discuss
feedback
communication
I The key issues are to:
I
I
I
I
A00
o0
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Problem
Applications
Methodology
Classification
Methods
Process
Systems
75 / 113
Use
Evaluation
Conclusions
An overview of the state of the art systems
Provide adequate annotation support and description units
Handle adequately contradictory and incomplete alignments
Incentivise active user participation
Handle adequately the malicious users
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Problem
Applications
Methodology
Classification
Methods
Process
Systems
76 / 113
Use
Evaluation
Conclusions
State of the art systems
DELTA, Hovy, TransScm, DIKE,
100+ matching systems exist, . . . we consider some of them
I Cupid (U. of Washington, Microsoft Corporation and U. of Leipzig)
I S-Match (U. of Trento)
SKAT &ONION, Artemis, H-Match,
SEMINT, IF-Map,
Tess, Anchor-Prompt, OntoBuilder,
NOM & QOM, oMap,
T-tree, CAIMAN,
Xu and Embley, Wise-Integrator,
FCA-merge, LSD,
Cupid, COMA & COMA++, QuickMig,
SF, MapOnto, CtxMatch&CtxMatch2,
IceQ, OLA, Falcon-AO, RiMOM,
GLUE, iMAP,
CBM, iMapper, SAMBO,
Automatch, SBI&NB,
AROMA, ILIADS, SeMap,
Kang and Naughton,
ASMOV, HAMSTER, SmartMatcher,
Dumas, Wang et al.,
I OLA (INRIA Rhône-Alpes and U. de Montréal)
I Falcon-AO (China Southwest U.)
S-Match, HCONE, MoA, ASCO, XClust,
Stroulia & Wang, MWSDI, SeqDisc,
I RiMOM (Tsinghua U.)
I ASMOV (INFOTECH Soft, Inc., U. of Miami)
I LogMap (U. of Oxford)
BayesOWL, OMEN, HSM, GeRoMe,
CBW, AOAS, BLOOMS & BLOOMS++,
GEM/Optima/Optima+, CSR,
sPLMap, FSM,
Prior+, YAM & YAM++, MoTo,
VSBM & GBM,
CODI, LogMap & LogMap2,
ProbaMap
OMviaUO, DCM, Scarlet, CIDER,
Elmeleegy et al., BeMatch, PORSCHE,
I eTuner (U. of Illinois and The MITRE Corporation)
I ...
MatchPlanner, Anchor-Flood, Lily,
PARIS
AgreementMaker, Homolonto, DSSim,
Schema-based
systems
MapPSO, TaxoMap, iMatch
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Instance-based
systems
78 / 113
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
79 / 113
Problem
Applications
Methodology
Classification
Methods
Process
Systems
Use
Evaluation
Conclusions
Cupid
Problem
Applications
Methodology
Process
Systems
Use
Evaluation
A0
Conclusions
M
O0
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Methodology
Classification
Methods
Process
Systems
80 / 113
Use
Evaluation
Conclusions
S-Match
Linguistic
matching
M0
Structure
matching
M 00
Weighting
thesauri
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Problem
Applications
Methodology
Classification
Methods
Process
Systems
81 / 113
Use
Evaluation
Conclusions
S-Match architecture
I Schema-based
I Computes equivalence (=), more general (w), less general (v),
disjointness (⊥)
I Transforms each ontology into a propositional theory based on external
resources (WordNet definitions of terms) and ontology structure
I Sequential system with a composition at the element level
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
M 000
O
I Sequential system
Applications
Methods
Cupid architecture
I Schema-based
I Computes similarity coefficients in the [0 1] range
I Performs linguistic and structure matching
Problem
Classification
82 / 113
o
Preprocessing
o0
Oracles
PTrees
Match
manager
SAT
solvers
A0
Basic
matchers
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
83 / 113
Problem
Applications
Methodology
Classification
Methods
Process
Systems
Use
Evaluation
Conclusions
OLA
Problem
Applications
Methodology
Classification
Methods
Process
Systems
Use
Evaluation
Conclusions
OLA architecture
O
I Schema- and Instance-based
I Computes dissimilarities + extracts alignments (equivalences in the
[0 1] range)
I Based on terminological (including linguistic) and structural (internal
and relational) distances
I Neither sequential nor parallel
parameters
A
M
similarity
computation
M0
A0
resources
O0
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Problem
Applications
Methodology
Classification
Methods
Process
Systems
84 / 113
Use
Evaluation
Conclusions
Falcon-AO
Problem
Applications
Methodology
Classification
Methods
Process
Systems
M0
Structure
matching
85 / 113
Use
Evaluation
Conclusions
Falcon-OA architecture
I Schema- and instance-based
I Sequence of string-based and structural matcher
I Does not use the structural matcher if the terminological match is high
enough
I String-based matcher based on so-called virtual documents
I Structural matcher close to OLA’s
I Partition the ontologies so that they can be processed faster
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
86 / 113
o
M
Linguistic
matching
M 00
A0
o0
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
87 / 113
Problem
Applications
Methodology
Classification
Methods
Process
Systems
Use
Evaluation
Conclusions
RiMOM
Problem
o
I Linguistic matching is based on edit distance, WordNet and vector
distance
I Structural matcher implements variations of similarity flooding
o0
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Applications
Methodology
Classification
Methods
Process
Systems
Use
Evaluation
Conclusions
RiMOM architecture
I Schema- and instance-based
I Dynamic strategy selection based on pre-processing
I Sequence of linguistic and structural matchers
Problem
Applications
Methodology
Classification
Methods
Process
Systems
Evaluation
Strategy
selection
PreLinguistic
processing matching
88 / 113
Use
Similarity
factors
Conclusions
ASMOV
Label
matchers
Vector
matcher
Structural
matching
A0
CCP PPP CPP
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Problem
Applications
Methodology
Classification
Methods
Process
Systems
89 / 113
Use
Evaluation
Conclusions
ASMOV architecture
I Schema- and instance-based
I Iterative similarity computation with sematic verification
I Matchers: string-based, language based, WordNet UMLS, iterative fix
point computation
I Verification through rule-based (anti-patterns) inference
o
A
o0
Preprocessing
Iterative
similarity
computation
WordNet
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Combination
90 / 113
A0
Semantic
verification
A00
UMLS
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
91 / 113
Problem
Applications
Methodology
Classification
Methods
Process
Systems
Use
Evaluation
Conclusions
LogMap
Problem
Applications
Methodology
Classification
Methods
Auxiliary resource
(WordNet, UMLS)
Use
Evaluation
Conclusions
Indexing
(lexical,
structural)
Lexical
indices’
intersection
A
o0
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Methodology
Classification
Methods
Process
Systems
92 / 113
Use
Evaluation
Conclusions
eTuner
of0
Compute
overlap
o
I Mapping repair through propositional satisfiability
I String-based mapping discovery from anchors through class hierarchies
Applications
Systems
LogMap architecture
I Schema- and instance-based
I Applies partitioning and pruning of large ontologies
I Initial anchors through indices’ intersection (exact strings)
Problem
Process
of
Mapping
repair
Mapping
discovery
A0
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Problem
Applications
Methodology
Classification
Methods
Process
Systems
93 / 113
Use
Evaluation
Conclusions
eTuner architecture
I A metamatching system
I Automatically tunes a matching system for a particular task by choosing
the most effective matchers and the best parameters to be used
I A matching system is modeled as a triple hL, G , K i:
I L is a library of matching components, e.g., edit distance; combiners,
e.g., through averaging; constraint enforcers, e.g., pre-defined domain
constraints; match selectors, e.g., thresholds.
I G is a directed graph which encodes the execution flow among the
components of the given matching system.
I K is a set of knobs to be set.
S
Transformation rules
Tuning procedures
Task generator
Staged
tuner
Synthetic
training
System M:
task
Sample generator
S0
hL, G , K i
Augmented schema S
Tuned
system
M
I Two phases: training through synthetic workload and (greedy) search.
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
94 / 113
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
95 / 113
Problem
Applications
Methodology
Classification
Methods
Process
Systems
Use
Evaluation
Conclusions
Main operations
Problem
Applications
Methodology
Classification
Methods
Process
Systems
Use
Evaluation
Conclusions
Merging
O
O0
Matcher
I Merge(o, o 0 , A) = o 00
I Translate(d, A) = d 0
A
I Interlink(d, d 0 , A) = L
I TransformQuery (q, A) = q 0 and Translate(a0 , Invert(A)) = a
I ...
Generator
SWRL, OWL, SKOS
axioms
Merge(O, O 0 , A)
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Problem
Applications
Methodology
Classification
Methods
Process
Systems
97 / 113
Use
Evaluation
Conclusions
Data translation
O
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Problem
Applications
Methodology
Classification
Methods
Process
Systems
Use
Evaluation
Conclusions
Query mediator
Matcher
O0
O
A
A
Generator
Generator
Transformation
Query’
mediator
D
Answer
Answer’
WSML, QueryMediator
XSLT, C-OWL
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
O0
Matcher
Query
D
98 / 113
99 / 113
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
100 / 113
Problem
Applications
Methodology
Classification
Methods
Process
Systems
Use
Evaluation
Conclusions
Evaluation of matching algorithms
Problem
Applications
Methodology
Classification
Methods
Process
Systems
Use
Evaluation
Conclusions
6 tracks, 8 test cases, 23 participants in 2013
http://oaei.ontologymatching.org
Goal: improvement of matching algorithms through comparison, measure of
the evolution of the field.
I Yearly campaigns comparing algorithms on different test cases
I Participants submit their alignments in a standard format
I We use alignment API for comparing these formats with reference
alignments
I Various degrees of blindness, expressiveness, realism
test
formalism
relations
benchmark
anatomy
conference
large bio
multifarm
library
interactive
rdft
OWL
OWL
OWL-DL
OWL
OWL
OWL
OWL-DL
RDF
=
=
=, <=
=
=
=
=, <=
=
confidence
[0
[0
[0
[0
[0
[0
[0
[0
1]
1]
1]
1]
1]
1]
1]
1]
modalities
language
blind+open
open
blind+open
open
open
open
open
blind
EN
EN
EN
EN
CZ, CN, EN, . . .
EN, DE
EN
EN
SEALS
√
√
√
√
√
√
√
I Tests and results are published on the web site
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Problem
Applications
Methodology
Classification
Methods
Process
Systems
102 / 113
Use
Evaluation
Conclusions
Evaluation process
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Problem
Applications
Methodology
Classification
Methods
Process
Systems
103 / 113
Use
Evaluation
Conclusions
Precision and recall
A
C × C0 × Q
R
R
o
parameters
matching
o0
evaluator
m
A0
Definition (Precision, Recall)
resources
Given a reference alignment R, the precision of some alignment A is given by
P(A, R) =
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
104 / 113
|R ∩ A|
|R ∩ A|
and recall is given by R(A, R) =
.
|A|
|R|
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
105 / 113
Problem
Applications
Methodology
Classification
Methods
Process
Systems
Use
Evaluation
Conclusions
2013: Precision-recall graph for the conference test case
Problem
Applications
Methodology
Classification
Methods
Process
Systems
Use
Evaluation
Conclusions
Summary
YAM++
AML-bk
LogMap
AML
ODGOMS
StringsAuto
F1 -measure=0.7
ServOMap
MapSSS
F1 -measure=0.6
Hertuda
F1 -measure=0.5
WikiMatch
WeSeE-Match
I Heterogeneity of ontologies is in the nature of the semantic web;
I Ontology matching is part of the solution;
I It can be based on many different techniques;
I There are already numerous systems around;
I A relatively solid research field has emerged (tools, formats, evaluation,
etc.) and it keeps making progress;
I But there remain serious challenges ahead.
IAMA
HotMatch
CIDER-CL
edna
rec=1.0
rec=.8
rec=.6
pre=.6
pre=1.0
pre=.8
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Problem
Applications
Methodology
Classification
Methods
Process
Systems
106 / 113
Use
Evaluation
Conclusions
Challenges
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
Problem
Applications
Methodology
Classification
Methods
Process
Systems
108 / 113
Use
Evaluation
Conclusions
Acknowledgments
We thank all the participants of the Heterogeneity workpackage of the Knowledge Web network of excellence
I Large-scale and efficient matching,
I Matching with background knowledge,
In particular, we are grateful to Than-Le Bach, Jesus Barrasa, Paolo
Bouquet, Jan De Bo, Jos De Bruijn, Rose Dieng-Kuntz, Enrico Franconi,
Raúl Garcı́a Castro, Manfred Hauswirth, Pascal Hitzler, Mustafa Jarrar,
Markus Krötzsch, Ruben Lara, Malgorzata Mochol, Amedeo Napoli, Luciano
Serafini, François Sharffe, Giorgos Stamou, Heiner Stuckenschmidt, York
Sure, Vojtěch Svátek, Valentina Tamma, Sergio Tessaris, Paolo Traverso,
Raphaël Troncy, Sven van Acker, Frank van Harmelen, and Ilya Zaihrayeu.
I Matcher selection, combination and tuning,
I User involvement,
I Social and collaborative matching,
I Uncertainty in matching,
I Reasoning with alignments,
I Alignment management.
And more specifically to Marc Ehrig, Fausto Giunchiglia, Loredana Laera,
Diana Maynard, Deborah McGuinness, Petko Valchev, Mikalai Yatskevich,
and Antoine Zimmermann for their support and insightful comments
and, of course, many others...
Part of this work was carried out while Pavel Shvaiko was with the
University of Trento.
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
109 / 113
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
110 / 113
Problem
Applications
Methodology
Classification
Methods
Process
Systems
Use
Evaluation
Conclusions
Problem
Applications
Methodology
Classification
Methods
Process
Systems
Use
Evaluation
Conclusions
Ontology matching the book, 2nd edition
Jérôme Euzenat, Pavel Shvaiko
Ontology matching
1. Applications
2. The matching problem
3. Methodology
Thank you
4. Classification
5. Basic similarity measures
for your attention and interest!
6. Global matching methods
7. Strategies
8. Systems
9. Evaluation
10. Representation
11. User involvement
12. Processing
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
http://book.ontologymatching.org
Questions?
[email protected]
[email protected]
http://www.ontologymatching.org
111 / 113
Ontology matching tutorial (v15): ISWC-2014 (Riva del Garda, Italy) – Euzenat and Shvaiko
112 / 113