Ontology Alignment/Matching Prafulla Palwe ► Introduction ► ► Matching Dimensions Basic Techniques ► Traditional Emergent Classification ► Sequential composition Parallel composition Application Domains ► Matching Operation Motivation Schema Matching Vs Ontology Matching Correspondence Alignment Matching Process ► Being serious about the semantic web Living with heterogeneity Heterogeneity problem I have a plan for you Matching Problem Agenda Element Level Structure Level Summary and Challenges Introduction ► Being serious about the semantic web - It is not one guy's ontology It is not several guys' common ontology It is many guys and girls' many ontologies So it is a mess, but a meaningful mess Introduction ► Living with heterogeneity - The semantic web will be: ► Huge ► Dynamic ► Heterogeneous These are not bugs, they are features. We must learn to live with them. Introduction ► Heterogeneity problem – Resources being expressed in different ways must be reconciled before being used. Mismatch between formalized knowledge can occur when: ► different languages are used; ► different terminologies are used; ► different modeling is used. Introduction ►I have a plan for you – Reconciliation Matching Problem ► Matching Operation Definition – Matching operation takes as input ontologies, each consisting of a set of discrete entities (e.g., tables, XML elements, classes, properties) and determines as output the relationships (e.g., equivalence, subsumption) holding between these entities Matching Problem ► Motivation – 2 XML Schemas 2 Ontologies Matching Problem Matching Problem Matching Problem Matching Problem Matching Problem Matching Problem Matching Problem Matching Problem Matching Problem ► Schema mapping Vs ontology mapping Differences ►Schemas their data often do not provide explicit semantics for Relational schemas provide no generalization ►Ontologies meaning are logical systems that constrain the Ontology definition as set of logical axioms Commonalities ►Schemas and ontologies provide a vocabulary of terms that describes the domain of interest ►Schemas and ontologies constrain the meaning of terms used in the vocabulary. Matching Problem ► Correspondence Definition – Given 2 ontologies O and O’ , a correspondence between M between O and O’ is a 5-uple : <id,e,e’,R,n> such that: ►id is a unique identifier of the correspondence. ►e and e’ are entities of O and O’ (e.g. XML Elements, classes) ►R is a relation (e.g. equivalence (=), disjointness (_|_)) ►n is a confidence measure in some mathematical structure (typically in the [0,1] range) Matching Problem ► Alignment Definition – Given 2 ontologies O and O’, an alignment A between O and O’: ►Is a set of correspondence on O and O’ ►With some cardinality: 1-1, 1-* etc. ►Some additional metadata (method, date, properties etc) Matching Process Matching Process Matching Process Matching Process Matching Process Matching Process Matching Process ► General Basic Matching Process Matching Process ► Sequential Composition Matching Process ► Parallel composition Matching Process ► Similarity Filter, alignment extractor and alignment filter – Matching Process ► Aggregation Operations – There are many different ways to aggregate matcher results, usually depending on confidence/similarity: ► Triangular norms (min, weighted products) useful for selecting only the best results ► Multidimensional distances (Eudidean distance, weighted sum) useful for taking into account all dimensions ► Fuzzy aggregation (min, weighted average) useful for aggregating competing algorithms and averaging their results ► Other specific measures (e.g., ordered weighted average) Application Domains ► Traditional - Ontology evolution Schema integration Catalog integration Data integration Application Domains ► Ontology Evolution Application Domains ► Catalog Integration Application Domains ► Emergent P2P information sharing Agent communication Web service composition Query answering on the web Application Domains ► P2P information sharing Application Domains ► Web Service Composition Application Domains ► Agent communication Classifications ► Matching Dimensions Input Dimensions ► Underlying models (e.g. XML, OWL) ► Schema Level Vs Instance Level Process Dimensions ► Approximate Vs Exact ► Interpretation of the input Output Dimensions ► Cardinality ► Equivalence Vs Diverse relations ► Graded Vs Absolute Confidence Classifications ► Three Layers Upper Layer ► Granularity of match ► Interpretation of the input information Middle Layer ► Represents classes of elementary (basic) matching techniques Lower Layer ► Based on the kind of input which is used by elementary matching techniques Classifications ► Classification of schema based techniques Basic Techniques ► Element Level Techniques String based – Prefix ► Takes an input 2 strings and checks whether the first string starts with the second ► e.g. net = network but also hot = hotel Suffix – ► Takes an input 2 strings and checks whether the first string ends with the second ► e.g. ID = PID but also word = sword Edit Distance – ► Takes as input 2 strings and calculates the number of edit operations (insertion,deletion,substitution) of characters required to transform one string into other normalized by length of the max string. ► editDistance(NKN, Nikon) = 0.4 Basic Techniques Language based – Tokenization – ► Parses names into tokens by recognizing punctuation, cases ► Hands-Free_Kits <hands, free, kits> Lemmatization – ► Analyses morphologically tokens in order to find all their possible basic forms ► Kits Kit Elimination – ► Discards empty tokens that are articles, prepositions, conjuctions ► a, the, by, type of, their, from Basic Techniques ► Structure Level Techniques Ontologies are viewed as graph-like structure containing terms and their inter-relationships. Taxonomy based ►Bounded path matching These take 2 paths with links between classes defined by the hierarchical relations, compare terms and their positions along these paths and identify similar terms. ►Super(sub)-concept rules If super concepts are the same, the actual concepts are similar to each other Basic Techniques Tree based Children ►2 non leaf schema elements are structurally similar if their immediate children sets are highly similar Leaves ►2 non leaf schema elements are structurally similar if their leaf sets are highly similar, even if their immediate children are not. Basic Techniques Basic Techniques Basic Techniques Summary and Challenges ► Summary Ontology Matching and alignment is the process of developing the common or most common structure/semantic terms out of 2 or more different ontologies/structures/schemas. Different efficient and complex algorithms using basic techniques of matching process, can be developed for matching and alignment generation. ► Challenges Developing generic and highly efficient matching and alignment generation algorithms. Thank You
© Copyright 2026 Paperzz