Semantic Enrichment of Ontology Mappings: Advances, Insights and Ideas for Improvement Patrick Arnold, Universität Leipzig 1 1. Introduction Semantic Enrichment of Ontology Mappings ● Important step for... ● ● Annotate the relation type of correspondences Schema/Ontology Merging Schema/Ontology Evolution Also applicable for: ● ● ● ● 06/20/14 Entity resolution Text mining / Information retrieval Linked Open Data etc. Semantic Enrichment of Ontology Mappings 2 1. Introduction Two-Step Approach 06/20/14 Semantic Enrichment of Ontology Mappings 3 1. Introduction 06/20/14 Semantic Enrichment of Ontology Mappings 4 1. Introduction Compound Strategy (Recap) ● ● Works with endocentric compounds ● drilling machine, school bus, blackboard Does not work with exocentric compounds ● Concept AB matches a concept B Draw is-a conclusion (vintage car is-a car) nightmare, redhead, saw tooth, butterfly Some borderline cases: ● 06/20/14 strawberry, airport, city hall, snowman, ... Semantic Enrichment of Ontology Mappings 5 1. Introduction Results in December 2012 (reconstructed) ● ● Strategies: Compound + Itemization Goals: Improve recall, keep up precision Scenario Recall Precision F-Measure WebDirectories (German) 44.4 % 70.0 % 57.2 % Health (Diseases) 58.5 % 92.3 % 75.4 % Text Mining Taxonomies 12.5 % 97.7 % 55.1 % Web Directories 2 (German) 35.8 % 75.6 % 55.5 % 06/20/14 Semantic Enrichment of Ontology Mappings 6 Agenda 1. Introduction 2. Advances 3. Evaluation 4. Outlook / Ideas for Improvements 06/20/14 Semantic Enrichment of Ontology Mappings 7 2. Advances 2.1 WordNet Using Java API for WordNet Search (JAWS) ● ● Renowned thesaurus Contains all relevant relations – ● is-a, part-of, related (cohyp.), equal Used by many other tools and approaches Our improvement: Gradual Modifier Reduction ● ● ● 06/20/14 Basic idea: Compounds among the most-productive way of word formation WordNet: 160,000 lexemes – English general vocabulary: approx. 1 Mio. lexemes – English overall vocabulary: >> 1 Mio. lexemes GTR: Handle words that do not occur in WordNet Semantic Enrichment of Ontology Mappings 8 2. Advances and Insights 2.1 WordNet Example ● ● Correspondence (US Vice President, Person) Could not be resolved, because “US Vice President“ was not in the dictionary. Gradual Modifier Reduction: ● ● ● 06/20/14 If a compound word does not occur in the dictionary, gradually remove its modifiers Start with the most left-hand modifier After each removed modifier: Check word aggain Semantic Enrichment of Ontology Mappings 9 2. Advances and Insights 2.1 WordNet 2 Examples resolved with this strategy 06/20/14 Semantic Enrichment of Ontology Mappings 10 2. Advances and Insights 2.1 WordNet Enhancements with WordNet Health Text Mining Taxonomies Recall was 58.5 ↑ 2.4 now 60.9 Precision 92.3 ↑ 3.8 96.1 06/20/14 Recall was 12.5 ↑ 31.9 now 44.3 Precision 97.7 ↑ 99.0 Semantic Enrichment of Ontology Mappings 1.3 11 2. Advances and Insights 2.1 WordNet Enhancements with WordNet GTR Health Recall was 60.9 Precision 96.1 06/20/14 Taxonomies ↓ 4.9 now 56.0 ↓ 4.1 92.0 Recall was 44.3 ↑ 15.8 now 60.1 Precision 99.0 ↓ 97.8 Semantic Enrichment of Ontology Mappings 1.2 12 2. Advances and Insights 2.2 Structure Strategy Motivation: Similar to WordNet GTR Question: What if neither strategy can draw a conclusion between matching concepts A, B? ● ● 06/20/14 Check whether a relation between (A, B') or (A', B) can be drawn Prime denoting father element Semantic Enrichment of Ontology Mappings 13 2. Advances and Insights 2.2 Structure Strategy Example: Online_Shoe_Store.Shoes.Sneakers ↔ Apparel_Store.Footwear Shoes is-a Footwear → Sneakers is-a Footwear 06/20/14 Semantic Enrichment of Ontology Mappings 14 2. Advances and Insights 2.2 Structure Strategy Enhancements with Structure Strategy Web Directories 2 Web Directories 1 Recall was 44.4 Precision 70.0 06/20/14 ↑ 3.2 now 47.6 ↓ 0.3 69.7 Recall was 35.8 ↑ 0.9 now 36.7 Precision 75.6 ↓ 8.2 67.4 Semantic Enrichment of Ontology Mappings 15 2. Advances and Insights 2.2 Structure Strategy Precision losses with Structure Strategy 06/20/14 Semantic Enrichment of Ontology Mappings 16 2. Advances and Insights 2.3 Compound-Modifier Match Strategy Original Compound Strategy: “Head” matches Compound ● Adaption: “Modifier” matches Compound ● e.g., (roof, roof window), (bed, bedroom) Compound-Modifier-Strategy is able to detect part-of / has-a relations ● ● e.g., (school, high-school) bed part-of bedroom doorknob part-of door Problem: Direction cannot be determined 06/20/14 Semantic Enrichment of Ontology Mappings 17 2. Advances 2.3 Compound-Modifier Match Strategy Two major cases: ● AB part-of A (23.3 %) – ● A part-of AB (30.7 %) – heartbeat, moonlight, earring, policeman, eyeballs, bathtub, ... bedroom, motorcycle, babysitter, railroad, fireplace, bookstore, … About 50 % contain the part-of resp. has-a relation ● ● 06/20/14 Other 50 % often “related” Examples: nightmare, fingerprint, tooth paste Semantic Enrichment of Ontology Mappings 18 2. Advances 2.3 Compound-Modifier Match Strategy Compound-Modifier-Mach Strategy currently disabled ● ● 06/20/14 Loss of precision Benchmarks hardly contain specified part-of relations Semantic Enrichment of Ontology Mappings 19 3. Evaluation 3.1 General Improvements Original Values: Increase: 06/20/14 Scenario Recall Precision F-Meas. Web 1 44.4 % 70.0 % 57.2 % Health 58.5 % 92.3 % 75.4 % Tax. 12.5 % 97.7 % 55.1 % Web 2 35.8 % 75.6 % 55.5 % Scenario Recall Precision F-Meas. Web 1 51.6 % 69.5 % 60.5 % Health 60.9 % 92.5 % 76.7 % Tax. 60.4 % 97.8 % 79.1 % Web 2 42.3 % 73.3 % 57.8 % Semantic Enrichment of Ontology Mappings 20 3. Evaluation 3.1 General Improvements Improvement: Scenario Recall Precision F-Meas. Web 1 + 7.2 % - 0.5 % + 3.3 % Health + 2.4 % - 0.2 % + 1.3 % Tax. + 47.9 % - 0.1 % + 24.0 % Web 2 + 6.5 % - 2.3 % + 2.3 % Conclusions: ● ● 06/20/14 Original goal: Increase recall without doing damage to the precision Goal was mostly achieved Semantic Enrichment of Ontology Mappings 21 3. Evaluation 3.2 Evaluation by Strategy Assumption: Exactly one strategy is enabled Recall: Strategy Web 1 Health Tax. Web 2 Mean Compound 7.9 36.5 12.3 3.8 15.1 Itemization 36.5 14.6 0.2 25.3 19.1 WordNet (simple) - 9.7 33.7 - 21.7 WordNet (GMR) - 9.7 49.3 - 29.5 OpenThes. 1.5 0.0 3.4 1.2 1.5 Structure 3.1 0.0 0.0 1.2 1.1 06/20/14 Semantic Enrichment of Ontology Mappings 22 3. Evaluation 3.2 Evaluation by Strategy Precision: Strategy Web 1 Health Compound 62.5 88.2 Itemization 82.1 WordNet (simple) WordNet (GTR) Web 2 Mean 97.7 42.8 72.8 100.0 100.0 83.3 91.3 - 100.0 99.1 - 99.5 - 80.0 97.6 - 88.8 OpenThes. 50.0 0.0 96.0 50.0 65.3 Structure 50.0 - - 20.0 35.0 06/20/14 Tax. Semantic Enrichment of Ontology Mappings 23 3. Evaluation 3.2 Evaluation by Strategy Some questions... ● ● 06/20/14 Why does OT detects relation types in an English-language scenario? – (Monarch, Person) – (Journalist, Person) – (Boxer, Person) – (Golfer, Athlete) – ... Why does WordNet scores below 100 %? – (Automobile, Vehicle): equal – (Road, Street): equal Semantic Enrichment of Ontology Mappings 24 3. Evaluation 3.2 Time Complexity Web1 Health Tax. Web2 Avg Compound 2.36 2.51 4.31 2.01 2.80 Itemization 4.41 2.36 3.18 2.28 3.05 WordNet (simple) 2.15 4.39 3.75 1.94 3.06 WordNet (GTR) 2.26 4.02 5.61 2.09 3.50 OpenThes. 4.61 5.18 7.57 3.87 5.31 Structure 2.22 2.50 3.35 1.82 2.47 Overall 27.3 11.5 15.2 24.9 19.7 Average execution time per correspondence and strategy (ms) Total execution time: ca. 5 .. 15 sec. per mapping ● 06/20/14 No time problems (yet) Semantic Enrichment of Ontology Mappings 25 4. Outlook 4.1 Introduction Possibilities: ● More background knowledge – ● ● 06/20/14 UMLS – medical domain More linguistic knowledge – Exploit cohyponyms – Compound-Modifier-Match strategy Hybrid strategies (advanced) – Wikipedia / Wiktionary – Search Engine Semantic Enrichment of Ontology Mappings 26 4. Outlook 4.2 Wikipedia Example Leipzig ● Leipzig is a City Example Bicycle ● ● ● 06/20/14 Bicycle is a vehicle Pushbike, pedal bike, pedal cycle, cycle are synonyms of bike Wheels are part of bike Semantic Enrichment of Ontology Mappings 27 4. Outlook 4.3 Wikipedia 06/20/14 Semantic Enrichment of Ontology Mappings 28 4. Outlook 4.4 Search Engines Approach presented in the xxx Paper Count the number of results for a specific expression like “A is a B” Problems: ● ● ● 06/20/14 Search Engines very restrictive wrt the number of queries/day “Emergency solution” if all other strategies fail Evaluation? Semantic Enrichment of Ontology Mappings 29 4. Outlook 4.4 Search Engines 2 Examples ● ● Leipzig | City President | Politician Query (Google) “leipzig is a city” “city is a leipzig” “leipzig is part of (a) city” “city is part of (a) leipzig” “leipzig is related to (a) city” “city is related to (a) leipzig” 06/20/14 Hits 62,700 0 0/0 3/0 0/0 0/0 Query (Google) “president is a politician” “politician is a president” “president is part of (a) politician)” “politician is part of (a) president” “president is related to (a) politician” “politician is related to (a) president” Semantic Enrichment of Ontology Mappings Hits 28.6 M 4 0/0 0/0 0/0 0/0 30 5. Conclusions Achievements since December 2012: ● ● ● New strategies: Structure, Background Knowledge Enhanced methods: – Itemization – Gradual Term Reduction – Cross-equivalence (Structure Strategy) Better recall, scarcely loss in precision Outlook: ● ● ● Many opportunities Wikipedia, Wiktionary, Search Engines Instance Data analysis – 06/20/14 No appropriate benchmarks so far Semantic Enrichment of Ontology Mappings 31
© Copyright 2026 Paperzz