Semantic Web The Story So Far Ian Horrocks <[email protected]> Oxford University Computing Laboratory Semantic Web Semantic Web • According to W3C – “an evolving extension of the World Wide Web in which web content can be … read and used by software agents, thus permitting them to find, share and integrate information more easily” • Data will use uniform syntactic structure (RDF) • (OWL) ontologies will provide – Schemas for data – Vocabulary for annotations • Ultimate goal is a “more intelligent web” Web Ontology Language OWL • Semantic Web led to requirement for a “web ontology language” • set up Web-Ontology (WebOnt) Working Group – WebOnt developed OWL language – OWL based on earlier languages RDF, OIL and DAML+OIL – OWL now a W3C recommendation (i.e., a standard) • OWL is a family of 3 languages: OWL Lite, OWL DL and OWL Full • OIL, DAML+OIL and OWL (DL & Lite) based on Description Logics – Has facilitated development of wide range of high quality tools & infrastructure • OWL now language of choice in many applications What Are Description Logics? • A family of logic based Knowledge Representation formalisms – Descendants of semantic networks and KL-ONE – Describe domain in terms of concepts (classes), roles (properties, relationships) and individuals – Operators allow for composition of complex concepts – Names can be given to complex concepts, e.g.: HappyParent ´ Parent u 8hasChild.(Intelligent t Athletic) Why (Description) Logic? • OWL exploits results of 15+ years of DL research – Well defined (model theoretic) semantics – Most DLs are subsets of C2, i.e., decidable fragments of FOL Why (Description) Logic? • OWL exploits results of 15+ years of DL research – Well defined (model theoretic) semantics – Formal properties well understood (complexity, decidability) I can’t find an efficient algorithm, but neither can all these famous people. [Garey & Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness. Freeman, 1979.] Why (Description) Logic? • OWL exploits results of 15+ years of DL research – Well defined (model theoretic) semantics – Formal properties well understood (complexity, decidability) – Known reasoning algorithms Why (Description) Logic? • OWL exploits results of 15+ years of DL research – Well defined (model theoretic) semantics – Formal properties well understood (complexity, decidability) – Known reasoning algorithms – Implemented systems (highly optimised) KAON2 Pellet CEL Class/Concept Constructors • Concept can be thought of as a FOL formula with one free variable Knowledge Base / Ontology Axioms OWL RDF/XML Exchange Syntax E.g., Parent u 8hasChild.(Intelligent t Athletic): <owl:Class> <owl:intersectionOf rdf:parseType=" collection"> <owl:Class rdf:about="#Parent"/> <owl:Restriction> <owl:onProperty rdf:resource="#hasChild"/> <owl:allValuesFrom> <owl:unionOf rdf:parseType=" collection"> <owl:Class rdf:about="#Intelligent"/> <owl:Class rdf:about="#Athletic"/> </owl:unionOf> </owl:allValuesFrom> </owl:Restriction> </owl:intersectionOf> </owl:Class> Ontology based Information Systems • Similar to relational databases – Ontology ¼ schema; instances ¼ data • Some important (dis)advantages + (Relatively) easy to maintain and update schema • Both schema and data are “self organising” + Query answers reflect both schema and data + Able to answer both intensional and extensional queries – Semantics may be counter-intuitive or even inappropriate • Open -v- closed world; axioms -v- constraints – Query answering (logical entailment) much more difficult • Can lead to scalability problems Ontology based Information Systems • Similar to relational databases – Ontology ¼ schema; instances ¼ data • Some important (dis)advantages + (Relatively) easy to maintain and update schema • Both schema and data are “self organising” + Query answers reflect both schema and data + Able to answer both intensional and extensional queries – Semantics may be counter-intuitive or even inappropriate • Open -v- closed world; axioms -v- constraints – Query answering (logical entailment) much more difficult • Can lead to scalability problems Very useful, but no miracles! Ontologies and Reasoning Support for Ontology Engineering • Developing and maintaining quality ontolgies is very challenging • Users need tools and services, e.g., to help check if ontology is: – Meaningful — all named classes can have instances Support for Ontology Engineering • Developing and maintaining quality ontolgies is very challenging • Users need tools and services, e.g., to help check if ontology is: – Meaningful — all named classes can have instances – Correct — captures intuitions of domain experts Support for Ontology Engineering • Developing and maintaining quality ontolgies is very challenging • Users need tools and services, e.g., to help check if ontology is: – Meaningful — all named classes can have instances – Correct — captures intuitions of domain experts – Minimally redundant — no unintended synonyms Banana split Banana sundae Support for Query Answering • In an Ontology based Information System (OIS), Query answering ¼ computing logical entailment – Reasoner needed in order to answer queries, e.g.: • C is a sub-class of D iff O ² 8 x . C(x) ! D(x) • a is an instance of C iff O ² C(a) OIS with no reasoner ¼ DBMS with no query engine Example Applications e-Science • E.g., for “in silico” investigations and “hypothesis testing” – Comparing data (e.g., on proteins) to (model of) biological knowledge – Characteristics of proteins captured in an ontology O • Goal is to identify protein instances based on characteristics e-Science • E.g., for “in silico” investigations and “hypothesis testing” – Comparing data (e.g., on proteins) to (model of) biological knowledge – Characteristics of proteins captured in an ontology O • Goal is to identify protein instances based on characteristics – Equivalent to answering queries of form: O ² P(i)? for protein P and instance i – Result may be discovery of new kinds of protein • And these may be potential drug targets if unique to a pathenogen – Result may also be discovery of errors in model • Which may reflect gaps/errors in existing knowledge Healthcare • UK NHS has a £6.2 billion “Connecting for Health” IT programme • Key component is Care Records Service (CRS) – “Live, interactive patient record service accessible 24/7” – Patient data distributed across local centres in 5 regional clusters, and a national DB • Detailed records held by local service providers • Diverse applications support radiology, pharmacy, etc • Applications exchange messages containing “semantically rich clinical information” • Summaries sent to national database – SNOMED-CT ontology provides common vocabulary for data • Clinical data uses terms drawn from ontology SNOMED • Over 400,000 concepts SNOMED • • • • Over 400,000 concepts Schema only — no instances Language used is a (well known) fragment of OWL NHS version extended with 1,000s of additional classes – OWL reasoner (FaCT++) used to classify and check ontology • Currently takes ¼ 4 hours – 180 missing subClass relationships were found, e.g.: • Periocular_dermatitis subClassOf Disease_of_face • Fibrin_measurement subClassOf Coagulation_factor_assay SNOMED • Vocabulary is extensible at point of use: “post coordination” – Users (e.g. clinicians) may add/define new vocabulary – Terminology service (reasoner) used to insert in ontology • Typical new term: – almond_allergy ´ “allergy caused_by almond” – OWL reasoner (FaCT++) used to classify new term • Takes <10 ms – Classified as a kind of “nut allergy” • Clearly of crucial importance to recognise patients with allergy caused by almond as kinds of patient with nut allergy Recent Developments OWL 1.1 • Is an extension of OWL – Addresses deficiencies identified by users and developers (at OWLED workshop) • Is based on more expressive DL: SROIQ – (OWL is based on SHOIN) • W3C working group now chartered – Will develop recommendation based on existing member submission • Already supported by popular OWL tools – Protégé, Swoop, TopBraid, FaCT++, Pellet What’s New in OWL 1.1? Four kinds of features: • More expressive logic (SROIQ) – qualified cardinality restrictions (>n R.C) and (6n R.C), e.g: • Person v Animal u =2 hasPart.Legs • Car v =4 hasComponent.Wheel • Person v 6 1 bioParent.Male (OWL/SHOIN only allows for concepts (>n R) and (6n R)) What’s New in OWL 1.1? Four kinds of features: • More expressive logic (SROIQ) – Expressive role axioms (R), e.g., complex role inclusions: R1 o … o Rn v S R1 o … o Rn o S v S S o R1 o … o Rn v S (with some restrictions on cycles) – useful, e.g., for owns o hasPart v owns ) 9owns.Bicycle v 9owns.Wheels partOf o locatedIn v locatedIn ) Fracture u 9locatedIn.FemurShaft v Fracture u 9locatedIn.Femur hasParent o hasBrother v hasUncle What’s New in OWL 1.1? Four kinds of features: • More expressive logic (SROIQ) – Expressive role axioms (R), e.g., asymmetry, reflexivity, etc: • Tra(R) (supported by SHOIN ) • Asy(R) e.g., Asy(properpartOf), Asy(hasParent) • Sym(R) (supported by SHOIN ) • Refl(R) • Irrefl(R) e.g., Irrefl(properPartOf), Asy(hasParent) • Disj(R S) e.g., Disj(hasParent hasSibling) • ObjectExistsSelf(likes) e.g., Refl(knows) [for narcissists] What’s New in OWL 1.1? Four kinds of features: • More expressive datatypes – OWL 1.1 allows for user-defined datatypes: • over18 ´ base(xsd:integer) minInclusive("18"xsd:integer) • Adult ´ Person u 9 age.over18 – and n-ary datatype predicates: • Spendthrift ´ 9 spends,earns.> – BUT, still cannot: • define complex relationships between data properties on different individuals, e.g., Women who earn more than their husbands. • declare a datatype property as inverse-functional (keys). What’s New in OWL 1.1? Four kinds of features: • Metamodelling and annotations – Names can be used as any or all of an individual, a class, or a property – Allows for a restricted form of metamodelling (“punning”), e.g.: subClassOf(SnowLeopard BigCat) ClassAssertion(SnowLeopard EndangeredSpecies) – Annotations of axioms as well as entities ClassAssertion(Comment(“source: WWF”) SnowLeopard EndangeredSpecies) What’s New in OWL 1.1? Four kinds of features: • Syntactic sugar (make things easier to say) – Disjoint unions, e.g.: DisjointUnion(Element Earth Wind Fire Water) – Negative assertions, e.g.: NegativeObjectPropertyAssertion(Ian hasChild Mary) NegativeDataPropertyAssertion(Ian hasAge 21) Tractable Fragments • OWL defines only one fragment (OWL Lite) – And it isn’t very tractable! • OWL 1.1 defines several different fragments with useful computational properties – E.g., reasoning complexity in range LOGSPACE to PTIME – Smaller fragments implementable using RDBs Tractable Fragments Tools and Methodologies • OWL 1.1 support already added to several tools: – Protégé, Swoop, TopBraid Composer, FaCT++, Pellet • New features available (soon) in OWL tools: – Diagnosis and semi-automatic repair of errors – Support for integration and modular design – Incremental classification (addition and retraction) – Support for bottom up design Diagnosis • Editing tools use reasoner to identify inconsistent classes • May not be very useful without some explanation facility Modularity in Ontology Engineering Benefits of a modular ontology design: to simplify • ontology refinement/update modifying a module should not lead to modifications in parts of the ontology that are not conceptually related • understanding relationships between different modules in an ontology controlled and well-understood • integration with other ontologies no unexpected consequences • partial reuse reuse only the relevant part/module of an ontology Tool Support for Modular Design • Check when integration of modules is “safe” – Interface between modules via exported vocabulary – Information flows from imported to importing ontology – No information flows back the other way • Formalised using conservative extensions – What is the effect of merging O2 into O1? – In general, check that O1 [ O2 ² C iff O1 ² C for any concept C constructed using vocabulary occurring in O1 [Cuenca Grau & Kazakov, IJCAI-07 and WWW-07] Tool Support for Modular Design • Extract smaller modules from large ontologies – E.g., starting with FMA, extract module for “Heart” – Tool should ensure that module • Is as small as possible, but • Still contains all relevant knowledge • More formally: – Extract a (small) module from O capturing all “relevant” information about some vocabulary V – In general, find O’ µ O s.t. O’ ² C iff O ² C for any concept C constructed using terms from V Incremental Reasoning • Modules can also be used to support incremental addition and retraction of axioms, e.g: – When retracting C v D, reclassify only concepts whose module includes this axiom – Typically this is only a very small subset of all concepts • Prototype now implemented in Swoop editor Tool Support for Bottom-up Design • Bottom-up design – Find a (small and specific) concept describing a set of individuals – In general, find most specific C s.t. O ² C(i1) Æ … Æ C(in) • Where C may be “small” and/or in a sub-language (of O) – Prototype: SONIC system [Turhan et al] Extending Expressive Power • Database style keys [Lutz et al, JAIR 2004] – E.g., make + model + chassis-number is a key for Vehicles • Rule language extensions – W3C RIF WG (see http://www.w3.org/2005/rules/) – First order extensions (e.g., SWRL) [Horrocks et al, JWS, 2005] – Hybrid language extensions, e.g., [Eiter et al, KR-04; Motik et al, ISWC-04; Rosati, JoWS, 2005] – LP/F-Logic/Common Logic [Chen et al, JLP, 1993; de Bruijn et al, WWW-05] • Other extensions – Temporal – Fuzzy – Extended annotation framework – Macro language – … Improving Scalability • Optimisation techniques – Improve performance of DL reasoners, e.g., [Tsarkov et al, JAR, ] • New Reasoning Techniques – Reduction to disjunctive Datalog [Motik et at, KR-04] • Transform SHOIN ontology to DatalogÇ rules • Use LP techniques to deal with large numbers of ground facts – Hybrid DL-DB systems [Horrocks et al, CADE-05] • Use DB to store “Abox” (individual) axioms • Cache inferences and use DB queries to answer/scope logical queries – Hypertableau based algorithms [Motik et al, CADE-07] • Prototypical implementation in HermiT system • Polynomial time algorithms for sub-ALC logics – Graph based techniques for EL+ [Baader et al, IJCAI-05] – Database techniques for DL-Lite [Calvanese et al, AAAI-05] Developing Tools and Infrastructure • Editors/environments – Oiled, Protégé, Swoop, TopBraid, Ontotrack, … Developing Tools and Infrastructure • Editors/environments – Oiled, Protégé, Swoop, TopBraid, Ontotrack, … • Reasoning systems – Cerebra, FaCT++, Kaon2, Pellet, Racer, CEL, … Pellet KAON2 CEL Developing Tools and Infrastructure • Editors/environments – Oiled, Protégé, Swoop, TopBraid, Ontotrack, … • Reasoning systems – Cerebra, FaCT++, Kaon2, Pellet, Racer, CEL, … • Design methodologies – Foundational ontologies, etc. Entity Endurant Quality Substantial Perdurant Event Achievement Stative Accomplishment Summary • Semantic Web aims to make web content more accessible to automated processes – Adds semantic annotations to web resources • OWL Ontologies provide vocabulary for annotations – Terms have well defined meaning • OWL now being used in a wide range of applications – e-Science, medicine, geography, geology, … • Reasoning enabled tools are of crucial importance – For both design and deployment of ontologies • Active research area – Expressive power, scalability, methodologies, tools, … Thank you for listening Thank you for listening FRAZZ: © Jeff Mallett/Dist. by United Feature Syndicate, Inc. Any questions?
© Copyright 2025 Paperzz