ISO/TC37/SC4/TDG6 Language Resource Ontologies 2008-09-27, Pisa HASIDA Koiti [email protected] CfSR, AIST, Japan Ontologization reformulation in terms of ontology provide standard way to convert annotations to labeled directed graphs DCR, LAF, LMF, FS, MAF, SemAF, SynAF, MLIF, etc. Cf. LMF and MAF have UML-based schemas. not XML but RDF as base description and modeling tool standard semantic interpretation for RDF highlight semantics rather than syntax 2 Purposes of Ontologization interoperability among ISO/TC37 standards with ontologies from elsewhere with any data containing linguistic content RDF data are easier to integrate than XML data. e.g. external annotation of texts in SMIL data without including linguistic description in SMIL specification fuller formalization of IS specifications semantic extension of DCR 3 Semantic Extension of DCR sorts of DCs unary predicate → class binary relation → property symmetric binary relation, etc. types of the domain (1st arg.) and the range (2nd arg.) of binary relations (properties) 4 XML Mess Semantic interpretation of XML is not standardized but defined ad hoc. Many inconsistent `standards’ on overlapping issues. Huge standards containing many different semantic interpretation manners. e.g., MPEG-7 > 2000 pages 5 RDF Resource Description Framework labeled directed graph W3C recommendation http://www.w3.org/RDF/ Schemas are provided by RDFS, OWL, etc. textual representation XML, N3, etc. 6 RDF Graph http://meetings.example.com/m1/hp m:homePage http://meetings.example.com/cal#m1 m:attending http://www.example.org/people#fred m:hasEmail m:givenName Fred mailto:[email protected] 7 Conversion of XML to RDF AnyURI- and IDREF(S)-type attribute → object property (link) other attribute → datatype property embedded element → object/datatype property 8 24610: Feature Structure typed feature structure as in HPSG, etc. ISO 24610-1: Feature Structure Representation ISO 24610-2: Feature System Declaration labeled directed graph AVM (attribute-value matrix) textual encoding by XML 9 FS Graph = RDF Graph POS ORTH SPECIFIER HEAD AGR determiner la NUMBER singular AGR POS ORTH noun pomme 10 FS in AVM SPECIFIER POS determiner ORTH `la’ AGR [1][NUMBER singular] HEAD POS noun ORTH `pomme’ AGR [1] 11 Ontologies Subsume Feature Systems Features are partial functions, whereas RDF properties are relations in general (possibly partial functions). Usual feature systems have no taxonomy of features, whereas usual ontologies have taxonomies of properties (e.g., due to rdfs:subPropertyOf). 12 Feature-System Declaration <fsDecl type="word" baseTypes="sign"> <fsDescr>The fundamental type for individual words</fsDescr> <fDecl name="orth"> <fDescr>The orthographic representation for this word</fDescr> <vRange><string/></vRange> </fDecl> </fsDecl> The fundamental type for individual words sign rdfs:comment rdfs:subClassOf The orthographic representation for this word word rdfs:domain rdfs:comment rdf:type orth owl:FunctionalProperty rdfs:range string 13 Constraint (Conditional) <cond> <fs> <f name="inv"> <binary value="true"/> </f> </fs> <then/> <fs> <f name="aux"> <binary value="true"/> </f> <f name="vform"> <symbol value="fin"/> </f> </fs> </cond> X inv true cond aux true vform fin X SWRL representation: inv(?X,true) -> aux(?X,true) & vform(?X,fin) 14 FS Ontologization (Summary) RDF ⊃ FS Use ontologies for feature-system declarations. SWRL to encode constraints Defaults are outside of ontology. 15 24612: Linguistic Annotation Framework 16 GrAF in RDF TOKEN rdfs:type The DET POS BASE rdfs:type THE clock POS BASE rdfs:type NUMBER possibly stand-off annotation NN CLOCK NP SING 17 SemAF-DActs Dialogue 1..1 sender Agent 0..* overhearer 1..* Turn addressee 1..* 1..* Utterance 1..* 0..* DialogueAct 18 func.dep. TODOs (projects in TDG6?) include ontologies in documents FSD just check UML (as far as no property hierarchy is necessary) LMF, MAF finish ontologization (possibly in UML) SynAF ontologize from scratch, forgetting XML DCR, SemAF-Time, SemAF-DActs, MLIF, etc. 19 Issues Who should ontologize individual WIs? ontologize future WIs from the beginning TDG6 should exemplify how. whether and how to make ontologization mandatory? Where to include ontologies of ongoing WIs? depending on their stages (WD, CD, ...) How to keep ontologizing DCs? replace DC metamodel by ontology? modify ISOCat? 20
© Copyright 2025 Paperzz