SHOE A Knowledge Representation Language for Internet Applications The Problem • HTML was never meant for computer consumption; its function is for displaying data for humans to read. • The "knowledge" on a web page is in a human-readable language (usually English), laid out with tables and graphics and frames in ways that we as humans comprehend visually. • Even with state-of-the-art natural language technology, getting a computer to read and understand web documents is very difficult. • This makes it very difficult to create an intelligent agent that can wander the web on its own, reading and comprehending web pages as it goes. The Solution • SHOE! • Simple HTML Ontology Extensions SHOE eliminates this problem by making it possible for web pages to include knowledge that intelligent agents can actually read. • SHOE eliminates this problem by making it possible for web pages to include knowledge that intelligent agents can actually read. The Internet changes things • The Web is a Knowledge Base. • A massive source of information for agents to make intelligent queries on. • Requires a shift in our view of what a KB is and what a KR language should be designed for. The Web as Knowledge Base • The Web is massive – Most KR systems have semantics too rich to scale well – Many KR languages have NP-hard complexity – KR for Web must make complexity/expressivity tradeoffs Web as KB (cont’d) • The Web is an “Open World” – A Web agent is not free to assume it has gathered all available information. – Many KR systems assume a “closed world.” – Unlikely, on the Web, that any KB describing it could ever be complete. The Web is Dynamic • Web changes faster than any bot or agent could keep up with. • A KR system must assume that data can be, and often will be, out of date. • Without a unifying ontological framework web agents will struggle to cross-map comflicting knowledge structures • The Web’s KR framework must be flexible yet general to handle the on-line economy of ideas. Web as KB redux • Viewing the Web as a Knowledge Base changes the way we must look at KR and KR languages. • Web systems cannot assume that all of the information is correct and consistent. • Authority on the Internet is distributed. No Central Control • Each page’s reliability must be questioned. • No guarantee on the availability of information. • Information from different sources can be in disagreement, leading to inconsistency. • Web Hoaxes • On the Web no one knows you’re a dog. Ontology • Modern KR systems designed around concept of categorization. – Allows reasoning about the generality of a concept – allows specification of relationships between these concepts. • Such ontologies allow one to define what is relevant and what is to be ignored Ontologies on the Web • Ontologies on the Web can be used to structure information if we take into account the properties discussed earlier. • Let’s look at some of the problems that may be solved with the use of ontologies Heterogeneity • Many file formats and protocols: – images, music, movies, VR files – HTTP, FTP, Telnet, Gopher • Automated indexing is difficult. • All of these resources are potentially useful to someone. • Need method to specify what information is contained in these sources. Lack of Structure • Structure of HTML used primarily for presentation, instead of information retrieval. • Difficult to infer semantic meaning from them despite limited support for semantic information (META tags, etc.) • XML will allow semi-structured documents, but will need some form of Ontology. • No structures for classification or reasoning. Contextual Dependency • Reading documents, people draw on contextual knowledge (domain, language) to interpret statements. • Context required to disambiguate terms and provide framework for understanding • Ontologies provide mechanism by which context can be encoded on web pages or other repositories of web-based information. The SHOE Language Basic Structure • Ontologies – define rules guiding what kinds of assertions may be made and what kinds of inferences may be drawn on ground assertions • Instances – entities which make assertions based on those rules Basic Structure • SHOE treats assertions as claims being made by specific instances (instead of facts to gather as generally-recognized truth.) • SHOE syntax is an application extension of HTML – also available in XML syntax – SHOE also designed for more general distributed knowledge and agent issues. SHOE Ontologies • SHOE has flexible facilities for ontologies to be derived from one or more superontologies in a multiple-inheritance scheme, or for later versions to modify earlier versions. • Four basic data types – strings, numbers, dates and boolean values SHOE Ontologies • An additional URL type is under consideration. • An ontology can define additional arbitrary types • An ontology can make category definitions which specify the categories under which instances can be classified. SHOE Ontologies • Relational Definitions – <RELATION> tags specify the format of n-ary relational claims made by instances regarding other instances and data • Inferential Declarations – <DEF-INFERENCE> tags can specify additional inferences agents may freely make on ground information. LKite: URL as id to give agents ability to determine is instance is really what it claims to be. SHOE Instances • Fill two functions: – instances are arbitrary objects, like those in an object-oriented database system. – Instances are elements responsible for making claims. • Each instance has unique ID – SHOE proposes, not requires, that the id be based on the URL of the page where instance found. SHOE Instances • Instances may specify delegate instances. • Within an instance may be found category claims and relation claims made by that instance: – category claim: instance x should be categorized under category y. – relational claim: instance claims that an n-ary relation exists. Formal Definition • We’ll skip the details today, but say: – SHOE’s semantic knowledge consists of a set of claims, made by instances, about relationships between ground atomic elements (numbers, strings, instances, etc.) – Claims are either ground claims explicitly stated in instances or claims SHOE has inferred via the simple rules defined in an ontology. Language Features • Compatibility with HTML/XML – application of SGML – HTML compatible syntax defined in an SGML DTD derived from the HTML DTD. – XML version: • has familiar format • can be analyzed and processed through DOM • With XSL, SHOE markup can be machine and human-readable. Language Features • Prevention of Contradiction – assertions permitted, not retractions – no negation – no single-valued relations (relational sets having only one value or a fixed number of values) – includes claimant as part of a claimed assertion. Language Features • Extensibility and Versioning – Shared Ontologies - two ontologies referring to a common concept should both extend an ontology in which that concept defined. – Each version of an ontology is a separate file with a unique version number – All versions of an ontology are accessible – Ontologies can specify backward-compatibility – Depends on compliance of onto-designers Related Work • • • • HTML Wrappers Ontobroker Web Analysis and Visualization Environment (WAVE) • Ontology Markup Language (OML) • Conceptual Knowledge Markup Language (CKML) SHOE vs. RDF • RDF drawbacks: – RDF is a semantic network without inheritance; just nodes connected with named links – RDF has no mechanism for defining general inferences – no way to map between different representations of the same concept. – RDF schema can’t rename properties to a local vocabulary (no equivalence) SHOE vs. RDF • RDF Drawbacks (cont’d): – no way to track revision of a schema unless schema maintainer uses a consistent naming scheme for the URIs. – Use of XML namespaces leads to difficulty in distinguishing RDF from a different DTD. LKite: ensuring that two object references are matched when they conceptually refer to the same object is an open problem. Language Features • Other features: – Separation of ontologies and instances (unlike RDF) – N-ary relations – Uniqueness of identification • the system will only interpret two objects as equivalent when they are truly equivalent Final Notes • Concerns: – versioning compliance depends on cooperation of ontology designers – reliance on “market forces” to weed out bad ontologies – relies on central repository of ontologies – Scalability yet to be proved – Ditto usability (simple tools needed) – Language issues (instance vs. category)
© Copyright 2026 Paperzz