Introduction to the Semantic Web Charlie Abela Department of Artificial Intelligence [email protected] Lecture Outline Course organisation Today’s Web limitations Machine-processable data The Semantic Web Impact Semantic Web Technologies The Layered Approach CSA 3210 Introduction 2 Organisation This part of the course: approx. 2ECTS = 14 hrs Lectures: usually Tuesday 15:00-16:00 Assignment: intends to combine all aspects of this course CSA 3210 Introduction 3 Course Material Slides & Additional Reading http://www.cs.um.edu.mt/~cabe2/lectures/sw/course_material.html Textbooks CSA 3210 A Semantic Web Primer by Grigoris Antoniou and Frank van Harmelen ISBN 9780262012102 Semantic Web: concepts, technologies and applications by Karin K. Breitman, Marco A. Casanova and Walter Truszkowski ISBN 9781846285813 Introduction 4 The Web What are the main component of the Web? HTTP (how to transfer data) URI (how to address data) http://www.cs.um.edu.mt/.... HTML (how to mark up data for human reader) CSA 3210 GET /index.html <html><head><title>..... Introduction 5 The core problem of the Web Information Overload which leads to problems when CSA 3210 Retrieving documents Extracting relevant data from retrieved documents Combining information from different sources to achieve a particular goal Introduction 6 Retrieve a document Querying for “jaguar” returns various types of results: CSA 3210 Introduction Cars Feline Operating system Who knows what else 7 Extracting information CSA 3210 Introduction 8 Extracting information CSA 3210 Introduction 9 Aggregating information Find me the cheapest price for the book “Semantic Web Primer” CSA 3210 Introduction 10 Aggregating information CSA 3210 Introduction 11 Personal Software Agents Let a personal assistant handle all the web related tasks. Cool!! However…. CSA 3210 Introduction 12 Today’s Web Today’s Web content is suitable for human consumption However for a machine it must be like this Crazy!!! CSA 3210 Introduction 13 Current Web Content Web content is currently formatted for human readers rather than programs. HTML is the predominant language in which Web pages are written Leads to problems where machines are involved: How to distinguish staff pages? How to determine exact contact hours? If links are to be followed, how will the agent find the correct one? CSA 3210 HTML <h1> Department of AI</h1> Welcome to the Department of Artificial Intelligence. <h2>Students’ hours</h2> Mon 10am – 11.30am<br> Tue 11am – 12.30pm<br> Wed 3pm - 4pm<br> Thu 11am – 12.30pm<br> Fri 10am – 11.30am<p> Students are urged to contact us during these slots <a href=". . .">Staff Pages</a> Introduction 14 Possible solution Apart from making content human-readable, make it also machine-processable! Ask queries that are machine-understandable CSA 3210 i.e. machines must be capable of understanding all the terms involved Introduction 15 The Semantic Web Approach The Semantic Web is specifically a web of machine-readable information whose meaning is well-defined by standards. It is not artificial intelligence: no magic involved, rather we need to find ways in which our machines can access and use machine-processable information to ease our day-to-day activities a separate kind of Web: rather an extension Web + machine-processable information CSA 3210 Introduction 16 Impact of the Semantic Web Knowledge Management: B2C Electronic Commerce: concerns itself with acquiring, accessing, and maintaining knowledge within an organization Key activity of large businesses: they view internal knowledge as an intellectual asset A typical scenario: user visits one or several online shops, browses their offers, selects and orders products. Browsing multiple stores is too time consuming. Make use of Shopbots. B2B Electronic Commerce: CSA 3210 Currently relies mostly on EDI (complex, difficult to use) But B2B not well supported by Web standards Introduction 17 Semantic Web Technologies Explicit metadata Ontologies to standardise concepts and relations between them Logic and Inference: languages founded in various flavours of logic Software Agents: make use of all the above to help us in our tasks CSA 3210 Introduction 18 Explicit Metadata Metadata: data about data is structured data which characteristics of a resource Metadata capture part of the meaning of data describes the used in HTML: <Meta>…tag It shares many similar characteristics to the cataloguing that takes place in libraries, museums and archives. E.g. Dublin Core schema: can be used to define a “virtual card” CSA 3210 Introduction 19 A more Comprehensive Representation XML based <department> <departmentName>Artificial intelligence </departmentName> <hod> <name>Roger Right</name> <room>312</room> XML-based representations <telephone>23400007</telephone> are more easily processable <contactHr>11:30amby machines, since they are 13:30pm</contactHr> </hod> more structured <staff> <lecturer>Steve Runner</lecturer> <lecturer>George Cool</lecturer> <secretary>Mary Nice</secretary> </staff> </department> CSA 3210 Introduction 20 Ontologies The term ontology originates from philosophy: CSA 3210 The study of the nature of existence Ontology is the study of the categories of things that exist or may exist in some domain…it is a catalogue of the types of things that are assumed to exist in a domain D from the perspective of a person who uses a language L to talk about D. (Sowa 1997) Think of an ontology as a vocabulary used to describe things (Guarino 1998) Ontologies are used to facilitate knowledge sharing and reuse by formally defining a shared conceptualization Introduction 21 Components of Ontologies An ontology describes formally a domain of discourse and includes the following components. Terms denote important concepts (or classes of objects) in the domain e.g. professors, staff, students, courses, departments Relationships between these terms: most typical is a taxonomy relation (is-A) CSA 3210 a class C is a subclass of another class C' if every object in C is also included in C' e.g. all professors are staff members Introduction 22 Other Ontology Components Properties: Value restrictions e.g. only faculty members can teach courses Disjointness statements e.g. X teaches Y e.g. faculty members and general staff are disjoint Logical relationships between objects CSA 3210 e.g. every department must include at least 10 faculty members Introduction 23 Ontologies on the Web Ontologies are ideal to provide a shared understanding of a domain: enable semantic interoperability overcome differences in terminology issue: mappings between ontologies Ontologies are useful for the organization and navigation of Web sites Ontologies are useful for improving the accuracy of Web searches CSA 3210 search engines can look for pages that refer to a precise concept in an ontology Introduction 24 Semantic Web Languages E X P R E S S I V E CSA 3210 Need languages to define ontologies Initially there where RDF/Schema: Resource Description Framework then came DAML and OiL now we have a W3C recommendation for OWL Web Ontology Language Introduction 25 Logic and Inference Logic is the discipline that studies the principles of reasoning Formal languages for expressing knowledge Well-understood formal semantics CSA 3210 Declarative knowledge: we describe what holds without caring about how it can be deduced Automated reasoners can deduce conclusions from the given knowledge Introduction (infer) 26 Machine understandable… Published facts B related-to A C related-to A D related-to C Query Return all entities related to A ?x related-to A Result CSA 3210 B C Introduction 27 Machine understandable + inference Published facts B related-to A C related-to A D related-to C also declare that related-to is transitive ?x related-to ?y and ?y related-to ?z => ?x related-to ?z Query Return all entities related to A ?x related-to A Result B C D CSA 3210 Introduction 28 Software Agents Software agents work autonomously and proactively They evolved out of object oriented and component-based programming A personal agent on the Semantic Web will: CSA 3210 receive some tasks and preferences from the person seek information from Web sources, communicate with other agents compare information about user requirements and preferences, suggest certain choices recommend answers to the user Introduction 29 Semantic Web Layered Approach CSA 3210 Introduction 30 In the following lectures… We will explore some of the technologies mentioned in the SW layered approach, particularly those in the lower layers: CSA 3210 present an overview of these technologies walk through examples and discuss their importance vis-à-vis application areas Introduction 31 Suggested reading… Textbook: Semantic Web Primer, Chapter 1 TBL, J.Hendler, O.Lassila, The Semantic Web. http://www.cs.um.edu.mt/~cabe2/lectures/sw/papers/The_Semantic_Web.pdf J.Hendler, Agents and the Semantic Web. http://www.cs.umd.edu/users/hendler/AgentWeb.html Further reading The Semantic Web: A Primer, E.Dumbill. http://www.xml.com/pub/a/2000/11/01/semanticweb/ The Semantic Web: An Introduction, S.Palmer. http://infomesh.net/2001/swintro/ CSA 3210 Introduction 32 Next lecture Introduction to XML CSA 3210 DTD XML schema Comparison Introduction 33 Extra slides CSA 3210 Introduction 34 Another typical Example prof(X) facultyMember(X) facultyMember(X) staffMember(X) prof(michael) We can deduce the following conclusions: facultyMember(michael) staffMember(michael) prof(micheal) staff(micheal) CSA 3210 Introduction 35
© Copyright 2026 Paperzz