Introducing the Semantic Web

Introduction to the
Semantic Web
Charlie Abela
Department of Artificial Intelligence
[email protected]
Lecture Outline






Course organisation
Today’s Web limitations
Machine-processable data
The Semantic Web Impact
Semantic Web Technologies
The Layered Approach
CSA 3210
Introduction
2
Organisation

This part of the course:



approx. 2ECTS = 14 hrs
Lectures: usually Tuesday 15:00-16:00
Assignment: intends to combine all aspects of
this course
CSA 3210
Introduction
3
Course Material

Slides & Additional Reading


http://www.cs.um.edu.mt/~cabe2/lectures/sw/course_material.html
Textbooks


CSA 3210
A Semantic Web Primer
by Grigoris Antoniou and Frank van Harmelen
ISBN 9780262012102
Semantic Web: concepts, technologies and
applications
by Karin K. Breitman, Marco A. Casanova and Walter
Truszkowski ISBN 9781846285813
Introduction
4
The Web

What are the main component of the Web?

HTTP (how to transfer data)


URI (how to address data)


http://www.cs.um.edu.mt/....
HTML (how to mark up data for human reader)

CSA 3210
GET /index.html
<html><head><title>.....
Introduction
5
The core problem of the Web

Information Overload which leads to problems
when



CSA 3210
Retrieving documents
Extracting relevant data from retrieved documents
Combining information from different sources to
achieve a particular goal
Introduction
6
Retrieve a document

Querying for “jaguar”
returns various types of
results:




CSA 3210
Introduction
Cars
Feline
Operating system
Who knows what else
7
Extracting information
CSA 3210
Introduction
8
Extracting information
CSA 3210
Introduction
9
Aggregating information
Find me the cheapest price
for the book “Semantic
Web Primer”
CSA 3210
Introduction
10
Aggregating information
CSA 3210
Introduction
11
Personal Software Agents
Let
a
personal
assistant handle all
the web related
tasks. Cool!!
However….
CSA 3210
Introduction
12
Today’s Web


Today’s Web content is suitable for human consumption
However for a machine it must be like this
Crazy!!!
CSA 3210
Introduction
13
Current Web Content




Web content is currently formatted
for human readers rather than
programs.
HTML is the predominant language
in which Web pages are written
Leads to problems where machines
are involved:
 How to distinguish staff pages?
 How to determine exact contact
hours?
 If links are to be followed, how
will the agent find the correct
one?
CSA 3210
HTML
<h1> Department of AI</h1>
Welcome to the Department of
Artificial
Intelligence.
<h2>Students’ hours</h2>
Mon 10am – 11.30am<br>
Tue 11am – 12.30pm<br>
Wed 3pm - 4pm<br>
Thu 11am – 12.30pm<br>
Fri 10am – 11.30am<p>
Students are urged to contact us
during these slots
<a href=". . .">Staff Pages</a>
Introduction
14
Possible solution


Apart from making content human-readable,
make it also machine-processable!
Ask queries that are machine-understandable

CSA 3210
i.e. machines must be capable of understanding all the
terms involved
Introduction
15
The Semantic Web Approach


The Semantic Web is specifically a web of
machine-readable
information
whose
meaning is well-defined by standards.
It is not


artificial intelligence: no magic involved, rather we
need to find ways in which our machines can
access and use machine-processable information
to ease our day-to-day activities
a separate kind of Web: rather an extension
Web + machine-processable information
CSA 3210
Introduction
16
Impact of the Semantic Web

Knowledge Management:



B2C Electronic Commerce:



concerns itself with acquiring, accessing, and maintaining knowledge
within an organization
Key activity of large businesses: they view internal knowledge as an
intellectual asset
A typical scenario: user visits one or several online shops, browses
their offers, selects and orders products.
Browsing multiple stores is too time consuming. Make use of Shopbots.
B2B Electronic Commerce:


CSA 3210
Currently relies mostly on EDI (complex, difficult to use)
But B2B not well supported by Web standards
Introduction
17
Semantic Web Technologies




Explicit metadata
Ontologies to standardise concepts and
relations between them
Logic and Inference: languages founded in
various flavours of logic
Software Agents: make use of all the above to
help us in our tasks
CSA 3210
Introduction
18
Explicit Metadata

Metadata: data about data


is structured data which
characteristics of a resource



Metadata capture part of the meaning of data
describes
the
used in HTML: <Meta>…tag
It shares many similar characteristics to the
cataloguing that takes place in libraries,
museums and archives.
E.g. Dublin Core schema: can be used to define
a “virtual card”
CSA 3210
Introduction
19
A more Comprehensive
Representation
XML based

<department>
<departmentName>Artificial intelligence
</departmentName>
<hod>
<name>Roger Right</name>
<room>312</room>
 XML-based representations
<telephone>23400007</telephone>
are more easily processable <contactHr>11:30amby machines, since they are 13:30pm</contactHr>
</hod>
more structured
<staff>
<lecturer>Steve Runner</lecturer>
<lecturer>George Cool</lecturer>
<secretary>Mary Nice</secretary>
</staff>
</department>
CSA 3210
Introduction
20
Ontologies

The term ontology originates from philosophy:




CSA 3210
The study of the nature of existence
Ontology is the study of the categories of things that
exist or may exist in some domain…it is a catalogue
of the types of things that are assumed to exist in a
domain D from the perspective of a person who uses
a language L to talk about D. (Sowa 1997)
Think of an ontology as a vocabulary used to
describe things (Guarino 1998)
Ontologies are used to facilitate knowledge sharing
and reuse by formally defining a shared
conceptualization
Introduction
21
Components of Ontologies


An ontology describes formally a domain of discourse
and includes the following components.
Terms denote important concepts (or classes of
objects) in the domain


e.g. professors, staff, students, courses, departments
Relationships between these terms: most typical is a
taxonomy relation (is-A)


CSA 3210
a class C is a subclass of another class C' if every object in C
is also included in C'
e.g. all professors are staff members
Introduction
22
Other Ontology Components

Properties:


Value restrictions


e.g. only faculty members can teach courses
Disjointness statements


e.g. X teaches Y
e.g. faculty members and general staff are disjoint
Logical relationships between objects

CSA 3210
e.g. every department must include at least 10
faculty members
Introduction
23
Ontologies on the Web

Ontologies are ideal to provide a shared
understanding of a domain: enable semantic
interoperability




overcome differences in terminology
issue: mappings between ontologies
Ontologies are useful for the organization and
navigation of Web sites
Ontologies are useful for improving the accuracy
of Web searches

CSA 3210
search engines can look for pages that refer to a
precise concept in an ontology
Introduction
24
Semantic Web Languages

E 
X
P
R

E
S 
S
I
V
E
CSA 3210
Need languages to define ontologies
Initially there where RDF/Schema:

Resource Description Framework
then came DAML and OiL
now we have a W3C recommendation for
OWL

Web Ontology Language
Introduction
25
Logic and Inference



Logic is the discipline that studies the
principles of reasoning
Formal languages for expressing knowledge
Well-understood formal semantics


CSA 3210
Declarative knowledge: we describe what holds
without caring about how it can be deduced
Automated reasoners can deduce
conclusions from the given knowledge
Introduction
(infer)
26
Machine understandable…

Published facts




B related-to A
C related-to A
D related-to C
Query

Return all entities related to A
?x related-to A

Result


CSA 3210
B
C
Introduction
27
Machine understandable + inference

Published facts
 B related-to A
 C related-to A
 D related-to C
 also declare that related-to is transitive
?x related-to ?y and ?y related-to ?z => ?x related-to ?z

Query
 Return all entities related to A
?x related-to A

Result

B
C

D

CSA 3210
Introduction
28
Software Agents

Software agents work autonomously and proactively


They evolved out of object oriented and component-based
programming
A personal agent on the Semantic Web will:




CSA 3210
receive some tasks and preferences from the person
seek information from Web sources, communicate with other
agents
compare information about user requirements and
preferences, suggest certain choices
recommend answers to the user
Introduction
29
Semantic Web Layered Approach
CSA 3210
Introduction
30
In the following lectures…

We will explore some of the technologies
mentioned in the SW layered approach,
particularly those in the lower layers:



CSA 3210
present an overview of these technologies
walk through examples and
discuss their importance vis-à-vis application areas
Introduction
31
Suggested reading…

Textbook: Semantic Web Primer, Chapter 1

TBL, J.Hendler, O.Lassila, The Semantic Web.
http://www.cs.um.edu.mt/~cabe2/lectures/sw/papers/The_Semantic_Web.pdf

J.Hendler, Agents and the Semantic Web.

http://www.cs.umd.edu/users/hendler/AgentWeb.html
Further reading

The Semantic Web: A Primer, E.Dumbill.
http://www.xml.com/pub/a/2000/11/01/semanticweb/

The Semantic Web: An Introduction, S.Palmer.
http://infomesh.net/2001/swintro/
CSA 3210
Introduction
32
Next lecture

Introduction to XML



CSA 3210
DTD
XML schema
Comparison
Introduction
33
Extra slides
CSA 3210
Introduction
34
Another typical Example
prof(X)  facultyMember(X)
facultyMember(X)  staffMember(X)
prof(michael)
We can deduce the following conclusions:
facultyMember(michael)
staffMember(michael)
prof(micheal)  staff(micheal)
CSA 3210
Introduction
35