07.Query on Semantic Web.ppt

Querying on the Web:
XQuery, RDQL, SparQL
Semantic Web - Spring 2006
Computer Engineering Department
Sharif University of Technology
Outline
• XQuery
– Querying on XML Data
• RDQL
– Querying on RDF Data
• SparQL
– Another RDF query language (under development)
2
Requirements for an XML Query
Language
David Maier, W3C XML Query Requirements:
• Closedness: output must be XML
• Composability: wherever a set of XML elements is
required, a subquery is allowed as well
• Can benefit from a schema, but should also be applicable
without
• Retains the order of nodes
• Formal semantics
3
How Does One Design a Query
Language?
• In most query languages, there are two aspects to
a query:
– Retrieving data (e.g., from … where … in SQL)
– Creating output (e.g., select … in SQL)
• Retrieval consists of
– Pattern matching (e.g., from … )
– Filtering (e.g., where … )
… although these cannot always be clearly distinguished
4
XQuery Principles
• A language for querying XML document.
• Data Model identical with the XPath data model
– documents are ordered, labeled trees
– nodes have identity
– nodes can have simple or complex types
(defined in XML Schema)
• XQuery can be used without schemas, but can be checked against
DTDs and XML schemas
• XQuery is a functional language
– no statements
– evaluation of expressions
5
Sample data
6
A Query over the Recipes Document
<titles>
{for $r in doc("recipes.xml")//recipe
return
$r/title}
</titles>
returns
<titles>
<title>Beef Parmesan with Garlic Angel Hair Pasta</title>
<title>Ricotta Pie</title>
…
</titles>
7
Query Features
Part to be returned as it is given {To be evaluated}
<titles>
doc(String) returns input document
{for $r in doc("recipes.xml")//recipe
return
$r/title}
</titles>
Iteration $var - variables
XPath
Sequence of results,
one for each variable binding
8
Features: Summary
• The result is a new XML document
• A query consists of parts that are returned as is
• ... and others that are evaluated (everything in {...} )
• Calling the function doc(String) returns an input document
• XPath is used to retrieve nodes sets and values
• Iteration over node sets:
let binds a variable to all nodes in a node set
• Variables can be used in XPath expressions
• return returns a sequence of results,
one for each binding of a variable
9
XPath is a Fragement of XQuery
• doc("recipes.xml")//recipe[1]/title
returns
<title>Beef Parmesan with Garlic Angel Hair Pasta</title>
an element
• doc("recipes.xml")//recipe[position()<=3]
/title
returns
<title>Beef Parmesan with Garlic Angel Hair Pasta</title>,
<title>Ricotta Pie</title>,
<title>Linguine Pescadoro</title>
a list of elements
10
Beware: XPath Attributes
• doc("recipes.xml")//recipe[1]/ingredient[1]
/@name
→ attribute name {"beef cube steak"}
a constructor for an attribute node
• string(doc("recipes.xml")//recipe[1]
/ingredient[1]/@name)
→
"beef cube steak"
a value of type string
11
XPath Attributes (cntd.)
• <first-ingredient>
{string(doc("recipes.xml")//recipe[1]
/ingredient[1]/@name)}
</first-ingredient>
→
<first-ingredient>beef cube steak</first-ingredient>
an element with string content
12
XPath Attributes (cntd.)
• <first-ingredient>
{doc("recipes.xml")//recipe[1]
/ingredient[1]/@name}
</first-ingredient>
→
<first-ingredient name="beef cube steak"/>
an element with an attribute
13
XPath Attributes (cntd.)
• <first-ingredient
oldName="{doc("recipes.xml")//recipe[1]
/ingredient[1]/@name}">
Beef
</first-ingredient>
→
<first-ingredient oldName="beef cube steak">
Beef
</first-ingredient>
An attribute is cast as a string
14
Iteration with the For-Clause
Syntax:
for $var in xpath-expr
Example: for $r in doc("recipes.xml")//recipe
return string($r)
• The expression creates a list of bindings for a variable $var
If $var occurs in an expression exp,
then exp is evaluated for each binding
• For-clauses can be nested:
for $r in doc("recipes.xml")//recipe
for $v in doc("vegetables.xml")//vegetable
return ...
15
Nested For-clauses: Example
<my-recipes>
{for $r in doc("recipes.xml")//recipe
return
<my-recipe title="{$r/title}">
{for $i in $r//ingredient
return
<my-ingredient>
{string($i/@name)}
</my-ingredient>
}
Returns my-recipes
with titles as attributes
and my-ingredients
with names as text content
</my-recipe>
}
</my-recipes>
16
The Let Clause
Syntax: let $var := xpath-expr
• binds variable $var to a list of nodes,
with the nodes in document order
• does not iterate over the list
• allows one to keep intermediate results for reuse
(not possible in SQL)
Example:
let $ooreps := doc("recipes.xml")//recipe
[.//ingredient/@name="olive oil"]
17
Let Clause: Example
<calory-content>
{let $ooreps := doc("recipes.xml")//recipe
[.//ingredient/@name="olive oil"]
for $r in $ooreps return
Calories of recipes
<calories>
{$r/title/text()}
with olive oil
{": "}
{string($r/nutrition/@calories)}
</calories>}
</calory-content>
Note the implicit
string concatenation
18
Let Clause: Example (cntd.)
The query returns:
<calory-content>
<calories>Beef Parmesan: 1167</calories>
<calories>Linguine Pescadoro: 532</calories>
</calory-content>
19
The Where Clause
Syntax: where <condition>
• occurs before return clause
• similar to predicates in XPath
• comparisons on nodes:
– "=" for node equality
– "<<" and ">>" for document order
• Example:
for $r in doc("recipes.xml")//recipe
where $r//ingredient/@name="olive oil"
return ...
20
Quantifiers
• Syntax:
some/every $var in <node-set>
satisfies <expr>
• $var is bound to all nodes in <node-set>
• Test succeeds if <expr> is true for some/every binding
• Note: if <node-set> is empty, then
“some” is false and “all” is true
21
Quantifiers (Example)
• Recipes that have some compound ingredient
for $r in doc("recipes.xml")//recipe
where some $i in $r/ingredient
satisfies $i/ingredient
Return $r/title
• Recipes where every ingredient is non-compound
for $r in doc("recipes.xml")//recipe
where every $i in $r/ingredient
satisfies not($i/ingredient)
Return $r/title
22
Element Fusion
“To every recipe, add the attribute calories!”
<result>
{let $rs := doc("recipes.xml")//recipe
for $r in $rs return
<recipe>
{$r/nutrition/@calories}
an attribute
{$r/title}
an element
</recipe>}
</result>
23
Element Fusion (cntd.)
The query result:
<result>
<recipe calories="1167">
<title>Beef Parmesan with Garlic Angel Hair Pasta</title>
</recipe>
<recipe calories="349">
<title>Ricotta Pie</title>
</recipe>
<recipe calories="532">
<title>Linguine Pescadoro</title>
</recipe>
</result>
24
Eliminating Duplicates
The function distinct-values(Node Set)
– extracts the values of a sequence of nodes
– creates a duplicate free sequence of values
Note the coercion: nodes are cast as values!
Example:
let $rs := doc("recipes.xml")//recipe
return distinct-values($rs//ingredient/@name)
yields
"beef cube steak
onion, sliced into thin rings
...
25
The Order By Clause
Syntax:
order by expr [ ascending | descending ]
for $iname in doc("recipes.xml")//@name
order by $iname descending
return string($iname)
yields
"whole peppercorns",
"whole baby clams",
"white sugar",
...
26
The Order By Clause (cntd.)
The interpreter must be told whether the values should be regarded
as numbers or as strings
(alphanumerical sorting is default)
for $r in $rs
order by number($r/nutrition/@calories)
return $r/title
Note:
– The query returns titles ...
– but the ordering is according to calories,
which do not appear in the output
Not possible in SQL!
27
Grouping and Aggregation
Aggregation functions count, sum, avg, min, max
Example: The number of simple ingredients
per recipe
for $r in doc("recipes.xml")//recipe
return
<number>
{attribute {"title"} {$r/title/text()}}
{count($r//ingredient[not(ingredient)])}
</number>
28
Grouping and Aggregation (cntd.)
The query result:
<number title="Beef Parmesan with Garlic Angel Hair
Pasta">11</number>,
<number title="Ricotta Pie">12</number>,
<number title="Linguine Pescadoro">15</number>,
<number title="Zuppa Inglese">8</number>,
<number title="Cailles en Sarcophages">30</number>
29
Nested Aggregation
“The recipe with the maximal number of calories!”
let $rs := doc("recipes.xml")//recipe
let $maxCal := max($rs//@calories)
for $r in $rs
where $r//@calories = $maxCal
return string($r/title)
returns
"Cailles en Sarcophages"
30
Running Queries with Galax
• Galax is an open-source implementation of
XQuery (http://www.galaxquery.org/)
– The main developers have taken part in the definition of
XQuery
31
RDQL
Querying on RDF data
Introduction
• RDF Data Query Language
• JDBC/ODBC friendly
• Simple:
SELECT
some information
FROM
somewhere
WHERE
this match
AND
these constraints
USING
these vocabularies
33
Example
34
Example
• q1 contains a query:
SELECT ?x
WHERE (?x, <http://www.w3.org/2001/vcard-rdf/3.0#FN>, "John Smith")
• For executing q1with a model m1.rdf:
java jena.rdfquery --data m1.rdf --query q1
• The outcome is:
x
=============================
<http://somewhere/JohnSmith/>
35
Example
• Return all the resources that have property FN and
the associated values:
SELECT ?x, ?fname
WHERE (?x, <http://www.w3.org/2001/vcard-rdf/3.0#FN>, ?fname)
• The outcome is:
x
| fname
================================================
<http://somewhere/JohnSmith/>
| "John Smith"
<http://somewhere/SarahJones/>
| "Sarah Jones"
<http://somewhere/MattJones/>
| "Matt Jones"
36
Example
• Return the first name of Jones:
SELECT ?givenName
WHERE (?y, <http://www.w3.org/2001/vcard-rdf/3.0#Family>, "Jones"),
(?y, <http://www.w3.org/2001/vcard-rdf/3.0#Given>, ?givenName)
• The outcome is:
givenName
=========
"Matthew"
"Sarah"
37
URI Prefixes : USING
• RDQL has a syntactic convenience that allows prefix
strings to be defined in the USING clause :
SELECT ?x
WHERE (?x, vCard:FN, "John Smith")
USING vCard FOR <http://www.w3.org/2001/vcard-rdf/3.0#>
SELECT ?givenName
WHERE (?y, vCard:Family, "Smith"),
(?y, vCard:Given, ?givenName)
USING vCard FOR <http://www.w3.org/2001/vcard-rdf/3.0#>
38
Filters
• RDQL has a syntactic convenience that allows prefix
strings to be defined in the USING clause :
SELECT ?resource
WHERE (?resource, info:age, ?age)
AND ?age >= 24
USING info FOR <http://somewhere/peopleInfo#>
39
Another Example
SELECT
?title ?description ?orbit ?satellite ?sensor ?date
FROM
<http://earth.esa.int/showcase/ers/dublin.rdf>
WHERE
(?item <dc:title> ?title)
(?item <dc:description> ?description)
(?item <isc:orbit> ?orbit)
(?item <isc:satellite> ?satellite)
(?item <isc:sensor> ?sensor)
(?item <dc:date> ?date)
USING
isc FOR <http://earth.esa.int/standards/showcase/>
dc FOR <http://purl.org/dc/elements/1.1/>
rdf FOR <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
rdfs FOR <http://www.w3.org/2000/01/rdf-schema#>
40
Implementations
• Jena
– http://jena.sourceforge.net/
• Sesame
– http://sesame.aidministrator.nl/
• RDFStore
– <http://rdfstore.sourceforge.net/>
41
Limitation
• Does not take into account semantics of RDFS
• For example:
ex:human rdfs:subClassOf ex:animal
ex:student rdfs:subClassOf ex:human
ex:john rdf:type ex:student
Query: “ To which class does the resource John belong?”
Expected answer: ex:student, ex:human, ex:animal
However, the query:
SELECT ?x
WHERE (<http://example.org/#john>, rdf:type, ?x)
USING rdf FOR <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
Yields only:
<http://example.org/#student>
• Solution: Inference Engines
42
SparQL
Introduction
• A RDF query language currently under
development by W3C
• Builds on previous RDF query languages such as
rdfDB, RDQL, and SeRQL.
44
Example RDF
45
Example
• Simple Query:
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?url
FROM <bloggers.rdf>
WHERE {
?contributor foaf:name "Jon Foobar" .
?contributor foaf:weblog ?url .
}
46
Example (cont.)
• Optional block:
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?name ?depiction
WHERE { ?person foaf:name ?name .
OPTIONAL { ?person foaf:depiction ?depiction . }
}
47
Example (cont.)
•
Alternative matches:
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT ?name ?mbox
WHERE {
?person foaf:name ?name .
{
{ ?person foaf:mbox ?mbox } UNION
{ ?person foaf:mbox_sha1sum ?mbox }
}
}
•
There are many other features in SparQL which is out of scope for this class.
Refer to references for more information.
48
References
• http://www.w3.org/TR/xquery/
• A Programmer's Introduction to RDQL
– http://jena.sourceforge.net/tutorial/RDQL/
• http://rdfstore.sourceforge.net/
• http://jena.sourceforge.net
• http://sesame.aidministrator.nl/
• http://www.w3.org/TR/2004/WD-rdf-sparql-query-20041012/
• http://www-128.ibm.com/developerworks/java/library/j-sparql/
49