Database Systems I
Query Languages
for XML
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester
357
Query Languages for XML
XPath is a simple query language based on
describing similar paths in XML documents.
XQuery extends XPath in a style similar to SQL,
introducing iterations, subqueries, etc.
XPath and XQuery expressions are applied to an
XML document and return a sequence of
qualifying items.
Items can be primitive values or nodes (elements,
attributes, documents).
The items returned do not need to be of the
same type.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester
358
XPath
A path expression returns the sequence of all
qualifying items that are reachable from the
input item following the specified path.
A path expression is a sequence consisting of
tags or attributes and special characters such as
slashes (“/”).
Absolute path expressions are applied to some
XML document and returns all elements that
are reachable from the document’s root element
following the specified path.
Relative path expressions are applied to an
arbitrary node.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester
359
XPath
<?XML version=“1.0” standalone =“yes” ?>
<bibliography>
<book bookID = “b100“> <title> Foundations… </title>
<author> Abiteboul </author>
<author> Hull </author>
<author> Vianu </author>
<publisher> Addison Wesley </publisher>
<year> 1995 </year> </book>
…
</bibliography>
Applied to the above document, the XPath expression
/bibliography/book/author returns the sequence
<author> Abiteboul </author>
<author> Hull </author>
<author> Vianu </author> . . .
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester
360
Attributes
If we do not want to return the qualifying elements,
but the value one of their attributes, we end the
path expression with @attribute.
Applied to the above document, the XPath
expression
/bibliography/book/@bookID
returns the sequence
“b100“ . . .
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester
361
Axes
XPath provides a variety of axes, i.e. modes of
navigation through semistructured data.
At each step of a path expression, we can prefix a
tag or attribute name by an axis name and a colon.
For example, the path expression
/child::bibliography/child::book/attribute::bookID
is equivalent to
/bibliography/book/@bookID.
Descendants are all direct and indirect children of a
node.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester
362
Axes
Axes include
parent,
ancestor,
descendant,
next-sibling,
previous-sibling,
self, and
descendant-or-self.
XPath has the following shorthands for axes:
/
//
@
.
..
child,
descendant-or-self,
attribute,
self,
parent.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester
363
Axes
<bibliography>
<book bookID = “b100“> <title> Foundations… </title>
<author affiliation = “IBM“> Abiteboul </author>
<author> Hull </author>
. . . </book>
<article articleID = “a245“>
<header>
<author authorID = “a739“> Codd </author>
<title> A relational database model </title> </header>
<body> . . . </body> </article>
</bibliography>
Applied to the above document, the path expression
/bibliography//author returns the sequence
<author> Abiteboul </author>
<author> Hull </author>
<author> Codd </author> .
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester
364
Wildcards
We can use wildcards instead of actual tags and
attributes:
* means any tag, and
@* means any attribute.
Examples
/bibliography/*/author returns the sequence
<author> Abiteboul </author>
<author> Hull </author>.
/bibliography//author/@* returns the sequence
“IBM“
“a739“.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester
365
Conditions
We can restrict the qualifying paths to those that satisfy a
given condition, surrounded by square brackets.
Conditions can be anything returning a boolean value.
In particular, conditions can be:
[<subpath>=<value>]
there exists a subpath with the specified value
[i]
the element is the i-th element of the specified type
Example
/bibliography/book[/title=“Foundations…”]/author[2]
returns
<author> Hull </author>.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester
366
XQuery
XQuery extends XPath, i.e. every XPath
expression is an XQuery expression.
Beyond XPath expressions, XQuery introduces
FLWOR expressions.
Format: for let where order-by return
for/let clauses
sequence of items
where clause
sequence of items
order-by/return clause
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester
367
XQuery
FLWOR expressions are similar to SQL select . .
from . . . where . . . queries.
XQuery allows zero, one or more for and let
clauses.
The where clause is optional.
There is one optional order-by clause.
Finally, there is exactly one return clause.
XQuery is case-sensitive.
XQuery (and XPath) is a W3C standard.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester
368
XQuery
XQuery is a functional language.
Any XQuery expression can be used in any
place that an expression is expected.
SQL also allows subqueries in many places.
However, SQL does, e.g., not allow any
subquery to be any operand of any comparison
in a WHERE clause.
This implies that every XQuery operator must
be defined for operands that are sequences of
items, not just for individual items.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester
369
XQuery Clauses
for $x in expr
Defines node variable $x.
The expression expr evaluates to a sequence of items.
The variable $x is assigned to each item, in turn, and
the body of the for clause is executed once for each
assignment.
let $x := expr
Defines collection variable $x.
The expression expr evaluates to a sequence of items.
The variable is bound to the entire sequence of items.
Useful for common subexpressions and for
aggregations.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester
370
XQuery Clauses
where condition
The condition is a boolean expression.
The clause is applied to some item.
If and only if the condition evaluates to true, the
following return clause is executed for that item.
return expression
The result of a FLWOR clause is a sequence of items.
Expression defines the result format for the current
(qualifying) item.
The sequence of items produced by expression is
appended to the sequence of items produced so far.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester
371
Document Nodes
The context for a for or let clause is often
provided by a document node.
Typically, the document comes from a file.
The doc function constructs a document node
from a file with a given name.
Examples
doc("bib.xml")
doc(“infolab.stanford.edu/~hector/movies.xml”)
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester
372
Interpretation as XQuery Expression
XQuery expressions can be used wherever an XML
expression of any kind is permitted.
Any text string is acceptable as content of a tag or value
of an attribute.
If a string contains an XQuery expression that should be
evaluated, this substring must be surrounded by curly
brackets {}.
Example
for $b in doc("bib.xml")/bibliography/book
return <result id = {$b/@bookID}>{$b/title}</result>
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester
373
XQuery Examples
Find all books.
for vs. let
Returns:
for $x in doc("bib.xml")/bibliography/book
return <result> {$x} </result>
let $x := doc("bib.xml")/bibliography/book
return <result> {$x} </result>
<result> <book>...</book></result>
<result> <book>...</book></result>
<result> <book>...</book></result>
...
Returns:
<result> <book>...</book>
<book>...</book>
<book>...</book>
...
</result>
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester
374
XQuery Examples
Find all titles of books published after 1995.
for $x in doc("bib.xml")/bibliography/book
where $x/year > 1995
return $x/title
Result:
<title> abc </title>
<title> def </title>
<title> ghi </title>
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester
375
Ordering the Query Result
The order-by clause allows you to order the results of
an XQuery expression.
order-by list of expressions
The sort order is based on the value of the first
expression. Ties are broken based on the value of the
second (if necessary third etc.) expression.
By default, the order is ascending.
A descending sort order can be specified using
descending.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester
376
Elimination of Duplicates
The built-in function distinct-values eliminates
duplicates from a sequence of result items.
In principle, it applies only to primitive (atomic) types.
It can also be applied to elements, but then it will
remove their tags, replacing them by quotes “”.
Example
If return $b/title produces
<title> aaa </title> <title> bbb </title>
<title> aaa </title>
then distinct-values (return $b/title) produces
“aaa” “bbb”.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester
377
XQuery Examples
Find all books published by Morgan Kaufman
and list them in descending order of their prices.
for $b in doc("bib.xml")
/bibliography/book[publisher=“Morgan Kaufmann”])
order-by $b/price descending
return $b
Uses order-by with option descending.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester
378
XQuery Examples
For each author of a book published by Morgan
Kaufmann, list the author and the titles of all
books she published.
for $a in distinct-values(doc("bib.xml")
/bibliography/book[publisher=“Morgan Kaufmann”]/author)
return <result>
{$a}
{for $t in /bib/book[author=$a]/title
return $t}
</result>
Uses nested subquery and function distinct-values.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester
379
XQuery Examples
Result:
<result>
<author>Jones</author>
<title> abc </title>
<title> def </title>
</result>
<result>
<author> Smith </author>
<title> ghi </title>
</result>
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester
380
Joins
We can join two or more documents, by using one
variable for each of the documents .
We let a variable range over the elements of the
corresponding document, within a for-clause.
Need to be careful when comparing elements for
equality, since their equality is by element identity,
not by element content.
Typically, we want to compare the element content.
The built-in function data(E) returns the content of an
element E.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester
381
Example
Find all pairs of titles of books from the same
year.
let $books:=doc("bib.xml")
for $b1 in doc("bib.xml")/bibliography/book,
$b2 in doc("bib.xml")/bibliography/book
where data($b1/year) = data($b2 /year)
return <result>{$b1/title} {$b2/title} </result>
Uses two variables ranging over books and the
data function applied to their year elements.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester
382
Comparison Operators
XQuery supports the standard comparison operators
such as <, >, =.
Comparison operators are applied to a sequence of
items.
Comparisons have an existential nature. I.e., they
return true if and only if at least one of the items
satisfies the condition of the comparison.
for $b in doc("bib.xml")/bibliography/book/
where $b/author/firstname = “A”
and $b/author/lastname = “B”
return $b
Books returned can have one author with firstname A and
another author with lastname B.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester
383
Comparison Operators
XQuery also supports special comparison operators
that only compare sequences consisting of a single
item: eq, ne, lt, gt, ge.
These comparisons fail if one of the operands
contains more than one item.
XQuery also provides built-in functions for
approximate string matching, in particular
contains($p, "windsurfing").
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester
384
Quantification
XQuery supports the existential and the universal
quantifier.
Universal quantifier
every $v in expression1 satisfies expression 2
Existential quantifier
some $v in expression1 satisfies expression 2
Expression1 evaluates to a sequence of items,
expression 2 is a boolean expression.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester
385
Aggregation
XQuery provides built-in functions for the standard
aggregations such as SUM, MIN, COUNT and AVG.
They can be applied to any XQuery expression, i.e. to
any sequence of items.
Example
avg(doc("bib.xml")/bibliography/book/price)
count(doc("bib.xml")/bibliography/book/price)
Computes the average book price and the number of
books, resp.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester
386
XQuery Examples
Find books whose price is larger than the
average price.
let $a:=avg(doc("bib.xml")/bibliography/book/price)
for $b in doc("bib.xml")/bibliography/book
where $b/price > $a
return $b
Uses aggregate operator (avg), applied to the
result of a path expression.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester
387
XQuery Examples
Find title of books with a paragraph containing
the terms “sailing” and “windsurfing”.
for $b in doc("bib.xml")//book
where some $p in $b//para satisfies
contains($p, "sailing") and contains($p, "windsurfing")
return $b/title
Uses existential quantifier (some) and string
matching (contains).
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester
388
XQuery Examples
Find the title of books where every paragraph
contains the terms “sailing”.
for $b in doc("bib.xml")//book
where every $p in $b//para satisfies
contains($p, "sailing")
return $b/title
Uses universal quantifier (every) and string
matching (contains).
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester
389
Summary
XQuery is the standard XML query language.
It is a functional language, i.e. any XQuery expression
can be used in any place where an expression is
expected.
An XQuery expression consists of for, let, where, order
and return clauses, of which some are optional.
The main new concept compared to SQL are path
expressions that return sets of elements reachable via
the given path.
Path expressions are defined in XPath, a sublanguage
of XQuery.
In addition, XQuery has equivalent constructs for
most of the main SQL constructs, in particular
quantifiers and aggregate functions.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester
390
© Copyright 2026 Paperzz