Xquery - CS, Technion

Xquery
Introduction by Examples
Sources
• XQuery 1.0: An XML Query LanguageW3C
Working Draft 22 August 2003
• Don Chamberlin’s Sigmod03 talk:
www.almaden.ibm.com/cs/people/chamberlin/
sigmod03_xquery.pdf
• Xquery from the Experts. Katz et. Al.
• Definitive XML Schema. Priscilla Walmsley.
Example of an Input Expression
This is “onebook.xml”:
<?xml version="1.0" encoding="ISO-8859-1"?>
<bib>
<book year="1994">
<title>TCP/IP Illustrated</title>
<author>
<last>Stevens</last>
<first>W.</first>
</author>
<publisher>Addison-Wesley</publisher>
<price>65.95</price>
</book>
</bib>
Expression and Result
Expression: document("onebook.xml")
Result:
document {
<bib>
<book year="1994">
<title>TCP/IP Illustrated</title>
<author><last>Stevens</last><first>W.</first></author>
<publisher>Addison-Wesley</publisher>
<price>65.95</price>
</book>
</bib>
}
Bibliography Database
<?xml version="1.0" encoding="ISO-8859-1"?>
<bib>
<book year="1994">
<title>TCP/IP Illustrated</title>
<author>
<last>Stevens</last>
<first>W.</first>
</author>
<publisher>Addison-Wesley</publisher>
<price>65.95</price>
</book>
<book year="1992">
<title>Advanced Programming in the Unix environment</title>
<author>
<last>Stevens</last>
<first>W.</first>
</author>
<publisher>Addison-Wesley</publisher>
<price>65.95</price>
</book>
Bibliography Database
<book year="2000">
<title>Data on the Web</title>
<author>
<last>Abiteboul</last>
<first>Serge</first>
</author>
<author>
<last>Buneman</last>
<first>Peter</first>
</author>
<author>
<last>Suciu</last>
<first>Dan</first>
</author>
<publisher>Morgan Kaufmann Publishers</publisher>
<price>39.95</price>
</book>
Bibliography Database
<book year="1999">
<title>The Economics of Technology and Content for Digital TV</title>
<editor>
<last>Gerbarg</last>
<first>Darcy</first>
<affiliation>CITI</affiliation>
</editor>
<publisher>Kluwer Academic Publishers</publisher>
<price>129.95</price>
</book>
</bib>
Some Path Expressions
Expression:
document("xmpbib.xml")//book[2]
Result
<book year="1992">
<title>Advanced Programming in the Unix environment</title>
<author><last>Stevens</last><first>W.</first></author>
<publisher>Addison-Wesley</publisher>
<price>65.95</price>
</book>
Node Creation
Expression:
<book>
<title> Go West </title>
<author> J. Cowen </author>
<publisher> Smith </publisher>
</book>
Result:
<book>
<title> Go West </title>
<author> J. Cowen </author>
<publisher> Smith </publisher>
</book>
Element Creation - Constructor
Expression:
element title { "Harold and the purple
crayon" }
Result:
<title>Harold and the purple crayon</title>
Element Creation - Constructor
Expression:
element {concat("tit","le")} { "Harold and the
purple crayon" }
Result:
<title>Harold and the purple crayon</title>
Document Creation
Expression:
document{
<book>
<title> Go West </title>
<author> J. Cowen </author>
<publisher> Smith </publisher>
</book>
}
Result:
document {
<book>
<title> Go West </title>
<author> J. Cowen </author>
<publisher> Smith </publisher>
</book>
}
Node Creation - Computation
Expression:
<example>
<p> here is a query </p>
<eg> document("b.xml")//book[1]/title </eg>
<eq> { document("../docs/xmpbib.xml")//book[1]/title }
</eq>
</example>
Result:
<example>
<p> here is a query </p>
<eg> document("b.xml")//book[1]/title </eg>
<eq><title>TCP/IP Illustrated</title></eq>
</example>
FLWOR
for $b in
document("../docs/xmpbib.xml")//book
where $b/@year = "2000"
return $b/title
Result:
<title>Data on the Web</title>
FLWOR
for $i in (1,2,3)
return
<tuple>
<i>
{$i}
</i>
</tuple>
Result:
<tuple><i>1</i></tuple>, <tuple><i>2</i></tuple>, <tuple><i>3</i></tuple>
Let
let $i := (1,2,3)
return
<tuple>
<i>
{$i}
</i>
</tuple>
Result:
<tuple><i>1 2 3</i></tuple>
Tuple Stream Production
for $i in (1,2,3), $j in (4,5,6)
return
<tuple>
<i>
{$i}
</i>
<j>
{$j}
</j>
</tuple>
Result:next slide
Result of Tuple Stream
Production
<tuple><i>1</i><j>4</j></tuple>,
<tuple><i>1</i><j>5</j></tuple>,
<tuple><i>1</i><j>6</j></tuple>,
<tuple><i>2</i><j>4</j></tuple>,
<tuple><i>2</i><j>5</j></tuple>,
<tuple><i>2</i><j>6</j></tuple>,
<tuple><i>3</i><j>4</j></tuple>,
<tuple><i>3</i><j>5</j></tuple>,
<tuple><i>3</i><j>6</j></tuple>
Tuple Stream Production
for $i in (1,2,3)
let $j := (1,2,3)
return
<tuple>
<i>
{$i}
</i>
<j>
{$j}
</j>
</tuple>
Result:
<tuple><i>1</i><j>1 2 3</j></tuple>,
<tuple><i>2</i><j>1 2 3</j></tuple>,
<tuple><i>3</i><j>1 2 3</j></tuple>
FLWOR + Let
for $b in document("../docs/xmpbib.xml")//book
let $c := $b/author
return
<book>
{ $b/title,
<count>
{ count($c) }
</count>
}
</book>
Result: Next slide
Result: FLWOR + Let
<book><title>TCP/IP
Illustrated</title><count>1</count></book>,
<book>
<title>Advanced Programming in the Unix
environment</title>
<count>1</count>
</book>,
<book><title>Data on the
Web</title><count>3</count></book>,
<book>
<title>The Economics of Technology and Content for
Digital TV</title>
<count>0</count>
</book>
More FLWOR
for $b in
document("../docs/xmpbib.xml")//book
where $b/price < 50
return $b/title
Result:
<title>Data on the Web</title>
More FLWOR – Let + Count
for $b in
document("../docs/xmpbib.xml")//book
let $c := $b//author
where count($c) > 2
return $b/title
Result:
<title>Data on the Web</title>
More FLWOR – Order
for $t in
document("../docs/xmpbib.xml")//title
order by $t
return $t
Result:
<title>Advanced Programming in the Unix
environment</title>,
<title>Data on the Web</title>,
<title>TCP/IP Illustrated</title>,
<title>The Economics of Technology and
Content for Digital TV</title>
More FLWOR – Variable
Binding
for $b in
document("../docs/xmpbib.xml")//book, $t
in $b/title
order by $t
return $t
Result:
<title>Advanced Programming in the Unix
environment</title>,
<title>Data on the Web</title>,
<title>TCP/IP Illustrated</title>,
<title>The Economics of Technology and
Content for Digital TV</title>
Order Specification
for $b in document("../docs/xmpbib.xml")//book,
$a in $b/author
order by $a/last descending , $a/first ascending
return $a
Result:
<author><last>Suciu</last><first>Dan</first></author>,
<author><last>Stevens</last><first>W.</first></author>,
<author><last>Stevens</last><first>W.</first></author>,
<author><last>Buneman</last><first>Peter</first></author>,
<author><last>Abiteboul</last><first>Serge</first></author>
Ordering without Outputting
let $b := document("../docs/xmpbib.xml")//book
for $t in distinct-values($b/title)
let $a1 := $b[title = $t]/author[1]
order by $a1/last , $a1/first
return $t
Result:
<title>Data on the Web</title>,
<title>TCP/IP Illustrated</title>,
<title>Advanced Programming in the Unix environment</title>,
<title>The Economics of Technology and Content for Digital
TV</title>
Order Specification – Empty
sequences
let $b := document("../docs/xmpbib.xml")//book
for $t in distinct-values($b/title)
let $a1 := $b[title = $t]/author[1]
stable order by $a1/last empty least, $a1/first
empty least
return $t
Result:
<title>The Economics of Technology and Content for Digital TV</title>,
<title>Data on the Web</title>,
<title>TCP/IP Illustrated</title>,
<title>Advanced Programming in the Unix environment</title>
Using FLWOR to Bind a
Variable
let $authors := for $a in
document("../docs/xmpbib.xml")//author
order by $a/last empty least , $a/first empty
least
return $a
return $authors
Result:
<author><last>Abiteboul</last><first>Serge</first></author>,
<author><last>Buneman</last><first>Peter</first></author>,
<author><last>Stevens</last><first>W.</first></author>,
<author><last>Stevens</last><first>W.</first></author>,
<author><last>Suciu</last><first>Dan</first></author>
Using FLWOR to Bind a
Variable
(<author><last>Abiteboul</last><first>Serge</first></author>,
<author><last>Buneman</last><first>Peter</first></author>,
<author><last>Stevens</last><first>W.</first></author>,
<author><last>Stevens</last><first>W.</first></author>,
<author><last>Suciu</last><first>Dan</first></author> ) /last
Result:
<last>Abiteboul</last>,
<last>Buneman</last>,
<last>Stevens</last>,
<last>Stevens</last>,
<last>Suciu</last>
A Bug?
let $authors := for $a in
document("../docs/xmpbib.xml")//author
order by $a/last empty least , $a/first empty
least
return $a
return $authors/last
Result:
<last>Stevens</last>,
<last>Stevens</last>,
<last>Abiteboul</last>,
<last>Buneman</last>,
<last>Suciu</last>
Another Ordering Example
for $a in document("../docs/xmpbib.xml")//author
order by $a/last empty least , $a/first empty
least
return $a/last
Result:
<last>Abiteboul</last>,
<last>Buneman</last>,
<last>Stevens</last>,
<last>Stevens</last>,
<last>Suciu</last>
Predicates
let $a := (1,2,3) [. > 2 or . = 1]
return
$a
Result:
1, 3
Predicates
let $a := (1,2,3) [. > 2 or . = 1]
for $i in $a
return
<r>
{ $i }
</r>
Result:
<r>1</r>, <r>3</r>
Computing Content
for $t in (document("../docs/xmpbib.xml")//title)
[count(../author) < 3 ]
return
<review>
{ $t }
</review>
Result:
<review><title>TCP/IP Illustrated</title></review>,
<review><title>Advanced Programming in the Unix environment</title></review>,
<review>
<title>The Economics of Technology and Content for Digital TV</title>
</review>
Returning a Variable Content
let $i := (1, 2, 3)
return
$i
Result:
1, 2, 3
Computing Content
for $b in document("../docs/xmpbib.xml")//book
return
<quote>
{$b/title , $b/price}
</quote>
Result:
<quote><title>TCP/IP Illustrated</title><price>65.95</price></quote>,
<quote>
<title>Advanced Programming in the Unix environment</title>
<price>65.95</price>
</quote>,
<quote><title>Data on the Web</title><price>39.95</price></quote>,
<quote>
<title>The Economics of Technology and Content for Digital TV</title>
<price>129.95</price>
</quote>
Computing Content – Similar
for $b in
document("../docs/xmpbib.xml")//book
return
<quote>
{$b/author , $b/price}
</quote>
Result: Next slide
Result: Computing Content
<quote>
<author><last>Stevens</last><first>W.</first></author>
<price>65.95</price>
</quote>,
<quote>
<author><last>Stevens</last><first>W.</first></author>
<price>65.95</price>
</quote>,
<quote>
<author><last>Abiteboul</last><first>Serge</first></author>
<author><last>Buneman</last><first>Peter</first></author>
<author><last>Suciu</last><first>Dan</first></author>
<price>39.95</price>
</quote>,
<quote><price>129.95</price></quote>
More Computing Content
let $i := (1, 2, 3)
return
<result>
{for $a in $i
return <a> { $a } </a> }
</result>
Result:
<result><a>1</a><a>2</a><a>3</a></result>
Some Fine Points
• The part of a direct element constructor between
the start tag and the end tag is called the
content of the element constructor.
• This content may consist of literal text
characters, nested element constructors, and
expressions enclosed in curly braces.
• In general, the value of an enclosed expression
may be any sequence of nodes and/or atomic
values.
• Enclosed expressions can be used in the
content of an element constructor to compute
both the content and the attributes of the
constructed node
Conceptually, the content of an element constructor is
processed as follows:
1. The content is evaluated to produce a sequence of
nodes called the content sequence, as follows:
1. Predefined entity references and character
references are expanded into their referenced
strings, as described in 3.1.1 Literals.
2. Each consecutive sequence of literal characters
evaluates to a single text node containing the
characters. However, if the sequence consists
entirely of boundary whitespace as defined in
3.7.1.4 Whitespace in Element Content and the
Prolog does not specify xmlspace = preserve, then
no text node is generated.
3. Each nested element constructor is evaluated
according to the rules in this section, resulting in a
new element node.
Cont.
4. Enclosed expressions are evaluated as follows: For
each node returned by an enclosed expression, a
new deep copy of the node is constructed, including
all its children, attributes, and namespace nodes (if
any). Each copied node has a new node identity.
Copied element nodes are given the type
annotation xs:anyType, and copied attribute nodes
are given the type annotation xs:anySimpleType.
For each adjacent sequence of one or more atomic
values returned by an enclosed expression, a new
text node is constructed, containing the result of
casting each atomic value to a string, with a single
blank character inserted between adjacent values.
Cont.
2. If the content sequence contains a document
node, a type error is raised.[err:XQ0023]
3. If the content sequence contains an attribute
node following a node that is not an attribute
node, a type error is raised.[err:XQ0024]
Attribute nodes occurring at the beginning of
the content sequence become attributes of the
new element node. If two or more attributes of
the new element node have the same name, a
dynamic error is raised.[err:XQ0025]
Cont.
4. Adjacent text nodes in the content sequence
are coalesced into a single text node by
concatenating their contents, with no
intervening blanks.
5. The resulting sequence of nodes becomes the
children and attributes of the new element
node in the data model representation.
6. The new element node is automatically
validated, as described in 3.7.1.5 Type of a
Constructed Element.
Examples
• Example:
<a>{1}</a> The constructed element node has
one child, a text node containing the value "1".
• Example:
<a>{1, 2, 3}</a> The constructed element node
has one child, a text node containing the value
"1 2 3".
• Example:
<c>{1}{2}{3}</c> The constructed element node
has one child, a text node containing the value
"123".
Examples
• Example:
<b>{1, "2", "3"}</b> The constructed element
node has one child, a text node containing the
value "1 2 3".
• Example:
<fact>I saw 8 cats.</fact> The constructed
element node has one child, a text node
containing the value "I saw 8 cats.".
• Example:
<fact>I saw {5 + 3} cats.</fact> The constructed
element node has one child, a text node
containing the value "I saw 8 cats.".
Examples
• Example:
<fact>I saw <howmany>{5 +
3}</howmany> cats.</fact> The
constructed element node has three
children: a text node containing "I saw ", a
child element node named howmany, and
a text node containing " cats.". The child
element node in turn has a single text
node child containing the value "8".
Example
for $a in document("../docs/xmpbib.xml")//author
return
<author>
{string($a/first) , "
", $a/last}
</author>
Result:
<author>W.
<author>W.
<author>Serge
<author>Peter
<author>Dan
<last>Stevens</last></author>,
<last>Stevens</last></author>,
<last>Abiteboul</last></author>,
<last>Buneman</last></author>,
<last>Suciu</last></author>
Example
for $a in document("../docs/xmpbib.xml")//author
return
<author>
{string($a/first) , "
", string($a/last)}
</author>
</author>
Result:
<author>W.
Stevens</author>,
<author>W.
Stevens</author>,
<author>Serge
Abiteboul</author>,
<author>Peter
Buneman</author>,
<author>Dan
Suciu</author>
Example (XHTML)
document {
<table border="1">
<thead>
<tr>
<td>Title</td>
<td>Publisher</td>
<td>Price</td>
<td>Year</td>
</tr>
</thead>
<tbody>
<tr>
<td>TCP/IP</td>
<td>Addison</td>
<td>29.95</td>
<td>1994</td>
</tr>
<tr>
<td>Elements and
Attributes</td>
<td>Weston</td>
<td>23</td>
<td>2002</td>
</tr>
</tbody>
</table>
}
Title
Publisher
Price
Year
TCP/IP
Addison
29.95
1994
Elements
and
Attributes
Weston
23
2002
Result
let $t := document("bib.xhtml")
return
$t
Use “at” - Translate
XHTML  “Obvious” XML
let $t := document("bib.xhtml")//table
for $r in $t/tbody/tr
return
<book>
{
for $c at $i in $r/td
return
element { lower-case(data($t/thead/tr/td[$i])) }
{ string($c) }
}
</book>
Result: Next slide
Result
<book>
<title>TCP/IP</title>
<publisher>Addison</publisher>
<price>29.95</price>
<year>1994</year>
</book>,
<book>
<title>Elements and Attributes</title>
<publisher>Weston</publisher>
<price>23</price>
<year>2002</year>
</book>
distinct
distinct-values(
document("../docs/xmpbib.xml")//author/la
st )
Result:
<last>Stevens</last>,
<last>Abiteboul</last>,
<last>Buneman</last>,
<last>Suciu</last>
distinct-values
for $a in distinct-values(
document("../docs/xmpbib.xml")//author )
return $a
Result:
<author><last>Stevens</last><first>W.</first></author>,
<author><last>Abiteboul</last><first>Serge</first></author>,
<author><last>Buneman</last><first>Peter</first></author>,
<author><last>Suciu</last><first>Dan</first></author>
Distinct
let $a := (<book>Elements Are Redundant</book>,
<book>Elements Are Redundant</book>)
return
<a>
{ distinct-nodes($a) }
but
{ distinct-values($a) }
</a>
Result
<a><book>Elements Are Redundant</book><book>Elements Are Redundant</book>
but
<book>Elements Are Redundant</book></a>
Larger font:
<a><book>Elements Are
Redundant</book><book>Elements Are
Redundant</book>
but
<book>Elements Are Redundant</book></a>
distinct-nodes
• fn:distinct-nodes($srcval as node*) as node*
• Returns the sequence that results from removing
from $srcval all but one of a set of nodes that
have the same identity as one another, based on
node identity (that is, using op:node-equal()).
The order in which the distinct nodes are
returned is ·implementation dependent·. If
$srcval is the empty sequence, returns the
empty sequence. For detailed semantics see
section 6.2.2 of [XQuery 1.0 and XPath 2.0
Formal Semantics].
distinct-values
• fn:distinctvalues($srcval as xs:anyAtomicType*) as xs:anyAtomicType*
• fn:distinctvalues($srcval as xs:anyAtomicType*,$collationLiteral as xs:string)
as xs:anyAtomicType*
• Returns the sequence that results from removing from $srcval all but
one of a set of values that are eq to one other. All the values must
be of a single type or one of its subtypes (for numeric values, the
numeric promotion rules defined in 6.2 Operators on Numeric
Values are used to promote all values to a single common type).
• The type returned is a sequence of values of the same type as
$srcval. The type must have a total order. If this condition is not
satisfied, an error is raised ("Type does not have total order").
Equality must also be defined for the type. If this condition is not
satisfied, an error is raised ("Type does not have equality defined").
For detailed semantics see section 6.2.2 of [XQuery 1.0 and XPath
2.0 Formal Semantics].
distinct-values
• If $collationLiteral is not in the lexical space of xs:anyURI
an error is raised ("Invalid collationURI").
• If $srcval is the empty sequence, the empty sequence is
returned.
• For xs:float and xs:double values, NaN is considered to
be equal to itself and 0.0 is equal to -0.0.
• If an xs:dateTime, xs:date or xs:time value does not have
a timezone, an implicit timezone is provided by the
evaluation context. The normalized value is adjusted
using this implicit timezone if necessary. The adjusted
normalized value is used to compute distinctness. If
multiple adjusted normalized values compare equal but
the accompanying timezones are different, it is
·implementation dependent· which value is returned.
distinct-values
• Equality of string values is determined according to the
collation that is used. The order of the values returned is
·implementation dependent·. The collation used by the
invocation of this function is determined according to the
rules in 7.3 Equality and Comparison of Strings. If the
type of the items in $srcval is not xs:string and
$collationLiteral is specified, the collation is ignored.
• 15.1.11.1 Examples
• fn:distinct-values(1, 2.0, 3, 2) returns (1, 3, 2.0).
• So, what about semantics for node sequence argument?
List by Author
let $books := document("../docs/xmpbib.xml")//bib
return
<authlist>
{
for $a in distinct-values($books//author)
order by $a
return
<author>
<name>
{ $a/last/text() }
</name>
<books>
{
Could we use distinctnodes?
for $b in $books//book [author = $a]
order by $b/title
return $b/title
}
</books>
</author>
}
</authlist>
Result
<authlist>
<author>
<name>Abiteboul</name>
<books><title>Data on the Web</title></books>
</author>
<author>
<name>Buneman</name>
<books><title>Data on the Web</title></books>
</author>
<author>
<name>Stevens</name>
<books>
<title>Advanced Programming in the Unix environment</title>
<title>TCP/IP Illustrated</title>
</books>
</author>
<author>
<name>Suciu</name>
<books><title>Data on the Web</title></books>
</author>
</authlist>
More Examples with Distinct
for $a in distinct-values( document("../docs/xmpbib.xml")//author )
for $l in distinct-values($a/last),
$f in distinct-values($a [ last = $l]/first)
return
<author>
<last>
{ $l }
</last>
<first>
{ $f }
</first>
</author>
Result: Next slide
Result
<author>
<last><last>Stevens</last></last>
<first><first>W.</first></first>
</author>,
<author>
<last><last>Abiteboul</last></last>
<first><first>Serge</first></first>
</author>,
<author>
<last><last>Buneman</last></last>
<first><first>Peter</first></first>
</author>,
<author>
<last><last>Suciu</last></last>
<first><first>Dan</first></first>
</author>
Concatenate
let $a := (<book>Elements Are
Redundant</book>, <book>Elements Are
Redundant</book>)
return
op:concatenate(distinct-values( $a ) ,
distinct-nodes($a))
Result
<book>Elements Are Redundant</book>,
<book>Elements Are Redundant</book>,
<book>Elements Are Redundant</book>
Concatenate
• op:concatenate($seq1 as item*,
$seq2 as item*) as item*
• Returns a sequence consisting of the items in
$seq1 followed by the items in $seq2. This
function backs up the infix operator ",". If either
sequence is the empty sequence, the other
operand is returned.
• 15.1.5.1 Examples
• op:concatenate((1 2 3), (4 5)) returns (1 2 3 4 5).
• op:concatenate((), ()) returns ().
Data
let $a := (<book>Elements Are
Redundant</book>, <book>Elements Are
Redundant</book>)
return
data($a)
Result:
Elements Are Redundant, Elements Are Redundant
Data
data(
<book> <title>Attributes <bold>are</bold>
Redundant</title> <price>12.30</price>
</book>)
Result:
Attributes are Redundant12.30
Data
• fn:data($srcval as item*) as xdt:anyAtomicType*
• fn:data takes a sequence of items and returns a
sequence of atomic values.
• The result of fn:data is the sequence of atomic
values produced by applying the following rules
to each item in $srcval:
• If the item is an atomic value, it is returned.
• If the item is a node, fn:data() returns the typed
value of the node as defined by the accessor
function dm:typed-value in [XQuery 1.0 and
XPath 2.0 Data Model].
dm:typed-value
• dm:typed-value($n as Node) as xdt:anyAtomicType*
• The dm:typed-value accessor returns the typed-value of the node,
which is a sequence of zero or more atomic values derived from the
string-value of the node and its type in such a way as to be
consistent with validation.
• If the node is a comment, document, namespace, processinginstruction, or text node, then its typed value is equal to its string
value as an instance of xdt:untypedAtomic.
• If the node is an attribute node with type xs:anySimpleType, then its
typed value is equal to its string value as an instance of
xdt:untypedAtomic. The typed value of an attribute node with any
other type is derived from its string value and type annotation in a
way that is consistent with XML Schema validation.
• If the node is an element node with type xs:anyType, then its typed
value is equal to its string value, as an instance of
xdt:untypedAtomic.
dm:typed-value (Cont.)
• If the node is an element node with a simple type or with a complex
type of simple content, then its typed value is derived from its string
value and type in a way that is consistent with XML Schema
validation.
• If the item is an element node with complex type of empty content,
then its typed value is the empty sequence.
• If the node is an element node with a complex type of mixed
content, then its typed value is its string value as an instance of
xdt:untypedAtomic.
• Recall dm:string-value: The concatenation of the string-values
of all the text node descendants of the element in document
order.
• If the item is an element node with complex type of complex content,
then its typed value is undefined and dm:typed-value raises a type
error, which may be handled by the host language.
• For detailed semantics see [XQuery 1.0 Formal Semantics].
A note on Understanding the Spec
• The basics are:
– The Xpath 2.0 and Xquery 1.0 data model.
– XML Schema.
– Sequence types (Xquery).
• Next come built-in functions and
operators.
• Next come Xquery expressions.
Join – File xmpreviews1.xml
<?xml version="1.0" encoding="ISO-8859-1"?>
<reviews>
<entry>
<title>TCP/IP Illustrated</title>
<price>65.95</price>
<review>
One of the best books on TCP/IP.
</review>
</entry>
</reviews>
Join
for $t in
document("../docs/xmpbib.xml")//title,
$e in
document("../docs/xmpreviews.xml")//entry
where $t = $e/title
return
<entry> { $t, $e/review }</entry>
Result: Next slide
Result
<entry>
<title>TCP/IP Illustrated</title>
<review>
One of the best books on TCP/IP.
</review>
</entry>,
<entry>
<title>Advanced Programming in the Unix environment</title>
<review>
A clear and detailed discussion of UNIX programming.
</review>
</entry>,
<entry>
<title>Data on the Web</title>
<review>
A very good discussion of semi-structured database
systems and XML.
</review>
</entry>
Join – Another way
for $t in document("../docs/xmpbib.xml")//title
return
<reviewentry>
{ $t }
{for $e in document("xmpreviews1.xml")//entry
where $t = $e/title
return $e/review }
</reviewentry>
Result: Next slide
Result
<reviewentry>
<title>TCP/IP Illustrated</title>
<review>
One of the best books on TCP/IP.
</review>
</reviewentry>,
<reviewentry>
<title>Advanced Programming in the Unix environment</title>
</reviewentry>,
<reviewentry><title>Data on the Web</title></reviewentry>,
<reviewentry>
<title>The Economics of Technology and Content for Digital TV</title>
</reviewentry>
Hierarchy Inversion
<listings>
{
for $p in distinct-values(document("../docs/xmpbib.xml")//publisher)
order by $p
return
<result>
{ $p }
{
for $b in document("../docs/xmpbib.xml")/bib/book
where $b/publisher = $p
order by $b/title
return $b/title
}
</result>
}
</listings>
Result: Next slide
Result
<listings>
<result>
<publisher>Addison-Wesley</publisher>
<title>Advanced Programming in the Unix environment</title>
<title>TCP/IP Illustrated</title>
</result>
<result>
<publisher>Kluwer Academic Publishers</publisher>
<title>The Economics of Technology and Content for Digital
TV</title>
</result>
<result>
<publisher>Morgan Kaufmann Publishers</publisher>
<title>Data on the Web</title>
</result>
</listings>
Existential Quantifiers
for $b in document("../docs/xmpbib.xml")//book
where some $a in $b//author
satisfies ($a/last="Stevens" and
$a/first="W.")
return $b/title
Result:
<title>TCP/IP Illustrated</title>,
<title>Advanced Programming in the Unix
environment</title>
Universal Quantifiers
for $b in document("../docs/xmpbib.xml")//book
where every $a in $b//author
satisfies ($a/last="Stevens" and
$a/first="W.")
return $b/title
Result: (note the empty satisfaction in the 3rd title)
<title>TCP/IP Illustrated</title>,
<title>Advanced Programming in the Unix
environment</title>,
<title>The Economics of Technology and
Content for Digital TV</title>
Example: Books per Author
<author-list>
{
let $a := document("../docs/xmpbib.xml")//author
for $l in distinct-values($a/last),
$f in distinct-values($a [ last = $l]/first)
order by data($l), data($f)
return
<author>
<name>
{ $l , " ", $f }
</name>
{
for $b in document("../docs/xmpbib.xml")//bib/book
where some $ba in $b/author satisfies
($ba/last= $l and $ba/first = $f)
order by $b/title
return $b/title
}
</author>
}
</author-list>
Result
<author-list>
<author>
<name><last>Abiteboul</last> <first>Serge</first></name>
<title>Data on the Web</title>
</author>
<author>
<name><last>Buneman</last> <first>Peter</first></name>
<title>Data on the Web</title>
</author>
<author>
<name><last>Stevens</last> <first>W.</first></name>
<title>Advanced Programming in the Unix environment</title>
<title>TCP/IP Illustrated</title>
</author>
<author>
<name><last>Suciu</last> <first>Dan</first></name>
<title>Data on the Web</title>
</author>
</author-list>
One More
for $b in document("../docs/xmpbib.xml")//book
return
<book>
{ $b/title }
{
for $a at $i in $b/author
where $i <= 2
return
<author> { string($a/last), ",", string($a/first) } </author>
}
{
if (count($b/author) > 2)
then <author> et. al. </author>
else ()
}
</book>
Result
<book><title>TCP/IP Illustrated</title><author>Stevens ,
W.</author></book>,
<book>
<title>Advanced Programming in the Unix environment</title>
<author>Stevens , W.</author>
</book>,
<book>
<title>Data on the Web</title>
<author>Abiteboul , Serge</author>
<author>Buneman , Peter</author>
<author> et. al. </author>
</book>,
<book>
<title>The Economics of Technology and Content for Digital TV</title>
</book>
mybook.xml
<section title="Intro">
<section title="Background">
</section>
<section title="Related Work">
</section>
</section>
<section title="Definitions"> </section>
<section title="The System">
<section title="Design"> </section>
<section title="Implementation">
<section title="Overview"> </section>
<section title="Detailed"> </section>
</section>
</section>
Recursive Functions
define function toc($book-or-section as element) as
element*
{
for $s in $book-or-section/section
return
<section>
{ $s/@title, toc($s) }
</section>
}
<toc>
{
for $a in document("mybook.xml")/book
return toc($a)
}
</toc>
Result: Next slide
Result
<toc>
<section title="Intro">
<section title="Background"></section>
<section title="Related Work"></section>
</section>
<section title="Definitions"></section>
<section title="The System">
<section title="Design"></section>
<section title="Implementation">
<section title="Overview"></section>
<section title="Detailed"></section>
</section>
</section>
</toc>
Trace
define function toc($book-or-section as element) as element*
{
for $s in $book-or-section/section
return
<section>
{ $s/@title, toc( trace($s, " ARG1 ") ) }
</section>
}
<toc>
{
for $a in document("mybook.xml")/book
return toc( trace($a, " ARG ") )
}
</toc>
Result: Next slide
ARG <book>
<section title="Intro">
<section title="Background">
Some background ...
</section>
<section title="Related Work">
Some papers ...
</section>
</section>
<section title="Definitions"> some defs ... </section>
<section title="The System">
<section title="Design"> System design ... </section>
<section title="Implementation">
<section title="Overview"> Overview ... </section>
<section title="Detailed"> The details ... </section>
</section>
</section>
</book> ARG1 <section title="Intro">
<section title="Background">
Some background ...
</section>
<section title="Related Work">
Some papers ...
</section>
</section> ARG1 <section title="Background">
Some background ...
</section> ARG1 <section title="Related Work">
Some papers ...
</section> ARG1 <section title="Definitions"> some defs ... </section> ARG1
<section title="The System">
<section title="Design"> System design ... </section>
<section title="Implementation">
<section title="Overview"> Overview ... </section>
<section title="Detailed"> The details ... </section>
</section>
</section> ARG1 <section title="Design"> System design ... </section> ARG1 <section
title="Implementation">
<section title="Overview"> Overview ... </section>
<section title="Detailed"> The details ... </section>
</section> ARG1 <section title="Overview"> Overview ... </section> ARG1
<section title="Detailed"> The details ... </section><toc>
<section title="Intro">
<section title="Background"></section>
<section title="Related Work"></section>
</section>
<section title="Definitions"></section>
<section title="The System">
<section title="Design"></section>
<section title="Implementation">
<section title="Overview"></section>
<section title="Detailed"></section>
</section>
</section>
</toc>
Assignment
•
Modify the previous program so that the output in general is of the form such that
under each section each first-level child section is indented by 5 spaces,
•
For example:
<toc>
Intro
Background
Related Work
Definitions
The System
Design
Implementation
Overview
Detailed
</toc>
:

Download Report

Xquery - CS, Technion

Paperzz.com

Your Paperzz