XML Processing Moves Forward
XSLT 2.0 and XQuery 1.0
Michael Kay
Prague 2005
About me
•
•
•
•
•
•
Database background
Started using XML in 1998 for
content management applications
Author of XSLT Programmer’s
Reference
Developer of Saxon XSLT
processor
Member of W3C XSL and XQuery
Working Groups
Founded SAXONICA March 2004
2
Contents
•
•
•
•
A tour of the new specs
What’s significant about XSLT 2.0
A quick demo
Why XQuery?
3
The QT Specification Family
XSLT 2.0
XQuery 1.0
XPath 2.0
Functions
and
Operators
Data Model
XML Schema
4
Standards maturity
XML
Schema
Maturity
XSLT 1.0
XPath 1.0
XML
XQuery
XSLT 2.0
XPath 2.0
REC
CR
Time
5
A family of standards
XQuery 1.0
XSLT 2.0
XPath 2.0
XSLT 1.0
XPath 1.0
XML Schema
6
XSLT and XQuery
Documents
Data
XSLT
XQuery
7
What’s new in XSLT 2.0
• New Processing Model
• Major Features
– grouping
– regular expressions
– functions
– schema support
• Many “minor” features
8
Some “minor” features
XSLT 2.0
XPath 2.0
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Temporary trees
Multiple Output Files
Format date/time
Tunnel parameters
Declared variable types
Multi-mode templates
xsl:next-match
conditional compilation
XHTML serialization
xsl:namespace
separator=“,”
character maps
Sequences
if..then..else
for $x in X return f($x)
some/every
except/intersect
$n is $m
Function library
•
•
•
•
•
String functions
Regex functions
Date/time arithmetic
URI handling
min(), max(), avg()
9
Handling unstructured text
• unparsed-text() function
– reads a text file into a string
• tokenize() function
– splits a string into substrings
• xsl:analyze-string
– parses a string and generates markup
10
Regular expression functions
• matches()
test if a string matches a regex
if (matches($in, ‘[A-Z]{3}[0-9]{3}’)
• tokenize()
split a string into substrings
regex matches the separator
for $s in tokenize($in, ‘,\s?’) ...
• replace()
replace every occurrence of a match
replace($in, ‘\s’, ‘%20’)
11
Grouping
• Takes any sequence as input
• Divides the items into groups
• Applies processing to each group
group-by:
items with a common value for a grouping key
group-adjacent:
adjacent items with a common grouping key
group-starting-with:
pattern to match first item in each group
group-ending-with:
pattern to match last item in each group
12
Grouping by Value
<xsl:for-each-group select=“book”
group-by=“publisher”>
<xsl:sort select=“current-grouping-key()”/>
<h2>Publisher:
<xsl:value-of select=“current-grouping-key”/>
</h2>
<xsl:for-each select=“current-group()”/>
<xsl:sort select=“title”/>
<p>author: <xsl:value-of select=“author”/></p>
<p>title: <xsl:value-of select=“title”/></p>
</xsl:for-each>
</xsl:for-each-group>
13
User-defined Functions
• Written like named templates
• Called from XPath
• Return a result
<xsl:function name=“ged:date-to-ISO” as=“xs:date”>
<xsl:param name=“in” as=“ged:date”/>
<xsl:sequence select=“xs:date(concat(
substring($in, 8, 4), ‘-’
format-number(index-of((“JAN”, “FEB”, ...), substring($in, 4, 3)), ’00’),
‘-’, substring($in, 1, 2)))”/>
</xsl:function>
<xsl:sort select=“ged:date-to-ISO(@birth-date)”/>
14
XQuery 1.0
• Designed to query XML databases
• Also handles in-memory
transformations
• Well supported by database vendors
15
XQuery Example
Join two tables
xquery version 1.0;
<results> {
for $p in doc ("auction.xml")/site/people/person
let $a := for $t in doc("auction.xml")
/site/closed_auctions/closed_auction
where $t/buyer/@person = $p/@id
return $t
return <item person="{$p/name}"> {count ($a)} </item>}
</results>
XMark Q8
16
XSLT Equivalent
<result xsl:version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:for-each select="/site/people/person">
<xsl:variable name="a"
select="/site/closed_auctions/closed_auction
[buyer/@person = current()/@id]"/>
<item person="{name}">
<xsl:value-of select="count($a)"/>
</item>
</xsl:for-each>
</result>
XMark Q8
17
Optimization
• With multi-GB databases, using
indexes is essential
• XQuery does not have template
rules
• This makes it possible to do static
analysis and join optimization
18
XMark Q8 results (msecs)
XSLT
Xalan
xt
MSXML
Saxon 8.4
XQuery
Saxon 8.4
Qizx
Galax
1Mb
1503
160
33
90
4Mb
11006
2253
519
1340
10Mb
65855
16414
4248
11126
136
351
1870
1575
711
6672
11947
1813
16625
O(n2)
O(n)
19
Two can play at that game!
XSLT
Xalan
xt
MSXML
Saxon 8.5
XQuery
Saxon 8.5
Qizx
Galax
1Mb
1503
160
33
27
4Mb
11006
2253
519
26
10Mb
65855
16414
4248
45
16
351
1870
16
711
6672
31
1813
16625
O(n2)
O(n)
caveat: this is one query only!
20
Conclusions
• XSLT 2.0 and XQuery 1.0 are nearly
ready
• XSLT 2.0 has many powerful new
features, making new applications
possible
• XQuery 1.0 designed for
optimization against very large
databases
21
© Copyright 2025 Paperzz