Search and Navigation in XML Documents

Databases and Information Systems 1
- Search and Navigation in XML Documents Dr. Rita Hartel
Fakultät EIM, Institut für Informatik
Universität Paderborn
WS 2011 / 2012
Axes in XML document trees - Axes names
Axes are considered relative
to current context node
ancestor
preceding
following
parent
preceding-sibling
attribute
self
following-sibling
@ @ @
child
descendant
Databases and Information Systems I – WS 2011/2012 – Search and navigation in XML documents
2/48
Axes in XML document trees – Semantic
The following axes select for a given context node:
Name
Semantic
child::
its child nodes
descendant::
its descendants (=children and their descendants)
parent::
the parent node (only root does not have a parent)
ancestor::
nodes on the path to the root (=parent and its ancestors)
following-sibling::
nodes with identical parent, following in doc order
(empty for attribute and namespace nodes).
preceding-sibling::
inverse to following-sibling
following::
all nodes following in doc order after context node
(excluding descendant-, attribute- & namespace-nodes).
preceding::
inverse to following
attribute::
its attributes (empty for each non-element node)
namespace::
its namespace-nodes (empty for each non-element node)
Databases and Information Systems I – WS 2011/2012 – Search and navigation in XML documents
3/48
Axes in XML document trees - Partitioning
ancestor
preceding
following
parent
preceding-sibling
self
following-sibling
child
descendant
Databases and Information Systems I – WS 2011/2012 – Search and navigation in XML documents
4/48
Axes in XML document trees - Partitioning
The axes
 ancestor::
 descendant::
 following::
 preceding::
Position of end-tag
When ignoring attribute nodes and namespace nodes, the following holds
for every document node:
ancestor
following
self
preceding
descendant
Position of start-tag
 self::
partition a document fully, i.e., the selected node sets do not overlap but
the union of all partitions contain all nodes of the document.
Databases and Information Systems I – WS 2011/2012 – Search and navigation in XML documents
5/48
The Document Object Model (DOM)
Document Object Model (DOM)
The Document Object Model (DOM) is a cross-platform and languageindependent convention for representing and interacting with objects
in HTML, XHTML and XML documents. Aspects of the DOM (such as its
"Elements") may be addressed and manipulated within the syntax of
the programming language in use. The public interface of a DOM is
specified in its application programming interface (API)
[…]
Because DOM supports navigation in any direction (e.g., parent and
previous sibling) and allows for arbitrary modifications, an
implementation must at least buffer the document that has been read
so far (or some parsed form of it).
from: http://en.wikipedia.org/wiki/Document_Object_Model
Databases and Information Systems I – WS 2011/2012 – Search and navigation in XML documents
6/48
The Document Object Model (DOM)
Initially: load XML document as a tree into main memory
Then (for each access):
navigate from node to node and:
 read nodes or texts,
 insert, update or delete
nodes, sub-trees or texts
Pros and Cons:
+ easy to program


consumes much memory
long loading time until
document is in memory
doc
name
kunde
“Meier“
auftrag
name
“Reich“
adresse
teil
auftrag
adresse
teil
Isartor 22,…
PC500
kunde
Goldgrube 1
PC600
Databases and Information Systems I – WS 2011/2012 – Search and navigation in XML documents
7/48
DOM Parser Java API (1) – reading a DOM tree
DOMParser parser = new DOMParser();
try { parser.parse(uri);
Document doc = parser.getDocument();
recurseNodes(doc, …);
} catch (Exception e) { … }
// instantiate parser
// parse text found at uri
// get document root
// work on document
public void recurseNodes(Node node, …) { … ;
switch (node.getNodeType())
{ case Node.DOCUMENT_NODE: …
case Node.ELEMENT_NODE: …
case Node.TEXT_NODE:…
}
}
// recursively on all nodes
// depending on node type
// if root node …
// if element node …
// if text node …
Databases and Information Systems I – WS 2011/2012 – Search and navigation in XML documents
8/48
DOM-Parser-Java-API (2) – reading a DOM tree
public void recurseNodes(Node node, …)
{ …
String name = node.getNodeName();
…
NodeList nodes = node.getChildNodes();
for (int i=0; i<nodes.getLength(); i++)
recurseNodes(nodes.item(i), "");
…
NamedNodeMap atts = node.getAttributes();
for (int i=0; i<atts.getLength(); i++) {
Node current = atts.item(i);
System.out.print(" " + current.getNodeName()
+ "=\"" + current.getNodeValue() +
"\"");
}
// recursively on all nodes
// read element name
// collect all children
// call each child node
// get attribute list
// get 1 attribute
// attribute name
// attribute value
Databases and Information Systems I – WS 2011/2012 – Search and navigation in XML documents
9/48
DOM-Parser-Java-API (3) – modifying a DOM tree
import javax.xml.parsers.*;
import org.w3c.dom.*;
public class GenDoc {
public static void main(String[] args) {
DocumentBuilderFactory dbf =
DocumentBuilderFactory.newInstance();
dbf.setNamespaceAware(true);
try {
DocumentBuilder db =
dbf.newDocumentBuilder();
Document d = db.newDocument();
(new DOMOut()).recurseNodes(d, "");
} catch (ParserConfigurationException e)
{… }
// Step 1: generate a Factory and
// configure it
// namespaces are considered
// Step 2: generate a Builder and
// a document
// construct DOM tree rooted in
// “d“  next slide
// print document rooted in “d“
} }
Databases and Information Systems I – WS 2011/2012 – Search and navigation in XML documents
10/48
DOM-Parser-Java-API (4) – modifying a DOM tree
// generate element with name root
Element root = d.createElement("root");
“
“
“ element
Element element = d.createElement("element"); // “
// 2 techniques to generate an
//attribute for a given element
// 1st technique
element.setAttribute("att1", "att value1");
Attr att = d.createAttribute("attribute2");
att.setTextContent("attribute value2");
element.setAttributeNode(att);
// 2nd technique
Text text = d.createTextNode("text");
element.appendChild(text);
// create a text node
// set text node as child of element
root.appendChild(element);
// let one element be a child of
// another element
// let element ‚root‘ be child of
// document root “d“
d.appendChild(root);
Databases and Information Systems I – WS 2011/2012 – Search and navigation in XML documents
11/48
DOM-Parser-Java-API (5) – some further commands
Node localNode = currentNode.getFirstChild();
while (localNode != null) {
System.out.println(localNode.getNodeName());
if (localNode.getNodeName().equals(„D“) ) {
currentNode.removeChild(localNode);
}
localNode = localNode.getNextSibling();
}
// if name is “D“
// delete node “D“
// process next sibling
More info:
 http://java.sun.com/j2se/1.4.2/docs/api/org/w3c/dom/Node.html
Databases and Information Systems I – WS 2011/2012 – Search and navigation in XML documents
12/48
The Streaming API for XML (StAX)
Streaming API for XML (StAX)
Streaming API for XML (StAX) is an application programming interface
(API) to read and write XML documents, originating from the Java
programming language community.
[…]
In the StAX metaphor, the programmatic entry point is a cursor that
represents a point within the document. The application moves the
cursor forward - 'pulling' the information from the parser as it needs.
from: http://en.wikipedia.org/wiki/StAX
Databases and Information Systems I – WS 2011/2012 – Search and navigation in XML documents
13/48
Streaming API for XML (StAX)
1. <doc>
2. <customer name="Meier">
3.
<order>
4. 5.
<item>PC500</item>
</order>
6.
<address>
7.
Isartor 22, …
</address>
3.
</customer>
8. <customer name="Reich">
4.
9. – 13. …
</customer>
5.
</doc>
1.
doc
2.
name
“Meier“
order
item
8.
customer
name
“Reich“
6.
address
7.
order
10 .
item
Isartor 22,…
PC500
customer
9.
12.
address
13.
Goldgrube 1
11.
PC600
 nodes are accessed in (textual) document order (1.,2.,…,13.)
 only one node of the XML document at a time in main memory
Databases and Information Systems I – WS 2011/2012 – Search and navigation in XML documents
14/48
Streaming API for XML (StAX) – how to use it
parser = factory.createXMLStreamReader(
new FileInputStream(“doc.xml"));
do {
// the StAX parser has already
// read the first token
if ( parser.isStartElement( ) ) …
else if ( parser.isEndElement( ) ) …
else if ( parser.isCharacters( ) ) …
// process the current token
// if it is a start-element tag
// if it is an end-element tag
// if it is a text node
// end if
parser.next();
// get the next token
} while ( parser.hasNext( ) ) ;
// as long as there is a next
// token
Databases and Information Systems I – WS 2011/2012 – Search and navigation in XML documents
15/48
Streaming API for XML (StAX) – Java-Code (1)
import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamException;
import javax.xml.stream.XMLStreamReader;
public class LittleStAXExample {
public static void main(String[] args) {
XMLInputFactory factory =
XMLInputFactory.newInstance() ;
try { XMLStreamReader parser =
factory.createXMLStreamReader(
new FileInputStream(“doc.xml"));
…
} catch (XMLStreamException e) { … }
catch (java.io.FileNotFoundException e) {…}
} }
// generate parser via factory
// the StAX parser has already
// read the first token
// process all tokens  next slide
Databases and Information Systems I – WS 2011/2012 – Search and navigation in XML documents
16/48
Streaming API for XML (StAX) – Java-Code (2)
do {
if ( parser.isStartElement( ){
write("<" + parser.getName( ) );
for (int i = 0; i < parser.getAttributeCount( ); i++)
write(" " + parser.getAttributeLocalName(i)
+ "= \"" + parser.getAttributeValue(i) + "\"");
write(">\n");
}
else if ( parser.isEndElement( ) )
write("</" + parser.getName( ) + ">\n");
else if ( parser.isCharacters( ) )
write( parser.getText( ) + "\n" );
parser.next( );
} while ( parser.hasNext( ) ) ;
// process the current token
// if it is a start-element tag
// write the start-element tag
// for all attributes
// write attr. name
// write attr. value
// if it is an end-element tag
// print it
// if it is a text node
// print text
// get next token
// as long as there is a next
// token
Databases and Information Systems I – WS 2011/2012 – Search and navigation in XML documents
17/48
Streaming API for XML (StAX) –getEventType()
Using getEventType, we do not miss any XML event

XMLStreamReader.ATTRIBUTE

XMLStreamReader.CDATA

XMLStreamReader.CHARACTERS

XMLStreamReader.COMMENT

XMLStreamReader.END_DOCUMENT

XMLStreamReader.END_ELEMENT

XMLStreamReader.ENTITY_REFERENCE

XMLStreamReader.NAMESPACE

XMLStreamReader.NOTATION_DECLARATION

XMLStreamReader.PROCESSING_INSTRUCTION

XMLStreamReader.SPACE

XMLStreamReader.START_DOCUMENT

XMLStreamReader.START_ELEMENT
Databases and Information Systems I – WS 2011/2012 – Search and navigation in XML documents
18/48
Streaming API for XML (StAX) - Pros and Cons

program can only navigate in (textual) document order

more difficult to process than DOM
1.
doc
+
requires less main memory
+
can be loaded fast into main
memory
2.
name
“Meier“
+
StAX processing is a standard

parser from different
companies

no dependency on a
parser supplier
3.
order
4.
item
5.
8.
customer
name
“Reich“
6.
address
7.
order
10 .
item
Isartor 22,…
PC500
customer
9.
12.
address
13.
Goldgrube 1
11.
PC600
Databases and Information Systems I – WS 2011/2012 – Search and navigation in XML documents
19/48
Simple API for XML (SAX)
Simple API for XML (SAX)
SAX (Simple API for XML) is an event-based sequential access parser
API developed by the XML-DEV mailing list for XML documents. SAX
provides a mechanism for reading data from an XML document that is
an alternative to that provided by the Document Object Model (DOM).
Where the DOM operates on the document as a whole, SAX parsers
operate on each piece of the XML document sequentially.
[…]
Unlike DOM, there is no formal specification for SAX. The Java
implementation of SAX is considered to be normative. It is used for
state-independent processing of XML documents, in contrast to StAX
that processes the documents state-dependently.
from: http://en.wikipedia.org/wiki/Simple_API_for_XML
Databases and Information Systems I – WS 2011/2012 – Search and navigation in XML documents
20/48
Simple API for XML (SAX)
1. <doc>
2. <customer name="Meier">
3.
<order>
4. 5.
<item>PC500</item>
</order>
6.
<address>
7.
Isartor 22, …
</address>
3.
</customer>
8. <customer name="Reich">
4.
9. – 13. …
</customer>
5.
</doc>
1.
doc
2.
name
“Meier“
order
item
8.
customer
name
“Reich“
6.
address
7.
order
10 .
item
Isartor 22,…
PC500
customer
9.
12.
address
13.
Goldgrube 1
11.
PC600
 nodes are accessed in (textual) document order (1.,2.,…,13.)
 only one node of the XML document at a time in main memory
Databases and Information Systems I – WS 2011/2012 – Search and navigation in XML documents
21/48
SAX-Parser-Java-API (1)
SAXParserFactory spf =
SAXParserFactory.newInstance();
// generate JAXP SAXParserFactory
spf.setNamespaceAware(true);
// set namespaceAware to true
SAXParser saxParser =
spf.newSAXParser();
// generate JAXP SAXParser
XMLReader xmlReader =
saxParser.getXMLReader();
// get handle to the embedded SAX
// XMLReader
xmlReader.setContentHandler
// generate new SAX output stream for
(new SAXOut()); // ContentHandler of XMLReader
xmlReader.setErrorHandler
(new MyErrorHandler(System.err));
// setup ErrorHandler, before parsing starts
xmlReader.parse(filename);
// parse the XML file
Databases and Information Systems I – WS 2011/2012 – Search and navigation in XML documents
22/48
SAX-Parser-Java-API (2)
public void startDocument() throws SAXException
{…}
// Parser calls this procedure
// once, at begin of parsing
public void startElement( String namespaceURI,
String localName, String qName, Attributes atts)
throws SAXException { …
for(int i=0; i<atts.getLength(); i++) {
out.println( atts.getQName(i) +
"=\"" + atts.getValue(i)+"\"");
}
…}
// SAX parser calls this once for
//each start tag of an element
public void endElement( String namespaceURI,
String localName, String qName)
throws SAXException { … }
// SAX parser calls this once for
each end tag of an element
public void endDocument() throws SAXException
{…}
// SAX parser calls this once at
// end of parsing
// for each attribute
// output attribute name and
attribute value
Databases and Information Systems I – WS 2011/2012 – Search and navigation in XML documents
23/48
SAX-Parser-Java-API (3)
public void characters(char[ ] ch, int start,
int length) throws SAXException
{
String text = new String (ch, start, length);
text = text.trim();
…
}
// SAX parser calls this once
// for each text found in the
// XML document
Databases and Information Systems I – WS 2011/2012 – Search and navigation in XML documents
24/48
Simple for XML (SAX) - Pros and Cons
Parser accesses at most one XML element node at a time:

program can only navigate in (textual)
document order

1.
doc
more difficult to process than DOM
2.
+
requires less main memory
+
can be loaded fast into main
memory
name
“Meier“
3.
order
4.
item
5.
8.
customer
name
“Reich“
6.
address
7.
order
10 .
item
Isartor 22,…
PC500
customer
9.
12.
address
13.
Goldgrube 1
11.
PC600
Databases and Information Systems I – WS 2011/2012 – Search and navigation in XML documents
25/48
Comparison StAX Parser ↔ SAX-Parser
Common aspects:
 event-oriented processing
 along the tag and text sequence in XML document order
Main difference:
 StAX-Parser: application drives parser
 ‘pull’-based parser
 SAX-Parser: parser drives application
 ‘push’-based parser
Databases and Information Systems I – WS 2011/2012 – Search and navigation in XML documents
26/48
XML Path language XPath
XML Path Language (Xpath)
XPath, the XML Path Language, is a query language for selecting
nodes from an XML document. In addition, XPath may be used to
compute values (e.g., strings, numbers, or Boolean values) from the
content of an XML document. XPath was defined by the World Wide
Web Consortium (W3C).
The XPath language is based on a tree representation of the XML
document, and provides the ability to navigate around the tree,
selecting nodes by a variety of criteria.[1] In popular use (though not in
the official specification), an XPath expression is often referred to
simply as an XPath.
from: http://en.wikipedia.org/wiki/Xpath
Databases and Information Systems I – WS 2011/2012 – Search and navigation in XML documents
27/48
XPath-Navigation within XML documents
doc
name
customer
“Meier“
order
XPath examples:
child-axis
attribute-axis
name
customer
“Reich“
address
order
address
/child::doc/child::customer/child::order
/
doc /
customer / order
/child::doc/child::customer/attribute::name
/
doc /
customer / @ name
Databases and Information Systems I – WS 2011/2012 – Search and navigation in XML documents
28/48
XML Path language XPath (1)
/
root element
.
current context node
/child::doc/child::customer
absolute path (starting at root)
./child::order/child::PC
relative path (starting at
current context node)
location steps
doc
XPath expression
name
customer
“Meier“
order
name
customer
“Reich“
address
order
Databases and Information Systems I – WS 2011/2012 – Search and navigation in XML documents
address
29/48
XML: root element, document element, document root
/
root element
<doc>
<customer name="Meier">
<order>
<item>PC500</item>
</order>
<address>
Isartor 22, …
</address>
</customer>
<customer name="Reich">
…
</customer>
</doc>
document root
doc
name
customer
“Meier“
order
name
customer
“Reich“
address
order
Databases and Information Systems I – WS 2011/2012 – Search and navigation in XML documents
address
30/48
XPath (2): Retrieval of XML data
doc
name
customer
“Meier“
order
name
customer
“Reich“
address
order
address
XPath expression:
/ child::doc / child::customer [attribute::name=“Meier“] / child::order
filter expression
location step
Databases and Information Systems I – WS 2011/2012 – Search and navigation in XML documents
31/48
XPath (3) – Location steps
XPath-Location-Expression ::=
LocationStep1 / … / LocationStepN
| / LocationStep1 / … / LocationStepN
(relative path)
(absolute path)
e.g. child::customer [attribute::name=“Meier“] / descendant::order
LocationStepI ::=
Axis-Specifier ‘::‘ NodeTest ( ‘[‘ FilterExpression ‘]‘ ) *
Examples (given in long form) :
child::customer [attribute::name=“Meier“]
descendant-or-self::address
Databases and Information Systems I – WS 2011/2012 – Search and navigation in XML documents
32/48
XPath (4) - Axes in XML document trees (4)
All XML axes can be used in XPath expressions:
Name
Semantic
child::
its child nodes
descendant::
its descendants (=children and their descendants)
parent::
the parent node (only root does not have a parent)
ancestor::
nodes on the path to the root (=parent and its ancestors)
following-sibling::
nodes with identical parent, following in doc order
(empty for attribute and namespace nodes).
preceding-sibling::
inverse to following-sibling
following::
all nodes following in doc order after context node
(excluding descendant-, attribute- & namespace-nodes).
preceding::
inverse to following
attribute::
its attributes (empty for each non-element node)
namespace::
its namespace-nodes (empty for each non-element node)
Databases and Information Systems I – WS 2011/2012 – Search and navigation in XML documents
33/48
XPath (5) - Axes in XML document trees (5)
More axes that can be used in XPath expressions:
The following axes select for a given context node:
 self::
the context node itself
 descendant-or-self:: the context node and its descendants
 ancestor-or-self::
the context node and its ancestors
Databases and Information Systems I – WS 2011/2012 – Search and navigation in XML documents
34/48
XPath (6): Node name tests
child::Ename
selects all Ename nodes reachable via
the child-axis from the context node
returns an empty set of nodes
if context node has no Ename child
attribute::Aname
selects the Aname attribute of the
context node
returns an empty set of nodes
if context node has no Aname attribute
axis-specifier:: Ename
selects only nodes with name Ename
reachable from the context node
through the specified axis
example:
descendant-or-self:: customer
selects the current context node if it is
a customer node and all customer
descendant nodes of the context node
Databases and Information Systems I – WS 2011/2012 – Search and navigation in XML documents
35/48
XPath (7): Node tests
node()
node test is true for all nodes
text()
node test is true for all text nodes
attribute()
node test is true for all attribute nodes
root()
node test is true if current node is root
axis-specifier:: *
selects all elements (or attributes) that are
reachable from the context node through the
specified axis
Databases and Information Systems I – WS 2011/2012 – Search and navigation in XML documents
36/48
XPath (8) – Filter expressions
Filter ::= ‘[‘ FilterExpression ‘]‘
The standard includes more than the presented definitions:
simple filter expressions (SFE)
If L1 and L2 are LocationPaths and c is a constant value,
the following are simple filter expressions:
L1
L1 = c L1 = L2
L1 != c L1 != L2
If SFE1 and SFE2 are simple filter expressions,
the following are simple filter expressions too:
SFE1 and SFE2
SFE1 or SFE2
not SFE1
Databases and Information Systems I – WS 2011/2012 – Search and navigation in XML documents
37/48
Exercise (1)
Which nodes are selected by the following XPath queries?
1. / descendant-or-self::order / descendant::*
1.
orders
5.
order
2.
order
2. / descendant::order / following::*
3. / descendant::item / preceding::*
3.
4.
6.
7.
customer
item
customer
item
“Meier“
PC500
“Reich“
PC600
4. /child::orders/child::order/child::customer/following-sibling::*/parent::*/child::*
5. / descendant-or-self::item / preceding::* / ancestor-or-self::*
6. / descendant::order[child::customer and child::item] / child::item
7. / descendant::order [child::customer=‚Meier'] / child::customer
8. / descendant::order [not child::customer=‚Reich'] [child::item != ‚PC600']
Databases and Information Systems I – WS 2011/2012 – Search and navigation in XML documents
38/48
Short notation (1) : XPath query examples
/
root element
.
current context node
/ doc / customer
absolute path (starting at root)
. / order / PC
relative path (starting at
current context node)
doc
order / PC
relative
path
name
name
customer
“Meier“
order
customer
“Reich“
address
order
Databases and Information Systems I – WS 2011/2012 – Search and navigation in XML documents
address
39/48
Short notation (2) : axis-specifier + node test
long notation
short notation
child::customer
customer
attribute::name
@name
child::*
*
attribute::*
@*
parent::*
..
self::*
.
descendant-or-self::node() //
descendant::customer
//customer
//customer =
descendant-or-self::node()/child::customer =
descendant::customer
Databases and Information Systems I – WS 2011/2012 – Search and navigation in XML documents
40/48
Short notation (3) – more XPath examples
short notation
description
/Elem1
absolute path: selects all child nodes Elem1 of the document root
//Elem1
.
absolute path: selects all descendant nodes Elem1 of the
document root
self-axis step - selects current context node
Elem1
relative path: child element Elem1 of current context node
@size
attribute with the name size of the current context node
*
all child nodes of current context node that are of type Element
@*
all child nodes of current context node that are of type Attribute
..
parent node
../Elem
sibling node with tag name Elem (or context node, if it has the
label Elem)
E2 child nodes of E1 parents that is a child node of current context
node
selects all Elem descendants of current context node
E1/E2
.//Elem
Databases and Information Systems I – WS 2011/2012 – Search and navigation in XML documents
41/48
Exercise (2)
Which nodes are selected by the following XPath queries?
1. / orders / order / item
2. // customer / .. / . / *
3. / orders / order [customer=‚Meier']
4. / orders / order [customer[.=‚Meier']/..]
1.
orders
5. // order [customer and item] / item
5.
order
2.
6. // orders // * //
order
3.
4.
6.
customer
item
customer
item
“Meier“
PC500
“Reich“
PC600
Databases and Information Systems I – WS 2011/2012 – Search and navigation in XML documents
7.
42/48
XPath filter examples (1)
order [ PC ]
only order nodes that have a PC child node
order [ not PC ]
only order nodes that have no PC child node
order [ PC=‘pc500‘ ]
only order nodes that have PC child nodes
with a text value of ‘pc500‘
order [ PC != ‘pc500‘ ]
only order nodes that have PC child node
with a text value different from ‘pc500‘
order [ not PC=‘pc500‘ ] only order nodes that have no PC child node
with a text value of ‘pc500‘
(Non-negated) filter expressions are (implicitely) -quantified !
Databases and Information Systems I – WS 2011/2012 – Search and navigation in XML documents
43/48
XPath filter examples (2): Nested Filters
// customer [ order [ PC=‘pc500‘ ] ]
customers, who order a ‘pc500‘
// customer [ order / PC =‘pc500‘ ]
as before
// customer [ order [ PC != ‘pc500‘ ] ]
customers, who order a PC
that is not a ‘pc500‘
as before
// customer [ order / PC != ‘pc500‘ ]
// customer [ order [ not PC = ‘pc500‘ ] ]
customers, who order something
that is not a ‘pc500‘ (may be no PC)
(but may order a 'pc500' too however in a different order)
// customer [ not order [ PC = ‘pc500‘ ] ]
customers, who don't order a ‘pc500‘
(may be they do not order anything)
as before
// customer [ not order / PC = ‘pc500‘ ]
Databases and Information Systems I – WS 2011/2012 – Search and navigation in XML documents
44/48
XPath filter examples(3) : -quantified queries
Customers, who have paid each order: each order has the status 'paid'
{ customer |  order-child-nodes: attribute status has the value 'paid' }
{ customer | not  order-child-node: not (attribute status has value 'paid') }
// customer [ not order [ not @status = ‘paid‘ ] ]
Custumers, who order a PC within every order:
(= whenever there is an order, this order has a PC-child node)
{ customer |  order-child-nodes: each has a PC-child node }
{ customer | not  order-child-node with not (it has a PC-child-node) }
// customer [ not order [ not PC ] ]
Databases and Information Systems I – WS 2011/2012 – Search and navigation in XML documents
45/48
XPath filter examples (4): Position filters
First order of each customer
// customer / order [ 1 ]
Each order of first customer
// customer [ 1 ] / order
First paid order of each customer
// customer / order [ @status = ‘paid‘ ] [ 1 ]
First order each customer, but only if this order is paid
(= first order of each customer who has paid his first order )
// customer / order [ 1 ] [ @status = ‘paid‘ ]
First order of any customer (whoever the customer parent will be)
// order [ parent::customer ] [ 1 ]
First order, if its parent node is a customer
// order [ 1 ] [ parent::customer ]
Databases and Information Systems I – WS 2011/2012 – Search and navigation in XML documents
46/48
Exercises (3) XPath and DTDs
1. /child::orders /child::order /child::customer /following-sibling::* /parent::* /child::*
2. / descendent-or-self::item / preceding::* / ancestor-or-self::*
3. / descendent-or-self::order / descendent::*
4. / orders / order / item
5. // customer / .. / . / *
6. / orders / order [customer=‚Meier']
7. / orders / order [customer [.=‚Meier'] / .. ]
8. // order [customer and item] / item
9. // *// order [customer=‚Meier'] / customer
orders
10. // order [not customer=‚Reich'] [item != ‚PC600']
5.
2.
Simplify XPath expressions where possible
order
order
Simplify XPath expressions if DTD is:
<!ELEMENT orders (order*)>
<!ELEMENT order (customer,item)>
<!ELEMENT customer (#PCDATA)>
<!ELEMENT PC (#PCDATA)>
3.
4.
6.
customer
item
customer
item
“Meier“
PC500
“Reich“
PC600
Databases and Information Systems I – WS 2011/2012 – Search and navigation in XML documents
7.
47/48
Summary
Databases and Information Systems I – WS 2011/2012 – Search and navigation in XML documents
48/48