Semi-Structured Data and the Web Topics: Querying XML Documents: XPath Prof. Dr. Slim Abdennadher c S. Abdennadher 1 Querying XML Documents • Querying XML data means to: – identify nodes – to test certain further properties of these nodes – then to operate on the matches – and to construct the result in XML documents as answers • In XML, XQUERY plays the role of SQL is relational databases – XPATH is an embedded sublanguage to locate and test – XQUERY iterates over selected parts and operates on and constructs answers c S. Abdennadher 2 XPath Navigation c S. Abdennadher 3 XPath • XPath is a syntax for defining parts of an XML document. • XPath uses path expressions to navigate in XML documents. • XPath is based on the Unix directory notation – In a Unix directory tree: /slim/Slides/CSEN604/Lecture6 – In an XML tree: /bib/book/year • Specification of the navigation formalism as W3C XPath in 1999 http://www.w3.org/TR/xpath • XPath is the building block for other W3C standards: – XSL Transformations: XSLT – XML Link: XLink – XML Pointer: XPointer – XML Query: XQuery c S. Abdennadher 4 Example of XPath Queries <bib> <book> <publisher> Springer </publisher> <author> Thom Fruehwirth </author> <author> <first-name> Slim </first-name> <last-name> Abdenndaher </last-name> </author> <title> Essentials of Constraint Programming </title> <year> 2001 </year> </book> <book price=’55’> <publisher> Freeman </publisher> <author> Jeffrey D. Ullman </author> <title> Principles of Database and Knowledge Base Systems </title> <year> 1998 </year> </book> </bib> c S. Abdennadher 5 Selecting Nodes – XPath Examples • Query: All year elements that are direct subelements of book: /bib/book/year • Result: <year> 2001 </year> <year> 1998 </year> • Query: All year elements that are direct subelements of paper: /bib/paper/year • Result: empty (there were no papers in the document) c S. Abdennadher 6 Selecting Nodes – XPath Examples • Query: all author elements in the current document: //author • Result: <author> Thom Fruehwirth </author> <author> <first-name> Slim </first-name> <last-name> Abdenndaher </last-name> </author> <author> Jeffrey D. Ullman </author> • Query: select all first name elements that are descendant from bib element, no matter where they are under bib element: /bib//first-name • Result: <first-name> Slim </first-name> c S. Abdennadher 7 Selecting Nodes – Attribute Nodes • Query: Find the prices of all books: /bib/book/@price • Result: ’55’ • @price means that price has to be an attribute. c S. Abdennadher 8 Selecting Nodes Expression Description nodename Selects all child nodes of the node / Selects from the root node // Selects nodes in the document from the current node that match the selection no matter where they are . Selects the current node .. Selects the parent of the current node @ Selects attributes Note that if the path starts with a slash / it always represents an absolute path to an element. • bib: selects all the child nodes of the bib element. • /bib: selects the root element bib. c S. Abdennadher 9 Selecting Unknown Nodes – XPath Examples • Query: /author/* • Result: <first-name> Slim </first-name> <last-name> Abdennadher </last-name> • Query: Select all book elements which have any attributes: //book[@*] • Result <book price=’55’> <publisher> Freeman </publisher> <author> Jeffrey D. Ullman </author> <title> Principles of Database and Knowledge Base Systems </title> <year> 1998 </year> </book> c S. Abdennadher 10 Selecting Unknown nodes XPath wildcards can be used to select unknown XML elements. c S. Abdennadher Wildcard Description * Matches any element node @* Matches any attribute node node() Matches any node of any kind 11 XPath – Functions • Query: /bib/book/author/text() • Result: Thom Fruehwirth Jeffrey D. Ullman • Slim Abdennadher does not appear because he has firstname and lastname. • Functions in XPath: – text(): matches the text value – node(): matches any node (corresponds to * or @* or text()) – name(): returns the name of the current tag c S. Abdennadher 12 Selecting Specific Nodes – Qualifiers • Predicates are used to find a specific node or a node that contains a specific value. • Predicates are always embedded in square brackets. • Query: Select the first book element that is child of the bib element: /bib/book[1] • Query: Select the first three book elements that are children of the bib element: /bib/book[position()<4] • Query: Select all the book elements that have a price attribute with a value greater than 50. /bib/book[@price>’50’] • Select all the title elements of the book elements of the bib element that have a price attribute with a value greater than 50. /bib/book[@price>’50’]/title c S. Abdennadher 13 More on Qualifiers • Query: Select all the author elements that have a firstname element: /bib/book/author[firstname] • Result: <author> <first-name> Slim </first-name> <last-name> Abdennadher </last-name> </author> • XPath expressions in condition have existential semantics: – The truth value associated with an XPath expression is true, if the result set is not empty. • XPath expressions in condition are not only simple properties of an object, but are path expressions that are evaluated wrt. the current context node. • Example: //country[.//city/name=’Cairo’]/name returns the names of all countries, in which a city with name Cairo is located. c S. Abdennadher 14 More on Qualifiers • Note that in conditions: .//city and //city are different • Example: //country[//city/name=’Cairo’]/name returns the names of all countries (if there is some city with name Cairo in the document). • Query: Select all the book elements that were written by an author who is younger than 25. /bib/book[author/@age < ’25’] • When comparing an element with something, the text() function is applied implicitly: //country[name = ’Egypt’] equivalent to //country[name/text() = ’Egypt’] • The Boolean connectives and, or, and not can be used in condition. c S. Abdennadher 15 Absolute and Relative Path • Paths that start with a name are relative paths that are evaluated against the current context node: //country[name = ’Egypt’] • Paths that starts with / or // are absolute paths. • By using the | operator in an XPath expression several paths can be selected. • Query: Selects all the title AND publisher elements of all book elements //book/title | //book/publisher c S. Abdennadher 16 XPath Navigation • Starting with a current node it is possible to navigate in an XML tree to several directions. • Navigation can be done along 13 axes: ancestor ancestor-or-self attribute child descendant descendant-or-self following following-sibling namespace parent preceding preceding-sibling self • We have only seen child, descendant, attribute, and parent so far. c S. Abdennadher 17 XPath Navigation c S. Abdennadher 18 Navigation Path • A navigation path is of the form: /step/step/... • The result of each step is a set of nodes that serve as input for the next step. • A step is of the form axisname::nodetest[predicate] – an axis: defines the tree-relationship between the selected nodes and the current node – a node-test: identifies a node within an axis – zero or more predicates (to further refine the selected node-set) c S. Abdennadher 19 Navigation Path – Examples • Query: child::book • Meaning: Selects all book nodes that are children of the current node. It corresponds to book. • Query: attribute::price • Meaning: Selects the price attribute of the current node. It corresponds to @price. • Query: child::* • Meaning: Selects all children of the current node • Query: child::text() • Meaning: Selects all text child nodes of the current node. • Query descendant::book • Meaning: Selects all book descendants of the current node • Query: ancestor-or-self::book • Meaning: Selects all book ancestors of the current node - and the current as well if it is a book node c S. Abdennadher 20
© Copyright 2026 Paperzz