node - University of Malta

XML Technologies
SAX and DOM
Dr Alexiei Dingli
1
What is SAX?
•
•
•
•
Simple API for XML
Used to parse XML
But does not create a default object
It just fires events when it detects objects
such as
– open or close tags
– PCDATA or CDATA
– Comments
– entities
2
Example
• Imagine the following document:
<?xml version = "1.0"?>
<addressbook>
<person>
<lastname>Dingli</lastname>
<firstname>Alexiei</firstname>
<company>University of Malta</company>
<email>[email protected]</email>
</person>
</addressbook>
3
SAX in 3 steps
1. Creating a custom object model (like
Person and AddressBook classes)
2. Creating a SAX parser
3. Creating a DocumentHandler (to turn your
XML document into instances of your
custom object model).
4
Custom Object Model (1)
• Create both a person and an address book
object
• Create its setters, getters and to xml
methods
5
Custom
Object
Model
(2)
6
Create a
SAX
parser
7
Create a Document Handler (1)
• Actually 4 Interfaces ...
– The Document Handler
– The Entity Resolver
– The DTD Handler
– The Error Handler
8
Create a Document Handler (2)
9
Setting the parser
parser.setDocumentHandler( ... )
parser.setDTDHandler( ... )
parser.setErrorHandler( ... )
10
Handler Class
• Rather than implementing all the interfaces
mentioned earlier
• Make use of
org.xml.sax.helpers.DefaultHandler
• Which implements all the methods
• And you simply override what you want to
use
• http://java.sun.com/j2se/1.4.2/docs/api/org/
11
xml/sax/helpers/DefaultHandler.html
Example Handler
SAX Handler
12
DOM
• W3C standard
• Standard way of accessing and manipulating documents
• Divided into 3 parts
– Core DOM (access any structured document)
– XML DOM
– HTML DOM
• Presents element as a tree structure
13
XML DOM
• A standard object model for XML
• A standard programming interface for XML
• Platform- and language-independent
• A W3C standard
• The XML DOM is a standard for how to get,
change, add, or delete XML elements
14
XML DOM rulez
Everything in XML is a node
– The entire document is a document node
– Every XML element is an element node
– The text in the XML elements are text nodes
– Every attribute is an attribute node
– Comments are comment nodes
15
Example
<bookstore>
<book category="web" cover="paperback">
<title lang="en">Learning XML</title>
<year>2008</year>
</book>
</bookstore>
•
•
•
•
•
Bookstore is the root node
It contains one book node
A book node contains a title node and a year node
16
Title contains a text node “Learning XML”
2008 is not the value of the year node but a text node inside
the year node
The node tree
• Any DOM object has a node tree where
– In a node tree, the top node is called the root
– Every node, except the root, has exactly one
parent node
– A node can have any number of children
– A leaf is a node with no children
– Siblings are nodes with the same parent
17
Creating the XML
text="<bookstore>"
text=text+"<book>";
text=text+"<title>Everyday Italian</title>";
text=text+"<author>John Smith</author>";
text=text+"<year>2008</year>";
text=text+"</book>";
text=text+"</bookstore>";
18
Parsing the XML
try //Internet Explorer
{
xmlDoc=new ActiveXObject("Microsoft.XMLDOM");
xmlDoc.async="false";
xmlDoc.loadXML(text);
} catch(e) {
try //Firefox, Mozilla, Opera, etc.
{
parser=new DOMParser();
xmlDoc=parser.parseFromString(text,"text/xml");
} catch(e) {
alert(e.message)
}
}
document.write("xmlDoc is loaded, ready for use");
19
XML DOM Methods
• x.getElementsByTagName(name) - get all
elements with a specified tag name
• x.appendChild(node) - insert a child node
to x
• x.removeChild(node) - remove a child
node from x
20
XML DOM properties
• x.nodeName - the name of x
• x.nodeValue - the value of x
• x.parentNode - the parent node of x
• x.childNodes - the child nodes of x
21
• x.attributes - the attributes nodes of x
Examples
document.write(xmlDoc.getElementsByTagName("title")
[0].childNodes[0].nodeValue);
document.write("<br />");
document.write(xmlDoc.getElementsByTagName("author")
[0].childNodes[0].nodeValue);
document.write("<br />");
document.write(xmlDoc.getElementsByTagName("year")
[0].childNodes[0].nodeValue);
22
Accessing nodes
1. By using the getElementsByTagName()
method
2. By looping through (traversing) the nodes
tree
3. By navigating the node tree, using the
node relationships
23
Example 1
xmlDoc.getElementsByTagName("title")
[0].childNodes[0].nodeValue;
24
Example 2
x=xmlDoc.getElementsByTagName("title");
for ( i=0; i<x.length; i++) {
document.write(x[i].childNodes[0].nodeValue);
document.write("<br />");
}
25
Example 3
x=xmlDoc.getElementsByTagName("book")[0].childNodes;
y=xmlDoc.getElementsByTagName("book")[0].firstChild;
for (i=0;i<x.length;i++) {
if (y.nodeType==1) {//Process only element_nodes (type 1)
document.write(y.nodeName + "<br />");
}
y=y.nextSibling;
}
26
Node properties
• nodeName
• nodeValue
• nodeType
27
nodeName property
• nodeName is read-only
• nodeName of an element node is the
same as the tag name
• nodeName of an attribute node is the
attribute name
• nodeName of a text node is always #text
• nodeName of the document node is
always #document
28
nodeValue property
• nodeValue for element nodes is undefined
• nodeValue for text nodes is the text itself
• nodeValue for attribute nodes is the
attribute value
29
nodeType property
Node type
NodeType
Element
1
Attribute
2
Text
3
Comment
8
Document
9
30
Acessing node attributes
x=xmlDoc.getElementsByTagName("book")[0].attributes;
document.write(x.getNamedItem("category").nodeValue);
31
Traversing Example
// documentElement always represents the root node
x=xmlDoc.documentElement.childNodes;
for (i=0;i<x.length;i++) {
document.write(x[i].nodeName); document.write(": ");
document.write(x[i].childNodes[0].nodeValue);
document.write("<br />");
}
32
Navigating Nodes (1)
•
•
•
•
•
•
parentNode
childNodes
firstChild
lastChild
nextSibling
previousSibling
33
Navigating Nodes (2)
34
Getting the node value
x=xmlDoc.getElementsByTagName("title")[0];
y=x.childNodes[0];
txt=y.nodeValue;
Result = the name of the book
Title node > Text node
35
Setting the node value
x=xmlDoc.getElementsByTagName("title")[0]
.childNodes[0];
x.nodeValue="Easy Cooking";
36
Removing Nodes
y=xmlDoc.getElementsByTagName("book")[0];
xmlDoc.documentElement.removeChild(y);
Or
y.parentNode.removeChild(y);
37
Creating nodes
newel=xmlDoc.createElement("edition");
x=xmlDoc.getElementsByTagName("book")[
0];
x.appendChild(newel);
38
Creating text nodes
newel=xmlDoc.createElement("edition");
newtext=xmlDoc.createTextNode("first");
newel.appendChild(newtext);
x=xmlDoc.getElementsByTagName("book")[
0];
x.appendChild(newel);
39
Create CDATA nodes
newCDATA=xmlDoc.createCDATASection("
Special Offer & Book Sale");
x=xmlDoc.getElementsByTagName("book")[
0];
x.appendChild(newCDATA);
40
Create Comment Node
newComment=xmlDoc.createComment("Re
vised March 2008");
x=xmlDoc.getElementsByTagName("book")[
0];
x.appendChild(newComment);
41
More additional methods
x.appendChild(newNode)
x.insertBefore(newNode,y)
x.cloneNode(true) // add all attributes and
children if true
x.insertData(offset,"Easy "); // add text
42
Getting attribute value
x=xmlDoc.getElementsByTagName("title")[0].getAttribu
teNode("lang");
txt=x.nodeValue;
43
Creating attributes
newatt=xmlDoc.createAttribute("edition");
newatt.nodeValue="first";
x=xmlDoc.getElementsByTagName("title");
x[0].setAttributeNode(newatt);
44
Setting the attribute value
x=xmlDoc.getElementsByTagName('book');
x[0].setAttribute("category","food");
Or
x=xmlDoc.getElementsByTagName("book")[0]
y=x.getAttributeNode("category");
y.nodeValue="food";
45
Removing attributes
x=xmlDoc.getElementsByTagName("book");
x[0].removeAttribute("category");
46
Exercise
• Given the following XML file
• How shall we display
– Two buttons
• “Get CD info” and display the Titles and the
Composer
• “Get CD info abridged” and display the Titles only
47
Answer (1)
• The code
• What’s the result?
48
Answer (2)
49
Answer (3)
• Any resemblance to the assignment?
;o)
50
Questions?
51