CH 8
1
ELEMENT
DECLARATIONS
Objective
2
Analyzing the document
ANY
#PCDATA
Child elements
Mixed content
Empty elements
Comments in DTDs
Analyzing the Document
3
First step in creating DTD for particular document is to
understand the structure of the information that you will
encode
Adding a DTD to this document enables you to enforce
constraints
When designing a new XML application,
Writing some actual instance documents first,
Then designing the DTD
If an element has not been declared, it cannot be used
http://www.cafeconleche.org/books/
bible3/source/04/4-2.xml (link to XML)
Document Type Definitions cont…
4 Television Schedule
The Elements in the
Element
Required
Children
Optional
Children
SCHEDULE
DATE,
STATION(s)
DATE
Text
STATION
CHANNEL
SHOW
NAME,
EPISODE_NUMBER, START_TIME,
START_TIME(s) LENGTH, AIR_DATE, ORIGINAL_
, LENGTH
AIR_DATE, CLOSED_CAPTIONED,
REPEAT, DESCRIPTION, TITLE, RATING,
YEAR_MADE, STARS, DIRECTOR,
WRITER,
PRODUCER, CAST
CAST
ACTOR(s)
NETWORK, CALL_LETTERS, SHOW(s)
ACTOR
GIVEN_NAME, MIDDLE_NAME,
MIDDLE_INITIAL, SURNAME
WRITER
GIVEN_NAME, MIDDLE_NAME,
Element Declarations
5
Element
Required
Children
PRODUCER
DIRECTOR
NAME
Text
TYPE
Text
LETTERS
Text
NETWORK
Text
CHANNEL
Text
EPISODE_NUMB
ER
Text
START_TIME
Text
LENGTH
Text
AIR_DATE
Text
Optional
Children
ANY
6
The keyword ANY is a content specification
indicating that there are no restrictions on the
content of an element
<!ELEMENT SCHEDULE ANY>
With the Keyword ANY, This says that all possible
elements as well as plain text can be children of the
SCHEDULE element
Because ANY is so unrestrictive, it lets you very
quickly create a DTD that will validate a document.
ANY cont…
7
<!ELEMENT SCHEDULE ANY>
<!ELEMENT DATE ANY>
<!ELEMENT STATION ANY>
<!ELEMENT NETWORK ANY>
<!ELEMENT CALL_LETTERS ANY>
<!ELEMENT CHANNEL ANY>
<!ELEMENT SHOW ANY>
<!ELEMENT NAME ANY>
<!ELEMENT TYPE ANY>
<!ELEMENT EPISODE_NUMBER ANY>
<!ELEMENT START_TIME ANY>
<!ELEMENT LENGTH ANY>
<!ELEMENT AIR_DATE ANY>
<!ELEMENT ORIGINAL_AIR_DATE ANY>
<!ELEMENT CLOSED_CAPTIONED ANY>
<!ELEMENT REPEAT ANY>
<!ELEMENT CAST ANY>
<!ELEMENT ACTOR ANY>
<!ELEMENT GIVEN_NAME ANY>
<!ELEMENT SURNAME ANY>
<!ELEMENT PRODUCER ANY>
<!ELEMENT DESCRIPTION ANY>
<!ELEMENT TITLE ANY>
<!ELEMENT MIDDLE_NAME ANY>
<!ELEMENT RATING ANY>
<!ELEMENT YEAR_MADE ANY>
<!ELEMENT STARS ANY>
<!ELEMENT DIRECTOR ANY>
<!ELEMENT WRITER ANY>
<!ELEMENT MIDDLE_INITIAL ANY>
ANY cont…
8
The DTD in the previous slide does not say very
much
It place no restrictions on where they may appear
and what they may contain
ANY Cont…
9
<?xml version="1.0"?>
<!DOCTYPE DATE SYSTEM "tvschedule.dtd">
<DATE>
July 3, 2003
<CAST>
<NETWORK>CBS</NETWORK>
<CALL_LETTERS>WCBS</CALL_LETTERS>
<CHANNEL>2</CHANNEL>
</CAST>
<SHOW>
Hollywood Squares
<START_TIME>19:00-0500</START_TIME>
</SHOW>
</DATE>
Figure 1: A Document That’s Valid According to the
DTD
ANY Cont…
10
<?xml version=”1.0”?>
<!DOCTYPE ACTOR SYSTEM “tvschedule.dtd”>
<ACTOR>
<NAME>
<GIVEN_NAME>Frank</GIVEN_NAME>
<SURNAME>Oz</SURNAME>
</NAME>
<ROLE>Yoda</ROLE>
<DATE>May 25, 1944</DATE>
</ACTOR>
Figure 1: A Document That’s invalid According to the
DTD
#PCDATA
11
When we want to specify that an element will only
contain text, and no child elements, we use the
keyword #PCDATA
Because this keyword specifies that the element must
contain parsable character data – that is , any text
except the characters less-than (<) , greater-than (>)
, ampersand (&), quote(') and double quote (")
#PCDATA
12
<DATE>July 3, 2003</DATE>
<!ELEMENT YEAR (#PCDATA)>
This declaration indicate text only.
No child element is allowed
#PCDATA cont…
13
<DATE>
<MONTH>July</MONTH>
<DAY>3</DAY>
<YEAR>2003</YEAR>
</DATE>
However, this DATE element is invalid because it contains
child elements
Child Elements
14
The first child element is date
To declare that a SCHEDULE must have a date, we
simply add pair of parentheses
<!ELEMENT SCHEDULE (DATE)>
What this simply mean is that each SCHEDULE
element should contain exactly one DATE child
element
Child Elements cont..
15
<!ELEMENT SCHEDULE (DATE, STATION,
STATION, STATION)>
This kind of declaration is called a sequence.
The above mean each SCHEDULE element should
contain exactly one DATE child element, followed by
exactly three STATION elements.
Child Elements cont..
16
Each element should be declared in its own
<!ELEMENT> declaration exactly once
Child Elements cont..
17
+ One or More Children
<!ELEMENT SCHEDULE (DATE, STATION+)>
Use plus sign (+) after element name in the
child list
The above example mean STATION element
have one or more STATION elements
Child Elements cont..
18
? Zero or One Child
We can indicate that a child is optional in
a sequence, that is, it can appear or not by
suffixing its name with a ?
<!ELEMENT ACTOR (GIVEN_NAME?,
MIDDLE_NAME?,
MIDDLE_INITIAL?, SURNAME?)>
<!ELEMENT WRITER (GIVEN_NAME?,
MIDDLE_NAME?,
MIDDLE_INITIAL?, SURNAME?)>
<!ELEMENT PRODUCER (GIVEN_NAME?,
MIDDLE_NAME?,
MIDDLE_INITIAL?, SURNAME?)>
Child Elements cont..
19
<!ELEMENT STATION (NETWORK?,
CALL_LETTERS?, CHANNEL, SHOW+)>
This above indicate that the STATION can
Either have a NETWORK,CALL_LETTER , or
One or more SHOW, and
ONE CHANNEL
Child Elements cont..
20
* Zero or More Children
This mean that that a child can appears zero or more
times
Can be use for middle names or middle initials
<!ELEMENT ACTOR (GIVEN_NAME?, MIDDLE_NAME*,
MIDDLE_INITIAL*, SURNAME?)>
<!ELEMENT WRITER (GIVEN_NAME?, MIDDLE_NAME?*,
MIDDLE_INITIAL*, SURNAME?)>
<!ELEMENT PRODUCER (GIVEN_NAME?, MIDDLE_NAME*,
MIDDLE_INITIAL*, SURNAME?)>
<!ELEMENT DIRECTOR (GIVEN_NAME?, MIDDLE_NAME*,
MIDDLE_INITIAL*, SURNAME?)>
Child Elements cont..
21
<!ELEMENT SHOW (NAME, TYPE?,
EPISODE_NUMBER?, START_TIME+, LENGTH,
AIR_DATE, ORIGINAL_AIR_DATE?
CLOSED_CAPTIONED?, REPEAT?, RATING?,
STARS?, DIRECTOR*, WRITER*, CAST?,
PRODUCER*, DESCRIPTION)>
potential child elements of SHOW can appear once, several
times, or not at all, including PRODUCER, DIRECTOR, and
WRITER
Child Elements cont..
22
Choices
<!ELEMENT PAYMENT (CASH | CREDIT_CARD)>
We can use vertical bar (|) to indicate choice
rather than with a comma(,) in the parent
element declaration
The PAYMENT element must have a single
child element of type CASH or CREDIT_CARD
This sort of content specification is called a choice
Child Elements cont..
23
Parentheses
Each set of parentheses combines several elements
so that the combination is treated as a single unit
when validating
This parenthesized unit can then be nested inside
other parentheses in place of a single element
You can then affix a plus sign, an asterisk, or a
question mark to it
Child Elements cont..
24
Parentheses cont …
Both choices and sequences appear in parentheses
These parentheses can also have +, *, or ?
quantifiers suffixed to them
<!ELEMENT ACTOR (GIVEN_NAME|
MIDDLE_NAME | MIDDLE_INITIAL |
SURNAME )+ >
this declaration says that an ACTOR element can
have one or more of GIVEN_NAME,
MIDDLE_NAME, MIDDLE_INITIAL, or
SURNAME child elements
Mixed Content
25
You can declare tags that contain both child
elements and character data. This is called mixed
content
The
<!ELEMENT CAST (#PCDATA |
ACTOR)*>
above declaration
allow
each CAST to include
text as well as Actor child elements
Empty Elements
26
Element with no content
<?xml version=”1.0”?>
<!DOCTYPE DOCUMENT [
<!ELEMENT DOCUMENT (TITLE,
SIGNATURE)>
<!ELEMENT TITLE (#PCDATA)>
<!ELEMENT COPYRIGHT (#PCDATA)>
<!ELEMENT EMAIL (#PCDATA)>
<!ELEMENT BR EMPTY>
<!ELEMENT HR EMPTY>
<!ELEMENT LAST_MODIFIED (#PCDATA)>
<!ELEMENT SIGNATURE (HR, COPYRIGHT,
BR, EMAIL,
BR, LAST_MODIFIED)>
]>
<DOCUMENT>
<TITLE>Empty-element Tags</TITLE>
<SIGNATURE>
<HR/>
<COPYRIGHT>2003 Elliotte Rusty
Harold</COPYRIGHT><BR/>
<EMAIL>[email protected]</EMAIL><BR/>
<LAST_MODIFIED>Wednesday, December 3,
2003</LAST_MODIFIED>
</SIGNATURE>
</DOCUMENT>
Comments in DTDs
27
DTDs can contain comments, just like the rest of an
XML document
These comments cannot appear inside a declaration,
but they can appear outside one
Comments are often used to organize the DTD in
different parts
Comments, is only for the benefit of people reading
the source code.
XML processors will ignore it
Comments in DTDs cont…
28
<!-- A date in the form Month Day, Year
The year is always written with four digits. ->
<!ELEMENT DATE (#PCDATA)>
DTDs often use comments to indicate
Who wrote the DTD
Copyright for the DTD
Usage conditions
Usage instructions
Customary PUBLIC and SYSTEM identifiers
references
29
http://stackoverflow.com/questions/918450/difference-between-
pcdata-and-cdata-in-dtd
© Copyright 2026 Paperzz