A Practical Introduction to XML in Libraries Marty Kurth NYLA October 22, 2004 What we’ll cover A functional overview of XML One use of XML at Cornell How our MARC to XML converter works What a Dublin Core XML record looks like to our users Concluding thoughts about XML opportunities A functional overview of XML (many thanks to David Ruddy for the source content of this section) XML = Extensible Markup Language A markup language gives meaning to special characters or character sequences, a.k.a. markup delimiters In XML, markup delimiters form rules for content designation (hold that thought!) In XML, markup delimiters have no inherent meaning (allowing them to serve as a flexible, extensible metalanguage) XML uses plain text, is non-proprietary, and is platform and software independent HTML versus XML HTML Procedural markup Rules govern display (fonts, layout) Doesn’t understand content XML Structural markup Rules establish relationships among content components Doesn’t control display A brief detour into metadata: Two ways to designate content In MARC: 245 04 $a The Big heat In XML: <title>Big heat</title> <name>value</name> In XML the name-value pair comprises an element An element has these parts: Start tag Element content End tag <tag>content</tag> <subject>Goldfinches</subject> Element rules and features Elements can hold data <pubPlace>Boston</pubPlace> Elements can hold other elements ad infinitum <sourceDesc> <biblFull> <titleStmt> <title>A letter to Orestes A. Brownson</title> <author>Hildreth, Richard, 1807-1865.</author> </titleStmt> </biblFull> </sourceDesc> Elements must be “properly” nested A quick look at other XML entities Attributes qualify elements <note type="500">Caption title.</note> Document Type Definitions (DTDs) control the structure of XML documents <!ELEMENT note (#PCDATA)> <!ATTLIST note type CDATA #IMPLIED> XML Schemas give more control than DTDs <xs:element ref="note" /> Extensible Stylesheet Language Transformation (XSLT) stylesheets transform one XML document into another (or into HTML) What does XML allow us to do? Structure data with a flexible and extensible set of rules Share data in a non-proprietary format, especially among “incompatible” systems Reuse data, e.g., in different presentation formats for different purposes One use of XML at Cornell A local reason for moving MARC data to XML CUL decided to use ENCompass for access to networked resources ENCompass requires XML records Our records for e-resources are in MARC, so we needed to get them into XML Using MARCXML MARCXML is lossless—it preserves the richness of the MARC record in XML LC offers a toolkit for converting MARC to MARCXML at http://www.loc.gov/standards/marcxml/ MARCXML can serve as a “bus” between MARC and other XML formats The MARCXML “bus” Adapting MARCXML tools We implemented LC’s converter to convert MARC to Dublin Core in XML We created a Web interface for systemwide access We extended LC’s Dublin Core XSLT stylesheet How our MARC to XML converter works Start with a MARC record Import it into the converter <xsl:for-each select="marc:datafield[@tag=245]"> <xsl:variable name="title"> <xsl:value-of select="marc:subfield[@code='a']"/><xsl:text> </xsl:text> <xsl:value-of select="marc:subfield[@code='b']"/><xsl:text> </xsl:text> <xsl:value-of select="marc:subfield[@code='f']"/><xsl:text> </xsl:text> <xsl:value-of select="marc:subfield[@code='g']"/><xsl:text> </xsl:text> <xsl:value-of select="marc:subfield[@code='p']"/> </xsl:variable> <xsl:variable name="cleanTitle"> <xsl:call-template name="clean"> <xsl:with-param name="toClean" select="normalize-space($title)" /> </xsl:call-template> </xsl:variable> <xsl:choose> <xsl:when test="@ind2 > 0"> <title> <xsl:call-template name="uppercase"> <xsl:with-param name="toUppercase" select="substring($cleanTitle, @ind2 + 1)" /> </xsl:call-template> </title> </xsl:when> <xsl:otherwise> <title><xsl:value-of select="$cleanTitle" /></title> </xsl:otherwise> </xsl:choose> </xsl:for-each> The converter applies our DC stylesheet And outputs a Dublin Core XML record <QDCrecord> <title>Harmonized tariff schedule of the United States </title> <alternative>HTS</alternative> <creator>United States.</creator> <contributor>United States International Trade Commission. Office of Tariff Affairs and Trade Agreements.</contributor> <type>Full text</type> <publisher>The Commission :</publisher> <date>[1987-</date> <description>HTSA provides the applicable tariff rates and statistical categorie for all merchandise imported into the United States; it is based on the international Harmonized System, the global classification system that is used to describe most world trade in goods.</description> <subject type="LCSH">Tariff--Law and legislation--United States-Periodicals.</subject> <subject>Education</subject> </QDCrecord> What a DC XML record looks like to our users The DC XML record is in our Find Databases system Users can view the DC record in a labeled display The DC XML is behind the labeled display Concluding thoughts about XML opportunities When XML knocks on your door: You can pick up XML encoding quickly With a little up-front IT time and XSLT skills, you can convert MARC to XML With XSLT skills, you can modify user displays in XML-based delivery systems
© Copyright 2026 Paperzz