IR2009_Workshop.wn - UP library

Introduction to Metadata:
Overview and Guidelines Using the
Dublin Core Metadata Schema
Amelia Breytenbach
Metadata Specialist
[email protected]
Institutional Repository Workshop
1-3 April 2009
Overview
•
•
•
•
•
•
•
•
What metadata is
Types of metadata
What does metadata do and why we use it
Metadata standards
Dublin Core Metadata Standard
Encoding schemes
Metadata creation
Metadata documentation
Definition of Metadata
• Metadata describes other data.
– It provides information about a certain item's
content, i.e. an image may include metadata
that describes the picture size, colour depth,
image resolution and date created
– A text document's metadata may contain
information about the document’s length, the
author, when the document was written and
a short summary
Source: The Tech Terms Computer Dictionary
http://www.techterms.com/definition/metadata
What Is Metadata?
• Standardized descriptions of resources that aid
in the discovery and retrieval of resources,
particularly in reference to information about
electronic, or digital, material
• Describing individual files, single objects or
complete collections
• Traditional library cataloging is a form of
metadata and MARC 21 and the AACR2 used
with it are metadata standards
Types of Metadata
• Descriptive
 title, author, extent,
subject, keywords
• Structural
 unique identifiers,
page numbers,
special features (table
of contents, indexes)
 file formats, scanning
dates, file
compression format,
image resolution
 Archival information
 Ownership, copyright,
license information
• Administrative or
technical
– Preservation
– Rights management
What Does Metadata Do?
Metadata
• is the key to ensuring that resources will survive
and continue to be accessible into the future
• is searchable and aids the identification and
retrieval of resources
• helps the end user to do accurate searching and
to evaluate a resource
• types also assists in managing, maintaining and
preserving digital collections
• facilitate interoperability
• supports archiving, security and authentication
of digital resources
Why Use Metadata?
• Metadata provides the essential link between
the information creator and the information user
• We can ensure that this objective is met by
using metadata in accordance with
international standards
Metadata Standards
• Data structure standards
Standardized sets such as Dublin Core, VRA
and MODS
• Data content standards
Rules or guidelines for input
• Data value standards
Lists of allowed values for an element
• Data format or encoding standards
How to encode the metadata
• Data presentation standards
Display of the metadata
Dublin Core as Structure Standard
for DSpace
Qualified Dublin Core Metadata Element Set
• Mandatory elements in DSpace
Title, Language and Date element
• DSpace refinements
Additional metadata qualifiers for some DC
elements
• System generated metadata
Dublin Core Metadata Initiative
(DCMI)
• An organization with the aim to promote more
intelligent resource discovery through the
widespread adoption of interoperable metadata
standards and the development of specialized
metadata vocabularies for describing resources
• DCMI provides an international forum for
identifying problems, to develop understanding
and proposing solutions
• Dublin Core website: http://dublincore.org/
Characteristics of Dublin Core
• The DC elements are
– simple to understand and apply
– subject independent with commonly
understood terminology
– optional and repeatable
– international in scope
– extensibility
Dublin Core Metadata Element Set
• Unqualified
For coarse-grained discovery of resources
• Qualified
– For richer descriptions to enable more refined
resource discovery
– Most digital library software uses qualified DC
• “Dumb-down” principle
– Collapse a refinement back into a core element
– Unqualified DC required for sharing metadata via
the Open Archives Initiative
Dublin Core Metadata Element Set
(Source: Miller, Steven J., 2007. Metadata for digital
collections: an online workshop.
Qualified
Dublin
Core in
DSpace
Dublin Core Qualifiers
Two categories of qualifiers:
• Element refinement
Make the meaning of an element narrower or
more specific. A refined element shares the
meaning of the unqualified element, but with a
more restricted scope
• Encoding scheme
Identify schemes that aid in the interpretation of
an element value. These schemes include
controlled vocabularies and formal notations
Dublin Core Element Refinements
•
•
•
•
•
•
•
•
•
•
•
•
Title
Creator
Contributor
Publisher
Description
Subject
Coverage
Format
Type
Date
Language
Relation
•
•
•
Source
Identifier
Rights
Alternative
Abstract, Table of Contents
Spatial, Temporal
Extent, Medium
Created, Available, Modified, Valid, Issued
Is Version of, Has Version, Is Replaced By,
Replaces, Is Required By, Requires, Is Part
Of, Has Part, Is Referenced By, References,
Is Format Of, Has Format, Conforms To
-
Qualifiers
for
dc.date
element
Encoding
scheme for a
element value
Dublin Core Encoding Schemes
• Getty Thesaurus of Geographic Names Online (TGN)
http://www.getty.edu/research/conducting_research/vocabularies/tgn/i
ndex.html
• Art and Architecture Thesaurus Online (AAT)
http://www.getty.edu/research/conducting_research/vocabularies/aat/
• Library of Congress Name Authority File (LCNAF)
http://authorities.loc.gov/
• Library of Congress Thesaurus for Graphic Materials
(LCTGM)
http://www.loc.gov/rr/print/tgm1/
Creator, Publisher
and Subject elements
Subject element
Coverage.spatial field
Subject element
Dublin Core Original vs Digital
Resource
• 1 : 1 principle
• Single metadata record with mix elements for the
original and digital object
• Use repeatable Dublin Core elements in the same
metadata description
• Use locally-defined elements and map to a Dublin
Core element
Date Original
DC Date element
Date Digital
Example
Element
Original Painting
Digital Image
Title
Mona Lisa
Mona Lisa
Creator
Leonardo da Vinci
Leonardo da Vinci
Date
1500
2002-10-30
Format.medium Oil painting
Image/JPEG file
Type
Still Image
Still Image
Identifier
No. 779 [museum
inventory number]
2002_0054.jpg
Format.extent
77 X 53 cm
158KB
Rights
Not in copyright
© [owner digital collection]
Metadata Creation
• Natural metadata is found in the source document
and created by the researcher or submitter
• supports discovery of resources
• includes the author’s name, date, title
• Added metadata is added by an metadata editor or
by software
• supports resource selection
• includes subject terms, abstracts and rights
metadata
Metadata Creation (cont.)
• Metadata as a view of the resource
• There is no one-size-fits-all metadata record
• Metadata for the same thing is different
depending on collection, use and audience
Metadata Creation (cont.)
Construct a title for this image if the theme of the digital
collection was:
• Rivers of Europe
Elbe river with passenger
boats, Dresden
• European Opera Houses
“Semperoper” opera house in
Dresden, Germany
• Cities of Europe
River scene in Dresden,
Germany
• Bridges of the World
Augustus Bridge over the Elbe
River, Dresden, Germany
Detailed vs Simple Metadata
Descriptions
• Detailed metadata descriptions
– may improve searching precision
– require higher investment in creation of metadata
– make it more difficult to promote consistency in
creation of metadata
• Simple descriptions
– are easier and less costly to generate
– more effort on the part of searchers to identify
most relevant results
– improve probability of cross-disciplinary
interoperability
Metadata Design and Documentation
• Metadata registry / Best practice guide / Data
dictionary / Application profile
• provides standardized information for the
definition, identification, and use of each data
element
• ensure that a metadata schema and data
elements in use by an organization can be
applied consistently within the organization or
community, reused by other communities,
and interpreted by computer applications and
human users
Value of Metadata Documentation
• Improve discovery of resources.
• Increase interoperability across all collections
created by an institution
• Increase interoperability with other digital
libraries participating in the Open Archives
Initiative
• Inform users on the digital object structure and
the software needed to view the digital resource
• Ensure quality control for metadata records
• Assist with management and long-term
preservation of digital files
Data Dictionaries (DDs)
• A table with applications of the metadata
standard applicable to a specific collection or
digital project or type of material
• Lists of local metadata elements
• Mapping to Dublin Core
• Specifications such as the use of controlled
vocabulary
• Examples and comments about the use of each
element
Theses
Books
Images
Best Practice Guides
• Guidance and documentation to describe and
standardize the use of metadata elements that best
support a community's needs
• Provide guidelines and decisions for metadata
creators
• Explanation of metadata elements, terms and
concepts
• Examples of the use of the different elements
• CDP Metadata Working Group
Dublin Core Metadata Best Practices, Version 2.1.1, Sept. 2006
http://www.cdpheritage.org/cdp/documents/cdpdcmbp.pdf
Metadata Registries
• A metadata registry is a central location in an
organization where metadata definitions are
stored and maintained in a controlled method
(Wikipedia)
• Protected area
• Stores data elements
• Stores the meaning of a data element
• Defines how the metadata is represented
Closing Remarks
“Metadata” means many different things:
• It involves applying traditional library principles to
new environments
• Good metadata practitioners use fundamental
cataloging principles in non-MARC environments
• Documentation is important
• Good metadata promotes good digital collections
• There is always more to learn
References
• Taylor, Chris. (2003) An Introduction to metadata
http://www.library.uq.edu.au/iad/ctmeta4.html
• Technical Advisory Service for Images (TASI). Metadata
and digital images.
http://www.tasi.ac.uk/advice/delivering/metadata.html
• Technical Advisory Service for Images (TASI). Controlling
your language – links to metadata vocabularies
http://www.tasi.ac.uk/resources/vocabs.html
• Hodge, Gail. (2001) Metadata made simpler.
• Smith, MacKenzie. (2003) Dspace: an open source
dynamic digital repository. D-Lib Magazine, January 2003.
http://www.dlib.org/dlib/january03/smith/1smith.html
• Disa Workshop: Digital collections management,
University of KwaZulu-Natal, 2004.
References (cont.)
• NISO Framework Advisory Group. A Framework of
Guidance for Building Good Digital Collections. 2nd ed.
Bethesda, MD: National Information Standards
Organization, 2004.
http://www.niso.org/framework/framework2.html
• Xia, Jingfeng. Personal name identification in the practice
of digital repositories. Electronic library and information
systems. Vol. 40, no. 3, 2006. pp. 256-267.
• Miller, Steven J. Metadata for digital collections: an online
workshop. University of Wisconsin-Milwaukee, School of
Information Studies, 2007.
• CDP Metadata Working Group. Dublin Core Metadata
Best Practices, Version 2.1.1, Sept. 2006
http://www.cdpheritage.org/cdp/documents/cdpdcmbp.pdf
Thank you!
Amelia Breytenbach
Metadata specialist
University of Pretoria
[email protected]