PREMIS

PREMIS
• What is PREMIS?
– Preservation Metadata Implementation Strategies
• When is PREMIS use?
– PREMIS is used for “repository design, evaluation, and
archived information packaged among preservation
repositories”
• How is PREMIS use?
– PREMIS Data Dictionary provides guidelines regarding
“the information a repository uses to support the
digital preservation process”
PREMIS
• What is “preservation metadata” referring to?
– “It is information that supports and documents the digital
preservation process”. Which includes information such as:
• Provenance – refers to who has ownership of the digital
object
• Authenticity – refers to the claim of the digital object
• Preservation activity – refers to the activities that have been
carried out to preserve the digital object
• Technical environment – refers to the tasks required to
interpret and use the digital object
• Rights management – refers the intellectual property rights
that must be declared
PREMIS Data Dictionary
• Conventions
– Semantic units : refers to a piece of information or
knowledge
• Example: objectIdentifer under the <object>
– Containers: refers to a xml tag that have no value
rather serve to group related semantic units
– Subunits: refers to the units held within a container
– Extension containers: are containers that are designed
to give a place for non-PREMIS metadata
PREMIS Data Model
PREMIS Data Model
• Intellectual Entity – refer to content that can
be describe as a unit (e.g. books, maps,
articles)
PREMIS Data Model
• Objects – refer to units of information in digital
form. PREMIS defines different kinds of objects
it can an a file, bitstream or representation
• File – it is a computer file, such as a pdf, txt or
JPEG
• Bitstream – refer to data bits within a file that
contain common properties for preservation
purposes
PREMIS Data Model
• Representation – refer to a set of files, that includes
structural metadata, required to be identified, stored
and maintained in order to assemble a complete
rendition of an Intellectual unit.
– For example, text files and images files of a
magazine are required to form a representation.
PREMIS Data Model
•
•
Sample syntax
<object> </object>
The units of information that can be recorded includes:
– Type of object (file, bitstream, or representation)
– A unique identifier for the object (type and value)
• For example,
<object xsi:type="representation">
<objectIdentifier>
<objectIdentifierType>FDsys ACP</objectIdentifierType>
<objectIdentifierValue>R0b002ee180b003b0</objectIdentifierValue>
</objectIdentifier>
</object>
This particular segment states that this object is a representation (that is a set of files, this
representation has a unique identifier)
PREMIS Object Example
•
Other units of information that can be recorded includes:
– “Information indicating the policy on the set of preservation functions to
be applied to an object”
<object xsi:type="file">
<objectIdentifier>
<objectIdentifierType>FDsys ACP</objectIdentifierType>
<objectIdentifierValue>D09002ee180b003a9</objectIdentifierValue>
</objectIdentifier>
<preservationLevel>
<preservationLevelValue>full</preservationLevelValue>
</preservationLevel>
PREMIS Object Example
• Other units of information that can be recorded includes:
– Information indicating if the object is subject to one or
more processes of decoding or unbundling
– information used to verify if an object has been changed in
an undocumented or unauthorized way
– The size of the object
– The format of the object
• For example
PREMIS Object Example
<objectCharacteristics>
<compositionLevel>0</compositionLevel>
<fixity>
<messageDigestAlgorithm>SHA-256</messageDigestAlgorithm>
<messageDigest>4977070b92f0bb2642c6be368ad68a8d1d1c5dbbb3310544db781f56a860b0a1</messageDigest>
<messageDigestOriginator>FDsys</messageDigestOriginator>
</fixity>
<size>9326</size>
<format>
<formatDesignation>
<formatName>text/plain</formatName>
</formatDesignation>
<formatRegistry>
<formatRegistryName>PRONOM</formatRegistryName>
<formatRegistryKey>x-fmt/111</formatRegistryKey>
</formatRegistry>
<formatNote>Plain Text File</formatNote>
</format>
</objectCharacteristics>
PREMIS Object Example
• Other units of information that can be recorded
includes:
– The original name of the object (prior to being named by
the repository)
– Information about where and how a files are stored in the
repository
– A categorization of the nature of the relationship (for
instance, “structural” is a relationship between parts of an
object)
PREMIS Object Example
<originalName>S3880IS.txt</originalName>
<storage>
<contentLocation>
<contentLocationType>URI</contentLocationType>
<contentLocationValue>file:/u02/app/emc/documentum/data/fdsysprod1/fdsysprod1/content_storage_0
1/00002ee1/80/55/b0/48.txt</contentLocationValue>
</contentLocation>
<storageMedium>hard disk</storageMedium>
</storage>
<relationship>
<relationshipType>structural</relationshipType>
<relationshipSubType>is part of</relationshipSubType>
<relatedObjectIdentification>
<relatedObjectIdentifierType>FDsys ACP</relatedObjectIdentifierType>
<relatedObjectIdentifierValue>R0b002ee180b003b0</relatedObjectIdentifierValue>
</relatedObjectIdentification>
</relationship>
</object>
PREMIS Data Model
• Events – refers to actions that involve an object and an agent
known to the system
– Events are critical for maintaining the digital provenance of an
object (helps demonstrates the authenticity of the object)
• Examples of Events:
– modifying an document
– actions that create new relationships
• Object could be related to another object as a result of a particular event,
for instance if a program takes file 1 and generates a different version
known as file 2
– Actions that check the validity and integrity of the objects (i.e.
virus scan)
PREMIS Data Model
• Sample syntax
<event> </event>
• The information that can be recorded under event includes:
– A unique identifier for the event
– Date, time and type of event
– Detail description of the event
– The outcome of the event
– Objects and agents involved in the event and their specific
roles
• Agents role are defined here because agents can perform
different roles in different events
PREMIS Event Example
<event>
<eventIdentifier>
<eventIdentifierType>FDsys:event</eventIdentifierType>
<eventIdentifierValue>1cdd2b6c-5a2d-449b-b386-ebb15eb4af11</eventIdentifierValue>
</eventIdentifier>
<eventType>Rendition Submitted</eventType>
<eventDateTime>2010-10-06T19:38:47-04:00</eventDateTime>
<eventDetail>Rendition R0b002ee180b003b0, uploaded by hotfolderadmin, was submitted in the Submission Information package
P0b002ee180b003af</eventDetail>
<eventOutcomeInformation>
<eventOutcome>Success</eventOutcome>
</eventOutcomeInformation>
<linkingAgentIdentifier>
<linkingAgentIdentifierType>FDsys:agent</linkingAgentIdentifierType>
<linkingAgentIdentifierValue>hotfolderadmin</linkingAgentIdentifierValue>
<linkingAgentRole>implementer</linkingAgentRole>
</linkingAgentIdentifier>
<linkingObjectIdentifier>
<linkingObjectIdentifierType>FDsys</linkingObjectIdentifierType>
<linkingObjectIdentifierValue>R0b002ee180b003b0</linkingObjectIdentifierValue>
<linkingObjectRole>outcome</linkingObjectRole>
</linkingObjectIdentifier>
</event>
PREMIS Data Model
• Agents – refer to people, organizations, or
software associated with events, more
specifically preservation events, of an object
– In the data model diagram, there is no arrow from
Agent entity to the Object entity, that is because
Agents influence Objects indirectly through
Events.
PREMIS Data Model
• Sample syntax
<agent> </agent>
• The information that can be recorded under
agent includes:
– A unique identifier for the agent
– The agent’s name
– The type of agent (people, organization or
software)
PREMIS Agent Example
<agent>
<agentIdentifier>
<agentIdentifierType>FDsys:agent</agentIdentifierType>
<agentIdentifierValue>hotfolderadmin</agentIdentifierValue>
</agentIdentifier>
<agentName>hotfolderadmin</agentName>
<agentType>Person</agentType>
</agent>
PREMIS Data Model
• Rights – refers to the rights and permission that are directly
relevant to preserving objects
• Sample syntax
<rights> </rights>
• The information that can be recorded under right includes:
– A unique identifier for the rights statement
– The action(s) that the rights statement allows
– The object(s) to which the statement applies
– The agents involved in the rights statements and their
roles
References
• http://www.loc.gov/standards/premis/understanding-premis.pdf
• http://www.oclc.org/research/activities/past/orprojects/pmwg/pre
mis-final.pdf