The OAIS experience at the British Library

The OAIS experience
at the British Library
Deborah Woodyard
Digital Preservation Coordinator
ERPANET OAIS Training Seminar, 28-29 Nov 2002
OVERVIEW
Introduction to the British Library
Why the BL chose to use the OAIS model
OAIS theory versus implementation
Terminology
Metadata
Issues not covered by OAIS
Summary of lessons learned about using the
OAIS
THE BRITISH LIBRARY







Deposit library
Aiming to get deposit legislation for digital
materials
Receiving digital material by voluntary deposit,
purchase and digitisation
Wide variety of types of digital material received
Require method/system for long term storage,
preservation and access
Seriously embarked on developing such a
system in 2000
Initial work developed detailed functional
specification of a system aligned with OAIS
model concepts
WHY OAIS?








Very little current experience of a system such as
this exists
No ‘off-the-shelf’ systems available
No other standards
OAIS model well developed
Considered to be the guidance for best practice
Provided excellent high level framework and
convincing back-up argument for political
justification for development of such a system
Provided standard terminology for
communication
A good match for almost the entire system we
were planning to build
OAIS THEORY vs
SYSTEM IMPLEMENTATION
High level standard implies no rules for actual
design or implementation
OAIS sounds like one system but is not
necessarily, or even likely to be, one single
entity
No formal method of implementation used
Analysed business processes and matched to
OAIS functions
DIAGRAM COMPARISON
OAIS TERMINOLOGY
Useful as a common vocabulary which is used
to communicate with internally and externally
Difficult to explain without reading a lot of the
document, therefore opaque to those not
heavily involved (e.g. OAIS vs OAI)
Still needed to create another glossary
Especially useful: SIP, AIP, DIP; Ingest;
Content Information = Content Data Object +
Representation Information
Difficulties with: defining an object;
naming preservation users
OAIS METADATA TO BL METADATA

Packaging Information
(i.e. how and where the bits are stored)

Content Information including Representation
Information
(i.e. how to interpret the bits into data)

Preservation Description Information including
Reference Information
 Context Information
 Provenance Information
 Fixity Information
(i.e. how to interpret the data into information)

CONTENT INFORMATION
Representation Information (Content data object
description)
 Technical details of files and resource structure
 How the resource appears, is installed and runs
 Documentation
 Significant properties
Representation Information (Environment
description)
 Requirements for hardware, peripherals,
 Operating system, application software,
 Input and output, memory requirements and other
parameters
 Documentation on installation, use and location of
environment components.
PRESERVATION DESCRIPTION
INFORMATION
Reference Information
 Identifiers & descriptive information
Context Information
 Reason for creation, relationships with other
resources
Provenance Information
 Origin of the resource & changes made due to
its life in the archive
Fixity Information
 Authentication details
BL METADATA (1/8)
Agent Group
 Agent Identifier
 Agent Role
 Personal Agent
Group





Personal Agent
Name Affix
Personal Agent
Family Name
Personal Agent
Given Name
Personal Agent
Affiliation,
Personal Agent Vital
Date

Corporate Agent
Group



Event Agent Group





Corp Agent Name
Corp Agent Place
Event Agent Name
Event Agent Number
Event Agent Location
Event Agent Date
Other Agent Group


Other Agent Name
Other Agent
Description
BL METADATA (2/8)
Descriptive Items Group
 Language
 Page Range
 Frequency Of Serial
 Issue Data
 Audience

Title Group








Primary Title
Title Status
Alternative Title
Sub Title
Series Title
Series Title Number
Article Title
Uniform Title
BL METADATA (3/8)

Subject Group







LCSH
DDC
Name As Subject
Free Text
Other Subject
Vocabularies
BL Collection
BL Classification

Description Group




Abstract
Table of Contents
Map Scale
Free Text
BL METADATA (4/8)
Date Group
 Date Issued
 Date Available
 Date Created
 Date Archived
 Licence Check Date
 Date Modified
 Date Coverage
 Date Valid
 Vital Date
 Event Date
 Other Descriptive
Dates
 System Dates
BL METADATA (5/8)
Coverage Group
 Temporal coverage
 Spatial Coverage
Terms Group
 Price
 Terms Of Availability
Statement
 Terms Of Availability
Reference
Type and Identifier
Group
 Resource Type
 Object Type
 Object Preservation
Category
 Resource Identifier
 System IDs
 Descriptive IDs
Format Group
BL METADATA (6/8)
Relation Group
 Relation Is Version Of
 Relation Is Format Of
 Relation Is Part Of
 Relation Is
Component Of
 Relation Is Replaced
By
 Relation Replaces
 Relation Requires
 Relation External
Object
 Relation Continues
History Group

Custody History
 Digitisation History
 Ingest History
 Preservation History
 Process Name
 Process Description
 Process Reason
 Process Selection
 Process Specification
 Critical Hardware
 Critical Software
 Process Result
 Process Agent
 Process Date
BL METADATA (7/8)
Object Part Group
 Digital Signature
 Digital Signature
Name
 Operating
Environment
 Object Part
Preservation Status
 Viewing Software
 Object Part Identifier
 Start File
 Underlying Abstract
Form
 Essence of Being
External Object Group




Source
Relation External Object
Related Information
Object
Other
Original Environment
Group








Operating System
Processor Type
Processor Speed
Hard Disc Capacity
RAM
Video Card
Sound Card
CD Speed
BL METADATA (8/8)
Rights Information
Group
 Rights Group
 Rights URL
 Rights XRML
 Rights Statement
 Rights Holder
Licence Group
 Licence Type
 Licence Fee
 Licence Description
 Location
 Number Of Licences
System Parameter Group
 Licence Key
 Extraordinary
Requirements
 Original Carrier
 Copy Counter
ISSUES NOT COVERED BY THE OAIS
(1/3)
Boundary of the system under development:
 Which materials will be stored in this system
 Should descriptive information be stored
internally
 Should object relationships be stored internally
 Should a retrieval manager component be
included
 Should an exit strategy (high volume data
transfer) be built from day one
Changes to metadata:
 Should changes be allowed without delivery and
re-ingest as new item
ISSUES NOT COVERED BY THE OAIS
(2/3)
Object deletion:
 Not included and may be difficult to implement
 Remove content or only access to content
Object identification in a volume:
 In the case of corruption or requested
refreshment is it necessary to be able to identify
the individual object on a volume
Independent use of archive volumes:
 Disaster recovery without exact same system
ISSUES NOT COVERED BY THE OAIS
(3/3)
Unique identifier:
 Where should it be generated
 What structure should it have
How to store license information:
 Scan hard copy or data entry
 Where should it be stored
Data integrity:
 How often should the data be checked
SUMMARY OF MAIN LESSONS
LEARNED
It’s heavy
It’s complex
It doesn’t define your scope
It’s worth understanding the terminology and
concepts
It is a very valuable tool and the basis of
progressing the long term preservation of
digital information