Bits and Bobs: Digital Preservation and Costs

Bits and Bobs: Digital Preservation and Costs
Digital Preservation Coalition:
Forum on preservation of e-learning materials
and cost models for digital preservation
London, 15th October 2002
T
M
Alison Macdonald
Secure Sciences Limited
[email protected]
© Secure Sciences Limited 2002
1
1986 schilling & computer
Now has to be taken to bank
for migration (to €1.453457
less charges … cost to you
of time, to bank of
processing)
All obsolete technologies:
12” Laserdisk, Philips player
BBC Acorn PC, software in BPCL
Original book (Domesday)
made in 1086, still in use
T
M
© Secure Sciences Limited 2002
2
Summary
1. Some words about costs
2. Identifying digital preservation costs today
- Cost drivers and variables
- Whose cheque book?
3. Some conclusions
T
M
© Secure Sciences Limited 2002
3
What is cost?
• A pressure
• It leads to interference from others
• They then impose (further) limitations
•
•
•
•
Costs pay for inputs which produce output
Cost usually has multiple elements ..
Which are subject to multiple factors
Cost has multiple contexts
T
M
© Secure Sciences Limited 2002
4
The context of costs
• Some words around costs:
T
M
© Secure Sciences Limited 2002
5
The context of costs
• Some words around costs:
–
–
–
–
–
–
Price
Payback
Proportion
Financing
Unit of measure
Charge-back
-
Investment
Return
Relative cost
Replacement cost
T
M
© Secure Sciences Limited 2002
6
Some jargon related to costs
•
•
•
•
•
•
•
Budget
Standard costing
Target costing
Value analysis
Value engineering
Unit of measure
Opportunity cost
T
M
© Secure Sciences Limited 2002
7
Part 2
Identifying digital preservation
costs today
Do we know what digital preservation
involves?
Does the finance department have a cost
centre for it? Should it?
T
M
© Secure Sciences Limited 2002
8
Digital preservation: OAIS model
Preservation Planning
P
r
o
d
u
c
e
r
Data
Management
Ingest
Access
Archival
Storage
C
o
n
s
u
m
e
r
Administration
Management
T
M
© Secure Sciences Limited 2002
From CCDSD, 2001
9
Digital preservation a sub-activity
• Digital preservation implies purpose behind
retention
• Digital preservation also implies awareness
• Economies of scale imply one or more real
or virtual repositories
= archive(s)
T
M
© Secure Sciences Limited 2002
10
Digital preservation is complex
Need to preserve:
–
–
–
–
–
–
–
Accessibility
Retrievability
Interpretability
Meaningfulness
Integrity
Authenticity
Security
Overcoming:
• Technology churn
– Hardware, software, media
• Knowledge, semantic
loss
• Infrastructure loss
(incl. funding
continuity)
• New problem areas
T
M
© Secure Sciences Limited 2002
11
Metadata = the engine of preservation
DIP
(Dissemination
Information
Packages)
OAIS model – from CCDSD 2001
Descriptive info.
Descriptive
info.
(metadata)
(Metadata)
Preservation Planning
P
r
o
d
u
c
e
r
Data
Management
Ingest
Access
Archival
Storage
C
o
n
s
u
m
e
r
Administration
T
M
SIP
(Submission
Information
Packages)
© Secure Sciences Limited 2002
Management
AIP
AIP
(Archival
(Archival
Information
Information
From CCDSD, 2001
Packages)
Packages)
12
Digital preservation: cost elements (1)
• Sub-activity of archive (in this example)
– Set-up & infrastructure
– Running costs: overheads, labour, materials
• Preservation actions as required
– Materials and staff as required
– Lots of knowledge required – almost certainly more
than available in-house
– Timing of these actions dependent on external factors
T
M
© Secure Sciences Limited 2002
13
Digital preservation: cost elements (2)
• Digital preservation cost components:
– Media refresh, software elements
– Labour (internal, external)
– Knowledge bank (internal, external?)
• Archive cost components:
–
–
–
–
–
Infrastructure
Hardware, media
Software
Telecommunications
Staff (archivists, IT, management)
= Uneven costs:
T
M
© Secure Sciences Limited 2002
14
Annual archive cost ex digital preservation and
average cumulative cost per 100,000 records
700
900,000
600
700,000
500
600,000
500,000
400
400,000
300
300,000
200
200,000
£ cumul cost per 100,000 records
£ annual archive cost ex DP
800,000
100
100,000
0
0
1
2
3
4
5
6
7
8
9
T
M
© Secure Sciences Limited 2002
15
Variables - what to preserve in a record?
•
The original functionality?
–
•
The original look and feel?
–
T
M
•
•
To what extent?
• Accuracy of numerical results
• Links – original targets
• Environmental dependencies
• Etc.
To what extent?
• Colour fidelity?
• Screen resolution
• Screen layout
• Fonts
• etc
Just the file / bit sequence?
Audit trail, annotations?
© Secure Sciences Limited 2002
16
Digital preservation: variables (2)
• Preservation actions as required
– But which method?
* Migration
* Emulation
* Archaeology
* Museum
* Other (eg Rosetta Stones)
– But each of these have quite different cost characteristics,
– And different implications about management
• Choice affects the value of the record
T
M
© Secure Sciences Limited 2002
17
Digital preservation: cost drivers (1)
• Cost of materials (hardware, software,
media)
• Hardware technology changes
• Availability of expert staff
T
M
© Secure Sciences Limited 2002
18
Digital preservation: cost drivers (2)
•
•
•
•
•
•
•
Heterogeneity of data types (today and tomorrow)
Specialization of data
Complexity of record structures
Number of user communities
Technology churn – software and communications
Quality of the original data
Indexing:
– Ease of metadata creation
– Depth and breadth of metadata
• Volumes (types, accumulation, annotations, audit trail)
• Frequency & regulation of access (ie, near-line, off-line)
T
M
© Secure Sciences Limited 2002
19
OAIS : metadata the engine – but how
many, and where do the costs fall?
DIP
(Dissemination
Information
Packages)
Descriptive info.
Descriptive
info.
(metadata)
(Metadata)
Preservation Planning
P
r
o
d
u
c
e
r
Data
Management
Ingest
Access
Archival
Storage
C
o
n
s
u
m
e
r
Administration
T
M
SIP
(Submission
Information
Packages)
© Secure Sciences Limited 2002
Management
AIP
(Archival
Information
From CCDSD, 2001
Packages)
20
Where do the costs fall?
• Who bears (or shares) which costs when?
–
–
–
–
–
–
–
Original funders
Data originators
Curator(s)
Other business units (eg IT departments)
Data owners
Current users, future users
Future funders
• What accounting problems are involved?
– Different archive models imply different problems
T
M
© Secure Sciences Limited 2002
21
Hard costs, soft benefits (mostly):
HYPOTHETICAL MAP
!Decreased network back-up costs
!Higher Tb per administrator for network
Hard
!Reduced h/w costs for network
Soft
" Improved DR capabilities
• Reduced “discovery” costs in litigation " Ease of dissemination
" Patent protection
!Regulatory compliance
" Space saving
!Digital search capabilities
• Re-use of digital record
• Enhancements of “record” + annotations
Conservative
T
M
! Strong observed case
© Secure Sciences Limited 2002
Nominal
•
Reasonable case
Aggressive
" Protection
22
The other side of the coin:
securing the budget
•
•
•
•
•
•
•
Justifying the costs in terms of the benefits
These are likely to be indirect
Are the benefits measurable?
Some archives may have income streams
Balancing costs and benefits
The risk of waste (several kinds)
Staying within budget
T
M
© Secure Sciences Limited 2002
23
Cost words again
• Reviewing those words around costs:
–
–
–
–
–
–
Price
Payback
Proportion
Financing
Unit of measure
Charge-back
-
Investment
Return
Relative cost
Replacement cost
T
M
© Secure Sciences Limited 2002
24
Concluding questions and answers
• Is cost a risk in digital preservation?
• Is it possible to budget for digital preservation?
• Which cost elements can we predict with any degree of
accuracy?
• Should the finance department have a cost centre for digital
preservation?
• What are the implications of the cost choices identifiable
today?
T
M
An understanding of costs in digital
preservation is an important factor in
maintaining maximum funding for
digital archives.
© Secure Sciences Limited 2002
25