Storage Tutorial For Content Creation and

Storage Tutorial For Content
Creation and Distribution
Tom Coughlin
Coughlin Associates
www.tomcoughlin.com
© 2004 Coughlin Associates
Outline
• Content Value Chain
• Storage Demand for Entertainment
Applications
• Storage Devices
• Storage Systems
• Digital Storage Applications for
Entertainment Media
• Conclusions
© 2004 Coughlin Associates
Storage Makes Me
Happpy!
© 2004 Coughlin Associates
Digital Content Value Chain
•PVR/DVR/set-tops
•Game Machines
•Mobile Devices
Content
Reception
Content
Creation
Content
Distribution
•Streaming Media
•VOD
•PPV
•Cameras
•Animation
Content
Editing
Content
Archiving
•Tape
•ATA Disk Arrays
•Optical Jukeboxes
© 2004 Coughlin Associates
•Field Editing
•Studio Editing
•Special Effects
Content Creation, Distribution, & Archive Market
(SCSI/FC HDD, ATA HDD, & Tape)
14,000,000
Source: 2004 Entertaimnent & Digital Media Storage Report
TB Shipped
12,000,000
10,000,000
Archiving Historical Content (Tape, TBs)
Distribution & Projection (ATA, TBs)
8,000,000
Archiving (Tape, TBs)
Editing (SCSI & FC HDD's, TBs)
Capture & Conversion (SCSI & FC HDD's, TBs)
6,000,000
4,000,000
2,000,000
0
2003
2004
2005
© 2004 Coughlin Associates
2006
2007
2008
Activities in Content Creation,
Editing, Archiving and Distribution
Content Creation
Ingest
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
Content Management
QA
Acquire Content
Field Editing
Ingest
Digitization
Capturing Metadata
Auto-Logging
Proxy Creation
Repository
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
Develop
Content Distribution
Web &
Application
Server
Deploy
Library and Archive Storage
Search Tools
Edit and Re-Purpose material
Metadata Management
Rights Management
Distribution Scheduling
Content Import/Export
Web access
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
© 2004 Coughlin Associates
Content
Delivery
Network
Access
Services
Various Delivery Networks
Point to Point & Point to
Multipoint
Push and Pull modes
Real-time and Scheduled
Dedicated Infrastructure
Format/Destination Selection
Caching and Edge Storage
Uncompressed Video Production Storage
Needs (Raw DPX 10 bit log files).
Resolution
Frames/sec
MB/second
Capacity/minute
(GB)
SD
1.7
38.4
2.3
1K
3.2
76.8
4.6
HD
8.2
197
11.8
2K
12.5
300
18.0
4K
50
1.2
72.0
© 2004 Coughlin Associates
Digital Production and Distribution
Rules!
• Save more than a factor of 50 in video
capture and editing costs vs. traditional
film
• Special effects and editing possible with
digital production can’t be matched with
older analog techniques
• Save 80% on digital theatre distribution vs.
film distribution
© 2004 Coughlin Associates
Adventures in Archiving
•
•
•
•
Demand huge, and growing
Long term storage formats
Format obsolescence
Need for format transfer planning—
archiving will not be merely static
• Enormous need for good metadata
tagging and data search and access
improvements
© 2004 Coughlin Associates
Storage Devices
© 2004 Coughlin Associates
Storage Hierarchy
© 2004 Coughlin Associates
Magnetic Rigid Disk Drives (HDD)
Spindle Motor
Disk
15k RPM FC Drive
Head Actuator
2.5 Inch Mobile Drive
High Capacity ATA Drives
(now up to 400 GB)
Head Suspension
© 2004 Coughlin Associates
Toshiba 0.85 inch HDD
HDD Areal and Volumetric Density Growth
Storage areal density CGR starts to slow from 100% per year near 100 Gbit/in2.
Volumetric density follows at similar rate.
Areal Density (Gb/in2)
1000
Progress begins
to slow down due
to technological
challenges
10000
New
Lab
Demos
100
Lab
Demos
10
Products
1
0.1
0.01
100%
CGR
25%
CGR
60%
CGR
Progress from breakthroughs,
including MR, GMR heads,
AFC media
0.001
1980
1990
2000
2010
Lab demos (year 2002) at 130 Gbit/in2
Volumetric Density (Gb/in3)
10000
1000
100
Storage volumetric
density has improved
based on increased
areal density, smaller
form factors and
closer packing of
disks
10
95%
CGR
1
0.1
14/10.8 inch
5.25
3.5
0.01
0.001
1980
© 2004 Coughlin Associates
1990
2000
Availability Year
From Clod Barrara, IBM
March 2004
2.5
1.0
2010
SHIPPING PRODUCT DISK
CAPACITY PROJECTIONS
Year
2002
2003
2004
2005
2006
95mm Mainstream Capacity Per Platter
40
80
120
180
270
By 2006 we could have four disk 3.5-inch disk SATA
drives with storage capacities of over 1 TB.
© 2004 Coughlin Associates
The Universe Still Beats Us by
Far in Information Capacity
50 Tbpsi
•The holographic and
universal information
bounds are far
beyond the data
storage capacities of
any current
technology!
X
Source: Information in the Holographic
Universe, August 2003 Scientific American
•Magnetic recording
technology may allow
up to 50 Tbpsi (50 X
1018 bpsi)
© 2004 Coughlin Associates
HDD Access Density
Sustained HDD I/O Rates per GByte
1000
Desktop and Server Drive Performance
100
<=7200 RPM
10
15K RPM
1
Desktop
5400 and 7200 RPM
10K RPM
0.1
1990
1995
2000
Year
© 2004 Coughlin Associates
2005
2010
From Clod Barrara, IBM
March 2004
HDD Reliability Trends
• MTBF/GB falling – drive rebuild times growing
• Multi-parity RAID a required aggregation
technology
1400
Mean-Time-To-Failure/GB of Storage
100
High/GB
Mean Time to Failure per GB
in k hours
Specified MTBF, khrs.
1200
1000
800
600
400
200
80
60
40
20
0
Ed Grochowski
Almaden Research Center
0
1985
Low/GB
1996
1997
1998
1999
2000
2001
Year of Drive Introduction
1990
1995
2000
2005
FCS Date
HDD MTBF Manufacturer Specifications
© 2004 Coughlin Associates
From Clod Barrara, IBM
March 2004
Popular Digital Tape Formats
(All ½ inch tape cartridge technologies)
SAIT
S-DLT
Tape with Tape Drive
LTO
•Tape is still digital archive media of choice
•Tape data access is on the order of minutes vs. milliseconds or seconds for disk
•Tape media costs have been somewhat underwritten by VCR tape production, implications
for future of tape costs
•½ inch tape capacities of up to 10 TB projected
© 2004 Coughlin Associates
Active tape format CAGRs are about 40%. Disk Drive
CAGRs are expected to be ~60%
100000
S-AIT
AIT
DDS
DLT
LTO
30% CAGR
60% CAGR
120% CAGR
1000
100
10
© 2004 Coughlin Associates
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1
1995
Tape Capacity (GB
10000
Blue Ray Optical Disks and Drive
© 2004 Coughlin Associates
MultiMedia Object Size/Bandwidth
Holographic Disks
© 2004 Coughlin Associates
Source: Telcordia 3/03
Storage Systems
© 2004 Coughlin Associates
On-line In-line
SSD
High Perf. Disk
Array
Capacity Disk
Array
Near-line
Archive
Capacity Disk Array
Tape
Tape
Optical Jukebox
WORM
Digital
Content
Lifecycle
in
Production
and
Distribution.
Content
Stage
Frequent
Changes
Frequent
Accesses
to Fixed
Content
In-Frequent
Accesses
to Fixed
Content
Production
Non-linear
editing
Production
Viewing
Archiving
Distribution
N/A
Distribution
Viewing,
PVR/DVR
DVDs, VHS,
Content
Downloads
© 2004 Coughlin Associates
Storage
Requirements
SGI InfiniteStorage DMF
Data Life Cycle Management
DMF manages data based on:
• age of file
• size of file
• type of file
Primary Storage
Online - high-performance disk
Nearline Disk
High Capacity, Low cost,
Promote Lower performance
used last 24 hrs
Tape Libraries
Higher capacity, lower cost
Promote
used last 7 days
Demote
> 7 days < 365
Gabriel Broner, SGI,
Jan. 2004
Demote
Archive
> 1 Yr < 2 Yr
> 2 Yr
© 2004 Coughlin Associates
Comparative Prices of Storage Systems
$70
$60
$50
Performance Disk
Midrange Disk
Capacity Disk
Automated Tape
$40
$30
$20
$10
$0
Low
High
© 2004 Coughlin Associates
Comparison of Tape and ATA
Disk Storage Economics
1000
Tape Drives
$/GB
100
10
Tape Drive +
100 Media
Ghetto RAID
1
Tape Media
0.1
Sony SAIT Projection
IDE Drives
0.01
1996
1998
2000
Tape Drive + 100 Media
2002
IDE Drive
© 2004 Coughlin Associates
2004
2006
Ghetto RAID
Device Virtualization
RAID subsystem is
a composite device
made with many
smaller devices.
Disk
Drive
Disk
Drive
Disk
Drive
Disk
Drive
Disk
Drive
Disk
Drive
Disk
Drive
Disk
Drive
Disk
Drive
© 2004 Coughlin Associates
RAID Systems
Volume
Manager
with RAID
Capability
Host System
Host System
Host I/O
Controller
RAID
Capability
or
Disk
Drive
Disk
Drive
Disk
Drive
JBOD with each device addressed
individually by host-resident RAID
Subsystem
RAID
Controller
Disk
Drive
Disk
Drive
Disk
Drive
RAID subsystem acting as a single
virtual device
© 2004 Coughlin Associates
RAID Advantages
• Can allow for more reliable data and/or
improved system performance
• A RAID requires fewer host I/O controller
slots. Also a RAID can use a single network
(e.g. SCSI) address rather than individual
addresses for each drive
• By creating a virtual drive with 1 file system
there is no need of the host to manage
separate file systems on the individual drives
© 2004 Coughlin Associates
Characteristics of RAID Levels
Usable disk space
Parity and
Redundancy
RAID 0
100%
None
RAID 1
50%
Duplicate data
RAID 5
67-93%
Parity distributed
over each drive
Minimum
number of disks
I/Os per Read
2
2
3
1 Read
1 Read
1 Read
I/Os per Write
1 Write
2 Write
2 Reads + 2 Writes
Best
Good
Worst for Writes
Worst
Best
Good
Best
Worst
Good
Tolerant of multiple,
simultaneous drive
failures. Higher
write performance
than RAID 5. Uses
the most storage
capacity for fault
tolerance.
Tolerant of single
drive failures. Uses
the least amount of
storage capacity for
fault tolerance
Performance
Fault Tolerance
Cost
Characteristics
Best over all
performance, but
data is lost if any
drive in the logical
drive fails. Uses
no storage space
forfault tolerance
© 2004 Coughlin Associates
RAID
Reliability
Hot
Spare
Disks
• Redundancy
– drives (hot
spares)
– power
supplies
– fans
– controllers
• Automatic failover to spares
RAID Group 1
RAID Group 2
© 2004 Coughlin Associates
RAID Group 3
Direct Attached vs. Networked
Storage
• In DAS (Direct Attached Storage) data storage
can be incrementally added to a computer
system and is subservient to the computer host.
• A SAN (Storage Area Network) is a “network
storage” system in which storage is accessed
through a separate storage network.
• A NAS (Network Attached Storage) is an
independent aggregated system that can be
attached to an existing LAN network in order to
increase network available storage.
© 2004 Coughlin Associates
© 2004 Coughlin Associates
NAS and SAN Architectures, InfoStor, December 2000
Switch
Zoning
Tape
Library
Loop 1
© 2004 Coughlin Associates
Standard Device Management
Interfaces – SMIS (SNIA Std.)
© 2004 Coughlin Associates
Examples of ATA-based Storage Systems
(Popular for Static Content Storage Systems)
Isilon IQ 3-Node 4.3 TB
Quantum DX30
The DX30 separates
backup functions from
archive functions to
optimize the data
protection process.
Nexsan ATABeast Nexsan's
14 TB for 7 cents a MB.
(See new introductions at
2004 NAB)
STK Bladestore product
uses 3.5 inch drives on
blade acting as one drive
to a fibre channel output
© 2004 Coughlin Associates
R200 now offers
up to 96 TB
MAID (Massive Array of Inactive
Disks)
• Disks inactive most of
the time (only about
25% active at any time)
• Can be RAID or JBOD
• Workload is mostly
writes, seldom read
• Reduced costs since
components shared
• Low power
• Field replaceable drives
• Start-ups ?? offering
MAID systems
© 2004 Coughlin Associates
Virtual Tape Cache for Backup
© 2004 Coughlin Associates
Tape-based Digital Content
Storage System
Sony
Petasite
Tape
Library
© 2004 Coughlin Associates
Content Software
•
•
•
•
•
•
•
SGI
Veritas
Exanet
SANbolic
Kasenna
Context Media
Many Others
© 2004 Coughlin Associates
Connection Interfaces and
Protocols
• SCSI
• Serial Attached
SCSI
• Fibre Channel
• FATA (Seagate and
HP)
• ATA
• Serial ATA
• TCP/IP and
variations
• iSCSI
• FC over IP
• Infiniband
© 2004 Coughlin Associates
Storage Interface Progression
Physical
Sector
ST-506
Logical
Block
Byte
String
Data
Separator
Sector
Buffer
ECC,
Geometry
Mapping
Space
Mgmt.
ESDI,
SMD
IPI-2
SCSI
OSD
• Each change represents intelligence moving from
host to drive
• Each advancement was met with resistance
• Eventually advantages of new intelligence were
From Seagate Technology, 2003
compelling
© 2004 Coughlin Associates
OSD: A New Standard Interface
CPU
CPU
Applications
Applications
System Call Interface
System Call Interface
File System
User Component
File System
User Component
File System
Space Management
Sector/LBA Interface
Block I/O Manager
Storage Device
OSD Interface
File System
Space Management
Block I/O Manager
Storage Device
Completes
CompletesDevice
DeviceAbstraction
Abstraction
© 2004 Coughlin Associates
From Seagate Technology, 2003
Object Storage Systems
Expect wide variety of Object Storage Devices
Disk array subsystem
“Smart” disk for objects
Prototype Seagate OSD
Ie. LLNL with Lustre
2 SATA disks – 240/500 GB
Highly integrated, single disk
16-Port GE
Switch Blade
Orchestrates system activity
Balances objects across OSDs
Stores up to 5 TBs per shelf
4 Gb/sec per
shelf to cluster
Battery-backed redundant power
© 2004 Coughlin Associates
From Seagate Technology, 2003
Applications for Entertainment
Content Storage
© 2004 Coughlin Associates
Professional Digital Camera
(Storage System for Content Capture)
Sony Optical
Camera
© 2004 Coughlin Associates
Asynchronous packet switched architecture
Production
Switcher
Camera
From External
Systems
Ingest
Client
Network
Storage
System
Playout
Client
To External
Systems
WAN
To External
Systems
Non Linear
Editor
© 2004 Coughlin Associates
Lowell Moulton, AF Associates
Jan. 2004
Material Exchange Format
(MXF)
• International standard
• Designed to enable distribution of A/V files
over IT infrastructures
• License free open source wrapper for
video, audio and metadata
• Real time streams or non real time file
transfers
• Wrapper can contain various metadata
such as DRM
© 2004 Coughlin Associates
Lowell Moulton, AF Associates
Jan. 2004
Material Exchange Format
(MXF)
• Partitions enable files to be read while
being written
• Files can also be tuned for file system
– KLV Alignment Grid (KAG)
– KAG specifies file system logical block size
• Standardized index tables
– Enable fast access to edit units and partitions
© 2004 Coughlin Associates
Lowell Moulton, AF Associates
Jan. 2004
Nonlinear Editing System
Connections
Services
Creating
Signals
Footage
Masters
Archives
Synchronous Signal I/O
Flat File
Render
Music &
MIDI
3D Object
Render
2D Paint
Graphics
I. P. Data Flow
Remote
I/O device
3D Titles
& Text
Gateway
Manager
3D Model
Animation
WAN
Asset
Analyzers
Browser
Viewer
Media File
Transcode
Asset
Catalog
Rights
Manager
LAN
Editing
Publishing
Audio
Edit/Mix
Narrative
Story Edit
Audio
Publish
Composite
3D DVE
Picture
Publish
Finishing
Versioning
HD/Film
Conform
HT/XML
Authoring
Review &
Approval
SAN
Near Line
Archives
Synchronous Signal I/O
© 2004 Coughlin Associates
Peter Fasciano, Avid,
Jan. 2004
Nonlinear System Design (Avid)
Video/Data Tape
Enterprise SAN Server
data
Microwave
data
data
Xmit Swx
Class 1 Video Server
Satellite
Prod Swx
Network
Network
Studio/Cameras
Videotape
Telecine
Audition
Satellite
Inbound Signals
Outbound Signals
Video
Signal
Router Associates
© 2004
Coughlin
Peter Fasciano, Avid,
Jan. 2004
Nonlinear System Design (Avid)
• The online real-time effects system
F
I
L
E
data
data
data
data
S
E
R
V
I
C
E
Titles
Switcher
& DVE
S
A
N
© 2004 Coughlin Associates
Synchronous Video
Asynchronous Data
Control & Metadata
Peter Fasciano, Avid,
Jan. 2004
Workflow with File Sharing (SGI)
Near-instantaneous access for data-intensive workflows
File A
Process 1
Process 2
Process 3
Process 4
Media Digitization
Color Correcting
Effects
Compositing
Mfg.
Visualization
Structural
Analysis
Design
Crash
Analysis
File sharing means large files don’t have to be moved
over the network—saving time, speeding workflow. Gabriel Broner, SGI,
© 2004 Coughlin Associates
Jan. 2004
Clustered Digital Content Storage
Isilon IQ Clustered Architecture
Traditional Storage Systems
Storage Device #1
Isilon IQ Cluster
Storage Device #2
Storage Device #3
GigE Switch
Mac Server
Mac Server
Win Server
Win Server
UNIX Server
UNIX Server
Acute Pain with Digital Content
•
•
•
•
Separate islands of storage
Complex & hard to grow
Server performance bottlenecks
Inherent single points of failure
Isilon IQ Eliminates Customer Pain
•
•
•
•
One single pool of storage
Simple, easy, & modular to grow
Cluster eliminates server bottlenecks
No single points of failure
© 2004 Coughlin Associates
Brett Goodwin, Isilon,
Jan. 2004
Conclusions
• Digital content creation and distribution will
require large volumes of storage
• Storage devices and requirements vary
throughout the content value chain.
• Storage device and architecture development
enables ever lower and more capable digital
content creation and distribution!
Acknowledgement: Much of the material from this presentation was created
while researching the 2004 Entertainment and Digital Media Storage
Report, Authors: Tom Coughlin, Pat Hanlon, and Dennis Waid. For more
information see www.tomcoughlin.com.
© 2004 Coughlin Associates
Digital Storage Will Entertain a New
Generation!
© 2004 Coughlin Associates