IBM Presentations: Smart Planet Template

Nils Haustein | Executive IT Specialist | EMEA Storage Competence Center
Introduction to Digital Archiving and
IBM archive storage options
© 2016 IBM Corporation
Agenda
►
Introduction to Archiving
Archiving Techniques
IBM Archive Storage Options
2
© 2016 IBM Corporation
Archiving – why, what and how?
Archiving is a well-planned process
where data which usually does not
change anymore is moved into an
archiving system. Data access, search,
processing
and
compliance
is
guaranteed over long life cycles.
Archiving requires strategic thinking!
3
© 2016 IBM Corporation
Reasons, Requirements, Challenges
Challenges
• Long Lifecycle
• Technological Progress
Reasons
• Data Growth
• Regulatory Duty
• Preservation of
Information
Requirements
• Cost and Efficiency
• Scalability, Flexibility
• Compliance*
• Operative Requirements
*Compliance in accordance with regulatory requirements
4
© 2016 IBM Corporation
Archive system architecture
Archive
Sources
E-Mails
Files
ERP
Database
PACS
Paper
www
Connectors and Converters
Archive
Management
Enterprise Content Management System
(Indexing, search, discovery, information management)
Archive Storage
Archive
Storage
5
© 2016 IBM Corporation
Backup vs. Archiving
Backup
Archiving
 For Recovery
 For long-term retention
 Data is copied
 Data is moved to archive
 Short-term retention
 Multiple Versions
 Retention Periods must be
enforced over a long lifecycle
 Compliance usually not required
 Compliance usually required
Backup is used to protect archived data
6
© 2016 IBM Corporation
Agenda
Introduction to Archiving
► Archiving Techniques
IBM Archive Storage Options
7
© 2016 IBM Corporation
Archive Management functions
 Archive management performs archiving &
retrieval process
– Selection, collection, ingest
– Metadata extraction and indexing
– Information- and Data management
– Search and retrieve
 Information management
– Classification, Indexing & Search
– Retention and Business process management
 Data management
Information Management
Archive Management
Data Management
Index
Archive Storage
– Access control, auditing
– Expiration and migration
8
© 2016 IBM Corporation
Archiving & retrieval processes
 Archiving is driven by archive management
– Select data to be archived from archive source
– Transfer data into archive management system
– Extract metadata and store this as index
– Store data in archive storage
 Retrieve is drive through archive management
– Provide search & discovery function to find data
o Search client may be integrated in archive source
and / or archive management
– Locate selected data in archive storage using index
– Read data from archive storage and provide it to
requesting application
9
archive
retrieve
Search and Discovery
Archive Management
Index
Archive Storage
© 2016 IBM Corporation
Archive Storage functions
 Archive storage stores and retains data
 Complements archive management with value
adding storage functions
– Write-Once-Read-Many protection (WORM)
– High availability and Disaster protection
– Data management (tiered storage, encryption,
compression, deduplication, hybrid cloud etc.)
Archive Management
Data Management
Index
 Type of archive storage depends on
requirements
10
Archive Storage
© 2016 IBM Corporation
Archive storage techniques
 Standardize interfaces
– Future proof, fostering interoperability
– No vendor lock-in
 Support of different storage media technologies
– Increases flexibility and scalability
– Supports compliance aspects
Standard Interface
Archiv Storage
 Tiered Storage
– Supports cost efficiency
– Allows migration to new storage media
11
© 2016 IBM Corporation
Learn out more about Archiving ….
Second Revision of Storage Networks explained
Second revision includes chapter about
archiving, business continuity and FCoE.
2. Edition - September 2009
568 Pages, Hardcover
ISBN-10: 0-470-74143-0
http://www.wiley-vch.de/publish/en/books/forthcomingTitles/EE00/0-470-74143-0/?sID=502qbja41e6nfl1sl0qs1v9tc4
© 2016 IBM Corporation
Agenda
Introduction to Archiving
Archiving Techniques
► IBM Archive Storage Options
13
© 2016 IBM Corporation
What is the best archive storage medium
 Longevity of medium is not the dominating factor
 Logical and physical migration is inevitable1
– Logical migration: applications, ECM, formats
– Physical migration: platforms, networks, storage
 Typical criteria for archive storage
– Operating cost, scalability, future-proof, compliance
Flash
1 Also
14
Disk
Optical
Tape
Cloud
see: „100 Year Archive Requirement Survey“ by SNIA Data Management Forum
© 2016 IBM Corporation
Why Tape is good for archiving
 Tape has superior TCO: 3 – 10 times better cost than disk over 5 – 10 years
 Tape has long lifecycle: ~10 years per generation
 Tape is reliable: With read-after-write and 2-dimensional ECC
 Tape is storage efficient: With advanced compression
 Tape is secure: With Encryption and Write Once Read Many (WORM)
 Tape is standardized: LTO and LTFS
 Tape has high potential to scale: Bit density on tape can be scaled down
15
© 2016 IBM Corporation
Scalabilty of tape
Visualization of bit cells for different storage techniques
 April 2015: IBM Research
demonstrated a new record of:
123 Gbits/In2
in areal data density on
magnetic particulate tape
– LTO-6 has 1.38 Gbits/in2
 At this areal density, a standard LTO
size cartridge could store up to
220 terabytes
of uncompressed data*
– 88 times improvement over an LTO-6
16
© 2016 IBM Corporation
Total cost of ownership comparison
Cost
Disk
Tape
Cost per GB per month
10¢ /GB/month
0.77¢ /GB/month
Cost per Petabyte per month
$100K/Month
$7.7K/Month
Cost over 5 years
$6.6 Million
$462 Thousand
Third party Independent study
Clipper Group
Enterprise Strategy Group
17
IBM Confidential
© 2016 IBM Corporation
Compliance requirements are key decision criteria for storage
What does compliance mean?
 To comply to laws and regulations
– Regulations and laws vary by countries and branches
 There are common laws and regulations in most countries
– E.g. trade, tax, stock exchange and civil laws demanding data retention
 Key requirement for archive storage is to prevent delete and changes
– Write-Once-Read-Many (WORM)
 “Certificates” document assessment for compliance
– Usually not required by legal authorities
– Helps to manage compliance risks
18
© 2016 IBM Corporation
Common compliance requirements
 Specifies kind of data to be preserved
 Data Retention periods
 „Write Once Read Many“ protection
– No deletion or modification of data during retention time
 Proof of completeness and authenticy
 Data access for auditing authorities during retention period
 Data and system protection (logical and physical)
 Deletion after expiration
 Compliance must be assured for the entire archive system
19
© 2016 IBM Corporation
Archive storage options - overview
Disk and Tape
Disk only
20
Cloud
•
For file and object
•
For file and object
•
Mostly object
•
Fast access
•
Fast and slow access
•
Medium access
•
High change rates
•
Low change rate
•
•
Higher cost for large
capacities over longer
period of times
•
Lower cost for large
capacities over longer
period of times
Cloud is not a storage
technique but an
architecture and
operating model
•
Cloud can use disk, tape
and other techniques
© 2016 IBM Corporation
Non-compliant IBM storage solutions - Overview
Spectrum Protect
Overview
Spectrum Protect
Disk
File & Object Storage
Block storage
File
/ Object server
Filer
Disk
Disk
Block
Storage
Tape Mgmt.
Protocols
TSM API
NFS, CIFS, Swift, S3
FCP, iSCSI
Highlights
Embedded backup &
migration to tape
Replication, High
availability
Most flexible
Embedded backup and
migration to tape
Replication, High availability
Most scalable
Replication via Disk
Most simplistic
Infrastructure
IBM Systems & Storage
Spectrum Protect
Storwize V7000 Unified
Spectrum Scale
Spectrum Archive
Spectrum Protect (HSM)
Cleversafe
IBM Storage
IBM Tape
21
© 2016 IBM Corporation
Spectrum Protect server
 Application uses TSM API via LAN for archive & retrieve
Archive
application
 Spectrum Protect server stores data on storage pools
– Storage pools can be different storage types
– Data can be migrated between storage pools
TSM API
►Based on age and size
Server
– Storage pool can also be object storage and cloud
TSM
 Archiving functions
– Backup of data and metadata
– Storage tiering
– Replication using node replication
– End-to-end encryption via TSM API
– Deduplication
– Cloud connection
Operating System
Storage Network
Flash
Disk
Tape
Cloud
Link to Whitepaper
22
© 2016 IBM Corporation
Spectrum Scale
 Application runs on Spectrum Scale node or connects via
NAS (NFS, SMB) or object to Spectrum Scale file system
App on GPFS
client
 Spectrum Scale file system is available on all cluster nodes
App on
NAS client
GPFS,NFS,SMB,Object
– Data is stored pools represented by storage devices
– Data can be migrated between pools
TCP/IP Network
►Based on flexible policies
– Connection to object storage and cloud
 Archiving functions
– Standardized data interfaces (NFS, SMB, S3, Swift)
– Global name space
– Built-in backup function to TSM
– Transparent storage tiering with tape
– Replication (synchronous and asynchronous)
– Encryption & compression
– Native RAID
– Cloud connector
23
Storage Network
Flash
Disk
Tape
Cloud
Link to Whitepaper
© 2016 IBM Corporation
Spectrum Archive
 Application runs on Spectrum Scale node or connects via
NAS (NFS, SMB) or object to Spectrum Scale file system
 Spectrum Archive integrates with Spectrum Scale
App on GPFS
client
App on
NAS client
GPFS,NFS,SMB,Object
– Facilitates migration and recall of files to LTFS tape
– Migration controlled by policies
TCP/IP Network
 Archiving functions
– Transparent storage tiering with flash, disk and tape
– Standardized data interfaces (NFS, SMB, S3, Swift)
– Global name space
– Replication (synchronous and asynchronous)
– Encryption & compression
– Native RAID
Storage Network
Disk
Tape
Link to Whitepaper
24
© 2016 IBM Corporation
IBM Cloud Object Storage - Cleversafe
 Application connects via object API (S3, Swift,
simple object) to Cleversafe
 Cleversafe Accesser nodes receive data and
distribute it on slice stores
– Leverages Information Dispersal Algorithm (IDA)
to slice objects and perform erasure encoding
– Object slices are stored on Slicestores
– Provides data availability across locations
– Can be deployed on-prem, hybrid or off-prem
 Archive functions
– Cost efficient with innovative IDA
– Built in encryption
– Geographical dispersal
– Central management
– Optimized for object storage
25
© 2016 IBM Corporation
IBM archive storage solutions provide the best for your workload
 Leverage Spectrum Scale as central point for tape tiering and hybrid cloud
Transparent Cloud Tiering
Tape storage tiering
Spectrum Archive
• Colder and long term Archives
• Cost efficiency with tape
• Transparency and automation
Spectrum Scale
•Big Data Analytics
•Unified File (NAS) and Object storage
•Multi-site file collaboration
IBM Cloud Object Storage
(Cleversafe)
•Active Archives
•Geographic dispersal
•Object Storage
 IBM Cloud Object Storage (Cleversafe ) can be used without Spectrum Scale
26
© 2016 IBM Corporation
Compliant IBM storage solutions - Overview
Spectrum Protect
(SSAM)
Overview
Spectrum Scale
immutability
SSAM
Spectrum
Filer Scale
Disk
Disk
Disk
WORM Tape
Tape Mgmt.
Protocols
TSM API
NFS, CIFS, POSIX
FCP, iSCSI
Highlights
Embedded backup &
migration to tape
Replication, High availability
Assessed for compliance
Most flexible
Embedded backup and
migration to tape
Replication, High availability
Assessed for compliance
Most scalable
Good streaming
performance
Tape is green
Assessed for
compliance
More simplistic
Infrastructure
IBM Systems & Storage
Spectrum Protect for Data
Retention (SSAM)
Spectrum Scale
Spectrum Archive
Spectrum Protect (HSM)
IBM WORM Tape
27
© 2016 IBM Corporation
Spectrum Protect for Data Retention (SSAM) - Overview
 Application uses TSM API via LAN for archive & retrieve
 SSAM is special version of TSM enriched with immutability
features that stores data on storage pools
– Storage pools can be different storage types
– Data can be migrated between storage pools
TSM API
Server
►Based on age and size
– Data cannot be deleted prior to expiration
– Storage pool can also be object storage and cloud
 Archiving functions
– Assessed for compliance (link)
– Built in backup of data and metadata
– Storage tiering
– End-to-end encryption via TSM API
– Deduplication
– Direct migration path from DR550 and IA
– Cloud connection
28
Archive
application
SSAM
Operating System
Storage Network
Flash
Disk
Tape
Cloud
Link to Whitepaper
© 2016 IBM Corporation
Spectrum Scale Immutability
 Application runs on Spectrum Scale node or connects via
NAS (NFS, SMB) to Spectrum Scale fileset
 Spectrum Scale fileset is configured for immutability
App on GPFS
client
App on
NAS client
GPFS,NFS,SMB
– Allows file retention management with SnapLock® like function
TCP/IP Network
►Encode retention time in “last access date” and set read-only
– Can leverage many Spectrum Scale archive functions
 Archiving functions
– Assessment for compliance is planned in 3Q16
– Standardized data interfaces (NFS, SMB, S3, Swift)
– Global name space
– Built-in backup function to TSM
– Transparent storage tiering with tape (Spectrum Archive)
– Replication (synchronous)
– Encryption & compression
– Native RAID
29
Storage Network
Flash
Disk
Tape
Link to Whitepaper
© 2016 IBM Corporation
Summary
 An archive system is comprised of archive sources, management and storage
– Key decision criterion for selecting archive storage is compliance
 IBM provides all components for archiving solutions
– on-premises, hybrid or in the cloud
– Compliant and non-compliant
 IBM archive storage offers value adding functions
– Better protection with integrated backup and recovery functions
– Better TCO with integrated storage tiering across different storage technologies
– Better operations with integrated data migration functions to new storage technologies
– Better integration with standardized data interfaces
 Tape helps to optimize cost and provides better protection
30
© 2016 IBM Corporation
31
© 2016 IBM Corporation
External References
 Book: Storage Networks explained:
http://www.wiley-vch.de/publish/en/books/forthcomingTitles/EE00/0-470-74143-0/?sID=502qbja41e6nfl1sl0qs1v9tc4
 SNIA: 10 Year Archive Requirment survey:
http://www.snia.org/sites/default/files2/100YrATF_Archive-Requirements-Survey_20070619.pdf
 Total cost of ownership studies for disk and tape storage solutions
http://www.clipper.com/research/TCG2013009.pdf
http://www.esg-global.com/blogs/active-archival-storage-a-cost-of-ownership-analysis/
 SSAM home page:
http://www-306.ibm.com/software/tivoli/products/storage-mgr-data-reten/
 SSAM 6.3 assessment report by KPMG:
http://www.kpmg.de/bescheinigungen/RequestReport.aspx?38076
 Spectrum Archive Home page:
http://www-03.ibm.com/systems/storage/tape/ltfs/index.html
 IBM Tape home page:
http://www-03.ibm.com/systems/storage/tape/
 Whitepaper: File archiving solutions with Spectrum Protect for Data Retention:
https://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP100901
 Whitepaper: Archiving solutions with Spectrum Archive:
https://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP102643
 Whitepaper: Spectrum Scale ILM Policies:
https://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP102642
 Whitepaper: Spectrum Protect for Data Retention solutions
https://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP102624
 Whitepaper: Spectrum Scale immutability – Introduction and Use Cases
https://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP102620
32
© 2016 IBM Corporation
Disclaimer
Important notes:
 This information is provided on an "AS IS" basis without warranty of any kind, express or implied, including, but not
limited to, the implied warranties of merchantability and fitness for a particular purpose. Some jurisdictions do not
allow disclaimers of express or implied warranties in certain transactions; therefore, this statement may not apply
to you.
 IBM reserves the right to change product specifications and offerings at any time without notice. This publication
could include technical inaccuracies or typographical errors. References herein to IBM products and services do
not imply that IBM intends to make them available in all countries.
 IBM makes no warranties, express or implied, regarding non-IBM products and services, and any implied
warranties of merchantability and fitness for a particular purpose. IBM makes no representations or warranties
with respect to non-IBM products. Warranty, service and support for non-IBM products is provided directly to you
by the third party, not IBM.
 When referring to storage capacity, GB stands for one billion bytes; accessible capacity may be less. Maximum
internal hard disk drive capacities assume the replacement of any standard hard disk drives and the population of
all hard disk drive bays with the largest currently supported drives available from IBM.
IBM Information and Trademarks
 The following terms are trademarks or registered trademarks of the IBM Corporation in the United States or other
countries or both: the e-business logo, IBM, system x, system p, System Storage
 SnapLock™ is a registered trademark of Network Appliance Corporation in the United States and other countries
 Intel, Pentium 4 and Xeon are trademarks or registered trademarks of Intel Corporation.
 Microsoft Windows is a trademark or registered trademark of Microsoft Corporation.
 Linux is a registered trademark of Linus Torvalds.
 Other company, product, and service names may be trademarks or service marks of others.
Acknowledgements
 Thanks to Tom Clark (Chief Architect Storage Software), Frank Kraemer (Client Technical Architect, IBM
Germany), Ulf Troppens (Spectrum Scale Development) for the valuable feedback shaping this presentation
33
© 2016 IBM Corporation