Managing a Tidal Wave of Data

Managing a Tidal Wave of Data
Manage More Data with Less Infrastructure
Terry Burba
WW Software Evangelist, Tivoli Storage
[email protected]
© 2009 IBM Corporation
Agenda
Are You Drowning in a Tidal Wave of Data?
The 4 Steps to Reducing Your Data Footprint
Why IBM?
Next steps
2
© 2009 IBM Corporation
The Tidal Wave of Data Continues …
The amount of digital information
continues to grow exponentially …
And we need to keep more of it, longer …
And the costs of losing data are
increasingly unacceptable …
–
–
–
–
–
Lost revenues
Lost customer confidence
Embarrassment in the market
Fines from contracts, government agencies
CEO and CFO could go to jail
2005
2006
2007
2008
2009
2010
Data created and copied is expected to grow
at 48% CAGR through 2010
We Need to do More with Less,
and we need to do it smarter
Source: Various external consultant reports
3
© 2009 IBM Corporation
Accumulated Data is Growing Faster Every Year
The world’s total data per person
0.8 GB/
person
2003
“No
Problem”
Problem”
24 GB/
person
2006
“I think I
can do
this”
this”
“Mr. CIO,
we have a
problem”
problem”
128 GB/
person
2010
Source of data growth: IDC: From Gigabytes to Yottabytes.
Source of population estimates. http://www.ibiblio.org/lunarbin/worldpop
4
© 2009 IBM Corporation
The Pressures on Storage Administrators are Growing
The consequences of data growth:
It takes longer to perform backups
– Not meeting backup window allowances
– Some data is not being adequately protected
It takes longer to perform recoveries
– Increased downtime = lost revenue opportunity
– Data that isn’t protected can’t be recovered
Can’t keep buying more storage
– Running out of floor space / electrical & cooling capacity
– Administration and management costs exploding
New data sources are complicating the problems
– New applications coming on-line
– Mergers and acquisitions increasing # of supported systems
5
© 2009 IBM Corporation
Surviving the Tidal Wave
Reducing your data storage footprint will:
Reduce your costs
– Less storage = less capital expenditures
– Less data = simplified management and
administration
Improve service levels
– Less downtime = higher application availability
– Improved competitiveness and customer
satisfaction
Mitigate risks
– Eliminate consequences of data loss
– Respond faster to events and legal/government
inquiries
6
IBM can help you build a
dynamic storage
infrastructure that will
intelligently improve service
levels, reduce costs and
manage risks
© 2009 IBM Corporation
Surviving the Tidal Wave
Steps to reducing your data storage footprint:
1. Avoid duplicating data – treat the cause, not the symptom
– Periodic full backups are the #1 cause of duplicate data
– Performing progressive incremental backups eliminates duplication
2. Categorize the data for migration & deletion
– Older data should be moved off production systems
– This will shorten backup cycles and improve application performance
3. Automate the migration, archival and deletion
– Set policies based on business requirements
– Move older, less-frequently accessed data to archive storage tiers
4. Compress and deduplicate what’s left
– Redundant copies may still exist on different source systems
– Deduplication can reduce capacity requirements by another 40-95%
7
© 2009 IBM Corporation
1. Avoiding Data Duplication
Treat the cause of the problem, not the symptom
© 2009 IBM Corporation
Avoiding Data Duplication
Performing periodic full backups is typically the largest contributor to data
growth in a data center
As much as 95% of your data doesn’t change from week-to-week
Every week, you make another copy of that data
Data deduplication solutions were created to address this problem
– When they claim 95% reduction ratios, this is the data they’re talking about
IBM has smarter solutions: never perform a full backup again!
Tivoli Storage Manager – ‘progressive-incremental’ and sub-file backup
Tivoli Storage Manager FastBack – block level incremental
Tivoli FastBack for Workstations– continuous incremental
9
© 2009 IBM Corporation
Data Reduction: Progressive Incremental Backup
Features
ONLY new or changed files backed up
NO redundant backups
Restores don’t require the same file to be
restored multiple times
NO wasteful weekly full backups and their
dependent incremental/differential
Data tracked at file level
Accurate restores
Benefits
Requires less storage space, less network bandwidth and less time
Shorter backup windows
Fast accurate restores
10
© 2009 IBM Corporation
Tivoli Storage Manager 6: Progressive Incremental Backup
Never perform a full backup again!
Capacity Requirements Comparison
2500
Gigabytes
2000
1500
Backup Capacity
Needed for 1 Month:
1000
Vendor A: 26TB
Vendor B: 14TB
500
IBM TSM:
7TB
0
Mon
Tue
Wed
Thu
Fri
Wkend
Mon
Tue
Week 1
Vendor A: Full+Differential
Vendor B: Full+Incremental
Wed
Thu
Fri
WkEnd
Week 2
TSM Progressive Incremental
Assumes: Full backup completed, 2TB data to start, 26% annual growth rate, 10% new/changed data per day
11
© 2009 IBM Corporation
2. Categorize the Data Migration & Deletion
© 2009 IBM Corporation
Determine What You Have BEFORE You Try To Fix It
IBM Tivoli Storage Productivity Center 4 for Data
Your files systems are probably bursting from data that is old and
rarely accessed
Some data can become a liability after it’s outlived it’s useful lifetime
– Think e-discovery: do you know what was saved 5 years ago?
Categorizing and then migrating this old data from production
systems will:
– Reduce capacity requirements and lower CAPEX and OPEX
– Improve backup and restore performance
– Help meet data retention and expiration mandates
TPC identifies data for migration and deletion:
– By date saved or last accessed
– By location and owner
– By file type and size
13
© 2009 IBM Corporation
3. Automate the Migration, Archival and Deletion
© 2009 IBM Corporation
Tivoli Storage Manager 6 for Space Management
Tivoli Storage Manager 6 HSM for Windows
Features
Benefits – Efficient Use of Storage
Migrates inactive data
Improve response times of file servers by
off-loading inactive data
Transparent recall
Move low-activity or inactive files to a
hierarchy of lower-cost storage
Policy managed
Integrated with backup
Use existing storage assets more
efficiently
Reduce backup times and resource usage
by focusing on active files only
Eliminate manual file system clean-up
activities
Disk pools
Optical pools
App Servers
15
Prod. Data
TSM Server
Tape pools
© 2009 IBM Corporation
Data Retention: Archive
Features
Long-term storage on cost-effective media
Point in time copy; revision history and auditability
Retention period and ‘retention hold’ enforcement
Fast expiration processing
Benefits – Records Retention
Speed file-server recovery times by moving file archive copies to a hierarchy of
lower-cost storage – recover only active data
Reduce backup times and resource usage by focusing on active files only
Move archived files to a hierarchy of lower-cost storage
Archived files are indexed with descriptive metadata to aid in locating historical
information
16
© 2009 IBM Corporation
IBM’s File Archive Solutions: TSM and Information Archive
ive
Arch
Arc
hiv
e
or
App Servers
App Servers
Ret
riev
e
ieve
Retr
Disk pools
TSM Server
Optical pools
Tape pools
IBM Information
Archive
IBM Tivoli Storage Manager 6
IBM Information Archive
Integrated Backup, HSM and Archive solution
Dedicated archive appliance
Leverage the same hierarchy of storage
Supported by more than 40 Apps
17
© 2009 IBM Corporation
4. Deduplicate and Compress What’s Left
© 2009 IBM Corporation
Basics of Data Deduplication
A ‘hot’ data reduction technology
Eliminates redundant subfiles (known as chunks, blocks, or extents)
Only one instance is stored for each common chunk
Duplicate instances of the chunk point to the stored chunk
19
© 2009 IBM Corporation
Where Can Data Deduplication Occur?
Applications like email and
content management are
building in Single Instance
Store and deduplication
Some backup
applications can perform
client or remote office
server deduplication
WAN devices perform
deduplication
Backup
Client
Source Side
Remote
Office Server
Target Side
Some deduplication
vendors are promoting their
appliances for live data as
well as a backup target
FC
Storage
Backup
Server
SAN
VTLs serve as a target for
backup applications and
have added in-line and post
process deduplication
20
NAS
Storage
LAN
Some NAS devices perform
Single Instance Store or fixed
block deduplication on live
data or can serve as a target
for backup applications
VTL
Backup applications like
TSM are including server
deduplication
© 2009 IBM Corporation
Effective Target-Side Data Deduplication: TSM or ProtecTIER
Use the deduplication capabilities built into IBM
Tivoli Storage Manager 6 Extended Edition when:
– You have a single TSM server
– You want to improve TSM recovery times by storing
more backup data on disk
– There isn’t a large amount of duplicate data across
the systems protected by multiple TSM servers
Use the IBM System Storage ProtecTIER
TS7650 Deduplication solutions when:
– You have multiple TSM servers
– You have other sources of backup and archive data
– You are using other (non-IBM) backup products that
perform periodic full backups
21
© 2009 IBM Corporation
Enhanced Data Reduction in Tivoli Storage Manager 6
New
Built-in Data Deduplication
Tivoli Storage Manager 6 Extended Edition includes data deduplication for disk
storage pools, at no extra charge
Improves recovery times as many more recovery points can be stored on disk; or
reduces the amount of capacity needed
Effective with data from any source including: API, backup, HSM, archive
Operates as a post-process / no impact on backup performance / automatic
space reclamation
Builds on automatic data compression
ESG Lab Report confirms 95% reduction ratio after just 11 weeks of backups
Tivoli Storage Manager 6, combining progressive incremental data
capture with data de-duplication, does a better job of reducing
storage capacity requirements than pure data deduplication solutions
22
© 2009 IBM Corporation
ProtecTIER’s Competitive Advantage
Strengths of our solution include:
Performance - In true apple to apple comparisons our solution is the
fastest on the market in real customer environments
Scalability - A single ProtecTIER VTL can easily scales in both
performance (1000MB/sec) AND capacity (1PB)
Data Integrity – ProtecTIER is one of the few solutions that doesn’t rely
on a hash algorithm and performs a byte level differential to ensure data
is a duplicate for enterprise class data integrity
Reliability – ProtecTIER features all IBM best of breed components
versus inexpensive OEM'd parts
Production Proven – There is more capacity (>25PBs) deployed behind
ProtecTIER servers in production than any other vendor in the world
IBM world class service and support available world wide
23
© 2009 IBM Corporation
Data Compression in Tivoli Storage Manager 6
TSM 6 includes algorithmic data compression as a
selectable option – yields 2:1 compression on
average
Sub-file backup can reduce the amount of data
being backed up when only small parts of a file
change
– Byte-Level: for smaller files, TSM can transfer only the bytes
that have changed since the last complete backup of the file
– Block-Level: for larger files, TSM can transfer changed blocks
– File-Level: if more than 60% of the file has changed, TSM
backs up the whole file
Tape Reclamation – increase tape utilization
Result: lower storage, bandwidth and management
costs
24
© 2009 IBM Corporation
Review: Choosing TSM or ProtecTIER for Data Deduplication
Both Solutions Offer the Benefits of Target side Deduplication:
– Greatly reduced storage capacity requirements
– Lower operational costs, energy usage and TCO
– Faster recoveries with more data on disk
Use TSM 6 Built-in Deduplication When:
– You desire deduplication operations be completely integrated within TSM 6
– The benefits of deduplication are desired without separate hardware or software
dependencies or licenses (ships with TSM 6 Extended Edition)
– You desire end to end data lifecycle management with minimized data store
Use ProtecTIER When:
–
–
–
–
–
–
–
Highest performance and capacity scaling are required!
Up to 500 MB/sec (1GB/s with 2 node) deduplication rates are needed
Deduplicated capacities up to 1 PB are required (25 PB of original data)
You wish to avoid operational impact of post processing deduplication
A VTL appliance model is desired
Deduplicating across multiple TSM (or other backup) servers
You don’t have TSM and are performing weekly full backups
Complementary Solutions Today!
Can be used together but don’t deduplicate the same data twice
25
© 2009 IBM Corporation
Why IBM?
© 2009 IBM Corporation
IBM’s Unique Position in the Industry
IBM is the only vendor with a comprehensive set of data
reduction technologies
IBM’s broad portfolio of data reduction solutions gives us the
freedom to solve customer issues with the most effective
technology
IBM’s high quality global support services will ensure your
investment in data reduction will meet your needs
IBM is continuing to invest in research and development to
further develop and deliver the advanced features our
customers are requesting
27
© 2009 IBM Corporation
End to End Unified Recovery Management – Reference Architecture
Remote Office(s)
FastBack
Clients
Data Center
FastBack
Clients
TSM B/A
Clients
Application Servers
File Servers
VMware Servers
FastBack for
Workstations
D/R Site
Application Servers
File Servers
VMware Servers
New
WAN
FastBack
Server
FastBack
Server
Tivoli Storage
Manager Server
Tiers of
Storage
Tivoli Storage Manager: enterprise-class data management includes backup, archive,
HSM, support for hundreds of devices and a broad range of operating systems
Tivoli Storage Manager FastBack: next-generation backup and near-instant restore for
critical servers; backup consolidation & disaster recovery for remote offices; and more …
Tivoli Storage Manager FastBack for Workstations: continuous data protection (CDP)
for desktop and laptop systems with centralized management
28
© 2009 IBM Corporation
Thank you for your time today.
For more information:
http://www.ibm.com/software/tivoli/products/storage-mgr-extended/
http://www.ibm.com/systems/storage/tape/ts7530/index.html
29
© 2009 IBM Corporation