Software-Defined Storage (SDS): Market Overview and IBM/TPC

Christian Bolik ([email protected]), IBM Storage Software Development
Software-Defined Storage (SDS)
© 2013 IBM Corporation
Objectives
• Understand the driving forces behind the desire to move to an SDS
• Understand the purpose of SDS, and its relation to SDE and Cloud
• Gain insight into the 2 primary perspectives on SDS
• Learn about what is required for establishing an SDS
2
© 2013 IBM Corporation
The increasing complexity, volume, and value of data
8 zettabytes of digital content created by 2015
3
© 2013 IBM Corporation
The Information Explosion…
●
•
•
•
Zettabytes
Information doubling every 18-24 months
Storage growing 20-40% per year
Storage budgets up 1%-5%
Velocity of Change
●
Acquisitions
●
Mergers
●
Consolidations
●
Exabytes
●
Petabytes
●
Terabytes
●
Gigabytes
2000
4
2005
2010
“Born on the
Web” type
applications
Legal
Requirements
●
The information explosion
meets budget reality
ILM, Data
Retention
initiatives
Regulations
demanding data
to be retained for
many years
Ever increasing
variety of data
stored digitally…
2015
© 2013 IBM Corporation
„Big Data“
Source:
http://www.domo.com/blog/2012
/06/how-much-data-is-createdevery-minute/
5
© 2013 IBM Corporation
Storage management pain points
• Top pain points are
dominated by
– Growth management
– Cost
– Complexity
• Problems seem even
more severe for midsize
enterprises compared to
large enterprises
The InfoPro Storage Study 1H12 – 451 Research
6
© 2013 IBM Corporation
Managing increasing amounts of storage takes time… and money
Survey respondents cited 77%
of storage staff time devoted to
administration of ongoing
operations. Things that could
be automated.
The InfoPro Storage Study 1H12 – 451 Research
7
© 2013 IBM Corporation
The special needs of Virtual Servers
Virtualized
Non-Virtualized
$13.331
• 60% of storage spend in 2011 was
for attachment to virtual servers
$20.082
2011 Storage Spend
IDC Storage Workloads
10/2012
• Nearly all customers reported some
storage issue with VM usage.
• Virtual servers bring their own
unique storage requirements, and
need special consideration for:
•
•
•
•
8
New capacity
New operational processes – DR
New performance management
Planning considerations
From: Research Report: 2012 Storage Market Survey. Source: Enterprise Strategy Group
Created for Connie Bright, IBM. © 2012 Enterprise Strategy Group, Inc. All Rights Reserved © 2013 IBM Corporation
Changing Workload Requirements
Agility &
Rapid Scale
Systems of Engagement (Situational Need)
Born on Cloud
 Orchestration across
compute/network/storage for provisioning,
deployment and management of workloads
(DevOps)
 Dynamic scalability as applications and data
requirements grow
 Cost-optimized storage via disks embedded
in servers
 Multi-tenant security at a fine-grained,
highly scaled level
 Open support of industry standards and APIs
Workload Optimized &
Transaction Integrity
Systems of Record (Traditional Operations)
Enabled for Cloud
 Orchestration across
compute/network/storage for provisioning,
deployment, and management of workloads
 Automation of provisioning and configuration
of storage based on application requirements,
with ongoing adjustments based on policies/SLA
 Programmable adjustments to storage (via
APIs) as application needs change
 Heterogeneous environment support
 Efficient management of data copies
(backup/archive/compliance)
Value is shifting to software to provide the dynamic and agile storage
environment required by these workloads
9
© 2013 IBM Corporation
Introducing: Software-Defined Storage (SDS)
• IDC Definition of SDS:
A software-defined data center is „...a loosely coupled set of software
components that seek to virtualize and federate datacenter-wide
hardware resources such as storage, compute, and network
resources.... The goal for a software-defined datacenter is to....make
the datacenter available in the form of an integrated service....“
• Key attributes
Flexibility, lower cost
– It is software
– Offers a full suite of storage services
Abstraction of storage capabilities
– Federates physical storage capacity from multiple locations/technologies
Flexibility through virtualisation
Based on „IDC„s Worldwide Software-Based (Software-Defined) Storage Taxonomy, 2013“
10
© 2013 IBM Corporation
Software Defined Storage = programmable smart storage
The New World
with Software Defined Storage
“Programmable Storage”
Today’s World,
with No Software Defined Storage
1.
A Workload Definition Layer (or application)
defines storage capacity requirements
2.
Storage administrators define logical
volumes with required storage capacity
and do a best guess of performance
requirements
1.
A Workload Definition Layer (or application)
will specify storage requirements explicitly:
a)
b)
c)
d)
Performance
Capacity
RPO/RTO
Replication, etc.
2.
A Workload Orchestrator will schedule
workload with appropriate compute, network
and storage resources to satisfy Service
Level Objectives
3.
Storage administrators map the logical
volumes to the application
4.
All the following events will need storage
administrators’ intervention:
3.
a) Storage capacity needs to be increased or
decreased
b) Application performance degrades due to resource
contention
c) Performance requirements change (increase or
decrease)
d) Data protection needs change
e) Replication policies change
f) RPO and RTO of the data changes
g) Backup and archive policy changes
If performance of an application is impacted,
storage service will automatically detect it
and adjust resources to maintain its Service
Level Agreements
4.
If the requirements are changed,
applications will communicate with storage
via APIs. Storage service will adjust the
resources accordingly
11
© 2013 IBM Corporation
Key characteristics of an SDS-enabled Storage Service
• Commoditized persistent data storage (lower cost)
• Service-based infrastructure (easy to consume)
• Open standards and interfaces based platform (no vendor lock-in)
• Focus on solution rather then technical platform (application-oriented)
• Scalability (capacity, throughput, performance)
• Resilient (always available)
• Workload-aware („fit for purpose“, optimized)
• Covering block, file and object storage
• Cost-efficient and highly automated
12
© 2013 IBM Corporation
SDS in SDE – Software Defined Environments
 Tighter coordination between
applications and storage/network,
– Exposing storage capabilities for the
software to dynamically provision
storage with the most suitable
characteristics
– Introducing new operations between
software and storage to let storage
better adapt to the needs of software
– Integrating storage functions to the
software to leverage higher-level
knowledge
 Control planes separated from the
hardware to the software layer. Unified
Control Planes allow rich resource
abstractions to assemble purpose fit
systems
 Programmable infrastructures allow for
dynamic optimization to respond to
business requirements
13
C
C
Control Plane
Workload Abstraction
SDE Unified Control Plane:
Cross-Domain Orchestration
Resource Abstraction
APIs
APIs
APIs
SDC Control
and Config
SDN Control
and Config
SDS Control
and Config
Data Plane
Virtualized Network
Heterogeneous
Compute Resources &
capabilities
Heterogeneous
Storage Resources &
capabilities
© 2013 IBM Corporation
Relation of SDS and SDE to Cloud Layering
Enabling business transformation
Business Process
as a Service
Business
Process
Solutions
Application
Application
Application
Application
Application
Marketplace of high value consumable business applications
Software
as a Service
External
Ecosystem
Industry
Collaboration
Human
Resources
Big Data &
Analytics
Commerce
Marketing
Social
Traditional
Workloads
Composable and integrated application development platform
Built using open standards
Platform
as a Service
Developmen
t
Big Data &
Analytics
Security
Integration
Mobile
Enterprise class, optimized infrastructure, via Software-Defined Environments (SDE)
Built using open standards
Infrastructure
as a Service
Software-Defined
Compute
Software-Defined
Storage
Software-Defined
Networking
Public. Private. Dynamic hybrid.
14
© 2013 IBM Corporation
Different views of the same coin...
Expectations on a Storage Service:
Consumer
Provider
Self-service
Highly automated storage lifecycle
management
Flexible and dynamic, elastic
Virtualized and standards-based, simple
capacity planning and forecasting
Cost-efficient, no overprovisioning
Automated and optimized, space-efficient
Charged by capacity and service level
used
Capacity reporting and metering,
multi-tenancy-enabled
Reliable and always available
Highly available, replicated, self-monitoring
and self-healing, secure
No need to have any knowledge of
infrastructure details
Automated mapping of consumer
requirements to infrastructure capabilities
15
© 2013 IBM Corporation
Key Aspect of IT Service Management in General:
Mapping
Business Requirements
Separation of concerns
to
Consumer
Provider
Infrastructure Capabilities
16
© 2013 IBM Corporation
What this means for Storage Service Management
Mapping
Business Requirements
to
Infrastructure Capabilities
17
Capacity
Accessibility
Availability
Performance
Security
Retention/Compliance
Media type
Disk technologies
RAID levels
Encryption
Compression
Thin Provisioning
Number of Copies
Access latency
Access protocols
Backup/Replication
etc....
© 2013 IBM Corporation
Establishing a service catalog of supported service classes which
service consumers can choose from
Service Catalog
Service Class “Platinum”
Service Class “Gold”
Service Class “Silver”
$$$$
$$$
$$
Different service classes map to
different levels of service in
some or all of the different
service level catagories:
•
•
•
•
•
•
Service Class “Bronze”
18
Accessibility
Availability
Performance
Consistency
Retention /
Compliance
Security
$
© 2013 IBM Corporation
Defining Requirements for Storage Services:
Service Level Categories – Service Level Objectives (SLOs)

Accessibility





Availability










19
Initial Access Time
Data Sharing
Requires Access Transparency
Max-Out-Of Space Duration
Availability Period
Planned Downtime
Max. Unplanned Downtime Aggregate
Max Unplanned Downtime Per Instance
Recovery Point Objective (RPO)
Recovery Time Objective (RTO)
Consistency
Number Of Copies
Number Of Versions
Retain Deleted
 Performance


Avg. I/O Rate
Avg. Data Throughput
 Retention / Compliance




Immutability
Disposal
Durability
Retention Period
 Security





Accountability
Integrity
Authenticity
Confidentiality
Physical Security
© 2013 IBM Corporation
Mapping storage resource and management capabilities to SLOs

Accessibility





Initial Access Time
Data Sharing
Requires Access Transparency
Max-Out-Of Space Duration
Metro Mirror,
Availability
Global Mirror,
 Availability Period
Snapshots
 Planned Downtime
(app-aware?),
 Max. Unplanned Downtime Aggregate
Backup/Restore
 Max Unplanned Downtime Per Instance
(file/image Recovery Point Objective (RPO)
level),
 Recovery Time Objective (RTO)
versioning, ....




20
Tape/Disk/Flash,
HSM, NAS
exports,
vaulting, thin
provisioning, ....
Consistency
Number Of Copies
Number Of Versions
Retain Deleted
Different disk
media (RPMs
 Performance
etc.), tape,
 Avg. I/O Rate
flash, RAID
 Avg. Data Throughput levels, Cache, ...
 Retention / Compliance




Immutability
Disposal
Durability
Retention Period
 Security





Accountability
Integrity
Authenticity
Confidentiality
Physical Security
WORM storage,
automated
deletion, data
shredding,
media lifetime,
...
Encryption, key
management,
access controls,
lockable
cabinets, etc.
© 2013 IBM Corporation
Provider„s Goal: Maximize storage capacity, minimize down-time:
Store data with as little cost as possible while maintaining committed
SLAs (Service Level Agreements) – How?
• Thin provisioning: Allocate only as much storage as is used,
expand allocation as needed
• Compression: Reduce actual capacity used
• Data deduplication: Store only one copy of files/blocks
containing the same data
• Tiering: Always place data on the lowest cost
Optimal Storage Tier Distribution
storage tier which still fulfills customer
requirements, optimize continuously
• Monitoring: Threshold-based alerting to detect
impending performance bottlenecks early,
balance volumes to address
• Virtualization: Employ virtualization to have the freedom of
moving data to lower cost storage without any downtime
1-5%
15-20%
20-25%
50-60%
21
Tier 0
Tier 1
Tier 2
Tier 3
© 2013 IBM Corporation
Flexibility through Storage Virtualization
Traditional Storage
With Storage Virtualization
 Capacity is isolated in SAN islands
 Combines storage capacity into 1 large storage pool
 Multiple management points
 Single management point
 Potentially poor capacity utilization
 Uses storage assets more efficiently
 Capacity is purchased for and owned by
individual applications
 Capacity purchases can be deferred
20%
capacity
55%
capacity
SAN
SAN
50%
capacity
95%
capacity
Storage Hypervisor
HDS
22
22
 Plus: Non-disruptive data migration between storage
resources
IBM
EMC
HP
HDS
IBM
EMC
HP
© 2013 IBM Corporation
Storage Management Interface Abstraction via SMI-S
•
•
•
SMI-S (Storage Management Initiave – Specification) started in 2002, with
the goal to standardize management interfaces of storage devices
SMI-S is currently supported by 21 different vendors (http://www.snia.org/ctp/conformingproviders/index.html)
SMI-S is developed by a workgroup of the SNIA (Storage Networking
Industry Association); in the meantime it has been accepted both as an
ANSI and an ISO standard
SMI-S builds on CIM (Common Information
Model), defined by the DMTF, uses XML for
formatting the payload, and HTTP as the
transport mechanism
23
© 2013 IBM Corporation
OSLC: Built on Linked Data
Linked Data describes a method of publishing
structured data so that it can be interlinked and
become more useful. It builds upon standard
Web technologies such as HTTP and URIs, but
rather than using them to serve web pages for
human readers, it extends them to share
information in a way that can be read
automatically by computers. This enables data
from different sources to be connected and
queried [1]
1.
Use URIs as names for things
2.
Use HTTP URIs so that people can look
up those names
3.
When someone looks up a URI, provide
useful information, using the standards
4.
Include links to other URIs. so that they
can discover more things
OSLC describes a method for integration
of disparate tools, across domains, by
providing a set of integration services,
through other tools can be discovered,
and more information about resources
retrieved. This is enabled by Linked Data.
Open Services for Lifecycle Collaboration: http://open-services.net/
[1] Bizer, Heath, Berners-Lee (2009). "Linked Data - The Story So Far"
24
© 2013 IBM Corporation
OpenStack provides an open mechanism for provisioning/managing
storage to workloads and is driving a rapidly developing ecosystem
 OpenStack storage includes

Cinder: Provision and manage block
storage for compute

Swift: object storage

Manila (future): file storage
 IBM provides support for OpenStack,
and provides extensions through
standard mechanisms
 OpenStack provides a common, open
interface for ISVs, applications and
cloud admins to provision and manage
storage resources

Integrated with compute and
networking management
Integration with
SDN and SDC
Smarter Data
Protection
IBM Enterprise Object
Storage Solution Platform
OpenStack Swift
Object Middleware
OpenStack Cinder
Smarter Mgmt
on any storage
Commands
Community/
Competitor
storage
support
Capabilities
Drivers
IBM
storage;
TPC
IBM sol‟n
for Internal
storage
Community
Enable & Extend
Differentiate
25
© 2013 IBM Corporation
In Summary...
• The exponential and on-going growth in data storage requirements calls
for new, more flexible storage management methods
• Software-Defined Storage promises to provide the flexibility, service
orientation, and cost-efficiency required to address today„s requirements
• By abstracting storage resource capabilities through service classes and
APIs, SDS is enabled to „snap-in“ to an SDE
• The 2 primary views on SDS are that of a service consumer and a
service provider, each having overlapping goals and expectations
• Main challenges for the provider are to map consumer-specified
business requirements to storage infrastructure capabilities, and to
maintain committed SLAs
• For the provider to be able to offer an attractively priced, yet sustainable
storage service, various technologies and methods exist
26
© 2013 IBM Corporation
27
© 2013 IBM Corporation
BACKUP
28
© 2013 IBM Corporation
What‟s the problem with storage these days?
Data growth is exponential…
128 GB/
person
The World‟s
total data
per person.
Drivers:
Digital Information
Created, Captured,
Replicated WW
●
2006: 180 exabytes
2007: 280 exabytes
...
2011: 1800 exabytes
(1800 billion gigabytes)
●
●
29
Acquisitions
●
Mergers
●
Consolidations
●
ILM, Data Retention initiatives
“Born on the Web” type
applications
Legal Requirements
●
24 GB/
person
0.8 GB/
person
●
●
Expected compound annual
growth rate is almost 60%
Sources:
IDC, Worldwide Disk Storage Systems 2007-2011
Forecast Update, Doc #209490
Velocity of Change
Regulations demanding data to
be retained for many years
Ever increasing variety of data
stored digitally…
IDC Whitepaper: The Diverse and Exploding Digitall
Universe, March 2008
2003 2006 2010
© 2013 IBM Corporation
Globally, storage requirement is 80% file-based unstructured data,
and growing
Worldwide Storage Capacity Shipped by Segment,
2008–2013
 Explosion of data, transactions, and
digitally-aware devices strains IT
infrastructure and operations. Storage
capacity is doubling every 18 months.
 Majority of this data is unstructured filebased, such as user files, medical images,
web and rich media content, growing at
63%
 Block storage, while still well suited for
existing OLTP/database workloads, is not
where majority of strategic analytics-based
applications and strategic storage initiatives
are being deployed
Source: IDC, State of File-Based Storage Use in Organizations:
Results from IDC's 2009 Trends in File-Based Storage Survey:
Dec 2009: Doc # 221138
30
© 2013 IBM Corporation
Customer Storage Needs - General
•
•
Across a range of customer types,
rapid growth of unstructured data, the
complexity of data protection, and
hardware costs are the biggest
challenges.
There is a list of other issues, made
worse by the size and growth rates of
data
•
•
•
•
•
•
Space constraints, poor utilization
Management tasks
Long implementation times
Lack of skills
Staff costs
Several System trends show through
to Storage
•
•
Support for virtual server
environments
Support for VDI
From: Research Report: 2012 Storage Market Survey. Source: Enterprise
Strategy Group
Created for Connie Bright, IBM. © 2012 Enterprise Strategy Group, Inc. All
Rights Reserved
31
© 2013 IBM Corporation
Security and Availability
Authentication/Auditing
Mirroring/DR
High Availability
Backup & Recovery
Performance and Opt.
Striping
Clustering
Platinum
Storage Services Layer
Encryption
Gold
Silver
Bronze
Compression
Deduplication
Tiering/ILM
SOFTWARE DEFINED STORAGE
SOFTWARE
DEFINED
COMPUTE
RESILIENCY
• Storage replication
• Disaster recovery
• Consistency groups
• Backup
32
32
Workload Abstraction
HETEROGENEITY
• Storage Abstraction
• Storage Provisioning
• Storage Monitoring
• SAN/GPFS/NAS/DAS
Resource Abstraction `
CAPABILITY
OPTIMIZATION
FABRIC
MANAGEMENT
• Storage tiers
• Performance aware placement
• Continous optimizations
• Migration
Mapping to Resource
SOFTWARE
DEFINED
NETWORK
•
•FC/FCoE/iSCSI/
Infiniband
•Zone management
Continuous Optimization
© 2013 IBM Corporation
What is needed for Software Defined Storage?
Storage Service Management
Storage
Resource
Management
33
Business
Continuity
Management
Devices
Services
• Block Storage Systems / Storage Arrays
• File Storage Systems / NAS Filers
• Object Storage Systems
• Tape Systems / Archive Systems
• Storage Virtualizers
• Storage Networks
• Thin Provisioning
• De-Duplication
• Data Replication
• Encryption
• Compression
• ...
Data
Protection
Management
Control
Plane
(incl. resource
abstraction)
Management
Data
Plane
I/O
© 2013 IBM Corporation
Example of a Software Defined Storage Platform
Key attributes
( IDC):
of storage services
• It federates
physical storage
capacity
Tivoli
Storage
Productivity
Center /
FlashCopy
Manager
Policy-based Management and Automation
Control
Plane Layer
Snapshot and Backup Management
Storage Software Platform
Security and Availability
Authentication/Auditing
Encryption
IBM
Storwize
Storage
Software
Platform
Feature Options
• It offers a full suite
IBM SmartCloud Virtual
Storage Center
• It is software
Management Software Platform
Mirroring/DR
High Availability
Backup & Recovery
Performance and Opt.
Striping
Clustering
Object Storage
Data Plane
Layer
Cluster File
System
Block
Virtualization
Compression
Deduplication
Tiering/ILM
Storage Infrastructure
34
© 2013 IBM Corporation