Hyperconvergence in Secondary Storage for VMware Environments

Hyperconvergence in Secondary
Storage for VMware Environments
James Dean
Cohesity
Introduction
Improve Data Protection and solve
the Dark Data/Analytics Problem
Confidential & Proprietary
2
Complexity and fragmentation built out of necessity
• A decade spent consolidating your servers with Vmware
• New era of consolidation around secondary storage.
• Entire ecosystems were born to integrate with various
VMware solutions
• Multiple vendors were created, and existing vendors
developed new solutions to handle Data Protection, Disaster
Recovery, DevOps, Analytics, and more!
• End users left with pools of data that serve one purpose and
limit the return on investment
Confidential & Proprietary
3
The storage truth, secret and problem
(The TRUTH)
Primary storage handles your Tier-1 mission-critical workloads,
including your VMware vSphere environment.
Confidential & Proprietary
4
The storage truth, secret and problem
(The SECRET)
Secondary Storage constitutes 80% of the data footprint in the
typical datacenter today.
Confidential & Proprietary
5
The storage truth, secret and problem
(The PROBLEM)
Secondary storage is fragmented among various data protection,
automation, and analytics solutions; all bolted together with
different requirements, storage solutions, and software
providers.
That’s an BIG Problem
Confidential & Proprietary
6
The Storage Iceberg
Primary Storage  Mission Critical
Fragmented
Inefficient
Secondary Storage
File Shares
Archiving
Test / Dev
Dark Data
Analytics
Backups
Cloud
Confidential & Proprietary
7
File Shares
Archiving
Analytics
Test / Dev
Backups
Cloud
Hyperconverged Secondary Storage
Confidential & Proprietary
8
Solve the Data Protection Problem
Confidential & Proprietary
9
Legacy Data Protection Issues
Complex
Master
Servers
Media
Servers
Tape
Slow
Weekly Full
Daily Incremental
Target Storage
Hours to recover
Cloud = Afterthought
Expensive
?
✖ Hardware silos
✖ 24-hour RPOs
✖ Bolt-on cloud gateways
✖ >$10K / protected TB
✖ Multiple management
✖ Long backup windows
✖ Long-term retention only
✖ Consulting costs
✖ Slow recovery times
UI
✖ No scale-out
✖ Forklift upgrades
✖ Fragmented dedupe
Just to provide an insurance policy, with protected data unproductive 99% of the time
Data Protection Is Complex
Virtual environments
Physical servers
Databases
SECONDARY STORAGE
Tape Library
Media Servers
Master Servers
Target Storage
Cloud Gateway
Cloud
Test / Dev
Analytics
Files & Objects
11
Simplify Data Protection with Cohesity
Virtual environments
Databases
Physical servers
SECONDARY STORAGE
Tape Library
Media Servers
Master Servers
Target Storage
Cloud Gateway
Web-scale platform for secondary data
Disaster
Recovery
Test / Dev
Cloud
Analytics
12
What Can Cohesity Do For Your Data Protection?
Cloud-Ready
Fast
Simple
• Converged data protection
• Scale-out platform
• Single UI
• Integrates with all leading
public clouds
• Archival, tiering, and
replication
• Sub-minute RPO
• Instantaneous RTO
Cost-Effective
• 80% lower TCO for backup
and target storage
Productive Data
• Disaster Recovery
• Test/Dev
• Analytics
13
Dark Data and Analytics Problem
Confidential & Proprietary
14
What Is Dark Data?
• Dark data is operational data that is not being used. Consulting and market
research company Gartner Inc. describes dark data as "information assets
that organizations collect, process and store in the course of their regular
business activity, but generally fail to use for other purposes.”
• Dark data is unstructured, which means that the information is in formats
that may be difficult to categorize, be read by the computer and analyzed
• Dark data (if analyzed) can reveal patterns, trends, efficiencies, security
issues, and others issues previously unknown issues because of disparate
uncorrelated data sets
• It’s estimated that 54% of enterprise data is dark data
15
The dark data facts!
•
•
•
It’s estimated that roughly 90% of data generated by sensors and analog-todigital conversions never gets used.
In a traditional enterprise
•
•
•
54% of data is dark data
32% is redundant, obsolete, or trivial
14% is business critical
•
90% of energy used by data centers is wasted on traditional data storage
methods
It is estimated that most enterprises are only analyzing 1% of their data. (We
know that number is higher here!!)
Backup data is required and largely sits idle 99% of the time making it DARK!!
•
Backup data is already in YOUR possession, why not use it?
•
16
Analytics Challenge
Production
Test / Dev
Analytics Cluster
***This requires 2 – 3 copies of your data not including the full set of data sitting on your backup system
Cohesity Solution
Production
Storage Metrics
• Utilization/Capacity
• Growth Trending
DevOps
Analytics Cluster
•
Built-in Analytics
• Hadoop
• Map Reduce
•
Real-time Indexing
•
Analytics Workbench
File Metrics
• Usage by File Type
VM Metrics
• Usage by VM
***Single copy of your production data, for Dev Ops and Analytics
Analytics WorkBench (AWB)
•
Deep Analysis
•
•
MapReduce Engine
•
•
•
Native Tools
Distributed
Runs Natively
Customizable
•
•
•
Inject Custom Code
Specific Jobs
Specific File Types
AWB Custom Apps
•
Clone/Create Mapper
•
Clone/Create Reducer
•
Upload Your Own JAR
•
Cohesity Pre-conf Apps
•
•
•
•
Distributed GREP
Key Word Search
Pattern Matching
Social Security / Password
Analytics Workbench - Future
• 3rd Party Support
• Splunk
• Apache Spark
• … and more!
Product Overview
Data Platform
Confidential & Proprietary
22
Cohesity DataPlatform
Web-scale platform designed to consolidate secondary data and workflows
✓
✓
✓
✓
✓
Hyperconverged Nodes
Storage and compute capacity
VM
✓
✓
✓
✓
✓
Software-defined
Distributed, web-scale architecture
Support for NFS, SMB and S3
Global dedupe & compression
SnapTree for zero-cost, unlimited snapshots
and clones
Multitenancy & QoS
Encryption (FIPS 140-2 Certified)
Remote replication
Public & Gov cloud integration (C2S Coming)
Hadoop & Map Reduce built in and open
Questions?
Thank you
James Dean
443-871-2404
[email protected]
Confidential & Proprietary
24