Document

Why the storage you have is not the
storage your data needs
Laz Vekiarides
ClearSky Data, Inc
2015 Storage Developer Conference. © ClearSky Data. All Rights Reserved.
Enterprise Storage Today
Flash
$/TB
Complex, costly silos
Mid-Range
Scale Out
Capacity
2015 Storage Developer Conference. © ClearSky Data. All Rights Reserved.
What Enterprises Really Want
Flash
$/TB
Complex, costly silos
High
Performance
Where It’s
Needed
Mid-Range
Enterprise
Availability
& Security
Cloud
Economics
& Scalability
Scale Out
Capacity
2015 Storage Developer Conference. © ClearSky Data. All Rights Reserved.
Tiering is a bad answer
Nothing remains static:
 How fast does hot data cool?
 How fast does it re-warm?
 Is the overhead from this churn manageable?
 How can we use the cloud?

4
2015 Storage Developer Conference. © ClearSky Data. All Rights Reserved.
It’s the Latency, Stupid
(Apologies to Stuart Cheshire)


Data travels at the speed of light
Fast - but finite



Example: Boston to San Francisco



2740 miles
29.4 milliseconds RT
There are more delays





3x108 meters per second
186000 miles per second
Light travels more slowly in fiber
Fiber-optic repeaters every few
hundred miles
Switches, routers
Protocols, virtualization, etc.
End result: ~70ms
2015 Storage Developer Conference. © ClearSky Data. All Rights Reserved.
So, Where Exactly Is “The Cloud”?




Amazon East is near Ashburn, VA
West is in Northern California
Boston is closest to East
Best case numbers:




~10ms round trip (private line)
From BOS MPOP via Direct
Connect Ethernet
Does not include time to actually
access the storage
Worst case ~150ms (IP transit)
2015 Storage Developer Conference. © ClearSky Data. All Rights Reserved.
The ClearSky Solution:
A Global Storage Network
Metro-based
fully managed service
Primary
Recovery
Back-up
Complete lifecycle
management
SLA-guaranteed for
enterprise workloads
2015 Storage Developer Conference. © ClearSky Data. All Rights Reserved.
ClearSky: Geo-Distributed Data Caching
Remote DR
Mirrored
Copy
Edge Cache
iSCSI/NFS/Fib
er Channel
Data Services
N x Metro E
ClearSky
Metro
Cache
Cloud
Customer
SAN
Edge
Metro POP
2015 Storage Developer Conference. © ClearSky Data. All Rights Reserved.
Latency Math
• Best case miss path ~25 ms
• Worst case <50 ms
Metro
Edge
N x Metro Ethernet
Direct 10G links
CPE
Worst case
~1%
Private Line
1-2 ms
Metro Process
<1 ms
S3 Direct Connect
10-25 ms
2015 Storage Developer Conference. © ClearSky Data. All Rights Reserved.
S3 Lookup
15 ms
Current Space Management


All managed data is
migrated to a Cloud
provider for durability
All data is optimized



Hot
10%
(1 copy)
POP


CLOUD
Warm
<30%
(1-2 copies)
At least three tiers


Deduplicated
Encrypted
EDGE
Hot (local)
Warm/near-line, (POP, <2ms)
Cold, e.g. S3 (<20ms)
Local appliance need only
cache hot dataset (~10%)
2015 Storage Developer Conference. © ClearSky Data. All Rights Reserved.
Cold
100%
(n copies)
Modeling Cache Performance*
Lower is better

Miss Ratio Curve (MRC)




Performance as f (size)
Working set knees
Inform allocation policy
Reuse distance


Unique intervening blocks
between use and reuse
LRU, stack algorithms
*Courtesy of Irfan Ahmad &
CloudPhysics
11
2015 Storage Developer Conference. © ClearSky Data. All Rights Reserved.
MRCs from Customer Workloads
Lower is better
12
2015 Storage Developer Conference. © ClearSky Data. All Rights Reserved.
Customer Heat Map Data Collector



Sizing tool built for VMware environments
Collected 3-9 days per workload, most workloads
analyzed for 7 days:
 >1400 virtual disks on >800 VMs
 Logical size of all workloads 27.4TB
 Allocated space 18.9TB (68%)
 Avg Read IOPS 5.2K, write IOPS 5.9K
Performance & latency averages:
 Read IO 36KB, write IO 110KB
 Read latency 9.7ms, write latency 4.5ms
2015 Storage Developer Conference. © ClearSky Data. All Rights Reserved.
Miss Ratio Curves (>1400 virtual disks)
16.00%
14.00%
12.00%
10.00%
Reads
8.00%
Writes
Reads+Writes
6.00%
4.00%
2.00%
0.00%
0
20
40
60
80
100
120
14
2015 Storage Developer Conference. © ClearSky Data. All Rights Reserved.
Importance of The Warm Tier
Edge Cache
(on premise):
SSD
CSD Metro PoP:
SSD and HDD
Cloud Storage
2015 Storage Developer Conference. © ClearSky Data. All Rights Reserved.
Heat Map Example: Production cluster
Production
12%
6%
Hot Data
Warm Data
Cold Data
82%
2015 Storage Developer Conference. © ClearSky Data. All Rights Reserved.
Example 2: Test / Dev / Beta / Xen
Test / Dev / Beta / XEN
4%
4%
Hot Data
Warm Data
Cold Data
92%
2015 Storage Developer Conference. © ClearSky Data. All Rights Reserved.
Yes. It Can Work
Data access is very tiered
 Small amounts of flash can yield
disproportionate performance benefits
 Variation of latencies must be bounded
 Single tier cache in front of high latency
storage cant work
 Bounding network latency is as important as
bounding media latency

18
2015 Storage Developer Conference. © ClearSky Data. All Rights Reserved.