PDF - Megacast.io

!
ESG$Lab$Spotlight%
Cohesity!Data$Platform:$Data$Efficiency$with$Converged$
Secondary*Storage!
Date:!October!2015!!!Author:!Kerry%Dolan,%Lab%Analyst!
$
Abstract:%ESG%Lab%evaluated%the%Cohesity%Data%Platform,%a%scalable,%intelligent%storage%platform%that%protects%data%
and%leverages%a%single%data%copy%for%DevOps,%fileshares,%analytics,%archiving,%and%other%tasks.%This%report%focuses%on%
Cohesity’s%performance%scalability,%essential%for%a%solution%designed%for%ongoing%data%growth.%Stay%tuned%for%future%
ESG%Lab%reports%focused%on%Cohesity’s%dataIaware%capabilities.%
The!Challenges!
Secondary$storage$causes$significant$data$growth$problem$for$most$organizations.$Production$workloads$are$protected$
and$stored$for$compliance,$creating$what$is$essentially$an$expensive$insurance$policy.$These$primary$workloads$are$then$
copied$for$DevOps,$data$warehouses,$analytics,$and$test/dev$environments.$They$all$run$on$segregated$silos$of$
infrastructure,$and$the$everFincreasing$cost$to$manage$them$is$staggering.$ESG$research$backs$this$up:$for$the$past$five$
years,$IT$managers$have$cited$managing$data$growth$in$their$top$IT$priorities,$and$in$2015,$it$was$the$second$mostFcited$
priority,$just$behind$information$security.1$$
The!Solution:!Cohesity!Data!Platform!
Cohesity$has$developed$the$Cohesity$Data$Platform$to$break$down$these$silos,$enabling$organizations$to$leverage$existing$
data$to$consolidate$secondary$storage$workflows$instead$of$making$new$full$copies.$Cohesity$starts$with$protecting$data$
and$then$using$the$backup$to$power$other$tasks$such$as$DevOps,$archiving,$and$analytics.$Targeted$initially$for$virtual$
machines$(VMs),$the$
Cohesity$Data$
Platform$lets$IT$
snapshot$or$clone$
data$sets$indefinitely$
at$any$interval,$
without$a$
performance$penalty,$
making$the$data$copy$
process$completely$
nonFdisruptive$and$
infinitely$scalable.$NFS$
is$currently$
supported,$while$SMB$
2.1,$SMB$3.0,$and$
HDFS$adapters$are$
coming$soon,$along$
with$support$for$
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
1
$Source:$ESG$Research$Report,$2015%IT%Spending%Intentions%Survey,$February$2015.$$
The$goal$of$ESG$Lab$reports$is$to$educate$IT$professionals$about$data$center$technology$products$for$companies$of$all$types$and$sizes.$ESG$Lab$reports$are$not$meant$
to$replace$the$evaluation$process$that$should$be$conducted$before$making$purchasing$decisions,$but$rather$to$provide$insight$into$these$emerging$technologies.$Our$
objective$is$to$go$over$some$of$the$more$valuable$feature/functions$of$products,$show$how$they$can$be$used$to$solve$real$customer$problems$and$identify$any$areas$
needing$improvement.$ESG$Lab’s$expert$thirdFparty$perspective$is$based$on$our$own$handsFon$testing$as$well$as$on$interviews$with$customers$who$use$these$
products$in$production$environments.$This$ESG$Lab$report$was$sponsored$by$Cohesity.$
©$2015$by$The$Enterprise$Strategy$Group,$Inc.$All$Rights$Reserved.$
ESG%Lab%Spotlight:%Cohesity%Data%Platform:%Data%Efficiency%with%Converged%Secondary%Storage%%%%%%%%%%%%%%%%%%%%%%%%%%%%%2% %
Microsoft$SQL,$Exchange,$SharePoint,$and$other$workloads.$$
The$Cohesity$Data$Platform$is$currently$offered$in$two$models,$both$2U,$fourFnode$appliances.$Each$node$has$its$own$
CPU,$memory,$and$network$connectivity.$The$Cohesity$C2500$includes$96$TB$of$HDD,$6.4$TB$of$PCIe$Flash,$64$GB$of$
memory,$dual$10GbE$SFP+$interfaces,$and$dual$1GbE$interfaces.$The$lower$priceFpoint$C2300$offers$48$TB$of$HDD$and$3.2$
TB$of$flash.$Nodes$can$be$mixed$and$joined$into$an$infinitely$scalable$cluster$with$a$minimum$of$three$nodes.$
Solution!Stack!
Cohesity’s$Open$Architecture$for$Scalable,$Intelligent$Storage$(OASIS)$combines$a$flexible$architecture$with$scalability$to$
support$consolidation$of$data$and$different$workloads.$OASIS$encompasses$a$Cluster$Manager,$I/O$Engine,$Metadata$
Store,$Indexing$Engine,$and$Integrated$Data$Protection$Engine$that$works$with$APIs$such$as$VMware$APIs$for$Data$
Protection$(VADP).$$
The$Cohesity!Storage!Services$layer$delivers$an$efficient$$general$purpose$storage$platform$which$scales$as$more$nodes$
are$added$to$the$cluster.$These$services$are$powered$by$Cohesity’s$patented$SnapTree$snapshotting$technology$which$
uses$a$tree$structure$of$pointers$instead$of$the$traditional$linkFchain$metadata$journaling.$SnapTree$limits$the$number$of$
hops$to$retrieve$data$blocks$to$three,$enabling$unlimited$snapshots$without$impacting$performance.$Also$included$is$a$
variableFlength$deduplication$engine$which$runs$globally$across$the$cluster$and$can$be$executed$inline,$post$process,$or$
not$at$all$depending$on$the$workload.$Data$is$intelligently$tiered$between$HDD$and$flash,$and$two$copies$of$data$are$
maintained$so$that$data$will$be$available$despite$a$failure.$Data$is$automatically$replicated$within$the$Cohesity$block$or$
rackFoptimized$for$fault$tolerance.$Upgrades,$node$addition$or$removal,$and$other$maintenance$tasks$can$be$completed$
nonFdisruptively.$$
The$Cohesity!Application!Environment$provides$native$applications$for$common$tasks;$initially$available$are$Cohesity$
Protection,$DevOps,$and$Analytics.$
•
•
•
Cohesity!Protection$offers$integrated$backup$and$recovery$features,$including$unlimited$snapshotting$and$
thin$cloning,$with$builtFin$global$indexing$for$full$searchability.$It$integrates$with$vCenter$and$automatically$
ingests$the$entire$VM$environment.$Protection$policies$can$be$created$for$individual$VMs,$groups$of$VMs,$or$
entire$vCenters.$All$metadata$is$indexed$as$it$is$ingested;$the$system$then$cracks$open$and$indexes$each$
VMDK$to$understand$the$file$structure$of$the$VM,$enabling$wildcard$searches$and$granular$restores$at$an$
application,$VM$or$fileFlevel.$$
Cohesity!DevOps$provides$ease$of$provisioning$and$management$of$development$environments.$With$
generalFpurpose$storage$capabilities$from$OASIS,$including$native$support$for$NFS$and$(soon$to$come)$SMB$
protocols,$enterprises$can$quickly$deploy$DevOps$clones$from$backup$data,$efficiently$repurposing$passive$
data$in$legacy$environments$for$faster$development$workflows.
Cohesity!Analytics$provides$storage$capacity,$deduplication,$and$protection$reports,$plus$predictive$analysis$
based$on$data$growth$rates$and$custom$reports.$In$addition,$the$distributed$compute$and$memory$resources$
can$be$used$off$cycle$with$the$Cohesity$Analytics$Workbench$to$run$inFplace$analytics$on$the$data$itself$
without$the$need$for$separate$big$data$infrastructure.$$
The$platform$can$be$used$as$a$target,$for$other$backup$applications$as$well$through$exported$NFS,$but$does$not$provide$
the$data$awareness$capabilities$for$data$not$written$through$Cohesity$Protection.$$$
ESG!Lab!Testing!
ESG$Lab$audited$Cohesity$testing$designed$to$evaluate$performance$scalability.$Testing$was$conducted$using$the$LinuxF
based$FIO$utility,$a$disk$I/O$benchmark,$as$the$number$of$Cohesity$nodes$scaled$from$four$to$32.$The$FIO$tool$was$used$
to$first$sequentially$write$eight$2GB$files$(similar$to$VMDK$files)$per$node,$followed$by$sequential$reads,$random$reads,$
and$random$writes.$1MB$block$size$was$used$for$sequential$reads$and$writes,$and$4KB$block$size$was$used$for$random$
reads$and$writes.$This$test$set$up$simulated$virtual$machine$I/O,$and$could$represent$a$few$very$busy$VMs,$or$hundreds$
of$less$busy$VMs.$$
$
$
©$2015$by$The$Enterprise$Strategy$Group,$Inc.$All$Rights$Reserved.$
ESG%Lab%Spotlight:%Cohesity%Data%Platform:%Data%Efficiency%with%Converged%Secondary%Storage%%%%%%%%%%%%%%%%%%%%%%%%%%%%%3% %
Figure$1$demonstrates$the$clear$linear$scalability$that$Cohesity$delivers:$For$both$sequential$and$random$reads$and$
writes,$as$more$nodes$are$added,$throughput$(MB/s)$and$I/O$per$second$(IOPS)$increase.$To$relate$these$to$common$IT$
tasks,$sequential$reads/writes$are$the$I/O$types$that$tasks$such$as$backups$and$video$generate.$As$the$top$two$charts$
demonstrate,$adding$Cohesity$nodes$provides$increasing$amounts$of$throughput,$enabling,$for$example,$more$
simultaneous$backups$or$video$streaming.$$
$
Figure%1.%Performance%Scalability%%%
$
%
The$bottom$two$charts$show$random$reads/writes;$especially$important$is$random$reads,$which$is$the$type$of$I/O$
generated$for$common$eFmail$and$database$tasks.$The$addition$of$Cohesity$nodes$increases$IOPS,$meaning$that$more$of$
these$common$business$tasks$can$be$executed$simultaneously$as$nodes$are$added.$$$
Why$This$Matters$
With$some$storage$solutions,$data$growth$can$cause$performance$problems.$In$a$solution$designed$to$protect$
growing$amounts$of$data$and$support$a$wide$range$of$IT$tasks,$the$ability$to$scale$without$performance$impact$
is$ essential.$ Cohesity$ provides$ a$ platform$ that$ can$ leverage$ a$ single$ backup$ copy,$ and$ from$ that$ create$
snapshots$ for$ use$ in$ DevOps,$ analytics,$ and$ other$ tasks.$ This$ is$ only$ valuable$ to$ a$ customer$ if$ the$ system$
doesn’t$get$bogged$down$as$more$data$is$stored$and$more$snapshots$are$created.$$
ESG$ Lab$ audited$ Cohesity$ testing$ of$ performance$ scalability$ and$ validated$ that$ the$ Cohesity$ Data$ Platform$
increases$both$throughput$and$IOPS$in$a$nearFlinear$fashion$as$nodes$are$added.$This$enables$organizations$to$
start$by$purchasing$only$what$they$need,$with$the$assurance$that$as$they$grow$they$can$add$Cohesity$nodes$
and$count$on$the$system$scaling$in$performance$to$support$additional$workloads.$
$
$
$
©$2015$by$The$Enterprise$Strategy$Group,$Inc.$All$Rights$Reserved.$
ESG%Lab%Spotlight:%Cohesity%Data%Platform:%Data%Efficiency%with%Converged%Secondary%Storage%%%%%%%%%%%%%%%%%%%%%%%%%%%%%4% %
ESG$Lab$also$explored$the$Cohesity$Data$Platform$Management$interface.$Our$testing$included$adding$a$Cohesity$node,$
which$was$simple$and$fast.$We$explored$the$protection$and$recovery$settings$for$VMs$and$vCenters;$in$addition$to$
common$capabilities$like$scheduling,$alerts,$defining$retention$times,$and$setting$SLAs,$ESG$Lab$noted$Cohesity’s$abilities$
to$set$latency$thresholds$(for$example,$to$enable$data$protection$during$production$time),$to$create$applicationF
consistent$snapshots,$and$to$choose$the$dedupe$method.$Also,$because$Cohesity$indexes$all$vCenter$metadata$during$
ingest,$complete$VM$and$vCenter$details$are$available$immediately.$$
We$also$viewed$the$ability$to$recover$VMs$or$files$using$a$GoogleFlike$search$function;$at$initial$launch,$the$product$
supports$recovery$to$the$NFS$data$store$on$Cohesity,$which$can$then$be$vMotioned$back.$An$upcoming$release$will$
provide$recovery$back$to$production$storage.$We$viewed$the$DevOps$workflow$that$enables$cloning$of$snapshots$for$
individual$VMs$or$applicationFconsistent$groups,$and$analytics$features$including$utilization$trending,$custom$querying,$
VM$and$file$reporting,$and$proactive$health$monitoring.$Additional$analytics$capabilities$are$due$in$the$very$near$future.$$
The$Cohesity$Data$Platform$offers$extensive$monitoring$capabilities,$including$throughput,$IOPS,$latency,$CPU,$and$
memory,$as$well$as$alerts$and$the$ability$to$drill$down$for$additional$details.$Figure$2$shows$the$dashboard$view$of$a$
cluster,$showing$system$status,$alerts,$and$storage$details$including$capacity,$data$reduction,$and$performance.$ESG$Lab$
has$viewed$many$storage$GUIs,$and$this$one$is$especially$clear,$making$it$easy$to$gain$an$understanding$of$cluster$status$
at$a$glance.$$
$
Figure%2.%Cohesity%Dashboard%%%
$
%
$
$
!
!
©$2015$by$The$Enterprise$Strategy$Group,$Inc.$All$Rights$Reserved.$
ESG%Lab%Spotlight:%Cohesity%Data%Platform:%Data%Efficiency%with%Converged%Secondary%Storage%%%%%%%%%%%%%%%%%%%%%%%%%%%%%5% %
!
The!Bigger!Truth$
$
$
Cohesity$uses$the$image$of$an$iceberg$to$demonstrate$the$magnitude$of$the$secondary$storage$problem,$and$that$image$
is$spot$on.$Many$organizations$focus$their$attention$on$primary$storage,$which$constitutes$just$a$fraction$of$the$data$
organizations$deal$with,$while$secondary$storage$continues$to$grow$unabated$across$multiple$infrastructure$silos.$Most$
organizations$are$willing$to$spend$more$on$high$performance$and$tight$SLAs$for$their$missionFcritical,$revenueF
generating$primary$storage$assets.$But$under$the$surface,$they$end$up$spending$much$more$on$secondary$storage—
buying$and$managing$separate$stacks$of$infrastructure$and$applications,$serviced$by$different$vendors,$and$generating$
multiple$data$copies$for$business$continuity/disaster$recovery,$DevOps,$analytics,$and$the$like.$These$multiple$
infrastructure$stacks$and$expanding$copies$of$the$same$data$result$in$higher$complexity$and$cost.$And$despite$keeping$so$
much$data,$most$organizations$actually$know$very$little$about$what$data$they$have$and$what$it’s$used$for.$They$have$a$
lot$of$“dark$data”—underFutilized$data$resources$that$are$costing$time,$money,$and$management$without$adding$value$
to$the$company.$$
Cohesity$has$created$a$way$to$eliminate$not$only$the$costly$redundant$data$copies,$but$also$the$redundant$infrastructure$
silos$organizations$buy$and$manage$to$act$on$that$data.$Cohesity’s$Data$Platform$is$a$single,$highly$scalable,$intelligent$
appliance$that$first$protects$your$data,$and$then$uses$the$copy$you$already$have$to$create$snapshots$for$secondary$uses.$
There’s$nothing$wrong$with$integrated$backup$appliances.$Or$deduplication$target$appliances.$Or$copy$data$
management$solutions.$The$question$is,$why%have%separate%data%copies,%hardware,%and%software,%when%you%can%have%
them%all%in%a%single,%scalable%appliance%that%leverages%one%copy%of%data?$$
Cohesity$makes$an$excellent$case$that$all$these$silos$are$unnecessary—no$matter$how$much$vendors$may$improve$upon$
their$current$solutions,$they$remain$individual$infrastructure$stacks$that$require$separate$management.$Cohesity’s$
approach$makes$sense—take$the$data$you$have$in$production,$copy$it$for$protection,$and$then$leverage$that$copy$for$
other$tasks.$To$make$this$work,$the$platform$must$be$scalable$without$a$performance$penalty.$ESG$Lab’s$auditing$of$
Cohesity$tests$indicates$that$the$performance$scalability$is$very$real.$$
Over$all,$the$Cohesity$Data$Platform$can$provide$an$efficient$platform$that$is$expandable$and$moldable$to$meet$the$
needs$of$each$customer.$The$ability$to$scale$incrementally$means$you$can$grow$organically$as$you$need,$with$full$
assurance$of$predictable$cost$and$performance.$It$offers$greater$insight$into$your$data,$with$simpler$management.$$
This$is$the$first$product$for$Cohesity,$and$the$unknown$at$this$point$is$how$it$will$be$received$by$customers.$ESG$Lab$
believes$the$company$made$the$right$choice$in$ensuring$that$performance$is$rock$solid,$and$we$are$pleased$to$see$that$a$
number$of$important$enterprise$features$are$expected$within$the$next$few$months:$integration$with$Active$Directory;$
support$for$SMB$and$for$workloads$such$as$Exchange,$SQL,$and$SharePoint;$recovery$back$to$production$storage;$and$
additional$analytics$capabilities.$We’re$excited$to$see$what$comes$next,$and$look$forward$to$getting$additional$
opportunities$to$test.$Stay$tuned!$
$
$
$
$
$
$
©$2015$by$The$Enterprise$Strategy$Group,$Inc.$All$trademark$names$are$property$of$their$respective$companies.$Information$contained$in$this$publication$has$been$
obtained$by$sources$The$Enterprise$Strategy$Group$(ESG)$considers$to$be$reliable$but$is$not$warranted$by$ESG.$This$publication$may$contain$opinions$of$ESG,$which$
are$subject$to$change$from$time$to$time.$This$publication$is$copyrighted$by$The$Enterprise$Strategy$Group,$Inc.$Any$reproduction$or$redistribution$of$this$publication,$
in$whole$or$in$part,$whether$in$hardFcopy$format,$electronically,$or$otherwise$to$persons$not$authorized$to$receive$it,$without$the$express$consent$of$The$Enterprise$
Strategy$Group,$Inc.,$is$in$violation$of$U.S.$copyright$law$and$will$be$subject$to$an$action$for$civil$damages$and,$if$applicable,$criminal$prosecution.$Should$you$have$
any$questions,$please$contact$ESG$Client$Relations$at$508.482.0188.$
$
©$2015$by$The$Enterprise$Strategy$Group,$Inc.$All$Rights$Reserved.$