How “Stranger Things” can happen with Visual

#NetflixData
How “Stranger Things” can
happen with Visual Analytics
Jason Flittner
Senior Analytics Engineer / Manager
Netflix - Content Data Engineering and Analytics
● About Netflix
● Tableau + Big Data
○ Lessons Learned
○ Where we are today
● Analytics and Iterating Quickly
What is Netflix?
Metrics
●
●
●
●
93+ million members
190 countries
1,000+ devices
10B hours/qtr
We plan on spending ~$6B in 2017 on
content for our members
~60 PB DW on S3
● ~1400 Tableau users
● Live & extract connections
● Analytics on billions of rows
●
Storage
AWS
S3
Compute
(Hadoop
clusters)
Data Interface
Data Access, Analytics and Visualization
● About Netflix
● Tableau + Big Data
○ Lessons Learned
○ Where we are today
● Analytics and Iterating Quickly
Choosing a source
● Hive
● Spark
● Presto
● Redshift
● Published Data Source
● etc...
● Powerful and scalable backend
● “Slower” 1,000,000,000/hr
● Hive + Tableau
○ Thrift Servers
○ Custom SQL vs Tables
○ Metadata
○ ODBC Optimization
● Scalable
● Faster than Hive in many cases
● Spark + Tableau
○ Thrift Servers
○ Long running job on Cluster
○ Query reliability
● Fast query engine
● Great for experimenting and
“smaller” data sets
● Connecting to Tableau
○ Web data connector
○ ODBC
● About Netflix
● Tableau + Big Data
○ Lessons Learned
○ Where we are today
● Analytics and Iterating Quickly
Tableau Extract API
Tableau Data
Extract
Publish to Server
Distributed Tableau Extract API
Publish to Server
Issues Command Create Extract
Provision Container Resource
Create Tableau Data Extract
Amazon
Redshift
●
●
●
●
●
Very fast loads from S3
Native Tableau connector
Quick Tableau Iteration
Live or Extract
Concurrency
BIG Data
● Too big to extract?
● Optimized live connections
○ SQL
● Custom data viz with Druid
● Tableau + Hyper!?
● About Netflix
● Tableau + Big Data
○ Lessons Learned
○ Where we are today
● Analytics and Iterating Quickly
Analytics Engineer
Analytics:
● Binge Analysis
● Viewing Patterns
● Hours Viewed
● Customer Joy
● Content Quality
Business users
Bringing it all
together
●
●
●
●
Content analytics
Iterate quickly
Move between backend sources
Strong user adoption
Merci
Thank you
Jason Flittner -