The Big Business Intelligence Dilemma

The Big
Business Intelligence Dilemma
Rick F. van der Lans
Industry analyst
Email [email protected] Twitter @rick_vanderlans
www.r20.nl
Copyright © 1991 ‐ 2016 R20/Consultancy B.V., The Hague, The Netherlands. All rights reserved. No part of this material may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photographic, or otherwise, without the explicit written permission of the copyright owners.
Rick F. van der Lans
Rick F. van der Lans is an independent consultant, lecturer, and author. He specializes in data
warehousing, business intelligence, database technology, and data virtualization. He is managing
director of R20/Consultancy B.V.. Rick has been involved in various projects in which data
warehousing, and integration technology was applied.
Rick van der Lans is an internationally acclaimed lecturer. He has lectured professionally for the last
twenty five years in many of the European and Middle East countries, the USA, South America, and in
Australia. He has been invited by several major software vendors to present keynote speeches.
He is the author of several books on computing, including his new Data Virtualization for Business
Intelligence Systems. Some of these books are available in different languages. Books such as the
popular Introduction to SQL is available in English, Dutch, Italian, Chinese, and German and is sold
world wide. He also authored The SQL Guide to Ingres and SQL for MySQL Developers.
As author for TechTarget.com and BeyeNetwork.com, writer of whitepapers, chairman for the annual
European Enterprise Data and Business Intelligence Conference, and as columnist for a few IT
magazines, he has close contacts with many vendors.
R20/Consultancy B.V. is located in The Hague, The Netherlands, www.r20.nl. You can get in touch with Rick via:
Email:
[email protected]
Twitter:
@Rick_vanderlans
LinkedIn: http://www.linkedin.com/pub/rick-van-der-lans/9/207/223
Copyright © 1991 - 2016 R20/Consultancy B.V., The Hague, The Netherlands
2
1
The Big BI Dilemma
Copyright © 1991 - 2016 R20/Consultancy B.V., The Hague, The Netherlands
3
Part 1: The First Stage of
Business Intelligence
Copyright © 1991 - 2016 R20/Consultancy B.V., The Hague, The Netherlands
4
2
Definition of Business Intelligence
Definition by Boris Evelson of Forrester Research:
Business Intelligence is a set of methodologies, processes, architectures, and
technologies that transform raw data into meaningful and useful information
used to enable more effective strategic, tactical, and operational insights and
decision-making.
Copyright © 1991 - 2016 R20/Consultancy B.V., The Hague, The Netherlands
5
My Interpretation of this Definition
All the reporting and analytical
environments
All the data
Business
Intelligence
All business insights
Copyright © 1991 - 2016 R20/Consultancy B.V., The Hague, The Netherlands
All forms of decision-making
6
3
Most Current BI Systems
IT-driven reporting and
analytical environments
Primarily transactional data
Business
Intelligence
Strategic and tactical business
insights
Strategic and tactical and a
little operational decisionmaking
Copyright © 1991 - 2016 R20/Consultancy B.V., The Hague, The Netherlands
7
The Classic Data Warehouse Architecture
Source
systems
Staging
area
ETL
Data
warehouse
ETL
Copyright © 1991 - 2016 R20/Consultancy B.V., The Hague, The Netherlands
Data marts
Analytics &
reporting
ETL
8
4
Characteristics of Classic Reporting
Reporting
High data quality
Consistent report results
Integrated data
Report reproducibility
Internal production data
High data latency
High and stable performance
Potentially large user group
Repetitive usage
Minimal analytics
Copyright © 1991 - 2016 R20/Consultancy B.V., The Hague, The Netherlands
9
Rigid Processes for Development, Operation, and Management Programming logic to verify, transform, cleanse, integrate, interpret, and standardize
source data from production systems
Backup and recovery mechanisms
Security mechanisms and policies to protect against unauthorized access and misuse of
the data and reports
A priority scheme for developing and maintaining reports
Manual procedures initiated by calamities
The monitoring of reporting performance, scalability, availability and other nonfunctional aspects of the operational reporting environment
Various administrative procedures related to human and computer resources
Data governance rules
Master data management
…
Copyright © 1991 - 2016 R20/Consultancy B.V., The Hague, The Netherlands
10
5
Part 2: The Second Stage:
Self‐Service and Big Data
Copyright © 1991 - 2016 R20/Consultancy B.V., The Hague, The Netherlands
11
Self‐Service Business Intelligence
Self-Service Reporting
Self-Service Analytics
Self-Service ETL
Self-Service Data Preparation
Self-Service …
Copyright © 1991 - 2016 R20/Consultancy B.V., The Hague, The Netherlands
12
6
Self‐Service BI: Reporting Chaos?
Source
systems
Staging
area
ETL
Data
warehouse
ETL
Data marts
Reporting
ETL
?
Self‐service
BI
Copyright © 1991 - 2016 R20/Consultancy B.V., The Hague, The Netherlands
13
Analytics
Wikipedia: Analytics is the discovery,
interpretation, and communication of meaningful
patterns in data
Analytics is shown what may happen + how what
may happen can be influenced
Forms:
• predictive analytics, prescriptive analytics, …
Techniques:
• Forecasting: averages, naïve, drift, time series, …
Application areas:
• fraud analytics, weather forecasts, customer churning,
credit risk analysis, …
Copyright © 1991 - 2016 R20/Consultancy B.V., The Hague, The Netherlands
14
7
Result of a Data Science Exercise
Result of a data science exercise is business
insight in the form of a model (not values)
Models
• “Implemented” by management in policies and
decisions
• Implemented in reports as KPIs
• Implemented in operational systems:
• Risk analysis
• Customer churn risk
• Traffic lights
Copyright © 1991 - 2016 R20/Consultancy B.V., The Hague, The Netherlands
15
The Data Lake or Data Sandbox
Source
systems
Staging
area
ETL
Data
warehouse
ETL
Data marts
Reporting
ETL
?
Data science
Data lake
Copyright © 1991 - 2016 R20/Consultancy B.V., The Hague, The Netherlands
16
8
Importing External Data – Option 1
Social
media data
Source
systems
Staging
area
ETL
ETL
Data
warehouse
ETL
ETL
Data marts
Analytics &
Data Science
ETL
ETL
Open data
Spreadsheets
Copyright © 1991 - 2016 R20/Consultancy B.V., The Hague, The Netherlands
17
Importing External Data – Option 2
Social
media data
Source
systems
Staging
area
ETL
Data
warehouse
ETL
Data marts
Analytics &
Data Science
ETL
Open data
Spreadsheets
Copyright © 1991 - 2016 R20/Consultancy B.V., The Hague, The Netherlands
18
9
Where and How do we Plug In Big Data?
Source
systems
Staging
area
ETL
ETL
ETL
Data marts
Data
warehouse
ETL
Analytics &
reporting
ETL
ETL
?
Big data
?
Copyright © 1991 - 2016 R20/Consultancy B.V., The Hague, The Netherlands
19
Distributed Big Data Production
Copyright © 1991 - 2016 R20/Consultancy B.V., The Hague, The Netherlands
20
10
Everyone is “Doing” Business Intelligence
Users of self-service BI tools
BI specialists
Website developers
Data scientists
Developers of production systems
…
Copyright © 1991 - 2016 R20/Consultancy B.V., The Hague, The Netherlands
21
Analytical Islands Everywhere
Graph analytics
Excel
Dedicated
Analytics for app
Self-service BI
with cubes
BI in the cloud
Standard reports
and classic DW
Data science
Copyright © 1991 - 2016 R20/Consultancy B.V., The Hague, The Netherlands
22
11
Metadata Specifications Everywhere
Excel
Dedicated
Analytics for app
Self-service BI
with cubes
Standard reports
and classic DW
Data science
Copyright © 1991 - 2016 R20/Consultancy B.V., The Hague, The Netherlands
23
The Big BI Dilemma:
Integrating Classic BI Forms
with New BI Forms
Copyright © 1991 - 2016 R20/Consultancy B.V., The Hague, The Netherlands
24
12
Part 3: The Next Frontier: Fast Data
Copyright © 1991 - 2016 R20/Consultancy B.V., The Hague, The Netherlands
25
Fast Data = Big Data +
Fast Streaming Data +
Fast Analytical Decisions
Copyright © 1991 - 2016 R20/Consultancy B.V., The Hague, The Netherlands
26
13
Traffic Lights for Bicyclists in Rotterdam
Heat sensor
http://nos.nl/artikel/2133816-slim-stoplicht-in-rotterdam-voelt-hoeveel-fietsers-er-staan-te-wachten.html
Copyright © 1991 - 2016 R20/Consultancy B.V., The Hague, The Netherlands
27
Market Overview
Transmitting fast data:
• Classic message queuing technology, Apache Kafka (developed by
LinkedIn) & Flume, Confluent (Kafka), Apache Storm (developed by
BackType – Twitter), RabbitMQ, Yahoo S4, …
Storage of fast data:
• Files, Hadoop HDFS & Hbase, NoSQL, NewSQL, …
Analyzing fast data:
• Spark Streaming, SQLStream, StreamBase, …
Data mining of big data streams
• MOA (Massive Online Analysis), SOMOA, RapidMiner, …
Monitoring and managing streaming data
• Apache NiFi, HortonWorks DataFlow (Apache NiFi++, developed NSA), …
Copyright © 1991 - 2016 R20/Consultancy B.V., The Hague, The Netherlands
28
14
Fast Data and the Classic Data Warehouse Architecture
Source
systems
Staging
area
ETL
Data
warehouse
ETL
?
Listener
Data marts
Analytics &
reporting
ETL
?
Application
Reaction
Copyright © 1991 - 2016 R20/Consultancy B.V., The Hague, The Netherlands
29
And … More Analytical Islands
Remote
analytics
Graph analytics
Excel
Dedicated
Analytics for app
Self-service BI
with cubes
BI in the cloud
Fast data
analytics
Standard reports
and classic DW
Data science
Copyright © 1991 - 2016 R20/Consultancy B.V., The Hague, The Netherlands
30
15
And … More Analytical Islands
Remote
analytics
Excel
Dedicated
Analytics for app
Self-service BI
with cubes
Fast data
analytics
Standard reports
and classic DW
Data science
Copyright © 1991 - 2016 R20/Consultancy B.V., The Hague, The Netherlands
31
The Even Bigger BI Dilemma:
Integrating Classic BI Forms with New BI Forms and
with Fast Data Analytics
Copyright © 1991 - 2016 R20/Consultancy B.V., The Hague, The Netherlands
32
16
Part 4: Closing Remarks
Copyright © 1991 - 2016 R20/Consultancy B.V., The Hague, The Netherlands
33
Every Technology has an Expiration Date
Copyright © 1991 - 2016 R20/Consultancy B.V., The Hague, The Netherlands
34
17
Expired?
Source
systems
Staging
area
ETL
Data
warehouse
ETL
Data marts
Analytics &
reporting
ETL
Copyright © 1991 - 2016 R20/Consultancy B.V., The Hague, The Netherlands
35
The Logical Data Warehouse Architecture
to the Rescue?
Copyright © 1991 - 2016 R20/Consultancy B.V., The Hague, The Netherlands
36
18
The Logical Data Warehouse Architecture
Source
systems
Staging
area
ETL
Social
media data
Open data
Big data
Spreadsheets
Fast Data
Listener
Copyright © 1991 - 2016 R20/Consultancy B.V., The Hague, The Netherlands
Logical Data Warehouse Architecture
ETL
Analytics &
reporting
Data
warehouse
37
Current Skills of BI Specialist
Business Skills
Technical Skills
Copyright © 1991 - 2016 R20/Consultancy B.V., The Hague, The Netherlands
38
19
Skills of Fast Data Specialist
Business Skills
Business Skills
Technical Skills
Technical Skills
Copyright © 1991 - 2016 R20/Consultancy B.V., The Hague, The Netherlands
39
Skills of Future BI Specialist
Business Skills
Business Skills
Technical Skills
Technical Skills
Copyright © 1991 - 2016 R20/Consultancy B.V., The Hague, The Netherlands
40
20
The Key Challenge is:
Solve the Big BI Dilemma!
Copyright © 1991 - 2016 R20/Consultancy B.V., The Hague, The Netherlands
41
Copyright © 1991 - 2016 R20/Consultancy B.V., The Hague, The Netherlands
42
21