WPC047 Data ON THE ROAD

WPC047
Data ON THE ROAD:
the Azure part
Jessica Tibaldi Tech Evangelist Microsoft
[email protected] @_jetiba
P R E S E N TA
Agenda
Understand how to build a scalable and
performant backend for your IoT
solution to store and analyze data in
the cloud using Azure services
• Azure Machine Learning Studio
• Azure HDInsight
• Azure Data Factory
• (extra) Azure Service Fabric
www.wpc2016.it – [email protected] - +39 02 365738.11
2
Demo Architecture
mydriving-vinlookup
Event Hub
Storage - Blob
Xamarin App
(device)
Service Fabric
Machine Learning
mydriving-archive
mydrivingAnalyticsDB
mydriving-sqlpbi
mydrivingDB
IoT Hub
Data Factory
Car (Sensor)
mydriving-hourlypbi
www.wpc2016.it – [email protected] - +39 02 365738.11
HDInsight
Power BI
3
Build, deploy, and publish predictive analytics solutions
Machine Learning
and Analytics
Machine
Learning
• Simple, scalable, cutting edge. A fully managed cloud service that enables you to easily build, deploy, and share predictive analytics solutions.
• Deploy in minutes. Azure Machine Learning means business. You can deploy your model into production as a web service that can be called
from any device, anywhere and that can use any data source.
• Publish, share, monetize. Share your solution with the world in the Gallery or on the Azure Marketplace.
www.wpc2016.it – [email protected] - +39 02 365738.11
4
Machine Learning Flow and Algorithms
Define
Objective
Collect
Data
Manage
•
•
•
Prepare/
Clean
Data
Publish
Integrate
Score/
Evaluate
Models
•
•
Construct/
Train
Models
www.wpc2016.it – [email protected] - +39 02 365738.11
5
Azure Machine Learning service
Data
Clients
API
ML STUDIO
Model is now a web
service that is callable
Blobs and Tables
Hadoop (HDInsight)
Relational DB
(Azure SQL DB)
Integrated development environment for
Machine Learning
Monetize the API through
our marketplace
www.wpc2016.it – [email protected] - +39 02 365738.11
6
DEMO
Azure Machine Learning
www.wpc2016.it – [email protected] - +39 02 365738.11
7
BigData Analysis
Machine Learning
and Analytics
HDInsight
(Hadoop and
Spark)
HDInsight is a cloud implementation on Microsoft Azure of Apache Hadoop technology stack
that is the go-to solution for big data analysis
Batch
Script
SQL
NoSQL
Streaming
In-Memory
Map Reduce
Pig
Hive
HBase
Storm
Spark
Core Engine
• Scale to petabytes on demand
• Deploy in Windows or Linux
• Process unstructured and semi-structured data
• Spin up an Apache Hadoop cluster in minutes
• Develop in Java, .NET, and more
• Visualize your Hadoop data in Excel
• Skip buying and maintaining hardware
• Easily integrate on-premises Hadoop clusters
www.wpc2016.it – [email protected] - +39 02 365738.11
8
Hive
• SQL-Like query syntax – if you know
SQL, you’ll be able to use Hive
• Relational set algebra mixed with
row-oriented manipulation
• Declare tables (internal and external)
and views
• Query processor optimizes
MapReduce job
www.wpc2016.it – [email protected] - +39 02 365738.11
9
Compose and orchestrate data services at scale
Information
Management
SQL
Data Factory
INGEST
SQL
DATA SOURCES
SQL
<>
{}
• Create, schedule, orchestrate, and manage data pipelines
• Automate cloud resource management
• Visualize data lineage
• Move relational data for Hadoop processing
• Connect to on-premises and cloud data sources
• Transform with Hive, Pig, or custom code
• Monitor data pipeline health
www.wpc2016.it – [email protected] - +39 02 365738.11
10
Data Factory Elements
• Pipelines
a grouping of logically related activities that performs a
task
o Activities
define the actions to perform on your data
 Data transformation
 Data movement
• Linked Services
define the information needed for Data Factory to
connect to external resources
 data store
 compute resource
 Datasets
Datasets identify data within different data stores, such
as tables, files, folders, and documents.
www.wpc2016.it – [email protected] - +39 02 365738.11
11
Example
Pipeline (Active Period: July 2016 to July 2017)
Datasets
Pipeline
www.wpc2016.it – [email protected] - +39 02 365738.11
Activities
12
Activity type, properties
& parameters (if required)
Inputs & outputs
Policy & schedule
DEMO
Azure Data Factory
www.wpc2016.it – [email protected] - +39 02 365738.11
14
Service Fabric - Microservices apporach
Compute
Service
Fabric
•
•
•
•
•
•
•
High scalability
High reliability
High availability
Constant application evolution
Deployment and update speed
Development agility
Resource optimization and cost reduction
www.wpc2016.it – [email protected] - +39 02 365738.11
15
Service Fabric in the demo application
•
VINLookupService (stateless)
looks up additional vehicle
information and saves that to a
SQL db
•
IoTHubPartitionMap (stateful)
obtains the Event Hub partition
key which VINLookupService uses
to connect
www.wpc2016.it – [email protected] - +39 02 365738.11
16
Additional extension routes…
•
Real-time identification of nearby points of interested based on GPS coordinates
•
Identification of the driver identity in a vehicles fleet management scenario
www.wpc2016.it – [email protected] - +39 02 365738.11
17
Some scenarios…
The front brakes are needing
to be serviced sooner than expected
Tow truck is on its way
to vehicle B204
14/15 vehicles meet
standards and 1 is scheduled
for maintenance
24 vehicles are shown on
a map, showing status
Temperature is beyond the
ideal range for 13 vehicles
www.wpc2016.it – [email protected] - +39 02 365738.11
Vehicle B204 is driving in
eco-mode 78% of the time
3 vehicles have daily mileage
that qualify them for reduced rates
18
Q&A
Domande e Risposte
www.wpc2016.it – [email protected] - +39 02 365738.11
19
MyDriving Docs
https://azure.microsoft.com/itit/documentation/samples/mydriving/
Useful Links
www.wpc2016.it – [email protected] - +39 02 365738.11
20
OverNet Education
Contatti
OverNet
Education
[email protected]
www.overneteducation.it
Tel. 02 365738
@overnete
www.facebook.com/OverNetEducation
www.linkedin.com/company/overnet-solutions
www.wpc2016.it
www.wpc2016.it – [email protected] - +39 02 365738.11
21
Appendix
www.wpc2016.it – [email protected] - +39 02 365738.11
22
Input data
Data Transformation
Define model
Train model
Score model => Prediction
Evaluate model => Prediction