10 Things to Consider When Using Apache Kafka: Utilization Points

10 Things to Consider When Using Apache Kafka:
Utilization Points of Apache Kafka Obtained From IoT Use Case
May 16, 2017
NTT DATA Corporation
Naoto Umemori, Yuji Hagiwara
© 2017 NTT DATA Corporation
Contents
1. Project outlines
2. Tips and pitfalls from IoT use case:
• Tunes Performance
• Deals with unusual Operations
• Availability pitfalls
3. Summary
© 2017 NTT DATA Corporation
2
Project Outlines
© 2017 NTT DATA Corporation
3
About us
 Who are we?
• Naoto Umemori : Platform Engineer
• Yuji Hagiwara
: Platform Engineer
 OSS professional headquarter in NTT Data Corp.
 Our main target
• IoT (Connected Vehicle)
• Cloud technology (OpenStack, Docker,…)
• Automation of platforms
© 2017 NTT DATA Corporation
4
Our Target: Connected Vehicle
Solve the following four data technology criteria (volume, velocity, variety &
security) in order to utilize automotive, human and social data efficiency.
The assumed volume for connected vehicle
Amount of Connections
> 1 million
Simultaneous connections
> 100k TPS
> 1GB/s
Amount of Transactions
Total Data rate
© 2017 NTT DATA Corporation
5
Apache Kafka: A distributed streaming platform
Apache Kafka is a distributed streaming platform as
having three key capabilities:
• Publish/Subscribe is similar to a message queue
• Store streams of records in a fault tolerant way
• Process streams of records
We have used Kafka as a Messaging System in our IoT
platform.
https://kafka.apache.org/intro
© 2017 NTT DATA Corporation
6
Overview of Our IoT Platform
Key Architecture: Separation of Stream and Batch processing unit
Devices
Sensors
Mobile phones
Servers
NW devices
IoT Platform
Connection
&
Collection
Accumulation &
Conversion
Applications
Biz Systems
Analysis
Inventory info.
Stream Proc. unit
Data stores
for Stream
Batch Proc. unit
Map info.
Multiple Data stores
for Analysis
Traffic info.
Data stores
for Batch
User info.
Auto mobile
7
…
© 2017 NTT DATA Corporation
Distribution
Monitoring & Visualization
Architecture of Our IoT Platform
IoT Platform
Accumulation & Conversion
Analysis
Device info.
Device info.
Message Broker
Device info.
Gateway (Kafka Producer)
Stream process unit
Stream
processing
Stream
Data stores
Batch process unit
Archive
Data stores
Data
Buffering
Batch Proc.
Device info.
Analysis
Data stores
8
…
…
…
…
…
…
Distribution
Monitoring & Visualization
© 2017 NTT DATA Corporation
Temporary
Data stores
Analysis
ETL
…
Real-time Analysis API
Collection
Device info.
Applications
Analysis API
Devices
Tips and pitfalls from IoT use case
© 2017 NTT DATA Corporation
9
Tips and pitfalls from IoT use case
Tunes Performance
•
•
•
•
Disk I/O of Kafka Broker
Concurrency of Kafka Producer
The number of Partitions
Async/Sync Bridge
Deals with unusual Operations
• Offset Monitoring
• Purging Kafka Topics
• Slow Pub/Sub Log
Availability pitfalls
• Undesirous RAID Group
• Unstable Kafka Topics
• A huge number of Partitions makes Cluster unhealthy
© 2017 NTT DATA Corporation
10
Summary
© 2017 NTT DATA Corporation
11
Disclaimer
1. Any product name, service name, software name and other marks
are trade mark or registered mark of corresponding companies.
2. This presentation is in a purpose of providing the knowledge
gained from our activities on IoT field.
3. A presenter and NTT DATA Corporation provide information in asis basis and have no responsiveness for results that you got
according to information in this presentation material.
© 2017 NTT DATA Corporation
12