Cloudera Kudu

Cloudera Kudu Introduction
Zbigniew Baranowski
Based on:
http://slideshare.net/cloudera/kudu-new-hadoop-storage-for-fast-analytics-onfast-data
What is KUDU?
• New storage engine for structured data (tables)
– does not use HDFS!
• Columnar store
• Mutable (insert, update, delete, scan)
• Written in C++
• Apache-licensed – open source
• Currently in beta version
Hadoop and KUDU
Yet another engine to store data?
• HDFS excels at
– Scanning of large amount of data at speed
– Accumulating data with high throughput
• HBASE (on HDFS) excels at
– Fast random lookups and writing by key
– Making data mutable
KUDU tries to fill the gap
Addressing evolution in hardware
• Hard drives -> solid state disks
– Random scans in milliseconds
– Analytics no longer IO bound -> CPU bound
• RAM is getting cheaper – can be used at
greater scale
Table oriented storage
• Table has Hive/RDBMS like schema
– Primary key (one or many columns), NO
secondary indexes
– Finite number of columns
– Each column has
• name and type
• Horizontally partitioned (range, hash) – called
tablets
– Tablets typically have 3 or 5 replicas
Table access and manipulations
• Operations on tables (NoSQL)
– insert, update, delete, scan
– Java and C++ API
• Integrated with
– Impala, MapReduce, Spark
– more are coming
KUDU table with Impala
Source: https://github.com/cloudera/kudu-examples/tree/master/java/collectl
CREATE TABLE `metrics` (
`host` STRING,
`metric` STRING,
`timestamp` INT,
`value` DOUBLE
)
TBLPROPERTIES(
'storage_handler' = 'com.cloudera.kudu.hive.KuduStorageHandler',
'kudu.table_name' = 'metrics',
'kudu.master_addresses' = 'quickstart.cloudera:7051',
'kudu.key_columns' = 'host, metric, timestamp'
);
insert into metrics values (‘myhost’,’temp1’, 1449005452,0.5343242);
select count(distinct metric) from metrics;
delete from metrics where host=‘myhost’ and metric=’temp1’ …..;
KUDU with Spark
import org.kududb.mapreduce._
import org.apache.hadoop.conf.Configuration
import org.kududb.client._
import org.apache.hadoop.io.NullWritable;
val conf = new Configuration
conf.set("kudu.mapreduce.master.address", "quickstart.cloudera");
conf.set("kudu.mapreduce.input.table", "metrics");
conf.set("kudu.mapreduce.column.projection",
"host,metric,timestamp,value");
val kuduRdd = sc.newAPIHadoopRDD(conf, classOf[KuduTableInputFormat],
classOf[NullWritable], classOf[RowResult])
// Print the first five values
kuduRdd.values.map(r => r.rowToString()).take(5).foreach(x => print(x + "\n"))
Data Consistency
• Reading
– Snapshot consistency
– Point in time queries (based on provided
timestamp)
• Writing
– Single row mutations done atomically across all
columns
– No multi-row transactions
Architecture overview
• Master server (single)
– Keeps metadata replicated
• Catalog (tables definitions) in a KUDU table (cached)
– Coordination (full view of the cluster)
– Tablets directory (tablets locations) (in memory)
– Fast failover supported
• Tablets Server (worker nodes)
– Stores/servers tablets
• On local disks (no HDFS)
– Tracks status of tablets replicas (followers)
Tables and tablets
Metadata
Data
When to use?
• When sequential and random data access is
required simultaneously
• When simplification of a data ingest is needed
• When updates on data are required
• Examples
– Time series
• Streaming data, immediately available
– Online reporting
Typical low latency ingestion flow
Simplified ingestion flow with KUDU
Benchmarking by Cloudera
Benchmarking by customer (Xiaomi)
KUDU under the hood
Data replication - Raft consensus
Master
Tablet server X
Client
WAL
Tablet 1
(leader)
Tablet server Y
Tablet 1
(follower)
Tablet server Z
WAL
Tablet 1
(follower)
WAL
Data Insertion (without uniqueness check)
MemRowSet
Row: Col1,Col2, Col3
INSERT
B+tree
Leafs sorted by
Primary Key
Row1,Row2,Row3
Flush
DiskRowSet1 (32MB)
PK
Col1
Col2
PK
Col1
Col2
Col3
Bloom
filters
PK {min, max}
Col3
Bloom
filters
Interval
tree
DiskRowSet2 (32MB)
Columnar store encoded
similarly to Parquet
PK {min, max}
Rows sorted by PK.
Tablets Server
Bloom filters for
PK ranges. Stored
in cached btree
Interval tree keeps
track of PK ranges
within DiskRowSets
There might be Ks
of sets per tablet
Column encoding within DiskRowSet
Index for given
column
Pages with data
For PK: maps row
keys to pages
Values
Size 256KB
Btree index
Page metadata
For a standard
column: maps
row offsets to
pages
Values
Page metadata
Values
Page metadata
Pages can be
compressed: LZ4,
gzip, or bzip2
Values
Page metadata
Pages are
encoded witha
variety of
encodings, such
as dictionary
encoding,
bitshuffle, or
front coding
DiskRowSet compaction
DiskRowSet1 (32MB)
PK {A, G}
DiskRowSet1 (32MB)
PK {A, D}
DiskRowSet2 (32MB)
PK {E, G}
Compact
DiskRowSet2 (32MB)
PK {B, E}
• Periodical task
• Removes deleted rows
• Reduces the number of sets with overlapping
PK ranges
• Does not create bigger DiskRowSets
– 32MB size for each DRS is preserver
PK has to be
provided for
UPDATE and
DELETE
Data updates
MemRowSet
UPDATE
Set Col2=x
where Col1=y
Using DRS PK
ranges and
boom filters
B+tree
Row1,Row2,Row3
DiskRowSet1 (32MB)
If the row is
there, get its
Col1
PK
offsetCol2
DiskRowSet2 (32MB)
PK
Col1
Col2
PK {min, max}
Col3
Bloom
filters
PK {min, max}
Col3
Compactions
are done
periodically
MemDeltaStore
DeltaStore
(on disk)
Compactions
Bloom
filters
Flush
B+tree
(row_offset,time),…
Summary
• KUDU is NOT
– a SQL database, a filesystem, an in-memory
database
– not a direct replacement for Hbase or HDFS
• KUDU is trying to be a compromise between
– Fast sequential scans
– Fast random reads
• Simplifies data ingestion model
Learn more
• Video: https://www.oreilly.com/ideas/kuduresolving-transactional-and-analytic-trade-offs-inhadoop
• Whitepaper: http://getkudu.io/kudu.pdf
• KUDU project: https://github.com/cloudera/kudu
• Get Cloudera Quickstart VM and test it

Download Report

Cloudera Kudu

Paperzz.com

Your Paperzz