What is Big Data

Big Data & Test Automation
Agenda
 What is Big Data?
 Big Data architecture
 Big Data technologies
 Testing strategy
 Functional test automation on data warehousing
What is Big Data
 Is it just huge amount of data ?
What is Big Data
Why do we need Big Data
For Effective marketing
To make better business decision
To gather customer feedback
Attain customer satisfaction
Increase revenue
Why do we need Big Data
4 V’s of Big Data
Big Data Architecture
Big Data Technologies
HDFS( Hadoop Distributed File System)
Big Data Technologies
Map Reduce – Framework
Map - Perform filtering and sorting on data sets
Reduce - Perform summary operation on map step result
Hbase
HBase is a column-oriented database management system
Hive
Data warehousing infrastructure for Hadoop.
Data summarization, query and analysis
Pig
To create MapReduce programs used with Hadoop
Test Strategy
Big Data Ecosystem Testing
Big Data Extraction Testing
• Pre-Hadoop Validation
• Meta Data Analysis and Validation
• Impala & HDFS Data Storage Validation
• Validation on Data Extraction from Source
• Referential Integrity & Constraint Validation
• Heterogeneous Data Integration Validation
Big Data
Ecosystem
Testing
Big Data
Extraction
Testing
Big Data Tools
• Query Surge - Functional
Big Data
Testing
Non
Functional
Testing
Non Functional Testing
• Performance Validation
• Security Validation
Data Transformation/Migration Testing
Data
Transform
ation
/Migratio
n Testing
Data
Analytics
and
Visualization
Testing
• Data Quality Validation
• Data Correctness/ Completeness Validation
• Business Rule Validation
Data analytics and Visualization Testing
•
•
•
•
HDFS to SSAS validation
Dashboard Validation
Visualization Validation
Report generation and Validation
Query Surge Tool
Conclusion
Thank You