Data Analytics Presentation

DATA ANALYTICS PRESENTATION
Group : 3
Ian Roberts
Brandon Segal
SUMMERY
Problem Statement
First Approach
 Challenges
 Outcomes
Second Approach
 Challenges
 Solutions
 Outcomes
Conclusion
PROBLEM STATEMENT
Can Small Pharmacies compete with
Large Pharmacies?
Overcoming
Competition
(Chain Pharmacies,
Food Stores)
Identifying Top
Earners
OTC
Success
Identifying Poor
Products
Identifying Unique
Factors to each
Pharmacy
DATA FORMAT AND CONTENTS
Data Format
 Excel ( To large to display)
 Relational Database
Data Contents




Subproduct Categories: 220
POS Transactions: 1.8 million
Unique Baskets: 654,543
Pharmacy Numbers and Zipcodes:59
FIRST APPROACH
SQL Database
 Microsoft Azure SQL server
Break up Sheets into a Relational
Database
Join tables to isolate relationships
between data sets
CHALLENGES
Learning Curve
Integration to other tools
Lack of Experience with
Database Programming
SECOND APPROACH
Create Ubuntu VM on Microsoft Azure
 Enable the entire group to ssh into the VM
 Allowed for the group to make changes concurrently
Flattening the Database with Python
 Concatenate the sheets of data using dictionaries
 Print the dictionaries onto a tab-deliminated document
 Replace codes with their descriptions for readability
CHALLENGES
 What kind of argument are we trying to put
forth?
 What other information do we consult?
 How to Visualize a Large Flattened Table?
Basket #
Phrmcy #
Prod #
SLS Date
Zip 3 #
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
• US Census
Bureau
• Population
• Population
Density
• Latitude
• Longitude
Visualization
Urban
Pharmacies:
• Medical
Supplies /
Home
Healthcare
Rural
Pharmacies:
• General
Stores
Supplemental
Sources
Argument
APPROACH
• Tableau
•
•
•
•
Fast
Aesthetic
Scalable
User
Friendly
RESULTS
• Urban Pharmacies
• Top Ten Earners from the 10 most
Populated Pharmacies
RESULTS
Rural Pharmacies
Top Ten Earners from the 10 least
Populated Pharmacies
DIFFERENCES BETWEEN PHARMACIES
Rural Pharmacies
Urban Pharmacies
Lottery Tickets
Allergy Medicine
Cigarettes
Pain and Sinus
Beverages
SHORTCOMINGS
Incomplete Reporting of Store Transactions
Sparse distribution of pharmacies
Outlier Transactions difficult to identify in large data set