Chapter 1

Review: ETL & OLAP
What is transposition?
What is ETL?
Script/batch/command vs manual?
What is the major cause/source of bad data?
GIGO:
KYD:
If you didn’t finish the 4 queries from last week
then do it today.
Exercise two: OLAP
Answer the following on the back of the handout.
Query 1:
How many Sales Districts are there?
Which district is consistently the top performer?
Which quarter had the best revenue (use totals)?
What’s the total revenue for 2004?
Which region would you be concerned about and why?
What’s the total revenue for 2003?
Query 2:
How many Sales Regions are there?
How many Sales Districts in the USA?
How many Sales Reps in NE USA?
How many Sales Reps in the USA?
Best and Worst Sales Reps?
Query 3:
Graph the results like this, what’s missing?
How can you fix this? What’s happened in Q1 2005?
Query 4
Create this report (Sales Reps by Revenue)
Look for the SQL that is created and run for you, where is it?
select
from
pa14.PROD_GRP_ID PROD_GRP_ID,
a13.PROD_GRP_DESC PROD_GRP_DESC,
coalesce(pa11.PROD_ID, pa12.PROD_ID) PROD_ID,
a13.PROD_NAME PROD_NAME,
coalesce(pa11.SALES_REP_ID, pa12.SALES_REP_ID) SALES_REP_ID,
a15.SALES_REP_NAME SALES_REP_NAME,
a15.SALES_REGN_ID SALES_REGN_ID,
a15.SALES_REGN_DESC SALES_REGN_DESC,
a15.SALES_DIST_ID SALES_DIST_ID,
a15.SALES_DIST_DESC SALES_DIST_DESC,
coalesce(pa11.QTR_ID, pa12.QTR_ID) QTR_ID,
a16.QTR_DESC QTR_DESC,
pa11.WJXBFS1 WJXBFS1,
pa11.WJXBFS2 WJXBFS2,
ZEROIFNULL((pa11.WJXBFS3 / NULLIFZERO(pa14.WJXBFS1))) WJXBFS3,
pa11.WJXBFS4 WJXBFS4,
pa12.WJXBFS1 WJXBFS5,
pa11.WJXBFS5 WJXBFS6
(select
coalesce(pa11.SALES_REP_ID, pa12.SALES_REP_ID) SALES_REP_ID,
coalesce(pa11.QTR_ID, pa12.QTR_ID) QTR_ID,
coalesce(pa11.PROD_ID, pa12.PROD_ID) PROD_ID,
pa11.WJXBFS1 WJXBFS1,
pa11.WJXBFS2 WJXBFS2,
pa11.WJXBFS4 WJXBFS3,
ZEROIFNULL((pa11.WJXBFS4 / NULLIFZERO(pa12.WJXBFS1))) WJXBFS4,
pa11.WJXBFS3 WJXBFS5
from
(select
a11.SALES_REP_ID SALES_REP_ID,
a12.QTR_ID QTR_ID,
a11.PROD_ID PROD_ID,
count(distinct a11.ORDER_ID) WJXBFS1,
sum(a11.ORDER_AMT) WJXBFS2,
count(distinct a11.ORDER_ID) WJXBFS3,
(1.00001 * sum(a11.ORDER_AMT)) WJXBFS4
from
F_ORDER a11
Query 1
Log on to SQLAssistant
Week 7: Data Mining
Copy the Weka.zip file to a USB.
What is Data Mining?
What is ARROWSMITH?
What is Market Basket Analysis?
Refer to the following metadata:
age {0_34,35_51,52_max}
gender {FEMALE,MALE}
region {INNER_CITY,TOWN,RURAL,SUBURBAN}
income {0_24386,24387_43758,43759_max}
married {NO,YES}
children {0,1,2,3}
car {NO,YES}
save_act {NO,YES}
current_act {NO,YES}
mortgage {NO,YES}
What business could it be from?
Suggest some theories as to what patterns or relationships you might
find in this data (e.g. married with children, rural with car)