Statistical Consulting - Cox Associates Consulting

Math 6330: Statistical Consulting
Class 4
Tony Cox
[email protected]
University of Colorado at Denver
Course web site: http://cox-associates.com/6330/
Assignment for next time (February 7)
• Evaluate evidence that PM2.5 causes elderly mortality in new
data set, Sample4.xlsx, at http://cox-associates.com/6330/.
Be prepared to present your thoughts in < 5-minute
presentation (just in case!)
• Read Russo & Schoemaker, 1989, Chapter 5 (improving
intelligence-gathering and estimation),
https://professional.sauder.ubc.ca/re_creditprogram/course_
resources/courses/content/499/russo.pdf
• Software: Download Netica, bring it next time
http://www.norsys.com/download.html
• (Optional) Youtube: Hans Rosling TED talk,
www.ted.com/talks/hans_rosling_shows_the_best_stats_you
_ve_ever_seen
• (Optional) Fair Coin problem
2
Introduction to descriptive analytics
(cont.) – Some high-value tools
•
•
•
•
•
•
Interaction plots
CART trees
Bayesian Networks (BNs)
Random Forests
Partial dependence plots
Visualization
3
Interaction plots
4
Interaction plot descriptions can
generate important hypotheses
Why is low income so strongly
associated with increased risk
of heart attack?
5
Interaction plot descriptions can raise
worthwhile research questions
How and why does education affect self-reported health risks?
6
BNs can help to answer such questions
7
PM2.5 is informative about elderly
mortality in Sample4
Dependence
between elderly and
other mortality
suggests hidden
confounders
8
Do changes in PM2.5 predict changes
in elderly mortality in Sample4?
9
Descriptive analytics: Visualization
• Hans Rosling TED talk,
www.ted.com/talks/hans_rosling_shows_the_
best_stats_you_ve_ever_seen
10
Introduction to predictive analytics
11
Everyone wants predictive analytics
“Move your analytics program from descriptive
to predictive with Microsoft Azure Machine
Learning—part of the Cortana Intelligence
Suite. You can use our pre-built modules or
upload your own R or Python code. Learn
more with our machine learning guide for
data scientists.”
12
Predictive analytics
• What will happen if we do nothing?
• How sure can we be?
Example: Black-box ARIMA
forecasting of losses due to
terrorist attacks
13
http://www.slideshare.net/VictorOdutokun/arima-analysis-project-slide
Predictive analytics challenge: Different models
make different predictions for case 7
• Model 1: Outcome = Predictor 3
• Model 2: Outcome = majority(Predictors 2-4)
• Model 3: Outcome = max(Predictors 3-4)
Case
Predictor 1
Predictor 2
Predictor 3
Predictor 4
Outcome
1
1
1
1
1
1
2
0
0
0
0
0
3
0
1
1
0
1
4
1
1
0
0
0
5
0
0
0
0
0
6
1
0
1
1
1
7
1
1
0
1
?
14
Predictive analytics techniques
•
•
•
•
Forecasting: Pr(future outputs | past)
Regression: Pr(output | covariates)
Dynamic simulation: Pr(outputs | inputs)
Inference: Bayesian network (BN) PDFs
– Inference: Pr(outputs | observed inputs)
• Monte-Carlo and exact inference algorithms
• Structure learning and ensemble learning algs
– Dynamic Bayesian Networks (DBNs)
• Kalman filtering and extensions
• Particle swarm optimization
15
Breakthroughs in predictive analytics
• Averaging predictions from
multiple models improves
predictions!
– More accurate, less bias, more
precise (lower error variance), less
over-confidence (fewer type 1,
type 2 errors)
• Ensemble methods improve
forecasts
–
–
–
–
Random forest (rf)
Gradient boosting (gbm)
Cross-validation, BMA
Super-learning
16
Introduction to Bayesian inference
with Netica®
17
Example: HIV screening
• Pr(s) = 0.01 = fraction of population with HIV
– s = has HIV, s′ = does not have HIV
– y = test is positive
• Pr(test positive | HIV) = 0.99
• Pr(test positive | no HIV) = 0.02
• Find: Pr(HIV | test positive) = Pr(s | y)
– Subjective probability estimates?
18
Solution via Bayesian Network (BN)
Solver
• DAG model: “True state  Observation”
– DAG = “directed acyclic graph”: Nodes and arrows, no
cycles allowed
• Store “marginal probabilities” at input nodes (having
output arrows only)
• Store “conditional probability tables” at all other
nodes.
• Make observations
• Enter query
– Solver calculates conditional probabilities
19
Solution in Netica
• Step 1: Build model, compile network
HIV_status
HIV present
1.0
HIV not present 99.0
Test_result
test positive
2.97
test negative
97.0
20
Solution in Netica
• Step 1: Build model, compile network
HIV_status
HIV present
1.0
HIV not present 99.0
Test_result
test positive
2.97
test negative
97.0
• Step 2: Condition on observation (right-click,
choose “Enter findings”), view conditional
probabilities
HIV_status
HIV present
33.3
HIV not present 66.7
Test_result
test positive
100
test negative
0
21
Wrap-up on Netica introduction
• User just needs to enter model and
observations (“findings”)
• Netica uses Bayesian Network algorithms to
update all probabilities (conditioning them on
findings)
• We will learn to do this manually for small
problems
• Algorithms and software are essential for
large, complex inference problems
22
Fair Coin Problem
• A box contains two coins: (a) A fair coin; and (b) A
coin with a head on each side. One coin is
selected at random (we don’t know which) and
tossed once. It comes up heads.
• Q1: What is the probability that the coin is the fair
coin?
• Q2: If the same coin is tossed again and shows
heads again, then what is the new (posterior)
probability that it is the fair coin?
Solve manually and/or using Netica.
23
Using Netica to solve fair coin problem
• Step 1: Create DAG model.
(Q: What is its root?)
A: Root node is “Coin is fair”
• Step 2: Use “Enter Findings” (right-click) to specify
observations (i.e., histories of observations on which
answers are to be conditioned, e.g., “Head on first toss” or
“Heads on first two tosses”)
• Step 3: View the “Coin is fair” root node to view the
answer (i.e., Pr(Coin is fair | Observations).
24
Using Netica to solve fair coin problem
• Step 1: Create DAG model.
(Q: What is its root?)
A: Root node is “Coin is fair”
CoinIsFair
Yes 50.0
No
50.0
FirstToss
Head 75.0
Tail
25.0
SecondToss1
Head 75.0
Tail
25.0
25
Using Netica to solve fair coin problem
• Step 1: Create DAG model.
(Q: What is its root?)
A: Root node is “Coin is fair”
• Step 2: Use “Enter Findings”
• Step 3: View the “Coin is fair” root node to view the
answer (i.e., Pr(Coin is fair | Observations).
CoinIsFair
Yes 33.3
No
66.7
FirstToss
Head
100
Tail
0
SecondToss1
Head 83.3
Tail
16.7
26
Using Netica to solve fair coin problem
• Step 1: Create DAG model.
(Q: What is its root?)
A: Root node is “Coin is fair”
• Step 2: Use “Enter Findings”
• Step 3: View the “Coin is fair” root node to view the
answer (i.e., Pr(Coin is fair | Observations).
CoinIsFair
Yes 20.0
No
80.0
FirstToss
Head
100
Tail
0
SecondToss1
Head
100
Tail
0
27