Statistical Consulting - Cox Associates Consulting

Math 6330: Statistical Consulting
Class 6
Tony Cox
[email protected]
University of Colorado at Denver
Course web site: http://cox-associates.com/6330/
Readings on Bayesian Networks
• Charniak (1991), pages 50-53,
http://www.aaai.org/ojs/index.php/aimagazine/article/viewFile/918/836
– Build the network in Figure 2
• Pearl (2009), Sections 1 and 2 (through page 102).
http://ftp.cs.ucla.edu/pub/stat_ser/r350.pdf
• Methods to Accelerate the Learning of Bayesian Network
Structures, Daly and Shen (2007)
https://pdfs.semanticscholar.org/e7d3/029e84a1775bb12e7e67541beaf2367f7a88.pdf
2
Causal questions
• Retrospective (evaluation)
– How would Y (or its probability distribution) have
been different if X had been different?
– Would Y have occurred if X had not occurred?
– Answers usually depend on the assumptions we
make about why X would have been different
• Prospective (decision optimization)
– What will happen to Y (or its probability distribution)
if we change X? How sure can we be?
• Explanatory
– Why does Y have the value (or probability
distribution) that it has?
• To what extent is it because of the value of X?
3
Implications among types of causation
attributive
refutationist
weight of evidence
• quasi-experiments
regularity
• structural equations
• simulation
• causal pathways
mediation
associational
• relative risk (RR)
• odds ratio (OR)
• regression coefficients
computational/exogeneity
• Simon-Iwasaki causal ordering
mechanistic
• etiologic fraction
• population attributable risk
• probability of causation
• burden of disease
manipulative
predictive
• do-calculus
• dynamic causal models
• transfer entropy
• Granger causality
statistical dependence
• DAG graph models
• Causal Bayesian networks
counterfactual/potential outcomes
• propensity scores, marginal structural models
• instrumental variables
• intervention studies
4
Types of effects
• Direct effect: How a change in X changes Y if all
other variables are held fixed
• Total effect: How a change in X changes Y if all
other variables are allowed to respond
• Mediated effect: How a change in X changes Y by
changing mediator Z
• Transient and comparative statics effects
• Example: Effect of a change in volume on
pressure in an ideal gas
P = nRT/V
5
Associations are unreliable guides to
causation
6
Non-causal associations between X
and Y
• Confounding: X  Z  Y
– Failing to condition on Z leads to spurious
association between X and Y
– Leads many statisticians to “control for” possible
confounding by putting all variables on rhs of
regression model
• Selection (Berkson): X  Z  Y
– Conditioning on Z leads to spurious association
between X and Y
7
Example of selection bias
• Suppose that the only workers who continue
to work in an industry are those who (a) Are
accustomed to high exposures; or (b) Are very
healthy.
• DAG: High exposure Stay  Healthy
• Then, among workers who stay, high exposure
is associated with lower health, even if
exposure does not increase risk.
8
Non-causal associations between
measured X and measured Y values
•
•
•
•
•
XZY
Y=Z
x = measured z + small error
y = measured z + large error
Then regression model may identify X but not
Z as a significant predictor of Y
– Even though Z and not X is a direct cause of Y
9
Non-causal associations between X
and Y
•
•
•
•
XZY
Y = Z2
X = Z2
Then linear regression model may identify X
but not Z as a significant predictor of Y
– Even though Z and not X is a direct cause of Y
10
Identifiability of causal impacts
• Principle: Effects are not conditionally independent of
their direct causes. We can use this as a screen for
possible causes in a ,ultivariate datbase
• Suppose we had an “oracle” (e.g., a perfect CART tree or
BN learning algorithm) for detecting conditional
independence
• Which of these could it distinguish among?
1.
2.
3.
4.
5.
XZY
ZXY
XYZ
XYZ
XZY
(e.g.,
(e.g.,
(e.g.,
(e.g.,
(e.g.,
exposure  lifestyle  health)
lifestyle  exposure  health)
exposure  health  lifestyle)
exposure  health  lifestyle)
exposure  lifestyle  health)
11
Identifiability of causal impacts
1.
2.
3.
4.
5.
XZY
ZXY
XYZ
XYZ
XZY
(e.g.,
(e.g.,
(e.g.,
(e.g.,
(e.g.,
exposure  lifestyle  health)
lifestyle  exposure  health)
exposure  health  lifestyle)
exposure  health  lifestyle)
exposure  lifestyle  health)
• In 1 and 5, but not the rest, X and Y are conditionally
independent given Z
– Markov equivalence class can be identified
•
In 4, but not the rest, X and Z are conditionally independent
given Y
• In 2, but not the rest, Z and Y are conditionally independent
given X
• In 3, X and Z are unconditionally independent but
conditionally dependent given Y
12
Quasi-experiments: Refuting non-causal
explanations with control groups
Example:
Do delinquency
interventions work?
http://www.slideshare.net/horatjitra/research-design-and-validity
13
Threats to validity of causal inferences
http://spectrum.troy.edu/renckly/week6a.htm
14
Generalizability of findings
• Invariance of causal laws across contexts
• “Transportability” of causal effect estimates
• Threats to external validity in quasiexperiments (QEs)
15
Overview of causal analytics techniques
• Causal graph models
– Path diagrams, structural equations models
– (Causal) Bayesian Networks, DBNs, influence diagrams (IDs)
• Time series methods
– Granger causality: Causes help to predict effects
– Transfer Entropy: Info flows from causes to their effects
– Hybrid techniques: Inferring causal graph models from time
series data
• Systems dynamics simulation models
16
Path analysis
Input
Output
Allows estimation of direct, indirect, and total effects
http://crab.rutgers.edu/~goertzel/pathanal.htm
17
Path analysis (cont.)
Input
Output
Causal hypotheses are provided as inputs; effects strengths are estimated as outputs.
http://crab.rutgers.edu/~goertzel/pathanal.htm
18
Time series: Granger causality
• X is a Granger-cause of Y if the future of Y is
not conditionally independent of the history
of X, given the history of Y
• Test based on time series regression and F test
for non-independence
19
Granger test example
http://davegiles.blogspot.com/2011/04/testing-for-granger-causality.html
20
Granger causality
F-tests
Asymmetry
http://epilepsyu.com/blog/tag/granger-causality-test/
21
From: Disruption of Frontal–Parietal Communication by Ketamine, Propofol, and Sevoflurane
Anesthesiology. 2013;118(6):1264-1275. doi:10.1097/ALN.0b013e31829103f5
Figure Legend:
Schematic illustration of transfer entropy. Symbolic transfer entropy measures the causal influence of source signal X on target signal Y, and is
based on information theory. The information transfer from signal X to Y is measured by the difference of two mutual information values, I [YF;
XP, YP] and I [YF; YP], where XP, YP, and YF are, respectively, the past of source and target signals and the future of the target signal. The difference
corresponds to information transferred from the past of source signal XP to the future of the target signal YF and not from the past of the target
signal itself. The average overall vector points measures the information transferred from the source signal to the target signal. The vector
points are symbolized with the rank of their components: e.g., a vector point (30,78,51) is symbolized to (1,3,2) with the rank of components in
ascending order.
Date of download: 2/16/2017
Copyright © 2017 American Society of Anesthesiologists. All rights reserved.
Algorithmic challenges
• Learning: Learn causal graph from data
– Structure (DAG) Learning
– CPT estimation
• Dirichlet prior and Bayesian estimation
• Monte Carlo sampling
• Inference: Use causal graph to draw
inferences about probabilities of variables
given observations
23
How to get from data to causal
predictions… objectively?
• Causal prediction
– Deterministic causal prediction: Doing X will make
Y happen to people of type Z
– Probabilistic causal prediction: Doing X will change
conditional probability distribution of Y, given
covariates Z
• Goal: Manipulative causation (vs. associational,
counterfactual, predictive, computational, etc.)
• Data: Observed (X, Y, Z) values
• Challenge: How will changing X change Y?
24