Can we establish cause-and-effect relationships in large healthcare databases? Lawrence McCandless Associate Professor [email protected] Faculty of Health Sciences, Simon Fraser University Spring 2016 Example of large healthcare database study Antidepressants and Suicide: Causal Link? Example of large healthcare database study Antidepressants and Suicide: Causal Link? Example of large healthcare database study Antidepressants and Suicide: Causal Link? • The Prolem: Confounding variables • What is causing suicide? SSRIs, depression, or both?!? • Were the authors able to control for confounders? Outline 1. An Example of Confounding: Beta-blocker Therapy in Heart Failure Patients 2. The Problem of Confounding when infering causation in large healthcare database studies 3. The assumption of no unmeasured confounders 4. The role of Bayesian statistics An Motivating Example: The Effectiveness of Beta-Blocker Therapy in Heart Failure Patients. Example of Confounding: Beta Blocker Therapy in Heart Failure Patients I present a re-analysis of healthcare administrative data described by McCandless (2007). A retrospective cohort study that examined the relationship between beta-blocker therapy and lower mortality in n=6969 heart faliure patients who were followed for 2 years between 1999-2001. Among 6969 patients, 1295 were treated with beta blockers (5674 were untreated). All participants were followed for up to 2 years, and a total of 1755 died (25% died). Observational study: No randomization McCandless, Gustafson & Levy (2007) Stat Med Bayesian sensitivity analysis for unmeasured confounding in observational studies. Example of Confounding: Beta Blocker Therapy in Heart Failure Patients Here is a simple multiple logistic regression model: Let Y = 1, 0 denote a dichotomous outcome variable for death by the end of follow-up. Let X = 1, 0 denote a dichotomous exposure variable for beta-blocker treatment. Let C denote at k = 12 vector of measured confounders (all dichotomous), including sex, age, and 8 disease indicator variables (e.g. cancer, heart disease). Logistic Regression Analysis Results Adjusted odds ratios for the association between treatment and lower mortality Covariate Beta blocker Female sex Age <65 65-74 75-84 85+ Comorbid conditions Cerebrovascular Dis COPD Hyponatremia .. . Odds Ratio (95% confidence interval) 0.72 (0.62-0.86) 0.75 (0.67-0.84) 1.00 1.43 (1.14-1.80) 2.20 (1.81-2.71) 3.29 (2.67-4.08) 1.37 (0.71-2.60) 1.07 (0.82-1.39) 1.10 (0.81-1.49) .. . Example of Confounding: Beta Blocker Therapy in Heart Failure Patients The crude odds ratio for the X − Y association was 0.63 with 95% CI (0.55, 0.74). The question is: Was there adequate adjustment for confounding? Are there additional unmeasured confounders that were not recorded in the health care administrative data? Healthy patients were more likely to get treated with beta blockers than unhealthy patients. Additional important confounding variables: Smoking; Physical activity; ... Causal Inference in Large Healthcare Databases The problems in the beta blocker data are typical of database studies. Large healthcare databases of electronic medical records have advantages for epidemiology research: • Very inexpensive compared to prospective cohort studies • Extremely high power • Free of errors from survey instruments (e.g. interviewer & respondent biased reporting) • Data capture entire populations because of publically funded healthcare system Causal Inference in Large Healthcare Databases However observational studies are not randomized control trials where study participants are randomly allocated to receive treatment or placebo. In observational studies, treated patients may differ systematically from control patients due to unmeasured confounding variables. For example, treated patients may be healthier than control patients. Then correlation does not imply causation. There is a mixing of the effect of the treatment with the effect of the unmeasured confounders. What is Causal Inference? Causal inference is the process of establishing cause-and-effect relationships from data that were not collected from an experiment (e.g. randomized controlled trial). The causal effect of a treatment is defined as the contrast between two counterfactual outcomes: 1. The outcome that was observed in a treated individual, versus 2. The outcome that would have been observed in the same individual if they had not been treated. These are called potential outcomes, or counterfactuals because they are “contrary to fact”. Hernan, Robins (2006) J Epi Com Health; Maldonado and Greenland (2002) Int J Epidmiol Inferring causation in large healthcare databases Causal inference in observational studies requires the assumption of no unmeasured confounders. Assumption: The risk of death in the group 1 (Treatment) would have been the same as the risk of death in group 2 (Control) had those subjects not recieved treatment. (and vice versa) This is also called the assumption of ignorable treatment assignment or exchangeability, and it is written as (Y1 , Y0 ) ⊥⊥ X |C. Also requires SUTVA assumption Hernan, Robins (2006) J Epi Com Health, Maldonado and Greenland (2002) Int J Epidmiol Inferring causation in large healthcare databases To reduce confounding from measured variables, we can use several methods: • Regression adjustment (e.g. Logistic Regression) • Propensity scores: Adjustment, matching, inverse probability of treatment weighting and marginal structural models • Other methods: Restriction, standardization, g formula, double robust methods, instrumental variables, disease risk scores Inferring causation in large healthcare databases A fundemental problem when analyzing large healthcare databases is the assumption of no unmeasured confounders. Large healthcare databases are frequently missing information on important clinical variables: • Smoking • Body mass index • Severity of underlying disease • Indications for treatment Therefore correlation may not imply causation. Example: Antidepressants and risk of suicide Inferring causation in large healthcare databases Rothman (2012) Epidemiology: An introduction Inferring causation in large healthcare databases Szklo & Nieto. (2015) Epidemiology, Beyond the Basics Dealing with unmeasured confounders • Possible solutions? =⇒ Speculate about the characteristics of the unmeasured confounder and then study the resulting inferences. =⇒ This forms the basis of sensitivity analysis and bias analysis. Greenland, S. (2005) Multiple-bias modelling for analysis of observational data. Journal of the Royal Statistical Society Ser A 168-267-306. What is Sensitivity Analysis? Sensitivity analysis for unmeasured confounding 1. Expand the model relating treatment and outcome to include extra parameters which model confounding from variables. (These are called bias parameters). 2. “Plug in” plausible values for bias parameters taken from the literature. 3. Repeat the data analysis and verify that study conclusions are robust to different choices of bias parameters. • There are many papers on this topic (see Rosenbaum, Rubin, Greenland, Robins). See Schneeweiss for examples pharmacoepidemiology. Sensitivity Analysis for Unmeasured Confounding Suppose There is a Single Binary U. Instead of this model Logit[P(Y = 1|X , C)] = β0 + βX X + βC C We use this model Logit[P(Y = 1|X , C, U)] = β0 + βX X + βC C + βU U LogitP(U = 1|X , C)] = γ0 + γX X This is also know as a latent class model, or a mixture model with 2 components. Sensitivity Analysis for Unmeasured Confounding Suppose There is a Single Binary U. The model is indexed by 3 bias parameters: βU odds ratio for association between U and Y γX odds ratio for association between U and X → expit(γ0 ) and expit(γ0 + γX ) are the prevalences of U|X . Sensitivity Analysis Results Odds ratios for the relationship between beta blockers and mortality∗ Association between U and X Association between U and Mortality Protective βU = −2 No Relation No Effect βU = −1 βU = 0 Harmful βU = 1 βU = 2 OR = 1/6 OR = 1/3 OR = 1 OR = 3 OR = 9 0.70 (0.58-0.83) 0.71 (0.60-0.84) 0.72 (0.62-0.86) 0.71 (0.60-0.84) 0.70 (0.58-0.83) 1.15 (0.96-1.39) 0.96 (0.81-1.14) 0.72 (0.62-0.86) 0.56 (0.47-0.66) 0.48 (0.40, 0.57) 1.49 (1.15-1.78) 1.10 (0.93-1.31) 0.72 (0.62-0.86) 0.50 (0.43-0.60) 0.42 (0.35-0.49) γX = 0 (OR=1) Increase γX = 1 (OR=3) Large Increase γX = 2 (OR=9) ∗ Adjusted for measured and unmeasured confounders Note: Original odds ratio was 0.72 (0.62, 0.86) A sensitivity analysis using the method of Lin et al. (1998) Biometrics How does Bayesian Statistics Fit in Here? What is Bayesian Statistics? Bayesian inference is an approach to statistics where scientific evidence is summarized using probability distributions Rather declaring a result to be significant or not-significant, the Bayesian approach quantifies the probability that a relationship is true. Thus Bayesian methods give better representations of uncertainty in complex data. What is Bayesian Statistics? Some of the advantages of Bayesian statistics: Results are easier to interpret (e.g. compared to p-values and confidence intervals) Bayesian statistics easily accomodate complex models (e.g. multilevel models, missing data, latent variables) We can incorporate prior information using the prior distribution. Bayesian approaches are well-suited knowledge synthesis and combining datasets. What is Bayesian Statistics? Prior distribution: Summary of what we know about the population quantity before having looked at the data. Posterior distribution: knowledge after having seen the data. Illustration of Bayesian statistics in diagnostic testing PPV = sensitivity × prevalence sensitivity × prevalence + (1 − specificity) × (1 − prevalence) Equivalently, P(D=1|T=1) = P(T=1|D=1)P(D=1) P(T=1|D=1)P(D=1) + P(T=0|D=0)P(D=0) where P(T = 1|D = 1) = Sensitivity, P(T = 0|D = 0) = Specificity P(D = 1|T = 1) = Positive predictive value Challenges with Bayesian Modelling There are serious challenges with the practical implementation of Bayesian models in large databases. • Custom computer code must be developed. • Trainees and personnel require advanced training in biostatitics (e.g Bayesian methods, and epidemiollogy) • Bayesian computation can be very challenging (e.g. due to nonidentifiability and difficulties with understanding the behavior of complex models). However... some good news: New software for Bayesian Analysis Recently Andrew Gelman and others (20+) have developed the software STAN Gelman et al. (2015) Stan Reference Manual 2.7.0 STAN Bayesian software STAN is generic software for doing Bayesian calculations. It is based on adaptive Hamilton Monte Carlo, and it is a black box precedure that requires little input from the user. Based on probabilistic programming language, which uses a computer program to represent probability distributions by generating data. Stan has is 500 page manual, dozens of distributions & models, with extensive support, across platforms, C, R, python Advantages of STAN software In the RSTAN package in R, use the model code: data { real y[n]; // Death real x[n]; // Treatment } parameters { real beta0; real betaX; } model { y ~ (bernoulli_logit(beta0 + betaX X); } Or use the function increment_log_prob() and then STAN does all the work. It discovers structure of model and returns a sample. Advantages of STAN software 15,000 interations, 10 chains, 1 hour on a new computer. mean se_mean sd 2.5% 25% 50% 75% 97.5% betaX -0.44 0.03 0.37 -1.14 -0.70 -0.44 -0.18 0.28 beta0 -1.81 0.09 0.82 -3.29 -2.53 -1.63 -1.15 -0.47 beta[1] -0.27 0.00 0.06 -0.41 -0.32 -0.27 -0.23 -0.15 beta[2] 0.35 0.01 0.12 0.12 0.27 0.35 0.44 0.60 beta[3] 0.88 0.01 0.12 0.67 0.80 0.88 0.96 1.11 beta[4] 1.40 0.01 0.13 1.16 1.31 1.40 1.49 1.67 beta[5] 0.24 0.01 0.37 -0.50 -0.01 0.25 0.49 0.96 beta[6] 0.08 0.00 0.15 -0.21 -0.01 0.09 0.18 0.38 beta[7] 0.19 0.00 0.17 -0.16 0.07 0.19 0.30 0.52 beta[8] 0.76 0.00 0.12 0.53 0.67 0.75 0.83 0.98 beta[9] 0.38 0.01 0.37 -0.37 0.14 0.39 0.63 1.08 beta[10] 0.60 0.01 0.29 0.02 0.41 0.61 0.80 1.17 beta[11] 1.46 0.01 0.22 1.05 1.32 1.46 1.61 1.89 beta[12] 0.11 0.01 0.31 -0.51 -0.10 0.12 0.33 0.71 betaU -0.15 0.17 1.53 -1.98 -1.66 -0.66 1.56 1.97 gammX -0.07 0.07 1.20 -1.91 -1.17 -0.11 0.99 1.90 gamm0 -0.06 0.04 0.91 -1.76 -0.70 -0.06 0.58 1.74 lp__ -3763.58 0.07 3.04 -3770.38 -3765.41 -3763.25 -3761.40 -3758.62 n_eff Rhat 136 1.07 79 1.13 1190 1.01 490 1.02 430 1.02 418 1.02 990 1.01 1119 1.01 1184 1.01 1119 1.01 952 1.01 1113 1.00 794 1.02 929 1.01 79 1.12 267 1.03 663 1.01 1744 1.01 Back to beta-blocker data example... Bayesian Sensitivity Analysis for Unmeasured Confounding We assign prior distributions to model parameters: The bias parameters βU , γ0 , γX ∼ Uniform(−2, 2) to model uniform beliefs about the magnitude of unmeasured confounding. For the remaining model parameters β0 , βX , βC ∝ 1 which are improper flat priors and very uninformative. Results Odds Ratios Adjusted for Measured and Unmeasured Confounders Covariate Beta blocker Female sex Age <65 65-74 75-84 85+ Comorbid conditions Cerebrovascular Dis COPD Hyponatremia .. . ∗ NAIVE: Adjusted for C. ∗∗ Bayesian: Adjusted for C and U. Odds Ratio (95% interval estimate) NAIVE analysis∗ Bayesian analysis∗∗ Unif(-2,2) Prior 0.72 (0.62-0.86) 0.75 (0.67-0.84) 0.72 (0.45-1.15) 0.74 (0.65-0.84) 1.00 1.43 (1.14-1.80) 2.20 (1.81-2.71) 3.29 (2.67-4.08) 1.00 1.41 (1.13-1.79) 2.22 (1.77-2.74) 3.37 (2.64-4.29) 1.37 (0.71-2.60) 1.07 (0.82-1.39) 1.10 (0.81-1.49) .. . 1.35 (0.70-2.63) 1.07 (0.81-1.42) 1.11 (0.81-1.42) .. . Ongoing Research Here is the posterior distribution of the bias parameters (βU , γ0 , γX ) Dotted lines indicate the prior distributions. The prior and posterior distribution for bias are different. This illustrates that Bayesian analysis using STAN software can give a unique perspective about the bias and causality. Can we infer causation in Large Healthcare Databases? • Conventional statistical methods are inadequate because they do not model all the uncertainties. • In massive dataset, confidence intervals and p-values are crushed to zero indicating that everything is significant. • We can do better, and we can build better statistical tools that capture uncertainty. • Bayesian methods are an important new tool for causal inference in large healthcare databases and epidemiology research. Can we infer causation in Large Healthcare Databases? • Bayesian models can easily accommodate complex models: 1) multilevel models, 2) missing data, 3) unobserved variables such as an individual’s disease status. • The use of prior probability distributions represents a powerful approach for incorporating information from previous studies. • Posterior probabilities are easier to interpret (e.g compared to p values) • Recent developments in Markov chain Monte Carlo methodology facilitate the implementation of Bayesian analyses of complex data sets. References McCandless LC, Gustafson P, Levy AR. (2007) Bayesian sensitivity analysis for unmeasured confounding in observational studies. Statistics in Medicine. 26:2331–47 Gustafson P (2014). Bayesian inference in partially identified models: Is the shape of the posterior distribution useful? Electronic Journal of Statistics 8, 476-496. Gustafson P (2015) Bayesian inference for partially identified models. Exploring the limits of limited data. CRC Press McCandless LC, Gustafson P, Levy AR, Richardson S. (2012) Hierarchical priors for bias parameters in Bayesian sensitivity analysis for unmeasured confounding. Statistics in Medicine 31:383-96. Greenland S (2006) Multiple bias modelling for analysis of observational data. J Royal Stat Soc Ser A. 168-267-306.
© Copyright 2025 Paperzz