Causal Inference for Binary Outcomes Applied Health Econometrics Symposium Leeds October 2013 Frank Windmeijer University of Bristol Outline: 1. Instrumental variables estimators for binary outcomes Structural economic/econometric models Binary outcomes Identification assumptions Structural Mean Models Risk difference, Risk ratio, Odds ratio Mendelian randomisation GMM estimation 2 Related papers: Clarke P.S. and F. Windmeijer, Instrumental Variable Estimators for Binary Outcomes, JASA 2012, 1638-1652. Clarke P.S. and F. Windmeijer, Identification of Causal Effects on Binary Outcomes Using Structural Mean Models, Biostatistics 2010. Clarke P.S., Palmer T. and F. Windmeijer, Estimating Structural Mean Models with Multiple Instrumental Variables using the Generalised Method of Moments, Cemmap/CMPO Working paper, 2012. Von Hinke Kessler Scholder, S., Davey Smith, G., Lawlor, D.A., Propper, C. and F. Windmeijer, Genetic Markers as Instrumental Variables, CMPO working paper. 3 Structural model and potential outcomes Consider the simple linear model for continuous outcome Y Y 0 X 1 U If this is a structural model then 1 is the ceteris paribus effect of a change in X , holding U constant. The potential outcome Y x is the outcome that would be obtained if exposure X is set equal to x . The average causal effect is then defined as ACE x, x E Y x Y x 4 For the simple structural linear model, Y x 0 1 x U and hence Y x Y x x x 1 ACE x, x . In the linear model an endogeneity problem occurs if U is not conditionally mean independent of observed X , E U | X 0 , and OLS is inconsistent for 1. 5 A classic example in microeconometrics is the returns to education. A simple Mincer wage equation is given by ln wi 0 1educi 2experi 3experi 2 ui and endogeneity arises because there are unobserved individual characteristics in ui that affect both ln wi and the choice of the number of years of education. educ 6 An Instrumental Variable is then a variable that, in this particular example, determines the level of education, but is independent of the unobservables in the ln w equation, u . For example, Angrist and Krueger (1991) use quarter of birth as an instrument for education, in a “natural experiment” type setting. educ QoB 7 We are interested in the case that outcome Y , treatment X and instrument Z are all binary 0,1 variables. Example, the effect of being overweight on hypertension, using genetic marker as instrument. We adopt a potential outcomes framework, and the system is triangular such that potential outcomes are denoted X z and Y z , x . Causal effects are: ACE ATE E Y 1 Y 0 CRR E Y 1 / E Y 0 , causal risk ratio COR E Y 1 / E 1 Y 1 E Y 0 / E 1 Y 0 8 , causal odds ratio The IV core conditions can be stated as: 1. Independence of the potential outcomes and IV: Y z, x Z 2. Exclusion restriction: Y z, x Y x 3. There is an association between exposure and IV: X not Z 9 Under these core conditions, it is not possible to point-identify a causal effect without any further assumptions on the generating process, which has the general form X f X Z ,V Y x f Y x,U U ,V ~ FUV Manski (1990) provides “worst case” bounds, Balke and Pearl (1997) provide sharp bounds for causal effects like ACE. Chesher (2010) also provides sharp bounds for causal effects, but with respect to different structural models that constrain the family of generating processes. We conjecture therefore that Chesher bounds are at least as narrow as those of Balke and Pearl. 10 At the other extreme, we can point identify all causal effects by specifying a fully parametric structural model: Y x I 0 x1 U 0 X Z I 0 Z1 V 0 Specifying the joint distribution of U ,V as bivariate normal results in the bivariate probit model, with the ML estimator consistent for the parameters, provided the model is correctly specified. 11 Structural Mean Models (SMMs) Robins (1989, 1994) introduced the class of semiparametric structural mean models. See also Vansteelandt and Goetgebheur (2003). The additive structural mean model is specified as E Y | X , Z E Y 0 | X , Z 0 1Z X This is a saturated model. From core conditions, Y z, x Y x Z (conditional mean independence (CMI), or randomisation assumption) E Y 0 | Z 1 E Y 0 | Z 0 E Y 0 12 and hence E Y 0 1 X | Z 1 E Y 0 X | Z 0 Cannot identify 2 parameters from 1 moment condition. The assumption made in SMM models is the “No Effect Modification” assumption (NEM). This assumes that the treatment effect is not modified by the value of the instrument Z, i.e. 1 0 , or E Y | X , Z E Y 0 | X , Z 0 X 0 E Y 1 Y 0 | X 1, ATT 13 and 0 is identified from the moment condition E Y 0 X | Z 1 E Y 0 X | Z 0 or, equivalently, from E Y X 0 0 | Z 1 0 E Y X 0 0 0 where 0 E Y 0 . Hence the SMM estimator for 0 is the same as the standard linear IV estimator, in this case the Wald estimator, with estimand E Y | Z 1 E Y | Z 0 0 E X | Z 1 E X | Z 0 14 Note, however, that a linear structural model Y 0 0 X U does not make sense here, as U is either 1 0 X 0 or 0 X 0 and hence not an antecedent of X . With multivalued Z , the moments are, under NEM, E Y X 0 0 | Z j 0 and the causal parameters can be estimated efficiently by GMM. 15 Multiplicative SMM The multiplicative SMM is E Y | X , Z E Y 0 | X , Z exp 0 1Z X . Under NEM, 1 0 , and from the CMI assumption it follows that: E Y exp X 0 0 | Z 1 0 E Y exp X 0 0 0 where 0 E Y 0 , and 16 exp 0 E Y 1 | X 1 E Y 0 | X 1 Is the causal risk ratio among the treated. Note that moment conditions of the form E Y exp X 0 0 | Z j 0 are equivalent to the Mullahy multiplicative moments for count data Y exp 0* X 0 E | Z j 0. * exp 0 X 0 where 0* ln E Y 0 17 A further generalisation is a logistic SMM, under NEM: logit E Y | X , Z logit E Y 0 | X , Z 0 X where logit p ln p / 1 p , and exp 0 is the causal odds ratio for the treated exp . E Y 0 | X 1 / 1 E Y 1 | X 1 E Y 1 | X 1 / 1 E Y 1 | X 1 0 The causal parameters of the SMMs can easily be estimated using the GMM command function in Stata, or R, and programmes are given in Clarke, Palmer and Windmeijer (2012). 18 Local Treatment Effects For the binary outcome, treatment and instrument case considered before, NEM does not hold if the generating process e.g. is a bivariate probit. If the NEM assumption does not hold, we can point identify (weighted) local causal effects, see e.g. Imbens and Angrist (1994), and Frangakis and Rubin (2002)). The core conditions for local estimation can be written as: 1. Independence of all potential outcomes and IV: X z , Y z , x Z . 2. Exclusion restriction: Y z, x Y x 3. Causal effect of IV on exposure: E X z is a nontrivial function of z . 19 Then, if the selection model is monotonic, such that X z X z if z z or vice versa, then the additive SMM identifies the LATE E Y 1 Y 0 | X 1 X 0 and the multiplicative SMM identifies the LRR E Y 1 | X 1 X 0 E Y 0 | X 1 X 0 With a multivalued instrument, the SMMs identify a weighted LATE (Angrist and Imbens) and a weighted LRR. 20 For example, let the values for Z , 0,1, 2,...K be ordered such that E YX | Z k E YX | Z k 1 , then for the one-step GMM estimator: K ez k ek,k 1 k 1 where ek,k 1 is the LRR for the subgroup with values for Z k and Z k 1. 21 As an example, consider an instrument that takes the values Z 0,1, 2,3, with Y and X generated from a bivariate normal distribution as X 1c0 c1Z1 c2 Z 2 c3 Z 3 V 0 Y 1b0 b1 X U 0 U V 0 1 N , 0 where Z j 1Z j. 22 1 The parameters are such that LRR1,0 e1,0 LRR2,1 e2,1 LRR3,2 e3,2 E Y 1 | X 1 X 0 E Y 0 | X 1 X 0 E Y 1 | X 2 X 1 E Y 0 | X 2 X 1 E Y 1 | X 3 X 2 E Y 0 | X 3 X 2 1.1585 1.3227 1.5303 And the population values of the k are given by 1 0.3725; 2 0.3991; 3 0.2285 23 The one-step GMM estimator will thus be a estimate of the weighted average 1LRR1,0 2 LRR2,1 3 LRR3,2 1.3090. The table presents some estimation results confirming this: 1 e1,0 e2,1 e3,2 e Mean 1.164 1.330 1.542 1.311 0.373 St Dev 0.094 0.121 0.160 0.038 0.027 10,000 MC replications. Sample size 40,000. 2 3 0.399 0.032 0.228 0.022 Further, using the two-step GMM results, Hansen’s J-test rejects the null 47% of the time at the 5% level, therefore clearly having power to reject this violation of the NEM assumption. 24 Application 1 Ten Have, Joffe and Cary (2003). Randomized placebo-controlled trial involving 266 African-American adults aged between 40 and 70 who had high cholesterol and/or hypertension. The treatment X is an intervention with patients supplied with an audio tape containing advice about good dietary behaviour for lowering cholesterol. (Noncomplicance) The instrument Z is randomisation, outcome Y is binary indicator for lower cholesterol. 25 Randomisation Selection Z X Usual Care (0) Usual Care (0) Tape (1) Tape (1) Usual Care (0) Tape (1) Outcome Y Positive Negative (1) (0) 33 99 0 0 9 20 40 65 Total 82 26 184 Total 122 0 29 105 262 Estimator First-Stage Model Intention to treat Ignoring Selection Linear/Logistic/ Probit Bounds Balke-Pearl Chesher Target Parameter Estimate E X | Z 1 E X | Z 0 0.784 (0.713, 0.851) E Y | Z 1 E Y | Z 0 0.116 (0.007, 0.226) ATE CRR COR 0.120 (0.003, 0.234) 1.460 (1.010, 2.113) 1.744 (1.012, 2.976) ATE CRR COR ATE CRR COR 0.049-0.265 1.194-2.060 1.277-3.185 0.116-0.265 1.463-2.060 1.729-3.185 27 Fully parametric IV probit Semiparametric 2SLS Mult. SMM Logistic SMM ATE CRR COR 0.151 (0.009, 0.296) 1.603 (1.027, 2.540) 2.007 (1.038, 3.973) LATE/ATT LRR/CRRT CORT 0.148 (0.006, 0.285) 1.633 (1.025, 3.122) 2.022 (1.028, 3.862) 28 Application 2 We apply the SMM estimation procedures described above to estimate the causal effect of adiposity on hypertension as in Timpson et al. (2010), using genetic markers as instruments for adiposity. The data are from the Copenhagen General Population Study and the full details of the variable definitions and selection criteria are described in Timpson et al. (2010). The outcome variable is whether an individual has hypertension, defined as a systolic blood pressure of >140 mmHg, diastolic blood pressure of > 90 mmHg, or the taking of antihypertensive drugs. The intermediate adiposity phenotype is being overweight, defined as having a BMI>25. We use genetic markers as instruments for being overweight. 29 Davey Smith, G. et al. BMJ 2005;330:1076-1079 30 Suitable and robust genetic instrument We use two SNPs that have been consistently shown to relate to weight • Frayling et al. (2007) use 38,759 individuals aged 7-80 from 13 different cohort of European ancestry. They find a positive association between FTO and all measures of weight: • for individuals in all cohorts • in all countries • of all ages and • of both sexes, with no difference between males and females No association with birth weight or height Each copy of FTO risk allele increases weight by 0.8 - 1kg • Similar, though slightly smaller associations are found for MC4R using 77,228 adults and 5,988 children (Loos et al., 2008) 31 Behaviours affected by genotype E.g. if mothers carry the “fat” alleles, this may have impacted on her behaviour. Mechanisms It is often unknown how the genes affect the phenotype. Recent studies show the effect of FTO on appetite and hence diet. Assortative Mating Hardy-Weinberg Equilibrium Test for patterns in observable characteristics between hetero- and homozygotes Population Stratification E.g. ethnicity. 32 Linkage Disequilibrium Some variants of different genes co-inherited Degree of linkage is function of distance between the loci Pleiotropy Single genetic marker has multiple phenotypic effects Gene environment interaction, epigenetics 33 DAGs for Linkage Disequilibrium OK Not OK G2 Z: G1 G2 u A Y Z: G1 34 u A Y Estimation Results, effect of being overweight on hypertension SMM Linear J-test p OLS 0.2009 (0.0039) 2SLS 0.2091 (0.0819) GMM2 0.2094 (0.0814 0.2965 Multiplicative Gamma 0.2974 (0.0063) GMM1 0.3090 (0.1192) GMM2 0.3104 (0.1192) 0.3071 Logistic Logit 0.9487 (0.0189) GMM1 1.0409 (0.4220) GMM2 1.0528 (0.4217) 0.2924 Sample size 55,523 35 We can use the same GMM format to estimate the logistic SMM with a continuous exposure X . With a continuous exposure, parametric assumptions have to be made in order to identify causal parameters. Following Vansteelandt and Goetghebeur (2003) and Vansteelandt et al. (2010), we impose that the exposure effect is linear in the exposure on the odds ratio scale and independent of the instrumental variable: odds Y 1| X , Z odds Y 0 1| X , Z 36 exp 0 X Exposure J-test p BMI ln BMI ln RELBMI 0.1122 (0.0384) 0.3035 (0.1069) 0.2879 (0.1016) 0.4714 0.4828 0.5004 37 Application 3 • Child weight and academic performance: Causal effect from weight to educational outcomes, e.g. • Overweight children experience higher absenteeism in school • Overweight children are more likely to have sleep problems • Overweight children may be treated differently by peers and teachers • Reverse causation, e.g. • Poor school outcomes may cause obesity Association driven by other unobserved factors that affect both weight and academic outcomes, e.g. • Time discount rates 38 39 Data Avon Longitudinal Study of Parents and Children (ALSPAC) Mothers with expected delivery date between 01/04/91 31/12/92 Approx. 12,000 pregnancies; genotypes for 7,700 children Detailed information from variety of sources In-depth interviews, questionnaires, medical & school records, etc. As not all children attended special clinics where measurements were taken, final sample sizes drop to 3,500 40 Outcome: nationally set KS3 exam (age 14, standardised) Child body size: Direct measure of child fat mass (DXA scan, age 11, standardised) Contextual variables: Mother’s pre-pregnancy BMI Birth weight, breastfed, age (in months), non-white, hh composition Family income (sq), social class, employment status, mother’s and grandparents education, lone parenthood, local area deprivation (IMD) Maternal health and behaviour: Smoking/drinking during pregnancy, mother’s age at birth Mother’s locus-of-control, EPDS, CCEI Parental investment in child: teaching scores, activity scores 41 42 Estimation results: OLS: (1) (3) (3) KS3 KS3 KS3 -0.099*** -0.052*** -0.040*** (0.015) (0.014) (0.014) R-squared 0.01 0.26 0.30 Number of observations 3729 3729 3729 Yes Yes DXA, age 11 Contextual variables Mother’s health and behaviour Yes 43 First Stage: FTO MC4R IV strength, F-statistic (1) DXA (2) DXA (3) DXA 1.49*** 1.40*** 1.43*** (0.24) (0.23) (0.23) 0.768** 0.890** 0.898** (0.28) (0.27) (0.27) 22.8 22.7 23.5 Yes Yes Contextual variables Mother’s health and behaviour Yes 44 IV: (1) OLS, KS3 (2) IV, KS3 (3) IV, KS3 (4) IV, KS3 -0.040*** 0.137 0.098 0.115 (0.014) (0.143) (0.132) (0.131) Contextual variables Yes No Yes Yes Mother’s health and behaviour Yes No No Yes DXA, age11 45 OLS shows that heavier children perform worse in school tests compared to their leaner counterparts Using the genetic markers, there is no evidence of fat mass affecting outcomes Mendelian randomization: Strength of instruments Even with valid instruments, need to recognise limitations • Power • Large standard errors • We need larger sample sizes, more variants, or markers with larger effects 46
© Copyright 2026 Paperzz