American Journal of Epidemiology ª The Author 2009. Published by the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail: [email protected]. Vol. 169, No. 10 DOI: 10.1093/aje/kwp035 Advance Access publication April 10, 2009 Practice of Epidemiology Methods of Covariate Selection: Directed Acyclic Graphs and the Change-inEstimate Procedure Hsin-Yi Weng, Ya-Hui Hsueh, Locksley L. McV. Messam, and Irva Hertz-Picciotto Initially submitted August 10, 2007; accepted for publication January 26, 2009. Four covariate selection approaches were compared: a directed acyclic graph (DAG) full model and 3 DAG and change-in-estimate combined procedures. Twenty-five scenarios with case-control samples were generated from 10 simulated populations in order to address the performance of these covariate selection procedures in the presence of confounders of various strengths and under DAG misspecification with omission of confounders or inclusion of nonconfounders. Performance was evaluated by standard error, bias, square root of the mean-squared error, and 95% confidence interval coverage. In most scenarios, the DAG full model without further covariate selection performed as well as or better than the other procedures when the DAGs were correctly specified, as well as when confounders were omitted. Model reduction by using change-in-estimate procedures showed potential gains in precision when the DAGs included nonconfounders, but underestimation of regression-based standard error might cause reduction in 95% confidence interval coverage. For modeling binary outcomes in a case-control study, the authors recommend construction of a ‘‘conservative’’ DAG, determination of all potential confounders, and then change-in-estimate procedures to simplify this full model. The authors advocate that, under the conditions investigated, the selection of final model should be based on changes in precision: Adopt the reduced model if its standard error (derived from logistic regression) is substantially smaller; otherwise, the full DAG-based model is appropriate. bias (epidemiology); computer simulation; confounding factors (epidemiology); epidemiologic methods; logistic models; models, statistical; models, theoretical Abbreviations: DAG, directed acyclic graph; lnOR, natural logarithm-transformed odds ratio; OR, odds ratio. Directed acyclic graphs (DAGs) and change-in-estimate procedures for confounder identification and selection during data analysis have, to date, been discussed separately in the epidemiologic literature (1–8). With few exceptions (9–11), data analysts have also tended to apply the procedures separately, although no obvious subject matter considerations preclude their joint use. This has been a natural course of action because the use of DAGs is generally based only on prior knowledge or a priori assumptions about causal relations among variables of interest in the source population from which the study sample is taken, while the change-in-estimate procedure relies on sample-based relations among variables. These fundamental differences serve also to highlight limitations in both approaches: DAGs in ignoring sampling variation and the change-in-estimate in not taking into consideration underlying causal relations. Although use of prior knowledge in model building has long been advocated (2, 12–14), previous studies have not comprehensively examined the performance of covariate selection procedures that combine the effect of both approaches on parameter estimation. Thus, in this simulation study, we investigated whether combined approaches could improve parameter estimation over the DAG approach alone in the presence of confounders of various strengths and under DAG misspecification resulting from omission of confounders or inclusion of nonconfounders. This objective distinguishes this study from previous studies on confounder selection strategies (5, 15, 16). We do not discuss problems resulting from adjustment for colliders or from Correspondence to Dr. Hsin-Yi Weng, 2430 Veterinary Medicine Basic Sciences Building, 2001 South Lincoln Avenue, Urbana, IL 61802 (e-mail: [email protected]). 1182 Am J Epidemiol 2009;169:1182–1190 Methods of Covariate Selection 1183 Table 1. Predetermined Regression Coefficients for Covariate-Exposure (bE) and Covariate-Outcome (bO) Relations Used for Logistic Regression Models Fit to Each of the 10 Populations Population Covariatea 1 2 3 4 5 6 7 8 9 10 bE bO bE bO bE bO bE bO bE bO bE bO bE bO bE bO bE bO bE bO X1 1.5 1.5 1.0 1.0 0.8 0.8 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 2.0 0 X2 0.5 0.5 0.5 0.5 0.3 0.3 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.0 1.0 1.5 0 X3 1.5 0.5 1.0 0.5 0.8 0.3 1.5 1.5 0.5 0.5 0.5 0.5 1.5 0 0 1.5 0.5 0.5 0.5 0 X4 0.5 1.5 0.5 1.0 0.3 0.8 0.5 0.5 0 1.5 0 1.5 0 0 2.0 X5 0.5 0.5 0.5 0.5 0.3 0.3 1.5 1.5 0 1.5 1.0 0 0 1.5 X6 1.5 0 1.0 0 0.8 0 0.5 0 0 1.0 X7 0 1.5 0 1.0 0 0.8 0 1.5 0 0.5 0 1.0 0.5 0.5 1.5 0 1.5 0 1.5 0 1.5 0 X8 X9 Exposure Intercept 0 1.5 1.0 0.8 1.5 1.5 1.5 1.5 1.5 0.5 1.5 1.5 3.6 7.2 3.6 6.3 3.3 5.4 2.1 5.2 2.1 5.3 2.1 5.3 2.5 5.2 1.1 6.0 3.5 6.5 3.1 8.4 a All covariates except X5 in populations 1–3 and X7 in population 10 were Bernoulli random variables with success probability of 0.2. These 4 covariates followed a normal distribution with a mean of 3 and a variance of 1. confounder selection based on significance tests, as these issues have been addressed elsewhere (3, 4, 5–7, 15, 17). MATERIALS AND METHODS Populations and samples We created 10 different populations, each consisting of 500,000 observations with a binary exposure (E) and outcome (O), as well as covariates (Xj) (Table 1). We then selected random samples of size 300 and 1,000 from these populations with confounders of various strengths and generated 25 scenarios (Table 2) to examine the performance of the covariate selection procedures under correctly and incorrectly specified DAGs for populations with different confounding structures. Misspecified DAGs may omit confounders or include nonconfounding covariates (i.e., those nonconfounders that are neither colliders nor consequences of either exposure or outcome). We iterated the process of cumulative incidence casecontrol sampling from the population for each sample size 1,000 times. One set consisted of 150 cases and 150 controls, and the other consisted of 500 cases and 500 controls. Sampling was without replacement within each iteration, but the sampled observations were replaced for the next iteration. SAS software for Windows (18) was used to generate the samples. Covariate selection procedures Logistic regression was used to model the parameter of interest, that is, the odds ratio (OR) relating E to O while simultaneously controlling for selected covariates. The covariate selection procedures investigated in this study are as follows. DAG full model. The covariates identified as confounders by the relevant DAG were included in the logistic reAm J Epidemiol 2009;169:1182–1190 gression model without further covariate selection. For example, X1, . . ., X5, but not X6 or X7, were included in the logistic regression models for all the samples in scenarios 1–3 (Tables 1 and 2). Different sets of covariates were included in the DAG full models in scenarios 4–25 to represent different types of DAG misspecification (Table 2). For example, X1, . . ., X4, but not X5, were included in the DAG full model for scenario 4 (sampled from population 1) to represent a misspecified DAG that omitted a confounder (i.e., X5). DAG gold-standard change-in-estimate procedure. In this procedure, the initial full model was the DAG full model, and covariates were selected by backward elimination. At each stage, the 1 covariate for which removal caused the smallest change in the exposure OR (defined as DOR) was removed, providing the DOR was less than 0.1 (a 10% change). DOR, at each stage, was given by the following equation: DOR ¼ j ORi ORDAG j=ORDAG ; ð1Þ where ORi is the OR estimated at the ith step of the procedure, and ORDAG is the OR estimated by using the initial DAG full model. The procedure was discontinued at the step where no covariate’s removal met the criterion (i.e., DOR 0.1). DAG gold-standard change-in-estimate procedure with consideration of precision. After selecting the model using the previously described DAG gold-standard procedure, we compared precision, quantified using the logistic regressionestimated standard error of the natural logarithm-transformed OR (lnOR), of this simplified model with precision of the DAG full model. At the final step, the model simplified by using the DAG gold-standard change-in-estimate procedure was selected if, and only if, it had greater precision (i.e., a smaller regression-estimated standard error) than did the DAG full model. Otherwise, the DAG full model was selected. 1184 Weng et al. Table 2. Covariates Included (O) in the Initial Directed Acyclic Graph Full Model for Each of the 25 Scenarios Scenariosb Covariatea 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 23 24 25 X1 O O O O O O O O O O O O O O O O O O O O O O O X2 O O O O O O O O O O O O O O O O O O O O O X3 O O O O O O O O O O O O O O X4 O O O O X5 O O O O O O O O O O O O O O O X6 21 22 O O O O O O O O O O O O O O O O O O O O X7 O O O O O O O O O O X8 O O X9 O O Exposure O O O O O O O O O O O O O O O O O O O O O O O O Intercept O O O O O O O O O O O O O O O O O O O O O O O O Source population 1 2 3 1 4 1 1 1 3 1 3 5 6 1 7 9 10 1 8 9 10 10 1 9 10 a Covariate relations with the exposure and outcome are described in Table 1 for each of the 10 source populations. Scenarios 1–3 contained confounders of decreasing strengths. Scenarios 4 and 5 represented misspecified directed acyclic graphs that omitted a strong confounder. Scenarios 6–9 represented directed acyclic graphs that omitted a moderate or weak confounder. Scenarios 10–12 represented directed acyclic graphs that omitted 2 same-direction confounders, while scenario 13 omitted 2 opposite-direction confounders. Scenarios 14–17 represented misspecified directed acyclic graphs that included nonconfounders associated with only study exposure, while scenarios 18–22 represented the inclusion of nonconfounders associated with only study outcome. Scenarios 23–25 included nonconfounders that were each associated with only study exposure or outcome. b DAG-stepwise change-in-estimate procedure. The DAG full model was the initial model, and backward elimination was applied. Instead of being compared with the OR obtained from the DAG full model, as in the case of the DAG gold-standard change-in-estimate procedure, the OR obtained at each step was compared with that generated at the previous step (i.e., DOR ¼ | ORi ORi–1 |/ORi1). The 0.1 criterion previously described was also applied to this procedure. We used a SAS macro (19) to perform the DAG-stepwise change-in-estimate procedure (2). This macro was further modified to execute the DAG gold-standard change-inestimate procedures. Performance measures The performances of these 4 procedures were assessed by comparing the lnOR calculated from each sample with the corresponding population lnOR. The performance measures were as follows: 1) standard error, estimated by using the standard deviation of the sample lnORs; 2) bias, estimated as the difference between the mean of the sample lnORs and the population lnOR; 3) square root of the mean-squared error, estimated as the square root of the sum of the bias squared and the variance (squared standard error) of the sample lnORs; and 4) 95% confidence interval coverage, calculated as the percentage of sample 95% confidence intervals that included the population lnOR. Sample confidence intervals were Wald confidence intervals derived from logistic regression. To quantify precision, we calculated the standard errors derived from both the standard deviation of the 1,000 iterations (sampling distribution-based standard error) and logistic regression (regression-based standard error). These 2 standard error estimates were then compared with each other. Simulations For simulations, all covariates followed a Bernoulli distribution with a success probability of 0.2 except for X5 in populations 1–3 and X7 in population 10, which were continuous and distributed normally with a mean of 3 and a standard deviation of 1 (Table 1). After predetermination of the distribution of each covariate, its relations with the exposure and outcome were modeled by using logistic regression (refer to equations 2 and 3): ð2Þ Pr E ¼ 1j Z ¼ eZbE 1 þ eZbE and Pr O ¼ 1j Z# ¼ eZ#bO 1 þ eZ#bO ; ð3Þ where E and O are as previously defined, and PrðE ¼ 1j ZÞ is the success probability of exposure conditional on the 1 3 n covariate matrix Z ¼ 1, X1, . . . Xj (n ¼ j þ 1); PrðO ¼ 1j Z#Þ is the success probability of the outcome conditional on the 1 3 n# covariate matrix Z# ¼ 1, E, X1, . . . Xk (n# ¼ k þ 2); and bE and bO are the n 3 1 and n# 3 1 matrices of the regression coefficients expressing the covariate-exposure associations and the effects of the study exposure and other covariates on outcome, respectively. Logistic regression intercepts were predetermined in order to ensure that the 95th percentile of outcome incidence proportion and the median of exposure prevalence in the source populations were approximately 10% and 20%, Am J Epidemiol 2009;169:1182–1190 Methods of Covariate Selection respectively. The percentiles were derived from the exposurecovariate joint probabilities, computed by using the predetermined regression coefficients (Table 1). The means of the continuous covariates and an exposure prevalence of 0.2 were used in these computations. 1185 Table 3. Scenarios Based on Correctly Specified Directed Acyclic Graphs Representing Confounders of Various Strengthsa Measures by Scenariob Covariate Selection Methods DAG DAG-GS CE DAG-GS P DAG-S CE Standard error RESULTS Although the relative performances of the procedures did not vary with sample size, the magnitudes of the differences between them were greater at n ¼ 1,000 than at n ¼ 300. We present only the results of n ¼ 1,000. Simulation results The simulated 95% confidence interval coverages of properly specified DAG full models were all close to the nominal level (95%–96%), and the absolute values of the bias were all small (0.003–0.025) (only results from scenarios 1–3 are shown) (Table 3). These results indicated that the number of simulations performed in the study was satisfactory, because in this study confounding of less than 0.1 (a 10% change in OR) was considered inconsequential in covariate selection. Performance measures No misspecification: strength of confounding (scenarios 1–3). Overall, when the DAG correctly specified the un- 1 0.173 0.177 0.177 0.188 2 0.164 0.167 0.167 0.179 3 0.176 0.177 0.177 0.184 1 0.007 0.057 0.057 0.077 2 0.006 0.058 0.058 0.119 3 0.003 0.072 0.072 0.159 1 0.173 0.186 0.186 0.204 2 0.164 0.177 0.177 0.215 3 0.176 0.191 0.191 0.244 Bias Square root of the mean-squared error 95% confidence interval coverage, % 1 96 94 94 90 2 95 93 93 86 3 96 94 94 84 Abbreviations: DAG, directed acyclic graph full model; DAG-GS CE, directed acyclic graph gold-standard change-in-estimate procedure without consideration of precision; DAG-GS P, directed acyclic graph gold-standard change-in-estimate procedure with consideration of precision; DAG-S CE, directed acyclic graph stepwise change-in-estimate procedure. a Standard errors, bias, square root of the mean-squared error, and the 95% confidence interval coverage for the natural logarithm of sample odds ratios were obtained by using the 4 model selection methods shown. The results were from 1,000 case-control samples, each consisting of 500 cases and 500 controls. b Scenarios 1–3 contained confounders of decreasing strengths, respectively. derlying causal relations, the DAG full model performed best, and the DAG-stepwise change-in-estimate procedure performed worst (Table 3). This was true regardless of strength of confounding. A comparison of the DAG goldstandard change-in-estimate procedure with versus without precision considerations showed that the results were identical on all performance measures. Bias and 95% confidence interval coverage were the measures that produced the most substantial differences among the 4 procedures. Although the DAG full model consistently produced the least bias, the bias resulting from the DAG-stepwise change-in-estimate procedure was consistently the greatest. Ninety-five percent confidence interval coverage was closer to nominal coverage for both the DAG full model and the DAG gold-standard change-in-estimate procedures than for the DAG-stepwise change-in-estimate procedure. The differences among the 4 procedures for these 2 performance measures (bias and 95% confidence interval coverage) were inversely associated with strength of confounding; that is, the differences increased as the strength of confounding decreased. Standard errors generated by the DAG-stepwise changein-estimate procedure were as much as 9% greater than DAG full model standard errors. Although the differences in standard error were small between the DAG full model and the DAG gold-standard change-in-estimate procedures, they increased with strength of confounding. (Table 4). In scenarios 5, 12, and 13, in which only 2 covariates were included in the initial full model for selection, performance measures were essentially the same across the 4 methods. In all other scenarios (where 3 covariates were included in the initial full model for selection), the DAG full model performed best, with respect to bias, square root of the mean-squared error, and 95% confidence interval coverage, and the DAG-stepwise change-in-estimate procedure performed worst. This was also true in general for the standard errors. The results of the DAG gold-standard change-in-estimate procedure with versus without precision considerations were identical for all 4 performance measures for scenarios 4–13. Misspecification: omission of confounders (scenarios 4–13). When the DAG omitted confounders, the DAG full Misspecification: inclusion of nonconfounding covariates (scenarios 14–25). When the initial DAG included non- model performed as well as or better than the other procedures, and the DAG-stepwise procedure performed worst confounders, the DAG gold-standard change-in-estimate and DAG-stepwise change-in-estimate procedures produced Am J Epidemiol 2009;169:1182–1190 1186 Weng et al. Table 4. Scenarios Based on Misspecified Directed Acyclic Graphs That Omitted Confoundersa Measures by Scenario (Omitted Confoundersb) DAG-GS CE DAG-GS P Covariate Selection Methods Measures by Scenario (Omitted Confoundersb) Covariate Selection Methods DAG Table 4. Continued DAG-S CE Standard error 4 (1 strong) 0.169 0.173 0.173 0.183 5 (1 strong) 0.172 0.172 0.172 0.172 6 (1 moderate) 0.164 0.165 0.165 0.171 7 (1 moderate) 0.163 0.164 0.164 0.170 8 (1 weak) 0.171 0.184 0.184 0.189 9 (1 weak) 0.174 0.178 0.178 0.185 10 (2 moderate)c 0.156 0.155 0.155 0.156 11 (2 weak)c 0.171 0.177 0.177 0.181 c 12 (2 weak) 0.169 0.169 0.169 0.169 13 (2 weak)d 0.157 0.157 0.157 0.157 0.214 0.265 0.265 0.292 Bias DAG DAG-GS CE DAG-GS P DAG-S CE Square root of the mean-squared error 4 (1 strong) 0.272 0.317 0.317 0.345 5 (1 strong) 0.243 0.243 0.243 0.243 6 (1 moderate) 0.224 0.256 0.256 0.272 7 (1 moderate) 0.221 0.256 0.256 0.267 8 (1 weak) 0.179 0.199 0.199 0.206 9 (1 weak) 0.175 0.196 0.196 0.244 c 10 (2 moderate) 0.314 0.344 0.344 0.345 11 (2 weak)c 0.199 0.234 0.234 0.251 12 (2 weak)c 0.188 0.188 0.188 0.188 13 (2 weak)d 0.159 0.159 0.159 0.159 5 (1 strong) 0.171 0.171 0.171 0.171 95% confidence interval coverage, % 6 (1 moderate) 0.153 0.195 0.195 0.212 4 (1 strong) 78 63 63 55 0.207 5 (1 strong) 85 85 85 85 0.081 6 (1 moderate) 85 79 79 74 88 79 79 76 4 (1 strong) 7 (1 moderate) 8 (1 weak) 0.149 0.051 9 (1 weak) 10 (2 moderate)c c 11 (2 weak) c 12 (2 weak) d 13 (2 weak) 0.197 0.076 0.197 0.076 0.017 0.084 0.084 0.159 7 (1 moderate) 0.272 0.307 0.307 0.308 8 (1 weak) 95 92 92 90 0.175 9 (1 weak) 96 92 92 84 0.082 10 (2 moderate)c 62 51 51 50 0.022 11 (2 weak)c 93 86 86 82 c 12 (2 weak) 93 93 93 93 13 (2 weak)d 96 96 96 96 0.101 0.082 0.022 0.152 0.082 0.022 0.152 0.082 0.022 Table continues smaller standard error than did the DAG full model in most scenarios regardless of whether the nonconfounder(s) were associated with only outcome or only exposure (Table 5). These procedures also outperformed the DAG full model with respect to bias and square root of the mean-squared error when the DAGs included nonconfounders that were associated with only study outcome. The DAG full model resulted in 95% confidence interval coverage that was closer to nominal coverage than the other procedures. When the DAG was misspecified to include nonconfounders, the results of the DAG gold-standard change-in-estimate procedure with versus without precision considerations were, again, identical. Regression-based versus sampling distribution-based standard errors The DAG gold-standard change-in-estimate procedure had a smaller mean of regression-based standard error than did their corresponding DAG full models in 22 of 25 scenarios (Table 6). On the other hand, only in 11 scenarios did the DAG gold-standard change-in-estimate procedure produce smaller sampling distribution-based standard error than the DAG full model. Ten of these scenarios corresponded to when the DAG included nonconfounders. Abbreviations: DAG, directed acyclic graph full model; DAG-GS CE, directed acyclic graph gold-standard change-in-estimate procedure without consideration of precision; DAG-GS P, directed acyclic graph gold-standard change-in-estimate procedure with consideration of precision; DAG-S CE, directed acyclic graph stepwise change-in-estimate procedure. a Standard errors, bias, square root of the mean-squared error, and the 95% confidence interval coverage for the natural logarithm of sample odds ratios were obtained by using the 4 model selection methods shown. The results were from 1,000 case-control samples, each consisting of 500 cases and 500 controls. b Number and strength of omitted confounders. c Omitted confounders were in the same direction. d Omitted confounders were in the opposite direction. DISCUSSION This study generated 25 different scenarios to investigate whether covariate selection strategies that combined DAGs and change-in-estimate approaches could improve parameter estimation over the DAG procedure used by itself under different scenarios of correctly specified DAGs with various strengths of confounding and of DAG misspecification. The finding that the DAG full model consistently performed best when DAGs were correctly specified with various strengths of confounding (scenarios 1–3) suggests that Am J Epidemiol 2009;169:1182–1190 Methods of Covariate Selection Table 5. Scenarios Based on Misspecified Directed Acyclic Graphs That Included Nonconfoundersa Measures by Scenario (Included Nonconfoundersb) DAG Table 5. Continued Measures by Scenario (Included Nonconfoundersb) Covariate Selection Methods DAG-GS CE DAG-GS P DAG-S CE Standard error 14 (1 E only) 0.181 0.184 0.184 0.189 15 (1 E only) 0.180 0.173 0.173 0.173 16 (3 E only) 0.190 0.187 0.187 0.184 17 (1 E only) 0.181 0.173 0.173 0.172 18 (1 O only) 0.181 0.183 0.183 0.192 19 (1 O only) 0.178 0.173 0.173 0.173 20 (3 O only) 0.192 0.184 0.184 0.185 21 (3 O only) 0.197 0.184 0.184 0.181 22 (4 O only) 0.201 0.186 0.186 0.183 23 (1 E, 1 O only) 0.188 0.190 0.190 0.193 24 (3 E, 3 O only) 0.201 0.192 0.192 0.188 25 (3 E, 4 O only) 0.223 0.204 0.204 0.193 Bias 1187 Covariate Selection Methods DAG DAG-GS CE DAG-GS P DAG-S CE Square root of the mean-squared error 14 (1 E only) 0.181 0.193 0.193 0.204 15 (1 E only) 0.180 0.173 0.173 0.173 16 (3 E only) 0.199 0.206 0.206 0.202 17 (1 E only) 0.182 0.173 0.173 0.172 18 (1 O only) 0.183 0.194 0.194 0.208 19 (1 O only) 0.181 0.174 0.174 0.174 20 (3 O only) 0.214 0.204 0.204 0.206 21 (3 O only) 0.203 0.186 0.186 0.183 22 (4 O only) 0.211 0.190 0.190 0.185 23 (1 E, 1 O only) 0.191 0.202 0.202 0.208 24 (3 E, 3 O only) 0.226 0.214 0.214 0.210 25 (3 E, 4 O only) 0.234 0.208 0.208 0.195 14 (1 E only) 0.010 0.060 0.060 0.077 15 (1 E only) 0 0.005 0.005 0.005 95% confidence interval coverage, % 16 (3 E only) 0.059 0.087 0.087 0.083 14 (1 E only) 95 92 92 90 96 95 95 95 17 (1 E only) 0.013 0.007 0.007 0.001 15 (1 E only) 18 (1 O only) 0.028 0.065 0.065 0.080 16 (3 E only) 94 92 92 93 0.021 17 (1 E only) 95 94 94 94 0.091 18 (1 O only) 95 92 92 90 94 94 94 94 19 (1 O only) 20 (3 O only) 0.032 0.095 0.021 0.088 0.021 0.088 21 (3 O only) 0.050 0.029 0.029 0.026 19 (1 O only) 22 (4 O only) 0.065 0.037 0.037 0.028 20 (3 O only) 92 92 92 92 0.079 21 (3 O only) 94 93 93 94 0.092 22 (4 O only) 95 93 93 94 0.027 23 (1 E, 1 O only) 94 90 90 90 24 (3 E, 3 O only) 94 92 92 92 25 (3 E, 4 O only) 95 90 90 92 23 (1 E, 1 O only) 24 (3 E, 3 O only) 25 (3 E, 4 O only) 0.030 0.102 0.069 0.069 0.095 0.041 0.069 0.095 0.041 Table continues further model simplification by using the change-inestimate procedure, which takes sampling variation into account, did not on average improve parameter estimation. The observations that the DAG full model provided minimal bias and the best 95% confidence interval coverage when DAGs were misspecified with omission of confounders (scenarios 4–13) simply reflect the fact that the variable selection procedures used in this study will not, in general, attenuate the bias resulting from inaccurate or incomplete causal assumptions made at the initial stage. The findings that the DAG gold-standard change-in-estimate procedures performed better with respect to bias than did the DAGstepwise change-in-estimate procedure in scenarios 1–13 highlight a potential deficiency in the latter. If a 0.1 change in parameter estimate is used as the criterion, with regard to bias, the DAG gold-standard change-in-estimate and the DAG-stepwise change-in-estimate procedures estimate ORs that conform to (0.9)DORDAG < OR < (1.1)ORDAG and (0.9) jORDAG < OR < (1.1) jDORDAG, respectively (where ORDAG and j are the estimated OR and the number Am J Epidemiol 2009;169:1182–1190 Abbreviations: DAG, directed acyclic graph full model; DAG-GS CE, directed acyclic graph gold-standard change-in-estimate procedure without consideration of precision; DAG-GS P, directed acyclic graph gold-standard change-in-estimate procedure with consideration of precision; DAG-S CE, directed acyclic graph stepwise change-in-estimate procedure; E, study exposure; O, study outcome. a Standard errors, bias, square root of the mean-squared error, and the 95% confidence interval coverage for the natural logarithm of sample odds ratios were obtained by using the 4 model selection methods shown. The results were from 1,000 case-control samples, each consisting of 500 cases and 500 controls. b Number of included nonconfounders and their relations with study exposure and outcome. For example, scenario 25 has included 7 nonconfounders, 3 of which were associated with exposure only and 4 with outcome only. of confounders included in the initial DAG full model, respectively). Thus, the DAG-stepwise change-in-estimate procedure results in a wider range of estimated ORs than does the DAG gold-standard change-in-estimate approach and, on average, is expected to produce more biased estimates 1188 Weng et al. Table 6. Sampling Distribution-based Standard Error and the Mean of Regression-based Standard Error for the Natural Logarithmtransformed Odds Ratios Obtained by Using the Directed Acyclic Graph Full Model and the Directed Acyclic Graph Gold-Standard Change-in-Estimate Procedurea Scenarios Sampling Distributionbased SE DAG DAG-GS CE % of DAG-GS CE With Smaller Regressionbased SE DAG DAG-GS CE Regressionbased SE 1 0.173 0.177 0.176 0.173 2 0.164 0.167 0.165 0.162 98.3 3 0.176 0.177 0.183 0.179 4 0.169 0.173 0.170 0.166 99.7 5 0.172 0.172 0.176 0.176 19.1 6 0.171 0.184 0.174 0.171 39.5 7 0.164 0.165 0.169 0.166 99.5 8 0.163 0.164 0.167 0.164 99.4 9 0.174 0.178 0.182 0.178 99.8 10 0.156 0.155 0.160 0.159 99.9 11 0.171 0.177 0.177 0.176 99.8 12 0.169 0.169 0.175 0.175 25.9 13 0.157 0.157 0.167 0.167 24.7 14 0.181 0.184 0.181 0.173 99.9 15 0.180 0.173 0.178 0.171 95.8 16 0.190 0.187 0.190 0.178 100 17 0.181 0.173 0.182 0.166 100 18 0.181 0.183 0.181 0.172 100 19 0.178 0.173 0.178 0.169 20 0.192 0.184 0.192 0.179 100 21 0.197 0.184 0.191 0.170 100 22 0.201 0.186 0.197 0.171 100 23 0.188 0.190 0.186 0.172 100 24 0.201 0.192 0.205 0.180 100 25 0.223 0.204 0.220 0.177 100 99.7 100 93.6 Abbreviations: DAG, directed acyclic graph full model; DAG-GS CE, directed acyclic graph gold-standard change-in-estimate procedure; SE, standard error. a The percentage of models simplified by using the DAG-GS CE procedure that resulted in a smaller regression-based standard error than the corresponding directed acyclic graph full models was also presented. The results were from 1,000 case-control samples, each consisting of 500 cases and 500 controls. when a sizable number of covariates are being considered for elimination and when the bias produced by the DAG full model cannot be corrected by model reduction, such as when the DAG is incorrectly specified to omit confounders. From a statistical point of view, model reduction should resolve the tradeoff of a larger bias for a smaller variance (13, 20–22). We found, however, that model reduction with logistic regression resulted in a larger bias but not necessarily a smaller standard error in most of the scenarios when the underlying causal assumptions were correctly specified or when the DAG was misspecified by the omission of confounders. Our study demonstrated that use of logistic regression-based standard error in covariate selection is problematic. Although the DAG gold-standard change-inestimate procedure always produced an equal or smaller mean of regression-based standard error than did the corresponding DAG full model, it had a smaller sampling distribution-based standard error than did the DAG full model only in 1 of these 13 scenarios. Earlier work by Robinson and Jewell (21) and Robinson et al. (22) explains the identical results of the DAG gold-standard change-in-estimate with versus without precision considerations observed in our study. The inconsistency between the regression-based and the sampling distribution-based standard errors is in agreement with findings from a previous study (16) and likely reflects inflated precision resulting from ignoring covariate selection-related uncertainty in regression modeling (16, 23). Use of regression-based standard errors in covariate selection optimizes the regression-estimated precision but not necessarily the true precision, which is estimated by the underlying sampling distribution. When the DAG was misspecified with inclusion of nonconfounders (scenarios 14–25), the DAG gold-standard change-in-estimate and the DAG-stepwise change-inestimate procedures produced smaller standard errors than did the DAG full model in most scenarios regardless of whether the nonconfounders that were included were associated with only study exposure or only outcome. That the largest differences in the mean regression-based standard errors between the DAG gold-standard change-in-estimate procedure and the DAG full model were consistently observed in these scenarios is noteworthy and might serve as an indication for misspecification of DAGs with inclusion of nonconfounders. However, the reduction in the 95% confidence interval coverage resulting from the DAG gold-standard change-inestimate procedures in many of these scenarios signaled the potential downward bias of the regression-based standard error. The DAG full model produced 95% confidence interval coverage that was closer to nominal coverage more consistently than did the other procedures in these scenarios. The paradox that the DAG full models produced largest bias in most of the scenarios when the DAG included nonconfounders that were associated with only study outcome but not when they were associated with only study exposure could be explained by the noncollapsibility of OR (22, 24–27). After adjustment for these covariates that were associated with only study outcome, the DAG full model produced the smallest bias in 10 of these 12 scenarios (data not shown). We caution the readers that the scenarios examined in our study are restricted to only a small region of the parameter space, and the conclusions made from this study are limited to the population structures generated and the number of covariates investigated. For instance, the underlying structure that we investigated involved confounders that were each independent of the others, that is, not on the same backdoor paths of the DAGs. Redundant confounders such as those lying on the same backdoor paths as other confounders were not investigated. We further acknowledge that there are different covariate selection procedures commonly used in epidemiologic studies, such as hierarchical backwards elimination (13, 28). The results from this study may not be generalizable or directly comparable with these procedures. Nevertheless, Am J Epidemiol 2009;169:1182–1190 Methods of Covariate Selection although we used case-control sampling, the results from this study should be applicable to a cohort study that uses risk or rate ratios as effect measures (29, 30) as we ensured that the outcomes were rare (i.e., incidence proportion <10%) in most joint strata of covariates in the source populations. On the other hand, our study encountered problems specific to using OR in confounder identification (12, 24–26). Additionally, although all logistic regression models converged, this study did not investigate constraints on model convergence. Although we expect that the results would be applicable to other commonly used models such as Poisson and Cox proportionalhazards models, the methods might not be feasible for others, such as log binomial models, as problems of model convergence using these models have been previously reported (31–33). In conclusion, potential, but not conclusive, benefits of performing further covariate selection using change-inestimate procedures were observed only when the DAGs were misspecified by the inclusion of nonconfounders. We conclude, therefore, that the primary task for the researcher/ analyst is to ensure that proper causal assumptions are made, in particular, that no strong confounders are excluded from data collection or analysis. Given that the investigator is never certain about the accuracy of prior causal assumptions, the recommended strategy is to construct a ‘‘conservative’’ DAG, including all known confounders and potential confounders even at the risk of including nonconfounders (given that they are neither colliders nor downstream effects of the exposure or the outcome), and use this full model in data analysis. An alternative is to construct a series of DAGs, each having plausibility based on prior knowledge, with various degrees of ‘‘conservativeness’’ regarding potential but not established confounders. The DAG full model for each can then be reported with complete transparency about the assumed underlying model. The analysts then could perform covariate selection using the DAG gold-standard change-in-estimate procedure. A large reduction in regression-based standard error in the simplified model might be an indication of misspecification of the underlying causal assumption and, specifically, with inclusion of nonconfounders. However, even under this type of misspecification, bias may be increased, and the 95% confidence interval coverage may deviate from nominal coverage through covariate selection. A final caveat is that results related to bias are applicable on the basis of the average of 1,000 iterations, but in any given single study, the actual performance of model reduction is unknown and may, in some circumstances, produce a less biased estimate of effect. ACKNOWLEDGMENTS Author affiliations: Department of Pathobiology, College of Veterinary Medicine, University of Illinois at UrbanaChampaign, Urbana, Illinois (Hsin-Yi Weng); Department of Biostatistics, School of Public Health and Tropical Medicine, Tulane University, New Orleans, Louisiana (Ya-Hui Hsueh); Department of Public Health and Preventive Medicine, Am J Epidemiol 2009;169:1182–1190 1189 School of Medicine and School of Veterinary Medicine, St. George’s University, Grenada, West Indies (Locksley L. McV. Messam); and Department of Public Health Sciences, School of Medicine, University of California at Davis, Davis, California (Irva Hertz-Picciotto). The authors thank Lora D. Delwiche of the Public Health Sciences, University of California at Davis, for assisting with modifying the SAS macros for covariate selection procedures. Conflict of interest: none declared. REFERENCES 1. Glymour MM, Weuve J, Berkman LF, et al. When is baseline adjustment useful in analyses of change? An example with education and cognitive change. Am J Epidemiol. 2005;162(3): 267–278. 2. Greenland S. Modeling and variable selection in epidemiologic analysis. Am J Public Health. 1989;79(3): 340–349. 3. Greenland S, Pearl J, Robins JM. Causal diagrams for epidemiologic research. Epidemiology. 1999;10(1):37–48. 4. Hernán MA, Hernández-Dı́az S, Werler MM, et al. Causal knowledge as a prerequisite for confounding evaluation: an application to birth defects epidemiology. Am J Epidemiol. 2002;155(2):176–184. 5. Maldonado G, Greenland S. Simulation study of confounderselection strategies. Am J Epidemiol. 1993;138(11):923–936. 6. Pearl J. Causal diagrams for empirical research. Biometrika. 1995;82(4):669–688. 7. Robins JM. Data, design, and background knowledge in etiologic inference. Epidemiology. 2001;12(3):313–320. 8. Rothman KJ, Greenland S. Modern Epidemiology. 2nd ed. Philadelphia, PA: Lippincott-Raven; 1998. 9. Nelson MC, Gordon-Larsen P, Adair LS. Are adolescents who were breast-fed less likely to be overweight? Analyses of sibling pairs to reduce confounding. Epidemiology. 2005; 16(2):247–253. 10. Weng HY, Kass PH, Hart LA, et al. Risk factors for unsuccessful dog ownership: an epidemiologic study in Taiwan. Prev Vet Med. 2006;77(1-2):82–95. 11. Messam LL, Kass PH, Chomel BB, et al. The human-canine environment: a risk factor for non-play bites? Vet J. 2008; 177(2):205–215. 12. Miettinen OS, Cook EF. Confounding—essence and detection. Am J Epidemiol. 1981;114(4):593–603. 13. Kleinbaum DG, Kupper LL, Morgenstern H. Epidemiologic Research: Principles and Quantitative Methods. Belmont, CA: Lifetime Learning Publications; 1982. 14. Pearl J. Causality: Models, Reasoning, and Inference. Cambridge, United Kingdom: Cambridge University Press; 2000. 15. Mickey RM, Greenland S. The impact of confounder selection criteria on effect estimation. Am J Epidemiol. 1989;129(1): 125–137. 16. Budtz-Jørgensen E, Keiding N, Grandjean P, et al. Confounder selection in environmental epidemiology: assessment of health effects of prenatal mercury exposure. Ann Epidemiol. 2007; 17(1):27–35. 17. Sonis J, Hertz-Picciotto I. Accessing the presence of confounding. Fam Med. 1996;28(7):462–463. 18. SAS Institute, Inc. SAS Software for Windows. Version 9.1. Cary, NC: SAS Institute, Inc; 1999. 1190 Weng et al. 19. Hegewald J, Pfahlberg A, Uter W. A backwards-manual selection macro for binary logistic regression in SAS v. 8.02 PROC LOGISTIC procedure. Presented at the 16th Annual North East SAS Users Group Conference (NESUG 2003), Washington, DC, September 2002. 20. Robins JM, Greenland S. The role of model selection in causal inference from nonexperimental data. Am J Epidemiol. 1986; 123(3):392–402. 21. Robinson LD, Jewell NP. Some surprising results about covariate adjustment in logistic regression models. Int Stat Rev. 1991;59(2):227–240. 22. Robinson LD, Dorroh JR, Lien D, et al. The effects of covariate adjustment in generalized linear models. Commun Stat Theory Meth. 1998;27(7):1653–1675. 23. Chatfield C. Model uncertainty, data mining and statistical inference. J R Stat Soc A. 1995;158(part 3):419–466. 24. Greenland S, Robins JM. Identifiability, exchangeability, and epidemiological confounding. Int J Epidemiol. 1986;15(3): 413–419. 25. Gail MH, Wieand S, Piantadosi S. Biased estimates of treatment effect in randomized experiments with non-linear regression and omitted covariates. Biometrika. 1984;71: 431–444. 26. Hauck WW, Neuhaus JM, Kalbfleisch JD, et al. A consequence of omitted covariates when estimating odds ratios. J Clin Epidemiol. 1991;44(1):77–81. 27. Negassa A, Hanley JA. The effect of omitted covariates on confidence interval and study power in binary outcome analysis: a simulation study. Contemp Clin Trials. 2007;28(3): 242–248. 28. Kleinbaum D, Klein M. Logistic Regression: A Self-Learning Text. 2nd ed. New York, NY: Springer; 2002. 29. Greenland S, Thomas DC. On the need for the rare disease assumption in case-control studies. Am J Epidemiol. 1982; 116(3):547–553. 30. Greenland S. Interpretation and choice of effect measures in epidemiologic analyses. Am J Epidemiol. 1987;125(5):761–768. 31. Wacholder S. Binomial regression in GLIM: estimating risk ratios and risk differences. Am J Epidemiol. 1986;123(1): 174–184. 32. McNutt LA, Wu C, Xue X, et al. Estimating the relative risk in cohort studies and clinical trials of common outcomes. Am J Epidemiol. 2003;157(10):940–943. 33. Zou G. A modified Poisson regression approach to prospective studies with binary data. Am J Epidemiol. 2004;159(7): 702–706. Am J Epidemiol 2009;169:1182–1190
© Copyright 2024 Paperzz