Type of Paper: Review Article Title: A Primer on Marginal Effects β Part I: Theory and Formulae Short title: Primer on Marginal Effects Authors: Onukwugha E1, Bergtold J2, Jain R3 1 220 Arch Street, Department of Pharmaceutical Health Services Research, University of Maryland School of Pharmacy, Baltimore, MD, USA. 2 304G Waters Hall, Department of Agricultural Economics, Kansas State University, Manhattan, KS, 66506-4011. 3 HealthCore, Inc., 800 Delaware Avenue 5th Floor, Wilmington, DE 19801, USA. Corresponding author: Eberechukwu Onukwugha 220 Arch Street, 12th Floor Baltimore, MD 21201 (410) 706-8981 [email protected] 1 TECHNICAL APPENDIX. Marginal effect formulas for the linear, logit, multinomial logit, generalized linear model with log link, poisson, negative binomial, two-part, sample selection, and survival models. 1. Linear Regression Model A typical linear regression equation with two independent variables takes the form: πΈ (π¦|π₯1 , π₯2 ) = π(π₯1 , π₯2 ) = π½0 + π½1 π₯1 + π½2 π₯2 . We examine each of the cases presented above for the linear regression model, showing how marginal and interaction effects changes under different conditions. For each case, assume π₯1 and π₯2 are continuous unless indicated otherwise. Case 1: Linearity in the covariates. Let πΈ(π¦|π₯1 , π₯2 ) = π(π₯1 , π₯2 ) = π½0 + π½1 π₯1 + π½2 π₯2. Then: ME2: ππ(β) ππ₯2 = π½2 In a linear regression with only linear transformations of variables, the marginal effect is constant. Case 2: Inclusion of nonlinear transformations of the covariates. Now consider the case where πΈ(π¦|π₯1 , π₯2 ) = π(π₯1 , π₯2 ) = π½0 + π½1 π₯1 + π½2 π₯2 + π½3 π₯22 , then ME2: ππ(β) ππ₯2 = π½2 + 2π½3 π₯2. The marginal effect of π₯2 , in this case is a function of π₯2 . If π½3 β₯ 0 (β€ 0); then the marginal effect of π₯2 on y is increasing (decreasing) as π₯2 increases (decreases). 2 Case 3: Inclusion of an interaction term between covariates. If πΈ(π¦|π₯1 , π₯2 ) = π(π₯1 , π₯2 ) = π½0 + π½1 π₯1 + π½2 π₯2 + π½3 (π₯1 β π₯2 ), then ME2: ππ(β) ππ₯2 = π½2 + π½3 π₯1 Now, the marginal effect of π₯2 depends on the value of π₯1 . Case 4: Linearity in the covariates with a discrete covariate. Assume that πΈ(π¦|π₯1 , π₯2 ) = π(π₯1 , π₯2 ) = π½0 + π½1 π₯1 + π½2 π₯2, but now let π₯1 be binary. Then: ME1: π(π₯1 , π₯2 )|π₯1 =1 β π(π₯1 , π₯2 )|π₯1=0 = π½1 . Note that in a linear regression with only linear transformations of variables, the marginal effect is still constant. Case 5: Inclusion of an interaction term with a discrete covariate. Let πΈ(π¦|π₯1 , π₯2 ) = π(π₯1 , π₯2 ) = π½0 + π½1 π₯1 + π½2 π₯2 + π½3 (π₯1 β π₯2 ), then: ME1: π(π₯1 , π₯2 )|π₯1 =1 β π(π₯1 , π₯2 )|π₯1=0 = π½1 + π½3 π₯2. The marginal effect of π₯1 in this case depends on the value of π₯2 . Case 6: Interaction effects when both covariates are continuous. Let πΈ(π¦|π₯1 , π₯2 ) = π(π₯1 , π₯2 ) = π½0 + π½1 π₯1 + π½2 π₯2 + π½3 (π₯1 β π₯2 ). The interaction effect of the marginal effect for π₯2 given a change in π₯1 is: ME21: π ππ₯1 [ππΈ2 ] = π½3 3 Case 7: Interaction between a continuous and discrete covariate. Let πΈ(π¦|π₯1 , π₯2 ) = π(π₯1 , π₯2 ) = π½0 + π½1 π₯1 + π½2 π₯2 + π½3 (π₯1 β π₯2 ), where π₯1 is binary. Then: ME21: ππ(β) | ππ₯2 π₯ =1 1 β ππ(β) | ππ₯2 π₯ =0 1 = {π½2 + π½3 β 1} β π½2 = π½3, and π ME12: = ππ₯ [ππΈ1 ] = π½3 2 A special case of the linear regression model is the log-linear. That is, the regression model: ln(π¦) = f(x1 , x2 ) + π’. For this regression model, if the dependent variable of interest is ln(y), then the marginal effects derived above apply. That is, the marginal effect of a change in π₯π on ln(y) is the statistic of interest. If instead, the applied modeler is interested in the marginal effect of π₯π on y, then the marginal effect formulas will differ. First, let: Μ Μ Μ Μ k = ππ(π₯;π½) = πln(π¦). ME ππ₯ ππ₯ π π To get to the marginal of π₯π on y, one will need to transform the above marginal effect. This is done by incorporating ππ¦ in the following way: implying that ππ¦ ππ₯π =π¦ πln(π¦) ππ₯π πln(π¦) ππ₯π = πln(π¦) ππ₯π ππ¦ × ππ¦ = πln(π¦) ππ¦ ππ¦ ππ¦ π π 1 × ππ₯ = ππ₯ × π¦, . Then: Μ Μ Μ Μ k × π¦. MEk = ME Thus, if a log-linear model is estimated, then the marginal effects are those derived above times the value of the dependent variable. This shows that marginal effects can include the dependent variable, as well. The above marginal effect derivations for the log-linear model assume a 4 homoskedastic retransformation and no need for a βsmearing estimatorβ. In the case of heteroskedasticity, the ME functional will include an extra term associated with the derivative of the error term with respect to the independent variable. 2. Logistic Regression Model (or Logit Model) Let predictor (or index) function of the model be given by π(π₯1 , π₯2 ; π½), then the Logit model takes the form πΈ(π¦ = 1|π₯1 , π₯2 ) = π¬(π(π₯1 , π₯2 ; π½)) = 1 1 + π β(π(π₯1 ,π₯2 ;π½)) where, π¬(β) is the cumulative distribution function of the standard logistic distribution. It is important to note that πΈ(π¦ = 1|π₯1 , π₯2 ) is non-linear in the Ξ²βs. Therefore, unlike the linear regression model, the magnitudes of the beta coefficients are not the marginal effects of the independent variables. For the marginal effect derivations below, the following formula will be of use: If π¬(π§) = 1 1+π β(π§) , then ππ¬(β) ππ§ π β(π§) = (1+π β(π§) ) 2 = π¬(π§)(1 β π¬(π§)). For each case, assume π₯1 and π₯2 are continuous unless indicated otherwise. In addition, for ease of notation, we may represent π(π₯1 , π₯2 ; π½) simply as π. Case 1: Linear in the covariates. Let π(π₯1 , π₯2 ; π½) = π½0 + π½1 π₯1 + π½2 π₯2 , then: ME2: ππ¬(β) ππ₯2 = ππ¬(β) ππ ππ β ππ₯ = π¬(π₯1 , π₯2 )(1 β π¬(π₯1 , π₯2 ))π½2 2 5 For the logit model, (i) all independent variables are involved in the calculation of the marginal effect and (ii) the marginal effect depends on the initial value of all the independent variables. Case 2: Inclusion of nonlinear transformations of the covariates. Let π(π₯1 , π₯2 ; π½) = π½0 + π½1 π₯1 + π½2 π₯2 + π½3 π₯22 , then: ME2: ππ¬(β) ππ₯2 = ππ¬(β) ππ ππ β ππ₯ = π¬(π₯1 , π₯2 )(1 β π¬(π₯1 , π₯2 ))(π½2 + 2π½3 π₯2 ) 2 As in the case of the linear regression model, we assume that the non-linear transformation of π₯2 is a square function. In general, it can be any transformation. For example, if the nonlinear term in the predictor was ln(π₯2 ) instead of π₯22 , then: ME2: ππ¬(β) ππ₯2 = ππ¬(β) ππ π½2 +2ππ(π₯2 ) ππ β ππ₯ = π¬(π₯1 , π₯2 )(1 β π¬(π₯1 , π₯2 )) ( 2 π₯2 ). Case 3: Inclusion of an interaction term between covariates. Consider an interaction term between π₯1 and π₯2 such that π(π₯1 , π₯2 ; π½) = Ξ²0 + Ξ²1 x1 + Ξ²2 x2 + Ξ²3 (x2 β x1 ). Then: ME2: βΞ(β) βx2 = βΞ(β) βΞ· βΞ· β βx = π¬(π₯1 , π₯2 )(1 β π¬(π₯1 , π₯2 ))(Ξ²2 + Ξ²3 x1 ). 2 Case 4: Linearity in the covariates with a discrete covariate. Let π(π₯1 , π₯2 ; π½) = π½0 + π½1 π₯1 + π½2 π₯2 and assume that π₯1 is binary. Then: 1 1 ME1: π¬(π₯1 , π₯2 )|π₯1=1 β π¬(π₯1 , π₯2 )|π₯1 =0 = 1+π β(π½0+π½1 +π½2 π₯2) β 1+π β(π½0 +π½2π₯2) . Case 5: Inclusion of an interaction term with a discrete covariate Let π(π₯1 , π₯2 ; π½) = Ξ²0 + Ξ²1 x1 + Ξ²2 x2 + Ξ²3 (x2 β x1 ) and assume that π₯1 is binary. Then: 6 1 1 ME1: π¬(π₯1 , π₯2 )|π₯1=1 β π¬(π₯1 , π₯2 )|π₯1 =0= 1+π β(π½0+π½1 +π½2 π₯2+Ξ²3x2) β 1+π β(π½0+π½2 π₯2). The remainder of this section (Cases 6, 7, 8) is based on the work by Ai and Norton (2003). For this section, consider the predictor: π(π₯1 , π₯2 ; π½) = π½0 + π½1 π₯1 + π½2 π₯2 + π½3 (π₯2 β π₯1 ). Case 6: Interaction effects with continuous covariates. β ππ¬(π₯1 , π₯2 ) ππ¬(π₯1 , π₯2 )2 ππ§ π 2π§ [ME2 ] = [ ME21 : β ] + π¬(π₯1 , π₯2 )(1 β π¬(π₯1 , π₯2 )) = βx1 ππ₯1 ππ₯1 ππ₯2 ππ₯2 ππ₯1 = [ππΈ1 (1 β 2π¬(π₯1 , π₯2 ))](Ξ²2 + Ξ²3 x1 )+ π¬(π₯1 , π₯2 )(1 β π¬(π₯1 , π₯2 ))π½3. Case 7: Interaction effects with a continuous and discrete covariate. Let π₯1 be binary again, making π1 = π½0 + π½1 + π½2 π₯2 + π½3 π₯2 and π0 = π½0 + π½2 π₯2. Then: ME21: [ππΈ2 ]|π₯1 =1 β [ππΈ2 ]|π₯1 =0 =π¬(π1 )(1 β π¬(π1 ))(π½2 + π½3 ) β π¬(π0 )(1 β π¬(π0 ))π½2. It should be emphasized that ME21 = ME12, as well. Case 8: Interaction effects for predictor linear in the coefficients and covariates. An interesting case is when there are no nonlinear transformations or interaction terms, i.e.: π(π₯1 , π₯2 ; π½) = π½0 + π½1 π₯1 + π½2 π₯2 . Now let π1 = π½0 + π½1 + π½2 π₯2 and π0 = π½0 + π½2 π₯2 . Then: ME21:[ππΈ2 ]|π₯1 =1 β [ππΈ2 ]|π₯1 =0 =π¬(π1 )(1 β π¬(π1 ))π½2 β π¬(π0 )(1 β π¬(π0 ))π½2 . 3. Multinomial Logistic Regression Model There exists a separate set of marginal effects for each outcome in the multinomial logistic regression model. These will be designated MEi,j for the marginal effect of variable i for 7 outcome j and MEik,j for the interaction marginal effect of variable i and variable k for outcome j. For each case, assume π₯1 and π₯2 continuous unless indicated otherwise. In addition, for ease of notation, we may represent ππ (π₯1 , π₯2 ; π½) simply as ππ . In addition, we will assume that: πΈ(π¦ = π|π₯1 , π₯2 , π₯3 ) = exp(ππ ) 1 + βπ½π=1 exp(ππ ) = βπ (π₯1 , π₯2 ) Case 1: Linearity in the covariates. Let ππ (π₯1 , π₯2 ; π½) = π½0,π + π½1,π π₯1 + π½2,π π₯2 for j = 1,β¦,J. Then: ME2,j: πβ(β) ππ₯2 = βπ (π₯1 , π₯2 )[π½2,π β βπ½π=0 βπ (π₯1 , π₯2 )π½2,π ]. As was with the logit model, for the multinomial logit model (i) all independent variables are involved in the calculation of the marginal effect and (ii) the marginal effect depends on the initial value of the independent variables. Case 2: Inclusion of nonlinear transformations of the covariates. Consider the predictor ππ (π₯1 , π₯2 ; π½) = π½0,π + π½1,π π₯1 + π½2,π π₯2 + π½3,π π₯22 for j = 1,β¦,J. Then: ME2,j: πβ(β) ππ₯2 = βπ (π₯1 , π₯2 )[(π½2,π + 2π½3,π π₯2 ) β βπ½π=0 βπ (π₯1 , π₯2 )(π½2,π + 2π½3,π π₯2 )]. Case 3: Inclusion of interaction terms between the covariates. Let ππ (π₯1 , π₯2 ; π½) = Ξ²0,j + Ξ²1,j x1 + Ξ²2,j x2 + Ξ²3,j (x2 β x1 ) for j = 1,β¦,J. Then: ME2,j: πβ(β) ππ₯2 = βπ (π₯1 , π₯2 )[(π½2,π + π½3,π π₯1 ) β βπ½π=0 βπ (π₯1 , π₯2 )(π½2,π + π½3,π π₯1 )]. 8 Case 4: Linearity in the covariates with a discrete covariate. Let π₯1 be binary and ππ (π₯1 , π₯2 ; π½) = π½0,π + π½1,π π₯1 + π½2,π π₯2 for j = 1,β¦,J. Now let ππ,1 = π½0,π + π½1,π + π½2,π π₯2 and ππ,0 = π½0,π + π½2,π π₯2 for j = 1,β¦,J. Then: ME1,j: βπ (π₯1 , π₯2 )|π₯ 1 β βπ (π₯1 , π₯2 )|π₯ =1 1 = =0 exp(ππ,1 ) π½ 1+βπ=1 exp(ππ,1 ) β exp(ππ,0 ) π½ . 1+βπ=1 exp(ππ,0 ) Case 5: Inclusion of an interaction term with a discrete covariate. Let π₯1 be binary and ππ (π₯1 , π₯2 ; π½) = Ξ²0,j + Ξ²1,j x1 + Ξ²2,j x2 + Ξ²3,j (x2 β x1 ) for j = 1,β¦,J. Now let ππ,1 = π½0,π + π½1,π + π½2,π π₯2 + π½3,π π₯2 and ππ,0 = π½0,π + π½2,π π₯2 for j = 1,β¦,J. Then: ME1,j: βπ (π₯1 , π₯2 )|π₯ 1 =1 β βπ (π₯1 , π₯2 )|π₯ 1 =0 = exp(ππ,1 ) π½ 1+βπ=1 exp(ππ,1 ) β exp(ππ,0 ) π½ . 1+βπ=1 exp(ππ,0 ) For the next two cases, consider the predictor: ππ (π₯1 , π₯2 ; π½) = π½0,π + π½1,π π₯1 + π½2,π π₯2 + π½3,π (π₯2 β π₯1 ) for j = 1,β¦,J. The derivations here are based on work by Bergtold and Onukwugha[1]. Case 6: Interaction effects when the covariates are both continuous. ππΈ21,π : π [ππΈ2,π ] ππ₯1 π½ = βπ (π₯1 , π₯2 )[π½3,π β βπ=1 (ππΈ1,π (π½2,π + π½3,π π₯1 ) + βπ (π₯1 , π₯2 )π½3,π )] + (ππΈ2 )(ππΈ1 ) βπ (π₯1 , π₯2 ) Case 7: Interaction effects with a continuous and discrete covariate. Let ππ,1 = π½0,π + π½1,π + π½2,π π₯2 + π½3,π π₯2 and ππ,0 = π½0,π + π½2,π π₯2 for j = 1,β¦,J. Then: 9 ME21,j: [ππΈ2,π ]|π₯ 1 =1 β [ππΈ2,π ]|π₯ 1 =0 = βπ (ππ,1 )[(π½2,π + π½3,π ) β βπ½π=0 βπ (ππ,1 )(π½2,π + π½3,π )] β βπ (ππ,0 )[π½2,π β βπ½π=0 βπ (ππ,0 )π½2,π ]. Case 8: Interaction effects for predictor linear in the coefficients and covariates. An interesting case is when there are no nonlinear transformations or interaction terms, i.e: ππ (π₯1 , π₯2 ; π½) = π½0,π + π½1,π π₯1 + π½2,π π₯2 for j = 1,β¦,J. Let ππ,1 = π½0,π + π½1,π + π½2,π π₯2 and ππ,0 = π½0,π + π½2,π π₯2 for j = 1,β¦,J. Then: ME21:[ππΈ2 ]|π₯1 =1 β [ππΈ2 ]|π₯1 =0 = βπ (ππ,1 )[π½2,π β βπ½π=0 βπ (ππ,1 )π½2,π ] β βπ (ππ,0 )[π½2,π β βπ½π=0 βπ (ππ,0 )π½2,π ]. 4. Generalized Linear Model (GLM) with Log Link Function The conditional mean for the GLM model with log link takes the form: πΈ(π¦|π₯1 , π₯2 ) = exp (ππ (π₯1 , π₯2 ; π½)), where ππ (π₯1 , π₯2 ; π½) is the predictor function. For each case, assume π₯1 and π₯2 are continuous unless indicated otherwise. In addition, for ease of notation, we represent π(π₯1 , π₯2 ; π½) simply as π. Case 1: Linear in the covariates. Let π(π₯1 , π₯2 ; π½) = π½0 + π½1 π₯1 + π½2 π₯2 , then: ME2: πππ₯π(π) ππ₯2 = exp(π) π½2 10 Case 2: Inclusion of nonlinear transformations of the covariates. Let π(π₯1 , π₯2 ; π½) = π½0 + π½1 π₯1 + π½2 π₯2 + π½3 π₯22 , then: ME2: πππ₯π(π) ππ₯2 = exp(π) (π½2 + 2π½3 π₯2 ) Case 3: Inclusion of an interaction term between covariates. Consider an interaction term between π₯1 and π₯2 such that π(π₯1 , π₯2 ; π½) = Ξ²0 + Ξ²1 x1 + Ξ²2 x2 + Ξ²3 (x2 β x1 ). Then: ME2: πππ₯π(π) ππ₯2 = exp(π) (π½2 + 2π½3 π₯1 ). Case 4: Linearity in the covariates with a discrete covariate. Let π(π₯1 , π₯2 ; π½) = π½0 + π½1 π₯1 + π½2 π₯2 and assume that π₯1 is binary. Then: ME1: ππ₯π(π)|π₯1=1 β ππ₯π(π)|π₯1 =0= exp(π½0 + π½1 + π½2 π₯2 ) β exp(π½0 + π½2 π₯2 ). Case 5: Inclusion of an interaction term with a discrete covariate Let π(π₯1 , π₯2 ; π½) = Ξ²0 + Ξ²1 x1 + Ξ²2 x2 + Ξ²3 (x2 β x1 ) and assume that π₯1 is binary. Then: ME1: ππ₯π(π)|π₯1=1 β ππ₯π(π)|π₯1 =0= exp(π½0 + π½1 + π½2 π₯2 + π½3 π₯2 ) β exp(π½0 + π½2 π₯2 ). For the next two cases (cases 6 and 7), let the predictor be given by: π(π₯1 , π₯2 ; π½) = π½0 + π½1 π₯1 + π½2 π₯2 + π½3 (π₯2 β π₯1 ). 11 Case 6: Interaction effects with continuous covariates. ME21 : β βx1 [ME2 ] = exp(π) (π½1 + π½3 π₯2 )(π½2 + π½3 π₯1 ) + exp(π) π½3. Case 7: Interaction effects with a continuous and discrete covariate. Let π₯1 be binary again, making π1 = π½0 + π½1 + π½2 π₯2 + π½3 π₯2 and π0 = π½0 + π½2 π₯2. Then: ME21: [ME2 ]|π₯1 =1 β [ME2 ]|π₯1 =0 =exp(π1 )(π½2 + π½3 ) β exp(π0 )π½2. It should be emphasized that ME21 = ME12 in this case, as well. Case 8: Interaction effects for predictor linear in the coefficients and covariates. An interesting case is when there are no nonlinear transformations or interaction terms, i.e.: π(π₯1 , π₯2 ; π½) = π½0 + π½1 π₯1 + π½2 π₯2 . Again let π₯1 be binary, π1 = π½0 + π½1 + π½2 π₯2 , and π0 = π½0 + π½2 π₯2 . Then: ME21:[ππΈ2 ]|π₯1 =1 β [ππΈ2 ]|π₯1 =0 =exp(π1 )π½2 β exp(π0 )π½2. 5. Count Models The conditional mean function for both Poisson and Negative Binomial models (as well as a number of other count data models) is: πΈ(π¦|π₯1 , π₯2 ) = exp(π(π₯1 , π₯2 ; π½)). For each case, assume π₯1 and π₯2 are continuous unless indicated otherwise. In addition, for ease of notation, we represent π(π₯1 , π₯2 ; π½) simply as π. The marginal effects for the count models presented below are similar in derivation to those for the GLM model with log link function 12 presented earlier, but it should be emphasized that the parameter estimates will not be the same in both models as the underlying distributions are different. Case 1: Linear in the covariates. Let π(π₯1 , π₯2 ; π½) = π½0 + π½1 π₯1 + π½2 π₯2 , then: ME2: πππ₯π(π) ππ₯2 = exp(π) π½2 Case 2: Inclusion of nonlinear transformations of the covariates. Let π(π₯1 , π₯2 ; π½) = π½0 + π½1 π₯1 + π½2 π₯2 + π½3 π₯22 , then: ME2: πππ₯π(π) ππ₯2 = exp(π) (π½2 + 2π½3 π₯2 ) Case 3: Inclusion of an interaction term between covariates. Consider an interaction term between π₯1 and π₯2 such that π(π₯1 , π₯2 ; π½) = Ξ²0 + Ξ²1 x1 + Ξ²2 x2 + Ξ²3 (x2 β x1 ). Then: ME2: πππ₯π(π) ππ₯2 = exp(π) (π½2 + 2π½3 π₯1 ). Case 4: Linearity in the covariates with a discrete covariate. Let π(π₯1 , π₯2 ; π½) = π½0 + π½1 π₯1 + π½2 π₯2 and assume that π₯1 is binary. Then: ME1: ππ₯π(π)|π₯1=1 β ππ₯π(π)|π₯1 =0= exp(π½0 + π½1 + π½2 π₯2 ) β exp(π½0 + π½2 π₯2 ). Case 5: Inclusion of an interaction term with a discrete covariate Let π(π₯1 , π₯2 ; π½) = Ξ²0 + Ξ²1 x1 + Ξ²2 x2 + Ξ²3 (x2 β x1 ) and assume that π₯1 is binary. Then: 13 ME1: ππ₯π(π)|π₯1=1 β ππ₯π(π)|π₯1 =0= exp(π½0 + π½1 + π½2 π₯2 + π½3 π₯2 ) β exp(π½0 + π½2 π₯2 ). For the next two cases (cases 6 and 7), let the predictor be given by: π(π₯1 , π₯2 ; π½) = π½0 + π½1 π₯1 + π½2 π₯2 + π½3 (π₯2 β π₯1 ). Case 6: Interaction effects with continuous covariates. ME21 : β βx1 [ME2 ] = exp(π) (π½1 + π½3 π₯2 )(π½2 + π½3 π₯1 ) + exp(π) π½3. Case 7: Interaction effects with a continuous and discrete covariate. Let π₯1 be binary again, making π1 = π½0 + π½1 + π½2 π₯2 + π½3 π₯2 and π0 = π½0 + π½2 π₯2. Then: ME21: [ME2 ]|π₯1 =1 β [ME2 ]|π₯1 =0 =exp(π1 )(π½2 + π½3 ) β exp(π0 )π½2. It should be emphasized that ME21 = ME12 in this case, as well. Case 8: Interaction effects for predictor linear in the coefficients and covariates. An interesting case is when there are no nonlinear transformations or interaction terms, i.e: π(π₯1 , π₯2 ; π½) = π½0 + π½1 π₯1 + π½2 π₯2 . Again let π₯1 be binary, π1 = π½0 + π½1 + π½2 π₯2 , and π0 = π½0 + π½2 π₯2 . Then: ME21:[ππΈ2 ]|π₯1 =1 β [ππΈ2 ]|π₯1 =0 =exp(π1 )π½2 β exp(π0 )π½2. 6. Survival models To incorporate conditioning factors, a common approach is the use of proportional hazard models. The proportional hazards model takes the form: π(π‘; π₯) = π (π₯)π0 (π‘), 14 where π0 (π‘) is the baseline hazard. A common parameterization is to let π (π₯) = exp(π(π₯1 , π₯2 ; π½)), where π(π₯1 , π₯2 ; π½) is a predictor function. The ME of interest is the marginal change in x on the conditional hazard function. That is: ππ(π‘;π₯) ππ₯ = exp(π(π₯1 , π₯2 ; π½)) ππ(π₯1 ,π₯2 ;π½) ππ₯ × π0 (π‘). In this case, the ME formula will follow those derived for the Gamma (or log linear) regression model and the Count Data models presented earlier, except the corresponding ME will be multiplied by π0 (π‘) [2] Despite the similarity in the ME formulae, the estimated ME will differ due to the difference in the underlying distribution. For time-varying factors or covariates, the proportional hazards model takes the form: π(π‘; π₯(π‘)) = π (π₯(π‘))π0 (π‘). If π(π‘; π₯(π‘)) is the Weibull hazard function[3], then the ME will be similar to the ME presented above for the time-invariant case. If π(π‘; π₯(π‘)) is the log-logistic hazard then: exp(π(π₯ ,π₯ ;π½))πΌπ‘ πΌβ1 1 2 π(π‘; π₯(π‘)) = [ 1+exp(π(π₯ ,π₯ 1 2 ;π½))π‘ πΌ ], where π(π₯1 , π₯2 ; π½) is a predictor function[2]. The ME is: ππ(π‘; π₯) exp(π(π₯1 , π₯2 ; π½))πΌπ‘ πΌβ1 ππ(π₯1 , π₯2 ; π½) =[ ] × (1 + exp(π(π₯1 , π₯2 ; π½))π‘ πΌ )2 ππ₯ ππ₯ The interaction (marginal) effect of x1 given a change in x2 is given by: ππΈ12 = [ exp(π(π₯1 , π₯2 ; π½)) πΌπ‘ πΌβ1 (1 β exp(π(π₯1 , π₯2 ; π½)) π‘ πΌ ) (1 + exp(π(π₯1 , π₯2 ; π½)) π‘ πΌ ) +[ 3 exp(π(π₯1 , π₯2 ; π½)) πΌπ‘ πΌβ1 (1 + exp(π(π₯1 , π₯2 ; π½)) π‘ πΌ ) 2] × ]× ππ(π₯1 , π₯2 ; π½) ππ(π₯1 , π₯2 ; π½) × ππ₯1 ππ₯2 π 2 π(π₯1 , π₯2 ; π½) ππ₯1 ππ₯2 15 7. Two-Part Regression Model Consider a two part model that consists of a probability model and distribution involving strictly positive values, i.e. π»(. ) = Pr(π¦ = 1) × πΈ(π¦|π¦ > 0) exp(π) where Pr(π¦ = 1) = 1+exp(π) and πΈ(π¦|π¦ > 0) = exp(π). The regression function is then given ππ₯π(π) ππ₯π(π) by: π(π₯1 , π₯2 ; π½) = 1+ππ₯π(π) × ππ₯π(π) = 1+ππ₯π(βπ) , where π = π(π₯1 , π₯2 ; π½) is the predictor function. For each case, assume π₯1 and π₯2 are continuous unless indicated otherwise. Case 1: Linear in the covariates. Let π(π₯1 , π₯2 ; π½) = π½0 + π½1 π₯1 + π½2 π₯2 , then: ππ(.) ππ₯π(π) ππ₯π(π) ME2: ( ππ₯ ) = 1+ππ₯π(βπ) (π½2 ) β [1+ππ₯π(βπ)]2 ππ₯ π(βπ) (π½2 ) 2 = π(π₯1 , π₯2 ; π½) π½2 [1 + ππ₯π(βπ) 1 ], 1 + ππ₯π(π) 1 because 1+ππ₯π(βπ) = 1+ππ₯π(π). Case 2: Inclusion of nonlinear transformations of the covariates. Let π(π₯1 , π₯2 ; π½) = π½0 + π½1 π₯1 + π½2 π₯2 + π½3 π₯22 , then: ππ(.) ππ₯π(π) ππ₯π(π) ME2: ( ππ₯ ) = 1+ππ₯π(βπ) (π½2 + 2π½3 π₯2 ) + [1+ππ₯π(βπ)]2 ex p(βπ) (π½2 + 2π½3 π₯2 ) 2 1 = π(π₯1 , π₯2 ; π½)(π½2 + 2π½3 π₯2 ) [1 + 1+ππ₯π(π)]. 16 Case 3: Inclusion of an interaction term between covariates. Consider an interaction term between π₯1 and π₯2 such that π(π₯1 , π₯2 ; π½) = Ξ²0 + Ξ²1 x1 + Ξ²2 x2 + Ξ²3 (x2 β x1 ). Then: ππ(.) ππ₯π(π) ππ₯π(π) ME2: ( ππ₯ ) = 1+ππ₯π(βπ) (π½2 + π½3 π₯1 ) + [1+ππ₯π(βπ)]2 ex p(βπ) (π½2 + π½3 π₯1 ) 2 1 = π(π₯1 , π₯2 ; π½)(π½2 + π½3 π₯1 ) [1 + 1+ππ₯π(π)]. Case 4: Linearity in the covariates with a discrete covariate. Let π(π₯1 , π₯2 ; π½) = π½0 + π½1 π₯1 + π½2 π₯2 and assume that π₯1 is binary. Then: ME1: π(π₯1 , π₯2 ; π½)|π₯1 =1 β π(π₯1 , π₯2 ; π½)|π₯1 =0 ππ₯π(π½ +π½1 +π½2 π₯2 ) 0 = 1+ππ₯π(βπ½ 0 βπ½1 βπ½2 π₯2 ) ππ₯π(π½ +π½2 π₯2 ) 0 β 1+ππ₯π(βπ½ . 0 βπ½2 π₯2 ) Case 5: Inclusion of an interaction term with a discrete covariate Let π(π₯1 , π₯2 ; π½) = Ξ²0 + Ξ²1 x1 + Ξ²2 x2 + Ξ²3 (x2 β x1 ) and assume that π₯1 is binary. Then: ME1: π(π₯1 , π₯2 ; π½)|π₯1 =1 β π(π₯1 , π₯2 ; π½)|π₯1 =0 ππ₯π(π½ +π½1 +π½2 π₯2 +π½3 π₯2 ) 0 = 1+ππ₯π(βπ½ 0 βπ½1 βπ½2 π₯2 βπ½3 π₯2 ) ππ₯π(π½ +π½2 π₯2 ) 0 β 1+ππ₯π(βπ½ . 0 βπ½2 π₯2 ) For the next two cases (cases 6 and 7), let the predictor be given by: π(π₯1 , π₯2 ; π½) = π½0 + π½1 π₯1 + π½2 π₯2 + π½3 (π₯2 β π₯1 ). Case 6: Interaction effects with continuous covariates. ME21 : β 1 [ME2 ] = [ME1 (π½2 + π½3 π₯1 ) + π(π₯1 , π₯2 ; π½)π½3 ] × [1 + ] βx1 1 + ππ₯π(π) 17 β π(π₯1 ,π₯2 ;π½)(π½2 +π½3 π₯1 ) exp(π)(π½1 [1+ππ₯π(π)]2 ππ(.) + π½3 π₯2 ), 1 where ME1: ( ππ₯ ) = π(π₯1 , π₯2 ; π½)(π½1 + π½3 π₯2 ) [1 + 1+ππ₯π(π)]. 1 Case 7: Interaction effects with a continuous and discrete covariate. Let π₯1 be binary again, making π1 = π½0 + π½1 + π½2 π₯2 + π½3 π₯2 and π0 = π½0 + π½2 π₯2. Then: ME21: [ME2 ]|π₯1 =1 β [ME2 ]|π₯1 =0 ππ₯π(π ) 1 ππ₯π(π ) 1 1 0 (π½2 + π½3 ) [1 + (π½2 ) [1 + = 1+ππ₯π(βπ ] β 1+ππ₯π(βπ ]. ) ) 1+ππ₯π(π ) 1+ππ₯π(π ) 1 1 0 0 Case 8: Interaction effects for predictor linear in the coefficients and covariates. An interesting case is when there are no nonlinear transformations or interaction terms, i.e: π(π₯1 , π₯2 ; π½) = π½0 + π½1 π₯1 + π½2 π₯2 . Again let π₯1 be binary, π1 = π½0 + π½1 + π½2 π₯2 , and π0 = π½0 + π½2 π₯2 . Then: ME21=[ππΈ2 ]|π₯1 =1 β [ππΈ2 ]|π₯1 =0 ππ₯π(π ) 1 ππ₯π(π ) 1 1 0 (π½2 ) [1 + (π½2 ) [1 + =1+ππ₯π(βπ ] β 1+ππ₯π(βπ ]. ) ) 1+ππ₯π(π ) 1+ππ₯π(π ) 1 1 0 0 8. Sample Selection Model The Heckman model with a log linear dependent variable[4, 5] is specified below: ln π¦ = π β² π½ + π½π π(π β² πΎ) + π ππ ~πππ π‘π. (πππ), πΈ(ππ ) = 0, π£ππ(ππ ) = π 2 , π β π π¦ = exp( π β² π½ + π½π π(β ) + π) 18 πΈ(π¦) = πΈ(π π β² π½+π½ π(β )+π π ) πΈ(π¦) = π πβ²π½ π π½ππ(β ) πΈ(π π |π₯) The general case of the marginal effect of the log linear formulation is derived as follows[6, 7]: ππΈ(π¦) π(π πβ²π½ π π½ππ(β ) πΈ(π π |π₯)) = ππ₯π ππ₯π ππΈ(π¦) ππΈ(π π |π₯) = π πβ²π½ π π½ππ(β ) (π½π πΈ(π π |π₯) + πΎπ π½π (βπ(β )2 β πΎπ₯π π(β )πΈ(π π |π₯)) + ) ππ₯π ππ₯π The formulation above is for the general case in which π₯π is a continuous variable and appears in both the main regression model and the IMR function. The ME function will differ according to the scenarios represented by the cases defined in the main article. If homoskedastic in π₯π , as assumed in the specification above, then ππΈ(π¦) = π πβ²π½ π π½ππ(β ) (π½π πΈ(π π |π₯) + πΎπ π½π (βπ(β )2 β πΎπ₯π π(β )πΈ(π π |π₯))) ππ₯π If heteroskedastic in π₯π , then ππΈ(π¦) ππΈ(π π |π₯) πβ²π½ π½π π(β ) π |π₯) 2 π =π π (π½π β πΈ(π + πΎπ π½π (βπ(β ) β πΎπ₯π π(β )πΈ(π |π₯)) + ) ππ₯π ππ₯π 19 The βnaïveβ estimate is appropriate to use in retransformation when the error structure is homoskedastic (log normal distribution is appropriate). Duan proposed a smearing estimator that does not require a log normality assumption[8] . Of note, the smearing estimator may differ across patient subgroups and in this case would require a subgroup-specific smearing estimator[4]. - Naïve estimate (normal distribution of Ξ΅i is assumed) πΈΜ (π¦) = exp(π β² π½ + π½π π(β ) + - π2 ) 2 Smearing estimate (normal distribution of Ξ΅i is not needed) π πΈΜ (π¦) = π Μ πβ²π½ 1 × β eΞ΅Μ π π=1 20 References (Technical Appendix) 1. Bergtold JS, Onukwugha E. The probabilistic reduction approach to specifying multinomial logistic regression models in health outcomes research. Journal of Applied Statistics. 2014;41(10):220621. 2. Wooldridge J. Econometric analysis of cross section and panel data Cambridge: MIT Press; 2002. 3. Ishak KJ, Kreif N, Benedict A, Muszbek N. Overview of parametric survival analysis for healtheconomic applications. Pharmacoeconomics. Aug;31(8):663-75. 4. Manning WG. The logged dependent variable, heteroscedasticity, and the retransformation problem. Journal of Health Economics. 1998;17(3):283-95. 5. Dow WH, Norton EC. Choosing Between and Interpreting the Heckit and Two-part models for corner solutions. Health Services and Outcomes Research Methodology. 2003;4(1):5-18. 6. Greene W. Econometric Analysis. 7th edition ed. Upper Saddle River, NJ: Prentice Hall; 2012. 7. Vance C. Marginal effects and significance testing with Heckman's sample selection model: a methodological note. Applied Economics Letters. 2009;16(14):1415-9. 8. Duan N. Smearing estimate: A nonparametric retransformation method. Journal of the American Statistical Association. 1983;78(383):605-10. 21
© Copyright 2026 Paperzz