40273_2014_210_MOESM1_ESM

Type of Paper: Review Article
Title:
A Primer on Marginal Effects – Part I: Theory and Formulae
Short title: Primer on Marginal Effects
Authors: Onukwugha E1, Bergtold J2, Jain R3
1
220 Arch Street, Department of Pharmaceutical Health Services Research, University of
Maryland School of Pharmacy, Baltimore, MD, USA.
2
304G Waters Hall, Department of Agricultural Economics, Kansas State University, Manhattan,
KS, 66506-4011.
3
HealthCore, Inc., 800 Delaware Avenue 5th Floor, Wilmington, DE 19801, USA.
Corresponding author:
Eberechukwu Onukwugha
220 Arch Street, 12th Floor
Baltimore, MD 21201
(410) 706-8981
[email protected]
1
TECHNICAL APPENDIX.
Marginal effect formulas for the linear, logit, multinomial logit, generalized linear model with
log link, poisson, negative binomial, two-part, sample selection, and survival models.
1. Linear Regression Model
A typical linear regression equation with two independent variables takes the form:
𝐸 (𝑦|π‘₯1 , π‘₯2 ) = 𝑓(π‘₯1 , π‘₯2 ) = 𝛽0 + 𝛽1 π‘₯1 + 𝛽2 π‘₯2 .
We examine each of the cases presented above for the linear regression model, showing how
marginal and interaction effects changes under different conditions. For each case, assume π‘₯1
and π‘₯2 are continuous unless indicated otherwise.
Case 1: Linearity in the covariates.
Let 𝐸(𝑦|π‘₯1 , π‘₯2 ) = 𝑓(π‘₯1 , π‘₯2 ) = 𝛽0 + 𝛽1 π‘₯1 + 𝛽2 π‘₯2. Then:
ME2:
πœ•π‘“(βˆ™)
πœ•π‘₯2
= 𝛽2
In a linear regression with only linear transformations of variables, the marginal effect is
constant.
Case 2: Inclusion of nonlinear transformations of the covariates.
Now consider the case where 𝐸(𝑦|π‘₯1 , π‘₯2 ) = 𝑓(π‘₯1 , π‘₯2 ) = 𝛽0 + 𝛽1 π‘₯1 + 𝛽2 π‘₯2 + 𝛽3 π‘₯22 , then
ME2:
πœ•π‘“(βˆ™)
πœ•π‘₯2
= 𝛽2 + 2𝛽3 π‘₯2.
The marginal effect of π‘₯2 , in this case is a function of π‘₯2 . If 𝛽3 β‰₯ 0 (≀ 0); then the marginal
effect of π‘₯2 on y is increasing (decreasing) as π‘₯2 increases (decreases).
2
Case 3: Inclusion of an interaction term between covariates.
If 𝐸(𝑦|π‘₯1 , π‘₯2 ) = 𝑓(π‘₯1 , π‘₯2 ) = 𝛽0 + 𝛽1 π‘₯1 + 𝛽2 π‘₯2 + 𝛽3 (π‘₯1 βˆ— π‘₯2 ), then
ME2:
πœ•π‘“(βˆ™)
πœ•π‘₯2
= 𝛽2 + 𝛽3 π‘₯1
Now, the marginal effect of π‘₯2 depends on the value of π‘₯1 .
Case 4: Linearity in the covariates with a discrete covariate.
Assume that 𝐸(𝑦|π‘₯1 , π‘₯2 ) = 𝑓(π‘₯1 , π‘₯2 ) = 𝛽0 + 𝛽1 π‘₯1 + 𝛽2 π‘₯2, but now let π‘₯1 be binary. Then:
ME1: 𝑓(π‘₯1 , π‘₯2 )|π‘₯1 =1 βˆ’ 𝑓(π‘₯1 , π‘₯2 )|π‘₯1=0 = 𝛽1 .
Note that in a linear regression with only linear transformations of variables, the marginal effect
is still constant.
Case 5: Inclusion of an interaction term with a discrete covariate.
Let 𝐸(𝑦|π‘₯1 , π‘₯2 ) = 𝑓(π‘₯1 , π‘₯2 ) = 𝛽0 + 𝛽1 π‘₯1 + 𝛽2 π‘₯2 + 𝛽3 (π‘₯1 βˆ— π‘₯2 ), then:
ME1: 𝑓(π‘₯1 , π‘₯2 )|π‘₯1 =1 βˆ’ 𝑓(π‘₯1 , π‘₯2 )|π‘₯1=0 = 𝛽1 + 𝛽3 π‘₯2.
The marginal effect of π‘₯1 in this case depends on the value of π‘₯2 .
Case 6: Interaction effects when both covariates are continuous.
Let 𝐸(𝑦|π‘₯1 , π‘₯2 ) = 𝑓(π‘₯1 , π‘₯2 ) = 𝛽0 + 𝛽1 π‘₯1 + 𝛽2 π‘₯2 + 𝛽3 (π‘₯1 βˆ— π‘₯2 ). The interaction effect of the
marginal effect for π‘₯2 given a change in π‘₯1 is:
ME21:
πœ•
πœ•π‘₯1
[𝑀𝐸2 ] = 𝛽3
3
Case 7: Interaction between a continuous and discrete covariate.
Let 𝐸(𝑦|π‘₯1 , π‘₯2 ) = 𝑓(π‘₯1 , π‘₯2 ) = 𝛽0 + 𝛽1 π‘₯1 + 𝛽2 π‘₯2 + 𝛽3 (π‘₯1 βˆ— π‘₯2 ), where π‘₯1 is binary.
Then:
ME21:
πœ•π‘“(βˆ™)
|
πœ•π‘₯2 π‘₯ =1
1
βˆ’
πœ•π‘“(βˆ™)
|
πœ•π‘₯2 π‘₯ =0
1
= {𝛽2 + 𝛽3 βˆ— 1} βˆ’ 𝛽2 = 𝛽3, and
πœ•
ME12: = πœ•π‘₯ [𝑀𝐸1 ] = 𝛽3
2
A special case of the linear regression model is the log-linear. That is, the regression model:
ln(𝑦) = f(x1 , x2 ) + 𝑒.
For this regression model, if the dependent variable of interest is ln(y), then the marginal effects
derived above apply. That is, the marginal effect of a change in π‘₯π‘˜ on ln(y) is the statistic of
interest. If instead, the applied modeler is interested in the marginal effect of π‘₯π‘˜ on y, then the
marginal effect formulas will differ. First, let:
Μ…Μ…Μ…Μ…k = πœ•π‘“(π‘₯;𝛽) = πœ•ln(𝑦).
ME
πœ•π‘₯
πœ•π‘₯
π‘˜
π‘˜
To get to the marginal of π‘₯π‘˜ on y, one will need to transform the above marginal effect. This is
done by incorporating πœ•π‘¦ in the following way:
implying that
πœ•π‘¦
πœ•π‘₯π‘˜
=𝑦
πœ•ln(𝑦)
πœ•π‘₯π‘˜
πœ•ln(𝑦)
πœ•π‘₯π‘˜
=
πœ•ln(𝑦)
πœ•π‘₯π‘˜
πœ•π‘¦
× πœ•π‘¦ =
πœ•ln(𝑦)
πœ•π‘¦
πœ•π‘¦
πœ•π‘¦
π‘˜
π‘˜
1
× πœ•π‘₯ = πœ•π‘₯ × π‘¦,
. Then:
Μ…Μ…Μ…Μ…k × π‘¦.
MEk = ME
Thus, if a log-linear model is estimated, then the marginal effects are those derived above times
the value of the dependent variable. This shows that marginal effects can include the dependent
variable, as well. The above marginal effect derivations for the log-linear model assume a
4
homoskedastic retransformation and no need for a β€˜smearing estimator’. In the case of
heteroskedasticity, the ME functional will include an extra term associated with the derivative of
the error term with respect to the independent variable.
2. Logistic Regression Model (or Logit Model)
Let predictor (or index) function of the model be given by πœ‚(π‘₯1 , π‘₯2 ; 𝛽), then the Logit
model takes the form
𝐸(𝑦 = 1|π‘₯1 , π‘₯2 ) = 𝛬(πœ‚(π‘₯1 , π‘₯2 ; 𝛽)) =
1
1 + 𝑒 βˆ’(πœ‚(π‘₯1 ,π‘₯2 ;𝛽))
where, 𝛬(βˆ™) is the cumulative distribution function of the standard logistic distribution. It is
important to note that 𝐸(𝑦 = 1|π‘₯1 , π‘₯2 ) is non-linear in the β’s. Therefore, unlike the linear
regression model, the magnitudes of the beta coefficients are not the marginal effects of the
independent variables.
For the marginal effect derivations below, the following formula will be of use: If
𝛬(𝑧) =
1
1+𝑒 βˆ’(𝑧)
, then
πœ•π›¬(βˆ™)
πœ•π‘§
𝑒 βˆ’(𝑧)
=
(1+𝑒 βˆ’(𝑧) )
2
= 𝛬(𝑧)(1 βˆ’ 𝛬(𝑧)).
For each case, assume π‘₯1 and π‘₯2 are continuous unless indicated otherwise. In addition, for ease
of notation, we may represent πœ‚(π‘₯1 , π‘₯2 ; 𝛽) simply as πœ‚.
Case 1: Linear in the covariates.
Let πœ‚(π‘₯1 , π‘₯2 ; 𝛽) = 𝛽0 + 𝛽1 π‘₯1 + 𝛽2 π‘₯2 , then:
ME2:
πœ•π›¬(βˆ™)
πœ•π‘₯2
=
πœ•π›¬(βˆ™)
πœ•πœ‚
πœ•πœ‚
βˆ— πœ•π‘₯ = 𝛬(π‘₯1 , π‘₯2 )(1 βˆ’ 𝛬(π‘₯1 , π‘₯2 ))𝛽2
2
5
For the logit model, (i) all independent variables are involved in the calculation of the marginal
effect and (ii) the marginal effect depends on the initial value of all the independent variables.
Case 2: Inclusion of nonlinear transformations of the covariates.
Let πœ‚(π‘₯1 , π‘₯2 ; 𝛽) = 𝛽0 + 𝛽1 π‘₯1 + 𝛽2 π‘₯2 + 𝛽3 π‘₯22 , then:
ME2:
πœ•π›¬(βˆ™)
πœ•π‘₯2
=
πœ•π›¬(βˆ™)
πœ•πœ‚
πœ•πœ‚
βˆ— πœ•π‘₯ = 𝛬(π‘₯1 , π‘₯2 )(1 βˆ’ 𝛬(π‘₯1 , π‘₯2 ))(𝛽2 + 2𝛽3 π‘₯2 )
2
As in the case of the linear regression model, we assume that the non-linear transformation of π‘₯2
is a square function. In general, it can be any transformation. For example, if the nonlinear term
in the predictor was ln(π‘₯2 ) instead of π‘₯22 , then:
ME2:
πœ•π›¬(βˆ™)
πœ•π‘₯2
=
πœ•π›¬(βˆ™)
πœ•πœ‚
𝛽2 +2𝑙𝑛(π‘₯2 )
πœ•πœ‚
βˆ— πœ•π‘₯ = 𝛬(π‘₯1 , π‘₯2 )(1 βˆ’ 𝛬(π‘₯1 , π‘₯2 )) (
2
π‘₯2
).
Case 3: Inclusion of an interaction term between covariates.
Consider an interaction term between π‘₯1 and π‘₯2 such that πœ‚(π‘₯1 , π‘₯2 ; 𝛽) = Ξ²0 + Ξ²1 x1 + Ξ²2 x2 +
Ξ²3 (x2 βˆ— x1 ). Then:
ME2:
βˆ‚Ξ›(βˆ™)
βˆ‚x2
=
βˆ‚Ξ›(βˆ™)
βˆ‚Ξ·
βˆ‚Ξ·
βˆ— βˆ‚x = 𝛬(π‘₯1 , π‘₯2 )(1 βˆ’ 𝛬(π‘₯1 , π‘₯2 ))(Ξ²2 + Ξ²3 x1 ).
2
Case 4: Linearity in the covariates with a discrete covariate.
Let πœ‚(π‘₯1 , π‘₯2 ; 𝛽) = 𝛽0 + 𝛽1 π‘₯1 + 𝛽2 π‘₯2 and assume that π‘₯1 is binary. Then:
1
1
ME1: 𝛬(π‘₯1 , π‘₯2 )|π‘₯1=1 βˆ’ 𝛬(π‘₯1 , π‘₯2 )|π‘₯1 =0 = 1+𝑒 βˆ’(𝛽0+𝛽1 +𝛽2 π‘₯2) βˆ’ 1+𝑒 βˆ’(𝛽0 +𝛽2π‘₯2) .
Case 5: Inclusion of an interaction term with a discrete covariate
Let πœ‚(π‘₯1 , π‘₯2 ; 𝛽) = Ξ²0 + Ξ²1 x1 + Ξ²2 x2 + Ξ²3 (x2 βˆ— x1 ) and assume that π‘₯1 is binary. Then:
6
1
1
ME1: 𝛬(π‘₯1 , π‘₯2 )|π‘₯1=1 βˆ’ 𝛬(π‘₯1 , π‘₯2 )|π‘₯1 =0= 1+𝑒 βˆ’(𝛽0+𝛽1 +𝛽2 π‘₯2+Ξ²3x2) βˆ’ 1+𝑒 βˆ’(𝛽0+𝛽2 π‘₯2).
The remainder of this section (Cases 6, 7, 8) is based on the work by Ai and Norton (2003). For
this section, consider the predictor: πœ‚(π‘₯1 , π‘₯2 ; 𝛽) = 𝛽0 + 𝛽1 π‘₯1 + 𝛽2 π‘₯2 + 𝛽3 (π‘₯2 βˆ— π‘₯1 ).
Case 6: Interaction effects with continuous covariates.
βˆ‚
πœ•π›¬(π‘₯1 , π‘₯2 ) πœ•π›¬(π‘₯1 , π‘₯2 )2 πœ•π‘§
πœ• 2𝑧
[ME2 ] = [
ME21 :
βˆ’
]
+ 𝛬(π‘₯1 , π‘₯2 )(1 βˆ’ 𝛬(π‘₯1 , π‘₯2 ))
=
βˆ‚x1
πœ•π‘₯1
πœ•π‘₯1
πœ•π‘₯2
πœ•π‘₯2 πœ•π‘₯1
= [𝑀𝐸1 (1 βˆ’ 2𝛬(π‘₯1 , π‘₯2 ))](Ξ²2 + Ξ²3 x1 )+ 𝛬(π‘₯1 , π‘₯2 )(1 βˆ’ 𝛬(π‘₯1 , π‘₯2 ))𝛽3.
Case 7: Interaction effects with a continuous and discrete covariate.
Let π‘₯1 be binary again, making πœ‚1 = 𝛽0 + 𝛽1 + 𝛽2 π‘₯2 + 𝛽3 π‘₯2 and πœ‚0 = 𝛽0 + 𝛽2 π‘₯2. Then:
ME21: [𝑀𝐸2 ]|π‘₯1 =1 βˆ’ [𝑀𝐸2 ]|π‘₯1 =0 =𝛬(πœ‚1 )(1 βˆ’ 𝛬(πœ‚1 ))(𝛽2 + 𝛽3 ) βˆ’ 𝛬(πœ‚0 )(1 βˆ’ 𝛬(πœ‚0 ))𝛽2.
It should be emphasized that ME21 = ME12, as well.
Case 8: Interaction effects for predictor linear in the coefficients and covariates.
An interesting case is when there are no nonlinear transformations or interaction terms, i.e.:
πœ‚(π‘₯1 , π‘₯2 ; 𝛽) = 𝛽0 + 𝛽1 π‘₯1 + 𝛽2 π‘₯2 . Now let πœ‚1 = 𝛽0 + 𝛽1 + 𝛽2 π‘₯2 and πœ‚0 = 𝛽0 + 𝛽2 π‘₯2 . Then:
ME21:[𝑀𝐸2 ]|π‘₯1 =1 βˆ’ [𝑀𝐸2 ]|π‘₯1 =0 =𝛬(πœ‚1 )(1 βˆ’ 𝛬(πœ‚1 ))𝛽2 βˆ’ 𝛬(πœ‚0 )(1 βˆ’ 𝛬(πœ‚0 ))𝛽2 .
3. Multinomial Logistic Regression Model
There exists a separate set of marginal effects for each outcome in the multinomial
logistic regression model. These will be designated MEi,j for the marginal effect of variable i for
7
outcome j and MEik,j for the interaction marginal effect of variable i and variable k for outcome j.
For each case, assume π‘₯1 and π‘₯2 continuous unless indicated otherwise. In addition, for ease of
notation, we may represent πœ‚π‘— (π‘₯1 , π‘₯2 ; 𝛽) simply as πœ‚π‘— . In addition, we will assume that:
𝐸(𝑦 = 𝑗|π‘₯1 , π‘₯2 , π‘₯3 ) =
exp(πœ‚π‘— )
1 + βˆ‘π½π‘˜=1 exp(πœ‚π‘˜ )
= β„Žπ‘— (π‘₯1 , π‘₯2 )
Case 1: Linearity in the covariates.
Let πœ‚π‘— (π‘₯1 , π‘₯2 ; 𝛽) = 𝛽0,𝑗 + 𝛽1,𝑗 π‘₯1 + 𝛽2,𝑗 π‘₯2 for j = 1,…,J. Then:
ME2,j:
πœ•β„Ž(βˆ™)
πœ•π‘₯2
= β„Žπ‘— (π‘₯1 , π‘₯2 )[𝛽2,𝑗 βˆ’ βˆ‘π½π‘˜=0 β„Žπ‘˜ (π‘₯1 , π‘₯2 )𝛽2,π‘˜ ].
As was with the logit model, for the multinomial logit model (i) all independent variables are
involved in the calculation of the marginal effect and (ii) the marginal effect depends on the
initial value of the independent variables.
Case 2: Inclusion of nonlinear transformations of the covariates.
Consider the predictor πœ‚π‘— (π‘₯1 , π‘₯2 ; 𝛽) = 𝛽0,𝑗 + 𝛽1,𝑗 π‘₯1 + 𝛽2,𝑗 π‘₯2 + 𝛽3,𝑗 π‘₯22 for j = 1,…,J. Then:
ME2,j:
πœ•β„Ž(βˆ™)
πœ•π‘₯2
= β„Žπ‘— (π‘₯1 , π‘₯2 )[(𝛽2,𝑗 + 2𝛽3,𝑗 π‘₯2 ) βˆ’ βˆ‘π½π‘˜=0 β„Žπ‘˜ (π‘₯1 , π‘₯2 )(𝛽2,π‘˜ + 2𝛽3,π‘˜ π‘₯2 )].
Case 3: Inclusion of interaction terms between the covariates.
Let πœ‚π‘— (π‘₯1 , π‘₯2 ; 𝛽) = Ξ²0,j + Ξ²1,j x1 + Ξ²2,j x2 + Ξ²3,j (x2 βˆ— x1 ) for j = 1,…,J. Then:
ME2,j:
πœ•β„Ž(βˆ™)
πœ•π‘₯2
= β„Žπ‘— (π‘₯1 , π‘₯2 )[(𝛽2,𝑗 + 𝛽3,𝑗 π‘₯1 ) βˆ’ βˆ‘π½π‘˜=0 β„Žπ‘˜ (π‘₯1 , π‘₯2 )(𝛽2,π‘˜ + 𝛽3,π‘˜ π‘₯1 )].
8
Case 4: Linearity in the covariates with a discrete covariate.
Let π‘₯1 be binary and πœ‚π‘— (π‘₯1 , π‘₯2 ; 𝛽) = 𝛽0,𝑗 + 𝛽1,𝑗 π‘₯1 + 𝛽2,𝑗 π‘₯2 for j = 1,…,J. Now let πœ‚π‘—,1 =
𝛽0,𝑗 + 𝛽1,𝑗 + 𝛽2,𝑗 π‘₯2 and πœ‚π‘—,0 = 𝛽0,𝑗 + 𝛽2,𝑗 π‘₯2 for j = 1,…,J. Then:
ME1,j: β„Žπ‘— (π‘₯1 , π‘₯2 )|π‘₯
1
βˆ’ β„Žπ‘— (π‘₯1 , π‘₯2 )|π‘₯
=1
1
=
=0
exp(πœ‚π‘—,1 )
𝐽
1+βˆ‘π‘˜=1 exp(πœ‚π‘˜,1 )
βˆ’
exp(πœ‚π‘—,0 )
𝐽
.
1+βˆ‘π‘˜=1 exp(πœ‚π‘˜,0 )
Case 5: Inclusion of an interaction term with a discrete covariate.
Let π‘₯1 be binary and πœ‚π‘— (π‘₯1 , π‘₯2 ; 𝛽) = Ξ²0,j + Ξ²1,j x1 + Ξ²2,j x2 + Ξ²3,j (x2 βˆ— x1 ) for j = 1,…,J. Now let
πœ‚π‘—,1 = 𝛽0,𝑗 + 𝛽1,𝑗 + 𝛽2,𝑗 π‘₯2 + 𝛽3,𝑗 π‘₯2 and πœ‚π‘—,0 = 𝛽0,𝑗 + 𝛽2,𝑗 π‘₯2 for j = 1,…,J. Then:
ME1,j: β„Žπ‘— (π‘₯1 , π‘₯2 )|π‘₯
1 =1
βˆ’ β„Žπ‘— (π‘₯1 , π‘₯2 )|π‘₯
1 =0
=
exp(πœ‚π‘—,1 )
𝐽
1+βˆ‘π‘˜=1 exp(πœ‚π‘˜,1 )
βˆ’
exp(πœ‚π‘—,0 )
𝐽
.
1+βˆ‘π‘˜=1 exp(πœ‚π‘˜,0 )
For the next two cases, consider the predictor: πœ‚π‘— (π‘₯1 , π‘₯2 ; 𝛽) = 𝛽0,𝑗 + 𝛽1,𝑗 π‘₯1 + 𝛽2,𝑗 π‘₯2 + 𝛽3,𝑗 (π‘₯2 βˆ—
π‘₯1 ) for j = 1,…,J. The derivations here are based on work by Bergtold and Onukwugha[1].
Case 6: Interaction effects when the covariates are both continuous.
𝑀𝐸21,𝑗 :
πœ•
[𝑀𝐸2,𝑗 ]
πœ•π‘₯1
𝐽
= β„Žπ‘— (π‘₯1 , π‘₯2 )[𝛽3,𝑗 βˆ’ βˆ‘π‘˜=1
(𝑀𝐸1,π‘˜ (𝛽2,π‘˜ + 𝛽3,π‘˜ π‘₯1 ) + β„Žπ‘˜ (π‘₯1 , π‘₯2 )𝛽3,π‘˜ )]
+
(𝑀𝐸2 )(𝑀𝐸1 )
β„Žπ‘— (π‘₯1 , π‘₯2 )
Case 7: Interaction effects with a continuous and discrete covariate.
Let πœ‚π‘—,1 = 𝛽0,𝑗 + 𝛽1,𝑗 + 𝛽2,𝑗 π‘₯2 + 𝛽3,𝑗 π‘₯2 and πœ‚π‘—,0 = 𝛽0,𝑗 + 𝛽2,𝑗 π‘₯2 for j = 1,…,J. Then:
9
ME21,j: [𝑀𝐸2,𝑗 ]|π‘₯
1 =1
βˆ’ [𝑀𝐸2,𝑗 ]|π‘₯
1 =0
=
β„Žπ‘— (πœ‚π‘—,1 )[(𝛽2,𝑗 + 𝛽3,𝑗 ) βˆ’ βˆ‘π½π‘˜=0 β„Žπ‘˜ (πœ‚π‘—,1 )(𝛽2,π‘˜ + 𝛽3,π‘˜ )] βˆ’ β„Žπ‘— (πœ‚π‘—,0 )[𝛽2,𝑗 βˆ’
βˆ‘π½π‘˜=0 β„Žπ‘˜ (πœ‚π‘—,0 )𝛽2,π‘˜ ].
Case 8: Interaction effects for predictor linear in the coefficients and covariates.
An interesting case is when there are no nonlinear transformations or interaction terms, i.e:
πœ‚π‘— (π‘₯1 , π‘₯2 ; 𝛽) = 𝛽0,𝑗 + 𝛽1,𝑗 π‘₯1 + 𝛽2,𝑗 π‘₯2 for j = 1,…,J. Let πœ‚π‘—,1 = 𝛽0,𝑗 + 𝛽1,𝑗 + 𝛽2,𝑗 π‘₯2 and πœ‚π‘—,0 =
𝛽0,𝑗 + 𝛽2,𝑗 π‘₯2 for j = 1,…,J. Then:
ME21:[𝑀𝐸2 ]|π‘₯1 =1 βˆ’ [𝑀𝐸2 ]|π‘₯1 =0 =
β„Žπ‘— (πœ‚π‘—,1 )[𝛽2,𝑗 βˆ’ βˆ‘π½π‘˜=0 β„Žπ‘˜ (πœ‚π‘—,1 )𝛽2,π‘˜ ] βˆ’ β„Žπ‘— (πœ‚π‘—,0 )[𝛽2,𝑗 βˆ’ βˆ‘π½π‘˜=0 β„Žπ‘˜ (πœ‚π‘—,0 )𝛽2,π‘˜ ].
4. Generalized Linear Model (GLM) with Log Link Function
The conditional mean for the GLM model with log link takes the form:
𝐸(𝑦|π‘₯1 , π‘₯2 ) = exp (πœ‚π‘— (π‘₯1 , π‘₯2 ; 𝛽)),
where πœ‚π‘— (π‘₯1 , π‘₯2 ; 𝛽) is the predictor function. For each case, assume π‘₯1 and π‘₯2 are continuous
unless indicated otherwise. In addition, for ease of notation, we represent πœ‚(π‘₯1 , π‘₯2 ; 𝛽) simply as
πœ‚.
Case 1: Linear in the covariates.
Let πœ‚(π‘₯1 , π‘₯2 ; 𝛽) = 𝛽0 + 𝛽1 π‘₯1 + 𝛽2 π‘₯2 , then:
ME2:
πœ•π‘’π‘₯𝑝(πœ‚)
πœ•π‘₯2
= exp(πœ‚) 𝛽2
10
Case 2: Inclusion of nonlinear transformations of the covariates.
Let πœ‚(π‘₯1 , π‘₯2 ; 𝛽) = 𝛽0 + 𝛽1 π‘₯1 + 𝛽2 π‘₯2 + 𝛽3 π‘₯22 , then:
ME2:
πœ•π‘’π‘₯𝑝(πœ‚)
πœ•π‘₯2
= exp(πœ‚) (𝛽2 + 2𝛽3 π‘₯2 )
Case 3: Inclusion of an interaction term between covariates.
Consider an interaction term between π‘₯1 and π‘₯2 such that πœ‚(π‘₯1 , π‘₯2 ; 𝛽) = Ξ²0 + Ξ²1 x1 + Ξ²2 x2 +
Ξ²3 (x2 βˆ— x1 ). Then:
ME2:
πœ•π‘’π‘₯𝑝(πœ‚)
πœ•π‘₯2
= exp(πœ‚) (𝛽2 + 2𝛽3 π‘₯1 ).
Case 4: Linearity in the covariates with a discrete covariate.
Let πœ‚(π‘₯1 , π‘₯2 ; 𝛽) = 𝛽0 + 𝛽1 π‘₯1 + 𝛽2 π‘₯2 and assume that π‘₯1 is binary. Then:
ME1: 𝑒π‘₯𝑝(πœ‚)|π‘₯1=1 βˆ’ 𝑒π‘₯𝑝(πœ‚)|π‘₯1 =0= exp(𝛽0 + 𝛽1 + 𝛽2 π‘₯2 ) βˆ’ exp(𝛽0 + 𝛽2 π‘₯2 ).
Case 5: Inclusion of an interaction term with a discrete covariate
Let πœ‚(π‘₯1 , π‘₯2 ; 𝛽) = Ξ²0 + Ξ²1 x1 + Ξ²2 x2 + Ξ²3 (x2 βˆ— x1 ) and assume that π‘₯1 is binary. Then:
ME1: 𝑒π‘₯𝑝(πœ‚)|π‘₯1=1 βˆ’ 𝑒π‘₯𝑝(πœ‚)|π‘₯1 =0= exp(𝛽0 + 𝛽1 + 𝛽2 π‘₯2 + 𝛽3 π‘₯2 ) βˆ’ exp(𝛽0 + 𝛽2 π‘₯2 ).
For the next two cases (cases 6 and 7), let the predictor be given by:
πœ‚(π‘₯1 , π‘₯2 ; 𝛽) = 𝛽0 + 𝛽1 π‘₯1 + 𝛽2 π‘₯2 + 𝛽3 (π‘₯2 βˆ— π‘₯1 ).
11
Case 6: Interaction effects with continuous covariates.
ME21 :
βˆ‚
βˆ‚x1
[ME2 ] = exp(πœ‚) (𝛽1 + 𝛽3 π‘₯2 )(𝛽2 + 𝛽3 π‘₯1 ) + exp(πœ‚) 𝛽3.
Case 7: Interaction effects with a continuous and discrete covariate.
Let π‘₯1 be binary again, making πœ‚1 = 𝛽0 + 𝛽1 + 𝛽2 π‘₯2 + 𝛽3 π‘₯2 and πœ‚0 = 𝛽0 + 𝛽2 π‘₯2. Then:
ME21: [ME2 ]|π‘₯1 =1 βˆ’ [ME2 ]|π‘₯1 =0 =exp(πœ‚1 )(𝛽2 + 𝛽3 ) βˆ’ exp(πœ‚0 )𝛽2.
It should be emphasized that ME21 = ME12 in this case, as well.
Case 8: Interaction effects for predictor linear in the coefficients and covariates.
An interesting case is when there are no nonlinear transformations or interaction terms, i.e.:
πœ‚(π‘₯1 , π‘₯2 ; 𝛽) = 𝛽0 + 𝛽1 π‘₯1 + 𝛽2 π‘₯2 . Again let π‘₯1 be binary, πœ‚1 = 𝛽0 + 𝛽1 + 𝛽2 π‘₯2 , and πœ‚0 = 𝛽0 +
𝛽2 π‘₯2 . Then:
ME21:[𝑀𝐸2 ]|π‘₯1 =1 βˆ’ [𝑀𝐸2 ]|π‘₯1 =0 =exp(πœ‚1 )𝛽2 βˆ’ exp(πœ‚0 )𝛽2.
5. Count Models
The conditional mean function for both Poisson and Negative Binomial models (as well as a
number of other count data models) is:
𝐸(𝑦|π‘₯1 , π‘₯2 ) = exp(πœ‚(π‘₯1 , π‘₯2 ; 𝛽)).
For each case, assume π‘₯1 and π‘₯2 are continuous unless indicated otherwise. In addition, for ease
of notation, we represent πœ‚(π‘₯1 , π‘₯2 ; 𝛽) simply as πœ‚. The marginal effects for the count models
presented below are similar in derivation to those for the GLM model with log link function
12
presented earlier, but it should be emphasized that the parameter estimates will not be the same
in both models as the underlying distributions are different.
Case 1: Linear in the covariates.
Let πœ‚(π‘₯1 , π‘₯2 ; 𝛽) = 𝛽0 + 𝛽1 π‘₯1 + 𝛽2 π‘₯2 , then:
ME2:
πœ•π‘’π‘₯𝑝(πœ‚)
πœ•π‘₯2
= exp(πœ‚) 𝛽2
Case 2: Inclusion of nonlinear transformations of the covariates.
Let πœ‚(π‘₯1 , π‘₯2 ; 𝛽) = 𝛽0 + 𝛽1 π‘₯1 + 𝛽2 π‘₯2 + 𝛽3 π‘₯22 , then:
ME2:
πœ•π‘’π‘₯𝑝(πœ‚)
πœ•π‘₯2
= exp(πœ‚) (𝛽2 + 2𝛽3 π‘₯2 )
Case 3: Inclusion of an interaction term between covariates.
Consider an interaction term between π‘₯1 and π‘₯2 such that πœ‚(π‘₯1 , π‘₯2 ; 𝛽) = Ξ²0 + Ξ²1 x1 + Ξ²2 x2 +
Ξ²3 (x2 βˆ— x1 ). Then:
ME2:
πœ•π‘’π‘₯𝑝(πœ‚)
πœ•π‘₯2
= exp(πœ‚) (𝛽2 + 2𝛽3 π‘₯1 ).
Case 4: Linearity in the covariates with a discrete covariate.
Let πœ‚(π‘₯1 , π‘₯2 ; 𝛽) = 𝛽0 + 𝛽1 π‘₯1 + 𝛽2 π‘₯2 and assume that π‘₯1 is binary. Then:
ME1: 𝑒π‘₯𝑝(πœ‚)|π‘₯1=1 βˆ’ 𝑒π‘₯𝑝(πœ‚)|π‘₯1 =0= exp(𝛽0 + 𝛽1 + 𝛽2 π‘₯2 ) βˆ’ exp(𝛽0 + 𝛽2 π‘₯2 ).
Case 5: Inclusion of an interaction term with a discrete covariate
Let πœ‚(π‘₯1 , π‘₯2 ; 𝛽) = Ξ²0 + Ξ²1 x1 + Ξ²2 x2 + Ξ²3 (x2 βˆ— x1 ) and assume that π‘₯1 is binary. Then:
13
ME1: 𝑒π‘₯𝑝(πœ‚)|π‘₯1=1 βˆ’ 𝑒π‘₯𝑝(πœ‚)|π‘₯1 =0= exp(𝛽0 + 𝛽1 + 𝛽2 π‘₯2 + 𝛽3 π‘₯2 ) βˆ’ exp(𝛽0 + 𝛽2 π‘₯2 ).
For the next two cases (cases 6 and 7), let the predictor be given by:
πœ‚(π‘₯1 , π‘₯2 ; 𝛽) = 𝛽0 + 𝛽1 π‘₯1 + 𝛽2 π‘₯2 + 𝛽3 (π‘₯2 βˆ— π‘₯1 ).
Case 6: Interaction effects with continuous covariates.
ME21 :
βˆ‚
βˆ‚x1
[ME2 ] = exp(πœ‚) (𝛽1 + 𝛽3 π‘₯2 )(𝛽2 + 𝛽3 π‘₯1 ) + exp(πœ‚) 𝛽3.
Case 7: Interaction effects with a continuous and discrete covariate.
Let π‘₯1 be binary again, making πœ‚1 = 𝛽0 + 𝛽1 + 𝛽2 π‘₯2 + 𝛽3 π‘₯2 and πœ‚0 = 𝛽0 + 𝛽2 π‘₯2. Then:
ME21: [ME2 ]|π‘₯1 =1 βˆ’ [ME2 ]|π‘₯1 =0 =exp(πœ‚1 )(𝛽2 + 𝛽3 ) βˆ’ exp(πœ‚0 )𝛽2.
It should be emphasized that ME21 = ME12 in this case, as well.
Case 8: Interaction effects for predictor linear in the coefficients and covariates.
An interesting case is when there are no nonlinear transformations or interaction terms, i.e:
πœ‚(π‘₯1 , π‘₯2 ; 𝛽) = 𝛽0 + 𝛽1 π‘₯1 + 𝛽2 π‘₯2 . Again let π‘₯1 be binary, πœ‚1 = 𝛽0 + 𝛽1 + 𝛽2 π‘₯2 , and πœ‚0 = 𝛽0 +
𝛽2 π‘₯2 . Then:
ME21:[𝑀𝐸2 ]|π‘₯1 =1 βˆ’ [𝑀𝐸2 ]|π‘₯1 =0 =exp(πœ‚1 )𝛽2 βˆ’ exp(πœ‚0 )𝛽2.
6. Survival models
To incorporate conditioning factors, a common approach is the use of proportional hazard
models. The proportional hazards model takes the form:
πœ†(𝑑; π‘₯) = πœ…(π‘₯)πœ†0 (𝑑),
14
where πœ†0 (𝑑) is the baseline hazard. A common parameterization is to let πœ…(π‘₯) =
exp(πœ‚(π‘₯1 , π‘₯2 ; 𝛽)), where πœ‚(π‘₯1 , π‘₯2 ; 𝛽) is a predictor function. The ME of interest is the marginal
change in x on the conditional hazard function. That is:
πœ•πœ†(𝑑;π‘₯)
πœ•π‘₯
= exp(πœ‚(π‘₯1 , π‘₯2 ; 𝛽))
πœ•πœ‚(π‘₯1 ,π‘₯2 ;𝛽)
πœ•π‘₯
× πœ†0 (𝑑).
In this case, the ME formula will follow those derived for the Gamma (or log linear) regression
model and the Count Data models presented earlier, except the corresponding ME will be
multiplied by πœ†0 (𝑑) [2] Despite the similarity in the ME formulae, the estimated ME will differ
due to the difference in the underlying distribution.
For time-varying factors or covariates, the proportional hazards model takes the form:
πœ†(𝑑; π‘₯(𝑑)) = πœ…(π‘₯(𝑑))πœ†0 (𝑑).
If πœ†(𝑑; π‘₯(𝑑)) is the Weibull hazard function[3], then the ME will be similar to the ME presented
above for the time-invariant case. If πœ†(𝑑; π‘₯(𝑑)) is the log-logistic hazard then:
exp(πœ‚(π‘₯ ,π‘₯ ;𝛽))𝛼𝑑 π›Όβˆ’1
1 2
πœ†(𝑑; π‘₯(𝑑)) = [ 1+exp(πœ‚(π‘₯
,π‘₯
1
2 ;𝛽))𝑑
𝛼
],
where πœ‚(π‘₯1 , π‘₯2 ; 𝛽) is a predictor function[2]. The ME is:
πœ•πœ†(𝑑; π‘₯)
exp(πœ‚(π‘₯1 , π‘₯2 ; 𝛽))𝛼𝑑 π›Όβˆ’1
πœ•πœ‚(π‘₯1 , π‘₯2 ; 𝛽)
=[
]
×
(1 + exp(πœ‚(π‘₯1 , π‘₯2 ; 𝛽))𝑑 𝛼 )2
πœ•π‘₯
πœ•π‘₯
The interaction (marginal) effect of x1 given a change in x2 is given by:
𝑀𝐸12 = [
exp(πœ‚(π‘₯1 , π‘₯2 ; 𝛽)) 𝛼𝑑 π›Όβˆ’1 (1 βˆ’ exp(πœ‚(π‘₯1 , π‘₯2 ; 𝛽)) 𝑑 𝛼 )
(1 + exp(πœ‚(π‘₯1 , π‘₯2 ; 𝛽)) 𝑑 𝛼 )
+[
3
exp(πœ‚(π‘₯1 , π‘₯2 ; 𝛽)) 𝛼𝑑 π›Όβˆ’1
(1 + exp(πœ‚(π‘₯1 , π‘₯2 ; 𝛽)) 𝑑 𝛼 )
2] ×
]×
πœ•πœ‚(π‘₯1 , π‘₯2 ; 𝛽) πœ•πœ‚(π‘₯1 , π‘₯2 ; 𝛽)
×
πœ•π‘₯1
πœ•π‘₯2
πœ• 2 πœ‚(π‘₯1 , π‘₯2 ; 𝛽)
πœ•π‘₯1 πœ•π‘₯2
15
7. Two-Part Regression Model
Consider a two part model that consists of a probability model and distribution involving
strictly positive values, i.e.
𝐻(. ) = Pr(𝑦 = 1) × πΈ(𝑦|𝑦 > 0)
exp(πœ‚)
where Pr(𝑦 = 1) = 1+exp(πœ‚) and 𝐸(𝑦|𝑦 > 0) = exp(πœ‚). The regression function is then given
𝑒π‘₯𝑝(πœ‚)
𝑒π‘₯𝑝(πœ‚)
by: 𝑓(π‘₯1 , π‘₯2 ; 𝛽) = 1+𝑒π‘₯𝑝(πœ‚) × π‘’π‘₯𝑝(πœ‚) = 1+𝑒π‘₯𝑝(βˆ’πœ‚) , where πœ‚ = πœ‚(π‘₯1 , π‘₯2 ; 𝛽) is the predictor
function. For each case, assume π‘₯1 and π‘₯2 are continuous unless indicated otherwise.
Case 1: Linear in the covariates.
Let πœ‚(π‘₯1 , π‘₯2 ; 𝛽) = 𝛽0 + 𝛽1 π‘₯1 + 𝛽2 π‘₯2 , then:
πœ•π‘“(.)
𝑒π‘₯𝑝(πœ‚)
𝑒π‘₯𝑝(πœ‚)
ME2: ( πœ•π‘₯ ) = 1+𝑒π‘₯𝑝(βˆ’πœ‚) (𝛽2 ) βˆ’ [1+𝑒π‘₯𝑝(βˆ’πœ‚)]2 𝑒π‘₯ 𝑝(βˆ’πœ‚) (𝛽2 )
2
= 𝑓(π‘₯1 , π‘₯2 ; 𝛽) 𝛽2 [1 +
𝑒π‘₯𝑝(βˆ’πœ‚)
1
],
1 + 𝑒π‘₯𝑝(πœ‚)
1
because 1+𝑒π‘₯𝑝(βˆ’πœ‚) = 1+𝑒π‘₯𝑝(πœ‚).
Case 2: Inclusion of nonlinear transformations of the covariates.
Let πœ‚(π‘₯1 , π‘₯2 ; 𝛽) = 𝛽0 + 𝛽1 π‘₯1 + 𝛽2 π‘₯2 + 𝛽3 π‘₯22 , then:
πœ•π‘“(.)
𝑒π‘₯𝑝(πœ‚)
𝑒π‘₯𝑝(πœ‚)
ME2: ( πœ•π‘₯ ) = 1+𝑒π‘₯𝑝(βˆ’πœ‚) (𝛽2 + 2𝛽3 π‘₯2 ) + [1+𝑒π‘₯𝑝(βˆ’πœ‚)]2 ex p(βˆ’πœ‚) (𝛽2 + 2𝛽3 π‘₯2 )
2
1
= 𝑓(π‘₯1 , π‘₯2 ; 𝛽)(𝛽2 + 2𝛽3 π‘₯2 ) [1 + 1+𝑒π‘₯𝑝(πœ‚)].
16
Case 3: Inclusion of an interaction term between covariates.
Consider an interaction term between π‘₯1 and π‘₯2 such that πœ‚(π‘₯1 , π‘₯2 ; 𝛽) = Ξ²0 + Ξ²1 x1 + Ξ²2 x2 +
Ξ²3 (x2 βˆ— x1 ). Then:
πœ•π‘“(.)
𝑒π‘₯𝑝(πœ‚)
𝑒π‘₯𝑝(πœ‚)
ME2: ( πœ•π‘₯ ) = 1+𝑒π‘₯𝑝(βˆ’πœ‚) (𝛽2 + 𝛽3 π‘₯1 ) + [1+𝑒π‘₯𝑝(βˆ’πœ‚)]2 ex p(βˆ’πœ‚) (𝛽2 + 𝛽3 π‘₯1 )
2
1
= 𝑓(π‘₯1 , π‘₯2 ; 𝛽)(𝛽2 + 𝛽3 π‘₯1 ) [1 + 1+𝑒π‘₯𝑝(πœ‚)].
Case 4: Linearity in the covariates with a discrete covariate.
Let πœ‚(π‘₯1 , π‘₯2 ; 𝛽) = 𝛽0 + 𝛽1 π‘₯1 + 𝛽2 π‘₯2 and assume that π‘₯1 is binary. Then:
ME1: 𝑓(π‘₯1 , π‘₯2 ; 𝛽)|π‘₯1 =1 βˆ’ 𝑓(π‘₯1 , π‘₯2 ; 𝛽)|π‘₯1 =0
𝑒π‘₯𝑝(𝛽 +𝛽1 +𝛽2 π‘₯2 )
0
= 1+𝑒π‘₯𝑝(βˆ’π›½
0 βˆ’π›½1 βˆ’π›½2 π‘₯2 )
𝑒π‘₯𝑝(𝛽 +𝛽2 π‘₯2 )
0
βˆ’ 1+𝑒π‘₯𝑝(βˆ’π›½
.
0 βˆ’π›½2 π‘₯2 )
Case 5: Inclusion of an interaction term with a discrete covariate
Let πœ‚(π‘₯1 , π‘₯2 ; 𝛽) = Ξ²0 + Ξ²1 x1 + Ξ²2 x2 + Ξ²3 (x2 βˆ— x1 ) and assume that π‘₯1 is binary. Then:
ME1: 𝑓(π‘₯1 , π‘₯2 ; 𝛽)|π‘₯1 =1 βˆ’ 𝑓(π‘₯1 , π‘₯2 ; 𝛽)|π‘₯1 =0
𝑒π‘₯𝑝(𝛽 +𝛽1 +𝛽2 π‘₯2 +𝛽3 π‘₯2 )
0
= 1+𝑒π‘₯𝑝(βˆ’π›½
0 βˆ’π›½1 βˆ’π›½2 π‘₯2 βˆ’π›½3 π‘₯2 )
𝑒π‘₯𝑝(𝛽 +𝛽2 π‘₯2 )
0
βˆ’ 1+𝑒π‘₯𝑝(βˆ’π›½
.
0 βˆ’π›½2 π‘₯2 )
For the next two cases (cases 6 and 7), let the predictor be given by:
πœ‚(π‘₯1 , π‘₯2 ; 𝛽) = 𝛽0 + 𝛽1 π‘₯1 + 𝛽2 π‘₯2 + 𝛽3 (π‘₯2 βˆ— π‘₯1 ).
Case 6: Interaction effects with continuous covariates.
ME21 :
βˆ‚
1
[ME2 ] = [ME1 (𝛽2 + 𝛽3 π‘₯1 ) + 𝑓(π‘₯1 , π‘₯2 ; 𝛽)𝛽3 ] × [1 +
]
βˆ‚x1
1 + 𝑒π‘₯𝑝(πœ‚)
17
βˆ’
𝑓(π‘₯1 ,π‘₯2 ;𝛽)(𝛽2 +𝛽3 π‘₯1 )
exp(πœ‚)(𝛽1
[1+𝑒π‘₯𝑝(πœ‚)]2
πœ•π‘“(.)
+ 𝛽3 π‘₯2 ),
1
where ME1: ( πœ•π‘₯ ) = 𝑓(π‘₯1 , π‘₯2 ; 𝛽)(𝛽1 + 𝛽3 π‘₯2 ) [1 + 1+𝑒π‘₯𝑝(πœ‚)].
1
Case 7: Interaction effects with a continuous and discrete covariate.
Let π‘₯1 be binary again, making πœ‚1 = 𝛽0 + 𝛽1 + 𝛽2 π‘₯2 + 𝛽3 π‘₯2 and πœ‚0 = 𝛽0 + 𝛽2 π‘₯2. Then:
ME21: [ME2 ]|π‘₯1 =1 βˆ’ [ME2 ]|π‘₯1 =0
𝑒π‘₯𝑝(πœ‚ )
1
𝑒π‘₯𝑝(πœ‚ )
1
1
0
(𝛽2 + 𝛽3 ) [1 +
(𝛽2 ) [1 +
= 1+𝑒π‘₯𝑝(βˆ’πœ‚
] βˆ’ 1+𝑒π‘₯𝑝(βˆ’πœ‚
].
)
)
1+𝑒π‘₯𝑝(πœ‚ )
1+𝑒π‘₯𝑝(πœ‚ )
1
1
0
0
Case 8: Interaction effects for predictor linear in the coefficients and covariates.
An interesting case is when there are no nonlinear transformations or interaction terms, i.e:
πœ‚(π‘₯1 , π‘₯2 ; 𝛽) = 𝛽0 + 𝛽1 π‘₯1 + 𝛽2 π‘₯2 . Again let π‘₯1 be binary, πœ‚1 = 𝛽0 + 𝛽1 + 𝛽2 π‘₯2 , and πœ‚0 = 𝛽0 +
𝛽2 π‘₯2 . Then:
ME21=[𝑀𝐸2 ]|π‘₯1 =1 βˆ’ [𝑀𝐸2 ]|π‘₯1 =0
𝑒π‘₯𝑝(πœ‚ )
1
𝑒π‘₯𝑝(πœ‚ )
1
1
0
(𝛽2 ) [1 +
(𝛽2 ) [1 +
=1+𝑒π‘₯𝑝(βˆ’πœ‚
] βˆ’ 1+𝑒π‘₯𝑝(βˆ’πœ‚
].
)
)
1+𝑒π‘₯𝑝(πœ‚ )
1+𝑒π‘₯𝑝(πœ‚ )
1
1
0
0
8. Sample Selection Model
The Heckman model with a log linear dependent variable[4, 5] is specified below:
ln 𝑦 = 𝑋 β€² 𝛽 + π›½πœ† πœ†(π‘Š β€² 𝛾) + πœ€
πœ€π‘– ~π‘‘π‘–π‘ π‘‘π‘Ÿ. (𝑖𝑖𝑑), 𝐸(πœ€π‘– ) = 0, π‘£π‘Žπ‘Ÿ(πœ€π‘– ) = 𝜎 2 , π‘Š βˆ‹ 𝑋
𝑦 = exp( 𝑋 β€² 𝛽 + π›½πœ† πœ†(β‹…) + πœ€)
18
𝐸(𝑦) = 𝐸(𝑒 𝑋
β€² 𝛽+𝛽 πœ†(β‹…)+πœ€
πœ†
)
𝐸(𝑦) = 𝑒 𝑋′𝛽 𝑒 π›½πœ†πœ†(β‹…) 𝐸(𝑒 πœ€ |π‘₯)
The general case of the marginal effect of the log linear formulation is derived as follows[6, 7]:
πœ•πΈ(𝑦) πœ•(𝑒 𝑋′𝛽 𝑒 π›½πœ†πœ†(β‹…) 𝐸(𝑒 πœ€ |π‘₯))
=
πœ•π‘₯π‘˜
πœ•π‘₯π‘˜
πœ•πΈ(𝑦)
πœ•πΈ(𝑒 πœ€ |π‘₯)
= 𝑒 𝑋′𝛽 𝑒 π›½πœ†πœ†(β‹…) (π›½π‘˜ 𝐸(𝑒 πœ€ |π‘₯) + π›Ύπ‘˜ π›½πœ† (βˆ’πœ†(β‹…)2 βˆ’ 𝛾π‘₯π‘˜ πœ†(β‹…)𝐸(𝑒 πœ€ |π‘₯)) +
)
πœ•π‘₯π‘˜
πœ•π‘₯π‘˜
The formulation above is for the general case in which π‘₯π‘˜ is a continuous variable and appears in
both the main regression model and the IMR function. The ME function will differ according to
the scenarios represented by the cases defined in the main article.
If homoskedastic in π‘₯π‘˜ , as assumed in the specification above, then
πœ•πΈ(𝑦)
= 𝑒 𝑋′𝛽 𝑒 π›½πœ†πœ†(β‹…) (π›½π‘˜ 𝐸(𝑒 πœ€ |π‘₯) + π›Ύπ‘˜ π›½πœ† (βˆ’πœ†(β‹…)2 βˆ’ 𝛾π‘₯π‘˜ πœ†(β‹…)𝐸(𝑒 πœ€ |π‘₯)))
πœ•π‘₯π‘˜
If heteroskedastic in π‘₯π‘˜ , then
πœ•πΈ(𝑦)
πœ•πΈ(𝑒 πœ€ |π‘₯)
𝑋′𝛽 π›½πœ† πœ†(β‹…)
πœ€ |π‘₯)
2
πœ€
=𝑒 𝑒
(π›½π‘˜ βˆ™ 𝐸(𝑒
+ π›Ύπ‘˜ π›½πœ† (βˆ’πœ†(β‹…) βˆ’ 𝛾π‘₯π‘˜ πœ†(β‹…)𝐸(𝑒 |π‘₯)) +
)
πœ•π‘₯π‘˜
πœ•π‘₯π‘˜
19
The β€˜naïve’ estimate is appropriate to use in retransformation when the error structure is
homoskedastic (log normal distribution is appropriate). Duan proposed a smearing estimator that
does not require a log normality assumption[8] . Of note, the smearing estimator may differ
across patient subgroups and in this case would require a subgroup-specific smearing
estimator[4].
-
Naïve estimate (normal distribution of Ξ΅i is assumed)
𝐸̂ (𝑦) = exp(𝑋 β€² 𝛽 + π›½πœ† πœ†(β‹…) +
-
𝜎2
)
2
Smearing estimate (normal distribution of Ξ΅i is not needed)
𝑛
𝐸̂ (𝑦) = 𝑒
Μ‚
𝑋′𝛽
1
× βˆ‘ eΞ΅Μ‚
𝑛
𝑖=1
20
References (Technical Appendix)
1.
Bergtold JS, Onukwugha E. The probabilistic reduction approach to specifying multinomial
logistic regression models in health outcomes research. Journal of Applied Statistics. 2014;41(10):220621.
2.
Wooldridge J. Econometric analysis of cross section and panel data Cambridge: MIT Press; 2002.
3.
Ishak KJ, Kreif N, Benedict A, Muszbek N. Overview of parametric survival analysis for healtheconomic applications. Pharmacoeconomics. Aug;31(8):663-75.
4.
Manning WG. The logged dependent variable, heteroscedasticity, and the retransformation
problem. Journal of Health Economics. 1998;17(3):283-95.
5.
Dow WH, Norton EC. Choosing Between and Interpreting the Heckit and Two-part models for
corner solutions. Health Services and Outcomes Research Methodology. 2003;4(1):5-18.
6.
Greene W. Econometric Analysis. 7th edition ed. Upper Saddle River, NJ: Prentice Hall; 2012.
7.
Vance C. Marginal effects and significance testing with Heckman's sample selection model: a
methodological note. Applied Economics Letters. 2009;16(14):1415-9.
8.
Duan N. Smearing estimate: A nonparametric retransformation method. Journal of the
American Statistical Association. 1983;78(383):605-10.
21