Simultaneous Equation Models with Endogenous Limited Dependent Variables: Efficiency of Alternative Estimators (1) Giorgio Calzolari Università di Firenze, Department of Statistics “G. Parenti”, [email protected] Antonino Di Pino Università di Messina, Department “V. Pareto”, [email protected] Abstract: A simultaneous equation model with two endogenous limited dependent variables characterized by an identical selection mechanism is considered in this study. The FIML procedure proposed by Poirier-Ruud (1981) for a single equation switching model is extended to provide the estimation of two distinct simultaneous equations (for instance, wage and reservation wage equations) joining with the selection function. Efficiency of FIML approach with respect to the Two-Stage Heckman approach is verified by an accurate Monte Carlo experiment taking into account endogeneity and assuming different distributions of the error terms. Simulation results show that if distributional assumptions of normality and homoskedasticity in the error terms are relaxed, finite and large sample bias is produced by the two parametric procedures above mentioned. But a semiparametric procedure as the Two-Stage CLAD (Khan and Powell, 2001) produces consistent estimates for the equation referring to the regime with more uncensored cases. The results of an empirical application on a sample of Italian Graduates are quite similar for the equation with more uncensored data (the wage equation), while FIML estimates are markedly different with respect to the T-S Heckman in the equation in which most cases are censored (reservation wage equation). Keywords: Selection Bias, Endogenous Switching, semiparametric regression (1) A. Di Pino has benefited from financial support of MIUR - National Research Project (PRIN 20082010) 1 1. Introduction: estimation of a regression model with a two-regime specification We consider a simultaneous equation model with two equations (Eq. (1) and Eq. (2)) whose dependent variables (for instance, individual wage and reservation wage taken from a cross-sectional survey sample) are both partially observed, or "limited", as a consequence of a selection mechanism that doesn’t allow to observe them together. In particular, the observation of one of the two dependent variables doesn’t permit us to observe the other. The selection mechanism may be specified by a third equation (Eq. (3)) whose dependent variable is a binary dummy “indicator” that produces two different regimes given, in this study, by the working status of the subject1. Furthermore, we can consider that the choice of a subject to work or not is influenced by both wage and reservation wage, with opposite effects. The two-regime characteristic suggests to specify the model as an endogenous two-regime “switching” regression model (Maddala and Nelson, 1975; Lee, 1978; Poirier and Ruud, 1981; Heckman, 1990; inter alia). In this context, we may adopt different alternative approaches to estimate the model: i) a parametric Two-Stage procedure, defined here TS Heckman, generally used to estimate regression models with limited dependent variable (Heckman, 1976; Lee, 1978; Lee, Maddala and Trost, 1980); ii) a parametric Maximum Likelihood (FIML) approach utilized to estimate regression models with endogenous switching (Poirier and Ruud, 1981), iii) a Two-Stage Censored Least Absolute Deviations (CLAD) estimator as a semiparametric approach to estimate censoring model without imposing distributional assumption to the error terms (Khan and Powell, 2001). Purpose of our analysis is to compare the performances of the three estimation methods by means of a detailed set of Monte Carlo experiments. Using simulated data, we analyze the finite sample bias of the estimation results and the relative efficiency of the estimation methods when the error terms of the selection criterion equation are 1 Another relevant application of a two-regime approach may be given by a model that estimates simultaneously labour income equations of both subjects who attend the college after high school and subjects who decide to access directly the labour market,. The choice to attend the college or not is considderd as a binary selection mechanism . A large discussion of the problem can be found is in the study of Carneiro, Hansen and Heckman (2003). 2 correlated with the error terms of the individual wage and reservation wage equations (endogeneity). Simulated data are produced with different distributions of the error terms. Particularly interesting is the case where the usual assumption of homoskedasticity is relaxed. To conclude, an empirical application is proposed on individual wage and reservation wage equations, using a sample of Italian Graduates. In the next section a brief survey of estimation methods for endogenous switching models is reported. In the third paragraph, we provide a stochastic specification of , respectively, maximum likelihood (FIML), T-S Heckman and semiparametric T-S CLAD estimators taking into account the case in which two different limited dependent variables, explained by potentially different regressors, are observed in each regime. In the fourth paragraph, the results of simulations based on the use of the above mentioned parametric and semiparametric procedures are discussed. The results of an empirical application estimating both wage and reservation wage function on a sample of Italian graduates are reported in the fifth section. Then, the sixth paragraph reports final observations and remarks. 2. Conceptual framework We may consider the binary dependent variable of the selection criterion equation analogous to a selection "rule": if the dependent variable is one, the individual is employed, and if the variable is zero the individual is unemployed. However, the assignment of an individual to a working or a non-working status follows a “nonrandom” criterion as a consequence of the dependence of the selection criterion with respect to the outcome (the wage, for instance). In this context, the effect of the explanatory variables on the two limited dependent variables in the two different regimes can be defined as an "endogenous switching". Several ways have been suggested to model this causal relationship in an endogenous switching framework. In an article of 1974, Maddala and Nelson discuss the methodological issues of a model where an endogenous selection criterion produces a two-regime regression model. The problem consists in the estimation of demand and supply schedules in disequilibrium markets. In this context, the demanded quantity and 3 the supplied quantity of a commodity, during period t, are not observed, and only the exchanged quantity and the market price during the period are observed. The price difference observed during the period represents the selection criterion for the two regimes. The price rises if demand is higher than supply, and we can assume that the observed exchange quantity is equal to the supplied quantity. Analogously, the price diminishes if demanded quantity is lower than supplied quantity, and in this case we can assume that the demanded quantity is observed. If we consider demanded quantity and supplied quantity as different dependent variables explained by two distinct regression equations, the Maddala-Nelson specification of the demand-supply model does not permit to observe at the same time both demanded quantity and supplied quantity, but only one of these. The authors suggest to adopt a maximum likelihood estimator by utilizing conditional densities of observed exchanged quantity (normally distributed) to be equal to demanded quantity or to supplied quantity. Schmidt (1978) and Lee (1978) propose two different estimation methods to explain wage difference between unionized and non-unionized workers. Schmidt adopts a maximum likelihood estimator given by the product of the marginal densities of two variables that are assumed to be independent from one another: 1) the error term of the wage equation, and 2) the difference between the expected wage of a unionized worker and the expected wage of a non-unionized worker. Lee (1978) suggests a Two-Stage approach, where at a first stage an indicator equation, given by a union membership function, is estimated by a Probit reduced form regression. At the second stage, the two equations, respectively, of wage of unionized workers and of wage of non-unionized workers are estimated by OLS. Both the OLS regressions are corrected for selectivity by utilizing as regressor the estimated inverse of Mill's ratio, provided by the Probit regression at a first stage for both unionized and non-unionized workers equations. This Two-Stage method (known as Probit Two-Stage method) is similar to the procedure introduced by Heckman to obtain a consistent and computationally feasible estimate of limited dependent variable equations (Heckman, 1976). However, in the Two-Stage estimation approach to the two-regime regression models, the selection function may be estimated by a reduced-form Tobit regression. This is the case, for instance, of the estimation of the education return model. In this context, the 4 desired years of college education involves the two-regime selection criterion. This variable is not dichotomous, but “limited”, and when it is equal to zero, it determines the regime in which education return does not depend on college attendance (Kenny, Lee, Maddala and Trost, 1979). In this case, we can estimate at the first stage the selection criterion equation by a reduced-form Tobit regression. The predicted values of the first stage estimation serves to obtain the inverse of the Mill's ratio estimates to utilize at a second stage as correction term, like in the T-S Heckman model, for the OLS estimation of education return for subjects with college attendance . We focus our attention on the theoretical approach suggested in Poirier and Ruud (1981). The authors show that we have an endogenous switching model only if the specification of the regression equation in two different regimes is related to the expected value of the dependent variable in each regime. If applied to our example, this implies that the individual reservation wage level is given by the subject’s evaluation of her/his own qualities and endowment, and it is not conditional on her/his non-working status. Analogously, working income or wage depends on the individual endowment, education and on exogenous characteristics of the labour market, but does not depend on the specific employment condition of the subject. In this context, a FIML estimation procedure can be based on a likelihood function given by the product of marginal probabilities of the subjects to perceive a wage or, alternatively, to desire a reservation wage. The Poirier-Ruud assumption involves consequences in terms of identifiability of the model. The main consequence is that a part of the variables conditioning the selection criterion (at least one) should be independent with respect to the outcome variables (wage and reservation wage). A similar identification criterion is requested if a T-S Heckman procedure is applied. Powers (1993) applies the endogenous switching regression estimation methods (including the Poirier-Ruud FIML procedure) to estimate the effect of family structure on the phenomenon of female’s early marriage and fertility. The author hypothesizes that the binary variable indicating whether a young woman reported giving birth to a child before her 20th birthday or not can be specified by an equation model characterized by two regimes: the first, whether the woman lived her childhood and adolescence apart from one or both biological parents or in a context of economic 5 deprivation; the second whether the woman grew up in a family where both biological parents were present. However, the validity of a FIML estimator, and of a parametric method in general, requires a correct specification of the error distribution. Misspecification in the distribution of the error terms in the model equations, such as heteroskedasticity or nonnormality, may cause inconsistency to FIML estimation results. To circumvent this problem, in the last decades, several semiparametric alternatives to estimate consistently censored regressions have been proposed. A well-known semiparametric approach is the Censored Least Absolute Deviations (CLAD) estimation method, proposed by Powell (1984). CLAD is a root-n-consistent estimator that allows to obtain the regression coefficients by minimizing the sum of absolute residuals. The adoption of this loss criterion implies to assume that the median of the error distribution is equal to zero. Under this assumption, the median of the dependent variable, y, of the regression equation y = x'φ + η is the regression function x'φˆ as in the quantile median regression. The distributional assumption of "zero median" in the error terms is not very restrictive, and permits non-normal, heteroskedastic and asymmetric errors. However, Censoring LAD (or MAD2 ) regression runs if the uncensored distribution of the error terms include the median (equal to zero). This implies that a semiparametric CLAD estimator can be adopted only if the censored cases are less than the uncensored cases. As a consequence, in our two-regime model, only the equation referring the subsample with more observations can be estimated using semiparametric approach. Therefore a comparison of estimation properties between parametric and semiparametric procedures can be conducted only considering the regime with the higher number of observations. Regarding inferential properties of CLAD estimator, Paarsh (1984) and Moon (1989) show that a bias characterizes the finite sample distribution of the CLAD markedly when we have a high share of censored observations. To solve this problem, Khan and Powell (2001) suggest a new two-step version of the CLAD estimator with the purpose of correcting the bias effect in finite samples. The authors adopt, at the first step, a semiparametric or nonparametric procedure to select the observations with a positive 2 MAD: Median Absolute Deviation 6 value for the regression function; at the second step, a median quantile (MAD or LAD) regression is performed on the selected observations. Regarding the first stage estimation of the Khan-Powell’s procedure, a problem occurs for the computational difficulty to estimate non-parametrically functions with high dimensional arguments. In order to circumvent this difficulty, Khan (2005) and Blevins and Khan (2009) suggest to use, for the selection function estimation, a Nonlinear Least Square “Sieve” estimator that allows to estimate semiparametrically the selection criterion function. The characteristics of this estimator will be briefly discussed in the next section. Parametric and semiparametric approaches to a two-regimes simultaneous equations estimation will be here compared with reference to their own respective finite-sample inferential properties by means of a Monte Carlo experiment. FIML and T-S Heckman can be compared in order to evaluate estimation performance in both regimes. The model here adopted, however, has a different stochastic specification with respect to the previous endogenous-switching model versions, because in the two regimes two limited dependent variables equations with potentially different explanatory variables are specified. For this reason, in the next section theoretical issues and the stochastic specification of, respectively, the Poirier-Ruud FIML estimator, the Heckman TwoStage (or Probit Two-Stage) approach, and the Khan-Powell semiparametric procedure are briefly illustrated. 3. Stochastic specification of the model and estimation procedures. Firstly, we provide a generic specification of the model as follows. Let’s start considering a three equations model in which wage, reservation wage and the individual participation propensity in the labour market will be estimated simultaneously. W = x'1 α + u if L = 1; 0 otherwise (1) RW = x'2 β + v if L = 0; 0 otherwise (2) 7 L = 1 if L* > 0 L = 0 otherwise L* = z ' γ + ε (3) Moreover, if L = 1, ε > − z ' γ ; if L = 0, ε ≤ − z ' γ The variable W is a vector of n1 elements, the wages and salaries perceived by the n1 employed people. The variable RW is a vector of n2 elements, the reservation wages desired by the n2 unemployed subjects and looking for a job. The binary variable L is a vector of n= n1 + n2 elements composed by n1 elements equal to one and n2 elements equal to zero. Moreover, x'1; x’2 and z’ are row-vectors, respectively, of three exogenous variables matrices X1, X2 and Z. Some exogenous explanatory variables on the right hand side of each equation can be common to the three equations, but the following two identification conditions must be observed: i) at least one of the regressors of L (included in Z) must be independent with respect to W and RW, and ii) at least one of the regressors of, respectively, W and RW, must not appear in the equation of L. The error terms u, v, and ε are assumed normally distributed with covariance matrix given by: σ u2 Σ= 0 σ uε 0 2 v σ σ vε σ uε σ vε . (4) 1 If the covariances σ uε and σ vε are assumed to be different from zero, the indicator variable L can be considered as non-exogenous with respect to W and RW. Consequently the "change" of regime from L = 0 to L = 1 should be considered as an "endogenous" switching. Given the characteristics of the model, the rationale of the three estimation procedure previously discussed will be here briefly reported. 3.1 The FIML estimator We start to consider the previous aleatory variables u, v, and ε. For the well-known properties of density have: ϕ (u, ε ) = ϕ (u )ϕ (ε u ) functions and and of conditional ϕ (v, ε ) = ϕ (v )ϕ (ε v ) . distributions are: 8 The density functions probability we marginal P (u ) = z 'γ +∞ ∫ ϕ (u, ε )dε = ∫ ϕ (u , ε )dε − z 'γ and P (v ) = −∞ − z 'γ +∞ −∞ z 'γ ∫ ϕ (v, ε )dε = ∫ ϕ (v, ε )dε (5) Furthermore, by substituting L = 1 or L = 0 into the Equation (3), we have, respectively: P(u u > 0) = P(ε > − z ' γ ) = 1 − Φ(− z ' γ ) = Φ( z ' γ ) if L = 1 (6a) P(v v ≤ 0) = P(ε ≤ − z ' γ ) = Φ(− z ' γ ) = 1 − Φ ( z ' γ ) if L = 0 (6b) If we consider the error terms u and v normally distributed, the p.d.f. of u if u u > 0 and of v if v v ≤ 0 are, respectively: ϕ (u ) = 1 σu W − x'1 α σu Φ(z' γ ) ϕ if L = 1, and ϕ (v ) = 1 σv RW − x'2 β σv 1 − Φ(z ' γ ) ϕ if L = 0 (7) Furthermore, utilizing the conditional density of a bivariate normal distribution and assuming L (u, v, ε ) = ∏ ϕ (u, ε )∏ ϕ (v, ε ) = ∏ ϕ (u )∫ ϕ (ε u ≥ 0)∏ ϕ (v )∫ ϕ (ε v ≤ 0 ) L =1 L =0 L =1 the L =0 Likelihood function to be maximized is: 1 W − x' α ∏ σ ϕ σ 1 L =1 u u σ uε z ' γ + 2 (W − x '1 α ) σu ⋅ 1 ϕ RW − x '2 β Φ 1 ∏ σv 2 2 2 L =0 σ v σ σ − u uε ( ) σ vε ( ) z ' + RW − x ' β γ 2 σ v2 ⋅ 1− Φ 1 2 2 2 σ − σ v vε ( ⋅ (8) ) This function differs from the Poirier-Ruud’s because different dependent variables are observed in the two regimes (W and RW, respectively), and the explanatory variables may change in both respective equations. 9 3.2 The Two-Stage Heckman (or Probit Two-Stage) estimator Considering the popularity of the T-S Heckman approach to the estimation of model with selectivity and its large utilization in many empirical studies, here we only describe briefly the specification of T-S Heckman model applied to a two-regime simultaneous equation estimation ( Lee, 1978; Lee, Maddala and Trost, 1980; Heckman, Tobias and Vytlacil, 2003). Starting from the previous basic model, this estimation method consists in a two-stage procedure. A first stage is given by a Probit estimation of Eq. (3), z ' γˆ , normalizing σ ε2 = 1 . Then, conditional on working status, L = 1, the expected value of wage is derived from Eq. (1): E (W L = 1) = x'1 α + E (u L = 1) (9) E (W L = 1) = x'1 α + E (u ε > − z ' γ ) (10) If u and ε are assumed to be normally distributed, we have: E (u ε ) = σ uε ε. σ ε2 Assuming σ ε2 = 1 and considering the properties of the truncated normal distribution, we have: E (W L = 1) = x'1 α + σ uε φ (z ' γ ) Φ(z ' γ ) (11) Analogously we obtain the expected value of reservation wage from Eq. (2): E (RW L = 0 ) = x'2 β + σ vε φ (z ' γ ) 1 − Φ(z ' γ ) (12) The Probit estimates at a first stage can be utilized replacing Eq. (11), and φ (z ' γ ) φ (z ' γˆ ) with in Φ(z ' γ ) Φ( z ' γˆ ) φ (z ' γ ) φ ( z ' γˆ ) with in Eq. (12). Then, the parameters α and β and 1 − Φ(z ' γ ) 1 − Φ( z ' γˆ ) the error terms covariances σ uε and σ vε can be estimated consistently at a second stage by OLS utilizing the equations: W = x'1 α + σ uε φ ( z ' γˆ ) + ηu Φ ( z ' γˆ ) on employed people only and 10 (13) RW = x'2 β + σ vε φ ( z ' γˆ ) + ηv 1 − Φ( z ' γˆ ) on unemployed people only (14) where ηu and ηv are iid random errors with zero mean. This estimation method is not generally efficient, but it is consistent. Lee, Maddala and Trost (1980) showed that the covariance matrix of the coefficients α and β is underestimated if heteroskedasticity affects the Probit estimation at a first stage. More recently, Heckman, Tobias and Vytlacil (2003) discussed the properties of the treatment parameters estimators in a framework similar to a two-regime switching model when the assumption of correct specification of the errors distribution is relaxed. Applying a Monte Carlo experiment, they observed that the estimated treatment parameters, using T-S Heckman procedure, generally have small bias even if error terms are generated by heavy-tailed Student-t. But when data are generated from highly asymmetric distributions, as Chi Square(d.o.f. 3), estimates show large bias. 3.3 The Two-Stage CLAD semiparametric estimator The semiparametric estimator proposed by Khan and Powell consists in a two-step procedure. At a first step, the observations with a positive value for the regression function are selected utilizing a non-parametric or a semiparametric estimator of the selection equation (Eq. (3)) like, for instance, the Manski's Maximum Score estimator (Mansky, 1985). This procedure, in particular, allows to estimate consistently Eq. (3) as a binary choice model without imposing distributional constraints to the error terms. In particular, this estimator involves choosing γ to maximize the following objective function: n Qn = ∑ [Li ⋅1 ∗ ( z 'i γˆ ≥ 0 ) + (1 − Li ) ⋅1 ⋅ ( z 'i γˆ < 0 )] (15) i =1 The estimator seeks to maximize the number of correct predictions3. However this procedure is often computationally cumbersome, especially if a large number of variables are included as regressors. To solve this problem Khan (2005) shows how an 3 As an alternative to the Manski's Maximum Score estimator Horowitz (1992) proposed a version that includes a smooth function. This approach is computationally simpler than the Mansky's estimator 11 alternative NLLS (Non Linear Least Squares) procedure under a conditional median restriction is observationally equivalent to both Mansky and Horowitz estimators. This procedure, known as a Sieve-NLLS estimator, is obtained by the minimization of the following function: 1 n Qn = ∑ Li − Φ z 'i γˆ * exp(l ( zi )) n i =1 [ ( 2 )] (16) where Li is a binary response variable (0;1), Φ () is the standard normal cumulative distribution function, γ* is a normalized vector (1, γ) of parameters, and exp(l(zi)) is a “sieve” function4. The Monte Carlo results reported in the study of Khan confirm consistency and finite-sample inferential properties of Sieve-NLLS estimator5. Then a Sieve-NLLS procedure can be applied to estimate, in a reduced form, the selection (participation) function and consequently to select the observations to include at the second stage in the median quantile regression. Regarding the present model, we can use the Sieve-NLLS estimation results to select, at a second stage, the observations with a "positive" index. In particular, if the number of observations conditional on L = 1 is higher than the number of observations conditional on L = 0, Eq. (1) (wage equation) can be estimated using the CLAD estimator at a second stage. In particular, using CLAD, we obtain the value of coefficients α̂ which minimizes: 1 n ˆ α = arg min ∑ Wì − max(0, x'1 α ) α n i =1 (17) This procedure, based upon the conditional median of Wi, is analogous to the median quantile regression (Powell, 1986): 1 n ρθ [Wì − max(0, x'1 α )] ∑ n i =1 αˆ = arg min α where ρθ ( ) (with θ = 0.5) (18) is the "check function" introduced by Koenker and Basset (1978) that simplifies the following equivalent equation: 4 Consider a simple model with two regressors, z1 and z2; the “Sieve” function is given by the following polynomial terms: exp(l0 + l1 z1 + l2 z2 + l3 z12+ l4z22 + l5z1 z2) 5 However, we must consider that the estimated coefficients of the selection criterion using Sieve-NLLS procedure, given by the normalized vector γ*, are generally not comparable with the estimated coefficients obtained adopting parametric methods. 12 αˆ = arg min θ α ∑ αW − x' α + (1 − θ ) ∑ αW − x' α 1 ì Wì ≥ x '1 1 ì Wì < x '1 (19) In the next paragraph we compare on simulated data the results of the Two-Stage Heckman, FIML and Two-Stage CLAD (Khan-Powell) estimation procedures. 4. Monte Carlo Results Investigation by a Monte Carlo experiment is motivated by a desire to compare finite sample performance and inferential properties of parametric and semiparametric approach to the model estimation. All designs here considered are generated by the model equations (1), (2) and (3), characterized by the inclusion in their right side of a single regressor variable. Each regressor (x1, x2, and z) has mean and variance equal to one. Slope and intercept coefficients and the elements of the error terms covariance matrix are set as follows: α = 20 α = 0 ; α1 = 5 σ u2 = σ v2 = σ ε2 = 1; β = 20 β = 0 ; β1 = 5 γ = −0.2 γ = 0 ; γ1 = 1 and: σ uε = σ vε = 0.95 We impose a high correlation (95%) between the error terms of, respectively, Eq. (1) and Eq. (3) and Eq. (2) and Eq. (3) to simulate the presence of strong endogeneity. Three different error distributions are considered: Normal(0;1); Student-t (0;1) (d.o.f. = 3); Log-Normal (0;1). Moreover experiments conducted under these distributional assumptions are replicated for the three distinct cases in which, respectively, homoskedasticity, heteroskedasticity and "strong" heteroskedasticity in the error terms of all the three equations are simulated. Heteroskedasticity is produced imposing to the error term of each equation a variance proportional to the absolute values, |x1|, |x2| and |z|, of the regressor; for strong heteroskedasticity, the variance is proportional to an exponential function of the regressor: exp( ). The simulation experiments consists of 10000 replications of the model estimation for random generated samples of size equal to 1000. We utilize mean bias and root mean- 13 square error (RMSE) to compare the performance of each estimator: T-S Heckman, FIML and T-S CLAD (Khan-Powell). Censoring for the Eq. (1) estimation is reproduced by imposing equal to 30% the percentage of cases in which the selection response variable is equal to zero (L = 0). Simmetrically, the censored observations in Eq. (2) estimation are equal to 70% . In semiparametric estimation of the coefficients of Eq. (3) we impose γ1 = 1 as normalization criterion, in this way normalized vector of parameters is equal to the vector of the "true" parameters, and Sieve-NLLS estimation of the coefficient γ0 results comparable with the parametric procedures results. Simulation results are reported extensively in the Appendix. In the first experiment, homoskedasticity condition is assumed (Table A1). If the error distribution is normal, we can observe how the bias in the estimated coefficients of Eq. (1) and Eq. (2) is substantially negligible for all the three estimation methods, but FIML procedure performs better than T-S Heckman and T-S CLAD in terms of relative efficiency. If we consider error terms distributed like a Student-t (3 d.o.f.), the bias of T-S Heckman and FIML estimates of Eq. (1) and in Eq. (2) varies from 0.1% to 1% (in absolute value), and bias of TS-CLAD estimates of Eq. (1) results negligible. However Eq. (3) estimates are generally biased in each procedure. Analogously, if error terms distribution is assumed to be log-normal (0;1), all the procedures are affected by a small bias in Eq. (1) and in Eq. (2) estimates, and large bias in estimated coefficients of Eq. (3). Variability of TS-CLAD estimates seems to be smaller than parametric methods. A second Monte Carlo experiment takes into account heteroskedasticity in the error terms distributions (Table A2). In this case, semiparametric estimators with normal or Student-t errors distribution seems to have a negligible bias in the Eq. (1) coefficients. Parametric procedures, for normal and Student-t distribution of error terms, in Eq. 1 and in Eq. (2) are affected by a relative bias varying between 0.3% and 3% (in absolute value). With log-normal errors, the relative bias of the Eq. (1) and the Eq. (2) coefficients is between 0.24% and 2.37%. The estimation of Eq. (3) coefficients shows a strong bias in all parametric and semiparametric procedures. In the third Monte Carlo experiment strong heteroskedasticity in the error terms is assumed (Table A3). In this case, the bias of parametric procedures markedly increases; 14 and especially if the errors are normally or Student-t distributed, T-S CLAD procedure performs better than parametric methods. Generally, comparing simulation results reported in tables A1, A2 and A3, we may observe that the bias in estimated coefficients seems to depend particularly on the presence of heteroskedasticity rather than distributional form of the error terms. The estimated coefficients of Eq. 3 only show a higher sensitivity across different errors distributions. In conclusion, semiparametric procedure generally performs better than parametric methods. Comparing parametric methods only, FIML estimator seems to be more efficient than T-S Heckman in the estimation of the coefficients of Eq. 1 (regression with a smaller percentage of censoring). RMSE of estimated coefficients of FIML procedure are particularly lower than T-S Heckman when errors are normally or Student-t distributed. 5. Empirical application: Italian graduates' labour income and reservation wage estimation We propose also an empirical application in which the performance of the previously discussed parametric and semiparametric procedures are compared estimating both graduate’s labour income and reservation wage on a sample selected from the ISTAT6 Survey on Italian Graduates in the year 2001 interviewed in the year 2004, three years after their degree. The survey sample is composed by 14126 individuals; 12109 of them are employed and 2017 are unemployed and looking for a job. Monthly labour income is observed on the 12109 employed graduates only, and logarithm of monthly labour income is here assumed as dependent variable in Eq. (1). In the survey, unemployed graduates declare their own desired monthly salary, here considered as their desired reservation wage and assumed as dependent variable of Eq. (2). For wage equation (Eq. (1)) a specification similar to the well-known function developed by Mincer7 is here adopted. But, with respect to the Mincer's function, it is difficult to find here explanatory variables proxy of the difference across the subjects in their working experience and education. In fact, cross-sectional survey sample is 6 ISTAT: Italian National Institute of Statistics 7 Cfr., inter alia, Mincer (1974) and Willis (1986) 15 composed by graduates interviewed only three years after their degree and with a consequent low experience in the labour market. We try to model individual difference in human capital endowment and ability by introducing as regressors in Eq. (1) both high school test score (a positive coefficient is expected) and duration of the period of attendance of university beyond regular completion time (a negative coefficient is expected). Furthermore, gender gap in wage (generally unfavourable to women) is modeled by the introduction of sex as a dummy regressor. As a result of a preliminary reduced-form regression, reduced-form estimated weekly working hours are also introduced to explain the wage difference due to the choice of the subject to work full-time or part-time8. Dimension of firm or administration in which the subject works is also a determinant of wage level, in the sense that the individual wage is expected to be higher in large dimension firms. Moreover, we consider career satisfaction and perspectives of the subject directly related to individual labour income. For this reason, a dummy variable regarding career perspectives of the subject is introduced as regressor. The reservation wage equation (Eq. (2)) is specified taking into account the individual factors influencing the desired wage of unemployed subjects. These factors and related to the job search activity and to the expected job characteristics. In particular, we consider as an important explanatory variable the decision to search for a part-time or a full-time job. To better identify the selection equation (Eq. (3)), we include as regressor the month in which the subject took the degree, here assumed as a variable influencing working status of the subject, but considered also to be independent with respect to both individual wage and reservation wage. Specification of Eq. (3) involves also the use of information about previous labour experience of the subject and the introduction of regional fixed effects as proxy of economic activity influence. Another factor potentially influencing the individual participation to the labour market is the decision to attend training courses and, consequently, to postpone the access to labour market. Then we estimate the previous model of three simultaneous equations, in which logwage and reservation wage (in Eq. (1) and in Eq. (2)) are considered as limited 8 The preliminary estimation, by a reduced-form OLS regression, of weekly hours serves to correct the influence of potential endogeneity of the time devoted by the subject to paid work and his/her perceived wage. The results of the estimation are not reported here for the sake of brevity. 16 dependent variables whose selection criterion for censoring is specified (symmetrically) in the Eq, (3) represented by a reduced-form participation function with a binary dependent variable. Preliminary estimates show how the residuals of a Probit participation equation (Eq. (3)) are correlated with both the residuals of the OLS estimates (run without any selectivity correction) of Wage Equation (Eq. (1)) and reservation Wage Equation (Eq. (2)), and the empirical values of correlation are 9% and 13%, respectively. We may consider these statistics as gross measures of endogeneity potentially affecting estimates. Estimation results are reported in Appendix (Tables A4, A5 and A6). We can observe that the estimated coefficients of Eq. (1) (log-Wage equation) are similar if we apply T-S Heckman or FIML as parametric procedures. T-S CLAD estimation results are different, especially considering the influence of the sex and the satisfaction of job on the individual wage. Furthermore, estimated standard errors of the semiparametric procedure are lower than both parametric methods. In Reservation Wage estimation (Eq. 2) we can compare the performance of the parametric methods only. In this circumstance, estimation results show non-negligible differences in estimated coefficients. Moreover standard errors of FIML estimates are lower than T-S Heckman. Estimation results of Eq. (3) (selection criterion equation) show small differences in estimated coefficients if we compare the two parametric procedures. FIML estimates are characterized by lower standard errors. We cannot consider a comparison with semiparametric estimates because of the normalization of coefficients used adopting Sieve-NLLS estimator. However we can also observe how the latter semiparametric procedure run with nine regressors, intercept included (see Table A6). In this specific case, the intercept is constrained to a value equal to one as a normalization (or identification) restriction. 17 6. Final remarks The results on simulated data show that, with respect to a parametric Two-Stage procedure, FIML estimator performs better in terms of finite sample properties and relative efficiency if estimates are affected by endogeneity. However, semiparametric T-S CLAD estimator reports a smaller bias and lower variability in estimates with respect to the FIML procedure; of course, the semiparametric estimator can be utilized only in the estimates of the equation in which the number of uncensored observations is larger than censored ones. Still Monte Carlo experiments show that estimates are generally more sensitive to heteroskedasticity than to the distribution form of the error terms. Semiparametric T-S CLAD simulation results seem to be comparatively less influenced by heteroskedasticity. The results of empirical application using the Italian graduates dataset show how FIML and T-S Heckman estimates differ markedly in the regression on Reservation Wage function, where uncensored observations are less than censored ones. 18 References Carneiro P., Hansen K. T. , Heckman J. J. (2003), "Estimating Distributions of Treatment Effects with an Application to the Returns to Schooling and Measurement of the Effects of Uncertainty on College Choice" International Economic Review, Vol. 44, No. 2, pp. 361-422 Chay K. Y., Powell J. L. (2001) “Semiparametric Censored Regression Models” Journal of Economic Perspectives, Vol. 15, No 4 pp. 29–42 Heckman J. J. (1978) “Dummy Endogenous Variables in a Simultaneous Equation System”, Econometrica, 46, 931-959. Heckman J. J. (1990) "Varieties of Selection Bias" The American Economic Review, Vol 80, No 2, Papers and Proceedings of the Hundred and Second Annual Meeting of the American Economic Association (May 1990), pp. 313 - 318 Heckman J. J., Tobias J., Vytlacil E. (2003) “Simple Estimators for Treatment Parameters in a Latent Variable framework”, Review of Economics and Statistics, 85 Issue 3, 748-755. Horowitz J.L. (1992). ”A Smoothed Maximum Score Estimator for the Binary Response Model". Econometrica. 60. pp. 505-531 ISTAT (2004) Survey on Italian Graduates in 2001 Khan S. (2005) “Distribution Free Estimation of Heteroskedastic Binary Response Models Using Probit/Logit Criterion Functions” University of Rochester Working Paper. Khan S., Powell J. L. (2001) "Two-step Estimation of Semiparametric Censored Regression Models" Journal of Econometrics, 103, 73-110 Koenker R. P., Basset Jr. G. S. (1978) "Regression Quantiles", Econometrica, 46, 3350. ISTAT Italian National Institute of Statistics (2004), Survey on Italian Graduates in the Year 2001. Lee L. F. (1978) “Unionism and Wage Rates: A Simultaneous Equations Model with Qualitative and Limited Dependent Variables” International Economic Review, Vol. 19, No. 2, pp. 415-433 Lee L. F., Maddala G. S., Trost R. P. (1980), “Asymptotic Covariance Matrices of Two-Stage Probit and Two-Stage Tobit Methods for Simultaneous Equations Models with Selectivity”, Econometrica, Vol. 48, No. 2 pp. 491-503 Maddala G. S., Nelson F.D. (1975) “Switching Regression Models with Endogenous and Exogenous Switching” Proceedings of the American Statistical Association (Business and Economics Section), pp. 423-426. Manski C.F. (1985). ”Semiparametric Analysis of Discrete Response: Asymptotic Properties of Maximum Score Estimation". Journal of Econometrics. 27. pp. 313-334 Mincer, Jacob (1974), Schooling, Experience and Earnings. National Bureau of Economic Research 19 Moon C. G. (1989) "A Monte Carlo Comparison of Semiparametric Tobit Estimators " Journal of Applied Econometrics 4, 361 - 382 Paarsh H. J. (1984) "A Monte Carlo Comparison of Estimators for Censored Regression Models", Journal of Econometrics, 24, 197 - 213. Poirier D. J., Ruud P. A. (1981) “On The Appropriateness of Endogenous Switching”, Journal of Econometrics, 16, 249-256. Powell J. L. (1984) “Least Absolute Deviation Estimation for the Censored Regression Models”, Journal of Econometrics 25, 303-325. Powell J. L. (1986) “Censored Regression Quantiles”, Journal of Econometrics 32, 143155. Powers D. A. (1993) “Endogenous Switching Regression Models with Limited Dependent Variables” Sociological Methods and Research, Vol. 22 No 2 pp. 248-273. Willis, R. (1986), “Wage Determinants: A Survey and Reinterpretation of Human Capital Earnings Functions,” in O. Ashenfelter and R. Layard (eds.), Handbook of Labor Economics, Amsterdam: North-Holland. 20 Appendix - Monte Carlo experiments and empirical application results Table A1 - Simulation results for parametric and semiparametric estimators with homoskedastic disturbances (percentage of cases with L = 0 index: 30%) A1a -Error Terms Distribution:Normal (0:1); sample: 1000; NREP: 10000 Two-Stage Heckman α0=20 α1=5 β0=20 β1=5 γ0= −0.2 γ1= 1 mean mean bias 20.0000 5.0002 19.9989 5.0001 -0.2017 1.0045 0.0000 0.0002 -0.0011 0.0001 -0.0017 0.0045 FIML % mean RMSE bias 0.00% 0.00% -0.01% 0.00% 0.84% 0.45% 0.0587 0.0305 0.1091 0.0406 0.0623 0.0639 mean mean bias 19.9997 4.9995 20.0003 5.0006 -0.1997 1.0009 -0.0003 -0.0005 0.0003 0.0006 0.0003 0.0009 Two-Stage CLAD % mean RMSE bias 0.00% -0.01% 0.00% 0.01% -0.13% 0.09% 0.0446 0.0267 0.0710 0.0356 0.0517 0.0479 mean mean bias 20.0000 4.9999 ---------0.1904 1.0000 0.0000 -0.0001 --------0.0097 0.0000 % mean RMSE bias 0.00% 0.00% ---------4.83% 0.00% 0.0739 0.0532 --------0.0772 0.0000 A1b - Error Terms Distribution: t-Student (3 d.o.f.); sample 1000; NREP: 10000 Two-Stage Heckman mean bias mean α0=20 α1=5 β0=20 β1=5 γ0= −0.2 γ1= 1 20.0166 4.9994 20.1250 4.9971 -0.2334 1.2118 % mean RMSE bias 0.0166 0.08% 0.0541 -0.0006 -0.01% 0.0294 0.1250 0.63% 0.1476 -0.0029 -0.06% 0.0538 -0.0334 16.70% 0.0648 0.2118 21.18% 0.0827 FIML mean 19.9763 4.9997 20.2112 5.0003 -0.1154 0.9600 mean bias % mean RMSE bias -0.0237 -0.12% 0.0390 -0.0003 -0.01% 0.0233 0.2112 1.06% 0.1604 0.0003 0.01% 0.0389 0.0846 -42.29% 0.0625 -0.0400 -4.00% 0.1126 Two-Stage CLAD mean mean bias 19.9999 4.9991 ---------0.1930 1.0000 -0.0001 -0.0009 --------0.0070 0.0000 % mean RMSE bias 0.00% -0.02% ---------3.51% 0.00% 0.0441 0.0299 --------0.0578 0.0000 A1c - Error Terms Distribution: Log-Normal (0;1); sample 1000; NREP: 10000 Two-Stage Heckman FIML Two-Stage CLAD mean α0=20 α1=5 β0=20 β1=5 γ0= −0.2 γ1= 1 19.9477 5.0010 19.7867 5.0010 -0.5746 1.8368 mean bias % mean RMSE bias -0.0523 -0.26% 0.0619 0.0010 0.02% 0.0396 -0.2133 -1.07% 0.0442 0.0010 0.02% 0.0190 -0.3746 187.30% 0.0969 0.8368 83.68% 0.1311 mean 19.9018 4.9997 19.8087 5.0007 -0.4236 1.4727 mean bias % mean RMSE bias -0.0982 -0.49% 0.0528 -0.0003 -0.01% 0.0342 -0.1913 -0.96% 0.0481 0.0007 0.01% 0.0191 -0.2236 111.80% 0.1126 0.4727 47.27% 0.2127 21 mean 19.7014 4.9999 ---------0.4804 1.0000 mean bias % mean RMSE bias -0.2986 -1.49% 0.0339 -0.0001 0.00% 0.0243 -------------------------0.2804 140.21% 0.0490 0.0000 0.00% 0.0000 Table A2 - Simulation results for parametric and semiparametric estimators with heteroskedastic disturbances (percentage of cases with L = 0 index: 30%) A2a -Error Terms Distribution:Normal (0:1); sample: 1000; NREP: 10000 Two-Stage Heckman mean α0=20 mean bias FIML % mean RMSE bias mean mean bias % mean RMSE bias Two-Stage CLAD % mean mean mean RMSE bias bias 19.8858 -0.1142 -0.57% 0.0658 19.9404 -0.0596 -0.30% 0.0415 19.9993 -0.0007 0.00% 0.0407 α1=5 5.1191 0.1191 2.38% 0.0401 5.0531 0.0531 1.06% 0.0426 4.9994 -0.0006 -0.01% 0.0546 β0=20 20.3189 0.3189 1.59% 0.1022 20.2276 0.2276 1.14% 0.0678 ----- ----- ----- ----- β1=5 4.7565 -0.2435 -4.87% 0.0533 4.7898 -0.2102 -4.20% 0.0512 ----- ----- ----- ----- γ0= −0.2 -0.2876 -0.0876 43.82% 0.0620 -0.2710 -0.0710 35.49% 0.0542 -0.2618 -0.0618 30.91% 0.0521 γ1= 1 0.8883 -0.1117 -11.17% 0.6367 0.8841 -0.1159 -11.59% 0.0600 1.0000 0.0000 0.00% 0.0000 A2b - Error Terms Distribution: t-Student (3 d.o.f.); sample 1000; NREP: 10000 Two-Stage Heckman mean α0=20 mean bias FIML % mean RMSE bias mean mean bias % mean RMSE bias Two-Stage CLAD % mean mean mean RMSE bias bias 19.9309 -0.0691 -0.35% 0.0618 19.9327 -0.0673 -0.34% 0.0423 19.9993 -0.0007 0.00% 0.0236 α1=5 5.0815 0.0815 1.63% 0.0392 5.0056 0.0056 0.11% 0.0349 4.9982 -0.0018 -0.04% 0.0319 β0=20 20.3252 0.3252 1.63% 0.1291 20.4162 0.4162 2.08% 0.2037 ----- ----- ----- ----- β1=5 4.7996 -0.2004 -4.01% 0.0699 4.8510 -0.1490 -2.98% 0.0568 ----- ----- ----- ----- γ0= −0.2 -0.3354 -0.1354 67.71% 0.0612 -0.1749 0.0251 -12.53% 0.0741 -0.2265 -0.0265 13.23% 0.0332 γ1= 1 1.1510 0.1510 15.10% 0.0874 0.8970 -0.1030 -10.30% 0.1305 1.0000 0.0000 0.00% 0.0000 A2c - Error Terms Distribution: Log-Normal (0;1); sample 1000; NREP: 10000 Two-Stage Heckman mean α0=20 mean bias FIML % mean RMSE bias mean mean bias % mean RMSE bias Two-Stage CLAD % mean mean mean RMSE bias bias 19.8993 -0.1007 -0.50% 0.0637 19.8698 -0.1302 -0.65% 0.0608 19.8423 -0.1577 -0.79% 0.0217 α1=5 5.0457 0.0457 0.91% 0.0525 5.0302 0.0302 0.60% 0.0456 4.8813 -0.1187 -2.37% 0.0283 β0=20 19.9514 -0.0486 -0.24% 0.0423 19.9424 -0.0576 -0.29% 0.0445 ----- ----- ----- ----- β1=5 4.8853 -0.1148 -2.30% 0.0307 4.8894 -0.1106 -2.21% 0.0312 ----- ----- ----- ----- γ0= −0.2 -0.6556 -0.4556 227.82% 0.1116 -0.5039 -0.3039 151.97% 0.1344 -0.3896 -0.1896 94.81% 0.0425 γ1= 1 2.0921 1.0921 109.21% 0.1833 1.7288 0.7288 72.88% 0.2874 1.0000 0.0000 0.00% 0.0000 22 Table A3 - Simulation results for parametric and semiparametric estimators with strong heteroskedastic disturbances (percentage of cases with L = 0 index: 30%) A3a -Error Terms Distribution:Normal (0:1); sample: 1000; NREP: 10000 Two-Stage Heckman mean mean bias FIML % mean RMSE bias mean α0=20 19.5801 -0.4199 -2.10% α1=5 5.5234 10.47% 0.0932 5.2407 β0=20 20.8650 0.8650 β1=5 0.5234 Two-Stage CLAD RMSE mean mean bias 0.2289 19.6864 -0.3136 -1.57% 0.0969 19.9894 -0.0106 % mean RMSE bias -0.05% 0.0694 0.2407 4.81% 0.0886 4.9972 -0.0028 -0.06% 0.0891 0.2520 20.5980 0.5980 2.99% 0.1657 ----- ----- ----- ----- 4.2902 -0.7098 -14.20% 0.1138 4.4939 -0.5061 -10.12% 0.0927 ----- ----- ----- ----- -0.60% 0.0806 0.00% 0.0000 γ0= −0.2 -0.2121 -0.0121 γ1= 1 mean bias % mean bias 4.32% 6.06% 0.0537 -0.1821 0.0179 -8.94% 0.0485 -0.1988 0.0012 0.4922 -0.5078 -50.78% 0.0458 0.4603 -0.5397 -53.97% 0.0433 1.0000 0.0000 A3b - Error Terms Distribution: t-Student (3 d.o.f.); sample 1000; NREP: 10000 Two-Stage Heckman mean mean bias FIML % mean RMSE bias mean mean bias % mean bias Two-Stage CLAD RMSE mean mean bias % mean RMSE bias α0=20 19.6866 -0.3134 -1.57% 0.2133 19.7317 -0.2683 -1.34% 0.1128 19.9912 -0.0088 -0.04% 0.0416 α1=5 5.3631 0.3631 7.26% 0.0960 5.0128 0.0128 0.26% 0.0703 4.9938 -0.0062 -0.12% 0.0521 β0=20 20.7357 0.7357 3.68% 0.2469 20.9519 0.9519 4.76% 0.3448 ----- ----- ----- ----- β1=5 4.3944 -0.6056 -12.11% 0.1409 4.6865 -0.3135 -6.27% 0.0959 ----- ----- ----- ----- -0.43% 0.0622 0.00% 0.0000 γ0= −0.2 -0.2266 -0.0266 13.28% 0.0539 -0.0712 0.1288 -64.38% 0.0683 -0.1991 0.0009 γ1= 1 0.6548 -0.3452 -34.52% 0.0517 0.4182 -0.5818 -58.18% 0.0744 1.0000 0.0000 A3c - Error Terms Distribution: Log-Normal (0;1); sample 1000; NREP: 10000 Two-Stage Heckman mean mean bias FIML % mean RMSE bias mean mean bias % mean bias Two-Stage CLAD RMSE mean mean bias % mean RMSE bias α0=20 19.2328 -0.7672 -3.84% 0.3164 19.1930 -0.8070 -4.03% 0.2382 19.6780 -0.3220 -1.61% 0.0374 α1=5 5.3088 0.3088 6.18% 0.1548 4.8691 -0.1309 -2.62% 0.0801 4.7683 -0.2317 -4.63% 0.0487 β0=20 20.0630 0.0630 0.31% 0.0764 20.0120 0.0120 0.0983 ----- ----- ----- ----- β1=5 4.5953 -0.4047 -8.09% 0.5441 4.6192 -0.3808 -7.62% 0.0544 ----- ----- ----- ----- 0.06% γ0= −0.2 -0.5106 -0.3106 155.29% 0.6250 -0.3141 -0.1141 57.05% 0.0543 -0.5542 -0.3542 177.10% 0.0569 γ1= 1 0.7790 -0.2210 -22.10% 0.0586 0.4952 -0.5048 -50.48% 0.0778 1.0000 23 0.0000 0.00% 0.0000 Table A4 - Estimation of Eq 1 - dependent variable: log of monthly wage T-S HECKMAN FIML T-S ClAD SE SE. SE coeff coeff coeff 0.0280 0.0039 intercept 5.9340 0.0281 5.8860 6.0180 Dummy Sex (man=0) -0.1088 0.0065 -0.1223 0.0060 -0.0441 0.0008 Years beyond regular -0.0046 0.0020 -0.0082 0.0020 -0.0012 0.0003 completion time High School final exam 0.0004 0.0001 0.0016 0.0004 0.0016 0.0002 test score Dummy satisfaction of job (unsatisfied = 0) 0.0883 0.0086 0.0876 0.0087 -0.0316 0.0009 dummy Career (unsatisfied = 0) 0.0446 0.0062 0.0437 0.0062 0.0101 0.0009 0.0274 0.0005 0.0283 0.0005 0.0271 0.0001 0.0003 0.0000 0.0003 0.0000 0.0003 0.0000 -0.2416 0.0206 instrumental: weekly working hours estimated in R.F. No of employees in the firm or in the Administation lambda Heckman Table A5 - Estimation of Equation 2 Dependent Variable: RW (monthly reservation wage) T-S FIML SE SE Coeff. Coeff. Intercept 909.3394 27.3661 532.6730 17.7061 5.6354 Dummy Sex (man=0) -95.5861 12.2727 -14.7290 Dummy employee =1 self8.1394 23.7694 10.6078 17.6415 employee=0 Dummy Msc after degree 8.6825 35.4462 12.3481 16.0356 (No =0) Dummy he/she prefers 126.6278 part-time =0; full-time=1 Dummy he/she could look for a job in foreign state (No =0) Dummy he/she attends program-training course (No =0) lambda Heckman 11.9903 132.6950 9.0556 75.1360 11.1506 29.9026 11.2068 72.9295 14.8289 42.6367 8.5750 86.8783 14.4559 24 Table A6 - Estimation of Eq. 3 (Participation function) Dependent variable : L =0 (employed) L =1 (unemployed) Estimator: Probit FIML SIEVE NLLS SE SE SE Coeff. Coeff. Coeff. (*) 0.1570 0.1193 1.0000 --Intercept -2.3958 -0.9908 Dummy Sex (man=0) -0.2963 0.0289 -0.2789 0.0227 -7.2148 1.6596 Years beyond regular completion time -0.0800 0.0099 -0.0453 0.0068 -0.2902 0.0719 Month of the Degree -0.0110 0.0292 -0.0123 0.0027 0.0101 0.0229 0.3606 0.0291 0.2543 0.0210 1064.95 0.0000 -0.3251 0.0281 -0.2563 0.0208 0.4303 0.1716 Dummy: if he/she worked before the degree (No = 0) Dummy: if he/she attends stage program (No = 0) Fixed effect: Regional 6.880E-05 3.000E-06 3.246E-05 2.456E-06 1.712E-04 3.940E-05 GDP pro capite Fixed effect: percentage in the college of graduates who found a fixed job 0.0100 0.0145 0.0009 0.0012 (*) Coefficients normalized imposing constant term equal to one N. obs: 14126. Censored obs. : 2017 25 0.0110 0.0054
© Copyright 2024 Paperzz