JOURNAL OF Econometrics Semiparametric estimation of

JOURNAL OF
Econometrics
Jouma! of Econometrics 80 (1q97) 1-34
ELSEVIER
Semiparametric estimation of the Type-3 Tobit model
Songnian Chen ~
~D~partment o] Economh's. The Hong Kong University of Science and Tech~.olo.qy
Clear Water Bay, KowIoon, Hong KoaU
(Received I February 1994; received in revised form I March 1996)
Abstract
This paper considers estimation of the Type-3 Tobit model with only some weak
restrictions imposed on the distribution of the error terms. Two least-squares-type estimation approaches are proposed under the condition that the error terms and regressors
are independent. Consistent estimators for the asymptotic covariance matrices are proposed
to facilitate large sample inference, and a small Monte Carlo simulation is performed to
investigate the finite sample behavior of the proposed estimators. Also, the semiparametrie
efficiency bound is derived for the Typo-3 Tobit model under the independence restriction.
© 1997 Elsevier Science S.A.
Keywords: Sample selection; Scmiparamctric estimation; Tobit models
JEL classification: C31; C34
I. Inlroduction
This paper considers semiparametric estimation o f the sample selection model
subject to a Tobit-type selection rule tChenceforth called the Type-3 Tobit model).
The Tyl~-3 Tobit model is obtained by adding more information to the selection equation in the more c o m m o n sample selection model subject to a binary
selection rule (henceforth called the Type-2 Tobit model). A classical example
o f the sample selection model is G r o n a a ' s (1973) female labor supply model,
[ am grateful to Gregory Chow arK] Robin L u m s d a i ~ for their detailed comment:, and slimulatmg
discussions, l thank Bo Honor~ for providing some of his ¢ompuler programs, I would also like
to Ihank Chunrong Ai, Slephcn Cosslctt, Be [:c,m~, Lung-Fei L¢¢, Glenn Sucyoshi, Grant Taylor,
Hal While, workshop parlicipat~ at HKUST, UCSD, Ohio State University and Princeton Uuivcrsity
for helpful comments. The helpful comments and suggestions by an associale editor and Iwo an~np
mous referees arc grcally appcccialcd. I am especially indebted to my Ihesis advisor James Pow¢ll
for his constant advice and ¢ncotuagcmenl.
0304-4076/97151?.00 ~ 1997 Elsevier Science S.A. All rights rcscrvc¢l
pot ~ n ~ n a . , I n ? ~ r o ~ , n n n l
1 .s
2
S. ChenlJournal o f Econometrics 80 (1997) 1 34
which is a Type-2 Tobit model. The labor supply model in Heckman (1974) is
a variation o f the Type-3 Tobit model. More recently, Udry (1994) proposed a
more complicated sample selection model generalizing the st::ad~:~'dType-3 Tobit
model, in the context of informal credit transaction markets in determining the
effect of loans on risk sharing among households in Northern Nigeria.
Traditionally, estimation o f sample selection models and other limited dependent variable models has been based on the maximum likelihood and other
likelihood-based methods with the error distribution specified to be in a parametric form known up to a finite-dimensional vector o f parameters. As is well
known, misspecification of the error distribution in sample selection models will
in general render likelihood-based estimators inconsistent. Lee (1982) adopted a
rich parametric family to lessen the misspccifieation problem to a certain extent.
Since a parametric form of error distribution cannot generally be justified by
economic theory, much of recent econometrics literature has focused on semiparametric estimation methods which only assume weak restrictions on the error
distribution to guard against possible misspeeifieation.
Formally, the Type-3 Tobit model is a two-equation model of the following
form:
l* = w,~o + ut,
(I)
y= : xflo + t,2,
(2)
where the first equation is the selection equation, the second equation is the main
equation, and the dependent variable y* can only be observed when the selection variable I* is positive. This model reduces to the Type-2 Tobit model when
the selection variable is only the sign of I*, in which case some exclusion restriction on fl0 will be necessary ibr model identification I)nder the assumption
that the error terms are independent of the regressors (see Chamberlain, 1986~
Powell, 1989). Various semiparametric estimators for the Type-2 Tobit model
have been proposed in Andrews (1991), Cosslett (1991), Gallant and Nychka
(1987), Ichimura and Lee (1991), Newey (1988), and Powell (1989), among
others. It was shown by Lee (1994) that the information on the selection variable
I* other than its sign makes the aforementioned exclusion restriction unnecessary
for the model identification in the Type-3 Tobit model under the independence
restriction. Recently, Honor~ et al. (1992) and Lee (1994) proposed some semiparametric estimators for the Type-3 Tobit model by exploiting this extra information on the selection equation. This article provides two new semiparametric
estimators for fl0 under the independence condition. Our procedures differ from
theirs in the way the selection bias is being corrected in the main equation.
Like those of Honor6 et al. and Lee, the estimators presented here are twostep estimators where the selection equation is estimated first, and the estimator
obtained from that equation is used to eliminate the effect of the selection on the
second equation. Our first estimator (henceforth called estimator I), constructed by
& Clienldournal o f Econometr&'s 80 (1997) 1-34
3
first transforming the selection correction function into a constant term through
trimming the original sample, is a simple least-squares-type estimator. For the
second estimator (estimator lI), a Buckley-James-type approach is used to directly
estimate the selection correction function, with estimates o f the parameters in
the selection equation given in the first step. in the second step, a weighted
least-squares-type procedure is performed by including the estimated selection
correction function as part of the regressors in the main equation. Our estimator II
is similar to Lee's (1994), except that in ours the estimated selection correction
function is not smooth and in Lee's it is. Hunor~ et al. (1992) proposed two-step
estimators by adopting a pairwise difference approach to eliminate the effect of
the selection.
Semiparametric efficiency hounds can serve as a criterion for evaluating existing
estimators and suggesting improved estimators. In this paper we also calculate the
s~-miparametric efficiency bound for the Type-3 Tobit model under the independence restriction. For the Type-2 Tobit model under the independence restriction
(Chamberlain, 1987), a nonsingular information matrix requires some exclusion
restriction in the main equation and the existence o f a continuous regressor in
the selection equation. However, neither requirement is needed to ensure a nonsingular information matrix for the Type-3 Tobit model.
The next section describes the model and discusses the e~timators and their
motivation. Large sample properties of the estimators are investigated in Section 3. Section 4 contains a limited Monte Carlo study to investigate practical
performance of the proposed estimators. In Section 5 we provide the semiparametric efficiency bound for the Type-3 Tobit model under the independence condition.
Section 6 concludes the paper with some discussions. The proofs of theorems are
in the appendices.
2. The model and estimators
We consider estimation o f the parameter vector//0 o f the Type-3 Tobit model
defined by the latent variables I* and y* which are of the form
r =w~0 + Ul,
(3)
y= = x~0 + u2.
(4)
Instead of I* and y*, we observe 1 and y which satisfy
I = max(l*,0),
(5)
y = y* I II>0t,
(6)
where llat represents the indicator function o f the event A, d = Iff>0t is the
binary selection indicator, y and 1 are the observable dependent variables, x and
4
S. ChenlJournal o f Economezrics 80 (1997) 1-34
w are row vectors o f exogenous variables with dimensions k and p, respectively,
flo and 6o are conformable column vectors o f unknown parameters, and ul and
u2 are error terms, x and w are allowed to have common components. Let z be a
vector consisting o f the distinct components in (w,x). Here y* is observable only
when d = 1; hence, the sample is selected according to the values o f w and u~.
In this paper we propose two estimators o f flo under the restriction that (u~,u2)
is independent o f z. As usual, the independence restriction can only facilitate
identification and esOmation for the slope parameters. Thus, it is assumed that z
does not contain a constant term. Conditional on z and on y* being observable,
the regression function o f y is
E ( y [ i > O,z) --- xflo + E(u21l > 0, z).
(7)
The main difficulty with estimating flo is the selection correction term E(u21ul >
w~o,z) being a nondegenerate random variable because ul is randomly censored by -wOo. Consequently, there wi13 be nonzero correlation between the
regressors and error term in the main equation for the selected subsample when
there is dependence between error terms across the two equations, which in
turn, causes the l,,.ast-squares estimatoc for flo applied to the selected sample
to be inconsistent. In this paper we propose two ways to deal with this inconsistency problem as the result o f the presence o f the selection correction
term.
The idea behind our estimator 1 is to trim the original sample so that the
selection correction term in the trimmed subsample is equal to a constant. Consequently, the net effect on the resulting subsample o f the selection and trimming
process is only to shift the intercept term in the main equation without affecting
the slope parameters. Therefore, the least-squares based on the trimmed subsampie yields consistent estimates for the slope parameters in the main equation.
Specifically, we can create a constant censoring point, say zero, instead o f - w O o
for uj through trimming the original sample so that the conditional expectation
o f the error term u2 in the trimmed sample is a constant. After the trimming, the
resulting new sample contains observations with uj > 0 and woo > 0 . If ( u b u 2 ) is
independent o f z, then the "new error term" u~ =u2 given [(Ul > 0 , w O o > 0 ) and
d = t,z] will have a constant mean, conditional on the regressors z, i.e., ao -E(u2lz, ul > 0 , wOo>0, d = 1)=E(u21u+ > 0 ) is a constant under the assumption
that the regressors and error terms are independent. Therefore, the regression
function for the trimmed sample is
-
-
E(y[z, u, > 0 , wOo>O, d = I) = E(ylz, ul > 0 , wOo>O, ul + w O o > O )
= E(ylz, ul > 0 , w/io>O)
= E(ylut > O , z ) =xflo ÷ ao
(8)
S. Chen/Journal of Econometrics 80 (r997) 1-34
5
which suggests a simple least-squares procedure applied to the trimmed subsample
to estimate//o by
~, : arg rain 1 ~ ! {,~,,>o.w,~>0)(Yi - x i / / - u)2,
(9)
/~.z rt i=l
where ~1~ = l ~ - w ~ is an estimate for ul~, and 6 is a root-n consistent estimator o f
/~0 in a first step. Note that under some mild conditions (discussed below), x will
be of full rank in the subsample for which ul > 0 , w S 0 > 0 even if x = w; thus, the
exclusion restriction alluded to above is not necessary for model identification.
Instead of restricting ul to the interval (0,oo), we can also consider K different
subsamples for which (ul E (ck--l,Ck),Wi60>--C~--I) holds in the £th subsample,
and co < c t < • •. <cK, then the conditional mean of the error term u2 in the kth
subsample is a coilstant, i.e. ~k=E(uelz, ek>U,>Ck-,, w6o>--Ct-bd= 1)=
E(u2}ck >ui > c k - i ) , and
E(ylz, ck > u l >C,-1,W60>--Ck--hd = 1 ) =X/~o + Z,.
(I0)
Similarly, a new least-squares-type estimator/~pl for/~o can be defined by pooling
the K subsamples:
/~p, = arg
min
l~-]~Edil{c>a:.~¢,.,,w~.~_c;_,}(yi-xi~-o~l)2.( t l )
Note that more observations will be included in defining ~p, than those used in
constructing ~t if we set c 0 < 0 and cK moderately large. Therefore, in general,
~pl would be more efficient if the sample size is large enough to offset the loss
o f degrees of freedom t as a result o f estimating K intercept terms (~k) for
k = 1,2 . . . . . K.
One example of an estimator for 60 is a pairwis¢ differenced estimator in
Honor6 and Powei! (1994):
= argi.nin ~ (1 -- l{i,< max{0,O~,-w,)6),i~<max(o,o,;-w,)6H)
6~A i<j
×IIi - 6 - (w: - w
)61,
where A is the parameter space for 60, which is chosen as the first-step estimator
in our subsequent Monte Carlo study. Othe.r alternatives include Buckley and
James (1979), Duncan (t986), Fernandez (1986), Horowitz (1986, 1988) and
Koul et aL (1981), among others.
I Note that there will be no loss o f degrees o f freedom due to estimating ~- when no observalion
falling into the kth interval.
6
S. ChenlJournal o f Economelrics 80 (1997) 1-34
For our estimator lI, instead of trimming observations to transform the selection correction term into a constant, we estimate the selection correction term
E(u2lut > - w ~ o . z ) directly for each observation; then we propose a least-squarestype procedure by including the estimated selection correction term as part of the
regressors in the main equation. More specifically, we construct
nl
Etzi, fl) =
n
^
- ~ Y~-~¢i Dq(
n 6)( yj ~-- xjfl)
n--' I
~i¢.OiA
(12)
)
to estimate E ( y - x f l l u l , > -w,~0,z~), where 6 is a first-step preliminary estimator
of/i0,
Dq(tS)---- I{t,>O,,,-w,)o>o} = I {6-,v~6>-w,&-w,~-w,6}
and the sample average in (12) is taken over the observations for which u t j >
-wiro,-wii~o>-wj,~o
if Dq(c~) is replaced by Di2(~0)- By the law of large
numbers, I~(z;,fl) will be consistent for
E[(y - xfl)l {,,, >-,,',~0,,,'60>,~,~ }] = E[(y - xB)iul > -wi~0, Wro > wi~o, zi]
E[ I {,,, > _,,. ~.,,.~o > ,~,oo} ]
which will be the ith selection correction term when fl = flo under the condition that the error tenns and regressors are independent. Our proposed second
estimation ate!hod is a weighted semiparametric t~ast-squares approach
argmi, n
p
2.,(y,
x, fl
E(z,,fl), d~mT.
(13)
n t=l
where rni = ( l / ( n - I))~'~j#i DiJ(~) is a weight function. This weighting scheme
is similar in spirit to the density weighting in Powell et al. (1989), to avoid the erratic behavior associated with small values o f the random denominator o f l~(z,.,fl)
and to simplify the technical analysis. Further discussions on the motivation and
identification conditions related to the weighted semiparametrie least-squares estimation approach are available in Lee (1994). In contrast, by imposing some
strong smoothness conditions on the distribution of the regressors, Lee (1994)
proposed a semiparametrie least-squares estimator by solving
min _i ~ l w ( w D ( y i - x i f l - Eni)2di,
(14)
where E~,. is a smoothed version of f:.(z, fl), and l w ( ' ) is a fixed trimming function. Notice that we use the weight function ra,. to deal with the random denominator of E(z,, fl), while Lee adopted a fixed trimming approach to ensure that
the random denominator of E~i is uniformly bounded away from zero asymptotically. For Lee's method, it is necessary to choose some smoothing parameters
S. ChenlJournal of Econometrics 80 (1997) 1-34
7
in constructing E.~. Also the fixed trimming approach could cause some loss of
efficiency, especially when the dimension of w is high. Recently, Honor6 et al.
(!992) proposed pairwise differenced estimators/~hl and flh2 such that
#hl : argrn~n E (l -- l{l,<maxtO,(w_w,)~).l,<max(O.(w,_,,;)~)}
)
i<./
x lY , -
Y:
-
(xi -
xi)#l
(x~ -
xj)#)2~
and
^
t<J
× (y,
-
y.i -
where 6 is a root-n consistent estimator for ~0- Honor6 et al. established that
~ht is consistent and asymptotically normal. ~ is briefly considered along with
#~,/~pl and #2 in the next section. Finally, a computational point worth noting
is that among all the estimators mentioned above, given a first step estimator
for the selection equation, only fl, (or flpl ) involves O(n) operations, while the
remaining ones require at least O(n z) operations.
3. Large sample properties of the estimators
3.1. Estimator I
We consider here estimator l, fit, o f fl0 under the assumption that the error
terms and regressors are independent. In this case, the estimator is defined by
minimizing the following objective function:
I ~ 1{t, >~,,~>0}(Yi Tn:~i:l
xifl - ct)2,
where ~ is a preliminary estimator of g0 in the first step.
We make the following assumptions:
Assumption 3.1 (Random sampling). The vectors (l~,w~,x~,y~) satisfying (3)
and (4) are independent and identically distributed across i.
The assumption that observations are independent of each other is made for
convenience; it might be possible to allow for dependent data. The identical
distribution condition, however, is essential for/~z and the estimators o f Lee and
Honor~ et al., since they either involve cross observation comparisons (for /~hl
and /~h2) or cross observation estimations (in defining #.(zi, ff) and En,). For /~l
8
s. ChenI Journal of Econometrics 80 (1997) 1-34
the essential requirement is the condition that
observations.
E(uill)i>O) is
a constant across
Assumption 3.2. The error term (ut,u2) is independent o f the regressors (x,w),
and (ul, u2) is continuously distributed with its densiU: having uniformly bounded
partial derivatives.
When x includes endogenous variables, all the estimators except ~hl can be
easily modified to allow instnnnen~al variable approaches. However, a similar
modification is less obvious for/~h~Let ~ = ( I , x ) and Z(/)) = Es~(u2~ - ~o)d~ 1 (t, >w,a>o}-
Assumption 3.3 Each element o f Z(b) i:~ differentiable at 6o.
Assumption 3.3' The vector o f regressors in the selection equation satisfies
P(wdi0 = 0 ) = O.
Assumption 3.4 ( M o m e n t condition) E[llzl]43 and E[ll.vll 4] are finite, where I1" II
denotes the usual Euclidean norm.
Assumption 3.5 (Preliminary estimator) The preliminary estimator t~ o f 60 is
W - c o n s i s t e n t , and it has an asymptotic linear representation
v:~(~ - ~o) = ~
!
~ ~, + oo(l ),
(15)
V tt i = l
where q~i:-O(li, wi), E t ~ i = 0 , and EIt~II 2 is finite.
Assumption 3.6 (Identification) T h e matrix E[s'sil {t, >,¢,~0>0}] is nonsingular.
Assumption 3.3 is m a d e to ensure a valid Taylor series expansion when expectation is first taken. A more primitive condition is Assumption 3.3'. For a given
first;step estimator ~ for 60, the observations in the trimmed subsample satisfy
(wi6 > 0 ) . Assumption 3.3' is made to ensure that the t r i m m i n g criterion (w~6 > O)
is asymptotically equivalent to the condition (w~(~0> 0 ) . In fact, Assumptions
3.2, 3.3' and the M o m e n t Conditio~ 3.4 are sufficient for Assumption 3.3. 2
Assumption 3.3' holds when wt~o has density with respect to Lebesgue measure.
2To show this, let O(uhu2) be the density function of 0q,u2) with respect to Lebesgue
measure. Let E.- denote the expectation operator with respect to z. Define Z'(~l,&e)=Es~(u2 -~)dl(l>,,.a~.,,s=>o}. Then from Assumption 3.2 and the definitions of ~ and Zffi), we have
Z(b)--Z*(6, b) and ~*(/io,/~)----0. Consequently, by Assumptions 3.2, 3.3 J and 3.4
i.(6o + zlg) - z(~o) = x'(6o + zl~,6o + zl&) - Z'(~o,~o + d~i)
= E:|
o
|,,~6o+~:i)>O}sJwf f (u 2 -- ~to)g(tq,u2)du! du2
w~O
S. Chen.iJournalof Econometrics 80 (1997) 1-34
9
Furthermore, Assumption 3.3' does not rule out the case that all the components
o f w are discrete, in this case i f w is one-dimensional, then Assumption 3.3 ~
holds if P(w = 0) -- 0. Assumption 3.5 assumes a first-step estimator ~ is available.
Some examples o f ~ satisfying Assumption 3.5 are listed in the preceding section.
Since P(ut > 0 ) > 0 holds generally, then the Identification Condition 3.6 requires
that ( l , x ) be o f full rank in the half-space {w,w~0>0}, which in turn, holds
generally without exclusion restriction imposed on the regressors. In the case
o f a single regressor and x = w, then the full rank condition is satisfied if the
conditional variance o f x given x > 0 is nonzero. Let ~ - - ( ~ , / ~ ) ' . Note that the
estimator ~ o f ~ r o - - ( ~ , ~ ) ' can be expressed in a closed form.
Theorem 1. Under Assumptions 3. !-3.6, ~ is a consistent estimator o f ~.o, and
x/~(~ - no) ± N(O,X,),
where
Xt -- A , - ' (C;.a. + C).,f~ + [2, C~. + ~ 2 1 C ~
)A~ -1,
with
~t = ~3"/.(i~o )
At = Fa~s~di1U, >w,ao>0~
and
2i = s[(u2i
-- ~o)di
I if, > w, a0 > o } ,
C~,~.= E22', with C ~ , C ~ , C~. analogously defined.
Proof. See Appendix A.
Like the OLS estimator for the linear regression model, r~ can be expressed in
a closed form, therefore its asymptotic normality can be proved directly without
showing its consistency first.
Once the asymptotic properties o f ~ are determined, it is easy to in_'et :hose o f
]~l, since lff is a subvector o f ~. Let J = (0,1k×k), where 0 is a null vector and Ik×k
^
is the k-dimensional identity matrix. From Theorem I, we have ~ ( ~
N ( 0 , J Z I J ' ) under Assumptions 3.1-3.6.
-~ IE~I {~,,~o>o}s'w]f (ut - ~o)~(O,u~)du2,1~ + o(,~a)
Hence X(/i)
is different/ablea! ~o-
I/0) d
S. ChenlJournal o f Econometrics 80 (1997) 1-34
10
For the pooling estimator ~pl' let ddk(f)=l{c~>t_)~6>~;_,.,~.6>_~,_,} for
x-~K
k = I . . . . . K. Let s x ( 6 ) = ( d d l ( 6 ) . . . . . ddK(6),x~-]~K=t ddk(6)), ).K = S'K~e 6 0]'~ 2_~t,=)
d d ~ . ( 6 o ) ( u 2 - oil), and ZK(6)= Es~(6)Y'j~kx=i d d k ( 6 ) ( u 2 - ~k). We define
Ox -
~zt~(60)
06'
AK = Es~¢(6o)SK (30).
Define --vx similar to ,St by replacing l]l,Ai and ), by OK, AK and 2x-, respectively. Let nx = ( ~ t . . . . . 0t~,/~o)'; then under similar regularity conditions it is
straightforward to show the estimator ~tc for nK is asymptotic normal with Z'K
as the asymptot'c covaria,ce matrix by the same set o f arguments. Again it is
easy to derive the asymptotic properties of/~pl since/~ is a subvector o f nK.
3.2. Estimator 1I
In this subsection, we derive the large sample properties o f estimator II, f12,
o f [io. Let
zl ( 6 ) = EdjDij(6)Dik(6)(xi - xk )'(xi - x))
and
D( 6) = EdiDij( 6)Di~ ( 6)(u2i - u2y )(xi - xk )'
for i # j # k . The following additional assumptions are made.
Assumption 3. Z Each element o f z2(6) is differentiable at 6i0Assumption 3. 7'. For observations indexed by i and j with i # j
sampling, P(wiro = u5-3o,wi # wj) = 0.
under random
Assumption 3.8. The matrix r t ( 6 o ) = E[diDiy(6o)Dik(~o)(xi - xk)'(xi - xj)] for
i # j # k is nonsingular.
Similar to Assumption 3.3, Assumption 3.7 is a smoothness condition. We
can show as in lc,~tnote 2 that Assumptions 3,2, 3.4 and 3.7' imply Assumption 3.7. Note that the observations used to construct I~.(zi,/~) in (12) satisfy
0vj~ > wi~). Given Assumptio~ 3.7' this requirement will be asymptotically equivalent to the condition ( ~ 6 o >wi¢$o). Assumption 3.7' does not rule out the ease
that all the components in w are discrete. When w is one-dimensional, then
Assumption 3.7' holds trivially. Like Assumption 3.6, Assumption 3,8 is an
identification condition, zl(6o) can be rewritten as E { d i [ x i - E(xlwro>wi6o)]'
Ix; - E(xlwr0 > wiro)][P(m > - w,60, wra > wiro)]2 }; therefore our identification
s. ChenlJournal of Econometrics 80 (1997) !-34
II
condition is essentially equivalent to that o f Lee 3 (Assumption 5 in Lee, 1994).
In general, Assumption 3.8 does not require any exclusion restriction on the
regressors. See Lee (1994) for further discussions. In the case that the error
term u~ has positive density everywhere, the identification conditions require
x~ - E(xdw~60>0 ) to be o f full rank for fit and xi - E(xtlw~5o>w~¢5o) to be
o f full rank for/~2 and Lee's estimator, compared with a slightly weaker condition that xi - Ex~ be o f full rank for identification for the estimators in Honor6
et al. Finally, only Lee's estimator involves smoothing, which, in turn, requires
strong smoothness conditions; in constrast, both o f our estimators and those in
Honor~ et al. require minimal smoothness conditions.
Theorem 2. f f Assumptions 3.1, 3.2, 3.4, 3.5 and 3.7, 3.8 hold, then f12 is a
consistent estimator o f flo, and
~ ( / i 2 - flo) ± N(0, Z2),
where
t
t--I
Z2 = Az1(C,m + C,l$~r2 + Q2C0,1 + Q2C0$Q2)A 2
with
f~2 :=
03'
'
A2 = EdiDo(6o)D~k(~o)(xi - x~ )'(xt - :9)
and ~h is defined by (51) in the proof. C)pl = E~hPl~, with C ~ and C ~ analogously
defined.
Proof. See Appendix A.
The consistency and asymptotic normality of/~2 is proved by making use o f
the recent developments in U processes (see, for example, Arcones and Gifie,
1993).
Similarly, the pairwise differenced estimator ]?h2 in Honor6 et al. (1992) can
also be expressed in terms o f U-statistics. Its consistency and asymptotic normality can be shown by following the same line o f reasoning under similar regularity
conditions.
In order for large sample inference on fl0 to be carried out using the estimators
~1, ]~2, ]~pl,and fib2, consistent estimators o f the asymptotic covariance matrices
o f these estimators need to be provided. Consider, for example, estimating Xt.
3In fact, the fixed trimming set W in Lee is conshocted such that P ( u t > - - w , ~ o , w & > w v ~ o )
is bounded away from zero for 1)~(EW. Therefore, our identificationcondition is implied by tha.) eF
Lee.
12
S. ChenlJournal o f Econometrics 80 (1997) 1-34
A consistent estimator of Z'l can be easily obtained if we can find consistent
estimators of ~21, C ; ; , C ~ , , C~¢ and Al. Obviously,
//l
1 '~ ,
= il
sisii~l,>w,~>o}di
is a consistent estimator o f Al. It follows directly from Pakes and Pollard (1989,
pp. 1043-1044) that one consistent estimator o f I21 is a numerical derivative
estimator ~t,
~ = ~
~'~ (h,(~. oi. ~ + e,nqu) - h,(£,, ~. ~ -- e,.q,y))
2neln i=l
for the j t h column of ~ t , where
qlj is the unit vector with 1 in its j t h place and eln converges to zero as sample
size increases with n-~/~e~ ~ = Op( ! ).
Consistent estimation o f the matrices C6~.,C~6 and C;.;, requires suitable estimates of the components {2i}, {4'i} in the asymptotically linear representation
of ~ given in (48). Specifically, it is useful to assume that sequences {q~i},{)./}
exist such that
_l ~ lid, - ~/112 = op(I ),
r/ i = l
1 ~ I1~.~- ~.,I]2 : o p ( l ) .
Fl i = l
One example of {q~i} is suggested in Honor~ and Powell (1994). An analogous
sequence of {).i} can be constructed as
,'.~= s i 1{t, >,,.,~>o} (Yi -- sifz)di.
With the sequences {q~i} and {2,} given, the remaining components of the asymptotic covariance matrix of Zl, namely, C~.s.,C¢~_ and C,¢ can be consistently
estimated as in Powell (1989). Define
1 _~n - . i
with analogous definitions o f C~.~. and C¢~. it is easy to show ihat ¢~.~, C'~.~.
and C'¢~ are con~!~tent for C~.~,C~.~.and C~4,, respectively. By following the same
approach, it is not difficult to construct consistent estimators for the asymptotic
covariance matrices of/~pL,i~2 and /~h2-
S. Chen I Journal o f E¢oltometrics 80 (1997) 1-34
[3
4. Finite sample behavior
In this section we present a small Monte Carlo study of tbe proposed estimators. While it cannot completely characterize the sampling distribution of the
estimators, it does reveal certain aspects o f their behavior.
In all of the designs
I = max {w161, + w2612 + ul,0},
(I6)
y = (WlP, i + wz[Jt2 + u2)lU~>0 },
(17)
where the true parameters are (•11,6t2)=(1,1) and ( ~ l h ~ 1 2 ) = ( | , 2 ) . The regressors wl and w2 are drawn from a normal N(0, i ) distribution and a uniform
U ( - 2 , 2 ) distribution, respectively, Different designs are constructed by varying
the distributions o f the error terms. Data on ul are generated from three different distributions, namely, the standard normal distribution N(0, 1) (Normal); a
mixed gamma and normal distribution (Gamma*Normal): v/'~'~Gamma(0, i ) +
v~/~.2N(0, 1); and a mixed negative gamma and normal distribution (-Gamma (0,1)
,Normal): -[v/'0-'.8Gamma(0, I)-t-v/0--~.2N(0, 1)], where Gamma(0,1) is a standardized gamma random variate with zero mean and unit variance of which the
density function is f a ( t ) = 8/3(t -t- 2) 3 exp[-2(t + 2)],t > - 2, with its mode at
_ !2" The disturbance u2 is obtained from u2 = v/'0~ut + V~-.Su~, where u~ is a
N(0, | ) random variable independent of ut. These designs are similar to those in
Lee (1994), and the censoring level is about 50%.
Here we consider the results of estimating (/lmm,~12) by different estimation
approaches. Lee (1994) compared the finite sample performance between his
semiparametric least-squares estimator for the Type-3 Tobit model and some
semiparametric estimators proposed for the Type-2 Tobit model, in the context
of the Type-3 Tobit model with certain exclusion restriction on ~o alluded to
earlier imposed; the results favored Lee's estimat~,r. We will focus on comparing
various semiparametric estimators designed f o r th e T ~ e - 3 Tobit model. More
specifically, the second-step estimators are/~pl,/12, ~hl, ~t,2, ~ffl,and fill , where the
censoring intervals for ~pl are chosen to be equispaced with K----20 with co = - 4
and CK = 4. We also investigate :he performance of ~pl with different c e n t r i n g
intervals. ~1 and ~H are the estimators by Lee with the fixed trimming sets b~ing
Iwl[ < 1.9, Iw2l < 1.8 and Iwzl < 1.7, [w-z[< 1.6, respectively; also ]~2 is our estimator II, and/~h, and ~h2 are the pairwise differenced estimators with the absolute
and square loss functions, respectively. Note that ~sl is the only one without a
closed-form solution given a first-step estimator. The pairwise differenced estimator with the absolute loss function proposed by Hono~ and Powell is used
as the first-step estimator due to its good finite sample performance (Honor~ and
Powell, 1994). Because estimatb¢ I, fl,, is a special case o f Jffpl with only one
14
S C/wn t Journal of Econometrics 80 (1997) 1-34
censoring interval, namely, ( 0 , + o o ) , there could be serious loss o f observations
due to heavy trimming, hence its performance is not reported here.
The results from 300 replications from each o f the three models are presented
with sample sizes o f 50, 100, and 200. For each o f the six estimators under
consideration, we report the mean value (Mean), the Standard Deviation (S.D.),
and the root mean square error (RMSE). For both/11 and/}tl, all the smoothing
parameters and kernel smoothing functions are chosen according to Lee (1994)
(see Lee for further details). 3 a e biases o f the estimates can be derived by
comparing their mean values with the true parameters.
Table I reports simulation results o f the six sgmiparametric estimation approaches with various sample sizes when the error term ul is normal (Normal).
Note that there are only small sample biases even when the sample size is 50,
which in turn implies that the effective sample size is only about 25 for the main
equation as a result o f the selection scheme. The biases and variances decrease
as the sample sizes increase. Also notice that as the sample size increases from
50 to 100, the relative performance o f tip1 improves significantly, which is to be
expected since the loss o f degrees o f freedom can be very important when the
sample size is small due to the estimation o f several intercept terms corresponding to the same numbcr o f censoring intervals. The problem becomes less sevcrc
as the sample size increases while the number o f the intercept terms is held fixed.
Table 2 reports simulation results in the case o f a Gamma*Normal error term
in the selection equation. The pattern o f the performance o f the estimators is
close to that reported in Table I. Most estimation procedures perform worse than
in the first design. As pointed out by Lee, the Gamma*Normal error term has a
heavier upper tail, and the thinner left tail is censored due to the sample selection
and the positive correlation between the error terms across the two equations.
Table 3 contains simulation results with a negative Gamma*Normal error term
in the selection equation. Compared with the Gamma*Normal case, here the
thinner tail is censored while the heavier fight tail remains; consequently, all the
estimators have the smallest RMSEs compared with the other two designs.
Overall, we observe that fipl is quite competitive among the six estimators
exar.'Aned when the samp!e size is moderately large, and /12 is slightly worse
than lffhl, /~h2 and /~tTable 4 reports the simulation results for/~p~ with different censoring intervals.
The sample size is 100. All the censoring intervals are chosen to be equispaced
with co = - 4 . The individual interval length for various censoring schemes is
0.2, 0,4, 0.6, 0.8 and 1.0 with the corresponding c~c's equal to 4, 4, 3.8, 4
and 4, respectively. 4 This range seems to be quite wide. There is a trade-off
between loss o f degrees o f freedom for censoring with small intervals with loss
4 Simulation results arc not very sensitive to the equispacing scheme. The interval (to.oK) is
chosen to be wide enough ~o ensure the loss of observations to b¢ minimal.
S. Chen/Journal of Econometrics 80 (1997) 1-34
15
Table !
Normal design
n
50
Estima',ors
True
Mean
S.D.
RMSE
/~pl
1.000
2.000
1.000
2.000
1.000
2.000
1.000
2.000
1.000
2.000
1.000
2.000
1.000
2.025
0.985
2.002
0.988
2.006
0.984
2.006
0.982
1.999
0.980
1.998
0.319
0.279
0.297
0.289
0.289
0.270
0.276
0.260
0.319
0.265
0.369
0.285
0.318
0.279
0.297
0.288
0.289
0.270
0.276
0.260
0.319
0.264
0.369
0.285
1.000
Z000
l.O00
2.000
1.000
2.000
1.000
2.000
1.000
2.000
1.000
2.000
1.010
1.984
1.016
1.984
1.012
1.988
1.007
1.987
! .005
1.984
i .000
1.989
0.197
0.189
0.210
0.203
0.194
0.183
0.192
0.183
0.214
O.188
0.229
O.195
0.197
0.189
0.210
0.203
0.194
0. i 83
O. 19 I
0.|83
0.213
O.188
0.229
O.195
1.000
2.000
1.000
2.000
1.000
2.000
1.000
2.000
1.000
2.000
1.000
2.600
1.005
2.003
1.0OO
2.002
l.OO[
1.999
1.000
2.000
0.997
1.994
0.997
1.998
0.128
0.121
0.145
0.130
0.133
O.123
O.132
0.121
0.137
0.121
0.154
0.130
0.128
0.121
0.145
0.129
0. 133
O. ! 23
O. 132
0.121
0.137
0.121
0. ! 53
0.130
f12
/~hl
/~1,2
1~1
['ltt
100
/~p~
/~2
tim
]~h2
/~I
/$ll
200
/~p~
f12
flJ,i
fib2
/~t
/~u
o f o b s e r v a t i o n s a s s o c i a t e d w i t h c e n s o r i n g w i t h large intervals. T h e v a l u e s o f / ~ p t
a s s o c i a t e d w i t h different c e n s o r i n g i n t e r v a l s d o n o t a p p e a r to b e v e r y s e n s i t i v e
to different i n t e r v a l c e n s o r i n g s c h e m e s . O n e p r a c t i c a l a p p r o a c h is to t a k e a v e r a g e
values of the estimates. The row marked 'Ave" reports the performance of the
16
S. ChenlJournal o f Econometrics 80 (1997) 1-34
Table 2
Gamma*Normal design
n
50
Estimalors
True
Mean
S.D.
RMSE
/~r~
1.000
2.000
1.000
2,000
! ,000
2.000
1.000
2.000
1.000
2.000
1.0O0
2.000
0.953
1.981
0.950
i .970
0.945
1.972
0.946
1.971
0.934
1.981
0.922
! .977
0.354
0.351
0.349
0,298
0.327
0.288
0.317
0.284
0+347
0,299
0,390
0,33 !
0.357
0.35 I
0.35 I
0.299
0.331
0.289
0.32 I
0.285
0.352
O.299
0.397
0.33 I
1.000
2.000
1.000
2.000
1.000
0.996
2.005
0.995
1.997
0.991
0.196
O. ! 98
0.220
0.221
0.197
0. i 96
0. t 98
0.219
0.220
0.197
2.000
1.993
0.211
0.211
1.000
2.000
1.000
2.000
1.000
2.000
0.989
i .992
0.993
1.987
0.992
1.993
O. 195
0.211
0.208
0.216
0.245
0.229
O. 195
0.210
0.208
0.216
0.245
0.229
1.000
2.000
1.000
2.000
1.000
2.000
1.000
2.000
1.000
2.000
1.000
2.000
1.005
1,999
1.005
1.992
1.003
1.995
| .000
1.993
1.004
1.992
1.008
1.991
O. 128
0.116
O. 149
0.143
0. ! 4 !
O. ! 34
O. ! 38
O. ! 32
0.154
0.135
0.174
0.145
0.128
0.116
0.149
0.143
0.140
0,134
0. 1 37
0,132
0. i 54
0.135
0. 1 73
0.145
/~
/3~,t
fiJ~2
lit
/ltt
100
livt
~2
fl/,1
['¢t,2
/~j
/~lt
200
~pt
/J2
~J~1
/~t,2
~,,
~t
averaged estimates. The results are quite encouraging with the average estimator
outperforming
individual interval censoring estimators. One possible explanation
f o r t h e g o o d p e r f o r m a n c e o f t h e a v e r a g e e s t i m a t o r is its f u l l e r u t i l i z a t i o n o f t h e
data.
X Chenldournal of Econometrics 80 (1997) 1-34
17
Table 3
Negative Gamma*Normal design
n
50
Estimators
"[ rue
Mean
S.D,
RMSE
~p,
1.000
2.000
1.000
2.000
1.000
2.000
1.000
2.000
1.000
2.000
1.000
2.000
1.023
2.021
0.997
1.999
0.996
1.997
0.994
! .'397
0.988
1.995
0.996
i .997
0.293
0.282
0.283
0.269
0.277
0.267
0.266
0.252
0.294
0.261
0.328
0.278
0.294
0.282
0.282
0.268
0.277
0.266
0.265
0.251
0.294
0.260
0.328
0.278
i .000
2.000
i .000
2.000
1.000
2.000
1.000
2.000
1.000
2.000
1.000
2.000
0.992
1.998
0.984
1.983
0.985
1.984
0.985
i .983
0.981
1,989
0.978
1.993
0.177
0.160
0.185
O. 173
0.179
0,167
O, 177
0.160
0.197
0.162
0,214
0.171
O. 177
0.160
O. 185
O. 173
0.180
0.168
0.177
0.161
0.198
0.162
0.2 ! 5
0.171
1.000
2.000
1.000
2,000
1.000
2.000
! .000
2.000
1.000
2.000
1.000
2.000
1.004
2.004
0.996
! .999
0.996
1.998
0.997
1.997
0.998
1.995
0.997
1.993
0.118
0.112
0. ! 32
0.114
0.124
0.11I
0.124
0.110
0.130
G.i06
0.146
0.111
0.118
0.112
0.132
0. [ 14
0.124
0.111
0. I24
0.109
0.130
0.106
O. 146
0.112
fl,
[~hi
[~b2
~:
/~u
100
/tpl
,f~2
[~1
[il,2
~1
/~1
200
~pl
~2
fl~l
~
~t
~11
In light o f the close relationship between our estimator II (/~2) and Lee's (1994)
estimator, some comparisons can be made. The weighting scheme in defining/~2
is adopted largely for convenience rather than for efficiency concerns. However,
it is necessary to choose some smoothing parameters to implement Lee's
18
S. ChentJournal a f Econometrics 80 (1997) 1-34
Table 4
Results with various interval censoring .schemes. Sample size: 100
Design
Interval width
True
Mean
S.D.
RMSE
Normal
0.2
1.000
2.000
1.000
2.000
1.0(30
2.000
1.000
2.000
! .000
2.000
1,000
2.000
1.016
1.989
1.010
1.984
1.010
1.985
1.019
1.993
1.018
1.991
1,015
1.989
0.211
0.198
0.197
0.189
0.192
0.184
0,198
O. 182
0.211
0.179
0.191
0.174
0.21 I
0.198
0.197
0.189
0.192
0.184
0,199
O. 18 I
0.212
0. ! 79
0.191
0. i 71
1.000
2.000
i .000
2.000
! .000
2.000
1.000
2.000
1,000
2,000
1.000
2.000
0.994
2.004
0.996
2.005
1.007
2.008
1.002
2,007
1,012
2.014
1.002
2.007
0.208
0.214
0. i 96
0.198
0.196
O. 190
0.200
0.205
0.207
0.199
0.190
0.189
0.208
0.214
0.196
0.198
0.196
O. 190
0.200
0.205
0.207
0.199
0.189
0.189
1.000
2.000
1.000
2.000
1.000
2.000
1.000
2.000
1.000
2.000
i .000
2.000
0.993
1.999
0.992
1.998
0.986
1.996
0.990
2.000
0.988
! .995
0.990
i .998
0.190
0.162
0.177
0.160
0,177
O, 166
0.190
0.161
0.177
0.160
0.177
O. 166
0.4
0.6
0.8
1.0
Ave
Gamma* Normal
0.2
0.4
0.6
0,8
1.0
Ave
Negative
Gamma* Normal
0.2
0.4
0.6
0.8
!.0
Ave
O. 184
O. I g4
0.171
0.191
0.179
0.173
0.156
0,171
0,191
0.179
0.173
0.155
estimator. Lee (1994) reported that his estitnatoes are . t , t very sensitive
to the smoothing parameters in his Monte Carlo study. Second, Lee's estimator involves a fixed trimming set whose choice could affect its pe.rformanee.
This point is illustrated from the relatively poor performance o f flu due to
X ChenlJournal of Econometrics 80 (1997) 1-34
19
heavier trimming (27% trimming for/~n vs. 15% for/~t), and this problem is expected to become more severe when more regressors are included in the selection
equation.
5. Semiparametric efficiency bound for the Type-3 Tobit model
Semiparametric efficiency bounds can .serve as a criterion for comparing existing estimators and suggesting improved estimators. In this section we calculate
the semiparametric bound for the Type-3 Tobit model under the condition that
the error terms and regressors are independent. For general treatment of efficiency
bounds, see Begun et al. (1983) and Newey (1990).
Here we follow the approach of Cosslett (1987) (also Begun et al., 1983). The
discussion is informal, while the arguments in the following can be made rigorous
under a set of assumptions in Appendix B, Section B.I, by following Cosslett
(19871. Like Cosslett, we adopt Hellinger differential approach (see Begun et al.,
1983 for details). Consider a more general sample selection model
I = max{-vl(z,O) + ul,0},
(18)
y = i{l>O}{--V2(z,O) + u2}.
(19)
Note that under the independence restriction the likelihood is
~ca'
y,I,z,O,#,h)=h(z){Go(vl)l{l=o}
+g(l +vby+~2)l(t>oi},
(20)
where v~ = vl (z, 0), v2 =- "'~(~'~0), 0(ul, u=), h(z) and .f(y, Lz, O, a, h ) are the densities of (ul,uz),z and (y, L z ) with respect to sigma-finite measures ~, oJ and/.t,
respectively, on some measurable spaces. Here ( is Lebesgue measure, while ~)
is the product of measures corresponding to each explanatory variable: Lebesgue
measure for continuous variables, counting measure for discrete variables, and
Lebesgue measure plus counting measure for any mixed (discrete and continuous)
variables. Also #0 and Go are the marginal density and cumulative distribution
functions o f ui. Let L2(/~),L2(~) and L2(~)) be the usual L2-spaces of squareintegrable functions, inner products and norms in these spaces are denoted by
(-,.) and I1" II, with subscripts # , w and ~ to distinguish the different L2-spaces.
In our application the measure ~( is the product of co with counting measure on
{1 = 0 , y = 0} plus Lebesgue measure on {I > 0}, so that
I1 11 = II (0,0,-)ll 2, + f f dydlll
(y,I, IEl,=o
(211
fl, --oo
for any function ~b(y,I,z) in L2(p).
We then proceed to calculate the 'derivatives' of the square-root likelihood
with respect to 0,g and h, respectively. Similar to Cosslett, we can obtain the
S. Chen/Journal of Econometrics 80 (1997) 1-34
20
following 'scores" (times ½f~12) for 0 and the nonparametric components g and
~ff ~o~- trj ) G o
p o ( y , l , z ) = 2 h l / 2 ( z ) { ~v'
~/2 ( vl )1 {i=o~ +
h:
[#,(I+v,,y+v2)~O
~vz] g_l/z( I + v t , y + v 2 ) l u > o I } , (22)
+ff2(l +vl,y+v2)-ffff]
~jfl(l, y,z) = ht/2(z){Go 1/2(vl )d(vt, fl)l {/=0} + fl(l + vl, y + v2)l {t>o} },
(23)
where .ql(',') and g 2 ( ' , ' ) denote, respectively, the partial derivatives of g with
respect to its first and second components,
j:
o~
=
-f
f al"2(ul,u2)fl(u,.u2)dul du2
(24)
F --OO
for/~ ~ .~;, = {/~ ~ ~ 2 ( ( ) I </~,a'/2); --0}. and
•~,7(Ly, z) = 7(z){GIo/2(vl )1 {t=o} + gl/2(l + vl,y + v2)1{1>o}}
(25)
for ;,E:~3,----{7EL2(~o)I (},,hl/2},,,=O}. Here poEL2(it) and ~,j x ~¢'¢j is a bounded linear operator from .~j x :~, to L2(p). If we can find the projection of
p0 on .el, where .~' is the mean square closure of the set of ,~¢= {~,jfl +
.~/~,~,for fl E ~ , 7 ~ e8,}, then the efficiency bound would be IS ~, provided its
inverse exists, where
]*
=
4((00
-
~ " ), (Oft
--
,
Y
= 4{,po,
P0)~,
- (~',~*X),}
~,, ) r ) t '
(26)
and ~* is the projection o f p~ on .~'.
Note that {po, .z/~7) = 0 and (,~fl, . ~ 7 ) = 0 for all fl ~ ~0 and 7 ~ .c~a; therefore
the lack of knowledge o f h should have no effect on the lower bound.
Let .~,~* denote the adjoint of .~/. In Appendix B, Section B.2, we find ~*, the
projection of P0 on ~t" by evaluating .~(,~,~• , ~ ) - t .~,~. p0, which gives
0:* =
"
*~)
,~,(,%
•
--I
,~;* a,~(L y , z )
+O~/2(u~,u2)f,,,- kt(v)d
( .qa(v)
Go(tOy 1{1>ot
X ChenlJournal of Econometrics 80 (1997) 1-34
21
- O-l/2(ul, ux)[Ol(ut, "2)kl (ut) + O2("t, u2)k2(ut )] 1{t> o}
+
o'/2(ul, us)go(u1 )kKul ) 1{t>o} }
Go(u1 )
(27 )
where ul = 1+el and u2 = y+v2, kL(u) = E(OVL/~O I vl < u) and k2(u) = E(~v2/aO I
vt < u). Hence we arrive at the following result.
Proposition 3. Under the conditions in Appendix B, Section B.I, the semiparametric efficiency bound for 0 in the nonlinear sample selection model ( 1 8 ) - ( 1 9 )
is given by I , I, where
I. = 4{ (Po.p~}v -- (=*,or*r),}
=-f
I ]
Hv,(u)Var [ ~-~ vl < u d
- ~
+ -=.f
0-'~..,,.,)
k Go(u)/
O~:.,..2)V,,r [°"'l
L aO I vt < ul
+ a' ( "" "2)~t2(u" u2)c°v [ av'O0"~v200,v, < u, ]
+ o,<.,...)oa(u,..,.)cov L oo ' ~ - I v, < . ,
where
I
Coy [[Ov;
c~vi I vl
00" S0'
<u,] =E [aO 00,
vl
for L j =
1,2
(29)
and He,(u) = P(vt < u).
When vl = v2 and ui =u2 the model reduces to the censored nonlinear regression model in Cosslett (1987), then the results will be exactly the same as those
in Cossl~tt.
We now specialize to the Type-3 Tobit model with a two equation linear
regression model as the underlying latent model, where vi = - W 6 o , v2------xflo.
S. C/tent Journal of Econometrics 80 (1997) 1-34
22
Let O(ul ) = { - w O o < Ul },
A(un ) =
and
C(m ) =
°o)
0
'
(: o)
r~
B(,~l ) =
(0°
0
"
where
~ , . = E(w' wlD(u ! )) - E(w'ID(u I ))E(wID(Ul )),
F,~x = E(W'xID(Ul )) -- E(w'lD(ul ))E(xID(ut ))
and
l'~,~ = E(x'x[D(ul )) - E(x' [D(ut ))E(xlD(ul ));
then the lower bound for 0 = (6,1~) would be V - t , where
V = - - of~ Hv,(u)A(u)d \ ~ ]
+ f
-oo
('J-I(ul,U2)(g21(UI,U2M(Ul) + ffI(UI,,U2),q2(Ul,U2)B(Ul )
+al(t, tt,u2 )g2(Ul,l'Q )nt(Ul ) +
,~2(Ul,u2)C(Ul )du2]J
dulHr,(u! ).
(30)
Let
Mt = at(u,,u2) - aoO'q )e(ul,u2)/Go(ul ) (w
ffl/2(ut,u2 )
-- E(w[D(u! ) ) )D(ul )
NI
=
O2(u_. l.u...3.2)- .
ol/2(Ul, u2 ) (x --
E(xID(u,)))D(u,).
Let ~* and ~ denote the measures having densities O(uE,u2) and ffo(ul ), respectively, with respect to Lebesgue measures on R 2 and R. T h e following proposition
provides a set o f sufficient conditions for the information matrix in (30) to be
nonsingular.
Proposition 4. Under conditions that (i) (d.qo(ul)/dut - g~(ut )~Go(u! ))(w - E
(w[D(Ul ) ) )D(ul ) is not linearly dependent almost surely w.r.t, ra × ~ , and (ii)
S. Chenl Journal of Econometrics 80 (1997) i~34
23
N I is n o t linearl), d e p e n d e n t a l m o s t s u r e l y w.r.t, o~ × e~*; t h e i n f o r m a t i o n m a t r i x
V in (30) is nonsinoular,
Proof. The information matrix F in (30) can be rewritten as
J'f f f
N1
N,
which is singular if and only if )~Mi + 2[Ni = 0 almost surely with respect
to cox re*, for some nonzero vector 2=(2~,2'~)'. There are three eases: (1)
•~.l = 0 , & ~ 0 ; (2) 2t ~ 0 , ~ ~ 0 ; and (3) ,:.l ~ 0 , 2 2 = 0 . For the first case, we have
(02(I/I, r42)/gt/2(Ul,U2))).t2(X- E(xlD(ut )))D(u, ) = 0 almost surely with respect to
co x if*, contradicting assumption (ii). For the second case, ).~M~ q- ~ N I : 0 almost surely w.r.t co × ~ . . Then we have
f (2~Mt +
).~Nl)gl/2(ul,u2)du2 ~-0
almost surely w.r.t co × ~ .
Hence,
(dg0(ul)/dul - fl~(ul )/Go(ut ) )(w - E(wlD(ul )))D(ul ) = 0
almost surely w,r.t, co × ff~', which contradicts assumption (i). Similarly, the third
ease will also lead to a contradiction.
If both x and w are o f full rank in the half-space { - w~o < co}, (dgo(co)/du! g~(co)/Go(co))~O and g2(co, c l ) ~ O for some (co, cl), then the assumptions o f
Proposition 4 will be satisfied since both Var(x[-w~50 < ul ) and Var(wl-wfi0
< ul ) are left continuous in ul, and (dg0(ul)/dul -g20(ut )/Go(Ul)) and 02(ul, u2)
are continuous in (ul,u2). When x and w both have full rank for the selected
subsample, i.e. both E[Var(xll = 1 )] and E[Var(wll = I )] are nonsingular, the assumptions in Proposition 4 hold generally except for special forms 5 o f g. This
full rank condition, in general, does not require the existence o f a continuous
regressor or any exclusion restriction on the regressors. In the case o f a single
regressor and w = x , we will have positive information if x has nonzero conditional variance gl,ven { - x ~ 0 < co}, and dg(co)/dul < 0 and O2(co,c t ) ~ O for
some co and ct.
Following Chamberlain (1986), singularity o f information matrix will imply
that there is no Vfff-root consistent estimator for 0. We can argue as Chambedain
(1986) that to ensure a nonsingular information matrix in the Type-3 Tobit model
no intercept is allowed; this is natural since no restriction on the loeatioa o f the
error distributions is imposed, and the intercept terms are not identified. In confrost to the Type-2 Tobit model (Chamberlain, 1986), however, general exclusion
S Note that Ihe family o f uniform distributions are not regular in the sense o f Begun et al, (1983),
and ruled out by the regularity conditions.
S. ChenlJournal of Econometrics 80 (1997) 1-34
24
restriction on fl is not needed; also vl is not required to have a continuous distribution.
6. Concluding remarks
Semiparametric estimation o f the Type-3 Tobit model is considered in this
paper under the restriction that the error terms and the regressors are ind: pendent. The first method is a simple least-squares-type estimation approach applied
to an adjusted sample, and the second one is a weighted least-squares-type approach through estimating the selection correctio,1 function. Both estimators are
consistent and asymptotically normal. Estimators for the asymptotic covariance
matrices are provided to carry out large sample inference. A limited Monte Carlo
simulation is used to study the practical performance o f the estimators. We also
calculate the scmiparametric efficiency bound for the Type-3 Tobit model under
the independence restriction.
Since the covariance matrices o f all the available estimators designed for the
Type-3 Tobit model are complicated and dependent on a particular first-step
estimator, exact analylic relationship between these matrices and the semiparametric bound will be difficult to derive. Also all o f these estimators are based
on some conditional moments without considering the associated optimal instruments; therefore, none o f them achieves the bound. It may be possible to exte~,d
Ritov's (1990) method for the Type-I Tobit model to construct an efficient estimator for the Type-3 Tobit model, but this is a topic for future research.
Appendix A
Proof of Theorem 1. To simplify the notation, define sr =s~l{i,>,,,~>0 }. With
this convention, r~ can be expressed as
=
d:~ st
)-"
~ disi y~
i=1
it
=
"x- I
n
,~ ,,.,,, )
,=,Ed:;'(~:o
--|
=,~o+ \ N
''"7
+ u 2 , - ~o)
.'1
-
hence,
/1
~
\-t
~ ( ,~ - ~o) = ( ~ E d:~'~; ~|
\
i= I
/
"~
a:;'(u2, - c~o).
i=!
(A.l)
S. Chen l Journal o f Econometrics 80 (1997) 1-34
2S
To get the desired result, we need to show t h a t ( l / n ) ~ / n _ l d~si.t s~• converges
in probability to a nonsingular matrix and (I/vrn))-'~=l d~s~'(u2i- oto) is asymptotically normal.
First. note that
1 x~n--ds.,s.
n-~=l i i i
=
I n
n~~di$i$il{l>
. w.tJo>o}i=l
t
l
R
I
+ - ~ dts~s~(I tit >~.,~>0} - 1{¢ >w,a0>o} )
ff i=1
By the strong law of large numbers,
I
dis~silll,>w,~.>o} ~-~ Edisis~Ifl~>w,~.>o I
(A.2)
n i=I
where convergence means element by element convergence. Also by the CauchySchwartz inequality,
l ~--Idss~si(l {1'> "/g>°} - I tt'> '"~°>°} )[]
n
IId:~sdl2
-n
(1| l'>w:>°}-Iff'>w'°°>°})2"
(A.3)
By Lemma 12 in Pakes and Pollard (1989) it is easy to verify that the class
of functions o f {I{i>wa>o},6ERP} is uniformly bounded and Euclidean. (For
the concept Euclidean and other empirical processes related terms the reader is
referred to Pakes and Pollard.) Then by the uniform law o f large numbers in
Lemma 8 in Pakes and Pollard we can prove that
1 "
n ~ l{/'>w"~>°) ~p Elll,>w,~o>o},
l{/,>,,;3>0}l{tj>w:~o>o } P-~ El(t~>w,~o>o}
/1 i----I
and
/2 i=l
26
S. Chenldournal o f Econometrics 80 (1997) 1-34
where -p- denotes convergence in probability. Consequently,
i )_~.(i if,>w,,~>o}
hi=!
__ _
n i=l
+ -
1 {t,>,,,,~o>O})2
l{t,>wfi>o t - - 2 1
r/ i=l
n i=1
Iu,>w,~>ot I{l,>w, ~o>ot
i {1,> ,,; ,~>0}
(A.4)
---- Op( 1 ).
Therefore, for the two terms on the right-hand side o f (A.4), the first one is
bounded in probability by Assumption 3.4 and the second term converges to 0
in probability; hence
-
disi si ---,
/r/i= I
t
Edis:ilu,>,.~o>o}
in probability.
Next, we investigate the asymptotic distribution o f (I/v~))--~.in=l d~s~.'(u2~-xo).
To simplify the notation, we define ri=(u2.di, wi, I . s i ) a n d h(ri, tS)=s~l{l,>w,~>o}
( u 2 i - cto)di. Since the class o f functions { I { I > ~ > o } , 6 E R p} is Euclidean, and
s~(u2i -ao)di is a fixed function (which is necessarily Euclidean), then the class
o f functions {h(-,6), 6 E R p } is a Euclidean class with an envelope F for which
J F 2 d p < co by Lemma 14 in Pakes and Pollard and Assumption 3.4, and the
parameterization is L2(p) continuous at go by Assumption 3.3. Let
I ~ (dis~(u2i-
Oto)1 {I, >,v,6> o~ -
Edis~(u2i-
C~o ) I {I, >,,',,s>0~
);
then by Lemma 2.17 o f Pakes and Pollard (1989), for every sequence o f positive
numbers {ttn} converging to zero,
sup
[vnh(',6) - vnh(',6o)l = Op(1).
LI~;-6oil<~t,,,
(A.5)
But since ~ & g0, there exists a sequence o f positive numbers {t2n} converging
to zero such that P(It~-6ol > t2n)---~0, as n--* oo. Hence,
P(Iv.h(-.g) - v,,h(., 6o)1 > a)
~<P [
sup
\116- 6oll~<r~,
Iv.h(-,,~)- v.h(-,~o)l>
+ P ( ~ - ,~ol> t=.) ~ o
(A.6)
S. Chenl Journal o f Econometrics 80 (1997) 1-34
27
as n---, oo, for any ~ > 0. Therefore,
1_~~ ais[(u2i - go)l {1,>,~, $>o}
V~i=l
___ 1 ~"~_dis~(u2i-O~o)lil>w, 6o>o}
V/'H i=I
+ v/-nEdis:(u2i - go)1 {t. > w,6>0} I~=d+ op( l )
(A.7)
since
E:.~ = E d : ~ ( u2~ - ~o )1 {u,, >o.~,60 >o)
= Es~(E(d~u2~10', > o.w,~o> 0~Iz) - ¢ZoEd~! {~,,,> o,~.a°>o} ) 1lw,~o>o}
~0.
if we show that Ed~s~(u2i- go)1 {!,>,v,6>o)[6 =d is asymptotically linear, then the
asymptotic distribution of ( I/V~ ) ~ = t s~(u2i - go)di I {1,>w$>o} follows easily.
To derive the asymptotic distribution of v ~ E d i s ~ ( u 2 i - ~o)liI, >~,~>o}[~=~, we
expand the expectation in Taylor series. By Assumption 3.3
vGEs~(u2~ - ~ o ) i v , >,v,6>ofl~=. ~
= vGEs~tu2~ - ~o ) ! V, > w,~o >o~ + vGf~l (~ - ,~o) + % ( l )
= V/-nfll(~ -- ~0) -I- Op(i ).
(A.8)
By the asymptotic linear representation o f ~ in Assumption 3.5,
, / ~ E s A u, 2 i
-
~ ~l~bi + op(1)
go)t~/, >w.~>o~l~=~ = - ~1 t=!
(A.9)
with E~i = 0. Consequently,
1 ~-~At_I(A~ + ~ | ~ i ) + o p ( l ) .
Hence,
v ~ ( n -- ~o) ~ N(O, ZI)
as asserted.
[]
(A.IO)
S. ChentJournal of Econometrics 80 (1997) 1-34
28
P r o o f o f T h e o r e m 2. Instead of analyzing the minimization problem (13) directly, we obtain a closed form for/~2 by some arithmetic,
q
I
/~z = |(n)~-'~-]~~ d,D~y(~)D,i(6)(xi - xt,)'(xi L
j#i k#i
I
(A.1 I)
x (n)~ '~-~ ~ diD~](5)Dii(~)(y~ - yj)(x~ - x})',
j#i k#i
where (n)~ = n ( n -
l ) ( n - 2); then
]'
[
V ~ ( # 2 -- [1o) ---- ( n ) ; ' ~ ~ diDo(~)Dik(5)(x, -- xk)'(x i -- xj)
× v/n(n)3-1~-] ~
d i D i j ( ~ ) D i k ( 5 ) ( u 2 i - u 2 j ) ( x i - xt¢ )'
=si-~s2.
(A.12)
We will show that $1 converges to a nonsingutar matrix in probability, and
that $2 is asymptotically normally distributed, respectively. Let
~ k (6) = d~Dij(~)Di~ (5)(x: - x~, )'(xi -- xy),
~i~kOS, fl) = diDij(5)Dik(5)(yi - yj - ( x i
+¢@A5,~) + ¢j~;(5,~)
and
then it is easy to show that
=(n); -~ ~
i<j<k
~(g)+op(I)
- x.i)fl)(xi - xk )',
X ChenlJournal of Econometrics 80 (1997) i-34
29
and
i<j<k
j=k~t
i<j<k
-~S~ 4- Op(l).
By Lemma 12 in Pakes and Pollard (1989), { D ( . , ~ ) , 6 E R P } is a Euclidean
class, and d i ( x i - xk)'(xi - x j ) does not depend on 5, then by Lemma 14 in
Pakes and Pollard and by Assumption 3.4, {~(-,6),.';~Z/R ~} is a Euclidean class
of functions with an integrable envelope. Then by the uniform law o f large
numbers for U-processes (e.g. see Arcones and Gin6, 1993)
II
(n)y| i <Ej < k (¢ok(~)-E¢ok(o))ll =%(!)
[l
<sup
6~A
(A.13)
also
(A.14)
ItE~ok(,~)l~ - E~ok(,~o)l[ = %(1 )
by the consistency of ~ and continuity of E~O,(6 ) at iio. Hence,
(n)~ -m ~
~i2~(~)-- ~E~;jk(t~o) = E~k(60 )
in probability.
i<j<h
By the nonsingu|arity of E~yk(5o)
S~-I ~ [E~jk(Jo)] -1 ----A~-L
in probability.
Now w¢ turn to analyzing S~. Similar to the class of functions { ~(., t~), ~ E R p },
{~b(-,tS),6E RP} is also a Euclidean class of functions with a square integrable
envelope, then by Corollary :5.7 of Arcones and Gin6 (1993) and the consistency
of ~,
(s;- -TE%k(,S)I~
v~
- ~(n)7' ,<E<,(%kOo) - E%k(~o)))
[
i<j<k
S. Chen I Journal of Econometrics 80 (1997) 1-34
30
-(n)3 -1,<j<~ (~bvk(6o) - E~ij,~(6o))]
=
(A.15)
%(I).
Similar to the proof in Section 3.1, by Assumption 3.7, we expand the expectation
term in Taylor series
S~ = ~ ( n ) f
I ~
(~kii,((~o)
--
E~qk(,~o)) + 122x/-~(5 -- ~o) + op(l ) (A.16)
i<j<t~
because
E~b~x.(6o) =- E d i D i j ( 5o )Dik ( 60 )( u2i - u2y )(x~ - xk )'
=- E[Di~(6o)(xi - xt,)' 1{,v,~o>,~,,>~}
x (E[(u2~l {,,,, > _,,.,~0 } - u2j I {,,, ,> .... ~o r ) l z . zj] )]
(A.17)
~- 0,
where 02 = ~E~bi~(60)/~'. Hence,
$2 = v~(n)~ l ~
(@iyk(5o)- E~btjk(ao)) + O2V~(~ - 60) + op(t). (A.18)
i<j<k
Next, we approximate the U-statistic v / n ( n ) ~ I E i <j <k ( ~tiJk( ¢~O) -- E~'O*(50)) by
its projection, which is a normalized sum of l i d random variables. By
Assumptions 3.2, 3.4 and Theorem 5.3.2 of Serfling (1980), we have
l
r!
$2 = " ~ ~ ('h -- Et/i) + ~2x/n (t~ -/~o) + op( ! ),
(A.19)
where
(A.20)
Finally, from Assumption 3.5
1 ~ Or, - Er/i + O2~bi) + %( l )
$2 = " ~ i=l
(A.21)
with E(~/i + f l z ~ i ) = 0 .
By Assumptions 3.1 and 3.4 and the Linderberg-Levy Central Limit Theorem,
we arrive at the desired result. I-1
S. ChenlJournal of Econometrics 80 (1997) 1-34
31
Appemlix B
B. 1. Reoularity conditions Jbr Section 5
The following assumptions are made for calculating the semiparametric efficiency bound for the Type-3 Tobit under the independvnce restriction. The assumptions are similar to those in Cosslett (Section 4).
Assumption B.1. Both vt(z,O) and v2(z,O) are differentiable with respect to 0
for almost all z, and measurable in z for each 0 E O (the parameter space for
0); also the square-root likelihood is Hellinger differentiable with respect to 0,
i.e.
f d~(z)h(z) { G~:2(v~(z,O+ 3)) - G~/Z(v~(z,O))
~TgO(UI(Z,O)GoI/2(DI(Z,O))
~Vl(-~'~1)}
= o(Izl 2)
and
f do~(z)h(z)f f dldy
0
l[
g~/z(I+ vt(z,O + z),y + v2(z,O + 3))
+ v l , y + v2)~O +[t2(1+ vt,y+ v 2 ) ~ ]
x [ / - t / z ( / + th,y + v 2 ) ~
=o(ITI2),
where vt = vt(z, 0), v2 = v2(z, 0).
Assump!ion B.2. The function g(ul,u2) is continuously differentiable with O(ut,
u2), gl(ul,u2) and O2(ut,u2) uniformly bounded. Both ~(uhu2)/O(ul,u2) and
~3~(UhU2)/O(Ubu2) are integrable.
Assumption B.3. The conditional expected values E[(~vt/OO)lv~ < u ] and E[(~v2/
00)Iv I <u] (both defined to be zero if P(vt < u ) ) is a bounded function of u in
any interval [c, oo].
Assumption B.4. The function Oo(u)3/Go(u) (defined to be zero when Go(u)= O)
is absolutely continuous in any interval [c, o¢].
32
S.
ChenlJournal of Econome+rics 80 (1997) 1-34
B.2. Derivations for the efficient score
For ,cZ~ in (23), its adjoint ~.0" is found to be
('~"Tr)(ul' u2) = gt/2(U"U2) [,,, f<v,dt°(z)h'/2(z)G°'/2(v' )r(0, 0,z)
+,,, >,',f dco(z)hl/2(z)r(ul - vl,u2 - v2)]
oo
--OI/2(.IsU2)
doo(z)h'i2(zl{Gio/2(v,)r(O,O,z)
:
oo
+ f ~ dvdv' gl/Z(v,v')r(v - Vl,V' - v2)},
(B.t)
l~I --
then
(.~;.eZ,i)fl(u~,u 2)-- gt/2(ul,u2) f
ltl
+
f
111
Let "~/.,i loft
dca(z)h(z)Gol(v,).l(v,,fl)
<*'1
dto(z)h(z)fl(ul,u2).
(a.2)
>~'l
~'; then it can be verified that
@(Ul,U2)
fl(u,,u2)=
-
Hv,(ul)
-
A/2"u u "~=d hr,(v)d(v, ~')
.q
~ - - Go(v)
,
" t l, 2JJ., v (H,,,(v))
(B.3)
where hv,(v) and H,,,(v) are probability density and cumulative distribution functions of v,, respectively, solves (B.2), i.e. fl(ul,u2)=(.~;,~q)-t~(uL,u2),
(://2,fl} = 0, and
(B.4)
(.~'* .~,, )( .~-1".% ) - I ,~( u i, u2 ) = q'(', t, u2 ).
Hence,
oo
(.%*.~,,)-;.<,*p0(-,.-2)
ou2('.,,,~)fk~fv)d0o("______2)
= -
.,
Go(v)
-I-~g-1:2(RI,U2)[gI(UhU2)kI(Ul )
+ 0 2 ( u , , u2 )k2( ul )]
! gl/2(u;,u2)gO(U t )kl(ul )
2
Go(ut )
(B.5)
33
S. ChenlJournal of Econometrics 80 (1997) 1-34
Finally, w ¢ h a v e
hI/2(z) Gol/2(v,)Lflc=tp)clG----o--~l{t=o)
=-
,,,
\Go(v)]
l {i>o}
-~1 --=I2 (ui,u2)[g=(ul,u2)kt(u=) + g2(u=, u2)k2(ul )]1
+
g=/2(u~'u2 )~°(ut )kl(u= ) 1
I
Go(u=)
it>o}
,
{l>o)
(8.6)
w h e r e ul = 1 + Vl a n d u2 = y + v2.
References
Amcmiya) T., 198S. Advanced Econometrics. Harvard University Press, Cambridge, MA.
Andrews, D.W.K., 1991. Asymptotic normality of series estimators for nonparamctric and
semiparametric regression models. Econometrica 59, 307-345.
Arconcs, M.A., Gin6, E., 1993. Limit theorems for U-processes. Annals of Probability 21, 1494-1592.
Begun, J.M., Hail, W., Huang, W.M., Wellner, J.A., 1983. Information and asymptotic efficiency in
pammetric-nonparametric models. Annals of Statistics ! I, 432-452.
Buckley, J., James, !., 1979. Linear regression with censored data. Biometrika 66, 429-436.
Chamberlain, G., 1986. Asymptotic efficiency in semipammctric models with censoring. Journal of
Econometrics 32, 189-218.
Chen, S., 1994. Semiparamelric median estimation of Typa-3 Tobit model. Economics Letters 44,
349- 352.
Cosslett, S., 1987. Efficiency bounds for distribution-flee estimators of the binary choice and t i c
censored regression models. Econometrica 55, 559-585
Cosslett, S.R., 1991. Scmiparametric estimation of a regression model with sample selectivity. In:
Barnett, W.A., Powell, J.L., Tauchcn, G. (Eds.), Nonparametric and Semipammetric Mcthnds in
Statistics and Econometrics. Cambridge University Press, Cambridge.
Duncan, G.M., 1986. A scmiparamenic censored regression estimator. Journal of Econometrics 32,
5-34+
Gallant, R., Nychka, D., 1987. Semi-nonpammetric maximum likelihood estimation. Econometrica 55,
363 -390.
Gronau, R., 1973. The effects of children on the housewife's value of timo. Journal of Political
Economy 81: SI68-SI99.
Heckman, J.J., 1974. Shadow prices, market wages and labor supply. Econometriea 42, 679-693.
Hcckman, J.J., 1976. The common str::+,-~-e of statistical models of trunction, sample selection and
limited dependent variables and a simple estimator for such models. Annals of Economic and
Social Measurement 5,475-492.
Honor6, B.E., Kydazidou, E., Udry, C., 1992. Estimation of TYI~-3 Tobit models using symmetric
trimming and paitwise comparisons. Unpublished manuscript.
Honor6, B.E., Pow¢ll, .I.L., 1994. Pairwise differences of linear, censored and mmcated regression
models. Journal of Econom¢Irics 64, 241-278.
34
S. ChenlJournal o f Econometrics 80 (1997) 1-34
Horowitz, J.L., 1986. A distributionffree least squares estimator for censored linear regression models.
Journal of Econometrics 32, 59-84.
Horowitz, J.L., 1988. Semiparametric M-estimation of censored regression models. Advances in
Econometrics 7, 4 5 - 8 3
Ichimura, H., Lee, L.F., 1991. Semiparametdc least squares estimation of multiple index models:
Single equation estimation. In: Barnett, W.A., Powell, 3.L., Tauchen, G. (Eds.), Nonparametric and
Semiparametric Methods in Statistics and Econometrics. Cambridge University Press, Cambridge.
Koul, Susarla, Van Ryzin, 1981. Buckley-James estimator for regression analysis wilh censored data.
Annals of Statistics 19,1370-1402.
Lee, L.F., 1982. Some approaches to the correction of selectivity bias. Review of Economic Studies
49, 355-372.
Lee, L.F., 1992. Semiparametrie nonlinear least-squares estimation of truncated regression models.
Econometric Theory 8, 52-94.
Lee, L.F., 1994. Semiparametric two-stage estimation of sample selection models subject to Tobit-type
selection rules. Journal of Econometrics 61,305-344.
Newey, W.K., 1988. Two-step series estimation of sample selection models. Unpublished manuscript,
Princeton University.
Newey, W.K., 1990. An introduction to semiparametric efficiency bounds. Journal of Applied
Econometrics 5, 99-135.
Pakes. A., Pollard, D., t989. Simulation and the asymptotics of optimization estimators. Econometrica
57, 1027-1058.
Powell, J.L., 1986. Symmetrically trimmed least squares estimation for tobit models. Econometriea
54, 1435-1460.
Powell, J.L., 1989. Semipatametric estimation of censored selection models. Unpublished manuscript.
Pow¢ll, J.L., 1994. Estimation of semiparametrie models. In: Engle, R., McFadden, D. (Eds.),
Handbook of Econometrics. North-Holland, Amsterdam.
PowelL J.L., Stock, J., Stoker, T., 1989. Semiparametric estimation o f index model. Econometrica
57, 1403-1430.
Ritov, Y., 1990. Estimation in a linear regression model with censored data. Annals of Statistics 18,
303-328.
Settling, R., ~980. Approximation Theorems in Mathematical Statistics. Wiley, New York.
Sherman, R.P., 1993. The limiting distribution of the maximum tank correlation estimator.
Econometrica 61, 123-137.
Udry, C., 1994. Risk and insurance in a rural credit market: an empirical investigation in northern
nigeria. Review of Economic Studies 61, 495-526.