A Note on Additive Separability and Latent Index Models of Binary

A Note on Additive Separability
and Latent Index Models of Binary Choice:
Representation Results
Edward Vytlacil∗
December 30, 2004
Abstract
The standard binary choice model in econometrics has the choice determined by a latent
index crossing a threshold. The latent index is almost always assumed to be additively separable in observable and unobservable regressors, and most commonly linear in all regressors.
This note provides a class of nonseparable latent index functions which will have equivalent
representations as additively separable or linear index functions. These results demonstrate
that assuming a linear or additively separable latent index function is less restrictive than
previously recognized.
JEL Numbers: C25
KEYWORDS: binary choice model, latent index model.
1
Introduction
Let Y denote a binary outcome variable, X denote an observable random vector, and V denote an
unobservable random variable. The standard binary choice model in econometrics is a threshold
crossing model of the form Y = 1[Y ∗ ≥ 0], where Y ∗ is a latent unobserved index and 1[·] is the
logical indicator function taking the value 1 if its argument is true and the value 0 otherwise.
∗
Assistant Professor of Economics, Stanford University, and 2003-2004 W. Glenn Campbell and Rita RicardoCampbell Hoover National Fellow. Correspondence: Landau Economics Building, 579 Serra Mall, Stanford CA
94305; Email: [email protected]; Phone: 650-725-7836; Fax: 650-725-5702. I would like to thank Azeem
Shaikh and Nese Yildiz for extremely helpful comments.
1
In the vast majority of cases, Y ∗ is assumed to be a linear index of X and V , Y ∗ = Xβ + V.
The linear index assumption is sometimes relaxed. For example, it is sometimes assumed that
Y ∗ = m(X) + V , so that the index is additively separable in X and V but not necessarily linear
in X.1
Imposing the linearity assumption on the latent index is not as restrictive as it might appear,
since there is a broader class of models that have a representation in this linear latent index form.
In particular, the simple observation that weak inequalities are invariant to strictly increasing
functions applied to both sides of the inequality implies that any model of the form Y = 1[f (Xβ +
V ) ≥ 0] or Y = 1[f (Xβ) − f (−V ) ≥ 0] with f : <1 7→ <1 strictly increasing will have a
representation of the form Y = 1[Xβ + V ≥ 0].2 Likewise, if Y = 1[f (m(X) + V ) ≥ 0] or
Y = 1[f (m(X)) − f (−V ) ≥ 0] with f : <1 7→ <1 strictly increasing, then the model will have a
representation of the form Y = 1[m(X) + V ≥ 0].
This note considers conditions under which a possible nonseparable latent index will have a
representation with X and V additively separable, and conditions under which the latent index
will have a representation linear in (X, V ). These conditions will strictly nest the conditions
discussed above. This note thus provides new representation results, further extending the class
of functions that are known to have a linear latent index or additively separable latent index
representation.
1
Examples of this form include Das, Newey, and Vella (2003), Heckman and Vytlacil (2001, 2003), Shaikh and
Vytlacil (2004), Vytlacil (2002), and Vytlacil and Yildiz (2004).
2
This point is further emphasized by Manski (1988).
2
2
Representation Results
Let Y = 1[g(X, U ) ≥ 0] where g : X × U 7→ <. I do not restrict U to be a scalar random variable,
both X and U may be random vectors or more generally random elements. I will use x and u to
denote potential realizations of X and U , and thus potential evaluation points of g. Consider the
following restriction on g:
[A ] g : X × U 7→ <. For any x, x̃ ∈ X , g(x, u0 ) > g(x̃, u0 ) for some u0 ∈ U ⇒ g(x, u) >
g(x̃, u) ∀ u ∈ U.
Define G to be the set of g functions satisfying restriction [A]. Note that f (m(x)+u) and f (m(x))−
f (−u) with f a strictly increasing function are contained in G. The most common interpretation
of the latent index is the difference between the indirect utility from choice one and the indirect
utility from choice zero. Under this interpretation, the restriction that g ∈ G imposes structure on
the (difference in) indirect utility functions. In particular, shifting the observable regressors from
x to x0 changes the difference in the indirect utility functions in the same direction regardless
of the level of the unobserved regressors, though the magnitude of the change in indirect utility
functions can vary freely with the level of the unobserved regressors.
I show that for any g ∈ G, Y = 1[g(X, U ) ≥ 0] has a representation in terms of a latent
threshold with additive separability between X and U , Y = 1[m(X) + V ≥ 0] with V = q(U ).
The analysis proceeds as follows. The first lemma shows that for any g ∈ G, the function can
be represented as a separable function in X and U . The second lemma shows that a threshold
crossing model with a latent index separable in X and U can be represented as a threshold crossing
model with a latent index additively separable in X and U . The second lemma as a by-product
3
produces a representation result for linear latent index models. The theorem then combines the
two lemmas.
First, for any g ∈ G, we have that the function can be written as a separable function of X
and U .
Lemma 1 For any g ∈ G, there exists m : X 7→ < with range M and h : M × U 7→ < with h
strictly increasing in its first argument, such that g(x, u) = h(m(x), u) for all (x, u) ∈ X × U.
Proof: We construct an h and m function, and then show that they possess the desired properties.
Pick an arbitrary u∗ ∈ U, and define the function m : X 7→ < by m(x) = g(x, u∗ ) for x ∈ X .3
For any t ∈ {m(x) : x ∈ X }, define h(t, u) by h(t, u) = {g(x̃, u) : m(x̃) = t}. We wish to show
that h(t, u) is a function, i.e., that {g(x̃, u) : m(x̃) = t} is a singleton. By condition [A], we have
g(x0 , u∗ ) = g(x1 , u∗ ) ⇒ g(x0 , u) = g(x1 , u) ∀ u ∈ U. Given our definition of m, we thus have that
m(x1 ) = m(x0 ) implies g(x0 , u) = g(x1 , u) for all u ∈ U. Thus, {g(x, u) : m(x̃) = t} is a singleton
and thus h is a function satisfying h(m(x), u) = g(x, u) for all (x, u) ∈ X × U. By condition [A],
m(x1 ) > m(x0 ) implies g(x1 , u) > g(x0 , u) and thus h(m(x1 ), u) > h(m(x0 ), u). We conclude that
the constructed h and m have the stated properties. q.e.d.
We now show that a threshold crossing model with a latent index separable in the observable
and unobservable regressors can be represented as a threshold crossing model additively separable
in observable and unobservable regressors.
Lemma 2 Let m : X 7→ < with range M and h : M×U 7→ < with h strictly increasing in its first
argument. Then there exists function q, q : U 7→ <, such that 1[h(m(x), u) ≥ 0] = 1[m(x)+v ≥ 0]
3
For example, if 0 ∈ U, can pick m(x) = g(x, 0).
4
with v = q(u), for all (x, u) ∈ X × U.
Proof: h strictly increasing in its first argument implies that h(·, u) is an invertible function for
any fixed u ∈ U, and thus 1[h((m(x), u) ≥ 0] = 1[m(x) + q(u) ≥ 0] with q(u) = −h−1 (0, u) where
h−1 (0, u) is the inverse of h(·, u) evaluated at 0. q.e.d..
Thus, as a special case of the Lemma, if Y = 1[h(Xβ, U ) ≥ 0], with h strictly increasing in
its first argument, then Y = 1[Xβ + V ≥ 0] where the random variable V is a function of the
underlying random element U . Note that the result nests the cases h(xβ, u) = f (xβ) − f (−u)
and h(xβ, u) = f (xβ + u) with f strictly increasing as special cases. As an immediate implication
of Lemmas 1 and 2, we have the following theorem.
Theorem 1 For g ∈ G, there exists functions m and q, m : X 7→ < and q : U 7→ <, such that
1[g(x, u) ≥ 0] = 1[m(x) + v ≥ 0] with v = q(u), for all (x, u) ∈ X × U.
3
Conclusion
The standard binary choice model assumes that the choice is determined by a latent index crossing
a threshold. The standard assumption of additive separability between observed and unobserved
variable in the index is not as restrictive as it appears. This note has shown conditions on
a nonseparable latent index under which it will have a representation as a latent index with
additive separability between observed and unobserved variables.
5
References
[1] Das, M., W. Newey, and F. Vella, (2003) “Nonparametric Estimation of Sample Selection
Models,” Review of Economic Studies, 70, 33-58.
[2] Heckman, J., and E. Vytlacil, 2001, “Local Instrumental Variables” in C. Hsiao, K.
Morimune, and J. Powell, eds., Nonlinear Statistical Inference: Essays in Honor of Takeshi
Amemiya, (Cambridge: Cambridge University Press), 1-46.
[3] Heckman, J., and E. Vytlacil, 2004, “Structural Equations, Treatment Effects and Econometric Policy Evaluation,” forthcoming, Econometrica.
[4] Manski, C., 1988, “Identification of Binary Response Models,” Journal of the American
Statistical Association, 83: 403, 729-738.
[5] Shaikh, A., and E. Vytlacil, 2004, “Threshold Crossing Models and Bounds on Treatment
Effects: A Nonparametric Analysis,” unpublished manuscript, Stanford University.
[6] Vytlacil, E. (2002), “Independence, Monotonicity, and Latent Index Models: An Equivalence
Result,” Econometrica, 70(1): 331-41.
[7] Vytlacil, E., and N Yildiz (2004), “Dummy Endogenous Variables in Weakly Separable Models,” unpublished manuscript, Stanford University.
6