Estimating students` preferences and bounded rationality in Mexico

Estimating students’ preferences and
bounded rationality in Mexico City∗
Job Market Paper
Alexis Le Chapelain, Sciences Po Paris
October 30, 2015
Abstract
Many developing countries are lagging behind in term of educational achievement and
human capital. This gap is often blamed on the low value placed on education by the
poorest students, and by their inability to identify the best educational alternatives.
This paper contributes to this debate by studying what are the preferences for school
attributes in Mexico, and how well students understand the functioning of the educational system. It describes the functioning of the school-student matching algorithm
used in Mexico city, shows that it is non-truthful, unstable, and that it requires a high
level of sophistication. It then presents suggestive evidence that many students choose
dominated strategies, and fail to fully understand the mechanism. Last it exploits
reports submitted by students to estimate a model of preferences for school attributes.
To overcome the problem raised by strategic reporting, it uses an approach based on
partial identification, and derives robust bounds on the parameters enabling to assess the validity of the point estimates obtained through more traditional methods.
Eventually, the paper finds that while students value selectivity and the quality of the
student body, there is considerable heterogeneity among them. The most deprived
students value less academic quality and prestige, regardless of their score, and are
often mismatched.
JEL Codes: C78, D47, H41, J24, I21, O15
Keywords: Student Preferences, Serial Dictatorship Mechanism, School Choice, Bounded
Rationality, Partial identification
∗
Corresponding author: Alexis Le Chapelain, Sciences Po Paris, [email protected]. I thank
Jorge Ubaldo Colin Pescina, Jean-Marc Robin, Olivier Tercieux, Il-Myoung Hwang, Yinghua He, Julien
Grenet, and Gabrielle Fack for their help and comments on the paper.
1
1 Introduction
Since the beginning of the interest of economists for education four decades ago, the approach
based on rational choice has been prominent. Students are seen as economic agents able to
balance the costs, risks and benefits of education decisions, and to accurately assess the
quality of various alternatives.1 However, this view has recently been challenged by works
showing the presence of behavioral bias and myopia among students.2 This problem seems
particularly acute in developing countries, where many students fail to recognize the benefit
of additional education, and have difficulty distinguishing good and bad schools.3 This can
have important implication, since human capital is an important factor in economic prosperity.4
Educational decision are known to be strongly influenced by social origin. Students from
a poorer social background are more likely to drop out or to choose less prestigious and
rewarding educational path, regardless of their academic ability.5 This can lead to an educational mismatch, where some students are underplaced compared to their academic ability.
This problem is particularly striking in Mexico. While education is compulsory until middle
school, and that students are primarily assigned to a school based on their catchment area,
students have to choose which high school they want to attend. This choice is particularly
important, because the choice of high school has a strong influence on the ability to pursue
into higher education, and that the Mexican secondary system is strongly stratified by ability. In this context, regardless of their academic score, the poorest students are much more
likely to attend schools with low selectivity and with a weak peer groups.
1
A good example of such a view comes from the literature on the estimation of structural model of human
capital accumulation which emerged after the seminal paper by Keane and Wolpin (1997)
2
See for instance Lavecchia et al. (2014) for a comprehensive review or recent economic works on education
inspired by behavioral economics.
3
This is documented for instance in the work of Jensen (2010), who shows that students from Dominican
Republic underestimate the returns to education.
4
Mexico is a good example of such a situation. While attendance to primary and middle schools is near
universal in this country, there is a very large drop-out rate in high schools with more than 40% of a cohort
unable or unwilling to complete secondary education. At the same time, returns to graduation are large,
and have been estimated at as high as 24% (Campos-Vázquez (2011)). Dropping out is thus likely to be
an inefficient decision. This low completion rate is often cited as one of the major factor hindering Mexico
economic development, which have been disappointing during the last decades (Hanson (2010)).
5
For instance, Hoxby and Avery shows that in the US, strong students from poor background fail to
apply to selective college with generous scholarship and end up in less selective and more costly institution.
2
Difference between own test score
and school test score
−4
−2
0
2
4
Figure 1 − Match quality and social origin
−2
0
Wealth index
95% CI
2
lpoly smooth
kernel = epanechnikov, degree = 0, bandwidth = .45, pwidth = .68
Difference between own test score
and school admission threshold
12
14
16
18
20
Figure 2 − Self−censorship among strong students
−2
0
Wealth index
95% CI
2
lpoly smooth
kernel = epanechnikov, degree = 0, bandwidth = .43, pwidth = .65
The two figures above describe the extent of the phenomenon. The first figure reveals that
while rich students tend to be in schools whose academic average is stronger than their own
score, the reverse is true for the poorest student, who are underplaced, and are in average
better than their peers. The second figure focus on the last quartile of ability. It shows that
conditional on their test score, the poorest students are in average assigned to less selective
3
high schools, namely their scores are well above the admission threshold to their assigned
schools. This is another sign of mismatch among this category of students.
Consequently, the aim of this paper will be to investigate the causes of this academic mismatch among poor students. It will show that this mismatch is primarily due to two causes:
the students coming from a more deprived background are more likely to misunderstand the
matching mechanism used in Mexico and to make mistakes in the application process; they
also have weaker preferences for selective schools than the other students.
High drop-out rate and low academic ambition among poor students are often blamed on
material factors, such as the lack of resource and the presence borrowing constraint, which
lead them to enter quickly the labor market 6 . However, in our case, public high schools
are free, and mismatch is present from the very choice of a secondary institution. Another
possibility is that the complexity of the school system itself can lead students to make poor
choices.7 This is especially the case when the admission process to various educational institutions is decentralized. In such circumstances, students have to apply to multiple schools
at the same time, which is often costly, and involve assessing in which schools one has reasonable chance to be admitted.8
In order to ease the process for students, the use of a centralized matching system collecting
preferences and processing them with an algorithm has become popular. Designing efficient
matching algorithm has become an active area of research, since the seminal work of Abdulkadiroglu and Sönmez (2003) and Balinski and Sönmez (1999). Following this trend,
Mexico City implemented a centralized admission exam in the early 2000’s, as well as a
matching algorithm derived from the serial dictatorship mechanism. Students have to take
an exam known as COMIPENS, and have to submit a rank ordered list of preferred schools.
They are then assigned to schools, with the students having the highest score being given
the highest priority.
While centralized admission system and matching algorithm are meant to simplify the admission procedure and to level the playing field, they can in fact involve complicated strategy
and demand a high level of sophistication from students.9 In the case of Mexico, the mecha6
These factors seem indeed present at the higher education level in Mexico, as shown by ...
For instance Bettinger et al. (2012) document how reducing administrative complexity can lead students
from a poor background to be more likely to apply to financial help and to attend higher education.
8
Abdulkadiroğlu et al. (2015) discuss this issue in the case of New York, while Che and Koh (2014)
provide theoretical argument against decentralized matching.
9
For instance, Pathak and Sönmez (2008) and Ergin and Sönmez (2006) show that the Boston Mechanism,
a widely used matching algorithm, generates a game where player have to conceal their true preferences to
7
4
nism which is used is apparently very simple to understand. However, I show that it involves
a high level of sophistication from the students. This stems from the fact that students do
not know their score before applying, and that they can submit only a partial preference list.
This forces students to make complicated trade-off between schools with different probability
of admission. Moreover, it compels them to form beliefs about their score and the school
admission thresholds.
Being unable to access to the true preferences of students prevents to measure directly their
rationality. Indeed it is not possible to compute what would be the optimal strategy of a
given student since his real valuation for each alternative is unknown. To overcome this
latter problem, I exploit the structure of the algorithm to show that some strategies are
dominated whatever the preferences are. I describe several behaviors which are difficult to
rationalize, and which point to the presence of bounded rationality among a large part of
the students. I also show that these rationality measures are a predictor of being assigned
to a less prestigious school conditional to one’s score, and to be left unmatched at the end
of the assignment process.
I then show that students from a poor background are more likely to display irrational behavior. This explains a part of their higher propensity to be undermatched compared to
their ability.
I then try to estimate preferences for school attributes by using the ranked ordered lists
submitted by the students to the planner. This is made difficult by the fact that the algorithm gives students a strong incentives to behave strategically and to misreport their true
preferences. In particular, many students avoid to rank popular but selective in which they
have very little chance to be admitted. Moreover, the large choice set (600 alternatives) is
likely to induce “decision fatigue” and to to lead students to adopt rough heuristics when
making choices.
In order to obtain robust estimate in this context, I follow the partial identification approach
outlined by Manski (1999) and Tamer (2010). I start with strong assumptions enabling to
obtain precise point estimates of the parameters, and I progressively weaken this assumption
maximize their payoff. Evidence from the field show that not all students are aware of this characteristics of
the mechanism, and that naive students who play a truthful strategy are likely to be disadvantaged compared
to sophisticated players.For these reasons, truthful mechanisms, where the optimal decision involves reporting
one’s true preference, a simple strategy, are often advocated. However, most of the mechanism implemented
in practice does not satisfy this condition. In particular, even the modified version of the Deferred Acceptance
and of the Top Trading Mechanism which are implemented in many school systems are not truthful because
they include a truncation of the preference list (Haeringer and Klijn (2009)).
5
to obtain more robust, but less precise and sometimes partially identified estimates or the
parameters of interest.
In addition to estimation a standard rank-ordered logit model, I use two other empirical
strategies. First, I define for each student a set of feasible schools based on his test score at
the placement exam, and then estimate a discrete choice model based on this personalized
choice set, in the spirit of the literature on college application and admission
10
, which esti-
mate preferences for college attributes among American students in a context where many
students do not apply to the most selective colleges due to their low probability of being
accepted. While far superior to a naive approach, this strategy relies on assumptions about
the expectations of students about the school they can gain access to. Since there is some
evidence that Mexican students are over-optimistic about their scores (Bobba and Frisancho
(2014)), this assumption may prove to be restrictive. Moreover, it relies on the assumption
that students truthfully report preference within their feasible set, while the analysis of the
algorithm shows that it can be optimal not to do so.
As a result, I use another identification strategy which only assumes that students submit a
“truthful partial order”, that is they truthfully rank the alternatives within the schools they
have selected into their preference list (which may not include other highly valued schools).
Not doing so is a dominated strategy under any set of beliefs and preferences, and violate
basic rationality. With this very weak assumption, it is possible to build bounds on the
probability of ranking a school higher than another in a list, and to construct conditional
moment inequalities. However, these inequalities only enable to achieve partial identification. I exploit the recent literature on the estimation of partially identified model (Andrews
and Shi (2013)) to build confidence set for each of the parameters of the model.
I find that students value factors related to academic quality such as the selectivity of the
school, the mean quality of the student body, or the belonging to elite subsystems. Students
also value proximity. While this is unsurprising, there is an interesting pattern of heterogeneity across students based on their ability and social origin. Students coming from a
more deprived background are less prone to apply to prestigious high schools and value less
academic quality, even though they have grades which would make them eligible to such
schools. This gives a second explanation to the higher likelihood of mismatch among poor
students.
10
See for instance Arcidiacono (2005), Long (2004), and Jacob et al. (2013). Such approach is also
advocated by Fack et al. (2015) in the context of school choice.
6
The paper is organized as follow. The second part presents the algorithm used in Mexico City
to match students to schools, explains how it generates a Bayesian game across students,
and how the complexity of this game makes unlikely that students behave fully rationally.
The third part presents several measures of irrationality, and shows that many students
exhibit behavior which are difficult to reconcile with full rationality. The fourth part present
the econometric model, and discuss the various identification strategies used as well as the
stringency of the assumption needed. The fifth part presents and discuss the results. The
last part concludes.
Relation to the literature
The present paper is connected with several strands of literature. It is related to the recent literature on school choice which explores how the presence of bounded rationality and
heterogeneity in preference can lead to suboptimal choices among students with a deprived
background see for instance Hastings et al. (2008), Hastings and Weinstein (2008), Deming
et al. (2014), Hastings et al. (2015b) and Hastings et al. (2015a). My paper add to this
literature by offering new evidence of educational mismatch from a developing country, and
by describing two channels explaining why the level of mismatch differ across students from
different social origin.
The availability of data on school choice has led researchers to estimate model of demand
for schools while taking into account the structure of the matching mechanism. Beyond estimating preferences, the aim of this literature has been to empirically assess the performance
of various mechanism, as well as the degree of sophistication of the students.
The strategic reporting of preference pointed by theorists and the bias it generates has been
at the heart of the literature. He (2015) was the first to explicitly solve for this problem
by proposing a structural model of choice in the context of the Boston mechanism. Other
papers estimating preferences under the Boston mechanism include Casalmiglia et al. (2014),
and Agarwal and Somaini (2014). These papers build structural models of school choice,
and assume rational expectation and a high level of rationality, in a way similar to He. They
found evidence of strategic reporting, a finding confirmed by Calsamiglia and Güell (2014),
and De Haan et al. (2015), which use reduced form method.
The complexity of the strategy that students have to play under perfect rationality make
unappealing to make such an assumption. As a result, a new trend in the literature is to
7
use less structural approach and try to weaken the assumption under which preferences are
estimated. Hwang (2014) uses a partial identification approach, and estimate confidence set
for the parameters of a demand model in the case of the Boston mechanism. Similarly, Fack
et al. (2015) propose several estimation method for the preference under the deferred acceptance mechanism with truncation of the preference list, which mostly rely on traditional
discrete choice method, and quite weak assumption. Burgess et al. (2014) estimate preferences in England under the Deferred Acceptance mechanism by exploiting the fact that for
many students the choice set is very small.11
My contribution to this strand of research is to propose a robust methodology for estimating
preferences based on report obtained with the serial dictatorship mechanism with truncation
and uncertain priorities. I am able to overcome the misreporting problem, and to obtain
robust estimates of the underlying preferences. To do so, I do not need to use strong assumption on the belief of the students, contrary to most of the literature, and I do not need
as well to assume that students understand fully the mechanism.
In parallel to the interest for student preferences, a rich theoretical literature has discussed
the property of various matching algorithm, and in particular their strategy proofness and
their vulnerability to manipulation. Examples includes previously quoted papers by Pathak
and Sönmez (2008) and Ergin and Sönmez (2006) which discuss the vulnerability to manipulation of the Boston mechanism, and a paper by Haeringer and Klijn (2009) who shows
that stable and strategy proof mechanisms can become unstable and manipulable when preferences are truncated. Pathak and Sönmez (2008) offer interesting evidence demonstrating
than different level of sophistication can lead the most naive player to become worse under
a manipulable mechanism.
My paper contributes to this literature by discussing a mechanism little explored so far,
the serial dictatorship mechanism with truncated choices and unknown ex-ante priorities.
It shows that this mechanism lack some desirable properties such as stability and strategy
proofness, and that it requires a high level of sophistication among students.
11
The estimation of the preference for school attributes has also been performed for higher education.
Akyol and Krishna (2014) ingeniously adapt the methodology of Berry et al. (1995) to data from higher
education in Turkey. Carvalho et al. (2014) estimate preferences for major in a Brazilian university using a
structural model.
8
2 Modeling school choice in Mexico
The aim of this section is to describe the matching mechanism used in Mexico city, and to
locate it within the framework of matching theory. It enables to determine what are the
optimal strategies of the students, what are the assumptions that it is reasonable to do on
their behavior, and what would be a behavior violating rationality.
2.1 The school choice plan in Mexico city
General description of the environment: The school choice plan in Mexico City is operated in
the transition from junior to senior high schools. While most students attend a junior high
school close to their home, they have to choose a high school which can be located anywhere
in the district of Mexico and in some surrounding suburbs.
They can choose among a very large number of alternatives, since High Schools in Mexico
City are run by nine different subsystems, operating many campuses, and an even larger
number of programs (more than 600).
Choosing high school is an important decision, since there is a high return to graduation from
high school in Mexico City (Campos-Vázquez (2011)), and that completion can significantly
vary across high schools (de Janvry et al. (2015)). Indeed, the rate of drop out is very high,
at about 40%. Moreover, some high schools offer a preferential access to some prestigious
universities (in particular, the high schools affiliated to UNAM, one of the most prestigious
university in Mexico).
Timing: The school choice plan follows a specific timing. First of all, Students have to
submit a list of ordered preferences to the planner about one year before the assignment. The
list can include up to 20 schools. Few months later, students have to take an exam, known as
COMIPENS, which is graded on a scale of 128. The exam is made of 128 true/false questions,
and is divided into several parts covering all the junior high school curriculum. There is no
penalty for incorrectly answering one question. At this stage, any student with a score below
31 is dismissed (however, this restriction was removed recently). Students are then assigned
to a high school based on their test score and on their submitted preferences.
9
The algorithm: The algorithm used in Mexico City is a variant of the serial dictatorship
with a truncated preference lists. It proceeds in the following steps.
Step (0) The mechanism collects the preferences of the students (namely, rank-ordered list
of at most twenty schools), and their score at the placement exam.
Step (1) All students are ordered by their score at the exam.
Step (2a) The best ranked students are assigned to their preferred schools.
Step (2b) If at this stage, some schools face an excess demand, they choose either to refuse
all the applying students with an equal score, or to accepts all these applying students
(namely, they choose an admission threshold, and then increase their number of seats
to face demand if there are too many ties; conversely, they can choose an admission
threshold just one point higher, but at the cost of having vacant seats).
Step (2c) The students and the school seats which have been matched are removed from
the algorithm.
Step 2 iterates until all admissible students have been assigned, or that there is no longer
any vacant seats.
The algorithm produces a matching denoted µ. The specific allocation given to an agent i
is denoted µi .
An important feature of the algorithm is that it does not use ties to discriminate between
students having the same level of priority. It rather consists in asking the high schools for
setting an admission threshold. Since programs are large (most of them admit more than
400 students), this does not have too much consequences on excess capacity.
We can note that this algorithm has not been studied in the literature so far, since serial
dictatorship has been investigated only in a context where preferences are not truncated (in
particular, a random version of this algorithm has been discussed in an important paper
by Bogomolnaia and Moulin (2001), as well as in Che and Kojima (2010)). As it has been
shown by Haeringer and Klijn (2009) in the case of popular algorithm such as the Top Trading
Cycle, Deferred Acceptance and Boston mechanism, truncation can have an important effect,
and can change the property of the algorithm. While the serial dictatorship mechanism is
truthful, stable and efficient, it may not retain all these characteristics if preferences are
truncated.
10
2.2 Model, strategy and equilibrium
In this section, I follow the literature on the structural estimation of students’ preference
(Agarwal and Somaini (2014), He (2015), Fack et al. (2015)), and I model the mechanism
used in Mexico as a Bayesian game.
2.2.1 Model
Definition of the game The game is defined as follow:
• a set of parents / students / households : {i}Ii=1
• a set of schools (=programs): {s}Ss=1 , and an outside option ø
• a school capacity vector: {qs }Ss=1 ;
PS
s=1 qs
≥I
k
• Students’ rank order lists: Li ∈ L, where Li = (c1i , c2i , ..., cK
i ), and K < 21, ci ∈
{s}Ss=1 ∪ ø for any k = 1, 2, ..., K
• School priorities P, which are in this context identical, and equal to the ranking in the
exam. As a result, Pis = gi , where gi is the score at the exam.
Preferences Preference of student i for school s are defined by
uis = U (Ws , Xis , is ) = Ws βW + Xis βX + is
(1)
where Ws is the vector of school characteristics (which define a common value of the school
across all students), Xis is a vector of school-student characteristics (such as the distance
to the school, or interaction terms between school and student characteristics), and is is a
student-specific idiosyncratic preference shock for the alternative s.
I assume that is follow an extreme value distribution F (0, σ ), that is ⊥ WS , Xis and that
shocks are non-correlated, corr(is , i0 s0 ) = 0. I collect the vectors Ws , Xi,s into Z, and the
parameters βW , βX into βZ .
I assume that students maximize their utility function, and that they know their preference,
as well as the distribution of the preferences of other students.
11
Beliefs Since students do not know in advance their scores at the placement exam, as well
as the admission threshold to the schools, they have to form beliefs about them.
Student have beliefs about their own score:
ĝi ∼ G(µgi , σgi )
Student have beliefs about the thresholds (ie. the score of the last admitted):
t̂si ∼ H(µsi , σsi )
If we impose rational expectation, the mean are equal to the true value. However, I assume
that belief are not perfectly accurate, and that the belief take the form of a distribution,
that is students assign a probability to all possible realization of their score and of the
school admission threshold. This is realistic, because since there is no penalty for giving a
wrong answer, students have a strong incentives to answer all the question in the test, even
at random. This introduce a strong random component in the test score, and make them
difficult to predict. As a result, even a student with rational expectation could be at most
expected to know the shape, mean and variance of the distribution of his score.
2.2.2 Strategies
The problem of the student is to find the list which maximize his expected utility.
For each school, he can define an admission probability pis (ĝi , t̂si , Li ) which depends on the
student expected score, the expected threshold, and the submitted list. Based on these
probabilities, the student choose the following strategy:
∗
σ (ui ) ∈ arg max
X
σ
Z Z
uis
pis (ĝi , t̂si , σ)dG(gi )dH(ts )
(2)
s∈S
NB: An alternative way to model the strategy of the students would be to assume that
students know the distribution of the characteristics and expectation of the other students,
and that they maximize accordingly. If we define D(u−i , ĝ−i ) as the joint distribution of
utility and expected scores, we can rewrite the problem as:
∗
σ (ui ) ∈ arg max
σ
X
Z Z
uis
pis (ĝi , σ, σ ∗ (u−i , g−i ))dG(gi )dD(u−i , ĝ−i )
(3)
s∈S
that is, students choose their best strategy conditional on other students choosing their
optimal strategy conditionally on their characteristics. In practice, the threshold depends
12
directly on the behavior of the other students, and is a sufficient statistics for choosing the
optimal strategy.
Defining the probability pis (ĝi , t̂si , σ) can be done as follow:
pis (ĝi , t̂si , σ) = P r(gi < ts0 ,s0 =l1 ,...lk & gi > ts ∀s = lk+1 )
(4)
The probability for a student i of being admitted in a given school s is equal to the likelihood
that his score is below the admission threshold for the better ranked schools s0 in the list,
and that it is above the admission threshold for the given school. As a result, this probability
depends on all the choices which are better ranked. When choosing to include or not a school
with a given admission threshold in his list, a students should therefore balance how much
it is going to increase his expected utility (ie. pis usi ) with the reduction in the probability
X ∂pi
s̃
of being admitted in the school likely to be ranked below (that is
us̃i , s̃ being the
∂s
s̃
schools whose threshold is below s).
The strategy can be further rewritten using this expression for the probability of admission:
pis (ĝi , t̂si , σ)
Z
Z
Z
=
Z
....
gi <t1
gi <t2
dG(gi )dH(t1 , ..., tk+1 )
gi <tk
(5)
gi >tk+1
The maximization problem can be rewritten further in a similar way:
X
Max U (Li ) =
uis
Li ∈L
s∈Li
Z
Z
Z
Z
dG(gi )dH(t1 , ..., tk+1 )
....
gi <t1
gi <t2
gi <tk
(6)
gi >tk+1
When the dimension of L is large, the problem becomes quickly intractable. As a result,
it is unlikely that students perfectly optimize their expected utility in our setting, since
there are more than 600 hundred schools. Indeed, for each school to be included in the list,
students have to compute their probability of admission conditional on the presence of other
schools in the list, which requires clearly a high level of sophistication. They also have to
determine the joint distribution of all threshold, which is a very difficult task. Therefore,
they are much more likely to rely on much simpler heuristics than on perfect expected utility
maximization.
13
2.2.3 Equilibrium
We can use the same argument than Fack et al. (2015) to establish the existence of purestrategy Bayesian Nash equilibrium, by applying Theorem 4 (Purification Theorem) in Milgrom and Weber (1985). However, there may be multiple equilibria including those in mixed
strategies.
2.3 Property of the algorithm
2.3.1 Some definitions:
I first define some useful concepts used in the matching literature.
A matching is said to be stable if there is no student such that he prefers another school
to his assignment, and either he has a higher priority to this school than one of the student
who got a seat there, or there is vacant seat in this school.
A stable mechanism is is a mechanism that associates a stable matching to every preference
profile and priority.
A mechanism is said to be strategy-proof if truth telling is a weakly dominant strategy
for every preference profile and priority, that is students cannot manipulate the outcome of
the mechanism by misreporting their true preferences.
A matching is said to be efficient or Pareto optimal if there is no pair of agent who can
improve their situation by exchanging their assignment.
An efficient mechanism is such that it associates an efficient matching to every preference
profile and priority.
A strategy σi is said to be a true partial order of schools if it is a list Li such that it
respect the true preference ordering of the schools included in the list.
2.3.2 Stability
Proposition 1: With a positive probability, the serial dictatorship with truncation of preference and ex-ante unknown priorities produce an unstable matching.
14
Proof. It is easy to build an example to prove proposition 1. Consider a setting with four
schools A, B, C, D, and a large number of agent. Consider the two following agents {1, 2}
with identical preferences A B C D. These two students (who apply among many
other unknown agents) have different utilities and expected test score and/or threshold so
that it leads to the following expected admission probability distribution conditionally on
applying. Let’s assume that they can submit a list of at most two schools among four.
Exemple 1
Grade distribution
P (g > tA )
P (tA > g > tB )
P (tB > g > tC )
P (tC > g > tD )
P (tD > g)
student 1
0.1
0.2
0.3
0.3
0.1
student 2
0.05
0.15
0.2
0.5
0.1
Utility
A
B
C
D
Ø
student 1
30
10
9
2
0
student 2
20
18
10
9
0
With the above grade distribution and utility, the optimal strategy for agent 1 is to submit
the list (A, C), since it maximizes his expected utility. The optimal strategy for agent 2 is
to submit the list (B, D). Conditional on the submission of these lists, and provided that
the beliefs are correct, there is a 6% probability that student 1 is assigned to C and student
2 is assigned to B. If so, Student 1 could have a grade (and thus a priority) which is higher
than the one of student 2. For instance, such an assignment is consistent with student 1
having a grade equal to tA − and student 2 having a grade equal to tB + . Therefore,
such grade distribution and preference can produce a result which is unstable with a positive
probability.
2.3.3 Efficiency
Proposition 2: If students perfectly know the ranking of schools based on threshold exante, or if they have expected threshold such that the support of the distribution of any
15
threshold never overlap with the distribution of another threshold, then serial dictatorship
with truncation of preference and ex-ante unknown test score and cutoffs is efficient.
Proof. I build a proof by contradiction. Notice that if an assignment is inefficient, then, for
some schools A, B, and some students 1, 2, we must have A 1 B and B 2 A, and an
assignment {µ1 = B, µ2 = A}. If threshold ranks are known ex-ante, then we must have
tA > tB , or the reverse. However, in such case, it is impossible to have the assignment
µ2 = A. Indeed, if so, the grade of students 2 has to be higher than tA , but then it will be
also higher than tB , a preferred alternative. The proof goes the same way if tA < tB .
2.3.4 Truth telling
Proposition 3: Serial dictatorship with truncation of preference and ex-ante unknown test
score and cutoffs is not strategy-proof.
Proof. See example 1. While within the list submitted by the students to the planner, the
order of the school is truthful, students do not reveal their true preferences since they omit
some of their most preferred alternatives from the list. A truthful strategy would have been
to submit A, B for all the students.
Proposition 4: Under serial dictatorship with truncation of preference and ex-ante unknown
test score and cutoffs, submitting a list which not a true partial order is a weakly dominated
strategy.
Proof. See theorem 7 in Fack et al. (2015). Since the Serial Dictatorship mechanism is
equivalent to the Deferred Acceptance mechanism with the same priorities at all schools,
their proof can be directly adapted to my setting.
2.4 Dominated strategies and non rational expectations
While a wide range of choice can be rationalized by postulating specific preferences and
beliefs, some of them are however quite unlikely to be generated by the choice of truly
a rational individual. Three sets of behavior are likely to come from bounded rationality
and/or biased expectations.
16
Proposition 5a: If the admission threshold are known ex-ante, or if the support of their distribution does not overlap, then it is a weakly dominated strategy not to respect the threshold
hierarchy in the rank-ordered list submitted to the planner
Proof. Suppose that a student i do not respect the threshold order, that is, he submits
A B while tB > tA with a probability one. Then, he can never be admitted to B, and the
list including {A,B} (in this order) will give the same utility as an identical list which would
differ only by omitting B.
Indeed, four cases are possible. If gi > tA & gi > tB , then he is admitted to A. If gi <
tA & gi < tB , then he is not admitted at any of the two schools. If gi > tA & gi < tB , he is
admitted to A. By assumption, since tB > tA , it is not possible to have gi < tA & gi > tB ,
a necessary condition for being accepted in B. As a result, he probability of being accepted
in B is null, and omitting this school from the list will not change the expected payoff of the
student. Moreover, the student may add to his list another school with a positive probability
of admission which would raise his expected utility.
Proposition 5b: Ranking a school with a higher admission threshold than the support of
one’s expected score is a weakly dominated strategy.
Proposition 5c:
Submitting a list which is not complete is a weakly dominated strat-
egy.
Proposition 5b and 5c are trivially true, They amount to the fact than students should not
waste any choice in their list.
The extent of the behavior described in proposition 5 can tell us how well students understand
the mechanism. If they often play weakly dominated strategies, we can assume that they
have a limited ability to strategize well.
17
3 Assessing students’ rationality
3.1 School choice and students’ rationality
As shown in part , the mechanism used in Mexico requires a high level of sophistication from
students and their families. Students have to predict their own score, predict the thresholds
of the schools they want to apply to, and understand the functioning of the mechanism. In
particular, they should understand the importance of respecting the threshold order when
ranking school, as well as the importance of spreading their choices across schools with different threshold of admission.
One can wonder to what extent students are able to display such a degree of sophistication.
Bobba and Frisancho (2014) provide evidence that students in Mexico have widely biased
upward expectation about their test score, by administering a survey and making students
take a mock exam. Following the survey, students somewhat change their beliefs and are
more likely to apply to institution which are better suited to their academic aptitude. This
shows that students belief are not necessarily correct, and they may make mistakes in their
application strategy.
Such mistakes may have important consequences. They could lead students to stay unmatched (about 15% of our sample). They could also lead students to miss some opportunity
and be under-match, or to be affected to schools which do not fit well their ability.
Students’ rationality is often a concern in school choice. Recent research has tried to distinguish between naive and sophisticated agents, the former having a poor understanding of
the mechanism, while the latter are able to design strategy to exploit them. This behavior
can sometimes be inferred from the data. For instance, in the Boston mechanism, it is often
an unwise strategy to reveal one’s true first choice, because it may be unaccessible due to
over-demand. It is often safer to aim to a less popular school at which one has priority.
Looking at how choices can change with priority can therefore be informative (e.g. Agarwal
and Somaini (2014) 2014). However, in some other examples, in particular the DA or TTC
mechanism, it is always possible to rationalize the observed choice by referring to the true
unobserved underlying preferences. Even though some students make mistakes, it is difficult
to detect them.
The interest of the Mexico school choice plan is that it is possible to observe directly strategies which are dominated, without access to the true preferences. As a result, it is possible
18
to built various tests of rationality.
The aim of this section is thus to present the results of these test for all the students’ population, to assess who are the students the most at risk of choosing inefficient strategies, and
what the consequence of these strategies in term of ability to secure access to a well-liked
school.
This section uses the data gathered by the COMIPENS, which administers a survey before
the test. The data are described in detail in part 5.
3.2 Test 1: respecting threshold order
A first rationality requirement for any strategy is to respect the threshold order of the ranked
schools. Indeed, as shown in proposition 5a, if one ranks a school higher than another school
which has a higher admission threshold, one has a zero percent probability of being admitted
to this latter school. Such a strategy ends up wasting a choice, which could be allocated to
increase the probability of admission to another well-liked school. This is therefore a weakly
dominated strategy.
In practice, threshold and ranking are not fixed and vary from years to years. As a result,
not respecting rankings may be due to a rational behavior. For instance, a schools usually
ranked higher than another may have moved below unexpectedly, or two very well liked
schools may have very close thresholds. However, if the distance between the thresholds of
two schools which are not properly ranked is very large, or if the percentage of violation of
the threshold order among all the school pair within a submitted list is very large, it may
be a sign that students have incorrect beliefs about the thresholds, or fail to understand the
importance of respecting the ranking of schools by admission threshold in their preference
list.
In practice, it is possible for students to form accurate expectations of admission thresholds
because these threshold does not move very much in time, and that they are published each
year in the news. Merely inferring the future thresholds from the past one give a reasonably
good prediction.
I build several measures of respect of the threshold order. The first measure consists in the
share of each school pair in the rank ordered list which does not respect the ex-post threshold
order. The second measure is the average difference between threshold each time rank order
19
is not respected for a pair. The share of list exhibiting at list one pair with a deviation is very
large, at 95%, which reflects the fact that thresholds are randomly moving, and that even a
rational students may end up posting a choice list which does not respect threshold order.
However, there is a very large heterogeneity on the ability to respect threshold order.
The figure below shows the distribution of the share of school pairs in the rank ordered
list which does not respect the threshold order. While not respecting threshold order is
widespread, the distribution has a thick right tail, showing that a substantial minority is
extremely bad at ranking schools, and submit list where more than 50% of schools pairs does
not respect threshold order. This implies wasting many choices. However, it may also be
due to the fact that some students are ranking school close to each other.
0
Percentage of students
.05
.1
Figure 1 − Ability to respect threshold order
0
.2
.4
.6
Share of incorrect pair
.8
1
The ability to respect threshold order is measured as the fraction of
the school pair in a list which does not respect ex−post threshold order.
The second table displays the distribution of the average deviation for each pair. The
distribution is very skewed and approximately follows an exponential distribution, with a
few students performing very poorly.
20
0
.05
Percentage of students
.1
.15
.2
.25
Figure 2 − Ability to respect threshold order (weighted)
0
20
40
60
Weighted share of incorrect pair
80
Each deviation from the threshold order is weigthed by the size of the deviation
To summarize, many students fails to rank schools while respecting threshold order, which
means wasting choices. However, there is a considerable heterogeneity among them, with
a substantial minority showing a very poor ability to follow this simple strategy, probably
because they haven’t understood well the mechanism.
3.3 Test 2: avoiding overoptimism
A second way to test the rationality of the application strategy of the students is to look
at their optimism. Indeed, students need to form expectation about their own score when
applying to schools. They should not only assess its mean, but also its variance. If they
consider their score to be highly uncertain (which tends to be true since the test is a multiple
answer exam which relies somewhat on chance), they should adjust their application strategy
accordingly.
Eventually, there is a few rules that students should follow to select their strategy. First, they
should never list a school with a null probability of admission (proposition 5b). Moreover,
if their expectation are unprecise, a good rule of thumb is to apply to schools of different
selectivity level, so that they cover a large part of their score distribution. Moreover, they
should apply widely and exhaust all their choices (proposition 5c).
21
To test if students are indeed following strategies of this sort, we can first look at the number
of applications which are addressed to schools with admission threshold well above their
expected test score. To do so, I first compute expected test score based on the previous scores
of students in junior high schools and some other characteristics (wealth, self-confidence, and
so on). I compute confidence intervals for this expected scores as well as critical values at
the 1, 5 and 10 % level (for a one sided test). Then, I look at the fraction of the choice of
each student above this upper bound. Note that this is a conservative test, since the value I
use to predict scores can predict only about one third of the admission exam score variance.
It is likely that students have access to much more accurate private information about their
chance of success.
The graph below show the distribution of the share of choice whose threshold is higher than
the 10 % upper bound. A non trivial part of the student have almost all their choices above
this upper bound, which shows a high level of optimism. More generally, many students
post more than 10% of their application above the 10% bound, which means that they are
ready to spend a quite large share of their choice on unlikely to reach schools. The results
are similar, although attenuated, for the bound at 1 and 5%. Again, a significant share of
students seems fairly optimistic.
0
.1
Percentage of students
.2
.3
.4
Figure 3 − Overoptimism
0
.2
.4
.6
.8
1
Share of the choice above with less than 10% probability of admission
22
A sign of overoptimism is that some students fail to be admitted to any school because
they target too high, while having test scores which would have allowed them to attend
decent high school. Among students who take the test and have positive test scores, 12.5%
are unable to obtain a seat because their score is below the threshold of any of the school
they applied too. These students have a test average much lower than the one of those who
secure a seat, at 59 compared to 72. However, the first quartile score is above 75, and the
first decile above 82, a quite good academic result. Moreover, about 50% of all matched
students get assigned in a school with a threshold lower then 59. As a result, a failure to
secure a high school seat is most likely due to overoptimism rather than a weak score.
Another interesting fact is that students differ very much in the dispersion of the schools
they list. The distribution of the span between the most and least selective institution is
approximately uniform, and is shown in the figure below.
0
.01
Percentage of students
.02
.03
.04
.05
Figure 4 − Choosing a safe strategy
0
20
40
60
Span of the list
80
100
The span is defined as the difference between the highest
and lowest threshold of the schools ranked in the list
Some students focus all their choice within a small threshold window, which means that
if they fail to accurately predict their score, they may end up either unmatched, or undermatched. In the opposite, many students follow a safer strategy and apply widely among
schools of different selectivity level, which may reflect a better understanding of the randomness of the allocation process.
23
3.4 Test 3: size of choice list
Given that both test scores and threshold are unknown ex ante, it is always a weakly dominant strategy to add more schools to one’s list, and to exhaust one’s choices (proposition 5c).
However, and similarly to the experience of many school choice plan in the world, very few
students use the full list, and some of them report only two or three choices. This is again a
puzzling and difficult to rationalize behavior. It may be that the students who do not report
a full list do so because they are sure to be assigned to a desirable choice. However, we have
seen that many students got eventually unmatched. Due to the randomness of score, it is
thus a risky strategy to report only a limited number of choices. Moreover, many students
reports choosing a program mostly for its academic quality, and thus could benefit from
ranking more well regarded school and hope to be lucky and have a better than expected
score, or a lower than expected threshold. As a result, the large number of students who
report less than the maximum possible choices is more likely to be due to their unability to
assign a subjective value to many schools, due to a lack of information and the difficulty of
choosing among so many alternatives.
I present below the distribution of the number of choice.
0
Percentage of students
.05
.1
.15
.2
Figure 5 − Number of submitted choices
0
5
10
Number of submitted choices
15
20
There is a peak at ten, maybe because it is a salient number. The majority of the students
24
reports ten or less choices. Only a minority reaches the size constraint of the list.
3.5 The determinant of rationality deviation
Many students seem to depart from the rational strategies in systematic ways. As a result,
one could wonder what are the determinant of such departure. Are there some predictors of
irrationality? I regress below various measures of rationality on gender, academic achievement, and some demographic characteristics.
Table 1: The determinant of weakly dominated strategies
VARIABLES
COMIPENS Score
Mean Score Of Junior HS
Girl
Personality Index
Academic Index
Index of Parental Care
Wealth Index
Constant
Observations
R-squared
(1)
(2)
(3)
(4)
(5)
Number of
Average rank
Percent rank
Choices’ span
percent
choices
deviation
deviation
0.00589***
-0.00651***
-0.00561***
0.00228***
-0.00205***
(0.000557)
(0.000146)
(0.000153)
(0.000170)
(0.000123)
0.00107
-0.0127***
-0.0109***
-0.00372***
-0.0137***
(0.00470)
(0.000530)
(0.000740)
(0.00101)
(0.000795)
0.00128
-0.0612***
-0.0922***
0.0463***
0.129***
(0.0238)
(0.00538)
(0.00566)
(0.00582)
(0.00526)
-0.0343**
-0.0466***
-0.0333***
-0.0140***
-0.0334***
(0.0137)
(0.00304)
(0.00323)
(0.00350)
(0.00358)
-0.0196
-0.0449***
-0.0166***
-0.0229***
0.00528*
(0.0171)
(0.00360)
(0.00376)
(0.00413)
(0.00304)
0.154***
-0.00125
-0.00306
0.00701**
-0.0195***
(0.0131)
(0.00320)
(0.00314)
(0.00322)
(0.00307)
0.0781***
-0.0609***
-0.0474***
-0.00213
0.0817***
(0.0179)
(0.00312)
(0.00331)
(0.00415)
(0.00394)
9.490***
1.325***
1.157***
0.0956
0.656***
(0.311)
(0.0362)
(0.0487)
(0.0658)
(0.0608)
138,945
138,945
138,945
138,945
138,945
0.004
0.098
0.062
0.003
0.050
overoptimistic
Robust standard errors in parentheses
*** p<0.01, ** p<0.05, * p<0.1
Observations are clustered at the junior high school level.
25
Most of the time, the R2 is very low, below 10 %, and sometimes close to null. Girls,
stronger students, students surrounded by better peers seem to do better at using more
efficient strategies. In particular, they are better at respecting ranks. However, the number
of choices and their span is uncorrelated to any characteristics.
One can also look at the determinant of being unmatched. Students with a lower score are
less likely to be matched, which is unsurprising. This also the case of girl, probably because
of their relatively better score in the continuous exam during the year compared to the
admission exam. This may lead them to be overoptimistic, and to forget to add some safe
choices. Students in good junior high school are also more at risk, probably because they
tend to aim higher because of their peers. Moreover, more wealthy and confident students
are more likely to stay unmatched, probably because these factors lead to overestimate one’s
ability.
26
Table 2 - The determinants of being unmatched
(1)
VARIABLES
COMIPENS Score
Being Unmatched
-0.0645***
(0.000579)
Mean Score Of Junior HS
0.0480***
(0.00206)
Girl
0.129***
(0.0204)
Personality Index
0.183***
(0.0122)
Academic Index
0.243***
(0.0123)
Index of Parental Care
0.0552***
(0.0123)
Wealth Index
0.266***
(0.0117)
Constant
-1.149***
(0.143)
Observations
138,945
Robust standard errors in parentheses
*** p<0.01, ** p<0.05, * p<0.1
Observations are clustered at the junior high school level.
3.6 The consequence of the failure to strategize
So far, I have described various apparent deviation from rationality. However, one can wonder
if they have consequences on the quality of the match achieved by students, conditional on
their test score. To explore this possibility, I regress the admission threshold, the mean and
the median of the student score in the high school where a student is assigned on the various
variable computed above to measure rationality. I control for the test scores to account
for the fact that students with mediocre score will have bad outcome anyway. Failure to
strategize have significant and quite large impact on all the measures of the quality of the
27
match. Girls are also more likely to get a good outcome conditional on their test score,
which is consistent with their superior measured rationality.
Table 3 - Irrationality measures and quality of the match
VARIABLES
Number of Choice
Share rank deviation
Choices’ span
Share Overoptimistic Choices
COMIPENS Score
Mean Score Of Junior HS
Girl
Wealth Index
Constant
Observations
R-squared
(1)
(2)
(3)
High school threshold
Mean high school score
Median high school score
0.297***
0.173***
0.159***
(0.0147)
(0.00788)
(0.00744)
-2.169***
-1.645***
-1.910***
(0.0725)
(0.0386)
(0.0359)
3.172***
0.879***
0.402***
(0.0796)
(0.0396)
(0.0364)
-4.676***
-1.977***
-1.376***
(0.0982)
(0.0494)
(0.0438)
1.046***
0.681***
0.666***
(0.00570)
(0.00374)
(0.00349)
-0.167***
-0.0122
0.0281***
(0.0151)
(0.00890)
(0.00801)
2.189***
0.836***
0.688***
(0.101)
(0.0529)
(0.0487)
-0.0859
0.200***
0.333***
(0.0564)
(0.0304)
(0.0282)
-18.57***
18.70***
16.27***
(0.955)
(0.566)
(0.511)
225,548
225,548
225,548
0.554
0.644
0.667
Robust standard errors in parentheses
*** p<0.01, ** p<0.05, * p<0.1
Observations are clustered at the junior high school level.
These results reinforce the interpretation of the various rationality measures presented
above as indeed representing departure from the optimal strategy, and not the correlate of
original and idiosyncratic preferences. When students follow difficult to justify strategies,
they end up with worse match conditional on their academic ability.
28
4 Estimating students’ preference
The aim of this part is to discuss the identification and estimation of students’ preferences by
exploiting the information contained in their reports. As discussed in the introduction, this
is not an easy task since the reports only convey partial information about the preferences
of the students due to the truncation of the preference list, and the unwillingness of many
students to fulfill completely their list. As a result, it is likely that many students do not
report school they like very much because they expect to be unable to obtain a score high
enough in the placement exam to be admitted. As discussed in Burgess et al. (2014) or Fack
et al. (2015) among others, this is likely to lead to an underestimation of the quality of the
best schools.
I follow the general philosophy of partial identification exposed, for instance, in Tamer
(2010). I start with strong assumptions enabling to obtain point identification and to use
standard and easy to use econometric methods. I then progressively relax these assumptions.
This enables to obtain more credible estimates, but at the cost of lower precision, and
sometimes point identification. However, even partial identification can provide bounds
on the parameters which are informative. It is also a way to assess the sensitivity of the
estimates to various assumptions.
4.1 Empirical model
I start with equation (1) in part 2, which specifies the value of each alternative for a given
student. Contrary to most of the literature on demand estimation
12
, I do not include fixed
effects for each alternatives which are later discomposed in a second stage. Indeed, given
the large number of programs, this is computationally difficult. Moreover, it is impossible
to estimate such model by using partial inference when there are more than a dozen of parameters. As a result, for the sake of comparability across specifications, I discomposed the
value of each alternative for a given student into three parts. Ws βW is the value coming
from the characteristics of the alternatives, such as the selectivity of the school, its belonging to an elite subsystem, or the quality of its student body. Zis βZ is the value coming
from characteristics belonging both to the school and the students, such as the distance to
12
In particular Berry (1994), and Berry et al. (1995). Variation of their estimation strategies have been
used in the context of school choice, in particular in papers by Akyol and Krishna (2014), and Epple et al.
(2014).
29
school, or interactions terms. Last, is is a random preference shock of the student for this
alternative.
uis = U (Ws , Zis , is ) = Ws βW + Xis βX + is
(7)
I assume that is follow an extreme value distribution type I (Gumbel distribution), and
that is is orthogonal to Ws and Xis . I denote by Zi the matrix collecting all the vector
Ws , Xis for each student. I denote by βZ the parameters βW and βX . I choose a simple
specification to ease computation. I choose not to use a random coefficient model for avoiding
to deal with large integrals. The presence of interaction terms is sufficient to provide realistic
heterogeneity in taste across students.
4.2 Assumption: Perfect rationality under incomplete information
A natural starting point would be to start with perfect rationality and rational expectation, and estimate the model under these assumptions. This would provide a benchmark.
A possible strategy would be to solve the problem of each individual students given some
beliefs and preference, and then compute the resulting equilibrium threshold. By iterating
the process, one would be able to find an equilibrium in rational expectation where belief
and strategies would be consistent. The resulting threshold and rank order list could then
be used to match the equivalent moments in the data.
However, as discussed in the first part, it is extremely difficult to compute the optimal strategy of a given student, even though thresholds are given, since it implies computing and
comparing the payoffs of an exponentially large number of strategy. Since there are 600
alternatives, this is impossible in practice.
Moreover, it is very unlikely that students follow such a complicated strategy. There are
much more likely to rely on simpler heuristic, which in fact could give very good results. It
is well known in the psychological literature that individual are unable to make correctly
choices among many alternatives
13
.
Another possibility is to specify a boundedly rational model of choice among many alterna13
This is commonly referred as “decision fatigue”, namely the inability to take correct decision once many
decisions have already been considered(quote someone). Barry Schwartz argues in a famous book that
exposure to a too large set of choice reduce welfare (Schwartz (2005)). One can also refer to Baumeister et
al. for the concept of “decision fatigue”.
30
tives, and then apply the estimation strategy outlined above. However, in my knowledge,
there is no such model in the literature. Moreover, modeling bounded rationality often implies making many arbitrary assumptions on the way of reasoning of individuals which are
no more credible than a purely rational model.
Due to these difficulties, I use a less structural approach in the paper, starting from more
parsimonious assumptions to propose several estimation strategies.
4.3 Assumption: truthful ranking of preferred schools
I first assume that students truthfully report their preference, that is, they do not take
into account that they are likely to be refused by some schools because their score is too
low. Under this assumption, preferences can be directly inferred from the report using a
rank-ordered logit model.
• Assumption A1: The students play a truthful strategy.
Under assumption A1, a given list can be observed only if its first choice is the most preferred
alternative, the second choice is the second most preferred alternative, and so on. This gives
the following likelihood for a given list submitted by an individual:
P r(Li = c1i , c2i , ..., cki ) = P r(ui,c1i > ui,c2i > ... > ui,cki > ui,s0 ∀s0 ∈ S \ Li | Zi , βZ )
=
Y
s∈Li
exp(Ui,s )
P
s0 6>s exp(Ui,s0 )
(8)
!
(9)
s0 6> s means all the schools which are not ranked above s in the list Li . This likelihood
function can be described as a product of multinomial logit models, where each model give
the probability of one school to be preferred to all the other alternatives which are ranked
below this school. Note that s0 change for each s in Li . The likelihood of observing a given
sample can be obtained by multiplying all the individual probability of observing a given
list.
Taking logs, this function can be written as:
31
LL(βZ , Zi ) =
I X
X
Ui,s −
i=1 s∈Li
I X
X
i=1 s∈Li
X
ln
Ui,s0
(10)
s0 6>s
By maximizing this likelihood function, one can find a point identified estimate of the parameters.
These estimates are however likely to be severely biased due to misreporting. In particular,
the most selective schools are unlikely to be selected into the list by many students who
expect to have low to average score at the placement exam. This leads to underestimating
their popularity, and the value given to selectivity by the weakest students.
This misreporting problem plague all the literature trying to estimate the heterogeneity of
preference among students. In particular, it may well be that the preference for low score
schools which is found among the weakest student in a part of this literature
14
is in fact a
statistical artifact due to imperfect reporting.
4.4 Assumption: truthful ranking of preferred schools among an expected feasible set
Several strategies have been proposed to solve for the misreporting problem. Most of them
imply to use a personalized choice set which reflect the fact that individuals have different
likelihoods of admission, and therefore of application, regardless of their taste.
The main difficulty comes from the definition of the choice set. Indeed, in this context,
admission score are determined after choices have been made. As a result, students make
choice based on expectation of their score and of admission threshold. While realized scores
and threshold are observable ex-post, expectation cannot be observed.
Failing to model adequately expectation may lead to a bias, because it could lead to let
outside of the personalized choice set some alternatives which where in fact seen as feasible
but not attractive. In this case, it would lead to an overestimation of the quality of these
alternatives.
I make three assumptions about expectation
14
For instance, in an influential paper, Hastings et al. (2008) finds that students from poor background
and belonging to a minorities are less likely to apply to well-off and academically strong school. However,
they do not control for the fact that the application list is severely truncated (no more than three choices
are allowed), which casts a doubt on their estimates.
32
• A21: Students do not underestimate their test score. That is, they do not give up to
apply to a preferred school because they think it is unfeasible, which latter turns out
to be feasible.
• A22: When students apply to a school outside of their ex-post feasible set of school,
the threshold of this schools correspond to their maximum expectation of score. That
is, they do not expect to have a higher score than the threshold of the most selective
school they apply to.
• A23: Students apply truthfully within their expected feasible set.
The assumption A21 and A22 imply the following expected test score ĝi :

 actual score
ĝi = max
 threshold of the best ranked alternative in the list L
i
I define the set of all the school below this threshold as Si . This constitutes my expected
feasible set of schools for each student.15
s ∈ Si if and only if ts ≤ ĝi
I can re-estimate the same model as before, by only changing the set of school among which
each student is choosing. That is:
P r(Li = c1i , c2i , ..., cki ) = P r(ui,c1i > ui,c2i > ... > ui,cki > ui,s0 , ∀s0 ∈ Si \ Li | Zi , βZ )
(11)
I only replace the set of all school S by the personalized set Si . Similarly, the loglikelihhod
function can be written as follow:
15
Several papers have tackled the problem of estimating discrete choice model when some alternatives
are unfeasible for a fraction of the students. In particular, Arcidiacono (2004), Arcidiacono (2005) and
Long (2004), and more recently Jacob et al. (2013) confronted this problem in the case of the estimation of
preferences for college attributes. They weight each alternative by the observed probability of application
given by a probit model. Conlon and Mortimer (2013) show that this method achieve consistency. My
framework departs from their because I do not rely on objective probability of application, but rather choose
to model expectation. Moreover, I deal with rank ordered list, and I choose to define the feasible set as all
the school where the student believe that he has a positive probability of admission.
33
LL(βZ , Zi ) =
I X
X
Ui,s −
i=1 s∈Li
I X
X
i=1 s∈Li
X
ln
Ui,s0
(12)
s0 6>s,
s0 ∈Si
One can wonder how credible are the assumptions A21 − A23. Other works (Bobba and
Frisancho (2014)) show that students are in general overoptimistic about their test scores.
Moreover, given that their test score is random, they are likely to select school whose threshold is higher than their mean expected score. Indeed, the fact that the test is a multiple
answer question means that any student can get a lucky draw and obtain a good score despite a low baseline level. As a result, it is very likely that students apply to schools ranked
much higher than the threshold they could reasonably expect to reach. This is confirmed in
the data by the fact that many students apply to schools which ex post are not feasible (as
shown by figure 2 in part 3).
Conversely, students could have wrong expectation and overestimate the difficulty to access
to some schools, letting them out of their expected feasible set, while they eventually turned
out to be feasible. In this case, including these schools in the set will lead to an underestimation of their popularity. However, thresholds are quite stable through time, and it is
unlikely that students make very large mistake on their expectation about them.
The Assumption A22 is more disputable. It is indeed possible that students do not like
schools which are better ranked than their most preferred alternative, and still think they
are feasible. In fact, assumption A22 can only give a lower bound on the expectation of test
scores. Another problem is that the threshold themselves are random and unknown from
the students when they make their choice. These threshold are thus a noisy measure of the
expectation of students. Eventually, assumption A22 is unlikely to hold perfectly, and defining expected feasible choice sets based on this assumption will only give an approximation
of the real sets.
Last, the assumption A23 is also unlikely to hold perfectly. As shown in the example 1 in
part 2, it may be rational for some students to omit some well-liked schools in their feasible
set if they meet the constraint on the size of their list. In this case, students will face a
trade-off between the likelihood of admission and the strength of their preference, and have
an incentives to scatter their choice among schools with different level of selectivity. While
this is theoretically possible, it is noticeable that few students use their entire choice list,
and that many of them submit rather short list 16 . As a result, within their feasible set, they
16
However, Submitting an incomplete list does not necessarily means that one does not let desirable
34
are very likely to truthfully report their preferences among alternatives.
To conclude, it is likely that assumption A1 to A3 are violated in practice. However, the
magnitude of the violation is likely to be small.
4.5 Assumption: stability
Another possibility explored in the literature (Fack et al. (2015)) is to assume stability, that
is to assume that each student is matched to his most preferred alternative among the one
in which he has the highest priority.
• A3: The observed matching is stable
In our context, it means that students are assigned their favorite choice among all the schools
with a threshold equal of below their test score. Given our data, stability can be expressed as
the joint probability of observing the given thresholds (given the characteristics of students),
and of having their assigned alternatives as their first choice among their feasible school (ex
post). The feasible is thus defined in the following way:
s ∈ Si if and only if ts ≤ gi
Where the expected grade is replaced by the grade observed ex-post. It is possible to build
a likelihood function based on the following probability:
P r(stable matching) = P r(threshold = t|Zi , θ) ×
Y
P r(ui = maxs∈Si |Zi , θ, t)
(13)
i
Following, the literature, I make here the assumption that the preferences are independent
of the thresholds (which are realized ex-post based on the existing preference). Note again
that the max is taken on a school set specific to each individual, and including all the schools
with a threshold equal or below the test score of the student.
However, this approach raises several problem. First, it is difficult to characterize the likelihood of observing a given threshold. In the case of the deferred acceptance mechanism,
schools outside of the list. There may be psychological costs associated with considering and ranking many
alternatives, which may lead students to be willing to rank only small list to alleviate their cognitive burden.
35
it is possible to use a result by Azevedo and Leshno (2014) to obtain a distribution for the
threshold, which is asymptotically unique (namely, when the number of students grow large
compared to the number of schools, the matching become unique). In our case, there is no
such analytical results.
Moreover, the assumption of stability is dubious in the case of the serial dictatorship with
truncation of the preference list. As seen in part 2, one can generate example of rational
strategies which lead to unstable matchings, and there is no reason for these instabilities to
disappear asymptotically. However, the number of blocking pair is likely to be small.
Fack et al. (2015) performs simulations which show that provided the preferences for schools
do not depend on the threshold, and the matching is unique (and its probability equal to 1),
the stability condition can be expressed by the second part of the equation 13 only. In this
case, we just need to perform an estimation with a multinomial logit with a personalized
choice set equal to Si
17
. More formally, the estimation is based on the following individual
probabilities, which can easily be used to build a likelihood function:
P r(ui = Max ui,s0 | Zi , βZ , t) = P
s∈Si
exp(Ui,s )
s∈Si exp(Ui,s0 )
(14)
4.6 Assumption: truthful partial order
We have seen above that the previously considered identification strategies rely on difficultto-justify assumptions on the strategy followed by the agents and on their beliefs. While it
enables to achieve point identification, it weakens the credibility of the estimates.
While assumption A1 is difficult to defend, assumption A2 to A4 are more reasonable. If
they are likely to be violated for some individuals, the number and size of the violations are
likely to be small, and the magnitude of the bias is likely to be reduced. However, this is
uncertain, and we don’t know the magnitude of the bias.
The aim of this section is therefore to propose a partial identification of the parameters based
on very weak assumption, and to propose robust confidence sets for the parameters. I make
only the following assumption:
• A4: Students submit a truthful partial order to the social planner. That is, the order
17
It is possible to re-express this logit model with moment condition, and to add moment inequalities based
on the rest of the rank order list. These moment conditions can then be combined to estimate the parameter,
by using the technique used in Andrews and Shi (2013). However, given that it add little information, and
that it is computationally difficult to implement, I did not use it
36
of the alternatives that they report respect the true preference order among them.
This assumption is very weak. Indeed, it is always a (weakly) dominated strategy to play
otherwise, since ranking an alternative above a more preferred one create the risk that
one is attributed the former rather than the latter (see proposition 4). Moreover, there is
no plausible psychological reason or cognitive bias which could lead students to play such
strategy.
Under this assumption, it is possible to build inequalities bounding the true parameters:
P r(s1 > s2 |Zi , βZ ) = P r(ui,s1 > ui,s2 & s1 , s2 ∈ Li |Zi , βZ )
(15)
The above equation states that the probability to observe school 1 being ranked above school
2 in a rank order list is equal to the joint probability of this two schools belonging to the
list, and the school 1 being preferred to the other. This implies that:
P r(s1 > s2 |Zi , βZ ) ≤ P r(ui,s1 > ui,s2 |Zi , βZ )
(16)
There is equality if and only if the two schools are always ranked in the list.
In a similar fashion, one can obtain another inequality. We have
P r(s2 > s1 |Zi , βZ ) = 1 − P r(ui,s1 > ui,s2 & s1 , s2 ∈ Li |Zi , βZ )
(17)
which comes from the fact that P r(s2 > s1 |Zi , βZ ) is the complement of P r1 > s2 |Zi , βZ ).
This can be rewritten as:
1 − P r(s2 > s1 |Zi , βZ ) = P r(ui,s1 > ui,s2 & s1 , s2 ∈ Li |Zi , βZ )
(18)
which in turns implies that:
1 − P r(s2 > s1 |Zi , βZ ) ≥ P r(ui,s1 > ui,s2 |Zi , βZ )
(19)
For a given individual and a given couple of alternatives, these two inequalities are bounding
the probability of preferring one alternative to the other.
37
Taking expectation on all the individual for this two inequalities enables to obtain conditional
moment inequalities, which are function of the list Li , and conditional to |Zi and βZ :
E[m1 (Li |Zi , βZ )] = E[P r(ui,s1 > ui,s2 |Zi , βZ ) − 1(s1 > s2 )|Zi , βZ ] ≥ 0
(20)
E[m2 (Li |Zi , βZ )] = E[1 − 1(s2 > s1 ) − P r(ui,s1 > ui,s2 |Zi , βZ )|Zi , βZ ] ≥ 0
(21)
It is possible to build these inequalities for each school pair in the dataset. It can be extended
to a larger number of schools, but this is less likely that many students include these schools
in this specific order, so I stick to school pairs. These moment inequalities can be used to
estimate a confidence set for the parameters βZ . Indeed, they are not sufficient to bring point
identification. The bounds become sharper when the alternatives are reported in many lists,
and the model is point identified if all the alternatives are reported in all the list.
Estimating partially identified model is not easy. I start from the approach of Chernozhukov
et al. (2007), who propose a criterion function approach where a function S take the value
0 when all the moment conditions are satisfied, and become increasingly large as some of
these conditions are violated. Among several possible choice, I choose the modified method
of moment function, which is simple to compute and which has been praised for its good
statistical properties ( Chernozhukov et al. (2007), Andrews and Shi (2013)).
S(m, Σ) =
2
p X
mj
j
σj
(22)
−
In the above equation, mj refer to a given moment inequality, who variance is σj , while the
matrix of variance covariance between all the moment inequality is given by Σ. The operator
. − take the value 0 if the moment inequality is satisfied, and return its square otherwise.
Larger violation are thus increasingly penalized. Other criterion functions are proposed in
Chernozhukov et al. (2007).
This criterion function can then be used to create a test statistics Tn (βZ ) whose null distribution can be derived. Several possibility have been evoked in the literature. I use the
generalized moment selection technique of Andrews and Soares (2010), which has been shown
38
to be superior to the plug-in asymptotic and the subsampling approach
18
. It is possible to
test any value of the parameter set, and to build a confidence set such that:
CSn = {βZ ∈ BZ : Tn (βZ ) ≤ cn,1−α (βZ )}
(23)
A problem to be solved is that I deal with conditional moment inequalities. I follow Andrews
and Shi (2013) and transform all the conditional moment into unconditional one using indicator function dividing the data space into hypercubes. These unconditional moments can
thus be aggregated into a test. I follow the advice of Andrews and Shi and I use a Cramer von Mises statistics (an alternative would be using a Kolmogorov - Smirnov statistics).
Unfortunately, the only way to compute the confidence set is to draw randomly parameter
vectors and to test them one by one. This approach work asymptotically if one can draw
an infinite number of parameter vectors. In this context, the Andrews-Shi approach is not
very convenient, because it implies to compute the matrix of variance - covariance among
all the moment inequalities. Since there are many of them, this is burdensome. To reduce
the computational burden, I randomly select among school pairs and students. I also drop
some school pairs which presented in a very small number of list (in a similar fashion as in
Ciliberto and Tamer and Hwang (2014)). It also limits the number of variable I can include
into the model. As I explain in the next part, I use standard data reduction techniques to
aggregate data on students into more parsimonious indicators.
Another problem is that the test proposed in the literature are quite conservative. It leads
the confidence set to be quite large, and to sometimes include 0. This is the cost of making
only parsimonious and robust identifying assumptions. As a result, the results of the partial
identification approach should mostly be interpreted as a robustness check. It enables to
test the stringency of the stronger assumptions which enable to point identify the model.
If these assumptions lead to estimate parameters which are outside the bounds given by
the partial identification estimators, then they are likely to lead to a large bias, and the
estimates obtained from these assumptions should be treated with caution. Conversely, if
the estimated parameters lie into the bounds, and if these one are informative enough, it
means that the estimated parameters have some credibility and only a small bias.
18
The advantage of the GMS procedure lies in the fact that it does not use all the moments to build the
null distribution, but only the one who are likely to be binding for the tested parameter vector. As a result,
it produce a smaller critical value, and reject more often, which gives a smaller identified set.
39
5 Data description and results
The aim of this section is to describe the choice of variables for the model as well as the
data, and to present and comment the results.
5.1 Data
Using partial identification creates computation problem. Indeed, the larger the number of
parameter, the larger the number of parameter vector to be tested for computing the confidence set. As a result, I have chosen to reduce as much as possible the number of variables
to be included in the model.
Since I do not use school fixed effects (again, for computational reason), I have to include
school level variable likely to impact the preference of individuals. I focus mostly on measure
of quality, such as the average score of the student to the placement exam, the admission
threshold, and the belonging to one of the two elite subsystem among Mexico public schools
(IPN and UNAM). I also include a variable distinguishing vocational and general education.
I then include interaction variables between school and student characteristics. These variables enable to capture the heterogeneity of preference between different type of students. I
choose to include an interaction term between academic results and academic quality, and
interaction terms between a measure of social background and academic quality as well as
the indicator for vocational education.
The interaction term between social background and school quality are a good way to assess
if poor students tend to self select into academic track and schools which are less academically demanding and prestigious. Controlling for an interaction term between own results
and school quality (measured by selectivity and average student quality) enable to estimate
the net effect of preference for quality when choosing a school compared with the expectation
of being admitted.
All the variables are build from the survey which is administered to the student before
they choose the schools they want to apply for. The survey includes many questions about
self assessment, family consumption, and academic results. I aggregate these variables into
one indicator of social status, and one indicator of academic achievement using principal
component analysis. I discuss the exact methodology (and more specifically which variables
are aggregated) in appendix A. Both indicators are standardized.
40
I present below some summary statistics for the schools.
Table 4 : Summary statistics for schools
Average size
1384
(1233)
Average admission threshold
57.47
(24.89)
Average student quality (COMIPENS score)
72.48
(17.26)
Curriculum is vocational
0.15
(0.36)
Belong to IPN
0.89
(0.28)
Belong to UNAM
0.144
(0.35)
Number of seats
250,031
Number of programs
612
I compute the value of each variable for each program, and I then weight
the value by the number of seats in the program. The number of seat
is computed based on the actual number of students enrolled in the program
after the assignment process.
5.2 Results
I estimate the model using the different estimation strategies outlined in the previous
part.
41
Table 5 : the determinant of the demand for school
(1)
(2)
(3)
(4)
Truthful
truthful reporting
stability
truthful
reporting
within feasible set
0.52
0.97
0.108
(0.02)
(0.11)
(0.14)
0.41
0.86
0.81
(0.05)
(0.10)
(0.14)
-2.12
-3.14
-2.98
(0.45)
(0.56)
(0.66)
3.13
4.35
4.56
(0.34)
(0.39)
(0.40)
-0.32
-0.45
-.0.39
(0.02)
(0.023)
(0.03)
selectivity ×
-0.11
-0.23
-0.26
student test score
(0.08)
(0.09)
(0.10)
Curriculum is vocational ×
-0.45
-0.67
-0.68
student test score
(0.18)
(0.23)
(0.25)
selectivity ×
-0.21
-0.56
-0.68
student social status
(0.02)
(0.12)
(0.15)
Curriculum is vocational ×
-2.13
-4.23
-4.46
student social status
(0.42)
(0.54)
(0.68)
Variables
partial order
School characteristics
selectivity
average peer quality
Curriculum is vocational
Belong to elite subsystem
[0.22 – 0.145]
[0.32 – 0.121]
[-2.18 – -4.84]
[3.01 – 6.23]
Interaction variables
Distance
[-0.64 – -0.22 ]
[-0.12 – -0.43]
[-0.38 – -1.01]
-1.07 – -0.28
[-7.23 – -2.86]
This table present the results of the estimation of the demand model using
different identification strategy. In the last column, I only report the
confidence where the estimates lie with 95% probability.
42
The sign of the coefficient are of the expected sign and in line with the rest of the literature.
Students value selectivity, peer quality, the belonging to an elite subsystem. They prefer to
avoid vocational education, and school far from home. This is coherent with the idea that
students treats admission exam and the quality of the student body as a signal of academic
quality. However, as it has been noted in the literature, this assumption is not necessarily
true. For instance, Akyol and Krishna (2014) document that the schools with the largest
value added at the national examination exam in Turkey are spread evenly among selectivity
levels. In the case of Mexico, two studies have documented by using a regressing discontinuity
design that the elite subsystem IPN, which specialized in offering a rigorous education in
math and science, is indeed able to increase test scores at the end of high school exam for
its students (Estrada and Gignoux (2014), de Janvry et al. (2015)). However, these is no
guarantee that these results hold among all the selectivity distribution.
As far as the interaction terms are concerned, there is an interesting pattern of heterogeneity
among students. Not surprisingly, students with good academic results are more likely to
value academic quality, probably because they feel confident in their ability to succeed in
rigorous programs. They also want to avoid vocational program. However, students with a
low social status are less likely to value positively academic quality. They also value more
vocational programs. This is consistent with self-censorship, as well as the will to enter
quickly the labor market, maybe because they are credit constraint.
Regarding the effect of the various identification strategy, there is a clear difference in term of
magnitude between the first strategy and the other. The first strategy lead to underestimate
the popularity of strong schools, particularly among the weakest students. The two strategies
restricting the feasible school set give quite similar results. The strategy based on partial
identification gives rather large confidence set, probably because the test on which it is based
is quite conservative. However, it enables to reject some of the estimate of the first strategy
(truthful) while it is consistent with all the estimate of the second and third identification
strategies (based on feasible choice set). As a result, partial identification clearly enables to
discriminate between some identification strategy which are credible, and some which are
not.
The results are interesting from a policy point of view. Students coming from a deprived
background tend to give comparatively less value to academic quality, academically oriented
curriculum, and to belonging to an elite school, regardless of their score. For the best of
43
them, this is likely to be detrimental, since they miss the opportunity to be well prepared
to college, and maybe to pursue their study at the higher education level. It shows that the
low level of education in some developing countries such as Mexico may be due in part to
low expectations and ambition from a part of the students.
44
6 Conclusion
This paper estimates the preference of students in Mexico. It shows that while all student
tends to value selectivity and peer quality, students from a poorer background are valuing
these characteristics much less that the wealthier students.
The paper also assesses the mechanism used in Mexico. It shows that this variant of the
serial dictatorship is not stable, and that it implies a high level of sophistication from the
students, as well as the ability to form accurate belief. The paper also presents evidence
showing that many students are not fully rational and choose strategies which are dominated.
Weaker and less wealthy students are more likely to do so.
Taken together, these two findings lead to some pessimism concerning the benefit to be
expected from centralized assignment mechanisms. This mechanism are supposed to improve
the matching between schools and students by simplifying the application process. However,
in practice, the variant of the mechanism which are implemented in the real world often
necessit to elaborate sophisticated strategies, and to form accurate beliefs. This comes
mainly from the truncation of the choice list, and by the fact that many students do not
fulfill their entire list of preference. This is made necessary by the large number of alternatives
that students are facing.
In practice, it is likely that many students rely on simple heuristics to make their choice.
However, it seems that these heuristics are far from perfect. As shown in the second part,
many strategies used by the students are dominated. Moreover, the further their strategy
is from the non-dominated strategy, the more likely are students to be mismatched or to be
left unassigned at the end of the assignment procedure.
Even though students were able to use the right strategy, it is unlikely that centralised
matching mechanism lead to a greater access to good high schools or to higher education
among students from a poor background. Indeed, these students are consistently less ambitious than their wealthier peers. Conditional on their academic ability, there are less likely
to apply to the best high schools, even though they would be accepted. This is consistent
with the presence of bounded rationality among them.
This finding is coherent with the recent works on the role of limited information and biased
expectation on the formation of human capital. Many students do not correctly assess the
various educational alternative they are facing. This generate a mismatch between the ability
45
of some students and the quality and requirement of the program they are attending.
This leaves some room for effective policy. Improving information, and maybe providing
incentives such as merit scholarships, could enable to raise the ambition of some able but
badly informed students.
46
References
Atila Abdulkadiroglu and Tayfun Sönmez. School choice: A mechanism design approach.
The American Economic Review, 93(3):729–747, 2003.
Atila Abdulkadiroğlu, Nikhil Agarwal, and Parag Pathak. The welfare effects of coordinated
assignment: Evidence from the new york city highschool match. NBER Working Paper,
2015.
Nikhil Agarwal and Paulo Somaini. Demand analysis using strategic reports: An application
to a school choice mechanism. NBER Working Paper, 2014.
Saziye Akyol and Kala Krishna. Preferences, selection, and value added: A structural
approach. NBER Working Paper, 2014.
Donald Andrews and Xiaoxia Shi. Inference based on conditional moment inequalities.
Econometrica, 81(2):609–666, 2013.
Donald Andrews and Gustavo Soares. Inference for parameters defined by moment inequalities using generalized moment selection. Econometrica, 78(1):119–157, 2010.
Peter Arcidiacono. Ability sorting and the returns to college major. Journal of Econometrics,
121(1):343–375, 2004.
Peter Arcidiacono. Affirmative action in higher education: How do admission and financial
aid rules affect future earnings? Econometrica, 73(5):1477–1524, 2005.
Eduardo Azevedo and Jacob Leshno. A supply and demand framework for two-sided matching markets. SSRN Working Paper 2260567, 2014.
Michel Balinski and Tayfun Sönmez. A tale of two mechanisms: student placement. Journal
of Economic theory, 84(1):73–94, 1999.
Steven Berry. Estimating discrete-choice models of product differentiation. The RAND
Journal of Economics, pages 242–262, 1994.
Steven Berry, James Levinsohn, and Ariel Pakes. Automobile prices in market equilibrium.
Econometrica: Journal of the Econometric Society, pages 841–890, 1995.
Eric Bettinger, Bridget Terry Long, Philip Oreopoulos, and Lisa Sanbonmatsu. The role of
47
application assistance and information in college decisions: Results from the h&r block
fafsa experiment. The Quarterly Journal of Economics, 127(3):1205–1242, 2012.
Matteo Bobba and Veronica Frisancho. Learning about oneself: The effects of signaling
academic ability on school choice. Unpublished manuscript, 2014.
Anna Bogomolnaia and Hervé Moulin. A new solution to the random assignment problem.
Journal of Economic Theory, 100(2):295–328, 2001.
Simon Burgess, Ellen Greaves, Anna Vignoles, and Deborah Wilson. What parents want:
school preferences and school choice. The Economic Journal, 2014.
Caterina Calsamiglia and Maia Güell. The illusion of school choice: Empirical evidence from
barcelona. CEPR Discussion Paper No. DP10011, 2014.
Raymundo Campos-Vázquez. Why did wage inequality decrease in mexico after nafta?
Economı́a Mexicana, 22(2):245–278, 2011.
José-Raimundo Carvalho, Thierry Magnac, and Qizhou Xiong. College choice allocation
mechanisms: Structural estimates and counterfactuals. IZA Discussion Paper, 2014.
Caterina Casalmiglia, Chao Fu, and Maia Güell. Structural estimation of a model of school
choices: the boston mechanism vs. its alternatives. Documentos de trabajo (FEDEA),
(21):1–63, 2014.
Yeon-Koo Che and Youngwoo Koh. Decentralized college admissions. Journal of Political
Economy, Forthcoming, 2014.
Yeon-Koo Che and Fuhito Kojima. Asymptotic equivalence of probabilistic serial and random
priority mechanisms. Econometrica, 78(5):1625–1672, 2010.
Victor Chernozhukov, Han Hong, and Elie Tamer. Estimation and confidence regions for
parameter sets in econometric models1. Econometrica, 75(5):1243–1284, 2007.
Christopher Conlon and Julie Holland Mortimer. Demand estimation under incomplete
product availability. American Economic Journal: Microeconomics, 5(4):1–30, 2013.
Monique De Haan, Pieter Gautier, Hessel Oosterbeek, and Bas Van der Klaauw. The performance of school assignment mechanisms in practice. IZA Discussion Papers, 2015.
Alain de Janvry, Andrew Dustan, and Elisabeth Sadoulet. Flourish or fail? the risky reward
of elite high school admission in mexico city. Unpublished manuscript, 2015.
48
David Deming, Justine Hastings, Thomas Kane, and Douglas Staiger. School choice, school
quality, and postsecondary attainment. American Economic Review, 104(3):991–1013,
2014.
Dennis Epple, Akshaya Jha, and Holger Sieg. Estimating the impact of school closings on
parental choice. 2014.
Haluk Ergin and Tayfun Sönmez. Games of school choice under the boston mechanism.
Journal of public Economics, 90(1):215–237, 2006.
Ricardo Estrada and Jérémie Gignoux. Benefits to elite schools and the formation of expected
returns to education: Evidence from mexico city. Unpublished manuscript, 2014.
Gabrielle Fack, Julien Grenet, and Yinghua He. Beyond truth-telling: Preference estimation
with centralized school choice. Unpublished manuscript, 2015.
Guillaume Haeringer and Flip Klijn. Constrained school choice. Journal of Economic Theory,
144(5):1921–1947, 2009.
Gordon H Hanson. Why isn’t mexico rich? Journal of Economic Literature, 48(4):987–1004,
2010.
Justine Hastings and Jeffrey Weinstein. Information, school choice, and academic achievement: Evidence from two experiments. The Quarterly Journal of Economics, pages 1373–
1414, 2008.
Justine Hastings, Thomas Kane, and Douglas Staiger. Heterogeneous preferences and the
efficacy of public school choice. Unpublished manuscript, 2008.
Justine Hastings, Christopher A Neilson, and Seth D Zimmerman. The effects of earnings
disclosure on college enrollment decisions. 2015a.
Justine S Hastings, Christopher A Neilson, Anely Ramirez, and Seth D Zimmerman. (un)
informed college and major choice: Evidence from linked survey and administrative data.
2015b.
Yinghua He.
Gaming the boston school choice mechanism in beijing.
Unpublished
manuscript, 2015.
Sam Il Myoung Hwang. A robust redesign of high school match. Unpublished manuscript,
2014.
49
Brian Jacob, Brian McCall, and Kevin Stange. College as country club: Do colleges cater
to students’ preferences for consumption? NBER Working Paper, 2013.
Robert Jensen. The (perceived) returns to education and the demand for schooling. The
Quarterly Journal of Economics, 125(2):515–548, 2010.
Michael Keane and Kenneth Wolpin. The career decisions of young men. Journal of political
Economy, 105(3):473–522, 1997.
Adam Lavecchia, Heidi Liu, and Philip Oreopoulos. Behavioral economics of education:
Progress and possibilities. NBER Working Paper, 2014.
Bridget Terry Long. How have college decisions changed over time? an application of the
conditional logistic choice model. Journal of Econometrics, 121(1):271–296, 2004.
Charles Manski. Identification problems in the social sciences. Harvard University Press,
1999.
Paul Milgrom and Robert Weber. Distributional strategies for games with incomplete information. Mathematics of Operations Research, 10(4):619–632, 1985.
Parag Pathak and Tayfun Sönmez. Leveling the playing field: Sincere and sophisticated
players in the boston mechanism. The American Economic Review, 98(4):1636–1652,
2008.
Barry Schwartz. The paradox of choice: why more is less. New York: Harper Perennial,
2005.
Elie Tamer. Partial identification in econometrics. Annual Review of Economics, 2(1):167–
195, 2010.
50