Estimating students’ preferences and bounded rationality in Mexico City∗ Job Market Paper Alexis Le Chapelain, Sciences Po Paris October 30, 2015 Abstract Many developing countries are lagging behind in term of educational achievement and human capital. This gap is often blamed on the low value placed on education by the poorest students, and by their inability to identify the best educational alternatives. This paper contributes to this debate by studying what are the preferences for school attributes in Mexico, and how well students understand the functioning of the educational system. It describes the functioning of the school-student matching algorithm used in Mexico city, shows that it is non-truthful, unstable, and that it requires a high level of sophistication. It then presents suggestive evidence that many students choose dominated strategies, and fail to fully understand the mechanism. Last it exploits reports submitted by students to estimate a model of preferences for school attributes. To overcome the problem raised by strategic reporting, it uses an approach based on partial identification, and derives robust bounds on the parameters enabling to assess the validity of the point estimates obtained through more traditional methods. Eventually, the paper finds that while students value selectivity and the quality of the student body, there is considerable heterogeneity among them. The most deprived students value less academic quality and prestige, regardless of their score, and are often mismatched. JEL Codes: C78, D47, H41, J24, I21, O15 Keywords: Student Preferences, Serial Dictatorship Mechanism, School Choice, Bounded Rationality, Partial identification ∗ Corresponding author: Alexis Le Chapelain, Sciences Po Paris, [email protected]. I thank Jorge Ubaldo Colin Pescina, Jean-Marc Robin, Olivier Tercieux, Il-Myoung Hwang, Yinghua He, Julien Grenet, and Gabrielle Fack for their help and comments on the paper. 1 1 Introduction Since the beginning of the interest of economists for education four decades ago, the approach based on rational choice has been prominent. Students are seen as economic agents able to balance the costs, risks and benefits of education decisions, and to accurately assess the quality of various alternatives.1 However, this view has recently been challenged by works showing the presence of behavioral bias and myopia among students.2 This problem seems particularly acute in developing countries, where many students fail to recognize the benefit of additional education, and have difficulty distinguishing good and bad schools.3 This can have important implication, since human capital is an important factor in economic prosperity.4 Educational decision are known to be strongly influenced by social origin. Students from a poorer social background are more likely to drop out or to choose less prestigious and rewarding educational path, regardless of their academic ability.5 This can lead to an educational mismatch, where some students are underplaced compared to their academic ability. This problem is particularly striking in Mexico. While education is compulsory until middle school, and that students are primarily assigned to a school based on their catchment area, students have to choose which high school they want to attend. This choice is particularly important, because the choice of high school has a strong influence on the ability to pursue into higher education, and that the Mexican secondary system is strongly stratified by ability. In this context, regardless of their academic score, the poorest students are much more likely to attend schools with low selectivity and with a weak peer groups. 1 A good example of such a view comes from the literature on the estimation of structural model of human capital accumulation which emerged after the seminal paper by Keane and Wolpin (1997) 2 See for instance Lavecchia et al. (2014) for a comprehensive review or recent economic works on education inspired by behavioral economics. 3 This is documented for instance in the work of Jensen (2010), who shows that students from Dominican Republic underestimate the returns to education. 4 Mexico is a good example of such a situation. While attendance to primary and middle schools is near universal in this country, there is a very large drop-out rate in high schools with more than 40% of a cohort unable or unwilling to complete secondary education. At the same time, returns to graduation are large, and have been estimated at as high as 24% (Campos-Vázquez (2011)). Dropping out is thus likely to be an inefficient decision. This low completion rate is often cited as one of the major factor hindering Mexico economic development, which have been disappointing during the last decades (Hanson (2010)). 5 For instance, Hoxby and Avery shows that in the US, strong students from poor background fail to apply to selective college with generous scholarship and end up in less selective and more costly institution. 2 Difference between own test score and school test score −4 −2 0 2 4 Figure 1 − Match quality and social origin −2 0 Wealth index 95% CI 2 lpoly smooth kernel = epanechnikov, degree = 0, bandwidth = .45, pwidth = .68 Difference between own test score and school admission threshold 12 14 16 18 20 Figure 2 − Self−censorship among strong students −2 0 Wealth index 95% CI 2 lpoly smooth kernel = epanechnikov, degree = 0, bandwidth = .43, pwidth = .65 The two figures above describe the extent of the phenomenon. The first figure reveals that while rich students tend to be in schools whose academic average is stronger than their own score, the reverse is true for the poorest student, who are underplaced, and are in average better than their peers. The second figure focus on the last quartile of ability. It shows that conditional on their test score, the poorest students are in average assigned to less selective 3 high schools, namely their scores are well above the admission threshold to their assigned schools. This is another sign of mismatch among this category of students. Consequently, the aim of this paper will be to investigate the causes of this academic mismatch among poor students. It will show that this mismatch is primarily due to two causes: the students coming from a more deprived background are more likely to misunderstand the matching mechanism used in Mexico and to make mistakes in the application process; they also have weaker preferences for selective schools than the other students. High drop-out rate and low academic ambition among poor students are often blamed on material factors, such as the lack of resource and the presence borrowing constraint, which lead them to enter quickly the labor market 6 . However, in our case, public high schools are free, and mismatch is present from the very choice of a secondary institution. Another possibility is that the complexity of the school system itself can lead students to make poor choices.7 This is especially the case when the admission process to various educational institutions is decentralized. In such circumstances, students have to apply to multiple schools at the same time, which is often costly, and involve assessing in which schools one has reasonable chance to be admitted.8 In order to ease the process for students, the use of a centralized matching system collecting preferences and processing them with an algorithm has become popular. Designing efficient matching algorithm has become an active area of research, since the seminal work of Abdulkadiroglu and Sönmez (2003) and Balinski and Sönmez (1999). Following this trend, Mexico City implemented a centralized admission exam in the early 2000’s, as well as a matching algorithm derived from the serial dictatorship mechanism. Students have to take an exam known as COMIPENS, and have to submit a rank ordered list of preferred schools. They are then assigned to schools, with the students having the highest score being given the highest priority. While centralized admission system and matching algorithm are meant to simplify the admission procedure and to level the playing field, they can in fact involve complicated strategy and demand a high level of sophistication from students.9 In the case of Mexico, the mecha6 These factors seem indeed present at the higher education level in Mexico, as shown by ... For instance Bettinger et al. (2012) document how reducing administrative complexity can lead students from a poor background to be more likely to apply to financial help and to attend higher education. 8 Abdulkadiroğlu et al. (2015) discuss this issue in the case of New York, while Che and Koh (2014) provide theoretical argument against decentralized matching. 9 For instance, Pathak and Sönmez (2008) and Ergin and Sönmez (2006) show that the Boston Mechanism, a widely used matching algorithm, generates a game where player have to conceal their true preferences to 7 4 nism which is used is apparently very simple to understand. However, I show that it involves a high level of sophistication from the students. This stems from the fact that students do not know their score before applying, and that they can submit only a partial preference list. This forces students to make complicated trade-off between schools with different probability of admission. Moreover, it compels them to form beliefs about their score and the school admission thresholds. Being unable to access to the true preferences of students prevents to measure directly their rationality. Indeed it is not possible to compute what would be the optimal strategy of a given student since his real valuation for each alternative is unknown. To overcome this latter problem, I exploit the structure of the algorithm to show that some strategies are dominated whatever the preferences are. I describe several behaviors which are difficult to rationalize, and which point to the presence of bounded rationality among a large part of the students. I also show that these rationality measures are a predictor of being assigned to a less prestigious school conditional to one’s score, and to be left unmatched at the end of the assignment process. I then show that students from a poor background are more likely to display irrational behavior. This explains a part of their higher propensity to be undermatched compared to their ability. I then try to estimate preferences for school attributes by using the ranked ordered lists submitted by the students to the planner. This is made difficult by the fact that the algorithm gives students a strong incentives to behave strategically and to misreport their true preferences. In particular, many students avoid to rank popular but selective in which they have very little chance to be admitted. Moreover, the large choice set (600 alternatives) is likely to induce “decision fatigue” and to to lead students to adopt rough heuristics when making choices. In order to obtain robust estimate in this context, I follow the partial identification approach outlined by Manski (1999) and Tamer (2010). I start with strong assumptions enabling to obtain precise point estimates of the parameters, and I progressively weaken this assumption maximize their payoff. Evidence from the field show that not all students are aware of this characteristics of the mechanism, and that naive students who play a truthful strategy are likely to be disadvantaged compared to sophisticated players.For these reasons, truthful mechanisms, where the optimal decision involves reporting one’s true preference, a simple strategy, are often advocated. However, most of the mechanism implemented in practice does not satisfy this condition. In particular, even the modified version of the Deferred Acceptance and of the Top Trading Mechanism which are implemented in many school systems are not truthful because they include a truncation of the preference list (Haeringer and Klijn (2009)). 5 to obtain more robust, but less precise and sometimes partially identified estimates or the parameters of interest. In addition to estimation a standard rank-ordered logit model, I use two other empirical strategies. First, I define for each student a set of feasible schools based on his test score at the placement exam, and then estimate a discrete choice model based on this personalized choice set, in the spirit of the literature on college application and admission 10 , which esti- mate preferences for college attributes among American students in a context where many students do not apply to the most selective colleges due to their low probability of being accepted. While far superior to a naive approach, this strategy relies on assumptions about the expectations of students about the school they can gain access to. Since there is some evidence that Mexican students are over-optimistic about their scores (Bobba and Frisancho (2014)), this assumption may prove to be restrictive. Moreover, it relies on the assumption that students truthfully report preference within their feasible set, while the analysis of the algorithm shows that it can be optimal not to do so. As a result, I use another identification strategy which only assumes that students submit a “truthful partial order”, that is they truthfully rank the alternatives within the schools they have selected into their preference list (which may not include other highly valued schools). Not doing so is a dominated strategy under any set of beliefs and preferences, and violate basic rationality. With this very weak assumption, it is possible to build bounds on the probability of ranking a school higher than another in a list, and to construct conditional moment inequalities. However, these inequalities only enable to achieve partial identification. I exploit the recent literature on the estimation of partially identified model (Andrews and Shi (2013)) to build confidence set for each of the parameters of the model. I find that students value factors related to academic quality such as the selectivity of the school, the mean quality of the student body, or the belonging to elite subsystems. Students also value proximity. While this is unsurprising, there is an interesting pattern of heterogeneity across students based on their ability and social origin. Students coming from a more deprived background are less prone to apply to prestigious high schools and value less academic quality, even though they have grades which would make them eligible to such schools. This gives a second explanation to the higher likelihood of mismatch among poor students. 10 See for instance Arcidiacono (2005), Long (2004), and Jacob et al. (2013). Such approach is also advocated by Fack et al. (2015) in the context of school choice. 6 The paper is organized as follow. The second part presents the algorithm used in Mexico City to match students to schools, explains how it generates a Bayesian game across students, and how the complexity of this game makes unlikely that students behave fully rationally. The third part presents several measures of irrationality, and shows that many students exhibit behavior which are difficult to reconcile with full rationality. The fourth part present the econometric model, and discuss the various identification strategies used as well as the stringency of the assumption needed. The fifth part presents and discuss the results. The last part concludes. Relation to the literature The present paper is connected with several strands of literature. It is related to the recent literature on school choice which explores how the presence of bounded rationality and heterogeneity in preference can lead to suboptimal choices among students with a deprived background see for instance Hastings et al. (2008), Hastings and Weinstein (2008), Deming et al. (2014), Hastings et al. (2015b) and Hastings et al. (2015a). My paper add to this literature by offering new evidence of educational mismatch from a developing country, and by describing two channels explaining why the level of mismatch differ across students from different social origin. The availability of data on school choice has led researchers to estimate model of demand for schools while taking into account the structure of the matching mechanism. Beyond estimating preferences, the aim of this literature has been to empirically assess the performance of various mechanism, as well as the degree of sophistication of the students. The strategic reporting of preference pointed by theorists and the bias it generates has been at the heart of the literature. He (2015) was the first to explicitly solve for this problem by proposing a structural model of choice in the context of the Boston mechanism. Other papers estimating preferences under the Boston mechanism include Casalmiglia et al. (2014), and Agarwal and Somaini (2014). These papers build structural models of school choice, and assume rational expectation and a high level of rationality, in a way similar to He. They found evidence of strategic reporting, a finding confirmed by Calsamiglia and Güell (2014), and De Haan et al. (2015), which use reduced form method. The complexity of the strategy that students have to play under perfect rationality make unappealing to make such an assumption. As a result, a new trend in the literature is to 7 use less structural approach and try to weaken the assumption under which preferences are estimated. Hwang (2014) uses a partial identification approach, and estimate confidence set for the parameters of a demand model in the case of the Boston mechanism. Similarly, Fack et al. (2015) propose several estimation method for the preference under the deferred acceptance mechanism with truncation of the preference list, which mostly rely on traditional discrete choice method, and quite weak assumption. Burgess et al. (2014) estimate preferences in England under the Deferred Acceptance mechanism by exploiting the fact that for many students the choice set is very small.11 My contribution to this strand of research is to propose a robust methodology for estimating preferences based on report obtained with the serial dictatorship mechanism with truncation and uncertain priorities. I am able to overcome the misreporting problem, and to obtain robust estimates of the underlying preferences. To do so, I do not need to use strong assumption on the belief of the students, contrary to most of the literature, and I do not need as well to assume that students understand fully the mechanism. In parallel to the interest for student preferences, a rich theoretical literature has discussed the property of various matching algorithm, and in particular their strategy proofness and their vulnerability to manipulation. Examples includes previously quoted papers by Pathak and Sönmez (2008) and Ergin and Sönmez (2006) which discuss the vulnerability to manipulation of the Boston mechanism, and a paper by Haeringer and Klijn (2009) who shows that stable and strategy proof mechanisms can become unstable and manipulable when preferences are truncated. Pathak and Sönmez (2008) offer interesting evidence demonstrating than different level of sophistication can lead the most naive player to become worse under a manipulable mechanism. My paper contributes to this literature by discussing a mechanism little explored so far, the serial dictatorship mechanism with truncated choices and unknown ex-ante priorities. It shows that this mechanism lack some desirable properties such as stability and strategy proofness, and that it requires a high level of sophistication among students. 11 The estimation of the preference for school attributes has also been performed for higher education. Akyol and Krishna (2014) ingeniously adapt the methodology of Berry et al. (1995) to data from higher education in Turkey. Carvalho et al. (2014) estimate preferences for major in a Brazilian university using a structural model. 8 2 Modeling school choice in Mexico The aim of this section is to describe the matching mechanism used in Mexico city, and to locate it within the framework of matching theory. It enables to determine what are the optimal strategies of the students, what are the assumptions that it is reasonable to do on their behavior, and what would be a behavior violating rationality. 2.1 The school choice plan in Mexico city General description of the environment: The school choice plan in Mexico City is operated in the transition from junior to senior high schools. While most students attend a junior high school close to their home, they have to choose a high school which can be located anywhere in the district of Mexico and in some surrounding suburbs. They can choose among a very large number of alternatives, since High Schools in Mexico City are run by nine different subsystems, operating many campuses, and an even larger number of programs (more than 600). Choosing high school is an important decision, since there is a high return to graduation from high school in Mexico City (Campos-Vázquez (2011)), and that completion can significantly vary across high schools (de Janvry et al. (2015)). Indeed, the rate of drop out is very high, at about 40%. Moreover, some high schools offer a preferential access to some prestigious universities (in particular, the high schools affiliated to UNAM, one of the most prestigious university in Mexico). Timing: The school choice plan follows a specific timing. First of all, Students have to submit a list of ordered preferences to the planner about one year before the assignment. The list can include up to 20 schools. Few months later, students have to take an exam, known as COMIPENS, which is graded on a scale of 128. The exam is made of 128 true/false questions, and is divided into several parts covering all the junior high school curriculum. There is no penalty for incorrectly answering one question. At this stage, any student with a score below 31 is dismissed (however, this restriction was removed recently). Students are then assigned to a high school based on their test score and on their submitted preferences. 9 The algorithm: The algorithm used in Mexico City is a variant of the serial dictatorship with a truncated preference lists. It proceeds in the following steps. Step (0) The mechanism collects the preferences of the students (namely, rank-ordered list of at most twenty schools), and their score at the placement exam. Step (1) All students are ordered by their score at the exam. Step (2a) The best ranked students are assigned to their preferred schools. Step (2b) If at this stage, some schools face an excess demand, they choose either to refuse all the applying students with an equal score, or to accepts all these applying students (namely, they choose an admission threshold, and then increase their number of seats to face demand if there are too many ties; conversely, they can choose an admission threshold just one point higher, but at the cost of having vacant seats). Step (2c) The students and the school seats which have been matched are removed from the algorithm. Step 2 iterates until all admissible students have been assigned, or that there is no longer any vacant seats. The algorithm produces a matching denoted µ. The specific allocation given to an agent i is denoted µi . An important feature of the algorithm is that it does not use ties to discriminate between students having the same level of priority. It rather consists in asking the high schools for setting an admission threshold. Since programs are large (most of them admit more than 400 students), this does not have too much consequences on excess capacity. We can note that this algorithm has not been studied in the literature so far, since serial dictatorship has been investigated only in a context where preferences are not truncated (in particular, a random version of this algorithm has been discussed in an important paper by Bogomolnaia and Moulin (2001), as well as in Che and Kojima (2010)). As it has been shown by Haeringer and Klijn (2009) in the case of popular algorithm such as the Top Trading Cycle, Deferred Acceptance and Boston mechanism, truncation can have an important effect, and can change the property of the algorithm. While the serial dictatorship mechanism is truthful, stable and efficient, it may not retain all these characteristics if preferences are truncated. 10 2.2 Model, strategy and equilibrium In this section, I follow the literature on the structural estimation of students’ preference (Agarwal and Somaini (2014), He (2015), Fack et al. (2015)), and I model the mechanism used in Mexico as a Bayesian game. 2.2.1 Model Definition of the game The game is defined as follow: • a set of parents / students / households : {i}Ii=1 • a set of schools (=programs): {s}Ss=1 , and an outside option ø • a school capacity vector: {qs }Ss=1 ; PS s=1 qs ≥I k • Students’ rank order lists: Li ∈ L, where Li = (c1i , c2i , ..., cK i ), and K < 21, ci ∈ {s}Ss=1 ∪ ø for any k = 1, 2, ..., K • School priorities P, which are in this context identical, and equal to the ranking in the exam. As a result, Pis = gi , where gi is the score at the exam. Preferences Preference of student i for school s are defined by uis = U (Ws , Xis , is ) = Ws βW + Xis βX + is (1) where Ws is the vector of school characteristics (which define a common value of the school across all students), Xis is a vector of school-student characteristics (such as the distance to the school, or interaction terms between school and student characteristics), and is is a student-specific idiosyncratic preference shock for the alternative s. I assume that is follow an extreme value distribution F (0, σ ), that is ⊥ WS , Xis and that shocks are non-correlated, corr(is , i0 s0 ) = 0. I collect the vectors Ws , Xi,s into Z, and the parameters βW , βX into βZ . I assume that students maximize their utility function, and that they know their preference, as well as the distribution of the preferences of other students. 11 Beliefs Since students do not know in advance their scores at the placement exam, as well as the admission threshold to the schools, they have to form beliefs about them. Student have beliefs about their own score: ĝi ∼ G(µgi , σgi ) Student have beliefs about the thresholds (ie. the score of the last admitted): t̂si ∼ H(µsi , σsi ) If we impose rational expectation, the mean are equal to the true value. However, I assume that belief are not perfectly accurate, and that the belief take the form of a distribution, that is students assign a probability to all possible realization of their score and of the school admission threshold. This is realistic, because since there is no penalty for giving a wrong answer, students have a strong incentives to answer all the question in the test, even at random. This introduce a strong random component in the test score, and make them difficult to predict. As a result, even a student with rational expectation could be at most expected to know the shape, mean and variance of the distribution of his score. 2.2.2 Strategies The problem of the student is to find the list which maximize his expected utility. For each school, he can define an admission probability pis (ĝi , t̂si , Li ) which depends on the student expected score, the expected threshold, and the submitted list. Based on these probabilities, the student choose the following strategy: ∗ σ (ui ) ∈ arg max X σ Z Z uis pis (ĝi , t̂si , σ)dG(gi )dH(ts ) (2) s∈S NB: An alternative way to model the strategy of the students would be to assume that students know the distribution of the characteristics and expectation of the other students, and that they maximize accordingly. If we define D(u−i , ĝ−i ) as the joint distribution of utility and expected scores, we can rewrite the problem as: ∗ σ (ui ) ∈ arg max σ X Z Z uis pis (ĝi , σ, σ ∗ (u−i , g−i ))dG(gi )dD(u−i , ĝ−i ) (3) s∈S that is, students choose their best strategy conditional on other students choosing their optimal strategy conditionally on their characteristics. In practice, the threshold depends 12 directly on the behavior of the other students, and is a sufficient statistics for choosing the optimal strategy. Defining the probability pis (ĝi , t̂si , σ) can be done as follow: pis (ĝi , t̂si , σ) = P r(gi < ts0 ,s0 =l1 ,...lk & gi > ts ∀s = lk+1 ) (4) The probability for a student i of being admitted in a given school s is equal to the likelihood that his score is below the admission threshold for the better ranked schools s0 in the list, and that it is above the admission threshold for the given school. As a result, this probability depends on all the choices which are better ranked. When choosing to include or not a school with a given admission threshold in his list, a students should therefore balance how much it is going to increase his expected utility (ie. pis usi ) with the reduction in the probability X ∂pi s̃ of being admitted in the school likely to be ranked below (that is us̃i , s̃ being the ∂s s̃ schools whose threshold is below s). The strategy can be further rewritten using this expression for the probability of admission: pis (ĝi , t̂si , σ) Z Z Z = Z .... gi <t1 gi <t2 dG(gi )dH(t1 , ..., tk+1 ) gi <tk (5) gi >tk+1 The maximization problem can be rewritten further in a similar way: X Max U (Li ) = uis Li ∈L s∈Li Z Z Z Z dG(gi )dH(t1 , ..., tk+1 ) .... gi <t1 gi <t2 gi <tk (6) gi >tk+1 When the dimension of L is large, the problem becomes quickly intractable. As a result, it is unlikely that students perfectly optimize their expected utility in our setting, since there are more than 600 hundred schools. Indeed, for each school to be included in the list, students have to compute their probability of admission conditional on the presence of other schools in the list, which requires clearly a high level of sophistication. They also have to determine the joint distribution of all threshold, which is a very difficult task. Therefore, they are much more likely to rely on much simpler heuristics than on perfect expected utility maximization. 13 2.2.3 Equilibrium We can use the same argument than Fack et al. (2015) to establish the existence of purestrategy Bayesian Nash equilibrium, by applying Theorem 4 (Purification Theorem) in Milgrom and Weber (1985). However, there may be multiple equilibria including those in mixed strategies. 2.3 Property of the algorithm 2.3.1 Some definitions: I first define some useful concepts used in the matching literature. A matching is said to be stable if there is no student such that he prefers another school to his assignment, and either he has a higher priority to this school than one of the student who got a seat there, or there is vacant seat in this school. A stable mechanism is is a mechanism that associates a stable matching to every preference profile and priority. A mechanism is said to be strategy-proof if truth telling is a weakly dominant strategy for every preference profile and priority, that is students cannot manipulate the outcome of the mechanism by misreporting their true preferences. A matching is said to be efficient or Pareto optimal if there is no pair of agent who can improve their situation by exchanging their assignment. An efficient mechanism is such that it associates an efficient matching to every preference profile and priority. A strategy σi is said to be a true partial order of schools if it is a list Li such that it respect the true preference ordering of the schools included in the list. 2.3.2 Stability Proposition 1: With a positive probability, the serial dictatorship with truncation of preference and ex-ante unknown priorities produce an unstable matching. 14 Proof. It is easy to build an example to prove proposition 1. Consider a setting with four schools A, B, C, D, and a large number of agent. Consider the two following agents {1, 2} with identical preferences A B C D. These two students (who apply among many other unknown agents) have different utilities and expected test score and/or threshold so that it leads to the following expected admission probability distribution conditionally on applying. Let’s assume that they can submit a list of at most two schools among four. Exemple 1 Grade distribution P (g > tA ) P (tA > g > tB ) P (tB > g > tC ) P (tC > g > tD ) P (tD > g) student 1 0.1 0.2 0.3 0.3 0.1 student 2 0.05 0.15 0.2 0.5 0.1 Utility A B C D Ø student 1 30 10 9 2 0 student 2 20 18 10 9 0 With the above grade distribution and utility, the optimal strategy for agent 1 is to submit the list (A, C), since it maximizes his expected utility. The optimal strategy for agent 2 is to submit the list (B, D). Conditional on the submission of these lists, and provided that the beliefs are correct, there is a 6% probability that student 1 is assigned to C and student 2 is assigned to B. If so, Student 1 could have a grade (and thus a priority) which is higher than the one of student 2. For instance, such an assignment is consistent with student 1 having a grade equal to tA − and student 2 having a grade equal to tB + . Therefore, such grade distribution and preference can produce a result which is unstable with a positive probability. 2.3.3 Efficiency Proposition 2: If students perfectly know the ranking of schools based on threshold exante, or if they have expected threshold such that the support of the distribution of any 15 threshold never overlap with the distribution of another threshold, then serial dictatorship with truncation of preference and ex-ante unknown test score and cutoffs is efficient. Proof. I build a proof by contradiction. Notice that if an assignment is inefficient, then, for some schools A, B, and some students 1, 2, we must have A 1 B and B 2 A, and an assignment {µ1 = B, µ2 = A}. If threshold ranks are known ex-ante, then we must have tA > tB , or the reverse. However, in such case, it is impossible to have the assignment µ2 = A. Indeed, if so, the grade of students 2 has to be higher than tA , but then it will be also higher than tB , a preferred alternative. The proof goes the same way if tA < tB . 2.3.4 Truth telling Proposition 3: Serial dictatorship with truncation of preference and ex-ante unknown test score and cutoffs is not strategy-proof. Proof. See example 1. While within the list submitted by the students to the planner, the order of the school is truthful, students do not reveal their true preferences since they omit some of their most preferred alternatives from the list. A truthful strategy would have been to submit A, B for all the students. Proposition 4: Under serial dictatorship with truncation of preference and ex-ante unknown test score and cutoffs, submitting a list which not a true partial order is a weakly dominated strategy. Proof. See theorem 7 in Fack et al. (2015). Since the Serial Dictatorship mechanism is equivalent to the Deferred Acceptance mechanism with the same priorities at all schools, their proof can be directly adapted to my setting. 2.4 Dominated strategies and non rational expectations While a wide range of choice can be rationalized by postulating specific preferences and beliefs, some of them are however quite unlikely to be generated by the choice of truly a rational individual. Three sets of behavior are likely to come from bounded rationality and/or biased expectations. 16 Proposition 5a: If the admission threshold are known ex-ante, or if the support of their distribution does not overlap, then it is a weakly dominated strategy not to respect the threshold hierarchy in the rank-ordered list submitted to the planner Proof. Suppose that a student i do not respect the threshold order, that is, he submits A B while tB > tA with a probability one. Then, he can never be admitted to B, and the list including {A,B} (in this order) will give the same utility as an identical list which would differ only by omitting B. Indeed, four cases are possible. If gi > tA & gi > tB , then he is admitted to A. If gi < tA & gi < tB , then he is not admitted at any of the two schools. If gi > tA & gi < tB , he is admitted to A. By assumption, since tB > tA , it is not possible to have gi < tA & gi > tB , a necessary condition for being accepted in B. As a result, he probability of being accepted in B is null, and omitting this school from the list will not change the expected payoff of the student. Moreover, the student may add to his list another school with a positive probability of admission which would raise his expected utility. Proposition 5b: Ranking a school with a higher admission threshold than the support of one’s expected score is a weakly dominated strategy. Proposition 5c: Submitting a list which is not complete is a weakly dominated strat- egy. Proposition 5b and 5c are trivially true, They amount to the fact than students should not waste any choice in their list. The extent of the behavior described in proposition 5 can tell us how well students understand the mechanism. If they often play weakly dominated strategies, we can assume that they have a limited ability to strategize well. 17 3 Assessing students’ rationality 3.1 School choice and students’ rationality As shown in part , the mechanism used in Mexico requires a high level of sophistication from students and their families. Students have to predict their own score, predict the thresholds of the schools they want to apply to, and understand the functioning of the mechanism. In particular, they should understand the importance of respecting the threshold order when ranking school, as well as the importance of spreading their choices across schools with different threshold of admission. One can wonder to what extent students are able to display such a degree of sophistication. Bobba and Frisancho (2014) provide evidence that students in Mexico have widely biased upward expectation about their test score, by administering a survey and making students take a mock exam. Following the survey, students somewhat change their beliefs and are more likely to apply to institution which are better suited to their academic aptitude. This shows that students belief are not necessarily correct, and they may make mistakes in their application strategy. Such mistakes may have important consequences. They could lead students to stay unmatched (about 15% of our sample). They could also lead students to miss some opportunity and be under-match, or to be affected to schools which do not fit well their ability. Students’ rationality is often a concern in school choice. Recent research has tried to distinguish between naive and sophisticated agents, the former having a poor understanding of the mechanism, while the latter are able to design strategy to exploit them. This behavior can sometimes be inferred from the data. For instance, in the Boston mechanism, it is often an unwise strategy to reveal one’s true first choice, because it may be unaccessible due to over-demand. It is often safer to aim to a less popular school at which one has priority. Looking at how choices can change with priority can therefore be informative (e.g. Agarwal and Somaini (2014) 2014). However, in some other examples, in particular the DA or TTC mechanism, it is always possible to rationalize the observed choice by referring to the true unobserved underlying preferences. Even though some students make mistakes, it is difficult to detect them. The interest of the Mexico school choice plan is that it is possible to observe directly strategies which are dominated, without access to the true preferences. As a result, it is possible 18 to built various tests of rationality. The aim of this section is thus to present the results of these test for all the students’ population, to assess who are the students the most at risk of choosing inefficient strategies, and what the consequence of these strategies in term of ability to secure access to a well-liked school. This section uses the data gathered by the COMIPENS, which administers a survey before the test. The data are described in detail in part 5. 3.2 Test 1: respecting threshold order A first rationality requirement for any strategy is to respect the threshold order of the ranked schools. Indeed, as shown in proposition 5a, if one ranks a school higher than another school which has a higher admission threshold, one has a zero percent probability of being admitted to this latter school. Such a strategy ends up wasting a choice, which could be allocated to increase the probability of admission to another well-liked school. This is therefore a weakly dominated strategy. In practice, threshold and ranking are not fixed and vary from years to years. As a result, not respecting rankings may be due to a rational behavior. For instance, a schools usually ranked higher than another may have moved below unexpectedly, or two very well liked schools may have very close thresholds. However, if the distance between the thresholds of two schools which are not properly ranked is very large, or if the percentage of violation of the threshold order among all the school pair within a submitted list is very large, it may be a sign that students have incorrect beliefs about the thresholds, or fail to understand the importance of respecting the ranking of schools by admission threshold in their preference list. In practice, it is possible for students to form accurate expectations of admission thresholds because these threshold does not move very much in time, and that they are published each year in the news. Merely inferring the future thresholds from the past one give a reasonably good prediction. I build several measures of respect of the threshold order. The first measure consists in the share of each school pair in the rank ordered list which does not respect the ex-post threshold order. The second measure is the average difference between threshold each time rank order 19 is not respected for a pair. The share of list exhibiting at list one pair with a deviation is very large, at 95%, which reflects the fact that thresholds are randomly moving, and that even a rational students may end up posting a choice list which does not respect threshold order. However, there is a very large heterogeneity on the ability to respect threshold order. The figure below shows the distribution of the share of school pairs in the rank ordered list which does not respect the threshold order. While not respecting threshold order is widespread, the distribution has a thick right tail, showing that a substantial minority is extremely bad at ranking schools, and submit list where more than 50% of schools pairs does not respect threshold order. This implies wasting many choices. However, it may also be due to the fact that some students are ranking school close to each other. 0 Percentage of students .05 .1 Figure 1 − Ability to respect threshold order 0 .2 .4 .6 Share of incorrect pair .8 1 The ability to respect threshold order is measured as the fraction of the school pair in a list which does not respect ex−post threshold order. The second table displays the distribution of the average deviation for each pair. The distribution is very skewed and approximately follows an exponential distribution, with a few students performing very poorly. 20 0 .05 Percentage of students .1 .15 .2 .25 Figure 2 − Ability to respect threshold order (weighted) 0 20 40 60 Weighted share of incorrect pair 80 Each deviation from the threshold order is weigthed by the size of the deviation To summarize, many students fails to rank schools while respecting threshold order, which means wasting choices. However, there is a considerable heterogeneity among them, with a substantial minority showing a very poor ability to follow this simple strategy, probably because they haven’t understood well the mechanism. 3.3 Test 2: avoiding overoptimism A second way to test the rationality of the application strategy of the students is to look at their optimism. Indeed, students need to form expectation about their own score when applying to schools. They should not only assess its mean, but also its variance. If they consider their score to be highly uncertain (which tends to be true since the test is a multiple answer exam which relies somewhat on chance), they should adjust their application strategy accordingly. Eventually, there is a few rules that students should follow to select their strategy. First, they should never list a school with a null probability of admission (proposition 5b). Moreover, if their expectation are unprecise, a good rule of thumb is to apply to schools of different selectivity level, so that they cover a large part of their score distribution. Moreover, they should apply widely and exhaust all their choices (proposition 5c). 21 To test if students are indeed following strategies of this sort, we can first look at the number of applications which are addressed to schools with admission threshold well above their expected test score. To do so, I first compute expected test score based on the previous scores of students in junior high schools and some other characteristics (wealth, self-confidence, and so on). I compute confidence intervals for this expected scores as well as critical values at the 1, 5 and 10 % level (for a one sided test). Then, I look at the fraction of the choice of each student above this upper bound. Note that this is a conservative test, since the value I use to predict scores can predict only about one third of the admission exam score variance. It is likely that students have access to much more accurate private information about their chance of success. The graph below show the distribution of the share of choice whose threshold is higher than the 10 % upper bound. A non trivial part of the student have almost all their choices above this upper bound, which shows a high level of optimism. More generally, many students post more than 10% of their application above the 10% bound, which means that they are ready to spend a quite large share of their choice on unlikely to reach schools. The results are similar, although attenuated, for the bound at 1 and 5%. Again, a significant share of students seems fairly optimistic. 0 .1 Percentage of students .2 .3 .4 Figure 3 − Overoptimism 0 .2 .4 .6 .8 1 Share of the choice above with less than 10% probability of admission 22 A sign of overoptimism is that some students fail to be admitted to any school because they target too high, while having test scores which would have allowed them to attend decent high school. Among students who take the test and have positive test scores, 12.5% are unable to obtain a seat because their score is below the threshold of any of the school they applied too. These students have a test average much lower than the one of those who secure a seat, at 59 compared to 72. However, the first quartile score is above 75, and the first decile above 82, a quite good academic result. Moreover, about 50% of all matched students get assigned in a school with a threshold lower then 59. As a result, a failure to secure a high school seat is most likely due to overoptimism rather than a weak score. Another interesting fact is that students differ very much in the dispersion of the schools they list. The distribution of the span between the most and least selective institution is approximately uniform, and is shown in the figure below. 0 .01 Percentage of students .02 .03 .04 .05 Figure 4 − Choosing a safe strategy 0 20 40 60 Span of the list 80 100 The span is defined as the difference between the highest and lowest threshold of the schools ranked in the list Some students focus all their choice within a small threshold window, which means that if they fail to accurately predict their score, they may end up either unmatched, or undermatched. In the opposite, many students follow a safer strategy and apply widely among schools of different selectivity level, which may reflect a better understanding of the randomness of the allocation process. 23 3.4 Test 3: size of choice list Given that both test scores and threshold are unknown ex ante, it is always a weakly dominant strategy to add more schools to one’s list, and to exhaust one’s choices (proposition 5c). However, and similarly to the experience of many school choice plan in the world, very few students use the full list, and some of them report only two or three choices. This is again a puzzling and difficult to rationalize behavior. It may be that the students who do not report a full list do so because they are sure to be assigned to a desirable choice. However, we have seen that many students got eventually unmatched. Due to the randomness of score, it is thus a risky strategy to report only a limited number of choices. Moreover, many students reports choosing a program mostly for its academic quality, and thus could benefit from ranking more well regarded school and hope to be lucky and have a better than expected score, or a lower than expected threshold. As a result, the large number of students who report less than the maximum possible choices is more likely to be due to their unability to assign a subjective value to many schools, due to a lack of information and the difficulty of choosing among so many alternatives. I present below the distribution of the number of choice. 0 Percentage of students .05 .1 .15 .2 Figure 5 − Number of submitted choices 0 5 10 Number of submitted choices 15 20 There is a peak at ten, maybe because it is a salient number. The majority of the students 24 reports ten or less choices. Only a minority reaches the size constraint of the list. 3.5 The determinant of rationality deviation Many students seem to depart from the rational strategies in systematic ways. As a result, one could wonder what are the determinant of such departure. Are there some predictors of irrationality? I regress below various measures of rationality on gender, academic achievement, and some demographic characteristics. Table 1: The determinant of weakly dominated strategies VARIABLES COMIPENS Score Mean Score Of Junior HS Girl Personality Index Academic Index Index of Parental Care Wealth Index Constant Observations R-squared (1) (2) (3) (4) (5) Number of Average rank Percent rank Choices’ span percent choices deviation deviation 0.00589*** -0.00651*** -0.00561*** 0.00228*** -0.00205*** (0.000557) (0.000146) (0.000153) (0.000170) (0.000123) 0.00107 -0.0127*** -0.0109*** -0.00372*** -0.0137*** (0.00470) (0.000530) (0.000740) (0.00101) (0.000795) 0.00128 -0.0612*** -0.0922*** 0.0463*** 0.129*** (0.0238) (0.00538) (0.00566) (0.00582) (0.00526) -0.0343** -0.0466*** -0.0333*** -0.0140*** -0.0334*** (0.0137) (0.00304) (0.00323) (0.00350) (0.00358) -0.0196 -0.0449*** -0.0166*** -0.0229*** 0.00528* (0.0171) (0.00360) (0.00376) (0.00413) (0.00304) 0.154*** -0.00125 -0.00306 0.00701** -0.0195*** (0.0131) (0.00320) (0.00314) (0.00322) (0.00307) 0.0781*** -0.0609*** -0.0474*** -0.00213 0.0817*** (0.0179) (0.00312) (0.00331) (0.00415) (0.00394) 9.490*** 1.325*** 1.157*** 0.0956 0.656*** (0.311) (0.0362) (0.0487) (0.0658) (0.0608) 138,945 138,945 138,945 138,945 138,945 0.004 0.098 0.062 0.003 0.050 overoptimistic Robust standard errors in parentheses *** p<0.01, ** p<0.05, * p<0.1 Observations are clustered at the junior high school level. 25 Most of the time, the R2 is very low, below 10 %, and sometimes close to null. Girls, stronger students, students surrounded by better peers seem to do better at using more efficient strategies. In particular, they are better at respecting ranks. However, the number of choices and their span is uncorrelated to any characteristics. One can also look at the determinant of being unmatched. Students with a lower score are less likely to be matched, which is unsurprising. This also the case of girl, probably because of their relatively better score in the continuous exam during the year compared to the admission exam. This may lead them to be overoptimistic, and to forget to add some safe choices. Students in good junior high school are also more at risk, probably because they tend to aim higher because of their peers. Moreover, more wealthy and confident students are more likely to stay unmatched, probably because these factors lead to overestimate one’s ability. 26 Table 2 - The determinants of being unmatched (1) VARIABLES COMIPENS Score Being Unmatched -0.0645*** (0.000579) Mean Score Of Junior HS 0.0480*** (0.00206) Girl 0.129*** (0.0204) Personality Index 0.183*** (0.0122) Academic Index 0.243*** (0.0123) Index of Parental Care 0.0552*** (0.0123) Wealth Index 0.266*** (0.0117) Constant -1.149*** (0.143) Observations 138,945 Robust standard errors in parentheses *** p<0.01, ** p<0.05, * p<0.1 Observations are clustered at the junior high school level. 3.6 The consequence of the failure to strategize So far, I have described various apparent deviation from rationality. However, one can wonder if they have consequences on the quality of the match achieved by students, conditional on their test score. To explore this possibility, I regress the admission threshold, the mean and the median of the student score in the high school where a student is assigned on the various variable computed above to measure rationality. I control for the test scores to account for the fact that students with mediocre score will have bad outcome anyway. Failure to strategize have significant and quite large impact on all the measures of the quality of the 27 match. Girls are also more likely to get a good outcome conditional on their test score, which is consistent with their superior measured rationality. Table 3 - Irrationality measures and quality of the match VARIABLES Number of Choice Share rank deviation Choices’ span Share Overoptimistic Choices COMIPENS Score Mean Score Of Junior HS Girl Wealth Index Constant Observations R-squared (1) (2) (3) High school threshold Mean high school score Median high school score 0.297*** 0.173*** 0.159*** (0.0147) (0.00788) (0.00744) -2.169*** -1.645*** -1.910*** (0.0725) (0.0386) (0.0359) 3.172*** 0.879*** 0.402*** (0.0796) (0.0396) (0.0364) -4.676*** -1.977*** -1.376*** (0.0982) (0.0494) (0.0438) 1.046*** 0.681*** 0.666*** (0.00570) (0.00374) (0.00349) -0.167*** -0.0122 0.0281*** (0.0151) (0.00890) (0.00801) 2.189*** 0.836*** 0.688*** (0.101) (0.0529) (0.0487) -0.0859 0.200*** 0.333*** (0.0564) (0.0304) (0.0282) -18.57*** 18.70*** 16.27*** (0.955) (0.566) (0.511) 225,548 225,548 225,548 0.554 0.644 0.667 Robust standard errors in parentheses *** p<0.01, ** p<0.05, * p<0.1 Observations are clustered at the junior high school level. These results reinforce the interpretation of the various rationality measures presented above as indeed representing departure from the optimal strategy, and not the correlate of original and idiosyncratic preferences. When students follow difficult to justify strategies, they end up with worse match conditional on their academic ability. 28 4 Estimating students’ preference The aim of this part is to discuss the identification and estimation of students’ preferences by exploiting the information contained in their reports. As discussed in the introduction, this is not an easy task since the reports only convey partial information about the preferences of the students due to the truncation of the preference list, and the unwillingness of many students to fulfill completely their list. As a result, it is likely that many students do not report school they like very much because they expect to be unable to obtain a score high enough in the placement exam to be admitted. As discussed in Burgess et al. (2014) or Fack et al. (2015) among others, this is likely to lead to an underestimation of the quality of the best schools. I follow the general philosophy of partial identification exposed, for instance, in Tamer (2010). I start with strong assumptions enabling to obtain point identification and to use standard and easy to use econometric methods. I then progressively relax these assumptions. This enables to obtain more credible estimates, but at the cost of lower precision, and sometimes point identification. However, even partial identification can provide bounds on the parameters which are informative. It is also a way to assess the sensitivity of the estimates to various assumptions. 4.1 Empirical model I start with equation (1) in part 2, which specifies the value of each alternative for a given student. Contrary to most of the literature on demand estimation 12 , I do not include fixed effects for each alternatives which are later discomposed in a second stage. Indeed, given the large number of programs, this is computationally difficult. Moreover, it is impossible to estimate such model by using partial inference when there are more than a dozen of parameters. As a result, for the sake of comparability across specifications, I discomposed the value of each alternative for a given student into three parts. Ws βW is the value coming from the characteristics of the alternatives, such as the selectivity of the school, its belonging to an elite subsystem, or the quality of its student body. Zis βZ is the value coming from characteristics belonging both to the school and the students, such as the distance to 12 In particular Berry (1994), and Berry et al. (1995). Variation of their estimation strategies have been used in the context of school choice, in particular in papers by Akyol and Krishna (2014), and Epple et al. (2014). 29 school, or interactions terms. Last, is is a random preference shock of the student for this alternative. uis = U (Ws , Zis , is ) = Ws βW + Xis βX + is (7) I assume that is follow an extreme value distribution type I (Gumbel distribution), and that is is orthogonal to Ws and Xis . I denote by Zi the matrix collecting all the vector Ws , Xis for each student. I denote by βZ the parameters βW and βX . I choose a simple specification to ease computation. I choose not to use a random coefficient model for avoiding to deal with large integrals. The presence of interaction terms is sufficient to provide realistic heterogeneity in taste across students. 4.2 Assumption: Perfect rationality under incomplete information A natural starting point would be to start with perfect rationality and rational expectation, and estimate the model under these assumptions. This would provide a benchmark. A possible strategy would be to solve the problem of each individual students given some beliefs and preference, and then compute the resulting equilibrium threshold. By iterating the process, one would be able to find an equilibrium in rational expectation where belief and strategies would be consistent. The resulting threshold and rank order list could then be used to match the equivalent moments in the data. However, as discussed in the first part, it is extremely difficult to compute the optimal strategy of a given student, even though thresholds are given, since it implies computing and comparing the payoffs of an exponentially large number of strategy. Since there are 600 alternatives, this is impossible in practice. Moreover, it is very unlikely that students follow such a complicated strategy. There are much more likely to rely on simpler heuristic, which in fact could give very good results. It is well known in the psychological literature that individual are unable to make correctly choices among many alternatives 13 . Another possibility is to specify a boundedly rational model of choice among many alterna13 This is commonly referred as “decision fatigue”, namely the inability to take correct decision once many decisions have already been considered(quote someone). Barry Schwartz argues in a famous book that exposure to a too large set of choice reduce welfare (Schwartz (2005)). One can also refer to Baumeister et al. for the concept of “decision fatigue”. 30 tives, and then apply the estimation strategy outlined above. However, in my knowledge, there is no such model in the literature. Moreover, modeling bounded rationality often implies making many arbitrary assumptions on the way of reasoning of individuals which are no more credible than a purely rational model. Due to these difficulties, I use a less structural approach in the paper, starting from more parsimonious assumptions to propose several estimation strategies. 4.3 Assumption: truthful ranking of preferred schools I first assume that students truthfully report their preference, that is, they do not take into account that they are likely to be refused by some schools because their score is too low. Under this assumption, preferences can be directly inferred from the report using a rank-ordered logit model. • Assumption A1: The students play a truthful strategy. Under assumption A1, a given list can be observed only if its first choice is the most preferred alternative, the second choice is the second most preferred alternative, and so on. This gives the following likelihood for a given list submitted by an individual: P r(Li = c1i , c2i , ..., cki ) = P r(ui,c1i > ui,c2i > ... > ui,cki > ui,s0 ∀s0 ∈ S \ Li | Zi , βZ ) = Y s∈Li exp(Ui,s ) P s0 6>s exp(Ui,s0 ) (8) ! (9) s0 6> s means all the schools which are not ranked above s in the list Li . This likelihood function can be described as a product of multinomial logit models, where each model give the probability of one school to be preferred to all the other alternatives which are ranked below this school. Note that s0 change for each s in Li . The likelihood of observing a given sample can be obtained by multiplying all the individual probability of observing a given list. Taking logs, this function can be written as: 31 LL(βZ , Zi ) = I X X Ui,s − i=1 s∈Li I X X i=1 s∈Li X ln Ui,s0 (10) s0 6>s By maximizing this likelihood function, one can find a point identified estimate of the parameters. These estimates are however likely to be severely biased due to misreporting. In particular, the most selective schools are unlikely to be selected into the list by many students who expect to have low to average score at the placement exam. This leads to underestimating their popularity, and the value given to selectivity by the weakest students. This misreporting problem plague all the literature trying to estimate the heterogeneity of preference among students. In particular, it may well be that the preference for low score schools which is found among the weakest student in a part of this literature 14 is in fact a statistical artifact due to imperfect reporting. 4.4 Assumption: truthful ranking of preferred schools among an expected feasible set Several strategies have been proposed to solve for the misreporting problem. Most of them imply to use a personalized choice set which reflect the fact that individuals have different likelihoods of admission, and therefore of application, regardless of their taste. The main difficulty comes from the definition of the choice set. Indeed, in this context, admission score are determined after choices have been made. As a result, students make choice based on expectation of their score and of admission threshold. While realized scores and threshold are observable ex-post, expectation cannot be observed. Failing to model adequately expectation may lead to a bias, because it could lead to let outside of the personalized choice set some alternatives which where in fact seen as feasible but not attractive. In this case, it would lead to an overestimation of the quality of these alternatives. I make three assumptions about expectation 14 For instance, in an influential paper, Hastings et al. (2008) finds that students from poor background and belonging to a minorities are less likely to apply to well-off and academically strong school. However, they do not control for the fact that the application list is severely truncated (no more than three choices are allowed), which casts a doubt on their estimates. 32 • A21: Students do not underestimate their test score. That is, they do not give up to apply to a preferred school because they think it is unfeasible, which latter turns out to be feasible. • A22: When students apply to a school outside of their ex-post feasible set of school, the threshold of this schools correspond to their maximum expectation of score. That is, they do not expect to have a higher score than the threshold of the most selective school they apply to. • A23: Students apply truthfully within their expected feasible set. The assumption A21 and A22 imply the following expected test score ĝi : actual score ĝi = max threshold of the best ranked alternative in the list L i I define the set of all the school below this threshold as Si . This constitutes my expected feasible set of schools for each student.15 s ∈ Si if and only if ts ≤ ĝi I can re-estimate the same model as before, by only changing the set of school among which each student is choosing. That is: P r(Li = c1i , c2i , ..., cki ) = P r(ui,c1i > ui,c2i > ... > ui,cki > ui,s0 , ∀s0 ∈ Si \ Li | Zi , βZ ) (11) I only replace the set of all school S by the personalized set Si . Similarly, the loglikelihhod function can be written as follow: 15 Several papers have tackled the problem of estimating discrete choice model when some alternatives are unfeasible for a fraction of the students. In particular, Arcidiacono (2004), Arcidiacono (2005) and Long (2004), and more recently Jacob et al. (2013) confronted this problem in the case of the estimation of preferences for college attributes. They weight each alternative by the observed probability of application given by a probit model. Conlon and Mortimer (2013) show that this method achieve consistency. My framework departs from their because I do not rely on objective probability of application, but rather choose to model expectation. Moreover, I deal with rank ordered list, and I choose to define the feasible set as all the school where the student believe that he has a positive probability of admission. 33 LL(βZ , Zi ) = I X X Ui,s − i=1 s∈Li I X X i=1 s∈Li X ln Ui,s0 (12) s0 6>s, s0 ∈Si One can wonder how credible are the assumptions A21 − A23. Other works (Bobba and Frisancho (2014)) show that students are in general overoptimistic about their test scores. Moreover, given that their test score is random, they are likely to select school whose threshold is higher than their mean expected score. Indeed, the fact that the test is a multiple answer question means that any student can get a lucky draw and obtain a good score despite a low baseline level. As a result, it is very likely that students apply to schools ranked much higher than the threshold they could reasonably expect to reach. This is confirmed in the data by the fact that many students apply to schools which ex post are not feasible (as shown by figure 2 in part 3). Conversely, students could have wrong expectation and overestimate the difficulty to access to some schools, letting them out of their expected feasible set, while they eventually turned out to be feasible. In this case, including these schools in the set will lead to an underestimation of their popularity. However, thresholds are quite stable through time, and it is unlikely that students make very large mistake on their expectation about them. The Assumption A22 is more disputable. It is indeed possible that students do not like schools which are better ranked than their most preferred alternative, and still think they are feasible. In fact, assumption A22 can only give a lower bound on the expectation of test scores. Another problem is that the threshold themselves are random and unknown from the students when they make their choice. These threshold are thus a noisy measure of the expectation of students. Eventually, assumption A22 is unlikely to hold perfectly, and defining expected feasible choice sets based on this assumption will only give an approximation of the real sets. Last, the assumption A23 is also unlikely to hold perfectly. As shown in the example 1 in part 2, it may be rational for some students to omit some well-liked schools in their feasible set if they meet the constraint on the size of their list. In this case, students will face a trade-off between the likelihood of admission and the strength of their preference, and have an incentives to scatter their choice among schools with different level of selectivity. While this is theoretically possible, it is noticeable that few students use their entire choice list, and that many of them submit rather short list 16 . As a result, within their feasible set, they 16 However, Submitting an incomplete list does not necessarily means that one does not let desirable 34 are very likely to truthfully report their preferences among alternatives. To conclude, it is likely that assumption A1 to A3 are violated in practice. However, the magnitude of the violation is likely to be small. 4.5 Assumption: stability Another possibility explored in the literature (Fack et al. (2015)) is to assume stability, that is to assume that each student is matched to his most preferred alternative among the one in which he has the highest priority. • A3: The observed matching is stable In our context, it means that students are assigned their favorite choice among all the schools with a threshold equal of below their test score. Given our data, stability can be expressed as the joint probability of observing the given thresholds (given the characteristics of students), and of having their assigned alternatives as their first choice among their feasible school (ex post). The feasible is thus defined in the following way: s ∈ Si if and only if ts ≤ gi Where the expected grade is replaced by the grade observed ex-post. It is possible to build a likelihood function based on the following probability: P r(stable matching) = P r(threshold = t|Zi , θ) × Y P r(ui = maxs∈Si |Zi , θ, t) (13) i Following, the literature, I make here the assumption that the preferences are independent of the thresholds (which are realized ex-post based on the existing preference). Note again that the max is taken on a school set specific to each individual, and including all the schools with a threshold equal or below the test score of the student. However, this approach raises several problem. First, it is difficult to characterize the likelihood of observing a given threshold. In the case of the deferred acceptance mechanism, schools outside of the list. There may be psychological costs associated with considering and ranking many alternatives, which may lead students to be willing to rank only small list to alleviate their cognitive burden. 35 it is possible to use a result by Azevedo and Leshno (2014) to obtain a distribution for the threshold, which is asymptotically unique (namely, when the number of students grow large compared to the number of schools, the matching become unique). In our case, there is no such analytical results. Moreover, the assumption of stability is dubious in the case of the serial dictatorship with truncation of the preference list. As seen in part 2, one can generate example of rational strategies which lead to unstable matchings, and there is no reason for these instabilities to disappear asymptotically. However, the number of blocking pair is likely to be small. Fack et al. (2015) performs simulations which show that provided the preferences for schools do not depend on the threshold, and the matching is unique (and its probability equal to 1), the stability condition can be expressed by the second part of the equation 13 only. In this case, we just need to perform an estimation with a multinomial logit with a personalized choice set equal to Si 17 . More formally, the estimation is based on the following individual probabilities, which can easily be used to build a likelihood function: P r(ui = Max ui,s0 | Zi , βZ , t) = P s∈Si exp(Ui,s ) s∈Si exp(Ui,s0 ) (14) 4.6 Assumption: truthful partial order We have seen above that the previously considered identification strategies rely on difficultto-justify assumptions on the strategy followed by the agents and on their beliefs. While it enables to achieve point identification, it weakens the credibility of the estimates. While assumption A1 is difficult to defend, assumption A2 to A4 are more reasonable. If they are likely to be violated for some individuals, the number and size of the violations are likely to be small, and the magnitude of the bias is likely to be reduced. However, this is uncertain, and we don’t know the magnitude of the bias. The aim of this section is therefore to propose a partial identification of the parameters based on very weak assumption, and to propose robust confidence sets for the parameters. I make only the following assumption: • A4: Students submit a truthful partial order to the social planner. That is, the order 17 It is possible to re-express this logit model with moment condition, and to add moment inequalities based on the rest of the rank order list. These moment conditions can then be combined to estimate the parameter, by using the technique used in Andrews and Shi (2013). However, given that it add little information, and that it is computationally difficult to implement, I did not use it 36 of the alternatives that they report respect the true preference order among them. This assumption is very weak. Indeed, it is always a (weakly) dominated strategy to play otherwise, since ranking an alternative above a more preferred one create the risk that one is attributed the former rather than the latter (see proposition 4). Moreover, there is no plausible psychological reason or cognitive bias which could lead students to play such strategy. Under this assumption, it is possible to build inequalities bounding the true parameters: P r(s1 > s2 |Zi , βZ ) = P r(ui,s1 > ui,s2 & s1 , s2 ∈ Li |Zi , βZ ) (15) The above equation states that the probability to observe school 1 being ranked above school 2 in a rank order list is equal to the joint probability of this two schools belonging to the list, and the school 1 being preferred to the other. This implies that: P r(s1 > s2 |Zi , βZ ) ≤ P r(ui,s1 > ui,s2 |Zi , βZ ) (16) There is equality if and only if the two schools are always ranked in the list. In a similar fashion, one can obtain another inequality. We have P r(s2 > s1 |Zi , βZ ) = 1 − P r(ui,s1 > ui,s2 & s1 , s2 ∈ Li |Zi , βZ ) (17) which comes from the fact that P r(s2 > s1 |Zi , βZ ) is the complement of P r1 > s2 |Zi , βZ ). This can be rewritten as: 1 − P r(s2 > s1 |Zi , βZ ) = P r(ui,s1 > ui,s2 & s1 , s2 ∈ Li |Zi , βZ ) (18) which in turns implies that: 1 − P r(s2 > s1 |Zi , βZ ) ≥ P r(ui,s1 > ui,s2 |Zi , βZ ) (19) For a given individual and a given couple of alternatives, these two inequalities are bounding the probability of preferring one alternative to the other. 37 Taking expectation on all the individual for this two inequalities enables to obtain conditional moment inequalities, which are function of the list Li , and conditional to |Zi and βZ : E[m1 (Li |Zi , βZ )] = E[P r(ui,s1 > ui,s2 |Zi , βZ ) − 1(s1 > s2 )|Zi , βZ ] ≥ 0 (20) E[m2 (Li |Zi , βZ )] = E[1 − 1(s2 > s1 ) − P r(ui,s1 > ui,s2 |Zi , βZ )|Zi , βZ ] ≥ 0 (21) It is possible to build these inequalities for each school pair in the dataset. It can be extended to a larger number of schools, but this is less likely that many students include these schools in this specific order, so I stick to school pairs. These moment inequalities can be used to estimate a confidence set for the parameters βZ . Indeed, they are not sufficient to bring point identification. The bounds become sharper when the alternatives are reported in many lists, and the model is point identified if all the alternatives are reported in all the list. Estimating partially identified model is not easy. I start from the approach of Chernozhukov et al. (2007), who propose a criterion function approach where a function S take the value 0 when all the moment conditions are satisfied, and become increasingly large as some of these conditions are violated. Among several possible choice, I choose the modified method of moment function, which is simple to compute and which has been praised for its good statistical properties ( Chernozhukov et al. (2007), Andrews and Shi (2013)). S(m, Σ) = 2 p X mj j σj (22) − In the above equation, mj refer to a given moment inequality, who variance is σj , while the matrix of variance covariance between all the moment inequality is given by Σ. The operator . − take the value 0 if the moment inequality is satisfied, and return its square otherwise. Larger violation are thus increasingly penalized. Other criterion functions are proposed in Chernozhukov et al. (2007). This criterion function can then be used to create a test statistics Tn (βZ ) whose null distribution can be derived. Several possibility have been evoked in the literature. I use the generalized moment selection technique of Andrews and Soares (2010), which has been shown 38 to be superior to the plug-in asymptotic and the subsampling approach 18 . It is possible to test any value of the parameter set, and to build a confidence set such that: CSn = {βZ ∈ BZ : Tn (βZ ) ≤ cn,1−α (βZ )} (23) A problem to be solved is that I deal with conditional moment inequalities. I follow Andrews and Shi (2013) and transform all the conditional moment into unconditional one using indicator function dividing the data space into hypercubes. These unconditional moments can thus be aggregated into a test. I follow the advice of Andrews and Shi and I use a Cramer von Mises statistics (an alternative would be using a Kolmogorov - Smirnov statistics). Unfortunately, the only way to compute the confidence set is to draw randomly parameter vectors and to test them one by one. This approach work asymptotically if one can draw an infinite number of parameter vectors. In this context, the Andrews-Shi approach is not very convenient, because it implies to compute the matrix of variance - covariance among all the moment inequalities. Since there are many of them, this is burdensome. To reduce the computational burden, I randomly select among school pairs and students. I also drop some school pairs which presented in a very small number of list (in a similar fashion as in Ciliberto and Tamer and Hwang (2014)). It also limits the number of variable I can include into the model. As I explain in the next part, I use standard data reduction techniques to aggregate data on students into more parsimonious indicators. Another problem is that the test proposed in the literature are quite conservative. It leads the confidence set to be quite large, and to sometimes include 0. This is the cost of making only parsimonious and robust identifying assumptions. As a result, the results of the partial identification approach should mostly be interpreted as a robustness check. It enables to test the stringency of the stronger assumptions which enable to point identify the model. If these assumptions lead to estimate parameters which are outside the bounds given by the partial identification estimators, then they are likely to lead to a large bias, and the estimates obtained from these assumptions should be treated with caution. Conversely, if the estimated parameters lie into the bounds, and if these one are informative enough, it means that the estimated parameters have some credibility and only a small bias. 18 The advantage of the GMS procedure lies in the fact that it does not use all the moments to build the null distribution, but only the one who are likely to be binding for the tested parameter vector. As a result, it produce a smaller critical value, and reject more often, which gives a smaller identified set. 39 5 Data description and results The aim of this section is to describe the choice of variables for the model as well as the data, and to present and comment the results. 5.1 Data Using partial identification creates computation problem. Indeed, the larger the number of parameter, the larger the number of parameter vector to be tested for computing the confidence set. As a result, I have chosen to reduce as much as possible the number of variables to be included in the model. Since I do not use school fixed effects (again, for computational reason), I have to include school level variable likely to impact the preference of individuals. I focus mostly on measure of quality, such as the average score of the student to the placement exam, the admission threshold, and the belonging to one of the two elite subsystem among Mexico public schools (IPN and UNAM). I also include a variable distinguishing vocational and general education. I then include interaction variables between school and student characteristics. These variables enable to capture the heterogeneity of preference between different type of students. I choose to include an interaction term between academic results and academic quality, and interaction terms between a measure of social background and academic quality as well as the indicator for vocational education. The interaction term between social background and school quality are a good way to assess if poor students tend to self select into academic track and schools which are less academically demanding and prestigious. Controlling for an interaction term between own results and school quality (measured by selectivity and average student quality) enable to estimate the net effect of preference for quality when choosing a school compared with the expectation of being admitted. All the variables are build from the survey which is administered to the student before they choose the schools they want to apply for. The survey includes many questions about self assessment, family consumption, and academic results. I aggregate these variables into one indicator of social status, and one indicator of academic achievement using principal component analysis. I discuss the exact methodology (and more specifically which variables are aggregated) in appendix A. Both indicators are standardized. 40 I present below some summary statistics for the schools. Table 4 : Summary statistics for schools Average size 1384 (1233) Average admission threshold 57.47 (24.89) Average student quality (COMIPENS score) 72.48 (17.26) Curriculum is vocational 0.15 (0.36) Belong to IPN 0.89 (0.28) Belong to UNAM 0.144 (0.35) Number of seats 250,031 Number of programs 612 I compute the value of each variable for each program, and I then weight the value by the number of seats in the program. The number of seat is computed based on the actual number of students enrolled in the program after the assignment process. 5.2 Results I estimate the model using the different estimation strategies outlined in the previous part. 41 Table 5 : the determinant of the demand for school (1) (2) (3) (4) Truthful truthful reporting stability truthful reporting within feasible set 0.52 0.97 0.108 (0.02) (0.11) (0.14) 0.41 0.86 0.81 (0.05) (0.10) (0.14) -2.12 -3.14 -2.98 (0.45) (0.56) (0.66) 3.13 4.35 4.56 (0.34) (0.39) (0.40) -0.32 -0.45 -.0.39 (0.02) (0.023) (0.03) selectivity × -0.11 -0.23 -0.26 student test score (0.08) (0.09) (0.10) Curriculum is vocational × -0.45 -0.67 -0.68 student test score (0.18) (0.23) (0.25) selectivity × -0.21 -0.56 -0.68 student social status (0.02) (0.12) (0.15) Curriculum is vocational × -2.13 -4.23 -4.46 student social status (0.42) (0.54) (0.68) Variables partial order School characteristics selectivity average peer quality Curriculum is vocational Belong to elite subsystem [0.22 – 0.145] [0.32 – 0.121] [-2.18 – -4.84] [3.01 – 6.23] Interaction variables Distance [-0.64 – -0.22 ] [-0.12 – -0.43] [-0.38 – -1.01] -1.07 – -0.28 [-7.23 – -2.86] This table present the results of the estimation of the demand model using different identification strategy. In the last column, I only report the confidence where the estimates lie with 95% probability. 42 The sign of the coefficient are of the expected sign and in line with the rest of the literature. Students value selectivity, peer quality, the belonging to an elite subsystem. They prefer to avoid vocational education, and school far from home. This is coherent with the idea that students treats admission exam and the quality of the student body as a signal of academic quality. However, as it has been noted in the literature, this assumption is not necessarily true. For instance, Akyol and Krishna (2014) document that the schools with the largest value added at the national examination exam in Turkey are spread evenly among selectivity levels. In the case of Mexico, two studies have documented by using a regressing discontinuity design that the elite subsystem IPN, which specialized in offering a rigorous education in math and science, is indeed able to increase test scores at the end of high school exam for its students (Estrada and Gignoux (2014), de Janvry et al. (2015)). However, these is no guarantee that these results hold among all the selectivity distribution. As far as the interaction terms are concerned, there is an interesting pattern of heterogeneity among students. Not surprisingly, students with good academic results are more likely to value academic quality, probably because they feel confident in their ability to succeed in rigorous programs. They also want to avoid vocational program. However, students with a low social status are less likely to value positively academic quality. They also value more vocational programs. This is consistent with self-censorship, as well as the will to enter quickly the labor market, maybe because they are credit constraint. Regarding the effect of the various identification strategy, there is a clear difference in term of magnitude between the first strategy and the other. The first strategy lead to underestimate the popularity of strong schools, particularly among the weakest students. The two strategies restricting the feasible school set give quite similar results. The strategy based on partial identification gives rather large confidence set, probably because the test on which it is based is quite conservative. However, it enables to reject some of the estimate of the first strategy (truthful) while it is consistent with all the estimate of the second and third identification strategies (based on feasible choice set). As a result, partial identification clearly enables to discriminate between some identification strategy which are credible, and some which are not. The results are interesting from a policy point of view. Students coming from a deprived background tend to give comparatively less value to academic quality, academically oriented curriculum, and to belonging to an elite school, regardless of their score. For the best of 43 them, this is likely to be detrimental, since they miss the opportunity to be well prepared to college, and maybe to pursue their study at the higher education level. It shows that the low level of education in some developing countries such as Mexico may be due in part to low expectations and ambition from a part of the students. 44 6 Conclusion This paper estimates the preference of students in Mexico. It shows that while all student tends to value selectivity and peer quality, students from a poorer background are valuing these characteristics much less that the wealthier students. The paper also assesses the mechanism used in Mexico. It shows that this variant of the serial dictatorship is not stable, and that it implies a high level of sophistication from the students, as well as the ability to form accurate belief. The paper also presents evidence showing that many students are not fully rational and choose strategies which are dominated. Weaker and less wealthy students are more likely to do so. Taken together, these two findings lead to some pessimism concerning the benefit to be expected from centralized assignment mechanisms. This mechanism are supposed to improve the matching between schools and students by simplifying the application process. However, in practice, the variant of the mechanism which are implemented in the real world often necessit to elaborate sophisticated strategies, and to form accurate beliefs. This comes mainly from the truncation of the choice list, and by the fact that many students do not fulfill their entire list of preference. This is made necessary by the large number of alternatives that students are facing. In practice, it is likely that many students rely on simple heuristics to make their choice. However, it seems that these heuristics are far from perfect. As shown in the second part, many strategies used by the students are dominated. Moreover, the further their strategy is from the non-dominated strategy, the more likely are students to be mismatched or to be left unassigned at the end of the assignment procedure. Even though students were able to use the right strategy, it is unlikely that centralised matching mechanism lead to a greater access to good high schools or to higher education among students from a poor background. Indeed, these students are consistently less ambitious than their wealthier peers. Conditional on their academic ability, there are less likely to apply to the best high schools, even though they would be accepted. This is consistent with the presence of bounded rationality among them. This finding is coherent with the recent works on the role of limited information and biased expectation on the formation of human capital. Many students do not correctly assess the various educational alternative they are facing. This generate a mismatch between the ability 45 of some students and the quality and requirement of the program they are attending. This leaves some room for effective policy. Improving information, and maybe providing incentives such as merit scholarships, could enable to raise the ambition of some able but badly informed students. 46 References Atila Abdulkadiroglu and Tayfun Sönmez. School choice: A mechanism design approach. The American Economic Review, 93(3):729–747, 2003. Atila Abdulkadiroğlu, Nikhil Agarwal, and Parag Pathak. The welfare effects of coordinated assignment: Evidence from the new york city highschool match. NBER Working Paper, 2015. Nikhil Agarwal and Paulo Somaini. Demand analysis using strategic reports: An application to a school choice mechanism. NBER Working Paper, 2014. Saziye Akyol and Kala Krishna. Preferences, selection, and value added: A structural approach. NBER Working Paper, 2014. Donald Andrews and Xiaoxia Shi. Inference based on conditional moment inequalities. Econometrica, 81(2):609–666, 2013. Donald Andrews and Gustavo Soares. Inference for parameters defined by moment inequalities using generalized moment selection. Econometrica, 78(1):119–157, 2010. Peter Arcidiacono. Ability sorting and the returns to college major. Journal of Econometrics, 121(1):343–375, 2004. Peter Arcidiacono. Affirmative action in higher education: How do admission and financial aid rules affect future earnings? Econometrica, 73(5):1477–1524, 2005. Eduardo Azevedo and Jacob Leshno. A supply and demand framework for two-sided matching markets. SSRN Working Paper 2260567, 2014. Michel Balinski and Tayfun Sönmez. A tale of two mechanisms: student placement. Journal of Economic theory, 84(1):73–94, 1999. Steven Berry. Estimating discrete-choice models of product differentiation. The RAND Journal of Economics, pages 242–262, 1994. Steven Berry, James Levinsohn, and Ariel Pakes. Automobile prices in market equilibrium. Econometrica: Journal of the Econometric Society, pages 841–890, 1995. Eric Bettinger, Bridget Terry Long, Philip Oreopoulos, and Lisa Sanbonmatsu. The role of 47 application assistance and information in college decisions: Results from the h&r block fafsa experiment. The Quarterly Journal of Economics, 127(3):1205–1242, 2012. Matteo Bobba and Veronica Frisancho. Learning about oneself: The effects of signaling academic ability on school choice. Unpublished manuscript, 2014. Anna Bogomolnaia and Hervé Moulin. A new solution to the random assignment problem. Journal of Economic Theory, 100(2):295–328, 2001. Simon Burgess, Ellen Greaves, Anna Vignoles, and Deborah Wilson. What parents want: school preferences and school choice. The Economic Journal, 2014. Caterina Calsamiglia and Maia Güell. The illusion of school choice: Empirical evidence from barcelona. CEPR Discussion Paper No. DP10011, 2014. Raymundo Campos-Vázquez. Why did wage inequality decrease in mexico after nafta? Economı́a Mexicana, 22(2):245–278, 2011. José-Raimundo Carvalho, Thierry Magnac, and Qizhou Xiong. College choice allocation mechanisms: Structural estimates and counterfactuals. IZA Discussion Paper, 2014. Caterina Casalmiglia, Chao Fu, and Maia Güell. Structural estimation of a model of school choices: the boston mechanism vs. its alternatives. Documentos de trabajo (FEDEA), (21):1–63, 2014. Yeon-Koo Che and Youngwoo Koh. Decentralized college admissions. Journal of Political Economy, Forthcoming, 2014. Yeon-Koo Che and Fuhito Kojima. Asymptotic equivalence of probabilistic serial and random priority mechanisms. Econometrica, 78(5):1625–1672, 2010. Victor Chernozhukov, Han Hong, and Elie Tamer. Estimation and confidence regions for parameter sets in econometric models1. Econometrica, 75(5):1243–1284, 2007. Christopher Conlon and Julie Holland Mortimer. Demand estimation under incomplete product availability. American Economic Journal: Microeconomics, 5(4):1–30, 2013. Monique De Haan, Pieter Gautier, Hessel Oosterbeek, and Bas Van der Klaauw. The performance of school assignment mechanisms in practice. IZA Discussion Papers, 2015. Alain de Janvry, Andrew Dustan, and Elisabeth Sadoulet. Flourish or fail? the risky reward of elite high school admission in mexico city. Unpublished manuscript, 2015. 48 David Deming, Justine Hastings, Thomas Kane, and Douglas Staiger. School choice, school quality, and postsecondary attainment. American Economic Review, 104(3):991–1013, 2014. Dennis Epple, Akshaya Jha, and Holger Sieg. Estimating the impact of school closings on parental choice. 2014. Haluk Ergin and Tayfun Sönmez. Games of school choice under the boston mechanism. Journal of public Economics, 90(1):215–237, 2006. Ricardo Estrada and Jérémie Gignoux. Benefits to elite schools and the formation of expected returns to education: Evidence from mexico city. Unpublished manuscript, 2014. Gabrielle Fack, Julien Grenet, and Yinghua He. Beyond truth-telling: Preference estimation with centralized school choice. Unpublished manuscript, 2015. Guillaume Haeringer and Flip Klijn. Constrained school choice. Journal of Economic Theory, 144(5):1921–1947, 2009. Gordon H Hanson. Why isn’t mexico rich? Journal of Economic Literature, 48(4):987–1004, 2010. Justine Hastings and Jeffrey Weinstein. Information, school choice, and academic achievement: Evidence from two experiments. The Quarterly Journal of Economics, pages 1373– 1414, 2008. Justine Hastings, Thomas Kane, and Douglas Staiger. Heterogeneous preferences and the efficacy of public school choice. Unpublished manuscript, 2008. Justine Hastings, Christopher A Neilson, and Seth D Zimmerman. The effects of earnings disclosure on college enrollment decisions. 2015a. Justine S Hastings, Christopher A Neilson, Anely Ramirez, and Seth D Zimmerman. (un) informed college and major choice: Evidence from linked survey and administrative data. 2015b. Yinghua He. Gaming the boston school choice mechanism in beijing. Unpublished manuscript, 2015. Sam Il Myoung Hwang. A robust redesign of high school match. Unpublished manuscript, 2014. 49 Brian Jacob, Brian McCall, and Kevin Stange. College as country club: Do colleges cater to students’ preferences for consumption? NBER Working Paper, 2013. Robert Jensen. The (perceived) returns to education and the demand for schooling. The Quarterly Journal of Economics, 125(2):515–548, 2010. Michael Keane and Kenneth Wolpin. The career decisions of young men. Journal of political Economy, 105(3):473–522, 1997. Adam Lavecchia, Heidi Liu, and Philip Oreopoulos. Behavioral economics of education: Progress and possibilities. NBER Working Paper, 2014. Bridget Terry Long. How have college decisions changed over time? an application of the conditional logistic choice model. Journal of Econometrics, 121(1):271–296, 2004. Charles Manski. Identification problems in the social sciences. Harvard University Press, 1999. Paul Milgrom and Robert Weber. Distributional strategies for games with incomplete information. Mathematics of Operations Research, 10(4):619–632, 1985. Parag Pathak and Tayfun Sönmez. Leveling the playing field: Sincere and sophisticated players in the boston mechanism. The American Economic Review, 98(4):1636–1652, 2008. Barry Schwartz. The paradox of choice: why more is less. New York: Harper Perennial, 2005. Elie Tamer. Partial identification in econometrics. Annual Review of Economics, 2(1):167– 195, 2010. 50
© Copyright 2026 Paperzz