THÈSE DE DOCTORAT
En vue de l’obtention du grade de
DOCTEUR DE L’ÉCOLE NORMALE SUPÉRIEURE EN
SCIENCES ÉCONOMIQUES
École doctorale Économie Panthéon Sorbonne
Unité de recherche Paris-Jourdan Sciences Économiques
Contextes éducatifs et inégalités scolaires
Présentée et soutenue publiquement le 6 décembre 2014 par
Son Thierry Ly
devant le jury composé de :
M. Éric Maurin
Directeur de thèse
M. Patrick Weil
Co-directeur de thèse
M. Cédric Afsa
Président du jury
Mme. Julie Berry Cullen
Rapporteur
M. Xavier d’Haultfoeuille Rapporteur
1
Remerciements
J’imaginais la thèse comme une simple étape de plus dans ma vie d’étudiant. J’ai découvert de
plein fouet qu’il s’agissait d’abord d’apprendre à ne plus en être un. La thèse a été pour moi
comme un passage à la vie adulte, à la fois difficile et épanouissant, une marche à pas forcés
vers l’autonomie intellectuelle pleine. Celle qui vous rend capable d’adopter des positions fortes
et de les défendre avec rigueur scientifique, parfois contre ceux-là même que vous admirez intellectuellement, pour apporter enfin votre propre pierre à l’édifice. Heureusement, j’ai bénéficié
d’un accompagnement exceptionnel tout au long de cette aventure.
Ma gratitude va tout d’abord à mon directeur de thèse Eric Maurin. Alors que j’étudiais la
biologie à l’ENS, le ghetto français a été l’un des tous premiers ouvrages de sciences sociales que
j’ai lu, et a fait partie de ces travaux qui ont suscité ma passion pour les sciences sociales. Deux
ans plus tard, j’ai eu la chance d’avoir son auteur comme professeur d’économétrie et directeur
de mon master d’économie, avant de réaliser mon mémoire de master puis ma thèse de doctorat
sous sa direction. C’est d’abord grâce à sa disponibilité, sa confiance et son appui permanent
que j’ai pu mener cette thèse à bien. J’ai découvert la majeure partie des hauts et des bas du
travail de recherche durant les nombreuses heures passées dans son bureau à décortiquer chacune
de mes idées de recherche, à les abandonner, ou au contraire à les formaliser rigoureusement
pour aboutir à un véritable article d’économie innovant. J’en suis sorti souvent très déprimé,
parfois satisfait, mais quoiqu’il en soit toujours grandi. Merci à lui pour avoir fait de moi le
chercheur que je suis aujourd’hui.
Des remerciements tout particuliers s’imposent évidemment pour mon co-directeur de thèse,
Patrick Weil. S’il a essentiellement agi comme un mentor durant cette thèse, sa présence et son
soutien indéfectible ont été inestimables tout au long de ces années. C’est surtout grâce à lui
que j’ai osé me lancer dans les sciences sociales alors que je n’avais jamais rien fait d’autre que
de la biologie et des sciences. C’est avec lui que j’ai effectué et publié mon premier travail de
2
recherche, en histoire cette fois. Ce sont ses conseils qui, durant les moments les plus difficiles
de ma thèse, m’ont amené à ne pas y renoncer. Cette thèse ne serait probablement jamais
arrivée à terme sans son concours.
I will switch to English to express all my gratitude to Julie Cullen. I had the chance to
meet her in Paris in 2011 and she made it possible for me to visit the University of California
- San Diego during three months during the spring 2014. Let me insist on how rare it is to
interact with such a bright senior economist and a nice person, always ready to help despite
her limited free time. I was not even her student, but she read all the versions of my papers
during the past years and always came back with excellent suggestions to improve them. The
quality of this thesis would be much lower without her constant support.
Je souhaiterais maintenant remercier mes amis et co-auteurs, Thomas Breda et Arnaud
Riegert. J’ai eu la chance, durant ma première année de thèse, de réaliser mon premier travail
de recherche avec Thomas, alors en fin de thèse. J’ai pu bénéficier de son expérience tout en
osant poser toutes mes questions et débattre librement, et je sais qu’il en aurait été autrement si
j’avais commencé par travailler avec un chercheur senior. Je lui suis reconnaissant de cette collaboration, grâce à laquelle j’ai pu devenir rapidement autonome dans mon travail de recherche,
et de tous ses conseils durant ces années. Quant à Arnaud, il m’est difficile d’estimer à quel
point notre amitié et notre travail en commun ont contribué à cette thèse. J’ai mené avec lui
mes premières actions sur le terrain éducatif il y a 8 ans, et encore aujourd’hui nous menons
ensemble l’essentiel de nos recherches et projets sur l’éducation. La convergence de nos intérêts et la complémentarité de nos qualités et faiblesses respectives rend notre collaboration
dynamique, efficace et toujours agréable. Cette thèse lui doit beaucoup.
Je tiens à remercier Cédric Afsa et Xavier d’Haultfoeuille d’avoir accepté de siéger dans
mon jury de thèse. Merci plus particulièrement à Cédric Afsa pour m’avoir ouvert l’accès à
d’incroyables bases de données à la DEPP, m’aidant à me relancer à un moment difficile de ma
thèse. Et à Xavier, pour m’avoir accueilli au CREST durant ma 4e année de thèse, et pour sa
disponibilité et sa gentillesse.
Merci à tous les membres de la Paris School of Economics. Tout d’abord les professeurs,
notamment Marc Gurgand et Julien Grenet, pour leurs nombreux conseils tout au long des
comités de thèse ces dernières années. Luc Behaghel et Thomas Piketty pour leur soutien et
le temps passé à discuter de mes travaux, ainsi que Sylvie Lambert, Karen Macours et Denis
Cogneau. Un grand merci enfin à tous les membres de l’administration avec qui j’ai eu tant
3
plaisir à interagir durant ces années, en particulier Marie-Christine Paoletti, Marie Philipon,
Damien Herpe, Béatrice Havet, Véronique Guillotin et Weronika Leduc. Merci également à
tous mes collègues doctorants qui ont rendu agréable le peu de temps que j’ai réussi à passer
sur place à PSE et au CREST. Je remercie en particulier mes amis Margaux, Gwenaël et Marie
pour leur présence bienveillante, leur enthousiasme et leur réconfort à travers ces années.
Cette thèse n’aurait jamais démarré sans tous les élèves, familles, proviseurs, enseignants
et CPE avec qui j’ai eu l’opportunité de travailler au cours de mes projets à l’ENS durant ces
huit dernières années, et auxquels je tiens à exprimer toute ma reconnaissance. Ce sont leur
dévouement, leur enthousiasme et leur acharnement à la réussite de tous, qui ont fait naître et
entretiennent toujours ma passion pour l’éducation. J’en profite pour remercier chaleureusement toutes les personnes avec qui j’ai eu la chance infinie de travailler sur ces projets à l’ENS,
d’abord à l’association TALENS puis au pôle PESU. J’ai vécu avec eux les moments les plus
forts de ma vie professionnelle, qui m’ont forgé tant sur le plan humain qu’intellectuel tout au
long de mes études et de mon doctorat. J’ai une pensée toute particulière pour mon équipe au
pôle PESU, Adeline, Claire, Pauline, Quentin et Matthieu. Mais aussi toute la dreamteam de
l’association Talens: à nouveau Marie, Gwenaël, Arnaud, ainsi que Muy-Cheng. Merci enfin
à Véronique Prouvost et Sophie Fermigier pour m’avoir aidé à créer PESU et pour tout leur
soutien durant ces années, ainsi toutes les personnes m’ayant aidé dans ces projets et auprès
de qui j’ai tant appris sur l’éducation, notamment Anne-Christine Lang, Nicole d’Anglejan,
Yannick Loiseau, Hervé Lefeuvre et Olivier Basso.
Merci enfin à tous mes amis et surtout à ma famille, en particulier ma mère, mes frères
Nghiep, Chanh et Hai ainsi que mes belles-soeurs, pour leur affection et leur appui à travers
tous les changements de carrière durant mes études. Un remerciement particulier à ma bellesoeur Emmanuelle pour les innombrables discussions et débats que nous avons depuis mon
adolescence, qui m’ont aidé à me construire intellectuellement et à croire en mes capacités de
me lancer dans les sciences sociales. Mes derniers remerciements sont évidemment dédiés à mon
mari, Matthieu, qui a vécu au quotidien tous les moments de bonheur mais aussi les épreuves
difficiles que j’ai traversé ces dix dernières années, et m’a constamment soutenu avec amour et
bienveillance.
4
Sommaire
Remerciements
2
Introduction
7
I
Faculty Biases and the Gender Segregation across Fields
12
Joint with Thomas Breda
1
Background, data, and measures of stereotypes . . . . . . . . . . . . . . . . . . . 17
2
Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3
Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4
More on the identification assumption . . . . . . . . . . . . . . . . . . . . . . . . 31
5
Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Appendix: On the handwriting detection test . . . . . . . . . . . . . . . . . . . . . . 53
II
Persistent Classmates: How Familiarity with Peers Protects from Disruptive Transition
56
Joint with Arnaud Riegert
1
Institutional context and data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
2
Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3
Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4
Robustness checks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5
Discussion and conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
III
A New School in Town: Public School Openings, Private School Choice
and Academic Achievement
114
1
Institutional context and data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
2
School openings, educational contexts and private school choice . . . . . . . . . 123
3
School openings and educational achievement . . . . . . . . . . . . . . . . . . . 126
4
Robustness and complementary analysis . . . . . . . . . . . . . . . . . . . . . . 130
5
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
Table of contents
154
List of figures
155
List of tables
157
6
Introduction
Educational inequalities result not only from individual but also environmental factors. School
institutions themselves create a large diversity of contexts, whose causes and effects have raised
a deep interest in social sciences. In contrast to family factors, schooling conditions (schools,
classes, tracks and majors, teachers, etc.) stem from administrative rules that are controlled by
public authorities. As such, understanding their role in students’ outcomes may help designing
efficient policy levers for mitigating school inequalities.
Unfortunately, this research agenda has faced a major empirical challenge that has blocked
most attempts to quantify the role of contexts on students’ outcomes. As long as students are
not randomly allocated to schools, classes, majors, peers or teachers, students may be observed
to perform differently in different contexts simply because they are different, not because their
contexts differ. Two kinds of solutions have been developed to overcome this crucial issue.
The first method consists in running controlled field experiments with random assignment to
contexts. For example, students may be randomly assigned to peers within classes (Duflo et al.,
2011) and/or to their teachers (Dee, 2004; Carrell et al., 2010). Field experiments are often the
most convincing in terms of unbiasedness, but real world interventions may affect how agents
behave, raising doubts about the external validity of their conclusions. For obvious reasons
of social acceptability, it is also almost impossible to allocate students randomly to schools or
majors for example.
The second type of methods take advantage of "natural experiments" or settings where the
allocation of students to contexts may be considered exogenous, at least locally. For example,
Angrist and Lavy (1999) use institutional rules proscribing class size to exceed 40 students
in Israeli public schools. As a consequence, class size decreases suddenly when the cohort
size in a school crosses thresholds of any multiple of 40 students. Comparing school cohorts
7
just above with cohorts just below a threshold provides an estimation of the causal effect of
class size, at least for similar cohort sizes. Other approaches use year-to-year variations in
cohort size or composition within schools (Hoxby, 2000). After controlling appropriately for
time trends, such variations may be viewed as unexpected (hence exogenous) shocks to school
composition and may be used to estimate the causal effect of school composition. For example,
schools may unexpectedly get more (or less) female students in a given cohort than usually. In
principle, comparing such cohorts with other cohorts within a school identifies the causal effect
of schools’ gender composition. Basically, such settings create natural "counterfactual" groups,
i.e. natural "control" and "treatment" groups that may be compared to estimate the causal
effect of school contexts.
This thesis, Educational Contexts and School Inequalities, put together three independent
research papers whose approach clearly fit into this second category. Using very unique datasets,
I implement natural experiment approaches to shed new light on debates regarding the role
of educational contexts on students’ achievement and school inequalities, and to formulate
insightful policy recommendations.
In Faculty Biases and the Gender Segregation across Fields, co-authored with Thomas Breda
(PSE), we investigate the link between how male-dominated a field is, and gender bias against
women in this field. Stereotypes and social norms influence females’ academic self-concept
and push females to choose humanities rather than science, with large-scale consequences on
inequalities on labor market outcomes. Do faculty reinforce this strong selection by their recruiting behavior and assessment of students’ skills? Thanks to the very particular framework of
the entrance exam of a French higher education institution (the École Normale Supérieure), we
are able to show the opposite: evaluation is biased in favor of females in more male-dominated
subjects (e.g. math, philosophy) and in favor of males in more female-dominated subjects (e.g.
literature, biology). This pattern induces a slight rebalancing of gender ratios between students recruited for research careers in science and humanities majors. The empirical strategy
takes advantage of the multiplicity of written and oral tests a single candidate has to take
in many subjects during these entrance examinations. We identify evaluation bias from systematic differences in students’ scores between oral tests (not gender blind) and anonymous
written tests (gender blind). By making comparisons of these oral-written score differences
across subjects for a given student, we are able to control both for students’ abilities in each
8
subject and their overall ability at oral compared to written exams. Gender differences in the
way oral-written score gaps evolve across subjects can thus be attributed to examiners’ biases
rather than students’ abilities.
In Persistent Classmates: How Familiarity with Peers Protects from Disruptive Transition, co-authored with Arnaud Riegert, we examine the effect of classmates’ characteristics on
students’ achievement in high school, exploiting natural experiments occuring sporadically in
French high schools. High school principals do not know their first-year students at the time
they assign them to classes, so they do the allocation using only a limited set of information
available on their registration files. In some rare cases, they have to assign to separate classes
two or more students who look nearly identical, according to the information they observe in
their files. We provide strong evidence suggesting that such first-year students are randomly
assigned to their classes. When using these quasi-experiments to investigate the role of several
classmates’ characteristics, we find an important, positive effect of assignment with more persistent classmates, i.e. classmates who were already in the freshman’s class before high school.
We provide strong evidence that this result derives from the benefit of familiarity with peers,
rather than from some unobserved ability characteristics of these classmates. The magnitude
of the estimates suggests that grouping low-achieving freshmen who know each other could
decrease their current repetition rate by around 13 percent, and raise their graduation rate by
the same amount.
The last paper, A New School in Town: Public School Openings, Private School Choice
and Academic Achievement, is a recent research project where I evaluate the effects of new
public school openings on families’ preference for a public rather than a private school, and on
academic achievement. Using French data on 36 new public schools created between 2003 and
2010, I compare students living close to new schools before and after their openings, controlling
for time trends, year and neighborhood fixed effects. New schools strongly decrease the distance
students have to travel to school, reduce school size and overcrowding. I find some evidence,
though not very robust, of improvement in academic achievement by around +7 % of a standard
deviation. A more robust result is the drop in enrollment in the private sector after elementary
schools. After a new public school opens in the neighborhood, families are 18 % less likely
to opt for a private middle school, most likely because of the new geographic proximity to a
public school. This result has important implications, as identifying the determinants of private
9
school choice is key for understanding how the private sector contributes to the inequality of
educational contexts.
Overall, this thesis draws new perspectives on school inequalities that all pertain to different
dimensions of educational contexts. In the first chapter, we show that the gender segregation
across fields affect faculty behavior as they examine students. Consequently, male and female
students face unequal environments, because faculty behave in a way that favor more females
in male-dominated fields and males in female-dominated fields. Therefore, faculty do not seem
to be responsible for the gender segregation across fields, meaning that future research to
understand this persisting issue should focus more on the supply-side (why do females enroll
less in science at school?). In the second chapter, we show that students’ classmates in the first
year of high school matter a lot for achievement. Whereas the role of peer ability, gender or
race has been largely investigated by the literature, our study reveals how knowing some peers
is enough to raise a student’s outcomes as she faces a new and difficult environment. The third
chapter studies how a school opening impacts local educational contexts (distance to school,
school size and overcrowding) and changes the way students choose to allocate themselves
across schools, in particular across public and private schools. It highlights the role of public
school proximity in the choice between the private and the public sector and thus adds to
our understanding of one of the major driving force behind the large inequalities of schooling
conditions.
10
Bibliography
Angrist, Joshua D. and Victor Lavy (May 1999). “Using Maimonides’ Rule to Estimate the
Effect of Class Size on Scholastic Achievement”. In: The Quarterly Journal of Economics
114.2, pp. 533–575.
Carrell, Scott E., Marianne E. Page, and James E. West (Aug. 2010). “Sex and Science: How Professor Gender Perpetuates the Gender Gap”. In: The Quarterly Journal of
Economics, MIT Press 125.3, pp. 1101–1144.
Dee, Thomas S. (Feb. 2004). “Teachers, Race, and Student Achievement in a Randomized
Experiment”. In: The Review of Economics and Statistics 86.1, pp. 195–210.
Duflo, Esther, Pascaline Dupas, and Michael Kremer (Aug. 2011). “Peer Effects, Teacher
Incentives, and the Impact of Tracking: Evidence from a Randomized Evaluation in Kenya”.
In: The American Economic Review 101.5, pp. 1739–74.
Hoxby, Caroline M. (2000). Peer Effects in the Classroom: Learning from Gender and Race
Variation. NBER Working Paper 7867. National Bureau of Economic Research.
11
Chapter I
Faculty Biases and the Gender
Segregation across Fields
Joint with Thomas Breda
We would like to thank Philippe Askenazy, Francesco Avvisati, Julie B. Cullen, Sandra McNally, Mathilde
Gaini, Julien Grenet, Eric Maurin, Thomas Piketty, Abel Schumann and Helge Thorsen for their helpful comments on this manuscript and the Ecole Normale Supérieure for allowing us access to their entrance exam
records. This research was supported by a grant from the CEPREMAP research center.
0
12
Although gender differences have disappeared or evolved in favor of girls in many educational outcomes, male and female students are still strongly segregated across majors (Bettinger
and Long, 2005; Carrell et al., 2010). Females are especially underrepresented in quantitative
science-related fields, leading to substantial gender gaps on the labor market as they comprise
only 25% of the science, technology, engineering and math workforce (National Science Foundation, 2006). Understanding the origin of these discrepancies is important from an economic
perspective: gender differences in entry into science careers account for a significant part of the
gender pay differential among college graduates (Brown and Corcoran, 1997; Weinberger, 1999;
Hunt et al., 2012) and may also reduce aggregate productivity (Weinberger, 1998).
Of all the potential explanations for the gender gap in science majors, a common idea is
that teachers and professors in those fields may be biased against girls (Bernard, 1979; Dusek
and Joseph, 1983; Madon et al., 1998; Tiedemann, 2000; Moss-Racusin et al., 2012). The
contribution of this paper is to test this hypothesis. We study if the bias against females in
different academic fields varies systematically with the extent to which the fields are dominated
by males.
We use as a quasi-experimental setting the entrance exam of a top French higher education
institution, the Ecole Normale Superieure (ENS), where students sit a broad series of both
written and oral tests in several subjects. Our strategy exploits the fact that the written
tests are blind (candidates’ gender is not known by the professor who grades the test) while
the oral tests are obviously not gender-blind. Providing that female handwriting cannot be
easily detected - which we show -, written tests provide a counterfactual measure of students’
cognitive ability in each subject. We investigate how the bonus a given candidate gets in oral
tests (compared to written tests) varies across subjects, depending on her gender. This enables
us to control both for students’ abilities in each subject, and for students’ differences in abilities
between written and oral tests, as long as the latter are constant across subjects.
This "triple difference" approach reveals that the premium in oral tests for a given girl is
higher on average in more male-dominated subjects (e.g. mathematics and physics) compared to
more female-dominated ones (e.g. biology and foreign languages). This result is driven neither
by the gender of the examiners in oral tests nor by the student’s characteristics. We measure
how male- or female- dominated a field is with the share of females among professors and
13
associate professors in France. This measure appears to be closely correlated with individuals’
perceptions or field-specific stereotypes.
Our identification strategy combines for the first time two different approaches already used
in the literature. (Dee, 2005; Dee, 2007) uses within-student comparisons across different subjects. However, he does not have a blind assessment that can be used as a counterfactual
measure of ability in each subject. A number of studies have used the difference-in-differences
approach between males’ and females’ gaps in blind and non-blind tests to identify discrimination (Blank, 1991; Rouse and Goldin, 2000). As double-differences strategies rely on comparisons between individuals, they may yet be biased by gender-specific differences in individuals’
productivity between the blind and non-blind tests. This problem arises in the education literature that compares scores in anonymous national exams to scores given by students’ own
teachers (e.g. Lavy, 2008; Hinnerich et al., 2011). In these studies, scores given by teachers may
reflect both cognitive skills and the assessment of students’ behavior in the classroom over the
school year. In our setting, both written and oral test scores are given by examiners who have
no personal relationship with the students and receive the same official instruction of evaluating students’ cognitive skills. Our paper is also the first to combine comparisons of blind and
non-blind tests (such as Lavy, 2008; Hinnerich et al., 2011) with within-student comparisons
across subjects (such as Dee, 2005; Dee, 2007) to deal with the fact that blind and non-blind
tests may not pick up exactly the same skills.
The ENS entrance exams are also very appropriate to identify discrimination because blind
and non-blind assessments are almost simultaneous. The time lag between oral and written
tests is only two months, and students only know that they are eligible for the oral tests two
weeks before taking them. Neither do they know their scores in the written tests, so that
low-graders will not prepare more than high-graders for the oral tests. This contrasts with
comparisons between anonymous national exams and assessments by students’ own teachers
(e.g. Lavy, 2008), as well as with studies that use an institutional change from a non-blind
assessment to a blind assessment (e.g. Rouse and Goldin, 2000).
Our results could be biased if female candidates feel especially self-confident in maledominated subjects and perform better in oral tests in these subjects, which may happen in such
a highly selected context. We provide strong evidence against this scenario. When they have to
14
choose an additional oral test, female candidates are a lot less likely to choose male-dominated
than female-dominated subjects. This is true even when we control for candidates’ abilities,
showing that female candidates are not especially self-confident in more male-dominated fields.
Female students are thus very unlikely to perform better in oral tests in those subjects. Even if
they were to assign effort differently than male during the two months period between written
and oral tests, they would invest more in their specialty, i.e. feminine subjects. Consequently,
we argue that differentials in candidates’ performance in oral and written tests can only bias
our estimates downwards, leading us to underestimate the real extent of examiners’ gender bias.
The pattern we find with our sample of candidates is similar to the observations in a number
of countries and situations in which girls usually do better in language tests and only slightly
less well in science tests1 , but are a lot less likely to complete a science degree, even when
controlling for gender differences in abilities (see e.g. Weinberger, 2001). Our setting does not
then exhibit any particularities on the supply side: female candidates behave exactly like (the
literature on) stereotypes would predict, as they shy away from male-dominated fields. This
similarity with other contexts and studies is reassuring from the point of view of the external
validity of our main results on discrimination.
This research complements a recent debate in the Proceedings of the National Academy of
Science on whether discrimination could explain part of the gender gap in science. Reviewing
the literature on the potential causes of women’s underrepresentation in science, Ceci and
Williams (2011) argue that there is no evidence that discrimination is one of these causes.
In contrast, Moss-Racusin et al. (2012) find a subtle bias towards males in science using a
correspondence study: reviewing fake applications for a job of lab manager in biology, chemistry
or experimental physics, recruiters were slightly more likely to choose male candidates than their
equally qualified female counterparts. In our view, the main limitations of this experiment are
to miss the subjects that are the most male-dominated and key to the understanding of the
gender gap in science (math and theoretical physics in particular) and to focus on a management
position, for which we may suspect a glass-ceiling effect against women that has nothing to do
with science. Another one is that recruiters in the Moss-Racusin et al. (2012) paper were told
In particular, gender differences in math and science test scores are now small in all developed countries. They fell in the 1980s and 1990s and remained constant or increased slightly in the 2000s, as shown
for example by PISA studies in 2003 and 2006 (http://pisacountry.acer.edu.au/index.php), and in 2009
(http://stats.oecd.org/PISA2009Profiles/).
1
15
that they were participating to an experiment. Our results are consistent with the Moss-Racusin
et al. (2012) study, but are inferred from a "real-life" context. Our key finding that examienrs
favor females in more male-dominated fields gives lead to Ceci and Williams (2011) idea that
explicit discrimination may not drive the gender gap in science. This idea is also consistent
with the literature on gender discrimination at school (Lindahl, 2007; Lavy, 2008; Hinnerich
et al., 2011; Kiss, 2013) that tend to find that teachers’ evaluation biases run against boys.
Even if not explicitly focused on science and on how evaluation biases vary across subjects,
those papers suggest that explicit discrimination against girls at school is difficult to find a
wide variety of contexts.
The remainder of this paper is organized as follows. Section 1 describes the background
of the ENS entrance exams and the data. Section 2 presents our empirical strategy. Results
are set out in Section 3. Section 4 provide evidence supporting the identification assumption.
Section 5 discusses the possible mechanisms, the link with the literature on stereotypes and
discrimination, and concludes.
16
1
Background, data, and measures of stereotypes
1.1
1.1.1
Institutional background
The Paris Ecole Normale Supérieure
The French higher education system is said to be particularly selective: after high school, the
best students can enter a highly demanding two-year preparatory school that prepares them
for entrance exams for elite universities called Grandes Ecoles. About 10% of high school
graduates choose this curriculum and enroll in a specific track: the main historical tracks
are “Mathematics-Physics”, “Physics-Chemistry”, “Biology-Geology”, “Humanities”, and “Social
Sciences”. Students’ preparatory school tracks determine the Grandes Ecoles to which they
may apply and the subjects on which they will be tested. These Grandes Ecoles are divided
into 4 groups: 215 Ecoles d’Ingénieur for scientific and technical studies (the most famous is the
Ecole Polytechnique), a few hundred Business Schools, a few hundred schools biology, agronomy
and veterinary studies, and three Ecoles Normales Supérieures (ENS). The number of places
available in each Grande Ecole is set and limited, such that the Grandes Ecoles entrance exams
are competitive.
The three ENS prepare students for high-level teaching and academic careers (about 80%
of their students go on to do a PhD). The Paris ENS on which this study focuses is the most
prestigious of them all and the annual entrance exams are designed to select the top students
with a set of highly demanding tests. The ENS are also the only general Grandes Ecoles: they
accept students from the five historical preparatory schools’ tracks. Consequently, the entrance
exams for the Paris ENS are divided into five different competitive exams: candidates have
to apply for the competitive exam that corresponds to their track and are accordingly tested
on specific subjects. Each competitive exam comprises a first “eligibility” stage in the form of
handwritten tests in April (about 3,500 candidates all tracks taken together). All competitive
exam candidates are then ranked according to a weighted average of all written test scores
and the highest-ranking students are declared eligible for the second stage (the threshold is
track-specific for a total of about 500 eligible students). This second “admission” stage takes
17
place in June and consists of oral tests on the same subjects.2 Importantly, oral test examiners
may be different to the written test examiners and they do not know what grades students have
obtained in the written tests. Students are only informed about their eligibility for oral tests
two weeks before taking them and are also unaware of their scores at written tests. Lastly,
eligible candidates for each major are ranked according to a weighted average of all written and
oral test scores and the highest-ranking candidates are admitted to the ENS. The admission
threshold is again competitive exam-specific and defined by law (see Table 1, Panel A for the
average annual number of eligible and admitted candidates in each track).3
1.1.2
Oral tests at the ENS entrance exams
At other schools, oral tests do not necessarily have the same objective as written tests: for
instance, oral tests in French business school entrance exams include interviews that are explicit
personality tests. However, this is not the case with the ENS entrance exams. Officially, the
ENS entrance exams are supposed to assess solely candidates’ academic abilities in each subject
based on both written and oral tests and everything is done to ensure that examiners’ decisions
are as objective as possible.4
Oral tests can be seen as a way of getting an additional and potentially better gauge of
students’ academic skills. Examiners at oral tests may, in particular, want to check whether
candidates can answer difficult questions instantly, an ability that clearly reveals students’
command of the subject. But oral and written tests are based on the same syllabus and on the
same kind of exercises for each subject. This is shown in the reports that recruiting boards’
publish each year for tests in each subject on each track.5
These reports describe the examination questions and the length of written tests, how oral
tests work (time allowed for preparation and presentation) and the type of questions asked, but
also examiners’ expectations for each test. They show that the cognitive skills that examiners
Eligible candidates for scientific tracks also have to take some written tests in the admission stage.
The general design of the exam with a first round of written tests and then oral tests for a subset of eligible
candidates is very common since it is identical for all French Grandes Ecoles. The oral tests are basically
designed to pinpoint the best candidates. They are usually given more weight, so that it is almost impossible
for students who perform badly at the oral tests to pass the exam.
4
For example, every written exam sheet is graded by two different examiners, which is admittedly a very
expensive procedure for the institution. Most oral tests are also evaluated by a panel of two or more interviewers.
5
The ENS website gives access to these reports. See http://www.ens.fr/spip.php?rubrique49 for humanities
tracks and http://www.ens.fr/spip.php?rubrique43 for scientific tracks.
2
3
18
try to measure in written and oral tests are very similar.6
1.2
1.2.1
Data
Candidates
The initial dataset is made up of the scores obtained by all candidates at all five competitive
exams from 2004 to 2009. We only focus on the some 500 students eligible for the oral exams
each year, for whom we have both a written and an oral score for each subject. The final
sample of 3,068 eligible candidates for the ENS entrance exam is described in Table I.1, Panel
A. A total of 36 % of these eligible candidates were actually admitted to the ENS.7 40 %
of both the eligible and admitted candidates were girls.8 However, the proportion of female
candidates varies dramatically across tracks. For example, girls only account for 9 % of the
candidates on the Math-Physics track whereas they account for 64 % of the candidates in
Humanities. Interestingly, the proportion of girls among admitted candidates is higher than
their proportion among eligible candidates only on the most scientific tracks.
1.2.2
Subjects
On each track, eligible candidates take a given set of written and oral exams in various subjects (see Table I.2). Unfortunately, a written blind test and an oral non-blind test are not
systematically taken in all subjects. We only consider the subjects for which there is both a
compulsory written test and a compulsory oral test for all students.9 This leaves us with a
calibrated sample of 25,644 test scores (half written, half oral). Depending on the track, there
For instance, the 2007 written philosophy test on the Humanities track consisted in a six-hour essay on the
question “Can we say anything we want?” (http://www.ens.fr/IMG/file/concours/2007/MP/mp_oral_math_
ulc-u.pdf) while the oral test consisted in a 30-minute presentation on a similar question drawn at random by the
student (http://www.ens.fr/IMG/file/concours/2007/AL/philosophie_epreuve_commune_oral.pdf). Reports
on the 2007 mathematics oral tests for Math-Physics track students also give specific examples of examination
questions (http://www.ens.fr/IMG/file/concours/2007/MP/mp_oral_math_ulc-u.pdf), which happen to be
very similar to those asked in the written tests (http://www.ens.fr/IMG/file/concours/2007/MP/mp_math_
mpi1.pdf).
7
Only a very small fraction turned down the ENS’ offer of a place.
8
Observing the same proportion of girls within the pools of eligible and admitted candidates could be
surprising but it is obviously just a coincidence. This pattern is not observed year by year.
9
In rare cases, students take two written or oral tests in the same subject. In that case, we have averaged
the candidates’ scores over the two tests in order to keep only one observation per triplet (student, subject,
type) where “type” differentiates written from oral tests.
6
19
are between two and six subjects for which all students are scored both at written and oral
tests (see Table I.2). The number of candidates taking both a compulsory written test and a
compulsory oral test may vary slightly from one subject to the next (within a track), because
a few students did not attend all tests (e.g. because of illness). On the Humanities track, the
number of candidates is lower for tests in latin/ancient greek and Foreign Languages because
we only kept the data on students who chose the same language for both written and oral tests,
such that both call for the same abilities.10
On each track, candidates have some discretionary power to choose an additional optional
tests among a set of possible subjects (e.g. computer sciences in the Maths-Physics track). This
choice might be perceived by the examiners of optional tests as a signal of candidates interest
or ability. It may influence their grading behavior. To avoid our results to be driven by this
specific context, we have choosen to keep only tests that are mandatory for all candidates for
our baseline empirical analysis. Doing so, we make sure that the pool of candidates graded at
each pair of oral and written tests is exactly identical. Lastly, we do not use tests in foreign
languages in scientific tracks, as they account for less than 5 % of a candidate’s final average
grade. This makes them hard to compare to other tests as students prepare much less for these
tests and examiners may behave differently as the stakes are much lower.
1.2.3
Male- and female-dominated fields
To characterize how much a subject relates to a female- or male-dominated field, we use an
index Ij based on the proportion of women among professors (professeurs des universités) and
assistant professors (maîtres de conférences) working in the corresponding field in all French
universities.11 This choice is particularly relevant to our context because most of the students
recruited by the ENS go on to become researchers. The value of the index for each subject
j is given in parentheses in Table I.2.12 This index shows substantial variations of female
68 % of the students on the Humanities track chose latin. The remaining 32 % chose ancient greek. The
foreign fanguages were English (69 %), German (24 %), Spanish (4 %) and other languages (3 %).
11
Statistics available at the French Ministry of Higher Education and Research website (http://media.
Selecting only
enseignementsup-recherche.gouv.fr/file/statistiques/20/9/demog07fniv2_23520_49209.pdf).
professors and associate professors to build our index does not affect our results.
12
One may wonder whether this measure accords with people’s subjective perception of how "masculine"
or "feminine" a subject is. To explore this, we built another index by averaging the perceptions of a small
(non-random) sample of individuals asked to rank how female they believe each subject to be on a scale of
0 to 10. Not surprisingly, results for both indices are very similar, suggesting that the proportion of female
10
20
representation across academic fields. This is even true between fields on which the same
candidate may be tested within a track, i.e. between humanities fields or between scientific
fields. For example, 26 % of academics in philosophy and 57 % in foreign languages are females.
Similar disparities are observed in science, with e.g. 21 % in physics and 43 % in biology. These
variations within a track are not much lower than those found across all subjects (the largest
gap is found between math and foreign languages, 57 − 15 = 42 %). This is key in our study,
as we need subjects’ degree of femininity to vary sufficiently within tracks to estimate its link
with examiners’ gender bias, whilst controlling for individual fixed effects (see below in section
2).
1.2.4
Test scores
All tests are initially scored between 0 and 20. We transform these scores into percentile ranks
for each test, i.e. separately by year ∗ track ∗ subject ∗ oral/written.13
We conduct this transformation for the following reasons. First, we focus on a competitive
exam. Candidates are not expected to achieve a given score, but only to be ranked in the
predefined number of available places. As only ranks matter, interpreting our results in terms
of gains or losses in rankings makes sense. Second, the initial test score distributions for the
written and oral tests are very different. This is because our sample contains only the best
candidates following the eligibility stage, who all tend to get good grades in written tests.
However, examiners expect a higher average level from these candidates in oral tests and try to
use the full spread of available grades in their marking, such that the distribution of scores in the
oral tests has a lower mean and is more spread out between 0 and 20. Figure I.1 gives the oral
and written test score distributions for female and male candidates on each track and confirms
this observation.14 Transforming scores in percentile ranks is the most natural way of keeping
only the ordinal information in an outcome variable and to get rid of all meaningless quantitative
(or cardinal) differences between the units of interest, hence avoiding that comparisons could
academics in each field is strongly related to the stereotype content of each subject.
13
The percentiles are computed by including only eligible candidates, i.e. candidates who take both written
and oral tests.
14
Which includes, on each track, all subjects for which there is both a compulsory written test and a compulsory oral test. Figure I.1 also shows that when all subjects in a track are grouped together, the distributions
of scores in written tests for female and male candidates are remarkably similar for most tracks. There is
only a small difference in the Math-Physics track where the distribution of females’ written test scores appears
narrower.
21
reflect the magnitude of these meaningless quantitative differences.
1.3
Evidence of gender rebalancing at oral tests
On panel B of Table I.1, we do a small counterfactual exercise. We compute the number of
young women who would have been accepted if the exam had only consisted in the written
tests of the eligibility step. We then compare it to the proportion of girls finally admitted to
the ENS. We repeat this exercise for each track over the period 2004-2009.
If the eligibility stage had been the one and only exam, the proportion and number of girls
among admitted candidates would have been 4 % higher (in relative terms) than the actual
proportion and number of girls among accepted candidates (column I). However, this statistic
varies strongly across tracks. On the Math-Physics track, the number of admitted girls is as
much as 55 % higher than it would have been if the exam had stopped after the written tests.
This number is still positive on the Physics-Chemistry track, but dips into the negative on
other tracks. These results already suggest that the gender in minority in each track seems to
be favored at oral tests, rebalancing the gender ratio across tracks in the final population of
students admitted.
22
2
Methodology
The goal of this paper is to estimate how examiners’ gender bias at oral tests varies by subject
at the ENS entrance exams. The notion of "examiners’ gender bias" emcompasses everything
in examiners’ behavior that favor a gender relative to the other. It can either be a direct
discrimination, or more subtles behaviors such as offering a greated level of comfort to one
gender relative to the other.
For this purpose, we investigate how the oral-written score gap evolves across subjects for
females and males. We account for individual and subject heterogeneity in the oral-written
gap, using the following model:
∆Rij = β · Fi · Ij + γj + µi + ǫij
(I.1)
where ∆Rij equals the oral minus the written test percentile ranks of student i in subject j.
Fi is an indicator equal to 1 for female candidates and Ij is the index measuring how female
dominated subject j is (see section 1.2.2). µi captures individual heterogeneity in the oralwritten test gap. γj captures the average gap in each subject. In practice, we do even control
for the average gap in each examiner panel (year ∗ track ∗ subject), but we present only the j
substrict for simplicity. ǫij represents individual-subject specific shocks to ∆Rij . In particular,
ǫij may be triggered by specific skills of candidate i in subject j that affect differently her written
and oral performances. If, for example, self-confidence matters more in oral than written tests,
then ǫij would capture any subject-specific level of self-confidence of candidate i.
β is the parameter of interest, i.e. the change in examiners’ bias towards females when the
subject is more feminine. The inclusion of individual fixed effects implies that β is estimated
using only differences within-student and between-subject, which gives to the strategy its flavor
of difference-in-difference-in-differences method. Females and males may have different oral and
written abilities: β is identified as long as these differences are subject-independent (discussed
later on). Or put it another way, a candidate’s oral versus written test abilities may differ
between fields, but not in a way that differs systematically for males and females.
As model I.1 controls for individual fixed effects, β is estimated using only variations in
23
∆Rij observed between the subset of subjects on which a given candidate is tested, depending
on her track (see again Table I.2). Strictly speaking, the estimates should only be used to
compare two subjects in which the same candidate may be tested in a track (not math and
french literature for example). Accordingly, β has to be interpreted in a relative way. For
example, β = −0.5 means that females lose 5 percentile ranks on average by switching to a
subject that is 10 percentage point more feminine than another subject in their track, due only
to differences in examiners’ gender bias between both fields.
From this perspective, tracks are framed in such a way that we mostly compare humanities
subjects (e.g. philosophy vs. literature), or scientific subjects (e.g. physics vs. chemistry). In
fact, this is a important advantage for the credibility of our identification. The oral-written
score gap may not be affected to the same extent in each subject by non-cognitive genderrelated skills. For instance, handwriting skills (resp. oral proficiency) may matter more for
written (resp. oral) tests in humanities than in scientific subjects. If the average quality of
handwriting (resp. speaking) differs between males and females, comparing oral-written score
gaps across subjects may be problematic. As a matter of fact, comparing humanities with
humanities and sciences with sciences only make us focus exclusively on subjects in which both
oral and written tests are set up very similarly. There are very similar requirements for subjects
compared on each track (table 2): there is no obvious reason to think that the oral-written
score gap captures different non-cognitive skills between history and literature (Humanities
and Social Sciences tracks), between biology and geology (Biology-Geology track), or between
physics and chemistry (Physics-Chemistry and Biology-Geology tracks). The only exception to
this pattern is math in the Social Sciences track. Therefore, we will systematically check that
our results are robust to removing these latter test scores from the analysis.
24
3
Results
3.1
Examiners’ bias toward the under-represented gender
Table I.3 presents the β parameter in model I.1 estimated by OLS. Standard errors are clustered
at the level of each examiner panel, that is at the year ∗ track ∗ subject level. We use data for
19 track ∗ subjects and six years, giving us a total of 114 examiner panels.
We find that switching from zero male professors to zero female professors in a subject leads
female candidates to gain about 30 percentile ranks in the scores’ c.d.f (column I). Switching
from a subject as feminine as biology (Ij = 0.43) to a subject as masculine as math (Ij = 0.21)
leads female candidates to gain an average 7 percentile ranks in oral tests with respect to
written tests. A difference in proportional rank of .7 is equivalent to about .25 % standard
deviations (given that the standard deviation of a uniform [0,1] distribution can be shown to
be .289. Similarly, males benefit from a 9 percentile rank premium (33 of a s.d.) on average
at oral tests in foreign languages (Ij = 0.57) relative to philosophy (Ij = 0.26).15
We check that our results are not driven by students’ characteristics that may be correlated
to gender. For instance, social background might be of particular importance. The seminal
work by Bourdieu (1989) shows that applicants with legacies have better chances of entering
the French Grandes Ecoles and that female students trying their chance in core science tracks
are from an even higher social background than their male counterparts. The effect of social
background might be particularly strong in oral examinations where it may be more visible.
As our analysis relies on within-student comparisons, students’ characteristics will bias our
estimates only if they affect differently students’ oral vs. written performance across subjects.
For example, a bias would appear if females are more often upper-class than males in the
Physics-Chemistry track, and if upper-class candidates perform better on Physics oral tests
than on Chemistry ones (relative to their corresponding performance at written tests). To
We do two quick robustness checks at this stage.
First, as argued in section 2, one may prefer to stick to comparisons between humanities subjects or between
scientific subjects to make the identification even more credible. We do so by estimating the same model after
removing test scores in math in the Social Sciences track. Reassuringly, the estimate increases slightly in both
magnitude (from −.301 to −.357) and precision, as the standard error drops from .085 to .080).
Second, the estimate presented on column I gives an equal weight to all subjects. Yet, each subject does not
have the same weight in candidates’ final score and students may affect their efforts accordingly. Although it
is unclear why this could yield any bias, we checked whether our results were robust to weighting each subject
by its relative importance within all oral exams of the candidate’s track. The results are virtually unchanged.
15
25
deal with this potential issue, we replicate the results after controlling for the subject-specific
effects of students’ observable characteristics presented in table 1 (panel B): father and mother’s
occupation, honors obtained at the Baccalaureat exam at the end of high school, preparatory
school quality and repeated year status.16 As shown on column II (Table I.3), the β estimate
remains basically unchanged.
Our baseline specification assumes that the return to the candidates’ true ability is identical
at oral and written tests. However, it is possible that candidates’ true ability is harder to observe
at oral test than at written tests (or vice versa). The return to candidates’ true ability would be
lower at oral tests, penalizing more the good candidates. Suppose now that females are better
than males in the most feminine subjects whereas the opposite is true in the most masculine
ones. In that case, our results could simply be a reflection of the greater test noise at oral tests.
A way to deal with this is to include in our regression model in first difference an alternative
measure of ability as a control (see Lavy, 2008). We do so for each candidate and subject by
controlling for the candidate’s grade in the subject at the Baccalaureat exam (corresponding
to ‘A’ levels, taken two years before the ENS entrance exam). Here, we lose about one half
of the candidates from the sample, which cannot be matched the national Baccalaureat grade
records. Again, the results are virtually unchanged (Table I.3, column III).17 Taken together,
the estimates in columns II and III are strong evidence suggesting that the differences in the
oral-written score gap across subjects are not driven by students’ abilities.
In practice, every student’s characteristic dummies were interacted with subject dummies (except for the
reference subject) and added into model I.1. The sample size is smaller because these observable characteristics
are only available from 2006 onwards.
17
We also investigated directly differences in test noise between the oral and the written tests. We find that
the correlations between test scores at the ENS exam and the Baccalaureat grades in the corresponding subject
are very close whether we consider only written tests or only oral tests. This suggests that oral tests are not
noisier than written tests.
The richness of the data allows us to do one more test on this: in the Math-Physics track, candidates take
both two distinct mandatory written math tests and two distinct mandatory oral math tests. In two regressions
of candidates’ grades at oral or written tests on individual fixed effects, we find that individual fixed effects
explain 63 % (resp. 72 %) of the variance in percentile ranks at the two written (resp. oral) math test scores.
As the individual fixed effects in such specifications should account for candidates’ intrinsic ability in math, the
unexplained part can arguably be attributed to test noise. As this unexplained part is larger at written tests,
we confirm that in math, oral tests are not noisier than written tests.
16
26
3.2
Robustness checks
One might worry that the result presented on Table I.3 is solely driven by a few examination
boards with a particular behavior. To demonstrate the consistency of the pattern, we decompose
the analysis in two distinct ways.
3.2.1
Subject-by-subject comparisons
First, we check within each track whether examiners’ gender bias goes in favor of females
relative to the most feminine subject.18 To do so, we estimate the following model for each
track:
∆Rij =
X
(γj + βj · Fi ) + µi + ǫij
(I.2)
j∈Ωi
where Ωi is the range of subjects taken by candidate i depending on her track, except for the
most feminine one. Again, we control for individual fixed effects to exploit only within-student
and between-subject comparisons. Consequently, the estimated examiners’ gender biases in all
subjects are only interpretable relative to this most feminine subject.
On column I, Table I.4 reports the βj OLS estimates from model I.2 for each subject and
track. As in Table I.3, columns II and III add controls for individual characteristics interacted
with subjects, and for Baccalaureat score in the subject (not available for social sciences and
latin/ancient greek). Except for the Math-Physics track where female-representation is quite
similar in math and physics, all estimates are positive and most of them are statistically different
from the reference subject. For example, the estimate for physics in the Physics-Chemistry track
is 0.133, meaning that females benefit from a 13 percentile rank premium on average between
oral and written tests in physics relative to chemistry. We find similar estimates in other tracks.
In particular, the most robust and precise estimates are in geology relative to biology (BiologyGeology track, panel C), in philosophy relative to literature (Social-Sciences track, panel D),
and in philosophy or literature relative to foreing languages (Humanities track, panel E)
However, the point estimates for the different subjects are not systematically decreasing
The most feminine subject is physics in the Math-Physics track, chemistry in the Physics-Chemistry track,
biology in the Biology-Geology track, literature in the Social Sciences track, and foreign languages in the
Humanities track.
18
27
with the proportion of females in the correspond field. The evidence would be fully compelling
if for each pair of subjects in each track, the estimate for the more male-dominated subject
in the pair was the highest one. That’s not the case for math as compared to physics in the
"Physics-Chemistry" track, for physics as compared to geology or chemistry in the "BiologyGeology" track, and for history as compared to literature or philosophy in the humanities track.
In the Social Sciences track, the estimate for math compared to literature does also not fit the
pattern, but remember that estimates based on comparisons between scientific and humanities
subjects are probably biased (see again section 2). In total, if we exclude this last estimate,
20 pair wise comparisons out of 26 fit our general evidence, and 6 go in the opposite direction.
None of these 6 exceptions is statistically significant at the 5% level and could well be due to
statistical error as our estimates tend to have relatively high standard errors. If we restrain to
pair wise comparisons that are significant at the 5% level, we get 14 pairs satisfying our general
results and 0 pairs going in the opposite direction.
Overall, the pattern observed on Table I.3 is robust in all tracks where comparisons across
subjects are relevant. The results hold both among science subjects and humanity subjects,
and for four different samples of candidates with very different characteristics and very different
types of abilities. These four samples are not random and our estimates should be viewed as
local average treatment effects, as they concern specific individuals that selected themselves
in a given track. The fact that our results hold for very different subsamples of candidates is
an additional indication that they do capture differences in examiners’ behavior. If they were
reflecting differences in students’ performance, the pattern would probably appear much less
stable across tracks, since gender differences in candidates’ characteristics vary a lot depending
on the track.
3.2.2
Robustness across years
Second, we check that our results are robust across time by presenting separate estimates of
equation I.1 for each track and year in our data (except the “Math-Physics” tracks in which
we consider math and physics as too similar in terms of female representation to make any
comparison relevant). Out of 24 track-year samples, we find the expected negative relationship
between the relative female domination in a subject and examiner bias in favor of females in
28
21 cases (Table I.5). There are only 3 exceptions: “Physics-Chemistry” in 2006 and 2007 and
“Social Sciences” in 2006 (see figures in bold). In all of these exceptions, the results are not
significant.
3.3
The role of examiner gender
A large body of literature studies the relationship between examiners’ gender and gender discrimination.19 This literature provides mixed results that are presumably context-dependent,
especially with regard to the “gender content” of the setting they consider. Our results could
be driven by the examiners’ gender. The index of feminization captures precisely the share of
females among academics in France. If this is exactly translated in the gender composition of
the ENS’ examiner panels, then examiners in more masculine subjects may also more often be
male professors, which could drive our results if they have a positive bias in favor of female
candidates.
We provide two evidence against this interpretation. First, Table I.6 reports the average,
minimum and maximum shares of females on the oral test examiner panels over the 2004-2009
period. The gender composition of examiners is fairly constant across subjects for almost every
track, except for the Humanities track. For all other tracks at least, it seems very unlikely that
examiners’ gender is the sole underlying driver of examiners’ gender bias. Second, our results
are virtually unchanged when controlling for the examiner panels’ female share (Table I.3 and
Table I.4, column IV). Formally, we add to model I.1 the examiner panels’ female share and
its interaction with the candidates’ gender. As the female share in examiner panels is defined
at the year ∗ track ∗ subject level, the model exploits its year-to-year variations (see Table I.6,
figures in brackets) to disentangle its effect from the effect of the subject’s degree of femininity
(defined at the subject ∗ track level only).
We report the estimated effect of the examiner panels’ female share for females on Table I.3
Broder (1993) finds that female authors applying for grants to the U.S. National Science Foundation (NSF)
have lower chances of success when assessed by female reviewers than when assessed by their male colleagues.
Bagues and Esteve-Volart (2010) find a similar opposite-gender preference among the hiring committees of the
Spanish Judiciary. By contrast, a same-gender preference seems to exist in academic promotion committees
in Italy (De Paola and Scoppa, 2011) and Spain (Zinovyeva and Bagues, 2011). Finally, Booth and Leigh
(2010) test for gender discrimination by sending fake CVs to apply for entry-level jobs and find that female
candidates are more likely to receive a callback, with the difference being largest in occupations that are more
female-dominated.
19
29
(column IV). It is very small and not statically significant at the 5 % level, suggesting that
examiners’ gender does not affect their bias in favor of a gender. This results may seem at odds
with regard to previous studies on the topic (see footnote 19).
30
4
More on the identification assumption
4.1
Are candidates over-confident when under-represented?
Our identification assumption is that students’ productivity at oral versus written tests may
differ across fields, but not in a way that differs for males and females (particularly not in
a way that is proportional to the share of female academics in the field). In particular, this
assumption could be violated if females (males) perceive themselves as particularly good in
male-dominant (female-dominant) fields, compared to other fields, and if confidence in one’s
ability affects more performance in oral than in written tests.20
It is possible to test for students’ confidence with regard to the different fields, by looking
at their decisions when they have to choose a specialty subject (see section 1.2.2). This choice
is made before the exam starts and leads candidates either to assign a greater weight to the
oral tests corresponding to their specialty, or to take an additional oral test in their specialty
subject. We focus on the Physics-Chemistry, Biology-Geology and Humanities track, where
the choice of a specialty subject has to be made from among the compulsory subjects taken
by all students on the track, that is, the subjects we have studied in our baseline analysis.
Figure 2 shows that females choose mainly the most feminine subject for their specialty oral
test. For example on the Physics-Chemistry track, 26 % of students who chose Chemistry as
their specialty subject were females, versus only 9.5 % for the Physics specialty.
This pattern remains true even if we control for students’ ability. We consider the following
model:
Specialtyij =
X
(γj + βj · Fi + AW
ij ) + µi + ǫij
(I.3)
j∈Specialties
where Specialtyi j is equal to one if candidate i has chosen subject j as a specialty. AW
ij is a linear
control for the score of candidate i in the written test in subject j that picks up subject-specific
ability. We restrict our sample to the tracks mentioned above and to subjects that can be
chosen as specialties. Results, presented in Table I.7, are striking. On the Physics-Chemistry
track, for example, females are about 50 % more likely than males to choose chemistry rather
In the same spirit, the way questions in written (oral) tests is framed could unintentionally favor (penalize)
the dominant gender in the field. As we already argue in section 2 however, this is unlikely since we restrict our
comparisons to subjects that are framed similarly for a given candidate.
20
31
than physics as their specialty oral test, even controlling for ability. Similar results are found
on the two other tracks. Overall, when pooling the three tracks using the index of female
dominance, we find that a subject with 10 % more females is 50 % more likely to be chosen by
female candidates than by male candidates of similar ability. We also try other specifications to
test the robustness of this result. On column II, we control for oral test scores in each subject
instead of written test scores. On column III, we control for both test scores and allow for
non-linearities using dummies per decile. These figures suggest that, on average, candidates
are not especially self-confident in oral tests in fields where their gender is underrepresented.21
Our results on examiner behavior could still be driven by those few females (males) who
unexpectedly choose masculine (feminine) specialties and may thus prepare more for subjects in
which they are under-represented. We replicate our baseline results after tossing out from the
sample either females who choose masculine specialties, males who choose feminine specialties,
or both.22 The results are very robust to limiting the sample in these ways (see Table I.8).
4.2
What if written tests are not really blind?
Our proposed identification strategy relies on the assumption that examiners cannot identify
gender in written tests and that it is only revealed in oral tests. However, they may be able
to distinguish between female and male handwriting. Gender may thus be detected in written
tests. We argue that this problem is not likely to be important.
First, grading a supposedly female-handwritten test is very different from being in the
physical presence of a female or male candidate in an oral exam. We can thus expect behavior
toward females – positive or negative – to be stronger in an oral test than a written test. More
importantly, the fact that written tests are not perfectly blind to gender should only lead us to
underestimate gender discrimination, because there is no reason for professors to discriminate
in different directions in written and oral tests. In the extreme case where gender is perfectly
Choosing a subject as a specialty increases its weight in the calculation of the candidates’ final ranking. If
females choose feminine specialties, they have clear incentives to prepare more for feminine subjects to maximize
their chances of admission to the ENS. This could bias our estimate, but in the opposite way, i.e. the relative
positive examiner bias for females may be underestimated by the more intense preparation made by females in
more feminine subjects.
22
We consider as masculine specialties: physics on the Physics-Chemistry track, geology on the BiologyGeology track and philosophy or history on the Humanities track. Other subjects in those tracks are considered
as feminine.
21
32
detectable in written tests and affects the jury similarly in both written and oral tests, we
should not find any difference between male and female gaps between the oral and written
tests.
Second, it is highly unlikely that examiners in written tests manage to systematically guess
the candidate’s gender. To support this idea, we conducted an actual handwriting test where
researchers or late PhD students at the Paris School of Economics had to guess the gender
of 118 graduate students from their handwritten anonymous exam sheets. The percentage of
correct guesses was 68.6 %; far from perfect detection, albeit significantly higher than the 50 %
average guess that would be obtained from random guessing (see the Online Appendix for more
details on the experiment).
Finally, examiners may be sensitive to the quality of handwriting, which is usually alleged
to be higher for women. Even if examiners have no gender bias in written tests, they may
give better scores on average to female candidates because of their better handwriting. Our
“triple difference” strategy is immune to this potential problem. As we only compare between
humanities subjects or between scientific subjects that are always set up the same way (see
section 2), handwriting quality is not likely to matter more in one of these subjects than in the
others (for example, in philosophy compared to literature, or in physics compared to chemistry).
Consequently, any handwriting quality effect on the written test scores should be cancelled out
when we differentiate scores across subjects.
33
5
Discussion
5.1
Mechanisms
A natural explanation for the gender ratio balancing observed in the ENS entrance exam is
that the ENS has an explicit affirmative action policy in order to recruit more females in fields
where there are too few. In contrast with the United States, affirmative action is very unlikely
to occur at the ENS. There is no legal basis for affirmative action in France, and the ENS
has a strong reputation for rewarding pure talent only (Bourdieu, 1989). As emphasized by
the sociologist, the school system in France (and the entrance exams of the Grandes Ecoles in
particular) relies on a fundamental belief in its meritocratic role. To confirm this, we interviewed
several members and heads of recruiting committees. None of them ever faced any explicit or
implicit demands from the institution to implement affirmative action. All of them thought it
inconceivable that the ENS would formulate such demands, either at the track or the subject
level.23
Two other types of discrimination could explain the pattern emphasized in the paper. The
first possible mechanism is similar to what is commonly referred to as "preference-based" discrimination in the literature (Becker, 1957). Even if there is no institutional affirmative action
at play, professors may still be trying to implement a positive discrimination on their own in
order to help what they think is the disadvantaged gender in their field. In that case, they
do so in a non-coordinated way, whereby professors evaluating different subjects on a given
track behave differently. Such preference for the minority gender could explain why we find
a differential bias between-fields for the same candidate. The second mechanism is closer to
an "information-based" type of discrimination, also called statistical discrimination (Phelps,
1972; Arrow, 1973). Assume examiners have higher priors about ability of candidates from the
under-represented gender in their field. This is credible in a setting with highly-selected individuals: because females that chose to major in science had to go against strong social norms,
examiners may actually expect them to have higher scientific cognitive skills than males, even
if they expect the opposite for typical females (i.e. females that they consider as representative
In any case, our results cannot be explained by affirmative action at the track level, since we identify
variations in examiners’ gender bias within tracks. Moreover, we find the same pattern in all tracks, including
those already quite balanced and where there would be no need for affirmative action (“Biology-Geology”, “Social
Sciences” and “Humanities”, where the share of females among eligible candidates is between 50% and 65%).
23
34
of the population). This mechanism is well described by (Fryer, 2006), who referred to it as a
“belief-flipping” in statistical discrimination, i.e. “being pessimistic about a group in general,
but optimistic about the successful members of that group” (p.1151). Such priors could explain
our results if two other conditions are fulfilled. First, candidates’ abilities have to be imperfectly
observable during the oral tests. Second, examiners in a the same track need to have different
priors from the same selected candidates. For instance in the Physics-Chemistry track, the
same females are considered better than males by Physics examiners, but not by Chemistry
ones.
Unfortunately, the data do not allow us to make any definitive conclusions about the mechanisms. Yet, some evidence suggest that a preference-based explanation is more likely than the
information-based mechanism just exposed. First, females’ candidates tend to perform slightly
worse in male dominated subjects in every track.24 Even if they do not know the grades of each
candidate specifically, it is hard to think that examiners are not aware of these broad patterns.
If any, examiners’ priors about the relative abilities of the ENS candidates are thus likely to
be in line with general stereotypes concerning women’s and men’s abilities in male and female
dominated subjects. Second, the premium for the under-represented gender is larger in years
where females perform relatively poorly. To show this, we add to model I.1 a control for the
year-specific relative performance of females interacted with the female dummy Fi . We measure this relative performance using Ajty , the average percentile rank of females at the written
test in subject j, track t and year y. Ajty is normalized to have mean zero in each subject
and track, and thus reflects the relative performance of females in year y as compared to the
long-run average performance of females at this particular test. The estimate is negative and
statistically significant at the 1 % level (Table I.9, column II), meaning that examiners favor females relatively more in years where they perform relatively worse at the written test. Column
III adds the triple interaction term (Fi · Ij · Ajty ) to the model. The estimate is again negative
and significant, whereas the estimate for Fi · Ajty becomes much smaller and not significant
anymore. Meaning that (i) there is discrimination in favor of the gender in minority in each
subject; (ii) this discrimination gets stronger when the gender in minority in the field performs
relatively worse than usual.
This result tend to support the preference-based than the information-based explanation.
24
As showed by gender gaps in written test scores in all subject ∗ track. Available on demand.
35
As a matter fact, the statistical discrimination mechanism would predict the opposite result:
if examiners in masculine subjects favor females because they have positive priors on their
abilities, why would they give a higher premieum in years where they look less skilled? A
more credible explanation is that examiners try to implement on their own a preference-based
discrimination whereby for personal motives (political considerations or pure preference for
diversity), they tend to favor the minority gender in their field. Again however, all this analysis
on the underlying mechanisms should be taken as suggestive only.
5.2
Stereotypes and discrimination
The unequal share of female faculties across fields may be intimately related with field-specific
gender stereotypes. Recent research based on more than half a million Implicit Association
Tests completed by citizens of 34 countries shows a clear and systematic implicit association
between women and humanities and between men and science (Nosek et al., 2009). In line with
this evidence of a gender-science stereotypes, people may have different beliefs about the relative
adequacy of women and men across academic field. These beliefs may relate to women’s and
men’s abilities (e.g. “women are less talented in math than in biology”), as well as to normative
views about what subject suits to women, independently from their ability (e.g. “women are
less suited in science than to humanities”). To explore this, we built an index by averaging the
perceptions of a small (non-random) sample of faculty and graduate students asked to rank
how "feminine" they believe each subject to be on a scale of 0 to 10. The femininity ranking of
fields was very similar to the one used in this paper, suggesting that the proportion of female
academics in each field may be strongly related to the stereotype content of each subject.
We do not know whether examiners at the ENS entrance exams also have similar priors. If
they do, our results would suggest that negative stereotypes might lead to positive discrimination, whereas gender stereotypes are usually thought to drive straightforward discrimination.
We emphasize that the links between gender stereotypes and gender discrimination are not
straightforward and may differ enormously from one setting to the next. Do examiners personally know the agents they evaluate? Do they consider them as representative of the larger group
they belong to? Is the assessment a one-off interaction or will examiners work with the agents
after the examination? These issues make a difference, implying that, in the absence of clear
36
evidence, no assumptions should be made as to how examiners’ stereotypes shape their behavior. As such, this paper stresses the need for empirical investigations into the links between
stereotypes and discrimination.
5.3
Conclusion
This study investigates how gender influences the admission decision of faculty tasked with
choosing students in male- or female-fominated fields. The unique setting of the entrance exam
for a French higher education institution allows us to identify examiners’ gender bias, using a
triple difference strategy. We show that the bias goes in favor of the under-represented gender
in the field. Even though our results are partly specific to the context in the study, they
provide interesting insights into how examiners might behave in similar settings, i.e. when
recruiting students who have already been highly selected. Many situations may relate to
this one, e.g. recruitment for highly qualified jobs and admission to highly selective graduate
programs. Identifying how examiners behave in such situations is crucial to understand what
fosters the gender inequalities in top academic and labor market positions. In traditionally
male-dominated fields in particular, this "glass ceiling" is a key issue, as it may perpetuate the
scarcity of female role models and reinforce inequalities (Carrell et al., 2010). By revealing that
females may be more favored (or less discriminated against) in more male-dominated subjects,
this study questions the responsibility of professors in the persistent glass ceiling. It suggests
that policies to improve the representation of women in science should focus on the supply side
and encourage girls to enroll more in scientific fields. In that respect, advertising the results we
find in this paper to young women could already be a relevant policy, as providing adequate
information to economic agents can sometimes be the most efficient way to trigger action.
37
Bibliography
Arrow, Kenneth (1973). “The Theory of Discrimination”. In: Discrimination in Labor Markets. Ed. by O. A. Ashenfelter and A. Rees. Princeton University Press.
Bagues, Manuel F. and Berta Esteve-Volart (2010). “Can Gender Parity Break the Glass
Ceiling? Evidence from a Repeated Randomized Experiment”. In: Review of Economic Studies 77.4, pp. 1301–1328.
Becker, Gary S. (1957). The Economics of Discrimination. The University of Chicago Press.
Bernard, Michael E. (1979). “Does Sex Role Behavior Influence the Way Teachers Evaluate
Students?” In: Journal of Educational Psychology 71, pp. 553–562.
Bettinger, Eric P. and Bridget Terry Long (May 2005). “Do Faculty Serve as Role Models?
The Impact of Instructor Gender on Female Students”. In: American Economic Review 95.2,
pp. 152–157.
Blank, Rebecca M (Dec. 1991). “The Effects of Double-Blind versus Single-Blind Reviewing:
Experimental Evidence from The American Economic Review”. In: American Economic
Review 81.5, pp. 1041–67.
Booth, Alison and Andrew Leigh (2010). “Do employers discriminate by gender? A field
experiment in female-dominated occupations.” In: Economic Letters 107, pp. 236–238.
Bourdieu, Pierre (1989). La Noblesse d’Etat: Grandes écoles et esprit de corps. Le sens commun. Les Editions de Minuit.
Broder, Ivy E. (1993). “Review of NSF Economics Proposals, Gender and Institutional Patterns.” In: American Economic Review 83, pp. 964–970.
Brown, Charles and Mary Corcoran (July 1997). “Sex-Based Differences in School Content
and the Male-Female Wage Gap”. In: Journal of Labor Economics, University of Chicago
Press 15.3, pp. 431–65.
38
Carrell, Scott E., Marianne E. Page, and James E. West (Aug. 2010). “Sex and Science: How Professor Gender Perpetuates the Gender Gap”. In: The Quarterly Journal of
Economics, MIT Press 125.3, pp. 1101–1144.
Ceci, Stephen J. and Wendy M. Williams (2011). “Understanding current causes of women’s
underrepresentation in science”. In: Proceedings of the National Academy of Sciences 108.8,
pp. 3157–3162.
De Paola, Maria and Vincenzo Scoppa (June 2011). Gender Discrimination and Evaluators’ Gender: Evidence from the Italian Academy. Working Papers 201106. Università della
Calabria, Dipartimento di Economia, Statistica e Finanza (Ex Dipartimento di Economia e
Statistica).
Dee, Thomas S. (May 2005). “A Teacher Like Me: Does Race, Ethnicity, or Gender Matter?”
In: American Economic Review 95.2, pp. 158–165.
— (2007). “Teachers and the Gender Gaps in Student Achievement”. In: Journal of Human
Resources 42.3.
Dusek, Jerome B. and Gail Joseph (1983). “The bases of teacher expectancies: A metaanalysis.” In: Journal of Educational Psychology 75.3, pp. 327–346.
Fryer Roland G., Jr. (Apr. 2006). Belief Flipping in a Dynamic Model of Statistical Discrimination. NBER Working Papers 12174. National Bureau of Economic Research, Inc.
Hinnerich, Björn Tyrefors, Erik Höglin, and Magnus Johannesson (Aug. 2011). “Are
boys discriminated in Swedish high schools?” In: Economics of Education Review 30.4,
pp. 682–690.
Hunt, Jennifer, Jean-Philippe Garant, Hannah Herman, and David J. Munroe (Mar.
2012). Why Don’t Women Patent? NBER Working Papers 17888. National Bureau of Economic Research, Inc.
Kiss, David (Dec. 2013). “Are immigrants and girls graded worse? Results of a matching
approach”. In: Education Economics 21.5, pp. 447–463.
Lavy, Victor (Oct. 2008). “Do gender stereotypes reduce girls’ or boys’ human capital outcomes? Evidence from a natural experiment”. In: Journal of Public Economics 92.10-11,
pp. 2083–2105.
Lindahl, Erica (2007). Does gender and ethnic background matter when teachers set school
grades? Evidence from Sweden. IFAU Working Paper 2007:25.
39
Madon, Stephanie, Lee Jusim, Shelley Keiper, Jacquelynne Eccles, Alison Smith,
and Polly Palumbo (1998). “The accuracy and power of sex, social class, and ethnic
stereotypes, a naturalistic study in person perception.” In: Personality and Social Psychology
Bulletin 12, pp. 1304–1318.
Moss-Racusin, Corinne A., John F. Dovidio, Victoria L. Brescoll, Mark J. Graham,
and Jo Handelsman (2012). “Science faculty’s subtle gender biases favor male students”.
In: Proceedings of the National Academy of Sciences 109.41, pp. 16474–16479.
National Science Foundation (2006). Science and Engineering Degrees, 1966–2004. Manuscript
NSF 07-307. National Science Foundation, Division of Science Resources Statistics.
Nosek, Brian A., Frederick L. Smyth, N. Sriram, Nicole M. Lindner, Thierry Devos,
Alfonso Ayala, et al. (2009). “National differences in gender–science stereotypes predict
national sex differences in science and math achievement”. In: Proceedings of the National
Academy of Sciences 106.26, pp. 10593–10597.
Phelps, Edmund S. (1972). “The Statistical Theory of Racism and Sexism”. In: American
Economic Review 62, pp. 659–661.
Rouse, Cecilia and Claudia Goldin (Sept. 2000). “Orchestrating Impartiality: The Impact
of "Blind" Auditions on Female Musicians”. In: American Economic Review 90.4, pp. 715–
741.
Tiedemann, Joachim (2000). “Parents’ gender stereotypes and teachers’ beliefs as predictors
of children’ concept of their mathematical ability in elementary school.” In: Journal of
Educational Psychology 92, pp. 144–151.
Weinberger, Catherine J. (1998). “Race and Gender Wage Gaps in the Market for Recent College Graduates”. In: Industrial Relations: A Journal of Economy and Society 37.1,
pp. 67–84.
— (1999). “Mathematical College Majors and the Gender Gap in Wages”. In: Industrial Relations: A Journal of Economy and Society 38.3, pp. 407–413.
— (2001). “Is Teaching More Girls More Math the Key to Higher Wages?” In: Squaring Up,
Policy Strategies to Raise Women’s Incomes in the U.S. Ed. by Mary C. King. University
of Michigan Press.
Zinovyeva, Natalia and Manuel F. Bagues (Feb. 2011). Does Gender Matter for Academic
Promotion? Evidence from a Randomized Natural Experiment. IZA Discussion Papers 5537.
Institute for the Study of Labor (IZA).
40
Figure I.1: Kernel density estimates of scores at written and oral tests, by track and gender.
Note - We keep only subjects present in our baseline data, that is all subjects for which there are both a
mandatory written test and a mandatory oral test. Distributions on each track are computed over all these
subjects pulled together, with an equal weight given to each one. Kernel density estimates use Epanechnikov
kernel function on Stata 12.0 software. The half-width of the kernel is an “optimal” width calculated automatically by the software, i.e. the width that would minimize the mean integrated squared error if the data
were Gaussian and a Gaussian kernel was used.
41
Figure I.2: Gender and choice of specialty.
Note - The figure represents the share of females among candidates choosing each specialty.
42
Table I.1: Descriptive statistics
Track
All
(I)
MathPhysics
Physics- BiologyChemistry Geology
Social
Sciences
Humanities
(0.216)
(0.269)
(0.342)
(0.362)
(0.435)
(II)
(III)
(IV)
(V)
(VI)
Panel A : Eligible candidates by track (2004-2009)
43
Total eligible candidates
3026
745
491
420
334
1036
Average per year
504
124
82
70
56
173
Average admitted per year
184
42
21
21
25
75
% Admitted among eligible candidates
37%
34%
26%
30%
45%
44%
% Girls in eligible candidates
40%
9%
17%
56%
53%
64%
% Girls in admitted candidates
40%
12%
13%
44%
47%
59%
Panel B : Counterfactual exercise - Potential admitted candidates after eligibility
N admitted girls (2004-2009) (a)
438
29
17
56
71
265
% among all admitted candidates
39.6%
11.6%
13.5%
44.4%
47.0%
58.5%
452
18
15
58
76
285
40.9%
7.5%
12.1%
48.7%
48.7%
61.0%
-3%
+38%
+12%
-4%
-7%
-8%
Counterfactual N admitted girls (b)
% among all counterfactual admitted students
Relative variation between (a) and (b)
Note on panel B - The counterfactual is the number of girls who would have been admitted if the exam was only made up by the
eligibility stage (anonymous written tests only). It is based on the eligibility rank computed by the exam board to determine the
pool of eligible students, to which we applied the final admission threshold of each track. We estimated then the number of girls
within the resulting counterfactual pool of admitted students.
Table I.2: Sample sizes for subjects and tracks with both written and oral tests
Track
Math (0.152)
Computer Sciences (0.192)
Physics (0.213)
MathPhysics
PhysicsChemistry
BiologyGeology
Social
Sciences
(0.216)
(0.269)
(0.342)
(0.362)
(0.435)
(I)
(II)
(III)
(IV)
(V)
1480
956
Wr. only
670
982
836
Humanities
Option
1474
Geology (0.250)
828
Philosophy (0.257)
668
2070
Geography (0.319)
Option
Option
Chemistry (0.331)
978
836
Social Sciences (0.335)
666
History (0.389)
666
2070
666
2073
Option
1786
Oral only
1878
Biology (0.432)
830
Literature (0.535)
Latin/Ancient Greek (0.547)
Foreign languages (0.565)
1452
958
83
Note: sample sizes are given for the subjects that we keep in our empirical analysis.
"Wr. only" ("Oral only") means that there is only a written (an oral) test for the subject.
"Option" means that the subject is optional at the written test, oral test or at both, meaning that all
candidates in the track do not necessarily take the test.
A blank is left in the corresponding box when a subject does not belong to a given track exam.
Data for Latin/Ancient Greek and Foreign languages are only kept for students who chose the same
language at written and oral tests. 68 % and 32 % of Humanities students respectively chooses Latin and
Ancient Greek.
Foreign languages are English (69 %), German (24 %), Spanish (4 %) and other languages (3 %).
Indexes of feminization are given in parenthesis for each subject and each track. Subjects and tracks are
ordered according to these indexes.
44
Table I.3: Subjects’ female representation and examiners’ gender bias
(I)
Fi · Ij
(II)
(III)
−0.290*** −0.304** −0.273*
(0.084)
(0.117)
(0.145)
Female share in examiner panel
(IV)
−0.290***
(0.083)
0.055
(0.103)
Fi · Female share in examiner panel
−0.001
(0.050)
R2
0.27
0.30
0.36
0.28
N
11,196
7,372
5,232
11,196
Controls for student charac. ∗ subject
No
Yes
Yes
No
Candidate’s A-level score in the subject
No
No
Yes
No
Controls for female share in examiner panel
No
No
No
Yes
Note: The dependent variable is the candidate’s difference between the oral and written percentile
ranks.
Each regression includes individual fixed effects and a dummy for examiner panel (year ∗ track ∗ subject).
Fi is the female candidate dummy and Ij the female share among faculty in field j in France.
Subjects are ordered according to the index of feminization (in parenthesis).
Standard errors are clustered at the examiner panel level (year ∗ track ∗ subject).
*** p<0.01, ** p<0.05, * p<0.1
45
Table I.4:
Between-subject differences in examiners’ gender bias
(I)
(II)
(III)
(IV)
-0.018
0.053
0.028
-0.018
(0.073)
(0.085)
(0.078)
(0.073)
Physics (0.213)
REF
REF
REF
REF
N
1,468
936
809
1,468
Panel A : Math-Physics
Math (0.152)
Panel B : Physics-Chemistry
Math (0.152)
0.062
0.038
0.037
0.058
(0.067)
(0.091)
(0.097)
(0.076)
0.133**
0.165*
0.165*
0.133**
(0.058)
(0.081)
(0.087)
(0.058)
Chemistry (0.331)
REF
REF
REF
REF
N
1,457
952
878
1,457
0.135**
0.090
0.105
0.135**
(0.057)
(0.065)
(0.063)
(0.056)
0.158***
0.162**
0.176**
0.089*
(0.044)
(0.069)
(0.080)
(0.047)
0.142***
0.079
0.069
0.096
(0.050)
(0.080)
(0.076)
(0.057)
Biology (0.432)
REF
REF
REF
REF
N
1,665
1,139
1,019
1,665
Controls for student charac. ∗ subject
No
Yes
Yes
No
Candidate’s A-level score in the subject
No
No
Yes
No
Controls for female share in examiner panel
No
No
No
Yes
Physics (0.213)
Panel C : Biology-Geology
Physics (0.213)
Geology (0.250)
Chemistry (0.331)
Continued on next page
46
Table I.4:
Between-subject differences in examiners’ gender bias
(I)
(II)
(III)
(IV)
0.020
0.022
0.033
-0.016
(0.082)
(0.115)
(0.106)
(0.071)
0.145***
0.176**
0.211**
0.145***
(0.037)
(0.078)
(0.074)
(0.035)
0.061
0.046
-
0.078
(0.073)
(0.116)
0.036
0.043
0.041
0.089**
(0.044)
(0.073)
(0.099)
(0.043)
Literature (0.535)
REF
REF
REF
REF
N
1,668
1,108
799
1,668
0.133***
0.150***
0.125*
0.129***
(0.034)
(0.051)
(0.061)
(0.041)
0.078
0.104
0.088
0.072
(0.047)
(0.066)
(0.074)
(0.056)
0.107**
0.132**
0.151**
0.107**
(0.044)
(0.054)
(0.054)
(0.045)
0.043
0.057
-
0.042
(0.046)
(0.055)
Foreign languages (0.565)
REF
REF
REF
REF
N
4,938
3,237
1,727
4,938
Controls for student charac. ∗ subject
No
Yes
Yes
No
Candidate’s A-level score in the subject
No
No
Yes
No
Controls for female share in examiner panel
No
No
No
Yes
Panel D : Social Sciences
Math (0.152)
Philosophy (0.257)
Social Sciences (0.335)
History (0.389)
(0.071)
Panel E : Humanities
Philosophy
History (0.389)
Literature (0.535)
Latin/Ancient Greek (0.547)
(0.048)
Note: The dependent variable is the candidate’s difference between the oral and written percentile ranks.
Fi is the female candidate dummy and Ij the female share among faculty in field j in France.
Subjects are ordered according to the index of feminization (in parenthesis).
Standard errors are clustered at the examiner panel level (year ∗ track ∗ subject).
*** p<0.01, ** p<0.05, * p<0.1
47
Table I.5: Subjects’ female representation and examiners’ gender bias - separate estimates for each
track and year
Years
Physics-Chemistry
Biology-Geology
Social Sciences
Humanities
All
2004
2005
2006
2007
2008
2009
(I)
(II)
(III)
(IV)
(V)
(VI)
(VII)
−0.592
(0.209)
−1.245**
(0.357)
−0.278
(0.144)
−0.222
(0.256)
−0.335
(0.129)
−0.084
(0.559)
−0.009
(0.411)
−0.221
(0.178)
−0.454
(0.382)
−0.635**
(0.239)
−0.152
(0.199)
−0.280***
(0.093)
0.252
0.373
−2.212** −1.050
(0.214)
(0.908)
(0.341)
(1.124)
−1.166** −0.232
−0.236
−0.932*
(0.358)
(0.312)
(0.705)
(0.386)
0.126
−1.092** −0.217
0.548
(0.214)
(0.241)
(0.187)
(0.749)
−0.373** −0.451*
−0.440
−0.014
(0.109)
(0.197)
(0.386)
(0.321)
Note: The dependent variable is the candidate’s difference between the oral and written percentile ranks. We
report estimated coefficients for the female dummy interacted with female representation among faculty in the
field. Results are obtained from 28 separate regressions: one for each track (except “Math-Physics”), and one
for each track and year available in the data. Each regression includes individual fixed effects and a dummy for
examiner panel (year ∗ track ∗ subject). Standard errors are clustered at the examiner panel level. *** p<0.01,
** p<0.05, * p<0.1
48
Table I.6: Female share in ENS oral tests examining boards (2004-2009 average)
Track
MathPhysics
(0.216)
Physics- BiologyChemistry Geology
(0.269)
(0.342)
(I)
Math (0.152)
Physics (0.213)
(II)
0.06
0.06
[0; .33]
[0; .33]
(III)
0.06
0
0
[0; .33]
[0; 0]
[0; 0]
Social
Sciences
(0.362)
(IV)
Humanities
(0.435)
(V)
0.2
Geology (0.250)
[0; .4]
Philosophy (0.257)
Chemistry (0.331)
0
0.14
[0; 0]
[0; .33]
0.5
0.36
[.5; .5]
[.17; .5]
0.58
Social Sciences (0.335)
[0; 1]
History (0.389)
0.75
0.28
[0; 1]
[0; .5]
0
Biology (0.432)
[0; 0]
Literature (0.535)
0.5
0.54
[.5; .5]
[.43; .67]
0.5
Latin/Ancient Greek (0.547)
[.5; .5]
0.79
Foreign languages (0.565)
[.5; 1]
Note: For each subject and track, the female share in oral test examining board is computed as
the sum of their number in oral tests over years 2004-2009, divided by the sum of the boards’ total
size over the same period. The minimum and maximum values across years 2004-2009 are reported
in square brackets. Candidates are not necessarily interviewed by all members of the examining
boards
49
Table I.7: Gender gap in choice of specialty subjects
(I)
(II)
(III)
-0.486***
(0.114)
-0.581***
(0.114)
-0.522***
(0.114)
0.17
979
0.14
979
0.23
979
-0.130*
(0.070)
-0.188***
(0.070)
-0.170**
(0.070)
0.53
829
0.52
829
0.58
829
-0.120***
(0.035)
-0.069*
(0.035)
0.031
(0.035)
-0.040
(0.037)
-0.153***
(0.035)
-0.089**
(0.035)
0.005
(0.035)
-0.051
(0.037)
-0.114***
(0.035)
-0.053
(0.035)
0.028
(0.035)
-0.039
(0.037)
0.13
4,938
0.12
4,938
0.15
4,938
0.523***
(0.100)
0.635***
(0.100)
0.503***
(0.099)
0.31
6,746
0.30
6,746
0.32
6,746
Yes
No
No
No
No
Yes
No
No
No
No
Yes
Yes
Panel A : Physics-Chemistry
Physics (0.213)
R2
N
Panel B : Biology-Geology
Geology (0.250)
R2
N
Panel C : Humanities
Philosophy (0.257)
History (0.389)
Literature (0.535)
Latin/Ancient Greek (0.547)
R2
N
Panel D : All 3 tracks
Fi · Ij
R2
N
Controls for ability in each subject:
Written test score (linear)
Oral test score (linear)
10 dummies for written test score
10 dummies for oral test score
Note: The dependent variable is a dummy variable equal to 1 when a subject is the
specialty chosen by a given candidate in the sample.
We keep only subjects corresponding to possible specialties.
Estimated coefficients for the female dummy interacted with each subject dummies are
reported on the table.
Subjects are ordered according to the index of feminization (in parenthesis).
Each regression includes individual fixed effects and a dummy for examiner panel (year
∗ track ∗ subject).
*** p<0.01, ** p<0.05, * p<0.1
50
Table I.8: Baseline results without females (males) with masculine (feminine)
specialties - Physics-Chemistry, Biology-Geology and Humanities tracks only
Candidates
Fi · Ij
R2
N
All
w/o females
with
masculine
spe.
w/o males
with feminine
spe.
w/o both
(I)
(II)
(III)
(IV)
-0.340***
(0.086)
-0.471***
(0.091)
-0.287**
(0.118)
-0.423***
(0.131)
0.25
8,060
0.26
6,894
0.25
6,040
0.26
4,874
Note: The dependent variable is the candidate’s difference between the oral and written
percentile ranks.
Each regression includes individual fixed effects and a dummy for examiner panel (year
∗ track ∗ subject).
Fi is the female candidate dummy and Ij the female share among faculty in field j in
France.
Subjects are ordered according to the index of feminization (in parenthesis).
Standard errors are clustered at the examiner panel level (year ∗ track ∗ subject).
*** p<0.01, ** p<0.05, * p<0.1
51
Table I.9: Gender bias depending on year-specific females’ ability
Fi · Ij
(I)
(II)
(III)
−0.290***
(0.084)
−0.298***
(0.068)
−1.324***
(0.231)
−0.299***
(0.067)
−0.411
(0.543)
−3.513*
(1.846)
0.27
11,196
0.28
11,196
0.28
11,196
Fi · Ajty
Fi · Ij · Ajty
R2
N
Note: The dependent variable is the candidate’s difference between the
oral and written percentile ranks. Each regression includes individual
fixed effects and a dummy for examiner panel (year ∗ track ∗ subject).
Fi is the female candidate dummy and Ij the female share among
faculty in field j in France. Ajty is the year ∗ subject ∗ track specific
relative ability of females, as measured by the average rank of females
after the written tests in subject j, track t and year y, centered at
mean zero in each subject and track.
Subjects are ordered according to the index of feminization (in parenthesis). Standard errors are clustered at the examiner panel level (year
∗ track ∗ subject).
*** p<0.01, ** p<0.05, * p<0.1
52
Appendix
We asked 13 researchers or late PhD students at Paris School of Economics (PSE) that all had a
grading experience to guess the gender of 118 students from their hand-written anonymous exam
sheets. Students were first and second year Master’s students from Paris School of Economics
and we managed to gather a total of 180 of their exam sheets (102 written by males and 78
by females) in four different subjects.25 Each grader was asked to guess the gender of about
one third of the 180 exam sheets. Out of a total of 858 guess, the percentage of correct guess
is 68.6 %. This number is significantly higher than the 50 % average that would be obtained
from random guess. It is nevertheless closer from random guess than from perfect detection
(100 %). Assessors seem to be a bit better at recognizing male hand-writing: the share of
correct guess reaching 71.8 % among males’ exam sheets but only 64.5 % among female exam
sheets. All 13 assessors have between 53 % and 78 % of good guess (Table I.10), and, except the
first assessor, they perform quite similarly on females’ and males’ exam sheets. One important
difference between the ENS candidate and the PSE master’s student is that the former are all
French whereas about one third of the latter are foreigners. We thus check that our results
were similar when restraining only to exam sheets belonging to French students and find the
share of correct guess to be only slightly higher on that sample (72.3 %).
We finally try to examine in what extent some handwriting could be unambiguously detected. To do this, we focus on a subsample of exam sheets that have been assessed by exactly
five researchers and that belong to different students, so that all handwriting on that sample
are different. We find that 40 % of the handwriting in that sample could be guessed accurately
by all five assessors (Table I.11). 21 % could be guessed by all five assessors but one. By
contrast, 6 % of the handwriting were wrongly guessed by all assessors and another 8 % were
wrongly assessed by all five assessors but one. Additional observations would be necessary to
confirm it, but these results suggest that about one half of handwriting can be detected quite
easily whereas about 15 % are very misleading.
Some students took exams in more than one of the topics we had, so that the final number of students is
lower than the number of exam sheets. We reproduced our analysis keeping only one exam sheet per student
and we got the same results.
25
53
Table I.10: How easy is it to detect female handwriting? Results obtained by 13 researchers guessing the gender
of 180 anonymous exam sheets.
%
gender
correctly
assessed
%
gender
correctly
assessed
among
females
%
gender
correctly
assessed
among
males
%
gender
correctly
assessed
among
nonforeigners
Assessor
Gender
Field
exam sheets assessed
Number
of exam
sheets
assessed
(I)
(II)
(III)
(IV)
(V)
(VI)
(VII)
(VIII)
(IX)
1
2
3
4
5
6
7
8
9
10
11
12
13
M
F
M
F
M
F
M
M
M
F
F
M
F
Socio.
Econ.
Econ.
Socio.
Econ.
Econ.
Econ.
Socio.
Econ.
Biol.
Econ.
Socio.
Socio.
114 to 156
69 to 128
131 to 180
69 to 130
1 to 68
69 to 130
131 to 180
69 to 130
131 to 156
1 to 171
1 to 68
1 to 68
1 to 68
43
60
50
62
68
62
50
62
26
171
68
68
68
53%
57%
58%
65%
65%
68%
68%
71%
73%
73%
74%
76%
78%
6%
59%
47%
64%
65%
73%
74%
64%
80%
61%
85%
81%
77%
88%
54%
65%
66%
64%
62%
65%
79%
69%
83%
67%
74%
79%
48%
58%
69%
65%
67%
76%
65%
74%
69%
76%
74%
83%
90%
66
69%
65%
72%
72%
Average
Note - The last line reports the average number of exam sheets assessed (column V) and the average share of correct gender
assessment (weighted by the number of exam sheets assessed).
54
Table I.11: Are assessors making the same guess about handwriting? Consistency
between assessors on the sample of exam sheets assessed exactly 5 times and
belonging to different students.
Proportion of the exam sheets’ sample
Number of
assessors
making a
correct guess
Whole
sample
(N=106)
Only girls
(N=48)
Only boys
(N=58)
Only French
(N=61)
0
1
2
3
4
5
6%
8%
12%
15%
21%
39%
10%
6%
15%
13%
15%
42%
2%
9%
10%
17%
26%
36%
3%
5%
15%
13%
23%
41%
55
Chapter II
Persistent Classmates: How Familiarity
with Peers Protects from Disruptive
Transition
Joint with Arnaud Riegert
We thank Éric Maurin, Julie Berry Cullen, Luc Behaghel, Gordon Dahl, Victor Lavy, Thomas Piketty,
Corinne Prost, Gwenaël Roudaut, Camille Terrier and Margaux Vinez for their helpful comments and suggestions. We thank participants at the SOLE conference (Arlington, 2014), IWAEE conference (Catanzaro, 2013),
PSE Applied Economics seminar and CREST internal seminar. We are also grateful to the statistical services
at the French Ministry for Education (DEPP) and in particular Cédric Afsa who facilitated our access to the
datasets. This research was supported by a grant from the CEPREMAP research center.
0
56
If peer effects exist, policies that influence the allocation of individuals with their peers
could improve welfare. This idea has aroused a huge amount of interest by economists, despite the great empirical challenges raised by endogenous sorting. For example, many papers
investigate the role of neighborhood (Goux and Maurin, 2007; Kling et al., 2007) and school
composition (Hoxby, 2000; Angrist and Lang, 2004; Cullen et al., 2006; Lavy, Silva, et al.,
2012) on students’ outcomes. The literature is much less extensive when it comes to estimating peer effects within classes, although most students’ interactions are likely to occur at this
level. It is also unfortunate from a policy point of view, because school administrators have
much more leeway in setting up classes than policymakers have in influencing neighborhood
and school choice. The main studies on the subject are based on experimental data either in
primary schools in developing countries (Duflo et al., 2011) or in colleges in developed countries
(Carrell et al., 2011). Evidence based on observational data is rare (see e.g. Lavy, Paserman,
et al., 2012; Fruehwirth, 2013), since it requires both rich data at class level and conclusive
natural experiments that are rarely available.
In this paper, we identify classroom peer effects by using natural experiments aroused by
the specific institutional features of student allocation across classes, within schools, in the first
year of high school in France (10th grade).1 By definition, a high school principal does not
know her first-year students before the beginning of the school year. As she has to allocate
them across classes before that time, she has only a finite set of information observed in their
registration files to go on, including e.g. gender, socioeconomic status (SES), middle school
and 9th grade class of origin, the list of optional courses and scores obtained in their grade 9
class. Using a unique administrative dataset, we are able to observe almost all characteristics
observed by principals about their cohort of freshmen.
Usually, the registration files a given principal has to consider in a given school year are all
different from one another. But there are rare cases where she gets two freshmen registration
files that are exactly or very similar with regard to all these characteristics (denoted "similarfile" or SF students). For example, these two students, call them Aurélien and Benoit, are
both low-SES boys coming from the same 9th grade class from the same middle school, who
got very similar scores in grade 9, and ask for the same optional courses in grade 10. If the
In France, secondary education consists of two blocks: middle school (grades 6 to 9) and high school (grades
10 to 12). High schools are generally separate from middle schools.
1
57
high school principal decides to separate them across two 10th grade classes X and Y, the key
intuition is that the choice of assigning Aurélien to class X and Benoit to class Y or the other
way around should be as good as random, because she does not have any additional relevant
information to distinguish between them (see Figure II.1). Note that we do not assume that the
decision to separate the students or to keep them in the same class is exogenous, but only that
the assignment is random in case she decides to separate them. We are able to provide strong
evidence supporting this assumption, using the anonymous scores obtained at the national
exam taken by students at the end of 9th grade, which is unobserved by principals.
Our estimation strategy is as follows. First, we restrict our analysis to students who have a
similar-file mate who ends up in a different class in the same high school. This is the case for
only 0.8 percent of the population of high school freshmen, leaving us with a sample of 28,053
students over the 2004-2011 period. Then, our strategy is to compare a student with his or
her similar-file mate only. We do this by controlling in all regressions for a single fixed effect
accounting altogether for freshmen’s high school, cohort, middle school and class of origin, and
all other characteristics observed in registration files. Finally, we investigate how the outcome
gap between two similar-file students relates to differences in the characteristics of their 10th
grade classes. In other words, the random assignment of Aurélien and Benoit between classes
X and Y can be seen as a given lottery or quasi-experiment, and the whole set of comparable
lotteries happening in French high school between 2004 and 2011 allows us to estimate the
effect of a number of classroom environment dimensions.
Common measures of peer characteristics are considered, such as peer ability, gender and
socioeconomic status (SES). Yet surprisingly, the most robust effect that emerges comes from
the number of persistent classmates (PC) a student gets, i.e. classmates who were already
in the freshman’s class in the last year of middle school. Not only does the number of PCs
significantly reduce the risk of repeating freshman year, but by contrast with other measures
of peer characteristics, the effect also endures in the long run and is associated with differences
in graduation rates at the end of high school.
The second part of this paper sets out to understand why the presence of these persistent
classmates generates positive spillovers. Although the number of PCs might capture some
omitted class characteristics associated with peer ability for instance, our investigations suggest
58
that students benefit from having more persistent classmates only because of a familiarity
mechanism, i.e. because they know each other well. Three findings lead us to this conclusion.
First, the estimates are extremely robust to the inclusion or not of controls for the other
classmates’ characteristics (ability, gender and SES). Second, we find that the PC effect is
highly heterogeneous and mainly driven by low-achieving, low-SES students. Also, the effect
seems slightly stronger when these students are suddenly more exposed to high-SES students.
This is consistent with our interpretation, as being surrounded by familiar faces should matter
more when the transition to high school is highly disruptive. Third, these students at risk of
underachievement in high school are not more impacted by their high- than their low-achieving
persistent classmates, which would be expected if the PC effect was driven by their higher
unobserved ability. Robustness checks are provided for our main results.
This study makes three important contributions to the literature. First, it sheds light on
the ongoing debate on the complexity of peer effects. While some recent studies offer an insight
into the role of social networks during school transitions (see e.g. Lavy and Sand, 2012), we
use natural experiments that provide a stronger identification of the impact of classmates’
characteristics, including their social links. In keeping with Foster (2006), our results also take
issue with popular belief that agents are more influenced by their friends than by other peers
(see also Halliday and Kwak, 2012). In particular, students could be influenced by former
classmates that are not friends, though there was no theoretical reason to expect classmate
persistence to have positive effects.2 As a matter of fact, recent results found by De Giorgi and
Pellizzari (2013) on Bocconi University were suggesting the opposite, as they find a decrease in
performance for undergraduate students that are assigned more often together across classes.
Our results show that former peers generate positive spillovers when reassigned together in the
context of school transitions, emphasizing how peers may not have the same effect depending
on the timing and contexts.
2
On the one hand, former classmates may be friends, and recent evidence suggest that friends may have a
positive effect on well-being and achievement (Calvò-Armengol et al., 2009; Lavy and Sand, 2012) Yet former
classmates may also simply be peers with whom it is easier to talk during the early weeks, to sit next to in the
classroom or to ask for help, thus making it easier to adapt to higher academic expectations and less supervision
from teachers. Even without friendship bonds, familiarity within the classroom could therefore reduce anxiety,
prevent social isolation and foster a student’s sense of belonging in the new school and class. On the other hand,
former classmates could prevent students from socializing with new peers, or be conducive to bad behavior in
the classroom if disruptive students stay together. Former classmates may also be enemies rather than friends,
and their presence could be detrimental to welfare and achievement. Mora and Oreopoulos (2011); Lavy and
Sand (2012) show that "non-reciprocal friends" (peers that consider you as a friend while you do not, or vice
versa) seem to have no or negative effects on outcomes.
59
Second, our findings need to be considered in relation to the strand of literature on the
impact of mobility across environments. Several papers find that policies that enhance neighborhood or school choice, or expand students’ access to high-performing schools have been
unexpectedly inefficient in improving students’ educational outcomes (Angrist and Lang, 2004;
Cullen et al., 2006; Kling et al., 2007). In line with other recent works (Lavy and Sand, 2012;
Gibbons et al., 2013), this paper suggests that these results could be due to the disruption
caused to a student’s environment by such policies.
The third contribution is more policy-oriented. Policy recommendations to improve achievement in high school are particularly relevant, in view of the issues at stake. In many countries,
formal tracking is implemented in high school, such that short-term low achievement in the
first year may end up with mismatched enrollment in low-skill tracks. In addition, the start of
high school is often simultaneous with the end of compulsory schooling, meaning that underachievement may lead to drop-out at that stage compared to previous stages. As our analysis
show, principals could substantially raise the achievement in high school of low-ability students
by assigning them in freshman year with some familiar classmates. Our estimates suggests that
their risk of repeating freshman year could be reduced by 4.5 percentage points, and their graduation rate raised by the same amount. This simple recommendation on class composition may
thus improve their performance by around 13 percent at no cost, while highly expensive policies usually target this population of students at risk. Moreover, moving these students across
classes based on their former networks is not a zero-sum game, in contrast to their ability or
gender. Although a high-ability student or a female student might be of benefit to everyone,3
grouping together freshmen from the same class should not affect freshmen from other classes.
Section 1 describes the institutional context and the data. Section 2 describes the identification strategy. We present the results and discuss the distribution of the effect and its
mechanisms in section 3. Robustness checks are then provided in section 4. Section 5 discusses
the implications of our results and concludes.
Carrell et al. (2011) built an algorithm designed to optimize peer effects and failed to do so partly for this
reason.
3
60
1
Institutional context and data
1.1
1.1.1
The high school curriculum in France
Enrolling in general high schools
By the end of middle school (grades 6 to 9), students apply for either vocational or general
studies, with the approval of middle school teachers. Around two-thirds of 9th grade students
opt for the general track, in which case they apply to general high schools in their district.4 Rules
of admission then differ by school district and year, but they usually depend on the students’
home address, socioeconomic status and school performance (9th grade scores). Allocation is
over by the end of June and high school administrations receive the registration files on their
future 10th graders in the first week of July.
At the same time, 9th grade students take national anonymous exams in the end of June in
three core subjects: mathematics, French and history-geography. These exams are not graded
by teachers from the student’s middle school, but externally (with scores between 0 and 40).
The resulting anonymous scores are combined with continuous assessment scores, i.e. scores
obtained in 9th grade in all courses and graded by the students’ own teachers (between 0
and 20). The anonymous scores and continuous assessment scores are combined to compute a
total score that determines whether they pass the middle school graduation diploma (Diplôme
national du brevet or DNB hereinafter).5
The anonymous scores are only available in mid-July. By that time, students have completed
their administrative registration for high school and class compositions are already determined.
In addition, these scores are not sent to the high school during the summer6 . Therefore, the
principals assign freshmen to grade 10 classes without knowing their anonymous scores, and
only having their continuous assessment scores.
Students opting for the vocational track have to choose a specialty, and vocational high schools usually have
places in only one or two classes per specialty. We have therefore decided to exclude vocational high schools
from this study, since class composition is highly constrained and is not really policy-relevant in these schools.
5
Note that students do not need to pass to go on to high school.
6
Some students do inform the high school of their results in the anonymous exams once they receive them
(although this is not a requirement), but informal discussions we have had with some high school administrations
suggest that this hardly ever happens. In any case, principals do not have these scores for all students, so they
are highly unlikely to use them to assign students across classes.
4
61
1.1.2
The curriculum in general high schools
In France, freshman year marks a difficult milestone for students attending general high schools.
The average ability of peers raises suddenly as one third of students, usually the lowest achieving
ones, has enrolled in vocational studies after middle school. This may not only increase teachers’
expectations, but it might also affect negatively students’ self-esteem. Students’ percentile rank
within their class drops from 64 to 52 in average between grade 9 and 10. Naturally, this change
hits students asymmetrically. While students in the top half of the ability distribution only
fall from the 78th to the 71st percentile rank, the other half drops from the 49th to the 32nd
percentile rank.7 Students also undergo a shock in the sociocultural dimension, as reflected by
the share of high-SES that increases from 22 percent in middle school to 30 percent in general
high school.
At the same time, the school change triggers an intense disruption in students’ networks, as
illustrated by the amount of persistent classmates in their class. Figure II.2 plots the typical
student’s class composition for each grade. As a benchmark, the share of persistent classmates
remains fairly constant throughout middle school at around 30 percent.8 Yet in grade 10, the
number of PCs drops dramatically. Only 5 percent of their classmates come from the same
class and 20 percent from the same middle school. Assuming that students rarely know the
students from other middle schools, this means that students do not know at least 80 percent
of their classmates at the beginning of the year. This figure decreases again in the subsequent
grades, coming to 45 percent in grade 12 due to the partial carryover of major-specific classes
from grade 11.
It turns out that these disruptions happen precisely at a time when achievement is highly
determinant for long-term outcomes. By the end of the year, students have to apply for a
major that will determine their 11th and 12th grade courses, their baccaulauréat (high school
graduation exam) specialty, and the university tracks they will be able to apply for at the end
of high school. First, students have to opt for the academic or technological track, the former
being historically more prestigious with harder, more academic courses. If students are not
accepted for any of the majors they apply for, they can opt for an alternative major suggested
These figures are computed using the anonymous exam scores at the DNB exam.
In grade 6, we are only able to identify students from the same elementary school, as we do not have any
information on the classes in grade 5.
7
8
62
by teachers, if any. Otherwise, they have to repeat grade 10 with a view to applying again the
following year.9 As a result, repetition rate is exceptionally high in the first year of high school
(10 percent compared to 5 percent in average in middle school).
Lastly, high school ends at grade 12 with the baccaulauréat exam. This high school graduation exam includes anonymous tests in different subjects depending on the student’s major, and
is almost entirely graded by teachers outside the student’s high school. Passing the baccaulauréat is required in most higher education tracks, and it is sufficient to access most university
tracks.
1.2
The class-assignment mechanism
In France, students are assigned to the same class for all subjects for the entire school year.
Classmates therefore have even more potential influence over each other’s outcomes, as they
spend most of the day together throughout the school year. In practice, classes are assigned in
early July immediately following student registration for high school, and two months before
the start of the school year in September. Classes are assigned entirely by hand, without the
aid of computer algorithms. This process is non-random, even for first year students. However,
unlike with the other grades, high school principals do not know the students personally when
they assign them to classes. Consequently, high school principals rely solely on the set of formal
registration data given in the students’ files and observable, for the most part, in our dataset.10
First, principals look at the options chosen by students. While most courses are part of the
common core curriculum and are the same for all (e.g. mathematics and French), students have
certain subjects to choose from such as which foreign language they prefer to study (e.g. English
Students not allowed to move up to the next grade may appeal the decision to a committee external to
the school, whose decision is final. Those refused permission to enroll in the major of their choice may appeal
the decision to the principal or even negotiate with a different high school. In any case, the final decision to
award a student a place on a given major rests with the principal of the high school attended in 11th grade.
The principals we met reported that very few students in each year actually go against their teachers’ advice.
10
High schools are usually separate from middle schools. However, 16 percent of French students attend
schools that cover the entire secondary curriculum (mostly private schools). In these schools, principals might
know 10th grade students coming in from their own middle school. Nonetheless, middle school and high school
still have separate deputy heads to whom principals generally delegate class composition. These deputy heads
do not necessarily coordinate over class assignment of 10th grade students, such that the high school deputy
heads may well not use any more information than the registration file. This is supported by our exogeneity
test (see section 2.1), which suggests that students are conditionally randomly assigned even in this case. We
have therefore chosen to keep students from these schools in our sample. Taking them out of the sample has
virtually no impact on the results.
9
63
or Spanish) and some additional optional courses (e.g. Latin and ancient Greek). Students who
take the same options are often grouped in the same class, for the sake of convenience when
timetabling classes.
Conditional on students’ options, school principals generally (but not necessarily) try to
balance classes in terms of gender and ability.11 They rely on the formal data contained in
students’ personal registration files: personal details on the students and their families (mainly
gender, age and parents’ occupations), scores obtained in 9th grade subjects (between 0 and
20) and 9th grade teachers’ comments. These short comments are not written for principals
but for parents, to assess student performance and behavior in each subject. For example,
teachers may write that they are satisfied with the student’s effort and participation in class,
or that the student talks too much with his or her classmates in class (without naming names).
These reports do not include recommendations to high school principals such as, "Do not put
these two students together". Only on the rarest of occasions would such advice be given to
principals, and then via an informal channel.
Unlike with other grades where principals know their students, they cannot count on any
personal knowledge of them such as motivation or emotional resilience.12
In addition, in French high schools, neither parents nor students can ask for placement in
specific classes or with friends. As a rule, families do not liaise directly with principals over
class assignment.13 Instead, tactics to get children assigned to a better class mainly take the
form of choosing specific options. In particular, families may encourage their children to take
"elite" options (e.g. German as a first foreign language, or Latin) to get them assigned to a
better class. This has no impact on our identification since we only compare students taking
the same options. Lastly, students are only notified of their class assignment the week before
the first day of school, and they are not allowed to change it.
There are good reasons to believe that principals do not use all the detailed formal data they
have on students to assign them to a class. As revealed by the class assignment sessions we
There is no legal requirement to do so, but the 1975 Haby Act that made middle schools comprehensive
established a tacit rule for schools to favor within-class heterogeneity. Besides, principals probably want to
avoid putting all low-achieving students together in one class that consequently risks being unruly.
12
With other grades, they might, for example, separate two friends who are disrupting lessons, or place a
fragile student with his or her friends for emotional support.
13
They do so only in very special cases, such as where car sharing needs to be organized for students in rural
areas.
11
64
attended, simply allocating classes on the basis of options is already complicated and timeconsuming enough as it is. Again, they have to do it by hand and take a large number
of constraints into account, while a host of other tasks are pending both to wind up the
current school year and prepare for the new one. Therefore, if two freshmen’s registration files
look broadly similar, principals are highly unlikely to spend time studying their characteristics
to try to find some minor detail to differentiate between them. In particular, principals do
not telephone families or middle school principals to get further information on students. In
practice, then, two 10th grade students do not need to be exactly identical on paper to be
deemed indistinguishable during the class assignment process. Section 2 provides empirical
evidence in support of this field observation.
1.3
1.3.1
Data
Datasets
The empirical analysis is based on two administrative datasets from the French Ministry of
Education.
• Administrative registration records: for all students enrolled in French public and publiclyfunded private middle and high schools from 2001 to 2012. This dataset contains students’
personal details (e.g. date and region of birth, gender and parents’ occupation) and
information on their education: in particular grade, school and class attended, options
taken, grade and school attended in t − 1 (but not the class attended in t − 1).
• Examination records: for all students from 2004 to 2011. This dataset contains personal details and informal scores in the 9th grade DNB (both the anonymous exam and
continuous assessment scores) and 12th grade baccaulauréat exams.
These datasets are exhaustive and the variables we make use of are well reported for almost
100 percent of the population. Unfortunately, students do not have personal identification
numbers so that they can be tracked through the different datasets. Yet for each 10th grade
student, we need to know at least which class they attended in 9th grade, their grade in t + 1
65
(repeating 10th grade or moving to 11th grade) and chosen major if they do move to 11th
grade. We also have to match the administrative and the examination records.
In order to find this missing information, we use a matching procedure taking the students’
personal details in each dataset. The procedure is based mainly on date and region of birth,
gender, grade and school attended in years t and t − 1. We manage to match 9th grade class
for 94 percent and DNB exam scores for 81 percent of new 10th grade students. The remaining
students either had no match in the auxiliary dataset (60 percent of occurrences) or multiple
matches (40 percent of occurrences).14 The online appendix provides further details on the
matching procedure. In the rest of the paper, all regressions include controls for the share of
missing observations in the class, although they do not change the estimates.
Our identification compares the set of information on students observed in our dataset to
the information observed by principals in their registration files at the time of class allocation.
So it is useful at this stage to summarize which variable is observed by whom:
• Covariates observed by both the principal and the econometrician: Date of birth, city of
residence, gender, parents’ occupation, foreign languages and options chosen, 9th grade
continuous assessment scores in all subjects, middle school and 9th grade class. We also
observe a numerical measure of student behavior as graded by the student’s head teacher;
this information is missing for the first two cohorts (out of eight) and it will therefore be
used only for robustness checks.
• Covariates observed by the principal, but not the econometrician: Students’ first and last
name (from which, in particular, ethnicity could be inferred), and exact home address.
The principal also observes the 9th grade teachers’ written comments, which may inform
of behavioral issues. Again, these comments are very short (one sentence from each
teacher), written for parents, and do not include information about the relationships
with specific students.
• Covariates observed by the econometrician, but not by the principal: Anonymous DNB
exam scores.
A multiple match means that two students are found in the same school × grade × year with the same
date of birth, gender, etc.). This may occur only randomly and is not likely to bias our results.
14
66
Most information observed by the principal is thus contained in the dataset. Although we
do not observe the teachers’ written comments, we do observe a behavioral score for threequarters of the sample, which contains precisely the information we expect the principals to
infer from the written comment15 . As we will show, the anonymous DNB exam scores are key
in this study, since they allow us to test our main ientification assumption.
Descriptive statistics are presented in Table II.1 for the entire population of 10th grade
students (column I). In particular, it is interesting to note that the average freshman has 1.7
persistent classmates out of the 8.3 former classmates enrolled in their high school.16
Note that it would be hard, in any case, to work directly with the written comments even if we could
observe them. If we were to do so, we would try to build a score to summarize the information contained in the
comment, which is the purpose of this behavioral score.
16
The ratio is roughly equal to 5, the average number of classes in high schools, suggesting that principals
do not try to group students having the same class of origin when allocating them among 10th grade classes.
15
67
2
Identification
The identification strategy used in this paper is based on a quasi-experimental setting. Since
principals do not know first-year students personally, principals cannot easily distinguish between two students who are similar "on paper", i.e. who exhibit identical or very close registration files. If such students are separated into different classes, it is credible to assume that
they were randomly assigned to their classes. Therefore, classroom peer effects are identified
by sticking to comparisons between students coming to a given high school with the same
observable characteristics, but ending up in different classes. Such students form what we
call "similar-file" groups (or SF groups), and all comparisons throughout the paper are made
between students belonging to the same SF group.
Figure II.1 illustrates this approach, where students A and B have close observable characteristics. If the high school principal has to split them between classes X and Y, we assume
that the decision between assigning A to X and B to Y (case 1) or the reverse (case 2) is as
good as random. Therefore, the differences between students A and B’s classes (e.g. characteristics of classmates C-D-E compared to F-G-H) are uncorrelated with differences in individual
unobserved factors of achievement, allowing for causal inference of peer effects. Classroom
peer effects can be estimated by examining the correlations between students A and B’s gap in
outcome and differences in their class characteristics.
Formally, we estimate the following model using OLS:17
yigc = αg + β · Cigc + ǫigc
(II.1)
where yigc denotes high school outcomes for student i, assigned to 10th grade class c and belonging to SF group g, as in model (II.3), Cigc is a vector of class characteristics (e.g. peer ability,
female share, or i’s number of persistent classmate), and ǫigc captures individual unobserved
factors of achievement. αg is the SF group fixed effect that restricts the analysis to comparisons within groups of students with similar registration files and is the key to identification.
Although our estimation strategy is similar in spirit to exact-matching methods, we choose not to use
matching estimation as the regressors examined in this paper are not binary. To our knowledge, the literature
is very poor when it comes to the estimation of average causal effects of multi-valued treatments by propensity
score or exact matching methods (see Imbens, 2000, from this point of view).
17
68
Indeed, β captures the peer effects of interest under the key assumption that 10th grade class
characteristics Cigc are not correlated with ǫigc conditional on g:
Cigc ⊥ ǫigc |g
Cov(Cigc , ǫigc |g) = 0
i.e.
(II.2)
By controlling for SF group fixed effects, model (II.1) estimates β parameters only by
comparing separated similar-file students with each other, as long as their classrooms differ on
Cigc dimensions (in particular, we do not compare students who are separated versus students
who end up in the same class).
In what follows, we describe how we define SF groups before providing empirical evidence
supporting assumption (II.2).
2.1
Definition of similar-file groups
The natural experiments consists in students who enrolled in a given high school and year
with the same or very similar observable characteristics from the principal’s perspective, but
assigned to different classrooms. These groups of students, denoted g ∈ {1, . . . , G} and called
SF groups, are defined using all variables that we observe on first-year students’ registration
files. We consider that students belong to the same "similar-file" groups only if they come
from the same 9th grade class in middle school; enroll in the same high school in the same
year; select the same options (i.e. same foreign language and optional courses); share the same
gender, age18 and social background (low- or high-SES) based on father’s occupation; belong
to the same quintile of average 9th grade continuous assessment score in scientific subjects
(mathematics, physics-chemistry, and biology); belong to the same quintile of average 9th
grade continuous assessment score in humanities (French, history and foreign languages)19 ; and
belong to the same decile of average 9th grade continuous assessment score across all subjects
We do not look at the exact date of birth but only at whether the students have repeated at least one year:
age is broken down to just one dummy variable.
19
The foreign languages score is the weighted average of the student’s main foreign language (weight = 2/3)
and second foreign language (weight = 1/3). Using different weights does not change the paper’s results. Note
that the continuous assessment score in history is missing for 5.4 percent of observations. For these students,
the average humanities score is the average of the French and foreign languages scores only.
18
69
listed above.20 Note that continuous assessment scores are binned into deciles or quintiles in
order to have enough cases where at least two students share the exact same values for all
observed characteristics. We show in section 4.1 that our estimates are unsensitive to using
narrower or wider bins.
This definition of SF groups leaves us with 32,492 groups of SF students, 13,680 of which
include students who were not assigned by their principal in the same 10th grade class.21 The
total sample of SF students for which class effects can be estimated thus comprises 28,053
students out of an initial population of 2,897,986 10th grade students covering eight cohorts.22
Model (II.1) is estimated on this "SF sample" of 28,053 students that have at least one
similar-file mate ("SF mate") while ending up in separate classes, corresponding to 13,680
separate natural experiments. Each student is only compared to his SF mate since we control
for αg fixed effects. This single fixed effect controls altogether for years, class and middle
school of origin, high school of destination and all other characteristics observed in students’
registration files. Therefore, there is no need to control separately for high school or year fixed
effects, or for individual characteristics.
Overall, these students are allocated to 22,162 different classes. Over the whole SF sample,
10,461 students are assigned to the same class as at least one student from another SF group.
When estimating model (II.1), standard errors are therefore clusterized by 10th grade class using
Moulton formula, although this only impacts standard errors at the margin as the clusters are
very small in size.
Two students who belong to the same average score decile across all subjects may have very different subjectspecific profiles: one may have high marks in sciences but not in humanities, or vice versa. Principals most
probably differentiate between such students. This explains why we add quintiles of scientific and humanities
scores separately, aside from the average score decile.
21
A total of 8,341 of the 32,492 SF groups are characterized by a set of optional courses that are only available
in one high school class. Thus, these groups could not have been split up in any case. We conclude that the
principals split up 13,680 SF groups out of 24,151 (57 percent) groups that could be split. The reason for
separating out a group of SF students across different classes might be endogenous to potential outcomes, but
does not affect our strategy. We only assume that, conditional on separating out SF students across different
classes, principals decide randomly which students they assign to which class.
22
This population excludes 10th grade repeaters, but also newcomers for whom data on 9th grade exam
scores is missing. Note that our definition of SF groups allows only 1 percent of the population to come to high
school with at least one other student (only one out of 93 percent) who shares the same values for Xi , while
ending up in different classes. This illustrates just how much our identification approach calls for a very rich
database: only a large initial pool of students can yield a sample of SF students large enough to get accurate
estimates of peer effects.
20
70
2.2
Empirical evidence of random assignment
The definition of SF groups we just described is already very restrictive. Considering principals’
time constraint, they may not look for additional information to decide which one of the two
students they assign to which class. As we described in section 1 however, there is still a few
information they have that might be used. Assume, for instance, that there is substantial
variation in ethnicity captured by names, even conditional on g, and that these variations
are correlated with potential outcomes. If they are also taken into account by the principal
when assigning these similar-file students to different classes, and if they are correlated to ǫigc ,
then assumption (II.2) is false and our results would be biased. The same argument holds for
teachers’ small written comments on students’ behavior, that we do not observe.
2.2.1
Balancing test using anonymous exam scores
A first way to provide supporting evidence of the validity of assumption (II.2) is to exploit one
information we do observe that principals do not: the students’ anonymous scores Ai obtained
in the national DNB exam just before entering high school. If conditional on g, principals
do assign students based on some information we do not observe and that is correlated with
potential outcomes, then some correlation between class characteristics Cigc and anonymous
exam scores Ai should be observed, again, conditional on g. For instance, if 9th grade teachers
feel a student is disruptive enough to warrant mentioning it in their written report, then they
probably underscore his or her performance in class (as measured by continuous assessment
scores). Therefore, disruptive students should display higher anonymous exam scores on average
than their SF mate(s) with no behavioral issues, since SF students have very close continuous
assessment scores by construction. This can be shown empirically taking our data on 2006-2011
cohorts, for which the behavior score NVS is available. A regression of Ai on NVSi controlling
for g (which includes continuous assessment scores) exhibits a negative correlation of −0.059
with a 0.020 standard error. It shows that when teachers report that a student’s behavior is
worse than his or her SF mate, this student gets a higher score in the anonymous DNB exam,
proving that they have been underscored in class.
Ai can thus be used to examine whether assumption (II.2) of a random assignment condi-
71
tional on g is credible. We do so by estimating the following model:
Cigc = αg + β · Ai + uigc
(II.3)
where Cigc is any of class c’s characteristics and g is the index denoting "similar-file" (SF)
groups, i.e. the groups of students sharing a specific vector of values for Xi . Adding αg to
the model constrains the regression to compare students solely with their SF mate(s). Under
assumption (II.2), we expect β to be equal to zero.
Table II.2 reports the estimated β parameters of model (II.3) for the entire SF sample for
a number of class characteristics Cigc (columns I and II). Column I measures the raw sample
correlations between ability and class characteristics, i.e. without the αg fixed effect. In the SF
sample, more able students are assigned to larger classes and with more persistent classmates;
their classmates are also higher-achieving students, more often female and high-SES. All these
correlations are statistically significant at the 1 percent level, except for the number of females.
However, these correlations vanish within SF groups: as soon as we include the SF fixed
effect, the estimates for β become very small and non-significant for all class characteristics
(column II). In other words, for students who were similar with respect to Xi at the time of
class assignment in a given high school and year, remaining differences in ability (unobserved
by principals) have no correlation with differences between class characteristics.23
This is a very strong result in favor of assumption (II.2). It clearly suggests that principals
do not use any achievement-related information we do not observe to decide on class assignment
for separated SF students, and thus split them up randomly (or at least exogenously to potential
outcomes). In actual fact, this result is far from unrealistic since SF students are similar across
many variables, and the remaining vector of information is very small in comparison (names
and teachers’ written comments). It is absolutely consistent with our field observations that
principals do not have time (or do not take time) to differentiate between very similar students.
Another possible scenario is that principals do consider the remaining information and
We provide two additional tests in the online appendix. First (Table II.11), we show that the results hold
when we use more detailed class characteristics regarding the number of persistent classmates of each type (lowor high-ability, same or opposite gender). Second (Table II.12), we estimate equation (II.3) the other way around,
i.e. regressing Ai on all class characteristics Cigc at the same time, thus measuring partial correlations between
ability and each of the class characteristics. The conclusions of both these tests are identical to Table II.2.
23
72
that the latter are indeed correlated with potential high school outcomes, but without being
correlated with DNB test scores. In other words, testing for imbalances in Ai would not be
relevant since anonymous DNB scores are not a good measure of all unobserved determinants
ǫigc of achievement in high school.24 Yet we argue that this is very unlikely, since principals
do not observe DNB scores. It is therefore hard to imagine that principals would use the few
information we do not observe to separate SF students in a way that is correlated with ǫigc but
without any correlation with Ai as shown in Table II.2.
Table II.2 also reports the results of the exogeneity test for a subsample of "at risk" SF
students (columns III and IV). We define them as SF students who are low-achievers (below
the median score of their middle school of origin) and low-SES. As we show in section 3.2.1, our
main results regarding the positive effect of keeping classmates in the transition to high school
are driven mostly by this specific subsample. For this reason, we check that the exogeneity test
performs well for these students, as reported in columns III and IV of Table II.2. Therefore,
we conclude that the SF students driving our main results are credibly exogenously assigned
to their classes.
All in all, our definition of SF groups appears to be most suitable to estimate the causal
effect of class environment on students’ outcomes. Checking the robustness of our results to
other specifications, we show in section 4.1 that our results barely change when using alternative
specifications that are more or less restrictive.
2.2.2
Additional evidence of random assignment
When defining the g groups of SF students, we do not require students to share a similar behavioral score (Note de vie scolaire or NVS). This is because this score is not available for the
first two cohorts, so including it would force us to remove one-quarter of our sample. However, Table II.2 suggests that it does not constitute a threat to our identification assumption.
Otherwise, the resulting allocation of SF students would create a correlation between Cigc
and Ai conditional on Xi , since anonymous DNB scores are correlated with behavioral issues
conditional on teachers’ grades (see above).
For instance, a student’s level of autonomy may be more important to high school achievement than
achievement in the DNB exam.
24
73
Additional evidence that principals do not take behavioral considerations to differentiate
SF students can yet be provided. This is done by checking whether differences in class characteristics Cigc are correlated with potential differences in the NVS score in cohorts 2006 to
2011 (when the NVS score is available). We do so by estimating (II.3) after substituting Ai
with a dummy for having an NVS score beneath the 10th percentile (equal to 15 out of 20).25
Results are presented in Table II.3. Most correlations between student behavior and class characteristics are both very small and non-significant at standard levels as long as comparisons are
restricted within groups of SF students. This is true for both the entire SF sample (column II
compared to column I) and the subsample of SF students at risk who drive our main results
(column IV compared to column III).
Lastly, we run additional tests whose results are not reported for the sake of brevity. Firstly,
all the analysis work presented in this paper is repeated for the 2006-2011 cohorts, constraining
SF students to share a similar NVS score. Similar results are systematically found, though the
estimates are less accurate. Secondly, we check whether differences in the continuous variables
included in the definition of g groups (notably continuous assessment scores) conditional on
their binned values are correlated with class characteristics Cigc . Once again, the correlations
are mostly small and not significantly different from 0, especially for the subsample of students
at risk of underachievement.
Overall, the empirical evidence strongly supports assumption (II.2) that high school principals randomly assign separated SF students (as defined in section 2.1) to their classes, or at least
exogenously to achievement potential outcomes. The separation of similar-file students across
10th grade classes creates differences in educational outcomes that can therefore be attributed
on average to differences in class characteristics.
2.3
Description of the SF sample
Descriptive statistics on the SF sample compared to the initial population are presented in
column II of Table II.1. All in all, students from the SF sample appear to be slightly higher
The NVS score has a very specific, negatively skewed distribution. 33 percent of students have the maximum
score of 20 since they exhibit no disruptive behavior. The average score is 18, while the median score is 19.
Therefore, we choose to define students with disruptive behavior as students with a score below the 10th
percentile, which is precisely equal to 15 out of 20. Our results are not sensitive to the choice of threshold.
25
74
achievers than the population of high school freshmen as a whole. Yet the differences are
not always large in magnitude, although they are statistically significant. For example, the
average DNB test score is 25.1 (sd = 5.6) in the SF sample compared to 23.9 (sd = 5.1) for
the population as a whole. 15.0 percent of the SF sample repeat 10th grade as opposed to
15.3 percent of the total population, a difference that is again very small.26 Column III reports
the same descriptive statistics for the subsample of SF students at risk. By construction, these
students are very low down on the ability distribution. They have an average normalized
DNB score of −0.76, repeat grade 10 almost 2.5 times as often as the average student in the
population, and graduate from high school almost half as much.
In terms of schools attended, SF students are found in 1,851 out of 2,679 high schools, i.e. 69 percent of all
high schools. The high schools that do not get SF students are mostly very small schools, in which the chances
of getting two students with the same registration files are small. They have 66 students on average in grade
10, versus 259 on average for high schools with SF students. Overall, the high schools containing SF students
account for 91 percent of all 10th grade students.
26
75
3
Results
3.1
Freshman-year class characteristics and achievement
Given assumption (II.2), differences in class characteristics between SF mates are orthogonal
to differences in individual unobservable characteristics. Therefore, conditional on g, regressing
outcomes on any class characteristic – e.g. classmates’ average ability – identifies a contextual
effect that is not attributable to unobserved individual characteristics. Yet since classes differ in
several ways simultaneously, the result of such a regression could be driven by some correlated,
omitted class characteristics – e.g. the number of females. Hence as a first step, we attempt to
figure out which aspect of the class environment is correlated with achievement, by regressing
outcomes on several observed characteristics at once. We estimate model (II.1) where Cigc is
a vector of peer characteristics commonly studied in the literature (average ability,27 number
of female students, number of high-SES students and class size), completed by the student’s
number of persistent classmates.
We consider different outcomes measured throughout the high school curriculum. The four
first outcomes pertain to students’ possible outcomes at the end of freshman year: repeating
grade 10, dropping out,28 enrolling in an academic or technological major.29
Results are reported in columns I to IV of Table II.4. The number of persistent classmates
is positively associated with achievement at the end of grade 10. On average, each additional
persistent classmate a student gets with regard to her similar-file mate reduces her risk of
repetition by −0.3 percentage point (pp.) and similarly raises enrollment in an academic
major (se = 0.1 pp.), with small and non-significant effects on drop-out and enrollment in
a technological major.30 One additional female classmate reduces the risk of repetition by
27
As measured by the DNB score. Because this data is missing for all repeating students (around 10 percent of
classmates) and for another 20 percent of classmates (not matched, see section 1.3.1), we also include quadratic
controls for the shares of repeating students and missing data.
28
As described in section 1, this "drop-out" measure picks up attrition due both to matching issues and
actual drop-out. Since class environment is unlikely to substantially affect the matching procedure though, we
believe this measure adequately captures the effect of class characteristics on the risk of drop-out.
29
We also estimated model (II.1) without controlling for the SF fixed effect to get the raw sample correlations.
Basically, the number of persistent classmates displays positive, significant correlations with all outcomes,
with larger estimates than those obtained with model (II.1). However, contrary to Table II.4’s estimates,
the classmates’ average ability and female share are respectively positively and negatively associated with
achievement. Detailed results on raw correlations are reported in the online appendix (Table II.13).
30
We examine whether one academic major drives the effect, but find the same positive, non-significant effect
76
0.1 pp. on average. This positive relationship between female peers and school achievement is
consistent with the results found by other studies (Hoxby, 2000; Lavy and Schlosser, 2011).31
Classmates’ average ability is negatively associated with performance. A one standard deviation
in the classmates’ average DNB score32 significantly raises the risk of repetition by 3.8 pp. on
average, but reduces both the drop-out rate and the probability of enrollment in an academic
major. Peer ability displays effects that are therefore unclear.33 . The number of high-SES
students also has a negative effect as it significantly increases the risk of dropping out, but its
magnitude is rather small. Finally, we find no class size effect, most likely because of the small
variance between classes in a given high school (the standard deviation of class size is only 1.9
students within SF groups).
Table II.4 also reports results for two outcomes measured later than the end of 10th grade.
Column V shows the effect of freshman-year class characteristics on the probability of students
taking the baccaulauréat exam "on schedule", i.e. three years after entering high school, meaning that they do not repeat grade 10 or grade 11 and that they make it through grade 12
without dropping out. Then, column VI investigates whether students with more persistent
classmates in grade 10 are also more likely to pass the exam at that time. Interestingly, only
the number of persistent classmates has a clear and enduring effect over time. Three years after
entering high school, SF students who gain an additional persistent classmate in their freshman
year are still more likely to take the baccaulauréat exam at the end of grade 12. This result
implies that the reduction in 10th grade repetition is not cancelled out by a higher propensity to repeat grade 11 or drop out of grade 12. Furthermore, they do not seem to perform
any worse than others over these years, since they are also more likely to graduate from high
school. In comparison, all other class characteristics display estimates that are rather small in
magnitude and never statistically significant. Thus, the number of PCs seems highly relevant
on enrollment in sciences, humanities and social sciences. Results on specific major enrollment are not reported
for the sake of brevity, but are available on request.
31
When adding an interaction term between own gender and the number of female classmates, we find that
this effect is driven entirely by female students (no effect on males). Note that controlling for this interaction
term does not change the estimate of the PC effect. This rules out the interpretation of the PC effect as
capturing the impact of assignment to same-sex classmates.
32
The standard deviation of classmates’ average ability within SF groups is only 27 percent of an average
DNB score standard deviation.
33
The results obtained in the literature as regards the effect of peer ability are also mixed and inconclusive.
Here, the negative peer effect is consistent with the impact of a lower relative position within the class, because
students may look weaker to teachers when assigned with better classmates. This may have little effect on
drop-outs, but it would raise their risk of repeating grade 10 and at the same time reduce their chances of
admission on an academic major track.
77
to capture the dimension(s) of class environment that matter, even more so than other, classic
peer characteristics commonly studied in the literature.
Yet it is unclear what the number of PCs actually measures or captures. SF students’
persistent classmates could affect them by means of mechanisms implying all sorts of unobserved
characteristics that generate peer effects, such as ability and motivation. In the following, we
provide strong evidence suggesting that the PC effect does not capture an ability peer effect,
but rather works through a social network mechanism. As we will show, the most consistent
interpretation of the data is that students simply benefit from getting peers they know and
with whom they are used to interacting.
3.2
The protective role of familiarity with classmates
We first check that controlling or not for other class characteristics does not affect the PC
estimates. In Table II.5, we report the previous estimates of the effect of PCs from regressions
where Cigc is the full vector (column II). In column I, the effect of the number of PCs is
estimated without controlling for the other class characteristics. The estimates are virtually
identical for the two regressions, indicating that PCs are not correlated with these other class
characteristics.34 This first piece of evidence strongly suggests that the number of persistent
classmates does not capture any other omitted dimension of class environment. Suppose, for
instance, that SF students’ persistent classmates had specific characteristics associated with
higher performance, which benefit SF students without any link with familiarity. Considering
Table II.5’s results, such characteristics should not be correlated with DNB score, gender or
SES. In other words, to be consistent with Table II.5’s pattern, any credible omitted class
characteristic driving the PC effect would have to be uncorrelated with all other observed class
dimensions in Cigc . It is highly unlikely that such a characteristic exists.35 In section 4.2, we
run a robustness check that takes variations in PC within-classes to check that the PC estimate
is unlikely to be driven by characteristics that are fixed at class level.
This remains true even if we allow the model to account for non-linearities in the effect of peer ability, e.g.
by controlling for the share of very high- (or low-) achieving peers.
35
Unfortunately, we do not have access to teachers’ characteristics at class level. Yet any correlation between
PCs and teachers’ characteristics should drive a change in the PC estimate when controlling for students’ average
ability, because assignment of teachers to classes is mostly related to students’ ability.
34
78
In the remaining tables of the present section, we systematically include quadratic controls
for other class characteristics when we estimate the effect of PCs. As seen in Table II.5, these
controls do not affect our estimates.
3.2.1
Distribution of the PC effect
Another reason to believe in the "familiarity" interpretation concerns the heterogeneity of the
PC effect. If historical familiarity with classmates really matters, it is probably not equally
important for all students. In particular, we expect that the role of former classmates is all the
more important for those students who are likely to experience a difficult transition. This is
exactly what we find.
We provide an analysis of the distribution of the PC effect in Table II.6. We first investigate
how it varies with SF students’ level of achievement (panel B). To retain enough statistical
power, we split the sample into just two parts, defining SF students as either low- or highability based on their relative position with respect to the median continuous assessment score
for their middle school of origin.36 The PC effect is strikingly heterogeneous across these two
categories. While the number of PCs has virtually no effect on high-ability students, low-ability
students are strongly and positively impacted. For the sake of brevity, we do not report on the
magnitude of the estimates, since the effects are actually highly heterogeneous again between
low- and high-SES within this subgroup. As reported in panel C, the effects observed on lowability students are almost exclusively driven by low-SES students. On average, each additional
PC reduces their risk of repetition by 1.4 pp, though not their risk of dropping out. They are
therefore significantly more likely to enroll for either an academic or a technological major, with
a similar increase in magnitude. No backlash to this strong short-term impact can be found
in following grades. On average, each PC in freshman year raises their chances of taking the
baccaulauréat exam and graduating by the same amount. In comparison, the estimates are very
small in magnitude and never statistically significant at conventional levels for low-ability highSES students (column V) and high-ability low-SES students (not reported). This suggests that
Therefore, the terms "low-ability" and "high-ability" do not denote distribution ends. Besides, we use
the continuous assessment score since SF groups are defined with regard to it, such that two SF students are
necessarily both below or above the median. Although the anonymous DNB exam score would be a better
measure of ability, two SF students may be on different sides of the median DNB score. We would thus lose
part of the SF sample by analyzing the PC effect separately on each side of the median DNB score. However,
doing so brings us to the same conclusions as in Table II.6, though the estimates are often less accurate.
36
79
keeping some classmates matters only for students who may be experiencing a hard transition
both academically – they were already performing poorly in middle school – and culturally –
their parents come from the working class and might not have studied at high school.
In the rest of the paper, we derive results only for this specific subsample of low-ability, lowSES students, which we call students "at risk". We focus on this subsample because the effect
of persistent classmates is entirely driven by these students. As in Table II.5, the presence or
not of other class characteristics on the right hand side of the regression does not change the
estimates of the PC effect.
In panel D, we investigate this distribution pattern further by looking into how the PC
effect varies with the difference in school-level social environment (measured by the share of
high-SES students). This gap, denoted ∆p, is negative for one-third of low-ability low-SES
students only.37 These low-ability, low-SES students are twice as likely to experience a positive
∆p, meaning that they enter an environment with more high-SES students than they used to
have.
Although the sample size is to small to draw definite conclusions (the differences between the
two groups are not significant to the 10 percent level), we do observe a difference in magnitude
between students at risk depending on the sign of ∆p. The difference goes in the direction we
would expect, i.e. that students who experience a more difficult transition (∆p > 0) are more
sensitive to the presence of persistent classmates.
In panel E, we also estimate the difference of the PC effect between male and female students
at risk. However, the results do not display any clear heterogeneity in the gender dimension.
Persistent classmates seem to have more impact on the male repetition rate than the female
repetition rate. Yet the discrepancy runs in the opposite direction for the baccaulauréat outcomes, with larger estimates for females. Both male and female students thus seem to benefit
from persistent classmates in freshman year, although the benefits differ slightly depending on
the stage of the high school curriculum.38
This is a mechanical consequence of the lower probability of finding low-SES students enrolled in general
high schools after grade 9.
38
Although the increase in academic major enrollment is similar across both genders, persistent classmates
switch male outcomes from repetition to science majors only and female outcomes to humanities only. To
be more precise, both male and female PCs increase male enrollment in science, while females enroll more in
humanities only when they get more female PCs. Results available on request.
37
80
We further analyzed the distribution of the effect in middle and high school contexts. These
results are not reported since no other interesting pattern can be found. For example, the effect
does not appear to vary significantly with middle or high school size, the share of middle school
classmates attending the high school, or the 10th grade class context.39
All in all, the results of this investigation are consistent with our interpretation of the PC
effect. The estimates reported in Table II.4 are a watered-down version of the very strong PC
effect on SF students who experience an upheaval in the transition to high school.40 Already
knowing some peers in the class matters a lot to low-ability students with low socioeconomic
status who come from an environment that is poor compared to the high school. This is most
consistent with the interpretation of the PC effect as reflecting the impact of familiarity. By
contrast, it is unlikely that the former classmates of these low-achieving underprivileged students have higher unobservables than average, which would drive the PC effect. The following
section presents additional evidence in support of this.
3.2.2
Do all former peers matter?
If the effect of persistent classmates is explained by familiarity, then students at risk should be
more affected by peers with whom they have been more likely to interact at middle school.41
In particular, they may have interacted much more with their former classmates than with
middle school peers in other classes. In Table II.7, panel A, we add to the previous regressions
the number of these former middle school mates from other classes. We find a small, negative
effect on grade repetition, but it is not significant. Surprisingly, this effect is related to a
small increase in the risk of dropping out, statistically significant at the 5 percent level. Other
estimates are very small in magnitude and never statistically significant. Therefore, students
We check, in particular, whether the extent to which your new classmates are grouped with their former
classmates increases your need to be with yours. Yet again, we find no result to suggest that this is the case.
This is noteworthy as it suggests that grouping former classmates would not drive negative spillovers on their
other classmates, who do not necessarily have many former classmates in high school. It would be helpful,
however, to confirm such a conclusion with a controlled field experiment to directly investigate externalities
within the class.
40
We check whether the other peer characteristics studied in Table II.4 also have a larger effect on this
specific category of students "at risk". Results are provided in the online appendix (Table II.14). Again, other
peer characteristics (average ability, number of females, etc.) display non-significant and non-persisting effects
on high school achievements, even for these students.
41
We focus solely on the subsample of SF students at risk since section 3.2.1 shows that they are the only ones
driving the PC effect. The balancing test shown in Table II.2 is repeated for this subsample for the different
types of PCs in the online appendix(Table II.11). The test is satisfied for all types of PCs.
39
81
seem to benefit only from their middle school mates who were in the same class, with whom
they probably interacted much more.
Students do not appear to benefit more from some specific types of persistent classmates.
After controlling for the number of all persistent classmates, the number of same-gender PC
(panel B) and high-ability PC (panel C) do not trigger different effects on academic achievement.
Interestingly, high-ability PCs do not appear to be more benefitial to students. Therefore, the
PC effect is highly unlikely to result from a higher unobserved ability of persistent classmates.42
This is another evidence that students benefit from their persistent classmates only because
they know each other.
We again define "high-ability" as students with continuous assessment scores above the school median, for
consistency with Table II.6. However, anonymous exam scores are a better measure of ability and could be
used here to define classmates’ ability without any loss of accuracy (by comparison to SF students, see again
footnote 36. The same results are found by using one or another measure.
42
82
4
Robustness checks
4.1
Alternative SF group specifications
All the results presented in section 3 are based on a quite restrictive definition of SF groups.
We required students to have the exact same values for all variables that were included in
the registration files, binning test scores into deciles or quintiles. This degree of accuracy was
necessary in order for our exogeneity tests to be valid. In this section however, we explore the
sensitivity of the results for alternative definitions of the SF groups.
We test both definitions that are more restrictive and less restrictive than the main definition. For instance, we can be less restrictive by allowing students to have different values for
some of the variables observed by principals or by broadening the bins of the continuous test
scores. Conversely, we can be more restrictive by using narrower bins. In Table II.8, we report
the effect of persistent classmates on grade repetition for four alternative specifications for the
primarily affected sample of "at risk" students (low-ability, low-SES). The reference definition
is reproduced in column IV; columns I to III show the results for less restrictive definitions and
column V presents a more restrictive definition. The details of each definition are given in the
table.
All specifications yield a significant negative effect of the number of persistent classmates on
grade repetition, although the magnitude of the effect varies from one specification to another.
Bear in mind that the balancing test presented in section 2 produces less conclusive results for
the alternative specifications.43 The results in column IV therefore remain our reference results.
However, the fact that the effect retains the same sign and size is reassuring for the validity
and robustness of our results.
43
Results available on request.
83
4.2
Estimation based on the impact of SF student allocation on their
classmates
In section 3, the identification of peer effects within classes is based directly on the comparison of
SF students randomly assigned to different classes. Yet the random assignment of SF students
may also be seen itself as a shock to class composition. Receiving one or another SF classmate
in the class can make a difference to the other students in these classes. A focus on the effects
of SF student allocation on their 10th grade classmates yields new estimates of the effect of
classmate persistence, hence testing the robustness of the results presented in section 3.
4.2.1
Principle
Using the notations from Figure II.1, we now compare students C to H with each other instead
of comparing student A to student B. If A and B are defined as similar-file, the result of the
random allocation of A and B should have no impact on students C–H, since A and B have
the same characteristics. However, if we allow A and B to come from two different classes of
the same middle school, the result of the allocation will have an impact on students C to H if
some of them come from A or B’s 9th grade class. Given the similarity of A and B from most
points of view except their specific class of origin, the result of the allocation will only affect
this dimension of the class characteristics vector Cic . For example, if students A and C come
from the same 9th grade class, C would have one more PC in case 1 than in case 2.
Therefore, in this section, we use another definition of SF students in which we require all
variables to be identical except for the classroom of origin (the middle school or origin must be
the same). However, we find in our exogeneity test (II.3) that principals do distinguish between
students who come from different classes if those classes do not have similar characteristics.
Therefore, in this section, students will be considered similar-file students if they come from
classes that share three characteristics: quintile of the average DNB score, proposing or not
elite options, and quintile of the number of students going to the destination high school.
Formally for each SF group j produced by the new definition, we can define an instrument
Zij equal to the number of persistent classmates that student i obtains from that particular SF
group. The variable is defined only for students who were in the same 9th grade class as one
84
of the SF students in group j, and who are in one of the 10th grade classes attended by these
SF students. We denote this sample Pj . Note that in this context, i belongs to the sample of
S
former classmates in the SF sample, P = j Pj , and not to the SF sample itself.
Formally, we estimate the following reduced form model:
Yijkc = αjk + β · Zij + ǫijkc
(II.4)
where c denotes the 10th grade class and k denotes the 9th grade class. The αjk fixed effect
ensures that comparisons are made between students who belong to the same Pj sample and
come from the same 9th grade class. Alternatively, this fixed effect can be replaced with a
αjc fixed effect, where we compare solely students ending up in the same 10th grade class. β
identifies the causal effect of getting one additional persistent classmate.44
4.2.2
Validity of the test
The exogeneity of this instrument is based on a stronger assumption than the main model.
Although SF students are randomly split, the assignment of other freshmen is not exogenous.
In particular, if they were assigned with respect to the number of PCs, instrument Zij would
not be exogenous. Therefore, model (II.4) is identified under the hypothesis that P students
are not allocated to classes correlated with PCs.
In order to check this additional hypothesis, we estimate the correlation between the value
of instrument Zij (the number of persistent classmates received by random allocation) and the
individual characteristics of the students i ∈ Pj :
Zij = αjk + γ · Xi + uijkc
(II.5)
where Xi is the vector of observable characteristics tested.
We present the results of this test in Table II.9. In column I, we find that individual
characteristics are correlated with the instrument when the controls for αjk fixed effects are
Zij is a "perfect" instrument for PC, as it has a correlation of one with PC and there is no compliance
issue here. This is why we estimate the reduced form model directly.
44
85
not included. However, these correlations vanish when we include them (column II) or when
we replace them with a 10th-grade-class fixed effect (column III).45 These results suggest that
the students who obtain a randomly assigned PC are comparable in their observed dimensions
to those who do not, within Pj samples. Although these students might be different on an
unobserved level, we argue that this test is satisfactory enough to run a robustness check on
our main results presented in section 3.
4.2.3
Results
The results of the estimation of model (II.4) are produced in Table II.10. As in Table II.9, the
αjk fixed effect is omitted in column I, included in column II and replaced with a 10th-gradeclass fixed effect in column III. Since Zkc takes the same value for all students in 10th grade
class c from the same 9th grade class k, the standard errors are clustered within kc groups
(in keeping with Moulton’s formula). Similar to section 3, we find that a higher number of
persistent classmates is associated with lower grade repetition and higher enrollment for the
academic major. We also observe a long-term positive effect on high school graduation. The
orders of magnitude are similar to the results using the first strategy and do not vary drastically
depending on the fixed effect we include.
All in all, these results confirm our main strategy results. Moreover, this approach has some
advantages even though it relies on a stronger assumption. First, the main strategy focuses on
students who have been separated from a very similar former classmate, most likely a friend.
Getting more persistent classmates may have more of an impact than usual in such settings.
This approach finds a similar impact on a different sample, thus removing doubts as to the
external validity of our results. Furthermore, by allowing comparisons of students within the
same class (column III), the effect can be estimated of a pure variation in the number of PCs,
holding other class characteristics constant.46 This allays our concerns about omitted class
characteristics driving our results in section 3. Last but not least, it shows that the positive
effect of persistent classmates does not operate solely through improvement in the global class
We use DNB quintile dummies instead of the DNB score to avoid losing students with missing values.
Students for which the DNB score is missing have all five dummies equal to zero.
46
Suppose in Figure II.1 that C comes from A’s class and D from B’s class. For C and D, getting A or
B only changes their relative number of PCs. This is true by construction, because A and B have the same
characteristics as regards all other dimensions.
45
86
context, which would affect everyone similarly.47 Freshmen do therefore benefit from familiar
peers via channels that operate at individual level, such as a greater sense of belonging or social
and academic support.
It might have been expected, for example, that it is more comfortable to teach a class if more students
already know each other in the class. This beneficial impact on teachers might then affect all students in the
class, even those who are not directly affected by having more former classmates. Yet in this case, we would
not find any difference between students in the same 10th grade class.
47
87
5
Discussion and conclusion
This paper documents how classes influence students’ achievements in high school. Empirical
evidence suggests that freshmen students with very similar registration files, when separated
among different classes, are randomly assigned to their class. With this quasi-experimental
setting, differences in class environments can be credibly assumed orthogonal to potential outcomes. After examining the correlations between a number of measures of class composition
and student outcomes, we find a robust and significant effect of being assigned again with more
former classmates. Yet this effect is all but homogenous. It is almost exclusively driven by
low-achieving, low-SES freshmen who enroll in high schools with more high-SES mates than
they used to have. These students "at risk" in high school do not benefit more from the presence of high-achieving persistent classmates. It may be a surprising result, since these peers are
more likely to provide academic help compared to low-achieving persistent classmates. Most
probably, low-achieving students are better-off by former classmates through social channels.
Mechanisms implying direct interactions could be at work. For example, persistent classmates could be friends or acquaintances to whom freshmen may talk during the early weeks,
ask for help, or even work as a team.48 However, even where there is no interaction, being
surrounded by peers they know and who experience the same difficulties may also be a psychological relief, fostering their sense of belonging in the high school. Precise data on students’
relationships and well-being such as those used by Lavy and Sand (2012) would help understand how freshmen take advantage of familiarity with peers. Unfortunately, such data is not
available in our context.
Most importantly, though, our results show that students do not bear the brunt of increased enrollment in grade 11 with lower performance in subsequent years. So, whatever the
mechanisms at work, we know that persistent classmates do raise achievement.49 Basically, we
argue that this result suffices to draw relevant policy recommendations on class composition.
In this way, grouping students who come from the same class may be an efficient tool to help teachers
develop cooperative learning within the class, as they might rely on existing friendships and social links between
students right from the start of the year.
49
In particular, mechanisms implying solely a change in preferences are ruled out by this result. For example,
persistent classmates could make students less likely to repeat 10th grade only by increasing their propensity
to appeal to teachers’ decisions (to avoid losing friends). In that case though, negative drawbacks would be
observed in the following grades since students would enroll in grades and for majors where requirements would
be too high.
48
88
Whereas a great deal of money is usually invested in improving outcomes of students at risk of
underachievement, we show that assigning them to some persistent classmates could increase
their performance at no cost. The potential gains could be substantial. Our analysis finds that
each persistent classmate reduces their risk of repetition by 1.5 pp. on average. This figure is
estimated using solely the variance of PCs observed within groups of SF students at risk, with
98 percent of variations ranging from 0 to 3 PCs (no conclusion should be drawn about the PC
effect beyond this range). Students at risk in the freshmen population have 1.5 low-achieving
PCs on average: raising this figure by 350 could thus reduce their risk of repetition by 4.5 pp.
(meaning 13 percent of their current rate) while increasing their graduation rate by the same
amount. Non-linearities of the PC effect might result in the same (or even a greater) benefit
with fewer than 3 PCs, but the small variance in the number of PCs in our sample does not
allow us to investigate this. Nonetheless, it would be helpful to back up these recommendations with experimental evidence, as Carrell et al. (2011) stressed the need to test such policy
predictions drawn from reduced-form estimates.
We believe this study makes an important contribution to the existing literature on the role
of the school environment in achievement. We show that low-achieving students benefit from
peer persistency when the rest of their environment gets largely disrupted by the transition to
high school. This study emphasizes the need for a minimum of stability in the face of great
instability, and highlights the huge complexity of peer effects and social interactions.
50
11 former classmates are enrolled in their high school on average.
89
Bibliography
Angrist, Joshua D. and Kevin Lang (Dec. 2004). “Does School Integration Generate Peer
Effects? Evidence from Boston’s Metco Program”. In: The American Economic Review 94.5,
pp. 1613–1634.
Calvò-Armengol, Antoni, Eleonora Patacchini, and Yves Zenou (2009). “Peer Effects
and Social Networks in Education”. In: Review of Economic Studies 76.4, pp. 1239–1267.
Carrell, Scott E., Bruce I. Sacerdote, and James E. West (Feb. 2011). From Natural
Variation to Optimal Policy? The Lucas Critique Meets Peer Effects. NBER Working Paper
16865. National Bureau of Economic Research.
Cullen, Julie Berry, Brian A. Jacob, and Steven Levitt (Sept. 2006). “The Effect of
School Choice on Participants: Evidence from Randomized Lotteries”. In: Econometrica
74.5, pp. 1191–1230.
De Giorgi, Giacomo and Michele Pellizzari (July 2013). Understanding Social Interactions:
Evidence from the Classroom. NBER Working Papers 19202. National Bureau of Economic
Research, Inc.
Duflo, Esther, Pascaline Dupas, and Michael Kremer (Aug. 2011). “Peer Effects, Teacher
Incentives, and the Impact of Tracking: Evidence from a Randomized Evaluation in Kenya”.
In: The American Economic Review 101.5, pp. 1739–74.
Foster, Gigi (2006). “It’s not your peers, and it’s not your friends: Some progress toward
understanding the educational peer effect mechanism”. In: Journal of Public Economics
90.8–9, pp. 1455–1475.
Fruehwirth, Jane Cooley (2013). “Identifying peer achievement spillovers: Implications for
desegregation and the achievement gap”. In: Quantitative Economics 4.1, pp. 85–124.
90
Gibbons, Stephen, Olmo Silva, and Felix Weinhardt (Oct. 2013). “I (Don’t) Like the
Way You Move: The Disruptive Effects of Residential Turnover on Student Attainment”.
Preliminary draft, SOLE 2014 Conference.
Goux, Dominique and Éric Maurin (Oct. 2007). “Close Neighbours Matter: Neighbourhood
Effects on Early Performance at School”. In: Economic Journal 117.523, pp. 1192–1215.
Halliday, Timothy J. and Sally Kwak (2012). “What is a peer? The role of network definitions in estimation of endogenous peer effects”. In: Applied Economics 44.3, pp. 289–
302.
Hoxby, Caroline M. (2000). Peer Effects in the Classroom: Learning from Gender and Race
Variation. NBER Working Paper 7867. National Bureau of Economic Research.
Imbens, Guido W. (Sept. 2000). “The Role of the Propensity Score in Estimating DoseResponse Functions”. In: Biometrika 87.3, pp. 706–710.
Kling, Jeffrey R, Jeffrey B Liebman, and Lawrence F Katz (2007). “Experimental Analysis of Neighborhood Effects”. In: Econometrica 75.1, pp. 83–119.
Lavy, Victor, M. Daniele Paserman, and Analia Schlosser (Mar. 2012). “Inside the Black
Box of Ability Peer Effects: Evidence from Variation in the Proportion of Low Achievers in
the Classroom”. In: Economic Journal 122.559, pp. 208–237.
Lavy, Victor and Edith Sand (Oct. 2012). The Friends Factor: How Students’ Social Networks Affect Their Academic Achievement and Well-Being? NBER Working Paper 18430.
National Bureau of Economic Research.
Lavy, Victor and Analia Schlosser (Apr. 2011). “Mechanisms and Impacts of Gender Peer
Effects at School”. In: American Economic Journal: Applied Economics 3.2, pp. 1–33.
Lavy, Victor, Olmo Silva, and Felix Weinhardt (Apr. 2012). “The Good, the Bad, and the
Average: Evidence on Ability Peer Effects in Schools”. In: Journal of Labor Economics 30.2,
pp. 367–414.
Mora, Toni and Philip Oreopoulos (2011). “Peer effects on high school aspirations: Evidence
from a sample of close and not-so-close friends”. In: Economics of Education Review 30.4,
pp. 575–581.
91
9th grade
10th grade
Class X
Class Y
A
C
B
F
D
E
G
H
1
e
as
C
A
Similar-file students
B
C
as
e
Class X
2
Class Y
B
C
A
F
D
E
G
H
Consider two "similar-file" students A and B, i.e. two first-year students who share the same
characteristics for all the information included in the registration files. Our key assumption is that
they are not distinguished by the high school principal during class assignment, because she does
not know personally her first-year students at that time.
The principal has two options when she allocates these students: she can either assign them to the
same class or separate them into two classes X and Y. We do not compare these two scenarios.
We focus exclusively on the scenario where the students are separated. In this scenario, there are
two possible cases: student A is assigned to class X and student B is assigned to class Y (case 1)
or student A is assigned to class Y and student B is assigned to class X.
We argue that the choice between case 1 and case 2 is random and can be seen as a lottery that
affects the two students’ social environments in 10th grade. For instance, student A will be with
students C, D and E in case 1, students F, G and H in case 2.
Differences in outcomes between students A and B can by credibly attributed to differences in
classroom characteristics.
Figure II.1: Class assignment of similar-file students
92
Repeating
100 %
D
90 %
students
80 %
70 %
C
Other origin
60 %
50 %
40 %
B
30 %
From the
same school
20 %
10 %
A
0%
6
7
8
9
10
Middle school
11
Persistent
classmates
12
High school
This graph shows the typical composition of a student’s classroom when he or she enters a given
grade g ∈ {6, . . . , 12}. For each grade g, the sample consists of students entering that grade for the
first time (i.e. repeating students are excluded) for cohorts 2000 to 2008 (when they never repeat,
students from cohort 2000 enter grade 6 in 2000, grade 7 in 2001, etc.).
Reading: Among the classmates of a student entering grade 9, 32 percent are persistent classmates
from 8th grade, 82 − 32 = 50 percent are classmates who were in different classes from the same
school in 8th grade, 94 − 82 = 12 percent are non-repeating students who were in a different school
in 8th grade, and 6 percent are students repeating 9th grade.
Notes about the data: Cohorts 2000 and 2001 are missing for grades 6 and 7; cohort 2008
missing for grade 12. We only know the school attended in grade 5, not the classroom, therefore
we cannot distinguish between persistent classmates and former schoolmates (stacks A and B) in
grade 6.
Note that during middle school, the average share of persistent classmates is around 30 percent.
This is due to the fact that classes are generally reshuffled from one year to another.
Figure II.2: Composition of the typical classroom from a non-repeating student’s point of view
93
Table II.1: Descriptive statistics on students’ characteristics
SF
sample
At risk
(I)
(II)
(III)
0.547
0.618
0.609
(0.498)
(0.486)
(0.488)
0.301
0.301
0.000
(0.459)
(0.459)
(0.000)
0.145
0.093
0.021
(0.352)
(0.291)
(0.143)
23.9352
25.140
20.114
(5.077)
(5.551)
(3.879)
0.0002
0.246
−0.757
(1.000)
(1.099)
(0.762)
0.147
0.037
0.101
(0.354)
(0.188)
(0.301)
0.151
0.150
0.359
(0.358)
(0.357)
(0.480)
0.083
0.040
0.078
(0.276)
(0.197)
(0.269)
0.590
0.694
0.303
(0.492)
(0.461)
(0.460)
0.175
0.117
0.259
(0.380)
(0.321)
(0.438)
0.658
0.737
0.482
(0.474)
(0.441)
(0.500)
0.529
0.630
0.314
(0.499)
(0.483)
(0.464)
1.7183
1.999
1.503
(2.528)
(2.256)
(1.760)
8.3273
12.680
10.934
(6.466)
(5.924)
(4.936)
3,589,710
28,053
8,981
Population1
Girl
High-SES
High quality optional course
DNB national exam score
Normalized DNB national exam score
Had repeated at least once before grade
Repeats 10th grade
Attrition (drop out or unmatched in panel)
Academic major in grade 11
Technological major in grade 11
Takes Bac on schedule
Graduates high school
Number of PC in 10th grade class
Number of former classmates in high school
N
1
2
3
Standard deviations are reported in parentheses.
The population is made of the 10th grade students in the general track coming from the 9th grade (i.e. excluding repeating students). The general track
typically contains more girls and more high-SES students.
Only students whose DNB score is known: N = 2, 897, 986.
Only students whose 9th grade class is known: N = 3, 381, 271.
94
Table II.2: Student’s class characteristics regressed on own anonymous exam
score: Evidence of the random assignment of similar-file students
Dependent variable
Number of PCs
Average DNB score1
Number of girls
Number of high-SES students
Class size
N
SF fixed effect
1
All
All
At risk
At risk
(I)
(II)
(III)
(IV)
0.062***
0.000
0.037***
−0.008
(0.003)
(0.005)
(0.005)
(0.008)
0.040***
0.001
0.047***
0.002
(0.001)
(0.001)
(0.001)
(0.002)
0.004
0.001
−0.014
−0.030
(0.006)
(0.011)
(0.014)
(0.020)
0.316***
0.013
0.255***
0.019
(0.008)
(0.009)
(0.014)
(0.015)
0.067***
0.002
0.063***
0.000
(0.004)
(0.006)
(0.009)
(0.012)
28,053
28,053
8,981
8,981
No
Yes
No
Yes
Each cell is from a separate regression of the class characteristic of interest
on the student’s standardized average anonymous score at the DNB exam.
The "at risk" sample consists of low-ability, low-SES students, on which
the main effects’ magnitudes are the highest.
The SF fixed effect is a single fixed effect accounting altogether for 9th
grade class and middle school of origin, high school of destination and
individual characteristics, as defined in section 2.1. By controlling for the
SF fixed effect, we compare each student only with her similar-file mate
assigned randomly to another class.
All regressions include quadratic controls for the share of retained students
and of missing DNB scores.
Standard errors (in parentheses) are clusterized at the 10th grade class
level.
This is the normalized DNB score. The normalization is done over the
whole population: the sample’s mean is 0.245.
95
Table II.3: Student’s class characteristics regressed on behavior score: Evidence of the random assignment of similar-file students
Dependent variable
Number of PCs
All
All
At risk
At risk
(I)
(II)
(III)
(IV)
−0.117*
0.017
(0.061)
(0.091)
−0.173*** −0.105
(0.059)
Average DNB score
−0.157*** −0.005
(0.014)
Number of girls
Number of high-SES students
Class size
N
SF fixed effect
(0.084)
(0.013)
−0.557*** −0.039
−0.129*** −0.012
(0.017)
−0.706***
(0.018)
0.115
(0.136)
(0.162)
(0.188)
(0.224)
0.027
0.065
−0.023
−0.028
(0.202)
(0.122)
(0.189)
(0.165)
−0.280*** −0.030
−0.179
−0.023
(0.093)
(0.110)
(0.116)
(0.126)
15,775
15,775
4,952
4,952
No
Yes
No
Yes
Each cell is from a separate regression of the class characteristic of interest
on a dummy for having an NVS score beneath the 10th percentile (equal
to 15 out of 20).
The behavior score is not avaiable for the first two cohorts, hence the
smaller sample size.
The "at risk" sample consists of low-ability, low-SES students, on which
the main effects’ magnitudes are the highest.
The SF fixed effect is a single fixed effect accounting altogether for 9th
grade class and middle school of origin, high school of destination and
individual characteristics, as defined in section 2.1. By controlling for the
SF fixed effect, we compare each student only with her similar-file mate
assigned randomly to another class.
All regressions include quadratic controls for the share of retained students
and of missing DNB scores.
Standard errors (in parentheses) are clusterized at the 10th grade class
level.
96
Table II.4: Effect of class characteristics on high school outcomes
Repeats
10th
grade
Drops
out
Academic
major
Tech.
major
Takes
Bac on
schedule
HS
graduate
(I)
(II)
(III)
(IV)
(V)
(VI)
−0.003**
−0.001
0.003**
0.001
0.005**
0.004*
(0.001)
(0.001)
(0.001)
(0.001)
(0.002)
(0.002)
0.040***
−0.012*
−0.022**
−0.006
−0.011
0.007
(0.010)
(0.006)
(0.010)
(0.010)
(0.014)
(0.014)
−0.001**
0.001
0.001
0.000
0.001
0.001
(0.001)
(0.000)
(0.001)
(0.001)
(0.001)
(0.001)
−0.000
0.001**
−0.000
−0.001
−0.000
0.001
(0.001)
(0.001)
(0.001)
(0.001)
(0.001)
(0.001)
−0.000
0.000
0.000
−0.000
0.002
0.000
(0.001)
(0.001)
(0.001)
(0.001)
(0.002)
(0.002)
R2
0.68
0.56
0.79
0.63
0.68
0.71
N
28,053
28,053
28,053
28,053
22,9461
22,9461
Yes
Yes
Yes
Yes
Yes
Yes
Independent variable
Number of PCs
Average DNB score
Number of girls
Number of high-SES students
Class size
SF fixed effect
1
Each column is from a separate regression of students’ outcomes on their class characteristics.
The SF fixed effect is a single fixed effect accounting altogether for 9th grade class and middle
school of origin, high school of destination and individual characteristics, as defined in section
2.1. By controlling for the SF fixed effect, we compare each student only with her similar-file
mate assigned randomly to another class.
All regressions include quadratic controls for the share of retained students and of missing DNB
scores.
Standard errors (in parentheses) are clusterized at the 10th grade class level.
Bac data was not available for the last two cohorts, hence the smaller sample size.
97
Table II.5: Effect of persistent classmates on high school outcomes with and without controlling for other class characteristics
Dependent variable
(I)
Repeats 10th grade
(II)
−0.003** −0.003***
(0.001)
(0.001)
−0.001
−0.001
(0.001)
(0.001)
0.003**
0.003**
(0.001)
(0.001)
0.001
0.001
(0.001)
(0.001)
28,053
28,053
0.005***
0.005***
(0.002)
(0.002)
0.004**
0.004*
(0.002)
(0.002)
22,946
22,946
SF fixed effect
Yes
Yes
Control for other class characteristics
No
Yes
Drops out
Academic major
Technological major
N
Takes Bac on schedule
HS graduate
N1
1
Each cell is from a separate regression of students’ outcomes
on their number of PCs, controlling or not for the other
class characteristics presented in Table II.4. Column (II) is
identical to the first row in Table II.4.
The SF fixed effect is a single fixed effect accounting altogether for 9th grade class and middle school of origin, high
school of destination and individual characteristics, as defined in section 2.1. By controlling for the SF fixed effect,
we compare each student only with her similar-file mate
assigned randomly to another class.
All regressions include quadratic controls for the share of
retained students and of missing DNB scores.
Standard errors (in parentheses) are clusterized at the 10th
grade class level.
Bac data was not available for the last two cohorts, hence
the smaller sample size.
98
Table II.6: Distribution of the PC effect
Dependent variable
Independant variable
Repeats
10th
grade
Drops
out
Academic
major
Tech.
major
Takes
Bac on
schedule
HS
graduate
(I)
(II)
(III)
(IV)
(V)
(VI)
0.003**
0.001
0.005***
0.004*
(A) Whole sample: average effet
Number of PCs
N
−0.003*** −0.001
(0.001)
(0.001)
(0.001)
(0.001)
(0.002)
(0.002)
28,053
28,053
28,053
28,053
22,9461
22,9461
−0.000
−0.001
0.002
−0.001
0.002
0.001
(0.001)
(0.001)
(0.001)
(0.001)
(0.002)
(0.002)
0.003
0.006**
0.009**
0.007
(B) Whole sample: by ability
Number of PCs
Low-ability × PC
N
−0.009*** −0.001
(0.003)
(0.002)
(0.003)
(0.003)
(0.004)
(0.004)
28,053
28,053
28,053
28,053
22,9461
22,9461
(C) Low-ability students: by SES
Number of PCs
Low-SES × PC
N
0.002
−0.002
−0.003
0.003
0.003
0.000
(0.005)
(0.003)
(0.005)
(0.003)
(0.007)
(0.007)
−0.015**
0.001
0.011*
0.003
0.011
0.012
(0.006)
(0.004)
(0.007)
(0.005)
(0.009)
(0.008)
11,383
11,383
11,383
11,383
9,5881
9,5881
(D) Low-ability, low-SES students ("at risk"): by disruption intensity
Number of PCs
∆p > 0 × PC
N
−0.010*
0.003
0.006
0.001
0.007
0.010*
(0.006)
(0.004)
(0.005)
(0.005)
(0.006)
(0.006)
−0.009
−0.009
0.006
0.012
0.015
0.005
(0.008)
(0.006)
(0.008)
(0.007)
(0.010)
(0.009)
8,981
8,981
8,981
8,981
7,6151
7,6151
(E) Low-ability, low-SES students ("at risk"): by gender
Number of PCs
Girl × PC
N
1
0.000
0.014**
0.004
0.010
0.009
(0.006)
(0.005)
(0.006)
(0.005)
(0.008)
(0.008)
0.007
−0.002
−0.009
0.004
0.007
0.005
(0.009)
(0.006)
(0.008)
(0.007)
(0.010)
(0.010)
8,981
8,981
8,981
8,981
7,6151
7,6151
−0.018***
Each column in each panel is from a separate regression.
The SF fixed effect is a single fixed effect accounting altogether for 9th grade class
and middle school of origin, high school of destination and individual characteristics, as
defined in section 2.1. By controlling for the SF fixed effect, we compare each student
only with her similar-file mate assigned randomly to another class.
All regressions include quadratic controls for the share of retained students, the share
of missing DNB scores, and the class characteristics presented in Table II.4.
Standard errors (in parentheses) are clusterized at the 10th grade class level.
Bac data was not available for the last two cohorts, hence the smaller sample size.
99
Table II.7: Which peers do matter? Decomposition of the PC effect on students at risk.
Dependant variable
Independent variable
Repeats
10th
grade
Drops
out
Academic
major
Tech.
major
Takes
Bac on
schedule
HS
graduate
(I)
(II)
(III)
(IV)
(V)
(VI)
0.008*
0.006*
0.013***
0.011**
(A) Persistent classmates and persistent schoolmates
Number of PCs
Persistent schoolmates from other classes
−0.014*** −0.001
(0.004)
(0.003)
(0.004)
(0.004)
(0.005)
(0.005)
0.000
0.002
−0.001
−0.001
−0.004
−0.003
(0.003)
(0.002)
(0.002)
(0.002)
(0.003)
(0.003)
−0.015**
−0.001
0.007
0.008
0.014*
0.011
(0.007)
(0.004)
(0.007)
(0.006)
(0.007)
(0.007)
0.001
−0.001
0.002
−0.002
−0.001
0.003
(0.010)
(0.006)
(0.009)
(0.009)
(0.011)
(0.010)
−0.017**
−0.001
0.014**
0.005
0.014*
0.011
(0.007)
(0.004)
(0.006)
(0.006)
(0.008)
(0.007)
0.007
0.000
−0.010
0.003
−0.000
0.001
(0.010)
(0.006)
(0.009)
(0.009)
(0.011)
(0.011)
8,981
8,981
8,981
8,981
7,6151
7,6151
(B) Persistent classmates by gender
Number of PCs
Same sex PCs
(C) Persistent classmates by ability
Number of PCs
High-ability PCs
N
1
The sample is limited to low-ability, low-SES students ("at risk") sample on which the effects’ magnitudes
are the highest.
Each cell is from a separate regression of students’ outcomes on their number of persistent classmates.
The SF fixed effect is a single fixed effect accounting altogether for 9th grade class and middle school of
origin, high school of destination and individual characteristics, as defined in section 2.1. By controlling for
the SF fixed effect, we compare each student only with her similar-file mate assigned randomly to another
class.
All regressions include quadratic controls for the share of retained students, the share of missing DNB scores,
and the class characteristics presented in Table II.4.
Standard errors (in parentheses) are clusterized at the 10th grade class level.
Bac data was not available for the last two cohorts, hence the smaller sample size.
100
Table II.8: Robustness check: effect of PC on low-ability students’ repetition
rate using different specifications of the SF fixed effect
Specifications
Independent variable
PC
(I)
(II)
(III)
(IV)
(V)
−0.004*** −0.012*** −0.015*** −0.014*** −0.019***
(0.001)
(0.004)
(0.004)
(0.004)
(0.007)
R2
0.58
0.68
0.60
0.61
0.62
N
169,258
19,369
11,404
8,981
3,214
Options
X
X
X
X
X
Middle school
X
X
X
X
X
Same
Similar
Same
Same
Decile
Decile
Decile
Decile
Science score
Quintile
Quintile
Quintile
Decile
Humanities score
Quintile
Quintile
Quintile
Decile
X
X
X
X
Gender
X
X
X
2-category SES
X
X
X
SF students share...
9th grade class
In-school score
Held back
Indifferent
Decile
X
The sample is limited to low-ability, low-SES students ("at risk") sample on
which the effects’ magnitudes are the highest.
Each cell is from a separate regression of students’ outcomes on their number
of persistent classmates.
All regressions include quadratic controls for the share of retained students,
the share of missing DNB scores, and the class characteristics presented in
Table II.4.
Standard errors (in parentheses) are clusterized at the 10th grade class level.
All regressions include similar-file fixed effects, which are different in each
column. Column IV is the original definition of SF groups, columns I to III
are less restrictive and column V is more restrictive.
101
Table II.9: IV exogeneity test
Independent variable
Held back
Girl
High-SES
High quality optional course
DNB Quintile 1
(I)
(II)
(III)
(IV)
−0.023
−0.024*
−0.016
−0.018
(0.014)
(0.013)
(0.012)
(0.012)
−0.013
0.003
0.006
−0.004
(0.010)
(0.008)
(0.007)
(0.007)
0.062***
0.004
0.001
0.009
(0.010)
(0.008)
(0.007)
(0.007)
0.041*
−0.005
0.006
0.002
(0.022)
(0.021)
(0.017)
(0.017)
0.020
0.005
0.008
(0.016)
(0.015)
(0.015)
−0.044*** −0.007
−0.010
−0.000
−0.051***
(0.017)
DNB Quintile 2
(0.016)
(0.015)
(0.014)
(0.013)
−0.015
−0.007
−0.008
−0.014
(0.016)
(0.013)
(0.012)
(0.012)
0.031**
0.021*
0.014
0.013
(0.014)
(0.013)
(0.011)
(0.011)
0.035**
0.020
0.001
0.015
(0.015)
(0.013)
(0.011)
(0.011)
Ref.
Ref.
Ref.
Ref.
—
—
—
—
R2
0.01
0.21
0.42
0.41
N
33,663
33,663
33,663
33,663
High
school
HS ×
9th
grade
class
10th
grade
class
DNB Quintile 3
DNB Quintile 4
DNB Quintile 5
DNB Missing
Fixed effect
None
Each column is from a separate regression of the instrument Z (number
of PCs received through one of the random allocations) on students’
characteristics.
Standard errors (in parentheses) are clusterized at the class of origin ×
class of destination level.
102
Table II.10: Effect of PC on high school outcomes using the IV strategy
Dependent variable
Repeats 10th grade
Drops out
Academic major
Technological major
(I)
(II)
(III)
−0.014*** −0.009*** −0.007**
(IV)
−0.008***
(0.003)
(0.003)
(0.003)
(0.003)
0.002
0.001
0.002
−0.000
(0.002)
(0.002)
(0.002)
(0.002)
0.029***
0.013***
0.009**
0.011***
(0.004)
(0.004)
(0.004)
(0.004)
−0.004**
−0.003
−0.018*** −0.004**
(0.002)
(0.002)
(0.002)
(0.002)
N
33,663
33,663
33,663
33,663
Takes Bac on schedule
0.010**
0.009**
0.003
0.009**
(0.004)
(0.004)
(0.004)
(0.004)
0.025***
0.013***
0.008*
0.012***
(0.005)
(0.004)
(0.004)
(0.004)
26,608
26,608
26,608
26,608
None
Highschool
HS × 9th
grade
class
10th
grade
class
HS graduate
N
Fixed effect
Each column is from a separate regression of students’ outcomes on
the instrument Z (number of PCs received through one of the random
allocations).
Standard errors (in parentheses) are clusterized at the class of origin ×
class of destination level.
103
Appendix
Appendix 1 : Additional Tables
104
Table II.11: Student’s class characteristics regressed on own anonymous exam score: Evidence of the random assignment of similar-file
students
Dependent variable
Number of PCs
Same sex PCs
Opposite sex PCs
High-ability PCs
Low-ability PCs
N
SF fixed effect
All
All
At risk
At risk
(I)
(II)
(III)
(IV)
0.062***
0.000
0.037***
−0.008
(0.003)
(0.005)
(0.005)
(0.008)
0.037***
0.004
0.019***
−0.002
(0.002)
(0.004)
(0.003)
(0.006)
0.025***
−0.003
0.018***
−0.006
(0.002)
(0.003)
(0.003)
(0.005)
0.050***
0.005
0.029***
−0.005
(0.002)
(0.004)
(0.003)
(0.006)
0.012***
−0.004
0.007**
−0.003
(0.001)
(0.003)
(0.003)
(0.005)
28,053
28,053
8,981
8,981
No
Yes
No
Yes
Each cell is from a separate regression of the number of PCs
of each type on the student’s standardized average anonymous
score at the DNB exam.
The "at risk" sample consists of low-ability, low-SES students,
on which the main effects’ magnitudes are the highest.
The SF fixed effect is a single fixed effect accounting altogether
for 9th grade class and middle school of origin, high school of
destination and individual characteristics, as defined in section
2.1. By controlling for the SF fixed effect, we compare each
student only with her similar-file mate assigned randomly to
another class.
All regressions include quadratic controls for the share of retained students and of missing DNB scores.
Standard errors (in parentheses) are clusterized at the 10th
grade class level.
105
Table II.12: Anonymous exam score regressed on class characteristics: Evidence of the random assignment of similar-file students
All
All
At risk
At risk
(I)
(II)
(III)
(IV)
0.149***
0.000
0.059***
−0.025
(0.013)
(0.012)
(0.022)
(0.026)
4.572***
0.088
3.065***
0.094
(0.069)
(0.084)
(0.092)
(0.141)
0.000
−0.010
−0.018
(0.006)
(0.007)
(0.008)
(0.012)
−0.010*
0.008
0.024***
0.015
(0.005)
(0.008)
(0.009)
(0.016)
0.046***
−0.001
−0.008
0.005
(0.011)
(0.012)
(0.013)
(0.020)
R2
0.25
0.90
0.17
0.78
F -test
0.00
0.47
0.00
0.35
28,053
28,053
8,981
8,981
No
Yes
No
Yes
Independent variable
Number of PCs
Average DNB score
Number of girls
Number of high-SES students
Class size
N
SF fixed effect
−0.022***
Each column is from a separate regression of the the student’s standardized
average anonymous score at the DNB exam on class characteristics.
The "at risk" sample consists of low-ability, low-SES students, on which
the main effects’ magnitudes are the highest.
The SF fixed effect is a single fixed effect accounting altogether for 9th
grade class and middle school of origin, high school of destination and
individual characteristics, as defined in section 2.1. By controlling for the
SF fixed effect, we compare each student only with her similar-file mate
assigned randomly to another class.
All regressions include quadratic controls for the share of retained students
and of missing DNB scores.
Standard errors (in parentheses) are clusterized at the 10th grade class
level.
106
Table II.13: Raw correlation between class characteristics and high school outcomes
Dependant variable
Independent variable
Number of PCs
Repeats
10th
grade
Drops
out
Academic
major
Tech.
major
Takes
Bac on
schedule
HS
graduate
(I)
(II)
(III)
(IV)
(V)
(VI)
−0.006*** −0.001
(0.001)
Average DNB score
(0.001)
−0.072*** −0.025*** 0.176***
(0.005)
Number of girls
(0.000)
0.013***
0.001***
(0.003)
−0.001***
−0.007*** 0.008***
(0.001)
(0.001)
−0.079*** 0.108***
0.011***
(0.001)
0.177***
(0.006)
(0.005)
(0.007)
(0.007)
0.000
−0.000
0.000
0.000
(0.000)
(0.001)
(0.001)
(0.000)
(0.000)
(0.001)
0.001***
0.000**
0.001**
(0.000)
(0.000)
(0.000)
(0.000)
(0.001)
(0.001)
0.000
0.003***
−0.000
0.005***
0.004***
(0.001)
(0.000)
(0.001)
(0.001)
(0.001)
(0.001)
R2
0.03
0.01
0.09
0.04
0.04
0.08
N
28,053
28,053
28,053
28,053
22,9461
22,9461
No
No
No
No
No
No
Number of high-SES students
Class size
SF fixed effect
1
−0.003***
−0.003*** −0.002***
0.001
Each column is from a separate regression of students’ outcomes on their class characteristics.
The SF fixed effect is not included in these regressions.
All regressions include quadratic controls for the share of retained students and of missing DNB
scores.
Standard errors (in parentheses) are clusterized at the 10th grade class level.
Bac data was not available for the last two cohorts, hence the smaller sample size.
107
Table II.14: Effect of class characteristics on high school outcomes for the sample "at risk"
Independent variable
Number of PCs
Repeats
10th
grade
Drops
out
Academic
major
Tech.
major
Takes
Bac on
schedule
HS
graduate
(I)
(II)
(III)
(IV)
(V)
(VI)
0.009**
0.007*
0.014***
0.012**
−0.014*** −0.001
(0.004)
(0.003)
(0.004)
(0.004)
(0.005)
(0.005)
0.064**
−0.004
−0.053**
−0.007
−0.037
−0.018
(0.025)
(0.014)
(0.021)
(0.023)
(0.028)
(0.026)
−0.004*
0.001
0.001
0.002
0.002
0.001
(0.002)
(0.001)
(0.002)
(0.002)
(0.002)
(0.002)
−0.000
0.001
−0.000
−0.001
−0.001
0.004
(0.003)
(0.002)
(0.002)
(0.002)
(0.003)
(0.003)
0.000
0.001
0.001
−0.003
0.001
−0.001
(0.003)
(0.002)
(0.003)
(0.003)
(0.004)
(0.004)
R2
0.61
0.56
0.67
0.58
0.61
0.60
N
8,981
8,981
8,981
8,981
7,6151
7,6151
Yes
Yes
Yes
Yes
Yes
Yes
Average DNB score
Number of girls
Number of high-SES students
Class size
SF fixed effect
1
The sample is limited to low-ability, low-SES students ("at risk") sample on which the effects’
magnitudes are the highest.
Each column is from a separate regression of students’ outcomes on their class characteristics.
The SF fixed effect is a single fixed effect accounting altogether for 9th grade class and middle
school of origin, high school of destination and individual characteristics, as defined in section
2.1. By controlling for the SF fixed effect, we compare each student only with her similar-file
mate assigned randomly to another class.
All regressions include quadratic controls for the share of retained students and of missing DNB
scores.
Standard errors (in parentheses) are clusterized at the 10th grade class level.
Bac data was not available for the last two cohorts, hence the smaller sample size.
108
Appendix 2 : Details on the matching procedure
As detailed in section 1.3.1, we use the two following datasets:
• Administrative registration records: for all students enrolled in French public and publiclyfunded private middle and high schools from 2001 to 2012. This dataset contains students’
personal details (e.g. date and region of birth, gender and parents’ occupation) and
information on their education: in particular grade, school and class attended, options
taken, grade and school attended in t − 1 (but not the class attended in t − 1).
• Examination records: for all students from 2004 to 2011. This dataset contains personal details and informal scores in the 9th grade DNB (both the anonymous exam and
continuous assessment scores) and 12th grade baccaulauréat exams.
Unfortunately, students do not have personal identification numbers so that they can be
tracked through the different datasets. Yet for each 10th grade student, we need to know at
least which class they attended in 9th grade and their grade in t + 1 (repeating 10th grade or
moving to 11th grade) and chosen major if they do move to 11th grade. We also have to match
the administrative and the examination records.
Matching administrative registration records between consecutive years
In order to match students between datasets in years t and t+1, we use the following algorithm.
For each student in year t, we look for students in t+1 who have the same values for a number of
variables within the following set: gender, date of birth (these first two variables must always
match), district of birth (this variable must either match or be missing on one of the two
datasets), district and city of residence and 31-category occupation of both parents or legal
guardians. In addition, as we have the previous school and grade in t + 1, we require these two
variables to match the current school and grade in t.
More precisely, we execute 16 rounds of matching. In the first round, we require all nine
variables to match exactly. In the following rounds, we exlude some of the variables that are
likely to change from one year to another. In the last round, we require only the gender, date
109
of birth, school and grade to be identical. At each round and for each student in t, there are
three possible outcomes:
• There is no match in t + 1: the student is kept for the next rounds where less matching
variables will be required.
• There is exactly one match: the student is marked as matched.
• There are several matches: the student is marked as impossible to match (matching on
less variables will only increase the number of matches).
Our population of interest consists of the students entering 10th grade for the first time (i.e.
not repeating the grade). We have 3,589,710 such students over eight cohorts, out of which
3,381,271 students (94 percent) are matched with the dataset from the previous year.
Matching administrative registration records and examination records
The examination records at the end of 9th grade contain fewer reliable variables and our
matching is based on a smaller set of variables. For each student from the administrative
dataset, we look for a student in the examination record who comes from the same school, has
the same date and district of birth and same gender. We keep only students for which we find
exactly one match. This is the case for 2,897,986 students, i.e. 86 percent of the students whose
9th grade class is known. Out of the 14 percent of students whose scores could not be matched,
60 percent had no match in the DNB dataset and 40 percent had more than one match.
110
Appendix 3 : Details on the process to define the SF sample
In section 2, we define SF groups to be groups of students who share the same set of characteristics Xi that are observed both by the econometrician and the high school principal. However, it
is virtually impossible to find students who have the exact same characteristics for all variables
in Xi , including for continuous variables such as test scores. Therefore, we needed to allow for
small differences, e.g. by binning continuous variables so that students with similar but not
identical test scores might belong to the same SF group.
Allowing for small differences in students characteristics actually makes sense. In practice,
principals do not have time to examine all the exact characteristics of the students in order
to assign them to classes, as we were able to notice during field observations. The average
academic high school contains about a thousand students, including c. 300 10th grade students
whom they do not know personally (except for repeaters).
Given the vector of observable characteristics Xi , we form a series of specifications of Xi that
are more or less accurate. A specification S(Xi ) is obtained by binning continuous variables
into e.g. deciles or quintiles, by simplifying multi-valued discrete variables into binary variables
or even by removing some variables. Then, we call Vi all the information that the principal
observes and that is not accounted for by S(Xi ), i.e. such that (Xi , Ui ) contains the exact
same information as (S(Xi ), Vi ). Therefore, Vi contains the variables observed by the principal
and not by us, the remaining variation within the bins of the continuous variables that we do
observe, and the variables that were dropped from Xi in S(Xi ).
Our objective is to find out the actual level of accuracy with which the principals distinguish
between students. If the specification S(Xi ) is accurate enough, then the 10th grade class
characteristics of two students sharing the same value for S(Xi ) should not be correlated with
Vi :
Cic ⊥ Vi |S(Xi )
Cov(Cic , Vi |S(Xi )) = 0
i.e.
(II.6)
where Cic is the vector of 10th grade class characteristics.
Although this assumption cannot be tested directly, we argue that the balancing test (II.3)
presented in section 2.2.1 provides supporting evidence of its validity. Recall that in the bal-
111
ancing test, we estimate the following model:
Cigc = αg + β · Ai + uigc
(II.7)
The fixed effect αg restricts comparisons to groups of individuals who have the same S(Xi ). As
explained in section 2.2.1, we expect that if Vi were correlated with Cigc conditional on S(Xi ),
then so should Ai . Therefore, if the balancing test fails, i.e. if β 6= 0 for at least one class
characteristic Cigc , we conclude that the specification S(Xi ) is not accurate enough. Indeed,
finding that β 6= 0 means that students who are identical with regard to S(Xi ) end up in
classes whose characteristics may differ significantly and that this difference is associated with
differences in anonymous test scores which are not observed by the principal.
Therefore, we performed the balancing test for a number of specifications. The optimal
specification S ∗ (Xi ) should satisfy the two following criteria:
1. pass the balancing test, i.e. that given two students who share the same value for S ∗ (Xi )
and end up in different classes, the differences in class characteristics should not be
correlated with differences in anonymous test scores;
2. be as inaccurate as possible, i.e. that any specification that is less accurate that S ∗ (Xi )
(e.g. with wider bins or with one variable removed) should not pass the balancing test.
Thes conditions ensure that we have as large a sample as possible while satisfying the balancing
test.
This optimal specification is defined as follows: SF groups contain students who come from
the same 9th grade class in middle school; enroll in the same high school in the same year; select
the same set of options (i.e. same foreign language and optional courses); share the same gender,
age and social background (low- or high-SES) based on father’s occupation; belong to the same
quintile of average 9th grade continuous assessment score in scientific subjects (mathematics,
physics-chemistry, and biology); belong to the same quintile of average 9th grade continuous
assessment score in humanities (French, history and foreign languages) and belong to the same
decile of average 9th grade continuous assessment score across all subjects listed above.51
51
See footnotes in 2.1 for more details on these variables.
112
This specification is used throughout the paper in most regressions. The "SF fixed effect"
row in our tables indicates that we compare students who share the same values for this set
of characteristics. On an indicative basis, we provide results for alternative specifications in
Table II.8. In this table, columns (I), (II) and (III) are based on less accurate specifications
that do not satisfy the balancing test; (IV) is the main specification S ∗ (Xi ); and column (V)
is a more accurate specification.
113
Chapter III
A New School in Town: Public School
Openings, Private School Choice and
Academic Achievement
I thank Éric Maurin, Xavier d’Haultfoeuille, Julien Grenet, Marc Gurgand and Arnaud Riegert for their
helpful comments and suggestions. I am also very grateful to the statistical services of the Créteil, Versailles
and Paris académies who opened access to the datasets.
0
114
State and local governments spend every year a significant share of their budget on school
facilities. In the US, around $13 billion is invested every year for total school construction, not
even mentioning land acquisitions.1 New school construction represents half of this investment,
the other half being dedicated to additions of buildings and modernization (Lyons, 1999). To
the best of my knowledge, we know only a little on the effect of opening a new school on
academic achievement in the neighborhood. (Duflo, 2001) shows a positive impact of school
constructions in Indonesia on educational achievement and labor market outcomes, but these
results are found in a very different setting where schools are built in areas without any school.
(Neilson and Zimmerman, 2011) evaluate the impact of a comprehensive school construction
project in a poor urban district. They find positive effect on home prices and test scores, which
may be attributed both to pedagogical and motivational effects.
In this paper, I take advantage of a unique French dataset to study the causal effect of
36 public middle school openings on their neighborhood, in terms of academic achievement
and school choice between public and private sectors. I use a difference-in-differences approach
that compares neighborhoods over time and with each other, as new schools open in different
years between 2003 and 2010. In this setting, it is unlikely that the exact year of opening is
endogenous to changes in students’ potential outcomes, which I discuss in details. It allows
me to look for shocks in school choice and outcomes happening precisely in the opening year,
controlling for linear time trends before and after this year, and for year and neighborhood
fixed effects. I provide evidence supporting the required assumption that outcomes would have
remained on their pre-opening trend without the new school.
I investigate the primary effects of new school opening on schooling conditions of students
living in the neighborhood. As 50 % of them attend the new school after its opening, the
proximity to their school decreases by 43 %. Although little is known on the matter, this huge
reduction in geographical distance to school could have a positive impact on students, e.g. by
mitigating fatigue due to the use of public transportation. Students may also identify more with
a school that is located in the same neighborhood, which matters for achievement according to
Akerlof and Kranton (2002). The other significant first-order effect of assigning many students
in the new school is to reduce the number of students enrolled in nearby schools. Consequently,
students in new school neighborhoods attend schools that are suddenly smaller, without over1
Source: Annual School Construction Reports (School Planning&Management).
115
crowding issues compared to previous cohorts. Smaller school may be benefitial to students
through higher identification with school, better participation and less fearful climates (Pittman
and Haughwout, 1987; Alspaugh, 1998; Lee and Loeb, 2000). Goux and Maurin (2005) also
show that house overcrowding is detrimental for academic achievement. In a similar way, school
overcrowding could be related to higher stress among teachers and students, deteriorated school
facilities and higher exposition to health and safety hazards. Finally, attending a new school
with modern and functional infrastructures and equipment could improve learning. Overall,
new public school openings may transform local educational contexts in several dimensions.
Thereafter, I examine the effect of this overall transformation of the local public school
market on students’ school choice and achievement. First, new school openings induce a strong
competition effect on the private sector. As it happens, students in the neighborhood enroll
18 % less in private middle schools after the end of elementary school. Second, I also find a
positive effect of opening a new school on academic achievement in the neighborhood, similar to
Neilson and Zimmerman (2011). Students’ achievement at the end of middle school (grade 9)
increases suddenly by around 7 % of a standard deviation in the opening year, although those
attending the new school had to change school between grade 8 and 9. However, this effect on
outcomes does not pass some robustness analysis, unlike the effect on choosing private school.
Why do households shift from the public to the private sector as a new public school opens
in their neighborhood? Although it is impossible to disentangle all the mechanisms listed
previously with certainty, some lessons can be drawn by examining the effect of new school
openings on the neighborhoods of nearby schools. Students living in these neighborhoods
undergo similar reductions in school size and overcrowding, as new school openings decrease
school size in all local schools. However, only students living close to the new school experience
a decrease in distance to school. Geographical proximity to school is virtually unaffected for
students living in other close neighborhoods. As it happen, these students do not attend
less private schools after the new school opens. It suggests that the sudden preference for
public schooling is driven by geographical proximity and/or new infrastructures, rather than
the reduction in school size or in overcrowding.
The private school sector, reinforced by voucher policies implemented in many countries, is
suspected to increase stratification and school inequalities (Epple and Romano, 1998; Epple,
116
Figlio, et al., 2004; Hsieh and Urquiola, 2006). Therefore, economists have been highly interested in understanding how households choose between public and private schools. According
to Hoxby (2000), an increased competition between public schools leads to higher productivity and keep families in the public sector. Fack and Grenet (2010) show that public school
performance raises real estate values, but this effect is mitigated when good private schools
are available in the neighborhood and provide a valuable outside option for parents. To the
best of my knowledge, the present study shows for the first time that the availability of a new
public school in the neighborhood, without any reputation, deters parents from choosing private schools. Neilson and Zimmerman (2011) show that school construction increase housing
prices in the neighborhoods, but they do not examine the competition effect of public school
construction on the private sector. Finally, another contribution of the paper is to emphasize
that parents seem to value geographic proximity to school, in line with (Owusu-Edusei et al.,
2007).
Section 1 describes the institutional context, the data and the definition of new school
neighborhoods. Section 2 examines the consequences of new school openings on the educational
contexts in which students evolve and measures the effect on private school choice. Section 3
focuses on school openings effect on academic achievement. Section 4 provides robustness tests,
examine the underlying mechanisms and the distribution of the effect on private school choice.
Section 5 concludes.
117
1
Institutional context and data
1.1
New middle schools in France
After elementary school (grade 1 to 5), all French students enroll in middle school from grade 6
to grade 9 with a common curriculum. In 2011, 575,529 middle school students were enrolled in
the Paris region, i.e. 18.1 % of all French middle school students. They were allocated between
1284 middle schools, including 305 private schools.
The decision of opening a new public middle school is taken by local administrative districts
called "département", who own the buildings and are thus in charge of their construction and
maintenance.2 For a département, building a new middle school is very expensive. A middle
school of average size (around 450 students) costs around 16 million euros and takes around 3
years to build. Therefore, they do the investment only when a middle school is overcrowded. As
shown on Figure III.1, the closest public school from the new school was near its full capacity
(around 95% vs. 80% in average) in the few years preceding the new school opening. In such
a context, any future growth of the students’ population in the area would be an issue, and
constructing a new school may be a preventive measure. The other option would be to stay still
and to re-draw the school boundaries determining students’ allocation to public middle schools
only when needed. This implies assigning some students in other public schools located farther
away, which is certainly a much cheaper alternative. However, increasing too much the distance
students have to travel may not be socially acceptable (especially for 11 year old children).
In France, new schools do not offer different classes or any innovative teaching practices. All
middle school students follow a common and heavy curriculum determined at the national level
for each grade, over which teachers have a very limited leeway. Teachers may have different
teaching methods, but only at the individual level. There is no pedagogical vision attached to
a whole school: principals do not choose their teachers and have historically no say in their
teaching methods. The allocation of teachers is determined at a school administration district
level called "académie", and results from teachers’ application with a priority given to more
experienced teachers. As a consequence, students enrolling in new schools may only get specific
teaching practices if specific teachers are assigned to these schools.
2
The Paris region includes 8 over the 101 French départements (the city of Paris is one of them).
118
1.2
Private schools
Around 20 % of French middle school students attend private schools. Almost all private schools
are publicly-funded. The cost per child is usually quite low for families, between 500 and 1000
euros per year, and private schools can adjust this price to family resources. Teachers are paid
by the state, in exchange for which students have to follow the same curriculum as in public
schools and take the same national exams. Private schools may only offer optional activities on
the top of the common curriculum, such as religious classes (mostly catholic classes) that are
prohibited in public schools. Therefore, private and public schools provide education framed
similarly, and of similar quality.
As a consequence, the boundary between public and private schools is quite porous. For
students who do not attend religious classes anyway, the choice depends mainly on families’
perception of the relative quality of peers between the local public and private schools. Students
can easily switch between two years from a public to a private school, and vice versa. They
can apply for any private schools they want, as publicly-funded private schools are not allowed
to refuse entry on any racial or religious ground.
1.3
Data
This study exploits a longitudinal and exhaustive administrative dataset on all middle school
students (grade 6 to 9) in the Paris region from 2002 to 2011.3 It contains information on:
• Individual characteristics: gender, age, socioeconomic status (SES) measured by parents’
occupation, whether the student receives a need-based grant (later referred to as "lowincome" students), geographic coordinates of home address, school and class.
• School characteristics: geographic coordinates of school address, number of classes, share
of teachers under 30 years old (later referred to as "low-experienced" teachers)
• Scores obtained at the DNB exam at the end of middle school. This national anonymous
exam is taken by all students at the end of grade 9 in Mathematics, French and History.
This dataset is built from two sources: the "Bases Elèves Académiques" available from académies’ statistical
services, and the "Base Océan" available from the French Ministry of Education.
3
119
Even though the exam syllabus is the same throughout the country, the precise content of
each exam differs by Académie.4 Therefore, scores are rank percentilized at the Académie
level between 0 (the worst rank) and 100 (the best rank). Unfortunately, achievement at
the DNB exam is not available for year 2002.
• Outcomes after middle school: enrollment in general or vocational studies in high school,
attrition. Attrition in the dataset after middle school may result from measurement error
on students’ ID or drop-out5
Descriptive statistics for all students in the Paris region are reported on Table III.1, column
(I), for year 2002.
1.4
School neighborhoods
I use this dataset to examine the effects of new school openings on school choice and achievement
of students living close to the new school. A school neighborhood nsg is defined as the group
of grade g students living at an "as-a-crow-flies" distance to school s lower than the median
distance to school in the Departement. I use the Departement-specific median distance to school
to account for differences in population densities across geographical area in the Paris region.
This median distance lies between 0.75 and 1.5 km depending on the Département.
The strategy consists in examining how a given neighborhood nsg evolves between 2002 and
2011, around the new school opening year t0s . Because I look at students enrolled in a given
grade, a neighborhood nsg contains different students every year, with the notable exception
of repeaters. I focus on the grades 6 to 9 neighborhoods of the 36 new middle schools created
over the 2003-2010 period in the Paris region. Note that all students in grade g living around
school s are considered part of its neighborhood nsg , even if they do not enroll in this school.
Therefore, new school neighborhoods may be observed even in pre-opening years t < t0s .
Figure III.3 and Figure III.4 show that these 36 neighbordhoods are very stable over time
around the opening year, for grade 6 and grade 9 students respectively.6 The number of students
The Paris region is divided into three Académies: Paris, Créteil and Versailles.
Schooling is compulsory until 16 years old. Thus, students who have repeated a grade in previous years
may be old enough to drop out.
6
The patterns are very similar for grades 7 and 8.
4
5
120
in these neighborhoods is constant over time and does not undergo any shock at t0s (Panel A).
Students’ characteristics do not seem to vary in the opening year either. Three dimensions are
considered: the share of high-SES students (panel B), the share of "low-income" students as
measured by need-based grant recipients (panel C), and the share of "low-achievers" measured
by repetition in elementary school (panel D, for grade 6 only). Again, no shock is observed at
t0s .
To confirm this result, we estimate the following regression:
Cn,t = α + β · postopeningt + θ(t − t0s ) + λt + µn + ǫn,t
(III.1)
where Cn,t is a characteristic of grade g students living in neighborhood nsg at year t, and
postopeningt is a dummy for post-opening years 1(t ≥ t0s ). The function θ(t − t0s ) captures
continuous time trends around the opening year t0s that may affect yn,t , using a spline function
of degree 1 with a knot at t0s . I use θ(t − t0s ) = θ0 · (t − t0s ) + θ1 · (t − t0s ) · 1(t ≥ t0s ) where
parameter θ0 captures pre-opening time trend whereas parameter θ1 captures the change in
time trend starting at the opening year t0s . The parameter λt represents year fixed effects. The
parameter µn represents neighborhood fixed effects that captures time-invariant neighborhood
heterogeneity. Lastly, the variable ǫn,t captures year-specific shock to Cn,t .
Table III.3 reports the OLS estimates. Comfortingly, the β estimates for the postopeningt
dummy are pretty small and far from statistical significance for each dimension, confirming the
impression conveyed graphically that new schools’ close neighborhoods are stable over time.
It is possible to include both year and neighborhood fixed effects in model (III.1) thanks
to the heterogeneity in opening years. Figure III.2 reports the distribution of new school
openings over the period 2003-2010. Although more schools open in some years than in others,
no specific time pattern can be observed. Since schools do not open in the same year, any
change in students’ characteristics (or outcomes in what follows) when the new school opens
in a neighborhood can be compared to the change observed in neighborhoods where the school
did not open yet or opened previously. The causal impact of school openings is thus retrieved
from a difference-in-differences approach. For such approach to be valid, the timing of school
construction must not be endogenous to expected changes in students’ outcomes. The Paris
121
region map in Figure III.2 provides an overview of where new schools created in the beginning
of the period (2003 to 2006) are built compared to schools opening between 2007 to 2010. There
is no specific geographic pattern in the timing of new school openings. Finally, I examine on
Table III.2 whether neighborhoods differ significantly in their students’ characteristics observed
in 2002, regarding the timing of school openings (before or after 2006). Neighborhoods of schools
opening in the end of the period are less populated. The difference is not statistically significant
at conventional levels, but that may be due to small sample sizes only. Though less numerous,
students are very similar in characteristics in terms of socio-economic status, low-income status
and academic achievement in elementary school. This result suggests that the timing of school
openings is not endogenous to students’ characteristics. This assumption will be discussed
further in section 3, which examines the effect of school opening on academic achievement.
122
2
School openings, educational contexts and private school
choice
In this section, I explore the effect of school openings on school characteristics Xi,n,t of students
living in new school neighborhoods. Grade g students living in the same neighborhood nsg do
not necessarily enroll in the same school, even after t0s when they may or not be assigned to
the new school s depending on their precise home address. To study how the average school
characteristics of students living in the same neighborhood change with the school opening, I
simply use the same model as previously:
X n,t = α + β · postopeningt + θ(t − t0s ) + λt + µn + ǫn,t
(III.2)
where X̄n,t is the average value of Xi,n,t over all students living in neighborhood nsg at year
t. β captures the average effect of school openings on Xi,n,t , for students living in nsg after t0s ,
assuming no year-specific shocks at the opening year conditional on a spline function of time
trend, year and neighborhood fixed effects. Such effect happens because these students are not
enrolled in the same schools anymore, but also because local schools change after t0s .
Table III.4 (panel A) reports the OLS estimates of model III.2 for grade 6 students for several
school characteristics. On column I, Xi,n,t is simply a dummy equal to one if student i is enrolled
in the new school s. As the new school opens, 50 % of students in the close neighborhood enroll
in the new school, and represents 70 % of its cohort size in average. Students who did not enroll
in the new school attend either another public school nearby (by assignment or derogation),
or a private school. As half of the students enroll in this very close new school, the average
distance between home and school (column II) drops strongly by 1.2 km (43 %) overall in the
neighborhood. The assignment of grade 6 students to this new school (not only those living
in the close neighborhood) decreases the average size of schools attended by students in the
new school neighborhood drops by 45 students in average (−23% of their 2002 size) (column
III). Consequently, the saturation of schools attended, as measured by the number of students
divided by the school capacity, shifts suddenly from 100 % to 78 % (column IV). Therefore,
new school openings achieve their explicit objective, i.e. reducing overcrowding. Graphically,
these two first-stage effects of school openings on distance to school and school size appear very
123
clearly (Figure III.5, panel A and B).
Interestingly, new public schools induce a net competition effect on private school. As it
happens, the share of families enrolling their child in a private middle school after elementary
school drops in post-opening years by 3 percentage points on a 17 pp. basis (column V).
Figure III.5 illustrates this effect (panel C). There is no institutional reason to expect such
impact, since students are free to attend a private school, whatever the local supply of public
school. New school openings induce some changes in the public school market that are valued
by families and affect their choice. I discuss further this important effect in section 4.
As they affect students’ allocation across schools, school openings modify peer composition in schools. Students living in the new school neighborhood get more high-SES peers in
their school (+1.8 pp.) and less low-income peers (-1.4 pp.). These estimates are statistically
significant at standard levels. This change in peer composition suggests that school district
boundaries are re-drawn in a way that assigns slightly “higher-quality" students in the new
school. It may also result in part from the reduction of enrollment in private schools, which
may bring many high-SES students to the new school.
Table III.4 (panel B) reports the corresponding estimates for grade 9 students living in the
neighborhood. An important distinction for students enrolled in grade 7 to 9 in the opening
year is that they were already enrolled in a middle school previously. When assigned to the new
school, these students have to change school and may undergo a disruption. To avoid school
change for just one year, grade 9 students are not systematically re-assigned and some new
schools do not have any grade 9 students in the first year. This is reflected in the data by a
lower enrollment rate in the new school for grade 9 students, in average (32 % vs 50 % for grade
6 students). As a consequence, all shocks observed on school allocation for grade 6 students
are smaller in magnitude for grade 9 students, though in the same direction and statistically
significant at conventional levels: shorter distance to school (−0.84 km), smaller school size
(−39 students) and better peers (+1.1 pp. high-SES students). A notable exception is the
absence of any significant effect on enrollment in the private sector, which may reflect families’
reluctance to remove their child from her school just for one year.7
The effect on private school enrollment for grade 7 and 8 is intermediary: the point estimate is −1.7 pp.,
lower than for grade 6 students, but statistically significant at the 5 % level.
7
124
Finally, I investigate whether school openings go along with an increase in standard resources
as class size or teachers’ experience (Table III.5). As local school sizes decrease with the new
school opening, the average class size could be reduced. However, the re-allocation of students
across schools goes along with a re-allocation of classes and teachers. After the new school
opens, the drop in the number of students per school and grade is accompanied by a lower
number of classes (−1.9 and −1.7 for grade 6 and 9 respectively, column I) and teachers
(−16.6 and −12.7, column II). As a result, I find no significant change on the average class
size (Table III.5, column III, and Figure III.6, panel A). However, the reassignment of teachers
across local schools is not neutral in their composition: the share of young teachers (below 30
years old) for students in the new school neighborhood rises by 7 pp. and 4.6 pp. respectively
for grade 6 and 9 students (Table III.5, column III). This is likely to result from the assignment
of young teachers to the new school, since they have a lower bargaining power to stay in their
school as the new school opens. As illustrated in Figure III.6 (panel B) however, this drop in
teachers’ experience is transitory and comes back to normal in a few years.
Overall, the net impact of school openings on educational context looks rather benefitial for
students living in new public school neighborhoods. Compared to previous cohorts, they have
to travel a smaller distance, take their classes in an environment with no overcrowding, better
peers and new infrastructures. Although class size is not reduced and teachers’ experience is
lower for a few years, these improvements are valued by families, which choose more frequently
to enroll their child in the public sector.
125
3
School openings and educational achievement
3.1
Methodology
Since new school openings affect simultaneously several dimensions of students’ educational
contexts, I conduct only a reduced-form analysis of their effect on academic achievement. To
estimate the effect of opening school s on the outcome Yi,n,t of grade g students living in its
neighborhood, I use again a similar model as previously:
Ȳn,t = α + β · postopeningt + θ(t − t0s ) + λt + µn + ǫn,t
(III.3)
where Ȳn,t is the average value of Yi,n,t over all students living in neighborhood nsg at year t.
Within this difference-in-differences framework, β is the parameter of interest that captures
the effect of school opening on outcomes, assuming that outcomes would have remained on the
same trend without new school. In other words, the key identification assumption is that there
is no time-specific shocks in students’ potential outcomes at the opening year (conditional on
continuous time trends, year and neighborhood fixed effects).
A first source of bias could come from students with different potential outcomes moving
in the neighborhood precisely at t0s . More students should thus be observed after t0s , and a
sudden increase in their unobserved ability should be reflected in their observed characteristics.
Nonetheless, the number of students and their characteristics (SES, need-based grant status,
repetition before grade 6) do not undergo any significant shock at t0s , as showed earlier on
Table III.3. This evidence suggest that students living in new school neighborhoods are comparable before and after school openings in terms of their individual characteristics. Theoretically,
it is still possible that abilities differ significantly but are not correlated with any of the three
students’ characteristics we use, but it is unlikely.
A second source of bias may come from an endogenous timing in the school opening, as
discussed already in section 1 Considering the overcrowding context in local schools, potential
outcomes without school opening may deviate from their trend precisely at the opening year,
even without any change in students’ profiles. Imagine, for example, that providing a seat to
each student was impossible starting from t0s without a new school. Then students’ outcomes
126
would have shifted downward without new school after t0s . The département might have decided
to open the new school at t0s because of this expected drop in outcomes, raising a reverse
causality issue.
However, I argue that this scenario is unlikely. First, I find no significant trend in school
sizes (the number of student in students’ schools) before the opening year (Table III.4, column
III). The context is such that local schools of students living in the new school neighborhood
are already saturated (the intercept on column IV is estimated at 100 % saturation). Suppose
that the number of students living in the area is raising, then they will probably be assigned
to other, more distant schools. Generally speaking, it is always possible to provide seats to
students by using available slots in schools located farther away, if schools located nearby are
already full. However, this does not even seem to happen for the subset of students living
in new school neighborhoods. The number of students living in these neighborhood is barely
increasing (Table III.3, column I, and Figure III.4, panel A). The estimated pre-opening trends
in distance to school are positive but small and only significant at the 10 % level for grade 9
students (Table III.4, column II). Considering these stable patterns, a sudden drop in potential
outcomes at t0s due to a shock in school supply sounds unlikely. In the worst scenario, assume
cohort sizes in the neighborhood would have increased after t0s , the consequence of not building
any new school would be to assign some students in more distant schools. Trends in outcomes
may slightly deviate from the pre-t0s trend in this counterfactual scenario, but there is no reason
to expect a shock in potential outcomes at t0s .
3.2
Results
Table III.6 reports the OLS estimates of model (III.3) for retention rate of grade 6, 7 and 8
students. We find no effect of school openings on students’ risk of retention for these grades.
The point estimates are very small and not significantly different from 0. Note that many
students in grade 7 and 8 experienced a school change at t0s , which effect is combined with the
strict impact of having a new school in the neighborhood. Assuming that school change has
a negative disruptive impact on student, it may counteract any positive effect of a new school
opening.
127
Table III.7 (panel A) examine the impact of new school openings on outcomes of grade
9 students. Since new school openings did not affect retention rate in previous grades, no
compositional effect is to be expected on grade 9 outcomes. The estimates suggest a positive
effect of school openings on the educational achievement of students at the end of middle school.
While the percentile rank at the DNB exam was declining over years, it increases in average
by 1.8 pp. in the opening year (around 7 % of a standard deviation) with a standard error of
0.76 pp (column I). This sudden improvement can be observed graphically (Figure III.7). It is
also confirmed by students’ outcomes after the grade 9 year. Compared to previous cohorts,
students enroll 2.7 pp. more in general studies in high school (column V), rather than vocational
studies (−0.9 pp., not significant), repeating grade 9 (−0.5 pp., not significant) or dropping
out (as measured by attrition from the dataset, −0.8 pp., not significant).
Until this point, I was using new school opening between 2003 and 2010. As the data start
in 2002 (2003 for ranks at the DNB exam), pre-opening trends cannot be estimated or are
poorly estimated for neighborhoods of schools opening in the first years. To check that the
results found on panel A (Table III.7) are not driven by unsufficient control for pre-opening
trends, panel B restricts the analysis to new schools opening between 2005 and 2010. The
results are very similar. The effect on exam percentile ranks remains positive though slightly
smaller in magnitude, and still statistically significant at the 10 % level (Table III.7, panel B,
column I), while the estimate for general studies remains positive.
Again, the impact of school opening in the first years after t0s includes the effect of the school
change experienced by middle school students reassigned to the new public school at t0s . To
check whether this disruption affects the estimated impact of new school openings on outcomes,
I test whether another shift in outcomes appears 3 years after t0s . Compared to previous years,
most students enrolled in grade 9 at t ≥ t0s + 3 did not suffer any school change, except for
those who repeated grade 6, 7 or 8. In panel C, I estimate model 3 to which I add a dummy
for t ≥ t0s + 3. The point estimate is small and not significant, confirming the impression from
Figure III.7 that the only shift happens at t0s . This result suggests that students’ achievement
at the end of middle school is not harmed by school changes following the new school opening.
Figure III.8 (panel A) represents the distribution of DNB percentile ranks of students living
in new school neighborhoods, before and after the opening year. The whole distribution shifts
128
to the right, suggesting a positive effect at all levels of the ability distribution. However, the
gap between the two curves captures not only the effect of school openings, but also the impact
of time. To get a more accurate view of the school opening effect, I estimate the residuals from
an OLS regression of DNB percentile ranks on a linear time trend, year and neighborhood fixed
effects. Figure III.8 (panel B) plots the distribution of residuals before and after the school
opening. Here, school openings seem to affect mainly the middle of the distribution, whereas
the extreme bottom and top part of the ability distribution look unaffected.
129
4
Robustness and complementary analysis
4.1
Robustness
The analysis emphasized two major effects of opening a new public school in the neighborhood.
Families are less likely to opt for a private middle school in grade 6, and students achieve
higher performances at the end of middle school. To confirm these results, I check whether
these effects are robust to other neighborhood sizes. Table III.8 reports the estimates for
neighborhoods defined with a radius equals to several multiples of the median distance to
school in the département: 0.5, 0.75, 1 (the baseline distance used previously), 1.25 and 1.5.
The results confirm that new school openings reduces families’ propensity to choose the private
sector after elementary school. The magnitude of the estimates remains around −3 percentage
points and they are precisely estimated for all neighborhood sizes. The effect of school openings
on students’ performance in grade 9 are much less robust. The magnitude of the estimate
remains similar to the baseline only for one other neighborhood size over four. In other cases,
the estimate is small in magnitude and does not differ statistically from 0 anymore. This result
calls for caution in concluding on the impact of new school openings on academic achievement.
4.2
The impact of school openings on nearby public schools
New schools may not only affect their close neighborhood. As the allocation of students across
local schools is impacted, students living close to nearby schools may also be affected, even
without being assigned to the new school. To investigate these externalities, I provide the same
analysis as above, for students living around the five public schools located the closest to the new
school. Again, I use the median distance in the Département to define these close public school
(CPS) neighborhoods. Table III.9 reports the estimates of the post−opening dummy, for several
outcomes measured in grade 6. Part of students living in CPS neighborhoods are assigned to the
new school, though much less than in the new school neighborhood (column I).8 Interestingly,
new school openings do only slightly affect distance to school in CPS neighborhoods, and in
the opposite direction for the closest public school as 11.6 % of them are assigned to the new
Note that in some cases, a new school and its CPS can be sufficiently close for their neighborhood to
overlap. Thereby, some students might belong to both neighborhoods.
8
130
school (column II). The estimate is very small and not statistically different from 0 for grade
9 students. By contrast, they affect school size of the closest public school in a similar way
(−31 students in schools of grade 6 students, Table III.9, column III). Although the reduction
in school size is somewhat smaller than in new school neighborhoods, it is still enough to put a
significant end to school overcrowding. For example, grade 9 students are enrolled just after t0s
in schools at 83 % of their capacity in average, compared to 97 % for previous cohorts (column
IV).
An important result is the absence of any strong and significant effect of new school openings
on private school enrollment in CPS neighborhoods (column V), while they reduce it in their
own neighborhoods (see again Table III.4, column V). It suggests that the competition effect
on the private sector is not driven by the reduction in school size and overcrowding, which is
comparable in both types of neighborhoods (though slightly smaller in CPS ones). As a matter
of fact, the change in school choice is more likely driven by families’ preference for novelty per
se and/or geographical proximity. Students do only switch from the private to the public sector
when the new public school is built in their neighborhood, revealing a preference for proximity.
4.3
Heterogeity of school opening effect on private school choice
On Table III.10, I examine whether some types of students are more likely than others to
respond to public school openings by shifting from the private to the public sector. Columns
II and III reports the regression estimates of model (III.3) using the private school attendance
of high-SES or low-SES students only as dependent variables. The magnitude of the estimate
is larger for high-SES students, but they attend much more private school as a baseline. The
relative decrease in private school choice is thus similar for both groups: −17 % for high-SES
students versus −18 % for low-SES students. The estimates for male and females are almost
identical (columns IV and V). Overall, I find no significant heterogeneity of the impact of school
opening on private school choice.
131
5
Conclusion
This paper evaluates the impact of opening new schools on private school enrollment and
academic achievement. Opening a new school reduces the size and overcrowding of schools
attended by all students living in the area. By contrast, distance to school is only strongly
reduced for students living in the new school close neighborhood. Overall, I find that new
public schools raise families’ preferences for the public sector, but only when the latter live in
their close neighborhood. A positive effect appears on academic achievement, even though it
is not robust to alternative neighborhood sizes. Although the educational contexts of students
living close to nearby schools also change as the new school opens, they are not affected either
positively or negatively. Therefore, parents seems attracted to the public sector because new
and modern public school facilities are provided at a short distance, more than the reduction
in school size and overcrowding. While private school choice may increase stratification and
reduce voters concern for public education, knowing what may keep families in the public sector
os of key importance.
132
Bibliography
Akerlof, George A. and Rachel E. Kranton (Dec. 2002). “Identity and Schooling: Some
Lessons for the Economics of Education”. In: Journal of Economic Literature 40.4, pp. 1167–
1201.
Alspaugh, John W. (1998). “The Relationship of School-to-School Transitions and School
Size to High School Dropout Rates”. In: The High School Journal 81.3, pp. 54–160.
Duflo, Esther (Sept. 2001). “Schooling and Labor Market Consequences of School Construction in Indonesia: Evidence from an Unusual Policy Experiment”. In: American Economic
Review 91.4, pp. 795–813.
Epple, Dennis, David Figlio, and Richard Romano (July 2004). “Competition between
private and public schools: testing stratification and pricing predictions”. In: Journal of
Public Economics 88.7-8, pp. 1215–1245.
Epple, Dennis and Richard E Romano (Mar. 1998). “Competition between Private and
Public Schools, Vouchers, and Peer-Group Effects”. In: American Economic Review 88.1,
pp. 33–62.
Fack, Gabrielle and Julien Grenet (2010). “When do better schools raise housing prices?
Evidence from Paris public and private schools”. In: Journal of Public Economics 94.1–2,
pp. 59–77.
Goux, Dominique and Eric Maurin (June 2005). “The effect of overcrowded housing on
children’s performance at school”. In: Journal of Public Economics 89.5-6, pp. 797–819.
Hoxby, Caroline M. (Dec. 2000). “Does Competition among Public Schools Benefit Students
and Taxpayers?” In: American Economic Review 90.5, pp. 1209–1238.
Hsieh, Chang-Tai and Miguel Urquiola (Sept. 2006). “The effects of generalized school
choice on achievement and stratification: Evidence from Chile’s voucher program”. In: Journal of Public Economics 90.8-9, pp. 1477–1503.
133
Lee, Valerie E. and Susanna Loeb (Mar. 2000). “School Size in Chicago Elementary Schools:
Effects on Teachers’ Attitudes and Students’ Achievement”. In: American Educational Research Journal 37.1, pp. 3–31.
Lyons, John B. (June 1999). School Construction in the United States. PEB Exchange, Programme on Educational Building 1999/9. OECD Publishing.
Neilson, Christopher and Seth D. Zimmerman (Nov. 2011). The Effect of School Construction on Test Scores, School Enrollment, and Home Prices. IZA Discussion Papers 6106.
Institute for the Study of Labor (IZA).
Owusu-Edusei, Kwame, Molly Espey, and Huiyan Lin (Apr. 2007). “Does Close Count?
School Proximity, School Quality, and Residential Property Values”. In: Journal of Agricultural and Applied Economics 39.01.
Pittman, R.B. and P. Haughwout (1987). “Influence of high school size on dropout rate.”
In: Educational Evaluation and Policy Analysis 9.4, pp. 337–343.
134
Figure III.1: Saturation of public schools located close from the new school.
Note - The treatment consists in a new middle school opening in the neighborhood during the 20032010 period. The opening year is centered at year 0. Saturation is computed as the number of
students in the school divided by its capacity. Dashed lines show the upper and lower confidence
bounds (95%).
135
Opened before 2003
Opened between 2003 and 2006
Opened between 2007 and 2010
Figure III.2: Time and geographic distribution of the 36 new school opening
Note - The histogram reports the distribution of opening years between 2003 and 2010. On the Paris
region map, the delimitations of the départements correspond to the thick black lines.
136
137
Figure III.3: Students’ composition in new school neighborhoods (Grade 6).
Note - This figures plots students’ composition of neighborhoods around new schools, before and after the opening year (centered at year 0). Dashed
lines show the upper and lower confidence bounds (95%). "Low-income" students are students receiving a state need-based grant. "Low-achievers" are
students who repeated at least a year in primary school, before entering in middle school.
Figure III.4: Students’ composition in new school neighborhoods (Grade 9).
Note - This figures plots students’ composition of neighborhoods around new schools, before and after
the opening year (centered at year 0). Dashed lines show the upper and lower confidence bounds
(95%). "Low-income" students are students receiving a state need-based grant.
138
Figure III.5: School openings and educational contexts in new school neighborhoods (Grade 6).
Note - The treatment consists in a new middle school opening in the neighborhood during the 20032010 period. The opening year is centered at year 0. Dashed lines show the upper and lower
confidence bounds (95%).
139
Figure III.6: School resources of students living in new school neighborhoods (Grade 6).
Note - This figures plots school resources of students living in new school neighborhoods, before and after the openin
lines show the upper and lower confidence bounds (95%).
140
Figure III.7: School achievement in new school neighborhoods (Grade 9).
Note - The treatment consists in a new middle school opening in the neighborhood during the 20032010 period. The opening year is centered at year 0. Achievement is measured at the DNB exam, i.e.
the anonymous national exam taken by French students at the end of middle school (end of grade 9).
The content of the exam differ across each school administration areas called "Académie" ("Créteil",
"Paris" or "Versailles"). Therefore, scores are percentilized at the Académie level (Lowest rank =
0, best rank = 100). Dashed lines show the upper and lower confidence bounds (95%).
141
Figure III.8: The effect of school opening on DNB exam percentile rank distribution in new
school neighborhoods
Note - The DNB exam is the anonymous national exam taken by French students at the end of middle
school (end of grade 9). The content of the exam differ across each school administration areas called
"Académie" ("Créteil", "Paris" or "Versailles"). Therefore, this figure plots the Epanechnikov
kernel function of students percentile rank (Lowest = 0, Best = 100) at the Académie level. Panel
A compares the DNB rank distribution of cohorts before and after the school opening. Panel B plots
the distribution of residuals obtained from an OLS regression of DNB percentile ranks on a linear
time trend, year and neighborhood fixed effects.
142
Table III.1: Descriptive statistics on students in 2002
Female
High-SES
Low-income
Low-achievers
School size
School saturation
Class size
Low-experienced teachers
Private sector
Distance to school
Retention
Attrition
Vocational studies
General studies
Percentile rank at DNB exam
N
Students not
in new school
neighborhoods
Students in
new school
neighborhoods
Difference (II)
- (I)
(I)
(II)
(III)
0.488
0.488
0.000
(0.500)
(0.500)
(0.003)
0.278
0.258
-0.019**
(0.448)
(0.438)
(0.003)
0.168
0.215
0.047**
(0.374)
(0.411)
(0.002)
0.273
0.301
0.028**
(0.445)
(0.459)
(0.006)
152.933
169.895
16.963**
(49.794)
(55.127)
(0.322)
0.828
0.943
0.115**
(0.212)
(0.253)
(0.002)
23.195
23.361
0.166**
(3.113)
(3.200)
(0.020)
0.273
0.265
-0.008**
(0.125)
(0.122)
(0.001)
0.163
0.180
0.017**
(0.369)
(0.385)
(0.002)
3.476
2.019
-1.457**
(16.842)
(4.229)
(0.106)
0.070
0.079
0.009**
(0.256)
(0.270)
(0.003)
0.125
0.118
-0.007*
(0.331)
(0.322)
(0.004)
0.227
0.251
0.024**
(0.419)
(0.434)
(0.005)
0.569
0.541
-0.028**
(0.495)
(0.498)
(0.006)
50.081
48.116
-2.052**
(24.977)
(25.739)
(0.349)
562719
25302
"Low-income" students are students receiving a state need-based grant. "Lowachievers" are students who repeated at least a year in primary school, before
entering in middle school (measured for grade 6 only) "Low-experienced" teachers
are teachers under 30 years old. The five last rows measure outcomes at the end of
grade 9. Standard deviations are reported in parentheses. * p < 0.1; ** p < 0.05;
*** p < 0.01
143
Table III.2: Neighborhoods of new schools opening before or after
2006
# students
% high-SES
% low-income
% low-achievers
N
2003-2006
2007-2010
Difference
(I)
(II)
(III)
230.733
177.667
-53.067
(164.557)
(152.722)
(53.313)
0.270
0.253
-0.017
(0.101)
(0.133)
(0.041)
0.177
0.181
0.004
(0.097)
(0.124)
(0.038)
0.279
0.288
0.009
(0.055)
(0.104)
(0.029)
21
15
"Low-income" students are students receiving a state need-based
grant. "Low-achievers" are students who repeated at least a year
in primary school, before entering in middle school (measured for
grade 6 only) Standard deviations are reported in parentheses. *
p < 0.1; ** p < 0.05; *** p < 0.01
144
Table III.3: Students’ composition in new school neighborhoods
# students
% high-SES
% low-income
%
low-achievers
(I)
(II)
(III)
(IV)
−0.965
0.017
0.002
−0.004
(4.021)
(0.010)
(0.008)
(0.012)
1.131
0.003
0.001
−0.015***
(1.295)
(0.002)
(0.002)
(0.003)
0.702
−0.003
0.003
0.000
(1.445)
(0.003)
(0.003)
(0.003)
204.492***
0.293***
0.166***
0.204***
0.99
0.86
0.91
0.75
4.181
0.001
−0.010
(4.168)
(0.011)
(0.007)
0.607
0.008***
0.004**
(1.259)
(0.003)
(0.002)
1.425
−0.009**
−0.000
(1.557)
(0.003)
(0.002)
189.000***
0.306***
0.160***
R2
0.98
0.90
0.92
N
360
360
360
Independent variable
Panel A : Grade 6
Post-opening
Pre-opening trend
Post-opening shift in trend
Constant
R2
Panel B : Grade 9
Post-opening
Pre-opening trend
Post-opening shift in trend
Constant
360
Each column is from a separate regression of neighborhood characteristics on a dummy for years
after the new school opening year, and a spline function for time trend (centered around the
opening year) with a knot at the opening year. All regressions include separate controls for year
and neighborhood fixed effects. "Low-income" students are students receiving a state need-based
grant. "Low-achievers" are students who repeated at least a year in primary school, before entering
in middle school (measured for grade 6 only) Robust standard errors are reported in parentheses.
* p < 0.1; ** p < 0.05; *** p < 0.01
145
Table III.4: Effect of school openings on educational contexts
Enrollment
Distance
in new
to school
school
Independent variable
(I)
(II)
School
size
School
saturation
(III)
(IV)
Enrollment
High-SES
in private
in school
sector
(V)
Lowincome
students
(VI)
(VII)
0.018***
−0.014**
Panel A : Grade 6
Post-opening
Pre-opening trend
Post-opening shift in trend
Constant
R
2
0.504***
−1.221*** −45.136*** −0.218*** −0.030***
(0.028)
(0.222)
(3.822)
(0.020)
(0.010)
(0.007)
(0.006)
−0.009
0.048
1.149
0.003
0.004**
0.004**
0.003*
(0.007)
(0.051)
(0.970)
(0.005)
(0.002)
(0.002)
(0.002)
0.004
−0.005*
−0.005**
0.001
(0.006)
(0.002)
(0.002)
(0.002)
0.170***
0.296***
0.187***
0.81
0.94
0.93
−0.007
0.011*
−0.013***
0.022**
−0.166*** −3.525***
(0.009)
(0.064)
−0.036
2.841***
0.85
0.60
(1.102)
195.672*** 1.004***
0.73
0.64
Panel B : Grade 9
Post-opening
Pre-opening trend
Post-opening shift in trend
Constant
R
2
N
0.322***
−0.842*** −38.816*** −0.160***
(0.031)
(0.204)
(4.397)
(0.017)
(0.008)
(0.006)
(0.005)
−0.011
0.084*
0.759
0.002
0.003
0.004***
0.004***
(0.007)
(0.046)
(1.069)
(0.004)
(0.002)
(0.002)
(0.001)
0.056***
−0.271***
−2.290*
−0.007
−0.009***
−0.002
−0.001
(0.010)
(0.059)
(1.339)
(0.006)
(0.002)
(0.002)
(0.002)
-0.048
2.893***
0.166***
0.287***
0.163***
0.76
0.63
0.66
0.69
0.83
0.95
0.94
360
360
360
360
360
360
360
176.613*** 0.989***
Each column is from a separate regression of neighborhood characteristics on a dummy for years after the new school
opening year, and a spline function for time trend (centered around the opening year) with a knot at the opening year.
All regressions include separate controls for year and neighborhood fixed effects. Robust standard errors are reported in
parentheses. * p < 0.1; ** p < 0.05; *** p < 0.01
146
Table III.5: Effect of school openings on school resources
Independent variable
Nb of
classes
Nb of
teachers
Class size
(I)
(II)
(III)
% lowexperienced
teachers
(IV)
Panel A : Grade 6
Post-opening
Pre-opening trend
−1.868*** −16.635*** −0.208
(0.149)
(1.219)
(0.193)
(0.011)
0.007
−0.499*
0.242***
−0.008***
(0.039)
(0.298)
(0.048)
(0.003)
0.134
−0.031
−0.009***
(0.045)
(0.357)
(0.056)
(0.003)
7.916***
56.363***
24.621***
0.246***
0.76
0.77
0.73
0.79
0.075
0.046***
Post-opening shift in trend −0.138***
Constant
R2
0.070***
Panel B : Grade 9
Post-opening
−1.729*** −12.690***
(0.196)
(1.159)
(0.243)
(0.010)
−0.038
−0.343
−0.217**
−0.010***
(0.049)
(0.298)
(0.100)
(0.003)
−0.113*
−0.868**
−0.116
−0.002
(0.061)
(0.358)
(0.104)
(0.003)
7.359***
57.514***
21.707***
0.236***
R2
0.67
0.77
0.55
0.81
N
360
360
360
360
Pre-opening trend
Post-opening shift in trend
Constant
Each column is from a separate regression of neighborhood characteristics on a
dummy for years after the new school opening year, and a spline function for
time trend (centered around the opening year) with a knot at the opening year.
All regressions include separate controls for year and neighborhood fixed effects.
Robust standard errors are reported in parentheses. * p < 0.1; ** p < 0.05; ***
p < 0.01
147
Table III.6: Effect of school openings on students’ retention rate at the end of
the school year (grades 6 to 8)
% Grade 6
Retention
% Grade 7
Retention
% Grade 8
Retention
(I)
(II)
(III)
-0.002
0.003
0.002
(0.005)
(0.004)
(0.008)
-0.004***
-0.003***
-0.007***
(0.001)
(0.001)
(0.002)
-0.000
0.001
-0.000
(0.002)
(0.001)
(0.002)
0.049***
0.014***
0.051***
R2
0.42
0.40
0.35
N
360
360
360
Independent variable
Post-opening
Pre-opening trend
Post-opening shift in trend
Constant
Each column is from a separate regression of neighborhood characteristics on a
dummy for years after the new school opening year, and a spline function for
time trend (centered around the opening year) with a knot at the opening year.
All regressions include separate controls for year and neighborhood fixed effects.
Retention is measured at the end of the corresponding grade (not available for
year 2012). Robust standard errors are reported in parentheses.
* p < 0.1; ** p < 0.05; *** p < 0.01
148
Table III.7: Effect of school openings on students’ outcomes in grade 9
Independent variable
Rank
national
exam
% Grade
retention
% Attrition
% Vocational
studies
% General
studies
(I)
(II)
(III)
(IV)
(V)
1.832**
-0.005
-0.008
-0.009
0.027**
(0.764)
(0.007)
(0.009)
(0.010)
(0.011)
-0.458**
-0.002
0.003
-0.003
0.002
(0.206)
(0.002)
(0.002)
(0.002)
(0.003)
0.387
-0.001
-0.003
0.004
0.002
(0.237)
(0.002)
(0.002)
(0.003)
(0.003)
324
360
360
360
360
1.425*
-0.004
-0.006
-0.019
0.033**
(0.860)
(0.010)
(0.013)
(0.012)
(0.013)
-0.388*
-0.001
0.004
-0.004
0.001
(0.219)
(0.002)
(0.002)
(0.003)
(0.003)
0.373
-0.004
-0.005
0.009*
0.001
(0.331)
(0.003)
(0.005)
(0.005)
(0.007)
Panel A : All new schools
Post-opening
Pre-opening trend
Post-opening shift in trend
N
Panel B: New schools opening after 2005
Post-opening
Pre-opening trend
Post-opening shift in trend
Panel C: New schools opening after 2005 - Testing for disruption
Post-opening t ≥ t0s
t≥
t0s
+3
Pre-opening trend
Post-opening shift in trend
N
1.389
-0.005
-0.006
-0.015
0.031**
(0.875)
(0.011)
(0.013)
(0.012)
(0.013)
-0.280
-0.014
0.001
0.029
-0.015
(1.289)
(0.011)
(0.018)
(0.026)
(0.029)
-0.387*
-0.001
0.004
-0.004
0.001
(0.220)
(0.002)
(0.002)
(0.003)
(0.003)
0.439
-0.001
-0.005
0.002
0.004
(0.465)
(0.005)
(0.006)
(0.007)
(0.008)
252
280
280
280
280
Each column is from a separate regression of neighborhood characteristics on a dummy for years after the new school
opening year, and a spline function for time trend (centered around the opening year) with a knot at the opening
year. All regressions include separate controls for year and neighborhood fixed effects. Outcomes are measured at
the end of grade 9 Column I refers to students’ percentile rank at the anonymous exam (DNB ) at the end of grade
9, ranged between 0 (lowest rank at the Académie level) and 100 (best rank). Achievement at the DNB exam is
not available for year 2002, hence the smaller sample size compared to other columns. Attrition (column V) refers
to students who cannot be found in our longitudinal dataset after grade 9, because they dropped out from school
or because of measurement errors on their ID. Panel B restricts the analysis to new schools opening after 2005, for
which we have enough years before the school opens to estimate consistent trends. Panel C includes a dummy for
periods later or equal to 3 years after opening, corresponding to cohorts that did not experience a school change
(except for repeaters).
Robust standard errors are reported in parentheses. * p < 0.1; ** p < 0.05; *** p < 0.01
149
Table III.8: Robustness check : Changing the neighborhood
radius
Enrollment in
private sector
(grade 6)
Rank
national
exam (grade
9)
(I)
(II)
-0.032*
-0.266
(0.016)
(1.442)
-0.036***
1.865*
(0.011)
(1.031)
-0.030***
1.832**
(0.010)
(0.764)
-0.030***
0.547
(0.008)
(0.622)
-0.026***
0.323
(0.008)
(0.593)
360
324
Radius of neighborhoods
dis <= 0.5 · d∗s
dis <= 0.75 · d∗s
dis <= 1 · d∗s (baseline)
dis <= 1.25 · d∗s
dis <= 1.5 · d∗s
N
Each cell reports the estimate of the post-opening dummy for
different radius of neighborhoods and outcomes. For example, the first line defines neighborhoods as all students living
at a distance to the new school that is lower than half of the
median distance to school in the Département. Two outcomes
are considered: enrollment in a private middle school in grade
6 (column I) and students’ percentile rank at the anonymous
exam at the end of grade 9 (column II). Outcomes are regressed on the post-opening dummy, controlling for a spline
function for time trend (centered around the opening year)
with a knot at the opening year. All regressions include separate controls for year and neighborhood fixed effects. Robust
standard errors are reported in parentheses. * p < 0.1; **
p < 0.05; *** p < 0.01
150
Table III.9: Effect of school opening on close public schools’ neighborhoods
Closest public school
1st closest
2nd closest
3rd closest
4th closest
5th closest
N
Enrollment in
the new
school
Distance to
school
School size
School
saturation
Enrollment in
private sector
(I)
(II)
(III)
(IV)
(V)
0.108***
0.116**
-31.385***
-0.148***
-0.003
(0.011)
(0.052)
(3.394)
(0.016)
(0.007)
0.074***
0.025
-15.092***
-0.062***
-0.009*
(0.009)
(0.049)
(2.766)
(0.014)
(0.005)
0.024***
-0.114**
-8.185***
-0.039***
-0.005
(0.004)
(0.048)
(2.595)
(0.013)
(0.005)
0.024***
0.019
-8.485***
-0.032***
0.003
(0.004)
(0.057)
(2.552)
(0.012)
(0.007)
0.011***
-0.073
-5.652***
-0.030***
0.000
(0.002)
(0.060)
(1.910)
(0.010)
(0.007)
360
360
360
360
360
This table focuses on students living in the neighborhood of public schools located close to the
newschool. The first line refers to the closest public school, the second line to the second closest
public school, and so on and so forth. Each column relates to a different outcome. Each cell corresponds to a separate regression of outcomes on the post-opening dummy, controlling for a spline
function for time trend (centered around the opening year) with a knot at the opening year. The
table reports the estimates for the post-opening dummy. All regressions include separate controls for
year and neighborhood fixed effects. Robust standard errors are reported in parentheses.
* p < 0.1; ** p < 0.05; *** p < 0.01
151
Table III.10: Heterogeneity of the school opening effect on private sector choice
All
High-SES
Low-SES
Female
Male
(I)
(II)
(III)
(IV)
(V)
-0.030***
-0.052**
-0.021**
-0.032**
-0.030**
(0.010)
(0.022)
(0.009)
(0.013)
(0.013)
0.004**
0.006
0.003
0.007**
0.003
(0.002)
(0.006)
(0.002)
(0.003)
(0.003)
-0.005*
-0.005
-0.005**
-0.005
-0.003
(0.002)
(0.006)
(0.002)
(0.004)
(0.003)
0.170***
0.309***
0.116***
0.182***
0.161***
(0.011)
(0.023)
(0.009)
(0.014)
(0.014)
R2
0.81
0.69
0.63
0.64
0.70
N
360
360
360
360
360
Independent variable
Post-opening
Pre-opening trend
Post-opening shift in trend
Constant
Each column uses the enrollment rate in private middle school of a specific subgroup of students as an
outcome. For example, column II reports the results of the regression of high-SES students’ enrollment rate
in the private sector on a dummy for years after the new school opening year, and a spline function for time
trend (centered around the opening year) with a knot at the opening year. All regressions include separate
controls for year and neighborhood fixed effects. Robust standard errors are reported in parentheses. *
p < 0.1; ** p < 0.05; *** p < 0.01
152
Table of contents
Remerciements
2
Introduction
7
I
Faculty Biases and the Gender Segregation across Fields
12
Joint with Thomas Breda
1
Background, data, and measures of stereotypes . . . . . . . . .
1.1
Institutional background . . . . . . . . . . . . . . . . .
1.1.1
The Paris Ecole Normale Supérieure . . . . .
1.1.2
Oral tests at the ENS entrance exams . . . .
1.2
Data . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2.1
Candidates . . . . . . . . . . . . . . . . . . .
1.2.2
Subjects . . . . . . . . . . . . . . . . . . . . .
1.2.3
Male- and female-dominated fields . . . . . .
1.2.4
Test scores . . . . . . . . . . . . . . . . . . .
1.3
Evidence of gender rebalancing at oral tests . . . . . .
2
Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.1
Examiners’ bias toward the under-represented gender .
3.2
Robustness checks . . . . . . . . . . . . . . . . . . . .
3.2.1
Subject-by-subject comparisons . . . . . . . .
3.2.2
Robustness across years . . . . . . . . . . . .
3.3
The role of examiner gender . . . . . . . . . . . . . . .
4
More on the identification assumption . . . . . . . . . . . . . .
4.1
Are candidates over-confident when under-represented?
4.2
What if written tests are not really blind? . . . . . . .
5
Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.1
Mechanisms . . . . . . . . . . . . . . . . . . . . . . . .
5.2
Stereotypes and discrimination . . . . . . . . . . . . .
5.3
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . .
Appendix: On the handwriting detection test . . . . . . . . . . . .
II
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Persistent Classmates: How Familiarity with Peers Protects from Disruptive Transition
17
17
17
18
19
19
19
20
21
22
23
25
25
27
27
28
29
31
31
32
34
34
36
37
53
56
Joint with Arnaud Riegert
1
2
Institutional context and data . . . . . . . . . . . . . .
1.1
The high school curriculum in France . . . . . .
1.1.1
Enrolling in general high schools . . .
1.1.2
The curriculum in general high schools
1.2
The class-assignment mechanism . . . . . . . .
1.3
Data . . . . . . . . . . . . . . . . . . . . . . . .
1.3.1
Datasets . . . . . . . . . . . . . . . . .
Identification . . . . . . . . . . . . . . . . . . . . . . .
2.1
Definition of similar-file groups . . . . . . . . .
2.2
Empirical evidence of random assignment . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
61
61
61
62
63
65
65
68
69
71
2.2.1
Balancing test using anonymous exam scores . . . . . . . . . . . 71
2.2.2
Additional evidence of random assignment . . . . . . . . . . . . 73
2.3
Description of the SF sample . . . . . . . . . . . . . . . . . . . . . . . . 74
3
Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
3.1
Freshman-year class characteristics and achievement . . . . . . . . . . . . 76
3.2
The protective role of familiarity with classmates . . . . . . . . . . . . . 78
3.2.1
Distribution of the PC effect . . . . . . . . . . . . . . . . . . . 79
3.2.2
Do all former peers matter? . . . . . . . . . . . . . . . . . . . . 81
4
Robustness checks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
4.1
Alternative SF group specifications . . . . . . . . . . . . . . . . . . . . . 83
4.2
Estimation based on the impact of SF student allocation on their classmates 84
4.2.1
Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
4.2.2
Validity of the test . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.2.3
Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
5
Discussion and conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
Appendix 1 : Additional Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
Appendix 2 : Details on the matching procedure . . . . . . . . . . . . . . . . . . 109
Appendix 3 : Details on the process to define the SF sample . . . . . . . . . . . 111
III
1
2
3
4
5
A New School in Town: Public School Openings, Private School Choice
114
and Academic Achievement
Institutional context and data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
1.1
New middle schools in France . . . . . . . . . . . . . . . . . . . . . . . . 118
1.2
Private schools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
1.3
Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
1.4
School neighborhoods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
School openings, educational contexts and private school choice . . . . . . . . . 123
School openings and educational achievement . . . . . . . . . . . . . . . . . . . 126
3.1
Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
3.2
Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
Robustness and complementary analysis . . . . . . . . . . . . . . . . . . . . . . 130
4.1
Robustness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
4.2
The impact of school openings on nearby public schools . . . . . . . . . . 130
4.3
Heterogeity of school opening effect on private school choice . . . . . . . 131
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
Table of contents
154
List of figures
155
List of tables
157
154
List of Figures
I.1
Kernel density estimates of scores at written and oral tests, by track and gender. 41
I.2
Gender and choice of specialty. . . . . . . . . . . . . . . . . . . . . . . . . . . 42
II.1
Class assignment of similar-file students . . . . . . . . . . . . . . . . . . . . . 92
II.2
Composition of the typical classroom from a non-repeating student’s point of
view
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
III.1
Saturation of public schools located close from the new school. . . . . . . . . 135
III.2
Time and geographic distribution of the 36 new school opening . . . . . . . . 136
III.3
Students’ composition in new school neighborhoods (Grade 6). . . . . . . . . 137
III.4
Students’ composition in new school neighborhoods (Grade 9). . . . . . . . . 138
III.5
School openings and educational contexts in new school neighborhoods (Grade
6). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
III.6
School resources of students living in new school neighborhoods (Grade 6). . 140
III.7
School achievement in new school neighborhoods (Grade 9). . . . . . . . . . . 141
III.8
The effect of school opening on DNB exam percentile rank distribution in
new school neighborhoods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
155
List of Tables
I.1
Descriptive statistics
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
I.2
Sample sizes for subjects and tracks with both written and oral tests . . . . . 44
I.3
Subjects’ female representation and examiners’ gender bias . . . . . . . . . . 45
I.5
Subjects’ female representation and examiners’ gender bias - separate estimates for each track and year
. . . . . . . . . . . . . . . . . . . . . . . . . . 48
I.6
Female share in ENS oral tests examining boards (2004-2009 average) . . . . 49
I.7
Gender gap in choice of specialty subjects . . . . . . . . . . . . . . . . . . . . 50
I.8
Baseline results without females (males) with masculine (feminine) specialties
- Physics-Chemistry, Biology-Geology and Humanities tracks only . . . . . . 51
I.9
Gender bias depending on year-specific females’ ability . . . . . . . . . . . . . 52
I.10
How easy is it to detect female handwriting? Results obtained by 13 researchers guessing the gender of 180 anonymous exam sheets.
I.11
. . . . . . . . 54
Are assessors making the same guess about handwriting? Consistency between assessors on the sample of exam sheets assessed exactly 5 times and
belonging to different students. . . . . . . . . . . . . . . . . . . . . . . . . . . 55
II.1
Descriptive statistics on students’ characteristics . . . . . . . . . . . . . . . . 94
II.2
Student’s class characteristics regressed on own anonymous exam score: Evidence of the random assignment of similar-file students . . . . . . . . . . . . 95
II.3
Student’s class characteristics regressed on behavior score: Evidence of the
random assignment of similar-file students . . . . . . . . . . . . . . . . . . . . 96
II.4
Effect of class characteristics on high school outcomes . . . . . . . . . . . . . 97
II.5
Effect of persistent classmates on high school outcomes with and without
controlling for other class characteristics . . . . . . . . . . . . . . . . . . . . . 98
II.6
Distribution of the PC effect . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
II.7
Which peers do matter? Decomposition of the PC effect on students at risk. . 100
156
II.8
Robustness check: effect of PC on low-ability students’ repetition rate using
different specifications of the SF fixed effect . . . . . . . . . . . . . . . . . . . 101
II.9
IV exogeneity test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
II.10
Effect of PC on high school outcomes using the IV strategy . . . . . . . . . . 103
II.11
Student’s class characteristics regressed on own anonymous exam score: Evidence of the random assignment of similar-file students . . . . . . . . . . . . 105
II.12
Anonymous exam score regressed on class characteristics: Evidence of the
random assignment of similar-file students . . . . . . . . . . . . . . . . . . . . 106
II.13
Raw correlation between class characteristics and high school outcomes . . . 107
II.14
Effect of class characteristics on high school outcomes for the sample "at risk" 108
III.1
Descriptive statistics on students in 2002 . . . . . . . . . . . . . . . . . . . . 143
III.2
Neighborhoods of new schools opening before or after 2006 . . . . . . . . . . 144
III.3
Students’ composition in new school neighborhoods . . . . . . . . . . . . . . 145
III.4
Effect of school openings on educational contexts . . . . . . . . . . . . . . . . 146
III.5
Effect of school openings on school resources . . . . . . . . . . . . . . . . . . 147
III.6
Effect of school openings on students’ retention rate at the end of the school
year (grades 6 to 8) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
III.7
Effect of school openings on students’ outcomes in grade 9 . . . . . . . . . . . 149
III.8
Robustness check : Changing the neighborhood radius . . . . . . . . . . . . . 150
III.9
Effect of school opening on close public schools’ neighborhoods . . . . . . . . 151
III.10
Heterogeneity of the school opening effect on private sector choice . . . . . . 152
157
158
© Copyright 2026 Paperzz