Social interactions, mechanisms, and equilibrium:
Evidence from a model of study time and academic achievement
Tim Conley, Nirav Mehta, Ralph Stinebrickner, Todd Stinebrickner
March 20, 2015
PRELIMINARY
Abstract
Due to data limitations, few papers documenting the existence of peer effects explore
the mechanisms through which they operate. This paper develops and estimates an
equilibrium model of study time choices made by students, given a social network. The
model is designed to exploit unique data collected in the Berea Panel Study (BPS). The
study time data allow us to quantify an intuitive mechanism for social interactions: the
cost of one’s own study time may depend on friends’ study time. The detailed social
network data allow us to embed these individual study time choices in an equilibrium
framework, allowing for feedback effects. We find evidence of a strong effects of friend
study time on own study time, and therefore, student achievement. Not taking into
account equilibrium behavior would understate the effect of peers on achievement by
20-30% of a standard deviation. Non-random sorting into friendships increases the
standard deviation of study time by 30% and the standard deviation of achievement
by 5%.
1
1
Introduction
Peer effects between students are widely believed to be important for determining academic
achievement. Most of the existing research on peer effects in this context has focused on
establishing a causal link between peer characteristics and academic outcomes in an effort
to provide evidence on whether peers matter.1 Empirical evidence on specific mechanisms
generating peer effects is very limited. In this paper we exploit unique, new data on college
students from the Berea Panel Study (BPS) to provide some of the first evidence about
mechanisms underlying peer effects in determining academic outcomes. We focus on the
specific, intuitively appealing mechanism that a student’s study effort is influenced by his
peers. We estimate an equilibrium model of study time choice conditional on a network of
social contacts.
Even taking the natural first step of establishing whether peers matter has proven to
be quite challenging. One difficulty is that the set of peers that are likely to have the
most influence on a particular student may not be easily observable. Peer definitions are
commonly taken to be all other students within an observable administrative unit like a
classroom, dormitory, or platoon; often due to data limitations. These units are inherently
of interest when the key policy question regards an administrative choice regarding their
composition, e.g. classroom tracking or platoon assignment(Gamoran and Hallinan (1995),
Hanushek and Woessmann (2006), Carrell et al. (2009)). However, recent research has recognized that, in many contexts, the relevant social network of peers is not well-represented
as disjoint, complete networks corresponding to administrative groups. Rather, individuals
are modeled as interacting with each other in a (typically sparse) social network that is
not revealed directly from administrative groupings (Brock and Durlauf (2001a), Brock and
Durlauf (2001b), Calvó-Amengol et al. (2009), Conley and Udry (2001), Conley and Udry
(2010), Weinberg (2009), Weinberg (2013), Carrell et al. (2013)). Even when the policy of
interest is a group composition choice, understanding the outcomes generated by a group’s
composition can require knowledge of the social network of associations between individuals
1
See Epple and Romano (2011) for a survey of this extensive literature.
2
and the mechanism underlying peer influence (Blume et al. (2010),Weinberg (2013), Carrell
et al. (2013). A second difficulty is that identifying the causal effect of peers requires addressing the well-known omitted variables (endogeneity) problem (or Manski (1993)’s “correlated
unobservables”) that will be present if peer groups are determined, in part, on the basis of
unobservable student characteristics that also influence academic performance. To deal with
this issue, researchers have often looked for situations where some subset of one’s peers (e.g.,
one’s college roommates (Sacerdote (2001)) are randomly assigned. From the standpoint of
establishing whether peers matter, it may be be less than ideal that the peer group under
study is often dictated by opportunities that arise from institutional details rather than by
a researcher’s priors about which type of peers are likely to be most important.
The goal of this paper is to move beyond a demonstration that peer effects exist and
pursue two of the next steps necessary to understand how they operate. Our first step is to
provide evidence about a mechanism underlying peer influences in our context. Motivated
by recent research which has hypothesized that time-use is likely to be the input that is most
readily influenced by peers in the short run, we focus on study effort as an explicit, underlying
mechanism through which peer effects could arise in the first year of college.2 Our second
step forward involves investigating the importance of spillover effects in the propagation of
peer effects. Not only might student i’s outcome be influenced by i’s friends, but i’s friends’
outcomes may be influenced by i. Moreover, these types of feedback/spillover effects can
work indirectly through students in the social network who are more than one link away
from student i. Then, taking into account equilibrium effects is necessary to understand
the manner in which changes in an input (in our case study effort) propagates throughout a
network, and to avoid an understatement of the total importance of peers.
We move away from black box regressions in which individuals’ outcomes are related to
the characteristics or outcomes of peers, to a model which explicitly incorporates a mechanism and allows for feedback effects in equilibrium. However, estimating such a model adds
substantial data challenges beyond those that are encountered when a researcher’s goal is
2
For a non-education (financial) example of research which is interested in understanding why peer effects
exist, see Bursztyn et al. (2014).
3
only to provide some evidence that peers matter. First, the equilibrium of the model depends
on the whole social network which leads to a need for data that characterize peer connections
over an entire social network. Unfortunately, among existing sources of social network data,
perhaps only one (National Longitudinal Survey of Adolescent Health,Add-Health) could
potentially provide a full view of an entire social network in an educational setting where
academic outcomes are, at least to some extent, also observed. Second, modeling an explicit
peer mechanism requires, for each person in the social network, data on an important input
in the production of human capital through which peer effects are transmitted. Study effort
is a clearly important and fundamental input but data on it is not easy to obtain. Collecting
reliable time-use information is very difficult in annual surveys, and, as a result, current
data sources do not contain information of this type.3 Thus, to our knowledge, there does
not exist an existing data source that is able to both characterize an entire social network
of school aged individuals and provide evidence about a central input in the grade production function (e.g., in our case study effort) that is likely to play an important role in peer
transmission.4
Our project is made possible by unique data from the Berea Panel Study (BPS) that was
collected specifically to overcome these current data limitations. Motivated by the objective
of providing information about the college period that is more detailed than what is available
in other data sources, the design of the BPS involved having each respondent complete as
many as twelve surveys each year while in school. This unique frequency of contact allows
the BPS to provide comprehensive information about the input (mechanism), study effort,
through which peer effects arise in our model. Specifically, detailed time-use information
comes from eight time-diaries that were collected during the freshman year. The frequency
of contact also plays an important role in our ability to characterize one’s peers, with each
3
Typically, the object interest will be how much a respondent studies over an entire academic period
(semester or year). Measurement error will be present in answers to retrospective questions which elicit
study amounts over the entire period. An alternative is to elicit information over a shorter, recent period
using a time diary. However, the amount a person studied in the shorter period will be a noisy measure of
how much the person studied over the entire academic period. The difficulty of finding current surveys that
provide information about study effort can be seen in the work of Babcock and Marks (2011) whose goal is
to detail changes in the amount of time spent studying over the last several decades.
4
The Add-Health dataset does not contain information about time spent studying.
4
person being asked to name his four closest friends four times each year. Given our objectives,
it is necessary to take seriously the task of characterizing the entire social network. The BPS
design is crucial in this respect as it involved surveying all entering students in a particular
cohort at Berea College and survey response rates that were in the range of 85% and 90%
for this cohort. In addition, our attempt to characterize an entire network benefits from a
decision to focus on our grade performance outcome in the first year of college; due, in part,
to the geographic location of the school (in a small town), most students do not know other
students when they arrive at college and we find that the overwhelming majority of named
friends in the freshman year are also freshmen.
The model consists of two time periods. In the beginning of each period, the social
network is realized. All students in the social network simultaneously choose study time,
which may enter as an input both directly and indirectly to their own and peer’s production
of achievement. There is a growing literature studying peer effects that has focused on the
modeling the evolution of social networks, an important and notoriously difficult problem
(see Christakis et al. (2010), Mele (2013), or Hsieh and Lee (2015)). In the present work
we take the network as given, which allows us to make real progress on how students affect
each other given the entire social network. In particular, we prove there is a contraction in
the period study-time game for reasonable parameter values, no matter how many students
there are in the network or how convoluted and complicated the nature of interconnectedness is. This approach has the advantage that we can avoid making commonly invoked
assumptions (e.g. feedback effects are limited to immediate neighbors) that sometimes turn
studies of “networks” into much more manageable studies of peer groups that are much less
interconnected, and therefore easier to work with. Simply put, how the same number links
are distributed between identical students can have drastic implications for the distribution
of achievement.The machinery we develop allows us to estimate the entire distribution of
achievement, given the network in different time periods, thus we can address the question
of how much friends matter.
Our objectives are to provide some of the first evidence about the mechanisms underlying
5
peer effects and the importance of feedback effects rather than developing a new approach
to dealing with peer endogeneity, the focus of much of the previous literature. When the
available data contains a rather comprehensive set of ability measures (as in our dataset), the
primary endogeneity concern is typically that individuals will tend to be grouped on the basis
of an unobserved propensity for effort. In our approach, a concern about unobservable study
effort per se is no longer problematic since we measure study effort directly and specify
it as the channel through which peer effects arise. Instead, as will be discussed in more
detail in Section 5, endogeneity concerns arise if a relationship between the study effort
of an individual and the study effort of his peers arises because individuals tend to group
themselves on the basis of an unobservable propensity to study. Our approach for dealing
with this potential problem is made possible by the luxury of designing a survey specifically
for the purposes of this paper. This allows us, at the time of college entrance, to collect
(normally unobserved) information about how much a student expects to study in college as
well as how much he or she actually studied in high school. We find that this information
has content, with both expected college study effort and high school study effort being
correlated with actual study effort in college. We also discuss alternative avenues through
which endogeneity concerns might arise in our treatment of study effort.
Our main preliminary results are a characterization of an impulse response to an individual increasing his or her study time and an assessment of the importance of sorting in the
social network upon outcomes. An increase in an individual’s study effort impacts his or her
own achievement and that of all others connected in the network. Our point prediction the
effect of a one hour per day increase in study time for a focal student upon that student’s
own achievement is an increase of approximately .025 GPA points. Averaging over which
student is shocked, the corresponding average general equilibrium responses for students
one link away from the focal student is approximately 0.08 GPA points. About a third of
this response for students one link away is due to indirect connections. The average effect
of this impulse upon those two links away is 0.02 GPA points, all due to indirect network
connections.
6
In order to assess the importance of sorting we construct counterfactual predictions in
a scenario with peer connections randomly assigned. These counterfactual predictions are
substantially different from our baseline estimates of study time for women, blacks, and those
with higher than median high school GPA because these groups have relatively high studytime peers in the BPS data. In the counterfactual randomly-assigned peer network they have
peers with more average study times by construction and hence have average reductions of
friend study time of 0.20 to 0.32 hours per day inducing a reduction in own study effort
of 0.15 to 0.20 hours with a corresponding reduction in GPA points of 0.04 to 0.06. It is
important to note that these groups’ losses are not offset by gains of their complements.
The remainder of this paper is organized as follows. Section 2 contains a literature
review and description of the BPS data. Section 3 presents our model. Section 4 presents
our empirical specification and operational definition of networks. Section 5 discusses our
results and counterfactual exercises and once we have a conclusion it will be in Section 6.
2
Data
The Berea Panel Study (BPS) is a longitudinal survey that was designed to provide detailed
information about students’ input choices and educational outcomes in college, and labor
market outcomes shortly after college.5 The BPS project closely followed all students who
entered Berea College in the fall of 2000 and the fall of 2001 from the time they entered
college through the early portion of their working lives. As has been discussed in previous
work that uses the BPS (Stinebrickner and Stinebrickner (2006)), caution is appropriate
when considering actually how results from our case study would generalize to other specific
institutions. At the same time, from an academic standpoint, Berea has much in common
with many colleges. It operates under a standard liberal arts curriculum and the students at
Berea are similar in academic quality to, for example, students at the University of Kentucky
(Stinebrickner and Stinebrickner (2008)). In addition, earlier work found that academic
decisions at Berea look very similar to decisions made elsewhere. For example, dropout
5
The BPS was designed and administered by Todd Stinebrickner and Ralph Stinebrickner.
7
rates are similar to those found elsewhere (for students from similar income backgrounds) and
patterns of major-switching at Berea are similar in spirit to those found in a representative
dataset by Arcidiacono (2004)
It is also worth noting that the study of one school is beneficial for at least two general
reasons. This makes it feasible to collect the multiple surveys each year that provide data
that are more detailed in important dimensions than other existing surveys, Furthermore, it
makes outcomes such as college grade performance directly comparable across individuals.
In terms of specific benefits, this study is made possible by three types of information
that are available in the BPS. First, the BPS elicited each student’s closest friends, four
times each year. Our main analysis utilizes friendship observations from the end of the first
semester and the end of the second semester. The survey question for the end of the first
semester is shown in Appendix A.1. The survey question for the end of the second semester
is similar. Second, the BPS collected detailed time-use information eight times each year,
using a twenty-four hour time diary, which is shown in Appendix A.1. Finally, questions
on the baseline survey reveal the number of hours that a student studied per week in high
school and how much the person expects to study per week in college. The survey data is
merged with detailed administrative data that provides information about race, sex, high
school grade point, average college entrance exam scores, and college grade point average in
each semester.
This paper focuses on the freshman year for students in the 2001 entering cohort. We
focus on understanding grade outcomes during students’ freshmen years for several reasons.
Under the general liberal arts curriculum at Berea, students have very similar course loads in
their first year. Approximately 88% of all entering students in the 2001 cohort completed our
baseline survey (which was administered immediately before the start of freshman classes),
and response rates remained high for the eleven subsequent surveys that were administered
during the freshman year. Finally, we have the most complete network data for these students
and their friendship network is overwhelmingly within-cohort. Over 80% of friends reported
by students in this cohort are themselves in the 2001 cohort (the large majority of the
8
remainder are in the 2000 cohort for which we also have data). These advantages tend to
fade in subsequent years as friendships change (in part due to substantial dropout after first
year) and students’ programs of study specialize.
2.1
Descriptive statistics
Our sample consists of 307 students who are each observed in the two semesters of their
freshman year. The top half of Table 1 shows descriptive statistics of student characteristics.
The first row in each of the five panels shows an overall sample average for the variable
of interest given in the first column. We split variables based on sex, whether or not the
student is black, and whether or not the student’s high school GPA was above the sample
median, to capture heterogeneity by these variables shown to be strongly correlated with
academic performance. Forty-four percent of students are male, 18% of students are black,
the mean high school grade point average for the sample is 3.39, the mean combined ACT
score is 23.26, and, on average, students studied 11.24 hours per week in high school. The
subsequent rows in each panel show sample averages (of the variable of interest in the first
column) for different subgroups. For example, the third panel shows that males and blacks
have lower than average high school grade point averages (3.24 and 3.14, respectively), and
the fifth panel shows that blacks studied more, on average, in high school than other students.
The second half of Table 1 shows descriptive statistics of first year outcomes. Panels 5
and 6 show that, on average, students study 3.45 hours in the first semester and 3.48 hours
in the second semester. On average, male students study less than female students, black
students study more than white students, and students with above-median high school GPAs
study more than students with below-median high school GPAs.Panels 8 and 9 show that
the average first semester grade point average is 2.89 and the average second semester grade
point average is 2.93. Male students, black students, and students with below-median high
school GPAs all have lower than average GPAs than their counterparts.
The same information is represented in Figure 1, which shows the group means by student
characteristic, with 95% confidence intervals.
9
Figure 1: Average own study time by group, with 95% confidence intervals
own study time (hours/day)
4.0
3.5
3.0
nonblack
black
female
male
10
low hsgpa
high hsgpa
total
Table 2 summarizes friend data, for those who report having at least one friend in each
semester, broken down by the same characteristics as in Table 1. The top panel shows
that students have about 3 friends on average. The mean masks considerable variation:
the minimum number of friends is one, while the maximum is 10 friends. The second and
third panels show that male and black students (and, therefore, female and white students)
sort towards students with the same characteristics. In addition, the third panel shows that
students with above-median high school GPAs have fewer black friends. The fourth and
fifth panels show that male and black students have friends with lower incoming GPAs and
combined ACT scores. The sixth and seventh panels show that black students and students
with above-median high school GPAs have friends who were more studious in both high
school and college, while males have less studious friends.
11
Table 1: Own summary statistics
Variable
Male indicator
Group
all
given black
given above med. HS GPA
N
307
55
155
Mean
0.44
0.45
0.33
SD
0.5
0.5
0.47
Min
0
0
0
q1
0
0
0
q2
0
0
0
q3
1
1
1
Max
1
1
1
Black indicator
all
given male
given above med. HS GPA
307
134
155
0.18
0.19
0.1
0.38
0.39
0.31
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
HS GPA
all
given male
given black
given above med. HS GPA
307
134
55
155
3.39
3.24
3.14
3.77
0.47
0.51
0.46
0.17
1.68
1.68
2.24
3.5
3.09
2.9
2.78
3.6
3.5
3.21
3.1
3.8
3.8
3.7
3.52
3.9
4
4
4
4
ACT
all
given male
given black
given above med. HS GPA
307
134
55
155
23.26
22.54
19.91
24.45
3.61
3.77
2.51
3.53
14
14
14
17
21
20
18
22
23
23
20
25
26
25
21
27
33
31
25
33
HS study*
all
given
given
given
all
given
given
given
male
black
above med. HS GPA
307
134
55
155
307
134
55
155
11.24
11.43
15.29
10.66
24.96
22.72
28.56
25.18
11.35
11.94
14
10.44
11.61
11.08
13.56
10.47
0
0
0
0
0
0.97
0
0
4
3.12
5
4
17
16
19
18
8
8
10.5
8
23
20.75
25
23.5
15
15
20
14.5
31
27.38
38.5
32
70
70
70
70
64
64
57.5
56
Sem. 1 Own study**
all
given male
given black
given above med. HS GPA
296
129
53
153
3.45
3.16
3.8
3.6
1.67
1.76
1.63
1.7
0
0
0
0
2.33
2
2.84
2.41
3.29
2.92
3.67
3.34
4.5
4.09
4.78
4.75
10.33
8.66
8.33
8.58
Sem. 2 Own study**
all
given male
given black
given above med. HS GPA
278
117
50
145
3.48
3.19
3.75
3.69
1.6
1.67
1.55
1.47
0
0
0
0
2.23
2
2.74
2.66
3.34
3
3.5
3.67
4.75
4.34
4.96
4.83
9
9
7.33
7.92
Sem. 1 GPA
all
given male
given black
given above med. HS GPA
307
134
55
155
2.89
2.72
2.42
3.19
0.78
0.8
0.78
0.62
0
0.3
0
0.52
2.49
2.17
1.82
2.81
3.06
2.8
2.57
3.29
3.46
3.29
2.84
3.69
4
4
4
4
Sem. 2 GPA
all
given male
given black
given above med. HS GPA
301
131
53
155
2.93
2.74
2.58
3.21
0.78
0.84
0.86
0.66
0
0
0.44
0
2.53
2.38
2.22
2.82
3.05
2.82
2.62
3.36
3.46
3.33
3.33
3.74
4
4
3.78
4
Expected study
male
black
above med. HS GPA
*Note: Hours per week spent studying senior year of high school.
**Note: Own study time is semester-level average hours per day.
12
Figure 2: Average friend study time by group, with 95% confidence intervals
friend study time (hours/day)
4.0
3.5
3.0
nonblack
black
female
male
low hsgpa
high hsgpa
total
Figure 2 shows the group means, with 95% confidence intervals for each group mean.
13
Table 2: Average friend summary statistics, pooled over both semesters
Variable
Group
N
Mean
SD
Min
q1
q2
q3
Max
Num. friends
all
given male
given black
given above med. HS GPA
614
268
110
310
3.31
3.22
3.21
3.34
1.58
1.59
1.35
1.62
1
1
1
1
2
2
2
2
3
3
3
3
4
4
4
4
10
10
7
10
Share male friends
all
given male
given black
given above med. HS GPA
614
268
110
310
0.43
0.74
0.43
0.35
0.39
0.31
0.4
0.38
0
0
0
0
0
0.5
0
0
0.33
0.82
0.33
0.25
0.75
1
0.83
0.67
1
1
1
1
Share black friends
all
given male
given black
given above med. HS GPA
614
268
110
310
0.18
0.18
0.69
0.1
0.32
0.32
0.38
0.22
0
0
0
0
0
0
0.43
0
0
0
1
0
0.25
0.25
1
0
1
1
1
1
Friend HS GPA
all
given male
given black
given above med. HS GPA
614
268
110
310
3.37
3.29
3.18
3.46
0.32
0.33
0.34
0.27
2.24
2.25
2.25
2.65
3.2
3.07
2.96
3.29
3.41
3.34
3.19
3.46
3.62
3.53
3.41
3.63
4
4
4
4
Friend ACT
all
given male
given black
given above med. HS GPA
614
268
110
310
23.29
22.72
21.2
23.79
2.63
2.64
2.53
2.42
16
16.33
16
17.5
21.67
21
19.33
22.23
23.33
23
21
23.67
25
24.64
22.5
25.33
32
31
29
32
Friend HS study
all
given male
given black
given above med. HS GPA
614
268
110
310
11.03
10.53
14.62
11.48
7.64
7.37
7.31
8.44
0
0.5
2.5
0.5
6
5.17
9.18
6
9.5
9
13.92
9.7
14.47
14
18.75
14
70
37.33
37
70
Friends expected study
all
given male
given black
given above med. HS GPA
614
268
110
310
24.82
22.89
28.05
24.72
7.4
6.97
8.53
7.42
0
4.06
12
0
19.75
18.23
21.35
20
23.55
21.65
28.9
23.55
29.62
27.05
33.79
29.48
55
55
51
55
Friend study*
all
given male
given black
given above med. HS GPA
614
268
110
310
3.5
3.16
3.78
3.64
1.72
1.49
1.77
1.79
0
0.5
0.5
0
2.47
2.21
2.7
2.56
3.26
3
3.52
3.36
4.28
3.88
4.47
4.41
11.93
8.46
10.81
11.93
*Note: Friend study time is the average of friends’ own study time that semester.
14
Table 3: Network characteristics
Churn
Prob. friendship reported first
0.51
semester but not second
i.e. Pr{A2 (i, j) = 0|A1 (i, j) = 1}
Prob. second semester
0.51
friendship is new
i.e. Pr{A1 (i, j) = 0|A2 (i, j) = 1}
Correlations bt. own
and avg. of friends
Black
Male
HS GPA
Combined ACT
HS study time
Expected study time
0.74
0.71
0.23
0.31
0.23
0.14
Table 3 shows other network characteristics. Both the probability that a first semester
friendship is not reported in the second semester and the probability that a second-semester
friendship was not reported in the first semester are about .50. Consistent with the findings
from Table 2, the right side of the table shows substantial sorting on race, sex, high school
GPA, combined ACT score, and high school study time.
Pooling observations in the two semesters, Table 4 presents OLS regression results predicting own study time (left column) and GPA (right column). The study time regression
shows evidence of significant partial correlations of one’s own study amount (computed as
the average amount the student studies in the four time diaries in a semester) with own
gender, own high school GPA, and own high school study time. Of particular relevance
for our analysis, one’s own study amount also has a positive partial correlation with the
average study time of one’s friends’ contemporaneous semester study time.6 The GPA regression shows that own achievement has a significant positive partial correlation with being
female, non-black, and having above-median high school GPA. Own achievement also has a
significant partial correlation with own study time and the average of friends’ contemporaneous semester study time.7 The results in Table 4 motivate the two key components of our
theoretical model.
6
Each friend’s semester study time is computed as the average amount that friend’s study in the four
time diaries in the relevant semester. This is then averaged over friends that semester.
7
The standard errors are substantially unchanged when we cluster them at the student level.
15
Table 4: Study and GPA regressions, pooled over both semesters
Dependent variable:
Own study
GPA
(1)
(2)
Male
−0.369∗∗∗
(0.136)
−0.131∗∗
(0.061)
Black
0.116
(0.186)
−0.225∗∗∗
(0.084)
HS GPA
0.413∗∗∗
(0.149)
0.437∗∗∗
(0.068)
ACT
−0.032
(0.021)
0.040∗∗∗
(0.009)
HS study
0.043∗∗∗
(0.006)
0.001
(0.003)
Expected study
−0.002
(0.006)
−0.006∗∗
(0.003)
Friend study
0.166∗∗∗
(0.037)
0.090∗∗∗
(0.019)
Own study
Constant
Observations
R2
Adjusted R2
Residual Std. Error
F Statistic
1.915∗∗∗
(0.671)
0.417
(0.303)
574
0.169
0.159
1.496 (df = 566)
16.457∗∗∗ (df = 7; 566)
571
0.259
0.249
0.676 (df = 563)
28.044∗∗∗ (df = 7; 563)
∗
Note:
16
p<0.1;
∗∗
p<0.05;
∗∗∗
p<0.01
3
Model
Students are indexed by i = 1, . . . , N and time periods (semesters) by t = 1, 2. We denote
the study time of student i in time t as sit and define a column vector collecting all students’
study times during that period as St . We use an adjacency matrix to represent the peer
network of friendship connections between students, and treat it as pre-determined. Let At
denote this adjacency matrix in period t with an (i, j) entry of one if student i has j as
a peer and zero otherwise, and a main diagonal of zeros. Though our baseline empirical
specification symmetrizes links (i.e. A(i, j) = max{A(i, j), A(j, i)}), the model can also
accommodate links that are not reciprocal, i may link to j without j linking to i, thus At
need not be symmetric.
There are two main implications of our treatment of the adjacency matrix as predetermined. The first is econometric: if students sort into friendships based on their expected benefits of doing so, a model of the friendship formation process could serve as a
control function for unobservables affecting sorting (Chan (2015), Hsieh and Lee (2015),
Badev (2013)). As discussed in Section 5.1, our alternative approach attempts to go to the
core of the endogeneity problem: we directly collect data on typically unobserved variables
that affect study time, which may be correlated between friends, eliminatingour need for
modeling friendship formation to derive such a control function. Second, because we do not
have a model for how friendships are formed, we cannot characterize how friendships might
change in response to policy interventions.8 Therefore, we limit our counterfactual exercises
to quantify the equilibrium effects of changes in friend study time using the observed social
network, and simulating achievement in an economy where friends are randomly assigned.
The average study time of i’s peers during period t depends on At and St according to:
PN
j=1
s−it = PN
At (i, j)sjt
j=1 At (i, j)
.
(1)
Taking into account their friends, students make decisions about how much to study in a
8
There is a growing literature modeling friendship choices (Badev (2013), de Paula et al. (2014), Mele
(2013)), which typically abstracts from mechanisms underlying payoffs to friendship formation.
17
particular semester by considering the costs and benefits of studying. The benefit of studying
comes from the accumulation of human capital. The production function for human capital,
y, for student i in period t is:
y(sit , µyi ) =β1 + β2 sit + µyi ,
(2)
where µyi is an observable “human capital type” which allows the amount a person learns
in school to vary across people, conditional on her own study level. As will be discussed in
Section 4, in practice this type will be constructed using observable characteristics that have
consistently been found to influence academic performance (e.g., high school grade point
average and sex).
The cost of putting effort into studying, c, is determined by
c(sit , s−it , µsi ) = θ1 sit + θ2 γ(µsi )sit +
θ3 sit θ4 γ(µsi )sit
θ5 s2it
+
+
.
s
s
s
sτ−it
sτ−it
2sτ−it
(3)
Peer study time enters the cost function by reducing the cost of one’s own studying. It’s less
arduous (or more fun) to study at the library when everyone else is at the library. Conversely,
if all your friends are at the library, it is no fun to stay in your dormitory room. µsi is a “study
type” which takes into account that the disutility from studying (or utility from leisure) will
vary across people, conditional on own and peer study levels. As will be discussed in Section
4, in practice this type will be constructed from observable characteristics such as how much
a student studied in high school and how much a student expects, at the time of entrance, to
study in college. Then, γ(µsi ) =
1
exp(τµ,1 µsi +τµ,2 µ2si )
allows the cost function to have intercepts
and slopes that vary across people of different study types. As we discuss after equation (7),
we adopt this particular specification for the cost function due to its desirable properties.
We do not include a fixed cost of studying because only a very small number of cases (5%)
have zero reported study time.
With full knowledge of the At sequence and all students’ human capital types {µyj } and
study time types {µsj }, students simultaneously choose study times (one per period) to
18
maximize utility, which we assume to be separable across periods:
u(si1 , si2 ) =
( 2
X
)
y(sit , µyi ) − c(sit , s−it , µsi ) .
(4)
t=1
.
3.1
Model Solution
The student’s decision problem is additively separate across time periods t. Therefore the
student can solve each period’s problem separately.9 Student i’s policy in t is
sit = arg max {y(s, µyi ) − c(s, s−it , µsi )},
(5)
s
which we can find by taking the first order condition with respect to own effort. Differentiating (5) in period t with respect to s gives
∂y
∂s
=
∂c
,
∂s
i.e. the utility-maximizing study
effort level equates the marginal return for increasing study effort with the marginal cost.
Expanding the optimality condition gives:
β2 = θ1 + θ2 γ(µsi ) + θ3
1
s
sτ−it
+ θ4
γ(µsi )
sit
+ θ5 τs
τs
s−it
s−it
⇒
sit = −
θ3 θ4
(β2 − θ1 ) τs
θ2
s
− γ(µsi ) +
s−it − γ(µsi )sτ−it
.
θ5 θ5
θ5
θ5
(6)
The second line shows that θ5 introduces necessary curvature into the student’s period objective – were it zero the student’s objective would be linear in own study time and there
would not exist a unique best response to friend study time. However, equation (6) shows
that one of the preference parameters θ must be normalized. Therefore, we normalize θ5 to
9
If utility were nonlinear in semester achievement or the argument of the cost function were study time
over the whole year, the problem would no longer be separable across time periods. We assume student
utility is linear in achievement because non-linearity of utility in achievement would be difficult to separate
from non-linearity in the cost function, outside functional form restrictions.
19
one, resulting in the student best response function
s
s
sit = −θ3 − θ4 γ(µsi ) + (β2 − θ1 )sτ−it
− θ2 γ(µsi )sτ−it
≡ ψ(s−it , µsi ),
(7)
which shows that sit is a function of the study amount of one’s peers. Note that student
best response functions depend on their study type µsi ; when notationally convenient, we
express this dependency by indexing the best response function by student and suppressing
the study type, i.e. ψi (s−it ). Own effort is increasing linearly in the productivity of own
effort β2 , and may also increase in own study type, µsi , depending on θ2 and θ4 . We restrict
parameters to ensure that own effort is an increasing and (weakly) concave function of peers’
effort, based on both the descriptive relationship in Table 4 and theory.
The above best response function has the desirable property of being weakly concave in
friend study time, a property that would result from any cost function possessing the natural
properties of being strictly convex in sit and weakly concave in s−it (i.e., τs ≤ 1).10 As shown
in Section 3.1, concave best response functions ensure existence of a unique equilibrium for
the study time game. As shown in equations (6) and (7), the separable form we adopted
for the cost function has the benefit of producing a tractable closed-form solution for the
student best response function.
3.1.1
Equilibrium
Definition 1 (Period Nash equilibrium). A pure strategy Nash equilibrium in study times
S ∗ = [s∗1 , s∗2 , · · · , s∗N ]0 satisfies s∗i = ψ(s∗−i , µsi ), for i ∈ N , given adjacency matrix A.
To provide some intuition for how important equilibrium effects are, we first work through
a simple example. Next, we prove existence of a unique equilibrium for a general network A.
Example 1. Consider a simple network of three friends (A, B, C), illustrated in the left side
of Figure 3, at an initial equilibrium S0∗ = (s∗A0 , s∗B0 , s∗C0 ). To consider the effect of an exogenous (fixed) increase to the study time of student A (the top red node) on the achievement
10
See Appendix A.3 for details.
20
Figure 3: Partial versus general equilibrium effects
2.0
ψB
ψC
ψB
ψC
when
when
when
when
sA
sA
sA
sA
= s∗A0
= s∗A0
increased
increased
0.8
1.0
1.2
sB
1.4
1.6
1.8
A
C
0.6
B
0.6
0.8
1.0
1.2
1.4
1.6
1.8
2.0
sC
of B, differentiate yB with respect to sA :
∂yB
∂sB ∂s−B
∂sB ∂( s2A + s2C )
= β2
= β2
= β2 ψB0 (s−B )
∂sA
∂s−B ∂sA
∂s−B
∂sA
1 ψC0 (s−C )
+
2
2
.
To calculate the partial equilibrium effect, we can use the estimated policy function and
technology to calculate how yB changes due to a change in sA , fixing sC = s∗C0 . However,
if ψC0 (s−C ) is positive, which we estimate to be the case, this will understate the effect of
increasing sA . The right side of Figure 3 demonstrates the difference between partial and
general equilibrium effects. First, consider the initial equilibrium characterized by the intersection of best responses of B and C, given sA = s∗A0 (black dot). The increase in sA shifts
ψB vertically for every sC . The partial equilibrium effect ignores the change in ψC (empty red
square). The general equilibrium effect takes into account the change in ψC (blue triangle),
and is therefore larger.
Claim 1. Let k be the number of hours during the entire time period. There exists a unique
pure strategy Nash equilibrium if ψi : RN 7→ R are strictly concave and weakly increasing,
ψi (0) > 0, and ψi (k) < k for i ∈ N .
21
Proof. Define S = [0, k]N , i.e. a compact and convex set. Define a function Ψ:
ψ (x ) − x1
1 −1
ψ2 (x−2 ) − x2
Ψ : S 7→ S =
..
.
ψN (x−N ) − xN
.
Existence: By Brouwer’s Fixed Point Theorem, Ψ is a continuous self map on the
compact set S, so an equilibrium exists.
Uniqueness: Note that Ψ is also strictly concave and because ψi (k) < k, we have
Ψ(K) < 0, where K is an vector of length N whose elements are all k. Say S ∗ solves
Ψ(S ∗ ) = 0. Take S̃ < S ∗ (i.e. no element larger, and at least one element strictly less). Due
to the concavity of Ψ we also have that Ψ is quasiconcave, which means it has convex upper
contour sets. This, combined with Ψ(0) > 0, implies that Ψ(S̃) > 0. Analogously, S̃˜ > S ∗
cannot solve Ψ.
4
Estimation
Our model provides a mapping from the adjacency matrix At and all the students’ (µsi , µyi )
to a unique equilibrium in study times for all students, St∗ . The equilibrium study times
St∗ , generate achievement in equilibrium yit∗ via the technology y(si , µyi ). The model is
operationalized by parameterizing a student’s types µsi and µyi as linear combinations of
observable characteristics collected in a vector xi . This vector includes indicators for being
Black and being male, along with high school GPA, combined ACT score, average hours per
week of study time in high school, and expected hours of study time per week in college. Thus
we take µsi = x0i ωs , and µyi = x0i ωy . This allows us to express each student’s equilibrium
study time and achievement as a function of At and all students characteristics which we
collect in a matrix X. Given the full set of parameters Γ = (β1 , β2 , θ1 , θ2 , θ3 , θ4 , ωs , ωy )0 , we
22
write these outcomes for individual i as
s∗it = ψ(s∗−it , µsi ) = δs (At , X; Γ)
and
yit∗ = y(s∗it , µyi ) = δy (At , X; Γ),
where s∗−it is defined by applying equation (1) to St and At . We estimate Γ by fitting these
functions to our data on achievement and study time. Our measure of achievement is the
student’s semester-level grade point average (GPA) measured on a four-point scale, denoted
ỹit . In our data there are approximately 7% of students who have a GPA of four and 1% who
have a GPA of zero. Therefore we take a Tobit model approach to fitting GPA. We define
latent GPA as yit∗ + νit where νit is a Gaussian measurement error, IID and independent from
A and X. Thus our Tobit model with censoring at zero and four becomes:
4
ỹit = 0
∗
yit + νit
if yit∗ + νit ≥ 4
if yit∗ + νit ≤ 0
otherwise
The GPA component of the likelihood function for individual i at time t is simply the
likelihood for this Tobit model, with censoring at zero and four:
Lyit
=Φ
0 − δy (At , X; Γ)
σν
1{ỹit =0} 1{ỹit =4}
4 − δy (At , X; Γ)
1
ỹit − δy (At , X; Γ)
× 1−Φ
× φ
.
σν
σν
σν
The likelihood function also includes study time information. Our empirical measures for
s∗it are up to four within-semester reports of study time for each student in semester t. Each
study time observation is a report of time the student spent studying in a 24 hour period.
Study time report r for student i in semester t is denoted s̃rit . We use Rit to denote the
set of reports for student i in semester t. We treat each of these study time observations
as being a noisy measure of s∗it . Approximately 5% of our study time observations are zero,
23
therefore we use a Tobit model approach analogous to that for GPA. Defining latent study
time s∗it + rit
s̃rit =
0
if s∗it + rit ≤ 0
s∗ + rit
it
otherwise
The contribution for student i, report r in semester t is:
Lsrit
=Φ
0 − δs (At , X; Γ)
σ
1{s̃rit =0}
1
× φ
σ
s̃rit − δs (At , X; Γ)
σ
.
We treat the ν and measurement errors as jointly independent across students, semesters,
and study time reports. The likelihood contribution for student i is therefore:
Li = (Πt Πr∈Rit Lsrit ) × (Πt Lyit ) .
Standard errors are calculated by computing the outer product of the score.
5
Estimation Results
Table 5 contains parameter estimates. The top panel contains the technology parameters that
enter the human capital production function. The key technology parameter is the marginal
product of own study effort on achievement, β2 . The point estimate of 0.254 implies that
increasing own study time by one hour per week increases achievement by about a quarter
of a GPA point, ceteris paribus.11 The human capital type in the model is a linear function
of observable characteristics which takes into account that individuals with different values
of high school GPA, combined ACT score, black/nonblack and male indicators, high school
study time, and expected study time (anticipated college study effort at time of entrance)
may accumulate different amounts of human capital (observed as receiving different GPAs),
conditional on the amount that they study. Table 5 shows that students with high GPAs in
11
We also estimated a specification of the model where friend study time directly entered the technology.
We exclude it here because we found the coefficient to be close to zero, and the quantitative findings largely
unchanged.
24
high school and high ACT scores accumulate significantly more human capital, and Black
students accumulate significantly less human capital.
Estimates of parameters in the study cost function appear in the second panel. It is
perhaps easiest to interpret the study cost parameters by substituting them into the best
response function, equation (7):
b −it , µsi ) = 0.907 − 0.096b
ψ(s
γ (µsi ) + 1.328s−it − 0.874b
γ (µsi )s−it .
(8)
The first two terms in equation (8) represent the intercept of the best response function, that
is how much a student would study even if her friends did not study at all. The first term,
−θ1 =0.907, is the component of the intercept that is common across students. The estimate
of θ4 is close to zero (0.096). The final two terms in equation (8) reveal how a student’s
study effort responds to the study effort of her friends. Based on our estimates, we set τs to
be 1, because it was indistinguishable from 1 in the baseline model, meaning that peer study
effort enters in a linear manner.12 The estimated values of τµ,1 = 0.105 and τµ,2 = −0.003
indicate that γ() is decreasing and convex is study type, because γ(µsi ) =
1
.
exp(τµ,1 µsi +τµ,2 µ2si )
We can get a rough sense of the shape of best response functions by looking at the
estimated distribution of γ(µs ). Figure 6 plots the histogram of γ̂(µˆs ), which ranges between
0 and 1.3. The estimated effect of (β2 − θ1 ), 1.328, is quite large which suggests that the
effect of peers’ study effort on own study effort will tend to be substantial. The coefficient on
the study-type-specific component pertaining to changes in friend study time is θ2 = 0.874.
Combining these last two terms in (8), we can see that the best response is indeed upwards
sloping for all study types.
Getting a sense of how student best response functions look requires us to combine the
common and type-specific components of equation (8). To provide a sense of the total
effect of peer study effort in the best response functions, Figure 5 plots best response functions for the lowest (dotted purple line), 25th percentile (solid red line), median (dot-dashed
12
We estimate τs in the robustness checks, and find it to be less than one under one specification of the
adjacency matrix. The results are very similar in this and other specifications of the adjacency matrix.
25
Figure 4: Histogram of student heterogeneity term in best response function, γ̂(µˆs )
count
15
10
5
0
0.4
0.6
µ̂s
0.8
1.0
orange line), and 75th percentile study types (solid blue line). The table just below calculates equation (8) for each study type, presenting the type-specific intercept (top row)
and coefficient on friend study time (bottom row). As mentioned before, there is little
heterogeneity in best response function intercepts, and responsiveness to peer study time
is increasing in study type. Further, we note that, even for the lowest study type person
(dotted purple line), the effect of peer study effort is positive. All together, one’s optimal
study choice is increasing in study type,13 and almost all of the heterogeneity across students
is driven by differential responsiveness to friend study time, the last term in equation (8).
13
The exception is for extremely high study types (above the 99th percentile), where γ 0 is positive due to
the concavity from τµ,2 .
26
Figure 5: Estimated study best response functions for different study types µs
4
3
2
0
1
own study effort, ψ(s−it , µsi )
5
6
lowest study type
25th pctile study type
median study type
75th pctile study type
identity function
0
1
2
3
4
5
6
friend study effort, s−it
Study type µ
bs :
Lowest
25th pctile
Median
75th pctile
Intercept
0.81
0.83
0.84
0.85
s
Coefficient on sτ−it
0.46
0.66
0.74
0.83
The final set of cost parameters shows the determinants of study type. Study type is
increasing in high school GPA and high school study time, but is smaller for males and
students with high combined ACT scores, though the latter effect is imprecisely estimated.
Though black students study considerably more than nonblack students, the coefficient on
being black is negative. This is explained by the fact that black students have much higher
high school study levels, which are an important determinant of study type.
Figure 6 is a scatterplot of estimated study type (x-axis) versus estimated human capital
type (y-axis), and shows that they are essentially unrelated – high human capital types are
not also more likely to be high study types.
In addition to describing individual heterogeneity in best response functions, Figure 5
provides evidence about the implications of this heterogeneity. To see this, note that the
intersection of each best response function with the identity function indicates the equilibrium study outcome were that person friends with someone of their study type. Therefore,
by comparing where the different types’ best response functions intersect the identity func27
Figure 6: Estimated human capital and study types
3.0
µ̂y
2.5
2.0
1.5
5
10
estimated study type, µ̂s
tion (dashed black line), we can identify equilibria when each student is matched with her
respective study type. When two 75th percentile study types are paired they would study
almost 5 hours each, almost twice the amount two 25th percentile study types would study
when paired together. The estimated study time policy functions indicate that the game
is exhibits a complementarity: Students study more, and therefore produce more human
capital in total, when matched by type. That being said, whether or not students will take
advantage of this complementarity depends on how they sort into friendships.
Figure 7 shows that the model closely fits mean observed study time (top panel) and GPA
(bottom panel), both overall and by student characteristics.Even though it is not explicitly
targeted, the model also closely captures the relationship between own and friend study time.
Figure 8 plots own versus friend study time, for both the observed data (solid red line) and
simulated outcomes (dashed blue line).14
14
Both observed and simulated data are smoothed according to loess, or local polynomial regression fitting.
See stat smooth in ggplot2 for details (Wickham (2009)).
28
Table 5: Parameter estimates
Parameter
Technology
β1
β2
ωy,HS GPA
ωy,ACT
ωy,Black
ωy,Male
ωy,HS study
ωy,expected study
Estimate
SE
Description
-0.350 0.363 intercept
0.254 0.057 marginal product of own study time
0.470 0.072 coefficient on HS GPA on human capital type
0.047 0.010 coefficient on ACT on human capital type
-0.213 0.089 coefficient on Black on human capital type
-0.037 0.073 coefficient on Male on human capital type
-0.007 0.004 coefficient on HS study on human capital type
-0.005 0.003 coefficient on expected study on human capital type
Study cost function
θ1
-1.074
θ2
0.874
θ3
-0.907
θ4
0.096
τs
1
τµ,1
0.105
τµ,2
-0.003
ωs,HS GPA
1
ωs,ACT
-0.063
ωs,Black
-0.735
ωs,Male
-1.065
ωs,HS study
0.344
ωs,expected study
0.005
0.088
0.146
0.524
0.828
–
0.035
0.002
–
0.054
0.437
0.474
0.096
0.018
study cost terms
study cost terms
study cost terms
study cost terms
curvature on friend study time, fixed to 1
linear term for study type
quadratic term for study type
coefficient on HS GPA on study type, fixed to 1
coefficient on ACT on study type
coefficient on Black on study type
coefficient on Male on study type
coefficient on HS study on study type
coefficient on expected study on study type
Shocks
σ
ση
0.017
0.025
sd measurement error for human capital
sd measurement error for observed study time
0.721
2.159
29
Figure 7: Fit of mean study time (left) and GPA (right)
3.0
GPA
study time (hours/day)
3.5
2.5
3.0
2.0
black
female
male
obs
low hsgpa
high hsgpa
total
nonblack
black
sim
female
male
obs
low hsgpa
sim
Figure 8: Fit of study time against friend study time
7.5
own study time (hours/day)
nonblack
5.0
2.5
0.0
2
4
friend study time (hours/day)
obs
30
sim
6
high hsgpa
total
5.1
Endogeneity and Identification
Our primary endogeneity concerns arise from the potential for the relationship between a
student’s study effort and that of her peers to be due, in part, to individuals’ friendships
being related to unobservable propensities to study. We have mitigated these concerns by
taking advantage of our survey collection to obtain direct measures of students’ propensities
to study. Our baseline survey elicited information about: 1) how much a student expected to
study in college and 2) how much a student studied in high school. These measures of one’s
propensity to study clearly have content; each of these variables is strongly correlated with
how much a student studies in college and comparing columns 1 and 2 of Table 6 shows that
adding these variables doubles the R2 in a regression of one’s own study effort on observable
characteristics and friend study time. We stress that a crucial feature of our two variables
(made possible by the flexibility in the timing of our survey collection) is that they describe
a student’s propensity to study at the time of entrance, immediately before students are
influenced by their friendships at Berea. Observing students’ choices and outcomes during
their freshman year at a liberal arts college (where the large majority of curriculum choices
for freshman are required general or introductory classes) is also important for mitigating
the concern that the correlation between own and peer study time is present simply because
friendships tend to arise between those sharing course specializations with similar workloads.
5.2
Quantitative Findings
We conduct two simulations to quantify the role of social interactions, viewed through the
mechanism of changes in study time, in the production of achievement. The goal of our first
simulation is to provide new evidence about the importance of general equilibrium effects in
the determination of achievement, by showing how general equilibrium interactions increase
estimates of peer effects above and beyond those estimated in individual student policy
functions. To do this, we exogenously increase the study time of a particular student and
measure how study times and achievement change for other students in the network, with
and without taking into account general equilibrium effects. We then examine how shocking
31
Table 6: Study regressions controlling for different sets of characteristics, pooled over both
semesters
Dependent variable:
Own study
(1)
(2)
(3)
Male
−0.369∗∗∗
(0.136)
−0.328∗∗
(0.140)
−0.391∗∗∗
(0.135)
Black
0.116
(0.186)
0.333∗
(0.192)
0.324∗
(0.172)
HS GPA
0.413∗∗∗
(0.149)
0.392∗∗
(0.156)
ACT
−0.032
(0.021)
−0.029
(0.022)
HS study
0.043∗∗∗
(0.006)
Expected study
−0.002
(0.006)
Friends study
0.166∗∗∗
(0.037)
0.198∗∗∗
(0.039)
0.202∗∗∗
(0.039)
0.228∗∗∗
(0.038)
Constant
1.915∗∗∗
(0.671)
2.167∗∗∗
(0.679)
2.850∗∗∗
(0.172)
2.648∗∗∗
(0.152)
574
0.169
0.159
574
0.087
0.079
574
0.076
0.071
574
0.058
0.056
Observations
R2
Adjusted R2
∗
Note:
32
p<0.1;
∗∗
p<0.05;
(4)
∗∗∗
p<0.01
different students has different effects on achievement by repeating this exercise for each
student in the network.
The second simulation builds on the first, by examining not only how equilibrium effects
can augment social interactions, but also how sorting into friendships affects student achievement. To do this we compare achievement in the baseline model with that when friends are
instead unconditionally randomly assigned. By taking into account both responsiveness and
sorting, we can provide some of the first evidence about the total importance of peers in
determining achievement.
Traditionally, obtaining this type of evidence has been difficult. First, general equilibrium effects may depend on the entire social network, which is typically not available to
researchers. Second, studies that go beyond showing how individual achievement depends
on peers typically do not study a particular mechanism. To address this, we specifically
surveyed students about a very plausible mechanism for through which social interactions
might manifest – study time. Our framework exploits these strengths of our data, naturally
embedding social interactions operating through an intuitive mechanism in an equilibrium
environment.
Throughout this section, we compare outcomes (achievement, own study time, and friend
study time) between baseline and counterfactual scenarios. To fix ideas, define the treatbaseline
ment effect on achievement for student i as ∆yi ≡ y(scf
, µyi ), where scf
i , µyi ) − y(si
i and
are student i’s equilibrium study time in the counterfactual and baseline scenarios,
sbaseline
i
respectively. Treatment effects for own and friend study time are defined analogously.
5.2.1
Partial versus general equilibrium effects
Though, in theory, general equilibrium effects will be positive because study time is increasing
for all students in friend study time, as demonstrated when we discuss parameter estimates
in Section 5, the extent to which partial and general equilibrium effects differ is an empirical
question. To provide quantitative evidence about the difference, we exogenously increase a
student’s study time by one standard deviation and measure how the rest of the network
33
responds under both partial equilibrium and general equilibrium scenarios. The partial
equilibrium effect takes into account only how the exogenous shock to one person influences
others who are directly linked to him. Thus, in the context of Figure 3, computing this
effect involves only the movement from the black dot to the square. In contrast, the general
equilibrium effect takes into account the full set of feedback effects in the network. Thus,
computing this effect involves iterating student policy functions until a new equilibrium is
reached at the triangle. We perform this exercise of shocking the study effort of a single
student and examining the effects on the network 307 times, i.e. once for each student. For
both the partial equilibrium (Row 1) and general equilibrium (Row 2) scenarios, Table 7
shows the average change in achievement (measured in GPA standard deviation units) in
the network over the 307 exercises, stratified by the distance of individuals from the shocked
nodes.
Table 7: Avg. change in achievement (GPA points), by distance from shocked node
Dist. from shocked node
0
1
2
3
4
Partial equilibrium
0.254 0.059 0.000 0.000 0.000
General equilibrium
0.254 0.078 0.022 0.006 0.002
The first column of Table 7 shows that, on average, the effect of increasing a person’s
study effort by one hour per day increases her own GPA by 0.254 points, which follows
directly from the estimate of β2 = 0.254. The second column shows the average effect of
the shock on students who are one link away is 1/3 larger for the general equilibrium case
than the partial equilibrium case (0.078 versus 0.059 GPA points). The entries in the third
and fourth columns reflect the reality that individuals who are two or more links away are
not affected by the shock under the partial equilibrium scenario but are influenced by the
shock under the general equilibrium scenario. Figure 9 shows box plots of the data used
to construct Table 7.15 The plot shows that there is substantial heterogeneity in how much
individuals are influenced by the shock both between which student was shocked and distance
15
We used the box plot function in the R package ggplot2 (R Core Team (2014), Wickham (2009)). The
middle line of each box plot is the median, while the top and bottom bars of the box are the 75th and 25th
percentiles, respectively. The top whisker extends to the 98th percentile, while the bottom whisker extends
to the 2nd percentile.
34
Figure 9: Average change in achievement from shocking each student’s study time
average increase in GPA
0.2
0.1
0.0
0
1
2
3
4
distance from shocked node
equilibrium partial
general
of students from the shocked student, indicated on the x-axis. For example, under general
equilibrium, the first quartile, median, and third quartiles for student who are two links away
are 0.015, 0.019, and 0.026 GPA points.
In addition to describing how increasing the study time of different students affects their
neighbors, we can also aggregate these changes and present the total gain in achievement
resulting from shocking different students. Figure 10 shows the total gain in achievement
h hP
ii
16
(Et Ej
∆
Computing the
it ), for partial (left) and general equilibrium (right).
i6=j
change in achievement using only partial equilibrium would have increased achievement by
0.19 GPA points. On average, total achievement increases by 0.52 GPA points when general
equilibrium effects are taken into account. Therefore, considering only partial equilibrium
effects would on average understate the gain in achievement by 64%. To get a better sense
of what drives the heterogeneity in achievement gains by which node is shocked, the left
panel of Figure 11 shows the relationship between the centrality of the shocked node and the
total gain in GPA, taking into account general equilibrium effects.17 Again, this excludes
16
Note that this excludes the mechanical gain in achievement experienced by the shocked node.
We use closeness as our measure of centrality. “closeness” is defined as the reciprocal of the sum of
shortest distances between that node and every other node in the graph. This distance is equal to the
number of nodes for unconnected nodes (Csardi and Nepusz (2006), Freeman (1979)).
17
35
Figure 10: Partial versus general equilibrium effects
total increase in GPA
0.9
0.6
0.3
0.0
partial
general
equilibrium
the mechanical gain in achievement experienced by the shocked node. Each point records
the total gain in GPA in the economy (y-axis) by the percentile centrality, or how central is,
the shocked node (x-axis). The size each point shows the degree of the shocked node. Larger
dots are concentrated at the top-right, and smaller ones at the bottom-left, i.e. students with
higher numbers of friends tend to higher centrality indices. Intuitively, because the effects of
effort changes are stronger the closer students are, the total gain is higher when the shocked
node is more centrally located. The right panel of Figure 11 adds partial equilibrium effects
(red diamonds). We can see here that, though shocked nodes have the same degree (point
sizes), the average gain not increasingly as strongly increasing in centrality of the shocked
node. This is because the general equilibrium effects play a larger role the more densely
connected the shocked node is to the rest of the network.
5.2.2
Randomly assigning friends
Peer effects manifest through a combination of how responsive students are to other students’ input choices (i.e., slopes of best response functions) and who is friends with whom
(i.e., sorting into friendships). Therefore, to understand how important peer effects are, we
36
Figure 11: Total increase in achievement, by centrality of shocked node
1.0
total increase in GPA
total increase in GPA
1.0
0.5
0.0
0.5
0.0
0.00
0.25
0.50
0.75
1.00
0.00
centrality percentile (1 means most central)
equilibrium general
deg. shocked node 2.5
0.25
0.50
equilibrium partial
5.0
7.5
0.75
1.00
centrality percentile (1 means most central)
10.0
general
deg. shocked node 2.5
5.0
7.5
10.0
compare achievement simulated under the baseline social network used to estimate model
parameters with achievement simulated under a scenario where friends are randomly assigned.18 The first column of Table 8 shows that substantial correlations between one’s own
characteristics and the average characteristics of her friends exist in the baseline simulation.
The second column shows the correlation between one’s own characteristics and the average
characteristics of her friends under the random assignment simulation. As expected, with
correlations being present only because of sampling variation, the correlations in this column
are all close to zero.
Table 9 summarizes the changes in model outcomes between the baseline and random
assignment simulations; achievement is measured in GPA points and study times are in
hours per day. Table 10 presents the same results, expressed in standard deviations of the
model outcome variables in the baseline. The first column shows the effect that moving to
random assignment has on one’s average study time. The first row shows that, on average,
moving to random assignment would reduce own study time by 0.08 hours, or 0.10 standard
deviations. Intuitively, students who, in reality, have friends with strong propensities to
18
We report results from one randomization. In principle, we could also present results from a large
number of randomly drawn adjacency matrices.
37
Table 8: Characteristics of baseline and simulated social networks
Avg. number of friends
Baseline Random friends
3.309
3.42
Correlations bt. own and avg. friends
Black
0.74
Male
0.71
HS GPA
0.23
ACT
0.31
HS study
0.23
Expected study
0.14
0.04
0.09
-0.04
0.06
0.04
-0.01
study are most harmed by a movement to random assignment because this makes them
much more likely to have lower study type friends. This explains why females, blacks, and
students with above-median high school GPAs, who are all of high study type and are seen
in Table 8 to often have friends of the same type, see own study time fall by 0.17, 0.20,
and 0.15 GPA points, respectively. Conversely, males and below median HS GPA students,
who had less studious peers under the baseline, tend to experience increases in outcomes
from random assignment. However, importantly, the estimated complementaries apparent
in the study time game shown in Figure 5 implies that the gain to the lower study types
are smaller than the losses to the high study types, with this explaining the overall decrease
in own study time under random assignment. That is, randomly assigning friends does not
merely re-allocate output, but also produces less output in total.
A similar story drives the overall results and the stratified results associated with friend
study time in the second column. The third column shows the changes in achievement
that result from the changes in study effort. The first row shows that, on average, moving
to random assignment would reduce achievement by 0.02 GPA points, or 0.04 standard
deviations. However, as expected given the findings of study effort, the declines are largest
for black students, females, and students with above median high school GPAs. As before,
the losses to these groups are not offset by the gains to other groups. Random assignment
would increase the baseline GPA gap between nonblack and black students of 0.5 GPA points
by 10%, reduce the baseline GPA gap between females and males of 0.31 points by 20%, and
38
reduce the baseline GPA gap between students with above- and below-median high school
GPAs of 0.60 GPA points by 7%.
Table 9: Change in outcomes from randomly assigning friends, study time (hours) and
achievement (GPA points)
Total
Own study time Friend study time Achievement
-0.08
-0.08
-0.02
Nonblack
Black
-0.06
-0.20
-0.03
-0.32
-0.01
-0.06
Female
Male
-0.17
0.05
-0.23
0.12
-0.04
0.02
Below med. HS GPA
Above med. HS GPA
0.01
-0.15
0.05
-0.20
0.00
-0.04
Table 10: Change in outcomes from randomly assigning friends, in sd of outcome variable
Total
Own study time Friend study time Achievement
-0.04
-0.05
-0.02
Nonblack
Black
-0.03
-0.09
-0.02
-0.20
-0.01
-0.08
Female
Male
-0.08
0.02
-0.14
0.08
-0.06
0.02
Below med. HS GPA
Above med. HS GPA
0.00
-0.07
0.03
-0.13
0.01
-0.05
The results from Table 9 present outcomes in terms of GPA at Berea College. To put
the above numbers in broader context, we take advantage of our unique longitudinal data
to map changes in achievement to two outcomes of substantial policy interest: graduation
rates and future real hourly wages.19 For each comparison, we map GPA during the first
year into each outcome.
First, we relate achievement in the first year to the probability of graduating from university.20 Table 11 shows the change in predicted graduation rates, by student characteristics.
19
Sacerdote (2001)’s study of randomly assigned roommates at Dartmouth College finds that increasing
roommate outcome GPA by one standard deviation is associated with a 0.05 point increase in own GPA.
However, it is not clear how to directly compare this with our findings.
20
Details of these calculations are in Appendix A.4. In Kentucky is 2005, 4-year graduation rates
39
The share of students graduating is 0.746, and would fall slightly (less than half a percentTable 11: Change in predicted graduation rate
Observed
0.746
Random friends
0.742
Difference
-0.004
Nonblack
Black
0.758
0.691
0.758
0.668
-0.000
-0.023
Female
Male
0.844
0.619
0.830
0.628
-0.014
0.008
Below med. HS GPA
Above med. HS GPA
0.599
0.890
0.601
0.880
0.002
-0.010
Total
age point) under the random assignment of friends. The share of black students graduating
would decrease from 69.1% to 66.8%, or 2.3 percentage points. The share of nonblack students graduating would essentially remain the same, at 75.8%. The share of women graduating would drop by 1.4 percentage points, from 84.4% to 83.0% The share of men graduating
from increase from 61.9% to 62.8%, an increase of almost a percentage point. Students with
below-median high school GPAs would graduate at slightly higher rates, while those with
above-median high school GPAs would decrease by a percentage point, from 89% to 88%. To
provide a point of comparison, Belley and Lochner (2007) find that moving from the lowest
to highest income quartile would increase college graduation rates by 10 percentage points.
Second, we provide a back-of-the-envelope calculation relating the average of first and
second semester GPAs with future wages, taking advantage of our ongoing unique longitudinal study.21 The mean real average hourly wage is $16.15. The 11.6% change in mean
wages corresponds to a drop of $0.11 an hour for black students, $0.08 an hour for female
students, and $0.08 an hour for students with above-median high school GPAs. The hourly
wage would fall by $0.02 for nonblack students, with negligible effects for males and students
with below-median high school GPAs. Overall, homogenizing friends would cause a drop of
$0.03 an hour, or $70 per year.
were 19.9% and 6-year graduation rates were 45.5% (http://collegecompletion.chronicle.com/state/
#state=ky§or=public_four).
21
Details of these calculations are in Appendix A.5.
40
Figure 12: Change in GPA, by percentile of estimated study type µ̂s
By race
0.2
change in GPA
change in GPA
0.2
0.0
0.0
-0.2
-0.2
-0.4
0.00
-0.4
0.25
0.50
0.75
1.00
percentile of estimated study type
0.00
0.25
0.50
0.75
race
1.00
nonblack
black
percentile of estimated study type
By HS GPA
0.2
0.2
0.0
0.0
change in GPA
change in GPA
By sex
-0.2
-0.2
-0.4
-0.4
0.00
0.25
0.50
0.75
1.00
percentile of estimated study type
sex
female
0.00
0.25
0.50
0.75
percentile of estimated study type
male
HS GPA level low hsgpa high hsgpa
41
1.00
Figure 12 shows how the effect of randomly assigning friends varies by study type µs .
The top-left figure shows the relationship between change in GPA and percentile study type.
The top-right splits this relationship by race, while the bottom-right and bottom-left split by
high school GPA and sex, respectively. Not only is the effect of randomly assigning friends
decreasing in estimated study type, but the effect is larger for black and female students,
who tend to strongly sort into friendships based on these characteristics, and therefore lose
more from random assignment due to the lack of positive spillovers through equilibrium
interactions with high study types.
Finally, Figure 13 shows how the variance of outcomes changes for different groups of
students. The top left figure plots the CDF for the change in own study time, by race and
sex. About three quarters of black women reduce their study time when friends are randomly
assigned, while about 63% of nonblack women would reduce their study time when friends
were randomly assigned. The median black male would not experience a change in own study
time when friends were randomly assigned, while most nonblack males would study more.
The top right figure plots the CDF for the change in friend study time, and resemble the
change in own study time, though with more dispersion, due to the fact that best responses
functions have slopes less than 1. The bottom figure plots the CDF for the change in
achievement, and follows from the top figure. Though mean outcomes for different groups
of students are different, there is substantial heterogeneity within student characteristic as
well. Standard deviations of own study time, friend study time, and achievement fall by
29%, 45%, and 5%, respectively.
6
Conclusion
TBD
42
Figure 13: Changes in own study time (top left), friend study time (top right), and achievement (bottom), by characteristic
1.00
1.00
0.75
0.75
0.50
0.50
0.25
0.25
0.00
0.00
-3
-2
-1
0
change in own study time
race
sex
nonblack
female
1
-4
-2
0
change in friend study time
black
race
male
sex
1.00
0.75
0.50
0.25
0.00
-0.5
0.0
change in achievement
race
sex
nonblack
female
43
black
male
nonblack
female
black
male
2
A
Data
A.1
Survey questions
A.2
Testing for interaction effects in study time, GPA regressions
44
Figure 14: Time diary question
45
Figure 15: Friends question
46
Table 12 estimates the study time regression in Table 4, allowing for interactions between
own characteristics and friend study time (columns 2-4). Other than males, who are less
responsive to increases in friend study time than other students, there are no significant
differences between groups.
Table 12: Pooled over both semesters, interactions
Dependent variable:
sem study
(1)
(2)
(3)
(4)
male
−0.154
(0.137)
−0.153
(0.137)
0.433
(0.293)
−0.154
(0.137)
black
0.414∗∗
(0.177)
0.553
(0.394)
0.411∗∗
(0.177)
0.414∗∗
(0.178)
hsgpa
0.325∗∗
(0.147)
0.330∗∗
(0.148)
0.334∗∗
(0.147)
0.328
(0.294)
0.181∗∗∗
(0.044)
0.189∗∗∗
(0.048)
0.267∗∗∗
(0.058)
0.184
(0.292)
sem friends sem study
−0.043
(0.109)
I(black ∗sem friends sem study)
−0.198∗∗
(0.087)
I(male ∗sem friends sem study)
−0.001
(0.085)
I(hsgpa ∗sem friends sem study)
Constant
Observations
R2
Adjusted R2
Residual Std. Error
F Statistic
1.835∗∗∗
(0.542)
1.794∗∗∗
(0.552)
1.523∗∗∗
(0.558)
1.826∗
(1.018)
646
0.052
0.046
1.627 (df = 641)
8.726∗∗∗ (df = 4; 641)
646
0.052
0.044
1.628 (df = 640)
7.003∗∗∗ (df = 5; 640)
646
0.059
0.052
1.622 (df = 640)
8.053∗∗∗ (df = 5; 640)
646
0.052
0.044
1.629 (df = 640)
6.970∗∗∗ (df = 5; 640)
∗
Note:
p<0.1; ∗∗ p<0.05; ∗∗∗ p<0.01
Table 16 does the same as Table 12, but for GPA. None of the estimated interactions of
one’s own characteristics on either own or friend study time are discernible from zero at the
5% level.
Finally, Table 17 augments the regressions in Table 4 by adding in mean friend characteristics. The takeaway is that friend study time, not friend characteristics, is what correlates
with changes in own study time (column 2) and achievement (column 4).
Table 18 shows that first semester inputs may enter the production of GPA in the second
semester.
Table 19 shows that a measure of study productivity, high school GPA divided by high
school study time, does not substantially affect either the relationship between friend and
own study time (column 1) or own and friend study time and the production of GPA (column
47
Table 13: Study regressions interacted with own characteristics, pooled over both semesters
Dependent variable:
sem study
(1)
(2)
(3)
(4)
male
−0.340∗∗
(0.134)
−0.072
(0.303)
−0.337∗∗
(0.134)
−0.342∗∗
(0.134)
black
0.560
(0.399)
0.217
(0.172)
0.220
(0.172)
0.218
(0.172)
hsgpa
0.361∗∗
(0.145)
0.360∗∗
(0.144)
0.326
(0.297)
0.352∗∗
(0.144)
studyhs
0.042∗∗∗
(0.006)
0.042∗∗∗
(0.006)
0.042∗∗∗
(0.006)
0.035∗∗∗
(0.012)
sem friends study
0.183∗∗∗
(0.041)
0.192∗∗∗
(0.045)
0.143
(0.251)
0.139∗∗
(0.057)
I(black ∗sem friends study)
−0.088
(0.094)
I(male ∗sem friends study)
−0.077
(0.079)
I(hsgpa ∗sem friends study)
0.007
(0.072)
I(studyhs ∗sem friends study)
Constant
Observations
R2
Adjusted R2
Residual Std. Error (df = 567)
F Statistic (df = 6; 567)
0.002
(0.003)
1.208∗∗
(0.544)
1.167∗∗
(0.552)
1.380
(1.040)
1.392∗∗
(0.558)
574
0.167
0.158
1.497
18.919∗∗∗
574
0.167
0.158
1.497
18.932∗∗∗
574
0.166
0.157
1.498
18.744∗∗∗
574
0.166
0.157
1.498
18.828∗∗∗
∗
Note:
48
p<0.1;
∗∗
p<0.05;
∗∗∗
p<0.01
Table 14: Study regressions interacted with friend characteristics, pooled over both semesters
Dependent variable:
sem study
(1)
(2)
(3)
(4)
male
−0.359∗∗∗
(0.134)
−0.346∗∗
(0.134)
−0.343∗∗
(0.133)
−0.327∗∗
(0.133)
black
0.565∗∗
(0.277)
0.170
(0.184)
0.220
(0.172)
0.247
(0.173)
hsgpa
0.333∗∗
(0.145)
0.337∗∗
(0.146)
0.349∗∗
(0.144)
0.367∗∗
(0.144)
studyhs
0.041∗∗∗
(0.006)
0.041∗∗∗
(0.006)
0.034∗∗∗
(0.007)
0.047∗∗∗
(0.007)
sem friends study
0.183∗∗∗
(0.039)
0.161∗∗∗
(0.038)
0.137∗∗∗
(0.042)
0.173∗∗∗
(0.038)
I((black ∗sem friends black) ∗sem friends study)
−0.130
(0.082)
I((black - sem friends black)ˆ2 ∗sem friends study)
0.070
(0.090)
I((studyhs ∗sem friends studyhs) ∗sem friends study)
0.0001
(0.0001)
I((studyhs - sem friends studyhs)ˆ2 ∗sem friends study)
Constant
Observations
R2
Adjusted R2
Residual Std. Error (df = 567)
F Statistic (df = 6; 567)
−0.0001
(0.0001)
1.317∗∗
(0.536)
1.370∗∗
(0.546)
1.428∗∗∗
(0.543)
1.191∗∗
(0.541)
574
0.169
0.160
1.495
19.247∗∗∗
574
0.166
0.158
1.498
18.860∗∗∗
574
0.169
0.161
1.495
19.259∗∗∗
574
0.168
0.159
1.496
19.121∗∗∗
∗
Note:
49
p<0.1;
∗∗
p<0.05;
∗∗∗
p<0.01
Table 15: Study regressions interacted with friend characteristics, pooled over both semesters
Dependent variable:
sem study
(1)
(2)
(3)
(4)
male
−0.359∗∗∗
(0.134)
−0.114
(0.234)
−0.328∗∗
(0.134)
−0.343∗∗
(0.133)
black
0.565∗∗
(0.277)
0.226
(0.172)
0.241
(0.174)
0.220
(0.172)
hsgpa
0.333∗∗
(0.145)
0.368∗∗
(0.145)
0.222
(0.224)
0.349∗∗
(0.144)
studyhs
0.041∗∗∗
(0.006)
0.042∗∗∗
(0.006)
0.042∗∗∗
(0.006)
0.034∗∗∗
(0.007)
sem friends study
0.183∗∗∗
(0.039)
0.185∗∗∗
(0.040)
0.046
(0.164)
0.137∗∗∗
(0.042)
I((black ∗sem friends black) ∗sem friends study)
−0.130
(0.082)
I((male ∗sem friends male) ∗sem friends study)
−0.088
(0.076)
I((hsgpa ∗sem friends hsgpa) ∗sem friends study)
0.010
(0.013)
I((studyhs ∗sem friends studyhs) ∗sem friends study)
Constant
Observations
R2
Adjusted R2
Residual Std. Error (df = 567)
F Statistic (df = 6; 567)
0.0001
(0.0001)
1.317∗∗
(0.536)
1.166∗∗
(0.548)
1.741∗∗
(0.798)
1.428∗∗∗
(0.543)
574
0.169
0.160
1.495
19.247∗∗∗
574
0.167
0.159
1.497
19.011∗∗∗
574
0.166
0.158
1.498
18.857∗∗∗
574
0.169
0.161
1.495
19.259∗∗∗
∗
Note:
50
p<0.1;
∗∗
p<0.05;
∗∗∗
p<0.01
Table 16: Pooled over both semesters, interactions
Dependent variable:
gpa
(1)
(2)
(3)
(4)
male
−0.131∗∗
(0.061)
−0.133∗∗
(0.061)
−0.052
(0.168)
−0.118∗
(0.061)
black
−0.337∗∗∗
(0.079)
−0.570∗∗
(0.244)
−0.334∗∗∗
(0.079)
−0.328∗∗∗
(0.079)
hsgpa
0.567∗∗∗
(0.065)
0.560∗∗∗
(0.066)
0.566∗∗∗
(0.066)
0.987∗∗∗
(0.175)
sem study
0.064∗∗∗
(0.017)
0.066∗∗∗
(0.019)
0.048∗
(0.026)
0.264∗
(0.136)
I(black ∗sem study)
−0.012
(0.050)
I(male ∗sem study)
0.026
(0.035)
−0.058
(0.039)
I(hsgpa ∗sem study)
sem friends sem study
0.054∗∗∗
(0.020)
0.039∗
(0.022)
0.081∗∗∗
(0.027)
0.297∗∗
(0.133)
0.084∗
(0.051)
I(black ∗sem friends sem study)
−0.057
(0.040)
I(male ∗sem friends sem study)
−0.071∗
(0.039)
I(hsgpa ∗sem friends sem study)
Constant
Observations
R2
Adjusted R2
Residual Std. Error
F Statistic
0.693∗∗∗
(0.242)
0.756∗∗∗
(0.249)
0.664∗∗
(0.258)
−0.743
(0.605)
636
0.228
0.222
0.714 (df = 630)
∗∗∗
37.299
(df = 5; 630)
636
0.232
0.223
0.714 (df = 628)
∗∗∗
27.072
(df = 7; 628)
636
0.231
0.223
0.714 (df = 628)
∗∗∗
26.981
(df = 7; 628)
636
0.237
0.228
0.712 (df = 628)
∗∗∗
27.849
(df = 7; 628)
∗
Note:
51
p<0.1; ∗∗ p<0.05; ∗∗∗ p<0.01
Table 17: Pooled over both semesters
Dependent variable:
sem study
(1)
gpa
(2)
(3)
(4)
male
−0.154
(0.137)
0.002
(0.198)
−0.131
(0.061)
−0.178∗∗
(0.088)
black
0.414∗∗
(0.177)
0.357
(0.254)
−0.337∗∗∗
(0.079)
−0.253∗∗
(0.115)
hsgpa
0.325∗∗
(0.147)
0.335∗∗
(0.148)
0.567∗∗∗
(0.065)
0.521∗∗∗
(0.067)
0.064∗∗∗
(0.017)
0.076∗∗∗
(0.018)
0.054∗∗∗
(0.020)
0.049∗∗
(0.021)
sem study
sem friends sem study
0.181∗∗∗
(0.044)
0.208∗∗∗
(0.047)
∗∗
sem friends black
0.225
(0.319)
−0.105
(0.143)
sem friends male
−0.283
(0.258)
0.080
(0.115)
sem friends hsgpa
0.332
(0.238)
0.101
(0.107)
Constant
Observations
R2
Adjusted R2
Residual Std. Error
F Statistic
1.835∗∗∗
(0.542)
0.601
(0.958)
0.693∗∗∗
(0.242)
646
0.052
0.046
1.627 (df = 641)
8.726∗∗∗ (df = 4; 641)
619
0.071
0.061
1.584 (df = 611)
6.688∗∗∗ (df = 7; 611)
636
0.228
0.222
0.714 (df = 630)
37.299∗∗∗ (df = 5; 630)
0.473
(0.429)
∗
Note:
52
614
0.229
0.219
0.706 (df = 605)
22.429∗∗∗ (df = 8; 6
p<0.1;
∗∗
p<0.05;
∗∗∗
p<0
Table 18: Evidence of human capital dynamics
Dependent variable:
gpa2
(1)
(2)
black
−0.275∗∗
(0.115)
−0.039
(0.103)
male
−0.078
(0.091)
0.009
(0.079)
hsgpa
0.544∗∗∗
(0.096)
0.282∗∗∗
(0.088)
studyhs
−0.003
(0.004)
−0.0001
(0.003)
sem2 study
0.096∗∗∗
(0.031)
0.058∗∗
(0.025)
sem2 frstudy
0.041∗∗
(0.020)
0.031∗
(0.018)
sem1 study
0.045
(0.031)
0.562∗∗∗
(0.059)
gpa1
Constant
Observations
R2
Adjusted R2
Residual Std. Error
F Statistic
0.568
(0.362)
0.021
(0.318)
271
0.265
0.246
0.680 (df = 263)
13.571∗∗∗ (df = 7; 263)
275
0.455
0.441
0.595 (df = 267)
31.834∗∗∗ (df = 7; 267)
∗
Note:
53
p<0.1;
∗∗
p<0.05;
∗∗∗
p<0.01
2).
Table 20 interacts own and friend study time in the GPA regression. The first column is
the GPA regression from Table 4. The second column excludes a direct effect of friend study
time, but adds an interaction between own and friend study time. This model predicts about
the same variation as the one excluding interactions. The third column adds the interaction
to the first model. Standard errors for both own and friend study times blow up because
the new interaction term is highly co-linear with own and friend study time.
Table 21 shows that excluding friend study time does not appreciably change the relationship between own study time and GPA, even controlling for friend characteristics.
54
Table 19: Study and GPA regressions, pooled over both semesters
Dependent variable:
sem study
gpa
(1)
(2)
male
−0.307∗∗
(0.138)
−0.081
(0.064)
black
0.210
(0.174)
−0.395∗∗∗
(0.080)
hsgpa
0.353∗∗
(0.148)
0.498∗∗∗
(0.069)
studyhs
0.038∗∗∗
(0.006)
−0.003
(0.003)
0.077∗∗∗
(0.022)
sem study
sem friends study
0.142∗∗∗
(0.045)
0.042∗∗
(0.021)
I(hsgpa/studyhs)
−0.186
(0.132)
0.010
(0.067)
I(hsgpa ∗sem study/studyhs)
−0.005
(0.014)
I(hsgpa ∗sem friends study/studyhs)
0.040
(0.038)
−0.006
(0.018)
Constant
1.474∗∗∗
(0.550)
0.990∗∗∗
(0.256)
546
0.162
0.151
1.494 (df = 538)
14.857∗∗∗ (df = 7; 538)
543
0.221
0.208
0.684 (df = 533)
16.822∗∗∗ (df = 9; 533)
Observations
R2
Adjusted R2
Residual Std. Error
F Statistic
∗
Note:
55
p<0.1;
∗∗
p<0.05;
∗∗∗
p<0.01
Table 20: GPA regression interacted with own characteristics, pooled over both semesters
Dependent variable:
gpa
(1)
(2)
∗∗
(3)
male
−0.122
(0.062)
−0.121
(0.062)
−0.121∗
(0.062)
black
−0.389∗∗∗
(0.079)
−0.385∗∗∗
(0.079)
−0.387∗∗∗
(0.080)
hsgpa
0.512∗∗∗
(0.067)
0.513∗∗∗
(0.067)
0.513∗∗∗
(0.067)
studyhs
−0.001
(0.003)
−0.001
(0.003)
−0.001
(0.003)
sem study
0.079∗∗∗
(0.019)
0.046∗
(0.026)
0.061
(0.041)
sem friends study
0.039∗∗
(0.018)
Observations
R2
Adjusted R2
Residual Std. Error
F Statistic
0.020
(0.043)
0.009∗∗
(0.004)
0.005
(0.010)
0.907∗∗∗
(0.249)
1.034∗∗∗
(0.245)
0.971∗∗∗
(0.281)
571
0.233
0.225
0.686 (df = 564)
28.561∗∗∗ (df = 6; 564)
571
0.233
0.225
0.686 (df = 564)
28.570∗∗∗ (df = 6; 564)
I(sem study ∗sem friends study)
Constant
∗∗
∗
Note:
56
571
0.233
0.224
0.687 (df = 563
24.484∗∗∗ (df = 7;
p<0.1;
∗∗
p<0.05;
∗∗∗
p<
Table 21: Study and GPA regressions with only friend characteristics, pooled over both
semesters
Dependent variable:
sem study
gpa
sem study
gpa
(1)
(2)
(3)
(4)
male
−0.304
(0.186)
−0.138
(0.087)
−0.324∗
(0.187)
−0.141
(0.087)
black
0.210
(0.252)
−0.282∗∗
(0.118)
0.195
(0.254)
−0.286∗∗
(0.118)
hsgpa
0.316∗∗
(0.145)
0.496∗∗∗
(0.068)
0.330∗∗
(0.146)
0.498∗∗∗
(0.068)
studyhs
0.037∗∗∗
(0.006)
−0.001
(0.003)
0.038∗∗∗
(0.006)
−0.001
(0.003)
0.077∗∗∗
(0.020)
sem study
0.083∗∗∗
(0.019)
sem friends study
0.127∗∗∗
(0.039)
0.040∗∗
(0.018)
sem friends black
−0.056
(0.310)
−0.136
(0.145)
0.026
(0.311)
−0.113
(0.145)
sem friends male
0.002
(0.235)
0.046
(0.109)
−0.065
(0.236)
0.025
(0.109)
sem friends hsgpa
0.251
(0.226)
0.103
(0.105)
0.315
(0.227)
0.122
(0.105)
sem friends studyhs
0.033∗∗∗
(0.009)
0.0002
(0.004)
0.041∗∗∗
(0.009)
0.002
(0.004)
Constant
0.399
(0.908)
0.608
(0.422)
0.518
(0.915)
0.639
(0.423)
574
0.187
0.174
1.483 (df = 564)
14.428∗∗∗ (df = 9; 564)
571
0.236
0.223
0.687 (df = 560)
17.337∗∗∗ (df = 10; 560)
574
0.172
0.160
1.495 (df = 565)
14.646∗∗∗ (df = 8; 565)
Observations
R2
Adjusted R2
Residual Std. Error
F Statistic
∗
Note:
57
571
0.230
0.218
0.690 (df = 561
18.620∗∗∗ (df = 9;
p<0.1;
∗∗
p<0.05;
∗∗∗
p<
58
Note:
Observations
R2
Adjusted R2
Residual Std. Error
F Statistic
Constant
gpa1
sem1 study
sem2 frstudy
sem2 study
sem friends study
sem study
296
0.175
0.161
1.527 (df = 290)
12.290∗∗∗ (df = 5; 290)
1.110
(0.764)
0.269∗∗∗
(0.081)
296
0.217
0.201
0.684 (df = 289)
13.368∗∗∗ (df = 6; 289)
1.243∗∗∗
(0.344)
0.013
(0.037)
0.045∗
(0.026)
0.0004
(0.004)
studyhs
0.040∗∗∗
(0.008)
0.341∗
(0.203)
0.477∗∗∗
(0.094)
0.323
(0.208)
hsgpa
0.179
(0.243)
−0.453∗∗∗
(0.111)
0.220
(0.246)
−0.291
(0.192)
−0.164∗
(0.084)
−0.381∗∗
(0.187)
black
male
278
0.163
0.148
1.474 (df = 272)
10.592∗∗∗ (df = 5; 272)
1.396∗
(0.760)
0.141∗∗∗
(0.042)
0.041∗∗∗
(0.008)
(3)
(2)
(1)
−0.072
(0.090)
(4)
gpa
275
0.268
0.252
0.688 (df = 268)
16.352∗∗∗ (df = 6; 268)
0.540
(0.363)
0.046∗∗
(0.020)
0.121∗∗∗
(0.028)
−0.002
(0.004)
0.563∗∗∗
(0.096)
−0.308∗∗∗
(0.115)
Dependent variable:
sem study
gpa
sem study
Table 22: Study and GPA regressions, by semester
time on achievement becomes similar when we control for first semester GPA.
(6)
∗
275
0.455
0.441
0.595 (df = 267)
31.834∗∗∗ (df = 7; 267)
0.021
(0.318)
0.562∗∗∗
(0.059)
p<0.1; ∗∗ p<0.05; ∗∗∗ p<0.01
271
0.265
0.246
0.680 (df = 263)
13.571∗∗∗ (df = 7; 263)
0.568
(0.362)
0.045
(0.031)
0.031∗
(0.018)
0.041∗∗
(0.020)
−0.0001
(0.003)
0.282∗∗∗
(0.088)
−0.039
(0.103)
0.009
(0.079)
0.058∗∗
(0.025)
gpa2
0.096∗∗∗
(0.031)
−0.003
(0.004)
0.544∗∗∗
(0.096)
−0.275∗∗
(0.115)
−0.078
(0.091)
(5)
Table 22 shows that though production of achievement may look different across semesters, the effect of own study
A.3
Concavity of policy function
The optimal study time solves the function G(s, s−i ) =
∂c
∂s
−β2 = 0. To find how s varies
with friend study time, use the Implicit Function Theorem:
∂G
∂2c
∂s
∂s
∂s∂s
= − ∂G−i = − ∂ 2 c−i .
∂s−i
∂s
∂s2
If friend study time decreases the cost of increasing one’s own study time, the numerator is
positive. If the cost of studying is convex in own study time, the denominator is negative,
meaning the overall sign is positive. Moreover, if friend study time enters c(· · · ) in a concave
manner – τs < 1, the numerator is smaller in absolute value for larger values of s−i , i.e.
study time is concave in friend study time.
A.4
Graduation rates
Table 23 shows the results of a probit of graduating on first year GPA. The first column
shows the results of a probit of graduating on the estimated human capital typeˆand average
(model) GPA during the first two semesters of college. The second column runs the same
probit, but substituting student characteristics in for estimated human capital type. Both
statistical models show a strong link between performance during the first year, conditional
on student characteristics, and whether or not a student graduates from Berea College. The
results in Table 11 are calculated using the first column, though results are largely unchanged
were we to use the second statistical model instead.
A.5
Future wage
Table 24 summarizes how we relate average performance during the first year, as measured
by GPA, to labor market outcomes. The first column regresses the logarithm of average real
hourly wage in post-college surveys five through nine on student characteristics and GPA
during the first year. We can see that a 1-point increase in GPA is associated with an 11.6%
increase in average real hourly wage, but the coefficient is quite imprecisely estimated because
59
Table 23: Probit of graduating on average of first year GPA
Dependent variable:
Graduate
(1)
µ̂y
(2)
−0.063
(0.458)
Black
0.417
(0.275)
Male
−0.303
(0.205)
HS GPA
0.309
(0.391)
ACT
−0.051
(0.041)
HS study
−0.013
(0.008)
Expected study
0.003
(0.009)
Avg. GPA♥
1.249∗∗∗
(0.444)
1.309∗
(0.672)
Constant
−2.772∗∗∗
(0.578)
−2.799∗∗∗
(0.900)
Observations
Log Likelihood
Akaike Inf. Crit.
307
−154.725
315.451
307
−146.111
308.221
♥
: Average of model achievement across both semesters
∗
Note:
p<0.1; ∗∗ p<0.05;
60
∗∗∗
p<0.01
of the very small sample size of 77 students. Because of this, we compare this coefficient with
that obtained from two separate regressions: column (2) presents results from a regression
of the logarithm of average real wages on college GPA, measured at the end of university
real wage)
( ∂log(avg.
= 0.168) and column (3) presents results from a regression of end of college
∂college GPA
∂college GPA
GPA on first semester GPA ( ∂avg.
= 0.665). Combing these coefficients returns
first year GPA
∂log(avg. real wage)
∂avg. first year GPA
=
∂log(avg. real wage)
∂college GPA
∂college GPA
× ∂avg.
= 16.8% × 0.665 = 11.2%, which is
first year GPA
strikingly close to the result from the first regression, 11.6%.
Table 24: Wage regressions
Dependent variable:
log(avg. real wage))
log(avg. real wage)
College GPA
(1)
(2)
(3)
Black
0.084
(0.193)
0.110
(0.111)
0.068
(0.119)
Male
−0.016
(0.113)
0.090
(0.075)
−0.025
(0.070)
Avg. first year GPA
0.116
(0.114)
0.665∗∗∗
(0.070)
0.168∗∗
(0.083)
College GPA
Constant
Observations
R2
Adjusted R2
Residual Std. Error
F Statistic
2.336∗∗∗
(0.368)
2.089∗∗∗
(0.282)
1.119∗∗∗
(0.228)
77
0.015
−0.026
0.493 (df = 73)
0.370 (df = 3; 73)
165
0.030
0.012
0.458 (df = 161)
1.680 (df = 3; 161)
77
0.568
0.550
0.305 (df = 73)
32.001∗∗∗ (df = 3; 73)
∗
Note:
61
p<0.1;
∗∗
p<0.05;
∗∗∗
p<0.01
References
Arcidiacono, P., “Ability Sorting and the Returns to College Major.” Journal of Econometrics, 121(1):343–375, 2004.
Babcock, P. and M. Marks, “The Falling Time Cost of College: Evidence from Half Century
of Time Use Data,” The Review of Economics and Statistics., 93(2):468–478, 2011.
Badev, A., “Discrete games in endogenous networks: Theory and policy,” 2013.
Belley, P. and L. Lochner, “The Changing Role of Family Income and Ability in Determining Educational Achievement,” Journal of Human Capital, 1(1):pp. 37–89, 2007, ISSN
19328575.
Blume, L. E., W. A. Brock, S. N. Durlauf and Y. M. Ioannides, “Identification of social
interactions,” Available at SSRN 1660002, 2010.
Brock, W. A. and S. N. Durlauf, “Discrete Choice with Social Interactions.” Review of
Economic Studies, 68:235–260, 2001a.
—, “Interactions-Based Models.” in J. J. Heckman and E. E. Leamer, eds., “Handbook of
Econometrics,” vol. 5, pp. 235–260, Amsterdam: North-Holland Press, 2001b.
Bursztyn, L., F. Ederer, B. Ferman and N. Yuchtman, “Understanding Mechanisms Underlying Peer Effects: Evidence from a Field Experiment on Financial Decisions,” Econometrica, 82(4):1273–1301, 2014.
Calvó-Amengol, A., E. Pattachini and Y. Zenou, “Peer Effects and Social Networks in Education,” The Review of Economic Studies, (76):1239–1267, 2009.
Carrell, S. E., R. L. Fullerton and J. E. West, “Does Your Cohort Matter? Measuring Peer
Effects in College Achievement,” Journal of Labor Economics, 27(3), 2009.
62
Carrell, S. E., B. I. Sacerdote and J. E. West, “From Natural Variation to Optimal Policy?
The Importance of Endogenous Peer Group Formation,” Econometrica, 81(3):855–882,
2013.
Chan, T. J., “Additive Model with Endogenous Network Formation,” 2015.
Christakis, N., J. Fowler, G. Imbens and K. Kalyanaraman, “An empirical model for strategic
network formation,” Tech. rep., National Bureau of Economic Research, 2010.
Conley, T. G. and C. R. Udry, “Social Learning through Networks: The Adoption of
New Agricultural Technologies in Ghana,” American Journal of Agricultural Economics,
83(3):668–673, 2001.
—, “Learning about a new technology: Pineapple in Ghana,” The American Economic
Review, pp. 35–69, 2010.
Csardi, G. and T. Nepusz, “The igraph software package for complex network research,”
InterJournal, Complex Systems:1695, 2006.
de Paula, A., S. Richards-Shubik and E. Tamer, “Identification of Preferences in Network
Formation Games,” 2014.
Epple, D. and R. Romano, “Peer Effects in Education: A Survey of the Theory and Evidence,” in A. B. Jess Benhabib and M. O. Jackson, eds., “Handbook of Social Economics,”
vol. 1, pp. 1053–1163, North-Holland, 2011.
Freeman, L. C., “Centrality in social networks conceptual clarification,” Social Networks,
1(3):215–239, 1979.
Gamoran, A. and M. T. Hallinan, “Tracking Students for Instruction,” in “Restructuring
Schools,” pp. 113–131, Springer, 1995.
Hanushek, E. and L. Woessmann, “Does Educational Tracking Affect Performance and Inequality? Differences-in-Differences Evidence Across Countries,” The Economic Journal,
116(510):C63–C76, 2006.
63
Hsieh, C. and L. F. Lee, “A Social Interactions Model with Endogeneous Friendship Formation and Selectivity,” Journal of Applied Econometrics, 2015.
Manski, C., “Identification of Endogenous Social Effects: The Reflection Problem,” The
Review of Economic Studies, 60(3):531–542, 1993.
Mele, A., “A Structural Model of Segregation in Social Networks,” Working paper, 2013.
R Core Team, R: A Language and Environment for Statistical Computing, R Foundation for
Statistical Computing, Vienna, Austria, 2014.
Sacerdote, B., “Peer Effects with Random Assignment: Results for Dartmouth Roommates.”
Quarterly Journal of Economics, pp. 681–704, 2001.
Stinebrickner, R. and T. R. Stinebrickner, “What can be learned about peer effects using
college roommates? Evidence from new survey data and students from disadvantaged
backgrounds,” Journal of public Economics, 90(8):1435–1454, 2006.
—, “The Effect of Credit Constraints on the College Drop-Out Decision: A Direct Approach
Using a New Panel Study,” The American Economic Review, 98(5):2163–84, 2008.
Weinberg, B. A., “Social interactions with endogenous associations.” Working Paper, 2009.
—, “Group design with endogeneous associations.” Regional Science and Urban Economics,
43:411–421, 2013.
Wickham, H., ggplot2: elegant graphics for data analysis, Springer New York, 2009, ISBN
978-0-387-98140-6.
64
© Copyright 2026 Paperzz