Key Players? Identifying High School Networking Effects on Earnings

Key Players? Identifying High School
Networking Effects on Earnings
Lucia M. Barbone and Peter J. Dolton
Key Players?
Identifying High School Networking Effects
on Earnings
Draft 9 - Please do not quote and do not circulate
Lucia M. Barbone1 2 and Peter J. Dolton∗1 3
1
Department of Economics, University of Sussex
of Economics and Finance, Catholic University of Milan
3 Centre for Economic Performance, London School of Economics
2 Department
9 September 2015
Abstract
The early acquisition of non-cognitive and social skills are now recognised
as fundamental determinants of success in the labour market. Social network
analysis can provide an objective measure of these skills. Various network
measures and different identification strategies are used to examine evidence
of the impact of these skills on adult earnings, using a unique dataset (AddHealth). The results show that social skills are determined not just by the
size of your network, but by the chosen identification assumption, the intensity of networks, and how central you are in the network - i.e. whether you are
a ”key player” or not. Our estimates indicate that there is a sizeable direct
impact of these high school network social skills on earnings in adult life: a
one standard deviation increase is increasing earnings by between 3% and 8%.
Keywords: Social Networks, Non-cognitive Skills, Earnings, Social Interactions
JEL codes: J24, J30, Z13
∗
The authors can be contacted at [email protected] and [email protected]. We
thank Lorenzo Cappellari, Jan Gunning, Marteen Lindeboom, Steve Machin, Maurizio Motolese, Petra
Todd, Peter Urwin, and the seminar participants at the University of Sussex, the Catholic University
of Milan, the WPEG 2014 Conference, the RES 2015 Conference, the ESPE 2015 Conference, and the
2015 Econometric Society World Congress for helpful comments and discussions.
1
1
Introduction
Who we know, how many people we know, and how central we are in our friendship network will condition, at least partially, how much we know (Woolley et al., 2010;
Christakis and Fowler, 2011; Pentland, 2014). It is also possible that the process of acquiring these friends and becoming central to any network directly augments our social and
non-cognitive skills1 : maintaining contacts, providing links by various means of communication requires interpersonal skills associated with: communication, coordination, mutual
understanding, language, nuance, conciliation, empathy, reciprocal respect, punctuality,
reliability, and commitment. The development of these skills whilst in high school is an
important part of the learning and maturation process (Powell et al., 1985). These social
and non-cognitive skills are also exactly the skills needed in the labour market: hence,
the acquisition of these skills will make an individual much more valuable and effective
in most kinds of jobs (Nolfi, 1979). Accordingly, their promotion or advancement is more
likely and hence we would expect their marginal product to be higher. Over time, and
certainly by mid-career, we would then expect that those with more network skills earn
more in the labour market. This paper seeks to characterise this process and measure the
size of the effect of networking skills on earnings outcomes. More specifically, we examine
the array of different models which have been used to identify network effects. We also
examine the proposition that it is being ”central” (in a well-defined sense) to a network
what matters, rather than just the number of friends you have. Our suggestion is that
being a ”key player” in a network is what imparts extra social and non-cognitive skills
and accordingly augments earnings later in life.
Various recent papers have found non-cognitive skills to be a key determinant of
labour market outcomes, such as earnings, occupational choices, and job performance
(e.g. Heckman et al. (2006); Almlund et al. (2011); Heckman (2013); Borghans, ter Weel
and Weinberg (2008), between others). However, the definition and the measurement
of these non-cognitive skills are still subject to much debate. This debate is generated
by the need to adopt proxies for these skills, since direct measures of them are elusive.
An important implication of this highly influential research is that a good part of the
non-cognitive skills that are important are learnt, or conferred, prior to school, within the
family, or at least in the early years of life (Heckman et al., 2013; Heckman, 2013; Heckman
et al., 2010). Thus, many authors have argued that early childhood interventions would
be the most effective for the development of these skills (Heckman, 2013). However,
our suggestion is that these non-cognitive skills are malleable in nature, and are still
developed throughout a child’s school years2 (Almlund et al., 2011). Our study breaks
new ground by suggesting that children go on acquiring these valuable skills via social
networking well into their high school years.
The previous literature has used three main groups of proxies to estimate the effects
of non-cognitive skills on earnings: personality traits (Heckman et al., 2006; Drago, 2011,
see), social skills (Kuhn and Weinberger, 2005; Glaeser et al., 2002), and behavioural
issues (Heckman and Rubinstein, 2001; Weiss, 2010, see). Results are typically positive
and significant for earnings and job performance, but estimates varies across the three
groups, because of the different approaches and measures adopted. Indeed, results range
1
In this paper, the terms ’non-cognitive skills’ and ’social skills’ will be used interchangeably. Likewise
we do not distinguish between the concept of ’social capital’ and ’network capital’, but we use them
interchangeably.
2
And even into adulthood, but we do not examine that in this paper.
2
from 12% (Barron et al., 2000; Postlewaite and Silverman, 2005) to 2.7% (Carneiro et al.,
2007)3 . However, these measures are difficult to compare, since they are calculated in different ways and often capture different aspects of non-cognitive skills. Recent papers have
introduced social network analysis measures of social skills to examine long-term labour
market and health outcomes (Conti et al., 2013; Hill, 2012; Babcock, 2008; Fletcher, 2014;
Badev, 2013). Indeed, network measures can give a proxy for the ability of an individual
to interact with others. The most common metric adopted in the previous literature has
been the number of friends that one has, measured as the nominations that one either
gives or receives in a survey (so-called ”degree”). The effect has been then mostly related
to the popularity that one has in the environment considered, i.e. school or work. Effects
are typically positive and significant: Hill (2012) finds a positive effect of around 5%
on earnings, while Babcock (2008) finds a 1.5% positive effect on employment probability4 . Finally, Conti et al. (2013) introduce a two step estimation strategy to evaluate
the determinants of the social skills. The results obtained show a positive and significant
effect: a one standard deviation increase in the number of times one is nominated by
others in the survey (so-called indegree) increases earnings by 2%. Previous studies did
not examine the endogeneity issue that might affect social network measures. Indeed, a
central problem in the assessment of the value of social skills induced by networks is that
the propensity to make friends could be related to the unobserved heterogeneity which
conditions earnings. So specifically, there is the possibility that unobserved traits like
energy, motivation, enthusiasm, personality, ambition could be important in the determination of how many network contacts we make and may also influence earnings. We
tackle, ’head-on’, this endogeneity problem in a number of ways in our estimations.
This paper overcomes these limitations and provides additional evidence on the value
of social skills for labour market outcomes, using a unique dataset - the National Longitudinal Study of Adolescent to Adult Health (AddHealth). This dataset is particularly
interesting since it includes fully sampled networks and thus allows us to adopt more
sophisticated network measures. Since it is panel data from age 15 to 28 and contains
contemporaneous information on friendship relationships, it does not suffer from recall
bias. More specifically, we address the concern of whether simply counting the number of
friends one has is adequate. What we seek is a more adequate measure of network capital
and how central one is to a set of interrelationships. This means we are concerned with
not just the number of friends someone has, but how many friends your friends, in turn,
have. The proposition is that capturing the extent to which someone is a well connected
’Key Player’ constitues a better measure of ’network capital’. We therefore propose an
alternative network metric, the Bonacich Centrality, which is not simply a measure of
early connectivity but a superior proxy for social skills, because of its ability to take into
account the location of an individual in a social network (Christakis and Fowler, 2011).
Conti et al. (2013) use a pseudo-likelihood estimator as identification strategy, in order
to address some data issues. The first stage of their estimation models different network
measures, taking into account the nature of the network variables. Their second stage
then evaluates the effect of these non-cognitive skills on earnings thirteen years later. We
compare and evaluate different network metrics to proxy for social skills, namely degree,
3
These reported effects we quote are the effect of one unit standard deviation increase in the noncognitive skills on wages.
4
Only Fletcher (2014) does not find any effect when comparing siblings and twins with different social
skills.
3
indegree, and Bonacich Centrality. Due to the detailed nature of the AddHealth dataset,
we are also able to compare ’partially’ and ’fully’ sampled networks, and to examine the
impact of network sampling on the estimates.
This paper makes several contributions to the existing literature. Firstly, we suggest
that social network measures are extremely useful and potentially superior measures for
social and non-cognitive skills. Whereas, a simple measure as the number of relationship
one has is shown to be a less effective proxy for social skills, because of the ambiguity of its
informational content (Borgatti et al., 2013). Recent applied network analysis literature
has underlined the importance of metrics able to account for the position in the network,
and not only for their connectedness (Banerjee et al., 2014; Calvó-Armengol et al., 2009).
Following this recent network analysis literature, as well as the sociology literature, we
adopt a metric of network centrality, which present a richer informational content and
which we believe can give a better proxy for social skills: the Bonacich Power Centrality
(Bonacich, 1987). This enables to take into account the structure of the network and
thus the importance of the single individual in the network itself. Secondly, this paper
directly addresses the identification issues related to the possible presence of endogeneity.
This is carefully analysed and tested using various robustness checks and alternative
identification strategies. In particular, we compare results obtained through various social
interaction models as proposed in the literature (Goldsmith-Pinkham and Imbens, 2013;
Bramoullé et al., 2009; Manski, 1993). This helps to clarify the nature and the meaning
of the estimated effect. We also contribute to this literature with a modification of the
key identifying matrix. Specifically, we introduce a network intensity model that does
not just classify friendships information as zeros or ones, but uses individual rankings
of friends to scale the adjacency matrix. Thirdly, this paper extends previous analyses
of the literature examining the heterogeneity of the effects by gender, ethnicities, and
earnings quantiles. The main findings of this paper are that a unit standard deviation
increase in network centrality raises earnings by around 3-8%5 and that this effect varies
over earnings quantiles. We find that the effects of network capital are heterogenous
across gender and ethnicity. Specifically, the benefits accrue largely to whites and males.
The effect is related to social abilities and prosocial qualities of the individuals, and these
skills are affected not only by family background and psychological traits, but also by the
school environment and the opportunities that one has to interact with others. For this
reason, more research is necessary to explore the role of schools in shaping these skills
and possible school policies to favour the development of these social skills in students.
The effect is found to be robust across different social interactions models, with bounds
between 0 and 10%. We also provide evidence of the separability of this effect from social
capital accumulation effect, i.e. more standard peer effects.
The rest of this paper is organised as follows. Section 2 reviews the previous findings in the literature and motivates the analysis. Section 3 illustrates the data and the
variables used to proxy social skills. Section 4 describes the empirical specification and
the identification strategy. Section 5 presents the results and Section 6 discusses some
implications of our findings and concludes.
5
Effects are calculated using standardised network measures
4
2
Interpersonal skills and labour market outcomes
The importance of non-cognitive or social skills for human capital accumulation
and labour market outcomes has been examined only recently. The term ’non-cognitive
skills’ has been used in the literature to define all the individual capacities not measured
through knowledge or ability to solve problems and learn new materials (Heckman, 2013;
Hill, 2012; Gardner, 1993), such as: personality traits, sociability, motivation, charm,
extraversion and various others. The term non-cognitive has been chosen to distinguish
these skills from more standard cognitive abilities, such as mathematical and verbal intelligence, typically measured through the IQ. However, the term might result as misleading
if one interprets it as skills not involving ’cognition’. Indeed, Gardner (1993, 1987) proposed a different conceptualisation of human intelligence, of which he identifies seven
different dimensions. The first five are: linguistic intelligence, logical-mathematical intelligence, spatial intelligence, musical intelligence, and bodily-kinesthetic intelligence. The
sixth and the seventh - more relevant for our purposes, are interpersonal intelligence,
i.e. ”the ability to understand other people: what motivates them, how they work, how to
work cooperatively with them” (Gardner (1993), p. 9), and intrapersonal intelligence, i.e.
”capacity to form an accurate, veridical model of oneself, and to be able to use that model
to operate effectively in life” (Gardner (1993), p. 9).
The skills we are considering in this paper are the interpersonal intelligence, and,
partly, the intrapersonal intelligence. The term ’non-cognitive’ will still be used, since it
has been widely adopted in the literature, but a clarification is necessary on the terminology. By non-cognitive or social skills, we mean the inter- and intra-personal intelligence
of Gardner. Specifically we are not referring to the ’cognitive’ skills learnt in math, language, logic, science, physical education, geography, or music lessons in school or college.
We are also not referring to psychological personality traits. Rather, we regard inter- and
intra-personal skills as the skills which help individuals to: communicate with others,
work in a team, get on with people, induce individuals to work cooperatively with you,
and develop leadership qualities. The term ’non-cognitive’ is somewhat of a misnomer,
as it implies that intelligence and cognitive brain function are not required in their use.
However, clearly this literature is talking about the same set of skills as us. Our central
idea is that these skills are developed early in school and help to augment productive human capital useful in the workplace. Central to our paper (and what sets us apart from
the ’noncognitive’ literature) is the idea that good information on networking in high
school enables us to capture and measure this propensity for inter- and intra-personal
skills.
Glaeser et al. (2002) consider these social characteristics of an individual human capital, results from both innate abilities and investments. The measurement of these skills
is challenging since it is not easy to separate cognitive and non-cognitive skills, as they
often influence each other (Heckman, 2011; Borghans, Meijers and ter Weel, 2008).
To clarify the state of the art of the literature to date, the rest of this section briefly reviews previous results and methodologies. Indeed, intrapersonal and non-cognitive skills
have been measured using different proxies, which can be grouped into three main categories: psychological or personality traits, socio-emotional skills, and behavioural issues.
Differences across the categories are due to different method of measurement adopted
and different conceptualisation of non-cognitive skills.
The first category includes the personality traits developed in the psychology literature,
5
and adopted by some authors (see between others (Borghans et al., 2011; Heckman,
2011)). The ’Big Five’ or ’OCEAN’6 is the most widely adopted taxonomy of human
personality traits, even though many alternatives have been developed (Bouchard and
Loehlin, 2001), together with the measures of locus-of-control and self-esteem. The impact of these traits on a range of outcomes, such as education, labour market, health, and
crime has been widely studied. Almlund et al. (2011) provides an extensive review of the
literature on the topic. Personality traits typically have a significant effect on earnings,
job performance and occupational choices (Kautz et al., 2014; Cobb-Clark and Schurer,
2012; Heckman, 2011; Cobb-Clark and Tan, 2011; Clausen and Gilens, 1990). Furthermore, traits were proven as more relevant for semi-skilled and unskilled workers, while
cognitive abilities are key for jobs such as professors, scientists, and managers (Schmidt
and Hunter, 2004). Personality has also been found as related to absenteeism (Störmer
and Fahr, 2013), relationship with employer (Caliendo et al., 2015), unemployment spells
(Gallo, 2003), and compensation scheme choice (Tsai et al., 2015).
The second category uses measures or indices of behavioural problems, such as ’delinquency’ or ’misbehaviours’. Delinquency scales are computed using actions such as damaging property, stealing, selling drugs, running away from home (see for example Loehlin
and Rowe (1992); Heckman and Rubinstein (2001)). As an alternative, some authors have
used behavioural problems assessments given by parents or teachers (Weiss, 2010; Segal,
2008). Therefore, these measures account for the impact of the lack of non-cognitive
skills, rather than the direct impact of these skills.
The last category includes socio-emotional skills: the skills and the ability to maintain relationships, achieve goals and manage feelings in public situations (Ajwad and
Nikoloski, 2014; Borghans, Meijers and ter Weel, 2008). The measures adopted in this
literature are more relevant to the purposes and the measures that we use. Previous
literature has estimated these skills using self-reported measures such as attitudes or
interaction frequency (Bandiera et al., 2009; Borghans, ter Weel and Weinberg, 2008;
Krueger and Schkade, 2008), or the rate of participation in social activities: a common
adopted proxy is being enrolled in an athletic club (Barron et al., 2000; Postlewaite and
Silverman, 2005) or in organisations Glaeser et al. (2002). Jenks (1979) finds a positively
significant effect of these non-cognitive skills on both earnings and occupational attainment, while Rosenbaum (2001) shows that leadership in high school is a key determinant
of earnings. Organisation membership and athletic club participation usually give a positive and significant effect of 12% (Barron et al., 2000; Postlewaite and Silverman, 2005)
on wages and occupational choice (Glaeser et al., 2002). The same holds for interaction
skills, which are found to have an impact of 16% on wages by Borghans, ter Weel and
Weinberg (2008). Cunha et al. (2010) estimate that 12% of the variation in educational
attainment is linked to sociability, friendliness and compliance. Carneiro et al. (2007)
find a 2.5% effect on wages of social skills, even though the authors suggest that the
effect works only indirectly via educational attainment.
2.1
Social Networks and Earnings Outcomes
Many studies have and estimated the influence of social networks interactions on
economic outcomes. However, the majority of the studies have used social network tech6
The acronym refers to Openness, Conscientiousness, Extraversion, Agreeableness, and Neuroticism
6
niques to measure information effects and job searching channels (Jackson, 2014)7 Only
recently, have social network metrics been adopted as a proxy for individuals’ social skills.
In particular, great attention has been given to short-term outcomes. Calvó-Armengol
et al. (2009) analyse the impact of a measure of the location of the individual in the network, the Bonacich centrality, on educational outcomes. Results show that a standard
deviation increase in the centrality measure increase the school performance by more than
7% of one standard deviation. Mihaly (2009) analyses the effects of popularity within
a network on academic achievements with a model of endogenous network formation.
Network formation is estimated with an instrumental variable regression, using individual demographic characteristics and the grade composition as instruments for popularity.
The identification strategy used is to exploit the variation of the demographic variables
within schools and grades across genders. The author finds strong evidence of endogeneity in the friendship formation process and a negative effect of popularity on academic
outcomes. The argument for such a result is that friendship relationships have a time
trade off with studying. Popularity is captured through various networks variables: indegree, density and eigenvector centrality measures.
Analyses of long-term outcomes are less frequent in the literature. Babcock (2008)
analyses the impact of degree measured at high school on long-term educational outcomes in the AddHealth dataset. Measures are computed for school with a sample of
at least half of the students. His results prove that connectedness is negatively related
with racial and grade heterogeneity. Furthermore, being in a more connected cohort is
related to more years of schooling and a higher probability of attending college. The
author then provides an interesting discussion about the possible determinants of the
network variables, even though this is not integrated in the estimation of the long-run
outcomes. In addition, challenges posed by partially sampled networks are not addressed,
and therefore the results might be an underestimation of the true effect. Hill (2012) investigates the impact of social networking skills on various long-term outcomes using
the AddHealth dataset. Two social networks variables, in-degree and proximity prestige,
are used to explain the variation in adult-life earnings. Results show that the earning
premium is positive and significant, even if small, across different specifications of the
model: using either in-degree or proximity prestige the coefficient is around 5%. The
author also estimates a model specification to control for the quality of the networks.
Fletcher (2013) analyses the impact of personality traits on labour market returns in the
AddHealth dataset. Results prove extraversion as an important factor for earnings. In
a following paper, Fletcher (2014) evaluates the impact of social networks measures on
adult-life salaries. The author compares siblings and their earnings to take into account
family level heterogeneity. Results show no effects of popularity on earnings: therefore,
family level heterogeneity is a more substantial determinant for labour market outcomes.
However, the formation of the networks measures is not empirically estimated, leaving
unsolved one of the main points of concern.
7
Another branch of literature has examined the effects of social interactions on observed productivity.
For instance, Bandiera et al. (2009) estimate the effect of social connections between workers and managers on productivity. Using an exogenous change in managerial incentives as a natural field experiment,
the authors find that performance bonuses incentivise managers to favour workers according to their
ability and effort, rather than based on their connections. The paper also investigates a model of the
determinants of social connections between workers and managers.
7
Finally, Conti et al. (2013) estimate the determinants of friendship formation and the
labour market returns to sociability in the Wisconsin Longitudinal Study. The authors
use social network analysis metrics such as ’in-degree’ (the number of times someone
is nominated by others) and ’out-degree’ (the number of times an individual nominates
others). The estimation adopts a two-step analysis: first they estimate the determinants
of the links formation and then the determinants of adult wages. Results show that for a
median individual, one additional friendship nomination in high school is linked to a 2%
wage advantage. The authors interpret indegree as a measure of respondents’ popularity:
the more people mention one as a friend, the more popular that person is in the school.
This is then identified as the main channel for this result and the authors propose as
an explanation the fact that being popular can proxy for social skills ability. Results
hold even for those who lately emigrated, and even when controlling for the number
of jobs found through friends. This suggests that the effect is not related to benefits
obtained by long-lasting relationships per se, but rather from a relational skill. However,
the dataset used by the authors have some important limitations. Data on high school
relations were collected on the phone eighteen years after the completion of high school:
this information are then likely to suffer from a retrospective bias, since people would only
remember other students they are still in contact with, or people particularly significant
to them, for reasons not observable to the researcher. Also, only a selection of individuals
were surveyed and therefore the mapping of the school network is incomplete, causing a
bias in the estimation Chandrasekhar and Lewis (2011). This also reduces dramatically
the range of different measures that the authors can adopt in their analysis8 . Table 1
summarises the main findings about the relation between network metrics (as a proxy for
non-cognitive skills) and labour market outcomes.
3
Data and Summary Statistics
This paper uses the restricted version of the National Longitudinal Study of Adolescent to Adult Health (AddHealth). AddHealth is a school-based longitudinal study of a
nationally representative sample of adolescents in grades 7-12 in the United States during
the school year 1994-1995. The dataset is composed by one in-school survey of 90,118
students and four in-home interviews administered to 27,000 selected students during the
baseline year (Wave I, 1994) and 1 year (Wave II, 1995), 6 years (Wave III, 2001) and 13
years later (Wave IV, 2008). The last Wave IV was collected over the period April 2007
- February 2009. Appendix Tables A1 and A2 provide details on the variables and the
sample size in each wave. AddHealth is the most detailed available dataset for research
on social networks at the individual level: it has many features that make it superior
to other datasets previously used in this field. First, it provides precise information on
the relationships of students in the sample. Each student is asked to nominate up to ten
friends (five males and five females). Therefore, it is possible to match the information of
social ties with individual characteristics. Second, the network information was collected
again in Wave II9 of the panel, offering a longitudinal and dynamic perspective on friends
and relationship information. Thirdly, a subsample of the schools was ’fully’ sampled, in
the sense that all individuals in the school, and their whole network was recorded. This
8
More sophisticated network measures requires a greater sampling of the network
The network information was collected in Wave III as well, but the number of observations resulted
as too low to be used for estimations.
9
8
9
WLS
AddHealth
Babcock (2008)
Conti et al. (2013)
AddHealth
AddHealth
Hill (2012)
Mihaly (2009)
Dataset
AddHealth
Author
Fletcher et al. (2014)
Employment
College
Earnings
GPA
Earnings
Dependent Variable
Earnings
Metric
In-Degree
Out-Degree
In-Degree
Proximity Prestige
In-Degree
BC
Peers’ Degree
Peers’ Degree
In-Degree
Out-Degree
Table 1: Previous Literature
Estimate
-0.004
0.007
0.0544**
0.0654**
0.04**
0.17**
0.0136***
0.0337***
0.020**
0.00
Identification Strategy
Siblings Comparison
”
Networks Exogenous
”
IV
”
Networks exogenous
”
Joint Pseudo-Likelihood
of Networks and Wages
makes it possible to completely map the networks for these schools. This has important
consequences for the estimation, as it will be described later.
However, the dataset suffers from three main limitations. First, respondents were
asked to nominate up to ten best friends and this limitation causes some truncation
in the data. However, less than 1% of the in-home sample reported ten friends and
therefore the truncation does not appear as a particular issue for the estimation. Second,
data on nominations were collected asking for five male friends and five female friends.
Thirdly, the network variables calculated from these nominations are school-based since
no information was collected at the class-level.
For this paper we use the in-home student questionnaire for the Waves I, II and IV,
the friendship nominations questionnaire, the school administrator questionnaire and the
parents questionnaire administered in Wave I. Also, additional information has been taken
from the Wave I Contextual Files and the constructed variables included in Wave IV10 .
Table 3 presents a selection of the summary statistics of the individual characteristics of
the respondents in the questionnaire. The complete tables with summary statistics are
reported in the Appendix A.
In Wave I, students are on average 16, and around half of them are females. A large
part of the sample is composed by white students (65%), and the rest by other ethnicities:
13% blacks, 12% Asians and 10% others. IQ is measured through the administration of
a modified version of the Peabody Picture Vocabulary Test. Respondents are distributed
through grades 7-12, with larger numbers for grades 10, 11 and 12. 17% of the sample
has repeated a grade at least once. The majority of the sample reports themselves as
Christians (either Catholics or other denominations), while only 3% declare themselves as
belonging to another religion or as Atheist. Only 8% pf the sample considers themselves
as not fit, 52% of the sample watch TV more than five times per week and 35% of the
sample is not involved in any club or activity. When looking at data from the Wave IV,
in 2008 respondents earn on average $36000 annually, and around 20% got a bachelor
degree. Less than 8% completed more than a bachelor degree, not surprisingly since the
sample age is quite young. The second panel of Table 3 reports the summary statistics
for family characteristics. Respondents reported to have on average 3 siblings, and 59%
of the students live with both parents at wave I. With respect to income class, 5% of
the respondents belong to a low income family, 45% to a middle income one, and 50% to
a high income one11 A separate survey administered to school administrators collected
information on the characteristics of the schools. 77% of the students attend a school
with more than 1000 students. The class size is on average 29, and 28% of parents are
actively involved in the Parent-Teacher Association (PTA).
3.1
Social Network Metrics
Social network variables are constructed using the friendship nominations data of
the survey at Wave I. Each respondent was asked to nominate at maximum 5 female best
friends and 5 male best friends. There is also some information about the activities and
the time that each two pair of friends spend together, and this is used in the second section
10
Network metrics have been computed by the authors using the ’igraph’ and ’sna’ packages in R.
Income is measured by the household income, as self-reported by parents. High-income is defined
as a family earning more than $52,000 per year, and low-income as earnings less than $11,000. These
categories are arbitrary but based on the Poverty Guidelines provided by the Office of the Assistant
Secretary for Planning and Evaluation (ASPE, 2014).
11
10
of this paper. Ties are recorded in a binary form, either with 1 if the ties exist and 0
otherwise. This nominations data does not suffer from any retrospective bias because the
survey asked for information on current friends at the time of Wave I and II. A unique
feature of the AddHealth dataset is that selected schools were fully surveyed and this
allows us to have a complete map of the school network. In this way, it is also possible to
compare results obtained with partially sampled schools and fully sampled schools (the
so-called saturated schools). Most of the past research in this area has been constrained
to use partially sampled network data (e.g. Conti et al. (2013)). Therefore, the partially
sampled school set includes 116 schools in which more than 50% of the student body
filled in the questionnaire. On the contrary, the saturated schools sample is composed by
only 16 schools. For the purposes of this paper, only 11 schools are then included in the
saturated sample. In both cases, network measures are constructed using the uniquely
identifiable nominations given by each respondent.
In this section, the main characteristics of the measures will be briefly described.
More extended descriptions of the social network analysis are available in Jackson (2008),
Newman (2010), and Kadushin (2011). The social network literature has developed some
metrics to measure the importance of an individual (node) in a network of relations. The
most important ones are degree, in-degree, and the Bonacich Power centrality.
Degree sum the number of links that one has with others in a network. It is a rather
generic figure, since there is no distinction on whether the link has been indicated by
the person, by the friend, or by both. When information on the direction of the link
(who is nominating whom) is available, this measure can then be refined as in-degree.
In-degree equals the number of incoming nominations from other people. This figure is
typically interpreted as a proxy for popularity Borgatti et al. (2013); Conti et al. (2013);
Wasserman and Faust (1994). However, both measures are limited in their informational
content, since it does not provide information on the position of the person in the network,
and thus how influential one can be. An individual can have a good number of friends,
but being very peripheral and thus not really at the ’centre of things’ (Borgatti et al.,
2013). In mathematical terms:
XX
Di =
Xji
(1)
j
i
IDi =
X
Xji
(2)
j
where Di measures degree and IDi measures indegree for individual i, j is any other
individual, and X is the 0, 1 adjacency matrix, indicating 1 for friendship, 0 for not for all
j. To address these issues, in his influential paper, Bonacich (1987) introduces a family
of measures (called Bonacich Power centrality or β centrality) that defines centrality as a
function of the statuses of the others whom one is connected to. The interesting feature
of this measure is that one results as more central when is well-connected with other
well-connected persons. In more intuitive terms, centrality can be thought as the ability
to connect with people with high social skills. A limitation of this measure is that it
is not able to include neither relationships outside of the network or the differences in
the quality of information received. In mathematical terms, Bonacich centrality (BC) is
defined as:
BC = α(I − βX)−1 Xl
(3)
where l is a column vector of 1s.
11
Table 3: Summary Statistics
Individual
Age
Female
White
IQ
Earnings
BA
More BA
Many Friends W4
Family and School
Siblings
PTA
High Income
Class Size
Large School
N
Mean
StDev
Min
Max
16
0.5
0.65
101
36171
0.2
0.083
0.35
1.4
0.5
0.47
12
21450
0.4
0.27
0.47
12
0
0
56
1200
0
0
0
19
1
1
133
140000
1
1
1
3
0.28
0.38
29
0.77
2
0
0.45
0
0.48
0
8
13
0.42
0
1450
15
1
1
38
1
Figure 1 plots the network with nodes weighted by centrality. Each dot is a student
in the network, i.e. high school. The size of the circle is based on the individual degree
figure: the larger the circle, the higher the degree. The colour of the circle is instead
based on the individual BC figure. Thus, the darker the dot, the higher the centrality
in the network is. It can be noticed that only few dots have a higher level of centrality
in the network. Also, some dots demonstrated a similar degree (same size), but different
levels of centrality.
Table 4 reports the summary statistics for the network variables. The degree of the network is on average 6, with a minimum of 1 and a maximum of 24, while In-degree is
on average 3 (with a minimum of 0 and maximum of 19). In other words, people have
on average 6 contacts in their school and are on average nominated as best friends by
other 3 students. These numbers suggest that students were reporting only connections
that they considered worth mentioning when asked Please tell me the name of your 5 best
male/female friends, starting with your best male/female friends. Indeed, best friends are
typically a smaller group of people with respect to acquaintances. The average Bonacich
centrality is 0.8 with a maximum of 4.469. It approximates the closeness of each person
to others in her influence domain. The higher the number, the more central the person is
(compared with people with lower figures). Having described the dataset and the network
variables, the next section illustrates the empirical methodology and the identification
strategy used to address the research questions previously illustrated.
12
Degree
Indegree
BC
N
Table 4: Networks
Complete Sample
(1)
(2)
(3)
(4)
Mean StDev Min Max
6
4
1
24
3
3
0
19
0.8
0.6
0.002 0.129
1450
Summary Statistics
Male Sample Female Sample
(5)
(6)
(7)
(8)
Mean StDev Mean
StDev
6
4
6
4
3
3
3
3
0.83
0.6
0.78
0.65
737
713
Figure 1: Network of one school - Centrality
Each circle represents a student. The size of each circle is based on the individual degree figure, while the colour is based on BC metric.
13
4
Endogeneity, Estimation and Identification of Networks Effects
Various empirical methodologies have been proposed in the literature to estimate and
identify the impact of social connections on the outcome of interest. Indeed, the estimation and identification of the impact is complicated by various forms of endogeneity,
such as reverse causality and omitted-variables. All these strategies aim to identify the
effects of networks (or more broadly agent interactions with peer groups) on the final
outcome. The interpretation of this effect depends on both the empirical methodology
and the research question. The most common approach used in the literature so far is
the evaluation of the impact of social relations through the estimation of peer effects, i.e.
the impact of peers’ or friends’ (or behaviour) outcomes on individual’s outcome. This is
different from attempting to capture the effect of network skills on individual outcomes.
We suggest that inter- and intra-personal skills are a developmental form of human capital, which yields a return in later life. The peer group modelling literature is typically
attempting to measure the contemporaneous impact of who you are grouped with, e.g.
in a class at school. However, both these perspectives are based on modelling patterns
of relational information, and present similar issues and challenges. It is therefore interesting to compare the different specifications and solutions developed by authors in the
previous literature. We critically implement and compare these different approaches and
suggest the most appropriate for our analysis. This allows us to compare our results and
identification assumptions with ones in the previous literature. In the attempt to analyse
the key factors for labour market outcomes, endogeneity is a common issue to deal with,
and thus it is important to examine its relevance for social network metrics. Schooling
and education decisions are standard examples in the literature of endogenous variables,
and their endogeneity has been extensively studied. Card (2001) has reviewed the theory
and the methods that authors have used to circumvent this problem. In fact, people
decide how many years to stay in education thinking (also) about the returns they will
obtain in the future because of it. Furthermore, those individuals with higher unobserved
abilities will choose to invest more in education due to its lower cost for them, but they
will also earn more because of both higher educational levels and higher ability.
Thus, for estimation and methodological purposes, it is important to clarify whether
network metrics are also subject to the same endogeneity problem. In this work, social network metrics are adopted as a proxy measure of social and non-cognitive skills.
Non-cognitive skills per se cannot be directly measured but a person’s position in the
network and the extent of their interconnectedness can approximate those social skills.
The implicit assumption is that the relationship between social skills and network metrics is positive and monotone in expectation: on average, the higher the social skills, the
more central one would be to the network. This proxy is measuring something that was
unobserved (or proxied differently) before.
It could be argued is that the network variable as a non-cognitive ability proxy not
only does not suffer from a standard endogeneity problem, but it instead reduces the
unobservables and therefore alleviates the omitted variable bias. Rather, it can be considered that network metrics are not related to, or caused by unobservable ability, but
they are actually an (approximate) measure of that same ability. Indeed, it is quite possible that less able people may have more ’friends’ and ’hang out’ in bigger groups. This is
in line with the approach adopted in the literature so far, where typically authors do not
consider social skill proxies as suffering from an endogeneity problem (Hill, 2012, see for
14
instance). Conti et al. (2013) - who use a basic network measure as a proxy for popularity - consider network metrics as endogenous and employ a pseudolikelihood estimation
procedure.
This non-standard potential form of endogeneity of the network metrics is characterised by the time which elapses between the friendship nominations and when the
earnings are reported. Indeed, network metrics are based on nominations recorded at
Wave I of AddHealth, in 1994, whilst earnings are reported at Wave IV, in 2008. It can
be argued that this time-lag helps to exclude the possibility of an endogeneity problem.
It seems implausible to think that individuals would invest in friendships at the age of
15 with the aim of higher returns thirteen years later. The network relationships (and
thus the network metrics) can be relevant for earnings if individuals maintain the same
friends over time and ’use’ them as a job search channels. Indeed, as Galeotti and Merlino (2010); Marmaros and Sacerdote (2002) underline, it is possible that pre-existing
networks favour individuals later on in life, but a necessary condition for this is that the
link between the individuals should be active - something that is not realistic in our data
- as only 6% of the respondents indicate to have maintained some or most of the friends
from high school in 2001. It seems implausible that adolescents when taking a decision
on having an additional friend at school would be thinking about the possible economic
returns of the friendship more than ten years later. Also, even though friendships can
last over decades, it is difficult to believe that they will be able to be relevant for job
opportunities. Indeed, tastes and preferences over friendships change over time, and the
characteristics of the individuals one interacts with tend to change over time. Furthermore, friends met during high schools are different from friends met in adulthood, and it
is not implausible to think that their potential for working career should be different. As
Levitt (2009) underlines, “What a 1957 friend could do for you might be very different
than what a 2009 friend can do for you”. This is also confirmed by the results by Conti
et al. (2013): even when the authors control for social capital and current job occupations, the magnitude and the significance of the social network measures is not affected.
Furthermore, in their analysis the effect results as stronger for those who migrate from
the state of origin, and are therefore less likely to take advantage of high school networks
for job hunting.
However, it is difficult to argue for complete exogeneity of any network metric, since
they might still suffer from an omitted variable bias. For example, some unobservable
(to the researcher) family characteristics might be related to both the social skills and
earnings later in life.
This section is then organised as follows: each empirical specification is described, together with its identification assumptions. Then results of all specifications are compared
in Table 8.
4.1
Two Stage Exclusion Restrictions Model
We first adopt a two-stage approach to eliminate this endogeneity bias. It builds
on the pseudo-likelihood estimator proposed by Conti et al. (2013). The purpose of this
methodology is to estimate the impact of social skills, in this case ’popularity’, on earning
outcomes in adult life. For this purpose the authors adopt some network metrics, such
as indegree, as a predictor for wages. The study uses the Wisconsin Longitudinal Study,
which a measure of individual propensity to nominate friends. However, nominations were
15
limited to three as a maximum, and questions were posed during adulthood referring to
high-school friends. Therefore, the information is truncated and it is likely to suffer from
a recall bias, since the questions were asked to adult respondents concerning friendships
at the time of high school (thus, eighteen years earlier). To overcome these limitations,
the authors introduced a pseudo-likelihood approach with an hypergeometric distribution
for their estimations. The pseudo-likelihood function is specified as follow12 :
L=
I
Y
P (outdegi , indegi , wi |xi , zi )
(4)
i=1
Where outdegi is the individual outdegree measure, indegi indegree, wi individual
wage,xi a set of covariates for social preferences, and zi other individual covariates. This
function estimates the probability of realisation of both individual popularity and earnings. In Conti et al. (2013), the effect of popularity is then estimated to be around 2%
on earnings for each additional friend during high school. The AddHealth dataset does
not have the problem of recall bias, and allows for more friends nominations. Therefore,
the choice of the hypergeometric distribution would be inappropriate. However, we have
follow the original contribution of the adoption of non-normal distribution of the network metrics, and we adopted a negative binomial distribution. In our paper, instead
of a pseudo-likelihood, we introduce a two-stage estimator. Indeed, the two-stage model
offers the advantage of tackling directly the probable endogeneity issue, due to the possible relation of the network measures with the individual unobservables. Identification
obtained in our estimations is ensured through both functional form and exclusion restrictions.
4.1.1
Empirical Specification
The two stage exclusion restriction model is set out in the model of equations (5)-(9).
The first stage estimates the determinants of the network measures, while the second
stage includes the predicted value of the first stage as an explanatory variable for the
earnings outcome. The dependent variable ln(earnings) of interest is estimated as a
function of various determinants, in broad terms: individual characteristics, school characteristics and family characteristics. The variables included are all variables that might
influence the social skills of each individual. In addition to usual characteristics, such
as age, gender, religion, ethnicity, the X vector includes individual traits such as height,
having being breastfeed, time spent in front of the TV, oneself’s perception, participation
in social activities at school, and smoking behaviour. These behaviours are influencing
time and opportunities that one has to interact with others. An alternative specification
also includes a control for ability using the AddHealth Picture Vocabulary Test score, a
measure of verbal IQ: this is not included in the main specification since it causes the loss
of many observations from the dataset. Vector S includes school characteristics: urbanity,
size, ethnicities proportions, smoking habits in the school, class size, grade proportions
and whether the school is public or private. In addition, it includes the homophily measures previously described. Lastly, vector F includes variables for family characteristics,
such as the number of siblings, living with both parents, parents education, not speaking English at home and family background, and the active participation of parents to
parent-teacher associations.
12
The notation is adapted to the one used in the rest of the paper
16
The second stage uses the results from the first stage to model adult life outcomes.
The aim is to capture how important social skills are for future life earnings. As it is
typically done in the literature, the dependent variable is the log of personal earnings
rather than their absolute value.
N1 = β0 + Xβ1 + Zβ2 + Sβ3 + F β4 + s + g + ε1
(5)
ln(earnings) = γ0 + N̂1 γ1 + Xγ2 + Sγ3 + F γ4 + s + g + ε2
(6)
Where N̂1 is the predicted value of the network variable measuring the social skills,
and s and g are school and grade fixed effects. The parameter of interest is represented
by the coefficient γ1 .
The underlying identification assumptions are:
Cov(ε1 , ε2 ) = 0
(7)
E[ε2 |N̂ ] = 0
(8)
and we assume for Z in equation (5) in the exclusion restriction that
Cov(Z, ε2 ) = 0
(9)
Formally, of course, this system can be identified by parametric functional form assumption on the joint distribution of ε1 and ε2 , for example bivariate normality. However,
Goldberger (1983) has shown us that this can give rise to inherent instability in the estimated parameters. Hence, the validity of this identification strategy is dependent on the
suitability of the exclusion restrictions.
4.1.2
Functional Form
The number of nominations given or received (or the combination of the two) by each
respondent is a non-negative discrete number. Therefore, a count data empirical model
can be appropriately used for the estimations with the choice of the model dictated
by the distribution of the dependent variable. Figure A1 in the Appendix shows the
distribution of the degree in our sample, which appears to be left-skewed compared to a
normal distribution. This particular feature of the dependent variable needs to be taken
into account to obtain unbiased estimates. This is something that the network literature
has acknowledged with the use of the Poisson distribution in the estimations. Instead,
we argue that the Negative Binomial distribution fits relational data better, since it can
account for the over-dispersed nature of the data13 . Figures A2 and A3 in the Appendix
show that the Negative Binomial is a superior choice for the distribution of both degree
and indegree. Lastly, the BC measure is a continuous variable and the estimation is done
with a simple OLS model, assuming a normal distribution. To facilitate interpretation
and inference, predicted values are standardised for all the network variables.
13
In fact, the Poisson model assumes that links are formed independently with equal probability.
Clearly this is not acceptable as a sensible assumption for friendship relationships, especially in a context
such as high school. Furthermore, the Poisson distribution assumes equidispersion Cameron and Trivedi
(2005), i.e. the equality of mean and variance. If this is not valid, the coefficients estimated can still
be consistent, but the standard errors will be biased. Since, the variance of degree in the sample is 18,
while the mean is 6.35: our data exhibit over-dispersion.
17
4.1.3
Exclusion Restrictions
Exclusion restrictions are those factors influencing the network metrics, but not
directly influencing the wages at Wave IV. In the baseline specification, the main exclusion
restrictions adopted are parents participation to parent-teacher association (PTA), and
the terms capturing homophily. Homophily is the tendency of individuals to relate with
others similar to them along various dimensions, such as socio-economic status, ethnicity,
gender, and grade (Jackson, 2014; Currarini et al., 2009; Burgess et al., 2011; Nahemow
and Lawton, 1975).
The psychology and psychiatry literature have documented the heritability of social
skills: parents’ social abilities can account for up to 68% of the variance of children’s
social skills (Scourfield et al., 1999). Therefore, a proxy for parental non-cognitive skills
(in absence of network information) is potentially a highly relevant exclusion restriction
for our model. PTA participation measure the involvement of parents with the school
environment and it can be considered as a good proxy for their relational skills, especially
since we also control for socioeconomic status and education 14 . Furthermore, parental
involvement at school can be reasonably considered as something exogenous for the students - apart from the relational abilities. Indeed, it does not seem plausible to think
that it could have a direct influence on earnings measured more than a decade later.
Indeed, previous literature has found effects of parents participations only on educational
outcomes (something for which we control in the estimation): participation can signal a
high interest of parents for schooling results of their children. However, we are not aware
of any study so far that demonstrated a direct effect of PTA on labour market outcomes.
The homophily phenomenon has been widely documented in both sociology and social network literature, and it has also been called assortative matching in the economics
literature (see between others Currarini et al. (2009); Jackson (2014); Nahemow and Lawton (1975); Moody (2001, 2012); Go et al. (2012); Burgess et al. (2011)). It essentially
consists in the preference of people to establish friendships or social relations with others
whose characteristics resemble their own along different dimensions, such as ethnicity,
age, beliefs, and others Currarini et al. (2009); Austen-Smith and Fryer (2005). A crucial
suggestion here is that homophily cannot directly affect earnings in adult life (thirteen
years later) apart from indirect affecting the social skills per se. Homophily measures are
modelled on the basis of Conti et al. (2013) as the percentage of students in the same
school presenting a certain characteristic multiplied by an individual dummy, capturing
whether the individual presents that characteristic or not. The terms included in our
empirical specifications control for homophily towards ethnicity, i.e. white, black, asian,
and other minorities, and for smokers. Grade fixed effects are also included to control
for within-grade homophily, as well as school fixed effects which work as network fixed
effects to control for heterogeneity.
14
Therefore, we control for the possibility that parents are not involved in PTA because of income
constraints or because of low education levels
18
4.1.4
Alternative Specifications
Two alternative specifications can be adopted to further verify the robustness of
the two-step model results. They both use the longitudinal information on the network
estimates contained in the dataset. The first alternative predicts the metrics computed
at Wave II, controlling for the metrics in Wave I in the first stage. This allows to
control further for unobserved heterogeneity between individuals in the same school.
Furthermore, the exclusion restrictions include not only PTA and homophily, as in the
baseline specification, but also two terms capturing geographical distance. The first one
captures the distance from the school, while the second one captures the average distance
from other students attending the same school. This two factors are relevant for social
skills because physical distance influence the opportunities and the chances to interact
with others. For instance, living closer to school allows students to be less dependent
on transportation means and parents, and favours after-school activities participation
and interactions with students outside school time. Living closer to other students is
likely to favour time spent together on bus or during the weekend. Indeed, many studies
have documented that meeting opportunities are a key factor for the formation of social
relations, and that they also influence the homophily preferences of the individual, causing
the so-called meeting homophily (Advani and Malde, 2014; Moody, 2001). It is hard to
argue that these physical distances can somehow directly influence earnings after having
controlled for socio-economic background and school location and characteristics.
N2 = β0 + N1 α1 + X1 β1 + Zβ2 + Sβ3 + F β4 + s + g + u1
(10)
ln(earnings) = γ2 + N̂2 α1 + X2 γ2 + Sγ3 + F γ4 + s + g + u2
(11)
The underlying assumptions of this model are similar to before, namely:
Cov(u1 , u2 ) = 0
(12)
E[u2 |N̂2 ] = 0
(13)
Cov(Z, u2 ) = 0
(14)
The logic of their validity is the same as before with equations (5) and (6), only now
we are suggesting that our underlying relationship will need to be more robust if it is to
hold true between Wave II and Wave IV, as well as between Wave I and Wave IV.
The second alternative exploits the difference between the network metrics in Wave
I and Wave II in the first stage, i.e. ∆N = N2 − N1 , and the difference between the
earnings in Wave III and Wave IV in the second one. This strategy controls for unobserved
heterogeneity in both social skills and earnings. Again the first stage includes the terms
capturing geographical distance from both school and peers. So here we are suggesting
that the change in networking between Wave I and Wave II should affect the change in
earnings between Wave III and Wave IV.
∆N = β0 + X1 β1 + Zβ2 + Sβ3 + F β4 + s + g + η1
(15)
d δ1 + X2 γ2 + Sγ3 + F γ4 + s + g + η2
∆Earnings = γ2 + ∆N
(16)
The underlying assumptions of this model are:
Cov(η1 , η2 ) = 0
19
(17)
d] = 0
E[η2 |∆N
(18)
Cov(Z, η2 ) = 0
(19)
These assumptions are much more plausible as the common underlying unobserved
heterogeneity between Wave I and Wave II is being netted out. But, by construction,
equation (15) would include a term in (u1 − ε1 ) (from differencing equation (10) from
equation (5)) and so net out for any common unobserved heterogeneity. Likewise, equation (16) would net out for (u2 − ε2 ), (from differencing equation (11) from equation (6)).
Such differencing requires a lot of the data and it will be reassuring if our networking
measures and their differences still prove to be significant in this model15 .
The rest of the models considered here are fundamentally different, since they aim to
capture the effect of contemporaneous outcomes on the individual outcome, i.e. peers’
effects. Nevertheless, the comparison is interesting since it allows to check whether the
effect captured in our estimates is related to non-cognitive skills or rather to peer effects.
All the following models are very data demanding, and thus we have included in the
estimations only a selection of covariates: age, ethnicity, gender, education, and financial
background16 . Furthermore, we have specified networks in two different ways, directed
and undirected: this changes the way the adjacency matrix is specified (more details are
provided in the Appendix ??). In essence, in the undirected network specification, the
’influence’ is reciprocal, even though one of two might not considered the relationships
as worth mentioning. In the direct network case, only the nominating person is affected,
and not the nominee (if not reciprocated).
4.2
Linear-in-Means Model
Equation (20) follows the standard Linear-in-Means (LiM) specification. This model
is typically adopted in the peer effect literature (see, between others, Sacerdote (2001,
2011); Epple and Romano (2011)), and estimates the impact that peers’ behaviours and
characteristics have on the individual outcome. The specification introduces average peers
behaviours and average peers characteristics as covariates for the individual outcome.
Y = β0 + βx Xi + βȳ Ȳi + βx̄ X̄i + ηi
(20)
The main coefficient of interest is the impact of the average peers’ outcome on the individual outcome, and this is defined as the ’endogenous effect’ (βȳ ). The effects of peers’
characteristics instead are defined as the ’exogenous effects’ (βx̄ ), while the ’contextual’
effects include the influences from the environment that cause both the individual and
the peers to behave similarly. Manski (1993) has shown that this model presents some
fundamental identification issues. The most important challenge is the ’reflection’ effect:
with simultaneous behaviours, it is not possible to distinguish the direction of the effect,
so whether the respondent influences the others’ behaviour or viceversa. As Bramoullé
et al. (2009) underline, a major assumption of this model is that individuals interact in
groups, and do not connect with individuals outside them.
Furthermore, Angrist (2014) shows that, although the LiM models may be identified,
the resultant peer effects might be the product of a measurement error bias, rather then
15
Formally, we are not subtracting (5) from (10) as we would have not have any regressors.
Results for the two stage exclusion restrictions model with only the selected covariates are reported
in the Appendix C
16
20
true behavioural effects.
Specification (21) enables us to compare the LiM model with a model of network effects:
ln(earnings) = α0 + ln(earnings)α1 + X2 α2 + X̄α3 + BCα4 + s + g + u
(21)
where earnings and X̄ are respectively the average of the peers’ earnings and characteristics, and g and s are the contextual effects, which are respectively school and grade
fixed effects. The coefficients of interest are α1 , which represents the endogenous effect,
and α3 , which measures the exogenous effect.
It should be noted that the dataset does not include class-level information, and this
constitute a substantial limitation to the estimation of a LiM model in our data. Indeed,
the peer group in this application includes all the individuals in the school with the same
age and in the same grade. However, this does not necessarily imply that the group contains only students in the same seminar group. This brings a high level of measurement
error in the peers’ earnings measurement, due to data limitations. Results should therefore be considered with caution. We also applied the LiM model using nominated friends
as the reference peer group. In this case, instead of the average earnings of all students
in the same school and grade, earnings and X̄ indicates the average of the nominated
friends’ earnings and characteristics.
4.3
Basic Network Model
Equation (22) follows a set of alternative strategies proposed by various authors such as
Calvó-Armengol et al. (2009); Bramoullé et al. (2009); Goldsmith-Pinkham and Imbens
(2013). This approach recognises that observations cannot be considered as independent
from each other, and so the standard empirical models are not identified (GoldsmithPinkham and Imbens, 2013). The relational (or adjacency) matrix G facilitates identification by taking into account the dependence between the observations.
The adjacency matrix G is constructed as follows: individuals are reported on both
columns and rows, and the matrix itself contains only binary terms, 1 if there exists a
relation between two individuals, and 0 otherwise. Powers of the matrix accounts for
indirect connections of the individuals: G2 represents individuals connected through two
steps, G3 three steps and so on. Each number in Gn is equal to the number of individuals
that are n-steps separated from the one considered.
Bramoullé et al. (2009) have shown that using the group interaction assumption
present in the linear-in-means model, it can be identified. In this case, each individual
has his own reference group, composed by the individuals to which one is connected to.
The variation in the individual reference groups allows to obtain identification, through
what the authors define as ’natural exclusion restrictions’: friends of friends that are not
relate to the individual. In mathematical terms, this model can be expressed as follow:
Y = α0 + Xα1 + GY α2 + GXα3 + s + g + (22)
If there are correlated effects, identification is possible if and only if I, G, G2 , G3 are
linearly independent. Indeed, if GG = G, peer groups partition the population, their
model is essentially equivalent to Manski’s LiM model (Goldsmith-Pinkham and Imbens,
2013).
Also Goldsmith-Pinkham and Imbens (2013) use the individual adjacency matrix to ensure identification, and they enrich the model with first an exogenous, and then an
21
endogenous, network formation element. This is specified as the individual utility of the
link between two agents in the network, as a function of the difference between the two
in terms of both observable and unobservable characteristics. The empirical specification
in Goldsmith-Pinkham and Imbens (2013) includes a likelihood function combining both
the likelihood of the outcome and the likelihood of the network realisation. However, the
analysis is constrained to only one outcome and one covariate.
4.4
Network Centrality Model
Calvó-Armengol et al. (2009) use network location as an estimate of the equilibrium
behaviour for the estimation of peer effects. More specifically, the Bonacich centrality
metric can capture the Nash equilibrium strategy of each individual, and thus the peer
effects intensity on the outcome can be accounted by the individual network position.
The individual outcome can then be separated into an idiosyncratic part and the ’peer
effects’ part. The peer effects part can be estimated using the Bonacich centrality metric.
However, this result is valid only if the individual utility function is linear-quadratic in
both the individual effort and the peer effort. The individual equilibrium outcome can
then be written as:
µ
BCi
(23)
φ
where θi (x) represent the observable differences between individuals, BCi is the individual Bonacich Power Centrality, φ a scalar representing the intensity of indirect peers’
influences, µ a scalar weighting the peers’ efforts influence in the individual utility function (see (Calvó-Armengol et al., 2009) for the complete explanation of the equation).
Referring to Manski’s terminology, θi (x) represents the contextual effect, network-fixed
effects the correlated effect, and the BC is the endogenous effect. The ’reflection problem’ is avoided thanks to the individually specified reference group, as in Bramoullé et al.
(2009). Identification is then ensured if there are at least two agents with different average
connectivity of direct friends. As a result, the empirical specification includes individual
characteristics, contextual effects, individual BC, and a spatial error term to account for
correlation in the unobservables. Results in the study by Calvó-Armengol et al. (2009)
show that an increase in one standard deviation of BC increases school performance by
7% standard deviation.
Model (24) includes the BC, the adjacency matrix multiplied by the earnings vector,
and the adjacency matrix multiplied by the friends’ covariates.
yi∗ (x, g) = θi (x) +
Y = α0 + Xα1 + GY α2 + GXα3 + BCα4 + s + g + (24)
The coefficient of interest in this model will be α4 , which will be a direct estimate of the
network effect, conditional on the possible indirect interactions between the individuals
which come through the regressors and friends values of the regressand. Specifically, we
require that the presence of these network terms condition out the endogeneity, i.e. that
E(|GY, GX) = 0
4.5
(25)
Network Intensity Model
The last estimation model used the adjacency matrix in its identification. However, in a
very important sense the simple adjacency matrix is a misspecification. Specifically, any
22
person connection with anyone is not appropriately captured as a zero or one. Not all
friends are equivalent or equal. And, if we assume that they are, this amounts to a form of
measurement error. Thus, we propose an intensity model, in which the adjacency matrix
is not specified as only binary. Additional information is used to enrich the informational
content of the matrix.
In this way, the dependency between the observations is weighted by the information
on the ’ranking’. This is therefore a relaxation of the limiting assumption that each
friendship matters the same for the individual, an assumption required for the LiM models
(Goldsmith-Pinkham and Imbens, 2013).
Indeed, using only binary values for the description of friendship seems to be limiting the
descriptive power of the estimations. It offers no information on friendship intensity, or
the strength of the interdependency between agent i and j. Thanks to the richness of the
dataset, we can overcome this problem.
Specification (26) weights the matrix components by the individual ranking of the
nominated friends. The friend nominated first has the highest weight, and the latest
the lowest (see Appendix for details). The adjacency matrix R is then obtained. An
alternative specification is obtained when α4 is set as zero.
Y = α0 + Xα1 + RY α2 + RXα3 + BCα4 + s + g + (26)
Of course, this model assumes that
E(|RY, RX) = 0
(27)
and that again the inclusion of interaction information between friends ’washes out’
any possible endogeneity. The additional feature here over assumption (25) is that we are
now allowing a much more flexible form of network relationship, which captures not only
who your friends are, but what is the strength of these associations. We will see later in
our results that this change causes some slight downward revision in our estimates of the
networking effect, which is reassuring.
5
5.1
Results
Two stage exclusion restrictions Model
This section describes the results from both the first stage and the second stage of
the estimation. Our results show that in the first stage after-school activities normally
engaged in at the High School, and ethnicity homophily are the most significant factors
determining intensity, while family background is found to have only a small impact.
Results do not vary too much across the different specifications.
The second stage examines the impact of the different network measures on adult life
outcomes, namely earnings. BC has a consistent and positive and significant return on
earnings. Non-cognitive skills are found in particular to favour white and male students
and those who working for lower paying jobs. In addition, there is evidence that partially
sampled networks are likely to give biased estimates when compared to fully sampled
networks. This section illustrates the results in detail and then comments on the main
findings, comparing them with the previous literature.
23
5.1.1
Factors influencing Social Skills
Our results obtained from the first stage of our regressions are interesting since
they can offer an idea of factors important for the developing of network measures. The
estimation results of this first stage are reported in Table B1 in the Appendix. Key
factors are those variables that facilitate opportunities to interact with others and thus
to improve social skills. Indeed: participating in sport clubs; not being fit; not being
religious; having repeated a grade; all have a statistically significant effect on centrality
in the network. There is also evidence of the importance of homophily towards other
smokers, and ethnic minorities students. Also, the participation of parents to the parentteacher association is positive and highly significant. One potentially counterintuitive
result is the positive sign attached to whether one watch more than five times a week,
as one might expect those with more friends to watch less TV. But this may not be true
if watching common TV programs in this era (e.g. the soap Friends) is important for
social bonding. However, since there is no information on when and how TV is watched,
it is difficult to precisely interpret this coefficient. It might be that TV is watched during
the evenings, and that it can offer a topic of conversation with others during the school
day. Overall, after-school activities and the ethnic composition of the schools seem to be
relevant factors for the developing of non-cognitive skills by the students. These results are
consistent with the analyses by Putnam (2001, 2015). He underlines how great changes in
the American society are affecting social capital and social interactions. Also, he suggests
that increasing inequality is facilitating high-income family kids participation in afterschool activities, leaving behind their low-income family classmates. This suggests that
policies which facilitate networking could reduce inequality.
5.1.2
The Effect of Social Skills of Earnings
Table 5 compares the results from different specifications of our regression. Each
column reports results from alternative specifications: without the first stage (Column
1), without school and grade fixed effects (columns 2, 3), and with grade and school
fixed effects (column 4), with hours of work (column 5), psychological variables (column
6), IQ measures (column 7), and all of these (column 8). Each cell in this table shows
the coefficient from a different regression, using the three different network measures
separately. The coefficients for the network measures are all positive and statistically significant across the specifications, and the results are remarkably consistent. It is evident
that when the endogeneity issue is not addressed explicitly, results are greatly underestimated (Column 1). Column (4) reports the results from our preferred specification, with
fixed effects but without personality traits: the psychological measures are computed
contemporaneously when earnings where reported, thus making difficult to disentangle
the reciprocal influence between occupation and psychological traits. In the Appendix,
Table B4, we also report the results using earnings per hour, and the results confirm a
positive effect of BC on earnings, with a magnitude of around 7%. However, these results
should be treated with caution since earnings and hours of work are reported referring to
different years. The magnitude of the network effects tend to vary according to the metric used. Earnings increase by 13% for an increase in one standard deviation of degree,
7.5% for indegree, and 8.8% for the BC. However measured, either through the ability to
connect with many students or through the ability of one to locate more centrally in the
network, it is clear that higher social skills bring benefits in terms of earnings in adult
life.
24
It is difficult to interpret the meaning of the degree measure, since it is a combination
of nominations either received or given: it is then not clear what is the change that is
driving the results. At the same time, as described earlier in this paper, indegree has
been typically interpreted as a measure for popularity (Conti et al., 2013), and it is not
clear whether popularity can be a good proxy for social skills: often popularity (especially during high school) is linked to socioeconomic status, physical characteristics, and
sports rather than to the ability of the individual to relate with others (Parkhurst and
Hopmeyer, 1998). This is why we argue that BC is a superior measure for social skills,
since it contains not only a measure of the number of people one is related to, but also
the ability to connect with other well-connected people, and therefore the ability to be a
’key player’ in ones own network.
25
26
0.087**
(0.034)
0.097***
(0.031)
0.088**
(0.042)
0.075*
(0.040)
0.065*
(0.037)
0.065*
(0.034)
0.077**
(0.039)
0.071**
(0.035)
0.067*
(0.039)
0.067*
(0.035)
(7)
0.106**
(0.044)
0.063*
(0.038)
0.056
(0.034)
(8)
0.095**
(0.042)
Yes
Yes
Yes
Yes
Yes
Yes
No
Yes
Yes
Yes
Yes
Yes
No
No
Yes
Yes
Yes
Yes
No
No
No
Yes
No
Yes
No
No
No
No
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
1450
1450
1450
1392
1387
1377
0.21
0.21
0.30
0.32
0.32
0.32
1000 replications. * p < 0.01, ** p < 0.05, *** p < 0.10
0.086**
(0.037)
0.090**
(0.033)
No
No
No
No
No
No
No
No
No
No
No
Yes
1878
1474
0.19
0.20
errors in parentheses,
0.033**
(0.016)
BC 0.1
Controls
Grade FE
School FE
Work hours
OCEAN
IQ
First Stage
N
R2
Bootstrapped
0.073***
(0.016)
Indegree
Degree
Table 5: Two Stage Exclusion Restriction Results
(1)
(2)
(3)
(4)
(5)
(6)
0.074*** 0.138*** 0.136*** 0.119** 0.094** 0.104**
(0.018)
(0.040)
(0.040) (0.049) (0.043) (0.043)
5.1.3
Alternative Specifications
The AddHealth dataset followed a sub-sample of the students over time, keeping
track of the friendship nominations one year and six years after the first wave collection.
A sub-sample was administered a questionnaire at home after the main school one in
1994 and then other questionnaires respectively after one year (1995), six years (2001)
and eight years (2008). In this case, the resulting sample for our purposes includes
1030 respondents. We have used the network measures calculated in Wave I in different
ways: as covariates in the first stage and as regressors then in the second stage. All the
results are available in Appendix C, while this section will only report the main results.
Table 6 and 7 reports the results from the alternative identification strategies described
in section 4.1.4. Table 6 shows the results when the main variables of interest are the
wave II network metrics. Column (1) reports the results when predicting only wave II
network metrics in the first stage. Then predicted values are used in the second stage,
together with Wave I network metrics. Column (2) include network metrics at wave I as a
covariate in the first stage, while column (3) includes the metrics at wave I in the second
stage. The results confirm our baseline, even though the magnitude of the coefficient is
slightly reduced, with an impact of BC of around 6.3%.
As an additional robustness check, we calculated the average distance of each students
from other students in the school and the distance from school. Residential location
clearly affects the chances to relate with others, since the closer two persons live, the
more likely is that they can spend time together, especially during high school year17 .
As one would expect, distance is negatively correlated with the network measures: the
farther away a student lives to her friends, the more difficult is to spend time together
outside class times. This is the identification strategy illustrated by equations 10 and 11
earlier in this paper. The effects on earnings are confirmed as positive and significant,
with an effect of around 10% attached to BC, as reported in Table 6.
Also, a third identification strategy was applied (equations 15 and 16). Table 7 reports
results obtained when using difference in the network metrics between Wave I and Wave
II as a predictor of the difference in earnings between Wave III and Wave IV. Again, the
Bonacich Centrality measure results as a positive and significant predictor of earnings:
a positive increase in the centrality measure between the two waves causes a positive
increase in earnings of 10% between wave III and wave IV. The effect is significant only
at the 10% level, however it should be noted that the sample for this estimation includes
only 546 observations.
Overall, results are proved as robust across different specifications and identification
strategies. Social skills, defined as the ability to relate with others with high social skills,
bring a significant increase on earnings of between 8% and 6.5% on average.
17
Distance is calculated using the information on the geographical distance of students from a central
point, secreted for privacy reasons. The measure obtained is then standardised and used in the first
stage as a covariate for the network metrics, together with the Wave I metrics information.
27
Table 6: II Identification Strategy: Wave II data
(1)
0.239***
(0.054)
0.217***
(0.048)
0.102**
(0.042)
-
(2)
0.102***
(0.030)
0.080***
(0.023)
0.063**
(0.026)
-
InDegree W1
-
-
BC 0.1 W1
-
-
Degree W2
Indegree W2
BC 0.1 W2
Degree W1
(3)
0.230***
(0.051)
0.205***
(0.049)
0.101**
(0.041)
0.010*
(0.005)
0.016**
(0.008)
0.033
(0.030)
Yes
Yes
Yes
Yes
No
1029
0.25
Grade FE
Yes
Yes
School FE
Yes
Yes
Network W1
No
Yes
First Stage
Yes
Yes
Geography Distance
No
No
N
1029
1029
2
R
0.25
0.25
Bootstrapped errors in parentheses.
1000 replications
* p < 0.01, ** p < 0.05, *** p < 0.10
28
Table 7: III Identification Strategy: Differences
Dependent variable First Stage ∆N
∆Degree
0.150**
(0.071)
∆InDegree
0.041
(0.066)
∆BC
0.086*
(0.045)
Grade FE
No
No
No
School FE
No
No
No
Birth Month
No
No
No
First Stage
Yes
Yes
Yes
Geography Distance
Yes
Yes
Yes
N
551
551
551
R2
0.27
0.27
0.27
Bootstrapped errors in parentheses.
1000 replications
* p < 0.01, ** p < 0.05, *** p < 0.10
5.2
LiM and Networks Models
Table 8 compares a selection of the results from the different models (complete estimates
are available in Appendix C). Only the results from the directed network models are
reported, since the undirected models will be subject to the measurement error bias. BC
is positive and significant in all the specifications that include it as a covariate. The effect
varies from a higher bound of 8% estimated with the two stage exclusion restriction model,
to a lower bound of 3.4% obtained with the LiM model (with friends as the reference
peer group). Therefore, intra-personal and inter-personal skills are found to have an
important impact on earnings, and this result is robust across different estimation models.
It is interesting to notice that the estimates of the network models, in particular of the
intensity network model, are more precise than the two stage model: the standard error
is about one half of the error in the first model.
On average, peers’ and friends’ earnings do not have a significant impact on individual
adult earnings. This result is reasonable, since there is a temporal gap between the high
school friendship nominations and wages of around thirteen years. It is unlikely that a
significant number of relationships will be maintained over such a long period: this is
also confirmed by the small number of nominations available in Wave III.
Overall, the comparison of these different models and specifications suggest that the
effect captured by the Bonacich centrality is not a standard ’peer effect’, i.e. the influence
of peers’ characteristics on individual outcomes. This is confirmed by the non significance
of the impact of the peers’ or friends’ earnings. On the contrary, the effect seems to be
better accounted for with the definition of interpersonal skills, i.e. the ability to relate
with others and to learn and use social norms.
The different specifications of the adjacency matrices do not show substantial changes
29
for the impact of friends’ earnings. However, interesting differences in magnitude are
found for other characteristics (complete results tables are available in Appendix C).
While the impact of individual characteristics do not vary much across specification,
demonstrating the robustness of the results, the coefficients obtained through the Basic
Network and the Centrality models show a downward bias in their magnitude. Indeed,
the coefficients in the Network Intensity model tend to be 30% to 50% smaller. This
suggests that a binary adjacency matrix, often chosen because of data constraints, is a
misspecification, which omits key information on the strength of these relationships. This
is true for both ’directed’ and ’undirected’ networks. These coefficients can be interpreted
as a form of social capital accumulation: individuals build their a network capital, i.e. a
network of resources that can help at the right moment, especially through information.
The fact that the main effects are related to education and financial background of friends
confirm this speculation. These effects are anyway distinct from the effects of social skills,
which do not depend on the resources that others have.
30
31
N
R2
Friends’ Earnings
Peers’ Earnings
Model
Equation
BC
1870
0.152
LiM FE
(21)
0.034
(0.028)
-0.451
(0.475)
0.003
(0.034)
1404
0.178
LiM Friends
(21)
0.034*
(0.019)
0.027
(0.033)
1404
0.174
Basic Net FE
(22)
0.022
(0.032)
1404
0.176
Centrality
(24)
0.053**
(0.017)
Variable BC has been standardised to make the results comparable across specifications
1450
0.21
Two Stage FE
(6)
0.088***
(0.026)
Table 8: Results Comparison
0.017
(0.033)
1404
0.178
Intensity FE
(26)
0.054***
(0.013)
6
Observed Heterogeneity and Robustness
The comparison of the results from the selected models has shown that the magnitude
and the significance of the coefficients do not dramatically vary. Thus, interpersonal and
intrapersonal skills have an impact on earnings of between 3 and 8%. It is therefore interesting to evaluate whether differences in this impact arises between gender, ethnic, and
whether there is a difference in the network effect across the distribution. The finding that
coefficients are not drastically different allows us to further apply the two-stage model to
verify the hypothesis of different returns according to different individual characteristics.
In this section, we then propose results obtained through quantile regressions, to examine
whether results vary across the wage quantile distribution, and through estimations separated by gender and ethnicity. We also examine the role that network sampling might
have on the results, comparing schools where all the students were surveyed with schools
where only a random selection of pupils was covered.
6.1
Quantile Regressions and Non-Monotonicity of the effect
Previous results in the literature have found that the return to non-cognitive skills
confer a bigger advantage to low-skilled workers and students from disadvantaged backgrounds (Carneiro et al., 2007). For this reason, we investigated whether the effects
vary over earnings quantiles. Results in Table 10 confirm the findings from the earlier
literature as they show that the effect is positive and significant in particular for the
lowest quantile of earnings. In particular, an increase of one standard deviation in the
BC measure increases earnings in the lowest quantiles by 9% and the median by 12%.
The effects are shown graphically in Figure 2. In addition, the quantile regression was
also run using absolute earnings values. Again, the effect is positive and significant only
in the lowest quantile (Table 10). These results are in line with previous findings and
suggest that non-cognitive skills allow workers in low paying job to distinguish themselves
from similar workers.
Another interesting question is whether the non-cognitive skills effects are monotonic
and whether there is any interaction with IQ. To check for the monotonicity of the effect,
we created quantile dummies for the network predicted values and we interact them with
the network metrics themselves. As reported in the Appendix, Table B6 only BC in the
top quantile results are positive and significant, so the effect tend to be larger for higher
levels of non-cognitive skills. To analyse the relationship with IQ (results are reported in
Table B7), we have interacted the IQ measures with the standardised predicted values
of the network measures. The only significant interaction term is the one that includes
BC: therefore, non-cognitive skills not only increase earnings per se, but smarter students
obtain a higher benefit.
32
Table 10: Quantile Estimates
(1)
Q 0.1
0.088∗∗
(0.040)
BC
(2)
Q 0.25
0.201
(0.131)
(3)
Q 0.5
0.121**
(0.060)
(4)
Q 0.75
0.043
(0.037)
(5)
Q 0.9
0.024
(0.052)
Female
-0.326∗∗∗
(0.039)
-0.410∗∗∗
(0.129)
-0.304∗∗∗
(0.059)
-0.271∗∗∗
(0.037)
-0.327∗∗∗
(0.034)
Bachelor
0.302∗∗∗
(0.055)
0.382∗∗
(0.180)
0.283∗∗∗
(0.083)
0.278∗∗∗
(0.051)
0.198∗∗∗
(0.047)
More than Bachelor
0.356∗∗∗
(0.074)
0.246∗∗
(0.245)
0.385∗∗∗
(0.113)
0.385∗∗∗
(0.070)
0.333∗∗∗
(0.064)
BC
Observations
R2
1450
0.211
Bootstrapped errors in parentheses, 1000 replications
∗
p < .10, ∗∗ p < .05, ∗∗∗ p < .01
Figure 2: Quantiles Estimates
33
Table 11: Monotonicity
Network
Network * IQ
N
R2
Degree
0.123***
(0.051)
0.021
(0.021)
1387
0.216
Indegree
0.073*
(0.040)
0.020
(0.022)
1387
0.214
BC
0.087*
(0.045)
0.038**
(0.019)
1387
0.216
Bootstrapped errors in parentheses, 1000 replications
∗
p < .10, ∗∗ p < .05, ∗∗∗ p < .01
6.2
Gender and Ethnicity Effects
Much of the literature has focused on white male samples (Drago, 2011; Kuhn and
Weinberger, 2005; Heckman and Rubinstein, 2001; Cawley et al., 2001, for example).
However, some studies have underlined the difference between genders and ethnicities;
clearly different ethnicities regard performance and ability differently, and this also influence social acceptance and thus social skills (Austen-Smith and Fryer, 2005). It is
therefore very interesting to investigate whether gender and ethnicity might be linked
to differences in the estimates of returns to non-cognitive skills as measured by network
capital. We therefore run additional regressions separating the sample first by gender and
then by ethnicity. Results are reported by Table 12, in respectively in the first and in
the second panel. The gender-based subsamples result as composed by 737 male students
and 713 female students. Social skills clearly advantage male students: an increase in
one standard deviation of the BC measure increases earnings by around 11%, while no
significant effect is found for females.
Because of the high percentage of white students, we split the sample into white and
non-white students: the two subsamples are then composed by 941 and 509 students respectively. Social skills are mostly benefitting white students: an increase in one standard
deviation of the BC measure is linked to a 10% increase in earnings. On the contrary, the
effect is not statistically significant for non-white students. This effect might be linked to
a possible discrimination effect in the labour market, rather than to non-cognitive skills
per se. However, there was no sign of ethnicity discrimination in our baseline regression
(see Table 5). Furthermore, the coefficients for the different ethnicities are non-significant
in the non-white results.
34
Table 12: Gender and Ethnicity Estimation
Gender
Ethnicity
Males Females
White Non White
Degree
0.204***
0.082
0.162***
-0.019
(0.069)
(0.056)
(0.053)
(0.075)
Indegree
0.132**
0.0256
0.119***
-0.018
(0.053)
(0.048)
(0.038)
(0.073)
BC
0.11**
0.082
0.10**
-0.033
(0.055)
(0.057)
(0.040)
(0.071)
Grade FE
Yes
Yes
Yes
Yes
School FE
Yes
Yes
Yes
Yes
N
737
713
941
509
R2
0.23
0.26
0.26
0.23
Bootstrapped errors in parentheses.
* p < 0.01, ** p < 0.05, *** p < 0.10
6.3
Fully Sampled Schools vs Partially Sampled Schools
Another important dimension of the variability in network effect results can be due
to the nature of the sampling of the network. Specifically, for example, Conti et al. (2013)
use only network data (collected retrospectively) from a partial sample. We are in the
fortunate position to be able to compare partially and fully sampled schools. Table 13
compares the results of our preferred specification run on the sample from the fully sampled schools and from the partially sampled schools. The first sample includes only the
schools where all the students were interviewed, allowing the mapping of the complete
network of relations; the second one includes also the schools where only a sub-selection
of students were interviewed, and where therefore only a partial re-construction of the
networks is possible. Chandrasekhar and Lewis (2011) underlines how a partial sampling
of a network induces a downward bias in the estimated coefficients. Therefore, since
the AddHealth dataset includes both the sampling strategies, it is interesting to make a
comparison. From Table 13, the difference is neat: not only the effects are smaller when
we include partially sampled schools, but they are not statistically significant.
35
Table 13: Partially Sampled vs. Fully Sampled Schools
Full
Partial
Degree
0.124***
-0.017
(0.047)
(0.015)
Indegree
0.077**
-0.012
(0.039)
(0.015)
BC
0.088**
-0.009
(0.044)
(0.013)
Grade FE
Yes
Yes
School FE
Yes
Yes
N
1450
3878
2
R
0.21
0.20
Bootstrapped errors in parentheses
1000 replications
* p < 0.01, ** p < 0.05, *** p < 0.10
36
7
Discussion and Conclusion
This paper has investigated the importance of interpersonal skills as measured by
social network capital on subsequent labor market earnings. A contribution of this paper
is the introduction of the Bonacich centrality measure as a proxy for social skills of
individuals. Our suggestion is that social network measures have clear advantages and
are less prone to bias than self-reported personality traits, since they use sociometric
measures of the relational pattern of students.
Network metrics as a proxy for popularity have been previously analysed by Conti
et al. (2013). We argue that the basic network measures used previously in the literature,
such as the number of nominations that a person either receives or gives in response
to a survey (i.e. degree, used by Conti et al. (2013)), are not the best metric to proxy
for interpersonal skills: students might be over-stating their relational status to signal
higher social prestige. In contrast, the Bonacich Power Centrality measure, takes into
account how well connected one is, in the whole network. Therefore, network capital and
interpersonal skills include, not only, the capacity to create links with others, but also to
be at the ’heart of things’ and to be connected, in turn, with more connected others in
the network - that is - to be a ’Key Player’.
Our results confirm previous findings in the literature: interpersonal skills have indeed a positive and significant effect on labour market outcomes. The magnitude of our
estimated effect is similar, if slightly larger than previous studies on the topic (Hill, 2012;
Weiss, 2010; Drago, 2011; Ajwad and Nikoloski, 2014), and changes slightly according
to the specification: a one-standard deviation increase in social skills raises earnings by
between 3 and 8%. The effects have a magnitude similar to those attributed to socioeconomic status and to cognitive skills, captured here by educational attainment. This
confirms the results of Jacob (2002).
While the previous literature has focused mainly on white samples, we explored the
possibility of an ethnicity effect, and we found that it is white students who benefit from
interpersonal skills. Thus, we can conclude that social skills create an advantage mainly
for white and male students. This effect might be related to specific socio-economic
contexts or by the fact that certain jobs and educational levels tend to predominantly
attained by white males. Our findings also suggest that the largest marginal effect of
network capital is at the 10th percentile and median of the earnings distribution. They
also suggest indirectly that schools and after school activities create favourable conditions
for the development of these skills (Heckman and Rubinstein, 2001; Bowles et al., 2001).
However, we do not find any formal evidence of this in this study, and further work will
be required in the future to support the important role for schools in the formation of
interpersonal skills.
The results in our analysis exhibit a reasonable degree of robustness to specification
and sample subset analysis. A major contributory factor in our investigation has been the
extremely rich AddHealth dataset, which includes specific relational longitudinal data,
which is not available in other datasets. In addition, the networks were entirely sampled,
and therefore the results do not suffer from an underestimation bias due to the partial
sampling of the network which is common in the literature.
It is important to attempt to clarify the mechanisms and the real underlying content
of what we are capturing in our estimations. There are several alternative explanations
of the mechanism by which network measures relate to earnings. The first is that net37
works fulfil an informational role in alerting friends to job (or better job) vacancies. The
second is that networks actually augment learning of cognitive skills or knowledge - as
there is good evidence that social networks improve the knowledge acquisition process.
(See Pentland (2014) who describes how networks act as a mechanism for collective intelligence.) The third interpretation is that social networks augment interpersonal skills.
The process is that acquiring more friends and dealing with them develops social skills.
Our data do not allow us to definitively delineate between these different explanations but this matters little - as it is the association between network measures and earnings
which we seek to calibrate, irrespective of the mechanism which drives the correlation.
There is a large literature in which labour economists have examined the role of social
networks is providing individuals with information on new job opportunities. The previous literature has analysed the importance of social relationships for those seeking a job:
friends and relatives can provide information or can give access to others with information
and then be use for job searching (Albert, 1966; Montgomery, 1991; Granovetter, 1995;
Ioannides and Loury, 2004; Galeotti and Merlino, 2010). Petersen et al. (2000) report
that 50% of people in the US find a job through social networks. This mechanism has
been formalised through the system of referrals, allowing employees to suggest potential
candidates for positions to their employers. Some authors have argued that individuals
might strategically decide to invest in social networks, through participation in specific
social activities, with the aim of obtaining job opportunities or open profitable channels
to them in the future.
In contrast, this paper uses information on social networks during high school, when
students are, on average, 16. These networks can be relevant for earnings if individuals
maintain the same friends over time and they eventually become a resource as a job search
channels. This possibility has been investigated by Marmaros and Sacerdote (2002), who
find that relationships at college, and in particular being part of a fraternity or a sorority,
can facilitate access to high-paying jobs, but does not influence earnings. Furthermore,
being active in the student union body positively influence the future salary of students.
However, two points need to be clarified regarding this study. Firstly, the authors were
not able to identify the mechanism of this effect, and indeed they also consider the
possibility that the effect is due to social abilities per se, rather than to an information
effect. Secondly, the study focuses on social relationships at college, and not at high
school. Therefore, these results are not directly comparable to ours.
We therefore suggest that the effect captured by the social network metrics in our
analysis is not an information effect, and we posit that the effect is caused by relational
social and interpersonal skills of the students imparted by the more active use of cultivated
friendships and network interactions. Research in sociology and adolescence development
have analysed the factors which influence the relationships of adolescences, in particular,
in high school settings. Socially able students are cooperative, kind, trustworthy, and are
less likely to start fights or to disrupt groups. They are students with prosocial attitudes
and qualities (Coie et al., 1982, see). These qualities are more likely to be captured by our
measure of centrality, since a simple measure of popularity as indegree does not provide
information on the people that do not like the individual (Parkhurst and Hopmeyer,
1998), and thus offers only a partial perspective of the connectivity of a person. The
literature has shown a clear difference between the perception of ones social skills and
popularity, and the sociometric measures of the two. Indeed, Parkhurst and Hopmeyer
(1998) finds that who is high on sociometric popularity is typically not high on perceived
38
popularity, and it is perceived as more socially able by other students18 . Earlier findings
(Caprara et al., 2012; Caprara and Steca, 2005) in the sociological literature showed
that social abilities and behaviours are related to a common latent factor, the so-called
pro-sociality, and that this tendency is relatively stable across time. Pro-sociality is
influenced by psychological traits, values, culture, and self-efficacy beliefs (Caprara et al.,
2012; Caprara and Steca, 2005).
Of course, it is possible that social and network abilities are, in part, related to psychological traits and attitudes. However, relating to others is also something that one learns
during the schooling period, and also throughout his working career. To verify whether
the effect captured is actually linked to personality, we have controlled for psychological
measures for three personality traits in the first stage of the baseline regression. The
traits included are extraversion, conscientiousness, and neuroticism19 . The magnitude
and the significance of the network measures is not affected either when the psychological measures are introduced in the first stage, or when psychological measures at wave 4
are calculated in the second stage. Therefore, we can conclude that social skills are not
equivalent to psychological traits, even though they are clearly influenced by them.
Throughout this analysis, we have identify a key element for the formation of these social
skills, that is the opportunity to interact with others and to learn and practice social
norms and prosocial behaviours. This is demonstrated by the fact that personality can
influence social skills but does not coincide with it and by the relevance of geographical
proximity to school and peers, and by the importance of time spent together during the
weekends. Further research is thus needed to explore the potential for school to support
and enrich the development of these social skills.
In summary this paper provides new evidence and insights on the relevance of interpersonal skills in the form of network capital for labour market outcomes. Using the
AddHealth dataset, we propose the Bonacich centrality measure as a superior proxy for
these skills and we estimate the impact of this social network measure on earnings. We
use only fully sampled schools and this allows to avoid problems due to partially observed
networks. Our results show that the earning premium attached to a standard deviation
in the centrality in the network is estimated to be between 3.4 and 8%. This paper has
also made several important contributions to our understanding of how we model of the
importance of interpersonal social network skills in high school in the determination of
earnings later in life. Our examination of different estimation and identification strategies has led to a clear consensus. We find that our 3-8% networking effect on earnings
is relatively robust to different identification assumptions. Specifically, we investigated
the basic network model and the LiM model as well as Peer Effects models. We also
introduced the Network Intensity model which explicitly used information on the ranking of the importance of friends instead of just looking at a zero or unit indicator of the
presence or absence of a friendship link. We suggested that the latter could be a misspecification of the importance of friendship information as it is a form of measurement
error. Accordingly, our results are slightly ameliorated. We have also shown that the
benefits from non cognitive skills are mostly for white male students, and for those who
are earning relatively less. This paper is the starting point of a more extensive analysis,
which will include different outcomes, such as academic performance and health status.
18
With socially able we mean being trustworthy and able to relate with others, and not being perceived
as aggressive or dominant
19
Because of data issues, it was not possible to calculate the other two traits
39
Further research is still needed to identify the determinants of these social skills. While
we were able to identify some factors correlated with them, more research on the precise
mechanism linking network capital with earnings is necessary. Our research could also
have important policy implications, since our results suggest a key role played by school
friendship networks for the development of these skills.
8
Acknowledgement
This research uses data from Add Health, a program project directed by Kathleen Mullan Harris and designed
by J. Richard Udry, Peter S. Bearman, and Kathleen Mullan Harris at the University of North Carolina at Chapel Hill,
and funded by grant P01-HD31921 from the Eunice Kennedy Shriver National Institute of Child Health and Human
Development, with cooperative funding from 23 other federal agencies and foundations. Special acknowledgment is due
Ronald R. Rindfuss and Barbara Entwisle for assistance in the original design. Information on how to obtain the Add
Health data files is available on the Add Health website (http://www.cpc.unc.edu/addhealth). No direct support was
received from grant P01-HD31921 for this analysis.
40
References
Advani, A. and Malde, B. (2014), ‘Empirical Methods for Networks Data: Social Effects, Network Formation and Measurement Error’, IFS Working Paper 14/34 .
Ajwad, M. I. and Nikoloski, Z. (2014), ‘Cognitive and Non-Cognitive Skills Affect Employment Outcomes : Evidence from
Central Asia’, Mimeo .
Albert, R. (1966), ‘Information Networks in Labor Markets’, American Economic Review 56(1/2), 559–566.
Almlund, M., Duckworth, A. L., Heckman, J. J. and Kautz, T. D. (2011), ‘Personality Psychology and Economics’, NBER
Working Paper 16822 .
Angrist, J. D. (2014), ‘The perils of peer effects’, Labour Economics 30, 98–108.
ASPE (2014), ‘Annual Statistical Supplement to the Social Security Bulletin 2014’, 13-11700.
Austen-Smith, D. and Fryer, R. G. J. (2005), ‘An Economic Analysis of ”Acting White”’, The Quarterly Journal of
Economics 120(2), 551–584.
Babcock, P. (2008), ‘From Ties to Gains? Evidence on Connectedness and Human Capital Acquisition’, Journal of Human
Capital 2(4), 379–409.
Badev, A. I. (2013), ‘Discrete games in endogenous networks: Theory and policy’, Population Study Center Working Papers
PSC 13-05 .
Bandiera, O., Barankay, I. and Rasul, I. (2009), ‘Social Connections and Incentives in the Workplace: Evidence from
Personnel Data’, Econometrica 77(4), 1047–1094.
Banerjee, A., Chandrasekhar, A. G., Duflo, E. and Jackson, M. O. (2014), ‘Gossip: Identifying Central Individuals in a
Social Network’, NBER Working Paper p. 29.
Barron, J. M., Ewing, B. T. and Waddell, G. R. (2000), ‘The Effects of High School Athletic Participation on Education
and Labor Market Outcomes’, The Review of Economics and Statistics 82(3), 409–421.
Bonacich, P. (1987), ‘Power and Centrality: A Family of Measures’, American Journal of Sociology 92(5), 1170–1182.
Borgatti, S. P., Everett, M. G. and Johnson, J. (2013), Analying Social Networks, SAGE Publications Ltd.
Borghans, L., Golsteyn, B. H. H., Heckman, J. and Humphries, J. E. (2011), ‘Identification Problems in Personality
Psychology Identification Problems in Personality Psychology’, IZA Discussion Paper Series 5605 .
Borghans, L., Meijers, H. and ter Weel, B. (2008), ‘The Role of Noncognitive Skills in Explaining Cognitive Test Scores’,
Economic Inquiry 46(1), 2–12.
Borghans, L., ter Weel, B. and Weinberg, B. (2008), ‘Interpersonal Styles and Labor Market Outcomes’, Journal of Human
Resources 43(4), 815–858.
Bouchard, T. J. and Loehlin, J. C. (2001), ‘Genes, Evolution, and Personality’, Behavior Genetics 31(3), 243–273.
Bowles, S., Gintis, H. and Osborne, M. (2001), ‘The Determinants of Earnings : A Behavioural Approach’, Journal of
Economic Literature 39(4), 1137–1176.
Bramoullé, Y., Djebbari, H. and Fortin, B. (2009), ‘Identification of peer effects through social networks’, Journal of
Econometrics 150(1), 41–55.
Burgess, S., Sanderson, E. and Umaña Aponte, M. (2011), ‘School ties: An analysis of homophily in an adolescent friendship
network’, Working Paper No. 11/267, The Centre for Market and Public Organisation .
Caliendo, M., Cobb-Clark, D. and Uhlendorff, A. (2015), ‘Locus of Control and Job Search Strategies’, Review of Economics
and Statistics 97(1), 88–103.
Calvó-Armengol, A., Patacchini, E. and Zenou, Y. (2009), ‘Peer Effects and Social Networks in Education’, Review of
Economic Studies 76(4), 1239–1267.
Cameron, A. and Trivedi, P. (2005), Microeconometrics. Methods and Applications, Cambridge University Press, Cambridge.
Caprara, G. V., Alessandri, G. and Eisenberg, N. (2012), ‘Prosociality: The contribution of traits, values, and self-efficacy
beliefs.’, Journal of Personality and Social Psychology 102(6), 1289–303.
Caprara, G. V. and Steca, P. (2005), ‘Affective and Social Self- Regulatory Efficacy Beliefs as Determinants of Positive
Thinking and Happiness’, European Psychologist 10(4), 275–286.
41
Card, B. Y. D. (2001), ‘Estimating the Return to Schooling: Progress on Some Persistent Econometric Problems over the
Past Decade’, Econometrica 69(5), 1127–1160.
Carneiro, P., Crawford, C. and Goodman, A. (2007), ‘The impact of early cognitive and non-cognitive skills on later
outcomes’, Centre for the Economics of Education Discussion Paper 92 .
Cawley, J., Heckman, J. J. and Vytlacil, E. (2001), ‘Three observations on wages and measured cognitive ability’, Labour
Economics 8, 419–442.
Chandrasekhar, A. G. and Lewis, R. (2011), ‘Econometrics of Sampled Networks’, Mimeo .
Christakis, N. A. and Fowler, J. H. (2011), Connected: The Surprising Power of Our Social Networks and How They Shape
Our Lives – How Your Friends’ Friends’ Friends Affect Everything You Feel, Think, and Do, Harper Press, London.
Clausen, J. A. and Gilens, M. (1990), ‘Personality and Labor Force Participation Across the Life Course: A Longitudinal
Study of Women’s Careers’, Sociological Forum 5(4), 595–618.
Cobb-Clark, D. and Schurer, S. (2012), ‘The stability of big-five personality traits’, Economics Letters 115, 11–15.
Cobb-Clark, D. and Tan, M. (2011), ‘Noncognitive skills, occupational attainment, and relative wages’, Labour Economics
18, 1–13.
Coie, J. D., Dodge, K. a. and Coppotelli, H. (1982), ‘Dimensions and Types of Social Status: A Cross-Age Perspective’,
Developmental Psychology 18(4), 557–570.
Conti, G., Galeotti, A., Muller, G. and Pudney, S. (2013), ‘Popularity’, Journal of Human Resources 48(4), 1072–94.
Cunha, F. et al. (2010), ‘Estimating the Technology of Cognitive and Noncognitive Skill Formation.’, Econometrica
78(3), 883–931.
Currarini, S., Jackson, M. O. and Pin, P. (2009), ‘An Economic Model of Friendship: Homophily, Minorities, and Segregation’, Econometrica 77, 1003–1045.
Drago, F. (2011), ‘Self-esteem and earnings’, Journal of Economic Psychology 32(3), 480–488.
Epple, D. and Romano, R. (2011), ‘Peer Effects in Education: Survey of the Theory and Evidence’, Handbook of Social
Economics 1, 1053–1163.
Fletcher, J. M. (2013), ‘The effects of personality traits on adult labor market outcomes: Evidence from siblings’, Journal
of Economic Behavior & Organization 89, 122–135.
Fletcher, J. M. (2014), ‘Friends or family? Revisiting the effects of high school popularity on adult earnings’, Applied
Economics 46(20), 2408–2417.
Fletcher, J. M., Ross, S. L. and Zhang, Y. (2014), ‘The Determinants and Consequences of Friendship Composition’, HCEO
Working Paper 2014-016 .
Galeotti, A. and Merlino, L. (2010), ‘Endogenous Job Contact Networks’, ISER working papers 2010-14 .
Gallo, W. e. a. (2003), ‘The Influence of Internal Control on the Employment Status of German Workers’, Schmollers
Jahrbuch 123, 71–82.
Gardner, H. (1987), ‘Beyond the IQ: Education and Human Development’, Harvard Educational Review 57(2), 187–193.
Gardner, H. (1993), Multiple Intelligences. The Theory in Practice, 1 Edn, Basic Books, New York.
Glaeser, E. L., Laibson, D. and Sacerdote, B. (2002), ‘An Economic Approach to Social Capital’, The Economic Journal
112(483), F437–F458.
Go, M.-H., Tucker, J. S., Green, H. D., Pollard, M. and Kennedy, D. (2012), ‘Social distance and homophily in adolescent
smoking initiation.’, Drug and Alcohol Dependence 124(3), 347–54.
Goldberger, A. S. (1983), Abnormal Selection Bias, in S. Karlin, T. Amemiya and L. A. Goodman, eds, ‘Studies in
Econometrics Times Series, and Multivariate Statistics’, Academic Press, New York, pp. 67–84.
Goldsmith-Pinkham, P. and Imbens, G. W. (2013), ‘Social Networks and the Identification of Peer Effects’, Journal of
Business & Economic Statistics 31(3), 253–264.
Granovetter, M. (1995), Getting a Job: a Study of Contacts and Careers, The University of Chicago Press, Chicago.
Heckman, J. J. (2011), ‘Integrating personality psychology into economics’, NBER Working Paper 17378 .
42
Heckman, J. J. (2013), Giving Kids A Fair Chance (A Strategy That Works), Boston Review, MIT Press, Cambridge,
Massachusetts, and London, England.
Heckman, J. J. and Rubinstein, Y. (2001), ‘The importance of noncognitive skills: Lessons from the GED testing program’,
American Economic Review 91(2), 145–149.
Heckman, J. J., Stixrud, J. and Urzua, S. (2006), ‘The Effects of Cognitive and Noncognitive Abilities on Labor Market
Outcomes and Social Behavior’, Journal of Labor Economics 24(3), 411–482.
Heckman, J., Moon, S. H., Pinto, R., Savelyev, P. and Yavitz, A. (2010), ‘Analyzing Social Experiments as Implemented:
A reexamination of the Evidence from the HighScope Perry Preschool Program’, Quantitative Economics 1(1), 1–46.
Heckman, J., Pinto, R. and Savelyev, P. (2013), ‘Understanding the mechanisms through which an influential early childhood program boosted adult outcomes’, American Economic Review 103(6), 2052–2086.
Hill, J. M. (2012), ‘The Effects of Social Networking Skills on Labor Market Outcomes’, Mimeo .
Ioannides, Y. M. and Loury, L. D. (2004), ‘Job Information Networks, Neighborhood Effects and Inequality’, Journal of
Economic Literature 42(4), 1056–1093.
Jackson, M. (2008), Social and Economic Networks, Princeton University Press, Princeton.
Jackson, M. O. (2014), ‘Networks in the Understanding of Economic Behaviors’, Journal of Economic Perspectives 28(4), 3–
22.
Jacob, B. (2002), ‘Where the boys aren’t: non-cognitive skills, returns to school and the gender gap in higher education’,
Economics of Education Review 21(6), 589–598.
Jenks, C. (1979), Who Gets Ahead?, Basic Books, New York.
Kadushin, C. (2011), Understanding Social Networks: Theories, Concepts, and Findings, OUP USA.
Kautz, T., Heckman, J. J., Diris, R., ter Weel, B. and Borghans, L. (2014), ‘Fostering and Measuring Skills. Improving
Cognitive and Non-Cognitive Skills to Promote Lifetime Success’, OECD Education Working Papers 110 1, 1–86.
Krueger, A. and Schkade, D. (2008), ‘Sorting in the Labor Market: Do Gregarious Workers Flock to Interactive Jobs?’,
Journal of Human Resources 43(4), 859–883.
Kuhn, P. and Weinberger, C. (2005), ‘Leadership Skills and Wages’, Journal of Labor Economics 23(3), 395–436.
Levitt, S. D. (2009), ‘The Economic Value of Popularity’.
URL: http://freakonomics.com/2009/03/04/the-economic-value-of-popularity/
Loehlin, J. C. and Rowe, D. C. (1992), Genes, environment and personality, in G.-V. Caprara and G. L. Van Heck, eds,
‘Modern Personality Psychology’, Harvester Wheatsheaf, New York, pp. 352–370.
Manski, C. F. (1993), ‘Identification of Endogenous Social Effects : The Reflection Problem’, Review of Economic Studies
60(3), 531–542.
Marmaros, D. and Sacerdote, B. (2002), ‘Peer and social networks in job search’, European Economic Review 46, 870–879.
Mihaly, K. (2009), ‘Do More Friends Mean Better Grades? Student Popularity and Academic Achievement’, RAND
Working Paper WR-678 .
Montgomery, J. D. (1991), ‘Social Networks and Labor-Market Outcomes: Toward an Economic Analysis’, The American
Economic Review 81(5), 1408–1418.
Moody, J. (2001), ‘Race, School Integration, and Friendship Segregation in America’, American Journal of Sociology
107(3), 679–716.
Moody, J. (2012), ‘Segregation in America’, America 107(3), 679–716.
Nahemow, L. and Lawton, M. P. (1975), ‘Similarity and propinquity in friendship formation.’, Journal of Personality and
Social Psychology 32(2), 205–213.
Newman, M. (2010), Networks: An Introduction, Oxford University Press, Oxford.
Nolfi, G. J. (1979), Experiences of Recent High School Graduates: The Transition to Work or Post-Secondary Education,
Lexington Books.
Parkhurst, J. T. and Hopmeyer, A. (1998), ‘Sociometric Popularity and Peer-Perceived Popularity: Two Distinct Dimensions of Peer Status’, Journal of Early Adolescence 18(2), 125–144.
43
Pentland, A. (2014), Social Physics: How Good Ideas Spread - The Lessons from a New Science, Penguin Press.
Petersen, T., Saporta, I. and Seidel, M. L. (2000), ‘Offering a Job: Meritocracy and Social Networks’, American Journal
of Sociology 106(3), 763–816.
Postlewaite, A. and Silverman, D. (2005), ‘Social isolation and inequality’, The Journal of Economic Inequality 3(3), 243–
262.
Powell, A. G., Farrar, E., Cohen, D. K. et al. (1985), The Shopping Mall High School: Winners and Losers in the
Educational Marketplace, Houghton Mifflin.
Putnam, R. D. (2001), Bowling Alone: The Collapse and Revival of American Community, Simon & Schuster Inc., New
York.
Putnam, R. D. (2015), Our Kids: The American Dream in Crisis, Simon & Schuster Inc., New York.
Rosenbaum, J. (2001), Beyond College for All: Career Paths for the Forgotten Half, Russell Sage Found, New York.
Sacerdote, B. (2001), ‘Peer Effects with Random Assignment: Results for Dartmouth Roommates’, The Quarterly Journal
of Economics 116(May (2)), 681–704.
Sacerdote, B. (2011), Peer Effects in Education: How Might They Work, How Big Are They and How Much Do We Know
Thus Far?, in ‘Handbook of the Economics of Education’, Vol. 3, North Holland, New York, pp. 249–277.
Schmidt, F. L. and Hunter, J. (2004), ‘General Mental Ability in the World of Work: Occupational Attainment and Job
Performance’, Journal of Personality and Social Psychology 86(1), 162–173.
Scourfield, J., Martin, N. and McGuffin, P. (1999), ‘Heritability of Social Cognitive Skills in Children and Adolescents’,
The British Journal of Psychiatry 175(6), 559–564.
Segal, C. (2008), ‘Classroom Behaviour’, The Journal of Human Resources 43(4), 783–814.
Störmer, S. and Fahr, R. (2013), ‘Individual determinants of work attendance: evidence on the role of personality’, Applied
Economics 45(19), 2863–2875.
Tsai, A. C., Pierce, C. M. and Papachristos, A. V. (2015), ‘From social networks to health: Durkheim after the turn of the
millennium’, Social Science & Medicine 125, 1–7.
Wasserman, S. and Faust, K. (1994), Social Network Analysis: Methods and Applications, Cambridge University Press,
Cambridge.
Weiss, C. T. (2010), ‘The Effects of Cognitive and Noncognitive Abilities on Earnings: Different School Systems’, Mimeo .
Woolley, A. W., Chabris, C. F., Pentland, A., Hashmi, N. and Malone, T. W. (2010), ‘Evidence for a Collective Intelligence
Factor in the Performance of Human Groups’, Science 330(October), 686–688.
44
Appendices
A Variable Descriptions, Summary Statistics, and Descriptive Graphs
B Complete Estimation Tables
C Results for Alternative Identification Strategies
45
Appendix A
A.1
Variable Descriptions,
Summary Statistics, and Descriptive Graphs
Variable Descriptions
Variable
Network Metrics
Table A1: Variable Definition
Description
Wave
Number of times one is nominated
or nominates friends without double counting
Number of times one is nominated
by peers as a friend
Bonacich Power Centrality
Degree
Indegree
BC
Other Variables
Earnings
BA
More BA
IQ
PTA
High Income
Many Friends W4
Large School
Geography Distance
Self-reported Earnings
Respondent has a Bachelor Degree
Respondent as a qualification
Higher than a Bachelor Degree
Measured with the PVT test
Parents participate in Parent-teacher association
Household has a income greater than $52,000
Respondent reports to have more than
Six close friends in Wave IV
School has more than 1,000 students
Average distance from peers
Table A2: AddHealth Waves, Years, and Observations
In Home Questionnaire
IQ and Height Info
Network Metrics I
Network Metrics II
OCEAN Info
In Home Questionnaire
Parents and School Info
Only Saturated Schools
PTA Info
Network Metrics II
46
Wave
I
I
I
II
IV
IV
I
I
I
II
Year
1994
1994
1994
1995
2008
2008
1994
1994
1912
1996
Obs
20745
20369
8874
1298
6026
6026
5382
2047
1557
1298
I, II
I, II
I, II
IV
IV
IV
I
I
I
IV
I
I
A.2
Summary Statistics
Figure A1: Degree Observed vs. Normal Distribution
Figure A2: Degree: Poisson, Negative Binomial and Actual
47
Figure A3: In-Degree: Poisson, Negative Binomial and Actual
Figure A4: Bonacich Power Centrality
48
Table A3: Std Bonacich Power Centrality - Percentiles
1%
5%
10%
25%
50%
75%
90%
95%
99%
Percentiles
-1.15
-1.15
-1.15
Obs
-.75
-.15
Mean
0.64
St.Dev.
1.38
Variance
1.89 Skewness
3.31
Kurtosis
1450
0.05
1
1
1.16
4.82
Table A4: Individual Characteristics
Variable
Mean Std. Dev. Min. Max.
N
Age
15.638
1.432
12
19
1450
Female
0.492
0.5
0
1
1450
Born in the US
0.699
0.459
0
1
1450
White
0.649
0.477
0
1
1450
Black
0.126
0.332
0
1
1450
Asian
0.116
0.32
0
1
1450
Other Ethn
0.109
0.312
0
1
1450
Non Christian
0.026
0.158
0
1
1450
IQ
100.603
12.79
56
133
1387
Atheist
0.117
0.322
0
1
1450
TV5
0.544
0.498
0
1
1450
Sports
0.950
0.219
0
1
1450
Hobbies
0.786
0.41
0
1
1450
Not fit
0.079
0.269
0
1
1450
No clubs
0.35
0.477
0
1
1450
Repeated
0.179
0.383
0
1
1450
Grade 7
0.057
0.231
0
1
1450
Grade 8
0.063
0.243
0
1
1450
Grade 9
0.128
0.334
0
1
1450
Grade 10
0.299
0.458
0
1
1450
Grade 11
0.246
0.431
0
1
1450
Grade 12
0.209
0.407
0
1
1450
49
Table A5: Individual Characteristics - Wave 4
Variable
Mean
Std. Dev. Min.
Age W4
28.638
1.432
25
Earnings
36171.263 21449.875 1200
Married
0.433
0.496
0
Bachelor
0.198
0.399
0
More than Bachelor
0.083
0.277
0
Same state
0.761
0.426
0
US citizen
0.968
0.175
0
Good health
0.907
0.291
0
Always vote
0.249
0.433
0
Many Friends
0.348
0.476
0
Friends Same Ethnicity
0.451
0.498
0
N
1450
Table A6: Family Characteristics
Variable
Mean Std. Dev. Min.
Both parents
0.607
0.489
0
Siblings
2.941
2.066
0
Only child
0.036
0.186
0
Twin
0.02
0.14
0
No English at home 0.135
0.342
0
Mother BA
0.172
0.378
0
Father BA
0.168
0.374
0
Low income
0.054
0.227
0
Middle income
0.564
0.496
0
High income
0.381
0.486
0
PTA
0.281
0.45
0
Max.
1
15
1
1
1
1
1
1
1
1
1
Table A7: Schools Characteristics
Variable
Mean Std. Dev. Min.
Urban
0.05
0.219
0
Suburban
0.479
0.5
0
Rural
0.471
0.499
0
Safe school
0.643
0.479
0
Public
0.950
0.219
0
Private
0.05
0.219
0
Small (1-400 students)
0.229
0.42
0
Large (1001-4000 students) 0.771
0.42
0
Ability Grouping
0.847
0.36
0
Class size
29.353
7.846
13
Prop White
0.551
0.466
0
Prop Black
0.044
0.152
0
Prop Asian
0.036
0.099
0
N
1450
50
Max.
32
140000
1
1
1
1
1
1
1
1
1
N
1450
1449
1450
1450
1450
1450
1450
1450
1450
1450
1450
Max.
1
1
1
1
1
1
1
1
1
38
1
1
0.312
Table A8: Networks Variables
Variable Mean Std. Dev. Min.
Degree
6.355
4.262
1
Indegree
3.188
2.727
0
BC
0.8
0.672
0
Density
0.01
0.018
0.002
N
1450
A.3
Max.
24
19
4.469
0.129
Adjacency Matrix G
The adjacency matrix G is constructed as follows:
1. A square matrix (nxn) of zeros is constructed with the individuals IDs on both rows and columns
2. For directed networks: every time an individual i nominates another individual j, the element [i,j] is replaced with
a one
3. For undirected networks: every time an individual i nominates another individual j, both the elements [i,j] and [j,i]
are replaced with a one
A.4
Adjacency Matrix R
The adjacency matrix R is constructed as follows:
For directed networks:
1. A square matrix (nxn) of zeros is constructed with the individuals IDs on both rows and columns
2. The first time an individual i nominates a female individual j, the element [i,j] is replaced with 1
3. The second time the element [i,j] is replaced with 0.8
4. The second time the element [i,j] is replaced with 0.6
5. The second time the element [i,j] is replaced with 0.4
6. The second time the element [i,j] is replaced with 0.2
7. The process is then repeated for male nominations
For undirected networks:
1. A square matrix (nxn) of zeros is constructed with the individuals IDs on both rows and columns
2. The first time an individual i nominates a female individual j, the elements [i,j] and [j,i] are replaced with 1
3. The second time the elements [i,j] and [j,i] are replaced with 0.8
4. The second time the elements [i,j] and [j,i] are replaced with 0.6
5. The second time the elements [i,j] and [j,i] are replaced with 0.4
6. The second time the elements [i,j] and [j,i] are replaced with 0.2
7. The process is then repeated for male nominations
51
Appendix B
Complete Estimation Tables
Table B1: First Stage Estimates
(1)
Degree
-0.055∗∗
(0.025)
(2)
InDegree
-0.020
(0.030)
(3)
BC 0.1
-0.074∗∗∗
(0.022)
0.137∗∗∗
(0.043)
0.210∗∗
(0.082)
0.060
(0.058)
Heigth
0.002
(0.002)
-0.001
(0.002)
0.003
(0.002)
Safe school
0.034
(0.043)
0.097∗∗
(0.045)
-0.031
(0.031)
Breastfeed
0.001
(0.023)
0.009
(0.031)
0.006
(0.031)
No Clubs
-0.099∗
(0.052)
-0.166∗∗
(0.074)
-0.052
(0.046)
Not fit
-0.207∗∗∗
(0.039)
-0.271∗∗∗
(0.069)
-0.130∗∗
(0.047)
Ever repeated a grade
-0.095∗∗
(0.039)
-0.133∗
(0.073)
-0.063∗∗∗
(0.015)
First-born
0.053
(0.033)
-0.005
(0.025)
0.061
(0.040)
Smoking Prop
0.096∗∗∗
(0.035)
0.109∗∗
(0.046)
0.096
(0.077)
Black Prop
-1.489∗∗∗
(0.187)
-1.922∗∗∗
(0.233)
-1.212∗∗∗
(0.149)
Class size
-0.076∗∗∗
(0.005)
-0.046∗∗∗
(0.004)
0.001
(0.006)
Hobbies
-0.064∗
(0.034)
-0.107∗∗∗
(0.034)
0.005
(0.034)
TV5
0.052∗∗
(0.023)
0.052
(0.041)
0.044∗∗∗
(0.010)
Sports
0.057
(0.076)
0.014
(0.082)
0.134∗
(0.062)
Hang-out
0.166∗
(0.087)
0.280∗∗∗
(0.062)
0.057
(0.099)
Living with both parents
0.019
(0.036)
0.011
(0.069)
0.013
(0.024)
Only child
0.008
(0.151)
0.081
(0.178)
-0.028
(0.126)
No English at home
0.058∗∗
(0.025)
0.028
(0.033)
0.034
(0.030)
Ability Grouping
1.522∗∗∗
(0.112)
1.200∗∗∗
(0.084)
0.030
(0.034)
Small School
-0.622∗∗∗
(0.095)
-0.456∗∗∗
(0.121)
Suburban
-1.479∗∗∗
(0.058)
-1.286∗∗∗
(0.063)
Rural
-1.987∗∗∗
(0.129)
-1.435∗∗∗
(0.123)
Age
Female Prop
52
-0.113
(0.081)
Public School
0.000
(.)
0.000
(.)
0.122∗∗
(0.043)
Middle Income
-0.029
(0.087)
-0.041
(0.060)
0.033
(0.109)
High Income
-0.011
(0.079)
-0.021
(0.074)
0.054
(0.080)
0.818∗∗∗
(0.091)
0.633∗∗∗
(0.107)
1.019∗∗∗
(0.074)
Other Ethn Prop
-0.034
(0.090)
-0.403∗∗∗
(0.114)
0.270∗∗
(0.086)
US Prop
-0.099
(0.071)
-0.115∗∗
(0.058)
-0.094
(0.094)
Non Christian Prop
-0.139
(4.191)
2.443
(3.946)
-2.916
(3.655)
Atheist Prop
-0.343
(0.338)
-0.210
(0.501)
-0.391∗
(0.212)
0.086∗∗∗
(0.015)
0.065
(0.044)
0.103∗∗∗
(0.030)
Grade FE
School FE
YES
YES
YES
YES
YES
YES
Constant
4.850∗∗∗
(0.711)
3.218∗∗∗
(0.745)
1.070
(0.614)
-1.808∗∗∗
(0.136)
1450
-1.333∗∗∗
(0.072)
1450
Asian Prop
PTA
lnalpha
Constant
Observations
R2
Standard errors in parentheses
∗
p < .10, ∗∗ p < .05, ∗∗∗ p < .01
53
1495
0.106
Table B2: Earnings Estimates
Degree
(1)
Degree
0.119∗∗
(0.049)
(2)
InDegree
(3)
BC 0.1
0.075∗
(0.040)
Indegree
0.088∗∗
(0.042)
BC
Density
0.000
(.)
0.000
(.)
11.353∗∗∗
(4.377)
Female
-0.347∗∗∗
(0.039)
-0.358∗∗∗
(0.041)
-0.326∗∗∗
(0.039)
Black
0.113
(0.087)
0.096
(0.087)
0.142
(0.103)
Asian
-0.052
(0.085)
-0.017
(0.082)
-0.103
(0.107)
Other Ethnicity
-0.021
(0.075)
-0.010
(0.075)
-0.043
(0.082)
Bachelor
0.305∗∗∗
(0.054)
0.310∗∗∗
(0.054)
0.302∗∗∗
(0.051)
More than Bachelor
0.360∗∗∗
(0.074)
0.367∗∗∗
(0.074)
0.356∗∗∗
(0.073)
Age
-0.068∗
(0.039)
-0.085∗∗
(0.038)
-0.060
(0.046)
Catholic
0.145∗∗
(0.058)
0.147∗∗
(0.058)
0.145∗∗
(0.057)
Christian
0.025
(0.060)
0.021
(0.060)
0.025
(0.056)
Other Religion
0.009
(0.059)
0.004
(0.059)
0.012
(0.061)
Not moved
0.069
(0.044)
0.067
(0.044)
0.069
(0.047)
Only child
-0.004
(0.099)
-0.022
(0.099)
0.012
(0.093)
Always vote
0.040
(0.043)
0.040
(0.043)
0.037
(0.041)
Middle Income
0.173∗∗
(0.082)
0.171∗∗
(0.082)
0.150
(0.092)
High Income
0.197∗∗
(0.085)
0.198∗∗
(0.085)
0.170∗
(0.095)
Ability Grouping
-0.173
(0.168)
-0.165
(0.168)
0.409
(0.276)
Ever arrested
-0.115∗∗
(0.045)
-0.115∗∗
(0.045)
-0.115∗∗
(0.049)
Class size
0.019∗∗∗
(0.006)
0.017∗∗∗
(0.006)
0.056∗∗
(0.027)
0.000
(.)
0.000
(.)
0.368
(0.393)
0.178∗∗∗
(0.039)
0.178∗∗∗
(0.039)
0.177∗∗∗
(0.036)
Small School
Married
54
Rural
-0.007
(0.056)
-0.009
(0.056)
-0.009
(0.055)
Mother Degree
0.013
(0.055)
0.016
(0.055)
0.010
(0.049)
Father Degree
0.068
(0.055)
0.068
(0.055)
0.070
(0.055)
Breastfeed
-0.032
(0.039)
-0.030
(0.039)
-0.033
(0.040)
Many Friends
0.138∗∗∗
(0.040)
0.139∗∗∗
(0.040)
0.140∗∗∗
(0.039)
Friends Ethnicity
0.102∗∗
(0.041)
0.101∗∗
(0.041)
0.104∗∗
(0.041)
Years Working
-0.003
(0.008)
-0.003
(0.008)
-0.003
(0.009)
Good health
0.120∗
(0.063)
0.118∗
(0.064)
0.125∗
(0.070)
Does not want children
0.022
(0.073)
0.024
(0.073)
0.024
(0.078)
-0.037
(0.108)
11.841∗∗∗
(1.278)
-0.042
(0.109)
12.374∗∗∗
(1.231)
-0.036
(0.087)
9.572∗∗∗
(1.935)
Birth Month FE
YES
YES
YES
Grade FE
YES
YES
YES
School FE
YES
YES
YES
Observations
R2
1450
0.212
1450
0.210
1450
0.211
US citizen
Constant
Bootstrapped errors in parentheses (1000 replications)
∗
p < .10, ∗∗ p < .05, ∗∗∗ p < .01
55
Table B3: Intensity Relations Results - Baseline
Degree
(1)
Degree
0.117∗∗
(0.048)
(2)
Indegree
(3)
BC
0.073∗
(0.038)
Indegree
0.086∗∗
(0.041)
BC
House
-0.141
(0.154)
-0.145
(0.153)
-0.143
(0.151)
Meet
0.055
(0.110)
0.055
(0.115)
0.056
(0.113)
Talk
-0.130
(0.105)
-0.133
(0.104)
-0.138
(0.099)
Phone
-0.022
(0.115)
-0.016
(0.115)
-0.026
(0.115)
Time
0.243∗∗
(0.107)
1450
0.216
0.242∗∗
(0.101)
1450
0.214
0.246∗∗
(0.107)
1450
0.215
Observations
R2
Boostrapped errors in parentheses (1000 replications)
∗
p < .10, ∗∗ p < .05, ∗∗∗ p < .01
56
Table B4: Annual Earnings and Hourly Wage
Annual
Hourly
Degree
0.095***
0.104**
(0.032)
(0.043)
Indegree
0.069***
0.071**
(0.027)
(0.035)
BC 0.1
0.064**
0.077**
(0.03)
(0.039)
Grade FE
Yes
Yes
School FE
Yes
Yes
N
1808
1450
R2
0.22
0.21
Bootstrapped errors in parentheses
1000 replications
* p < 0.01, ** p < 0.05, *** p < 0.10
Table B5: Estimates with OCEAN traits
Degree
Indegree
BC
0.095**
0.056
0.063*
(0.042)
(0.034)
(0.038)
Anxious
-0.007
-0.007
-0.007
(0.006)
(0.006)
(0.006)
Depression
-0.197*** -0.196*** -0.20***
(0.061)
(0.055)
(0.055)
Extraversion
0.008
0.008
0.008
(0.006)
(0.007)
(0.007)
Agreeableness
-0.010
-0.010
-0.010
(0.009)
(0.008)
(0.009)
Openness
-0.007
-0.007
-0.008
(0.009)
(0.009)
(0.009)
Conscientiousness
0.011
0.011
0.011
(0.007)
(0.007)
(0.007)
N
1377
1377
1377
R2
0.32
0.32
0.32
Bootstrapped errors in parentheses
1000 replications
* p < 0.01, ** p < 0.05, *** p < 0.10
Network
Table B6: Earnings Estimates - Monotonicity of Effects
(1)
Degree
57
(2)
InDegree
(3)
BC 0.1
Degree
0.123∗∗∗
(0.047)
0.073∗
(0.040)
Indegree
0.087∗∗
(0.043)
BC
IQ
0.000
(0.002)
Degree*IQ
0.022
(0.021)
Indegree*IQ
0.001
(0.002)
-0.001
(0.002)
0.020
(0.022)
0.038∗
(0.020)
BC*IQ
Density
7.104
(4.394)
9.085∗∗
(4.478)
10.780∗∗
(4.425)
Female
-0.352∗∗∗
(0.043)
-0.360∗∗∗
(0.043)
-0.328∗∗∗
(0.041)
Black
0.084
(0.099)
0.064
(0.097)
0.103
(0.106)
Asian
-0.044
(0.092)
-0.004
(0.088)
-0.074
(0.107)
Other Ethnicity
-0.033
(0.084)
-0.025
(0.081)
-0.045
(0.088)
Bachelor
0.305∗∗∗
(0.052)
0.313∗∗∗
(0.053)
0.299∗∗∗
(0.052)
More than Bachelor
0.339∗∗∗
(0.077)
0.348∗∗∗
(0.073)
0.335∗∗∗
(0.074)
Age
-0.066
(0.043)
-0.085∗∗
(0.042)
-0.065
(0.046)
Catholic
0.139∗∗
(0.058)
0.142∗∗
(0.056)
0.139∗∗
(0.059)
Christian
0.032
(0.057)
0.027
(0.056)
0.035
(0.056)
Other Religion
-0.028
(0.062)
-0.034
(0.063)
-0.023
(0.059)
Moved
0.082∗
(0.048)
0.080∗
(0.049)
0.083∗
(0.047)
Only child
0.062
(0.082)
0.047
(0.086)
0.079
(0.086)
Always vote
0.034
(0.042)
0.034
(0.044)
0.033
(0.042)
Middle income
0.107
(0.092)
0.106
(0.090)
0.104
(0.090)
High income
0.126
(0.092)
0.129
(0.093)
0.120
(0.092)
Ability Grouping
0.052
(0.297)
0.213
(0.294)
0.394
(0.284)
Ever arrested
-0.117∗∗
(0.047)
-0.117∗∗
(0.048)
-0.112∗∗
(0.048)
58
Class size
0.027
(0.029)
0.037
(0.030)
0.051∗
(0.028)
Small School
-0.113
(0.441)
0.044
(0.443)
0.354
(0.411)
0.167∗∗∗
(0.038)
0.167∗∗∗
(0.036)
0.169∗∗∗
(0.037)
Rural
-0.006
(0.054)
-0.009
(0.056)
-0.008
(0.057)
Mother BA
0.001
(0.050)
0.004
(0.050)
0.000
(0.050)
Father BA
0.077
(0.057)
0.077
(0.056)
0.074
(0.057)
Breastfeed
-0.034
(0.039)
-0.033
(0.038)
-0.030
(0.040)
Many Friends
0.126∗∗∗
(0.040)
0.128∗∗∗
(0.039)
0.128∗∗∗
(0.038)
Friends Same Ethnicity
0.082∗∗
(0.042)
0.082∗
(0.042)
0.084∗∗
(0.041)
Years Working
-0.001
(0.009)
-0.001
(0.010)
-0.001
(0.009)
Good health
0.114
(0.073)
0.112
(0.075)
0.120
(0.075)
Does not want children
0.017
(0.080)
0.021
(0.078)
0.030
(0.081)
US citizen
-0.030
(0.092)
-0.041
(0.093)
-0.033
(0.091)
Birth Month FE
Yes
Yes
Yes
Grade FE
Yes
Yes
Yes
School FE
Yes
Yes
Yes
Constant
11.243∗∗∗
(1.940)
1387
0.216
11.157∗∗∗
(2.010)
1387
0.214
10.046∗∗∗
(2.005)
1387
0.216
Married
Observations
R2
Bootstrapped errors in parentheses
∗
p < .10, ∗∗ p < .05, ∗∗∗ p < .01
59
Table B7: Earnings Estimates - Quantiles Dummies
Degree
(1)
Degree
0.222
(0.177)
Indegree
(2)
InDegree
(3)
BC 0.1
0.236
(0.155)
BC
0.005
(0.069)
I Quantile(Degree)
0.000
(0.000)
III Quantile(Degree)
-0.151
(0.172)
III Quantile(Degree)
-0.102
(0.188)
I Quantile(Degree) * Degree
0.000
(0.000)
II Quantile(Degree) * Degree
-0.099
(0.228)
III Quantile(Degree) * Degree
-0.097
(0.182)
I Quantile(ID)
0.000
(0.000)
II Quantile(ID)
-0.185
(0.146)
III Quantile(ID)
-0.152
(0.165)
I Quantile(ID) * ID
0.000
(0.000)
II Quantile(ID) * ID
-0.192
(0.193)
III Quantile(ID) * ID
-0.170
(0.162)
I Quantile(BC)
0.000
(0.000)
II Quantile(BC)
0.071
(0.082)
III Quantile(BC)
-0.072
(0.112)
I Quantile(BC) * BC
0.000
(0.000)
II Quantile(BC) * BC
0.035
(0.153)
III Quantile(BC) * BC
0.230∗∗
(0.093)
Density
6.646
(5.245)
7.973∗
(4.583)
10.844∗∗
(4.313)
Female
-0.353∗∗∗
(0.040)
-0.367∗∗∗
(0.042)
-0.328∗∗∗
(0.042)
0.086
(0.114)
0.096
(0.113)
0.064
(0.104)
Black
60
Asian
-0.031
(0.105)
-0.004
(0.092)
-0.140
(0.108)
Other Ethnicity
-0.029
(0.085)
-0.026
(0.083)
-0.046
(0.083)
Bachelor
0.309∗∗∗
(0.053)
0.317∗∗∗
(0.052)
0.298∗∗∗
(0.050)
More than Bachelor
0.342∗∗∗
(0.074)
0.354∗∗∗
(0.075)
0.331∗∗∗
(0.074)
Age
-0.065
(0.044)
-0.087∗∗
(0.041)
-0.061
(0.046)
Catholic
0.138∗∗
(0.058)
0.136∗∗
(0.057)
0.133∗∗
(0.057)
Christian
0.032
(0.056)
0.022
(0.055)
0.032
(0.055)
Other Religion
-0.025
(0.060)
-0.031
(0.061)
-0.034
(0.061)
Moved State
0.084∗
(0.049)
0.082∗
(0.049)
0.084∗
(0.046)
Only child
0.057
(0.088)
0.044
(0.088)
0.076
(0.088)
Always vote
0.035
(0.044)
0.032
(0.043)
0.035
(0.043)
Middle income
0.107
(0.085)
0.110
(0.086)
0.091
(0.085)
High income
0.129
(0.087)
0.134
(0.090)
0.113
(0.088)
Ability Grouping
0.035
(0.352)
0.166
(0.309)
0.402
(0.271)
Ever arrested
-0.119∗∗
(0.048)
-0.122∗∗
(0.050)
-0.120∗∗
(0.048)
Class size
0.028
(0.032)
0.033
(0.030)
0.054∗
(0.028)
Small School
-0.062
(0.487)
0.032
(0.458)
0.393
(0.405)
0.166∗∗∗
(0.037)
0.169∗∗∗
(0.037)
0.168∗∗∗
(0.037)
Rural
-0.007
(0.056)
-0.015
(0.055)
-0.015
(0.056)
Mother BA
0.001
(0.050)
0.009
(0.050)
-0.009
(0.052)
Father BA
0.075
(0.058)
0.077
(0.055)
0.074
(0.056)
Breastfeed
-0.032
(0.040)
-0.031
(0.039)
-0.034
(0.041)
Many Friends
0.128∗∗∗
(0.039)
0.128∗∗∗
(0.039)
0.132∗∗∗
(0.041)
Friends Same Ethnicity
0.082∗∗
(0.041)
0.083∗∗
(0.041)
0.087∗∗
(0.041)
Years Working
-0.002
(0.010)
-0.002
(0.009)
-0.001
(0.009)
Married
61
Good health
0.114
(0.075)
0.113
(0.076)
0.121∗
(0.070)
Does not want children
0.018
(0.082)
0.020
(0.080)
0.022
(0.077)
US citizen
-0.035
(0.088)
-0.036
(0.089)
-0.043
(0.090)
Birth Month FE
Yes
Yes
Yes
Grade FE
Yes
Yes
Yes
School FE
Yes
Yes
Yes
Constant
11.361∗∗∗
11.714∗∗∗
(1.944)
1387
0.217
(1.924)
1387
0.214
9.663∗∗∗
(1.961)
1387
0.217
Observations
R2
Bootstrapped errors in parentheses
∗
p < .10, ∗∗ p < .05, ∗∗∗ p < .01
Table B8: Quantile Estimation Results - BC
(1)
Q0.1
0.088∗∗
(0.040)
(2)
Q 0.25
0.201
(0.131)
(3)
Q 0.5
0.121∗∗
(0.060)
(4)
Q 0.75
0.043
(0.037)
(5)
Q0.9
0.024
(0.034)
(6)
Q(0.9)
0.010
(0.052)
Density
0.000
(.)
26.379∗∗
(11.538)
11.248∗∗
(5.308)
8.379∗∗
(3.283)
6.264∗∗
(3.032)
-1.010
(4.557)
Female
-0.326∗∗∗
(0.039)
-0.410∗∗∗
(0.129)
-0.304∗∗∗
(0.059)
-0.271∗∗∗
(0.037)
-0.327∗∗∗
(0.034)
-0.320∗∗∗
(0.051)
Black
0.142
(0.094)
0.259
(0.311)
0.290∗∗
(0.143)
0.192∗∗
(0.088)
0.020
(0.082)
-0.088
(0.123)
Asian
-0.103
(0.098)
-0.348
(0.325)
-0.112
(0.149)
0.008
(0.092)
-0.020
(0.085)
-0.048
(0.128)
Other Ethnicity
-0.043
(0.075)
-0.099
(0.249)
-0.099
(0.114)
0.021
(0.071)
-0.037
(0.065)
-0.024
(0.098)
Bachelor
0.302∗∗∗
(0.055)
0.382∗∗
(0.180)
0.283∗∗∗
(0.083)
0.278∗∗∗
(0.051)
0.198∗∗∗
(0.047)
0.179∗∗
(0.071)
More than Bachelor
0.356∗∗∗
(0.074)
0.246
(0.245)
0.385∗∗∗
(0.113)
0.385∗∗∗
(0.070)
0.333∗∗∗
(0.064)
0.228∗∗
(0.097)
Age
-0.060
(0.041)
-0.103
(0.137)
-0.026
(0.063)
-0.055
(0.039)
-0.056
(0.036)
-0.086
(0.054)
Catholic
0.145∗∗
(0.058)
0.174
(0.192)
0.181∗∗
(0.088)
0.155∗∗∗
(0.055)
0.155∗∗∗
(0.050)
0.088
(0.076)
Christian
0.025
(0.060)
0.019
(0.198)
-0.074
(0.091)
-0.008
(0.056)
0.079
(0.052)
-0.022
(0.078)
Other Religion
0.012
(0.059)
-0.042
(0.197)
0.040
(0.090)
0.037
(0.056)
0.069
(0.052)
0.110
(0.078)
Not moved
0.069
(0.044)
0.140
(0.147)
0.142∗∗
(0.067)
0.053
(0.042)
0.007
(0.039)
-0.002
(0.058)
Only child
0.012
(0.099)
0.075
(0.328)
0.096
(0.151)
-0.056
(0.093)
0.066
(0.086)
-0.052
(0.129)
Always vote
0.037
(0.043)
0.092
(0.143)
-0.017
(0.066)
0.046
(0.041)
0.061
(0.038)
0.051
(0.056)
Middle Income
0.150∗
(0.082)
0.371
(0.272)
0.192
(0.125)
0.123
(0.077)
0.159∗∗
(0.071)
0.150
(0.107)
BC
62
High Income
0.170∗∗
(0.086)
0.473∗
(0.284)
0.189
(0.130)
0.152∗
(0.081)
0.212∗∗∗
(0.075)
0.211∗
(0.112)
Ability Grouping
-0.350
(0.224)
1.062
(0.717)
0.118
(0.330)
0.362∗
(0.204)
0.325∗
(0.188)
-0.096
(0.283)
Ever arrested
-0.115∗∗
(0.045)
-0.219
(0.149)
-0.161∗∗
(0.069)
-0.079∗
(0.043)
-0.055
(0.039)
-0.056
(0.059)
Class size
0.014∗∗
(0.006)
0.153∗
(0.078)
0.074∗∗
(0.036)
0.031
(0.022)
0.022
(0.021)
-0.015
(0.031)
0.000
(.)
1.186
(1.285)
0.397
(0.591)
0.192
(0.366)
0.095
(0.338)
-0.469
(0.508)
0.177∗∗∗
(0.039)
0.372∗∗∗
(0.128)
0.204∗∗∗
(0.059)
0.147∗∗∗
(0.036)
0.126∗∗∗
(0.034)
0.074
(0.051)
Rural
-0.009
(0.056)
0.081
(0.186)
-0.056
(0.085)
-0.065
(0.053)
-0.029
(0.049)
0.052
(0.073)
Mother Education
0.010
(0.055)
0.162
(0.183)
0.014
(0.084)
-0.049
(0.052)
-0.051
(0.048)
0.031
(0.072)
Father Education
0.070
(0.055)
0.060
(0.183)
0.047
(0.084)
0.082
(0.052)
0.183∗∗∗
(0.048)
0.081
(0.072)
Breastfeed
-0.033
(0.039)
-0.067
(0.129)
0.001
(0.059)
-0.028
(0.037)
-0.022
(0.034)
0.023
(0.051)
Many Friends
0.140∗∗∗
(0.040)
0.160
(0.133)
0.163∗∗∗
(0.061)
0.094∗∗
(0.038)
0.095∗∗∗
(0.035)
0.063
(0.052)
Friends Ethnicity
0.104∗∗
(0.041)
0.157
(0.135)
0.149∗∗
(0.062)
0.062
(0.038)
0.077∗∗
(0.035)
0.020
(0.053)
Years Working
-0.003
(0.008)
-0.027
(0.028)
-0.004
(0.013)
0.002
(0.008)
0.009
(0.007)
0.004
(0.011)
Good health
0.125∗∗
(0.063)
0.225
(0.210)
0.091
(0.096)
0.091
(0.060)
0.091∗
(0.055)
-0.051
(0.083)
Does not want children
0.024
(0.073)
0.035
(0.240)
0.016
(0.110)
0.070
(0.068)
0.056
(0.063)
0.061
(0.095)
US citizen
-0.036
(0.109)
-0.070
(0.359)
0.029
(0.165)
0.112
(0.102)
0.107
(0.094)
-0.076
(0.142)
Constant
11.947∗∗∗
(1.299)
5.859
(5.383)
7.702∗∗∗
(2.477)
10.289∗∗∗
(1.532)
10.845∗∗∗
(1.415)
14.395∗∗∗
(2.126)
Birth Month FE
YES
YES
YES
YES
YES
YES
Grade FE
YES
YES
YES
YES
YES
YES
School FE
YES
YES
YES
YES
YES
YES
Observations
R2
1450
0.211
1450
1450
1450
1450
1450
Small School
Married
Bootstrapped errors in parentheses (500 replications)
∗
p < .10, ∗∗ p < .05, ∗∗∗ p < .01
Table B9: Quantile Estimation Results - Degree
(1)
Q0.01
0.118∗∗
(0.055)
(2)
Q 0.25
0.189
(0.124)
(3)
Q 0.5
0.215∗∗∗
(0.074)
(4)
Q 0.75
0.083∗
(0.049)
(5)
Q 0.9
0.065
(0.041)
(6)
Q(0.9)
0.037
(0.060)
Female
-0.352∗∗∗
(0.038)
-0.448∗∗∗
(0.107)
-0.386∗∗∗
(0.061)
-0.285∗∗∗
(0.041)
-0.328∗∗∗
(0.040)
-0.336∗∗∗
(0.049)
Bachelor
0.306∗∗∗
0.402∗∗∗
0.291∗∗∗
0.281∗∗∗
0.202∗∗∗
0.182∗∗∗
Degree
63
(0.059)
(0.136)
(0.076)
(0.053)
(0.049)
(0.068)
0.360∗∗∗
(0.071)
0.342
(0.210)
0.369∗∗∗
(0.104)
0.382∗∗∗
(0.066)
0.329∗∗∗
(0.066)
0.226∗∗∗
(0.071)
Birth Month FE
YES
YES
YES
YES
YES
YES
Grade FE
YES
YES
YES
YES
YES
YES
School FE
YES
YES
YES
YES
YES
YES
Observations
R2
1450
0.211
1450
1450
1450
1450
1450
More than Bachelor
Standard errors in parentheses
∗
p < .10, ∗∗ p < .05, ∗∗∗ p < .01
64
Appendix C
Results for Alternative Identification Strategies
Table C1: Wave 2 Estimates - Stand
Degree WII
(1)
Degree
0.230∗∗∗
(0.051)
(2)
InDegree
0.205∗∗∗
(0.049)
Indegree WII
0.101∗∗
(0.041)
BC WII
Degree WI
(3)
BC 0.1
0.010∗
(0.005)
0.016∗∗
(0.008)
Indegree WI
BC WI
0.033
(0.030)
Density
12.910∗∗
(5.014)
12.883∗∗∗
(4.994)
9.420∗∗
(4.791)
Female
-0.362∗∗∗
(0.044)
-0.362∗∗∗
(0.045)
-0.375∗∗∗
(0.046)
Black
0.203∗∗
(0.103)
0.172∗
(0.104)
0.204∗
(0.112)
Asian
-0.099
(0.108)
-0.131
(0.112)
-0.059
(0.106)
Other Ethnicity
-0.144
(0.107)
-0.146
(0.110)
-0.144
(0.105)
Bachelor
0.294∗∗∗
(0.057)
0.311∗∗∗
(0.055)
0.303∗∗∗
(0.059)
More than Bachelor
0.261∗∗∗
(0.088)
0.277∗∗∗
(0.092)
0.281∗∗∗
(0.090)
Age
-0.082
(0.053)
-0.071
(0.056)
-0.110∗∗
(0.053)
Catholic
0.110∗
(0.065)
0.105∗
(0.062)
0.123∗∗
(0.062)
Christian
-0.023
(0.064)
-0.031
(0.067)
-0.026
(0.064)
Other Religion
0.046
(0.068)
0.033
(0.069)
0.050
(0.069)
Not moved
0.023
(0.053)
0.019
(0.052)
0.021
(0.054)
Only child
0.039
(0.103)
0.016
(0.103)
0.034
(0.100)
Always vote
0.046
(0.047)
0.046
(0.047)
0.046
(0.047)
Middle Income
0.088
(0.116)
0.123
(0.107)
0.079
(0.117)
High Income
0.085
(0.117)
0.123
(0.112)
0.073
(0.117)
65
Ability Grouping
0.314
(0.305)
0.357
(0.301)
0.224
(0.291)
Ever arrested
-0.128∗∗
(0.060)
-0.134∗∗
(0.059)
-0.122∗∗
(0.056)
Class size
0.062∗
(0.033)
0.064∗∗
(0.032)
0.030
(0.032)
Small School
-0.022
(0.468)
0.030
(0.451)
-0.166
(0.472)
0.185∗∗∗
(0.043)
0.181∗∗∗
(0.045)
0.183∗∗∗
(0.045)
Rural
-0.016
(0.062)
-0.013
(0.062)
-0.021
(0.062)
Mother Education
-0.069
(0.057)
-0.066
(0.058)
-0.069
(0.057)
Father Education
0.158∗∗∗
(0.060)
0.167∗∗∗
(0.059)
0.153∗∗
(0.062)
-0.052
(0.045)
-0.026
(0.046)
-0.052
(0.047)
Many Friends
0.188∗∗∗
(0.043)
0.190∗∗∗
(0.045)
0.199∗∗∗
(0.044)
Friends Same Ethnicity
0.115∗∗
(0.048)
0.106∗∗
(0.049)
0.120∗∗
(0.048)
Years Working
-0.004
(0.011)
-0.005
(0.012)
-0.003
(0.012)
Good health
0.145∗
(0.087)
0.137
(0.087)
0.150∗
(0.090)
Does not want children
0.095
(0.088)
0.095
(0.090)
0.084
(0.089)
US citizen
-0.164
(0.114)
-0.159
(0.114)
-0.175
(0.111)
Constant
10.882∗∗∗
(2.398)
10.363∗∗∗
(2.455)
12.726∗∗∗
(2.317)
Birth Month FE
YES
YES
YES
Grade FE
YES
YES
YES
School FE
Observations
R2
YES
1029
0.258
YES
1029
0.259
YES
1029
0.248
Married
Breastfeed
Standard errors in parentheses
∗
p < .10, ∗∗ p < .05, ∗∗∗ p < .01
66
Table C2: Wave 2 Estimates
Degree WII
(1)
Degree
0.102∗∗∗
(0.030)
(2)
InDegree
0.080∗∗∗
(0.023)
Indegree WII
BC WII
Observations
R2
(3)
BC 0.1
1029
0.250
1029
0.249
Standard errors in parentheses
∗
p < .10, ∗∗ p < .05, ∗∗∗ p < .01
67
0.063∗∗
(0.026)
1029
0.246
Table C3: Wave 2 Estimates - Stand
Degree
(1)
Degree
0.105∗∗∗
(0.027)
(2)
InDegree
(3)
BC 0.1
0.077∗∗∗
(0.021)
Indegree
0.069∗∗
(0.029)
BC
Density
10.892∗∗
(5.030)
10.384∗∗
(5.225)
9.291∗
(5.124)
Female
-0.372∗∗∗
(0.045)
-0.374∗∗∗
(0.046)
-0.375∗∗∗
(0.045)
Black
0.128
(0.104)
0.113
(0.105)
0.145
(0.106)
Asian
-0.037
(0.105)
-0.043
(0.104)
-0.026
(0.107)
Other Ethnicity
-0.139
(0.103)
-0.135
(0.110)
-0.144
(0.107)
Bachelor
0.301∗∗∗
(0.055)
0.318∗∗∗
(0.058)
0.303∗∗∗
(0.058)
More than Bachelor
0.267∗∗∗
(0.086)
0.287∗∗∗
(0.088)
0.278∗∗∗
(0.088)
currentagecorrect
-0.103∗
(0.054)
-0.101∗
(0.054)
-0.115∗∗
(0.052)
Catholic
0.117∗
(0.067)
0.118∗
(0.063)
0.125∗
(0.064)
Christian
-0.033
(0.062)
-0.042
(0.065)
-0.027
(0.066)
rel2
0.040
(0.072)
0.034
(0.064)
0.046
(0.070)
Did not move outside the state
0.014
(0.054)
0.011
(0.049)
0.016
(0.056)
Only child
0.003
(0.105)
-0.009
(0.105)
0.008
(0.100)
Always vote
0.053
(0.044)
0.049
(0.047)
0.054
(0.043)
middleincome
0.124
(0.108)
0.136
(0.102)
0.114
(0.112)
highincome
0.114
(0.114)
0.124
(0.109)
0.106
(0.115)
Ability Grouping
0.291
(0.302)
0.290
(0.322)
0.224
(0.301)
Ever arrested
-0.142∗∗
(0.059)
-0.143∗∗
(0.058)
-0.129∗∗
(0.059)
Class size
0.049
(0.034)
0.046
(0.033)
0.033
(0.033)
ASIZE==Small (1-400 students)
0.016
(0.490)
0.028
(0.476)
-0.104
(0.467)
0.182∗∗∗
(0.044)
0.181∗∗∗
(0.045)
0.182∗∗∗
(0.043)
Married
68
Rural
-0.031
(0.062)
-0.028
(0.063)
-0.030
(0.064)
motherdegree
-0.073
(0.057)
-0.073
(0.057)
-0.072
(0.062)
fatherdegree
0.144∗∗
(0.058)
0.166∗∗∗
(0.060)
0.122∗
(0.066)
breastfeed
-0.040
(0.046)
-0.033
(0.046)
-0.040
(0.045)
closefrmore6
0.200∗∗∗
(0.044)
0.208∗∗∗
(0.044)
0.201∗∗∗
(0.043)
allsamerace
0.122∗∗
(0.050)
0.118∗∗
(0.047)
0.122∗∗∗
(0.047)
Years Working
-0.002
(0.011)
-0.002
(0.012)
-0.001
(0.013)
Good health
0.172∗∗
(0.084)
0.165∗
(0.088)
0.171∗∗
(0.086)
Does not want children
0.100
(0.090)
0.100
(0.088)
0.089
(0.091)
US citizen
-0.188∗
(0.111)
Yes
Yes
Yes
-0.186
(0.117)
Yes
Yes
Yes
-0.198
(0.123)
Yes
Yes
Yes
11.656∗∗∗
(2.367)
1023
0.250
11.598∗∗∗
(2.429)
1023
0.248
12.588∗∗∗
(2.344)
1023
0.246
Birth Month
Grade FE
School FE
Constant
Observations
R2
Standard errors in parentheses
∗
p < .10, ∗∗ p < .05, ∗∗∗ p < .01
69
Table C4: Differences
∆Degree
(1)
Degree
0.150∗∗
(0.071)
∆Indegree
(2)
Indegree
(3)
BC
0.041
(0.066)
Density
3.444
(2.754)
4.564∗
(2.671)
0.086∗
(0.045)
4.987∗
(2.620)
Female
-0.246∗∗∗
(0.089)
-0.290∗∗∗
(0.101)
-0.327∗∗∗
(0.075)
Black
-0.082
(0.192)
-0.023
(0.191)
-0.043
(0.186)
Asian
0.136
(0.147)
0.072
(0.149)
0.189
(0.151)
Other Ethnicity
-0.056
(0.147)
-0.113
(0.151)
-0.015
(0.146)
Bachelor
0.631∗∗∗
(0.105)
0.614∗∗∗
(0.105)
0.627∗∗∗
(0.105)
More than Bachelor
0.715∗∗∗
(0.109)
0.707∗∗∗
(0.110)
0.714∗∗∗
(0.109)
Age
-0.010
(0.040)
-0.027
(0.041)
-0.025
(0.038)
Catholic
0.076
(0.098)
0.081
(0.098)
0.066
(0.098)
Christian
-0.032
(0.108)
-0.017
(0.108)
-0.034
(0.108)
Other Religion
-0.006
(0.111)
-0.004
(0.113)
-0.012
(0.113)
Not moved
-0.015
(0.093)
-0.015
(0.093)
-0.017
(0.092)
Only child
0.052
(0.186)
0.051
(0.185)
0.120
(0.189)
Always vote
-0.061
(0.077)
-0.048
(0.077)
-0.053
(0.077)
Middle Income
-0.264∗
(0.156)
-0.161
(0.154)
-0.261∗
(0.157)
High Income
-0.150
(0.161)
-0.079
(0.165)
-0.140
(0.161)
Ability Grouping
0.037
(0.185)
0.032
(0.192)
0.001
(0.184)
Ever arrested
-0.019
(0.091)
-0.021
(0.092)
-0.011
(0.091)
Class size
0.014
(0.009)
0.018∗
(0.009)
0.016∗
(0.009)
Small School
-0.393∗∗
(0.187)
-0.245
(0.181)
-0.245
(0.175)
Married
0.152∗∗
(0.072)
0.151∗∗
(0.073)
0.155∗∗
(0.073)
-0.102
-0.101
-0.099
∆BC
Rural
70
(0.094)
(0.094)
(0.094)
Mother Education
-0.121
(0.114)
-0.128
(0.114)
-0.101
(0.116)
Father Education
0.155∗
(0.091)
0.157∗
(0.092)
0.151
(0.092)
Breastfeed
-0.057
(0.074)
-0.053
(0.076)
-0.026
(0.074)
0.283∗∗∗
(0.075)
0.287∗∗∗
(0.075)
0.285∗∗∗
(0.075)
Ethnicity Friends
-0.040
(0.081)
-0.040
(0.082)
-0.036
(0.081)
Years Working
0.027
(0.020)
0.029
(0.020)
0.028
(0.020)
Good health
-0.033
(0.111)
-0.036
(0.111)
-0.055
(0.110)
Does not want children
0.169
(0.133)
0.153
(0.134)
0.156
(0.133)
US citizen
0.087
(0.248)
0.064
(0.254)
0.090
(0.252)
Constant
9.099∗∗∗
(1.233)
551
0.274
9.336∗∗∗
(1.309)
551
0.268
9.450∗∗∗
(1.183)
551
0.272
Many Friends
Observations
R2
Standard errors in parentheses
∗
p < .10, ∗∗ p < .05, ∗∗∗ p < .01
Table C5: Two stage Results
BC
(1)
0.108∗∗∗
(0.026)
(2)
0.108∗∗∗
(0.033)
Female
-0.273∗∗∗
(0.039)
-0.289∗∗∗
(0.041)
Bachelor
0.339∗∗∗
(0.043)
0.350∗∗∗
(0.044)
More than Bachelor
0.396∗∗∗
(0.070)
0.405∗∗∗
(0.067)
Age
0.084∗∗∗
(0.015)
0.020
(0.038)
White
-0.094∗
(0.050)
0.065
(0.078)
Black
0.153∗
(0.091)
0.160
(0.102)
Middle Income
0.131
(0.091)
0.142
(0.093)
High Income
0.165∗
(0.094)
0.159∗
(0.096)
Constant
7.796∗∗∗
(0.466)
9.307∗∗∗
(1.186)
School FE
No
Yes
Grade FE
No
Yes
71
Observations
R2
1495
0.131
1495
0.155
Bootstrapped errors in parentheses, 1000 replications
∗
p < .10, ∗∗ p < .05, ∗∗∗ p < .01
Table C6: LiM
Log(Mean Earn)
(1)
0.393∗∗∗
(0.111)
(2)
-0.450
(0.475)
(3)
-0.451
(0.493)
Female
-0.316∗∗∗
(0.093)
-0.316∗∗∗
(0.098)
-0.312∗∗∗
(0.097)
-0.061
(0.182)
-0.793∗∗
(0.314)
-0.788∗∗
(0.321)
Bachelor
0.373∗∗∗
(0.034)
0.372∗∗∗
(0.034)
0.358∗∗∗
(0.025)
Mean BA
-0.119
(0.202)
0.404
(0.539)
0.433
(0.553)
0.446∗∗∗
(0.053)
0.451∗∗∗
(0.055)
0.433∗∗∗
(0.056)
0.000
(.)
0.000
(.)
0.000
(.)
Age
-0.042
(0.028)
-0.041
(0.026)
-0.037
(0.023)
White
-0.039∗∗
(0.014)
-0.049∗∗∗
(0.013)
-0.041∗∗
(0.018)
Black
-0.084∗∗∗
(0.020)
-0.084∗∗∗
(0.020)
-0.061∗
(0.032)
Mean White
-0.163∗∗∗
(0.050)
0.540
(0.419)
0.477
(0.412)
Mean Black
-0.250∗∗
(0.088)
1.733
(0.989)
1.753
(0.987)
Mean Age
0.060
(0.037)
0.020
(0.319)
-0.007
(0.313)
Middle Income
0.153∗∗
(0.058)
0.140∗∗
(0.056)
0.139∗∗
(0.058)
High Income
0.188∗∗
(0.067)
0.162∗∗
(0.061)
0.162∗∗
(0.062)
Mean High Income
-0.023
(0.584)
-0.912
(0.918)
-0.913
(0.936)
Mean Middle Income
-0.274
(0.512)
-0.610
(0.840)
-0.597
(0.844)
Mean Female
More than Bachelor
Mean moreBA
Standardized values of (bon)
0.034
(0.028)
Constant
5.861∗∗∗
(0.839)
16.840
(10.183)
17.617
(10.181)
School FE
No
Yes
Yes
No
1870
0.135
Yes
1870
0.151
Yes
1870
0.152
Grade FE
Observations
R2
Standard errors in parentheses, clustered at school level
∗
p < .10, ∗∗ p < .05, ∗∗∗ p < .01
72
Table C7: LiM Friends Earnings
(1)
0.068
(0.051)
(2)
0.011
(0.034)
(3)
0.003
(0.034)
Friends Female
0.036
(0.032)
-0.022
(0.040)
-0.024
(0.041)
Friends BA
0.116
(0.067)
0.172∗∗
(0.067)
0.165∗∗
(0.066)
Friends moreBA
0.062
(0.035)
0.125∗∗∗
(0.039)
0.121∗∗
(0.040)
Friends Age
0.014
(0.028)
-0.059
(0.042)
-0.061
(0.041)
Friends White
0.063∗∗
(0.026)
0.189∗∗∗
(0.027)
0.196∗∗∗
(0.029)
Friends Black
-0.048
(0.067)
-0.061
(0.069)
-0.056
(0.070)
Friends High Income
0.141
(0.099)
0.036
(0.077)
0.041
(0.076)
Friends Middle Income
0.021
(0.090)
-0.025
(0.087)
-0.023
(0.090)
Female
-0.301∗∗∗
(0.050)
-0.320∗∗∗
(0.059)
-0.317∗∗∗
(0.060)
Bachelor
0.324∗∗∗
(0.032)
0.337∗∗∗
(0.027)
0.327∗∗∗
(0.025)
More than Bachelor
0.414∗∗∗
(0.097)
0.413∗∗∗
(0.091)
0.396∗∗∗
(0.092)
Age
0.041∗∗
(0.016)
-0.049∗∗
(0.018)
-0.045∗∗
(0.017)
White
-0.128∗∗∗
(0.017)
-0.038∗∗
(0.013)
-0.033∗∗
(0.013)
Black
0.002
(0.051)
0.035
(0.076)
0.058
(0.074)
Middle Income
0.182∗∗
(0.069)
0.186∗∗
(0.063)
0.184∗∗
(0.067)
High Income
0.246∗∗∗
(0.071)
0.215∗∗∗
(0.065)
0.214∗∗
(0.068)
Log(Mean Friends’ Earn)
0.034∗
(0.019)
BC
Constant
7.774∗∗∗
(0.516)
12.835∗∗∗
(0.882)
12.822∗∗∗
(0.849)
School FE
No
Yes
Yes
No
1428
0.134
Yes
1404
0.176
Yes
1404
0.178
Grade FE
Observations
R2
Standard errors in parentheses, clustered at school level
∗
p < .10, ∗∗ p < .05, ∗∗∗ p < .01
Table C8: Undirected Network - Basic and Centrality
Log(Adj*Earn)
(1)
0.153∗∗
(0.065)
73
(2)
0.096
(0.057)
(3)
0.095
(0.056)
Adj*Female
-0.014
(0.017)
-0.005
(0.012)
-0.004
(0.012)
Adj*BA
0.031
(0.024)
0.062∗∗∗
(0.019)
0.060∗∗
(0.019)
Adj*moreBA
-0.001
(0.022)
0.038∗
(0.020)
0.034
(0.021)
Adj*Age
-0.003
(0.002)
-0.003
(0.002)
-0.004
(0.002)
Adj*White
-0.005
(0.011)
0.032∗
(0.016)
0.037∗
(0.019)
Adj*Black
0.028
(0.032)
0.023
(0.017)
0.029
(0.019)
Adj*High Income
0.073
(0.049)
0.063
(0.050)
0.063
(0.050)
Adj*Middle Income
0.018
(0.054)
0.005
(0.060)
0.003
(0.060)
Female
-0.286∗∗∗
(0.074)
-0.311∗∗∗
(0.083)
-0.309∗∗∗
(0.084)
Bachelor
0.314∗∗∗
(0.023)
0.313∗∗∗
(0.026)
0.310∗∗∗
(0.026)
More than Bachelor
0.401∗∗∗
(0.092)
0.389∗∗∗
(0.087)
0.384∗∗∗
(0.088)
Age
0.040∗∗∗
(0.012)
-0.052∗∗
(0.020)
-0.051∗∗
(0.020)
White
-0.114∗∗
(0.044)
-0.007
(0.013)
-0.006
(0.013)
Black
-0.104∗∗∗
(0.030)
-0.075
(0.051)
-0.071
(0.052)
Middle Income
0.158∗∗
(0.055)
0.168∗∗
(0.056)
0.167∗∗
(0.057)
High Income
0.206∗∗∗
(0.061)
0.183∗∗
(0.063)
0.182∗∗
(0.064)
0.030∗
(0.014)
Standardized values of (bon)
Constant
7.437∗∗∗
(0.756)
11.200∗∗∗
(0.589)
11.229∗∗∗
(0.591)
School FE
No
Yes
Yes
No
1595
0.135
Yes
1567
0.166
Yes
1567
0.166
Grade FE
Observations
R2
Standard errors in parentheses, clustered at school level
∗
p < .10, ∗∗ p < .05, ∗∗∗ p < .01
Table C9: Undirected Network Intensity - Ranking
Log(R*Earn)
R*Female
R*BA
(1)
0.134
(0.080)
(2)
0.080
(0.057)
(3)
0.080
(0.057)
-0.048∗∗∗
(0.013)
-0.046∗∗∗
(0.010)
-0.045∗∗∗
(0.011)
0.062∗∗
(0.025)
0.096∗∗∗
(0.015)
0.096∗∗∗
(0.014)
74
R*moreBA
-0.017
(0.023)
0.022
(0.022)
0.021
(0.022)
R*Age
0.000
(0.003)
-0.000
(0.003)
-0.000
(0.003)
R*White
-0.020∗
(0.011)
0.027∗∗
(0.011)
0.028∗
(0.013)
R*Black
-0.009
(0.028)
-0.009
(0.021)
-0.007
(0.023)
R*High Income
0.017
(0.063)
0.009
(0.054)
0.009
(0.054)
R*Middle Income
-0.017
(0.078)
-0.029
(0.075)
-0.029
(0.076)
Female
-0.279∗∗∗
(0.081)
-0.302∗∗∗
(0.091)
-0.302∗∗∗
(0.091)
Bachelor
0.303∗∗∗
(0.022)
0.299∗∗∗
(0.022)
0.298∗∗∗
(0.024)
More than Bachelor
0.377∗∗∗
(0.090)
0.360∗∗∗
(0.085)
0.359∗∗∗
(0.086)
Age
0.039∗∗
(0.013)
-0.051∗∗
(0.021)
-0.051∗∗
(0.020)
White
-0.129∗∗
(0.057)
0.005
(0.014)
0.005
(0.014)
Black
-0.046
(0.048)
-0.024
(0.049)
-0.023
(0.049)
Middle Income
0.163∗∗
(0.053)
0.176∗∗∗
(0.054)
0.175∗∗∗
(0.055)
High Income
0.214∗∗∗
(0.061)
0.192∗∗∗
(0.060)
0.191∗∗∗
(0.060)
BC
0.007
(0.020)
Constant
7.637∗∗∗
(0.874)
11.292∗∗∗
(0.498)
11.283∗∗∗
(0.498)
School FE
No
Yes
Yes
No
1595
0.138
Yes
1567
0.171
Yes
1567
0.171
Grade FE
Observations
R2
Standard errors in parentheses, clustered at school level
∗
p < .10, ∗∗ p < .05, ∗∗∗ p < .01
Table C10: Directed Network - Basic and Centrality
(1)
0.061
(0.048)
(2)
0.027
(0.033)
(3)
0.022
(0.032)
Adj*Female
-0.008
(0.013)
-0.008
(0.013)
-0.009
(0.014)
Adj*BA
0.050∗
(0.025)
0.070∗∗∗
(0.020)
0.066∗∗
(0.021)
Adj*moreBA
0.017
(0.021)
0.048∗
(0.024)
0.042
(0.025)
Adj*Age
-0.003
(0.002)
-0.003
(0.002)
-0.003
(0.002)
Log(Adj*Earn)
75
Adj*White
0.002
(0.013)
0.029
(0.017)
0.039∗
(0.019)
Adj*Black
-0.028
(0.040)
-0.005
(0.023)
-0.001
(0.026)
Adj*High Income
0.090∗
(0.044)
0.058
(0.050)
0.056
(0.050)
Adj*Middle Income
0.012
(0.046)
0.005
(0.058)
0.002
(0.059)
Female
-0.279∗∗∗
(0.060)
-0.315∗∗∗
(0.068)
-0.311∗∗∗
(0.068)
Bachelor
0.308∗∗∗
(0.029)
0.307∗∗∗
(0.027)
0.302∗∗∗
(0.027)
More than Bachelor
0.397∗∗∗
(0.098)
0.386∗∗∗
(0.093)
0.375∗∗∗
(0.095)
Age
0.049∗∗∗
(0.011)
-0.059∗∗
(0.019)
-0.057∗∗
(0.018)
White
-0.106∗∗
(0.042)
-0.010
(0.019)
-0.009
(0.019)
Black
-0.007
(0.040)
-0.029
(0.050)
-0.012
(0.056)
Middle Income
0.179∗∗
(0.064)
0.195∗∗∗
(0.060)
0.192∗∗∗
(0.061)
High Income
0.238∗∗∗
(0.066)
0.215∗∗∗
(0.059)
0.213∗∗∗
(0.060)
0.053∗∗
(0.017)
Standardized values of (bon)
Constant
8.150∗∗∗
(0.581)
11.552∗∗∗
(0.482)
11.617∗∗∗
(0.486)
School FE
No
Yes
Yes
No
1428
0.138
Yes
1404
0.174
Yes
1404
0.176
Grade FE
Observations
R2
Standard errors in parentheses, clustered at school level
∗
p < .10, ∗∗ p < .05, ∗∗∗ p < .01
Table C11: Directed Network Intensity - Ranking
(1)
0.055
(0.050)
(2)
0.021
(0.033)
(3)
0.017
(0.033)
R*Female
-0.020
(0.021)
-0.025
(0.016)
-0.027
(0.017)
R*BA
0.080∗∗
(0.030)
0.109∗∗∗
(0.022)
0.104∗∗∗
(0.024)
R*moreBA
0.007
(0.031)
0.054∗∗
(0.022)
0.049∗∗
(0.021)
R*Age
-0.004
(0.003)
-0.004
(0.003)
-0.005∗
(0.003)
R*White
0.022
(0.019)
0.072∗∗
(0.026)
0.085∗∗
(0.028)
R*Black
-0.046
(0.064)
-0.009
(0.042)
-0.003
(0.048)
Log(R*Earn)
76
R*High Income
0.124∗
(0.061)
0.072
(0.059)
0.069
(0.060)
R*Middle Income
0.035
(0.067)
0.016
(0.072)
0.010
(0.074)
Female
-0.276∗∗∗
(0.065)
-0.313∗∗∗
(0.074)
-0.308∗∗∗
(0.076)
Bachelor
0.311∗∗∗
(0.029)
0.312∗∗∗
(0.025)
0.305∗∗∗
(0.026)
More than Bachelor
0.403∗∗∗
(0.089)
0.392∗∗∗
(0.086)
0.376∗∗∗
(0.087)
Age
0.054∗∗∗
(0.010)
-0.058∗∗
(0.019)
-0.055∗∗
(0.018)
White
-0.130∗∗
(0.047)
-0.021
(0.021)
-0.015
(0.021)
Black
0.002
(0.052)
-0.027
(0.063)
-0.007
(0.066)
Middle Income
0.183∗∗
(0.064)
0.201∗∗∗
(0.061)
0.200∗∗
(0.064)
High Income
0.242∗∗∗
(0.066)
0.220∗∗∗
(0.061)
0.221∗∗∗
(0.062)
0.054∗∗∗
(0.013)
BC
Constant
8.116∗∗∗
(0.619)
11.587∗∗∗
(0.467)
11.619∗∗∗
(0.471)
School FE
No
Yes
Yes
No
1428
0.137
Yes
1404
0.175
Yes
1404
0.178
Grade FE
Observations
R2
Standard errors in parentheses, clustered at school level
∗
p < .10, ∗∗ p < .05, ∗∗∗ p < .01
77