ESTIMATION OF PEER EFFECTS IN ENDOGENOUS SOCIAL
NETWORKS: CONTROL FUNCTION APPROACH
IDA JOHNSSON AND HYUNGSIK ROGER MOON
Abstract. We propose a method of estimating the linear-in-means model of peer effects
in which the peer group, defined by a social network, is endogenous in the outcome equation
for peer effects. Endogeneity is due to unobservable individual characteristics that influence
both link formation in the network and the outcome of interest. We use estimates of the
unobserved individual effects as a control function to control for the endogeneity of the
social network matrix in the outcome equation for peer effects. We leave the functional
form of the control function unspecified and treat it as unknown. To estimate the model, we
use a sieve semiparametric approach, and we establish asymptotics of the semiparametric
estimator. Keywords: peer effects, endogenous network, sieve estimation,
control function
1. Introduction
The ways in which interconnected individuals influence each other are usually referred
to as peer effects. One of the first to formally model peer effects is Manski (1993). He
proposes the linear-in-means model, in which an individual’s action depends on the average
action of other individuals and possibly also on their average characteristics. Manski (1993)
assumes that all individuals within a given group are connected. Later literature allows for
more complex patterns of connections, in which an individual might be directly influenced
by a subset of the group. Papers that consider these type of models include for example
Bramoullé et al. (2009), Lee et al. (2010), Lee (2007b).
Models of peer effects have been applied in various areas, such as education, health and
development. For example Graham (2008) uses conditional variance restrictions to identify the effects of social interactions and finds that differences in peer group quality are an
important source of individual-level variation in academic achievement. Conley and Udry
(2010) use data on communication patterns to study the diffusion of new technology among
farmers in Ghana and Ductor and Fafchamps (2011) study the effects of co-authorship on
academic productivity. A review of the peer effects literature can be found in Blume et al.
(2013), Manski (2000) and Graham (2011).
1
2
IDA JOHNSSON AND HYUNGSIK ROGER MOON
Many models considered in the literature assume that connections between individuals
are independent of unobserved individual characteristics that influence outcomes. However,
the exogeneity of the network or peer group might be restrictive in many applications. For
example, consider the following, widely studied, empirical application of peer effects: peer
influence on scholarly achievement.1 The assumption that friendships are exogenous in the
outcome equation for scholarly achievement means that there are no unobservables that influence both friendship formation and individual grades. Even if the researcher controls for
observable individual characteristics such as gender, age, race and parents’ education, he is
likely to leave out factors that influence both students’ choice of friends and their GPA, for
example parents’ expectations, psychological disorders or substance use. For more examples
of endogenous peer groups and networks see Weinberg (2007), Hsieh and Lee (2014) and
Shalizi (2012). If the network matrix is endogenous and we do not account for this fact,
estimates of peer effects are biased.
In this paper we make the following contributions. We estimate a linear-in-means model
of peer effects, where the network is endogenous to the outcome equation. Unobserved
individual heterogeneity is the cause of the network endogneity, and we use the network
formation model proposed by Graham (2015) to control for this endogeneity. We assume
that unobserved individual characteristics can be arbitrarily correlated with observables, i.e.
they are treated as fixed effects. We leave the functional form in which unobservables enter
the outcome equation unspecified and estimate it nonparametrically. We derive the distribution of the semi-parametric estimator of the peer effects coefficient and perform Monte
Carlo simulations to compare its finite sample performance against an estimator that assumes unobserved characteristics enter in a linear way, as well as an instrumental variable
(IV) estimator that does not control for network endogeneity.
The issue of network endogeneity has been considered previously in the literature, and
several approaches have been proposed to control for the endogeneity of the network. Badev
(2013) and Weinberg (2007) model links and outcomes as jointly chosen. Weinberg (2007)
uses data on associations among high school students from the National Longitudinal Study
of Adolescent Health (Add Health) and provides evidence consistent with the hypothesis
that people have a tendency to form links with individuals who are similar to themselves.
Sacerdote (2014) proposes the use of data where peer groups are exogenously and randomly
determined. Zimmerman (2003) and Hoxby (2000) identify the effect of peers on scholarly
achievement using sources of variation that are credibly idiosyncratic. Papers that use the
control function approach include Hsieh and Lee (2014), Goldsmith-Pinkham and Imbens
1
See Calvó-Armengol et al. (2009) for an empirical application and Epple and Romano (2011) for a survey
of theoretical and empirical literature on peer effects in education.
ESTIMATION OF PEER EFFECTS IN ENDOGENOUS SOCIAL NETWORKS
3
(2013) and Qu and Lee (2015).2 Hsieh and Lee (2014) and Goldsmith-Pinkham and Imbens
(2013) use a Bayesian approach to estimate their models, whereas we take a frequentist approach. Qu and Lee (2015) introduce endogeneity through selection on unobservables.
The remainder of the paper is organized as follows. In Section 2 we formally present our
model and discuss related literature, in Section 3, we show under which assumptions the
model is identified. Estimation is discussed in Section 4, and in Section 5 we present results
of Monte Carlo simulations. Section 6 concludes.
2. Model of Peer Effects with Endogenous Network
The econometrician observes a sample i = 1, . . . , N of individuals. Xi is an lx × 1 vector
of observable exogenous characteristics, Yi denotes the outcome variable and Ai is a scalar
unobserved characteristic of individual i. Ai is treated as an individual fixed-effect, and
hence, might be correlated with Xi . Individuals are connected by an undirected network
DN = {Dij,N }, with Dij,N = Dji,N = 1 if i and j are directly connected and 0 otherwise,
Dii,N = 0 for all i. There are n = N2 dyads. Let Tij denote an lT × 1 vector of dyad-specific
characteristics of dyad ij. We assume that Tij = f (X1i , X1j ), where X1i ⊆ Xi .
Agents form links according to
Dij,N = 1(Tij0 λ + Ai + Aj − uij ≥ 0),
(2.1)
where uij is an idiosyncratic shock that is i.i.d. across dyads and 1 is an indicator function.
Let GN be the matrix obtained by row-normalizing DN , i.e. it is an N × N matrix whose
(i, j)th element Gij,N is:
(
0
if i = j
Gij,N =
Dij,N
P
otherwise.
Dij,N
j6=i
We assume that individual outcomes are given by the linear-in-means model of peer effects
0
N
N
X
X
0
γ+
Gij,N X2j δ + υi ,
(2.2)
Yi = β
Gij,N Yj + X2i
j=1
j6=i
j=1
j6=i
where X2i ⊆ Xi . Using the terminology of Manski (1993), β captures the endogenous social
effect, and δ measures the exogenous social effect. Note that the exogenous characteristics
2
After finishing the draft of this paper, we were made aware of the paper by Arduini et al. (2015), which
also uses the network formation model proposed by Graham (2015). Arduini et al. (2015) assume that the
errors from the network formation stage and outcome equation are jointly normally distributed and control
for selectivity bias. In our setup, we do not make a distributional assumption on the outcome effect and we
use the unobserved heterogeneity to control for the endogeneity of the network.
4
IDA JOHNSSON AND HYUNGSIK ROGER MOON
that influence link formation, X1i , and exogenous characteristics that influence outcomes,
X2i , could be the same, partially overlapping, or different. We denote XN = (X10 , . . . , XN0 )0 ,
YN = (Y1 , . . . , YN )0 and υN = (υ1 , . . . , υN )0 .
In the standard linear-in-means model of peer effects it is assumed that E[υi |XN , GN ] = 0.
Identification of the linear in means model of peer effects has been studied in many papers,
see for example Manski (1993) and Bramoullé et al. (2009), Lee (2007b). For a discussion
of identification in linear social interaction models see Blume et al. (2013). Manski (1993)
considers a model where Gij,N = 1/N ∀ i 6= j and shows that in this scenario peer effects are
not identified. Bramoullé et al. (2009) show that the parameters of the model are identified if
the matrices IN , GN and G2N are linearly independent, with IN denoting the N × N identity
matrix.
In order to estimate the model it is necessary to take into account the fact that the
P
2
regressor N
i=1 Gij,N Yj is correlated with the error term υi . Assume υi ∼ i.i.d.(0, σ ). If
(IN − βGN ) is invertible and |β| < 1, it is true that
E[(GN YN )0 υN ] = [(GN (IN − βGN )−1 (XN γ + GN XN δ + υN ))0 υN ]
= E[(GN (IN − βGN )−1 υN )0 υN ] = σ0 tr(GN (IN − βGN )−1 ) 6= 0.
(2.3)
To solve the endogeneity problem different estimators have been proposed in the literature.
Kelejian and Prucha (1998) propose a Two-Stage Least Squares (2SLS) estimator which is
further refined by Lee (2003). Lee (2007a) proposes a GMM estimator. Bramoullé et al.
(2009) use the method proposed by Lee (2003) which involves two steps. In the first step a
2SLS
2SLS estimator θ̂N
is obtained using ZN = [XN GN XN G2N XN ] as instruments
2SLS
0
0
θ̂N
= (WN
ZN (Z0N ZN )−1 ZN WN )−1 WN
ZN (Z0N ZN )−1 Z0N YN ,
(2.4)
where WN = [GN YN , XN , GN XN ]. The validity of this estimator depends on the assumption that E[Z0N υN ] = 0. This assumption does not hold if GN is correlated with υN , which
is true if unobserved individual characteristics directly influence both link formation and
individual outcomes.
In this paper, we consider the case where it may be that E[υi |XN , GN ] 6= 0. This means
that unobserved characteristics that influence link formation can also have a direct effect
on individual outcomes. When the network matrix is endogenous, E[GN υN ] 6= 0, and the
procedure used by Bramoullé et al. (2009) and others is no longer valid since the instrumental
variable (IV) matrix ZN = [XN , GN XN , G2N XN ] is correlated with the error term υN .
ESTIMATION OF PEER EFFECTS IN ENDOGENOUS SOCIAL NETWORKS
5
2.1. Related Literature. In this section we briefly present models, which, to our knowledge, are the most closely related to our approach.
Goldsmith-Pinkham and Imbens (2013) assume that the network is observed in two periods, t = 0, 1. Let F0,ij = 1 if i and j have friends in common in period t = 0 and F0,ij = 0
otherwise. D0,ij is equal to 1 if i and j are directly connected in period t = 0 and equal to 0
otherwise. Let D1,N denote the N × N binary matrix of connections in period t = 1, with
links formed according to the following equation
Pr(D1,ij,N = 1|Xi , Xj , Ai , Aj , D0 , F0 ) =
exp(α0 + αx |Xi − Xj | + αA |Ai − Aj | + αD D0,ij + αF F0,ij )
1 + exp(α0 + αx |Xi − Xj | + αA |Ai − Aj | + αD D0,ij + αF F0,ij )
and individual outcomes given by
Yi = α + β
N
X
j=1
j6=i
Gij,N Yj + γ 0 Xi + δ 0
N
X
Gij,N Xj + θAi + εi ,
j=1
j6=i
where εN |AN , XN , DN ∼ N (0, IN σ 2 ) and GN is the matrix obtained by row-normalizing
D1,N . Hence, θAi is a linear control function and Ai are treated as random effects: Ai is
independent of Xi with distribution
Pr(Ai = 1|XN , DN ) = 1 − Pr(Ai = 0|XN , DN ) = p.
Goldsmith-Pinkham and Imbens (2013) estimate α, β, γ, δ, θ and A1 , . . . , AN using a Bayesian
approach.
The specification of Hsieh and Lee (2014) is similar to that of Goldsmith-Pinkham and
Imbens (2013), however, Hsieh and Lee (2014) consider one time period and let the probability of a link ij forming depend on the characteristics of individuals other than i and j and
on the network structure. They also use the Add Health data and take smoking behavior
as the dependent variable Yi . The results of Hsieh and Lee (2014) suggest that peer effects
are overestimated when the endogeneity of the network is not taken into account. Similarly
to Goldsmith-Pinkham and Imbens (2013), Hsieh and Lee (2014) use Bayesian methods to
estimate their model and they do not establish identification conditions for the outcome
equation.
Qu and Lee (2015) assume there is a set of exogenously determined “distances” ρij between
each pair of individuals i and j. The probability of a link between i and j is decaying in ρij ,
which induces sparsity of the network and allows to control the dependence of the outcomes.
Links are formed according to
Dij,N = f (XN , εN , ρij ),
6
IDA JOHNSSON AND HYUNGSIK ROGER MOON
and individual outcomes are given by
Yi = α + β
N
X
Gij,N Yj + γ 0 Xi + υi .
j=1
j6=i
Qu and Lee (2015) assume that the errors from the link formation process and the outcome
equation are jointly normally distributed,
(υi , εi ) ∼ i.i.d. N (0, Συε ).
Hence, the errors of the network formation process can be used as a linear control function
in the outcome equation: E[υi |εi ] = δεi .
3. Identification of peer effects using control function
In this section we present the assumptions of our model that lead to identification of peer
effects using instrumental variable estimation and a control function approach.
Assumption 3.1. (i) Random sampling: (Xi , Ai , υi ) ∼ i.i.d. across i, i = 1, . . . , N .
(ii) {uij }i,j=1,...,N ⊥ (XN , AN , υN )
(iii) Fixed effects: υi ⊥ Xi |Ai
Assumption 3.1 (i) is standard in the peer effects literature. Under Assumption 3.1 (ii) the
link formation error is orthogonal to all other observables and unobservables in the model.
This means that the dyad-specific unobservable shock uij from the link formation process
does not influence outcomes Y1 , . . . , YN . Hence, we allow for endogeneity of the social interaction group through dependence between the two unobserved components Ai and υi , but
not through the idiosyncratic network formation error uij .
Assumption 3.1 (iii) allows the regressors Xi of the outcome equation to be endogenous
with respect to unobserved component υi through Ai . This assumption accounts for the fact
that the unobserved individual heterogeneity Ai in the network formation model is fixed and
can be arbitrarily correlated with the observed components Xi .
Notice that GN is a measurable function of (Xi , X−i , Ai , A−i , {uij }i,j=1,...,N ), where X−i =
(X1 , . . . , Xi−1 , Xi+1 , . . . , XN ) and A−i is defined analogously. Under Assumption 3.1, we
have
E(υi |XN , GN , Ai ) = E[υi |X−i , GN (X−i , A−i , {uij }i,j=1,...,N , Xi , Ai ), Xi , Ai ]
(1)
= E[υi |Xi , Ai ]
(2)
= E[υi |Ai ],
ESTIMATION OF PEER EFFECTS IN ENDOGENOUS SOCIAL NETWORKS
7
where (1) holds because (X−i , A−i , {uij }i,j=1,...,N ) ⊥ (Xi , Ai , υi ) under Assumptions 3.1 (i)
and (ii) and (2) holds by Assumption 3.1 (iii). This leads to the following Lemma, which
shows that υi and (XN , GN ) are mean independent conditioning on Ai .
Lemma 3.2 (Control Function). Under Assumption 3.1, we have
E[υi |XN , GN , Ai ] = E[υi |Ai ].
Consider now the identification of the parameters of the outcome equation (2.2). Keep in
mind that, regardless of the possible endogeneity of GN and Xi , we need to control for the
P
endogeneity of the term Nj=1 Gij,N Yj . Let ZN = [XN GN XN G2N XN ] be the usual IV matrix
j6=i
used in 2SLS estimation of the peer effects equation and let WN = [GN YN , XN , GN XN ].
Further, denote the ith rows of ZN and WN by Zi and Wi , respectively. Let θ = (β 0 , γ 0 , δ)0
and θ0 = (β00 , γ00 , δ00 )0 be the true parameter values. Note that since E[Zi υi ] 6= 0, ZN is not a
valid IV matrix.
Assumption 3.3 (Rank condition). E (Zi (Wi − E[Wi |Ai ])0 ) has full rank.
Notice that
E[Zi (Yi − Wi0 θ − E(Yi − Wi0 θ|Ai )]
= E[Zi (Wi − E[Wi |Ai ])0 ](θ − θ0 ) + E[Zi (υi − E[υi |Ai ])]
(1)
= E[Zi (Wi − E[Wi |Ai ])0 ](θ − θ0 ) = 0
(2)
⇔ θ = θ0 .
Here (2) follows from Assumption 3.3, and (1) follows since
E[Zi υi |Ai ] = E [E[Zi υi |Ai , XN , GN ]|Ai ]
= E [Zi E[υi |Ai , XN , GN ]|Ai ]
= E [Zi E[υi |Ai ]|Ai ] = E[Zi |Ai ] E[υi |Ai ] = 0
by Lemma 3.2.
As a summary, we have the following identification theorem.
Theorem 3.4 (Identification). Under Assumptions 3.1 and 3.3, the parameter θ0 is identified
by the moment condition E[Zi (Yi − Wi0 θ − E(Yi − Wi0 θ|Ai ))] = 0:
E[Zi (Yi − Wi0 θ − E(Yi − Wi0 θ|Ai ))] = 0 ⇐⇒ θ = θ0 .
4. Estimation
This Section is organized as follows. In Section 4.1 we discuss the assumptions and estimation of the model of network formation. In Section 4.2 we describe the non-parametric control
8
IDA JOHNSSON AND HYUNGSIK ROGER MOON
function approach. In Section 4.3 we introduce the Two-Stage Least Squares estimator of θ
and in Section 4.4 we derive the limiting distribution of the estimator.
4.1. Model of Network Formation. Following Graham (2015), we make the following
assumptions about the network formation model.
Assumption 4.1 (Network Formation). (i) uij ∼ i.i.d. for all ij follows a logistic distribution.3
(ii) The support of λ is B, a compact subset of RlT , the support of Tij is T, a compact subset
of RlV , and the support of Ai is A, a compact subset of R.
Assumption 4.1 (i) implies that the probability of a link between i and j is given by
Pr(Dij,N
exp(Tij0 λ + Ai + Aj )
= 1|XN , AN ) =
,
1 + exp(Tij0 λ + Ai + Aj )
with
Pr(Dij,N = d1 , Dkl = d2 |XN , AN ) = Pr(Dij,N = d1 |XN , AN ) Pr(Dkl = d2 |XN , AN ).
Utility from link formation is transferable directly across linked agents and network externalities do not play a role in link formation. The utility from link formation rules out homophily
based on unobserved characteristics, however, it allows for homophily based on unobservables. The unobserved characteristic Ai captures the characteristics of individual i that make
her or him a desirable linking partner, for example trustworthiness or productivity.
Further, Assumptions 4.1 (i) and 4.1 (ii) imply that
exp(Tij0 λ + Ai + Aj )
< 1 − κ ∀ i, j,
κ<
1 + exp(Tij0 λ + Ai + Aj )
κ ∈ (0, 1),
the probability that any two agents link is bounded away from zero. This assumption is
appealing in settings where it is likely that all agents might interact with one another and
potentially form links.
Graham (2015) proposes two estimators: a conditional maximum likelihood estimator and
a joint maximum likelihood estimator (JMLE). The latter leads to consistent estimation of
AN . Below we present the main results related to the JMLE.
The JMLE chooses λ̂ and ÂN simultaneously to maximize
ln (β, AN ) =
N X
X
Dij,N Tij0 λ + Ai + Aj − ln 1 + exp(Tij0 λ + Ai + Aj ) .
i=1 j<i
3The
logistic distribution assumption could possibly be relaxed as discussed in Graham (2015).
(4.1)
ESTIMATION OF PEER EFFECTS IN ENDOGENOUS SOCIAL NETWORKS
9
Assumption 4.2. E[lN (λ, AN )|XN , AN ] is uniquely maximized at the true parameter values
λ = λ0 and AN = A0N .
This assumption generally holds if there is sufficient variance in Tij (Graham, 2015). Let
ÂN (λ) = arg maxAN ln (λ, AN ). For computational and analytical purposes Graham (2015)
defines λ̂ as the maximizer of the concentrated likelihood
n X
h
i
X
c
ln =
Dij,N Tij0 λ + Âi (λ) + Âj (λ) − ln 1 + exp(Tij0 λ + Âi (λ) + Âj (λ))
(4.2)
i=1 j<i
and shows that ÂN (λ) is the unique solution to a fixed point problem. Denote Âi = Ai (λ̂).
The following Theorem is from Graham (2015).
Theorem 4.3. Under Assumptions 3.1, 4.1 4.2, with probability 1 − O(N −2 )
!
r
ln N
sup |Âi − Ai | < O
.
N
1≤i≤N
p
Graham (2015) also shows that under Assumptions 3.1, 4.1 and 4.2, λ̂ → λ.
4.2. Non-parametric control function. Let E[υi |Ai ] = h(Ai ), where h(Ai ) is an unknown
function. We use sieve estimation to approximate the unknown function h(Ai ). In this section we formally define the sieve estimator and discuss assumptions related to this estimator.
We give examples of suitable sieve bases and show how the series estimator is applied in the
context of our model of peer effects.
We approximate
h(a) ≈
qk (a)αk .
k=1
KN
Let q (a) = (q1 (a), . . . , qKN (a)) , QN = (q (A1 ), . . . , q KN (AN ))0 , H(AN ) = (h(A1 ), . . . , h(AN ))0
and α = (α1 , . . . , αKN )0 . We use ÂN to approximate AN and denote Q̂N = (q KN (Â1 ), . . . , q KN (ÂN ))0
and H(ÂN ) = (h(Â1 , . . . , h(ÂN ))0 .
KN
0
KN
X
Notation. Throughout the paper, we use the following notation: M denotes a finite generic
P
1/2
2
constant and k A k=
|a
|
denotes the Frobenius norm. We denote [GX2 ]i by
ij
i,j
0
X2,Gi , [G2 X2 ]i by X2,G2 i , [GY]i by YGi , Zi0 = [X2,i
, X2,Gi , X2,G2 i ], where the dimension of Zi
0
0
is (3lx ) × 1. Also, we denote Wi = [YGi , X2,i , X2,Gi ].
We impose standard regularity assumptions on the sieve basis QN . In particular, we make
the following assumptions.
10
IDA JOHNSSON AND HYUNGSIK ROGER MOON
Assumption 4.4. For every KN there is a non-singular matrix of constants B such that
for q̃ KN (a) = Bq KN (a),
(i) The smallest eigenvalue of E[q̃ KN (Ai )q̃ KN (Ai )0 ] is bounded away from zero uniformly in
KN .
(ii) There exists a sequence of constants ζ0 (KN ) that satisfy the condition
sup kq̃ KN (a)k ≤ ζ0 (KN ),
a∈A
where KN = K(N ) such that ζ0 (KN )2 KN /N → 0 as N → ∞.
(f )
(iii) For f ∈ {h(a), E[Zi |Ai ], E[Wi |Ai ], E[Yi |Ai ]}, there exists a sequence of αKN and a number
κ > 0 such that
(f )
sup k f − q KN 0 αKN k∞ = O(KN−κ )
a∈A
as KN → ∞.
√
(iv) N KN−κ → 0 and N → ∞.
(v) As N → ∞, KN → ∞ and KN /N → 0.
Assumption 4.5 (Lipschitz condition). The sieve basis satisfies the following condition:
there exists a positive number ζ1 (k) such that
k qk (a) − qk (a0 ) k≤ ζ1 (k) k a − a0 k ∀ k = 1, . . . , KN
with
KN
ln N X
ζ 2 (k) = o(1)
N k=1 1
and
ζ06 (KN )
!
KN
ln N X
ζ 2 (k) = o(1).
N k=1 1
Assumption 4.4 is standard in the semi-parametric literature, see for example Newey
(1997), Li and Racine (2007). Assumption 4.4 (i) and (ii) ensures that Q0N QN is asymptotically non-singular. Assumption 4.4 (iii) controls the rate of approximation of the function f
by the sieve estimator. Newey (1997) gives examples of sieve bases that satisfy Assumption
4.4.
Assumption 4.5 is non-standard and it requires that the sieve basis satisfy the Lipschitz
condition. This allows to control for the error introduced by the approximation of Ai by Âi .
Below we present examples of sieve bases that satisfy the Lipschitz condition.
Let T riP ol(KN ) denote the space of trigonometric polynomials on [a, ā] of degree KN ,
)
(
KN
X
T riP ol(KN ) = ν0 +
[νk cos(2kπa) + ωk sin(2kπa)], a ∈ [a, ā] : νk , ωk ∈ R .
k=1
ESTIMATION OF PEER EFFECTS IN ENDOGENOUS SOCIAL NETWORKS
11
Let CosP ol(KN ) denote the space of cosine polynomials on [a, ā] of degree KN ,
)
(
KN
X
CosP ol(KN ) = ν0 +
[νk cos(2kπa), a ∈ [a, ā] : νk ∈ R .
k=1
Let SinP ol(KN ) denote the space of sine polynomials on [a, ā] of degree KN ,
(K
)
N
X
SinP ol(KN ) =
[νk sin(2kπa)], a ∈ [a, ā] : νk ∈ R .
k=1
In the case when |Ai | ≤ 1 ∀ i we can also use the polynomial sieve with
(
)
KN
X
P ol(KN ) = ν0 +
[νk ak , a ∈ [a, ā] : νk ∈ R
k=1
or the Hermite Polynomial sieve,
)
(K +1
2
N
X
−x
[νk Hk (a) exp
, a ∈ [a, ā] : νk ∈ R ,
HP ol(KN ) =
2
k=1
2
where Hk (a) = (−1)k ea
dk −a2
e .
dak
To see that the above sieve bases satisfy Assumption 4.5 consider for example the sine
polynomial and w.l.o.g. assume a ≤ â. For any k we have
k νk [sin(2kπa) − sin(2πkâ)] k= (2kπ cos(ã))2 k a − â k= M k 2 k a − â k,
where ã ∈ (a, â).
P
Consider the Hermite polynomials. For arbitrary k we have Hk (a) = ki=1 ci abi , were bi and
ci are some finite constants. When |a| ≤ 1, it is easy to see that Hk (a) satisfies the Lipschitz
condition for arbitrary k.
P N 2
Remark 4.6. Denote K
k=1 ζ1 (k) = ϕ(KN ) and consider the trigonometric and polynomial
sieve bases. For the trigonometric sieve, ζ1 = O(KN2 ) and ϕ(KN ) = O(KN5 ). For the polynomial sieve ζ1 = O(KN ) and ϕ(KN ) = O(KN3 ). For both the trigonometric and polynomial
√
sieve ζ0 = O( KN ). Hence, the conditions that must be satisfied for both the trigonometric
√
sieve and the polynomial sieve are KN2 /N → 0 and N KN−κ → 0. Further, for the trigonometric sieve we must have lnNN O(KN8 ) = o(1) and for the polynomial sieve lnNN O(KN6 ) = o(1)
must hold.
The trigonometric sieve T riP ol is well-suited for approximating periodic functions on
[0, 1], while the cosine sieve CosP ol is suitable for approximating aperiodic functions on
[0, 1]. The sine sieve SinP ol can approximate functions vanishing at the boundary points
(Chen, 2007).
12
IDA JOHNSSON AND HYUNGSIK ROGER MOON
Assumption 4.7 (Bounded Support and Finite Second Moments). (i) The elements of XN
2
are uniformly bounded in absolute value. (ii) E[Z2N ] < M < ∞ and E[WN
] < M < ∞.
0
Remark 4.8. Under Assumption 4.7 (i), E[X2,Gi
X2,Gi ] < ∞
4.3. Two-Stage Least Squares estimation. Assuming that |β| < 1, IN − βGN is invertible and we can rewrite equation (2.2)
YN = WN θ + Q̂N α + εN + H(AN ) − Q̂N α .
|
{z
}
(4.3)
error
In this section we present the derivation of the 2SLS estimator of θ in (4.3). Let PQ =
QN (Q0N QN )− Q0N and PQ̂ = Q̂N (Q̂0N Q̂N )− Q̂0N , where − denotes any symmetric generalized
b and
inverse. 4 Define MQ = I − PQ and MQ̂ = I − PQ̂ . Finally, we let Ξ = [MQb ZN , Q]
−1
PΞ = Ξ (Ξ0 Ξ) Ξ0 . The 2SLS estimator of θ and α solves the following problem
0
b N α PΞ YN − WN θ − Q
bN α .
(θ̂2SLS , α̂2SLS ) = arg min YN − WN θ − Q
θ,α
By definition,
θ̂2SLS
α̂2SLS
!
=
0
0
bN
WN
PΞ WN W N
PΞ Q
b 0 PΞ WN Q
b 0 PΞ Q
bN
Q
N
N
!−1
0
WN
PΞ Y N
b 0 P Ξ YN
Q
N
!
.
Notice that
0
WN
PΞ WN
=
0
WN
MQb ZN
−1
−1
0 b
0 b
0
0
b
b 0 WN
Q
ZN MQb WN + WN QN QN QN
ZN MQb ZN
N
b0N PΞ WN = Q
b0N WN ,
Q
0
WN
PΞ YN
b0N PΞ Q
bN = Q
b0N Q
bN ,
b0N PΞ YN = Q
b 0 YN
Q
Q
N
−1
−1
0 b
0
b0 Q
bN
b 0 YN .
QN Q
Q
Z0N MQb YN + WN
= WN
MQb ZN Z0N MQb ZN
N
N
Then, we have
−1
−1
0
0
0
0
bN Q
b N PΞ Q
bN
b N PΞ WN
θb2SLS =
WN P Ξ W N − WN PΞ Q
Q
−1
0
0
0
0
b
b
b
b
× WN PΞ YN − WN PΞ QN QN PΞ QN
Q N PΞ YN
=
0
WN
MQb ZN
Z0N MQb ZN
−1
Z0N MQb WN
−1
−1
0
WN
MQb ZN Z0N MQb ZN
Z0N MQb YN .
4.4. Limiting distribution of estimator. In this section we discuss the steps involved in
deriving the limiting distribution of the 2SLS estimator. To derive the limiting distribution
of θ̂2SLS , we first control the sampling error coming from the fact that we do not observe AN
and approximate it with ÂN . To do this, we use Theorem 4.3. Second, we control the error
4Under
Assumption 4.4, Q0N QN is non-singular with probability approaching one, hence,
standard inverse.
−
will be the
ESTIMATION OF PEER EFFECTS IN ENDOGENOUS SOCIAL NETWORKS
13
introduced by the approximation of h(Ai ) by the series estimator. Finally, we show that the
estimator converges to a well-defined limit.
Solving for θ̂2SLS we have
θ̂2SLS − θ
−1
−1
0
0
0
=
WN MQb ZN ZN MQb ZN
ZN MQb WN
0
×WN
MQb ZN
0
Z MQb ZN
−1
Z0N MQb
b
εN − H(AN ) − QN α .
In Lemma B.1 in Appendix B we show the following asymptotic result.
(a)
(b)
(c)
(d)
1
(Z0N PQb WN − Z0N PQ WN ) = op (1).
N
1
(Z0N PQb ZN − Z0N PQ ZN ) = op (1).
N
√1 (Z0 P b ε − Z0 PQ εN ) = op (1).
N
N Q
N
1
0
bN α)) = op (1).
√ (Z M b (H(AN ) − Q
Q
N
The above result implies that under suitable assumptions on the sieve basis, the error that
stems from the approximation of AN with ÂN is asymptotically negligible. Using this result,
we can approximate
√ N θ̂2SLS − θ
!−1
−1
1 0
1 0
1 0
W MQ ZN
Z MQ ZN
Z MQ WN
=
N N
N N
N N
−1
1 0
1 0
1
√ Z0N MQ εN + op (1).
× WN MQ ZN
ZN MQ ZN
N
N
N
Now, consider the error introduced by the non-parametric approximation of h(Ai ). Let
hW (Ai ) = E(Wi |Ai ), ηiW = Wi − hW (Ai ), hZ (Ai ) = E(Zi |Ai ) and ηiZ = Zi − hZ (Ai ). Let
ĥW (Ai ) and ĥZ (Ai ) denote the series approximation of hW (Ai ) and hZ (Ai ), respectively.
Notice that
N
0
1 0
1 X
W
Z
W MQ ZN =
Wi − ĥ (Ai ) Zi − ĥ (Ai ) ,
N N
N i=1
N
0
1 X
1 0
Zi − ĥZ (Ai ) Zi − ĥZ (Ai ) ,
ZN MQ ZN =
N
N i=1
N
1 0
1 X
Z
√ ZN MQ εN = √
Zi − ĥ (Ai ) εi .
N
N i=1
In Lemma D.1 in Appnedix D we show the following.
0
0
P P
W
Z
W
W
−
ĥ
(A
)
Z
−
ĥ
(A
)
= N1 N
Zi − hZ (Ai ) +
(a) N1 N
i
i
i
i
i=1
i=1 Wi − h (Ai )
op (1),
14
IDA JOHNSSON AND HYUNGSIK ROGER MOON
0
0
P
Z
Z
Z
Z
−
ĥ
(A
)
Z
−
ĥ
(A
)
Zi − hZ (Ai ) +op (1),
= N1 N
i
i
i
i
i=1
i=1 Zi − h (Ai )
PN PN
Z
Z
√1
Z
−
ĥ
(A
)
ε
=
Z
−
h
(A
)
εi + op (1).
i
i
i
i
i
i=1
i=1
N
PN (b)
1
N
(c)
√1
N
Finally, to derive the limiting distribution of the 2SLS estimator we need find the limit of
P
PN
1
W
Z
0
Z
Z
0
(i) N1 N
i=1 (Wi − h (Ai ))(Zi − h (Ai )) , (ii) N
i=1 (Zi − h (Ai ))(Zi − h (Ai )) , and (iii)
P
N
Z
√1
i=1 (Zi − h (Ai ))εi .
N
We introduce the following notation. Let s0 (Xi , Ai ) be a bounded function of (Xi , Ai ) and
S0 (XN , AN ) = (s0 (X1 , A1 ), . . . , s0 (XN , AN ))0 a vector-valued function.
Define s1,N (Xi , Ai ) = [GN S0 (XN , AN )]i and S1 (XN , AN ) = (s1,N (X1 , A1 ), . . . , s1,N (XN , AN ))0 ,
s2,N (Xi , Ai ) = [GN S1 (XN , AN )]i , S2 (XN , AN ) = (s2,N (X1 , A1 ), . . . , s2 (XN , AN ))0 and so on.
Hence, in general we have
sm,N (Xi , Ai ) = [GN Sm−1 (XN , AN )]i .
Note that we can equivalently write sm,N (Xi , Ai ) = [Gm
N S0 (XN , AN )]i . Let s̃m,N (Xi , Ai ) =
plimN →∞ sm,N (Xi , Ai ) and S̃m (XN , AN ) = (s̃m,N (X1 , A1 ), . . . s̃m,N (XN , AN ))0 .
0
A
X
X
X
Let sX
0 (Xi , Ai ) = Xi , S0 (XN , AN ) = (s0 (X1 , A1 ), . . . , s0 (XN , AN )) , and define s0 and
0
A
A
A
S0A analogously: sA
0 (Xi , Ai ) = h(Ai ) and S0 (XN , AN ) = (s0 (X1 , A1 ), . . . , s0 (XN , AN )) . Fim A
A
m X
nally, let s̃X
m (Xi , Ai ) = plimN →∞ [GN S0 (XN , AN )]i and s̃m (Xi , Ai ) = plimN →∞ [GN S0 (XN , AN )]i .
In Lemma E.2 in Appendix E we show the following. Keeping in mind that for a matrix
B we define ηiB = Bi − E(Bi |Ai ) it is true that
N
1 X
p
(Wi − hW (Ai ))(Zi − hZ (Ai ))0 →
−
N i=1
h
i
2
E ηiGY (ηiG X2 )0
E ηiGY (ηiGX2 )0
E ηiGY (ηiX2 )0
h
i
2
E ηiX2 (ηiGX2 )0
E ηiX2 (ηiG X2 )0
S W Z = E ηiX2 (ηiX2 )0
h
i
GX2 GX2 0 GX2 X2 0
GX2 G2 X2 0
E ηi (ηi ) E ηi (ηi ) E ηi (ηi
)
and
N
1 X
p
(Zi − hZ (Ai ))(Zi − hZ (Ai ))0 →
−
N i=1
h
i
X2 GX2 0 X2 X2 0
X2 G2 X2 0
E η (ηi )
E ηi (ηi )
E ηi (ηi
)
i
h
i
2
E ηiGX2 (ηiGX2 )0
E ηiGX2 (ηiG X2 )0 ,
S ZZ = E ηiGX2 (ηiX2 )0
h 2
i
h 2
i
h 2
i
G X2 X2 0
G X2 GX2 0
G X2 G2 X2 0
E ηi
(ηi )
E ηi
(ηi )
E ηi
(ηi
)
ESTIMATION OF PEER EFFECTS IN ENDOGENOUS SOCIAL NETWORKS
15
where
"
r
E ηiGY (ηiG X2 )0 = E
∞
X
!
0 ˜X
˜A
γ 0 s̃˜X
m+1 (Xi , Ai )m + δ s̃m+2 (Xi , Ai ) + s̃m+1 (Xi , Ai )
#
0
, r = 0, 1, 2
s̃˜X
r (Xi , Ai )
m=0
E
r
s
ηiG X2 (ηiG X2 )0
h
˜X
= E s̃˜X
r (Xi , Ai )) s̃s (Xi , Ai )
0 i
, r, s = 0, 1, 2
and
s̃˜m,N (Xi , Ai ) = plim (sm,N (Xi , Ai ) − E[m2,N (Xi , Ai )|Ai ]).
N →∞
We assume that that the projection error ε satisfies the following condition.
Assumption 4.9. Let Fi = (XN , AN , GN , ε1 , . . . , εi−1 ). Conditional on (AN , XN , GN ),
εi ∼ i.i.d.(0, σ02 ) with E[ε4i |Fi−1 ] < µ < ∞.
Notice that Assumption 4.9 implies that (ηiZ εi , Fi ) is a martingale difference sequence (see
Lemma E.3 in Appendix E). Then, it is possible to show
N
1 X Z d
√
− N 0, E ηiZ (ηiZ )0 ε2i ,
η i εi →
N i=1
(4.4)
N
1 X Z d
√
− N 0, S ZZ σ02 .
ηi εi →
N i=1
(4.5)
or, given Assumption 4.9,
Combining all the limit results, we have the following theorem.
Theorem 4.10. Under the assumptions made in Lemma E.2 and Assumption E.1, we have
−1 √
d
WZ
ZZ −1 W Z 0
N (θ̂2SLS − θ0 ) →
− N 0, S
S
S
σ02 .
See Appendix E.1 for a proof.
−1
−1
WZ
ZZ
W Z0
Notice that the asymptotic variance can be consistently estimated by Ŝ
Ŝ
Ŝ
σ̂ 2 ,
where
Ŝ W Z =
Ŝ
and σ̂ 2 =
1
N
PN
2
i=1 ε̂i ,
ZZ
N
0
1 X
Wi − ĥW (Âi ) , Zi − ĥZ (Âi ) ,
N i=1
N
0
1 X
Z
Z
=
Zi − ĥ (Âi ) Zi − ĥ (Âi ) ,
N i=1
with ε̂i = Yi − ĥY (Âi ) − (Wi − ĥW (Âi ))θ̂2SLS .
16
IDA JOHNSSON AND HYUNGSIK ROGER MOON
5. Monte Carlo
The Monte Carlo design of the network formation process follows Graham (2015). Links
are formed according to
Dij,N 1(Xi Xj λ + Ai + Aj − uij ) ≥ 0,
where Xi ∈ {−1, 1}, λ = 1 and uij follows a logistic distribution. This link rule implies that
agents have a strong taste for homophilic matching since Xi Xj λ = 1 when Xi = Xj and
Xi Xj λ = −1 when Xi 6= Xj . Individual-level degree heterogeneity is generated according to
Ai = ϕ(αL 1(Xi = −1) + αH 1(Xi = 1) + ξi ),
n
o
0
with αL ≤ αH and ξi a centered Beta random variable ξi |Xi ∼ Beta(µ0 , µ1 ) − µ0µ+µ
so
1
h
i
0
1
that Ai ∈ ϕ αL − µ0µ+µ
, αH + µ0µ+µ
. ϕ is a scaling factor that assures that |Ai | ≤ 1 in
1
1
the designs that so require.
The Monte Carlo parameter specifications are presented in Table 1. Designs A1 to A4
incorporate degree heterogeneity that is (i) uncorrelated with Xi and (ii) symmetrically
distributed. This leads to graphs with bell-shaped degree distributions. The next four
designs, B1 to B4, invole degree heterogeneity distributions that are (i) correlated with Xi
and (ii) right skewed. The latter generates degree distributions closer to those observed in
real world networks. Below we present results for designs A2 and B4, results for the other
designs can be found in the online appendix.
Table 1. Monte Carlo Designs
Symmetric Uncorrelated
Heterogeneity
A1
A2
A3
A4
0 −1/4 −3/4 −5/4
0 −1/4 −3/4 −5/4
1
1
1
1
1
1
1
1
Parameters
αL
αH
µ0
µ1
Network Statistics*
Density
0.50
Min. degree
32
Max. degree
66
0.40
23
57
0.28
14
41
0.24
12
36
Right-Skewed
Correlated Heterogeneity
B1
B2
B3
B4
0 −1/2 −1 −3/2
1/2
0
−1/2
1
1/4 1/4
1/4
1/4
3/4 3/4
3/4
3/4
0.58
40
75
0.40
21
61
* - average network statistics across 1000 Monte Carlo replications
0.28
12
46
0.24
11
39
ESTIMATION OF PEER EFFECTS IN ENDOGENOUS SOCIAL NETWORKS
17
Individual outcomes are generated according to
Yi = β
N
X
j=1
j6=i
Gij,N Yj + γXi + δ
N
X
Gij,N Xj + h(Ai ) + εi .
j=1
j6=i
exp(Ai )
In the simulations, we set γ = δ = 1, β ∈ {0.2, 0.5, 0.8}, h(Ai ) = κ+exp(A
with
i)
κ ∈ {1, 10, 100} and εi ∼ N (0, 0.1) . The degree of endogeneity of the network is controlled by κ, with larger values of κ making the network links less correlated with individual
fixed-effects Ai . We estimate the outcome equation using the non-parametric control function approach, and compare it with estimates obtained using a linear control function and
no control function. We expect the bias of the two latter methods to be smaller when κ is
larger. We use polynomial , cosine and Hermite polynomial sieve bases with KN = 3, 4, 5, 6.
The main findings of the Monte Carlo simulations are: (i) the non-parametric control
function performs better than the linear control function in all designs, (ii) as expected,
not controlling for h(Ai ) gives biased estimates, (iii) for a given design, the non-parametric
estimates are more precise when KN = 6 than when KN = 3 and (iv) the results do not
change notably when we vary the network density.
18
IDA JOHNSSON AND HYUNGSIK ROGER MOON
Table 2. Design A2, polynomial sieve: Parameter values across 1000 MC replications with KN = 3.
κ
CF
β = 0.20
γ=1
δ=1
κ
CF
β = 0.50
γ=1
δ=1
κ
CF
β = 0.80
γ=1
δ=1
(0)
0.02
(1.00 )
-0.02
(0.15 )
-1.90
(34.97 )
-0.13
(2.93 )
0.22
(6.06 )
-0.13
(0.94 )
(0)
-0.23
(4.95 )
-0.37
(0.43 )
-2.74
(36.86 )
-2.09
(2.96 )
-0.41
(10.96 )
-0.76
(0.89 )
(0)
-0.76
(2.46 )
-0.73
(0.34 )
-1.29
(14.73 )
-2.01
(1.33 )
-1.00
(3.58 )
-0.91
(0.45 )
1.00
(1)
0.08
(3.90 )
-0.02
(0.13 )
2.03
(76.21 )
-0.15
(2.56 )
0.47
(23.20 )
-0.14
(0.81 )
1.00
(1)
-0.14
(6.47 )
-0.35
(0.45 )
-0.88
(55.86 )
-1.81
(3.04 )
-0.28
(13.45 )
-0.71
(0.91 )
1.00
(1)
1.26
(67.53 )
-0.73
(0.43 )
2.29
(144.16 )
-2.00
(1.42 )
1.64
(87.06 )
-0.92
(0.53 )
(2)
-0.00
(0.07 )
-0.00
(0.00 )
-0.03
(0.90 )
0.00
(0.10 )
-0.01
(0.42 )
-0.00
(0.01 )
(0)
0.00
(0.06 )
0.00
(0.03 )
0.06
(2.57 )
-0.00
(0.53 )
0.02
(0.34 )
0.00
(0.17 )
(2)
0.04
(1.23 )
0.00
(0.03 )
0.10
(3.53 )
-0.00
(0.18 )
0.08
(2.68 )
0.00
(0.05 )
(0)
0.00
(2.28 )
-0.03
(0.32 )
-0.34
(12.83 )
-0.11
(1.55 )
0.02
(4.79 )
-0.05
(0.66 )
(2)
0.24
(7.83 )
-0.00
(0.09 )
0.40
(12.56 )
-0.01
(0.26 )
0.31
(10.31 )
-0.00
(0.12 )
(0)
-0.74
(31.10 )
-0.26
(0.76 )
-2.04
(75.52 )
-0.68
(2.18 )
-0.94
(39.77 )
-0.34
(0.94 )
10.00
(1)
-0.00
(0.22 )
0.00
(0.02 )
0.14
(6.10 )
-0.00
(0.49 )
-0.03
(1.31 )
0.00
(0.15 )
10.00
(1)
0.54
(12.38 )
-0.02
(0.26 )
14.83
(440.08 )
-0.12
(1.43 )
0.59
(12.76 )
-0.05
(0.54 )
10.00
(1)
-2.05
(92.73 )
-0.21
(0.72 )
-9.03
(382.67 )
-0.54
(1.90 )
-2.43
(111.48 )
-0.27
(0.92 )
(2)
-0.00
(0.02 )
-0.00
(0.00 )
-0.00
(0.24 )
-0.00
(0.09 )
-0.00
(0.14 )
-0.00
(0.01 )
(0)
0.00
(0.02 )
0.00
(0.00 )
-0.01
(0.55 )
0.00
(0.11 )
0.00
(0.14 )
0.00
(0.02 )
(2)
0.37
(11.00 )
0.00
(0.02 )
1.05
(31.92 )
0.00
(0.16 )
0.79
(23.87 )
0.00
(0.05 )
(0)
0.01
(0.37 )
0.00
(0.04 )
0.04
(2.40 )
0.00
(0.24 )
0.02
(0.77 )
0.00
(0.09 )
(2)
-0.14
(3.58 )
-0.00
(0.08 )
-0.30
(7.45 )
-0.01
(0.24 )
-0.19
(4.61 )
-0.00
(0.10 )
(0)
0.02
(3.03 )
-0.00
(0.15 )
-0.45
(11.61 )
0.00
(0.44 )
0.05
(4.27 )
-0.00
(0.19 )
CF - control function. (0) - none, (1) - linear, (2) - non-parametric
iqr - (75th-25th) quartile
N = 100, K = 3, σε = 0.1
h(a) = exp(a)/(κ + exp(a))
Average network statistics: density= 0.40, min degree= 23.86, max degree= 57.02
100.00
(1)
0.09
(2.86 )
0.00
(0.00 )
-1.19
(35.80 )
0.00
(0.11 )
0.64
(20.69 )
0.00
(0.02 )
100.00
(1)
0.00
(0.66 )
-0.00
(0.04 )
0.02
(4.16 )
-0.00
(0.22 )
0.00
(1.34 )
-0.00
(0.08 )
100.00
(1)
0.03
(0.89 )
-0.00
(0.14 )
0.10
(2.32 )
-0.01
(0.38 )
0.04
(1.13 )
-0.00
(0.17 )
(2)
-0.00
(0.05 )
-0.00
(0.00 )
-0.01
(0.42 )
0.00
(0.09 )
-0.01
(0.31 )
-0.00
(0.01 )
mean bias
std
median bias
iqr
mean bias
std
median bias
iqr
mean bias
std
median bias
iqr
(2)
-0.02
(1.03 )
0.00
(0.02 )
-0.07
(3.50 )
0.00
(0.16 )
-0.04
(2.22 )
0.00
(0.05 )
mean bias
std
median bias
iqr
mean bias
std
median bias
iqr
mean bias
std
median bias
iqr
(2)
-0.07
(3.05 )
-0.00
(0.08 )
-0.17
(6.29 )
-0.01
(0.23 )
-0.10
(3.93 )
-0.00
(0.10 )
mean bias
std
median bias
iqr
mean bias
std
median bias
iqr
mean bias
std
median bias
iqr
ESTIMATION OF PEER EFFECTS IN ENDOGENOUS SOCIAL NETWORKS
19
Table 3. Design A2, polynomial sieve: Parameter values across 1000 MC replications with KN = 6.
κ
CF
β = 0.20
γ=1
δ=1
κ
CF
β = 0.50
γ=1
δ=1
κ
CF
β = 0.80
γ=1
δ=1
(0)
0.04
(1.22 )
-0.01
(0.15 )
-1.02
(43.81 )
-0.13
(2.86 )
0.34
(7.06 )
-0.12
(0.94 )
(0)
-0.23
(4.95 )
-0.37
(0.43 )
-2.75
(36.85 )
-2.09
(2.97 )
-0.40
(10.96 )
-0.76
(0.89 )
(0)
-0.71
(2.84 )
-0.73
(0.35 )
-1.03
(16.81 )
-2.02
(1.33 )
-0.94
(3.93 )
-0.91
(0.45 )
1.00
(1)
-0.04
(1.03 )
-0.02
(0.13 )
-0.28
(14.36 )
-0.15
(2.51 )
-0.23
(6.83 )
-0.14
(0.80 )
1.00
(1)
-0.09
(6.27 )
-0.35
(0.45 )
-0.87
(55.84 )
-1.85
(3.04 )
-0.17
(12.98 )
-0.71
(0.91 )
1.00
(1)
1.20
(67.56 )
-0.73
(0.43 )
1.95
(144.58 )
-2.00
(1.41 )
1.58
(87.08 )
-0.92
(0.53 )
(2)
0.00
(0.04 )
0.00
(0.00 )
0.01
(0.37 )
0.00
(0.10 )
0.01
(0.23 )
-0.00
(0.01 )
(0)
0.00
(0.06 )
0.00
(0.03 )
0.06
(2.56 )
0.00
(0.53 )
0.02
(0.34 )
0.00
(0.17 )
(2)
-0.00
(0.21 )
0.00
(0.03 )
0.01
(0.74 )
-0.00
(0.18 )
-0.00
(0.46 )
-0.00
(0.06 )
(0)
0.00
(2.28 )
-0.03
(0.32 )
-0.32
(12.83 )
-0.11
(1.55 )
0.02
(4.79 )
-0.05
(0.66 )
(2)
-0.06
(1.40 )
-0.00
(0.09 )
-0.09
(2.43 )
-0.01
(0.26 )
-0.07
(1.83 )
-0.00
(0.11 )
(0)
-0.77
(31.12 )
-0.26
(0.75 )
-2.14
(75.58 )
-0.68
(2.19 )
-0.97
(39.80 )
-0.34
(0.94 )
10.00
(1)
-0.00
(0.21 )
0.00
(0.02 )
0.06
(5.41 )
-0.00
(0.49 )
-0.04
(1.29 )
0.00
(0.15 )
10.00
(1)
0.54
(12.38 )
-0.02
(0.26 )
14.63
(440.13 )
-0.12
(1.41 )
0.60
(12.78 )
-0.05
(0.54 )
10.00
(1)
-2.08
(92.73 )
-0.21
(0.73 )
-9.14
(382.67 )
-0.55
(1.91 )
-2.47
(111.49 )
-0.27
(0.93 )
(2)
0.00
(0.13 )
0.00
(0.00 )
0.02
(0.90 )
0.00
(0.09 )
0.02
(0.83 )
-0.00
(0.01 )
(0)
0.00
(0.02 )
0.00
(0.00 )
-0.01
(0.55 )
0.00
(0.11 )
0.00
(0.14 )
0.00
(0.02 )
(2)
0.01
(0.34 )
0.00
(0.02 )
0.02
(0.91 )
0.00
(0.16 )
0.02
(0.75 )
0.00
(0.05 )
(0)
0.01
(0.36 )
0.00
(0.04 )
0.02
(1.96 )
-0.00
(0.25 )
0.02
(0.76 )
0.00
(0.09 )
(2)
0.08
(1.53 )
-0.00
(0.08 )
0.15
(2.68 )
-0.00
(0.23 )
0.11
(2.00 )
-0.00
(0.10 )
(0)
0.02
(3.03 )
-0.00
(0.15 )
-0.46
(11.62 )
0.00
(0.44 )
0.05
(4.27 )
-0.00
(0.20 )
CF - control function. (0) - none, (1) - linear, (2) - non-parametric
iqr - (75th-25th) quartile
N = 100, K = 6, σε = 0.1
h(a) = exp(a)/(κ + exp(a))
Average network statistics: density= 0.40, min degree= 23.86, max degree= 57.02
100.00
(1)
0.09
(2.86 )
-0.00
(0.00 )
-1.19
(35.80 )
0.00
(0.12 )
0.64
(20.69 )
0.00
(0.02 )
100.00
(1)
0.02
(0.66 )
-0.00
(0.04 )
0.13
(4.01 )
-0.00
(0.23 )
0.03
(1.35 )
0.00
(0.08 )
100.00
(1)
0.03
(0.88 )
-0.00
(0.14 )
0.09
(2.31 )
-0.01
(0.38 )
0.03
(1.13 )
-0.00
(0.17 )
(2)
0.00
(0.03 )
-0.00
(0.00 )
-0.01
(0.34 )
0.00
(0.09 )
0.00
(0.20 )
-0.00
(0.01 )
mean bias
std
median bias
iqr
mean bias
std
median bias
iqr
mean bias
std
median bias
iqr
(2)
0.01
(0.38 )
0.00
(0.02 )
0.03
(1.22 )
0.00
(0.16 )
0.02
(0.83 )
0.00
(0.05 )
mean bias
std
median bias
iqr
mean bias
std
median bias
iqr
mean bias
std
median bias
iqr
(2)
0.03
(0.98 )
-0.00
(0.08 )
0.07
(1.64 )
-0.00
(0.22 )
0.05
(1.29 )
-0.00
(0.10 )
mean bias
std
median bias
iqr
mean bias
std
median bias
iqr
mean bias
std
median bias
iqr
20
IDA JOHNSSON AND HYUNGSIK ROGER MOON
Table 4. Design B4, polynomial sieve: Parameter values across 1000 MC replications with N = 100, K = 3.
κ
CF
β = 0.20
γ=1
δ=1
κ
CF
β = 0.50
γ=1
δ=1
κ
CF
β = 0.80
γ=1
δ=1
(0)
0.04
(0.32 )
0.03
(0.05 )
0.26
(10.70 )
0.11
(1.52 )
0.31
(2.51 )
0.20
(0.27 )
(0)
-0.54
(12.57 )
-0.14
(1.26 )
-3.35
(57.62 )
-1.16
(6.72 )
-1.11
(27.23 )
-0.25
(2.73 )
(0)
-0.90
(9.66 )
-0.87
(1.13 )
-2.33
(24.61 )
-2.73
(2.48 )
-1.16
(12.56 )
-1.11
(1.50 )
1.00
(1)
0.01
(0.32 )
0.01
(0.03 )
0.24
(6.90 )
0.19
(0.64 )
0.06
(2.58 )
0.06
(0.19 )
1.00
(1)
-0.30
(5.25 )
0.04
(0.42 )
-1.00
(26.22 )
0.25
(2.28 )
-0.65
(11.21 )
0.10
(0.90 )
1.00
(1)
-0.61
(12.68 )
-0.18
(1.18 )
-1.71
(34.16 )
-0.42
(3.56 )
-0.77
(16.24 )
-0.24
(1.56 )
(2)
0.00
(0.06 )
-0.00
(0.00 )
0.09
(1.56 )
0.03
(0.08 )
0.01
(0.39 )
-0.00
(0.01 )
(0)
0.00
(0.12 )
0.00
(0.01 )
-0.15
(2.52 )
0.01
(0.20 )
0.05
(1.05 )
0.02
(0.03 )
(2)
-0.09
(2.75 )
0.00
(0.01 )
-0.12
(4.72 )
0.03
(0.13 )
-0.22
(6.44 )
-0.00
(0.03 )
(0)
0.01
(1.01 )
0.04
(0.10 )
-0.13
(9.12 )
0.13
(0.59 )
0.03
(1.90 )
0.09
(0.20 )
(2)
0.00
(0.26 )
-0.00
(0.05 )
0.04
(0.48 )
0.03
(0.18 )
0.00
(0.35 )
-0.00
(0.06 )
(0)
0.15
(3.14 )
0.11
(0.39 )
0.35
(9.94 )
0.21
(1.12 )
0.19
(3.93 )
0.14
(0.49 )
10.00
(1)
0.00
(0.02 )
0.00
(0.00 )
-0.02
(1.66 )
0.02
(0.12 )
0.02
(0.21 )
0.01
(0.03 )
10.00
(1)
-0.01
(1.27 )
0.01
(0.04 )
0.08
(7.24 )
0.07
(0.28 )
-0.02
(2.64 )
0.03
(0.10 )
10.00
(1)
-0.07
(1.63 )
0.02
(0.16 )
-0.12
(3.62 )
0.07
(0.46 )
-0.09
(2.13 )
0.03
(0.21 )
(2)
0.00
(0.02 )
-0.00
(0.00 )
0.02
(0.49 )
0.01
(0.07 )
0.01
(0.15 )
-0.00
(0.01 )
(0)
0.00
(0.01 )
0.00
(0.00 )
-0.07
(2.10 )
-0.00
(0.07 )
0.01
(0.07 )
0.00
(0.01 )
(2)
0.01
(0.16 )
0.00
(0.01 )
0.04
(0.46 )
0.01
(0.12 )
0.02
(0.35 )
0.00
(0.03 )
(0)
0.01
(0.14 )
0.00
(0.02 )
0.03
(0.88 )
0.00
(0.15 )
0.02
(0.31 )
0.00
(0.05 )
(2)
-0.01
(0.56 )
-0.00
(0.05 )
-0.01
(1.19 )
0.01
(0.16 )
-0.02
(0.73 )
-0.00
(0.06 )
(0)
-0.01
(1.06 )
0.00
(0.08 )
0.11
(3.22 )
0.01
(0.23 )
-0.02
(1.46 )
0.01
(0.10 )
CF - control function. (0) - none, (1) - linear, (2) - non-parametric
iqr - (75th-25th) quartile
N = 100, K = 3, σε = 0.1
h(a) = exp(a)/(κ + exp(a))
Average network statistics: density= 0.24, min degree= 11.24, max degree= 39.02
100.00
(1)
-0.00
(0.02 )
0.00
(0.00 )
0.02
(0.26 )
0.00
(0.08 )
-0.00
(0.17 )
0.00
(0.01 )
100.00
(1)
0.01
(0.24 )
0.00
(0.02 )
0.03
(1.44 )
0.01
(0.14 )
0.02
(0.50 )
0.00
(0.04 )
100.00
(1)
-0.00
(0.39 )
0.00
(0.06 )
-0.00
(1.35 )
0.01
(0.20 )
-0.00
(0.49 )
0.00
(0.08 )
(2)
0.00
(0.02 )
-0.00
(0.00 )
0.00
(0.28 )
0.00
(0.07 )
0.00
(0.11 )
-0.00
(0.01 )
mean bias
std
median bias
iqr
mean bias
std
median bias
iqr
mean bias
std
median bias
iqr
(2)
0.02
(0.29 )
0.00
(0.01 )
0.07
(0.87 )
0.00
(0.12 )
0.05
(0.66 )
0.00
(0.03 )
mean bias
std
median bias
iqr
mean bias
std
median bias
iqr
mean bias
std
median bias
iqr
(2)
0.05
(1.35 )
-0.00
(0.05 )
0.13
(2.92 )
0.00
(0.16 )
0.07
(1.76 )
-0.00
(0.06 )
mean bias
std
median bias
iqr
mean bias
std
median bias
iqr
mean bias
std
median bias
iqr
ESTIMATION OF PEER EFFECTS IN ENDOGENOUS SOCIAL NETWORKS
21
Table 5. Design B4, polynomial sieve: Parameter values across 1000 MC replications with N = 100, K = 6.
κ
CF
β = 0.20
γ=1
δ=1
κ
CF
β = 0.50
γ=1
δ=1
κ
CF
β = 0.80
γ=1
δ=1
(0)
0.04
(0.32 )
0.03
(0.05 )
0.28
(10.68 )
0.12
(1.54 )
0.29
(2.50 )
0.20
(0.28 )
(0)
-0.58
(12.59 )
-0.14
(1.29 )
-3.55
(57.75 )
-1.20
(6.90 )
-1.19
(27.26 )
-0.26
(2.74 )
(0)
0.23
(36.41 )
-0.87
(1.12 )
-0.23
(69.90 )
-2.74
(2.47 )
0.34
(48.21 )
-1.10
(1.49 )
1.00
(1)
0.01
(0.31 )
0.01
(0.03 )
0.27
(6.87 )
0.20
(0.65 )
0.04
(2.49 )
0.06
(0.19 )
1.00
(1)
-0.30
(5.25 )
0.04
(0.42 )
-1.00
(26.22 )
0.29
(2.26 )
-0.66
(11.21 )
0.10
(0.88 )
1.00
(1)
-0.61
(12.69 )
-0.18
(1.18 )
-1.70
(34.17 )
-0.41
(3.47 )
-0.77
(16.24 )
-0.24
(1.55 )
(2)
-0.00
(0.03 )
-0.00
(0.00 )
0.03
(0.23 )
0.03
(0.07 )
-0.00
(0.24 )
-0.00
(0.01 )
(0)
0.00
(0.12 )
0.00
(0.01 )
-0.16
(2.57 )
0.01
(0.20 )
0.05
(1.05 )
0.02
(0.03 )
(2)
0.04
(1.02 )
-0.00
(0.01 )
0.18
(3.25 )
0.03
(0.12 )
0.09
(2.28 )
-0.00
(0.03 )
(0)
0.01
(1.01 )
0.04
(0.10 )
-0.13
(9.12 )
0.13
(0.59 )
0.03
(1.90 )
0.09
(0.20 )
(2)
0.08
(2.70 )
-0.00
(0.05 )
0.13
(3.19 )
0.03
(0.16 )
0.10
(3.70 )
-0.00
(0.06 )
(0)
0.15
(3.14 )
0.11
(0.38 )
0.36
(9.92 )
0.22
(1.09 )
0.20
(3.92 )
0.14
(0.48 )
10.00
(1)
0.00
(0.02 )
0.00
(0.00 )
-0.02
(1.66 )
0.02
(0.12 )
0.02
(0.21 )
0.01
(0.03 )
10.00
(1)
-0.01
(1.27 )
0.01
(0.04 )
0.08
(7.24 )
0.07
(0.27 )
-0.02
(2.64 )
0.03
(0.09 )
10.00
(1)
-0.07
(1.63 )
0.03
(0.16 )
-0.12
(3.62 )
0.07
(0.46 )
-0.09
(2.13 )
0.03
(0.21 )
(2)
-0.00
(0.03 )
-0.00
(0.00 )
-0.01
(0.32 )
0.01
(0.07 )
-0.01
(0.21 )
-0.00
(0.01 )
(0)
0.00
(0.01 )
0.00
(0.00 )
-0.07
(2.10 )
-0.00
(0.07 )
0.01
(0.07 )
0.00
(0.01 )
(2)
-0.01
(0.29 )
-0.00
(0.01 )
0.01
(0.42 )
0.01
(0.11 )
-0.02
(0.67 )
-0.00
(0.03 )
(0)
-0.00
(0.40 )
0.00
(0.02 )
0.12
(3.18 )
0.00
(0.15 )
-0.01
(1.14 )
0.00
(0.05 )
(2)
-0.01
(0.38 )
-0.00
(0.05 )
-0.02
(0.76 )
0.01
(0.16 )
-0.01
(0.51 )
-0.00
(0.06 )
(0)
-0.01
(1.06 )
0.00
(0.08 )
0.11
(3.22 )
0.01
(0.23 )
-0.02
(1.46 )
0.01
(0.10 )
CF - control function. (0) - none, (1) - linear, (2) - non-parametric
iqr - (75th-25th) quartile
N = 100, K = 6, σε = 0.1
h(a) = exp(a)/(κ + exp(a))
Average network statistics: density= 0.24, min degree= 11.23, max degree= 39.03
100.00
(1)
-0.00
(0.02 )
0.00
(0.00 )
0.01
(0.30 )
0.00
(0.08 )
-0.00
(0.17 )
0.00
(0.01 )
100.00
(1)
0.01
(0.24 )
0.00
(0.02 )
0.03
(1.44 )
0.01
(0.14 )
0.02
(0.50 )
0.00
(0.04 )
100.00
(1)
-0.00
(0.39 )
0.00
(0.06 )
-0.02
(1.41 )
0.00
(0.20 )
-0.00
(0.49 )
0.00
(0.08 )
(2)
-0.00
(0.01 )
-0.00
(0.00 )
0.00
(0.25 )
0.00
(0.07 )
-0.00
(0.08 )
-0.00
(0.01 )
mean bias
std
median bias
iqr
mean bias
std
median bias
iqr
mean bias
std
median bias
iqr
(2)
0.00
(0.14 )
0.00
(0.01 )
0.01
(0.42 )
0.00
(0.11 )
0.00
(0.32 )
-0.00
(0.03 )
mean bias
std
median bias
iqr
mean bias
std
median bias
iqr
mean bias
std
median bias
iqr
(2)
-0.18
(5.19 )
-0.00
(0.05 )
-0.21
(6.06 )
0.00
(0.15 )
-0.25
(7.12 )
-0.00
(0.06 )
mean bias
std
median bias
iqr
mean bias
std
median bias
iqr
mean bias
std
median bias
iqr
22
IDA JOHNSSON AND HYUNGSIK ROGER MOON
Table 6. Design A2, Hermite polynomial sieve: Parameter values across 1000 MC replications with
KN = 3.
κ
CF
β = 0.20
γ=1
δ=1
κ
CF
β = 0.50
γ=1
δ=1
κ
CF
β = 0.80
γ=1
δ=1
(0)
0.04
(1.22 )
-0.02
(0.15 )
-0.98
(43.81 )
-0.13
(2.90 )
0.34
(7.07 )
-0.12
(0.95 )
(0)
-0.20
(4.92 )
-0.37
(0.43 )
-2.71
(36.86 )
-2.07
(2.99 )
-0.35
(10.88 )
-0.76
(0.88 )
(0)
-0.72
(2.85 )
-0.73
(0.34 )
-1.05
(16.81 )
-2.03
(1.33 )
-0.96
(3.95 )
-0.92
(0.45 )
1.00
(1)
-0.03
(1.03 )
-0.02
(0.13 )
-0.17
(14.45 )
-0.13
(2.53 )
-0.18
(6.83 )
-0.14
(0.80 )
1.00
(1)
-0.08
(6.28 )
-0.35
(0.45 )
-0.69
(55.99 )
-1.81
(3.09 )
-0.15
(12.98 )
-0.71
(0.92 )
1.00
(1)
1.22
(67.54 )
-0.73
(0.43 )
1.95
(144.55 )
-2.01
(1.42 )
1.60
(87.06 )
-0.92
(0.53 )
(2)
0.02
(0.46 )
-0.00
(0.03 )
-0.08
(7.59 )
-0.00
(0.50 )
0.13
(3.26 )
-0.00
(0.20 )
(0)
0.00
(0.06 )
0.00
(0.03 )
0.05
(2.59 )
-0.00
(0.52 )
0.02
(0.34 )
0.00
(0.17 )
(2)
-0.00
(3.81 )
-0.03
(0.26 )
0.32
(20.80 )
-0.17
(1.45 )
-0.03
(7.89 )
-0.07
(0.54 )
(0)
-0.03
(2.18 )
-0.03
(0.32 )
-0.26
(12.72 )
-0.11
(1.56 )
-0.05
(4.48 )
-0.05
(0.66 )
(2)
-0.25
(5.11 )
-0.21
(0.65 )
-0.88
(13.00 )
-0.51
(1.72 )
-0.31
(6.63 )
-0.26
(0.83 )
(0)
-0.72
(31.10 )
-0.26
(0.76 )
-2.05
(75.52 )
-0.68
(2.23 )
-0.91
(39.77 )
-0.34
(0.95 )
10.00
(1)
-0.00
(0.22 )
0.00
(0.02 )
0.14
(6.10 )
-0.00
(0.49 )
-0.03
(1.31 )
0.00
(0.15 )
10.00
(1)
0.54
(12.38 )
-0.02
(0.27 )
14.83
(440.08 )
-0.12
(1.43 )
0.58
(12.76 )
-0.06
(0.55 )
10.00
(1)
-2.05
(92.85 )
-0.21
(0.74 )
-9.07
(383.21 )
-0.54
(1.92 )
-2.43
(111.62 )
-0.27
(0.94 )
(2)
0.00
(0.10 )
0.00
(0.01 )
0.08
(1.70 )
0.00
(0.13 )
0.02
(0.62 )
0.00
(0.03 )
(0)
0.00
(0.02 )
0.00
(0.00 )
-0.01
(0.55 )
0.00
(0.11 )
0.00
(0.14 )
0.00
(0.02 )
(2)
-0.08
(3.20 )
0.00
(0.05 )
-0.20
(9.18 )
-0.00
(0.28 )
-0.17
(7.00 )
0.00
(0.11 )
(0)
0.01
(0.37 )
0.00
(0.04 )
0.04
(2.40 )
0.00
(0.25 )
0.02
(0.77 )
0.00
(0.09 )
(2)
-0.10
(3.21 )
-0.01
(0.17 )
-0.31
(7.57 )
-0.03
(0.45 )
-0.12
(4.11 )
-0.01
(0.22 )
(0)
0.02
(3.04 )
-0.00
(0.15 )
-0.47
(11.62 )
0.00
(0.44 )
0.05
(4.27 )
0.00
(0.19 )
CF - control function. (0) - none, (1) - linear, (2) - non-parametric
iqr - (75th-25th) quartile
N = 100, K = 3, σε = 0.1
h(a) = exp(a)/(κ + exp(a)), q k (a) = H k (a) exp(−a2 /2)
Average network statistics: density= 0.40, min degree= 23.87, max degree= 57.02
100.00
(1)
0.09
(2.86 )
-0.00
(0.00 )
-1.19
(35.81 )
0.00
(0.12 )
0.64
(20.70 )
0.00
(0.02 )
100.00
(1)
0.00
(0.66 )
-0.00
(0.04 )
0.02
(4.16 )
-0.00
(0.22 )
0.00
(1.34 )
-0.00
(0.08 )
100.00
(1)
0.03
(0.89 )
-0.00
(0.14 )
0.10
(2.32 )
-0.00
(0.38 )
0.03
(1.13 )
-0.00
(0.17 )
(2)
0.00
(0.48 )
-0.00
(0.00 )
-1.66
(50.05 )
0.00
(0.10 )
0.10
(5.11 )
-0.00
(0.02 )
mean bias
std
median bias
iqr
mean bias
std
median bias
iqr
mean bias
std
median bias
iqr
(2)
0.00
(0.46 )
-0.00
(0.03 )
-0.01
(2.17 )
0.00
(0.19 )
0.01
(0.97 )
-0.00
(0.06 )
mean bias
std
median bias
iqr
mean bias
std
median bias
iqr
mean bias
std
median bias
iqr
(2)
-0.02
(2.49 )
-0.00
(0.10 )
-0.03
(5.06 )
-0.00
(0.28 )
-0.03
(3.23 )
-0.00
(0.13 )
mean bias
std
median bias
iqr
mean bias
std
median bias
iqr
mean bias
std
median bias
iqr
ESTIMATION OF PEER EFFECTS IN ENDOGENOUS SOCIAL NETWORKS
23
Table 7. Design A2, Hermite polynomial sieve: Parameter values across 1000 MC replications with
KN = 6.
κ
CF
β = 0.20
γ=1
δ=1
κ
CF
β = 0.50
γ=1
δ=1
κ
CF
β = 0.80
γ=1
δ=1
(0)
0.02
(1.00 )
-0.01
(0.15 )
-1.85
(34.96 )
-0.13
(2.86 )
0.22
(6.05 )
-0.12
(0.95 )
(0)
-0.22
(4.95 )
-0.37
(0.43 )
-2.71
(36.85 )
-2.09
(2.96 )
-0.39
(10.96 )
-0.76
(0.89 )
(0)
-0.71
(2.84 )
-0.73
(0.34 )
-1.03
(16.80 )
-2.02
(1.33 )
-0.94
(3.93 )
-0.91
(0.45 )
1.00
(1)
-0.04
(1.03 )
-0.02
(0.13 )
-0.29
(14.36 )
-0.15
(2.50 )
-0.23
(6.83 )
-0.14
(0.80 )
1.00
(1)
-0.09
(6.27 )
-0.35
(0.45 )
-0.84
(55.84 )
-1.81
(3.04 )
-0.17
(12.98 )
-0.71
(0.91 )
1.00
(1)
1.19
(67.54 )
-0.73
(0.43 )
1.93
(144.55 )
-2.00
(1.43 )
1.57
(87.07 )
-0.92
(0.53 )
(2)
-0.00
(0.18 )
0.00
(0.00 )
-0.02
(0.70 )
0.00
(0.10 )
-0.03
(1.18 )
-0.00
(0.01 )
(0)
0.00
(0.06 )
0.00
(0.03 )
0.05
(2.59 )
-0.00
(0.53 )
0.02
(0.35 )
0.00
(0.17 )
(2)
0.06
(1.19 )
0.00
(0.03 )
0.19
(3.77 )
-0.00
(0.18 )
0.13
(2.59 )
0.00
(0.06 )
(0)
0.01
(2.28 )
-0.02
(0.32 )
-0.29
(12.80 )
-0.11
(1.57 )
0.03
(4.78 )
-0.05
(0.66 )
(2)
0.03
(0.59 )
-0.00
(0.09 )
0.06
(1.14 )
-0.00
(0.26 )
0.04
(0.77 )
-0.00
(0.11 )
(0)
-0.41
(29.63 )
-0.26
(0.75 )
-1.08
(69.56 )
-0.67
(2.19 )
-0.52
(37.99 )
-0.33
(0.95 )
10.00
(1)
-0.00
(0.22 )
0.00
(0.02 )
0.15
(6.10 )
-0.00
(0.49 )
-0.03
(1.31 )
0.00
(0.15 )
10.00
(1)
0.57
(12.42 )
-0.02
(0.26 )
14.99
(440.12 )
-0.12
(1.44 )
0.64
(12.91 )
-0.05
(0.54 )
10.00
(1)
-2.10
(92.84 )
-0.21
(0.72 )
-9.23
(383.19 )
-0.55
(1.91 )
-2.50
(111.61 )
-0.27
(0.94 )
(2)
0.01
(0.30 )
-0.00
(0.00 )
0.12
(3.43 )
0.00
(0.09 )
0.06
(1.88 )
-0.00
(0.01 )
(0)
0.00
(0.02 )
0.00
(0.00 )
-0.01
(0.55 )
0.00
(0.11 )
0.00
(0.14 )
0.00
(0.02 )
(2)
-0.01
(0.29 )
0.00
(0.02 )
-0.06
(1.38 )
0.00
(0.16 )
-0.02
(0.60 )
0.00
(0.05 )
(0)
0.01
(0.38 )
0.00
(0.04 )
0.05
(2.45 )
0.00
(0.25 )
0.02
(0.78 )
0.00
(0.09 )
(2)
-0.01
(0.83 )
-0.00
(0.08 )
0.02
(1.54 )
-0.00
(0.24 )
-0.01
(1.09 )
-0.00
(0.10 )
(0)
0.02
(3.04 )
-0.00
(0.15 )
-0.47
(11.62 )
0.00
(0.44 )
0.05
(4.27 )
-0.00
(0.20 )
CF - control function. (0) - none, (1) - linear, (2) - non-parametric
iqr - (75th-25th) quartile
N = 100, K = 6, σε = 0.1
h(a) = exp(a)/(κ + exp(a)), q k (a) = H k (a) exp(−a2 /2)
Average network statistics: density= 0.40, min degree= 23.86, max degree= 57.01
100.00
(1)
0.09
(2.86 )
-0.00
(0.00 )
-1.19
(35.81 )
-0.00
(0.11 )
0.64
(20.70 )
0.00
(0.02 )
100.00
(1)
0.01
(0.61 )
-0.00
(0.04 )
0.08
(3.74 )
-0.00
(0.23 )
0.02
(1.25 )
-0.00
(0.08 )
100.00
(1)
0.03
(0.88 )
-0.00
(0.13 )
0.10
(2.31 )
-0.01
(0.38 )
0.03
(1.13 )
-0.00
(0.17 )
(2)
-0.00
(0.12 )
-0.00
(0.00 )
-0.01
(0.90 )
0.00
(0.09 )
-0.01
(0.77 )
-0.00
(0.01 )
mean bias
std
median bias
iqr
mean bias
std
median bias
iqr
mean bias
std
median bias
iqr
(2)
-0.01
(0.20 )
0.00
(0.02 )
-0.03
(0.74 )
0.00
(0.16 )
-0.02
(0.43 )
0.00
(0.05 )
mean bias
std
median bias
iqr
mean bias
std
median bias
iqr
mean bias
std
median bias
iqr
(2)
0.10
(2.33 )
-0.00
(0.08 )
0.18
(4.14 )
-0.00
(0.23 )
0.12
(3.04 )
-0.00
(0.10 )
mean bias
std
median bias
iqr
mean bias
std
median bias
iqr
mean bias
std
median bias
iqr
24
IDA JOHNSSON AND HYUNGSIK ROGER MOON
Table 8. Design B4, Hermite polynomial sieve: Parameter values across 1000 MC replications with
N = 100, K = 3.
κ
CF
β = 0.20
γ=1
δ=1
κ
CF
β = 0.50
γ=1
δ=1
κ
CF
β = 0.80
γ=1
δ=1
(0)
0.04
(0.32 )
0.03
(0.05 )
0.29
(10.68 )
0.12
(1.52 )
0.30
(2.50 )
0.20
(0.27 )
(0)
-0.57
(12.59 )
-0.14
(1.24 )
-3.44
(57.78 )
-1.16
(6.76 )
-1.16
(27.25 )
-0.26
(2.71 )
(0)
-0.89
(9.66 )
-0.87
(1.12 )
-2.36
(24.67 )
-2.73
(2.49 )
-1.14
(12.56 )
-1.11
(1.49 )
1.00
(1)
0.01
(0.32 )
0.01
(0.03 )
0.25
(6.88 )
0.19
(0.64 )
0.06
(2.58 )
0.06
(0.19 )
1.00
(1)
-0.30
(5.25 )
0.05
(0.41 )
-1.01
(26.22 )
0.29
(2.27 )
-0.65
(11.21 )
0.11
(0.88 )
1.00
(1)
-0.51
(13.00 )
-0.18
(1.18 )
-1.48
(34.71 )
-0.42
(3.52 )
-0.64
(16.67 )
-0.24
(1.54 )
(2)
0.01
(0.30 )
-0.00
(0.00 )
0.16
(2.26 )
0.04
(0.08 )
0.05
(2.26 )
-0.00
(0.01 )
(0)
0.00
(0.12 )
0.00
(0.01 )
-0.14
(2.52 )
0.01
(0.20 )
0.05
(1.05 )
0.02
(0.03 )
(2)
0.01
(0.13 )
0.00
(0.02 )
0.10
(0.47 )
0.06
(0.16 )
0.03
(0.29 )
0.00
(0.05 )
(0)
0.01
(1.01 )
0.04
(0.10 )
-0.13
(9.12 )
0.13
(0.59 )
0.03
(1.90 )
0.09
(0.20 )
(2)
0.07
(1.31 )
0.01
(0.08 )
0.15
(2.83 )
0.08
(0.23 )
0.09
(1.72 )
0.01
(0.10 )
(0)
0.15
(3.14 )
0.11
(0.38 )
0.35
(9.94 )
0.21
(1.09 )
0.19
(3.93 )
0.14
(0.49 )
10.00
(1)
0.00
(0.02 )
0.00
(0.00 )
-0.02
(1.66 )
0.02
(0.12 )
0.02
(0.21 )
0.01
(0.03 )
10.00
(1)
-0.01
(1.27 )
0.01
(0.04 )
0.08
(7.24 )
0.07
(0.27 )
-0.02
(2.64 )
0.03
(0.09 )
10.00
(1)
-0.07
(1.63 )
0.02
(0.16 )
-0.13
(3.62 )
0.07
(0.46 )
-0.09
(2.13 )
0.03
(0.21 )
(2)
0.00
(0.01 )
0.00
(0.00 )
0.01
(0.19 )
0.01
(0.07 )
0.00
(0.07 )
0.00
(0.01 )
(0)
0.00
(0.01 )
0.00
(0.00 )
-0.07
(2.10 )
-0.00
(0.07 )
0.01
(0.07 )
0.00
(0.01 )
(2)
-0.00
(0.10 )
0.00
(0.02 )
-0.02
(0.64 )
0.01
(0.13 )
-0.00
(0.24 )
0.00
(0.03 )
(0)
-0.00
(0.40 )
0.00
(0.02 )
0.12
(3.18 )
0.00
(0.15 )
-0.01
(1.14 )
0.00
(0.05 )
(2)
-0.04
(1.73 )
0.00
(0.05 )
-0.19
(5.48 )
0.01
(0.16 )
-0.05
(2.18 )
0.00
(0.06 )
(0)
-0.01
(1.06 )
0.00
(0.08 )
0.11
(3.22 )
0.01
(0.24 )
-0.02
(1.46 )
0.01
(0.10 )
CF - control function. (0) - none, (1) - linear, (2) - non-parametric
iqr - (75th-25th) quartile
N = 100, K = 3, σε = 0.1
h(a) = exp(a)/(κ + exp(a)), q k (a) = H k (a) exp(−a2 /2)
Average network statistics: density= 0.24, min degree= 11.23, max degree= 39.02
100.00
(1)
(2)
-0.00
0.00
(0.02 ) (0.02 )
0.00
0.00
(0.00 ) (0.00 )
0.02
-0.01
(0.26 ) (0.42 )
0.00
0.00
(0.08 ) (0.07 )
-0.00
0.00
(0.17 ) (0.17 )
0.00
-0.00
(0.01 ) (0.01 )
100.00
(1)
(2)
0.01
-0.12
(0.24 ) (3.88 )
0.00
0.00
(0.02 ) (0.02 )
0.03
1.96
(1.44 ) (64.17 )
0.01
0.00
(0.14 ) (0.12 )
0.02
-0.38
(0.50 ) (12.63 )
0.00
-0.00
(0.04 ) (0.03 )
100.00
(1)
(2)
0.00
-0.05
(0.39 ) (2.92 )
0.00
0.00
(0.06 ) (0.05 )
-0.00
0.08
(1.35 ) (1.70 )
0.00
0.00
(0.20 ) (0.16 )
0.00
-0.08
(0.49 ) (4.21 )
0.00
0.00
(0.08 ) (0.06 )
mean bias
std
median bias
iqr
mean bias
std
median bias
iqr
mean bias
std
median bias
iqr
mean bias
std
median bias
iqr
mean bias
std
median bias
iqr
mean bias
std
median bias
iqr
mean bias
std
median bias
iqr
mean bias
std
median bias
iqr
mean bias
std
median bias
iqr
ESTIMATION OF PEER EFFECTS IN ENDOGENOUS SOCIAL NETWORKS
25
Table 9. Design B4, Hermite polynomial sieve: Parameter values across 1000 MC replications with
N = 100, K = 6.
κ
CF
β = 0.20
γ=1
δ=1
κ
CF
β = 0.50
γ=1
δ=1
κ
CF
β = 0.80
γ=1
δ=1
(0)
0.04
(0.32 )
0.03
(0.05 )
0.34
(10.66 )
0.12
(1.54 )
0.27
(2.50 )
0.19
(0.27 )
(0)
-0.57
(12.59 )
-0.14
(1.26 )
-3.52
(57.75 )
-1.18
(6.76 )
-1.18
(27.26 )
-0.26
(2.72 )
(0)
-0.87
(9.64 )
-0.86
(1.12 )
-2.36
(24.67 )
-2.73
(2.48 )
-1.11
(12.55 )
-1.10
(1.49 )
1.00
(1)
-0.00
(0.37 )
0.01
(0.03 )
0.62
(12.21 )
0.20
(0.64 )
-0.04
(3.26 )
0.06
(0.19 )
1.00
(1)
-0.30
(5.25 )
0.04
(0.42 )
-1.00
(26.22 )
0.28
(2.27 )
-0.66
(11.21 )
0.10
(0.89 )
1.00
(1)
-0.61
(12.68 )
-0.18
(1.17 )
-1.71
(34.16 )
-0.42
(3.57 )
-0.77
(16.24 )
-0.24
(1.55 )
(2)
-0.00
(0.02 )
-0.00
(0.00 )
0.04
(0.29 )
0.03
(0.07 )
-0.01
(0.15 )
-0.00
(0.01 )
(0)
0.00
(0.12 )
0.00
(0.01 )
-0.15
(2.52 )
0.01
(0.20 )
0.05
(1.05 )
0.02
(0.03 )
(2)
-0.35
(11.45 )
-0.00
(0.01 )
-1.56
(51.31 )
0.03
(0.12 )
-0.77
(24.92 )
-0.00
(0.03 )
(0)
0.01
(1.01 )
0.04
(0.10 )
-0.13
(9.12 )
0.13
(0.59 )
0.03
(1.90 )
0.09
(0.20 )
(2)
0.02
(1.18 )
0.00
(0.05 )
0.15
(3.98 )
0.03
(0.16 )
0.01
(1.46 )
-0.00
(0.06 )
(0)
0.13
(3.03 )
0.11
(0.38 )
0.27
(9.57 )
0.21
(1.09 )
0.16
(3.78 )
0.14
(0.48 )
10.00
(1)
0.00
(0.02 )
0.00
(0.00 )
-0.02
(1.66 )
0.02
(0.12 )
0.02
(0.21 )
0.01
(0.03 )
10.00
(1)
-0.01
(1.27 )
0.01
(0.04 )
0.08
(7.24 )
0.07
(0.27 )
-0.02
(2.64 )
0.03
(0.10 )
10.00
(1)
-0.07
(1.63 )
0.03
(0.16 )
-0.10
(3.58 )
0.07
(0.46 )
-0.09
(2.13 )
0.03
(0.21 )
(2)
-0.00
(0.01 )
-0.00
(0.00 )
0.00
(0.18 )
0.01
(0.07 )
-0.00
(0.09 )
-0.00
(0.01 )
(0)
0.00
(0.01 )
0.00
(0.00 )
-0.07
(2.10 )
-0.00
(0.07 )
0.01
(0.07 )
0.00
(0.01 )
(2)
0.00
(0.15 )
-0.00
(0.01 )
0.01
(0.33 )
0.01
(0.11 )
0.00
(0.34 )
-0.00
(0.03 )
(0)
0.01
(0.14 )
0.00
(0.02 )
0.02
(0.88 )
0.00
(0.15 )
0.02
(0.31 )
0.00
(0.05 )
(2)
-0.05
(0.98 )
-0.00
(0.05 )
-0.07
(1.92 )
0.01
(0.15 )
-0.07
(1.30 )
-0.00
(0.06 )
(0)
-0.01
(1.06 )
0.00
(0.08 )
0.11
(3.22 )
0.01
(0.24 )
-0.02
(1.46 )
0.01
(0.10 )
CF - control function. (0) - none, (1) - linear, (2) - non-parametric
iqr - (75th-25th) quartile
N = 100, K = 6, σε = 0.1
h(a) = exp(a)/(κ + exp(a)), q k (a) = H k (a) exp(−a2 /2)
Average network statistics: density= 0.24, min degree= 11.23, max degree= 39.02
100.00
(1)
-0.00
(0.02 )
0.00
(0.00 )
0.02
(0.26 )
0.00
(0.08 )
-0.00
(0.18 )
0.00
(0.01 )
100.00
(1)
0.01
(0.24 )
0.00
(0.02 )
0.03
(1.44 )
0.01
(0.14 )
0.02
(0.50 )
0.00
(0.04 )
100.00
(1)
0.00
(0.39 )
0.00
(0.06 )
-0.00
(1.35 )
0.01
(0.20 )
0.00
(0.49 )
0.00
(0.08 )
(2)
-0.00
(0.01 )
-0.00
(0.00 )
-0.00
(0.21 )
0.00
(0.07 )
-0.00
(0.11 )
-0.00
(0.01 )
mean bias
std
median bias
iqr
mean bias
std
median bias
iqr
mean bias
std
median bias
iqr
(2)
0.00
(0.13 )
0.00
(0.01 )
0.01
(0.37 )
0.00
(0.11 )
0.01
(0.30 )
0.00
(0.03 )
mean bias
std
median bias
iqr
mean bias
std
median bias
iqr
mean bias
std
median bias
iqr
(2)
-0.02
(0.39 )
-0.00
(0.05 )
-0.02
(0.71 )
0.01
(0.15 )
-0.03
(0.52 )
-0.00
(0.06 )
mean bias
std
median bias
iqr
mean bias
std
median bias
iqr
mean bias
std
median bias
iqr
26
IDA JOHNSSON AND HYUNGSIK ROGER MOON
6. Conclusions
In this paper we show that, whenever it is likely that the network is endogenous, it is
important to control for this endogeneity when estimating peer effects. Failing to control for
the endogeneity of the connections matrix in general leads to biased estimates of peer effects.
We show that under specific assumptions, we can use the control function approach to deal
with the endogeneity problem. We assume that unobserved individual characteristics directly
affect link formation and individual outcomes. We leave the functional form through which
unobserved individual characteristics enter the outcome equation unspecified and estimate it
using a non-parametric approach. The estimator we propose is easy to use in applied work,
and Monte Carlo results show that it preforms well compared to a linear control function
estimator. Erroneously assuming that unobserved characteristics enter the outcome equation
in a linear fashion can lead to a serious bias in the estimated parameters.
References
Arduini, T., E. Patacchini, and E. Rainone (2015). Parametric and Semiparametric IV
Estimation of Network Models with Selectivity. Working Paper .
Badev, A. (2013). Discrete Games in Endogenous Networks : Theory and Policy. Working
Paper .
Blume, L., W. Brock, S. Durlauf, and R. Jayaraman (2013). Linear social interactions
models. NBER Working Paper (19212).
Bramoullé, Y., H. Djebbari, and B. Fortin (2009, May). Identification of peer effects through
social networks. Journal of Econometrics 150 (1), 41–55.
Calvó-Armengol, A., E. Patacchini, and Y. Zenou (2009, October). Peer Effects and Social
Networks in Education. Review of Economic Studies 76 (4), 1239–1267.
Chen, X. (2007). Large sample sieve estimation of semi-nonparametric models. Handbook of
Econometrics (1262).
Conley, T. G. and C. R. Udry (2010, March). Learning about a New Technology: Pineapple
in Ghana. American Economic Review 100 (1), 35–69.
Ductor, L. and M. Fafchamps (2011). Social networks and research output. Review of
Economics and Statistics.
Epple, D. and R. E. Romano (2011). Peer effects in education: A survey of the theory and
evidence (1 ed.), Volume 1. Elsevier B.V.
Goldsmith-Pinkham, P. and G. W. Imbens (2013). Social Networks and the Identification
of Peer Effects. Journal of Business & Economic Statistics 31 (3), 253–264.
Graham, B. (2011). Econometric Methods for the Analysis of Assignment Problems in the
Presence of Complimentarity and Social Spillovers. Handbook of Social Economics 1B,
965–1062.
ESTIMATION OF PEER EFFECTS IN ENDOGENOUS SOCIAL NETWORKS
27
Graham, B. (2015). Methods of identification in social networks. Annual Review of Economics 7, 1–60.
Graham, B. S. (2008, May). Identifying Social Interactions Through Conditional Variance
Restrictions. Econometrica 76 (3), 643–660.
Hoxby, C. (2000). Peer effects in the classroom: Learning from gender and race variation.
National Bureau of Economic Research Working Paper 4978 (August), 1–54.
Hsieh, C. and L. Lee (2014). A social interactions model with endogenous friendship formation and selectivity. Journal of Applied Econometrics, 1099–1255.
Kelejian, H. and I. Prucha (1998). A generalized spatial two-stage least squares procedure for
estimating a spatial autoregressive model with autoregressive disturbances. The Journal
of Real Estate Finance and Economics 17, 99–121.
Lee, L. (2003, January). Best Spatial TwoStage Least Squares Estimators for a Spatial
Autoregressive Model with Autoregressive Disturbances. Econometric Reviews 22 (4),
307–335.
Lee, L.-f. (2007a, April). GMM and 2SLS estimation of mixed regressive, spatial autoregressive models. Journal of Econometrics 137 (2), 489–514.
Lee, L.-f. (2007b, October). Identification and estimation of econometric models with group
interactions, contextual factors and fixed effects. Journal of Econometrics 140 (2), 333–
374.
Lee, L.-f., X. Liu, and X. Lin (2010). Specification and estimation of social interaction models
with network structures. Econometrics Journal 13, 145–176.
Li, Q. and S. J. Racine (2007). Nonparametric Econometrics: Theory and Practice. Princeton
University Press.
Manski, C. (1993). Identification of endogenous social effects: The reflection problem. The
review of economic studies.
Manski, C. (2000). Economic Analysis of Social Interactions. English 14 (3, Summer), 115–
136.
Newey, W. K. (1997, July). Convergence rates and asymptotic normality for series estimators.
Journal of Econometrics 79 (1), 147–168.
Qu, X. and L.-f. Lee (2015, February). Estimating a spatial autoregressive model with an
endogenous spatial weight matrix. Journal of Econometrics 184 (2), 209–232.
Sacerdote, B. (2014). Experimental and Quasi-Experimental Analysis of Peer Effects: Two
Steps Forward? Annual Review of Economics 6 (1), 253–272.
Shalizi, C. R. (2012, January). Comment on ”Why and When ’Flawed’ Social Network
Analyses Still Yield Valid Tests of no Contagion”. Statistics, Politics, and Policy 3 (1).
Weinberg, B. (2007). Social interactions with endogenous associations.
28
IDA JOHNSSON AND HYUNGSIK ROGER MOON
Zimmerman, D. J. (2003). Peer Effects in Academic Outcomes: Evidence from a Natural
Experiment. Review of Economics and Statistics 85 (1), 9–23.
ESTIMATION OF PEER EFFECTS IN ENDOGENOUS SOCIAL NETWORKS
29
Appendix
A. Notation. We use the following notation. M denotes a finite generic constant and a ⊥ b
means that a and b are orthogonal to each other. For an N × N matrix A, we define matrix
1/2
P
2
norms as follows: k A k=
|a
|
denotes the Frobenius norm, kAko denotes the
ij
i,j
operator norm of matrix A, that is, kAko = λmax (A0 A)1/2 , λmin (A) denotes the minimum
eigenvalue of A. Notice that
kAko ≤ kAk ≤ kAko rank(A).
(A.1)
Further, for matrix A, [A]i denotes the i’th row of A. Denote [GX2 ]i by X2,G,i , [G2 X]i
by X2,G2 ,i , [GY ]i by YG,i . The ith row of the instrument matrix ZN is given by Zi0 =
0
0
[X2,i
, X2,G,i , X2,G2 ,i ], Zi is (3lx ) × 1. Similarly, Wi0 = [YG,i , X2,i
, X2,G,i ]. We denote matrices
0
0 0
0
0 0
by bold letters - ZN = (Z1 , . . . , ZN ) , WN = (W1 , . . . , WN ) and AN = (A1 , . . . , AN )0 .
θ = (β 0 , γ 0 , δ 0 )0 is the vector of parameters to be estimated.
The series approximation function of order K is given by q K (a) = (q1 (a), . . . , qKN (a))0 , and,
combining all series terms in matrix form yields QN = (q K (A1 ), . . . , q K (AN ))0 . We denote
the series approximation with the estimated ÂN by Q̂N = (q K (Â1 ), . . . , q K (ÂN ))0 . Finally,
α = (α1 , . . . , αKN )0 , H(AN ) = (h(A1 ), . . . , h(AN ))0 and H(ÂN ) = (h(Â1 , . . . , h(ÂN ))0 .
B. Controling the error from the approximation  − A. In this section, we show that
the error coming from the approximation of A with ÂN is of order op (1). All supporting
Lemmas can be found in Appnedix C.
Lemma B.1. Assume the assumptions in Theorem 4.3 and Assumptions 3.3, 4.4, 4.5, 4.7
and 4.9. Then the following hold.
(a)
(b)
(c)
(d)
1
(Z0N PQb WN − Z0N PQ WN ) = op (1).
N
1
(Z0N PQb ZN − Z0N PQ ZN ) = op (1).
N
√1 (Z0 P b ε − Z0 PQ εN ) = op (1).
N Q
N
N
1
0
bN α)) = op (1).
√ (Z M b (H(AN ) − Q
Q
N
30
IDA JOHNSSON AND HYUNGSIK ROGER MOON
Proof. Part (a).
1 0
(Z P b WN − Z0N PQ WN )
N N Q
!−1
!−1
0
−1
0
b
Q0 W
ZN QN − QN
0 b
0
0 b
0
b
b
b
QN QN
ZN QN
QN QN
QN QN
QN WN
N
N
=
−
−
N
N
N
N
N
N
N
0
!−1 b
Q
−
Q
WN
0
0
b Q
b
N
N
ZN QN Q
N N
+
N
N
N
!−1
!−1
0
bN − QN
bN − QN
Z
Q
Z0N Q
0 b
0
b
b
b0 Q
bN
N
QN QN
(QN − QN ) WN
Q
Q0N WN
N
+
=
N
N
N
N
N
N
0
!−1
!−1 b
0
−1
Q
−
Q
WN
0
0 b
0
0 b
0
b
b
N
N
QN QN
Z QN
QN QN
Z QN QN QN
QN WN
−
− N
+ N
N
N
N
N
N
N
N
= I1 + I2 − I3 + I4 , say.
For the desired result, by (A.1) we show that
1 0
0
(ZN P b WN − ZN PQ WN ) = op (1),
Q
N
o
which follows by triangular inequality if we show
kI1 ko , kI2 ko , kI3 ko , kI4 ko = op (1).
For term I1 ,
kI1 ko
2 !−1 0 b
b
b
WN ZN QN − QN QN QN
≤ √ √
√N N
N N o
!
K
N
ln N X
ζ1 (k)2 OP (1)O(1) = op (1),
= Op (1)
N k=1
where the last line holds by (C.1), Lemmas C.2 and C.4, and by Assumption 4.5.
For term I2 ,
!−1 0 b
b
b
ZN QN WN QN − QN QN QN
kI2 ko ≤ √ √
√N √N N
N N o
= Op (1)
ln N
N
KN
X
!1/2
ζ1 (k)2
OP (1)ζ0 (KN )O(1) = op (1),
k=1
where the last line holds by (C.1), Lemmas C.2 and C.4, and by Assumption 4.5.
ESTIMATION OF PEER EFFECTS IN ENDOGENOUS SOCIAL NETWORKS
31
For term I3 , write
I3
Z0N QN
=
N
b0 Q
b
Q
N N
N
!−1 (
Z0N QN
=
N
b0 Q
b
Q
N N
N
!−1
Z0 QN
+ N
N
b0 Q
b
Q
N N
N
!−1
) 0
−1 0
Q0N QN
QN QN
Q N WN
−
N
N
N
!
b0 (Q
bN − QN ) Q0 QN −1 Q0 WN
Q
N
N
N
N
N
N
!
bN − QN )0 QN Q0 QN −1 Q0 WN
(Q
N
N
.
N
N
N
b0 Q
b
Q
N N
N
!
Then,
kI3 ko ≤ Op (1)ζ0 (KN )Op (1)ζ0 (KN )
KN
ln N X
ζ1 (k)2
N k=1
!1/2
Op (1)ζ0 (KN )Op (1) = op (1),
where the last equality follows by Assumption 4.5.
The desired result of term I4 follows by similar argument used for term I2 .
Part (b) can be shown in a similar way as Part (a).
Part (c).
1
√ (Z0N PQb εN − Z0N PQ εN )
N
!−1
!−1
0
bN − QN
bN − QN
Q
Z
Z0N Q
0 b
0
b
b
b0 Q
bN
N
QN QN
(QN − QN ) εN
Q
Q0N εN
N
√
√
+
=
N
N
N
N
N
N
0
!−1
!−1 b
−1
QN − QN εN
b0 Q
b
b0 Q
b
Q0N QN
Q0N εN
Z0 QN Q
Z0 QN
Q
N N
N N
√
√
+ N
− N
−
N
N
N
N
N
N
N
= III1 + III2 − III3 + III4 , say,
and the resired result of Part (c) follows if we show that for j = 1, ..., 4,
kIIIj k = op (1).
First, for term III1 , we have
!−1 (Q
0
0
bN − QN Q
b Q
bN
b
ZN Q
−
Q
)
ε
N
N
N
N
√
√
√
kIII1 k ≤ N
N
N N
!1/2
KN
(Q
0
b
ln N X
−
Q
)
ε
N
N
N
2
√
= Op (1)
ζ1 (k)
Op (1) ,
N k=1
N
32
IDA JOHNSSON AND HYUNGSIK ROGER MOON
where the last line holds by (C.1), Lemmas C.2 and C.4. Under Assumption 4.9, we can
show that
2
(Q
2
0
b
1 N − QN ) εN b
√
E QN − QN .
|XN , GN , AN =
N
N
Then, by Lemma C.2 and Assumption 4.5, we have the required result for term III1 .
The rest of the required results follow by similar fashion and we omit the proof.
Part (d).
Notice that
1
bN α))
√ (Z0N MQb (H(AN ) − Q
N
1
= √ Z0N MQbN H(AN )
N
1
1
= √ Z0N MQb − MQ H(AN ) + √ Z0N MQ (H(AN ) − QN α)
N
N
= IV1 + IV2 , say.
We can show IV1 = op (1) by applying similar arguments used in the proof of Part (a).
For term IV2 , notice that
k IV2 k = k IV2 ko
1
≤ √ ZN kMQ ko kH(AN ) − QN αko
N
o
1
kH(AN ) − QN αk
√
Z
= N
N
√
= Op (1) N O(KN−κ ) = op (1)
by Assumption 4.4 (iii) and (iv).
C. Supporting Lemmas. First notice that by Assumption 4.7
1
1
k ZN k2 = Op (1),
k WN k2 = Op (1).
N
N
(C.1)
Lemma C.1. Under Assumption 4.4, we have
1
k QN k2 ≤ M ζ02 (KN ).
N
Proof.
N
1
1 X
2
k QN k =
k q K (Ai ) k2 ≤ sup k q K (Ai ) k2 = ζ02 (KN )
N
N i=1
i
by Assumption 4.4 (ii).
ESTIMATION OF PEER EFFECTS IN ENDOGENOUS SOCIAL NETWORKS
33
Lemma C.2. Under Assumptions 4.4 and 4.5, we have
KN
1
ln N X
2
b
k QN − QN k = M
ζ1 (k)2 .
N
N k=1
Proof.
N KN
N KN
1 XX
1
1 XX
2
2
b
k QN − QN k =
k qk (Âi ) − qk (Ai ) k ≤
ζ1 (k)2 k Âi − Ai k2
N
N i=1 k=1
N i=1 k=1
≤
KN
N KN
ln N
ln N X
1 XX
=
ζ1 (k)2
ζ1 (k)2 ,
N i=1 k=1
N
N k=1
where the first inequality follows from Assumption 4.5 and the second inequality follows from
Theorem 4.3. .
Lemma C.3. For symmetric matrices A and B it is true that
|λmin (A) − λmin (B)| ≤k A − B k
Proof. Let xA be the eigenvector associated with the minimum eigenvalue of A. Define xB
analogously. First we show |λmin (A) − λmin (B)| ≤ kA − Bk.
λmin (A) − λmin (B) = x0A AxA − x0B BxB
≤ x0B (A − B)xB
≤ |x0B (A − B)xB | ≤ kA − Bk.
Also, we can prove the other direction. Notice that
λmin (A) − λmin (B) = x0A AxA − x0B BxB
≥ x0A (A − B)xA
≥ −|x0B (A − B)xB | ≥ −kA − Bk.
Then, we have the required result.
Lemma C.4. Under Assumptions 4.4 and 4.5, W.p.a.1, there exists a positive constant
C > 0 such that
!
0
b0 Q
bN
1
QN QN
Q
N
≤ λmin
, λmin
.
C
N
N
0
Q Q
Proof. First we show that there exists a positive constant C such that C1 ≤ λmin NN N ,
which follows by Assumption 4.4(i) if we show
0
Q
Q
N
N
K
K
0
N
N
λmin
= op (1).
−
E[q
(A
)q
(A
)
]
i
i
N
34
IDA JOHNSSON AND HYUNGSIK ROGER MOON
For this, by Lemma C.3, we have
0
0
λmin QN QN − E[q KN (Ai )q KN (Ai )0 ] ≤ QN QN − E[q KN (Ai )q KN (Ai )0 ]
N
N
N
1 X
KN
KN
0
KN
KN
0 = q (Ai )q (Ai ) − E[q (Ai )q (Ai ) ] .
N
i=1
Then, by Assumption 4.4(ii), we have
2
N
1 X
q KN (Ai )q KN (Ai )0 − E[q KN (Ai )q KN (Ai )0 ] E
N
i=1
!2
KN X
KN
N
X
1 X
(qk (Ai )ql (Ai ) − E[qk (Ai )ql (Ai )])
=
E
N
i=1
k=1 l=1
!2
KN
KN
KN X
X
X
1
1
≤
E[qk (Ai )ql (Ai )]2 ≤
qk (a)2
sup
N k=1 l=1
N a
k=1
ζ0 (KN )4
= o(1),
N
where the last line holds by Assumptions 4.4(ii) and 4.5.
Next, given the first part of the lemma, the second claim of the lemma follows if we show
!
0
b0 Q
bN
Q
Q
Q
N
N
N
− λmin
λmin
= op (1).
N
N
≤
Notice by Lemma C.3, for symmetric matrices A and B, we have
kλmin (A) − λmin (B)k ≤k A − B k .
Then,
λmin
b0 Q
b
Q
N N
N
!
− λmin
Q0N QN
N
Q
bN
Q0N QN b0N Q
−
≤ N
N (Q̂ − Q )0 Q Q0 (Q̂ − Q ) N
N
√ N √ + √N
√ N ≤ N
N
N
N
(Q̂ − Q )0 (Q̂ − Q ) N
N
√ N
√ N .
+ N
N
Then, by lemmas C.1 and C.2 and by Assumption 4.5, we have
v
!
u
0
KN
KN
0 b
u
X
X
b
QN QN
QN QN ln N
ln N
− λmin
ζ12 (k) +
ζ 2 (k) = op (1),
λmin
≤ M ζ0 (KN )t
N
N
N k=1
N k=1 1
as desired.
ESTIMATION OF PEER EFFECTS IN ENDOGENOUS SOCIAL NETWORKS
35
D. Controling the series approximation error.
Lemma D.1 (Series Approximation). Assume the assumptions in Lemma B.1. Then, we
have
0
0
P P
W
Z
W
W
−
ĥ
(A
)
Z
−
ĥ
(A
)
Zi − hZ (Ai ) +
(a) N1 N
= N1 N
i
i
i
i
i=1
i=1 Wi − h (Ai )
op (1),
0
0
P P
Z
Z
Z
(b) N1 N
Zi − hZ (Ai ) +op (1),
Z
−
ĥ
(A
)
Z
−
ĥ
(A
)
= N1 N
i
i
i
i
i=1
i=1 Zi − h (Ai )
P P
Z
Z
(c) √1N N
Z
−
ĥ
(A
)
εi = √1N N
i
i
i=1
i=1 Zi − h (Ai ) εi + op (1).
Proof. Lemma D.1 follows if we show
0
P W
W
W
W
(i) N1 N
ĥ
(A
)
−
h
(A
)
ĥ
(A
)
−
h
(A
)
= op (1).
i
i
i
i
i=1
0
P
Z
Z
ĥZ (Ai ) − hZ (Ai ) = op (1).
(ii) N1 N
i=1 ĥ (Ai ) − h (Ai )
P
Z
Z
(iii) √1N N
i=1 ĥ (Ai ) − h (Ai ) εi = op (1).
Lemma D.1 (i) and (ii) is true by Lemma D.4 and Lemma D.1 (iii) follows from (ii). See
the remainder of this section.
Following Newey (1997), we assume B = I in Assumption 4.4, hence, q̃ K (a) = q K (a).
Also, we assume P = E[q K (Ai )(q K (Ai ))0 ] = I.5
Lemma D.2. E[k P̃ − I k2 ] = O(ζ0 (KN )2 KN /N ), where P̃ = (Q0N QN )/N .
For proof see Li & Racine (2007, p. 481) Note that this Lemma implies that k P̃ − I k=
p
Op (ζ0 (KN ) KN /N = op (1). Also, since the smallest eigenvalue of P̃ − I is bounded by
k P̃ − I k, this implies that the smallest eigenvalue of P̃ converges to one in probability.
Letting 1N be the indicator function for the smallest eigenvalue of P̃ being greater than 1/2,
we have Pr(1N = 1) → 1.
Lemma D.3. k α̃(f ) − α(f ) k= Op (KN−κ ), where α̃(f ) = (Q0N QN )−1 Q0N f , where α(f ) satisfies
Assumption 4.4 and f = h(·), hZ (a), hW (a).
Proof.
1N k α̃(f ) − α(f ) k = 1N k (Q0N QN )−1 Q0N (f − P α(f ) ) k
= 1N {(f − QN α(f ) )0 QN (Q0N QN )−1 (Q0N QN /N )−1 Q0N (f − QN α(f ) )/N }1/2
= 1N OP (1){(f − QN α(f ) )0 QN (Q0N QN )−1 Q0N (f − QN α(f ) )/N }1/2
≤ Op (1){(f − QN α(f ) )0 (f − QN α(f ) )/N }1/2 = Op (KN−κ )
by Lemma D.2, Assumption 4.4(iii), the fact that QN (Q0N QN )−1 Q0N is idempotent and
Pr(1N = 1) → 1.
5The
Lemmas in this section follow Section 15.6 in Li and Racine (2007).
36
IDA JOHNSSON AND HYUNGSIK ROGER MOON
Define SA,B =
1
N
P
i
Ai Bi0 and SA,A = SA .
Lemma D.4. Sf −f˜ = Op (KN−2κ ) = op (N −1/2 ), where f = h(·), hZ (a), hW (a) and f˜ =
QN α̃(f ) .
Proof. We have
1
1
k f − f˜ k2 ≤ {k f − QN α(f ) k2 + k QN (α(f ) − α̃(f ) ) k2 }
N
N
−2κ
(f )
= O(KN ) + (α − α̃(f ) )0 (Q0N QN /N )(α(f ) − α̃(f ) )
Sf −f˜ =
= O(KN−2κ ) + Op (1) k α(f ) − α̃(f ) k2 = Op (KN−2κ )
by Assumption 4.4(iii), Lemma D.2 and Lemma D.3.
E. Distribution of estimator. In this section we derive the distribution of the estimator.
All supporting lemmas can be found in Section E.2.
Assumption E.1. E[s̃m,N (Xi , Ai )|Ai ] < κ < ∞ exists for all i.
Lemma E.2 (Limiting Distribution). Under the Assumptions made in Lemma B.1 and
Assumption E.1
N
1 X
p
(Wi − hW (Ai ))(Zi − hZ (Ai ))0 →
−
N i=1
h
i
2
E ηiGY (ηiX2 )0
E ηiGY (ηiGX2 )0
E ηiGY (ηiG X2 )0
i
h
X2 GX2 0
X2 G2 X2 0
X2 X2 0
WZ
E ηi (ηi
)
E ηi (ηi )
S
= E ηi (ηi )
h
i
2
E ηiGX2 (ηiX2 )0 E ηiGX2 (ηiGX2 )0 E ηiGX2 (ηiG X2 )0
and
N
1 X
p
(Zi − hZ (Ai ))(Zi − hZ (Ai ))0 →
−
N i=1
h
i
X2 GX2 0 X2 X2 0
X2 G2 X2 0
E η (ηi )
E ηi (ηi )
E ηi (ηi
)
i
h
i
2
S ZZ = E ηiGX2 (ηiX2 )0
E ηiGX2 (ηiGX2 )0
E ηiGX2 (ηiG X2 )0 ,
h 2
i
h 2
i
h 2
i
G X2 X2 0
G X2 GX2 0
G X2 G2 X2 0
E ηi
(ηi )
E ηi
(ηi )
E ηi
(ηi
)
where
"
r
E ηiGY (ηiG X2 )0 = E
∞
X
!
0 ˜X
˜A
γ 0 s̃˜X
m+1 (Xi , Ai )m + δ s̃m+2 (Xi , Ai ) + s̃m+1 (Xi , Ai )
m=0
h
Gr X2 Gs X2 0 0 i
X
X
˜
˜
E ηi
(ηi
) = E s̃r (Xi , Ai )) s̃s (Xi , Ai ) , r, s = 0, 1, 2
#
0
s̃˜X
, r = 0, 1, 2
r (Xi , Ai )
ESTIMATION OF PEER EFFECTS IN ENDOGENOUS SOCIAL NETWORKS
37
and
s̃˜m,N (Xi , Ai ) = plim (sm,N (Xi , Ai ) − E[m2,N (Xi , Ai )|Ai ]).
N →∞
Proof. Keeping in mind that we denote ηiX = Xi − E[Xi |Ai ] and taking
hW (Ai ))(Zi − hZ (Ai ))0 as an example we have
1
N
PN
i=1 (Wi
−
N
1 X
(Wi − hW (Ai ))(Zi − hZ (Ai ))0 =
N i=1
PN
PN GY GX2 0
X2 0
1
1
GY
ηi (ηi )
i=1 ηi (ηi )
N
N
Pi=1
1 PN X2 X2 0
N
X2 GX2 0
1
)
N i=1 ηi (ηi )
i=1 ηi (ηi
N
PN GX2 X2 0 1 PN GX2 GX2 0
1
(ηi ) N i=1 ηi (ηi )
i=1 ηi
N
P
G2 X2 0
GY
).
Consider the element N1 N
i=1 ηi (ηi
PN GY G2 X2 0
1
)
ηi (ηi
N
Pi=1
N
X2 G2 X2 0
1
ηi (ηi
) .
N
PNi=1 GX
G2 X2 0
1
2
(ηi
)
i=1 ηi
N
When |β| < 1,
∞
X
GN YN =
β m Gm+1
(X2N γ + GN X2N δ + H(AN ) + εN ).
N
m=0
and
"
[GN YN ]i = γ 0
∞
X
#
"
β m Gm+1
X2N +δ 0
N
m=0
X2i ⊆ Xi , hence, X2i =
i
sX
0,N (Xi , Ai ).
∞
X
"
#
β m Gm+1
GN X2N +
N
m=0
i
∞
X
m=0
#
β m Gm+1
H(AN ) +op (1).
N
i
We have
N
N
0
1 X GY G2 X2 0
1 X
ηi (ηi
) =
([GN YN ]i − E{[GN YN ]i |Ai }) [G2N X2N ]i − E{[G2N X2N ]i |Ai }
N i=1
N i=1
!
N
∞
X
X
0
1 X
m
X
X
0
β sm+1,N (Xi , Ai ) − E[sm+1,N (Xi , Ai )|Ai ]
sX
γ
=
2,N (Xi , Ai ) − E[s2,N (Xi , Ai )|Ai ]
N i=1
m=0
!
∞
N
0
1 X 0X m X
X
+
δ
β sm+2,N (Xi , Ai ) − E[sX
sX
m+2,N (Xi , Ai )|Ai ]
2,N (Xi , Ai ) − E[s2,N (Xi , Ai )|Ai ]
N i=1
m=0
!
∞
N
X
0
A
1 X
0
m
A
X
+
sX
γ
β sm+1,N (Xi , Ai ) − E[sm+1,N (Xi , Ai )|Ai ]
2,N (Xi , Ai ) − E[s2,N (Xi , Ai )|Ai ] .
N i=1
m=0
W.l.o.g. consider the term
N
∞
X
1 X
0
X
γ
β m sX
m+1,N (Xi , Ai ) − E[sm+1,N (Xi , Ai )|Ai ]
N i=1
m=0
!
X
sX
2,N (Xi , Ai ) − E[s2,N (Xi , Ai )|Ai ]
0
38
IDA JOHNSSON AND HYUNGSIK ROGER MOON
Denote plimN →∞ (sm,N (Xi , Ai ) − E[m2,N (Xi , Ai )|Ai ]) = s̃˜m,N (Xi , Ai )
By the WLLN and Lemma E.6 we have
!
N
∞
X
X
0
1 X
0
m
X
X
γ
β sm+1,N (Xi , Ai ) − E[sm+1,N (Xi , Ai )|Ai ]
sX
2,N (Xi , Ai ) − E[s2,N (Xi , Ai )|Ai ]
N i=1
m=0
Z X
∞
p
β m s̃˜m (x, a)s̃˜2 (x, a)π(x, a)dxda.
→
− γ0
m=0
The limit can be derived in a similar fashion for all other terms of
P
Z
Z
0
hZ (Ai ))0 and N1 N
i=1 (Zi − h (Ai ))(Zi − h (Ai )) .
1
N
PN
W
i=1 (Wi −h (Ai ))(Zi −
E.1. Proof of Theorem 4.10.
Lemma E.3. (ηiZ εi , Fi ) is a martingale difference sequence.
Proof. E[ηiZ εi |Fi ] = ηiZ E[εi |Fi ] = 0, where the last equality follows from Assumption 4.9. Lemma E.4.
"
#
N
N
1 X Z Z 0 2 p
1 X Z Z 0 2
η (η ) ε →
η (η ) σ
− E
N i=1 i i i
N i=1 i i
Proof. Note that
N
N
X
1 X
p
Z Z 0 2
2 1
E[ηi (ηi ) εi |Fi−1 ] = σ
E[ηiZ (ηiZ )0 ] →
− σ 2 E[ηiZ (ηiZ )0 ]
N i=1
N i=1
(E.1)
Theorem 4.10 follows from the Martingale Central Limit theorem, Lemma E.2 and Lemma
E.4.
E.2. Supporting Lemmas.
Lemma E.5 (Uniform Convergence in i). Under Assumptions 3.1, 3.1 (ii), 4.1 (i) and 4.7
sup |sm+1,N (Xi , Ai ) − s̃m+1,N (Xi , Ai )| = op (1) ∀ m ≥ 0
1≤i≤N
where
R
s̃m+1,N (Xi , Ai ) =
p(g(Xi , x) + Ai + a)sm,N (x, a)π(x, a)dxda
R
.
p(g(Xi , x) + Ai + a)π(x, a)dxda
Proof. Note that for m = 0 the statement holds trivially because of the assumption that the
moments of Xi and Ai exist and are finite for all i.
ESTIMATION OF PEER EFFECTS IN ENDOGENOUS SOCIAL NETWORKS
39
Step 1
Note that for fixed i, sm+1,N (Xi , Ai ) has the following limit as N → ∞:
R
p(g(Xi , x) + Ai + a)sm,N (x, a)π(x, a)dxda
p
R
sm+1,N (Xi , Ai ) →
− s̃m+1,N (Xi , Ai ) =
,
p(g(Xi , x) + Ai + a)π(x, a)dxda
where π(x, a) is the joint density of Xi and Ai . By definition,
!−1
1 X
1 X
[GN Sm (XN , AN )]i =
Dij,N
Dij,N sm,N (Xj , Aj )
N j6=i
N j6=i
!−1
1 X
I {g(Xi , Xj ) + Ai + Aj ≥ vij }
=
N j6=i
×
1 X
I {g(Xi , Xj ) + Ai + Aj ≥ vij } sm,N (Xj , Aj ).
N j6=i
Then, by the WLLN, we have
Z
1 X
p
I {g(Xi , Xj ) + Ai + Aj ≥ vij } →
−
p(g(Xi , x) + Ai + a)π(x, a)dxda
N j6=i
Z
1 X
p
I {g(Xi , Xj ) + Ai + Aj ≥ vij } sm,N (Xj , Aj ) →
−
p(g(Xi , x) + Ai + a)sm (x, a)π(x, a)dxda.
N j6=i
Step 2
Let
ζi,N
N
1 X
=
(Dij,N − E[Dij,N |Xi , Ai ])
N j=1,
j6=i
and
τi,N
N
1 X
=
(Dij,N sm,N (Xj , Aj ) − E[Dij,N sm,N (Xj , Aj )|Xi , Ai ]).
N j=1,
j6=i
As shown in (i), for fixed i, ζi,n = op (1) and τi,N = op (1). Let α > 2 and note that
√
supi |ζi,N N | = Op (N 1/α ). This follows because
√
√
√
1
1
1
sup |ζi,N N | = Op (N α ) ↔ sup 1/α |ζi,N N | = Op (1) ↔ sup |ζi,N N |α = Op (1)
i
i N
i N
and
N
X 1
√
√
1
sup |ζi,N N |α ≤
|ζi,N N |α = Op (1).
N
i N
i=1
√
since E[|ζi,N N |α ] < κ < ∞. Hence,
√
1
1
sup |ζi,N N | √ = √ Op (N 1/α ) = Op (N 1/α−1/2 ) = op (1).
i
N
N
40
IDA JOHNSSON AND HYUNGSIK ROGER MOON
A similar argument can be applied to show that τi,N = op (1).
Step 3
Now consider
−1
1 X
sm+1,N (Xi , Ai ) = [GN Sm,N (X, A)]i =
Dij,N
N j=1,
j6=i
1 X
Dij,N sm,N (Xj , Aj )
N j=1,
j6=i
−1
1 X
=
I {h(Xi , Xj ) + Ai + Aj ≥ vij }
N j=1,
j6=i
1 X
I {h(Xi , Xj ) + Ai + Aj ≥ vij } sm,N (Xj , Aj ).
×
N j=1,
j6=i
P
1
Let sm+1,N (Xi , Ai ) = Ψ−1
i Φi,N where Ψi,N = N
j6=i Dij,N and Φi,N =
As established above,
p
Φi i, N →
− φi,N
1
N
P
j6=i
Dij,N sm,N (Xj , Aj ).
uniformly in i and
p
Ψi →
− ψi,N
uniformly in i, where φi,N =
Further
1
N
PN
j=1,
j6=i
E[Dij,N |Xi , Ai ] and ψi,N =
1
N
PN
j=1,
j6=i
E[Dij,N s0,N (Xj , Aj )|Xi , Ai ].
φi i, N
Φi,N − φi,N
φi,N (Ψi,N − ψi,N )
Φi,N
−
=
−
.
Ψi,N
ψi,N
Ψi,N
Ψi,N ψi,N
Hence,
φi,N
supi |Φi,N − φi,N |
Φi,N
sup
−
= op (1)
≤
Ψi,N
ψi,N
inf i Ψi,N
i
since supi |Φi,N −φi,N | = op (1) and inf i Ψi,N is bounded away from 0 uniformly in i. Therefore
sup1≤i≤N |sm+1,N (Xi , Ai ) − s̃m+1,N (Xi , Ai )| = op (1).
Lemma E.6.
sup |
1≤i≤N
∞
X
m=0
β m [sm,N (Xi , Ai ) − s̃m,N (Xi , Ai )] | = op (1)
ESTIMATION OF PEER EFFECTS IN ENDOGENOUS SOCIAL NETWORKS
41
Proof. Step 1
Note that for fixed i
"
#
∞
X
β
m
Gm+1
XN
N
m=0
"
∞
X
β
m
∞
X
Gm+1
GN XN
N
β
→p
Gm+1
H(AN )
N
m=0
(E.2)
∞
X
β m s̃X
m+2,N (Xi , Ai )
(E.3)
β m s̃A
m+1,N (Xi , Ai ).
(E.4)
m=0
#i
m
β m s̃X
m+1,N (Xi , Ai )
m=0
#i
m=0
"
→p
∞
X
→p
∞
X
m=0
i
Proof:
P∞
P∞
m m+1
m X
limN →∞ Pr |
β
G
X
−
β
s̃
(X
,
A
)|
≥
ε
≤
N
i
i
m+1,N
N
m=0
m=0
i
Pm∗ m m+1
Pm∗ m X
limN →∞ Pr |
XN i − m=0 β s̃m+1,N (Xi , Ai )| ≥ ε/2 +
m=0 β GN
P∞
P
m m+1
m X
XN i − ∞
limN →∞ Pr |
m=m∗+1 β GN
m=m∗+1 β s̃m+1,N (Xi , Ai )| ≥ ε/2
Denote s̃X
m+1,N (Xi , Ai ) = cm+1,N and consider
!
" m∗
#
m∗
X
X
m+1
β m s̃X
lim Pr |
β m GN
XN −
m+1,N (Xi , Ai )| ≥ ε/2
N →∞
m=0
i
m=0
We have
Pm∗
β m ( Gm+1
XN i − cm+1,N )| ≥ ε/2 ≤
N
Pm∗ m m+1
ε
=
XN i − cm+1,N | ≥ 2m∗
limN →∞ Pr
m=0 β | GN
m+1
P
limN →∞ m∗
XN i − cm+1,N | ≥ 2β mε m∗ = 0
m=0 Pr | GN
since plimN →∞ Gm+1
XN i − cm+1,N = 0.
N
Further,
m+1
P
m
limN →∞ Pr | ∞
XN i − cm+1,N )| ≥ ε/2 ≤
m=m∗+1 β ( GN
m+1
P∞
m
limN →∞ Pr
XN i − cm+1,N )| ≥ ε/2 ≤
m=m∗+1 β | GN
1
supm≥m∗+1 | Gm+1
X
−
c
)|
≥
ε/2
=0
limN →∞ Pr 1−β
N i
m+1,N
N
limN →∞ Pr |
m=0
(E.3) and (E.4) can be shown in an analogous way.
42
IDA JOHNSSON AND HYUNGSIK ROGER MOON
Step 2
sup |
1≤i≤N
sup
1≤i≤N
∞
X
β m [sm,N (Xi , Ai ) − s̃m,N (Xi , Ai )] | ≤
m=0
1
sup |sm,N (Xi , Ai ) − s̃m,N (Xi , Ai )| = op (1)
1 − β m≥0
by Lemma E.5.
University of Southern California and USC Dornsife INET
E-mail address: [email protected]
University of Southern California, USC Dornsife INET, and Yonsei University
E-mail address: [email protected]
© Copyright 2026 Paperzz