Supplement to Two-Step Estimation of
Network-Formation Models with
Incomplete Information
Michael Leung
May 5, 2015
A
Extensions for Kernel Estimator
In this section, we maintain Assumptions 1, 3, and 4.
Consistency of the kernel estimator requires that the sequence of equilibria chosen by the equilibrium selection mechanism for every n satisfies certain smoothness
and finiteness conditions. This requires a modification of Assumption 2, redefining
G(X n , θ0 ) as the set of s-times differentiable, symmetric equilibria such that certain
derivatives are uniformly bounded over n. We next provide conditions for the existence of such equilibria.
P
Let σ : Rm → Rp . Define for q ∈ Nm |q| = m
i=1 qi and the |q|th order derivative
|q| σ
∂
of σ with respect to x ∈ Rm as Dxq σ(x) = ∂xq1 ...∂xqm . For the next proof, we will view
m
1
X as a vector and define Xk to be its kth component. Additionally, for i, j, k ∈ Nn ,
let gi ∈ An such that gii = 0, gi,−jk be gi with components j and k removed, and
(1, 1, gi,−jk ) be gi with components j and k replaced with ones. Similarly define gi,−j
and (1, gi,−j ).
Theorem A.1. Suppose Assumption 5 holds and that µ and the joint CDF of εi
are s-times differentiable. Then there exists a sequence of equilibria {σn }∞
n=2 that are
s-times differentiable at X and symmetric such that for any q with |q| ≤ s,
X q
sup max Dx σi (1, gi,−j ) | X < ∞ and
n
i,j∈Nn
gi,−j
X
q
sup max Dx σi (1, 1, gi,−jk ) | X < ∞.
n i,j,k∈Nn
(1)
gi,−jk
(Note that the summands above correspond to Dxq E[Gij | X, σ] and Dxq E[Gij Gik | X, σ],
respectively.)
1
Proof. Define Σy , Γ as in the proof of Theorem 1. We will write Σyn ≡ Σy and
Γn ≡ Γ to emphasize their dependence on the number of agents. For σn ∈ Σyn and
X ∈ Xn , view (σni (a | X); a ∈ An ) as the ith “row” of σn (X). For each n, let Σ̃yn ⊆ Σyn
such that for each σ ∈ Σ̃yn , (1) holds. These subsets are nonempty since constant
functions satisfy these requirements. In view of Theorem 1, it only remains to show
all n and that Σ̃yn is closed. Regarding the first
that Γn (·, X) maps Σ̃yn to itself for P
q
claim, notice that E[Gij | X, σ] =
gi,−j Dx Γ(σ, X)i,(1,gi,−j ) , where Γ(·, X)i,a is the
component of Γ associated with agent i and action a ∈ An with ai = 0. Then for any
σ ∈ Σ̃yn and |q| = 1,
Dxq E[Gij | X, σ] = Dxq Φ Zij0 θ0
X
1X qX
=
Dxq σj ((1, gj,−i ) | X) ,
D
σk ((1, gk,−j ) | X) µ(Xik ),
n k6=i x g
gj,−i
k,−j
X
X
1
q
q 0
Dx σi ((1, 1, gi,−jk ) | X) , Dx Xij θ0 × φ(Zij0 θ0 ),
n k g
i,−jk
which is uniformly bounded by Assumption 1, 4, and the construction of Σ̃yn . Repeated
application of the chain rule shows that this is also true for q such that |q| ∈ [1, s].
Next, let Φ[2] be the joint CDF of (εij , εik ). Then for |q| = 1,
0
Dxq E[Gij Gik | X = x] = Dxq Φ − Zij0 θ0 , −Zik
θ0
X
1X qX
D
σl ((1, gl,−j ) | X) µ(Xil ),
Dxq σj ((1, gj,−i ) | X) ,
=
n l6=i x g
gj,−i
l,−j
X
X
1
q
q 0
Dx σi ((1, 1, gi,−jl ) | X) , Dx Xij θ0
n l g
i,−jl
0
× D1 Φ − Zij0 θ0 , −Zik
θ0
X
1X qX
+
Dxq σk ((1, gk,−i ) | X) ,
Dx
σl ((1, gl,−k ) | X) µ(Xil ),
n
gk,−i
gl,−k
l6=i
X
X
1
q
q 0
Dx σi ((1, 1, gi,−kl ) | X) , Dx Xij θ0
n l g
i,−kl
0
× D2 Φ − Zij0 θ0 , −Zik
θ0 ,
which is also uniformly bounded for the same reason. Repeated application of the
chain rule shows that this is also true for q such that |q| ∈ [1, s]. Hence, {Γn }∞
n=2
map, respectively, {Σ̃yn }∞
n=2 to themselves.
Lastly, note that uniform boundedness must carry over to limit points of Σ̃yn for
any n, so Σ̃yn is closed. This completes the proof.
2
This justifies our next assumption, which replaces Assumption 2 if kernel estimators are used. For each n, let G(X n , θ0 ) be the set of s-times differentiable, symmetric
equilibria such that any σ n ∈ G(X n , θ0 ) satisfies (1).
Assumption A.1 (Selection Mechanism II). There exist sequences of equilibrium
selection mechanisms {λn (·); n ∈ N} and public signals {ν n ; n ∈ N} such that for n
sufficiently large, G(X n , θ0 ) is nonempty, and for any g n ∈ Gn ,
n
n
X
n
P(G = g | X ) =
n
n
n
n
P (λn (X , ν , θ0 ) = σ | X )
σ n ∈G(X n ,θ0 )
n
Y
σin (gin | X n ).
i=1
Next we demonstrate uniform consistency of the kernel estimator. We introduce
multi-index notation for multivariate Taylor expansions for the following proof. Let
f : Rm → R be continuously differentiable up to order q. For α ∈ Nm and x ∈ Rm ,
define
m
m
m
X
Y
Y
|α| =
αi ,
α! =
αi !,
xα =
xαi i
i=1
i=1
i=1
and the kth order derivative
f (α) (x) = Dα f (x) =
∂ |α| f
,
∂xα1 1 . . . ∂xαmm
Theorem A.2. Under the following conditions,
P G K Xij −Xkl
kl
k,l
h
− E[Gij | X, σ] = Op
sup P
Xij −Xkl
i,j∈Nn
k,l K
h
|α| ≤ k.
r
hs +
log n
nhd
!
.
(2)
(a) {Xij ; i, j ∈ Nn } is identically distributed with an s-times differentiable density
p(·).
(b) Assumption A.1 holds.
(c) The kernel K : Rd → [0, 1] satisfies the following conditions.
R
R r
• K is a higher-order
kernel
of
degree
2s:
K(u)
u
=
1;
u K(u) du = 0 for
R
R
all r < s; |u2s |K(u) du < ∞; and K 2 (u) du < ∞.1
• K is regular in the sense of Einmahl and Mason (2005).
(d) h → 0 and nhd → ∞.
1
Note that the order of the kernel is twice the smoothness of p(x̃) and E[Gij | X, σ], unlike the
standard case.
3
Proof. We need some notation first. For x̃ ∈ X, define
1 X
x̃ − Xkl
p̂(x̃) = 2 d
,
K
n h k,l
h
the kernel estimate of p(x̃), and X−ij the matrix of attributes X excluding Xij . By a
p
simple variance calculation, standard arguments yield supx̃ |p̂(x̃) − p(x̃)| −→ 0.
By the triangle inequality, the left-hand side of equation (2) is at most
P (G − E[G | X, σ]) K
kl
k,l kl
sup P
Xij −Xkl
i,j∈Nn
K
k,l
h
Xij −Xkl
h
variance
P E[G | X, σ]K Xij −Xkl
kl
k,l
h
+ sup − E[Gij | X, σ] .
P
Xij −Xkl
i,j∈Nn
k,l K
h
bias
q
log n
s
Bias. We show that the bias term is Op h + nhd uniformly in i, j. Fix σ
and i, j ∈ Nn . By Proposition 1, E[Gij | X, σ] is invariant to permutations of the
components Xkl of X−ij . Then by Assumption A.1, for any i, j ∈ Nn and a given σ,
we can write E[Gij | X, σ] = ρn (Xi , Xj , X−ij ), where ρn is a function that is invariant
to permutations of the components Xkl of X−ij . By the mean-value theorem,
E[Gkl | X, σ] = ρn (Xkl , Xij , X−ij,kl ) = ρn (Xij , Xkl , X−ij,kl )
E[Gij | X,σ]
+
X
0<|q|<s
X q
1
∗ 1
D2q E[Gij | X, σ] (Xkl − Xij )q +
D2 Pijkl
(Xkl − Xij )q ,
q!
q!
|q|=s
∗
The term Pijkl
lies between E[Gkl | X, σ] and E[Gij | X, σ]. The notation D2q is meant
to underscore the fact that the derivative is taken with respect to the two components
highlighted by the overbracket. Then substituting in the Taylor expansions and using
4
uniform consistency of p̂(Xij ),
P E[G | X, σ]K Xij −Xkl
kl
k,l
h
−
E[G
|
X,
σ]
ij
P
Xij −Xkl
k,l K
h
X 1 Xij − Xkl 1
1
≤ E[G
|
X,
σ]
−
E[G
|
X,
σ]
K
ij
ij
p̂(Xij ) n2
hd
h
k,l
p̂(Xij )
X
1
1 X 1
1˜
Xij − Xkl
q
q
+
D2 E[Gij | X, σ] (Xkl − Xij ) K
2
d
p̂(Xij ) n k,l h
h
q!
0<|q|<s
1
1 X 1
Xij − Xkl X q ∗ 1 ˜
q
+
K
D
P
(X
−
X
)
kl
ij
2 ijkl
p̂(Xij ) n2 k,l hd
h
q!
|q|=s
X
1 1 X 1
Xij − Xkl ˜
1
q
(1 + op (1)), (3)
≤
Bq 2
K
(X
−
X
)
kl
ij
q! n k,l hd
h
p(Xij ) 0<|q|≤s
A
where Bq < ∞ and supn maxi,j∈Nn D2q < Bq . This exists by condition (b). By
standard arguments, the supremum
of A over all Xij uniformly converges to its
p
d
−1
expectation at the rate O( (nh ) log n). Moreover, this expectation is O(hs ) by
X −X
the usual change of variables u = kl h ij . This establishes the bound on the bias.
Variance. We next show that the variance term is Op
q
log n
nhd
. By equation (4),
Gij = 1 Zij0 θ0 + εij > 0 ,
which is a function of random elements εij and X. Since ε ⊥
⊥ X by Assumption
1, conditioning on X is the same as fixing attributes and the equilibrium σ (by
Assumption A.1) and dropping the conditioning, so we can apply the usual empirical
process results for independent data. Hence, in what follows, we treat X, σ as fixed.
Notice that the variance term is at most
P
kl
k,l (Gkl − E[Gkl | X, σ]) K x̃−X
h
sup P
x̃−Xkl
x̃
k,l K
h
1 X
1 x̃ − Xkl .
≤ sup 2 d
(Gkl − E[Gkl | X, σ]) K
sup n
h
h
p̂(x̃)
x̃
x̃
k,l
B
C
By a simple variance calculation, standard arguments and Assumption 1 yield C =
Op (1) by (a), so it remains to compute the rate of convergence of B. We first need
5
several definitions. Define
rkl (x̃, h) ≡
r(Zij0 θ0 , Xkl , εkl ; x̃, h)
R = {r(·, ·, ·; x̃, h)} and η = M
q
=K
x̃ − Xkl
h
1 εkl ≥ Zij0 θ0 ) ,
log n
nhd
for any M > 0. Let ξ1 , . . . , ξn be i.i.d.
Rademacher random variables, independent of X, ε.2 For a given η, let N1 18 η, Pn , R
be the smallest m for which there exist functions r[1] , . . . , r[m] such that minj Pn |r −
r[j] | < 81 η for all r ∈ R. Then by a standard symmetrization argument (see e.g.
Pollard, 1984, (11) and Theorem 24),
"
!
#!
X
1 X 1X
1
P sup d
rkl (x̃, h) − E
rkl (x̃, h) X
> η X, σ
(4)
nh
n
n
x̃
k
l
!
! l
X
X
1
1
1
≤ 4 P sup d
ξk
rkl (x̃, h) > η X, σ
nh k
n l
4
x̃
1
≤ 4 N1
η, Pn , R ×
8
!
!
X
1 X
1
1
0
max P d
ξk
r[j] (Zkl
θ0 , Xkl , εkl ) > η X, σ .
j∈{1,...,m}
nh k
n l
8
Since K(·) is bounded due to Assumption 1,
1
nhd ξk
!
1X
1
0
r[j] (Zkl
θ0 , Xkl , εkl ; x̃, h) < C d
n l
nh
for some C > 0. In view of applying Bernstein’s inequality (e.g. Pollard, 1984,
Appendix B), we next show that
!
!
1
1 X
1X
0
ξk
r[j] (Zkl
θ0 , Xkl , εkl ; x̃, h) X, σ < D d
Var
(5)
d
nh k
n l
nh
for some D > 0 with probability approaching one. For any x̃ ∈ X, the left-hand side
2
That is, P(ξ=1) = P(ξ= − 1) = 0.5.
6
is at most
1 X 2 x̃ − Xkl
E[Gkl | X] (1 − E[Gkl | X])
K
n4 h4d k,l
h
1 X
x̃ − Xkm
x̃ − Xkl
+ 4 4d
K
Cov(Gkl , Gkm | X)
K
n h k,l,m
h
h
1
1 X 2 x̃ − Xkl
≤ 2 d 2 d
K
4n h n h k,l
h
D
1
1 X
x̃ − Xkm
x̃ − Xkl
+ d 3 d
K
K
.
nh n h k,l,m
h
h
E
By (c),
1X 1 2
K
D=
n k hd
E ≤ sup K 2 (u) ×
u
1 X 1 2 x̃2 − Xl
K
×
and
n l hd
h
x̃2 − Xl
1X 1
x̃2 − Xm
.
K
×
K
hd
h
n m hd
h
x̃1 − Xk
h
1X 1
n
l
p̂(x̃)
As argued
p previously, these averages converge to their expectations uniformly in x̃
at rate (nhd )−1 Rlog n. After a change of variables,Rthe expectations associated with
B take the form K 2 (u)p(u + x̃h)du < supx̃ p(x̃) K 2 (u)du. This is finite by Assumptions 1 and (c). Meanwhile, supu K 2 (u) < ∞, since K is bounded by. This
establishes (5).
Thus, by Bernstein’s inequality, with probability approaching one,
1 X
ξk
max P d
j
nh
k
!
!
1 1X
0
r[j] (Zkl θ0 , Xkl , εkl ) > η X, σ
n l
8
1
η2
≤ 2 exp − 1
1
2 nhd D + 3nh
d Cη
log n
1
2
M
nhd
q
≤ 2 exp −
.
2 D
log n
C
+
M
nhd
3nhd
nhd
which is O (exp {−M 2 log n}). Further note that functions in R are products of
indicators which do not depend on (x̃, h) and regular kernels by (c). By definition,
the class of such kernels is VC-subgraph. Hence, R is a VC-subgraph class by Lemma
7
2.6.18(vi) of van der Vaart and Wellner (1996). By Theorem 2.6.7 of van der Vaart
and Wellner (1996), there exists a constant K such that
1
η, Pn , R = log KV (R)(16e)V (R) − (V (R) − 1)(log η − log 8) ,
log N1
8
where V (R) is the VC-index of R. This is O(log n) by definition of η. Thus,
2
n→∞
(4) = O n−M log n −→ 0,
for any M > 0 a.s. This completes the proof.
With some minor changes, the same argument can be applied to compute the
uniform rate of convergence for analogous kernel estimators for supported trust and
weighted in-degree. For example, for weighted-in-degree, we again use the trick in
Proposition 2 to rewrite the numerator of the kernel estimator as a sum over outdegree:
!
x̃ − Xik
1X
1 X
µ(Xij )
Gjk K
.
n2 hd i,j6=i
n k
h
Then the main change to the consistency proof is to instead define
x̃ − Xik
0
0
rkl (x̃, h) ≡ r(Zjk θ0 , Xik , Xij , εjk ; x̃, h) = K
1 εjk ≥ Zjk
θ0 ) µ(Xij ).
h
Theorem A.3 (Asymptotic Normality). Assume the following.
(a) The assumptions of Theorem 3 hold.3
(b) The assumptions of Theorem A.2 hold for s > d.
Then
√
−1/2
nΛn
d
(θ̂ − θ0 ) −→ N (0, Ip ), where Λn = Vn−1 Ψn Vn−1 , defined in Theorem 3.4
Proof. Define Mij as in the proof of Theorem 3. It is enough to show that equation
(23) of Theorem 3 is op (1); the rest of the proof remains the same. Since p̂(x̃) is
uniformly consistent for p(x̃), which is bounded away from zero on its support by
3
4
As usual, condition (b) of Theorem 3 implies s > d/2.
Recall that p is the dimension of Zij .
8
assumption,
!
1 X Mij
Xij − Xkl
K
− Mkl Ekl
n2 hd i,j p̂(Xij )
h
!
1 X
1 X Mij
Xij − Xkl
= √
K
−
M
E
1
+
o
(1)
kl
kl
p
h
n n k,l n2 hd i,j p(Xij )
1 X
(23) = √
n n k,l
A
+
1 X
√
Mkl Ekl
n n k,l
!
op (1).
B
It remains to show that A is op (1) and B is Op (1). We can rewrite B as
!
!
1 X 1X
1 X 1 X
Gml µ(Xkm )
√
Mkl [1]Glk + √
Mkl [−1]
,
Gmk Gml
n k
n l
n m n2 k,l
where Mkl [1] is the first column of Mkl , and Mkl [−1] is all but the first column of
Mkl . Since the summands are independent conditional on X and uniformly bounded
by Assumptions 1 and 4, Lyapunov’s CLT applies, and B is Op (1) as
desired.
P
M
X
−X
ij
kl
− Mkl is op (1)
By a similar argument, A is op (1) if n21hd i,j p(Xijij ) K
h
uniformly in k, l. To show the latter, we take a Taylor expansion of Mkl . By the same
argument as equation (3) in the proof of Theorem A.2,
1 X Mij
X
−
X
ij
kl
K
−
M
kl
n2 hd
p(X
)
h
ij
i,j
1 X Mkl
Xij − Xkl
K
= 2 d
− Mkl n h i,j p(Xij )
h
X Dq Mkl 1
1 X
X
−
X
ij
kl
2
q
˜(Xkl − Xij ) + 2 d
K
n h i,j
h
p(Xij ) q!
0<|q|<s
q
∗
1 X
1˜
Xij − Xkl X D2 M̃ijkl
q
+ 2 d
K
(Xkl − Xij ) n h i,j
h
p(Xij ) q!
|q|=s
1 X 1
Xij − Xkl
≤ 2 d
K
− 1 B0
n h i,j p(Xij )
h
X
1 1 X 1
Xij − Xkl
1 ˜
q
Bq 2
+
K
(X
−
X
)
kl
ij
,
q! n i,j hd
h
p(Xij )
0<|q|≤s
9
where Bq < ∞, supn maxi,j∈Nn |D2q Mij | < Bq for all n, and D2q is defined in the proof
of Theorem A.2.5 Since K(·) is regular by condition (c) of Theorem A.2, the uniform
entropy condition of Theorem 2.4.3 of van der Vaart and Wellner (1996) holds (see
the argument at the end of the proof of Theorem A.2). Hence, Theorem 2.4.3 implies
that the absolute-value terms in the last line converge to their expectations uniformly
in Xkl and h. In particular,
X
1
1 ˜
1
Xij − Xkl
(Xkl − Xij )q
K
sup 2
d
n
h
h
p(X
)
h,Xkl
ij
i,j
Z
p
1
Xij − Xkl ˜
q
−→ 0.
−
(X
−
X
)
dX
K
kl
ij
ij
d
h
Xij h
As argued in the proof of Theorem A.2, the integral on the right-hand side is O(hs )
X −X
uniformly in Xkl by a change of variables u = ij h kl . Similarly,
Xij − Xkl
1 X 1
K
−1
n2 hd i,j p(Xij )
h
Z
1
Xij − Xkl
Xij − Xkl
1 X 1
K
−
K
dXij
= 2 d
d
n h i,j p(Xij )
h
h
Xij h
op (1)
Z
+
Xij
1
K
hd
Xij − Xkl
h
dXij − 1 .
0
This completes the proof.
B
Additional Application Results
The tables below summarize coefficient estimates and standard errors that are not
presented in the main text.
5
This derivative exists by condition Theorem A.2, condition (b). It is uniformly bounded by
Assumptions 1 and 4.
10
Table 1: Controls for i (ego).
head of household
OBC caste
scheduled caste
female
hindu
λ = 0 λ = 0.1 λ = 0.2 dyadic
0.08∗
0.08∗
0.08∗
0.09∗∗∗
(0.05) (0.05)
(0.05) (0.03)
-0.04
-0.04
-0.04
-0.08∗∗
(0.08) (0.08)
(0.09) (0.03)
-0.04
-0.04
-0.04
-0.05
(0.07) (0.08)
(0.08) (0.03)
0.11∗
0.12∗
0.11∗
0.11∗∗∗
(0.06) (0.06)
(0.06) (0.03)
-0.06
-0.06
-0.05
-0.06∗∗
(0.05) (0.05)
(0.05) (0.03)
Note: Standard errors are in parentheses. (*) denotes significance at the 10% level, (**)
the 5% level, and (***) the 1% level.
Table 2: Controls for j (alter).
λ = 0 λ = 0.1 λ = 0.2
head of household 0.02
0.03
0.03
(0.07) (0.07)
(0.08)
OBC caste
-0.12
-0.12
-0.12
(0.09) (0.10)
(0.11)
scheduled caste
-0.04
-0.05
-0.04
(0.10) (0.11)
(0.12)
female
-0.02
-0.01
0.00
(0.09) (0.09)
(0.10)
hindu
-0.07
-0.06
-0.05
(0.07) (0.08)
(0.09)
dyadic
0.04
(0.03)
-0.16∗∗∗
(0.03)
-0.09∗∗∗
(0.03)
0.02
(0.03)
-0.03
(0.03)
Note: Standard errors are in parentheses. (*) denotes significance at the 10% level, (**) the
5% level, and (***) the 1% level.
11
References
Einmahl, U. and D. Mason, “Uniform in Bandwidth Consistency of Kernel-Type
Function Estimators,” Annals of Statistics, 2005, 33 (3), 1380–1403. 3
Pollard, D., Convergence of Stochastic Processes, Springer-Verlag, 1984. 6
van der Vaart, A. and J. Wellner, Weak Convergence and Empirical Processes:
With Applications to Statistics Springer Series in Statistics, Springer, 1996. 8, 10
12
© Copyright 2026 Paperzz