Social Interactions under Incomplete Information: Games, Equilibria

Social Interactions under Incomplete Information: Games, Equilibria,
and Expectations
Dissertation
Presented in Partial Fulfillment of the Requirements for the Degree Doctor of
Philosophy in the Graduate School of The Ohio State University
By
Chao Yang, B.A., M.A.
Graduate Program in Economics
The Ohio State University
2015
Dissertation Committee:
Lung-fei Lee, Advisor
Jason Blevins
Stephen Cosslett
Lixin Ye
c Copyright by
Chao Yang
2015
Abstract
My dissertation research investigates interactions of agents’ behaviors through social
networks when some information is not shared publicly, focusing on solutions to a series
of challenging problems in empirical research, including heterogeneous expectations and
multiple equilibria.
The first chapter,“Social Interactions under Incomplete Information with Heterogeneous
Expectations”, extends the current literature in social interactions by devising econometric
models and estimation tools with private information in not only the idiosyncratic shocks
but also some exogenous covariates. For example, when analyzing peer effects in class performances, it was previously assumed that all control variables, including individual IQ and
SAT scores, are known to the whole class, which is unrealistic. This chapter allows such
exogenous variables to be private information and models agents’ behaviors as outcomes
of a Bayesian Nash Equilibrium in an incomplete information game. The distribution of
equilibrium outcomes can be described by the equilibrium conditional expectations, which
is unique when the parameters are within a reasonable range according to the contraction mapping theorem in function spaces. The equilibrium conditional expectations are
heterogeneous in both exogenous characteristics and the private information, which makes
estimation in this model more demanding than in previous ones. This problem is solved in
a computationally efficient way by combining the quadrature method and the nested fixed
point maximum likelihood estimation. In Monte Carlo experiments, if some exogenous
ii
characteristics are private information and the model is estimated under the mis-specified
hypothesis that they are known to the public, estimates will be biased. Applying this model
to municipal public spending in North Carolina, significant negative correlations between
contiguous municipalities are found, showing free-riding effects.
The Second chapter,“A Tobit Model with Social Interactions under Incomplete Information”, is an application of the first chapter to censored outcomes, corresponding to the
situation when agents’ behaviors are subjected to some binding restrictions. In an interesting empirical analysis for property tax rates set by North Carolina municipal governments, it
is found that there is a significant positive correlation among near-by municipalities. Additionally, some private information about its own residents is used by a municipal government
to predict others’ tax rates, which enriches current empirical work about tax competition.
The third chapter, “Social Interactions under Incomplete Information with Multiple
Equilibria”, extends the first chapter by investigating effective estimation methods when
the condition for a unique equilibrium may not be satisfied. With multiple equilibria, the
previous model is incomplete due to the unobservable equilibrium selection. Neither conventional likelihoods nor moment conditions can be used to estimate parameters without
further specifications. Although there are some solutions to this issue in the current literature, they are based on strong assumptions such as agents with the same observable
characteristics play the same strategy. This paper relaxes those assumptions and extends
the all-solution method used to estimate discrete choice games to a setting with both discrete and continuous choices, bounded and unbounded outcomes, and a general form of
incomplete information, where the existence of a pure strategy equilibrium has been an
open question for a long time. By the use of differential topology and functional analysis,
it is found that when all exogenous characteristics are public information, there are a finite
iii
number of equilibria. With privately known exogenous characteristics, the equilbria can be
represented by a compact set in a Banach space and be approximated by a finite set. As a
result, a finite-state probability mass function can be used to specify a probability measure
for equilibrium selection, which completes the model. From Monte Carlo experiments about
two types of binary choice models, it is found that assuming equilibrium uniqueness can
bring in estimation biases when the true value of interaction intensity is large and there are
multiple equilibria in the data generating process.
iv
This is dedicated to my beloved Father, Xianmao Yang, and Mother, Fuyun Liao.
v
Acknowledgments
I owe my great thanks to my advisor, Lung-fei Lee, for his guidance in my economic
research and career development. Professor Lee is an experienced economist with deep
insights in econometric theories. His expertise greatly helped me formulate life-intriguing
ideas into concrete theoretical research questions and apply mathematical tools in rigorous
analysis. Moreover, I was greatly impressed by his great passion in developing practical
econometric tools and pursuing for rigor and perfection, which led to me be a responsible
and active researcher.
I would like to thank Jason Blevins. Professor Blevins introduced me a large string of
recent literature about Microeconometrics, which helped me advance my research and catch
up with the frontier. I benefited a lot from him in computation skills, too.
I would also like to thank Stephen Cosslett. I have learned quite a lot from his lectures, which covered a broad range of econometric theories. Additionally, when serving
as the grader for Professor Cosslett, from his well-designed problem sets and exams, I got
knowledge on both econometric theory and teaching skills.
I am also indebted to Lixin Ye for his guidance and encouragement. As my work
combines game theory and econometric theories, it is important to build game-based models
from real life problems. Professor Ye gave me many constructive suggestions, which aided
me to build models with consolidated microeconomic foundations and practical applications.
vi
I appreciate Robert De Jong, Javier Donna, Daeho Kim, Maryam Saeedi, and Bruce
Weinberg. Their precious comments and encouragements led me to pursuing research in
Econometrics and Microeconometrics. I also appreciate Yaron Azrieli, David Blau, Lucia
Dunn, Bill Dupor, Paul J. Healy, John Kagel, Aubhik Khan, Pok-sang Lam, Dan Levin,
Hu McCulloch, James Peck, Julia Thomas, and Huanxing Yang. Their courses provided
me with rigorous training in economic science and their comments were beneficial for me
to make progress.
I send my appreciation to The Ohio Supercomputer Center, for the support in allocation of computing time, which facilitated the investigation of the performances of my new
estimation methods.
I would like to appreciate Hajime Miyazaki. His great devotion to administrative work
ensures graduate students to concentrate their time and energy on research and teaching.
In particular, his encouragement and guidance on speaking English, teaching skills, and
career development greatly helped me to make a firm decision to become a productive and
competent economist.
I am grateful to all the stuff in the Department of Economics, for their efficient work and
warm-hearted help. I also owe thanks to my friends and classmates. As an international
student, without their help, I could not have had such a pleasant time during my graduate
studies.
Last but not the least, I would like to thank my beloved mother, for her strong support,
deep concern, and continuing encouragement.
vii
Vita
1982 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Born - Wuhan, China
2005 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B.A. Economics and Mathematics,
Wuhan University
2010 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . M.A. Economics,
The Ohio State University
2009-present . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Graduate Teaching Associate,
The Ohio State University
Publications
Research Publications
Chao Yang, Lian-sheng Wu, and Xiaohui Bo “Career Concern and Tax Preparer Fraud”.
Annuals of Economics and Finance, 11(2):355–379, 2010.
Fields of Study
Major Field: Economics
viii
Table of Contents
Page
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
ii
Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
v
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
vi
Vita . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
viii
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
xii
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
xiv
1.
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
2.
Social Interactions under Incomplete Information with Heterogeneous Expectations
8
2.1
2.2
2.3
2.4
2.5
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2.1 A Model Framework . . . . . . . . . . . . . . . . . . . . . . .
2.2.2 Continuous and Discrete Choices and Information Structures
2.2.3 Game Theoretical Explanation . . . . . . . . . . . . . . . . .
Equilibrium Analysis . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3.1 Equilibrium and Expectations . . . . . . . . . . . . . . . . .
2.3.2 Existence and Uniqueness . . . . . . . . . . . . . . . . . . . .
Equilibrium Solution and Computation . . . . . . . . . . . . . . . .
2.4.1 All Characteristics are Publicly Known . . . . . . . . . . . .
2.4.2 Characteristics are Self-Known . . . . . . . . . . . . . . . . .
2.4.3 Socially-Known Characteristics . . . . . . . . . . . . . . . . .
Identification, Likelihood, and Estimation . . . . . . . . . . . . . . .
2.5.1 Identification . . . . . . . . . . . . . . . . . . . . . . . . . . .
ix
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
8
12
12
15
16
20
20
22
25
25
26
31
35
35
.
.
.
.
42
44
48
51
A Tobit Model with Social Interactions under Incomplete Information . . . . .
53
3.1
3.2
.
.
.
.
.
.
.
.
.
.
.
.
.
.
53
57
57
58
60
60
62
64
70
74
75
76
79
83
Social Interactions under Incomplete Information with Multiple Equilibria . . .
85
2.6
2.7
2.8
3.
3.3
3.4
3.5
3.6
3.7
3.8
3.9
4.
4.1
4.2
4.3
4.4
4.5
4.6
2.5.2 Estimation . . . .
Monte Carlo Experiments
An Empirical Application
Conclusion and Remarks
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Introduction . . . . . . . . . . . . . .
The Model . . . . . . . . . . . . . . .
3.2.1 The Model Framework . . . .
3.2.2 Game Theoretical Foundation
Equilibrium Analysis . . . . . . . . .
3.3.1 Equilibrium and Expectations
3.3.2 Unique Equilibrium . . . . . .
3.3.3 Equilibrium Computation . . .
Identification . . . . . . . . . . . . . .
Estimation . . . . . . . . . . . . . . .
Extensions . . . . . . . . . . . . . . .
Monte Carlo Experiments . . . . . . .
Empirical Application . . . . . . . . .
Conclusion . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Introduction . . . . . . . . . . . . . . . . . . . .
Models . . . . . . . . . . . . . . . . . . . . . . .
4.2.1 A Model Framework . . . . . . . . . . . .
4.2.2 Examples . . . . . . . . . . . . . . . . . .
Game, Equilibrium, and Expectations . . . . . .
4.3.1 Game Explanations . . . . . . . . . . . .
4.3.2 Equilibrium and Expectations . . . . . .
Estimation with Publicly Known Characteristics
4.4.1 Characterization of the Equilibrium Set .
4.4.2 Selection Rule and Complete Likelihood .
4.4.3 Identification . . . . . . . . . . . . . . . .
4.4.4 Computation and Estimation . . . . . . .
Estimation with Self-Known Characteristics . . .
4.5.1 Discrete Private Characteristics . . . . .
4.5.2 Continuous Private Characteristics . . . .
Discussions and Extensions . . . . . . . . . . . .
4.6.1 Group Unobservables . . . . . . . . . . .
4.6.2 Peer Effects . . . . . . . . . . . . . . . . .
x
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
85
89
89
91
93
93
94
100
101
105
106
109
113
115
115
126
126
128
.
.
.
.
.
131
132
133
136
142
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
144
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
146
Appendices
153
A.
Appendix to Chapter 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
153
A.1
A.2
A.3
A.4
A.5
.
.
.
.
.
153
155
158
160
163
Appendix to Chapter 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
183
B.1 Equilibrium Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
B.2 Identification Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
183
185
Appendix to Chapter 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
196
C.1
C.2
C.3
C.4
C.5
C.6
196
199
205
208
218
220
4.7
4.8
5.
B.
C.
4.6.3 Deterministic Rule . . . . . .
Binary Choice Models: Analysis and
4.7.1 Binary Choice Model I . . .
4.7.2 Binary Choice Model II . . .
Conclusion . . . . . . . . . . . . . .
Proofs . . . . . . . . . . . . . . . . .
Numerical Methods . . . . . . . . .
Identification . . . . . . . . . . . . .
Unobserved Group Random Effects
Identification: Bayesian Analysis . .
. . . . . . . .
Experiments
. . . . . . . .
. . . . . . . .
. . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Expectations, Equilibria, and Functions . . . . . . . . . . . . . . . . .
Proofs for Equilibrium Characterizations with Public Characteristics .
Proofs for Identification with Public Characteristics . . . . . . . . . .
Equilibrium with Privately Known Characteristics . . . . . . . . . . .
Equilibrium for Peer Effects . . . . . . . . . . . . . . . . . . . . . . . .
Proofs for Equilibrium Set Characterization in Binary Choice Models
xi
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
List of Tables
Table
Page
A.1 Linear Model with Discrete Characteristics for Independent Groups
. . . .
166
A.2 Binary Choice with Discrete Characteristics for Independent Groups . . . .
167
A.3 Linear Model with Continuous Characteristics for Independent Groups . . .
168
A.4 Binary Choice with Continuous Characteristics for Independent Groups . .
169
A.5 Linear Model with Discrete Characteristics and Constant Friend Number for
a Single Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
170
A.6 Linear Model with Discrete Characteristics and Random Friend Number for
a Single Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
171
A.7 Binary Choice with Discrete Characteristics and Constant Friend Number
for a Single Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
172
A.8 Binary Choice with Discrete Characteristics and Random Friend Number for
a Single Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
173
A.9 Linear Model with Continuous Characteristics and Constant Friend Number
for a Single Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
174
A.10 Linear Model with Continuous Characteristics and Random Friend Number
for a Single Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
175
A.11 Binary Choice with Continuous Characteristics and Constant Friend Number
for a Single Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
176
A.12 Binary Choice with Continuous Characteristics and Random Friend Number
for a Single Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
177
xii
A.13 Linear Model with Discrete Characteristics and Unobserved Group Random
Effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
178
A.14 Sample Statistics: Municipalities in North Carolina . . . . . . . . . . . . . .
179
A.15 Empirical Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
180
B.1 Tobit Model with Discrete Characteristics for Independent Groups . . . . .
187
B.2 Tobit Model with Continuous Characteristics for Independent Groups . . .
188
B.3 Tobit Model with Discrete Characteristics and Constant Friend Number for
A Single Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
189
B.4 Tobit Model with Discrete Characteristics and Random Friend Number for
A Single Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
190
B.5 Tobit Model with Continuous Characteristics and Constant Friend Number
for A Single Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
191
B.6 Tobit Model with Continuous Characteristics and Random Friend Number
for A Single Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
192
B.7 Tobit Model with Discrete Characteristics and Unobserved Group Random
Effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
193
B.8 Sample Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
194
B.9 Tobit Model for Property Tax Competition . . . . . . . . . . . . . . . . . .
195
C.1 Binary Choice I: Estimation Comparison . . . . . . . . . . . . . . . . . . . .
224
C.2 Binary Choice II: Estimation Comparison for Moderate Interactions . . . .
225
C.3 Binary Choice II: Estimation Comparison for Large Interactions . . . . . .
226
xiii
List of Figures
Figure
Page
A.1 Identification for Continuous Choices . . . . . . . . . . . . . . . . . . . . . .
181
A.2 Identification for Binary Choices . . . . . . . . . . . . . . . . . . . . . . . .
182
C.1 The Haar Basis Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . .
227
C.2 Equilibrium Illustration for Binary Choice I with Influences from Peers A .
228
C.3 Equilibrium Illustration for Binary Choice I with Influences from Peers B .
229
C.4 Equilibrium Illustration for Binary Choice I with Influences from Peers C .
230
C.5 Equilibrium Illustration for Binary Choice I with Influences from Peers D .
231
C.6 Equilibrium Illustration for Binary Choice I with General Social Relations .
232
C.7 Equilibrium Outcomes for Binary Choice I with General Social Relations .
233
C.8 Equilibrium Illustration for Binary Choice II with Influences from Peers A .
234
C.9 Equilibrium Illustration for Binary Choice II with Influences from Peers B .
235
C.10 Equilibrium Illustration for Binary Choice II with Influences from Peers C .
236
C.11 Equilibrium Illustration for Binary Choice II with Influences from Peers D .
237
C.12 Equilibrium Illustration for Binary Choice II with General Social Relations
238
C.13 Equilibrium Outcomes for Binary Choice II with General Social Relations .
239
C.14 Homotopic Mappings on Sphere for Binary Choices . . . . . . . . . . . . . .
240
xiv
Chapter 1: Introduction
With the rapid development of social media, social networks have been influencing more
and more aspects of human life. There is also a boom in real life cases and data sets, which
helps to investigate how social networks influence people’s behaviors. However, endeavors in
empirical research face some challenges due to the difference between the social interaction
models and the traditional models about agents’ behaviors. Unlike the previous analysis
where agents are making independent decisions or their direct interdependence is negligible
in a large market, when analyzing people’s behaviors in a social group, the interactions
between two closely related agents should be taken into account. This type of interactions
implies coordination or conflicts of individual behaviors, which in turn relates estimation
of social interactions to estimation of games. Then several issues arise, such as the choice
of equilibrium concepts, the relationship between the expectations and/or outcomes of two
different individuals, and equilibrium multiplicity. The goal of my thesis is to build models
on the foundation of game theory and find effective estimation methods for individual
heterogeneity and multiple equilibria.
The thesis focuses on the setting where agents make decisions simultaneously and some
individual characteristics and idiosyncratic shocks are not shared as public information in
a social group. Agents’ behaviors in this setting can be viewed as the equilibrium outcomes
of a simultaneous move game with incomplete information. This is a natural setting for
1
course enrollment in a class and market entry of a number of firms in the same industry.
Modeling and estimation in this setting has been analyzed by Manski(1993), Brock and
Durlauf(2001), and Lee, Li and Lin(2014). However, in their models it is assumed that all
exogenous characteristics that can be observed from the data by the econometricians are
public information to agents in a social group. Only the variables unobserved by the econometricians are private information. Under this assumption, when investigating students’
interactions in class performances, exogenous characteristics used by the econometricians,
such as individual IQ’s and GPA, are viewed as public information known to every student
in a class. Obviously, this assumption is unrealistic. This assumption is relaxed in this
thesis such that there is private information in not only the variables unobservable from the
data sets but also exogenous characteristics that can be observed by the econometricians.
The framework here is very general. It includes many models which are frequently used
in empirical studies, such as the linear model for continuous choices, the model for binary
choices, and the tobit model for truncated outcomes.
Although this new setting about the information structure is more realistic, it makes
estimation and computation more challenging than that for the previous models. With
incomplete information, a Bayesian Nash Equilibrium (BNE) is equivalent to the equilibrium conditional expectations on agents’ behaviors. When all exogenous characteristics are
public information and independent of the unobservables, every agent has the same expectations as the econometricians do. It is then safe to assume that individuals have rational
expectations. When agents are symmetric, as it is shown by Manski(1993) and Brock and
Durlauf(2001), an equilibrium can be represented by a scalar satisfying an equation. Lee,
Li and Lin(2014) take into account heterogeneity in individual characteristics and allow the
expected outcomes of two different agents to differ. Then in their model, an equilibrium
2
can be represented by a vector, which solves a system of nonlinear equations. However, in
this thesis, with asymmetric private information about exogenous characteristics, there is
another type of heterogeneity. That is, two agents may form different expectations on the
behaviors of the same third agent. As the value of expectations vary with the information
used to make predictions, in this general framework, an equilibrium can be represented
by a vector-valued function, satisfying a system of function equations. Identification and
estimation depend on analyzing these function equations, where the difficulty lies.
To identify and estimate model primitives requires characterize the distribution of observed outcomes, which entails an investigation of the equilibria. Two fundamental issues
about the equilibria of a game are existence and uniqueness. If it can be guaranteed that
there is a unique equilibrium, as the equilibrium can be computed from the model primitives, the distribution of observed outcomes can be captured without further assumptions.
However, if equilibrium multiplicity is possible, as the real equilibrium that is played is not
observed, there is not a complete distribution of the observed outcomes without additional
assumptions. This thesis first searches for a reasonable assumption ensuring the existence
of a unique equilibrium and investigate effective methods to compute it. Then this thesis
extends the methods for game estimation with multiple equilibria by Bajari et al(2010b)
and (2010c) to a more general context and delves into efficient computation and estimation.
The investigation of the equilibrium set relies on mathematical theories about function
spaces. An expedient way to find sufficient conditions for equilibrium uniqueness is to use
the contraction mapping in Banach spaces. It is found that there is a unique equilibrium
when the intensity of social interactions is bounded in a range, which depends on the behaviors agents make and the network structures. For large interaction effects where this
3
restriction is not necessarily satisfied, this thesis applies the intersection theory in differential topology and the Schauder fixed point theorem to characterize the set of equilibria. It
is found that when all exogenous characteristics are public information, there are a finite
number of equilibria under some general regularity conditions. When some exogenous characteristics are private information, the set of equilibrium is compact in a Banach space and
can be approximated by a finite number of equilibria. Therefore, it is possible to specify a
parametric random equilibrium selection rule and use a finite probability mass function for
the distribution of equilibrium selection to complete the model. This extends the estimation
method by Bajari et al(2010b) and (2010c). In their setting, there are a finite number of
players, each of them has a finite number of choices. It is well established that there are a
finite number of Nash equilibria. Nonetheless, the existence of a pure strategy equilibrium
when the set of actions is not finite has been an open question for a long period of time. Following the seminal paper by Milgrom and Weber(1985) and Radner and Rosenthal(1982),
Khan and Sun(1995) and Kan and Zhang(2014) derive existence theorems when the set of
actions is compact. Although their conditions can be used for a general setting about the
information structure, it is not reasonable to restrict the values of outcomes to be bounded
for many empirical studies. The equilibrium characterization in this thesis provides with
complements to that large literature, as the analyses of a specific reduced-form BNE for
unbounded outcomes.
Computation is the challenges this all-solution method has to face. Computation (or
approximation when some exogenous characteristics are private information) of the equilibrium set can be converted to deriving all the solutions to a system of nonlinear equations.
This ambitious goal, nonetheless, is achievable by using the homotopy continuation method
4
proposed by Garcia and Zangwill(1981), given some generic regularity conditions are satisfied. Through detailed discussions and Monte Carlo experiments about two types of Binary
Choice models, it is found in this thesis that this computationally intensive estimation
method is rewarded by good performances in terms of estimate biases and log likelihoods
when there are multiple equilibria.
In the recent literature of social interaction model estimation, there are some estimation methods which does not require solving all the equilibria and can derive consistent
estimates with some special model structures or additional assumptions. The basis of these
methods is the two-step algorithm proposed by Bajari et al.(2010a) for a game with discrete
choice and incomplete information on individual unobservables. They first derive consistent
nonparametric estimates of individual choice probabilities and then use them to estimate
structural parameters. If there are a large number of repetitions of the same game, under
the assumption that the same equilibrium is played for all the repetitions, the individual
choice probabilities can be estimated consistently in the first step. However, unlike the
empirical work in industrial organization for which there are usually a long time series for
a small group of agents in the data sets, for empirical work in social interactions, there
are often cross-section data sets with a large number of heterogeneous agents. Then this
two-step algorithm cannot be directly applied without special model structures and further
assumptions. Bisin et al.(2011) consider the case that individual outcomes are influenced
by a global equilibrium aggregate. Then it is possible to first estimate the equilibrium aggregate and then recover other parameters. This algorithm can also be used if equilibria are
refined. Leung(2013) focuses on one particular type of equilibria, where individuals with the
same observable characteristics play the same strategy. However, this two-step algorithm
would be invalid with the presence of unobserved group heterogeneity which cannot be fully
5
explained by observed characteristics. As a result, although the all-solution method in this
thesis is more computationally intensive, it can ensure consistent and efficient estimates in
a general model framework without imposing stringent assumptions on the data set or the
data generating process.
In addition to the discussions about the equilibrium solution and estimation in a general framework, there are detailed analyses of the identification, estimation, and empirical
studies for three frequently used models that are incorporated in this framework: the linear model for continuous choices, the model for binary choices, and the tobit model for
truncated outcomes. As for the binary choices, two types of the model are analyzed as
their equilibrium sets have different characteristics. The entry problem is a good example
for the type I binary choice models, where firms choose whether or not to enter a market
and the firms who do not enter have no effects on others’ profits ex post. The second type
model for binary choices, like that in Brock and Durlauf(2001), allows interactions from
agents choosing both actions, for individual utilities depend on coordination or conflicts of
actions. The tobit model with social interactions corresponds to the case that the actions
the socially associated agents can take are subject to some bounds. As censored and /or
truncated outcomes are frequently observed in the data sets, the tobit model has gained
more and more attention both empirically and theoretically. Recent theoretical research includes the papers by Kumar(2012) and Abreyava and Shen (2014). This thesis contributes
to this literature by incorporating interactive influences between individuals into the model.
Estimation methods are applied to analyze the interactions among municipalities in
North Carolina for policy-making. It is found that when neighboring cities increase their
public safety spending, the spending on public safety of a city will be reduced, indicating
6
a “free-riding” effects. When the property tax rates of near-by cities are lowered, however,
the property tax rate of a city will be reduced, showing competing effects.
This thesis proceeds as follows. Chapter 2 investigates computation and estimation
of social interaction models with a general form of incomplete information when there
is a unique equilibrium. It discusses about the linear model for continuous choices and
the binary choice model in detail. Chapter 3 applies the theoretical results in Chapter 2
to the tobit model for truncated outcomes. Chapter 4 extends the previous analysis to
investigate estimation with multiple equilibria. Chapter 5 concludes and discusses about
future research.
7
Chapter 2: Social Interactions under Incomplete Information with
Heterogeneous Expectations
2.1
Introduction
It is natural to believe that an agent may not know the features of all other members in
her social group. For example, a college student knows the SAT scores of her close friends
in a class but not those of others. Because of different knowledge, even two students with
the same characteristics may have different predictions on academic performance of a third
student. Hence, heterogeneity in private information may influence an individual’s actions
through social interactions. However, this type of heterogeneity has not been analyzed in
the existing literature. In models like Manski (1993), Brock and Durlauf (2001), and Lee,
Li, and Lin (2014), under the assumption of “rational expectations”, two agents form the
same expectation on a third individual. Although this assumption simplifies equilibrium
solution, it is not suitable when there is asymmetric private information and decisions are
not made repeatedly. In this paper, we related social interactions with private information
to a simultaneous move game under incomplete information and adopt the solution concept
of Bayesian Nash Equilibrium (BNE).
In our behavioral model, an agent’s action depends on her expectations on the behaviors
of her linked agents, based on her information about group features, social relationships,
8
and some specific personal characteristics. We can view this model as a simultaneous move
game under incomplete information. A strategy of an agent is a correspondence which
maps her characteristics and private information to an action. We show a correspondence
between a BNE and the equilibrium conditional expectations on group members’ behaviors. The equilibrium expectation function shows two types of heterogeneity. On one hand,
the expectation on behaviors of two different agents are generally different, for they have
different traits. On the other, if the private information of two agents differ, their expectations on a third person may not be the same. As the equilibrium conditional expectation
is a vector-valued function of privately known individual characteristics, we embed conditional expectations in a function space and use functional contraction mapping to derive
a sufficient condition for the existence of a unique equilibrium. It is shown that for some
frequently used models, such as linear and binary choice models, after row-normalizing the
social weighting matrix, this condition reduces to the intensity of social interaction not
being too large.
Except for some special cases where a closed-form solution of equilibrium conditional
expectations exists, in general, equilibrium must be solved numerically. We illustrate the
solution process when privately known characteristics are discretely distributed with a finite support. When they are continuously distributed, since expectations are integrals, we
approximate them by Gaussian quadratures. We can first solve the values of expectation
functions at fixed quadrature abscissae by contraction mapping and then approximate the
expectation functions. The solution of equilibrium conditional expectations is nested in
calculating sample log likelihood functions for estimation, the “nested fixed point” (NFXP)
ML estimation originated in Rust (1987) for the estimation of dynamic discrete choice models. The NFXP ML estimation performs well for both the linear and binary choice models.
9
In Monte Carlo experiments, we also compare estimation under various information structures. It is found that estimation under a true information structure generally results in
a higher maximized sample log likelihood than that under misspecified information structures. Hence, maximized log likelihood may provide a criterion of model selection. We also
extend the model by incorporating unobserved group effects. Applying the model to correlation of public safety expenditure for neighboring municipalities in North Carolina, we
find significantly negative spatial correlation under various regression settings, indicating
possible “free-riding” effects.
Including linear models and binary choice models as special cases, the model is related to
the “linear-in-means” model of social interactions by Manski (1993) and Brock and Durlauf
(2001). In those models, an individual makes predictions on other agents’ outcomes. After
imposing symmetry and rational expectations, expectation reduces to a scalar. Relaxing
symmetry in exogenous characteristics, Lee, Li, and Lin (2014) allow expectation on one
agent’s behavior to be different from those on others. As personal characteristics are common knowledge for all members in a group, by imposing rational expectations, any two
agents will have the same expectation on the action of a third agent. So those two types of
models are special cases of ours.
Our paper also connects the literature of estimation of games under incomplete information, such as Bajari et al. (2010a) and Aradillas (2010). In Bajari et al. (2010a), a
finite number of agents choose one action from a finite action set simultaneously. Private
information resides in the idiosyncratic shocks. All personal characteristics are public information. Identification and estimation are based on the one-to-one correspondence between
choice probabilities and expected payoffs. Incomplete information about both exogenous
characteristics and idiosyncratic shocks is discussed in a two-player binary choice game in
10
Aradillas (2010). The author focused on one type of equilibrium where expectations of
both players are based on a commonly observed public signal. As a result, expectations do
not vary with asymmetric private information in that paper. Therefore, our model is an
extension to the above two papers in the form of incomplete information.
Methodologies in game estimation have been used to investigate social interactions recently. In Bisin et al. (2011), individual behaviors are socially interacted in the context that
all exogenous characteristics are public information for group members and idiosyncratic
shocks are privately known. In Leung (2013), endogenous formation of social networks is
formulated as a game where agents choose social association simultaneously. In that model,
whether an agent connects with another depends on individual features, pair-specific match
value, and an idiosyncratic shock. Only the idiosyncratic shock is privately known. Discussions are focused on the case when individuals of identical attributes take the same
strategies ex ante. Then Bayesian equilibrium reduces to a single function of exogenous
characteristics. In both papers, a two-stage method similar to that in Bajari et al. (?) is
used for estimation.1 So our model is more general in terms of incomplete information.
The paper proceeds as follows. In Section 2.2, we introduce the model framework, as
well as the relationship between the model and a simultaneous move game with incomplete
information. In Section 2.3, we first show the one-to-one correspondence between a BNE
and a consistent conditional expectation function and then discuss equilibrium existence
and uniqueness. A detailed discussion on computation of equilibrium is in Section 2.4.
We investigate identification and estimation in Section 2.5. Monte Carlo experiments are
conducted and results are presented in Section 2.6. After an empicial analysis in Section
2.7, we conclude in Section 2.8. Proofs of equilibrium analysis and numerical methods
1
In Bisin et al. (2011), an alternative estimation method is to solve all equilibria for each parameter
value and choose the one to maximize log likelihood.
11
for equilibrium computation are put in Appendix A.1 and A.2, respectively. In Appendix
A.3, we discuss in detail sufficient conditions for identification. An extension to unobserved
group random effects and the Bayesian analysis of identification are elaborated, respectively,
in Appendix A.4 and A.5.
2.2
2.2.1
Models
A Model Framework
Consider a group of socially related agents of size n.2 The n × n matrix, W , is used
to represent their (exogenous) social relations. For example, W represents friendship. For
any i 6= j, Wi,j = 1 if an individual i is associated with the individual j; and Wi,j = 0
otherwise. For any i, Wi,i = 0, as is usual in a network. More generally, Wi,j may depend
on geographic, economic, or social distances between i and j. We require that elements of
W are all non-negative, Wi,j ≥ 0 for any i, j. So Wi,j can represent not only whether i and
j are associated but also the strength of their relation. Note that W may be asymmetric
or symmetric.
To incorporate possible discrete choices in addition to continuous outcomes, the model is
presented with a latent variable and its relationship with observed outcomes. The observed
outcome, or agent’s behavior, yi , depends on the latent variable, yi∗ , as
yi = hi (yi∗ ),
(2.2.1)
where hi (·) : < → < is a real-valued function, which can be linear or nonlinear. The latent
variable for i is interacted with her expectations about her linked agents’ outcomes.
yi∗ = u(Xi ) + λ
X
Wi,j E[yj |XJpi , Z] − i .
(2.2.2)
j6=i
2
Theoretically, we can analyze social interactions in a single group. In empirical application, however,
we may have independent groups. In the presence of many groups for estimation, we will use subscript, g,
as each group’s index.
12
u(Xi ) represents the direct effect of exogenous covariates. Xi = (X g , Xic , Xip ) is composed of three types of exogenous characteristics, the group features, X g ; some commonly
known individual characteristics, Xic ; as well as privately known information, Xip . i is the
idiosyncratic shock. i ’s are i.i.d. and independent of all the exogenous characteristics and
social relations. Its distribution is characterized by a pdf function, f (·), with its CDF, F (·).
Assume that i is known by individual i herself, but not by other agents or econometricians.
Innovations of our model reside in the middle part on the right hand side of (4.2.2). It
shows two features of social interactions. First, yi∗ is related to the behavior of agent j only
if i is associated with j; i.e., Wi,j 6= 0. Second, j influences i through i’s expectation on the
true outcome of j’s behavior. This is intuitively appealing, because agents make decisions
simultaneously before actual outcomes are realized. In this model, expectations are based on
public information on social relations in W , group features, X g , commonly known individual
characteristics, Xjc ’s, and private information about exogenous characteristics, Xjp ’s.3 The
information structure can be fully described by specifying for every agent the subset of
agents whose Xjp ’s are known to her. Therefore, for each i, we define an n × 1 vector, Ji ,
such that Ji (j) = 1 if i knows Xjp and Ji (j) = 0 otherwise, for each 1 ≤ j ≤ n. Hence,
0
0
0
information structure in a group can be represented by an n2 × 1 vector, J = (J1 , · · · , Jn ) .
For every i, we define by XJpi , the vector formed by the random variables, Xjp ’s, which are
0
known to i, XJpi = (Xjp : Ji (j) = 1)0 . Suppose that Xjp is of dimension kp . Then the
P
dimension of XJpi is Mi = ( nj=1 Ji (j))kp . For instance, if 1 only knows X1p , X2p and X3p ,
0
0
0 0
J1 (j) = 1 for j = 1, 2, 3 and J1 (j) = 0 for j > 3, and XJp1 = (X1p , X2p , X3p ) . We assume
that such an information structure J is publicly known. However, the realization of those
random variables are private information. For example, it is publicly known that agent 1
0
3
Here, W represents the vector formed by all its elements, (W1,1 , · · · , W1,n , · · · , Wn,1 , · · · , Wn,n ) , which
shows all social relations in the group.
13
knows her own features and those of agents 2 and 3. But the realizations of X1p , X2p and
X3p are unknown to agent 4. To simplify notation, we summarize all the publicly known
variables in one vector,
0
0
0
0
0
Z = (X g , X1c , · · · , Xnc , W1,1 , · · · , W1,n , · · · , Wn,1 , · · · , Wn,n , J1 , · · · , Jn ) .
(2.2.3)
In sum, i’s expectations about other agents’ behaviors are based on realizations of two
random vectors, one about private information, XJpi , and the other about public information,
Z. Two agents have the same private information if Ji = Jj , so that XJpi = XJpj .
Although it is natural to assume that an agent’s prediction may also be based on the
realization of her idiosyncratic shocks, it can be excluded from her private information.
Due to independence across idiosyncratic shocks, {i }’s, and also their independence with
exogenous characteristics, {Xi }’s, an agent’s conditional expectations will not be changed
if her i is also used to make prediction.4 Therefore, the information framework that
all exogenous characteristics are public information and idiosyncratic shocks are private
information, which is frequently used in estimation of incomplete information games (such
as Bajari et al. (2010a)), is included in our model as a special case. Although the presence
of unobserved group effects may complicate identification and estimation, because they are
observed by group members, they do not influence theoretical analysis of social interactions.
So we focus on a model without unobserved group effects first and include them in the model
as an extension in Appendix A.4.
We allow XJpi to vary across agents. As a result, there are two types of heterogeneity
0
in E[yj,g |XJpi , Z]’s. Since agent j and j may be different in personal characteristics, i.e.,
Xj 6= Xj 0 , their expected outcomes may differ in i’s predictions. That is, in general,
R
For any k 6= i, since k is independent of i and all X, E[yi |XJpk , Z, k ] = E[ hi (u(Xi ) +
R
P
P
λ j6=i Wi,j E[yj |XJpi , Z, i ] − i )|XJpk , Z, k ] = E[ hi (u(Xi ) + λ j6=i Wi,j E[yj |XJpi , Z, i ] − i )|XJpk , Z].
4
14
0
0
E[yj |XJpi , Z] 6= E[yj 0 |XJpi , Z], for i 6= j, i 6= j , and j 6= j . On the other, because two agents
may have different private information on an individual, their expectations on the behavior
of a same person might differ; that is, E[yj |XJpi , Z] 6= E[yj |XJpk , Z] when Ji 6= Jk in general.
Those two types of heterogeneity distinguishes the current model from previous ones in
Manski (1993), Brock and Durlauf (2001), and Lee, Li, and Lin (2014). In Manski (1993),
expectations are taken on the average behavior of the whole group based on commonly
known group features. In the discrete choice model by Brock and Durlauf (2001), every agent
has the same expectation of any other agents in her group. Lee, Li, and Lin (2014) introduce
asymmetry in personal characteristics and allow expectation on choices of different agents
to differ. However, with rational expectations, any two agents have the same expectation
on another agent. Therefore, our model extends existing research in social interactions with
incomplete information by allowing two types of heterogeneity in conditional expectations.
The intensity of social interaction effects is captured by the parameter, λ, in (4.2.2). If
λ > 0, the social interaction effect is positive. If λ < 0, outcomes are negatively related.
The case of λ = 0 represents absence of social interactions.
2.2.2
Continuous and Discrete Choices and Information Structures
The model, (2.2.1) and (2.2.2), provides with a framework broad enough to incorporate
many interesting cases, in particular, a linear model with continuous choices and a discrete
choice model with social interactions.
1. (Continuous Choices) hi (z) = h(z) = z for any z ∈ <, and yi = yi∗ ; i.e., the latent
variable, yi∗ , can be perfectly observed.
15
2. (Binary Choice) hi (·) = h(·), where h(·) can take only two values, 0 and 1. That is,
yi = I(yi∗ > 0). The observed outcome is a binary variable, whose value depends on
sign of the latent variable.
Specifications on XJpi ’s reflect information structures. Several examples follow.
1. (Publicly Known Characteristics) All exogenous group and individual characteristics,
social relations, and the information structure are publicly known. In that case, XJpi
0
0 0
include all Xjp ’s, i.e., for all i, XJpi = (X1p , · · · , Xnp ) .
2. (Self-Known Characteristics) Xip is observed by i but not by any other group members.
That is, for each i, XJpi = Xip .
3. (Socially-Known Characteristics) i knows Xjp if i is associated with j; i.e., Wi,j 6= 0.
0
0
In this case, for each i, XJpi = (Xjp : j = i or Wi,j 6= 0) .
They will be discussed in detail in subsequent sections.
2.2.3
Game Theoretical Explanation
In the model, (2.2.1) and (2.2.2), behaviors of agents in a group are interacted when
they are uncertain about others’ attributes. This is just like a simultaneous move game with
incomplete information. According to Harsanyi (1967a; 1967b), such interactions can be
modeled as a Bayesian game. By assuming that agents’ payoffs are related to a randomly
determined “state,” an agent’s uncertainty comes from the fact that her signal does not
completely recover the true state. The game form can be set up following Osborne and
Rubinstein (1994).
There are n players, i = 1, · · · , n. Let Si represent the support of Xip . The set of
states,
Qn
i
Si , is defined as the set of all possible Xip ’s for all players. In this case, player
16
i’s “type” is her private information, XJpi . Accordingly, her set of types is the support
of those characteristics, Ti =
states to her type, τi :
Qn
i
Q
Si →
k:Ji (k)=1 Sk .
Q
The signal function is a mapping from the
k:Ji (k)=1 Sk .
Her prior belief on the set of states is the
joint distribution of Xip ’s, Fp (·). The prior belief is the same across all players. For each
player, i = 1, · · · , n, there is an action set, Yi . For player i, a strategy is a contingent
plan specifying her actions for each of her types, si :
actions, y = (y1 , · · · , yn ) ∈
Qn
i=1 Yi ,
Q
k:Ji (k)=1 Sk
→ Yi . For a profile of
every player receives a payoff. Because the actions
taken by players depend on their types, which is related to the uncertain state, before
actions are taken, the payoff of a player is uncertain. We assume that a player’s preference
over uncertain payoffs can be expressed by expected utilities.
Specification for a continuous choice model differs from that for a discrete choice one.
So we discuss payoffs, strategies, and equilibria separately for continuous choice and binary
choice.
1. (Continuous Choice) With a strategy profile, s = (s1 (·), · · · , sn (·)), given public information, Z = z, if the realized types are xpJ1 , · · · , xpJn , the payoff for player i is
r(si (xpJi ), s−i (xp−Ji ))
=q(xi ) + (u(xi ) − i )si (xpJi ) + λsi (xpJi )
X
j6=i
1
Wi,j sj (xpJj ) − (si (xpJi ))2 ,
2
where s−i (·) denotes the strategies of players other than i, and xp−Ji represents their
realized types.5 Given public information Z = z, the expected payoff of player i when
5
We use uppercase letters to represent random variables and lowercase letters for their realizations.
17
her type is xpJi is6
p
E[r(si (XJpi ), s−i (X−J
))|XJpi = xpJi , Z = z]
i
=q(xi ) + (u(xi ) − i )si (xpJi ) + λsi (xpJi )
X
Wi,j E[sj (xpJj )|XJpi = xpJi , Z = z]
j6=i
1
− (si (xpJi ))2 .
2
This is a quadratic function of the action, si (xpJi ). To maximize expected utility, she
chooses
si (xpJi ) = u(xi ) + λ
X
Wi,j E[sj (XJpj )|XJpi = xpJi , Z = z] − i .
(2.2.4)
j6=i
Eq (2.2.4) shows player i’s best response to other players’ strategies, s−i (·). Therefore,
a profile of strategies, s = (s1 (·), · · · , sn (·)), is a BNE in this game, if it satisfies (2.2.4),
for any i = 1, · · · , n and xpJi ∈
Q
7
k:Ji (k)=1 Sk .
Suppose that our observed actions are
derived from a BNE, with public information, Z, and privately known characteristics,
XJp1 , · · · , XJpn , we have the following relation,
yi = u(Xi ) + λ
X
Wi,j E[yj |XJpi , Z] − i ,
j6=i
which is just the case of (2.2.1) and (2.2.2), where hi (·) is an identity function.
2. (Binary Choice) Consider the case that agent i’s action can only take two values, 0 and
1. Namely, Yi = {0, 1}. In this case, for a realization of her type, player i’s strategy,
si , specifies whether to take action 1 or 0. Given a strategy profile, s = (s1 , · · · , sn ),
the payoff associated with action 0 is normalized to 0 for any player and any type.
When public information is Z = z and the realized types are xpJ1 , · · · , xpJn , if player i
6
Because i ’s are i.i.d. and are also independent of the characteristics, X’s, expectation does not change
if i is used to make predictions, as we discussed before.
7
need (2.2.4) holds on
Q According to Mas-collel, Whinston, and Green (1995), for a player i, we only
p
S
almost
everywhere
with
respect
to
the
probability
distribution
of
X
.
k
Ji
k:Ji (k)=1
18
chooses 1, her payoff will be
ri (si (xpJi ), s−i (xp−Ji )) = u(xi ) + λ
X
Wi,j sj (xpJj ) − i .
j6=i
Thus, her expected payoff when taking action 1 is
p
E[ri (si (XJpi ), s−i (X−J
))|XJpi = xpJi , Z = z]
i
=u(Xi ) + λ
X
Wi,j E[sj (XJpj )|XJpi = xpJi , Z = z] − i .
j6=i
It is optimal to choose si (xpJi ) = 1 if and only if
u(xi ) + λ
X
Wi,j E[sj (XJpj )|XJpi = xpJi , Z = z] > i ,
j6=i
and
si (xpJi )
= 0, otherwise. Thus, the following equation system,
si (xpJi ) = I(u(xi ) + λ
X
Wi,j E[sj (XJpj )|XJpi = xpJi , Z = z] − i > 0),
(2.2.5)
j6=i
defines a BNE. The observed binary choices, y = (y1 , · · · , yn ), can be viewed as actions
in equilibrium. That is to say, given public information, Z, and XJp1 , · · · , XJpn ,
yi = I(u(Xi ) + λ
X
Wi,j E[yj |XJpi , Z] − i > 0),
j6=i
which corresponds to (2.2.1) and (2.2.2), where hi (·) is an indicator.
From the above two examples, the inter-dependence of behaviors in our model, (2.2.1)
and (2.2.2),
yi = hi (u(Xi ) + λ
X
Wi,j E[yj |XJpi , Z] − i ),
(2.2.6)
j6=i
can be viewed as an outcome of a reduced-from Bayesian game where a BNE satisfies
si (XJpi ) = hi (u(Xi ) + λ
X
Wi,j E[sj (XJpj )|XJpi , Z] − i ),
j6=i
such that yi = si (XJpi ).
19
(2.2.7)
2.3
2.3.1
Equilibrium Analysis
Equilibrium and Expectations
Given that observed outcomes are realizations of an equilibrium satisfying (2.2.7), we
n
o
solve equilibrium strategies via solving E[yj |XJpi , Z] . Pick any i and k such that i 6= k.
By consistency, we get that
E[yi |XJpk , Z] = E[hi (u(Xi ) + λ
X
Wi,j E[yj |XJpi , Z] − i )|XJpk , Z].
(2.3.1)
j6=i
As defined, Z is a vector representing public information on group and individual features,
social relations, and information structures. XJpi is a vector representing private information
of i, composed of privately known personal characteristics. According to our assumption,
k only knows Ji but not the realization of XJpi . For her, XJpi is a random vector, and
E[yj |XJpi , Z] is a random variable. As a result, given the distribution of X p , she has to
integrate over all possible realizations of XJpi conditioning on realizations of her own information, XJpk (and Z). The conditional expectation, E[yi |XJpk , Z], is a function of the random
vector XJpi . Conditional expectations for behaviors of all group members, E[y1 |·, Z], · · · ,
E[yn |·, Z], are such functions. This distinguishes our model from previous research by Manski (1993), Brock and Durlauf (2001), and Lee, Li, and Lin (2014). Adopting BNE, with
heterogeneity in private information, expectations of yj vary with private information used
to make predictions.
Conditional on Z = z, let (Ω, F, P ) denote the probability space on which Xjp ’s are
defined. When the dimension of each Xkp is kp , XJpj is a measurable function from (Ω, F)
P
M
M
to (<Mj , B< j ), where Mj = ( nk=1 Jj (k))kp is the dimension of the vector XJpj and B< j
e ,
is the corresponding Borel set. Then for each Jj , we define a measurable function, ψi,J
j
M
e (x) = E[y |X p = x, z], for any x ∈ <Mj . Then
from (<Mj , B< j ) to (<, B< ) as ψi,J
i
Jj
j
e (X p ) : (Ω, F, P ) → (<, B , m ) is a random variable with
the composite function ψi,J
<
<
Jj
j
20
e (X p ) = E[y |X p , z]. The function ψ
ψi,J
i
i,Jj shows how the conditional expectation on yi
Jj
Jj
j
varies with realization of XJpj . Thus, we can characterize how the conditional expectation
changes with the random vector on which the expectation is based by defining a function
on a random vector. With the network connections in our model, an agent needs only to
predict the behaviors of agents with whom she is associated. Therefore, for each i, we only
need to consider conditional expectations on behaviors of her associated agents. Let’s begin
n
o
with the domain defined by Ai = XJpj : Wj,i 6= 0 . For each 1 ≤ i ≤ n, Ai ∈ Ai , there is
j, such that Wj,i 6= 0 and Ai = XJpj .8 Denote the space of all random variables on (Ω, F, P )
by C. Define for each i, ψie : Ai → C, such that, for any Ai ∈ Ai ,
e
ψie (Ai ) = ψi,J
(XJpj ) = E[yi |XJpj , z],
j
(2.3.2)
when Ai = XJpj for some j = 1, · · · , n. By collecting those functions into a vector-valued
function, we have ψ e :
Qn
i=1 Ai
→ Cn , such that for any A = (A1 , · · · , An ) ∈
Qn
i=1 Ai ,
ψ e (A) = (ψ1e (A1 ), · · · , ψne (An )).
Expectation functions are defined as conditional expectations of actions in (2.2.6). We
now investigate its relationship to a BNE. For simplicity, we will write E[yi |XJpj , z] instead
of E[yi |XJpj , Z = z] in the discussions below.
Proposition 2.3.1 Conditional on public information Z = z, s = (s1 , · · · , sn ) :
Qn
i=1 Yi
is a profile of strategies. Define vector-valued function, ψ e :
Qn
i=1 Ai
Qn
i=1 Ti
→
→ Cn such
that
ψ e (A)i = E[si (XJpj )|Ai , z],
8
(2.3.3)
If no one connects to i, that is, Wj,i = 0 for any j, others’ expectations on i’s behavior do not affect
outcomes. For example, n = 3, W2,1 = W1,2 = W3,2 = 1 and Wi,j = 0 otherwise. Agents’ behaviors are
influenced by E[y1 |XJp2 ], E[y2 |XJp1 ], and E[y2 |XJp3 ], and not by E[y3 |XJp1 ] or E[y3 |XJp2 ]. Therefore, although
we can define ψ3e , it is not relevant for the outcome of the model. To simplify notation, for general analysis
of model equilibrium, we define conditional expectations for behaviors of the whole sample and keep in mind
that we only need to do that for agents whose expected actions can influence the outcomes.
21
for any i = 1, 2, · · · , n, and A = (A1 , · · · , An ) ∈
Qn
i=1 Ai ,
where Ai = XJpj , for some j with
Wj,i 6= 0. Suppose that s is a BNE in the model, i.e., it satisfies (2.2.7), then ψ e satisfies
the consistency condition:
ψ e (A)i = ψie (Ai ) = E[hi (u(Xi ) + λ
X
Wi,j ψje (XJpi ) − i )|Ai , z],
(2.3.4)
j6=i
for any i. Additionally, ψ e (A)i = E[yi |Ai , z], where yi ’s are the actions of this equilibrium. On the reverse, suppose that the vector-valued function ψ e :
Qn
i=1 Ai
isfies ψ e (A)i = ψie (Ai ), for any i = 1, 2, · · · , n, and A = (A1 , · · · , An ) ∈
→ Cn sat-
Qn
i=1 Ai ,
and
condition (2.3.4) holds, then the strategy profile, s = (s1 , · · · , sn ), defined by si (XJpi ) =
hi (u(Xi ) + λ
p
e
j6=i Wi,j ψj (XJi )
P
− i ) is a BNE; i.e., it satisfies (2.2.7). Moreover, (2.3.2)
holds for actions associated with this equilibrium.
Proof. See Appendix A.1.
Hence, a BNE corresponds to an expectation function satisfying (2.3.4). We can analyze
equilibrium via analyzing this functional equation.
2.3.2
Existence and Uniqueness
From the above discussion, we see that there is a BNE in our model if and only if there is
a vector-valued expectation function, ψ e :
Qn
i=1 Ai
→ Cn , satisfying (2.3.4). We investigate
possible existence and uniqueness of such a mapping by functional analysis. The following
are primitive assumptions in our model.
Assumption 2.3.1 f () > 0 for all −∞ < < ∞. That is, the support for all i ’s is <.
Assumption 2.3.2 u(·) is continuous.
22
Assumption 2.3.3 For any real number x, Hi (x) = E[hi (x − )] < ∞ and is differentiable
with respect to x. Moreover, the derivative of Hi (x) is uniformly bounded in x and i; i.e.,
i (x)
| < ∞.
max1≤i≤n supx | dHdx
Suppose that conditional on Z = z, the joint distribution of X p = (X1p , · · · , Xnp ) is
Fp (·|Z = z).9 As Z will be a fixed realization for a group, we will suppress it in subsequent
analysis. Let Ψ be a set of functions such that any ψ ∈ Ψ is a vector-valued function which
maps a profile of sets A = (A1 , · · · , An ) ∈
Qn
i=1 Ai
to a random vector in Cn . It has the
0
form, ψ(A) = (ψ1 (A1 ), · · · , ψn (An )) , where for each i = 1, · · · , n, ψi : Ai → C satisfies the
following integrability condition:
Z
max
max
1≤i≤n {j:Wj,i 6=0}
|ψi (xpJj )|dFp (xp ) < ∞.
(2.3.5)
That is, ψ is integrable with respect to the distribution of private personal characteristics,
conditional on public information in the group, Z = z. From previous discussions, the
conditional expectation in equilibrium will be in Ψ once it meets requirement (3.3.4).
Define sum and scalar product operations of functions in Ψ in the conventional way. That
0
0
0
is, for any ψ, ψ ∈ Ψ, α ∈ <1 , (ψ + ψ )(A1 , · · · , An ) = ψ(A1 , · · · , An ) + ψ (A1 , · · · , An ), and
(αψ)(A1 , · · · , An ) = αψ(A1 , · · · , An ) for any (A1 , · · · , An ) ∈
Qn
i=1 Ai .
Zero is a function
which is equal to 0 almost everywhere with respect to Fp (·). It is easy to see that with these
operations, Ψ is a linear space. In addition, define
Z
kψk = max max
|(ψ)i (xpJj )|dFp (xp ),
1≤i≤n {j:Wj,i 6=0}
(2.3.6)
which is a norm from the following lemma.
Lemma 2.3.1 k·k is a well-defined norm on Ψ.
We allow correlation between public information, Z, and private characteristics, (X1p , · · · , Xnp ). When
they are independent, the conditional distribution will be the same as the unconditional distribution.
9
23
Proof. See Appendix A.1.
Moreover, (Ψ, k·k) is a complete linear normed space as the following lemma shows.
Lemma 2.3.2 The linear normed space (Ψ, k·k) is a Banach space.
Proof. See Appendix A.1.
Define an operator T on Ψ, such that
(T (ψ))i (A1 , · · · , An ) = (T (ψ))i (Ai ) = E[Hi (u(Xi ) + λ
X
Wi,j ψj (XJpi ))|Ai , z],
(2.3.7)
j6=i
where Hi (x) =
R
hi (x − )f ()d. We can see that if Hi (u(Xi ) + λ
p
j6=i Wi,j ψj (XJi ))
P
is
integrable with respect to Fp (·), T (ψ) ∈ Ψ for any ψ ∈ Ψ. If ψ e ∈ Ψ, it is a fixed point of
T . Additionally, if ψ ∈ Ψ is a fixed point of T , it satisfies the consistency condition. Thus,
a fixed point of T is an equilibrium conditional expectation (vector-valued) function. That
is to say, if we focus on functions in (Ψ, k · k), there is a one-to-one correspondence between
equilibrium conditional expectation functions and fixed points of operator T . The following
proposition shows that T can be a contraction mapping under a minor condition.
Proposition 2.3.2 Under the condition that
|λ| kW k∞ D < 1,
(2.3.8)
i (x)
where D = maxi supx | dHdx
| is well-defined under Assumption 4.3.2, T : (Ψ, k·k) →
(Ψ, k·k) is a contraction mapping. Thus, there is one unique BNE in the model.
Proof. See Appendix A.1.
For spatial networks, it is conventional to row-normalize W such that kW k∞ = 1. As
a result, how stringent the condition (2.3.8) will be depends on the upper bound, D. We
take the popular continuous choice and binary choice models as examples below.
1. (Continuous Choice) In this case, hi (x) = x, Hi (x) =
Thus, D = 1.
24
R
(x − )f ()d = x − E[].
2. (Binary Choice) When hi (x) = I(x > 0), Hi (x) = F (x). Thus, D = supx |f (x)|.
i (x)
In this case, the uniform boundedness of | dHdx
| will be due to boundedness of ’s
density. For the binary choice probit and logit models, the corresponding densities
are, respectively, the standard normal density with supx |f (x)| ≤
√1 ,
2π
and the logistic
density with supx |f (x)| ≤ 14 .
2.4
Equilibrium Solution and Computation
Proposition 2.3.2 is a key result for the general model framework. It not only provides
a sufficient condition for existence and uniqueness of BNE, but also suggests numerical
methods for estimation. That is because the fixed point of a contraction mapping can be
derived by recursive iterations beginning with an arbitrary initial guess. In this section, we
consider three different information structures and illustrate how equilibrium expectations
can be solved either analytically or numerically in each case. For all the analysis that
follows, condition (2.3.8) is assumed to be satisfied, so there is one and only one BNE.
2.4.1
All Characteristics are Publicly Known
When all personal characteristics are publicly known, XJpi = (X1p , · · · , Xnp ), for all i. An
agent has the same expectation as that of econometricians, so for any i 6= j, E[yi |XJpj , z] =
E[yi |X1p , · · · , Xnp , z]. The equilibrium conditional expectation is a vector,
0
ψ e = (E[y1 |X1p , · · · , Xnp , z], · · · , E[yn |X1p , · · · , Xnp , z]) ,
which is characterized by the consistency condition (2.3.4). That condition now reduces to
ψie = E[hi (u(Xi ) + λ
X
Wi,j ψje − i )|X1p , · · · , Xnp , z],
(2.4.1)
j6=i
for all i. This is the case in Lee, Li, and Lin (2014). For the linear model, as E[i |X1p , · · · , Xnp , z] =
0, we have a linear system ψie = u(Xi ) + λ
e
j6=i Wi,j ψi .
P
25
The equilibrium expectation vector
is
0
ψ e = (I − λW )−1 (u(X1 ), · · · , u(Xn )) .
2.4.2
(2.4.2)
Characteristics are Self-Known
In this case, each Xip is known only to i herself. We have XJpi = Xip and Ai =
n
o
p
Xjp : Wj,i 6= 0 . Xi,g
’s might be correlated across individuals. As they have the same
support, we consider the situation that they are “exchangeable” conditional on group public information, Z = z.
Assumption 2.4.1 Conditional on public information, Z = z, Xip ’s have the same support, Sp . Their conditional joint distribution, f p (X1p , · · · , Xnp |Z = z),10 is exchangeable,
i.e., for any permutation, s : {1, · · · , n} → {1, · · · , n},
p
p
f p (X1p , · · · , Xnp |Z = z) = f p (Xs(1)
, · · · , Xs(n)
|Z = z).
This includes the obvious case that Xip ’s are independent of each other. More generally,
exchangeability of random variables can be represented by Xip = α + pi , where α captures
some unobservable common features or shocks to all group members and pi are some i.i.d.
idiosyncratic shocks. This notion of exchangeability has been characterized by De Finetti
(1975).
Under Assumption 2.4.1, f (Xip |Xkp = x, Z = z) = f (Xip |Xkp0 = x, Z = z), for any k 6= i
0
and k 6= i. Hence,
Z
X
e
ψi,J
(x)
=
Hi (u(X g , Xic , y) + λ
Wi,j ψje (y))fp (y|XkP = x, Z = z)dy
k
j6=i
Z
=
Hi (u(X g , Xic , y) + λ
X
Wi,j ψje (y))fp (y|XkP0 = x, Z = z)dy
(2.4.3)
j6=i
e
=ψi,J
k
0
(x).
10
When Xip ’s are discrete random variables, f p (·|Z = z) is the probability mass function. If Xip ’s are
continuous, f p (·|Z = z) represents the distribution density.
26
where Hi (x) =
R
h(x − )f ()d. Thus, with Assumption 3.3.2, the identity of private
characteristics used to make predictions does not matter. It is its realization that influences
conditional expectations. Owing to this fact, we can directly define ψie as a mapping from
e (x) =
<kp (kp is the dimension of Xjp ’s) to <1 . To be specific, for any i, define ψie (x) = ψi,J
k
E[yi |Xkp = x, z], which is invariant with any k 6= i. Similarly, define ψ e : <nkp → <n
as ψ e (x1 , · · · , xn )i = ψie (xi ). Because this is a special case of the general framework, our
previous analysis about equilibrium existence and uniqueness applies here. In this case,
consistency condition (2.3.4) reduces to
Z
X
e
ψi (x) = Hi (u(X g , Xic , y) + λ
Wi,j ψje (y))fp (y|x)dy,
(2.4.4)
j6=i
where fp (y|x) is the density of Xip conditional on Xjp = x for any j 6= i and Z = z. So
with exchangeability on Xip ’s, one needs to consider the solution of n functions, ψ1e , · · · , ψne ,
characterized by (2.4.4). We investigate possible quantitative solutions in three different
cases that follow.
Independent Characteristics
Given Z = z, if the characteristics Xjp ’s are independent of each other, fp (y|x) =
fp (Xkp = y|Xip = x, z) is the same as fp (y) = fp (Xkp = y|z) without x. Hence, ψie (x) =
R
Hi (u(X g , Xic , y) + λ
e
j6=i Wi,j ψj (y))fp (y)dy.
P
This means that ψie (x) is a constant func-
tion of x, whose value will be denoted as ψie . The equilibrium expectation vector, ψ e =
0
(ψ1e , · · · , ψne ) , is characterized by
Z
X
e
ψi = Hi (u(X g , Xic , y) + λ
Wi,j ψje )fp (y)dy,
(2.4.5)
j6=i
for i = 1, · · · , n. For the binary choice model, it can be solved by contraction mapping
iteration similar to that in Lee, Li and Lin(2014). For the linear model, (2.4.5) is a linear
27
equation system which has an analytical solution,
0
ψ e = (I − λW )−1 (E[u(X g , X1c , y)], · · · , E[u(X g , Xnc , y)]) .
(2.4.6)
Compared with (2.4.2), instead of true realizations, expectations are used in (2.4.6). That
is because Xip is not observed by any agent other than i in the group. If there are no
publicly known individual characteristics, Xic , the expectations on the right hand side of
(2.4.6) would be the same. We can rewrite (2.4.6) as ψ e = E[u(X g , y)](I − λW )−1 ln , where
n is the group population and ln is the n × 1 vector composed of 1’s. Therefore, when W
were row-normalized, ψ e = E[u(X g , y)]ln /(1 − λ). Then all the components in ψ e would be
equal. This would be the case in Brock and Durlauf (2001). If W were not row-normalized
and all its entries were 0 or 1 values, ψ e would be proportional to the vector of “centrality
measures,” as in Ballester et. al. (2010)
Correlated Discrete Characteristics
In general, Xip ’s may be correlated. In this subsection, we focus on the case that Xip has
a finite support. As a result, ψie (·) is a function defined on a finite set and can be viewed
as a vector. Suppose that the support of Xip is xl : 1 ≤ l ≤ m , where xl is a vector with
specific values and m is the number of support points. The conditional probability function,
fp (y|x), is fully captured by the transition matrix,

p11 p12 · · ·
 ..
..
..
P = .
.
.

p1m
..  ,
. 
pm1 pm2 · · · pmm
0
where pll0 = prob(Xip = xl |Xkp = xl ), for k 6= i. With P , (2.4.4) becomes
ψie (xl )
=
m
X
0
pll0 Hi (u(X g , Xic , xl ) + λ
0
X
0
Wij,g ψje (xl )),
j6=i
l =1
which characterizes the equilibrium expectations vector,
0
ψ e = (ψ1e (x1 ), · · · , ψne (x1 ), · · · , ψ1e (xm ), · · · , ψne (xm )) .
28
(2.4.7)
For the linear model, ψ e is an nm × 1 vector and can be solved analytically from the linear
system,
ψie (xl ) =
m
X
0
pll0 (u(X g , Xic , xl ) + λ
X
0
0
Wi,j ψje (xl )),
(2.4.8)
j6=i
l =1
for i = 1, · · · , n, l = 1, · · · , m. To be specific, define
0
u = (u(X g , X1c , x1 ), · · · , u(X g , Xnc , x1 ), · · · , u(X g , X1c , xm ), · · · , u(X g , Xnc , xm )) .
Then (2.4.8) can be written as ψ e = (P ⊗In )u+λ(P ⊗W )ψ e ,where In is the n×n-dimension
identity matrix. As kP ⊗ W k∞ ≤ kP k∞ kW k∞ = kW k∞ , |λ(P ⊗ W )k∞ < 1 under (2.3.8).
So I − λ(P ⊗ W ) is invertible. Hence,
ψ e = (I − λ(P ⊗ W ))−1 (P ⊗ In )u.
(2.4.9)
Correlated Continuous Characteristics
Now we consider the case where Xip ’s are continuous random variables. In this case,
the equilibrium expectations, ψie for i = 1, · · · , n, in (2.4.4) are functions. Under some
stochastic structure for the simpler linear model, analytical solutions may still be possible.
Here we consider a linear model where the expectation of Xip conditional on Xjp and Z is
linear in Xjp .
Assumption 2.4.2 hi (x) = x for any x and all i; i.e., the model is linear.
Assumption 2.4.3 E[i ] = 0 for any i.
0
Assumption 2.4.4 u(X g , Xic , Xip ) = v(X g , Xic ) + Xip β. That is, individual own effect is
separable in public information and private information, and is linear in self-known characteristics, Xip .
29
Assumption 2.4.5 In addition to Assumption 2.4.1, E[Xip |Xjp , z] = µ + CXjp , for any
i 6= j, where the dimension of Xip is kp × 1, µ is a kp × 1 vector and C is a kp × kp matrix.
Both µ and C may depend on Z = z.
One multivariate distribution that satisfies Assumption 2.4.5 is the joint normal distribution. Suppose that conditional on Z = z, (Xip , Xjp ) with i 6= j are jointly nor
Σ1 Σ2
0
0 0
. Then, E[Xip |Xjp , z] =
mal with mean (e
µ ,µ
e ) and variance-covariance matrix
0
Σ2 Σ1
p
(I − Σ2 Σ−1
µ + Σ2 Σ−1
1 )e
1 Xj .
Proposition 2.4.1 Under Assumptions 2.4.1 - 2.4.5, the unique equilibrium conditional
expectation is linear, i.e.,
0
ψie (X) = ai + bi X,
0
0
0
(2.4.10)
0
for any i = 1, · · · , n. b = (b1 , · · · , bn ) and a = (a1 , · · · , an ) are characterized by
0
0
b = (Inkp − λ(W ⊗ C ))−1 (ln ⊗ C β),
(2.4.11)
and
0
0
a = (In − λW )−1 (v + (µ β)ln + λ(W ⊗ µ )b),
(2.4.12)
0
where v = (v(X g , X1c ), · · · , v(X g , Xnc )) and ln is an n × 1 vector of ones.
Proof. See Appendix A.1.
In general, equilibrium expectation cannot be solved analytically. We need to use numerical methods. When self-known characteristics are continuously distributed, the equilibrium
expectation is a vector-valued function. Inspecting model specifics, we find that for each
i = 1, · · · , n, ψie (·) is expressed as an integration, which can be approximated by Gaussian
quadratures. Suppose that there are K abscissae, x1 , ..., and xK . We first solve the vector
0
(ψie (x1 ), · · · , ψie (xK )) by contraction mapping as we did for discrete characteristics in the
30
last section. Then we approximate ψie (x) at any x in the support by associating those
abscissae with corresponding weights. A detailed discussion on numerical methods is in
Appendix A.2. In Monte Carlo experiments, we take K = 8.
2.4.3
Socially-Known Characteristics
Consider the case that Xip is known to i herself and any k who is linked to i, Wk,i 6= 0.
The information structure in this case is Ji (l) = 1 if l = i or Wi,l 6= 0; and Ji (l) = 0,
otherwise. Use XJpi to represent the vector of Xjp ’s which is known to i. Consistency for
equilibrium conditional expectation can be written as follows: for any k 6= i and Wk,i 6= 0,
ψie (XJpk )
e
=ψi,J
(XJpk )
k
e
=ψi,J
(Xkp , Xip , (Xlp : l 6= i, l 6= k, Wk,l 6= 0))
k
=E[Hi (u(Xi ) + λ
X
Wi,j ψje (XJpi ))|XJpk ]
(2.4.13)
j6=i
Z
=
Hi (u(X g , Xic , Xip ) + λ
X
e
Wi,j ψj,J
(Xip , (Xlp0 : Wi,l0 6= 0)))
i
j6=i
fp (Xlp0
0
: l 6= k, Wi,l0 6= 0, Wk,l0 = 0|Xkp , Xip , (Xlp : l 6= i, l 6= k, Wk,l 6= 0), Z = z)
0
d(Xlp0 : l 6= k, Wi,l0 6= 0, Wk,l0 = 0).
To solve the equilibrium conditional expectation is to solve those functions for all those
i and k who are linked and their associated information structures. When all Xip ’s are
discrete, we can solve them as vectors. For continuous variables in Xip , we can either
discretize Xip ’s or use our Gaussian quadrature approximation. What is different from
previous discussion about private information is that the number of such functions increases,
because every agent i can be associated with many agents who have different social relations
and information structures.
31
Independent Characteristics
When Xip ’s are independent of each other, conditional on Z = z, the consistency condition (2.4.13) can be simplified as
e
(Xkp , Xip , (Xlp : l 6= i, l 6= k, Wk,l 6= 0))
ψie (XJpk ) =ψi,J
k
Z
X
e
= Hi (u(X g , Xic , Xip ) + λ
Wi,j ψj,J
(Xip , (Xlp0 : Wi,l0 6= 0)))
i
j6=i
fp (Xlp0
0
0
: l 6= k, Wi,l0 6= 0, Wk,l0 = 0|Z = z)d(Xlp0 : l 6= k, Wi,l0 6= 0, Wk,l0 = 0).
(2.4.14)
We see that the right hand side of the above equation is a function of those Xlp ’s that are
e
known to both i and k. Hence, effectively, the function ψi,J
may have fewer arguments
k
e (X p , X p , (X p : l 6= i, l 6= k, W
than ψie (XJpk ) = ψi,J
k,l 6= 0)). Actually, it can be written as
i
k
l
k
e (X p , X p , (X p : l 6= i, l 6= k, W 6= 0, W
ψi,J
i,l
k,l 6= 0)).
i
k
l
k
Hence, for each i and k with Wk,i 6= 0, we only need to know how k’s expectation on yi
varies with Xlp ’s which are known to both i and k. This feature reduces the dimension of
function domains and hence can alleviate computational burden.
To demonstrate the equilibrium solution, we provide an illustration of a simpler case.
Consider a simply network made of n = 3 agents. The social matrix W is such that Wi,i = 0
0
for all i = 1, 2, 3; for i 6= j, W2,3 = W3,1 = 0 and Wi,j 6= 0 otherwise. Thus, J1 = (1, 1, 1) ,
0
0
J2 = (1, 1, 0) and J3 = (0, 1, 1) . Namely, agent 1 knows all agents. Agent 2 knows the most
informative agent, agent 1. However, agent 3 knows only agent 2. To solve equilibrium, we
solve the following equation system:11
e
e
ψ2,J
(X1p , X2p ) = H2 (u(X g , X2c , X2p ) + λW2,1 ψ1,J
(X1p , X2p )),
1
2
(2.4.15)
e
e
ψ3,J
(X2p , X3p ) = H3 (u(X g , X3c , X3p ) + λW3,2 ψ2,J
(X2p )),
1
3
(2.4.16)
11
We apply two properties of equilibrium expectation functions here. First, i makes a prediction on j only
if i links herself to j; i.e., Wi,j 6= 0. Second, when k is making predictions on i, only the characteristics
which they both know affect the expectation.
32
e
ψ1,J
(X1p , X2p ) =
2
Z
e
e
H1 (u(X g , X1c , X1p ) + λW1,2 ψ2,J
(X1p , X2p ) + λW1,3 ψ3,J
(X2p , X3p ))
1
1
· fp (X3p |Z = z)dX3p ,
(2.4.17)
and
e
ψ2,J
(X2p )
3
Z
=
e
H2 (u(X g , X2c , X2p ) + λW2,1 ψ1,J
(X1p , X2p ))fp (X1p |Z = z)dX1p .
2
(2.4.18)
This is an equation system of four functions. Although Xip ’s are independent of each
other, equilibrium conditional expectations are much different from those when Xip ’s are
only self-known. We need to solve expectations under each possible information structure,
e , with W
ψi,J
k,i 6= 0, as a function.
k
For computation of equilibrium expectations, consider the case that Xip ’s are discrete.
Suppose that conditional on Z = z, Xip ’s are i.i.d. distributed, taking m possible values,
x1 , · · · , xm , with probabilities, pl = prob(Xip = xl |Z = z), for any i = 1, · · · , n and l =
1, · · · , m. Then (2.4.17), (2.4.15), (2.4.18), and (2.4.16) can be written as follows:
0
0
0
e
e
ψ2,J
(xl , xl )),
(xl , xl ) = H2 (u(X g , X2c , xl ) + λW2,1 ψ1,J
2
1
0
∗
(2.4.19)
0
∗
e
e
ψ3,J
(xl , xl ) = H3 (u(X g , X3c , xl ) + λW3,2 ψ2,J
(xl )),
1
3
0
e
ψ1,J
(xl , xl ) =
2
m
X
0
(2.4.20)
0
∗
e
e
pl∗ H1 (u(X g , X1c , xl ) + λW1,2 ψ2,J
(xl , xl ) + λW1,3 ψ3,J
(xl , xl )),
1
1
l∗ =1
(2.4.21)
and
e
ψ2,J
(xl
3
0
)=
m
X
0
0
e
pl H2 (u(X g , X2c , xl ) + λW2,1 ψ2,J
(xl , xl )).
1
(2.4.22)
l=1
e ’s as vectors, for instance,
We can define ψi,J
k
0
e
e
e
e
e
= (ψ1,J
(x1 , x1 ), · · · , ψ1,J
(x1 , xm ), · · · , ψ1,J
(xm , x1 ), · · · , ψ1,J
(xm , xm )) ,
ψ1,J
2
2
2
2
2
and solve them by contraction mapping iteration. For the case of continuous characteristics,
other considerations, such as quadrature method, are possible.
33
Correlated Characteristics
It is possible that Xip ’s are correlated with each other. We illustrate the equations of
equilibrium expectations by using the same network relations in the last section. In that
small group made up of n = 3 agents, W11 = W22 = W2,3 = W3,1 = W3,3 = 0 and Wi,j 6= 0
0
0
0
otherwise. So J1 = (1, 1, 1) , J2 = (1, 1, 0) and J3 = (0, 1, 1) . Then we have the following
equations for equilibrium conditional expectations:
e
e
ψ2,J
(X1p , X2p ) = H2 (u(X g , X2c , X2p ) + λW2,1 ψ1,J
(X1p , X2p )),
1
2
(2.4.23)
e
e
(2.4.24)
ψ3,J
(X2p , X3p ) = H3 (u(X g , X3c , X3p ) + λW3,2 ψ2,J
(X2p , X3p )),
1
3
Z
p
p
e
e
e
ψ1,J
(X2p , X3p ))
(X
,
X
)
=
H1 (u(X g , X1c , X1p ) + λW1,2 ψ2,J
(X1p , X2p ) + λW1,3 ψ3,J
1
2
1
2
1
fp (X3p |X1p , X2p , Z = z)dX3p ,
(2.4.25)
and
e
ψ2,J
(X2p , X3p )
3
Z
=
e
H2 (u(X g , X2c , X2p ) + λW2,1 ψ2,J
(X1p , X2p ))fp (X1p |X2p , X3p , Z = z)dX1p .
1
(2.4.26)
Compared with (2.4.15), (2.4.16), (2.4.17), and (2.4.18), we see that contrary to the
e
independent case, ψ2,J
varies with not only X2p but also X3p . Hence, dimension of the
3
domain of equilibrium conditional expectation can be higher with correlated characteristics
than that with independent characteristics.
To show how to solve equilibrium conditional expectation, let us consider the case
that Xip ’s are discrete random variables. Keep on assuming “exchangeability”. The joint
distribution of those discrete random variables can be represented as P (X1p , · · · , Xnp ) =
R
P (X1p |η) · · · · · P (Xnp |η)dFη (η), where η is some common factor. Then we can simplify
notations for conditional probabilities. Assume that the support for each Xip conditional
34
on Z = z is x1 , · · · , xm , we denote
0
∗
pl∗ |ll0 = P rob(Xip3 = xl |Xip1 = xl , Xip2 = xl , Z = z)
0
for any i1 , i2 , i3 = 1, 2, 3, i1 6= i2 , i1 6= i3 , i2 6= i3 , l, l , l∗ = 1, · · · , m. In this case, (2.4.23),
(2.4.24), (2.4.25), (2.4.26) can be written as
0
0
0
e
e
ψ2,J
(xl , xl ) = H2 (u(X g , X2c , xl ) + λW2,1 ψ1,J
(xl , xl )),
1
2
0
∗
0
∗
(2.4.27)
∗
e
e
ψ3,J
(xl , xl ) = H3 (u(X g , X3c , xl ) + λW3,2 ψ2,J
(xl , xl )),
1
3
0
e
ψ1,J
(xl , xl ) =
2
m
X
0
(2.4.28)
0
∗
e
e
pl∗ |ll0 H1 (u(X g , X1c , xl ) + λW1,2 ψ2,J
(xl , xl ) + λW1,3 ψ3,J
(xl , xl )),
1
1
l∗ =1
(2.4.29)
and
0
∗
e
ψ2,J
(xl , xl ) =
3
m
X
0
0
e
pl|l0 l∗ H2 (u(X g , X2c , xl ) + λW2,1 ψ2,J
(xl , xl )).
1
(2.4.30)
l=1
e ’s
ψi,J
k
Regarding
as vectors, they can be solved analytically for linear models or numer-
ically by contraction mapping iteration for binary choice models.
2.5
2.5.1
Identification, Likelihood, and Estimation
Identification
For identification, we focus on how to recover the model primitives, the function u(·),
the intensity of social interaction effects, λ, and the distribution of the exogenous variables,
FX, (·), from the data. In the following discussion, we use Y , X c and X g to denote the
matrices of observations of those endogenous and exogenous variables for a whole group.
The vector, X g , represents features of a group. Individual observations are represented
by Yi , Xic and Xip . All the exogenous characteristics of a group are denoted as X =
0
ln ⊗ X g , X c , X p , where n is the size of the group; ln is an n × 1 vector composed by
0
0 0
0
0 0
1’s; X c = (X1c , · · · , Xnc ) and X p = (X1p , · · · , Xnp ) .
35
Analogous to Brock and Durlauf (2007), observational equivalence is defined based on
the distribution of observed variables and consistency of equilibrium expectations.
Definition 2.5.1 Given social relations, W = W , (u, λ, F,X ) is observationally equivalent
e Fe,X ) at W , if they imply the same distribution of observables,
to (e
u, λ,
e Fe,X ).
FY,X|W (·, ·; u, λ, F,X ) = FY,X|W (·, ·; u
e, λ,
(2.5.1)
Following from (2.5.1), they are consistent with the same equilibrium expectation, ψ e (·),
Z
X
W i,j ψje (XJpi ) − i )f,X p |X p ,X g ,X c (i , XJpi )ddXJpi
ψie (XJpk ) = hi (u(Xi ) + λ
Ji
j6=i
Z
=
e
hi (e
u(Xi ) + λ
X
Jk
(2.5.2)
W i,j ψje (XJpi )
− i )fe,X p |X p
j6=i
Ji
Jk
p
p
,X g ,X c (, XJi )ddXJi ,
for any Y and X in their support. For some X in the support of X, we say that (u, λ, F,X )
e Fe,X ) are observationally equivalent at (X, W ), if
and (e
u, λ,
(2.5.10 )
e Fe,X ),
FY |X,W (·, ·; u, λ, F,X ) = FY |X,W (·, ·; u
e, λ,
and (2.5.2) holds for X = X.
e Fe,X ) 6= (u, λ, F,X ) is not
u, λ,
Definition 2.5.2 (u, λ, F,X ) is identifiable at W if any (e
observationally equivalent to (u, λ, F,X ) at W = W . (u, λ, F,X ) is identifiable at (X, W )
e Fe,X ) 6= (u, λ, F,X ) is not observationally equivalent to (u, λ, F,X ) at X = X
if any (e
u, λ,
and W = W .
Our discussions of identification are based on the following assumptions.
Assumption 2.5.1 is independent of X. Moreover, has a full support.
Assumption 2.5.2 FX can be identified from observations of X.
0
0
0
Assumption 2.5.3 u(·) is a linear function of the form u(Xi ) = β0 +Xic β1 +Xip β2 +X g β3 .
36
We focus on identification of two models, the continuous choice model, hi (z) = z, and
the binary choice model, hi (z) = I(z > 0), for all i and z. For each model, we first
discuss the special case that all exogenous characteristics are public information to group
members, and then the general structure of incomplete information. Additionally, since X g
is constant for all members within the same group, if there is just one single group, it will
be absorbed by the constant term. As a result, we categorize the sample network structures
into two classes, depending on whether W is block-diagonal or not. The block-diagonal case
corresponds to a sample consisting of several independent groups. Variation across different
groups help identify the effect of group-level features. But of course, for a single network,
the effect of group-level features is unidentified.
Linear Model for Continuous Choices
We focus our analysis on a single network W and then comment on the case with
many different networks. In a single network, X g is absorbed by the constant term. By
0
0
Assumption 2.5.3, u(Xi ) = β0 + Xic β1 + Xip β2 .
If all exogenous characteristics are publicly known to group members, u(Xi ) = β0 +
0
Xic β1 . Given the network structure, W , and the commonly known individual characterisc
tics, X c = X , we try to identify the true parameters, β0∗ , β1∗ , λ∗ , and F∗ from the observable
c
data about Y . To ensure identification, we impose additional assumptions on X and W .
c
Assumption 2.5.4 ln , W ln , X , W X
c
has full column rank, where n is the size of
c
this group and ln is an n × 1 vector of 1’s. When W is row-normalized, ln , X , W X
c
has full column rank.
c
Proposition 2.5.1 Suppose that all exogenous variables are public information. For X in
the support of X c , and a social network matrix, W = W , of a single group, if Assumptions
37
c
2.3.8, 2.5.1, 2.5.2, 2.5.3, and 2.5.4 hold, (β0∗ , β1∗ , λ∗ , F∗ ) is identified at (W , X ) for the
linear model.
Proof. See Appendix A.3.
In the case that some characteristics are privately known, we need to impose restrictions
on X p in order to ensure identification.
c
p
Assumption 2.5.5 Given an information structure, J = J, X c = X , X p = X and
c
p
W = W , E[Yi |J = J, W = W , X c = X , XJpk = X Jk ] can be identified (nonparametrically),
for any i, k = 1, · · · , n, i 6= k.
c
p
Assumption 2.5.6 ln , X , X , E p has full column rank, where E p is an n×1 vector,
whose i-th component is
P
j6=i W i,j E[Yj |J
c
p
= J, W = W , X c = X , XJpi = X Ji ].
c
p
Proposition 2.5.2 Given an information structure, J, X in the support of X c , X in the
support of X p , and a social network matrix, W = W , if Assumptions 2.3.8, 2.5.1, 2.5.2,
c
p
2.5.3, 2.5.5, and 2.5.6 hold, (β0∗ , β1∗ , λ∗ , F∗ ) is identified at (J, W , X , X ) for the linear
model.
Proof. See Appendix A.3.
From Proposition 2.5.2, the key to ensure identification is to make sure the relevant
conditional expectations are not linearly dependent on relevant exogenous characteristics. In
some special cases, more elementary sufficient conditions can be derived. For example, when
Xip is known only to i and is a discrete random variable with a finite support, the conditional
expectation can be solved analytically as (2.4.9). Then we can derive a new set of sufficient
c
conditions for identification. For W = W , X c = X , suppose that X p has a finite support,
0
0 0
0
0
X p,s = (X p,1 , · · · , X p,m ) , where X p,1 , · · · , X p,m are m possible values that X p many
b = lnm , lm ⊗ X c , X p,s ⊗ ln .
take. The corresponding transition matrix is P . Define X
38
b (P 2 ⊗ W )X
b has full column rank. When
Assumption 2.5.7 The matrix (P ⊗ In )X,
c
c
W is row-normalized, lnm , lm ⊗ X , (P X p,s ) ⊗ ln , lm ⊗ (W X ), (P 2 X p,s ) ⊗ ln
has full column rank.
Proposition 2.5.3 Suppose that Xip is only revealed to i, for i = 1, · · · , n, and the distribution of Xip ’s is exchangeable with a finite support, X p,s , and a transition matrix, P . For
c
X in the support of X c , and a social network matrix for a single group, W = W , if Asc
sumptions 2.3.8, 2.5.1, 2.5.2, 2.5.3, and 2.5.7 hold, (β0∗ , β1∗ , λ∗ , F∗ ) is identified at (W , X )
for the linear model.
Proof. See Appendix A.3.
In Section 2.4, we discussed another special case when Xip ’s are continuously distributed.
According to Proposition 2.4.1, when E[Xip |Xjp , z] = µ+CXjp , the equilibrium expectations,
0
ψie (x) = ai + bi x, i = 1, · · · , n, are linear in private characteristics Xip , which is random.
c
c
0
0
For X c = X and W = W , a = (a1 , · · · , an ) = (In − λ∗ W )−1 (β0∗ ln + X β1∗ + (µ β2∗ )ln +
0
0
0
0
0
0
λ∗ (W ⊗ µ )b), and b = (b1 , · · · , bn ) = (In ⊗ Ikp − λ∗ (W ⊗ C ))−1 (ln ⊗ C β2∗ ).
We provide sufficient conditions for identification in this special case. Denote B = (b1 , · · · , bn ),
0
0
Gn = (In − λ∗ W )−1 W and Gc,n = (In ⊗ Ikp − λ∗ W ⊗ C )−1 (W ⊗ C ). We consider the
following two conditions.
0
0
c
Assumption 2.5.8 ln ⊗ C , Gc,n (ln ⊗ C )β2∗ and ln , X have full column ranks.
0
0
0
c
c
Assumption 2.5.9 ln ⊗C and ln , X , Gn [ln β0∗ + X β1∗ + (ln β2∗ + B )µ] have full column ranks.
Proposition 2.5.4 Suppose that Xip is only revealed to i, for i = 1, · · · , n, and the distribution of Xip ’s is exchangeable with a continuous density satisfying Assumption 2.4.5. For
39
c
X in the support of X c , and a social network matrix, W = W , (β0∗ , β1∗ , λ∗ , F∗ ) is identified
c
p
at (W , X , X ) for the linear model, if Assumptions 2.3.8, 2.5.1, 2.5.2, 2.5.3, and 2.5.8
hold, or Assumptions 2.3.8, 2.5.1, 2.5.2, 2.5.3, and 2.5.9 hold.
Proof. See Appendix A.3.
Those rank identification conditions, 2.5.8 and 2.5.9, can be simplified when W is rownormalized. In that case, W ln = ln , so Gn ln = ln /(1 − λ∗ ), and
0
0
0
0
Gc,n (ln ⊗ C ) = (In − λ∗ W ⊗ C )−1 (W ⊗ C )(ln ⊗ C )
0
0
= (In − λ∗ W ⊗ C )−1 (ln ⊗ C )C
0
0
0
0
= (ln ⊗ C )(Ikp − λ∗ C )−1 C ,
because
∗
0
(In − λ W ⊗ C )
−1
0
(ln ⊗ C ) =(
∞
X
0
0
(λ∗ W ⊗ C )m )(ln ⊗ C )
m=0
0
=(ln ⊗ C )
∞
X
0
(λ∗ C )m
m=0
0
0
=(ln ⊗ C )(Ikp − λ∗ C )−1 .
When W is row-normalized, the first part in Assumption 2.5.8 would not hold. This is
0
0
0
because the column rank of (ln ⊗ C ) Ikp , (Ikp − λ∗ C )−1 C β2∗ cannot exceed kp and does
not have full column rank. In this case, identification depends on Assumption 2.5.9, which
0
is simplified to ln ⊗ C and
0
0
c
c
ln , X , ln (β0∗ + µ β2∗ )/(1 − λ∗ ) + Gn (X β1∗ + B µ) have full column ranks.
In the special case that the Xjp ’s are independent of each other, B = 0 and Gc,n = 0.
The identification conditions, 2.5.8 and 2.5.9 would be invalid. This is because there are
only some linear combinations of parameters that can be identified through the equilibrium
conditional expectations. In the proof of Appendix A.3, it is obvious that in this case,
(A.3.1) is not useful. Through (A.3.2), one can identify β1∗ , λ∗ and a linear combination
0
β0∗ + µ β2∗ . This becomes more transparent in the implied equilibrium, (2.4.6), which has
40
c
0
c
0
ψ e = (I − λ∗ W )−1 (ln β0∗ + X β1∗ + ln µ β2∗ ) = (I − λ∗ W )−1 [ln (β0∗ + µ β2∗ ) + X β1∗ ]. However,
identification of each parameter can be achieved by considering (2.5.10 ), because in that
case
c
p
c
p
Yn = ln β0∗ + X β1∗ + X β2∗ + λ∗ W ψ e − c
0
= ln β0∗ + X β1∗ + X β2∗ + λ∗ Gn [ln (β0∗ + µ β2∗ ) + X β1∗ ] − .
Under the condition that
c
p
c
0
ln , X , X , Gn (ln (β0∗ + µ β2∗ ) + X β1∗ )
(2.5.3)
has full column rank, parameters can be identified. We note that in this case, the presence
of X c plays a key role in identification. If X c were not relevant, i.e., β1∗ = 0, then when
either λ∗ = 0, or W is row-normalized, the equilibrium expectations become a constant
0
0
vector, ψ e = (In − λ∗ W )−1 ln (β0∗ + µ β2∗ ) = ln (β0∗ + µ β2∗ )/(1 − λ∗ ). Then the model is simply
0
Y = ln [β0∗ + λ∗ (β0∗ + µ β2∗ )/(1 − λ∗ )] + X p β2∗ − . It can be seen that λ∗ would not be revealed
in this case.
When the whole sample is formed by many independent groups, social relationships of
the whole sample can be represented by a block-diagonal matrix, W . β3∗ can be identified
from the variation of X g among different groups.
Binary Choices
For the binary choice model, since a lot of information about y ∗ is missed, identification
is more difficult. Our discussion will focus on the case with a parametric distribution of the
location-scale family, which is used frequently in empirical studies.
Assumption 2.5.10 The idiosyncratic shocks, i ’s, are i.i.d. with support <. The distri
bution belongs to a location-scale family, f () = σ1 fs ( −µ
σ ), where µ and σ are known and
normalized respectively to be equal to 0 and 1. fs (·) is a known standard density function
in this family.
41
Since fs (·) is a known function, Fs (·) is known. By taking the inverse, Fs−1 , on choice
probabilities which are identifiable from the data set, we can derive a linear equation. That
helps us derive sufficient conditions for identification on model parameters, β ∗ and λ∗ .
Proposition 2.5.5 Suppose that all exogenous variables are public information. For X
c
in the support of X c , and a social network matrix, W = W , if Assumptions 2.3.8, 2.5.1,
c
c 2.5.2, 2.5.3, and 2.5.10 hold and ln , X , W E[Y |W = W , X c = X ] has full column
c
rank, (β0∗ , β1∗ , λ∗ ) is identified at (W , X ) for the binary choice model.
Proof. See Appendix A.3.
c
Proposition 2.5.6 Given a structure of incomplete information, J, X in the support
of X c , X
p
in the support of X p , and a social network matrix, W = W , if Assump-
tions 2.3.8, 2.5.1, 2.5.2, 2.5.3, 2.5.5, 2.5.6, and 2.5.10 hold, (β0∗ , β1∗ , β2∗ , λ∗ ) is identified
c
p
at (J, W , X , X ) for the binary choice model.
Proof. See Appendix A.3.
In addition to identification for classical reference, one may also consider identification
via Bayesian inference. Bayesian analysis is concerned with how sample information can
improve the posterior distribution of parameters over the prior distributions. The posterior distribution can be simulated by the Metropolis-Hastings algorithm. Details are in
Appendix A.5.
2.5.2
Estimation
For identification and estimation, we consider a parametric model where the direct
utility, u(Xi ; β), is a parametric function with parameter, β, and f (; σ) and fp (X p ; η, Z)
are parametric density functions with parameters σ and η.
42
From the econometrician’s point of view, exogenous characteristics are observable, and
observed outcomes, yi = hi (u(Xi ; β) + λ
p
j6=i Wi,j E[yj |XJi , Z] − i,g ),
P
are stochastic due to
the unobserved i . As i ’s are i.i.d. within a group, conditional on exogenous characteristics, likelihoods of individual outcomes are independent of each other. Therefore, the log
likelihood is
log Lc (β, λ, σ, η|y1 , · · · , yn , X1 , · · · , Xn ) =
X
log L(yi |X, β, λ, σ, η).
(2.5.4)
i
In the case that all X’s are public information in the group, an agent does not need
to integrate over unknown characteristics when making predictions. Then, the distribution
of yi , for i = 1, · · · , n, conditional on X, does not depend on η. When Xip ’s are privately
known, the conditional likelihood function will depend on η. For econometricians, all Xip ’s
are observable. η can be correctly inferred from the sample of X p . Hence, econometricians
can plug the estimated value of η into the likelihood (2.5.4). Alternatively, one may set
up the full likelihood function of Y and X p for estimation, if the distribution of X p is
known,
P
i log L(yi |X, β, λ, σ, η)
+ log L(X1p , · · · , Xnp |η). The full likelihood may achieve
efficiency, but optimization will be over a higher dimensional parameter space. The twostep estimation procedure can be computationally simpler. The estimate will be consistent
but there may be some loss of efficiency.
When calculating the likelihood (2.5.4), we need to solve for equilibrium expectations,
E[yj |XJpi , Z], as a fixed point. For estimation, we nest the fixed point solution within an ML
estimation. We solve for the unique fixed point with a given parameter vector and calculate
likelihood at that parameter vector. Then we search for the vector which maximizes the
likelihood. This method is a “nested fixed point” (NFXP) algorithm, frequently used in
estimating dynamic discrete choice models, such as in Rust (1987). By using NFXP, we
fully solve (conditional) expectation functions. Since there is a one-to-one correspondence
43
between (conditional) expectation function and a BNE in the underlying simultaneousmove game with incomplete information, we fully solve for equilibrium when estimating
the model.12 Since the likelihood of yi ’s are independent of each other, consistency and
asymptotic normality of the estimator can be derived in a conventional way as for crosssectional data.
2.6
Monte Carlo Experiments
We investigate finite sample properties of the MLE by Monte Carlo experiments.
As usual, u(·) is assumed to be a linear function, u(Xi ) = β0 + Xic β1 + Xip β2 , where the
observed group feature, X g , is absent. The idiosyncratic shocks, i ’s are i.i.d. distributed
with a pdf of φ(·; σ), where φ(·; σ) is a normal density with zero mean and standard deviation
σ > 0. The true parameter values are β0 = 0, β1 = 1, β2 = 1, λ = 0.3, and σ = 1.
The social weighting matrix, W , represents network relations, and is composed of 0’s
and 1’s and then row-normalized. We consider two types of networks. In the first case,
W can be organized as a block diagonal matrix, as the whole sample is a collection of a
number of independent groups. In our experiments, each group has the same size of n = 20.
Within a group, for every agent, 3 other agents are randomly selected to be linked to her,
represented as F = 3. The number of independent groups, G, increases from 100 to 500.
The second case has a single network with nodes of either n = 200 or n = 1000. We try
two different settings on the number of links an agent can make. Under one assumption,
everyone links with a fixed number of agents, F . We choose F = 30 for n = 200. When
12
Our method differs from some indirect approaches for estimating incomplete information games. For the
discrete choice case, first, some semi- or non-parametric methods are used to derive a consistent estimate of
choice probabilities, and then those estimates are substituted back to estimate model parameters, such as
Bajari et al (?) and Aradilla-Lopez (2010). However, for such indirect approached, a consistent estimate of
choice probabilities often requires a large repetition of the same game played by the same people. It is hard
to meet that requirements in the context of social interactions, as network links are specific for a group of
players. For panels, networks might change over time.
44
n = 1000, we look at F = 30 and F = 150. In another case, the number of associations an
agent can make is randomly determined. We allow that number to take any integer from 0
to an upper bound, U F with equal probabilities. U F = 59 for n = 200. For n = 1000, we
investigate cases when U F = 59 and 299.
The commonly known individual characteristics, X c , are generated as independent standard normal variables. For Xip , we consider two cases. In one case, it is a discrete variable.
In the other case, it is a continuous variable. In the discrete case, Xip is dichotomous, taking
values 0 or 1. The realizations of Xip ’s are determined in the following way: 40% of agents
in a group (or block if W is block diagonal) are picked randomly to have Xip = 1; otherwise, Xip = 0. For the design, the distribution of Xip ’s is “exchangable”, with a transition
(conditional) probability matrix:
P (X2p = 1|X1p = 1) P (X2p = 0|X1p = 1)
=
P =
P (X2p = 1|X1p = 0) P (X2p = 0|X1p = 0)
n1 −1
n−1
n1
n−1
n−n1
n−1
n−n1 −1
n−1
!
,
where n is the number of agents in the group and n1 = 0.4n. Members know the joint
distribution of Xip ’s. Because both n and n1 are observable, econometricians can infer that
distribution from data. Thus, we focus on the likelihood of yi ’s conditional on regressors
for estimation.
p
For the case with continuous Xip ’s, we adopt the framework that Xi,g
= αg + pi,g , for
g = 1, · · · , G and i = 1, · · · , n. Suppose that αg ’s are i.i.d. normal with mean µ and variance
σ12 across g. pi,g ’s are i.i.d. normal with zero mean and variance σ22 for all i and g. They are
0
p
p
also independent of αg ’s. Then within a group, g, (X1,g
, · · · , Xn,g
) is jointly normal with
0
mean (µ, · · · , µ) and variance-covariance

1
ρ

η2  .
 ..
matrix

ρ ··· ρ
1 · · · ρ

.. . . ..  ,
. .
.
ρ ρ ··· 1
45
(2.6.1)
where η 2 = σ12 + σ22 and ρ =
σ12
2
σ1 +σ22
p
p
. For any i 6= j and given Xj,g
= x, Xi,g
is normally
distributed with mean ρx + (1 − ρ)µ and variance (1 − ρ2 )η 2 . We choose µ = 1, η = 2
and ρ = 0.4. While values of those parameters are known to agents, econometricians need
to estimate them from observed data. In this situation, we adopt a two-step algorithm in
p
estimation. We first estimate µ, η and ρ from Xi,g
’s. Those estimates can be consistent,
when the number of independent groups, G, increases to ∞. We then plug those estimates
into the likelihood of y and use the NFXP method to estimate other parameters. If there
is only one single group, instead of estimating µ, η and ρ, we directly use their true values
when maximizing the likelihood of y.13
We consider two models, the linear one for continuous choices with hi (z) = z and the
binary choice one with hi (z) = I(z > 0) for all i. Each model is estimated by the NFXP ML
method. To investigate the influence of information structures on estimation results, for
each model we conduct the following experiment. First, simulate the data for the case that
all Xip ’s are publicly known, and estimate the model under the true information structure,
as well as using a false information structure, self-known characteristics. The latter is
intended to investigate the consequence of estimation if a researcher has hypothesized a
false information structure for agents’ behaviors in estimation. Second, simulate the data
for the case that all Xip ’s are self-known, and estimate the model under both the true
information structure and a model where characteristics are mistakenly publicly known.
Results are tabulated below.
Tables A.1 to A.4 present the estimation results for W being block-diagonal. In each of
those tables, parameter names and true values are listed in the first and second columns.
13
Implicitly, in practice, we assume that those parameters might be recovered from other sources and used
for a study on hand.
46
Each table is divided into two panels. The first panel corresponds to the data generatp
ing process where all Xi,g
’s are known to every agent in the group g. The second panel
p
presents the results when each generated Xi,g
is known only to (i, g). Estimation results
under true information structures and those under misspecified information structures are
in columns of corresponding panels. We calculate the empirical mean and corresponding
empirical standard deviations for every estimated parameter. We also calculate sample average maximized log likelihood values. Additionally, we denote by rtrue , the proportion of
simulations for which maximized sample log likelihood under a true information structure
is larger than that under a corresponding misspecified information structure.
From the results in Tables A.1 and A.2, the NFXP ML estimation works well for both
linear and binary choice models. The bias for estimates is small, and the standard deviation
decreases as sample size increases. Comparing the two models, the performance of estimates
for the linear model is better than that of the binary choice model. That is intuitive,
because compared to the linear model, some information on the latent dependent variables
is lost in the binary choice indicator. It is interesting that if data is generated for the
p
case that all Xi,g
’s are publicly known, there is only a small difference between estimates
under the true information structure and the wrongly hypothesized information structure
in terms of bias and standard deviation. On the other hand, if data is generated from
p
p
all Xi,g
’s being self-known, but the estimated model mistakenly treats all Xi,g
’s as public
information, there will be a larger bias in the estimates of the constant term and the social
interaction intensity. Additionally, that bias cannot be reduced even when sample size
is increased. For each model, comparing estimation under a true information structure
and a falsely hypothesized information structure, we see that estimation under the true
information structure can achieve a larger likelihood value in general. Therefore, information
47
criterion in terms of maximized log likelihood can be valuable for model selection. Results
p
have similar features when Xi,g
’s are continuously distributed, as shown by Tables A.3 and
p
A.4, but the biases of estimates under false information structure are larger when Xi,g
’s are
p
continuously distributed than those when Xi,g
’s are discrete.
Results for a single group when X p is discrete are tabulated in Tables A.5 to A.12. We
can see that our estimation method still works well in this case. One interesting thing is
that when the group size and number of links for each member are both multiplied by 5, the
standard deviations of parameter estimates decrease except for those for β0 and λ. That
is because as sample size increases, interactions between the expectations about yi ’s also
goes up, which increases the multicollinearity in the equation of y ∗ . If we hold the number
of links for each individual constant, the standard deviations for the estimates of β0 and λ
also decreases. Analogous results are derived when the number of links is random, but then
standard deviations of parameter estimates decrease faster as sample size increases.
To assume that all personal characteristics are publicly known is equivalent to assuming
rational expectations. From the above experiments, we can see that if data generating
process is incompatible with that assumption, imposing it can bring in estimation bias.
2.7
An Empirical Application
Public safety provided by local governments is a public good. A safer city can benefit
not only its own residents but also people living in neighboring cities. As a result, a city
might want to be a “free-rider” and spend less on police, fire, and emergency. That is, there
is a substitution effect between two nearby cities in terms of public safety expenditure.
However, a complementary effect may also exist. That is because an improvement in public
safety service in one city may drive criminals to other places not faraway from that city. For
48
a group of cities in a region, those effects imply spatial correlation. We empirically analyze
this correlation by applying our model to municipalities in the state of North Carolina.
We collect data about government finance in the 2012 fiscal year from North Carolina
Department of State Treasurer. From the same data source, we also get county public safety
spending in the same fiscal year. Since residents’ earnings may also influence public safety
spending, we collect data about city median household income from “FindtheData.org”14 ,
which is based on the American Community Survey. To construct spatial relation, we
first collect information about municipal latitudes and longitudes from “CityLatitudeLongitude.com”15 and then calculate the distance using the Haversine formula16 . After that,
we choose a cutoff value. Two cities whose distance is no bigger than that cutoff value is
defined as “neighbors”. In empirical analysis, we choose three different cutoff values and
run regressions for each case. Those three cutoff values are 30 kilometers, 50 kilometers and
100 kilometers. Sample statistics are summarized in Table A.14.
Because expenditure on public safety is a continuous choice variable, we use the following
model for empirical study:
0
0
yi = Xic α + Xip β + λ
X
Wi,j E[yj |XJpi , Z] − i ,
(2.7.1)
j6=i
where yi stands for the public safety spending for city i, Xic contains municipal financial
and demographic information that are known to all cities in the state, such as population
and total revenue. Xip represents city features that are known to i and not to other cities.
14
http://acs-economic-city.findthedata.org.
15
The latitudes and longitudes of most municipalities in our sample are listed on the web page,
http://citylatitudelongitude.com/NC/Matthews.htm. Data about the rest 14 cities are found by searching on Google individually.
16
r
ι2 − ι 1
ξ2 − ξ1
) + cos(ι1 ) cos(ι2 ) sin2 (
)),
2
2
where d is the distance, r is the radius, ι1 and ι2 are latitudes, and ξ1 and ξ2 are longitudes.
d = 2r arcsin(
sin2 (
49
In this paper, the variable chosen to represent them is the median household income in
the current period when decisions about government spending is to made. Specifically, we
assume that city median household income is composed of the state average level income
and city idiosyncratic shocks.
M HHIi = α + pi ,
(2.7.2)
where M HHIi denotes the median household income of municipality i. The random variable, α, represents the state median household income. City-specific shock is denoted by
pi . α is normal with mean µ and standard deviation, ω. pi ’s are i.i.d. normal with zero
mean and standard deviation, γ. pi ’s are also independent of α. Then M HHIi ’s have an
exchangeable joint normal distribution, with mean µ, variance η 2 = ω 2 + γ 2 and correlation
coefficient, ρ =
ω2
.17
ω 2 +γ 2
We estimate the model under two different information structures.
In the firse scenario, the median household income is publicly known to related cities. But
they are assumed to be self-known information in the second case.
Results for regression using nested fixed point maximum likelihood under different settings are tabulated in Table A.15. We can find that there is a significant negative spatial
correlation between neighboring municipalities, when the distance between two “neighbors”
is less than 30 or 50 kilometers, although the magnitude of the effect is small. If we allow
the cutoff distance for “neighbors” to be 100 kilometers, that effect is still negative but
insignificant. Therefore, if two cities are close to each other, the spending on public safety
by one of them has externality effect on another, resulting in a “free-riding” effect. Additionally, for the same cutoff distance, estimates when median household income is public
information are very close to those when the true value of its own median household income
17
We collect the time series of the median household income for the state of North Carolina, αt , from 1984
to 2012, and estimate µ and ω by sample mean and standard deviation.
50
of a city is private information. However, in terms of estimated sample log likelihoods, the
regressions under public information outperform those under private information. That is
to say, when making budget decisions, municipalities in a state know quite well about the
factors influencing the decisions taken by nearby cities.
2.8
Conclusion and Remarks
In this paper, we analyze social interactions under incomplete information in a framework allowing heterogeneity in expectations about agents’ behaviors. Due to heterogeneous
individual exogenous characteristics, one agent’s expectations on another two agents’ behaviors generally differ. With asymmetric private information, two agents may have different
expectations on the behavior of a third agent in the group. Incorporating these two types
of heterogeneity, especially the second one, we can analyze the influence of various forms of
private information on social interactions, which extends the current literature.
We relate our model to a simultaneous move game with incomplete information, in which
agents’ behaviors can be viewed as best responses to their expectations on other agents’
behaviors. We adopt the concept of BNE. As a result, expectations vary with the private
information used to make predictions. We find that in this case conditional expectations of
agents’ behaviors in a group can be viewed as a vector-valued function of private information
about group members. As a BNE corresponds to a consistent conditional expectation
function, it is possible to use methods in functional analysis to investigate the existence and
uniqueness of an equilibrium conditional expectation function. We also show that such a
model can be estimated using the nested fixed point ML estimation.
The essence of the equilibrium analysis is the use of contraction mapping. An equilibrium
of the model corresponds to a fixed point of a contraction mapping in a complete function
51
space. Contraction mapping has the advantage that it not only ensures existence and
uniqueness but also supplies a solution algorithm. However, it may be demanding to ensure
a contraction mapping in some cases. Inspecting our sufficient condition for equilibrium
uniqueness, the structure of a network and its interactions play an important role. Utilizing
the row-normalizing technique in the literature of spatial econometrics, for the linear model
and probit model, the sufficient conditions do not impose restrictive parameter ranges on
interactions. However, the bounds on interaction intensity can be more stringent for other
models or without row-normalization.
It is shown by Monte Carlo experiments that estimation under a misspecified information
structure may result in a loss of efficiency or additional bias. In an empirical investigation,
econometricians may not know the true information structure underlying the data sets.
However, the maximized log likelihood provides a valuable model selection criterion.
Comparing Monte Carlo experiments of linear and binary choice models, we can see that
the performance of NFXP maximum likelihood estimation for the Binary choice model is not
as good as that for linear model, due to a loss of information. Hence, it will be interesting
to examine the Tobit model in this framework, as it is of a nonlinear form connecting linear
and binary choice structures.
52
Chapter 3: A Tobit Model with Social Interactions under Incomplete
Information
3.1
Introduction
Truncated and/or censored outcomes are frequently observed in practice, when agents
face binding constraints in decision, in particular, the nonnegative constraint. As the most
popular way to model this kind of limited dependent variables, the Tobit model has gained
much attention for both empirical and theoretical reasons. For example, Kumar(2012) proposes an extension of nonparametric estimation methods for nonlinear budget-set models
to censored dependent variables. Abreyava and Shen (2014) consider estimation of censored panel-data models with individual-specific slope heterogeneity. In this paper, we are
interested in modeling the censored outcomes associated with social interactions where an
agent’s behavior may be affected by other group members. There are two types of models
about social interactions. In one stream of the literature, agents who move simultaneously
have complete information about all exogenous characteristics and shocks. As a result, each
agent’s behavior is influenced by the actual behaviors of others in that social group. See
Lee(2007) and Boucher et al.(2014) for example. Other models are built on incomplete
information, so an agent’s actions are affected by her expectations on the behaviors of other
53
agents in a social group. For instance, suppose that all exogenous characteristics are public information and idiosyncratic shocks are privately known, Manski (1993) studies linear
models about socially interacted continuous choices; and Brock and Durlauf (2001) and Lee
et al.(2014) investigate binary choices for socially linked agents. In their recent work, Yang
and Lee (2014) extend previous research to a general form of incomplete information. They
find that by allowing incomplete information in not only unobserved idiosyncratic shocks
but also exogenous individual characteristics, conditional expectations about agents’ behaviors can be functions of private information. Their discussions are mainly about continuous
choices and binary choices. In this paper, we analyze social interactions of censored outcomes under a general form of incomplete information and study the tax rate competition
in local government of North Carolina.
The basis of our model is a simultaneous-move Bayesian game when players’ actions
are bounded from below by no action so that a corner solution to the expected utility
maximization problem is possible. We find that a correspondence between a BNE and
a profile of conditional expectation functions consistent with strategies. Similar to Yang
and Lee(2014), we can embed those conditional expectation functions into a normed space
and transform an equilibrium conditional expectation function into a fixed point of an
operator. Additionally, a sufficient condition that ensures the operator to be a contraction
mapping holds for the Tobit model in terms of a reasonable range of the parameter of
interest. That range corresponds to weak or moderate social interactions, but not strong
ones. Strong social interactions might demonstrate multiple (expectation) equilibria which
generate stable and unstable systems as it is shown for the binary choices models by Brock
and Durlauf(2001). Focusing on the case of a unique equilibrium, we discuss computation,
identification, and estimation of the model.
54
We explore the information which is unique in the Tobit model to help identify the
variance of the idiosyncratic shocks. Different from models with continuous and binary
choices where either continuous or discrete outcomes are observed, for the Tobit model with
censored outcomes, two types of information are available from data: whether an outcome
is censored or observed. With both types of information, in addition to coefficients of
explanatory variables, it is possible to identify the variance of idiosyncratic shocks; while
that is not possible for binary choice models.
By investigating interacted behaviors under a general form of incomplete information,
in addition to models with networks and social interactions, our model is also related to
the literature of estimation of games, such as the work of Bajari et al(2010a) and Aradillas
(2010). Bajari et al (2010a) focus on estimation in a framework where all variables are
public information except that each agent only gets the realization of her own shocks.
Incomplete information about both exogenous characteristics and idiosyncratic shocks is
discussed by Aradillas (2010). However, only a two-agent binary choice game is considered
and only equilibria based on public information are discussed. Therefore, heterogeneity in
social relations and in private information is missing there.
For estimation, similar to Rust (1987), we nest the fixed point iteration in the maximum likelihood estimation to derive parameter estimates. That nested fixed point algorithm works well in Monte Carlo experiments. For a sample with a large number groups,
we also consider group common features (unobserved by econometricians) as random effects to investigate unobserved group heterogeneity. The incorporation of group common
features is important in social interactions in order to capture correlation effects as in Manski(1993) and Moffitt(2000). Since group common features are observed by group members,
given their realizations, equilibrium conditional expectations can be solved in a similar way.
55
However, econometricians need to integrate over those random effects in estimation. Utilizing stochastic simulation estimation method, we can calculate a stochastic integration and
estimate parameters using the nested fixed point maximum likelihood method.
As an empirical application, we study the property tax rates for contiguous municipalities in North Carolina. Tax competition among local governments has been theoretically
and empirically studied (See Brueckner (2003) for a comprehensive review). Most research
considers the tax rate as a continuous variable. However, it is more appropriate to adopt
the Tobit model as tax rates are non-negative and local governments’ choices are subject
to this non-negative constraint. More recently, Porto and Revelli (2013) evaluate three empirical approaches to the analysis of spatially dependent limited tax policies. Their Tobit
type models are based on interactions with latent variables and/or spatial time lags, which
are different from ours. We model the property tax rate as an outcome of the Bayesian
Nash Equilibrium from a static simultaneous-move game with incomplete information and
estimate the corresponding parameters using the nested fixed point maximum likelihood
method proposed in this paper. For the sample of municipal property tax rates, we find the
existence of strong competition among near-by municipalities in North Carolina.
The paper proceeds as follows. In Section 3.2, we build the model and explain it as a
reduced-form equilibrium in a simultaneous-move game under incomplete information with
censored outcomes. In Section 3.3, we provide sufficient conditions for the existence of a
unique equilibrium and discuss its computation under different forms of incomplete information. In the subsequent two sections, Section 3.4 and Section 3.5, we discuss identification
and estimation of model parameters. Following an extension to allow group unobservables
56
in Section 3.6, we investigate the finite sample performance of the nested fixed point maximum likelihood estimation by Monte Carlo experiments in Section 3.7 and study the tax
competition among municipalities in North Carolina in Section 3.8. Section 3.9 concludes.
3.2
3.2.1
The Model
The Model Framework
Consider a group of n agents, i = 1, · · · , n, who are socially linked. Their social relations
are represented by an n × n weighting matrix, Wn , such that for all i 6= j, its (i, j) entry,
Wn,ij > 0, if i connects with j; and Wn,ij = 0 otherwise. The diagonal elements, Wn,ii = 0
for all i = 1, · · · , n. Wn may be either symmetric or asymmetric. For example, Wn may
represent a friendship network. Then Wn,ij = 1 if i views j as one of her friends; and
Wn,ij = 0 otherwise. When friendship is mutual, the network is undirected and Wn is
symmetric. However, if friendship is not mutual, it is possible that i regards j as one of
her friends, but j does not think i is among her good friends. Then Wn,ij 6= Wn,ji and
Wn is asymmetric. In spatial econometrics, an example is the relative strength of spatial
interactions among counties. In that case, Wn,ij may be the reciprocal of the geographic
distance between two different counties, i 6= j. Formulated in this way, Wn is symmetric.
However, once it is row-normalized, Wn would in general be asymmetric.
We model the social interactions of agents’ outcomes subject to nonnegative constraint,
yi ’s, under incomplete information:




X
0
0
0
yi = max β0 + Xic β1 + Xip β2 + X g β3 + λ
Wn,ij E[yj |XJpi , Z] − i , 0 .


(3.2.1)
j6=i
This model distinguishes our work from the classical Tobit model by incorporating social
interactions. From (3.2.1), we can see that j’s behavior can affect i only if i links to j, i.e.,
Wn,ij 6= 0. Moreover, that impact is through i’s expectation on j based on her information
57
about various personal characteristics represented by X. That is suitable for the case when
agents take actions simultaneously while some information is unknown. Therefore, similar to
Yang and Lee (2014), we classify observable exogenous characteristics into three categories,
0
0 0
the group features, X g ; commonly known individual characteristics, X c = (X1c , · · · , Xnc ) ;
0
0 0
and some personal traits which may be privately known, X p = (X1p , · · · , Xnp ) . To describe
information structures in a general form, for any i = 1, · · · , n, we use an n × 1 vector Ji to
represent her private information about X p . That is, Ji (j) = 1 if Xjp is known by i; and
Ji (j) = 0 otherwise. Xjpi then represents the random vector composed of those Xjp ’s that
i knows. In (3.2.1), for simplicity, Z collects all public information, including the group
features, X g , commonly known individual traits, X c , the social relations, Wn , as well as
the information structure, J1 , · · · , Jn .
We assume that idiosyncratic shocks i ’s are i.i.d. with the pdf, f (·), and the corresponding CDF, F (·). These idiosyncratic shocks are also independent of all exogenous
characteristics and network connections.
3.2.2
Game Theoretical Foundation
Our model is related to a simultaneous-move game under incomplete information where
the values of actions are bounded below by zero to satisfy the nonnegative constraint. For
example, for investment on community services, although a resident may prefer to disinvest,
a negative amount of investment is not possible. Similarly, for competitive stock brokers,
portfolio choices will also be restricted if short-sales are not allowed. We use the n × n
matrix Wn to represent social relations among a group of n players. Denote the action
taken by agent i by ai . Assume that ai ≥ 0. Her payoff is determined by the following
58
equation:18
0
0
0
r(ai , a−i , X g , Xic , Xip ) = α−γ(ai −β0 −Xic β1 −Xip β2 −X g β3 −λ
X
Wn,ij aj +i )2 . (3.2.2)
j6=i
As before,
Xg
denotes group features,
Xic
refers to publicly known individual character-
istics, Xip stands for privately known personal traits. The idiosyncratic shocks, i , is revealed only to i. They are i.i.d. and are independent of observable exogenous variables,
i.e., X and W . The private information structure about X p is also represented by vectors,
J1 , · · · , Jn . Following Harsanyi (1967a; 1967b), we interpret incomplete information by unknown “types”. Let Si represent the support of Xip and T denote the common support of
i ’s. The set of states,
Si × T n , is defined as the set of possible values of Xip ’s and i ’s
Qn
i
for all players. In this case, player i’s “type” is her private information about exogenous
characteristics and shocks, XJpi and i . Hence, her type set is the corresponding support,
Ri =
τi :
Q
Qn
i
k:Ji (k)=1 Sk
Si × T n →
× T . The signal function is a mapping from the states to her type,
Q
k:Ji (k)=1 Sk
× T . Her prior belief on the set of states is the joint
distribution of Xip ’s, and the distribution for shocks. The prior belief is the same across
all players. A strategy is a plan specifying an action for each possible realization of types.
That is, si : Ri → Ai , where Ai is i’s set of actions.
Because players move at the same time, they do not know which actions others will take
when they are making their own decisions. They can only form expectations based on their
private information. The expected payoff by taking action ai is as follows:
E[r(ai , a−i , X g , Xic , Xip )|XJpi , Z, i ]
0
0
0
=α − γ(ai − β0 − Xic β1 − Xip β2 − X g β3 − λ
X
Wn,ij E[aj |XJpi , Z] + i )2
j6=i
(3.2.3)
X
X
+ γλ2 (
Wn,ij E[aj |XJpi , Z])2 − E[(
Wn,ij aj )2 |XJpi , Z] ,
j6=i
j6=i
18
There are some other specifications which can imply linear equations for socially interacted outcomes,
see Durlauf(2000), Ballester, Calvó-Armengol and Zenou(2010), and Tao and Lee(2014).
59
where the expectation on aj does not depends on i , because i ’s are independent of each
other and also independent of X and Wn . Suppose that γ > 0, if there is no restrictions on
ai , the best response is to choose the ideal value,
0
0
0
a∗i = β0 + X c β1 + Xip β2 + X g β3 + λ
X
Wn,ij E[aj |XJpi , Z] − i .
j6=i
However, with the constraint, ai ≥ 0, it is possible to have corner solutions. The optimal action is ai = max {a∗i , 0}. Thus, (s1 (XJp1 , 1 ), · · · , sn (XJpn , n )) is a Bayesian Nash
Equilibrium (BNE) if




X
p
p0
p
p
c0
g0
si (XJi , i ) = max β0 + X β1 + Xi β2 + X β3 + λ
Wn,ij E[sj (XJi , i )|XJi , Z] − i , 0 .


j6=i
(3.2.4)
Therefore, yi ’s in our model, (3.2.1), can be viewed as the realizations of actions in a BNE.
3.3
3.3.1
Equilibrium Analysis
Equilibrium and Expectations
Similar to Yang and Lee(2014), we employ the conditional expectations on agents’ behaviors as tools to analyze the model. From (3.2.1), fix Z = z, for any k 6= i, we have
that
0
0
0
E[yi |XJpk , z] = E[H(β0 + X c β1 + Xip β2 + X g β3 + λ
X
Wn,ij E[yj |XJpi , Z])|XJpk , z], (3.3.1)
j6=i
where H(·) is a real-valued function such that for any x ∈ <1 ,
Z +∞
Z
H(x) =
I(x > c)(x − c)f (c)dc = xF (x) −
−∞
cf (c)dc.
(3.3.2)
c<x
0
We can see that for any two different agents, k 6= k , their predictions on the behavior of a
0
third agent, i, with i 6= k and i 6= k , will be the same, if they have the same private information about X p , i.e., Jk = Jk0 . Given an information structure, Jk , XJpk is a random vector,
composed of Xjp ’s such that Jk (j) = 1. Denote the underlying sample space for X p , by
60
(Ω, F, P), each Xjp is a mapping from the sample space to (<kp , Bkp ).19 So XJpk is a function
P
from the sample space to (<Mk , Bmk ), where Mk = ( nj=1 Jk (j))kp . Therefore, as a function
of XJpk , given public information Z = z, for any realization ω, E[yi |XJpk , z] will return a real
number, E[yi |XJpk = XJpk (ω), z]. That is to say, fixing an information structure, Jk , the conditional expectation is a random variable. Therefore, summarizing all possible information
structures that are relevant to predict i’s behaviors, we can describe conditional expectations on i as a mapping which returns a random variable for each relevant vector of privately
o
n
known characteristics, ψi : Ai → C, where Ai = XJpk : Wn,ki 6= 0 and C is the set of all
random variables on (Ω, F, P ). Collecting expectations for all group members, we get a vector of those functions, ψ :
0
for any (A1 , · · · , An ) ∈
Qn
i
Qn
i
0
Ai → Cn , such that ψ(A1 , · · · , An ) = (ψ1 (A1 ), · · · , ψn (An )) ,
Ai . We call it the “conditional expectation function”.20 In a
BNE, agents’ predictions are consistent with others’ strategies. As a result, (3.3.1) implies
the following “consistency condition”:
0
0
0
ψi (Ai ) = E[H(β0 + Xic β1 + Xip β2 + X g β3 + λ
X
Wn,ij ψj (XJpi ))|Ai , z],
(3.3.3)
j6=i
0
for all i = 1, · · · , n, A = (A1 , · · · , An ) ∈
Qn
i
Ai . If ψ satisfies (3.3.3), we say that it is an
equilibrium conditional expectation function and denote it as ψ e .
Now we related the conditional expectation function to the game on which our Tobit
model is based.
Proposition 3.3.1 Conditional on public information, Z = z, if
(s1 (XJp1 , 1 ), · · · , sn (XJpn , n )) is a BNE of the Bayesian game in Section 3.2.2, i.e., it satisfies (3.2.4), there is a conditional expectation function, ψ :
19
Qn
i
Ai → Cn , such that (3.3.3)
kp is the dimension of Xip .
20
If Wk,i = 0 for all i, the expected value of i’s behavior does not affect those of other agents. Then
expectations about yi is redundant in determining the equilibrium. To facilitate discussion in general, we
define conditional expectation function for every agent, keeping in mind that only agents who are connected
with others are relevant in the system.
61
holds. Conversely, if ψ :
Qn
i
Ai → Cn satisfies (3.3.3), there is a BNE,
(s1 (XJp1 , 1 ), · · · , sn (XJpn , n )).
Proof. See Appendix B.1.
Proposition 3.3.1 indicates a correspondence between a BNE and an equilibrium conditional expectation function, which characterizes the equilibrium outcome distribution. In
this paper, investigations of estimation methods are based on the analysis of existence and
uniqueness of the equilibrium conditional expectation function.
3.3.2
Unique Equilibrium
According to Yang and Lee (2014), we can view each conditional expectation function as
a point in a function space and transform an equilibrium conditional expectation function
into a fixed point of an operator in that space. Particularly, let Ξ be a set of functions such
that any ξ ∈ Ξ maps a profile of random vectors A = (A1 , · · · , An ) ∈
Qn
i=1 Ai
to a random
vector in Cn with two properties,
0
ξ(A) = (ξ1 (A1 ), · · · , ξn (An )) ,
Z
max
max
1≤i≤n {j:Wj,i 6=0}
|ξi (xpJj )|dFp (xp ) < ∞,
(3.3.4)
where Fp is a simplified notation for the conditional distribution of X p given public information Z = z. In Yang and Lee(2014), it is shown that the function, k · k : Ξ → <+ ,
with
Z
kξk = max
max
1≤i≤n {j:Wj,i 6=0}
|ξi (xpJj )|dFp (xpJi ),
for any ξ ∈ Ξ is a well-defined norm and (Ξ, k · k) is a complete metric space.
62
Based on our model for censored behaviors, define an operator on Ξ, T , such that for
any ξ ∈ Ξ, and A = (A1 , · · · , An ) ∈
Qn
i=1 Ai ,
0
0
0
T (ξ)(A)i = T (ξ)i (Ai ) = E[H(β0 + Xic β1 + Xip β2 + X g β3 + λ
X
Wn,ij ξj (XJpi ))|Ai , z],
j6=i
(3.3.5)
where H(·) is given in (3.3.2). To disuss equilibrium conditional expectations in Ξ, we
need the image of T to be always in Ξ. To ensure that property, we make the following
assumption on the distribution of the idiosyncratic shocks.
Assumption 3.3.1 E[i ] < ∞ for any i = 1, · · · , n.
For any i = 1, · · · , n, denote
0
0
0
gi (XJpi ) = β0 + Xic β1 + Xip β2 + X g β3 + λ
X
Wn,ij ξj (XJpi ).
j6=i
By (3.3.4),
R
|ξj (XJpi )|dFp
< ∞ uniformly for all j 6= i, given public information Z = z.
gi (XJpi ) is a sum of constant and several integrable functions. Thus,
R
|gi (XJpi )|dFp < ∞
uniformly for all i. For any k 6= i with Wn,ki 6= 0, we have that
Z
|T (ξ)i (XJpk )|dFp
Z
Z
p
p
= |gi (XJi )F (gi (XJk )) −
cf (c)dc|dFp
c<gi (XJp )
i
Z
≤
|gi (XJpi )|dFp
Z
+
Z
|
c<gi (XJp )
cf (c)dc|dFp
i
Z
≤
|gi (XJpi )|dFp + E[||],
where the second inequality follows from Assumption 3.3.1.21 Therefore,
maxi maxk6=i
R
|T (ξ)i (XJpk )|dFp < ∞. Thus, whenever ξ ∈ Ξ, T (ξ) ∈ Ξ.
We can see that if ξ ∈ Ξ is a fixed point of T , it is an equilibrium conditional expectation function. Additionally, for equilibrium conditional expectation functions which are
integrable with respect to the conditional distribution of X p given public information Z,
21
E[i ] < ∞ if and only if E[|i |] < ∞.
63
such a function must be a fixed point of T . That is, focusing on integrable functions, there
is a one-to-one correspondence between a fixed point of T and an equilibrium conditional
expectation function.
Proposition 3.3.2 T : Ξ → Ξ is a contraction mapping if
|λ|kWn k∞ < 1,
where kWn k∞ = max1≤i≤n
Pn
j=1 Wn,ij .
(3.3.6)
Then there is one and only one BNE.
Proof. See Appendix B.1.
If Wn is row-normalized, kWn k∞ = 1. Hence, a sufficient condition for the existence of
a unique equilibrium is |λ| < 1. Namely, the magnitude of influence from socially associated
agents is less than 1 (in absolute value). This assumption is conventional for linear spatial
autoregressive models in the literature of spatial econometrics and social interactions. With
(3.3.6) being satisfied, we investigate solution, identification, and estimation for the model.
3.3.3
Equilibrium Computation
Owing to the properties of a contraction mapping in a complete metric space, when
(3.3.6) holds, there is an algorithm to solve the unique equilibrium. To be specific, since
T is a contraction mapping under (3.3.6), liml→∞ T l (ξ 0 ) = ψ e , for any ξ 0 ∈ Ξ, according
to the norm of (Ξ, k · k). Thus, beginning with any initial guess, by iterating the operator
T , we can approximate the equilibrium conditional expectation function, ψ e . Then by
applying Proposition 3.3.1, we can derive the corresponding BNE. However, in general,
since ψ e :
Qn
i=1 Ai
→ C is a function of random vectors, to solve it means to derive its
realizations for every point on the underlying sample space, which is by no way a trivial
task.
64
Nonetheless, ψ e changes with elements in the sample space indirectly through the random vector X p and a predetermined information structure J. Therefore, if Xip ’s are discrete
random vectors with a finite support, it suffices to characterize ψ e by its values on those
points, which makes it possible to represent ψ e by a vector of finite dimensions. Although
that representation is not applicable when Xip ’s vary continuously on a continuum support,
due to the use of quadrature order stochastic integrals as an approximation of the integrals
generated by expectations, we can approximate every possible realization of ψ e by a finitedimension vector. To elucidate the idea, we begin with the simplest case and then move on
to more complicated ones.
Publicly Known Characteristics
When all exogenous characteristics, X g , X c , X p , are public information in a group, there
is no uncertainty other than the idiosyncratic shocks, which are i.i.d. and independent of
all exogenous characteristics. Given Z = z, ψ e reduces to an n × 1 vector, satisfying
0
0
0
ψie = H(β0 + Xic β1 + Xip β2 + X g β3 + λ
X
Wn,ij ψje ),
(3.3.7)
j6=i
for all i = 1, · · · , n. Although we cannot get an analytical solution, due to the nonlinearity
of the function H(·), we can solve ψ e numerically by contraction mapping iteration at each
given vector of parameters.
Self-Known Characteristics
If Xip is realized only to i(and the econometricians), we call the information structure as
“self-known characteristics”, in which case Ji (k) = 0 for all k 6= i. Then any two different
agents do not share private information. For i 6= k, we have that
0
0
0
ψie (Xkp ) = E[H(β0 + Xic β1 + Xip β2 + X g β3 + λ
X
j6=i
65
Wn,ij ψje (Xip ))|Xkp , z].
(3.3.8)
Inspecting (3.3.8), if all Xip ’s are independent of each other conditional on Z = z, the
realization of Xkp does not provide new information on Xip given Z = z. That is to say,
0
0
0
ψie (Xkp ) = E[H(β0 + Xic β1 + Xip β2 + X g β3 + λ
X
Wn,ij ψje (Xip ))|z],
j6=i
for any i 6= k. Since the right-hand side does not depend on the random vector Xkp , we can
view expectations on i’s behaviors as a constant. Therefore, with independent self-known
characteristics, the equilibrium conditional expectation is still a vector, such that
0
0
0
ψie = E[H(β0 + Xic β1 + Xip β2 + X g β3 + λ
X
Wn,ij ψje )|z],
(3.3.9)
j6=i
Comparing (3.3.7) and (3.3.9), we can see that although in both cases every two agents
0
k 6= k have the same expectation on the behavior of a third person, i, when all exogenous
0
characteristics are public information, k and k just integrate over unobserved idiosyncratic
shocks in (3.3.7). In contrast, when they know just their only realizations for X p , they have
to integrate over Xip to predict i’s behaviors in (3.3.9), for Xip is not included in Z.
If Xip ’s are correlated, however, conditional expectations depend on the private infor0
mation used to make predictions. Scrutinizing (3.3.8), if there are two agents k and k with
Wn,ki 6= 0 and Wn,k0 i 6= 0, their private information influences predictions on i’s behaviors
through the conditional distributions Fp (Xip |Xkp , z) and Fp (Xip |Xkp0 , z). Therefore, when the
0
two conditional distributions differ, k and k will form different conditional expectations.
However, if we can provide conditions assuring that the two conditional distributions are
the same, those two agents’ predictions on i’s behaviors will be identical once they get the
same realizations. One sufficient condition is “exchangeability”, cited below from Yang and
Lee (2014):
66
Assumption 3.3.2 Conditional on public information, Z = z, Xip ’s have the same support, Sp . Their conditional joint distribution, f p (X1p , · · · , Xnp |Z = z),22 is exchangeable,
i.e., for any permutation, s : {1, · · · , n} → {1, · · · , n},
p
p
f p (X1p , · · · , Xnp |Z = z) = f p (Xs(1)
, · · · , Xs(n)
|Z = z).
Under “exchangeability”, if Xkp = Xkp0 = x,
0
0
0
Xic β1
0
Xip β2
0
X
g0
X
ψie (Xkp = x) =E[H(β0 + Xic β1 + Xip β2 + X g β3 + λ
Wn,ij ψje (Xip ))|Xkp = x, z]
j6=i
=E[H(β0 +
+
+ X β3 + λ
Wn,ij ψje (Xip ))|Xkp0 = x, z]
j6=i
=ψie (Xkp0
= x).
(3.3.10)
0
for any k, k with Wn,ki 6= 0 and Wn,k0 i 6= 0. Therefore, we can directly define
ψie
on the
common support of Xip ’s Sp , and characterize ψ e by
0
0
0
ψie (x) = E[H(β0 + Xic β1 + y β2 + X g β3 + λ
X
Wn,ij ψje (y))|x, z],
(3.3.11)
j6=i
for all i = q, · · · , n and x ∈ Sp . Now we consider two classes of joint distributions that
satisfy Assumption 3.3.2.
1. (Discrete X p ) Suppose that Xip can only take one of m values in xl : 1 ≤ l ≤ m ,
where each xl is a vector with specific values. Given public information Z = z, the
conditional probability function, fp (y|x, z), is fully captured by the transition matrix,


p11 p12 · · · p1m

..
..  ,
..
P =  ...
.
.
. 
pm1 pm2 · · · pmm
0
where pll0 = prob(Xip = xl |Xkp = xl , Z = z), for k 6= i. We can represent ψ e by an
(nm) × 1 vector,
0
ψ e = (ψ1e (x1 ), · · · , ψ1e (xm ), · · · , ψne (x1 ), · · · , ψne (xm )) ,
When Xip ’s are discrete random variables, f p (·|Z = z) is the probability mass function. If Xip ’s are
continuous, f p (·|Z = z) represents the density.
22
67
and characterize it by the following system of nonlinear equations:
ψie (xl )
=
m
X
0
e0
0
plel H(β0 + Xic β1 + xl β2 + X g β3 + λ
X
Wn,ij ψje (xl )),
e
(3.3.12)
j6=i
e
l=1
for i = 1, · · · , n and l, e
l = 1, · · · , m. Beginning with any (nm) × 1 vector and iterating
the contraction mapping, we can derive ψ e .
2. (Continuous X p ) Consider the case when X1p , · · · , Xnp are jointly normal with mean
0
0 0
(µ , · · · , µ ) , and variance-covariance

Σ1
Σ0
 2
 ..
 .
0
Σ2
matrix

Σ2 · · · Σ2
Σ1 · · · Σ2 

.. . .
..  .
. . 
.
0
Σ2 · · · Σ1
Then for any i 6= j, conditional on Xjp = X, Xip is normal with mean µ+Σ2 Σ−1
1 (X −µ)
0
p
kp
and variance Σ1 −Σ2 Σ−1
1 Σ2 . In this case, each Xi can take any value in < . Thus, it is
impossible to represent ψ e by a finite-dimension vector. However, scrutinizing (3.3.11),
we find that for any i and x ∈ <kp , ψie (x) is determined by an integral, which can be
approximated by the values of the function on a fixed number of values (quadrature
points) using the quadrature method. To illustrate the idea, let us consider the special
case that each Xip is a single random variable, i.e., its dimension kp = 1. In this case,
denote Σ1 = η 2 and Σ2 = ρη 2 . We first transform integration in <1 into integration
over a finite interval, [−1, 1], and then apply the Gauss-Legendre quadrature.
68
ψie (x)
Z +∞
=
−∞
0
0
H(β0 + Xic β1 + x
eβ2 + X g β3 + λ
X
Wn,ij ψje (e
x))
j6=i
(e
x − ρx − (1 − ρ)µ)2
1
·p
)de
x
exp(−
2
2
2(1 − ρ2 )η 2
2π(1 − ρ )η
Z 1
X
0
0
z
z
=
H(β0 + Xic β1 + log(
)β2 + X g β3 + λ
Wn,ij ψje (log(
)))
1−z
1−z
0
j6=i
z
(log( 1−z
) − ρx − (1 − ρ)µ)2
1
1
·p
dz
exp(−
)
2(1 − ρ2 )η 2
z(1 − z)
2π(1 − ρ2 )η 2
s
Z 1
X
0
0
2
ze + 1
ze + 1
=
H(β0 + Xic β1 + log(
)β2 + X g β3 + λ
Wn,ij ψje (log(
)))
π(1 − ρ2 )η 2 −1
1 − ze
1 − ze
j6=i
2
z
e+1
(log( 1−e
)
z
− ρx − (1 − ρ)µ)
1
· exp(−
de
z
)
2(1 − ρ2 )η 2
(e
z + 1)(1 − ze)
s
K
X
X
0
0
2
zek + 1
zek + 1
≈
ωk H(β0 + Xic β1 + log(
)β2 + X g β3 + λ
Wn,ij ψje (log(
)))
π(1 − ρ2 )η 2
1 − zek
1 − zek
k=1
· exp(−
j6=i
z
ek +1
(log( 1−e
) − ρx − (1 − ρ)µ)2
zk
2(1 − ρ2 )η 2
)
1
.
(e
zk + 1)(1 − zek )
(3.3.13)
In (A.2.2), the second equality is derived by a change of integration variable, x
e =
z
log( 1−z
) and the third equality comes from a transformation of z =
ze+1
2 .
At last,
the approximation is based on standard Gauss-Legendre quadrature, where ωk ’s are
the weights, zek ’s are the corresponding abscissae, and K is the number of abscissae.
ek +1
Define accordingly, xpk = log( z1−e
zk ), for k = 1, · · · , K, we get nK equalities,
s
K
X
X
2
0
0
Wn,ij ψje (xpk ))
ωk H(β0 + Xic β1 + xpk β2 + X g β3 + λ
ψie (xpk0 ) =
2
2
π(1 − ρ )η
j6=i
k=1
· exp(−
(xpk − ρxpk0 − (1 − ρ)µ)2
2(1 −
ρ2 )η 2
)
1
,
(e
zk + 1)(1 − zek )
0
for all i = 1, · · · , n and k = 1, · · · , K. This is very similar to (3.3.12). Hence, we can
solve ψie (xpk0 )’s by contraction mapping iterations. After that, for any x ∈ <1 , we can
approximate ψie (x) by (A.2.2). Owing to the fast convergence of the Gauss-Legendre
quadrature, we only need to take a small number of abscissae. In our Monte Carlo
experiments, it is shown that good performance can be achieved in estimation when
choosing K = 8.
69
When Xip ’s are of multiple dimensions, multiple-dimension quadrature methods are
available but quite computationally intensive. Alternatively, we can use the stochastic
integral approximation with importance sampling. Let h(ai ) be a density with its
support containing the support of Xip such that fp (xp |x)/h(xp ) is well defined. Then
we can generate K random draws, xpk , from h(·). The stochastic approximation will
be
ψie (x)
K
p
X
1 X
p
c0
g0
e p fp (xk |x)
H(β0 + Xi β1 + xk β2 + X β3 + λ
Wn,ij ψj (xk ))
.
≈
K
h(xpk )
k=1
j6=i
Analogous to previous discussion, we first solve
ψ e (xpk )’s
by contraction mapping and
then approximate the function ψie (x) at any point x.
For general information structures, the unique equilibrium can be calculated in a similar
way. When all Xip ’s are discrete random vectors with a finite support, we can fully solve ψ e
directly via contraction mapping iteration. When Xip ’s are continuous random variables, we
choose a finite number of points and approximate the integration in conditional expectations
by a weighted sum. If continuous Xip ’s are not exchangeable, for the stochastic simulation,
the simulated values from important densities can be different for each agent. However, if
the number of simulated points is the same, say K, for each i, the total number of equations
for solution can remain to be nK. Since values of ψ e on those finite number of variables can
be solved as a vector by contraction mapping iteration, the values of ψ e for all realizations
can be approximated.
3.4
Identification
For identification, we aim at recovering the model primitives, β, λ, Fx and F , from
observations on X, Y , and Wn .
e λ,
e Fex , Fe ) for social conDefinition 3.4.1 (β, λ, Fx , F ) is observationally equivalent to (β,
nections Wn at W n and information structure J at J, if they generate the same distribution
70
of the observables,
e λ,
e Fex , Fe ).
FY,X|W n ,J (·, ·|β, λ, Fx , F ) = FY,X|W n ,J (·, ·|β,
(3.4.1)
e λ,
e Fex , Fe ) at
If (3.4.1) holds for X at X, (β, λ, Fx , F ) is observationally equivalently to (β,
W n , J and X.
The observational equivalence in terms of distribution, (3.4.1) implies, in particular, the
same conditional expectations, i.e.,
e λ,
e Fex , Fe ].
E[yi |XJp , z, β, λ, Fx , F ] = E[yi |XJp , z, β,
k
(3.4.2)
k
Definition 3.4.2 Suppose that (β ∗ , λ∗ , Fx∗ , F∗ ) is the true parameters for Y, X at W n and
e λ,
e Fex , Fe ) 6= (β ∗ , λ∗ , F ∗ , F ∗ ) cannot be observaJ. (β ∗ , λ∗ , Fx∗ , F∗ ) is identified if any (β,
x
tionally equivalent to (β ∗ , λ∗ , Fx∗ , F∗ ) at W n and J.
Our discussion about identification is based on the hypothesis:
Assumption 3.4.1 The distribution of exogenous characteristics, Fx (·), can be inferred
from data about X.
So we can focus on β, λ, and F . Additionally, we parametrize the distribution of i ’s.
Assumption 3.4.2 i ’s are i.i.d. with the full support, <1 , according to a parametric pdf,
f (·; σ), where the functional form, f (·; ·) is known but the parameter value of σ is unknown.
The corresponding CDF is F (·; σ), which is strictly increasing in its argument.
In the following Lemma 3.4.1, we show that σ can be identified from the relationship
between the mean outcomes, E[yi |X p , z], and the average amount of censoring, E[I(yi >
0
0 0
0)|X p , z], where X p refers to the matrix of all privately known characteristics, (X1p , · · · , Xnp ) .
71
Lemma 3.4.1 Given public information, Z = z, for all i = 1, · · · , n,
Z
p
p
−1
p
E[yi |X , z] = E[I(yi > 0)|X , z]F (E[I(yi > 0)|X , z]; σ)−
c<F −1 (E[I(yi >0)|X p ,z];σ)
cf (c; σ)dc.
(3.4.3)
Proof. See Appendix B.2.
Since (3.4.3) holds for an arbitrarily chosen agent, we can suppress the subscript and
simply write
Z
E[y|X p , z] = E[I(y > 0)|X p , z]F−1 (E[I(y > 0)|X p , z]; σ)−
c<F−1 (E[I(y>0)|X p ,z];σ)
cf (c; σ)dc.
(3.4.30 )
In addition to the relation (3.4.30 ), we impose the following two assumptions for the
identification of σ.
(c;σ)
=
Assumption 3.4.3 f (c; σ) is differentiable with respect to σ. Moreover, limc→−∞ c ∂Fdσ
0.
Assumption 3.4.4 The ratio,
∂F (c;σ)
∂σ
f (c;σ)
, is strictly monotonic with respect to c.
Proposition 3.4.1 For any network, Wn , and information structure, J, if Assumptions
3.4.1 to 3.4.4 are satisfied, σ can be identified from moments, E[y|X p , z] and E[I(y >
0)|X p , z].
Proof. See Appendix B.2.
The proof of Proposition 3.4.1 depends on the relationship, (3.4.30 ), which is valid with
any information structure on X. In principle, with appropriate empirical observations,
E[y|X p , z] and E[I(y ∗ > 0)|X p , z] can be identified nonparameterically from empirical observations. Then we can identify σ for any information structure.
As an example of Assumptions 3.4.3 and 3.4.4, consider the case that i is normally distributed with zero mean and standard deviation σ. Then, F (c; σ) = Φ(c/σ) and f (c; σ) =
72
c
1
σ φ( σ ),
where Φ(·) and φ(·) are respectively the CDF and pdf of the standard normal distri-
(c;σ)
bution. limc→−∞ c ∂F∂σ
= limc→−∞ −( σc )2 φ( σc ) = 0 and
∂F (c;σ)
∂σ
f (c;σ)
=
c
φ( σc )
σ2
1
φ( σc )
σ
−
= − σc , which
is decreasing in c. Therefore, the sufficient conditions in Assumptions 3.4.3 and 3.4.4 are
satisfied.
Actually, it is more transparent to see the identification of the standard deviation in the
normal disturbance case through (3.4.30 ). By calculation, we can get that
E[y|X p , z] = σ Φ−1 (E[I(y > 0)|X p , z])E[I(y > 0)|X p , z] + φ(Φ−1 (E[I(y > 0)|X p , z])) ,
which implies that
σ=
E[y|X p , z]
.
Φ−1 (E[I(y > 0)|X p , z])E[I(y > 0)|X p , z] + φ(Φ−1 (E[I(y > 0)|X p , z]))
(3.4.4)
Now we turn to identification of other parameters. For a single group, any group characteristics, X g , is absorbed by the constant term. So we focus on the identification of β0 ,
β1 , β2 and λ. We impose two additional assumptions.
c
p
Assumption 3.4.5 Given an information structure, J = J, X c = X , X p = X , and W =
c
p
W n , E[Yi |J = J, W = W n , X c = X , XJpk = X Jk ] can be identified (nonparametrically), for
any i, k = 1, · · · , n, i 6= k, and Wn,ki 6= 0.
c
p
Assumption 3.4.6 ln , X , X , E p has full column rank, where ln is an n × 1 vector
of 1’s and E p is an n × 1 vector, whose i-th component is
c
P
j6=i W n,ij E[Yj |J
= J, W =
p
W n , X c = X , XJpi = X Ji ].
The full rank condition in Assumption 3.4.6 is essential to rule out multicollinearity of
regressors, similar to conventional linear regressions.
Proposition 3.4.2 For a single group, for a social network matrix, W n , information strucc
p
ture, J, and personal characteristics, X and X , if Assumptions 3.4.1 to 3.4.6 hold, we
can identify β0 , β1 , β2 , λ, and σ.
73
Proof. See Appendix B.2.
When there are multiple independent groups in the sample, we can identify β3 from the
variation of group features, X g .
3.5
Estimation
Because all exogenous characteristics are observed in the data, from econometricians’
point of view, randomness in agents’ behaviors come from the idiosyncratic shocks, i ’s.
As those shocks are independent across agents, the likelihood for behaviors of agents are
independent of each other. Therefore, the sample log likelihood can be written as follows:
log L(Y |X c , X p , X g , Wn )
=
n
X
0
0
0
g0
X
I(yi > 0) log f (yi − (β0 + Xic β1 + Xip β2 + X g β3 + λ
i=1
+ I(yi = 0) log F (β0 +
X
Wn,ij E[yj |XJpi , z]); σ)
j6=i
0
Xic β1
+
0
Xip β2
+ X β3 + λ
Wn,ij E[yj |XJpi , z]; σ) .
j6=i
(3.5.1)
In (3.5.1), the conditional expectations,
E[yj |XJpi , z]’s,
are determined by the equilibrium
condition, (3.3.1). Therefore, we need first solve those conditional expectations in order to
calculate the sample likelihood. Since the conditional expectation for the whole group is the
fixed point of a contraction mapping, we can solve it by contraction mapping iteration for
every parameter vector and then choose the parameter vector to maximize the sample log
likelihood.23 That is, we nest a fixed point solution algorithm in the maximum likelihood
estimation, similar to Rust (1987).
Because the conditional likelihoods of agents’ behaviors are independent of each other
and the joint likelihood of a whole sample is the product of marginal likelihoods, we can
derive large sample properties of the estimator in a conventional way for independent observations.
23
In Section 3.3.3, we discuss in detail how to solve the unique equilibrium under various circumstances.
74
3.6
Extensions
In practice, there are some possible common factors which can influence all group members but are unknown to econometricians. For example, when making decisions on tax
rates, municipal officers know the lobbying power of different parties. Nonetheless, there
might not be a measure about that from data sets. Such factors are group unobservables.
In this paper, we model group unobservables as random effects.
To be specific, consider G independent groups. For a group g, in addition to X g , X c,g ,
and X p,g , there is another set of group features lumped together into a variable, ω g , which
is unobserved. Assume that ωg is independent of other variables and has a zero mean. The
observed censored outcomes satisfy




X
0
p0
c0
yi,g = max β0 + Xi,g
β1 + Xi,g
β2 + X g β3 + ω g + λ
Wng ,ij E[yj,g |XJp,g
,
Z]
−
,
0
.
i,g
i,g


j6=i
(3.2.10 )
Because ω g is public information for agents, its presence will be similar to X g in the
analysis of equilibrium expectation and behaviors. The distribution of ω g can be identified through variation across different groups. However, when unobservable group random
effects are taken into account, our estimation method will need to be modified. Since ω g
is unobservable to econometricians, we integrate over it to construct the sample likelihood
function:
log L(Y |X c , X p , X g , W )
=
G
X
g=1
log
ng
hZ Y
0
0
0
p
c
f (yi,g − (β0 + Xi,g
β1 + Xi,g
β2 + X g β3 + ω g + λ
i=1
0
X
Wng ,ij E[yj,g |XJp,g , z]); σ)I(yi,g >0)
i,g
j6=i
0
0
p
c
· F (β0 + Xi,g
β1 + Xi,g
β2 + X g β3 + ω g + λ
X
(3.6.1)
i
Wng ,ij E[yj,g |XJp,g , z]; σ)I(yi,g )=0 fω (ω g )dω g .
i,g
j6=i
In estimation, we use stochastic integration to approximate the integration over the
unobserved group random effects, ω g ’s. That is, we derive S independent draws, ω g,s , for
s = 1, · · · , S for each group g = 1, · · · , G from the density fω (·; γ), to construct a simulated
75
sample log likelihood:
log L(Y |X c , X p , X g , W )
=
G
X
g=1
log
ng
S Y
h1 X
S
0
0
0
p
c
f (yi,g − (β0 + Xi,g
β1 + Xi,g
β2 + X g β3 + ω g,s + λ
s=1 i=1
0
X
Wng ,ij E[yj,g |XJp,g , z]); σ)I(yi,g >0)
i,g
j6=i
0
0
p
c
· F (β0 + Xi,g
β1 + Xi,g
β2 + X g β3 + ω g,s + λ
X
Wng ,ij E[yj,g |XJp,g , z]; σ)I(yi,g )=0
i,g
i
.
j6=i
(3.6.10 )
In estimation, we will still nest fixed point iteration in a maximum likelihood estimation
algorithm, replacing the true likelihood, (3.6.1), with the simulated one according to (3.6.10 ).
3.7
Monte Carlo Experiments
We investigate finite sample performance of the nested fixed point maximum likelihood
estimation via Monte Carlo experiments. In our experiments, the observed group feature,
X g , is absent, for simplicity. The idiosyncratic shocks, i ’s, are i.i.d. with a pdf, (1/σ)φ(·/σ),
where φ(·) is the standard normal density. We focus on the estimation of the coefficient
of the intercept, β0 , the individual commonly known characteristics, β1 , private personal
features, β2 , the interactions from socially associated agents, λ, and the standard deviation
of the idiosyncratic shocks, σ. Their true values are β0 = 0, β1 = 1, β2 = 1, λ = 0.3, and
σ = 1.
We suppose agents are linked in social networks, which are represented by the matrix
Wn , where n is the population size of the whole network. For any two agents, i 6= j,
Wn,ij = 1 if i links to j; and Wn,ij = 0 otherwise. Wn,ii = 0 for all i = 1, · · · , n. Then
we row-normalize Wn such that the sum of each row is equal to 1. Two types of network
designs are considered. In the first case, the whole sample is composed of a collection of
independent groups. Consequently, Wn can be organized as a block-diagonal matrix, with
each block representing social relations in one group. We assume that all groups have the
same size, 20. Within a group, for every agent, 3 other agents are randomly selected to be
linked to her, which is represented as F = 3 in reported tables of estimates. The number
of groups, G, is either 100 or 500. In the second network design, there is a single group
of socially associated agents with its population size being either n = 200 or n = 1000.
76
As for social relations, we make experiments on two settings. First, we consider the case
that the number of friends an agent can make, F , is fixed. We begin with F = 30 for
the network of size n = 200. When the population size increases to n = 1000, we look
at F = 30 and F = 150. That is, we compare the estimates when the number of social
links is kept constant with that when social links increase proportionally with population
size. After that, we turn to the case where the number of social links an agent can make
is random. It can take any integer value between 0 and an upper bound U F with equal
probabilities. That is, we use social links generated by a uniform discrete distribution. For
population size n = 200, the upper limit is U F = 59 so that the mean is near 30, the
same as the fixed number of social links for group size n = 200 in the previous case. For
n = 1000, we investigate two cases, U F = 59 and U F = 299, corresponding to the settings
for fixed number of social links with a group of size 1000. In this way, we try to find how
the randomness of social links influences the correlation among agents’ behaviors and the
performance of estimators.
The commonly known individual characteristics, X c , are generated as independent variables. For Xip , we consider two cases, corresponding to our discussion in Section 3.3.3. That
is, they are either discretely distributed with a finite support or continuously distributed
with a continuum support. In both cases, Xip ’s are correlated with an “exchangeable” joint
distribution. For the first case, we adopt the simplest setting that each Xip is dichotomous,
taking a value of 0 or 1. The realizations of Xip ’s are determined as follows: 40% of the
agents in a group are picked randomly. People who are picked have Xip = 1; and those who
are not selected have Xip = 0. For this design, the distribution of Xip ’s does not depends
on the identities of agents, with a transition (conditional) probability matrix:


 
p
p
p
p
n−n1
n1 −1
P (X2 = 1|X1 = 1) P r(X2 = 0|X1 = 1)  n−1
n−1 
P =
,
= 1
n
n−n1 −1
P (X2p = 1|X1p = 0) P r(X2p = 0|X1p = 0)
n−1
n−1
where n is the number of agents in the group and n1 = 0.4n is the number of agents who
are picked in the group. Members know the joint distribution of Xip ’s. Because both n and
77
n1 are observable, econometricians can infer Xip ’s joint distribution from data. So we focus
on the likelihood of yi ’s conditional on regressors for estimation.
p
When Xip ’s are continuously distributed, we adopt the framework that Xi,g
= αg + pi,g ,
for g = 1, · · · , G and i = 1, · · · , n. Suppose that αg ’s are i.i.d. normal with mean µ
and variable σ12 . pi,g ’s are i.i.d. normal with zero mean and variance σ22 . They are also
0
p
p
independent of αg ’s. Then within a group g, (X1,g
, · · · , Xn,g
) is jointly normal with mean
0
(µ, · · · , µ) and variance-covariance matrix,

1 ρ


ρ 1

η2  . .
. .
. .

ρ ρ
where η 2 = σ12 + σ22 and ρ =
σ12
2
σ1 +σ22
···
···
..
.
···

ρ


ρ

,
.. 

.

1
p
p
. For any i 6= j, given Xj,g
= x, Xi,g
is normally
distributed with mean ρx + (1 − ρ)µ and variance (1 − ρ2 )η 2 . We can see this is just the
example we discussed in Section 3.3.3. Hence, we can apply the Gauss-Legendre quadrature
approximation in calculating an equilibrium. For the Monte Carlo study, we choose µ = 1,
η = 2, and ρ = 0.4. While values of those parameters are known to agents, econometricians
need to estimate them from observed data. In this situation, we adopt a two-step algorithm
p
in estimation. We first estimate µ, η, and ρ from Xi,g
’s. Those estimates are consistent,
when the number of independent groups, G, increases to ∞. We then plug those estimates
into the likelihood of y and use the nested fixed point maximum likelihood method to
estimate model parameters.24
The experiment results are tabulated in Tables B.1 to B.6. We can see that the nested
fixed point maximum likelihood algorithm works well in general. Estimates have small biases
and their (empirical) standard deviations decrease as sample size increases. Comparing the
estimates when the sample is composed of many independent groups with those when there
is just a single group of socially related agents, we find that the former ones are better
24
When the first stage estimator is consistent, estimates of other parameters will be consistent. However,
the standard errors of the first stage estimates would have effects on the standard errors of the second stage
estimates. Then we need to make adjustments on standard errors of the second stage estimates.
78
than the later ones in terms of smaller biases and standard deviations. That is intuitive.
Although agents make decisions independently, their expectations are correlated with each
other. Hence, the more intensive the social relationships, the higher the collinearity of the
regressors and the less variant the generated regressors. When there are many independent
groups in the sample, as the number of group increases, the independence among the sample
regressors increases, which implies better performance of estimators. Additionally, when
we look at the Monte Carlo experiment results for a single group, we see that when the
number of social links is fixed, the standard deviations of the estimates for the intercept,
β0 , and the social relation intensity, λ, do not decrease when the sample size and the
number of social links increase proportionally. But that situation alleviates if the number
of social links is randomly determined. That is because the additional randomness helps
reduces collinearity and increases variations of the generated regressors. In Table B.7, we
present the results when there is unobserved group random effects. Due to the presence
of group unobservables, we need to use simulated likelihood for estimation(with simulation
size S = 500). As a result, estimation biases are bigger than those without any unobserved
group random effect. The performance of the nested fixed point ML estimator improves
when the number of independent groups is raised.
For all the experiments, we tried two different information structures. In the first case, all
Xip ’s are assumed to be public information to group members. In the second case, however,
Xip is only revealed to i herself. We generate data under one information structure and
estimate the model under both the correct one and the corresponding misspecified one.
We can see that the estimation under the true information structure implies in general a
higher maximized sample log likelihood than that under the misspecified one. Therefore,
maximized sample likelihood is useful for model selection.
3.8
Empirical Application
We apply our model to property tax for municipalities in North Carolina. Municipal
tax revenue depends on both tax rates and the amount of properties upon which tax is
79
levied. A higher rate of property tax can increase tax revenue per unit properties owned
by its residents. At the same time, it may provide incentives for residents to move out to
nearby municipalities which offer lower tax rates. Therefore, there is a trade-off for setting
a high tax rate due to the competition between nearby municipalities. We model this tax
competition by a game with incomplete information.
This tax competition can be modeled as a simultaneous-move game with incomplete information in Section 3.2.2. Consider n municipalities in a state. They are related according
to geographic vicinity, represented by an n × n matrix, Wn . For any two different cities,
i 6= j, Wn,ij = Wn,ji = 1 if the distance between them is less than some cutoff value, e > 0;
and Wn,ij = Wn,ji = 0 otherwise. as usual, Wn,ii = 0, for all i.25 We use ai to denote
the rate of property tax set by city i. A tax rate must be non-negative, i.e., ai ≥ 0 for all
i = 1, · · · , n. i’s payoffs by choosing ai when other municipalities choose a−i is given by
(3.2.2). When a city chooses tax rates without knowing others’ decisions, the expected payoff by choosing ai is given by (3.2.3). In this equation, if λ > 0, the higher the tax rates of
near-by municipalities, the higher the ideal tax rate that i wants to set, showing competition
between cities. Without the non-negative constraint, the ideal tax rate is determined by
some city-level demographics known to all local governments, such as the population, some
city features which may not be known by other cities, say how rich residents in the city are
in the current period, and the tax rates set by contiguous municipalities. For policy-making
on property tax, the tax rates are non-negative.26 Therefore, if the ideal rate is negative,
we will get a corner solution. When the observed tax rates, (a1 , · · · , an ), is an outcome of
such an equilibrium, we have that




X
0
0
ai = max β0 + X c β1 + Xip β2 + λ
Wn,ij E[aj |XJpi , Z] − i , 0 .


(3.8.1)
j6=i
25
We can see that in this special case, Wn is a symmetric matrix with zero diagonal elements. In estimation,
we row-normalize Wn such that the sum of each row is equal to 1.
26
Local governments have other ways to subsidize residents. However, for property tax, the rates are
non-negative.
80
Therefore, we may use our model to investigate the problem of tax competition. Assume
that condition (3.3.6) is satisfied. Hence, the observed data comes from the unique equilibrium in this model, which can be solved as a fixed point.
We look at municipalities in North Carolina, collecting data on property tax rates, government finance, and demographics in the 2012 fiscal year as well as geographic statistics
(latitudes and longitudes). Since the total property tax a household pays is the sum of city
tax and county tax, we also collect data on county property tax rates in 2012. Data of county
and city property tax rates are from North Carolina Department of Revenue. Information
about municipal government finance comes from North Carolina Department of State Treasurer. Data about city median household income is found from “FindtheData.org”, which is
based on the American Community Survey. Latitudes and longitudes are found from “CityLatitudeLongitude.com”27 We calculate distance between any two cities based on latitudes
and longitudes, using the Haversine formula28 . Sample statistics are summarized in Table
B.8. From the table, we can find a big variety among the 506 municipalities in the sample
in terms of demographics and financial status. Among those municipalities, the rates of
property tax is strictly positive except for 29 of them who levy no property taxes.
In defining geographic vicinity, we tried two different cutoff values for distance between
two municipalities, 30 kilometers and 50 kilometers, and estimate model parameters under
the two associated social weighting matrices respectively. As for the public information
about a municipality, we include the population and the property tax rates of related
counties.29 Since it is possible that a municipal government knows more about the financial
situation of people living in its own territory than other governments do, it is reasonable to
include in X p some of the residents’ financial data in the current period. Here we choose
27
The latitudes and longitudes of most municipalities in our sample are listed on the webpage, “CityLatitudeLongitude.com”. Data about the rest 14 cities are found by searching on Google individually.
28
r
ι2 − ι 1
ξ2 − ξ1
) + cos(ι1 ) cos(ι2 ) sin2 (
)),
2
2
where d is the distance, r is the radius, ι1 and ι2 are latitudes, and ξ1 and ξ2 are longitudes.
d = 2r arcsin(
sin2 (
29
When a municipality shares its border with several counties, we use the population-weighted average
property tax rate.
81
the median household income. Specifically, we assume that city median household income
depends on two factors, state average level income and city idiosyncratic shocks.
M HIi = α + pi ,
(3.8.2)
where M HIi is the median household income of municipality i in the sate (North Carolina).
The random variable, α, represents state average median household income and pi , is a
municipal-specific shock. Suppose that α is normal with mean µ and standard deviation, ω;
pi ’s are i.i.d. with zero mean and standard deviation, γ; and pi is independent of α. Then
Xip ’s have an exchangeable joint normal distribution, with mean µ, variance η 2 = ω 2 + γ 2 ,
and correlation coefficient, ρ =
ω2
.
ω 2 +γ 2
30
We estimate the model under two different
information structures. The median household income is publicly known to related cities in
the first scenario and is self-known in the second case.
In Table B.9, we report regression results of the model under different vicinity cutoffs
and information structures. We see that all estimates are significant at the 5% level with
signs consistent across different regressions. The tax rate of a municipality is positively
related to the tax rate set by the county to which it is affiliated and is negatively related
to the median household income of the residents. The wealthier the residents in a city,
the lower the rate of property tax. More importantly, the intensity of interactions between
near-by cities, λ, is siginificantly positive, which supports the competition effects among
municipalities. Comparing the maximized sample likelihood, we can see that models with
spatial interactions outperform the traditional Tobit model. Moreover, among the four
regressions with spatial interactions, we can see the magnitudes of parameters are similar
to each other. The estimated competition effects between neighboring municipalities when
the median household income is self-known are a little bit stronger than those under the
hypothesis of public information.
30
We collect the time series of the median household income for the state of North Carolina, αt , from 1984
to 2012, and estimate µ and ω by sample mean and standard deviation.
82
3.9
Conclusion
We consider social interactions for censored outcomes under incomplete information,
which incorporates various types of nonlinearity. First, outcomes of a social group depends
nonlinearly on exogenous variables and model parameters due to expectations of peer’s
performances among socially linked agents. Second, the outcome of an agent changes with
her own features in a nonlinear way due to censoring. Applying the theoretical results about
socially interacted behaviors under incomplete information in Yang and Lee (2014), we
relate the model with a simultaneous-move game under incomplete information with binding
nonnegative constraint. We transform solution of an equilibrium conditional expectation
function into calculating a fixed point of a function mapping and derive sufficient conditions
for that mapping to be a contraction, which ensures the existence of a unique equilibrium.
Under this scenario, we can solve the unique equilibrium, and identify and estimate model
parameters.
In this paper, we solve and estimate the model under the hypothesis that parameters are
within the range that ensures a unique equilibrium. This corresponds to weak or moderate
social interaction scenario, under which it is possible to derive clear implications from model
estimation. The situation of multiple equilibria corresponds to strong social interaction. In
the literature, there are also methods which allows for multiple equilibria. Cases in point
are the nonparametric two-step method by Bisin et al (2011) and Leung (2013). Their basic
idea is to assume that agents of identical characteristics play the same strategy ex ante and
exactly the same equilibrium is played for any repetition of the game. In that way, it is possible to identify choice probabilities consistently by non-parametric method. Plugging those
choice probabilities back into the likelihood function, we can derive consistent estimates.31 .
For social interactions with large group sizes, a similar approach is also possible such as in
Shang and Lee(2011). However, with small or moderate group sizes or in a general network
31
In Bisin et al (2011), an alternative estimation method is to solve all equilibrium for each parameter
value and choose the one to maximize log likelihood.
83
setting without repetitions, computationally tractable estimation approaches remain to be
found. All those are of interest for future research.
In the Tobit model, there are social interactions for only one type of behaviors. The value
of observed outcomes and the censoring result are determined by just a single equation. A
natural extension will be the sample selection model with social interactions incorporated.
With social interactions in both the outcome equation and selection equation, as well as
connections between the two equations, it is possible to analyze social interactions for two
classes of related choices, one continuous and another discrete. That is a concise case with
multiple equilibria for future research.
84
Chapter 4: Social Interactions under Incomplete Information with
Multiple Equilibria
4.1
Introduction
Estimating models for social interactions with possible multiple equilibria is a challenging issue both theoretically and empirically. The distribution of outcomes is influenced by
not only the unknown parameters, but also the underlying equilibrium which is actually
played but not observed. Without further specifications, with any given parameter values,
the sample likelihood or moment conditions are still indeterminate and cannot be used for
estimation. Bajari et al.(2010a) propose a two-step algorithm to estimate a discrete choice
game under incomplete information. They first derive nonparametric estimates for players’ choice probabilities and then use them to estimate other structural parameters. When
there are a large number of repetitions of the same game, under the assumption that the
same equilibrium is played for all the repetitions, the individual choice probabilities can
be estimated consistently. However, in empirical studies of social interactions, especially
interactions among friends, it is frequent to work with cross-section data sets. For data
sets with individual information over years, there is also a problem with the evolution of
social relations, which makes it unrealistic to assume that the same equilibrium is played
repeatedly. This method, nonetheless, can still be used for some special model structures
or under some additional assumptions. For example, when individual outcomes are influenced by a global equilibrium aggregate, Bisin et al.(2011) first estimate the equilibrium
aggregate and then recover other parameters. Leung(2013) focuses on one particular type of
85
equilibria, where individuals with the same observable characteristics play the same strategy. With a large number of independent groups and/or a large number of agents playing
the same strategy, repetitions are derived. However, the method by Bajari et al.(2010a)
would be invalid with the presence of unobserved group heterogeneity which cannot be fully
explained by observed characteristics. For social interaction models with potentially multiple equilibria, without assuming repetitions of the same equilibrium, this paper proposes
using a parametric stochastic rule to specify a probability distribution of equilibrium selection. Although this method requires solving or approximating equilibria, it can be applied
to a general model framework and data generating processes, including both discrete and
continuous choices, bounded and unbounded outcomes, and incomplete information about
idiosyncratic shocks and exogenous characteristics. In a related paper, Bisin et al(2011)
choose both the parameter values and equilibria to maximize the sample likelihood function. That method is implicitly built on a particular selection distribution. That is, exactly
one equilibrium is chosen with probability one. However, that specification cannot lead
to economic implications on equilibrium selection. Moreover, if the likelihood function is
of a complicated form, it might be difficult to maximize likelihood when choosing both
parameter values and equilibria.
Using a stochastic rule to complete a game with multiple equilibria is not new in the
literature of game estimation. Bajari et al.(2010b),(2010c) use this method to identify and
estimate discrete choice games under both complete and incomplete information. However,
it is not straightforward to make this approach applicable to various empirical studies in
social interactions under incomplete information. Socially interacted behaviors can be either
discrete or continuous, bounded or unbounded. Moreover, in Bajari et al.(2010b),(2010c),
only the idiosyncratic shocks are private information. In their recent research, Yang and
Lee(2014) point out that there can also be incomplete information about some exogenous
characteristics for social interactions. For example, when analyzing peer effects in class
performance, class and individual characteristics such as grades, locations, genders, SAT
scores, and IQ scores are often used as exogenous covariates. It would be unrealistic to
86
assume that individual SAT scores and IQ scores are public information. Nonetheless,
incorporating various types of behaviors and information structures makes it more difficult
to specify the probability distribution of equilibrium selection and compute the likelihood
of the complete model.
The difficulty in specification comes from the equilibrium set. The structure of Nash
equilibria for a game with complete information has been well understood. So is that for a
finite-player finite-action game under incomplete information. However, the existence of a
pure strategy equilibrium in a game with private information when the number of possible
actions and types are not finite has been an open question for a long time. Following the
pioneering work by Milgrom and Weber(1985) and Radner and Rosenthal(1982), Khan and
Sun(1995) and Kan and Zhang(2014) provide existence conditions when the set of actions
is compact. Although their conditions apply to general abstract private information, in
many empirical applications, it is unsatisfactory to restrict the values of outcomes to be
bounded. Moreover, as our model is based on a reduced-form Bayesian Nash Equilibrium
(BNE), their conditions about the payoff functions may not be directly applied. Therefore,
we characterize the equilibrium set and derive conditions for existence and equilibrium
properties specific to our framework.
It is shown that, in terms of the distribution of outcomes, a BNE is equivalent to an
equilibrium conditional expectation of individual choices, which are functions of the private
information used to make predictions. Therefore, the equilibrium conditional expectations
are used to represent BNEs. Particularly, if all exogenous characteristics are public information and only the idiosyncratic shocks are privately known, conditional expectation
functions reduce to vectors in an Euclidean space satisfying a system of nonlinear equations.
By the transversality theorem and the intersection theory in differential topology, it is shown
that under certain regularity conditions, there are a finite number of equilibria. Another
result is a sufficient condition for equilibrium uniqueness, which is weaker than the condition derived from contraction mapping by Yang and Lee (2014). When some exogenous
characteristics are private information and they have a continuum support, the equilibrium
87
conditional expectations are generally functions. They are embedded into a Banach space
of functions, which is related to the classical Lp spaces for integrable functions. By the
Schauder fixed point theorem, sufficient conditions are derived, which ensure that the set
of equilibria is nonempty and compact. As a result, the set of equilibria can be approximated by a finite number of equilibria. With a finite number of elements in the (possibly
approximated) equilibrium set, a probability mass function for equilibrium selection can be
specified based on a parametric selection rule. That completes the model.
By the strategy of “identification at infinity” and techniques in spatial econometrics,
parameters can be identified for the linear model with continuous choices, the model with
binary choices, and the Tobit model. Challenges in estimation come from computation.
When all exogenous characteristics are public information, solving for the set of equilibria
is equivalent to getting all solutions to a system of nonlinear equations. According to Garcia
and Zangwill(1981), it is possible to get all solutions via a homotopy continuation method
under the regularity and path-finiteness condition, when the system can be extended to
complex spaces in an analytic way. It is verified that those conditions hold for a couple of
models with normal shocks. There are also discussions about the application of another
related homotopy algorithm, used by Borkovski et al.(2010a) and (2010b). There is also a
brief discussion about group unobservables, peer effects, and a deterministic selection rule.
For models with peer effects, it suffices to focus on conditional expectations about group
average outcomes. Hence, an equilibrium can be represented by a vector-valued function
with less coordinate functions than that in the general model framework.
Particularly, this paper discusses about two types of binary choice models in detail.
In the Type I model, agents take one of two actions, 0 and 1. The utility for choice 0 is
normalized to 0 for every agent. When an agent chooses 1, however, her utility depends
on the number of agents who are associated with her and also choose 1. Two key features
of this model is that ex post, the agents who choose 0 are not affected by others and have
no effects on the utilities of agents who choose 1. The entry game is a case in point. The
Type II model does not have these two properties. Like the model discussed in Brock and
88
Durlauf(2001), in the Type II model, the utility an agent can get depends on the difference
between her choices and those of the agents who she is associated with. With normal
idiosyncratic shocks, it is more likely to have multiple equilibria in the Type II model than
it is in the Type I model. Additionally, by comparing different estimation methods in the
Monte Carlo experiments, it is found that assuming equilibrium uniqueness can bring in
biases when the intensity of social interactions is large and there are multiple equilibria in
the data generating process.
The paper proceeds as follows. The model framework is introduced in Section 4.2. Then
we relate the model to a game under incomplete information and a BNE to an equilibrium
conditional expectation function in Section 4.3. Sections 4.4 and 4.5 contain a detailed
analysis of the set of equilibria, identification, and estimation for two different types of
information structures. In Section 4.6, there is a brief discussion about unobserved group
characteristics, peer effects, and the deterministic equilibrium selection rule. Section 4.7
focuses on equilibrium set characteristics and Monte Carlo experiments of two types of
binary choice models. Section 4.8 concludes. Technical proofs are put in Appendices C.1
through C.5.
4.2
4.2.1
Models
A Model Framework
The discussion of multiple equilibria is in the framework for social interactions under a
general form of incomplete information analyzed by Yang and Lee (2014). Consider a group
of n socially related agents. Their relations are represented by an n × n matrix Wn . For
any i, j = 1, · · · , n, Wn,ij ≥ 0. Wn,ii = 0 for all i = 1, · · · , n. For any i 6= j, Wn,ij > 0 if
i connects with j; and Wn,ij = 0 otherwise. Take a game with n players for example. As
the payoffs of any two agents are interdependent, Wn,ij = 1 for any i 6= j; and Wn,ii = 0.
This social relation matrix can also be used in the model for peer effects in a social group,
where the behavior of an agent is related to those of all the other group members. If Wn
is used to represent the spatial relations of geographic regions or local governments, Wn,ij
89
is usually negatively correlated with the distance between i and j. In that case, Wn is
symmetric. If we use Wn to represent friendship networks, Wn will be symmetric if only
mutual friendship are considered. However, if the network is directed, it is possible that i
considers j as one of her close friends while j does not regard i as her good friends. Then
Wn may be asymmetric.
Let yi∗ denote the latent variable. The observed outcome, or agent’s behavior, yi , depends on yi∗ in the following way:
yi = hi (yi∗ ),
(4.2.1)
where hi (·) : < → < is a real-valued function, which can be linear or nonlinear. In the
general setting, we allow the form of this function to vary across agents. In applications,
we usually have hi (·) = hj (·) = h(·). The latent variable for i is related to her expectations
about the outcomes for other agents as
yi∗ = u(Xi ) + λ
X
Wi,j E[yj |XJpi , Z] − i .
(4.2.2)
j6=i
According to (4.2.2), the value of yi∗ depends on three parts. The first part, u(Xi ),
represents the direct effects of exogenous covariates, Xi = (X g , Xic , Xip ). We consider the
group features, X g ; some commonly known individual characteristics, Xic , and some personal features which may be privately known, Xip . The third part is the idiosyncratic shock,
represented by i . Those shocks are i.i.d. and independent of all the exogenous characteristics and social relations. Their identical distribution is characterized by a pdf function,
f (·), with its cdf, F (·). Assume that i is known by individual i herself, but not by other
agents or econometricians. The second part represents the interaction effects from socially
associated agents. There are two features in this formulation. First, yi∗ is affected by agent
j only if i connects with j; i.e., Wn,ij 6= 0. Second, j influences i through i’s expectation
on j’s true outcome. In the model, i’s expectations are made on the basis of public information about social relations in Wn , group features, X g , commonly known individual
characteristics, Xjc ’s, and her private information about exogenous characteristics, Xjp ’s.
The information structure can be fully described by specifying the subset of agents whose
90
Xjp ’s are known to an agent. Given a finite number of agents, this is achievable using vectors. For each i, we define an n × 1 vector, Ji , such that Ji (j) = 1 if i knows Xjp ; and
Ji (j) = 0 otherwise, for each 1 ≤ j ≤ n. As a result, information structure in a group of n
0
0
0
agents is represented by an n2 × 1 vector, J = (J1 , · · · , Jn ) . For every i, we define by XJpi ,
0
the vector formed by Xjp ’s, which are known to i, XJpi = (Xjp : Ji (j) = 1)0 . Suppose that
P
Xjp is of dimension kp . Then the dimension of XJpi is Ni = ( nj=1 Ji (j))kp .32 To simplify
notation, we summarize all the publicly known variables in one vector,
0
0
0
0
0
Z = (X g , X1c , · · · , Xnc , Wn,11 , · · · , Wn,1n , · · · , Wn,n1 , · · · , Wn,nn , J1 , · · · , Jn ) .
(4.2.3)
Then we sum up i’s information set used to make predictions by two random vectors, one
about private information, XJpi , and the other about public information, Z.33
The parameter, λ, represent the intensity of social interactions. If λ > 0, the social
interaction effect is positive. If λ < 0, outcomes are negatively related. The case of λ = 0
represents absence of social interactions.
4.2.2
Examples
As it is shown in Yang and Lee (2014) and Yang, Qu and Lee (2014), the model,
(4.2.1) and (4.2.2), is general enough to include the linear model with continuous choices,
the model with binary choices and the Tobit model. We list them along with some other
possible applications below.
1. (Linear Model with Continuous Choices) If hi (d) = d for all i and d ∈ <, we have that
yi = u(Xi ) + λ
X
Wn,ij E[yj |XJpi , Z] − i .
(4.2.4)
j6=i
For example, if 1 only knows X1p , X2p and X3p , J1 (j) = 1 for j = 1, 2, 3; and J1 (j) = 0 for j > 3, and
0
0
0 0
= (X1p , X2p , X3p ) . We assume that such an information structure J is common knowledge. However,
the realizations of those random variables are private information. In this example, although it is publicly
known that agent 1 knows her own features and those of agents 2 and 3, the realizations of X1p , X2p and X3p
may be unknown to agent 4.
32
XJp1
33
As it is explained in Yang and Lee (2014), because the idiosyncratic shocks, i ’s are independent of each
other and they are also independent of the exogenous covariates and social relations, adding the realization
of i to her information set does not change i’s predictions on others’ outcomes.
91
2. (Binary Choice Model I) If hi (d) = I(d > 0), for all i and d ∈ <, where I(·) is the
indicator, we have that
yi = I(u(Xi ) + λ
X
Wn,ij E[yj |XJpi , Z] − i > 0).
(4.2.5)
j6=i
3. (Binary Choice Model II) Based on the assumption that agents drive utilities from
taking actions similar to their friends and/or neighbors, Brock and Durlauf(2001)
consider another model for binary choices, yi = 1 or − 1, according to hi (d) = 2I(d >
0) − 1, and
yi = 2I(u(Xi ) + λ
X
Wn,ij E[yj |XJpi , Z] − i > 0) − 1.
(4.2.6)
j6=i
4. (Tobit Model with Homogeneous Cutoff Points) If all negative outcomes are censored,
i.e, hi (d) = dI(d ≥ 0) for all i and d ∈ <, we have that




X
yi = max u(Xi ) + λ
Wn,ij E[yj |XJpi , Z] − i , 0 .


(4.2.7)
j6=i
5. (Tobit Model with Heterogeneous Cutoff Points) If hi (d) = I(d > v(X g , Xic )) for all
d ∈ < for i = 1, · · · , n, we have that
X
yi =I(u(Xi ) + λ
Wn,ij E[yj |XJpi , Z] − i > v(X g , Xic ))
j6=i
· (u(Xi ) + λ
X
(4.2.8)
Wn,ij E[yj |XJpi , Z]
− i ).
j6=i
6. (Two-sided Censored Outcomes) If hi (d) = dI(c1 < d < c2 ) for all i and d ∈ < and
some parameters, c1 < c2 , we get
yi = (u(Xi )+λ
X
Wn,ij E[yj |XJpi , Z]−i )(c1 < u(Xi )+λ
j6=i
X
Wn,ij E[yj |XJpi , Z]−i < c2 ).
j6=i
(4.2.9)
7. (Ordered Multiple Choices) If hi (d) =
PK
k=0 k(ck
< d < ck+1 ) for all i and d ∈ <,
where K > 1 is a fixed integer, c0 = −∞, c1 < · · · < cK , cK+1 = ∞, we derive the
92
following model:
yi =
K
X
k(ck < u(Xi ) + λ
k=0
X
Wn,ij E[yj |XJpi , Z] − i < ck+1 ).
(4.2.10)
j6=i
8. (Investment Decisions with Cobb-Douglas Production Functions) At last, consider
interactions in investment among competing firms or contiguous local governments.
yi∗ denotes the latent investment. yi represents the output. Assume that output is
influenced by technology A, the labor input Li , as well as the capital investment.
Since investment cannot be negative, the actual investment is max {yi∗ , 0}. Assume
that A can be estimated from other data sources. Li ’s are public information and
exogenously given. The potential investment yi∗ is still determined by (4.2.2). With
a Cobb-Douglas production function,
∗
ι
yi = hi (yi∗ ) = AL1−ι
i (max {yi , 0}) ,
(4.2.11)
where o < ι < 1.
Various information structures can be discussed in this framework.
1. (Publicly-known Characteristics) If all exogenous covariates are public information,
Ji = 1n , for all i = 1, · · · , n, where 1n is an n × 1 vector of 1’s.
2. (Self-known characteristics) If Xip is revealed just to agent i, for any i = 1, · · · , n,
Ji (i) = 1 and Ji (j) = 0 for all j 6= i.
3. (Socially-known Characteristics) If for any two agents i and j, i knows Xjp if and only
if i connects to j, for all i, we have that Ji (i) = 1, Ji (j) = I(Wn,ij > 0) for all i 6= j.
4.3
4.3.1
Game, Equilibrium, and Expectations
Game Explanations
In the model, (4.2.1) and (4.2.2), an agent’s behavior is interacted with those of others
when she is uncertain about some of their attributes. Hence, we can view outcomes of the
model as the outcome of an equilibrium for a simultaneous move game with incomplete
93
information. According to Harsanyi (1967a; 1967b), assuming that agents’ payoffs are
related to a randomly determined “state”, an agent’s uncertainty comes from the fact that
her signal does not completely recover the true state. Then predicting others’ unknown
characteristics is equivalent to making inference about the realized state from her own
signal. We follow the setup in Osborne and Rubinstein (1994).
For n group members, let Xpi represent the support of Xip and E the common support
Q
of i ’s. The set of states, ni Xi × E n , is the set of all possible Xip ’s and i ’s for all players.
In this case, player i’s “type” is her private information, XJpi , and idiosyncratic shocks, i .
Q
Her set of types is then Ti = k:Ji (k)=1 Xk × E. The signal function is a mapping from
Q
Q
the states to her type, τi : ni Xi × E n → k:Ji (k)=1 Xk × E. Her prior belief on the set of
states is the joint distribution of Xip ’s and the distribution of the i.i.d. shocks, F (·), which
is the same for all agents. The set of actions for agent i is denoted by Yi . Her strategy is a
Q
contingent plan specifying the action to take for each realized type, si : k:Ji (k)=1 Xk × E →
Yi . The payoff received by an agent depends on actions taken by all group members,
Q
y = (y1 , · · · , yn ) ∈ ni=1 Yi , as well as the uncertain state. se = (se1 (·), · · · , sen (·)) is a
Bayesian Nash Equilibrium (BNE) in this game if
sei (XJpi , i ) = hi (u(Xi ) + λ
X
Wn,ij E[sej (XJpj , j )|XJpi , Z, i ] − i ).
(4.3.1)
j6=i
With specific hi (·) functions, it is possible to build a structural model. See Yang and
Lee (2014) for continuous and binary choices and Yang, Qu and Lee (2014) for the Tobit
model when agents’ actions are subject to the non-negative constraint.
4.3.2
Equilibrium and Expectations
Under the assumption that observed outcomes are realizations of a BNE (4.3.1), we
relate equilibrium strategies, se = (se1 (·), · · · , sen (·)), to conditional expected outcomes in an
n
o
equilibrium, E[yj |XJpi , Z = z] . Pick any i and k such that i 6= k. By consistency, we get
that
E[yi |XJpk , Z = z] = E[hi (u(Xi ) + λ
X
Wn,ij E[yj |XJpi , Z = z] − i )|XJpk , Z = z].
j6=i
94
(4.3.2)
Therefore, given public information Z = z, for any i, conditional expectations on i’s behavior
0
0
0
depends on the private information. Two agents k and k , where k 6= i, k 6= i, and k 6= k ,
have the same expectations about i’s behavior if they have the same private information,
i.e., XJpk = XJp 0 .
k
Similar to Yang and Lee (2014), conditional expectations can be modeled as functions
and embedded into a function space. In this paper, however, conditional expectation functions are defined in a different way in order to utilize properties of classical function spaces.34
Given social relations Wn and information structure J = (J1 , · · · , Jn ), for each i, we collect
all possible types of private information which are used by others to predict her actions as,
n
o
Jbi = Je ∈ {0, 1}n : Je = Jj for some j s.t. Wn,ji 6= 0 .
(4.3.3)
Denote the number elements in this set by Mi .35 Considering that two agents may have
P
the same type of private information, Mi ≤ j6=i Mn,ji . Labeling elements in Jbi by m =
1, · · · , Mi , we get
n
o
Jbi = Jei,1 , · · · , Jei,Mi .
(4.3.4)
For any j with Wn,ji 6= 0, there is exactly one vector in Jbi representing j’s private information, Jj . That is, there is a unique mi (j) ∈ {1, · · · , Mi }, such that Jj = Jei,mi (j) . The
mapping, mi (·), defined in this way is onto. Let Xpi be the support of Xip (conditional on
public information Z = z). It is a subset of <kp .(Recall that kp is the dimension of Xip ’s.)
34
Treating the privately known characteristics used to make predictions as a random vector, Yang and Lee
(2014) define the conditional expectation about i’s behaviors as a function of all possible random vectors used
to make predictions. That function maps a random vector in its domain to a random variable, specifying the
value of conditional expectations for each realization of that random vector. Consider a group of 3 people as
an example. Agent 1 is linked to agents 2 and 3, i.e., W3,21 6= 0 and W3,31 6= 0. XJp2 and XJp3 are the private
information for 2 and 3respectively.
The conditional expectation about y1 , ψ1 , is then defined on the set of
random vectors, A1 = XJp2 , XJp3 . ψ1 (XJp2 ) is a random vector, such that for each ω in the sample space
of Xip ’s, ψ1 (XJp2 )(ω) = E[y1 |XJp2 = XJp2 (ω), Z = z]. In our model, however, we define exclusively a function
for every possible type of private information that is used to predict i’s behaviors. For the aforementioned
example, if J2 6= J3 , we define functions ξ1,J2 and ξ1,J3 , on the support of those random vectors. That is,
ξ1,J2 (xpJ2 ) = E[y1 |XJp2 = xpJ2 , Z = z] and ξ1,J3 (xpJ3 ) = E[y1 |XJp3 = xpJ3 , Z = z]. The conditional expectation
functions defined in this way are mappings from a subset of an Euclidean space to <1 , which makes it
convenient to apply the properties of the classical Lp spaces.
35
If Mi = 0, i’s actions does not influence others’ choices. Then expectations on her behaviors do not
influence the distribution of outcomes. It is redundant in the system. In computation, we can just exclude
the redundancy.
95
For any (i, m), denote by Xpi,m the support of privately known characteristics contained in
P
Jei,m . Then it is a subset of the Euclidean space with dimension kp ( nj=1 Jei,m (j)). We denote its elements simply by xpi,m . For example, if Jei,m (j) = 1, for j = 2, 3; and Jei,m (j) = 0,
0
0 0
otherwise. Xpi,m is the support for (X2p , X3p ). Its elements are realizations, (xp2 , xp3 ) . Define
e : Xp
1
ξi,m
i,m → < as
e
ξi,m
(xpi,m ) = E[yi |X pe
Ji,m
= xpe , Z = z].
(4.3.5)
Ji,m
e is a mapping from a subset of an Euclidean space with dimension k
Then ξi,m
p
Pn
j=1 Ji,m (j)
e
to <1 .36 Collecting all those functions, we derive a vector-valued function,
e
e
e
e
, · · · , ξn,1
, · · · , ξn,M
),
ξ e = (ξ1,1
, · · · , ξ1,M
n
1
p
m=1 Xi,m
Q n Q Mi
whose domain is
i=1
and range is <M , where M =
Pn
i=1 Mi .
ξ e has two
properties:
e
ξ e (xp1,1 , · · · , xp1,M1 , · · · , xpn,1 , · · · , xpn,Mn )i,m = ξi,m
(xpi,m );
(4.3.6)
and the equilibrium condition,
e
ξi,m
(xpi,m ) = E[hi (u(X g , Xic , Xip ) + λ
X
p
p
e
Wn,ij ξj,m
(Xj,m
) − i )|Xi,m
= xpi,m , Z = z],
j (i)
j (i)
j6=i
(4.3.7)
for all i = 1, · · · , n, m = 1, · · · , Mi , and xpi,m ∈ Xi,m .37 In particular, when all exogenous
characteristics are public information, conditional expectations only depend on public information. In that case, E[yi |Z = z] is a scalar for any i and ξ e reduces to an n × 1 vector,
0
ξ e = (ξ1e , · · · , ξne ) .
In Appendix C.1, we construct a Banach space, (Ξ(Wn , J ), k · k). ξ ∈ (Ξ(Wn , J ), k · k) if
and only if each of its coordinates, ξi,m , is an element of L1 (Xpi,m , Bi,m , µp ; <1 ), the space of
all integrable functions on Xpi,m under the probability measure implied by the distribution
36
e
In principle, ξi,m
can be defined on the whole Euclidean space. However, considering the support of
may not be full, it will be convenient to work with bounded subset of the Euclidean space sometimes.
Therefore, we just use an abstract subset at this stage.
Xip ’s
37
By our definition, given i, for each j 6= i, via the mapping mj (·), we find exactly the private information
p
Jei,mj (i) = Ji . Thus, in (4.3.7), all the Xj,m
’s are the same. They are just XJpi , the random vector of
j (i)
exogenous characteristics which are known by i.
96
of Xip ’s conditional on public information Z = z, µp , with the standard L1 norm. Focusing
on the conditional expectation functions with this property, the properties of the classical
function space, L1 (Xpi,m , Bi,m , µp ; <1 ), can be utilized to characterize equilibria in a general
setting.
The relationship between a BNE and a consistent conditional expectation function is
expressed by the following proposition.
Q
Proposition 4.3.1 Conditional on public information Z = z, if se = (se1 , · · · , sen ) : ni=1 Ti ×
Q
E n → ni=1 Yi is a BNE of this model, then there is a conditional expectation function ξ e satisfying conditions (4.3.6) and (4.3.7). On the contrary, if there is a conditional expectation
function satisfying conditions (4.3.6) and (4.3.7), there is a BNE for this model.
Proof. See Appendix C.1.
Due to Proposition 4.3.1, there is an onto correspondence from the set of BNEs to the set
of equilibrium conditional expectation functions. Although it is possible that two different
BNEs may imply the same equilibrium conditional expectation functions, those two BNEs
are equivalent in terms of the distribution of equilibrium outcomes.38 Therefore, from now
on, we will focus on the equilibrium conditional expectation functions. Such a functions is
simply referred to as an equilibrium. The set of those functions are viewed as the set of
equilibria. For the discussions in subsequent sections, we impose the following assumptions.
Assumption 4.3.1 f () > 0 for all −∞ < < ∞. That is, the support for all i ’s is <.
Assumption 4.3.2 For any real number a, Hi (a) = E [hi (a − )] < ∞ and is differentiable
with respect to a.
Hi (·) can take various forms in different models.
1. (Linear Model with Continuous Choices) Hi (a) = a and dHi (a)/da = 1.
38
Re-defining the set of BNEs by equivalence classes in terms of the distribution of equilibrium outcomes,
there is a one-to-one correspondence between the set of BNEs and the set of equilibrium conditional expectation functions.
97
2. (Binary Choice Model I) Hi (a) = F (a) and dHi (a)/da = f (a).
3. (Binary Choice Model II) Hi (a) = 2F (a) − 1 and dHi (a)/da = 2f (a)
4. (Tobit Model with Homogeneous Cutoff Points) Hi (a) = aF (a) −
R
c<a cf (c)dc
and
dHi (a)/da = F (a).
5. (Tobit Model with Heterogeneous Cutoff Points) With cutoffs being v(X g , Xic ),
R
Hi (a) = aF (a − v(X g , Xic )) − c<a−v(X g ,X c ) cf (c)dc;
i
and dHi (a)/da = F (a −
v(X g , Xic ))
+ v(X g , Xic )f (a − v(X g , Xic )).
6. (Two-sided Censored Outcomes) Hi (a) = a(F (a − c1 ) − F (a − c2 )) −
R a−c1
a−c2
cf (c)dc
and dH1 (a)/da = F (a − c1 ) − F (a − c2 ) + c1 f (a − c1 ) − c2 f (a − c2 ).
P
7. (Ordered Multiple Choices) Hi (a) = K
k=0 k(F (a − ck+1 ) − F (a − ck ))
PK
and dHi (a)/da = k=0 k(f (a − ck+1 ) − f (a − ck )).
8. (Investment Decisions with Cobb-Douglas Production Function)
R
R
1−ι
ι
ι−1 f (c)dc.
Hi (a) = AL1−ι
i
c<a (a − c) f (c)dc and dHi (a)/da = ιALi
c<a (a − c)
Yang and Lee (2014) and Yang, Qu, and Lee (2014) find a sufficient condition for the
existence of a unique equilibrium.
i (a)
Proposition 4.3.2 Under the condition that maxi supa | dHda
| < ∞, if
|λ| kWn k∞ max sup |
i
where kWn k∞ = max1≤i≤n
Pn
j=1 Wn,ij ,
a
dHi (a)
| < 1,
da
(4.3.8)
there is one unique equilibrium in the model.
Proof. See the Appendix in Yang and Lee (2014).
With a unique equilibrium, the model, (4.2.1) and (4.2.2), will be complete. Parameters can
be estimated using standard likelihood or moment conditions. According to (4.3.8), in order
to ensure a unique equilibrium, the possible range for the intensity of social interactions,
λ, depends on the number of links and the derivative of the functions, Hi (·)’s. If they
are large, the range for λ will be very narrow. kWn k∞ can be big for large groups. For
98
example, when Wn represents the relations among n players in a game, kWn k∞ = n − 1,
which increases with the group population. In the spatial econometrics literature, it is often
fn,ij =
to row-normalize Wn so that W
P Wn,ij
j6=i Wn,ij
fn k∞ = 1, which helps alleviate the
and kW
problem. For social interactions, when Wn,ij is equal to either 1 or 0 depending on whether i
is socially associated with j, after this row-normalization, any Wn,ij for j 6= i is weighted by
the number of social links formed by i. In general, the number of social links an agent forms
differs from person to person. For an agent who has more social relations, her links will be
discounted more heavily than those of other agents who have less social relations. That is,
links are treated differently in the same network with row-normalization. In addition, when
the social links are not fully exogenous like those in Yang (2014), the number of social links
an agent has is endogenous. Then row-normalization might change the correlation between
i (a)
| depends on the type of behaviors. It
the social relations and outcomes. maxi supa | dHda
is no bigger than 1 for the continuous choice model and the Tobit model with zero cutoffs.
However, for some models, such as the Tobit model with heterogeneous cutoff points and
the model of ordered multiple choices listed in this paper, maxi supa |dHi (a)/da| can be
very large. See also Yang (2014) for a similar case in the sample selection model. Moreover,
by imposing (4.3.8), possible strong interactions are excluded.
This paper investigates model estimation without imposing Assumption (4.3.8). It is
shown that the method of random equilibrium selection in Bajari et al (2010b) and (2010c)
can be extended to this general framework for social interactions. Suppose that an equilibrium ξ e is selected from the set of equilibria, E(X, Wn ), according to probability measure,
µe (γ(·; X, Wn ), α),
(4.3.9)
where γ(·; X, Wn ) is a vector-valued criterion function whose coordinates correspond to
different criteria, such as Pareto efficiency and maximal entry rate. α is a parameter vector,
0
attaching weights to different criteria. Then the full likelihood for y = (y1 , · · · , yn ) can be
written as
Z
L(y; X, Wn ) =
n
Y
f (yi |ξ e , X, Wn )dµe (γ(ξ e ; X, Wn ), α),
E(X,Wn ) i
99
(4.3.10)
where we apply the independence of yi ’s owing to the independence across the i.i.d. idiosyncratic shocks.
A practical specification depends on characterization of the set of equilibria. Although
it is well established that there are a finite number of equilibria for the finite-player discrete
choice game analyzed by Bajari et al.(2010a),(2010b), and (2010c), as far as I know, there are
no conditions in the literature that can be directly and easily applied to our model framework
(See Khan and Sun (2002) for a survey of related theories). Therefore, we investigate
the set of equilibria in this model and derive conditions specific to our framework. As it
turns out that the equilibrium sets have different characteristics under different information
structures, according to whether exogenous characteristics are private information or not,
we discuss those two scenarios separately.
4.4
Estimation with Publicly Known Characteristics
We begin our discussions with the case that all exogenous covariates are public information and only the idiosyncratic shocks are privately known. In this case, there are just
two types of exogenous characteristics, X g and Xic ’s. Since all expectations are based on
public information, every agent other than i will form the same expectations on i’s behavior.
Therefore, an equilibrium conditional expectation function ξ e reduces to an n × 1 vector,
0
ξ e = (ξ1e , · · · , ξne ) . In this case, (4.3.7) can be rewritten as
ξie = Hi (u(Xi ) + λ
X
Wn,ij ξje ),
(4.4.1)
j6=i
for all i = 1, · · · , n. Given exogenous covariates, X, and social matrix, Wn , define S : <n →
<n such that for all i = 1, · · · .n,
S(ξ; X, Wn )i = Hi (u(Xi ) + λ
X
Wn,ij ξje ) − ξi .
(4.4.2)
j6=i
ξ e ∈ <n is an equilibrium conditional expectation vector if and only if S(ξ e ; X, Wn ) = 0.
Therefore, for a group of n agents with publicly-known exogenous covariates, X, and social
relations, Wn , the set of equilibria, E(X, Wn ), can be describes as the set of solutions to
100
this system of nonlinear equations. That is,
E(X, Wn ) = {ξ ∈ <n : S(ξ; X, Wn ) = 0} .
4.4.1
(4.4.3)
Characterization of the Equilibrium Set
In this section, we characterize the set of equilibria, E(X, Wn ), through inspecting solutions to S(ξ; X, Wn ) = 0. Solving equations is one of the central topics in mathematics.
There are a myriad of well-established results about it. For this particular problem, we employ the oriented intersection theory to analyze solutions to the equation system, (4.4.2),
aiming to derive conditions specific to this model framework. The applications of differential topology for equilibrium characterizations is not new in economic studies. For example,
Debreu (1970) and Dierker (1972) used these theories to analyze the set of competitive
equilibria in an economy. The key idea of this approach is to deform S(·; X, Wn ) and connect it to a function with a simpler form in a “smooth” way, which is called a homotopy.
Implications about the set of zeros of S(·; X, Wn ) are then derived from the set of zeros of
that simpler function. Garcia and Zangwill (1981) provide an intuitive introduction and
basic results for this method. In this paper, we utilize some more general results from the
textbook by Guillemin and Pollack (1974) and construct homotopies tailored to our model
framework. In this way, several properties of the set of equilibria and, especially, a new
sufficient condition for the existence of a unique equilibrium can be derived. We present
our main results and leave technical proofs to Appendix C.2.
Define the function T (·; X, Wn ) : <n → <n such that
T (ξ; X, Wn )i = Hi (u(Xi ) + λ
X
Wn,ij ξj ),
(4.4.4)
j6=i
for all i = 1, · · · , n. We can see that ξ is a solution to S(ξ; X, Wn ) = 0 if and only if ξ is a
fixed point of T . The rate that the Euclidean norm of kT (ξ)kE explodes relative to kξkE is
crucial for the existence of an equilibrium.
101
Assumption 4.4.1 For any group, X, Wn , there is a real number b < 1, such that39
lim
kξkE →∞
kT (ξ; X, Wn )kE /kξkE = b.
(4.4.5)
Under Assumption 4.4.1 and an easy-to-satisfy regularity condition, the conclusions in
Proposition 4.4.1 hold.40
Proposition 4.4.1 Under Assumptions 4.3.1, 4.3.2, C.2.1, for any regular social group,
(X, Wn ), if, in addition, Assumption 4.4.1 holds, there is r0 (X, Wn ) > 0, such that all
equilibria are within the open ball B(0, r0 (X, Wn )) = {ξ ∈ <n : kξkE < r0 (X, Wn )} and the
number of equilibria is finite.
Proof. See Appendix C.2.
Finiteness comes from regularity. Regularity implies that all equilibria are isolated.
As a result, in the closed ball B[0, r0 (X, Wn )], which is compact and contains B(0, r), the
number of equilibria is finite. Moreover, we can derive a new condition for the existence of
a unique equilibrium.
Proposition 4.4.2 Under Assumptions 4.3.1, 4.3.2, C.2.1, and 4.4.1, for a regular group,
the total number of equilibia is odd. In addition, if the Jacobian determinant, det(DS(ξ; X, Wn )),
does not change its sign in the ball B(0, r0 (X, Wn )) which contains all equilibria, there is a
unique equilibrium.
Proof. See Appendix C.2.
Recalling that in Yang and Lee (2014), the sufficient condition for a unique equilibrium
is (4.3.8). The following lemma shows that condition (4.3.8) is stronger than the condition
in Proposition 4.4.2.
Lemma 4.4.1 When (4.3.8) holds, sgn(det(DS(ξ; X, Wn ))) = (−1)n for all ξ ∈ <n .
39
The subscript, “E”, denotes the Euclidean norm.
40
See Appendix C.2 for detailed discussions about the meaning and implications of the regularity condition.
Additionally, it is shown that for models where u(·) is linear in Xic with a non-zero slope, if dHi (a)/da 6= 0
for any i and a ∈ <1 , almost all groups, (X, Wn ), satisfy the regularity condition.
102
Proof. See Appendix C.2.
The above characterizations of the equilibrium set hinges on Assumption 4.4.1. It is
easy to see that this condition is satisfied when Hi (·) are uniformly bounded. Therefore,
the above results apply for models with binary choices, two-sided censored outcomes and
ordered multiple choices. It also holds for unbounded (piecewisesly) continuous choices with
the magnitude of Hi (a) increases with the magnitude of a ∈ <1 , as long as the increasing
rate is not very big. A case in point is the example of investment choices. Since the elasticity
coefficient in the Cobb-Douglas production function, ι, is between 0 and 1, using Jensen’s
inequality, we have that

ι


X
(max
u(X
)
+
λ
W
ξ
−
,
0
f (i )di
|T (ξ)i | = AL1−ι
i
n,ij j
i
i


j6=i


Z


X
≤ AL1−ι
[
max
u(X
)
+
λ
W
ξ
−
,
0
f (i )di ]ι
i
n,ij
j
i
i


Z
j6=i
= AL1−ι
i [HT (u(Xi ) + λ
X
Wn,ij ξj )]ι ,
j6=i
where HT (a) = aF (a)−
R
c<a cf (c)dc
is the H(·) function corresponding to the Tobit model.
When E[] < ∞, HT (a)/|a| ≤ 1 when |a| is sufficiently large. For any i,
HT (u(Xi ) + λ P Wn,ij ξj ) ι
|T (ξ)i |
1−ι
P j6=i
≤AL
kξkE
|u(Xi ) + λ j6=i Wn,ij ξj |
|u(Xi ) + λ P Wn,ij ξj | ι kξkι
j6=i
E
·
.
kξkE
kξkE
When |λ|kWn k∞ < ∞, the right hand side goes to zero as kξkE goes to infinity, for 0 < ι <
1.41 Therefore, we derive a condition which can guarantee the existence of a pure strategy
BNE with unbounded piecewisely continuous choices and non-compact private shocks. Our
condition, Assumption 4.4.1, therefore, is complementary to sufficient conditions about the
existence of an equilibrium for games with private information under general game settings
(See Khan and Sun (2002) for a research survey and Khan and Zhang (2014) for a recent
41
When kξk
→ ∞, |ξj | →
P E
HT (u(Xi )+λ j6=i Wn,ij ξj ) ι
(
) goes to
kξkE
∞ for at least one j.
If |ξk | < ∞ for all k with
Wn,ik > 0,
P
H (u(Xi )+λ
Wn,ij ξj )
i
0, as kξkE goes to infinity. Otherwise, ( T|u(Xi )+λ P j6=W
)ι is
n,ij ξj |
j6=i
P
bounded by 1 as kξkE is large enough. As |u(Xi ) + λ j6=i Wn,ij ξj | ≤ |u(Xi )| + |λ|kWn k∞ kξkE , when
|u(X )+λ P W ξ | ι
i
n,ij j
j6=i
|λ|kWn k∞ < ∞,
is also bounded.
kξkE
103
improvement). However, in some models, to ensure Assumption 4.4.1 to hold, we need
to impose restrictions on the range of λ, which may be stringent sometimes. The linear
model with continuous choices and the Tobit model are such examples. We discuss how to
characterize equilibria in those models.
First, for continuous choices, Hi (a) = a for all i = 1, · · · , n and a ∈ <. S(·; X, Wn ) = 0
is actually a linear equation system:
S(ξ; X, Wn ) = u + λWn ξ − ξ = 0,
0
(4.4.6)
0
where u = (u(X1 ), · · · , u(X)n ) and ξ = (ψ1 , · · · , ψn ) . For a regular group, DS(ξ; X, Wn ) =
λWn − In is non-singular. Therefore, (4.4.6) has one and only one solution in that case.
That is to say, for a regular group, in the linear model of socially interacted continuous
choices, there is one and only one equilibrium.
For the Tobit model with homogeneous cutoffs which are normalized to be equal to 0,
R
Hi (a) = H(a) = aF (a) − c<a cf (c)dc, is unbounded and strictly increasing. S(·; X, Wn ) =
0 is a system of nonlinear equations:
S(ξ; X, Wn )i = H(u(Xi ) + λ
X
Wn,ij ξj ) − ξi ,
(4.4.7)
j6=i
for i = 1, · · · , n. Define pi = E[I(yi > 0)], for all i = 1, · · · , n. There is a one-to-one
correspondence between p and ξ, for ξi = E[yi ] = H(F−1 (pi )) holds for all i = 1, · · · , n.
Then (4.4.7) can be written as
F (u(Xi ) + λ
X
Wn,ij H(F−1 (pj )) − pi = 0.
(4.4.70 )
j6=i
for i = 1, · · · , n. This is a system of equations about p. Because p ∈ [0, 1]n and F (·)
is bounded, we can apply Propositions 4.4.1 and 4.4.2, for the Tobit model and find an
odd number of equilibria within a ball B(0, r) with r > 1. Similarly, if the cutoffs are
heterogeneous and are modeled as v(X g , Xic ), we have that ξi = E[yi ] = H(F−1 (pi ) +
v(X g , Xic )). Then the counterpart to (4.4.7) is
X
F u(Xi ) + λ
Wn,ij H(F−1 (pj ) + v(X g , Xic )) − v(X g , Xic ) − pi = 0.
j6=i
104
(4.4.70 )
4.4.2
Selection Rule and Complete Likelihood
When there are a finite number of equilibria for a group, to attach a probability of equilibrium selection is to attach a probability mass to each point in this set. Following Bajari
et al.(2010b) and (2010c), the probability masses are associated with some selection crite0
ria and parameters. To be specific, let γ(ξ e , X, Wn ) = (γ1 (ξ e , X, Wn ), · · · , γL (ξ e , X, Wn ))
be a vector composed of 0’s and 1’s representing equilibrium properties. For example,
γ1 (ξ e , X, Wn ) = 1 if ξ e is Pareto dominated by another equilibrium; and γ1 (ξ e , X, Wn ) = 0
otherwise. γ2 (ξ e , X, Wn ) = 1 if the equilibrium expected utility is maximal in ξ e . For the
model with binary choices, we can make γ3 (ξ e , X, Wn ) = 1, if the number of agents who
choose 1 is bigger in ξ e than that in any other equilibria. Similarly, for the Tobit model,
we can make γ4 (ξ e , X, Wn ) = 1, if the number of agents whose behaviors are not censored
is maximized at ξ e . γl (ξ e , X, Wn )’s can also be continuously dependent on the properties
of an equilibrium, say the equilibrium total expected utilities. Let α ∈ <L denote the
weight. Suppose that given a set of equilibia, E(X, Wn ), an equilibrium is picked randomly
according to the following random rule: ξ e,l is selected if
0
0
0
α γ(ξ e,l , X, Wn ) + sl ≥ α γ(ξ e,l , X, Wn ) + sl0 ,
0
(4.4.8)
0
for any ξ e,l ∈ E(X, Wn ) and l 6= l , where sl ’s are i.i.d. equilibrium-specific shocks with
type-I extreme value distribution. Therefore, the propabiity that ξ e,l is selected is
0
exp(α γ(ξ e,l , X, Wn ))
.
0
e,l
ξ∈E(X,Wn ) exp(α γ(ξ , X, Wn ))
ρ(ξ e,l ; E(X, Wn ), α) = P
(4.4.9)
0
Then the complete likelihood function for an outcome y = (y1 , · · · , yn ) in a social group is
as follows,
L(y; X, Wn ) =
X
e
ρ(ξ ; E(X, Wn ), α)
ξ e ∈E(X,Wn )
n
Y
i=1
which is the basis for identification and estimation.
105
f (yi |ξ e ),
(4.4.10)
4.4.3
Identification
Our analysis about identification is based on the parametric and distributional assumptions below. First, we assume that the payoff function, u(·), is linear in covariates. Since all
exogenous characteristics are public information in this section, we only need to consider
X g and X c .
0
0
Assumption 4.4.2 u(Xi ) = β0,0 + X g β0,1 + Xic β1 for all i = 1, · · · , n.
Assumption 4.4.3 The pdf for the i.i.d. idiosyncratic shocks, i ’s, is f (·; σ) with a known
function form and an unknown parameter, σ > 0.
Assumption 4.4.4 Xic ∈ <L and i ’s have full support.
For interactions within one group, the group characteristics, X g , is absorbed by the
constant term. Therefore, we suppress β0,1 = 0 now.
e λ,
e σ
α, β,
e) are obserDefinition 4.4.1 Given social relations, W = W n , (α, β, λ, σ) and (e
vationally equivalent at W n , if they imply the same distribution of observables, namely,
e λ,
e σ
FY,X|W n (·, ·; α, β, λ, σ) = FY,X|W n (·, ·; α
e, β,
e).
(4.4.11)
e λ,
e σ
(α, β, λ, σ) is identifiable at W n , if any (e
α, β,
e) 6= (α, β, λ, σ) is not observationally
equivalent to (α, β, λ, σ) at W n .
Different functions hi (·)’s correspond to different types of behaviors. In this paper, we
concentrate on identifying model parameters for three of them, linear model for continuous
choices, binary choices, and the Tobit model for censored outcomes. Those three models
are representative in terms of the relationships between the observed outcomes and the
latent variables. For the linear model with continuous choices, because there is a unique
equilibrium for almost all groups, we can identify β, σ and λ from socially interacted
outcomes without worrying about equilibrium selection.42 For the binary choice model and
42
As a consequence, the parameter for the selection rule, α, is not identifiable in this case.
106
the Tobit model, however, given the possibility of equilibrium multiplicity, the distribution
of outcomes will depend on the probabilities of equilibrium selection and the distribution
of outcomes in an equilibrium. In a nonparametric setting, Aguirregabiria and Mira (2013)
identify structural parameters via exclusion restrictions for a game with discrete choices.
In this parametric setting for social interactions, the technique of identification at infinity
is employed to identify model parameters.
Proposition 4.4.3 In the linear model of continuous choices, for a group with social relaei = (1, P W n,ij , X c0 , P W n,ij X c0 )0 ,
tions W n , denote X
i
j
j6=i
j6=i
P
0
0
0
c
bi = (1, X c ,
and X
i
j6=i W n,ij Xj ) .
Under Assumptions 4.4.2, 4.4.3, and 4.4.4, β0,0 , β1 , σ and λ can be identified, if
ei X
ei0 ] > 0;
min min eigE[X
(4.4.12)
bi X
b 0 ] > 0,
min min eigE[X
i
(4.4.120 )
1≤i≤n
or
1≤i≤n
when W n is row-normalized, where min eig(·) is the minimal eigenvalue of the corresponding
matrices. Since there is only one equilibrium in the linear model for almost all groups, α is
not identified in this case.
Proof. See Appendix C.3.
For the payoff parameters, β, λ and σ, in the binary choice model and the Tobit model, we
make the following assumption on the model coefficients.
Assumption 4.4.5 β1,l 6= 0 for some l ∈ {1, · · · , L}. Without loss of generality, suppose
that l = 1.
Assumption 4.4.6 The i.i.d. idiosyncratic shocks, i ’s, are distributed according to a meanscale family with a pdf, f (c) = (1/σ)fs ((c − µ )/σ), where fs (·) is some known standard
distribution pdf, µ and σ are the location and scale parameters. Normalize µ = 0 and
σ = 1.
107
Lemma 4.4.2 Under Assumptions 4.4.2, 4.4.4, and 4.4.5, for any i and ω−i ∈ {0, 1}n−1 ,
there is a subset X c (ω) ⊆ <nL such that P (X ∈ X c (ω)) > 0,
c
c |→∞,j6=i,X c ∈X c (ω) P (y−i = ω|X ) = 1.
and lim|Xj,1
Proof. See Appendix C.3.
Proposition 4.4.4 In the model of binary choices, for a group with social relations W n ,
P
0
c0
e
for any ω ∈ {0, 1}n−1 , denote X(ω)
i = (1, Xi ,
j6=i W n,ij ωj ) . Under Assumptions 4.4.2,
4.4.4, 4.4.5, and 4.4.6, β and λ can be identified, if for some non-zero vector ω0 ∈ {0, 1}n−1 ,
there is D0 > 0, such that
c
c
c
e
e0
inf min eigE[X(ω)
i X (ω)i |X ∈ X (ω) , |Xj,1 | ≥ D, j 6= i] > 0.
D≥D0
(4.4.13)
Proof. See Appendix C.3.
As for the Tobit model, we still use the technique of “identification at infinity” for
parameters of the payoff function. However, since uncensored outcomes are continuous, we
cannot fix any uncensored outcomes as a result of dominant strategy. Nonetheless, when
no outcomes are censored, the distribution of interacted outcomes will be similar to that
of continuous choices in a linear model, where there is only one equilibrium. Therefore, we
can identify parameters for payoffs and shock distributions separately from the parameters
for equilibrium selection. Additionally, different from the binary choice model where those
parameters are identified up-to-scale, in the Tobit model, σ can be identified based on the
following relationship about the average (individual) outcomes and censoring rate found by
Yang, Qu and Lee (2014):
c
E[yi |X ] = E[I(yi > 0)|X
c
]F−1 (E[I(yi
c
> 0)|X ]; σ) −
Z
c<F−1 (E[I(yi >0)|X c ];σ)
cf (c)dc.
(4.4.14)
Because (4.4.14) holds for any individual under every equilibrium, it can be used to identify
σ regardless equilibrium multiplicity. To utilize this relationship, we impose the assumption
below:
Assumption 4.4.7 f (·; σ) is differentiable with respect to σ, limc→∞ c(dF (c; σ)/dσ) = 0,
and
∂F (c;σ)/∂c
f (c;σ)
is strictly monotonic with respect to c.
108
Lemma 4.4.3 Under Assumptions 4.4.2, 4.4.4, and 4.4.5, there is a subset X1c ⊆ <nL such
c
c |→∞,1≤j≤n,X c ∈X c P (yj = 1, 1 ≤ j ≤ n|X ) = 1.
that P (X ∈ X0c ) > 0 and lim|Xj,1
1
Proof. See Appendix C.3.
Proposition 4.4.5 In the Tobit model, for a group with social relations W n , denote
ei = (1, P W n,ij , X c0 , P W n,ij X c0 )0 and X
bi = (1, X c0 , P W n,ij X c0 )0 .
X
i
j
j
i
j6=i
j6=i
j6=i
Under Assumptions 4.4.2, 4.4.4, and 4.4.5, β0,0 and λ can be identified, if there is D0 > 0
with
inf
ei X
e 0 |X c , |X c | ≥ D, 1 ≤ j ≤ n] > 0;
min min eigE[X
i
1
j,1
D≥D0 1≤i≤n
(4.4.15)
or
inf
bi X
b 0 |X c , |X c | ≥ D, 1 ≤ j ≤ n] > 0;
min min eigE[X
i
1
j,1
D≥D0 1≤i≤n
(4.4.150 )
when W n is row-normalized. If, in addition, Assumption 4.4.7 is satisfied, σ is identified.
Proof. See Appendix C.3.
After parameters for the payoff function u(·) are identified, we identify α given (β, σ, λ).
Suppose that for (β, σ, λ), for group (X, Wn ), there are D equilibria. Denote their values
under the criterion γ(·; X, Wn ) by γ(ξ d ; X, W n ) ∈ <q . Stacking them together, we get an
n × q matrix:
0
0
0
Γ(X c , W n ; β, σ, λ) = (γ (ξ 1 ; X, W n , · · · , γ (ξ D ; X, W n )) .
Proposition 4.4.6 In the binary choice (Tobit) model, with social relations, W n , under
the assumptions in Proposition 4.4.4(Proposition 4.4.5), α can be identified if
0
E[Γ (X c , W n ; β, σ, λ)Γ(X c , W n ; β, σ, λ)|X c ] has full column rank for any X c .
Proof. See Appendix C.3.
When there are different groups, β0,1 can be identified from variations across groups.
4.4.4
Computation and Estimation
With a parametric selection probability distribution, it is natural to derive parameter
estimates by maximizing the complete likelihood function, (4.4.10). However, that requires
109
computation of all the equilibria, which is a challenging issue both theoretically and numerically. According to Garcia and Zangwill(1981), that ambitious goal is achievable for a class
of problems by the homotopy continuation method (simply, the homotopy method). To be
c : S(e
c) = 0}.
specific, in complex spaces, S : Cn → Cn , is analytic. Define a homotopy: {e
we construct the following homotopy:
Ri (e
c, t) = (1 − t)(e
cqi i − 1) + tS i (e
c),
(4.4.16)
for i = 1, · · · , n and 0 ≤ t ≤ 1. qi is a positive integer. For t = 0, Ri (e
c) = e
cqi i − 1, which is
c) = S i (e
c). Separate the real and imaginary
a polynomial with qi solutions. For t = 1, Ri (e
∗
R
I
parts of variables and functions. That is, e
c = (e
cR , e
cI ) and Ri (e
c, t) = (Ri (e
c, t), Ri (e
c, t)). The
∗
c=e
c(ω)
function R then has 2n coordinates. Re-parametrize the system by ω, such that e
R
I
and t = t(ω).43 Then we get Ri (e
c, t) = 0 and Ri (e
c, t) = 0 for all i = 1, · · · , n. Denote this
∗
R
I
R
I
system as R = (R1 , R1 , · · · , Rn , Rn ). Taking derivatives, we get that
0
∗
∂R
∂R
∗
∗ e
0
De
c(ω) +
Dt(ω) = DR ((c), t) De
DR (y)Dy =
c Dt(ω) = 0,
∂e
c
∂t
0
(4.4.17)
0
where y = (e
c , t) . It is shown by Garcia and Zangwill(1981) that the above system can be
solved through “basic differential equations” (BDE):
∗
0
yi (ω) = (−1)i detDR−i (y),
∗
(4.4.18)
∗
for i = 1, · · · , 2n + 1, staring at (e
c0 , t0 )’s with R (e
c0 , t0 ) = 0. DR−i (y) is the Jacobian of
∗
∗
DR (y) with the i-th column removed. The above calculation is possible if DR (e
c, t) is of
full row rank for all (e
c, t) with R(e
c, t) = 0, which is called the “regularity” condition. If
∗
in addition, the homotopy is also “path finite”(that it, for any t, any e
c satisfying R (e
c, t)
never goes to infinity), solving the BDEs can guarantee getting all solutions to S(e
c) = 0.
Similar to the previous discussions about regular groups, the regularity condition is satisfied
43
This re-parametrization is important. With multiple equilibria, any fixed t may corresponds to different
e
c’s.
110
in general.44 A sufficient condition for path-finiteness is that the following limit
S(e
c)i
q
c i −1
ke
ci k→∞ e
lim
(4.4.19)
is not a pure real negative number, for all i. Especially, the path-finite condition is satisfied
when the above limit is equal to zero.
To apply this theory, we need first to extend S(ξ; X, Wn ) from the real line to complex
spaces and make sure its extension is analytic. This extension, actually, is important, in
order to ensure that all solutions to S(ξ; X, Wn ) = 0 can be solved in this way. That is
∗
c, t)/∂e
c) ≥
because for analytic functions in complex spaces, we can make sure det(∂R (e
0.45 It then follows from (4.4.18) that t(ω) is monotonic. That is, on any path, t is
monotonic and never turns back. As a result, stemming backwardly from any points in
n
o
∗
∗
e
c : R (e
c, 1) = S(e
c) = 0 , a path goes down directly to a zero for R (e
c, 0) and never goes
up. That ensures that every zero of S(e
c) can be connected through a path to one of the zeros
∗
∗
c, 0). If we restrict to the real line, however, det(∂R (e
c, t)/∂e
c)
of the homotopic function, R (e
can be either positive or negative or zero. In that case, paths can reverse back and some
zeros of S(e
c) may not be reached. One example can be found in Garcia and Zangwill (1981).
This extension is crucial, nonetheless, not straightforward for some economic applications. Bajari et al.(2010c) show that when the power of the polynomials, qi ’s, are sufficiently
large, the homotopy method can be used to derive all solutions for the multinomial choice
model. For the framework in this paper, instead, it is relatively easy to extend S(ξ; X, Wn )
for many classes of models. Inspecting the models we listed in the paper, most of the Hi (·)’s
Ra
are defined by integrals. Noticing that G(e
a) = a0 g(e
c)de
c is analytic, when g(·) is analytic in
a simply-connected region, we can derive analytic extensions for the models listed in this paper. For example, when the idiosyncratic shocks are normally distributed, its density, f (·),
R
Ra
is analytic. For its CDF, we can manipulate as F (a) = c<a f (c)dc = a0 f (c)dc + F (a0 ),
R ea
for some real number a0 > 0. The extension that Fb (e
a) = a0 f (e
c)de
c + F (a0 ) is analytic.46
44
This result, as well as that for regular groups in equilibrium analysis, are a direct result of the Sard’s
theorem. See the textbook written by Guillemin and Pollack(1974) for details.
45
This result follows from the Cauchy-Riemann equations.
46
The whole space, C2 is simply-connected. The integral here then does not depend on the path.
111
Since sums, products, and composites of analytic functions are analytic, we can derive analytic extensions for other models in a similar way. For the model of investment decisions
under Cobb-Douglas technology, there is a power function, which can be multivalued in the
complex plane. We choose one sheet in that case.
It is not hard to check the sufficient condition for “path-finiteness” in the specific model
structure of this paper. As the diagonal entries of the social relation matrix, Wn , are all
c) does not depend on e
ci . Hence, the limit in (4.4.19) will be zero when
zeros, for any i, S i (e
qi = 2.
Solving the “basic differential equations” requires computing the Jacobinas, which can
increase computation burden. As a result, Garcia and Zangwill(1979) propose to combine
the homotopy continuation method with Newton’s method, which can facilicate computation.
In practice, there is a computation algorithm, the homotopy algorithm, with a Fortran
code suite, HOMPACK90, provided by Watson et al. (1987) and (1997). That algorithm,
is related but not the same as the homotopy continuation method we discussed above. Due
to Watson et al. (1987), the homotopy algorithm is a global convergent algorithm and
is based on the theory that almost all starting points can lead the a zero of a function,
or equivalently, a fixed point of a function. Therefore, there is not guarantee that it can
lead to all solutions to an equation. However, it is also noted by Borkovsly et al. (2010a)
and (2010b), with discretization in computation, the homotopy algorithm can alleviate the
problem that the function we are using is non-analytic. In practice, we may compare the
performance of both methods.
With G independent groups, the log likelihood of the whole sample can be written as:
log L(Y1 , · · · , YG |β, λ, σ, α)
=
G
X
g=1
ng
0
log(
X
ξ e ∈E(Xg ,Wg )
P
Y
exp(α γ(ξ e ; Xg , Wg ))
f (yi,g |ξ e ))
exp(α0 γ(ξee ; Xg , Wg ))
ξee ∈E(Xg ,Wg )
i=1
112
(4.4.20)
The form of the sample likelihood function follows from two types of independence. First,
those groups are independent of each other. Second, because the privately known idiosyncratic shocks are independent, within any group, given an equilibrium, the outcome of a
group member is independent of those of other group members. With a large number of
independent groups, we can apply conventional large sample theory about maximum likelihood estimation. In practice, as it is computationally intensive to compute all the equilibria,
instead of maximizing the sample log likelihood, for any given parameter vector, we can
compute the set of equilibrium and simulate the selection result and outcome distribution.
Then we can calculate simulated moments and estimate parameters through maximizing
the simulated moment conditions.
4.5
Estimation with Self-Known Characteristics
When some exogenous characteristics, Xip ’s, are not public information, conditional
e (·) vary with the private information used to make predictions. This paper
expectations, ξi,m
focuses on the special case that Xip is known only to i (and the econometricians). That is,
for any i, Ji (i) = 1 and Ji (j) = 0 for all j 6= i. Equilibrium and estimation method for the
general information structure can be analyzed in a similar way, notations and calculations
will be more complicated though. We make an additional simplifying assumption, as Yang
and Lee (2014) do.
0
0 0
Assumption 4.5.1 The conditional distribution of X p = (X1p , · · · , Xnp ) is exchangeable,
if Xip ’s have the same support Xp ∈ <kp , for any public information Z = z and for any
permutation, $ : {1, · · · , n} → {1, · · · , n}, the conditional distribution of X p given Z = z,
fp (·), satisfies
p
p
fp (X1p = xp1 , · · · , Xnp = xpn ) = fp (X$(1)
= xp1 , · · · , X$(n)
= xpn ),
0
0 0
for any xp = (xp1 , · · · , xpn ) in their support.
Under Assumption 4.5.1, fixing public information, Z = z, the conditional distribution,
fp (Xip |Xjp , Z = z) is invariant with i, j as long as i 6= j. So we just denote it by fp (e
x|x).
113
0
For any i, and k, k 6= i, for any x in the support Xp ,
X
Wn,ij ξje (Xip ))|Xkp = x, Z = z]
ξie (Xkp = x) =E[Hi (u(Xi ) + λ
j6=i
=E[Hi (u(Xi ) + λ
X
Wn,ij ξje (Xip ))|Xkp0 = x, Z = z]
j6=i
=ξie (Xkp0 = x).
That is to say, the conditional expectation ξie (·) depends only on the realization of self-known
information. Any two agents other than i will make the same prediction on i’s behavior
whenever their own self-known features are the same. According to Appendix C.1, the
conditional expectation about i’s behaviors, ξie : Xp → <, is a mapping from the support of
self-known covariates to the space of possible outcomes. For conditional expectations about
behaviors of all group members, the vector-valued function ξ e = (ξ1e , · · · , ξne ), satisfies:
ξ e (xp1 , · · · , xpn )i = ξie (xpi ),
(4.5.1)
and
ξie (x)
Z
=
x
e
Hi (u(X g , Xic , x
e) + λ
X
Wn,ij ξje (e
x))fp (e
x|x)de
x,
(4.5.2)
j6=i
for all i = 1, · · · , n and x ∈ Xp .
Particularly, if Xip is independent of Xjp for any i 6= j, fp (e
x|x) = fp (e
x). That is,
information about Xjp does not help to predict i’s actions, given public information Z = z.
In this case, the conditional expectation function ξ e reduces to an n × 1 vector with
Z
X
e
ξi = Hi (u(X g , Xic , x
e) + λ
Wn,ij ξje )fp (e
x)de
x,
(4.5.3)
x
e
j6=i
which is a system of nonlinear equations and can be analyzed in a way similar to that used
when all exogenous characteristics are public information.
Hence, in the subsequent sub-sections, discussions are focused on the case that Xip
and Xjp are correlated for i 6= j. Since the domain of each ξie is the support of Xjp ’s,
whether Xjp ’s are discrete or continuous random vectors will have different implications on
the conditional expectation functions. We discuss equilibrium, estimation and identification
with self-known characteristics separately for those two cases.
114
4.5.1
Discrete Private Characteristics
Suppose that Xip ’s are discretely distributed and have a finite number of mass points.
Suppose that their common support is Xp = x1 , · · · , xK for some K < ∞. The transi0
tional probabilities are Pk,k0 = P r(Xip = xk |Xjp = xk , Z = z). Then ξ e can be represented
by a vector,
0
e
e
e
e
ξ e = (ξ1,1
, · · · , ξ1,K
, · · · , ξn,1
, · · · , ξn,K
),
(4.5.4)
e = ξ e (xk ). The consistency condition (4.5.2) reduces to
where ξi,k
i
e
ξi,k
=
K
X
k
0
Pk,k0 Hi (u(X g , Xic , xk ) + λ
0
X
e
Wn,ij ξj,k
0 ),
(4.5.5)
j6=i
for i = 1, · · · , n and k = 1, · · · , K. This is a finite dimension nonlinear system of equations,
similar to the equilibrium condition without private information. Thus, the techniques in
the previous section can be used to analyze equilibria and complete the model. Identification
of model parameters can be proved analogously. Although the support of privately known
characteristics, Xip ’s, is bounded, if the commonly known individual features, Xic ’s, have a
full support, the method of “identification at infinity” can still be used.
4.5.2
Continuous Private Characteristics
Equilibrium Set
If Xip ’s are continuous random variables, ξ e is a function defined on a continuum set
of points satisfying the functional equations (4.5.1) and (4.5.2). According to Appendix
C.1, we can view ξ e as a point in a Banach space, (Ξ(Wn , J ), k · k). Define an operator,
T : (Ξ(Wn , J ), k · k) → (Ξ(Wn , J ), k · k) as
Z
X
p
p
T (ξ)(x1 , · · · , xn )i = Hi (u(X g , Xic , x
e) + λ
Wn,ij ξje (e
x))fp (e
x|xpi )de
x,
x
e
j6=i
115
(4.5.6)
for all i and (xp1 , · · · , xpn ) ∈ Xnp . Then an equilibrium corresponds to a fixed point of
this operator. We can prove the existence of an equilibrium by the Schauder fixed point
theorem.47 To apply this theorem, we impose the following two assumptions.
Assumption 4.5.2 For all i = 1, · · · , n, Hi (·) is differentiable. Additionally,
max sup |
1≤i≤n c∈<
dHi (c)
|kWn k∞ < ∞.
dc
(4.5.7)
Assumption 4.5.3 There is 0 ≤ b < 1 such that maxkξk→∞ kT (ξ)k/kξk = b.
It is proved in Lemma C.4.4 that under Assumption 4.5.2, T is continuous.(Actually,
T is a Lipschitz function when this condition holds.) In addition, it is shown by Lemma
C.4.5 that with Assumption 4.5.3, there is r0 > 0, such that for all ξ in the closed ball,
B[0, r0 ] = {ξ : kξk ≤ r0 }, kT (ξ)k ≤ r0 . That is to say, the images of all points in B[0, r0 ]
is still in this ball. Moreover, if there is any equilibrium, it must be contained in B[0, r0 ].
The ball, B[0, r0 ], is nonempty, closed, and convex. To apply the Schauder fixed point
theorem, it suffices to show that T (B[0, r0 ]) is contained in a compact subset of B[0, r0 ].
However, that is not trivial, because the ball B[0, r0 ] is not compact in the function space,
(Ξ(Wn , J ), k · k). To capture a compact set in this space, we begin with the relatively
compact sets, for the closure of a relatively compact set is compact. As we have mentioned, ξ ∈ (Ξ(Wn , J ), k · k) if any only if each of its coordinate functions, ξi , belongs to
the Lebesgue space, L1 (Xp , BX , µp ; <1 ). Utilizing the characterization of relatively compact subsets in Lebesgue spaces by Dunford and Schwartz(1958), we derive necessary and
sufficient conditions for relative compactness in (Ξ(Wn , J ), k · k). (See Proposition C.4.3
for the results of a general form of private information about Xip ’s.) On the basis of these
discussions, a theorem about the set of equilibria is derived.
Proposition 4.5.1 Under Assumptions 4.5.2 and 4.5.3, if in addition,
Z
max
|T (ξ)i (x + x
e)fp (x + x
e) − T (ξ)i (x)fp (x)|dx → 0,
1≤i≤n Xp
(4.5.8)
47
The theorem is cited as Proposition C.4.4 in Appendix C.4. See Bonsall (1962) for details. A brief
introduction can be found at http://en.wikipedia.org/wiki/Schauder fixed point theorem.
116
as x
e → 0, uniformly for any ξ ∈ B[0, r0 ]; and
Z
max
|T (ξ)i (x)|fp (x)dx → 0,
(4.5.9)
1≤i≤n Xp −Cr
as r → ∞, uniformly for all ξ ∈ B[0, r0 ], the set of equilibria, E(X, Wn ), is a nonempty and
compact subset of (Ξ(Wn , J ), k·k) and is contiained in the closed ball B[0, r0 ]. In particular,
(4.5.8) and (4.5.9) are satisfied, if
0
0
1. Hi (·)’s are uniformly bounded, i.e., max1≤i≤n supa∈<1 |Hi (a)| ≤ B for some B ;
2. E[Xip |Z = z] < ∞, for all i; and
3. For some δ0 > 0, for each i, there is an function gi (x, x
b) such that
R
R
b)|dxdb
x < ∞, fp,i (x + x
e, x
b) ≤ gi (x, x
b), a.e., for any x
e in the cube
Xp
Xp |gi (x, x
i,m
Ji
Cδ0 , where fp,i (·, ·) is the joint density of Xip and Xjp , i 6= j, conditional on public
information Z = z.48
This existence theorem requires some conditions about the behaviors, Hi (·)’s, and the
joint distribution of Xip ’s conditional on public information Z = z. Conditions (4.5.8)
and (4.5.9) are about uniform convergence, which may be hard to verify. However, when
Hi (·)’s are uniformly bounded, we just need to verify some distribution conditions. This
type of scenarios include models with bounded behaviors, such as the models for binary
choices, ordered multinomial choices and two-sided censored choices. In that case, we can
see that if conditional on public information Z = z, Xip ’s have a continuous joint distribution
on a bounded support, the sufficient conditions about distribution are satisfied. When
the support is unbounded, for some distributions, the joint density of Xip ’s can still be
dominated by a integrable function. One example is the normal distribution. See Lemma
C.4.6 in Appendix C.4. Therefore, when outcomes and/or behaviors are uniformly bounded
and the joint distribution of Xip and Xjp (i 6= j) is normal conditional on public information
Z = z, there is at least one equilibrium and the set of equilibria is a compact set contained
in a closed ball B[0, r0 ].
48
A cube, Cr , is the set in <kp such that for any x
e ∈ Cr , all its coordinates are within [−r, r].
117
The compactness of E(X, Wn ) is important. Given this result, for any small positive
number, η > 0, there is a finite number of points in E(X, Wn ), ξ e,1 , · · · , ξ e,K , such that
for any equilibrium ξ e , we can pick one of those points, say ξ e,k , with kξ e − ξ e,k k < η.49 This
way, the finite set ξ e,1 , · · · , ξ e,K can be viewed as an approximation of all equilibria with
precision η. This is the basis on which we specify the distribution of equilibrium selection
in the current structure of incomplete information.
Conditions (4.5.8) and (4.5.9) are used to ensure that we can apply the Schauder fixed
point theorem to the whole ball B[0, r0 ]. If we restrict to equilibria with some special
properties so that they are contained in a compact subset of B[0, r0 ], we may apply the
Schauder fixed point theorem in this compact subset. Then (4.5.8) and (4.5.9) will be
satisfied, some other possible equilibria will be excluded though. For the model of investment
decisions with Cobb-Douglas production function, Assumption 4.5.3 is satisfied. To see this,
via the Jensen’s inequality and the Hölder’s inequality,
R
Xp
|T (ξ)i (x)|fp (x)dx
kξk
n
o
R RR
P
ι
max u(X g , Xic , y) + λ j6=i Wn,ij ξj (y) − i , 0
f (i )di fp (y|x)dyfp (x)dx|
Xp
=AL1−ι
i
R
≤AL1−ι
i
Xp
R
[HT (u(X g , Xic , y) + λ
kξk
ι
W
ξ
(y))]
fp (y|x)dyfp (x)dx
n,ij j
j6=i
P
kξk
R
P
ι
g
c
,
y)
+
λ
W
[
H
(u(X
,
X
n,ij ξj (y))fp (x, y)dxdy]
T
i
j6=i
≤AL1−ι
i
kξk
P
ι h Z H (u(X g , X c , y) + λ
T
i
j6=i Wn,ij ξj (y))
1−ι kξk
P
=ALi
kξk
|u(X g , Xic , y) + λ j6=i Wn,ij ξj (y)|
P
iι
|u(X g , Xic , y) + λ j6=i Wn,ij ξj (y)|
·
fp (x, y)dxdy
kξk
P
g
c
iι
ιhZ
H
T (u(X , Xi , y) + λ
kξk
p
j6=i Wn,ij ξj (y)) p
P
≤AL1−ι
fp (x, y)dxdy
i
c
g
kξk
|u(X , Xi , y) + λ j6=i Wn,ij ξj (y)|
h Z |u(X g , Xic , y) + λ P
iι
q
j6=i Wn,ij ξj (y)| q
·
fp (x, y)dxdy ,
kξk
(4.5.10)
for some p, q ≥ 1 with 1/p + 1/q = 1. Similar to our discussion in the previous section, if |λ|kWn k∞ < ∞,
R
Xp
|T (ξ)i (x)|fp (x)dx
kξk
goes to zero as kξk goes to infinity. Then
49
See Dunford and Schwartz (1958) and Folland(1999) for a brief introduction on compactness and relative
compactness.
118
limkξkE →∞
kT (ξ)k
kξk
= 0. However, other conditions may not hold in this case. Instead, if
we focus on the case that Xip ’s have a compact support and all equilibrium conditional
expectations, ξ e ’s, are continuous, we may derive a similar characterization of the set of
equilibria.50
Nonetheless, for some models with unbounded choices, such as the linear model of
continuous choices and the Tobit model, in order to satisfy Assumption 4.5.3, we might have
to impose strong conditions on λ and kWn k∞ . To avoid that, we employ other techniques
to analyze those models. First consider the linear model with continuous choices. In this
case, the equilibrium condition for conditional expectations is:
Z
Z X
e)fp (e
x|x)de
x+ λ
ξie (x) = u(X g , Xic , x
Wn,ij ξje (e
x))fp (e
x|x)de
x.
x
e
x
e
j6=i
Reorganizing the above equation, we get that
Z
Z
X
e
e
ξi (x) + (−λ)
Wn,ij ξj (e
x))fp (e
x|x)de
x = u(X g , Xic , x
e)fp (e
x|x)de
x.
x
e
(4.5.11)
(4.5.12)
x
e
j6=i
This is an n-dimension Fredholm integration alternative, whose solutions are investigated
0
by general Fredholm theory.51 Especially, when u(X g , Xic , Xip ) = v(X g , Xic ) + Xip β and
0
E[Xjp |Xip ] = µ + CXip , by Yang and Lee (2014), when Inkp − λ(Wn ⊗ C ) and In − λWn are
both invertible, there is one and only one linear equilibrium conditional expectation, ξ e .
For the Tobit model with heterogeneous cutoffs, define
P
pi (x) = F (u(X g , Xic , x) + λ j6=i Wn,ij ξje (x) − v(X g , Xic )),
R
for all i and x ∈ Xp . Then we have that ξi (x) = Hi (F−1 (pi (e
x)) + v(X g , Xic ))f (e
x|x)de
x.
The equilibrium condition can be rewritten as
Z
X
pi (x) = F (u(X g , Xic , x) + λ
Wn,ij Hj (F−1 (pj (e
y )) + v(X g , Xic ))f (e
y |x)de
y − v(X g , Xic )).
j6=i
50
In that case, we may use the sup-norm for each coordinate function, the ξi ’s, instead of the k · k1 norm.
51
See Ruston (1986) for systematic discussions. Lax(2002) provides with
a succinct
discussions. Define
R
P
an operator, TF such that for any ξ, TF (ξ)(x1 , · · · , xn )l = TF (ξ)l (xl ) = xe (−λ) j6=i Wn,ij ξje (e
x)fp (e
x|xl )de
x,
p
p
kp
for all l and xl ∈ X . If X is compact in < , TF is a compact operator from (Ξ(Wn , J ), k · k) to a space of
p
continuous functions
R on gX . cThen this integration
R has a solution if
0
e)fp (e
x|xn )de
x) is orthogonal to the null space
u
e(x1 , · · · , xn ) = ( xe u(X , Xi , x
e)fp (e
x|x1 )de
x, · · · , xe u(X g , Xic , x
0
of the transpose operator, TF .
119
As F (·) is bounded, we can apply Proposition 4.5.1 to analyze equilibrium set. If v(X g , Xic ) =
0 for all i, that is the case of the Tobit model with homogeneous cutoffs normalized to 0.
Equilibrium Set Approximation
Although we can approximate the whole equilibrium set by a finite number of equilibria
for any level of precision, as functions defined on a continuum, it is not possible to derive
the exact values of those functions at every point in their domains. In order to apply the
stochastic selection rule, we approximate such a function. Four possible approximation
approached are discussed here.
The first method uses the simple functions. In Appendix C.4, we show that ξ =
(ξ1 , · · · , ξn ) ∈ (Ξ(Wn , J ), k · k) if and only if each of its coordinate function, ξi , is an
element of the Lebesgue space, L1 (Xp , Bp , µp ; <1 ). In such a function space, there is a
special set of functions called the “simple functions”. A function ξ from Xp to <1 is µp
simple if ξ has only a finite set of values, {ν1 , · · · , νK }; and for any νk , ξ −1 (νk ) is an element of the σ-algebra, Bp . By Dunford and Schwartz (1958), the set of µp -measurable
simple functions is dense in L1 (Xp , Bp , µp ; <1 ). That is, for any given level of precision, we
can always find a simple function to approximate a function in L1 (Xp , Bp , µp ; <1 ). For any
integer K, choose a finite partition of Xp , U K,1 , · · · , U K,K , where each U K,k is in Bp . For
P
x ∈ U K,k ). Then the
any i = 1, · · · , n, define a simple function as ξiK (e
x) = K
k=1 κi,K,k I(e
equilibrium condition (4.5.2) can be approximated as
K
X
κi,K,k I(e
x ∈ U K,k ) ≈
Z
Hi (u(X g , Xic , ye)+λ
k=1
X
Wn,ij
j6=i
K
X
κj,K,k I(e
y ∈ U K,k ))fp (e
y |e
x ∈ U K,k )de
y , (4.5.13)
k=1
for all i = 1, · · · , n and k = 1, · · · , K. Therefore,
Z
κi,K,k =
Hi (u(X g , Xic , ye) + λ
X
Wn,ij
j6=i
=
K
X
P (U
K,k
0
|U
K,k
)E[Hi (u(X
K
X
κj,K,k I(e
y ∈ U K,k ))fp (e
y |e
x ∈ U K,k )de
y
k=1
g
, Xic , ye)
(4.5.14)
+λ
k=1
X
j6=i
120
Wn,ij κj,K,k0 )|e
x∈U
K,k
, ye ∈ U
K,k
0
],
0
0
where P (U K,k |U K,k ) = P (Xjp ∈ U K,k |Xip ∈ U K,k ). As i runs over 1, · · · , n and k runs
over 1, · · · , K, there are nK equations for nK unknowns. This is similar to the previous
analysis for the case of discretely distributed Xip ’s. What is different is that the conditional
distribution, fp (e
y |xpK,k ), is not discrete. When all Hi (·)’s and fp (·) can be extended to the
complex space in an analytic way, as the right-hand-side of (4.5.14) is a weighted sum of
integrations over Hi (·), the above system can be extended to complex spaces. Multiple
solutions for κj,K,k ’s can then be computed by the homotopy method. The corresponding
simple functions are then used as an approximation of the equilibrium set.
To approximate equilibrium conditional expectation functions by simple functions is, in
essence, to discretize the domain, Xp . Rust(1987) uses this method to solve for the optimal
engine replacement scheme in an optimization programming problem. Compared with his
work, instead of a unique optimal scheme in Rust(1987), it is possible to get multiple
solutions to (4.5.14), which makes the model in this paper more computationally intensive.
Precision depends on the choice of partitions. Generally speaking, the finer the partition
(increasing K), the more precise the approximation. However, it remains a problem how to
choose the cutoffs for a partition.
Inspecting (4.5.2), the value of an equilibrium conditional expectation is determined by
an integration, which can be approximated by the quadrature method(See Judd (1998) and
Lee(2001) for details.). Employing this approximation of integration, we get the second
method to approximate equilibria. Take the Gauss-Legendre quadrature as an example.
Consider the simple case that the dimension for Xip ’s is kp = 1 and Xp is an interval, [a, b].
We have that
ξie (x) =
b
Z
a
≈
K
X
Hi (u(xg , xci , x
e) + λ
X
Wn,ij ξje (e
x))fp (e
x|x)de
x
j6=i
ωk Hi (u(xg , xci ,
k=1
· fp (
X
(υk + 1)(b − a)
(υk + 1)(b − a)
+ a) + λ
Wn,ij ξje (
+ a))
2
2
(4.5.15)
j6=i
(υk + 1)(b − a)
b−a
+ a|x)
,
2
2
where υk , for k = 1, · · · , K, are abscissae and ωk are the weights. They are fixed. If we
can get the values of expectation function on a finite number of points, xpk =
(υk +1)(b−a)
2
+ a,
for k = 1, · · · , K, we can approximate the value of expectation function at any point in
121
the support of Xip . To be specific, when we take x = xpk for k = 1, · · · , K, we will get
nK equations for nK unknowns, ξie (xpk ) i,k ’s. Using the homotopy method, we get a
n
o
, for some d = 1, · · · , D. For each d, plugging
finite number of solutions, ξie,d (xpk )
i,k
n
o
back into (A.2.1), we derive an approximation of an equilibrium conditional
ξie,d (xpk )
i,k
expectation function. If Xip has a full support, we need to change variables so that integral
is over a bounded interval. This quadrature method is used by Yang and Lee (2014) and
Yang, Qu and Lee (2014) to derive approximations to equilibrium conditional expectation
functions when there are private information in exogenous characteristics. The difference
lies in the way to solve the system of equations (A.2.1). They search for the unique solution
under condition (4.3.8). Instead, in this paper, the homotopy method is used to derive all
o
n
the solutions, ξie,d (xpk )
.
i,k
When
Xip ’s
are of multiple dimensions, the tensor product may be used. Details are
introduced by Judd (1998). However, when the dimension of Xip ’s are high, computation can
be intensive. Therefore, for high-dimension privately known characteristics, the stochastic
integration may be used for approximation instead. To be specific, let g(·) be a density with
its support containing the support of Xip such that fp (xp |x)/g(xp ) is well defined. Then we
can generate K random draws, say, xpk , from density h(·). The stochastic approximation
will be
ξie (x)
K
p
X
1 X
e p fp (xk |x)
g c p
Wn,ij ξj (xk ))
.
Hi (u(x , xi , xk ) + λ
≈
K
g(xpk )
k=1
(4.5.16)
j6=i
The fourth method is a combination of the quadrature method and basis function approximation. That is, an equilibrium function is approximated by a linear combination of
a finite number of bases. In our model, ξ = (ξ1 , · · · , ξn ) is an element of (Ξ(Wn , J ), k · k)
if and only if each of its coordinate function, ξi is an element of the Lebesgue space,
L1 (Xp , Bp , µp ; <1 ). We can use basis of this Banach space to approximate an equilibrium.
For example, the dimension of Xip ’s is kp = 1 and Xp = [a, b]. We first transform ξ into a
function on [0, 1] by changing variables.
122
1. When −∞ < a < b < +∞, by setting x = a + (b − a)e
x,
Rb
R1
x)fp (a + (b − a)e
x)de
t.
a ξi (x)fp (x)dx = 0 bξ(a + (b − a)e
Define ξei (e
x) = bξ(a + (b − a)e
x)fp (a + (b − a)e
x).
2. When a = −∞ and b < +∞, by setting x = log(be
x),
Rb
R1
x.
x))fp (log(be
x)) xe1 de
a ξi (x)fp (x)dx = 0 ξ(log(be
Define ξei (e
x) = ξ(log(be
x))fp (log(be
x)) xe1 .
3. When a > −∞ and b = +∞, by setting x = log(a/(1 − x
e)),
Rb
R1
1
e))fp (log(a/(1 − x
e))) 1−e
x.
x de
a ξi (x)fp (x)dx = 0 ξ(log(a/(1 − x
1
Define ξei (e
x) = ξ(log(a/(1 − x
e))fp (log(a/(1 − x
e))) 1−e
x.
4. When a = −∞ and b = +∞, by setting t = log(e
x/(1 − x
e)),
R1
Rb
1
x/(1 − x
e)))fp (log(e
x/(1 − x
e))) xe(1−e
x.
a ξi (x)fp (x)dx = 0 ξ(log(e
x) de
1
Define ξei (e
x) = ξ(log(e
x/(1 − x
e)))fp (log(e
x/(1 − x
e))) xe(1−e
x) .
We can see that ξi is an element of L1 (Xp , Bp , µp ; <1 ) if and only if ξei is in
L1 ([0, 1], B[0,1] , m; <1 ), the space of Lebesgue integrable functions defined on the unit interval [0, 1]. Let
τ0 (e
x) = 0,
(4.5.17)
for all x
e ∈ [0, 1]; and
τk,j = 2k−1 (I((2j − 2)2−k ≤ x
e < (2j − 1)2−k ) − I((2j − 1)2−k ≤ x
e < 2j · 2−k )), (4.5.18)
for k and 1 ≤ j < 2k−1 and x
e ∈ [0, 1]. Sort them in the order, τ0 , τ1,1 , τ2,1 , τ2,2 ,
· · · , and re-label those functions as τe0 , τe1 , τe2 , · · · . Then {e
τk } is called the Haar bases for
L1 ([0, 1], B[0,1] , m; <1 ).52 See Figure C.1 for a graphic illustration. Choose an integer L,
we approximate ξei by a linear combination of a finite number of those basis functions, i.e.,
P
ξei ≈ L
el . From it, we get an approximation of ξ. Take the first case listed above as
l=0 κi,L,l τ
P an example, ξi (x) ≈ L
κ
τ
e
((x−a)/(b−a))
/
bf
(x)
. Plugging this approximation
p
i,L,l l
l=0
52
Since ke
τk k1 = 1, the Haar basis is a basis with unit norm.
123
back into the consistency condition, we get that
PL
κi,L,l τel ((x − a)/(b − a))
bfp (x)
Z b L
X
X
κj,L,l0 τel ((e
y − a)/(b − a)) Hi u(X g , Xic , ye) + λ
Wn,ij
=
fp (e
y |e
x)de
y,
bfp (e
y)
a
0
l=0
j6=i
(4.5.19)
l =1
As for the integration, we can choose the quadrature points as we did before, i.e.,
L
X
l=0
=
K
X
k=1
κi,L,l
τel ((x − a)/(b − a))
bfp (x)
L
X
X
(υk + 1)(b − a)
τel ((υk + 1)/2)
ωk Hi u(X g , Xic ,
+ a) + λ
Wn,ij
κj,L,l0
(υ
+1)(b−a)
2
bfp ( k
+ a)
0
j6=i
l =1
(4.5.20)
2
(υk + 1)(b − a)
b−a
· fp (
+ a|e
x)de
y
,
2
2
for i = 1, · · · , n, l = 1, · · · , L, and k = 1, · · · , K. The Gauss-Legendre quadrature
abscrissae, υk , and weights, ωk , are fixed. The Haar bases, τel ’s, are known functions.
From the data, it is possible to get the distribution density of Xip ’s conditional on public
information. Consequently, once we plug in x =
(υk +1)(b−a)
,
2
the value of
τel ((x−a)/(b−a))
bfp (x)
can
be calculated. In this way, we derive a system of nK nonlinear equations for nL coefficients,
κi,L,l ’s. Choosing K = L and applying the homotopy method when analytic extension is
possible, we may get multiple solutions, κdi,L,l , for d = 1, · · · , D. They correspond to multiple
equilibria. Approximations when [a, b] is unbounded can be computed in a similar way by
changing variables.
Comparing the above approaches, we can see that using the first method, simple function
approximation, we first fix the class of functions which are used to make approximation
and then pin down the unknown coefficients of equilibrium functions by the equilibrium
condition. However, there is not a general guidance on the choice of partitioning cutoffs. For
the second and third method, we do not fix the form of equilibrium expectation functions.
Instead, we first solve the values of an equilibrium expectation function at a fixed finite set
of points and then use those values to approximate the equilibrium expectation function
at any point in its domain. These two methods also specify how those points are chosen.
124
Using the last method, we construct a flexible form of functions using basis functions and
then employ the quadrature method to approximate integration. Using the second method,
approximation precision depends on the number of quadrature abscissae. For the fourth
method, instead, the approximation performance hinges on the number of function basis
chosen.
Equilibrium Selection and Parameter Identification
Given a precision η > 0, consider a finite approximation to the set of equilibria,
E(X, Wn ), E0 (X, Wn ) = ξ e,1 (X, Wn ), · · · , ξ e,D (X, Wn ) ,
we can derive an approximation to the likelihood function (4.3.10):
e ; X, Wn ) =
L(Y
D
X
ρ(ξ e,d )
n
Y
f (yi |ξ e ).
(4.5.21)
i=1
d=1
For example, we can set
0
ρe(ξ
e,d
exp(α γ(ξ e,d ; X, Wn ))
; E0 (X, Wn ), α) = PD
d0 =1
0
exp(α0 γ(ξ e,d ; X, Wn ))
.
(4.5.22)
Then we derived the (approximated) sample likelihood function:
e X, Wn ) =
L(y;
D
X
ρe(ξ d,k ; E(X, Wn ), α)
n
Y
f (yi |ξ e,k ).
(4.5.23)
i=1
d=1
When there are G independent groups, the corresponding approximation to the log
likelihood of the whole sample is
e 1 , · · · , YG |β, λ, σ, α)
log L(Y
=
Dg
ng
0
X
Y
exp(α γ(ξ e,g,d ; Xg , Wg ))
log(
f (yi,g |ξ e,g,d )).
PDg
0
0
e,g,d
e
exp(α γ(ξ
; Xg , Wg )) i=1
g=1
d=1
d0 =1
G
X
(4.5.24)
As for identification, β, λ, and σ can still be identified using the strategy of “identification at infinity”. That is because those parameters can be separately from α using this
technique. However, it is more difficult to identify α. The reason is that (4.5.23) is not the
exact likelihood but an approximation. However, as the set of equilibria is determined by
β, λ, and σ, it does not depend on α. If we assume that all independent groups choose the
125
same approximation, Dg = D, and use the same probability mass, (4.5.22), we can identify
α from variation across groups.
Parameters can be estimated either by directly maximizing the approximated likelihood
function or simulated moment conditions, similar to the approach used when all exogenous
characteristics are public information. The performance of estimates relies on the approximation. In this paper, actually, there are two types of approximations. First, approximate
the set of equilibria by a finite subset. Second, approximate each of those functions.
Regarding the equilibrium conditional expectation on i’s behavior based on a structure
p
of private information, Ji,m , as a function of the realization of Xi,m
, the model under a
general form of incomplete information about Xip ’s can be estimated in a similar way.
4.6
4.6.1
Discussions and Extensions
Group Unobservables
In the previous discussions, all exogenous characteristics, X g , Xic ’s, and Xip ’s are observable to econometricians. In reality, however, some variables are known to agents but
unavailable from the data. For example, researchers studying students’ class performances
may not know how good the teachers are, which is known to the students. Similarly, in
market sale data, it is possible that no information about the wealth of customers is available from the data. But the firms may have some relevant information. In this section, we
take into account unobservable variables that are public information in a group and can
affect the payoff of all group members. Modeling them as random effects, we derive the
following framework:
∗
yi,g = hi,g (yi,g
),
(4.2.10 )
and
∗
yi,g
= u(X g , Xic , Xip , ζ g ) + λ
X
j6=i
126
Wg,ij E[yj,g |XJpi,g , Z] − i,g .
(4.2.20 )
where g is the group index. By (4.2.20 ), we implicitly assume that all group unobervables
are additive and summarize them as a single variable, ζ g . Assume that ζ g ’s are i.i.d.
independent of all the other exogenous variables, social relations, as well as the idiosyncratic
shocks. Their distribution is denoted by the pdf, fζ (·; ϑ).
Because each ζ g is publicly known to agents in g, for interactions among group members,
it acts the same as X g . For any group g, we can characterize the equilibrium set and use a
parametric stochastic selection rule to complete the model. Suppose that the distribution
of equilibrium selection is
0
ρ(ξ e ; E(X g , X c , X p , ζ g , Wg ), α) = ρ(α γ(ξ e ; X g , X c , X p , Wg ); E(X g , X c , X p , ζ g , Wg )),
(4.6.1)
with some known selection rule γ(ξ e ; X g , X c , X p , Wg ) and unknown parameter α. The
above form shows that the group unobservables only affect the set of equilibria, but not
the selection rule. Therefore, given a set of equilibria(finite, or approximated by a finite
number of equilibrium expectation functions), the realizations of unobserved group features
does not influence the distribution of equilibrium outcomes. This assumption is reasonable
if the selection rule is consistent with social welfare maximization or Pareto optimization.
The selection rules in previous sections satisfy (4.6.1). Then we can complete the model
and write down the sample log likelihood function as follows:
ng
G
hZ Z
Y
X
log L(Y ; X, W ) =
log
f (yi,g |ξge )·
g=1
ξ e ∈E(X g ,X c ,X p ,ζeg ,Wg ) i=1
(4.6.2)
i
ρ(α γ(ξ e ; X g , X c , X p , Wg ); E(X g , X c , X p , ζeg , Wg ))fζ (ζeg ; ϑ)dζeg .
0
Because X g ’s are observed from the data set and ζ g ’s are not, identification and estimation methods will be different from those in previous sections.
Assuming that u(·) is a linear function of exogenous characteristics, i.e.,
0
0
0
p
c , Xp , ζg) = β
g
c
g
u(X g , Xi,g
0,0 + X β0,1 + Xi,g β1 + Xi,g β2 + ζ ,
i,g
for all i and g. We normalize the mean of ζ g to be equal to 0, i.e., E[ζ g ] = 0. For data about
a single group, applying the technique of identification at infinity, we can identify β1 , β2 ,
0
and βe0 = β0,0 + X g β0,1 + ζ g (as a whole) under certain conditions, based on our previous
127
discussions. If Hi (·) is strictly increasing, the group average behaviors will be increasing in
0
βe0 = β0,0 + X g β0,1 + ζ g , which helps us identify β0,0 , β0,1 , and the distribution of ζ g ’s from
variations across groups.
As for estimation, we can calculate integration over unobserved ζ g ’s by stochastic simulations. That is to say, we randomly take S draws from the distribution fζ (·; ϑ) for each
g, ζ g,1 , · · · , ζ g,S and calculate the simulated log likelihood:
b ; X, W ) =
log L(Y
G
X
g=1
log
ng
Y
S Z
h1 X
S
s=1
ξ e ∈E(X g ,X c ,X p ,ζ g,s ,Wg )
f (yi,g |ξge )·
i=1
(4.6.20 )
i
0
ρ(α γ(ξ e ; X g , X c , X p , Wg ); E(X g , X c , X p , ζ g,s , Wg )) .
4.6.2
Peer Effects
Consider the case that an agent’s behaviors are affected by the performances of her
peers. That is, Wn,ij = 1 for all i 6= j. Since every agent makes predictions on anyone
else, we denote the set of all possible private information about Xip ’s in the group as
n
o
Jb = Je : Je = Ji for some i = 1, · · · , n . Denote the number of elements in this set as
n
o
M0 . We can denote the set as Jb = Je1 , · · · , JeM0 . For each i, there is a unique m(i)
with 1 ≤ m(i) ≤ M0 such that Ji = Jem(i) . We use xpJ,m to represent one realization of
p
the random vector XJ,m
corresponding to the private information, Jem . The corresponding
p
= XJpi , representing the private information
support is denoted as XpJ,m . Note that XJ,m(i)
e ), takes the following special
known to i. The conditional expectation function, ξ e = (ξi,m
form:
e
ξ e (xpJ,1 , · · · , xpJ,M0 )i,m = ξi,m
(xpJ,m ) = E[Hi (u(Xi ) + λ
X
p
p
e
ξj,m
(Xm(i)
))|XJ,m
= xpJ,m , Z],
j6=i
(4.6.3)
for all i = 1, · · · , n, m = 1, · · · , M0 , and xpJ,m ∈ XpJ,m .53 When all exogenous covariates are public information, ξ e reduces to an n × 1 vector, satisfying ξie = E[Hi (u(Xi ) +
P
λ j6=i ξje )|Z = z]. Since only the total expected behaviors of peers is taken into account
53
In the general case, we define the expectation about i conditional on private information for agents other
than i. For convenience, when discussing peer effects, we also define conditional expectation about i based
on private information the same as i’s. Since Wn,ii = 0 for all i, model equilibria and implications do not
change with this small alteration.
128
when an agent is making decisions, we wonder whether it is possible to represent an equilibrium by a vector-valued function, the dimension of whose range is less than n. That means
e ’s, is reduced. Lee, Li and Lin(2014) prove that
the number of coordinate functions, ξi,m
this is possible in the binary choice model when all exogenous characteristics are public
information and only the idiosyncratic shocks are privately known. Their results can be
extended to the general setting in this paper.
Let ξ = (ξ 1 , · · · , ξ M0 ) be a vector-valued function such that
ξ(xpJ,1 , · · · , xpJ,M0 )m = ξ m (xpJ,m ),
(4.6.4)
for any m = 1, · · · , M0 . For any i, there is a unique m(i) = 1, · · · , M0 , such that Ji = Jem(i) .
Consider a function equation system, Gi (ξ m(i) , ξi ; X, Wn ) = 0, where ξi = (ξi,1 , · · · , ξi,M0 )
is a vector-valued function which satisfies (4.6.4) and that for any m,
Gi (ξ m(i) , ξi ; X, Wn )(xpJ,1 , · · · , xpJ,M0 ) m
= Gi (ξ m(i) , ξi ; X, Wn )m (xpJ,m )
p
p
p
=E[Hi (u(Xi ) + λξ m(i) (XJ,m(i)
) − λξi,m(i) (XJ,m(i)
))|XJ,m
= xpJ,m , Z = z] − ξi,m (xpJ,m )
=0,
(4.6.5)
for all xpJ,m ∈ XpJ,m . Applying the Brouwer fixed point theorem, for any ξ m(i) , there is
a function ξi satisfying the above system of equations. For any χ = (χ1 , · · · , χM0 ) with
χm ∈ L1 (XpJ,m , BJ,m , µp ; <1 ), consider the linear operator ∆i (ξ m(i) , ξi ; X, Wn ) such that
∆i (ξ m(i) , ξi ; X, Wn )(χ) m (xpJ,m )
p
p
= − λE[DHi u(Xi ) + λξ(XJ,m(i)
) − λξi,m(i) (XJ,m(i)
) ·
p
p
χm (XJ,m(i)
)XJ,m
= xpJ,m , Z = z] − χm (xpJ,m );
for m = m(i); and
∆i (ξ m(i) , ξi ; X, Wn )(χ)
m
(xpJ,m ) = −χm (xpJ,m ),
for m = 1, · · · , M0 and m 6= m(i), where DHi (a) denotes the derivative of Hi (·) at point a.
∆i (ξ m(i) , ξi ; X, Wn ) is the Fréchet derivative of Gi with respect to ξi at (ξ m(i) , ξi ; X, Wn ).
129
If ∆(ξ m(i) , ξi ; X, Wn ) is an isomorphism, by the Implicit Function Theorem in Banach
e
spaces, for a neighborhood ξ m(i) , there is only one ξie such that the functional equation,
Gi (ξ m(i) , ξi ; X, Wn ) = 0 is satisfied. In this way, we can derive an operator Λi , which defines each function ξi as the image of ξ mi such that Gi (ξ m(i) , Λi (ξ m(i) ); X, Wn ) = 0. If
∆i (ξ m(i) , ξi ; X, Wn ) is an isomorphism for all i, we call the group to be regular.
e
e
e
Proposition 4.6.1 Suppose that the group is regular. If there is a function ξ = (ξ 1 , · · · , ξ M0 )
such that it satisfies (4.6.4), and
e
ξ m (xpJ,m ) =
n
X
e
p
)
E[Hi (u(X g , Xic , Xip ) + λξ m(i) (XJ,m(i)
(4.6.6)
i=1
−
e
p
p
λ(Λi (ξ m(i) ))m(i) (XJ,m(i)
))|XJ,m
=
xpJ,m , z],
where Λi (·) is defined above, then there is a vector of functions,
e
e
e
e
ξ e = (ξ1,1
, · · · , ξ1,M
(0) , · · · , ξn,1 , · · · , ξn,M0 ),
such that (4.6.3) holds for all i, m, and xpJ,m ∈ XpJ,m . On the contrary, if there is a vector of
e , · · · , ξe
e
e
functions, ξ e = (ξ1,1
1,M0 , · · · , ξn,1 , · · · , ξn,M0 ), satisfying (4.6.3), there is a function
e
e
e
ξ = (ξ 1 , · · · , ξ M0 ) such that (4.6.4) and (4.6.6) hold.
Proof. See Appendix C.5.
Particularly, when all exogenous characteristics are public information, the functional
equation system (4.6.5) reduces to
e
e
Gi (ξ , ξie ; X, Wn ) = Hi (u(Xi ) + λξ − λξie ) − ξie = 0,
(4.6.7)
which is just a nonlinear equation. The regularity condition then reduces to
e
−λDHi (u(Xi ) + λξ − λξie ) − 1 6= 0.
We can see that if DHi (a) > 0 for all i and a and λ ≥ 0, the regularity condition is
e
satisfied. Then a BNE is equivalent to the expected group total outcomes, ξ , which is
a scalar. The system analyzed by Lee, Li and Lin (2014) corresponds to the special case
that Hi (a) = F (a). When each Xip is known to i only and the joint distribution of Xip ’s
conditional on the public information Z = z is exchangeable with a pdf fp (·), (4.6.5) takes
130
the following form:
e
Gi (ξ , ξie ; X, Wn )(x)
Z
e
=
Hi (u(X g , Xic , x
e) + λξ (e
x) − λξie (e
x))fp (e
x|x)de
x − ξie (x)
(4.6.8)
Xp
=0,
for all i and x ∈ Xp . In this case, a BNE is equivalent to the expectation of group total
e
outcomes conditional on the realization of an individual’s self-known characteristics. ξ (·)
is a function mapping the realization of that privately known characteristics to a real number. Therefore, focusing on peer effects, the dimension of an equilibrium can be reduced,
simplifying estimation.
4.6.3
Deterministic Rule
In previous sections, we assume that an equilibrium is selected from the set of equilibria
according to a stochastic rule. It is also possible to use a deterministic rule. To be specific,
let E(X, Wn ) denote a set of equilibria for a group (X, Wn ). It is equivalent to the set
of solutions to a system of (generally nonlinear, functional) equations, S(ξ; X, Wn ) = 0.
Let Π(ξ; X, Wn ) denote a real-valued function of equilibria and group features (X, Wn ). For
instance, Π(ξ; X, Wn ) can be the expected total utility of the group, or the expected number
of market entrants. We select an equilibrium to maximize the objective function:
max
Π(ξ e ; X, Wn ) s.t. S(ξ e ; X, Wn ) = 0.
e
ξ
(4.6.9)
When all exogenous covariates are known to the public, (4.6.9) is just an ordination optimization problem. According to our discussion in Section 4.4, the set of equilibria, or
equivalently, the set of zeros for S(·; X, Wn ), is finite. To solve (4.6.9) is to pick one of
those finitely many points to maximize the criterion function. In general, however, the
conditional expectation function, ξ e , is a vector valued function defined on subsets of the
Euclidean spaces, (4.6.9) is a problem of functional optimization. In that case, we may
solve the optimization problem using optimal control.
131
For peer effects, especially, assuming that an equilibrium is selected under a fixed equilibrium selection rule, it is possible to derive a simpler estimation method. When both
Π(ξ e ; X, Wn ) and S(ξ e ; X, Wn ) are continuous with exogenous characteristics, X, and the
set of equilibria is compact, we can apply the Maximum theorem, which claims that the set
of solutions to (4.6.9) is an upper-hemicontinuous correspondence of X.5455 If we can assure
that there is a unique maximizer (by imposing some convexity conditiona, for example),
the unique optimal equilibrium will be a continuous function of the group characteristics of
∗
X. So do the mean, ξ . Therefore, when there is a large number of independent groups,
∗
∗
we can non-parametrically estimate ξ from group means. By Proposition 4.6.1, with ξ ,
we can recover conditional expectations on individual behaviors, the ξi∗ ’s. Then we can
estimate model parameters either from sample likelihood function or moment conditions,
∗
conditioning on the equilibrium represented by ξ .
The equilibrium conditional expectation acts like the group aggregate in the model built
by Bisin et al.(2011). It shows that without assuming that the single equilibrium is played
repeatedly over time periods or across markets, as long as the same criterion rule is used
and there is a unique optimizing equilibrium, we can still use the two-step estimation, if
the actions of individuals are influenced by the average of her peers.
4.7
Binary Choice Models: Analysis and Experiments
This section discusses in detail the equilibrium sets of two forms of binary choice models,
(4.2.5) and (4.2.6). Additionally, by Monte Carlo experiments, there is a comparison about
the small sample performances between the maximum likelihood estimation with complete
likelihood for multiple equilibria in this paper and the nested fixed point maximum likelihood estimation assuming unique equilibrium used by Yang and Lee(2014).
54
In our discussion on equilibria in the general information structure in Appendix C.4, we show that
under some conditions, all equilibria is in a compact set. Since the set of zeros of the continuous function,
S(·; X, Wn ), is closed. The set of equilibria is itself compact.
55
As for the Maximum Theorem, see Stokey et al.(1989) for a proof when criterion, Π(·, X, Wn ), and the
constraint, S(·; X, Wn ), are functions defined on Euclidean spaces. A proof for general metric spaces can be
found in Wikipedia. Seehttp://en.wikipedia.org/wiki/Maximum theorem.
132
4.7.1
Binary Choice Model I
Consider the model, (4.2.1) and (4.2.2), where hi (z) = I(z > 0) for all i, such that
(4.2.5) holds. According to Yang and Lee(2014), this corresponds to a game where agents
in a group simultaneously choose between 0 and 1. In this case, the expected utility following
choice “0” is normalized to be zero. Instead, if i chooses “1”, her expected utility is affected
P
by her expectations about others’ actions, u(Xi ) + λ j6=i Wn,ij E[yj |XJpi , Z] − i . Therefore,
P
yi = I(u(Xi ) + λ j6=i Wn,ij E[yj |XJpi , Z] − i > 0). The entry game for a group of firms in
the same industry is a case in point. Discussions in this section focuses on the case that all
Xi ’s are public information and i ’s are i.i.d. normal with mean 0 and variance 1. It follows
from the previous analysis that an equilibrium is an n × 1 vector in ξ ∈ [0, 1]n , such that
ξi = Φ(ui + λ
X
Wn,ij ξj ),
(4.7.1)
j6=i
for i = 1, · · · , n, where Φ(·) is the cdf for standard normal distribution and ui is a simplified
notation for u(Xi ).
The investigation of the equilibrium set begins with the special case that every two
group members are associated with each other. That is, Wn,ij = 1 for all i 6= j. As it is
shown in Section 4.6, in this case, an equilibrium conditional expectation is a scalar and
can be represented as a zero of a nonlinear function. The characteristics of the equilibrium
set is summarized by the following proposition,
Proposition 4.7.1 In Binary Choice Model I (4.2.5), consider a group of n agents such
that any two of them are associated with each other, i.e., Wn,ij = 1 for all i 6= j. Suppose
that λ 6= 0.
√
• If − 2π < λ < 0, there is a unique equilibrium;
• When λ > 0, there is a unique equilibrium if
min ui >
1≤i≤n
133
λ
,
2
(4.7.2)
or
λ
max ui < − ,
1≤i≤n
2
(4.7.3)
where Φ(·) and φ(·) are respectively the cdf and pdf of the standard normal distribution.
Proof. See Appendix C.6.
From Proposition 4.7.1, it is more likely to have multiple equilibria when λ > 0 than
it is when λ < 0. Thus, further investigation of the equilibrium set focuses on the case
that λ > 0. Figures C.2 to C.5 illustrates the characteristics of the equilibrium set as the
group population and interaction intensity varies. Figures C.2 and C.3 show that there
is a unique equilibrium when (4.7.2) is satisfied. Figure C.2 corresponds to the case that
agents are symmetric with ui = u = 2 for all i. Figure C.3 is for the case that ui ’s are
heterogeneous and vary from 1 to 2. When neither (4.7.2) nor (4.7.3) is satisfied, as it is
shown by Figures C.4 and C.5, there can be multiple equilibria. In Figure C.4, symmetric
agents have ui = u = −2 for all i. However, ui ’s change from -3 to -2 in Figure C.5.
In the above graphs, there are at most three equilibria. Actually, the number of equilibria
is no more than three under a reasonable condition.
Lemma 4.7.1 Suppose that 0 < λ <
√
2 2π
3 .
For the function c(a; λ) = λφ(a) + 1 +
a2 (2λφ(a) − 1), there is a+ (λ) > 0, such that c(a; λ) > 0 for −a+ (λ) < a < a+ (λ);
c(a+ (λ); λ) = c(−a+ (λ); λ) = 0, and c(a; λ) < 0, for a < −a+ (λ) or a > a+ (λ). In
addition, a+ (λ) increases with λ.
Proof. See Appendix C.6.
Proposition 4.7.2 In Binary Choice Model I (4.2.5), consider a group of n agents such
that any two of them are associated with each other, i.e., Wn,ij = 1 for all i 6= j. Suppose
that 0 < λ <
√
2 2π
3 .
There are at most three equilibria if
min ui > λΦ(a+ (λ)) + a+ (λ),
1≤i≤n
134
(4.7.4)
or
max ui < λΦ(−a+ (λ)) − a+ (λ) − λn.
(4.7.5)
1≤i≤n
Proof. See Appendix C.6.
For general social relations, an equilibrium is represented by an n × 1 vector, ξ =
0
(E[y1 ], · · · , E[yn ]) and is a zero of a system of nonlinear equations. Applying Proposition
4.4.2, there is a unique equilibrium if the sign of the determinant of the following matrix
does not change as ξ varies in [0, 1]n :

P
0
···
φ(u1 + λ j6=1 Wn,1j ξj )

P

0
φ(u2 + λ j6=2 Wn,2j ξj ) · · ·

λ
.
..
..

..
.
.


0
0
···
0




0

 Wn − I n .
..

.


P
φ(un + λ j6=n Wn,nj ξj )
(4.7.6)
For a graphical illustration, suppose that u(Xi ) is a linear function:
c
c
u(Xi ) = β0 + Xi,1
β1 + Xi,2
β2 ,
(4.7.7)
c and X c are two commonly known exogenous characteristics. Take β ∗ = 0,
where Xi,1
0
i,2
β1∗ = β2∗ = 1. Simulate one sample with G = 100 independent groups, each of which
has n = 5 members. In each group, the number of social relations an agent can build is
randomly determined, ranging from 0 to n − 1. Based on the randomly generated total link
P
number for agent i, Fi = j6=i Wn,ij , for each j 6= i, generate a random number, rnn,ij .
Then Wn,ij = 1, if rnn,ij is among the Fi largest ones. The social relation matrix, Wn , is
not row-normalized. Characteristics of the equilibrium set in this case are illustrated by
Figures C.6 and C.7. It can be seen that for Binary Choice Model I, although the sufficient
condition for equilibrium uniqueness in Yang and Lee(2014) is violated for a big proportion
of groups when λ is a little bit bigger than 0.6, for groups in this sample, when λ is within 0
and 1, there is only a unique equilibrium. As λ increases, the average equilibrium outcomes
and the proportion of agents who choose action 1 increase.
135
In the Monte Carlo experiments, the social relation matrix is constructed in the same
way as it is for the illustration in Figures C.6 and C.7. There are L = 400 simulations. In
each simulated sample, there are G independent groups with homogeneous population n.
n is fixed at 5 in the experiments. There are two cases about the number of independent
groups, G = 100 and G = 200. The true value of the interaction intensity is λ∗ = 0.8.
Table C.1 summarizes the estimation results for three regression methods. Regression I
is the conventional Probit estimation without social interactions. Regressions II and III
take into account the interactions among socially related agents. Regression II assumes
condition (4.3.8) is satisfied and uses the contraction mapping iteration method to solve for
the (assumed) unique equilibrium. On the contrary, Regression III does not make restrictive
assumptions on the intensity of social interactions, λ, or the number of equilibrium in
estimation. It allows for multiple equilibria, uses the homotopy continuation method to
compute the equilibrium set, and chooses the equilibria with maximal expected probabilities
for choice “1” to complete the model. From the table, it can be seen that Regression III
outperforms Regression I and II in terms of parameter estimation biases and the value of
estimated average log likelihoods. That is because of the distortions imposed by (4.3.8).
From Table C.1, condition (4.3.8) is violated by 67.44% (67.35%)of the groups in the sample
on average when the number of groups is G = 100 (G = 200) under the true parameter
values. Additionally, the average upper bound on the interaction intensity for a sample
imposed by (4.3.8), 0.6267, is very close to the estimates of λ in Regression II (The estimate
for λ is 0.6263 when G = 100 and 0.6266 when G = 200.) Therefore, imposing (4.3.8) can
be restrictive when it is violated by a large proportion of the groups in a sample.
4.7.2
Binary Choice Model II
In the basic framework, (4.2.1) and (4.2.2), take hi (z) = 2I(z > 0) − 1 for all i. Then
(4.2.6) holds. Similar to Brock and Durlauf(2001), this model describes the equilibrium
outcomes of a simultaneous move game with discrete choices where the utility an agent gets
depends on the difference between her own action and those of her friends. To be specific,
136
suppose that in a group of n agents, an agent i can choose two actions, -1 and 1. Her
utilities depend on her own choice and those of the agents who she is associated with. If
e P Wn,ij yj + e
she chooses 1, with others choosing y−i , her utility is u
e(Xi , 1) + λ
1i . If she
j6=i
e P Wn,ij yj + e
e
chooses -1, her utility is u
e(Xi , −1) − λ
−1
j6=i
i . When λ > 0, an agent benefits
e < 0, an
from taking the same action as her friends and/or peers do. In contrast, when λ
agent gets rewarded by distinguishing herself from her friends and/or peers. Suppose that
all exogenous characteristics are public information but the idiosyncratic shocks are private
information. As i does not know her friends’ actions when her decisions are made, she has
e P Wn,ij E[yj ] + e
to maximize her expected utility, which is u
e(Xi , 1) + λ
1i for action 1 and
j6=i
e P Wn,ij yj + e
−1
for action -1. Thus, she will choose 1 if
u
e(Xi , −1) − λ
j6=i
i
e
u
e(Xi , 1) − u
e(Xi , −1) + 2λ
X
Wn,ij E[yj ] − (e
−1
1i ) > 0.
i −e
j6=i
e and i = e
Define u(Xi ) = u
e(Xi , 1) − u
e(Xi , −1), λ = 2λ,
−1
1i . Plug them into (4.2.1)
i −e
and (4.2.2). Choose hi (z) = 2I(z > 0) − 1. Then the Type II model for binary choices
is derived. Suppose that (e
1i , e
−1
i )’s are i.i.d. normal with zero mean and variance
0
1
2.
An
0
equilibrium conditional expectation, ξ = (ξ1 , · · · , ξn ) = (E[y1 ], · · · , E[yn ]) , satisfies
X
ξi = 2Φ(u(Xi ) + λ
Wn,ij ξj ) − 1
j6=i
(4.7.8)
e
= 2Φ(e
u(Xi , 1) − u
e(Xi , −1) + 2λ
X
Wn,ij ξj ) − 1.
j6=i
If any two agents are associated with each other, i.e., Wn,ij = 1 for any i 6= j, λ
represents the intensity of influences from any other group member. Based on the previous
discussions, in this case, the equilibrium can be described by the group total expected
P
P
outcome, ξ = ni=1 ξi = ni=1 E[yi ]. Similar to Binary Choice Model I, ξ can be described
as a zero of a nonlinear function. It is possible to characterize the equilibrium set by
analyzing this function.
Proposition 4.7.3 In Binary Choice Model II (4.2.5), consider a group of n agents such
that any two of them are associated with each other, i.e., Wn,ij = 1 for all i 6= j. Suppose
that λ 6= 0.
137
√
• If −
2π
2
< λ < 0, there is a unique equilibrium;
• When λ > 0, there is a unique equilibrium if
min ui > λn,
(4.7.9)
max ui < −λn,
(4.7.10)
1≤i≤n
or
1≤i≤n
where Φ(·) and φ(·) are respectively the cdf and pdf of the standard normal distribution.
Proof. See Appendix C.6.
Compare Proposition 4.7.1 and Proposition 4.7.3, it is easy to see that for both types
of the Binary choice models, it is easier to ensure uniqueness when λ < 0 than it is when
λ > 0. Additionally, the sufficient conditions for a unique equilibrium is more stringent for
the Type II model of binary choices than it is for the Type I model. Equilibrium multiplicity
for the Type II Binary Choice model is illustrated by Figures C.8 to C.11. Figures C.8 and
C.9 show that there is a unique equilibrium when (4.7.9) is satisfied. Figure ?? is for the
case that in a group of population n, ui = n + 1 for all i. Figure C.9 depicts the case for
heterogeneous agents such that max1≤i≤n ui = n + 3 and min1≤i≤n ui = n + 1. Figures
C.10 and C.11 show that there can be multiple equilibria when both (4.7.9) and (4.7.10)
are violated. Figure C.10 corresponds to the case of homogeneous agents with ui = u = 1
for all i. The case for heterogeneous agents with ui ’s randomly change from 1 to 2 is shown
in Figure C.11. It is interesting to see that there are multiple equilibria in this case for
the Type II model of Binary choices (as it is shown in Figure C.11) and there is a unique
equilibrium for the Type I model (as it is shown in Figure C.3). These graphical illustrations
confirm that it is more likely to have multiple equilibria in the Type II model of binary
choices.
Similar to the discussions for the Type I binary choice model, under a certain condition,
there are at most three equilibria in the Type II model.
138
Lemma 4.7.2 Suppose that 0 < λ <
√
2π
3 .
Fix λ, define a function e
c(a; λ) = 2λφ(a) +
1 + a2 (4λφ(a) − 1). There is e
a+ (λ) > 0, such that e
c(a; λ) > 0, for −e
a+ (λ) < a < e
a+ (λ);
e
c(e
a+ (λ); λ) = e
c(−e
a+ (λ); λ) = 0; and e
c(a; λ) < 0, for a < −e
a+ (λ) or a > e
a+ (λ). In addition,
e
a+ (λ) increases with λ.
Proof. See Appendix C.6.
Proposition 4.7.4 In Binary Choice Model II (4.2.6), consider a group of n agents such
that any two of them are associated with each other, i.e., Wn,ij = 1 for all i 6= j. Suppose
√
that 0 < λ <
2π
3 .
There are at most three equilibria if
min ui > λ(2Φ(e
a+ (λ)) − 1 + n) + e
a+ (λ),
(4.7.11)
max ui < λ(2Φ(−e
a+ (λ)) − 1 − n) − e
a+ (λ).
(4.7.12)
1≤i≤n
or
1≤i≤n
Proof. See Appendix C.6.
For general social relation matrix, Wn , an equilibrium is an n × 1 vector satisfying a
system of nonlinear equations. Applying Proposition 4.4.2, there is a unique equilibrium if
the sign of the determinant of the following matrix does not change as ξ varies in [−1, 1]n :

φ(u1 + λ



2λ 



P
j6=1 Wn,1j ξj )
0
..
.
0
φ(u2 + λ
0
P
···
0
j6=2 Wn,2j ξj ) · · ·
..
..
.
.
0
···
0
..
.
P
φ(un + λ
j6=n

Wn,nj ξj )




 W n − In .



(4.7.13)
When there are multiple equilibria, the selection rule is applied to complete the model.
Based on previous discussions, there are a finite number of equilibria. Denote them by
139
ξ e,1 , · · · , ξ e,L . Suppose that equilibria are selected according to the total (ex ante) exP
pected utilities. That is, γ(ξ e,l , X, W ) = ni=1 Ui (ξ e,l ), where
Ui (ξ e,l )
Z
X
X
e
e
−1
1i > u
e(Xi , −1) − λ
Wn,ij ξje,l + e
= I(e
u(Xi , 1) + λ
Wn,ij ξje,l + e
i )
j6=i
j6=i
e
· (e
u(Xi , 1) + λ
X
j6=i
e
−1
1
e
1
1i ) 2 φ( i )φ( i )de
Wn,ij ξje,l + e
1i de
−1
i
σ
σ
σ
Z
+
e
I(e
u(Xi , 1) + λ
X
e
1i ≤ u
e(Xi , −1) − λ
Wn,ij ξje,l + e
e
· (e
u(Xi , −1) − λ
j6=i
−1
Wn,ij ξje,l + e
i )
j6=i
j6=i
X
X
e
−1
1
e
1i
−1
Wn,ij ξje,l + e
)φ( i )de
1i de
−1
i ) 2 φ(
i
σ
σ
σ
ξie,l + 1 e X
e(Xi , −1)
+λ
Wn,ij ξje,l ξi + u
2
j6=i
Z
e,l
eP
u
e(Xi , 1) − u
e(Xi , −1) + 2λ
1i 1 e
1
j6=i Wn,ij ξj + e
1
+ e
i Φ
φ( i )de
1i
σ
σ σ
Z
e,l
eP
−1
u
e(Xi , −1) − u
e(Xi , 1) − 2λ
e
−1
i 1
j6=i Wn,ij ξj + e
−1
+ e
i Φ
φ( i )de
−1
i
σ
σ
σ
X
ξ e,l
1
=(u(Xi ) + λ
Wn,ij ξje,l ) i + u(Xi ) + u
e(Xi , −1)
2
2
j6=i
P
Z
u(Xi ) + λ j6=i Wn,ij ξje,l + e
1i 1 e
1
φ( i )de
1i
+ e
1i Φ
σ
σ σ
P
Z
−1
−u(Xi ) − λ j6=i Wn,ij ξje,l + e
e
−1
i 1
+ e
−1
Φ
φ( i )de
−1
i
i
σ
σ
σ
=(e
u(Xi , 1) − u
e(Xi , −1))
In this model, σ =
q
1
2.
(4.7.14)
u
e(Xi , −1) is normalized to be equal to 0. Thus, individual
expected utilities can be computed given the model primitives, ui ’s, and an equilibrium set.
Although there are no explicit analytical forms for the last two integrals in (4.7.14), they
can be computed numerically by the Gaussian quadrature.
For further investigation of the equilibrium set for general social relations, suppose that
u(Xi ) = u
e(Xi , 1) − u
e(Xi , −1) takes the linear function form (4.7.7). Like the discussions
in the Type I Binary Choice model, take β0∗ = 0 and β1∗ = β2∗ = 1. Consider a sample of
G = 100 independent groups. Each of them has n = 5 members. In a group, the number of
social links an agent has is randomly determined. For each i, generate a random number,
P
fi . If fi ≥ 0.5, the number of social links for i, Fi =
j6=i Wn,ij , is n − 1; otherwise,
Fi = n − 2. Given Fi , generate random numbers, rnij for each j 6= i. Then Wn,ij = 1 if rnij
140
is among the Fi largest. Use the homotopy continuation to compute the equilibrium set for
this sample. Figures C.12 and C.13 illustrate the characteristics of the equilibrium set and
the (selected) equilibrium outcomes. According to Yang and Lee(2014), when |λ| < 0.3133,
there is a unique equilibrium. From Figure C.12, there is still a unique equilibrium when
λ is a little bit bigger than 0.3133. However, unlike the Type I model for binary choices,
as λ continues to increase, some groups have more than one equilibria. Additionally, there
is an increasing tendency for the sample average number of equilibria and the proportion
of groups with multiple equilibria. Figure C.13 shows the characteristics of the selected
outcomes corresponding to the selection criterion (4.7.14). As the interaction intensity,
λ, increases, there is not an obvious trend for the sample average expected equilibrium
outcomes, actual outcomes, the proportion of agents who choose 1, and the proportion of
agents who choose -1. This is different from the characteristics of the equilibrium outcomes
in the Type I model of binary choices, where as λ increases more agents choose action 1
and the average equilibrium outcome also increases (see Figure C.7).
Samples are generated in the same way for the Monte Carlo experiments. There are
L = 400 simulations. Consider two values that λ takes, 0.2 and 0.8. When λ = 0.2, the
sufficient condition for equilibrium uniqueness in Yang and Lee (2014) is satisfied. However,
when λ = 0.8, as it is shown by Figure C.12, it is possible to have multiple equilibria. In
experiments, four estimation method are compared: (1) conventional maximum likelihood
estimation without social interactions, i.e., assuming λ = 0; (2) nested fixed point maximum
likelihood estimation which assumes a unique equilibrium, restricts λ, and uses contraction
mapping iterations to compute the equilibrium; (3) nested fixed point maximum likelihood
estimation which assumes a unique equilibrium and computes the equilibrium by solving
nonlinear equations without restricting λ; and (4) maximum likelihood estimation for complete likelihood with equilibrium selection which uses the homotopy continuation method
to compute the set of equilibria. Results are summarized in the Tables C.2 and C.3.
It is obvious that the conventional regression which ignores the interactive effects among
socially associated agents can bring in more biases than regressions which take into account
141
social interactions even when the intensity of social interactions is moderate. When λ∗ = 0.2,
there is a unique equilibrium according to Yang and Lee(2014). From Table C.2, the last
three regression methods have similar performances. However, with large interaction intensity, λ∗ = 0.8, from the tabulated results in Table C.3, Regression IV outperforms the
other three regressions. As the upper bound on interaction intensity which ensures equilibrium uniqueness is 0.3133 on average, which is far smaller than the true parameter value,
λ∗ = 0.8, when |λ| is restricted within this bound, estimations are biased, which is shown
by the performances of Regression II. Since the average estimates for λ by Regression I
is 0.3109, which is very close to 0.3133, this upper bound is nearly binding. A potential
improvement may be achieved by relaxing the restrictions on |λ| and computing the equilibrium by solving a system of nonlinear equations, using numerical algorithms such as the
Newton’s method. This method is used in Regression III. Although this method performs
well for moderate interactions when there is a unique equilibrium (as Table C.2 shows),
when λ∗ = 0.8, it brings in biased estimation. That is because this method still assumes
that there is a unique equilibrium. However, the average number of equilibria in the simulated sample is 2.2918 and 78.11% of groups have more than one equilibria. In Regression
IV, equilibrium multiplicity is considered and equilibria are selected according their expected total utilities. It can be seen that Regression IV has much smaller biases and much
larger estimated sample log likelihoods than those of other regression methods. That is,
the computation intensity of Regression IV is rewarded with good estimation performances
when there are multiple equilibria.
4.8
Conclusion
In a general framework of social interactions under incomplete information with multiple equilibria, this paper investigates the approach to complete the model, along with
identification, computation, and estimation issues. The proposed solution to multiple equilibria extends the random equilibrium selection method used by Bajari et al(2010b) and
(2010c) to a setting, which is general in the types of behaviors and information structures.
142
Although this all-solution method can be computationally intensive, it does not impose
strong assumptions on the data generating process and can be applied to a broad range
of empirical studies. Although the model have specific structures on the interdependence
among socially associated agents, it incorporates discrete and continuous choices, bounded
and unbounded outcomes, unbounded idiosyncratic shocks, as well as different information structures. Therefore, the characterization of equilibria in this paper complements the
existence theorems for the Bayesian Nash Equilibrium in the recent theory literature.
Group unobservables are important in empirical studies. In this paper, there is a brief
discussion for the case when those unobservables only affect individual choices but not the
equilibrium selection rule. It will be an interesting extension if that assumption is relaxed.
Using the stochastic selection rule, we do not need to worry about the case that two
equilibria score the same according to that criterion, for only the distribution of equilibrium selection matters. If the deterministic rule is used, however, the model will still be
incomplete when there is a tie between two equilibria according to the deterministic objective function. However, if we can apply some mathematical tools, such as optimal control,
optimization over the equilibrium set may be less computationally intensive than the computations of all the equilibria. Consequently, further investigations of the deterministic rule
are of significance.
143
Chapter 5: Conclusion
This thesis investigates modeling and estimation of social interactions under a general
form of incomplete information. The framework is built on the basis of a simultaneous move
game under incomplete information. It is assumed that the observed outcomes come from
a Bayesian Nash Equilibrium (BNE), which is equivalent to the equilibrium conditional
expectations of the individual outcomes. Utilizing functional fixed point theorems and the
intersection theory in differential topology, it characterizes the set of equilibria and proposes
effective methods for identification, computation, and estimation. This thesis contributes
to the current research in social interactions both theoretically and empirically and also
complements the theoretical analysis of the existence of a pure strategy Bayesian Nash
Equilibrium and estimation of incomplete information simultaneous move games.
My dissertation research is focused on analyzing the effects of social networks on behaviors. Another fundamental problem about networks is how a network depends on people’s
actions. When the social relations are not fully predetermined but rather depend on the
choices of individuals in a social group, the individual behaviors and social relations may
be correlated through some unobservables. In that case, ignoring this correlation can bring
in endogenous biases. As a result, it is of theoretical and empirical significance to investigate the formation of social networks and its implications on the social interaction effects.
This investigation may be more interesting when there is also asymmetric private information about the social relations. Hence, it is of interest to introduce endogenous network
formation in the current framework.
144
Another future research project is to discuss dynamics of the interactions among socially
associated agents and the evolution of endogenous networks. With more and more big data
sets tracing the behaviors of a big group of agents become available, it is necessary to devise
effective estimation methods incorporating time dynamics. Equilibrium multiplicity may
still be a challenging issue. However, with dynamics, it might be possible to refine the
equilibrium set and reduce the number of equilibria, which can potentially alleviate the
computation burden.
145
Bibliography
ABREVAYA, J. and SHEN, S. 2014. Estimation of Censored Panel-Data Models with Slope
Heterogeneity. Journal of Applied Econometrics, 29:523–548.
AGUIRREGABIRIA, V. and MIRA, P. 2002. “Swapping the Nested Fixed Point Algorithm:
A Class of Estimators for Discrete Markov Decision Models”. Econometrica, 70(4):1519–
1543.
AGUIRREGABIRIA, V. and MIRA, P. 2013. “Identification of Games of Incomplete
Information with Multiple Equilibria and Common Observed Heterogeneity”. working
paper.
ARADILLAS-LOPEZ, A. 2010. “Semiparametric Estimation of A Simultaneous Game
with Incomplete Information”. Journal of Econometrics, 157:409–431.
BAJARI, P., HONG, H., KRAINER, J., and NEKIPELOV, D. 2010a. “Estimating Static
Models of Strategic Interactions”. Journal of Business and Economic Statistics, 28(4):
469–482.
BAJARI, P., HONG, H., KRAINER, J., and NEKIPELOV, D. 2010b. “Computing Equilibria in Static Games of Incomplete Information Using the All-Solution Homotopy”.
working paper.
BAJARI, P., HONG, H., and RYAN, S. 2010c. “Identification and Estimation of A Discrete
Game of Complete Information”. Econometrica, 78(5):1529–1568.
BALLESTER, C., CALVÓ-ARMENGOL, H., and ZENOU, Y. 2010. “Who is Who in
Networks. Wanted: the Key Player”. Econometrica, 74(5):1403–1417.
146
BISIN, A., MORO, A., and TOPA, G. 2011. “The Empirical Content of Models with
Multiple Equilibria in Economies with Social Interactions”. working paper.
BLEVINS, J. 2014. Nonparametric Identification of Dynamic Decision Processes with
Discrete and Continuous Choices. Quantitative Economics, forthcoming.
BONSALL, F. F. 1962. “Lectures on Some Fixed Point Theorems of Functional Analysis”.
Tata Institute of Fundemental Research, Bombay.
BORKOVSKY, R., DORASZELSKI, U., and KRYUKOV, Y. 2010a. “A User’s Guide to
Solving Dynamic Stochastic Games Using the Homotopy Method”. Operations Research,
58(4):1116–1132.
BORKOVSKY, R., DORASZELSKI, U., KRYUKOV, Y., and SATTERTHWAITH, M.
2010b. “Learning-by-doing, Organizational Forgetting, and Industry Dynamics”. Econometrica, 78(2):453–508.
BOUCHER, V., BRAMOULLé, Y., Djebbari, H., and Fortin, B. 2014. Do Peer Affect
Student Achievement? Evidence from Canada Using Group Size Variation. Journal of
Applied Econometrics, 29:91–109.
BROCK, W. and DURLAUF, S. 2000. “Estimating Static Models of Strategic Interactions”.
In HECKMAN, K. and LEAMER, E., editors, Handbook of Econometrics: Edition 1,
pages 3297–3380. Elsevier.
BROCK, W. and DURLAUF, S. 2001. “Discrete Choice with Social Interactions”. The
Review of Economic Studies, 42:430–445.
BROCK, W. and DURLAUF, S. 2003. “Multinomial Choice with Social Interactions”.
working paper.
BROCK, W. and DURLAUF, S. 2007. “Identification of Binary Choice Models with Social
Interactions”. Journal of Econometrics, 140:52–75.
147
BROOKS, J. and DINCULEANU, N. 1979. “Conditional Expectations and Weak and
Strong Compactness in Spaces of Bochner Integrable Functions”. Journal of Multivariate
Analysis, 9:420–427.
BRUECKNER, J. 2003. Strategic Interaction among Governments: an Overview of Empirical Studies. International Regional Science Review, 26:175–188.
DE FINETTI, B. 1975. Theory of Probability, volume 2. John Wiley& Sons, New York.
DEBREU, G. 1970. “Economies with a Finite Set of Equilibria”. Econometrica, 38:387–392.
DIERKER, E. 1972. “Two Remarks on the Number of Equilibria of an Economy”. Econometrica, 40:951–953.
DUNFORD, N. and SCHWARTZ, J. 1958. “Linear Operators, Part I”. Interscience, New
York.
DURLAUF, S. 2000. A framework for the study of individual behavior and social interactions. working paper 16,Wisconsin Madison, Social Systems.
FOLLAND, G. B. 1999. “Real Analysis: Modern Techniques and Their Applications”.
John Wiley & Sons, INC., New York, 2nd edition.
GARCIA, C. B. and ZANGWILL, W. I. 1979. “Finding All Solutions to Polynomial
Systems and Other Systems of Equations”. Mathematical Programming, 16:159–176.
GARCIA, C. B. and ZANGWILL, W. I. 1981. Pathways to Solutions, Fixed Points, and
Equilibria. Prentice-Hall Series in Computational Mathematics.
GASPAR, J. and JUDD, K. 1997. “Solving Large-Scale Rational Expectations Models”.
Macroeconomic Dynamics, 68(2):235–260.
GELMAN, A., CARLIN, J. B., STERN, H., B., D. D., VEHTARI, A., and RUBIN, D.
2014. Bayesian Data Analysis. CRC Press, Taylor& Francis Group, 3rd edition.
148
GUILLEMIN, V. and POLLACK, A. 1974. “Differential Topology”. Prentice-Hall, Inc.,
New Jersy.
HARSANYI, J. C. 1967a. “Games with Incomplete Information Played by ‘Bayesian’
players, parts i”. Management Science, 14(3):159–182.
HARSANYI, J. C. 1967b. “Games with Incomplete Information Played by ‘Bayesian’
players, parts ii”. Management Science, 14(3):320–334.
HOTZ, V. J. and MILLER, R. A. 1993. “Conditional Choice Probabilities and the Estimation of Dynamic Models”. The Review of Economic Studies, 60:497–529.
HSIEH, C. S. and LEE, L. 2011. “A Social Interactions Model with Endogenous Friendship
Formation and Selectivity”. working paper.
JUDD, K. 1998. “Numerical Methods in Economics”. MIT Press, Cambridge, MA.
KHAN, M. A. and SUN, Y. N. 1995. “Pure Strategies in Games with Private Information”.
Journal of Mathematical Economics, 24:633–653.
KHAN, M. A. and SUN, Y. N. 2002. “Non-cooperative Games with Many Players”. In
AUMANN, R. J. and HART, S., editors, Handbook of Game Theory with Economic
Applications: Edition 1, volume 3, pages 1761–1808. Elsevier.
KHAN, M. A. and ZHANG, Y. C. 2014. “On the Existence of Pure-strategy Equilibria in
Games with Private Information: A Complete Characterization”. Journal of Mathematical Economics, 50:197–202.
KUMAR, A. 2012. Nonparametric Estimation of the Impact of Taxes on Female Labor
Supply. Journal of Applied Econometrics, 27:415–439.
LAX, D., P. 2002. “Functional Analysis”. John Wilen&Sons, Inc., New York.
LEE, L. 2000. “A Numerically Stable Quadrature Procedure for the One-Factor Random
Component Discrete Choice Model”. Journal of Econometrics, 95(1):117–129.
149
LEE, L. 2001. “Interpolation, Quadrature and Stochastic Integration”. Econometric Theory, 17:933–961.
LEE, L. 2007. “Identification and Estimation of Econometric Models with Group Interactions, Contextual Factors and Fixed Effects”. Journal of Econometrics, 140:333–374.
LEE, L. and LI, J. 2009. “Binary Choice under Social Interactions: An Empirical Study
With and Without Subjective Data on Expectations”. Journal of Applied Econometrics,
24:257–281.
LEE, L., LI, J., and LIN, X. 2014. “Binary Choice Models with Social Network under
Heterogeneous Rational Expectations”. Review of Economics and Statistics, 96:402–417.
LEUNG, M. 2013. “Two-Step Estimation of Network Formation Models with Incomplete
Information”. working paper.
LI, S., LIU, Y., and DEININGER, K. 2013. How Important are Endogenous Peer Effects
in Group Lending? Estimating a Static Game of Incomplete Information. Journal of
Applied Econometrics, 28:864–882.
MANSKI, C. 1985. “Semiparametric Analysis of Discrete Response: Asymptotic Properties
of the Maximum Score Estimator”. Journal of Econometrics, 27:313–333.
MANSKI, C. 1987. “Semiparametric Analysis of Random Effects: Linear Models from
Binary Penel Data”. Econometrica, 55:357–362.
MANSKI, C. 1993. “Identification of Endogenous Social Effects: The Reflection Problem”.
The Review of Economic Studies, 60:531–542.
MANSKI, C. 2000. “Economic Analysis of Social Interactions”. Journal of Economic
Perspectives, 14:115–136.
MANSKI, C. 2004. “Measuring Expectations”. Econometrica, 72:1329–1376.
MAS-COLELL, A., D., W. M., and GREEN, J. 1995. “Microeconomic Theory”. Oxford
University Press.
150
MELE, A. 2010. “A Structure Model of Segregation in Social Networks”. CeMMAP working
paper.
MILGROM, P. R. and WEBER, R. J. 1985. “Distributional Strategies for Games with
Incomplete Information”. Mathematics of Operations Research, 10:619–632.
MOFFITT, R. A. 2000. Policy Interventions, Low-level Equilibria, and Social Interactions.
Social Dynamics, 97:45–82.
OSBORNE, M. J. and RUBINSTEIN, A. 1994. “A Course in Game Theory”. MIT Press.
PORTO, E. D. and REVELLI, F. 2013. Tax-Limited Reaction Functions. Journal of
Applied Econometrics, 28:823–839.
QU, X. and LEE, L. 2013. “LM Tests for Spatial Correlation in Spatial Models with
Limited Dependent Variables”. Regional Science and Urban Economics, 42:430–445.
RADNER, R. and ROSENTHAL, R. W. 1982. “Private Information and Pure-strategy
Equilibria”. Mathematics of Operations Research, 7:401–409.
ROTHENBERG, T. J. 1971. “Identification in Parametric Models”. Econometrica, 39(3):
577–591.
RUST, J. 1987. “Optimal Replacement of GMC Bus Engines: An Empirical Model of
Harold Zurcher”. Econometrica, 55(5):999–1033.
RUSTON, A. F. 1986. “Fredholm Theory in Banach Spaces”. Cambridge University Press.
SHANG, Q. and LEE, L. 2011. Two-Step Estimation of Endogenous and Exogenous Group
Effects. Econometric Reviews, 30(2):173–207.
SIMON, J. 1987. “Commpact Sets in the Space Lp (0, T ; B)”. Annali di Matematica para
ed applicata, CXLVI(IV):65–96.
STOKEY, N. L., LUCAS, E. J., R., and PRESCOTT, E. C. 1989. “Recursive Methods in
Economic Dynamics”. Harvard University Press.
151
TAMER, E. 2003. “Incomplete Simultaneous Discrete Response Model with Multiple
Equilibria”. The Review of Economic Studies, 70(1):147–165.
TAO, J. and LEE, L. 2014. A Social Interaction Model with an Extreme Order Statistics.
The Econometrics Journal, 17(3):197–240.
van NEERVEN, J. 2014. “Compactness in the Lebesgue-Boucher spaces Lp (µ; X)”. Indagationes Mathematicae, 25:389–394.
WATSON, L., BILLUPS, S., and MORGAN, A. 1987. “HOMPACK: A Suite of Codes
for Globally Convergent Homotopy Algorithms”. ACM Transactions on Mathematical
Software, 13:281–310.
WATSON, L., SOSONKINA, M., MELVILLE, R., MORGAN, A., and WALKER, H. 1997.
“Algorithm777: HOMPACK90:A Suite of Fortran 90 Codes for Globally Convergent
Homotopy Algorithms”. ACM Transactions on Mathematical Software, 23:514–549.
YANG, C. 2014. “Sample Selection with Social Interactions”. working paper.
YANG, C. and LEE, L. 2014. “Social Interactions under Incomplete Information with
Heterogeneous Expectations”. working paper.
YANG, C., QU, X., and LEE, L. 2014. “A Tobit Model with Social Interactions under
Incomplete Information”. working paper.
152
Appendix A: Appendix to Chapter 2
A.1
Proofs
Proposition 2.3.1. The first part follows straightforwardly from the definition of ψ e and
consistency condition of expectations in equilibrium, (2.3.1). For the second part, by the
way that a strategy is defined, we have that
E[sj (XJpj )|XJpi , z] =E[hj (u(Xj ) + λ
X
Wj,k ψke (XJpj ) − j )|XJpi , z] = ψje (XJpi ),
k6=j
when Wi,j 6= 0. Therefore, we have that
si (XJpi ) = hi (u(Xi ) + λ
X
Wi,j E[sj (XJpj )|XJpi , z] − i ).
j6=i
Since y = (y1 , · · · , yn ) is the realization of actions in this BNE, (2.3.2) holds. The second
part is proved.
Lemma 2.3.1.
From (2.3.6), kψk is non-negative. Moreover, kψk = 0 if and only if
ψi (Ai ) = 0 a.e. for any i and Ai ∈ Ai , i.e., ψ = 0. For any real scalar α,
Z
Z
p
p
|ψi (xpJj )|dFp (xp )
kαψk = max max
|(αψ)i (xJj )|dFp (x ) = |α| max max
1≤i≤n {j:Wj,i 6=0}
1≤i≤n {j:Wj,i 6=0}
= |α| kψk .
0
Additionally, for any ψ, ψ ∈ Ψ,
Z
0
0
|(ψ + ψ )i (xpJj )|dFp (xp )
ψ + ψ = max max
1≤i≤n {j:Wj,i 6=0}
Z
≤ max max
|ψi (xpJj )|dFp (xp ) + max
1≤i≤n {j:Wj,i 6=0}
Z
max
1≤i≤n {j:Wj,i 6=0}
0
= kψk + ψ .
153
0
|psii (xpJj )|dFp (xp )
Lemma 2.3.2.
Suppose {ψm } is a Cauchy sequence in Ψ. There exists a subsequence
such that
Z
1
|(ψmk+1 )i (Ai ) − (ψmk )i (Ai )|dFp (X p ) ≤ ψmk+1 − ψmk < k ,
2
for any i and Ai ∈ Ai . For every k, define φk such that
(φk )i (Ai ) = |(ψm1 )i (Ai )| +
k
X
|(ψml+1 )i (Ai ) − (ψml )i (Ai )|.
l=1
and function φ such that (φ)i (Ai ) = |(ψm1 )i (Ai )| +
P∞
l=1 |(ψml+1 )i (Ai )
− (ψml )i (Ai )|. It is
easy to see that kφk k ≤ kψm1 k + 1. Because φk ↑ φ, by monotonicity convergence,
Z
Z
p
p
max max
|(φ)i (xJj )|dFp (x ) = max max lim
|(φk )i (xpJj )|dFp (xp ) ≤ kψm1 k+1.
i
i
{j:Wj,i 6=0}
{j:Wj,i 6=0} k→∞
Therefore, φ < ∞ a.e.. So the series ψm1 +
P∞
k=1 (ψmk+1
− ψmk ) converges a.e.. ψ =
limk→∞ ψmk is well-defined and kψk ≤ kφk. Therefore, ψ ∈ Ψ. By Fatou’s Lemma,
Z
max max
|(ψm )i (XJpj ) − (ψ)i (XJpj )|dFp (X p )
i {j:Wj,i 6=0}
Z
≤ max max lim inf |(ψm )i (XJpj ) − (ψmk )i (XJpj )|dFp (X p )
i {j:Wj,i 6=0} k→∞
Z
≤ lim inf max max
|(ψm )i (XJpj ) − (ψmk )i (XJpj )|dFp (X p ).
k→∞
i
{j:Wj,i 6=0}
By definition of the Cauchy sequence, limn→∞ kψm − ψk = 0.
0
Proposition 2.3.2. For any two functions, ψ, ψ ∈ Ψ,
Z
X
0 max
|E[ Hi (u(Xi ) + λ
Wi,j ψj (XJpi ))
T (ψ) − T (ψ ) = max
1≤i≤n {k:Wk,i 6=0}
j6=i
X
0
− Hi (u(Xi ) + λ
Wi,j ψj (XJpi )) |xpJk , z]|dFp (xp )
j6=i
≤ max
max
1≤i≤n {k:Wk,i 6=0}
= max
max
1≤i≤n {k:Wk,i 6=0}
|λ|D
X
|λ|D
X
Z
E[|ψj (XJpi ) − ψj (XJpi )||xpJk , z]dFp (xp )
Z
|ψj (XJpi ) − ψj (XJpi )|dFp (X p )
Wi,j
0
j6=i
Wi,j
0
j6=i
0
≤|λ| kW k∞ D| ψ − ψ .
Proposition 2.4.1. We prove these results by guess-and-verify. Under “exchangeability”,
we have that
154
0
ψie (X) = v(X g , Xic ) + E[Xip |X, z]β + λ
P
j6=i Wi,j
R
ψje (Y )fp (Y |X, z)dY .
0
Try ψie (X) = ai + bi X, where ai ∈ < and bi ∈ <L . As X varies, in order that
P
P
0
0
0
ai + bi X = v(X g , Xic ) + β (µ + CX) + β + λ j6=i Wi,j aj + λ j6=i Wi,j bj (µ + CX),
P
0
0
0
for i = 1, · · · , n, it is necessary that bi = β C + λ j6=i Wi,j bi , and
P
P
0
0
0
0
0
ai = v(X g , Xic ) + β µ + λ j6=i Wi,j aj + λ j6=i Wi,j bj µ. Thus, bi = C β + λC BWi,. , where
0
0
0
0
Wi,. is the i-th row of W , and B = (b1 , · · · , bn ). That in turn gives B = C βln + λC BW ,
where ln denotes the n × 1 vector with all coordinates equal to 1. By vectorization, b =
0
0
0
vec(B) and b = (ln ⊗C β)+λ(W ⊗C )b. Hence, under the assumption that Inkp −λ(W ⊗C )
is invertible, we have (2.4.11). The n dimensional vector a is characterized by
0
0
a = v + (µ β)ln + λW (a + B µ)
0
0
0
= λW a + v + (µ β)ln + λvec(µ BW )
0
0
= λW a + v + (µ β)ln + λ(W ⊗ µ )vec(B).
From it, follows (2.4.12).
A.2
Numerical Methods
We consider the case when Xip is known only to i and the joint distribution of Xip ’s
conditional on Z = z is “exchangeable”. So conditional on Z = z, Xip ’s have the same
distribution. Denote their common support conditional on Z = z as Sp = {x : fp (x) > 0},
where fp (·) is the conditional density of Xip given Z = z. When self-known characteristics
are continuously distributed, equilibrium expectation is a function defined on the continuum
set, Sp . A simple approximation device is to discretize the support of self-known characteristics. That will transform the problem to the case where self-known characteristics are
discrete random variables, which we have already discussed about. However, how refine
discretization needs to be remains to be an issue. A general approach to solve a function
from functional equations is the “projection method”, introduced by Judd (1998). Its key
idea is to approximate the solution by a linear finite combination of a set of function bases.
Those coefficients of the linear combination are pinned down by the functional equation.
155
However, inspecting specifics of our model, we suggest an alternative easier approach.
By (2.4.4), the value of conditional expectation function at a realization of self-known
characteristics is determined by an integral, which can be approximated by Gauss-Legendre
quadrature56 . For simplicity, let us consider the case where Xip ’s are one-dimensional.
Standard Gauss-Legendre quadrature considers integral over [−1, 1]. But one can extend
the range to a finite interval, [a, b], which is assumed to be the support of X p , via a linear
transformation. Then we have that
Z b
X
Hi (u(xg , xci , xp ) + λ
Wi,j ψje (xp ))fp (xp |x)dxp
ψie (x) =
a
≈
K
X
k=1
j6=i
ωk Hi (u(xg , xci ,
X
(zk + 1)(b − a)
(zk + 1)(b − a)
+ a) + λ
Wi,j ψje (
+ a))
2
2
j6=i
b−a
(zk + 1)(b − a)
+ a|x)
,
· fp (
2
2
(A.2.1)
where zk , for k = 1, · · · , K, are abscissae and ωk are weights. They are fixed. We can
see that as long as we get the values of expectation function on those abscissae, xpk =
(zk +1)(b−a)
2
+ a, for k = 1, · · · , K, we can approximate the value of expectation function at
any point in the range of X p . In (A.2.1), when x runs over xp1 , · · · , xpK , using the quadrature
approximation, we derive a system of n × K equations about ψie (xpk ), for i = 1, · · · , n
and k = 1, · · · , K, just like what we have in the case where Xip ’s are discrete. Hence,
approximated values of ψie (xpk )’s can be solved by contraction mapping. Then we can
derive approximation of conditional expectation function, ψ e at any value x via (A.2.1).
Especially, when we plug in the observed Xip ’s of the data, we can derive values of conditional
expectations used to calculate the likelihood function. In practice, a small number of
abscissae of the Gauss-Legendre quadrature is enough to get good approximation. In our
Monte Carlo experiments, we choose K = 8.
When the support of X p is infinite, we may use nonlinear transformation. We illustrate
that by an example. Consider the case when Xip ’s are jointly normal. We have shown that
56
Other quadrature methods can also be used. But we need to make sure that two conditions are satisfied.
First, the range of integral should be big enough to include possible values that Xip ’s can take. Second, the
abscissae used do not change with the realization of X p on which expectation is conditioned.
156
an analytical solution can be derived in some special cases. Generally, we have to solve
for equilibrium conditional expectation functions numerically. Specifically, suppose that
X1p , · · · , Xnp are jointly normal with mean

1


ρ

η2  .
.
.

ρ
0
(µ, · · · , µ) , and variance-covariance matrix

ρ ··· ρ


1 · · · ρ

.
.. . . .. 

.
.
.

ρ ··· 1
Then for any i 6= j, condition on Xjp = x, Xip is normal with mean ρx + (1 − ρ)µ and
variance (1 − ρ2 )η 2 . Then (A.2.1) takes a special form:
Z +∞
X
e
ψi (x) =
Hi (u(X g , Xic , xp ) + λ
Wi,j ψje (xp ))
−∞
j6=i
− ρx − (1 − ρ)µ)2
)dxp
2(1 − ρ2 )η 2
2π(1 − ρ2 )η 2
Z 1
X
z
z
=
Hi (u(X g , Xic , log(
)) + λ
Wi,j ψje (log(
)))
1
−
z
1
−
z
0
·p
1
exp(−
(xp
j6=i
·p
z
(log( 1−z
)
− ρx − (1 − ρ)µ)2
1
)
dz
2
2
2(1 − ρ )η
z(1 − z)
1
(A.2.2)
exp(−
2π(1 − ρ2 )η 2
s
Z 1
X
2
ze + 1
ze + 1
g
c
=
H
(u(X
,
X
,
log(
))
+
λ
Wi,j ψje (log(
)))
i
i
2
2
π(1 − ρ )η −1
1 − ze
1 − ze
j6=i
· exp(−
ze+1
(log( 1−e
z)
− ρx − (1 −
2(1 − ρ2 )η 2
ρ)µ)2
)
1
de
z,
(e
z + 1)(1 − ze)
z
where the second equality is derived by a change of integration variable, xp = log( 1−z
) and
the third equality comes from a transformation of z =
ze+1
2 .
Then we can use weights and
abscissae of standard Gauss-Legendre quadrature57 .
When Xip ’s are of multiple dimensions, we can use multiple-dimension quadrature methods. When the dimension is not very high, we can just use the tensor product. For high
dimension X p , computation can be intensive. In that case, we can use some monomial
formulas, some of which can be found in Judd (1998). Alternatively, we may use stochastic
57
When calculating expectation of a function of a normal random variable, it is conventional to use the
Gauss-Hermite quadrature. However,
p to use standard weights and abscissae of that method, we need to
change integration variable as xp = 2(1 − ρ2 )η 2 z + ρx + (1 − ρ)µ. Then we cannot get a system of equation
for values of the expectation function at a fixed set of known points.
157
integral approximation via importance sampling. Let h(ai ) be a density with its support
containing the support of Xip such that fp (xp |x)/h(xp ) is well defined. Then we can generate K random draws, say, xpk , from h(·). The stochastic approximation will be
p
P
P
g c p
e p fp (xk |x)
ψie (x) ≈ K1 K
k=1 Hi (u(x , xi , xk ) + λ
j6=i Wi,j ψj (xk )) h(xp ) .
k
From these, we can solve
ψ e (xpk )’s
by contraction mapping, and approximate the function
ψie (x) at any point x.
A.3
Identification
Classical analysis results of global identification are proved as follows.
Proposition 2.5.1.
In this case, we get a closed-form of equilibrium expectation,
c
E[Y |W = W , X c = X] = (I − λ∗ W )−1 (β0∗ ln + X β2∗ ). Therefore, if (β ∗ , λ∗ , F ) and
e λ,
e Fe ) are observationally equivalent for (W , X c ), (I − λ∗ W )−1 (β ∗ ln + X c β ∗ ) = (I −
(β,
0
2
e )−1 (βe0 ln + X c βe2 ). Multiply both side by (I − λW
e )(I − λ∗ W ) and use the property that
λW
e )−1 = (I−λW
e )−1 W , we get that (I−λW
e )(β ∗ ln +X c β ∗ ) = (I−λ∗ W )(βe0 ln +X c βe2 ).
W (I−λW
0
2
That is,
c
c
β0∗
λ∗ βe0
e ∗
λβ
0
0
β1∗
0
0
λ∗ βe1
e ∗0
λβ
1
0
= 0.
ln W ln X W X
− βe0
−
− βe1
−
e
Hence, if ln W ln X c W X c has full column rank, β0∗ = βe0 , β1∗ = βe1 , and λ∗ = λ.
When W is row-normalized, W ln = ln . Then we have that
0
0
0
0
0
c
c
∗
∗
∗
∗
∗
∗
e
e
e
e
e
e
= 0.
ln X W X
β0 − β0 + λ β0 − λβ0 β1 − β1 λ β1 − λβ1
In that case, if ln X c W X c has full column rank, we still have that β0∗ = βe0 , β1∗ = βe1 ,
P
∗ ∗
c ∗
∗
c
e Because F
c
and λ∗ = λ.
j6=i W i,j E[Yj |W = W , X ]−y),
Yi |X ,W (y) = 1−F (β0 +Xi β1 +λ
we can then identify F∗ .
Proposition 2.5.2. Since
c0
p0
Yi = β0∗ + X i β1∗ + X i β2∗ + λ∗
P
j6=i W i,j E[Yj |J
c
we have that
c
p
E[Yi |J = J, W = W , X c = X , XJpi = X J i ]
P
c0
p0
c
p
= β0∗ + X i β1∗ + X i β2∗ + λ∗ j6=i W i,j E[Yj |J = J, W = W , X c = X , XJpi = X J i ],
158
p
= J, W = W , X c = X , XJpi = X J i ] − i ,
for Ji (i) = 1, i.e., Xip is known by i herself. Therefore, under Assumption 3.4.6, we can
identify β0∗ , β1∗ , β2∗ and λ∗ . Then we can recover F∗ from the distribution of Yi ’s.
e λ,
e Fe ) are observationally equivaProposition 2.5.3. In this case, if (β ∗ , λ∗ , F∗ ) and (β,
lent,
e ⊗ W ))−1 (P ⊗ In )X
e
b ∗ = (Inm − λ(P
b β,
ψ e = (Inm − λ∗ (P ⊗ W ))−1 (P ⊗ In )Xβ
where m is the number of possible values that Xip may take. Multiply both sides by
e ⊗
e ⊗ W ))(Inm − λ∗ (P ⊗ W )) and use the property that (P ⊗ W )(Inm − λ(P
(Inm − λ(P
e ⊗ W ))−1 (P ⊗ W ), we get that (Inm − λ(P
e ⊗ W ))(P ⊗ In )Xβ
b ∗ =
W ))−1 = (Inm − λ(P
e Therefore,
b β.
(Inm − λ∗ (P ⊗ W ))(P ⊗ In )X
0
0
0
2
∗
∗
∗
e
e
e
b
b
= 0.
(β − β) (λ β − λβ )
(P ⊗ In )X (P ⊗ W )X
e If W is row
b (P 2 ⊗ W )X
b has full column rank, β ∗ = βe and λ∗ = λ.
So if (P ⊗ In )X
normalized, (P ⊗ In )ln = (P 2 ⊗ W )ln = ln . Note that P lm = P 2 lm = lm . We have
c
c
lnm lm ⊗ X (P X p,s ) ⊗ ln lm ⊗ (W X ) (P 2 X p,s ) ⊗ ln )
0
0
0
0
0
∗
∗
∗
∗
∗
∗
∗
∗
∗
e )
e ) (λ βe2 − λβ
e (β − βe1 ) (β − βe2 ) (λ βe1 − λβ
= 0.
β0 − βe0 + βe0 λ − β0 λ
2
1
1
2
Therefore, if lnm lm ⊗ X c (P X p,s ) ⊗ ln lm ⊗ (W X c ) (P 2 X p,s ) ⊗ ln ) has full cole We can then identify F ∗ from the distribution of Y .
umn rank, β ∗ = βe and λ∗ = λ.
Proposition 2.5.4.
Suppose (β, λ) is observationally equivalent to the true (β ∗ , λ∗ ), by
the equivalent (identical) equilibrium functions, the corresponding b’s would be equal, and
so are the a’s. That is, for the equivalence of b’s,
0
0
0
0
(In ⊗ Ikp − λ(W ⊗ C ))−1 (ln ⊗ C β2 ) = (In ⊗ Ikp − λ∗ (W ⊗ C ))−1 (ln ⊗ C β2∗ ),
which, by pre-multiplication to eliminate the inverse, can be written as
0
0
0
0
0
0
(ln ⊗ C β2 ) − λ∗ (W ⊗ C )(ln ⊗ C β2 ) = (ln ⊗ C β2∗ ) − λ(W ⊗ C )(ln ⊗ C β2∗ ).
0
0
0
By rearrangement, it follows that (In ⊗ Ikp − λ∗ W ⊗ C )(ln C )(β2 − β2∗ ) + (W ⊗ C )(ln ⊗
0
C β2∗ )(λ − λ∗ ) = 0, and hence,
0
0
(ln ⊗ C )(β2 − β2∗ ) + Gc,n (ln ⊗ C )β2∗ (λ − λ∗ ) = 0,
159
(A.3.1)
0
0
where Gc,n = (In ⊗ Ikp − λ∗ W ⊗ C )−1 (W ⊗ C ). For the equivalence of a’s, by denoting
Gn = (In − λ∗ W )−1 W , one has
c
0
0
ln (β0 − β0∗ ) + X (β1 − β1∗ ) + ln µ (β2 − β2∗ ) + (W ⊗ µ )b(λ − λ∗ )
c
0
0
+ Gn [ln β0∗ + X β1∗ + ln µ β2∗ + λ∗ (W ⊗ µ )b](λ − λ∗ ) = 0.
Note that (In
+ λ∗ G
n )W
0
0
= Gn and (In ⊗ µ )b = B µ, where B =
b1 · · · bn . Therefore,
it follows
0
c
0
c
0
0
ln [(β0 −β0∗ )+µ (β2 −β2 )]+X (β1 −β1∗ )+Gn [ln β0∗ +X β1∗ +(ln β2∗ +B )µ](λ−λ∗ ) = 0. (A.3.2)
Either of the two assumptions 2.5.8 and 2.5.9, implies that β = β ∗ and λ = λ∗ .
Proposition 2.5.5. Notice that
c
c
c
Fs−1 (E[Y |W = W , X c = X ]) = β0∗ + X β1∗ + λ∗ W E[Y |W = W , X c = X ].
c
e λ)
e are observationally equivalent at W and X ,
If (β ∗ , λ∗ ) and (β,
0
0
c
c
c
∗
∗
∗
e
e
β0 − β0 (β1 − β1 ) λ − λ = 0.
ln , X , W E[Y |W = W , X = X ]
c
c
c
Hence, if ln , X , W E[Y |W = W , X = X ] has full column rank, β0∗ = βe0 , β1∗ = βe1
and λ∗ = λ.
Proposition 2.5.6. Similarly, we have that
c
p
Fs−1 (E[yi |W = W , X c = X , Xip = X i ])
P
c
c
p
= β0∗ + X β1∗ + λ∗ j6=i W i,j E[Y |W = W , X c = X , Xip = X i ].
By the definition of observational equivalence, we get the results.
A.4
Unobserved Group Random Effects
In our discussion in the main text, exogenous characteristics used by agents to make
predictions are all observed by econometricians. In reality, however, there might be exogenous features observed by group members but are unavailable in a data set. For example,
students in a class know how patient their math teacher is. But there is no available data
about that. To investigate that kind of cases, we incorporate unobserved group characteristics into our model, which is possible for samples from many independent groups. Suppose
there are G independent groups. For group g, X g , Xgc , Xgp and Wg represent the observed
160
group features, commonly known individual traits and privately known personal characteristics, as well as social network relations in this group, for g = 1, · · · , G. Suppose that
in addition to the group features, X g , which is observable to econometricians, there are
some other group characteristics, ω g , which are observed by all agents in group g but not
by econometricians. Such unobserved group features might be incorporated into the model
such that:
p
∗
c
yi,g
= u(X g , Xi,g
, Xi,g
, ωg ) + λ
X
Wij,g E[yj,g |XJpi,g , Zg ] − i,g .
(A.4.1)
j6=i
ω g is now contained in public information for group g:
0
0
0
0
0
c
c
Zg = (X g , X1,g
, · · · , Xn,g
, W11,g , · · · , W1n,g , · · · , Wn1,g , · · · , Wnn,g , J1,g , · · · , Jn,g , ωg ) .
(A.4.2)
By assuming all relevant unobserved group features are additive, they can be represented
by a single variable ω g with a pdf fω (·; γ). Therefore, unobserved group features are treated
as random effects in our model.
Since ω g is publicly known to every group member, for socially interacted agents who
are making decisions simultaneously, ω g plays the same role as X g does. However, because
X g is observed from data and ω g is not, the presence of ω g affects the distribution of Y .
So identification and estimation will be different from the case without unobserved group
features.
As for identification, we keep on assuming that is independent of X and is distributed with a full support; and the distribution of observable exogenous variables, X =
(X g , X c , X p ) can be identified from the data about X. We also assume that u(·) is a linear
0
0
0
c β + X p β + X g β + ω . We normalize the mean of ω to
function, u(Xi,g ) = β0 + Xi,g
1
3
g
g
i,g 2
be zero, E[ωg ] = 0, and assume that ω g ’s are i.i.d. across different groups. According to
our discussion in the main text, under certain conditions we can identify β1 , β2 and λ from
data about a single group. Then variation across independent groups help us identify β0 ,
β3 and the distribution of ωg .
161
The presence of unobserved group random effects complicates the sample likelihood.
Conditional on ω g , probability distributions of yi,g ’s in group g are independent of each
other. However, since ω g is unobserved and is treated as random, econometricians have to
integrate over it. Therefore, the sample likelihood can be written as
ng
G Z Y
Y
p
p
c
log L(θ; Y, X ) = log
f (yi,g |X g , Xi,g
, Xi,g
, ω g ; β, σ, η)fω (ω g ; γ)dω g
g=1
i=1
Z Y
ng
G
X
p
c
=
log[
f (yi,g |X g , Xi,g
, Xi,g
, ω g ; β, σ, η)fω (ω g ; γ)dω g ].
g=1
(A.4.3)
i=1
For estimation, the likelihood of yi,g is a nonlinear function of ω g . In the event that
an analytical expression of the integral over ω g is not feasible, one can utilize Monte Carlo
integration and construct a simulated sample likelihood function, The simulated likelihood
function can be estimated with equilibrium expectation solution algorithm nested:
b=
L
G
X
g=1
S ng
1 XY
p
c
f (yi,g |X g , Xi,g
, Xi,g
, ωss ; β, σ, η)],
log[
S
(A.4.4)
s=1 i=1
where for each g, ωgs ’s are S independent draws from distribution fω (·; ι).
Additionally, for samples with large network size or sizes of groups, there is a practical
issue about calculation. In the simulated likelihood, (A.4.4), for each g, the likelihood for all
members in a group is calculated for each simulation, ωgs , and then summed up for repeated
simulations. If the size of group g, ng , is large and individual likelihoods take small values,
their product can be extremely small. As a result, the average simulated group likelihood
will be so small that the result will be treated as zero in a computer, i.e. underflow. That
will bring in computational issues when the logarithm is taken for it, or serious round errors
would occur. To overcome that potential problem, we recommend using the method in Lee
(2000) to change the order of summation and multiplication. To be specific, (A.4.4) can be
transformed as
G
X
g=1
log(
ng S
1 YX
p
c
f (yi,g |X g , Xi,g
, Xi,g
, ω g,s ; β, σ, η)ξi−1,g,s ),
S
i=1 s=1
where the weights, ξ’s are determined recursively:
ξ0,g,s = 1;
162
(A.4.5)
ξi,g,s = PS
c , X p , ω s ; β, σ, η)ξ
f (Yi,g |X g , Xi,g
i−1,g,s
g
i,g
0
s0 =1
c , X p , ω s ; β, σ, η)ξ
f (yi,g |X g , Xi,g
g
i,g
i−1,g,s0
.
To investigate the finite sample performance of the estimation algorithm where we nest
the solution of a fixed point in a simulated likelihood, we do a Monte Carlo experiment
p
for the linear model of continuous choices, hi,g (z) = z for all i, g and z, Xi,g
is discretely
distributed with a finite support and ωg is i.i.d. normal with mean zero and standard
deviation, γ = 1. The results are tabulated in Table ??. Compared with the corresponding
results without unobservable group random effects, estimation bias will be larger when the
number of independent groups is small. But the performance will become much better
when the number of independent groups is increased from 100 to 500. We can also see that
in the presence of unobserved group random effects, the maximized sample log likelihood
is in general bigger for estimation under the true information structure than that under a
misspecified information structure.
A.5
Identification: Bayesian Analysis
By Bayesian analysis, we check parameter identification by comparing prior distribution
0
c β +
and posterior distribution. Consider a linear form for u(·) with u(Xi,g ) = β0 + Xi,g
1
0
0
p
Xi,g
β2 . Denote θ = (β0 , β1 , β2 , λ, σ) . The prior distribution is denoted as π(θ|W ). We
assume that the prior distributions of β0 , β1 , β2 , and λ, and σ are independent. For each
i, i = 0, 1, and 2, βi is normal with mean µi and standard deviation, σi . λ is truncated
normal with mean µ3 and standard deviation σ3 , restricted in [−1/τG , 1/τG ] ∩ [−1, 1], where
τG = maxg τg and
τg = min
g


max
1≤i≤n
n
X
|Wij,g |, max
1≤j≤n
j=1
n
X
i=1
|Wij,g |


.

σ is truncated normal with mean µ4 and standard deviation σ4 , only taking positive values.
Thus, the prior distribution relates to the network structure through the range of λ. The
hyper-parameters are chosen as µ0 = 0, µ3 = 0.1, µ1 = µ2 = µ4 = 1, and σi = 1, for
i = 0, 1, · · · , 4.
163
The sample is generated in the way as described in the Monte Carlo experiments. We
consider two models, the linear model for continuous choices with hi,g (z) = z; and the
binary choice (probit) model with hi,g (z) = I(z > 0), for all i and g; two information
p
structures such that Xi,g
is either public information in group g or known only to (i, g);
and two types of network structures such that the sample is either composed of a number
of independent groups or just a single group.
The posterior distribution is derived by Metropolis-Hastings algorithm. Beginning with
a randomly chosen θ0 , we propose an update from θ to θe according to q(·|·), such that for
k = 0, 1, 2,
βek − βk ∼ δk N (0, 1);
e ∼ T N (λ, δ3 )[−1/τG , 1/τG ] ∩ [−1, 1];
λ
σ
e ∼ T N (σ, δ4 )(0, ∞);
θe is taken with probability α:
(
e
e
e
e = min 1, q(θ|θ) P (Y |θ, X, W ) π(θ)
α(θ, θ)
e P (Y |θ, X, W ) π(θ)
q(θ|θ)
)
.
e we can verify that the detailed balance condition
For any two parameter vectors, θ and θ,
holds as follows:
e
P r(θ|θ)P
(θ|Y, X, W )
P (Y |θ, X, W )π(θ|W )
b X, W )π(θ|W
b )dθb
P (Y |θ,
(
)
e P (Y |θ,
e X, W ) π(θ|W
e )
q(θ|θ)
e R
= min 1,
q(θ|θ)
e P (Y |θ, X, W ) π(θ|W )
q(θ|θ)
(
)
e P (Y |θ, X, W ) π(θ|W )
q(θ|θ)
eR
= min 1,
q(θ|θ)
e P (Y |θ,
e X, W ) π(θ|W
e )
q(θ|θ)
e
e
e θ)q(θ|θ)
e R P (Y |θ, X, W )π(θ|W )
=α(θ,
b X, W )π(θ|W
b )dθb
P (Y |θ,
e θ|θ)
e R
=α(θ, θ)q(
e (θ|Y,
e X, W ).
=P r(θ|θ)P
164
P (Y |θ, X, W )π(θ|W )
b X, W )π(θ|W
b )dθb
P (Y |θ,
e X, W )π(θ|W
e )
P (Y |θ,
b X, W )π(θ|W
b )dθb
P (Y |θ,
As for the step of updating distribution, δ, we begin with some initial guesses and than make
adjustments according to the standard deviation of draws that have been generated, as is
conventional in the literature (Gelman et al. (2014)). Since our main focus is the intensity of
social interaction, λ, we show prior and posterior distributions of this parameter in Figures
A.1 and A.2. When there are a number of independent groups, it is easy to tell the posterior
distribution apart from the prior distribution, which is an evidence of identification. If there
is only one single group in the sample, due to inter-personal interactions, it is much harder to
distinguish the posterior distribution from the prior distribution, especially for the binary
choice model, as lots of information about the latent variable is lost when we only have
binary choices. However, an increase in the sample size can help with identification.
165
Table A.1: Linear Model with Discrete Characteristics for Independent Groups
True parameters
Publicly Known Characteristics
Publicly Known
Self-Known
n = 20, G = 100 n = 20, G = 500
n = 20, G = 100 n = 20, G = 500
β0
0
0.0007 (0.0299)
-0.0001 (0.0130)
0.0007 (0.0310)
-0.0001 (0.0138)
β1
1
1.0011 (0.0224)
0.9999 (0.0103)
1.0012 (0.0227)
0.9999 (0.0104)
β2
1
0.9989 (0.0465)
1.0007 (0.0204)
0.9988 (0.0468)
1.0007 (0.0206)
λ
0.3
0.2993 (0.0293)
0.3000 (0.0133)
0.2993 (0.0314)
0.2999 (0.0144)
σ
1
0.9999 (0.0156)
0.9999 (0.0071)
1.0031 (0.0156)
1.0031 (0.0071)
-1.4187 (0.0156)
-1.4188 (0.0071)
-1.4219 (0.0155)
-1.4220 (0.0071)
0.9700
1.0000
-
-
m log L
rtrue
True parameters
Self-Known Characteristics
Self-Known
Publicly Known
n = 20, G = 100 n = 20, G = 500
n = 20, G = 100 n = 20, G = 500
β0
0
0.0007 (0.0308)
-0.0001 (0.0137)
0.0322 (0.0302)
0.0315 (0.0132)
β1
1
1.0011 (0.0226)
0.9999 (0.0104)
1.0070 (0.0223)
1.0060 (0.0102)
β2
1
0.9989 (0.0467)
1.0007 (0.0206)
0.9768 (0.0468)
0.9784 (0.0206)
λ
0.3
0.2993 (0.0313)
0.2999 (0.0143)
0.2597 (0.0318)
0.2604 (0.0144)
σ
1
0.9999 (0.0156)
0.9999 (0.0071)
1.0026 (0.0156)
1.0026 (0.0071)
-1.4187 (0.0156)
-1.4188 (0.0071)
-1.4214 (0.0156)
-1.4215 (0.0071)
0.9590
0.9990
-
-
m log L
rtrue
Note: G is the number of groups. n is the population for each group. m log L is the estimated sample average
log likelihood. rtrue is the proportion of simulations for which estimated log likelihood under the true information
structure is larger than that under the misspecified one. The numbers in parentheses are standard deviation.
166
Table A.2: Binary Choice with Discrete Characteristics for Independent Groups
True parameters
Publicly Known Characteristics
Publicly Known
Self-Known
n = 20, G = 100 n = 20, G = 500
n = 20, G = 100 n = 20, G = 500
β0
0
0.0008 (0.1436)
-0.0008 (0.0619)
-0.0011 (0.1553)
-0.0007 (0.0699)
β1
1
1.0030 (0.0475)
1.0010 (0.0202)
1.0027 (0.0475)
1.0007 (0.0201)
β2
1
1.0022 (0.0747)
1.0019 (0.0336)
1.0022 (0.0748)
1.0017 (0.0336)
λ
0.3
0.2995 (0.2080)
0.3013 (0.0899)
0.3021 (0.2269)
0.3010 (0.1021)
-0.4381 (0.0119)
-4.3850 (0.0051)
-0.4382 (0.0119)
-0.4386 (0.0051)
0.6440
0.7750
-
-
m log L
rtrue
True parameters
Self-Known Characteristics
Self-Known
Publicly Known
n = 20, G = 100 n = 20, G = 500
n = 20, G = 100 n = 20, G = 500
β0
0
-0.0026 (0.1535)
-0.0011 (0.0701)
0.0381 (0.1437)
0.0378 (0.0625)
β1
1
1.0030 (0.0479)
1.0011 (0.0200)
1.0033 (0.0479)
1.0012 (0.0200)
β2
1
1.0025 (0.0751)
1.0019 (0.0338)
1.0000 (0.0750)
0.9995 (0.0338)
λ
0.3
0.3036 (0.2244)
0.3015 (0.1027)
0.2428 (0.2094)
0.2432 (0.0913)
-0.4382 (0.0120)
-0.4385 (0.0051)
-0.4383 (0.0120)
-0.4386 (0.0051)
0.5910
0.7640
-
-
m log L
rtrue
Note: G is the number of groups. n is the population for each group. m log L is the estimated sample average
log likelihood. rtrue is the proportion of simulations for which estimated log likelihood under the true information
structure is larger than that under the misspecified one. The numbers in parentheses are standard deviation.
167
Table A.3: Linear Model with Continuous Characteristics for Independent Groups
True parameters
Publicly Known Characteristics
Publicly Known
Self-Known
n = 20, G = 100 n = 20, G = 500
n = 20, G = 100
n = 20, G = 500
β0
0
-0.0005 (0.0192)
0.0001 (0.0088)
-0.0050 (0.0487)
-0.0331 (0.0215)
β1
1
1.0007 (0.0212)
0.9999 (0.0099)
1.0012 (0.0244)
1.0007 (0.0111)
β2
1
0.9994 (0.0139)
0.9998 (0.0063)
1.0412 (0.0246)
1.0424 (0.0109)
λ
0.3
0.3004 (0.0119)
0.3001 (0.0052)
0.2945 (0.0464)
0.2927 (0.0202)
σ
1
0.9985 (0.0160)
0.9997 (0.0072)
1.1183 (0.0195)
1.1201 (0.0087)
-1.4173 (0.0161)
-1.4186 (0.0072)
-1.5306 (0.0174)
-1.5324 (0.0078)
1.0000
1.0000
-
-
m log L
rtrue
True parameters
Self-Known Characteristics
Self-Known
Publicly Known
n = 20, G = 100 n = 20, G = 500
n = 20, G = 100
n = 20, G = 500
β0
0
-0.0018 (0.0471)
0.0005 (0.0202)
0.2586 (0.0288)
0.2607 (0.0133)
β1
1
1.0007 (0.0223)
0.9999 (0.0102)
1.0109 (0.0223)
1.0105 (0.0103)
β2
1
1.0015 (0.0199)
0.9999 (0.0085)
1.1102 (0.0138)
0.1103 (0.0062)
λ
0.3
0.2993 (0.0323)
0.2999 (0.0136)
0.0425 (0.0149)
0.0417 (0.0066)
σ
1
0.9985 (0.0160)
0.9997 (0.0072)
1.0142 (0.0162)
1.0155 (0.0073)
-1.4173 (0.0161)
-1.4186 (0.0073)
-1.4329 (0.0160)
-1.4343 (0.0072)
1.0000
1.0000
-
-
m log L
rtrue
True parameters
µ
1
Distribution of Self-Known Characteristics
n = 20, G = 100
n = 20, G = 500
1.0054 (0.0037)
0.9993 (0.0570)
η
2
1.9964 (0.0637)
2.0000 (0.0280)
ρ
0.4
0.3969 (0.0384)
0.4000 (0.0164)
Note: G is the number of groups. n is the population for each group. m log L is the estimated sample average
log likelihood. rtrue is the proportion of simulations for which estimated log likelihood under the true information
structure is larger than that under the misspecified one. The numbers in parentheses are standard deviation.
168
Table A.4: Binary Choice with Continuous Characteristics for Independent Groups
True parameters
Publicly Known Characteristics
Publicly Known
Self-Known
n = 20, G = 100 n = 20, G = 500
n = 20, G = 100
n = 20, G = 500
β0
0
-0.0074 (0.1167)
0.0013 (0.0516)
-0.0129 (0.2965)
0.0094 (0.1395)
β1
1
1.0079 (0.0612)
1.0000 (0.0264)
1.0050 (0.0611)
0.9776 (0.0264)
β2
1
1.0055 (0.0488)
1.0003 (0.0219)
1.0030 (0.0564)
0.9991 (0.0257)
λ
0.3
0.3095 (0.1759)
0.2984 (0.0769)
0.3129 (0.4627)
0.2809 (0.2162)
-0.2550 (0.0138)
-0.2565 (0.0062)
-0.2557 (0.0138)
-0.2571 (0.0062)
0.7830
0.9820
-
-
m log L
rtrue
True parameters
Self-Known Characteristics
Self-Known
Publicly Known
n = 20, G = 100 n = 20, G = 500
n = 20, G = 100
n = 20, G = 500
β0
0
-0.0064 (0.2898)
0.0129 (0.1410)
0.1509 (0.1195)
0.1681 (0.0524)
β1
1
1.0080 (0.0612)
1.0000 (0.0264)
1.0081 (0.0613)
0.9999 (0.0264)
β2
1
1.0064 (0.0554)
1.0014 (0.0258)
1.0227 (0.0486)
1.0171 (0.0218)
λ
0.3
0.3081 (0.4519)
0.2805 (0.2183)
0.0536 (0.1777)
0.0384 (0.0773)
-0.2549 (0.0136)
-0.2565 (0.0061)
-0.2550 (0.0136)
-0.2565 (0.0061)
0.5640
0.6980
-
-
m log L
rtrue
True parameters
µ
1
Distribution of Self-Known Characteristics
n = 20, G = 100
n = 20, G = 500
1.0054 (0.1337)
0.9993 (0.0571)
η
2
1.9964 (0.0637)
2.0000 (0.0280)
ρ
0.4
0.3969 (0.0384)
0.4000 (0.0164)
Note: G is the number of groups. n is the population for each group. m log L is the estimated sample average
log likelihood. rtrue is the proportion of simulations for which estimated log likelihood under the true information
structure is larger than that under the misspecified one. The numbers in parentheses are standard deviation.
169
Table A.5: Linear Model with Discrete Characteristics and Constant Friend Number for a
Single Group
True
Parameters
β0
0
β1
1
β2
1
λ
0.3
σ
1
m log L
rtrue
Publicly Known Characteristics
Publicly Known
Self-Known
n = 200 n = 1000 n = 1000
n = 200 n = 1000 n = 1000
F = 30
F = 150
F = 30
F = 30
F = 150
F = 30
0.0117
(0.2239)
0.9952
(0.0702)
1.0026
(0.1477)
0.2741
(0.3724)
0.9886
(0.0506)
0.0015
(0.2141)
1.0008
(0.0314)
1.0013
(0.0651)
0.2961
(0.3696)
0.9978
(0.0216)
-0.0054
(0.0978)
1.0008
(0.0314)
1.0009
(0.0653)
0.3080
(0.1625)
0.9978
(0.0216)
0.0170
(0.2369)
0.9953
(0.0700)
1.0019
(0.1477)
0.2642
(0.3979)
0.9891
(0.0507)
0.0026
(0.2259)
1.0009
(0.0314)
1.0010
(0.0652)
0.2942
(0.3906)
0.9979
(0.0216)
-0.0055
(0.1093)
1.0008
(0.0314)
1.0010
(0.0654)
0.3079
(0.1827)
0.9982
(0.0216)
-1.4062
(0.0510)
0.5490
-1.4165
(0.0216)
0.5700
-1.4165
(0.0216)
0.6600
-1.4066
(0.0510)
-
-1.4166
(0.0217)
-
-1.4169
(0.0217)
-
Self-Known Characteristics
Self-Known
Publicly Known
n = 1000 n = 1000
n = 200 n = 1000 n = 1000
F = 150
F = 30
F = 30
F = 150
F = 30
True
Parameters
n = 200
F = 30
β0
0
β1
1
β2
1
λ
0.3
σ
1
0.0167
(0.2368)
0.9954
(0.0699)
1.0018
(0.1474)
0.2654
(0.3976)
0.9888
(0.0507)
0.0029
(0.2258)
1.0008
(0.0314)
1.0010
(0.0652)
0.2938
(0.3901)
0.9978
(0.0216)
-0.0056
(0.1093)
1.0008
(0.0314)
1.0010
(0.0653)
0.3083
(0.1830)
0.9978
(0.0216)
0.0439
(0.2261)
0.9953
(0.0702)
1.0008
(0.1478)
0.2189
(0.3772)
0.9889
(0.0505)
0.0329
(0.2177)
1.0008
(0.0314)
1.0008
(0.0651)
0.2415
(0.3759)
0.9979
(0.0216)
0.0283
(0.0977)
1.0012
(0.0314)
0.9989
(0.0653)
0.2506
(0.1630)
0.9981
(0.0216)
-1.4063
(0.0510)
0.5280
-1.4166
(0.0217)
0.5130
-1.4165
(0.0216)
0.6380
-1.4064
(0.0510)
-
-1.4166
(0.0217)
-
-1.4168
(0.0216)
-
m log L
rtrue
Note: n is the number of agents in the group. F is the constant number of friends a person can make. m log L is
the estimated sample average log likelihood. rtrue is the proportion of simulations for which estimated log likelihood
under the true information structure is larger than that under the misspecified one. The numbers in parentheses are
standard deviation.
170
Table A.6: Linear Model with Discrete Characteristics and Random Friend Number for a
Single Group
Publicly Known Characteristics
Publicly Known
Self-Known
n = 200
n = 1000
n = 1000
n = 200
n = 1000
U F = 59 U F = 299 U F = 59
U F = 59 U F = 299
n = 1000
U F = 59
0.0066
(0.1596)
1.0042
(0.0714)
1.0002
(0.1488)
0.2910
(0.2459)
0.9894
(0.0501)
0.0028
(0.1138)
1.0015
(0.0314)
1.0000
(0.0657)
0.2948
(0.1935)
0.9995
(0.0228)
0.0008
(0.0645)
1.0015
(0.0314)
1.0003
(0.0656)
0.2977
(0.1025)
0.9994
(0.0228)
0.0077
(0.1729)
1.0045
(0.0714)
1.0002
(0.1481)
0.2888
(0.2692)
0.9903
(0.0503)
0.0018
(0.1263)
1.0015
(0.0314)
1.0000
(0.0658)
0.2961
(0.2142)
0.9996
(0.0228)
0.0029
(0.0710)
1.0015
(0.0314)
0.9999
(0.0656)
0.2942
(0.1139)
1.0003
(0.0220)
-1.4070
(0.0508)
0.5760
29.4344
-1.4181
(0.0228)
0.6020
149.6204
-1.4181
(0.0228)
0.7640
29.5245
-1.4079
(0.0509)
29.4344
-1.4183
(0.0228)
149.6204
-1.4190
(0.0229)
29.5245
True
Parameters
n = 200
U F = 59
Self-Known
n = 1000
U F = 299
β0
0
β1
1
β2
1
λ
0.3
σ
1
0.0083
(0.1722)
1.0044
(0.0714)
1.0002
(0.1480)
0.2875
(0.2677)
0.9895
(0.0501)
0.0016
(0.1263)
1.0015
(0.0314)
1.0000
(0.0658)
0.2965
(0.2142)
0.9994
(0.0228)
0.0027
(0.0711)
1.0016
(0.0314)
1.0000
(0.0658)
0.2944
(0.1140)
0.9994
(0.0228)
0.0387
(0.1599)
1.0048
(0.0714)
0.9958
(0.1491)
0.2367
(0.2486)
0.9900
(0.0500)
0.0347
(0.1143)
1.0018
(0.0314)
0.9987
(0.0658)
0.2398
(0.1948)
0.9996
(0.0228)
0.0329
(0.0645)
1.0025
(0.0314)
0.9958
(0.0660)
0.2436
(0.1035)
1.0000
(0.0228)
-1.4071
(0.0508)
0.5800
29.4344
-1.4181
(0.0228)
0.6040
149.6204
-1.4181
(0.0228)
0.7020
29.5245
-1.4076
(0.0507)
29.4344
-1.4183
(0.0228)
149.6204
-1.4187
(0.0228)
29.5245
True
Parameters
β0
0
β1
1
β2
1
λ
0.3
σ
1
m log L
rtrue
mF
m log L
rtrue
mF
Self-Known Characteristics
Publicly Known
n = 1000
n = 200
n = 1000
n = 1000
U F = 59
U F = 59 U F = 299 U F = 59
Note: n is the number of agents in the group. U F is the maximum number of friends a person can make. mF
is the average number of friends. m log L is the estimated sample average log likelihood. rtrue is the proportion
of simulations for which estimated log likelihood under the true information structure is larger than that under the
misspecified one. The numbers in parentheses are standard deviation.
171
Table A.7: Binary Choice with Discrete Characteristics and Constant Friend Number for a
Single Group
True
Parameters
β0
0
β1
1
β2
1
λ
0.3
m log L
rtrue
Publicly Known Characteristics
Publicly Known
Self-Known
n = 200 n = 1000 n = 1000
n = 200 n = 1000 n = 1000
F = 30
F = 150
F = 30
F = 30
F = 150
F = 30
0.1266
(0.5925)
1.0290
(0.1554)
1.0398
(0.2591)
0.1039
(0.8816)
0.1239
(0.5687)
1.0073
(0.0678)
1.0051
(0.1110)
0.1101
(0.8695)
0.0363
(0.4429)
1.0077
(0.0672)
1.0054
(0.1106)
0.2447
(0.6713)
0.1475
(0.5921)
1.0290
(0.1557)
1.0359
(0.2586)
0.0742
(0.8804)
0.1350
(0.5848)
1.0073
(0.0678)
1.0045
(0.1111)
0.0933
(0.8941)
0.0562
(0.4725)
1.0077
(0.0672)
1.0048
(0.1105)
0.2144
(0.7191)
-0.4307
(0.0381)
0.5950
-0.4369
(0.0171)
0.5630
-0.4367
(0.0170)
0.5420
-0.4309
(0.0381)
-
-0.4369
(0.0171)
-
-0.4368
(0.0170)
-
Self-Known Characteristics
Self-Known
Publicly Known
n = 1000 n = 1000
n = 200 n = 1000 n = 1000
F = 150
F = 30
F = 30
F = 150
F = 30
True
Parameters
n = 200
F = 30
β0
0
β1
1
β2
1
λ
0.3
0.1505
(0.5902)
1.0286
(0.1554)
1.0362
(0.2584)
0.0684
(0.8795)
0.1351
(0.5838)
1.0073
(0.0677)
1.0043
(0.1110)
0.0934
(0.8930)
0.0583
(0.4716)
1.0073
(0.0675)
1.0043
(0.1101)
0.2110
(0.7173)
0.1400
(0.5941)
1.0286
(0.1551)
1.0399
(0.2588)
0.0823
(0.8851)
0.1382
(0.5704)
1.0073
(0.0677)
1.0049
(0.1109)
0.0883
(0.8728)
0.0660
(0.4466)
1.0073
(0.0675)
1.0047
(0.1102)
0.1993
(0.6774)
-0.4310
(0.0380)
0.4020
-0.4369
(0.0171)
0.4480
-0.4369
(0.0170)
0.4670
-0.4308
(0.0380)
-
-0.4369
(0.0171)
-
-0.4369
(0.0170)
-
m log L
rtrue
Note: n is the number of agents in the group. F is the constant number of friends a person can make. m log L is
the estimated sample average log likelihood. rtrue is the proportion of simulations for which estimated log likelihood
under the true information structure is larger than that under the misspecified one. The numbers in parentheses are
standard deviation.
172
Table A.8: Binary Choice with Discrete Characteristics and Random Friend Number for a
Single Group
Publicly Known Characteristics
Publicly Known
Self-Known
n = 200
n = 1000
n = 1000
n = 200
n = 1000
U F = 59 U F = 299 U F = 59
U F = 59 U F = 299
n = 1000
U F = 59
0.0952
(0.4874)
1.0413
(0.1625)
1.0420
(0.2625)
0.1576
(0.7243)
0.0738
(0.4485)
1.0046
(0.0663)
1.0030
(0.1075)
0.1900
(0.6807)
0.0120
(0.2650)
1.0045
(0.0669)
1.0037
(0.1078)
0.2838
(0.4025)
0.0966
(0.4907)
1.0411
(0.1622)
1.0405
(0.2611)
0.1554
(0.7304)
0.0929
(0.4662)
1.0046
(0.0663)
1.0027
(0.1074)
0.1609
(0.7074)
0.0191
(0.2760)
1.0044
(0.0668)
1.0034
(0.1075)
0.2728
(0.4310)
-0.4280
(0.0381)
0.5100
29.4344
-0.4378
(0.0167)
0.5000
149.6204
-0.4381
(0.0169)
0.5300
29.5245
-0.4280
(0.0380)
29.4344
-0.4379
(0.0167)
149.6204
-0.4381
(0.0169)
29.5245
True
Parameters
n = 200
U F = 59
Self-Known
n = 1000
U F = 299
β0
0
β1
1
β2
1
λ
0.3
0.0934
(0.4910)
1.0415
(0.1611)
1.0404
(0.2611)
0.1599
(0.7302)
0.0947
(0.4655)
1.0047
(0.0664)
1.0049
(0.1073)
0.1582
(0.7065)
0.0192
(0.2756)
1.0045
(0.0665)
1.0033
(0.1071)
0.2727
(0.4212)
0.1062
(0.4858)
1.0415
(0.1613)
1.0415
(0.2624)
0.1401
(0.7218)
0.0918
(0.4502)
1.0048
(0.0664)
1.0031
(0.1074)
0.1623
(0.6837)
0.0311
(0.2662)
1.0045
(0.0666)
1.0031
(0.1074)
0.2543
(0.4060)
-0.4281
(0.0378)
0.5040
29.4344
-0.4378
(0.0166)
0.5070
149.6204
-0.4380
(0.0168)
0.5270
29.5245
-0.4280
(0.0507)
29.4344
-0.4378
(0.0166)
149.6204
-0.4381
(0.0168)
29.5245
True
Parameters
β0
0
β1
1
β2
1
λ
0.3
m log L
rtrue
mF
m log L
rtrue
mF
Self-Known Characteristics
Publicly Known
n = 1000
n = 200
n = 1000
n = 1000
U F = 59
U F = 59 U F = 299 U F = 59
Note: n is the number of agents in the group. U F is the maximum number of friends a person can make. mF
is the average number of friends. m log L is the estimated sample average log likelihood. rtrue is the proportion
of simulations for which estimated log likelihood under the true information structure is larger than that under the
misspecified one. The numbers in parentheses are standard deviation.
173
Table A.9: Linear Model with Continuous Characteristics and Constant Friend Number for
a Single Group
True
Parameters
β0
0
β1
1
β2
1
λ
0.3
σ
1
m log L
rtrue
Publicly Known Characteristics
Publicly Known
Self-Known
n = 200 n = 1000 n = 1000
n = 200 n = 1000 n = 1000
F = 30
F = 150
F = 30
F = 30
F = 150
F = 30
0.0040
(0.5329)
1.0008
(0.0698)
0.9995
(0.0460)
0.3086
(0.2362)
0.9909
(0.0496)
0.0179
(0.5118)
1.0003
(0.0316)
1.0007
(0.0215)
0.2864
(0.2302)
0.9973
(0.0225)
-0.0004
(0.2137)
1.0003
(0.0316)
1.0007
(0.0215)
0.3028
(0.0926)
0.9974
(0.0225)
0.1510
(0.6050)
1.0011
(0.0697)
0.8927
(0.1564)
0.2710
(0.3875)
0.9941
(0.0499)
0.2124
(0.6147)
1.0003
(0.0306)
0.9089
(0.1350)
0.2332
(0.3393)
0.9980
(0.0224)
0.1337
(0.4495)
1.0004
(0.0318)
0.8910
(0.0692)
0.2827
(0.1673)
1.0010
(0.0225)
-1.4085
(0.0501)
0.6610
-1.4160
(0.0225)
0.6680
-1.4160
(0.0225)
0.9350
-1.4118
(0.0501)
-
-1.4166
(0.0225)
-
-1.4197
(0.0225)
-
Self-Known Characteristics
Self-Known
Publicly Known
n = 1000 n = 1000
n = 200 n = 1000 n = 1000
F = 150
F = 30
F = 30
F = 150
F = 30
True
Parameters
n = 200
F = 30
β0
0
β1
1
β2
1
λ
0.3
σ
1
0.0138
(0.4033)
1.0011
(0.0700)
1.0034
(0.1742)
0.2908
(0.3818)
0.9910
(0.0498)
0.0029
(0.3356)
1.0004
(0.0316)
1.0004
(0.1473)
0.3000
(0.3277)
0.9974
(0.0224)
-0.0035
(0.1722)
1.0003
(0.0316)
0.9988
(0.0754)
0.3042
(0.1647)
0.9974
(0.0225)
0.2034
(0.4569)
1.0002
(0.0701)
1.1310
(0.0474)
0.0821
(0.2177)
0.9920
(0.0449)
0.2180
(0.4382)
1.0002
(0.0316)
1.1316
(0.0237)
0.0635
(0.2146)
0.9977
(0.0225)
0.2034
(0.2028)
1.0007
(0.0316)
1.1314
(0.0237)
0.0775
(0.0854)
0.9986
(0.0225)
-1.4086
(0.0503)
0.5680
-1.4161
(0.0225)
0.5820
-1.4160
(0.0225)
0.7850
-1.4097
(0.0500)
-
-1.4163
(0.0225)
-
-1.4173
(0.0226)
-
m log L
rtrue
Note: n is the number of agents in the group. F is the constant number of friends a person can make. m log L is
the estimated sample average log likelihood. rtrue is the proportion of simulations for which estimated log likelihood
under the true information structure is larger than that under the misspecified one. The numbers in parentheses are
standard deviation.
174
Table A.10: Linear Model with Continuous Characteristics and Random Friend Number
for a Single Group
Publicly Known Characteristics
Publicly Known
Self-Known
n = 200
n = 1000
n = 1000
n = 200
n = 1000
U F = 59 U F = 299 U F = 59
U F = 59 U F = 299
n = 1000
U F = 59
0.0043
(0.2582)
0.9985
(0.0738)
0.9993
(0.0465)
0.3025
(0.1329)
0.9871
(0.0504)
-0.0085
(0.2434)
0.9994
(0.0334)
1.0007
(0.0207)
0.2995
(0.1126)
0.9978
(0.0227)
-0.0038
(0.1197)
0.9994
(0.0335)
1.0006
(0.0207)
0.3016
(0.0555)
0.9978
(0.0227)
0.0668
(0.4144)
0.9988
(0.0742)
0.8803
(0.0949)
0.3092
(0.2243)
0.9954
(0.0507)
0.1383
(0.4191)
0.9995
(0.0335)
0.8910
(0.0790)
0.2866
(0.1958)
1.0002
(0.0228)
0.1088
(0.3379)
0.9991
(0.0336)
0.8862
(0.0435)
0.3033
(0.1040)
1.0068
(0.0230)
-1.4047
(0.0512)
0.8070
29.4867
-1.4165
(0.0228)
0.8710
149.5960
-1.4165
(0.0227)
0.9860
29.5193
-1.4030
(0.0510)
29.4867
-1.4189
(0.0228)
149.5960
-1.4254
(0.0228)
29.5193
True
Parameters
n = 200
U F = 59
Self-Known
n = 1000
U F = 299
β0
0
β1
1
β2
1
λ
0.3
σ
1
-0.0094
(0.2248)
0.9986
(0.0471)
0.9934
(0.0999)
0.3129
(0.2122)
0.9871
(0.0503)
-0.0004
(0.1952)
0.9994
(0.0334)
1.0003
(0.0851)
0.3007
(0.1882)
0.9978
(0.0227)
-0.0040
(0.1005)
0.9993
(0.0334)
0.9989
(0.0451)
0.3041
(0.0947)
0.9978
(0.0227)
0.1554
(0.2698)
0.9983
(0.0743)
1.1282
(0.1481)
0.0968
(0.1296)
0.9909
(0.0509)
0.1432
(0.2648)
0.9995
(0.0335)
1.1305
(0.0235)
0.0910
(0.1103)
0.9988
(0.0228)
0.1332
(0.1852)
1.0007
(0.0337)
1.1279
(0.0236)
0.0983
(0.0573)
1.0017
(0.0228)
-1.4047
(0.0510)
0.7000
29.4867
-1.4165
(0.0228)
0.7590
149.5960
-1.4165
(0.0228)
0.9400
29.5193
-1.4085
(0.0515)
29.4867
-1.4175
(0.0228)
149.5960
-1.4204
(0.0227)
29.5193
True
Parameters
β0
0
β1
1
β2
1
λ
0.3
σ
1
m log L
rtrue
mF
m log L
rtrue
mF
Self-Known Characteristics
Publicly Known
n = 1000
n = 200
n = 1000
n = 1000
U F = 59
U F = 59 U F = 299 U F = 59
Note: n is the number of agents in the group. U F is the maximum number of friends a person can make. mF
is the average number of friends. m log L is the estimated sample average log likelihood. rtrue is the proportion
of simulations for which estimated log likelihood under the true information structure is larger than that under the
misspecified one. The numbers in parentheses are standard deviation.
175
Table A.11: Binary Choice with Continuous Characteristics and Constant Friend Number
for a Single Group
True
Parameters
β0
0
β1
1
β2
1
λ
0.3
m log L
rtrue
Publicly Known Characteristics
Publicly Known
Self-Known
n = 200 n = 1000 n = 1000
n = 200
n = 1000 n = 1000
F = 30
F = 150
F = 30
F = 30
F = 150
F = 30
0.2318
(2.4761)
1.3816
(7.4183)
1.3554
(6.1150)
0.0654
(0.8985)
0.1781
(0.6683)
1.0146
(0.1008)
1.0138
(0.0887)
0.0436
(0.8995)
0.0740
(0.5812)
1.0047
(0.1008)
1.0142
(0.0884)
0.2020
(0.7420)
0.0041
(6.7784)
1.9455
(25.0543)
0.9883
(25.9679)
0.0126
(0.9487)
0.1963
(0.6316)
1.0046
(0.1007)
1.0124
(0.1085)
0.0127
(0.9523)
0.1354
(0.5812)
1.0146
(0.1008)
1.0060
(0.0884)
0.1078
(0.7420)
-0.2497
(0.0856)
0.6260
-0.2559
(0.0812)
0.6180
-0.2557
(0.0812)
0.5660
-0.2502
(0.0858)
-
-0.2506
(0.0812)
-
-0.2559
(0.0812)
-
Self-Known Characteristics
Self-Known
Publicly Known
n = 1000 n = 1000
n = 200
n = 1000 n = 1000
F = 150
F = 30
F = 30
F = 150
F = 30
True
Parameters
n = 200
F = 30
β0
0
β1
1
β2
1
λ
0.3
0.0740
(0.5812)
1.0147
(0.1008)
1.0142
(0.0884)
0.2020
(0.7420)
0.1756
(0.6206)
1.0138
(0.1007)
1.0322
(0.1120)
0.0205
(0.9525)
0.0194
(0.5665)
1.0140
(0.0988)
1.0250
(0.1056)
0.1245
(0.8755)
0.1354
(0.5745)
1.0146
(0.1010)
1.0060
(0.1040)
0.1078
(0.8663)
0.2013
(0.6546)
1.0138
(0.1007)
1.0341
(0.0918)
-0.0183
(0.8977)
0.1435
(0.5750)
1.0138
(0.0987)
1.0341
(0.0898)
0.0698
(0.7499)
-0.2557
(0.0812)
0.5660
-0.2559
(0.0788)
0.3900
-0.2558
(0.0787)
0.4750
-0.2559
(0.0812)
-
-0.2558
(0.0787)
-
-0.2557
(0.0787)
-
m log L
rtrue
Note: n is the number of agents in the group. F is the constant number of friends a person can make. m log L is
the estimated sample average log likelihood. rtrue is the proportion of simulations for which estimated log likelihood
under the true information structure is larger than that under the misspecified one. The numbers in parentheses are
standard deviation.
176
Table A.12: Binary Choice with Continuous Characteristics and Random Friend Number
for a Single Group
Publicly Known Characteristics
Publicly Known
Self-Known
n = 200
n = 1000
n = 1000
n = 200
n = 1000
U F = 59 U F = 299 U F = 59
U F = 59 U F = 299
n = 1000
U F = 59
0.1675
(0.7507)
1.0839
(0.3773)
1.0867
(0.2811)
0.0980
(0.7758)
0.1569
(0.8577)
1.0652
(1.7943)
1.0495
(1.2264)
0.1173
(0.7447)
0.0719
(0.7611)
1.0659
(1.7943)
1.0508
(1.2264)
0.2386
(0.5099)
0.2003
(0.7347)
1.0828
(0.3796)
1.0815
(0.2880)
0.0381
(0.8444)
0.1930
(0.8496)
1.0661
(1.8275)
1.0457
(1.2506)
0.0595
(0.8241)
0.1181
(0.7713)
1.0664
(1.8276)
1.0395
(1.2502)
0.1760
(0.6042)
-0.2500
(0.0838)
0.5270
29.4867
-0.2514
(0.0822)
0.5240
149.5960
-0.2514
(0.0821)
0.5430
29.5193
-0.2503
(0.0838)
29.4867
-0.2514
(0.0822)
149.5960
-0.2515
(0.0822)
29.5193
True
Parameters
n = 200
U F = 59
Self-Known
n = 1000
U F = 299
β0
0
β1
1
β2
1
λ
0.3
0.1823
(0.7438)
1.0870
(0.3982)
1.1083
(0.3402)
0.0452
(0.8468)
0.1771
(0.8437)
1.0650
(1.8271)
1.0644
(1.2500)
0.0586
(0.8225)
0.1032
(0.7699)
1.0660
(1.8272)
1.0580
(1.2496)
0.1736
(0.6094)
0.2034
(0.7555)
1.0879
(0.3967)
1.1140
(0.3364)
0.0183
(0.7761)
0.1934
(0.8576)
1.0641
(1.7939)
1.0682
(1.2257)
0.0303
(0.7535)
0.1274
(0.7589)
1.0650
(1.7939)
1.0688
(1.2258)
0.1267
(0.5160)
-0.2500
(0.0816)
0.4830
29.4867
-0.2512
(0.0794)
0.4640
149.5960
-0.2513
(0.0793)
0.5290
29.5193
-0.2498
(0.0816)
29.4867
-0.2512
(0.0794)
149.5960
-0.2513
(0.0793)
29.5193
True
Parameters
β0
0
β1
1
β2
1
λ
0.3
m log L
rtrue
mF
m log L
rtrue
mF
Self-Known Characteristics
Publicly Known
n = 1000
n = 200
n = 1000
n = 1000
U F = 59
U F = 59 U F = 299 U F = 59
Note: n is the number of agents in the group. U F is the maximum number of friends a person can make. mF
is the average number of friends. m log L is the estimated sample average log likelihood. rtrue is the proportion
of simulations for which estimated log likelihood under the true information structure is larger than that under the
misspecified one. The numbers in parentheses are standard deviation.
177
Table A.13: Linear Model with Discrete Characteristics and Unobserved Group Random
Effects
True parameters
Publicly Known Characteristics
Publicly Known
Self-Known
n = 20, G = 100 n = 20, G = 500
n = 20, G = 100 n = 20, G = 500
β0
0
-0.0026 (0.1906)
0.0004 (0.0636)
-0.0042 (0.1890)
0.0011 (0.0636)
β1
1
1.0013 (0.0233)
0.9999 (0.0106)
1.0013 (0.0232)
1.0000 (0.0106)
β2
1
0.9989 (0.0463)
1.0007 (0.0203)
0.9991 (0.0469)
1.0007 (0.0207)
λ
0.3
0.3002 (0.0404)
0.2993 (0.0177)
0.2998 (0.0457)
0.2989 (0.0202)
σ
1
1.0012 (0.0161)
1.0005 (0.0073)
1.0043 (0.0161)
1.0036 (0.0073)
γ
1
1.1136 (0.1657)
1.0412 (0.0623)
1.1110 (0.1686)
1.0413 (0.0636)
-1.5145 (0.0156)
-1.5129 (0.0072)
-1.5175 (0.0156)
-1.5158 (0.0072)
0.9500
1.0000
-
-
m log L
rtrue
True parameters
Self-Known Characteristics
Self-Known
Publicly Known
n = 20, G = 100 n = 20, G = 500
n = 20, G = 100 n = 20, G = 500
β0
0
-0.0071 (0.1924)
0.0008 (0.0637)
0.0377 (0.2010)
0.0410 (0.0681)
β1
1
1.0013 (0.0233)
1.0000 (0.0106)
1.0024 (0.0235)
1.0010 (0.0106)
β2
1
0.9989 (0.0466)
1.0006 (0.0206)
0.9797 (0.0467)
0.9816 (0.0205)
λ
0.3
0.2998 (0.0452)
0.2990 (0.0201)
0.2429 (0.0415)
0.2417 (0.0181)
σ
1
1.0012 (0.0161)
1.0005 (0.0073)
1.0037 (0.0162)
1.0029 (0.0073)
γ
1
1.1116 (0.1681)
1.0411 (0.0634)
1.1980 (0.1777)
1.1268 (0.0673)
-1.5145 (0.0157)
-1.5129 (0.0072)
-1.5168 (0.0157)
-1.5151 (0.0072)
0.9370
1.0000
-
-
m log L
rtrue
Note: G is the number of groups. n is the population for each group. m log L is the estimated sample average
log likelihood. rtrue is the proportion of simulations for which estimated log likelihood under the true information
structure is larger than that under the misspecified one. The numbers in parentheses are standard deviation.
178
Table A.14: Sample Statistics: Municipalities in North Carolina
Variables
Mean
Standard Deviation
Min
Max
Total Revenue (×106 )
19.8781
94.5765
0.0043
1.6486×103
Expenditure
($ ×106 )
7.0020
28.6849
0
280.1064
(0.2752)
(0.2261)
(0)
(0.8821)
2.3462
17.4553
-0.2218
350.2160
(0.0603)
(0.0782)
(-0.0179)
(0.8760)
2.2384
20.6795
0
448.8580
(Proportion)
(0.1036)
(0.0930)
(0)
(0.5746)
Government
1.5539
6.5083
0.0027
110.4780
(Proportion)
(0.2277)
(0.2053)
(0.0154)
(1.5187)
Public Safety
4.0027
19.8703
0
357.6710
(Proportion)
(0.1829)
(0.1315)
(0)
(0.7449)
2.9412
13.9665
-1.4503
178.4934
(0.1503)
(0.1375)
(-0.5187)
0.8045
Population ×103
10.2842
45.0819
0.0250
751.9990
Median Household Income $ ×104
4.2092
1.8504
1.1750
15.7297
Related County Public Safety
Expenditure ×106
No. of Related Counties
33.5566
41.7575
0
217.6625
1.1008
0.3326
1
4
Distance (kilometer)
216.7976
125.4035
1.2133
767.1423
No. of Observations
506
-
-
-
Utility
(Proportion)
Debt & Services
(Proportion)
Transportation
Other
(Proportion)
179
Table A.15: Empirical Analysis
Regression
Constant
Regression for Municipal Spending on Public Safety
(1)
(2)
(3)
(4)
(5)
(6)
(7)
0.1268
(0.4486)
-0.0049
(0.5252)
0.0534
(0.5204)
0.2068
(0.4733)
0.2526
(0.4845)
0.5939
(0.5344)
0.6346
(0.5766)
Population
0.3079***
(0.0040)
0.3149***
(0.0042)
0.3145***
(0.0042)
0.3135***
(0.0045)
0.3129***
(0.0045)
0.3109***
(0.0042)
0.3106***
(0.0042)
Total Revenue
0.0631***
(0.0019)
0.0599***
(0.0020)
0.0601***
(0.0020)
0.0607***
(0.0021)
0.0610***
(0.0021)
0.0617***
(0.0020)
0.0619***
(0.0020)
-0.1294*
(0.0771)
-0.0420
(0.1132)
-0.0574
(0.1084)
-0.0661
(0.0898)
-0.0805
(0.0866)
-0.1100
(0.0825)
-0.1257
(0.0805)
-0.0569***
(0.0169)
-0.0649***
(0.0210)
-0.0835***
(0.0313)
-0.0913**
(0.0367)
-0.1356
(0.0860)
-0.1437
(0.0990)
2.1768***
(0.0166)
2.1495***
(0.0161)
2.1522***
(0.0162)
2.1510***
(0.0165)
2.1540***
(0.0166)
2.1572***
(0.0163)
2.1596***
(0.0164)
-1111.6
-1105.2
-1105.8
-1105.6
-1106.3
-1107.0
-1107.6
506
506
506
506
506
506
506
No. of
“Neighbors”
10.8656
(4.3634)
10.8656
(4.3634)
28.3636
(9.0854)
28.3636
(9.0854)
94.0474
(29.1823)
94.0474
(29.1823)
Cutoff
Distance(km)
30
30
50
50
100
100
Median HHI
λ
σ
log L
No. of Obs
Variables
µ
Distribution of Median Household Income
ω
η
ρ
Estimates
3.3971
0.7742
2.0059
0.1490
Note: Regression (1) is ordinary linear regression without social interactions. Regression (2),(4) and (6) correspond to
the linear model when all characteristics are public information. Municipal median household income is assumed to be
self-known for municipalities in Regression (3),(5) and (7). Two municipalities are viewed as close “neighbors” if the
distance between them is less than 30 kilometers for Regression (2) and (3), or less than 50 kilometers for Regression
(4) and (5), or less than 100 kilometers for Regression (6) and (7). Numbers in parentheses are standard deviations.
Estimates that are significant at the %10, %5, and %1 levels are marked by “*”, “**”, and “***”, respectively.
180
Publicly Known Discrete Characteristics for Multiple Groups
Self−Known Discrete Characteristics for Multiple Groups
2500
2000
prior
n=20,G=100,F=3
n=20,G=500,F=3
2000
prior
n=20,G=100,F=3
n=20,G=500,F=3
1500
1500
1000
1000
500
500
0
0.2
0.25
0.3
λ
0.35
0.4
0
0.2
0.45
0.25
Publicly Known Discrete Characteristics for a Single Group
0.3
λ
0.35
0.4
0.45
Self−Known Discrete Characteristics for a Single Group
25
20
prior
n=200,G=1,F=30
n=1000,G=1,F=30
20
prior
n=200,G=1,F=30
n=1000,G=1,F=30
15
15
10
10
5
5
0
−1
−0.8
−0.6
−0.4
−0.2
0
λ
0.2
0.4
0.6
0.8
0
−1
1
−0.8
−0.6
−0.4
−0.2
0
λ
0.2
Figure A.1: Identification for Continuous Choices
181
0.4
0.6
0.8
1
Publicly Known Discrete Characteristics for Multiple Groups
Self−Known Discrete Characteristics for Multiple Groups
60
40
prior
n=20,G=100,F=3
n=20,G=500,F=3
50
prior
n=20,G=100,F=3
n=20,G=500,F=3
35
30
40
25
30
20
15
20
10
10
5
0
−0.5
0
0.5
λ
0
−0.6
1
−0.4
Publicly Known Discrete Characteristics for a Single Group
1.5
−0.2
0
0.2
λ
0.4
0.6
0.8
1
Self−Known Discrete Characteristics for a Single Group
1.4
prior
n=200,G=1,F=30
n=1000,G=1,F=30
prior
n=200,G=1,F=30
n=1000,G=1,F=30
1.2
1
1
0.8
0.6
0.5
0.4
0.2
0
−1
−0.8
−0.6
−0.4
−0.2
0
λ
0.2
0.4
0.6
0.8
0
−1
1
−0.8
−0.6
−0.4
−0.2
0
λ
Figure A.2: Identification for Binary Choices
182
0.2
0.4
0.6
0.8
1
Appendix B: Appendix to Chapter 3
B.1
Equilibrium Analysis
Proof of Proposition 3.3.1.
Suppose that if (s1 (XJp1 , 1 ), · · · , sn (XJpn , n )) satisfies
Qn
Qn
0
n
(3.2.4), define ψ :
i=1 Ai → C as follows. For any A = (A1 · · · , An ) ∈
i=1 Ai ,
there exists k1 , · · · , kn ∈ {1, · · · , n} such that Wn,ki i 6= 0 and Ai = XJpk . Set ψ(A) =
i
0
(ψ1 (A1 ), · · · , ψn (An )) , where ψi (Ai ) =
ψi (XJpk )
i
=
E[si (XJpi , 1 )|XJpk , z],
i
for i = 1, · · · , n.
Then, we have that
ψi (Ai ) =E[max


0
0
0
β0 + Xic β1 + Xip β2 + X g β3 + λ

X
Wn,ij E[sj (XJpi , i )|XJpi , Z] − i , 0
0
0
0
X
0
0
0
X
|Ai , z]

j6=i
=E[H(β0 + Xic β1 + Xip β2 + X g β3 + λ


Wn,ij E[sj (XJpi , i )|XJpi , Z])|Ai , z]
j6=i
=E[H(β0 + Xic β1 + Xip β2 + X g β3 + λ
Wn,ij ψj (XJpi ))|Ai , z].
j6=i
In the above equation, the first equality holds because i ’s are independent of each other
and also independent of X and Wn . The second equality follows from the definition of ψ.
Q
Thus, (3.3.3) holds. Conversely, suppose that a function ψ : ni Ai → Cn satisfies (3.3.3).
Define a profile of strategies, (s1 (XJp1 , 1 ), · · · , sn (XJpn , n )), by




X
0
0
0
si (XJpi , i ) = max β0 + Xic β1 + Xip β2 + X g β3 + λ
Wn,ij ψj (XJpi ) − i , 0 ,


j6=i
for i, j = 1, · · · , n and Wj,i 6= 0. Taking expectations over sj (XJpi , j ) conditional on XJpi
and Z = z, we derive that
183
E[sj (XJpi , j )|XJpi , z]
h
=E max


0
β0 + Xic β1 +
0
Xip β2
0
+ X g β3 + λ

X
Wn,ij ψj (XJpi )
j6=i
0
0
0
=E[H(β0 + Xic β1 + Xip β2 + X g β3 + λ
X


i
− i , 0 XJpi , z

Wn,ij ψj (XJpi ))|XJpi , z]
j6=i
=ψj (Xip ),
where the second equality follows from (3.3.3). Hence, we have that




X
0
0
0
si (XJpi , i ) = max β0 + Xic β1 + Xip β2 + X g β3 + λ
Wn,ij E[sj (XJpi , j )|XJpi , z] − i , 0 .


j6=i
Therefore, (3.2.4) is satisfied.
0
Proof of Proposition 3.3.2. For any two functions, ξ, ξ ∈ Ξ,
0 T (ξ) − T (ξ ) = max
Z
max
1≤i≤n {k:Wn,ki 6=0}
0
X
0
0
0
|E[ H(β0 + Xic β1 + Xip β2 + X g β3 + λ
Wn,ij ξj (XJpi ))
j6=i
0
0
− H(β0 + Xic β1 + Xip β2 + X g β3 + λ
X
Wn,ij ξj (XJpi )) |xpJk , z]|dFp (xp )
0
j6=i
Z
0
dH(x) X
|
Wn,ij E[|ξj (XJpi ) − ξj (XJpi )|xpJk , z]dFp (xp )
≤ max
max
|λ| sup |
1≤i≤n {k:Wn,ki 6=0}
dx
x
j6=i
Z
X
0
dH(x)
= max
max
|λ| sup |
|
Wn,ij |ξj (XJpi ) − ξj (XJpi )|dFp (X p )
1≤i≤n {k:Wn,ki 6=0}
dx
x
j6=i
0
dH(x) ≤|λ| kWn k∞ sup |
|| ψ − ψ .
dx
x
Recall that Fp (·) is a simplified notation of the distribution of Xip ’s conditional on
public information Z = z. By calculation, dH(x)/dx = F (x). Therefore, if |λ| kWn k∞ < 1,
T : Ξ → Ξ is a contraction mapping on the complete metric space, (Ξ, k · k). Then, it admits
a unique fixed point. Because there is a one-to-one correspondence between a BNE and a
fixed point of T , there is a unique BNE.
184
B.2
Identification Proofs
0
0
0
Proof of Lemma 3.4.1. Define ci = β0 +Xic β1 +Xip β2 +X g β3 +λ
p
j6=i Wn,ij E[yj |XJi , z].
P
According to (3.2.1), we have that E[I(yi > 0)|X p , z] = P r(ci − i > 0|X p , z) = F (ci ; σ),
where the second equality follows from the independence between the idiosyncratic shocks
and the exogenous characteristics. Since F (c; σ) is strictly increasing with respect to c, we
have the inversion, ci = F−1 (E[I(yi > 0)|X p , z]; σ). Because
Z
Z
E[yi |X p , z] =
(ci − c)f (c; σ)dc = ci F (ci ; σ) −
c<ci
cf (c; σ)dc,
c<ci
we derive (3.4.3).
Proof of Proposition 3.4.1. For X p and z, we define a function S such that
S(E[y|X p , z], E[I(y > 0)|X p , z], σ) = E[y|X p , z] − E[I(y > 0)|X p , z]F−1 (E[I(y > 0)|X p , z]; σ)
Z
+
cf (c; σ)dc.
c<F−1 (E[I(y>0)|X p ,z];σ)
By (3.4.30 ), when data are generated from the true parameter σ0 , S(E[y|X p , z], E[I(y >
0)|X p , z], σ0 ) = 0. We would like to show that given the observed data (E[y|X p , z], E[I(y >
0)|X p , z]), there is a unique σ that satisfies the above equation. That is, σ can be identified
from the observed data. To achieve this, we first take the partial derivative,
∂S(E[y|X p , z], E[I(y > 0)|X p , z], σ)
∂σ
= F−1 (E[I(y > 0)|X p , z]; σ)f (F−1 (E[I(y > 0)|X p , z]; σ); σ) − E[I(y > 0)|X p , z]
∂F−1 (E[I(y > 0)|X p , z]; σ)
∂σ
Z
∂f (c; σ)
+
c
dc.
∂σ
c<F−1 (E[I(y>0)|X p ,z];σ)
·
Since F (F−1 (E[I(y > 0)|X p , z]; σ); σ) = E[I(y ∗ > 0)|X p , z],
−1
f (F−1 (E[I(y > 0)|X p , z]; σ); σ) ∂F
(E[I(y>0)|X p ,z];σ)
∂σ
+
∂F (F−1 (E[I(y>0)|X p ,z];σ);σ)
∂σ
= 0,
where the derivative with respect to σ in the second term is the derivative of F (c; σ) with
185
respect to σ. Thus,
∂F (F −1 (E[I(y>0)|X p ,z];σ);σ)
∂F−1 (E[I(y > 0)|X p , z]; σ)
∂σ
=−
.
∂σ
f (F−1 (E[I(y > 0)|X p , z]; σ); σ)
Additionally, by using integration by parts,
Z
∂f (c; σ)
c
dc
∂σ
c<F−1 (E[I(y>0)|X p ,z];σ)
∂F (F−1 (E[I(y > 0)|X p , z]; σ); σ)
∂F (c; σ)
− lim c
c→−∞
∂σ
∂σ
∂F (c; σ)
dc.
∂σ
c<F−1 (E[I(y>0)|X p ,z];σ)
=F−1 (E[I(y0)|X p , z]; σ)
Z
−
Hence,
∂S(E[y|X p , z], E[I(y > 0)|X p , z], σ)
∂σ
=E[I(y > 0)|X
Z
−
p
∂F (F−1 (E[I(y>0)|X p ,z];σ);σ)
∂σ
, z]
−1
f (F (E[I(y > 0)|X p , z]; σ); σ)
c<F−1 (E[I(y>0)|X p ,z];σ)
∂F (c; σ)
dc
∂σ
∂F (F−1 (E[I(y>0)|X p ,z];σ);σ)
∂σ
(
−1
−1
p
(E[I(y
>
0)|X p , z]; σ); σ)
f
(F
c<F (E[I(y>0)|X ,z];σ) Z
=
Therefore, when Assumption 3.4.4 is satisfied,
−
∂F (c;σ)
∂σ
f (c; σ)
)f (c; σ)dc.
∂S(E[y|X p ,z],E[I(y>0)|X p ,z],σ)
∂σ
is either posi-
tive or negative. As a result, by the implicit function theorem, for (E[y|X p , z], E[I(y ∗ >
0)|X p , z]), there is a unique σ such that (3.4.30 ) holds.
That is, from the moments,
(E[y|X p , z], E[I(y ∗ > 0)|X p , z]), we can identify σ.
Proof of Proposition 3.4.2.
σ is identified from Proposition 3.4.1. As H(x; σ) =
R
xF (x, σ) − c<x cf (c; σ)dc, ∂H(x;σ)
= F (x; σ) > 0 for any x ∈ <1 . Thus, H(x; σ) is strictly
∂x
increasing in x and H −1 (·; σ) is well-defined. Therefore, we have
0
0
H −1 (E[yi |X p , z]) = β0 + Xic β1 + Xip β2 + λ
X
Wn,ij E[yj |XJpi , z].
j6=i
0
0
0
0
0
0 0
e ) are observationally equivalent at W , J and
Therefore, if (β0 , β1 , β2 , λ) and (βe0 , βe1 , βe2 , λ
X,
c
p
ln , X , X ,
As
c
p
ln , X , X ,
Ep
0
0
0
0
e 0 = 0.
(β0 − βe0 , β1 − βe1 , β2 − βe2 , λ − λ)
Ep
0
0
0
0
0
0 0
e).
has full column rank, (β0 , β1 , β2 , λ) = (βe0 , βe1 , βe2 , λ
186
Table B.1: Tobit Model with Discrete Characteristics for Independent Groups
Model Specification
Characteristics are all Publicly Known
Publicly Known
Self-Known
n = 20, G = 100 n = 20, G = 500
n = 20, G = 100 n = 20, G = 500
β0
0
0.0017 (0.0599)
0.0004 (0.0261)
0.0014 (0.0647)
-0.0000 (0.0285)
β1
1
1.0020 (0.0274)
0.9999 (0.0122)
1.0021 (0.0276)
0.9999 (0.0122)
β2
1
0.9987 (0.0483)
1.0003 (0.0217)
0.9988 (0.0487)
1.0005 (0.0219)
λ
0.3
0.2983 (0.0486)
0.2997 (0.0214)
0.2985 (0.0536)
0.3000 (0.0238)
σ
1
0.9997 (0.0194)
0.9998 (0.0090)
1.0016 (0.0195)
1.0017 (0.0090)
-1.1611 (0.0163)
-1.1618 (0.0072)
-1.1627 (0.0163)
-1.1633 (0.0072)
rtrue
0.8920
0.9990
-
-
rcensor
0.3209
0.3206
0.3209
0.3206
m log L
Model Specification
Characteristics are Self-Known
Self-Known
Publicly Known
n = 20, G = 100 n = 20, G = 500
n = 20, G = 100 n = 20, G = 500
β0
0
0.0016 (0.0643)
0.0003 (0.0283)
0.0588 (0.0605)
0.0578 (0.0265)
β1
1
1.0020 (0.0278)
0.9999 (0.0122)
1.0051 (0.0275)
1.0032 (0.0122)
β2
1
0.9988 (0.0484)
1.0005 (0.0219)
0.9864 (0.0483)
0.9880 (0.0219)
λ
0.3
0.2983 (0.0535)
0.2997 (0.0235)
0.2469 (0.0510)
0.2481 (0.0225)
σ
1
0.9998 (0.0194)
0.9999 (0.0090)
1.0014 (0.0194)
1.0014 (0.0090)
-1.1614 (0.0163)
-1.1620 (0.0072)
-1.1627 (0.0163)
-1.1633 (0.0072)
rtrue
0.8750
0.9970
-
-
rcensor
0.3208
0.3205
0.3208
0.3205
m log L
Note: G is the number of groups. n is the population for each group. m log L is the estimated sample average log
likelihood. rtrue is the proportion of simulations for which estimated log likelihood is bigger than that under wrong
information structure. rcensor is the censoring rate. The numbers in parentheses are standard deviation.
187
Table B.2: Tobit Model with Continuous Characteristics for Independent Groups
Model Specification
Characteristics are All Publicly Known
Publicly Known
Self-Known
n = 20, G = 100 n = 20, G = 500
n = 20, G = 100
n = 20, G = 500
β0
0
-0.0030 (0.0385)
0.0002 (0.0172)
-0.0636 (0.1122)
-0.0592 (0.0494)
β1
1
1.0008 (0.0248)
0.9998 (0.0114)
1.0067 (0.0272)
1.0061 (0.0124)
β2
1
0.9995 (0.0166)
0.9997 (0.0077)
1.0365 (0.0270)
1.0369 (0.0125)
λ
0.3
0.3012 (0.0167)
0.3001 (0.0074)
0.3022 (0.0612)
0.3002 (0.0271)
σ
1
0.9988 (0.0191)
0.9998 (0.0085)
1.0759 (0.0222)
1.0769 (0.0097)
m log L
rtrue
rcensor
-1.1361 (0.0306)
1.0000
0.2746
-1.1368 (0.0133)
1.0000
0.2749
-1.1937 (0.0330)
0.2746
-1.1942 (0.0142)
0.2749
Model Specification
Characteristics are Self-Known
Self-Known
Publicly Known
n = 20, G = 100 n = 20, G = 500
n = 20, G = 100
n = 20, G = 500
β0
0
-0.0037 (0.0901)
0.0003 (0.0390)
0.4473 (0.0474)
0.4528 (0.0218)
β1
1
1.0010 (0.0255)
0.9999 (0.0115)
1.0068 (0.0256)
1.0057 (0.0117)
β2
1
1.0014 (0.0210)
0.9998 (0.0094)
1.0777 (0.0167)
1.0774 (0.0077)
λ
0.3
0.3003 (0.0487)
0.3000 (0.0209)
0.0431 (0.0209)
0.0414 (0.0094)
σ
1
0.9989 (0.0191)
0.9998 (0.0084)
1.0077 (0.0193)
1.0087 (0.0085)
-1.1452 (0.0280)
0.9990
0.2682
-1.1459 (0.0121)
1.0000
0.2683
-1.1525 (0.0283)
0.2682
-1.1533 (0.0121)
0.2683
m log L
rtrue
rcensor
True parameters
µ
η
ρ
1
2
0.4
Distribution of Self-Known Characteristics
n = 20, G = 100
n = 20, G = 500
1.0054 (0.0037)
1.9964 (0.0637)
0.3969 (0.0384)
0.9993 (0.0570)
2.0000 (0.0280)
0.4000 (0.0164)
Note: G is the number of groups. n is the population for each group. m log L is the estimated sample average log
likelihood. rtrue is the proportion of simulations for which estimated log likelihood is bigger than that under wrong
information structure. rcensor is the censoring rate. The numbers in parentheses are standard deviation.
188
Table B.3: Tobit Model with Discrete Characteristics and Constant Friend Number for A
Single Group
Model
Specification
β0
0
β1
1
β2
1
λ
0.3
σ
1
m log L
rtrue
rcensor
Characteristics are Publicly Known
Publicly Known
Self-Known
n = 200 n = 1000 n = 1000
n = 200 n = 1000 n = 1000
F = 30
F = 150
F = 30
F = 30
F = 150
F = 30
0.0561
(0.5469)
0.9963
(0.0835)
1.0023
(0.1600)
0.2416
(0.5368)
0.9881
(0.0630)
0.0217
(0.5333)
1.0013
(0.0380)
0.9997
(0.0700)
0.2777
(0.5242)
0.9980
(0.0270)
-0.0100
(0.2483)
1.0012
(0.0381)
0.9994
(0.0699)
0.3094
(0.2410)
0.9981
(0.0271)
0.0651
(0.5794)
0.9964
(0.0832)
1.0009
(0.1600)
0.2330
(0.5713)
0.9884
(0.0632)
0.0039
(0.5602)
1.0014
(0.0379)
0.9995
(0.0701)
0.2758
(0.5516)
0.9981
(0.0270)
-0.0091
(0.2818)
1.0012
(0.0381)
0.9994
(0.0699)
0.3085
(0.2740)
0.9983
(0.0270)
-1.1512
(0.0532)
0.5400
0.3202
-1.1607
(0.0228)
0.5730
0.3203
-1.1606
(0.0228)
0.5930
0.3204
-1.1515
(0.0533)
0.3202
-1.1607
(0.0228)
0.3203
-1.1608
(0.0228)
0.3204
Characteristics are Self-Known
Self-Known
Publicly Known
n = 1000 n = 1000
n = 200 n = 1000 n = 1000
F = 150
F = 30
F = 30
F = 150
F = 30
Model
Specification
n = 200
F = 30
β0
0
β1
1
β2
1
λ
0.3
σ
1
0.0648
(0.5785)
0.9967
(0.0836)
1.0012
(0.1603)
0.2331
(0.5706)
0.9884
(0.0633)
0.0254
(0.5605)
1.0012
(0.0379)
0.9993
(0.0700)
0.2746
(0.5518)
0.9980
(0.0270)
-0.0094
(0.2825)
1.0012
(0.0380)
0.9994
(0.0701)
0.3088
(0.2745)
0.9980
(0.0270)
0.1136
(0.5529)
0.9964
(0.0840)
1.0015
(0.1604)
0.1850
(0.5441)
0.9884
(0.0632)
0.0794
(0.5410)
1.0012
(0.0380)
0.9993
(0.0699)
0.2211
(0.5323)
0.9980
(0.0270)
0.0560
(0.2490)
1.0013
(0.0381)
0.9982
(0.0701)
0.2448
(0.2423)
0.9982
(0.0270)
-1.1513
(0.0533)
0.4890
0.3202
-1.1607
(0.0229)
0.4690
0.3202
-1.1606
(0.0228)
0.5780
0.3203
-1.1513
(0.0532)
0.3202
-1.1607
(0.0229)
0.3202
-1.1608
(0.0228)
0.3203
m log L
rtrue
rcensor
Note: n is the number of agents in the group. F is the constant number of friends a person can make. m log L is the
estimated sample average log likelihood. rtrue is the proportion of simulations for which estimated log likelihood is
bigger than that under wrong information structure. rcensor is the censoring rate. The numbers in parentheses are
standard deviation.
189
Table B.4: Tobit Model with Discrete Characteristics and Random Friend Number for A
Single Group
Model
Specification
β0
0
β1
1
β2
1
λ
0.3
σ
1
m log L
rtrue
rcensor
mF
Characteristics are Publicly Known
Publicly Known
Self-Known
n = 200
n = 1000
n = 1000
n = 200
n = 1000
U F = 59 U F = 299 U F = 59
U F = 59 U F = 299
n = 1000
U F = 59
0.0066
(0.1596)
1.0042
(0.0714)
1.0002
(0.1488)
0.2910
(0.2459)
0.9894
(0.0501)
0.0070
(0.2746)
1.0014
(0.0385)
0.9996
(0.0700)
0.2927
(0.2698)
0.9993
(0.0286)
0.0015
(0.1402)
1.0016
(0.0384)
1.0001
(0.0702)
0.2978
(0.1354)
0.9993
(0.0287)
0.0077
(0.1729)
1.0045
(0.0714)
1.0002
(0.1481)
0.2888
(0.2692)
0.9903
(0.0503)
0.0045
(0.3041)
1.0014
(0.0385)
0.9997
(0.0700)
0.2949
(0.2983)
0.9994
(0.0286)
0.0051
(0.1532)
1.0016
(0.0384)
0.9997
(0.0701)
0.2943
(0.1487)
0.9998
(0.0287)
-1.4070
(0.0508)
0.5760
0.3214
29.4344
-1.1616
(0.0236)
0.5440
0.3206
149.6204
-1.1599
(0.0236)
0.6740
0.3219
29.5245
-1.4079
(0.0509)
0.3214
29.4344
-1.1617
(0.0236)
0.3206
149.6204
-1.1604
(0.0236)
0.3219
29.5245
Characteristics are Self-Known
Self-Known
Publicly Known
n = 1000
n = 1000
n = 200
n = 1000
n = 1000
U F = 299 U F = 59
U F = 59 U F = 299 U F = 59
Model
Specification
n = 200
U F = 59
β0
0
β1
1
β2
1
λ
0.3
σ
1
0.0162
(0.3633)
1.0058
(0.0847)
1.0010
(0.1575)
0.2830
(0.3491)
0.9894
(0.0631)
0.0043
(0.3044)
1.0014
(0.0384)
0.9997
(0.0701)
0.2951
(0.2984)
0.9992
(0.0286)
0.0056
(0.1532)
1.0015
(0.0383)
0.9998
(0.0704)
0.2939
(0.1485)
0.9992
(0.0286)
0.0633
(0.3233)
1.0060
(0.0849)
0.9983
(0.1584)
0.2370
(0.3211)
0.9897
(0.0631)
0.0590
(0.2742)
1.0015
(0.0385)
0.9989
(0.0702)
0.2416
(0.2699)
0.9993
(0.0287)
0.0516
(0.1394)
1.0019
(0.0384)
0.9974
(0.0707)
0.2487
(0.1354)
0.9995
(0.0286)
-1.1498
(0.0534)
0.5630
0.3215
29.4344
-1.1616
(0.0236)
0.5900
0.3205
149.6204
-1.1600
(0.0236)
0.6250
0.3218
29.5245
-1.1501
(0.0535)
0.3215
29.4344
-1.1617
(0.0236)
0.3205
149.6204
-1.1603
(0.0236)
0.3218
29.5245
m log L
rtrue
rcensor
mF
Note: n is the number of agents in the group. U F and mF are respectively the maximum and average number
of friends. m log L is the estimated sample average log likelihood. rtrue is the proportion of simulations for which
estimated log likelihood is bigger than that under wrong information structure. rcensor is the censoring rate. The
numbers in parentheses are standard deviation.
190
Table B.5: Tobit Model with Continuous Characteristics and Constant Friend Number for
A Single Group
Model
Specification
β0
0
β1
1
β2
1
λ
0.3
σ
1
m log L
rtrue
rcensor
Characteristics are Publicly Known
Publicly Known
Self-Known
n = 200 n = 1000 n = 1000
n = 200 n = 1000 n = 1000
F = 30
F = 150
F = 30
F = 30
F = 150
F = 30
0.0106
(0.6553)
0.9980
(0.0848)
0.9987
(0.0632)
0.2863
(0.3790)
0.9859
(0.0657)
0.0319
(0.6319)
1.0011
(0.0385)
1.0009
(0.0278)
0.2713
(0.3876)
0.9972
(0.0284)
-0.0101
(0.2668)
1.0012
(0.0386)
1.0009
(0.0279)
0.3026
(0.1673)
0.9973
(0.0285)
0.1998
(0.9211)
0.9980
(0.0849)
0.9298
(0.1620)
0.2342
(0.5352)
0.9878
(0.0661)
0.2659
(0.8726)
1.0010
(0.0385)
0.9425
(0.1414)
0.1996
(0.4943)
0.9976
(0.0285)
0.1371
(0.5054)
1.0009
(0.0387)
0.9228
(0.0786)
0.2708
(0.2672)
0.9993
(0.0286)
-1.1261
(0.2576)
0.5990
0.2751
-1.1346
(0.2501)
0.6220
0.2755
-1.1345
(0.2499)
0.8120
0.2756
-1.1280
(0.2587)
0.2751
-1.1350
(0.2502)
0.2755
-1.1365
(0.2499)
0.2756
Characteristics are Self-Known
Self-Known
Publicly Known
n = 1000 n = 1000
n = 200 n = 1000 n = 1000
F = 150
F = 30
F = 30
F = 150
F = 30
Model
Specification
n = 200
F = 30
β0
0
β1
1
β2
1
λ
0.3
σ
1
0.0643
(0.9197)
0.9996
(0.0821)
1.0092
(0.1738)
0.2652
(0.5229)
0.9864
(0.0628)
0.0442
(0.8261)
1.0011
(0.0370)
1.0085
(0.1506)
0.2733
(0.4756)
0.9975
(0.0275)
-0.0171
(0.4353)
1.0010
(0.0370)
0.9979
(0.0809)
0.3089
(0.2487)
0.9974
(0.0275)
0.3756
(0.6291)
0.9990
(0.0818)
1.0919
(0.0615)
0.0811
(0.3527)
0.9871
(0.0627)
0.3959
(0.6078)
1.0011
(0.0370)
1.0936
(0.0282)
0.0595
(0.3577)
0.9977
(0.0275)
0.3649
(0.2574)
1.0016
(0.0371)
1.0936
(0.0281)
0.0824
(0.1464)
0.9984
(0.0276)
-1.1335
(0.2258)
0.5100
0.2696
-1.1430
(0.2163)
0.5270
0.2694
-1.1429
(0.2163)
0.6810
0.2694
-1.1338
(0.2257)
0.2696
-1.1431
(0.2164)
0.2694
-1.1436
(0.2165)
0.2694
m log L
rtrue
rcensor
Note: n is the number of agents in the group. F is the constant number of friends a person can make. m log L is the
estimated sample average log likelihood. rtrue is the proportion of simulations for which estimated log likelihood is
bigger than that under wrong information structure. rcensor is the censoring rate. The numbers in parentheses are
standard deviation.
191
Table B.6: Tobit Model with Continuous Characteristics and Random Friend Number for
A Single Group
Model
Specification
β0
0
β1
1
β2
1
λ
0.3
σ
1
m log L
rtrue
rcensor
mF
Characteristics are Publicly Known
Publicly Known
Self-Known
n = 200
n = 1000
n = 1000
n = 200
n = 1000
U F = 59 U F = 299 U F = 59
U F = 59 U F = 299
n = 1000
U F = 59
0.0009
(0.3353)
0.9978
(0.0905)
1.0007
(0.0661)
0.2919
(0.2771)
0.9836
(0.0655)
-0.0056
(0.2956)
0.9981
(0.0405)
0.9991
(0.0276)
0.2978
(0.2090)
0.9960
(0.0285)
-0.0063
(0.1515)
0.9984
(0.0406)
0.9992
(0.0276)
0.3068
(0.1120)
0.9962
(0.0285)
0.0976
(0.4870)
0.9976
(0.0916)
0.9200
(0.1025)
0.2793
(0.2976)
0.9880
(0.0662)
0.1504
(0.4572)
0.9980
(0.0405)
0.9216
(0.0770)
0.2703
(0.2574)
0.9974
(0.0287)
0.1288
(0.2588)
0.9981
(0.0407)
0.9203
(0.0460)
0.2791
(0.1382)
1.0015
(0.0290)
-1.1039
(0.2721)
0.7090
0.2895
29.4867
-1.1495
(0.2506)
0.7710
0.2620
149.5960
-1.1480
(0.2503)
0.9170
0.2634
29.5193
-1.1081
(0.2744)
0.2895
29.4867
-1.1508
(0.2513)
0.2620
149.5960
-1.1529
(0.2526)
0.2634
29.5193
Characteristics are Self-Known
Self-Known
Publicly Known
n = 1000
n = 1000
n = 200
n = 1000
n = 1000
U F = 299 U F = 59
U F = 59 U F = 299 U F = 59
Model
Specification
n = 200
U F = 59
β0
0
β1
1
β2
1
λ
0.3
σ
1
-0.0258
(0.4592)
0.9969
(0.0887)
0.9954
(0.0977)
0.3160
(0.2631)
0.9835
(0.0621)
-0.0055
(0.4146)
0.9980
(0.0400)
0.9975
(0.0788)
0.3041
(0.2395)
0.9962
(0.0275)
-0.0102
(0.1962)
0.9981
(0.0401)
0.9969
(0.0420)
0.3070
(0.1120)
0.9963
(0.0277)
0.2464
(0.3522)
0.9973
(0.0884)
1.0921
(0.0648)
0.1415
(0.2384)
0.9873
(0.0625)
0.2614
(0.3208)
0.9985
(0.0401)
1.0911
(0.0277)
0.1255
(0.1900)
0.9973
(0.0275)
0.2296
(0.1945)
0.9993
(0.0404)
1.0890
(0.0279)
0.1444
(0.1028)
0.9999
(0.0279)
-1.1134
(0.2394)
0.6490
0.2828
29.4867
-1.1551
(0.2189)
0.7180
0.2582
149.5960
-1.1537
(0.2193)
0.8810
0.2595
29.5193
-1.1161
(0.2398)
0.2828
29.4867
-1.1558
(0.2191)
0.2582
149.5960
-1.1565
(0.2199)
0.2595
29.5193
m log L
rtrue
rcensor
mF
Note: n is the number of agents in the group. U F and mF are respectively the maximum and average number
of friends. m log L is the estimated sample average log likelihood. rtrue is the proportion of simulations for which
estimated log likelihood is bigger than that under wrong information structure. rcensor is the censoring rate. The
numbers in parentheses are standard deviation.
192
Table B.7: Tobit Model with Discrete Characteristics and Unobserved Group Random
Effects
Model Specification
Characteristics are Publicly Known
Publicly Known
Self-Known
n = 20, G = 100 n = 20, G = 500
n = 20, G = 100 n = 20, G = 500
β0
0
0.0323 (0.1733)
0.0127 (0.0612)
0.0287 (0.1776)
0.0121 (0.0630)
β1
1
1.0023 (0.0287)
1.0001 (0.0128)
1.0021 (0.0286)
0.9999 (0.0129)
β2
1
0.9982 (0.0517)
1.0007 (0.0226)
0.9980 (0.0523)
1.0002 (0.0228)
λ
0.3
0.3051 (0.0613)
0.3023 (0.0259)
0.3057 (0.0702)
0.3025 (0.0298)
σ
1
1.0008 (0.0194)
1.0005 (0.0094)
1.0029 (0.0194)
1.0025 (0.0095)
γ
1
1.0514 (0.1364)
1.0194 (0.0513)
1.0477 (0.1371)
1.0190 (0.0516)
-1.1667 (0.0344)
-1.1655 (0.0149)
-1.1682 (0.0344)
-1.1669 (0.0149)
rtrue
0.8880
0.9960
-
-
rcensor
0.3509
0.3516
0.3509
0.3516
m log L
Model Specification
Characteristics are Self-Known
Self-Known
Publicly Known
n = 20, G = 100 n = 20, G = 500
n = 20, G = 100 n = 20, G = 500
β0
0
0.0289 (0.1781)
0.0124 (0.0630)
0.1106 (0.1830)
0.0920 (0.0637)
β1
1
1.0024 (0.0289)
1.0001 (0.0129)
1.0030 (0.0291)
1.0006 (0.0129)
β2
1
0.9985 (0.0520)
1.0006 (0.0229)
0.9870 (0.0519)
0.9895 (0.0228)
λ
0.3
0.3051 (0.0707)
0.3026 (0.0297)
0.2458 (0.0629)
0.2426 (0.0265)
σ
1
1.0009 (0.0194)
1.0005 (0.0093)
1.0026 (0.0195)
1.0022 (0.0093)
γ
1
1.0481 (0.1362)
1.0192 (0.0520)
1.1095 (0.1454)
1.0756 (0.0553)
-1.1669 (0.0345)
-1.1657 (0.0149)
-1.1680 (0.0346)
-1.1668 (0.0149)
rtrue
0.8560
0.9950
-
-
rcensor
0.3509
0.3515
0.3509
0.3515
m log L
Note: G is the number of groups. n is the population for each group. m log L is the estimated sample average log
likelihood. rtrue is the proportion of simulations for which estimated log likelihood is bigger than that under wrong
information structure. rcensor is the censoring rate. The numbers in parentheses are standard deviation.
193
Table B.8: Sample Statistics
Variables
Mean
Standard Deviation
Min
Max
Property Tax Rate
Per $100 Valuation
0.3676
(0.1972)
0
0.8200
Population
×103
10.2842
(45.0819)
0.0250
751.9990
Utility Revenue
$ ×106
7.3150
(29.6843)
0
344.9110
Median Household Income
$ ×104
4.2092
(1.8504)
1.1750
15.7297
Related County Tax Rate
Per $100 Valuation
0.6380
(0.1518)
0.1162
0.9900
1.1008
(0.3326)
1
4
216.7976
(125.4035)
1.2133
767.1423
No. of Related County
Distance
Kilometers
No. of Observations
506
194
Table B.9: Tobit Model for Property Tax Competition
Regression
Regression for City Property Tax Rates
(1)
(2)
(3)
(4)
(5)
Constant
β0
0.3436***
(0.0511)
0.2232***
(0.0488)
0.1534***
(0.0535)
0.1610***
(0.0442)
0.0852**
(0.0364)
Population
β1
0.0005***
(0.0001)
0.0006***
(0.0001)
0.0004***
(0.0001)
0.0007***
(0.0001)
0.0004***
(0.0002)
Related County Tax Rate
β3
0.2822***
(0.0648)
0.1574**
(0.0647)
0.1454**
(0.0658)
0.1652***
(0.0641)
0.1387**
(0.0585)
Median Household Income
β4
-0.0393***
(0.0043)
-0.0334***
(0.0046)
-0.0327***
(0.0044)
-0.0311***
(0.0044)
-0.0301***
(0.0041)
Iteraction Intensity
λ
0.4754***
(0.1401)
0.6843***
(0.2023)
0.6020***
(0.1311)
0.8517***
(0.1493)
Shock Variance
σ
0.1848***
(0.0065)
0.1833***
(0.0064)
0.1835***
(0.0064)
0.1849***
(0.0066)
0.1839***
(0.0065)
89.6458
93.5224
92.7961
93.7981
96.5846
506
506
506
506
506
10.8656
(4.3634)
10.8656
(4.3634)
28.3636
(9.0854)
28.3636
(9.0854)
30
30
50
50
η
ρ
2.0059
0.1490
Estimated log Likelihood
No. of Observations
Number of “Neighbors”
Cutoff Distance (kilometers)
Variables
Estimates
Distribution of Median Household Income
µ
ω
3.3971
0.7742
Note: Regression (1) is ordinary Tobit regression without social interactions. Regressions (2) and (4) correspond to
the Tobit model when all characteristics are public information. Municipal median household income is assumed to
be self-known for municipalities in Regressions (3) and (5). Two municipalities are viewed as close “neighbors” if the
distance between them is less than 30 kilometers for Regressions (2) and (3), or less than 50 kilometers for Regressions
(4) and (5). Numbers in parentheses are standard deviations. Estimates that are significant at the %10, %5, and %1
levels are marked by “*”, “**”, and “***”, respectively.
195
Appendix C: Appendix to Chapter 4
C.1
Expectations, Equilibria, and Functions
In this appendix, we embed conditional expectation functions into a function space. For
a group with size n, social relations Wn and information structure J, we define a function
Q Q i
p
space, Ξ(Wn , J), such that each ξ ∈ Ξ(Wn , J ) is a mapping from an ni=1 M
m=1 Xi,m to
<M , satisfying
1. For all i = 1, · · · , n, m = 1, · · · , Mi , and xpi,m ∈ Xpi,m
ξ(xp1,1 , · · · , xp1,M1 , · · · , xpn,1 , · · · , xpn,Mn )i,m = ξi,m (xpi,m ).
(C.1.1)
2.
Z
max
max
1≤i≤n 1≤m≤Mi
|ξi,m (xpi,m )|dµp < ∞,
(C.1.2)
where µp represents the L-S measure implied by the conditional distribution of Xip ’s
given public information Z = z.
Define summation and scalar product in Ξ(Wn , J ) in a conventional way. According to
(C.1.2), define the norm on Ξ(Wn , J) as
Z
kξk = max
max
1≤i≤n 1≤m≤Mi
|ξi,m (xpi,m )|dµp .
(C.1.3)
Lemma C.1.1 The norm, k · k, defined in (C.1.3) is well-defined.
Proof. It is obviously that kξk ≥ 0 for any ξ ∈ (Ξ(Wn , J ), k · k) and kξk = 0 if and only if
ξ = 0 a.e. according to µp . For any real scalar α and ξ ∈ (Ξ(Wn , J ), k · k),
Z
Z
p
kαξk = max max
|αξi,m (xi,m )|dµp = |α| max max
|ξi,m (xpi,m )|dµp = |α|kξk.
1≤i≤n 1≤m≤Mi
1≤i≤n 1≤m≤Mi
196
0
For any two elements, ξ, ξ ∈ (Ξ(Wn , J), k · k),
Z
0
0
kξ + ξ k = max max
|ξi,m (xpi,m ) + ξi,m (xpi,m )|dµp
1≤i≤n 1≤m≤Mi
Z
Z
0
p
≤ max max
|ξi,m (xi,m )|dµp + max max
|ξi,m (xpi,m )|dµp
1≤i≤n 1≤m≤Mi
1≤i≤n 1≤m≤Mi
0
= kξk + kξ k.
L1 (Xpi,m , Bi,m , µp ; <1 ) is the space of all real-valued functions on Xpi,m which are meaR
R
surable under µp such that kχk1 := Xp
|χ|dµp < ∞ for all χ ∈ L1 (Xpi,m , Bi,m , µp ; <1 ).
i,m
This space belongs to the class of Lebesgue spaces. According to Dunford and Schwartz
(1958), it is a Banach space. Since ξ is a vector-valued function composing of a finite
number of coordinate functions, with the norm defined as the maximal of the absolute integrable of its coordinate functions, ξ ∈ (Ξ(Wn , J ), k · k) if and only if each of its coordinates,
ξi,e ∈ L1 (Xpi,m , Bi,m , µp ; <1 ). Based on this finding, we derive the following result.
Proposition C.1.1 (Ξ(Wn , J ), k · k) is complete. So it is a Banach space.
Proof. Take any Cauchy sequence, ξ k in (Ξ(Wn , J), k · k). That is,
R k
l (xp )|dµ → 0, as k, l → ∞. Then
kξ k − ξ l k = max1≤i≤n max1≤m≤Mi |ξi,m
(xpi,m ) − ξi,m
p
i,m
R k
l (xp )|dµ → 0 as k, l → ∞ uniformly for all i = 1, · · · , n and m =
|ξi,m (xpi,m ) − ξi,m
p
i,m
1, · · · , Mi in the L1 norm. For any η > 0, for each (i, m), the completeness of
L1 (Xpi,m , Bi,m , µp ; <1 ), implied that there is ξi,m ∈ L1 (Xpi,m , Bi,m , µp ; <1 ) such that there
R
p
k (xp ) − ξ
is Ki,m (η) >, whenever k > Ki,m (η), Xp |ξi,m
i,m (xi,m )|dµp < η. Since the
i,m
i,m
total number of coordinate function is finite, define ξ = (ξi,m )1≤i≤n,1≤m≤Mi . For K =
R
k (xp )−
maxi,m Ki,m (η), whenever k > K, we have that kξ k −ξk = max1≤i≤n max1≤m≤Mi Xp |ξi,m
i,m
i,m
ξi,m (xpi,m )|dµp
L1 (Xpi,m , Bi,m , µp ; <1 )
< η. Moreover, since ξi,m ∈
for all (i, m),
R
max1≤i≤n max1≤m≤Mi Xp |ξi,m | < ∞. That is, ξ ∈ (Ξ(Wn , J ), k · k).
i,m
Proposition 4.3.1 claims the equivalence between the BNEs and the equilibrium conditional expectations. This proposition is proved below.
197
Proof of Proposition 4.3.1. If se is a BNE, define ξ e by
e
(xpi,m ) = E[se (XJpi , i )|X pe
ξ e (xp1,1 , · · · , xp1,M1 , · · · , xpn,1 , · · · , xpn,Mn )i,m = ξi,m
Ji,m
= xpi,m , z],
for any i, m, and xpi,m ∈ Xpi,m . It follows from (4.3.1) that
e
ξi,m
(xpi,m ) =E[hi (u(Xi ) + λ
X
=E[hi (u(Xi ) + λ
X
=E[hi (u(Xi ) + λ
X
j6=i
j6=i
j6=i
Wn,ij E[sej (XJpj , j )|XJpi , z, i ] − i )|X pe
Ji,m
Wn,ij E[sej (XJpj , j )|XJpi , z] − i )|X pe
Ji,m
p
e
(Xj,m
) − i )|X pe
Wn,ij ξj,m
j (i)
j (i)
Ji,m
= xpi,m , z]
= xpi,m , z]
= xpi,m , z].
In the above equation, the second equality comes from the independence among i ’s, and
the independence between the idiosyncratic shocks and exogenous covariates and social
relations. The third equality follows directly from the way in which ξ e is defined. Note
p
that Xj,m
= XJpi for any j with Wn,ij 6= 0. On the contrary, assume that ξ e is an
j (i)
equilibrium conditional expectation function, i.e., (4.3.6) and (4.3.7) are satisfied. Define
P
p
e
(Xj,m
) − i ). Then we have that
se by sei (XJpi , i ) = hi (u(Xi ) + λ j6=i Wn,ij ξj,m
j (i)
j (i)
E[Sje (XJpj , j )|XJpi , z, i ] =E[hj (u(Xj ) + λ
X
=E[hj (u(Xj ) + λ
X
p
e
Wn,jk ξk,m
(Xk,m
) − j )|XJpi , z, i ]
k (j)
k (j)
k6=j
p
e
Wn,jk ξk,m
(Xk,m
) − j )|XJpi , z]
k (j)
j (k)
k6=j
p
e
=ξj,m
(Xj,m
),
j (i)
j (i)
p
where Xj,m
corresponds to XJpi . The second equality follows from the independence
j (i)
among all i ’s and the independence between those shocks and exogenous characteristics.
The third equality is derived by applying (4.3.7). Therefore, by the above definition of se ,
sei (XJpi , i ) = hi (u(Xi ) + λ
X
Wi,j E[Sje (XJpj , j )|XJpi , z, i ] − i ).
j6=i
198
C.2
Proofs for Equilibrium Characterizations with Public Characteristics
In this section, we discuss in detail the structure of the set of equilibria when all exogenous covariates are public information. We first prove that there is no loss of generality
by focusing on regular groups. We impose Assumption C.2.1 in order to apply theorems in
differential topology.
Assumption C.2.1 The functions, u(·) and Hi (·), for i = 1, · · · , n, are smooth. That is,
they have continuous partial derivatives of all orders.
In most models used in empirical and theoretical studies, u(·) is linear function of
exogenous covariates. Although hi (·) can be discrete, Hi (·) defined in Assumption 4.3.2
is usually smooth as our examples show. In the subsequent discussions, we focus on the
following case.
Definition C.2.1 For a group (X, Wn ), an equilibrium ξ e is regular if the derivative of
S(·; X, Wn ) at ξ e , DS(ξ e ; X, Wn ) is non-singular. A group (X, Wn ) is regular if each of
its equilibrium is regular. That is, DS(ξ e ; X, Wn ) is non-singular for any ξ e such that
S(ξ e ; X, Wn ) = 0.
When a group if regular, applying the Inverse Function Theorem, in a neighborhood of
an equilibrium, ξ e , there is no other equilibria. That is to say, all equilibria are isolated
from each other. That property is important for the following discussion. We show in the
proposition below that there is no loss of generality by focusing on regular groups. Denote
the support of X by X. For any social matrix Wn , define function Se : <n × X → <n by
e X; Wn ) = S(ξ; X, Wn ). That is, given Wn , S(·; X, Wn ) can be viewed as a family of
S(ξ,
smooth maps, indexed by the exogenous covariates, X. By Assumption C.2.1, Se is smooth.
Proposition C.2.1 follows from the transversality theorem in differential topology.
e X; Wn ) has full row rank for all
Proposition C.2.1 Given social matrix Wn , if DS(ξ,
e e ; X, Wn ) = 0, then for almost every X ∈ X, DS(ξ e ; X, Wn ) is non-singular.
(ξ, X) with S(ξ
That is, almost all groups are regular.
199
Proof. The result follows from the transversality theorem in the context book by Guillemin
e e , X; Wn ) has full row-rank, 0 is a regular value for S(·,
e X; Wn ).
and Pollack (1974). If DS(ξ
e X; Wn ) is transversal to {0}. From the transversality
Since {0} is a singleton in <n , S(·,
theorem, S(·; X, Wn ) is transveral to {0} for almost every X.
The following corollary shows that we can apply Proposition C.2.1 for a large class of
models.
Corollary C.2.1 Suppose that u(·) is linear in exogenous covariates, i.e., u(Xi ) = β0,0 +
0
0
e X; Wn ) has full
X g β0,1 + Xic β1 . If β1 6= 0 and dHi (a)/da 6= 0 for any i and x ∈ <, DS(ξ,
row rank for all Wn . Then almost all groups are regular.
Proof of Corollary C.2.1. Without loss of generality, suppose that β1,1 6= 0, we have
e
that DS(ξ, X; Wn ) = λDH − In β1,1 DH ∗ , where
DH = diag(dH1 (e
x1 )/dx, · · · , dHn (e
xn )/dx) is a diagonal matrix whose diagonal is comP
posed of the derivatives of Hi ’s evaluated at x
ei = u(Xi ) + λ j6=i Wi,j ψj for all i. By the
e X; Wn ) are linearly
assumption, none of dHi (e
xi )/dx is zero. Therefore, the rows for DS(ξ,
independent. So it has full row rank.
Although the function S(·; X, Wn ) is defined on the whole space, <n , we usually begin
searching for an equilibrium in a region. According to Guillemin and Pollack (1974), some
properties of the set of solutions for S(ξ; X, Wn ) = 0 inside a region can be derived by
analyzing the properties of a function on the boundary of that region. Denote the closed
ball in <n with radius r > 0 centered at the origin by B[0, r] = {x ∈ <n : kxkE ≤ r}, where
k·kE is the Euclidean norm. Its interior is the open ball, B(0, r) = {x ∈ <n : kxkE < r}. Its
boundary, ∂B[0, r] = {x ∈ <n : kxkE = r}, is a sphere, centering at the origin with radium
r > 0. In particular, we call the ball a unit sphere if its radius is 1. It is standard to
denote a unit sphere in <n as S n−1 . If S(ξ; X, Wn ) 6= 0 for any ξ ∈ ∂B[0, r], we can define
b X, Wn ), on ∂B[0, r], as
a function, S(·;
b X, Wn )i =
S(ξ;
P
Hi (u(Xi ) + λ j6=i Wn,ij ξj ) − ξi
S(ξ; X, Wn )i
= qP
,
P
n
kS(ξ; X, Wn )kE
2
i=1 (Hi (u(Xi ) + λ
j6=i Wn,ij ξj ) − ξi )
200
(C.2.1)
b X, Wn ) maps points on ∂B[0, r] to a point in
for i = 1, · · · , n. We can see that S(·;
b X, Wn ) to a class of functions via homotopy. We
S n−1 ⊆ <n . We now associate S(·;
analyze properties of this function through deformation.
Definition C.2.2 C and D are smooth manifolds in Euclidean spaces.58 Smooth maps,
e : C ×[0, 1] → D,
R0 : C → D and R1 : C → D are homotopic, if there exists a smooth map, R
e 0) = R0 (c) and R(c,
e 1) = R1 (c), for all c ∈ C, R
e is called a homotopy between
such that R(c,
R0 and R1 .
For any given t ∈ [0, 1], if tHi (u(Xi ) + λ
P
j6=i Wn,ij ψj ) − ψi
6= 0 for all i = 1, · · · , n and
ξ ∈ ∂B[0, r], by setting
tHi (u(Xi ) + λ
Rt (ξ; X, Wn )i = qP
n
P
j6=i Wn,ij ξj )
i=1 (tHi (u(Xi ) + λ
− ξi
,
(C.2.2)
2
j6=i Wn,ij ξj ) − ξi )
P
we can derive a mapping from ∂B[0, r] to S n−1 . If that is possible for all t ∈ [0, 1], we get
e ·; X, Wn ) : ∂B[0, r] × [0, 1] → S 1 such that R(·,
e ·; X, Wn ) = Rt (ξ; X, Wn ).
a homotopy, R(·,
b X, Wn ) is homotopic to the function R0 (·; X, Wn ) : ∂B[0, r] → S n−1 :
In that case, S(·;
−ξi
−ξi
R0 (ξ; X, Wn )i = pPn
=
,
2
r
i=1 (−ξi )
(C.2.3)
for all i = 1, · · · , n. The simplicity of that function makes it convenient to derive properties
b X, Wn ).
which are invariant to smooth changes in a homotopy and can be applied to S(·;
To make sure that this homotopy is well-defined, we impose Assumption 4.4.1 about the
function, T , as in (4.4.4). Figure C.14 is a graphic illustration of the homotopy constructed
for the H(·) function:
H(ξ)i =
exp(β(h + Jξj )) − exp(−(β(h + Jξj )))
,
exp(β(h + Jξj )) + exp(−(β(h + Jξj )))
(C.2.4)
for i, j = 1, 2 and i 6= j, which corresponds to the Binary choice model analyzed by Brock
and Durlauf(2007) without imposing rational expectations, ξ1 = ξ2 . We trace out the
images of Rt (ξ; X, Wn ) for a point in the circle centerer at the original with a radius of 3,
when t takes various values and depict them with the same color and marks. We do that
58
Intuitively speaking, a manifold is a subset of an Euclidean space which looks like an open subset of an
Euclidean space locally. Any open subsets of an Euclidean space is, of course a manifold. See Guillemin and
Pollack (1974) for a rigorous definition.
201
for four points, (r, 0), (0, r), (−r, 0), and (−r, −r). For each of those points, their images
in such a homotopy form an interval in the unit circle in a smooth way. We also depict
the image of the original function, S(·; X, Wn ), at a(1, 0), a(0, 1), a(−1, 0), and a(−1, −1),
when a runs from 0 to 1. Utlitizing the homotopy (C.2.2), we derive some properties of the
set of equilibria, which are summarized in Proposition 4.4.1. It is proved in detail below.
Proof of Proposition 4.4.1.
The proof is composed of two parts. We first show the
existence of an equilibrium in the interior, B(0, r) and then prove finiteness. (1) Under
Assumption 4.4.1, when the positive number r is sufficiently large, for any ξ with kξkE = r,
we have that
kS(ξ; X, Wn )kE = kT (ξ; X, Wn ) − ξkE ≥ |kT (ξ; X, Wn )kE − kξkE | > 0.
b X, Wn ) in (C.2.1) is well-defined on ∂B[0, r].
Thus, S(ξ; X, Wn ) 6= 0 on ∂B[0, r] and S(·;
Similarly, for any t ∈ [0, 1], ktT (ξ; X, Wn ) − ξkE ≥ |ktT (ξ; X, Wn )kE − kξkE | > 0. Thus,
P
tHi (u(Xi ) + λ j6=i Wn,ij ψj ) − ψi 6= 0, for all t ∈ [0, 1], i = 1, · · · , n, and all ξ ∈ ∂B[0, r].
e ·; X, Wn ) : ∂B[0, r] × [0, 1] → S 1 such that
Therefore, we can define a homotopy, R(·,
e ·; X, Wn ) = Rt (ξ; X, Wn ), which is defined in (C.2.2). R0 (ξ; X, Wn ) is a linear transforR(·,
b X, Wn ) :
mation from ∂B[0, r] to S n−1 with degree (−1)n 6= 0. Therefore, the degree of S(·;
∂B[0, r] → S n−1 is equal to (−1)n 6= 0. (2) If there is no solution to S(ξ; X, Wn ) = 0 in the
b X, Wn ) can be extended to the whole closed ball, B[0, r]. According
interior, B(0, r), S(·;
b X, Wn ) on ∂B[0, r] is equal to zero.
to Guillemin and Pollack (1974), the degree of S(·;
That is a contradiction. Therefore, there must be at least one point ξ e ∈ IntB[0, r] such
that S(ξ e ; X, Wn ) = 0. (3) For any group (X, Wn ), the set of equilibria, E(X, Wn ) is the
zeros for the continuous function, S(·; X, Wn ). Therefore, it is closed. As a closed subset
of the closed ball B[0, r], E(X, Wn ) is compact. Given regularity, for any ξ e ∈ B[0, r],
det(DS(ξ; X, Wn ) 6= 0. By the Inverse Function theorem, S(·; X, Wn ) is a local diffeomorphism around ξ. Thus, there is an open neighborhood, O(ξ e ) ∈ B[0, r] for ξ e such that
O(ξ e ) ∩ E(X, Wn ) = {ξ e }. Then the relative open sets, {O(ξ e ) ∩ E(X, Wn )} is an open
cover for E(X, Wn ). By compactness, there is a finite subcover. That is, there is an integer,
202
e
e
e
K, such that E(X, Wn ) ⊆ ∪K
k=1 O(ξk ) ∩ E(X, Wn ). Because each O(ξk ) ∩ E(X, Wn ) = {ξk }
e .
is a singleton, E(X, Wn ) contains just a finite number of points, ξ1e , · · · , ξK
Proof of Proposition 4.4.2. Pick B[0, r] according to Proposition 4.4.1, define a function
b ·; X, Wn ) : B(0, r) × [0, 1] → <n , such that for all t ∈ [0, 1], i = 1, · · · , n, and ξ ∈ <n ,
R(·,
b t; X, Wn )i = tHi (u(Xi ) + λ
R(ξ,
X
Wn,ij ξj ) − ξi .
j6=i
We do not lose any zeros of S(·; X, Wn ), for there is no zeros for this homotopy on the boundb1 (ξ; X, Wn ) = R(ξ,
b 1; X, Wn ) is the restriction of S(·; X, Wn ) in
ary ∂B[0, r]. We can see R
b0 (ξ; X, Wn ) = R(ξ,
b 0; X, Wn ) is just a linear transformation with R
b0 (ξ; X, Wn )i =
B(0, r). R
The restriction, S(·; X, Wn ) : B(0, r) → <n is homotopic
−ψi for all i = 1, · · · , n.
b0 (·; X, Wn ). By the Sard’s theorem, pick a point b ∈ <n , such that b is a reguto R
b Since R
b−1 (b) is a closed set in B(0, r) ⊆ B[0, r], it is compact. Due
lar value of R.
to transversality, Fb−1 (b) is a one-dimension compact submanifold in B(0, r). Therefore,
b−1 (b) is zero. Since the boundary,
the sum of the orientation numbers at points in ∂ R
b is equal to R
b0 on B(0, r) × {0} and
∂(B(0, r) × [0, 1]) = (B(0, r) × {0}) ∪ (B(0, r) × {1}, ∂ R
b1 on B(0, r) × {1}. Therefore, the intersection numbers of those three functions at {b}
R
satisfy the following relation:
b {b}) = I(R
b1 , {b}) − I(R
b0 , {b}).
I(∂ R,
b0 (·; X, Wn ) and R
b1 (·; X, Wn ), have the same intersection
That is, the two homotopic maps, R
numbers at {b}. According to Guillemin and Pollack (1974), since <n is connected and has
the same dimension with B(0, r) ∈ <n , the intersection number is invariant with the point
picked and is defined as the degree of a function. Therefore, choose point {0}, we get that
b1 ) = I(R
b1 , {0}) = I(R
b0 , {0}) = deg(R
b0 ).
deg(R
b−1 ({0}) = {0} and det(DR
b0 ) = (−1)n , we get that the degree of R
b0 is also (−1)n ,
Since R
0
which is equal to +1 when n is even and is equal to −1 when n is odd. Since 0 is a
b1 ) = I(R
b1 , {0})
b1 which is the restriction of S(·; X, Wn ) on B(0, r), deg(R
regular value for R
is actually the sum of orientation numbers for points in S −1 (·; X, Wn ), which are model
203
equilibrium expectations. Because the orientation numbers of those points are by definition
either +1 or −1, if their sum is either +1 or −1, there must be an odd number of such points.
In that case, the equilibria with orientation number +1 (−1) outnumbers the equilibria with
orientation number −1 by exactly 1. In addition, if the sign of det(DS(·; X, Wn )) does not
change in B(0, r), all the equilibria will have the same orientation numbers, either +1 or
−1. If there are more than one equilibria, the absolute value of their sum will be bigger
b1 (·; X, Wn ) = (−1)n . Therefore, in that case, there is
than 1, which contradicts with deg(R
a unique equilibrium.
Proof of Lemma 4.4.1. By calculation,

dH1 (a1 )
 da



DS(ξ; X, Wn ) = λ∆H W − In = λ 



where ai = u(Xi ) + λ
P
e
j6=i Wn,ij ξj
0
···
0
0
..
.
dH2 (a2 )
da
..
.
···
..
.
0
..
.
0
0
···
dHn (an )
da





 W − In ,



for i = 1, · · · , n, and In is the n-dimension identity
matrix. DS(ξ; X, Wn ) is an n × n matrix with all diagonal elements equal to −1, i.e.,
DS(ξ; X, Wn )i,i = −1. All of its off-diagonal elements are equal to their counterparts
in λ∆H W , i.e, DS(ξ; X, Wn )i,j = (λ∆H W )i,j . By the Gershgorin circle theorem, every
P
eigenvalue of DS(ξ; X, Wn ) lies within one of the discs, B[−1, |λ||dHi (ai )/da| j6=i Wi,j ],
for i = 1, · · · , n.59 Those circles are all centered at −1. By (4.3.8), all of their radii are
strictly less than 1. Therefore, every real eigenvalues of DS(ξ; X, Wn ) is strictly negative.
Since the trace of DS(ξ; X, Wn ) is real, if τ is one of DS(ξ; X, Wn )’s eigenvalues, so be its
conjugate. But their product is τ τ > 0. Let 2k denote the number of complex eigenvalues.
The sign of the product of all eigenvalues, which is equal to the sign of the determinant, is
equal to (−1)n−2k = (−1)n . Therefore, sgn(det(DS(ξ; X, Wn ))) = (−1)n for all ξ ∈ <n .
59
P
Let A = (aij ) be an n × n matrix. Denote by Ri =
j6=i |ai,j | the sum of absolute values of offdiagonal elements in the i-th row. Denote the closed disc centered at aii with a radius Ri by B[aii , Ri ].
By the Gershgorin circle theorem, every eigenvalue of A lies within one of those closed discs, B[aii , Ri ], for
i = 1, · · · , n. A brief explanation and proof for this theorem can be found at http://en.wikipedia.org.
204
C.3
Proofs for Identification with Public Characteristics
Proof of Proposition 4.4.3. With public information on all exogenous covariates, simply
denote (X g , X c ) = X. By calculation, E[Y |X] = E[(In − λW n )−1 Xβ|X]. Therefore, if
e λ,
e σ
(β, λ, σ) and (β,
e) are observationally equivalent,
e n )−1 X β|X]
e
E[(In − λW n )−1 Xβ − (In − λW
= 0.
for any X in its support. Multiply both sides by the non-random matrix, (In − λW n )(In −
e n ). Notice that (In − λW n )(In − λW
e n ) = (In − λW
e n )(In − λW n ), we have that
λW
e n )Xβ − (In − λW n )X β|X]
e
E[(In − λW
= 0.
Denote by ln the n × 1 vector of 1’s,
0 i
h
0
0
0
0
e 0,0 β − βe λβe − λβ
e
|X = 0,
E ln W n ln X c W n X c
β0,0 − βe0,0 λβe0,0 − λβ
1
1
1
1
which is equivalent to
0 i
h
c
c
c
c
E ln W n ln X W n X
ln W n ln X W n X |X
0
0
0
0
0
e
e
e
e
e
e
β0,0 − β0,0 λβ0,0 − λβ0,0 β1 − β1 λβ1 − λβ1 = 0.
Taking expectations over X, we get that
0 i
h
E ln W n ln X c W n X c
ln W n ln X c W n X c
0
0
0
0
0
e
e
e
e
e
e
β0,0 − β0,0 λβ0,0 − λβ0,0 β1 − β1 λβ1 − λβ1 = 0.
h
0 Xc
Xc
Xc
Xc
i
is
Under assumption (4.4.12), E ln W n ln
Wn
ln W n ln
Wn
0
e 0,0 β 0 − βe0 λβe0 − λβ
e 0 implies that β =
positive definite. Then β0,0 − βe0,0 λβe0,0 − λβ
1
1
1
1
e If W n is row-normalized, W n ln = ln . Then observationally equivalence implies
βe and λ = λ.
that
0 i
h
c
c
c
c
E ln X W n X
ln X W n X
0
0
0
0
0
e 0,0 β − βe λβe − λβ
e
= 0.
β0,0 − βe0,0 + λβe0,0 − λβ
1
1
1
1
205
e If W n is row-normalized, W n ln = ln .
If (4.4.120 ) holds, we will also have β = βe and λ = λ.
0
Because E[(Y − E[Y |X])(Y − E[Y |X]) |X] = σ 2 In . We can identify σ through conditional
variance of yi ’s given X.
Proof of Lemma 4.4.2.
Without loss of generality, suppose that β1,1 > 0. For ω−i ∈
{0, 1}n−1 , choose X (ω)c as
c
X (ω)c = X c ∈ <nL : Xj,1
(2ωj − 1) ≥ 0, j 6= i .
c | → ∞ in X c (ω) ⊆ <nL , with the restriction,
We can see that for any j 6= i, as |Xj,1
c (2ω − 1) ≥ 0, u(X ) goes to +∞ when ω is 1; and u(X ) goes to −∞ when ω is 0.
Xj,1
j
j
j
j
j
c
c
c |→∞,j6=i,X c ∈X c (ω) P (y−i = ω|X ) = 1. Since all X ’s have full support,
Therefore, lim|Xj,1
i
P (X c ∈ X c (ω)) > 0.
Proof of Proposition 4.4.4. By Lemma 4.4.2,
lim
c |→∞,j6=i,X c ∈X c (ω )
|Xj,1
0 −i
=
lim
c |→∞,j6=i,X c ∈X c (ω )
|Xj,1
0 −i
0
X
c0
X
P (β0,0 + Xic β1 + λ
W n,ij ψj − i > 0|X c )
j6=i
P (β0,0 + Xi β1 + λ
W n,ij ω0,j − i > 0|X c ).
j6=i
Therefore,
lim
log P (yi , ω0 |X c )
lim
yi log F (β0,0 + Xic β1 + λ
c |→∞,j6=i,X c ∈X c (ω )
|Xj,1
0
=
c |→∞,j6=i,X c ∈X c (ω )
|Xj,1
0
0
X
W n,ij ω0,j )
j6=i
c0
+ (1 − yi ) log(1 − F (β0,0 + Xi β1 + λ
X
W n,ij ω0,j )).
j6=i
c | ≥ D, for all i 6= j,
Under condition (4.4.13), when X c ∈ X c (ω0 ) and |Xj,1
0
c | ≥ D, i 6= j]
E[(∂ log P (yi , ω0 |X c )/∂θ)(∂ log P (yi , ω0 |X c )/∂θ) |X c ∈ X c (ω0 ), |Xj,1
0
0
is positive definite. From Rothenberg(1971), θ = (β , λ) can be identified.
Proof of Lemma 4.4.3. Similar to the proof of Lemma4.4.2, choose
c
X1c = X c ∈ <nL : Xj,1
≥ 0, 1 ≤ j ≤ n .
0
c k → ∞, X c β
c
In this set, as kXi,1
E
i,1 1,1 → +∞. As u(Xi ) = β0,0 + Xi β1 → +∞, in the limit,
none outcomes are censored.
206
Proof of Proposition 4.4.5.
c | → ∞ in
It follows from Lemma4.4.3 that in X1c , as |Xi,1
this set, no choices are censored. Therefore, for the distribution of observed outcomes we
have that
lim
c |→∞,1≤j≤n,X c ∈X c
|Xj,1
1
f (Y |X c ) =
lim
c |→∞,1≤j≤n,X c ∈X c
|Xj,1
1
f (Y ∗ |X c ).
That is, in the limit, the observed outcomes are the latent variables which are associated
with each other just as continuous choices in linear models. Since E[|yi ||X c ] = Hi (u(Xi ) +
P
λ j6=i W n,ij ξje ) < ∞ and E[|yi∗ ||X c ] = E[yi |X c ], by the Lebesgue Control convergence
theorem,
lim
c |→∞,1≤j≤n,X c ∈X c
|Xj,1
1
E[Y |X c ] =
=
lim
E[Y ∗ |X c ]
lim
(In − λW n )−1 (β0,0 ln + X c β1 ).
c |→∞,1≤j≤n,X c ∈X c
|Xj,1
1
c |→∞,1≤j≤n,X c ∈X c
|Xj,1
1
e λ,
e σ
Thus, if (β, λ, σ) and (β,
e) are observationally equivalent,
h
lim
E ln W n ln X c W n X c
c
c
c
|Xj,1 |→∞,1≤j≤n,X ∈X1
0
0
0
0
0
e 0,0 β − βe λβe − λβ
e
β0,0 − βe0,0 λβe0,0 − λβ
1
1
1
1
i
|X c = 0,
for all X c ∈ X1c . Similar to the proof of Proposition 4.4.3, we derive that
0 i
h
c
c
c
c
E ln W n ln X W n X
lim
ln W n ln X W n X
c
c
c
|Xj,1 |→∞,1≤j≤n,X ∈X1
e 0,0 β 0 − βe0 λβe0 − λβ
e 0
β0,0 − βe0,0 λβe0,0 − λβ
1
1
1
1
0
= 0.
Under (4.4.15), there is a positive-measure subset of covariates such that
0 i
h
c
c
c
c
E ln W n ln X W n X
ln W n ln X W n X
e λ).
e Identification of σ follows from Yang, Qu,
is positive definite. Therefore, (β, λ) = (β,
and Lee (2014).
Proof of Proposition 4.4.6.
For a group (X, W n ), the sample log likelihood can be
written as
P
log ξe ρ(α, ξ e )f (Y |ξ e ).
207
By calculation, we get that
∂ log L(Y |X c , W n ) ∂ log L(Y |X c , W n ) c
|X ]
∂α
∂α0
1
1
=E[( P
)4 ( P
)2
0
e ; E(X, W ), α)f (Y |ξ e )
e
ρ(ξ
n
ξe
ξ e exp(α γ(ξ ; X, W n ))
E[
1
(P
exp(α
ξe
0
0
γ(ξ e ; X, W
n )f (Y
|ξ e ))
)2 Γ (X c , W n ; β, σ, λ)(D(X, W n ))2 Γ(X c , W n ; β, σ, λ)|X c ].
When there are M equilibria, D(X, W n ) is a M × M diagonal matrix. Its (m, m) element is
0
P
0
e ;X,W ))
exp(α γ(ξ e ;X,W n ))f (y|ξ e )
n
− Pexp(α γ(ξ
.
0
0
e ;X,W ))f (y|ξ
e)
e
e
exp(α
γ(
ξ
exp(α
γ(
ξee ;X,W n ))
n
ξee
ξee
We can see that as long as there are multi0
ple equilibria, D(X, W n ) is positive definite. If E[Γ (X c , W n ; β, σ, λ)Γ(X c , W n ; β, σ, λ)|X c ]
has full column rank, so does E[ ∂ log L(Y∂α|X
c ,W
n)
∂ log L(Y |X c ,W n )
|X c ].
∂α0
From Rothenberg(1971),
α can be identified.
C.4
Equilibrium with Privately Known Characteristics
In this section, we discuss the existence and property of equilibria when some exogenous
characteristics are private information. The case in Section 4.5 is one special case. In the
Banach space, (Ξ(Wn , J ), k · k). define an operator, T : Ξ(Wn , J ) → Ξ(Wn , J ), such that
for all i = 1, · · · , n and m = 1, · · · , Mi ,
T (ξ)(xpJe , · · · , xpJe
1,1
1,M1
, · · · , xpJe , · · · , xpJe
n,1
n,Mn
)i,m = E[Hi (u(Xi ) + λ
X
Wn,ij ξj,Ji (XJpi ))|xpJe
, z].
i,m
j6=i
(C.4.1)
An equilibrium conditional expectation function, ξ e , corresponds to one of T ’s fixed
points. To apply the Schauder fixed point theorem (Proposition C.4.4), we need to capture
a compact set in the function space, (Ξ(Wn , J ), k·k). Analogous to conventional discussions
on Banach spaces, we focus on a weaker condition, relative compactness.
Definition C.4.1 A set in a metric space is relatively compact if its closure is compact.
Simon (1987) introduces a property about relative compact sets, which is used for our
proof. it is cited as the lemma below.
208
Lemma C.4.1 A set V is a normed space U is relatively compact if and only if for any
η > 0, there are a finite subset {v1 , · · · , vL } ⊆ V such that for any v ∈ V , there is vi for
some i = 1, · · · , n with kv − vi kU < η.
Thus, as long as we find a set which is relatively compact, all of its own points and its
limit points form a new set which is compact. Because ξ ∈ (Ξ(Wn , J ), k·k) if and only if each
of its coordinate functions, ξi,m is a function in the Lebesgue space, L1 (Xpi,m , Bi,m , µp ; <1 ),
we may utilize the properties of a relatively compact set in L1 (Xpi,m , Bi,m , µp ; <1 ). That is
possible due to the following lemma.
Lemma C.4.2 Γ0 is a relatively compact subset of (Ξ(Wn , J ), k · k) if and only if
n
o
Q
Γ0,i,m = ξi,m : Xpi,m → <1 : (ξi,m , ξ−im ) ∈ Γ0 for some ξ−im : (i0 ,m0 )6=(i,m) Xpi0 ,m0 → <M −1
is relatively compact in L1 (Xi,m , Bi,m , µp ; <1 ) for all i = 1, · · · , n, m = 1, · · · , Mi .
Proof. Suppose that Γ0,i,m is relatively compact in L1 (Xpi,m , Bi,m , µp ; <1 ) for each (i, m).
By Lemma C.4.1, for an arbitrarily chosen η > 0, for any (i, m), there is (ξi,m,1 , · · · , ξi,m,Li,m )
in Γ0,i,m , such that for any ξi,m ∈ Γ0,i,m , there is ξi,m,l for some 1 ≤ l ≤ Li,m with
kξi − ξi,l k1 < η. Construct a finite subset of Γ0 as
Γ0b = {(ξ1,1,l1 , · · · , ξn,mn ,ln ) : 1 ≤ li ≤ Li,m for all i = 1, · · · , n, m = 1, · · · , Mi }.
Then for any ξ = (ξ1,1 , · · · , ξn,Mn ) ∈ Γ0 , for each (i, m), pick li,m such that kξi,m −
R
ξi,m,li,m k1 = |ξi,m − ξi,m,li,m |dµp < η. Denoting ξ b = (ξ1,1,l1,1 , · · · , ξn,Mn ,ln,Mn ) ∈ Γ0b ,
we have that kξ − ξ b k = max1≤i≤n max1≤m≤Mi kξi,m − ξi,m,li,m k1 < η. Therefore, Γ0 is
relatively compact in (Ξ(Wn , J ), k · k). On the contrary, suppose that Γ0 is relatively
compact in (Ξ(Wn , J ), k · k). If for some (i0 , m0 ), Γ0,i0 ,m0 is not relatively compact in
L1 (Xpi0 ,m0 , Bi0 ,m0 , µp ; <1 ), there is η0 > 0, such that for any finite subset of Γ0,i0 ,m0 ,
n
o
ξi0 ,m0 ,1 , · · · , ξi0 ,m0 ,Li0 ,m0 , there is ξi∗0 ,m0 ∈ Γ0,i0 ,m0 with kξi∗0 ,m0 − ξi0 ,m0 ,l k1 > η for all 1 ≤
o
n
l ≤ Li0 ,m0 . For any finite subset of Γ0 , ξ 1 , · · · , ξ L , ξi10 ,m0 , · · · , ξiL0 ,m0 is a finite subset of
Γ0,i0 ,m0 . Take ξ ∗ = (ξi∗0 ,m0 , ξ −i0 m0 ) ∈ Γ0 . We have that kξ ∗ − ξ l k ≥ kξi∗0 ,m0 − ξil0 ,m0 k1 > η0 ,
contradicting with Γ0 being relatively compact. Therefore, each Γ0,i,m is relatively compact
in L1 (Xpi,m , Bi,m , µp ; <1 ) for all (i, m).
209
Owing to Lemma C.4.2, to characterize relatively compact sets in (Ξ(Wn , J ), k · k), we
need to capture the relatively compact sets in the Lebesgue space L1 (XJei,m , Σi,m , µp ; <1 ).
There are some characterizations for Lebesgue spaces of functions whose domains are general measurable spaces and ranges are Banach spaces, such as the results by Brooks and
Dinculeanu(1979) and more recently, the Diaz-Mayoral Theorem (See van Neerven(2014) for
an elementary proof). Here, we apply the classical results by Dunford and Schwartz(1958).
If Ω = <1 , BR is the Borel σ-algebra, µ is the Lebesgue measure, m, and Y is a Banach
space with norm k·kY . Dunford and Schwartz (1958) have a characterization for a relatively
compact set in the Lebesgue space, Lq (<1 , BR , m; Y), the space consisting of mappings from
<1 to Y, integrable under m, with the Lq norm.
Proposition C.4.1 For 1 ≤ q < ∞, Υ0 ⊆ Lq (<1 , BR , m; Y) is relatively compact if and
only if:
1. It is bounded, i.e., supχ∈Υ0 (
2.
R +∞
−∞
R +∞
−∞
kχ(t)kqY dt)1/q < U for some U > 0;
kχ(t + s) − χ(t)kqY dt → 0 as s → 0 uniformly for all χ ∈ Υ0 ; and
R +∞ R −r
3. ( r + −∞ )kχ(t)kqY dt → 0 as r → ∞ uniformly for all χ ∈ Υ0 .
For Ω = [a, b], (1) and (2) and necessary and sufficient conditions for relative compactness.
Proof. See Dunford and Schwartz(1958) Theorem IV.8.20 (pp.298).
The results above can be extended to the n-dimension Euclidean space.
Proposition C.4.2 For 1 ≤ q < ∞, Υ0 ⊆ Lq (<n , B, m; Y) is relatively compact if and
only if:
R
1. It is bounded, i.e., supχ∈Υ0 ( <n kχ(t)kqY dt)1/q < U for some U > 0;
2.
R +∞
−∞
···
R +∞
−∞
kχ(t1 + s1 , · · · , tn + sn ) − χ(t1 , · · · , tn )|qY dt1 · · · dtn → 0,
as s = (s1 , · · · , sn ) → 0 uniformly for all χ ∈ Υ0 ; and
3.
R
<n −Cr
kχ(t)kqY dt → 0 as r → ∞ uniformly for all χ ∈ Υ0 , where the cube
Cr = {t = (t1 , · · · , tn ) ∈ <n : −r ≤ ti ≤ r ∀i = 1, · · · , n}.
210
If Ω =
Qn
i=1 [ai , bi ],
(1) and (2) are necessary and sufficient for relative compactness.
Proof. See Dunford and Schwartz(1958) Theorem IV.8.21 (pp.301).
Although those results are very general, they are about Lebesgue measure. In our
model, particularly, each coordinate function ξi for ξ ∈ Ξ is defined on a subset of the
Euclidean space with the probability measure, µp , which is induced by the distribution of
Xip ’s conditional on the public information Z = z. We show that when there is a pdf for
this conditional distribution, we can apply Proposition C.4.1 and Proposition C.4.2.
Lemma C.4.3 Let Ω be a subset of <n . When µ << m, dµ/dm = f is a strictly positive
on Ω, Υ0 ∈ L1 (Ω, BΩ , µ; Y) is relatively compact if and only if f Υ0 = {f χ : χ ∈ Υ0 } is
relatively compact in L1 (Ω, ΣB , m; B).
Proof. On one hand, if f Υ0 is relatively compact in L1 (Ω, ΣB , m; B), by Lemma C.4.1, for
any η > 0, there are a finite subset, χ1 , · · · , χK in L1 (Ω, BΩ , m; Y), such that for any χ ∈
R
L1 (Ω, BΩ , m; Y), there is χk in that finite set with Ω kχ − χk kY dm < η. χ1 /f, · · · , χK /f
R
is a finite subset of L1 (Ω, BΩ , µ; Y). For any χ
e ∈ Υ0 , f χ
e ∈ f Υ0 . Therefore, Ω kf χ
e−
R
R
χl kY dm = Ω ke
χ − χl /f kY f dm = Ω ke
χ − χl /f kY dµ < η. Therefore, Υ0 is relatively
compact in L1 (Ω, BΩ , µ; Y). On the other hand, if Υ0 is relatively compact, for any η > 0,
1
there is a finite subset χ
e ,··· ,χ
eK such that for any χ
e ∈ Υ0 , there is a function χ
ek in
1
R
that finite subset such that Ω ke
χ−χ
ek kY dµ < η. Note that f χ
e , · · · , fχ
eK is a finite set
in L1 (Ω, BΩ , m; Y). Take χ ∈ L1 (Ω, BΩ , m; Y), χ/f ∈ L1 (Ω, BΩ , µ; Y). Hence, we have
R
R
R
that Ω kχ − f χ
ek kY dm = Ω kχ/f − χ
ek kY f dm = Ω kχ/f − χ
ek kY dµ < η. Therefore, f Υ0 is
relatively compact in L1 (Ω, BΩ , m; Y).
Corollary C.4.1 Suppose that Ω =
Qn
i=1 [ai , bi ]
for −∞ ≤ ai < bi ≤ +∞. µ << m.
dµ/dm = f is strictly positive on Ω. When |ai | < ∞ and |bi | < ∞ for all i = 1, · · · , n,
Υ0 ∈ L1 (Ω, BΩ , µ; Y) is relatively compact if and only if:
1. supχ∈Υ0
R
Ω kχ(t)kB dt
< U for some U > 0;
211
2.
Z
b1
Z
bn
···
a1
|χ(t1 + s1 , · · · , tn + sn )f (t1 + s1 , · · · , tn + sn ) − χ(t1 , · · · , tn )f (t1 , · · · , tn )|dt1 · · · dtn → 0,
an
as s = (s1 , · · · , sn ) → 0, uniformly for all χ ∈ Υ0 .
When |ai | = ∞ for some i or |bj | = ∞ for some j, the necessary and sufficient conditions
for Υ0 ∈ L1 (Ω, BΩ , µ; Y) to be relatively compact include both (1) and (2), as well as (3):
R
ω−Cr kχ(t1 , · · · , tn )kY f (t1 , · · · , tn )dt1 · · · dtn → 0, as r → ∞ uniformly for all χ ∈ Υ0 ,
where the cube Cr = {t = (t1 , · · · , tn ) ∈ Ω : −r ≤ ti ≤ r ∀i = 1, · · · , n}.
Combining Lemma C.4.2 and Corollary C.4.1, we derive a characterization of a relatively
compact subset of (Ξ(Wn , J ), k · k) using properties of functions.
Proposition C.4.3 Suppose that conditional on public information Z = z, the support of
Xip , Xpi , is a cube in <kp (It can be bounded or unbounded), and the joint distribution of
Xip ’s has a pdf fp (·).
1. When all the Xpi ’s are bounded, Γ0 is relatively compact in (Ξ(Wn , J ), k · k), if and
only if,
(a) (uniformly bounded) there is a real number B > 0 such that
R
supξ∈Γ0 kξk = supξ∈Γ0 max1≤i≤n max1≤m≤Mi Xp |ξi,m (x)|fp (x)dx ≤ B;
i,m
(b) max1≤i≤n max1≤m≤Mi
R
Xpi,m
|ξi,m (x + x
e)fp (x + x
e) − ξi,m (x)fp (x)|dx → 0 as x
e→0
uniformly for any ξ ∈ Γ0 .
2. When some of the support is unbounded, the necessary and sufficient conditions for
Γ0 to be relatively compact in (Ξ(Wn , J ), k · k) include (1) and (2) as well as (3):
R
max1≤i≤n max1≤m≤Mi Xp −Cr,i,m |ξi,m (x)|dx → 0 as r → ∞ uniformly for all ξ ∈ Υ0 ,
ni,m
o
where the cube Cr,i,m = x ∈ Xpi,m : |xj | ≤ r ∀j with Ji,m (j) = 1 .
Proof. (a) Suppose that Γ0 is relatively compact in (Ξ(Wn , J ), k · k). Then for each i and
m, the set, Γ0,i,m = {ξi,m : (ξi,m , ξ−im ) ∈ Γ0 for some ξ−im }
212
is relatively compact in L1 (Xpi,m , Bi,m , µp ; <1 ) according to LemmaC.4.2. It follows from
R
Corollary C.4.1 that for any i and m, there is U i,m , such that supξi,m Γ0,i,m Xp |ξi (x)|fp (x)dx <
i,m
U i,m . Therefore,
Z
sup kξk = sup max
max
ξ∈Γ0 1≤i≤n 1≤m≤Mi
ξ∈Γ0
≤ max
Xpı,m
|ξi,m (x)|fp (x)dx
max U i,m
1≤i≤n 1≤m≤Mi
=U ,
where U = max1≤i≤n max1≤m≤Mi U i,m < ∞. Additionally, for any η > 0, there is δi,m > 0,
R
such that Xp |ξ(x + x
ei,m )f (x + x
ei,m − ξ(x)f (x)|dx < η when ke
xi,m kE < δi,m for all ξ ∈ Γ0 .
i,m
Take δ = min 1 ≤ i ≤ n max1≤m≤Mi δi,m > 0. For a vector
0
0
0
0
0
x
e = (e
x1,1 , · · · , x
e1,M1 , · · · , x
en,1 , · · · , x
en,Mn ) , when ke
xkE < δ, ke
xi,m kE < δi,m for all (i, m).
R
ei,m )f (x + x
ei,m ) − ξi,m (x)f (x)|dx < η for all
Then max1≤i≤n max1≤m≤Mi Xp |ξi,m (x + x
i,m
ξ ∈ Γ0 . When some of the Xip ’s have an unbounded support, for any η > 0, there is
R
Ri,m > 0, such that for all (i, m), Xp −Cr,i,m |χ(x)|dx < η for all r > R and all ξi,m . Take
i,m
R = max1≤i≤n max1≤m≤Mi Ri,m < +∞. Then when r > R,
R
max1≤i≤n max1≤m≤Mi Xp −Cr,i,m |ξi,m (x)|dx < η for all ξ ∈ Υ0 .
i,m
(b) One the contrary, if (1) and (2) hold, suppose that there is (i0 , m0 ), such that for any
R
U > 0, there is ξi0 ,m0 ∈ Γ0,i0 ,m0 , such that kξi0 ,m0 k1 = Xp
|ξi0 ,m0 (x)|fp (x)dx > U . Then
i0 ,m0
pick any ξ−i0 m0 ∈ Γ0,−i0 m0 such that ξ = (ξi0 ,m0 , ξ−i0 m0 ) ∈ Γ0 . Then
R
R
kξk = max1≤i≤n max1≤m≤Mi Xp |ξi,m (x)|fp (x)dx ≥ Xp
|ξi0 ,m0 (x)|fp (x)dx > U ,
i,m
i0 ,m0
which is a contradiction. Therefore, every Γ0,i,m is uniformly bounded under the k · k1
norm.
δi0 ,m0
Similarly, suppose that there is some (i0 , m0 ), with some ηi0 ,m0 > 0, for any
Rb
> 0, there is x
ei0 ,m0 with ke
xi0 kE < δi0 ,m0 and ξi0 ,m0 ∈ Γ0,i0 ,m0 , such that a |ξi0 ,m0 (x +
x
ei0 ,m0 )fp (x + x
ei0 ,m0 ) − ξi0 ,m0 (x)f (x)|dx > ηi0 ,m0 . Pick any ξ−i0 m0 ∈ Γ0,−i0 m0 such that
R
ξ = (ξi0 ,m0 , ξ−i0 m0 ) ∈ Γ0 . Then max1≤i≤n max1≤m≤Mi Xp |ξi,m (x + x
ei,m )fp (x + x
ei,m ) −
i,m
R
ξi,m (x)f (x)|dx ≥ Xp
|ξi0 ,m0 (x + x
ei0 ,m0 )fp (x + x
ei0 ,m0 ) − ξi0 ,m0 (x)f (x)|dx > η, which
i0 ,m0
contradicts that Γ0 satisfies (2).
some ηi0 ,m0
Similarly, suppose that there is some (i0 , m0 ), with
R
> 0, for any R > 0, there is r > R, such that Xp
|ξi0 ,m0 (x)|fp (x)dx >
i0 ,m0
213
ηi0 ,m0 . Then max1≤i≤n max1≤m≤Mi
R
Xpi,m −Cr,i,m
|ξi,m (x)|dx ≥
R
Xpi
0 ,m0
|ξi0 ,m0 (x)|fp (x)dx > η,
which is also a contradiction. By Corollary C.4.1, each Γ0,i,m is relatively compact in
L1 (Xpe , Bi,m , µp ; <1 ). It then follows from LemmaC.4.2 that Γ0 is relatively compact in
Ji,m
(Ξ(Wn , J), k · k).
The existence of an equilibrium is established by Schauder fixed point theorem, which
is cited below.
Proposition C.4.4 [Schauder fixed point theorem] Let K be a nonempty, closed, and convex subset of a normed space. Let T be a continuous mapping from K into a compact subset
of K. Then T has a fixed point in K.
In order to apply this theorem to operator T defined in (C.4.1), we impose two assumptions, Assumptions 4.5.2 and 4.5.3.
Lemma C.4.4 Under Assumption 4.5.2, T is continuous. Actually, it is a Lipschitz function.
0
Proof. For any ξ, ξ ∈ (Ξ(Wn , J ), k · k),
0
kT (ξ) − T (ξ )k
Z h
i
X
X
0
= max max
E Hi (u(Xi ) + λ
Wn,i ξj,Ji (xpJi )) − Hi (u(Xi ) + λ
Wn,i ξj,Ji (xpJi ))|xpi,m , z dFp
1≤i≤n 1≤m≤Mi
≤ max
max sup |dHi (c)/dc||λ|
1≤i≤n 1≤m≤Mi
= max
j6=i
c
max sup |dHi (c)/dc||λ|
1≤i≤n 1≤m≤Mi
c
X
j6=i
Z
Wn,ij
h
0
E |ξi,Ji (xpJi ) − ξi,Ji (xpJi )|xpi,Je
i,m
i
, z dFp
6=i
X
Z
Wn,ij
0
|ξi,Ji (xpJi ) − ξi,Ji (xpJi )|dFp
6=i
0
≤ max sup |dHi (c)/dc||λ|kWn k∞ kξ − ξ k.
1≤i≤n
c
That is to say, T is a Lipschitz function. Thus, it is continuous in (Ξ(Wn , J ), k · k).
Lemma C.4.5 Under Assumption 4.5.3, there is r0 > 0, such that there is no equilibria
out of the closed ball, B[0, r0 ] = {ξ ∈ (Ξ(Wn , J ), k · k) : kξk ≤ r0 }. In addition, for any
ξ ∈ B[0, r0 ], T (ξ) ∈ B[0, r0 ].
214
Proof. Because kT (ξ) − ξk ≥ kT (ξ)k − kξk, under Assumption 4.5.3, there is r1 > 0, such
that kT (ξ) − ξk > 0 for all ξ with kξk > r1 . Now we show that there is r2 > 0, such that for
any r ≥ r2 , T (B[0, r]) ⊆ B[0, r]. If this statement does not hold, for any positive r, there
is r∗ ≥ r and ξ r∗ with kξ r∗ k ≤ r∗ , kT (ξ r∗ )k > r∗ ≥ kξ r∗ k. Then kT (ξ r∗ )k/kξ r∗ k > 1, which
contradicts Assumption 4.5.3. Choose r0 = max {r1 , r2 }, we get the results.
Proposition C.4.5 Under Assumptions 4.5.2 and 4.5.3, if in addition,
Z
max max
|T (ξ)i,m (x + x
e)fp (x + x
e) − T (ξ)i,m (x)fp (x)|dx → 0,
(C.4.2)
as x
e → 0, uniformly for any ξ ∈ B[0, r0 ]; and
Z
max max
|T (ξ)i,m (x)|fp (x)dx → 0,
(C.4.3)
1≤i≤n 1≤m≤Mi
Xp
1≤i≤n 1≤m≤Mi
Xp −Cr
as r → ∞, uniformly for all ξ ∈ B[0, r0 ], the set of equilibria, E(X, Wn ) is a nonempty and
compact subset of (Ξ(Wn , J ), k·k) and is contiained in the closed ball B[0, r0 ]. In particular,
(C.4.2) and (C.4.3) are satisfied, if
0
0
1. Hi (·)’s are uniformly bounded, i.e., max1≤i≤n supa∈<1 |Hi (a)| ≤ B , for some B ;
2. E[Xip |Z = z] < ∞, for all i; and
3. For some δ0 > 0, for each (i, m), there is an function gi,m (x, x
b) such that
R
R
b)|dxdb
x < ∞, |fp,i,m (x + x
e, x
b)| ≤ gi,m (x, x
b), a.e., for any x
e in the
Xp
Xp |gi,m (x, x
i,m
Ji
p
cube Cδ0 , where fp,i,m (·, ·) is the joint density of Xi,m
= XJpi,m and XJpi conditional on
public information Z = z.60
Proof. Choose the closed ball B[0, r0 ] satisfying the properties stated in Lemma C.4.5.
0
It is nonempty, closed, and convex in space (Ξ(Wn , J ), k · k). For any ξ ∈ T (B[0, r0 ]),
0
ξ = T (ξ), for some ξ ∈ B[0, r0 ]. Therefore,
Z
0
0
|ξi,m (x + x
e)fp (x + x
e) − ξi,m (x)fp (x)|dx
max max
1≤i≤n 1≤m≤Mi Xp
Z
|T (ξ)i,m (x + x
e)fp (x + x
e) − T (ξ)i,m (x)fp (x)|dx;
= max max
1≤i≤n 1≤m≤Mi
Xp
p
It is possible that they have overlaps between Xi,m
and XJpi . For example, both agent 1 and agent 2
p
know X3 . We write fp,i,m (x, y) in order to simplify notations.
60
215
Z
max
1≤i≤n 1≤m≤Mi
Z
0
max
Xp −Cr
|ξi,m (x)|fp (x)dx = max
|T (ξ)i,m (x)|fp (x)dx.
max
1≤i≤n 1≤m≤Mi
Xp −Cr
Therefore, if (C.4.2) and (C.4.3) are satisfied, according to Proposition C.4.3, T (B[0, r0 ]) is
relatively compact in the normed space (Ξ(Wn , J ), k·k). Its closure, T (B[0, r0 ]), is compact.
Moreover, T (B[0, r0 ]) ⊆ B[0, r0 ], for B[0, r0 ] is closed. By the Schauder fixed point, T has
a fixed point in B[0, r0 ]. Thus, the set of equilibria, E(X, Wn ), is nonempty. Since it is the
set of fixed points for the continuous operator T , E(X, Wn ) is closed. As a closed subset of
the compact set T (B[0, r0 ]), E(X, Wn ) is compact.
In particular, if Hi (·)’s are uniformly bounded,
Z
max
max
1≤i≤n 1≤m≤Mi
Xp
i,m
Z
= max
max
1≤i≤n 1≤m≤Mi
Xp
i,m
T (ξ)i,m (x + x
e)fp (x + x
e) − T (ξ)i,m (x)fp (x)dx
X
E[Hi (u(X g , Xic , x
b) + λ
Wn,ij ξj (b
x))|x + x
e, z]fp (x + x
e)
j6=i
− E[Hi (u(X g , Xic , x
b) + λ
X
Wn,ij ξj (b
x))|x, z]fp (x)dx
j6=i
Z
= max
max
1≤i≤n 1≤m≤Mi
Xp
i,m
Z
Xp
J
Hi (u(X g , Xic , x
b) + λ
max
i
sup |Hi (a)|
1≤i≤n 1≤m≤Mi a∈<1
Wn,ij ξj (b
x))(fp (x + x
e, x
b) − fp (x, x
b))db
x|dx
j6=i
Z
Z
≤ max
X
Xp
i,m
Xp
J
|fp (x + x
e, x
b) − fp (x, x
b))|db
xdx,
i
which, following from the Lebesgue dominated convergence theorem, goes to 0 uniformly
for all ξ’s as x
e → 0, under the above distribution assumption. Similarly, for any r > 0,
Z
max max
|T (ξ)i,m (x)|fp (x)dx
1≤i≤n 1≤m≤Mi Xp −Cr
Z
Z
X
= max max
|
Hi (u(X g , Xic , x
b) + λ
Wn,ij ξj (y))fp (b
x|x)db
x|fp (x)dx
1≤i≤n 1≤m≤Mi
Xp −Cr
XpJ
j6=i
i
Z
≤ max
max
sup |Hi (a)|
1≤i≤n 1≤m≤Mi a∈<1
≤ max
max
sup |Hi (a)|
1≤i≤n 1≤m≤Mi a∈<1
≤ max
max
sup |Hi (a)|
1≤i≤n 1≤m≤Mi a∈<1
fp (x)dx
Xp −Cr
X
P (|Xjp | > r)
j:Ji,m (j)=1
X
E[|Xjp |z]/r,
j:Ji,m (j)=1
(C.4.4)
216
where the last inequality follows from the Chebyshev’s inequality. As r → ∞, the above
formula goes to zero uniformly for all ξ.
It is obvious that if Xip ’s have a bounded support and their joint density conditional
on public information is continuous, |fp,i,m (x + x
e, y)| can be dominated by the constant
function on the support, which is Lebesgue integrable. Now we show that we can dominate
the density for jointly normal random vectors.
0
0 0
0
0
0
Lemma C.4.6 Suppose that (X1 , X2 ) are jointly normal with mean (µ1 , µ2 ) and variancecovariance matrix,


Σ11 Σ12 
Σ=
.
Σ21 Σ22
Take δ0 > 0 arbitrarily. Then there is a function g : <k1 × <k2 → <1 , such that the joint
density f (x + x
e, x
b) ≤ g(x, x
b) a.e. for any x
e ∈ Cδ0 .
Proof. Denote the inverse of Σ by


e
e
Σ11 Σ12 
Σ−1 = 
.
e 21 Σ
e 22
Σ
The joint density takes the following form:
f (x + x
e, x
b)
n
o
0
0
0
0
0
=(2π)−(k1 +k2 )/2 (det(Σ))−1/2 exp −(1/2)(x + x
e − µ1 , x
b − µ2 )Σ−1 (x + x
b − µ1 , x
b − µ2 )
o
n
0
e
e 22 − Σ
e 21 Σ
e −1 Σ
x − µ2 )
=(2π)−(k1 +k2 )/2 (det(Σ))−1/2 exp −(1/2)(b
x − µ2 ) ( Σ
11 12 )(b
n
o
0
−1/2 e
e −1/2 Σ
e 12 (b
e
e
· exp −(1/2)(e
x + (x − µ1 ) + Σ
x
−
µ
))
Σ
(e
x
+
(x
−
µ
)
+
Σ
Σ
(b
x
−
µ
))
.
2
11
1
12
2
11
11
For any (x, x
b), since the cube Cδ0 is compact in an Euclidean space, we can define
−1/2 e
e
ge(x, x
b) = minxe∈Cδ0 (e
x + (x − µ1 ) + Σ
11
0
−1/2 e
e 11 (e
e
Σ12 (b
x − µ2 )) Σ
x + (x − µ1 ) + Σ
11
Σ12 (b
x − µ2 )).
This function is defined on the basis of an optimization problem about a quadratic form
with respect to linear constraints. For any c ≥ 0, define the lower contour set,
n
o
0
e −1/2 Σ
e 12 (b
e 11 (e
e −1/2 Σ
e 12 (b
L(c, x, x
b) = x
e : (e
x + (x − µ1 ) + Σ
x − µ2 )) Σ
x + (x − µ1 ) + Σ
x − µ2 )) ≤ c .
11
11
217
Each of those sets is an aera composed of an ellipse and its interior. It is convex.
e −1/2 Σ
e 12 (b
Fixing (x, x
b), we either have a solution inside Cδ0 , −(x − µ1 ) − Σ
x − µ2 ), or a
11
corner solution at a boundary point of the cube Cδ0 . In the space for x
e, fixing c, when
(x, x
b) varies, the center of the ellipse moves; while the axes do not change. Therefore,
we can divide the space for (x, x
b) into several regions, such that two points in the same
region either both have interior solutions or corner solutions on the same edge of Cδ0 .61
e −1/2 Σ
e 12 (b
If −(x − µ1 ) − Σ
x − µ2 ) ∈ Cδ0 , ge(x, x
b) = 0. Otherwise, its value depends on the
11
minimal boundary point. In this case, as the edges of Cδ0 are bounded, for any x
e on the
boundary of Cδ0 ,
0
e −1/2 Σ
e 12 (b
e 11 (e
e −1/2 Σ
e 12 (b
x + (x − µ1 ) + Σ
x − µ2 )) Σ
x + (x − µ1 ) + Σ
x − µ2 ))
(e
11
11
0
e −1/2 Σ
e 12 (b
e 11 ((x − µ1 ) + Σ
e −1/2 Σ
e 12 (b
− ((x − µ1 ) + Σ
x
−
µ
))
Σ
x
−
µ
))
b)k2E → 0
/k(x, x
2
2
11
11
as k(x, x
b)kE → ∞. Therefore, there is R > 0, such that when k(x, x
b)kE > R,
0
e −1/2 Σ
e 12 (b
e 11 ((x − µ1 ) + Σ
e −1/2 Σ
e 12 (b
ge(x, x
b) ≥ (1/4)((x − µ1 ) + Σ
x − µ2 )) Σ
x − µ2 )).
11
11
Define g(x, x
b) = (2π)−(k1 +k2 )/2 (det(Σ))−1/2 exp {−(1/2)e
g (x, x
b)}. Then f (x+e
x, x
b) ≤ g(x, x
b),
for any (x, x
b). By the Maximum Theorem, ge(x, x
b) is continuous, so does g(x, x
b). Moreover,
−1/2 e
e
if −(x − µ1 ) − Σ
11
Σ12 (b
x − µ2 ) ∈ Cδ0 , g(x, x
b) = (2π)−(k1 +k2 )/2 (det(Σ))−1/2 ; otherwise,
g(x, x
b) is continuous when k(x, x
b)kE ≤ R, and when k(x, x
b)kE > R,
g(x, x
b) ≤(2π)−(k1 +k2 )/2 (det(Σ))−1/2
n
o
0
e −1/2 Σ
e 12 (b
e 11 ((x − µ1 ) + Σ
e −1/2 Σ
e 12 (b
· exp −(1/8)((x − µ1 ) + Σ
x − µ2 )) Σ
x − µ2 )) .
11
11
Hence, g(x, x
b) is integrable.
C.5
Equilibrium for Peer Effects
e
Proof of Proposition 4.6.1.
On one hand, suppose that ξ satisfying (4.6.6). With a
e
e
regular group, we can define ξj,m
= Λj (ξ m(j) ) m for all j = 1, · · · , n and m = 1, · · · , M0 .
61
This can be seen clearly for the special case when k1 = k2 = 1. In that case, for any (x, x
b), if −(x −
e −1/2 Σ
e 12 (b
e −1/2 Σ
e 12 (b
µ1 ) − Σ
x − µ2 ) < −δ0 , the minimizing solution is −δ0 ; if −δ0 ≤ −(x − µ1 ) − Σ
x − µ2 ) ≤ δ0 ,
11
11
e −1/2 Σ
e 12 (b
the minimizing solution is interior; if −(x − µ1 ) − Σ
x − µ2 ) > δ0 , the minimizing solution is δ0 .
11
218
(4.6.6) implies that
e
ξ m(i) (xpJ,m(i) )
=
=
n
X
j=1
n
X
e
e
p
p
p
) − λ(Λj (ξ m(j) ))m(j) (XJ,m(j)
))|XJ,m(i)
= xpJ,m(i) , z]
E[Hj (u(Xj ) + λξ m(j) (XJ,m(j)
e
Λj (ξ m(j) ) m(i) (xpJ,m(i) )
j=1
=
n
X
e
(xpJ,m(i) ).
ξj,m(i)
j=1
for all i and xpJ,m(i) ∈ XpJ,m(i) . Therefore,
e
e
ξi,m
(xpJ,m ) = Λi (ξ m(i) ) m (xpJ,m )
e
p
p
p
e
=E[Hi (u(Xi ) + λξ m(i) (XJ,m(i)
) − λξi,m(i)
(XJ,m(i)
))|XJ,m
= xpJ,m , z]
X
p
p
e
=E[Hi (u(Xi ) + λ
ξj,m(i)
(XJ,m(i)
))|XJ,m
= xpJ,m , z].
j6=i
e
e
e , · · · , ξe
On the other hand, given that ξ e = (ξ1,1
1,M0 , · · · , ξn,1 , · · · , ξn,M0 ) satisfies (4.6.3),
e
e
e
define ξ = (ξ 1 , · · · , ξ M0 ) such that
ξ
e
(xpJ,1 , · · ·
, xpJ,M0 )m
=
e
ξ m (xpJ,m )
=
n
X
e
ξi,m
(xpJ,m ),
i=1
for all m and
xpJ,m
∈
XpJ,m .
Then we get that
e
ξi,m
(xpJ,m ) =E[Hi (u(Xi ) + λ
X
p
p
e
ξj,m(i)
(XJ,m(i)
))|XJ,m
= xpJ,m , z]
j6=i
e
p
p
p
e
=E[Hi (u(Xi ) + λξ m(i) (XJ,m(i)
) − λξi,m(i)
(XJ,m(i)
))|XJ,m
= xpJ,m , z].
e
e
Applying the Implicit Function theorem in Banach spaces, ξi,m
= (Λi (ξ m(i) ))m , for all
i = 1, · · · , n and m = 1, · · · , M0 . Therefore,
219
e
ξ m (xpJ,m ) =
n
X
e
ξi,m
(xpJ,m )
i=1
=
n
X
e
(Λi (ξ m(i) ))m (xpJ,m )
i=1
=
n
X
e
p
p
p
e
))|XJ,m
= xpJ,m , z]
) − λξi,m(i)
(XJ,m(i)
E[Hi (u(Xi ) + λξ m(i) (XJ,m(i)
i=1
=
n
X
e
e
p
E[Hi (u(Xi ) + λξ m(i) (XJpi ) − λ(Λi (ξ m(i) ))(XJpi ))|XJ,m
= xpJ,m , z].
i=1
C.6
Proofs for Equilibrium Set Characterization in Binary Choice Models
Proof for Proposition 4.7.1. Let ξ denote the total expected outcome in the group.
Pn
Pn
That is, ξ =
1 E[yi ]. Given ξ, for agent i, K(ξi , ξ; ui , λ) = Φ(ui + λξ −
i=1 ξi =
λξi ) − ξi = 0. K(0, ξ; ui , λ) > 0. K(1, ξ; ui , λ) < 0.
∂K
∂ξi
= −(λφ(ui + λξ − λξi ) + 1). If
λφ(ui +λξ −λξi )+1 > 0, for each ξ, there is a unique ξi ∈ (0, 1) such that K(ξi , ξ; ui , λ) = 0.
Thus, individual expected outcomes are determined by a function, ξi = G(ui , ξ), such that
∂G(ui , ξ)
λφ(ui + λξ − λG(ui , ξ))
=
,
∂ξ
λφ(ui + λξ − λG(ui , ξ)) + 1
∂ 2 G(ui , ξ)
∂ξ
2
=−
λ2 φ(ui + λξ − λG(ui , ξ))(ui + λξ − λG(ui , ξ))
.
[λφ(ui + λξ − λG(ui , ξ)) + 1]3
ξ is determined by
S(u, ξ) =
n
X
G(ui , ξ) − ξ.
i=1
As a result,
∂S
∂ξ
=
Pn
i=1
∂G(ui ,ξ)
∂ξ
e
− 1 and
∂2S
2
∂ξ
=
Pn
i=1
∂ 2 G(ui ,ξ)
∂ξ
2
. As ξi ∈ [0, 1] for i = 1, · · · , n,
the valid equilibrium ξ ∈ [0, n]. Since G(ui , ξ) ∈ (0, 1), S(u, 0) > 0 and S(u, n) < 0.
Therefore, there must be an equilibrium between 0 and n.
√
• When − 2π < λ < 0, λφ(ui + λξ − λG(ui , ξ)) + 1 > 0. Thus,
and
∂S
∂ξ
< 0. As a result, there is a unique equilibrium.
220
∂G(ui ,ξ)
∂ξ
< 0 for all i
• When λ > 0,
∂G(ui ,ξ)
∂ξ
> 0 for all i. If min1≤i≤n ui >
λ
2,
Φ(ui − λ2 ) −
is, K( 12 , 0; ui , λ) > 0. Then for all i, G(ui , ξ) ≥ G(ui , 0) >
λξ − λG(ui , ξ) = Φ−1 (G(ui , ξ)) > 0. It follows that
Similarly, if max1≤i≤n ui <
K( 21 , n; ui , λ) <
1
2.
λ
2
∂ 2 G(ui ,ξ)
∂ξ
2
1
2.
Φ−1 (G(ui , ξ) < 0. In this case,
∂ 2 G(ui ,ξ)
2
∂ξ
1
2.
> 0. That
Therefore, ui +
< 0 for any ξ ∈ [0, n].
− λn, for any i, Φ(ui + λn − λ2 ) −
Then G(ui , ξ) ≤ G(ui , n) <
1
2
1
2
< 0. That is,
Then ui + λξ − λG(ui , ξ) =
> 0 for all ξ ∈ [0, n]. In both cases,
∂ 2 G(ui ,ξ)
∂ξ
2
does not change sign as ξ runs in [0, n]. Hence, there is a unique equilibrium.
Proof of Lemma 4.7.1. It is easy to see that c(a; λ) = c(−a; λ). c(0; λ) = λφ(0) + 1 > 0.
0
lima→+∞ c(a; λ) = lima→−∞ c(a; λ) = −∞. c (a; λ) = a(3λφ(a) − 2 − 2λa2 φ(a)). When
0<λ<
√
2 2π
3 ,
0
0
3λφ(a) − 2 − 2λa2 φ(a) < 0. Thus, c (a; λ) < 0 for a > 0 and c (a; λ) > 0
for a < 0. Therefore, there is a+ > 0 with the claimed properties. Additionally,
da+
dλ
=
2
(2a +1)φ(a)
− a(3λφ(a)−2−2λa
2 φ(a)) > 0.
Proof of Proposition 4.7.2. For any i,
∂ 3 G(ui , ξ)
∂ξ
∂3S
3
∂ξ
=
Pn
i=1
3
∂ 3 G(ui ,ξ)
∂ξ
3
=−
λ3 φ(ui + λξ − λG(ui , ξ))c(ui + λξ − λG(ui , ξ); λ)
.
[λφ(ui + λξ − λG(ui , ξ)) + 1]5
. As λ > 0,
∂G(ui ,ξ)
∂ξ
> 0 for all i. If (4.7.4) holds, for any i, Φ(ui −
λΦ(a+ (λ))) − Φ(a+ (λ)) > 0. That is, K(Φ(a+ (λ)), 0; ui , λ) > 0. Therefore, G(ui , ξ) ≥
G(ui , 0) > Φ(a+ (λ)) for all ξ ∈ [0, n]. It follows that ui + λξ − λG(ui , ξ) = Φ−1 (G(ui , ξ)) >
a+ (λ). From Lemma 4.7.1, c(ui +λξ −λG(ui , ξ); λ) < 0, for all i and ξ ∈ [0, n]. Analogously,
under condition (4.7.5), for any i, Φ(ui + λn − λΦ(−a+ (λ))) − Φ(−a+ (λ)) < 0. Then
K(Φ(−a+ (λ)), n; ui , λ) < 0. Therefore, G(ui , ξ) ≤ G(ui , n) < Φ(−a+ (λ)). That is, for any i.
ui +λξ−λG(ui , ξ) = Φ−1 (G(ui , ξ)) < −a+ (λ). By Lemma 4.7.1, c(ui +λb−λG(ui , ξ); λ) < 0,
for all i and ξ ∈ [0, n]. Thus, in both cases,
are more than three equilibria,
∂2S
2
∂ξ
∂3S
3
∂ξ
> 0 when ξ runs from 0 to n. If there
must change its sign, which is impossible if
∂3S
3
∂ξ
keeps
on being positive. Consequently, under condition (4.7.4) or (4.7.5), there are at most three
equilibria.
221
e i , ξ; ui , λ) = 2Φ(ui +λξ−λξi )−1−ξi = 0.
Proof of Proposition 4.7.3. Fix λ, for any i, K(ξ
e ξ; ui , λ) < 0, and
e
ξ; ui , λ) > 0, K(1,
As K(−1,
e
∂K
∂ξi
= −(2λφ(ui + λξ − λξi ) + 1), when
e i , ξ; ui , λ) =
2λφ(ui +λξ −λξi )+1 > 0, for each ξ, there is a unique ξi ∈ [−1, 1] such that K(ξ
e i , ξ; λ). By
0. Thus, ξi is implicitly a function of ξ and ui . Denote this function as G(u
computation,
e i , ξ)
e i , ξ))
∂ G(u
2λφ(ui + λξ − λG(u
=
,
e i , ξ)) + 1
∂ξ
2λφ(ui + λξ − λG(u
e i , ξ)
∂ 2 G(u
∂ξ
2
=−
e i , ξ))(ui + λξ − λG(u
e i , ξ))
2λ2 φ(ui + λξ − λG(u
.
e i , ξ)) + 1]3
[2λφ(ui + λξ − λG(u
ξ is determined by
e ξ) =
S(u,
n
X
e i , ξ) − ξ.
G(u
i=1
As a result,
e
∂S
∂ξ
=
Pn
i=1
e i ,ξ)
∂ G(u
−1
∂ξ
e
and
e
∂2S
2
∂ξ
=
Pn
e i ,ξ)
∂ 2 G(u
i=1
∂ξ
2
. As ξi ∈ [−1, 1] for i = 1, · · · , n,
e i , ξ) ∈ (−1, 1), S(u,
e −n) > 0 and S(u,
e n) < 0.
the valid equilibrium ξ ∈ [−n, n]. Since G(u
Therefore, there must be an equilibrium between −n and n.
√
• When −
and
e
∂S
∂ξ
2π
2
e i , ξ)) + 1 > 0. Then
< λ < 0, 2λφ(ui + λξ − λG(u
e i ,ξ)
∂ G(u
∂ξ
< 0 for all i
< 0. Thus, in this case, there is a unique equilibrium.
• When λ > 0,
e i ,ξ)
∂ G(u
∂ξ
> 0 for all i. If min1≤i≤n ui > λn, 2Φ(ui − λn) − 1 > 0.
e −n; ui , λ) > 0. Then for all i, G(u
e i , ξ) ≥ G(u
e i , −n) > 0. Therefore,
That is, K(0,
e i , ξ) = Φ−1 ( G(ui ,ξ)+1 ) > 0. It follows that
ui + λξ − λG(u
2
e
e i ,ξ)
∂ 2 G(u
∂ξ
2
< 0 for any
ξ ∈ [−n, n]. Similarly, if max1≤i≤n ui < −λn, for any i, 2Φ(ui + λn) − 1 < 0. That
e i , n) < 0. Then ui + λξ − λG(u
e i , ξ) =
e n; ui , λ) < 0. Hence, G(u
e i , ξ) ≤ G(u
is, K(0,
Φ−1 ( G(ui2,ξ)+1 ) < 0. In this case,
e
both cases,
e i ,ξ)
∂ 2 G(u
2
∂ξ
e i ,ξ)
∂ 2 G(u
∂ξ
2
> 0 for all ξ ∈ [−n, n]. That is to say, in
does not change its sign as ξ runs in [−n, n]. Hence, there is a
unique equilibrium in these two cases.
Proof of Lemma 4.7.2. First, it is easy to see that e
c(a; λ) = e
c(−a; λ). e
c(0; λ) = 2λφ(0)+
1 > 0. lima→+∞ e
c(a; λ) = lima→−∞ e
c(a; λ) = −∞.
222
de
c(a;λ)
da
= 2a(3λφ(a) − 1 − 2λφ(a)a2 ).
√
If 0 < λ <
2π
3 ,
3λφ(a) − 1 − 2λφ(a)a2 < 0. Then
de
c(a;λ)
da
> 0 if a < 0; and
de
c(a;λ)
da
<0
if a > 0. Thus, a = 0 is the unique peak for e
c(·; λ). By symmetry, there is e
a+ > 0 such
that e
c(a; λ) > 0 if −e
a+ < a < e
a+ ; e
c(e
a+ ; λ) = e
c(−e
a+ ; λ) = 0; and e
c(a; λ) < 0 for a > e
a+ or
φ(e
a+ )(1+2e
a2 )
a+
+
a < −e
a+ . In addition, from e
c(e
a+ ; λ) = 0, de
> 0.
dλ = e
a+ (2λφ(e
a+ )e
a2+ +1−3λφ(e
a+ ))
P
3
3e
Proof of Proposition 4.7.4. ∂ 3se = ni=1 ∂ G3 , where
∂ξ
e
∂3G
∂ξ
3
=−
∂ξ
e i ; λ))e
e i ; λ); λ)
c(ui + λξ − λG(u
2λφ(ui + λξ − λG(u
.
e i ; λ)) + 1)5
(2λ3 φ(ui + λξ − λG(u
e
∂3G
3
∂ξ
e i ; λ); λ). If
is determined by the sign of e
c(ui + λξ − λG(u
min1≤i≤n ui > λ(2Φ(e
a+ (λ))−1+n)+e
a+ (λ), for any i, 2Φ ui −λn−λ(2Φ(e
a+ (λ))−1) −1 >
When λ > 0, the sign of
e
a+ (λ)) − 1, −n; ui , λ) > 0. Thus, for any ξ ∈ [−n, n],
2Φ(e
a+ (λ)) − 1. That is, K(2Φ(e
a+ (λ)) − 1. Therefore, ui + λξ − λG(ui , ξ) = Φ−1 ( G(ui2,ξ)+1 ) >
G(ui , ξ) ≥ G(ui , −n) > 2Φ(e
e i ; λ); λ) < 0 for all i and
e
a+ . It then follows from Lemma 4.7.2 that e
c(ui + λξ − λG(u
ξ ∈ [−n, n]. Similarly, if max1≤i≤n ui < λ(2Φ(−e
a+ (λ))−1−n)−e
a+ (λ), for all i, 2Φ ui +λn−
e
λ(2Φ(−e
a+ (λ)) − 1) − 1 < 2Φ(−e
a+ (λ)) − 1. That is, K(2Φ(−e
a+ (λ)) − 1, n; ui , λ) < 0. Then
G(ui , ξ) ≤ G(ui , n) < 2Φ(−e
a+ (λ)) − 1. Therefore, ui + λξ − λG(ui , ξ) = Φ−1 ( G(ui2,ξ)+1 ) <
e i ; λ); λ) < 0 for all i and ξ ∈ [−n, n]. That is
−e
a+ . Applying Lemma 4.7.2, e
c(ui + λξ − λG(u
to say, when (4.7.11) or (4.7.12) holds,
this case,
∂ 2 se
2
∂ξ
e
∂3G
3
∂ξ
> 0 for all i and
∂ 3 se
3
∂ξ
> 0 for all ξ ∈ [−n, n]. In
does not change its sign in [−n, n] and there are at most three equilibria.
223
Table C.1: Binary Choice I: Estimation Comparison
I
II
III
True parameters
G = 100
G = 200
G = 100
G = 200
G = 100
G = 200
β0
0
0.9376
(0.2174)
0.9409
(0.1566)
0.1851
(0.2127)
0.1792
(0.1470)
0.0030
(0.2537)
0.0056
(0.1773)
β1
1
0.7397
(0.0957)
0.7470
(0.0662)
0.9237
(0.1050)
0.9281
(0.0674)
1.0210
(0.1299)
1.0165
(0.0848)
β2
1
0.7957
(0.4130)
0.7859
(0.3037)
0.9103
(0.4138)
0.9280
(0.2889)
0.9935
(0.4648)
1.0116
(0.3134)
λ
0.8
0.6263
(0.0037)
0.6266
(0.0004)
0.8259
(0.1047)
0.8113
(0.0833)
-0.2401
(0.0218)
-0.2407
(0.0150)
-0.2356
(0.0231)
-0.2371
(0.0165)
-
-
0.6267
(4.4465 × 10−16 )
0.6267
(4.4465 × 10−16 )
0.6744
(0.0496)
0.6735
(0.0343)
m log L
|λ|
-0.3288
(0.0258)
-0.3288
(0.0180)
runctr
Note: Regression I corresponds to the conventional regression without social interactions. Regression II and II allow
for interactions through social relations. Regression II imposes a sufficient condition on λ, assumes equilibrium
uniqueness, and uses the method of contraction mapping iteration for equilibrium computation. Regression III does
not impose restrictions on the interaction intensity, λ, and uses the homotopy continuation method for equilibrium
computation. |λ| is the upper bound on the intensity of social interactions, corresponding to the sufficient condition
for contraction mapping in Yang and Lee(2014). runctr denotes the proportion of groups in which that condition is
violated under the true parameter values.
224
Table C.2: Binary Choice II: Estimation Comparison for Moderate Interactions
True parameters
I
II
III
IV
β0
0
0.0298
(0.0741)
-0.0010
(0.0483)
-0.0011
(0.0484)
0.0014
(0.0493)
β1
1
1.0322
(0.0976)
1.0219
(0.1004)
1.0219
(0.1005)
1.0241
(0.0979)
β2
1
1.2152
(0.3356)
0.9906
(0.2666)
0.9894
(0.2678)
1.0175
(0.2848)
λ
0.2
0.1962
(0.0294)
0.1962
(0.0294)
0.1707
(0.0695)
-0.4594
(0.0246)
-0.4594
(0.0246)
-0.4620
(0.0264)
m log L
-0.4804
(0.0247)
|λ|
0.3133
(1.6758 × 10−16 )
0
(0)
runctr
ne
1
(0)
0
(0)
rm
n
be
1
(0)
0
(0)
rbm
Note: In each simulation, the number of independent groups is G = 100. The population of every group is n = 5.
Regression I corresponds to the conventional regression without social interactions. Regressions II, III, and IV take
social interactions into account. Regressions II and III assumes equilibrium uniqueness. Regression II restricts the
interaction intensity to satisfy the sufficient condition in Yang and Lee(2014) and uses contraction mapping iterations
to solve for the equilibrium. Regression III does not restrict the interaction intensity and computes the equilibrium
through solving nonlinear equations by the Newton’s method. Regression IV allows for equilibrium multiplicity,
uses the homotopy continuation method to solve for the equilibrium set, and select an equilibrium according to the
expected total utilities to complete the model. |λ| stands for the upper bound on interaction intensity which ensures
equilibrium uniqueness in a sample according to Yang and Lee(2014). runctr represents the proportion of groups
which violate that sufficient condition. ne and n
be refer to respectively the average sample and estimated number
of equilibria in a group. rm and rbm report the sample and estimated proportion of groups with multiple equilibria
respectively. Numbers in parentheses are standard deviations.
225
Table C.3: Binary Choice II: Estimation Comparison for Large Interactions
True parameters
I
II
III
IV
β0
0
0.0899
(0.1339)
0.0021
(0.0431)
-0.0042
(0.0525)
0.0439
(0.0694)
β1
1
0.4767
(0.0525)
0.5191
(0.0636)
0.5748
(0.2164)
1.0544
(0.2492)
β2
1
1.1328
(0.4606)
0.6596
(0.2650)
0.6197
(0.3941)
0.9346
(0.3442)
λ
0.8
0.3109
(0.0039)
0.3900
(0.0277)
0.8325
(0.0968)
α
1
m log L
1.0776
(0.6221)
-0.6031
(0.0200)
-0.4312
(0.0349)
|λ|
-0.3989
(0.0477)
-0.1218
(0.0235)
0.3133
(2.2232 × 10−16 )
1
(0)
runctr
ne
2.2918
(0.0784)
0.7811
(0.0407)
rm
n
be
2.2487
(0.1716)
0.7642
(0.1015)
rbm
Note: In each simulation, the number of independent groups is G = 100. The population of every group is n = 5.
Regression I corresponds to the conventional regression without social interactions. Regressions II, III, and IV take
social interactions into account. Regressions II and III assumes equilibrium uniqueness. Regression II restricts the
interaction intensity to satisfy the sufficient condition in Yang and Lee(2014) and uses contraction mapping iterations
to solve for the equilibrium. Regression III does not restrict the interaction intensity and computes the equilibrium
through solving nonlinear equations by the Newton’s method. Regression IV allows for equilibrium multiplicity,
uses the homotopy continuation method to solve for the equilibrium set, and select an equilibrium according to the
expected total utilities to complete the model. |λ| stands for the upper bound on interaction intensity which ensures
equilibrium uniqueness in a sample according to Yang and Lee(2014). runctr represents the proportion of groups
which violate that sufficient condition. ne and n
be refer to respectively the average sample and estimated number
of equilibria in a group. rm and rbm report the sample and estimated proportion of groups with multiple equilibria
respectively. Numbers in parentheses are standard deviations.
226
The Haar Basis Functions
5
4
3
2
τ7
τ
τ3
τ0
1
0
−1
τ1
τ5
τ2
τ6
−2
τ4
−3
−4
−5
0
0.1
0.2
0.3
0.4
0.5
x
0.6
0.7
Figure C.1: The Haar Basis Functions
227
0.8
0.9
1
n=2, u=1
n=5, u=1
10
n=10, u=1
15
λ=0
λ=0.2
λ=0.4
λ=0.6
λ=0.8
λ=1
8
25
λ=0
λ=0.2
λ=0.4
λ=0.6
λ=0.8
λ=1
10
λ=0
λ=0.2
λ=0.4
λ=0.6
λ=0.8
λ=1
20
6
15
4
5
10
2
5
0
0
0
−2
−5
−5
−4
−6
−6
−4
−2
0
tpsi
2
4
6
−10
−10
−5
0
tpsi
5
10
−10
−15
−10
−5
0
tpsi
5
10
15
Figure C.2: Equilibrium Illustration for Binary Choice I with Influences from Peers A
Note: In this figure, “tpsi” refers to the expected total outcomes in a group of symmetric members for the Binary
Choice Model I with influences from peers. For each value of λ, an equilibrium expected total outcome is a zero of a
nonlinear function, whose graph is depicted as a curve. The characteristics of the equilibrium set may differ as the
group population, n, and homogeneous individual utility, u, vary. The left, middle, and right diagrams respectively
correspond to three cases: n = 2, u = 2; n = 5, u = 2; and n = 10, u = 2.
228
n=2, max u=2, min u=1
n=5, max u=2, min u=1
10
n=10, max u=2, min u=1
15
λ=0
λ=0.2
λ=0.4
λ=0.6
λ=0.8
λ=1
8
25
λ=0
λ=0.2
λ=0.4
λ=0.6
λ=0.8
λ=1
10
λ=0
λ=0.2
λ=0.4
λ=0.6
λ=0.8
λ=1
20
6
15
4
5
10
2
5
0
0
0
−2
−5
−5
−4
−6
−6
−4
−2
0
tpsi
2
4
6
−10
−10
−5
0
tpsi
5
10
−10
−15
−10
−5
0
tpsi
5
10
15
Figure C.3: Equilibrium Illustration for Binary Choice I with Influences from Peers B
Note: In this figure, “tpsi” refers to the expected total outcomes in a group of asymmetric members for the Binary
Choice Model I with Influences from peers. For each value of λ, an equilibrium expected total outcome is a zero of a
nonlinear function, whose graph is depicted as a curve. The characteristics of the equilibrium set may differ as the
group population, n, and heterogeneous individual utilities, ui ’s, vary. The left, middle, and right diagrams
respectively correspond to three cases: n = 2, maxi ui = 2, mini ui = 1; n = 5, maxi ui = 2, mini ui = 1; and
n = 10, maxi ui = 2, mini ui = 1.
229
n=2, u= −2
n=5, u= −2
8
n=10, u= −2
15
λ=0
λ=0.2
λ=0.4
λ=0.6
λ=0.8
λ=1
6
20
λ=0
λ=0.2
λ=0.4
λ=0.6
λ=0.8
λ=1
10
λ=0
λ=0.2
λ=0.4
λ=0.6
λ=0.8
λ=1
15
4
10
2
5
5
0
0
0
−2
−5
−4
−5
−10
−6
−8
−6
−4
−2
0
tpsi
2
4
6
−10
−10
−5
0
tpsi
5
10
−15
−15
−10
−5
0
tpsi
5
10
15
Figure C.4: Equilibrium Illustration for Binary Choice I with Influences from Peers C
Note: In this figure, “tpsi” refers to the expected total outcomes in a group of symmetric members for the Binary
Choice Model I with influences from peers. For each value of λ, an equilibrium expected total outcome is a zero of a
nonlinear function, whose graph is depicted as a curve. The characteristics of the equilibrium set may differ as the
group population, n, and homogeneous individual utility, u, vary. The left, middle, and right diagrams respectively
correspond to three cases: n = 2, u = −2; n = 5, u = −2; and n = 10, u = −2.
230
n=2, max u= −2, min u= −3
n=5, max u= −2, min u= −3
8
n=10, max u= −2, min u= −3
15
λ=0
λ=0.2
λ=0.4
λ=0.6
λ=0.8
λ=1
6
20
λ=0
λ=0.2
λ=0.4
λ=0.6
λ=0.8
λ=1
10
λ=0
λ=0.2
λ=0.4
λ=0.6
λ=0.8
λ=1
15
4
10
2
5
5
0
0
0
−2
−5
−4
−5
−10
−6
−8
−6
−4
−2
0
tpsi
2
4
6
−10
−10
−5
0
tpsi
5
10
−15
−15
−10
−5
0
tpsi
5
10
15
Figure C.5: Equilibrium Illustration for Binary Choice I with Influences from Peers D
Note: In this figure, “tpsi” refers to the expected total outcomes in a group of asymmetric members for the Binary
Choice Model I with Influences from peers. For each value of λ, an equilibrium expected total outcome is a zero of a
nonlinear function, whose graph is depicted as a curve. The characteristics of the equilibrium set may differ as the
group population, n, and heterogeneous individual utilities, ui ’s, vary. The left, middle, and right diagrams
respectively correspond to three cases: n = 2, maxi ui = −2, mini ui = −3; n = 5, maxi ui = −2, mini ui = −3; and
n = 10, maxi ui = −2, mini ui = −3.
231
Equilibria as the Interaction Intensity Increases
1
0.8
0.6
0.4
ne
rm
0.2
ru
0
−0.2
0
0.1
0.2
0.3
0.4
0.5
λ
0.6
0.7
0.8
0.9
1
Figure C.6: Equilibrium Illustration for Binary Choice I with General Social Relations
Note: This figure shows the features of the equilibrium set in Binary Choice Model I as the interaction intensity, λ,
increases, for a sample of G = 100 groups with homogeneous group population n = 5. ne represents the average
number of equilibria of the sample. rm is the ratio of groups with more than one equilibria. ru stands for the
proportion of groups whose social relation matrix does not satisfy the sufficient condition for contraction mapping in
Yang and Lee(2014).
232
Equilibrium Outcomes as Interaction Intensity Increases
1
0.8
0.6
0.4
0.2
me
0
r1
r0
−0.2
0
0.1
0.2
0.3
0.4
0.5
λ
0.6
0.7
0.8
0.9
1
Figure C.7: Equilibrium Outcomes for Binary Choice I with General Social Relations
Note: This figure shows the features of the equilibrium outcomes in Binary Choice Model I as the interaction
intensity, λ, increases, for a sample of G = 100 groups with homogeneous group population n = 5. me , r1 , and r0 ,
respectively represent the average expected (individual) outcomes, the ratio of agents who choose “1”, and the ratio
of agents who choose “0”, of the unique equilibrium.
233
n=2, u=n+1
n=5, u=n+1
10
n=10, u=n+1
15
λ=0
λ=0.2
λ=0.4
λ=0.6
λ=0.8
λ=1
8
25
λ=0
λ=0.2
λ=0.4
λ=0.6
λ=0.8
λ=1
10
λ=0
λ=0.2
λ=0.4
λ=0.6
λ=0.8
λ=1
20
6
15
4
5
2
10
0
0
5
−2
−5
0
−4
−6
−6
−4
−2
0
tpsi
2
4
6
−10
−10
−5
0
tpsi
5
10
−5
−15
−10
−5
0
tpsi
5
10
15
Figure C.8: Equilibrium Illustration for Binary Choice II with Influences from Peers A
Note: In this figure, “tpsi” refers to the expected total outcomes in a group of symmetric members for the Binary
Choice Model II with influences from peers. For each value of λ, an equilibrium expected total outcome is a zero of a
nonlinear function, whose graph is depicted as a curve. The characteristics of the equilibrium set may differ as the
group population, n, and homogeneous individual utility, u, vary. The left, middle, and right diagrams respectively
correspond to three cases: n = 2, u = n + 1; n = 5, u = n + 1; and n = 10, u = n + 1.
234
n=2, max u=n+3, min u=n+1
n=5, max u=n+3, min u=n+1
10
n=10, max u=n+3, min u=n+1
15
λ=0
λ=0.2
λ=0.4
λ=0.6
λ=0.8
λ=1
8
25
λ=0
λ=0.2
λ=0.4
λ=0.6
λ=0.8
λ=1
10
λ=0
λ=0.2
λ=0.4
λ=0.6
λ=0.8
λ=1
20
6
15
4
5
2
10
0
0
5
−2
−5
0
−4
−6
−6
−4
−2
0
tpsi
2
4
6
−10
−10
−5
0
tpsi
5
10
−5
−15
−10
−5
0
tpsi
5
10
15
Figure C.9: Equilibrium Illustration for Binary Choice II with Influences from Peers B
Note: In this figure, “tpsi” refers to the expected total outcomes in a group of asymmetric members for the Binary
Choice Model II with Influences from peers. For each value of λ, an equilibrium expected total outcome is a zero of
a nonlinear function, whose graph is depicted as a curve. The characteristics of the equilibrium set may differ as the
group population, n, and heterogeneous individual utilities, ui ’s, vary. The left, middle, and right diagrams
respectively correspond to three cases: n = 2, maxi ui = n + 3, mini ui = n + 1; n = 5, maxi ui = n + 3,
mini ui = n + 1; and n = 10, maxi ui = n + 3, mini ui = n + 1.
235
n=2, u=1
n=5, u=1
10
n=10, u=1
15
λ=0
λ=0.2
λ=0.4
λ=0.6
λ=0.8
λ=1
8
25
λ=0
λ=0.2
λ=0.4
λ=0.6
λ=0.8
λ=1
10
λ=0
λ=0.2
λ=0.4
λ=0.6
λ=0.8
λ=1
20
6
15
4
5
10
2
5
0
0
0
−2
−5
−5
−4
−6
−6
−4
−2
0
tpsi
2
4
6
−10
−10
−5
0
tpsi
5
10
−10
−15
−10
−5
0
tpsi
5
10
15
Figure C.10: Equilibrium Illustration for Binary Choice II with Influences from Peers C
Note: In this figure, “tpsi” refers to the expected total outcomes in a group of symmetric members for the Binary
Choice Model II with influences from peers. For each value of λ, an equilibrium expected total outcome is a zero of a
nonlinear function, whose graph is depicted as a curve. The characteristics of the equilibrium set may differ as the
group population, n, and homogeneous individual utility, u, vary. The left, middle, and right diagrams respectively
correspond to three cases: n = 2, u = 1; n = 5, u = 1; and n = 10, u = 1.
236
n=2, max u=2, min u=1
n=5, max u=2, min u=1
10
n=10, max u=2, min u=1
15
λ=0
λ=0.2
λ=0.4
λ=0.6
λ=0.8
λ=1
8
25
λ=0
λ=0.2
λ=0.4
λ=0.6
λ=0.8
λ=1
10
λ=0
λ=0.2
λ=0.4
λ=0.6
λ=0.8
λ=1
20
6
15
4
5
10
2
5
0
0
0
−2
−5
−5
−4
−6
−6
−4
−2
0
tpsi
2
4
6
−10
−10
−5
0
tpsi
5
10
−10
−15
−10
−5
0
tpsi
5
10
15
Figure C.11: Equilibrium Illustration for Binary Choice II with Influences from Peers D
Note: In this figure, “tpsi” refers to the expected total outcomes in a group of asymmetric members for the Binary
Choice Model II with Influences from peers. For each value of λ, an equilibrium expected total outcome is a zero of
a nonlinear function, whose graph is depicted as a curve. The characteristics of the equilibrium set may differ as the
group population, n, and heterogeneous individual utilities, ui ’s, vary. The left, middle, and right diagrams
respectively correspond to three cases: n = 2, maxi ui = 2, mini ui = 1; n = 5, maxi ui = 2, mini ui = 1; and
n = 10, maxi ui = 2, mini ui = 1.
237
3
ne
rm
r
2.5
u
2
1.5
1
0.5
0
0
0.1
0.2
0.3
0.4
0.5
λ
0.6
0.7
0.8
0.9
1
Figure C.12: Equilibrium Illustration for Binary Choice II with General Social Relations
Note: This figure shows the features of the equilibrium set in Binary Choice Model II as the interaction intensity, λ,
increases, for a sample of G = 100 groups with homogeneous group population n = 5. ne represents the average
number of equilibria of the sample. rm is the ratio of groups with more than one equilibria. ru stands for the
proportion of groups whose social relation matrix does not satisfy the sufficient condition for contraction mapping in
Yang and Lee(2014).
238
1
0.8
0.6
0.4
0.2
0
−0.2
−0.4
me
my
−0.6
rp
rm
−0.8
−1
0
0.1
0.2
0.3
0.4
0.5
λ
0.6
0.7
0.8
0.9
1
Figure C.13: Equilibrium Outcomes for Binary Choice II with General Social Relations
Note: This figure shows the features of the equilibrium outcomes in Binary Choice Model II as the interaction
intensity, λ, increases, for a sample of G = 100 groups with homogeneous group population n = 5. me , my , rp , and
rm , respectively represent the average expected (individual) outcomes, the average individual choices, the ratio of
agents who choose “1”, and the ratio of agents who choose “-1”, of the unique/selected equilibrium.
239
4
(r,0)
(0,r)
(−r,0)
(0,−r)
S(a(1,0))
S(a(0,1))
S(a(−1,0))
S(a(0,−1))
3
2
m*2
1
0
−1
−2
−3
−4
−4
−3
−2
−1
0
m*1
1
2
3
4
Figure C.14: Homotopic Mappings on Sphere for Binary Choices
240