A REVEALED PREFERENCE MEASURE OF STABILITY IN SOCIAL AND ECONOMIC NETWORKS K HAI X IANG C HIONG1 This paper derives the observable consequences of stability in networks, which then allows for measuring and quantifying stability as an observable property of networks. A network is stable if no agents prefer to deviate by forming or dissolving links. The notion of stability then crucially depends on preferences, which are unobserved. To obtain a notion of stability based only on observables (network data), I use the principle of revealed preference: – a network is observed to be stable if and only if there exists preferences that rationalize the network as stable. For any structure on preferences, I characterize the necessary and sufficient conditions for an observed network to be rationalizable as stable. Consistency of network data with these conditions forms the basis for the measure of stability. I explore how stability is empirically associated with economic outcomes and social characteristics. I find that targeting of nodes that have central positions in social networks to increase the spread of information is more effective when the underlying networks are also associated with higher measures of stability. 1. I NTRODUCTION This paper examines theoretically and empirically, the hypothesis of pairwise stability in networks, which is the solution concept that in a pairwise stable network, no pair of agents prefer to deviate from the existing network configuration by unilaterally dissolving links, or bilaterally forming links (Jackson and Wolinsky (1996)). More concretely, given a network of social relationships (who is friends with whom), or a network of financial relationships (who borrows money from whom, and who lends to whom), 1 Draft date: December 20, 2014. For the latest version: http://www.hss.caltech.edu/ ~kchiong. Division of the Humanities and Social Sciences, California Institute of Technology, Pasadena, California, USA. E-mail address: [email protected]. Notable thanks to Federico Echenique, Matthew Shum and Leeat Yariv for their invaluable encouragement and advice. For helpful discussions, I am also very grateful to Matthew Elliott, Matthew Jackson, Katrina Ligett, Philip Reny, Jean-Laurent Rosenthal and John Quah. The author gratefully acknowledges financial support from the NET Institute, www.NETinst.org 1 2 the goal is to develop and axiomatize a measure of stability that quantifies if a single observed network is pairwise stable or not. The result of this paper is a new network measure that quantifies the stability of a given network. This measure is then shown to be axiomatized by a particular structure on preferences. I also show that this measure unlocks a new descriptive aspect of social network that predicts important economic outcomes. Specifically, using the dataset from Banerjee, Chandrasekhar, Duflo, and Jackson (2013), I examine both the correlates of stability and how stability matters for economic outcomes. I find that targeting of nodes that have central positions in the underlying social network to increase the spread of a new microfinance program is more effective when the network is associated with a higher measure of stability. The challenges in deriving measures of network stabiliy are that (i) I assume only a single network is observed, and (ii) the notion of stability fundamentally relies on preferences, which are not observed. To say that a network is pairwise stable1 , we need to check that agents do not prefer to deviate from the existing network, and this requires knowing the preferences of everyone in the network. How then can we talk about stability as an observable property of a single network? The contribution of this paper is to characterize the observable consequences of stability in networks, and this is the first paper to do so. That is, I derive the notion of stability in terms of data (an observed network) that is equivalent to the standard notion of stability which is formulated in terms of unobserved preferences. This then allows for quantifying stability given some network data. To be more precise, the observable analog of pairwise stability is obtained using the revealed preference approach:– a network is observed to be stable if and only if there exists some preferences that induce this network as pairwise stable. In other words, a network is observed to be stable if and only if it is rationalizable as pairwise stable. The main theoretical result is to characterize, for any given preference structure p, the necessary and sufficient condition on the network data g, such that g is rationalizable as pairwise stable with preferences that satisfy the 1 Pairwise stability can be understood as a mapping from the preferences of a group of agents to the pattern of interactions amongst them, as embodied by the network. 3 structure p. When we do not impose any structure on preferences, any network can be rationalized as stable (the measure has no power). A preference structure is formally an extension of the revealed preference relations. I show that the characterization of what networks can be rationalized as stable amounts to checking for cycles in an appropriately extended profile of revealed preference relations. Thus we have a natural way to extend from a binary to a continuous measure of stability, which is to count the prevalence of cycles in the aforementioned profile. Our characterization resembles GARP (Generalized Axiom of Revealed Preference), which is the necessary and sufficient condition for the existence of some utility functions that rationalize a given consumption dataset (Afriat (1967); Samuelson (1938); Varian (1982, 2006)). Our stability measure parallels the usage of GARP as a measure of rationality (i.e. consistency with utility maximization), which also checks for cycles in the revealed preference relations. More concretely, preferences are revealed from network data as follows. If a link is observed between two agents, we then say that both agents reveal preferred to form this link, conditional on the rest of the network. However when a link is not observed between two agents, we can only say that either one or both of them reveal preferred not to form this link, conditional on the rest of the network. The departure from classical revealed preference theory arises because in networks and other two-sided settings, we are unsure as to which of the two sides blocked and chose not to form a link. Therefore the data induces a multiplicity of possible revealed preference relations. A further theoretical contribution that has practical consequence is to show that for a class of preference structures, this revealed preference characterization can be reduced as follows. We can ignore cycles of length greater than 2, and the indeterminacy in revealed preferences collapsed into one particular profile of revealed preference relations.2 In the application, I impose a particular structure on preferences that is not restrictive, but still has enough power for the dataset considered in the empirical application. Specifically, I restrict an agent’s preference to depend 2 This particular profile corresponds to the preferences that are revealed assuming that link deletion is bilateral not unilateral – dissolving links requires the consent of the other party. Therefore when a link is not observed in the data, both sides reveal preferred not to form this link. 4 only on her local k-subnetwork, which consists of agents that are reachable from her in k or fewer number of links (k can be varied). Now within the scope of this structure, preferences can depend however arbitrarily and non-parametrically on the observed network characteristics, which can include the network structure, demographics, or any interaction of the two. Under this particular preference structure, I show that there is an efficient algorithm that computes this measure of stability. This is due to the theoretical result obtained before: we only have to check for cycles of length 2 in one particular profile of revealed preference relations (among many possible). Now this cycle can be understood as follows: take a pair of agents who reveal preferred to form a link with each other, but there exists another pair of similar agents who reveal preferred not to form a link with each other. In turn, similarity is operationalized by color-preserving graph isomorphism. Using this measure, I examine the empirical distribution of stability of a cross-section of 75 social networks in Banerjee et al. (2013). This measure unearths considerable variations in stability across different villages in the dataset. Moreover this variation in stability is found to be associated with the diffusion of microfinance in these rural Indian villages. Banerjee et al. (2013) show that how central the position of first-informed agents (injection points) in a network plays an important role in the diffusion of microfinance. I show that when stability is taken into control, the network centrality of the injection points matters even more.3 In their findings, one standard deviation increase in the diffusion centrality (a measure how central individuals are in their social network with regard to spreading information) of injection points would lead to an increase in the eventual adoption rate of 3.9 percentage points. I find that this result depends significantly at how stable the underlying network structure is, particularly that of advice networks. For instance, at the 90th, 75th, and 25th percentile stability, the corresponding magnitudes are 7.2-8.1, 5.9-6.1, and 1.3-2.0 percentage points increase in the adoption rate. Interestingly, stability by itself has a small but negative effect on diffusion. A possible explanation is that random meetings of households helps increase diffusion, and stable networks lack these random, non-network-based meetings. Finally, I examined how this measure performs when networks are ran3 Specifically the network centrality of injection points is more strongly and positively correlated with eventual adoption rate when network stability is high. 5 domly generated instead, following the canonical Erdös-Rényi random graph model. To ensure that our measure of stability is not merely an artifact of other commonly used network properties such as the density and average degree of a network, I define the excess stability of a network to be its observed stability benchmarked against the expected stability of a randomly generated network having the same network property. The empirical findings are robust to measuring stability using excess stability. 1.1. Related literature This is the first paper to characterize the empirical restriction of pairwise stability in networks. It belongs to the broader agenda of revealed preference in economics, championed by Afriat (1967); Richter (1966); Samuelson (1947, 1938); Varian (1982). They show that when preferences are unobserved, the empirical restriction of utility maximization is captured by GARP. Their works led to extensive empirical applications using GARP as a basis for measuring rationality (i.e. consistency with utility maximization). Notably, Andreoni and Miller (2002); Blundell et al. (2003); Choi et al. (2014); Echenique et al. (2011); Harbaugh et al. (2001). For surveys on the subject, see Crawford and De Rock (2014); Varian (2006). Beyond the setting of individual decision-making, the revealed preference approach has also been applied to other multi-agents (strategic or otherwise) settings. This paper extends to the network setting, although the desideratum is microfounding measures based on revealed preference characterizations, which is not the objective in these papers. For example, Brown and Matzkin (1996) give a revealed preference characterization of Walrasian equilibria; Carvajal et al. (2013) for Cournot equilibria; Echenique (2008); Echenique et al. (2013) for stability of two-sided matching models; Chambers and Echenique (2014) for bargaining equilibria; Cherchye et al. (2007, 2011); Chiappori (1988); Sprumont (2000) for collective rationality. 2. I LLUSTRATIVE EXAMPLES Before I formally introduce the general setup, the examples in this section help illustrate the heart of the matter. Consider the social network g depicted in Figure 1 below. From observing a link (friendship) between agents 6 a and b, we say that agents a and b both reveal preferred the network in Figure 1 to the network in Figure 2. Notationally, the statement “agent a reveals preferred the network g to g − ab” is represented by g a g − ab. Now we can repeat the same step and define revealed preference this way for all pairs of agents who are linked in Figure 1. a b a b a’ b’ a a’ b’ and b Figure 2: Network g − ab Figure 1: Network g Now from observing no link between agents a0 and b0 , we say that either agent a0 or b0 (or both) reveal preferred the network in Figure 3 to the network in Figure 4. While forming a link requires the consent of both sides (bilateral), dissolving a link is unilateral. Therefore when we do not observe a link, we are unsure as to which side prefers not to form the link. a b a b a’ b’ a0 a’ b’ Figure 3: Network g or b0 Figure 4: Network g + a0 b0 The question we are interested in is: does there exist some underlying preferences that could induce these revealed preferences? If we can find such preferences, then we say the network g in Figure 1 is observed to be stable, or equivalently, rationalizable as pairwise stable. This gives us a binary measure of stability that depends only on the data, and not on unobserved preferences. 7 2.1. Network g is not rationalizable as pairwise stable The network g in Figure 1 (in fact any network) is rationalizable as pairwise stable.4 But under the following structure on preferences, it is not rationalizable as pairwise stable. Let an agent’s degree in g be her number of links (friends) in the network g. For this example (and this example only), assume that preferences are as follows: there is a ranking of degrees d0 , d1 , d2 , . . . , such that everyone agrees having d0 friends is better than d1 friends, and better than d2 friends and so on.5 From Figures 1 and 2, the degrees of agent a in the networks g and g − ab are 2 and 1 respectively. Therefore the revealed preference g a g − ab implies: agent a reveals preferred to have 2 friends to 1. On the other hand, we see in Figures 3 and 4 that the degrees of agent a0 in the network g and g + a0 b0 are 1 and 2 respectively. However we cannot be sure if agent a0 reveals preferred to have 1 friend than 2. Looking again at Figure 3 and 4, the degrees of agent b0 in the network g and g + a0 b0 are also 1 and 2 respectively, just like agent a0 . Therefore even when revealed preferences of agents a0 and b0 are indeterminate, we can still say that there is some agent (either of a0 or b0 ) who reveals preferred 1 friend to 2; yet from the previous paragraph, we found an agent a who reveals preferred 2 friends to 1. Since agents agree on a preference ranking over degrees, we obtained a cycle of length 2 in the revealed preference relation – which implies that network g in Figure 1 is not rationalizable as pairwise stable. 2.2. Rationalizable networks Given the restrictive assumption on preferences in this example, one can still find networks that are rationalizable as pairwise stable. For example, all star networks with more than 3 agents (Figure 5) are rationalizable as 4 To see this, the revealed preference relation we constructed is acyclic, there is therefore an underlying preferences (complete and transitive) that extend or induce them. 5 For instance, this assumption is satisfied when the utility that agent i derives from network g is ui ( g) = di ( g) − cd2i ( g), where di ( g) is the degree of i in g, and c is the cost parameter that is homogenous among all agents. 8 Figure 5: All star networks are rationalizable as pairwise stable. pairwise stable. A preference that rationalizes star networks is as follows: agents prefer fewer friends than more friends below a threshold degree; and agents prefer more friends above a threshold degree. Moreover agents also prefer to have at least one friend to none. 2.3. Sufficient condition Thus far, I have illustrated the intuition behind the necessary condition for a network to be rationalizable as pairwise stable:– check for cycle of length 2 in the revealed preference relations. Is this also sufficient? In particular, one can imagine the possibility of longer cycles, for example, when some agent (reveals) preferred d0 degrees to d1 , another prefers d1 to d2 , and another prefers d2 to d0 . However a simple argument shows that no cycles of length greater than 2 can exist without nesting a cycle of length 2. Therefore we can ignore cycle of length greater than 2. To see this, suppose there is a cycle of length 3, and without loss of generality, say that agent i (reveals) preferred d0 number of friends to d0 + 1 friends. If another agent j prefers d0 + 1 to d0 number of friends, then we obtain a cycle of length 2 in the revealed preference relation. If instead agent j prefers d0 + 1 to d0 + 2 friends, then there can be no agent who reveals preferred d0 + 2 to d0 . This is an artifact of pairwise stability, which is a conservative solution concept that requires stable network to be robust to only one-link deviation. As a result, preferences over two networks that differ in more than one link are never revealed. 2.4. Modus Operandi No doubt the assumption on preferences here is very restrictive. Ultimately, we seek a measure of stability based on more reasonable assumption on 9 preferences. The questions we will address are the following. For any given preference structure, what is the necessary and sufficient condition for a network to be rationalizable as pairwise stable? How can we turn this binary measure of stability into a continuous measure? How can we incorporate richer information about the network, such as the attributes or types of the agents? This example illustrates the modus operandi of our measures. Our measures of stability amount to counting the prevalence of cycles in the revealed preference relations. Moreover our measures will inherit two important simplifications we saw in this example. First, we only need to consider one particular profile of revealed preference relations among many possible.6 Secondly, we can ignore cycle of length greater than 2 in the revealed preference relation. In general, I will show that for a class of preference structures, these simplifications are valid. 2.5. Preview of formal result I will demonstrate the main result by applying Theorem 2 to this example, where preferences are assumed to have the structure given in Section 2.1. The theorem is more broadly applicable to other less restrictive preference structures. Let di ( g) be the degree (or the number of friends) of agent i in network g. The network g is rationalizable as pairwise stable if and only if for all pairs of agents i0 , j0 that are not linked in network g, there does not exist a pair of agents i, j such that di ( g) = di0 ( g + i0 j0 ) and d j ( g) = d j0 ( g + i0 j0 ). This condition is essentially an algorithm that decides if a given network data is observed to be stable or not. A continuous measure of stability can then be constructed, based on how severe an observed network violates this condition. To explain this result further: Suppose for network g, the condition above is violated for the pair of agents i0 , j0 , and the pair of agents i, j. Denote di for di ( g) = di0 ( g + i0 j0 ) ≥ 1 and d j for d j ( g) = d j0 ( g + i0 j0 ) ≥ 1. Since we observe no link between agents i0 and j0 , either agent i0 reveals preferred di − 1 to di number of friends, or agent j0 reveals preferred d j − 1 to d j number of friends. Otherwise they could have formed a link with each other to achieve di and d j number of friends respectively. But we also know that node i reveals preferred di to 6 This profile is uniquely defined by letting link deletion be bilateral. 10 di − 1 number of friends, otherwise he can unilaterally sever one of his links. Likewise node j reveals preferred d j to d j − 1 number of friends. Therefore regardless of the indeterminacy in whether it was agent i0 or j0 who blocked the formation of the link between them, at least one of the following length-2 cycle must be present: (i) agents reveal preferred di to di − 1 number of friends, yet di − 1 to di number of friends, or (ii) they prefer d j to d j − 1 number of friends, yet d j − 1 to d j number of friends. 3. S ETUP The primitives consist of (i) data, (ii) preferences, (iii) preference structure, and (iv) the notion of pairwise stability. 3.1. Data The data is a network g, which is an ordered tuple g = ( N, E, X ) where N = {1, 2, . . . , n} is the set of agents who can form links with each other. The set E is the set of links. It is a set listing who interacts with whom, also known as an adjacency list. For instance, {i, j} ∈ E means there is a link between agents i and j. Implicit is the assumption that the network g is undirected, i.e., if node i is linked to node j, then node j is also linked to node i. X = ( x1 , . . . , xn ) is the vector of agents’ types. The type of node i is xi ∈ X , where the type space X is finite. As an example of a network, let N = {1, 2, 3}, E = {{1, 2}, {1, 3}}, and X = ( Black, White, Black). The network g = ( N, E, X ) then describes a star (or line) network depicted in Figure 6. 1 2 3 Figure 6: g = ( N, E, X ) For ease of exposition, I will refer to the unordered pair {i, j} ∈ E as ij ∈ E, which denotes a link between agents i and j. Moreover when we say ij ∈ g, 11 we mean the link ij ∈ E in the network g = ( N, E, X ). A non-link i0 j0 ∈ / g 0 0 denotes the absence of a link between i and j in the network g. In addition, I will use g + ij to denote the network obtained by adding an undirected link between i and j in g, i.e., g + ij is the network ( N, E ∪ {i, j}, X ). Similarly g − ij is the network obtained by removing ij ∈ g. 3.2. Preferences Agents have preferences over networks. For example, an agent i might prefer the network g to g − ij, denoted as g i g − ij. When g i g − ij, then we also say he prefers to form a link with agent j in the network g. In similar veins, g i g + ij means that agent i prefers not to form a link with agent j in the network g. More formally, let G( N, X ) be the set of all networks with N as the set of nodes, and X as the profile of types.7 Unless stated otherwise, the sets N and X are fixed, and we will suppress the dependence of G( N, X ) on N and X. Now I define agent i’s preference space Gi as follows: let the pairs of networks that are one-link adjacent for agent i be Gi = {( g, g0 ) ∈ G 2 : ∃ j ∈ N, g = g0 + ij or g = g0 − ij}. That is, ( g, g0 ) ∈ Gi if and only if g, g0 differs in one link that involves agent i. Agent i’s preference relation, denoted as i , is formally a subset of Gi . It is a binary relation i ⊂ Gi that is complete and transitive. For instance, the statement ( g, g0 ) ∈i is taken to mean g i g0 , which says that agent i weakly prefers the network g to the network g0 . Moreover by definition of the preference space Gi , we can always write g i g0 as g i g ± ij, for some j. Finally let = (1 , . . . , n ) be a profile of preferences. 3.3. Pairwise Stability In Definition 3.1 below, I introduce pairwise stability, which is the equilibrium condition that no agent wants to dissolve an existing relationship, and no pair of agents want to form a link between them. Pairwise stability was first introduced in Jackson and Wolinsky (1996), and is a weak requirement that individuals have a tendency to form relationships that are mutually 7 It is convenient to think of g ∈ G( N, X ) as a colored graph, where each of the type in X represents a color. 12 beneficial and to drop relationships that are not. Other more restrictive equilibrium concepts exist, for example pairwise Nash equilibrium.8 Pairwise stability is often considered as a necessary condition for any equilibrium concept in strategic network formation (Calvó-Armengol and İlkılıç (2009)). Definition 3.1. The network g ∈ G is pairwise stable with respect to a profile of preferences if 1.) For all ij ∈ g, g i g − ij 2.) For all ij ∈ / g, g i g + ij AND OR g j g − ij. g j g + ij Implicit in this definition is the idea that while forming links is a collective and bilateral effort of two parties, it only takes one side to dissolve a link. 3.4. Preference Structure The final primitive I introduce is the notion of preference structure, as defined below. A preference structure p tells us how preferences must be related to one another. Definition 3.2. A preference structure ∼ p , is a partition or an equivalence relation on N × G 2 , such that if (i0 , f 0 , g0 ) ∼ p (i, f , g), then agent i must rank the networks f and g the same way agent i0 ranks the networks f 0 and g0 . In Remark 1, I state formally the conditions for a preference structure p to be well-defined. In words, (i0 , g0 , g0 − i0 j0 ) ∼ p (i, g, g − ij) is the restriction that agent i prefers to form a link with j in the network g if and only if agent i0 prefers to form a link with j0 in g0 . Note that we can have (i, f , g) ∼ p (i, f , g), which imposes no structure on the preference of agent i with respect to the networks f and g. As an example, we can use ∼ p to formalize the notion that agents only care about their degrees, i.e. the number of links they have. That is, let ∼ p be such that (i0 , g0 , g0 − i0 j0 ) ∼ p (i, g, g − ij) whenever di ( g) = di0 ( g0 ), where di ( g) is the degree of i in g. Therefore whenever agent i in network g has the same degree as agent i0 in network g0 , they have the same preference. Note 8 Intuitively, pairwise stable networks are robust to one-link deviations, while pairwise- Nash networks are robust to many-links deletion and single-link creation. See Goyal and Joshi (2003), Goyal (2009) for more. 13 that we are agnostic about whether more or fewer links are better, as long as individuals only care about the number of links they have. I now state formally what it means for a preference profile to satisfy the structure p. Preferences are said to satisfy the structure p if whenever p says that (i0 , f 0 , g0 ) ∼ p (i, f , g), then the preferences of agent i over networks f , g, and that of agent i0 over networks f 0 , g0 , are the same. So if agent i prefers f to g, then agent i0 also prefers f 0 to g0 . Definition 3.3. The profile of preferences = (1 , . . . , n ) satisfies the structure p if whenever (i0 , f 0 , g0 ) ∼ p (i, f , g) and f i g, then we must have f 0 i0 g0 . Similarly whenever (i0 , f 0 , g0 ) ∼ p (i, f , g) and g i f , then we have g 0 i 0 f 0 . 3.5. Observed Stability We are now ready to introduce our measure of network stability. This is done jointly in Definition 3.4 and 3.5 below. The most important criterion for a meaningful measure is that it must be formulated entirely in terms of observables, which in the current context, is the network data. Since the notion of stability depends on unobservables (such as preferences), I use the revealed preference approach to eliminate the unobservables. That is, we say that a network is observed to be stable if and only if there exists some profile of preference relations that could induce this network as pairwise stable. Strictly speaking, I give a family of measures – as our measure of stability is defined with respect to some preference structure p. In the empirical application, I will take a special instance of p, which is an unrestrictive structure on preferences. Definition 3.4. The network data g ∈ G is strictly rationalizable as pairwise stable with preferences that satisfy p if and only if there exists a profile of strict preferences = (1 , . . . , n ) such that (i) g is pairwise stable with respect to , and (ii) satisfies the structure p. Definition 3.5 (Binary measure of observed stability). The network data g is observed to be stable given the preference structure p if and only if the 14 network g is strictly rationalizable as pairwise stable with preferences that satisfy p. In Section 4.3, I will show how to implement a continuous form of this measure. We will also witness a tradeoff between how restrictive the preference structure is (more formally, how coarse the preference structure is, see Remark 1) and the power of the measure. When not enough structure is imposed on preferences, the measure has no power:– any network can be rationalized as stable. When the measure has no power, we cannot do better: we have to impose a stronger structure on preferences. Although we now have a measure that depends only on observables, the way it is defined here is not useful or constructive. In particular, to show that a network is not observed to be stable, we would have to show that every possible preferences do not rationalize the data as pairwise stable. This is not feasible by any means.9 The main task is to formulate an equivalent statement that gives an effective and operationalizable measure. The idea is to turn statement such as "There exists a black swan” into "All swans are white”. The former is existential, while the latter formulation is universal.10 3.6. Technical remarks R EMARK 1: More formally, the preference structure ∼ p is a binary relation S on G ∗ = i∈ N ({i } × Gi ), where Gi is the preference space of agent i. That is, Gi is the set of pairs of networks that are one-link adjacent for agent i, Gi = {( g, g0 ) ∈ G 2 : ∃ j ∈ N, g = g0 + ij or g = g0 − ij}. Note that G ∗ ⊂ N × G 2 . Moreover the binary relation ∼ p satisfies the following: (i) Transitivity: if (i0 , f 0 , g0 ) ∼ p (i, f , g) and (i00 , f 00 , g00 ) ∼ p (i0 , f 0 , g0 ), then it must be that (i00 , f 00 , g00 ) ∼ p (i, f , g), (ii) Reflexivity: (i, f , g) ∼ p (i, f , g), (iii) Symmetry: (i0 , f 0 , g0 ) ∼ p (i, f , g) implies that (i, f , g) ∼ p (i0 , f 0 , g0 ). In fact, I show in Lemma 2 in the Appendix that: there exists a profile of preference relations that satisfies the structure p if and only if (i, f , g) p (i, g, f ) for all (i, f , g) ∈ G ∗ . Moreover, we say that the preference structure p is less re9 Assuming agents have strict preferences over networks, then the number of distinct n ( n −1) preferences for just one individual is 2 2 factorial. 10 Chambers et al. (2010) contains in-depth discussion of the role of existential vs. universal statements. 15 strictive than p0 if p is finer than p0 (or p0 is coarser than p), that is, whenever (i0 , f 0 , g0 ) ∼ p (i, f , g), then (i0 , f 0 , g0 ) ∼ p0 (i, f , g) R EMARK 2: Definition 3.1 is the strict version of pairwise stability as defined in Jackson and Wolinsky (1996). In their definition, when two agents are indifferent to forming a link with each other, the outcome is indeterminate. This stability concept is too weak: by assuming that agents are always indifferent, any observed networks can be pairwise stable. Therefore it is not possible to distinguish between stable and unstable networks. 4. A MEASURE OF STABILITY In this section, I will consider a special measure of stability. This particular measure will be used in the empirical application that follows. Although we will focus on this measure of stability, Section 6 provides the general result that characterizes the set of networks that are rationalizable as stable under arbitrary preference structures. This measure is chosen because it has desirable theoretical properties – it is axiomatized by a structure on preferences that is not too restrictive. The networks that are stable under this measure are exactly the networks that are rationalizable as pairwise stable with preferences that satisfy a certain preference structure. Therefore we can be rigorous about exactly what this measure captures. Moreover this measure is well-suited because it has enough power for the purpose of our application (Section 7). 4.1. Graph theoretic definitions Our main measure of stability relies on the notion of graph isomorphisms and k-neighborhood. Recall that a network g ∈ G( N, X ) consists of links amongst agents in N = {1, . . . , n}. Associated with each individual i ∈ N is a type xi ∈ X . Definition 4.1 (Isomorphism). Consider networks g, g0 ∈ G( N, X ), (i) g and g0 are isomorphic if and only if there is a bijection σ : N → N such that ij ∈ g ⇐⇒ σ(i )σ( j) ∈ g0 . 16 (ii) The isomorphism σ is a types-preserving isomorphism if and only if xi = xσ(i) for all i. 1 3 1 3 2 4 2 4 Figure 7: The left network: g = {12, 13, 24, 34}, and the right network: g0 = {13, 14, 23, 24} are isomorphic. Both networks represent the same network structure. Two networks are isomorphic if they have the same network structure. That is, we can relabel and permute the identities of the nodes in one network to obtain another isomorphic network. Types-preserving isomorphism is more restrictive in that the relabeling of nodes is restricted to nodes of the same type. For example, consider the network g = {12, 13, 24, 34} depicted in the left side of Figure 7 below. Player 1 and 3’s types are both black, and player 2 and 4’s types are both white. We write this types profile as x = ( B, W, B, W ). Consider a permutation σ on the labels of the nodes such that σ(1) = 3, σ (3) = 1, σ(2) = 2 and σ (4) = 4. Then, we can see that σ( g) = {32, 31, 24, 14} = {13, 14, 23, 24} = g0 . Moreover note that the σ only permutes a node of the same color to another of the same color. Therefore, σ is a types-preserving isomorphism from g to g0 . Definition 4.2 (k-neighborhoods). The neighborhood of node i in network g is the set of nodes that i is linked to. While the k-neighborhood of node i in network g is the set of all nodes that are distance no more than k from i. Now the k-neighborhood graph of node i in network g is the subgraph induced by the nodes in the k-neighborhood of i. For simplicity, k-neighborhood will be understood as the k-neighborhood graph. More formally, Ni ( g) = {i } ∪ { j : ij ∈ g} is the neighborhood of node i in g. The k-neighborhood is recursively defined as Nik ( g) = Ni ( g) ∪ S k −1 N ( g ) . j∈ Ni ( g) j The k-neighborhood graph of node i in g, denoted as g( Nik ( g)), is the subgraph of g with Nik ( g) as the set of nodes such that i0 j0 ∈ g( Nik ( g)) if and 17 i i Figure 8: The 2-neighborhood of node i in the network on the right is the network on the left hand side. only i0 , j0 ∈ Nik ( g) and i0 j0 ∈ g. 4.2. k-stability I will now detail the main measure of stability, which I call k-stability, which is formulated entirely in terms of observables, i.e. the network data. While standard notions of stability are defined in terms of unobserved preferences. Definition 4.3 (k-similar). Node i in network g is k-similar to node i0 in network g0 if and only if there exists a types-preserving isomorphism σ from the k-neighborhood graph of i to that of i0 such that σ (i ) = i0 . Note that k-similarity is not a comparison of two nodes. It is a comparison of a node given a reference network and another node given a possibly different network. Figure 9 below gives an illustration. i i’ j j’ Figure 9: Node i in the network above is 2-similar to node i0 in the same network, but this is not true for nodes j and j0 . Even when the dotted lines are added as links to the network above, the conclusion holds. We can also replace the notion of k-similarity with any other notion of nodes similarity. Notably, we can say that node i in network g is θ-similar to node i0 in network g0 if and only if θi ( g) = θi0 ( g0 ) where θ is a vector of network 18 statistics. For instance, θi ( g) could be the clustering coefficient and network centrality of node i in network g. In fact, given that agents only care about their k-neighborhoods, k-similarity encompasses all possible network statistics. This will be discussed further in Section 4.5. Definition 4.4 (k-stability). The network g is k-stable if and only if for all ordered pairs of nodes i and j, there does not exist a non-link (i0 , j0 ) ∈ / g, 0 0 such that (i) node i in g is k-similar to node i in g + i x, and (ii) node j in g is k-similar to node j0 in g + j0 y, for some x and y.11 See Figure 10. i i’ j j’ Figure 10: This network g is not 2-stable. When a link is formed between nodes i0 and j0 (see the dotted line), node i in g becomes 2-similar to node i0 in g + i0 j0 , and node j is 2-similar to node j0 in g + i0 j0 . Now k-stability as defined here allows us to say which links and nodes are stable. Since we can decompose k-stability of a network into the stability of its links, I will use the fraction of stable nodes as a continuous measure of stability in the next section. The (binary) measure of stability above can in principle be derived on an ad hoc basis. But stability is ultimately a statement about preferences, and we need to know what such measure means exactly in terms of preferences. The following theorem axiomatizes our measure of stability: it tells us what preferences give rise to this measure. T HEOREM 1. The network g is k-stable if and only if g is rationalizable as pairwise stable with preferences that satisfy the structure p∗ in Definition 4.7. I will delay the definition along with the discussion of this preference structure to the end of this section (Section 4.5). The result here is a special case i and j not necessarily distinct, and non-links i0 x and j0 y are not necessarily distinct as well. 11 Nodes 19 of a general revealed preference characterization in Section 6. Notably, kstability, which characterizes precisely the networks that are rationalizable as pairwise stable with preferences that satisfy p∗ , is equivalent to checking for cycles of length 2 in an appropriately constructed revealed preference relation.12 4.3. Continuous Measure of Stability This section introduces a continuous form of k-stability. The k-stability score, denoted as S( g; k ) ∈ [0, 1], tells us how close the network g is to be k-stable. At the extreme end when S( g; k ) = 1, network g is k-stable in the sense of Definition 4.4. Definition 4.5. The k-stability score of a network g, S( g; k), is the fraction of nodes in g that are k-stable. A node i is k-stable if and only if there does not exist another node j and a non-link i0 j0 ∈ / g such that (i) node i in g is 0 0 k-similar to the node i in g + i x, (ii) node j in g is k-similar to the node j0 in g + j0 y for some x and y. When a node i is not k-stable, then inconsistency with pairwise stability can be resolved by having node i dissolve a link. Therefore the stability score defined here intuitively measures the fraction of nodes that should dissolve links. 4.4. Robust k-stability In Proposition 3, I show that the measure in Definition 4.4 has a worst-case running time of O(n3 ), where n is the number of nodes, assuming degrees are bounded.13 Here, I provide a weaker version of k-stability that has a 12 To illustrate the connection between k-stability and revealed preference, consider Figure 10 again. When a link is observed between agents i, j in network g, we say that they both reveal preferred g to g − ij. Now because agents i, j in g are k-similar to agents i0 , j0 in g + i0 j0 respectively, the preference structure p∗ then says that their preferences are similar too. In particular if agent i prefers g to g − ij, then agent i0 also prefers g + i0 j0 to g (and vice versa for agents j, j0 we have g j g − ij =⇒ g + i0 j0 j0 g). Now we have just constructed a cycle in the revealed preference relation of length 2, because from observing no link between agents i0 , j0 in the data g, either of agents i0 , j0 must reveal preferred g to g + i 0 j0 . 13 The maximum number of links a node is allowed to have is a constant number that does not depend on n. In practice, surveys set a constant limit on the number of links a 20 desirable robustness property, with a similar worst-case running time of O ( n3 ). Definition 4.6 (Weak k-stability). The network g is weakly k-stable if and only if for all links (i, j) ∈ g, there does not exist non-links (i0 , j0 ) ∈ / g such that (i) node i in g is k-similar to node i0 in g + i0 j0 , and (ii) node j in g is k-similar to node j0 in g + i0 j0 . While k-stability in Definition 4.4 iterates over all pairs (i, j) ∈ N × N in the outer loop, weak k-stability restricts all such pairs (i, j) to be links. Moreover in the inner loop, while the algorithm for k-stability iterates over all pairs of non-links i0 x, j0 y ∈ / g, weak k-stability only iterates over a single non-link 0 0 (i x and j y are the same). Therefore if a network is k-stable under 4.4, it is also weakly k-stable under 4.6, but the converse is not true. However the following proposition establishes a partial converse. P ROPOSITION 1. Let the network g be weakly k-stable under Definition 4.6 but not k-stable, then g is k0 -stable for all k0 > k. i’ i j’ j Figure 11: This network g is weakly 2-stable, but not 2-stable. Hence from Proposition 1, it is 3-stable. Agents i and j in g are 2-similar to agents i0 and j0 in g + i0 j0 (i.e. when the dotted link is added) respectively. However it is weakly 2-stable, since nodes i and j are not linked, and there is no other pair of linked nodes that are similar to (i0 , j0 ) (per condition in 4.6). According to this proposition, we should use weak k-stability because it is more robust to different specification of k. Henceforth Definition 4.6 will be our notion of k-stability, but Table V in the empirical section shows that respondent can report, and this limit is rarely reached. 21 using the stronger form of k-stability does not change the result. Figure 11 illustrates this. 4.5. Axiomatization The network g is k-stable (Definition 4.4) if and only if g is rationalizable as pairwise stable with preferences that satisfy the structure p∗ defined in Definition 4.9 below. I will define p∗ as composing of two parts: p1 and p2 . Definition 4.7 (Agents only care about their local network configurations). Define the preference structure, p1 , as follows. For all (i0 , g0 , g0 − i0 j0 ) and (i, g, g − ij) in G ∗ , let (i0 , g0 , g0 − i0 j0 ) ∼ p1 (i, g, g − ij) if and only if node i0 in network g0 is k-similar to node i in network g. Definition 4.7 says that (locally) observationally equivalent agents have the same preference. By construction of k-similarity, whenever agent i in g is ksimilar to agent i0 in g0 , their respective k-neighborhoods are observationally equivalent.14 More precisely, it says that for agent i in g who is k-similar to agent i0 in g0 , if i prefers to form a link in g, then i0 also prefers to form a link in g0 . If agent i finds having one fewer link preferable at g, then i0 also prefers to have one fewer link at g0 . Definition 4.8 (Agents care about their degrees sufficiently). Define the preference structure, p2 , as follows. For all (i, g + ij, g) ∈ G ∗ , let (i, g + ik, g) ∼ p2 (i, g + ij, g) for all ik ∈ / g. Similarly, for all (i, g, g − ij) ∈ G ∗ , let (i, g, g − ik ) ∼ p2 (i, g, g − ij) for all ik ∈ g Definition 4.8 says that agents can always be made better off by adopting the following simple decision-making heuristic: at any given network, an agent chooses to either form a link (any), or dissolve a link (any), or neither. Although different links have different values to the agent, but because she cares sufficiently about her degree, if at network g, the network g + ij is acceptable to her (agent i), then g + ik is also acceptable to her. 14 For instance, they have identical network characteristics. Common examples of network characteristics include (Jackson (2008)): the density of links, the average distance between nodes with various characteristics, number of cliques of given sizes and characteristics, and so forth. Formally, a network characteristics is defined to be a property that is preserved under types-preserving isomorphisms of networks. In other words, it is a structural property of the network itself, not of a specific drawing or labeling of the network. 22 This restriction is imposed in order to obtain a close-formed characterization of the networks that are rationalizable as stable. Without it, we have to consider revealed preference violations involving arbitrary off-path15 networks that are computationally infeasible to verify. Moreover as alluded in the example in Section 2, under this structure, we only need to consider one particular profile of revealed preference relations among many possible.16 In Section 6, I derive the condition for the network g to be rationalizable as pairwise stable for other preference structure p. This characterization is not computationally feasible in practice: the number of possible profiles of revealed preference relations is Ω(3n ) where n is the number of agents, and we need to check for cycles of any arbitrary length in the extension of each of these profiles.17 Definition 4.9. Let the preference structure p∗ be the supremum of p1 and p2 , that is the finest partition of G ∗ that is coarser than both partitions p1 and p2 . The structure p∗ leads to a simple measure of stability that has enough power for our empirical application (Section 7). We can choose a different structure on preferences that is less restrictive (formally p is less restrictive than p∗ if p is a finer partition of G ∗ than p∗ , see Remark 1), but the ensued measure of stability will have less power. 4.6. Discussion The measure of stability proposed here captures the extend to which network data are pairwise stable with respect to preferences that satisfy the restriction p∗ . Arguably the most demanding feature of p∗ is that it restricts preferences to depend only on observed network configurations (kneighborhood), with no agent-level idiosyncracies. Therefore we can inter15 Off-path networks are those that cannot be obtained from the observed network with just one addition or deletion of a link, they are more than one link away from the observed network. 16 This profile is uniquely defined by letting link deletion be bilateral. Moreover, we can ignore cycle of length greater than 2 in the revealed preference relation. 17 Assuming that degrees are bounded above. For every non-link ij ∈ / g in the data, there are three permutations of revealed preference relations, either i or j or both prefer not to form the link ij. 23 pret this measure as the amount of stability that can be attributed to observed network structures. This interpretation of stability is unavoidable if we want to infer stability from a single network. The empirical section that follows provides some evidence supporting the validity of the proposed k-stability measure. In particular, the stability measure is strongly correlated in expected ways with factors that ought to affect the stability of social and economic networks. 5. 5.1. E MPIRICAL A PPLICATION Data In this section, I present an empirical application of the stability measure described in Definition 4.5. The dataset in this section is taken from Banerjee et al. (2013). They collected detailed network data by surveying households about a wide range of interactions in 75 rural villages in India. This information was then used to create different network graphs for each village.18 I will consider 4 different types of networked relationships. Note that agents in this context refers to households. Financial network with respect to money: there is a financial relationship with regards to money between agents i and j if and only if agent i reports borrowing money from (or lending to) agent j, and agent j reports lending money to (or borrowing from) i. Similarly, I construct networks of financial relationships with regards to borrowing and lending rice or kerosene. Advice network: there is an advice relationship between agents i and j if and only if agent i reports receiving advice from (or giving to) agent j, and agent j reports giving advice to (or receiving from) i. Social networks are constructed in similar manner.19 In addition to network data, Banerjee et al. (2013) also collected various household characteristics that are then matched to nodes in the various net18 More precisely, individuals are surveyed and a relationship between households exists if any household members indicated a relationship with members from the other household. Although surveyed individuals could name nonsurveyed individuals as friends, relatives, etc., such links are omitted unless both individuals were surveyed. 19 Social network: there is a social relationship between agents i and j if and only if agent i reports visiting the house of (or receiving visits from) agent j, and agent j reports receiving visits from (or visiting the house of) agent i. 24 works. Although our measure of stability can incorporate these attributes, I will focus on measuring stability as if only the network structure is observed, that is, I compute the k-stability score for the observed network without node attributes. There are two reasons for measuring the stability of just the network structure. First, many network data are purely relational – they only say who is linked to whom.20 This represents the most minimal network data that our measure of stability can be applied to. Secondly, not using node attributes in computing stability gives us more power (Section 7). If every node has a different attribute, then we can never tell if a network is not stable. However the empirical findings in this paper are robust to incorporating household indicators. For instance, I repeat the empirical exercise using the household variable indicating whether a household contains a leader (e.g., teachers, shopkeepers, savings group leaders). 5.2. Descriptive result There is considerable variation in k-stability scores across villages for a given network type. For example, Figure 12 plots the histogram of observed stability for the cross section of 75 village networks. Our measure of stability could be picking up basic aspects of network, such as the total number of nodes and links. Section 7 shows that the expected stability of a randomly generated network varies considerably depending on the parameters n, p, where n is the number of nodes and p is the network density. In order to benchmarked the k-stability score of a network, I construct the excess k-stability. Definition 5.1. The excess k-stability of a network g is calculated by subtracting the expected k-stability score of an Erdös-Rényi random network21 with the same parameters as g, from the k-stability score of g. Figure 13 shows how excess stability varies across villages and network types. One thing to note is the distribution of k-stability score can look 20 For instance the well-known network data on the topography of the internet in Albert et al. (1999) is a purely relational one. 21 Specifically the G ( n, p ) random network where there are n nodes and each link is formed independently with probability p 25 different from the distribution of the corresponding excess k-stability, as demonstrated by the difference between Figures 12 and 13. Advice Network 0 1 Density 2 3 Financial Network .2 .4 .6 .8 1 .2 .4 .6 .8 1 Stability Figure 12: Comparing the 2-stability scores of financial (money) and advice networks across the 75 villages. The correlation between the two is 0.232. The histogram and the estimated kernel density are shown. The histogram for 3-stability looks similar, but with mass shifted towards 1. Advice Network 0 2 Density 4 6 Financial Network -.2 0 .2 .4 -.2 0 .2 .4 Excess Stability Figure 13: Comparing the excess 2-stability of various networks types across the 75 villages. The correlation between the two is −0.067. There is a noticeable difference between the distributions of stability and excess stability. The former suggests that financial networks are more stable, while the latter suggests otherwise. 22 There is a link between agents i and j if and only if agent i reports borrowing money from (or lending to) agent j, and agent j reports lending money to (or borrowing from) i. Non-surveyed households are not included. Graph is produced using the spring algorithm. 26 123 129 74 101 148 89 137 55 111 77 115 104 149 98 68 91 107 75 92 97 112 96 84 103 146 46 59 99 78 54 106 69 155 159 67 61 57 49 130 40 151 150 28 160 134 30 34 145 37 157 25 43 14 15 29 31 23 153 44 16 144 13 94 2 58 11 22 21 5 3 120 20 135 Figure 14: Network of borrowing and lending money22 in village number 27 of Banerjee et al. (2013). Bolded links are not 2-stable, which spans 14 out of 73 households. The 2stability score of this network is 80.8%. A (Erdös-Rényi) random network with the same parameter is expected to have a 2-stability score of 83.4%. This difference is the excess 2stability. 5.3. Correlates of stability This section examines the village level predictors of stability. Specifically, I will look at how the network stability of a village is related to the average characteristics of the village. In Table I, I provide results for an ordinary least squares (OLS) regression of stability as a function of various village characteristics. The measure of stability here is the excess 2-stability (Definition 5.1), averaged across financial, social and advice networks. While k-stability is affected by n, the number of agents in a network, excess k-stability takes into control variation in stability caused by variation in n. The key village characteristics that explain stability is the number of house- 27 holds and the fraction of individuals in the village who travel outside the village for work.23 In particular, they are significantly negatively related to stability. This empirical relationship suggests that (i) larger village tends to have lower stability, and (ii) village where individuals spend more time in the village tends to have higher stability. It is plausible that in a larger village, mutually beneficial relationships are not formed because agents have limited attention (Masatlioglu et al. (2012)) and do not consider all such beneficial relationships. This could be due to the geographical spread of the village as village size increases. Recall from earlier discussion that k-stability is measure of stability attributed to the observed network structure, i.e. the extent to which network data are pairwise stable with respect to preferences that depend only on network structures. Instability then could be caused by preferences for non-network characteristics that are not captured by the network structure. The result in Table I could be explained by the fact that in forming links, agents have preferences for whether other agents travel outside the village for work. However I show in Table III at the Appendix that our result is unlikely to be driven by the restriction of preferences to depend only on network structure. I construct an alternative measure of k-stability by directly incorporating as (binary) node types whether household has members who travel outside village for work. The significant predictors in Table I remain valid, with comparable magnitudes. Moreover, another variable are now significant: the fraction of villagers who are not natives is negatively correlated with network stability. It is also possible that the variation in stability is driven by unequal measurement and sampling error across villages, where links are misreported with higher frequency in some villages.24 Although households can name others not in the survey, I omit those links involving non-surveyed households (similar to Jackson et al. (2012)). Moreover, I construct the variable “Recip” that measures the proportion of reports that are reciprocated in a 23 While the first variable is readily available in the dataset provided by Banerjee et al. (2013), the Work Outside Village variable is not. I construct this variable by calculating the fraction of respondents in a village who answer ‘Yes’ to the question ‘Do you travel outside the village for work?’ 24 Chandrasekhar and Lewis (2011) examines the biases that would arise when sampled (and missing) network data are used in regression analysis. 28 village, for instance, when household i reports borrowing from household j, how frequent does j also reports lending to i. This variable can be a proxy for measurement error in constructing village-level undirected network from survey data, but including it did not change the regression result in Table I. Finally, the stability of financial network has negligible correlation (even negative) with the stability of social network, even though the measures are calculated based on the same sampled nodes. Excess Stability Number of Households (per 1000) Work Outside Village -0.508∗∗ (0.216) -0.530∗∗ (0.226) -0.563∗∗ (0.219) -0.125∗ (0.0674) -0.260∗∗ (0.120) -0.292∗∗ (0.136) -0.310∗∗ (0.134) Average Education -1.227 (0.824) -0.437 (0.703) -1.453 (0.990) -1.446 (1.001) -0.912 (0.620) -0.139 (0.523) -1.168 (0.737) -0.975 (0.728) Fraction GM 0.0413 (0.0471) -0.0103 (0.0400) -0.0239 (0.0426) Fraction Nonnatives -0.0888 (0.0975) -0.117 (0.119) -0.102 (0.0987) 0.0239 (0.0284) 0.0170 (0.0310) (per 100) Average Age (per 100) Average No. Rooms Constant N R2 0.0961∗∗∗ (0.0226) 0.544∗ (0.282) 0.110 (0.208) 0.668∗∗ (0.326) 0.563∗ (0.319) 75 0.092 75 0.144 75 0.021 75 0.163 75 0.101 Robust standard errors in parentheses ∗ p < 0.10, ∗∗ p < 0.05, ∗∗∗ p < 0.01 TABLE I The relationship of network stability with village characteristics. The network stability of a village is negatively correlated with the number of households and the fraction of villagers who travel outside the village for work. Excess stability takes into account the random variation in stability due to variation in network density and number of households. Other village-level predictors of stability that I control for are the average education level; average age of the villagers; fraction belonging to the upper two castes; fraction who are not natives of the village; and the average number of rooms owned by each person (wealth indicator). 29 5.4. Relation of stability to diffusion of microfinance In this section, I show how stability is associated with economic outcomes, specifically, the diffusion of microfinance.25 In addition to the network data, Banerjee et al. (2013) also collected data on the diffusion of microfinance in a subset of the 75 surveyed villages. Specifically, the leaders in these villages are the injection points of a newly introduced microfinance loan program. These leaders are known and mapped to nodes in the network data. Banerjee et al. (2013) show that when agents who were first-informed about a new microfinance loan program (injection point) have central position in the underlying social network, then the eventual adoption rate is greater. My main insight is that stability reinforces the importance of network position of these first-informed nodes in diffusion. In particular when stability is high, network centrality of injection points is more strongly and positively correlated with eventual adoption rate. The practical relevance is as follows: the targeting of nodes that have central positions in the network to increase the spread of a new microfinance program is more effective when the network is associated with a higher measure of stability. Moreover the result is driven primarily by the stability of advice network, which is the network type expected to be more salient for households passing on information about a new loan program. Interestingly also, stability itself has a small but negative effect on diffusion. A possible explanation is that random meetings of households helps increase diffusion, and stable networks lack these random, non-network-based meetings. 5.4.1. Details of regression result The first column of Table II replicates the regression result of Banerjee et al. (2013). The dependent variable is the microfinance take-up rate of nonleader households. The main explanatory variables introduced in Banerjee et al. (2013) are the communication and diffusion centrality of the injection 25 Potentially, the effect of stability on diffusion that we obtained here is applicable for diffusion of information on networks in a broader sense. Galeotti et al. (2010); Jackson and Yariv (2007, 2010). 30 points, which are measures of how central individuals are in their social network with regard to spreading information. We can see in Table II that the diffusion centrality of injection points helps significantly in predicting eventual adoption of microfinance. The controls include Number of households, Savings, SHG participation, Fraction GM, and Fraction Leaders.26 In Column (2) and (3) of Table II, I introduced different measures of network stability to the Banerjee et al. (2013)’s specification. Specifically, in Column (2) of Table II, the stability of a village is taken to be the 2-stability score (see Section 4.5), averaged across the village’s four different network types: financial (money), financial (rice and kerosene), advice, and social networks. Similarly in Column (3), I take excess stability to be the excess 2-stability27 , averaged across different network types. As seen in Table II, our measures of stability added considerable explanatory power to the original model of diffusion. In all specifications, the interaction between stability and diffusion centrality is significantly positive. The main empirical finding is that when stability is high, network centrality of injection points matters more for diffusion. To get a sense of the magnitudes, using Column (3) of Table II, one standard deviation increase in diffusion centrality is associated with 6.7, 4.6, 2.1 percentage points increase in the dependent variable at 90th, 75th, 25th percentile of stability scores. Magnitudes using Column (2) are similar. In comparison to Banerjee et al. (2013), one standard deviation increase in diffusion centrality is associated with an increase of 3.9 percentage points (the average participation rate is 18.5%) Interestingly the result also suggests that network with low stability has higher diffusion. Stability by itself seems to have small but negative effect on adoption rate. One explanation is that random and non-network-based meetings of agents is important for the spread of information, and networks that are stable have less random meetings of agents. Therefore our result suggests that for villages with higher stability, network-based diffu26 Savings: Fraction of households engaging in formal savings; SHG participation: Fraction of households participating in a self-help group; Fraction GM: Fraction of households that are not from the scheduled castes and scheduled tribes: groups that historically have been relatively disadvantaged; Fraction Leaders: Fraction of households that are leaders 27 The excess k-stability score is the k-stability score in excess of a randomly generated network with the same network parameters (see Section 7). 31 sion is more prevalent; while for villages with lower stability, random nonnetwork meetings drive diffusion. Note that the result here did not change when I included the significant predictor of stability from Table I ("Fraction Work Outside"). 5.4.2. Robustness In Table V of the Appendix, I repeat the same exercise but focusing instead on advice networks. Here, I also verify if the result is sensitive to different ways of defining stability. The interaction between stability and diffusion centrality is significantly positive in all 7 columns, showing that stability influenced the marginal effect of diffusion centrality. In columns (1) and (2), stability is defined as 2 and 3-stability. In column (3), node characteristics are incorporated into the computation of 2-stability, specifically the variable “leader" that indicates if a household contains a leader. In column (4), I use the stronger version of 2-stability (see Section 4.4). In similar veins, columns (5), (6), (7) are the excess stability counterparts to columns (1), (2) and (3). The magnitudes are comparable across specifications. 32 (1) Microfinance participation rate 0.0222∗∗∗ (0.00690) Diffusion Centrality (2) (3) -0.0886∗ (0.0470) 0.0165∗∗∗ (0.00594) Stability -0.897∗∗∗ (0.323) Stability × Diffusion Centrality 0.143∗∗ (0.0583) Excess Stability -1.275∗∗∗ (0.391) Excess Stability × Diffusion Centrality 0.230∗∗∗ (0.0734) -0.000540∗∗∗ (0.000163) -0.000472∗∗∗ (0.000169) -0.000507∗∗∗ (0.000157) Savings -0.244∗ (0.138) -0.205∗ (0.118) -0.252∗ (0.138) SHG Participation -0.228 (0.200) -0.179 (0.180) -0.165 (0.199) Fraction GM -0.0354 (0.0396) -0.0280 (0.0395) -0.0410 (0.0404) Fraction Leaders 0.270 (0.457) 0.301 (0.442) 0.222 (0.425) 43 0.442 43 0.511 43 0.542 Number of Households N R2 Robust standard errors in parentheses ∗ p < 0.10, ∗∗ p < 0.05, ∗∗∗ p < 0.01 TABLE II The dependent variable is the microfinance take-up rate of non-leader households. The first column reproduces exactly the result of Banerjee et al. (2013). (Excess) stability here is defined as the average (excess) 2-stability scores across all network types. Excess k-stability is the k-stability score of a network in excess of a randomly generated network with the same parameter. Using Column (2), one standard deviation increase in diffusion centrality is associated with 6.8, 5.6, 3.0 percentage points increase in the dependent variable at 90th, 75th, 25th percentile of stability scores. Using Column (3), one standard deviation increase in diffusion centrality is associated with 6.7, 4.6, 2.1 percentage points increase in the dependent variable at 90th, 75th, 25th percentile of stability scores. 33 6. G ENERAL R ESULTS In this section, I lay out the revealed preference framework that leads us to the general result, of which Theorem 1 in the previous section was a special case. The question to keep in mind is: under what necessary and sufficient conditions on the primitive (the network data g), can g be rationalized as pairwise stable with preferences that satisfy the structure p. Just as Samuelson (1938) defined revealed preference from a consumption data set, we will now define how preferences are revealed from an observed network. If a link between agents i and j is observed in the network g, then we say that agents i and j both reveal preferred to have this link to not having this link in g. On the other hand, if we do not observe a link between agents i and j in the network g, then it is revealed that either agent i or agent j prefers not to form this link in g. More formally, Definition 6.1. For a given network g, let R = ∏i∈ N iR be a profile of revealed preference relations defined as follows: 1.) For all ij ∈ g, g iR g − ij AND 2.) For all ij ∈ / g, g iR g + ij OR g Rj g − ij. g Rj g + ij This definition does not pin down a unique revealed preference relation (due to part 2.) of the definition). Now let R( g) be the set of all such profiles of revealed preference relations derived via Definition 6.1 above. The set R( g) contains all the possible revealed preference relations that are induced by the data g. Definition 6.2. The network g ∈ G is rationalizable as pairwise stable if and only if there exists a profile of strict preferences that extends some profile of revealed preference relations in R( g). In classical revealed preference theory, the revealed preference relation is uniquely defined. In our setup, rationalization only requires that we can find just one revealed preference relation in R( g) that can be extended into a complete and transitive relations. P ROPOSITION 2. Any data are rationalizable as pairwise stable. This result can be shown graphically, and the method used will recur through- 34 out the paper. Recall that agent i’s preference space Gi was defined as follows: Gi = {( g, g0 ) ⊂ G 2 : ∃ j ∈ N, g = g0 + ij or g = g0 − ij}, i.e. the pairs of S networks that are one-link adjacent for agent i. Now let G ∗ = i∈ N (Gi × {i }) the space of preferences. The space G ∗ can be graphically represented as follows: define the graph P on G × N such that there is a link between two nodes gi , gi00 ∈ G × N whenever ( g, g0 ) ∈ Gi (note that ( g, g0 ) ∈ Gi implies ( g0 , g) ∈ Gi ). The graph P consists of components that look like Figure 15, and a different component is obtained by varying i ∈ N and g ∈ G . A generic link in this graph P, say { gi , ( g − ij)i }, represents the preference of agent i over the two networks g and g − ij, or in another words, i’s preference in forming the link ij ∈ g. Formally, Definition 6.3. The preference graph P, is a graph on G × N such that: for all gi , gi00 ∈ G × N, there is a link between gi and gi00 if and only if i = i0 and there exists j ∈ N such that g0 = g + ij or g0 = g − ij. gj gi ( g + i j)i ( g ° ik) i ( g + i j ° ik) i ( g + i j) j ( g + jl ) j ( g + i j + jl ) j Figure 15: The graph P represents the preference space G ∗ Now every profile of (strict) preferences can be represented as an acyclic orientation of the graph P.28 For instance, g i g − ij is symbolized by an arrow pointing from gi to ( g − ij)i in P. Therefore, the task of rationalization is equivalent to finding an acyclic orientation of P that respects pairwise stability. The graph P is undirected – any link {v, w} in P is unordered. Now an orientation of an undirected graph is an assignment of a direction to each link or edge, turning the initial graph into a directed graph. That is replacing each undirected link {v, w} with an ordered pair, either (v, w) or 28 There is then a bijection from the set of all acyclic orientation of P to the set of all strict profiles of preference relations. 35 (w, v). An orientation is acyclic if it does not contain any directed cycles.29 For a given data g, we can represent a profile of revealed preference relations R ∈ R( g) as a partial orientation of the graph P. That is, whenever g iR g0 , we assign a direction from gi to gi0 , as illustrated in Figure 16. Denote P( R ) as this partial orientation of the graph P according to R . gj gi ( g + i j)i ( g ° ik) i ( g + i j ° ik) i ( g + i j) j ( g + jl ) j ( g + i j + jl ) j Figure 16: Representing revealed preference relations in P Proposition 2 is then a statement that we can find an acyclic orientation that completes or extends the partial orientation30 P( R ) for some R ∈ R( g). This comes from the fact that given a network g, any revealed preference relation iR induced from the data is acyclic for all i, and therefore the partially oriented graph P( R ) has no directed cycles. If a partially oriented or partially directed graph has no directed cycles, then it is known that there is an acyclic orientation that extends it, by Szpilrajn extension theorem.31 Lemma 1 below gives a procedure for checking if a given data can be rationalized with preferences that satisfy an arbitrary structure p. The procedure starts by taking a profile of revealed preference relations and extending it according to the structure imposed by p. To this end, I define an extended revealed preference relation in Definition 6.4. First let R ∈ R( g) be a profile of revealed preference relations induced by the data g (see Definition 6.1). Let p be a preference structure per Definition 3.2. The p-extension of R takes R and extends it according to the structure imposed by p. 29 i.e. distinct nodes v 0 , . . . , vk −1 such that ( v0 , v1 ), ( v1 , v2 ), . . . , ( vk −2 , vk −1 ), ( vk−1 , v0 ) are present in the directed graph. 30 A partial orientation only assigns directions to some links, and an orientation extends a partial orientation if it only assigns directions to all the undirected links. 31 any transitive and asymmetric relation has a transitive, asymmetric and complete extension. 36 Definition 6.4. Let R = ( R1 , . . . , Rn ) be a profile of binary relations. The pextension of R is another profile of binary relations R̃ = ( R̃1 , . . . , R̃n ) such that f R̃i g if and only if (i0 , f 0 , g0 ) ∼ p (i, f , g) and f 0 Ri0 g0 . Lemma 1. The data g is rationalizable as pairwise stable with preferences that satisfy the structure p, if and only if there exists a profile of revealed preference relations R ∈ R( g) such that the p-extension of R is acyclic.32 gj gi ( g + i j)i ( g ° ik) i ( g + i j ° ik) i ( g + i j) j ( g + jl ) j ( g + i j + jl ) j Figure 17: Extended revealed preference relations with a cycle The Lemma is illustrated in Figure 17, which extends the revealed preference relations in Figure 16. Although revealed preference relations are always acyclic (see Proposition 2), cycles could arise in the extended revealed preference relations. As can be seen in Figure 17, p restricts the preferences colored in blue to be aligned, i.e. g i g + ij ⇐⇒ g + jl j g + jl + jk, hence the direction of the double arrow. Vice versa for the links colored in red. Note that this extended revealed preference relation can be very incomplete in general, such as when we take p to be a very fine partition. 6.1. Tractable characterization Lemma 1 gives an algorithm to check if an observed network is rationalizable as pairwise stable, given any structure function p we have in mind. This algorithm terminates in finite steps because the set of revealed preference relations R( g) is finite. However in practice, such as when the data consist of large networks, this algorithm is not tractable. It turns out that, 32 Given a profile of binary relations = (i , . . . , n ), we say that is acyclic if i is acyclic for all i ∈ N. 37 for the class of preference structure we are interested in, this procedure can be simplified remarkably. This is the main result of the section. In addition to the conditions in Remark 1 for the preference structure p to be well-defined, we will assume for simplicity that all structure p in this paper will be specified in such a way that whenever (i0 , f 0 , g0 ) ∼ p (i, f , g), then (i0 , f 0 , f 0 − i0 j0 ) ∼ p (i, f , f − ij) for some j, j0 . Axiom 1. The preference structure p satisfies Axiom 1 if and only if (i) (i, g, g − ij) ∼ p (i0 , g0 , g0 − i0 j0 ) implies that for all k ∈ Ni ( g) there exists a k0 ∈ Ni0 ( g0 ) such that (i, g, g − ik ) ∼ p (i0 , g0 , g0 − i0 k0 ). (ii) (i, g + ij, g) ∼ p (i0 , g0 + i0 j0 , g0 ) implies that for all k ∈ / Ni ( g) there exists 0 0 0 0 0 0 / Ni0 ( g ) such that (i, g + ik, g) ∼ p (i , g + i k , g0 ). ak ∈ Axiom 1 says the preference structure p can only restrict the preferences of agent i in g, and agent i0 in g0 to be similar only when agent i in g treats all his links (and non-links) the same way as agent i0 treats all his links (and non-links) in the network g0 . From any structure p, we can always refine p to obtain a structure p∗ that satisfies Axiom 1 such that p∗ is less restrictive than, and more general than p. This follows from the fact that a preference structure p is a partition of S the space G ∗ = i∈ N ({i } × Gi2 ), see Remark 1. Let p be a preference structure that satisfies Axiom 1. T HEOREM 2.A. The data g is rationalizable as pairwise stable with preferences that satisfy p, if and only if there exists a profile of revealed preference relations R ∈ R( g) such that the p-extension of R has no cycle of length 2.33 T HEOREM 2.B. The data g is rationalizable as pairwise stable with preferences that satisfy p, if and only if for all (ordered) non-links (i0 , j0 ) ∈ / g, there does not 34 exist links (i, j) ∈ g and (k, l ) ∈ g such that (i, g, g − ij) ∼ p (i0 , g + i0 j0 , g) and (k, g, g − kl ) ∼ p ( j0 , g + i0 j0 , g). All examples of preference structure given in this paper so far satisfy Axiom 1, including the structure p∗ that was the subject of the main section, and the 33 The profile of binary relations = (i , . . . , n ) has no cycle of length 2 if and only if i has no cycle of length 2 for all i ∈ N. 34 { k, l } is not necessarily distinct from {i, j } 38 example given in Section 3.4 where p restricts preferences to depend only on agents’ degrees. However, Example 1 below does not satisfy Axiom 1. A particular class of preference structures satisfies Axiom 1: – the structure p such that (i, g, g − ik ) ∼ p (i, g, g − ij) for all g ∈ G , i ∈ N, and ij, ik ∈ g.35 That is, every agent treats all her links the same, and all her non-links the same. For instance, if agent i only cares about the number of friends she has, then all links have the same value to her. If agent i prefers to form the link ij ∈ g, then she also prefers to form the link ik ∈ g. E XAMPLE 1 (Preferences only depend on types): (i, g, g − ij) ∼ p (i0 , g0 , g0 − i0 j0 ) if and only if xi = xi0 and x j = x j0 . Agent i preference for forming a link with j depends only on the type of i and the type of j. There is no network effect, individuals only care about having (or not having) a particular type of friends 6.2. Bilateral link deletion I now present a characterization that is equivalent to Thereom 2, but involves checking just one particular revealed preference relation, instead of checking every profiles of revealed preference relations as in Lemma 1. This unique profile of revealed preference relations is described in Definition 6.5 below. Definition 6.5. For a given network g, let R = ∏i∈ N Ri be the unique profile of revealed preference relations defined as follows: 1.) For all ij ∈ g, g Ri g − ij 2.) For all ij ∈ / g, g Ri g + ij AND AND g R j g − ij. g R j g + ij And define R p to be the profile of extended revealed preference relations obtained by the p-extension of R (see Definition 6.4). This revealed preference relation is obtained by inferring that when a link is not observed between two individuals, both sides do not want to form this link. That is we can treat link deletion as bilateral instead of unilateral. I shall now show that for preference structure p that satisfies Axioms 1, there 35 And similarly, (i, g + ik, g) ∼ p (i, g + ij, g) for all g ∈ G , i ∈ N, and ij, ik ∈ /g 39 is a tractable equivalent to Lemma 1 that uses only the unique revealed preference relation described above. Corollary 1: The data g is rationalizable as pairwise stable with preferences that satisfy p, where p satisfies Axiom 1, if and only if the p-extended relation R p (Definition 6.5), does not have joint cycles of length 2 of the form: p p p p (i) g Ri ( g + ij), but ( g + ij) Ri g, (ii) g R j ( g + ij), but ( g + ij) R j g 7. C OMPARING O BSERVED S TABILITY TO T HAT IN A R ANDOM N ETWORK The stability score of a network must be appropriately benchmarked. The benchmark I use is the stability of a randomly generated network. In this section, I will examine the stability of the canonical Erdös-Rényi random graph model. More specifically, I consider the G (n, p, λ) random graph model, where a network with n number of nodes is randomly generated as follows: each link is included in the network with probability p (independently). With probability λ, a node is assigned as type 0, and with probability 1 − λ, a node is assigned as type 1. We are interested in the expected stability function E(n, p, λ; k), which tells us on average, what is the k-stability score for a network randomly drawn from the random graph model, G (n, p, λ). As defined in Section 4.3, the kstability score of a network is the proportion of its nodes that whose links are k-stable. For instance, if E(n0 , p0 , λ0 ) = 0.4, then we expect a network randomly generated from G (n0 , p0 , λ0 ) to have a k-stability score of 0.4. Therefore the score of 0.4 is the appropriate stability benchmark for an observed network with n0 number of nodes, density of p0 , and λ0 proportion of type-0. In this section, we will focus on k-stability with k = 2, 3, but the same analysis can be carried out for different k-stability scores. Definition 7.1. Let S( g; k) be the k-stability score of the network g as defined in 4.3. Let n, p, λ be the number of nodes, the density, and the proportion of type-0 in the network g respectively. The excess k-stability of a network g is S( g; k) − E(n, p, λ; k ) 40 1 Expected Stability 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.5 1 1.5 2 2.5 3 3.5 4 4.5 (n − 1) · p Figure 18: The simulated expected 2 and 3-stability can be accurately described by an inverted Gaussian function that depends only on the average degree (n − 1) p. That is, the curves above are obtained by fitting ( n −1) p − β 2 ) ] to the simulation points. The curve for expected 1 − α · exp [−( γ 3-stability lies strictly on top. First I fix λ = 0 (or equivalently λ = 1), that is, assuming that nodes do not have different types or characteristics. Although I am not able to analytically characterize the expected stability function, Figure 18 shows simulation evidence that the expected stability of a random network depends only on the product (n − 1) p, which is the average degree of the network, i.e. how many links a node has on average. It is known that (n − 1) p characterized certain phase transitions of Erdös-Rényi random networks.36 Moreover as can be seen in Figure 18, the simulated expected stability can be described by an inverted Gaussian function remarkably well. That is, 2 ( n −1) p − β . E(n, p) = 1 − α · exp − γ Ultimately we wish to know the general shape of the expected stability function, with λ ∈ (0, 0.5]. In Figure 19, I plot a non-parametric smoothed surface over the simulated expected stability. This is obtained by first simulating the expected stability over a range of values for n, p, and λ; and 36 For instance, when the average degree is less than one, the network consists of many components that are small relative to the number of nodes, and when the average degree is greater than one, there is a unique giant component. Expected Stability 41 1 0.9 0.8 0.7 0.6 0.4 0.5 0.2 λ 0.4 1 2 3 0 4 (n − 1) · p Figure 19: Non-parametric smoothed surface (Loess Curve) of the simulated expected stability function E(n, p, λ). The function E(n, p, λ) is the expected 2-stability score of a network randomly drawn from G (n, p, λ). Stability is expected to be lowest around (n − 1) p = 1, which corresponds to the wellknown phase transition for the rise of a giant component in Erdös-Rényi random networks. As λ increases, stability is expected to increase. then secondly, computing the local, non-parametric regression known as the LOESS Curve. Again the non-parametric plot of the simulated expected stability in Figure 19 is suggestive of an inverted multivariate Gaussian function. That is, I obtain an R2 of 0.96 when I fit to the simulated the inverted Gaus data, 2 ( n −1) p − β , where the sian function of the form E(n, p, λ) = 1 − α exp − γ height and location of the peak (α and β respectively) is a linear function of λ.37 Figures 18 is obtained by first drawing 1000 pairs of (n, p) uniformly from n ∈ {40, . . . , 70}, and p such that np ∈ [0, 4.5], then for each parameter (n, p), 10 random networks are drawn from G (n, p) for a total of 10,000 networks. The 2 and 3-stability scores are then computed for each of these networks. To obtain Figure 19, I simulate the expected stability function at 10,000 distinct values of (n, p, λ) drawn uniformly from n ∈ {40, . . . , 70}, λ ∈ [0, 0.5], and p such that np ∈ [0, 4.5]. For each (n, p, λ), I draw 10 ran37 There is however a caveat: the Gaussian function appears only to match the nonparametric surface when λ is not too close to the boundary of 0 and 0.5, and (n − 1) p not too close to the boundary of 0. 42 dom networks from G (n, p, λ), and I then compute the 2-stability score for each randomly drawn networks (100,000 in total). 8. A DDITIONAL TABLES Excess Stability -0.513∗∗∗ (0.180) -0.622∗∗∗ (0.183) Work Outside Village -0.202∗∗∗ (0.0623) -0.262∗∗ (0.128) -0.282∗∗ (0.129) Fraction Nonnatives -0.142∗ (0.0722) -0.174∗∗ (0.0756) -0.157∗ (0.0877) Average No. Rooms 0.0403∗ (0.0220) 0.0327 (0.0237) Average Age -1.164∗ (0.663) Number of Households (per 1000) -0.146 (0.0947) 0.0314 (0.0240) 0.0273 (0.0210) 0.0181 (0.0232) -0.951 (0.643) -0.852 (0.617) -0.478 (0.489) -0.196 (0.507) -0.551 (0.760) -0.544 (0.807) -0.642 (0.814) 0.288 (0.618) 0.364 (0.642) 0.0469 (0.0415) 0.0319 (0.0455) 0.0314 (0.0445) 0.102∗∗∗ (0.0381) 0.0910∗∗ (0.0419) 0.141∗∗∗ (0.0452) 0.533∗ (0.287) 0.418 (0.278) 0.317 (0.261) 0.159 (0.185) 0.00576 (0.188) 75 0.197 75 0.276 75 0.183 75 0.161 75 0.206 75 0.100 Average Education (per 100) Fraction GM N R2 -0.276∗∗ (0.124) -0.165∗∗ (0.0780) (per 100) Constant -0.658∗∗∗ (0.187) Robust standard errors in parentheses ∗ p < 0.10, ∗∗ p < 0.05, ∗∗∗ p < 0.01 TABLE III An alternative measure of excess 2-stability is constructed by directly incorporating as (binary) node types whether household has at least two members who travel outside village for work (or one if the household consists of one member). Variables are explained in Table I. 43 (1) No. of Households (per 1000) (2) (3) Excess Stability (4) (5) -0.549∗∗ (0.217) Average degrees 0.0161 (0.0217) -0.0494 (0.0372) -0.0252∗ (0.0146) -0.0187 (0.0207) 0.415∗∗∗ (0.151) Average clustering 1st Eigenvalue 0.568∗∗∗ (0.199) 0.00718 (0.0124) 2nd Eig. Stoch. Adj N R2 (7) 0.0564 (0.464) Average distance Constant (6) 0.00926 (0.0249) -0.620 (0.610) -0.958 (0.644) 0.0732∗∗∗ (0.0202) -0.0203 (0.0581) 0.140∗∗ (0.0659) -0.0426∗ (0.0246) -0.00969 (0.0560) 0.634 (0.601) 1.050 (0.697) 75 0.062 75 0.012 75 0.052 75 0.121 75 0.005 75 0.016 75 0.192 Standard errors in parentheses ∗ p < 0.10, ∗∗ p < 0.05, ∗∗∗ p < 0.01 TABLE IV Correlation of excess 2-stability with other network measures, at the village-level. All measures are averaged over the four types of networks. These network measures are the ones used in Banerjee et al. (2013). Restricting to the subsample of 43 villages where diffusion data are available, the correlation between stability and diffusion centrality of injection points are 0.1762. 43 0.529 1.107∗∗∗ (0.316) TABLE V 43 0.583 1.191∗∗∗ (0.281) 43 0.531 1.323∗∗∗ (0.340) 43 0.525 1.109∗∗∗ (0.325) 43 0.572 0.723∗∗ (0.280) 43 0.550 0.846∗∗∗ (0.275) 43 0.553 0.870∗∗∗ (0.303) 0.204∗∗ (0.0806) -1.107∗∗∗ (0.376) 0.0141∗∗ (0.00527) (7) The dependent variable is the microfinance take-up rate of non-leader households. All stability scores are measured with respect to the Advice Network. Demographic controls included as in Table II. In columns (1) and (2), stability is defined as 2 and 3-stability; in column (3), 2-stability is computed by further incorporating node characteristics indicating whether household contains a leader; in column (4), the stronger version of 2-stability is used (see Section 4.4). Columns (5), (6) and (7) are excess stability counterparts to (1), (2), and (3). Magnitudes are comparable with Table II. For column (5), one standard deviation increase in the diffusion centrality is associated with 7.2, 5.9, 1.3 percentage points increase in the dependent variable at 90th, 75th, 25th percentile of stability scores. Using column (6), the numbers are 8.1, 6.1, and 1.7; and for column (7), the numbers are 8.1, 6.7, and 2.0. For column (1): 6.6, 5.3 and 1.8; for column (2): 8.3, 5.8, and 1.2; for column (3): 7.9, 6.2 and 2.4; for column (4): 6.5, 5.1 and 2.2. Robust standard errors in parentheses ∗ p < 0.10, ∗∗ p < 0.05, ∗∗∗ p < 0.01 N R2 Constant 0.188∗∗ (0.0707) 0.155∗∗ (0.0577) Excess Stability × Diffusion Centrality 0.123∗∗ (0.0580) 0.0175∗∗∗ (0.00548) (6) -0.822∗∗∗ (0.295) 0.164∗∗ (0.0776) -0.725∗∗∗ (0.265) 0.00370 (0.00795) (5) -0.890∗∗∗ (0.272) 0.200∗∗∗ (0.0405) 0.128∗ (0.0657) Stability × Diffusion Centrality -0.917∗∗ (0.348) -0.0446 (0.0306) (4) Excess Stability -0.954∗∗∗ (0.190) -0.760∗∗ (0.299) Stability -0.0902∗ (0.0511) -0.133∗∗∗ (0.0282) -0.0527 (0.0367) (3) (2) Diffusion Centrality (1) 44 45 (1) (6) (7) Comm. Centrality -0.695 (0.825) 0.449∗∗ (0.204) Network Measures × Comm. Centrality 11.26 (6.725) 12.17∗∗∗ (2.536) Diffusion Centrality Network Measures × Diffusion Centrality Network Measures Controls N R2 (2) (3) (4) Microfinance participation rate (5) 0.0511∗∗ (0.0212) 0.00486 (0.0515) 0.0318 (0.0724) -0.0560∗∗∗ (0.0162) 0.0506 (0.0429) -0.345 (0.228) 0.00623 (0.0199) -0.00219 (0.0156) 0.601∗∗∗ (0.131) -0.00699 (0.00971) 0.445 (3.042) -0.0739 (0.101) 0.0480 (0.0879) -3.206∗∗∗ (0.685) -0.000738 (0.0524) -0.671 (0.406) -0.847∗∗∗ (0.266) Yes 43 0.492 Yes 43 0.470 Yes 43 0.468 Yes 43 0.582 Yes 43 0.495 Yes 43 0.443 Yes 43 0.574 Robust standard errors in parentheses ∗ p < 0.10, ∗∗ p < 0.05, ∗∗∗ p < 0.01 TABLE VI Placebo test of the main specification (Table II) using different network measures in place of stability. The network measures used are (1): Number of nodes (households); (2): Average degrees; (3) Average distance; (4) Average clustering; (5): Largest eigenvalue of the adjacency matrix. Column (6) shows that although average clustering has the same effect on the dependent variable as stability, this effect is not robust when communication centrality of injection points is used instead. Communication centrality is another measure of centrality that is similar to diffusion centrality (see Banerjee et al. (2013)). Finally, Column (7) shows that our measure of stability, excess 2-stability, is robust to using communication centrality in place of diffusion centrality. 46 9. P ROOFS P ROPOSITION 3. Let n be the number of nodes in the network g. Computing the k-stability (Definition 4.4) of g has a worst-case running time of O(n3 ) assuming the degrees of nodes have an upper bound independent of n. P ROOF : The input to the algorithm is a sorted adjacency lists. The outer loop consists of iterating over all nodes i ∈ N, and requires n steps. The inner loop iterates over all non-links i0 x ∈ / g, and requires at most n(n − 1) steps. Note that non-links of a node can be identified from a sorted adjacency lists in constant time. We will then argue that within the inner loop, computing conditions (i) and (ii) in Definition 4.4 takes constant time. When degrees are bounded above by d, the k-neighborhood of any node has at most 1 + dk number of nodes. Since we have a sorted adjacency lists, it then takes constant time to identify the k-neighborhood of a node. 9.1. Proof of Proposition 1 We want to show the following: the network g be weakly k-stable under Definition 4.6 but not k-stable, then g is k0 -stable for all k0 > k. Suppose g is not k-stable, so there are nodes i, j, and a non-link (i0 , j0 ) ∈ / g 0 0 0 0 such that i in g is k-similar to i in g + i j , and j in g is k-similar to j in g + i0 j0 . Since g is weakly k-stable, it follows that for all neighbors of i, i.e. nodes k such that ik ∈ g, k in g is not k-similar to j0 in g + i0 j0 . Therefore node i in g cannot be (k + 1)-similar to node i0 in g + i0 j0 . 9.2. Proof of Lemma 2 Lemma 2. There exists a profile of preference relations that satisfies the structure p if and only if (i, f , g) ∈ / p( g, f , i ) for all (i, f , g) ∈ G ∗ . As in Proposition 2, every profile of preference relations (remember that we only consider strict preference relations) corresponds to an acyclic orientation of the graph P (as defined in Definition 6.3). Therefore we will prove this lemma by showing if the condition of the lemma is satisfied, then there is an acyclic orientation of P that respects the structure p. 47 Definition 9.1. An orientation of P is said to respect the preference structure p (formally p is an equivalence relation on G ∗ ) if whenever (i, g, g − ij) ∼ p (i0 , g0 , g0 − i0 j0 ), then the directions of the links { gi , ( g − ij)i } and { gi00 , ( g0 − i0 j0 )i0 } in P are the same. In the sense that there is an arc (directed link) from gi to ( g − ij)i if and only if there is an arc from gi00 to ( g0 − i0 j0 )i0 . We now construct an acyclic orientation of P that respects p, via a procedure called contraction: Definition 9.2. Define the p-contraction of P as follows: whenever (i, g, g − ij) ∼ p (i0 , g0 , g0 − i0 j0 ), we contract the pair of nodes gi and gi0 ; and the pair of nodes ( g − ij)i and ( g0 − i0 j0 )i0 in the graph P. Two nodes u and v are said to be contracted if we form a new graph by removing v from the original graph such that all the nodes that were linked to v are now linked to u. That is, whenever (i, g, g − ij) ∼ p (i0 , g0 , g0 − i0 j0 ), we ‘merge’ the ordered links ( gi , ( g − ij)i ) and ( gi00 , ( g0 − i0 j0 )i0 ) in P. For this contraction to be valid, this procedure cannot have the following two contradictory mergers: (1) merge the links ( x, y) and ( x 0 , y0 ), and (2) merge the links by ( x, y) and (y0 , x 0 ). This could only happen when (i, g, g − ij) ∼ p (i0 , g0 , g0 − i0 j0 ), and (i, g − ij, g) ∼ p (i0 , g0 , g0 − i0 j0 ). By transitivity of p, this implies that (i, g, g − ij) ∼ p (i, g − ij, g), which is precisely ruled out by the condition of the Lemma. Once we have a p-contraction of P, call it P/p. Observe that any acyclic orientation of P/p corresponds to an acyclic orientation of P that respects p. But for any graph, there is always an acyclic orientation. Therefore, we have shown that if the condition of the lemma is satisfied, there is an acyclic orientation of P that respects p, for any p. 9.3. Proof of Lemma 1 The condition in the lemma is clearly necessary, we will only prove the sufficiency part of the lemma. We want to show: if there exists a profile of revealed preference relations R ∈ R( g) such that the p-extension of R is acyclic, then the data g is rationalizable as pairwise stable with preferences that satisfy p First take some R ∈ R( g) such that the p-extension of R is acyclic, we know this exists by the assumption of the lemma. Call this extended re- 48 vealed preference relation R p . Now let P( R p ) be a partial orientation of P according to R p . This is a well-defined operation since R p is acyclic, so ( x, y) ∈ R p implies that (y, x ) ∈ / R p . Note that P( R p ) is a partially directed graph in P that has no directed cycles. The proof is complete once we construct an acyclic orientation that extends P( R p ) and respects p From Definition 9.2, construct the p-contraction of P( R p ), denote it as P( R p )/p. Note that P( R p )/p is a (smaller) partially directed graph with no directed cycles. By construction, any acyclic orientation that extends P( R p )/p corresponds to an acyclic orientation of P that extends P( R p ) and respects p. Since P( R p )/p is a partially directed graph with no directed cycles, there is then an acyclic orientation that extends it (due to Szpilrajn extension theorem). Therefore there is an acyclic orientation that extends P( R p ) and respects p 9.4. Proof of Necessity of Theorem 2 I will prove a slightly more general result – that is I give a necessary condition for the data g to be rationalizable as pairwise stable with preferences that satisfy any structure p. The condition is: For all (ordered) non-links (i0 , j0 ) ∈ / g, there does not exist links (i, j) ∈ g and (k, l ) ∈ g such that (i, g, g − ij) ∼ p (i0 , g + i0 j0 , g) and (k, g, g − kl ) ∼ p ( j0 , g + i0 j0 , g). Suppose by contradiction, there exists (i0 , j0 ) ∈ / g, (i, j) ∈ g and (k, l ) ∈ g 0 0 0 such that (i, g, g − ij) ∼ p (i , g + i j , g) and (k, g, g − kl ) ∼ p ( j0 , g + i0 j0 , g), and the data g is rationalizable as pairwise stable with preferences that satisfy p In particular, there is a profile of preference relations satisfying the structure p such that the data g is pairwise stable. From observing (i, j) ∈ g, we deduce that g i g − ij, and from p this implies that g + i0 j0 i0 g. Similarly, from observing (k, l ) ∈ g, we deduce that g k g − kl, and from p this implies that g + i0 j0 j0 g. However we know from observing (i0 , j0 ) ∈ / g, 0 0 0 0 that either g i0 g + i j or g j0 g + i j . Therefore cannot be asymmetric, which is a contradiction that is profile of binary relations that are asymmetric, transitive and complete. 49 9.5. Proof of Sufficiency of Theorem 2 We will show that if a preference structure p satisfies Axiom 1, then the particular extended revealed preference relations defined in Definition 6.5 cannot have cycles of length greater than 2, without also having a cycle of length 2. That is, given the data g0 , let R ∈ R( g0 ) be as defined in 6.5, then the p-extension of R cannot have cycles of length greater than 2, without also having a cycle of length 2. The proof then follows from applying Lemma 1. Consider again the graph P that represents the preference space G ∗ , per Definition 6.3. Pick the R ∈ R( g) as in Definition 6.5, and consider the p-extension of R, which we denote by R p . Now represent this extended revealed preference relations in the graph P by considering the partial orientation of P according to R p , which we call P( R p ). Now any cycle in P( R p ) takes one of the two forms in Figure 20 below. gi ( g ° i j)i ( g ° ik) i ( g + ik) i ( g + i j)i gi Figure 20: All cycles in any extended revealed preference relations take one of these two forms. Claim 1: If p satisfies Axiom 1, if the cycle on the left side of Figure 20 occurs in P( R p ), then we also have a cycle of length 2 in P( R p ). Suppose not. We know either there is a g0 R x ( g0 − xy) such that ( x, g0 , g0 − xy) ∼ p (i, g, g − ik ), or that g is the data g0 and ik ∈ g0 . Suppose it is the former, then by Axiom 1, we have ( x, g0 , g0 − xt) ∼ p (i, g, g − ij), for some xt ∈ g0 . Since g0 R x ( g0 − xt), we must have gRi,p ( g − ij). However in the left-hand side Figure 20, the arrow points from ( g − ij)i to gi . Therefore we have found a cycle of length 2 in P( R p ). 50 Now suppose instead that g is in fact the data g0 and ik ∈ g0 . Clearly we also have ij ∈ g0 so that g0 Ri ( g0 − ij). Therefore we have found a cycle of length 2 in P( R p ). Claim 2: If p satisfies Axiom 1, if the cycle on the right side of Figure 20 occurs, then we also have a cycle of length 2 in P( R p ). Suppose not. We know either there is a g0 R x ( g0 + xy) such that ( x, g0 + xy, g0 ) ∼ p (i, g + ik, g), or that g is the data g0 and ik ∈ / g0 . Suppose it is the former, then by Axiom 1, we have ( x, g0 + xt, g0 ) ∼ p (i, g + ij, g), for some xt ∈ / g0 . Since g0 R x ( g0 + xt), we must have gRi,p ( g + ij). However in the left-hand side Figure 20, the arrow points from ( g + ij)i to gi . Therefore we have found a cycle of length 2 in P( R p ). Now suppose instead that g is in fact the data g0 and ik ∈ / g0 . Clearly we 0 0 0 also have ij ∈ / g so that g Ri ( g + ij). Therefore we have again found a cycle of length 2 in P( R p ). Lemma 3. Let g be a partially directed graph.38 g has an acyclic orientation if g contains no directed cycles. P ROOF : Number the nodes from 1 to n such that if the link ij ∈ g is ordered as (i, j) ∈ g, the number assigned to i is less than the number assigned to j. This is always possible since there is no directed cycle in g. Now orient every link from the node with smaller number to the node with the larger. We have just constructed an acyclic orientation of g. R EFERENCES Afriat, S. N. (1967). The construction of utility functions from expenditure data. International Economic Review, 8(1):67–77. Albert, R., Jeong, H., and Barabási, A.-L. (1999). Internet: Diameter of the world-wide web. Nature, 401(6749):130–131. Andreoni, J. and Miller, J. (2002). Giving according to garp: An experimental test of the consistency of preferences for altruism. Econometrica, 70(2):737–753. Banerjee, A., Chandrasekhar, A. G., Duflo, E., and Jackson, M. O. (2013). The diffusion of microfinance. Science, 341(6144). 38 g is a partially directed graph if some links of g are directed, and some links of g are undirected. An orientation of a partially directed graph g is an orientation that only assigns direction to the undirected links in g. 51 Blundell, R. W., Browning, M., and Crawford, I. A. (2003). Nonparametric engel curves and revealed preference. Econometrica, 71(1):205–240. Brown, D. J. and Matzkin, R. L. (1996). Testable restrictions on the equilibrium manifold. Econometrica: Journal of the Econometric Society, pages 1249–1262. Calvó-Armengol, A. and İlkılıç, R. (2009). Pairwise-stability and nash equilibria in network formation. International Journal of Game Theory, 38(1):51–79. Carvajal, A., Deb, R., Fenske, J., and Quah, J. K.-H. (2013). Revealed preference tests of the cournot model. Econometrica, 81(6):2351–2379. Chambers, C. P. and Echenique, F. (2014). On the consistency of data with bargaining theories. Theoretical Economics, 9(1):137–162. Chambers, C. P., Echenique, F., and Shmaya, E. (2010). General revealed preference theory. Chandrasekhar, A. and Lewis, R. (2011). Econometrics of sampled networks. Cherchye, L., De Rock, B., and Vermeulen, F. (2007). The collective model of household consumption: a nonparametric characterization. Econometrica, 75(2):553–574. Cherchye, L., De Rock, B., and Vermeulen, F. (2011). The revealed preference approach to collective consumption behaviour: Testing and sharing rule recovery. The Review of Economic Studies, 78(1):176–198. Chiappori, P.-A. (1988). Rational household labor supply. Econometrica: Journal of the Econometric Society, pages 63–90. Choi, S., Kariv, S., Wieland, M., Silverman, D., et al. (2014). Who is (more) rational? American Economic Review, 104(6):1518–50. Christakis, N., Fowler, J., Imbens, G., and Kalyanaraman, K. (2010). An empirical model for strategic network formation. Crawford, I. and De Rock, B. (2014). Empirical revealed preference. Annual Review of Economics, 6:503–524. Echenique, F. (2008). What matchings can be stable? the testable implications of matching theory. Mathematics of Operations Research, 33(3):757–768. Echenique, F., Lee, S., and Shum, M. (2011). The money pump as a measure of revealed preference violations. Journal of Political Economy, 119(6):1201–1223. Echenique, F., Lee, S., Shum, M., and Yenmez, M. B. (2013). The revealed preference theory of stable and extremal stable matchings. Econometrica, 81(1):153–171. Galeotti, A., Goyal, S., Jackson, M. O., Vega-Redondo, F., and Yariv, L. (2010). Network games. The review of economic studies, 77(1):218–244. Goldsmith-Pinkham, P. and Imbens, G. W. (2013). Social networks and the identification of peer effects. Journal of Business & Economic Statistics, 31(3):253–264. Goyal, S. (2009). Connections: an introduction to the economics of networks. Princeton University Press. Goyal, S. and Joshi, S. (2003). Networks of collaboration in oligopoly. Games and Economic behavior, 43(1):57–85. Harbaugh, W. T., Krause, K., and Berry, T. R. (2001). Garp for kids: On the development of rational choice behavior. American Economic Review, pages 1539–1545. Hsieh, C.-S. and Lee, L.-F. (2014). A social interactions model with endogenous friendship formation and selectivity. Jackson, M. (2008). Social and Economic Networks. Princeton University Press. Jackson, M. and Wolinsky, A. (1996). A strategic model of social and economic networks. Journal of Economic Theory, 71:44–74. 52 Jackson, M. O., Rodriguez-Barraquer, T., and Tan, X. (2012). Social capital and social quilts: Network patterns of favor exchange. The American Economic Review, 102(5):1857–1897. Jackson, M. O. and Yariv, L. (2007). Diffusion of behavior and equilibrium properties in network games. The American economic review, pages 92–98. Jackson, M. O. and Yariv, L. (2010). Diffusion, strategic interaction, and social structure. Handbook of Social Economics, edited by J. Benhabib, A. Bisin and M. Jackson. Masatlioglu, Y., Nakajima, D., and Ozbay, E. Y. (2012). Revealed attention. The American Economic Review, 102(5):2183–2205. Mele, A. (2013). A structural model of segregation in social networks. PhD thesis. Miyauchi, Y. (2014). Structural estimation of a pairwise stable network with nonnegative externality. Available at SSRN: http://ssrn.com/abstract=2165884. Richter, M. K. (1966). Revealed preference theory. Econometrica: Journal of the Econometric Society, pages 635–645. Samuelson, P. (1947). Foundation of Economic Analysis. Harvard University Press, Cambridge, MA. Samuelson, P. A. (1938). A note on the pure theory of consumer’s behaviour. Economica, 5(17):61–71. Sheng, S. (2014). Identification and estimation of network formation games. mimeo. Sprumont, Y. (2000). On the testable implications of collective choice theories. Journal of Economic Theory, 93(2):205–232. Varian, H. R. (1982). The nonparametric approach to demand analysis. Econometrica: Journal of the Econometric Society, pages 945–973. Varian, H. R. (2006). Revealed preference. Samuelsonian economics and the twenty-first century, pages 99–115.
© Copyright 2026 Paperzz