Trading, Profits, and Volatility in a Dynamic Information Network Model∗ Johan Walden† November 14, 2013 Abstract We introduce a dynamic noisy rational expectations model, in which information diffuses through a general network of agents. In equilibrium, agents’ trading behavior and profits are determined by their position in the network. Agents who are more closely connected have more similar period-by-period trades, and an agent’s profitability is determined by a centrality measure that is closely related to so-called Katz centrality. The model generates rich dynamics of aggregate trading volume and volatility, beyond what can be generated by heterogeneous preferences in a symmetric information setting. An initial empirical investigation suggests that price and volume dynamics of small stocks may be especially well explained by such asymmetric information diffusion. The model could potentially be used to study individual investor behavior and performance, and to analyze endogenous network formation in financial markets. Keywords: Information diffusion, information networks, heterogeneous investors, portfolio choice, asset pricing. ∗ I thank seminar participants at the Duisenberg School of Finance, Maastrict University, the Oxford-Man Institute, Rochester (Simon School), Stockholm School of Economics, and USC, Bradyn Breon-Drish, Laurent Calvet, Nicolae Gârleanu, Brett Green, Pierre de la Harpe, Philipp Illeditsch, Matt Jackson, Svante Jansson, Ron Kaniel, Anders Karlsson, Martin Lettau, Dmitry Livdan, Hanno Lustig, Gustavo Manso, Christine Parlour, Han Ozsoylev, Elvira Sojli, Matas Šileikis, and Deniz Yavuz for valuable comments and suggestions. † Haas School of Business, University of California at Berkeley, 545 Student Services Building #1900, CA 94720-1900. E-mail: [email protected], Phone: +1-510-643-0547. Fax: +1-510-643-1420. 1 Introduction There is extensive evidence that heterogeneous and decentralized information diffusion influences investors’ trading behavior. Shiller and Pound (1989) survey institutional investors in the NYSE, and find that a majority attribute their most recent trades to discussions with peers. Ivković and Weisbenner (2007) find similar evidence for households. Hong, Kubik, and Stein (2004) find that fund managers’ portfolio choices are influenced by word-of-mouth communication. Heimer and Simon (2012) find similar influence from on-line communication between retail foreign exchange traders. Such information diffusion may help explain several fundamental stylized facts of stock markets. First, investors are known to hold vastly different portfolios, in contrast to the prediction of classical models that everyone should hold the market portfolio. The standard explanations for such diverse portfolio holdings are hedging motives and heterogeneous preferences, but several studies indicate that there are limitations to how well such motives can explain observed heterogeneous investor behavior.1 With decentralized information diffusion, however, it is unsurprising if significantly heterogeneous behavior of investors is observed in the market. Second, stock markets are known to experience large price movements that are unrelated to public news, as documented in Cutler, Poterba, and Summers (1989), and Fair (2002). These studies find that over two thirds of major stock market movements cannot be attributed to public news events, suggesting that there are other channels through which information is incorporated into asset prices. Third, the dynamics of trading volume and asset prices are known to be very rich. Returns and trading volume in many markets are heavy-tailed, time varying, show “long memory,” and are related to each other in a complex way (see Gabaix, Gopikrishnan, Plerou, and Stanley 2003; Karpoff 1987; Gallant, Rossi, and Tauchen 1992; Bollerslev and Jubinski (1999); Lobato and Velasco 2000). Lumpy information diffusion provides a potential explanation for such behavior. In periods when more information diffuses, volatility is higher, as is trading volume (see, Clark 1973; Epps and Epps 1976; Andersen 1996). To generate rich trading volume dynamics, however, such information diffusion must necessarily be heterogeneous. In this paper, we follow a recent strand of literature that uses information networks to model information diffusion (see Colla and Mele 2010, Ozsoylev and Walden 2011, Han and Yang 2013, and Ozsoylev, Walden, Yavuz, and Bildik 2013). Agents who are directly linked in a network share information, which consequently diffuses among the population over time in a well-specified manner. This literature has made important observations about effects of information networks, but several key questions remain open. How does the network structure in a market determine the dynamic trading behavior of its agents, and their performance? How does the network structure influence aggregate properties of the market, e.g., price volatility and trading volume? What does heterogeneous information diffusion “add” compared with, e.g., what can can be 1 For example, Massa and Simonov (2006) find that hedging motives for human capital risk—a fundamental source of individual investor risk—does not explain heterogeneous investment behavior among individual investors well. Similarly, Calvet, Campbell, and Sodini (2007) and Calvet, Campbell, and Sodini (2009) find that diversification and portfolio rebalancing motives do not explain investors’ portfolio holdings well. 1 generated by heterogeneous preferences alone? These questions are obviously fundamental for our understanding of the impact of information networks on financial markets, and ultimately for how well such information networks can explain observed investor and market behavior. In this paper we analyze these questions. We introduce a dynamic noisy rational expectations model in which agents in a network share information with their neighbors. Agents receive private noisy signals about the unknown value of an asset in stochastic supply, and trade in a market over multiple time periods. In each period, they share all the information they have received up until that point with their direct neighbors, leading to gradual diffusion of the private signals. The structure of the network is completely general. As a first contribution, we prove the existence of a noisy rational expectations equilibrium, and present closed-form expressions for all variables of interest. Theorem 1 provides the main existence and characterization result for a Walrasian equilibrium in a large network economy. To define the large economy equilibrium, we use the concept of replica networks, assuming that that there is a local network structure (e.g., at the level of a municipality) and that there are many similar such local network structures in the economy. This allows for a clean characterization of equilibrium, as well as justifies the assumption that agents act as price takers and are willing to share information. We know of no other network model of information diffusion in a centralized financial market (i.e., exchange) that allows for a complete characterization of equilibrium, that is completely general with respect to network structure, and that is based on first principles of financial economics. The structure of the network is crucial in determining asset pricing dynamics. We show in a simple example that price informativeness and volatility at any given point in time does not only depend on the specific information agents in the model have obtained at that point, but also on how the information has diffused through the network. The equilibrium outcome thus depends on complex properties of the network, beyond the mere precision of agents’ signals at any specific point in time. We next study how the network structure determines the trading behavior and profitability of agents. A priori, one may expect that portfolio holdings of agents who are close in the network should be positively correlated, whereas their trades may be negatively correlated in some periods, because some agents trade earlier on information than others and may be ramping down their investments to realize profits when other agents are ramping up their investments. We show that, contrary to this intuition, the period-by-period trades of agents in the network are always positively related, and increasingly so in the degree of the overlap in their connectedness. This result justifies the use of data on trades to draw inferences about network structure, as has been done in previous studies, providing our second contribution. As a third contribution, we study what determines who in the network makes profits. It is argued in Ozsoylev and Walden (2011) and elsewhere that some type of centrality measure should determine agent profitability.2 Centrality—a fundamental concept in network theory— 2 In an empirical study, Ozsoylev, Walden, Yavuz, and Bildik (2013) study the trades of all investors on the 2 captures the concept that it is not only who your direct neighbors are that matters, but also who your neighbors’ neighbors are, who your neighbors’ neighbors’ neighbors are, etc. The argument is that agents who are centrally placed tend to receive information signals early, and therefore perform better in the market than peripheral agents, who tend to receive information later. It is not a priori clear, however, which centrality measure—among several—that is appropriate in this context. We show that profitability is determined by a centrality measure related to eigenvector centrality or, more generally, to so-called Katz centrality. To the best of our knowledge, this is the first complete characterization of the relationship between agent centrality and performance in a general information network model of financial markets. For large random networks, Katz centrality dominates other common centrality measures in our model, i.e., it dominates degree centrality, average distance and closeness centrality. We show in simulations that the ranking of centrality measures also holds in medium-sized random networks. Thus, the use of centrality measures when studying individual investor behavior (e.g., in Ozsoylev, Walden, Yavuz, and Bildik 2013) is theoretically justified. The result is quite intuitive: The Katz centrality measure provides the best balance between direct connections and connections at farther distances, whereas degree centrality focuses exclusively on direct connections, and average distance and closeness centrality are tilted toward connections at far distances. Our main results that characterize trading behavior, profitability and welfare of agents, and the relationship between centrality and profitability, are Theorems 2-4. Our fourth contribution is to derive and analyze several aggregate results regarding the dynamic behavior of price volatility and trading volume in the model. Specifically, the network structure in an economy is an important determinant of volatility and trading volume in the time series, as well as of the relationship between the two. Several stylized properties naturally arise in the model, for example, persistence of shocks to volatility and trading volume, as well as lead-lag relationships between the two. We show that the rich dynamics of volatility and volume in our general information network model cannot be generated by heterogeneous preferences alone in a symmetric network. Again, the intuition is straightforward: With symmetry, there can be no bottlenecks in the information network. This in turn implies that the rate of information diffusion over time must be unimodal, i.e., it cannot slow down and then speed up again. Consequently, aggregate market dynamics are restricted under symmetry, whereas (as we show) there are no such unimodality restrictions in the general case. Finally, we study volatility and volume of a sample of U.S. stocks. We compare Large Cap and Small Cap companies, and find that the behavior of Small Cap companies is better explained by asymmetric networks with a large degree of nonpublic information diffusion, suggesting that information networks may be especially important for such stocks. Our theoretical results on Istanbul Stock Exchange in 2005, and find a positive relationship between investors’ so-called eigenvector centrality and profitability, but this choice of centrality measure is not theoretically justified. Several other finance papers discuss and use various centrality measures without a complete theoretical justification, see e.g., Das and Sisk (2005), Adamic, Brunetti, Harris, and Kirilenko (2010), Li and Schurhoff (2012) and Buraschi and Porchia (2012). 3 aggregate volume and volatility are summarized in Theorems 5-7. The rest of the paper is organized as follows. In the next section, we discuss related literature. In Section 3, we introduce the model and characterize equilibrium. In Section 4, we analyze trading behavior and profitability of individual agents. In Section 5, we study the implications of network structure for aggregate volatility and trading volume. Finally, Section 6 concludes. All proofs are delegated to the appendix. 2 Related Literature Our paper is most closely related to the recent strand of literature that studies the effects of information diffusion on trading and asset prices. Colla and Mele (2010) show that the correlation of trades among agents in a network varies with distance, so that close agents naturally have positively correlated trades, whereas the correlation may be negative between agents who are far apart. Their model is dynamic, and assumes a very specific symmetric network structure, namely a circle, where each agent has exactly two neighbors. This restricts the type of dynamics that can arise in their model. Ozsoylev and Walden (2011) introduce a static rational expectations model that allows for general network structures and study, among other things, how price volatility varies with network structure. Their model is not appropriate for studying dynamic information diffusion, however, and is therefore not well-suited for several of the questions analyzed in this paper, e.g., the relationship between agent profitability and centrality, and the short-term correlation between agents’ trade. Han and Yang (2013) study the effects of information diffusion on information acquisition. They show that in equilibrium, information diffusion may reduce the amount of aggregate information acquisition, and therefore also the informational efficiency and liquidity in the market. Their model is also static, and does thereby not allow for dynamic effects. In an empirical study, Ozsoylev, Walden, Yavuz, and Bildik (2013) test the relationship between centrality— constructed from the realized trades of all investors in the market—and profitability. They find that more central agents, as measured by eigenvector centrality, are more profitable. However, they do not justify this choice of centrality measure theoretically. Pareek (2012) studies how information networks—proxied by the commonality in stock holdings–among mutual is related to return momentum. A different strand of literature studies information diffusion through so-called information percolation (Duffie and Manso 2007, Duffie, Malamud, and Manso 2009). In the original setting, a large number of agents meet randomly in a bilateral decentralized (OTC) market and share information, and the distribution of beliefs over time can then be strongly characterized. Recently, the model has been adapted to centralized markets, with exchange traded assets and observable prices—a setting more closely related to ours. For example, Andrei (2012), shows that persistent price volatility can arise in such a model. In contrast to our model, in which some agents may be better positioned than others, these models are ex ante symmetric in that all agents have the same chance of meeting and sharing information. 4 Babus and Kondor (2013) also introduce a model of information diffusion in a bilateral OTC market. As in our paper, their network can be perfectly general. In contrast to our model, there is no centralized information aggregation mechanism in their setting, and therefore no interplay between diffusion through public and private channels. Moreover, agents have private values in their model, and their model is static. This paper is also related to the literature on information diffusion and trading volume (Clark 1973). Lumpy information diffusion was suggested to explain heavy-tailed unconditional volatility of asset prices, as an alternative to the stable Paretian hypothesis. Under the Mixture of Distributions Hypothesis (MDH), lumpiness in the arrival of information lead to variation in return volatility and trading volume, as well as a positive relation between the two (see Epps and Epps 1976 and Andersen 1996). Foster and Viswanathan (1995) build upon this intuition to develop a model with endogenous information acquisition, leading to a positive autocorrelation of trading volume over time. Similar results arise in He and Wang (1995), in a model where an infinite number of ex ante identical agents receive noisy signals about an asset’s fundamental value. Admati and Pfleiderer (1988) explain U-shaped intra-daily trading volume in a model with endogenous information acquisition. Our paper further explores the richness of the dynamics of volatility and volume that arises when agents share their signals, allowing for completely general asymmetry in how some agents are better positioned than others. This extension may potentially shed further light on the very rich dynamics of volatility and volume, and the relationship between the two (see Karpoff 1987, Gallant, Rossi, and Tauchen 1992, Bollerslev and Jubinski 1999, Lobato and Velasco 2000, and references therein). A related strand of literature explores the role of trading volume in providing further information to investors about the market, see Blume, Easley, and O’Hara (1994), Schneider (2009), and Breon-Drish (2010). Our model does not explore this potential informational role of trading volume. Our study is related to the large literatures on games on networks, see the survey of Jackson and Zenou (2012). The games in these models are typically not directly adaptable to a finance setting. Our existence result and the characterization of equilibrium in a model based on first principles of financial economics are therefore of interest. Since the welfare of agents in equilibrium can be simply characterized, our model could potentially also be used to study endogenous network formation, see Jackson (2005) for a survey of this literature. Finally, our paper is related to the (vast) general literature on asset pricing with heterogeneous information (see, e.g., the seminal papers by Grossman 1976, Hellwig 1980, Kyle 1985, and Glosten and Milgrom 1985). Technically, we build upon the model in Vives (1995), who introduces a multi-period noisy rational expectations model in a similar spirit as the static model in Hellwig (1980). Like Vives, we assume the presence of a risk-neutral competitive market maker, to facilitate the analysis in a dynamic setting. This simplifies the characterization of equilibrium considerably. The cost of this assumption is that asset prices simply reflect the expected terminal value of the asset conditioned on public information at all points in time. Since our focus is on volatility and trading volume, this is a marginal cost for us. Unlike Vives, 5 we allow for information diffusion among agents, through general network structures. 3 Model There are N agents, enumerated by a ∈ N = {1, . . . , N }, in a T + 1-period economy, t = 0, . . . , T + 1, where T ≥ 2. We define T = {1, . . . , T }. Each agent, a, maximizes expected utility of terminal wealth, and has constant absolute risk aversion (CARA) preferences with risk aversion coefficient γa , a = 1, . . . , N , Ua = E[−e−γa Wa,T +1 ]. We summarize agents’ risk aversion coefficients in the N -vector Γ = (γ1 , . . . , γN ). There is one asset with terminal value v = v̄ + η, where η ∼ N (0, σv2 ), i.e., the value is normally distributed with mean v̄ and variance σv2 . Here, v̄ is known by all agents, whereas η is unobservable. Agents are connected in a network, represented by a graph G = (N , E). The relation E ⊂ N × N describes which agents (vertices, nodes) are connected in the network. Specifically, (a, a ) ∈ E, if and only if there is a connection (edge, link) between agent a and a . We will subsequently assume that there are many identical “replica” copies of this network in the economy, each copy representing a “local” network structure. This will make the economy “large” and justify price taking behavior of agents, as well as simplify the characterization of equilibrium. For the time being, we focus on one representative copy of this large network. We use the convention that each agent is connected to himself, (a, a) ∈ E for all a ∈ N , i.e., E is reflexive. We also assume that connections are bidirectional, i.e., that E is symmetric. A convenient representation of the network is by the adjacency matrix E ∈ {0, 1}N ×N , with (E)aa = 1 if (a, a ) ∈ E and (E)aa = 0 otherwise. The distance function D(a, a ) defines the number of edges in the shortest path between agents a and a . We use the conventions that D(a, a) = 0, and that D(a, a ) = ∞ whenever there is no path between a and a . The set of direct neighbors to agent a is Sa,1 = {a : (a, a ) ∈ E}. Moreover, the set of agents at distance m > 1 from agent a is Sa,m = {a : D(a, a ) = m}, and the set of agents at distance not further away than m is Ra,m = ∪m j=1 Sa,m . The number of agents at a distance not further away than m from agent a is Va,m = |Ra,m |. Here, Va,1 is the degree of agent a, which we also refer to as agent a’s connectedness, whereas Va.m is agent a’s mth order degree. We use the convention that Va,0 = 0 for all a. We also define ΔVa,m = |Sa,m |. We define N -vectors V m , m = 1, . . . , N , where the ath element of V m is Va,m Equivalently, Definition 1 The mth order degree vector, V m ∈ RN + , m = 1, 2, . . . , is defined as V m = χ(E m )1. (1) Here E m is the mth power of the adjacency matrix, and χ : RN ×N → {0, 1}N ×N is a matrix 6 indicator function, such that (χ(A))i,j = 0 if Ai,j = 0 and (χ(A))i,j = 1 otherwise. Moreover, 1 is an N -vector of ones. First order degree is commonly referred to as degree centrality. Finally, the number of agents within a distance of m from both agents a and a is Va,a ,m = |Ra,m ∩Ra ,m |, and the number of neighbors at distance exactly m from both agents is ΔVa,a ,m = |Sa,m ∩ Sa ,m | 3.1 Information diffusion At t = 0, each agent receives a noisy signal about the asset’s value, sa = v + σξa , where ξa ∼ N (0, 1) are jointly independent across agents, and independent of v. At T + 1, the true value of the asset, v, is revealed. It will be convenient to use the precisions τv = σv−2 and τ = σ −2 . The graph, G, determines how agents share information with each other. Specifically, at t+1, agent a shares all signals he has received up until t with all his neighbors. We let Ia,t denote the information set that agent a has received up until t, either directly or via his network. It is natural to ask why agents would voluntarily reveal valuable information to their neighbors. Of course, in a large economy with an infinite number of agents, sharing signals with ones’ (finite number of) neighbors has no cost, since the actions of a finite number of agents will not influence prices. Even with an economy of finite size, as long as signals can be verified ex post, truthful revelation may be optimal in a repeated game setting, since an agent who provides misinformation can be punished by his neighbors, e.g., by being excluded from the network in the future. Even if signals are not ex post verifiable, it may still be possible for an agent to draw inferences about the truthfulness of another agent’s signal, by comparing it with other received signals. Again, the threat of future exclusion from the network, could be used to enforce truthful information sharing. We therefore take the truthful information sharing behavior of agents as given. A potentially fruitful area for future research is to better understand in which finite sized financial networks truthful signal sharing can be sustained. As in Ozsoylev, Walden, Yavuz, and Bildik (2013), we formalize the information sharing role of the network by defining Definition 2 The graph G represents an information network over the signal structure {sa }a , if for all agents a ∈ N , a ∈ N and times t = 1, . . . , T , sa ∈ Ia,t if and only if D(a, a ) ≤ t. The information about the asset’s value that an agent has received through the network up until time t can be summarized (as we shall see) by the sufficient statistic def za,t = 1 sj = v + ζa,t , Va,t j∈Ra,t 7 V 3 The number of signals agent a receives at t is ΔV , where ζa,t = σa,t a,t 2 ξa,t , and ξa,t ∼ N (0, 1). and we therefore expect {ΔVa,t }a∈N ,t∈T to be important for the dynamics of the economy. 3.2 Market The market is open between t = 1 and T + 1. Agents in the information network submit limit orders, and a risk-neutral competitive market maker sets the price such that at each point in time it reflects all publicly available information, pt = Et [v|Itp ], where Itp is the time-t publicly available information set. At T + 1, the asset’s value is revealed so pT +1 = v. Before trading begins, the price is set as the asset’s ex ante expected value, p0 = v̄. To avoid fully revealing prices, we make the standard assumption of stochastic supply of the asset. Specifically, in period t, noise traders submit market orders of ut per trader in the network, where ut ∼ N (0, σu2 ). In other words, the noise trader demand is defined relative to the size of the population in the information network. As argued elsewhere in the literature, the noise trader assumption need not be taken literally, but is rather a reduced-form representation of unmodeled supply shocks. It could, e.g., represent hedging demand among investors due to unobservable wealth shocks, or other unexpected liquidity shocks. We do not further elaborate on the sources of these shocks. We will use the precision τu = σu−2 . Agents in the network are price takers. At each point in time they submit limit orders to optimize their expected utility of terminal wealth. They thus condition their demand on contemporaneous public information, as well as on their private information. An agent’s total demand for the asset at time t is xa,t = arg max E e−γa Wa,T +1 |Īa,t , (2) x subject to the budget constraint Wa,t+1 = Wa,t + xa,t (pt+1 − pt ), t = 1, . . . , T, and his net time-t demand is Δxa,t = xa,t − xa,t−1 , with the convention that xa,0 = 0 for all agents. Here, Īa,t contains all public and private information available to agent a at time t. In the linear equilibrium we study, za,t and pt are jointly sufficient statistics for an agent’s information set, Īa,t = {za,t , pt }, leading to the functional form xa,t = xa,t (za,t , pt ). Of course, an agent’s optimal time-t strategy in (2) depends on the (optimal) future strategy. The dynamic problem can therefore be solved by backward induction. The primitives of the economy are summarized by the tuple M = (G, Γ, τ, τu , τv , v̄, T ). We note that the assumption that the asset’s value is revealed at T + 1 means that any residual uncertainty at T of the asset’s value is completely mitigated by T + 1. We think of this as public information, which becomes available to all agents at T + 1. Alternatively, we 3 A variation is to let agents receive new private signals in each time period. The analysis in this case is qualitatively similar, but not as clean because of the increased number of signals. 8 could have assumed that residual uncertainty is gradually incorporated into the market between T and T for some T > T + 1, keeping the assumption that the information diffusion between agents in the network only occurs until T . The graph, G, determines how information diffuses in the network over time, whereas Γ captures agent preferences. We wish to separate dynamics that can be generated solely by heterogeneity in preferences from those that require heterogeneity in network structure. To this end, we define an economy to be preference symmetric if γa = γ for all agents, and some constant γ > 0. There are several symmetry concepts for graphs. The notion we use is so-called distance transitivity.4 Informally, symmetry captures the idea that any two vertices can be switched without the network changing its structure. To formalize the concept, we define an automorphism on a graph to be a bijection on the vertices of the graph, f : N ↔ N , such that (f (a), f (a )) ∈ E if and only if (a, a ) ∈ E. A graph is distance-transitive if for every quadruple of vertices, a, a , b, and b , such that D(a, b) = D(a , b ), there is an automorphism, f , such that f (a) = a and f (b) = b . An economy is said to be network symmetric if its graph is distance-transitive. Preference symmetric economies and network symmetric economies provide useful benchmarks to which the general class of economies can be compared. We point out that network symmetry does not imply that the same amount of information is diffused among agents at each point in time. It does, however, still impose severe restrictions on how information may spread in the economy, as shown by the following lemmas:5 Lemma 1 In a network symmetric economy, ΔVa,t is the same for all agents at each point in time. That is, for each t, for each a, ΔVa,t = ΔVt for some common ΔVt . Thus, in a network symmetric economy, all agents have an equal precision of information at any point in time, although their signal realizations of course differ. Lemma 2 In a network symmetric economy the sequence ΔV1 , ΔV2 , . . . , ΔVT , is unimodal. Specifically, there are times 1 ≤ t1 ≤ t2 ≤ T , such that ΔVt+1 > ΔVt for all t ≤ t1 , ΔVt+1 = ΔVt for all t1 < t ≤ t2 , and ΔVt+1 < ΔVt for all t > t2 . In other words, the typical behavior of the information diffusion process in a network symmetric economy is “hump-shaped,” initially increasing, after which it reaches a plateau and then decreases. 3.3 Replica network To justify the assumption that agents are price takers, the number of agents needs to be large. Moreover, as analyzed in Ozsoylev and Walden (2011), restrictions on the distribution of number Other notions include vertex transitivity, distance regularity, arc-transitivity, t-transitivity, and strong regularity, see Briggs (1993). Distance transitivity is a stronger concept than vertex transitivity, arc-transitivity, and distance regularity, respectively, but neither stronger, nor weaker, than t-transitivity and strong regularity. 5 The first result follows immediately from the fact that automorphisms preserve distances between nodes, see Briggs (1993), page 118. The second result follows from Taylor and Levingston (1978), where the result is shown for the larger class of distance regular graphs (see also Brouwer, Cohen, and Neumaier 1989, page 167). 4 9 of connections are needed, to ensure existence of equilibrium. Ozsoylev and Walden (2011) carry out a fairly general analysis of the restrictions needed on the degree distribution for the existence of equilibrium to be guaranteed. They show that a sufficient condition is that the degree distribution is not too fat-tailed. Compared with their static model, our model has the additional property of being dynamic. Therefore, not only would restrictions on first-order degrees be needed to ensure the existence of equilibrium, but also on degrees of all higher orders. In the dynamic economy, signals spread over longer distances, thereby “fattening” the tail of the distribution of signals among agents over time. We therefore believe that a general analysis would be technically challenging, while adding limited additional economic insight, which is why we choose the simplified approach. We build on the concept of replica economies, originally introduced by Edgeworth (1881) to study the game theoretic core of an economy (see also Debreu and Scarf 1964). We assume that the full economy consists of a large number, M , of disjoint identical replicas of the network previously introduced, and that agents’ random signals are independent across these replicas. A replica network approach provides the economic and technical advantages of a large economy, namely that price taking behavior is rationalized and that the law of large numbers makes most idiosyncratic signals cancel out in aggregate, while avoiding the issues of signals spreading too quickly among some agents, causing equilibrium to break down. The total number of agents in the economy is N̄ = N × M . Formally, we define the set of agents in an M -replica economy as Am = N × {1, . . . , M }, where a = (i, j) ∈ Am represents the ith agent in the jth replica network, in an economy with M replica networks. There is still one asset, one market, and one competitive market maker in the market with N̄ agents. We use the enumeration a = 1, . . . , M N, of agents, where agent (i, j) maps to a = (j − 1)N + i. Agent (i, j) and (i, j ) are thus ex ante identical in their network positions and in their signal distributions, although their signal realizations (typically) differ. We let M increase in a sequence of replica economies, with the natural embedding A1 ⊂ A2 · · · ⊂ Am ⊂ · · · , and take the limit A = limM →∞ AM , letting A define our large economy, in a similar manner as in Hellwig (1980). The network G is thus a representative network in the large economy, A. Our interpretation is that the network, G, represents a fairly localized structure, perhaps at the level of a town or municipality in an economy, whereas A represents the whole economy. At time t, the market maker observes the average order flow per agent in the network6 wt = ut + 3.4 N̄ 1 Δxa,t . N̄ a=1 (3) Equilibrium We restrict our attention to linear equilibria in which agents in the same position in different replica networks are (distributionally) identical. Such equilibria are thus characterized by the Technically, the market maker observes ut + limM →∞ this can be done without confusion. 6 10 1 MN M N a=1 Δxa,t . We avoid such limit notation when behavior of agents a = 1, . . . , N , who are “representative.” Our main existence result is the following theorem, that shows existence of a linear equilibrium in the large economy under general conditions and, furthermore, characterizes this equilibrium. Theorem 1 Consider an economy characterized by M. For t = 1, . . . , T , define At = N τ Va,t , N γa a=1 yt = τu (At − At−1 )2 , t Yt = ys , s=1 Ca,t = Da,t = τv + τ Va,t+1 + Yt+1 τv + τ Va,t + Yt T τv + Yt τv + Yt+1 1 + τ Va,t 1 1 − τv + Yt τv + Yt+1 , −1/2 Ca,s , s=t+1 with the convention that A0 = 0, and YT +1 = ∞. There is a linear equilibrium, in which prices at time t are given by pt = t τv Yt τu v̄ + v+ (As − As−1 )us . τv + Yt τv + Yt τv + Yt s=1 (4) In equilibrium, agent a’s time-t demand and expected utility, given wealth Wa,t and the realization of signals summarized by za,t , take the form xa,t = τ Va,t (za,t − pt ), γa −γa Wa,t − 12 τ Ua,t = −Da,t e (5) 2 τ 2 Va,t (z −pt )2 +Y +τ Va,t a,t v t . (6) Several observations are in place. First, note that the price function (4) has a fairly standard structure. It is determined by the fundamental value (v) and the aggregate supply shocks (us , s = 1, . . . , t). The weights on these different components are determined by how signals spread through the network. Especially, At summarizes how aggressive—and thereby informative—the trades of agents are at time t, consisting of a weighted average of t-degree connectivity of all agents. The variable Yt corresponds to a cumulative average of squared innovations in A up until time t, and determines how much of the fundamental value that is revealed in the price. The main generalization compared with Vives (1995) is that Va,t varies with agent and over time, depending on the network structure. Moreover, preferences are allowed to vary across agents, through γa . This allows us to compare the equilibrium dynamics that may arise because of heterogeneous 11 preferences with the dynamics that may arise because of heterogeneous information diffusion. It is notable that Yt does not only depend on the total amount of information that has been diffused at time t, but also on how this information has diffused over time. In other words the price at a specific point in time is information path dependent. For example, consider two economies with 4 agents, all with unit risk aversion (γa = 1), and with parameters τ = τv = τu = 1. The first network, shown in panel A of Figure 1, is tight-knit (it is even complete) with every agent being directly connected to every other agent. It is straightforward to calculate 1 . The second V1,a = V2,a = 4, A1 = A2 = 4, Y1 = Y2 = 16, via (4) leading to p2 − v ∼ N 0, 17 network, shown in Panel B of Figure 1, is not as tightly knit, and agents have to wait until t = 2 before they have received all signals. In the latter case, V1,a = 3, V2,a = 4, A1 = 3, A2 = 4, 1 Y1 = 9, Y2 = 16, via (4) leading to p2 − v ∼ N 0, 11 . Thus, the price at t = 2 is less revealing in the second case, even though all agents have the same information at t = 2 in both economies. The reason is that in the tight-knit economy, the information revelation is more lumpy, whereas it is more gradual in the less tight-knit economy. Lumpy information diffusion leads to more revealing prices, since it generates more aggressive trading behavior in some periods, in turn making it easier to separate informed trading from supply shocks. This is our first example of how the network structure impacts asset price dynamics. A. B. Figure 1: Impact of network structure. The figure shows two networks with four agents: In Panel A, a tight-knit network is shown, in which every agent is connected with every other agent. In panel B, a less tight-knit network is shown. At t = 2, prices are more revealing in the tight-knit network, since the aggregate information arrival has been more lumpy, in turn leading to more revealing trading behavior of informed agents. 4 Trading, profits, and centrality of individual agents We study how the trades and performance of agents are determined by their positions in the network. 12 4.1 Correlation of trades Feng and Seasholes (2004) studied retail investors in the People’s Republic of China, and found that the geographical position of investors was related to the correlation of their trades: geographically close investors had more positively correlated trades than investors who were farther apart. Colla and Mele (2010) showed that information networks can give rise to such patterns of trade correlations, under the assumption that geographically close agents are also close in the information network. Their analysis was restricted to a cyclical network, but the effect was also shown to arise in general networks in the static model of Ozsoylev and Walden (2011). In Ozsoylev, Walden, Yavuz, and Bildik (2013), this positive relationship between trades and network position was used to reverse engineer a proxy of the information network in the Istanbul Stock Exchange from individual investor trades. Loosely speaking, agents who repeatedly traded in the same stock, in the same direction, at similar points in time, were assumed to be linked in the market’s information network. Such an approach is justified in the static model of Ozsoylev and Walden (2011), but the situation is more complex in a dynamic setting. Specifically, one may expect a positive relationship between network proximity and portfolio holdings also in the dynamic model, since agents who are close in the network have many overlapping signals and thereby similar information, leading to similar portfolio holdings. However, whereas agents’ trades and portfolio positions are equivalent in the static model, they are not in a dynamic model. Instead, portfolio holdings are equivalent to cumulative period-by-period trades in the dynamics model. For period-by-period trading behavior, the timing of information arrival is also important, and this timing is different even for agents who are close in the network. Consider, for example, an economy in which one agent, a, is connected to many other agents at a distance t, and therefore receives very precise information about the asset’s value at time t. This agent will at t take a large position in the asset (positive or negative, depending on whether the agent believes that it is under- or over-valued). Moreover, assume that one of agent a’s neighbors, agent b, does not have many connections within a distance of t but, being connected to a, receives a lot of information at t + 1. Now, assume that at t + 1 there is also a substantial amount of information incorporated into the asset’s price (driven by many other agents who are also connected to many agents at distance t + 1). In this case, agent b will take on a similar position as agent a at time t + 1, although less extreme since the price will be closer to the fundamental value at this point. Agent a, however, may actually decrease his position, to realize profits and decrease risk exposure. This argument suggests that the time t + 1 trades of the two agents are negatively correlated, although they are neighbors in the network. Thus, the period-by-period relationship between network proximity and correlation of trades seems less clear than that between network proximity and portfolio holdings, suggesting that one needs to be careful in choosing an appropriate window length when inferring network structure from observed trades. The following theorem characterizes covariances of trades, for an individual agent over time, and between agents at a specific point in time. 13 Theorem 2 The covariance between an agent’s trade at t and t + 1 is τ Va,t Va,t+1 − . Cov(Δxa,t+1 , Δxa,t ) = ΔVa,t γa τv + Yt+1 τv + Yt (7) The covariance of agent a and b’s trades at time t is τ2 Cov(Δxa,t , Δxb,t ) = γa γb ΔVa,t ΔVb,t yt + ΔVa,b,t . Va,t−1 Vb,t−1 + (τv + Yt−1 )(τv + Yt ) τv + Yt (8) Equation (7) shows that in the special case when an agent does not receive any new signals (and thus ΔVa,t = 0), the covariance between time t and t + 1 trades is also zero. This is natural since the agent’s trades in this case depends solely on price changes, and the price process is a martingale. In the more interesting case when the agent does receive signals, the covariance between subsequent trades is determined by the information advantage of the agent over the market at t and t + 1, respectively. Specifically, Va,t represents how much information agent a has received at time t, whereas τv + Yt represents how much aggregate information has been incorporated into prices. A high Va,t relative to τv + Yt , means that the agent has a substantial information advantage at time t. If the information advantage increases between t and t + 1, then agent a’s trades will be positively correlated over these two periods, representing a situation where he tends to ramp up investments. If, on the other hand, agent a’s information advantage decreases, his trades will be negatively correlated, representing a situation where he takes home profits and decreases risk exposure by selling stocks (or buying, if in a short position), in line with our previous discussion. An example of the two different situations is shown in Panels A and B of Figure 2. In Panel A, agent a is at the center of an extended star network and will have a large advantage over the market at t = 1. At t = 2, the playing field is more even, since information diffusion has made all of agent a’s neighbors well-informed too. Agent a still has an information advantage, now having received all signals from the periphery of the network, but the advantage is lower than in the previous period, and he therefore decreases his asset position, leading to a negative correlation of his trades over the two periods. In Panel B of the figure, agent a still receives the same signals period-by-period, but since his neighbors are now more directly connected, his information advantage at t = 1 is smaller. In this case, his information advantage may be higher in period 2 than in period 1, so he may tend to gradually ramp up up his position over the two time periods and therefore have positively correlated trades. We note that this intuition, that relative information advantage over time determines an agent’s dynamic trading behavior, which is very clear in the general setting, does not come out clearly if we restrict our attention to symmetric networks. Indeed in a network symmetric economy, ΔVa,t is the same for all agents, and it can be shown that (7) takes the form τ τ Cov(Δxa,t+1 , Δxa,t ) = ΔVt+1 ΔVt τv + Yt − Vt ΔVt+1 . γa γ̄ 14 Thus, all else equal, for low degrees of information diffusion between t and t + 1 (i.e., a low ΔVt+1 ), trades will be positively correlated, whereas they will be negatively correlated when the degree of information diffusion is high (i.e., for a high ΔVt+1 ). This result is the opposite of what we just showed. The issue in the symmetric case is that within the class of network symmetric economies, connectivity must increase for all agents at the same time, so there is no way to increase the relative information advantage of one agent. Increasing the connectivity of all agents at the same time has two effects: it increases the total informativeness of their signals and it increases the amount of information that is incorporated into the asset’s price, and the second effect always dominates. Thus, increased connectivity at t + 1, all else equal, always decreases the information advantage of the agents in a symmetric network setting, making them decrease their portfolio holdings, and thereby potentially leading to negative correlation with their trades in the previous period. This is our first example of how restricting ones’ attention to symmetric economies may be misleading—in this case reversing the result. We next focus on Equation (8), which shows how the time-t trades of two agents are related. The first two terms in the expression represent covariance induced by the fact that two informed agents will tend to trade in the same direction because, both being informed, they will take a similar stand on whether the asset is over-priced or under-priced. This part of the expression increases in the total amount of information the agents have received at t − 1 (through Va,t−1 Vb,t−1 ), as well as in how much additional information they expect to receive between t − 1 and t (through ΔVa,t ΔVb,t ). Offsetting these effects is the aggregate informativeness of the 1 , similarly to what we saw in (7). The third market, through the terms (τv +Yt−1yt)(τv +Yt ) and τv +Y t term in the expression provides an additional positive boost to the covariance, and is increasing in the number of common agents at distance t of both agent a and b. This term is zero if the agents are further apart than a distance of 2t, but will otherwise typically be positive. The term captures the natural intuition that agents who receive identical information signals have more similar trades than agents who receive signals with independent error terms. The first main implication of (8) is that the covariance is always strictly positive. Thus, the situation with negative correlation between trades—because two nearby agents tend to trade in different directions when one is ramping down portfolio exposure whereas the other one is ramping up—never occurs in the model. To understand why this is the case, we use (5) to rewrite agent a’s time-t demand as τ j∈ΔSa,t sj − pt − Va,t (pt − pt−1 ) . ΔVa,t Δxa,t = γa ΔVa,t The first term in this expression represents the agent’s demand because of additional information received between t − 1 and t. We note that j∈ΔSa,t sj /ΔVa,t = v + ζ a , where the error term ζ a ∼ N (0, σ 2 /ΔVa,t ) is independent of prices. The second term represents the agent’s sloping demand curve, causing him to rebalance portfolio holdings when the price catches up, by selling (buying) stocks when the price increases (decreases) between t − 1 and t. For an agent who has 15 an information advantage at time t − 1, but receives no new information between t − 1 and t, this second term is the only one present (since ΔVa,t = 0). Now, agent b’s demand function has the same form as agent a’s and, assuming that agent b receives a lot of new information between t − 1 and t, the first term dominates. Negative correlation would then arise if agent b tends to ramp up when agent a ramps down, which is the case if Cov v + ζ b − pt , −(pt − pt−1 ) < 0. However, since the market is semi-strong form efficient, v − pt is independent of pt − pt−1 . Furthermore, since ζ b is independent of aggregate variables, Cov ζ b , −(pt − pt−1 ) = 0. In other words, since agent a’s rebalancing demand between t − 1 and t is publicly known at t, it must be independent of agent b’s time-t demand which in turn is due to informational advantage at time t. This result is model dependent. It depends on the linear structure of agents’ demand functions. However, to a first order approximation we expect the result to hold in more general settings in semi-strong form efficient markets, given that trading for rebalancing purposes mainly depends on price changes, trading for informational purposes depends on the difference between the true value and market price, and the two terms are uncorrelated in a weak-form efficient market. The second main implication of (8) is that, all else equal, period-by-period covariances (and correlations) between two agents’ trades increase in the number of common acquaintances they have (through the ΔVa,a ,t -term). As an example, in Panel C of Figure 2, we show a cyclical network. It immediately follows that the correlation between two agents’ trades at any time t < 4 (at which point all information has reached all agents) is decreasing in the distance between the two agents. These properties of trade correlations suggest that it may be justified to use trades instead of portfolio holdings to draw inferences about a market’s information network as, e.g., done in Ozsoylev, Walden, Yavuz, and Bildik (2013), using short time horizons over which trades are compared. A. C. B. a a Figure 2: Correlation of trades. The figure shows three networks that lead to different types of trading behavior. Panel A shows a network in which agent a’s trades at time 1 and 2 are negatively correlated, whereas they are positively correlated in Panel B. Panel C shows a network in which the correlation between two agents’ trades, at each point in time, is inversely related to the distance between the two agents. 16 4.2 Profits Who makes profits in an information network? Our starting point is the following theorem: Theorem 3 Define πa,t = τ Va,t πa,T = τ 1 1 − τv + Yt τv + Yt+1 Va,T , τv + YT , t = 1, . . . , T − 1, a = 1, . . . , N a = 1, . . . , N. The ex ante certainty equivalent of agent a is Ua = 1 log(C), 2γa where C = T (1 + πa,t ) . (9) t=1 The expected profit of agent a is τ Πa , γa where Πa = T (τv + Yt )−1 Va,t (10) (11) t=1 is the profitability of agent a. We focus on profitability, and leave welfare implications of the model for future research. Equation (10) determines (ex ante) expected profits of an agent. It shows that expected profits depends on three components. First, profits are inversely proportional to an agent’s risk-aversion, γa , because more risk-averse agents take on smaller positions—all else equal. This follows immediately, since an agent’s equilibrium trading position is proportional to γa , so it corresponds to pure scaling. Therefore, we do not include it in our measure of profitability, as defined by Equation (11). Neither do we include the signal precision, τ , which is constant across agents. Second, expected profits depend on an agent’s position in the network through {Va,t }t , t ∈ T : the higher any given Va,t is, the higher the agent’s expected profits. Third, expected profits depend inversely on the amount of aggregate information available in the market, in that at any given point in time, the higher the total amount of aggregate information, the lower the expected profits of any given agent. The third part represents a negative externality of information. Equation (11) thus provides a direct relationship between the properties of a network, local as well as aggregate, and individual agents’ profitability. 17 4.3 Centrality Equation (11) shows that an agent’s profitability is determined by his centrality, defined appropriately. Recall that Va,t denotes the number of agents that are within distance t from agent a. So, Va,1 is simply the degree of agent a. For t > 1, higher order connections are also important in determining profitability. For example, Va,2 does not only depend on how connected agent a is, but also on how connected his neighbors are. We use (1) to rewrite (11) on vector form as Π= ∞ βt χ(E t )1, (12) t=1 where βt = (τv + Yt )−1 , for t ∈ T , and βt = 0 for t > T . Here, Π is an N -vector where the ath element is the profitability of agent a. We explore the relationship between this profitability measure and standard centrality measures in the literature. Specifically, we define degree centrality, farness, closeness centrality, Katz centrality, and eigenvector centrality (see, e.g., Friedkin (1991) for a detailed discussion of these concepts), and compare (12) with these measures. Definition 3 • The degree centrality vector is the vector of first-order degrees, V 1 , i.e., the degree centrality of agent a is Va,1 . • The farness of agent a is the agent’s average distance to all other agents, i.e., a =a D(a, a ) . Fa = N −1 • The closeness centrality of agent a, Ĉa , is the inverse of that agent’s farness, Ĉa = 1 . Fa • The Katz centrality vector with parameter α < 1 is the vector K ∈ RN + , defined as K = Kα = ∞ αt E t 1. (13) t=1 Here, E t denotes the t:th power of the adjacency matrix, E, and 1 ∈ RN is an N -vector of ones. • The eigenvector centrality vector is the eigenvector corresponding to the largest eigenvalue of E, i.e., the vector C that solves the equation C = λEC, for the largest possible eigenvalue, λ, where we normalize C such that a∈N Ca = 1. 18 We see that the structures of the profitability measure (12) and Katz centrality (13) are similar. Specifically they are both made up by a weighted sum of powers of the adjacency matrix, multiplied with the vector of ones. The differences are that the weighting is a power of α for Katz centrality but varies more generally with t for profitability, and that the matrix indicator function, χ, operates on the power of the adjacency matrix in the profitability measure. It is a standard result that eigenvector centrality can be viewed as a special case of Katz centrality, α since C = limαλ−1 KK α .7 a a The similar structure of the formulas for profitability (12) and Katz centrality (13) suggests that Katz centrality is closely related to profitability, as is eigenvector centrality being a special case of Katz centrality. Therefore, we may expect Katz and eigenvector centrality to dominate the other measures for “an average” network, although there may be exceptions. To formalize this intuition, we introduce a random network generating process, which allows us to derive a probabilistic ranking of the different measures. We use the classical concept of random graphs, originally studied in Erdös (1947), Erdös and Rényi (1959), and Gilbert (1959), in which links between agents are formed randomly, independently, and with constant probability.8 Our focus is on sparse networks—in line with what is observed in practice (see the discussion in Ozsoylev, Walden, Yavuz, and Bildik (2013)). We assume that the expected number of links per agent in a network of size N is c log(N )k + 1 for some c > 0 and k > 3.9 With this assumption, the connectedness of agents grows with N , but the fraction of expected connections in the network (which is approximately cN log(N )k /2) to maximum number of connections (which is approximately N 2 /2) tends to zero for large N , so when N is large the network is indeed sparse. To this end, we define Definition 4 The Erdös-Rényi random graph model of size N , and parameters c > 0, and k > 3, G(N, c, k), is defined so that for each 1 ≤ a ≤ N , 1 ≤ a ≤ N , a = a , (a, a ) ∈ E with )k i.i.d. probability c log(N N −1 . We use cross sectional correlation across agents as the metric of similarity.10 Specifically, for a given network we define the cross sectional correlation between the different centrality measures and profitability, ρD = Corr(D, Π), ρF = Corr(−F, Π), ρĈ = Corr(Ĉ, Π), ρK α = Corr(K α, Π), ρC = Corr(C, Π). 7 Uniqueness of the eigenvector centrality measure is not guaranteed, but is almost never an issue in practice. We focus on the Erdös-Rényi model, since it is the most parsimonious and analytically most tractable. Many other network generating processes have also been suggested in the literature, e.g., the preferential attachment model of Barabasi and Albert (1999), and the model of Watts and Strogatz (1998). 9 The restriction k > 3 is needed for technical reasons. 10 2 2 The cross sectional statistics are defined as EX = N1 a∈N Xa , σ (X) = E[(X − EX) ], Cov(X, Y ) = E[(X − EX)(Y − EY )], and Corr(X, Y ) = Cov(X, Y )/(σ(X)σ(Y )). 8 19 Here, we use the convention that the correlation between any random variable and a constant is zero. In the random network model, these cross sectional correlations are in turn random, since they depend on the random realization of the network. Note that ρF is defined as the correlation between negative farness and profits, since farness is inversely related to connectedness. For simplicity, we focus on preference symmetric economies, but allow for γ, τ , τu , τv , v̄, and T to be arbitrary. The following result shows that Katz centrality dominates degree centrality, farness, and closeness centrality for large random graphs. Theorem 4 For any given c > 0 and k > 3, there is an N0 , such that for all networks of size N > N0 in the G(N, c, k) model, there is an α, such that E [ρK α ] > E [ρD ] > max E ρĈ , E [ρF ] . (14) Thus, for large random networks, profitability is best characterized by Katz centrality. This result provides a strong ranking between different centrality measures for a large class of networks. To the best of our knowledge, this is the first such ranking of centrality measures in equilibrium models of asset pricing with general networks. In contrast, previous applications of network theory to asset pricing typically provide results for very specific networks (e.g., Ozsoylev 2005, Colla and Mele 2010, and Buraschi and Porchia 2012), or is empirically motivated (e.g., Das and Sisk 2005, Adamic, Brunetti, Harris, and Kirilenko 2010, Li and Schurhoff 2012, and Ozsoylev, Walden, Yavuz, and Bildik 2013,). The challenge in proving the result stems from the fact that profitability is determined endogenously in equilibrium. The intuition behind the result is straightforward, as shown in the proof of the theorem, suggesting that it may extend to more general settings. The profitability equation (12) is made up by a weighted average of the connectedness of different orders (with the weights, βt , endogenously determined). Degree centrality exclusively focuses on first-order connections, whereas closeness centrality and farness mainly focus on high-order connections (since the vast majority of agents will be quite far away from any given agent in a large network). Katz and eigenvector centrality, in contrast, balance the weights of different orders of connectedness, and are thereby more closely related to profitability. Of course, our strong ranking of centrality measures holds only for large networks. We also explore the relationship between the different centrality measures in medium-sized networks using simulations, to verify that Katz centrality also works well in such networks. We randomly simulate 1,000 economies of sizes N = 50, 100, 200, 500, and 1000, respectively, and in each economy we randomly generate links between agents, using the random graph model, so that √ each agent is expected to have N links. The growth of the number of links in the network with N is thus slightly faster than in the G(N, c, k) model, although the graph is still asymptotically sparse. We measure the correlation between profitability (Π) and the different centrality measures. The results are shown in Table 1. We see that the ranking is the same as in Theorem 4 and, 20 Size of network, N 50 100 200 500 1000 Mean correlation C 0.96 0.94 0.94 0.95 0.97 0.95 Kα 0.96 0.94 0.94 0.95 0.97 0.95 V1 0.88 0.82 0.79 0.83 0.85 0.83 Ĉ 0.0 0.0 0.0 0.0 0.0 0.0 F 0.0 0.0 0.0 0.0 0.0 0.0 Table 1: Centrality Measures. The table shows the correlation between profitability, Π, and centrality for several different centrality measures, degree centrality (V 1 )), eigenvector centrality (C), Katz centrality (K α ), closeness centrality (Ĉ) and farness (F ). The parameters of the economy are σ = σu = σv = 10, v̄ = 0, T = 5, and γ = 1 for all agents. The number of agents is varied between N = √ 50 and N = 1000, and links are randomly drawn between agents, such that on average an agent has N links. For each N , we simulate 1, 000 random networks. The results show that eigenvector and Katz centrality are most closely related to profitability, followed by degree and betweenness centrality. Closeness centrality and farness perform poorly, since there is usually an isolated agent in the network, leading to all agents having closeness centrality of zero and infinite farness centrality. furthermore, that eigenvector centrality, being a special case of Katz centrality, performs about as well as the Katz centrality measure. 5 Aggregate volatility and trading volume According to the Mixture of Distributions Hypothesis (MDH), return volatility varies over time, which then leads to heavy-tailed unconditional return distributions. A common explanation for such time varying volatility—as well as trading volume—is lumpy diffusion of information into the market (Clark 1973; Epps and Epps 1976; Andersen 1996). Rich dynamics of volatility and trading volume have indeed been documented in the literature. First, trading volume is positively autocorrelated over extended periods, i.e., the autocorrelation function of trading volume has “long memory” in that it decreases very slowly (see Bollerslev and Jubinski 1999 and Lobato and Velasco 2000). Roughly speaking, this means that abnormally high trading volume one period predicts abnormally high trading volume for many future periods. Second, return volatility of individual stocks, and markets, also have long memory (e.g., Bollerslev and Jubinski 1999 and Lobato and Velasco 2000). Third, volume and volatility are related (Karpoff 1987; Crouch 1975, Rogalski 1978), in line with the Wall Street wisdom that it takes volume to move markets. Contemporaneously, trading volume and absolute price change are highly positively correlated. The two series also have positive lagged cross-correlations. Using a semi-parametric approach to study returns and trading volume on the NYSE, Gallant, Rossi, and Tauchen (1992) show that large price movements predict large trading volume. In other markets, there is evidence for a reverse casualty, i.e., that large trading volume leads large price movements. For example, Saatcioglu and Starks (1998) find such evidence in several Latin American equity markets. 21 In our model, agents’ preferences (Γ) and the network structure (E) determine volatility and volume dynamics in the market. It is then natural to ask which type of dynamics can be generated within the model for general Γ and E, and also what one can infer about the network structure from observed volatility and volume dynamics. 5.1 Volatility In a general network economy, we would expect price volatility to vary substantially over time. For example, information diffusion may initially be quite limited, with low price volatility as an effect, but eventually reach a hub in the network, at which point substantial information revelation occurs with associated high price volatility. The following result characterizes the price volatility over time, and moreover shows that any volatility structure can be supported in a general economy. Theorem 5 For t = 1, · · · , T , the variance of prices between t − 1 and t, is 2 = σp,t yt , (τv + Yt )(τv + Yt−1 ) (15) where we use the convention that Y0 = 0, and between T and T + 1, it is 2 σp,T +1 = 1 . τv + YT (16) +1 kt = 1 and an arbiMoreover, given coefficients, k1 , . . . , kT +1 , such that kt > 0, and Tt=1 trarily small > 0, there is a preference symmetric economy, such that 2 σp,t − kt ≤ , t = 1, . . . , T + 1. (17) τv From the first part of the theorem, we see that the volatility (the square root of variance) has a general decreasing trend over time because of the increasing denominator in (15), but that it can still have spikes in some time periods because of large values in the numerator. In fact, (17) shows that any structure of time-varying volatility after an information shock can be generated. Note that the total cumulative variance up until time t ≤ T is 2 = σP,t 1 Yt . τv τv + Yt (18) This part of the variance that is incorporated until T represents the “information diffusion” component of asset dynamics, whereas the part between T and T + 1, i.e., (16), represents the component due to public information sources, in line with the discussion in Section 3.2. The 22 total price variance between 0 and T + 1 is of course equal to the ex ante variance, 2 2 + σp,T σv2 = σP,T +1 = 1 YT 1 + , τv τv + YT τv + YT independently of network structure. But the way the variance is divided period-by-period, and into the information diffusion and public component, depends on the network. It is specifically determined by yt , t = 1, . . . , T . If we restrict our attention to network symmetric economies, the possible dynamic behavior of volatility is much more restricted. This is not surprising, given the restrictions on information diffusion dynamics described in Lemmas 1 and 2. From (18), it follows that yt is proportional 1 2 to Δηt = ηt − ηt−1 , where ηt = σ2 −σ 2 . Since yt is proportional to ΔVt (which is the same for v P,t all agents in a network symmetric economy), ΔVt is unimodal, and the square of a nonnegative unimodal function is also unimodal, it follows that the sequence Δη is also unimodal. Moreover, 2 if the public information component is large compared with the diffusion component, σP,T << 2 σv , it follows from (18) that 2 , (19) σv4 Δηt ≈ σp,t so in this case σp,t is also unimodal. We summarize this in Corollary 1 In a network symmetric economy, 1. Δηt is unimodal, 2 << σv2 , then σp,t is unimodal. 2. if the public information component is high, σP,T Of course, the timing of an information shock is typically not known. We have in mind a situation in which information shocks arrive every now and again, and are then gradually incorporated into prices. We do not observe the timing of these shocks, and therefore do not know when t = 0 in our model. However, autocorrelations — which take averages over all time periods — can be calculated even without observing the timing of shocks. Now, there is a close relationship between a function’s shape and the shape of its autocorrelation function. For a general sequence, f0 , f1 , . . . , fT , we define the difference operator (Δf )k = fk − fk−1 , k = 1, . . . , T , (Δf )T +1 = 0 − fT , and the higher-order operators Δn+1 f = Δ(Δn f ). The following lemma relates the monotonicity of a function and its autocorrelation function: Lemma 3 Consider a nonnegative sequence f0 , f1 , . . . , fT , such that f0 = 0. Define the auto −k fk fk+|τ |, −T + 1 ≤ τ ≤ T − 1. Then, correlation function Rτ = Tk=0 1. if f is unimodal, so is R, 2. if Δn f does not switch sign, then R has at most n − 1 turning points. 23 In light of Corollary 1, the result immediately leads to: Corollary 2 In a network symmetric economy with a high public information component, the autocorrelation function of volatility is unimodal. Thus, given the restrictions imposed by network symmetry, nonmonotonicity of the autocorrelation function of volatility suggests that there is asymmetry in the underlying information network. One way of measuring the degree of monotonicity of an autocorrelation function is by counting its number of turning points for positive τ (since the autocorrelation function is symmetric, it is sufficient to study τ > 0). Another way is to measure its absolute variation, τ >0 |Rτ +1 −Rτ |, which will be higher the more nonmonotone the autocorrelation function. To summarize, network symmetry leads to hump-shaped volatility after an information shock, and also to hump-shaped autocorrelation functions, whereas any dynamics can arise in a preference symmetric economy. 5.2 Volume Just like with volatility, rich dynamics of trading volume can arise within the network model. Since the model is inherently asymmetric, aggregate trading volume will be made by the heterogeneous trades of many different agents. This contrasts to the uniform behavior in models with a representative informed agent (e.g., Kyle 1985), as well as to the ex ante symmetric behavior in economies with symmetric information structures (e.g., Vives 1995; He and Wang 1995). We focus on the aggregate period-by-period trading volume of agents in the network, since the stochastic supply is (quite trivially) normally distributed. To this end, we define: Definition 5 The time-t aggregate (realized) trading volume is Wt = ante) expected trading volume is Xt = E [Wt ]. 1 N a |Δxa,t |, and the (ex The following theorem characterizes the expected trading volume, and mirrors our volatility results by showing that any pattern of expected trading volume can be supported in the model. Theorem 6 The time-t expected trading volume is Xt = N 2 2 2 V V ΔV ΔV τ 1 2 a,t−1 a,t a,t a,t . − + + N a=1 γa π τv + Yt−1 τv + Yt τv + Yt τ 24 (20) Given positive coefficients, c1 , c2 , . . . , cT +1 , and any > 0, there is an economy such that |Xt − ct | ≤ , t = 1, . . . , T + 1. It is clear from (20) that, as is the case for volatility, heterogeneous preferences alone cannot generate such complete generality of trading volume dynamics. In a network symmetric economy, all terms under the square root are identical across agents, and (20) collapses to Xt = 2τ 2 πγ̄ 2 2 Vt−1 Vt2 ΔVt2 ΔVt − + + . τv + Yt−1 τv + Yt τv + Yt τ (21) Again, differences in preferences in this case are only important through the effect they have on the (harmonic) average risk aversion coefficient, γ̄. Moreover, the restrictions on ΔVt imposed by network symmetry carry over to trading volume. It is easily seen that if the public information component is large compared with the diffusion component, the fourth term under the square 2τ 2 root in (21) dominates and Xt ≈ πγ̄ 2 ΔVt . In this case, we therefore get: Corollary 3 In a network symmetric economy, in which the public information component is high: • Xt is unimodal, as is its autocorrelation function, • There is an approximate square root relationship between σp,t and Xt : Xt ∼ √ σp,t . (22) To summarize, the degree of nonmonotonicity of the autocorrelation functions of volatility and trading volume are related to the degree of asymmetry in the information diffusion process. In the network symmetric case, after an information shock trading volume and price volatility increase, reach a peak and then decrease. In the asymmetric case, the dynamics of volume and volatility after an information shock may be nonmonotone. Also, the instantaneous correlation of volatility and volume is high in the symmetric case. In the completely general case, with heterogeneity over both preferences (γa ) and network structure (Va,t ), we expect the interplay between the two to give rise to quite arbitrary trading volume and volatility dynamics. For example, at a large time, t, almost all information may have diffused among the bulk of agents, leading to a small and decreasing ΔVa,t , and thereby low volatility. A peripheral agent with very low risk aversion, who receives many signals very late, may still generate large trading volume at such a late point in time, despite the low volatility. The above example captures the important distinction between trading volume driven by high aggregate information diffusion, and by demand from agents with low risk aversion, a 25 distinction that does not arise in either the preference symmetric or the network symmetric benchmark cases. 5.3 Conditioning on event time In the previous analysis we studied unconditional relations, assuming that the arrival time of information shocks is unknown. In some cases the arrival time may be observable. Earnings announcements may, for example, constitute well defined information events for which we could use such conditioning. Although the actual announcement is a public event, if agents have additional private information, the inferences they draw from the announcement together with their private information may differ. If the time of arrival of the information shock is well-identified, we can condition on the time that has passed since arrival and calculate conditional relations between returns and volume. Specifically, we can use t in our information set when calculating relations, e.g., defining Cov(Wt , Wt+1 ) = E[(Wt −E[Wt ])(Wt+1 −E[Wt+1 )]|t]. We define the price change, μt = pt −pt−1 , and we then have Theorem 7 Trading volume and price changes satisfy the following conditional relations: • Corr(Wt , |μt |) > 0, • Corr(Wt , Wt−1 ) > 0, • Wt is independent of μs for s < t. The first two results state that trading volume is contemporaneously positively related to absolute price changes, as well as positively autocorrelated. These results are in line with the previously discussed empirical literature. The third result states that price movements do not cause lagged trading volume. The result seems to be inconsistent with what is found in Gallant, Rossi, and Tauchen (1992), but consistent with the results in Saatcioglu and Starks (1998). We emphasize though that to test this prediction, one needs to condition on the time elapsed after an information event, which these studies do not do. It is therefore an open question whether the prediction holds empirically. 5.4 Size versus volatility and volume The results in Sections 5.1 and 5.2 can potentially be used to understand variations in dynamics across different markets and assets. One such variation is between companies of different size, where we may expect more transparent information diffusion for large than for small companies, because of the higher coverage of large companies. The underlying information network structure for large companies may therefore be relatively symmetric, and the public information 26 component may be large. In contrast, the network structure of smaller companies may be more asymmetric, and the nonpublic information component more significant. We carry out initial tests of whether the observed dynamics of volatility and volume in the market is consistent with such differences between large and small companies. The tests are suggestive and exploratory. A more extensive empirical investigation, e.g., in the form of reverse engineering the underlying network using maximum likelihood estimation, is outside of the scope of this paper. 5.4.1 Data Using the Center for Research in Security Prices (CRSP), we collect daily returns, market value, and trading volume over a ten-year period (January 2003-December 2012), for all companies in the Russell 3000 Index. We classify the companies as Large Cap (market capitalization above USD 10 Billion), Mid Cap (market capitalization between USD 1 Billion and 10 Billion), and Small Cap (market capitalization below USD 1 Billion). We require data for at least 98% of the days in the sample (2,466 of the 2,517 trading days) to be available, to include a company in the sample, leaving us with 276 Large Cap, 746 Mid Cap, and 502 Small Cap companies, in total J = 1524 companies. We calculate weekly return volatility and turnover for each stock in the sample, which we σ , and turnover, RW , for each stock then use to calculate weekly autocorrelations of volatility, Rj,t j,t (j), and lag t = 1, 2, . . . , 12 weeks. We use turnover instead of raw volume, to get a normalized trading volume measure. 5.4.2 Results We calculate the average number of turning points of the autocorrelation functions for volatility and turnover, in the three groups. The results are shown in Table 2. We see that the number of turning points is larger for smaller firms, both for autocorrelation of volatility and of turnover. The differences are statistically significant at the 2.5% level (t-statistic of 2.02, using a one-sided test) when testing for differences of means between Mid Cap and Small Cap companies, whereas it is significant at the 0.1% level for the other three tests (Large Cap verses Mid Cap, and Mid Cap versus Small Cap for volatility, and Large Cap versus Mid Cap for turnover). This is in line with our analysis, that large companies have more symmetric information networks with a larger public component than small companies. A similar picture emerges for absolute variation of the autocorrelation functions of volatility and volume, for firm j defined as 12 σ |Rj,t − σ Rj,t−1 | and t=1 12 W W |Rj,t − Rj,t−1 |, t=1 respectively. As seen in Table 2, the average absolute variation is higher for smaller companies, with similar statistical significance as for the number of turning points. Since absolute variation 27 Large Cap σ A. Volatility, Rj,t Average number of turning points Mid Cap 3.63 4.08 (3.76) Average absolute variation 0.88 0.92 3.70 4.18 0.71 0.77 0.57 0.85 (6.70) 0.53 (-4.00) Number of firms 4.37 (2.02) (3.90) C. Correlation 0.99 (5.67) (3.74) Average absolute variation 4.46 (3.81) (2.03) W B. Turnover, Rj,t Average number of turning points Small Cap 276 0.41 (-14.1) 746 502 Table 2: Company size versus dynamics of volatility and turnover. Panel A shows how the σ , is related to company size. Companies are divided into three autocorrelation function of volatility, Rj,t groups: Large Cap (market capitalization above USD 10 Billion), Mid Cap (market capitalization between USD 1 and 10 Billion), and Small Cap (market capitalization below USD 1 Billion). Average number of turning points, and average absolute variation are shown in rows 1 and 3, respectively. The t-statistic of a difference of means test between number of turning points of Large Cap and Mid Cap companies, and between Mid Cap and Small Cap companies, are shown in columns 2 and 4 of row 2, respectively, and the same difference in means tests for absolute variation in rows 2 and 4 of row 4. Panel B shows the same for the autocorrelation function of turnover. Panel C shows the average correlation between volatility and turnover for the three groups of companies. Columns 2 and 4 show the t-statistics of difference in means test between Large and Mid Cap companies, and between Mid and Small Cap companies, respectively. provides another measure of nonmonotonicity, this reinforces the view that smaller stocks are associated with more asymmetric information networks than large stocks. Finally, the cross correlation between volatility and turnover is higher for Large Cap companies than for Small Cap companies (0.57 for Large Cap versus 0.41 for Small Cap, with differences in means strongly statistically significant both between Large Cap and Mid Cap, and between Mid Cap and Small Cap). We argued earlier that in a symmetric network with a large public information component, there will be a close instantaneous link between volatility and volume, so this result is also consistent with less asymmetric networks for large stocks. One potential issue with the test above is that trading volume is known to be non-stationary, increasing over time (see, e.g., Lo and Wang 2000). Although we use turnover in our test, which did not seem to have an identifiable trend during the sample period (the average weekly turnover was 2.8% in the first year, 2003, peaked around 10% at the height of the financial crises, 2008, and was then down to 3.4% in the last year of the period, 2012), we wish to control for trends in our test, for the sake of robustness. We therefore carry out the same tests as in Table 2, but normalize volatility and turnover by dividing these variables by the average volatility and turnover across all stocks, in any given week. Thus, the sum of normalized volatilities and turnovers in any given week is equal to one. The results (not reported) are virtually identical to those in Table 2, so non-stationarity does not see to be driving our results. Another implication of the model is that in network symmetric economies with large public 28 information components, expected trading volume scales as the square root of volatility (22). In log-coordinates, we can write wj,t = aj + bj sj,t + j,t , (23) where wj,t = log(Wt ) is the logarithm of trading volume, and sj,t = log(σp,t ) is the logarithm of volatility, for firm j at time t, j,t are random shocks, and the coefficient bj = 0.5 for firms with symmetric networks and large public information components. We investigate this relation. For each firm, we regress log-turnover on log-volatility, and we then calculate the average bj coefficient for firms in the three different groups. The results are shown in Table 3, Panel A. We see that the average coefficients are lower than 0.5, although quite close, for all groups. They are about 0.43 for the Large Cap and Medium Cap groups, and 0.38 for the Small Cap group. This suggests that the relationship between volatility and volume is farthest away from a square root relationship for stocks in the Small Cap group, again consistent with more asymmetry in this group. The difference in means between Mid Cap and Small Cap companies is significant, with a t-stat of -5.67. The regressions in (23) do not separate between aggregate shocks to volatility and volume, and company specific shocks. If shocks are mainly aggregate, a cross sectional comparison between companies will allow for only limited inferences. We can see how well volatility andturnover are regressing log-average turnover, w̄t = related in the aggregate market, by 1 1 log J j Wt (j) , on log-average volatility, s̄t = log J j σp,t (j) , i.e., w̄t = ā + b̄s̄t + t . The b̄ coefficient in this regression is insignificantly different from 0.5 (see Panel C of Table 3), suggesting that the square root relationship holds well at the market level. To focus on excess turnover and volatility, above and beyond market shocks, we regress wj,t − w̄t = aej + bej (sj,t − s̄t ) + ej,t . The average bej coefficients for companies in the three groups are shown in Panel B of Table 3. The coefficients are lower for all groups compared with those in Panel A—between 0.32 and 0.35—as are the R-squares, suggesting a significant market component that generates both volatility and volume for individual stocks. The lowest average coefficient is again for companies in the Small Cap group, significantly lower than the average of the Mid Cap companies (t-statistic of -2.51). However, the average coefficient for the Large Cap companies is in this case lower than for the Mid Cap companies. Altogether, the results are consistent with the following story: Large companies have symmetric information networks, and the type of information that is relevant for these companies is also closely related to aggregate market information, causing their excess regression coefficient to be low. Mid-sized companies also have symmetric networks, but their information is not as closely related to market information, i.e., it is more idiosyncratic. Finally, 29 A. Log-regression Average b Large Cap Mid Cap Small Cap 0.43 (0.009) 0.43 (0.005) 0.38 (0.008) (0.35) Average R2 B. Excess log-regression Average be (-5.67) 0.30 0.25 0.15 0.33 (0.011) 0.35 (0.0065) 0.32 (0.0096) (1.64) Average R2 C. Market regression Market b̄ R2 D. Correlations Average ρ(s̄, sj ) Average ρ(w̄,wj ) 0.17 Market (-2.51) 0.15 0.11 0.51 (0.022) 0.52 0.63 0.60 0.60 0.54 0.54 0.41 Table 3: Relationship between log-turnover and log-volatility. Panel A shows how the regression coefficient of log-turnover on log-volatility is related to company size. Companies are divided into three groups: Large Cap (market capitalization above USD 10 Billion), Mid Cap (market capitalization between USD 1 and 10 Billion), and Small Cap (market capitalization below USD 1 Billion). Average regression coefficients are shown in row 1, and standard deviations in row 2. The t-statistic of difference of means tests between regression coefficients of Large Cap and Mid Cap companies, and between Mid Cap and Small Cap companies, are shown in columns 2 and 4 of row 3, respectively. The average R-squares are shown in Row 4. Panel B shows the same for excess log-turnover (defined as wj,t − w̄t ) regressed on excess log-volatility (defined as sj,t − s̄t ). Panel C shows the coefficient when mean log turnover for all firms in the sample, w̄t , is regressed on mean log volatility, s̄t . Panel D shows the average (time series) correlations between mean log-volatility and individual company volatility (row 1), and between mean log-turnover and individual company log-turnover (row 2), for the three groups. small companies have asymmetric networks, and information about these companies is even more idiosyncratic. The correlations between company volatility and turnover are indeed different for companies of different sizes, as shown in Panel D of Table 3. The panel shows the average (time series) correlation between market and individual firm log-turnover, Corr(w̄,wj ), and between market and individual firm volatility, Corr(s̄, sj ), for the three groups. The average correlations are lower for smaller than for larger companies. Thus, altogether our tests suggest that price and volume dynamics of small stocks may be especially well explained by information diffusion through asymmetric networks. 6 Concluding remarks We have introduced a general network model of a financial market with decentralized information diffusion, allowing us to study the effects of heterogeneous preferences and asymmetric diffusion of information among investors, and the interplay between the two. At the individual investor 30 level, our results show that the trading behavior of investors is closely related to their positions in the network: Closer agents have more positively correlated trades even over short time periods, and standard network centrality measures are closely related to agents’ profits. At the aggregate level, the dynamics of a market’s volatility and trading volume is related to—and could therefore be used to draw inferences about—the underlying information network. In an initial empirical investigation, we show that the dynamics for large stocks are consistent with symmetric information networks and large public information components, whereas the dynamics of smaller stocks suggest asymmetric information diffusion through nonpublic channels. We leave to future research to shed further light on the underlying information network in the market. Another future line of research may be the study of endogenous network formation in financial markets. Given the strong characterization of individual agents’ welfare, the model may be extended to include a period before signals are received and trading occurs, during which agents form connections in anticipation of the future value these will generate. It is an open question what type of networks will form in equilibrium in such an extension, and how well those networks could match observed dynamics at the individual and aggregate level. 31 Proofs Proof of Theorem 1: We prove the result using a slightly more general formulation, where the volatility of noise trade demand is allowed to vary over time, so instead of τu , we have τu1 , . . . , τuT . We first state three (standard) lemmas. Lemma 4 (Projection Theorem) Assume a multivariate signal [μ̃x ; μ̃y ] ∼ N ([μx ; μy ], [Σxx , Σxy ; Σyx , Σyy ]). Then the conditional distribution is −1 μ̃x |μ̃y ∼ N μx + Σxy Σ−1 yy (μ̃y − μy ), Σxx − Σxy Σyy Σyx . Lemma 5 (Special case of projection theorem) Assume an K-dimensional multivariate signal v = [v; s] ∼ N (v̄1, σv2 11 + Λ2 ), where Λ = diag(0, σ1 , . . . , σM −1 ). This is to say that v ∼ N (v̄, σv2 ), si = v + ξi , where ξi ∼ N (0, σi2 )’s are independent of each other and of v, i = 1, . . . , K − 1. Then the conditional distribution is v|s ∼ N Here, τ = (τ1 , . . . , τK−1 )T , τi = σi−2 , τ = M −1 i=1 1 τv 1 v̄ + τ s, τv + τ τv + τ τv + τ . τi , and τv = σv−2 . Lemma 6 (Expectation of exponential quadratic form) Assume x ∼ N (μ, Σ), and that B is a symmetric positive semidefinite matrix. Then 1 E e− 2 (2a x+x Bx) = −1 −1 −1 −1 −1 1 1 e− 2 (μ Σ μ−(Σ μ−a)(Σ +B) (Σ μ−a)) . |I + ΣB|1/2 The structure of the proof is now quite straightforward, the extension compared with previous literature being the heterogeneous information diffusion. We first assume that agents’ demand takes a linear form at each point in time, and calculate the market maker’s pricing function given observed aggregate demand in (3). This turns out to be linear in a way such that the market maker’s information is completely revealed in prices. Thus, pt and wt convey the same information. We then close the loop by verifying that given the market maker’s pricing function in each time period, each agent when solving their backward induction problem will derive demand and utility according to (5,6), verifying that agents’ demand functions are indeed linear. It will be convenient to use the variables Qa,t = τ Va,t . We enumerate the agents from one-dimensionally from 1 to N̄ , so that agent 1, . . . , N represents the agents in the first replica network, agents N + 1, . . . , 2N , the agents in the second replica network, etc. Assume that agent a’s time-t demand function is xa,t (za,t , pt ) = Aa,t za,t + ηa,t (pt ). Then the total average agent demand is xt (v, pt ) = N̄ 1 Aa,t za,t + ηa,t (pt ) = At v + ηt (pt ), N̄ a=1 1 N 1 N where At = N a=1 Aa,t , and ηt = N a=1 ηa,t (pt ), with the convention, A0 = 0, η0 ≡ 0. Here, we are using the fact 1 M −1 that in our large network limM →∞ M r=0 za+rM,t = v for all a and t (almost surely). The net demand at time t is then the difference between time t and t − 1 demands, Δxt = xt (v, pt ) − xt−1 (v, pt−1 ) = (At − At−1 )v + η(pt ) − η(pt−1 ). Now, the market maker observes total time t demands, wt = Δxt + ut , and since the functions ηt and ηt−1 are known, the market maker can back out Rt = (At − At−1 )v + ut . (24) This leads to the following pricing formula, which immediately follows from Lemma 2. Lemma 7 Given the above assumptions, the time-t price is given by pt = t τv τ̂ut 1 v̄ + v + (As − As−1 )τus us , τv + τ̂ut τv + τ̂ut τv + τ̂ut s=1 32 (25) −2 where τ̂ut = ts=1 (As − As−1 )2 τus , τus = σu s . Equivalently, pt = λt Rt + (1 − λt (At − At−1 ))pt−1 , where λt = τut (At −At−1 ) , t τv +τ̂u (26) and p0 = v̄. Proof of Lemma 7: At time t, the market maker has observed R1 , . . . , Rt . We define the vector s = (R1 /(A1 − 2 /(A − A 2 A0 ), R2 /(A2 − A1 ), . . . , Rt /(At − At−1 )) , and it is clear that si ∼ N (v̄, σv2 + σu i i−1 ) ). It then follows immediately i from Lemma 2 that t τv 1 1 Ri 2 v|s ∼ N v̄ + (A − A ) τ , , u i i−1 i τv + τ̂ut τv + τ̂ut i=1 Ai − Ai−1 τv + τ̂ut i.e., v = Vt + σVt ξtV , where Vt = τv t v̄ τv +τ̂u + 1 t τv +τ̂u t i=1 (Ai 2 = − Ai−1 )τui Ri , σV t 1 τVt (27) , where τVt = τv + τ̂ut . So, pt = Vt = E[v|s] takes the given form in the first expression of the lemma. A standard induction argument, assuming that the second expression is valid up until t − 1, shows that the expression then is also valid for t. 2 ). We have shown Note that (27) is the posterior distribution of v given the information R1 , . . . , Rt , so v ∼ N (Vt , σV t Lemma 7. Thus, linear demand functions by agents imply linear pricing functions in the market, showing the first part of the proof. We next move to the demand functions and expected utilities of agent a, given the pricing function of the market maker. We have (using lemma 1 for the posterior distribution) at time t, the distribution of value given {za,t , pt } is v|{za,t , pt } ∼ N τVt τa,t 1 Vt + za,t , τVt + τa,t τVt + τa,t τVt + τa,t . The time-T demand of an agent can now be calculated. Since individual agents condition on prices, they also observe Ř, 2 /Ā2 )) leads to and agent a’s information set is therefore {za , Ř}, which via Lemma 1 (with Λ = diag(0, σi2 , σu ⎛ ⎞ ⎜ ⎟ ⎜ ⎟ τv τa τ̂u 1 ⎟ v|{za , Ř} ∼ N ⎜ ⎜ τ + τ + τ̂ v̄ + τ + τ + τ̂ za + τ + τ + τ̂ Ř, τ + τ + τ̂ ⎟ . v a u v a u v a u v a u ⎝ ⎠ 2 σ̂a μa At time T , given the behavior of the market maker, the asset’s value, given agent a’s information set is therefore conditionally normally distributed so given agent a’s CARA utility, the demand for the asset (2) takes the form: xa,T (za , p) = E[v|Ia,T ] − pT , γa σ2 [v|Ia,T ] a = 1, . . . , N̄ . The demand of agent a is therefore xa,T (za,T , pT ) = = = = = It follows that AT = 1 N N Na,T a=1 σ 2 γa μa − pT γa σ̂a2 1 τv v̄ + τa za + τ̂u Ř − (τv + τa + τ̂u )p γa 1 τa τv v̄ + τa za + τ̂u Ř − 1 + (τv v̄ + τ̂u Ř) γa τv + τ̂u 1 (τa za − τa p) γa τa,T za,T − pT . γa . 33 (28) 2 ), and z Since v − pT ∼ N (0, σV a,T − pT = ζa,T + (v − pT ), where ζa,T is independent of v − pT , it follows that t ζa,T |(za,T − pT ) ∼ N ( τ τV T +τa,T (za,T − pT ), VT of zero), given za,T − pT , is then Ua,T τV T 1 +τa,T ). The expected utility of the agent at time T (with time-T wealth = −E e−γa xa,T (v−pT ) za,T − pT −E e−τa,T (za,T −pT )(za,T −pT −ζa,T ) za,T − pT 2 −e−τa,T (za,T −pT ) E e−τa,T (za,T −pT )(−ζa,T ) za,T − pT = −e−τa,T (za,T −pT ) e = −e = = 2 τa,T Vt +τa,T −τ τV T VT +τa,T τa,T (za,T −pT ) τ (za,T −pT )− 1 τ 2 (za,T −pT )2 τ 2 a,T 1 VT +τa,T (za,T −pT )2 ((τVt +τa,T −τVt )− 1 τ 2 a,T ) 2 τa,T −e = −1 (za,T −pT )2 2 τV +τa,T t . τV T , using lemma 3. It is easy to check that the unconditional expected utility is −E0 [e−γa xa,T (v−pT ) ] = − τ +τ VT a,T t t Va,t t 2 We define Yt = τ̂u = i=1 yi , where yi = (Ai − Ai−1 ) τui , and recall that Qa,t = τa,t = σ 2 = i=1 qa,i , where This shows the result at T . qa,i = Va,i −Va,i−1 σ2 = ΔVa,i . σ2 With this notation we have and −E0 [e−γa xa,T (v−pT ) ] = − Ua,T = xa,T = τv +YT τv +YT +τ Va,T −1 Q2 a,T (za,t −pt )2 −e 2 τv +YT +QT Qa,T (za,T − pT ), γa = −√ 1 Ca,T , = −Da,T . We proceed with an induction argument: We show that given that (5,6) is satisfied at time t, then it is satisfied at time t − 1. As already shown, pt−1 , and za,t−1 sufficiently summarizes agent a’s information at time t − 1 (given the linear pricing function). From the law of motion, Wa,t = Wa,t−1 + xa,t−1 (pt − pt−1 ), an agent’s optimization at time t − 1 is then Ua,t = arg max −Ea,t−1 e−γa Wa,t−1 −γa xa,t−1 (pt −pt−1 ) Da,t e xa,t−1 = arg max −Da,t e−γa Wa,t−1 Ea,t−1 e xa,t−1 def = arg max −Da,t e−γa Wa,t−1 Ea,t−1 e b Q2 t −1 (za,t −pt )2 2 τv +Yt +Qt z a,t−1 , pt−1 Q2 t −γa xa,t−1 (pt −pt−1 )− 1 (za,t −pt )2 2 τv +Yt +Qt z Q2 t −b(pt −pt−1 )− 1 (za,t −pt )2 2 τv +Yt +Qt z a,t−1 , pt−1 a,t−1 , pt−1 . (29) Thus, we need to calculate the distributions of pt − pt−1 and za,t − pt given za,t−1 and pt−1 . From the signal structure, we have the following relationship za,t−1 za,t = = v + ξt−1 , v + ξt = v + ξt−1 ∼ N 0, 1 Qt−1 Qt−1 qt ξt−1 + et , Qt Qt , (30) et ∼ N 0, 1 qt , (31) where et and ξt−1 jointly independent and independent of all other variables. In the new notation, from (4), we have pt = t τv Yt 1 v̄ + v+ (As − As−1 )τus us , τv + Yt τv + Yt τv + Yt s=1 (32) pt−1 = t−1 τv Yt−1 1 v̄ + v+ (As − As−1 )τus us , τv + Yt−1 τv + Yt−1 τv + Yt−1 s=1 (33) A4 34 so pt − pt−1 = τv τv − τv + Yt τv + Yt−1 v̄ + Yt Yt−1 1 − v+ (At − At−1 )τut ut τv + Yt τv + Yt−1 τv + Yt √ y t τu t−1 1 1 − (As − As−1 )τus us , τv + Yt τv + Yt−1 s=1 + t A1 (34) B1 and also za,t − pt Qt−1 qt τv τv ξt−1 + et − v̄ + v Qt Qt τv + Yt τv + Yt = A2 t−1 1 1 (At − At−1 )τut ut − (As − As−1 )τus us . τv + Yt τv + Yt s=1 − √ (35) yt τut This leads to the unconditional distribution: ⎡ ⎤ ⎛⎡ pt − pt−1 st − p t ⎥ ⎜⎢ ⎦ ∼ N ⎝⎣ st−1 pt−1 ⎢ ⎣ ⎤ ⎡ ΣXX 0 0 ⎥ ⎢ v̄ ⎦ , ⎣ ΣXY v̄ ΣXY ⎤⎞ ⎥⎟ ΣY Y ⎦⎠ , (36) Here, ⎡ ΣXX = ⎣ = B1 Yt−1 A2 yt yt A1 A2 2 1 + (τ +Y − (τ +Y 2 + B1 Yt−1 2 − τv +Yt τv τv v t) v t) 2 B1 Yt−1 Yt−1 A A1 A2 yt yt 1 − (τ +Y + τ 2 + (τ +Y 2 − τv +Yt 2 + (τ +Y )2 τv Qt v v t) v t) v t yt 0 (τv +Yt )(τv +Yt−1 ) , 1 1 + τ +Y 0 Qt v t ⎡ ΣY Y = ⎣ 1 τv ⎣ ΣXY = = 1 Qt 1 Qt−1 A4 τv ⎡ = + 1 τv + 1 Qt−1 Yt−1 τv (τv +Yt−1 ) A1 τv + A2 τv A2 4 τv + A4 τv Yt−1 (τv +Yt−1 )2 Yt−1 τv (τv +Yt−1 ) Yt−1 τv (τv +Yt−1 ) 0 ⎦ ⎤ ⎦ ⎤ ⎦, A4 A1 B1 + τ +Y Yt−1 τv v t−1 A2 A4 1 1 − Y τv τv +Yt−1 τv +Yt t−1 yt (τv +Yt )(τv +Yt−1 ) 1 1 + τ +Y Qt v t ⎤ . 0 We use the projection theorem to write [pt − pt−1 ; za,t − pt ] ∼ N (μ, Σ̂), where μ = ΣXY Σ−1 Y Y [za,t−1 − v̄; pt−1 − v̄], and Σ̂ = ΣXX − ΣXY Σ−1 Y Y ΣXY . It follows that ⎡ μ=⎣ Qt−1 yt (τv +Yt )(τv +Qt−1 +Yt−1 ) Qt−1 (τv +Yt−1 )(τv +Qt +Yt ) Qt (τv +Qt−1 +Yt−1 )(τv +Yt ) 35 ⎤ ⎦ (za,t−1 − pt−1 ). We rewrite (29) as Ua,t 1 arg max −Da,t e−γa Wa,t−1 E e−ax1 − 2 x Bx , = q ! where x = [pt − pt−1 ; zt − pt ], B = 0, 0; 0, Q2 t τv +Yt +Qt " , a = [q; 0], and q = γa xa,t−1 . From Lemma 3, it follows directly that this maximization problem is equivalent to Ua,t = arg max −Da,t e−γa Wa,t−1 |I + Σ̂B|1/2 q 1 e− 2 (μ Σ̂−1 μ−(Σ̂−1 μ−a)Z(Σ̂−1 μ−a)) , where Z = (Σ̂−1 + B)−1 . Clearly, the optimal solution is given by arg max q(ZΣ̂−1 μ)1 − q leading to q ∗ = (ZΣ̂−1 μ)1 . Z11 It is easy to verify that Σ̂−1 μ = q∗ 1 Z11 q 2 , 2 Qt Qt−1 [1; 1](za,t−1 Qt −Qt−1 − pt−1 ), and some further algebraic manipulations shows that indeed = Qt−1 (za,t−1 − pt−1 ), leading to the stated demand function at t − 1, (5). Given the form of q, it then follows that μ Σ̂−1 μ − (Σ̂−1 μ − a)Z(Σ̂−1 μ − a) = Q2t−1 τv + Qt−1 + +Yt−1 (za,t−1 − pt−1 )2 , (37) leading to the form of the utility stated in the theorem (6), with Ca,t−1 = |I + Σ̂B|1/2 . It is easy to check that Ca,t−1 −1/2 takes the prescribed form, as does then Da,t−1 = Ca,t−1 Da,t . Thus, given a linear pricing function, agents’ demand take a linear form and, moreover, the coefficients take the functional forms shown in the Theorem, as do agents’ expected utility. We are done. Proof of Theorem 2: The proof follows immediately from (5), (30-31), and (35). Proof of Theorem 3: The certainty equivalent satisfies −e−γa CE = E0 [−e −WT +1 ] = −E0 #T 1 Q2 1 −1 − 2 τv +Q1 +Y1 (za,1 −p1 ) t=2 Ct−1 e It is easy to see that T $ t=2 −1 Ct−1 = T τv + QT + YT τv + Y1 $ Qt−1 (Yt − Yt−1 ) ) 1+ . τv + Q1 + Y1 τv + YT t=2 (τv + Yt−1 )(τv + Yt ) Moreover, since at t = 0, za,1 − p1 ∼ N (0, 1 Q1 E0 e + 1 ), τv +Y1 it follows that Q2 1 −1 (za,1 −p1 )2 2 τv +Q1 +Y1 = τv + Q1 + Y1 , τv + Y1 and the result for the ex ante certainty equivalent follows. The ex ante expected profits between t and t + 1 are E0 [(za,t − pt )(pt+1 − pt )]. Plugging in the form (34,35) yields the result. Using a similar approach for expected profits, we get that the expected total, time T trading profit of agent a’s trade in time t is 1 Qa,t , γa τv + Yt 36 2 . and the total expected trading profit over time therefore is T 1 Qa,t , γa t=1 τv + Yt (38) We are done. Proof of Theorem 4: Proof of Theorem 4 : We focus on the Erdös-Renyi random graph G(N, pN ) model, where pN = c logk (N )/(N − 1), qN = p(N − 1) = c logk (N ). c > 0, k > 3, (39) (40) Throughout the proof we will often suppress the dependence of variables on N , e.g., writing p and q instead of pN and qN . Also, denotes a constant arbitrarily close to zero, and C, c, c1 , etc. denote strictly positive (possibly large) constants. The proof is based on the fact that for arbitrary constants, α > 0, β > 0, it follows that q α << N β , for large N . We shall see that the variables of interest can be given as well-behaved functions of q −α with error terms of order N −β , which then leads to tight bounds on the expected correlations. We first introduce some convenient notation. For a sequence of random variables wN , we write w = Os (f (N )), where it is assumed that limN→∞ f (N ) = 0, if for large N P(|w| ≥ Cf (N )) ≤ Cf (N ) for some constant, C > 0. If wN − r = Os (f (N )), for some function (or constant) r, we also write wN = r + Os (f (N )). We will also use standard O() (Big-O) and o() (little-o) notation. Finally, aN ∼ bN means that 0 < c ≤ aN /bN ≤ C < ∞ for large N (i.e., that aN = Θ(bN )). We use the notation V ar(x), Cov(x, y), and Corr(x, y) for the population variance, covariance and correlation, respectively, of random variables x and y, whereas we use V [X], C[X, Y ], and ρ[X, Y ] for the sample (cross sectional) variance, covariance and correlation of vectors of realization of random variables, X and Y . We are interested in the expectations of the cross sectional correlations ρD ρF ρĈ ρK def = def = def = def = ρ[DN , ΠN ], ρ[−F, ΠN ], ρ[Ĉ, ΠN ], ρ[K α , ΠN ]. We begin with deriving properties of yt and ζt = βt /β1 . To do this, we start with the following Lemma that shows the distributional properties of Va,t for large N . Lemma 8 Consider constants K > 14 (arbitrarily large), and > 0 (arbitrarily close to zero), and an integer T > 1 (arbitrarily large). Then there is an N0 , such that for all N > N0 , for all a = 1, . . . , N , for all t = 1, . . . , T , P Va,t − Va,t−1 − q t ≥ q t ≤ N −K . (41) Proof : We apply Lemma 10.7 in Bollobas (2001). Given that αt = (K − 2) log(N )/(c logk (N ))t ), βt = (c logk (N ))t /N , and γt = 2(c logk (N ))t−1 /N , it follows that α1 ≤ C log(N ) t 1−k 2 , and αi + βi + γi = α1 (1 + o(1)), i=1 where αi , βi and γi are defined in Lemma 10.6 of Bollobas (2001). Lemma 10.7 in Bollobas (2001) now implies that, P Va,t − Va,t−1 − q t ≥ δt q t ≤ N −K , 37 (42) where δt ≤ C log(N ) 1−k 2 , t = 1, . . . , T, (43) and since δt → 0 when N → ∞, the result follows. Bounds on yt : We recall that Va,t is the number of nodes within a distance of t from node a, and derive asymptotic properties of the averages N def 1 Va,t , t = 1, . . . , T. (44) V̄t = N a=1 We note that At = γτ V̄t , so all asymptotic results we derive for V̄t will, up to a constant, also hold for At , and thereby be important for yt . We have yt α(V̄t − V̄t−1 )2 , = 2 where we define V̄0 = 0, and α = τuγτ (the risk aversion coefficient being constant since we assume a preference symmetric economy). Lemma (8) immediately implies that P V̄t − V̄t−1 − q t ≥ q t ≤ N −K , for any > 0, for large N , and therefore V̄t − V̄t−1 − q t ≤ q t = yt − αq 2t α V̄t − V̄t−1 − q t × V̄t − V̄t−1 + q t ≤ αq t × (1 + )q t = q 2t . ⇒ Since is also arbitrarily close to zero, we get: P yt − αq 2t ≥ q 2t ≤ N −K . (45) Bounds on ζt : We can write the expected profit of agent a as Πa = β 1 Va,1 + T ζt Va,t , t=2 where β1 = (τv + y1 )−1 and ζt = τv + y1 t τv + k=1 yk . The coefficients, ζt (which do not depend on a) represent the relative value for agents of having many links at distance t. The bound on yt (45) now immediately imply that % & P ζt × q 2(t−1) − 1 ≥ ≤ N −K . (46) for any > 0, K > 14, for large enough N . Asymptotic behavior of E[ρD ], and E[ρK ]: First, we restrict ΩN to only contain events for which ζt q 2(t−1) ∈ [1 − , 1 + ], t = 1, . . . , T , for some fixed << 1. From (46) we know that this set of events satisfies P(ΩN ) ≥ 1 − O(q −K ) for arbitrarily large K. 38 (47) We have ρ[DN , ΠN ] = = = N N N N T N ρ[Va,1 , β1 (Va,1 + ζ2 Va,2 + ζ3 Va,3 + · · · + ζN Va,T )] S12 + ζ2 C[Va,1 , Va,2 ] + · · · + ζT C[Va,1 , Va,2 ] ' S1 V [Va,1 + ζ2 Va,2 + · · · + ζT Va,T ] S12 + T i=2 ζi Si,1 , T 2 2 2 S1 S1 + i=2 ζi Si + T i=1,j>i 2ζi ζj Sj,i (48) where we have defined ζ1 = 1, and St2 = V [Va,t ], St,s = C[Va,t , Va,s ], t = 1, 2, . . . T, (49) s, t = 1, 2, . . . , T. (50) We define at = ζi q 2(t−1) , t = 2, . . . , T , where the coefficients ai can be chosen arbitrarily close to 1 by letting N be sufficiently large and focusing on events in Ω defined in (47), we have good control over the behavior of ζt . We also need to control St2 , and St,s , t > s. The following Lemma is helpful Lemma 9 Given the definitions (49), (50), we have St2 t = Zi2 + 2 i=1 St,s = t t q j−i Zi + Os (N −1/3+ ), 1 ≤ t ≤ T, (51) i=1 j=i+1 Ss2 + s t q j−i Zi + Os (N −1/3+ ), 1 ≤ s, t ≤ T. (52) i=1 j=s+1 Here, Zt2 = qt t−1 qi , i=0 = q 2t−1 t−1 q −i , t = 1, . . . , T, i=0 Proof of Lemma 9: For large N , the distributions of Va,t for small t will resemble those of a branching process, the difference being that branching processes do not choose from the same nodes when going from t to t + 1. For large N , the difference will be Va,t small. We therefore begin with studying the branching process defined as Za,1 = ξa,0 , Za,t+1 = i=1 ξa,t,i , a = 1, . . . , N , t = 1, . . . , T − 1. Here, ξa,t,i ∼ Bin(N − 1, p), and are independent across a, t, and i. Using iterated expectations, and the law of total variance, it is straightforward to show the following (unconditional) moment formulas E[Za,t ] = E[ξ]t , V ar[Za,t ] = E[ξ]t V ar(ξ) + E[ξ]2 V ar[Za,t−1 ], Cov[Za,t , Za,s ] = E[ξ]t−sV ar(Za,s ), t > s. In the specific case where E[ξ] = q, and V ar(ξ) = q + O(N −1 )—i.e., our case—a recursive argument shows that E[Za,t ] = qt , V ar[Za,t ] = qt t−1 q i + O(N −1+ ), (53) i=0 Cov[Za,t , Za,s ] = qt s−1 q i + O(N −1+ ), i=0 39 t > s. (54) We next show that the moments for Va,t+1 − Va,t are similar to the moments of Za,t+1 . Of course, Va,1 has identical moments as Za,1 . For Va,2 , we have, given that node a has Va,1 links, the probability that node b which is not a neighbor of a is not a neighbor to one of a’s neighbors is (1 − p)Va,1 . Therefore, the conditional distribution of the number of nodes at distance 2 from a, given Va,1 is Va,2 − Va,1 ∼ Bin(N − Va,1 , 1 − (1 − p)Va,1 ), leading to E[Va,2 − Va,1 |Va,1 ] = = = (N − Va,1 )(1 − (1 − p)Va,1 ) ⎞ ⎛ Va,1 % Va,1 & −N (1 − Va,1 /N ) ⎝ (−1)i pi ⎠ i i=1 ⎛ ⎞ Va,1 % Va,1 & −q i ⎝ ⎠, −N (1 − Va,1 /N ) i N −1 i=1 (55) and V ar[Va,2 − Va,1 |Va,1 ] = = = (N − Va,1 )(1 − (1 − p)Va,1 )(1 − p)Va,1 ⎛ ⎞ Va,1 % Va,1 & Va,1 ⎝ i i⎠ −N (1 − Va,1 /N )(1 − p) (−1) p i i=1 ⎛ ⎞ Va,1 % Va,1 & −q i ⎠. −N (1 − Va,1 /N )(1 − p)Va,1 ⎝ i N −1 i=1 (56) The law of total variance now implies that V ar[Va,2 − Va,1 ] E[V ar[Va,2 − Va,1 |Va,1 ]] + V ar[E[Va,2 − Va,1 |Va,1 ]], = i /(N −1)] = O(N −1+ ) for any fixed i and t, and furthermore, E and since (as is easily shown) E[Va,t ∞ O(N −1 ), (55,56) immediately imply, when plugged into (57), that V ar[Va,2 − Va,1 ] (57) i=2 i /(N − 1)i = Va,t q 2 + q 3 + O(N −1+ ) = V ar[Za,1 ] + O(N −1+ ). = The same argument for Va,3 − Va,2 ∼ Bin(N − (Va,2 − Va,1 ), 1 − (1 − p)Va,2 −Va,1 ), leads to V ar[Va,3 − Va,2 ] = q 3 + q 2 (q 2 + q 3 ) + O(N −1+ ) = V ar[Za,2 ] + O(N −1+ ), and for higher orders to V ar[Va,t − Va,t−1 ] = V ar[Za,t ] + O(N −1+ ). (58) A similar argument for the covariances leads to, Cov[Va,t − Va,t−1 , Va,s − Va,s−1 ] = Cov[Za,t , Za,s ] + O(N −1+ ), t > s. (59) Now, since Va,t = Va,1 + (Va,2 − Va,1 ) + . . . + (Va,t − Va,t−1 ), from (58,59) it immediately follows that V ar[Va,t ] = t i=1 Zi2 + 2 t t i=1 j=i+1 40 q j−i Zi + Os (N −1+ ), and (60) Cov[Va,t , Va,s ] = V ar[Va,s ] + s t q j−i Zi + Os (N −1+ ), t > s. (61) i=1 j=s+1 m − 1, is of course We next show that covariances of Va,t and Vb,t when a = b (i.e., across agents) is low. First, Va,1 binomially distributed, Va,1 − 1 ∼ Bin(N, p). We recall that the moment generating function of a binomial distribution is M (t) = (1 − p + pet )n , and it therefore follows that for any fixed m, μm m ] V ar[Va,1 = m E[Va,1 ] = M (m) (0) = q m (1 + O(q −1 )), = M (2m) (0) − (M (m) (0))2 = q 2m−1 (1 + O(q −1 )). def m and V n , a = b is on the form The covariance between Va,1 b,1 m n Cov(Va,1 , Vb,1 ) = E[(wa + I)m (wb + I)n ] − E[(wa + I)m ]E[(wb + I)n ], where I is Bernoulli distributed, I ∼ Ber(p), representing the possible common link between a and b, wa ∼ Bin(N − 1, p), wb ∼ Bin(N − 1, p) represent all the other links, and I, wa , and wb are independent. Moreover, since the Bernoulli distribution is idempotent, I k = I, k ≥ 1, it follows that m n , Vb,1 ) Cov(Va,1 = = − = = = E[(wa + I)m (wb + I)n ] − E[(wa + I)m ]E[(wb + I)n ] ⎡ ⎞⎤ ⎛ m−1 n−1 %m& %n& j m i n E ⎣ wa + wa I ⎝wb + w b I ⎠⎦ i j i=0 j=0 ⎤ ⎡ m−1 n−1 %m& %n& j E wam + wai I E ⎣wbn + wb I ⎦ i j i=0 j=0 ⎡ ⎞⎤ ⎤ ⎛n−1 m−1 ⎡n−1 m−1 %m& %n& j %m& %n& j i i ⎣ ⎝ ⎠ ⎦ ⎣ wa I wb I wa I E wb I ⎦ E −E i j i j i=0 j=0 i=0 j=0 ⎞ ⎞ m−1 m−1 ⎛n−1 ⎛n−1 %m& %n& %m& %n& ⎠ ⎝ ⎝ μi μj E[I] − μi μj ⎠ E[I]2 i j i j i=0 j=0 i=0 j=0 ⎞ ⎛n−1 m−1 %m& %n& ⎝ μi μj ⎠ V ar(I) i j i=0 j=0 = O(q m+n−2 p(1 − p)) = O(N −1+ ). m ), this then implies that Together with the bounds on V ar(Va,1 m n Corr(Va,1 , Vb,1 ) = m ,V n ) Cov(Va,1 b,1 ' V ar(Va,1 )m V ar(Vb,1 )n = O(N −1+ ) ' 2m−1 q q 2n−1 (1 + O(q −1 )) = O(N −1+ ). An identical argument as that above shows that Cov((Va,1 − Vb,1 )m , (Vc,1 − Vd,1 )n ) = O(q m+n−2 N −1+ ) = O(N −1+ ), when a, b, c and d are different, and that Corr((Va,1 − Vb,1 )m , (Vc,1 − Vd,1 )n ) 41 = O(N −1+ ), and similarly for general T ≥ t ≥ s ≥ 1, n ≥ 1, m ≥ 1 m n , Vb,s ) Cov(Va,t = O(N −1+ ), −1+ (62) m n Corr(Va,t , Vb,s ) = O(N ), (63) Cov((Va,t − Vb,t )m , (Vc,s − Vd,s )n ) = O(N −1+ ), (64) Corr((Va,t − Vb,t ) , (Vc,s − Vd,s ) ) m n = O(N −1+ ). (65) To go from these bounds on population variances and covariances to bounds on sample variances and covariances, we use the following lemmas: Lemma 10 Assume x1 , . . . , xN are identically distributed (but not necessarily independent) random variables, with mean μ and (finite) variance σ2 . Also, assume that Corr(xi , xj ) ≤ CN −α , α > 0. Then the sample mean def x̄N = satisfies N 1 xi N i=1 x̄N = μ + Os (max(σ, 1)N −β/3 ), where β = min(α, 1). Here, μ and σ are allowed to depend on N . Moreover, if σ is independent of N , the formula reduces to x̄N = μ + Os (N −β/3 ). σ2 2 2 −β . Chebyshev’s inequality then Proof of Lemma 10: It is clear that V ar(x̄N ) = N 2 (N + i,j=i Corr(xi , xj )) ≤ C σ N immediately implies that P(|x̄N − μ| ≥ CσN −β/2 K) ≤ K −2 , and by choosing K = N β/6 the result follows. Lemma 11 If x̄N = μ + Os (N −α+ ) for all > 0, then, for any fixed m > 1, such that μm−1 = o(N ) for all > 0, m −α+ ). x̄m N = μ + Os (N Proof of Lemma 11: The proof is by induction. Assume that the result has been proved for all m = 1, 2, . . . , M , i.e., m −α+ ) ≤ Cm N −α+ , P(|x̄m N − μ | ≥ Cm N We expand m = 1, 2, . . . , M. M +1 M +1 i M −i − μ | = |x̄ − μ| a μ x̄ |x̄M , i N N N i=0 for some constants ai , i = 1, . . . , M . Now, clearly under our induction assumption, m −α+ , P(|x̄m N | ≥ 2μ ) ≤ N for large enough N . Therefore, m = 1, . . . , M M i M −i M ai μ x̄N ≥ C μ P ≤ C N −α+ , i=0 for some C > 0. Moreover, P(|x̄N − μ| ≥ C1 N −α+ ) ≤ C1 N −α . i M −i ≤ C μM and |x̄ − μ| ≤ C N −α , then and since if M 1 N i=0 ai μ x̄N +1 |x̄M N −μ M +1 | = |x̄N M i M −i − μ| ai μ x̄N ≤ C1 N −α+ C μM ≤ C N −α+2 . i=0 42 Finally, M i M −i M −α ≥ 1 − C1 + C N −α = 1 − C N −α . ai μ x̄N ≤ C μ ∩ |x̄N − μ| ≤ C1 N P i=0 Setting CM +1 = max(C , C ), and recalling that > 0 was arbitrary, the lemma follows. Lemma 12 Assume x1 , . . . , xN are identically distributed (but not necessarily independent) random variables, with mean μ, variance σ2 , and finite fourth moments. Assume that |Corr(xi , xj )| ≤ CN −α and that |Corr((xi − xj )2 , (xk − xn )2 )| ≤ CN −α , where α > 0, and i, j, k, n are different indexes. Then the sample variance N 1 (xi − x̄N )2 N − 1 i=1 def s2N = satisfies s2N = σ2 + Os (N −2β/3 ), where β = min(α, 1). Proof of Lemma 12: We rewrite def s2N = We have 1 1 (xi − xj )2 . N (N − 1) i,j 2 E[(xi − xj )2 ] = E[((xi − μ) − (xj − μ))2 ] = V ar(xi ) + V ar(xj ) − 2Cov(xi , xj ), implying that def E[s2N ] = σ2 − Cov(xi , xj ) = σ2 (1 − r) = σ̄2 , in turn leading to V ar(s2N ) = 1 N (N − 1) 2 r = O(N −α ), ⎡⎛ ⎞ ⎤ 2 1 2 2 ⎠ ⎦ ⎣ ⎝ E (xi − xj ) − σ̄ . 2 i,j We expand the square, and separate • N (N − 1) terms on the form: 1 N (N − 1) 2 E 1 (xi − xj )2 − σ̄2 2 2 . The sum of these terms will thus be of O(N −2 ). • N (N − 1)(N − 2) terms on the form: 1 N (N − 1) ! 2 E 1 (xi − xj )2 − σ̄2 2 1 (xi − xk )2 − σ̄2 2 " . The sum of these terms will thus be of O(N −1 ). • N 4 − 3N 3 − N 2 + N terms on the form: 1 N (N − 1) ! 2 E 1 (xi − xj )2 − σ̄2 2 1 (xk − xn )2 − σ̄2 2 " . Since ! " 1 1 2 2 2 2 E − x ) − σ̄ − x ) − σ̄ (x (x n i j k 2 2 = × = the sum of these terms will be of O(N −α ). 43 Corr 1 (xi − xj )2 , 1 (xk − xn )2 2 2 1 (xi − xj )2 V ar 2 O(N −α ), Thus, altogether, V ar(s2N ) = O(N −β ). As in Lemma 10, the result then follows from Chebyshev’s inequality. Lemma 13 Assume x1 , . . . , xN , and y1 , . . . , yN , are identically distributed random variables, with means μX , μY , vari2 , σ 2 , covariance σ −α , ances σX XY = Cov(xi , yi ) for all i, and finite fourth moments. Assume that |Corr(xi , yj )| ≤ CN Y 2 2 −α and that |Corr((xi − xj ) , (yk − yn ) )| ≤ CN , where α > 0, and i, j, k, n are different indexes. Then the sample covariance N 1 def sXY = (xi − x̄N )(yi − ȳN ) N − 1 i=1 satisfies sXY = σXY + Os (N −2β/3 ), where β = min(α, 1). Proof of Lemma 13: Identical to the Proof of Lemma 12. Lemma 12, together with (60) and (63) therefore implies (51), and Lemma 13 together with (61) and (65) implies (52). . We have shown Lemma 9. Plugging (51-52) into (48), we get ρ[DN , ΠN ] = S1 = q(1 S12 + T i=2 ζi Si,1 + Os (N −1/3+ ) T 2 2 2 S1 + i=2 ζi Si + T i=1,j>i 2ζi ζj Sj,i + a22 (q −2 −1/3+ + q(1 + a2 (q −1 + q −2 ) + a3 (q −2 + q −3 ) + a4 q −3 + q −4 P1 (q −1 )) + 2a2 (q −1 + q −2 ) + 2a3 (q −2 + q −3 ) + 2a4 q −3 + 2a2 a3 q −3 + q −4 Q(q −1 ))1/2 3q −3 ) + Os (N ), = 1 + a2 (q −1 + q −2 ) + a3 (q −2 + q −3 ) + a4 q −3 + q −4 P1 (q −1 ) 1 + a22 (q −2 + 3q −3 ) + 2a2 (q −1 + q −2 ) + 2a3 (q −2 + q −3 ) + 2a4 q −3 + 2a2 a3 q −3 + q −4 Q(q −1 )1/2 + Os (N −1/3+ ), (66) where P and Q are polynomials finite orders, and we define a1 = 1. A Taylor expansion of (66) around x = 0, where x = q −1 now yields, 1+ a22 (q −2 + 1 + a2 (q −1 + q −2 ) + a3 (q −2 + q −3 ) + a4 q −3 + q −4 P1 (q −1 ) = 1 − cq −3 + O(q −4 ) + 2a2 (q −1 + q −2 ) + 2a3 (q −2 + q −3 ) + 2a4 q −3 + 2a2 a3 q −3 + q −4 Q(q −1 )1/2 3q −3 ) altogether leading to ρD = 1 − cq −3 + Os (q −4 ), (67) where c > 0 depends on a2 , . . . , aT , and when at → 1, i = 2 . . . , aT , then c → 1/2. Of course, this immediately implies that P |1 − ρD − cq −3 | ≥ Cq −4 ≤ Cq −4 , (68) for large N for some bounded constant C > 0. Now, since correlations are bounded between -1 and 1, for general sequences of random variables, aN , and bN , if P (|1 − ρ[aN , bN ] − f (N )| ≥ o(f (N ))) = o(f (N )), where limN→∞ f (N ) = 0, then Thus, from (68), it follows that 1 − E(ρ[aN , bN ]) = f (N )(1 + o(1)). 1 − E[ρD ] = cq −3 (1 + o(1)) ∼ q −3 . A similar expansion of VN1 + ζ2 VN2 gives ρ[Va,1 + ζ2 Va,2 , ΠN ] = 44 1 − cq −5 + Os (q −6 ), (69) where c = 1/2 when ai → 1, i = 2 . . . , aT , which then leads to 1 − E[ρK ] ∼ q −5 . (70) This establishes the relationship E[ρK ] > E[ρD ] for large N . We next study F = 1 , the average distance from an agent to all other agents, and show that Ĉ 1 − E[ρ[D, F ]] ∼ q −1 . (71) We note that (53,54) imply that for i > 1, and fixed α, Corr(Z1 , Z1 + αZi ) 1 ,Zi ) 1 + α Cov(Z V ar(Z1 ) Cov(Z1 ,Zi ) V ar(Z ) 1 + 2α V ar(Z + α2 V ar(Z i ) ) = 1 1 1 + αq i−1 ' 1 + 2αq i−1 + α2 (q 2i−2 + q 2i−3 + O(q 2i−4 )) = 1 + (αq)−1 ' 1 + 2(αq)−1 + q −1 + O(q −2 )) 1 1 − q −1 + O(q −2 ), 2 = = in turn implying that for α1 , . . . , αR , such that α1 /( i αi ) ≤ 1 − , > 0, Corr Z1 , R = Zi 1− i=1 1 −1 q + O(q −2 ). 2 (72) Now, from Theorem 10.10 in Bollobas (2001), it follows that if we define R = log(N )/ log(c) + 1, with probability greater than 1 − N −k for arbitrary k, the maximum distance between any two agents is either R − 1 or R. We will therefore have that the average distance between an agent, a, and all other agents is Fa = = 1 N R− Va,1 + R−1 i(Va,i − Va,i−1 ) + R N − (Va,1 + R−1 i=2 R−1 Va,1 − N (Va,i − Va,i−1 ) i=2 R−1 i=2 R−i (Va,i − Va,i−1 ). N Now, from (72), if we replace Va,i − Va,i−1 with Za,i , F̃a def = = 1 N R− Za,1 + R−1 (iZi ) + R N− Za,1 + R−1 i=2 Z1 i=2 R−1 R−i R−1 Za,1 − Za,i N N i=2 we have Corr(Va,1 , −F̃a ) = 1 − 1 −1 + O(q −2 ). q 2 A similar argument as that leading to (67) therefore implies that Corr(Va,1 , −Fa ) = 1 − 1 −1 + O(q −2 ), q 2 and ρ[D, −F ] = 1 − q + Os (q −2 ), 2 45 (73) and therefore, following a similar argument as for ρD , that E[ρ[D, −F ]] = 1 − q + O(q −2 ). 2 We use the following triangle-inequality like lemma: √ √ Lemma 14 Assume ρ(a, b) = 1 − α, and ρ(a, c) = 1 − β, with β > α. Then ρ(b, c) ≤ 1 − ( β − α)2 . Proof of Lemma 14: Because of the simple renormalizations, a → (a − E[a])/σ(a), b → (b − E[b])/σ(b), and c → (c − E[c])/σ(c), we can without loss of generality assume that a, b, and c have zero expectations and unit variances. All ' correlations can then expressed as, ρ(a, b) = E[ab], etc. We introduce the metric d(a, b) = E[(a − b)2 ], and we then have the triangle inequality d(a, c) ≤ d(a, b) + d(b, c), leading to d(b, c) ≥ d(a, c) − d(b, c). We have d(a, b)2 = E[(a − b)2 ] = E[a2 ] + E[b2] − 2E[ab] = 2 − 2(1 − α) = 2α,√and similarly d(a, c)2 = 2β. Finally, d(a, c)2 = 2 − 2ρ(a, c) which via the triangle √ √ √ inequality leads to 2 − 2ρ(a, c) ≥ ( 2β − 2α)2 = 2( β − α)2 , leading to the result. We are done. Equations (68) and (73), together with Lemma 14, where a = D, b = −F , c = Π, now implies that if 1−ρ[D, Π] ≤ cq −3 , and 1 − ρ[D, −F ] ≥ cq −1 , then 1 − ρ[−F, Π] = 1 − ρF > c(1 − )q −1 for large N , and thus, 1 − E[ρF ] ≥ c(2 − )q −1 , implying the third part of the Theorem, E[ρD ] > E[ρF ]. Finally, for Ĉ, we first note that and more generally that E[Va,1 ] = q + O(N −1+ ), V ar[Va,1 ] = q + O(N −1+ ), E[Fa ] = R(1 + O(q −1 )), V ar[Fa ] = R2 (1 + O(q −1 )), q E[(1 − F/R)i ] = q −i (ci + O(q −1 )) for any fixed i ≥ 1, where ci is a (finite) constant, and c2 = 1. We have Corr(Va,1 , Ĉ) = = = = = = 1 Corr Va,1 , F 1 Corr Va,1 , R − (R − F ) ⎞ ⎛ 1 &⎠ % Corr ⎝Va,1 , 1− 1− F R ⎞ ⎛ F F i⎠ i ⎝ (−1) 1 − Corr Va,1 , − + R i≥2 R q di q −i 1 − + O(q −2 ) + 2 i≥2 1− q + O(q −2 ). 2 A similar argument as for ρF , again using Lemma 14, then implies the final part of the theorem, E[ρD ] > E[ρĈ ]. We are done. Proof of Theorem 5: For the first part, we note that since (pt −pt−1 ) is independent of pt−1 (given publicly available inforyt 2 = (Σ mation), it follows that the price volatility between t−1 and t is equal to (ΣXX )11 , σp,t . XX )11 |t−1 = (τ +Y )(τ +Y ) 2 with the convention Y0 = 0. Also, the final period volatility of v − pT is σp,T +1 = theorem. 46 v 1 τv +YT t v t−1 . This proves the first part of the t k1 For the second part, we first note that from the definition of y1 , it follows that y1 = τv 1−k 1 and—defining Kt = 1−KT kt i=1 ki —a simple induction argument further shows yt = τv (1−Kt−1 )(1−Kt ) , t = 1, . . . , T − 1, and yT = τv kT (1−KT −1 ) . We note that all the y1 , . . . , yT are all well defined. y1 Next, we back out the connectedness that is needed to be consistent with the y’s. We have A1 = , At = τu def V def yt γ y γ y a,1 a At−1 + τ , leading to V̄1 = = τ τ 1 , ΔV̄t = V̄t − V̄t−1 = τ τ t . Thus, if we can replicate, arbitrarily closely, N u u u any sequence of diffusions, through which the average number of signals, V̄t , increases over time, then we can generate any yt , and thereby any volatility structures. We note that γ is a free parameter that allows us to scale the network to arbitrary sizes. The result now follows from the following lemma: Lemma 15 For any T , there are networks of size N , such that V̄T = (1 + o(1))V̄T for T > T , and V̄T = for T < T . 1 V̄ N(1+o(1)) T This lemma thus states that we can always find a (possibly large) network such that very little happens before and after time T , with respect to information diffusion. The result follows immediately: For T = 1, a tightly-knit network would have these properties. For T = 2, a large star network. For T = 3, a star-like network with N 2 + N nodes, in which there are N tightly-nit nodes in the center, each connected to N peripheral agents. For even T ≥ 4, adding longer distance to the T = 2 (star) network, and for odd T ≥ 5, adding longer distances to the T = 3 network will generate these properties. Let’s call such a network a T -network. V̄t+1 , t = 1, . . . , T can be generated by choosing a network with many disjoint 1−, 2−, . . . , T −networks Finally, any sequence of V̄ t in such a way so that the relative sizes of the networks match the fractions. We are done. Proof of Lemma 3: i): Let us study the function g, defined by gi = fi , i = 0, . . . , N , and gN+1 = 0, which has the same autocorrelation function as f , for τ ≤ N , and which always has a maximum at an interior point when f is unimodal, nonnegative, and f0 = 0. Let us denote that maximum by m, i.e., gm ≥ max(gm−1 , gm+1 ). Of course, the result follows trivially if f ≡ 0, so we assume that fi > 0 for some i. We use the following lemma Lemma 16 Consider a sequence f0 , f1 , . . . , fN+1 , such that f0 = fN+1 = 0. Then N fi (fi+1 − fi ) − (fk+1 − fk )(fk − fk−1 ) ≤ 0 i=0 for any 1 ≤ k ≤ N . N N Proof of Lemma 16: The summation by parts rule implies that i=0 fi (fi+1 − fi ) = − i=0 fi+1 (fi+1 − fi ), in turn N 2 implying that i=0 fi (fi+1 − fi ) = − 21 N i=0 (fi+1 − fi ) . We now have N fi (fi+1 − fi ) = − i=0 N 1 (fi+1 − fi )2 2 i=0 ≤ 1 1 − ((fk − fk−1 )2 + (fk+1 − fk ))2 ) ≤ 2(fk+1 − fk )(fk − fk−1 ), 2 2 where the second inequality follows from the inequality (x + y)2 = x2 + y 2 + 2xy ≥ 0. Thus, fk )(fk − fk−1 ) ≤ 0, as claimed. This completes the proof of Lemma 16. N i=0 fi (fi+1 − fi ) + (fk+1 − Since Rτ is symmetric around τ = 0, we restrict our attention to the region τ ≥ 0, showing that Rτ is nonincreasing in this region. This will immediately imply i). We prove the result by contradiction: Assume that there is a τ , such that Rτ +1 − Rτ > 0. Given the definition of Rτ , this means that N+1−τ gi (gi+τ − gi ) > 0 (74) i=0 for some τ . By zero padding (adding extra zeros to) g, we can without loss of generality assume that the maximum occurs at m = nτ for some integer n, and that N + 1 − τ = M τ − 1, i.e., by studying the function gi = 0, i < v, gi = gi−v , 0 ≤ i − v ≤ N + 1 − τ , gi = 0, N + 1 − τ < i − v ≤ M τ − 1. Thus, it must be that ⎛ ⎝ (m−1)τ −1 i=0 gi (gi+τ − ⎞ ⎛ gi )⎠ +⎝ mτ −1 ⎞ gi (gi+τ − gi )⎠ + M τ −1 i=mτ i=(m−1)τ 47 gi (gi+τ − gi ) > 0. Now, because gi is increasing for i ≤ mτ , and decreasing for i ≥ mτ , we can replace gi with gr i τ , where ri = τ (i/τ + 1) in the first and third term, and by g(m−1)τ for the middle term, leading to m−2 giτ τ −1 (giτ +j+τ − giτ +j ) i=0 (m−1)τ −1 j=0 g(m−1)τ τ −1 M −1 giτ (gmτ +j+τ − gmτ +j ) ≥ τ −1 j=0 mτ −1 gi (gi+τ − gi ), i=(m−1)τ τ −1 (giτ +j+τ − giτ +j ) i=m gi (gi+τ − gi ), i=0 j=0 But since ≥ > j=0 M τ −1 gi (gi+τ − gi ). i=mτ gk+j+1 − gk+j = gk+τ − gk , this implies that m−2 giτ (g(i+1)τ − giτ ) % & − g(m−1)τ (gmτ − g(m+1)τ ) + M −1 i=0 in turn leading to giτ (g(i+1)τ − giτ ) > 0, i=m M −1 giτ (g(i+1)τ − giτ ) + (g(m−1)τ − gmτ )(g(m+1)τ − gmτ ) > 0. i=0 , 0 ≤ i ≤ M , it follows that h = h Now, defining hi = giτ 0 M = 0, that h is positive and unimodal, and that this in turn implies that M −1 hi (hi+1 − hi ) − (hm − hm−1 )(hm+1 − hm ) > 0. i=0 However, from Lemma 16 no such sequence can exist, and thus neither can a sequence g satisfying (74). So, R is nonincreasing on the positive axis, and it must therefore be unimodal (with maximum at 0). We have shown i). n n ii): It is easy to show that (Δn R)(τ ) = N−τ i=1 fi (Δ f )i . Thus, if Δ f is positive (negative), so is Rτ . This, in turn, implies that Rτ has at most n − 1 turning points. Proof of Theorem 6: The proof is based on the following standard lemma: Lemma 17 Assume a normally distributed random variable, y ∼ N (μ, σ2 ). Then E[|y|] = σ where Φ is the cumulative normal distribution of a standard normal variable. μ2 2 − 2σ2 e π +μ(1−2Φ(−μ/σ)), We note that from (5) and given that v = v̄ + η, it follows that agent a’s net time-t demand is γa Δxa,t = = − = Qa,t (za,t − pt ) − Qa,t (za,t−1 − pt−1 ) t qt τv Yt 1 Qt−1 ξa,t−1 + ea,t − v̄ + (v̄ + η) + (Ai − Ai−1 )τui ui Qa,t v̄ + η + Qa,t Qa,t τv + Yt τv + Yt τv + Yt i=1 t−1 τv Yt−1 1 Qa,t−1 v̄ + η + ξa,t−1 + − v̄ + (v̄ + η) + (Ai − Ai−1 )τui ui τv + Yt−1 τv + Yt−1 τv + Yt−1 i=1 t−1 Qa,t−1 Qa,t Qa,t qa,t ea,t − − (Ai − Ai−1 )τui ui − (At − At−1 )τui ut τv η − τv + Yt τv + Yt−1 τv + Yt i=1 ∼N(0,qa,t ) 2 ) ∼N(0,r̂a,t where r̂t2 = Qa,t−1 Qa,t − τv + Yt τv + Yt−1 2 (τv + Yt−1 ) + Qa,t τv + Yt 48 2 yt = Q2a,t−1 τv + Yt−1 − Q2a,t τv + Yt + 2 qa,t τv + Yt . Recalling that qa,t = τ ΔVa,t and Qa,t = τ Va,t , this leads to γa Δxa,t = ' τ ΔVa,t √ ra,t τ ωa,t + ' ξt , ΔVa,t where ωa,t ∼ N (0, 1) is independent of ξt ∼ N (0, 1) and across agents, rt2 = 2 Va,t−1 τv + Yt−1 when ΔVa,t > 0, and − 2 Va,t τv + Yt + 2 ΔVa,t τv + Yt , γa Δxa,t = ra,t τ ξt , √ Δx r τ when ΔVa,t = 0. Assuming that ΔVa,t > 0, we note that γa τ ΔVa,t ξt ∼ N √a,t ξt , 1 . The law of large numbers, a,t ΔV a,t together with Lemma 17, then in turn implies that, conditioned on ξt , ⎛ ⎞ 2 ξ2 √ √ τ ra,t t ' − 2ΔV 1 2 τ τ r r a,t a,t a,t + ' ⎠. e |Δxa,t | →a.s. τ ΔVa,t ⎝ ξt 1 − 2Φ − ' ξt γa M a π ΔVa,t ΔVa,t It follows immediately that the unconditional expectation of the first term (not conditioning on ξt ) is √ 2 τ ΔVa,t . π 2 +ΔV τ ra,t a,t For the second term, we use the fact that E[yΦ(ay)] = √ 2 τ ΔVa,t π 2 τ ra,t +1 ΔV 1 2π √ a , a2 +1 for a random variable y ∼ N (0, 1), to get 2 ra,t τ √ √ 2 ' τ ra,t 2' 2√ ra,t τ ra,t τ ΔVa,t = τ ΔVa,t E ' ξt 1 − 2Φ − ' ξt τ ΔVa,t = τ . 2 π π 2 ΔVa,t ΔVa,t ra,t τ τ ra,t + ΔVa,t 1 + ΔV a,t Summing the two terms together, we get ( 1 τ 2 2 + ΔVa,t . E |Δxa,t | →a.s. ra,t M a γa π τ We note that this formula also holds when ΔVa,t = 0, since E|ra,t τ ξt | = τ Xt = = = N τ 1 N a=1 γa ( 2 π 2 + ra,t ΔVa,t τ 2 2 r . π a,t This finally leads to (20) ) * N 2 Va,t−1 yt τ 1 * ΔVa,t 2Va,t−1 ΔVa,t +2 + − N a=1 γa π (τv + Yt−1 )(τv + Yt ) τv + Yt τ ) * N 2 2 2 Va,t−1 Va,t ΔVa,t τ 1 * ΔVa,t +2 . − + + N a=1 γa π τv + Yt−1 τv + Yt τv + Yt τ Now, if one of the γa → 0, then agent Va,t will determine At , yt , and Yt . In this case, we get Xt = ) ⎛ * * N τ 1 *2 ⎝ + N a=1 γa π (τv + 2 2 τu τ ΔV 2 Vt−1 t γ2 τu τ 2 2 γa t−1 i=1 a ΔVt2 )(τv + 49 τu τ 2 2 γa t i=1 = a,t ΔVt2 ) − τv + 2Vt−1 ΔVt τu τ 2 t 2 γa i=1 ΔVt2 ⎞ ΔVt ⎠ + . τ Using the fact that Vt = ΔVt , leading to the inequality Vt2 ≤ t ti=1 ΔVi2 (from E[x2 ] ≥ E[x]2 ), it follows that for large ΔVt , the third term will be dominant, and therefore any sequence of Xt can be generated by choosing ΔVt appropriately. This shows the second part of the theorem. We are done. Proof of Theorem 7: We first show the second result: We use (24,26) to define R̂t = (At − At−1 )τut Rt − yt v̄ = yt η + (At − At−1 )τui ui , and, of course, R̂1 , . . . , R̂t can be backed out of p1 , . . . , pt . Now, it is straightforward to use (26) to show that pt = v̄ + = v̄ + 1 τv + Yt 1 τv + Yt Yt η + t (Ai − Ai−1 )τui ui i=1 t R̂t = v̄ + i=1 1 τv + Yt Yt η + t (Ai − Ai−1 )τui ui . (75) i=1 Using this relation, we get γa Δxa,t = = − = − = Qa,t (za,t − pt ) − Qa,t−1 (za,t−1 − pt−1 ) t qt 1 Qt−1 R̂t ξa,t−1 + ea,t − v̄ + Qa,t v̄ + η + Qa,t Qa,t τv + Yt i=1 t 1 R̂t Qa,t−1 v̄ + η + ξa,t−1 − v̄ + τv + Yt i=1 t 1 qa,t ea,t + Qa,t η − Yt η + (Ai − Ai−1 )τui ui τv + Yt i=1 t−1 1 Qa,t−1 η − (Ai − Ai−1 )τui ui Yt−1 η + τv + Yt−1 i=1 t Qa,t (Ai − Ai−1 )τui ui qa,t ea,t + τv η − τv + Yt i=1 ∼N(0,qt ) − Qa,t−1 τv + Yt−1 τv η − = qa,t ea,t − t−1 (Ai − Ai−1 )τui ui i=1 Qa,t Qa,t−1 − τv + Yt τv + Yt−1 ∼N(0,qt ) = qa,t ea,t − Qa,t Qa,t−1 − τv + Yt τv + Yt−1 τv η − (Ai − Ai−1 )τui ui i=1 t−1 − Qa,t (At − At−1 )τui ut τv + Yt ∼N(0,yt ) ∼N(0,τv +Yt−1 ) ft−1 − Qa,t gt , τv + Yt where f1 = τv η ∼ N (0, τv ), ft = ft−1 + gt , gt ∼ N (0, yt ), ft |ft−1 ∼ N (ft−1 , yt ). We note that the coefficient in front of ft−1 is strictly positive. The total trading volume at times t and t + 1 then have the form: Wt = |α1a,t ξa,t − α2a,t ft−1 − α3a,t gt |, a Wt+1 = |α1a,t+1 ξa,t+1 − α2a,t+1 (ft−1 + gt+1 ) − α3a,t+1 gt+1 |, a where ξa,t , ξa,t+1 , ft−1 , gt , and gt+1 are all jointly independent, and all α’s are positive. The result now follows from the following lemma: 50 Lemma 18 Assume that Ã, B̃ and z̃ are independent random variables with standardized normal distributions. Then, for any a > 0 and b > 0, Cov(|aà + z̃|, |bB̃ + z̃|) > 0. Proof of Lemma 18: First, we note that if Cov(X̃, Ỹ |Z̃1 , · · · , Z̃n ) > 0 for all realizations of the random variables Z̃1 , . . . , Z̃n , then it must be that Cov(X̃ , Ỹ ) > 0 unconditionally. This follows from the law of iterated expectations, since Cov(X̃ , Ỹ ) = E[X̃ Ỹ ] − E[X̃]E[Ỹ ] = E E[X̃ Ỹ ] − E[X̃]E[Ỹ ]|Z̃ = E[Cov(X̃, Ỹ )|Z̃] > 0. Now, define Z1 = a|Ã|, Z2 = b|B̃|. Then it follows that E[|aà + z̃| |Z1 ] = E[max(Z1 , |z̃|)], E[|bB̃ + z̃| |Z2 ] = E[max(Z2 , |z̃|)], and E[|aà + z̃||bB̃ + z̃| |Z1 , Z2 ] = E[max(Z1 , |z̃|) max(Z1 , |z̃|)], so Cov(|aà + z̃|, |bB̃ + z̃| |Z1 , Z2 ) = Cov(max(Z1 , |z̃|), max(Z2 , |z̃|) |Z1 , Z2 ) > 0 where the inequality follows from max(c, x) being a monotone transformation of x, so max(Z1 , |z̃|), and max(Z2 , |z̃|) are comonotonic. Given the argument, the positivity must then also hold unconditionally, and the lemma therefore follows. Now, this lemma implies that all the terms that make up Wt and Wt+1 have pairwise positive covariances, and it is indeed the case that Cov(Wt , Wt+1 ) > 0, showing the second result. For the third result, we proceed as follows: It follows from (75) that v − pt = t 1 1 ft = η − R̂i . τv + Yt τv + Yt i=1 (76) Therefore, since v − pt−1 is independent of pt−1 , the same holds for ft−1 (and, of course for gt ), trading volume is indeed independent of pt−1 , and further of pt−i for all i > 1, showing the third result. For the first result, from (76), the relationship |pt − pt−1 | = 1 1 ft − ft−1 = τv + Yt−1 τv + Yt 1 1 1 ft−1 − − gt τ +Y τ + Y τ + Y v v t v t t−1 follows, and a similar argument as for trading volume over time then implies that Cov(|pt − pt−1 |, Wt ) > 0, showing the first result. We are done. 51 References Adamic, L., C. Brunetti, J. Harris, and A. Kirilenko (2010): “Trading networks,” Working Paper, University of Michigan. Admati, A., and P. Pfleiderer (1988): “A Theory of Intraday Patterns: Volume and Price Volatility,” Review of Financial Studies, 1, 3–40. Andersen, T. G. (1996): “Return Volatility and Trading Volume: An Information Flow Interpretation of Stochastic Volatility,” Journal of Finance, 51, 169–204. Andrei, D. (2012): “Information Percolation Driving Volatility,” Working Paper, UCLA. Babus, A., and P. Kondor (2013): “Trading and Information Diffusion in Over-the-Counter Markets,” Working Paper, Imperial College. Barabasi, A. L., and R. Albert (1999): “Emergence of Scaling in Random Networks,” Science, 286, 509–512. Blume, L., D. Easley, and M. O’Hara (1994): “A Theory of Intraday Patterns: Volume and Price Volatility,” Journal of Finance, 49, 153–181. Bollerslev, T., and D. Jubinski (1999): “Equity Trading Volume and Volatility: Latent Information Arrivals and Common Long-Run Dependencies,” Journal of Business and Economic Statistics, 17, 9–21. Bollobas, B. (2001): Random Graphs. Cambridge University Press, 2nd edn. Breon-Drish, B. M. (2010): “Asymmetric Information in Financial Markets: Anything Goes,” Working Paper, Stanford University. Briggs, N. (1993): Algebraic Graph Theory. Cambridge University Press. Brouwer, A. E., A. M. Cohen, and A. Neumaier (1989): Distance-Regular Graphs. Springer-Verlag. Buraschi, A., and P. Porchia (2012): “Dynamic Networks and Asset Pricing,” Working Paper, Imperial College. Calvet, L., J. Campbell, and P. Sodini (2007): “Down or Out: Assessing the Welfare Costs of Household Investment Mistakes,” Journal of Political Economy, 115(5), 707–747. (2009): “Fight or Flight? Portfolio Rebalancing by Individual Investors,” Quarterly Journal of Economics, 124(1), 301–348. Clark, P. K. (1973): “A subordinated Stochastic Process Model with Finite Variance for Speculative Prices,” Econometrica, 41, 135–155. Colla, P., and A. Mele (2010): “Information linkages and correlated trading,” Review of Financial Studies, 23(1), 203–246. Crouch, R. L. (1975): “The volume of transactions and price changes on the New York Stock Exchange,” Financial Analysts Journal, 65, 104–109. Cutler, D. M., J. M. Poterba, and L. H. Summers (1989): “What moves stock prices,” Journal of Portfolio Management, 15, 4–12. Das, S. J., and J. Sisk (2005): “Financial Communities,” Journal of Portfolio Management, 31, 112– 123. Debreu, G., and H. Scarf (1964): “A limit theorem on the core of an economy,” International Economic Review, 4, 235–256. Duffie, D., S. Malamud, and G. Manso (2009): “Information percolation with equilibrium search dynamics,” Econometrica, 77, 1513–1574. 52 Duffie, D., and G. Manso (2007): “Information percolation in large markets,” American Economic Review, 97, 203–209. Edgeworth, F. Y. (1881): Mathematical Psychics: An essay on the mathematics to the moral sciences. Wiley. Epps, T. W., and M. L. Epps (1976): “The Stochastic Dependence of Security Price Changes and Transaction Volumes: Implications for the Mixture-of-Distributions Hypothesis,” Econometrica, 44, 305–321. Erdös, P. (1947): “Some remarks on the theory of graphs,” Bulletin of the American Mathematical Society, 53, 292–294. Erdös, P., and A. Rényi (1959): “On random graphs I,” Publ. Math. Debrecen, 6, 290–297. Fair, R. C. (2002): “Events that Shook the Market,” Journal of Business, 75, 713–731. Feng, L., and M. Seasholes (2004): “Correlated trading and location,” Journal of Finance, 59, 2117–2144. Foster, D., and S. Viswanathan (1995): “Can speculative trading explain the volume-volatility relation?,” Journal of Business and Economics Statistics, 13, 379–396. Friedkin, N. (1991): “Theoretical Foundations for Centrality Measures,” American Journal of Science, 96, 1471–1504. Gabaix, X., P. Gopikrishnan, V. Plerou, and H. E. Stanley (2003): “A theory of power-law distributions in financial market fluctuations,” Nature, 423, 267–270. Gallant, A. R., P. E. Rossi, and G. Tauchen (1992): “A subordinated Stochastic Process Model with Finite Variance for Speculative Prices,” Review of Financial Studies, 5, 199–242. Gilbert, E. N. (1959): “Random Graphs,” Annals of Mathematical Statistics, 30, 1141–1144. Glosten, L., and P. Milgrom (1985): “Bid, ask and transaction prices in a specialist market with heterogeneously informed traders,” Journal of Financial Economics, 14, 71–100. Grossman, S. (1976): “On the Efficiency of Competitive Stock Markets Where Traders Have Diverse Information,” Journal of Finance, 31, 573–584. Han, B., and L. Yang (2013): “Social Networks, Information Acquisition, and Asset Prices,” Management Science, 59, 1444–1457. He, H., and J. Wang (1995): “Differential Information and Dynamic Behavior of Stock Trading Volume,” Review of Financial Studies, 8, 919–972. Heimer, R. Z., and D. Simon (2012): “Facebook Finance: How Social Interaction Propagates Active Investing,” Working Paper, Brandeis University. Hellwig, M. F. (1980): “On the aggregation of information in competitive markets,” Journal of Economic Theory, 22, 477–498. Hong, H., J. D. Kubik, and J. C. Stein (2004): “Social Interaction and Stock-Market Participation,” Journal of Finance, 49, 137–163. Ivković, Z., and S. Weisbenner (2007): “Information diffusion effects in individual investors’ common stock purchases: Covet thy neighbors’ investment choices,” Review of Financial Studies, 20, 1327–1357. Jackson, M., and Y. Zenou (2012): “Games on Networks,” Handbook of Game Theory, Vol. 4, forthcoming. Jackson, M. O. (2005): “A survey of models of network formation: stability and efficiency,” in Group Formation in Economics: Networks, Clubs, and Coalitions, ed. by G. Demange, and M. Wooders. Cambridge University Press. 53 Karpoff, J. M. (1987): “The Relation Between Price Changes and Trading Volume: A Survey,” Journal of Financial and Quantitative Analysis, 22, 109–126. Kyle, A. S. (1985): “Continuous Auctions and Insider Trading,” Econometrica, 53, 1315–1336. Li, D., and N. Schurhoff (2012): “Dealer Networks,” Working Paper, University of Lausanne. Lo, A. W., and J. Wang (2000): “Trading Volume: Definitions, Data Analysis and Implications of Portfolio Theory,” Review of Financial Studies, 13, 257–300. Lobato, I. N., and C. Velasco (2000): “Long Memory in Stock-Market Trading Volume,” Journal of Business and Economic Statistics, 18, 410–427. Massa, M., and A. Simonov (2006): “Hedging, Familiarity, and Portfolio Choice,” Review of Financial Studies, 19(2), 633–685. Ozsoylev, H. (2005): “Asset pricing implications of social networks,” Working paper, University of Oxford. Ozsoylev, H., and J. Walden (2011): “Asset Pricing in Large Information Networks,” Journal of Economic Theory, 146, 2252–2280. Ozsoylev, H., J. Walden, D. Yavuz, and R. Bildik (2013): “Investor Networks in the Stock Market,” Review of Financial Studies, Forthcoming. Pareek, A. (2012): “Information networks: Implications for mutual fund trading behavior and stock returns,” Working Paper, Rutgers University. Rogalski, R. J. (1978): “The dependence of prices and volume,” Review of Economics and Statistics, 60, 268–274. Saatcioglu, K., and L. T. Starks (1998): “The Stock Price-Volume Relationship in Emerging Stock Markets: The Case of Latin America,” International Journal of Forecasting, 14, 215–225. Schneider, J. (2009): “A Rational Expectations Equilibrium with Informative Trading Volume,” Journal of Finance, 64, 2783–2805. Shiller, R. J., and J. Pound (1989): “Survey evidence on the diffusion of interest and information among investors,” Journal of Economic Behavior, 12, 47–66. Taylor, D. E., and R. Levingston (1978): “Distance-Regular Graphs,” in Combinatorial Mathematics, ed. by D. A. Holton, and J. Seberry, Berlin. Vives, X. (1995): “Short term investment and the informational efficiency of the market,” Review of Financial Studies, 8, 125–160. Watts, D. J., and S. H. Strogatz (1998): “Collective dynamis of ‘small world’ networks,” Nature, 393, 440–442. 54
© Copyright 2026 Paperzz