A Framework for Dynamic Oligopoly in Concentrated Industries∗ Bar Ifrach† Gabriel Y. Weintraub‡ This Version: August 2014; First Version: August, 2012 Abstract In this paper we introduce a new computationally tractable framework for Ericson and Pakes (1995)style dynamic oligopoly models that overcomes the computational complexity involved in computing Markov perfect equilibrium (MPE). First, we define a new equilibrium concept that we call momentbased Markov equilibrium (MME), in which firms keep track of their own state, the detailed state of dominant firms, and few moments of the distribution of fringe firms’ states. Second, we provide guidelines to use MME in applied work and illustrate with an application how it can endogenize the market structure in a dynamic industry model even with hundreds of firms. Third, we develop a series of results that provide support for using MME as an approximation. We present numerical experiments showing that MME approximates MPE for important classes of models. Then, we introduce novel unilateral deviation error bounds that can be used to test the accuracy of MME as an approximation in large-scale settings. Overall, our new framework opens the door to study novel issues in industry dynamics. 1 Introduction Ericson and Pakes (1995)-style dynamic oligopoly models (hereafter, EP) offer a framework for modeling dynamic industries with heterogeneous firms. The main goal of the research agenda put forward by EP was to conduct empirical research and evaluate the effects of policy and environmental changes on market outcomes in different industries. The importance of evaluating policy outcomes in a dynamic setting and the broad flexibility and adaptability of the EP framework has generated many applications in industrial organization and related fields (see Doraszelski and Pakes (2007) for an excellent survey). Despite the broad interest in dynamic oligopoly models, there remain significant hurdles in applying them to problems of interest. Dynamic oligopoly models are typically analytically intractable, hence numerical methods are necessary to solve for their Markov perfect equilibrium (MPE). With recent estimation ∗ We would like to thank Vivek Farias for numerous fruitful conversations, valuable insights, important technical input, and careful reading of the paper. We have had helpful conversations with Lanier Benkard, Jeff Campbell, Allan Collard-Wexler, Dean Corbae, Pablo D’Erasmo, Uli Doraszelski, Przemek Jeziorski, Ramesh Johari, Boyan Jovanovic, Sean Meyn, Ariel Pakes, John Rust, Bernard Salanie, Minjae Song, Ben Van Roy, Daniel Xu, as well as seminar participants at U. of Chicago, Columbia, Harvard, Maryland, Pittsburgh, UT Austin, U. of Chile, and various conferences. We thank Yair Shenfeld for exemplary research assistance. We are grateful to the editor Stéphane Bonhomme and three anonymous referees for constructive feedback that greatly improved the paper. † Airbnb, [email protected] ‡ Columbia Business School, [email protected] 1 methods, such as Bajari et al. (2007), it is no longer necessary to solve for the equilibrium in order to structurally estimate a model. However, in the EP framework solving for MPE is still essential to perform counterfactuals and evaluate environmental and policy changes. The complexity of this computation has severely limited the practical application of EP-style models to industries with just a handful of firms, far less than the real-world industries the analysis is directed at. Such limitations have made it difficult to construct realistic empirical models. Thus motivated, we propose a new computationally tractable model to study industry dynamics in settings with a few dominant firms and many fringe firms; in applications, dominant firms would typically be the market share leaders. A market structure in which a few firms holding significant market share co-exist with many small firms is prevalent in many industries, such as supermarkets, banking, beer, cereals, and heavy duty trucks, to name a few (see, e.g, concentration ratios from the U.S. Economic Census). These industries are intractable in the standard EP framework due to the large number of fringe firms. Although individual fringe firms have negligible market power, they may have significant cumulative market share and may collectively discipline dominant firms’ behavior. Our model and methods capture these types of interactions and therefore significantly expand the set of industries that can be analyzed computationally. In an EP-style model, each firm is distinguished by an individual state at every point in time. The industry state is a vector (or ‘distribution’) encoding the number of firms in each possible value of the individual state variable. Assuming its competitors follow a prescribed strategy, a given firm selects, at each point in time, an action (e.g., an investment level) to maximize its expected discounted profits. The selected action will depend in general on the firm’s individual state and the industry state. Even if firms were restricted to anonymous and symmetric strategies, the computation entailed in selecting such an action quickly becomes infeasible as the number of firms and individual states grows. This renders commonly used dynamic programming algorithms to compute MPE infeasible for many problems of practical interest. The first main contribution of this work is the introduction of a new framework and equilibrium concept that overcomes the computational complexity involved in computing MPE. In an industry with many competitors, it may be reasonable to assume that firms are more sensitive to changes in the dominant firms’ states. Further, it may be unrealistic to expect that firms have resources to keep track of the evolution of all rivals. Thus motivated, we postulate a plausible model of firms’ behavior: firms closely monitor dominant firms, but do not monitor fringe firms as closely. Specifically, we assume that firms’ strategies depend on: (i) their own individual state; (ii) the detailed state of dominant firms; and (iii) a small number of aggregate statistics (such as a few moments) of the distribution of fringe firms. We refer to the fringe firms’ distribution or state interchangeably. We call these strategies moment-based strategies, where we use the term ‘moments’ generically, as firms could keep track of other statistics of the distribution of fringe firms, such as un-normalized moments or quantiles. Based on these strategies, we introduce an equilibrium concept that we call moment-based Markov equilibrium (MME). In MME, firms’ beliefs about the evolution of the industry satisfy a consistency condition, and firms employ an optimal moment-based strategy given these beliefs. One challenge in defining this concept, however, is that fringe moments may not be sufficient statistics to predict the future evolution of the industry. For example, suppose firms keep track of the first moment of the fringe state. For a given value of the first moment, there could be many different fringe distributions consistent with that value, from 2 which the future evolution of the industry could vary greatly. Technically, the issue that arises is that the stochastic process of moments may not be a Markov process even if the underlying dynamics are. Therefore, to define MME we introduce a ‘Markov perceived transition kernel’ to approximate the dynamics of the (non-Markov) moment-based state; we provide natural and concrete examples for such a kernel. Notably, this construction allows for fringe firms to become dominant as they grow large and vice-versa, fully endogenizing the market structure in equilibrium. After defining MME, our second main contribution is a series of guidelines to use MME in applied work. We introduce an algorithm that efficiently computes MME even in industries with hundreds of firms, provided that firms keep track of a few moments of the fringe state and there are a few dominant firms. In addition, we discuss important implementation issues for the use of MME in practice, such as how to specify the perceived transition kernel, and how to select moments and the number of dominant firms allowed. While MME may be appealing intuitively, moment-based strategies may not be close to a best response, and therefore, MME may not be close to a subgame perfect equilibrium in general. The reason is that because moments may not be sufficient statistics to predict the future evolution of the industry, they may not induce a sufficient partition of histories and may not summarize all payoff-relevant history in the sense of Maskin and Tirole (2001). Indeed, observing the full distribution of fringe firms may provide valuable information for decision making. Our third main contribution is a series of results that addresses this difficulty, providing support for using MME as an approximation. First, we explore the performance of MME as a low-dimensional approximation to MPE. We perform a large set of numerical experiments comparing MME and MPE outcomes in industries with a few firms and a few individual states per firm, for which MPE can be computed. We consider two important dynamic oligopoly models: a quality ladder model and a capacity competition model. We show that MME is able to provide accurate approximations to MPE across a wide range of parameter values even when firms keep track of a single and natural fringe moment. For example, in the capacity competition model this moment is the total installed capacity among all fringe firms. Alternatively, MME can be thought of as an appealing heuristic and behavioral model on its own. To formalize this interpretation, we define the ‘value of the full information deviation’ that measures the extent of sub-optimality of MME strategies in terms of a unilateral deviation to a strategy that keeps track of the full industry state, similar to the notion of -equilibrium. If this value is small, MME may be an appealing behavioral model, as unilaterally deviating to more complex strategies does not significantly increase profits, and doing so may involve costs associated with gathering and processing more information. We provide a theorem showing that if the value of the full information deviation becomes small (e.g., by including more moments in MME), then MME strategies approach MPE strategies. Measuring the value of the full information deviation is intractable in large-scale applications with many firms; in these instances, computing a best response is almost as computationally challenging as solving for a MPE. Of course, in these cases we could not compare MME with MPE either. To address this limitation, using ideas from robust dynamic programming, we propose a novel computationally tractable upper bound to the value of the full information deviation. This bound is useful because it allows one to evaluate whether the state aggregation is appropriate or whether a finer state aggregation is necessary, for example, by adding more moments. We provide numerical experiments showing how the error bound works in practice. 3 Our fourth and last contribution illustrates the applicability of our approach, showing how it can be used to endogenize the market structure in a fully dynamic model with hundreds of firms for which MPE would be impossible to compute. In particular, we perform numerical experiments motivated by the long trend towards concentration in the beer industry in the US during the years 1960-1990. Over the course of those years, the number of active firms dropped dramatically, and three industry leaders emerged. One common explanation of this trend is the emergence of national TV advertising as an ‘endogenous sunk cost’ (Sutton, 1991). We build and calibrate a dynamic advertising model of the beer industry and use our methods to determine how a single parameter related to the returns to advertising expenditures critically affects the resulting market structure and the level of concentration in an industry with around two hundred firms overall. We also show how MME generates rich strategic interactions between dominant and fringe firms, in which, for example, dominant firms make investment decisions to deter the entry and growth of fringe firms. As further evidence of the usefulness of our approach, Corbae and D’Erasmo (2014) has already used our framework to study the impact of capital requirements in market structure in a calibrated model of banking industry dynamics with dominant and fringe banks. In summary, our approach offers a computationally tractable model for industries with a dominant/fringe market structure, opening the door to studying novel issues in industry dynamics. As such, our framework greatly increases the applicability of dynamic oligopoly models. The rest of this paper is organized as follows. We discuss related literature in Section 2. Section 3 describes our dynamic oligopoly model. Section 4 introduces our new equilibrium concept and Section 5 discusses the interpretations of MME as an approximation. Section 6 introduces an algorithm to compute MME as well as details on the perceived transition kernel. Section 7 provides a numerical comparison between MME and MPE and discusses important implementation issues. Section 8 describes the application to the beer industry and Section 9 introduces our error bounds. Finally, Section 10 provides extensions and concludes. 2 Related Literature Previous work has already recognized the challenges involved in applying dynamic oligopoly models in practice. For example, researchers have used different approaches to overcome the burden involved in computing equilibria in empirical applications. Some papers empirically study industries that contain only a few firms in which exact MPE computation is feasible if the number of individual states assumed is moderate (e.g., Benkard (2004), Ryan (2012), Collard-Wexler (2013a), and Collard-Wexler (2011)). Other researchers structurally estimate models in industries with many firms using approaches that do not require MPE computation, and do not perform counterfactuals that require computing equilibria (e.g., Benkard et al. (2010)). Our work provides a method to perform counterfactuals in concentrated industries with many firms. Other papers perform counterfactuals computing MPE but in reduced-size models compared to the actual industry. These models include a few dominant firms and ignore the rest of the (fringe) firms (for example, see Ryan (2012) and Gallant et al. (2011)). Other papers coarsely discretize the individual states (e.g, Collard-Wexler (2013b) and Corbae and D’Erasmo (2013) assume homogenous firms). Finally, some papers allow for richer heterogeneity but assume a simplified model of dynamics in which a certain process of ‘moments’ that summarize the industry state information is Markov (e.g., see Kalouptsidi (2014), Jia and 4 Pathak (2012), Santos (2010), and Tomlin (2008); Lee (2013) use a similar approach in a dynamic model of demand with forward-looking consumers). Our methods can help researchers determine the validity of these simplifications. A stream of empirical literature related to our work uses simplified notions of equilibrium for estimation and counterfactuals. In particular, Xu (2008), Iacovone et al. (2013), Qi (2013), and Saeedi (2013) among others use the notion of oblivious equilibrium (OE) introduced by Weintraub et al. (2008), in which firms assume the average industry state holds at any time. OE can be shown to approximate MPE in industries with many firms by a law of large numbers, provided that the industry is not too concentrated. Our approach extends this type of analysis to industries that are more concentrated. Our work is related to recent methodological work that tries to overcome the complexity involved in computing MPE. For example, Pakes and McGuire (2001) proposes a stochastic approximation algorithm that solves the model only on a recurrent class of states, and reduces the computation time at each state through simulation. Doraszelski and Judd (2011) suggests casting the game in continuous time, which greatly reduces the computational cost at each state by reducing the number of future states reachable from each current state (see also Jeziorski (2013) for an application of this technique). Other papers propose different computational approaches to approximate MPE. Farias et al. (2012) develops a method based on approximate dynamic programming with value function approximation (see also Sweeting (2013) for an empirical application using value function approximation). Santos (2012) introduces a state aggregation technique based on quantiles of the industry state distribution. Our work is particularly related to Benkard et al. (2014), which extends the notion of oblivious equilibrium to include dominant firms. In that paper, there is a fixed and pre-determined set of dominant firms, which all firms in the industry monitor. Firms assume that at every point in time the fringe state is equal to the expected state conditional on the state of dominant firms. Our paper builds on and generalizes Benkard et al. (2014)’s idea of considering a dominant/fringe market structure.1 In particular, our approach is more general in at least two important dimensions: (i) in that firms not only keep track of the dominant firms’ states but also of statistics that summarize the fringe firms’ states; and (ii) that in our model the set of dominant firms arises endogenously in equilibrium. That is, we allow for firms to transition from the fringe to the dominant tier and vice-versa, and therefore, we fully endogenize the market structure and explicitly model how firms grow to become dominant. We provide more details about this connection in Section 4.6. We introduce MME as a restriction to firms’ strategies in which the only dependence on the fringe state is through a few moments. In this sense, MME limits firms’ information sets and therefore could be cast as an equilibrium concept in a game of asymmetric information even if the underlying economic model is not. Hence, while their model and motivation is fundamentally different from ours, MME is related to the notion of experienced-based equilibria of Fershtman and Pakes (2012) that weakens the restrictions imposed by perfect Bayesian equilibria for dynamic games of asymmetric information. We discuss this relation in more detail in Section 4.6. Finally, our moment-based strategies are similar to and follow the same spirit of the seminal paper by Krusell and Smith (1998), which replaces the distribution of wealth across agents in the economy with its 1 In their analysis of a stylized model of dynamic mergers, Gowrisankaran and Holmes (2004) also provide an earlier example of the dominant/fringe dichotomy. 5 moments, when computing stationary stochastic equilibrium in a stochastic growth model. While our main focus has been on dynamic oligopoly models, our methods can also be used to find better approximations in stochastic growth models, as well as in macroeconomic dynamic industry models with an infinite number of heterogeneous firms and an aggregate shock (see, e.g., Khan and Thomas (2008) and Clementi and Palazzo (2014)). We discuss these extensions in Section 10.2. 3 Dynamic Oligopoly Model and Solution Concept In this section we formulate a model of industry dynamics in the spirit of Ericson and Pakes (1995). Similar models have been applied to numerous settings in industrial organization such as advertising, auctions, R&D, collusion, consumer learning, learning-by-doing, and network effects (see Doraszelski and Pakes (2007) for a survey). 3.1 Model We introduce the main elements of our dynamic oligopoly model. Time Horizon. The industry evolves over discrete time periods and along an infinite horizon. We index time periods with nonnegative integers t ∈ N (N = {0, 1, 2, . . .}). All random variables are defined on a probability space (Ω, F, P) equipped with a filtration {Ft : t ≥ 0}. We adopt a convention of indexing by t variables that are Ft -measurable. Firms. Each firm that enters the industry is assigned a unique positive integer-valued index. The set of indices of incumbent firms at time t is denoted by St . State Space. Firm heterogeneity is reflected through firm states that represent the quality level, productivity, capacity, the size of its consumer network, or any other aspect of the firm that affects its profits. At time t the individual state of firm i is denoted by xit ∈ X , where X is a finite subset of Nq , q ≥ 1. We define the industry state s̄t to be a vector over individual states that specifies, for each firm state x ∈ X , the number of incumbent firms at x in period t. Note that because we will focus on symmetric and anonymous equilibrium strategies in the sense of Doraszelski and Pakes (2007), we can restrict attention to industry states for which the identities of firms do not matter. The integer number N represents the maximum number of incumbent firms that the industry can accommodate at every point in time and we let nt be the number of incumbent firms at time period t. Exit process. In each period, each incumbent firm i observes a nonnegative real-valued sell-off value φit that is private information to the firm. If the sell-off value exceeds the value of continuing in the industry, then the firm may choose to exit, in which case it earns the sell-off value and then ceases operations permanently. We assume the random variables {φit |t ≥ 0, i ≥ 1} are i.i.d. and have a well-defined density function with finite moments. Entry process. We consider an entry process similar to the one in Doraszelski and Pakes (2007). At time period t, there are N − nt potential entrants, ensuring that the maximum number of incumbent firms that the industry can accommodate is N (we assume n0 < N ). Each potential entrant is assigned a unique index. In each time period each potential entrant i observes a positive real-valued entry cost κit that is private 6 information to the firm. We assume the random variables {κit |t ≥ 0 i ≥ 1} are i.i.d., independent of all previously defined random quantities, and have a well-defined density function with finite moments. If the entry cost is below the expected value of entering the industry then the firm will choose to enter. Potential entrants make entry decisions simultaneously. Entrants appear in the following period at state xe ∈ X and earn profits thereafter.2 As is common in this literature and to simplify the analysis, we assume potential entrants are short-lived and do not consider the option value of delaying entry. Potential entrants that do not enter the industry disappear and a new generation of potential entrants is created in the next period. Transition dynamics. If an incumbent firm decides to remain in the industry, it can take an action to improve its individual state. Let I ⊆ <k+ (k ≥ 1) be a convex and compact action space; for concreteness, we refer to this action as an investment. Given a firm’s investment ι ∈ I and state at time t, the firm’s transition to a state at time t + 1 is described by the following Markov kernel Q: Q[x0 |x, ι] = P[xi,t+1 = x0 xit = x, ιit = ι]. Uncertainty in state transitions may arise, for example, due to the risk associated with a research and development endeavor or a marketing campaign. The cost of investment is given by a nonnegative function c(ιit , xit ) that depends on the firm’s individual state xit and investment level ιit . Both the kernel and the investment cost are assumed to be continuous functions of investment. We assume that transitions are independent across firms conditional on the industry state and investment levels. We also assume these transitions are independent of all previously defined random quantities. Aggregate shock. There is an aggregate profitability shock zt that is common to all firms and that may represent a common demand shock, a common shock to input prices, or a common technology shock. We assume that {zt ∈ Z : t ≥ 0} is a finite, ergodic Markov chain, and independent of all previously defined random quantities. Single-Period Profit Function. Each incumbent firm earns profits on a spot market. We denote by π(xit , s̄t , zt ) as the single-period expected profits garnered by firm i at time period t, which depends on its individual state xit , the industry state s̄t , and the value of the aggregate shock zt . Timing of Events. In each period, events occur in the following order: (1) Each incumbent firm observes its sell-off value and then makes exit and investment decisions; (2) Each potential entrant observes its entry cost and makes entry decisions; (3) Incumbent firms compete in the spot market and receive profits; (4) Exiting firms exit and receive their sell-off values; (5) Investment shock outcomes are determined, new entrants enter, and the industry takes on a new state st+1 . Firms’ objective. Firms aim to maximize expected discounted profits. The interest rate is assumed to be positive and constant over time, resulting in a constant discount factor of β ∈ (0, 1) per time period. Finally, we assume that for all competitors’ decisions and all terminal values, a firm’s one time-step ahead optimization problem to determine its optimal investment has a unique solution.3 2 It is straightforward to generalize the model by assuming that entrants can also invest to improve their initial state. This assumption is a generalization of the unique investment choice admissibility assumption in Doraszelski and Satterthwaite (2010) that is used to guarantee the existence of an equilibrium to the model in pure strategies. It is satisfied by many of the commonly used specifications in the literature. 3 7 3.2 Markov Perfect Equilibrium The most commonly used equilibrium concept in such dynamic oligopoly models is that of symmetric pure strategy Markov perfect equilibrium (MPE) in the sense of Maskin and Tirole (1988). To simplify notation we re-define the industry state to incorporate the aggregate shock, that is st = (s̄t , zt ). Hence, there is a function ι such that at each time t, each incumbent firm i invests an amount ιit = ι(xit , st ), as a function of its own state xit and the industry state st . Similarly, each firm follows an exit strategy that takes the form of a cutoff rule: there is a real-valued function ρ such that an incumbent firm i exits at time t if and only if φit > ρ(xit , st ). Weintraub et al. (2008) show that there always exists an optimal exit strategy of this form even among veryogeneral classes of exit strategies. We define the state space n P S = (s̄, z) ∈ N|X | × Z x∈X s̄(x) ≤ N , that is, S is the set of industry states including the aggregate shock. Let M denote the set of exit/investment strategies such that an element µ ∈ M is a set of functions µ = (ι, ρ), where ι : X × S → I is an investment strategy and ρ : X × S → R is an exit strategy.4 Similarly, each potential entrant follows an entry strategy that takes the form of a cutoff rule: there is a real-valued function λ such that a potential entrant i enters at time t if and only if κit < λ(st ). It is simple to show that there always exists an optimal entry strategy of this form even among very general classes of entry strategies (see Doraszelski and Satterthwaite (2010)). We denote the set of entry strategies by Λ, where an element of Λ is a function λ : S → R. We define the value function V (x, s|µ0 , µ, λ) to be the expected discounted value of profits for a firm at state x when the industry state is s, given that its competitors each follow a common strategy µ ∈ M, the entry strategy is λ ∈ Λ, and the firm itself follows strategy µ0 ∈ M. Formally, " 0 V (x, s|µ , µ, λ) = E µ0 ,µ,λ τi X β k−t (π(xik , sk ) − c(ιik , xik )) + β τi −t # φi,τi xit = x, st = s , k=t where i is taken to be the index of a firm at individual state x at time t, τi is a random variable representing the time at which firm i exits the industry, and the subscripts of the expectation indicate the strategy followed by firm i, the strategy followed by its competitors, and the entry strategy. In an abuse of notation, we will use the shorthand, V (x, s|µ, λ) ≡ V (x, s|µ, µ, λ), to refer to the expected discounted value of profits when firm i follows the same strategy µ as its competitors. An equilibrium in our model comprises an investment/exit strategy µ = (ι, ρ) ∈ M and an entry strategy λ ∈ Λ that satisfy the following conditions: 1. Incumbent firms’ strategies represent a MPE: sup V (x, s|µ0 , µ, λ) = V (x, s|µ, λ) , ∀(x, s) ∈ X × S. µ0 ∈M 2. For all states, the cut-off entry value is equal to the expected discounted value of profits of entering 4 To simplify notation we do not make this restriction explicit; however, we restrict attention to states (x, s) ∈ X × S, such that s̄(x) > 0 (recall that s̄(x) is the the x-th component of s̄). 8 the industry:5 λ(s) = β E µ,λ [V (xe , st+1 |µ, λ)|st = s] , ∀s ∈ S. Doraszelski and Satterthwaite (2010) establish the existence of an equilibrium in pure strategies for a similar model. With respect to uniqueness, in general we presume that our model may have multiple equilibria. Doraszelski and Satterthwaite (2010) also provide an example of multiple equilibria in a closely related model. A limitation of MPE is that the set of relevant industry states grows quickly with the number of firms in the industry and individual states, making its computation intractable when there are more than a few firms. This motivates our alternative approach. 4 Moment-Based Markov Equilibrium In this section we introduce a new equilibrium concept that overcomes the computational complexity involved in solving for MPE. 4.1 Dominant and Fringe Firms We focus on industries that exhibit the following market structure: there are a few dominant firms and many fringe firms. Let Dt ⊂ St and Ft ⊂ St be the set of incumbent dominant and fringe firms at time period t, respectively. The sets Dt and Ft are common knowledge among firms at every period of time. These sets can change over time and our model incorporates a mechanism that endogenizes the process through which fringe firms become dominant and vice-versa. We let Xf ⊆ X and Xd ⊆ X be the set of feasible individual states for fringe and dominant firms, respectively. We define the state or distribution of fringe firms ft to be a vector over individual states that specifies, firm state x ∈oXf , the number of incumbent fringe firms at x in period t. We define n for eachfringe P |X | f Sf = f ∈ N x∈Xf f (x) ≤ N to be the set of all possible states of fringe firms. Let dt be the state of dominant firms, specifying the individual state of each dominant firm at time period t. The set of all possible dominant firms’ states is defined by Sd = {d ∈ XdD }, where D is the maximum number of dominant firms the industry can accommodate. Hence, the dominant firms’ state is represented by a list of individual states.6 Note that by defining an inactive state, the number of active dominant firms could be anywhere between 0 and D in a given time period. With some abuse of notation, we define the state space S = Sf × Sd × Z, where a state s = (f, d, z) ∈ S is given by a distribution of fringe firms, a state for dominant firms, and a value for the aggregate shock. The division between dominant and fringe firms will depend on the specific application at hand. Typically, however, dominant firms will be market share leaders or, more generally, firms that most affect 5 Hence, potential entrants enter if the expected discounted profits of doing so are positive. Throughout the paper it is implicit that the industry state at time period t + 1, st+1 , includes the entering firm in state xe whenever we write (xe , st+1 ). 6 Because we focus on equilibrium strategies that are anonymous, we can restrict attention to a set Sd for which the identity of dominant firms does not matter. For example, if the state is one dimensional, we can define Sd = {d ∈ XdD d(1) ≤ d(2), ..., ≤ d(D)}. We choose this state representation for dominant firms as opposed to a distribution over individual states, because we will typically consider applications with a small number of dominant firms. 9 competitors’ profits. For concreteness, suppose that the state of the firm represents its ability to compete in the market and firms in higher individual states have larger market shares, as in the two examples we introduce below in Section 7.1. It may then be natural to define dominant firms as those that have a sufficiently large individual state. In our numerical experiments, we separate dominant firms from fringe firms by an exogenous threshold state x, such that i ∈ Dt if and only if xit ≥ x. We note that in this case, Xf ∩ Xd = ∅, which we assume throughout the rest of the paper to simplify notation.7 More generally, one could consider a relative threshold that depends on some statistics of the current fringe state; for example, a fringe firm becomes dominant if it is certain number of standard deviations above the current average size of fringe firms. The threshold could also depend on the current state of dominant firms. Note that under all these specifications the transitions among the fringe and dominant tiers are naturally embedded in the firms’ transitions. Our approach could also accommodate alternative specifications on how firms become dominant depending on the empirical application at hand. In the applications we have in mind, dominant firms are a few and have significant market power. In contrast, fringe firms are many and individually hold little market power, although their aggregate market share may be significant. This market structure suggests that firms’ decisions should be more sensitive to the state of dominant firms than to the state of fringe firms. Moreover, given that the fringe firms’ state is a highly dimensional object, gathering information on the state of each individual small firm is likely to be more expensive than doing so for larger firms that not only are few, but also usually more visible and often publicly traded. Consequently, as the number of fringe firms increases it seems implausible that firms keep track of the individual state of each one. Instead, we postulate that firms only keep track of the state of dominant firms and of a few summary statistics of the fringe state. 4.2 Moment-Based Strategies Our approach requires that firms compute best responses in strategies that depend only on a few summary statistics of the fringe state. A set of such summary statistics is a multi-variate function θ : Sf → Θ, where Θ P is a finite subset of <n . For example, when the fringe firm state is one-dimensional, θ(f ) = x∈Xf xα f (x) is the α−th un-normalized moment with respect to the distribution f .8 For example, if α = 1, the statistic θ(ft ) corresponds to the first un-normalized moment, which is equivalent to the sum of the individual states P of all incumbent fringe firms, i∈Ft xit . For brevity and concreteness, we call such summary statistics fringe moments, with the understanding that they could include normalized or un-normalized moments, but also other functions of the fringe distribution such as quantiles or the number of fringe firms that are about to become dominant. Accordingly, we define a set of moments of the distribution of fringe firms: θt = θ(ft ). (1) 7 Note that more generally we can always make the assumption that Xf ∩ Xd = ∅ by appending one dimension to the individual state in order to indicate whether the firm is fringe or dominant. This construction is useful as it allows, for example, for fringe and dominant firms to have different model primitives and strategies. 8 Note that with a finite number of fringe firms (N ) and a finite number of individual states per firm (|Xf |), this and all other statistics that we use in the paper take a finite number of values. 10 We introduce firm strategies that depend on their own individual state xit , the state of dominant firms dt , the aggregate shock zt , and fringe moments θt as defined in (1). We call such strategies moment-based strategies. We also define Sθ as the set of admissible moments defined by (1); that is, Sθ = {θ|∃f ∈ Sf s.t. θ = θ(f )}. In light of this, we define the moment-based industry state by ŝ = (θ, d, z) ∈ Ŝ = Sθ × Sd × Z. A moment-based investment strategy is a function ι such that at each time t, each incumbent firm i ∈ St invests an amount ιit = ι(xit , ŝt ), where ŝt is the moment-based industry state at time t. Similarly, each firm follows an exit strategy that takes the form of a cutoff rule: there is a real-valued function ρ such that an incumbent firm i ∈ St exits at time t if and only if φit ≥ ρ(xit , ŝt ). Let M̂ denote the set of moment-based exit/investment strategies such that an element µ ∈ M̂ is a pair of functions µ = (ι, ρ), where ι : X × Ŝ → I is an investment strategy and ρ : X × Ŝ → < is an exit strategy.9 Each potential entrant follows an entry strategy that takes the form of a cutoff rule: there is a real-valued function λ such that a potential entrant i enters at time t if and only if κit ≤ λ(ŝt ). We denote the set of entry functions by Λ̂, where an element of Λ̂ is a function λ : Ŝ → <. We assume that all new entrants become part of the fringe. Note that moment-based strategies and the moment-based state space are defined with respect to a specific function of moments (1). Relative to MPE strategies, moment-based strategies do not keep track of the full fringe state; they only keep track of a few fringe moments. 4.3 Transition Kernel With Markov strategies (µ, λ) the underlying state, {st = (ft , dt , zt ) : t ≥ 0}, is a Markov process with a transition kernel denoted by Pµ,λ . In addition, we denote by Pµ0 ,µ,λ the transition kernel of (xit , st ) when firm i uses strategy µ0 , and its competitors use strategy (µ, λ). Note that given strategies, both these kernels can be derived from the primitives of the model, namely, the distributions of φ and κ, the kernel of the aggregate shock, and the kernel Q. We emphasize that the underlying industry state st should be distinguished from the moment-based industry state ŝt = (θt , dt , zt ). Defining our notion of equilibrium in moment-based strategies will require the construction of what can be viewed as a ‘Markov’ approximation to the dynamics of the moment-based state process {(xit , ŝt ) : t ≥ 0}, where i is a generic firm. Note that this process is, in general, not Markov even if the dynamics of the underlying state {(xit , st ) : t ≥ 0} are. To see this, consider for example that the individual state represents a firm’s size and that firms keep track of the first un-normalized moment of the fringe state, P θt = θ(ft ) = x∈Xf xft (x). Suppose the current value of that moment is θt = 10; this value is consistent with one fringe firm in individual state 10, but also with 10 fringe firms in individual state 1. These two different states do not necessarily yield the same probabilistic distribution for the first moment in the next period. Therefore, θt may not be a sufficient statistic to predict the future evolution of the industry, because there are many fringe distributions that are consistent with the same value of θt . In the process of aggregating information via moments the resulting process is no longer Markovian. To simplify our presentation of the transition kernel over moment-based states we initially assume that 9 To simplify notation we do not make this restriction explicit; however, we restrict attention to states (x, ŝ), such that if the firm in state x is a dominant firm, then its individual state needs to coincide with one of the components of d. 11 the set of dominant firms is fixed and predetermined, and hence, there are no transitions between the sets of fringe and dominant firms. At the end of this section we discuss the more practical setting that we use in our numerical experiments in which the set of dominant firms is endogenously determined. Assuming that firm i follows the moment-based strategy µ0 , and that all other firms use moment-based strategies (µ, λ), we will describe a kernel P̂µ0 ,µ,λ [·|·] with the hope that the Markov process described by this kernel is a good approximation to the (non-Markov) process {(xit , ŝt ) : t ≥ 0}. The Markov process described by P̂µ0 ,µ,λ are firm i’s perception of the evolution of its own state in tandem with the momentbased industry state. To this end, we first introduce the kernel P̂µ,λ [θ0 |ŝ] that describes the evolution of a hypothetical Markov process over moments. To approximate the moment process, we require that the perceived transitions over moments described by P̂µ,λ agree in some manner with the actual transitions over underlying states described by Pµ,λ . Formally, in MME we will impose that: P̂µ,λ = ΦPµ,λ , (2) for a given operator Φ. MME could be defined with any operator Φ; however, some specifications yield better approximations and Example 4.1 below provides one such specification. In our numerical experiments we use a perceived transition kernel based on this example unless otherwise noted. Example 4.1 (Observed transitions). One possible definition for P̂µ,λ is the kernel that coincides with the long-run average observed transitions from the moment-based state in the current time period to the moment in the next time period under strategies (µ, λ). More specifically, we let the industry evolve for a long time under strategies (µ, λ). For each moment-based state ŝ that is visited, we observe a transition to a new moment θ0 . We count the observed frequency of these transitions and set the kernel P̂µ,λ [θ0 |ŝ] to be the observed distribution corresponding to these frequencies. A similar construction is used by Fershtman and Pakes (2012) in a setting with asymmetric information. We now formalize the definition of this kernel. The evolution of the underlying industry state is described by the kernel Pµ,λ . We let Rµ,λ be the recurrent class of moment-based states the industry will eventually reach. For all θ0 ∈ Sθ and ŝ ∈ Rµ,λ , we define the operator: PT 0 (ΦPµ,λ ) [θ |ŝ] = lim T →∞ t=1 1{ŝt = ŝ, θt+1 = PT t=1 1{ŝt = ŝ} θ0 } . (3) We require that P̂µ,λ [θ0 |ŝ] = (ΦPµ,λ )[θ0 |ŝ], for all ŝ ∈ Rµ,λ , and P̂µ,λ is arbitrarily defined for states ŝ ∈ / Rµ,λ . In other words, perceived transitions for fringe moments coincide with their observed transitions in the recurrent class of states, and are arbitrary outside this class.10 P To make the example more concrete, suppose that θt = θ(ft ) = x∈Xf xft (x), the first un-normalized 10 In this paper, as is common in this literature, we focus on ‘capital accumulation’ games without investment spillovers for which it is simple to show that for given strategies, the industry state process admits a single recurrent class. Alternatively, in other settings that yield more than one recurrent class, to construct the perceived transition kernel we would need to assume that all firms agree on which recurrent class the industry will eventually transition into. 12 moment of the fringe state. To simplify the exposition, we ignore dominant firms and the aggregate shock. We let the industry evolve for a long time under given strategies (µ, λ). Suppose that in the course of this simulation, we observe that one half of the time periods in which θt = 10, the moment the next time period is given by θt+1 = 10, one fourth is given by θt+1 = 12, and the other fourth is given by θt+1 = 8. Then, we set P̂µ,λ [θ0 = 10|θ = 10] = 1/2, P̂µ,λ [θ0 = 12|θ = 10] = 1/4, P̂µ,λ [θ0 = 8|θ = 10] = 1/4. We would set P̂µ,λ [θ0 |θ] similarly for all moment states in the recurrent class. Another possible construction of the perceived transition kernel that has been successfully used in stochastic growth models in macroeconomics (Krusell and Smith, 1998) and subsequent literature assumes a linearly parameterized and deterministic evolution for moments. We describe this construction in detail in Appendix A. This perceived transition kernel has the disadvantage of imposing strong parametric restrictions and assuming deterministic transitions. On the other hand, these same restrictions significantly reduce the computational burden of solving for the equilibrium relative to the ‘observed transitions’ kernel. Both of these kernels are meant to capture the long-run evolution of the industry. We could also introduce a perceived kernel meant to capture the short-term dynamics of the industry after a policy or an environmental change, such as a merger. For this, we could consider the average observed transitions from the current moment to the next, over many finite (and short) trajectories that start from the same state; this state describes the initial condition of the industry. Having defined the kernel for fringe moments P̂µ,λ [θ0 |ŝ], we now define the perceived kernel P̂µ0 ,µ,λ [·|·] over states (xit , ŝt ) according to: P̂µ0 ,µ,λ [x0 , ŝ0 |x, ŝ] = Pµ0 ,µ,λ [x0 , d0 , z 0 |x, ŝ]P̂µ,λ [θ0 |ŝ], (4) where with some abuse of notation Pµ0 ,µ,λ [x0 , d0 , z 0 |x, ŝ] denotes the marginal distribution of the next state of firm i, the next state of dominant firms, and the next value for the aggregate shock, conditional on the current moment-based state, according to the kernel of the underlying state Pµ0 ,µ,λ . The definition above makes the following facts about this perceived process transparent: 1. Were firm i a fringe firm, the above definition asserts that this fringe firm ignores its own impact on the evolution of industry moments. This is evident in that x0 is distributed independently of θ0 conditional on x and ŝ. 2. Given information about the current moment-based state, the firm correctly assesses the distribution of its next state, the next state of dominant firms, and the next aggregate shock. Note that because firms use moment-based strategies, the moment-based state (x, ŝ) is sufficient to determine the transition probabilities of (x, d, z) according to the transition kernel of the underlying state Pµ0 ,µ,λ . 3. However, it should be clear that the Markov process given by the above definition remains an approximation since it posits that the evolution of the moments θ are Markov with respect to ŝ as described by P̂µ,λ [θ0 |ŝ]. In fact, the distribution of moments at the next point in time potentially depends on the full distribution of the fringe firms and not only its moments. 13 We finish this subsection by briefly describing how to modify the transition kernel when the set of dominant firms is endogenous. Recall that Pµ0 ,µ,λ [x0 , d0 , z 0 |x, ŝ] (see (4)) is the kernel that describes the actual evolution of (xit , dt , zt ), which firms can determine exactly from the primitive transition kernel when transitions between the fringe and dominant tiers are not possible. However, if such transitions are possible the moments in θ(·) may not contain sufficiently detailed information to precisely calculate the transition probabilities into dominance, and therefore, the distribution of the dominant firm state the next time period conditional on the current moment-based state. For example, suppose θ(f ) is the first un-normalized moment of the fringe state and that firms become dominant if they grow larger than a pre-determined individual state. Then, the probability that a fringe competitor becomes dominant depends on the fringe state beyond that moment. Therefore, in this case, we need to modify the kernel that describes the evolution of (xit , dt , zt ) to incorporate firms’ perceived probabilities that a fringe competitor will transition into dominance given the moment-based industry state. The operator Φ will specify consistency conditions for these events in addition to moments’ transitions; for example, the perceived transition probabilities of fringe firms becoming dominant could coincide with the corresponding transition frequencies observed given the strategies firms use. We provide a detailed description of tier transitions in Section 6.2. 4.4 Single-Period Profit Function The single-period profit function can be written as π(xit , st ): a function of the individual state xit and the industry state st = (ft , dt , zt ). If firms use moment-based strategies and only keep track of fringe moments, they do not have the ability to evaluate a single-period profit function that depends on the entire fringe state. Hence, to define our equilibrium concept in these cases, similar to the perceived transition kernel, we need to define a perceived single-period profit function over moment-based states, π̂µ,λ (xit , ŝt ), for given strategies (µ, λ). One possible specification for the perceived single-period profit function is: PT π̂µ,λ (x, ŝ) = lim T →∞ t=1 π(x, st )1{ŝt = PT t=1 1{ŝt = ŝ} ŝ} , (5) for all ŝ in the recurrent class Rµ,λ , and where the evolution of the underlying industry state is described by the kernel Pµ,λ . In other words, this is the long-run average single-period profit experienced whenever the moment-based state is given by (x, ŝ). This specification is similar to the one used in Fershtman and Pakes (2012). Despite this definition, we envision that our approach will be most useful in settings for which the single-period profit function only depends on the fringe state through a few moments (or that this provides an accurate approximation), as the next assumption makes explicit and that we keep throughout the paper unless otherwise noted. Assumption 4.1. The single-period expected profit of firm i at time t, π(xit , θ̃t , dt , zt ), depends on its individual state xit , a vector θ̃t ∈ Θ̃ ⊆ Θ of fringe moments, the state of dominant firms dt , and the value of the aggregate shock zt . 14 The single-period profit functions we will consider either satisfy the previous assumption exactly or can be well approximated by functions of fringe moments. Given this, in our approach, firms will always keep track of the moments that determine (or approximate) the profit function, θ̃t ; that is θ̃t will always be included in θt in equation (1). The state space spanned by θ̃t is much smaller than the original one if Θ̃ is low dimensional and significantly smaller than |Xf |. In fact, many single profit functions of interest depend on a small number of functions of the distribution of firms’ states. For example, commonly used profit functions that arise from monopolistic competition models depend on a particular moment of that distribution (Dixit and Stiglitz, 1977; Besanko et al., 1990). Further, in Section 7.1 we describe two important profit functions for applied work that we use in our numerical experiments that can be well approximated by a function of a single fringe moment. We note that firms may keep track of additional contemporaneous moments beyond the moments θ̃t to improve their predictions of the future evolution of the industry. In our applications, however, we will typically start our analysis by assuming that firms only keep track of the fringe moments associated with the single-period profit function, hence θt = θ̃t . 4.5 Moment-Based Markov Equilibrium A moment-based Markov equilibrium (MME) is an equilibrium in moment-based strategies. Having defined a Markov process approximating the process {(xit , ŝt ) : t ≥ 0}, we define the perceived value function by a deviating firm i that uses moment-based strategy µ0 in response to strategies (µ, λ). Importantly, this value function assumes that firm i’s perception of the evolution of its own state and the moment-based industry state are described by the kernel P̂µ0 ,µ,λ defined above. Formally, " 0 V̂ (x, ŝ|µ , µ, λ) = E µ0 ,µ,λ τi X β k−t # τi −t π(xik , ŝk ) − c(ιik , xik ) + β φi,τi xit = x, ŝt = ŝ , k=t where the expectation is taken with respect to the perceived transition kernel P̂µ0 ,µ,λ . We will use the shorthand notation V̂ (x, ŝ|µ, λ) ≡ V̂ (x, ŝ|µ, µ, λ). A moment-based Markov equilibrium (MME) is defined with respect to a function of moments θ in (1) and an operator Φ in (2) that defines the perceived transition kernel. Hence, the function of moments and the operator are inputs to define MME. We could derive different MME for the same model primitives by changing these inputs. Definition 4.1. A MME of our model is comprised of an investment/exit strategy µ = (ι, ρ) ∈ M̂ and an entry cutoff function λ ∈ Λ̂ that satisfy the following conditions: C1: Incumbent firms’ strategies optimization: sup V̂ (x, ŝ|µ0 , µ, λ) = V̂ (x, ŝ|µ, λ) µ0 ∈M̂ 15 ∀(x, ŝ) ∈ X × Ŝ. C2: At each state, the cut-off entry value is equal to the expected discounted value of entering the industry: h i e λ(ŝ) = β E µ,λ V̂ (x , ŝt+1 |µ, λ)ŝt = ŝ ∀ŝ ∈ Ŝ. C3: The perceived transition kernel is given by: P̂µ,λ = ΦPµ,λ Note that the first two conditions are similar to the conditions imposed in MPE, but incorporate the moment-based state space and strategies. The third condition imposes a relation between the perceived transition kernel and the actual transition kernel through the operator Φ. For example, the operator Φ could be defined as in Example 4.1. Computationally, MME is appealing if agents keep track of a few moments of the fringe state and there are a small number of dominant firms, because agents optimize over low dimensional strategies. We note that if the function θ is the identity, so that θ(f ) = f , and Φ is such that the perceived transition kernel coincides with the actual transition kernel for all states, then MME is equivalent to MPE. Using a similar argument to Doraszelski and Satterthwaite (2010) and imposing some additional technical conditions we can show that an MME always exists. In the numerical experiments that we present later in the paper we are always able to computationally find an MME over a large set of primitives that satisfy the standard assumptions required for the existence of MPE in EP-style models. With respect to uniqueness, in general we presume that our model may have multiple equilibria. 4.6 Relation with Other Equilibrium Concepts We finish this section by establishing the relationship between MME and related equilibrium concepts previously defined in the literature. MME is related to the equilibrium concepts defined in Fershtman and Pakes (2012) that considers dynamic oligopoly models with asymmetric information. While their model and motivation is fundamentally different from ours, they are related in that MME limits firms’ information sets, and therefore could be cast as an equilibrium concept in a game of asymmetric information even if the underlying economic model is not. More specifically, assuming that fringe firms privately observe their own states, and the publicly observed information are the fringe moments, the state of dominant firms, and the aggregate shock (and assuming that all industry states are in the recurrent class), then MME with the transition kernel of Example 4.1 coincides with that of restricted experienced based equilibrium (REBE) defined in Fershtman and Pakes (2012). If the set of states in the recurrent class is a strict subset of the state space, which is typically the case, then REBE is different from MME, because it does not require optimality of strategies outside the recurrent class. Instead, we require optimality at all states in MME, but the perceived transition kernel can be arbitrarily defined outside the recurrent class. This equivalence holds when REBE strategies only keep track of the current moments, state of dominant firms, and aggregate shock. However, REBE strategies typically keep track of past observations as well. To extend the equivalence to these cases, we could extend the definition of MME so that firms keep track 16 of past values of the moment (θt−1 , θt−2 , ...). Nevertheless, to provide a more direct connection with the commonly used concept of MPE, in which firms keep track only of the current industry state, and to simplify exposition, our definition of MME considers strategies that only depend on current moments of the fringe state. In addition, in Section 10.1 we provide further connections between our work and Fershtman and Pakes (2012), in particular in relation to our error bounds. Finally, if we assume (i) the set of dominant firms is fixed and pre-determined; (ii) firms only keep track of the dominant firms’ state and do not keep track of fringe moments; and (iii) that the perceived singleperiod profit function is specified as in equation (5), then MME coincides with the simulated version of partially oblivious equilibrium (POE) of Benkard et al. (2014).11 5 MME as Approximation There are at least two alternative ways of interpreting MME as an approximation. First, we can interpret MME as a low-dimensional approximation to MPE. Alternatively, MME can be thought of as an appealing heuristic and behavioral model on its own; this is related to the optimality of moment-based strategies. We discuss these two interpretations in the next subsections. 5.1 MME as Behavioral Model Suppose firms play moment-based strategies with moments θ(·) and perceived transition kernel P̂µ,λ . As previously noted, generally {ŝt : t ≥ 0} is not Markov, so it may not be a sufficient statistic to predict the future evolution of the industry as described by the primitive transition kernel Pµ,λ . Hence, θ(·) does not necessarily summarize all payoff relevant history in the sense of Maskin and Tirole (2001), and observing the underlying full fringe state may provide valuable information for decision making. Generally, MME strategies may not be close to a best response, and therefore, may not be close to a subgame perfect equilibrium. However, MME strategies would provide an appealing behavioral model if they are close to optimality, that is, assuming that competitors play MME strategies themselves, unilaterally deviating to a strategy that keeps track of the full underlying fringe state would not improve expected discounted profits by much. In this case, it may be reasonable to assume firms use MME strategies, as unilaterally deviating to more complex strategies does not significantly increase payoffs, and doing so may involve costs associated with gathering and processing more information. To formalize this notion, for MME strategies (µ, λ), we define the value of the full information deviation: ∆µ,λ (x, s) = sup V (x, s|µ0 , µ, λ) − V (x, s|µ, λ). (6) µ0 ∈M Note that V (x, s|µ, λ) provides the actual expected discounted profits a firm would get if all firms play MME strategies. Similarly, supµ0 ∈M V (x, s|µ0 , µ, λ) provides the expected discounted profits if the firm would Instead, define π̂µ,λ (xit , dt , zt ) = π(xit , f˜µ,λ (dt , zt ), dt , zt ), where f˜µ,λ (d, z) = limT →∞ latter case, we recover the regular notion of POE. 11 17 PT t=1 ft 1{dt =d, zt =z} P . T t=1 1{dt =d,zt =z} In the unilaterally deviate to the best response Markov strategy that keeps track of the full industry state. We use the value of the full information deviation to measure the extent of sub-optimality of MME strategies. If this value is small, the MME strategy achieves essentially the same profits compared to the best possible Markov strategy. In this sense, the value of the full information deviation is similar to the notion of −equilibrium. Note that we expect this value to be small when moments in MME are close to being sufficient statistics of the future evolution of the industry; that is, when P̂µ,λ [θ0 |ŝ] ≈ Pµ,λ [θ0 |s], and the industry state s does not provide additional information to predict the future industry evolution beyond the moment-based industry state ŝ. It is illustrative to study whether there are models for which equilibrium strategies yield moments that are indeed sufficient statistics, and therefore for which MME strategies are optimal. In Appendix B we introduce one such class of models. Assuming that the single-period profit function for fringe firms exhibits constant returns to scale, that the dynamics of a fringe firm’s evolution are linear in its own state, and that there are no transitions between the dominant and fringe tiers, we show in the appendix that MME strategies yield moments that form a Markov process and hence summarize all payoff relevant information as the number of fringe firms grows. As a consequence, it is also possible to show that for this model the value of the full information deviation becomes small when the number of fringe firms is large. The model just described imposes strong assumptions on the model primitives for fringe firms and may be too restrictive for many applications. A natural question is whether we can obtain similar aggregation results for a broader class of models. Unfortunately, as we illustrate below, the answer is usually no. Specifically, our next result shows that when firms keep track of one moment, moments are sufficient statistics only if fringe firms’ equilibrium transitions are, in an appropriate sense, linear in their own state. To our knowledge, such equilibrium transitions arise only in the constant returns to scale model presented in Appendix B or in close variations. There, equilibrium transitions are linear, because the primitive transitions are linear and, as we show in the appendix, the equilibrium investment strategy does not depend on the fringe firms’ individual state.12 We have the following result. All proofs are provided in the appendix. Suppose that the single-period profit function depends on the α un-normalized moment of the fringe state (like in Example 7.1). Proposition 5.1. Assume Xf is a closed interval in <, N − D̄ ≥ 3 , there is no entry and exit in the industry, the set of dominant firms is fixed, and the set of moments contains the α un-normalized moment only, that P is, θ(f ) = x∈Xf xα f (x). Suppose that under MME strategy µ, the moment is a sufficient statistic for the evolution of the industry, that is: P̂µ,λ [θ0 |ŝ] = Pµ,λ [θ0 |s], for all θ0 ∈ Sθ , s= (f, d, z) ∈ S, and ŝ= (θ, d, z) ∈ Ŝ, with θ(f ) = θ. Then, fringe firms’ transitions in MME must be linear in their own state, in the sense that E µ [xαi,t+1 |xit = x, ŝt ] = xα ζ̃1 (ŝt |µ) + ζ̃0 (ŝt ), for given functions ζ̃1 (·|µ) and ζ̃0 (·). 12 These results are closely related to the literature in macroeconomics that deals with the aggregation of macroeconomic quantities from the decisions of a single ‘representative agent’ (Blundell and Stoker, 2005). With heterogeneous agents, similarly to our model, this type of result is only obtained under strong assumptions about consumers’ preferences that are also akin to linearity (see, for example, Altug and Labadie (1994)). 18 This result suggests that it seems implausible to prove a general result showing that MME becomes an exact approximation for broad classes of models that are typically used in applied work. Instead, a more practical approach, that we take in our numerical experiments, is to use MME in these models assuming that firms approximate the evolution of the moment-based state with a perceived Markov process, and then try to measure the extent of the approximation error. Of course, measuring the value of the full information deviation is intractable in large-scale applications with many firms; in these instances, computing a best response is almost as computationally challenging as solving for a MPE. Thus motivated, in Section 9 we introduce a computationally tractable upper bound to the value of the full information deviation. 5.2 MME as Approximation to MPE MME can also be interpreted as a low-dimensional approximation to MPE. Despite the lack of theoretical results showing that moments become sufficient statistics, as argued in the previous section, in Section 7.3 we will provide computational evidence showing that MME approximates MPE for important classes of models. For now, we show that if the value of the full information deviation becomes small (e.g., by including more moments in MME), then MME strategies approach MPE strategies. This result provides a connection between the interpretation of MME as an appealing behavioral model in the sense of an −equilibrium, and that of MME as an approximation to MPE. We formalize this in the following theorem. Let Γ ⊆ M×Λ be the set of MPE. For all (µ, λ) ∈ M×Λ, let us define D(Γ, (µ, λ)) = inf (µ0 ,λ0 )∈Γ k(µ0 , λ0 )− (µ, λ)k∞ .13 Theorem 5.1. Consider a sequence of MME strategies {(µn , λn )|n ∈ N}, such that (i) the value of the full information deviation converges to zero for all industry states, that is limn→∞ ∆µn ,λn (x, s) = 0, ∀(x, s) ∈ X × S; and (ii) the cut-off entry condition is satisfied asymptotically, that is limn→∞ |λn (s) − β E µn ,λn [V (xe , st+1 |µn , λn )|st = s] | = 0, ∀s ∈ S.14 Then, limn→∞ D(Γ, (µn , λn )) = 0. The proof closely follows the proof of a similar result in Farias et al. (2012); we include it in the appendix for completeness. We note that the assumption that there is no overlap of individual states between dominant and fringe firms is important to obtain the result; if not, MME would not admit symmetric Markov strategies. In addition, the assumption that the value of the full information deviation converges to zero for all industry states is key to prove the result. However, in practice, when using a transition kernel like the one in Example 4.1 we may only expect for this to be the case in states in the recurrent class, because transitions outside this class are arbitrary. This issue would be avoided if the entire state space is recurrent under strategies that could potentially be best responses.15 Alternatively, if we only assume that the value of the full information deviation converges to zero in states in the recurrent class, we can show that we approximate a weaker equilibrium concept that does not impose optimality requirements for states that are not reachable in equilibrium, similar to those proposed in Fershtman and Pakes (2012). 13 k · k∞ denotes the supremum norm. When the value of the full information deviation becomes small, it is also reasonable to expect that the error in the cut-off entry rule also becomes small. 15 For example, if the model has depreciation and appreciation shocks in all states. 14 19 6 Computation of MME In this section we introduce an algorithm that computes MME and discuss in detail the perceived transition kernel with tier transitions. 6.1 Algorithm To compute MME we employ a best response-type algorithm. The solver starts with a strategy profile, checks for the equilibrium conditions, and updates the strategies by best responding to the current strategy until an equilibrium is found. In each iteration we construct the perceived transition kernel from firms’ strategies according to the operator Φ. Depending on the operator used this could require simulation. We describe the steps in Algorithm 1 for the observed transition kernel introduced in Example 4.1. The following remarks are important: 1. The algorithm terminates when the norm of the distance betweenna strategy and its best response ois P 0 small. We consider the following norm kµ − µ0 kh = maxx∈X ŝ∈Ŝ |µ(x, ŝ) − µ (x, ŝ)|h(ŝ) , where h is a probability vector. We take h to be the frequency in which each industry state is visited, hence states are weighted according to their relevance. This is useful because simulation errors in the estimated transition kernel are higher for states that are visited infrequently. We also try a variation of h in which we initially add positive weight to all states. 2. Even if in theory all states are recurrent, in a finite length simulation it is possible that some states will not be visited, and for these states the perceived transition kernel cannot be computed. For these states we set the transitions in P̂ to be some predetermined value. We try different specifications for these states and discuss their impact on the equilibrium in Section 7.3. Algorithm 1 Equilibrium solver 1: 2: 3: 4: 5: 6: 7: Initialize with (µ, λ) and industry state s0 = (f0 , d0 , z0 ) with corresponding ŝ0 ; n := 1, δ := + 1; while δ > do Simulate a T period sample path {(ft , dt , zt )}Tt=1 with corresponding {ŝt }Tt=1 for large T ; P T Calculate the observed frequencies of industry states h(ŝ) := T1 t=1 1{ŝt = ŝ}, for all ŝ ∈ Ŝ; Use the same sample path to compute the observed transition kernel P̂µ,λ [·|ŝ], for all ŝ; Solve µ0 := argmax V̂ (x, ŝ|µ0 , µ, λ), for all (x, ŝ) ∈ X × Ŝ; µ0 ∈M̂ 8: 9: 10: 11: 12: 13: Let λ0 (ŝ) := Eµ,λ [V̂ (xe , ŝt+1 |µ, λ)|ŝt = ŝ], for all ŝ ∈ Ŝ; δ := max(kµ − µ0 kh , kλ − λ0 kh ); µ := µ + (µ0 − µ)/(1 + nσ ); λ := λ + (λ0 − λ)/(1 + nσ ); n := n + 1; end while; If the algorithm terminates with = 0 and h(·) assigns strictly positive weight to all industry states we have found an MME. A positive value of allows for numerical error. We also note that in the algorithm, 20 0 < σ < 1 is chosen to speedup convergence by smoothing out the update of strategies from one iteration to the next.16 For different perceived transition kernels, lines 4-6 in Algorithm 1 will be different. For example, they would require a slight modification to incorporate transitions between the fringe and dominant tiers as we do in our numerical experiments. This extension is explained in detail in Section 6.2. We found that for the observed transition kernel a real-time stochastic algorithm similar to Pakes and McGuire (2001) and Fershtman and Pakes (2012) is much faster than Algorithm 1. In this variant the kernel is not simulated in every iteration, but instead continuation value functions that allow to solve for firms’ optimal strategies are kept in memory and updated through simulation draws. In this sense, this modified real-time algorithm performs the simulation and optimization steps simultaneously. We use one step of Algorithm 1 to check whether C1 and C2 of MME hold and convergence has been achieved. The details of this variant of the algorithm are presented in Appendix D. 6.2 Perceived Transition Kernel In this section we extend the observed perceived transition kernel of Example 4.1 to allow for tier transitions. We use this kernel in the context of the previously introduced algorithm to compute MME in our numerical experiments. In our experiments we assume that firms are dominant if and only if they are above a pre-determined state x. Hence, the transitions in and out of dominance are naturally embedded in firms’ transitions. To simplify the dynamics we impose the constraint that at most one firm becomes dominant in a time period; this constraint was typically never binding in our experiments. Further, to construct the perceived transition kernel we assume that the events that a fringe competitor becomes dominant and vice versa are both statistically independent of the moment transitions. Our numerical results suggest that we are able to approximate MPE even under this simplifying assumption. Moreover, this approximation is computationally useful, because without it we would need to store the joint transition kernel of dominant firms and moment transitions, requiring significantly more memory; this could be relevant in large-scale applications like the one presented in Section 8.17 Under these assumptions, we can extend the observed transition kernel of equations (3) and (4) by: P̂µ0 ,µ,λ [x0 , ŝ0 |x, ŝ] = Pµ0 ,µ,λ [x0 , d0 , z 0 |x, ŝ]P̂µ,λ [θ0 |ŝ]P̂µ,λ [(f → d)|ŝ], (7) where the first term of the right-hand side incorporates the transition probability of firm i at state x (including its chances of becoming dominant in the next time period), the transitions of current dominant firms, and the evolution of the shock. The second term corresponds to the observed transition kernel for moments specified in (3). The event (f → d) corresponds to a fringe competitor of firm i transitioning and becoming a domi16 In practice, we use σ = 2/3. Under our simplification, to build the transition kernel we need to store the following elements in memory: the perceived transition kernel for moments, the tier transitions’ probabilities, and the strategy of dominant firms. This roughly requires memory space proportional to |Ŝ||Sθ | + 2|Ŝ| (assuming no aggregate shock). Instead, the joint transition kernel of dominant firms and moment transitions would require memory space roughly proportional to Ŝ 2 . Recall that |Ŝ| = |Sθ ||Sd |. For example, if Sθ = Sd = 100, then the first expression is equal to 1,020,000 and the second to 100,000,000. 17 21 nant firm in the next period.18 For our specification, this event corresponds to a fringe competitor growing to or above individual state x̄. For a given moment-based state, we assume firms’ perceived probability of this transitioning event occurring is equal to its long-run average observed frequency, that is: PT t=1 1{ŝt = ŝ, (f PT t=1 1{ŝt = P̂µ,λ [(f → d)|ŝ] = lim T →∞ → d)t } ŝ} . Note that in the kernel specified in equation (7) we also assume that a fringe firms’ own transitions are independent of the moments, which seems reasonable in applications with many fringe firms. However, in the small-scale experiments we run below to compare MME with MPE, we also tried a specification in which a fringe firm that becomes dominant removed itself from the moment. With a small number of fringe firms this had an important effect in improving the approximation to MPE, specifically in the capacity competition model described below. 7 Numerical Experiments: Comparisons to MPE As mentioned in Section 5, MME can be interpreted as a low-dimensional approximation to MPE and in this section we numerically explore this interpretation. In Section 7.1 we introduce the two profit functions we use in our numerical experiments and in Section 7.2 we provide other model specifications. Then, in Section 7.3 we provide our numerical results showing that MME approximates MPE well. We also discuss important implementation issues to consider when using MME in applied work. 7.1 Single-Period Profit Functions In this section we provide two important profit functions that we use in our numerical experiments and that can be well approximated by a function of a single fringe moment. Our first model is similar to standard industrial organization models commonly used in empirical work. Example 7.1 (Quality-Ladder). Similarly to Pakes and McGuire (1994), we consider an industry with differentiated products, where each firm’s state variable is a number that represents the quality of its product. Hence, a firm’s state is given by a natural number xit . For clarity, we do not consider aggregate shocks in this example. There are m consumers in the market. In period t, consumer j receives utility uijt from consuming the good produced by firm i given by: uijt = α1 ln(xit ) + α2 ln(Y − pit ) + νijt , i ∈ St , j = 1, . . . , m, where Y is the consumer’s income, and pit is the price of the good produced by firm i. The random variables νijt are i.i.d. Gumbel that represent unobserved characteristics for each consumer-good pair. There is also an 18 Since we assumed that at most one firm becomes dominant in a time period, in principle, one should explicitly consider and rule out the event that firm i and a competitor become dominant in the same time period. Even though we explicitly consider this event in our numerical experiments, it turned out not to be important in practice, because the constraint of having at most one fringe firm becoming dominant in a time period was typically never binding. Hence, for simplification, we ignored this effect in the kernel above. 22 outside good that provides consumers zero expected utility. We assume consumers buy at most one product each period and that they choose the product that maximizes utility. Under these assumptions our demand system is a standard logit model. Considering a constant marginal cost of production d, there exists a unique Nash equilibrium in pure strategies that can be computed by solving the first-order conditions of the pricing game (see Caplin and Nalebuff (1991)). Let p∗ (x) denote the price charged by a firm in state x in such an equilibrium. Let K(x, p) = exp(α1 ln(x) + α2 ln(Y − p)). Then, expected profits are given by: π(xit , st ) = m(p∗ (xit ) − d) 1+ K(xit , p∗ (xit )) , ∀i ∈ St . ∗ x∈X st (x)K(x, p (x)) P (8) Now, if, similar to the logit model of monopolistic competition of Besanko et al. (1990), we assume fringe firms set prices assuming they have no market power, so they all set the price p∗ = (Y + cα2 )/(1 + α2 ) (but dominant firms continue to compete Nash in prices), expected profits are given by: π(xit , st ) = m(p∗ (xit ) − d) 1 + (Y − p∗ )α2 P K(xit , p∗ (xit )) P α1 ∗ y∈Xf y ft (y) + j∈Dt K(xjt , p (xjt )) (9) Therefore, under this assumption, π(xit , st ) can be written as π(xit , θ̃t , dt ), that is, as a function of the single P moment θ̃t = y∈Xf y α1 ft (y), the α1 −th un-normalized moment of ft . In Section 7.3 we numerically show that (9) approximates (8) well in the type of applications we have in mind in which fringe firms are indeed smaller in size compared to dominant firms. Our second model is structurally different and we chose it because it has been shown in previous work to yield rougher MPE value functions relative to our first model (see Farias et al. (2012)). In this sense, approximating MPE for this model imposes a more demanding test for MME. Example 7.2 (Capacity Competition). This model is based on the quantity competition model of Besanko and Doraszelski (2004). We consider an industry with homogeneous products, where each firm’s state variable determines its production capacity q(xit ). We assume that capacity grows linearly in state, that is q(x) = q i + q s x, for some constants q i , q s > 0. Investment increases this capacity. We also ignore an aggregate shock. At each period, firms compete in a capacity-constrained quantity setting game. There is a linear demand function Q(p) = m(e − f p) and an inverse demand function P (Q) = e/f − Q/(mf ), where P represents the common price charged by all firms in the industry, Q represents the total industry output, and e, m and f are positive constants. To simplify the analysis, we assume the marginal costs of all firms are equal to zero. Given the total quantity produced by its competitors Q−i,t , the profit maximization problem for firm i at time t is given by max0≤qit ≤q(xit ) P (qit + Q−i,t )qit , where q(xit ) is the production capacity at individual state xit . It is possible to show that a simple iterative algorithm yields the unique Nash 23 equilibrium of this game q ∗ (xit ).19 Profits for firm i are then given by: ! π(xit , st ) = P X ∗ st (x)q (x) q ∗ (xit ). (10) x∈X Suppose that all fringe firms are relatively small and they produce at full capacity in the Nash equilibrium. Then, X X π(xit , st ) = P ft (y)q̄(y) + q ∗ (xjt ) q ∗ (xit ). (11) j∈Dt y∈Xf Hence, under this assumption π(xit , st ) can again be written as a function of a single moment, π(xit , θ̃t , dt ), P where θ̃t = x∈Xf ft (x)q̄(x), the total current installed capacity of fringe firms. In Section 7.3 we numerically show that (11) approximates (10) well in the type of applications we have in mind in which fringe firms are indeed smaller than dominant firms. In our numerical experiments firms keep track of the moments described above that approximate the profit function. In the logit model, the fringe moment summarizes the impact of the fringe on market shares; this is a natural statistic to keep track of. In the capacity competition model, firms monitor the aggregate capacity among fringe firms. This again is natural, because (i) it may be a statistic that is easy to gather, for example, from trade associations; and (ii) fringe firms may indeed produce at full capacity. Ryan (2012) uses a profit function similar to our capacity competition model and argues that these two characteristics are present in his study of the cement industry. 7.2 Other Model Specifications In our numerical experiments in this section we consider the single-period profit functions from Examples 7.1 and 7.2. The rest of the model specifications are common across these two profit functions. Firms can invest ι ≥ 0 in order to improve their individual state over time (either product quality or capacity, depending on the model) at a constant marginal investment cost of c. A firm’s investment is aι , in which case the individual state increases by one level. The firm’s product successful with probability 1+aι may also depreciate by one state with probability δ, independently each period. Our model differs from Pakes and McGuire (1994) here because the depreciation shocks in our model are idiosyncratic. Combining the investment and depreciation processes, it follows that the transition probabilities for a firm in state x that does not exit and invests ι are given by: h i P xi,t+1 xit = x, ι = (1−δ)aι 1+aι (1−δ)+δaι 1+aι δ if 1+aι if xi,t+1 = x + 1 if xi,t+1 = x xi,t+1 = x − 1 . While it would be straightforward to implement entry and exit as we do in Section 8, in order to simplify 19 The equilibrium quantities are Pcharacterized by the following max 0, min q(xit ), 12 me + q ∗ (xit ) − x∈X st (x)q ∗ (x) , ∀ i ∈ St . 24 set of equations: q ∗ (xit ) = the model we omit them from our computations in this section. In practice, parameters would either be estimated using data from a particular industry or chosen to reflect an industry under study. We use a set of parameter values to reflect reasonable economic fundamentals, summarized in Tables 1, 2, and 3. We keep these parameters fixed for all experiments, unless otherwise noted. Parameter Value m 60 d 0.5 α1 1 α2 0.5 Y 1 c 0.65 Table 1: Default parameters for quality ladder model experiments. Parameter Value m 2 qi 0 qs 2 e 50 f 10 c 4 Table 2: Default parameters for capacity model experiments. Parameter Value β 0.925 δ 0.7 a 1 X {1,...,9} Table 3: Default parameters for both models. 7.3 Comparison to MPE In this section we numerically compare MME with MPE for the two models described above. Because exact computation of MPE is only feasible for small state spaces, we consider |X | = 9 individual states and N = 5, 6, 7 firms. We also consider several parameter regimes. First, for the quality ladder model, the default parameters in the tables above represent a regime in which there is a small market size and a small investment cost. We also consider a set of parameters in which these two quantities are relatively larger.20 We also consider similar regimes for the capacity competition model with a similar interpretation for the default parameters.21 Both for MME and MPE there are potentially multiple equilibria. We have not made an attempt to exhaustively search for all of them; however, our algorithm always found the same equilibria for a given instance, even when starting from different initial conditions (see Besanko et al. (2010) for an explanation of this phenomenon). All of the results below apply to this particular equilibria. Instead of comparing equilibrium strategies directly, we compare economic indicators induced by these strategies. These indicators are long-run averages of various functions of the industry state under the strategy in question. The indicators we examine are those that are typically of interest in applied work; we consider average total investment, average producer surplus, average consumer surplus, and average social welfare. 20 For the latter, we consider m = 90 and c = 1. The default market size parameters are used for N = 6. For N = 5 and N = 7 we subtract and add 5 units to these numbers, respectively. 21 For the regime with a larger investment cost, we consider e = 65 and c = 6.5. Again, the default market size parameter is used for N = 6. For N = 5 and N = 7 we use m = 1.66 and m = 2.33, respectively. 25 For the comparison results presented in Table 4, we consider MME with D = 3 dominant firm slots. We assume that firms become dominant if they are large enough, specifically if they grow above (and including) a pre-determined individual state x. For all the instances, we fixed x = 6 (recall there are 9 individual states). This threshold state seems like a reasonable compromise between defining dominant firms once they indeed grow large enough, but at the same time allowing enough richness in dominant firms’ dynamics. We assume firms always keep track of a single moment, namely, the one suggested in Section 7.1, for each profit function. Computing MME typically takes a matter of minutes, while computing MPE could take many hours, in particular for the case N = 7.22 We discuss below how the results change when considering different specifications for dominant firms and moments. In Table 4, we observe that MME provides accurate approximations to MPE for all economic indicators of interest in all instances. As expected, MME does a better job approximating MPE for the quality ladder model, for which most indicators are approximated within 3-5 %. For the capacity competition model, the approximation errors are similar, except for consumer surplus, for which the approximation error is around 11-12% for two instances (but significantly smaller for the rest). Note that there are several sources of approximation error for MME relative to MPE. First, differently to MPE, even though fringe firms in MME consider the possibility of becoming dominant when investing, they do not have immediate incentives to invest to deter the growth of other firms, because their state is not monitored by competitors unless the fringe firm becomes dominant. As a consequence, in some states, fringe firms in MME underinvest relative to MPE and this is an important driver in the differences in consumer surplus between MME and MPE, which in all instances is larger in the latter. However, we expect that MME may provide an even better approximation to MPE as the number of firms grows, because, in this case, the deterrence incentives of small firms in MPE would also get reduced. Unfortunately, we cannot test this hypothesis because we cannot compute MPE for larger industries.23 A second source of approximation error for MME relative to MPE could be that in MME we use approximations to the single-period profit function that only depend on a single moment. However, when numerically comparing the long-run average difference between the ‘approximate’ single-period profit function with the exact profit function, we observe that this does not seem to play an important role. For the capacity competition model, there is no difference; fringe firms are small relative to the size of the market and they always produce at full capacity in all instances. For the quality ladder model, prices in both profit functions are slightly different but the average differences in profits are always less than 1%. A third source of approximation error for MME relative to MPE is that moments may not be sufficient statistics to predict the evolution of the industry. As previously mentioned, in our numerical experiments we always keep track of a single moment, namely, the one suggested in Section 7.1, for each profit function.24 22 For example, for N = 7 in the capacity competition model, the number of industry states for MME and MPE are given by 1120 and 11440, respectively. All runs were performed on a shared cluster of 27 dual Xeon processors. The code was programmed in Java. 23 In the range of industries we compute, fringe firms are still reasonably sized relative to dominant firms even for N = 7. For this reason and because there are several moving parts in our numerical experiments (e.g., typically the average number of dominant firms increases as N increases) the approximation errors in Table 4 do not necessarily always decrease as N increases. 24 If the set of feasible values the moment can take is large due to the presence of a potentially large number of fringe firms, we could discretize the state space of moments using a manageable grid and interpolate the value function in between points of the grid. We experimented with this type of setting using linear interpolation. Further, in the experiments in the beer industry in Section 26 High investment cost N =5 Low investment cost N =5 High investment cost Number of firms N =5 Low investment cost Capacity competition model Quality ladder model Instance N =5 N =6 N =7 N =6 N =7 N =6 N =7 N =6 N =7 MPE MME % Diff. MPE MME % Diff. MPE MME % Diff. MPE MME % Diff. MPE MME % Diff. MPE MME % Diff. MPE MME % Diff. MPE MME % Diff. MPE MME % Diff. MPE MME % Diff. MPE MME % Diff. MPE MME % Diff. Long-Run Statistics Total Prod. Cons. Inv. Surp. Surp. 7.126 26.936 173.63 7.688 25.601 170.755 7.89 4.96 1.66 7.841 28.725 194.878 8.082 27.468 190.102 3.07 4.37 2.45 8.519 30.47 215.628 8.373 29.257 208.545 1.73 3.98 3.28 7.1 17.42 112.21 7.66 16.56 110.36 7.82 4.98 1.65 8.04 19.18 130.99 8.31 18.37 127.88 3.37 4.23 2.37 8.98 20.91 149.95 8.85 20.15 145.36 1.44 3.65 3.06 5.799 144.689 33.426 5.804 141.403 31.708 0.09 2.27 5.14 7.209 176.806 41.716 7.009 171.930 38.455 2.77 2.75 7.82 8.61 209.003 50.24 8.179 201.605 44.41 5.01 3.54 11.6 6.564 98.666 38.715 6.867 97.234 36.496 4.63 1.45 5.73 8.073 119.87 48.206 8.206 118.077 44.539 1.66 1.5 7.6 9.609 140.949 58.295 9.516 138.576 51.376 0.97 1.68 11.87 Soc. Welf. 200.566 196.356 2.1 223.603 217.57 2.7 246.098 237.801 3.37 129.64 126.91 2.1 150.17 146.25 2.61 170.86 165.51 3.13 178.116 173.111 2.81 218.522 210.385 3.72 259.242 246.015 5.1 137.381 133.729 2.66 168.077 162.617 3.25 199.244 189.951 4.66 Table 4: Comparison of MPE and MME indicators. Long-run statistics computed simulating industry evolution over 106 periods. 27 As we argued there, these are natural moments to keep track of and it is encouraging that we obtain good approximations to MPE in the two models. If in certain applications using a single moment is not enough to achieve an accurate approximation, one could add an additional moment. One informative statistic that could potentially improve the approximation is the number of fringe firms that have a high chance of becoming dominant soon. For example, in our model this could be the number of fringe firms that are close to the threshold state x. We tried adding this statistic in our experiments and in general it did not improve the approximation significantly. Instead, in our experiments adding dominant firm slots significantly improved the approximation, as we discuss next. A fourth source of approximation error for MME relative to MPE is that in the former there is a potentially small number of dominant firm slots. In fact, in our numerical experiments we find that having enough slots to accommodate fringe firms transitioning to become dominant is important for providing accurate approximations. For the results presented in Table 4, the number of slots is typically large enough so there is at least one slot empty in 90% or more of the industry states in the recurrent class. Further, with 3 dominant slots, the average number of dominant firms in the experiments is between only 1 and 2. This is important to avoid distorting investment incentives of fringe firms that are about to become dominant. For example, in the low investment cost instance of the capacity competition model for N = 6, decreasing the number of dominant slots from 3 to 2 and then to 1, increases the error in approximating consumer surplus from 7.6% to 18.1% and then to 37.5%, respectively. On the other hand, increasing the number of dominant slots to 5 improves the approximation of consumer surplus to only 4.3.% (without significantly improving the approximation of producer surplus and total investment), with almost a four-fold increase in the size of the state space. The relatively small improvement in approximation errors does not seem to compensate for the large increase in computational cost. Finally, we also study the impact of firms’ perceived transition probabilities outside the recurrent class on MME. Recall that the perceived transition kernel used in our experiments (defined in Section 6.2) only specifies transitions in the recurrent class for given strategies; transitions outside the recurrent class are arbitrary. In the results presented in Table 4, we assume that firms’ perceptions in states outside the recurrent class are such that the current value of the moment remains the same in the next time period, which we call ‘status-quo’ perceptions. In our numerical experiments, we show that small perturbations around statusquo perceptions (e.g., instead of assuming that the moment stays the same in the next time period, firms assume that the moment transitions to a near-by value) have a minor impact on MME outcomes. We also run experiments with extreme ‘optimistic perceptions’; in this case, firms’ perceptions in states outside the recurrent class are that the moment transitions to its lowest possible value regardless of its current value. We observe that the differences in MME outcomes between these two modes of perceptions outside the recurrent class could be large in general; however, these differences tend to decrease as the number of fringe firms in the industry decreases.25 For robustness, users should experiment with different specifications for the perceptions outside the recurrent class in their applications. 8 we use a bicubic spline to interpolate between grid points when computing firms’ optimal strategies (see, e.g., Judd (1998)). We note that changing the fineness of the grid within reasonable ranges did not change the results significantly. 25 For example, in the low investment cost instance of the quality ladder model, the difference in long-run expected consumer surplus between MME with status-quo and optimistic perceptions is 15.5%, 4.4%, and 1.1% for, N = 6, N = 10, and N = 20, respectively. 28 8 Large-Scale Application: The Beer Industry In the previous section we provided results for small-scale numerical experiments to be able to compare MME with MPE. In this section, to show how our approach increases the applicability of dynamic oligopoly models, we provide an application of MME in a large industry for which MPE is infeasible to compute. Some questions that have puzzled economists for decades are: What are the determinants of market structure? Why do some industries become dominated by a handful of firms while still holding many small firms? How does the resulting market structure affect market outcomes? We believe that our approach can be used to shed light on these questions, and more broadly, to develop counterfactuals in different empirical settings in which the market structure is endogenously determined in a fully dynamic model. As an example, we perform numerical experiments that are motivated by the long trend towards concentration in the beer industry in the US during the years 1960-1990. In the course of those years, the number of active firms dropped from about 150 to 30, and three industry leaders emerged: Anheuser-Busch (owner of Budweiser brand among others), Miller, and Coors. Two competing explanations for this trend are common in the literature (see Tremblay et al. (2005)): an increase in the minimum efficient scale (MES), and an increase in the importance of advertising that with the emergence of national television benefited big firms. The role of advertising as an ‘endogenous sunk cost’ in determining market structure is discussed in detail in Sutton (1991) (see Chapter 13 for a discussion of the beer industry). In this section we calibrate a model of the beer industry, similar to the dynamic advertising model by Doraszelski and Markovich (2007), to examine the role advertising may have on market structure. We emphasize that it is not our goal to develop a complete and exhaustive empirical model of the beer industry, but rather to illustrate our approach in a setting of potential empirical relevance. The model is similar to Example 7.1, where xit is the goodwill of firm i in period t with associated market share σ(xit , st ) ∝ (xit )α1 (Y − pit )α2 . Firms invest in advertising to increase their goodwill stock over time and compete in prices every period. The number of dominant firms is determined endogenously in the equilibrium, with a maximum number of three. The transitions between goodwill states are similar to those introduced in Section 7.2, but we use a multiplicative growth model (as opposed to an additive one) that has been previously used in empirical applications of advertising (Roberts and Samuelson, 1988). Firms become dominant when their individual goodwill level becomes larger than a predetermined value. The numerical experiments examine the effect of different specifications of the contribution of goodwill on firms’ profits as captured by the parameter α1 . Firms keep track of the α1 -th un-normalized moment and MME is computed for each parameter specification. We say that the profit function exhibits: decreasing returns to advertising (DRA) if α1 < 1, constant returns to advertising (CRA) if α1 = 1, and increasing returns to advertising (IRA) if α1 > 1. We consider three specifications of returns to advertising with α1f and α1d controlling the returns to advertising for fringe and dominant firms, respectively. These take values in (α1f , α1d ) ∈ {αD , αC } × {αD , αC , αI }, where (αD , αC , αI ) = (.85, 1, 1.1). The three cases under consideration are: (1) DRA-DRA with (αD , αD ), (2) DRA-CRA with (αD , αC ), and (3) CRA-IRA with (αC , αI ). 29 We calibrate the model parameters from a variety of empirical research that studies the beer industry or related advertising settings.26 For example, α2 and Y are chosen to match the price elasticity in the average price, and the average sell-off value is taken from the sales’ price of used manufacturing plants. See Appendix E for a list of the parameters and their sources. We emphasize that each of our experiments includes 200 firms and 29 different individual states, which makes it much larger than any problem that can be solved if MPE was used as an equilibrium concept. Figure 1 plots (on a log-scale) the average goodwill distribution of firms for the three cases, and Table 5 reports some average industry statistics. The experiments suggest that higher returns to advertising indeed give rise to more right-skewed size distributions, as is expected. Indeed, it is clear that in the DRA-DRA case there is on average a vacancy in the dominant tier, and dominant firms are not much bigger than the biggest fringe firms. In both DRA-CRA and CRA-IRA the industry is much more concentrated and dominant firms are much larger than fringe firms. Also, dominant firms remain dominant for a much longer period of time. In addition, as the industry becomes more concentrated, there are fewer incumbent fringe firms, they are smaller, and they spend less time in the industry on average. In that sense, large dominant firms deter the growth of fringe firms. Overall, our results show that increasing returns to advertising could yield a concentrated market structure with a few dominant firms like in the beer industry. Figure 1: Average size distribution (un-normalized) of firms in log-scale. The solid line represents fringe firms and the triangles represent dominant firms. Finally, we examine the effect fringe firms have on dominant firms by computing MPE with dominant firms only. In fact, ignoring fringe firms is a common practice in the applied literature to simplify computation. In order to make the comparison fair we normalize the profit function to compute MPE by fixing the fringe firms’ moment to its MME long-run average. We obtain that, for example, in the DRA-CRA case the average size of dominant firms drops by 17% and the average size of the smallest dominant firm falls around 50% in the MPE relative to the MME values. This suggests that deterring entry and pushing down investment from fringe firms are important determinants in dominant firms’ investment incentives. The collective 26 We thank Carol Tremblay and Victor Tremblay for providing supplementary data. 30 Industry averages Concentration ratio C2 Concentration ratio C3 First moment (normalized by # fringe) Active fringe firms (#) Active dominant firms (#) Size incumbent fringe (goodwill) Size dominant (goodwill) Entrants per period (#) Time in industry fringe Time as dominant CRA-IRA 0.42 0.53 1.085 147 3 1.1 83.4 16.5 12.9 1536 DRA-CRA 0.36 0.44 1.22 167 3 1.4 64.9 12.3 16 360 DRA-DRA 0.13 0.17 1.74 180 2.3 2 24.1 7.6 26.2 21 Table 5: Average industry statistics presence of fringe firms, in spite of their weak individual market power, disciplines dominant firms and forces them to invest more than in an oligopoly in which there are only dominant firms. We conclude that explicitly modeling fringe firms may have important effects on conclusions derived from counterfactuals. 9 Approximation Error Bounds In Section 5 we discussed that MME can be interpreted as a way of approximating MPE, but also as an appealing behavioral model on its own. To evaluate the performance of MME through the lens of the latter perspective, we introduced the notion of the value of the full information deviation, that is, the value of a unilateral deviation to a strategy that keeps track of the full industry state. While in small-scale experiments like those presented in Section 7.3 we can compute the value of the full information deviation exactly, this is not possible in large-scale applications like the one presented in Section 8. In these instances, computing a best response is almost as computationally challenging as solving for a MPE. Of course, neither could we compare MME with MPE in these cases. Thus motivated, we suggest a novel computationally tractable error bound that upper bounds the value of the full information deviation. This error bound is derived by observing a connection between the unilateral deviation problem of a firm in MME and problems in robust dynamic programming (RDP); see Iyengar (2005) for a reference to the RDP literature.27 We note that to the best of our knowledge ideas from RDP have not been previously used to derive sub-optimality error bounds for dynamic programs with partial information as we do here. This bound is useful because it allows one to evaluate whether the state aggregation is appropriate or whether a finer state aggregation is necessary, for example, by adding more moments. In fact, if the value of the full information deviation is small, it is plausible that firms would use the relatively simpler MME strategies as opposed to more complex Markov strategies that do not yield significant additional benefits. 27 RDP considers Markov decision processes with unknown transition kernels in which the decision maker chooses a ‘robust’ strategy to mitigate this ambiguity. In our case, observing the moment, but not the underlying industry state, is equivalent to having ambiguity about the underlying transition probabilities. Therefore, upper bounding the extent of a unilateral deviation of an agent in MME can be reformulated as a RDP problem. 31 9.1 Derivation of Error Bound Let (µ, λ) be a fixed MME strategy for the remainder of this section. Recall that V (x, s|µ, λ) is the actual value of playing MME strategies starting from (x, s). Denote by V ∗ (x, s|µ, λ) = supµ0 ∈M V (x, s|µ0 , µ, λ). The value of the full information deviation is ∆µ,λ (x, s) = V ∗ (x, s|µ, λ) − V (x, s|µ, λ) (see (6)). Note that given MME strategies (µ, λ), V (x, s|µ, λ) can be computed using forward simulation. However, the problem of finding the optimal strategy that achieves V ∗ is highly dimensional. Instead we find an upper bound to V ∗ with which we can upper bound ∆µ,λ . To do so we construct a robust Bellman operator as follows. For every ŝ = (θ, d, z) ∈ Ŝ define the consistency set Sf (ŝ) = {f ∈ Sf θ(f ) = θ}, that is, the set of all fringe distributions that are consistent with the value of the moments in state ŝ. Note that moments are not sufficient statistics for the evolution of the industry, because typically Sf (ŝ) is not a singleton and different fringe distributions in the consistency set may have different future evolutions. Now, define (TR V̂ )(x, ŝ) = sup sup n h π(x, ŝ) + E φ1{φ ≥ ρ} ι∈I f ∈Sf (ŝ) ρ≥0 io , + 1{φ < ρ} − c(x, ι) + β E µ,λ [V̂ (xi,t+1 , ŝt+1 )xit = x, st = (f, d, z), ι] (12) where V̂ ∈ V̂ is a bounded vector V̂ : X × Ŝ → < and 1{·} is the indicator function. Recall that we assume all competitors use MME strategies (µ, λ). That is, the robust Bellman operator is defined on X × Ŝ and it is identical to the Bellman operator associated with the best response in C1 in the definition of MME, except that the firm can also choose any underlying fringe state consistent with observed moment θ. In Lemma C.1 in the Appendix, we show that the robust Bellman operator TR has a unique fixed point ∗ V̂ that we call the robust value function. The next result, also proved in the appendix, relates it to the optimal value function V ∗ . Theorem 9.1. For all (x, s) ∈ X × S V ∗ (x, s) ≤ V̂ ∗ (x, ŝ), where ŝ is the moment-based industry state that is consistent with s, that is, θ = θ(f ) In essence, the robust value function resolves the indeterminacy of moment transitions by choosing the best fringe state from the associated consistency set. Intuitively, this should provide an upper bound for V ∗ . ˆ Therefore, we can bound the value of the full informational deviation with ∆(x, s) = V̂ ∗ (x, ŝ) − V (x, s), where ŝ is the moment-based industry state of s. The advantage of computing V̂ ∗ over V ∗ is that TR operates on the moment-based state space Ŝ, which is much smaller than S. Nevertheless, the computation of V̂ ∗ is still demanding, as we explain next. Finding the fixed point of the operator TR is generally a hard computational problem, in fact it is NPcomplete (Iyengar, 2005, §3). This is not surprising, since the optimization over consistency sets may be very complex. However, iterating the operator TR becomes much simpler if there is a large number of fringe firms. In this case, the one-step transition of the fringe state is close to being deterministic, because, conditional on the current state, the transitions of individual fringe firms average out at the aggregate level 32 by a law of large numbers. When one-step transitions are deterministic, finding the optimal consistent f in (12) is equivalent to choosing the next moment from a set of moments that are accessible from the current industry state. This considerably simplifies the computation of the inner maximization in (12), since the moment accessibility sets are low dimensional. Moreover, characterizing these accessibility sets can be done efficiently. This procedure, described in detail in Appendix F, is tractable for problems for which computing MME is tractable. We note that this method, based on deterministic transitions for the fringe state, only provides an approximation to the robust upper bound for models with a large but finite number of fringe firms.28 It is simple to observe that if moments are sufficient statistics to predict the future evolution of the industry, then the bound is tight. If this is not the case, the gap between the actual value of the full information deviation and the quantity computed with the robust bound depends on the distance V̂ ∗ (x, ŝ) − V ∗ (x, s). To lower this gap and make the bound tighter we can compute the robust bound over a refinement of the moment-based state space used in the equilibrium computation by including moments in the robust operator not included in MME. For example, we could add another contemporaneous moment like a quantile, or the number of fringe firms that are close to becoming dominant, or we could add a lagged moment. This will reduce the size of the consistency sets, making the bound tighter, albeit increasing the computational cost. We illustrate this with numerical experiments in the next subsection. 9.2 Numerical Experiments We have done extensive numerical experiments using the robust bound. We present some results on two simplified models similar to the beer industry experiments of Section 8 with a single dominant firm that operates under constant returns to advertising. In the first model N = 200 and fringe firms can enter and exit the market (model E), in the second N = 100 and firms cannot enter or exit the industry (model NE). For each model we vary α1f , the parameter controlling fringe firms’ returns to advertising, from .45 to .65. The lower the returns to advertising, the smaller the reward from investment and the more homogenous the fringe firms are in equilibrium. Since the moment is a sufficient statistic of the future evolution of the industry when fringe firms are homogenous, we expect the error bound to be lower for low values of α1f . In both models E and NE, to simplify computation, we use the parametric approach described in Appendix A to construct the perceived transition kernel. In particular, we assume a linear relationship θt+1 = ξ0 (dt ) + ξ1 (dt )θt , and the parameters {(ξ0 (d), ξ1 (d))}d∈Sd are taken to fit the simulated transitions using ordinary least squares. To simplify, we assume there is no aggregate shock. We report in Figures 2a and 2b the expected error bound for both dominant and fringe firms as a percentage of the actual value function. Namely, we plot h ∆ ˆ µ,λ (x, s) i , E µ,λ V (x, s|µ, λ) 28 It is possible to formally derive a probabilistic version of the robust bound that corrects for the fact that the number of fringe firms √ N is finite using standard probability bounds. Loosely speaking, by a central limit theorem the correction term is of the order N . However, given our numerical experience, we believe that in many settings of interest in which N is large, the square root N correction will not have a significant impact for any practical purposes. 33 (a) No Entry/Exit (b) Entry/Exit Figure 2: Error Bound where the expectation is taken with respect to the invariant distribution of the underlying industry state s.29 In addition to the standard robust bound, we consider a variation where a lagged fringe state enters the computation of the consistency sets. This reduces the size of these sets and so decreases the gap between the actual value of the full information and the error bound. Indeed, we see that the lagged state decreases the error bound, in particular in model E. Adding further contemporaneous or lagged moments should generally make the error bound even tighter. The experiments agree with our conjecture that a more homogenous fringe tier (low α1f ) will result in a lower error bound. In addition, we see that the error bounds are generally lower for model NE. This is explained by the better fit of the linear moment transitions without entry and exit. Entry and exit decisions are inherently nonlinear, and therefore, the assumed linear moment transitions approximate the actual moment transitions in NE better than in E, resulting in a lower value of the bound. These experiments suggest that the robust bound can be useful to test the extent of sub-optimality of MME strategies in terms of a unilateral deviation. Based on our numerical results, we find that depending on the model, the bound can be sometimes tight indicating that the extent of a unilateral deviation is small, but it can also be loose. In the latter cases, refining the state space by adding additional moments when computing the robust bound can be helpful to make the bound tighter and more useful. Also, adding additional moments to firms’ MME strategies should generally improve the accuracy of the perceived transition kernel, and it is plausible to expect that this would decrease the bound as well. There may still be settings in which even after adding several moments, the bound will be loose. This is not entirely surprising as the error bound derived in this paper is very generic and does not use any problem-specific information. We leave potential refinements of our bound that use problem-specific information as matter for further research. 29 For dominant firms, the expectation is also taken with respect to the invariant distribution of their individual state evolution. For fringe firms, x is taken to be the most-visited fringe state. 34 10 Extensions and Conclusions In this paper we introduced a novel framework to study dynamic oligopolies in concentrated industries that opens the door to study new issues in the empirical analysis of industry dynamics. In particular, we first introduced MME and an algorithm to compute it. We also discussed important implementation issues for applied work and illustrated our approach in a large-scale application. Then, we provided results to justify MME as an approximation by comparing it to MPE and by introducing error bounds. In the rest of this section, we discuss some extensions of our approach to other important classes of economic models, as well as a discussion on other possible future directions. 10.1 Extension: Dynamic Games of Asymmetric Information In the dynamic games of asymmetric information studied by Fershtman and Pakes (2012) a firm keeps track of the history it has observed up to that period. In principle, optimal strategies could depend on the infinite history of observations, because past history may provide useful information about current competitors’ private states. To make the model tractable, the authors assume that histories are finite and discuss environments in which this assumption may be plausible, for example, when there is public periodic revelation of all information. Our error bounds and related ideas can be used to upper bound the extent of a unilateral deviation from the finite history REBE of Fershtman and Pakes (2012) to a strategy that keeps track of the infinite history. In the spirit of approximating equilibrium in a game of complete information, our error bounds assume that the deviant firm has access to the full current industry state. However, in a game of asymmetric information it is more appropriate to consider a unilateral deviation to a strategy that instead keeps track of the observed infinite history. Note that typically, fixing competitors’ strategies, a unilateral deviation to a strategy of the former type provides larger expected discounted profits than one of the latter type, because having access to the full current state typically provides more information than having access to the observed infinite history. For example, while in the former the firm knows the exact current state of competitors, this may not necessarily be true in the latter. As a consequence, our error bounds are valid for the model of asymmetric information; however, tighter bounds can be derived. To derive tighter bounds, we formulate the best response problem of a firm in the dynamic game of asymmetric information as a partially observable Markov decision process (POMDP); see, for example Lovejoy (1991) for a reference on POMDPs. Then, we use the results from White III and Scherer (1994) that derive upper bounds to the optimal value function of the POMDP that keeps track of the entire infinite history, based on the POMDP that only keeps track of a finite history of length M . White III and Scherer (1994) also shows that the bounds become weakly tighter as M grows. They also provide sufficient conditions under which the extent of the unilateral deviation becomes zero for finite M . The sufficient conditions essentially establish that distant history becomes irrelevant and are related to the literature on contraction rates and forecast horizons in fully observed MDPs (see references in Lovejoy (1991)). However, checking such sufficient conditions over model primitives depends on the underlying stochastic process and is typically quite complex. In summary, using ideas from POMDPs we can derive bounds on the extent of a unilateral deviation for 35 fixed competitors’ strategies from REBE with a given memory length to a strategy that keeps track of the full infinite history. Note that because competitors’ strategies are fixed, the transition kernel that describes the evolution of competitors is fixed. In contrast, a much harder problem is to consider a sequence of REBE with increasing memory lengths for all firms and show that strategies become optimal as memory length grows. In this case, the underlying competitors’ process faced by a given firm is itself becoming more memory dependent as the memory length grows, and therefore, the previously mentioned POMDP results do not apply. Studying this problem is outside the scope of this paper but may be interesting for future research. 10.2 Extension: Macroeconomic Models with Heterogeneous Agents Our approach is related to the important literature in macroeconomics that studies models with heterogeneous agents in the presence of aggregate shocks. Notably, Krusell and Smith (1998) studies a stochastic growth model with heterogeneity in consumer income and wealth. Other papers focus on firm-level heterogeneity in capital and productivity (see, e.g., Khan and Thomas (2008) and Clementi and Palazzo (2014)). All of these models assume a continuum of agents and therefore agents’ dynamic programming problems are infinite dimensional; strategies depend on the distribution over individual states. Motivated by the seminal idea in Krusell and Smith (1998), in these papers agents are assumed to keep track of only a small number of statistics of this distribution to simplify the problem. Note that these macroeconomic models do not incorporate dominant agents, so our framework accommodates them by considering moment-based states that include moments and the value of the aggregate shock only. While our approach is partially inspired by this previous literature, we believe that in turn our results can also be useful in these macroeconomic models. For concreteness, we focus on the seminal paper of Krusell and Smith (1998) that show that in their model the first moment of the wealth distribution is essentially a sufficient statistic for the evolution of the economy, a property they call ‘approximate aggregation’. The reason is that agents’ equilibrium decisions turn out to be almost linear in their state. In fact, if decisions were exactly linear, the first moment would be an exact sufficient statistic. While the approximate aggregation property has been shown to be quite robust in this class of models, Krusell and Smith (2006) and Algan et al. (2013) acknowledge that it does not hold for all models and may not hold for important models considered in the future. For this reason, it is important to understand the boundaries of approximate aggregation by testing its accuracy. Den Haan (2010) describes the limitations of commonly used accuracy tests, such as the so called ‘R2 test’ and provides other alternatives. These tests often try to measure the accuracy of the perceived transition kernel proposed. We believe that our computationally tractable error bounds provide a more direct test of the validity of a state aggregation technique, because it directly measures improvements in payoffs, in particular, on how much an agent can improve its expected discounted utility (in monetary terms) by unilaterally deviating from the approximate equilibrium strategy to a strategy that keeps track of the full distribution of wealth. The bound is useful because it can help determine whether approximate aggregation holds or a finer approximation is required. 36 10.3 Other Future Directions We believe that our work suggests several other future directions for research in dynamic oligopolies. First, our error bound is very general and we envision that tighter bounds can be derived using problem-specific information. Second, in our definition of MME, firms keep track of current moments of the fringe state. Our equilibrium concept and error bound can be modified to allow firms to keep track of past values of moments. Finally, MME with the perceived transition kernels used in our experiments are intended to study the longrun equilibrium market structure. We believe that an appropriate modification of our approach would allow the study of short-term dynamics and how the market structure would evolve in few years after a policy or an environmental change, such as a merger. In Sections 10.1 and 10.2 we discussed how our methods can be extended to other important models in economics. In addition, we envision that some of our ideas could potentially also be useful to study dynamic models with forward looking consumers. For example, in a dynamic oligopoly model with durable goods, firms may need to make pricing and investment decisions, keeping track of the distribution of consumers’ ownership, which is a highly dimensional object (e.g., Goettler and Gordon (2011)). Our ideas may be useful in this context as well, where one could replace this distribution by some of its moments. We leave all these extensions for future research. References Algan, Y., O. Allais, W. J. Den Haan, P. Rendahl. 2013. Solving and simulating models with heterogeneous agents and aggregate uncertainty. Handbook of Computational Economics Vol. 3. North-Holland, Elsevier. Altug, S., P. Labadie. 1994. Dynamic Choice and Asset Markets. Academic Press. Bajari, P., C. L. Benkard, J. Levin. 2007. Estimating dynamic models of imperfect competition. Econometrica 75(5) 1331 – 1370. Benkard, C. L. 2004. A dynamic analysis of the market for wide-bodied commercial aircraft. Review of Economic Studies 71(3) 581 – 611. Benkard, C. L., A. Bodoh-Creed, J. Lazarev. 2010. Simulating the dynamic effects of horizontal mergers: U.S. airlines. Working Paper, Stanford. Benkard, C. L., P. Jeziorski, G. Y. Weintraub. 2014. Oblivious equilibrium for concentrated industries. Working Paper, Stanford University. Bertsekas, D. P., S. Shreve. 1978. Stochastic Optimal Control: The Discrete-Time Case. Academic Press Inc. Besanko, D., U. Doraszelski. 2004. Capacity dynamics and endogenous asymmetries in firm size. RAND Journal of Economics 35(1) 23 – 49. 37 Besanko, D., U. Doraszelski, Y. Kryukov, M. Satterthwaite. 2010. Learning-by-doing, organizational forgetting, and industry dynamics. Econometrica 78(2). Besanko, D., M. K. Perry, R. H. Spady. 1990. The logit model of monopolistic competition: Brand diversity. The Journal of Industrial Economics 38(4) 397 – 415. Blundell, R., T.M. Stoker. 2005. Heterogeneity and aggregation. Journal of Economic Literature XLIII 347 – 391. Caplin, A., B. Nalebuff. 1991. Aggregation and imperfect competition - on the existence of equilibrium. Econometrica 59(1) 25 – 59. Clementi, G.L., D. Palazzo. 2014. Entry, exit, firm dynamics, and aggregate fluctuations. NBER Working paper . Collard-Wexler, A. 2011. Productivity dispersion and plant selection. Working Paper, NYU. Collard-Wexler, A. 2013a. Demand fluctuations in the ready-mix concrete industry. Econometrica 81(3) 1003–1037. Collard-Wexler, A. 2013b. Mergers and sunk costs: An application to the ready-mix concrete industry. Forthcoming American Economic Journal: Micro. Corbae, D., P. D’Erasmo. 2013. A quantitative model of banking industry dynamics. Working paper, UT Austin. Corbae, D., P. D’Erasmo. 2014. Capital requirements in a quantitative model of banking industry dynamics. Working Paper, Maryland University. Den Haan, W. J. 2010. Assessing the accuracy of the aggregate law of motion in models with heterogeneous agents. Journal of Economic Dynamics and Control 34 79 – 99. Dixit, A. K., J. E. Stiglitz. 1977. Monopolistic competition and optimum product diversity. American Economic Review 67(3) 297 – 308. Donoghue, W. F. 1969. Distributions and Fourier Transforms. Academic Press. Doraszelski, U., K. Judd. 2011. Avoiding the curse of dimensionality in dynamic stochastic games. Quantitative Economics 3(1) 53–93. Doraszelski, U., S. Markovich. 2007. Advertising dynamics and competitive advantage. RAND Journal of Economics 38(3) 557–592. Doraszelski, U., A. Pakes. 2007. A framework for applied dynamic analysis in IO. Handbook of Industrial Organization, Volume 3. North-Holland, Amsterdam. Doraszelski, U., M. Satterthwaite. 2010. Computable markov-perfect industry dynamics. RAND Journal of Economics 41 215–243. 38 Ericson, R., A. Pakes. 1995. Markov-perfect industry dynamics: A framework for empirical work. Review of Economic Studies 62(1) 53 – 82. Farias, V., D. Saure, G.Y. Weintraub. 2012. An approximate dynamic programming approach to solving dynamic oligopoly models. The RAND Journal of Economics 43(2) 253–282. Fershtman, C., A. Pakes. 2012. Dynamic games with asymmetric information: A framework for empirical work. Quarterly Journal of Economics 127(4) 1611 – 1661. Gallant, A. R., H. Hong, A. Khwaja. 2011. Dynamic entry with cross product spillovers: An application to the generic drug industry. Working paper. Goettler, R. L., B. Gordon. 2011. Does AMD spur Intel to innovate more? Journal of Political Economy 119(6) 1141–1200. Gowrisankaran, G., T. Holmes. 2004. Mergers and the evolution of industry concentration: results from the dominant-firm model. RAND Journal of Economics 35(3) 1–22. Iacovone, L., B. Javorcik, W. Keller, J. Tybout. 2013. Walmart in Mexico: The impact of FDI on innovation and industry productivity. Working paper, Penn State University. Iyengar, G. 2005. Robust dynamic programming. Mathematics of Operations Research 30(2) 257–280. Jeziorski, P. 2013. Estimation of cost synergies from mergers: Application to us radio. Working Paper, U.C. Berkeley. Jia, P., P. Pathak. 2012. The cost of free entry: Evidence from real estate brokers in greater Boston. Working Paper, MIT. Judd, K. 1998. Numerical Methods in Economics. MIT Press. Kalouptsidi, M. 2014. Time to build and fluctuations in bulk shipping. American Economic Review 104(2) 564–608. Khan, A., J.K Thomas. 2008. Idiosyncratic shocks and the role of nonconvexities in plant and aggregate investment dynamics. Econometrica 76(2) 395 – 436. Krusell, P., A. A. Smith, Jr. 1998. Income and wealth heterogeneity in the macroeconomy. Journal of Political Economy 106(5) 867–896. Krusell, P., A. A. Smith, Jr. 2006. Quantitative macroeconomic models with heterogeneous agents. Advances in Economics and Econometrics: Theory and Applications, Ninth World Congress. Lee, R. S. 2013. Vertical integration and exclusivity in platform and two-sided markets. American Economic Review 103(7) 2960–3000. Lovejoy, W. S. 1991. A survey of algorithmic methods for partially observed markov decision processes. Annals of Operations Research 18 47 – 66. 39 Lucas, R. E., E. C. Prescott. 1971. Investment under uncertainty. Econometrica 39(5) 659 – 681. Maskin, E., J. Tirole. 1988. A theory of dynamic oligopoly, I and II. Econometrica 56(3) 549 – 570. Maskin, E., J. Tirole. 2001. Markov perfect equilibrium I. observable actions. Journal of Economic Theory 100 191 – 219. Pakes, A., P. McGuire. 1994. Computing Markov-perfect Nash equilibria: Numerical implications of a dynamic differentiated product model. RAND Journal of Economics 25(4) 555 – 589. Pakes, A., P. McGuire. 2001. Stochastic algorithms, symmetric Markov perfect equilibrium, and the ‘curse’ of dimensionality. Econometrica 69(5) 1261 – 1281. Qi, S. 2013. The impact of advertising regulation on industry: The cigarette advertising ban of 1971. RAND Journal of Economics 44(2) 215–248. Resnick, S.I. 1998. A Probability Path. Birhauser. Roberts, M. J., L. Samuelson. 1988. An empirical analysis of dynamic, nonprice competition in an oligopolistic industry. The RAND Journal of Economics 19(2) pp. 200–220. Rojas, C. 2008. Price competition in U.S. brewing. The Journal of Industrial Economics 56(1) 1–31. Ryan, S. 2012. The costs of environmental regulation in a concentrated industry. Econometrica 80(3) 1019 – 1062. Saeedi, M. 2013. Reputation and adverse selection: Theory and evidence from ebay. Working Paper, The Ohio State University. Santos, C. D. 2010. Sunk costs of R&D, trade and productivity: the moulds industry case. Working Paper, U. of Alicante. Santos, C. D. 2012. An aggregation method to solve dynamic games. Working paper . Sutton, J. 1991. Sunk Costs and Market Structure. 1st ed. MIT Press. Sutton, J. 1997. Gibrat’s legacy. Journal of Economic Literature 35(1) 40 – 59. Sweeting, A. 2013. Dynamic product positioning in differentiated product markets: The effect of fees for musical performance rights on the commercial radio industry. Econometrica 81(5) 1763–1803. Tomlin, B. 2008. Exchange rate volatility, plant turnover and productivity. Working Paper, Boston University. Tremblay, V. J., N. Iwasaki, C. H. Tremblay. 2005. The dynamics of industry concentration for u.s. micro and macro brewers. Review of Industrial Organization 26 307–324. Tremblay, V. J., C. H. Tremblay. 2005. The US Brewing Industry: Data and Economic Analysis, MIT Press Books, vol. 1. The MIT Press. 40 Weintraub, G. Y., C. L. Benkard, B. Van Roy. 2008. Markov perfect industry dynamics with many firms. Econometrica 76(6) 1375–1411. White III, C. C., W. T. Scherer. 1994. Finite-memory suboptimal design for partially observed markov decision processes. Operations Research 42(3) 439 – 455. Whitt, W. 1980. Representation and approximation of noncooperative sequential games. SIAM J. on Control and Optimization 18(1) 33–48. Xu, Y. 2008. A structural empirical model of R&D, firm heterogeneity, and industry evolution. Working paper, NYU University. 41 Appendices A Parametric Perceived Transition Kernels We describe a construction of the perceived transition kernel that has been successfully used in stochastic growth models in macroeconomics (Krusell and Smith, 1998) and subsequent literature. This perceived kernel assumes a parameterized and deterministic evolution for moments. That is, starting from industry state ŝt = (θt , dt , zt ), the next moment value is assumed to be θt+1 = G(θt ; ξ(dt , zt )), where ξ(dt , zt ) ∈ Ξ are parameters. For example, this could represent a linear relationship with one moment, θt+1 = ξ0 (dt , zt )+ξ1 (dt , zt )θt . In this case the goal would be to choose functions ξ0 and ξ1 that approximate the actual transitions best, for instance by employing linear regressions. B Moments Become Sufficient Statistics In this section we present a class of industry dynamic models for which a succinct set of moments essentially summarize all payoff relevant information and is close to being sufficient statistics. In these models, the value of the full information deviation is zero, or becomes zero asymptotically as the number of fringe firms grows large. A simple class of models for which this holds is when fringe firms are homogeneous, in the sense that Xf is a singleton. In this case, a single moment, namely the number of incumbent fringe firms, is a sufficient statistic of the future evolution of the industry. Next, we describe a model with firm fringe heterogeneity that does not allow for entry and exit, in which moments also become sufficient statistics as the number of fringe firms grows large. Constant Returns to Scale Model for Fringe Firms. We take Xf = <+ (with another component that we suppress for clarity indicating that the firm is fringe). We assume that there is no entry and exit (therefore µ = ι), that there are N − D fringe firms, and that there are no transitions between the fringe and dominant tier. The analysis below can be applied to single-period profit functions and transition kernels that depend on any integer moments of the fringe state. However, to simplify the exposition we assume that they both depend only on the first moment. A model like Example 7.1 with α1 = 1 would give rise to this type of profit function. Accordingly, we assume that firms keep track only of the first un-normalized moment of the P fringe state, that is, θt = x∈Xf xft (x). We make the following assumptions on the primitives of fringe firms only; no restrictions are placed on the primitives of dominant firms. 1. The single-period profit is linear in the fringe firm’s own state, π(x, ŝ) = xπ1 (ŝ) + π0 (ŝ) with supŝ∈Ŝ {π1 (ŝ), π0 (ŝ)} < ∞. The assumption imposes constant returns to scale. 42 2. For a fixed state x, the cost function increases linearly with the investment level ι. In addition, the marginal investment cost increases linearly with the state. Formally, c(x, ι) = (cx)ι with ι ∈ I = <+ . 3. The dynamics of a fringe firm’s evolution are linear in its own state: xi,t+1 = xit ζ1 (ι, ŝt , w1it ) + ζ0 (ŝt , w0it ), where ι is the amount invested. Each of the sequences of random variables {w0it |t ≥ 0, i ≥ 1} and {w1it |t ≥ 0, i ≥ 1} is i.i.d. and independent of all previously defined random quantity and of each other. In addition, we assume the functions ζ0 and ζ1 are uniformly bounded over all realizations, investment levels, and industry states.30 We begin by showing that for any perceived kernel P̂µ the corresponding best response investment strategy for a fringe firm does not depend on its own individual state.31 All proofs are relegated to Appendix C. Lemma B.1. Consider the constant returns to scale model described above. For any MME strategy µ and for every x, x0 ∈ Xf and ŝ ∈ Ŝ we have µ(x, ŝ) = µ(x0 , ŝ) = µ(ŝ). We use this result to characterize the evolution of the moment under MME strategies as the number of fringe firms grows. To obtain a meaningful asymptotic regime, we scale the market size together with the number of fringe firms. Formally, we consider a sequence of industries with growing market size m; market size would typically enter the profit function, through the underlying demand system like in Example 7.1. Therefore, we consider a sequence of markets indexed by market sizes m ∈ N with profit functions denoted by π m . We assume the number of firms increases proportionally to the market size, that is N m = N m − D, for some constant N > 0. We assume that all other model primitives, including the maximum number of dominant firms, are independent of m. Quantities associated to market size m are indexed with the superscript m. We state the following assumption. Assumption B.1. Let {µm : m ≥ 1} be a sequence of MME strategies followed by all firms in market m. Then, there exists a compact set X f ⊂ Xf such that for all m ≥ 0, t ≥ 0, and i ∈ Ftm , P[xm it ∈ X f ] = 1. While the previous assumption imposes conditions on equilibrium outcomes, it is quite natural in this context; MME is a sensible equilibrium concept only if fringe firms do not grow unboundedly large. We have the following result. Proposition B.1. Consider the constant returns to scale model under Assumption B.1, and suppose that for every m ≥ 1 all firms use MME strategy µm . For a given t, conditional on the realizations of {xit ∈ X f |i ∈ Ftm } and ŝm t , we have h i m m m m (1/N m )θt+1 − ζ̃1m (ŝm ) (1/N ) θ + ζ̃ (ŝ ) → 0, a.s., 0 t t t P P m m m as m → ∞, where θtm = i∈F m xit = x∈Xf xftm (x), ζ̃1m (ŝm t ) = E µm [ζ1it (µ (ŝt ), ŝt , w1it )], and t m ζ̃0 (ŝm t ) = E µm [ζ0it (ŝt , w0it )]. 30 The assumption about linear transitions is similar to assuming Gibrat’s law in firm’s transitions (Sutton, 1997). This result is similar to Lucas and Prescott (1971) that studies a dynamic competitive industry model with similar assumptions to ours, but with deterministic transitions and stochastic demand. 31 43 The result shows that the first moment becomes a sufficient statistic for the evolution of the next moment. In particular, the next moment becomes a linear function of the current moment. Heuristically, this suggests defining the following perceived transition kernel:32 m m m m P̂µm [ζ̃1m (ŝm t )θt + N ζ̃0 (ŝt )|ŝt ] = 1, (13) which will become a good approximation as m grows large. In fact, with this perceived transition kernel and under suitable continuity conditions, one can show that the value of the full information deviation (appropriately normalized) converges to zero as the market size m approaches infinity.33 C Proofs Proof of Lemma B.1. Under the assumptions of model (N ) of Chapter 9 in Bertsekas and Shreve (1978) we have from Proposition 9.8, that the optimal value function satisfies: ( h i (T V )(x, ŝ|µ) := max xπ1 (ŝ0 ) + π0 (ŝ0 ) − cxι + β E µ V (x1 , ŝ1 )ι, ŝ0 = ŝ, x0 = x ι∈I ) = V (x, ŝ|µ) Moreover, we have that T n V̂ →V if V̂ = 0 by Proposition 9.14. We first show that sup V (x, ŝ|µ0 , µ) = xV 1 (ŝ) + V 0 (ŝ) µ0 ∈M for appropriate functions V 1 (·) and V 0 (·) by demonstrating that the posited form of the perceived value function is stable under an application of the Bellman operator. We have: ( h (T V̂ )(x, ŝ|µ) = max xπ1 (ŝ) + π0 (ŝ) − cxι + β E µ (xζ1 (ι, ŝ, w1 ) + ζ0 (ŝ, w0 ))V̂1 (ŝt+1 ) ι∈I i + V̂0 (ŝt+1 )ι, ŝt = ŝ ( = x max ι∈I ) h i − cι + β E µ ζ1 (ι, ŝ, w1 )V̂1 (ŝt+1 )ι, ŝt = ŝ ) + xπ1 (ŝ) + Ṽ0 (ŝ) (14) = xṼ1 (ŝ) + Ṽ0 (ŝ), 32 Note that a derivation similar to that in Proposition B.1 will show that any k-th moment of fringe firms’ states for an integer k would depend on moments k, k − 1, . . . , 1 only, as m grows large. Therefore, if higher integer moments are payoff relevant they could be accounted for as well. 33 This result requires appropriate continuity conditions on the model primitives, and also that the equilibrium investment strategy is continuous in the moment-based fringe state. The result is omitted for brevity and can be obtained from the authors upon request. 44 h i where we define Ṽ0 (ŝ) = π0 (ŝ) + β E µ ζ0 (ŝ, w0 )V̂1 (ŝt+1 ) + V̂0 (ŝt+1 )ŝt = ŝ . Now, let us denote by V̂ n the iterates obtained by applying the Bellman operator T . Then, we have concluded that xV̂1n (ŝ) + V̂0n (ŝ)→V (x, ŝ), ∀x, ŝ, as n → ∞, where V is the optimal value function. But since the above holds for at least two distinct values of x for any given ŝ, this suffices to conclude that V̂1n (ŝ)→V̂1∞ (ŝ) and V̂0n (ŝ)→V̂0∞ (ŝ). Now, under the additional Assumption C of Chapter 4 in Bertsekas and Shreve (1978), and further assuming that the supremum implicit in the dynamic programming operator applied to V is attained for every (x, ŝ) , the second claim follows immediately from equation (14) and Proposition 4.3 of the reference. Proof of Proposition B.1. Note that m m θt+1 = N X i=1 m xi,t+1 = N X m m [xit ζ1 (µm (ŝm t ), ŝt , w1it ) + ζ0 (ŝt , w0it )], i=1 where (w0it , w1it ) are i.i.d. We need to show that conditional on sm t , m m (1/N m )θt+1 − E µm [θt+1 |sm t ] → 0, a.s., m m m m m where E µm [θt+1 |sm t ] = ζ̃1 (ŝt )θt + N ζ̃0 (ŝt ). The result follows by Assumption B.1 and a law of large numbers (see, for example, Corollary 7.4.1 in Resnick (1998)). Proof of Proposition 5.1. If α 6= 1 redefine the state of fringe firms to be y = xα . Without loss of generality assume Xf = [0, x̄]. Let st , s0t be two consistent underlying industry states, that is, dt = d0t , zt = zt0 , and θ(ft ) = θ(ft0 ) = θt . For moments to be sufficient statistics it must be that for any such consistent underlying states the expected next moment are the same E µ [θ(ft+1 )|st ] = E µ [θ(ft+1 )|s0t ]. Let us fix a moment-based industry state ŝ and let us define the function g(x; ŝ) = E µ [xi,t+1 |xit = x, ŝt = s]. A function h is midpoint convex if for all x1 , x2 ∈ (x, x̄) ⊂ < the following holds h((x1 + x2 )/2) ≤ (h(x1 ) + h(x2 ))/2. Midpoint concavity is defined by reversing the inequality. A function that is midpoint convex (concave) and bounded on an interval is convex (concave) (see Donoghue (1969)). We will show that g(·; ŝ) is both convex and concave and so linear by showing that midpoint convexity and midpoint concavity hold. Note that we need to show this only for x ≤ θ as a single fringe firm cannot be larger than the moment. For any x0 in the interior of Xf with 2x0 ≤ θt we can find (we assumed N − D ≥ 3) a consistent fringe state f with f (x0 ) ≥ 2 and δ > 0 such that x0 ± δ ∈ Xf . Construct f 0 with f 0 (x0 ) = f (x0 ) − 2, f 0 (x0 − δ) = f (x0 − δ) + 1, f 0 (x0 + δ) = f (x0 + δ) + 1 and f 0 (x) = f (x) for all x 6∈ {x0 , x0 ± δ}. Let s = (f, d, z) and s0 = (f 0 , d, z). Clearly ŝ = ŝ0 , because θ(f ) = θ(f 0 ). By assumption E µ [θ(ft+1 )|st ] − E µ [θ(ft+1 )|s0t ] = 2g(x0 ; ŝ) − g(x0 − δ; ŝ) − g(x0 + δ; ŝ) = 0 or g(x0 ; ŝ) = (g(x0 − δ; ŝ) + g(x0 + δ; ŝ))/2. That is g(x; ŝ) is midpoint convex and midpoint concave in the variable x, and so linear. For x0 ≥ θ/2 we use the following construction. From the result above we have that g(θt /2 − δ; ŝ) is linear for 0 ≤ δ ≤ θt /2. By assumption, g(θt /2 + δ; ŝt ) + g(θt /2 − δ; ŝt ) = 2g(θt /2; ŝt ), or g(θt /2 + 45 δ; ŝt ) = 2g(θt /2; ŝt ) − g(θt /2 − δ; ŝt ). Substituting for the right hand side we get g(θt /2 + δ; ŝt ) = (θt /2 + δ)ζ̃1 (ŝt |µ) + ζ̃0 (ŝt |µ) for appropriately defined ζ̃0 , ζ̃1 and for all 0 ≤ δ ≤ min(x̄ − θt /2, θt /2). This completes the proof. Proof of Theorem 5.1. Assume the claim to be false. It must be that there exists an > 0 such that for all n, there exists an n0 > n for which d(Γ, (µn0 , λn0 )) > . We may thus construct a subsequence {(µ̃n , λ̃n )} for which inf n d(Γ, (µ̃n , λ̃n )) > . Now, because the space of strategies is compact, we have that {(µ̃n , λ̃n )} has a convergent subsequence; call this subsequence {(µ0n , λ0n )} and its limit (µ∗ , λ∗ ). We have thus established the existence of a sequence of strategies and entry rate functions {(µ0n , λ0n )}, satisfying: V (x, s|BR(µ0n , λ0n ), µ0n , λ0n ) − V (x, s|µ0n , λ0n ) → 0, ∀(x, s) ∈ X × S (15) λ0n (s) − βEµ0n ,λ0n V (xe , st+1 |µ0n , λ0n )|st = s → 0, ∀s ∈ S (16) (µ0n , λ0n ) → (µ∗ , λ∗ ). (17) inf d(Γ, (µ0n , λ0n )) > , (18) n where BR(µ, λ) denotes the best response strategy when competitors play strategy µ and enter according to λ. Now, it is simple to show that our assumptions on model primitives guarantee that V (x, s|µ0 , µ, λ) is continuous in (µ0 , µ, λ) for all (x, s). Moreover, the assumption that for all competitors’ decisions and all terminal values, a firm’s one time-step ahead optimization problem to determine its optimal investment has a unique solution, yields in addition that BR(·) is continuous on M × Λ. For a proof of this fact, see the proof of Proposition 2 in Doraszelski and Satterthwaite (2010) which in turn employs Lemmas 3.1 and 3.2 of Whitt (1980). Thus, we have from (15), (16), and (17) that V (x, s|BR(µ∗ , λ∗ ), µ∗ , λ∗ ) − V (x, s|µ∗ , λ∗ ) = 0, ∀(x, s) ∈ X × S, and λ∗ (s) − βEµ∗ ,λ∗ [V (xe , st+1 |µ∗ , λ∗ )|st = s] = 0, ∀s ∈ S. Hence, (µ∗ , λ∗ ) ∈ Γ. But by (17), (18), and the triangle inequality d(Γ, (µ∗ , λ∗ )) > , a contradiction. The result follows. Lemma C.1. The operator TR satisfies the following properties: 1. TR is a contraction mapping modulo β. That is, for V̂ , V̂ 0 ∈ V̂, kTR V̂ − TR V̂ 0 k∞ ≤ βkV̂ − V̂ 0 k∞ . 2. The equation TR V̂ = V̂ has a unique solution V̂ ∗ . 3. V̂ ∗ = limk→∞ TRk V̂ for all V̂ ∈ V̂. Proof. Our result is based on a special case of Iyengar (2005) (see Theorem 3.2). However, to allow for greater generality in the state space and action space we use a different proof based on classic dynamic programming results. Define (T̂R V̂ )(x, ŝ) = sup n h π(x, ŝ) + E φ1{φ ≥ ρ} f ∈Sf (ŝ) ι∈I ρ≥0 io + 1{φ < ρ} − c(x, ι) + β E µ,λ [V̂ (xi,t+1 , ŝt+1 )xit = x, st = (f, d, z), ι] . 46 (19) The statement of the lemma holds for T̂R from the results for model (D) in Bertsekas and Shreve (1978) (recall that we assumed bounded profits). The statement follows for TR since T̂R V̂ = TR V̂ for all bounded V̂ . Proof of Theorem 9.1. Take vectors V̂ ∈ V̂ and V , where V : X × S → <, such that V (x, s) = V̂ (x, ŝ), for all moment-based industry state ŝ that is consistent with industry state s. Let T ∗ be the Bellman operator associated with the full Markov best response that keeps track of the entire industry state. It is simple to observe that T ∗ V (x, s) ≤ TR V̂ (x, ŝ), for all x, and all ŝ and s consistent. By the monotonicity of the TR and T ∗ operators we conclude that (T ∗ )k V (x, s) ≤ TRk V̂ (x, ŝ) for all k ≥ 1. Taking k to infinity we get TRk V̂ → V̂ ∗ from Lemma C.1, and (T ∗ )k V → V ∗ by standard dynamic programming arguments. Therefore, V ∗ (x, s) ≤ V̂ ∗ (x, ŝ) for s and ŝ consistent. D Real Time Algorithm for MME with the ‘Observed Transitions’ Kernel Given strategies (µ, λ) and their associated value functions, it is useful to define W (x, ŝ|µ, λ) = E µ,λ V̂ (x, ŝt+1 |µ, λ)ŝt = ŝ , (20) where the expectation is taken with respect to the perceived transition kernel. The function W is the expected continuation value starting from industry state ŝ and landing in state x in the next period. Note that we only integrate over the possible transitions of ŝ, excluding the firm’s own transition to x. It is worth emphasizing that if x ∈ Xd in (20) then ŝt+1 depends on x, whereas if x ∈ Xf the next industry state, ŝt+1 , is independent of x. Namely, dominant firm i that transitions to x will integrate over (θt+1 , d−i,t+1 , zt+1 ) with dt+1 = (xi,t+1 , d−i,t+1 ). Now, we can write the Bellman equation associated with C1 in the equilibrium definition as follows: n h io V̂ (x, ŝ; W ) = sup π(x, ŝ) + E φ1{φ ≥ ρ} + 1{φ < ρ} − c(x, ι) + β E µ,λ [W (xi,t+1 , ŝ)xit = x, ι] , ι∈I ρ≥0 where the first expectation is taken with respect to the sell-off random value φ, and the second with respect to the firm’s transition under investment level ι. Note that when evaluated at the optimal value function, the function W is sufficient to compute a best response strategy. Based on this formulation we introduce our real-time dynamic programming algorithm to compute MME; see Algorithm 2. The steps of the algorithm are as follows. We begin with W functions with which firms’ optimal decisions can be computed (investment, exit, and entry). Once these are determined, we can simulate the next industry state. The continuation value in the simulated state is used to update the W functions. The update step depends on the number of times the state was visited e(ŝ) and on the number of rounds (we take σ(n) = min(n, n̄) for some integer n̄ > 0 to allow for quick updating in the early rounds). At the end of the simulation/optimization phase we check whether the W functions have converged, and if so 47 Algorithm 2 Equilibrium solver with real-time dynamic programming 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26: 27: 28: 29: 30: 31: 32: Initiate W (x, ŝ) := 0, for all (x, ŝ) ∈ X × Ŝ; e(ŝ) = 0 for all ŝ ∈ Ŝ; Initiate industry state (f0 , d0 , z0 ) and ŝ0 := (θ(f0 ), d0 , z0 )); ∆w := δw + 1; n := 1 while ∆w > δw do W 0 (x, ŝ) := W (x, ŝ) for all (x, ŝ) ∈ X × Ŝ; t := 1; while t ≤ T do for all x with ft (x) > 0 or x ∈ dt , do Compute optimal strategies using V̂ (x, ŝt ; W 0 ) and store them; end for; Compute optimal entry cutoff from V̂ (xe , ŝt ; W ) and store it; Simulate (ft+1 , dt+1 , zt+1 ) and ŝt+1 from these strategies; 1 Let γ := σ(n)+e(ŝ ; t) 0 for all x ∈ Xf do Compute V̂ (x0 , ŝt+1 ; W 0 ); Update W (x0 , ŝt ) := γ V̂ (x0 , ŝt+1 ; W 0 ) + (1 − γ)W 0 (x0 , ŝt ); end for; for all Dominant firm i and x0 ∈ Xd that is accessible in one step from xit do Define ŝ0t+1 to be the industry state ŝt+1 when firm i transitions to state x0 ; Compute V̂ (x0 , ŝ0t+1 ; W 0 ); Update W (x0 , ŝt ) := γ V̂ (x0 , ŝ0t+1 ; W 0 ) + (1 − γ)W 0 (x0 , ŝt ); end for; e(ŝt ) := e(ŝt ) + 1, t := t + 1; end while; ∆w := kW 0 − W k∞ ; e(ŝ) := 0 for all ŝ ∈ Ŝ; (f0 , d0 , z0 ) := (fK+1 , dK+1 , zK+1 ), and ŝ0 := (θ(f0 ), d0 , z0 ); n := n + 1; end while; Compute µ(x, ŝ) and λ(ŝ) from V̂ (x, ŝ; W ) for all (x, ŝ) ∈ X × Ŝ; Run Algorithm 1 with these strategies (used as stopping criteria); 48 we check for convergence using Algorithm 1 as well. If convergence has not occurred we iterate on the simulation/optimization phase. E Beer Industry Numerical Experiments Denote by xit the goodwill of firm i at time t. The evolution of goodwill is similar to Pakes and McGuire (1994), but with a multiplicative growth model following Roberts and Samuelson (1988): xit+1 = min(xn̄ , xit (1 + ρ)) xit max(x1 , xit /(1 + ρ)) w.p. w.p. w.p. δψ(x)ιit 1+ψ(x)ιit 1−δ 0 +(1−δ)ψ(x)ιit 1+ψ(x)ιit δ0 1+ψ(x)ιit . This is equivalent to a depreciation factor 1/(1 + ρ) as is common in the literature on goodwill. With this in mind we define a grid of states {x1 , . . . , xn̄ } for the possible values of goodwill firms can take, where xk = x1 (1 + ρ)k−1 for some x1 > 0. To maintain the relationship between goodwill and advertising costs, we choose the parameter ψ(x) such that E[xit+1 |xit = x, ιit = x] = x, that is, a firm with goodwill x has to −1 0 . invest x dollars in advertising to maintain goodwill level x on average. It follows that ψ(x) = δδ 1−(1+δ) δx Under this condition, the average goodwill (state) of a firm that invests xit in every period is approximately xit . The moment space is discretized linearly and we use a bicubic spline to interpolate between grid points when computing firms’ optimal strategies. We use the profit function of Example 7.1. We take β = .925 and (δ 0 , δ) = (1, .55) as the transition parameters. After some experimentation we choose the market size m = 30. The other parameters are listed in the next table with their relevant sources. 49 Description Number of firms (N ) Value Source 200 This number is chosen to be greater than the maximal number of active firms in this period. 3 This is the actual number of dominant firms in the industry. Depreciation of goodwill (ρ) .25 Roberts and Samuelson (1988) estimate this by .2 for the cigarette market. Production cost per barrel (c) $120 Rojas (2008) estimated the markup to be about a third of the price, and the average price is $165 per barrel in the period studied. Maximal number of dominant firms (D) Average entry cost (exponential dist.) 35 × 106 Based on costs of new plants. Sell-off value (exponential dist.) 7 × 106 Based on sales’ price of used plants. Fixed cost per period fringe (double for dominant) 106 We introduce this fixed cost in the single-period profit function. Profit function parameters (Y and α2 ) 200 and 1, resp. Chosen to match the price elasticity (-.5) for the average price, see (Tremblay and Tremblay, 2005, p. 23). F Computation of Robust Bound The remainder of the appendix provides details on the computation of the robust bound under the assumption that there are large number of fringe firms and therefore the one-step transition of the fringe state is assumed to be deterministic. When transitions are deterministic, finding the optimal consistent f in (12) is equivalent to choosing the next moment from an accessibility set of moments that can be reached from the current industry state. As we will explain in detail now, this considerably simplifies the computation of the inner maximization in (12), since the moment accessibility sets are low dimensional. Moreover, characterizating these accessibility sets can be done efficiently. We assume throughtout that the set of individual fringe firm states is discrete and univariate, Xf = {x1 , . . . xn̄ }, where xn ∈ <, for all n = 0, 1, . . . , n̄ < ∞.34 In addition, for simplicity, we assume that there are no transitions between the dominant and fringe tiers. Finding the set of accessible moments amounts to solving an integer feasibility problem. To simplify, P α let us assume that θ consist of only one moment of the form θ = x∈X̃f f (x)(x) for α ≥ 0, where X̃f ⊆ Xf . This representation is general enough to include moments and other statistics. The extension to more moments is direct. Assuming that the next fringe state is deterministic and equals to its expected value, we say that moment 0 θ is accessible from moment θ in industry state ŝ if there exists a fringe state f ∈ Sf (ŝ) that solves the 34 The extension to multivariate xn is simple, and the extension to a continuous state space will require discretization. 50 following system of linear equations, X f (x) (x)α = θ x∈X̃f X f (x) E µ [(xi,t+1 )α |(xit , ŝt ) = (x, ŝ)] + 1{xe ∈ X̃f }(xe )α P(κit ≤ λ(ŝ))N e (f ) = θ0 (21) x∈Xf X f (x) ≤ N, f ∈ Nn , x∈Xf where the first equation states that the current moment is consistent with the fringe state, and the second that the expected next moment is θ0 . We denote by N e (f ) as the number of potential entrants at fringe state f . We say that moment θ0 is accessible from ŝ if this system of linear equations has a solution. This motivates the definition of the accessibility set A(ŝ), where θ0 ∈ A(ŝ) if and only if it is accessible from ŝ, that is, if there is a fringe state consistent with ŝ such that the expected next moment is θ0 . Note that the accessibility sets depend on the investment and entry MME strategies that control the expected next moment in (21). Due to the integrability constraint f (x) ∈ Nn , the computation of A(ŝ) is demanding. However, we can relax this integrability constraint by replacing it with f ≥ 0. With that, the accessibility problem amounts to solving a feasibility problem of a system of linear equations that can be solved easily. We denote the relaxed accessibility set by Â(ŝ); this set contains A(ŝ). Define the operator (T̂ V̂ )(x, ŝ) = sup sup n h π(x, ŝ) + E φ1{φ ≥ ρ} ι∈I θ0 ∈Â(ŝ) ρ≥0 io , + 1{φ < ρ} − c(x, ι) + β E µ,λ [V̂ (xi,t+1 , (θ0 , dt+1 , zt+1 ))xit = x, ŝt = ŝ, ι] where V̂ ∈ V̂. The next proposition states that we can search over accessibility sets instead of the much larger consistency sets. Proposition F.1. Assume that there are no transitions between tiers and that the fringe state follows deterministic transitions given by the expected next state (equation (21)). Then (TR V̂ ) ≤ (T̂ V̂ ), for every V̂ ∈ V̂. Moreover, the operator T̂ satisfies the same properties than the operator TR given in Lemma C.1. Proof. Without tier transitions the evolution of the moments is independent of the evolution of dominant firms. If fringe firms’ transitions are also deterministic each f ∈ Sf (ŝ) maps to an expected next moment and so we can optimize over the set of these moments instead. The inequality follows because we consider the relaxed accessibility sets, so we optimize over a larger set in T̂ . Based on this, we propose the following computationally tractable algorithm to find the robust error bound: (i) construct the relaxed accessibility sets by solving the relaxed feasibility problem for all ŝ ∈ Ŝ 51 and θ0 ∈ Sθ and store them; and (ii) iterate the operator T̂ over the relaxed accessibility sets until a fixed point is found. Since the relaxed accessibility sets contain the accessibility sets, this provides an upper bound to V ∗ , assuming deterministic fringe transitions (by Theorem 9.1 together with the previous proposition). For problems for which MME is solvable this procedure is generally computationally manageable. In addition, this procedure can be easily extended to accommodate tier transitions between dominant and fringe firms. In this case, however, the robust bound typically becomes looser. 52
© Copyright 2025 Paperzz