A Framework for Dynamic Oligopoly in Concentrated Industries

A Framework for Dynamic Oligopoly in Concentrated Industries∗
Bar Ifrach†
Gabriel Y. Weintraub‡
This Version: August 2014; First Version: August, 2012
Abstract
In this paper we introduce a new computationally tractable framework for Ericson and Pakes (1995)style dynamic oligopoly models that overcomes the computational complexity involved in computing
Markov perfect equilibrium (MPE). First, we define a new equilibrium concept that we call momentbased Markov equilibrium (MME), in which firms keep track of their own state, the detailed state of
dominant firms, and few moments of the distribution of fringe firms’ states. Second, we provide guidelines to use MME in applied work and illustrate with an application how it can endogenize the market
structure in a dynamic industry model even with hundreds of firms. Third, we develop a series of results
that provide support for using MME as an approximation. We present numerical experiments showing
that MME approximates MPE for important classes of models. Then, we introduce novel unilateral deviation error bounds that can be used to test the accuracy of MME as an approximation in large-scale
settings. Overall, our new framework opens the door to study novel issues in industry dynamics.
1
Introduction
Ericson and Pakes (1995)-style dynamic oligopoly models (hereafter, EP) offer a framework for modeling
dynamic industries with heterogeneous firms. The main goal of the research agenda put forward by EP
was to conduct empirical research and evaluate the effects of policy and environmental changes on market
outcomes in different industries. The importance of evaluating policy outcomes in a dynamic setting and
the broad flexibility and adaptability of the EP framework has generated many applications in industrial
organization and related fields (see Doraszelski and Pakes (2007) for an excellent survey).
Despite the broad interest in dynamic oligopoly models, there remain significant hurdles in applying
them to problems of interest. Dynamic oligopoly models are typically analytically intractable, hence numerical methods are necessary to solve for their Markov perfect equilibrium (MPE). With recent estimation
∗
We would like to thank Vivek Farias for numerous fruitful conversations, valuable insights, important technical input, and
careful reading of the paper. We have had helpful conversations with Lanier Benkard, Jeff Campbell, Allan Collard-Wexler, Dean
Corbae, Pablo D’Erasmo, Uli Doraszelski, Przemek Jeziorski, Ramesh Johari, Boyan Jovanovic, Sean Meyn, Ariel Pakes, John
Rust, Bernard Salanie, Minjae Song, Ben Van Roy, Daniel Xu, as well as seminar participants at U. of Chicago, Columbia, Harvard,
Maryland, Pittsburgh, UT Austin, U. of Chile, and various conferences. We thank Yair Shenfeld for exemplary research assistance.
We are grateful to the editor Stéphane Bonhomme and three anonymous referees for constructive feedback that greatly improved
the paper.
†
Airbnb, [email protected]
‡
Columbia Business School, [email protected]
1
methods, such as Bajari et al. (2007), it is no longer necessary to solve for the equilibrium in order to
structurally estimate a model. However, in the EP framework solving for MPE is still essential to perform
counterfactuals and evaluate environmental and policy changes. The complexity of this computation has
severely limited the practical application of EP-style models to industries with just a handful of firms, far
less than the real-world industries the analysis is directed at. Such limitations have made it difficult to
construct realistic empirical models.
Thus motivated, we propose a new computationally tractable model to study industry dynamics in settings with a few dominant firms and many fringe firms; in applications, dominant firms would typically be
the market share leaders. A market structure in which a few firms holding significant market share co-exist
with many small firms is prevalent in many industries, such as supermarkets, banking, beer, cereals, and
heavy duty trucks, to name a few (see, e.g, concentration ratios from the U.S. Economic Census). These
industries are intractable in the standard EP framework due to the large number of fringe firms. Although
individual fringe firms have negligible market power, they may have significant cumulative market share
and may collectively discipline dominant firms’ behavior. Our model and methods capture these types of
interactions and therefore significantly expand the set of industries that can be analyzed computationally.
In an EP-style model, each firm is distinguished by an individual state at every point in time. The
industry state is a vector (or ‘distribution’) encoding the number of firms in each possible value of the
individual state variable. Assuming its competitors follow a prescribed strategy, a given firm selects, at
each point in time, an action (e.g., an investment level) to maximize its expected discounted profits. The
selected action will depend in general on the firm’s individual state and the industry state. Even if firms
were restricted to anonymous and symmetric strategies, the computation entailed in selecting such an action
quickly becomes infeasible as the number of firms and individual states grows. This renders commonly used
dynamic programming algorithms to compute MPE infeasible for many problems of practical interest.
The first main contribution of this work is the introduction of a new framework and equilibrium concept that overcomes the computational complexity involved in computing MPE. In an industry with many
competitors, it may be reasonable to assume that firms are more sensitive to changes in the dominant firms’
states. Further, it may be unrealistic to expect that firms have resources to keep track of the evolution of all
rivals. Thus motivated, we postulate a plausible model of firms’ behavior: firms closely monitor dominant
firms, but do not monitor fringe firms as closely. Specifically, we assume that firms’ strategies depend on:
(i) their own individual state; (ii) the detailed state of dominant firms; and (iii) a small number of aggregate
statistics (such as a few moments) of the distribution of fringe firms. We refer to the fringe firms’ distribution or state interchangeably. We call these strategies moment-based strategies, where we use the term
‘moments’ generically, as firms could keep track of other statistics of the distribution of fringe firms, such
as un-normalized moments or quantiles.
Based on these strategies, we introduce an equilibrium concept that we call moment-based Markov equilibrium (MME). In MME, firms’ beliefs about the evolution of the industry satisfy a consistency condition,
and firms employ an optimal moment-based strategy given these beliefs. One challenge in defining this
concept, however, is that fringe moments may not be sufficient statistics to predict the future evolution of
the industry. For example, suppose firms keep track of the first moment of the fringe state. For a given
value of the first moment, there could be many different fringe distributions consistent with that value, from
2
which the future evolution of the industry could vary greatly. Technically, the issue that arises is that the
stochastic process of moments may not be a Markov process even if the underlying dynamics are. Therefore, to define MME we introduce a ‘Markov perceived transition kernel’ to approximate the dynamics of
the (non-Markov) moment-based state; we provide natural and concrete examples for such a kernel. Notably, this construction allows for fringe firms to become dominant as they grow large and vice-versa, fully
endogenizing the market structure in equilibrium.
After defining MME, our second main contribution is a series of guidelines to use MME in applied
work. We introduce an algorithm that efficiently computes MME even in industries with hundreds of firms,
provided that firms keep track of a few moments of the fringe state and there are a few dominant firms. In
addition, we discuss important implementation issues for the use of MME in practice, such as how to specify
the perceived transition kernel, and how to select moments and the number of dominant firms allowed.
While MME may be appealing intuitively, moment-based strategies may not be close to a best response,
and therefore, MME may not be close to a subgame perfect equilibrium in general. The reason is that because moments may not be sufficient statistics to predict the future evolution of the industry, they may not
induce a sufficient partition of histories and may not summarize all payoff-relevant history in the sense of
Maskin and Tirole (2001). Indeed, observing the full distribution of fringe firms may provide valuable information for decision making. Our third main contribution is a series of results that addresses this difficulty,
providing support for using MME as an approximation.
First, we explore the performance of MME as a low-dimensional approximation to MPE. We perform
a large set of numerical experiments comparing MME and MPE outcomes in industries with a few firms
and a few individual states per firm, for which MPE can be computed. We consider two important dynamic
oligopoly models: a quality ladder model and a capacity competition model. We show that MME is able
to provide accurate approximations to MPE across a wide range of parameter values even when firms keep
track of a single and natural fringe moment. For example, in the capacity competition model this moment is
the total installed capacity among all fringe firms.
Alternatively, MME can be thought of as an appealing heuristic and behavioral model on its own. To
formalize this interpretation, we define the ‘value of the full information deviation’ that measures the extent
of sub-optimality of MME strategies in terms of a unilateral deviation to a strategy that keeps track of the
full industry state, similar to the notion of -equilibrium. If this value is small, MME may be an appealing
behavioral model, as unilaterally deviating to more complex strategies does not significantly increase profits,
and doing so may involve costs associated with gathering and processing more information. We provide a
theorem showing that if the value of the full information deviation becomes small (e.g., by including more
moments in MME), then MME strategies approach MPE strategies.
Measuring the value of the full information deviation is intractable in large-scale applications with many
firms; in these instances, computing a best response is almost as computationally challenging as solving for
a MPE. Of course, in these cases we could not compare MME with MPE either. To address this limitation,
using ideas from robust dynamic programming, we propose a novel computationally tractable upper bound
to the value of the full information deviation. This bound is useful because it allows one to evaluate whether
the state aggregation is appropriate or whether a finer state aggregation is necessary, for example, by adding
more moments. We provide numerical experiments showing how the error bound works in practice.
3
Our fourth and last contribution illustrates the applicability of our approach, showing how it can be used
to endogenize the market structure in a fully dynamic model with hundreds of firms for which MPE would be
impossible to compute. In particular, we perform numerical experiments motivated by the long trend towards
concentration in the beer industry in the US during the years 1960-1990. Over the course of those years, the
number of active firms dropped dramatically, and three industry leaders emerged. One common explanation
of this trend is the emergence of national TV advertising as an ‘endogenous sunk cost’ (Sutton, 1991). We
build and calibrate a dynamic advertising model of the beer industry and use our methods to determine how
a single parameter related to the returns to advertising expenditures critically affects the resulting market
structure and the level of concentration in an industry with around two hundred firms overall. We also show
how MME generates rich strategic interactions between dominant and fringe firms, in which, for example,
dominant firms make investment decisions to deter the entry and growth of fringe firms. As further evidence
of the usefulness of our approach, Corbae and D’Erasmo (2014) has already used our framework to study
the impact of capital requirements in market structure in a calibrated model of banking industry dynamics
with dominant and fringe banks.
In summary, our approach offers a computationally tractable model for industries with a dominant/fringe
market structure, opening the door to studying novel issues in industry dynamics. As such, our framework
greatly increases the applicability of dynamic oligopoly models. The rest of this paper is organized as follows. We discuss related literature in Section 2. Section 3 describes our dynamic oligopoly model. Section
4 introduces our new equilibrium concept and Section 5 discusses the interpretations of MME as an approximation. Section 6 introduces an algorithm to compute MME as well as details on the perceived transition
kernel. Section 7 provides a numerical comparison between MME and MPE and discusses important implementation issues. Section 8 describes the application to the beer industry and Section 9 introduces our error
bounds. Finally, Section 10 provides extensions and concludes.
2
Related Literature
Previous work has already recognized the challenges involved in applying dynamic oligopoly models in
practice. For example, researchers have used different approaches to overcome the burden involved in
computing equilibria in empirical applications. Some papers empirically study industries that contain only
a few firms in which exact MPE computation is feasible if the number of individual states assumed is
moderate (e.g., Benkard (2004), Ryan (2012), Collard-Wexler (2013a), and Collard-Wexler (2011)). Other
researchers structurally estimate models in industries with many firms using approaches that do not require
MPE computation, and do not perform counterfactuals that require computing equilibria (e.g., Benkard et al.
(2010)). Our work provides a method to perform counterfactuals in concentrated industries with many firms.
Other papers perform counterfactuals computing MPE but in reduced-size models compared to the actual
industry. These models include a few dominant firms and ignore the rest of the (fringe) firms (for example,
see Ryan (2012) and Gallant et al. (2011)). Other papers coarsely discretize the individual states (e.g,
Collard-Wexler (2013b) and Corbae and D’Erasmo (2013) assume homogenous firms). Finally, some papers
allow for richer heterogeneity but assume a simplified model of dynamics in which a certain process of
‘moments’ that summarize the industry state information is Markov (e.g., see Kalouptsidi (2014), Jia and
4
Pathak (2012), Santos (2010), and Tomlin (2008); Lee (2013) use a similar approach in a dynamic model
of demand with forward-looking consumers). Our methods can help researchers determine the validity of
these simplifications.
A stream of empirical literature related to our work uses simplified notions of equilibrium for estimation
and counterfactuals. In particular, Xu (2008), Iacovone et al. (2013), Qi (2013), and Saeedi (2013) among
others use the notion of oblivious equilibrium (OE) introduced by Weintraub et al. (2008), in which firms
assume the average industry state holds at any time. OE can be shown to approximate MPE in industries
with many firms by a law of large numbers, provided that the industry is not too concentrated. Our approach
extends this type of analysis to industries that are more concentrated.
Our work is related to recent methodological work that tries to overcome the complexity involved in
computing MPE. For example, Pakes and McGuire (2001) proposes a stochastic approximation algorithm
that solves the model only on a recurrent class of states, and reduces the computation time at each state
through simulation. Doraszelski and Judd (2011) suggests casting the game in continuous time, which
greatly reduces the computational cost at each state by reducing the number of future states reachable from
each current state (see also Jeziorski (2013) for an application of this technique). Other papers propose
different computational approaches to approximate MPE. Farias et al. (2012) develops a method based on
approximate dynamic programming with value function approximation (see also Sweeting (2013) for an
empirical application using value function approximation). Santos (2012) introduces a state aggregation
technique based on quantiles of the industry state distribution.
Our work is particularly related to Benkard et al. (2014), which extends the notion of oblivious equilibrium to include dominant firms. In that paper, there is a fixed and pre-determined set of dominant firms,
which all firms in the industry monitor. Firms assume that at every point in time the fringe state is equal to
the expected state conditional on the state of dominant firms. Our paper builds on and generalizes Benkard
et al. (2014)’s idea of considering a dominant/fringe market structure.1 In particular, our approach is more
general in at least two important dimensions: (i) in that firms not only keep track of the dominant firms’
states but also of statistics that summarize the fringe firms’ states; and (ii) that in our model the set of dominant firms arises endogenously in equilibrium. That is, we allow for firms to transition from the fringe to the
dominant tier and vice-versa, and therefore, we fully endogenize the market structure and explicitly model
how firms grow to become dominant. We provide more details about this connection in Section 4.6.
We introduce MME as a restriction to firms’ strategies in which the only dependence on the fringe state
is through a few moments. In this sense, MME limits firms’ information sets and therefore could be cast as
an equilibrium concept in a game of asymmetric information even if the underlying economic model is not.
Hence, while their model and motivation is fundamentally different from ours, MME is related to the notion
of experienced-based equilibria of Fershtman and Pakes (2012) that weakens the restrictions imposed by
perfect Bayesian equilibria for dynamic games of asymmetric information. We discuss this relation in more
detail in Section 4.6.
Finally, our moment-based strategies are similar to and follow the same spirit of the seminal paper by
Krusell and Smith (1998), which replaces the distribution of wealth across agents in the economy with its
1
In their analysis of a stylized model of dynamic mergers, Gowrisankaran and Holmes (2004) also provide an earlier example
of the dominant/fringe dichotomy.
5
moments, when computing stationary stochastic equilibrium in a stochastic growth model. While our main
focus has been on dynamic oligopoly models, our methods can also be used to find better approximations in
stochastic growth models, as well as in macroeconomic dynamic industry models with an infinite number of
heterogeneous firms and an aggregate shock (see, e.g., Khan and Thomas (2008) and Clementi and Palazzo
(2014)). We discuss these extensions in Section 10.2.
3
Dynamic Oligopoly Model and Solution Concept
In this section we formulate a model of industry dynamics in the spirit of Ericson and Pakes (1995). Similar
models have been applied to numerous settings in industrial organization such as advertising, auctions,
R&D, collusion, consumer learning, learning-by-doing, and network effects (see Doraszelski and Pakes
(2007) for a survey).
3.1
Model
We introduce the main elements of our dynamic oligopoly model.
Time Horizon. The industry evolves over discrete time periods and along an infinite horizon. We index
time periods with nonnegative integers t ∈ N (N = {0, 1, 2, . . .}). All random variables are defined on a
probability space (Ω, F, P) equipped with a filtration {Ft : t ≥ 0}. We adopt a convention of indexing by
t variables that are Ft -measurable.
Firms. Each firm that enters the industry is assigned a unique positive integer-valued index. The set of
indices of incumbent firms at time t is denoted by St .
State Space. Firm heterogeneity is reflected through firm states that represent the quality level, productivity,
capacity, the size of its consumer network, or any other aspect of the firm that affects its profits. At time t
the individual state of firm i is denoted by xit ∈ X , where X is a finite subset of Nq , q ≥ 1. We define the
industry state s̄t to be a vector over individual states that specifies, for each firm state x ∈ X , the number of
incumbent firms at x in period t. Note that because we will focus on symmetric and anonymous equilibrium
strategies in the sense of Doraszelski and Pakes (2007), we can restrict attention to industry states for which
the identities of firms do not matter. The integer number N represents the maximum number of incumbent
firms that the industry can accommodate at every point in time and we let nt be the number of incumbent
firms at time period t.
Exit process. In each period, each incumbent firm i observes a nonnegative real-valued sell-off value φit that
is private information to the firm. If the sell-off value exceeds the value of continuing in the industry, then
the firm may choose to exit, in which case it earns the sell-off value and then ceases operations permanently.
We assume the random variables {φit |t ≥ 0, i ≥ 1} are i.i.d. and have a well-defined density function with
finite moments.
Entry process. We consider an entry process similar to the one in Doraszelski and Pakes (2007). At time
period t, there are N − nt potential entrants, ensuring that the maximum number of incumbent firms that the
industry can accommodate is N (we assume n0 < N ). Each potential entrant is assigned a unique index.
In each time period each potential entrant i observes a positive real-valued entry cost κit that is private
6
information to the firm. We assume the random variables {κit |t ≥ 0 i ≥ 1} are i.i.d., independent of all
previously defined random quantities, and have a well-defined density function with finite moments. If the
entry cost is below the expected value of entering the industry then the firm will choose to enter. Potential
entrants make entry decisions simultaneously. Entrants appear in the following period at state xe ∈ X and
earn profits thereafter.2 As is common in this literature and to simplify the analysis, we assume potential
entrants are short-lived and do not consider the option value of delaying entry. Potential entrants that do not
enter the industry disappear and a new generation of potential entrants is created in the next period.
Transition dynamics. If an incumbent firm decides to remain in the industry, it can take an action to
improve its individual state. Let I ⊆ <k+ (k ≥ 1) be a convex and compact action space; for concreteness,
we refer to this action as an investment. Given a firm’s investment ι ∈ I and state at time t, the firm’s
transition to a state at time t + 1 is described by the following Markov kernel Q:
Q[x0 |x, ι] = P[xi,t+1 = x0 xit = x, ιit = ι].
Uncertainty in state transitions may arise, for example, due to the risk associated with a research and development endeavor or a marketing campaign. The cost of investment is given by a nonnegative function
c(ιit , xit ) that depends on the firm’s individual state xit and investment level ιit . Both the kernel and the
investment cost are assumed to be continuous functions of investment. We assume that transitions are
independent across firms conditional on the industry state and investment levels. We also assume these
transitions are independent of all previously defined random quantities.
Aggregate shock. There is an aggregate profitability shock zt that is common to all firms and that may
represent a common demand shock, a common shock to input prices, or a common technology shock. We
assume that {zt ∈ Z : t ≥ 0} is a finite, ergodic Markov chain, and independent of all previously defined
random quantities.
Single-Period Profit Function. Each incumbent firm earns profits on a spot market. We denote by
π(xit , s̄t , zt ) as the single-period expected profits garnered by firm i at time period t, which depends on
its individual state xit , the industry state s̄t , and the value of the aggregate shock zt .
Timing of Events. In each period, events occur in the following order: (1) Each incumbent firm observes its
sell-off value and then makes exit and investment decisions; (2) Each potential entrant observes its entry cost
and makes entry decisions; (3) Incumbent firms compete in the spot market and receive profits; (4) Exiting
firms exit and receive their sell-off values; (5) Investment shock outcomes are determined, new entrants
enter, and the industry takes on a new state st+1 .
Firms’ objective. Firms aim to maximize expected discounted profits. The interest rate is assumed to
be positive and constant over time, resulting in a constant discount factor of β ∈ (0, 1) per time period.
Finally, we assume that for all competitors’ decisions and all terminal values, a firm’s one time-step ahead
optimization problem to determine its optimal investment has a unique solution.3
2
It is straightforward to generalize the model by assuming that entrants can also invest to improve their initial state.
This assumption is a generalization of the unique investment choice admissibility assumption in Doraszelski and Satterthwaite
(2010) that is used to guarantee the existence of an equilibrium to the model in pure strategies. It is satisfied by many of the
commonly used specifications in the literature.
3
7
3.2
Markov Perfect Equilibrium
The most commonly used equilibrium concept in such dynamic oligopoly models is that of symmetric
pure strategy Markov perfect equilibrium (MPE) in the sense of Maskin and Tirole (1988). To simplify
notation we re-define the industry state to incorporate the aggregate shock, that is st = (s̄t , zt ). Hence,
there is a function ι such that at each time t, each incumbent firm i invests an amount ιit = ι(xit , st ),
as a function of its own state xit and the industry state st . Similarly, each firm follows an exit strategy
that takes the form of a cutoff rule: there is a real-valued function ρ such that an incumbent firm i exits
at time t if and only if φit > ρ(xit , st ). Weintraub et al. (2008) show that there always exists an optimal
exit strategy
of this form even among veryogeneral classes of exit strategies. We define the state space
n
P
S = (s̄, z) ∈ N|X | × Z x∈X s̄(x) ≤ N , that is, S is the set of industry states including the aggregate
shock. Let M denote the set of exit/investment strategies such that an element µ ∈ M is a set of functions
µ = (ι, ρ), where ι : X × S → I is an investment strategy and ρ : X × S → R is an exit strategy.4
Similarly, each potential entrant follows an entry strategy that takes the form of a cutoff rule: there is a
real-valued function λ such that a potential entrant i enters at time t if and only if κit < λ(st ). It is simple to
show that there always exists an optimal entry strategy of this form even among very general classes of entry
strategies (see Doraszelski and Satterthwaite (2010)). We denote the set of entry strategies by Λ, where an
element of Λ is a function λ : S → R.
We define the value function V (x, s|µ0 , µ, λ) to be the expected discounted value of profits for a firm at
state x when the industry state is s, given that its competitors each follow a common strategy µ ∈ M, the
entry strategy is λ ∈ Λ, and the firm itself follows strategy µ0 ∈ M. Formally,
"
0
V (x, s|µ , µ, λ) = E µ0 ,µ,λ
τi
X
β
k−t
(π(xik , sk ) − c(ιik , xik )) + β
τi −t
#
φi,τi xit = x, st = s ,
k=t
where i is taken to be the index of a firm at individual state x at time t, τi is a random variable representing
the time at which firm i exits the industry, and the subscripts of the expectation indicate the strategy followed
by firm i, the strategy followed by its competitors, and the entry strategy. In an abuse of notation, we will use
the shorthand, V (x, s|µ, λ) ≡ V (x, s|µ, µ, λ), to refer to the expected discounted value of profits when firm
i follows the same strategy µ as its competitors. An equilibrium in our model comprises an investment/exit
strategy µ = (ι, ρ) ∈ M and an entry strategy λ ∈ Λ that satisfy the following conditions:
1. Incumbent firms’ strategies represent a MPE:
sup V (x, s|µ0 , µ, λ) = V (x, s|µ, λ) ,
∀(x, s) ∈ X × S.
µ0 ∈M
2. For all states, the cut-off entry value is equal to the expected discounted value of profits of entering
4
To simplify notation we do not make this restriction explicit; however, we restrict attention to states (x, s) ∈ X × S, such that
s̄(x) > 0 (recall that s̄(x) is the the x-th component of s̄).
8
the industry:5
λ(s) = β E µ,λ [V (xe , st+1 |µ, λ)|st = s] , ∀s ∈ S.
Doraszelski and Satterthwaite (2010) establish the existence of an equilibrium in pure strategies for
a similar model. With respect to uniqueness, in general we presume that our model may have multiple
equilibria. Doraszelski and Satterthwaite (2010) also provide an example of multiple equilibria in a closely
related model. A limitation of MPE is that the set of relevant industry states grows quickly with the number
of firms in the industry and individual states, making its computation intractable when there are more than a
few firms. This motivates our alternative approach.
4
Moment-Based Markov Equilibrium
In this section we introduce a new equilibrium concept that overcomes the computational complexity involved in solving for MPE.
4.1
Dominant and Fringe Firms
We focus on industries that exhibit the following market structure: there are a few dominant firms and many
fringe firms. Let Dt ⊂ St and Ft ⊂ St be the set of incumbent dominant and fringe firms at time period t,
respectively. The sets Dt and Ft are common knowledge among firms at every period of time. These sets
can change over time and our model incorporates a mechanism that endogenizes the process through which
fringe firms become dominant and vice-versa.
We let Xf ⊆ X and Xd ⊆ X be the set of feasible individual states for fringe and dominant firms,
respectively. We define the state or distribution of fringe firms ft to be a vector over individual states that
specifies,
firm state x ∈oXf , the number of incumbent fringe firms at x in period t. We define
n for eachfringe
P
|X
|
f
Sf = f ∈ N x∈Xf f (x) ≤ N to be the set of all possible states of fringe firms.
Let dt be the state of dominant firms, specifying the individual state of each dominant firm at time period
t. The set of all possible dominant firms’ states is defined by Sd = {d ∈ XdD }, where D is the maximum
number of dominant firms the industry can accommodate. Hence, the dominant firms’ state is represented
by a list of individual states.6 Note that by defining an inactive state, the number of active dominant firms
could be anywhere between 0 and D in a given time period. With some abuse of notation, we define the
state space S = Sf × Sd × Z, where a state s = (f, d, z) ∈ S is given by a distribution of fringe firms, a
state for dominant firms, and a value for the aggregate shock.
The division between dominant and fringe firms will depend on the specific application at hand. Typically, however, dominant firms will be market share leaders or, more generally, firms that most affect
5
Hence, potential entrants enter if the expected discounted profits of doing so are positive. Throughout the paper it is implicit
that the industry state at time period t + 1, st+1 , includes the entering firm in state xe whenever we write (xe , st+1 ).
6
Because we focus on equilibrium strategies that are anonymous, we can restrict attention to a set Sd for which
the identity of
dominant firms does not matter. For example, if the state is one dimensional, we can define Sd = {d ∈ XdD d(1) ≤ d(2), ..., ≤
d(D)}. We choose this state representation for dominant firms as opposed to a distribution over individual states, because we will
typically consider applications with a small number of dominant firms.
9
competitors’ profits. For concreteness, suppose that the state of the firm represents its ability to compete in
the market and firms in higher individual states have larger market shares, as in the two examples we introduce below in Section 7.1. It may then be natural to define dominant firms as those that have a sufficiently
large individual state. In our numerical experiments, we separate dominant firms from fringe firms by an
exogenous threshold state x, such that i ∈ Dt if and only if xit ≥ x. We note that in this case, Xf ∩ Xd = ∅,
which we assume throughout the rest of the paper to simplify notation.7
More generally, one could consider a relative threshold that depends on some statistics of the current
fringe state; for example, a fringe firm becomes dominant if it is certain number of standard deviations
above the current average size of fringe firms. The threshold could also depend on the current state of
dominant firms. Note that under all these specifications the transitions among the fringe and dominant
tiers are naturally embedded in the firms’ transitions. Our approach could also accommodate alternative
specifications on how firms become dominant depending on the empirical application at hand.
In the applications we have in mind, dominant firms are a few and have significant market power. In
contrast, fringe firms are many and individually hold little market power, although their aggregate market
share may be significant. This market structure suggests that firms’ decisions should be more sensitive to
the state of dominant firms than to the state of fringe firms. Moreover, given that the fringe firms’ state is
a highly dimensional object, gathering information on the state of each individual small firm is likely to be
more expensive than doing so for larger firms that not only are few, but also usually more visible and often
publicly traded. Consequently, as the number of fringe firms increases it seems implausible that firms keep
track of the individual state of each one. Instead, we postulate that firms only keep track of the state of
dominant firms and of a few summary statistics of the fringe state.
4.2
Moment-Based Strategies
Our approach requires that firms compute best responses in strategies that depend only on a few summary
statistics of the fringe state. A set of such summary statistics is a multi-variate function θ : Sf → Θ, where Θ
P
is a finite subset of <n . For example, when the fringe firm state is one-dimensional, θ(f ) = x∈Xf xα f (x)
is the α−th un-normalized moment with respect to the distribution f .8 For example, if α = 1, the statistic
θ(ft ) corresponds to the first un-normalized moment, which is equivalent to the sum of the individual states
P
of all incumbent fringe firms, i∈Ft xit .
For brevity and concreteness, we call such summary statistics fringe moments, with the understanding
that they could include normalized or un-normalized moments, but also other functions of the fringe distribution such as quantiles or the number of fringe firms that are about to become dominant. Accordingly, we
define a set of moments of the distribution of fringe firms:
θt = θ(ft ).
(1)
7
Note that more generally we can always make the assumption that Xf ∩ Xd = ∅ by appending one dimension to the individual
state in order to indicate whether the firm is fringe or dominant. This construction is useful as it allows, for example, for fringe and
dominant firms to have different model primitives and strategies.
8
Note that with a finite number of fringe firms (N ) and a finite number of individual states per firm (|Xf |), this and all other
statistics that we use in the paper take a finite number of values.
10
We introduce firm strategies that depend on their own individual state xit , the state of dominant firms dt ,
the aggregate shock zt , and fringe moments θt as defined in (1). We call such strategies moment-based
strategies. We also define Sθ as the set of admissible moments defined by (1); that is, Sθ = {θ|∃f ∈
Sf s.t. θ = θ(f )}. In light of this, we define the moment-based industry state by ŝ = (θ, d, z) ∈ Ŝ =
Sθ × Sd × Z.
A moment-based investment strategy is a function ι such that at each time t, each incumbent firm i ∈ St
invests an amount ιit = ι(xit , ŝt ), where ŝt is the moment-based industry state at time t. Similarly, each
firm follows an exit strategy that takes the form of a cutoff rule: there is a real-valued function ρ such that an
incumbent firm i ∈ St exits at time t if and only if φit ≥ ρ(xit , ŝt ). Let M̂ denote the set of moment-based
exit/investment strategies such that an element µ ∈ M̂ is a pair of functions µ = (ι, ρ), where ι : X × Ŝ → I
is an investment strategy and ρ : X × Ŝ → < is an exit strategy.9
Each potential entrant follows an entry strategy that takes the form of a cutoff rule: there is a real-valued
function λ such that a potential entrant i enters at time t if and only if κit ≤ λ(ŝt ). We denote the set of
entry functions by Λ̂, where an element of Λ̂ is a function λ : Ŝ → <. We assume that all new entrants
become part of the fringe. Note that moment-based strategies and the moment-based state space are defined
with respect to a specific function of moments (1). Relative to MPE strategies, moment-based strategies do
not keep track of the full fringe state; they only keep track of a few fringe moments.
4.3
Transition Kernel
With Markov strategies (µ, λ) the underlying state, {st = (ft , dt , zt ) : t ≥ 0}, is a Markov process with
a transition kernel denoted by Pµ,λ . In addition, we denote by Pµ0 ,µ,λ the transition kernel of (xit , st )
when firm i uses strategy µ0 , and its competitors use strategy (µ, λ). Note that given strategies, both these
kernels can be derived from the primitives of the model, namely, the distributions of φ and κ, the kernel
of the aggregate shock, and the kernel Q. We emphasize that the underlying industry state st should be
distinguished from the moment-based industry state ŝt = (θt , dt , zt ).
Defining our notion of equilibrium in moment-based strategies will require the construction of what
can be viewed as a ‘Markov’ approximation to the dynamics of the moment-based state process {(xit , ŝt ) :
t ≥ 0}, where i is a generic firm. Note that this process is, in general, not Markov even if the dynamics
of the underlying state {(xit , st ) : t ≥ 0} are. To see this, consider for example that the individual state
represents a firm’s size and that firms keep track of the first un-normalized moment of the fringe state,
P
θt = θ(ft ) = x∈Xf xft (x). Suppose the current value of that moment is θt = 10; this value is consistent
with one fringe firm in individual state 10, but also with 10 fringe firms in individual state 1. These two
different states do not necessarily yield the same probabilistic distribution for the first moment in the next
period. Therefore, θt may not be a sufficient statistic to predict the future evolution of the industry, because
there are many fringe distributions that are consistent with the same value of θt . In the process of aggregating
information via moments the resulting process is no longer Markovian.
To simplify our presentation of the transition kernel over moment-based states we initially assume that
9
To simplify notation we do not make this restriction explicit; however, we restrict attention to states (x, ŝ), such that if the firm
in state x is a dominant firm, then its individual state needs to coincide with one of the components of d.
11
the set of dominant firms is fixed and predetermined, and hence, there are no transitions between the sets of
fringe and dominant firms. At the end of this section we discuss the more practical setting that we use in our
numerical experiments in which the set of dominant firms is endogenously determined.
Assuming that firm i follows the moment-based strategy µ0 , and that all other firms use moment-based
strategies (µ, λ), we will describe a kernel P̂µ0 ,µ,λ [·|·] with the hope that the Markov process described by
this kernel is a good approximation to the (non-Markov) process {(xit , ŝt ) : t ≥ 0}. The Markov process
described by P̂µ0 ,µ,λ are firm i’s perception of the evolution of its own state in tandem with the momentbased industry state. To this end, we first introduce the kernel P̂µ,λ [θ0 |ŝ] that describes the evolution of
a hypothetical Markov process over moments. To approximate the moment process, we require that the
perceived transitions over moments described by P̂µ,λ agree in some manner with the actual transitions over
underlying states described by Pµ,λ . Formally, in MME we will impose that:
P̂µ,λ = ΦPµ,λ ,
(2)
for a given operator Φ. MME could be defined with any operator Φ; however, some specifications yield
better approximations and Example 4.1 below provides one such specification. In our numerical experiments
we use a perceived transition kernel based on this example unless otherwise noted.
Example 4.1 (Observed transitions). One possible definition for P̂µ,λ is the kernel that coincides with the
long-run average observed transitions from the moment-based state in the current time period to the moment
in the next time period under strategies (µ, λ). More specifically, we let the industry evolve for a long time
under strategies (µ, λ). For each moment-based state ŝ that is visited, we observe a transition to a new
moment θ0 . We count the observed frequency of these transitions and set the kernel P̂µ,λ [θ0 |ŝ] to be the
observed distribution corresponding to these frequencies. A similar construction is used by Fershtman and
Pakes (2012) in a setting with asymmetric information.
We now formalize the definition of this kernel. The evolution of the underlying industry state is described
by the kernel Pµ,λ . We let Rµ,λ be the recurrent class of moment-based states the industry will eventually
reach. For all θ0 ∈ Sθ and ŝ ∈ Rµ,λ , we define the operator:
PT
0
(ΦPµ,λ ) [θ |ŝ] = lim
T →∞
t=1 1{ŝt = ŝ, θt+1 =
PT
t=1 1{ŝt = ŝ}
θ0 }
.
(3)
We require that
P̂µ,λ [θ0 |ŝ] = (ΦPµ,λ )[θ0 |ŝ],
for all ŝ ∈ Rµ,λ , and P̂µ,λ is arbitrarily defined for states ŝ ∈
/ Rµ,λ . In other words, perceived transitions
for fringe moments coincide with their observed transitions in the recurrent class of states, and are arbitrary
outside this class.10
P
To make the example more concrete, suppose that θt = θ(ft ) = x∈Xf xft (x), the first un-normalized
10
In this paper, as is common in this literature, we focus on ‘capital accumulation’ games without investment spillovers for which
it is simple to show that for given strategies, the industry state process admits a single recurrent class. Alternatively, in other settings
that yield more than one recurrent class, to construct the perceived transition kernel we would need to assume that all firms agree
on which recurrent class the industry will eventually transition into.
12
moment of the fringe state. To simplify the exposition, we ignore dominant firms and the aggregate shock.
We let the industry evolve for a long time under given strategies (µ, λ). Suppose that in the course of this
simulation, we observe that one half of the time periods in which θt = 10, the moment the next time period
is given by θt+1 = 10, one fourth is given by θt+1 = 12, and the other fourth is given by θt+1 = 8. Then,
we set P̂µ,λ [θ0 = 10|θ = 10] = 1/2, P̂µ,λ [θ0 = 12|θ = 10] = 1/4, P̂µ,λ [θ0 = 8|θ = 10] = 1/4. We would
set P̂µ,λ [θ0 |θ] similarly for all moment states in the recurrent class.
Another possible construction of the perceived transition kernel that has been successfully used in
stochastic growth models in macroeconomics (Krusell and Smith, 1998) and subsequent literature assumes
a linearly parameterized and deterministic evolution for moments. We describe this construction in detail in
Appendix A. This perceived transition kernel has the disadvantage of imposing strong parametric restrictions and assuming deterministic transitions. On the other hand, these same restrictions significantly reduce
the computational burden of solving for the equilibrium relative to the ‘observed transitions’ kernel.
Both of these kernels are meant to capture the long-run evolution of the industry. We could also introduce a perceived kernel meant to capture the short-term dynamics of the industry after a policy or an
environmental change, such as a merger. For this, we could consider the average observed transitions from
the current moment to the next, over many finite (and short) trajectories that start from the same state; this
state describes the initial condition of the industry.
Having defined the kernel for fringe moments P̂µ,λ [θ0 |ŝ], we now define the perceived kernel P̂µ0 ,µ,λ [·|·]
over states (xit , ŝt ) according to:
P̂µ0 ,µ,λ [x0 , ŝ0 |x, ŝ] = Pµ0 ,µ,λ [x0 , d0 , z 0 |x, ŝ]P̂µ,λ [θ0 |ŝ],
(4)
where with some abuse of notation Pµ0 ,µ,λ [x0 , d0 , z 0 |x, ŝ] denotes the marginal distribution of the next state
of firm i, the next state of dominant firms, and the next value for the aggregate shock, conditional on the
current moment-based state, according to the kernel of the underlying state Pµ0 ,µ,λ . The definition above
makes the following facts about this perceived process transparent:
1. Were firm i a fringe firm, the above definition asserts that this fringe firm ignores its own impact on the
evolution of industry moments. This is evident in that x0 is distributed independently of θ0 conditional
on x and ŝ.
2. Given information about the current moment-based state, the firm correctly assesses the distribution of
its next state, the next state of dominant firms, and the next aggregate shock. Note that because firms
use moment-based strategies, the moment-based state (x, ŝ) is sufficient to determine the transition
probabilities of (x, d, z) according to the transition kernel of the underlying state Pµ0 ,µ,λ .
3. However, it should be clear that the Markov process given by the above definition remains an approximation since it posits that the evolution of the moments θ are Markov with respect to ŝ as described
by P̂µ,λ [θ0 |ŝ]. In fact, the distribution of moments at the next point in time potentially depends on the
full distribution of the fringe firms and not only its moments.
13
We finish this subsection by briefly describing how to modify the transition kernel when the set of
dominant firms is endogenous. Recall that Pµ0 ,µ,λ [x0 , d0 , z 0 |x, ŝ] (see (4)) is the kernel that describes the
actual evolution of (xit , dt , zt ), which firms can determine exactly from the primitive transition kernel when
transitions between the fringe and dominant tiers are not possible. However, if such transitions are possible
the moments in θ(·) may not contain sufficiently detailed information to precisely calculate the transition
probabilities into dominance, and therefore, the distribution of the dominant firm state the next time period
conditional on the current moment-based state.
For example, suppose θ(f ) is the first un-normalized moment of the fringe state and that firms become
dominant if they grow larger than a pre-determined individual state. Then, the probability that a fringe
competitor becomes dominant depends on the fringe state beyond that moment. Therefore, in this case,
we need to modify the kernel that describes the evolution of (xit , dt , zt ) to incorporate firms’ perceived
probabilities that a fringe competitor will transition into dominance given the moment-based industry state.
The operator Φ will specify consistency conditions for these events in addition to moments’ transitions;
for example, the perceived transition probabilities of fringe firms becoming dominant could coincide with
the corresponding transition frequencies observed given the strategies firms use. We provide a detailed
description of tier transitions in Section 6.2.
4.4
Single-Period Profit Function
The single-period profit function can be written as π(xit , st ): a function of the individual state xit and the
industry state st = (ft , dt , zt ). If firms use moment-based strategies and only keep track of fringe moments,
they do not have the ability to evaluate a single-period profit function that depends on the entire fringe state.
Hence, to define our equilibrium concept in these cases, similar to the perceived transition kernel, we need to
define a perceived single-period profit function over moment-based states, π̂µ,λ (xit , ŝt ), for given strategies
(µ, λ).
One possible specification for the perceived single-period profit function is:
PT
π̂µ,λ (x, ŝ) = lim
T →∞
t=1 π(x, st )1{ŝt =
PT
t=1 1{ŝt = ŝ}
ŝ}
,
(5)
for all ŝ in the recurrent class Rµ,λ , and where the evolution of the underlying industry state is described by
the kernel Pµ,λ . In other words, this is the long-run average single-period profit experienced whenever the
moment-based state is given by (x, ŝ). This specification is similar to the one used in Fershtman and Pakes
(2012).
Despite this definition, we envision that our approach will be most useful in settings for which the
single-period profit function only depends on the fringe state through a few moments (or that this provides
an accurate approximation), as the next assumption makes explicit and that we keep throughout the paper
unless otherwise noted.
Assumption 4.1. The single-period expected profit of firm i at time t, π(xit , θ̃t , dt , zt ), depends on its
individual state xit , a vector θ̃t ∈ Θ̃ ⊆ Θ of fringe moments, the state of dominant firms dt , and the
value of the aggregate shock zt .
14
The single-period profit functions we will consider either satisfy the previous assumption exactly or
can be well approximated by functions of fringe moments. Given this, in our approach, firms will always
keep track of the moments that determine (or approximate) the profit function, θ̃t ; that is θ̃t will always be
included in θt in equation (1). The state space spanned by θ̃t is much smaller than the original one if Θ̃ is low
dimensional and significantly smaller than |Xf |. In fact, many single profit functions of interest depend on a
small number of functions of the distribution of firms’ states. For example, commonly used profit functions
that arise from monopolistic competition models depend on a particular moment of that distribution (Dixit
and Stiglitz, 1977; Besanko et al., 1990). Further, in Section 7.1 we describe two important profit functions
for applied work that we use in our numerical experiments that can be well approximated by a function of a
single fringe moment.
We note that firms may keep track of additional contemporaneous moments beyond the moments θ̃t
to improve their predictions of the future evolution of the industry. In our applications, however, we will
typically start our analysis by assuming that firms only keep track of the fringe moments associated with the
single-period profit function, hence θt = θ̃t .
4.5
Moment-Based Markov Equilibrium
A moment-based Markov equilibrium (MME) is an equilibrium in moment-based strategies. Having defined
a Markov process approximating the process {(xit , ŝt ) : t ≥ 0}, we define the perceived value function by a
deviating firm i that uses moment-based strategy µ0 in response to strategies (µ, λ). Importantly, this value
function assumes that firm i’s perception of the evolution of its own state and the moment-based industry
state are described by the kernel P̂µ0 ,µ,λ defined above. Formally,
"
0
V̂ (x, ŝ|µ , µ, λ) = E µ0 ,µ,λ
τi
X
β
k−t
#
τi −t
π(xik , ŝk ) − c(ιik , xik ) + β
φi,τi xit = x, ŝt = ŝ ,
k=t
where the expectation is taken with respect to the perceived transition kernel P̂µ0 ,µ,λ . We will use the
shorthand notation V̂ (x, ŝ|µ, λ) ≡ V̂ (x, ŝ|µ, µ, λ).
A moment-based Markov equilibrium (MME) is defined with respect to a function of moments θ in (1)
and an operator Φ in (2) that defines the perceived transition kernel. Hence, the function of moments and
the operator are inputs to define MME. We could derive different MME for the same model primitives by
changing these inputs.
Definition 4.1. A MME of our model is comprised of an investment/exit strategy µ = (ι, ρ) ∈ M̂ and an
entry cutoff function λ ∈ Λ̂ that satisfy the following conditions:
C1: Incumbent firms’ strategies optimization:
sup V̂ (x, ŝ|µ0 , µ, λ) = V̂ (x, ŝ|µ, λ)
µ0 ∈M̂
15
∀(x, ŝ) ∈ X × Ŝ.
C2: At each state, the cut-off entry value is equal to the expected discounted value of entering the industry:
h
i
e
λ(ŝ) = β E µ,λ V̂ (x , ŝt+1 |µ, λ)ŝt = ŝ
∀ŝ ∈ Ŝ.
C3: The perceived transition kernel is given by:
P̂µ,λ = ΦPµ,λ
Note that the first two conditions are similar to the conditions imposed in MPE, but incorporate the
moment-based state space and strategies. The third condition imposes a relation between the perceived
transition kernel and the actual transition kernel through the operator Φ. For example, the operator Φ
could be defined as in Example 4.1. Computationally, MME is appealing if agents keep track of a few
moments of the fringe state and there are a small number of dominant firms, because agents optimize over
low dimensional strategies. We note that if the function θ is the identity, so that θ(f ) = f , and Φ is such
that the perceived transition kernel coincides with the actual transition kernel for all states, then MME is
equivalent to MPE.
Using a similar argument to Doraszelski and Satterthwaite (2010) and imposing some additional technical conditions we can show that an MME always exists. In the numerical experiments that we present later
in the paper we are always able to computationally find an MME over a large set of primitives that satisfy
the standard assumptions required for the existence of MPE in EP-style models. With respect to uniqueness,
in general we presume that our model may have multiple equilibria.
4.6
Relation with Other Equilibrium Concepts
We finish this section by establishing the relationship between MME and related equilibrium concepts previously defined in the literature.
MME is related to the equilibrium concepts defined in Fershtman and Pakes (2012) that considers dynamic oligopoly models with asymmetric information. While their model and motivation is fundamentally
different from ours, they are related in that MME limits firms’ information sets, and therefore could be cast
as an equilibrium concept in a game of asymmetric information even if the underlying economic model
is not. More specifically, assuming that fringe firms privately observe their own states, and the publicly
observed information are the fringe moments, the state of dominant firms, and the aggregate shock (and
assuming that all industry states are in the recurrent class), then MME with the transition kernel of Example
4.1 coincides with that of restricted experienced based equilibrium (REBE) defined in Fershtman and Pakes
(2012). If the set of states in the recurrent class is a strict subset of the state space, which is typically the
case, then REBE is different from MME, because it does not require optimality of strategies outside the
recurrent class. Instead, we require optimality at all states in MME, but the perceived transition kernel can
be arbitrarily defined outside the recurrent class.
This equivalence holds when REBE strategies only keep track of the current moments, state of dominant
firms, and aggregate shock. However, REBE strategies typically keep track of past observations as well. To
extend the equivalence to these cases, we could extend the definition of MME so that firms keep track
16
of past values of the moment (θt−1 , θt−2 , ...). Nevertheless, to provide a more direct connection with the
commonly used concept of MPE, in which firms keep track only of the current industry state, and to simplify
exposition, our definition of MME considers strategies that only depend on current moments of the fringe
state. In addition, in Section 10.1 we provide further connections between our work and Fershtman and
Pakes (2012), in particular in relation to our error bounds.
Finally, if we assume (i) the set of dominant firms is fixed and pre-determined; (ii) firms only keep track
of the dominant firms’ state and do not keep track of fringe moments; and (iii) that the perceived singleperiod profit function is specified as in equation (5), then MME coincides with the simulated version of
partially oblivious equilibrium (POE) of Benkard et al. (2014).11
5
MME as Approximation
There are at least two alternative ways of interpreting MME as an approximation. First, we can interpret
MME as a low-dimensional approximation to MPE. Alternatively, MME can be thought of as an appealing
heuristic and behavioral model on its own; this is related to the optimality of moment-based strategies. We
discuss these two interpretations in the next subsections.
5.1
MME as Behavioral Model
Suppose firms play moment-based strategies with moments θ(·) and perceived transition kernel P̂µ,λ . As
previously noted, generally {ŝt : t ≥ 0} is not Markov, so it may not be a sufficient statistic to predict
the future evolution of the industry as described by the primitive transition kernel Pµ,λ . Hence, θ(·) does
not necessarily summarize all payoff relevant history in the sense of Maskin and Tirole (2001), and observing the underlying full fringe state may provide valuable information for decision making. Generally,
MME strategies may not be close to a best response, and therefore, may not be close to a subgame perfect
equilibrium.
However, MME strategies would provide an appealing behavioral model if they are close to optimality,
that is, assuming that competitors play MME strategies themselves, unilaterally deviating to a strategy that
keeps track of the full underlying fringe state would not improve expected discounted profits by much.
In this case, it may be reasonable to assume firms use MME strategies, as unilaterally deviating to more
complex strategies does not significantly increase payoffs, and doing so may involve costs associated with
gathering and processing more information.
To formalize this notion, for MME strategies (µ, λ), we define the value of the full information deviation:
∆µ,λ (x, s) = sup V (x, s|µ0 , µ, λ) − V (x, s|µ, λ).
(6)
µ0 ∈M
Note that V (x, s|µ, λ) provides the actual expected discounted profits a firm would get if all firms play MME
strategies. Similarly, supµ0 ∈M V (x, s|µ0 , µ, λ) provides the expected discounted profits if the firm would
Instead, define π̂µ,λ (xit , dt , zt ) = π(xit , f˜µ,λ (dt , zt ), dt , zt ), where f˜µ,λ (d, z) = limT →∞
latter case, we recover the regular notion of POE.
11
17
PT
t=1 ft 1{dt =d, zt =z}
P
.
T
t=1 1{dt =d,zt =z}
In the
unilaterally deviate to the best response Markov strategy that keeps track of the full industry state. We use
the value of the full information deviation to measure the extent of sub-optimality of MME strategies. If this
value is small, the MME strategy achieves essentially the same profits compared to the best possible Markov
strategy. In this sense, the value of the full information deviation is similar to the notion of −equilibrium.
Note that we expect this value to be small when moments in MME are close to being sufficient statistics of
the future evolution of the industry; that is, when P̂µ,λ [θ0 |ŝ] ≈ Pµ,λ [θ0 |s], and the industry state s does not
provide additional information to predict the future industry evolution beyond the moment-based industry
state ŝ.
It is illustrative to study whether there are models for which equilibrium strategies yield moments that
are indeed sufficient statistics, and therefore for which MME strategies are optimal. In Appendix B we
introduce one such class of models. Assuming that the single-period profit function for fringe firms exhibits
constant returns to scale, that the dynamics of a fringe firm’s evolution are linear in its own state, and that
there are no transitions between the dominant and fringe tiers, we show in the appendix that MME strategies
yield moments that form a Markov process and hence summarize all payoff relevant information as the
number of fringe firms grows. As a consequence, it is also possible to show that for this model the value of
the full information deviation becomes small when the number of fringe firms is large.
The model just described imposes strong assumptions on the model primitives for fringe firms and may
be too restrictive for many applications. A natural question is whether we can obtain similar aggregation
results for a broader class of models. Unfortunately, as we illustrate below, the answer is usually no. Specifically, our next result shows that when firms keep track of one moment, moments are sufficient statistics only
if fringe firms’ equilibrium transitions are, in an appropriate sense, linear in their own state. To our knowledge, such equilibrium transitions arise only in the constant returns to scale model presented in Appendix B
or in close variations. There, equilibrium transitions are linear, because the primitive transitions are linear
and, as we show in the appendix, the equilibrium investment strategy does not depend on the fringe firms’
individual state.12
We have the following result. All proofs are provided in the appendix. Suppose that the single-period
profit function depends on the α un-normalized moment of the fringe state (like in Example 7.1).
Proposition 5.1. Assume Xf is a closed interval in <, N − D̄ ≥ 3 , there is no entry and exit in the industry,
the set of dominant firms is fixed, and the set of moments contains the α un-normalized moment only, that
P
is, θ(f ) = x∈Xf xα f (x). Suppose that under MME strategy µ, the moment is a sufficient statistic for the
evolution of the industry, that is:
P̂µ,λ [θ0 |ŝ] = Pµ,λ [θ0 |s],
for all θ0 ∈ Sθ , s= (f, d, z) ∈ S, and ŝ= (θ, d, z) ∈ Ŝ, with θ(f ) = θ. Then, fringe firms’ transitions in
MME must be linear in their own state, in the sense that E µ [xαi,t+1 |xit = x, ŝt ] = xα ζ̃1 (ŝt |µ) + ζ̃0 (ŝt ), for
given functions ζ̃1 (·|µ) and ζ̃0 (·).
12
These results are closely related to the literature in macroeconomics that deals with the aggregation of macroeconomic quantities from the decisions of a single ‘representative agent’ (Blundell and Stoker, 2005). With heterogeneous agents, similarly to our
model, this type of result is only obtained under strong assumptions about consumers’ preferences that are also akin to linearity
(see, for example, Altug and Labadie (1994)).
18
This result suggests that it seems implausible to prove a general result showing that MME becomes an
exact approximation for broad classes of models that are typically used in applied work. Instead, a more
practical approach, that we take in our numerical experiments, is to use MME in these models assuming
that firms approximate the evolution of the moment-based state with a perceived Markov process, and then
try to measure the extent of the approximation error. Of course, measuring the value of the full information
deviation is intractable in large-scale applications with many firms; in these instances, computing a best
response is almost as computationally challenging as solving for a MPE. Thus motivated, in Section 9 we
introduce a computationally tractable upper bound to the value of the full information deviation.
5.2
MME as Approximation to MPE
MME can also be interpreted as a low-dimensional approximation to MPE. Despite the lack of theoretical
results showing that moments become sufficient statistics, as argued in the previous section, in Section 7.3
we will provide computational evidence showing that MME approximates MPE for important classes of
models.
For now, we show that if the value of the full information deviation becomes small (e.g., by including
more moments in MME), then MME strategies approach MPE strategies. This result provides a connection
between the interpretation of MME as an appealing behavioral model in the sense of an −equilibrium, and
that of MME as an approximation to MPE. We formalize this in the following theorem.
Let Γ ⊆ M×Λ be the set of MPE. For all (µ, λ) ∈ M×Λ, let us define D(Γ, (µ, λ)) = inf (µ0 ,λ0 )∈Γ k(µ0 , λ0 )−
(µ, λ)k∞ .13
Theorem 5.1. Consider a sequence of MME strategies {(µn , λn )|n ∈ N}, such that (i) the value of
the full information deviation converges to zero for all industry states, that is limn→∞ ∆µn ,λn (x, s) =
0, ∀(x, s) ∈ X × S; and (ii) the cut-off entry condition is satisfied asymptotically, that is limn→∞ |λn (s) −
β E µn ,λn [V (xe , st+1 |µn , λn )|st = s] | = 0, ∀s ∈ S.14 Then, limn→∞ D(Γ, (µn , λn )) = 0.
The proof closely follows the proof of a similar result in Farias et al. (2012); we include it in the
appendix for completeness. We note that the assumption that there is no overlap of individual states between
dominant and fringe firms is important to obtain the result; if not, MME would not admit symmetric Markov
strategies. In addition, the assumption that the value of the full information deviation converges to zero for
all industry states is key to prove the result. However, in practice, when using a transition kernel like the
one in Example 4.1 we may only expect for this to be the case in states in the recurrent class, because
transitions outside this class are arbitrary. This issue would be avoided if the entire state space is recurrent
under strategies that could potentially be best responses.15 Alternatively, if we only assume that the value
of the full information deviation converges to zero in states in the recurrent class, we can show that we
approximate a weaker equilibrium concept that does not impose optimality requirements for states that are
not reachable in equilibrium, similar to those proposed in Fershtman and Pakes (2012).
13
k · k∞ denotes the supremum norm.
When the value of the full information deviation becomes small, it is also reasonable to expect that the error in the cut-off entry
rule also becomes small.
15
For example, if the model has depreciation and appreciation shocks in all states.
14
19
6
Computation of MME
In this section we introduce an algorithm that computes MME and discuss in detail the perceived transition
kernel with tier transitions.
6.1
Algorithm
To compute MME we employ a best response-type algorithm. The solver starts with a strategy profile,
checks for the equilibrium conditions, and updates the strategies by best responding to the current strategy until an equilibrium is found. In each iteration we construct the perceived transition kernel from firms’
strategies according to the operator Φ. Depending on the operator used this could require simulation. We describe the steps in Algorithm 1 for the observed transition kernel introduced in Example 4.1. The following
remarks are important:
1. The algorithm terminates when the norm of the distance betweenna strategy and its best response ois
P
0
small. We consider the following norm kµ − µ0 kh = maxx∈X
ŝ∈Ŝ |µ(x, ŝ) − µ (x, ŝ)|h(ŝ) ,
where h is a probability vector. We take h to be the frequency in which each industry state is visited,
hence states are weighted according to their relevance. This is useful because simulation errors in the
estimated transition kernel are higher for states that are visited infrequently. We also try a variation of
h in which we initially add positive weight to all states.
2. Even if in theory all states are recurrent, in a finite length simulation it is possible that some states
will not be visited, and for these states the perceived transition kernel cannot be computed. For these
states we set the transitions in P̂ to be some predetermined value. We try different specifications for
these states and discuss their impact on the equilibrium in Section 7.3.
Algorithm 1 Equilibrium solver
1:
2:
3:
4:
5:
6:
7:
Initialize with (µ, λ) and industry state s0 = (f0 , d0 , z0 ) with corresponding ŝ0 ;
n := 1, δ := + 1;
while δ > do
Simulate a T period sample path {(ft , dt , zt )}Tt=1 with corresponding
{ŝt }Tt=1 for large T ;
P
T
Calculate the observed frequencies of industry states h(ŝ) := T1 t=1 1{ŝt = ŝ}, for all ŝ ∈ Ŝ;
Use the same sample path to compute the observed transition kernel P̂µ,λ [·|ŝ], for all ŝ;
Solve µ0 := argmax V̂ (x, ŝ|µ0 , µ, λ), for all (x, ŝ) ∈ X × Ŝ;
µ0 ∈M̂
8:
9:
10:
11:
12:
13:
Let λ0 (ŝ) := Eµ,λ [V̂ (xe , ŝt+1 |µ, λ)|ŝt = ŝ], for all ŝ ∈ Ŝ;
δ := max(kµ − µ0 kh , kλ − λ0 kh );
µ := µ + (µ0 − µ)/(1 + nσ );
λ := λ + (λ0 − λ)/(1 + nσ );
n := n + 1;
end while;
If the algorithm terminates with = 0 and h(·) assigns strictly positive weight to all industry states we
have found an MME. A positive value of allows for numerical error. We also note that in the algorithm,
20
0 < σ < 1 is chosen to speedup convergence by smoothing out the update of strategies from one iteration to
the next.16 For different perceived transition kernels, lines 4-6 in Algorithm 1 will be different. For example,
they would require a slight modification to incorporate transitions between the fringe and dominant tiers as
we do in our numerical experiments. This extension is explained in detail in Section 6.2.
We found that for the observed transition kernel a real-time stochastic algorithm similar to Pakes and
McGuire (2001) and Fershtman and Pakes (2012) is much faster than Algorithm 1. In this variant the kernel
is not simulated in every iteration, but instead continuation value functions that allow to solve for firms’
optimal strategies are kept in memory and updated through simulation draws. In this sense, this modified
real-time algorithm performs the simulation and optimization steps simultaneously. We use one step of
Algorithm 1 to check whether C1 and C2 of MME hold and convergence has been achieved. The details of
this variant of the algorithm are presented in Appendix D.
6.2
Perceived Transition Kernel
In this section we extend the observed perceived transition kernel of Example 4.1 to allow for tier transitions.
We use this kernel in the context of the previously introduced algorithm to compute MME in our numerical
experiments.
In our experiments we assume that firms are dominant if and only if they are above a pre-determined
state x. Hence, the transitions in and out of dominance are naturally embedded in firms’ transitions. To
simplify the dynamics we impose the constraint that at most one firm becomes dominant in a time period;
this constraint was typically never binding in our experiments. Further, to construct the perceived transition kernel we assume that the events that a fringe competitor becomes dominant and vice versa are both
statistically independent of the moment transitions. Our numerical results suggest that we are able to approximate MPE even under this simplifying assumption. Moreover, this approximation is computationally
useful, because without it we would need to store the joint transition kernel of dominant firms and moment
transitions, requiring significantly more memory; this could be relevant in large-scale applications like the
one presented in Section 8.17 Under these assumptions, we can extend the observed transition kernel of
equations (3) and (4) by:
P̂µ0 ,µ,λ [x0 , ŝ0 |x, ŝ] = Pµ0 ,µ,λ [x0 , d0 , z 0 |x, ŝ]P̂µ,λ [θ0 |ŝ]P̂µ,λ [(f → d)|ŝ],
(7)
where the first term of the right-hand side incorporates the transition probability of firm i at state x (including
its chances of becoming dominant in the next time period), the transitions of current dominant firms, and the
evolution of the shock. The second term corresponds to the observed transition kernel for moments specified
in (3). The event (f → d) corresponds to a fringe competitor of firm i transitioning and becoming a domi16
In practice, we use σ = 2/3.
Under our simplification, to build the transition kernel we need to store the following elements in memory: the perceived
transition kernel for moments, the tier transitions’ probabilities, and the strategy of dominant firms. This roughly requires memory
space proportional to |Ŝ||Sθ | + 2|Ŝ| (assuming no aggregate shock). Instead, the joint transition kernel of dominant firms and
moment transitions would require memory space roughly proportional to Ŝ 2 . Recall that |Ŝ| = |Sθ ||Sd |. For example, if Sθ =
Sd = 100, then the first expression is equal to 1,020,000 and the second to 100,000,000.
17
21
nant firm in the next period.18 For our specification, this event corresponds to a fringe competitor growing
to or above individual state x̄. For a given moment-based state, we assume firms’ perceived probability of
this transitioning event occurring is equal to its long-run average observed frequency, that is:
PT
t=1 1{ŝt = ŝ, (f
PT
t=1 1{ŝt =
P̂µ,λ [(f → d)|ŝ] = lim
T →∞
→ d)t }
ŝ}
.
Note that in the kernel specified in equation (7) we also assume that a fringe firms’ own transitions are
independent of the moments, which seems reasonable in applications with many fringe firms. However,
in the small-scale experiments we run below to compare MME with MPE, we also tried a specification
in which a fringe firm that becomes dominant removed itself from the moment. With a small number of
fringe firms this had an important effect in improving the approximation to MPE, specifically in the capacity
competition model described below.
7
Numerical Experiments: Comparisons to MPE
As mentioned in Section 5, MME can be interpreted as a low-dimensional approximation to MPE and in
this section we numerically explore this interpretation. In Section 7.1 we introduce the two profit functions
we use in our numerical experiments and in Section 7.2 we provide other model specifications. Then, in
Section 7.3 we provide our numerical results showing that MME approximates MPE well. We also discuss
important implementation issues to consider when using MME in applied work.
7.1
Single-Period Profit Functions
In this section we provide two important profit functions that we use in our numerical experiments and that
can be well approximated by a function of a single fringe moment. Our first model is similar to standard
industrial organization models commonly used in empirical work.
Example 7.1 (Quality-Ladder). Similarly to Pakes and McGuire (1994), we consider an industry with
differentiated products, where each firm’s state variable is a number that represents the quality of its product.
Hence, a firm’s state is given by a natural number xit . For clarity, we do not consider aggregate shocks in
this example. There are m consumers in the market. In period t, consumer j receives utility uijt from
consuming the good produced by firm i given by:
uijt = α1 ln(xit ) + α2 ln(Y − pit ) + νijt , i ∈ St , j = 1, . . . , m,
where Y is the consumer’s income, and pit is the price of the good produced by firm i. The random variables
νijt are i.i.d. Gumbel that represent unobserved characteristics for each consumer-good pair. There is also an
18
Since we assumed that at most one firm becomes dominant in a time period, in principle, one should explicitly consider and
rule out the event that firm i and a competitor become dominant in the same time period. Even though we explicitly consider this
event in our numerical experiments, it turned out not to be important in practice, because the constraint of having at most one fringe
firm becoming dominant in a time period was typically never binding. Hence, for simplification, we ignored this effect in the kernel
above.
22
outside good that provides consumers zero expected utility. We assume consumers buy at most one product
each period and that they choose the product that maximizes utility. Under these assumptions our demand
system is a standard logit model.
Considering a constant marginal cost of production d, there exists a unique Nash equilibrium in pure
strategies that can be computed by solving the first-order conditions of the pricing game (see Caplin and
Nalebuff (1991)). Let p∗ (x) denote the price charged by a firm in state x in such an equilibrium. Let
K(x, p) = exp(α1 ln(x) + α2 ln(Y − p)). Then, expected profits are given by:
π(xit , st ) = m(p∗ (xit ) − d)
1+
K(xit , p∗ (xit ))
, ∀i ∈ St .
∗
x∈X st (x)K(x, p (x))
P
(8)
Now, if, similar to the logit model of monopolistic competition of Besanko et al. (1990), we assume fringe
firms set prices assuming they have no market power, so they all set the price p∗ = (Y + cα2 )/(1 + α2 ) (but
dominant firms continue to compete Nash in prices), expected profits are given by:
π(xit , st ) = m(p∗ (xit ) − d)
1 + (Y − p∗ )α2
P
K(xit , p∗ (xit ))
P
α1
∗
y∈Xf y ft (y) +
j∈Dt K(xjt , p (xjt ))
(9)
Therefore, under this assumption, π(xit , st ) can be written as π(xit , θ̃t , dt ), that is, as a function of the single
P
moment θ̃t = y∈Xf y α1 ft (y), the α1 −th un-normalized moment of ft . In Section 7.3 we numerically
show that (9) approximates (8) well in the type of applications we have in mind in which fringe firms are
indeed smaller in size compared to dominant firms.
Our second model is structurally different and we chose it because it has been shown in previous work
to yield rougher MPE value functions relative to our first model (see Farias et al. (2012)). In this sense,
approximating MPE for this model imposes a more demanding test for MME.
Example 7.2 (Capacity Competition). This model is based on the quantity competition model of Besanko
and Doraszelski (2004). We consider an industry with homogeneous products, where each firm’s state
variable determines its production capacity q(xit ). We assume that capacity grows linearly in state, that is
q(x) = q i + q s x, for some constants q i , q s > 0. Investment increases this capacity. We also ignore an
aggregate shock. At each period, firms compete in a capacity-constrained quantity setting game. There is
a linear demand function Q(p) = m(e − f p) and an inverse demand function P (Q) = e/f − Q/(mf ),
where P represents the common price charged by all firms in the industry, Q represents the total industry
output, and e, m and f are positive constants. To simplify the analysis, we assume the marginal costs of all
firms are equal to zero. Given the total quantity produced by its competitors Q−i,t , the profit maximization
problem for firm i at time t is given by max0≤qit ≤q(xit ) P (qit + Q−i,t )qit , where q(xit ) is the production
capacity at individual state xit . It is possible to show that a simple iterative algorithm yields the unique Nash
23
equilibrium of this game q ∗ (xit ).19 Profits for firm i are then given by:
!
π(xit , st ) = P
X
∗
st (x)q (x) q ∗ (xit ).
(10)
x∈X
Suppose that all fringe firms are relatively small and they produce at full capacity in the Nash equilibrium.
Then,


X
X
π(xit , st ) = P 
ft (y)q̄(y) +
q ∗ (xjt ) q ∗ (xit ).
(11)
j∈Dt
y∈Xf
Hence, under this assumption π(xit , st ) can again be written as a function of a single moment, π(xit , θ̃t , dt ),
P
where θ̃t = x∈Xf ft (x)q̄(x), the total current installed capacity of fringe firms. In Section 7.3 we numerically show that (11) approximates (10) well in the type of applications we have in mind in which fringe
firms are indeed smaller than dominant firms.
In our numerical experiments firms keep track of the moments described above that approximate the
profit function. In the logit model, the fringe moment summarizes the impact of the fringe on market shares;
this is a natural statistic to keep track of. In the capacity competition model, firms monitor the aggregate
capacity among fringe firms. This again is natural, because (i) it may be a statistic that is easy to gather,
for example, from trade associations; and (ii) fringe firms may indeed produce at full capacity. Ryan (2012)
uses a profit function similar to our capacity competition model and argues that these two characteristics are
present in his study of the cement industry.
7.2
Other Model Specifications
In our numerical experiments in this section we consider the single-period profit functions from Examples
7.1 and 7.2. The rest of the model specifications are common across these two profit functions.
Firms can invest ι ≥ 0 in order to improve their individual state over time (either product quality or
capacity, depending on the model) at a constant marginal investment cost of c. A firm’s investment is
aι
, in which case the individual state increases by one level. The firm’s product
successful with probability 1+aι
may also depreciate by one state with probability δ, independently each period. Our model differs from
Pakes and McGuire (1994) here because the depreciation shocks in our model are idiosyncratic. Combining
the investment and depreciation processes, it follows that the transition probabilities for a firm in state x that
does not exit and invests ι are given by:
h
i
P xi,t+1 xit = x, ι =

(1−δ)aι



 1+aι
(1−δ)+δaι
1+aι



 δ
if
1+aι
if xi,t+1 = x + 1
if xi,t+1 = x
xi,t+1 = x − 1 .
While it would be straightforward to implement entry and exit as we do in Section 8, in order to simplify
19
The
equilibrium
quantities are Pcharacterized by
the following
max 0, min q(xit ), 12 me + q ∗ (xit ) − x∈X st (x)q ∗ (x)
, ∀ i ∈ St .
24
set
of
equations:
q ∗ (xit )
=
the model we omit them from our computations in this section.
In practice, parameters would either be estimated using data from a particular industry or chosen to reflect an industry under study. We use a set of parameter values to reflect reasonable economic fundamentals,
summarized in Tables 1, 2, and 3. We keep these parameters fixed for all experiments, unless otherwise
noted.
Parameter
Value
m
60
d
0.5
α1
1
α2
0.5
Y
1
c
0.65
Table 1: Default parameters for quality ladder model experiments.
Parameter
Value
m
2
qi
0
qs
2
e
50
f
10
c
4
Table 2: Default parameters for capacity model experiments.
Parameter
Value
β
0.925
δ
0.7
a
1
X
{1,...,9}
Table 3: Default parameters for both models.
7.3
Comparison to MPE
In this section we numerically compare MME with MPE for the two models described above. Because
exact computation of MPE is only feasible for small state spaces, we consider |X | = 9 individual states
and N = 5, 6, 7 firms. We also consider several parameter regimes. First, for the quality ladder model, the
default parameters in the tables above represent a regime in which there is a small market size and a small
investment cost. We also consider a set of parameters in which these two quantities are relatively larger.20
We also consider similar regimes for the capacity competition model with a similar interpretation for the
default parameters.21
Both for MME and MPE there are potentially multiple equilibria. We have not made an attempt to
exhaustively search for all of them; however, our algorithm always found the same equilibria for a given
instance, even when starting from different initial conditions (see Besanko et al. (2010) for an explanation
of this phenomenon). All of the results below apply to this particular equilibria. Instead of comparing equilibrium strategies directly, we compare economic indicators induced by these strategies. These indicators
are long-run averages of various functions of the industry state under the strategy in question. The indicators
we examine are those that are typically of interest in applied work; we consider average total investment,
average producer surplus, average consumer surplus, and average social welfare.
20
For the latter, we consider m = 90 and c = 1. The default market size parameters are used for N = 6. For N = 5 and N = 7
we subtract and add 5 units to these numbers, respectively.
21
For the regime with a larger investment cost, we consider e = 65 and c = 6.5. Again, the default market size parameter is
used for N = 6. For N = 5 and N = 7 we use m = 1.66 and m = 2.33, respectively.
25
For the comparison results presented in Table 4, we consider MME with D = 3 dominant firm slots. We
assume that firms become dominant if they are large enough, specifically if they grow above (and including)
a pre-determined individual state x. For all the instances, we fixed x = 6 (recall there are 9 individual
states). This threshold state seems like a reasonable compromise between defining dominant firms once
they indeed grow large enough, but at the same time allowing enough richness in dominant firms’ dynamics.
We assume firms always keep track of a single moment, namely, the one suggested in Section 7.1, for
each profit function. Computing MME typically takes a matter of minutes, while computing MPE could take
many hours, in particular for the case N = 7.22 We discuss below how the results change when considering
different specifications for dominant firms and moments.
In Table 4, we observe that MME provides accurate approximations to MPE for all economic indicators
of interest in all instances. As expected, MME does a better job approximating MPE for the quality ladder
model, for which most indicators are approximated within 3-5 %. For the capacity competition model, the
approximation errors are similar, except for consumer surplus, for which the approximation error is around
11-12% for two instances (but significantly smaller for the rest).
Note that there are several sources of approximation error for MME relative to MPE. First, differently
to MPE, even though fringe firms in MME consider the possibility of becoming dominant when investing,
they do not have immediate incentives to invest to deter the growth of other firms, because their state is not
monitored by competitors unless the fringe firm becomes dominant. As a consequence, in some states, fringe
firms in MME underinvest relative to MPE and this is an important driver in the differences in consumer
surplus between MME and MPE, which in all instances is larger in the latter. However, we expect that
MME may provide an even better approximation to MPE as the number of firms grows, because, in this
case, the deterrence incentives of small firms in MPE would also get reduced. Unfortunately, we cannot test
this hypothesis because we cannot compute MPE for larger industries.23
A second source of approximation error for MME relative to MPE could be that in MME we use approximations to the single-period profit function that only depend on a single moment. However, when numerically comparing the long-run average difference between the ‘approximate’ single-period profit function
with the exact profit function, we observe that this does not seem to play an important role. For the capacity
competition model, there is no difference; fringe firms are small relative to the size of the market and they
always produce at full capacity in all instances. For the quality ladder model, prices in both profit functions
are slightly different but the average differences in profits are always less than 1%.
A third source of approximation error for MME relative to MPE is that moments may not be sufficient
statistics to predict the evolution of the industry. As previously mentioned, in our numerical experiments we
always keep track of a single moment, namely, the one suggested in Section 7.1, for each profit function.24
22
For example, for N = 7 in the capacity competition model, the number of industry states for MME and MPE are given by
1120 and 11440, respectively. All runs were performed on a shared cluster of 27 dual Xeon processors. The code was programmed
in Java.
23
In the range of industries we compute, fringe firms are still reasonably sized relative to dominant firms even for N = 7. For
this reason and because there are several moving parts in our numerical experiments (e.g., typically the average number of dominant
firms increases as N increases) the approximation errors in Table 4 do not necessarily always decrease as N increases.
24
If the set of feasible values the moment can take is large due to the presence of a potentially large number of fringe firms, we
could discretize the state space of moments using a manageable grid and interpolate the value function in between points of the
grid. We experimented with this type of setting using linear interpolation. Further, in the experiments in the beer industry in Section
26
High investment cost
N =5
Low investment cost
N =5
High investment cost
Number
of firms
N =5
Low investment cost
Capacity competition model
Quality ladder model
Instance
N =5
N =6
N =7
N =6
N =7
N =6
N =7
N =6
N =7
MPE
MME
% Diff.
MPE
MME
% Diff.
MPE
MME
% Diff.
MPE
MME
% Diff.
MPE
MME
% Diff.
MPE
MME
% Diff.
MPE
MME
% Diff.
MPE
MME
% Diff.
MPE
MME
% Diff.
MPE
MME
% Diff.
MPE
MME
% Diff.
MPE
MME
% Diff.
Long-Run Statistics
Total
Prod.
Cons.
Inv.
Surp.
Surp.
7.126
26.936
173.63
7.688
25.601 170.755
7.89
4.96
1.66
7.841
28.725 194.878
8.082
27.468 190.102
3.07
4.37
2.45
8.519
30.47 215.628
8.373
29.257 208.545
1.73
3.98
3.28
7.1
17.42
112.21
7.66
16.56
110.36
7.82
4.98
1.65
8.04
19.18
130.99
8.31
18.37
127.88
3.37
4.23
2.37
8.98
20.91
149.95
8.85
20.15
145.36
1.44
3.65
3.06
5.799 144.689
33.426
5.804 141.403
31.708
0.09
2.27
5.14
7.209 176.806
41.716
7.009 171.930
38.455
2.77
2.75
7.82
8.61 209.003
50.24
8.179 201.605
44.41
5.01
3.54
11.6
6.564
98.666
38.715
6.867
97.234
36.496
4.63
1.45
5.73
8.073
119.87
48.206
8.206 118.077
44.539
1.66
1.5
7.6
9.609 140.949
58.295
9.516 138.576
51.376
0.97
1.68
11.87
Soc.
Welf.
200.566
196.356
2.1
223.603
217.57
2.7
246.098
237.801
3.37
129.64
126.91
2.1
150.17
146.25
2.61
170.86
165.51
3.13
178.116
173.111
2.81
218.522
210.385
3.72
259.242
246.015
5.1
137.381
133.729
2.66
168.077
162.617
3.25
199.244
189.951
4.66
Table 4: Comparison of MPE and MME indicators. Long-run statistics computed simulating industry
evolution over 106 periods.
27
As we argued there, these are natural moments to keep track of and it is encouraging that we obtain good
approximations to MPE in the two models. If in certain applications using a single moment is not enough to
achieve an accurate approximation, one could add an additional moment. One informative statistic that could
potentially improve the approximation is the number of fringe firms that have a high chance of becoming
dominant soon. For example, in our model this could be the number of fringe firms that are close to the
threshold state x. We tried adding this statistic in our experiments and in general it did not improve the
approximation significantly. Instead, in our experiments adding dominant firm slots significantly improved
the approximation, as we discuss next.
A fourth source of approximation error for MME relative to MPE is that in the former there is a potentially small number of dominant firm slots. In fact, in our numerical experiments we find that having
enough slots to accommodate fringe firms transitioning to become dominant is important for providing accurate approximations. For the results presented in Table 4, the number of slots is typically large enough so
there is at least one slot empty in 90% or more of the industry states in the recurrent class. Further, with 3
dominant slots, the average number of dominant firms in the experiments is between only 1 and 2. This is
important to avoid distorting investment incentives of fringe firms that are about to become dominant. For
example, in the low investment cost instance of the capacity competition model for N = 6, decreasing the
number of dominant slots from 3 to 2 and then to 1, increases the error in approximating consumer surplus
from 7.6% to 18.1% and then to 37.5%, respectively. On the other hand, increasing the number of dominant
slots to 5 improves the approximation of consumer surplus to only 4.3.% (without significantly improving
the approximation of producer surplus and total investment), with almost a four-fold increase in the size of
the state space. The relatively small improvement in approximation errors does not seem to compensate for
the large increase in computational cost.
Finally, we also study the impact of firms’ perceived transition probabilities outside the recurrent class
on MME. Recall that the perceived transition kernel used in our experiments (defined in Section 6.2) only
specifies transitions in the recurrent class for given strategies; transitions outside the recurrent class are
arbitrary. In the results presented in Table 4, we assume that firms’ perceptions in states outside the recurrent
class are such that the current value of the moment remains the same in the next time period, which we call
‘status-quo’ perceptions. In our numerical experiments, we show that small perturbations around statusquo perceptions (e.g., instead of assuming that the moment stays the same in the next time period, firms
assume that the moment transitions to a near-by value) have a minor impact on MME outcomes. We also
run experiments with extreme ‘optimistic perceptions’; in this case, firms’ perceptions in states outside the
recurrent class are that the moment transitions to its lowest possible value regardless of its current value.
We observe that the differences in MME outcomes between these two modes of perceptions outside the
recurrent class could be large in general; however, these differences tend to decrease as the number of fringe
firms in the industry decreases.25 For robustness, users should experiment with different specifications for
the perceptions outside the recurrent class in their applications.
8 we use a bicubic spline to interpolate between grid points when computing firms’ optimal strategies (see, e.g., Judd (1998)). We
note that changing the fineness of the grid within reasonable ranges did not change the results significantly.
25
For example, in the low investment cost instance of the quality ladder model, the difference in long-run expected consumer
surplus between MME with status-quo and optimistic perceptions is 15.5%, 4.4%, and 1.1% for, N = 6, N = 10, and N = 20,
respectively.
28
8
Large-Scale Application: The Beer Industry
In the previous section we provided results for small-scale numerical experiments to be able to compare
MME with MPE. In this section, to show how our approach increases the applicability of dynamic oligopoly
models, we provide an application of MME in a large industry for which MPE is infeasible to compute.
Some questions that have puzzled economists for decades are: What are the determinants of market
structure? Why do some industries become dominated by a handful of firms while still holding many small
firms? How does the resulting market structure affect market outcomes? We believe that our approach can
be used to shed light on these questions, and more broadly, to develop counterfactuals in different empirical
settings in which the market structure is endogenously determined in a fully dynamic model.
As an example, we perform numerical experiments that are motivated by the long trend towards concentration in the beer industry in the US during the years 1960-1990. In the course of those years, the number of
active firms dropped from about 150 to 30, and three industry leaders emerged: Anheuser-Busch (owner of
Budweiser brand among others), Miller, and Coors. Two competing explanations for this trend are common
in the literature (see Tremblay et al. (2005)): an increase in the minimum efficient scale (MES), and an
increase in the importance of advertising that with the emergence of national television benefited big firms.
The role of advertising as an ‘endogenous sunk cost’ in determining market structure is discussed in detail
in Sutton (1991) (see Chapter 13 for a discussion of the beer industry). In this section we calibrate a model
of the beer industry, similar to the dynamic advertising model by Doraszelski and Markovich (2007), to
examine the role advertising may have on market structure. We emphasize that it is not our goal to develop
a complete and exhaustive empirical model of the beer industry, but rather to illustrate our approach in a
setting of potential empirical relevance.
The model is similar to Example 7.1, where xit is the goodwill of firm i in period t with associated
market share
σ(xit , st ) ∝ (xit )α1 (Y − pit )α2 .
Firms invest in advertising to increase their goodwill stock over time and compete in prices every period.
The number of dominant firms is determined endogenously in the equilibrium, with a maximum number
of three. The transitions between goodwill states are similar to those introduced in Section 7.2, but we use
a multiplicative growth model (as opposed to an additive one) that has been previously used in empirical
applications of advertising (Roberts and Samuelson, 1988). Firms become dominant when their individual
goodwill level becomes larger than a predetermined value.
The numerical experiments examine the effect of different specifications of the contribution of goodwill
on firms’ profits as captured by the parameter α1 . Firms keep track of the α1 -th un-normalized moment and
MME is computed for each parameter specification. We say that the profit function exhibits: decreasing
returns to advertising (DRA) if α1 < 1, constant returns to advertising (CRA) if α1 = 1, and increasing
returns to advertising (IRA) if α1 > 1. We consider three specifications of returns to advertising with α1f
and α1d controlling the returns to advertising for fringe and dominant firms, respectively. These take values
in (α1f , α1d ) ∈ {αD , αC } × {αD , αC , αI }, where (αD , αC , αI ) = (.85, 1, 1.1). The three cases under
consideration are: (1) DRA-DRA with (αD , αD ), (2) DRA-CRA with (αD , αC ), and (3) CRA-IRA with
(αC , αI ).
29
We calibrate the model parameters from a variety of empirical research that studies the beer industry
or related advertising settings.26 For example, α2 and Y are chosen to match the price elasticity in the
average price, and the average sell-off value is taken from the sales’ price of used manufacturing plants.
See Appendix E for a list of the parameters and their sources. We emphasize that each of our experiments
includes 200 firms and 29 different individual states, which makes it much larger than any problem that can
be solved if MPE was used as an equilibrium concept.
Figure 1 plots (on a log-scale) the average goodwill distribution of firms for the three cases, and Table 5
reports some average industry statistics. The experiments suggest that higher returns to advertising indeed
give rise to more right-skewed size distributions, as is expected. Indeed, it is clear that in the DRA-DRA case
there is on average a vacancy in the dominant tier, and dominant firms are not much bigger than the biggest
fringe firms. In both DRA-CRA and CRA-IRA the industry is much more concentrated and dominant firms
are much larger than fringe firms. Also, dominant firms remain dominant for a much longer period of
time. In addition, as the industry becomes more concentrated, there are fewer incumbent fringe firms, they
are smaller, and they spend less time in the industry on average. In that sense, large dominant firms deter
the growth of fringe firms. Overall, our results show that increasing returns to advertising could yield a
concentrated market structure with a few dominant firms like in the beer industry.
Figure 1: Average size distribution (un-normalized) of firms in log-scale. The solid line represents fringe
firms and the triangles represent dominant firms.
Finally, we examine the effect fringe firms have on dominant firms by computing MPE with dominant
firms only. In fact, ignoring fringe firms is a common practice in the applied literature to simplify computation. In order to make the comparison fair we normalize the profit function to compute MPE by fixing the
fringe firms’ moment to its MME long-run average. We obtain that, for example, in the DRA-CRA case the
average size of dominant firms drops by 17% and the average size of the smallest dominant firm falls around
50% in the MPE relative to the MME values. This suggests that deterring entry and pushing down investment from fringe firms are important determinants in dominant firms’ investment incentives. The collective
26
We thank Carol Tremblay and Victor Tremblay for providing supplementary data.
30
Industry averages
Concentration ratio C2
Concentration ratio C3
First moment (normalized by # fringe)
Active fringe firms (#)
Active dominant firms (#)
Size incumbent fringe (goodwill)
Size dominant (goodwill)
Entrants per period (#)
Time in industry fringe
Time as dominant
CRA-IRA
0.42
0.53
1.085
147
3
1.1
83.4
16.5
12.9
1536
DRA-CRA
0.36
0.44
1.22
167
3
1.4
64.9
12.3
16
360
DRA-DRA
0.13
0.17
1.74
180
2.3
2
24.1
7.6
26.2
21
Table 5: Average industry statistics
presence of fringe firms, in spite of their weak individual market power, disciplines dominant firms and
forces them to invest more than in an oligopoly in which there are only dominant firms. We conclude that
explicitly modeling fringe firms may have important effects on conclusions derived from counterfactuals.
9
Approximation Error Bounds
In Section 5 we discussed that MME can be interpreted as a way of approximating MPE, but also as an
appealing behavioral model on its own. To evaluate the performance of MME through the lens of the latter
perspective, we introduced the notion of the value of the full information deviation, that is, the value of a
unilateral deviation to a strategy that keeps track of the full industry state. While in small-scale experiments
like those presented in Section 7.3 we can compute the value of the full information deviation exactly, this
is not possible in large-scale applications like the one presented in Section 8. In these instances, computing
a best response is almost as computationally challenging as solving for a MPE. Of course, neither could we
compare MME with MPE in these cases.
Thus motivated, we suggest a novel computationally tractable error bound that upper bounds the value of
the full information deviation. This error bound is derived by observing a connection between the unilateral
deviation problem of a firm in MME and problems in robust dynamic programming (RDP); see Iyengar
(2005) for a reference to the RDP literature.27 We note that to the best of our knowledge ideas from RDP
have not been previously used to derive sub-optimality error bounds for dynamic programs with partial
information as we do here.
This bound is useful because it allows one to evaluate whether the state aggregation is appropriate or
whether a finer state aggregation is necessary, for example, by adding more moments. In fact, if the value
of the full information deviation is small, it is plausible that firms would use the relatively simpler MME
strategies as opposed to more complex Markov strategies that do not yield significant additional benefits.
27
RDP considers Markov decision processes with unknown transition kernels in which the decision maker chooses a ‘robust’
strategy to mitigate this ambiguity. In our case, observing the moment, but not the underlying industry state, is equivalent to having
ambiguity about the underlying transition probabilities. Therefore, upper bounding the extent of a unilateral deviation of an agent
in MME can be reformulated as a RDP problem.
31
9.1
Derivation of Error Bound
Let (µ, λ) be a fixed MME strategy for the remainder of this section. Recall that V (x, s|µ, λ) is the actual
value of playing MME strategies starting from (x, s). Denote by V ∗ (x, s|µ, λ) = supµ0 ∈M V (x, s|µ0 , µ, λ).
The value of the full information deviation is ∆µ,λ (x, s) = V ∗ (x, s|µ, λ) − V (x, s|µ, λ) (see (6)). Note
that given MME strategies (µ, λ), V (x, s|µ, λ) can be computed using forward simulation. However, the
problem of finding the optimal strategy that achieves V ∗ is highly dimensional. Instead we find an upper
bound to V ∗ with which we can upper bound ∆µ,λ . To do so we construct a robust Bellman operator as
follows. For every ŝ = (θ, d, z) ∈ Ŝ define the consistency set Sf (ŝ) = {f ∈ Sf θ(f ) = θ}, that is, the
set of all fringe distributions that are consistent with the value of the moments in state ŝ. Note that moments
are not sufficient statistics for the evolution of the industry, because typically Sf (ŝ) is not a singleton and
different fringe distributions in the consistency set may have different future evolutions. Now, define
(TR V̂ )(x, ŝ) = sup sup
n
h
π(x, ŝ) + E φ1{φ ≥ ρ}
ι∈I f ∈Sf (ŝ)
ρ≥0
io
,
+ 1{φ < ρ} − c(x, ι) + β E µ,λ [V̂ (xi,t+1 , ŝt+1 )xit = x, st = (f, d, z), ι]
(12)
where V̂ ∈ V̂ is a bounded vector V̂ : X × Ŝ → < and 1{·} is the indicator function. Recall that we
assume all competitors use MME strategies (µ, λ). That is, the robust Bellman operator is defined on X × Ŝ
and it is identical to the Bellman operator associated with the best response in C1 in the definition of MME,
except that the firm can also choose any underlying fringe state consistent with observed moment θ.
In Lemma C.1 in the Appendix, we show that the robust Bellman operator TR has a unique fixed point
∗
V̂ that we call the robust value function. The next result, also proved in the appendix, relates it to the
optimal value function V ∗ .
Theorem 9.1. For all (x, s) ∈ X × S
V ∗ (x, s) ≤ V̂ ∗ (x, ŝ),
where ŝ is the moment-based industry state that is consistent with s, that is, θ = θ(f )
In essence, the robust value function resolves the indeterminacy of moment transitions by choosing the
best fringe state from the associated consistency set. Intuitively, this should provide an upper bound for V ∗ .
ˆ
Therefore, we can bound the value of the full informational deviation with ∆(x,
s) = V̂ ∗ (x, ŝ) − V (x, s),
where ŝ is the moment-based industry state of s. The advantage of computing V̂ ∗ over V ∗ is that TR operates
on the moment-based state space Ŝ, which is much smaller than S. Nevertheless, the computation of V̂ ∗ is
still demanding, as we explain next.
Finding the fixed point of the operator TR is generally a hard computational problem, in fact it is NPcomplete (Iyengar, 2005, §3). This is not surprising, since the optimization over consistency sets may be
very complex. However, iterating the operator TR becomes much simpler if there is a large number of
fringe firms. In this case, the one-step transition of the fringe state is close to being deterministic, because,
conditional on the current state, the transitions of individual fringe firms average out at the aggregate level
32
by a law of large numbers. When one-step transitions are deterministic, finding the optimal consistent f in
(12) is equivalent to choosing the next moment from a set of moments that are accessible from the current
industry state. This considerably simplifies the computation of the inner maximization in (12), since the
moment accessibility sets are low dimensional. Moreover, characterizing these accessibility sets can be
done efficiently. This procedure, described in detail in Appendix F, is tractable for problems for which
computing MME is tractable. We note that this method, based on deterministic transitions for the fringe
state, only provides an approximation to the robust upper bound for models with a large but finite number
of fringe firms.28
It is simple to observe that if moments are sufficient statistics to predict the future evolution of the industry, then the bound is tight. If this is not the case, the gap between the actual value of the full information
deviation and the quantity computed with the robust bound depends on the distance V̂ ∗ (x, ŝ) − V ∗ (x, s).
To lower this gap and make the bound tighter we can compute the robust bound over a refinement of the
moment-based state space used in the equilibrium computation by including moments in the robust operator
not included in MME. For example, we could add another contemporaneous moment like a quantile, or the
number of fringe firms that are close to becoming dominant, or we could add a lagged moment. This will
reduce the size of the consistency sets, making the bound tighter, albeit increasing the computational cost.
We illustrate this with numerical experiments in the next subsection.
9.2
Numerical Experiments
We have done extensive numerical experiments using the robust bound. We present some results on two
simplified models similar to the beer industry experiments of Section 8 with a single dominant firm that
operates under constant returns to advertising. In the first model N = 200 and fringe firms can enter and
exit the market (model E), in the second N = 100 and firms cannot enter or exit the industry (model NE).
For each model we vary α1f , the parameter controlling fringe firms’ returns to advertising, from .45 to .65.
The lower the returns to advertising, the smaller the reward from investment and the more homogenous
the fringe firms are in equilibrium. Since the moment is a sufficient statistic of the future evolution of the
industry when fringe firms are homogenous, we expect the error bound to be lower for low values of α1f .
In both models E and NE, to simplify computation, we use the parametric approach described in
Appendix A to construct the perceived transition kernel. In particular, we assume a linear relationship
θt+1 = ξ0 (dt ) + ξ1 (dt )θt , and the parameters {(ξ0 (d), ξ1 (d))}d∈Sd are taken to fit the simulated transitions
using ordinary least squares. To simplify, we assume there is no aggregate shock. We report in Figures
2a and 2b the expected error bound for both dominant and fringe firms as a percentage of the actual value
function. Namely, we plot
h ∆
ˆ µ,λ (x, s) i
,
E µ,λ
V (x, s|µ, λ)
28
It is possible to formally derive a probabilistic version of the robust bound that corrects for the fact that the number of fringe
firms
√ N is finite using standard probability bounds. Loosely speaking, by a central limit theorem the correction term is of the order
N . However, given our numerical experience, we believe that in many settings of interest in which N is large, the square root N
correction will not have a significant impact for any practical purposes.
33
(a) No Entry/Exit
(b) Entry/Exit
Figure 2: Error Bound
where the expectation is taken with respect to the invariant distribution of the underlying industry state s.29
In addition to the standard robust bound, we consider a variation where a lagged fringe state enters the
computation of the consistency sets. This reduces the size of these sets and so decreases the gap between the
actual value of the full information and the error bound. Indeed, we see that the lagged state decreases the
error bound, in particular in model E. Adding further contemporaneous or lagged moments should generally
make the error bound even tighter.
The experiments agree with our conjecture that a more homogenous fringe tier (low α1f ) will result in
a lower error bound. In addition, we see that the error bounds are generally lower for model NE. This is
explained by the better fit of the linear moment transitions without entry and exit. Entry and exit decisions are
inherently nonlinear, and therefore, the assumed linear moment transitions approximate the actual moment
transitions in NE better than in E, resulting in a lower value of the bound.
These experiments suggest that the robust bound can be useful to test the extent of sub-optimality of
MME strategies in terms of a unilateral deviation. Based on our numerical results, we find that depending on
the model, the bound can be sometimes tight indicating that the extent of a unilateral deviation is small, but it
can also be loose. In the latter cases, refining the state space by adding additional moments when computing
the robust bound can be helpful to make the bound tighter and more useful. Also, adding additional moments
to firms’ MME strategies should generally improve the accuracy of the perceived transition kernel, and it
is plausible to expect that this would decrease the bound as well. There may still be settings in which even
after adding several moments, the bound will be loose. This is not entirely surprising as the error bound
derived in this paper is very generic and does not use any problem-specific information. We leave potential
refinements of our bound that use problem-specific information as matter for further research.
29
For dominant firms, the expectation is also taken with respect to the invariant distribution of their individual state evolution.
For fringe firms, x is taken to be the most-visited fringe state.
34
10
Extensions and Conclusions
In this paper we introduced a novel framework to study dynamic oligopolies in concentrated industries that
opens the door to study new issues in the empirical analysis of industry dynamics. In particular, we first
introduced MME and an algorithm to compute it. We also discussed important implementation issues for
applied work and illustrated our approach in a large-scale application. Then, we provided results to justify
MME as an approximation by comparing it to MPE and by introducing error bounds. In the rest of this
section, we discuss some extensions of our approach to other important classes of economic models, as well
as a discussion on other possible future directions.
10.1
Extension: Dynamic Games of Asymmetric Information
In the dynamic games of asymmetric information studied by Fershtman and Pakes (2012) a firm keeps track
of the history it has observed up to that period. In principle, optimal strategies could depend on the infinite
history of observations, because past history may provide useful information about current competitors’
private states. To make the model tractable, the authors assume that histories are finite and discuss environments in which this assumption may be plausible, for example, when there is public periodic revelation of
all information. Our error bounds and related ideas can be used to upper bound the extent of a unilateral
deviation from the finite history REBE of Fershtman and Pakes (2012) to a strategy that keeps track of the
infinite history.
In the spirit of approximating equilibrium in a game of complete information, our error bounds assume
that the deviant firm has access to the full current industry state. However, in a game of asymmetric information it is more appropriate to consider a unilateral deviation to a strategy that instead keeps track of
the observed infinite history. Note that typically, fixing competitors’ strategies, a unilateral deviation to a
strategy of the former type provides larger expected discounted profits than one of the latter type, because
having access to the full current state typically provides more information than having access to the observed
infinite history. For example, while in the former the firm knows the exact current state of competitors, this
may not necessarily be true in the latter. As a consequence, our error bounds are valid for the model of
asymmetric information; however, tighter bounds can be derived.
To derive tighter bounds, we formulate the best response problem of a firm in the dynamic game of asymmetric information as a partially observable Markov decision process (POMDP); see, for example Lovejoy
(1991) for a reference on POMDPs. Then, we use the results from White III and Scherer (1994) that derive
upper bounds to the optimal value function of the POMDP that keeps track of the entire infinite history,
based on the POMDP that only keeps track of a finite history of length M . White III and Scherer (1994)
also shows that the bounds become weakly tighter as M grows. They also provide sufficient conditions
under which the extent of the unilateral deviation becomes zero for finite M . The sufficient conditions essentially establish that distant history becomes irrelevant and are related to the literature on contraction rates
and forecast horizons in fully observed MDPs (see references in Lovejoy (1991)). However, checking such
sufficient conditions over model primitives depends on the underlying stochastic process and is typically
quite complex.
In summary, using ideas from POMDPs we can derive bounds on the extent of a unilateral deviation for
35
fixed competitors’ strategies from REBE with a given memory length to a strategy that keeps track of the
full infinite history. Note that because competitors’ strategies are fixed, the transition kernel that describes
the evolution of competitors is fixed. In contrast, a much harder problem is to consider a sequence of REBE
with increasing memory lengths for all firms and show that strategies become optimal as memory length
grows. In this case, the underlying competitors’ process faced by a given firm is itself becoming more
memory dependent as the memory length grows, and therefore, the previously mentioned POMDP results
do not apply. Studying this problem is outside the scope of this paper but may be interesting for future
research.
10.2
Extension: Macroeconomic Models with Heterogeneous Agents
Our approach is related to the important literature in macroeconomics that studies models with heterogeneous agents in the presence of aggregate shocks. Notably, Krusell and Smith (1998) studies a stochastic
growth model with heterogeneity in consumer income and wealth. Other papers focus on firm-level heterogeneity in capital and productivity (see, e.g., Khan and Thomas (2008) and Clementi and Palazzo (2014)).
All of these models assume a continuum of agents and therefore agents’ dynamic programming problems
are infinite dimensional; strategies depend on the distribution over individual states. Motivated by the seminal idea in Krusell and Smith (1998), in these papers agents are assumed to keep track of only a small
number of statistics of this distribution to simplify the problem. Note that these macroeconomic models
do not incorporate dominant agents, so our framework accommodates them by considering moment-based
states that include moments and the value of the aggregate shock only.
While our approach is partially inspired by this previous literature, we believe that in turn our results can
also be useful in these macroeconomic models. For concreteness, we focus on the seminal paper of Krusell
and Smith (1998) that show that in their model the first moment of the wealth distribution is essentially a
sufficient statistic for the evolution of the economy, a property they call ‘approximate aggregation’. The
reason is that agents’ equilibrium decisions turn out to be almost linear in their state. In fact, if decisions
were exactly linear, the first moment would be an exact sufficient statistic.
While the approximate aggregation property has been shown to be quite robust in this class of models,
Krusell and Smith (2006) and Algan et al. (2013) acknowledge that it does not hold for all models and may
not hold for important models considered in the future. For this reason, it is important to understand the
boundaries of approximate aggregation by testing its accuracy. Den Haan (2010) describes the limitations
of commonly used accuracy tests, such as the so called ‘R2 test’ and provides other alternatives. These tests
often try to measure the accuracy of the perceived transition kernel proposed. We believe that our computationally tractable error bounds provide a more direct test of the validity of a state aggregation technique,
because it directly measures improvements in payoffs, in particular, on how much an agent can improve its
expected discounted utility (in monetary terms) by unilaterally deviating from the approximate equilibrium
strategy to a strategy that keeps track of the full distribution of wealth. The bound is useful because it can
help determine whether approximate aggregation holds or a finer approximation is required.
36
10.3
Other Future Directions
We believe that our work suggests several other future directions for research in dynamic oligopolies. First,
our error bound is very general and we envision that tighter bounds can be derived using problem-specific
information. Second, in our definition of MME, firms keep track of current moments of the fringe state. Our
equilibrium concept and error bound can be modified to allow firms to keep track of past values of moments.
Finally, MME with the perceived transition kernels used in our experiments are intended to study the longrun equilibrium market structure. We believe that an appropriate modification of our approach would allow
the study of short-term dynamics and how the market structure would evolve in few years after a policy or
an environmental change, such as a merger.
In Sections 10.1 and 10.2 we discussed how our methods can be extended to other important models in
economics. In addition, we envision that some of our ideas could potentially also be useful to study dynamic
models with forward looking consumers. For example, in a dynamic oligopoly model with durable goods,
firms may need to make pricing and investment decisions, keeping track of the distribution of consumers’
ownership, which is a highly dimensional object (e.g., Goettler and Gordon (2011)). Our ideas may be
useful in this context as well, where one could replace this distribution by some of its moments. We leave
all these extensions for future research.
References
Algan, Y., O. Allais, W. J. Den Haan, P. Rendahl. 2013. Solving and simulating models with heterogeneous
agents and aggregate uncertainty. Handbook of Computational Economics Vol. 3. North-Holland, Elsevier.
Altug, S., P. Labadie. 1994. Dynamic Choice and Asset Markets. Academic Press.
Bajari, P., C. L. Benkard, J. Levin. 2007. Estimating dynamic models of imperfect competition. Econometrica 75(5) 1331 – 1370.
Benkard, C. L. 2004. A dynamic analysis of the market for wide-bodied commercial aircraft. Review of
Economic Studies 71(3) 581 – 611.
Benkard, C. L., A. Bodoh-Creed, J. Lazarev. 2010. Simulating the dynamic effects of horizontal mergers:
U.S. airlines. Working Paper, Stanford.
Benkard, C. L., P. Jeziorski, G. Y. Weintraub. 2014. Oblivious equilibrium for concentrated industries.
Working Paper, Stanford University.
Bertsekas, D. P., S. Shreve. 1978. Stochastic Optimal Control: The Discrete-Time Case. Academic Press
Inc.
Besanko, D., U. Doraszelski. 2004. Capacity dynamics and endogenous asymmetries in firm size. RAND
Journal of Economics 35(1) 23 – 49.
37
Besanko, D., U. Doraszelski, Y. Kryukov, M. Satterthwaite. 2010. Learning-by-doing, organizational forgetting, and industry dynamics. Econometrica 78(2).
Besanko, D., M. K. Perry, R. H. Spady. 1990. The logit model of monopolistic competition: Brand diversity.
The Journal of Industrial Economics 38(4) 397 – 415.
Blundell, R., T.M. Stoker. 2005. Heterogeneity and aggregation. Journal of Economic Literature XLIII 347
– 391.
Caplin, A., B. Nalebuff. 1991. Aggregation and imperfect competition - on the existence of equilibrium.
Econometrica 59(1) 25 – 59.
Clementi, G.L., D. Palazzo. 2014. Entry, exit, firm dynamics, and aggregate fluctuations. NBER Working
paper .
Collard-Wexler, A. 2011. Productivity dispersion and plant selection. Working Paper, NYU.
Collard-Wexler, A. 2013a. Demand fluctuations in the ready-mix concrete industry. Econometrica 81(3)
1003–1037.
Collard-Wexler, A. 2013b. Mergers and sunk costs: An application to the ready-mix concrete industry.
Forthcoming American Economic Journal: Micro.
Corbae, D., P. D’Erasmo. 2013. A quantitative model of banking industry dynamics. Working paper, UT
Austin.
Corbae, D., P. D’Erasmo. 2014. Capital requirements in a quantitative model of banking industry dynamics.
Working Paper, Maryland University.
Den Haan, W. J. 2010. Assessing the accuracy of the aggregate law of motion in models with heterogeneous
agents. Journal of Economic Dynamics and Control 34 79 – 99.
Dixit, A. K., J. E. Stiglitz. 1977. Monopolistic competition and optimum product diversity. American
Economic Review 67(3) 297 – 308.
Donoghue, W. F. 1969. Distributions and Fourier Transforms. Academic Press.
Doraszelski, U., K. Judd. 2011. Avoiding the curse of dimensionality in dynamic stochastic games. Quantitative Economics 3(1) 53–93.
Doraszelski, U., S. Markovich. 2007. Advertising dynamics and competitive advantage. RAND Journal of
Economics 38(3) 557–592.
Doraszelski, U., A. Pakes. 2007. A framework for applied dynamic analysis in IO. Handbook of Industrial
Organization, Volume 3. North-Holland, Amsterdam.
Doraszelski, U., M. Satterthwaite. 2010. Computable markov-perfect industry dynamics. RAND Journal of
Economics 41 215–243.
38
Ericson, R., A. Pakes. 1995. Markov-perfect industry dynamics: A framework for empirical work. Review
of Economic Studies 62(1) 53 – 82.
Farias, V., D. Saure, G.Y. Weintraub. 2012. An approximate dynamic programming approach to solving
dynamic oligopoly models. The RAND Journal of Economics 43(2) 253–282.
Fershtman, C., A. Pakes. 2012. Dynamic games with asymmetric information: A framework for empirical
work. Quarterly Journal of Economics 127(4) 1611 – 1661.
Gallant, A. R., H. Hong, A. Khwaja. 2011. Dynamic entry with cross product spillovers: An application to
the generic drug industry. Working paper.
Goettler, R. L., B. Gordon. 2011. Does AMD spur Intel to innovate more? Journal of Political Economy
119(6) 1141–1200.
Gowrisankaran, G., T. Holmes. 2004. Mergers and the evolution of industry concentration: results from the
dominant-firm model. RAND Journal of Economics 35(3) 1–22.
Iacovone, L., B. Javorcik, W. Keller, J. Tybout. 2013. Walmart in Mexico: The impact of FDI on innovation
and industry productivity. Working paper, Penn State University.
Iyengar, G. 2005. Robust dynamic programming. Mathematics of Operations Research 30(2) 257–280.
Jeziorski, P. 2013. Estimation of cost synergies from mergers: Application to us radio. Working Paper, U.C.
Berkeley.
Jia, P., P. Pathak. 2012. The cost of free entry: Evidence from real estate brokers in greater Boston. Working
Paper, MIT.
Judd, K. 1998. Numerical Methods in Economics. MIT Press.
Kalouptsidi, M. 2014. Time to build and fluctuations in bulk shipping. American Economic Review 104(2)
564–608.
Khan, A., J.K Thomas. 2008. Idiosyncratic shocks and the role of nonconvexities in plant and aggregate
investment dynamics. Econometrica 76(2) 395 – 436.
Krusell, P., A. A. Smith, Jr. 1998. Income and wealth heterogeneity in the macroeconomy. Journal of
Political Economy 106(5) 867–896.
Krusell, P., A. A. Smith, Jr. 2006. Quantitative macroeconomic models with heterogeneous agents. Advances
in Economics and Econometrics: Theory and Applications, Ninth World Congress.
Lee, R. S. 2013. Vertical integration and exclusivity in platform and two-sided markets. American Economic
Review 103(7) 2960–3000.
Lovejoy, W. S. 1991. A survey of algorithmic methods for partially observed markov decision processes.
Annals of Operations Research 18 47 – 66.
39
Lucas, R. E., E. C. Prescott. 1971. Investment under uncertainty. Econometrica 39(5) 659 – 681.
Maskin, E., J. Tirole. 1988. A theory of dynamic oligopoly, I and II. Econometrica 56(3) 549 – 570.
Maskin, E., J. Tirole. 2001. Markov perfect equilibrium I. observable actions. Journal of Economic Theory
100 191 – 219.
Pakes, A., P. McGuire. 1994. Computing Markov-perfect Nash equilibria: Numerical implications of a
dynamic differentiated product model. RAND Journal of Economics 25(4) 555 – 589.
Pakes, A., P. McGuire. 2001. Stochastic algorithms, symmetric Markov perfect equilibrium, and the ‘curse’
of dimensionality. Econometrica 69(5) 1261 – 1281.
Qi, S. 2013. The impact of advertising regulation on industry: The cigarette advertising ban of 1971. RAND
Journal of Economics 44(2) 215–248.
Resnick, S.I. 1998. A Probability Path. Birhauser.
Roberts, M. J., L. Samuelson. 1988. An empirical analysis of dynamic, nonprice competition in an
oligopolistic industry. The RAND Journal of Economics 19(2) pp. 200–220.
Rojas, C. 2008. Price competition in U.S. brewing. The Journal of Industrial Economics 56(1) 1–31.
Ryan, S. 2012. The costs of environmental regulation in a concentrated industry. Econometrica 80(3) 1019
– 1062.
Saeedi, M. 2013. Reputation and adverse selection: Theory and evidence from ebay. Working Paper, The
Ohio State University.
Santos, C. D. 2010. Sunk costs of R&D, trade and productivity: the moulds industry case. Working Paper,
U. of Alicante.
Santos, C. D. 2012. An aggregation method to solve dynamic games. Working paper .
Sutton, J. 1991. Sunk Costs and Market Structure. 1st ed. MIT Press.
Sutton, J. 1997. Gibrat’s legacy. Journal of Economic Literature 35(1) 40 – 59.
Sweeting, A. 2013. Dynamic product positioning in differentiated product markets: The effect of fees for
musical performance rights on the commercial radio industry. Econometrica 81(5) 1763–1803.
Tomlin, B. 2008. Exchange rate volatility, plant turnover and productivity. Working Paper, Boston University.
Tremblay, V. J., N. Iwasaki, C. H. Tremblay. 2005. The dynamics of industry concentration for u.s. micro
and macro brewers. Review of Industrial Organization 26 307–324.
Tremblay, V. J., C. H. Tremblay. 2005. The US Brewing Industry: Data and Economic Analysis, MIT Press
Books, vol. 1. The MIT Press.
40
Weintraub, G. Y., C. L. Benkard, B. Van Roy. 2008. Markov perfect industry dynamics with many firms.
Econometrica 76(6) 1375–1411.
White III, C. C., W. T. Scherer. 1994. Finite-memory suboptimal design for partially observed markov
decision processes. Operations Research 42(3) 439 – 455.
Whitt, W. 1980. Representation and approximation of noncooperative sequential games. SIAM J. on Control
and Optimization 18(1) 33–48.
Xu, Y. 2008. A structural empirical model of R&D, firm heterogeneity, and industry evolution. Working
paper, NYU University.
41
Appendices
A
Parametric Perceived Transition Kernels
We describe a construction of the perceived transition kernel that has been successfully used in stochastic
growth models in macroeconomics (Krusell and Smith, 1998) and subsequent literature. This perceived
kernel assumes a parameterized and deterministic evolution for moments. That is, starting from industry
state ŝt = (θt , dt , zt ), the next moment value is assumed to be
θt+1 = G(θt ; ξ(dt , zt )),
where ξ(dt , zt ) ∈ Ξ are parameters. For example, this could represent a linear relationship with one moment,
θt+1 = ξ0 (dt , zt )+ξ1 (dt , zt )θt . In this case the goal would be to choose functions ξ0 and ξ1 that approximate
the actual transitions best, for instance by employing linear regressions.
B
Moments Become Sufficient Statistics
In this section we present a class of industry dynamic models for which a succinct set of moments essentially
summarize all payoff relevant information and is close to being sufficient statistics. In these models, the
value of the full information deviation is zero, or becomes zero asymptotically as the number of fringe firms
grows large.
A simple class of models for which this holds is when fringe firms are homogeneous, in the sense that
Xf is a singleton. In this case, a single moment, namely the number of incumbent fringe firms, is a sufficient
statistic of the future evolution of the industry. Next, we describe a model with firm fringe heterogeneity
that does not allow for entry and exit, in which moments also become sufficient statistics as the number of
fringe firms grows large.
Constant Returns to Scale Model for Fringe Firms. We take Xf = <+ (with another component that
we suppress for clarity indicating that the firm is fringe). We assume that there is no entry and exit (therefore
µ = ι), that there are N − D fringe firms, and that there are no transitions between the fringe and dominant
tier. The analysis below can be applied to single-period profit functions and transition kernels that depend
on any integer moments of the fringe state. However, to simplify the exposition we assume that they both
depend only on the first moment. A model like Example 7.1 with α1 = 1 would give rise to this type of
profit function. Accordingly, we assume that firms keep track only of the first un-normalized moment of the
P
fringe state, that is, θt = x∈Xf xft (x). We make the following assumptions on the primitives of fringe
firms only; no restrictions are placed on the primitives of dominant firms.
1. The single-period profit is linear in the fringe firm’s own state, π(x, ŝ) = xπ1 (ŝ) + π0 (ŝ) with
supŝ∈Ŝ {π1 (ŝ), π0 (ŝ)} < ∞. The assumption imposes constant returns to scale.
42
2. For a fixed state x, the cost function increases linearly with the investment level ι. In addition, the
marginal investment cost increases linearly with the state. Formally, c(x, ι) = (cx)ι with ι ∈ I = <+ .
3. The dynamics of a fringe firm’s evolution are linear in its own state: xi,t+1 = xit ζ1 (ι, ŝt , w1it ) +
ζ0 (ŝt , w0it ), where ι is the amount invested. Each of the sequences of random variables {w0it |t ≥
0, i ≥ 1} and {w1it |t ≥ 0, i ≥ 1} is i.i.d. and independent of all previously defined random quantity
and of each other. In addition, we assume the functions ζ0 and ζ1 are uniformly bounded over all
realizations, investment levels, and industry states.30
We begin by showing that for any perceived kernel P̂µ the corresponding best response investment
strategy for a fringe firm does not depend on its own individual state.31 All proofs are relegated to Appendix
C.
Lemma B.1. Consider the constant returns to scale model described above. For any MME strategy µ and
for every x, x0 ∈ Xf and ŝ ∈ Ŝ we have µ(x, ŝ) = µ(x0 , ŝ) = µ(ŝ).
We use this result to characterize the evolution of the moment under MME strategies as the number of
fringe firms grows. To obtain a meaningful asymptotic regime, we scale the market size together with the
number of fringe firms. Formally, we consider a sequence of industries with growing market size m; market
size would typically enter the profit function, through the underlying demand system like in Example 7.1.
Therefore, we consider a sequence of markets indexed by market sizes m ∈ N with profit functions denoted
by π m . We assume the number of firms increases proportionally to the market size, that is N m = N m − D,
for some constant N > 0. We assume that all other model primitives, including the maximum number
of dominant firms, are independent of m. Quantities associated to market size m are indexed with the
superscript m. We state the following assumption.
Assumption B.1. Let {µm : m ≥ 1} be a sequence of MME strategies followed by all firms in market m.
Then, there exists a compact set X f ⊂ Xf such that for all m ≥ 0, t ≥ 0, and i ∈ Ftm , P[xm
it ∈ X f ] = 1.
While the previous assumption imposes conditions on equilibrium outcomes, it is quite natural in this
context; MME is a sensible equilibrium concept only if fringe firms do not grow unboundedly large. We
have the following result.
Proposition B.1. Consider the constant returns to scale model under Assumption B.1, and suppose that for
every m ≥ 1 all firms use MME strategy µm . For a given t, conditional on the realizations of {xit ∈ X f |i ∈
Ftm } and ŝm
t , we have
h
i
m
m
m
m
(1/N m )θt+1
− ζ̃1m (ŝm
)
(1/N
)
θ
+
ζ̃
(ŝ
)
→ 0, a.s.,
0
t
t
t
P
P
m m
m
as m → ∞, where θtm = i∈F m xit = x∈Xf xftm (x), ζ̃1m (ŝm
t ) = E µm [ζ1it (µ (ŝt ), ŝt , w1it )], and
t
m
ζ̃0 (ŝm
t ) = E µm [ζ0it (ŝt , w0it )].
30
The assumption about linear transitions is similar to assuming Gibrat’s law in firm’s transitions (Sutton, 1997).
This result is similar to Lucas and Prescott (1971) that studies a dynamic competitive industry model with similar assumptions
to ours, but with deterministic transitions and stochastic demand.
31
43
The result shows that the first moment becomes a sufficient statistic for the evolution of the next moment.
In particular, the next moment becomes a linear function of the current moment. Heuristically, this suggests
defining the following perceived transition kernel:32
m
m
m
m
P̂µm [ζ̃1m (ŝm
t )θt + N ζ̃0 (ŝt )|ŝt ] = 1,
(13)
which will become a good approximation as m grows large. In fact, with this perceived transition kernel
and under suitable continuity conditions, one can show that the value of the full information deviation
(appropriately normalized) converges to zero as the market size m approaches infinity.33
C
Proofs
Proof of Lemma B.1. Under the assumptions of model (N ) of Chapter 9 in Bertsekas and Shreve (1978) we
have from Proposition 9.8, that the optimal value function satisfies:
(
h
i
(T V )(x, ŝ|µ) := max xπ1 (ŝ0 ) + π0 (ŝ0 ) − cxι + β E µ V (x1 , ŝ1 )ι, ŝ0 = ŝ, x0 = x
ι∈I
)
= V (x, ŝ|µ)
Moreover, we have that T n V̂ →V if V̂ = 0 by Proposition 9.14. We first show that
sup V (x, ŝ|µ0 , µ) = xV 1 (ŝ) + V 0 (ŝ)
µ0 ∈M
for appropriate functions V 1 (·) and V 0 (·) by demonstrating that the posited form of the perceived value
function is stable under an application of the Bellman operator. We have:
(
h
(T V̂ )(x, ŝ|µ) = max xπ1 (ŝ) + π0 (ŝ) − cxι + β E µ (xζ1 (ι, ŝ, w1 ) + ζ0 (ŝ, w0 ))V̂1 (ŝt+1 )
ι∈I
i
+ V̂0 (ŝt+1 )ι, ŝt = ŝ
(
= x max
ι∈I
)
h
i
− cι + β E µ ζ1 (ι, ŝ, w1 )V̂1 (ŝt+1 )ι, ŝt = ŝ
)
+ xπ1 (ŝ) + Ṽ0 (ŝ)
(14)
= xṼ1 (ŝ) + Ṽ0 (ŝ),
32
Note that a derivation similar to that in Proposition B.1 will show that any k-th moment of fringe firms’ states for an integer k
would depend on moments k, k − 1, . . . , 1 only, as m grows large. Therefore, if higher integer moments are payoff relevant they
could be accounted for as well.
33
This result requires appropriate continuity conditions on the model primitives, and also that the equilibrium investment strategy
is continuous in the moment-based fringe state. The result is omitted for brevity and can be obtained from the authors upon request.
44
h
i
where we define Ṽ0 (ŝ) = π0 (ŝ) + β E µ ζ0 (ŝ, w0 )V̂1 (ŝt+1 ) + V̂0 (ŝt+1 )ŝt = ŝ . Now, let us denote by V̂ n
the iterates obtained by applying the Bellman operator T . Then, we have concluded that
xV̂1n (ŝ) + V̂0n (ŝ)→V (x, ŝ), ∀x, ŝ,
as n → ∞, where V is the optimal value function. But since the above holds for at least two distinct values
of x for any given ŝ, this suffices to conclude that V̂1n (ŝ)→V̂1∞ (ŝ) and V̂0n (ŝ)→V̂0∞ (ŝ).
Now, under the additional Assumption C of Chapter 4 in Bertsekas and Shreve (1978), and further
assuming that the supremum implicit in the dynamic programming operator applied to V is attained for every
(x, ŝ) , the second claim follows immediately from equation (14) and Proposition 4.3 of the reference.
Proof of Proposition B.1. Note that
m
m
θt+1
=
N
X
i=1
m
xi,t+1 =
N
X
m
m
[xit ζ1 (µm (ŝm
t ), ŝt , w1it ) + ζ0 (ŝt , w0it )],
i=1
where (w0it , w1it ) are i.i.d. We need to show that conditional on sm
t ,
m
m
(1/N m )θt+1
− E µm [θt+1
|sm
t ] → 0, a.s.,
m
m
m m m
where E µm [θt+1 |sm
t ] = ζ̃1 (ŝt )θt + N ζ̃0 (ŝt ). The result follows by Assumption B.1 and a law of large
numbers (see, for example, Corollary 7.4.1 in Resnick (1998)).
Proof of Proposition 5.1. If α 6= 1 redefine the state of fringe firms to be y = xα . Without loss of generality
assume Xf = [0, x̄]. Let st , s0t be two consistent underlying industry states, that is, dt = d0t , zt = zt0 , and
θ(ft ) = θ(ft0 ) = θt . For moments to be sufficient statistics it must be that for any such consistent underlying
states the expected next moment are the same E µ [θ(ft+1 )|st ] = E µ [θ(ft+1 )|s0t ].
Let us fix a moment-based industry state ŝ and let us define the function g(x; ŝ) = E µ [xi,t+1 |xit =
x, ŝt = s]. A function h is midpoint convex if for all x1 , x2 ∈ (x, x̄) ⊂ < the following holds h((x1 +
x2 )/2) ≤ (h(x1 ) + h(x2 ))/2. Midpoint concavity is defined by reversing the inequality. A function that is
midpoint convex (concave) and bounded on an interval is convex (concave) (see Donoghue (1969)). We will
show that g(·; ŝ) is both convex and concave and so linear by showing that midpoint convexity and midpoint
concavity hold. Note that we need to show this only for x ≤ θ as a single fringe firm cannot be larger than
the moment.
For any x0 in the interior of Xf with 2x0 ≤ θt we can find (we assumed N − D ≥ 3) a consistent
fringe state f with f (x0 ) ≥ 2 and δ > 0 such that x0 ± δ ∈ Xf . Construct f 0 with f 0 (x0 ) = f (x0 ) − 2,
f 0 (x0 − δ) = f (x0 − δ) + 1, f 0 (x0 + δ) = f (x0 + δ) + 1 and f 0 (x) = f (x) for all x 6∈ {x0 , x0 ± δ}. Let
s = (f, d, z) and s0 = (f 0 , d, z). Clearly ŝ = ŝ0 , because θ(f ) = θ(f 0 ). By assumption E µ [θ(ft+1 )|st ] −
E µ [θ(ft+1 )|s0t ] = 2g(x0 ; ŝ) − g(x0 − δ; ŝ) − g(x0 + δ; ŝ) = 0 or g(x0 ; ŝ) = (g(x0 − δ; ŝ) + g(x0 + δ; ŝ))/2.
That is g(x; ŝ) is midpoint convex and midpoint concave in the variable x, and so linear.
For x0 ≥ θ/2 we use the following construction. From the result above we have that g(θt /2 − δ; ŝ) is
linear for 0 ≤ δ ≤ θt /2. By assumption, g(θt /2 + δ; ŝt ) + g(θt /2 − δ; ŝt ) = 2g(θt /2; ŝt ), or g(θt /2 +
45
δ; ŝt ) = 2g(θt /2; ŝt ) − g(θt /2 − δ; ŝt ). Substituting for the right hand side we get g(θt /2 + δ; ŝt ) =
(θt /2 + δ)ζ̃1 (ŝt |µ) + ζ̃0 (ŝt |µ) for appropriately defined ζ̃0 , ζ̃1 and for all 0 ≤ δ ≤ min(x̄ − θt /2, θt /2). This
completes the proof.
Proof of Theorem 5.1. Assume the claim to be false. It must be that there exists an > 0 such that for all n,
there exists an n0 > n for which d(Γ, (µn0 , λn0 )) > . We may thus construct a subsequence {(µ̃n , λ̃n )} for
which inf n d(Γ, (µ̃n , λ̃n )) > . Now, because the space of strategies is compact, we have that {(µ̃n , λ̃n )} has
a convergent subsequence; call this subsequence {(µ0n , λ0n )} and its limit (µ∗ , λ∗ ). We have thus established
the existence of a sequence of strategies and entry rate functions {(µ0n , λ0n )}, satisfying:
V (x, s|BR(µ0n , λ0n ), µ0n , λ0n ) − V (x, s|µ0n , λ0n ) → 0, ∀(x, s) ∈ X × S
(15)
λ0n (s) − βEµ0n ,λ0n V (xe , st+1 |µ0n , λ0n )|st = s → 0, ∀s ∈ S
(16)
(µ0n , λ0n ) → (µ∗ , λ∗ ).
(17)
inf d(Γ, (µ0n , λ0n )) > ,
(18)
n
where BR(µ, λ) denotes the best response strategy when competitors play strategy µ and enter according to
λ.
Now, it is simple to show that our assumptions on model primitives guarantee that V (x, s|µ0 , µ, λ) is
continuous in (µ0 , µ, λ) for all (x, s). Moreover, the assumption that for all competitors’ decisions and all
terminal values, a firm’s one time-step ahead optimization problem to determine its optimal investment has a
unique solution, yields in addition that BR(·) is continuous on M × Λ. For a proof of this fact, see the proof
of Proposition 2 in Doraszelski and Satterthwaite (2010) which in turn employs Lemmas 3.1 and 3.2 of
Whitt (1980). Thus, we have from (15), (16), and (17) that V (x, s|BR(µ∗ , λ∗ ), µ∗ , λ∗ ) − V (x, s|µ∗ , λ∗ ) =
0, ∀(x, s) ∈ X × S, and λ∗ (s) − βEµ∗ ,λ∗ [V (xe , st+1 |µ∗ , λ∗ )|st = s] = 0, ∀s ∈ S. Hence, (µ∗ , λ∗ ) ∈ Γ.
But by (17), (18), and the triangle inequality d(Γ, (µ∗ , λ∗ )) > , a contradiction. The result follows.
Lemma C.1. The operator TR satisfies the following properties:
1. TR is a contraction mapping modulo β. That is, for V̂ , V̂ 0 ∈ V̂, kTR V̂ − TR V̂ 0 k∞ ≤ βkV̂ − V̂ 0 k∞ .
2. The equation TR V̂ = V̂ has a unique solution V̂ ∗ .
3. V̂ ∗ = limk→∞ TRk V̂ for all V̂ ∈ V̂.
Proof. Our result is based on a special case of Iyengar (2005) (see Theorem 3.2). However, to allow for
greater generality in the state space and action space we use a different proof based on classic dynamic
programming results. Define
(T̂R V̂ )(x, ŝ) = sup
n
h
π(x, ŝ) + E φ1{φ ≥ ρ}
f ∈Sf (ŝ)
ι∈I
ρ≥0
io
+ 1{φ < ρ} − c(x, ι) + β E µ,λ [V̂ (xi,t+1 , ŝt+1 )xit = x, st = (f, d, z), ι]
.
46
(19)
The statement of the lemma holds for T̂R from the results for model (D) in Bertsekas and Shreve (1978)
(recall that we assumed bounded profits). The statement follows for TR since T̂R V̂ = TR V̂ for all bounded
V̂ .
Proof of Theorem 9.1. Take vectors V̂ ∈ V̂ and V , where V : X × S → <, such that V (x, s) = V̂ (x, ŝ),
for all moment-based industry state ŝ that is consistent with industry state s. Let T ∗ be the Bellman operator
associated with the full Markov best response that keeps track of the entire industry state. It is simple to
observe that T ∗ V (x, s) ≤ TR V̂ (x, ŝ), for all x, and all ŝ and s consistent. By the monotonicity of the
TR and T ∗ operators we conclude that (T ∗ )k V (x, s) ≤ TRk V̂ (x, ŝ) for all k ≥ 1. Taking k to infinity
we get TRk V̂ → V̂ ∗ from Lemma C.1, and (T ∗ )k V → V ∗ by standard dynamic programming arguments.
Therefore, V ∗ (x, s) ≤ V̂ ∗ (x, ŝ) for s and ŝ consistent.
D
Real Time Algorithm for MME with the ‘Observed Transitions’ Kernel
Given strategies (µ, λ) and their associated value functions, it is useful to define
W (x, ŝ|µ, λ) = E µ,λ V̂ (x, ŝt+1 |µ, λ)ŝt = ŝ ,
(20)
where the expectation is taken with respect to the perceived transition kernel. The function W is the expected
continuation value starting from industry state ŝ and landing in state x in the next period. Note that we only
integrate over the possible transitions of ŝ, excluding the firm’s own transition to x. It is worth emphasizing
that if x ∈ Xd in (20) then ŝt+1 depends on x, whereas if x ∈ Xf the next industry state, ŝt+1 , is independent
of x. Namely, dominant firm i that transitions to x will integrate over (θt+1 , d−i,t+1 , zt+1 ) with dt+1 =
(xi,t+1 , d−i,t+1 ). Now, we can write the Bellman equation associated with C1 in the equilibrium definition
as follows:
n
h
io
V̂ (x, ŝ; W ) = sup π(x, ŝ) + E φ1{φ ≥ ρ} + 1{φ < ρ} − c(x, ι) + β E µ,λ [W (xi,t+1 , ŝ)xit = x, ι]
,
ι∈I
ρ≥0
where the first expectation is taken with respect to the sell-off random value φ, and the second with respect
to the firm’s transition under investment level ι. Note that when evaluated at the optimal value function, the
function W is sufficient to compute a best response strategy. Based on this formulation we introduce our
real-time dynamic programming algorithm to compute MME; see Algorithm 2.
The steps of the algorithm are as follows. We begin with W functions with which firms’ optimal decisions can be computed (investment, exit, and entry). Once these are determined, we can simulate the
next industry state. The continuation value in the simulated state is used to update the W functions. The
update step depends on the number of times the state was visited e(ŝ) and on the number of rounds (we
take σ(n) = min(n, n̄) for some integer n̄ > 0 to allow for quick updating in the early rounds). At the
end of the simulation/optimization phase we check whether the W functions have converged, and if so
47
Algorithm 2 Equilibrium solver with real-time dynamic programming
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12:
13:
14:
15:
16:
17:
18:
19:
20:
21:
22:
23:
24:
25:
26:
27:
28:
29:
30:
31:
32:
Initiate W (x, ŝ) := 0, for all (x, ŝ) ∈ X × Ŝ;
e(ŝ) = 0 for all ŝ ∈ Ŝ;
Initiate industry state (f0 , d0 , z0 ) and ŝ0 := (θ(f0 ), d0 , z0 ));
∆w := δw + 1; n := 1
while ∆w > δw do
W 0 (x, ŝ) := W (x, ŝ) for all (x, ŝ) ∈ X × Ŝ;
t := 1;
while t ≤ T do
for all x with ft (x) > 0 or x ∈ dt , do
Compute optimal strategies using V̂ (x, ŝt ; W 0 ) and store them;
end for;
Compute optimal entry cutoff from V̂ (xe , ŝt ; W ) and store it;
Simulate (ft+1 , dt+1 , zt+1 ) and ŝt+1 from these strategies;
1
Let γ := σ(n)+e(ŝ
;
t)
0
for all x ∈ Xf do
Compute V̂ (x0 , ŝt+1 ; W 0 );
Update W (x0 , ŝt ) := γ V̂ (x0 , ŝt+1 ; W 0 ) + (1 − γ)W 0 (x0 , ŝt );
end for;
for all Dominant firm i and x0 ∈ Xd that is accessible in one step from xit do
Define ŝ0t+1 to be the industry state ŝt+1 when firm i transitions to state x0 ;
Compute V̂ (x0 , ŝ0t+1 ; W 0 );
Update W (x0 , ŝt ) := γ V̂ (x0 , ŝ0t+1 ; W 0 ) + (1 − γ)W 0 (x0 , ŝt );
end for;
e(ŝt ) := e(ŝt ) + 1, t := t + 1;
end while;
∆w := kW 0 − W k∞ ;
e(ŝ) := 0 for all ŝ ∈ Ŝ;
(f0 , d0 , z0 ) := (fK+1 , dK+1 , zK+1 ), and ŝ0 := (θ(f0 ), d0 , z0 );
n := n + 1;
end while;
Compute µ(x, ŝ) and λ(ŝ) from V̂ (x, ŝ; W ) for all (x, ŝ) ∈ X × Ŝ;
Run Algorithm 1 with these strategies (used as stopping criteria);
48
we check for convergence using Algorithm 1 as well. If convergence has not occurred we iterate on the
simulation/optimization phase.
E
Beer Industry Numerical Experiments
Denote by xit the goodwill of firm i at time t. The evolution of goodwill is similar to Pakes and McGuire
(1994), but with a multiplicative growth model following Roberts and Samuelson (1988):
xit+1 =



min(xn̄ , xit (1 + ρ))


xit



max(x1 , xit /(1 + ρ))
w.p.
w.p.
w.p.
δψ(x)ιit
1+ψ(x)ιit
1−δ 0 +(1−δ)ψ(x)ιit
1+ψ(x)ιit
δ0
1+ψ(x)ιit .
This is equivalent to a depreciation factor 1/(1 + ρ) as is common in the literature on goodwill. With this
in mind we define a grid of states {x1 , . . . , xn̄ } for the possible values of goodwill firms can take, where
xk = x1 (1 + ρ)k−1 for some x1 > 0. To maintain the relationship between goodwill and advertising costs,
we choose the parameter ψ(x) such that E[xit+1 |xit = x, ιit = x] = x, that is, a firm with goodwill x has to
−1
0
.
invest x dollars in advertising to maintain goodwill level x on average. It follows that ψ(x) = δδ 1−(1+δ)
δx
Under this condition, the average goodwill (state) of a firm that invests xit in every period is approximately
xit .
The moment space is discretized linearly and we use a bicubic spline to interpolate between grid points
when computing firms’ optimal strategies.
We use the profit function of Example 7.1. We take β = .925 and (δ 0 , δ) = (1, .55) as the transition
parameters. After some experimentation we choose the market size m = 30. The other parameters are listed
in the next table with their relevant sources.
49
Description
Number of firms (N )
Value
Source
200
This number is chosen to be greater than the maximal number of active firms in this period.
3
This is the actual number of dominant firms in the industry.
Depreciation of goodwill (ρ)
.25
Roberts and Samuelson (1988) estimate this by .2 for the
cigarette market.
Production cost per barrel (c)
$120
Rojas (2008) estimated the markup to be about a third of
the price, and the average price is $165 per barrel in the
period studied.
Maximal number of dominant
firms (D)
Average entry cost (exponential
dist.)
35 × 106
Based on costs of new plants.
Sell-off value (exponential dist.)
7 × 106
Based on sales’ price of used plants.
Fixed cost per period fringe
(double for dominant)
106
We introduce this fixed cost in the single-period profit
function.
Profit function parameters (Y
and α2 )
200 and 1, resp.
Chosen to match the price elasticity (-.5) for the average
price, see (Tremblay and Tremblay, 2005, p. 23).
F
Computation of Robust Bound
The remainder of the appendix provides details on the computation of the robust bound under the assumption
that there are large number of fringe firms and therefore the one-step transition of the fringe state is assumed
to be deterministic. When transitions are deterministic, finding the optimal consistent f in (12) is equivalent
to choosing the next moment from an accessibility set of moments that can be reached from the current
industry state. As we will explain in detail now, this considerably simplifies the computation of the inner
maximization in (12), since the moment accessibility sets are low dimensional. Moreover, characterizating
these accessibility sets can be done efficiently.
We assume throughtout that the set of individual fringe firm states is discrete and univariate, Xf =
{x1 , . . . xn̄ }, where xn ∈ <, for all n = 0, 1, . . . , n̄ < ∞.34 In addition, for simplicity, we assume that there
are no transitions between the dominant and fringe tiers.
Finding the set of accessible moments amounts to solving an integer feasibility problem. To simplify,
P
α
let us assume that θ consist of only one moment of the form θ =
x∈X̃f f (x)(x) for α ≥ 0, where
X̃f ⊆ Xf . This representation is general enough to include moments and other statistics. The extension to
more moments is direct.
Assuming that the next fringe state is deterministic and equals to its expected value, we say that moment
0
θ is accessible from moment θ in industry state ŝ if there exists a fringe state f ∈ Sf (ŝ) that solves the
34
The extension to multivariate xn is simple, and the extension to a continuous state space will require discretization.
50
following system of linear equations,
X
f (x) (x)α = θ
x∈X̃f
X
f (x) E µ [(xi,t+1 )α |(xit , ŝt ) = (x, ŝ)] + 1{xe ∈ X̃f }(xe )α P(κit ≤ λ(ŝ))N e (f ) = θ0
(21)
x∈Xf
X
f (x) ≤ N,
f ∈ Nn ,
x∈Xf
where the first equation states that the current moment is consistent with the fringe state, and the second that
the expected next moment is θ0 . We denote by N e (f ) as the number of potential entrants at fringe state f .
We say that moment θ0 is accessible from ŝ if this system of linear equations has a solution. This motivates
the definition of the accessibility set A(ŝ), where θ0 ∈ A(ŝ) if and only if it is accessible from ŝ, that is, if
there is a fringe state consistent with ŝ such that the expected next moment is θ0 . Note that the accessibility
sets depend on the investment and entry MME strategies that control the expected next moment in (21).
Due to the integrability constraint f (x) ∈ Nn , the computation of A(ŝ) is demanding. However, we can
relax this integrability constraint by replacing it with f ≥ 0. With that, the accessibility problem amounts to
solving a feasibility problem of a system of linear equations that can be solved easily. We denote the relaxed
accessibility set by Â(ŝ); this set contains A(ŝ).
Define the operator
(T̂ V̂ )(x, ŝ) = sup sup
n
h
π(x, ŝ) + E φ1{φ ≥ ρ}
ι∈I θ0 ∈Â(ŝ)
ρ≥0
io
,
+ 1{φ < ρ} − c(x, ι) + β E µ,λ [V̂ (xi,t+1 , (θ0 , dt+1 , zt+1 ))xit = x, ŝt = ŝ, ι]
where V̂ ∈ V̂. The next proposition states that we can search over accessibility sets instead of the much
larger consistency sets.
Proposition F.1. Assume that there are no transitions between tiers and that the fringe state follows deterministic transitions given by the expected next state (equation (21)). Then
(TR V̂ ) ≤ (T̂ V̂ ),
for every V̂ ∈ V̂. Moreover, the operator T̂ satisfies the same properties than the operator TR given in
Lemma C.1.
Proof. Without tier transitions the evolution of the moments is independent of the evolution of dominant
firms. If fringe firms’ transitions are also deterministic each f ∈ Sf (ŝ) maps to an expected next moment
and so we can optimize over the set of these moments instead. The inequality follows because we consider
the relaxed accessibility sets, so we optimize over a larger set in T̂ .
Based on this, we propose the following computationally tractable algorithm to find the robust error
bound: (i) construct the relaxed accessibility sets by solving the relaxed feasibility problem for all ŝ ∈ Ŝ
51
and θ0 ∈ Sθ and store them; and (ii) iterate the operator T̂ over the relaxed accessibility sets until a fixed point
is found. Since the relaxed accessibility sets contain the accessibility sets, this provides an upper bound to
V ∗ , assuming deterministic fringe transitions (by Theorem 9.1 together with the previous proposition). For
problems for which MME is solvable this procedure is generally computationally manageable. In addition,
this procedure can be easily extended to accommodate tier transitions between dominant and fringe firms.
In this case, however, the robust bound typically becomes looser.
52

Download Report

A Framework for Dynamic Oligopoly in Concentrated Industries

Paperzz.com

Your Paperzz