Reconciling the assumption of Pareto distribution

Reconciling the assumption of Pareto distribution of
firm productivities with the exporter data
Zhanar Akhmetova∗
Department of Economics, Australian School of Business, University of New South Wales
Sydney, NSW 2052, Australia
Cristina Mitaritonna
Centre d’Etudes Prospectives et d’Informations Internationales (CEPII)
113 rue de Grenelle 75007, Paris, France
January 8, 2014
Abstract. We propose a stochastic formulation of the Melitz model with a sunk entry cost and a Pareto distribution of firm productivities. Payoffs of a firm are subject
to random time-varying shocks. As a result, the distribution of firm productivities
conditional on exporting is no longer Pareto, but a hump-shaped distribution with a
heavy right tail. This fits well the observed shape of the distribution of productivities
∗
Corresponding author. Telephone: +61 2 93854965; E-mail: [email protected].
1
of exporting firms.
Keywords: Heterogeneous Producers, Pareto Distribution, Melitz model, Random
Shocks.
JEL classification: F12, F14, L11.
1
Introduction
The Pareto distribution has been used widely to model the cross-sectional distribution of firm productivity or size in the recent international trade literature. In the
heterogeneous firm models of Helpman et al. (2004), Helpman et al. (2008), Melitz
and Ottaviano (2008), Chaney (2008), Eaton et al. (2011), all of which build on the
seminal paper by Melitz (2003), firm productivities are assumed to follow a Pareto
distribution. A Pareto distribution is characterised by two parameters: a scale parameter and a shape parameter. The probability density function of a random variable
with a Pareto distribution is plotted in Figure (1) for various values of the shape
parameter.
The main justification for this assumption, apart from its analytical convenience,
is derived from the works of Simon (1955), Simon and Bonini (1958), Ijiri and Simon
(1964), Steindl (1965), Axtell (2001), Luttmer (2007). Simon (1955) shows that
2
some stochastic processes in dynamic models can result in a class of highly-skewed
distributions, particularly Pareto, in the steady state. Simon and Bonini (1958), Ijiri
and Simon (1964), Steindl (1965) have all indicated the good approximation of the
distribution of firm sizes that the Pareto distribution provides. Most recently, Axtell
(2001) demonstrated this using 1997 data on the entire population of firms in the US.
Luttmer (2007) presents a stochastic model of growth where the stationary firm size
distribution is a Pareto distribution.
However, when we consider the productivities in the dataset of French exporters
between 1995-2007, we obtain distributions that do not look like Pareto, at least not
for low values of productivity, a fact that has not been emphasised in the international
trade literature. The dataset provides information on export values and quantities of
all exporters by country and 8-digit-code product (according to the European CN8
classification, which is an 8-digit extension of the HS6 classification system, analogous to the ten-digit extensions (HS10) employed by the US). It is merged with the
domestic balance sheets of the firms, which allows us to measure total factor productivity (TFP) of each firm a la Ackerberg et al. (2006). Our procedure is described in
the Appendix. We consider all exporters (firms that exported at least once between
1995-2007) of a given 8-digit-code product to a given destination. In order to obtain
comparable productivity values across firms exporting the same product in the presence of multi-product firms, we locate the 2-digit industrial sector that dominates the
exports of the given 8-digit-code product, and consider only firms that declare this
3
sector as their primary industry. For example, most exporters of perfumes (8-digit
code 33030010), shampoos (8-digit code 33051000), soaps (8-digit code 34011100)
and perfumed bath salts (8-digit code 33073000) declare themselves as belonging to
the 2-digit industry 24 (production of soaps, perfumes, cleaning products and other
chemical products). We therefore take into account only exporters that belong to this
sector when plotting the distribution of productivities of exporters of these 8-digitcode products. These plots are presented in Figure (2), for selected countries (Japan
and USA). Distributions of productivities of exporters of other products and to other
destinations look similar.
10
Proability density function
9
Shape parameter = 3, scale parameter = 1
8
7
Shape parameter = 4, scale parameter = 1
6
Shape parameter = 10, scale parameter = 1
5
4
3
2
1
0
1
1.5
2
2.5
3
3.5
4
4.5
5
Values of random variable
Figure 1: Plot of probability density functions for a random variable with Pareto distribution, for various values of the shape parameter.
It is shown in what follows that in a standard Melitz (Melitz, 2003) model with
4
Density
.2
.4
.6
Exporters of soap to Japan
0
0
Density
.2
.4
.6
Exporters of bath salts to Japan
2
4
6
TFP
8
10
12
2
kernel = epanechnikov, bandwidth = 0.3317
4
6
TFP
8
10
12
kernel = epanechnikov, bandwidth = 0.2953
Exporters of shampoos to the US
0
0
.1
Density
.1 .2
Density
.2 .3
.3
.4
Exporters of perfumes to the US
2
4
6
8
10
12
TFP
2
4
6
8
10
12
TFP
kernel = epanechnikov, bandwidth = 0.5009
kernel = epanechnikov, bandwidth = 0.3708
Figure 2: Kernel density estimates of the distributions of exporting firm productivities
(TFP).
heterogeneous firms whose productivities are drawn from a Pareto distribution, and
with a sunk entry cost, the distribution of productivities of firms that export is also
Pareto. We extend this, as we call it, ‘deterministic’ model by introducing random
time-varying shocks to the payoffs of the firm from every action (exporting and not
exporting). We interpret these shocks as variation in the costs of the firm. They are
firm-specific and independent of firm productivity, which is time-invariant. Under
certain assumptions (Rust, 1987), one can calculate the probabilities that a firm
chooses exporting or not exporting, given its productivity and past history (whether it
has exported before or not). We show how we derive the stationary distribution of this
model and demonstrate that for a sufficiently high sunk entry cost the cross-sectional
5
distribution of firm productivities conditional on exporting has the same shape as that
shown in Figure (2): increasing for low values of productivities and declining after
some point, with a heavy right tail. This happens because unlike in the deterministic
model, random shocks to firm payoffs imply that even low-productivity firms (firms
that would not export at all in a deterministic model) may pay the sunk entry cost in
a period when they receive a very favourable shock to their exporting payoff. These
firms might then continue exporting afterwards, since they have already invested in
the entry cost. Thus, firms with low productivities will be present in the exporter
dataset with non-zero probability. The higher the productivity of a firm, the higher
the probability it exports in any given period. This positive relationship interacts with
the unconditional Pareto distribution of firm productivities, which implies an inverse
relationship between firm productivity and its prevalence in the firm population, to
generate the hump-shaped distribution of productivities conditional on exporting,
observed in the data.
This paper is distinct from the literature explaining the presence of small exporters
in the exporter dataset. Arkolakis (2010) introduces a costly marketing technology
that allows firms to choose how many consumers to access in a market, into a model
of trade with product differentiation and firm productivity heterogeneity. This helps
explain why we observe so many firms exporting low volumes. Works on learning
under demand uncertainty, e.g. Akhmetova and Mitaritonna (2013), Albornoz et al.
(2012), Eaton et al. (2008) introduce an assumption of uncertainty about demand
6
in export markets, and allow firms to learn about demand from observed sales. This
framework can help justify the low sales volumes of new exporters, and in the case of
test marketing (experimentation in the market before investing in the sunk entry cost,
as in Akhmetova and Mitaritonna, 2013) can explain the presence of low-productivity
firms among exporters. However, all of these assumptions only lead to a downward
shift in the threshold on productivity beyond which firms choose to export. For
any given level of productivity, all firms either export or not, with probability 1. The
distribution of productivities conditional on exporting is still a Pareto in these models,
assuming a Pareto distribution of firm productivities in the population. Instead,
our work introduces idiosyncratic random shocks to payoffs of the firm, which for
sufficiently high values of the sunk entry cost produce a positive, strictly monotonic
on a certain range, relationship between firm productivity and the probability of
exporting, that ranges between 0 and 1. As a result, the hump-shaped distribution
of firm productivities conditional on exporting emerges.
The rest of the paper is organised as follows. Section 2 analyses the conditional
distribution of productivities of exporting firms in a deterministic Melitz model. We
present the stochastic formulation of this model in Section 3 and demonstrate the
stationary distribution of firm productivities of exporting firms. Section 4 concludes.
7
2
The Deterministic Model
2.1
Consumer Preferences
We study the optimal behavior of firms at Home producing varieties of a differentiated
good and wishing to sell in the Foreign market. Time is discrete. Normalize the size of
the market to 1. The representative consumer has CES preferences over the varieties
of the differentiated good, so that his/her utility from consuming quantities qjt at
time t is given by
Z
(qjt )
Ut =
ε−1
ε
ε
ε−1
dj
,
j
where j denotes varieties, ε > 1 is the elasticity of substitution.
Utility-maximizing quantity demanded of a variety j by an individual consumer
and in the market as a whole is given by
−ε
hjt
qjt = Qt
= yt (hjt )−ε (Pt )ε−1 ,
Pt
where hjt is the price of variety j, Pt is the aggregate price index for the differentiated
R
1
good, Pt = [ j (hjt )1−ε dj] 1−ε , yt denotes the total expenditures on the differentiated
good, and Qt is the total consumption of the differentiated good (so that Qt Pt = yt ,
i.e. Qt =
2.2
yt
).
Pt
Firm’s Problem
Each firm produces one variety. We index firms with j, and assume that there is a
continuum of firms in the domestic economy, j ∈ [0, 1]. Firm j produces variety j.
8
Labor is the only factor of production, with the constant marginal cost of production of firm j given by the ratio of wages, wt , and firm j’s productivity, φjt . Gross
profits from exporting to the market at any time t are given by
πjt = qjt hjt − qjt
wt
wt
= yt (hjt )−ε (Pt )ε−1 (hjt −
).
φjt
φjt
To maximize these profits, the firm sets the price as a constant mark-up over the
marginal cost:
hjt =
ε wt
.
ε − 1 φjt
Given optimal prices, we can re-write the aggregate price index as
Z
Z
1
1
ε wt 1−ε 1−ε
1−ε
) dj] .
Pt = [ (hjt ) dj] 1−ε = [ (
j ε − 1 φjt
j
In what follows, we drop the index j for simplicity of notation, and will consider a
stationary equilibrium, where all aggregate variables and firms’ productivities take
constant values, so that for a firm with productivity φ
πt = π, ∀t,
π ≡ φε−1
1
ε −ε P ε−1
[
] y[ ] .
ε−1 ε−1
w
(1)
The firm has to pay a sunk entry cost F to access the foreign market. The firm is
infinitely lived, but every period there is a probability 1 − δ, δ ∈ (0, 1) of an exogenous
death shock. The firm may choose to export or not to export in any period. If the
firm has paid the sunk entry cost by time t, and it does not export at time t, it will
still be able to export at time t + 1.
9
2.3
Solution of the Problem of the Firm
The Bellman equation for the value function is as follows:
V (∆t ) = max V c (∆t , it ),
it =0,1
where
it =



 1, if the firm exports at time t,


 0,
∆t =
otherwise,



 1, if the firm has incurred the cost F by time t,


 0,
otherwise,
V c (∆t , 0) = δV (∆t+1 ), ∀∆t ,
is the choice-specific value function for the decision to not export,



 π + δV (∆t+1 ),
if ∆t = 1,
c
V (∆t , 1) =


 π + δV (∆t+1 ) − F, if ∆t = 0,
is the choice-specific value function for the decision to export, profits π of a firm with
productivity φ from exporting in a stationary equilibrium are given by (1), and ∆t
evolves according to
∆t+1 =



 1, if ∆t = 1 or it = 1,


 0,
otherwise.
10
(2)
From (2) we obtain
V c (∆t , 1) =



 π + δV (1),
if ∆t = 1,
(3)


 π + δV (1) − F, if ∆t = 0,
and
V (1) = max{V c (1, 0), V c (1, 1)} = max{δV (1), π + δV (1)}.
Since π > 0, for all non-negative φ, so that π + δV (1) > δV (1), we get
V (1) = π + δV (1),
V (1) =
π
.
1−δ
Hence,
V (0) = max V c (0, it )
it =0,1
= max{δV (0), π + δV (1) − F }
= max{δV (0), π + δ
= max{δV (0),
π
− F}
1−δ
π
− F, }
1−δ
so that
V (0) =



 0,



π
1−δ
if
π
1−δ
− F < 0,
− F, if
π
1−δ
− F ≥ 0.
To summarise, the optimal policy choice is given by:



 0, if ∆t = 0 and π − F < 0,
1−δ
it (∆t ) =


π
 1,
if ∆t = 1 or 1−δ
− F ≥ 0,
11
(4)
and the value function is
V (∆t ) =




0,











π
1−δ
if ∆t = 0 and
π
1−δ
− F < 0,
− F, if ∆t = 0 and
π
1−δ
− F ≥ 0,
π
,
1−δ
(5)
if ∆t = 1.
To find the cutoff on productivity φ such that firms with productivity above this
cutoff export, and all firms with lower productivity do not export, we solve
1
1
ε −ε P ε−1
φε−1
[
] y[ ] − F = 0.
1−δ
ε−1 ε−1
w
Denote this cutoff by φ̃:
φ̃ ≡
1
w F (1 − δ)( − 1) −1
−1
[
] [
] .
P
y
−1
We illustrate the probability of exporting conditional on not having exported
before over the values of productivity φ for a hypothetical value of φ̃ = 1.3964 in
Figure (3). Notice that it is a piecewise constant function, equal to 0 for φ < φ̃ and
equal to 1 for φ ≥ φ̃.
2.4
The distribution of productivity conditional on exporting
in the deterministic model
Assume that productivity φ follows a Pareto distribution with probability density
function
fΦ (φ) =


α

m
 α φα+1
, for φ ≥ φm ,
φ


 0,
for φ < φm ,
12
Probability of exporting conditional on
not having exported before
1
0.8
0.6
0.4
0.2
0
1
1.5
2
2.5
3
3.5
4
4.5
Values of productivity
Figure 3: Probability of exporting conditional on not having exported before over the
values of productivity φ in the deterministic model.
where φm > 0 is a scale parameter (the lower bound on the support of φ), and α > 0
is a shape parameter. This density function is illustrated in Figure (4) for φm = 1
and α = 3. On the same graph, we illustrate the conditional probability density
function, conditional on φ ≥ 1.3964, where φ̃ = 1.3964, a hypothetical threshold on
productivity for exporting.
The conditional distribution is also a Pareto distribution, with a different lower
bound. In general, the conditional probability distribution of a Pareto-distributed
random variable, given that it is greater than or equal to a particular number φ̃ exceeding φm , is a Pareto distribution with the same shape parameter α, but with scale
parameter φ̃ instead of φm . This is proven in the Appendix. Thus, the distribution of productivities of exporting firms, that is firms with productivity φ above the
threshold φ̃, is also a Pareto distribution. As can be seen in Figure (2), this is not an
13
5
accurate depiction of reality.
Probability density function
3
Unconditional distribution
2.5
Conditional distribution
2
1.5
1
0.5
0
1
1.5
2
2.5
3
3.5
4
4.5
Values of productivity
Figure 4: The unconditional and conditional Pareto distributions.
3
Stochastic formulation of the model
Let us introduce random shocks to the payoffs of the firm. Consider the same problem
as above, only now firm payoffs have an additional component, random, time-varying
and firm-specific. The Bellman equation for the value function is as follows:
V (∆t ) = max V c (∆t , it ),
it =0,1
V c (∆t , 0) = 0t + δV (∆t+1 ), ∀∆t ,
14
5
is the choice-specific value function for the decision to not export, 0t is the random
shock to the payoff from not exporting,



 π + 1t + δV (∆t+1 ),
if ∆t = 1,
c
V (∆t , 1) =


 π + 1t + δV (∆t+1 ) − F, if ∆t = 0,
is the choice-specific value function for the decision to export, and 1t is the random
shock to the payoff from exporting. All the other variables are as defined above, and
in particular ∆t evolves according to (2).
Following the seminal paper of Rust (1987), we assume that t ≡ (0t , 1t ) is
distributed i.i.d. (across choices and periods) and follows a bivariate extreme value of
2
2
Type I process, with mean normalised to (0, 0) and variance normalised to ( π6 , π6 ).
The conditional independence assumption (Rust, 1987) holds due to the deterministic
nature of the state variable ∆t . Under these conditions, we can define and evaluate
˜ (∆t , i) that has as its arguments the state variable
the expected value function EV
∆t and the choice variable i:
˜ (∆t , i) ≡ E∆ , |∆t ,i {max[P ayof f (∆t+1 , j) + j(t+1) ]}
EV
t+1 t+1
j
= E∆t+1 |∆t ,i Et+1 |∆t+1 ,∆t ,i {max[P ayof f (∆t+1 , j) + j(t+1) ]}
j
X
= E∆t+1 |∆t ,i ln{
exp[P ayof f (∆t+1 , j)]}
j
=
X
∆t+1
ln{
X
exp[P ayof f (∆t+1 , j)]}T r(∆t+1 |∆t , i),
j
where
15



˜ (∆t+1 , 0)

δ EV
if j = 0,




˜ (1, 1)
P ayof f (∆t+1 , j) =
π + δ EV
if j = 1, ∆t+1 = 1,






˜ (0, 1) − F if j = 1, ∆t+1 = 0.
 π + δ EV
(6)
Ex|y denotes the expected value with respect to variable x conditional on variable y,
and T r(∆t+1 |∆t , i) denotes the transition probability of the state variable ∆t :




0
if i = 0, ∆t = 0, ∆t+1 = 1,







 1 if i = 1, or ∆t = 1, ∆t+1 = 1,
T r(∆t+1 |∆t , i) =



0 if i = 1, or ∆t = 1, ∆t+1 = 0,







 1
if i = 0, ∆t = 0, ∆t+1 = 0.
(7)
˜ (0, 0), EV
˜ (0, 1), EV
˜ (1, 0),
Given (6) and (7), we can write out the values EV
˜ (1, 1) as follows:
EV
˜ (0, 0) = ln(exp(δ EV
˜ (0, 0)) + exp(π + δ EV
˜ (0, 1) − F )),
EV
(8)
˜ (0, 1) = ln(exp(δ EV
˜ (1, 0)) + exp(π + δ EV
˜ (1, 1))),
EV
(9)
˜ (1, 0) = ln(exp(δ EV
˜ (1, 0)) + exp(π + δ EV
˜ (1, 1))),
EV
(10)
˜ (1, 1) = ln(exp(δ EV
˜ (1, 0)) + exp(π + δ EV
˜ (1, 1))).
EV
(11)
It is evident that
˜ (0, 1) = EV
˜ (1, 0) = EV
˜ (1, 1).
EV
16
˜ (1, 0) = EV
˜ (1, 1). Then from (10) and (11)
Denote x ≡ EV
x = ln(exp(δx) + exp(π + δx))
exp(x) = exp(δx) + exp(π + δx)
exp(x) = (exp(x))δ + exp(π)(exp(x))δ
x=
ln(1 + exp(π))
.
1−δ
(12)
˜ (0, 0). From (8)
Next, denote y ≡ EV
y = ln(exp(δy) + exp(π + δx − F )),
ln(1 + exp(π))
− F )),
1−δ
ln(1 + exp(π))
exp(y) = (exp(y))δ + exp(π + δ
− F ),
1−δ
ln(1 + exp(π))
exp(y) − (exp(y))δ = exp(π + δ
− F ),
1−δ
y = ln(exp(δy) + exp(π + δ
(13)
We simplify this equation further, by denoting z ≡ exp(y), C ≡ exp(π +δ ln(1+exp(π))
−
1−δ
F ), and obtain
z − z δ = C,
(14)
The left-hand-side function of z, g(z) ≡ z −z δ , is graphed in Figure (5) for δ = 0.9.
In general, ∀δ ∈ (0, 1), this function is a positive and strictly increasing function of z
˜ (0, 0) ≥ 0), with values g(z) ∈ [0, ∞)
over z ∈ [1, ∞) (and z ∈ [1, ∞) as long as EV
and derivative
g 0 (z) = z − δz δ−1 > 0, ∀z ∈ [1, ∞).
17
Therefore, ∀C ≡ exp(π + δ ln(1+exp(π))
− F ) ∈ [0, ∞), exists a unique solution for
1−δ
˜ (0, 0)), z ≥ 1, such that (14), and therefore (13), holds.
z ≡ exp(EV
40
35
Function g
30
25
20
15
10
5
0
10
20
30
40
50
60
70
80
90
Values of z
Figure 5: Function g of z, for z ≥ 1
3.1
Comparative statics of the choice probabilities
˜ (0, 0), EV
˜ (0, 1), EV
˜ (1, 0), EV
˜ (1, 1) are obtained, we can calculate
Once the values EV
the probability that the firm chooses action i, given ∆t :
exp(P ayof f (∆t , i))
P rob(i|∆t ) = P
.
j exp(P ayof f (∆t , j))
18
100
In particular,
P rob(0|1) =
˜ (1, 0))
exp(δ EV
˜ (1, 0)) + exp(π + δ EV
˜ (1, 1))
exp(δ EV
=
exp(δx)
exp(δx) + exp(π) exp(δx)
=
1
,
1 + exp(π)
P rob(1|1) = 1 − P rob(0|1) =
(15)
exp(π)
,
1 + exp(π)
where x is given by (12).
P rob(0|0) =
˜ (0, 0))
exp(δ EV
˜ (0, 0)) + exp(π + δ EV
˜ (0, 1) − F )
exp(δ EV
=
exp(δy)
exp(δy) + exp(π + δx − F )
=
exp(δy)
exp(y)
= exp(yδ − y),
where the third line follows from (13), and
P rob(1|0) = 1 − P rob(0|0) = 1 − exp(yδ − y).
Notice that from (13)
exp(y) − exp(yδ) = exp(π + δ
exp(y)(1 − exp(yδ − y)) = C
zP rob(1|0) = C.
19
ln(1 + exp(π))
− F)
1−δ
(16)
Hence,
C
z
z − zδ
=
z
1
= 1 − 1−δ .
z
P rob(1|0) =
(17)
where z is obtained from (14), and
P rob(0|0) = 1 − P rob(1|0).
(18)
Let us study the comparative statics of these probabilities with respect to the
parameters F and φ.
Proposition 1. As φ increases, P rob(0|1) decreases, P rob(1|1) increases, P rob(0|0)
decreases, and P rob(1|0) increases.
ε −ε P ε−1
1
[ ε−1
] y[ w ] .
Proof. This happens because higher φ implies higher π ≡ φε−1 ε−1
That higher π leads to lower P rob(0|1) and correspondingly higher P rob(1|1) is ev− F ) and
ident from (15), (16). Higher π implies higher C ≡ exp(π + δ ln(1+exp(π))
1−δ
therefore higher z, due to the properties of the function g(z) ≡ z − z δ . Different
values of z for various values of C are illustrated in Figure (6). Higher z in turn
leads to higher P rob(1|0), as is seen from (17), and therefore lower P rob(0|0). This
is expected: as profits from exporting go up, a firm is more likely to export.
Propostion 2. As F increases, P rob(0|1) and P rob(1|1) are unaffected, P rob(0|0)
increases, and P rob(1|0) decreases.
Proof. It can be seen from (15), (16) that F does not affect P rob(0|1) and
20
Function g and values of C
40
Function g
C1=10
C2=20
C3=30
35
30
25
20
15
10
5
0
10
20
30
40
z1
50
Values of z
60
70
z2
80
90
z3
Figure 6: The values of z for various values of C: C1 = 10, C2 = 20, C3 = 30. As C
increases, the corresponding value of z satisfying (14) increases.
− F ) and thereP rob(1|1). F has a negative effect on C ≡ exp(π + δ ln(1+exp(π))
1−δ
fore on z. Hence, as F increases, P rob(1|0) decreases and P rob(0|0) increases. This
is intuitive. As the sunk entry cost increases, the behavior of those firms that have
already incurred this cost (∆t = 1) does not change. For firms that have not invested
in F yet, the higher sunk entry cost makes exporting less enticing.
3.2
Stationary distribution
In this dynamic model we need to calculate the stationary equilibrium of the economy
to study the distribution of firms conditional on exporting. Assume that a mass M̄
of new firms is born every year, where each firm draws its productivity from a Pareto
21
100
distribution. The new-born firms choose whether to export or not, and the probability
of these choices for a firm with productivity φ is given by (17), (18), where z is the
− F ). A fraction δ of all new-born
solution of (14) and C ≡ exp(π + δ ln(1+exp(π))
1−δ
firms survives by the next period. Out of those firms that did not export previously,
a fraction P rob(1|0) exports and a fraction P rob(0|0) does not, where again these
probabilities are given by (17), (18). Out of those firms that exported previously, a
fraction P rob(1|1) exports and a fraction P rob(0|1) does not, where P rob(1|1) and
ε −ε P ε−1
1
[ ε−1
] y[ w ] . Out of all firms, a
P rob(0|1) are given by (16), (15), and π ≡ φε−1 ε−1
fraction δ survives to the following period, and so on and so forth.
For any ν > 0, it is possible to find t ∈ N + such that the mass of firms from a given
cohort (all firms that were born in the same period) surviving after t periods is smaller
than ν. We set this ν very close to 0 (say, e−10 ) and calculate the corresponding t,
satisfying this property. Denote by M (t) the mass of all firms belonging to the same
cohort that survive after t periods. Then we need
M (t) = δ t M̄ ≤ ν
t ln δ ≤ ln
t≥
ν
M̄
ν
ln M̄
.
ln δ
Set
ν
ln M̄
t̄ ≡
ln δ
,
where bxc denotes the least integer function of x.
22
We can say that after t̄ periods, every cohort of firms dies out completely (if we
set ν arbitrarily close to 0), so when counting all existing firms in any period T , we
need only to consider the firms that were born at T , T − 1 (and survived after 1
period), T − 2 (and survived after 2 periods), ..., T − t̄ (and survived after t̄ periods),
and no further. Denote by N the mass of all existing firms in any period T :
N=
t̄
X
M (s)
s=0
=
t̄
X
δ s M̄
s=0
= M̄
1 − δ t̄
.
1−δ
In what follows, we will discretize the Pareto distribution of firm productivities,
so that every new-born firm has productivity φi with probability d(φi ), i = 1, 2, ..., I,
PI
i=1
d(φi ) = 1, I ∈ N + . For every φi , we can calculate the corresponding profits πi ≡
1
φε−1
[ ε ]−ε y[ Pw ]ε−1 . This will also give us the probabilities P robi (0|1), P robi (1|1),
i
ε−1 ε−1
P robi (0|0), P robi (1|0), since all of these probabilities depend on the value of π. In
particular, we know from the comparative statics results, that firms with higher φi
will have higher P robi (1|1) and P robi (1|0), and lower P robi (0|1) and P robi (0|0).
Denote by E(i) the total mass of all firms with probability φi that export at
any time period in the stationary equilibrium. E(i) is composed of all firms with
probability φi that are born currently and choose to export, all firms with probability
φi that were born in the last period, survive until now and choose to export, all firms
with probability φi that were born two periods ago, survive until now and chose to
23
export, and so on, up to the firms with probability φi that were born t̄ periods ago,
survive until now and choose to export.
E(i) = d(φi )M̄ P robj (1|0) + δd(φi )M̄ [P robj (1|0)P robj (1|1) + P robj (0|0)P robj (1|0)]
+ δ 2 d(φi )M̄ [P robj (0|0)P robj (0|0)P robj (1|0) + P robj (0|0)P robj (1|0)P robj (1|1)
+ P robj (1|0)P robj (1|1)P robj (1|1) + P robj (1|0)P robj (0|1)P robj (1|1)] + ...
+ δ t̄ d(φi )M̄ [P robj (0|0)t̄ P robj (1|0) + P robj (0|0)t̄−1 P robj (1|0)P robj (1|1)
+ P robj (0|0)t̄−2 P robj (1|0)P robj (0|1)P robj (1|1) + ... + P robj (1|0)P robj (1|1)t̄ ]
Once we numerically evaluate E(i), i = 1, 2, ..., I, for given parameter values, we
can calculate the total mass of firms that export in any given period in stationary
equilibrium:
E=
I
X
E(i).
i=1
Denote by r(φi ) the distribution of productivities φ, conditional on exporting, in
stationary equilibrium. This can be obtained as
r(φi ) ≡
E(i)
.
E
We will study the behavior of this conditional distribution, for various values of the
model parameters, and compare it with the unconditional distribution of productivities.
24
3.3
The distribution of productivity conditional on exporting
in the stochastic model
In this section we will simulate the distribution of productivities of exporting firms,
for various values of the sunk entry cost F , for fixed values of parameters P , y, w, δ,
ε, fixed values of M̄ , mass of new firms in every period, and ν, tolerance parameter
defined above, and assuming that firms draw productivity values φ from a Pareto
distribution with fixed values of parameters φm and α. The values of the parameters
are presented in Table (1).
Table 1: Values of the model parameters
Parameter
Specification 1
Specification 2
Specification 3
F
0
10
20
P
1
1
1
y
1
1
1
w
1
1
1
δ
0.9
0.9
0.9
ε
4
4
4
φm
1
1
1
α
3
3
3
M̄
1
1
1
ν
e−10
e−10
e−10
25
We first generate a discretized version of the Pareto distribution. The details of
this procedure are explained in the Appendix. This distribution has a finite support (in the range [1.0249, 6] in our simulations). The probabilities of exporting,
conditional on having exported before and not, are then obtained for each value of
productivity φi , i = 1, ..., I. I = 100 in our simulations. These probabilities are
depicted in Figures (7), (8), for values of F = 0, 10, 20. The probability of exporting conditional on having exported before is independent of F , since the firms that
have exported at least once view the cost F as sunk. Instead, firms that have never
exported to the market, are strongly affected by the value of F . The probability of
exporting, conditional on not having exported, shifts down for low and intermediate
values of φ, as F increases first from 0 to 10, and then from 10 to 20. This fact has
implications for the distribution of firm productivities, conditional on exporting.
Probability of exporting conditional on
having exported before
1.1
1
0.9
0.8
0.7
0.6
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
5.5
Values of productivity
Figure 7: Probability of exporting conditional on having exported before over the values of
productivity φ in the stochastic model. The same curve applies to all cases of F = 0, 10, 20.
26
6
Probability of exporting conditional on
not having exported before
1
0.9
0.8
F=0
0.7
0.6
F = 10
0.5
F = 20
0.4
0.3
0.2
0.1
0
1
1.5
2
2.5
3
3.5
4
4.5
5
5.5
Values of productivity
Figure 8: Probability of exporting conditional on not having exported before over the
values of productivity φ in the stochastic model. Solid curve for F = 20, dashed curve for
F = 10, dotted curve for F = 0.
The probability distribution of firm productivities, conditional on exporting, is
obtained as described above. The unconditional and conditional distributions of
productivity, for values of F = 0, 10, 20, are depicted in Figures (9), (10), (11).
In Figure (9), both the unconditional and conditional probability mass functions
are strictly decreasing. Notice that the conditional probability mass function is lower
than the unconditional one for low values of productivity, and higher for high values
of productivity. That is, due to the fact that the less productive firms are less likely
to export than the more productive firms, the weight in the conditional distribution
is shifted more towards the higher productivity firms.
In Figure (10), where F = 10, this shift in weight is even stronger, since the
27
6
probability of exporting conditional on not having exported before shifts down for
low values of φ as F increases from 0 to 10. As a result, there is now a slight hump
in the conditional probability distribution of φ, for values between 1.5 and 2.5.
As we increase F even further, to F = 20, the probability of exporting conditional
on not having exported before shifts down even more for low values of productivity,
as shown in Figure (8), and in particular, becomes nearly 0 for φ between 1 and 2.17.
The shift in weight in the conditional probability distribution of φ towards higher
values is therefore even stronger, and there is a pronounced hump, as is evident in
Figure (11). The probability mass function is strictly increasing for φ ∈ [2, 2.77] and
strictly decreasing for φ ∈ [2.77, 6].
0.14
Probability mass function
0.12
Unconditional
0.1
Conditional on exporting
0.08
0.06
0.04
0.02
0
1
1.5
2
2.5
3
3.5
4
4.5
5
5.5
Values of productivity
Figure 9: Sunk entry cost F = 0. The solid curve shows the probability distribution
of productivity of all firms, and the dashed curve shows the probability distribution of
productivity of all exporting firms.
28
6
0.14
Probability mass function
0.12
Unconditional
0.1
Conditional on exporting
0.08
0.06
0.04
0.02
0
1
1.5
2
2.5
3
3.5
4
4.5
5
5.5
6
Values of productivity
Figure 10: Sunk entry cost F = 10. The solid curve shows the probability distribution
of productivity of all firms, and the dashed curve shows the probability distribution of
productivity of all exporting firms.
0.14
Probability mass function
0.12
Unconditional
0.1
Conditional on exporting
0.08
0.06
0.04
0.02
0
1
1.5
2
2.5
3
3.5
4
4.5
5
5.5
Values of productivity
Figure 11: Sunk entry cost F = 20. The solid curve shows the probability distribution
of productivity of all firms, and the dashed curve shows the probability distribution of
productivity of all exporting firms.
29
6
4
Conclusion
The assumption that firm productivities follow a Pareto distribution has been used
extensively in the recent trade literature. However, we show that the productivity
distribution of exporting firms cannot be exactly described as Pareto, but rather as
a hump-shaped distribution with a heavy right tail. In a standard Melitz model
with a sunk entry cost and heterogeneous firms that draw their productivities from a
Pareto distribution, the productivity distribution of exporting firms is still a Pareto.
Introducing idiosyncratic random shocks to the firm payoffs results in a stationary
distribution of firm productivities conditional on exporting that matches the shape
of the empirical distribution, for sufficiently high values of the sunk entry cost.
30
5
Appendix
5.1
Estimating TFP
To carry out TFP estimation we use only data on domestic sales of the firm, i.e.
Rjt = R̂jt −Xjt , where R̂jt is total revenues of the firm, and Xjt is its export revenues.
We assume that the inputs used for domestic production only are the same fraction
of total inputs used as the fraction of domestic sales in total sales.
In what follows, we borrow many steps from De Loecker (2011). Begin with the
pair of production and demand equations:
αk ωjt +ujt
e
,
Qjt = Lαjtl Mjtαm Kjt
Qjt = Qst
Pjt
Pst
−ε
eηjt ,
where L, M, K are inputs - labor, intermediate inputs and capital, respectively, ωjt
is unobserved productivity shock, ujt is an error term, Pjt is firm j-s price, Qjt is
firm j-s quantity produced and sold, ηjt is unobserved demand shock, Qst and Pst
are industry-wide output and price index, respectively. j indexes firms, and t indexes
time (years in our case). We observe only total domestic revenues (and not physical
output):
1
1
1
Rjt ≡ Qjt Pjt = Qjt Qjt − ε Qst ε Pst eηjt ε ,
so that upon dividing both sides by Pst and taking logs:
r̃jt =
ε−1
1
1
qjt + qst + ηjt ,
ε
ε
ε
31
and plugging in the equation for the production function:
r̃jt = βl ljt + βm mjt + βk kjt + βs qst + ωjt + ηjt + ujt ,
the small letters denote the logarithms of the capitalized variables (e.g. qjt ≡ ln Qjt ),
and all the coefficients are reduced form parameters combining the production function and demand parameters.
We next denote ω̃jt ≡ ωjt + ηjt and re-write:
r̃jt = βl ljt + βm mjt + βk kjt + βs qst + ω̃jt + ujt .
Here we follow Levinsohn and Melitz (2002) in treating both the unobserved productivity and demand shocks jointly, so that we do not identify these two sources of firm
profitability independently. Since our goal is only estimation of TFP, it suffices for
us to control for industry-wide output with industry-time fixed effects. Therefore, we
have
r̃jt = βl ljt + βm mjt + βk kjt + ω̃jt +
T
X
βst Dst + εjt ,
t=1
where Dst are industry-year dummies. Next issue to deal with is the identification
of the variable inputs’ coefficients. To do that, we take note of the ACF (Ackerberg,
Caves and Frazer, 2006) critique of the Levinsohn and Petrin (2003) and Olley and
Pakes (1996) estimation approach, and estimate all the input coefficients in the second
stage. We use value added in the first stage regression, so that we only estimate the
coefficients on labor and capital in the production function. We do use the data on
intermediate inputs, however, as the control for (unobserved) productivity. We use
32
the equation for intermediate inputs
mjt = mt (kjt , ω̃jt ),
so that assuming monotonicity in the function mt (.), we can invert:
ω̃jt ≡ ψt (mjt , kjt ).
Similarly, we assume that the optimal quantity of labor is chosen once current productivity is observed by the firm, so that
ljt = lt (kjt , ω̃jt ),
and once we utilize the expression for ω̃jt above,
ljt = lt (kjt , ψt (mjt , kjt )).
Inserting these into the expression for value added:
va
g
jt ≡ r̃jt − βm mjt = βl ljt + βk kjt + ψ(mjt , kjt ) +
= βl lt (kjt , ψ(mjt , kjt )) + βk kjt + ψ(mjt , kjt ) +
T
X
βst Dst + εjt
t=1
T
X
βst Dst + εjt
t=1
≡ Ψt (mjt , kjt ) +
T
X
βst Dst + εjt ,
t=1
where Ψt (mjt , kjt ) is a polynomial in capital and intermediate inputs, one for each
33
time period (year in our case):
Ψ(mjt , kjt ) ≡
T
X
b0t Dt +
t=1
+
+
T
X
t=1
T
X
T
X
b1kt Dt kjt +
t=1
2
+
b2kkt Dt kjt
T
X
b1mt Dt mjt +
t=1
T
X
T
X
2
mjt
b3kkmt Dt kjt
t=1
t=1
t=1
b2mkt Dt kjt mjt
t=1
b2mmt Dt m2jt +
b3kmmt Dt kjt m2jt +
T
X
T
X
3
+
b3kkkt Dt kjt
T
X
b3mmmt Dt m3jt ,
t=1
t=1
where Dt are year dummies. Assuming that productivity follows a first-order Markov
process:
ω̃jt = E[ω̃jt |ω̃j(t−1) ] + νjt ,
where νjt is uncorrelated with kjt , and given values of βl and βk , one can estimate
the residual νjt (unobserved innovation to productivity) non-parametrically from
ω̃jt = va
g
jt − βl ljt − βk kjt −
T
X
βst Dst ,
t=1
2
3
ω̃jt = z0 + z1 ω̃j(t−1) + z2 ω̃j(t−1)
+ z3 ω̃j(t−1)
+ νjt .
Next, use the moments
E[νjt (βk , βl )kjt ] = 0,
E[νjt (βk , βl )lj(t−1) ] = 0,
to identify the coefficients on capital and labor.
Finally, given estimates β̂k , β̂l , the estimate of TFP, ωc̃
jt , is given by
ωc̃
g
jt = va
jt − β̂l ljt − β̂k kjt −
T
X
t=1
34
β̂st Dst .
5.2
Conditional Pareto distribution
Consider a random variable φ that has a Pareto distribution:


α

m
 α φα+1
, for φ ≥ φm ,
φ
fΦ (φ) =


 0,
for φ < φm ,
where φm > 0 is a scale parameter and α > 0 is a shape parameter.
The cumulative distribution function of φ is



 1 − ( φm )α , for φ ≥ φm ,
φ
FΦ (φ) =


 0,
for φ < φm ,
Hence, the conditional probability distribution of φ, given that φ ≥ φ̃, φ̃ > φm , is
f (φ|φ ≥ φ̃) =
f (φ)
1 − F (φ̃)
α
=
m
α φφα+1
( φφ̃m )α
=α
φ̃α
,
φα+1
for φ ≥ φ̃, and 0 otherwise. That is, the conditional probability distribution of φ,
given that φ ≥ φ̃, φ̃ > φm , is a Pareto distribution with shape parameter α and scale
parameter φ̃.
5.3
Generating a discretized version of the Pareto distribution
Our method of generating a discretized version of the Pareto distribution with given
scale parameter and shape parameter is an extension of the approach to discretizing
35
continuous random variables proposed by Tauchen (1986).
Suppose a random variable φ follows a Pareto distribution with scale parameter
φm and shape parameter α. We would like to construct its discrete representation,
by first building a grid - a finite collection of equally spaced points on the R axis,
and then calculating the weight for each of these points. Fix the desired number of
elements in the grid, I ∈ N + . The variable φ has a well-defined lower bound, φm , but
is unbounded from above. We set the upper bound for the discretized version equal
to the 99-th percentile of the TFP values observed in the entire dataset of French
firms, which is approximately 6. Denote by φi , i = 1, 2, ..., I, the elements in the grid,
and by ei , i = 1, 2, ..., I + 1, the so-called endpoints, whose purpose will be explained
shortly. We calculate these as follows:
φI = 6,
gap ≡
φI − φm
,
2I − 1
e1 = φm ,
φ1 = e1 + gap,
e2 = φ1 + gap,
...
φi = ei + gap, i = 3, ..., I
ei = φi−1 + gap, i = 3, ..., I + 1.
36
That is, each grid point φi is contained in the interval (ei , ei+1 ), at equal distance
from both endpoints ei and ei+1 . We calculate the weight associated with each grid
point φi , i = 1, ..., I, as
d(φi ) =
Fφ (ei+1 ) − Fφ (ei )
,
Fφ (eI+1 ) − Fφ (e1 )
where Fφ is the cumulative Pareto distribution with scale parameter φm and shape
parameter α, and we divide the nominator Fφ (ei+1 ) − Fφ (ei ) by the term Fφ (eI+1 ) −
Fφ (e1 ) to ensure that all the weights sum up to 1:
I
X
i=1
I
X
Fφ (ei+1 ) − Fφ (ei )
Fφ (eI+1 ) − Fφ (e1 )
d(φi ) =
=
= 1.
Fφ (eI+1 ) − Fφ (e1 )
Fφ (eI+1 ) − Fφ (e1 )
i=1
The discretized version of the distribution of the random variable φ then is represented by the grid φi , i = 1, ..., I and the probability mass function d(φ).
37
6
References
Ackerberg, D., K. Caves and G. Frazer, ‘Structural Identification of Production Functions’, Econometrica R&R, 2006.
Akhmetova, Z. and C. Mitaritonna, ‘A Model of Firm Experimentation under
Demand Uncertainty with an Application to Multidestination Exporters’, Working
Paper, University of New South Wales, 2013.
Albornoz, F., H. F. C. Pardo, G. Corcos and E. Ornelas, ‘Sequential Exporting’,
Journal of International Economics 88 (2012), 17-31.
Arkolakis, C., ‘Market Penetration Costs and the New Consumers Margin in International Trade’, Journal of Political Economy 118 (2010), 1151-1199.
Axtell, R. L., ‘Zipf Distribution of U.S. Firm Sizes’, Science 293 (2001), 1818-1820.
Chaney, T., ‘Distorted Gravity: The Intensive and Extensive Margins of International Trade’, American Economic Review 98 (2008), 1707-17218.
De Loecker, J., ‘Product Differentiation, Multi-Product Firms and Estimating the
Impact of Trade Liberalization on Productivity, Econometrica 79 (2011), 1407-1451.
Eaton, J., M. Eslava, C.J. Krizan, M. Kugler and J. Tybout, ‘A Search and
Learning Model of Export Dynamics’, Working Paper, 2008.
Eaton, J., S. Kortum and F. Kramarz, ‘An Anatomy of International Trade:
Evidence from French Firms’, Econometrica 79 (2011), 1453-1498.
Helpman, E., M.J. Melitz and Y. Rubinstein, ‘Estimating Trade Flows: Trading
Partners and Trading Volumes’, The Quarterly Journal of Economics 123 (2008),
38
441-487.
Helpman, E., M.J. Melitz and S.R. Yeaple, ‘Export Versus FDI with Heterogeneous Firms’, American Economic Review 94 (2004), 300-316.
Ijiri, Y. and H.A. Simon., ‘Business Firm Growth and Size’, American Economic
Review 54 (1964), 7789.
Levinsohn, J. and M.J. Melitz, ‘Productivity in a Differentiated Products Market
Equilibrium’, mimeo, Harvard University, 2002.
Levinsohn, J. and A. Petrin, ‘Estimating Production Functions Using Inputs to
Control for Unobservables’, Review of Economic Studies 70 (2003), 317-342.
Luttmer, E., ‘Selection, Growth, and the Size Distribution of Firms’, Quarterly
Journal of Economics 122 (2007), 11031144.
Melitz, M. J., ‘The Impact of Trade on Intra-Industry Reallocations and Aggregate
Industry Productivity’, Econometrica 71 (2003), 1695-1725.
Melitz, M. J. and G. I. P. Ottaviano, ‘Market Size, Trade, and Productivity’,
Review of Economic Studies 75 (2008), 295-316.
Olley, S. and A. Pakes, ‘The Dynamics of Productivity in the Telecommunications
Equipment Industry’, Econometrica 64 (1996), 1263-1295.
Rauch, J. E. and J. Watson, ‘Starting Small in an Unfamiliar Environment’, International Journal of Industrial Organization 21 (2003), 1021-1042.
Rust, J., ‘Optimal Replacement of GMC Bus Engines: An Empirical Model of
Harold Zurcher’, Econometrica 55 (1987), 999-1033.
39
Simon, H. A., ‘On a Class of Skew Distribution Functions’, Biometrika 42 (1955),
425-440.
Simon, H. A. and C. P. Bonini, ‘The Size Distribution of Business Firms’, American Economic Review 48 (1958), 607617.
Steindl, J., Random Processes and the Growth of Firms; A Study of the Pareto
Law, New York, NY: Hafner Publishing Company, 1965.
Tauchen, G., ‘Finite state markov-chain approximations to univariate and vector
autoregressions’, Economics Letters 20 (1986), 177-181.
40