Reconciling the assumption of Pareto distribution of firm productivities with the exporter data Zhanar Akhmetova∗ Department of Economics, Australian School of Business, University of New South Wales Sydney, NSW 2052, Australia Cristina Mitaritonna Centre d’Etudes Prospectives et d’Informations Internationales (CEPII) 113 rue de Grenelle 75007, Paris, France January 8, 2014 Abstract. We propose a stochastic formulation of the Melitz model with a sunk entry cost and a Pareto distribution of firm productivities. Payoffs of a firm are subject to random time-varying shocks. As a result, the distribution of firm productivities conditional on exporting is no longer Pareto, but a hump-shaped distribution with a heavy right tail. This fits well the observed shape of the distribution of productivities ∗ Corresponding author. Telephone: +61 2 93854965; E-mail: [email protected]. 1 of exporting firms. Keywords: Heterogeneous Producers, Pareto Distribution, Melitz model, Random Shocks. JEL classification: F12, F14, L11. 1 Introduction The Pareto distribution has been used widely to model the cross-sectional distribution of firm productivity or size in the recent international trade literature. In the heterogeneous firm models of Helpman et al. (2004), Helpman et al. (2008), Melitz and Ottaviano (2008), Chaney (2008), Eaton et al. (2011), all of which build on the seminal paper by Melitz (2003), firm productivities are assumed to follow a Pareto distribution. A Pareto distribution is characterised by two parameters: a scale parameter and a shape parameter. The probability density function of a random variable with a Pareto distribution is plotted in Figure (1) for various values of the shape parameter. The main justification for this assumption, apart from its analytical convenience, is derived from the works of Simon (1955), Simon and Bonini (1958), Ijiri and Simon (1964), Steindl (1965), Axtell (2001), Luttmer (2007). Simon (1955) shows that 2 some stochastic processes in dynamic models can result in a class of highly-skewed distributions, particularly Pareto, in the steady state. Simon and Bonini (1958), Ijiri and Simon (1964), Steindl (1965) have all indicated the good approximation of the distribution of firm sizes that the Pareto distribution provides. Most recently, Axtell (2001) demonstrated this using 1997 data on the entire population of firms in the US. Luttmer (2007) presents a stochastic model of growth where the stationary firm size distribution is a Pareto distribution. However, when we consider the productivities in the dataset of French exporters between 1995-2007, we obtain distributions that do not look like Pareto, at least not for low values of productivity, a fact that has not been emphasised in the international trade literature. The dataset provides information on export values and quantities of all exporters by country and 8-digit-code product (according to the European CN8 classification, which is an 8-digit extension of the HS6 classification system, analogous to the ten-digit extensions (HS10) employed by the US). It is merged with the domestic balance sheets of the firms, which allows us to measure total factor productivity (TFP) of each firm a la Ackerberg et al. (2006). Our procedure is described in the Appendix. We consider all exporters (firms that exported at least once between 1995-2007) of a given 8-digit-code product to a given destination. In order to obtain comparable productivity values across firms exporting the same product in the presence of multi-product firms, we locate the 2-digit industrial sector that dominates the exports of the given 8-digit-code product, and consider only firms that declare this 3 sector as their primary industry. For example, most exporters of perfumes (8-digit code 33030010), shampoos (8-digit code 33051000), soaps (8-digit code 34011100) and perfumed bath salts (8-digit code 33073000) declare themselves as belonging to the 2-digit industry 24 (production of soaps, perfumes, cleaning products and other chemical products). We therefore take into account only exporters that belong to this sector when plotting the distribution of productivities of exporters of these 8-digitcode products. These plots are presented in Figure (2), for selected countries (Japan and USA). Distributions of productivities of exporters of other products and to other destinations look similar. 10 Proability density function 9 Shape parameter = 3, scale parameter = 1 8 7 Shape parameter = 4, scale parameter = 1 6 Shape parameter = 10, scale parameter = 1 5 4 3 2 1 0 1 1.5 2 2.5 3 3.5 4 4.5 5 Values of random variable Figure 1: Plot of probability density functions for a random variable with Pareto distribution, for various values of the shape parameter. It is shown in what follows that in a standard Melitz (Melitz, 2003) model with 4 Density .2 .4 .6 Exporters of soap to Japan 0 0 Density .2 .4 .6 Exporters of bath salts to Japan 2 4 6 TFP 8 10 12 2 kernel = epanechnikov, bandwidth = 0.3317 4 6 TFP 8 10 12 kernel = epanechnikov, bandwidth = 0.2953 Exporters of shampoos to the US 0 0 .1 Density .1 .2 Density .2 .3 .3 .4 Exporters of perfumes to the US 2 4 6 8 10 12 TFP 2 4 6 8 10 12 TFP kernel = epanechnikov, bandwidth = 0.5009 kernel = epanechnikov, bandwidth = 0.3708 Figure 2: Kernel density estimates of the distributions of exporting firm productivities (TFP). heterogeneous firms whose productivities are drawn from a Pareto distribution, and with a sunk entry cost, the distribution of productivities of firms that export is also Pareto. We extend this, as we call it, ‘deterministic’ model by introducing random time-varying shocks to the payoffs of the firm from every action (exporting and not exporting). We interpret these shocks as variation in the costs of the firm. They are firm-specific and independent of firm productivity, which is time-invariant. Under certain assumptions (Rust, 1987), one can calculate the probabilities that a firm chooses exporting or not exporting, given its productivity and past history (whether it has exported before or not). We show how we derive the stationary distribution of this model and demonstrate that for a sufficiently high sunk entry cost the cross-sectional 5 distribution of firm productivities conditional on exporting has the same shape as that shown in Figure (2): increasing for low values of productivities and declining after some point, with a heavy right tail. This happens because unlike in the deterministic model, random shocks to firm payoffs imply that even low-productivity firms (firms that would not export at all in a deterministic model) may pay the sunk entry cost in a period when they receive a very favourable shock to their exporting payoff. These firms might then continue exporting afterwards, since they have already invested in the entry cost. Thus, firms with low productivities will be present in the exporter dataset with non-zero probability. The higher the productivity of a firm, the higher the probability it exports in any given period. This positive relationship interacts with the unconditional Pareto distribution of firm productivities, which implies an inverse relationship between firm productivity and its prevalence in the firm population, to generate the hump-shaped distribution of productivities conditional on exporting, observed in the data. This paper is distinct from the literature explaining the presence of small exporters in the exporter dataset. Arkolakis (2010) introduces a costly marketing technology that allows firms to choose how many consumers to access in a market, into a model of trade with product differentiation and firm productivity heterogeneity. This helps explain why we observe so many firms exporting low volumes. Works on learning under demand uncertainty, e.g. Akhmetova and Mitaritonna (2013), Albornoz et al. (2012), Eaton et al. (2008) introduce an assumption of uncertainty about demand 6 in export markets, and allow firms to learn about demand from observed sales. This framework can help justify the low sales volumes of new exporters, and in the case of test marketing (experimentation in the market before investing in the sunk entry cost, as in Akhmetova and Mitaritonna, 2013) can explain the presence of low-productivity firms among exporters. However, all of these assumptions only lead to a downward shift in the threshold on productivity beyond which firms choose to export. For any given level of productivity, all firms either export or not, with probability 1. The distribution of productivities conditional on exporting is still a Pareto in these models, assuming a Pareto distribution of firm productivities in the population. Instead, our work introduces idiosyncratic random shocks to payoffs of the firm, which for sufficiently high values of the sunk entry cost produce a positive, strictly monotonic on a certain range, relationship between firm productivity and the probability of exporting, that ranges between 0 and 1. As a result, the hump-shaped distribution of firm productivities conditional on exporting emerges. The rest of the paper is organised as follows. Section 2 analyses the conditional distribution of productivities of exporting firms in a deterministic Melitz model. We present the stochastic formulation of this model in Section 3 and demonstrate the stationary distribution of firm productivities of exporting firms. Section 4 concludes. 7 2 The Deterministic Model 2.1 Consumer Preferences We study the optimal behavior of firms at Home producing varieties of a differentiated good and wishing to sell in the Foreign market. Time is discrete. Normalize the size of the market to 1. The representative consumer has CES preferences over the varieties of the differentiated good, so that his/her utility from consuming quantities qjt at time t is given by Z (qjt ) Ut = ε−1 ε ε ε−1 dj , j where j denotes varieties, ε > 1 is the elasticity of substitution. Utility-maximizing quantity demanded of a variety j by an individual consumer and in the market as a whole is given by −ε hjt qjt = Qt = yt (hjt )−ε (Pt )ε−1 , Pt where hjt is the price of variety j, Pt is the aggregate price index for the differentiated R 1 good, Pt = [ j (hjt )1−ε dj] 1−ε , yt denotes the total expenditures on the differentiated good, and Qt is the total consumption of the differentiated good (so that Qt Pt = yt , i.e. Qt = 2.2 yt ). Pt Firm’s Problem Each firm produces one variety. We index firms with j, and assume that there is a continuum of firms in the domestic economy, j ∈ [0, 1]. Firm j produces variety j. 8 Labor is the only factor of production, with the constant marginal cost of production of firm j given by the ratio of wages, wt , and firm j’s productivity, φjt . Gross profits from exporting to the market at any time t are given by πjt = qjt hjt − qjt wt wt = yt (hjt )−ε (Pt )ε−1 (hjt − ). φjt φjt To maximize these profits, the firm sets the price as a constant mark-up over the marginal cost: hjt = ε wt . ε − 1 φjt Given optimal prices, we can re-write the aggregate price index as Z Z 1 1 ε wt 1−ε 1−ε 1−ε ) dj] . Pt = [ (hjt ) dj] 1−ε = [ ( j ε − 1 φjt j In what follows, we drop the index j for simplicity of notation, and will consider a stationary equilibrium, where all aggregate variables and firms’ productivities take constant values, so that for a firm with productivity φ πt = π, ∀t, π ≡ φε−1 1 ε −ε P ε−1 [ ] y[ ] . ε−1 ε−1 w (1) The firm has to pay a sunk entry cost F to access the foreign market. The firm is infinitely lived, but every period there is a probability 1 − δ, δ ∈ (0, 1) of an exogenous death shock. The firm may choose to export or not to export in any period. If the firm has paid the sunk entry cost by time t, and it does not export at time t, it will still be able to export at time t + 1. 9 2.3 Solution of the Problem of the Firm The Bellman equation for the value function is as follows: V (∆t ) = max V c (∆t , it ), it =0,1 where it = 1, if the firm exports at time t, 0, ∆t = otherwise, 1, if the firm has incurred the cost F by time t, 0, otherwise, V c (∆t , 0) = δV (∆t+1 ), ∀∆t , is the choice-specific value function for the decision to not export, π + δV (∆t+1 ), if ∆t = 1, c V (∆t , 1) = π + δV (∆t+1 ) − F, if ∆t = 0, is the choice-specific value function for the decision to export, profits π of a firm with productivity φ from exporting in a stationary equilibrium are given by (1), and ∆t evolves according to ∆t+1 = 1, if ∆t = 1 or it = 1, 0, otherwise. 10 (2) From (2) we obtain V c (∆t , 1) = π + δV (1), if ∆t = 1, (3) π + δV (1) − F, if ∆t = 0, and V (1) = max{V c (1, 0), V c (1, 1)} = max{δV (1), π + δV (1)}. Since π > 0, for all non-negative φ, so that π + δV (1) > δV (1), we get V (1) = π + δV (1), V (1) = π . 1−δ Hence, V (0) = max V c (0, it ) it =0,1 = max{δV (0), π + δV (1) − F } = max{δV (0), π + δ = max{δV (0), π − F} 1−δ π − F, } 1−δ so that V (0) = 0, π 1−δ if π 1−δ − F < 0, − F, if π 1−δ − F ≥ 0. To summarise, the optimal policy choice is given by: 0, if ∆t = 0 and π − F < 0, 1−δ it (∆t ) = π 1, if ∆t = 1 or 1−δ − F ≥ 0, 11 (4) and the value function is V (∆t ) = 0, π 1−δ if ∆t = 0 and π 1−δ − F < 0, − F, if ∆t = 0 and π 1−δ − F ≥ 0, π , 1−δ (5) if ∆t = 1. To find the cutoff on productivity φ such that firms with productivity above this cutoff export, and all firms with lower productivity do not export, we solve 1 1 ε −ε P ε−1 φε−1 [ ] y[ ] − F = 0. 1−δ ε−1 ε−1 w Denote this cutoff by φ̃: φ̃ ≡ 1 w F (1 − δ)( − 1) −1 −1 [ ] [ ] . P y −1 We illustrate the probability of exporting conditional on not having exported before over the values of productivity φ for a hypothetical value of φ̃ = 1.3964 in Figure (3). Notice that it is a piecewise constant function, equal to 0 for φ < φ̃ and equal to 1 for φ ≥ φ̃. 2.4 The distribution of productivity conditional on exporting in the deterministic model Assume that productivity φ follows a Pareto distribution with probability density function fΦ (φ) = α m α φα+1 , for φ ≥ φm , φ 0, for φ < φm , 12 Probability of exporting conditional on not having exported before 1 0.8 0.6 0.4 0.2 0 1 1.5 2 2.5 3 3.5 4 4.5 Values of productivity Figure 3: Probability of exporting conditional on not having exported before over the values of productivity φ in the deterministic model. where φm > 0 is a scale parameter (the lower bound on the support of φ), and α > 0 is a shape parameter. This density function is illustrated in Figure (4) for φm = 1 and α = 3. On the same graph, we illustrate the conditional probability density function, conditional on φ ≥ 1.3964, where φ̃ = 1.3964, a hypothetical threshold on productivity for exporting. The conditional distribution is also a Pareto distribution, with a different lower bound. In general, the conditional probability distribution of a Pareto-distributed random variable, given that it is greater than or equal to a particular number φ̃ exceeding φm , is a Pareto distribution with the same shape parameter α, but with scale parameter φ̃ instead of φm . This is proven in the Appendix. Thus, the distribution of productivities of exporting firms, that is firms with productivity φ above the threshold φ̃, is also a Pareto distribution. As can be seen in Figure (2), this is not an 13 5 accurate depiction of reality. Probability density function 3 Unconditional distribution 2.5 Conditional distribution 2 1.5 1 0.5 0 1 1.5 2 2.5 3 3.5 4 4.5 Values of productivity Figure 4: The unconditional and conditional Pareto distributions. 3 Stochastic formulation of the model Let us introduce random shocks to the payoffs of the firm. Consider the same problem as above, only now firm payoffs have an additional component, random, time-varying and firm-specific. The Bellman equation for the value function is as follows: V (∆t ) = max V c (∆t , it ), it =0,1 V c (∆t , 0) = 0t + δV (∆t+1 ), ∀∆t , 14 5 is the choice-specific value function for the decision to not export, 0t is the random shock to the payoff from not exporting, π + 1t + δV (∆t+1 ), if ∆t = 1, c V (∆t , 1) = π + 1t + δV (∆t+1 ) − F, if ∆t = 0, is the choice-specific value function for the decision to export, and 1t is the random shock to the payoff from exporting. All the other variables are as defined above, and in particular ∆t evolves according to (2). Following the seminal paper of Rust (1987), we assume that t ≡ (0t , 1t ) is distributed i.i.d. (across choices and periods) and follows a bivariate extreme value of 2 2 Type I process, with mean normalised to (0, 0) and variance normalised to ( π6 , π6 ). The conditional independence assumption (Rust, 1987) holds due to the deterministic nature of the state variable ∆t . Under these conditions, we can define and evaluate ˜ (∆t , i) that has as its arguments the state variable the expected value function EV ∆t and the choice variable i: ˜ (∆t , i) ≡ E∆ , |∆t ,i {max[P ayof f (∆t+1 , j) + j(t+1) ]} EV t+1 t+1 j = E∆t+1 |∆t ,i Et+1 |∆t+1 ,∆t ,i {max[P ayof f (∆t+1 , j) + j(t+1) ]} j X = E∆t+1 |∆t ,i ln{ exp[P ayof f (∆t+1 , j)]} j = X ∆t+1 ln{ X exp[P ayof f (∆t+1 , j)]}T r(∆t+1 |∆t , i), j where 15 ˜ (∆t+1 , 0) δ EV if j = 0, ˜ (1, 1) P ayof f (∆t+1 , j) = π + δ EV if j = 1, ∆t+1 = 1, ˜ (0, 1) − F if j = 1, ∆t+1 = 0. π + δ EV (6) Ex|y denotes the expected value with respect to variable x conditional on variable y, and T r(∆t+1 |∆t , i) denotes the transition probability of the state variable ∆t : 0 if i = 0, ∆t = 0, ∆t+1 = 1, 1 if i = 1, or ∆t = 1, ∆t+1 = 1, T r(∆t+1 |∆t , i) = 0 if i = 1, or ∆t = 1, ∆t+1 = 0, 1 if i = 0, ∆t = 0, ∆t+1 = 0. (7) ˜ (0, 0), EV ˜ (0, 1), EV ˜ (1, 0), Given (6) and (7), we can write out the values EV ˜ (1, 1) as follows: EV ˜ (0, 0) = ln(exp(δ EV ˜ (0, 0)) + exp(π + δ EV ˜ (0, 1) − F )), EV (8) ˜ (0, 1) = ln(exp(δ EV ˜ (1, 0)) + exp(π + δ EV ˜ (1, 1))), EV (9) ˜ (1, 0) = ln(exp(δ EV ˜ (1, 0)) + exp(π + δ EV ˜ (1, 1))), EV (10) ˜ (1, 1) = ln(exp(δ EV ˜ (1, 0)) + exp(π + δ EV ˜ (1, 1))). EV (11) It is evident that ˜ (0, 1) = EV ˜ (1, 0) = EV ˜ (1, 1). EV 16 ˜ (1, 0) = EV ˜ (1, 1). Then from (10) and (11) Denote x ≡ EV x = ln(exp(δx) + exp(π + δx)) exp(x) = exp(δx) + exp(π + δx) exp(x) = (exp(x))δ + exp(π)(exp(x))δ x= ln(1 + exp(π)) . 1−δ (12) ˜ (0, 0). From (8) Next, denote y ≡ EV y = ln(exp(δy) + exp(π + δx − F )), ln(1 + exp(π)) − F )), 1−δ ln(1 + exp(π)) exp(y) = (exp(y))δ + exp(π + δ − F ), 1−δ ln(1 + exp(π)) exp(y) − (exp(y))δ = exp(π + δ − F ), 1−δ y = ln(exp(δy) + exp(π + δ (13) We simplify this equation further, by denoting z ≡ exp(y), C ≡ exp(π +δ ln(1+exp(π)) − 1−δ F ), and obtain z − z δ = C, (14) The left-hand-side function of z, g(z) ≡ z −z δ , is graphed in Figure (5) for δ = 0.9. In general, ∀δ ∈ (0, 1), this function is a positive and strictly increasing function of z ˜ (0, 0) ≥ 0), with values g(z) ∈ [0, ∞) over z ∈ [1, ∞) (and z ∈ [1, ∞) as long as EV and derivative g 0 (z) = z − δz δ−1 > 0, ∀z ∈ [1, ∞). 17 Therefore, ∀C ≡ exp(π + δ ln(1+exp(π)) − F ) ∈ [0, ∞), exists a unique solution for 1−δ ˜ (0, 0)), z ≥ 1, such that (14), and therefore (13), holds. z ≡ exp(EV 40 35 Function g 30 25 20 15 10 5 0 10 20 30 40 50 60 70 80 90 Values of z Figure 5: Function g of z, for z ≥ 1 3.1 Comparative statics of the choice probabilities ˜ (0, 0), EV ˜ (0, 1), EV ˜ (1, 0), EV ˜ (1, 1) are obtained, we can calculate Once the values EV the probability that the firm chooses action i, given ∆t : exp(P ayof f (∆t , i)) P rob(i|∆t ) = P . j exp(P ayof f (∆t , j)) 18 100 In particular, P rob(0|1) = ˜ (1, 0)) exp(δ EV ˜ (1, 0)) + exp(π + δ EV ˜ (1, 1)) exp(δ EV = exp(δx) exp(δx) + exp(π) exp(δx) = 1 , 1 + exp(π) P rob(1|1) = 1 − P rob(0|1) = (15) exp(π) , 1 + exp(π) where x is given by (12). P rob(0|0) = ˜ (0, 0)) exp(δ EV ˜ (0, 0)) + exp(π + δ EV ˜ (0, 1) − F ) exp(δ EV = exp(δy) exp(δy) + exp(π + δx − F ) = exp(δy) exp(y) = exp(yδ − y), where the third line follows from (13), and P rob(1|0) = 1 − P rob(0|0) = 1 − exp(yδ − y). Notice that from (13) exp(y) − exp(yδ) = exp(π + δ exp(y)(1 − exp(yδ − y)) = C zP rob(1|0) = C. 19 ln(1 + exp(π)) − F) 1−δ (16) Hence, C z z − zδ = z 1 = 1 − 1−δ . z P rob(1|0) = (17) where z is obtained from (14), and P rob(0|0) = 1 − P rob(1|0). (18) Let us study the comparative statics of these probabilities with respect to the parameters F and φ. Proposition 1. As φ increases, P rob(0|1) decreases, P rob(1|1) increases, P rob(0|0) decreases, and P rob(1|0) increases. ε −ε P ε−1 1 [ ε−1 ] y[ w ] . Proof. This happens because higher φ implies higher π ≡ φε−1 ε−1 That higher π leads to lower P rob(0|1) and correspondingly higher P rob(1|1) is ev− F ) and ident from (15), (16). Higher π implies higher C ≡ exp(π + δ ln(1+exp(π)) 1−δ therefore higher z, due to the properties of the function g(z) ≡ z − z δ . Different values of z for various values of C are illustrated in Figure (6). Higher z in turn leads to higher P rob(1|0), as is seen from (17), and therefore lower P rob(0|0). This is expected: as profits from exporting go up, a firm is more likely to export. Propostion 2. As F increases, P rob(0|1) and P rob(1|1) are unaffected, P rob(0|0) increases, and P rob(1|0) decreases. Proof. It can be seen from (15), (16) that F does not affect P rob(0|1) and 20 Function g and values of C 40 Function g C1=10 C2=20 C3=30 35 30 25 20 15 10 5 0 10 20 30 40 z1 50 Values of z 60 70 z2 80 90 z3 Figure 6: The values of z for various values of C: C1 = 10, C2 = 20, C3 = 30. As C increases, the corresponding value of z satisfying (14) increases. − F ) and thereP rob(1|1). F has a negative effect on C ≡ exp(π + δ ln(1+exp(π)) 1−δ fore on z. Hence, as F increases, P rob(1|0) decreases and P rob(0|0) increases. This is intuitive. As the sunk entry cost increases, the behavior of those firms that have already incurred this cost (∆t = 1) does not change. For firms that have not invested in F yet, the higher sunk entry cost makes exporting less enticing. 3.2 Stationary distribution In this dynamic model we need to calculate the stationary equilibrium of the economy to study the distribution of firms conditional on exporting. Assume that a mass M̄ of new firms is born every year, where each firm draws its productivity from a Pareto 21 100 distribution. The new-born firms choose whether to export or not, and the probability of these choices for a firm with productivity φ is given by (17), (18), where z is the − F ). A fraction δ of all new-born solution of (14) and C ≡ exp(π + δ ln(1+exp(π)) 1−δ firms survives by the next period. Out of those firms that did not export previously, a fraction P rob(1|0) exports and a fraction P rob(0|0) does not, where again these probabilities are given by (17), (18). Out of those firms that exported previously, a fraction P rob(1|1) exports and a fraction P rob(0|1) does not, where P rob(1|1) and ε −ε P ε−1 1 [ ε−1 ] y[ w ] . Out of all firms, a P rob(0|1) are given by (16), (15), and π ≡ φε−1 ε−1 fraction δ survives to the following period, and so on and so forth. For any ν > 0, it is possible to find t ∈ N + such that the mass of firms from a given cohort (all firms that were born in the same period) surviving after t periods is smaller than ν. We set this ν very close to 0 (say, e−10 ) and calculate the corresponding t, satisfying this property. Denote by M (t) the mass of all firms belonging to the same cohort that survive after t periods. Then we need M (t) = δ t M̄ ≤ ν t ln δ ≤ ln t≥ ν M̄ ν ln M̄ . ln δ Set ν ln M̄ t̄ ≡ ln δ , where bxc denotes the least integer function of x. 22 We can say that after t̄ periods, every cohort of firms dies out completely (if we set ν arbitrarily close to 0), so when counting all existing firms in any period T , we need only to consider the firms that were born at T , T − 1 (and survived after 1 period), T − 2 (and survived after 2 periods), ..., T − t̄ (and survived after t̄ periods), and no further. Denote by N the mass of all existing firms in any period T : N= t̄ X M (s) s=0 = t̄ X δ s M̄ s=0 = M̄ 1 − δ t̄ . 1−δ In what follows, we will discretize the Pareto distribution of firm productivities, so that every new-born firm has productivity φi with probability d(φi ), i = 1, 2, ..., I, PI i=1 d(φi ) = 1, I ∈ N + . For every φi , we can calculate the corresponding profits πi ≡ 1 φε−1 [ ε ]−ε y[ Pw ]ε−1 . This will also give us the probabilities P robi (0|1), P robi (1|1), i ε−1 ε−1 P robi (0|0), P robi (1|0), since all of these probabilities depend on the value of π. In particular, we know from the comparative statics results, that firms with higher φi will have higher P robi (1|1) and P robi (1|0), and lower P robi (0|1) and P robi (0|0). Denote by E(i) the total mass of all firms with probability φi that export at any time period in the stationary equilibrium. E(i) is composed of all firms with probability φi that are born currently and choose to export, all firms with probability φi that were born in the last period, survive until now and choose to export, all firms with probability φi that were born two periods ago, survive until now and chose to 23 export, and so on, up to the firms with probability φi that were born t̄ periods ago, survive until now and choose to export. E(i) = d(φi )M̄ P robj (1|0) + δd(φi )M̄ [P robj (1|0)P robj (1|1) + P robj (0|0)P robj (1|0)] + δ 2 d(φi )M̄ [P robj (0|0)P robj (0|0)P robj (1|0) + P robj (0|0)P robj (1|0)P robj (1|1) + P robj (1|0)P robj (1|1)P robj (1|1) + P robj (1|0)P robj (0|1)P robj (1|1)] + ... + δ t̄ d(φi )M̄ [P robj (0|0)t̄ P robj (1|0) + P robj (0|0)t̄−1 P robj (1|0)P robj (1|1) + P robj (0|0)t̄−2 P robj (1|0)P robj (0|1)P robj (1|1) + ... + P robj (1|0)P robj (1|1)t̄ ] Once we numerically evaluate E(i), i = 1, 2, ..., I, for given parameter values, we can calculate the total mass of firms that export in any given period in stationary equilibrium: E= I X E(i). i=1 Denote by r(φi ) the distribution of productivities φ, conditional on exporting, in stationary equilibrium. This can be obtained as r(φi ) ≡ E(i) . E We will study the behavior of this conditional distribution, for various values of the model parameters, and compare it with the unconditional distribution of productivities. 24 3.3 The distribution of productivity conditional on exporting in the stochastic model In this section we will simulate the distribution of productivities of exporting firms, for various values of the sunk entry cost F , for fixed values of parameters P , y, w, δ, ε, fixed values of M̄ , mass of new firms in every period, and ν, tolerance parameter defined above, and assuming that firms draw productivity values φ from a Pareto distribution with fixed values of parameters φm and α. The values of the parameters are presented in Table (1). Table 1: Values of the model parameters Parameter Specification 1 Specification 2 Specification 3 F 0 10 20 P 1 1 1 y 1 1 1 w 1 1 1 δ 0.9 0.9 0.9 ε 4 4 4 φm 1 1 1 α 3 3 3 M̄ 1 1 1 ν e−10 e−10 e−10 25 We first generate a discretized version of the Pareto distribution. The details of this procedure are explained in the Appendix. This distribution has a finite support (in the range [1.0249, 6] in our simulations). The probabilities of exporting, conditional on having exported before and not, are then obtained for each value of productivity φi , i = 1, ..., I. I = 100 in our simulations. These probabilities are depicted in Figures (7), (8), for values of F = 0, 10, 20. The probability of exporting conditional on having exported before is independent of F , since the firms that have exported at least once view the cost F as sunk. Instead, firms that have never exported to the market, are strongly affected by the value of F . The probability of exporting, conditional on not having exported, shifts down for low and intermediate values of φ, as F increases first from 0 to 10, and then from 10 to 20. This fact has implications for the distribution of firm productivities, conditional on exporting. Probability of exporting conditional on having exported before 1.1 1 0.9 0.8 0.7 0.6 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 Values of productivity Figure 7: Probability of exporting conditional on having exported before over the values of productivity φ in the stochastic model. The same curve applies to all cases of F = 0, 10, 20. 26 6 Probability of exporting conditional on not having exported before 1 0.9 0.8 F=0 0.7 0.6 F = 10 0.5 F = 20 0.4 0.3 0.2 0.1 0 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 Values of productivity Figure 8: Probability of exporting conditional on not having exported before over the values of productivity φ in the stochastic model. Solid curve for F = 20, dashed curve for F = 10, dotted curve for F = 0. The probability distribution of firm productivities, conditional on exporting, is obtained as described above. The unconditional and conditional distributions of productivity, for values of F = 0, 10, 20, are depicted in Figures (9), (10), (11). In Figure (9), both the unconditional and conditional probability mass functions are strictly decreasing. Notice that the conditional probability mass function is lower than the unconditional one for low values of productivity, and higher for high values of productivity. That is, due to the fact that the less productive firms are less likely to export than the more productive firms, the weight in the conditional distribution is shifted more towards the higher productivity firms. In Figure (10), where F = 10, this shift in weight is even stronger, since the 27 6 probability of exporting conditional on not having exported before shifts down for low values of φ as F increases from 0 to 10. As a result, there is now a slight hump in the conditional probability distribution of φ, for values between 1.5 and 2.5. As we increase F even further, to F = 20, the probability of exporting conditional on not having exported before shifts down even more for low values of productivity, as shown in Figure (8), and in particular, becomes nearly 0 for φ between 1 and 2.17. The shift in weight in the conditional probability distribution of φ towards higher values is therefore even stronger, and there is a pronounced hump, as is evident in Figure (11). The probability mass function is strictly increasing for φ ∈ [2, 2.77] and strictly decreasing for φ ∈ [2.77, 6]. 0.14 Probability mass function 0.12 Unconditional 0.1 Conditional on exporting 0.08 0.06 0.04 0.02 0 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 Values of productivity Figure 9: Sunk entry cost F = 0. The solid curve shows the probability distribution of productivity of all firms, and the dashed curve shows the probability distribution of productivity of all exporting firms. 28 6 0.14 Probability mass function 0.12 Unconditional 0.1 Conditional on exporting 0.08 0.06 0.04 0.02 0 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 Values of productivity Figure 10: Sunk entry cost F = 10. The solid curve shows the probability distribution of productivity of all firms, and the dashed curve shows the probability distribution of productivity of all exporting firms. 0.14 Probability mass function 0.12 Unconditional 0.1 Conditional on exporting 0.08 0.06 0.04 0.02 0 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 Values of productivity Figure 11: Sunk entry cost F = 20. The solid curve shows the probability distribution of productivity of all firms, and the dashed curve shows the probability distribution of productivity of all exporting firms. 29 6 4 Conclusion The assumption that firm productivities follow a Pareto distribution has been used extensively in the recent trade literature. However, we show that the productivity distribution of exporting firms cannot be exactly described as Pareto, but rather as a hump-shaped distribution with a heavy right tail. In a standard Melitz model with a sunk entry cost and heterogeneous firms that draw their productivities from a Pareto distribution, the productivity distribution of exporting firms is still a Pareto. Introducing idiosyncratic random shocks to the firm payoffs results in a stationary distribution of firm productivities conditional on exporting that matches the shape of the empirical distribution, for sufficiently high values of the sunk entry cost. 30 5 Appendix 5.1 Estimating TFP To carry out TFP estimation we use only data on domestic sales of the firm, i.e. Rjt = R̂jt −Xjt , where R̂jt is total revenues of the firm, and Xjt is its export revenues. We assume that the inputs used for domestic production only are the same fraction of total inputs used as the fraction of domestic sales in total sales. In what follows, we borrow many steps from De Loecker (2011). Begin with the pair of production and demand equations: αk ωjt +ujt e , Qjt = Lαjtl Mjtαm Kjt Qjt = Qst Pjt Pst −ε eηjt , where L, M, K are inputs - labor, intermediate inputs and capital, respectively, ωjt is unobserved productivity shock, ujt is an error term, Pjt is firm j-s price, Qjt is firm j-s quantity produced and sold, ηjt is unobserved demand shock, Qst and Pst are industry-wide output and price index, respectively. j indexes firms, and t indexes time (years in our case). We observe only total domestic revenues (and not physical output): 1 1 1 Rjt ≡ Qjt Pjt = Qjt Qjt − ε Qst ε Pst eηjt ε , so that upon dividing both sides by Pst and taking logs: r̃jt = ε−1 1 1 qjt + qst + ηjt , ε ε ε 31 and plugging in the equation for the production function: r̃jt = βl ljt + βm mjt + βk kjt + βs qst + ωjt + ηjt + ujt , the small letters denote the logarithms of the capitalized variables (e.g. qjt ≡ ln Qjt ), and all the coefficients are reduced form parameters combining the production function and demand parameters. We next denote ω̃jt ≡ ωjt + ηjt and re-write: r̃jt = βl ljt + βm mjt + βk kjt + βs qst + ω̃jt + ujt . Here we follow Levinsohn and Melitz (2002) in treating both the unobserved productivity and demand shocks jointly, so that we do not identify these two sources of firm profitability independently. Since our goal is only estimation of TFP, it suffices for us to control for industry-wide output with industry-time fixed effects. Therefore, we have r̃jt = βl ljt + βm mjt + βk kjt + ω̃jt + T X βst Dst + εjt , t=1 where Dst are industry-year dummies. Next issue to deal with is the identification of the variable inputs’ coefficients. To do that, we take note of the ACF (Ackerberg, Caves and Frazer, 2006) critique of the Levinsohn and Petrin (2003) and Olley and Pakes (1996) estimation approach, and estimate all the input coefficients in the second stage. We use value added in the first stage regression, so that we only estimate the coefficients on labor and capital in the production function. We do use the data on intermediate inputs, however, as the control for (unobserved) productivity. We use 32 the equation for intermediate inputs mjt = mt (kjt , ω̃jt ), so that assuming monotonicity in the function mt (.), we can invert: ω̃jt ≡ ψt (mjt , kjt ). Similarly, we assume that the optimal quantity of labor is chosen once current productivity is observed by the firm, so that ljt = lt (kjt , ω̃jt ), and once we utilize the expression for ω̃jt above, ljt = lt (kjt , ψt (mjt , kjt )). Inserting these into the expression for value added: va g jt ≡ r̃jt − βm mjt = βl ljt + βk kjt + ψ(mjt , kjt ) + = βl lt (kjt , ψ(mjt , kjt )) + βk kjt + ψ(mjt , kjt ) + T X βst Dst + εjt t=1 T X βst Dst + εjt t=1 ≡ Ψt (mjt , kjt ) + T X βst Dst + εjt , t=1 where Ψt (mjt , kjt ) is a polynomial in capital and intermediate inputs, one for each 33 time period (year in our case): Ψ(mjt , kjt ) ≡ T X b0t Dt + t=1 + + T X t=1 T X T X b1kt Dt kjt + t=1 2 + b2kkt Dt kjt T X b1mt Dt mjt + t=1 T X T X 2 mjt b3kkmt Dt kjt t=1 t=1 t=1 b2mkt Dt kjt mjt t=1 b2mmt Dt m2jt + b3kmmt Dt kjt m2jt + T X T X 3 + b3kkkt Dt kjt T X b3mmmt Dt m3jt , t=1 t=1 where Dt are year dummies. Assuming that productivity follows a first-order Markov process: ω̃jt = E[ω̃jt |ω̃j(t−1) ] + νjt , where νjt is uncorrelated with kjt , and given values of βl and βk , one can estimate the residual νjt (unobserved innovation to productivity) non-parametrically from ω̃jt = va g jt − βl ljt − βk kjt − T X βst Dst , t=1 2 3 ω̃jt = z0 + z1 ω̃j(t−1) + z2 ω̃j(t−1) + z3 ω̃j(t−1) + νjt . Next, use the moments E[νjt (βk , βl )kjt ] = 0, E[νjt (βk , βl )lj(t−1) ] = 0, to identify the coefficients on capital and labor. Finally, given estimates β̂k , β̂l , the estimate of TFP, ωc̃ jt , is given by ωc̃ g jt = va jt − β̂l ljt − β̂k kjt − T X t=1 34 β̂st Dst . 5.2 Conditional Pareto distribution Consider a random variable φ that has a Pareto distribution: α m α φα+1 , for φ ≥ φm , φ fΦ (φ) = 0, for φ < φm , where φm > 0 is a scale parameter and α > 0 is a shape parameter. The cumulative distribution function of φ is 1 − ( φm )α , for φ ≥ φm , φ FΦ (φ) = 0, for φ < φm , Hence, the conditional probability distribution of φ, given that φ ≥ φ̃, φ̃ > φm , is f (φ|φ ≥ φ̃) = f (φ) 1 − F (φ̃) α = m α φφα+1 ( φφ̃m )α =α φ̃α , φα+1 for φ ≥ φ̃, and 0 otherwise. That is, the conditional probability distribution of φ, given that φ ≥ φ̃, φ̃ > φm , is a Pareto distribution with shape parameter α and scale parameter φ̃. 5.3 Generating a discretized version of the Pareto distribution Our method of generating a discretized version of the Pareto distribution with given scale parameter and shape parameter is an extension of the approach to discretizing 35 continuous random variables proposed by Tauchen (1986). Suppose a random variable φ follows a Pareto distribution with scale parameter φm and shape parameter α. We would like to construct its discrete representation, by first building a grid - a finite collection of equally spaced points on the R axis, and then calculating the weight for each of these points. Fix the desired number of elements in the grid, I ∈ N + . The variable φ has a well-defined lower bound, φm , but is unbounded from above. We set the upper bound for the discretized version equal to the 99-th percentile of the TFP values observed in the entire dataset of French firms, which is approximately 6. Denote by φi , i = 1, 2, ..., I, the elements in the grid, and by ei , i = 1, 2, ..., I + 1, the so-called endpoints, whose purpose will be explained shortly. We calculate these as follows: φI = 6, gap ≡ φI − φm , 2I − 1 e1 = φm , φ1 = e1 + gap, e2 = φ1 + gap, ... φi = ei + gap, i = 3, ..., I ei = φi−1 + gap, i = 3, ..., I + 1. 36 That is, each grid point φi is contained in the interval (ei , ei+1 ), at equal distance from both endpoints ei and ei+1 . We calculate the weight associated with each grid point φi , i = 1, ..., I, as d(φi ) = Fφ (ei+1 ) − Fφ (ei ) , Fφ (eI+1 ) − Fφ (e1 ) where Fφ is the cumulative Pareto distribution with scale parameter φm and shape parameter α, and we divide the nominator Fφ (ei+1 ) − Fφ (ei ) by the term Fφ (eI+1 ) − Fφ (e1 ) to ensure that all the weights sum up to 1: I X i=1 I X Fφ (ei+1 ) − Fφ (ei ) Fφ (eI+1 ) − Fφ (e1 ) d(φi ) = = = 1. Fφ (eI+1 ) − Fφ (e1 ) Fφ (eI+1 ) − Fφ (e1 ) i=1 The discretized version of the distribution of the random variable φ then is represented by the grid φi , i = 1, ..., I and the probability mass function d(φ). 37 6 References Ackerberg, D., K. Caves and G. Frazer, ‘Structural Identification of Production Functions’, Econometrica R&R, 2006. Akhmetova, Z. and C. Mitaritonna, ‘A Model of Firm Experimentation under Demand Uncertainty with an Application to Multidestination Exporters’, Working Paper, University of New South Wales, 2013. Albornoz, F., H. F. C. Pardo, G. Corcos and E. Ornelas, ‘Sequential Exporting’, Journal of International Economics 88 (2012), 17-31. Arkolakis, C., ‘Market Penetration Costs and the New Consumers Margin in International Trade’, Journal of Political Economy 118 (2010), 1151-1199. Axtell, R. L., ‘Zipf Distribution of U.S. Firm Sizes’, Science 293 (2001), 1818-1820. Chaney, T., ‘Distorted Gravity: The Intensive and Extensive Margins of International Trade’, American Economic Review 98 (2008), 1707-17218. De Loecker, J., ‘Product Differentiation, Multi-Product Firms and Estimating the Impact of Trade Liberalization on Productivity, Econometrica 79 (2011), 1407-1451. Eaton, J., M. Eslava, C.J. Krizan, M. Kugler and J. Tybout, ‘A Search and Learning Model of Export Dynamics’, Working Paper, 2008. Eaton, J., S. Kortum and F. Kramarz, ‘An Anatomy of International Trade: Evidence from French Firms’, Econometrica 79 (2011), 1453-1498. Helpman, E., M.J. Melitz and Y. Rubinstein, ‘Estimating Trade Flows: Trading Partners and Trading Volumes’, The Quarterly Journal of Economics 123 (2008), 38 441-487. Helpman, E., M.J. Melitz and S.R. Yeaple, ‘Export Versus FDI with Heterogeneous Firms’, American Economic Review 94 (2004), 300-316. Ijiri, Y. and H.A. Simon., ‘Business Firm Growth and Size’, American Economic Review 54 (1964), 7789. Levinsohn, J. and M.J. Melitz, ‘Productivity in a Differentiated Products Market Equilibrium’, mimeo, Harvard University, 2002. Levinsohn, J. and A. Petrin, ‘Estimating Production Functions Using Inputs to Control for Unobservables’, Review of Economic Studies 70 (2003), 317-342. Luttmer, E., ‘Selection, Growth, and the Size Distribution of Firms’, Quarterly Journal of Economics 122 (2007), 11031144. Melitz, M. J., ‘The Impact of Trade on Intra-Industry Reallocations and Aggregate Industry Productivity’, Econometrica 71 (2003), 1695-1725. Melitz, M. J. and G. I. P. Ottaviano, ‘Market Size, Trade, and Productivity’, Review of Economic Studies 75 (2008), 295-316. Olley, S. and A. Pakes, ‘The Dynamics of Productivity in the Telecommunications Equipment Industry’, Econometrica 64 (1996), 1263-1295. Rauch, J. E. and J. Watson, ‘Starting Small in an Unfamiliar Environment’, International Journal of Industrial Organization 21 (2003), 1021-1042. Rust, J., ‘Optimal Replacement of GMC Bus Engines: An Empirical Model of Harold Zurcher’, Econometrica 55 (1987), 999-1033. 39 Simon, H. A., ‘On a Class of Skew Distribution Functions’, Biometrika 42 (1955), 425-440. Simon, H. A. and C. P. Bonini, ‘The Size Distribution of Business Firms’, American Economic Review 48 (1958), 607617. Steindl, J., Random Processes and the Growth of Firms; A Study of the Pareto Law, New York, NY: Hafner Publishing Company, 1965. Tauchen, G., ‘Finite state markov-chain approximations to univariate and vector autoregressions’, Economics Letters 20 (1986), 177-181. 40
© Copyright 2026 Paperzz