GRIPS Discussion Paper 14-11 The Henry George Theorem in a second-best world Kristian Behrens Yoshitsugu Kanemoto Yasusada Murata August 2014 National Graduate Institute for Policy Studies 7-22-1 Roppongi, Minato-ku, Tokyo, Japan 106-8677 This draft: August 15, 2014 The Henry George Theorem in a second-best world Kristian Behrens* Yoshitsugu Kanemoto† Yasusada Murata‡ Abstract The Henry George Theorem (HGT) states that, in first-best economies, the fiscal surplus of a city government that finances the Pigouvian subsidies for agglomeration externalities and the costs of local public goods by a 100% tax on land is zero at optimal city sizes. We extend the HGT to distorted economies where product differentiation and increasing returns are the sources of agglomeration economies and city governments levy property taxes. Without relying on specific functional forms, we derive a second-best HGT that relates the fiscal surplus to the excess burden expressed as an extended Harberger formula. Keywords: Henry George Theorem; second-best economies; optimal city size; monopolistic competition; local public goods JEL Classification: D43; R12; R13 * Canada Research Chair, Département des Sciences Economiques, Université du Québec à Montréal (UQAM), Canada; National Research University, Higher School of Economics, Russia; CIRPÉE, Canada; and CEPR. E-mail: [email protected] † National Graduate Institute for Policy Studies (GRIPS); and Graduate School of Public Policy (GraSPP), University of Tokyo. E-mail: [email protected] ‡ Advanced Research Institute for the Sciences and Humanities (ARISH), Nihon University, Japan. E-mail: [email protected] 1 This draft: August 15, 2014 1. Introduction The equilibrium sizes of agglomerations such as cities (or communities and shopping centers), are determined by the balance between increasing and decreasing returns to spatial concentration. As is well known, cities need not be optimally sized at equilibrium since the urban environment is replete with externalities (Henderson, 1974; Kanemoto, 1980). Recent empirical research suggests that any departure from optimal city size can generate sizeable economic costs, especially in terms of foregone productivity (e.g., Au and Henderson, 2006a, 2006b). Hence, elaborating efficient urban growth policies is likely to be of first-order importance to many countries, especially developing ones. An important preliminary step for devising such urban policies is to assess whether cities are too large or too small, and by how much. The ‘golden rule’ of local public finance (Flatters et al., 1974), i.e., the Henry George Theorem (henceforth HGT; Stiglitz, 1977; Arnott and Stiglitz, 1979; Kanemoto, 1980; Schweizer, 1983) provides a condition for the optimal size of a city or, equivalently, the optimal number of cities given a fixed total population.1 It is thus a potentially useful tool that can allow policy makers to assess whether cities are too large or too small.2 Another application of the HGT is concerned with smaller agglomerations such as shopping centers and business subcenters that are mostly developed by private companies. Those developers capture increases in land prices to finance development costs. The HGT shows that free entry of developers yields an efficient allocation in a first-best world. It would be of interest for policy makers to know whether developments are too few or too many in a more realistic setting with various distortions.3 The HGT may be viewed as an extension of the result on the optimal number of firms in an industry – in the first best, entry into an industry is optimal when the marginal social benefit (i.e., the profit) of the last entrant vanishes. In a spatial context, this optimality condition must be extended to include land rents: aggregate land rents (which capitalize agglomeration benefits) equal the aggregate losses from increasing returns activities that generate agglomeration. For example, if the driving force for agglomeration is a pure local public good, then the cost of its supply must be equal to aggregate land rents at the optimal city size. In the case of a factory town, where firm-level scale economies generate spatial concentration, aggregate land rents must be equal to the losses that the firm would incur if it were constrained to price at marginal cost. In a new economic geography (henceforth NEG) model with increasing returns 1 See Mieszkowski and Zodrow (1989) and Arnott (2004) for an overview of the HGT. 2 Kanemoto et al. (1996, 2005) applied these ideas to empirically test the often-made claim that Tokyo, with a metropolitan population of about 30 million, is much too large. It is not easy to obtain reliable estimates of key variables such as the aggregate land rents and the aggregate Pigouvian subsidies in a city, but the HGT provides a promising theoretical framework for empirical studies. As stated by Arnott (2004, pp.1086–1087): “Does the Henry George Theorem provide a practical guide to optimal city size? The jury is not yet in, but the approach is sufficiently promising to merit further exploration.” 3 In reality, of course, agglomeration benefits of a shopping center or a business subcenter spill over to other areas, and a developer is unlikely to capture all the benefits if he does not control all those areas. In such a case, the government may want to subsidize or share the burden of infrastructure development. Our analysis shows that, on top of the spillover effects, the government must take into account various second-best issues. 2 and product differentiation, aggregate land rents must be equal to the subsidies paid to firms in order to achieve efficient production scale and optimum product diversity. More generally, using the concept of the fiscal surplus – defined as the net surplus of the city government (property tax revenue minus costs of local public goods and any other losses from increasing returns) plus all the land rents – the HGT at the first best simply states that the fiscal surplus is zero when the number of cities is optimal (or, alternatively, when the cities are of optimal size). As is well known, the HGT holds in a first-best world without distortions but not necessarily in a second-best world (see Arnott, 2004, for a recent survey). However, as highlighted by the latter two foregoing examples, most factors that drive the concentration of economic activity involve some form of market failure.4 Thus, for the HGT to be of practical relevance it must be extended to cope with settings encompassing distortions of various kinds. This has, to the best of our knowledge, not been systematically done to date. The purpose of this article is to fill that gap by identifying conditions under which the HGT holds even in a second-best economy – an economy where policy makers can implement the optimal size of an agglomeration but are constrained to take production and consumption decisions (including entry decisions of firms) and the resulting equilibrium prices as given. We also examine in which directions the theorem needs to be modified should it fail to hold in such a world. Several related articles have examined variations on the HGT in a second-best world. First, Arnott (2004, p.1073) showed that “in distorted urbanized economies the generalized HGT continues to hold when the aggregate magnitudes are valued at shadow prices.” The article does not, however, suggest how these shadow prices can be calculated, thereby limiting the practical relevance of the generalized HGT. Second, Helsley and Strange (1990) showed that the theorem does not hold in a matching framework of urban labor markets where firms compete for workers on a circle of skill ranges and where wages are determined via Nash bargaining. Last, Behrens and Murata (2009) examined a monocentric city model with monopolistic competition and showed that the HGT holds at the second best if and only if the second-best allocation is first-best efficient. In their model, the latter holds true only in the constant elasticity of substitution (henceforth CES) case.5 As should be clear from the foregoing literature review, whether or not the HGT holds at the second best hinges on the chosen modeling framework. Unlike previous approaches on the second-best HGT that rely from the beginning on specific functional forms, our aim is to derive a more general version of a second-best HGT and apply it to various models of spatial concentration, including those of the NEG featuring imperfect competition and product differentiation, as well as models of local public goods. Our 4 Duranton and Puga (2004) provide an excellent survey on the theoretical foundations of agglomeration economies. 5 Behrens et al. (2010) have shown that the HGT continues to hold in a CES model with heterogeneous agents and sorting along talent across cities. The reason is that, despite heterogeneity and sorting, the underlying tradeoff in terms of price and variety distortions is unaffected. Many of the results obtained in the CES model (markups, firm size, entry etc.) however are clearly knife-edge results. See Zhelobodko et al. (2012) and Dhingra and Morrow (2013) for overviews and discussions. 3 methodology allows us to obtain general results without resorting to specific functional forms. In a second-best world, the fiscal surplus is not generally zero at optimum and depends on two things: the excess burden and the total value of distortions for residents who move from the existing cities to a new city. Creating a new city induces changes in production and consumption in existing cities, in addition to the direct impacts of the migration of residents/workers from the existing cities to the new city. The excess burden captures the welfare effects of the induced changes, whereas the total value of distortions reflects the fact that price distortions cause divergence between the social and market values of the direct impacts. Formally, the former can be expressed as a Harberger excess burden formula (Harberger, 1964, 1971), extended to include distortions in product diversity that have largely escaped attention in the literature because no explicit price exists for the number of varieties: the weighted sum of changes in quantities and product diversity induced by the creation of a new city, with weights being the associated distortions. The latter is the weighted sum of distortions with weights being the quantities consumed by the movers. To illustrate our results in a simple setting, we first obtain the second-best HGT in a NEG-type model where the utility function is additively separable with respect to varieties as in Dixit and Stiglitz (1977), although we follow Behrens and Murata (2007) and Zhelobodko et al. (2012) and do not restrict the functional form of the subutility a priori to the CES type. We also incorporate a property tax on housing and a pure local public good in each city. In this model, distortions in the differentiated good sector take two forms: a price distortion for each variety of the differentiated good, and a distortion for the number of varieties consumed. The variety distortion works in the opposite direction of the price distortion. We show that the price distortion depends on the relative risk aversion (henceforth RRA) and that the variety distortion is inversely related to the elasticity of the subutility function.6 Unlike the RRA, the elasticity of the subutility function depends on the absolute level of utility: adding a positive constant to the subutility makes the variety distortion larger. This result reveals an important difference between models with endogenous product diversity (like new trade and NEG models) and expected utility theory that has not been emphasized much until now. In expected utility theory, utility is unique up to an affine transformation and its absolute level does not matter. In monopolistic competition models, the desirability of introducing a new variety hinges on the magnitude of the utility increase it produces. The utility increase is the difference between the utility level with equilibrium consumption and that with zero consumption. Models with endogenous product diversity, therefore, depend on the absolute level of utility, whereas expected utility theory depends only on the marginal utility and higher-order derivatives. 7 In our setting, whether or not the Henry George Theorem holds crucially hinges on both the 6 The result that the RRA and the elasticity of the subutility matter for the gap between the first- and the second-best allocations already appears in Behrens and Murata (2009, eqs. (33) and (34)), although they do not explicitly decompose aggregate distortions into price and variety distortions. 7 The fact that the variety distortion depends on the absolute level of utility is closely related to the welfare economics of new goods discussed by Romer (1994). The price of a good reflects its marginal value but does not provide the value of introducing a new good, which is given by the total utility rather than the marginal utility. 4 derivatives of the subutility and its absolute level. In other words, modeling choices related to the value of variety are important. We then extend our framework and show that basically the same results hold in a more general model with a nonseparable utility function, general production technology, multiple differentiated goods, and congestible local public goods. The remainder of the paper is organized as follows. In Section 2, we examine a simple model with a utility function that is additively separable with respect to varieties and a simple production technology with constant marginal and fixed costs. Section 3 extends the results to a nonseparable utility function and general production technology. Finally, Section 4 concludes. 2. The second-best Henry George Theorem in a simple monopolistic competition model with pure local public goods Before proceeding to the analysis of the general framework that encompasses many models of urban agglomeration, we illustrate the basic principles in a simple monopolistic competition model, imposing additive separability and symmetry on the demand and supply side of a differentiated good. The basic structure of the model is as follows. The economy consists of many potential sites for cities, and the number of cities that are developed is the focus of our analysis. All consumers/workers are homogeneous and free to choose a city in which to live and work. These assumptions, combined with zero moving costs, guarantee that all consumers obtain equal utility. Agglomeration economies arise from product differentiation in the consumer good, reflecting benefits from greater consumption diversity in larger cities. The size of a city is limited by the fixed supply of urban land, which causes smaller lot sizes and higher housing costs in larger cities. Housing is produced by combining land and labor, and a property tax is levied on housing. The city government provides a pure local public good. Our problem is to obtain conditions for the optimal number of cities under the constraint that the price and variety distortions for the differentiated good cannot be removed. We solve this problem in three steps. First, assuming a fixed number of cities, we obtain the equilibrium conditions for all markets in the economy, where monopolistic competition prevails for the differentiated good. The second step is to examine the welfare impact of changing the number of cities. In this step, we use a relatively unknown version of the consumer surplus measure: the Allais surplus (Allais, 1943, 1977). Roughly speaking, we fix the utility level and compute the amount of the numéraire good that can be extracted from the economy. In our case, we obtain the surplus of the numéraire that is generated by an increase in the number of cities. In the final step, we exploit the fact that the equilibrium is second-best optimal if the Allais surplus is maximized there: increasing or decreasing the number of cities does not generate additional numéraire conditional on holding utility fixed at its optimal level. 2.1. The model Every potential site for cities is endowed with the same fixed amount of land, 5 X L . Urban land is homogeneous and we ignore the spatial structure of a city.8 Out of these potential sites, n cities are developed, where n is a potential policy variable. The total population of the economy is fixed at N . Focusing on symmetric allocations, the size of a city is then given by N N / n . From our assumption of free mobility and zero moving costs, all consumers achieve the same level of utility in equilibrium. Consumers derive utility from four types of goods: leisure, a differentiated consumer good, a local public good, and housing. In addition to these consumer goods, we have another good called land that is used together with labor to produce housing. These goods cannot be transported between cities. This holds for labor as well: although workers can move between cities, once they choose a city to live in, they cannot supply labor in other cities. One special feature of land is worth emphasizing: adding a new city increases the total amount of urban land in the economy unlike other inputs such as labor. A new city does not increase the total number of workers in the economy, but it increases the total amount of land used for housing. The utility function is U(x0 ,U M , xG , x H ) , where x0 is the consumption of leisure, U M is the subutility derived from the consumption of the differentiated consumer good, and xG and x H are the consumptions of a local public good and housing, respectively. For simplicity, we assume that the consumption of housing is fixed. This implies that an increase in the population of a city requires an increase in labor input for housing because the total available land is fixed in a city. The local public good is assumed to be purely public within a city so that all residents in a city consume the same amount of that good. The subutility function U M is assumed to be additively separable and symmetric with respect to varieties. Assuming that we have a continuum of varieties, it can be represented by an integral with a common lower-tier subutility function, u ( x j ) : m U M u ( x j )dj , where m is the number (or mass) of varieties of the differentiated 0 good. Note that this form encompasses various specifications such as the widely used CES ( u ( x) x ( 1) / ) due to Dixit and Stiglitz (1977) and the constant absolute risk aversion (CARA) ( u ( x) 1 e x ) analyzed by Behrens and Murata (2007, 2009).9 We focus on a symmetric equilibrium where all varieties have equal prices and quantities. We also assume symmetry with respect to cities so that all cities have the same allocation. Let x , p , and p H respectively denote the consumption and price of a variety and the price of housing. We further assume public ownership, i.e., land and 8 It is easy to incorporate the spatial structure within a city, for example, by using a monocentric city model as common in the first-best HGT in Arnott and Stiglitz (1979), Arnott (1979), Kanemoto (1980), and others. As shown in Behrens and Murata (2009), this extension is also possible for the second-best version. 9 Note that representability by integrals does not mean additive separability. As shown in Section 3, quadratic quasi-linear preferences are not additively separable yet can be represented using integrals. 6 firms in all cities are owned equally by all residents.10 The budget constraint can then be written as ( x0 x0 ) s mpx p H x H , where x0 and s respectively denote the endowment of time and an equal share of profits/rents minus the head tax to finance the deficit of the government.11 Labor is assumed to be the numéraire.12 Concerning the choice of the number of varieties to consume, a consumer is constrained by the available number of varieties in the market, denoted by m S . A consumer then maximizes the utility function, U x0 , mu( x), xG , xH , with respect to x0 , x , and m , subject to the budget constraint and the supply-side constraint on the mass of available varieties, m m S . From the first-order conditions for this problem, the marginal rate of substitution between a variety and the numéraire (labor) equals their price ratio: (1) U / U M u( x) p . U / x0 For the choice of the number of varieties, the first-order condition is an inequality: (2) U / U M u ( x ) px . U / x0 The left-hand side of ( 2 ) is the benefit from adding a variety, which should not be lower than the purchase cost of the variety. If the supply-side constraint is binding, the marginal benefit strictly exceeds the cost. Combining ( 1 ) and ( 2 ) yields the condition that the average utility is larger than or equal to the marginal utility: (3) u ( x) u ( x ) , x or, alternatively (4) U R ( x) 1 , where U R ( x) xu ( x) / u ( x) is the elasticity of the subutility function, which will be important when characterizing the variety distortion. We call condition ( 4 ) the ‘love-of-variety’ condition. When it does not hold, a consumer prefers to reduce the number of varieties and increase the consumption of each variety, keeping the total consumption constant. Observe also from the expression of U R (x) that the absolute level of utility matters unlike in expected utility theory: shifting the utility function upward by adding for example a constant term reduces the elasticity of U R (x) and 10 The following analysis can be applied with minor modifications to the ‘absentee landlord case’, where we assume that rents are spent outside the economy that is modeled. 11 The deficit of the government equals the cost of the local public good minus the property tax revenue. 12 Strictly speaking, the numéraire is leisure in one of the cities because labor is not tradable between cities. Because of our symmetry assumption, we can take the price of leisure to be equal across cities. 7 tends to increase the number of varieties. The assumption of homogeneous consumers implies that all varieties supplied in the market will be consumed because if a consumer chooses zero consumption, all other consumers will do the same, making the market demand zero. We therefore set m m S from now on. The production side is formulated as follows. The differentiated consumer good and housing are produced in each city, and they cannot be transported to other cities. The sole input for the differentiated good is labor, whereas housing is produced by combining land and labor. For the differentiated consumer good, there is only one firm producing a particular variety in a city. The number of varieties m is endogenous.13 Production of a variety by a firm is denoted by Y . As noted above, production requires labor only. The fixed cost part is F and the constant marginal cost is c. We consistently adopt the notational convention that the net output is positive and the net input is negative. Labor used in producing a variety, denoted by Y0M , is then negative and satisfies the cost function (in labor units): Y0M cY F . Profit maximization yields the standard condition that the price margin equals the inverse of the price elasticity of perceived demand: ( p c) / p 1 / , where denotes the price elasticity perceived by a producer. Under the monopolistic competition assumption with a continuum of firms, no producer has any influence on the market aggregates. In our framework, the marginal rate of substitution between the differentiated good and the numéraire in ( 1 ), (U / U M ) /(U / x0 ) , is taken as fixed. This implies that the price elasticity equals the inverse of the relative risk aversion (RRA): 1 / RR ( x ) , where RR xu ( x ) / u ( x ) denotes the RRA, which will be important for characterizing the price distortion.14 The first-order condition for profit maximization is then (5) pc RR (x ) , p whereas the second-order condition is (6) u( x) x 2. u( x) We assume that there is free entry, so that the profit of a firm is zero: M ( p c )Y F 0 . Housing is produced by combining land and labor. The housing sector is assumed to be competitive with free entry. Furthermore, all producers have the same 13 The importance of product diversity has been emphasized by, e.g., Ethier (1982), Fujita (1989) and Fujita et al. (1999) in the new trade, urban economics, and NEG literatures. 14 See Behrens and Murata (2007). Zhelobodko et al. (2012) gave a new name, the relative love of variety (RLV), to the relative risk aversion. 8 production technology. Under these conditions, we can represent housing production in a city by an aggregate production function with constant returns to scale: YH FH (YLH , Y0H ) , where YH is the total production of housing in a city and YLH and Y0H are land and labor inputs, respectively. We assume that a property tax t H is levied on housing. Maximizing the profit of the housing sector, H ( pH t H )YH pLYLH Y0H , yields the first-order conditions: (7) F 1 F pL HH ; HH . Y0 pH t H YL pH t H Again, from our notational convention, the inputs are negative. The maximized profit is zero from the free entry condition: H 0 . Because housing consumption per capita is fixed, housing production in a city is determined as a function of the number of cities: YH N xH / n YH* (n) . With the land input in a city fixed at YLH X L , this immediately yields the labor input required for housing production also as a function of n: Y0H * (n) . Finally, we assume that production of the local public good requires only labor input. The latter is given by the cost function: Y0G CG (YG ) . Since YG is a pure local public good, we immediately have xG YG . 2.2. Equilibrium conditions In this subsection, we examine the equilibrium conditions, given the number of cities and the supply of the local public good. Substituting the market clearing condition for a variety, Nx nY , into the zero-profit condition yields (8) ( p c) Nx F. n This equation, coupled with the first-order condition for profit maximization ( 5 ), determines the equilibrium price and consumption as functions of the number of cities: p p* (n) and x x* (n) . From the market clearing condition for a variety, this in turn yields the production of that variety, Y Y * (n) . The labor input for a variety also becomes a function of n: Y0M * (n) (cY * (n) F ) . Note that, given the number of cities (or equivalently the size of a city), the equilibrium price, consumption, and production of a variety are determined solely by conditions in the differentiated good sector. In particular, they are independent of the income (or utility) level of a consumer and the supply of the local public good. Next, the number of varieties and the consumption of leisure depend on the supply of the local public good in addition to the number of cities because they must satisfy the equilibrium condition for the labor market, (9) N ( x0 x0 ) nY0* (n, YG ) , 9 and the first-order conditions for utility maximization, ( 10 ) U ( x0 , mu ( x * (n)), YG , x H ) / U M u ( x * (n)) p * (n) , * U ( x0 , mu ( x (n)), YG , x H ) / x0 where the total labor input in a city in ( 9 ) is given by: Y0* (n, YG ) mY0M * (n) Y0H * (n) CG (YG ) . ( 11 ) Equations ( 9 ) and ( 10 ) determine the equilibrium values of leisure and variety as functions of the number of cities and the supply of the local public good: x0* (n, YG ) and m* (n, YG ) . The utility level is then given by ( 12 ) U * (n, YG ) U x0* (n, YG ), m* (n, YG )u ( x * (n)), YG , x H . 2.3. Allais surplus Our aim is to see how the first-best HGT has to be modified in a second-best setting with monopolistic pricing, product diversity, and property tax distortions. The HGT is obtained from the first-order conditions for the optimal number of cities or, equivalently since total population is given, for the optimal size of a city. As is well known, in a first-best economy with no price distortions, indirect impacts caused by general equilibrium repercussions yield zero net benefits when measured in monetary terms. It therefore suffices to concentrate on the benefits of direct impacts. In a second-best economy, however, we also have to take into consideration the excess burden (or the deadweight loss) in addition to the direct benefits. Expressing the excess burden in pecuniary units, Harberger (1964, 1971) developed his celebrated excess burden measure, i.e., the weighted sum of induced changes in quantities consumed with the weights being the corresponding price distortions. In this paper, we take a dual approach and consider the problem of minimizing the resource cost of providing city residents a fixed level of utility.15 When the resource cost is measured in terms of the numéraire good, as will be done in this paper, this leads to the consumer surplus concept developed by Allais (1943, 1977): the maximum amount of the numéraire good that can be extracted, while fixing the utility levels of all households.16 The Allais surplus provides a simple measure of welfare while being consistent. Compensating and equivalent variations (CV and EV) yield consistent consumer surplus measures, but, when applied to a general equilibrium setting, they yield much more complicated results than Harberger’s. The reason is that the compensated demand 15 Harberger in effect used the Marshallian consumer surplus to convert the utility change into pecuniary units, dividing marginal changes in utility by the marginal utility of income. This Marshallian consumer surplus has a well-known difficulty of path dependence: for a discrete change, its value is not unique and depends on the path of integration. 16 Debreu (1951) proposed a variant of Allais’ measure. Instead of extracting the numéraire, Debreu’s coefficient of resource utilization reduces all primary inputs proportionally. Debreu’s coefficient of resource utilization is not in pecuniary units, however, and as such it cannot serve as a consumer surplus measure. See Diewert (1983; 1985) and Tsuneki (1987) for further extensions. 10 functions do not guarantee that demand equals supply along the path of integration. This generates welfare impacts additional to direct benefits even in a first-best economy, as shown by Kanemoto and Mera (1985). Unlike the CV and EV, the Allais surplus does not have this problem because, by definition, demand equals supply in all markets where prices change. For an arbitrary number of cities, the procedure described in the preceding subsection determines the equilibrium allocation and the utility level. Starting from this equilibrium, we examine the welfare impact of a change in the number of cities, using the Allais surplus measure. The Allais surplus is defined as the maximum amount of the numéraire good (i.e., labor) that can be extracted from the economy. Formally, we maximize A nY0 N ( x0 x0 ) , ( 13 ) while keeping the utility level constant: U ( x0 , mu ( x), YG , x H ) U , ( 14 ) and all the equilibrium conditions satisfied, except, of course, for the market clearing condition for the numéraire, ( 9 ), or equivalently, the budget constraint for a consumer. In the preceding subsection, we obtained a general equilibrium with all the markets cleared, including the market for the numéraire. This yields the equilibrium utility level as a function of the number of cities. The Allais surplus uses an equilibrium in a slightly different setting where the utility level is fixed and the market clearing condition is not imposed for the numéraire. The surplus of the numéraire good in this equilibrium yields the Allais surplus. Intuitively, the equilibrium is optimal if changing the number of cities does not allow us to extract any additional surplus conditional on the equilibrium utility level. 2.4. Harberger’s excess burden extended to include variety distortions As a preliminary step to obtain the optimal number of cities, we consider the effects of changing the number n of cities (or equivalently, the size N of a city given the total population N ) on the Allais surplus: dA / dn . We assume that the supply of local public goods is fixed, relegating the analysis of the optimal supply of local public goods to Section 2.8. As noted before, under our assumption of a separable utility function, the equilibrium consumption of a variety does not depend on the income or the utility level. The derivation of an equilibrium with a fixed utility level therefore proceeds in exactly the same way as before up to ( 8 ), and we replace the market clearing condition for the numéraire ( 9 ) by the utility constraint ( 14 ). The equilibrium values for leisure and variety are different from those in the preceding section and we denote them respectively by x0** (n, YG ;U ) and m** (n, YG ;U ) . Noting ( 11 ), the Allais surplus can be written as ( 15 ) A n m** ( n, YG ;U )Y0M * (n) Y0H * (n) CG (YG ) N x0** (n, YG ;U ) x0 . 11 We assume that, given the number of cities, a unique equilibrium exists.17 To ease exposition we regard n as a continuous variable. We now examine the effect of a marginal increase in the number of cities on the Allais surplus, assuming that the supply of the local public good is fixed. Differentiating the Allais surplus with respect to the number of cities yields ( 16 ) dY0M dY0H dA dx0 M dm , Y0 n N m Y0 dn dn dn dn dn where we omit superscripts * and ** to simplify notation. This equation shows that the change in the Allais surplus consists of two parts: the first term on the right-hand side represents the change in the new city and the second term captures the impacts on the existing cities. Production in the new city requires labor input there, Y0 , but in the existing cities less labor input is required. Because the utility level is fixed, there will be no change in the surplus on the consumption side. Applying the first-order conditions for producer and consumer optimization and the market clearing conditions for the differentiated good and housing to ( 16 ), we can decompose it into direct and indirect benefits. The former ignore general equilibrium impacts on prices and quantities whereas the latter – which corresponds to Harberger’s excess burden – captures those price and quantity effects induced by changes in the number of cities. The excess burden in our model includes the distortion for the number of varieties in addition to the price distortions. The latter are well known: the price distortion of a differentiated good is t p c 0 , where the inequality follows from ( 8 ). The property tax creates a price distortion for housing, but this does not generate any deadweight loss because we assume that the housing consumption is fixed. In our model with an endogenous number of varieties, we must extend Harberger’s excess burden to include the distortion for the number of varieties, which we simply call the variety distortion. The variety distortion is the difference between the marginal social benefit and the marginal social cost of increasing the number of varieties. Because all consumers in a city benefit from the variety increase, the marginal social benefit of a variety is the sum of the marginal rates of substitution between the number of varieties and the numéraire in ( 2 ) over all residents in a city: ( 17 ) Pm N U / U M u ( x) , u ( x ) Np U / x0 u ( x ) where the second equality results from the first-order condition for utility maximization ( 1 ). The foregoing expression shows that the number of varieties has the aspect of a local public good because the benefits of an extra unit are summed over all individuals to obtain the aggregate benefit. The marginal social cost is the production cost of an additional variety, cY F . The variety distortion is then defined as the difference between the marginal social benefit of an additional variety and the marginal social cost 17 Though this assumption is a strong one, which need not be satisfied in general, it will be satisfied in the various applications we present in the remainder of this paper. 12 of that variety: Tm Pm (cY F ) [ pY / u( x)][u ( x) / x u( x)] 0 , where we use the zero-profit condition, M pY (cY F ) 0 , the market clearing condition, Nx nY , and the love of variety condition ( 3 ) for the differentiated good. The following proposition decomposes the change in the Allais surplus into the direct benefit and Harberger’s excess burden. Proposition 1 (Extension of Harberger’s excess burden formula). Let t p c 0 and Tm Pm (cY F ) 0 be the price distortion of a variety and the variety distortion of the differentiated good, respectively. Then, an increase in the number of cities changes the Allais surplus by ( 18 ) dA DB EB , dn where ( 19 ) DB Y0 N m( p t ) x ( pH tH ) xH and ( 20 ) dx dm EB n Nmt Tm dn dn are respectively the direct benefit and the extension of Harberger’s excess burden to include the variety distortion in addition to the price distortion. Proof: See Appendix A1. ■ The direct benefit of creating a city involves the migration of consumers/workers from the existing cities to the new city. Because we assume the utility level to be fixed, the direct benefit appears as an increase in the Allais surplus: the amount of labor input that can be saved by creating a new city, ignoring indirect impacts through price changes. A new city increases the labor input there but existing cities need less because they have fewer residents to serve. The first term Y0 0 on the right-hand side of the direct benefit ( 19 ) represents the former, and the second term captures the latter. Because the decrease in the total population of existing cities equals the population of the new city, N, the labor inputs saved are Nm( p t ) x 0 and N ( p H t H ) xH 0 , yielding the second term in the direct benefit. When price distortions exist, creating a new city (or, equivalently, reducing the sizes of existing cities) changes the excess burden in the existing cities in addition to generating direct benefits. As shown by Harberger (1964, 1971), the change in the excess burden can be expressed as the weighted sum of induced changes in consumption, with weights being the corresponding price distortions. The intuition is simple. An increase in the consumption of a good benefits the consumers but increases the social cost of production. The consumer and producer prices equal the marginal benefit for the consumer and the marginal cost of production, respectively, and the difference between them is the price distortion. The net social benefit of the increase in 13 consumption is then given by the price distortion times the change in consumption. When the number of varieties is endogenous, the Harberger formula must be extended to include the induced change in the number of varieties multiplied by the variety distortion. Note that the excess burden does not have a term with the property tax distortion because housing consumption is assumed to be fixed. For expositional simplicity, we call the marginal increase in the excess burden simply the excess burden and denote it by EB. It is not a priori clear whether the excess burden is positive or negative. We return to this point in Section 2.6 below. 2.5. A second-best Henry George Theorem We have so far considered the derivative dA / dn to obtain an extension of Harberger’s excess burden formula. In this subsection, we first derive the condition for the optimal number of cities by setting dA / dn 0 , which is equivalent to DB EB from Proposition 1. 18 We then rewrite DB using the zero-profit conditions to establish the second-best HGT. Proposition 2 (Optimal number of cities). If the number of cities is optimal, then dA / dn 0 and the direct benefit equals the excess burden: DB EB . Proof: Immediate from Proposition 1. ■ Proposition 2 provides the condition for the optimal number of cities. In a symmetric case where all cities have the same allocation, this is equivalent to finding the optimal size of a city. Flatters et al. (1974) considered the latter problem to obtain the first-best HGT for local public goods (called the ‘golden rule’ there). Because the number of cities satisfies n N / N , we can re-express our results in terms of the optimal size of a single city by maximizing the Allais surplus ( 15 ) with respect to the population of a city: dA / dN 0 . The following corollary states the condition for the optimal city size. Corollary (Optimal size of a single city). If the city size is optimal at the initial equilibrium (in which A 0 by definition), then we have DBN EBN , where ( 21 ) DBN x0 x0 m( p t ) x ( pH t H ) xH and ( 22 ) dm dx Tm EBN Nmt dN dN can be interpreted as the direct benefit and the excess burden of increasing the population of a city, respectively. Proof: See Appendix A2. ■ 18 This maximization of distributable surplus A is equivalent to maximizing the common utility for a fixed level of surplus under regularity conditions well known in the duality literature. 14 In a first-best economy, there are no price and variety distortions: t Tm 0 . Hence, the excess burden is zero and the condition for the optimal population of a city is that the direct benefit of adding a worker is zero. This implies that the marginal product of labor equals the consumption by a worker.19 In our framework, the former is labor supply multiplied by its price (i.e., one), and the latter is the sum of the social costs of the differentiated good and housing that she consumes. The social costs of differentiated good consumption must be evaluated at marginal cost rather than at the market price. In a second-best economy with price and variety distortions, the direct benefit need not be zero but must equal the excess burden created by adding a worker in a city. Now let us derive the second-best HGT by converting the direct benefit into its dual form that includes the total land rent in a city. First, we define the fiscal surplus as the total land rent in a city plus the property tax revenue minus the cost of the local public good: FS pL X L Nt H xH CG (YG ) . A positive value implies that a 100% tax on land rents plus the property tax raise enough government revenue to cover the provision of the public good and to generate a surplus. Applying the zero-profit conditions in the differentiated good and housing sectors to the direct benefit ( 19 ) yields a relationship between the fiscal surplus and the direct benefit. Since the direct benefit equals the excess burden at the second best by Proposition 2, the second-best Henry George Theorem follows immediately. Theorem 1 (The second-best Henry George Theorem). When the number of cities is optimal, the fiscal surplus equals the excess burden caused by increasing the number of cities plus the total value of price distortions for residents who move from the existing cities to the new city: FS EB T , where the fiscal surplus is the total land rent in a city plus the property tax revenue minus the cost of local public goods: FS pL X L Nt H xH CG (YG ) ; and the total value of price distortions consists of the price distortions for the differentiated good and housing: T (tmx t H x H ) N 0 . The price distortion for the differentiated good equals the total fixed cost, mF , and that for housing equals the property tax revenue. Proof: See Appendix A3. ■ In a first-best economy with no distortions, the fiscal surplus equals the direct benefit of adding a city, where the latter is zero when the number of cities is optimal. With price distortions, this is no longer true. The total value of price distortions for those who move from the existing cities to the new city must be subtracted from the fiscal surplus: FS T DB . The reason is that the direct benefit, which equals the labor input saved in the existing cities minus the labor input required in the new city, is measured by the producer–side shadow price. The fiscal surplus measured by the consumer price exceeds the direct benefit by the total value of price distortions for the 19 This condition reproduces the simple and appealing interpretation of the first-best HGT in Flatters et al. (1974, p. 102): at the optimum all wage income is devoted to private good consumption of workers and all land rents are devoted to production of the public good and private good consumption of landowners – a golden rule result. The private good consumption of landowners does not appear in our formulation because land is collectively owned by workers. 15 movers. In particular, the property tax revenue that is part of the fiscal surplus is completely offset by the cost of price distortions, yielding zero net surplus.20 Theorem 1 shows that the sign of the fiscal surplus at the second-best optimum depends on the signs of the total value of price distortions and the excess burden. The former is positive because the price distortions are positive in our model, but the excess burden can be positive or negative as shown in the next subsection.21 The empirical implementation of the second-best HGT would require estimates of the price distortions, in addition to the total land rent in a city that is required for the first-best version. The tax distortions are not difficult to estimate, and there have been many attempts at estimating monopolistic price distortions (see, e.g., the recent methodology developed by De Loecker and Warzcinsky, 2012). Most difficult is the estimation of variety distortion because the marginal social benefit of increasing the number of varieties is not observable directly. As discussed in Kanemoto (2013c), for differentiated intermediate goods, the aggregate production functions of final goods can be used to estimate the shadow price of increasing variety. Unfortunately, this method is not applicable to differentiated consumer goods because, contrary to output levels, utility levels are not observable. One possible solution is to use variation in housing or land prices to infer the agglomeration benefits of variety. Because the utility level is equalized across cities when agents are mobile, any benefit of increased variety is capitalized into higher land prices. 2.6. The sign of the excess burden Next, we examine factors that affect the sign of the excess burden. In Proposition 1, the excess burden contains only dx / dn and dm/ dn after eliminating dx0 / dn by using the relationship dx0 / dn pm(dx / dn) ( Pm / N )(dm / dn) in Appendix A1, which is derived from the fixed utility constraint and the first-order conditions for consumer optimization.22 We now eliminate dm/ dn using the same relationship to obtain the following proposition. Proposition 3 (The sign of the excess burden). The excess burden is given by ( 23 ) t T dx T dx EB N pm m m 0 . p Pm dn Pm dn The price margin equals t / p R R ( x ) 0 , where RR (x) xu(x) / u(x) denotes the relative risk aversion. The variety margin satisfies Tm / Pm 1 U R ( x) 0 , where 20 Note that both the second-best and first-best HGTs do not require the supply of a local public good to be optimal. 21 Note that this may not be true in general. If there is a price subsidy, for example, the price distortion can be negative. 22 Note that this condition is equivalent to the fact that the expenditure function is homogeneous of degree one with respect to prices. 16 U R ( x ) xu ( x ) / u ( x ) denotes the elasticity of the subutility function. Furthermore, dx / dn 0 . Proof: See Appendix A4. ■ The variety distortion part of the excess burden in Proposition 1 is now rewritten in terms of the changes in consumption of the differentiated good and the numéraire. For the differentiated good, the change in consumption must be multiplied by the difference between the price and variety margins, t / p Tm / Pm , in addition to its price and the number of varieties. The same structure applies to the numéraire, although the difference between the price and variety margins is simply Tm / Pm because its price margin is zero. Since both the price and variety margins are nonnegative and less than one for the differentiated good, the sign of t / p Tm / Pm may be positive or negative. It is easy to see that both signs are indeed possible. The price margin equals the RRA, t / p R R ( x ) , and the variety margin equals one minus the elasticity of the subutility, Tm / Pm 1 U R ( x) .23 The variety margin depends on the level of subutility u (x) , although the price margin does not. Adding a positive (negative) constant, a, to the subutility function, u~ ( x) a u ( x) , makes the elasticity, U R ( x ) xu ( x ) / u ( x ) , smaller (larger) and the variety margin larger (smaller). Furthermore, the elasticity of subutility can be made arbitrarily close to zero or one by the choice of the constant. Because the RRA is not affected by the constant term, the variety margin can be made larger (smaller) than the price margin by choosing a large (small) enough constant. The excess burden can, therefore, be positive or negative. The next subsection confirms this for a special functional form that extends the CES function. It is worth noting that Tm / Pm t / p equals the elasticity of the elasticity of subutility: ( 24 ) U ( x ) T dU R ( x) x t m . dx U R ( x) Pm p From the second-order condition for a differential good producer, dx / dn is positive. As noted above, for the numéraire, the price margin is zero and the difference between the price and variety margins is Tm / Pm , which is always negative. Unfortunately, we have not been able to obtain a general result on the sign of dx0 / dn . In a special case where consumption of leisure is fixed, as assumed in many of the 23 The variety margin is closely linked to the measure of taste for variety in Benassy (1996). In the additively separable case, his measure of taste for variety is (m) [m (m)] / (m) , where (m) mu ( x) / u (mx) . This satisfies (m) 1 [mxu(mx)] / u (mx) , which is similar to our Tm / Pm 1 xu( x) / u ( x) . The difference is that in Benassy (1996), holding mx constant implies that a change in m has no impact on (m) , which is not the case with our Tm / Pm . It is also equal to what Dhingra and Morrow (2013) call the ‘social markup’, which “denotes the utility from consumption of a variety net of its resource cost.” 17 earlier works such as Behrens and Murata (2009), it is zero. The next subsection shows that it is positive if the upper-tier utility function U is Cobb-Douglas and the lower-tier utility function U M is CES as in Abdel-Rahman and Fujita (1990). 2.7. Some functional form examples This subsection examines examples with specific functional forms, which will allow us to establish that all different cases are possible for well-behaved utility functions.24 Example 1: A variable RRA form (extended CES) The first example assumes that consumption of the numéraire (leisure) is fixed in addition to housing, i.e., dx0 / dn 0 . Under this assumption, only the sign of Tm / Pm t / p matters for characterizing the sign of the excess burden since dx / dn 0 . Ogaki and Zhang (2001) proposed a family of utility functions embedding varying degrees of RRA represented by a simple extension of the CES as follows: v( x) ( x ) ( 1) / . In our context of monopolistic competition, analyzing the case with 0 is involved because of a possible corner solution.25 We thus focus on the case where 0 . Since the absolute level of utility matters for the sign of the excess burden, we add a constant term to v(x) : ( 25 ) a (x )( 1)/ for x 0 , u(x) for x 0 0 where 1 . The RRA in this case is given by RR ( x ) x /[ ( x )] and its elasticity R ( x ) ( dR R ( x ) / dx )( x / RR ( x )) is equal to R ( x ) /( x ) . The specification we use hence encapsulates two different cases: (i) CRRA (constant RRA) when 0 ; and (ii) IRRA (increasing RRA) when 0 . Although it does not encapsulate the DRRA (decreasing RRA) case, we now show that the sign of the excess burden can go either way depending on the value of a . The following result summarizes how the sign of the excess burden varies with the parameter a. Result 1 (Variable RRA form) In the variable RRA case with 1 , we have U ( x) R ( x) for a 0 . Hence, when a 0 , EB 0 in the IRRA case ( 0 ), and EB 0 in the CRRA case ( 0 ). If a is negative, then the excess burden may be negative even in the IRRA case. 24 See the discussion paper version (Behrens et al., 2010) for additional examples including CARA preferences. 25 We thank Sergey Kokovin for bringing this problem to our attention. 18 Proof: See Appendix A5. ■ These results are illustrated in Figure 1. Some comments are in order. First, in the ‘standard’ CES case (with a 0 ), the excess burden is zero, even in the second best. This confirms the results obtained by Duranton and Puga (2001) and by Behrens and Murata (2009). Second, if 0 and a 0 , the sign of the excess burden is positive. Last, an increase (decrease) in a tends to make the excess burden larger (smaller). This implies that (i) even in the CRRA case the excess burden is positive (negative) if a is positive (negative), and (ii) the excess burden can get negative even in the IRRA case. When taken together, these findings show that the case of zero excess burden is a rather special one which occurs for a zero-measure set of parameter values. Figure 1. Variable RRA case – sign of excess burden in ( , a)-space Example 2: Intersectoral distortions Proposition 3 shows that even if distortions exist only in the differentiated good sector, a change in consumption of leisure induces a change in the excess burden. This is an example of intersectoral effects that are associated with changes in the variety distortion. We next examine the sign of the intersectoral effect, assuming that the consumption of the numéraire good and that of the differentiated good are aggregated via a Cobb-Douglas form and that the lower-tier utility function U M has a CES form: ( 26 ) 1 1 m U ( x0 ,U M , xG , xH ) U ( x0 x j dj 0 , xG , x H ) , where we assume that 1 and 0 1 .26 The RRA in this case is constant and 26 This formulation is similar to that in Abdel-Rahman and Fujita (1990), but there are two major differences. First, they focus on a differentiated intermediate good, instead of a differentiated consumption good. Second, they use m 0 xj ( 1) / dj 19 /( 1) , whereas we use m 0 xj ( 1) / dj for given by RR ( x ) 1 / , so that t / p 1 / . The elasticity of utility is given by U R ( x ) ( 1) / , and hence Tm / Pm 1 / . Combining these two results yields U ( x) 0 . The sign of the excess burden thus depends solely on the sign of the intersectoral distortion as captured by Tm / Pm and dx0 / dn . We can show the following result. Result 2 (Intersectoral distortion). If the utility function is given by ( 26 ), then t / p Tm / Pm 1 / , so that U ( x) 0 . Furthermore, dx0 / dn ((1 ) / ) x0 / n . Hence, the excess burden is positive: EB N Tm dx0 N dx0 0 Pm dn dn . Proof: See Appendix A6. ■ Result 2 shows that, in the Cobb-Douglas/CES case, the excess burden is positive at the second-best city size. Once the intersectoral distortions disappear from the model, the excess burden is zero. This confirms Result 1: the excess burden is zero when the utility function for the differentiated good is of the CES type. 2.8. The optimal supply of local public goods We now examine the optimal supply of local public goods. The Allais surplus is now a function of the local public goods in all cities in addition to the number of cities: A(YG , n) , where YG is a vector of the local public goods in all cities (recall that each local public good is, by definition, only consumed in the city it is provided in). Optimization with respect to each local public good YG requires A / YG 0 for any city , where YG denotes the supply of the local public good in city .27 Because of our symmetry assumption, all cities have the same allocation at the optimum, but in order to obtain the optimality condition for a local public good we have to break symmetry and examine the effect of changing the supply of the local public good in a city with those in other cities held fixed. In the following proposition, therefore, we have to add superscript to indicate city in the excess burden formula.28 Proposition 4 (The optimal supply of a local public good). The optimality condition U M . The reason for the latter choice is that we can show that as goes to zero, U ( x0 ,U M , xG , xH ) in (26) boils down to a special case of a variable RRA model that we analyze in Result 1. 27 This same condition holds regardless of whether the number of cities is fixed exogenously or optimized simultaneously with the local public goods. In the latter case, the first-order conditions are simply this condition plus A / n 0 , and Proposition 2 remains valid as the optimality condition for the number of cities. 28 Except for the excess burden, we do not have to add superscript values across cities in equilibrium. 20 , since all variables take equal for the supply of a local public good is PG CG (YG ) EBG , ( 27 ) where PG N ( 28 ) U / xG U / x0 is the marginal benefit of increasing the supply of the local public good and ( 29 ) x m m x EBG Nmt Tm (n 1) Nmt Tm YG YG YG YG is the excess burden. In the partial derivatives in the excess burden formula, x and m without superscript denote those in cities other than city . Proof: See Appendix A7. ■ Thus, the supply of the local public good is optimal when the marginal benefit minus the marginal cost equals the excess burden. The marginal benefit equals the sum of the marginal rates of substitution between the local public good and the numéraire over all city residents, and the excess burden is given by the sum of the induced changes in consumption of the differentiated good and the number of its varieties multiplied by the associated price distortions. Note that the induced changes differ between the city where the local public good is increased and the other cities. It is worth emphasizing that equation ( 29 ) captures the interaction between the local public good, on the one hand, and the price and variety distortions, on the other hand. If there were neither price nor variety distortions (i.e., t Tm 0 ), then EBG would be zero. In a second-best world, however, EBG would not be zero, thus suggesting that ignoring the price and variety distortions may lead to an incorrect assessment of the impact of the local public good on the excess burden. This result is important, especially in an urban context, since the local public goods and product diversity (and the associated price competition) are the sources of agglomeration that have been highlighted in the classic urban economics literature and in the new economic geography, respectively. 3. Extension to a non-separable utility function, a general production technology, and congestible local public goods In the additively separable cases analyzed in the preceding section, the first-order condition for profit maximization and the zero-profit condition determine p and x as functions of the number of cities, p* (n) and x* (n) . Hence, given n, these two variables are determined solely by the shape of the subutility function u (x) , via the relative risk aversion and the elasticity of u (x) , and the cost structure of the differentiated good sector, i.e., c and F. This implies that the sign of Tm / Pm t / p does 21 not depend on sectors other than that of the differentiated good, which allows us to obtain Proposition 3 to sign the excess burden for the special cases in Results 1 and 2. In this section we show that our main results, namely Propositions 1 – 2 and Theorem 1, remain valid without the additive separability assumption. In addition to extending the analysis to nonseparable utility functions, we assume a general production technology with multiple differentiated and other goods. We also allow for the possibility of congestion effects for local public goods.29 In order to keep the analysis simple, we stick to the assumptions of identical cities and homogeneous consumers. 30 Furthermore, assuming that the supplies of local public goods are fixed, we concentrate on the optimal number of cities.31 3.1. The model Our model has five categories of goods: the numéraire, differentiated goods, local public goods, land and other factors attached to a city, and the rest of the goods which include the structural part of housing. Let I 0,1, , I denote the set of indices of goods in the economy, where good 0 is taken as the numéraire. Unlike in Section 2, the numéraire does not have to be leisure and it may or may not be tradable. The other four categories are as follows. The first category, M , represents differentiated goods (either consumer goods or intermediate goods), each of which consists of differentiated varieties. We denote by m mi iM the ‘numbers’ – or more precisely, the masses, since we work with a continuum – of varieties for each differentiated good. The second category, G , contains local public goods. Both M and G are not tradable between cities and generate agglomeration forces. The third category, L , consists of land and other factors whose supplies are fixed in each city at X k , k L . L is also non-traded between cities but generates dispersion forces. As noted in Section 2, the most important feature of the third category is that adding a new city increases the use of these goods. We refer to them as ‘land’, but they include other resources attached to the area. The last category, H , may or may not be traded, and includes all remaining goods as well as housing. For short, we call them ‘housing’. Hence I 0 M G L H . The utility function of a representative consumer is given by U (U i iM , xi iM ) which consists of subutility functions U i iM for differentiated goods and the consumption vector of other goods that include the numéraire, local public goods, land, and housing: xi iM x0 , xi iG , xi iL , xi iH . We formulate the utility function U in a general form and impose sufficient restrictions to guarantee a unique optimum. 29 See Berglas and Pines (1981) and Hoyt (1991) for early contributions that consider the HGT and congestible public goods. Papageorgiou and Pines (1999, Ch.10) provide a more recent treatment. 30 It is not difficult to generalize the framework to heterogeneous cities and consumers, in which case the HGT holds approximately for the marginal city. See also Kanemoto (2013a) for the analysis of heterogeneous cities in the context of project evaluation. 31 We do not consider the condition for the optimal supply of a public good because it is basically the same as that obtained in Section 2.8. 22 Each subutility function U i , in turn, is defined over a continuum of varieties, denoted by x ij , j [0, mi ]iM , so that U i U i ( xij , j [0, mi ]) , where [0, mi ] denotes the range of varieties of differentiated good i. We focus on a class of subutility functions that allow for an integral representation and love of variety. Thus, wherever our model involves a continuum of varieties, the associated expressions are well-behaved and representable as integrals. In what follows, to alleviate notation, we denote utility by U (x, m) for short.32 We add m to make it clear that utility depends on the masses of varieties via the subutility functions U i iM . We allow for the possibility that U i is not additively separable, like the quadratic quasi-linear utility function used by Vives (1985) and Ottaviano et al. (2002).33 Finally, each consumer has an initial endowment of the numéraire, x0 , and owns land and firms collectively. There are no endowments of differentiated goods, local public goods, and housing. 3.2. Consumption Let p ({ pij , j [0, mi ]}iM , pi iM ) be the vector of consumer prices, where { pij , j [0, mi ]}iM is the price pi iM p0 , pi iG , pi iL , pi iH vector of the differentiated goods and is that of other goods. The consumer price of a local public good here is its shadow price, i.e., the marginal rate of substitution between the local public good and the numéraire. Although local public goods are congestible, we assume that no price is charged for them. The government finances the local public goods solely by taxes and other revenues, but not by user fees. A consumer receives income from the initial endowment of the numéraire, x0 , as well as from the equal share, s , of land rents, profits, and the deficits of the governments: x0 s . A consumer maximizes utility U (x, m) with respect to x and m, mi subject to the budget constraint x0 s iM pij xij dj iM ,iG pi xi 0 and the constraints on the numbers of varieties, mi miS , where miS is the number of varieties available in the market for differentiated good i. Assuming that the utility function is differentiable, we can write the first-order conditions for quantities consumed as: 32 Note that our vector x of goods may include intermediate goods, which always take the value zero for final consumers. 33 The quadratic quasi-linear utility function, m U (x, m) x j dj 0 2 m 0 m ( x j ) 2 dj x j dj x0 2 0 , 2 is not additively separable because it contains an interactive term as the third term on the right-hand side. 23 ( 30 ) U (x, m) / xij U (x, m) / x0 pij , i M , j [0, mi ] , and U ( x, m ) / xi pi , i M . U ( x, m ) / x0 Note that the local public goods, which are not choice variables for a consumer, are included in the latter condition because we have defined their consumer (shadow) prices as the marginal rates of substitution. For the choice of the number of varieties, a consumer may have the option of not consuming all the varieties produced. However, the assumption of homogeneous consumers implies that all varieties supplied in the market will be consumed because if a consumer chooses zero consumption, all other consumers will do the same, making the market demand zero (and thus the variety would not be available in the first place). We therefore take it that a consumer consumes all varieties in the available range so that mi miS . The number of varieties then satisfies:34 U ( x, m ) / mi pij xij , i M , j [0, mi ] , U ( x, m ) / x0 ( 31 ) which is a generalization of ( 2 ). Note that U (x, m) / mi is a shorthand for a change in utility caused by increasing the consumption of the marginal variety, ximi , from 0 to the utility-maximizing level. 3.3. Production Our model has three types of products: differentiated goods, local public goods, and housing. 35 First, the production of differentiated good i M in a city is represented by a transformation function F i (Yij , Y0ij , j [0, mi ]) 0 , where Yij and Y0ij respectively denote the output of variety j and the input of the numéraire. For notational simplicity, we assume that the numéraire is the only input in the differentiated good sector.36 A differentiated good firm has monopoly power and the producer shadow price diverges from the consumer price that it receives. By the free entry assumption, the profit is zero in equilibrium, however. Then, the profit of the producer of variety j of differentiated good i satisfies ( 32 ) ij pijYij Y0ij 0 , i M , j [0, mi ] . We continue to follow the sign convention that inputs take negative values and outputs take positive values. Next, for local public goods and housing, the transformation function is 34 For the range of varieties consumed to be represented as an interval [0, mi ] , we assume that the variety index j can be ordered so that (U ( x, m ) / mi ) /( pij xij ) evaluated at the utility-maximizing consumption level is not increasing in j. This can always be done without loss of generality. 35 The numéraire can be a product but for simplicity we ignore this possibility. 36 It is straightforward to introduce inputs other than numéraire. 24 F i (Yi , Y0i , {Yki }kL ) 0 , i G H , where Yi denotes the output of good i; and Y0i and {Yki }kL respectively denote the numéraire and land inputs. Labor and land inputs in the production of local public goods are assumed to be fixed, i.e., Y0i Y0i and Yki Yki for any i G and k L , so that the supplies of local public goods are fixed: Yi Yi for any i G .37 However, the consumption of a local public good is endogenous because of congestion effects discussed below. The total cost of local public goods for a city is:38 ( 33 ) CG Y0i pk Yki . iG k L Housing production requires the numéraire and land inputs, denoted respectively by Y0i , i H , and Yki , i H , k L . As in the preceding section, we assume that the profit in the housing sector is zero. The profit in the housing sector in a city then satisfies ( 34 ) i ( pi ti )Yi Y0i pkYki 0 , i H . k L A property tax ti causes price distortion in the housing sector. We assume that the property taxes are uniform across all cities. We do not spell out the details of producer decisions here because we need a formulation that permits monopolistic as well as competitive behavior of producers. The only assumption we make is that net outputs are uniquely determined and satisfy the transformation functions. 3.4. Market clearing conditions Now, we obtain the market equilibrium conditions, given the number of cities. First, market clearing for the numéraire good requires that: N ( x0 x0 ) nY0 , ( 35 ) where its total supply in a city satisfies: ( 36 ) Y0 Y0ij dj Y0i Y0i . mi iM 0 i H iG For the differentiated goods, which we assume to be non-tradable, the market clearing conditions are given by (recall that there are no initial endowments of differentiated goods): 37 We assume that differentiated goods and housing goods are not used in the production of local public goods. Again, we could relax this assumption at the expense of heavier notation. 38 Note that, unlike in Section 2 where CG (YG ) denotes the cost function, CG here represents only the amount of the total cost. 25 ( 37 ) Nxij Yij , i M , j [0, mi ] . As noted before, land has the special characteristic that if the number of cities increases, the total amount of land used for consumption and production increases by X i for i L of the added city.39 The market clearing condition for land is hence: ( 38 ) N xi nX i n Yi k Yi k , i L . k G kH For housing, which may or may not be traded, we have ( 39 ) N xi nYi , i H . Note that the equilibrium conditions for non-traded goods must hold for each city separately, but at a symmetric equilibrium, these constraints are always satisfied. We therefore use the aggregate market clearing conditions, ( 35 ), ( 38 ), and ( 39 ), in what follows, regardless of whether the goods are traded or not. Because migration is free and costless, the utility levels in all cities must be equalized. This equal utility condition and the above market clearing conditions, together with consumer and producer decisions, determine the equilibrium allocation. Assuming that the supply of each local public good is fixed, we focus on a class of utility and production functions such that the equilibrium is unique and write the equilibrium utility as U (x (n), m (n)) . 3.5. A general version of the second-best Henry George Theorem Assuming that the supplies of local public goods are fixed, we obtain the condition for the optimal number of cities. We do not consider the condition for the optimal supply of each public good because it is basically the same as that obtained in Section 2.8. The first step is to evaluate, in pecuniary units, the utility change driven by a change in the number of cities n. As in Section 2, we use the Allais surplus defined as the maximum amount of the numéraire good that can be extracted from the economy with the utility level being fixed at U U (x (n), m (n)) obtained in the preceding subsection. For each n, given the optimal choices of consumers and producers, the market clearing conditions ( 37 ), ( 38 ), ( 39 ) determine the equilibrium values of all endogenous variables, where the market clearing condition for the numéraire, ( 35 ), is replaced by the fixed utility constraint. As said before, we assume that, given the number of cities, a unique equilibrium exists to determine the Allais surplus as a 39 When we develop a new city, people start to use the land. In the market clearing condition, this looks like newly supplied land. Alternatively, we may think about ‘raw’ land being available in limitless supply, whereas developing a new city requires ‘constructible’ land, the supply of which increases if a new city is created. In any case, we think about land as being ‘freely available’ when developing a new city. We do not consider the conversion of non-urban land to urban land in existing cities, i.e., once a city is developed, its stock of land is fixed. We could relax this assumption and introduce also the size elasticity of city-wide surface with respect to population. 26 function of n: A(n) nY0 (n) N ( x0 (n) x0 ) . First, let us define price and variety distortions. Following the notation of Boadway and Bruce (1984, Chs. 8.7 and 10.3), we denote the price distortions by t ({t ij , j [0, mi ]}iM ,{ti }iM ) and the shadow prices on the production side by p t .40 In our model, the price distortions are caused by the monopoly power in the differentiated goods markets and the property taxes. We normalize the shadow prices such that t0 0 .41 The producer shadow prices of the differentiated goods and housing are given by the marginal rates of transformation, ( 40 ) pij tij F i / Yij F i / Y0ij , i M , j [0, mi ] ; pi ti F i / Yi , iH ; F i / Y0i and in the housing sector the price of land satisfies pk ( 41 ) F i / Yki , i H and k L , F i / Y0i where we assume differentiability of the transformation function. As in ( 17 ), we define the consumer-side shadow price of product diversity (i.e., the value of a marginal increase in the mass of varieties mi for good i M ) by U (x, m) / mi , i M . U (x, m) / x0 Pmi N ( 42 ) Note that from condition ( 31 ) the consumer-side shadow price of product diversity is larger than or equal to the total expenditure on a variety in a city: ( 43 ) Pmi N U (x, m) / mi Npij xij , i M , j [0, mi ] . U (x, m) / x0 The producer shadow price of increasing the mass of varieties of good i M is its production cost: ( 44 ) Pmi Tmi Y0imi . The local public goods require special attention. Although we are assuming that the supply of a local public good is fixed, its consumption can change when the population of a city changes, because of congestion effects. The consumption of a local public good is given by: 40 In general, the vector of price distortions t can depend on the number of cities n. We suppress this argument to alleviate the notational burden. 41 As is well known in the optimal commodity tax literature, e.g., Auerbach (1985, pp.89-90), taking the price distortion to be zero on one of the goods is just a normalization. 27 xi Gi (Yi , N ) , i G , ( 45 ) where we have Gi / N 0 which reflects congestion effects. The special case of a pure local public good considered in Section 2 assumes Gi / N 0 . Although the consumer side shadow price given by ( 30 ) is positive for a local public good, the social cost of a change in its consumption is zero because its supply and hence its production cost are fixed. For a local public good, therefore, we can take the producer shadow price to be zero, i.e., pi ti 0 for i G . The price distortion then equals the consumer-side shadow price: ti pi . As in Section 2.4, differentiating the Allais surplus with respect to the number of cities, n, and applying the definitions of price and variety distortions yields an extension of Harberger’s excess burden analogous to that from Proposition 1. Proposition 5 (Extension of Harberger’s excess burden formula). Let tij , Tmi , and ti denote the price distortion of variety j of differentiated good i , the variety distortion of differentiated good i , and the price distortion of good i , respectively, where ti pi for i G . Then, an increase in the number of cities changes the Allais surplus by dA DB EB , dn ( 46 ) where ( 47 ) mi DB Y0 N ( pij tij ) xij dj ( pi ti ) xi pi xi iH iL iM 0 ( 48 ) mi dx dm dx dx EB n N tij ij dj Tmi i N ti i N ti i dn dn dn dn iM 0 iH iG iM are respectively the direct benefit and the extension of Harberger’s excess burden to include variety distortions in addition to price distortions. Proof: See Appendix A8. ■ Proposition 5 shows that Proposition 1 does not depend on the separability assumption and a specific production technology. The direct benefit is given by the social value of the decrease in consumption in existing cities minus the cost of supporting residents in the new city. The former must be evaluated by using the producer shadow prices when price distortions exist. The excess burden is the change in consumption multiplied by the corresponding price distortion, summed over all goods and consumers. As before, the excess burden includes variety distortions. In addition, it contains the congestion effects for local public goods, as can be seen from the last term in ( 48 ). For local public goods, we have 28 ( 49 ) dxi G N i 0 , i G , dn N n which is nonnegative as it is for differentiated goods. A decrease in city size caused by an increase in the number of cities reduces congestion in the consumption of the local public goods. This is captured as a decrease in the excess burden in the proposition. Following the same procedure as in Section 2.5, we now obtain the general version of the second-best Henry George Theorem. First, if the number of cities is second-best optimal, the direct benefit equals the excess burden, yielding the generalization of Proposition 2. Next, the fiscal surplus of a local government that finances the costs of local public goods by revenues from land rents and the property tax can be defined as: ( 50 ) FS pi X i N ti xi CG . iL iH The fiscal surplus here does not contain profits because they are assumed to be zero. Applying the zero-profit conditions to the direct benefit, we can show that the fiscal surplus equals the sum of the direct benefit and the total value of price distortions for residents who move from the existing cities to the new city, T : FS DB T . The second-best Henry George Theorem then follows immediately. Theorem 2 (The second-best Henry George Theorem). When the number of cities is optimal, the fiscal surplus equals the sum of the excess burden and the total value of price distortions for those who move from the existing cities to the new city: FS EB T , where the fiscal surplus is given by ( 50 ) and the total value of price distortions is: ( 51 ) mi T tij xij dj ti xi N . iM 0 iH Proof: See Appendix A9. ■ 4. Concluding remarks Most factors that cause urban agglomeration involve some form of market failure. In this article, we have identified conditions under which the HGT, or the ‘golden rule’ of local public finance, holds in a second-best world and investigated in which direction the theorem needs to be modified otherwise. In so doing, we have largely focused on monopolistic competition models that are widely used in urban economics and the NEG to generate local increasing returns to scale and in which distortions in both prices and product diversity matter. Our key findings may be summarized as follows. First, the net benefit of creating a new city can be decomposed into two parts: the direct benefit that ignores indirect impacts through price changes, and the excess burden that captures the welfare impacts of the induced changes. As shown by Harberger (1964, 1971), if price distortions exist, the excess burden can be expressed as the weighted sum of induced changes in 29 consumption with weights being the corresponding price distortions. In our model based on monopolistic competition, this must be extended to include the induced change in the number of varieties multiplied by the variety distortion. This result parallels those obtained by Kanemoto (2013a, b) for the benefits of transportation improvements in monopolistic competition models of urban agglomeration. Using the zero-profit conditions, we can convert the direct benefit into its dual form that includes the total land rent in a city, thereby establishing the HGT. In a first-best economy with no distortions, the fiscal surplus equals the direct benefit, yielding the result that the fiscal surplus is zero when the number of cities is optimal. This is the ‘standard’ HGT where a 100% tax on land raises exactly the revenue required to finance the local public goods. With price distortions, this is no longer true and the total value of price distortions for residents who move from the existing cities to the new city must be subtracted from the fiscal surplus. The second-best HGT then states that the fiscal surplus equals the sum of the excess burden and the total value of price distortions, where the latter is obtained by summing, over all goods, the price distortion times the total consumption for the movers. The excess burden can be expressed as an extended Harberger Formula – the sum of induced quantity and variety changes from adding another city, weighted by the associated price and variety distortions. In the additively separable case, the sign of the excess burden is determined by the relative magnitudes of the variety and price distortions, which are in turn linked to the elasticity of the subutility and the relative risk aversion, respectively. We have shown that the price distortion tends to make the excess burden negative, while the variety distortion works in the opposite direction. When the subutility is shifted upwards, the variety distortion becomes larger and the excess burden is more likely to be positive. Last, we have shown that the HGT holds, in general, only for a zero measure subset of cases at the second best. Whether the fiscal surplus is positive or negative, i.e., whether a 100% tax on land and property tax revenue can finance the public goods at the optimal city size, depends strongly on the specification of the model. Concerning the local public finance issues, our main results are the following three. First, the Samuelsonian condition for the optimal supply of a local public good must be extended to include the excess burden: the marginal benefit of increasing the supply of the local public good equals the marginal cost plus the excess burden. Second, if a city government levies property taxes, the fiscal surplus includes the tax revenue. However, it is also a component of the total value of price distortions, and these two offset each other. Third, a congestible local public good involves price distortion, which must be included in the excess burden. A decrease in city size reduces congestion in the consumption of the local public good, which results in a reduction in the excess burden. Last, we find that ignoring the price and variety distortions may lead to an incorrect assessment of the impact of the local public good on the excess burden. This result is important, especially in an urban context, since the local public goods and product diversity (and the associated price competition) are the sources of agglomeration that have been highlighted in the classic urban economics literature and in the new economic geography, respectively. A first natural extension of the present work would be to depart from symmetry. As shown in Kanemoto (2013a), an extension to asymmetric cities is not difficult. 30 Introducing asymmetry among producers of differentiated goods, as done in the context of international trade by Melitz (2003), is more involved and left for the future. However, our result that the elasticity of the subutility and the relative risk aversion matter for the gap between the first- and the second-best allocations under firm heterogeneity, as well as for comparative static results of equilibria with respect to key variables such as market size, has been recently reconfirmed in different context (see Dhingra and Morrow, 2013, for the former, and Behrens et al., 2014, for the latter). Another direction for future work is the empirical implementation of our results following Kanemoto et al. (1996, 2005). Our more general conditions may be useful to evaluate whether observed city sizes are likely to be too large or too small. A third natural extension would be to apply our approach to the evaluation of public policies such as trade and transport policies, as they are likely to significantly affect city sizes and the spatial distribution of economic activity (Melitz, 2003; Venables, 2007). Kanemoto (2013a, b) made a first attempt in this direction for transport policies, but there are many other policy areas open for future research. Acknowledgments We thank Richard Arnott, Tom Holmes, Sergey Kokovin, two anonymous referees and the handling editor, Vernon Henderson, as well as participants at the Nagoya Workshop for Spatial Economics and International Trade and at the 56th Meetings of the RSAI in San Francisco for valuable comments and suggestions. Behrens is holder of the Canada Research Chair in Regional Impacts of Globalization. Financial support from the CRC Program of the Social Sciences and Humanities Research Council (SSHRC) of Canada is gratefully acknowledged. Behrens further gratefully acknowledges financial support from FQRSC Québec (Grant NP-127178). Kanemoto gratefully acknowledges financial support from kakenhi (18330044), (23330076). Murata gratefully acknowledges financial support from kakenhi (17730165) and mext.academic frontier (2006-2010). Any remaining errors are ours. Appendix A: Proofs A1. Proof of Proposition 1 The proof proceeds in two steps. The first step is to convert the changes in labor inputs in the existing cities into those in outputs and consumption there, noting that they must satisfy the production and utility functions and the first-order conditions for optimal choices of producers and consumers. The second step exploits the fact that creating an additional city reduces the need for consumption goods in the existing cities. Using the market clearing conditions for the differentiated good and housing, we obtain the relationship between the changes in outputs in the existing cities and those in the new city. First, differentiating the cost function of the differentiated good yields dY0M / dn c(dY / dn) , i.e., the change in the amount of input (with minus sign because of our sign convention) equals the change in the output multiplied by the marginal cost parameter, c. Differentiating the housing production function, applying the first-order conditions for profit maximization, and noting that the supply of land is 31 fixed, dYLH / dn 0 , we obtain: ( 52 ) dY0H dY ( p H t H ) H . dn dn On the consumption side, differentiating the fixed utility constraint ( 14 ) and using the first-order condition for optimal consumption choice yields: ( 53 ) dx0 dx Pm dm . pm dn dn N dn Substituting these relationships into ( 16 ), we can rewrite the change in Allais surplus as: ( 54 ) dY dA dx dY dm Y0 n Npm mc ( Pm Y0M ) ( pH tH ) H . dn dn dn dn dn In the second step, we relate the changes in the outputs and consumption in the existing cities to those in the new city, using the market clearing conditions. For the differentiated good, differentiating the market clearing condition, N x nY , summing over all varieties, and applying Y Nx , yields: ( 55 ) nm dY dx m Nx N . dn dn This shows that the reduction in the total output of the differentiated good in existing cities equals the output in one city (i.e., the additional city) minus the increase in the total consumption. Thus, the decrease in the total labor input for the differentiated good in existing cities has two components: the shift of production to the new city and the induced change in consumption caused by general equilibrium repercussions. Both have to be multiplied by the marginal cost of production. Because consumption of housing per person is fixed, differentiating the market clearing condition for housing, NxH nYH , with respect to n yields ( 56 ) n dYH Nx H . dn This shows that decreases in housing production in the existing cities exactly offset housing produced in the new city. Substituting ( 55 ), ( 56 ), and Y0M (cY F ) into ( 54 ) yields ( 57 ) dA dx dm . Y0 N mcx ( p H t H ) xH n Nm ( p c) ( Pm (cY F )) dn dn dn The first two terms on the right-hand side represent the direct benefit and the last term the excess burden. Using the price and variety distortions for the differentiated good, we can rewrite ( 57 ) to obtain the extension of Harberger’s excess burden to include the 32 variety distortion. A2. Proof of Corollary From n N / N , we have dn / dN N / N 2 , which yields ( 58 ) dA dA dn N dA N 2 2 ( DB EB ) . dN dn dN N dn N Because this is for the entire economy, we have to divide this by the number of cities to obtain the condition that corresponds to Flatters et al. (1974): N dA DB N EB N , N dN ( 59 ) where DBN DB / N and EBN EB / N can be interpreted as the direct benefit and the excess burden of increasing the population of a city. From the market clearing conditions for a variety and housing, we can express the direct benefit using the consumption side variables as: ( 60 ) DBN ( x0 x0 ) mcx ( pH t H ) xH A . N The excess burden can be written as: dx dm EBN Nmt Tm . dN dN ( 61 ) If the initial allocation, where A 0 , has the optimal city size, then dA / dN 0 , which immediately yields the corollary. A3. Proof of Theorem 1 First, Proposition 1 shows that the direct benefit in the housing sector is the before-tax value of housing minus the labor cost: ( pH t H )YH Y0H . Because the profit is zero in the housing sector, this must equal the total land rent in a city: ( pH t H )YH Y0H pL X L . Second, the direct benefit in the local public good sector is negative and equals its cost: Y0G CG (YG ) . Third, the direct benefit in the differentiated good sector is the variable cost minus the total cost in a city: mcY mY0M . From Y0M cY F and the zero-profit condition, pY cY F , we obtain mcY mY0M tmY . Summing the three parts of the direct benefit then yields DB pL X L CG (YG ) tmY . This can be rewritten as DB FS T , where T tmY t H YH . From the zero-profit condition, pY cY F , the fixed cost satisfies F tY . A4. Proof of Proposition 3 First, solving ( 53 ) for dm/ dn and substituting the result into the excess 33 burden in Proposition 1 yields ( 23 ). Second, substituting ( 17 ) into the definition of the variety distortion immediately yields Tm / Pm 1 U R ( x) 0 . Next, the sign of dx / dn can be obtained as follows. From the zero-profit condition and from p c /(1 RR ( x )) , we readily obtain ( p c) Nx F RR ( x) N cx F 0 1 RR ( x) n . For convenience, let us rewrite this expression as follows: xRR ( x ) nRR ( x ) 1 0 , ( 62 ) where F / cN 0 is a bundle of parameters. Differentiating this equation with respect to x and n , and using the definition of the relative risk aversion, we obtain dx RR ( x) ux 2 0 2 dn (1 RR ( x)) u , where the inequality follows from the second-order condition for profit maximization of a differentiated good producer ( 6 ). A5. Proof of Result 1 The elasticity of subutility satisfies ( 63 ) U R ( x) 1 x 1 , x 1 a ( x )(1 ) / which is decreasing in a. The elasticity of the elasticity of subutility is then given by ( 64 ) U ( x ) x 1 x x a( x ) 1 1 a( x ) 1 . Comparing this expression with R (x ) and making use of the love of variety condition ( 3 ), we readily see that ( 65 ) U ( x) R ( x) as a 0 . In particular, if a 0 , then U (x) coincides with the elasticity of RRA and we have (i) U ( x) 0 in the IRRA case where 0 , and (ii) U ( x) 0 in the CRRA case where 0 . If a is negative, then U (x) may be negative even in the IRRA case. Finally, noting that dx / dn 0 from the proof of Proposition 3, we obtain Result 1. 34 A6. Proof of Result 2 The only step that has not been proved is the sign of dx0 / dn . In order to sign this derivative, observe that the first-order condition for utility minimization ( 10 ) implies that 1 1 x0 p. mx ( 66 ) From RR 1 / , the profit maximization condition ( 5 ) and the zero-profit condition ( 8 ) yields x ( 1)n and p ( 67 ) 1 c, where F /(cN ) is a constant term. Plugging these into ( 66 ) above then yields a relationship between x0 and n as: m ( 68 ) 1 1 1 x0 . 2 c n Substituting ( 67 ) and ( 68 ) into the utility function ( 26 ) and noting the constant utility constraint then yields: ( 1) ( 1) 1 1 1 1 1 U x0 ( 1 ) n , x , x G H U 2 c . Since xG and x H are constant, differentiating this equation yields dx0 x0 (1 ) 0 dn n . A7. Proof of Proposition 4 From the definition of the Allais surplus, we first obtain ( 69 ) A x0 Y0M Y0H N M m C ( Y ) N m Y ( x x ) G G 0 0 0 YG YG YG YG YG YG x Y M m Y0H N (n 1) N 0 m 0 Y0M ( x0 x0 ) YG YG YG YG YG . Here, variables without superscript denote those in cities other than . Because we start from a symmetric equilibrium, all the variables except for the partial derivatives are the same between city and other cities. We therefore drop superscript for those variables. Because the total population is fixed, we have 35 N N ( x0 x0 ) (n 1) 0 . YG YG ( 70 ) The partial derivatives of the city population with respect to the local public good therefore drop out from ( 69 ). As for housing, production of housing is proportional to the city population: YH xH N . From the production function and the first-order conditions for profit maximization, ( 7 ), the labor input for housing then satisfies: ( 71 ) Y0H YH N ( pH t H ) xH ( pH t H ) , for any city . YG YG YG Hence, the partial derivatives of the labor input for housing production also cancel each other out: ( 72 ) N N Y0H Y0H ( 1 ) ( ) (n 1) 0 . x p t n H H H YG YG YG YG Concerning differentiated good production, from the cost function and the market clearing condition, we have ( 73 ) x N Y Y0M , for any city . x c N c Y Y YG YG G G As before, the population change terms cancel each other out and we are left with ( 74 ) x Y0M Y0M x ( n 1 ) cN (n 1) . YG YG YG YG Differentiating the constant utility condition ( 14 ) and substituting the definition of the marginal benefit of increasing the supply of the local public good in ( 28 ) yields x0 x Pm m PG pm . YG YG N YG N ( 75 ) In other cities, we have ( 76 ) x P m x0 . pm m YG YG N YG Substituting ( 72 ), ( 74 ), ( 75 ), and ( 76 ) into ( 69 ) yields ( 77 ) A x m C ( Y ) P Nm ( p c ) ( P ( cY F )) G G G m Y Y YG G G x m (n 1) Nm( p c) ( Pm (cY F )) YG YG 36 . The optimality condition for the supply of the local public good immediately follows from A / YG 0 . A8. Proof of Proposition 5 First, differentiating the Allais surplus A(n) nY0 (n) N ( x0 (n) x0 ) with respect to the number of cities yields: ( 78 ) dY dx dA Y0 N 0 n 0 . dn dn dn We modify this expression by using the conditions that the equilibrium values satisfy the transformation function and the fixed utility constraint. Differentiating the transformation functions, F i (Yij , Y0ij , j [0,mi ]) 0 of differentiated good i M and F i (Yi , Y0i , {Yki }kL ) 0 for housing i H , with respect to n yields ( 79 ) F i dYij F i dY0ij 0 for j [0,mi ] , i M , Yij dn Y0ij dn ( 80 ) F i dYi F i dY0i F i dY i i i k 0 for i H . Yi dn Y0 dn kL Yk dn Differentiating ( 36 ) with respect to n , we obtain ( 81 ) mi dY0ij dY0 dmi dY i 0 . dj Y0imi dn iM 0 dn dn iH dn Substituting ( 79 ) and ( 80 ) into this relationship and applying the definitions of the producer shadow prices, ( 40 ), ( 41 ), and ( 44 ), yields dY mi dYi dYki dm dY0 ( pij tij ) ij dj ( Pmi Tmi ) i , ( ) p t p ( 82 ) i i k dn dn kL ,iH dn iM 0 dn dn iH where we have used the normalizations, p0 1 and t0 1 , and the assumption that the supply of a local public good is fixed. Next, differentiating the fixed utility constraint U U (x(n), m(n)) and using the first-order condition for optimal consumption choice ( 30 ), as well as the definition of the consumer-side shadow price of product diversity ( 42 ), yields: ( 83 ) dx mi 1 dx0 dx dmi . pi i pij ij dj Pmi 0 dn dn N dn iM i 0 , iM dn Substituting these relationships into ( 78 ), we can rewrite the change in Allais surplus as: 37 ( 84 ) dx mi dA dx dmi 1 Y0 N pi i pij ij dj Pmi 0 dn dn N dn i 0, iM dn iM dY mi dY dY k dm n ( pi ti ) i pi i ( pij tij ) ij dj ( Pmi Tmi ) i dn iL , kH dn iM 0 dn dn iH . In the second step, we relate the changes in the outputs and consumption in the existing cities to those in the new city, using the market clearing conditions. Differentiating the market clearing conditions ( 37 ) – ( 39 ) with respect to n yields: ( 85 ) N dxij dn Nxij n N ( 86 ) ( 87 ) dYij dn , i M , j [0, mi ] dxi dY k Nxi n i , i L dn k H dn N dxi dY Nxi n i , i H , dn dn where we are assuming that the supplies of the local public goods are fixed. Substituting these into ( 84 ) yields ( 46 ), ( 47 ), and ( 48 ) in Proposition 5. A9. Proof of Theorem 2 Substituting the market clearing conditions ( 37 ), ( 38 ), and ( 39 ) into the direct benefit yields: mi ( 88 ) DB Y0 ( pij tij )Yij dj ( pi ti )Yi pi ( X i Yi k Yi k ) . 0 iH k H k G iM iL Using the zero-profit conditions, ( 32 ) and ( 34 ), and the definition of the total cost of local public goods, ( 33 ), we can further rewrite the direct benefit as: ( 89 ) DB pi X i CG tijYij dj , mi iL iM 0 which yields DB FS T . References Abdel-Rahman, H., Fujita M., 1990. Product variety, Marshallian externalities, and city sizes. Journal of Regional Science 30, 165–184. Allais, M., 1943. A la Recherche d'une Discipline Économique, vol. I. Paris: Imprimerie Nationale. Allais, M., 1977. Theories of General Economic Equilibrium and Maximum Efficiency. In: Schwödiauer, G. (Ed.), Equilibrium and Disequilibrium in Economic Theory. D. Reidel Publishing Company, Dordrecht, pp. 129–201. 38 Arnott, R., 2004. Does the Henry George Theorem provide a practical guide to optimal city size? The American Journal of Economics and Sociology 63, 1057–1090. Arnott, R., Stiglitz, J.E., 1979. Aggregate land rents, expenditure on public goods, and optimal city size. Quarterly Journal of Economics 93, 471–500. Au, C.-C., Henderson, J.V., 2006a. Are Chinese cities too small? Review of Economic Studies 73, 549–576. Au, C.-C., Henderson, J.V., 2006b. How migration restrictions limit agglomeration and productivity in China. Journal of Development Economics 80, 350–388. Auerbach, A.J., 1985. The theory of excess burden and optimal taxation. In: Auerbach, A.J., Feldstein, M. (Eds.), Handbook of Public Economics, vol. 1. Elsevier, Amsterdam, 61–127. Behrens, K., Duranton G., Robert-Nicoud, F.L., 2010. Productive cities: Sorting, selection and agglomeration. CEPR Discussion Papers 7922, C.E.P.R. Discussion Papers. Behrens, K., Kanemoto, Y., Murata Y., 2010. The Henry George Theorem in a second-best world. CEPR Discussion Papers 8120, C.E.P.R. Discussion Papers. Behrens, K., Murata, Y., 2007. General equilibrium models of monopolistic competition: A new approach. Journal of Economic Theory 136, 776–787. Behrens, K., Murata, Y., 2009. City size and the Henry George Theorem under monopolistic competition. Journal of Urban Economics 65, 228–235. Behrens, K., Pokrovsky, D., Zhelobodko, E., 2014. Market size, entrepreneurship, and income inequality. CEPR Discussion Papers 9831, C.E.P.R. Discussion Papers. Benassy, J.-P., 1996. Taste for variety and optimum production patterns in monopolistic competition. Economics Letters 52, 41–47. Berglas, E., Pines, D., 1981. Clubs, local public goods and transportation models: A synthesis. Journal of Public Economics 15, 141–162. Boadway, R., Bruce, N., 1984. Welfare Economics. Oxford: Basil-Blackwell. Debreu, G. 1951. The coefficient of resource utilization. Econometrica, 19, 273-292. De Loecker, J., Warzynski, F., 2012. Markups and firm-level export status. American Economic Review, 102(6): 2437-2471. Dhingra S., Morrow, J., 2013. Monopolistic competition and optimum product diversity under firm heterogeneity. LSE, mimeo. Diewert, W. E., 1983. The Measurement of waste within the production sector of an open economy. Scandinavian Journal of Economics, 85, 159-179. Diewert, W. E., 1985. Measurement of waste and welfare. in applied general equilibrium models. In: Piggot, J., Whalley, J. (Eds.), New Developments in Applied General Equilibrium Analysis. Cambridge University Press, Cambridge, New York, and Melbourne, pp. 42-102. Dixit, A.K., Stiglitz, J.E., 1977. Monopolistic competition and optimum product 39 diversity. American Economic Review 67, 297–308. Duranton, G., Puga, D., 2001. Nursery cities: Urban diversity, process innovation, and the life cycle of products. American Economic Review 91, 1454–1477. Duranton, G., Puga, D., 2004. Micro-foundation of urban agglomeration economies. In: Henderson, J.V., Thisse, J.F. (Eds.), Handbook of Regional and Urban Economics, vol. 4. Elsevier, Amsterdam, 2063–2117. Ethier, W.J., 1982. National and international returns to scale in the modern theory of international trade. American Economic Review 72, 389–405. Flatters, F., Henderson, J.V., Mieszkowski, P., 1974. Public goods, efficiency, and regional fiscal equalization. Journal of Public Economics 3, 99–112. Fujita, M., 1989. Urban Economic Theory: Land Use and City Size. Cambridge Univ. Press, Cambridge, MA. Fujita, M., Krugman, P., Venables, A., 1999. The Spatial Economy: Cities, Regions, and International Trade. MIT Press, Massachusetts. Harberger, A.C., 1964. The measurement of waste. American Economic Review 54, 58–76. Harberger, A.C., 1971. Three basic postulates for applied welfare economics: An interpretive essay. Journal of Economic Literature 9, 785–797. Helsley, R. W., Strange, W. C., 1990. Matching and agglomeration economies in a system of cities. Regional Science and Urban Economics 20, 189–212. Henderson, J.V., 1974. The sizes and types of cities. American Economic Review 64, 640 – 656. Hoyt, W.H., 1991. Competitive jurisdictions, congestion, and the Henry George Theorem: When should property be taxed instead of land? Regional Science and Urban Economics 21, 351–370. Kanemoto, Y., 1980. Theories of Urban Externalities. North-Holland, Amsterdam. Kanemoto, Y., 2013a. Second-best cost-benefit analysis in monopolistic competition models of urban agglomeration. Journal of Urban Economics 76, 83–92. Kanemoto, Y., 2013b. Evaluating benefits of transportation in models of new economic geography. Economics of Transportation 2, 53–62. Kanemoto, Y., 2013c. Pitfalls in estimating “wider economic benefits” of transportation projects. GRIPS Discussion Paper 13–20. Kanemoto, Y., Kitagawa, T., Saito, H., Shioji, E., 2005. Estimating urban agglomeration economies for Japanese metropolitan areas: Is Tokyo too large? In: Okabe, A. (Ed.), GIS-Based Studies in the Humanities and Social Sciences. Taylor&Francis, Boca Raton, pp. 229–241. Kanemoto, Y., Mera, K., 1985. General equilibrium analysis of the benefits of large transportation improvements. Regional Science and Urban Economics 15, 343– 363. 40 Kanemoto, Y., Ohkawara, T., Suzuki, T., 1996. Agglomeration economies and a test for optimal city sizes in Japan. Journal of the Japanese and International Economies 10, 379–398. Melitz, M.J., 2003. The impact of trade on intra-industry reallocations and aggregate industry productivity. Econometrica 71, 1695–1725. Mieszkowski, P., Zodrow, G.R., 1989. Taxation and the Tiebout model: The differential effects of head taxes, taxes on land rents, and property taxes. Journal of Economic Literature 27, 1098–1146. Ogaki, M., Zhang, Q., 2001. Decreasing risk aversion and tests of risk sharing. Econometrica 69, 515–526. Ottaviano, G.I.P., Tabuchi, T., Thisse, J.-F., 2002. Agglomeration and trade revisited. International Economic Review 43, 409–436. Papageorgiou, Y.Y., Pines, D., 1999. An Essay on Urban Economic Theory. Kluwer Academic Publishers , Boston. Romer, P., 1994. New goods, old theory, and the welfare costs of trade restrictions. Journal of Development Economics 43, 5–38. Schweizer, U., 1983. Efficient exchange with a variable number of consumers. Econometrica 51, 575–584. Stiglitz, J.E., 1977. The theory of local public goods. In: Feldstein, M.S., Inman, R.P. (Eds.), The Economics of Public Services. MacMillan, London, pp. 274–333. Tsuneki, A., 1987. The measurement of waste in a public goods economy. Journal of Public Economics 33, 73–94. Venables, A.J., 2007. Evaluating urban transport improvements: Cost-benefit analysis in the presence of agglomeration and income taxation. Journal of Transport Economics and Policy 41, 173–188. Vives, X., 1985. On the efficiency of Bertrand and Cournot equilibria with product differentiation. Journal of Economic Theory 36, 166–175. Zhelobodko, E., Kokovin, S., Parenti, M., Thisse, J.-F., 2012. Monopolistic competition: Beyond the Constant Elasticity of Substitution. Econometrica 80, 2765–2784. 41
© Copyright 2026 Paperzz