Railroads and American Economic Growth: New Data and Theory∗ Dave Donaldson MIT and NBER Richard Hornbeck Harvard and NBER Draft Version: September 2011 Preliminary and Incomplete Abstract This paper examines the impact of railroads on American economic growth. Drawing on general equilibrium trade theory, changes in counties’ “market access” captures the aggregate impact of local changes in railroads. Using a new spatial network of freight transportation routes, we calculate county-to-county lowest transportation costs and counties’ implied market access. As railroads expanded from 1870 to 1890, changes in market access are capitalized in land values with an estimated elasticity of 1 – 1.5. The computed decline in market access from removing all railroads in 1890 is estimated to reduce GNP by 6.3%, roughly double estimates by Fogel (1964). Without railroads, the total value of US agricultural land would have been lower by 73%. Fogel’s proposed extention of the canal network would have only mitigated 8% of the loss from removing railroads. ∗ For helpful comments and suggestions, we thank seminar participants at Harvard. We are grateful to Jeremy Atack and coauthors for sharing data from their research. Irene Chen, Manning Ding, Jan Kozak, and Rui Wang provided excellent research assistance. During the 19th century, railroads spread throughout a growing United States as the economy rose to global prominence. Railroads became the dominant form of freight transportation; areas around railroad lines prospered and preferential rates could allow particular companies to dominate sectors. The early historical literature often presumed that railroads were indispensable to the United States’ economy or, at least, very influential in US economic growth. Understanding the development of the American economy is shaped by an understanding of the impact of railroads and, more generally, the impact of market integration. In Railroads and American Economic Growth, Fogel (1964) transformed the academic literature on railroads’ impact by using a social savings methodology to focus attention on counterfactuals: in the absence of railroads, freight transportation by rivers and canals would have been only slightly more expensive in most areas. Small differences in freight rates can make-or-break individual firms or induce areas to thrive at the expense of others, but the aggregate impact on economic growth may be small. Fogel’s social savings methodology has been widely applied and continues to influence the evaluation of transportation infrastructure improvements. However, many scholars have discussed limitations of the social savings methodology (reviewed in Fogel 1979).1 One alternative approach is to create a computational general equilibrium (CGE) model, with the explicit inclusion of multiple regions separated by a transportation technology (e.g., Williamson 1974; Herrendorf et al. 2009). Increasingly, econometric studies have exploited local variation in railroad construction and county-level data on population, output, or land values (Haines and Margo 2008; Atack and Margo 2010; Atack et al. 2010; Atack et al. 2011). Similar methods have been used to estimate the impact of railroads in modern China (Banerjee et al. 2010) or highways in the United States (Baum-Snow 2007; Michaels 2008). Given plausibly random variation in the location of transportation routes, these studies estimate the relative impact of transportation infrastructure. General equilibrium effects limit the aggregate interpretation of such estimates, however, because railroads may displace activity from one area to another with little aggregate impact. Railroads are inherently a network technology, and local changes in the network affect all other areas on the network and areas off the network. This paper introduces a new methodology for estimating the aggregate impact of transportation improvements, drawing on recent advances in trade theory and spatial databases to estimate the impact of railroads on the American economy. Following Fogel’s analysis, we focus on the aggregate impact of railroad freight transportation on the agricultural sector in 1890. Drawing on general equilibrium trade theory, counties’ “market access” aggre1 For example, the methodology ignores the possibility of increasing returns, market power, or other market imperfections (David 1969). 1 gates the local impacts and geographic spillover impacts of railroads.2 Building on empirical work by Atack et al., we construct a GIS network database to calculate county-to-county transportation costs and analyze county-level variation in market access. As railroads expanded from 1870 to 1890, changes in market access are capitalized in land values with an estimated elasticity of 1 – 1.5. County market access is influenced by changes elsewhere in the railroad network; indeed, the estimated elasticity is robust to controlling for a county’s own railroads. We then compare the observed impact of railroads against two counterfactual scenarios in 1890. In the first scenario, eliminating railroads lowers the total value of US agricultural land by 73% and generates an annual economic loss of 6.3% of GNP, roughly double Fogel’s estimate. In the second scenario, replacing railroads with extended canals generates an annual economic loss of 5.8% of GNP, mitigating only 8% of the loss from eliminating railroads. Contrary to Fogel’s expectations, the proposed extension of the canal network is estimated to be a poor substitute for the railroad network. In contrast to canals, railroads provided direct routes to many trading partners, and dramatically lowered distances of expensive wagon transportation. As a comparison, we extend a lesser-known component of Fogel’s methodology using recent improvements in county-level data and spatial computational analysis. We calculate the value of agricultural land within different-sized thin buffers around railroads and rivers. Through algebraic manipulation, eliminating railroads is estimated to impose a 4% annual cost and replacing railroads with canal extensions is estimated to impose a 1% annual cost. These estimates are expected to understate the cost of eliminating railroads and, in particular, overstate the benefit of canal extensions. By comparison, the market access methodology results appear of reasonable magnitude. In the United States and around the world, transportation infrastructure is attracting tremendous resources as a popular initiative to increase economic growth. This paper develops and implements a new methodology for measuring the aggregate impacts of transportation infrastructure. The estimates suggest a large role for market integration in economic development. Further, as the trade literature makes progress on tractable general equilibrium models, this empirical setting provides an opportunity to test these models’ empirical insights using a direct shock to trade costs and land value data. While the market access data are drawn from a newly constructed network database, this data collection process can be replicated in other settings. 2 In related work, Redding and Sturm (2007) use a similar model to estimate the impact on population from changes in market access following the division and reunification of Germany. Donaldson (2010) estimates the impact of railroads on price dispersion, trade flows and real income in colonial India, using a different model to calculate overall gains from trade. 2 I Constructing a New Dataset on the Historical US Transportation Network The data are drawn from a newly-constructed geographic information system (GIS) network database on the location of railroads in 1870 and 1890 and the location of other transportation methods.3 This network database, i.e., “Google Maps for the 19th century,” allows the calculation of lowest freight transportation costs between any two counties. For 1870 and 1890, we calculate counties’ “market access:” a weighted sum of transportation costs suggested by a general equilibrium trade model. We also calculate counties’ market access under counterfactual scenarios, such as eliminating railroads or constructing Fogel’s hypothetical expansion of the canal network in lieu of railroads. The database considers three methods of freight transportation: railroads, boats, and wagons. Railroad and boat freight rates were similar per ton-mile in 1890, so the key advantage to railroads was their wider geographic dispersion. Railroads mainly affected transportation costs by reducing the required lengths of expensive wagon transportation and by providing more-direct transportation routes. Faced with a number of data challenges, the network construction compromises in other areas and focuses on the location of railways and waterways and the implied distances of wagon transportation. I.A Transportation Cost Parameters We use Fogel’s transportation cost parameters, which are reasonable choices and enable closer comparison between the methodologies’ results. One strong assumption is that freight rates per ton-mile are constant throughout the country, which only partly reflects data limitations. Local freight rates are likely endogenous to local economic activity and market power, so there are advantages to using average national rates. Fogel’s analysis uses freight rates in 1890, and we maintain these rates in 1870. Thus, measured changes in trade costs and market access are determined by variation in the location of transportation routes rather than prices. Using Fogel’s parameters, railway rates are set at 0.63 cents per ton-mile and waterway rates are set at 0.49 cents per ton-mile.4 Transshipment costs 50 cents per ton, incurred whenever transferring goods to/from a railroad car, river boat, canal barge, or ocean liner.5 Wagon transportation costs 23.1 cents per ton-mile, defined as the straight line distance 3 We are grateful to Jeremy Atack and co-authors for providing initial GIS maps of railways. Rates reflect an output-weighted average of rates for transporting grain and meat. Waterway rates include insurance charges for lost cargo (0.025 cents), inventory and storage costs for slower transport and non-navigable winter months (0.194 cents), and the social cost of public waterway investment (0.073 cents). Waterway or railway rates are not allowed to vary by direction. 5 Fogel considers transshipment charges as a sub-category of water rates, but our modeling of transshipment points allows for a unified treatment of Fogel’s interregional and intraregional scenarios. Fogel’s sources record higher railroad freight costs per ton-mile for shorter routes, but approximately reflect a 100 cent fixed fee and a 0.63 cent fee per mile. 4 3 between two points.6 Due to the wide dispersion in travel costs by method, the key features of the transportation network concern the required length of wagon transportation and the number of transshipment points. I.B Transportation Network Construction The first step is obtaining digitized maps of constructed railroads in 1870 and 1890. We are grateful to Jeremy Atack and co-authors for providing these initial GIS railroad files.7 These railroad files were originally constructed to define mileage of railroad track by county and year; by contrast, for our purposes, railroad lines must connect exactly such that GIS software recognizes travel is possible along a continuous line. We use GIS topology tools to ensure exact connections between all railroad line segments.8 To minimize measurement error in changes, we create a cleaned 1890 railroad file and modify that file to create a version for 1870 that omits lines constructed between 1870 and 1890. Figure 1 maps the railroad network in 1870 and 1890. The second step adds locations of navigable rivers and canals. We follow Fogel’s definition and mapping of navigable waterways, and build on a preliminary file provided by Atack. We enhanced the Fogel-Atack map to follow natural river bends, as the relative straightness of railroads contributes to travel cost differences. Overlapping waterways and railways do not connect by default; instead, we create connections among waterways and railways to allow for fixed transshipment costs. We also include the locations of hypothetical canals, whose construction Fogel argued was feasible and would have largely substituted for railroads. Figure 2 maps the waterway network, and Figure 3 maps the combined waterway network and railroad network in 1870 and 1890. The third step allows for travel over two-dimensional space (lakes, oceans, unmapped land). The network allows for travel along line routes, but space between lines is undefined and travel is either costless or impossible.9 We create waterways that saturate oceans and lakes with a large number of possible routes. For over-land routes, we create wagon connections from every county centroid to every other county centroid within 300km.10 The fourth step connects individual counties to the transportation network. We define av6 This rate reflects a cost of 16.5 cents per mile traveled and Fogel’s adjustment factor of 1.4 between the shortest straight line distance and miles traveled. 7 First, year-specific railroad atlas scans from the Library of Congress are “georeferenced” to US county borders. Second, railroad lines are hand-traced in GIS software to create a digital map of railroad line locations. 8 For example, digitized railroad lines often contain small internal gaps. In addition, by default, two crossing railroad lines are equivalent to a highway overpass without a connection. Correcting all such cases has little effect on total track mileage by county, but can have large impacts on calculated lowest-cost routes. 9 The analysis could use rasters that allow for travel costs for each map pixel, but it would be unable to reflect transshipment costs: a string of railroad pixels will intersect a string of river pixels. 10 Note that wagon rates adjust straight line distance to reflect average travel distance. 4 erage travel costs between counties by calculating the travel cost between county centroids. County centroids must be connected to the network of railways and waterways; otherwise, lowest-cost travel calculations assume that freight first travels (freely) to the closest railway or waterway. We create wagon connections between each county centroid and each nearby type of transportation route in each relevant direction. The fifth step refines the centroid connections to reflect the importance of wagon distances to overall travel costs. Consider that when railroads travel through a county, the centroid’s nearest distance to a railroad no longer reflects the average distance from county points to a railroad. Fogel recognized the importance of modeling this distance and his ideal solution was to break each county into small grids and take the average of nearest distances from each grid to a railroad. Due to technical limitations, Fogel approximated this average distance using one-third of the distance from the farthest point in a county to a railroad. We use GIS software to create 200 random points in each county, calculate the distance from each point to the nearest railroad, and take the average of these nearest distances. We adjust the cost of travel along each centroid connection to within-county railroads to reflect that county’s average travel cost to a railroad. We repeat this process for rivers, canals, and Fogel’s proposed canals. This process allows the analysis to exploit precise variation on the intensive margin of county access to railways and waterways. I.C Transportation Route Cost Calculations We use the completed network dataset to calculate the lowest cost route between each pair of counties, i.e., 5 million calculations. The lowest cost routes are calculated under four scenarios: (1) transportation costs with 1870 railroads, (2) transportation costs with 1890 railroads, (3) transportation costs with no railroads, and (4) transportation costs with no railroads and Fogel’s extended canals. All scenarios include transportation by river/canal/lake/ocean and over-land “wagon routes.” The observed travel paths are consistent with intuitive travel patterns. Ultimately, calculated travel costs are not intended to most accurately reflect actual travel costs (for which partial data are available); rather, calculated travel costs are a proxy for differences over space and time in transportation costs due to the availability of nearby railways and waterways. I.D County-level Census Data County-level data are drawn from the US Censuses of Agriculture and Population (Haines 2005). The main variables of interest are total population and the total value of agricultural land. We adjust data from 1870 to reflect 1890 county boundaries (Hornbeck 2010). Unsettled counties with missing data are assumed to have zero population and zero land value, as all areas contribute to the calculation of market access. The empirical analysis excludes 5 counties with zero land value, and explores robustness to excluding all counties West of Texas. II A Distance Buffer Approach to Valuing Railroads Using only the location of railways and waterways, an initial empirical exercise extends Fogel’s lesser-known approach to valuing the impact of railroads in 1890. Fogel’s best-known approach involves multiplying the quantity of transported agricultural goods by the difference in transportation costs with railroads and without railroads. This method is applied to inter-regional trade between major hubs and some intra-regional trade. One limitation in applying this method to all intra-regional trade is that waterway and wagon transportation costs would be higher than product values in some railroad-serviced areas. As an alternative, Fogel excludes all areas more than 40 miles from a navigable waterway and calculates the economic loss as the value of agricultural land in these areas. Fogel’s alternative method of calculating economic loss is based on a Ricardian model: land values capitalize economic rents, which depend on transportation costs to markets. Railroads and waterways are much cheaper than wagons – 40 miles of wagon transportation is equivalent to 1500-1900 miles of transportation by rail or waterway – so distance to a railroad or waterway is a decent proxy for transportation costs. Fogel makes a stark assumption that farmland is worth its full amount within 40 miles and is worth zero beyond 40 miles, when wagon costs exceed average product prices. We extend this Ricardian method to a full evaluation of railroads, exploiting GIS software and county-level data. For 1890, we calculate the share of each county that is 0-5 miles, 5-10 miles, ..., 35-40 miles, and more than 40 miles from a railroad or river/canal/lake/ocean. We also calculate the share of each county in each distance bin under two counterfactual scenarios: (1) no railroads and (2) no railroads and canal extensions. Figure 4 maps 40-mile buffers for each scenario, which illustrates the intuition in Fogel’s original analysis. Panel A is shaded to reflect the area of land within 40 miles of a railroad or waterway. Panel B is shaded to reflect the area of land within 40 miles of a waterway. Panel C overlays these two maps; the gray area is “serviced” by a railroad but not by a waterway. Panel D adds Fogel’s proposed canal extensions; most of the Midwestern areas become “serviced” by this expanded waterway network. Figure 5 maps 10-mile buffers for each scenario, which illustrates why Fogel’s approach may understate the value of railroads and overstate the value of canal extensions. Panel A is shaded to reflect the area of land within 10 miles of a railroad or waterway; given the high density of railroad construction, most areas remain shaded when compared to panel A of figure 4. In Panel B, however, relatively little area is covered by waterways when compared to panel B of figure 4. Panel C overlays these maps, highlighting the relative advantage 6 of railroads at closer distances. Panel D adds the canal extensions, which are relatively ineffective at covering areas serviced by railroads. Calculating the economic impact of railroads involves algebraic manipulation in two steps. First, we inflate observed land values to reflect lands’ implied “true value” if it were next to a transportation route. Second, we calculate the implied lost land value from changing observed transportation routes to each counterfactual scenario. First, assume a continuous depreciation in land values as nearest distance to a railway or waterway increases from 0 to 40 miles. For each county 𝑖, the implied per-acre “true land value” or 𝑇 𝑟𝑢𝑒𝐿𝑉𝑖 is a function of the “observed land value” or 𝑂𝑏𝑠𝑒𝑟𝑣𝑒𝑑𝐿𝑉𝑖 and the share of county 𝑖 in each distance bin 𝛼𝑖𝑑 : (1) 𝑇 𝑟𝑢𝑒𝐿𝑉𝑖 = 𝛼𝑖0−5 37.5 40 𝑂𝑏𝑠𝑒𝑟𝑣𝑒𝑑𝐿𝑉𝑖 + 𝛼𝑖5−10 32.5 + ... + 𝛼𝑖35−40 2.5 40 40 Second, we calculate the share of each county 𝑖 in each distance bin 𝛽𝑖𝑑 from only a river/canal/lake/ocean. The implied lost land value from removing railroads is: (2) 𝐿𝑜𝑠𝑠𝑁 𝑜𝑅𝑅 = 𝑇 𝑟𝑢𝑒𝐿𝑉𝑖 [(𝛽𝑖40+ − 𝛼𝑖40+ ) + (𝛽𝑖35−40 − 𝛼𝑖35−40 ) 37.5 2.5 + ... + (𝛽𝑖0−5 − 𝛼𝑖0−5 ) ] 40 40 Similarly, we calculate the share of each county 𝑖 in each distance bin 𝛾𝑖𝑑 from a proposed canal extension or a river/canal/lake/ocean. The implied lost land value from replacing railroads with extended canals is: (3) 𝐿𝑜𝑠𝑠𝐶𝑎𝑛𝑎𝑙𝑠 = 𝑇 𝑟𝑢𝑒𝐿𝑉𝑖 [(𝛾𝑖40+ − 𝛼𝑖40+ ) + (𝛾𝑖35−40 − 𝛼𝑖35−40 ) 37.5 2.5 + ... + (𝛾𝑖0−5 − 𝛼𝑖0−5 ) ] 40 40 This method estimates that removing railroads in 1890 would decrease land values by $6 billion, generating an annual loss of $480 million using Fogel’s preferred rate of return. By comparison, Fogel estimates annual intra-regional losses of $248 million to $337 million, and annual inter-regional losses of $73 million.11 If land values are not inflated to “true values,” skipping step one above, the estimated annual loss is $420 million. This method estimates that replacing railroads with canal extensions in 1890 would still generate an annual loss of $128 million. In this approach, canal extensions largely mitigate the loss of railroads because extended canals reach near otherwise unserved areas. Canal extensions only provide indirect routes to destinations, however, so a methodology based on trade costs should estimate lower value from canal extensions. The subsequent analysis focuses on county-to-county trade costs because, ultimately, rail11 From the 40-mile buffer loss alone, Fogel estimates an annual loss of $154 million. 7 ways and waterways matter for where they provide transportation. The full trade model simplifies to a county-level regression that maintains this earlier Ricardian intuition: low trade costs and high market access are capitalized in higher land values. The model is identified from changes in market access over time, and we use these parameters to estimate the economic cost from eliminating railroads or replacing railroads with canal extensions. III A Market Access Approach to Valuing Railroads We formalize a model of trade among US counties that specifies how goods prices and factor prices – and hence living standards – respond to changing trade costs. The model contains thousands of counties, each with interacting goods markets and factor markets, that generate positive and negative spillovers on other counties. Remarkably, under a set of assumptions that are standard among modern quantitative trade models, all of these spillovers collapse into a single sufficient statistic represented by a county’s “market access.” The model generates a regression equation that we take to the data; in particular, the model implies a simple relationship between county agricultural land values and county market access (appropriately defined). The model’s simplifications and abstractions are stark, but these assumptions are rewarded with analytical insight and empirical predictions.12 In the end, regressing county land values on a weighted sum of trade costs with other counties (i.e., “market access”) also has a completely atheoretical appeal. III.A A Model of Trade Among US Counties The economy consists of many trading counties, each indexed by 𝑜 if the origin of a trade and by 𝑑 if the destination. Some counties are rural (𝑜 ∈ ℛ) and some are urban (𝑜 ∈ 𝒰). Tastes: Economic agents have Cobb-Douglas preferences over two commodity groups: agricultural goods (weight 𝛾) and manufactured goods (weight 1 − 𝛾). The commodity groups are composed of many differentiated goods varieties (indexed by 𝑗), and tastes over these varieties take a CES form (with elasticities 𝜎𝐴 and 𝜎𝑀 ).13 An agent living in county 12 These modeling assumptions are used in the fields of international trade and economic geography, and reflect recent best-practice to gain traction in general equilibrium spatial settings with many regions and commodities. 13 It is important to note that the elasticities of substitution within each commodity group are not restricted; these could be very high if agricultural varieties are relatively similar. Anderson et al. (1992) provide an attractive microfoundation of aggregate-level CES preferences. If individual agents desire only one variety of the agricultural good (their “ideal variety”) and agents’ ideal varieties are distributed in a Logit fashion, then aggregate consumption data from a population of many such agents behaves as though all agents have CES preferences over all varieties (where in such an interpretation 𝜎 indexes the inverse of the dispersion of agents’ ideal varieties). 8 𝑜 and receiving income 𝑌𝑜 experiences indirect utility: (4) 𝑉 (Po , 𝑌𝑜 ) = 𝑌𝑜 , 𝐴 𝛾 (𝑃𝑜 ) (𝑃𝑜𝑀 )1−𝛾 where 𝑃𝑜𝐴 and 𝑃𝑜𝑀 are standard CES price indices.14 Agricultural technology: For simplicity, we assume a stark form of specialization: rural counties produce only agricultural goods and urban counties produce only manufacturing goods.15 Rural counties use a Cobb-Douglas technology to transform land and labor into agricultural goods varieties. We assume that labor is mobile across space and across sectors, i.e., that farmers have an endogenous outside option to work in other counties or manufacturing. The marginal cost of producing agricultural goods of variety 𝑗 in county 𝑜 is: 𝑀 𝐶𝑜𝐴 (𝑗) = (5) (𝑞𝑜 )𝛼 (𝑤𝑜 )1−𝛼 if 𝑜 ∈ ℛ, 𝑧𝑜𝐴 (𝑗) where 𝑤𝑜 is the wage rate, 𝑞𝑜 is the agricultural land rental rate, and 𝑧𝑜𝐴 (𝑗) is a Hicksneutral productivity shifter that is exogenous and local to county 𝑜. We follow Eaton and Kortum (2002) in modeling these productivity shifters by assuming that each county draws its productivity level, for any given variety 𝑗, from a Frechet (or Type II extreme value) distribution: 𝐹𝑜 (𝑧) = 1−exp(−𝑇𝑜𝐴 𝑧 −𝜃𝐴 ).16 This functional form is restrictive, but it has made multiple-region models of comparative advantage tractable since the seminal work of Eaton and Kortum (2002). Finally, we assume perfect competition among producers, following most of the trade literature on relatively homogeneous goods such as agricultural commodities. Manufacturing technology: Urban counties use a Cobb-Douglas technology to transform capital and labor into manufacturing goods varieties, whose marginal cost of production is: 𝑀 𝐶𝑜𝑀 (𝑗) = (6) 14 That is, 𝑃𝑜𝐴 = 𝑛𝐴 𝐴 (𝑝𝑐 (𝑗))1−𝜎𝐴 𝑑𝑗 0 [∫ (𝑤𝑜 )𝛽 (𝑟𝑜 )1−𝛽 if 𝑜 ∈ 𝒰, 𝑧𝑜𝑀 (𝑗) ]1/(1−𝜎𝐴 ) and 𝑃𝑜𝑀 = 𝑛𝑀 𝑀 (𝑝𝑜 (𝑗))1−𝜎𝑀 𝑑𝑗 0 [∫ ]1/(1−𝜎𝑀 ) . Variables 𝑛𝐴 and 𝑛𝑀 are the number of varieties available to consumers; as we describe below, the number of varieties is fixed in agriculture and the number is endogenous in the manufacturing sector and adjusts to allow free entry. 15 In future work we aim to subdivide all counties into their rural and urban segments (based on population shares in the Census data). An additional extension could solve for the equilibrium specialization of each county into its rural and urban segments. 16 This distribution captures how productivity differences across counties give incentives to specialize and trade. The parameter 𝑇𝑜𝐴 captures county-specific (log) mean productivity, which corresponds to the county’s absolute advantage. The parameter 𝜃𝐴 captures, inversely, the (log) standard deviation of productivity, which corresponds to the scope for comparative advantage. A low 𝜃𝐴 means county productivity draws are dispersed, creating large incentives to trade on the basis of productivity differences. 9 where 𝑤𝑜 is the same wage rate, 𝑟𝑜 is the capital rental rate, and 𝑧𝑜𝑀 (𝑗) is a Hicks-neutral productivity shifter that is exogenous and local to county 𝑜. The urban manufacturing sector pays both this marginal cost and a fixed cost of production, 𝑓𝑜 , paid in the same proportion of factors as marginal costs. We follow Melitz (2003) in assuming a set of potential producers, each with a productivity shifter 𝑧𝑜𝑀 (𝑗) that is drawn, prior to entry, from a distribution with CDF 𝐺𝑜 (𝑧). We also follow Chaney (2008), among others, in modeling the distribution 𝐺𝑜 (𝑧) as a Pareto distribution: 𝐺𝑜 (𝑧) = 1 − (𝑧/𝑇𝑜𝑀 )−𝜃𝑀 .17 The parameter 𝑇𝑜𝑀 captures the location of this distribution and 𝜃𝑀 captures its dispersion. Following Melitz (2003), firms compete monopolistically with free entry such that all profits are zero. Trade costs: Trading goods is costly. Remote rural locations pay high prices for imported goods and receive low prices for goods they produce, as this is the only way locations can be competitive in distant markets. We model trade costs using a simple and standard “iceberg” formulation. When a variety is made in county 𝑜 and sold locally in county 𝑜, its price is 𝑝𝑘𝑜𝑜 (𝑗) in sector 𝑘; but when this same variety is made in county 𝑜 and shipped to county 𝑑, 𝑘 𝑘 𝑘 is applied to each unit of 𝑝𝑜𝑜 (𝑗).18 A proportional trade cost 𝜏𝑜𝑑 it will sell for 𝑝𝑘𝑜𝑑 (𝑗) = 𝜏𝑜𝑑 𝑘 the variety shipped.19 Trade is costly because 𝜏𝑜𝑑 ≥ 1. Our empirical analysis below models 𝑘 𝜏𝑜𝑑 as a function of observables, based on the available transportation routes between county 𝑜 and county 𝑑 and how these routes changed over time. Factor mobility: Workers are free to move around the country. Worker utility, as defined in equation (4), will hence be equalized across counties in equilibrium and wages will satisfy: (7) 𝑤𝑜 = 𝑈¯ (𝑃𝑜𝐴 )𝛾 (𝑃𝑜𝑀 )1−𝛾 , where 𝑈¯ is the endogenous level of utility obtained by workers in each county. Land is fixed by county and capital is perfectly mobile.20 III.B Using the Model to Inform Empirical Work Solving for trade flows—the gravity equation: First, we solve for trade flows from each origin county 𝑜 to each other destination county 𝑑 in the agricultural sector. Due to perfect competition, the marginal costs of producing each variety are equal to their price. We assume that consumers only buy a variety from the cheapest available location. Substituting marginal costs from each supply location 𝑜 (equation 5) into the demand for agricultural 17 This is a good match for the empirical distribution of firm sizes in most samples (e.g., Axtell 2001). Monopolistically competitive producers in the manufacturing sector have an incentive to segment markets but, with CES preferences and many producers, the same mark-up over marginal cost will be charged by each producer in each market. 19 Melitz (2003) proposed that some trade barriers may be fixed (ie, constant regardless of the quantity shipped). For simplicity we have not included such costs here but they could easily be added in future work. 20 Predictions are equivalent if capital is perfectly immobile, but intermediate cases are difficult to model. 18 10 varieties in county 𝑑, Eaton and Kortum (2002) derive the value of total exports from 𝑜 to 𝑑:21 (8) (9) where −1 𝐴 −𝜃𝐴 𝐴 ) (𝐶𝑀 𝐴𝐴 = 𝜒𝐴 𝑇𝑜𝐴 (𝑞𝑜𝛼 𝑤𝑜1−𝛼 )−𝜃𝐴 (𝜏𝑜𝑑 𝑋𝑜𝑑 𝑑 ) 𝑌𝑑 , ∑ ≡ 𝑇𝑖𝐴 (𝑞𝑖𝛼 𝑤𝑖1−𝛼 )−𝜃𝐴 (𝜏𝑖𝑑𝐴 )−𝜃𝐴 . 𝐶𝑀 𝐴𝐴 𝑑 𝑖∈ℛ 𝐶𝑀 𝐴𝐴 𝑑 refers to “consumer market access” for agricultural goods in destination market 𝑑. Consumer market access is a weighted sum of productivity-adjusted costs of production in each market 𝑖 that could supply market 𝑑, with weights declining in the cost of trading from 𝑖 to 𝑑 (i.e., 𝜏𝑖𝑑𝐴 ). From equation (8), county 𝑜 sends more goods to county 𝑑 if county 𝑜 is relatively productive (high 𝑇𝑜𝐴 ) or relatively low cost (low 𝑞𝑜 or 𝑤𝑜 ). County 𝑜 also sends more goods to county 𝑑 if county 𝑑 has high total income (high 𝑌𝑑 ) or low overall consumer market access (low 𝐶𝑀 𝐴𝐴 𝑑 ), meaning that county 𝑜 faces fewer competitors when selling to market 𝑑. A similar expression can be derived for trade in the manufacturing sector. The manufacturing problem is complicated by the presence of free entry, however, Chaney (2008) shows that trade flows again take a gravity equation form: (10) (11) where 𝑀 𝛽 1−𝛽 −𝜃𝑀 𝑀 −𝜃𝑀 −1 𝑋𝑜𝑑 = 𝜒𝑀 𝑇𝑜𝑀 𝑛𝑀 ) (𝜏𝑜𝑑 ) (𝐶𝑀 𝐴𝑀 𝑜 (𝑤𝑜 𝑟𝑜 𝑑 ) 𝑌𝑑 , ∑ 𝛽 1−𝛽 −𝜃𝑀 𝐶𝑀 𝐴𝑀 ≡ 𝑇𝑖𝑀 𝑛𝑀 ) (𝜏𝑖𝑑𝑀 )−𝜃𝑀 . 𝑑 𝑖 (𝑤𝑖 𝑟𝑖 𝑖∈𝒰 The only difference from equation (8) is that manufacturing exports from county 𝑜 also depend on the endogenous number of varieties available in each location (i.e., 𝑛𝑀 𝑜 ). Equations (8) and (10) are known as a gravity equations, and they govern trade flows in this model. Deriving a gravity equation is dependent on the particular functional form assumptions for tastes and technology; indeed, it is difficult to imagine alternative assumptions that give rise to a gravity equation in these two settings (Arkolakis et al. 2011). The gravity equation is appealing because it dramatically simplifies a complex general equilibrium problem of spatial competition. A more appealing feature of the gravity equation is that it appears to fit data universally (Anderson and van Wincoop 2003, 2004; Combes et al. 2008). Thus, it is difficult to imagine an exercise that models trade among counties in the United States that does not predict trade flows will take the “gravity” form. Solving for the land rental rate. The absence of systematic data on 19th century trade flows 𝐴 𝑀 between US counties means that 𝑋𝑜𝑑 and 𝑋𝑜𝑑 are unobservable. However, the gravity equation implies tractable expressions for outcome variables that are observable and important for our purposes, such as the rural land rental rate (𝑞𝑜 ). 21 The term 𝜒𝐴 is a constant that is comprised of the parameters 𝛾, 𝜃𝐴 , and 𝜎𝐴 . 11 The Cobb-Douglas agricultural technology leaves land a fixed share 𝛼 of total output 𝑌𝑜𝐴 , so 𝑞𝑜 𝐿𝑜 = 𝛼𝑌𝑜𝐴 where 𝐿𝑜 is the (exogenous) quantity of agricultural land in county 𝑜. Goods ∑ 𝐴 ). Thus, equation (8) implies: markets clear, so all produced goods are bought (𝑌𝑜𝐴 = 𝑑 𝑋𝑜𝑑 𝑞𝑜1+𝛼𝜃𝐴 (12) ( =𝛼 𝑇𝑜𝐴 𝐿𝑜 ) 𝑤𝑜−𝜃𝐴 (1−𝛼) 𝐹 𝑀 𝐴𝐴 𝑜. 𝐹 𝑀 𝐴𝐴 𝑜 refers to “firm market access” over agricultural goods in origin 𝑜 and is defined as: 𝐹 𝑀 𝐴𝐴 𝑜 ≡ 𝛾 (13) ∑ 𝐴 −𝜃𝐴 −1 (𝜏𝑜𝑑 ) (𝐶𝑀 𝐴𝐴 𝑑 ) 𝑌𝑑 . 𝑑∈ℛ,𝒰 Firm market access (𝐹 𝑀 𝐴𝐴 𝑜 ) is a sum of terms for all destination counties 𝑑 to which the rural county 𝑜 tries to sell its agricultural goods. These terms include the size of the destination market (given by total income, 𝑌𝑑 ) and the competitiveness of the destination market (given by its 𝐶𝑀 𝐴𝐴 𝑑 term). All terms are inversely weighted by the cost of trading with 𝐴 −𝜃𝐴 each distant market ((𝜏𝑜𝑑 ) ). Substituting equation (7) into equation (12) to eliminate the wage, we obtain an expression for the land rental rate in county 𝑜: ) ( 𝐴) ( ) 1 𝑇𝑜 𝛾(1 − 𝛼) ln 𝑞𝑜 = 𝜅 + ln + ln(𝐶𝑀 𝐴𝐴 𝑜) 1 + 𝛼𝜃𝐴 𝐿𝑜 1 + 𝛼𝜃𝐴 ( ) ( ) (1 − 𝛾)(1 − 𝛼) 1 𝑀 + ln(𝐶𝑀 𝐴𝑜 ) + ln(𝐹 𝑀 𝐴𝐴 𝑜 ), 1 + 𝛼𝜃𝐴 1 + 𝛼𝜃𝐴 ( (14) where 𝜅 is a constant that subsumes model parameters and 𝑈¯ . Equation (14) provides a parsimonious guide for empirical work. Equilibrium rural land rental rates (𝑞𝑜 ) are log-linear in just three endogenous economic variables: (1) firm market access for agricultural goods that these rural areas produce (𝐹 𝑀 𝐴𝐴 𝑜 ), (2) consumer market 𝐴 access to agricultural goods (𝐶𝑀 𝐴𝑜 ), and (3) consumer market access to manufacturing goods (𝐶𝑀 𝐴𝑀 𝑜 ). Immobile land in county 𝑜 will be more valuable if county 𝑜 has: easy access to large uncompetitive markets (high 𝐹 𝑀 𝐴𝐴 𝑜 ) or easy access to agricultural goods 𝐴 (high 𝐶𝑀 𝐴𝑜 ) and manufactured goods (high 𝐶𝑀 𝐴𝑀 𝑜 ) that workers value. Equation (14) has two key implications for analyzing the impact of railroads on the development of the American economy. First, all economic forces that make goods and factor markets interdependent across thousands of counties get collapsed into three sufficient statistics, each of which capture a slightly different concept of “market access.” All cross-county general equilibrium spillover effects of railroads are captured by working with this particular concept of market access. For example, one county receiving a railroad line would, in 12 principle, affect other counties: counties that remain without a railroad, counties that previously had a railroad, counties that traded with the new railroad county, counties that can now trade with the new railroad county, and so on. Even if railroad assignment is random, “control” counties are affected and a regression of land rents on railroad access will confound these effects and be biased. However, a regression of land rents on market access will be free of this bias because, according to the model, all counties’ market access variables will adjust with changes in the railroad network. The second key implication of equation (14) is that access to markets matters for county prosperity, as opposed to access to railroads or waterways. Market access can increase or decrease due to events far beyond county borders: any changes in the transportation network or changes in other counties’ market size. Thus, the empirical estimation is not identified solely from particular counties gaining railroad access; rather, county market access changes due to factors independent of decisions to extend railroads into particular locations. The model suggests several instrumental variable strategies, which we explore below. III.C From Theory to an Empirical Specification Equation (14) provides a useful approach for empirical work, though three obstacles prevent its direct empirical implementation. We discuss proposed solutions to these challenges, many of which are preliminary and can be improved upon in future work. First, the Census of Agriculture contains data on land values rather than land rental rates.22 We use land values as a proxy for land rental rates.23 Second, the technology term 𝑇𝑜𝐴 in equation (14) is not directly observable in our setting.24 One plausible assumption is that this term is constant over time, or at least that changes in county agricultural technology are orthogonal to changes in market access. Because technology (𝑇𝑜𝐴 ) enters log-linearly in equation (14), we control for this term using county fixed effects. We also attempt to control for technological change that may be correlated with changes in market access through the use of state-year fixed effects, and year effects interacted with cubic polynomials in each county’s latitude and longitude. 𝑀 Third, the three market access terms (𝐹 𝑀 𝐴𝑜 , 𝐶𝑀 𝐴𝐴 𝑜 , 𝐶𝑀 𝐴𝑜 ) are also not directly ob22 The Census reports the value of agricultural land and buildings. In some years the Census reports separate values for land and buildings, and land values are the largest share of this combined measure. 23 In particular, we assume that 𝑉𝑜 = 𝑞𝛿𝑜 , where 𝑉𝑜 is the land value and 𝛿 is the interest rate or discount rate relevant to those who own agricultural land. This assumes that agents face a fixed interest rate or discount rate and expect zero serial correlation in market access over time. We are exploring relaxing this assumption by allowing for forward-looking agents with different expectations for changes in market access. The analysis benefits from debates and uncertainty over the future paths of railroad construction, as well as uncertainty over changes in market sizes. 24 Following Eaton and Kortun (2002) and a large body of work since, a typical method for estimating the technology terms 𝑇𝑜𝐴 and 𝑇𝑜𝑀 uses the relationship between these terms and the exporter fixed effect in a gravity equation. Due to the lack of trade data in our setting, this method is unavailable to us. 13 servable because each contains variables that are unobserved in our setting (e.g., the wage 𝑤𝑑 , the capital rental rate 𝑟𝑑 , and the technology terms 𝑇𝑑𝐴 and 𝑇𝑑𝑀 ). We are exploring calibrating these terms to population data (as in Redding and Sturm 2007); for now, we proxy 𝑀 for 𝐹 𝑀 𝐴𝑜 , 𝐶𝑀 𝐴𝐴 𝑜 , and 𝐶𝑀 𝐴𝑜 with one simplified market access term (𝑀 𝐴𝑜 ) inspired by Harris (1954).25 Harris originally defined a concept of “market potential,” based on the number and magnitude of markets available at low trade costs. This market potential term (𝑀 𝑃𝑜𝑡𝐻 ) effectively ∑ 𝐴 𝐴 −1 𝐴 is a proxy for total demand at location 𝑑, such as pop) 𝑋𝑑𝑡 , where 𝑋𝑑𝑡 equals 𝑑∕=𝑜 (𝜏𝑜𝑑𝑡 ulation or the value of agricultural land. That is, market potential is the sum of all other 𝐴 𝐴 market sizes (𝑋𝑑𝑡 ) down-weighted by inverse trade costs (weights are 1/𝜏𝑜𝑑𝑡 ). We extend the Harris formula in two respects. First, in our model, trade costs do not diminish effective market sizes inversely; rather, they do so to the power of −𝜃𝐴 and typical estimates of 𝜃𝐴 (discussed below) are significantly greater than one. Second, the 𝐹 𝑀 𝐴𝐴 𝑜 formula in equation (13) implies that market size should be deflated by the competitiveness of that market (𝐶𝑀 𝐴𝐴 𝑑 ). Intuitively, a large market 𝑑 is more appealing to exporters in county 𝑜 if market 𝑑 is served by fewer other counties (and, thus, prices are high in market 𝑀 𝑑). The same is true for 𝐶𝑀 𝐴𝐴 𝑜 and 𝐶𝑀 𝐴𝑜 : access to a large market 𝑑 is more appealing to consumers in market 𝑜 if market 𝑑 supplies fewer other counties (and, thus, sells goods 𝐴 ) by a for low prices). To incorporate these multilateral effects, we divide market sizes (𝑋𝑑𝑡 ∑ 𝐴 −1 𝐴 proxy for county 𝑑’s competitiveness level (𝑀 𝑃𝑑𝑡𝐻 = 𝑖∕=𝑑,𝑜 (𝜏𝑑𝑖𝑡 ) 𝑋𝑖𝑡 ). 𝑀 In this way, we condense three unobserved market access terms (𝐹 𝑀 𝐴𝑜 , 𝐶𝑀 𝐴𝐴 𝑜 , 𝐶𝑀 𝐴𝑜 ) into one observed market access term: (15) 𝑀 𝐴𝑜𝑡 = ∑ 𝐴 −𝜃𝐴 𝐴 (𝜏𝑜𝑑𝑡 ) 𝑋𝑑𝑡 . 𝐴 −𝜃𝐴 𝐴 𝑋𝑖𝑡 𝑖∕=𝑑,𝑜 (𝜏𝑑𝑖𝑡 ) ∑ 𝑑∕=𝑜 This combined market access term is more “ad hoc” than the three terms that follow directly from the model.26 However, we find the assumption plausible that a farmer’s market access is positively correlated with the farmer having lower trade costs with: (1) larger populations demanding the farmer’s goods, (2) larger populations that have fewer other sources for acquiring the farmer’s goods, (3) larger populations producing the farmer’s demanded goods, and (4) larger populations that have fewer other destinations to sell the farmer’s demanded goods. 25 𝑀 𝐴 Note that collapsing 𝐶𝑀 𝐴𝐴 𝑜 , 𝐶𝑀 𝐴𝑜 , and 𝐹 𝑀 𝐴𝑜 into one market access variable removes the model’s distinction between urban and rural counties (and between agricultural and manufacturing sectors). This simplification removes the scope for analyzing cross-county specification into sectors and analyzing separate effects on urban and rural counties. 26 𝑀 𝐴 Formally, 𝑀 𝐴𝑜𝑡 can be shown to approximate 𝐶𝑀 𝐴𝐴 𝑜𝑡 , 𝐶𝑀 𝐴𝑜𝑡 and 𝐹 𝑀 𝐴𝑜𝑡 as the location of manufacturing and agriculture in the vicinity of county 𝑜 becomes more and more similar. 14 These four forces are embodied in (15). The term 𝑀 𝐴𝑜𝑡 is calculated for each county in 1870 and 1890 using the GIS-computed county-to-county lowest cost transportation route.27 Summarizing the above discussion, we regress land values in county 𝑜 and year 𝑡 on a county fixed effect (𝛼𝑜 ), state-by-year fixed effects (𝛼𝑠𝑡 ), a cubic polynomial in county latitude and longitude interacted with year effects (𝑓 (𝑥𝑜 , 𝑦𝑜 )𝛾𝑡 ), and log market access (𝑀 𝐴𝑜𝑡 ): (16) ln 𝑉𝑜𝑡 = 𝛼𝑜 + 𝛼𝑠𝑡 + 𝑓 (𝑥𝑜 , 𝑦𝑜 )𝛾𝑡 + 𝛽 ln(𝑀 𝐴𝑜𝑡 ) + 𝜀𝑜𝑡 . The sample is a balanced panel of 2,161 counties with land value data in 1870 and 1890. Standard errors are clustered at the state level to adjust for heteroskedasticity and withinstate correlation over time. III.D Empirical Results Table 1 presents results from estimating equation (16). Column 1 reports the estimated impact of market access on land values, when defining market access based on other counties’ 𝐴 = 𝐿𝑑𝑡 in equation (15)). Column 2 repeats this regression, defining market population (𝑋𝑑𝑡 access based on other counties’ land value. In columns 1 and 2, market access is estimated to have an economically large and statistically significant impact on county land values. The elasticity between land values and market access is 1.477 (column 1) or 1.441 (column 2), indicating that our preliminary results are insensitive to the variable chosen to capture “size” of each county market. This similarity 𝐴 in results is reassuring, as neither proxy matches total demand in other counties (𝑋𝑑𝑡 in the model).28 There is substantial variation in market access across the country, and the baseline estimates may be influenced by outliers: frontier counties whose market access and land values both increased dramatically. Column 3 weights the regression by 1870 county populations. Column 4 excludes areas West of Texas. The estimated elasticities decline to 1.2 and 1; unsurprisingly smaller if market access is more influential on land values in less-dense agriculture-exporting counties. Estimated coefficients of 1 – 1.5 imply a large elasticity between land values and market access, but in the range one might expect using the approximation of our model discussed above. A coefficient of 1.05 is predicted by plausible parameter values.29 27 As described below, we follow Donaldson (2010) and use a value of 𝜃𝐴 = 3.8 in constructing 𝑀 𝐴𝑜𝑡 . The full model predicts a log-linear relationship between land values and market access. Consistent with this theoretical relationship, when including a fourth-degree polynomial in the log of market access, the coefficient on the linear term remains similar and the higher-order terms are(jointly)insignificant. 28 2−𝛼 under the model The coefficient on market access in Equation (16) should satisfy 𝛽 = 1+𝛼𝜃 𝐴 approximation discussed above. The parameter 𝛼 is the land share in agricultural production, which Caselli 29 15 III.E Economic Impact of Eliminating Railroads and/or Extending Canals The estimated relationship between market access and land values can be used to estimate the aggregate economic impact associated with counterfactual scenarios. In particular, we consider two scenarios: (1) eliminating all railroads in 1890, and (2) replacing all railroads in 1890 with Fogel’s proposed canal extensions. First, consider the economic impact if no railroads existed in 1890. We calculate the resulting market access of each county in 1890, using computed county-to-county freight costs in a spatial network that excludes all railroads. We find that market access falls by 63%, on average, when all railroads are eliminated. By comparison, the average county increase in market access from 1870 to 1890 was 50%; thus, the counterfactual fall in market access is not substantially “out of sample.”30 County land value falls by 1.477 log points for every log point decline in market access (from column 1), and the implied percent decline in land value is multiplied by county land value in 1890. Summing over all counties, eliminating railroads in 1890 would have decreased the total value of US agricultural land by 73% or $9.5 billion (1890 dollars). The implied annual economic loss is $751 million (6.3% of GNP) using Fogel’s preferred rate of return.31 Second, consider the economic impact of replacing all railroads with an extended canal network in 1890. Fogel (1964) proposes an extended canal network that may have largely substituted for railroads; indeed, the distance buffer calculations suggested that extended canals mostly substituted for railroads. Our market access methodology suggests that canal extensions were not as effective, as county market access declines on average by 57%. Replacing railroads with extended canals generates an annual economic loss of $692 million (5.8% of GNP). When the benefits of railroads are measured more accurately – reflecting the wide dispersal of railroad lines and direct connections to markets – the 1890 railroad network generates benefits that far exceed the potential gains from feasible improvements in canals. III.F Endogeneity of Railroad Construction and Changes in Market Access The baseline analysis has developed the necessary theory and data to regress land values on market access; further work can improve “identification” of this regression. Changes in county market access may be correlated with unobserved determinants of county land values, though the preliminary regressions do control for county and state-by-year fixed effects and and Coleman (2001) take to be 0.19 based on Census data. The parameter 𝜃𝐴 is known as the “trade elasticity.” Donaldson (2010) estimates a value of 3.8 in colonial Indian agriculture and Simonovska and Waugh (2011) estimate a value of 4.5 in modern OECD manufacturing. Using 𝛼 = 0.19 and 𝜃𝐴 = 3.8 we obtain 𝛽 = 1.05. Our specification is an approximation to the ideal model, but we find this encouraging. 30 Note also that the impact of market access was estimated to be approximately linear. 31 For an elasticity of 1.175 (from column 3), the annual economic loss is $679 million (5.7% of GNP). The estimated economic loss declines less than linearly in the estimated elasticity. 16 year-specific polynomial controls for county latitude and longitude. County market access depends on three terms: costs of trading with other counties 𝐴 𝐴 −𝜃𝐴 ), and the competitiveness of other coun) ), market sizes in other counties (𝑋𝑑𝑡 ((𝜏𝑜𝑑𝑡 ∑ 𝐴 −𝜃𝐴 ) )). Plausible exogenous variation in any of these three terms can be ties (( 𝑖∕=𝑑,𝑜 𝑋𝑖𝑡𝐴 (𝜏𝑑𝑖𝑡 used to identify the impact of market access on land values. Thus, the identification problem has more potential solutions than a direct impact evaluation of railroad placement. In the first term, county railroad construction is unlikely to be randomly assigned (conditional on the controls). One approach may be to replicate identification strategies by Atack et al. to use some exogenous variation in railroad placement, or adapt methods used by Baum-Snow (2007) and Michaels (2008). Alternatively, we can exploit the larger impact of nationwide railroad expansion on counties with worse access to waterways; in practice, counties’ (non)access to waterways can instrument for improved market access from 1870 to 1890. In the second term, changes in other counties’ population is less likely to be correlated with a county’s own land values (conditional on the controls). We can instrument for changes in market access with changes in other counties’ population; in particular, we can control for changes in nearby counties’ population and define the instrument based on changes in population for the county’s initial trading partners. Alternatively, we can exploit market size variation in the impact of trade cost declines: among counties that experience large declines in trade costs, some become able to trade with larger markets. In the third term, changes in other counties’ access to other counties is less likely to be correlated with a county’s own land values (conditional on the controls). These multilateral competitiveness terms are determined by factors twice-removed from a county: if a county’s trading partner gains access to other markets, it decreases that county’s effective market access. We can instrument for changes in county market access with (inverse) changes in the county’s trading partners’ access to other markets (not including itself). In practice, some of these instrumental variables strategies will have stronger or weaker predictive power, but comparing results across specifications should be informative. As an initial empirical exercise, we focus on variation in county market access that is orthogonal to that particular county’s railroad access. Table 2, column 1 reports the baseline estimate for comparison. Table 2, column 2, reports the estimated impact on land values from a county having any railroad track, conditional on county fixed effects, state-by-year fixed effects, and year-specific cubic polynomials in latitude and longitude. A county containing any railroad track is associated with a 0.359 log point increase in land values, though county railroad construction may be endogenous and jointly determined with land values. The estimated impact of market access is remarkably robust to controlling for whether a county has any railroad track (Table 2, column 3). Once controlling for market access, the 17 estimated impact of any railroad track is no longer substantial or statistically significant. These estimates are similar when also controlling for the length of railroad track by county (Table 2, column 4).32 Thus, county market access is correlated with railroad track, but the estimated impact of market access is robust to exploiting variation in market access that is independent of that county’s railroad construction and associated endogeneity bias. IV Conclusion (To Come) 32 The coefficient on market access is also robust to controlling for high-order polynomials of railroad track length. 18 References Anderson, S. P., A. de Palma, and J. Thisse (1992): Discrete Choice Theory of Product Differentiation. MIT Press, Cambridge. Anderson, J. and E. van Wincoop (2003): “Gravity with Gravitas: A Solution to the Border Puzzle,” American Economic Review, 93(1), pp. 170-192. Anderson, J. and E. van Wincoop (2004): “Trade Costs,” Journal of Economic Literature XLII(3), pp. 691-751. Arkolakis, C., A. Costinot, and A. Rodríguez-Clare (2011): “New Trade Models, Same Old Gains?,” American Economic Review, forthcoming. http://econwww.mit.edu/files/6445. Atack, J., F. Bateman, M. Haines and R.A. Margo (2010): “Did Railroads Induce or Follow Economic Growth? Urbanization and Population Growth in the American Midwest, 1850-1860,” Social Science History, 34(2), pp. 171-197. Atack, J., M. Haines and R.A. Margo (2011): “Railroads and the Rise of the Factory: Evidence for the United States, 1850-1870,” in P. Rhode, J. Rosenbloom, D. Weiman, eds. Economic Evolution and Revolutions in Historical Time. Palo Alto, CA: Stanford University Press. Atack, J. and R.A. Margo (2010): "The Impact of Access to Rail Transportation on Agricultural Improvement: The American Midwest as a Test Case, 1850-1860," BU Mimeo, March. http://www.bu.edu/econ/files/2010/10/FinalDraftJTLUMarch-282010JournalVersion1.pdf Axtell, R. (2001): “Zipf Distribution of U.S. Firm Sizes,” Science, 293(5536), pp. 18181820. Baum-Snow, N. (2007), "Did Highways Cause Suburbanization?," The Quarterly Journal of Economics, 122(2), pp. 775-805. Caselli, F. and W. Coleman II (2001): “The U.S. Structural Transformation and Regional Convergence: A Reinterpretation,” The Journal of Political Economy, 109(3), pp.584-616. Chaney, T. (2008): “Distorted Gravity: The Intensive and Extensive Margins of International Trade,” American Economic Review, 98(4), pp. 1707-1721. Combes, P-P., T. Mayer and J-F. Thisse (2008): Economic Geography: the Integration of Regions and Nations, Princeton: Princeton University Press. 19 David, P. (1969): “Transport Innovation and Economic Growth: Professor Fogel on and off the Rails,” The Economic History Review, 22(3), pp. 506-524. Donaldson, D. (2010): “Railroads of the Raj: Estimating the Impact of Transportation Infrastructure,” revisions requested at The American Economic Review. http://econ-www.mit.edu/files/6038. Eaton, J. and S. Kortum (2002): “Technology, Geography and Trade,” Econometrica, 70(5), pp. 1741-1779. Fogel, R. W. (1964): Railroads and American Economic Growth: Essays in Economic History, Johns Hopkins University Press, Baltimore. Fogel, R. W. (1979): “Notes on the Social Saving Controversy,” Journal of Economic History, 39, pp. 1-54. Haines, M. (2005): Historical, Demographic, Economic, and Social Data: US, 17902000, ICPSR. Haines, M. and R.A. Margo (2008): “Railroads and Local Economic Development: The United States in the 1850s,” in J. Rosenbloom, ed., Quantitative Economic History: The Good of Counting, pp. 78-99. London: Routledge. Harris, C. (1954): “The Market as a Factor in the Localization of Industry in the United States,” Annals of the Association of American Geographers, 64, pp. 315-348. Herrendorf, B. J.A. Schmitz Jr. and A.Teixeira (2009): "Transportation and Development: Insights from the U.S., 1840—1860," Federal Reserve Bank of Minneapolis Research Department Staff Report 425. http://minneapolisfed.org/research/sr/sr425.pdf Hornbeck, R. (2010): “Barbed Wire: Property Rights and Agricultural Development,” The Quarterly Journal of Economics, 125, pp. 767-810. Melitz, M. (2003): “The Impact of Trade on Intra-Industry Reallocations and Aggregate Industry Productivity,” Econometrica, 71, pp. 1695-1725. Michaels, G. (2008): "The Effect of Trade on the Demand for Skill: Evidence from the Interstate Highway System," The Review of Economics and Statistics, 90(4), pp. 683-701. Redding, S. and D. Sturm (2007): “The Cost of Remoteness: Evidence from German Division and Reunification,” American Economic Review, 98(5), pp. 1766-1797. 20 Simonovska, I. and M. Waugh (2011): “Elasticity of Trade: Estimates and Evidence,” Unpublished Manuscript, UC Davis and NYU. https://files.nyu.edu/mw134/public/uploads/56836/estimate_theta_paper.pdf. Williamson, J.G. (1974): Late Nineteenth-Century American Development: a General Equilibrium History, New York: Cambridge University Press. 21 Figure 1. Railroad Network in 1870 and 1890 A. Railroads in 1870 B. Railroads in 1890 Notes: Railroad lines are traced in GIS from georeferenced railroad Atlase images in 1870 and 1887. 22 Figure 2. Waterway Network A. Navigable Rivers B. Navigable Rivers and Canals C. Navigable Rivers, Canals, Lakes, and Coastal Routes D. All Navigable Waterways and Proposed Canal Extensions Notes: Navigable rivers, canals, and proposed canal extensions are traced in GIS, as defined by Fogel (1964). Lakes and coasts are saturated with waterways. 23 Figure 3. Available Waterways and Railroads, by Year A. Waterways and 1870 Railroads B. Waterways and 1890 Railroads Notes: Railroads are as defined in Figure 1. Waterways are as defined in Figure 2C. 24 Figure 4. 40-mile Buffers around Railroads, Waterways, and Proposed Canal Extensions A. Railroads in 1890 B. Waterways C. Railroads, overlaid with Waterways D. Railroads, overlaid with Proposed Canals, then Waterways Notes: Using GIS software, 40-mile buffers are drawn around railroads, waterways, and proposed canal extensions. 25 Figure 5. 10-mile Buffers around Railroads, Waterways, and Proposed Canal Extensions A. Railroads in 1890 B. Waterways C. Railroads, overlaid with Waterways D. Railroads, overlaid with Proposed Canals, then Waterways Notes: Using GIS software, 10-mile buffers are drawn around railroads, waterways, and proposed canal extensions. 26 Table 1. Estimated Elasticity between Market Access and Land Values Log (Value of Agricultural Land and Buildings) Dependent variable: (1) Log Market Access (based on population) (2) (3) (4) 1.477** 1.175** 0.956** (0.254) (0.158) (0.320) Log Market Access 1.441** (based on land values) (0.251) Number of Counties 2,161 2,161 2,161 2,068 R-squared 0.587 0.587 0.538 0.633 Notes: Each column reports estimates from equation (16) in the text. For a balanced panel 2,161 counties in 1870 and 1890, the Log Value of Agricultural Land and Buildings is regressed on Log Market Access, as defined in equation (15). All regressions include county fixed effects, state-by-year fixed effects, and yearspecific cubic polynomials in county latitude and longitude. Column 1 defines market access using county population to proxy for other markets' demand. Column 2 defines market access using land values to proxy for other markets' demand. Column 3 weights the regression by county population in 1870. Column 4 drops counties West of Texas, Oklahoma, Kansas, Nebraksa, South Dakota, and North Dakota. Robust standard errors clustered by state are reported in parentheses: ** denotes statistical significance at the 1% level, and * at the 5% level. 27 Table 2. Market Access Elasticity: Robustness to Direct Controls for Railroads Log (Value of Agricultural Land and Buildings) Dependent variable: (1) Log Market Access (based on population) (2) (3) (4) 1.477** 1.443** 1.455** (0.254) (0.240) (0.320) 0.359** 0.037 0.044 (0.116) (0.098) (0.092) Any Railroad Track Railroad Track Length (100km) - 0.032 (0.070) Number of Counties 2,161 2,161 2,161 2,161 R-squared 0.587 0.544 0.587 0.587 Notes: Column 1 reports the baseline specification from Table 1. All regressions include county fixed effects, state-by-year fixed effects, and year-specific cubic polynomials in county latitude and longitude. Column 2 reports the coefficient on a dummy variable for the county containing any railroad track. Column 3 reports the baseline specification from Table 1, controlling for a dummy variable for the county containing any railroad track. Column 4 also controls for the length of county railroad track (in units of 100km). Robust standard errors clustered by state are reported in parentheses: ** denotes statistical significance at the 1% level, and * at the 5% level. 28
© Copyright 2026 Paperzz