Railroads and American Economic Growth: New Data

Railroads and American Economic Growth:
New Data and Theory∗
Dave Donaldson
MIT and NBER
Richard Hornbeck
Harvard and NBER
Draft Version: September 2011
Preliminary and Incomplete
Abstract
This paper examines the impact of railroads on American economic growth. Drawing
on general equilibrium trade theory, changes in counties’ “market access” captures the
aggregate impact of local changes in railroads. Using a new spatial network of freight
transportation routes, we calculate county-to-county lowest transportation costs and
counties’ implied market access. As railroads expanded from 1870 to 1890, changes in
market access are capitalized in land values with an estimated elasticity of 1 – 1.5. The
computed decline in market access from removing all railroads in 1890 is estimated to
reduce GNP by 6.3%, roughly double estimates by Fogel (1964). Without railroads,
the total value of US agricultural land would have been lower by 73%. Fogel’s proposed
extention of the canal network would have only mitigated 8% of the loss from removing
railroads.
∗
For helpful comments and suggestions, we thank seminar participants at Harvard. We are grateful to
Jeremy Atack and coauthors for sharing data from their research. Irene Chen, Manning Ding, Jan Kozak,
and Rui Wang provided excellent research assistance.
During the 19th century, railroads spread throughout a growing United States as the
economy rose to global prominence. Railroads became the dominant form of freight transportation; areas around railroad lines prospered and preferential rates could allow particular
companies to dominate sectors. The early historical literature often presumed that railroads
were indispensable to the United States’ economy or, at least, very influential in US economic
growth. Understanding the development of the American economy is shaped by an understanding of the impact of railroads and, more generally, the impact of market integration.
In Railroads and American Economic Growth, Fogel (1964) transformed the academic
literature on railroads’ impact by using a social savings methodology to focus attention on
counterfactuals: in the absence of railroads, freight transportation by rivers and canals would
have been only slightly more expensive in most areas. Small differences in freight rates can
make-or-break individual firms or induce areas to thrive at the expense of others, but the
aggregate impact on economic growth may be small.
Fogel’s social savings methodology has been widely applied and continues to influence
the evaluation of transportation infrastructure improvements. However, many scholars have
discussed limitations of the social savings methodology (reviewed in Fogel 1979).1 One alternative approach is to create a computational general equilibrium (CGE) model, with
the explicit inclusion of multiple regions separated by a transportation technology (e.g.,
Williamson 1974; Herrendorf et al. 2009).
Increasingly, econometric studies have exploited local variation in railroad construction
and county-level data on population, output, or land values (Haines and Margo 2008; Atack
and Margo 2010; Atack et al. 2010; Atack et al. 2011). Similar methods have been used to
estimate the impact of railroads in modern China (Banerjee et al. 2010) or highways in the
United States (Baum-Snow 2007; Michaels 2008). Given plausibly random variation in the
location of transportation routes, these studies estimate the relative impact of transportation infrastructure. General equilibrium effects limit the aggregate interpretation of such
estimates, however, because railroads may displace activity from one area to another with
little aggregate impact. Railroads are inherently a network technology, and local changes in
the network affect all other areas on the network and areas off the network.
This paper introduces a new methodology for estimating the aggregate impact of transportation improvements, drawing on recent advances in trade theory and spatial databases
to estimate the impact of railroads on the American economy. Following Fogel’s analysis, we
focus on the aggregate impact of railroad freight transportation on the agricultural sector
in 1890. Drawing on general equilibrium trade theory, counties’ “market access” aggre1
For example, the methodology ignores the possibility of increasing returns, market power, or other
market imperfections (David 1969).
1
gates the local impacts and geographic spillover impacts of railroads.2 Building on empirical
work by Atack et al., we construct a GIS network database to calculate county-to-county
transportation costs and analyze county-level variation in market access.
As railroads expanded from 1870 to 1890, changes in market access are capitalized in land
values with an estimated elasticity of 1 – 1.5. County market access is influenced by changes
elsewhere in the railroad network; indeed, the estimated elasticity is robust to controlling
for a county’s own railroads.
We then compare the observed impact of railroads against two counterfactual scenarios
in 1890. In the first scenario, eliminating railroads lowers the total value of US agricultural
land by 73% and generates an annual economic loss of 6.3% of GNP, roughly double Fogel’s
estimate. In the second scenario, replacing railroads with extended canals generates an annual economic loss of 5.8% of GNP, mitigating only 8% of the loss from eliminating railroads.
Contrary to Fogel’s expectations, the proposed extension of the canal network is estimated
to be a poor substitute for the railroad network. In contrast to canals, railroads provided direct routes to many trading partners, and dramatically lowered distances of expensive wagon
transportation.
As a comparison, we extend a lesser-known component of Fogel’s methodology using recent improvements in county-level data and spatial computational analysis. We calculate
the value of agricultural land within different-sized thin buffers around railroads and rivers.
Through algebraic manipulation, eliminating railroads is estimated to impose a 4% annual
cost and replacing railroads with canal extensions is estimated to impose a 1% annual cost.
These estimates are expected to understate the cost of eliminating railroads and, in particular, overstate the benefit of canal extensions. By comparison, the market access methodology
results appear of reasonable magnitude.
In the United States and around the world, transportation infrastructure is attracting
tremendous resources as a popular initiative to increase economic growth. This paper develops and implements a new methodology for measuring the aggregate impacts of transportation infrastructure. The estimates suggest a large role for market integration in economic
development. Further, as the trade literature makes progress on tractable general equilibrium models, this empirical setting provides an opportunity to test these models’ empirical
insights using a direct shock to trade costs and land value data. While the market access
data are drawn from a newly constructed network database, this data collection process can
be replicated in other settings.
2
In related work, Redding and Sturm (2007) use a similar model to estimate the impact on population
from changes in market access following the division and reunification of Germany. Donaldson (2010)
estimates the impact of railroads on price dispersion, trade flows and real income in colonial India, using a
different model to calculate overall gains from trade.
2
I
Constructing a New Dataset on the Historical US Transportation Network
The data are drawn from a newly-constructed geographic information system (GIS) network
database on the location of railroads in 1870 and 1890 and the location of other transportation methods.3 This network database, i.e., “Google Maps for the 19th century,” allows the
calculation of lowest freight transportation costs between any two counties. For 1870 and
1890, we calculate counties’ “market access:” a weighted sum of transportation costs suggested by a general equilibrium trade model. We also calculate counties’ market access under
counterfactual scenarios, such as eliminating railroads or constructing Fogel’s hypothetical
expansion of the canal network in lieu of railroads.
The database considers three methods of freight transportation: railroads, boats, and wagons. Railroad and boat freight rates were similar per ton-mile in 1890, so the key advantage
to railroads was their wider geographic dispersion. Railroads mainly affected transportation
costs by reducing the required lengths of expensive wagon transportation and by providing
more-direct transportation routes. Faced with a number of data challenges, the network construction compromises in other areas and focuses on the location of railways and waterways
and the implied distances of wagon transportation.
I.A
Transportation Cost Parameters
We use Fogel’s transportation cost parameters, which are reasonable choices and enable closer
comparison between the methodologies’ results. One strong assumption is that freight rates
per ton-mile are constant throughout the country, which only partly reflects data limitations.
Local freight rates are likely endogenous to local economic activity and market power, so
there are advantages to using average national rates. Fogel’s analysis uses freight rates in
1890, and we maintain these rates in 1870. Thus, measured changes in trade costs and market
access are determined by variation in the location of transportation routes rather than prices.
Using Fogel’s parameters, railway rates are set at 0.63 cents per ton-mile and waterway
rates are set at 0.49 cents per ton-mile.4 Transshipment costs 50 cents per ton, incurred
whenever transferring goods to/from a railroad car, river boat, canal barge, or ocean liner.5
Wagon transportation costs 23.1 cents per ton-mile, defined as the straight line distance
3
We are grateful to Jeremy Atack and co-authors for providing initial GIS maps of railways.
Rates reflect an output-weighted average of rates for transporting grain and meat. Waterway rates
include insurance charges for lost cargo (0.025 cents), inventory and storage costs for slower transport and
non-navigable winter months (0.194 cents), and the social cost of public waterway investment (0.073 cents).
Waterway or railway rates are not allowed to vary by direction.
5
Fogel considers transshipment charges as a sub-category of water rates, but our modeling of transshipment points allows for a unified treatment of Fogel’s interregional and intraregional scenarios. Fogel’s
sources record higher railroad freight costs per ton-mile for shorter routes, but approximately reflect a 100
cent fixed fee and a 0.63 cent fee per mile.
4
3
between two points.6 Due to the wide dispersion in travel costs by method, the key features
of the transportation network concern the required length of wagon transportation and the
number of transshipment points.
I.B
Transportation Network Construction
The first step is obtaining digitized maps of constructed railroads in 1870 and 1890. We
are grateful to Jeremy Atack and co-authors for providing these initial GIS railroad files.7
These railroad files were originally constructed to define mileage of railroad track by county
and year; by contrast, for our purposes, railroad lines must connect exactly such that GIS
software recognizes travel is possible along a continuous line. We use GIS topology tools
to ensure exact connections between all railroad line segments.8 To minimize measurement
error in changes, we create a cleaned 1890 railroad file and modify that file to create a version
for 1870 that omits lines constructed between 1870 and 1890. Figure 1 maps the railroad
network in 1870 and 1890.
The second step adds locations of navigable rivers and canals. We follow Fogel’s definition
and mapping of navigable waterways, and build on a preliminary file provided by Atack. We
enhanced the Fogel-Atack map to follow natural river bends, as the relative straightness of
railroads contributes to travel cost differences. Overlapping waterways and railways do not
connect by default; instead, we create connections among waterways and railways to allow
for fixed transshipment costs. We also include the locations of hypothetical canals, whose
construction Fogel argued was feasible and would have largely substituted for railroads. Figure 2 maps the waterway network, and Figure 3 maps the combined waterway network and
railroad network in 1870 and 1890.
The third step allows for travel over two-dimensional space (lakes, oceans, unmapped
land). The network allows for travel along line routes, but space between lines is undefined and travel is either costless or impossible.9 We create waterways that saturate oceans
and lakes with a large number of possible routes. For over-land routes, we create wagon
connections from every county centroid to every other county centroid within 300km.10
The fourth step connects individual counties to the transportation network. We define av6
This rate reflects a cost of 16.5 cents per mile traveled and Fogel’s adjustment factor of 1.4 between the
shortest straight line distance and miles traveled.
7
First, year-specific railroad atlas scans from the Library of Congress are “georeferenced” to US county
borders. Second, railroad lines are hand-traced in GIS software to create a digital map of railroad line
locations.
8
For example, digitized railroad lines often contain small internal gaps. In addition, by default, two
crossing railroad lines are equivalent to a highway overpass without a connection. Correcting all such cases
has little effect on total track mileage by county, but can have large impacts on calculated lowest-cost routes.
9
The analysis could use rasters that allow for travel costs for each map pixel, but it would be unable to
reflect transshipment costs: a string of railroad pixels will intersect a string of river pixels.
10
Note that wagon rates adjust straight line distance to reflect average travel distance.
4
erage travel costs between counties by calculating the travel cost between county centroids.
County centroids must be connected to the network of railways and waterways; otherwise,
lowest-cost travel calculations assume that freight first travels (freely) to the closest railway
or waterway. We create wagon connections between each county centroid and each nearby
type of transportation route in each relevant direction.
The fifth step refines the centroid connections to reflect the importance of wagon distances
to overall travel costs. Consider that when railroads travel through a county, the centroid’s
nearest distance to a railroad no longer reflects the average distance from county points to
a railroad. Fogel recognized the importance of modeling this distance and his ideal solution
was to break each county into small grids and take the average of nearest distances from each
grid to a railroad. Due to technical limitations, Fogel approximated this average distance
using one-third of the distance from the farthest point in a county to a railroad. We use
GIS software to create 200 random points in each county, calculate the distance from each
point to the nearest railroad, and take the average of these nearest distances. We adjust
the cost of travel along each centroid connection to within-county railroads to reflect that
county’s average travel cost to a railroad. We repeat this process for rivers, canals, and
Fogel’s proposed canals. This process allows the analysis to exploit precise variation on the
intensive margin of county access to railways and waterways.
I.C
Transportation Route Cost Calculations
We use the completed network dataset to calculate the lowest cost route between each pair of
counties, i.e., 5 million calculations. The lowest cost routes are calculated under four scenarios: (1) transportation costs with 1870 railroads, (2) transportation costs with 1890 railroads,
(3) transportation costs with no railroads, and (4) transportation costs with no railroads and
Fogel’s extended canals. All scenarios include transportation by river/canal/lake/ocean and
over-land “wagon routes.” The observed travel paths are consistent with intuitive travel patterns. Ultimately, calculated travel costs are not intended to most accurately reflect actual
travel costs (for which partial data are available); rather, calculated travel costs are a proxy
for differences over space and time in transportation costs due to the availability of nearby
railways and waterways.
I.D
County-level Census Data
County-level data are drawn from the US Censuses of Agriculture and Population (Haines
2005). The main variables of interest are total population and the total value of agricultural
land. We adjust data from 1870 to reflect 1890 county boundaries (Hornbeck 2010). Unsettled counties with missing data are assumed to have zero population and zero land value,
as all areas contribute to the calculation of market access. The empirical analysis excludes
5
counties with zero land value, and explores robustness to excluding all counties West of Texas.
II
A Distance Buffer Approach to Valuing Railroads
Using only the location of railways and waterways, an initial empirical exercise extends Fogel’s lesser-known approach to valuing the impact of railroads in 1890. Fogel’s best-known
approach involves multiplying the quantity of transported agricultural goods by the difference in transportation costs with railroads and without railroads. This method is applied
to inter-regional trade between major hubs and some intra-regional trade. One limitation in
applying this method to all intra-regional trade is that waterway and wagon transportation
costs would be higher than product values in some railroad-serviced areas. As an alternative,
Fogel excludes all areas more than 40 miles from a navigable waterway and calculates the
economic loss as the value of agricultural land in these areas.
Fogel’s alternative method of calculating economic loss is based on a Ricardian model:
land values capitalize economic rents, which depend on transportation costs to markets.
Railroads and waterways are much cheaper than wagons – 40 miles of wagon transportation
is equivalent to 1500-1900 miles of transportation by rail or waterway – so distance to a railroad or waterway is a decent proxy for transportation costs. Fogel makes a stark assumption
that farmland is worth its full amount within 40 miles and is worth zero beyond 40 miles,
when wagon costs exceed average product prices.
We extend this Ricardian method to a full evaluation of railroads, exploiting GIS software
and county-level data. For 1890, we calculate the share of each county that is 0-5 miles, 5-10
miles, ..., 35-40 miles, and more than 40 miles from a railroad or river/canal/lake/ocean.
We also calculate the share of each county in each distance bin under two counterfactual
scenarios: (1) no railroads and (2) no railroads and canal extensions.
Figure 4 maps 40-mile buffers for each scenario, which illustrates the intuition in Fogel’s
original analysis. Panel A is shaded to reflect the area of land within 40 miles of a railroad
or waterway. Panel B is shaded to reflect the area of land within 40 miles of a waterway.
Panel C overlays these two maps; the gray area is “serviced” by a railroad but not by a
waterway. Panel D adds Fogel’s proposed canal extensions; most of the Midwestern areas
become “serviced” by this expanded waterway network.
Figure 5 maps 10-mile buffers for each scenario, which illustrates why Fogel’s approach
may understate the value of railroads and overstate the value of canal extensions. Panel A
is shaded to reflect the area of land within 10 miles of a railroad or waterway; given the high
density of railroad construction, most areas remain shaded when compared to panel A of
figure 4. In Panel B, however, relatively little area is covered by waterways when compared
to panel B of figure 4. Panel C overlays these maps, highlighting the relative advantage
6
of railroads at closer distances. Panel D adds the canal extensions, which are relatively
ineffective at covering areas serviced by railroads.
Calculating the economic impact of railroads involves algebraic manipulation in two steps.
First, we inflate observed land values to reflect lands’ implied “true value” if it were next
to a transportation route. Second, we calculate the implied lost land value from changing
observed transportation routes to each counterfactual scenario.
First, assume a continuous depreciation in land values as nearest distance to a railway or
waterway increases from 0 to 40 miles. For each county 𝑖, the implied per-acre “true land
value” or 𝑇 𝑟𝑢𝑒𝐿𝑉𝑖 is a function of the “observed land value” or 𝑂𝑏𝑠𝑒𝑟𝑣𝑒𝑑𝐿𝑉𝑖 and the share
of county 𝑖 in each distance bin 𝛼𝑖𝑑 :
(1)
𝑇 𝑟𝑢𝑒𝐿𝑉𝑖 =
𝛼𝑖0−5 37.5
40
𝑂𝑏𝑠𝑒𝑟𝑣𝑒𝑑𝐿𝑉𝑖
+ 𝛼𝑖5−10 32.5
+ ... + 𝛼𝑖35−40 2.5
40
40
Second, we calculate the share of each county 𝑖 in each distance bin 𝛽𝑖𝑑 from only a
river/canal/lake/ocean. The implied lost land value from removing railroads is:
(2) 𝐿𝑜𝑠𝑠𝑁 𝑜𝑅𝑅 = 𝑇 𝑟𝑢𝑒𝐿𝑉𝑖 [(𝛽𝑖40+ − 𝛼𝑖40+ ) + (𝛽𝑖35−40 − 𝛼𝑖35−40 )
37.5
2.5
+ ... + (𝛽𝑖0−5 − 𝛼𝑖0−5 ) ]
40
40
Similarly, we calculate the share of each county 𝑖 in each distance bin 𝛾𝑖𝑑 from a proposed
canal extension or a river/canal/lake/ocean. The implied lost land value from replacing
railroads with extended canals is:
(3) 𝐿𝑜𝑠𝑠𝐶𝑎𝑛𝑎𝑙𝑠 = 𝑇 𝑟𝑢𝑒𝐿𝑉𝑖 [(𝛾𝑖40+ − 𝛼𝑖40+ ) + (𝛾𝑖35−40 − 𝛼𝑖35−40 )
37.5
2.5
+ ... + (𝛾𝑖0−5 − 𝛼𝑖0−5 ) ]
40
40
This method estimates that removing railroads in 1890 would decrease land values by $6
billion, generating an annual loss of $480 million using Fogel’s preferred rate of return. By
comparison, Fogel estimates annual intra-regional losses of $248 million to $337 million, and
annual inter-regional losses of $73 million.11 If land values are not inflated to “true values,”
skipping step one above, the estimated annual loss is $420 million.
This method estimates that replacing railroads with canal extensions in 1890 would still
generate an annual loss of $128 million. In this approach, canal extensions largely mitigate
the loss of railroads because extended canals reach near otherwise unserved areas. Canal
extensions only provide indirect routes to destinations, however, so a methodology based on
trade costs should estimate lower value from canal extensions.
The subsequent analysis focuses on county-to-county trade costs because, ultimately, rail11
From the 40-mile buffer loss alone, Fogel estimates an annual loss of $154 million.
7
ways and waterways matter for where they provide transportation. The full trade model
simplifies to a county-level regression that maintains this earlier Ricardian intuition: low
trade costs and high market access are capitalized in higher land values. The model is identified from changes in market access over time, and we use these parameters to estimate the
economic cost from eliminating railroads or replacing railroads with canal extensions.
III
A Market Access Approach to Valuing Railroads
We formalize a model of trade among US counties that specifies how goods prices and factor
prices – and hence living standards – respond to changing trade costs. The model contains
thousands of counties, each with interacting goods markets and factor markets, that generate
positive and negative spillovers on other counties. Remarkably, under a set of assumptions
that are standard among modern quantitative trade models, all of these spillovers collapse
into a single sufficient statistic represented by a county’s “market access.” The model generates a regression equation that we take to the data; in particular, the model implies a simple
relationship between county agricultural land values and county market access (appropriately
defined). The model’s simplifications and abstractions are stark, but these assumptions are
rewarded with analytical insight and empirical predictions.12 In the end, regressing county
land values on a weighted sum of trade costs with other counties (i.e., “market access”) also
has a completely atheoretical appeal.
III.A
A Model of Trade Among US Counties
The economy consists of many trading counties, each indexed by 𝑜 if the origin of a trade
and by 𝑑 if the destination. Some counties are rural (𝑜 ∈ ℛ) and some are urban (𝑜 ∈ 𝒰).
Tastes: Economic agents have Cobb-Douglas preferences over two commodity groups:
agricultural goods (weight 𝛾) and manufactured goods (weight 1 − 𝛾). The commodity
groups are composed of many differentiated goods varieties (indexed by 𝑗), and tastes over
these varieties take a CES form (with elasticities 𝜎𝐴 and 𝜎𝑀 ).13 An agent living in county
12
These modeling assumptions are used in the fields of international trade and economic geography, and
reflect recent best-practice to gain traction in general equilibrium spatial settings with many regions and
commodities.
13
It is important to note that the elasticities of substitution within each commodity group are not
restricted; these could be very high if agricultural varieties are relatively similar. Anderson et al. (1992)
provide an attractive microfoundation of aggregate-level CES preferences. If individual agents desire only
one variety of the agricultural good (their “ideal variety”) and agents’ ideal varieties are distributed in a
Logit fashion, then aggregate consumption data from a population of many such agents behaves as though
all agents have CES preferences over all varieties (where in such an interpretation 𝜎 indexes the inverse of
the dispersion of agents’ ideal varieties).
8
𝑜 and receiving income 𝑌𝑜 experiences indirect utility:
(4)
𝑉 (Po , 𝑌𝑜 ) =
𝑌𝑜
,
𝐴
𝛾
(𝑃𝑜 ) (𝑃𝑜𝑀 )1−𝛾
where 𝑃𝑜𝐴 and 𝑃𝑜𝑀 are standard CES price indices.14
Agricultural technology: For simplicity, we assume a stark form of specialization: rural
counties produce only agricultural goods and urban counties produce only manufacturing
goods.15 Rural counties use a Cobb-Douglas technology to transform land and labor into
agricultural goods varieties. We assume that labor is mobile across space and across sectors,
i.e., that farmers have an endogenous outside option to work in other counties or manufacturing. The marginal cost of producing agricultural goods of variety 𝑗 in county 𝑜 is:
𝑀 𝐶𝑜𝐴 (𝑗) =
(5)
(𝑞𝑜 )𝛼 (𝑤𝑜 )1−𝛼
if 𝑜 ∈ ℛ,
𝑧𝑜𝐴 (𝑗)
where 𝑤𝑜 is the wage rate, 𝑞𝑜 is the agricultural land rental rate, and 𝑧𝑜𝐴 (𝑗) is a Hicksneutral productivity shifter that is exogenous and local to county 𝑜. We follow Eaton and
Kortum (2002) in modeling these productivity shifters by assuming that each county draws
its productivity level, for any given variety 𝑗, from a Frechet (or Type II extreme value)
distribution: 𝐹𝑜 (𝑧) = 1−exp(−𝑇𝑜𝐴 𝑧 −𝜃𝐴 ).16 This functional form is restrictive, but it has made
multiple-region models of comparative advantage tractable since the seminal work of Eaton
and Kortum (2002). Finally, we assume perfect competition among producers, following most
of the trade literature on relatively homogeneous goods such as agricultural commodities.
Manufacturing technology: Urban counties use a Cobb-Douglas technology to transform
capital and labor into manufacturing goods varieties, whose marginal cost of production is:
𝑀 𝐶𝑜𝑀 (𝑗) =
(6)
14
That is, 𝑃𝑜𝐴 =
𝑛𝐴 𝐴
(𝑝𝑐 (𝑗))1−𝜎𝐴 𝑑𝑗
0
[∫
(𝑤𝑜 )𝛽 (𝑟𝑜 )1−𝛽
if 𝑜 ∈ 𝒰,
𝑧𝑜𝑀 (𝑗)
]1/(1−𝜎𝐴 )
and 𝑃𝑜𝑀 =
𝑛𝑀 𝑀
(𝑝𝑜 (𝑗))1−𝜎𝑀 𝑑𝑗
0
[∫
]1/(1−𝜎𝑀 )
. Variables 𝑛𝐴
and 𝑛𝑀 are the number of varieties available to consumers; as we describe below, the number of varieties is
fixed in agriculture and the number is endogenous in the manufacturing sector and adjusts to allow free entry.
15
In future work we aim to subdivide all counties into their rural and urban segments (based on population
shares in the Census data). An additional extension could solve for the equilibrium specialization of each
county into its rural and urban segments.
16
This distribution captures how productivity differences across counties give incentives to specialize
and trade. The parameter 𝑇𝑜𝐴 captures county-specific (log) mean productivity, which corresponds to
the county’s absolute advantage. The parameter 𝜃𝐴 captures, inversely, the (log) standard deviation
of productivity, which corresponds to the scope for comparative advantage. A low 𝜃𝐴 means county
productivity draws are dispersed, creating large incentives to trade on the basis of productivity differences.
9
where 𝑤𝑜 is the same wage rate, 𝑟𝑜 is the capital rental rate, and 𝑧𝑜𝑀 (𝑗) is a Hicks-neutral
productivity shifter that is exogenous and local to county 𝑜. The urban manufacturing sector
pays both this marginal cost and a fixed cost of production, 𝑓𝑜 , paid in the same proportion
of factors as marginal costs. We follow Melitz (2003) in assuming a set of potential producers, each with a productivity shifter 𝑧𝑜𝑀 (𝑗) that is drawn, prior to entry, from a distribution
with CDF 𝐺𝑜 (𝑧). We also follow Chaney (2008), among others, in modeling the distribution
𝐺𝑜 (𝑧) as a Pareto distribution: 𝐺𝑜 (𝑧) = 1 − (𝑧/𝑇𝑜𝑀 )−𝜃𝑀 .17 The parameter 𝑇𝑜𝑀 captures the
location of this distribution and 𝜃𝑀 captures its dispersion. Following Melitz (2003), firms
compete monopolistically with free entry such that all profits are zero.
Trade costs: Trading goods is costly. Remote rural locations pay high prices for imported
goods and receive low prices for goods they produce, as this is the only way locations can be
competitive in distant markets. We model trade costs using a simple and standard “iceberg”
formulation. When a variety is made in county 𝑜 and sold locally in county 𝑜, its price is
𝑝𝑘𝑜𝑜 (𝑗) in sector 𝑘; but when this same variety is made in county 𝑜 and shipped to county 𝑑,
𝑘
𝑘 𝑘
is applied to each unit of
𝑝𝑜𝑜 (𝑗).18 A proportional trade cost 𝜏𝑜𝑑
it will sell for 𝑝𝑘𝑜𝑑 (𝑗) = 𝜏𝑜𝑑
𝑘
the variety shipped.19 Trade is costly because 𝜏𝑜𝑑
≥ 1. Our empirical analysis below models
𝑘
𝜏𝑜𝑑 as a function of observables, based on the available transportation routes between county
𝑜 and county 𝑑 and how these routes changed over time.
Factor mobility: Workers are free to move around the country. Worker utility, as defined
in equation (4), will hence be equalized across counties in equilibrium and wages will satisfy:
(7)
𝑤𝑜 = 𝑈¯ (𝑃𝑜𝐴 )𝛾 (𝑃𝑜𝑀 )1−𝛾 ,
where 𝑈¯ is the endogenous level of utility obtained by workers in each county. Land is fixed
by county and capital is perfectly mobile.20
III.B
Using the Model to Inform Empirical Work
Solving for trade flows—the gravity equation: First, we solve for trade flows from each
origin county 𝑜 to each other destination county 𝑑 in the agricultural sector. Due to perfect
competition, the marginal costs of producing each variety are equal to their price. We
assume that consumers only buy a variety from the cheapest available location. Substituting
marginal costs from each supply location 𝑜 (equation 5) into the demand for agricultural
17
This is a good match for the empirical distribution of firm sizes in most samples (e.g., Axtell 2001).
Monopolistically competitive producers in the manufacturing sector have an incentive to segment
markets but, with CES preferences and many producers, the same mark-up over marginal cost will be
charged by each producer in each market.
19
Melitz (2003) proposed that some trade barriers may be fixed (ie, constant regardless of the quantity
shipped). For simplicity we have not included such costs here but they could easily be added in future work.
20
Predictions are equivalent if capital is perfectly immobile, but intermediate cases are difficult to model.
18
10
varieties in county 𝑑, Eaton and Kortum (2002) derive the value of total exports from 𝑜 to 𝑑:21
(8)
(9)
where
−1
𝐴 −𝜃𝐴
𝐴
) (𝐶𝑀 𝐴𝐴
= 𝜒𝐴 𝑇𝑜𝐴 (𝑞𝑜𝛼 𝑤𝑜1−𝛼 )−𝜃𝐴 (𝜏𝑜𝑑
𝑋𝑜𝑑
𝑑 ) 𝑌𝑑 ,
∑
≡
𝑇𝑖𝐴 (𝑞𝑖𝛼 𝑤𝑖1−𝛼 )−𝜃𝐴 (𝜏𝑖𝑑𝐴 )−𝜃𝐴 .
𝐶𝑀 𝐴𝐴
𝑑
𝑖∈ℛ
𝐶𝑀 𝐴𝐴
𝑑 refers to “consumer market access” for agricultural goods in destination market 𝑑.
Consumer market access is a weighted sum of productivity-adjusted costs of production in
each market 𝑖 that could supply market 𝑑, with weights declining in the cost of trading from
𝑖 to 𝑑 (i.e., 𝜏𝑖𝑑𝐴 ). From equation (8), county 𝑜 sends more goods to county 𝑑 if county 𝑜 is
relatively productive (high 𝑇𝑜𝐴 ) or relatively low cost (low 𝑞𝑜 or 𝑤𝑜 ). County 𝑜 also sends more
goods to county 𝑑 if county 𝑑 has high total income (high 𝑌𝑑 ) or low overall consumer market
access (low 𝐶𝑀 𝐴𝐴
𝑑 ), meaning that county 𝑜 faces fewer competitors when selling to market 𝑑.
A similar expression can be derived for trade in the manufacturing sector. The manufacturing problem is complicated by the presence of free entry, however, Chaney (2008) shows
that trade flows again take a gravity equation form:
(10)
(11)
where
𝑀
𝛽 1−𝛽 −𝜃𝑀
𝑀 −𝜃𝑀
−1
𝑋𝑜𝑑
= 𝜒𝑀 𝑇𝑜𝑀 𝑛𝑀
)
(𝜏𝑜𝑑
)
(𝐶𝑀 𝐴𝑀
𝑜 (𝑤𝑜 𝑟𝑜
𝑑 ) 𝑌𝑑 ,
∑
𝛽 1−𝛽 −𝜃𝑀
𝐶𝑀 𝐴𝑀
≡
𝑇𝑖𝑀 𝑛𝑀
)
(𝜏𝑖𝑑𝑀 )−𝜃𝑀 .
𝑑
𝑖 (𝑤𝑖 𝑟𝑖
𝑖∈𝒰
The only difference from equation (8) is that manufacturing exports from county 𝑜 also
depend on the endogenous number of varieties available in each location (i.e., 𝑛𝑀
𝑜 ).
Equations (8) and (10) are known as a gravity equations, and they govern trade flows
in this model. Deriving a gravity equation is dependent on the particular functional form
assumptions for tastes and technology; indeed, it is difficult to imagine alternative assumptions that give rise to a gravity equation in these two settings (Arkolakis et al. 2011). The
gravity equation is appealing because it dramatically simplifies a complex general equilibrium problem of spatial competition. A more appealing feature of the gravity equation is
that it appears to fit data universally (Anderson and van Wincoop 2003, 2004; Combes et
al. 2008). Thus, it is difficult to imagine an exercise that models trade among counties in
the United States that does not predict trade flows will take the “gravity” form.
Solving for the land rental rate. The absence of systematic data on 19th century trade flows
𝐴
𝑀
between US counties means that 𝑋𝑜𝑑
and 𝑋𝑜𝑑
are unobservable. However, the gravity equation implies tractable expressions for outcome variables that are observable and important
for our purposes, such as the rural land rental rate (𝑞𝑜 ).
21
The term 𝜒𝐴 is a constant that is comprised of the parameters 𝛾, 𝜃𝐴 , and 𝜎𝐴 .
11
The Cobb-Douglas agricultural technology leaves land a fixed share 𝛼 of total output 𝑌𝑜𝐴 ,
so 𝑞𝑜 𝐿𝑜 = 𝛼𝑌𝑜𝐴 where 𝐿𝑜 is the (exogenous) quantity of agricultural land in county 𝑜. Goods
∑ 𝐴
). Thus, equation (8) implies:
markets clear, so all produced goods are bought (𝑌𝑜𝐴 = 𝑑 𝑋𝑜𝑑
𝑞𝑜1+𝛼𝜃𝐴
(12)
(
=𝛼
𝑇𝑜𝐴
𝐿𝑜
)
𝑤𝑜−𝜃𝐴 (1−𝛼) 𝐹 𝑀 𝐴𝐴
𝑜.
𝐹 𝑀 𝐴𝐴
𝑜 refers to “firm market access” over agricultural goods in origin 𝑜 and is defined as:
𝐹 𝑀 𝐴𝐴
𝑜 ≡ 𝛾
(13)
∑
𝐴 −𝜃𝐴
−1
(𝜏𝑜𝑑
) (𝐶𝑀 𝐴𝐴
𝑑 ) 𝑌𝑑 .
𝑑∈ℛ,𝒰
Firm market access (𝐹 𝑀 𝐴𝐴
𝑜 ) is a sum of terms for all destination counties 𝑑 to which the
rural county 𝑜 tries to sell its agricultural goods. These terms include the size of the destination market (given by total income, 𝑌𝑑 ) and the competitiveness of the destination market
(given by its 𝐶𝑀 𝐴𝐴
𝑑 term). All terms are inversely weighted by the cost of trading with
𝐴 −𝜃𝐴
each distant market ((𝜏𝑜𝑑
) ).
Substituting equation (7) into equation (12) to eliminate the wage, we obtain an expression
for the land rental rate in county 𝑜:
) ( 𝐴) (
)
1
𝑇𝑜
𝛾(1 − 𝛼)
ln 𝑞𝑜 = 𝜅 +
ln
+
ln(𝐶𝑀 𝐴𝐴
𝑜)
1 + 𝛼𝜃𝐴
𝐿𝑜
1 + 𝛼𝜃𝐴
(
)
(
)
(1 − 𝛾)(1 − 𝛼)
1
𝑀
+
ln(𝐶𝑀 𝐴𝑜 ) +
ln(𝐹 𝑀 𝐴𝐴
𝑜 ),
1 + 𝛼𝜃𝐴
1 + 𝛼𝜃𝐴
(
(14)
where 𝜅 is a constant that subsumes model parameters and 𝑈¯ .
Equation (14) provides a parsimonious guide for empirical work. Equilibrium rural land
rental rates (𝑞𝑜 ) are log-linear in just three endogenous economic variables: (1) firm market
access for agricultural goods that these rural areas produce (𝐹 𝑀 𝐴𝐴
𝑜 ), (2) consumer market
𝐴
access to agricultural goods (𝐶𝑀 𝐴𝑜 ), and (3) consumer market access to manufacturing
goods (𝐶𝑀 𝐴𝑀
𝑜 ). Immobile land in county 𝑜 will be more valuable if county 𝑜 has: easy
access to large uncompetitive markets (high 𝐹 𝑀 𝐴𝐴
𝑜 ) or easy access to agricultural goods
𝐴
(high 𝐶𝑀 𝐴𝑜 ) and manufactured goods (high 𝐶𝑀 𝐴𝑀
𝑜 ) that workers value.
Equation (14) has two key implications for analyzing the impact of railroads on the development of the American economy. First, all economic forces that make goods and factor
markets interdependent across thousands of counties get collapsed into three sufficient statistics, each of which capture a slightly different concept of “market access.” All cross-county
general equilibrium spillover effects of railroads are captured by working with this particular concept of market access. For example, one county receiving a railroad line would, in
12
principle, affect other counties: counties that remain without a railroad, counties that previously had a railroad, counties that traded with the new railroad county, counties that can
now trade with the new railroad county, and so on. Even if railroad assignment is random,
“control” counties are affected and a regression of land rents on railroad access will confound
these effects and be biased. However, a regression of land rents on market access will be free
of this bias because, according to the model, all counties’ market access variables will adjust
with changes in the railroad network.
The second key implication of equation (14) is that access to markets matters for county
prosperity, as opposed to access to railroads or waterways. Market access can increase or
decrease due to events far beyond county borders: any changes in the transportation network
or changes in other counties’ market size. Thus, the empirical estimation is not identified
solely from particular counties gaining railroad access; rather, county market access changes
due to factors independent of decisions to extend railroads into particular locations. The
model suggests several instrumental variable strategies, which we explore below.
III.C
From Theory to an Empirical Specification
Equation (14) provides a useful approach for empirical work, though three obstacles prevent
its direct empirical implementation. We discuss proposed solutions to these challenges, many
of which are preliminary and can be improved upon in future work.
First, the Census of Agriculture contains data on land values rather than land rental
rates.22 We use land values as a proxy for land rental rates.23
Second, the technology term 𝑇𝑜𝐴 in equation (14) is not directly observable in our setting.24 One plausible assumption is that this term is constant over time, or at least that
changes in county agricultural technology are orthogonal to changes in market access. Because technology (𝑇𝑜𝐴 ) enters log-linearly in equation (14), we control for this term using
county fixed effects. We also attempt to control for technological change that may be correlated with changes in market access through the use of state-year fixed effects, and year
effects interacted with cubic polynomials in each county’s latitude and longitude.
𝑀
Third, the three market access terms (𝐹 𝑀 𝐴𝑜 , 𝐶𝑀 𝐴𝐴
𝑜 , 𝐶𝑀 𝐴𝑜 ) are also not directly ob22
The Census reports the value of agricultural land and buildings. In some years the Census reports
separate values for land and buildings, and land values are the largest share of this combined measure.
23
In particular, we assume that 𝑉𝑜 = 𝑞𝛿𝑜 , where 𝑉𝑜 is the land value and 𝛿 is the interest rate or discount
rate relevant to those who own agricultural land. This assumes that agents face a fixed interest rate or
discount rate and expect zero serial correlation in market access over time. We are exploring relaxing this
assumption by allowing for forward-looking agents with different expectations for changes in market access.
The analysis benefits from debates and uncertainty over the future paths of railroad construction, as well
as uncertainty over changes in market sizes.
24
Following Eaton and Kortun (2002) and a large body of work since, a typical method for estimating
the technology terms 𝑇𝑜𝐴 and 𝑇𝑜𝑀 uses the relationship between these terms and the exporter fixed effect
in a gravity equation. Due to the lack of trade data in our setting, this method is unavailable to us.
13
servable because each contains variables that are unobserved in our setting (e.g., the wage
𝑤𝑑 , the capital rental rate 𝑟𝑑 , and the technology terms 𝑇𝑑𝐴 and 𝑇𝑑𝑀 ). We are exploring calibrating these terms to population data (as in Redding and Sturm 2007); for now, we proxy
𝑀
for 𝐹 𝑀 𝐴𝑜 , 𝐶𝑀 𝐴𝐴
𝑜 , and 𝐶𝑀 𝐴𝑜 with one simplified market access term (𝑀 𝐴𝑜 ) inspired by
Harris (1954).25
Harris originally defined a concept of “market potential,” based on the number and magnitude of markets available at low trade costs. This market potential term (𝑀 𝑃𝑜𝑡𝐻 ) effectively
∑
𝐴
𝐴 −1 𝐴
is a proxy for total demand at location 𝑑, such as pop) 𝑋𝑑𝑡 , where 𝑋𝑑𝑡
equals 𝑑∕=𝑜 (𝜏𝑜𝑑𝑡
ulation or the value of agricultural land. That is, market potential is the sum of all other
𝐴
𝐴
market sizes (𝑋𝑑𝑡
) down-weighted by inverse trade costs (weights are 1/𝜏𝑜𝑑𝑡
).
We extend the Harris formula in two respects. First, in our model, trade costs do not
diminish effective market sizes inversely; rather, they do so to the power of −𝜃𝐴 and typical
estimates of 𝜃𝐴 (discussed below) are significantly greater than one. Second, the 𝐹 𝑀 𝐴𝐴
𝑜
formula in equation (13) implies that market size should be deflated by the competitiveness
of that market (𝐶𝑀 𝐴𝐴
𝑑 ). Intuitively, a large market 𝑑 is more appealing to exporters in
county 𝑜 if market 𝑑 is served by fewer other counties (and, thus, prices are high in market
𝑀
𝑑). The same is true for 𝐶𝑀 𝐴𝐴
𝑜 and 𝐶𝑀 𝐴𝑜 : access to a large market 𝑑 is more appealing
to consumers in market 𝑜 if market 𝑑 supplies fewer other counties (and, thus, sells goods
𝐴
) by a
for low prices). To incorporate these multilateral effects, we divide market sizes (𝑋𝑑𝑡
∑
𝐴 −1 𝐴
proxy for county 𝑑’s competitiveness level (𝑀 𝑃𝑑𝑡𝐻 = 𝑖∕=𝑑,𝑜 (𝜏𝑑𝑖𝑡
) 𝑋𝑖𝑡 ).
𝑀
In this way, we condense three unobserved market access terms (𝐹 𝑀 𝐴𝑜 , 𝐶𝑀 𝐴𝐴
𝑜 , 𝐶𝑀 𝐴𝑜 )
into one observed market access term:
(15)
𝑀 𝐴𝑜𝑡 =
∑
𝐴 −𝜃𝐴 𝐴
(𝜏𝑜𝑑𝑡
) 𝑋𝑑𝑡
.
𝐴 −𝜃𝐴 𝐴
𝑋𝑖𝑡
𝑖∕=𝑑,𝑜 (𝜏𝑑𝑖𝑡 )
∑
𝑑∕=𝑜
This combined market access term is more “ad hoc” than the three terms that follow directly
from the model.26 However, we find the assumption plausible that a farmer’s market access
is positively correlated with the farmer having lower trade costs with: (1) larger populations
demanding the farmer’s goods, (2) larger populations that have fewer other sources for acquiring the farmer’s goods, (3) larger populations producing the farmer’s demanded goods, and
(4) larger populations that have fewer other destinations to sell the farmer’s demanded goods.
25
𝑀
𝐴
Note that collapsing 𝐶𝑀 𝐴𝐴
𝑜 , 𝐶𝑀 𝐴𝑜 , and 𝐹 𝑀 𝐴𝑜 into one market access variable removes the model’s
distinction between urban and rural counties (and between agricultural and manufacturing sectors). This
simplification removes the scope for analyzing cross-county specification into sectors and analyzing separate
effects on urban and rural counties.
26
𝑀
𝐴
Formally, 𝑀 𝐴𝑜𝑡 can be shown to approximate 𝐶𝑀 𝐴𝐴
𝑜𝑡 , 𝐶𝑀 𝐴𝑜𝑡 and 𝐹 𝑀 𝐴𝑜𝑡 as the location of
manufacturing and agriculture in the vicinity of county 𝑜 becomes more and more similar.
14
These four forces are embodied in (15). The term 𝑀 𝐴𝑜𝑡 is calculated for each county in 1870
and 1890 using the GIS-computed county-to-county lowest cost transportation route.27
Summarizing the above discussion, we regress land values in county 𝑜 and year 𝑡 on a
county fixed effect (𝛼𝑜 ), state-by-year fixed effects (𝛼𝑠𝑡 ), a cubic polynomial in county latitude and longitude interacted with year effects (𝑓 (𝑥𝑜 , 𝑦𝑜 )𝛾𝑡 ), and log market access (𝑀 𝐴𝑜𝑡 ):
(16)
ln 𝑉𝑜𝑡 = 𝛼𝑜 + 𝛼𝑠𝑡 + 𝑓 (𝑥𝑜 , 𝑦𝑜 )𝛾𝑡 + 𝛽 ln(𝑀 𝐴𝑜𝑡 ) + 𝜀𝑜𝑡 .
The sample is a balanced panel of 2,161 counties with land value data in 1870 and 1890.
Standard errors are clustered at the state level to adjust for heteroskedasticity and withinstate correlation over time.
III.D
Empirical Results
Table 1 presents results from estimating equation (16). Column 1 reports the estimated impact of market access on land values, when defining market access based on other counties’
𝐴
= 𝐿𝑑𝑡 in equation (15)). Column 2 repeats this regression, defining market
population (𝑋𝑑𝑡
access based on other counties’ land value.
In columns 1 and 2, market access is estimated to have an economically large and statistically significant impact on county land values. The elasticity between land values and market
access is 1.477 (column 1) or 1.441 (column 2), indicating that our preliminary results are
insensitive to the variable chosen to capture “size” of each county market. This similarity
𝐴
in results is reassuring, as neither proxy matches total demand in other counties (𝑋𝑑𝑡
in the
model).28
There is substantial variation in market access across the country, and the baseline estimates may be influenced by outliers: frontier counties whose market access and land values both increased dramatically. Column 3 weights the regression by 1870 county populations. Column 4 excludes areas West of Texas. The estimated elasticities decline to 1.2 and
1; unsurprisingly smaller if market access is more influential on land values in less-dense
agriculture-exporting counties.
Estimated coefficients of 1 – 1.5 imply a large elasticity between land values and market
access, but in the range one might expect using the approximation of our model discussed
above. A coefficient of 1.05 is predicted by plausible parameter values.29
27
As described below, we follow Donaldson (2010) and use a value of 𝜃𝐴 = 3.8 in constructing 𝑀 𝐴𝑜𝑡 .
The full model predicts a log-linear relationship between land values and market access. Consistent
with this theoretical relationship, when including a fourth-degree polynomial in the log of market access,
the coefficient on the linear term remains similar and the higher-order terms are(jointly)insignificant.
28
2−𝛼
under the model
The coefficient on market access in Equation (16) should satisfy 𝛽 = 1+𝛼𝜃
𝐴
approximation discussed above. The parameter 𝛼 is the land share in agricultural production, which Caselli
29
15
III.E
Economic Impact of Eliminating Railroads and/or Extending Canals
The estimated relationship between market access and land values can be used to estimate
the aggregate economic impact associated with counterfactual scenarios. In particular, we
consider two scenarios: (1) eliminating all railroads in 1890, and (2) replacing all railroads
in 1890 with Fogel’s proposed canal extensions.
First, consider the economic impact if no railroads existed in 1890. We calculate the resulting market access of each county in 1890, using computed county-to-county freight costs
in a spatial network that excludes all railroads. We find that market access falls by 63%, on
average, when all railroads are eliminated. By comparison, the average county increase in
market access from 1870 to 1890 was 50%; thus, the counterfactual fall in market access is
not substantially “out of sample.”30
County land value falls by 1.477 log points for every log point decline in market access
(from column 1), and the implied percent decline in land value is multiplied by county land
value in 1890. Summing over all counties, eliminating railroads in 1890 would have decreased
the total value of US agricultural land by 73% or $9.5 billion (1890 dollars). The implied
annual economic loss is $751 million (6.3% of GNP) using Fogel’s preferred rate of return.31
Second, consider the economic impact of replacing all railroads with an extended canal
network in 1890. Fogel (1964) proposes an extended canal network that may have largely
substituted for railroads; indeed, the distance buffer calculations suggested that extended
canals mostly substituted for railroads. Our market access methodology suggests that canal
extensions were not as effective, as county market access declines on average by 57%. Replacing railroads with extended canals generates an annual economic loss of $692 million
(5.8% of GNP). When the benefits of railroads are measured more accurately – reflecting the
wide dispersal of railroad lines and direct connections to markets – the 1890 railroad network
generates benefits that far exceed the potential gains from feasible improvements in canals.
III.F
Endogeneity of Railroad Construction and Changes in Market Access
The baseline analysis has developed the necessary theory and data to regress land values
on market access; further work can improve “identification” of this regression. Changes in
county market access may be correlated with unobserved determinants of county land values,
though the preliminary regressions do control for county and state-by-year fixed effects and
and Coleman (2001) take to be 0.19 based on Census data. The parameter 𝜃𝐴 is known as the “trade
elasticity.” Donaldson (2010) estimates a value of 3.8 in colonial Indian agriculture and Simonovska and
Waugh (2011) estimate a value of 4.5 in modern OECD manufacturing. Using 𝛼 = 0.19 and 𝜃𝐴 = 3.8 we
obtain 𝛽 = 1.05. Our specification is an approximation to the ideal model, but we find this encouraging.
30
Note also that the impact of market access was estimated to be approximately linear.
31
For an elasticity of 1.175 (from column 3), the annual economic loss is $679 million (5.7% of GNP).
The estimated economic loss declines less than linearly in the estimated elasticity.
16
year-specific polynomial controls for county latitude and longitude.
County market access depends on three terms: costs of trading with other counties
𝐴
𝐴 −𝜃𝐴
), and the competitiveness of other coun) ), market sizes in other counties (𝑋𝑑𝑡
((𝜏𝑜𝑑𝑡
∑
𝐴 −𝜃𝐴
) )). Plausible exogenous variation in any of these three terms can be
ties (( 𝑖∕=𝑑,𝑜 𝑋𝑖𝑡𝐴 (𝜏𝑑𝑖𝑡
used to identify the impact of market access on land values. Thus, the identification problem
has more potential solutions than a direct impact evaluation of railroad placement.
In the first term, county railroad construction is unlikely to be randomly assigned (conditional on the controls). One approach may be to replicate identification strategies by Atack
et al. to use some exogenous variation in railroad placement, or adapt methods used by
Baum-Snow (2007) and Michaels (2008). Alternatively, we can exploit the larger impact of
nationwide railroad expansion on counties with worse access to waterways; in practice, counties’ (non)access to waterways can instrument for improved market access from 1870 to 1890.
In the second term, changes in other counties’ population is less likely to be correlated
with a county’s own land values (conditional on the controls). We can instrument for changes
in market access with changes in other counties’ population; in particular, we can control
for changes in nearby counties’ population and define the instrument based on changes in
population for the county’s initial trading partners. Alternatively, we can exploit market size
variation in the impact of trade cost declines: among counties that experience large declines
in trade costs, some become able to trade with larger markets.
In the third term, changes in other counties’ access to other counties is less likely to be
correlated with a county’s own land values (conditional on the controls). These multilateral
competitiveness terms are determined by factors twice-removed from a county: if a county’s
trading partner gains access to other markets, it decreases that county’s effective market
access. We can instrument for changes in county market access with (inverse) changes in the
county’s trading partners’ access to other markets (not including itself). In practice, some
of these instrumental variables strategies will have stronger or weaker predictive power, but
comparing results across specifications should be informative.
As an initial empirical exercise, we focus on variation in county market access that is orthogonal to that particular county’s railroad access. Table 2, column 1 reports the baseline
estimate for comparison. Table 2, column 2, reports the estimated impact on land values
from a county having any railroad track, conditional on county fixed effects, state-by-year
fixed effects, and year-specific cubic polynomials in latitude and longitude. A county containing any railroad track is associated with a 0.359 log point increase in land values, though
county railroad construction may be endogenous and jointly determined with land values.
The estimated impact of market access is remarkably robust to controlling for whether a
county has any railroad track (Table 2, column 3). Once controlling for market access, the
17
estimated impact of any railroad track is no longer substantial or statistically significant.
These estimates are similar when also controlling for the length of railroad track by county
(Table 2, column 4).32 Thus, county market access is correlated with railroad track, but the
estimated impact of market access is robust to exploiting variation in market access that is
independent of that county’s railroad construction and associated endogeneity bias.
IV
Conclusion
(To Come)
32
The coefficient on market access is also robust to controlling for high-order polynomials of railroad
track length.
18
References
Anderson, S. P., A. de Palma, and J. Thisse (1992): Discrete Choice Theory of Product
Differentiation. MIT Press, Cambridge.
Anderson, J. and E. van Wincoop (2003): “Gravity with Gravitas: A Solution to the
Border Puzzle,” American Economic Review, 93(1), pp. 170-192.
Anderson, J. and E. van Wincoop (2004): “Trade Costs,” Journal of Economic Literature
XLII(3), pp. 691-751.
Arkolakis, C., A. Costinot, and A. Rodríguez-Clare (2011): “New Trade Models, Same
Old Gains?,” American Economic Review, forthcoming. http://econwww.mit.edu/files/6445.
Atack, J., F. Bateman, M. Haines and R.A. Margo (2010): “Did Railroads Induce or
Follow Economic Growth? Urbanization and Population Growth in the American
Midwest, 1850-1860,” Social Science History, 34(2), pp. 171-197.
Atack, J., M. Haines and R.A. Margo (2011): “Railroads and the Rise of the Factory:
Evidence for the United States, 1850-1870,” in P. Rhode, J. Rosenbloom, D.
Weiman, eds. Economic Evolution and Revolutions in Historical Time. Palo Alto,
CA: Stanford University Press.
Atack, J. and R.A. Margo (2010): "The Impact of Access to Rail Transportation on
Agricultural Improvement: The American Midwest as a Test Case, 1850-1860,"
BU Mimeo, March.
http://www.bu.edu/econ/files/2010/10/FinalDraftJTLUMarch-282010JournalVersion1.pdf
Axtell, R. (2001): “Zipf Distribution of U.S. Firm Sizes,” Science, 293(5536), pp. 18181820.
Baum-Snow, N. (2007), "Did Highways Cause Suburbanization?," The Quarterly
Journal of Economics, 122(2), pp. 775-805.
Caselli, F. and W. Coleman II (2001): “The U.S. Structural Transformation and Regional
Convergence: A Reinterpretation,” The Journal of Political Economy, 109(3),
pp.584-616.
Chaney, T. (2008): “Distorted Gravity: The Intensive and Extensive Margins of
International Trade,” American Economic Review, 98(4), pp. 1707-1721.
Combes, P-P., T. Mayer and J-F. Thisse (2008): Economic Geography: the Integration of
Regions and Nations, Princeton: Princeton University Press.
19
David, P. (1969): “Transport Innovation and Economic Growth: Professor Fogel on and
off the Rails,” The Economic History Review, 22(3), pp. 506-524.
Donaldson, D. (2010): “Railroads of the Raj: Estimating the Impact of Transportation
Infrastructure,” revisions requested at The American Economic Review.
http://econ-www.mit.edu/files/6038.
Eaton, J. and S. Kortum (2002): “Technology, Geography and Trade,” Econometrica,
70(5), pp. 1741-1779.
Fogel, R. W. (1964): Railroads and American Economic Growth: Essays in Economic
History, Johns Hopkins University Press, Baltimore.
Fogel, R. W. (1979): “Notes on the Social Saving Controversy,” Journal of Economic
History, 39, pp. 1-54.
Haines, M. (2005): Historical, Demographic, Economic, and Social Data: US, 17902000, ICPSR.
Haines, M. and R.A. Margo (2008): “Railroads and Local Economic Development: The
United States in the 1850s,” in J. Rosenbloom, ed., Quantitative Economic
History: The Good of Counting, pp. 78-99. London: Routledge.
Harris, C. (1954): “The Market as a Factor in the Localization of Industry in the United
States,” Annals of the Association of American Geographers, 64, pp. 315-348.
Herrendorf, B. J.A. Schmitz Jr. and A.Teixeira (2009): "Transportation and
Development: Insights from the U.S., 1840—1860," Federal Reserve Bank of
Minneapolis Research Department Staff Report 425.
http://minneapolisfed.org/research/sr/sr425.pdf
Hornbeck, R. (2010): “Barbed Wire: Property Rights and Agricultural Development,”
The Quarterly Journal of Economics, 125, pp. 767-810.
Melitz, M. (2003): “The Impact of Trade on Intra-Industry Reallocations and Aggregate
Industry Productivity,” Econometrica, 71, pp. 1695-1725.
Michaels, G. (2008): "The Effect of Trade on the Demand for Skill: Evidence from the
Interstate Highway System," The Review of Economics and Statistics, 90(4), pp.
683-701.
Redding, S. and D. Sturm (2007): “The Cost of Remoteness: Evidence from German
Division and Reunification,” American Economic Review, 98(5), pp. 1766-1797.
20
Simonovska, I. and M. Waugh (2011): “Elasticity of Trade: Estimates and Evidence,”
Unpublished Manuscript, UC Davis and NYU.
https://files.nyu.edu/mw134/public/uploads/56836/estimate_theta_paper.pdf.
Williamson, J.G. (1974): Late Nineteenth-Century American Development: a General
Equilibrium History, New York: Cambridge University Press.
21
Figure 1. Railroad Network in 1870 and 1890
A. Railroads in 1870
B. Railroads in 1890
Notes: Railroad lines are traced in GIS from georeferenced railroad Atlase images in 1870 and 1887.
22
Figure 2. Waterway Network
A. Navigable Rivers
B. Navigable Rivers and Canals
C. Navigable Rivers, Canals, Lakes, and Coastal Routes
D. All Navigable Waterways and Proposed Canal Extensions
Notes: Navigable rivers, canals, and proposed canal extensions are traced in GIS, as defined by Fogel (1964). Lakes and coasts are saturated with waterways.
23
Figure 3. Available Waterways and Railroads, by Year
A. Waterways and 1870 Railroads
B. Waterways and 1890 Railroads
Notes: Railroads are as defined in Figure 1. Waterways are as defined in Figure 2C.
24
Figure 4. 40-mile Buffers around Railroads, Waterways, and Proposed Canal Extensions
A. Railroads in 1890
B. Waterways
C. Railroads, overlaid with Waterways
D. Railroads, overlaid with Proposed Canals, then Waterways
Notes: Using GIS software, 40-mile buffers are drawn around railroads, waterways, and proposed canal extensions.
25
Figure 5. 10-mile Buffers around Railroads, Waterways, and Proposed Canal Extensions
A. Railroads in 1890
B. Waterways
C. Railroads, overlaid with Waterways
D. Railroads, overlaid with Proposed Canals, then Waterways
Notes: Using GIS software, 10-mile buffers are drawn around railroads, waterways, and proposed canal extensions.
26
Table 1. Estimated Elasticity between Market Access and Land Values
Log (Value of Agricultural Land and Buildings)
Dependent variable:
(1)
Log Market Access
(based on population)
(2)
(3)
(4)
1.477**
1.175**
0.956**
(0.254)
(0.158)
(0.320)
Log Market Access
1.441**
(based on land values)
(0.251)
Number of Counties
2,161
2,161
2,161
2,068
R-squared
0.587
0.587
0.538
0.633
Notes: Each column reports estimates from equation (16) in the text. For a balanced panel 2,161 counties in
1870 and 1890, the Log Value of Agricultural Land and Buildings is regressed on Log Market Access, as
defined in equation (15). All regressions include county fixed effects, state-by-year fixed effects, and yearspecific cubic polynomials in county latitude and longitude. Column 1 defines market access using county
population to proxy for other markets' demand. Column 2 defines market access using land values to proxy
for other markets' demand. Column 3 weights the regression by county population in 1870. Column 4 drops
counties West of Texas, Oklahoma, Kansas, Nebraksa, South Dakota, and North Dakota. Robust standard
errors clustered by state are reported in parentheses: ** denotes statistical significance at the 1% level, and *
at the 5% level.
27
Table 2. Market Access Elasticity: Robustness to Direct Controls for Railroads
Log (Value of Agricultural Land and Buildings)
Dependent variable:
(1)
Log Market Access
(based on population)
(2)
(3)
(4)
1.477**
1.443**
1.455**
(0.254)
(0.240)
(0.320)
0.359**
0.037
0.044
(0.116)
(0.098)
(0.092)
Any Railroad Track
Railroad Track Length (100km)
- 0.032
(0.070)
Number of Counties
2,161
2,161
2,161
2,161
R-squared
0.587
0.544
0.587
0.587
Notes: Column 1 reports the baseline specification from Table 1. All regressions include county fixed
effects, state-by-year fixed effects, and year-specific cubic polynomials in county latitude and longitude.
Column 2 reports the coefficient on a dummy variable for the county containing any railroad track. Column
3 reports the baseline specification from Table 1, controlling for a dummy variable for the county containing
any railroad track. Column 4 also controls for the length of county railroad track (in units of 100km).
Robust standard errors clustered by state are reported in parentheses: ** denotes statistical significance at the
1% level, and * at the 5% level.
28