Up from Poverty?

Up from Poverty?
The 1832 Cherokee Land Lottery and
the Long-run Distribution of Wealth∗
Hoyt Bleakley†
Joseph Ferrie‡
First version: December 4, 2012
This version: September 30, 2013
Abstract
The state of Georgia allocated most of its land through lotteries, providing unusual opportunities
to assess the long-term impact of large shocks to wealth, as winning was uncorrelated with
individual characteristics and participation was nearly universal among the eligible population
of adult white male Georgians. We use one of these episodes to examine the idea that the lower
tail of the wealth distribution reflects in part a wealth-based poverty trap because of limited
access to capital. Using wealth measured in the 1850 Census manuscripts, we follow up on a
sample of men eligible to win in the 1832 Cherokee Land Lottery. We assess the impact of lottery
winning on the distribution of wealth 18 years after the fact. Winners are on average richer
(by an amount close to the median of 1850 wealth), but mainly due to a (net) shifting of mass
from the middle to the upper tail of the wealth distribution. The lower tail is largely unaffected.
This is inconsistent with the prediction of an asset-based poverty trap, but is consistent with
heterogeneity in characteristics associated with what wealth would have been absent treatment.
∗
Draft version. Comments welcome. The authors thank Chris Roudiez for managing some of the transcription,
and Lou Cain, Greg Clark, Bob Fogel, Matt Gentzkow, Tim Guinnane, Erik Hurst, Petra Moser, Kevin Murphy,
Paul Niehaus, Emily Oster, Jonathan Pritchett, Rachel Soloveichik, Chris Woodruff, and seminar participants at
the University of Chicago, Yale University, Tulane University, and the University of Pennsylvania (Wharton) for
helpful comments. We gratefully acknowledge funding support from the Stigler Center and the Center for Population
Economics, both at the University of Chicago.
†
Booth School of Business, University of Chicago. Postal address: 5807 South Woodlawn Avenue, Chicago, Illinois
60637. Electronic mail: bleakley at uchicago dot edu.
‡
Department of Economics, Northwestern University. Postal address: 2001 Sheridan Road, Evanston, Illinois
60208. Electronic mail: ferrie at northwestern dot edu.
1
1
Introduction
Wealth disparities interest researchers and policymakers because of concern for the plight of the poor
and societal preferences for less inequality of outcomes. Although transfers to equalize outcomes can
dull the incentives for productive activity, transfers might improve efficiency if so-called poverty
traps prevent the poor from making even very high-return investments. In such cases, unequal
circumstances in the past can create inequality of opportunity in the future.
Yet we seldom observe transfers in settings where their long-term effect on the distribution of
wealth can be properly assessed. Analysis of the distributional effect of transfers depends both
on the measurement of wealth and on credibly exogenous variation in transfers, which are usually
an endogenous response to an individual’s misfortune. Also, a long follow-up period is necessary
if the short-run effects of transfers, which change the wealth distribution purely as a matter of
accounting, are to be distinguished from their more persistent effects. Such longer-run effects could
either amplify or attenuate the initial transfer, depending on the underlying causes of the initial
wealth distribution.
Consider a poverty trap that arises with a limited ability to borrow (thus entrepreneurs can
only make investments with their own wealth) and a fixed cost of production (thus entrepreneurs
with zero wealth cannot grow incrementally by investing retained profits). Those with low wealth
could get stuck in such a poverty trap, which imparts extra persistence to the path of inequality
(see Banerjee and Newman, 1993, and Buera and Shin, 2013, for example). An implication of
these models is that perturbations of the wealth distribution, particularly when pushing up from
poverty, should be highly persistent and perhaps even with a positive multiplier that amplifies the
initial shock. The relevance of such a poverty trap in understanding the wealth distribution is
a question of interest both in contemporary developing economies and in the historical evolution
of today’s developed nations.1 Perhaps paradoxically, a constant lump-sum grant to everyone
could actually compress the distribution of wealth (in levels) because the added wealth unlocks
high-return investments among those with low wealth at the outset. This thought experiment
of dumping wealth on individuals and then examining the later wealth distribution informs our
empirical analysis in present study.
We analyze a large-scale lottery to consider the effect of a random disbursement of wealth on
the wealth distribution in the long run. Participation was nearly universal, unlike other studies
of lotteries whose participants are a selective subset of the population. The prize in this lottery
1
See, for example, Carter and Barrett, 2006, and McKenzie and Woodruff, 2006, for empirical studies of assetbased poverty traps in developing countries. Fogel and Engerman, 1973, and Wright, 1979, discuss this issue as a
possible cause of wealth inequality in the antebellum Southern United States.
2
was a claim on a parcel of land. The average value of such parcels was large—comparable to the
median level of wealth at the time. Winning in the lottery was close to a pure wealth shock:
there were no strings attached to the land (such as a homesteading requirement) and the claim
could be liquidated immediately. In addition, we consider a historical episode, which allows us
to retrospectively examine the distributional effect in the long run, almost two decades after the
lottery took place.
Specifically, we investigate the aftermath of the 1832 Cherokee Land Lottery in the US state of
Georgia. In the early 19th century, Georgia opened almost three-quarters of its total land area to
white2 settlers in a series of lotteries. In the history of land opening, this was an unusual allocation
method, chosen in large measure for its sheer transparency in the wake of several tumultuous
corruption scandals in Georgia in the 1790s. We conduct a follow-up on these random wealth
shocks using a sample of over 14,000 men eligible to win land in that lottery. To ascertain the
long-term effect on the wealth distribution, we transcribe information on wealth from the 1850
Census manuscripts, measured 18 years after the lottery. The two measures of wealth available
in the 1850 Census are real-estate and slave holdings. From this sample of eligibles, we identify
winners using a list published by the state of Georgia (Smith, 1838). Those identified in the Smith
list comprise the treatment group, and the lottery eligibles that were not linked to the Smith list
serve as a control group. While, in theory, not all of the men in our sample of ‘eligibles’ were
technically eligible to win the lottery, our analysis in Section 4 suggests this was a minor subset in
practice. Further, in our sample, lottery losers look similar to lottery winners in a series of placebo
checks found in Section 4 and 5.5.
As a point of departure, consider first the mechanical effect on the wealth distribution of randomly assigned wealth. If everyone receives (and holds on to) the same dollar amount, this simply
shifts the entire distribution of wealth, in levels, to the right by that same amount. It is nevertheless
common to treat the wealth distribution in natural logarithms, which would be strongly compressed
in such a circumstance. (A wealth shock of a given size represents a much larger fraction at the
lower tail of the distribution.) If instead there are much higher returns to capital at the low-end,
as argued by some,3 then a constant-level disbursement would compress the distribution both in
levels and, to a greater degree, in logs. A complication, however, comes from the heterogeneity in
quality for the lotteried parcels. While this would increase the variance of the treatment wealth
distribution relative to control, it would still have the effect of draining mass out of the lower tail
2
Slaves and free people of color were excluded from the lottery, as were Native Americans. Indeed, while the present
study is focused on distributional changes for white men in Georgia eligible for the lottery, it bears mentioning that
the land was expropriated (i.e., redistributed) from the Cherokees, who were subsequently expelled from northwest
Georgia in a forced march known as the Trail of Tears.
3
We review related literature in Section 2.
3
by the random nature of the lottery, as long as the value of winning the lottery was positive. These
cases provide a point of comparison for the empirical results, discussed next.
Almost two decades after the lottery, winners were, on average, $700 richer4 than a comparable
population that did not win the lottery. The gains in wealth, however, are not evenly distributed
among the lottery winners. Indeed, the poorest third of lottery winners were essentially as poor
as the poorest third of lottery losers. Rather, the gains from lottery winning are almost entirely
seen as a (net) shifting of mass from the middle of the wealth distribution to the upper tail. The
lower tail is largely unaffected. Therefore this wealth shock tended to exacerbate inequality (at
least when considering the poor versus the rest) rather than reduce it. These results are found
in Section 5, where we compare the probability density functions (PDFs) and cumulative density
functions (CDFs) of control and treatment wealth distributions. Further, in Section 5, we use a
quantile-regression estimator to show that winning the lottery affects wealth mostly in the upper
half of the distribution.5 We also show that these results are robust to controlling for various factors,
including characteristics of the person’s name. The latter strengthens the earlier conclusions in that,
although we used the name to link to the list of winners, it did not appear to bias our estimate of
the treatment effect.
Whether the wealth transfer actually caused an aggregate improvement for the treated depends
on one’s taste for equity.6 Various measures of inequality, such as the Gini coefficient or the
standard deviation of log wealth, are higher in the treatment group than in the controls. We use
a constant-elasticity-of-substitution aggregator, bootstrapped over both groups, to ask whether
the treatment distribution shows a statistically distinguishable improvement over the control group
under different preferences about the size of the pie versus how it is sliced. For very large elasticities
of substitution (and correspondingly low weights on equity), the treatment group has a significantly
higher aggregate outcome than the control group. But we cannot reject equality of outcomes for
4
Dollar figures reported in the study are in 1850 dollars unless otherwise specified. We suggest a few different
ways to contextualize this number. First, as stated above, this is approximately equal to the median of wealth in
our sample. If instead we convert this number to 2010 values using consumer prices, it is approximately $20,200. In
contrast, it would convert to $142,000 in 2010 if adjusted by the relative value of the unskilled wage. (This latter figure
translates to over ten years of earnings at the 2010 federal minimum wage of $7.25 per hour for a full-time/full-year
worker.) These conversion factors come from MeasuringWorth.com (Williamson, 2013).
5
Absent the property of rank invariance across the distributions of potential outcomes, we cannot literally interpret
these effects as the treatment effects at a given point in the control distribution, in that these results could have
arisen through more complicated patterns of reshuffling from control to treatment. For example, all of the wouldhave-been-poor could have become rich and an equal number of the would-have-been-rich could have become poor
as a consequence of treatment.
6
For the purposes of the present study, we set aside issues of broader efficiency. The efficiency loss associated with
the lottery could be twofold. First, by not selling the land at its market value, the state of Georgia was foregoing
revenues that then would have to be raised from more distortionary taxes. Second, opening the land through a lottery
was a peculiar form of “market design” that appeared to constrain land use well into the 20th century. This latter
issue is discussed in detail in Weiman (1991) and Bleakley and Ferrie (2013a).
4
elasticities of substitution much below one, a far cry from a Rawlsian elasticity of zero in which
social welfare depends exclusively on the outcome of the lowest-ranked individual.
In Section 6, we ask why we fail to observe the footprint of an asset-based poverty trap—in
which the strongest effects of a positive wealth shock should come from the lower portion of the
wealth distribution. The possibility of negative selection into treatment should be ruled out by
the random nature of the lottery. Further, it is likely that there were indeed important fixed
costs and/or minimum-input requirements in this economy that could generate a poverty trap.
Subsistence constraints would have made it difficult to gradually improve land on the frontier, for
example. Nevertheless, the winnings from the lottery should have been more than enough to eject a
winning landless laborer well into the distribution of existing farms; the expected value of winning
was around $700, close to the peak of the bell curve of (log) asset holdings. But we fail to find
evidence of particularly strong returns from treatment at the low end, and in fact find quite the
opposite: an apparently complete dissipation of winnings among those in that range of the wealth
distribution.
Rather than invoking inflections of the production function and the asset-based poverty traps
they can create, an explanation that better fits the results includes heterogeneity (across the counterfactual wealth distribution) in the characteristics that permit one to hold onto wealth.7 This
heterogeneity could have taken numerous forms. These sources of heterogeneity might also have
resulted in a position low in the wealth distribution even in the absence of treatment. Heterogeneity
consistent with the absence of a large, positive effect of winning on wealth in the lower tail includes
a strong bias towards early consumption (either through very high discount rates or self-control
issues),8 a lack of skill needed to manage a complex venture like a farm, or some tendency to inefficient (read ‘reckless’) risk-taking.9 Sorting out which of these is the main source of heterogeneity is
beyond the scope of the paper (and no doubt each of these mechanisms is operative to some degree).
Nevertheless, the evidence is more consistent with the wealth dissipation in the lower tail coming
7
We say “hold onto” here because, across the 1850 wealth distribution, there was a roughly constant rate of
claiming land by lottery winners; therefore, those who ended up poor did not do so because they disproportionately
failed to collect their winnings at the outset. See Section 6.3.1.
8
One minor complication is that individuals especially ones with high discount rates may have begun to consume
down some of their winnings for lifecycle reasons. But note that, if someone who appears to be in the poverty trap
discounts the future heavily enough, there is no trap from his perspective; the very large returns are still not large
enough to justify delaying gratification. In any case, almost all of our sample is young enough to be in the ages
in which a typical person is still accumulating assets, presumably for later-life consumption or bequests. By 1850,
the men in our sample could have expected at least another 20 more years of life. Further, the results below are
not sensitive to our accounting for differential fertility or for inter vivos transfers to their children. Nor were lottery
winners who wound up in the lower tail less likely to go out to the high-growth frontier (thereby avoiding the hard
work of land improvement). This analysis is seen in Section 6.
9
The risk would have to have been substantially above and beyond what we observe among those who got their
wealth non-randomly. We measure the degree of wealth churn (and risk of total loss) in see Section 6.2.
5
from an inequality of skills, including what James Heckman and others call non-cognitive skills
such as the ability to delay gratification and/or avoid obviously bad decisions, and less consistent
with the lower tail emerging from an asset-based poverty trap.
Section 7 concludes the study.
2
Related Literature
The condition of the small entrepreneur is a topic that has received attention across a wide variety
of contexts and disciplines. We cannot hope to give a proper survey here, so instead in this
section we touch on a few relevant examples from various perspectives. For starters, recall that
Thomas Jefferson and his later intellectual disciples argued for policies that would encourage yeomen
farming (i.e., small-scale and owner-operated farms) rather than large estates or urban factories,
both employing landless laborers.10 This view gave rise to land policies in the 19th century US
that distributed small landholdings on the frontier at low, often below-market, prices.
While the intent of the policy was to establish the dominance of small farming, the extent to
which it did so may have been limited by other factors. Indeed, Gates (1996) argues the so-called
free-land11 policy was ineffective, because small-scale settlers were often capital constrained and
probably were outbid, outmaneuvered, or bought out by those he called “frontier estate builders”
(chapter 2 title, on page 23). Indeed, Atack (1988) shows that rates of landlessness among agricultural workers in the Midwest were at similar levels in 1860 (when ostensibly free land was still
available on the frontier) and in 1880 (when the frontier was closing). To some extent this is a
puzzle; giving free land in relatively small parcels to individuals will have the mechanical effect
of compressing the logarithmic land-wealth distribution in the short-run. The question remains,
however, whether this compression will persist, or whether it will be unwound by some other feature
of the economic environment that makes it difficult for (some) farmers to operate at such a small
scale.
In the historical U.S. South, these issues of scale and inequality are particularly stark. The
South by the first half of the nineteenth century was characterized by a distribution of farm sizes
with less mass in the middle than in other farming regions of the U.S. There were plantations
oriented toward market production that covered in some cases thousands of acres and employed
10
See, for instance, Jefferson’s Notes on the State of Virginia, in response to “Query 19”, at
http://etext.virginia.edu/toc/modeng/public/JefVirg.html.
11
Calling the land “free” was perhaps more a political slogan then a statement about its price or its value. The
Homestead Acts, for example, effectively rationed small parcels to people willing to invest several years in improving
and farming them. It would be a mistake to assume that land obtained through this process was not of productive
value, even if labeled by politicians as “free.”
6
large numbers of slaves and small family farms producing very small marketable surpluses and
oriented mainly toward meeting subsistence needs. But the middle of the distribution was thinner,
especially when compared with the Midwest, where the class of yeoman farmers were so prominent.
One explanation for this pattern comes from an older literature on antebellum Southern agriculture
(discussed by Fogel and Engerman, 1977, and Wright, 1979) which asserts that the presence of
large farming units had “privileged access to capital” (Wright, p.63), which thus prevented the
growth of small farms into intermediate-sized farms.12 Another explanation is advanced by Fogel
and Engerman (1977), who acknowledge the possibility that small farmers faced such constraints,
but instead emphasize the scale economies in employing slaves under the “gang system” (with
regimented passage of entire groups of slaves through fields in cultivating and harvesting).
More recently, de Soto (1987) and Khanna (2007) highlight the ubiquity of small-scale entrepreneurs in developing economies, often hidden in the informal sector. De Soto (2002) also
presents parallels between the antebellum US and developing economies today. While his central thesis is that economic development is (or was) held back by conflicting property rights, the
undergirding theme is that capital markets fail(ed) to direct resources to a large class of small
entrepreneurs (including small farmers), who could otherwise make productive use of such capital.
This line of thinking is related to the notion of a “poverty trap,” which appears in a wide
range of theoretical papers. There are many possible motivations for the existence of such traps;
perhaps the easiest one to think about is a simple fixed cost of production. An example from
this theoretical literature is by Banerjee and Newman (1993) who, using a model with a poverty
trap to analyze occupational choice (being a laborer versus a self-employed entrepreneur versus an
employer), demonstrate the possibility of multiple steady-states for the wealth distribution. In this
and related models, the wealth distribution is the state variable and can be highly persistent, even
to the point of path dependence.
But are such poverty traps empirically relevant for small entrepreneurs in developing economies?
In principle, one could take detailed measurements of the production function, although demonstrating the poverty trap requires precise evidence on the third derivative of the production function.13
An alternative approach would be to randomly disburse capital to entrepreneurs and attempt to
12
Evidence presented by Wright and Kunreuther (1975) on the crop mix choices of small Southern farmers is
consistent with those farmers facing a binding liquidity constraint—they produced a mix of corn and cotton more
in line with a desire to satisfy a subsistence constraint (imposed by an inability to borrow in the short run) than a
desire to maximize expected long-run income. It is a short leap from a liquidity constraint on year-to-year borrowing
to a capital constraint that effectively barred longer-run investments such as those needed to expand the farm’s
size. Indeed, according to Govan (1978, p.202), long-term credit in antebellum Georgia was very limited in that the
“chief function of the banks was to furnish credit for mercantile operations, and to supply a medium through which
payments could be made in distant places at a minimum of risk and expense.”
13
McKenzie and Woodruff (2006), for example, study noncenvexities induced by fixed costs.
7
measure how this changes the distribution of their outcomes. This is the approach of the present
study. Perhaps the most closely related work is by de Mel, McKenzie, and Woodruff (2008, 2012),
who examine the impact on profitability of randomly assigned capital grants to a sample of selfemployed in Sri Lanka. They find large effects on the profitability of microenterprises in the short
and medium runs. (In addition to the obvious difference in location and time period, we note that
their grants were approximately one month worth of unskilled wages rather than almost a decade
as was the case for winnings in the Cherokee Land Lottery.) More recently, Blattman, Fiala, and
Martinez (2013) follow up several years after an unconditional cash transfer (of approximately one
year’s wages) to find higher earnings and labor supply.
Risk is another central feature of the environment that an entrepreneur faces, perhaps to an
even greater degree at a small scale of operation. Banerjee and Duflo (2011) argue that there is
“so much risk in the everyday lives of the poor [...] that, somewhat paradoxically, events that are
perceived to be cataclysmic in rich countries often seem to barely register with them (page 136).”
They provide some illustrative anecdotes of such risk.14 Note that high returns can exist in the
short run perhaps as compensation for high risk that becomes more evident at longer horizons.
Thus supernormal returns in the short run are not necessarily an indication of a binding capital
constraint. Instead, it might indicate a failure of diversification, a distortion that can itself hold back
economic development, as in the model of Acemoglu and Zilibotti (1997). Relatedly, Rosenzweig
and Binswanger (1993) study how farmers in India change their crop mix if they face greater weather
risk and Karlan, Osei, Osei-Akoto, and Udry (2012) show how agricultural decisions change with
the provision of insurance for small farmers in Ghana. Wright (1979) argued that small Antebellum
Southern farmers practiced “safety first” farming because their risk exposure was so great. Ransom
(2005) labelled the Antebellum period as “the era of walk-away farming,” in which small farmers
could cope with bad shocks by simply abandoning their land (and presumably their debts as well).
In contrast, wealthier farmers were better equipped to self-insure and thus not be obliged to abandon
their wealth in response to a transitory negative shock.
Heterogeneity in returns might also arise for reasons that do not bring the specter of inefficiency.
Consider Schultz’s (1975) argument that ability or human capital helps one take advantage of new
opportunities. Indeed, a basic notion of economics is that factors of production should gravitate
to their highest valued use. If the experimentalist somehow manages to perturb the distribution
of factors away from the baseline, this logic suggests that we should expect a reduction in average
returns. It is likely that skill and wealth are complementary, and furthermore that at least some at
14
Richer detail, albeit from nonacademic sources, is presented by the journalist Boo (2012), who relates some of the
difficult shocks endured by several families in an informal settlement in Mumbai, and by Wilder (1971), who details
her own experience as a young mother on the 19th-century US frontier.
8
the bottom of the (treatment or counterfactual) wealth distribution were there precisely because
they lacked the ability to seize opportunities such as winning the lottery.
Finally, there is earlier work that also analyzes the wealth shock coming from lottery winnings.
Imbens, Rubin, and Sacerdote (2001) follow up on the consumption behavior of people who had
won large jackpots in state-run lotteries in Massachusetts. Hankins, Hoekstra, and Skiba (2010)
examine medium-sized jackpots in the Florida lottery and relate this to bankruptcy filings over
the following several years. Both of these studies are strongly related to the present one by using
lotteries to analyze wealth, although neither considers a developing-economy context and in neither
case is the sample size large enough to permit the distributional analysis that we conduct below.
Further, a perennial concern about examining the shock from gambling winnings is that one can
only analyze the effect on gamblers, who are typically a highly selected population. As we discuss
below, participation in the 1832 lottery was so widespread (at least, among white adult men resident
in Georgia circa 1830) that this selection issue is less important in our case.
3
The Cherokee Land Lottery of 1832
The state of Georgia is quite unusual in the U.S. in that much of the state’s territory was distributed
through a series of land lotteries. The initial Georgia colony was concentrated around the Savannah
River, and this land was distributed through a more traditional grant-based system. However, a
corruption scandal in the 1790s (the Yazoo Land Fraud) provoked such popular outrage that the
Georgia Legislature opted to use lotteries as methods of distributing land from then forward. The
first lottery took place in 1805 and the last ones were held in the early 1830s.
For this study we consider the 1832 lottery of Cherokee County in northwest Georgia. We choose
to focus on the 1832 lottery because the list of winners was available and the later date increases
the chance of tracking these people in census data. The land in this area was made available to
white settlers by the eviction of the Cherokee from that area.
Essentially every adult male residing in Georgia for the three years leading up to 1832 was
eligible to one draw in this lottery. Widows, orphans, and certain veterans were eligible for two
draws. (Because we would not know in the control group who was a widow, orphan, or veteran,
we exclude them from the treated group in our analysis. Practically speaking, this is of little
consequence because our sample excludes females and excludes years of birth that the veterans
or orphans would disproportionately populate.) A group of highwaymen called the “Pony Club”
that operated in old Cherokee County was also explicitly excluded from the lottery, but this group
was trivially small compared to the population of the state. In theory, winners in previous lottery
9
waves were excluded from participating, and there was also a 12.5¢ registration fee. It is not
immediately evident the extent to which either of these was enforced, but the numbers suggest that
neither was much of an impediment to participating. We do not know the exact population in late
1832 of white men meeting the requirements for age (18+) and residency (3+ years in Georgia),
but the 1830 Census reports the white male population of Georgia ages 15 and over in 1830 as
approximately 80,000. There were close to 15,000 winners (excluding widows and orphans) in the
1832 Land Lottery (Smith, 1838), which implies a winning rate of around 19%.15 Lists of the
eligible population were constructed by each county government and forwarded to the state capital
in Milledgeville.
Concurrent with this, the area known at the time as Cherokee County was divided into four
sections, which were further subdivided into dozens of districts. The districts were generally square,
except for those that were on the boundaries of the original Cherokee County, which were defined
by the state border to the north and west, and by the Chattahoochee River to the southeast.
Surveyors were sent to each district with the aim of further subdividing it into an 18×18 grid of
square parcels of 160 acres each.
After the surveys were completed and the lists of eligibles were collected, the lottery began. The
drawing proceeded as follows. One drum was filled with slips of paper containing the registration
information on each eligible person. Another drum was filled with slips of paper specifying a parcel.
Blank slips were added to the parcel barrel to equalize the number of pieces of paper in each barrel.
A slip of paper was drawn simultaneously from each barrel to determine who had won which parcel.
(Thus, lottery losers were those matched to a blank piece of paper.) This implies that winning
and losing was assigned randomly, and also that the specific parcel awarded to an individual, even
conditional on winning, was random.16 Over 18,000 parcels were assigned in this manner.
Very few requirements were imposed on the winners of the lottery. They were not required to
homestead the parcel for any amount of time. They were not even required to set foot on their
parcel. They simply had to register their claim with the state government and pay a nominal fee
15
Cadle (1991, page 278) reports that the total number of registrants was around 85,000, but does not give the
breakdown by single- versus double-draw categories. We use the distribution of single-draw and double-draw winners
in the Smith (1838) book to infer this breakdown, and compute that approximately 75,822 registered for the single
draws. The 1830 Census reports 77,968 white men aged 15 and older in Georgia in 1830. Comparison of these
numbers indicates that around 97.2% of group in 1830 indeed registered for the lottery. The remaining 2.8% might
easily be explained by a combination of mortality or emigration between 1830 and 1832, in-migration to Georgia
between 1829 and summer 1830 (thus missing the full three years of the residency requirement), and that few of the
15-year-olds on June 1 of 1830 would have attained 18 years of age by the fall of 1832, when the drawing was held.
16
Weiman, 1991, argues that the lottery’s outcomes appeared approximately random. Both barrels were rolled
around to ensure adequate mixing (or proper randomization, in today’s parlance). The blank slips of paper further
increased the transparency of the process; it was thus more difficult to increase your odds by excluding other names
from ever making it into the barrel. There were a few instances of corruption after the fact that were easily discovered
by virtue of the transparent nature of the lottery (Cadle, 1991).
10
($18). If they wished, they could immediately resell title to that parcel. Indeed, it is likely that
many of the winners took this route. One factor that made this sort of “flipping” attractive is that
it took six years before the state of Georgia could effectively exercise its jurisdiction over this land.
The Cherokee nation fought the eviction through the legal system, and the state of Georgia was
not able to evict the Cherokees until 1838. Information on the parcels as well as a list of winners
was circulated throughout the state and compiled into a single source by Smith (1838).
A rough measure of the value of a winning draw in the lottery can be obtained by calculating
the average value of a farm in the 10 counties of Northwest Georgia in 1850, when the U.S. Census
first provides the information necessary to make this calculation. These counties (Cass, Chattooga,
Cherokee, Dade, Floyd, Gilmer, Gordon, Murray, Union, and Walker) contained 1.289 million acres
of farmland (improved and unimproved); the 6,193 farms in these counties had a total cash value of
$8.566 million (1850 dollars), of which $357,000 was implements and machinery.17 Tostlebe (1957,
p. 179) suggests that improved land was three times as valuable as unimproved land in the humid
states (apart from the Great Plains, Iowa and Illinois where the mark-up was 1.5 owing to the lower
cost of clearing land in these states). If we use the 3-to-1 mark-up for 1850, improved acres were
worth $12.45 and unimproved acres were worth $4.15, so an unimproved 160-acre plot was worth
$664 in 1850. If winners improved their land at the average for these ten counties (27 percent), a
160-acre farm would have been valued at $1,048 in 1850.18
Not all of the land in these counties was in farms in 1850, however: 61 percent of the 3.303
million total acres in these ten counties do not appear in the census agricultural schedules. Part of
this discrepancy results from non-farm land uses: pine forest that was not used as part of an active
farm, town lots, roads, bridges, ferries, and mills. The first of these – pine forest – accounts for the
vast majority of the unfarmed land. Another component of the 61 percent discrepancy between
total and farmed land is farms that are missing from the agricultural schedules because the farm
household was missed entirely by the census. Hacker (2013, Table 4) estimates that 6 percent of
white males born in the South were missed in the 1850 census (the number of adults missed will be
below this as the total is skewed by a 19.7 rate for age 0-4). Finally, some farms would also have
been missed in the agricultural schedules if they were owned by farmers who resided outside the
county and were not present when the census marshal visited.19
17
Numbers for this calculation are reported in ICPSR study 2896 (Haines, 2010). Georgia did not move to an ad
valorem land tax until 1852 and few of the county tax digests are easily accessible until 1890, so it is not possible to
use actual assessment records to recover the value of land as reported by county tax collectors. The figures reported
here are based on the Agricultural Schedules of the 1850 U.S. Census of Population.
18
In 1832, the only improved land in these 10 counties was the 19,320 acres cultivated by the Cherokee. (Wishart,
1995, Table 1, p. 125), or just 1.3% of the total land in farms by 1850. If we assume each farm was only 1.3%
improved (rather than 27% improved), a farm was worth $681 in 1850.
19
For example, in 1851, Christopher Chaney resided in Militia District 583 Appling County, Georgia where he
11
The 1832 lottery exhaustively partitioned the area that was distributed, so the non-farmed areas
must be accounted for and assigned a value in estimating the value of a 160-acre plot. If we assume
that the 5 percent of farms that were missed entirely by the census were otherwise identical to
the farms included, and that another 5 percent of otherwise identical farms were held by residents
outside the ten-county area who farmed the land themselves, the fraction of non-farmed land falls
to 51 percent. But this is too high a fraction to which we should assign a zero value, for two
reasons: first, land in non-farm uses could have had a positive value (e.g. pine forest adjacent to
water or roads that could be used to transport timber to market); and second, the trend from 1850
to 1860 was for an increasing fraction of each county’s land to be farmed, suggesting that some land
counted as non-farmed in 1850 was in fact farmable but simply had not yet been occupied or had
not yet been incorporated into existing farms.20 This fraction rose roughly 20 percentage points in
the counties for which we have comparable data in 1850 and 1860. A conservative estimate of the
fraction of zero-value land in these ten counties is therefore no more than 30 percent.21
This allows us to estimate the expected 1850 value of a 160-acre plot (which will now comprise
112 acres of positive-value land) won in the 1832 lottery: $464 if completely unimproved and $716
if 27 percent improved. Using the GDP deflator and ignoring capital gains, these values correspond
to $375 and $579 in 1832. One measure of capital gains is the New York price for raw cotton,
which averaged 10.3¢ per pound 1831-33 and 10.7¢ per pound in 1849-51 (Historical Statistics of
the U.S., 2006, Series Cc222). Taking account of this small trend would further reduce the 1832
value of 160 acres only slightly.
An additional measure of the value of land comes from the neighboring counties of Carroll,
Coweta, Muscogee, and Troup, which were opened up in the lottery of 1827. Unlike in 1832,
fractional parcels (produced in large measure by surveying accidents) were withheld from the 1827
lottery and sold instead at auction. These auctions did not use reserve prices, and therefore the
full distribution of prices can be found in the auction records. Weiman (1991, page 845) reports
the mean land value per acre ($2.19) in the auction, which translates into $350.40 for a 160-acre
appears in that county’s tax records for that year. He is reported to own 160 acres of land in Cherokee County
(in Section 2, District 251, Number 5), in addition to his landholdings in Appling County. (Ancestry.com 2011) In
the 1850 agricultural schedules for Georgia, Chaney appears only in Appling County, and the acreage he reports
(490 acres) is equal to his holdings in Appling County only. (Ancestry.com, 2010) The Cherokee County land would
have been missed entirely if Chaney was farming the land himself and had no operator who could report the farm’s
characteristics to the census marshal.
20
Bode and Ginter (1986, p. 59, Map 1) show higher ratios of farm acres to total acres for 1860 than we have
calculated for every county in the lottery zone. For example, we find that 42 percent of the total area of Chattooga
and Floyd Counties was in farms in 1850, but Bode and Ginter find 80 percent of the total area in these two counties
was in farms in 1860.
21
Banks (1905, p. 19) suggests that no more than 25 percent of the plots distributed in Georgia’s land lotteries
eventually went unclaimed and reverted to the state. Even the unclaimed parcels would have had some value, and
many were sold subsequently by state-run auctions.
12
plot circa 1827. If we adjust this for the 6.5% higher farm values in 1850 for Old Cherokee County
versus the four counties just considered, the estimated value of 160 acres rises to $373.18.22
If they sold the plot before 1850 and bought land with a similar net present value (NPV),
we would expect the same. These effects might be attenuated, however: wealth could be held
in other forms, e.g., slaves, which we observe by linking to the 1850 Slave Schedule) or financial
assets (very rare, except for the wealthiest); or wealth could be consumed (in a variety of forms:
direct consumption goods or larger family sizes). Additionally, those who flipped the land quickly
may have received less than the land’s NPV because of uncertainty about the exact timing of
the expulsion of the Cherokee. There should have been little doubt about their eventual eviction,
however. The Indian Removal Act was passed in 1830, and had been applied several times already
in the region.
Roughly the bottom third of Cherokee County was distributed in 40-acre parcels as part of a
separate lottery (called the “Gold Lottery”). It was thought that this area was particularly rich in
gold deposits, an assumption that proved to be overly optimistic. (For this study, we examine only
winners in the Land Lottery section of old Cherokee County.
4
Data
4.1
Sources and Construction
The present study follows up on the outcomes of lottery winners and losers. There are two principal
ingredients to this exercise. First, we need to identify who was eligible, and who won. Second, we
need to find these individuals in later, publicly available data sources, so as to follow up on their
outcomes. For the most part, we search for these individuals in the Census manuscripts of 1850
using a preliminary version of the full-count file for the 1850 Population Census from the IPUMS
project, indexed and scanned images of the 1850 manuscript pages from Ancestry.com, and an
index of the 1850 Slave Schedule on Ancestry.com.
The original source for the names of lottery winners in the 1832 Georgia land lottery is Smith
(1838). He lists, in numerical order, each parcel that was available and the associated lottery
winner, along with the winner’s county and minor civil division in 1832. Smith’s list was partially
transcribed and available on accessgenealogy.com, which we downloaded, cleaned, and compared
with a copy of Smith (1838) that we scanned and transcribed with an OCR program.
22
This value might itself be considered an underestimate in that a fractional parcel was probably below the optimal
farm size, and its use depended on combining it with a neighboring plot through an illiquid market. The auctions
themselves took place typically in the state capital and the participants appeared to be market makers and/or
consolidators (Weiman 1991).
13
In order to generate a control and treatment group for this lottery, we took advantage of the
lottery’s entry requirements: individuals had to be 18 years or older in 1832 and resident in Georgia
for at least three years by 1832. We extracted all males from the complete count file of the 1850
U.S. Census who met two criteria: (1) they had at least one child born in Georgia in the three
years prior to 1832; and (2) they had no children born outside of Georgia in those same years. This
yielded a population of 14,306 individuals. Of these, 1,758 were then identified in the list of lottery
winners based on their surname and given name. These individuals were then sought in the 1850
census manuscripts to transcribe their 1850 real-estate value,23 occupation, and literacy.24 The
complete count file directly provided the other outcomes we will explore below (county of residence
in 1850, and marital status in 1850 and the number of children born between the 1832 lottery and
the 1850 census). Slave wealth was added by locating households in the 1850 US Census Slave
Schedules. Together with data on slave prices by age and sex (taken from Kotlikoff, 1979, Table
II), this made it possible to impute a value of slave wealth to each household.25
An initial concern regarding our sample design is that individual lottery winners needed to
survive to 1850 in order to be at risk to be linked from the lottery to the 1850 Census. Given the
age structure of the Georgia population, and new life tables produced by Hacker (2010) for the
early nineteenth century U.S., we estimate that over 60% of the males eligible to participate in the
1832 lottery would have survived to 1850. Further, Steckel (1988) finds essentially no relationship
between real estate wealth and survival probabilities 1850-60, so we argue that lottery winners are
no more likely to be found in 1850 than non-winners.
An additional concern is that our reliance on the observed household structure in 1850 to impute
lottery eligibility (i.e., the presence of at least one child born in Georgia 1829-32 and the absence
of any children born outside Georgia in that window) imparts a bias by focusing our attention on
homes where fewer children had left home by 1850 (and were thus present with their fathers and
available for us to examine their birthplace and year of birth). Steckel (1996) reports that only 11%
of children in the antebellum South departed their parental home by the age of 18, so this, too,
is unlikely to contaminate our sample. Nevertheless, children born in 1829-32 must have survived
to 1850 to be at risk to be observed, whether within or outside their parental home. Again, the
aforementioned lack of a wealth effect on survival should prevent mortality from contaminating the
23
Steckel (1994) compares taxable wealth (county records) and census-reported wealth in a sample of individuals
in Massachusetts and Ohio located in both sources. There are some discrepancies (more in Ohio than in Mass.), but
no association between the size of the discrepancies and any observable characteristics apart from gender.
24
These variables were double or triple input and then rectified by a different transcriber in case of any discrepancy.
25
One complication with the Slave Schedule is that slaves were listed with the household/farm where they resided.
If an absentee owner were not listed in the household on the Population Schedule, then the slave wealth would be
attached to the wrong person. Note that this measurement issue is almost certainly limited to the upper tail of the
distribution (e.g., the absentee planter who resides in Charleston rather than on his plantation).
14
sample of treated versus control households.
We nonetheless take seriously the possibility that differential survival and differential rates of
children leaving home by wealth could leave questions as to the extent to which our findings are
driven by the wealth differences upon which we focus rather than peculiarities of the sample design.
To alleviate these concerns, we perform a series of balancing tests comparing the pre-treatment
characteristics of lottery winners and non-winners in Section 4.2.
4.2
Summary Statistics and Balancing Tests
We present summary statistics for the sample in Table 1. Each row presents a different variable,
and variables are grouped thematically into panels. Means and standard deviations (in parentheses)
are shown for each variable. These values for the whole sample are seen in Column 1, and then we
provide decompositions based on each individual’s likely lottery status in Columns 2 and 3, which
report the summary statistics for lottery losers and winners, respectively. Additionally, in Column
4, we report the p-value of a test of the difference in means between these two subsamples. We
implement this test with a bivariate regression on a dummy variable for being a lottery winner. In
the cases below in which there is a grouped-data structure, such as the household or surname level,
we cluster the standard errors. The number in square brackets in each row reports the sample size
used to compute this test statistic.
In the present study, we consider two measures of whether the person won land in the drawing
for the Cherokee Land Lottery of 1832. Summary statistics for these variables are found in Panel
A of Table 1. The first measure is coded to one if that person is a unique match to a name
found on the list of winners published by Smith (1838). Anyone else is coded to zero, including
individuals who were among several persons matched to the same winner’s name. As is seen in
the table, 12.4% of our sample is matched to the list of lottery winners. By construction, this
variable takes on means of zero and one in Columns 2 and 3. In the second measure, we attempt
to accommodate the relatively small fraction of individuals that tie for a match to the Smith list
with others in our sample. In the case of a tie among n observations, we recode the match variable
to 1/n. This recoding of the variable is motivated by the belief that one member of the tying set
did in fact win in the lottery, but we do not know which and thus distribute the probability of
winning evenly across the group as if we had a uniform prior. More sophisticated (i.e., nonuniform)
versions of assigning partial treatment values within such groups are possible, but we shied away
from this approach because of the lack of appropriate benchmark data with which to calibrate such
an approach. The average value for this variable is 15.5% in our sample, which is approximately 3%
higher than the binary match variable and just slightly below the rates discussed above. The vast
15
majority of differences occur because numerous groups of small-n ties were recoded from zero up to
1/n. These two lottery-status variables are extremely highly correlated: the regression coefficient
of the second measure on the first has a t statistic of 329.
Next, we consider in Panel B of Table 1 a series of outcomes that were determined prior to the
realization of the lottery, and therefore should be unaffected by whether the individual won land in
the 1832 lottery. Analysis of these outcomes therefore serves as a balancing test when comparing
the control and treatment samples. The lottery-eligible men in the sample are approximately 51
years old in 1850, and average age is similar between winners and losers. Almost 50% of the sample
was born in Georgia, with the bulk of the remainder being born in the Carolinas. These fractions
are statistically similar across groups. By the construction of the sample, these individuals have
at least one child born in Georgia in the three years prior to the 1832 lottery. But there is no
reason why lottery status should correlate to the number of children born in this earlier period, if
our sampling design has drawn an appropriately matched treatment and control group. Indeed, we
do find that the sample has approximately 1.33 children born in the pre-lottery window, and this
number is quite similar between the two subsamples.
The next variable that we consider is whether the individual could read and write. While this
variable is measured in 1850 and could theoretically be affected by the lottery some 18 years prior,
literacy was more likely realized in childhood. These men, if they had won the lottery or not, would
be unlikely to undertake remedial education in literacy given that they were already adults in 1832
and had on their shoulders the demands of supporting a family in a largely agrarian society. By
this measure, almost 15% of our sample was illiterate, with insignificant differences between the
control and treatment groups. (Note that this was probably a fairly weak test of literacy in that
many enumerators classified someone as literate if they could read and write their name. Rates of
illiteracy were considerably higher if a more modern standard of literacy was applied.)
In the rest of Panel B, we examine characteristics based on the individual’s surname, which
was inherited from the father at birth and therefore predates the lottery. As there was probably
very little phonetic change in the surname over the life course (or even across generations), the low
rates of literacy and somewhat lax orthography of the time might have occasioned some drift in
how the surname was spelled. For example, in the census manuscripts the surname “Blakely” has
variants “Blakeley,” “Bleakley,” “Blakelee,” and others, as does “Ferry” have the variant “Ferrie.”
To accommodate this heterogeneity in spelling, we use the Soundex version of the name, which
reclassifies names that are phonetically similar into a single code. The first surname-based outcome
that we consider is the number of letters in that name (and for this outcome alone we use the original
surname rather than the Soundex version). On average, surnames have 6.2 characters, and this
16
average is indeed slightly lower in the subsample of lottery winners. Next we find that the average
person has a surname that appears 36 times in the sample, and this is not significantly different
between subsamples. We also find that 10% of the sample has a surname that begins with the
letter ‘M’ or ‘O’ (correlated with Celtic origin), and this rate is insignificantly different between
the group of winners and losers. Indeed, for a cross tabulation of lottery status and the first letter
of last name, a chi-squared test (d.f.=26) of the equality of distribution across groups has a value
of 20 (p=0.8).
The final set of surname-related outcomes that we present in Panel B are constructed from
the average characteristics of others in Georgia with the same surname. We restrict ourselves to
Georgia in part to maintain similarity with our sample and also because we had access to a full
transcription of the 1850 census for the counties in Georgia starting with the letters A-J that was
provided to us by the IPUMS project. We took this transcription file and formed averages by
surname (again using the Soundex recoding of surname) for various outcomes. To prevent any
mechanical contamination from our lottery-eligible sample, we exclude anyone in our sample from
the construction of the surname-level averages. The mean surname-average of real estate wealth for
our sample (again, not their real estate wealth but the average wealth of those people with the same
surname) is approximately $1200. Because wealth is right-skewed, the mean presents a somewhat
misleading picture, and accordingly we find the median wealth among individuals with the same
surname is considerably lower: less than $300. The surname-level illiteracy rate is almost 22%.
None of these surname-level outcomes show a statistically significant difference when comparing
the lottery winners versus losers. (Some readers might argue that this is a weak test because
perhaps the surname-level averages are measured with considerable noise. Nevertheless, we show
below in Section 6.2 that the surname averages are strong predictors of individual-level behavior,
even when conditioning on demographic and locational covariates. We also test for interactions of
winning the lottery with these surname averages below as a test of heterogeneity in the response
to wealth shocks.)
In Panel C, we present summary statistics for measures of wealth in 1850. Note that this panel
and the rest of the table can no longer be considered part of a balancing test in that we examine
outcomes that might very well be affected by winning the lottery. For this panel, the numbers in
curly brackets display the 25th, 50th, and 75th percentiles, respectively. The first measure that we
consider is real estate wealth. The whole-sample mean is approximately $2000 and the median is
$650. Unlike many of the outcomes above, here the mean differences by lottery status is significant
for an α = 10% level. Real-estate wealth also shows differences at the median, although not in the
upper or lower tails. Next we consider statistics for slave wealth, which had a mean of approximately
17
$1340, and a statistically significant difference in means by lottery status. The final row of Panel
C displays the sum of these two wealth components, which we label “total wealth” throughout
the paper.26 This variable, whose mean is over $3000, shows a several-hundred-dollar difference
between control and treatment groups, which is both economically and statistically significant.
The mean difference in total wealth that we observe between lottery winners and losers is close in
magnitude to our earlier back-of-the-envelope estimate of the value of the land won in the lottery.
The median and 75th percentile is higher in the treatment versus control, but the 25th percentile is
the same. Further, a Kolmogorov-Smirnov test rejects equality of the control and treatment wealth
distributions at an α = 5% level.
Finally, the vast majority of the sample still lived in Georgia in 1850, and the bulk of the
remainder resided in Alabama. (Appendix Figure 1 displays the geographical distribution of our
sample by county in 1850.) Nevertheless, we do not see significant differences across treated versus
control subsamples in the propensity to be in either of these states. However, a chi-squared test
overwhelmingly rejects the equality of the distribution of the subsamples across counties. One main
aspect of this difference is the increased propensity of lottery winners to be in a county whose land
was opened up by the 1832 Cherokee Land Lottery.
5
Estimated Change in the Wealth Distribution
In this section, we characterize the difference in the control versus treatment distributions of 1850
wealth using a variety of estimators. In Section 5.1 we define a simple regression equation that
forms the basis of our empirical analysis. In Section 5.2, we show that the treatment group of lottery
winners had, almost two decades after the lottery, higher mean wealth than the control group of
lottery non-winners. This result is robust to a variety of controls derived from the characteristics
of surnames and given names. However, results from quantile regressions show that the effect of
the lottery on the treatment group is concentrated in the upper part of the wealth distribution.
Then, in Section 5.3, we present estimates of the PDF for control and treatment groups, as well
as estimates of the difference in the CDF (∆CDF) between the two groups. Again, we show that
the treatment associated with lottery winnings perturbs the distribution of wealth primarily in the
upper half of the distribution. Using both the quantile and ∆CDF estimators, we find very little
effect of treatment on the lower 40% of the wealth distribution. (To be clear, we are thinking of the
distribution itself as an object that is being treated. None of our results in this section is meant
26
Plainly, this is not a global total; there are other components of wealth that we cannot measure, such as non-slave
personal property (which was only reported in the 1860 and 1870 censuses) and the individual’s human-capital wealth.
Below we show that these results are not sensitive to using occupation to impute physical capital or to accounting
for investments in children.
18
to imply anything about the mapping from control to treatment, in the sense of characterizing
the precise relationship between potential outcomes.) Next, in Section 5.4, we evaluate the gains
from treatment (relative to control) under various preferences for distributional equity. Finally, in
Section 5.5, we conduct a placebo exercise using a sample defined by having children born within
the pre-lottery window, but within South Carolina instead of Georgia. Matches to the Smith list in
this case are entirely spurious, and, accordingly, this placebo variable does not predict differences
in wealth between the control and treatment groups.
5.1
Estimation strategy
The basic research design of the study is to compare the long-run outcomes of winners and losers
among participants in the 1832 Cherokee Land Lottery. Above we discussed how we assign lottery
status (winning vs. losing) in a sample of men who, by their characteristics, were eligible to
participate in the lottery. With such a sample, estimating the treatment effect of winning the
lottery is as simple as a comparison of means across the subsamples of winners and losers or,
equivalently, a bivariate regression with the outcome on the left-hand side and a dummy variable
for winning the lottery on the right-hand side. Throughout the present study, we opt for the
regression approach, which is able to accommodate additional control variables on the right-hand
side as well as the 1/n measure of lottery status, which is not dichotomous. At some level, the
random nature of the lottery should obviate the need for control variables as fixes for omittedvariable problems. Nevertheless, controls might be useful to absorb some of the residual variation
and perhaps improve the precision of the treatment estimates. Further, the methods that we use
for tracking the lottery-eligible sample and imputing lottery status might introduce biases that
control variables could clean up. (The fact that lottery status is not predictive of predetermined
variables, as seen in Section 3 and Table 1, casts doubt on this supposition, but we can never rule
it out entirely. We return to this issue in Section 5.5 with an alternative placebo test.)
The basic regression equation, which we generally estimate using OLS, is as follows:
Yik = γTi + BXik + δa + δk + ik
(1)
in which i, a, and k index individuals, ages, and 1850 counties of residence. The variable of
interest, Ti , is a binary variable that denotes treatment—meaning winning the lottery—and the
control variables are as follows: δa is a set of dummies for age; δk is a set of dummies for location
(county×state k), which we include to account for differences in settlement patterns in the control
and treatment groups; and Xik is a vector of other control variables, as specified below. The random
19
assignment of treatment by the lottery allows us to recover an unbiased estimate of γ.
A principal alternative specification used below also incorporates characteristics of the surname
(last name). The main variant of the specification includes fixed effects at the surname level.
The specification controls for a host of differences that might persist across patrilines. One way
of thinking about the specification is measuring the impact of lottery winning within extended
families (again, defined patrilineally). Recent work by Clark and Cummins (2012) and Güell et al.
(2012) highlights the persistence in outcomes across patrilines, and this effect would be absorbed
by surname fixed effects. Furthermore, specification problems that are introduced by our use of
surname in constructing the lottery variables would also be absorbed by these fixed effects. (As we
discussed above, we use the Soundex version of names to account for minor spelling differences.)
Note that this is a stronger test to pass in that we effectively ignore individuals whose surnames
are unusual enough that the sample does not contain both a winner and loser with that surname.
5.2
Baseline regression results
We estimate a large effect on 1850 wealth from having won the lottery almost two decades earlier.
Table 2 presents the estimates of equation (1) with total wealth (the sum of real estate and slave
wealth) as the dependent variable, and results are shown for both levels and natural logs. The
baseline estimates are found in Column 1. On average, lottery winners have approximately $750
or 14% more wealth in 1850. This number is similar in magnitude to the unconditional difference
seen in Table 1. It is also similar to, perhaps a bit smaller than, the back-of-the-envelope estimate
of the value of land won in 1832. It is possible that the winnings were partially spent or saved
in some other kind of wealth, although there was a relatively limited set of assets that could be
used to store value in the rural Deep South at this time, and we are measuring two of the most
important components (land and slaves).
In any event, the baseline estimates suggest substantial persistence. The remaining columns of
Table 2 report specifications that use different sets of fixed effects as controls. In Columns 2-4, we
control for characteristics of the surname: the first letter, the number of letters, and the frequency of
that surname in our sample. These estimates are within a third of a standard error of the baseline.
In Column 5, we report specifications that include a full set of dummies for each surname (using
the Soundex concept, as discussed above). Estimates drop by about half the standard error in this
case, but we still estimate that lottery winners were almost $600 richer 18 years after the lottery.
In Column 6, we control for a full set of dummies for given (first) names rather than surnames,
and we see that the estimates instead rise by about half a standard error relative to the baseline.
Finally, in Column 7, we include fixed effects both for given name and for surname. (Note that
20
these are two sets of fixed effects; fixed effects for each given-name-x-surname cell would absorb the
lottery-status variable, which uses the full name for linkage to the Smith list.) These estimates are
a bit below the baseline, but a bit above the estimates that we obtain when controlling for surname
alone.
Table 3 continues the analysis of lottery status and wealth by presenting specifications with
alternative ways of constructing the wealth variable. Panel A presents results for total wealth in
levels or logs. Estimates from the baseline specification are repeated here for reference. Also in this
panel, we attempt to adjust this variable for the truncation of the lower tail. Specifically, census
enumerators were instructed to leave real estate wealth blank if the value was under $100. It is
common in studies of variables that are censored or truncated like this to impute a value of zero
in levels and in logs (=ln($1)). In the previous analysis, we assume the blanks were zeros in levels
and missing values in logs. It is difficult to check these assumptions, but they seem ad hoc. An
inspection of the distribution of real estate wealth reveals that the truncation at $100 is important:
there is a nontrivial amount of density at and just above $100. Furthermore, the distribution looks
approximately log-normal above $100. If we fit a truncated normal to the distribution above $100,
we estimate that the expected value of wealth below $100 is approximately $59.34. We use this
number to impute wealth to those whose real estate wealth is below $100 and rerun the regressions
from above. As can be seen in Panel A, this adjustment for truncation of the lower tail results
in trivial differences in the estimated coefficient on lottery winning. While this adjustment for
truncation is also imperfect, the fact that the results change so minutely when moving around the
lower-tail imputations by so much suggests that lottery status has very little impact on the lower
tail of wealth. We test this directly in the next panel.
Panel B of Table 3 presents the results of quantile regressions27 that allow us to explore the
effect of winning the lottery on wealth at various points in the wealth distribution. We estimate
very little effect of winning on wealth in the lower tail, seen in the first row of the panel where
we estimate the treatment effect at the 25th percentile of wealth. (Note that the person at the
25th percentile of the wealth distribution in the sample has zero wealth.) In contrast, we estimate
an effect of approximately $200 at the median and over $500 midway into the upper tail. We see
even larger differences in wealth at the 95th percentile, although this result is only statistically
significant for one of the two specifications. At such high levels of wealth, it is likely that any
treatment effect of winning the lottery is overwhelmed by noise (be it statistical noise or variations
in fortune/endowments/etc.), especially if the noise grows in magnitude as wealth increases even
as the dollar value of winnings does not.
27
It was not computationally feasible to estimate the quantile regressions with large sets of dummy variables, so
the results reported here are from bivariate quantile regressions.
21
This pattern of results across the distribution is also shown in Figure 1, which presents quantileregression estimates of lottery winning across the distribution of total wealth in 1850. The points
are the quantile-specific estimates of the treatment effect, and the dashed line is a local-polynomialsmoothed mean of these estimates. Here we use the ‘unique match to Smith’ definition of lottery
winning. (Appendix Figure 2 displays analogous results using the 1/n match instead.) Again, we
see that shifts in the distribution are quite small in the lower tail, become larger in the middle,
and then grow quite large in the upper tail. (We omit the display of quantiles above 0.985, where
estimates are larger still, so as to not obscure the shape of the curve for the vast majority of the
distribution.) Note that, while the average coefficient is $525, the gains are quite concentrated in
the upper third of the distribution.
These results are, on their face, inconsistent with the simple hypothesis that the random disbursement of a fixed amount of wealth shifts the distribution equally at all points. For certain,
there was variance in the value of lots won, but the random nature of the lottery insures us that
both the variance and the expected value would have been independent of a winner’s counterfactual point in the control distribution. Thus, if all of the winnings were at least positive, then the
lottery should have to some degree drained mass from the lower tail of the distribution, relative
to control.28 In any case, we cannot interpret these estimates as the treatment effects at a given
point in the control distribution, unless the mapping from control to treatment (which is inherently
unobservable) preserves the relative rank of each observation in the outcome distributions. Absent
this rank-invariance property, the interpretation of quantile-regression estimates is somewhat awkward to render in words, so we return to this issue with graphical presentations of the differences
in the distributional function in Section 5.3 below.29
28
Some readers might wonder what would the results look like if most of the parcels were of zero value? First,
note that this assumption is extreme. Banks (1905, p. 19) estimates that more than three quarters of the lottery
parcels were eventually claimed, which suggests an expected value greater than the filing fee of $18.) Second, note
that someone who would have been in the lower tail should have been just as likely to win an unusually valuable
parcels as someone who would have been elsewhere in the distribution. The following two simulations are illustrative.
In simulation (1), we take the control group and turn a random 12% of them into spurious winners, and then add
$500 to their wealth. The effect at the 25 percentile is $500 with a standard error of 40. In simulation (2), we instead
add $2000 to a random 1/4 of this group of spurious winners. The quantile regression coefficient is again $500, but
now with a standard error of 43. This pattern continues if we decrease the probability of winning a valuable parcel,
but maintain the expected value of $500: the coefficient stays at 500, but the standard error increases. Thus, the
problem introduced by parcel heterogeneity would seem to be one of precision, not of bias across the quantiles.
29
See also Appendix A, where we present a partial-identification strategy for putting bounds on treatment effects
for those who, absent winning the lottery, would have been in the lower tail of the wealth distribution (call it
the ‘counterfactual lower tail’). With minimal assumptions, we cannot rule out considerable departures from rank
invariance. If we impose that the expected value of treatment was positive throughout the distribution, we can rule
very large treatment effects for those in the counterfactual lower tail. But this upper bound is high: roughly $1500.
Thus, even with the partial-identification analysis, we cannot rule out that the lower tails of treatment and control
are similar because some of would-have-been-poor moved up and an equivalent number moved down to take there
place.
22
In Table 3, Panels C and D, we consider the subcomponents of measured wealth: real-estate
wealth and wealth held in the form of slaves. First, consider the intensive margin. We estimate
positive treatment effects of winning the lottery for both categories of wealth, with a somewhat
higher coefficient on slave wealth. The estimate for real estate is about half the median real estate
wealth, while the estimate for slaveholding is considerably larger than the median (of zero) in
that category of wealth. Second, consider the extensive margin of wealth. We estimate essentially
no effect of winning the lottery on holding real estate valued at least $100 (the truncation point
imposed by the census as discussed above). In contrast, lottery winners are four to five percent
more likely to own at least one slave.
Finally, estimates are similar when using either the binary or the 1/n match variables to impute
lottery winning. Throughout the rest of the study, we present only the binary variable to save space.
Nevertheless, it does seem from the results in Table 2 that the specification including controls for
surname fixed effects is a bit more conservative, and so we present that specification as an alternative
to the baseline in the tables to follow.
5.3
Comparison of Density Functions
In this section, we compare the probability density functions and cumulative density functions (PDF
and CDF, respectively) of the control and treatment groups. These results are shown graphically
in Figure 2. To construct these graphs, we use the same sample of lottery-eligible household heads
as above and we define “treated” to be the binary variable indicating a unique match to Smith
(1838). The y axes in Figure 2 denote density (or probability) and the x axes measure the natural
log of total wealth (displayed in thousands of dollars for legibility).
Relative to control, the empirical PDF of the treated group appears to be missing mass in the
middle of the distribution and have extra mass in the upper tail. Panel A of Figure 2 shows the
estimated PDFs of both the control and treatment groups (dashed and solid lines, respectively).30
The vertical line at 0.1 (that is, $100) denotes a level below which some enumerators censored
real estate wealth in the 1850 Census. The control PDF is approximately log-normal and roughly
similar to the treatment distribution below a few hundred dollars. Between roughly $300 and $2500,
however, the treatment PDF is noticeably lower than that of the control. Above $2500, this pattern
is reversed, with control below treatment. (As mentioned in Section 4, the two distributions are
significantly different from one another in a Kolmogorov-Smirnov test.)
30
These densities are estimated using a kernel estimator in stata (“kdensity”) with a Epanechnikov kernel and
stata’s calculation for the optimal bandwidth. We omit those observations with zero wealth rather than using the
imputation. Otherwise the assumption of smoothness would be violated for the kernel density estimator. The question
of the extensive margin of wealth was treated above.
23
The CDF for the treated is shifted up relative to control in a statistically significant manner,
but only for wealth between approximately two and and ten thousand dollars. This result is seen
in Figure 2, Panel B, which displays the estimated differences in CDFs between the two groups
at various points. We implement this estimator by constructing an indicator variable for wealth
w̄
being below a given w̄: dw̄
i ≡ I(wi ≤ w̄). We then regress di on the treatment dummy, with
controls as in equation 1. Sweeping the w̄ threshold across the distribution of wi , we recover a
treatment effect and confidence interval that estimates the difference in CDFs across a series of
wealth levels. (Note that shifting of the treatment PDF will generate negative coefficients in this
procedure, by the definition of the CDF.) These estimates are shown in Panel B, with the solid
gray line being the point estimate and the dotted lines describing the 95% confidence interval. A
solid black horizontal line is drawn at zero for reference. As can be seen, the lower half (or more)
of the CDFs are statistically indistinguishable, as are the extreme upper tails. Nevertheless, there
is a range of wealth values, in the several thousands but not the tens of thousands, for which the
treatment CDF has significantly more mass at higher wealth than has the control CDF. (These
results are similar if we use alternative regression specifications. See Appendix Figure 3.)
5.4
Incorporating Tastes for Equity
The magnitude of the improvement for the treated group depends on one’s taste for equity (i.e.,
distaste for inequality). There is greater inequality in the treatment than control distributions, as
measured by the coefficient of variation, the standard deviation of logs, and the Gini coefficient, as
well as with the Mehran, Piesch, Kakwani, and Theil entropy measures. Furthermore, a statistical
test that the standard deviation of log total wealth (with the imputation adjustment) is greater for
treatment than control (versus the null of equality) has a p value of 0.0321. But how should we
weigh this increase in inequality (a presumed cost) versus the higher average wealth (an obvious
benefit)?
A simple and standard way to vary the equity weight when considering a group’s distribution
of outcomes is to use an aggregator with a constant elasticity of substitution (CES):

Ūj = 
X 1
i∈Zj
Nj
ρ−1
ρ
wi


ρ
ρ−1
(2)
in which i indexes individuals in our sample, j is an indicator for treatment status (j = 0, 1), Zj
is the set of indices i belonging to group j, wi is i’s 1850 total wealth, and ρ is the elasticity of
substitution, which relates to the preference for equity in a manner described below. The question
24
we ask is whether, for a given ρ, can we reject the equality of the CES aggregator between the two
groups or, more formally, H0 (ρ) : [Ū0 = Ū1 ]?
Figure 3 shows the relationship between the taste for equity, ρ, and the ratio of the Ū for the
treated divided by that of the controls. Importantly, the graph also displays the 95% confidence
interval of this ratio, so we can see whether the two groups are statistically distinguishable under
varying assumptions for ρ. These statistics are computed with 5000 bootstrapped samples for each
ρ. The sample is the same as in Figure 2, Panel B, and the computations use the untruncated
level31 of wealth.
For a sufficiently high ρ we can reject the null hypothesis of equivalence between the two groups.
As ρ −→ ∞, Ūj becomes simply the arithmetic mean for each group. In this limit, a dollar held by
any given person becomes a perfect substitute, in social-welfare terms, for a dollar held by someone
else. Thus, the taste for equity disappears as ρ gets larger, and social welfare eventually depends
only on the average outcome. Average wealth is indeed higher in the treatment group, as seen
above. Therefore, as we see in Figure 3, we can reject the null hypothesis of identical Ū for very
large ρ. As ρ −→ 1, the Ūj aggregator becomes the Cobb-Douglas function, with each individual’s
wealth as its arguments. The H0 is rejected for this case, which is equivalent to the test above for
the natural log of wealth.
Nevertheless, we cannot reject the equality of the two (re-weighted) wealth distributions if we
place a greater weight on equity. Specifically, the control and treatment distributions are only
statistically distinguishable if we use a ρ much less than one. This is seen in Figure 3, where the
confidence intervals overlap with zero for ρ below and including 0.95. This number is very far from,
for example, Rawls’ (1971) maximin preference, an elasticity of zero in which social welfare depends
exclusively on the well-being of the least-well-off individual.
5.5
Placebo test using South Carolina
We perform a falsification exercise using South Carolina rather than Georgia and do not find
statistically significant results. One of the challenges in identifying the treatment effect associated
with winning the 1832 Lottery is that our method of imputing lottery status via name matching
may introduce biases through sample selection. To check for this possibility, we construct a placebo
sample using households with only children born in South Carolina (rather than Georgia) during
the same pre-lottery window (the three years prior to the Cherokee Land Lottery of 1832).32 We
use the names among this South Carolina sample to impute lottery status per the Smith (1838) list.
31
32
Using natural logarithms would be an inappropriate starting point, tantamount to forcing ρ ≤ 1.
We are grateful to Petra Moser for suggesting this test.
25
As above, we use both a dummy for a unique match to the Smith list and a variable that allows for
probabilistic matches, deflated to 1/n in case of ties. By the eligibility rules of the Cherokee Land
Lottery, any matches to this list from the placebo sample must be spurious. It is then reassuring
that the fraction of unique matches in the placebo sample derived from South Carolina is only one
quarter of the fraction in the Georgia sample. In Table 4, we estimate Equation 1 using this placebo
sample, for the different variables indicating lottery status, and using both the basic specification
and the one that includes surname/Soundex fixed effects. These results are found in Panels A and
B, with analogous results from the Georgia sample provided for reference in Panel C. The first
four columns of Table 4 show outcomes that were determined prior to the 1832 Lottery, and there
are no statistically significant results. (Note that a series of falsifications checks using pre-lottery
variables was also performed for the Georgia sample, as shown in Table 1, Panel B.) The remaining
columns show post-lottery outcomes in 1850 such as residing in old Cherokee County and real-estate
wealth. There is no statistically significant pseudo-treatment effect for the South Carolina sample,
in contrast to what we find for Georgia.
6
Discussion
Paying attention to their differential effects across the wealth distribution, we consider in this section
various mechanisms for the results above. We do not present evidence that can rule out any one of
these mechanisms, but we do discuss several channels that pertain to them. To have such similar
lower tails of the control and treatment distributions two decades after the lottery, there must have
been dissipation of the lottery winnings by some. Painting with a broad brushstroke, we suggest
four distinct channels of dissipation. First, there may have been differences in the physical returns
to capital, which might arise because of a fixed cost of production or an interaction with the ability
of the individual. The second mechanism is risk. The economy of the period was characterized by
much uninsurable risk. This may have been differentially incident on the lower tail if they had less
of a buffer against shocks that might take them down to a subsistence level. A third mechanism for
wealth dissipation is consumption: some winners may have consumed all of their newfound wealth
prior to 1850.33 A fourth mechanism is leisure: lottery winners may have chosen to live a quiet
life rather than leveraging their capital with sweat equity (e.g., by purchasing on the frontier and
improving their land).
33
Note that, in an environment without access to credit, it is difficult to separate the timing of consumption and
investment decisions. Thus, even if a non-convexity in production represents a high return to investment relative to
canal bonds, e.g., the return may be low when compared with the individual’s subjective discount rate. If so, there
is effectively no poverty trap in this case because the individual would chose consumption over investment if he had
the wealth.
26
6.1
Fixed Costs
It is implausible that a fixed cost of production, in and of itself, caused the long-run insensitivity
to treatment of the lower tail of the wealth distribution. We say this not because we doubt the
existence of fixed costs (we do not), but rather because it is inconsistent with what we know about
farming in this period. Without doubt there were farms (plantations, really) that had large fixed
costs of start-up and operation. But a prospective farmer who could not produce such a large
initial investment would have had the option to farm at a lower scale, perhaps with a different crop
and/or in a different area.
Within our data itself, it is difficult to square the results with a large fixed cost of production.
Differences in the PDFs do not emerge until over $400 and the CDFs do not significantly diverge
until over $2000 in wealth. In contrast, we see farmers in our own data working with much
lower levels of wealth. (See Figure 2, Panel A.) The fact that the control wealth distribution is
approximately log-normal, even in this low-wealth range, suggests that we are observing something
close to the steady-state level of operation rather than some transitory range that farms pass
through on the way to either closing or dramatically expanding. In any event, the parcels won in
the 1832 Cherokee Land Lottery were roughly 160 acres, which would have been more than enough
for small-scale farming, except on the lowest-quality land. Even if 160 acres were too small for
some purposes, lottery winners could have simply sold their land in Old Cherokee County (as many
did) and purchased a larger farm with cheaper land farther west on the frontier.
While slaves were manifestly difficult to purchase in non-integral units, this nonconvexity need
not have been an impediment to productivity using the lottery winnings. First, purchasing a primeage, male fieldhand would have cost several hundred rather than several thousand dollars in 1850.
Second, there were substitutes for purchasing slave labor, such as hired labor or less slave-intensive
crops. Furthermore, while some areas of the South were dominated by slave-based agriculture, other
were not. Indeed, Old Cherokee County itself was in the Upcountry where smaller yeomen-operated
farms were commonplace.
Land improvement was itself an up-front cost, but did not need to be either large or lumpy.
Initial land clearance was often done in a low-cost, slash-and-burn fashion. In any case, improving
land could be done incrementally, perhaps hiring oneself out as a farm laborer until at least enough
land was cleared for subsistence.34
34
We are grateful to Tim Guinnane for pointing this out.
27
6.2
Risk and Churn
Another mechanism is exposure to risk. Figure 4 displays estimates of the fraction holding no more
than $100 in 1860 as a function of 1850 total wealth. The dependent variable (on the y axis) is a
dummy for whether the 1860 total wealth (personal plus real estate) is less than or equal to $100
in 1860 dollars. The independent variable (on the x axis) in this figure is total measured wealth
in 1850. The sample size for this figure is 5603, because only 40% of the sample is linked to 1860.
The dashed line displays a local-polynomial smoothed estimate of the indicated fraction for each
level of total wealth, and the grayed area denotes the 95% confidence interval. For reference, the
solid gray line presents the PDF for log total wealth (excluding the imputation for those with zero
wealth). Even very wealthy men faced a roughly 5% chance of dropping to the lowest measured
wealth status over the course of a decade. This probability reaches one-quarter for the very poor.
These results highlight the large degree of risk and churn present in that period.
A simple calculation helps think about the following hypothesis: lottery winners who would have
been the lower tail anyway have zero treatment effect because of risk. Suppose that any lottery
winners who would have been in the lower tail had they lost the lottery faced some catastrophic risk
that would wipe out their lottery winnings. Suppose further that the shock would hit them with
95% probability over the 18 years between the lottery and the 1850 census. If the hazard rate of this
catastrophic risk was constant over time, this would translate into a chance of approximately 15%
per year (1 − .051/18 ) of losing their extra lottery wealth. This number seems high to us, although
there is very little data on the idiosyncratic risk exposure in the pre-1850 period. Indeed, Wright’s
hypothesis that small Southern farmers practised “safety first” was based on observations of crop
selection rather than direct information about uninsured risk. If this sort of yearly risk prevailed
in the linked 1850-60 data shown in Figure 4, the dashed curve (indicating the 1860 wealth had
dropped below the reporting threshold) would reach levels of over 80% (0.8510 ) rather than below
30% as estimated. This suggests that a story purely of risk (perhaps conditional on wealth) cannot
explain these results, although it leaves open the possibility of a story of heterogeneity across
individuals in risk or returns.
Aggregate risk was also a major factor. The Panic of 1837 saw a large drop in commodities
prices, including the price of cotton. (There was also a panic in the late 1850s, which would have
affected the 1850-60 transition discussed above.) From a pure portfolio perspective, this should
have affected the yeomen farmers less if “safety first” lead them to diversify away from marketoriented crops like cotton. But they might have been more affected if they had less of a buffer to
keep them away from subsistence and/or Ransom’s “walk away” margin.
28
6.3
Interactions and Human Capital
One possible mechanism is that wealth shocks alone are insufficient, but rather they must be paired
with some complementary skill. Indeed, those at the bottom of the distribution (either control or
treatment) may have been there precisely because they lack the ability to seize opportunities. If so,
the bottom parts of the two distributions might look similar because the low-skilled winners could
not take advantage of their winnings.
6.3.1
Did the would-be poor not claim their winnings?
Failure to claim lottery winnings cannot account for the lower tails of the treatment group being
similar to control. Evidence supporting this claim is seen in Figure 5, which displays estimates of
the fraction of parcels claimed prior to 1838 (as reported in Smith, 1838) versus total 1850 wealth
(adjusted for truncation at the lower tail) for the subsample of lottery winners only. (By definition,
lottery losers were not assigned parcels, so parcel characteristics are unavailable for the full sample.)
The solid line displays a local-polynomial smoothed estimate of the mean claim rate for each level
of total wealth, and the short-dashed lines denote 95% confidence intervals. For reference, the
long-dashed line presents the PDF for log total wealth.
Across the distribution of 1850 wealth in the treated group, there was a relatively stable rate of
claiming land, as shown in Figure 5. Claiming rates are, if anything, a bit higher in the lower tail of
the 1850 wealth distribution. In any event, note the range of the curve: within 5 percentage points
of one half for almost all of the distribution. If less than 50% failed to claim their winnings by the
end of 1837,35 this will attenuate the effect of treatment, but it cannot explain the apparent 100%
markdown of winnings in the lower tail of treatment. Therefore the lottery winners who ended up
poor did not do so because they all failed to collect their winnings at the outset.
6.3.2
Surname-average Characteristics
In this subsection, we consider whether the response to the shock of winning the lottery is related
to characteristics of other people who share the same surname and are therefore likely related along
patrilines.
To assess this possibility, we construct surname-specific averages of wealth, fertility, literacy, and
school attendance as possible proxies for differences across extended families in either preferences
or prices. We used the 1850 100% census file to construct the average fertility, school attendance,
35
Eventual claiming rates are probably even higher because the Georgia state government processed claims for
several more years. Banks (1905) estimates that approximately a quarter of the land lots opened by lottery went
unclaimed.
29
and real-estate wealth among Georgia-resident households for each (Soundex) surname. Those
individuals that appear in our lottery-eligible sample are excluded from the construction of the
averages. We first check for the statistical power of these proxies by regressing the individual-level
outcome on the surname average:
Yijks = αYs0 + δa + δk + ijk
(3)
where s denotes the surname for each observation, Ys0 is the surname-average of the Y 0 variable,
and each regression contains dummies for age and for state/county of residence. (The ‘prime’ on Y 0
allows for the possibility of a different Y variable’s average on the right-hand side of the equation.)
Furthermore, recognizing the group-level regressor, we adjust the standard errors for clustering at
the surname level. The base sample for these regressions is the same as for analogous estimates of
Equation 1 displayed in earlier tables, with the exception that some households are omitted if there
were no other households in Georgia with the same surname and therefore no one with which to
form the surname-level averages. We consider three surname-averaged wealth outcomes: the level
of 1850 total wealth, its natural log (both adjusted for truncation in the lower tail), and a dummy
variable if total wealth exceeds $5000. The baseline estimate for the treatment dummy is shown in
Column 1 of Table 5.
The estimates indicate that the surname-averaged variable is indeed a strong and statistically
significant predictor of individual wealth, although results are weaker for surname-average fertility.
Estimates of equation 3 are found in even-numbered columns of Table 5. The coefficient of zero
is rejected in most cases for conventional confidence intervals. A mechanistic model in which the
patrilineal dynasty (as proxied by surname) predicts outcomes one-for-one is even more strongly
rejected, however; the coefficients are closer to 1/4th or 1/8th. See, for example, Panel A, Column
2, or Panel B, Columns 4 and 6, for apples-to-apples comparisons. (Note that we do not argue that
this is a causal effect of the behavior of their relatives on the individuals’ choices, but rather a proxy
for some shifter that is common within the group. Thus the interaction term should be interpreted
as interacting with the shifter as well.) Surname-average fertility is a weaker predictor of wealth
(Column 8). Nevertheless, men have statistically and economically higher wealth if their surnames
are associated with higher rates of school enrollment among children aged 5-15 or of literacy among
adult men (Columns 10 and 12, respectively). In sum, these surname-level measures are generally
strong predictors of own wealth, and thus may be suitable predetermined proxies of own human
capital. (Recall from Table 1, Panel B, that lottery status did not predict these characteristics.)
The odd-numbered Columns 3–13 of Table 5 report results in which we interact the surname-
30
average characteristic with winning the lottery. The specific estimating equation is as follows:
Yijks = γTj + βTj z(Ys0 ) + αYs0 + δa + δk + ijk
(4)
Note that this is a modified version of Equation 1, to which we add the interaction of the treatment
variable (Tj ) with the z-score of the surname-average variable (z(Ys0 )). The main effect of the
patriline loads on to αYs0 , and standard errors are clustered at the surname level. We report the
coefficients on treatment, the surname average, and the interaction of treatment with the surname
average. The estimated coefficients on being a lottery winner throughout the table are similar
to those reported above, and the coefficients on the surname averages are similar to those in the
even-numbered columns. The estimated interaction terms are generally of the expected sign but
not statistically significant. For example, in Panel B, Column 7, we display estimates in which the
outcome variable is the natural log of total wealth and the surname-level variable used to form
the interaction with treatment is the natural log of the median total wealth among others adult
males in Georgia with the same surname. The coefficient of .009 implies an additional .036 of log
total wealth from treatment as we sweep across four standard deviations of the distribution of the
surname-average variable. The difference is approximately one third of the main effect of treatment.
However, this interaction coefficient is not significantly different from zero. Indeed, only one of the
18 interactions with surname averages yields a significant (at the 10% level) estimate, although 13
of 18 are of the expected sign.
6.3.3
Own Illiteracy
We do not find a significant interaction between treatment and the lottery winner’s own illiteracy.
While this could in theory have been affected by the lottery, it is unlikely that many men would
have become literate during adulthood, whether they won in the lottery or not. Nor does lottery
winning predict illiteracy, as we saw in Table 1. With these facts, we take license to use the
lottery eligible’s illiteracy (defined as unable to read or write) as a predetermined (prior to the
lottery, that is) variable. These estimates should be taken cum grano salis in case this assumption
is incorrect. In Table 5, Column 14, we show that illiterate men indeed had substantially lower
wealth by all three measures. However, estimates for the interaction term (shown in Column 15)
are not significantly different from zero.
All told, evidence in support of both a statistically and an economically important interaction
between treatment and human capital is weak, at least with the proxies of human capital at our
disposal.
31
6.4
Life-Cycle and Family Considerations
The results above are not likely because of a life-cycle-related wealth decumulation (be it for
spending or bequests). First, note that inter vivos transfers to current members of the same
household are already included in measure of wealth used above. Second, the age distribution of
and life-profile of wealth do not favor this hypothesis of decumulation. The distribution of age and
the average wealth by age, in our sample, are shown in Appendix Figure 4. Wealth accumulation
peaks around 60 years, and over 80% of our sample is below this age. In any event, it is probable
that some would have maintained their gross asset position (which is what we measure above)
into old age to use as collateral against a mortgage or as a kind of social collateral to secure help
from their children who would be waiting for their inheritance. Again using Hacker’s life table,
we compute remaining life expectancy for ages 35-75 in 1850 was an additional 17.7 years, which
is similar to the number of years lapsed since the lottery. Thus, our sample of lottery eligibles
was nowhere near the end of their planning horizon for consumption. We argue, therefore, that
the similarities of the lower tail of the control and treatment wealth distributions are not because
people had spent or bequeathed all of their wealth for standard life-cycle motives.
One complication is that some of the wealth might have already been bequeathed to adult
children outside the house. Note that our sample would still have children in the household by
construction, and it is unlikely that they would have already bequeathed all of their wealth to
children outside the household.36 In any case, we attempt to include inter vivos transfers by
measuring the wealth of nearby adult sons. (We have no way to track sons that moved far away.)
In order to identify potential sons of lottery participants who may have left home by 1850 and
established nearby farms themselves, the manuscript pages were searched for all males within 50
individuals of each lottery participant with a surname similar to the participant but a reported 1850
age that was 20 or more years younger than that of the participant. The real estate wealth of these
individuals was then transcribed and their wealth was re-inserted into the participant household’s
total to account for the possibility that some wealth might have been disbursed to these potential
sons as they left their father’s home and set up farms. Though some were no doubt nephews rather
than sons or may in fact have been unrelated to the lottery participants, we do not expect the
quality of this indicator of wealth “leakage” to vary systematically between lottery winners and
lottery losers. We find that the relationship between having a possible son nearby and the 1850
distribution of wealth is quite similar in the control and treatment groups. (See Appendix Figure
5, Panel B, which is patterned on Figure 4.) Furthermore, when we construct the difference in the
36
Primogeniture was abolished by the Georgia colony in 1777 (19 Ga col rec part 2 1912 455), so children were
entitled to equal treatment as heirs when their father died intestate. There are numerous examples of trousseau
(dowries) being provided to daughters at the time of their marriage, but the practice was not legally mandated.
32
CDF of wealth including potential sons’ wealth (shown in Appendix Figure 3, Panel F) the pattern
is similar to the baseline estimates in Figure 2, Panel B.
We also argue that changes in fertility are not an important mechanism in explaining the pattern
of results across the distribution. Using a similar research design, Bleakley and Ferrie (2013a) found
that lottery winners had a higher fertility rate (approximately 0.1 more children by 1850, significant
at conventional levels of confidence) and that the effect on fertility37 is strongest in the lower half of
the wealth distribution. This result is shown in Appendix Figure 5, Panel A, where the control and
treatment means show a gap of roughly 0.1 over approximately the same range for which the ∆CDF
results in Figure 2, Panel B were insignificantly different from zero. Nevertheless, this coincidence
does not imply that the pattern of excess fertility among lottery winners explains the pattern of
the wealth results. First, if the average number of additional children is approximately 0.1, it
implies that the vast majority of the treated did not have additional children, and therefore had no
reason to lose their extra wealth because of extra fertility. Second, even if we ignore the integral
nature of childbearing, comparing 0.1 extra children with the approximate value of winnings of
$500-$1000 would imply a price of child-rearing on the order of $5000 of wealth per child, which
seems implausibly large.
6.5
Locational choice
Finally, we do not find locational choice as an important consideration in explaining differences in
wealth between control and treatment. First, note that the lottery winners were slightly more likely
to be in old Cherokee County in 1850, by approximately 2 percentage points. But apart from this,
lottery winnings did not appear to bring them to a place that had peculiar characteristics, at least
across a wide range of observables from aggregate county data, including fertility, schooling, farm
values, farm sizes, land improvement, slave density, urbanization, or access to transport. (These
results are found in Appendix Table 1.) In any event, we obtain fairly similar results for wealth
whether or not we condition on state×county dummies; therefore, the change across the wealth
distribution is not because some part of the treatment group happened to pick counties that saw
faster appreciation of land values, for example. Note that (as seen in Appendix Figure 5, Panel B)
residing in old Cherokee County in 1850 by the treated seems to be strongest at the lower/middle
part of the wealth distribution. This is consistent with claiming rates being slightly higher in the
lower part of the distribution, as seen above. Generally, this pattern makes sense in that much of
37
Note that same study, focused on the child-quantity versus child-quality effects of parental wealth, reports that
lottery winners did not change their investments in average child quality. The children of lottery winners did not go
to school more in 1850, nor did they hold more wealth in 1870, nor have higher-paid occupations in 1880, nor were
they more literate in 1880.
33
the work of actually going out to improve land would have been labor-intensive, and therefore a
task disproportionately taken up by those with a low opportunity cost of time.
At the same time, going out to the frontier and waiting for population growth might have been
a high-return investment, albeit one that required the hard work of improving land. We find that
treated men tended to live in counties in 1850 that had lower population density in both 183038
and in 1850. But the 1830-50 growth in log population density in the 1850 county of residence was
not significantly different for control versus treatment groups, and the pattern of density growth
is similar across the 1850 wealth distribution for both groups (Appendix Figure 5, Panel E). (We
would have preferred to conduct this calculation using the 1830-50 growth in land value, but farm
values by county were not available before 1850.) In any event, we do not find that those in the
treated group who ended up in the lower tail had opted for the ‘quiet life’ of purchasing land in
already developed (dense) counties (Appendix Figure 5, Panel F).
6.6
Unmeasured assets
The analysis above is limited to two categories of assets: land and slaves. Here we discuss our
conclusions that might be affected by unmeasured aspects of the household’s balance sheet. First,
financial assets (such as stocks and bonds) would not have been an important part of the portfolio,
except for the wealthiest households. Access to banks would also have been limited, especially for
those in the lower tail in view of the fact that they were more likely to live in sparsely populated
counties. Some important assets for those engaged in agriculture, however, would have been farm
implements and livestock. While we do not measure them, holdings of such assets would presumably
rise in proportion with landownership, which we do measure.
Another important class of assets is physical capital that might have been used by artisans who
were not themselves farmers, but whose work was demanded by local farmers. For example, blacksmiths, farriers, and millwrights supplied goods and services to farmers, but these activities were
physical-capital intensive rather than land intensive. To identify occupations other than farming
that had substantial requirements in terms of physical capital,39 the 1860 Census of Population
was used to generate average personal wealth (reported for the first time in 1860) by occupation
in the non-slave states40 among white, native-born males aged 20-65. Using their 1850 occupation,
we add this occupational index of non-farm, non-slave wealth to the total wealth of the sample of
38
The 1830 density is computed for the 1850 county boundaries using a raster-based method described in Bleakley
and Lin (2012). The underlying data are from NHGIS and ICPSR study # 2896 (Haines, 2010).
39
We are grateful to Chris Woodruff for suggesting this strategy.
40
Slaves would have been counted as personal wealth in 1860, so we used the pattern of occupational wealth in
non-slave states to avoid double counting.
34
lottery eligibles. The similarity between control and treatment groups in their lower tails is not
affected by this imputation. If anything, the distributions are even more similar at the lower end,
with statistically significant differences not emerging until $2500. (See Appendix Figure 3, Panel
E.)
This index of personal wealth was then culled to remove occupations in which the sole source
of earnings was likely to be human capital (physicians, lawyers, teachers) or farm implements
(farmers, e.g.). We then grouped the occupations into those with above-median and below-median
averages of filtered personal wealth. The control and treatment groups have similar propensities to
be in occupations with apparently high physical-capital requirements. If anything, the lower tail
of wealth has a higher representation of such ‘artisan’ occupations (Appendix Figure 5, Panel D).
These results are inconsisent with the idea that the lower tail of the treatment group was in fact
shifted up from poverty, but obscured to us by not measuring artisans’ wealth above.
Another issue is that we measure the gross asset holdings, which might be inflated if the farmer
could leverage his wealth with debt. Farm mortgages in the first half of the nineteenth century
were generally valued at no more than half the value of the underlying property,41 so the reported
gross real-estate wealth can deviate from real-estate wealth net of encumbrances by no more than
a factor of two. This is an upper bound in that few farms were at their maximum loan-to-value
ratio. Bogue (1976) reports that for the 1850s, between 5% and 40% of all farmland in a set of
Midwestern counties was encumbered. The fraction of land encumbered was probably lower in the
South in this period, as Ransom and Sutch (1974) report that “as late as 1890, only 5.3 percent
of all farms in the South were mortgaged, compared to 36.4 percent of all farms outside of the
South.” (Ransom and Sutch, 1973, note 4, p. 136). This low number should not be surprising
given the antebellum Southern banks’ overwhelming focus on short-term commerical affairs. (See
footnote 12 above.)
7
Conclusions
The 1832 Cherokee Land Lottery in Georgia represents an unusual environment in which to assess
the long-term impact of shocks to wealth on the wealth distribution. Winning should have been
uncorrelated with individual characteristics by the random nature of the lottery, and indeed our
model passes numerous falsification tests using predetermined variables and a placebo sample from
South Carolina. Further, participation in the lottery was nearly universal (albeit limited to adult
white men largely) thus ameliorating some of the possible problems with external validity that can
41
Among southern land banks, “mortgages were to represent property of double value” (Sparks, 1932, p. 96).
35
arise with self-selected samples.
Using wealth measured in the 1850 Census manuscripts, we follow up on a sample of men eligible
to win in the 1832 Cherokee Land Lottery. With these data, we can assess the effect of winning
that lottery on the distribution of wealth almost two decades after the fact.
We show that winners are, on average, richer, but mainly because the middle of the distribution
is thinner and the upper tail is fatter. In contrast, the lower tail is largely unaffected. This
stands in contrast with a ‘mechanical’ short-run effect of the lottery, which would tend to compress
the distribution of (log) wealth. The results are also inconsistent with the view that the effect
of winning would have been greatest on the lower tail because credit constraints had created a
wealth-based ‘poverty trap’. Instead, the results are consistent with heterogeneity along a number
of dimensions—differences by wealth in risk-taking or the ability to manage resources and delay
gratification, for example—that may themselves help account for wealth levels that prevail absent
a lottery. As we see in this episode, then, it may take more than just wealth to move the lower tail
of the long-run wealth distribution “up from poverty.”
8
References
accessgenealogy.com. 2009. “1832 Cherokee Country Georgia Land Lottery”. [Accessed May 29,
2009.]
Acemoglu, Daron, and Fabrizio Zilibotti. 1997. “Was Prometheus Unbound by Chance? Risk,
Diversification, and Growth,” Journal of Political Economy, 105(4, August):709-751
Ancestry.com. 2010. Selected U.S. Federal Census Non-Population Schedules, 1850-1880 [database
on-line]. Provo, UT: Ancestry.com Operations, Inc.
Ancestry.com. 2011. Georgia, Property Tax Digests, 1793-1893 [database on-line]. Provo, UT:
Ancestry.com Operations, Inc.
Atack, Jeremy. 1988. “Tenants and Yeomen in the Nineteenth Century,” Agricultural History,
62(3, Summer):6-32.
Banerjee, Abhijit V. and Esther Duflo. 2011. Poor economics: a radical rethinking of the way to
fight global poverty. New York: Public Affairs.
Banerjee, Abhijit and Andrew Newman. 1993. “Occupational Choice and the Process of Development,” Journal of Political Economy, 101:(2, April):274-298.
Banks, Enoch Marvin. 1905. The economics of land tenure in Georgia. New York: The Columbia
University Press, The Macmillan Company.
Blattman, Christopher, Fiala, Nathan and Martinez, Sebastian. 2013 “Credit Constraints, Occupational Choice, and the Process of Development: Long Run Evidence from Cash Transfers in
Uganda.” SSRN working paper: http://ssrn.com/abstract=2268552.
36
Bleakley, Hoyt and Joseph Ferrie. 2013a. “Land Openings on the Georgia Frontier and the Coase
Theorem in the Short- and Long-Run,” Unpublished manuscript, March.
Bleakley, Hoyt and Joseph Ferrie. 2013b. “Shocking Behavior: Random Wealth in Antebellum
Georgia and Human Capital Across Generations.” Unpublished manuscript, August.
Bleakley, Hoyt, and Jeffrey Lin. 2012. “Portage and Path Dependence,” Quarterly Journal of
Economics, May 2012, pp 587-644.
Bode, Frederick A., and Donald E. Ginter. 1986. Farm Tenancy and the Census in Antebellum
Georgia. Athens: University of Georgia Press.
Bogue, Allan G. 1976. “Land Credit for Northern Farmers 1789-1940,” Agricultural History, Vol.
50, No. 1 (Jan.), pp. 68-100.
Boo, Katherine. 2012. Behind the beautiful forevers. New York: Random House.
Buera, Francisco J. and Yongseok Shin. 2013. ”Financial Frictions and the Persistence of History:
A Quantitative Exploration,” 121(2):221-273.
Cadle, Farris W. (1991). Georgia Land Surveying History and Law. Athens, Georgia: Univ. Of
Georgia Press.
Carter, Michael and Christopher Barrett. 2006. ”The Economics of Poverty Traps and Persistent
Poverty: An Asset-based Approach,” Journal of Development Studies 42(2):178-199.
Clark, Gregory and Neil Cummins. 2012. “What is the True Rate of Social Mobility? Surnames
and Social Mobility, England 1800-2012.” Unpublished manuscript.
Conley, Timothy G., and David W. Galenson. 1994. “Quantile Regression Analysis of Censored
Wealth Data,” Historical Methods, 27(4):149-165.
de Mel, Suresh, David McKenzie, Christopher Woodruff, 2008. “Returns to Capital in Microenterprises: Evidence from a Field Experiment,” The Quarterly Journal of Economics, 123(4):1329-1372,
November.
de Mel, Suresh, David McKenzie, Christopher Woodruff, 2012. “One-Time Transfers of Cash or
Capital Have Long-Lasting Effects on Microenterprises in Sri Lanka,” Science, February, Vol. 335
no. 6071 pp. 962-966.
de Soto, Hernando. 1987. El Otro Sendero: La Revolución Informal. Editorial Sudamericana:
Buenos Aires.
de Soto, Hernando. 2002. The Mystery of Capital: Why Capitalism Triumphs in the West and
Fails Everywhere Else. New York: Basic Books.
Fogel, Robert W., and Stanley L. Engerman. 1977. “Explaining the Relative Efficiency of Slave
Agriculture in the Antebellum South.” The American Economic Review, Vol. 67, No. 3 (Jun.),
pp. 275-296.
Gates, Paul W. 1996. The Jeffersonian Dream: Studies in the History of American Land Policy
and Development. Albuquerque : University of New Mexico Press. (Bogue, Allan G., and Margaret
Beattie Bogue, eds.)
Govan, Thomas P. 1978. Banking and the Credit System In Georgia, 1810-1860. New York: Arno
Press.
37
Güell, Maia, José V. Rodrı́guez, and Christopher I. Telmer. 2012. “Intergenerational Mobility and
the Informational Content of Surnames.” Unpublished manuscript. January.
Hacker, J. David. 2010. “Decennial Life Tables for the White Population of the United States,
1790-1900,” Historical Methods, Volume 43, Number 2 (April-June), pp. 45-79.
Hacker, J. David. 2013. “New Estimates of Census Coverage in the United States, 18501930.”
Social Science History, Volume 37, Number 1: 71-101.
Hankins, Scott, Mark Hoekstra, and Paige Marta Skiba. 2010. “The Ticket to Easy Street? The
Financial Consequences of Winning the Lottery.” Unpublished manuscript.
Haines, Michael R. 2010. Historical, Demographic, Economic, and Social Data: The United States,
1790-2002, Inter-university Consortium for Political and Social Research, icpsr.org, study 2896.
Imbens, Guido W., Donald B. Rubin and Bruce I. Sacerdote. 2001. “Estimating the Effect of
Unearned Income on Labor Earnings, Savings, and Consumption: Evidence from a Survey of
Lottery Players,” The American Economic Review, 91(4):778-794.
Khanna, Tarun. 2007. Billions of Entrepreneurs: How China and India are Reshaping Their
Futures–and Yours. Boston, Mass.: Harvard Business School Press.
Kotlikoff, Laurence J, 1979. “The Structure of Slave Prices in New Orleans, 1804 to 1862,” Economic Inquiry, 17(4):496-518, October.
McKenzie David J. and Christopher Woodruff, 2006. “Do Entry Costs Provide an Empirical Basis
for Poverty Traps? Evidence from Mexican Microenterprises,” Economic Development and Cultural
Change, 55(1):3-42, October.
Minnesota Population Center, 2004. National Historical Geographic Information System: Prerelease Version 0.1, University of Minnesota, nhgis.org.
Ransom, Roger L., 2005, The Confederate States of America. New York : W.W. Norton & Co..
Ransom, Roger L., and Richard Sutch. 1973. “The Ex-Slave in the Post-Bellum South: A Study
of the Economic Impact of Racism in a Market Environment.” The Journal of Economic History,
Vol. 33, No. 1, The Tasks of Economic History (Mar.), pp. 131-148.
Rawls, John. 1971. A Theory of Justice. Cambridge, MA: Belknap Press.
Rosenzweig, Mark R., and Hans P. Binswanger. 1993. “Wealth, Weather Risk and the Composition
and Profitability of Agricultural Investments, ” The Economic Journal, Vol. 103, No. 416 (Jan.),
pp. 56-78
Schultz, Theodore W. 1975. “The Value of the Ability to Deal with Disequilibria,” Journal of
Economic Literature, 13(3):827-846.
Smith, James F. 1838. The Cherokee Land Lottery, Containing a Numerical List of the Names of
the Fortunate Drawers in Said Lottery, with an Engraved Map of Each District. New York: Harper
and Brothers.
Sparks, Earl Sylvester. 1932. History and theory of agricultural credit in the United States. New
York : Thomas Y. Crowell.
Steckel, Richard H. 1988. The Health and Mortality of Women and Children, 1850-1860. The
Journal of Economic History, Vol. 48, No. 2, The Tasks of Economic History (Jun.), pp. 333-345.
38
Steckel, Richard H. 1994. “Census Manuscript Schedules Matched with Property Tax Lists:
A Source of Information on Long-Term Trends in Wealth Inequality,” Historical Methods, 27:2
(Spring) pp. 71-85.
Steckel, Richard H. 1996. The Age at Leaving Home in the United States, 1850-1860. Social Science
History, Vol. 20, No. 4 (Winter), pp. 507-532.
Tostlebe, Alvin S. 1957. Capital in Agriculture. Princeton: Princeton University Press.
Weiman, David F. (1991). “Peopling the land by lottery? The market in public lands and the
regional differentiation of territory on the Georgia frontier.” The Journal of Economic History,
51(4):835-860.
Wilder, Laura Ingalls. 1971. The first four years. New York: Harper & Row.
Williamson, Samuel H. 2013. “Seven Ways to Compute the Relative Value of a U.S. Dollar Amount,
1774 to present,” MeasuringWorth.com, accessed March 15, 2013.
Wishart, David M. 1995. “Evidence of Surplus Production in the Cherokee Nation Prior to Removal.” The Journal of Economic History, Vol. 55, No. 1 (Mar.): 120-138.
Wright, Gavin. 1979. “The Efficiency of Slavery: Another Interpretation.” The American Economic Review, Vol. 69, No. 1 (Mar.), pp. 219-226.
Wright, Gavin, and Howard Kunreuther. 1975. “Cotton, Corn and Risk in the Nineteenth Century.” The Journal of Economic History, Vol. 35, No. 3 (Sep.), pp. 526-551.
39
Notes: This figure displays estimates from a quantile regression of the effect of winning the lottery ("treatment") on total
wealth in 1850. The points are the quantile-specific estimates of the treatment effect at various quantile points. The dashed
line is a local-polynomial-smoothed (Epanechikov kernel, with a bandwidth of .11) estimate of the treatment effect. The
sample consists of all household heads in the 1850 census with children born in Georgia during the three years prior to the
Cherokee Land Lottery of 1832 and no children born outside of Georgia during the same period. An observation is coded as a
lottery winner ("treated") if he is a unique match to a name found on the list of winners published by Smith (1838); anyone
else in the sample is coded to zero ("control"). The dependent variable in this figure is total measured wealth in 1850, the
sum of real-estate wealth and slave holdings. Real-estate wealth is as reported on and transcribed from the manuscript pages
of the 1850 Census of Population. Slave wealth was estimated by linking the household to the 1850 Slave Schedule and
imputing a market value of slave holdings adjusting for the reported ages and gender of slaves on the Schedule. The sample
size for this figure is 13094. Data sources and additional variable and sample definitions are found in the text and in the
appendices.
Figure 1: Quantile Regression Estimates on Total Wealth and Lottery Winning
Notes: This figure displays estimates of the distribution, decomposed by lottery status, of total 1850 wealth for sample of lottery-eligible men. The sample consists of all household heads in the 1850
census with children born in Georgia during the three years prior to the Cherokee Land Lottery of 1832 and no children born outside of Georgia during the same period. An observation is coded as a
lottery winner ("treated") if he is a unique match to a name found on the list of winners published by Smith (1838); anyone else in the sample is coded to zero ("control"). The dependent variable in this
figure is total measured wealth in 1850, the sum of real-estate wealth and slave holdings. Real-estate wealth is as reported on and transcribed from the manuscript pages of the 1850 Census of Population.
Slave wealth was estimated by linking the household to the 1850 Slave Schedule and imputing a market value of slave holdings adjusting for the reported ages and gender of slaves on the Schedule. The
sample size for this figure is 13094. Panel A presents the probability distribution functions, estimated using the "kdensity" command in stata. The vertical line denotes $100, the level below which some
enumerators censored real-estate wealth. The solid line in Panel B presents the differences in the cumulative distribution function between groups (treatment minus control), estimated using a linear
probability model (equation 1) of an indicator for being below 200 quantile cut-points for wealth. Heteroskedasticity-robust 95% confidence intervals are plotted with the dashed lines. Data sources and
additional variable and sample definitions are found in the text and in the appendices.
Panel B: Estimated Differences in the Cumulative Distribution Functions
Panel A: Estimated Probability Distribution Functions
Figure 2: Differences in the Distribution of Total Wealth Between Lottery Winners and Losers
Notes: This figure compares treatment/control differences using a range of preferences for equity over total wealth. The summary statistic
for each group is computed using a constant-elasticity-of-substitution (CES) aggregator. The ratio (treatment divided by control) of this
statistic is indicated on the y axis. This ratio is computed for various values of the elasticity of substitution (rho), denoted on the x axis.
The graph displays the ratio and 95% confidence interval, computed with 5000 bootstrapped samples for each point in the rho grid. The
sample is the same as in Figure 1, Panel B. See the notes to Figure 1 for variable and sample definitions.
Figure 3: Total-Wealth Differences, Lottery Winners versus Losers, Under Various Tastes for Equity
Figure 4: Fraction with 1860 Wealth Less Than or Equal to $100 versus 1850 Wealth
Notes: This figure displays estimates of the fraction holding no more than $100 in 1860 as a function of 1850 wealth. The base sample consists of all household heads in the 1850 census with children
born in Georgia during the three years prior to the Cherokee Land Lottery of 1832 and no children born outside of Georgia during the same period. An observation is coded as a lottery winner ("treated")
if he is a unique match to a name found on the list of winners published by Smith (1838). The dependent variable (on the y axis) is a dummy for whether the 1860 total wealth (personal plus real estate) is
less than or equal to $100. The independent variable (on the x axis) in this figure is total measured wealth in 1850, the sum of real-estate wealth and slave holdings. Real-estate wealth is as reported in
and transcribed from the manuscript pages of the 1850 Census of Population. Slave wealth was estimated by linking the household to the 1850 Slave Schedule and imputing a market value of slave
holdings adjusting for the reported ages and gender of slaves on the Schedule. The sample size for this figure is 5603. The dashed displays a local-polynomial smoothed estimate of the indicated fraction
for each level of total wealth, and the grayed area denotes the 95% confidence interval. For reference, the solid gray line presents the probability distribution function for log total wealth (excluding the
imputation for those with zero wealth). These curves are estimated using, respectively, the "lpoly" and "kdensity" commands in Stata version 12. Data sources and additional variable and sample
definitions are found in the text and in the appendices.
Figure 5: Probability that Parcel is Claimed by 1838 versus Realized 1850 Wealth among Lottery Winners
Notes: This figure displays estimates of fraction of parcels claimed as of 1838 (as reported in Smith, 1838) versus total 1850 wealth for the subsample of lottery winners only. (By definition, lottery
losers were not assigned parcels, so parcel characteristics are unavailable for the full sample.) The base sample consists of all household heads in the 1850 census with children born in Georgia during the
three years prior to the Cherokee Land Lottery of 1832 and no children born outside of Georgia during the same period. An observation is coded as a lottery winner ("treated") if he is a unique match to a
name found on the list of winners published by Smith (1838). The dependent variable (on the y axis) is a dummy for whether the lotter winner's parcel had been claimed by the time of publication of the
Smith (1838) list. The independent variable (on the x axis) in this figure is total measured wealth in 1850, the sum of real-estate wealth and slave holdings. Real-estate wealth is as reported in and
transcribed from the manuscript pages of the 1850 Census of Population. Slave wealth was estimated by linking the household to the 1850 Slave Schedule and imputing a market value of slave holdings
adjusting for the reported ages and gender of slaves on the Schedule. The sample size for this figure is 1607. The solid line displays a local-polynomial smoothed estimate of the mean claim rate for each
level of total wealth, and the short-dashed lines denote 95% confidence intervals. For reference, the long-dashed line presents the probability distribution function for log total wealth (excluding the
imputation for those with zero wealth). These curves are estimated using, respectively, the "lpoly" and "kdensity" commands in Stata version 12. Data sources and additional variable and sample
definitions are found in the text and in the appendices.
Table 1: Summary Statistics
(1)
(2)
(3)
(4)
Whole
Sample
Lottery
“Losers”
Lottery
“Winners”
p-value, mean
difference [N]
Panel A: Lottery Winner or Loser
Dummy for unique match to Smith
(1838) list
0.124
(0.329)
0
1
---
Dummy for match to Smith (1838),
deflated to 1/n in case of ties
0.155
(0.335)
0.037
(0.121)
0.995
(0.053)
0.000
[14375]
Panel B: Predetermined Outcomes
Age, in years
51.2
(8.5)
51.3
(8.5)
50.9
(8.6)
0.122
[14375]
Born in Georgia
0.497
(0.500)
0.497
(0.500)
0.498
(0.500)
0.889
[14375]
Born in South Carolina
0.212
(0.408)
0.210
(0.407)
0.222
(0.416)
0.263
[14375]
Born in North Carolina
0.180
(0.384)
0.180
(0.384)
0.178
(0.383)
0.804
[14375]
Number of Georgia-born children in
the three years prior to the lottery
1.333
(0.542)
1.333
(0.541)
1.332
(0.542)
0.910
[14375]
Cannot read and write
0.147
(0.354)
0.147
(0.354)
0.142
(0.350)
0.593
[14340]
Number of letters in surname
6.19
(1.61)
6.20
(1.62)
6.13
(1.51)
0.072
[14375]
Frequency with which surname
appears in sample
36.2
(46.3)
36.3
(46.9)
35.3
(41.9)
0.380
[14375]
Surname begins with “M” or “O”
0.101
(0.302)
0.101
(0.301)
0.104
(0.305)
0.740
[14375]
Mean wealth of families in Georgia
with same surname
1186.3
(1257.8)
1185.4
(1288.4)
1192.3
(1021.8)
0.811
[13848]
Median wealth of families in Georgia
with same surname
289.1
(716.6)
290.0
(717.6)
282.7
(709.9)
0.686
[13848]
Mean illiteracy of adults in Georgia
with same surname
0.219
(0.107)
0.219
(0.108)
0.218
(0.098)
0.648
[13848]
Notes: Table continues on next page.
Table 1 (continued): Summary Statistics
(1)
(2)
(3)
(4)
Whole
Sample
Lottery
“Losers”
Lottery
“Winners”
p-value, mean
difference [N]
Panel C: Measures of Wealth in 1850
Real-estate wealth
Slave weath
Total wealth (sum of wealth in real
estate and slaves)
1999.0
(4694.2)
{0,650,2000}
1970.8
(4422.0)
{0,640,2000}
2198.2
(6290.1)
{0,700,2000}
0.068
[13094]
1339.1
(5761.0)
{0,0,0}
1297.3
(5329.7)
{0,0,0}
1635.3
(8189.0)
{0,0,326}
0.021
[14375]
3323.7
(8691.0)
{100,800,3000}
3245.5
(7952.9)
{100,800,3000}
3876.5
(12734.4)
{100,1000,3550}
0.006
[13094]
Notes: This table displays summary statistics for the main data used in the present study. The sample consists of all household heads in the
1850 census with children born in Georgia during the three years prior to the Cherokee Land Lottery of 1832 and no children born outside of
Georgia during the same period. Column (1) presents means and standard deviations (in parentheses) of variables for this entire sample. We
use two measures of whether the person won land in the drawing for the Cherokee Land Lottery of 1832. The first measure is coded to 1 if
that person is a unique match to a name found on the list of winners published by Smith (1838); anyone else in the sample is coded to zero.
The second measure takes individuals that “tie” for a match to the Smith list with (n-1) other observations and recodes them to 1/n. These
variables are summarized in Panel A. Columns (2) and (3) present means and standard deviations of variables for the subsamples of,
respectively, lottery losers and winners (decomposed using the first measure). Column (4) presents the p-value on the test of zero difference
in means between the subsamples of losers and winners. In square brackets, we report the sample size used for this test, although the test
involving children or surnames adjust for the clustering of errors. With the exception of measure the surname length, we use the Soundex
version of each name to account for minor spelling differences. For the variables that are means by surname, we use the 1850 100% census
file to construct average fertility, school attendance, and real-estate wealth among Georgia-resident households for each (soundex) surname.
(Those individuals that appear in our lottery-eligible sample are excluded from the construction of these indices.) Real-estate wealth is as
reported on and transcribed from the manuscript pages of the 1850 Census of Population. Slave wealth was estimated by linking the
household to the 1850 Slave Schedule and imputing a market value of slave holdings adjusting for the reported ages and gender of slaves on
the Schedule. Numbers in curly brackets in Panel C are the 25th, 50th, and 75th percentiles of the respective wealth measures. Data sources
and additional variable and sample definitions are found in the text and in the appendices.
777.7
(310.7) **
0.146
(0.042) ***
Levels
Natural Logs
(3)
(4)
0.126
(0.043) ***
710.1
(325.4) **
0.121
(0.043) ***
632.4
(311.2) **
0.098
(0.049) **
593.6
(352.3) *
(5)
First letter
of
surname
0.147
(0.042) ***
749.8
(303.0) **
Number
of letters
in
surname
0.146
(0.042) ***
762.5
(310.5) **
Freq. of
surname
in sample
0.135
(0.042) ***
660.2
(300.2) **
Surname
0.112
(0.049) **
572.0
(335.6) *
Panel B: Allow for 1/n Matching to Smith (1838)
0.128
(0.043) ***
714.4
(319.5) **
Panel A: Binary Match to Smith (1838)
(2)
Given
name
0.158
(0.045) ***
922.7
(331.3) ***
0.142
(0.045) ***
855.1
(348.9) **
(6)
Surname;
Given
name
0.110
(0.053) **
645.6
(332.6) **
0.098
(0.053) *
677.8
(385.6) *
(7)
Notes: This table displays OLS estimates of equation (1) in the text. Each cell presents results from a separate regression, and only the coefficient on winning
the lottery is reported. The sample consists of all household heads in the 1850 census with children born in Georgia during the three years prior to the Cherokee
Land Lottery of 1832 and no children born outside of Georgia during the same lapse of time. The dependent variable in this table is total measured wealth.
This variable is the sum of real-estate wealth, which was reported to enumerators on the population schedule, and slave wealth, which was computed from the
slave schedule. Whether this variable enters the specification in levels or natural logs is indicated by the row headings. The sample size the levels regressions is
13094, and is 10013 for the logs regressions. The baseline specification also includes dummies for age and for (state x county) of residence. Additional sets of
fixed effects are included in columns 2-7, as reported in the bottom row. In columns 4-7, we use the Soundex version of each name to account for minor
spelling differences. Two variables are constructed to measure whether the person was a lottery winner. The first measure, used in Panel A, is coded to 1 if that
person is a unique match to a name found on the list of winners published by Smith (1838); anyone else in the sample is coded to zero. The second measure,
which is used in Panel B, takes individuals that “tie” for a match to the Smith list with (n-1) other observations and recodes them to 1/n. A single asterisk
denotes statistical significance at the 90% confidence level; double 95% and triple 99%. Data sources and additional variable and sample definitions are found
in the text and in the appendices.
None
0.127
(0.043) ***
Natural Logs
Additional FixedEffect Controls:
723.4
(325.3) **
Levels
(1)
Table 2: Lottery Status versus Total Wealth in 1850
Table 3: Lottery Status vs. Various Wealth Measures in 1850
(1)
(2)
Binary
match to
Smith
Allow 1/n
match
Panel A: Total Wealth (N=13094)
Levels
723.4
(325.3) **
777.7
(310.7) **
Levels, Adjusted for Truncation
of Lower Tail
723.6
(325.2) **
777.6
(310.6) **
Natural Logs (N=10013)
0.127
(0.043) ***
0.146
(0.042) ***
Natural Logs, Adjusted for
Truncation of Lower Tail
0.121
(0.049) ***
0.142
(0.049) ***
Panel B: Quantiles of Total Wealth (N=13094)
Levels, 25th percentile
0.0
(32.7)
0.0
(25.7)
Levels, 50th percentile (median)
200.0
(39.8) ***
200.0
(26.7) ***
550.0
(109.7) ***
511.8
(116.1) ***
Levels, 75th percentile
Levels, 95th percentile
1503.7
(1114.3)
2022.1 **
(1076.6)
Panel C: Real-Estate Wealth (N=13094)
Levels
286.3
(159.7) *
295.2
(154.4) **
Indicator for Wealth At Least
$100
0.002
(0.011)
-0.003
(0.010)
Panel D: Slave Wealth (N=14375)
Levels
391.8
(201.8) *
431.8
(192.7) **
Indicator for Wealth > 0
0.039
(0.011) ***
0.052
(0.011) ***
Notes: This table displays OLS estimates of equation (1) in the text, except for Panel B where a quantile regression is used. Each cell
presents results from a separate regression, and only the coefficient on winning the lottery is reported. The specification also includes
dummies for age and for (state x county) of residence. The sample consists of all household heads in the 1850 census with children born in
Georgia during the three years prior to the Cherokee Land Lottery of 1832 and no children born outside of Georgia during the same period.
The dependent variables are various measures of wealth, as indicated in the Panel descriptions and row headings. The wealth variable in
Panels A and B is the sum of real-estate wealth, which was reported to enumerators on the population schedule, and slave wealth, which
was computed from the slave schedule. Panel C reports results for real-estate wealth. Enumerators in 1850 were instructed to record such
wealth only if were at least $100, which is the cutoff we used for analyzing the extensive margin in Panel C as well as for estimating the
truncated normal used to impute values below $100 in the truncation adjustment in Panel A. Two variables are constructed to measure
whether the person was a lottery winner. The first measure, used in Column (1), is coded to 1 if that person is a unique match to a name
found on the list of winners published by Smith (1838); anyone else in the sample is coded to zero. The second measure, used in Column
(2), takes individuals that “tie” for a match to the Smith list with (n-1) other observations and recodes them with to 1/n. A single asterisk
denotes statistical significance at the 90% confidence level; double 95% and triple 99%. Data sources and additional variable and sample
definitions are found in the text and in the appendices.
-0.016
(0.012)
-0.003
(0.014)
-0.004
(0.014)
0.014
(0.011)
0.012
(0.012)
0.004
(0.004)
0.000
(0.004)
0.003
(0.004)
-0.004
(0.012)
0.001
(0.014)
Dummy for match to Smith (1838),
deflated to 1/n in case of ties
Dummy for unique match to Smith
(1838) list
Dummy for match to Smith (1838),
deflated to 1/n in case of ties
Basic specification
Control for surname fixed effects
Number
SC-born
children,
prelottery
(4)
Resides in
Old
Cherokee
County
(5)
Realestate
Wealth
($)
(6)
0.010
(0.007)
0.005
(0.007)
-15.6
(232.0)
-41.1
(236.2)
-0.004
(0.020)
-0.016
(0.021)
0.009
(0.008)
0.006
(0.008)
-72.6
-(72.6)
-93.3
(229.4)
Panel B: South Carolina, including surname fixed effects
-0.002
(0.018)
-0.019
(0.019)
Panel A: South Carolina, basic specification
Number
Ga.-born
children,
prelottery
(3)
0.015
(0.015)
0.005
(0.016)
0.001
(0.015)
0.001
(0.016)
Realestate
Wealth
>$100
(7)
0.023
(0.008) ***
0.022
(0.008) ***
315.8
(146.8) ***
295.2
(154.4) *
0.002
(0.011)
0.002
(0.011)
0.021
(0.010) **
0.020
(0.009) **
0.012
(0.014)
0.012
(0.012)
0.004
(0.012)
0.007
(0.013)
Realestate
Wealth
>$3000
(8)
Notes: This table displays estimates of equation (1) in the text. Each cell presents results from a separate regression, and only the coefficient on "winning the lottery" is reported. The sample for
Panels A and B consists of all households in the 1850 census with children born in South Carolina during the three years prior to the Cherokee Land Lottery of 1832 and no children born outside of
Georgia during the same period. The sample for Panel C, which repeats some results from earlier tables, uses households with Georgia-born children in this same window. We use two measures of
whether the person won land in the drawing for the Cherokee Land Lottery of 1832. The first measure is coded to 1 if that person is a unique match to a name found on the list of winners published by
Smith (1838); anyone else in the sample is coded to zero. The second measure takes individuals that “tie” for a match to the Smith list with (n-1) other observations and recodes them to 1/n. Note that
these are spurious measures for the South-Carolina samples because the birthplace of their children implies that they lived outside of Georgia at some point during the three years prior to the lottery,
and were therefore ineligible. The basic specification also includes dummies for age. The other specification used includes fixed effects for surname (soundex). The dependent variables are indicated
in the column headings. A single asterisk denotes statistical significance at the 90% confidence level; double 95% and triple 99%. All standard errors (shown in parentheses) are heteroskedasticity
robust and clustered on the lottery-eligible man in there are multiple observations per household. Data sources and additional variable and sample definitions are found in the text and in the
appendices.
0.009
(0.016)
0.002
(0.014)
Panel C: Analogous results for Georgia, dummy for unique match to Smith list
-0.017
(0.013)
0.001
(0.004)
Lottery-status variables:
Dummy for unique match to Smith
(1838) list
(2)
Born in
Georgia
(1)
Born in
South
Carolina
Dependent variables:
Table 4: Falsification test using South Carolina instead of Georgia to construct sample
0.115
(0.048) **
Dummy for unique match to
Smith (1838) list
(3)
(4)
(5)
13780
6.96E-6
(3.23E-6) **
12553
13780
0.000
(0.010)
7.09E-6
(3.26E-6) **
0.029
(0.010) ***
12553
0.083
(0.050) *
3.83E-5
(1.63E-5) **
0.120
(0.051) **
12553
44
(216)
(8)
Log median wealth
Fertility
(9)
(10)
(11)
12553
12553
25
(224)
361
(87) ***
12896
126
(223)
0.115
(0.051) **
0.117
(0.051) **
0.028
(0.009) ***
12553
0.043
(0.050)
12553
12896
13780
13780
0.004
(0.009)
12896
-0.028
(0.063)
-0.004
(0.052)
0.117
(0.050) **
0.028
(0.009) ***
(12)
12417
1438
(433) ***
13780
13780
0.004
(0.010)
14144
12595
1802
(932) *
0.118
(0.051) **
12595
-397
(390)
1631
(766) **
(15)
13005
0.115
(0.051) **
13005
-811
(650)
-2186
-2084
(227) *** (241) ***
826
(251) ***
Own illiteracy
(14)
12417
0.028
(0.010) ***
12417
-0.056
(0.051)
12595
0.028
(0.009) ***
12595
0.038
(0.049)
13005
0.034
(0.010) ***
13005
-0.024
(0.132)
0.444
0.411
0.911
0.679
-0.974
-0.971
(0.117) *** (0.119) *** (0.168) *** (0.172) *** (0.046) *** (0.049) ***
0.121
(0.051) **
12417
42
(198)
1260
(469) ***
713
(298) **
Literacy rate
(13)
14144
0.012
(0.013)
13615
13615
0.001
(0.010)
13824
13824
-0.005
(0.009)
14271
14271
-0.037
(0.025)
0.019
0.071
0.047
0.123
0.100
-0.108
-0.104
(0.009) ** (0.022) *** (0.023) ** (0.030) *** (0.032) *** (0.009) *** (0.009) ***
0.027
(0.009) ***
Panel C: Total wealth Greater than $5000
12553
0.009
(0.049)
0.042
0.038
0.022
0.019
0.017
(0.007) *** (0.007) *** (0.004) *** (0.004) *** (0.009) *
12553
12896
346
(304)
182
(224)
682
(283) **
709
(297) **
In school, ages 5-15
Panel B: Total wealth, Natural Logs, Adjusted for Truncation of Lower Tail
12553
443
(86) ***
710
(298) **
0.282
0.236
0.164
0.140
-0.063
(0.035) *** (0.036) *** (0.022) *** (0.022) *** (0.053)
12553
-12
(241)
684
(150) ***
0.158
0.144
(0.067) ** (0.062) **
821
(145) ***
707
(298) **
4.04E-5
(1.75E-5) **
(7)
Panel A: Total wealth, Levels, Adjusted for Truncation of Lower Tail
Mean log wealth
717
(298) **
12553
(6)
Additional variables constructed from surname averages in 1850 Census data from Georgia (excluding own realizations)
Mean wealth
(2)
Notes: This table displays OLS estimates of equations (2) and (3) in the text. This table departs from previous ones in the use of surname-specific characteristics to proxy for differences across extended families ("dynasties") in child and wealth outcomes. We use the
1850 100% census file to construct average fertility, school attendance, and real-estate wealth among Georgia-resident households for each (soundex) surname. (Those individuals that appear in our lottery-eligible sample are excluded from the construction of these
indices.) Each panel/column presents results from a separate regression. In addition to the displayed coefficients, regressions include dummies for age and for (state x county) of residence. The base sample for these regressions consists of all households in the 1850
census with children born in Georgia during the three years prior to the Cherokee Land Lottery of 1832 and no children born outside of Georgia during the same period. Households without a corresponding surname in the database of surname averages are excluded from
the regressions. The dependent variables are indicated in the column headings. A household is coded as a lottery winner if the head is a unique match to a name found on the list of winners published by Smith (1838); anyone else in the sample is coded to zero. A single
asterisk denotes statistical significance at the 90% confidence level; double 95% and triple 99%. All standard errors (shown in parentheses) are heteroskedasticity robust and clustered on the surname level to account for correlation induced by the surname-averages. Data
sources and additional variable and sample definitions are found in the text and in the appendices.
Number of observations
Interaction term (using z
score for surname variables)
14306
0.029
(0.009) ***
Dummy for unique match to
Smith (1838) list
Additional variable
13036
Number of observations
Interaction term (using z
score for surname variables)
Additional variable
13036
Number of observations
Interaction term (using z
score for surname variables)
Additional variable
Dummy for unique match to
Smith (1838) list
716
(232) ***
Baseline
(1)
Table 5: Interaction with Own Illiteracy and Surname Averages
Notes: This figure displays a map of the southeastern United States with information on the location (by county) in 1850 of the lottery-eligible households in our main sample. Black lines indicate the
1850 county boundaries, drawn from the NHGIS database. The area shaded in blue in northwest Georgia denotes old Cherokee County, which was allocated by the Cherokee Lottery of 1832. The
sample consists of all household heads in the 1850 census with children born in Georgia during the three years prior to the Cherokee Land Lottery of 1832 and no children born outside of Georgia during
the same period. If households in our sample are resident in a county in 1850, we place a red dot at the county centroid. The area of a dot is proportional to the number of sample households resident in
that county. A minor fraction of sampled households resides in counties outside the frame of this map. Such households are included in the econometric analysis, but we zoom in on this region to make
the features legible in the map. Data sources and additional variable and sample definitions are found in the text and in the appendices.
Appendix Figure 1: Old Cherokee County and the 1850 Locations of the Sample
Notes: This figure displays estimates from a quantile regression of the effect of winning the lottery ("treatment") on total
wealth in 1850. The points are the quantile-specific estimates of the treatment effect at various quantile points. The dashed
line is a local-polynomial-smoothed (Epanechikov kernel, with a bandwidth of .11) estimate of the treatment effect. The
sample consists of all household heads in the 1850 census with children born in Georgia during the three years prior to the
Cherokee Land Lottery of 1832 and no children born outside of Georgia during the same period. An observation is coded to a
lottery status using the 1/n match procedure as described in the text. The dependent variable in this figure is total measured
wealth in 1850, the sum of real-estate wealth and slave holdings. Real-estate wealth is as reported on and transcribed from
the manuscript pages of the 1850 Census of Population. Slave wealth was estimated by linking the household to the 1850
Slave Schedule and imputing a market value of slave holdings adjusting for the reported ages and gender of slaves on the
Schedule. The sample size for this figure is 13094. Data sources and additional variable and sample definitions are found in
the text and in the appendices.
Appendix Figure 2: Replicate Figure 1 (Quantile Regressions) Using 1/n Match to Smith Instead
Appendix Figure 3: CDF Differences under Various Specifications
Panel A: Binary Match to Smith List, Baseline Specification (Shown in Figure 2, Panel B)
Panel B: Binary Match to Smith List, Bivariate Specification
Panel C: Binary Match to Smith List, Baseline Specification with Soundex Fixed Effects
Notes: Figure continues on next page.
Appendix Figure 3 (Continued): CDF Differences under Various Specifications
Panel D: 1/n Match to Smith List, Baseline Specification
Panel E: Include Capital Imputed by Occupation
Panel F: Include the Real-Estate Wealth of Possible Nearby Sons
Notes: This figure displays alternate estimates of the change in cumulative distribution functions in Figure 2, Panel B. The solid lines in each Panel presents
the differences in the cumulative distribution function between groups (treatment minus control), estimated using a linear probability model (equation 1) of an
indicator for being below 200 quantile cut-points for wealth. Heteroskedasticity-robust 95% confidence intervals are plotted with the dashed lines. See the
notes for Figure 2 for definitions of the data and specifications
Appendix Figure 4: Wealth versus Age
Notes: This figure plots the average total 1850 wealth by age in the sample of lottery eligibles, except for the upper left-hand
panel, which displays the fraction of the sample in each age cell. The age-specific averages are denoted with the square
symbols. A quadratic fit (plus confidence interval) is displayed with the solid line (and associated shading). The sample and
data are defined as in Figure 1-3, except that we do not display results for the handful of observations with ages above 75.
Appendix Figure 5: Various Outcomes versus Realized 1850 Wealth, by Lottery Status
Panel A: Number of Children Born Post Lottery in 1850 Household
Panel B: Fraction with Possible Sons within 50 Lines of Household on the Census Manuscripts
Panel C: Fraction Residing in Old Cherokee County in 1850
Notes: Figure continues on next page.
Appendix Figure 5 (Continued): Various Outcomes versus Realized 1850 Wealth, by Lottery Status
Panel D: Occupation has Significant Physical-Capital Requirement
Panel E: 1830-50 Change in Log Population Density in 1850 County of Residence
Panel F: 1850 Log Population Density in 1850 County of Residence
Notes: This figure displays estimates for 1850 of the number of children and residence in Old Cherokee County versus total wealth, by lottery status. The base sample consists of all
household heads in the 1850 census with children born in Georgia during the three years prior to the Cherokee Land Lottery of 1832 and no children born outside of Georgia during
the same period. An observation is coded as a lottery winner ("treated") if he is a unique match to a name found on the list of winners published by Smith (1838). The dependent
variable (on the y axis) is an indicated above. The independent variable (on the x axis) in this figure is total measured wealth in 1850, the sum of real-estate wealth and slave holdings.
Real-estate wealth is as reported in and transcribed from the manuscript pages of the 1850 Census of Population. Slave wealth was estimated by linking the household to the 1850
Slave Schedule and imputing a market value of slave holdings adjusting for the reported ages and gender of slaves on the Schedule. The solid and dashed display local-polynomial
smoothed estimates of the means for each level of total wealth for the treated and controls, respectively. The gray shaded areas denote 95% confidence intervals for the treatmentgroup conditional mean. These curves are estimated using the "lpoly" command in Stata version 12. Data sources and additional variable and sample definitions are found in the text
and in the appendices.
0.004
(0.013)
---
---
0.022
(0.008) ***
---
---
4.560
(4.727)
4.654
(4.343)
4.265
(3.997)
4.320
(3.643)
Miles
East
(3)
-6.569
(2.761) **
-5.924
(2.781) **
-4.661
(2.306) **
-4.026
(2.211) *
Miles
North
(4)
Log of
Average
Farm Size
(9)
-0.007
(0.021)
-0.014
(0.017)
Panel A: Basic Specification
Log of
Farm
Value per
Acre
(8)
-0.017
(0.017)
Log of
Improved
Land
Ratio
(10)
0.009
(0.012)
-0.011
(0.022)
-0.005
(0.017)
-0.024
(0.018)
Panel B: Control for Surname Fixed Effects
0.011
(0.012)
Total
Fertility
Rate
(TFR19)
(7)
0.000
(0.026)
-0.001
(0.026)
Log
Slaves per
Area
(11)
0.005
(0.004)
0.007
(0.012)
-0.012
(0.022)
-0.004
(0.016)
-0.015
(0.017)
0.008
(0.025)
Panel C: Basic Specification, Control for Residence in Old Cherokee County
0.005
(0.004)
0.006
(0.004)
Total
Fertility
Rate
(TFR5)
(6)
-0.048
(0.029)
-0.057
(0.030) *
-0.045
(0.029)
Log Pop.
Density in
1850
(12)
-0.005
(0.003) *
0.004
(0.004)
0.005
(0.012)
-0.016
(0.022)
0.004
(0.016)
-0.022
(0.019)
0.009
(0.026)
-0.058
(0.030) *
Panel D: Control for Surname Fixed Effects, Control for Residence in Old Cherokee County
-0.004
(0.003)
-0.004
(0.003)
-0.003
(0.003)
School
Enroll.
Rate
(5)
-0.108
(0.054) **
-0.101
(0.052) *
-0.117
(0.054) **
-0.111
(0.052) **
Log Pop.
Density in
1830
(13)
0.066
(0.062)
0.072
(0.072)
0.066
(0.062)
0.072
(0.072)
Log
Fraction
Urban
(14)
0.002
(0.011)
-0.003
(0.011)
-0.001
(0.011)
-0.007
(0.011)
Access to
Water
Transport
(15)
0.014
(0.015)
0.018
(0.015)
0.015
(0.016)
0.018
(0.016)
Access to
Railroads
(16)
Notes: This table displays OLS estimates of equation (1) in the text. Each cell presents results from a separate regression, and only the coefficient on winning the lottery is reported. The basic specification (shown in Panel A) also includes dummies for age. The specification
used in Panel B includes fixed effects for surname (soundex). Panels C and D repeat specifications from Panels A and B, respectively, but also include a dummy variable for residence in Old Cherokee County. The sample consists of all household heads in the 1850 census
with children born in Georgia during the three years prior to the Cherokee Land Lottery of 1832 and no children born outside of Georgia during the same period. The dependent variables are the locational county-specific characteristics denoted in the column headings.
Location data used in Columns 3 and 4 are county centroids computed from NHGIS data, and are converted into miles east or north of the NAD83 reference point in central Oklahoma. County data used in Columns 5-14 are drawn from ICPSR study #2896. The number of
observations for Columns 1-4 is 14375 and for Columns 5-14 is 14237 because of missing data for some (mostly unorganized) counties. A household is coded as a lottery winner if the head is a unique match to a name found on the list of winners published by Smith (1838);
anyone else in the sample is coded to zero. A single asterisk denotes statistical significance at the 90% confidence level; double 95% and triple 99%. All standard errors are heteroskedasticity robust and, in Columns 3-14, clustered at the (state x county) level to account for
multiple observations per county. Data sources and additional variable and sample definitions are found in the text and in the appendices.
0.005
(0.011)
Resides in
Georgia
Resides in
Old
Cherokee
County
0.022
(0.008) ***
(2)
(1)
Appendix Table 1: Differences in 1850-County-of-Residence Characteristics by Lottery Status
Appendix A: A Partial Characterization of Treatment Effects Across
the Counterfactual Distribution
In this section, we present a partial characterization of mean treatment effects across the distribution
of counterfactual outcomes. In plainer language, what were the returns to winning for those lottery
losers who ended up with a certain 1850 wealth? Recall that the quantile regressions answer
this questions only under the assumption of rank stability. This assumption seems unpalatable,
particularly in light of the assertion that some of the poor could make profitable investments if only
they could raise capital. Instead, we attempt this analysis with minimal assumptions about the
mapping from one’s outcome assuming no treatment and one’s outcome if treated. (It has come to
our attention that this exercise can be called a partial-identification strategy.)
Let the potential outcomes for individual i’s total wealth be w1i and w0i , where the subscript
1 denotes his 1850 wealth if he were treated with lottery winning and the subscript 0 denotes
his 1850 wealth if he did not win the lottery. Note that one of these variables is counterfactual:
we only observe one of the potential outcomes for a given person, who is either treated or not.
Nevertheless, we can postulate that they exist and, furthermore, that the potential outcomes can
be characterized by a joint distribution function, F (w1 , w0 ) and marginal distributions g(w1 ) and
h(w0 ).
For the remainder of this appendix, we consider a discretized version of the marginal potentialoutcomes distributions (g and h) that can be summarized with the vectors x1 and x0 . The joint
distribution of the w implies a Markov-process mapping that relates the discretized marginal distributions:
x01 = x00 P
where P is the Markov matrix of conditional probabilities. Even with the normalization that
~1 = P ~1, this is a grossly underdetermined system, with m2 unknowns and only 2m equations. The
elements of a Markov matrix being probabilities, it is also the case that 0 ≤ P ≤ 1, where the
inequalities refer to each element of P . This adds another 2m2 constraints, although at most m2
can be binding.
The vectors x are also unobserved because they depend on both the potential outcomes that
we can observe and on those that we cannot. However, because treatment was randomly assigned,
the expected value of the discretized marginals are equivalent between the potential outcomes of
the whole sample and realized outcomes of the treatment group. Thus, we use the treatment-group
distribution to estimate x1 . Similarly, we use the control-group distribution to estimate x0 .
We can characterize the bounds on mean treatment effects using these equations. Note that
58
this system is linear in the elements of P . Therefore we can transform it into a linear-programming
problem that searches over the feasible set of possible P to find the extrema of some linear transformation of P . The expected treatment effect (conditional on the untreated potential outcome) is
itself linear in the elements of P .
These results are presented in Appendix Table 2. We discretize the data into M bins that
cover the extrema of the total-wealth data. The bins are of equal width in levels to account for
the expected value of lottery winnings being a shock in levels rather than logs. We start with a
grid of 100 bins (results shown in Panel A), but also verify that our results are similar using grids
of 50 or 200 (results shown in Panels B and C, respectively). The rows of each Panel presents
results for various sets of constraints on P . In each row, we report on whether the linear program
has a feasible solution in the full sample (Column 1) and in what fraction of 1000 bootstrapped42
re-samples (Column 2). In the remaining columns, we report the lower and upper bounds on the
mean treatment effect for various subsets of the counterfactual, untreated state (i.e., for ranges
of w0i ). We consider mean treatment effects above and below $800 (roughly the median of total
wealth) as well as below $500 and below $400.
The first rows, labelled “Basic only”, of each Panel displays the results from imposing only the
constraints above. These are minimal bounds and therefore a feasible solution is found. As can be
see in Columns 3–4, the bounds on returns for this case are quite wide. The upper bound on the
treatment effect for the lower tail is around $9000, while the lower bound is in the single digits.
The gap between upper and lowers bounds for the above-$800 range is about as wide, but is shifted
down by around $6000. This basic restriction rules out very little in terms of explanations for the
similarity of the lower tails between control and treatment.
The next case, denoted “no worse off” in the second rows of each Panel, attempts to restrict
the treatment to be weakly positive, in an ex post sense. In other words, the constraint on P is that
everyone who was treated end up with higher wealth than they would have had if untreated.43 This
restriction, it turns out, is not feasible in the main data, nor in more than 99% of the bootstrapped
samples. We do not bother reporting the implied bounds for this case because this constraint would
42
The bootstrap is stratified by treatment status, so that the size of control and treatment groups remains constant
throughout the simulation.
43
For example, if we discretized using 5 bins, the modified upper-bound restriction would be




P ≤
1
0
0
0
0
1
1
0
0
0
1
1
1
0
0
1
1
1
1
0
1
1
1
1
1





with the diagonal and above-diagonal elements of P allowed to be up to unity, but the below-diagonal elements
pinned at zero.
59
seem to be violated in the data.
A related question to ask is what are the minimal and maximal fractions of the joint distribution
of potential outcomes that are below the diagonal of P ? The answer is that we can obtain feasible
solutions if as few as 0.2% are worse off from treatment. But we can also obtain feasible solutions
if as many as 76.6% are worse off from treatment.
In any event, this “no worse off” constraint may not be as appealing on a priori grounds in
that lottery winners might take on more risk and therefore some might end up worse off than if
they had instead lost the lottery. A more sensible restriction might be that lottery winners were
weakly better off, when viewed from the perspective of moments after the lottery occurring. We
test this idea in the third rows of the Panels of Appendix Table 2. There we report results from
restricting the expected return from each and every grid cell to be (weakly) positive. There exists
feasible solutions about 80% with this constraint applied. Also, by restricting the degree of churn
from above, the upper bound for the mean treatment effects is reduced in the lower tail of the
counterfactual, untreated distribution. For example, the highest feasible mean treatment effect is
$1505 for those who would had a potential outcome of less than $500 in the untreated state (Panel
A, Column 5). This upper bound on treatment is greater than our estimate above of the average
parcel value won, but only by a factor of 1.67 or a log difference of 0.51. Divided over 18 years,
this reflects a maximal difference in returns of 2.85% per annum.
Finally, we impose the restriction that no grid-cell’s expected benefit of treatment be greater
than $2000. We can find feasible solutions in this case for both the original data and for 100% of
the bootstrapped samples. Unfortunately, the results are not particularly informative about the
maximal returns in the lower tail; the upper bound that comes out is simply the upper bound that
we fed in.
In conclusion, with only basic assumptions, we cannot place sharp bounds on the effect of
treatment for those who would have ended up in the lower tail if untreated. Put another way, with
few restrictions, the data are consistent with very little churning or quite a lot of churning of one’s
position in the treated versus untreated distributions of potential outcomes. However, if we assume
that the expected value of treatment was positive throughout the distribution, the upper bound on
treatment effects in the lower tail of w0i is somewhat closer to our estimate of the average value of
land won.
60
Appendix Table 2: Bounds on Mean Treatment Effects for Subsets of the Counterfactual Distribution
(1)
(2)
(3)
Feasible solution?
In full
sample
(4)
(5)
(6)
Bounds on treatment effect for various counterfactuals
Percent in
bootstrap
samples
At or
above $800
Below
$800
Below
$500
Below
$400
Panel A: Discretize using 100-cell Grid
-6047
1259
22
7518
14
8535
10
9453
81.1%
0
1259
22
1271
14
1505
11
1689
100.0%
-735
1259
22
2000
14
2000
11
2000
Basic only
Yes
100.0%
No worse off
No
0.9%
Expected value positive
Yes
Expected value ≤ $2000
Yes
Panel B: Discretize using 50-cell Grid
-6046
1244
23
7253
14
8516
11
9431
81.5%
0
1244
23
1257
14
1488
11
1670
100.0%
-749
1244
23
2000
14
2000
11
2000
Basic only
Yes
100.0%
No worse off
No
0.8%
Expected value positive
Yes
Expected value ≤ $2000
Yes
Panel C: Discretize using 200-cell Grid
-5720
1191
19
7672
11
9336
9
10045
79.7%
0
1191
19
1338
11
1658
9
1807
100.0%
-598
1191
19
2000
11
2000
9
2000
Basic only
Yes
100.0%
No worse off
No
0.9%
Expected value positive
Yes
Expected value ≤ $2000
Yes
Notes: This table displays results from a linear-program consisting of (i) two vectors summarizing the discretized distributions of total wealth in the
control and treatment samples and (b) auxiliary constraint matrices, specified in the first column and in the appendix. The number of bins used to
discretize the distributions is reported in the panel headings. Column (1) reports whether there exists a feasible solution to the linear program formed
from the various constraints and the two vectors using the distribution vectors constructed from the full sample. Results in Column (2) comes from
1,000 bootstrapped samples, each of which was used to recompute the control and treatment distribution vectors and recast the linear program. The
remaining columns report lower and upper bounds on the expected (mean) effect of treatment for specified ranges of the distribution of potential
outcomes. These effects are weighted by the control distribution within the specified range. See the appendix for details on this procedure. The
underyling sample consists of all household heads in the 1850 census with children born in Georgia during the three years prior to the Cherokee Land
Lottery of 1832 and no children born outside of Georgia during the same period. The outcome variable is "total wealth," as defined above. An
indidivual is coded as treated if he is a unique match to a name found on the list of winners published by Smith (1838); anyone else in the sample is
coded as control. Data sources and additional variable and sample definitions are found in the text.
Shocking Behavior :
Random Wealth in Antebellum Georgia
and Human Capital Across Generations
Hoyt Bleakley† and Joseph Ferrie††
First version: March 19, 2010
This version: August 20, 2013
Abstract
Does the lack of wealth constrain parents’ investments in the human capital of their
descendants? We conduct a fifty-year followup of an episode in which such constraints
would have been plausibly relaxed by a random allocation of wealth to families. We
track descendants of those eligible to win in Georgia’s Cherokee Land Lottery of 1832,
which had nearly universal participation among adult white males. Winners received
close to the median level of wealth – a large financial windfall orthogonal to parents’
underlying characteristics that might have also affected their children’s human capital.
Although winners had slightly more children than non-winners, they did not send them
to school more. Sons of winners have no better adult outcomes (wealth, income,
literacy) than the sons of non-winners, and winners’ grandchildren do not have higher
literacy or school attendance than non-winners’ grandchildren. This suggests only a
limited role for family financial resources in the formation of human capital in the next
generations in this environment and a potentially more important role for other factors
that persist through family lines.
For comments on early versions, we thank Lou Cain, Greg Clark, Robert Pollack, Chris Roudiez,
Rachel Soloveichik, and seminar and conference participants at the NBER Cohort Studies meeting, the
Stanford SITE meeting, the University of Washington, the University of Michigan, the University of
Chicago, and the Minnesota Population Center and for helpful comments, to Steven Ruggles for access to
a preliminary version of the 1850 full-count file. Bleakley gratefully acknowledges research support from
the Center on Aging (which provided a pilot grant), the Center for Population Economics (funded under
National Institutes of Health Grant P01 AG010120), the Stigler Center, and the Deans’ Office of the Booth
School of Business, all at the University of Chicago.
†
University of Chicago & NBER. Email: [email protected].
††
Northwestern University & NBER. Email: [email protected].
1.
Introduction
The role of parents’ resources in shaping the human capital of their children has been a central
concern of economists since the work of Schumpeter (1951) and of social scientists more generally
since Dewey (1889).1 The extent to which parents are constrained in this process is of particular
concern when capital markets are imperfect and parents cannot fully borrow against the future labor
earnings of their children. In such situations, much of the productive potential of the children of
resource-constrained parents may go unrealized, leaving both them and society as a whole poorer
as a result. Nevertheless, the importance of such constraints is difficult to assess in the absence of
randomized perturbations to family wealth.
In this study, we assess the impact over several generations of a large, exogenous random
shock to family wealth that had the potential to relax the constraint faced by parents and allow them
to make investments in the human capital of their children that they would not have otherwise been
able to afford. Both the size of the windfall we study (close to the median wealth at the time) and the
near-universal participation in the system that provided this windfall to roughly 15% of adult white
males makes this setting unique until the advent of large-scale income-based social experiments in
the late twentieth century.
Parental wealth might predict child outcomes for reasons other than borrowing constraints.
Though more advantaged parents may simply have more assets (e.g. financial or human capital) that
they can directly transfer to their children, their advantages might result as well from underlying
characteristics (e.g. ability, ambition, or access to superior investment opportunities) that have led
to their accumulation of more assets. If these underlying characteristics are passed on to their
1
See the summary in Becker and Tomes (1986).
1
children, they would exhibit superior outcomes regardless of their parents’ direct investments in
them.
The design of effective interventions depends on what plays a greater role in human-capital
formation: financial constraints or the household’s underlying characteristics. For example, if
investment in the next generation is governed primarily by the resources parents have available,
policies that relax that constraint will lead to greater investment.2 But if children’s human capital is
instead the result of particular characteristics transmitted to them by their parents, policies that
merely relax the financial resource constraint will prove ineffective in generating additional
investment in children. The challenge is determining whether parents’ financial and human capital
are crucial inputs themselves in producing children’s human capital, or simply if these resources are
themselves merely indicators of higher underlying productivity transmitted across generations.
Previous research seeking to disentangle the various intergenerational determinants of human
capital (Chevalier 2004, Black et al. 2005, Oreopoulos et al. 2006) has focused on identifying
plausibly exogenous policy changes that forced parents to invest more in their children’s human
capital than otherwise (e.g. the imposition of compulsory school attendance laws or raising school
leaving ages). In the present study, we focus instead on a random shock to financial resources to
separate these effects. If the household’s resources are expanded through a random wealth shock but
children’s human capital is nonetheless unaffected, this is evidence for the role of underlying
2
In an extreme case, a “poverty trap” (Galor and Zeira 1993), parents cannot make even small
investments in their children’s human capital because of a fixed cost, and they cannot make large investments
because of the binding financing constraint and their inability to borrow against their children’s future
earnings. In this situation, large transfers may be necessary to move families past the threshold at which it
becomes feasible to begin investing in their children.
2
characteristics rather than parents’ resources in shaping children’s outcomes, as posited by Clark and
Cummins (2013) in the British context.
We examine the results of a large-scale lottery in the mid-nineteenth century in the U.S. state
of Georgia (the 1832 Cherokee Land Lottery).3 At this time in Georgia, it was already clear that
parents’ resources were linked to the human capital outcomes of their children and even their
grandchildren: correlations were both substantively large and statistically significant between
parental resources and the school attendance, literacy, wealth, and occupations of their children and
grandchildren.4 In this setting, we analyze the effects of random disbursements of wealth on fertility
and human-capital investments over a long horizon.
This lottery has several advantages when it comes to researching this question. First, the lottery
generated a shock to an individual’s wealth that we can plausibly expect was exogenous to his
characteristics. Second, registration in this lottery was cheap and widespread (over 97% of those
eligible were registered, by our calculations), unlike many studies of lotteries whose participants are
a selective subset of the population. Third, the prize in this lottery was a claim on a parcel of land
with an average value close to the median wealth of the period. Fourth, lottery winnings were
essentially a pure wealth shock – there was no homesteading requirement and the claim could be
readily liquidated without even setting foot on the land. Finally, we are able to undertake a long-run
followup on the effects of this wealth shock: we are able to examine family outcomes over nearly
fifty years after the wealth disbursement.
3
See Figure 1 for the location of Old Cherokee County, the area settled through this lottery.
4
See Section 8.2 below.
3
In the first three decades of the nineteenth century, Georgia used lotteries to allocate over
two-thirds of its area to white settlers. Following several large corruption scandals in Georgia in the
1790s, this peculiar manner of opening land was chosen because of its relative incorruptibility.
(Section 2 presents background details on the Cherokee Land Lottery and discusses related work in
the literature on intergenerational transmission.) We conduct a follow-up on these random wealth
shocks using a sample (drawn from the 1850 Census) of over 14,000 men eligible to win land in the
1832 Land Lottery. From this sample of eligibles, we identify winners using a list published by the
state of Georgia (Smith, 1838). Those found in the list comprise the treatment group, and the lottery
eligibles not found in list serve as a control group. Note that not we cannot verify that all of the men
in our sample of eligibles did in fact register for the lottery, but our calculations (seen in Section 3)
indicate that this was a minor subset. Further, in our sample, lottery losers look similar to lottery
winners in a series of balancing tests using outcomes determined prior to the lottery and placebo
regressions using a sample drawn from South Carolina instead. (These results are found in Sections
3 and 8.3, respectively.) We compute that the net-present-value of land won was at least several
hundred dollars and perhaps as high as $1,000 in 1850 dollars. Further, we estimate with 1850 data
that winners were, on average, $700 richer than the controls, almost two decades after the lottery.5
(We present the estimation strategy, main regression equations, and estimates for 1850 wealth in
Section 4.)
To measure the long-term effect of wealth on investments in children, we collect information
from various years of the Census manuscripts. (Further description of the data are found in Section
3.) Existing indices of the census contain information on net fertility, residence, spousal age, etc.
5
An unskilled laborer in the South Atlantic region earned $0.74 per day in 1849 (Margo and
Villaflor 1987, p. 880), so $700 in 1850 represented 945 days of work by an unskilled worker.
4
Adding to this, we transcribed information on wealth, literacy, and school attendance from the 1850
Census manuscripts, where children present in the household are recorded with their lottery-eligible
fathers. We also link a subsample of these children to the 1870 and 1880 Census manuscripts to
observe their outcomes as adults, as well as the outcomes of any grandchildren present by 1880.
When compared to a similar population that did not win the lottery, winners had only slightly
higher (post-lottery) fertility but were no more likely to send their children to school. The fertility
estimates, however, do not imply a particularly steep Engel curve for the number of children. The
result for school attendance, on the other hand, suggests that the lack of paternal wealth was not a
significant impediment to investing in a child’s education.
We show that the increase in
(post-lottery) fertility took place along both intensive and extensive margins. In contrast, we do not
find a significant result for schooling when decomposing the result by age or gender of the children.
We also show that these results (and all others of the paper) are robust to controlling for various
factors, including characteristics of the person’s name. The latter strengthens our conclusions in that,
although we used the name to link to the list of winners, this does not appear to bias our estimate of
the treatment effect. Further, while contextual influences on fertility and school attendance are no
doubt important, they are not apparently an important mechanism for these results: effects are not
sensitive to controlling for county of residence, nor do we find evidence that lottery winners move
to counties with unusual fertility, schooling, land value, slave intensity, farm sizes, land
improvement, urbanization, or transport access.
Effects of a random shock to paternal wealth on sons’ human capital do not manifest
themselves as we follow these children into adulthood in 1870 and 1880. School attendance is an
imperfect proxy for human-capital investments and some aspects of human capital may not appear
until adulthood. Linkage forward also allows us to determine if any fertility effect persists across
5
generations. The sons’ 1870 wealth (in real estate and/or personal property) is not statistically
distinguishable between control and treatment groups. Nevertheless, a mechanical split of the ‘extra’
1850 paternal wealth among his children would suggest a treatment effect in 1870 of $140 (in 1850
dollars), which we can reject for reasonable discount rates. Further, in 1880, we do not find
differences in occupational standing or literacy as a function of their father’s lottery status. In 1880,
the grandchildren themselves do not have significantly greater literacy or school attendance if their
grandfather was a lottery winner. If anything, the grandchildren of the treated are less likely to be
in school in 1880, some five decades after the lottery. Treated families have fewer grandchildren
per son in 1880, which roughly offsets the small fertility effect in the previous generation, leaving
a statistically similar number of grandchildren by lottery status. In other words, the additional wealth
causes a one-generation blip to the size of that dynasty.
The failure of lottery winners to invest more in their children’s human capital is not the result
of a lack of a substantial return on such investment. Cross-sectional comparisons show signs of
returns to skill in nineteenth-century Georgia. This evidence is found in Section 8.1. We show that
literacy and childhood school attendance predict adult wealth and sons with more siblings tend to
have worse adult outcomes. Whether this reflects a causal effect is uncertain, but the standard
methodologies for measuring these relationships indicate their presence in one form or another in
the context of our study. The presence of positive returns and the absence of an effect of lottery
winning indicate that parents did not use the wealth to relax a financial resource constraint. The
results are instead consistent with the presence of deeper, underlying characteristics that persist
through family lines and are associated with superior outcomes, like those posited by Clark and
Crimmins (2013). Section 8.2 provides evidence on correlations between parental resources and
children’s human capital outcomes. It also provides suggestive evidence for the presence of linkages
6
that persist through family lines in the form of regressions on individual-level outcomes (wealth,
fertility, human capital) using the average values of these variables for all individuals (except the
individual in our sample) with the same surname. The results are also not the product of biases
introduced in the process of generating our data: a placebo exercise in Section 8.3 uses the same
procedure we employed but applied to South Carolina, for which any data linkages generated should
be entirely spurious, and finds no effect of lottery winning. Finally, we perform a simulation in
Section 8.4 using our control sample, and find that the cross-sectional relationships between parental
wealth and human capital are not consistent with our estimates derived from the lottery treatment.6
Section 9 concludes the study.
2.
Background
2.1.
The Cherokee Land Lottery in Northwest Georgia, 1832
Georgia, unlike most U.S. states, placed most of its land in the hands of the public through a
series of land lotteries.7 At its origin, the colony of Georgia was located along the Savannah River,
where most of the land was distributed through the headright system, in which arriving settlers were
given land in proportion to the number of individuals they brought with them. The impetus for the
lottery system as a new means of distributing land was a widely-reported corruption scandal, the
Yazoo Land Fraud of the 1790s. The legislature, in response to the public uproar brought on by the
scandal, introduced lotteries as an ostensibly fair and transparent system to distribute the state’s land,
6
The cross-sectional relationships for human capital and wealth are present in our control sample
and similar to modern intergenerational correlations despite the Civil War’s occurrence between the 1832
lottery and when the children of 1832 lottery participants are observed as adults.
7
This summary of early Georgia land policy is drawn from Cadle (1991, pp. 60-108 and 267-283).
7
beginning in 1805. As new land was acquired by the state in treaties concluded with the indigenous
population, new lotteries were conducted, the last following the eviction of the Cherokee from
northwest Georgia beginning in the early 1830s.
It is this last Georgia lottery in 1832 on which we focus our attention in this study. A list of the
lottery’s winners was easy to obtain, and the late date of the lottery makes it possible for us to
identify the lottery’s winners and losers in the first U.S. Census (1850) with complete names, ages,
and birthplaces for all household members, as well as measures of literacy, school attendance, and
real estate and slave wealth. Once individual lottery winners and losers were located in 1850, it was
straightforward to locate their children in 1850, 1860, 1870, and 1880, and their grandchildren in
1880.
The lottery’s rules were simple: every male age 18 and older who had resided in Georgia for
the three years prior to the 1832 drawing was eligible to one draw (Cadle 1991, pp. 267-283). Some
others (widows, orphans, and military veterans) were entitled to two draws. Our inability to identify
these groups in our treatment and control groups, as well as the small numbers represented by these
groups, led us to exclude them from the subsequent analysis. Finally, members of a band of outlaws
known collectively as “The Pony Club” were excluded from participation in the lottery and are
ignored in our analysis. In theory, individuals who had won land in prior lotteries conducted by the
state were also excluded. A nominal fee of 12.5¢ was imposed on registrants. In view of the nearuniversal participation rate (see below), we doubt that these latter two constraints were rigorously
enforced.
We can estimate the participation rate in the 1832 lottery. The state’s 1830 census returns
8
report 77,968 white males age 15+.8 Using data in Cadle (1991), we estimate that just under 76,000
males registered for single draws, a number only 2.8% lower than the population of males in the
eligible age range by 1832 based on the 1830 census. The small discrepancy can be accounted for
by out-migration and mortality between 1830 and 1832 , combined with those aged 15 at the time
of the 1830 census who had not reached 18 by the time of the 1832 lottery. Thus, the participation
rate among the eligible population was extremely high – in fact, close to universal (from 97.2 percent
to 99.5 percent depending on the assumptions we make). Smith (1838) reports that there were 15,000
winners in the 1832 lottery, excluding widows and orphans, which corresponds to a winning rate of
roughly 19%, only slightly higher than rates observed in Columbia County (16.0%) and Oglethorpe
County (16.8%) where the full lists of lottery participants and winners have survived.
As Cherokee County was being surveyed in preparation for the lottery, lists of those eligible
to participate were forwarded to the state capital at Milledgeville. The survey divided what was to
become the 10 northwestern counties of present-day Georgia into four sections, each of which was
then divided into districts (generally square in shape, though less regularly-shaped right along the
border between Cherokee County and both the older-settled region to the east and the Chattahoochee
River to the county’s southeast). The districts were then further divided into parcels of 160 acres,
with 324 parcels in each of the square districts.
The lottery itself was conducted to ensure the greatest possible transparency, with a slip of
paper for each participant placed in one large barrel and a slip of paper for each plot placed in a
second barrel, which also contained enough additional blank slips so the number of slips was the
same in each barrel. Names and parcels (or blank slips) were then drawn simultaneously from the
8
Calculated from ICPSR Study 2896 (Haines, 2010). These figures also appear in Bleakley and
Ferrie (2013).
9
two barrels until both were empty. As a result, winning was random and so was the quality of the
parcel won among winners.
Once the lottery was completed, winners could immediately sell their winning draw. Unlike
land distributions in many Midwestern states, there was no requirement that the recipient spend any
time on the land or make any improvements whatsoever. The only requirement imposed was that
winners register their claim and pay an $18 registration fee to the state.9 The land could not be
immediately occupied, however, as the Cherokee Nation was engaged in legal action to fight their
eviction, and the final ruling in favor of the state did not come until 1838. As a result, some lottery
winners may have exercised their option of immediately “flipping” their property.
We estimate that the value of a winning draw was perhaps as high as $700 in 1850 for a 160acre parcel. This $700 figure is based on the value of farmland in the ten counties of northwestern
Georgia in 1850, minus the value of implements and machinery.10 In Table 2 below, we find that
winners were in fact $700 wealthier than losers by 1850 – the equivalent of more than 900 days of
earnings for an unskilled laborer in the South at this time. Even if they sold their parcel between
1832 and 1850 and bought land that rose in value at a similar rate, we would also expect them to be
wealthier in 1850 than lottery losers. Those who sold out before the uncertainty over the timing of
the expulsion of the Cherokee might have received somewhat less than this, but the timing was the
only source of uncertainty in this process, as the Indian Removal Act of 1830 under which the
eviction was conducted had already been applied elsewhere.11
9
The registration fee need not have been an obstacle to liquidity-constrained winners in that there
were many who simply sold the claim itself.
10
See Bleakley and Ferrie (2013) for the details of this calculation.
11
The bottom third of Cherokee County was thought to contain gold, and was distributed in a
separate lottery in smaller, 40-acre parcels. We focus only on the main lottery of 160 acre parcels.
10
2.2. Related literature
The literature on the effect of parental resources on child outcomes is so large that we cannot
possibly do it justice here. One could start with the claims of Malthus (1806). As presented by
Becker (1992), the simple Malthusian idea was that income was a “preventive check” that was the
main constraint on fertility. Post-Malthus, however, many societies experienced the Demographic
Transition in which fertility declined while human-capital investments took off. Becker argued that
a simple trade-off between child quantity and child quality was at play during this transition.
Once opportunities for investing in children’s human capital become available, it is possible
to imagine how parents’ circumstances affect the outcomes of each of their children (rather than
merely the total number of their children). Becker and Tomes (1986) modeled the decision made by
parents to invest in their children, subject to a budget constraint and the presence of “a family’s
cultural and genetic ‘infrastructure.’” (1986, p. S6). In this setting, wealthier and better-educated
parents face a different budget constraint than poorer and less-educated parents, resulting in a
correlation in outcomes across generations even if all families possess the same “infrastructure.”
Conversely, parents facing identical budget constraints might also see different outcomes for their
children if their “infrastructures” are different. Some of the advantages enjoyed by certain parents
might be dissipated (i.e., not exclusively generate better per-child outcomes) to the extent that they
result in greater fertility. Becker and Tomes (1986) predict on the basis of studies available in 1986
that any earnings advantage would be erased within three generations.
The existence of such intergenerational correlations in outcomes in the nineteenth century U.S.
is clear: for example, Long and Ferrie (2013) show the links between the occupations of fathers and
sons in the U.S., 1850-80. Sacerdote (2005) examines father-son links after the Civil War and finds
11
that it took roughly two generations for the descendants of those born into slavery in the U.S. (and
faced severely limited opportunities for human or financial capital accumulation) to converge to the
human capital outcomes of blacks who were born free. In Britain, Clark and Crimmins (2013) use
evidence from rare surnames to show how advantages in educational opportunities (attendance at
Oxford and Cambridge) persist for eight centuries.
A large number of contemporary studies have examined correlations in human capital across
generations (see Black et al. 2005 for a summary of several such studies). Oreopoulos et al. (2006)
use a change in compulsory schooling laws in the U.S. and show that parents’ education has a causal
impact on children’s education. Black et al. (2005) use a large sample of twins and a change in
schooling policy that was rolled out only gradually across Norway to isolate the effect of parents’
own human capital on that of their children from unobserved family effects, finding that the latter
(including the genetic inheritance of ability) were most important. What all of these studies have in
common is an interest in separating the effect of parents’ outcomes (e.g. higher educational
attainment) from the effect of their ability and its effect on their own education and then on their
children’s. The focus in these exercises has largely been upon finding plausible exogenous variation
in educational attainment that is not the product of variation in ability or other unobserved family
characteristics.
We are not, however, acquainted with any study that uses random variation in wealth to study
human capital transmission at the start of the demographic transition, nor are we aware of one that
follows up on random wealth shocks over such a long horizon (five decades). We examine just such
an “experiment” – outcomes for families that participated in a large-scale lottery with a significant
prize awarded to a large number of winners. Since the lottery took place well after parents had
completed their own schooling, there was little opportunity for the outcome to alter their own human
12
capital stock. Instead, it should have relaxed the budget constraint faced by poorer households and
allowed them to invest more in the human capital of their children. If human capital was unaffected
in the next generations, this is evidence in favor of the view (recently advanced by Clark and
Cummins 2013) that a substantial portion of the intergenerational correlation in outcomes is driven
by fundamental, family-specific effects (the “family’s cultural and genetic ‘infrastructure’” in the
Becker and Tomes 1986 model) and that the latitude to improve mobility across generations through
interventions that address only the outcomes themselves (e.g. improving parents’ or children’s
education) is severely limited.
Our results also relate to a literature analyzing so-called Conditional Cash Transfer (CCT)
programs that have become quite popular in the contemporary developing world. While the 1832
lottery was a wealth shock with no strings attached, a CCT is an ongoing payment conditioned on
certain behaviors, such as sending one’s children to school. (See Das, Do, and Özler, 2005 for a
review.) There is mixed evidence on whether the transfer itself promotes school attendance if such
conditionality is removed (Baird, McIntosh, and Özler, 2011; Akresh, de Walque, and Kazianga,
2013). Nevertheless, work in Brazil by Bursztyn and Coffman (2012) is not consistent with the idea
that school investments are held back by imperfectly altruistic parents who cannot borrow against
the future earnings of their children versus some other within-household bargaining problem.
13
3. Data
3.1 Data Sources and Construction
The present study follows up on the outcomes of lottery winners and losers and their children
and grandchildren. In order to do this, we first need to identify who was eligible, and who then won.
We find these individuals, their children, and their childrens’ children in later, publicly available data
sources, and ascertain their outcomes. We initially search for these individuals in the Census
manuscripts of 1850-1880 using a preliminary version of the full-count file for the 1850 census from
the IPUMS project, the full-count file for the 1880 census from the NAPP project, and also indexes
to the 1860 and 1870 censuses searchable on Ancestry.com.
The names of winners in the 1832 Georgia land lottery were published in Smith (1838), who
lists each parcel in the 1832 lottery area and the name of the winner of that parcel, as well as the
county and minor civil division where the winner resided in 1832. A version of this list was obtained
on-line from on accessgenealogy.com. It was compared against a copy of the Smith book that was
converted to a computer file using an OCR program and any discrepancies were resolved.
Although we possess a list of winners, there is no surviving state-wide list of all participants
from which we could construct a control population to compare to those treated by winning the
lottery. To create the control population, we exploited the lottery’s eligibility requirements and
information available in the 1850 Census of Population, which identifies all household members by
name, age, and state of birth. The bulk of those eligible to participate in the lottery had to have been
males age 18+ in 1832 who had been present in Georgia continuously over the preceding three years.
Using the full-count file created by the transcription of the 1850 census, we identified all white males
who would have been age 18+ in 1832 and who had at least one child who was born in Georgia
1829-1832 and who had no children born outside Georgia in the same interval. There were 14,306
14
individuals who met these criteria, 1,758 of whom were subsequently located – using their surname
and given name – in the Smith (1838) list of lottery winners.12 For both the control and treated
populations, information reported in the 1850 census (county of residence, marital status, number
and ages of all children, occupation, literacy, school attendance, and real estate value) was then
transcribed and combined with information on the number and age and gender of slaves owned as
reported in the slave schedule that was created by census marshals concurrently with the population
schedule. The value of slave wealth was estimated using slaves’ age and gender and
contemporaneous slave prices disaggregated by these characteristics.
Linkage to later censuses was then performed to generate multigenerational outcomes for the
control and treated populations. The male sons of the 1850 male household heads previously
identified as the control and treatment groups were sought in the 1880 U.S. Census of Population
in two ways: (1) the characteristics of 1850 sons (name, year of birth, birthplace, and parents’
birthplaces) were used to locate them in the 1880 U.S. Census 100% File13; and (2) individuals not
successfully linked 1850-80 were located in the Ancestry.com on-line 1850 U.S. Census index,
where any hints to their 1880 record were followed (these hints are generated by Ancestry.com on
the basis of both actual links among individuals made by genealogists in the construction of their
family trees, and links generated by Ancestry.com through a machine learning process in which
12
An individual was considered to have been uniquely linked if exactly one individual in the Smith
list appeared in the 1850 census group of eligibles with the correct given name and a surname that differed
by no more than 15 units in the SPEDIS “phonetic distance” function in SAS (which assigns points to
different sorts of transcription errors such as omitting a letter, sums the points, and adjusts for the name
length). If several individuals were matched by given name and all had exactly the same SPEDIS value
(below 15), the individual was considered to have been multiply matched.
13
When multiple matches were found in 1880 for the same 1850 individual, the match that
minimized the SPEDIS “phonetic distance” between the 1850 individual sought and the 1880 individual
located was chosen; if multiple 1880 individuals minimized this distance, the observations were rejected.
15
actual genealogist-generated links were used as training data and the system then generated links
automatically for individuals not previously linked by genealogists). When 1850 sons were identified
as 1880 household heads through either of these mechanisms, the 1880 information on their entire
1880 family was transcribed (occupation, literacy, school attendance).
The male sons of the 1850 male household heads we previously identified as the control and
treatment groups were sought in the 1870 U.S. Census of Population in two ways: (1) individuals
linked to the 1880 100% file in the manner described above were located in the Ancestry.com
on-line 1880 U.S. Census index, from which any hints to their 1870 record were followed; and (2)
individuals not successfully linked 1850-80 in the manner described above had their hinted links
forward from their 1850 census record on Ancestry.com followed. When 1850 sons were identified
as 1870 household heads through either of these mechanisms, their 1870 real estate and personal
estate were transcribed.
The initial sample drawn from the 1850 census yielded 47,749 children age 5-17 whose
schooling and literacy were observed (as the lottery occurred in 1832, the number of children under
18 years of age was also an outcome that we observed for families of winners and losers). The
linkage to 1880 yielded 14,963 male children of lottery winners and losers whose outcomes could
be observed in both 1850 and 1880, together with 40,658 grandchildren in 1880 of the original
lottery winners and losers. Finally, the linkage to 1870 yielded 24,510 male children of lottery
winners and losers whose 1870 outcomes were observed in 1870; of these 6,823 were adults in 1870,
so their 1870 real and personal wealth was observed.
16
3.2 Summary Statistics and Balancing Tests
Table 1 presents the sample’s summary statistics. Each variable appears in its own row, and
each panel contains similar variables.14 Column 1 displays values (means and, in parentheses,
standard deviations) for the entire sample, while Columns 2 and 3 report, respectively, the
corresponding values for lottery losers and winners. Column 4 reports p-values for a test of the null
hypothesis that the means in Columns 2 and 3 are identical (where the test is a simple bivariate
regression on a dummy for lottery winner). Clustered standard errors are calculated throughout the
analysis when the data have a grouped structure. Sample sizes are reported in square brackets.
We use two measures of winning land in the 1832 Cherokee Land Lottery (Panel A, Table 1).
If an individual was uniquely matched to the list of winners (Smith, 1838), the first measure is coded
to one; otherwise this measure is coded to zero. By this measure, 12.4% of our observations are
lottery winners. This measure has a mean of zero for losers (Column 2) and a mean of one for
winners (Column 3) by construction. The second measure is designed to account for the few cases
where more than one individual is matched to the list of winners. If n individuals are matched to the
same winner, the match variable is recorded to 1/n. Our maintained assumption in constructing this
measure is that one of the “tied” individuals in fact won a parcel, but in the absence of additional
information, we can do no better than assigning equal probabilities of this even to all n individuals
in the “tied” set. The mean value for this measure of the probability of winning is 15.5%, which is
3% higher than the original measure but similar to the winning rates in Columbia and Oglethorpe
counties where we have actual lists of both lottery participants and lottery winners.
14
Portions of Tables 1 and 2 also appear in Bleakley and Ferrie (2013).
17
The second measure is higher than the first, as some individuals who were multiply matched
have a zero for the first measure but 1/n for the second. In 9 cases, there was one unique match to
1850 but several similar quality matches (e.g. the additional multiple matches had full given names
but lacked middle initials, while the unique match had a middle initial and full given name in the
winners list and in the 1850 census). In these cases, there is one observation with the value one and
n-1 with the value zero by the first measure, and n observations with the value 1/n by the second.
Overall, these two lottery winning indicators have a very high correlation.
Panel B of Table 1 presents outcomes determined before the 1832 lottery, which should not
be affected by whether the individual was a lottery winner or loser. The comparisons between
Columns 2 and 3 here represent a balancing test – an analysis of how well the treated group
compares to the control group prior to the application of the treatment. Average age, the fraction
born in Georgia, the number of co-resident children present in 1850 and born in the three-year prelottery window, and the fraction of adults who could not read or write are similar in the control and
treatment groups.
We then examine characteristics associated with the surname of each individual. Since
surname was determined at birth and could not have been affected by the lottery, we would also
expect no differences between the control and treatment groups in these measures. We account for
minor variation in spelling by using the Soundex code for each. Surnames (prior to Soundex coding)
are 6.2 characters in length on average, though this measure is slightly lower for lottery winners. On
average, each individual’s surname occurs 36 times in the sample, with no difference between
winners and losers. Surnames began with the letter “M” or “O” (a rough indicator of Celtic origin)
in 10% of all cases, with no difference between winners and losers.
18
Finally, we constructed average characteristics from other males resident in Georgia in 1850
with the same surname.15 Mean real estate wealth of those people with the same surname as the
sample individuals is $1,200, while the median wealth is below $300. The surname average rate for
illiteracy is 22%. Neither measure differs between the winners and losers.16
Panel C summarizes our measures of 1850 wealth. Outcomes measured here and in the rest of
Table 1 are no longer expected to be the same between winners and losers and thus, unlike Panel B,
are not useful as a balancing test. We report real estate wealth, slave wealth, and the sum of these.
Although we label the latter “total wealth,” there are other forms in which wealth could be held that
were not recorded in the 1850 census (a personal wealth question was added in 1860 and 1870).
Mean wealth in all 3 measures (real estate, slave, and “total”) are all several hundred dollars higher
for lottery winners than for lottery losers. The economically large magnitude is similar to the value
of winning a parcel that we calculated previously.
Summary statistics for fertility and school attendance among the children of winners and losers
are shown in Panel D. Lottery winners had, on average, 0.2 more children born after 1832 who
survived to 1850 than did lottery losers. By contrast, the fraction of school-eligible children who
attended school at any time during the 12 months prior to the census reference date (June 1, 1850)
did not differ between the winners and losers.
15
Although some lottery winners and losers no doubt migrated out of Georgia after the 1832 lottery,
we have limited our attention to Georgia in constructing these surname-average characteristics because
roughly half of the counties in Georgia have already been completely transcribed. Individuals in the lottery
sample are themselves excluded from these surname averages.
16
Although this is a weak test due to noise in the surname averages, surname averages (in results
not shown) are statistically significant predictors of individual-level behavior even controlling for a variety
of other covariates.
19
Additional characteristics for spouses and 1850 locations are compared in Panel E. Slightly
more winners than losers still had a spouse present in 1850, while among those winners and losers
with spouses present in 1850 the winners had spouses 6 months younger on average than the losers.
Roughly equal percentages of spouses were illiterate among the winners and losers. Most of the
sample still resided in Georgia in 1850 (Figure 1), while most of the balance outside Georgia was
in Alabama. Although the fractions residing in Georgia and Alabama do not differ between the
winners and losers, the equality of the distributions of winners and losers across counties is strongly
rejected by a simple ÷2 test. As we will see below, lottery winners were slightly more likely than
losers to reside in 1850 in Old Cherokee County (the counties settled through the 1832 Cherokee
Land Lottery).
4.
Estimation strategy
Our data allow us to analyze outcomes for lottery winners themselves, their children, and their
grandchildren – a span of roughly 50 years from the date of the lottery. We were able to construct
treated (winners) and control (losers) groups based on the list of winners and the criteria for
participation, where the latter allowed us to identify all individuals likely to have been eligible to win
and the former allowed us to identify winners in that eligible population. In this sample, the
treatment effect of winning a parcel in the lottery can be assessed directly by comparing mean
outcomes for winners and losers (and their descendants), or by estimating a simple bivariate
regression with a relevant outcome on the left-hand side and a dummy variable for winning a parcel
on the right-hand side. We adopt the regression-based approach to permit both the inclusion of
additional control variables and the continuous 1/n lottery status indicator. Although the random
assignment of parcels among participants reduces the omitted-variable problem and thereby
20
diminishes the need to introduce additional controls, such controls can improve the precision of our
estimated treatment effect and reduce the residual variation. These controls can also reduce any
biases resulting from our process for imputing lottery status, although the inability of lottery status
to predict pre-determined outcomes reduces this concern.17
We estimate OLS regressions of the following form:
(1)
Yijk
=
gTj + BXijk + da + dk + eijk
in which where i is the individual, j indexes the lottery-eligible person, Tj, (a binary variable) denotes
treatment—winning a parcel in the lottery—and control variables are: da (a set of age dummies); dk
(a set of county ◊ state location dummies to allow for differences between control and treated in
settlement patterns; and Xijk (a vector of other control variables specified below). The error term is
allowed to vary by both i and j. When we examine outcomes for the original lottery participants, i=j.
But many of the regressions below use instead samples of children or grandchildren of the lottery
participants, generating potentially numerous observations (i) for each lottery participant (j). In these
regressions, standard errors will be clustered at the lottery participant (j) level. The estimate of g that
we recover should be uncontaminated by omitted-variable or endogeneity problems, as a result of
the random assignment of treatment by the lottery.
We also employ an additional specification that incorporates characteristics measured at the
level of surnames, in the simplest case adding a fixed effect for each surname. Such a specification
controls for numerous differences that might be constant in family lines (patrilineal lines here, as we
only have information on surnames), allowing the impact of winning a parcel in the lottery to persist
within extended patrilineal families. Clark and Cummins (2013) and Güell et al. (2012) both
17
We also address this issue directly with a placebo test in Section 8.3.
21
highlight striking persistence in a variety of outcomes across family lines, an effect that surname
fixed effects would absorb. At the same time, our imputation process for determining lottery status
relies on matching by surname, so noise introduced in this process can be absorbed by surname fixed
effects.
5.
Analysis of wealth differences for original lottery participants
As a first step in assessing the impact of lottery winnings on family outcomes across
generations, we estimate the direct effect on lottery participants of winning a parcel on both real
estate and total wealth (real estate plus slave wealth) levels in Table 2. The effect is large: the
baseline estimate in Column 1 is an impact of $750 on 1850 wealth, similar to the unconditional
difference in Table 1 and our estimate of the value of a parcel of land in northwest Georgia by 1850.
Although winnings could in theory have been invested in a variety of instruments other than land and
slaves, such alternative investment opportunities were rare in the Deep South in the antebellum
period.18 The baseline estimates suggest that the effect of winning a parcel in the lottery persisted
for at least the two decades following the drawing.19
In the rest of Table 2, we employ specifications with different fixed effects: surname
characteristics (initial letter, length, and frequency in the sample) in Columns 2-4, which yield results
within a third of a standard error of the baseline; dummies for each surname (by Soundex code) in
18
Ransom and Sutch (1988, Table A.1, pp. 150-1) report that the total value of slaves in the U.S. in
1860 (the first time the census reported both real and personal wealth) was $3.1 billion. In that year, total real
estate and personal estate in the South were $3.4 billion and $4.7 billion, respectively (IPUMS 1860 1%
Sample: Ruggles et al., 2010). Thus, slaves accounted for 2/3 of all personal wealth in 1860, and land plus
slaves accounted for 80% of total wealth in 1860.
19
Bleakley and Ferrie (2013) present a more detailed analysis of the 1850 wealth of the lottery
winners and losers.
22
Column 5, which reduces the effect of winning by half a standard error, although the effect remains
large ($600) even 18 years after the lottery; dummies for given name in Column 6, which raises the
estimated effect by half a standard error; and both given name and surname fixed effects in Column
7, which also yields a substantial impact of lottery winning, slightly below the baseline but higher
than with surname controls alone.20
Finally, estimates of the effect of winning are similar whether we use the binary or the 1/n
match versions of our lottery status. We present only the binary variable results in what follows. The
Table 2 results reveal that including surname fixed effects has a larger impact, so we provide that
specification as an alternative throughout.
6.
Effects on Child Quantity versus Child Quality
Lottery winners tended to have slightly more children, but did not send them to school more.
In Becker’s (1982) terminology, they invested in child quantity but not child quality. These results
are found in Table 3, where, as above, we estimate equation (1). We report the coefficients on the
binary measure of lottery winning. Column 1 reports results when the dependent variable is the
number of children born after 1832 (the year of the lottery) who were still present in the household
in 1850. (Recall that the number of children born in the three years prior to the 1832 Lottery was
not significantly related to lottery status.) In the basic specification, we estimate lottery winners have
0.13 more children on average and, in the specification augmented with surname fixed effects, we
estimate instead a coefficient of 0.19. These numbers are consistent with the unconditional
20
The specification in Column 7 uses two sets of fixed effects: one for each surname and one for
each given name. Dummies for each given-name ◊ surname cell would entirely absorb the lottery-status
variable, as lottery status in the eligible population was determined by linkage to the Smith (1838) winners
list using surname and given name.
23
difference seen in Table 1 of 0.2. When considered over the entire set of children still in the
household, this represents a 3% increase in fertility, as seen in Column 2.
The remaining two columns of Table 3 examine school attendance by children in the household
aged at least five years old but not more than 17 years old. (Note that this age range excludes
children born prior to the lottery.) These children are linked to the lottery status of their father, and
the standard errors are adjusted for clustering at the level of the father. Column 3 uses the OLS
estimator and therefore this regression is a linear probability model, while Column 4 uses the logit
estimator, with marginal effects evaluated at the mean of observables and assuming the surname
fixed effects are all zero. The resulting coefficients imply an effect of winning the lottery of close
to zero, and we can rule out effects of more than a few percentage points.
In Table 4, we consider some decompositions and possible mechanisms for the quantity/quality
result. One hypothesis for these results is that richer husbands might be able to remarry more easily
(and/or to a younger spouse) if his first wife had died in childbirth (which was not an uncommon
occurrence in this period). This higher remarriage probability could result in higher fertility in
families headed by lottery winners. But we see in Columns 1 and 2 that there is not a statistically
significant difference by lottery status in the wife being present or in the wife’s age, if she is present.
Next we consider the extensive margin of post-lottery fertility in Column 3, where we see that lottery
winners are more likely to have children after 1832 than the lottery losers. Indeed, the entire CDF
of the number of post-lottery children (Nkp) is shifted out for lottery winners, although such
differences are strongest for when assessing whether winners were more likely than losers to have
had one or two additional post-lottery children, as seen in Columns 3 and 4. In results not shown,
we find evidence of differential stopping behavior: the average age of children in the household or
the age of the youngest child is about 0.2 years lower for winners, although this result is only
24
marginally significant (p=13%). In Column 5, we find essentially no effect on the gender
composition of children, suggesting that the fertility effect is not due to the differential survival of
one gender or the other. Finally, in Columns 6-9, we obtain similar school-attendance results when
decomposing the sample by gender or by broad age groups.
Locational choice, at least at the county level, does not appear to be a central mechanism in
driving these results. First note that results are quite similar whether we include fixed effects for
county ◊ state of residence (in 1850) (in Tables 2 and 3) or not (in Table 1). We further investigate
this mechanism by examining characteristics of the 1850 county of residence in Table 5. We begin
by noting that lottery winners are slightly more likely to end up in Old Cherokee County in 1850
(Column 1), although this difference in probabilities is quite small (2.2%). The lack of a
homesteading requirement implies that there is no mechanical reason why the lottery winners should
have higher rates of residence in Old Cherokee County than the lottery losers. Nonetheless, some
of them may have chosen to settle on their parcel rather than flip it, and this decision apparently
stuck for a small fraction. However, the treatment group shows no differential probability of residing
in the state of Georgia (Column 2) or residing in a county that is farther east (Column 3). But lottery
winners do, on average, live somewhat farther south when compared to lottery losers. This may be
because land in the Upcountry frontier was cheaper and therefore more attractive to the poorer lottery
losers. Alternatively, this may be because someone with enough capital to buy a slave preferred to
stay farther south where slave agriculture was more productive.
The remainder of Table 5 (Columns 5-16) uses county-level data to construct left-hand-side
variables describing the local economic and demographic conditions in the 1850 county of residence.
Because of the repeated data within counties, we now cluster the standard errors on county of
residence. Most important for the quantity/quality results, we do not see differences by lottery status
25
in the average school-enrollment or fertility rates (Columns 5-7). This suggests that lottery winners
were not differentially moving to areas that were more conducive to higher fertility or school
attendance (the latter being perhaps because of the provision of school infrastructure). Additionally,
being a lottery winner does not predict differences in county-of-residence farm values, farm sizes,
land improvement, or slave density (Columns 8-11). While this might suggest that lottery winners
bought more acreage instead of moving to counties with more valuable land, we cannot rule out that
they bought the land that was more valuable within a county. Finally, we do not find statistically
significant differences for county-of-residence urbanization or access to transport (Columns 12–15).
Additional analyses in Panels C and D add a dummy variable for residence in Old Cherokee County
and controls for surname fixed effects and find no difference from the results in Panel B.
7.
Outcomes of the next generations in 1870 and 1880
We now follow up on the outcomes in 1870 and in 1880 of children observed in the 1850
households. Someone who was a child in 1850 will have advanced to adulthood in those later years,
thus giving us an opportunity to observe the adult outcomes of children whose parents were eligible
to win in the Cherokee Land Lottery. Many of those in the second generation following the lottery
had formed households by 1880, which also allows us to observe the childhood outcomes of the
grandchildren of those who were eligible to participate in the 1832 Lottery. Note that here we are
examining outcomes that are almost 50 years after the lottery took place.
We next track this sample by taking the children under 19 in the 1850 households and looking
for likely matches in the 1870 and 1880 censuses. We use the 100% file from NAPPdata.org for
1880, as well as indexes for 1870 and 1880 that are searchable on Ancestry.com. The conditions
under which lottery winning predicts linkage are discussed below. It should be noted that we only
26
attempt to link male children across censuses, because female children would almost certainly
change their surname at marriage. Linkage rates to 1870 and 1880 are somewhat low (28% and 35%,
respectively). The lower linkage rate for 1870 results from the exclusive reliance on the hints
generated by Ancestry.com to perform the matching – for 1880, we were able to use both the hints
and the 1880 100% file from NAPP. Approximately 59% of the lottery eligible men have at least
one child in the 1880 sample.
The relationship between having a father win the 1832 Lottery and various outcomes for these
children as adults is presented in Table 6. As above, we present results from a basic specification
that includes dummies for age and place of residence, and for an augmented specification that
controls for surname fixed effects as well (Panels A and B, respectively). For our purposes, the bulk
of the outcomes of interest are drawn from the 1880 census, so we focus on those first. Note in
Columns 1 and 2 that having a lottery-winning father is a significant predictor for the child being
linked to the 1870 or 1880 census. This might induce a bias in the coefficients for other outcomes
in 1880, although the fact that the 1880 outcomes are all either binary or have limited ranges puts
an upper bound on the magnitude of such bias. In any event, this differential linkage seems to result
from differences in the characteristics of given (first) names. Accordingly, if we condition on a
variety of characteristics of the given name, lottery winning no longer significantly predicts
differential linkage. Therefore, to our standard set of specifications, we add a Panel C in which we
also control for the number of letters in the given name.21
Next, we turn to outcomes of the children of winners and losers. Having a lottery-winning
father predicts linkage to 1870 or 1880, but this correlation dissipates when controlling for
21
We find similar results if we use other characteristics of the given name.
27
characteristics of the given name. The first outcome variable examined is illiteracy in Column 3,
measured as whether the lottery participant’s son is unable to read and unable to write. In Column
4, the outcome variable is the occupational score, in adulthood (1880), of the children of the lottery
participants. Neither of these outcomes is significantly different when comparing the children of
lottery winners versus those of lottery losers. In Columns 5 through 7, we consider outcomes in
1870. It is perhaps too early in 1870 to reliably measure the outcomes of grandchildren simply
because many of the children of lottery eligibles would just be starting families. But 1870 is the last
nineteenth-century census in which wealth is reported, which we use as an outcome variable here.
We transcribed both real-estate and personal wealth, and the results here are for the sum of these two
variables. To compare with the estimates above, we deflate the 1870 wealth to 1850 dollars using
the consumer price index from measuringworth.com (Williamson, 2013). Results are similar using
other deflators.
The 1870 total wealth is statistically and economically similar between control and treatment
groups. An alternative point of comparison is a mechanical split of the lottery winnings among the
average number of children. This would suggest a treatment effect in 1870 of $140 (in 1850 dollars),
which we also cannot reject at conventional levels of confidence. On the other hand, we can reject
values larger than that. Note that the deflator adjusts for inflation only and does not convert the 1870
wealth into its present-value equivalent; results for 1870 wealth would drop by a factor of 2 to 5 for
annual interest rates of 3% to 8%. At standard confidence levels, we could handily reject a lotterywinning effect of $140 in 1850 dollars for interest rates much above 3% per annum. These
estimates for the wealth of the sons are inconsistent with a claim of supernormal returns in
intergenerational transmission. Finally, we show in Appendix Figure 1 using a quantile regression
of 1870 wealth on treatment that effects are similar across the distribution of 1870 wealth.
28
In Columns 8 through 10, we turn to outcomes in the third generation (the grandchildren of
lottery participants). Note that children in the household in 1880 are, by construction, the
grandchildren of lottery eligibles. There are two principal outcomes we consider: illiteracy (Column
8) and school enrollment (Column 9). Differences in illiteracy between the grandchildren by lottery
status of their paternal grandfather are not statistically significant. In contrast, the grandchildren of
lottery winners have a 2 to 3 percentage point lower probability of attending school. (These two
columns use restricted ages corresponding to the age ranges over which the variables are measured
and/or meaningful.) The result for schooling is the opposite of what one would expect if wealth
were relaxing a constraint on human-capital investment. Nor are these results consistent with
moving along or relaxing a quantity/quality tradeoff in that there is less of both variables. Men
whose fathers had won the lottery had fewer children by 1880 (Column 10), although this effect is
only significant at the 10% level. Nevertheless, the magnitude of this effect is approximately the
same as it was for the previous generation. A regression at the grandfather level winning on the
number of grandchildren cannot reject equality between winners and losers in the 1832 lottery.
Furthermore, the fact the fertility effects in the first and second generations roughly cancel out
suggests that the wealth shock induced only a one-generation blip in the size of the dynasty.
8.
Discussion
In this section, we address four distinct questions: (1) is there evidence in these data of a return
to skill? (Yes.); (2) is there evidence of intergenerational correlations in outcomes at this time, and
are such correlations consistent with the presence of characteristics that might be passed along family
lines? (Yes.); (3) do we obtain similar results using a placebo sample constructed using children born
in South Carolina rather than Georgia in the years prior to 1832? (No.); and (4) are these results
29
consistent with the relationship between parental wealth and children’s human capital observed in
our control population? (No.)
8.1. Was there a return to skill in nineteenth-century Georgia?
A possible response to the results above is that antebellum Georgia is not the right environment
to observe parents investing in skills or facing a quantity/quality trade-off, perhaps because it was
too early in that region’s path of economic development. But was this indeed the case? It is possible
that the contemporary reader might be unduly influenced by the seemingly moribund state of
education in the South after the Civil War. Nevertheless, Bleakley and Hong (2013) show that
antebellum rates of school enrollment among white children in the South were considerably higher
than rates postwar, and, indeed, that the South would have caught up to the North by circa 1890 if
the antebellum trends in school enrollment had continued after the Civil War.
Schultz (1975) has emphasized the importance of returns to education in agriculture once
farming has passed out its “traditional” phase (in which prices are stable, long-used production
techniques can be employed year after year, and there are no new technological or financial
innovations that need to be dealt with). In light of the non-traditional nature of farming in Georgia
from the 1830s forward (with new crops like new cotton varieties being introduced, and increasingly
national and international markets for the state’s products with wide year-to-year swings in prices),
it would not be surprising to find a substantial value for education in this environment.
In any event, lacking an intervention or instrument that specifically manipulates time in school
or the price of fertility control or some such, we cannot provide causal evidence on the return to
school or the technological rate of substitution between quantity and quality in this context. Instead,
we apply standard methods using observational data to get a first-pass estimate of these effects. We
30
use the lottery-eligible sample from above because (i) it allows for estimates that are most internally
comparable to the results above, and (ii) it can be used ‘off the shelf’ without the need for further
linkage, transcription, or data description. There are a variety of outcomes in these data that are
suitable for estimating such models, including wealth, occupational score, illiteracy, school
attendance, and family size.
We find substantial (correlational) evidence of returns to skill in our sample. These results are
found in Panel A of Table 7, which display estimates of equation (2):
(2)
Yijks =
wQijks + da + dk + eijk
where i is the individual, j indexes the lottery-eligible person, Q (“quality”) is the skill variable, da
are dummies for age, and dk are dummies for state/county of residence. To this baseline
specification, we add, in some cases, fixed effects for surname/Soundex. Column 1 of Panel A
regresses the 1850 wealth of the lottery-eligible man on whether he is illiterate. Men in our sample
had substantially lower wealth in 1850 if they could not read and write. The remaining columns of
Panel A consider 1870-80 outcomes for the children of the lottery-eligible men. As seen in Columns
2 and 3, if these men were illiterate in 1880, they had lower income (using the occupational income
score as a proxy) and wealth. We see also in Columns 4-6 that attending school in 1850 is associated
with lower illiteracy and higher income in 1880 and higher wealth in 1870.
Though the absence of an effect of lottery winning on children’s (and grandchildren’s) human
capital might reflect a quantity/quality trade-off (winners increased their family size but did not
invest more in their children’s education), this is implausible in light of the very small effect of
winning on the number of children in winners’ households (and the negative sign on the number of
children in winners’ sons’ households) and the associated high value for children this would imply:
for example, winners had between 0.13 and 0.19 children more than losers (Table 2), despite their
31
having won as much as $700 in the lottery. This implies that an additional child was worth between
$3,600 and $5,300. Kotlikoff (1979) reports that a prime-age male slave could be purchased at
auction in 1832 in New Orleans for $701. Given high antebellum infant and child mortality rates,
parents would have had to place an implausibly high premium on their own children’s labor (and any
non-pecuniary benefits from populating their households with their own children rather than slaves)
for the measured effect of lottery winning on fertility to be consistent with investment at the
extensive margin (quantity) in lieu of investment at the intensive margin (quality).
In sum, it appears that in antebellum Georgia there were indeed returns to human-capital
investment. To reiterate, this table departs from previous ones in that we are not estimating the
treatment effect of winning the lottery, but rather estimating the relationship (not necessarily a causal
one) between human capital variables and other outcomes.
8.2. Intergenerational correlations
Next, we consider the extent to which outcomes are in fact correlated across generations at this
time and whether outcomes are related to characteristics of other people who share the same surname
and are therefore likely related along patrilineal lines of descent. As an example of correlation across
generations, we examine the 1850 outcomes as predictors of the son’s own 1870 wealth. There is
evidence of a strong relationship between the log wealth of fathers and sons in Columns 6 and 7. In
the control sample, the elasticity of son’s 1870 wealth with respect to the father’s 1850 wealth is 0.23
(and the correlation is 0.57) and statistically significant.22 This linkage persists despite the
intervening Civil War that destroyed much of the South’s physical capital (though most had been
22
Charles and Hurst (2001, 2003) report an intergenerational wealth elasticity of 0.37 and an
intergenerational correlation of 0.23 to 0.50 for the modern U.S.
32
restored by 1870), but – perhaps more importantly – resulted in emancipation and the disappearance
from slaveowners’ balance sheets of a significant quantity of capital. Despite the loss of this capital,
the link between fathers’ and sons’ wealth remains strong. This is perhaps not surprising as the war
would not have destroyed human capital acquired before it took place. The correlations in the
control sample are also substantively large and statistically significant at the one percent level
between father’s 1850 log total wealth and his children’s 1850 school attendance (0.25), 1880
literacy (0.11) and 1880 occupational score (0.11), and between the father’s 1850 log total wealth
and his grandchildren’s 1880 school attendance (0.04).
We now show that this linkage comes through characteristics that are common across
patrilineal lines, using surname-specific averages of fertility, school attendance, and wealth as
possible proxies for differences across extended families in either preferences or prices. We used
the 1850 100% census file to construct the average fertility, school attendance, and real-estate wealth
among Georgia-resident households for each (Soundex) surname.23 Those individuals that appear
in our lottery-eligible sample are excluded from the construction of the averages. We first check for
the statistical power of these proxies by regressing the individual-level outcome on the surname
average:
(3)
Yijks =
aYs + da + dk + eijk
where s denotes the surname for each observation, Ys is the surname-average of the Y variable, and
each regression contains dummies for age and for state/county of residence. Furthermore, due to the
group-level nature of the regressor, we adjust the standard errors for clustering at the surname level.
The base sample for these regressions is the same as for analogous estimates of equation (1)
23
On the use of surnames in this way, see Clark and Crimmins (2013 ) and Güell et al. (2012).
33
displayed in earlier tables, with the exception that some households are omitted if there were no
other households in Georgia with the same surname and therefore no one from whom to form the
surname-level averages. Estimates of this equation are found in Panel B of Table 7.
The surname-averaged variable is indeed a strong and statistically significant predictor of the
individual-level outcome. The coefficient of zero is rejected in all three cases for conventional
confidence intervals. A mechanistic model in which the patrilineal dynasty (proxied by surname)
predicts outcomes one-for-one is even more strongly rejected; the coefficients are closer to 1/5th or
1/8th. (Note that we are not arguing that this is a causal effect of the behavior of their relatives on
the individuals’ choices, but rather a proxy for some shifter that is common within the group.) The
surname has a little effect on the son’s 1870 log total wealth, however, when both the father’s log
1850 total wealth and the surname-average log 1850 total wealth are included together in Column
7 – the father’s wealth dominates the surname effect.
8.3. Falsification exercise using a placebo sample from South Carolina
We perform a falsification exercise using South Carolina rather than Georgia and do not find
statistically significant results. One of the challenges in identifying the treatment effect associated
with winning the 1832 Lottery is that our method of imputing lottery status via name matching may
introduce biases through sample selection. To check for this possibility, we construct a placebo
sample using households with children born only in South Carolina (rather than Georgia) during the
same pre-lottery window (the three years prior to the Cherokee Land Lottery of 1832).24 We use the
names among this South Carolina sample to impute a pseudo-lottery-status by linking to the Smith
24
This exercise required the transcription of an additional 55,739 observations.
34
(1838) list. As above, we use both a dummy for a unique match to the Smith list and a variable that
allows for probabilistic matches, deflated to 1/n case of ties. By the eligibility rules of the Cherokee
Land Lottery, any matches from the South Carolina sample to this list must be spurious. It is then
reassuring that the fraction of unique matches in the placebo sample derived from South Carolina
is only one quarter of the fraction in the Georgia sample.
In Table 8, we estimate equation (1) using this placebo sample, for the different variables
indicating lottery status, and using both the basic specification and the one that includes
surname/Soundex fixed effects. These results are found in Panels A and B, with analogous results
from the Georgia sample provided for reference in Panel C. The first four columns of Table 8 show
outcomes that were determined prior to the 1832 Lottery, and there are no statistically significant
results. (Note that a series of falsifications checks using pre-lottery variables was also performed
for the Georgia sample, as shown in Table 1, Panel B.) The remaining columns show post-lottery
outcomes such as residing in old Cherokee County and fertility by 1850. There is no statistically
significant pseudo-treatment effect for the South Carolina sample, in contrast to what we find for
Georgia.25 Nor is there a statistically significant effect of treatment for 1850 real-estate wealth, along
either intensive or extensive margins (Columns 7 and 8, respectively). In Column 9, there is some
evidence of a positive relationship between the pseudo-treatment and school attendance. If we
choose to subtract this estimate from the Georgia estimates, it would make the above estimates even
less supportive of the idea that wealth allowed families to buy their way around credit constraints
to invest more in their children’s schooling.
25
We generally limited this falsification test to variables that were already available in the 1850
census index. We also transcribed wealth and school attendance in these households from the 1850 Census
manuscripts. Our efforts to link to the slave schedule were considerably more skilled-labor intensive, so we
did not duplicate these efforts for the placebo sample.
35
8.4. Simulated results using cross-sectional relationships in the control group
Based on the cross-sectional relationship between paternal wealth and sons’ outcomes, we
would have expected much larger effects of winning the lottery on sons’ human capital but not on
fertility. We come to this conclusion by conducting a simple shift/share analysis using the expected
change in the wealth distribution interacted with the relationship between wealth and various
outcomes in the control group. We use the control group to conduct this calculation because we wish
to compare the results from the randomized wealth with those for wealth in a sample that did not
receive a random wealth disbursement. Some readers might ask why we did not instead set this up
as a two-stage-least-squares (2SLS) problem with 1850 wealth as the endogenous regressor. This is
inappropriate in that lottery winners may have spent some of their wealth precisely on the humancapital formation of children. This would violate the 2SLS exclusion restriction in that lottery
treatment has an effect on child outcomes via a channel other than measured 1850 wealth. Such
transitional dynamics of wealth would not be present in the control group, which did not receive the
extra wealth. However, we would not argue that the relationship between child outcomes and wealth
in the control group is necessarily causal, but rather is a useful benchmark. One additional
complication that motivates our use of the shift/share analysis (versus a more common comparison
of 2SLS and OLS estimates) is that the relationship between wealth and various outcomes might not
be linear.
The specifics of the shift/share calculation are as follows. We use 100 grid points, evenly
spaced across the distribution of log 1850 total wealth, to discretize the 1850 wealth distribution.
Within each cell j there is an estimated average xj of some outcome. Let the vector of these averages
be x and the probability of being in each cell summarized by the vector p={pj}. The expected value
of this outcome variable across the whole sample is therefore the dot product of p and x. Suppose
36
the distribution of wealth is perturbed to be q. The change in the expected value of the outcome
variable would be Ä= (q-p)·x. For a given perturbation of the wealth distribution, we compute the
distribution of Ä with 500 bootstraps from the control sample. In the case of child or grandchild
outcomes, we use a block bootstrap grouped by the lottery-eligible father.
Results from this exercise are shown in Table 9. The outcome measures and the year in which
they are measured are displayed on the leftmost columns of the table. Each row and column group
displays the mean and, in square brackets, the 95% confidence interval from a different simulation.
The rightmost columns display estimates from the lottery-based design above. A dagger denotes that
the confidence interval for that simulation does not overlap with the confidence interval estimated
from the lottery treatment. For each simulation, we specify an expected value of winning,
discounted to 1850 and denoted in the “EV” column-group headings. The first expected value we
use for the simulation is $700, corresponding to our estimate in Section 2.1 of the value of land won.
We also consider expected values $200 above and $200 and $400 below $700. We focus mostly on
the $700 case, but discuss the robustness to alternate assumptions.
We also allow for heterogeneity in the value of land winnings by using a simple, two-point
distribution including zero as a possible “prize.” For each outcome and expected value of winning,
we conduct three simulations with varying degrees of heterogeneity. These are denoted in the
column “Fraction with zero,” and indicate the fraction t of the simulated winners that receive zero
change in wealth. In other words, we use the control sample to construct p using 1850 total wealth.
We then define a perturbed-wealth variable equal to measured wealth for t of the sample and equal
to measured wealth plus EV/(1- t ) for the remaining (1- t ). (Receiving zero wealth is randomly
assigned separately for each bootstrapped sample.) In the end, these alternate assumptions do not
make much difference. For a given outcome and expected value of the lottery prize, the simulated
37
and estimated confidence intervals tend to overlap either in all three cases or in none at all. (Of the
44 blocks of cells in Table 9, 38 have either zero or three daggers.) In practice, this relative
insensitivity arises from the approximate linearity of the relationship between most outcomes and
1850 wealth, at least across the densest part of the wealth distribution.
The simulations are generally consistent with our estimates of lottery treatment on fertility
above. These are seen in row-groups A, B, and D, in Table 9. The fertility/wealth relationship in the
control sample (not shown) is approximately flat in terms of economic significance. (A smoothed
plot of 1850 fertility versus 1850 wealth displays an inverse c-shape. However, the range on the yaxis is quite small and only a minor change in fertility is associated with large changes in wealth.)
In the simulations, wealth shocks of various sizes change the fertility rate by only a few children per
hundred. The 95% confidence intervals for the simulation typically do not contain the point
estimates from above, but they do overlap with the estimated confidence interval.
With a few exceptions noted below, the simulations are generally not consistent with our
estimates of lottery treatment on human-capital variables, particularly at the low end. These results
are found in Table 9, row-groups C and D-J. First, consider 1850 school attendance in row-group
C. By this simulation, a homogeneous $700 wealth shock would increase school attendance by
approximately 5.4%. This is different in both statistical and economic terms from the lottery-based
estimate of -0.001. The simulation delivers larger (smaller) effects for larger (smaller) wealth
shocks. In contrast, while we find positive rather than negative simulated effects for the occupational
income score of the sons in 1880, there is generally a substantial overlap between the confidence
intervals of the simulation and the estimate. (This outcome is complicated by the fact that essentially
the entire sample was involved in farming, thus narrowing the occupational range.) The results for
the sons’ literacy, however, show economically and statistically significant differences between the
38
simulation and the estimates. In the simulation, a positive wealth shock should have reduced the rate
of illiteracy. However, the relationship between 1850 wealth and human capital of the descendants
is weaker for the grandchildren than for the sons. Accordingly, there is substantial overlap in the
confidence intervals for grandchildren’s human capital, except for the $900 wealth shock.
Finally, we examine the 1870 wealth of sons, which the simulations suggest would have been
markedly different at the low end (row-groups H-J). By the simulation, we would have expected an
increase in the proportion of the sons with positive wealth in 1870, rather than a decrease as was
estimated above. Relatedly, the simulations imply a large increase in the natural log of the sons’
1870 wealth, while we observed essentially no change using the lottery-based estimates. While these
latter two outcomes are weighted towards changes in the lower tail of the sons’ wealth distribution,
we also examine the level of wealth in row-group H. In each simulation for 1870 wealth levels, the
estimated and simulated confidence intervals have substantial overlap. These results taken together
indicate the strongest effect of winning the lottery on the low end of the sons’ wealth distribution.
We might expect this pattern of results on a priori grounds as well in that high-wealth families were
presumably less likely to be liquidity constrained.
9.
Conclusions
The state of Georgia allocated most of its land to the public through a system of lotteries. These
episodes provide a unique opportunity to assess the impact of shocks to wealth, in that the random
assignment implied that the wealth shock was uncorrelated with individual characteristics. We focus
on the 1832 Cherokee Land Lottery. We assess the impact on the winners themselves and their
families into the third generation. Using 1850 Census microdata, we draw a sample of male
household heads that likely were eligible for the lottery. The rate of registration for this eligible
39
population was very high. We identify the lottery winners using Georgia state records and define
them as our treatment group. We cannot reject that the treatment variable was randomly assigned
in several balancing and placebo tests. We estimate that lottery winners won some $700 – close to
median wealth in 1850 and the equivalent of nearly two and a half years of wages for an unskilled
laborer in the South.
We focus on child outcomes in response to this wealth shock. Lottery winners slightly
increased their family size after the lottery more than non-winners, but were not more likely to send
their children to school. Children of lottery winners did not have more wealth, literacy, or income
as adults. Further, the grandchildren of winners were not more likely to be literate or attend school.
Indeed, the sons of lottery winners actually have fewer children and, if anything, send their children
to school less that the control-group sons. This reduction of treated fertility in the second generation
actually leaves the estimated number of grandchildren similar between control and treatment groups,
effectively nullifying any fertility effect from treatment in the long run.
Despite the substantial size of the financial windfall received by lottery winners and the
presence of returns to human capital, it does not appear that lottery winners invested more in their
children (or that winners’ children in turn invested more in their own children) than did losers (or
losers’ children). The random nature of the lottery assures us that winning was orthogonal to parents’
wealth or their underlying characteristics. Taken together, these findings are inconsistent with
parents’ financial resources being a significant constraint in shaping their children’s human capital.
The results are also inconsistent with a wealth-based “poverty trap” for human capital. The observed
intergenerational links are consistent instead with the presence of underlying characteristics that are
passed down along family lines and are associated with better outcomes.
Clark (2007, p. 8) describes a similar process by which characteristics associated with better
40
economic outcomes – patience, hard work, ingenuity, innovativeness, education – persisted and
spread within family lines in England as the fertility of the affluent exceeded that of the poor, setting
the stage for the Industrial Revolution. Attributes like these and the attitudes toward human capital
accumulation they inform might be transmitted through a variety of channels, generating the
intergenerational correlations and their apparent immunity to wealth shocks that we have shown.
10.
References
accessgenealogy.com. 1832 Cherokee Country Georgia Land Lottery. Web site.
http://www.accessgenealogy.com/georgia/landlottery/index.htm [Accessed May 29, 2009.]
Akresh, Richard, Damien de Walque and Harounan Kazianga. 2013. Cash Transfers and Child
Schooling: Evidence from a Randomized Evaluation of the Role of Conditionality. World Bank
Policy Research Working Paper 6340. January.
Baird, Sarah, Craig McIntosh, and Berk Özler. 2011. Cash or Condition? Evidence from a Cash
Transfer Experiment. Quarterly Journal of Economics 126 (4): 1709?53.
Becker, Gary. 1982. A Treatise on the Family. Chicago: University of Chicago Press.
Becker, Gary. 1992. Fertility and the Economy. Journal of Population Economics, Vol. 5, No. 3:
185-201.
Becker, Gary S., and Nigel Tomes. 1986. Human Capital and the Rise and Fall of Families. Journal
of Labor Economics, Vol. 4, No. 3 (July): S1-39.
Black, Sandra E., Paul J. Devereux and Kjell G. Salvanes. 2005. Why The Apple Doesn’t Fall Far:
Understanding Intergenerational Transmission Of Human Capital. American Economic Review, Vol.
95, No. 1 (Mar.), pp. 437-449.
Bursztyn, Leonardo and Lucas C. Coffman. 2012. The Schooling Decision: Family Preferences,
Intergenerational Conflict, and Moral Hazard in the Brazilian Favelas. Journal of Political
Economy, 120(3): 359-397.
Bleakley, Hoyt and Joseph Ferrie. 2013. Up from Poverty? The 1832 Cherokee Land Lottery and the
Long-run Distribution of Wealth. NBER Working Paper No. w19175.
Bleakley, Hoyt and Sok Chul Hong. 2013. When the Race between Education and Technology Goes
Backwards: The Postbellum Decline of White School Attendance in the Southern U.S. Unpublished
41
manuscript, University of Chicago. April.
Cadle, Farris W. 1991. Georgia Land Surveying History and Law. Athens, Georgia: Univ. Of
Georgia Press.
Charles, Kerwin Kofi, and Erik Hurst. 2001. The Correlation of Wealth across Generations. Working
Paper (June). University of Chicago, Booth School of Business.
Charles, Kerwin Kofi, and Erik Hurst. 2003. The Correlation of Wealth across Generations. Journal
of Political Economy, Vol. 111, No. 6 (December): 1155-1182.
Chevalier, Arnaud. 2004. Parental Education and Child’s Education: A Natural Experiment. IZA
Discussion Paper No. 1153.
Clark, Gregory. 2007. A Farewell to Alms. Princeton: Princeton Univ. Press.
Clark, Gregory and Neil Cummins. 2013. What is the True Rate of Social Mobility? Surnames and
Social Mobility, England 1800-2012. Unpublished manuscript.
Das Jishnu, Quy-Toan Do, and Berk Özler. 2005. Reassessing Conditional Cash Transfer Programs.
The World Bank Research Observer, 20(1):57-80. http://wbro.oxfordjournals.org/cgi/reprint/20/1/57
Dewey, John. 1889. Galton’s Statistica M ethods. Publications of the American
Statistical Association, Vol 1,No.7: 331-34.
Galor, Oded and Jospeh Zeira. 1993. Income Distribution and Macroeconomics. Review of Economic
Studies, Vol. 60(Jan.): 35-52.
Graham, Paul K. 2010. Georgia Land Lottery Research. Atlanta: Georgia Genealogical Society.
Güell, Maia, Jose V. Rodriguez, and Christopher I. Telmer. 2012. Intergenerational Mobility and the
Informational Content of Surnames. Unpublished manuscript. January.
Haines, Michael R., and Inter-university Consortium for Political and Social Research, 2010.
Historical, Demographic, Economic, and Social Data: The United States, 1790–2002,
Inter-university Consortium for Political and Social Research, icpsr.org.
Kotlikoff, Laurence J, 1979. The Structure of Slave Prices in New Orleans, 1804 to 1862. Economic
Inquiry, Vol. 17, No. 4 (Oct.): 496-518.
Long, Jason, and Joseph Ferrie (2013). Intergenerational Occupational Mobility in Great Britain and
the United States since 1850. American Economic Review, Vol. 103, No. 4 (June): 1109-37.
Malthus, Thomas Robert. 1806. Essay on the principle of population; or, A view of its past and
present effects on Human happiness. London: Johnson.
42
Margo, Robert A., and Georgia C. Villaflor. 1987. The Growth of Wages in Antebellum America:
New Evidence. The Journal of Economic History, Vol. 47, No. 4 (Dec.): 873-895.
North Atlantic Population Project. 2004. NAPP: Complete Count Microdata, Preliminary Version
0.2. Computer File. Distributed by University of Minnesota Minnesota Population Center:
Minneapolis, MN. www.nappdata.org.
Oreopoulos, Philip, Marianne E. Page, and Ann Huff Stevens. 2006. The Intergenerational Effects
of Compulsory Schooling. Journal of Labor Economics, Vol. 24, No. 4 (Oct.): 729-760.
Ransom, Roger, and Richard Sutch. 1988. Capitalists without Capital: The Burden of Slavery and
the Impact of Emancipation. Agricultural History, Vol. 62, No. 3, Quantitative Studies in Agrarian
History (Summer): 133-160.
Ruggles, Steven, J. Trent Alexander, Katie Genadek, Ronald Goeken, Matthew B. Schroeder, and
Matthew Sobek. 2010. Integrated Public Use Microdata Series: Version 5.0 [Machine-readable
database]. Minneapolis, MN: Minnesota Population Center [producer and distributor].
Sacerdote, Bruce. 2005. Slavery and the Intergenerational Transmission of Human Capital. The
Review of Economics and Statistics, Vol. 87, No. 2 (May): 217-234.
Schultz, T.W. (1975). The Value of the Ability to Deal with Disequilibria. Journal of Economic
Literature, Vol. 13, No. 3 (Sep.): 827-846.
Schumpeter, J.A. (1951). Imperialism and Social Classes, translated by Heinz Norden. New York:
Augustus M. Kelley.
Smith, James F. 1838. The Cherokee land lottery, containing a numerical list of the names of the
fortunate drawers in said lottery, with an engraved map of each district. New York: Harper and
Brothers.
Williams, H. David. 1989. Gambling Away the Inheritance: The Cherokee Nation and Georgia’s
Gold and Land Lotteries of 1832-33. The Georgia Historical Quarterly, Vol. 73, No. 3, Special Issue
Commemorating the Sesquicentennial of Cherokee Removal 1838-1839 (Fall): 519-539.
Williamson, Samuel H. 2013. Seven Ways to Compute the Relative Value of a U.S. Dollar Amount,
1774 to present. MeasuringWorth.com, accessed March 15, 2013.
43
Table 1: Summary Statistics
(1)
(2)
(3)
(4)
Whole
Sample
Lottery
“Losers”
Lottery
“Winners”
p-value, mean
difference [N]
Panel A: Lottery Winner or Loser
Dummy for unique match to Smith
(1838) list
0.124
(0.329)
0
1
---
Dummy for match to Smith (1838),
deflated to 1/n in case of ties
0.155
(0.335)
0.037
(0.121)
0.995
(0.053)
0.000
[14375]
Panel B: Predetermined Outcomes
Age, in years
51.2
(8.5)
51.3
(8.5)
50.9
(8.6)
0.122
[14375]
Born in Georgia
0.497
(0.500)
0.497
(0.500)
0.498
(0.500)
0.889
[14375]
Born in South Carolina
0.212
(0.408)
0.210
(0.407)
0.222
(0.416)
0.263
[14375]
Born in North Carolina
0.180
(0.384)
0.180
(0.384)
0.178
(0.383)
0.804
[14375]
Number of Georgia-born children in
the three years prior to the lottery
1.333
(0.542)
1.333
(0.541)
1.332
(0.542)
0.910
[14375]
Cannot read and write
0.147
(0.354)
0.147
(0.354)
0.142
(0.350)
0.593
[14340]
Number of letters in surname
6.19
(1.61)
6.20
(1.62)
6.13
(1.51)
0.072
[14375]
Frequency with which surname
appears in sample
36.2
(46.3)
36.3
(46.9)
35.3
(41.9)
0.380
[14375]
Surname begins with “M” or “O”
0.101
(0.302)
0.101
(0.301)
0.104
(0.305)
0.740
[14375]
Mean wealth of families in Georgia
with same surname
1186.3
(1257.8)
1185.4
(1288.4)
1192.3
(1021.8)
0.811
[13848]
Median wealth of families in Georgia
with same surname
289.1
(716.6)
290.0
(717.6)
282.7
(709.9)
0.686
[13848]
Mean illiteracy of adults in Georgia
with same surname
0.219
(0.107)
0.219
(0.108)
0.218
(0.098)
0.648
[13848]
Notes: Table continues on next page.
Table 1 (continued): Summary Statistics
(1)
(2)
(3)
(4)
Whole
Sample
Lottery
“Losers”
Lottery
“Winners”
p-value, mean
difference [N]
Panel C: Measures of Wealth in 1850
Real-estate wealth
1999.0
(4694.2)
1970.8
(4422.0)
2198.2
(6290.1)
0.068
[13094]
Slave weath
1339.1
(5761.0)
1297.3
(5329.7)
1635.3
(8189.0)
0.021
[14375]
Total wealth (sum of wealth in real
estate and slaves)
3323.7
(8691.0)
3245.5
(7952.9)
3876.5
(12734.4)
0.006
[13094]
Panel D: Child Quantity versus Quality
Number of children in household
born after the 1832 lottery
3.955
(2.546)
3.930
(2.539)
4.135
(2.586)
0.002
[14375]
School attendance among children
aged 5-18, inclusive
0.342
(0.474)
0.342
(0.475)
0.341
(0.474)
0.799
[47749]
Panel E: Other Outcomes
Spouse present in household
0.806
(0.395)
0.804
(0.397)
0.820
(0.384)
0.109
[14375]
45.9
(7.8)
46.0
(7.8)
45.5
(7.8)
0.037
[11591]
Spouse cannot read and write
0.235
(0.424)
0.236
(0.424)
0.231
(0.421)
0.676
[11563]
Resides in Georgia
0.723
(0.447)
0.722
(0.448)
0.729
(0.445)
0.548
[14375]
Resides in Alabama
0.144
(0.351)
0.144
(0.351)
0.145
(0.352)
0.935
[14375]
Spouse age, in years
Notes: This table displays summary statistics for the main data used in the present study. The sample consists of all household heads in the
1850 census with children born in Georgia during the three years prior to the Cherokee Land Lottery of 1832 and no children born outside of
Georgia during the same period. Column (1) presents means and standard deviations (in parentheses) of variables for this entire sample. We
use two measures of whether the person won land in the drawing for the Cherokee Land Lottery of 1832. The first measure is coded to 1 if
that person is a unique match to a name found on the list of winners published by Smith (1838); anyone else in the sample is coded to zero.
The second measure takes individuals that “tie” for a match to the Smith list with (n-1) other observations and recodes them to 1/n. These
variables are summarized in Panel A. Columns (2) and (3) present means and standard deviations of variables for the subsamples of,
respectively, lottery losers and winners (decomposed using the first measure). Column (4) presents the p-value on the test of zero difference
in means between the subsamples of losers and winners. In square brackets, we report the sample size used for this test, although the test
involving children or surnames adjust for the clustering of errors. With the exception of the measure of surname length, we use the Soundex
version of each name to account for minor spelling differences. For the variables that are means by surname, we use the 1850 100% census
file to construct average fertility, school attendance, and real-estate wealth among Georgia-resident households for each (soundex) surname.
(Those individuals that appear in our lottery-eligible sample are excluded from the construction of these indices.) Real-estate wealth is as
reported on and transcribed from the manuscript pages of the 1850 Census of Population. Slave wealth was estimated by linking the
household to the 1850 Slave Schedule and imputing a market value of slave holdings adjusting for the reported ages and gender of slaves on
the Schedule. Numbers in curly brackets in Panel C are the 25th, 50th, and 75th percentiles of the respective wealth measures. Data sources
and additional variable and sample definitions are found in the text.
Table 2: Lottery Status versus Total Wealth in 1850
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
855.1
(348.9) **
677.8
(385.6) *
723.6
(325.2) **
645.6
(332.6) **
777.6
(310.6) **
Panel A: Binary Match to Smith (1838)
723.4
(325.3) **
714.4
(319.5) **
710.1
(325.4) **
632.4
(311.2) **
593.6
(352.3) *
Panel B: Allow for 1/n Matching to Smith (1838)
777.7
(310.7) **
Additional FixedEffect Controls:
None
749.8
(303.0) **
First letter
of
surname
762.5
(310.5) **
Number
of letters
in
surname
660.2
(300.2) **
Freq. of
surname
in sample
572.0
(335.6) *
Surname
922.7
(331.3) ***
Given
name
Surname;
Given
name
None;
Adjust
truncated
lower tail
Notes: This table displays OLS estimates of equation (1) in the text. Each cell presents results from a separate regression, and only the coefficient on winning the lottery is
reported. The sample consists of all household heads in the 1850 census with children born in Georgia during the three years prior to the Cherokee Land Lottery of 1832 and no
children born outside of Georgia during the same lapse of time. The dependent variable in this table is total measured wealth. This variable is the sum of real-estate wealth, which
was reported to enumerators on the population schedule, and slave wealth, which was computed from the slave schedule. The sample size is 13,094. The baseline specification
also includes dummies for age and for (state x county) of residence. Additional sets of fixed effects are included in columns 2-7, as reported in the bottom row. In columns 4-7,
we use the Soundex version of each name to account for minor spelling differences. Two variables are constructed to measure whether the person was a lottery winner. The first
measure, used in Panel A, is coded to 1 if that person is a unique match to a name found on the list of winners published by Smith (1838); anyone else in the sample is coded to
zero. The second measure, which is used in Panel B, takes individuals that “tie” for a match to the Smith list with (n-1) other observations and recodes them to 1/n. A single
asterisk denotes statistical significance at the 90% confidence level; double 95% and triple 99%. Data sources and additional variable and sample definitions are found in the text.
Table 3: Effects on Child Quantity versus Quality
(1)
(2)
(3)
(4)
Number
children
born post
1832
Natural
log of total
children
Attended
school
Attended
school
(Logit)
1. Estimates of the effect of winning the lottery
Panel A: Basic Specification
0.134
(0.059) **
0.032
(0.015) **
-0.001
(0.011)
-0.005
(0.051)
Panel B: Control for Surname Fixed Effects
0.193
(0.073) ***
0.030
(0.014) **
-0.003
(0.011)
-0.010
(0.033)
2. Estimation sample
Lottery-eligible person,
linked to household
characteristics
Children aged 5-17,
inclusive
[N=14375]
[N=47749]
Notes: This table displays estimates of equation (1) in the text. Each cell presents results from a separate
regression, and only the coefficient on winning the lottery is reported. Estimates are computed using OLS, except
in Column 4, which uses logit. The basic specification (shown in Panel A) also includes dummies for age and for
(state x county) of residence. The specification used in Panel B includes fixed effects for surname (soundex).
The base sample consists of all households in the 1850 census with children born in Georgia during the three
years prior to the Cherokee Land Lottery of 1832 and no children born outside of Georgia during the same period.
The sample for Columns 1-2 consists of household heads, while the sample for Columns 3-4 consist of their
children at least 5 but not more than 17 years of age. The dependent variables are indicated in the column
headings. A household is coded as a lottery winner if the head is a unique match to a name found on the list of
winners published by Smith (1838); anyone else in the sample is coded to zero. A single asterisk denotes
statistical significance at the 90% confidence level; double 95% and triple 99%. All standard errors are
heteroskedasticity robust and clustered on the lottery-eligible man if there are multiple observations per
household. Data sources and additional variable and sample definitions are found in the text.
Table 4: Mechanisms and Decompositions, Fertility and School Attendance
(1)
(2)
(3)
(4)
(5)
Spouse
present
Spouse
age, if
present
Post-1832
children >
0
Post-1832
children >
3
Child
gender is
male
(6)
(7)
(8)
(9)
Attended school in past year
1. Estimates of the effect of winning the lottery
Panel A: Basic Specification
0.013
(0.010)
-0.104
(0.138)
0.014
(0.007) **
0.021
(0.011) *
0.000
(0.008)
0.003
(0.013)
-0.006
(0.013)
-0.007
(0.012)
0.007
(0.013)
-0.008
(0.013)
-0.013
(0.013)
0.009
(0.014)
Panel B: Control for Surname Fixed Effects
0.009
(0.011)
-0.038
(0.153)
0.015
(0.007) **
0.018
(0.010) *
0.000
(0.009)
0.005
(0.014)
2. Estimation sample
Lottery-eligible person, linked to household
characteristics
Children
under 18
Males,
age 5-17
Females,
age 5-17
Children,
age 5-12
Children,
age 13-17
[N=14375]
[N=47749]
[N=24510]
[N=23239]
[N=26756]
[N=20993]
Notes: This table displays OLS estimates of equation (1) in the text. Each cell presents results from a separate regression, and only the coefficient on winning the lottery is
reported. The basic specification (shown in Panel A) also includes dummies for age and for (state x county) of residence. The specification used in Panel B includes fixed
effects for surname (soundex). The base sample consists of all households in the 1850 census with children born in Georgia during the three years prior to the Cherokee Land
Lottery of 1832 and no children born outside of Georgia during the same period. The sample for Columns 1-4 consists of household heads, while the sample for Columns 5-8
consist of their children at least 5 but not more than 17 years of age, with subsamples noted in the last row. The dependent variables are indicated in the column headings. A
household is coded as a lottery winner if the head is a unique match to a name found on the list of winners published by Smith (1838); anyone else in the sample is coded to
zero. A single asterisk denotes statistical significance at the 90% confidence level; double 95% and triple 99%. All standard errors are heteroskedasticity robust and clustered
on the lottery-eligible man if there are multiple observations per household. Data sources and additional variable and sample definitions are found in the text.
Table 5: Differences in 1850-County-of-Residence Characteristics by Lottery Status
(1)
Resides in
Old
Cherokee
County
(2)
Resides in
Georgia
(3)
Miles
East
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14)
(15)
(16)
Miles
North
School
Enroll.
Rate
Total
Fertility
Rate
(TFR5)
Total
Fertility
Rate
(TFR19)
Log of
Farm
Value per
Acre
Log of
Average
Farm Size
Log of
Improved
Land
Ratio
Log
Slaves per
Area
Log Pop.
Density in
1850
Log Pop.
Density in
1830
Log
Fraction
Urban
Access to
Water
Transport
Access to
Railroads
-0.017
(0.017)
-0.001
(0.026)
-0.045
(0.029)
-0.111
(0.052) **
0.072
(0.072)
-0.007
(0.011)
0.018
(0.016)
0.000
(0.026)
-0.057
(0.030) *
-0.117
(0.054) **
0.066
(0.062)
-0.001
(0.011)
0.015
(0.016)
-0.048
(0.029)
-0.101
(0.052) *
0.072
(0.072)
-0.003
(0.011)
0.018
(0.015)
-0.108
(0.054) **
0.066
(0.062)
0.002
(0.011)
0.014
(0.015)
Panel A: Basic Specification
0.022
(0.008) ***
0.005
(0.011)
4.320
(3.643)
-4.026
(2.211) *
-0.003
(0.003)
0.006
(0.004)
0.022
(0.008) ***
0.004
(0.013)
4.265
(3.997)
-4.661
(2.306) **
-0.004
(0.003)
0.005
(0.004)
0.011
(0.012)
-0.007
(0.021)
-0.014
(0.017)
Panel B: Control for Surname Fixed Effects
0.009
(0.012)
-0.011
(0.022)
-0.005
(0.017)
-0.024
(0.018)
Panel C: Basic Specification, Control for Residence in Old Cherokee County
---
---
4.654
(4.343)
-5.924
(2.781) **
-0.004
(0.003)
---
---
4.560
(4.727)
-6.569
(2.761) **
-0.005
(0.003) *
0.005
(0.004)
0.007
(0.012)
-0.012
(0.022)
-0.004
(0.016)
-0.015
(0.017)
0.008
(0.025)
Panel D: Control for Surname Fixed Effects, Control for Residence in Old Cherokee County
0.004
(0.004)
0.005
(0.012)
-0.016
(0.022)
0.004
(0.016)
-0.022
(0.019)
0.009
(0.026)
-0.058
(0.030) *
Notes: This table displays OLS estimates of equation (1) in the text. Each cell presents results from a separate regression, and only the coefficient on winning the lottery is reported. The basic specification (shown in Panel A) also includes dummies for age. The specification
used in Panel B includes fixed effects for surname (soundex). Panels C and D repeat specifications from Panels A and B, respectively, but also include a dummy variable for residence in Old Cherokee County. The sample consists of all household heads in the 1850 census
with children born in Georgia during the three years prior to the Cherokee Land Lottery of 1832 and no children born outside of Georgia during the same period. The dependent variables are the locational county-specific characteristics denoted in the column headings.
Location data used in Columns 3 and 4 are county centroids computed from NHGIS data, and are converted into miles east or north of the NAD83 reference point in central Oklahoma. County data used in Columns 5-14 are drawn from ICPSR study #2896. The number of
observations for Columns 1-4 is 14375 and for Columns 5-14 is 14237 because of missing data for some (mostly unorganized) counties. A household is coded as a lottery winner if the head is a unique match to a name found on the list of winners published by Smith (1838);
anyone else in the sample is coded to zero. A single asterisk denotes statistical significance at the 90% confidence level; double 95% and triple 99%. All standard errors are heteroskedasticity robust and, in Columns 3-14, clustered at the (state x county) level to account for
multiple observations per county. Data sources and additional variable and sample definitions are found in the text.
Table 6: Outcomes of Next Generation(s) in 1870-80
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
Linked to
1880
census
Linked to
1870
census
Unable to
read and
write
Occup.
Score
Total
Wealth
($)
Wealth
Positive
Natural
log of
Wealth
Unable to
read and
write
Enrolled
in school
Number
children
under 18
1. Estimates of the effect of father or grandfather winning the lottery
Panel A: Basic Specification
0.037
(0.010) ***
0.018
(0.008) **
0.004
(0.008)
-0.124
(0.285)
57.6
(86.8)
-0.031
(0.016) *
-0.038
(0.048)
-0.002
(0.013)
-0.020
(0.012) *
-0.092
(0.056) *
0.001
(0.014)
-0.028
(0.012) **
-0.092
(0.056) *
-0.003
(0.015)
-0.035
(0.014) **
-0.105
(0.060) *
Panel B: Control for Surname Fixed Effects
0.027
(0.011) **
0.011
(0.008)
0.005
(0.009)
-0.087
(0.326)
112.4
(100.9)
-0.017
(0.019)
0.023
(0.060)
Panel C: Control for Surname Effects and Length of Given Name
0.017
(0.013)
0.003
(0.010)
0.011
(0.011)
0.328
(0.398)
115.7
(119.0)
-0.008
(0.025)
0.055
(0.078)
2. Estimation sample
Children
in 1850
Children
in 1850
1850
children
as adults
in 1880
[N=40024]
[N=24510]
[N=14963]
1850
children
as adults
in 1880
1850
children
as adults
in 1870
1850
children
as adults
in 1870
1850
children
as adults
in 1870
Children
in 1880,
10-19
years old
Children
in 1880,
ages 5-19
1850
children
as adults
in 1880
[N=14956]
[N=6823]
[N=6823]
[N=6823]
[N=23544]
[N=40658]
[N=14963]
Notes: This table displays OLS estimates of equation (1) in the text. Each cell presents results from a separate regression, and only the coefficient on winning the lottery is reported. The basic
specification (shown in Panel A) includes dummies for age and for (state x county) of residence. The specification used in Panel B also includes fixed effects for surname (soundex), and the
specification in Panel C adds to this dummies for the length (number of letters) of the given name. The base sample of children in 1850 is as described in prior tables, and this sample is used in
Columns 1 and 8 to estimate the differential probability of linkage to 1870 and 1880 censuses. The samples in the remaining columns are drawn from the 1870 or 1880 households of those
male children linked from 1850. The dependent variables are indicated in the column headings. A household is coded as a lottery winner if the head is a unique match to a name found on the
list of winners published by Smith (1838); anyone else in the sample is coded to zero. A single asterisk denotes statistical significance at the 90% confidence level; double 95% and triple 99%.
All standard errors are heteroskedasticity robust and clustered on the lottery-eligible man if there are multiple observations per household. Data sources and additional variable and sample
definitions are found in the text.
Table 7: Estimated Returns to Skill from the Cross Section
(1)
(2)
(3)
(4)
(5)
(6)
(7)
Panel A: Estimated returns to human capital
Dependent variable:
Total
wealth in
1850
Occup.
score in
1880
Total
wealth in
1870
Literacy
in 1880
Occup.
score in
1880
Total
wealth in
1870
Measure of human capital:
Cannot
read and
write,
1850
Cannot
read and
write,
1880
Cannot
read and
write,
1880
Attend
school in
1850
Attend
school in
1850
Attend
school in
1850
Estimates from basic
specification:
-2737
(103) ***
[-770]
-3.107
(0.188) ***
-661
(100) ***
[-300]
-0.038
(0.008) ***
[-300]
2.004
(0.325) ***
633
(97) ***
[205]
Estimates using surname
fixed effects:
-2828
(162) ***
-2.994
(0.218) ***
-648 ***
(149)
-0.027
(0.010) ***
1.586
(1.586) ***
548 ***
(115)
Sample:
Sample size:
Lottoeligible
13063
Children in 1850 as adults in 1870-1880 (ages [5,18] for col 4-6)
14956
6501
7524
7524
5380
Panel B: Does surname average predict own level?
Dependent variable:
Average of outcome for
surname in 1850 Georgia
Number
children
born post
1832
0.169
(0.063) ***
Attended
school in
1850
0.130
(0.015) ***
Total
wealth in
1850,
levels
0.225
(0.088) **
Total
wealth in
1850,
logs
0.164
(0.022) ***
Son's total wealth in 1870, logs
0.076
(0.033) **
Father's total wealth in 1850,
logs
Number of observations:
14213
45688
12661
12553
5080
0.023
(0.032)
0.226
(0.012) ***
0.225
(0.012) ***
5080
5080
Notes: This table display OLS estimates of equations (2) and (3). (Terms in square brackets for wealth outcomes are from a quantile regression at the median.) This table
departs from previous ones in that we are not estimating a treatment effect of winning the lottery, but rather estimating the relationship (not necessarily a causal one) between
human-capital variables and other outcomes. Each cell presents results from a separate regression. In addition to the displayed coefficients, regressions include dummies for
age and state/county of residence. The second specification in Panel A includes fixed effects for surname (soundex). The samples are as described in the previous tables. In
the regressions with school attendance, children under 5 and over 18 years of age in the year in which that variable is measured are excluded from the sample. The dependent
and main independent variables for each column are indicated in the first two rows of each table. For Panel B, we use the 1850 100% census file to construct average fertility,
school attendance, and real-estate wealth among Georgia-resident households for each (soundex) surname. (Those individuals that appear in our lottery-eligible sample are
excluded from the construction of these indices.) A single asterisk denotes statistical significance at the 90% confidence level; double 95% and triple 99%. All standard errors
(shown in parentheses) are heteroskedasticity robust and clustered at the level of the original lottery-eligible person. Data sources and additional variable and sample
definitions are found in the text.
Table 8: Falsification test using South Carolina instead of Georgia to construct sample
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
Born in
Georgia
Born in
South
Carolina
Number
Ga.-born
children,
prelottery
Number
SC-born
children,
prelottery
Resides in
Old
Cherokee
County
Number
children
born post
1832
Realestate
Wealth
($)
Realestate
Wealth
>$100
In school,
children
ages
[5,18]
Dependent variables:
Lottery-status variables:
Panel A: South Carolina, basic specification
Dummy for unique match to Smith
(1838) list
0.001
(0.004)
-0.017
(0.013)
-0.019
(0.019)
0.005
(0.007)
0.019
(0.077)
-41.1
(236.2)
0.001
(0.016)
0.025
(0.021)
Dummy for match to Smith (1838),
deflated to 1/n in case of ties
0.004
(0.004)
-0.016
(0.012)
-0.002
(0.018)
0.010
(0.007)
0.001
(0.074)
-15.6
(232.0)
0.001
(0.015)
0.025
(0.020)
Panel B: South Carolina, including surname fixed effects
Dummy for unique match to Smith
(1838) list
0.000
(0.004)
-0.003
(0.014)
-0.016
(0.021)
0.006
(0.008)
0.016
(0.096)
-93.3
(229.4)
0.005
(0.016)
0.053
(0.023) **
Dummy for match to Smith (1838),
deflated to 1/n in case of ties
0.003
(0.004)
-0.004
(0.014)
-0.004
(0.020)
0.009
(0.008)
-0.029
(0.093)
-72.6
-(72.6)
0.015
(0.015)
0.055
(0.022) **
Panel C: Analogous results for Georgia, dummy for unique match to Smith list
Basic specification
-0.004
(0.012)
0.014
(0.011)
0.002
(0.014)
0.022
(0.008) ***
0.134
(0.058) **
295.2
(154.4) *
0.002
(0.011)
-0.001
(0.011)
Control for surname fixed effects
0.001
(0.014)
0.012
(0.012)
0.009
(0.016)
0.023
(0.008) ***
0.193
(0.073) ***
315.8
(146.8) ***
0.002
(0.011)
-0.003
(0.011)
Notes: This table displays estimates of equation (1) in the text. Each cell presents results from a separate regression, and only the coefficient on "winning the lottery" is reported. The sample for Panels A and B consists
of all households in the 1850 census with children born in South Carolina during the three years prior to the Cherokee Land Lottery of 1832 and no children born outside of Georgia during the same period. The sample
for Panel C, which repeats some results from earlier tables, uses households with Georgia-born children in this same window. We use two measures of whether the person won land in the drawing for the Cherokee Land
Lottery of 1832. The first measure is coded to 1 if that person is a unique match to a name found on the list of winners published by Smith (1838); anyone else in the sample is coded to zero. The second measure takes
individuals that “tie” for a match to the Smith list with (n-1) other observations and recodes them to 1/n. Note that these are spurious measures for the South-Carolina samples because the birthplace of their children
implies that they lived outside of Georgia at some point during the three years prior to the lottery, and were therefore ineligible. The basic specification also includes dummies for age. The other specification used
includes fixed effects for surname (soundex). The dependent variables are indicated in the column headings. A single asterisk denotes statistical significance at the 90% confidence level; double 95% and triple 99%. All
standard errors (shown in parentheses) are heteroskedasticity robust and clustered on the lottery-eligible man if there are multiple observations per household. Data sources and additional variable and sample definitions
are found in the text.
Table 9: Simulated effects of wealth and comparison with estimates from the lottery
Outcome measure:
Fraction
w/ zero
Results from simulations or estimates; Mean and [95% Confidence Interval]
EV $700
EV $500
EV $300
EV $900
Estimates from above
Number children
born post 1832
0
.25
.5
.029 [-.008, .065]
.012 [-.023, .041]
-.004 [-.027, .017] †
.038 [.003, .073]
.023 [-.006, .047]
.006 [-.013, .028]
.045 [.010, .077]
.031 [.005, .054]
.017 [-.003, .036]
.018 [-.022, .056]
.001 [-.034, .033]
-.014 [-.039, .011] †
.134 [.018, .250]
3A1
(B)
Natural log of
total children
0
.25
.5
.005 [-.005, .013]
.002 [-.007, .009]
-.002 [-.007, .004]
.007 [-.001, .015]
.004 [-.003, .010]
.001 [-.005, .006]
.009 [.001, .017]
.006 [.000, .012]
.003 [-.002, .007]
.003 [-.008, .012]
-.001 [-.009, .007]
-.004 [-.011, .003]
.032 [.003, .061]
3A2
(C)
Attended school
(children)
0
.25
.5
.042 [.038, .045] †
.039 [.036, .042] †
.034 [.032, .037] †
.028 [.025, .031] †
.026 [.024, .029] †
.024 [.022, .026] †
-.001 [-.023, .021]
3A3
(D)
Number
grandchildren (per
son) under 18
0
.25
.5
-.045 [-.073, -.012]
-.042 [-.065, -.015]
-.035 [-.061, -.020]
-.034 [-.057, -.003]
-.033 [-.051, -.005]
-.029 [-.046, -.011]
-.021 [-.043, .009]
-.021 [-.039, .001]
-.020 [-.032, -.002]
-.055 [-.082, -.018]
-.049 [-.075, -.026]
-.040 [-.070, -.030]
-.092 [-.202, .018]
6A2
Occupational
score (sons)
0
.25
.5
.347 [.207, .482]
.354 [.243, .477]
.346 [.260, .442]
.229 [.086, .351]
.247 [.134, .338]
.252 [.174, .322]
.112 [-.013, .233]
.126 [.025, .216]
.142 [.076, .209]
.458 [.319, .598]
.459 [.331, .609]
.435 [.311, .550]
-.124 [-.683, .435]
6A4
Unable to read
and write (sons)
0
.25
.5
-.021 [-.025, -.015] †
-.018 [-.022, -.014] †
-.015 [-.017, -.011]
-.018 [-.022, -.013] †
-.016 [-.019, -.011]
-.012 [-.015, -.009]
-.015 [-.019, -.010]
-.012 [-.016, -.009]
-.010 [-.012, -.007]
-.023 [-.028, -.018] †
-.020 [-.024, -.015] †
-.016 [-.019, -.013] †
.004 [-.012, .020]
6A3
Unable to read
and write
(grandchildren)
0
.25
.5
-.030 [-.035, -.025]
-.025 [-.030, -.022]
-.020 [-.023, -.017]
-.026 [-.031, -.022]
-.022 [-.026, -.019]
-.018 [-.020, -.015]
-.020 [-.025, -.016]
-.018 [-.021, -.014]
-.014 [-.016, -.012]
-.033 [-.039, -.028] †
-.028 [-.032, -.024]
-.023 [-.026, -.019]
-.002 [-.027, .023]
6A5
(H)
Attended school
(grandchildren)
0
.25
.5
.005 [.001, .009]
.007 [.003, .010]
.008 [.006, .011] †
.001 [-.003, .004]
.003 [.000, .006]
.005 [.003, .008]
-.002 [-.005, .001]
-.001 [-.003, .002]
.001 [-.001, .003]
.008 [.004, .013] †
.010 [.006, .014] †
.011 [.008, .013] †
-.020 [-.044, .004]
6A6
(I)
Total Wealth ($)
0
.25
.5
70 [-199, 227]
93 [-123, 218]
111 [-43, 208]
20 [-252, 177]
46 [-169, 163]
68 [-78, 151]
-27 [-261, 115]
-5 [-213, 110]
22 [-131, 99]
58 [-113, 228]
6A8
Wealth is positive
0
.25
.5
.035 [.028, .043] †
.029 [.023, .036] †
.022 [.017, .027] †
.032 [.024, .039] †
.026 [.020, .033] †
.020 [.015, .025] †
.026 [.018, .033] †
.022 [.016, .027] †
.017 [.013, .021] †
.038 [.029, .048] †
.031 [.025, .040] †
.023 [.019, .030] †
-.031 [-.062, .000]
6A9
Natural log of
total wealth
0
.25
.5
.320 [.298, .339] †
.292 [.277, .310] †
.249 [.238, .266] †
.239 [.221, .259] †
.231 [.217, .247] †
.208 [.195, .219] †
.129 [.111, .148] †
.144 [.131, .157] †
.142 [.133, .152] †
.382 [.359, .403] †
.345 [.323, .362] †
.290 [.274, .303] †
-.038 [-.132, .056] 6A10
(G)
(J)
(K)
1880 (sons and grandchildren)
(F)
1870 (sons only)
(E)
1850
(A)
.054 [.050, .057] †
.049 [.046, .053] †
.043 [.040, .045] †
.064 [.060, .068] †
.058 [.055, .062] †
.049 [.047, .053] †
116 [-170, 281]
135 [-82, 270]
146 [-22, 262]
Notes: This table provides a shift-share analysis with the differences in probability generated by various perturbations of the wealth distribution and the relationship between each outcome and 1850 wealth in the control group. The outcome measures and the year in
which they are measured are displayed on the leftmost columns of the table. We use a discretized distribution of 1850 wealth using 100 grid points evenly spaced across 1850 log wealth. For each simulation, we specify the expected value of winning (in 1850$), as
denoted in the "EV" column-group headings. For each outcome and expected value of winning, we conduct three simulations with varying degrees of heterogeneity in the value of land winnings. These are denoted in the column "Fraction w/ zero", and indicate the
fraction of the simulated winners that receive zero change in wealth. The rightmost columns display estimates from the treatment/control comparisons above. The final column on the right (a number-letter-number sequence) denotes the Table, Panel, and Column
from which the estimate is drawn. Each row and column group displays the mean and, in square brackets, the 95% confidence interval from a different simulation. A dagger denotes that the confidence interval for that simulation does not overlap with the
confidence interval estimated from the lottery treatment. The data for the simulation are the lottery losers, defined as those with no match to the Smith (1838) list. The statistics for each simulation come from 500 bootstrapped samples of the control group, with the
lottery-eligible man being the block for the bootstrap when the outcomes are for their descendants.
Figure 1: Old Cherokee County and the 1850 Locations of the Sample
Notes: This figure displays a map of the southeastern United States with information on the location (by county) in 1850 of the lottery-eligible households in our main sample. Black lines indicate the
1850 county boundaries, drawn from the NHGIS database. The area shaded in blue in northwest Georgia denotes old Cherokee County, which was allocated by the Cherokee Lottery of 1832. The
sample consists of all household heads in the 1850 census with children born in Georgia during the three years prior to the Cherokee Land Lottery of 1832 and no children born outside of Georgia during
the same period. If households in our sample are resident in a county in 1850, we place a red dot at the county centroid. The area of a dot is proportional to the number of sample households resident in
that county. A minor fraction of sampled households resides in counties outside the frame of this map. Such households are included in the econometric analysis, but we chose to zoom in on this region
to make the feature is legible in this figure. Data sources and additional variable and sample definitions are found in the text.
Appendix Figure 1: Quantile Regression Estimates of Treatment Effects on Childrens' 1870 Wealth
Notes: This figure displays quantile-regression estimates of equation (1) in the text. The coefficient on winning the lottery is reported for various quantiles. Data sources and additional variable and sample definitions are found in the text.