Does Sample-Selection Bias Explain the Antebellum Puzzle?

Does Sample-Selection Bias Explain the Antebellum Puzzle?
Evidence from Military Enlistment in the Nineteenth-Century United States
Ariell Zimran
[email protected]
Northwestern University
November 28, 2015
Job Market Paper
Click Here for Most Recent Version
Abstract
It is widely believed that the antebellum puzzle—a simultaneous decline in average stature and rise in
GDP per capita, coupled with lower average stature in richer regions, in the antebellum United States—
reveals a welfare cost of economic development that is hidden by conventional indicators; but it has
been argued that the puzzle may be an artifact of sample-selection bias, stemming from a reliance on
data from military volunteers to construct height samples. In this paper I provide the first empirical
test of whether sample-selection bias explains the antebellum puzzle by using a two-step semi-parametric
sample-selection model to estimate trends and patterns in average stature in the antebellum United States
that are corrected for selection into military service on the basis of both observable and unobservable
characteristics. This estimation is based on a new data set of my construction, consisting of military data
from US Army enlistment records—including stature—for the birth cohorts of 1832–1860, linked to US
census data and combined with similar data from the Union Army project. Identification is based on the
incorporation of voting data from the presidential elections of 1856 and 1860, which measure political
motives for military enlistment, and on differences in the nature of the military enlistment decision
depending on whether an individual was old enough to serve in the Civil War. I find that the antebellum
puzzle is robust to these corrections, and therefore is not an artifact of sample-selection bias. A net
decrease in average stature of approximately 0.6 to 0.7 inches between 1832 and 1860 is present despite
the correction, as is a gap of about 0.5 to 0.6 inches in favor of Southerners relative to Northeasterners.
These results, however, do not imply that sample-selection bias can be disregarded in studies of historical
heights. On the contrary, the degree of sample-selection bias is shown to vary over birth cohorts and
across regions and sectors, and accounting for sample selection meaningfully and statistically significantly
alters patterns in average stature across birth cohorts, regions, and sectors.
Acknowledgements I am indebted to Joel Mokyr, Joseph Ferrie and Matthew Notowidigdo for encouragement and guidance. I also thank John Cawley, Stephanie Chapman, Carola Frydman, Seema Jayachandran,
John Komlos, Aviv Nevo, Sangyoon Park, Yannay Spitzer, Richard Steckel, Benjamin Ukert, and Carlos
Villareal for helpful suggestions and insightful comments. Thanks are also due to Roy Mill for access to the
dEntry transcription system and for investing considerable time and energy adjusting it to my needs; to seminar participants at Northwestern University; to participants in the 2013 Asian Meetings of the Econometric
Society, the 2015 Western Economic Association International Graduate Student Dissertation Workshop
and Conference, the 2015 European Historical Economics Society Conference, the 2015 Illinois Economic
Association Conference, and the 2015 Social Science History Association Conference for helpful comments;
and to Christie Jeung for excellent research assistance. This project was supported by the Northwestern
University Center for Economic History, a Northwestern University Graduate Research Grant, an Economic
History Association Exploratory Data and Travel Grant, and an Economic History Association Dissertation
Fellowship. This research was supported in part through the computational resources and staff contributions
provided for the Social Sciences Computing Cluster (SSCC) at Northwestern University. Recurring funding
for the SSCC is provided by the Office of the President, Weinberg College of Arts and Sciences, Kellogg
School of Management, the School of Professional Studies, and Northwestern University Information Technology. This project, by virtue of its use of the Union Army Data, was supported by Award Number P01
AG10120 from the National Institute on Aging. The content is solely the responsibility of the author and
does not necessarily represent the official views of the National Institute on Aging or the National Institute
of Health. All errors are my own.
Notes Previous versions of this paper were titled “New Perspectives on Historical Standards of Living:
Evidence from US Military Enlistment in the Late Nineteenth Century” and “Does Sample-Selection Bias
Explain the Industrialization Puzzle? Evidence from Military Enlistment in the Nineteenth-Century United
States.”
1
Introduction
Some time around 1800, the western world reached a major economic turning point. Spurred by twin events,
the Industrial Revolution and the Demographic Transition, the western economies broke the Malthusian trap
that had bound them for millennia and entered the period of modern economic growth that persists to today
(Clark, 2007). There is no doubt that in the long run living standards in these economies have improved
tremendously. Figure 1 illustrates these long-term gains in the United States, presenting two indicators.
The first, GDP per capita, provides a measure of the economic standard of living; that is, of the material
goods available to the average person. It shows a marked secular improvement in living standards since 1800,
rising more than twenty-fold to 2010.1 The other statistic, height, summarizes the biological standard of
living—an alternative measure of welfare that captures health and physical well-being (Steckel, 2008). Such
data are widely studied in this context because they are available in large quantities and in many contexts,
because there is considerable evidence of a strong relationship between a population’s average stature and
its welfare as measured by conventional monetary indicators (Deaton, 2007, 2013; Eveleth and Tanner, 1976;
Floud et al., 2011; Frisancho, 1993; Martorell and Habicht, 1986; Steckel, 1995, 2008), and because they
capture aspects of welfare, such as health and “the effects of environmental and exogenous influences” (Floud,
1985, p. 33), that conventional monetary measures do not. Just as GDP per capita shows a marked increase
in the economic standard of living in the United States since 1800, so too does height show a considerable
improvement, with a nearly three-inch increase in average stature since 1760.2 The fact that the improvement
has manifested in both the economic and biological standards of living confirms the standard view that the
long-term economic growth since about 1800 has unambiguously improved welfare.
The gains in the biological standard of living, however, do not show the same monotonicity in the United
States as does the trend in GDP per capita. In fact, up to 1900 average heights in the United States had
actually declined by about an inch from 1760 levels. Thus, although economic growth up to 1900 showed
an improvement in the economic standard of living, these improvements did not manifest in the biological
standard of living, which actually deteriorated over this period. Some aspect of welfare of at least some
portion of the population suffered during (and likely as a result of) early modern economic growth. The
welfare effects of modern economic growth were therefore not unambiguously good over the short and medium
term as they were in the long term.
1 This trend is reaffirmed by Costa and Steckel (1997), Goldin and Margo (1992), Lindert and Williamson (2015), and Margo
and Villaflor (1987), who show that other conventional measures of living standards showed a similar pattern of improvement.
It is well known that the whole western world experienced such improvements (Clark, 2007).
2 Similar increases are also evident in Western Europe (Hatton and Bray, 2010).
1
As shown in Figure 1, the decline in heights that leads to the divergence of the economic and biological
standard of living occurred between roughly 1830 and 1860. This timing, together with the fact that this
phenomenon has challenged conventional views on the welfare effects of economic growth, has led to it
being called the “Antebellum Puzzle” (A’Hearn, 1998; Craig, Forthcoming; Floud et al., 2011; Fogel, 1986;
Fogel et al., 1983; Komlos, 1987, 1992; Margo and Steckel, 1983; Zehetmayer, 2011); its English cousin is
the “Industrialization Puzzle” (Crafts, 1997; Floud, Wachter, and Gregory, 1990). The puzzle persists in
the birth cohorts of the latter half of the nineteenth century, among which stature stagnated rather than
rising like the economic standard of living. Cross-sectional analogs of the puzzle have also been discovered.
For example, despite being the richest region of the United States in the antebellum period (Lindert and
Williamson, 2015), stature data consistently show the Northeast as the shortest during this period (A’Hearn,
1998; Zehetmayer, 2011).3 Modern analogs exist as well. Stunting of Indian children has not improved to the
extent that that country’s GDP per capita growth might suggest; similarly, the heights of African children
exceed those of Indian children despite India’s advantage in income per capita (Deaton, 2007; Jayachandran
and Pande, 2015).
The discovery of the antebellum puzzle is arguably the key contribution of the anthropometric history
research program to the prodigious standards-of-living debate and is one of the fundamental stylized facts
in economic history. Most authors have taken at face value the puzzle’s implication that early modern
economic growth was harmful, to some extent, to the welfare of some parts of the population, and that
economic growth was, therefore, not unambiguously welfare-improving in the short- to medium-term.4 They
have sought to understand the mechanisms driving the decline in average stature and the height disadvantage
of richer regions (Floud et al., 2011; Komlos, 1987, 1998b, 2012; Steckel, 1992, 2009).5
3 Lindert and Williamson (2015, ch. 5, Table 5-4) show that the Northeast surpassed the South in per-capita income sometime
between 1800 and 1840. Similar patterns are evident elsewhere. For example, Mokyr and Ó Gráda (1996) find that English
recruits to the East India Company Army were shorter than Irish recruits despite considerable evidence that England was better
off in economic terms.
4 The simultaneous decline in average stature and improvement in the economic standard of living and the regional divergence
of the biological and economic standards of living need not imply that growth was harmful. Unobserved exogenous factors, such
as an outbreak of disease, could lead to a decline in average stature. However, studies of living standards in the United States
have generally ruled out such exogenous explanations and focused on explanations in which the decline in stature is endogenous
to the onset of modern economic growth (Floud et al., 2011; Komlos, 1998b, 2012).
5 Several explanations have been offered (A’Hearn, 1998; Floud et al., 2011; Gallman, 1996; Haines, 1998; Haines, Craig,
and Weiss, 2003; Komlos, 1987, 1998b, 2012; Margo, 2000; Steckel, 1992, 2009; Woitek, 2003), including increased urbanization,
increased integration of the disease environment, the economic downturn of the 1830s, an increase in inequality, and an increase
in the relative price of food coupled with a decline in per-capita food production. The latter explanation is most widely accepted
(Floud et al., 2011; Komlos, 2012). It has also been shown that the deterioration in health implied by the antebellum puzzle did
not apply uniformly throughout the population, with the relatively well-off less likely to suffer a decline in height (e.g., Sunder,
2011). Another possible explanation is that reductions in infant mortality (Floud et al., 2011, p. 322, Table 6.8) may have led to
a decline in heights if it led those who would have died under worse conditions to survive as shorter adults—in the language of
Bozzoli, Deaton, and Quintana-Domeque (2009), if the selection effect of infant mortality on heights were surpassed by scarring
effects. Studying and testing these explanations is beyond the scope of the current paper. The task here is to evaluate the
first-order question of whether there exists at all a puzzle that requires explanation.
2
Others, however, have been less willing to accept the pessimist implication of this puzzle. They have
challenged the veracity of the antebellum puzzle, suggesting that it might be an artifact of some deficiencies in
the stature data rather than a true representation of the living standards of the population. They thus raise
the first-order question of whether there was any divergence between the biological and economic standards
of living that requires explanation. Bodenhorn, Guinnane, and Mroz (2013, henceforth BGM) have recently
given new life to this perspective. Echoing, formalizing, and expanding arguments made by Gallman (1996)
and Mokyr and Ó Gráda (1994, 1996), they point out that stature data are often drawn from sources that
may not be representative of the population, such as the records of military volunteers. On the basis of
a Roy (1951)-type model and a variety of simulation exercises, they argue that, if the resulting sampleselection bias (from observing heights only among those who choose to enter the military) varies across birth
cohorts, then the observed trend in average stature will not represent the trend in the average stature of the
population.6 Similarly, if the sample-selection bias differs across regions, the observed differences in average
stature across regions may not reflect the differences in the population. Such a scenario can theoretically
generate a spurious antebellum puzzle. If economic conditions were improving (as Figure 1 shows they were),
height might have been rising in the population, but selection into military service on the basis of height
may have become more negative over time as successive cohorts faced better options in the civilian labor
market.7 This is precisely what BGM argue happened in the antebellum United States—military samples
were characterized by cohort-variant sample-selection bias, and show declining average heights even though
population average heights were increasing—making the antebellum puzzle entirely an artifact of sampleselection bias.8 Similarly, if better labor market opportunities in richer areas provided a better alternative
to military service, then the selection bias may have been more severe in these richer areas than in poorer
areas in which fewer opportunities resulted in less negative selection. If the effect were sufficiently strong, it
would lead the richer areas to appear shorter than poorer areas.
While this argument is theoretically valid, empirical support for it is quite limited. Several studies
have proposed tests for the presence of cohort-variant sample-selection bias (Bodenhorn, Guinnane, and
6 If the sample-selection bias is constant across cohorts, the observed trend in heights would lie below the population trend
if military enlisters are negatively selected on the basis of height, but the difference would be fixed over time. Changes over
time in observed heights would thus be representative of changes in the average height of the population.
7 As Mokyr and Ó Gráda (1996, p. 164) put it, data may have been “drawn increasingly from the left tail of a distribution
which itself is shifting to the right.” In Gallman’s (1996, p. 194) words, military enlisters may not have “retained an unchanging
character” over time. However, it is important to point out that the sample-selection problem is distinct from left truncation, lefttail shortfall, or unknown minimum height requirements (e.g., Penttinen, Moltchanova, and Nummela, 2013). These approaches
assume that there is some height above which observed individuals are a random sample of the population above that height.
No such assumption is made here. Bodenhorn, Guinnane, and Mroz (2013) develop this distinction in detail.
8 Indeed, Bodenhorn, Guinnane, and Mroz (2015a) conclude that “The Industrial Revolution posed challenges for those
facing the transformations it wrought, but it did not make people shorter.”
3
Mroz, 2013, 2014, 2015b; Fourie, Inwood, and Mariotti, 2014). Although these studies have found evidence
consistent with the presence of such bias, they are unable to separately identify changes in sample-selection
bias from changes in population average heights and thus cannot definitively conclude that such bias was
present. Moreover, they are unable to quantify or correct for any cohort-variant sample-selection bias that
may be present, and do not address differences in sample-selection bias by region. It is therefore still unknown
whether sample-selection bias wholly, or even partially, accounts for the antebellum puzzle. More generally,
it is not known whether recognizing and correcting for sample-selection bias leads to any meaningful changes
in the conclusions drawn from height data drawn from samples in which selection is likely to be an issue.
In this paper I fill this gap in the literature by estimating a two-step semi-parametric sample-selection
model (Klein and Spady, 1993; Li and Racine, 2007; Newey, 2009) for height observed only among military
enlisters. I then use the results of this estimation to compute unconditional trends in average stature over
time and differences in average stature across regions in the United States, both of which are corrected for
selection into military service on the basis of both observable and unobservable characteristics. The results
of this estimation permit me to answer two questions. First, is the antebellum puzzle an artifact of sampleselection bias that varies over birth cohorts and regions, or a reflection of true patterns in the population? I
answer this question by determining whether the corrected height patterns exhibit an antebellum puzzle. To
conclude that no puzzle is present, it must be the case that heights are increasing over time after correction,
and are greater in richer than poorer regions, thus following the economic standard of living. Second,
does correcting for cohort-variant or region-variant sample-selection bias meaningfully alter the conclusions
drawn from stature data? I address this issue by comparing the corrected and uncorrected patterns in
average stature and testing for differences.9 To my knowledge, this is the first paper that seeks to correct
patterns in population average stature for sample-selection bias stemming from selection on unobservables.10
This estimation is based on a new data set of my construction. Unlike most studies of historical heights,
this data set supplements information on those whose heights are observed with data on the portion of the
population for which stature is not observed, making possible the estimation of the sample-selection model.
The core of the data set consists of military data—including stature—for the birth cohorts of 1832–1860
(thus spanning the key years of the antebellum puzzle), collected from the Register of Enlistments in the U.S.
9 These uncorrected patterns are corrected for selection on the basis of observable characteristics through the use of appropriate weights, but do not correct for selection on unobservables (sample-selection bias).
10 This is not the first paper to correct for selection on observable characteristics. Previous research has done this through
the use of appropriate weights. However, in several ways the correction for observables performed in this paper has desirable
characteristics relative to that in previous papers. In particular, linkage of military records to the census provides a much
larger set of observables on which to correct, rather than the ordinary occupation and place of birth given in military records.
Moreover, the collection of data from the enlister’s childhood rather than at the time of enlistment implies that the observables
are more likely to be height-relevant. The correction for selection on unobservables is novel to the historical heights literature.
4
Army, 1798–1914 (n.d., henceforth, Register of Enlistments). Using conventional methods, I linked these
individuals to their census records in adolescence, thus adding socioeconomic characteristics of enlisters to
the data set and making possible comparisons of enlisters to the general population. I then transcribed, for a
sample of roughly 4,200 individuals, information from all of their military enlistments and their census record
from observation between ages 9 and 18. These data are combined with similar data from the Union Army
project (Costa, 2013) for the birth cohorts of 1832–1846. I also collected a random sample of micro-level
census data from the complete population at risk for military service (Ruggles et al., 2010). Stature data are
unavailable for these individuals, but by providing information on the characteristics of the population at
risk for military enlistment they make possible the estimation of the sample-selection model (in particular,
the first stage binary choice model for military enlistment). All analyses are performed on two samples: the
first uses Union Army enlisters to represent the 1832–1846 cohorts and Regular Army enlisters to represent
the 1847–1860 cohorts. This sample represents the data that currently provide the most sophisticated and
widely accepted documentation of the decline in heights during the antebellum period (A’Hearn, 1998;
Craig, Forthcoming; Zehetmayer, 2011). The second uses Regular Army enlisters to represent all cohorts
from 1832–1860.
Identification of the sample-selection model stems from two sources. The first is the incorporation of
county-level voting data from the US presidential elections of 1856 and 1860 in the model of military enlistment. These election data, which provide information on a county’s views on slavery and related issues (the
central issues in the elections in question), are a proxy for individuals’ propensity to join the military for
political, rather than economic, reasons. They are used as exclusion restrictions under the assumption that,
although they affect the probability of enlisting in the military, they have no effect on height conditional on
the other covariates that are included. These voting data have previously been shown to be important in
the military desertion decision during the Civil War (Costa and Kahn, 2003, 2007). The second source of
identification is that the model allows the effects of covariates on the probability of enlisting in the military
to vary based on whether an individual’s birth year made him eligible to serve during the Civil War or not,
whereas the equation determining height is assumed to be time invariant.11 This restriction is based on the
fact that historical accounts of military enlistment report that recruits to the military differed considerably
between the Civil War and the postbellum period (e.g., Coffman, 1986; Foner, 1970; Weigley, 1967).12
Estimation using these data requires some small adaptations of the standard two-step semi-parametric
11 That is, the first stage interacts all covariates with an indicator for being born in 1846 or earlier, while the second stage
does not.
12 Results are similar when identification is based only on one source or the other.
5
sample-selection model. In particular, the structure of the data set requires adaptation of the standard model
to accept choice-restricted samples.13 This is accomplished by slightly modifying Klein and Spady’s (1993)
semi-parametric binary choice model to accept Cosslett’s (1981) likelihood for such data. The estimation
must also be adjusted to compute the unconditional trends in average height rather than the conditional
trends that a standard model would produce.
The main finding of this paper is that the antebellum puzzle survives the correction for sample-selection
bias. A decline in stature of 0.6 to 0.7 inches is evident in the corrected trends, and it is possible to reject
the null hypothesis of no decline in heights over time. Moreover, a 0.5 to 0.6 inch height advantage for the
South over the Northeast is present after correction, and the null of no difference can be rejected.14 Thus,
I conclude that the antebellum puzzle, in both its temporal and cross-sectional forms, is not an artifact of
sample-selection bias.
Nonetheless, I show that failing to take sample selection into account can appreciably affect the conclusions drawn from historical height data. I find that there existed sample-selection bias in the military samples
that I study, and that this bias differed over birth cohorts and across regions and sectors, thus altering the
observed (uncorrected) patterns in stature. I thus validate concerns over the existence of sample-selection
bias in historical heights samples. In particular, the magnitude of the decline in average stature between
1832 and 1860 before the correction for selection on unobservables ranges from 0.8 to 1.25 inches (depending
on the sample) in my data, with 1.25 inches being the benchmark in the literature (e.g., Craig, Forthcoming). Correcting for selection on unobservables leads to a statistically significant differences in the trend
over time, and results in a considerably smaller decline of only 0.6 to 0.7 inches.15 There are similar effects
of the selection correction on the magnitude of cross-sectional differences. Specifically, the uncorrected gap
of roughly 0.8 inches in favor of the South is (statistically significantly) reduced by about 0.2 to 0.3 inches
when corrections are made for sample-selection bias. I also find that the unconditional urban height penalty
in my data is statistically significantly reduced by the correction for selection on observables, decreasing by
about 0.2 to 0.3 inches relative to a base of about 0.6 to 0.8 inches.16
13 That is, a sample with the following structure. One subsample contains the outcome of interest and covariates for the
selected sample only. The other subsample contains only the covariates of a random sample of the population; the selection
state (i.e., military enlistment decision) and outcome of interest (i.e., height) are not observed for members of this subsample.
14 To put these results in perspective, consider the following patterns presented by Deaton and Arora (2009) from modern
American data. They show that the difference in average stature between men with college degrees and those without a high
school diploma is about 1.2 inches, that the gap between college graduates and those with a high school diploma is about 0.7
inches, and that the gap between those with some college (but no degree) and those with a technical or vocational education is
about 0.3 inches.
15 In particular, the combination of Union Army and Regular Army data shows a decline from roughly 68.2 inches to 67.0
inches before correction, and from 69.1 to 68.4 inches after correction. Regular Army data alone show declines of 68.1 inches
67.2 inches and 68.3 inches to 67.7 inches, respectively. These are net declines over the whole 1832–1860 period. A more detailed
breakdown of these results is given in section 6.3.
16 The urban height penalty is an empirical regularity in many historical settings that health, measured by height or life
6
The remainder of the paper proceeds as follows. Section 2 provides additional background information on
the role of height in the standards of living debate, and on existing tests for sample-selection bias in historical
height samples. In section 3, I discuss the empirical method to be used in estimating selection-corrected
trends in average stature. Section 4 details the various data sources used, and outlines the procedure by which
the data set for analysis was constructed from these sources. In section 5, I discuss several estimation issues
arising from the particular data to be used, and present the selection-correction procedure in the presence
of these issues. In section 6, I execute the selection-correction procedure, compute selection-corrected trends
in average stature, and present the main results of the paper regarding the robustness of the antebellum
puzzle. Section 7 discusses potential threats to the validity of the results. Section 8 concludes.
2
Background
2.1
Height, Welfare, and the Biological Standard of Living
Height data are commonly used as measures of living standards in studies in economic history and economic
development. On the individual level, height is overwhelmingly the product of genetics; in modern populations, roughly 80 percent of the variation in individual heights is driven by genetic differences, with the
remaining 20 percent the product of environmental characteristics (Silventoinen, 2003).17 This non-genetic
portion is essentially a measure of an individual’s net nutrition and health in youth and adolescence (Steckel,
1995). Greater calorie consumption and better health tend to increase height, while physical labor, malnutrition, and chronic disease tend to decrease it. In means of large groups, the genetic component of stature
tends to average out, making the average stature of a large group indicative of the non-genetic component
and thus a good measure of its living standards (Steckel, 1995).
Average stature has also been shown to be correlated with a number of conventional indicators of living
standards (Steckel, 1995). For example, outside of the period of the Antebellum puzzle, richer countries tend
to have greater average stature, and within countries, taller individuals tend to have higher income (Case
and Paxson, 2008). Average stature even provides a measure of inequality as a result of nonlinearities in the
height production function (Steckel, 1995).18
As a result of these useful features of stature, it has been argued that stature might be a better measure
of welfare than conventional indicators. For instance, Floud (1985, p. 33) argues that “height data are a
expectancy, was usually worse in urban than in rural environments (e.g., Haines, Craig, and Weiss, 2003).
17 Silventoinen (2003) points out that the fraction of variation in height attributable to genetics is likely smaller in more
deprived populations, such as in the developing United States of the nineteenth century.
18 Recent innovations in measuring welfare, such as that of Jones and Klenow (2015), also seek to incorporate inequality.
7
direct measure of welfare, much closer to what we think of as welfare or the standard of living than artificial
constructs such as national income per capita or the real wage.” However, the virtues of stature are usually
considered to be slightly less laudable, and it is generally taken as a complement for conventional measures,
augmenting them by providing information on facets of welfare, such as health, that they do not capture.
Thus, height is thought of as providing a measure of the “biological standard of living” (roughly speaking,
health and physical well-being), as opposed to the “economic standard of living” (access to resources). Of
course, the two measures of the standard of living are not mutually exclusive—the two generally move in
tandem, but may diverge in unusual circumstances, such as in the early nineteenth century.
Overall, the value of height as a measure of welfare is well established. It is therefore used often when
alternative measures are unavailable or lacking (e.g., Spitzer and Zimran, 2015), or in cases where one wishes
to learn about health specifically (e.g., Deaton, 2007, 2013; Jayachandran and Pande, 2015).
2.2
The Standards-of-Living Debate
The standards-of-living debate focuses on the welfare effects of early industrialization and modern economic
growth, primarily for the working classes. Its primary focus is England (during the Industrial Revolution),
though considerable attention has also been paid to the United States and other Western European countries.
Contributors to this debate have divided into three camps. “Optimists” argue that the bulk of the population
experienced an appreciable short- to medium-term increase in living standards. “Pessimists” argue that living
standards deteriorated (“strong pessimists”), or at best improved only slowly (“weak pessimists”) during this
period.
Four classes of indicators are generally called on to assess living standards in this debate. The first
three classes are conventional measures of the economic standard of living—GDP per capita, real wages, and
consumption. In the English case, these indicators tend to support the weak pessimist perspective (Brown,
1990; Clark, Huberman, and Lindert, 1995; Crafts, 1985, 1997; Crafts and Harley, 1992; Feinstein, 1998;
Mokyr, 1988; Schwarz, 1985; Voth, 2003). Thus, although industrialization was not a boon to the English
working classes in the short- to medium-term, it did not make them worse off. The situation in the United
States was more positive, with more rapid improvements evident in conventional indicators of welfare (Costa
and Steckel, 1997; Goldin and Margo, 1992; Lindert and Williamson, 2015; Margo and Villaflor, 1987).
The fourth class of indicators, stature, is a boon to the strong pessimist case. Both in the United States
and Great Britain, the biological standard of living, as measured by average stature, declined precipitously
in the mid-nineteenth century (A’Hearn, 1998; Floud et al., 2011; Floud, Wachter, and Gregory, 1990; Fogel,
8
1986; Fogel et al., 1983; Komlos, 1987, 1992; Margo and Steckel, 1983; Mokyr and Ó Gráda, 1996; Voth
and Leunig, 1996), indicating deterioration of the biological standard of living, and subsequently stagnated,
failing to improve despite considerable gains in the economic standard of living.
2.3
Tests for Cohort-Variant Sample-Selection Bias
A number of empirical strategies have been proposed to test whether sample-selection bias may be present
and problematic in historical heights data. Bodenhorn, Guinnane, and Mroz (2015b) perform a metaanalysis of the historical heights literature in which they show that findings of declining heights are more
likely to come from samples to which individuals enter only by choice (e.g., military volunteers rather than
conscripts). The motivating example given for this approach is the case of England and the United States, as
opposed to Sweden and France. The former two had volunteer armies and show declines in average stature,
whereas the latter two had conscript armies and show no such declines (Sandberg and Steckel, 1997; Weir,
1997). Bodenhorn, Guinnane, and Mroz (2015b) argue that this is evidence that the declines observed in
England and the United States are artifacts of the data sources. However, alternate explanations for these
differences have also been offered, such as a slower rate of industrialization in Sweden and France. More
generally, drawing conclusions from this analysis requires making comparisons across countries, classes, and
time periods, across which the actual patterns of height may have differed. Thus, finding that heights are
more likely to decrease in voluntary samples may be a sign of sample-selection bias, or of a different military
recruitment policy in countries that tended to industrialize more rapidly.
Bodenhorn, Guinnane, and Mroz (2013, 2014, 2015b) also propose a “diagnostic test” for the presence of
cohort-variant sample-selection bias. They perform a regression of observed heights on dummy variables for
birth cohort and year of enlistment (possibly interacted), with the test of interest being for joint significance
of the year-of-enlistment indicators (or the interactions if they are included). This test is applied to a variety
of data sets that were used to establish the existence of the antebellum puzzle, with evidence of sampleselection bias discovered in all samples. However, it can be shown that the test fails to identify certain forms
of cohort-variant sample-selection bias and incorrectly characterizes some innocuous patterns of enlistment
over the life cycle as cohort-variant sample-selection bias. Moreover, the approach can only identify changes
over time in the selection of a cohort, but cannot identify the level of selection experienced by each cohort.
Therefore, no quantification or correction for sample-selection bias is possible.
Fourie, Inwood, and Mariotti (2014) also find evidence of changing sample-selection bias over time. They
compare South African military recruits in the Anglo-Boer War and World War I and find that, conditional
9
on their observable characteristics, soldiers in World War I were significantly shorter, providing evidence of
changing unobservables among enlisters. Failing to estimate a sample-selection model, however, leads to bias
in these estimates. Moreover, the lack of some comparison to the population as a whole makes it impossible
to correct for selection, and the methodology is not applied to the industrialization puzzle period of the
mid-19th century but to a later period. Nonetheless, this study provides strong evidence that cohort-variant
sample-selection bias may be present in military samples.
Steckel and Ziebarth (2015) are able to determine whether sample-selection bias is responsible for spuriously generating important conclusions in the literature regarding the catch-up growth of slaves, finding no
evidence of problematic sample-selection bias. However, their approach requires the existence of a sample
that is unlikely to have suffered from sample-selection bias. As a result, it cannot be applied to testing
whether the antebellum puzzle is an artifact of sample-selection bias because no such unselected sample is
available for the native-born white male population of the United States born in the antebellum period.
This discussion shows that, although some studies have been able to provide empirical support to BGM’s
argument, they have at best been able to show that cohort-variant sample-selection bias may have been
present in historical height samples. The magnitude of such bias is entirely unknown, and it thus remains
an open question whether the antebellum puzzle is a statistical artifact and whether correcting for this
sample-selection bias changes existing conclusions at all. The present paper fills this gap.
3
Empirical Framework
In this section I develop the method that I will use to estimate unconditional trends in average stature.
Corrections must be made for two types of selection—selection on observables and selection on unobservables. Selection on observables arises through the fact that some observable characteristics are important
in determining both height and military enlistment.19 This type of selection is commonly addressed in the
historical heights literature through the use of appropriate weights. Selection on unobservables arises from
correlation between unobservables in the military enlistment decision and height determination, possibly as
a result of height entering directly into the military enlistment decision. This is the type of selection that
BGM argue has not been appropriately addressed in the historical heights literature, possibly leading to
sample-selection bias.
The method to be used is essentially the same as existing two-step parametric and semi-parametric
19 For example, if urbanites are both shorter and more likely to enlist in the military than rural residents, observed heights
will be biased downwards relative to population heights.
10
techniques that are used to correct for sample-selection bias in regression (e.g., Ahn and Powell, 1993;
Cosslett, 1991; Heckman, 1979; Li and Racine, 2007; Newey, 2009; Powell, 2001; Vella, 1998). However,
the fact that I wish to estimate the unconditional trend in average stature rather than the conditional
trend that a standard selection correction would yield, requires minor modifications to the method. The
difference between these trends is one of interpretation. Essentially, an unconditional trend is equivalent to
simply computing average stature (possibly with a correction for truncation or selection on observables) for
each birth cohort. The conditional trend is equivalent to regressing heights on covariates and birth cohort
indicators, and using the coefficients on the birth cohort indicators to represent the trend in heights. This
method corrects for selection on observables, but gives a different trend than the unconditional trend. To
see that the object estimated changes when regression is used, consider the following two-period example.
Suppose that the population is initially half rural and half urban, and that rural residents are one inch taller
on average than urbanites. Suppose that in the second period there is no change in the average height of
the rural and urban groups, but that the mixture of the population changes such that the population is now
one-quarter rural and three-quarters urban. The population of the second period would thus be, on average,
one quarter inch shorter than the first-period population. This decline would represent a real decline in the
living standards of the population as a whole, which is the object of interest; however, a regression of stature
on birth cohort and urban dummy variables would yield a regression coefficient on the birth-cohort dummy
variable of zero, suggesting no change in average stature or living standards. In order to capture changes in
living standards of this type, I focus the estimation on computing selection-corrected unconditional trends
in average stature.
3.1
3.1.1
General Framework
Notation
The following notation is used throughout this paper.
• hi : The height of individual i
• yi∗ : The latent utility for individual i of ever joining the military in his lifetime;20 it is assumed that
20 I model the choice of whether or not to enlist in the military as a static once-per-lifetime decision. The goal is to compute
the average height of each birth cohort by using data from the complete set of enlistments by that cohort without respect to
timing. Thus, as long as accommodations are made for cases in which terminal height may not yet have been reached, all that
matters ultimately is whether an individual ever enlists or not, not when he enlists. Focusing on changing selection within
a cohort over time would give undue importance to changes in selection that might be wholly or partially averaged away by
subsequent enlisters in the same cohort. Differences in the temporary economic conditions experienced by different cohorts can
be controlled for in a reduced-form specification by including cohort fixed effects in all stages of estimation.
11
an individual enlists in the Army if and only if yi∗ > 0



1 if yi∗ > 0
• yi =


0 otherwise
• ci : The birth cohort of individual i
• ci : A vector of birth cohort indicators for individual i
• xi : A vector of individual i’s covariates affecting both height and military enlistment
• zi : A vector of individual i’s covariates affecting military enlistment but not height
 
xi 
• wi =  
zi
• εi : Unobserved components in the determination of height
• ui : Unobserved components determination of military enlistment
• F (·): The joint distribution of xi and zi (that is, of wi ); its density is f (·) and its support is X
Other notation will be introduced below as necessary. The unobservables εi and ui are assumed to be
independent of ci , xi , and zi , though they may be correlated with one another.21
3.1.2
Model
Suppose that an individual’s height is determined by
hi = h(ci , xi ) + εi ,
(1)
and that the latent utility of military enlistment is
yi∗ = g(ci , xi , zi ) + ui .
(2)
Height (hi ) is observed if and only if yi∗ > 0 (i.e., yi = 1). The latent utility yi∗ is never observed. This is a
Tobit type-II model (Amemiya, 1985).
21 This independence assumption is stronger than necessary, but is adopted for simplicity. For instance, for identification
of the sample-selection model, zi need only be independent of εi and ui conditional on xi and ci (Huber and Mellace, 2014;
Manski, 1994). A weaker assumption than independence may even be made, such as conditional mean independence; that is,
E(hi |xi , ci , zi ) = E(hi |xi , ci ).
12
The ultimate goal is to learn E(hi |ci )—the unconditional average height of each birth cohort—from the
available data.22 If a random sample of the population were observed, it would be possible to estimate this
object simply by taking averages for each cohort or by regressing heights on a series of birth cohort dummies
(and no controls). However, selection into military service, whether on the basis of observable characteristics (i.e., through the impact of xi on both the military enlistment decision and height) or unobservable
characteristics (i.e., through correlation of εi and ui ), makes it impossible to apply this simple approach to
a selected sample. Section 3.2 and Appendix C outline the methods that I will use to address these two
potential complications.
3.2
Correcting for Selection
If selection into military service is based only on observable characteristics—that is, if εi and ui are uncorrelated with one another— then it can be addressed by computing weights from aggregate data. This
approach, which is suggested by Fogel et al. (1983), is discussed formally in Appendix C.23
Allowing also for selection on unobservables—by permitting correlation of εi and ui —complicates this
correction. As is shown in Appendix C, the simultaneous correction for selection on observables and unobservables requires the additional estimation of E(εi |ci , xi , zi , yi = 1). A considerable literature exists
providing methods to estimate this object (see, e.g., Li and Racine, 2007; Vella, 1998). The specific estimation procedure based on this literature and the derivations in Appendix C is discussed in section 5, where
complications arising from the data to be introduced in section 4 can be addressed.
4
Data
In a standard sample-selection model for height observed only for military enlisters, the researcher would
seek to collect data on a random sample of the population. For all individuals in this sample a number
of covariates would be observed, together with a binary indicator for whether or not an individual chooses
to enlist in the military (observed for all). Height would be available only for those who chose to enlist
in the military. The construction of such a dataset is not possible in this context. In particular, it is not
possible to observe, in random samples of the population at risk for military enlistment, whether a particular
22 I
will also study E(hi |xi ) for some discrete covariate xi , such as region of birth or sector of residence. The argument is
almost identical to that below and therefore is not made explicitly.
23 Although the use of weights to correct for selection on observables is well established, the data used in this paper enable a
more powerful application of this method than in previous work. In particular, the broader range of covariates achieved through
census linkage, and the fact that they are taken in childhood and adolescence (when they are more likely to affect height) rather
than at the time of enlistment results in more of the variation in heights being captured by variation in observables.
13
individual did or did not enlist. Nonetheless, the model can still be estimated using a data set consisting
of two subsamples. The first subsample consists of a random sample of individuals taken from military
enlistment records. By virtue of their presence in these records, they are known to have made the decision
to enlist, and their heights are observed, in addition to a number of covariates.24 The second subsample
consists of a random sample of the population as a whole (including both enlisters and non-enlisters), but
includes only the covariates, leaving height and the military enlistment decision unobserved. With one
more piece of information—the fraction of the population at risk for enlistment that chooses to enlist—the
sample-selection model can then be estimated in two stages. In the first, a binary choice model is estimated,
essentially by comparing the covariates of the military enlistment sample to those of the population as a
whole (Cosslett, 1981).25 In the second, a Heckman (1979)-type correction is conducted using the results of
the binary choice model of the first stage and data only on those whose heights are observed (Newey, 2009);
the results can then be averaged as discussed in section 3 to produce the unconditional selection-corrected
trend in average height. As a result, for every military enlistment data set to be constructed below that
includes only military enlisters (a choice-restricted sample), a supplementary data set must be constructed
that includes the covariates of a supplementary random sample of the population at risk for enlistment.
I construct three such data sets, representing the birth cohorts of 1832–1860, a range that spans the
heart of the antebellum puzzle. The first consists of census-linked Union Army enlisters, together with a
supplementary sample of the general population at risk for enlistment in the Union Army for comparison
of covariates. Due to the timing of the Civil War, this dataset provides information for the birth cohorts
of 1832–1846.26 The second consists of census-linked Regular Army enlisters—again with a supplementary
sample—from the Register of Enlistments, also for the birth cohorts of 1832–1846. The third includes censuslinked Regular Army enlisters for the birth cohorts of 1847–1860 from the Register of Enlistments, again
with a supplementary sample of the covariates of the complete population at risk for enlistment. Thus,
as I will collect three choice-restricted samples of military enlistment data, a total of six samples must be
constructed—the three military enlistment samples, and one random sample of the population for comparison
to each of the three military enlistment samples.27 I perform all analyses twice—once using the Union Army
24 In
practice, these covariates will be collected through census linkage.
is easy to see that estimation of the binary choice model is possible with such data. Bayes’s Theorem (and conditioning
f (xi ,zi |yi =1)P (yi =1)
. The
on ci throughout) enables the writing of the conditional choice probability as P (yi = 1|xi , zi ) =
f (xi ,zi )
density f (xi , zi |yi = 1) can be learned from the choice-restricted sample of military enlisters, while the density f (xi , zi ) can be
learned from the supplementary random samples of the whole population.
26 As I restrict the sample to those 18 and older at the time of enlistment, and the Civil War ended in April 1865, the cohort
of 1846 was the last cohort to be entirely eligible for service (above age 18) in the Civil War. The restriction of the sample to
post-1832 cohorts is explained below.
27 In practice, the supplementary sample for the Union Army is a strict subset of that for the 1832–1846 cohorts of the
Regular Army, for reasons to be discussed below.
25 It
14
enlisters to represent the 1832–1846 cohorts and once using the Regular Army enlisters to represent these
cohorts; the 1847–1860 cohorts are always represented by the Regular Army.
4.1
4.1.1
Data Sources
Military Height Data
Military height data were collected from two sources. The first source is the Union Army Project (Costa,
2013), which provides information collected at the time of entry into the Union Army, including stature.
These data are described extensively by Costa (2013), Costa and Steckel (1997), Fogel (1986), Fogel et al.
(1983), and Margo and Steckel (1983). As a result, I omit a detailed discussion here and note that it
is considered to be representative of the northern white male population (Costa, 2013). I limit the data
extracted from this source to individuals born between 1832 and 1846, for whom height, birth year, and age
of measurement are known. I also collected records of enlistments in the Regular US Army for the birth
cohorts of 1832–1860 from the Register of Enlistments. This source provides some advantages over the Union
Army data. First, it permits me to extend the coverage of this stature series past the 1846 birth cohort.
Second, it permits me to gather data on enlistments from cohorts that might have enlisted during the Civil
War from a source other than the Union Army, thus ensuring that results are not driven by some peculiarity
of that data source. Finally, the Union Army data excludes individuals from Confederate states; the Regular
Army data, in that they include enlistments both before and after the Civil War, provide additional coverage
of southerners. In order to make the results of the present study comparable to those of most other studies
of military stature in the United States, I restrict attention to native-born whites and exclude individuals
born in the West region of the United States (e.g., Fogel, 1986; Zehetmayer, 2011).
The Register of Enlistments has not been used as frequently as the Union Army data (c.f., Komlos and
Carlson, 2012; Zehetmayer, 2010, 2011) and as a result it is not as well documented. I therefore describe it
in more detail here. This source contains information on all soldiers enlisting in the Regular Army for the
period 1798 to 1914, and appears to have been created as a summary of muster rolls, enlistment documents,
service records, and discharge papers that were created during the soldier’s service. For each soldier and
each enlistment, the Register of Enlistments contains the soldier’s name, date of enlistment, place of birth,
place of enlistment, age, occupation, physical description (including height), and a description of the soldier’s
service, including the cause of his separation from service (e.g., discharge, death, desertion), and (except for
some earlier enlistments in the study period) the number of previous enlistments. An index of this source,
including name, age, and birthplace, is available from Ancestry.com (2007). I supplemented this index by
15
transcribing, for each enlistment for the linked sample to be described below, the soldier’s date of enlistment,
occupation, and height. I refer to the sample of individuals collected from this source for the birth cohorts
of 1832–1860 as the Regular Army sample.
Both the Union Army and Regular Army data exclude officers and individuals serving in the Navy or
Marine Corps. They are also composed either exclusively or primarily of volunteers.28 In brief, the differences
between the two sources are the following. The Union Army was active only from 1861 to 1865, and was a
citizens’ army composed of units raised at the local level under the auspices of the various northern states.
The Regular Army, on the other hand, is the same professional army that now exists, and was the sole option
for those enlisting in the Army before or after the Civil War. It remained as a separate entity during the
Civil War (Weigley, 1967, p. 199), and its strength was only increased by a small amount from its pre-war
level of 15,000 enlisted men, with the vast majority of those serving during the war entering the Union Army.
There was considerable difficulty in maintaining the strength of the Regular Army during the Civil War.
Bernardo and Bacon (1955, pp. 201–203) report that the incentives for enlistment in the Regular Army and
in the volunteers (the Union Army) were identical, whereas the latter provided a shorter term of service,
less rigorous discipline, and the freedom to elect one’s own officers. To my knowledge, no evidence exists on
why one would choose to enlist in one branch of the wartime military rather than another.
In the event of multiple enlistments, I use the height data from an individual’s first enlistment.29 However,
if an individual enlists prior to age 21 and subsequently re-enlists, I use information from his first post-age
21 enlistment. If he does not enlist after the age of 21, but does enlist prior to age 18, I use his first post-age
18 enlistment. If there are no enlistments at age 18 or later, I remove the individual from the sample. These
restrictions are intended to minimize the effect of post-observation growth to the extent possible,30 while
retaining the bulk of observations. In all specifications, I control for age of measurement with the inclusion
of age-of-measurement indicators in all regressions in which height is the dependent variable.
4.1.2
Census Data for Enlisters
Data on the covariates xi for military enlisters were collected through linkage of the military data with the US
Census. Collecting data from the census (as will also be done for the supplementary random samples of the
28 The type of enlistment is recorded for about half of those in the Union Army sample. Among these, individuals are about
86 percent volunteers, 9 percent draftees, and 5 percent substitutes.
29 No linkage is available between multiple enlistments for the same person in this source. I identify multiple enlistments by
comparing all enlistments linked to a given census record in the linkage to be discussed below.
30 In particular, final stature among males in relatively poorly nourished populations is not achieved until approximately age
22 (A’Hearn, Peracchi, and Vecchi, 2009; Beard and Blaser, 2002; Eveleth and Tanner, 1976; Frisancho, 1993). The age at
which final height is achieved appears to be increasing in the degree of malnourishment of the population (Horrell and Oxley,
2015).
16
population) ensures that comparable data are collected for the entire data set, whether located in enlistment
records or not. Enlisters in the Union Army sample have already been linked to the US Censuses of 1850
and 1860 by the Union Army project (Costa, 2013).31 Using information from the 1850 and 1860 censuses,
I can observe characteristics of the enlisters and their households during adolescence. No previous linkage
has been performed for the enlisters that I extracted from the Register of Enlistments.32 I therefore linked
enlisters to census data using procedures outlined in Appendix D. I then transcribed census information for
the linked individuals and their households, covering the censuses of 1850–1880.33 One possible concern is
that selection into the linked sample may introduce some biases into the data. This possibility is studied in
section 4.3, where I conclude that my results are unlikely to be affected by selection into the linked sample.
For both the Union Army and the Regular Army samples I retain only individuals for whom census
information could be located between the ages of 9 and 18.34 This age range is desirable for several reasons.
First, in censuses prior to 1880, the relationship between the various members of a household was not
explicitly recorded. Capping the age range of the sample at 18 rather than some older age makes it much
simpler to determine whether an individual was a first-degree relative of the head of household, or the head
of household himself.35 Second and more importantly, it covers a period during which household conditions
are likely to contribute both to the determination of final height and to the enlistment decision. The future
recruits are children or adolescents and are thus in a period of important growth, but are old enough such that
their household characteristics are more likely to affect their labor market outcomes, and thus their decision
of whether or not to enlist in the military. Using information from older ages would raise the concern of
whether terminal height had already been achieved, and thus of whether the observed characteristics would
affect height. Moreover, focusing on pre-age 18 data reduces the possibility that individuals are observed
31 Linkage to other censuses has also been performed, but is either for only a subset of the data, or provides information at
ages beyond those in which the present study is interested.
32 Zehetmayer (2010) links enlisters in the Register of Enlistments from the birth cohorts of 1847 to 1880 in the 1880 census.
However, he collects data only on enlistments between 1898 and 1912. Thus, my linkage of those born 1832–1846 to the census
from the Register of Enlistments is novel. Moreover, his limitation of his sample to those enlisting after 1898 implies that the
youngest of the enlisters that I study (the 1860 cohort) would have to have stayed in the military to age 38, and subsequently
re-enlisted, in order to be observed. Clearly these individuals are not representative of the population of enlisters from these
birth cohorts. My linkage of individuals born 1847–1860 is therefore also necessary (i.e., it is not possible to simply use his
data).
33 As discussed in Appendix D, the 1832–1846 cohorts of the Regular Army are sampled separately from the 1847–1860
cohorts. For this reason I study each group separately. I transcribed a random sample of the linked individuals from the
1832–1846 cohorts, and all linked individuals from the 1847–1860 cohorts.
34 A range of ten years is necessary because of the decennial nature of the census: all individuals should, in principle, be
enumerated in a census exactly once between these ages.
35 In most census data sets, it is clear who is the head of household, as he (or she) is listed first. However, in the Union
Army data, the transcription was performed such that the recruit himself is always listed first followed by the remainder of the
household in the order that they were enumerated; the original ordering is not preserved. It thus cannot be determined if the
recruit is the head of household or if the second individual listed is the head of household. When consideration is restricted to
individuals under the age of 18, it is safe to assume that the individual himself is not the head of household, and in cases of
ambiguity (when the second listed individual and the recruit have different surnames), the recruit can be assumed to not be
related to the head of household.
17
after they have already joined the Army, which would raise the concern of whether the observed covariates
affected the military enlistment decision. Finally, as opposed to samples covering individuals at a younger
age, this sample makes it possible to include individuals as old as the birth cohort of 1832. This is due to
the fact that census data for 1840 and earlier are not particularly useful for such exercises, as they do not
include detailed household information, as do the censuses of 1850 and later. Samples covering individuals
at younger ages would require beginning at later birth cohorts and omitting individuals born at the early
end of the period of interest. For example, if the sample instead included individuals from ages 0–9, the
earliest birth cohort to have such an observation in a census of 1850 or later would be that of 1841, thus
omitting most of the birth cohorts experiencing the decline in heights from the data.
Many previous studies have instead used information collected at the time of enlistment to gain additional information on enlisters (Haines, Craig, and Weiss, 2003; Zehetmayer, 2011). However, this approach
has several drawbacks in the present context. First it describes an individual’s circumstances at the time
of enlistment, at which final height is more likely to have already been achieved. Second, although military
enlistment data provide information that is unavailable from the Census—such as specific (a finer level than
the state) place of birth—information on the individual’s household (e.g., wealth, size, occupational composition) is not available from this source. Bailey, Hatton, and Inwood (2014) show that these characteristics are
important in determining height. Finally, the information taken at the time of enlistment is collected under
different circumstances from that in the Census and in a context in which non-enlisters cannot be observed;
collecting all information from the Census makes the samples of enlisters and of non-enlisters comparable.
From these sources, I collected information on the property ownership of the enlister’s household, the
enlister’s place of residence, the size and composition of the enlister’s household, the occupations of the
members of the enlister’s household, and the enlister’s school attendance.
4.1.3
Census Data for the Population
As discussed above, the introduction of a separate comparison sample of the covariates of the general population is required in order to estimate the model. To create such a sample for each of the three military
enlistment samples, I collect information on the covariates xi for a random sample of the general population—
the public use samples of the 1850, 1860, and 1870 censuses (Ruggles et al., 2010).36 These sources contain
the same information as that discussed in section 4.1.2, and I again restrict attention to native-born (outside
of the West region) white males observed between ages 9 and 18.
36 Some of these individuals are also linked to the 1880 census (Ruggles et al., 2010), providing additional information on
nativity.
18
There is no information in this source on military enlistment, and it is therefore possible (and, in the
case of Union Army, quite likely) that there are individuals in these samples who did serve in the military.
Moreover, it is not possible to systematically identify individuals in these samples who did serve. In the
case of the Union Army, there is not (to my knowledge) a comprehensive list of enlisters; moreover, even
if there were such a list (such as the Register of Enlistments for the Regular Army enlisters), identifying
enlisters would require the use of (necessarily imperfect) record linkage techniques. I therefore make no
attempt to remove enlisters from this sample. As a result, a standard binary choice estimator for stratified
samples cannot be used, as the outcome is essentially unobserved for this sample. I discuss an approach
for estimating conditional choice probabilities in this context in section 5.1. At this point it is convenient
to define two relevant terms using the language of Cosslett (1981): I will refer to the several samples of
enlisters linked to census data as the “choice-restricted samples” (that is, restricted to those who choose to
enlist in the military and containing their census information and height) and to the samples of the general
population as the “supplementary samples.”
These data are used to create three supplementary samples—one each for comparison to each of the
military samples created above. In each case, the supplementary sample is a random sample of the population
at risk for enlistment in the military sample to which it is being compared. Thus, the supplementary sample
for comparison to the Union Army is a random sample of native-born white males living in non-seceding areas
and born between 1832 and 1846. The supplementary sample for the 1832–1846 cohorts of the Regular Army
is the same as that for the Union Army, but includes individuals living in seceding states. The supplementary
sample for the 1847–1860 cohorts of the Regular Army includes a random sample of native-born white males
in those birth cohorts.
4.1.4
Locality Information
In addition to the individual- and household-level data collected from the census manuscript schedules, I
collect some information—for both the choice-restricted and supplementary samples—on an individual’s
county of residence from the Minnesota Population Center (2011, henceforth, NHGIS). From this source
I collect, based on county of residence, data on agricultural and industrial production and on population,
yielding per-capita measures of wheat production, per-capita quantities of milk cows and pigs, and per-capita
measures of industrial production. A better measure of the conditions faced by the potential enlister would
likely be the same measure taken from the county of birth (Craig and Weiss, 1998; Haines, Craig, and Weiss,
2003; Wilson and Pope, 2003). However, extraction of the county of birth from the Register of Enlistments
19
is quite difficult and would be impossible for the supplementary census samples, in which only the state of
birth is given; for this reason, I focus on the county of residence at ages 9–18, taken from the census.
4.1.5
Voting Data
For identification of the sample-selection model, exclusion-restricted variables (i.e., variables that enter the
enlistment equation but not the height equation), zi , are required.37 For this purpose, I collect countylevel voting data for the presidential elections of 1856 and 1860 (ICPSR, 1999).38 Formally, the exclusion
restrictions must satisfy the condition that hi is mean independent of zi , conditional on on xi and ci (Huber
and Mellace, 2014; see also Kitagawa, 2010; Manski, 1994).39 That is, omission of the voting variables
from a regression with height as the dependent variable and cohort indicators and the various elements of
xi —including the individual-, household-, and county-level information described above, as well as region
fixed effects—as regressors must not result in omitted variables bias. Moreover, the excluded variables must
be relevant to the military enlistment decision.40
The voting data are intended to measure an individual’s (proxied by his county of residence’s) attitude
toward military service. Intuitively, relevance should come from the fact that the central issues of these
elections were essentially the same issues over which the Civil War was fought, and were closely related to
issues central to the Army’s post-war mission. The central issue in the 1856 election was the general debate
over the spread of slavery, in particular the Kansas-Nebraska Act. The Democratic candidate and eventual
victor, James Buchanan, ran on a platform supporting the Kansas-Nebraska Act in particular and popular
sovereignty more generally, as well as the annexation of Cuba as a slave state.41 His main challenger was
the Republican John Frémont (the first Republican presidential candidate), who opposed him on all of these
issues. Former President Millard Fillmore also ran on behalf of the nativist American party.
The chief issues in the election of 1860 were slavery and the use of force in the preservation of the Union
(“coercion”). Of the four candidates, the Northern Democrat Stephen Douglas of Illinois was the most vocally
in favor of the use of force to preserve the Union (the “Norfolk Doctrine”), and campaigned on this issue
37 The
reasons for this requirement are discussed further in section 5.2.
from the election of 1864 are also used for additional specifications, which are available on request. Results are similar
to those generated using the 1856 and 1860 data.
39 That is, E(h |x , c , z ) = E(h |x , c ).
i i i i
i i i
40 Huber and Mellace (2014) also discuss monotonicity conditions stemming from the assumption of additive separability of
the errors. These will be revisited in section 7.2.
41 All three of these were essentially pro-slavery measures. Popular sovereignty was a doctrine advocated by Illinois Senator
Stephen Douglas. It allowed the citizens of a newly admitted state or territory to vote on whether to be admitted as slave or
free, and overturned the Missouri Compromise of 1820, which prevented the spread of slavery north of the 32◦ 300 N parallel. In
practice, the doctrine led to violent episodes such as “Bleeding Kansas” in which a war was fought after the Kansas-Nebraska
Act allowed the residents popular sovereignty.
38 Data
20
throughout the South (Bowman, 2010, pp. 142–143). Though he received the fewest electoral votes of any
candidate, Douglas carried the second-largest total of the popular vote. The eventual victor, Republican
Abraham Lincoln, held stronger anti-slavery views than Douglas (who had authored the popular sovereignty
doctrine and advocated annexation of Cuba), but did not (at least during the campaign) advocate the use
of force to preserve the Union. Indeed, “the coalition voting for Lincoln was not pro-war in 1860,” preferring
to let the southern states go in peace (Costa and Kahn, 2007, p. 326, fn. 2). Only after the bombardment of
Fort Sumter, five months after the election, did Lincoln call for the use of force.42 The Southern branch of
the Democratic party was represented by John Breckinridge, who would later become a Confederate general.
He garnered the support of secessionists in the South. Finally, John Bell, of the Constitutional Union party,
took no stance on slavery, advocating preservation of the Union; however he opposed the use of force to do
so, and eventually defected to the Confederacy because he did not see the Federal government as having the
right to invade a state.
That voting in these elections should have had an effect on the military enlistment decision is intuitively
clear. The Civil War was fought over the issues of slavery and preservation of the Union. These elections
centered on these issues, and the views on these issues should affect both voting and the propensity to enlist.
Similarly, after the Civil War, one of the Army’s main duties was Reconstruction—the military occupation
of the South. It is not hard to imagine that this type of service would be relatively more attractive to those
who had favored the Civil War.43
This approach is very similar to the use of election data by Costa and Kahn (2003, 2007). Costa and
Kahn (2003) relate voting data from the elections of 1856 and 1860 to the probability of desertion from the
Union Army, finding that enlisters from counties with greater support for Republican candidates—described
by Costa and Kahn (2003) as a proxy for ideology—were less likely to desert. Similarly, Costa and Kahn
(2007) use voting data from the 1864 election to measure a community’s support for the Civil War, finding
that deserters from communities with greater support for the war were more likely to migrate after the war,
and were more likely to settle in more anti-war communities, as measured by voting. The relevance to the
desertion decision suggests relevance to the enlistment decision.
I specifically use two variables in the primary specifications: the fraction of the individual’s county of
residence voting for James Buchanan in 1856, and the fraction voting for Stephen Douglas in 1860; these
42 Interestingly,
Costa and Kahn (2007) show that religion was an important determining factor of voting in this election and
in 1864. Nevins (1959, pp. 72–74) also discusses reluctance among those in the Lincoln administration to go to war prior to
Fort Sumter.
43 The main issue in the 1864 election was continuation of the Civil War. The election pitted the Republican incumbent
Abraham Lincoln against the Democrat George McClellan, who ran on a platform favoring a negotiated settlement with the
South.
21
variables are summarized in Figures B.1 and B.2.44 Besides their strong positions on slavery and preservation
of the Union, these candidates have the advantage of having been on the ballot in every state (except Texas
in Douglas’s case, and South Carolina, where electors were chosen by the state legislature). Focusing on the
Republican candidates in 1856 and 1860 would require the omission of southern states, in which they were
not on the ballot.
What remains to be considered is the excludability of these voting variables; that is, whether it is in
fact the case that they are conditionally independent of height. At this stage, this assumption cannot be
tested formally. However, I argue that, conditional on the other covariates included in estimation, the voting
variables should not be related to height. One might imagine, for example, that poor southerners might have
been more likely to support slavery, less likely to enter the military, and shorter. However, any correlation
with height must come after conditioning on covariates such as occupation, region, and property ownership
in order to be problematic. I argue that this is unlikely to be the case. I investigate the matter in more
detail in section 7.2.
4.2
Summary Statistics
Table 1 summarizes the structure of the sample, including presenting information on which censuses each
cohort’s data are drawn from. Note that although the supplementary samples are listed under the heading
of each military sample, these are not military samples; only the choice-restricted samples are taken from
military data. Figure 2 shows the sample size by sample and birth cohort for the choice-restricted samples
and for the supplementary samples.
4.2.1
Heights
Figures 3–6 describe the height data collected from the various sources described above. Figure 3 presents
kernel density estimates of the raw (that is, uncorrected for either truncation or measurement age) distributions of heights in each of the three military samples, as well as histogram approximations of these
distributions. The Union Army enlisters are taller than the 1832–1846 cohorts of Regular Army enlisters
by 0.64 inches and taller than the 1847–1860 cohorts of Regular Army enlisters by 0.795 inches. The earlier
group of cohorts of Regular Army enlisters is taller than the second group of cohorts by 0.153 inches. All of
these differences are statistically significant at at least the five-percent level.
44 The apparent shortfall in Pennsylvania for Douglas comes from his combination into a Fusion ticket there, which joined
him with John Bell.
22
Figure 4 depicts the mean observed height for each birth cohort, controlling for the age of measurement
through the inclusion of age-of-measurement indicators, and for minimum height requirements through the
use of truncated regressions with a truncation point of 64 inches (A’Hearn, 1998; Komlos, 1998a).45 As will
be the case throughout the paper, the 1847–1860 cohorts are represented by the Regular Army enlisters
from these cohorts; the 1832–1846 cohorts are represented by the Union Army in one curve, and by the
Regular Army enlisters for these cohorts in the other trend. The kernel-smoothed version of these trends
are also included. The antebellum puzzle is clearly visible in both trends, which each exhibit a decline in
heights in the 1830s and early 1840s. However, the magnitudes of the declines are quite different. While
the sample composed by combining the Union Army sample and the later cohorts of the Regular Army
sample exhibits a decline of approximately 1.75 inches, the Regular Army-based trend is milder, exhibiting
a decline of only about 0.60 inches. The decline in the Union Army-based sample is greater than that found
by A’Hearn (1998), Costa and Steckel (1997), and Fogel (1986), who find a decline of about 0.7 (in the case
of A’Hearn, 1998) to 1.2 inches (in the case of Costa and Steckel, 1997; Fogel, 1986). However, apart from
showing that the raw data exhibit a decline and that the decline is larger in the Union Army-based sample,
these trends should not be taken to be indicative of the population, as they are not corrected for selection
on observables, as is commonly done in the literature. Moreover, they extend to 1860, whereas A’Hearn’s
(1998) data do not go beyond the 1847 cohort. When the trends are corrected for selection on observables
(as will be shown in section 6), the figure becomes approximately 1.25 inches, and thus more comparable to
the existing estimates.
Enlistment years for the two Regular Army subsamples are presented in Figure 5. The bulk of enlistments
for the 1832–1846 cohorts occurred during the Civil War; however, there were also a considerable number of
enlistments before and after the War for these cohorts. The 1847–1860 cohorts of the Regular Army exhibit
a more uniform pattern of enlistments over time, likely due to the fact that they had not yet reached age 18
by the end of the war.
The average measurement age for each birth cohort and each sample of military enlisters is depicted in
Figure 6. There is a sharp decline in the age of measurement between the birth cohorts of 1832 and 1846 in
both the Union Army and Regular Army samples, though the decline is less precipitous in the Regular Army
45 That is, the plot shows the value of the estimated coefficients β from a truncated regression (with truncation point of 64
t
inches) of the form
X
X
hitτ =
βt cit +
γτ miτ + εitτ ,
τ
t
where hitτ is the height of individual i, of birth cohort t, measured at age τ ; cit are a series of dummy variables indicating
individual i’s membership in cohort t; and miτ are a series of age-of-measurement indicators, indicating measurement of
individual i at age τ . The base age is τ = 21.
23
sample than in the Union Army sample. The age of measurement then remains at a roughly constant level
for the 1847–1860 cohorts. Given that the decline in stature of the antebellum puzzle is contemporaneous
with the sharp decline of the age of measurement, this pattern might lead one to be concerned that the
decline in heights of these birth cohorts is driven by some idiosyncrasy of service in the Civil War rather
than by an actual decline in population heights. One possibility is that younger cohorts appeared shorter
because they had not yet reached their terminal height at the time of the Civil War, when they enlisted.
For two reasons, this issue is not concerning. First, there is a general consensus in the anthropological and
biological literature (Cline et al., 1989; Frisancho, 1993) that male growth ceases by age 22 at the latest,
and even earlier in better nourished populations. Thus, it is safe to conclude that the decline in heights
from roughly 1832 to 1840 cannot be driven by declining measurement ages because these individuals are
overwhelmingly (if not exclusively) observed after age 22. Second, the trends in height are corrected for the
age of measurement through the inclusion of measurement age indicators; thus any link between height and
measurement age (which may be present among later cohorts) should not survive into the reported trends.
The more pressing concern is that selection into military service among the older cohorts is more positive.
For example, if by age 30 only the fittest would enlist, then the older cohort might appear artificially taller.
This is a distinct possibility, but the selection-correction method, which permits selection to differ by birth
cohort, should address this concern in the corrected trends.
4.2.2
Census, Voting, and County-Level Data
From the linked and unlinked census samples (representing the choice-restricted and supplementary samples,
respectively), data were collected on the property ownership of each individual’s household (expressed in 1860
dollars, using deflators of Lindert and Margo, 2006), the composition of the household (including its size
and whether the individual of interest—either the enlister or prospective enlister—was related to its head),
whether the household resided in an urban or rural county,46 and whether the individual of interest attended
school in the year prior to observation. The occupations of each member of the individual’s household were
also gathered and were classified according to the system used by the Union Army Project (Costa, 2013).
As these occupations can, in a sense, be ordered, the household is classified by the “maximum” occupational
status of its members; for example, if one member of the household is professional and the other is clerical,
the household is categorized as professional. In addition, the birth region of the individual in question is
classified using the major US census regions (Northeast, Midwest, South, and West). Recall that the sample
46 I
define an urban area as a county with a non-zero urban population, as defined by NHGIS.
24
is restricted to include native-born whites, so individual nativity and race are not of interest.47
Table 2 summarizes some of the individual- and household-level data taken from the census for each
subsample for the choice-restricted samples and for the supplementary samples together with the voting
data.48 Columns (1), (4), and (7) present information for the choice-restricted samples of military enlisters.
Columns (2), (5), and (8) present information for the supplementary random samples of census information
from the population as a whole. Columns (3), (6), and (9) present t-tests of the difference between the supplementary and the choice-restricted samples for each of the military enlistment samples. For the most part
(with the exception of birth region, which is unsurprising given the exclusion of most southerners from the
Union Army sample or its supplementary sample, and some occupational variables) the summary statistics
of the supplementary samples appear similar.49 The choice-restricted samples, however, are quite different
from one another and from the supplementary samples. Indeed, nearly all of the t-tests for the differences
between the enlisters and the general population indicate statistically significant differences between enlisters and the general population at the one-percent level. These differences extend to the voting data, which
exhibit statistically significant differences between the choice-restricted and supplementary samples for each
military enlistment sample (that is, at least one of the vote fractions shows a significant difference between
the choice-restricted and supplementary samples in each sample), suggesting (but, of course, not proving)
that the voting data may be important in the military enlistment decision.
4.2.3
Height Regressions
In order to document the relationship between height and the several covariates collected from the census
data, I perform a number of regressions with height as the dependent variable. These regressions do not
correct for selection into the sample, either on the basis of observable or unobservable characteristics. As a
47 For a select portion of the sample, paternal nativity is known and the mother’s age at the birth of the (prospective)
enlister can be computed. Although the sample is restricted to the native born, parental nativity must be considered. The
Army was disproportionately composed of foreign born individuals (Foner, 1970); it is thus reasonable to expect that those
with parents of foreign birth would be more likely to enlist than those with native-born parents. Moreover, individuals with
parents of foreign birth could reasonable be expected to be shorter than those with parents of native birth because of differences
in average heights between Europe and the United States (Fogel, 1986; Hatton and Bray, 2010). Mother’s age is used as a
proxy for birth order. Birth order is likely to affect height (Black, Devereux, and Salvanes, 2015; Hatton and Martin, 2010;
Hermanussen, Hermanussen, and Burmeister, 1988; Lundborg, Ralsmark, and Rooth, 2015) and may affect military enlistment
because of different inheritance prospects. However, census records cannot definitively determine birth order; for this reason, I
proxy this variable by mother’s age at the individual’s birth, which is likely to be correlated with birth order. However, these
variables are available only for a select subset of the sample, so they are not included in baseline specifications to avoid the
resulting sample-selection issues. In specifications that include these variables (and are limited to observations for which they
are available) results are similar. These results are available upon request.
48 The full table, including occupational data and county-level NHGIS data, is presented in Table A.1.
49 Some southerners will still be included in the supplementary sample for the Union Army, however, as not all portions of
the South seceded and southerners may have migrated out of seceding areas. The Union Army supplementary sample is a strict
subset of that for the Regular Army for the same birth cohorts; the latter includes residents of seceding states, as they may
have enlisted in the Regular Army prior to the Civil War or after it.
25
result, no interpretation beyond sample-level correlations should be attached to the results. Table 3 presents
the results of truncated regression of height on a variety of covariates xi and on the excluded variables zi with
a truncation point of 64 inches.50 The table includes regressions only for the combined samples spanning
the entirety of the 1832–1860 cohorts, weighting appropriately to account for the separate sampling of each
group of cohorts and displaying results for only a limited set of covariates.51 Columns (1) and (3) present
the results of regression of heights on the covariates xi alone. Columns (2) and (4) include the voting data
zi as well.
The results of columns (1) and (3) show that the height data collected from each source display the
expected relationships with many of the covariates, though these relationships are at times imprecisely
estimated. For example, property-owning households produced taller individuals; individuals related to
the head of their household in childhood and adolescence were taller; an urban height penalty is apparent
relative to residents of rural areas; and residents of the Northeast are considerably shorter than those from
other regions. Other covariates produce less consistent results. For instance, school attendance displays
a statistically significant and positive relationship with height in the Regular Army-based samples, while
a negative and statistically insignificant relationship is evident in the sample combining the Union Army
enlisters with the second group of Regular Army enlisters.
The results of the even-numbered columns provide information regarding the relationship between the
voting variables and height in the selected sample, conditional on the other covariates. Of course, showing
that height is mean independent of voting conditional on the other covariates in the selected sample does not
imply that they are mean independent in the unselected sample (nor would showing that mean independence
fails in the selected sample imply that it fails in the population). Nonetheless, studying the relationship in
the selected sample provides interesting information. In neither sample are χ2 -tests of the joint significance
of the two variables able to reject the null of no explanatory power of zi (Column (2): χ22 = 0.57, p = 0.75;
Column (4): χ22 = 1.05, p = 0.59). Moreover, the coefficients are small in magnitude. Consider, for example,
the largest coefficient that appears for the voting data—0.245 for the Douglas vote share in the Regular
Army sample; a one standard-deviation increase in this vote fraction of approximately 19 percentage points
(see Table 2) is thus associated with only a 0.05 inch increase in average height, or only about 0.02 standard
deviations of height.
50 Table 3 includes only a subset of variables and samples; the full sets of variables and samples are presented in Table A.2.
Regression results including paternal nativity and mother’s age at birth are available on request.
51 Table A.2 includes specifications for additional samples and variables.
26
4.3
Representativeness of the Linked Samples
The military height data used in this paper differ from those typically used in the anthropometric history
literature. While this literature generally takes random samples of the height data available in any particular
source (leaving, in a military enlistment sample, the military enlistment decision as the only relevant point
of sample selection), I limit my samples to the subset of these records that could also be linked to census
records. The introduction of this second selection mechanism is necessary in order to gather covariates for
comparison to the population as a whole and thus for estimation of the sample-selection model; but it may
induce additional and potentially problematic bias.52
Bias from non-representative linking can take two forms. First, I determine whether correcting for
sample-selection bias has a meaningful effect on the trends in stature by comparing the trends corrected for
selection on both observable and unobservable characteristics (which should represent the population trend
in heights) to those corrected only for selection on observable characteristics (representing the conventional
approach in the anthropometric history literature). If the sample-selection model corrects for selection both
into military service and into census linking,53 then any difference in trends may be due to selection into
census linking and not into military enlistment. Such bias would not be present in most studies of historical
heights (because they do not use linked samples), but I would erroneously conclude that sample-selection
bias existed in the height samples. Second, it is unlikely that the sample-selection correction would properly
correct for selection both into military enlistment and census linking. Mroz (2015) shows that studying two
types of selection in a single index model has the potential to exacerbate any sample-selection bias, leading
to estimates with even greater bias than those from a naive approach that ignores selection altogether.
Due to the possibly severe consequences of selection into census linkage, it is important to determine
empirically whether such selection is likely to be present. As with the sample-selection issue that motivates
this paper, selection into linkage is only problematic if it varies over cohorts. Fortunately, unlike the sampleselection problem, in which the outcome of interest is observed only for the selected sample, it is possible to
directly test for selection of this type because the outcome variable (height) can be observed for both the
selected (successfully census-linked) and unselected (failed to link to the census) samples. In order to test
for sample-selection bias induced by selection into census-linking, I collected data on a random sample of
Regular Army enlisters for the 1832–1860 cohorts without any attempt at linkage to the census.54 Similarly,
52 The potential for non-representative samples to be generated by linkage procedures is discussed by Ferrie (1996), among
others.
53 A strong condition must hold for this to be the case (Mroz, 2015).
54 The Regular Army samples are weighted so that the distribution of enlistment years of the unlinked sample is the same
as that of the linked sample. I perform this weighting (rather than the reverse) in order to correct for the overrepresentation
27
I collect from the Union Army project information on enlisters without regard to linkage. Comparing the
distributions and trends in heights of the linked sample and the unlinked sample (which represents the
population of military enlisters as a whole rather than only those who could not be linked) makes it possible
to determine whether problematic sample-selection bias is likely to exist.
Table 4 presents regressions comparing the trends in heights of the linked and unlinked (that is, representative of the whole enlisting population regardless of census linking) samples. Each column of this table
presents the results of two specifications. The first regresses heights on birth year indicators, measurement
age indicators, and an indicator for being in the linked sample. The coefficient on the linked indicator is
presented in Table 4. This tests whether the linked and unlinked trends differ in level. The second regression adds interactions of the linkage indicator and the cohort indicators. The results of a χ2 -test of joint
significance of these trends—which is a test of whether the trends in height of the linked and unlinked differ
from one another—are also presented in Table 4. Results of these regressions show statistically significant
evidence of positive selection into linkage on the basis of height for the Union Army, the second cohort group
of the Regular Army and both combination samples spanning the entire study period; the first cohort group
of the Regular Army shows statistically insignificant evidence of positive selection into linkage on the basis of
height. In no case, however, is there any indication of a statistically significant difference in trends.55 Figure
7 replicates this analysis graphically by plotting the trends in average heights over birth cohorts in both
the linked and the unlinked groups. While differences in levels are evident between the linked and unlinked
trends, the trends themselves are visually quite similar.
It therefore appears that the only difference between the linked and unlinked trends is in level, with the
linked taller than the population of military enlisters as a whole; that is, there exists selection into linkage,
but it is cohort-invariant. The presence of positive selection in to census linking on the basis of height is
unsurprising, as census linkage is likely to favor those who provide accurate information in a number of
sources, and who are therefore likely to be better educated, and thus also likely to be taller (Ferrie, 1996).
This difference suggests that one should be cautious in interpreting the level of the trends corrected for
selection on both observable and unobservable characteristics. However, I find no reason to believe that
the trend itself is unrepresentative of the population of military enlisters, and therefore conclude that the
correction for sample-selection is informative regarding the trend in heights of the population. To put it
of individuals who enlist many times in the unlinked sample. In order to transcribe more effectively, I randomized on the level
of the microfilm image rather than the individual enlister. As a result, any analysis performed clusters standard errors on the
image level. There were 1,382 images sampled for the sample of 6,805 individuals, for an average of 4.9 individuals per image.
55 I perform a similar analysis for regional differences, with the results available upon request. No statistically significant
selection into linkage is found for either compound sample.
28
briefly, any bias from census linking should be captured by the intercept.
To further explore differences between the linked and unlinked military samples, I also gather a number
of covariates from the military enlistment records for both the linked and unlinked. As these are taken from
military records rather than from census records located through linkage (as are the covariates used in the
main analysis), these covariates are not necessarily comparable to those discussed above. The covariates
collected are region of birth, year of birth, year of enlistment, and occupation (categorized using the same
categories as above). I also create measures of name complexity and length separately for first name and
surname. Name complexity is measured by the scrabble score, which is increasing in the length and complexity of a name.56 These measures are included in order to capture the fact that individuals with unique
names are generally easier to match.
In Table 5, I study whether the sample is balanced on the basis of these covariates. In particular, I
present the results of a number of regressions of the form
xi = ς0 + ς1 `i + νi ,
where xi is some covariate and `i is an indicator for being in the linked sample. Cells of Table 5 present
estimates of ς1 , which is the degree to which a particular covariate is overrepresented in the linked sample
relative to the population of enlisters as a whole. For example, Northeasterners make up 3.2 percentage
points less of the linked Union Army sample than the population of the Union Army. Overall, there are
statistically significant differences between the linked and unlinked samples on the basis of name length and
complexity, in terms of region of birth (generally under-representing the Northeast and over-representing the
Midwest), and on the basis of occupation.
To correct for these imbalances, I compute weights in order to correct for selection into census linkage on
the basis of these observable characteristics. In particular, I estimate a probit model for selection into linkage
using Cosslett’s (1981) likelihood, and use the results to compute inverse conditional linkage probabilities by
which to weight.57 I first reproduce the above analysis of the trends in heights of the linked and unlinked,
weighting the linked samples by the inverse linkage probability. The results are presented in Table A.3
and Figure B.3. The differences in level between the heights of the linked and unlinked are smaller than
in the unweighted equivalents, suggesting that some of the level differences are due to differences in these
56 See Biavaschi, Giulietti, and Siddique (2013) for more details on the use of the scrabble score as a measure of name
complexity.
57 I omit the top one percent and bottom one percent of the sample, in terms of weights, in order to avoid having the results
be driven by such outliers receiving too much weight. Where this omission qualitatively affects the results, I make a notation
in a footnote.
29
observable characteristics. Importantly, any differences in trend between the linked and unlinked are small
and statistically insignificant; only in the case of the first cohort group of the Regular Army is there any
difference, but this difference is only marginally statistically significant, and a visual inspection of Figure B.3
shows that the trend is nonetheless quite similar in both the linked and unlinked groups.58 In all analyses
below (except where indicated otherwise), I weight by this inverse linkage probability in order to correct for
non-representative linkage.59
5
Estimation
The estimation procedure employed can be thought of as a two-step Heckman (1979)-type procedure with
some variations at each stage, followed by the final weighted averaging of estimates to compute the unconditional trend. In brief, the procedure is as follows. First, I estimate a binary choice model for military
enlistment. The structure of the choice-based sample complicates this estimation, as does the fact that I
estimate the model semi-parametrically rather than assuming a particular functional form. After this stage
of estimation, the correction for selection on observables alone can be performed using the estimated conditional enlistment probabilities from this stage. Next, I use a flexible function of the estimated linear index
of the selection function as another regressor in a regression of height to correct for sample-selection bias
(that is, selection on unobservables); the functional form is left free rather than assumed to be of a particular
form. The selection-free estimates can then be averaged (with appropriate weights as discussed above) to
compute the unconditional selection-free trend in heights. In the following section, I discuss the remaining
details necessary to perform this estimation, and provide the estimation procedure in more detail.
5.1
Estimating Conditional Enlistment Probabilities
In principle it is possible to estimate conditional enlistment probabilities completely non-parametrically,
imposing very few assumptions beyond those imposed in section 3; however, with many covariates on which
to condition, some structure must be imposed to make estimation practical. I therefore assume that the
function g(·, ·, ·) in equation (2) is linear; that is, that
yi∗ = x0i β + c0i α + z0i δ + ui
(3)
58 I also perform the analysis of regional differences in this way, with results available upon request. Results are similar to
those described above.
59 As a result, the sample size is reduced compared to the summary statistics described above. In particular, individuals with
missing covariates in their enlistment records must be omitted as weights cannot be computed for them.
30
where β, α, and δ are parameter vectors. Let G(·) denote the distribution of −ui . Then the object of interest
is
P (yi = 1|ci , xi , zi ) = P (yi∗ > 0|ci , xi , zi )
= P (−ui < x0i β + c0i α + zi 0 δ|ci , xi , zi )
= G(x0i β + c0i α + zi 0 δ),
(4)
where the last equality follows from the independence of ui and x0i β + c0i α + zi 0 δ. The function G(·) and the
parameters β, α, and δ can be consistently estimated according to the method of Klein and Spady (1993),
who provide a framework to estimate this model semi-parametrically, leaving the form of G(·) free and
estimating by maximum likelihood using the likelihood function (E.5) below, developed by Cosslett (1981)
and discussed in section 5.1.1. I will adopt this method, though it requires some adjustments to Klein and
Spady’s (1993) method due to the use of the choice-restricted sample; these adjustments are outlined in
Appendix E.
I allow the coefficients β and δ to differ between the two groups of birth cohorts (1832–1846 and 1847–
1860), but constrain the distribution G(·) to be the same so that the resulting linear index can be used in
the second stage.60 This is accomplished by estimating the model for both sub-samples jointly, but including
interactions of all components of xi and zi with an indicator for the 1832–1846 cohorts. I impose this
restriction because the nature of the enlistment decision likely differed considerably between the cohorts
that were eligible for Civil War service and those that were not. Thus, expression (4) can be written as
P (yi = 1|ci , xi , zi ) = G(x0i βk + c0i α + zi 0 δk ),
(5)
where β and δ are now explicitly allowed to differ by cohort group k ∈ {1, 2}.
Once I have estimated P (yi = 1|ci , xi , zi ) by this method, I can correct for selection on observables as
outlined in Appendix C by regressing observed height on age-of-measurement dummies and birth cohort
dummies, weighting by the estimated inverse conditional enlistment probabilities. Correcting for selection
on unobservables and observables together requires an additional step, which I discuss in section 5.2.
60 It is not necessary to allow α to differ between samples, as the samples are defined by birth cohorts. Thus, each element
of α corresponds only to one sample.
31
5.1.1
The Likelihood of the Sample
The choice-restricted sample described in section 4 makes standard binary choice likelihoods invalid for two
reasons. First, because the data on military enlisters and the general population are sampled separately, the
proportion of military enlisters in the sample is not equal to (nor will it converge in a large sample to) the
proportion of military enlisters in the population. The data are thus a choice-based sample (Cosslett, 1981;
Manski and Lerman, 1977; Manski and McFadden, 1981). Second, while it is known that individuals in the
choice-restricted military samples did enlist in the military, the military enlistment status of the members
of the supplementary census samples is unknown. Thus, in essence, the variable yi is equal to one for all
members of the choice-restricted samples, but missing for all members of the supplementary sample—the
observed yi has no variance (Steinberg and Cardell, 1992). Cosslett (1981) provides a maximum-likelihood
estimator for such samples, which is discussed in Appendix E.
5.2
Accounting for Selection on Unobservables
In order to perform the corrections of equations (C.8) and (C.9) for selection on unobservables, I must also
account for the fact that E(εi |ci , xi , zi , yi = 1) 6= 0. Under assumption (3), equation (C.8) can be written as
E(hi |ci , xi , zi , yi = 1) = E(hi |ci , xi ) + E(εi |ci , xi , zi , yi = 1)
= E(hi |ci , xi ) + E(εi |ci , xi , zi , ui > −x0i βk − c0i α − zi 0 δk )
= E(hi |ci , xi ) + Ω(x0i βk + c0i α + zi 0 δk ),
(6)
where Ω(·) is an unknown function that captures the degree of sample-selection bias for an individual,
conditional on his observables. For tractability, I also impose some structure on equation (6). In particular,
I assume that it can be written as
E(hi |ci , xi , zi , yi = 1) = µ + c0i γ + x0i θ + Ω(x0i βk + c0i α + zi 0 δk ),
(7)
where µ is an intercept. I also include a vector of measurement age indicators in equation (7), yielding the
final second stage estimation equation (for only the military enlistment sample)
hi = c0i γ + x0i θ + m0i π + Ω(x0i βk + c0i α + zi 0 δk ) + ξi ,
32
(8)
where mi are age-of-measurement indicators that normalize the age of measurement to 21, and ξi is an error
term that is orthogonal to ui . Define ĥi = hi − Ω̂(x0i β̂k + c0i α̂ + z0i δ̂k ). The coefficients θ and π are assumed
to be time-invariant.
Expression (8) helps to illustrate the formal source of the sample-selection bias. If the sample-selection
problem is disregarded, then Ω(x0i βk + c0i α + z0i δ) is an omitted variable. The fact that it includes xi and ci
make it correlated with the regressors, leading to omitted variables bias. Inclusion of the function Ω(·) in the
estimation of equation (8) corrects for this bias. This expression also helps to demonstrate the importance
of the exclusion restrictions zi for identification. Suppose that β and δ are not permitted to vary across
cohort groups and that δ = 0; that is, suppose that all of the variables entering into Ω(·) were also included
in expression (8) as determinants of height. In the standard parametric sample-selection model, where εi
and ui are assumed to be jointly normally distributed, Ω(·) is the inverse Mills ratio; its nonlinearity ensures
that there is no collinearity problem between ci and xi on the one hand, and Ω(x0i β + c0i δ) on the other,
which, if present, would preclude estimation of equation (8). In the present setting, however, I prefer to
avoid the joint normality assumption, implying no particular form for Ω(·). The fact that Ω(·) is permitted
to potentially be linear reintroduces the possibility of collinearity. The two features of the model discussed
above—allowing β and δ to vary by cohort group and including the voting variables zi —contribute to solving
this problem by ensuring that the linear index in Ω(·) is not a linear combination of ci and xi alone. All
that is required is that either δ 6= 0 or that some component of βk differ between cohort groups.61
Newey (2009) provides a method to estimate such models. He suggests estimating a sample-selection
model by first using Klein and Spady’s (1993) method to estimate a binary choice model and then using
a spline to flexibly estimate the selection bias in the second stage.62 Estimation is based on this method,
61 Strictly speaking, the interaction of x with the Union Army indicator is alone sufficient for identification of the binary
i
choice model, as a similar interaction is not performed in the second stage. This is, of course, a very weak source of identification
as it hinges on the functional form of the enlistment decision. As a result, the additional exclusion restrictions are also exploited.
When a parametric form is assumed and functional-form-based identification methods are used, results are similar. I have also
replicated the results using the enlister’s school attendance as a third exclusion restriction; this has the advantage of providing
individual-level variation rather than only county-level variation. This variable should clearly affect military enlistment. There
is also likely to be a correlation between school enrollment and height, but much of the literature on height (Case and Paxson,
2008; Case, Paxson, and Islam, 2009) suggests that runs from height (or, more accurately, the cognitive ability with which it
is correlated) to schooling. Results are very similar in this case. Results are also similar when the coefficients θ are allowed to
vary by cohort group, leading to identification based only on the exclusion restrictions zi .
62 A number of alternative methods exist for estimating sample-selection models. Cosslett (1991) proposes a method similar
to Newey’s (2009), approximating the unknown function Ω(·) by a series of indicator variables. Ahn and Powell (1993) and
Powell (2001) provide a method based on the observation that individuals with a similar value of the linear index of the selection
equation should have similar selection bias. Ichimura and Lee (1991) provide a one-step semi-parametric nonlinear least squares
estimator. The motivation in many of these papers for estimation of the second stage is often based on Robinson’s (1988)
method, which is used by Lee and Vella (2006) and Li and Wooldridge (2002) in estimation of a semi-parametric type-III Tobit
model (in which the variable determining selection is truncated rather than observed only in sign as in this case). I have also
implemented Procedure 1 using Robinson’s (1988) method instead of Newey’s (2009) method and all results are nearly identical
to those presented in the paper. Das, Newey, and Vella (2003) provide a nonparametric series estimator for sample-selection
models, but the high dimensionality of the present problem precludes its use in this context.
33
incorporating the necessary adjustments to Klein and Spady’s (1993) estimator for the choice-restricted
samples, and making my adjustment for selection on observables in computing the unconditional trend.
Procedure 1. The procedure is the following. Estimation of standard errors at each stage is discussed in
Appendix F.
1. The binary choice model of equation (5) is estimated using my adapted Klein and Spady (1993)
estimator, discussed in Appendix E.
2. Using Newey’s (2009) method, I estimate equation (8), leaving the form of Ω(·) free and approximating
it by a spline. This estimation is weighted to account for the separate sampling of the two groups of
birth cohorts. This yields an estimate of E(hi |ci , xi ), c0i γ̂ + x0i θ̂ (excluding the age-of-measurement
indicators, so that heights are normalized to age 21). Importantly, the γ̂ represent only the selectioncorrected conditional trends in height; as the object of interest is the unconditional trend, further
analysis is necessary.
3. Leaving the form of Ω(·) free in equation (8) captures µ, preventing it from being separately estimated
in step 2. I therefore estimate µ after step 2 using Andrews and Schafgans’s (1998) method.63 The
estimate of the intercept is denoted µ̂, and is defined as
PN
µ̂ =
i=1
Γ(x0i β̂k + c0i α̂ + zi 0 δ̂k )(hi − c0i γ̂ − x0i θ̂ − m0i π̂)
,
PN
0
0
0
i=1 Γ(xi β̂k + ci α̂ + zi δ̂k )
(9)
where Γ(·) is a weighting function that gives weight only to the upper tail of (x0i β̂k + c0i α̂ + z0i δ̂k ).64
4. I estimate equation (C.10) by computing
V
E(hi |ci ) =
k̂ci X µ̂ + c0i γ̂ + x0i θ̂ + ξˆi
,
Nci i∈c P (y = 1|c , x , z∗ )
i
i
i i
i
V
(10)
where ξˆi are the residuals from the regression of step 2 (and are thus purged of the function Ω(·)).This
is accomplished by a regression of the predicted selection- and measurement age-corrected height,
ĥi = µ̂ + c0i γ̂ + x0i θ̂ + ξˆi , for each member of the choice-restricted sample, on birth-cohort indicators,
weighting by inverse enlistment probability.
63 A
simpler version of this estimator, with less desirable asymptotic properties, is given by Heckman (1990).
particular, the weight is zero below the 90th percentile of (x0i β̂k + c0i α̂ + z0i δ̂k ), one above the 92.5 percentile, and linearly
increasing in between. As a result, the estimate of µ is based only on a small fraction (10 percent) of the data. Due to the fact
that this estimate may be imprecise, and because section 4.3 shows that it may be contaminated by selection into census linking,
I do not place much stock in it. Nonetheless, it is still possible to compare the estimated trends, as these are independent of µ̂.
64 In
34
It is instructive to consider how the model admits changing sample selection over birth cohorts. In this
model, the selection function, Ω(·) is a time-invariant function of the single index x0i βk + c0i α + zi 0 δk . Thus
there are essentially three mechanisms by which selection can change over cohorts. The first is changes in
the distribution of xi and zi over cohorts. The second is in the inclusion of the cohort indicators ci , which
in principle allow the selection to vary across cohorts. The last is any variation in β and δ across cohort
groups. In all cases, the probability of enlistment must vary in order to generate variations in selection.
5.3
Constructing Weights
The only remaining piece of information necessary to implement the maximum likelihood estimator is Q1 , the
fraction of the population joining the military.65 This weight is computed from external data separately for
each subsample. The estimates are presented in Table 6 and the details of their computation are discussed
in Appendix G. In Table 6 and throughout the paper I use the shorthand “UA,” “RA(a),” and “RA(b)”
to refer to the Union Army, Regular Army (1832–1846 cohorts), and Regular Army (1847–1860 cohorts),
respectively. I refer to estimation that uses the Union Army sample to represent the 1832–1846 cohorts
(and the Regular Army to represent the 1847–1860 cohorts) as “UA & RA(b),” and estimation that uses the
Regular Army data to represent both cohort groups as “RA.”
5.4
Truncation
The Union Army and Regular Army were subject to minimum height requirements. The correction for
selection on unobservables also corrects for truncation, and thus no further correction is necessary in the
trends corrected for selection on unobservables.66 However, the raw trends and the trends corrected only
for selection on observables require correction. To this end, I estimate the raw and observables-corrected
trends by maximum likelihood (A’Hearn, 1998).67 For all three samples, I set the truncation point to be
64 inches. For the Union Army, this truncation point follows Komlos (1998a);68 for both Regular Army
65 I have experimented with a variety of methods of estimating Q . The results are not meaningfully changed when Q is
1
1
varied.
66 In this case the binary choice model for military enlistment must be thought of as representing the compound event in
which an individual both meets the minimum height requirement and chooses to join the military. Bodenhorn, Guinnane, and
Mroz (2015b) make a similar assumption.
67 The log-likelihood function is
log(L) =
N
X
hi − c0i γ − m0i π
t − c0i γ − m0i π
− log 1 − Φ
,
− log(σ) + log φ
σ
σ
i=1
(11)
where t is the truncation point. Other approaches to correcting for truncation are proposed by A’Hearn (2004), Jacobs, Katzur,
and Tassenaar (2008), Komlos and Kim (1990), and Wachter and Trussell (1982).
68 A’Hearn (1998) uses a cutoff of 64.5 inches.
35
samples, this was the official minimum height requirement (Coffman, 1986, p. 332), and is corroborated by
a visual inspection of Figure 3.
6
The Antebellum Puzzle: Fact or Artifact?
6.1
Selection into Military Service
The results of estimation of the binary choice model in equation (5) for lifetime military enlistment are
presented in Table 7.69 Rather than presenting the estimates of α, β, and δ, which are not inherently meaningful, I present average semi-elasticities—the sample average of the derivative of log(Enlistment Probability)
with respect to each covariate.70 Column (1) presents the results for the sample representing the 1832–1846
cohorts with the Union Army enlisters, and column (2) presents the results for the sample representing
these cohorts with Regular Army data. In each column (denoted by a number), sub-column (a) presents the
semi-elasticity for the 1832–1846 cohorts and sub-column (b) presents the semi-elasticity for the 1847–1860
cohorts.
The semi-elasticities of the two vote share variables indicate that they have economically significant
effects of the expected sign on the enlistment decision. For example, a one standard-deviation increase in
the vote share for Douglas (approximately 18 percentage points) is associated with an approximately three
to 14 percent increase in the probability of enlistment in specification (1); the vote share for Buchanan
also has a meaningful effect in specification (1), with a one-standard deviation increase (approximately 14
percentage points) associated with a roughly four to 13 percent decline in the probability of enlistment. The
effects in column (2) are, for the most part, slightly smaller but still of meaningful magnitude, with a onestandard deviation increase in either vote share associated with a three to nine percent change in enlistment
probability. These magnitudes speak to one aspect of the identifying power of the vote share variables by
showing that they were relevant to determining the enlistment decision.
Several other covariates also show interesting results. Urban residence, for instance, is associated with a
five to 19 percent greater probability of enlistment in all three of the Regular Army samples—consistent with
contemporary reports (Foner, 1970; Weigley, 1967)—and with a four percent lower probability of enlistment
in the Union Army. Similarly, relation to the head of household displays a positive effect on enlistment in the
Union Army and a negative effect on the probability of enlistment in the Regular Army, regardless of cohort
69 This table presents the results for the benchmark specifications only. Results of other specifications are available upon
request.
70 Computation of the semi-elasticities is discussed in section E.4 of Appendix E.
36
group, again consistent with contemporary reports of the Regular Army as being composed of the relatively
poor. Regardless of the sample, school attendance is associated with a decrease in enlistment probability.
Moreover, consistent with contemporary reports and recruiting practices, Northeasterners were more likely
than residents of other regions to enlist in the Regular Army.
There is also evidence to support identification on the basis of changes in the effects of covariates on the
probability of enlistment between cohort groups, though this is much more the case in the Union Armybased specification in column (1). For example, the opposing effects of urban residence on the probability of
enlistment in column (1) show evidence of differences in the effects of covariates on enlistment probability
between the two sub-samples.
6.2
Selection-Corrected Height Regressions
The next step in estimation is to estimate equations (8) and (9), the second-stage selection-corrected height
regression and estimate of the intercept of that equation. The results of this estimation are presented in
Table 8, with the Constant row presenting the estimate µ̂.71 Column (1) of this table is comparable to
column (1) of Table 3 (in that it includes the same variables but is corrected for sample-selection bias), while
column (2) is comparable to column (3) of Table 3.72 The results of the selection-corrected regressions are
similar to those of the uncorrected regressions of Table 3, though there are some exceptions. For example,
the conditional urban height penalty is larger after the selection correction. The Northeast’s conditional
height disadvantage relative to the South decreases after the correction.73
6.3
Corrected Trends in Height
I present the results of the complete correction for selection on observables and unobservables in two sets
of figures. The estimated selection bias function Ω(·) for each specification, averaged for each birth cohort
(weighting by inverse enlistment probability) and smoothed is presented in Figure 8. One particular feature
of these graphs is relevant to the argument being tested. There is a clear pattern of change in the value of
the selection bias over birth cohorts in both panels of Figure 8. When the Union Army is used to represent
the 1832–1846 cohorts, a clear difference in the level of selection exists between the two cohort groups, with
71 Only
a selection of variables are presented in Table 8; the full results are presented in Table A.4.
sample sizes differ between the two tables because the truncated regressions reported in Table 3 require the omission
of individuals below the minimum height requirement. If the minimum height requirements were stringently enforced (that is,
if nobody below 64 inches were permitted to enlist) then no exclusion would be necessary and the sample would be the same.
Table 8 also weights to correct for selection into linkage on the basis of observable characteristics.
73 The unconditional urban height penalty and unconditional regional differences are discussed in section 6.5.
72 The
37
the Regular Army subsample exhibiting much stronger negative selection than the Union Army.74 Moreover,
within the Union Army, the degree of selection becomes less negative over time. When the Regular Army
is used to represent the complete span of cohorts, more negative sample-selection bias is evident over later
birth cohorts, though no abrupt change is evident in the transition between cohort groups and the change in
selection over cohorts is small. If the antebellum puzzle were indeed an artifact of sample-selection bias—that
is, if the decline in average stature were driven by changing sample-selection bias—the function Ω(·), which
captures the degree of sample-selection bias, would have to decrease over time in order to generate a decline
in observed heights despite the absence of such a decline in the population. Thus, the estimates of Ω(·)
provide mixed evidence on the antebellum puzzle. While declines are evident in Ω(·) over the whole period,
the Union Army shows increases, and the magnitudes do not appear to be sufficiently large to overturn the
observed decline.
Figure 9 presents the observables- and unobservables-corrected trends in average height by birth cohort.
This Figure includes both the estimated average heights for each cohort and a kernel-smoothed version
of these trends,75 for both the observables-corrected and the unobservables-corrected trends;76 the raw
trends (corrected only for minimum height requirements and age of measurement) are also included in their
smoothed versions for reference.77 The observables-corrected trends represent the state of the art of the
historical heights literature, which disregards the possibility of selection into observation on the basis of
unobservables. The unobservables-corrected trends are the contribution of the present paper, incorporating
the correction for selection on unobservables. Two key insights can be drawn from this Figure. First, it shows
that correcting for sample-selection bias does not eliminate the antebellum puzzle. In the specification that
74 Though the level of the trend as a whole should be interpreted with great caution, much more stock can be placed in the
levels of the two portions of the trend relative to one another.
75 A table presenting the estimated cohort averages themselves is available on request.
76 There is an inconsistency between the estimates of Ω(·) in Figure 8 and the trends in Figure 9. In particular, it is not
the case that the trend corrected for selection on observables only is the sum of the trend corrected for both observables and
unobservables and the estimate of Ω(·), as the analysis of section 3 shows should be the case. This discrepancy is due to the
correction for truncation. When no such correction is made, the difference between the observables-corrected and unobservablescorrected trend is indeed given by Ω(·). However, the correction for truncation pushes the observables-corrected trend down,
and does so even more in the 1847–1860 cohorts than in the 1832–1846 cohorts, thus expanding the difference between the
observables-corrected and unobservables-corrected trends. This should not be considered a deficiency of the estimation. First,
the goal of the observables-corrected trend is to represent the state of the art in the historical heights literature, which performs
the truncation correction. Differences from this trend indicate that failing to correct for selection leads to possibly erroneous
correlations. Second, as argued above, the selection correction method should also correct for truncation; the greater difference
between the unobservables-corrected and observables-corrected trends thus represent the change in trend due to selection alone,
while the estimates of Ω(·) are net of the correction for truncation, which tends to oppose the direction of the selection correction.
Moreover, by modeling the compound event of meeting the minimum height requirement and choosing to enlist, the current
approach may even be better as it can incorporate cases in which individuals join despite failing to meet the minimum height
requirement (whereas the truncation method requires dropping them). I am still investigating the interaction of minimum
height requirements and selection based on the desire to join the military in determining the magnitude of the correction.
77 Results of additional specifications are also presented. Figure B.4 presents results incorporating interactions of the covariates xi in the second-stage equation so that identification is only from the exclusion restrictions zi . Results are similar to those
of Figure 9, except for the intercept, and slight evidence of a reversal of the decline in the 1850s in the Regular Army sample.
38
represents the 1832–1846 cohorts with the Union Army data (Figure 9a), a decline of approximately 0.95
inches is present in the smoothed trends corrected for selection on unobservables between 1832 and 1846; the
net decline between 1832 and 1860 is estimated to be approximately 0.69 inches. In the specifications that
use only Regular Army data for all cohorts (Figure 9b), a decline of approximately 0.90 inches is present
between 1832 and 1846, and the net decline between 1832 and 1860 is estimated to be approximately 0.56
inches. Confidence intervals for the smoothed declines are given in Figure 10, and show that it is possible
to reject the null of no net decline and of no decline at all. Thus, I conclude that the decline of heights of
the antebellum puzzle is not an artifact of sample-selection bias. It should be noted that even if I had found
no evidence of a decline in average heights, that would not constitute sufficient evidence to conclude that
the antebellum puzzle was resolved. It would still be necessary to explain why stature did not increase in
the presence of rapid economic growth. To resolve the puzzle based on selection alone, the corrected trends
would have to show an increase in average stature.
The second key insight evident in Figure 9 is that although the corrected trends continue to display
a decline in stature, the decline is smaller in the trends corrected for selection on both observable and
unobservable characteristics than it is in the trends corrected only for selection on observables. In particular,
the sample consisting of both Union Army and Regular Army enlisters shows a decline of about 1.40 inches
from 1832 to 1846 in the smoothed trend, and a net decline of about 1.25 inches to 1860; the Regular
Army-only sample shows a decline of approximately 1.22 inches from 1832 to 1846, and a net decline of 0.82
inches from 1832 to 1860. These declines are all larger than those discussed above from the fully corrected
trends. Moreover, it is possible to reject the null of equality of trends between the (unsmoothed) corrected
and uncorrected trends in each sample (UA∪RA(b): χ2 = 541.07, p < 0.01; RA: χ2 = 54.66, p < 0.01).
Thus, although the view that the antebellum puzzle is an artifact of sample-selection bias is not supported,
the general argument, that failing to properly account for sample-selection bias may lead to biased estimates
of the trends in height over birth cohorts, is supported. Making this correction leads to both statistically
and economically significant changes in the observed trends.
Another puzzle is resolved by these results. Although in principle they represent the same populations—
native-born whites in the 1832–1860 birth cohorts—the Union Army-Regular Army combination sample
and the Regular Army only sample show declines in heights of different magnitudes when not corrected for
selection on unobservables. The difference in these magnitudes is approximately 0.4 inches. After correction,
however, the two samples are much more similar to one another, with the magnitude of the decline between
them differing by only about 0.1 inches. This convergence stems from the fact that the correction for selection
39
on unobservables has a much larger impact on the decline in heights in the Union Army-based sample than
in the Regular Army-based sample, where it leads to only a small change in the magnitude of the decline.
6.4
Interpretation
In order to better understand the mechanisms driving the results, I also estimate piecewise linear versions
of the trends discussed above, presenting graphical results in Figure 11 and numerical results in Table 9. In
particular, I estimate the specification
ĥi + ξi = κ0 + κ1 ci + κ2 pi + κ3 ci pi + ζi ,
(12)
where ci is the individual’s birth year (not a series of indicators) and pi is an indicator for being born between
1847 and 1860; thus, this is a piecewise linear version of the trend, allowing the trend to change after 1846.
By taking focus away from the cohort-to-cohort fluctuations and summarizing the trends in heights in four
parameters, this estimation helps to understand the mechanisms driving the change of the estimated trend
when correcting for sample-selection bias. It also helps to ensure that the statistical significance of results is
not driven by only a small number of cohorts.78 For each sample, I estimate specification (12) for the raw
trends (correcting for truncation and measurement age only), correcting for selection on observables, and
correcting for selection on both observables and unobservables. I also perform tests of the null hypotheses
that each individual component of this trend is the same, and for the joint null hypothesis that all components
(except the intercept κ0 ) are the same after correction.
Table 9 shows that for both samples, the null of equality of trends measured in this way can be rejected.
For the sample combining the Union and Regular Army enlisters, column (1) shows that I fail to reject the
null hypothesis that the trend within the 1832–1846 cohorts is the same after correction (that is, that κ1 is
not statistically significantly changed by the correction; χ2 = 1.80, p = 0.18). It is also not possible to reject
the null that the trend in the 1847–1860 cohorts (κ1 + κ3 ) is the same after correction at any but a marginal
level of significance (χ2 = 3.36, p = 0.07). I can reject the null hypothesis that κ2 , the difference in levels
between the two portions of the sample, is unchanged, at the one-percent level (χ2 = 57.58, p < 0.01); the
difference is also large at over one inch. Thus, I conclude that the difference in trends in the Union ArmyRegular Army combination sample is not driven by changes over time within armies in selection. Instead it
78 For example, it could be the case that the corrected and uncorrected trends do not differ drastically for all but one birth
cohort. If that difference is sufficiently large, the null of equality of trends would be rejected, but it would not be clear that
a meaningful difference had been discovered. This approach would help to reduce the focus on these small fluctuations and
instead focuses on larger trends.
40
is the product of distinctly different levels of sample-selection bias between the two armies, and thus in the
two portions of the sample. The confounding effects of sample-selection bias thus arise when the two very
differently selected samples are placed side by side and used to construct a trend. When the circumstances
surrounding enlistment in each sample group are considered, this result is not surprising. The Union Army
was a citizen’s army, comprising nearly half of the eligible population (see Table 6). The Regular Army, on
the other hand, was a professional army, drawing only a small fraction of individuals, often from the bottom
rungs of society (Foner, 1970; Weigley, 1967). That the transition from one group to the next would distort
the true trend in height to show a greater decline is thus unsurprising.
The Regular Army-only sample exhibits a different pattern. As shown in column (2) of Table 9, it is
possible to reject the null hypothesis that κ1 , the slope for the 1832–1846 cohorts, is unchanged as a result of
the correction (χ2 = 5.00, p = 0.03), indicating that the trends within the 1832–1846 cohorts is changed as
a result of the correction. The magnitude of the change is meaningful, at about 0.025 inches per year. The
difference in the trend in the 1847–1860 cohorts (κ1 +κ3 ) is also statistically significant (χ2 = 6.37, p = 0.01).
It is also possible to reject the null hypothesis that the difference in levels between the two cohort groups
(κ2 ) is not significantly changed, but only at a marginal level of significance (χ2 = 2.73, p = 0.10), with the
size of the change small. As with the patterns in the previous sample, the absence of a tremendous change in
the nature of enlistment within the Regular Army explains the lack of an abrupt change in selection between
the two samples, either before or after the correction for selection on unobservables.
How do the selection-corrected estimates compare to existing estimates in the literature? The most
sophisticated existing estimates of the decline in heights within the Union Army sample from approximately
1830 to 1846 are provided by A’Hearn (1998, Table 4, p. 262), who places the decline at approximately 0.6
inches.79 The most recent estimates of the decline in heights from about 1830 to about 1860 are provided by
Craig (Forthcoming), based on the estimates of A’Hearn (1998) and Zehetmayer (2011). He produces a figure
that shows a decline of roughly 1.25 inches over the period. My observables-corrected trends are comparable
to these estimates when computed using Union Army data. My estimated observables-corrected decline
over the range 1832–1860 is 1.25 inches, while the estimated decline over the 1832–1846 period is about
1.4 inches. The net decline over the period is quite similar to the estimates of Craig (Forthcoming). The
decline over 1832–1846 is larger than A’Hearn’s (1998) estimate. However, this need not indicate that my
observables-corrected trends are not comparable to existing estimates. In particular, the fact that A’Hearn
79 A’Hearn’s (1998) innovation was to correct for minimum height requirements by using a truncated regression. He estimates
a decline of about 0.2 inches from the 1830–1834 cohorts to the 1835–1839 cohorts, and a decline of about 0.4 inches from the
1835–1839 cohorts to the 1840–1847 cohorts.
41
(1998) computes the decline using three cohort-group averages implies that the magnitude of his decline
should be biased downwards relative to mine. Indeed, when I compute the decline in a manner analagous
to A’Hearn (1998), my observables-corrected decline is very similar to his. The observables-corrected means
are 68.69 inches for the 1832–1834 cohorts, 68.46 inches for the 1835–1839 cohorts, and 68.24 inches for
the 1840–1846 cohorts. The decline of 0.23 inches between the first two groups is very similar to A’Hearn’s
(1998) 0.2 inches. The decline of 0.23 inches between the second two groups is also not very different from
A’Hearn’s (1998) 0.40 inches. Thus, I conclude that my observables-corrected trend is roughly comparable
to benchmark estimates in the historical heights literature.
My finding that the selection-corrected trends (for both observables and unobservables) differ considerably
from the observables-corrected trends, can thus be interpreted as showing that correction for sample-selection
bias leads to differences from the existing benchmark estimates in the literature. Thus, I conclude that the
differences between my observables-corrected trends and unobservables-corrected trends constitute evidence
that existing estimates of the decline in average stature in the literature are biased upwards as a result of
cohort-variant sample-selection bias.
6.5
Cross-Sectional Puzzles
In addition to the temporal antebellum puzzle, a number of studies (e.g., Komlos, 1989; Mokyr and Ó
Gráda, 1996; Zehetmayer, 2011) have found evidence of a cross-sectional antebellum (or industrialization)
puzzle, in which residents of poorer areas are found to be paradoxically taller. In the American case, the
paradoxical cross-sectional relationship in heights manifests through the ranking of the various regions of
the United States in terms of income and stature. In particular, despite being the poorest region of the
United States, the South is tallest in the selected samples; similarly, the Northeast, despite being the richest,
appears shortest.80 However, just as sample-selection bias that differs over birth cohorts can theoretically
lead to temporal patterns in heights that contradict traditional indicators, sample-selection bias that varies
cross-sectionally can theoretically lead to cross-sectional patterns in heights such as those in the United
States.
To determine whether sample-selection bias is responsible for these cross-sectional paradoxes, I repeat
Procedure 1. Instead of averaging the estimated corrected heights over birth cohorts, however I do so over
regions of birth. As above, I correct for selection on observables by weighting by inverse conditional enlistment
80 Similar relationships are present in selected samples in other contexts, such as between England and Ireland in the East
India Company Army (Mokyr and Ó Gráda, 1996) and in the Habsburg empire (Komlos, 1989). Mokyr and Ó Gráda (1996)
acknowledge that this puzzle may be due in part to different selection into military enlistment in each country.
42
probability. Results are presented in Table 10. Panel A shows the mean heights per region, correcting for
truncation, measurement age, and selection on observables only. The puzzling relationship between the
regions is present, with Northeasterners roughly 0.75 to 0.80 inches shorter than Southerners in both the
Union Army- and Regular Army-based specifications. Panel B corrects for selection on unobservables; the
difference between the Northeast and the South is smaller, but still present at roughly 0.5 to 0.6 inches. As
was the case for the temporal antebellum puzzle, the differences between the heights of the various regions,
in particular those of the Northeast and South, are smaller, but still present and statistically significant.
Once again, sample-selection bias cannot wholly account for a puzzling relationship in height.
But again, the researcher cannot simply ignore sample-selection bias. As shown in Panel C, the difference
in heights between Northeasterners and those of other regions in both samples become statistically significantly smaller, decreasing the difference between the groups by roughly half in the case of the Northeast-South
difference in the Union Army sample. These large and statistically significant changes in the difference in
heights between Northeasterners and those from other regions show again that researchers cannot disregard
sample-selection bias, even in cross-sectional comparisons.81
A similar analysis is possible in order to investigate the urban height penalty—a robust finding that
residents of urban areas were shorter than residents of rural areas (Haines, Craig, and Weiss, 2003; Haines and
Steckel, 2000; Humphries and Leunig, 2009). The penalty is not evident in all contexts, however, suggesting
that it may be an artifact of sample-selection bias (following the Bodenhorn, Guinnane, and Mroz, 2015b
argument).82 Performing procedure 1 and averaging estimated corrected heights over sector makes it possible
to determine to what extent this finding is the result of sample-selection bias that differs by sector. The
results of this procedure are presented in Table 11, in which an urban height penalty of approximately 0.62
to 0.78 inches is present when correcting for selection on observables (Panel B). The results of the correction
for sample-selection in Panel C show a similar pattern to the regional case. Correcting for selection does not
eliminate the difference in average stature, but does statistically significantly and meaningfully change the
magnitude of the difference to about 0.48 to 0.51 inches.
I therefore conclude that correcting for selection on unobservables in cross-sectional settings does not
eliminate puzzling differences in average stature, but does lead to statistically significant declines in the
magnitudes of these differences. These results are similar to the temporal case studied in section 6.3.
81 The magnitude of the effect of correcting for sample-selection bias on the regional differences is not robust to the exclusion
of outliers in the linkage probability. When outliers are included, the magnitudes of the changes are much smaller, and only
marginally statistically significant.
82 See, for instance, Humphries and Leunig (2009), Martínez-Carrión and Moreno-Lázaro (2007), Reis (2009), and Twarog
(1997)
43
7
Threats to Validity
7.1
Robustness to Weighting
I test the robustness of the results to omitting the weighting to correct for selection into census linkage on
the basis of observable characteristics, as discussed in section 4.3. The results of performing this estimation
are presented in Figure 12 for the trend in average stature over birth cohorts. In this specification, the Union
Army-Regular Army combination sample shows a decline of 1.38 inches before correcting for selection on
unobservables, and a decline of 1.15 inches after the correction. The Regular Army only combination sample
shows a decline of 0.69 inches before correction and 0.46 inches after correction. The most fundamental
result—that the antebellum decline is not an artifact of sample-selection bias—is thus robust to the omission
of the weighting. However, the secondary result—that the correction for sample-selection bias has an effect
of meaningful magnitude on the trend in stature—is not supported. In particular, while correction is shown
to lead to statistically significant changes in the trend in average stature, the effect is small. Results of the
cross-sectional comparisons are also similar in the weighted and unweighted cases, although in the case of
the cross-sectional comparisons, the effects of correcting for selection are of meaningful magnitude in both
cases.
7.2
Exclusion Restrictions
Validity of the exclusion restrictions is essential in this context. It is crucial for identification, as the semiparametric nature of the model cannot gain identification from non-linearity of functional form. Thus it
is critical to ensure that the exclusion restrictions satisfy the relevant assumptions—relevance (i.e., the
exclusion restrictions affect the enlistment probability) and excludability (i.e., it is appropriate to omit them
from the height equation). I address each of these two issues separately below, focusing on the vote share
variables.
7.2.1
Relevance
The issue of relevance of the voting variables has already been addressed in some detail above. Section 4.1.5
laid out the theoretical foundation for why military enlistment should be related to these votes, based on the
similar issues on which military enlistment and voting were based, and in justification of voting’s relationship
to the similar decision of desertion by Costa and Kahn (2003, 2007). Section 6.1 showed that there was a
relationship of economically significant magnitude between the voting variables and the probability of military
44
enlistment. Figure 13 serves to reiterate this point by depicting graphically the unconditional relationship
between the two vote shares and the estimated probability of enlistment, G(x0i β̂k + c0i α̂ + z0i δ̂k ).83 In both
the Union Army-based sample and the Regular Army-only sample, these graphs show a clear relationship
between enlistment probability and the vote shares.
7.2.2
Excludability
Establishing the excludability of the vote share variables is more difficult, and it cannot be definitively
proven.84 Excludability might fail, for instance, if the voting variables capture some unobserved economic
conditions that also affect height. It is hoped that including the individual- and household-level covariates
mitigates this possibility, but it nevertheless still exists.
While it is difficult to test for excludability of the vote share variables, it is possible to determine whether
the exclusion restriction assumptions are crucial to the results by reproducing the results with an alternative
identification method that does not rely on the excludability of the vote shares or the assumption that the
covariates impact the military enlistment decision differently across cohort groups. As discussed by Mulligan
and Rubinstein (2008), it is possible to identify the sample-selection model in the absence of exclusion
restrictions or distributional assumptions. Known as “identification at infinity,” this approach (used by
Andrews and Schafgans, 1998; Heckman, 1990 to identify the intercept of a sample-selection model) relies
on the limiting properties of equation (8) as the probability of selection into the sample conditional on
individuals’ covariates approaches one. Intuitively, if an individual’s covariates are such that his probability
of selection into the sample is high, he is observed almost regardless of his draw of unobservables, making
selection bias relatively unproblematic, and more so as the probability of observation conditional on covariates
approaches one. All that is required for identification in this approach is that there exist at least one covariate
with unbounded support (Chamberlain, 1986; Lewbel, 2007);85 the several continuous variables employed
in estimating the binary choice model ensure that there is sufficient variation for identification using this
method. The major drawback of this approach is that only a small subsample of the data can be used
83 As with the partial effects presented in Table 7, these graphs cover only the supplementary sample, giving an estimate of
the relationship in the whole population.
84 Huber and Mellace (2014) provide a test for joint satisfaction of excludability and monotonicity (a consequence of additive
separability) of exclusion restrictions. Excludability has already been discussed in section 4.1.5 above. Monotonicity requires
that the selection state change monotonically in the exclusion restriction; for a continuous exclusion restriction, this implies
that “each individual switches its selection state at most once under the null as a reaction to different values of the instrument”
(Huber and Mellace, 2011, p. 17). However, Huber and Mellace’s (2014) test has validity of the instrument as the null, and
is not particularly high powered. Thus, I do nothing more here than note that my performance of the test does not indicate
violations of the assumptions for the exclusion restrictions, but I do not base my claims of validity of the exclusion restrictions
on the results of this test.
85 d’Haultfoeuille and Maurel (2009) provide an alternative condition that hinges on unbounded support of the outcome.
45
because it relies on individuals with a high conditional enlistment probability.
In practice, I implement this identification approach as follows. First, using the results of the binary
choice model laid out in equation (5) and Table 7, I identify the top 7.5 percent (for the Union ArmyRegular Army combination sample) or 15 percent (for the Regular Army only sample) of individual enlisters
in each birth year on the basis of enlistment probability. Next, using this subset of the data, I estimate an
analog of equation (8) that imposes the assumption that the selection function can be disregarded among
these individuals with higher enlistment probability:
hi = c0i γ + x0i θ + m0i π + εi .
(13)
Under the identification at infinity assumption, the model can be estimated by OLS on the small sample of
individuals with high enlistment probability. Finally, to correct for the even stronger selection on observables
induced by limiting the sample to those individuals, I use the estimates of equation (13) from the small sample
to compute predicted heights for the whole sample (as ĥi = c0i γ̂ + x0i θ̂), which I then average within cohorts
(weighting by inverse enlistment probability to correct for selection on observables into the original sample)
and smooth.
Results of estimation using this approach are presented in Figure 14, which includes the results of the
observables-corrected and unobservables-corrected trends of Figure 9 for comparison. It should be noted
that because the estimates of the model identified at infinity rely on very small samples, and because the
enlistment probabilities for the Regular Army are not very large, these estimates are likely not precise.86
As a result, I do not focus on the magnitude of the decline implied by this estimation. The key implication
that I draw from the identification at infinity results is that both show declines in stature over birth cohorts
(though this decline is quite small in the Regular Army sample). This result further supports (without
reliance on validity of the exclusion restrictions) the conclusion that sample-selection bias is not responsible
for the antebellum puzzle.
It is also possible to conduct a sort of overidentifying restrictions test. As allowing the coefficients of the
binary choice model to differ by cohort group is alone sufficient for identification, it is possible to include the
voting variables in the second stage to obtain selection-corrected estimates of their relationship with height.
Though this approach has validity as the null and assumes that it is appropriate not to include interactions in
86 The sample sizes for the combination sample of the Union and Regular Armies peaks at 52 individuals for the 1844 cohort.
The average enlistment probability for the Union Army sample among the top 7.5 percent is 91.4 percent; for the 1847–1860
cohorts of the regular Army it is 9.2 percent. For the Regular Army-only sample, the largest sample is 35 for the 1850 cohort.
The average enlistment probability for the top 15 percent of the 1832–1846 cohorts is 5.5 percent; for the 1847–1860 cohorts it
is 3.7 percent.
46
the second stage, it is informative to consider the results.87 Table 12 presents the results of this exercise. The
first item of note in this table is that in neither specification are the relationships of the voting variables with
height statistically significant, either individually or jointly. Moreover, the magnitudes of the coefficients
are again small. The largest in magnitude—that for Douglas’s vote share in column (2)—is interpreted as
showing that a one standard deviation increase in the vote share for Douglas (about 18 percentage points)
is associated with a decrease in average height of only 0.04 inches, or less than three percent of a standard
deviation of height. This is thus additional evidence in favor of the excludability of the vote share variables.
8
Conclusion
The antebellum puzzle—and its European cousin, the industrialization puzzle—is a major stylized fact of
the economic history literature. Its surprising implication that living standards in the United States and
England were not unambiguously improved by the onset of modern economic growth and industrialization
has changed economists’ understanding of economic development and spawned a near forty-year effort to
document, understand, and explain the response of the human body to modern economic growth.
In addition to its historical importance, this puzzle has provided valuable insights in modern contexts.
Debates over the future of economic growth in the developed world hinge on trends in GDP and productivity
statistics (e.g., Gordon, 2012). This puzzle provides an example of a divergence between GDP statistics and
(at least a portion of) living standards. If considerable GDP growth could come without considerable gains
in health in the nineteenth century, it is not infeasible that gains in living standards could come without a
considerable gain in GDP per capita in the twenty-first. This puzzle is also relevant in modern developing
countries. Deaton (2007) and Jayachandran and Pande (2015) report that fast-paced economic growth in
India has not been matched by improvements in height. They also report that cross-sectional relationships
in height exist that contradict monetary measures of welfare, such as between India and Sub-Saharan Africa,
with Africa poorer but taller. Better modern data collection rules out the possibility that these are statistical
artifacts. The similarity of these relationships to those experienced by the United States and United Kingdom
during their early economic development suggests that important insights can be drawn from the economic
history of this puzzle to understand modern economic development in African and India. That the temporal
and cross-sectional divergence between the economic and biological standards of living that constitute the
antebellum puzzle are not statistical artifacts but are real phenomena shows that it is possible to use insights
87 I do not present results when one of the vote share variables is excluded and the other is included. Results are very similar
in this case.
47
from history to better understand these modern puzzles, and suggests that divergence between the economic
and biological standards of living may be a symptom of the early stages of rapid economic growth.
The tremendous amount of scholarly energy devoted to studying this puzzle, together with its importance
in understanding historical and modern economic development, have made suggestions that the puzzle may
simply be a statistical artifact particularly apposite. The magnitude and importance of the literature under
threat by this recent suggestion therefore make it quite important to empirically verify the argument. In
this paper, I supplement suggestive evidence from existing studies (Bodenhorn, Guinnane, and Mroz, 2013,
2014, 2015b; Fourie, Inwood, and Mariotti, 2014) and evidence from studies on related puzzles in the
anthropometric history literature (Steckel and Ziebarth, 2015) with the first rigorous test of whether the
antebellum puzzle is an artifact of sample-selection bias.
Based on the estimation of a two-step semi-parametric sample-selection model on a set of militarylinked census data from the birth cohorts of 1832–1860 in the United States, I find robust evidence that
the antebellum puzzle is not an artifact of sample-selection bias. A decline in average stature through the
birth cohorts of the 1830s and 1840s is evident even in the selection-corrected trends in height. Similarly,
regional divergences between the economic and biological standards of living are also found to be present
after the correction. Nonetheless, I do find evidence supporting the argument that the anthropometric
history literature has failed to properly take sample-selection bias into account. I find that the selectioncorrected trends in stature differ considerably from the baseline results in the literature (A’Hearn, 1998;
Craig, Forthcoming; Zehetmayer, 2011) and from the trends in height that I compute using the standard
techniques of the literature. The difference stems primarily from large changes in the degree of sampleselection bias across different sources of data. Similar results are obtained for cross-sectional analogs of the
puzzle. Thus, although I do not find that the antebellum sample is an artifact of sample-selection bias, the
general argument that it is important to correct for sample-selection bias in studying historical heights is
supported.88 Future studies of historical heights must take selection bias seriously where it is likely to exist.
If sample-selection bias does not explain the antebellum puzzle, then what does? A large literature has
attempted to answer this question, producing a number of possible solutions (Komlos, 1998b). The most
widely accepted cites an increase in the relative price of food and a decline in calorie production per capita
over the period 1830–1860 (Floud et al., 2011; Komlos, 1987, 2012). An alternative explanation focuses on
the fact that inequality rose considerably in the antebellum period (Lindert and Williamson, 2015), possibly
88 In a sense, this result is similar to that of Chetty, Friedman, and Rockoff’s (2014) evaluation of the importance of selection
and sorting in the literature on teacher value added (Rothstein, 2010). They find evidence of sorting, but it is small and
insufficient to render the method uninformative.
48
leading to a decline in average stature because of non-linearities in the production of height (Steckel, 1995).
Other explanations include an increased spread of disease due to urbanization and regional integration
(Steckel, 2008) or an intensification of labor as a result of industrialization (Komlos, 1998b). The results of
this paper show that it is not possible to simply attribute the puzzling behavior of heights in the antebellum
period to sampling irregularities. A true explanation is therefore required, and although none of those cited
above are backed by strong and direct evidence, there is reason to believe that they can explain at least part
of the puzzle. Investigating these and other explanations is the object of my ongoing research.
It must be pointed out that I have established the robustness to corrections for sample-selection bias of
the antebellum puzzle only, and in particular not of the industrialization puzzle. Determining the robustness
of this phenomenon, which is perhaps even more important to the study of the welfare effects of economic
growth than the antebellum puzzle, must be the object of future research. However, studying the trend in
heights in England will be more difficult for several reasons. Among these are the following two: the decline
in stature in England occurs slightly earlier than in the United States; given the lack of high-quality census
data prior to 1851, linkage to height-relevant covariates may be difficult. Second, identification in the present
paper is essentially driven by the political turmoil of the Civil War; the lack of similar upheaval in England
may make identification challenging.
49
References
1860 US Census, Population Schedule. (National Archives Microfilm Publication M653, 1,438 Rolls). Washington, D.C.: National Archives and Records Administration.
1870 US Census, Population Schedule. (National Archives Microfilm Publication M593, 1,761 Rolls). Washington, D.C.: National Archives and Records Administration.
Abramitzky, Ran, Leah Platt Boustan, and Katherine Eriksson (2012). “Europe’s Tired, Poor, Huddled
Masses: Self-Selection and Economic Outcomes in the Age of Mass Migration.” The American Economic
Review 102:5, pp. 1832–1856.
——— (2013). “Have the Poor Always Been Less Likely to Migrate? Evidence from Inheritance Practices
during the Age of Mass Migration.” Journal of Development Economics 102, pp. 2–14.
——— (2014). “A Nation of Immigrants: Assimilation and Economic Outcomes in the Age of Mass Migration.” Journal of Political Economy 122:3, pp. 467–506.
A’Hearn, Brian (1998). “The Antebellum Puzzle Revisited: A New Look at the Physical Stature of Union
Army Recruits during the Civil War.” In The Biological Standard of Living in Comparative Perspective.
John Komlos and Jörg Baten (ed.). Stuttgart: Franz Steiner Verlag, pp. 250–267.
——— (2004). “A Restricted Maximum Likliehood Estimator for Truncated Height Samples.” Economics
and Human Biology 2, pp. 5–19.
A’Hearn, Brian, Franco Peracchi, and Giovanni Vecchi (2009). “Height and the Normal Distribution: Evidence
from Italian Military Data.” Demography 46:1, pp. 1–25.
Ahn, Hyungtaik and James L. Powell (1993). “Semiparametric Estimation of Censored Selection Models
with a Nonparametric Selection Mechanism.” Journal of Econometrics 58, pp. 3–29.
Amemiya, Takeshi (1985). Advanced Econometrics. Cambridge: Harvard University Press.
Ancestry.com (2007). U.S. Army, Register of Enlistments, 1798–1914 [database on-line]. Provo: Ancestry.com Operations Inc.
——— (2009a). 1850 United States Federal Census [database on-line]. Provo: Ancestry.com Operations Inc.
——— (2009b). 1860 United States Federal Census [database on-line]. Provo: Ancestry.com Operations Inc.
——— (2009c). 1870 United States Federal Census [database on-line]. Provo: Ancestry.com Operations Inc.
Andrews, Donald W. K. and Marcia M. A. Schafgans (1998). “Semiparametric Estimation of the Intercept
of a Sample Selection Model.” The Review of Economic Studies 65:3, pp. 497–517.
Bailey, Roy E., Timothy J. Hatton, and Kris Inwood (2014). “Health, Height and the Household at the Turn
of the 20th Century.” IZA Discussion Paper No. 8128.
Beard, Albertine S. and Martin J. Blaser (2002). “The Ecology of Height: The Effect of Microbial Transmission on Human Height.” Perspectives in Biology and Medicine 45:4, pp. 475–498.
Ben-Akiva, Moshe, Daniel McFadden, Makoto Abe, Ulf Böckenholt, Denis Bolduc, Dinesh Gopinath, Takayuki
Morikawa, Venkatram Ramaswamy, Vithala Rao, David Revelt, and Dan Steinberg (2007). “Modeling
Methods for Discrete Choice Analysis.” Marketing Letters 8:3, pp. 273–286.
Bernardo, C. Joseph and Eugene H. Bacon (1955). American Military Policy: Its Development Since 1775.
Harrisburg: The Telegraph Press.
Biavaschi, Costanza, Corrado Giulietti, and Zahra Siddique (2013). “The Economic Payoff of Name Americanization.” IZA Discussion Paper No. 7725.
50
Black, Sandra E., Paul J. Devereux, and Kjell G. Salvanes (2015). “Healthy(?), Wealthy and Wise: Birth
Order and Adult Health.” NBER Working Paper 21337.
Bodenhorn, Howard, Timothy W. Guinnane, and Thomas A. Mroz (2013). “Problems of Sample-selection
Bias in the Historical Heights Literature: A Theoretical and Econometric Analysis.” Yale University
Economics Department Working Paper No. 114.
——— (2014). “Caveat Lector: Sample Selection in Historical Heights and the Interpretation of Early Industrializing Economies.” NBER Working Paper 19955.
——— (2015a). “Biased Samples Yield Biased Results: What Historical Heights Can Teach Us About Past
Living Standards.” Vox EU.
——— (2015b). “Sample-Selection Biases and the ‘Industrialization Puzzle’.” NBER Working Paper 21249.
Bolt, J. and J. L. van Zanden (2013). “The First Update of the Maddison Project: Re-Estimating Growth
Before 1820.” Maddison Project Working Paper 4.
Bowman, Shearer Davis (2010). At the Precipice: Americans North and South During the Secession Crisis.
Chapel Hill: University of North Carolina Press.
Bozzoli, Carlos, Angus Deaton, and Climent Quintana-Domeque (2009). “Adult Height and Childhood Disease.” Demography 46:4, pp. 647–669.
Brown, John C. (1990). “The Condition of England and the Standard of Living: Cotton Textiles in the
Northwest, 1806–1850.” The Journal of Economic History 50:3, pp. 591–614.
Case, Anne and Christina Paxson (2008). “Stature and Status: Height, Ability, and Labor Market Outcomes.”
Journal of Political Economy 116:3, pp. 499–532.
Case, Anne, Christina Paxson, and Mahnaz Islam (2009). “Making Sense of the Labor Market Height Premium: Evidence from the British Household Panel Survey.” Economics Letters 102, pp. 174–176.
Chamberlain, Gary (1986). “Asymptotic Efficiency in Semi-Parametric Models with Censoring.” Journal of
Econometrics 32, pp. 189–218.
Chetty, Raj, John N. Friedman, and Jonah E. Rockoff (2014). “Measuring the Impacts of Teachers I: Evaluating Bias in Teacher Value-Added Estimates.” The American Economic Review 104:9, pp. 2593–2632.
Clark, Gregory (2007). A Farewell to Alms. Princeton: Princeton University Press.
Clark, Gregory, Michael Huberman, and Peter H. Lindert (1995). “A British Food Puzzle, 1770–1850.” The
Economic History Review 48:2, pp. 215–237.
Cline, Martha G., Keith E. Meredith, John T. Boyer, and Benjamin Burrows (1989). “Decline of Height with
Age in Adults in a General Population Sample: Estimating Maximum Height and Distinguishing Birth
Cohort Effects from Actual Loss of Stature with Aging.” Human Biology 61:3, pp. 415–425.
Coffman, Edward M. (1986). The Old Army: A Portrait of the American Army in Peacetime, 1784–1898.
New York: Oxford University Press.
Cosslett, Stephen R. (1981). “Efficient Estimation of Discrete-Choice Models.” In Structural Analysis of
Discrete Data with Econometric Applications. Charles F. Manski and Daniel McFadden (ed.). Cambridge:
MIT Press, 1990. Chap. 2, pp. 51–111.
——— (1991). “Semiparametric Estimation of a Regression Model with Sample Selectivity.” In Nonparametric and Semiparametric Methods in Econometrics and Statistics. William A. Barnett, James L. Powell,
and George Tauchen (ed.). New York: Cambridge University Press. Chap. 7, pp. 175–198.
Costa, Dora L. (2013). “Early Indicators of Later Work Levels, Disease and Death.” NIH/NIA Grant P01
AG10120.
51
Costa, Dora L. and Matthew E. Kahn (2003). “Cowards and Heroes: Group Loyalty in the American Civil
War.” The Quarterly Journal of Economics 118:2, pp. 519–548.
——— (2007). “Deserters, Social Norms, and Migration.” Journal of Law and Economics 50:2, pp. 323–353.
Costa, Dora L. and Richard H. Steckel (1997). “Long-Term Trends in Health, Welfare, and Economic Growth
in the United States.” In Health and Welfare During Industrialization. Richard H. Steckel and Roderick
Floud (ed.). Chicago: University of Chicago Press. Chap. 2, pp. 47–90.
Crafts, N. F. R. (1985). “English Workers’ Real Wages During the Industrial Revolution: Some Remaining
Problems.” The Journal of Economic History 45:1, pp. 139–144.
——— (1997). “Some Dimensions of the ‘Quality of Life’ during the British Industrial Revolution.” The
Economic History Review 50:4, pp. 617–639.
Crafts, N. F. R. and C. K. Harley (1992). “Output Growth and the British Industrial Revolution: A Restatement of the Crafts-Harley View.” The Economic History Review 45:4, pp. 703–730.
Craig, Lee A. (Forthcoming). “Antebellum Puzzle: The Decline in Heights at the Onset of Modern Economic
Growth.” In Handbook of Economics and Human Biology. John Komlos and Inas Kelly (ed.). Oxford:
Oxford University Press.
Craig, Lee A. and Thomas Weiss (1998). “Nutritional Status and Agricultural Surpluses in the Antebellum
United States.” In The Biological Standard of Living in Comparative Perspective. John Komlos and Jörg
Baten (ed.). Stuttgart: Franz Steiner Verlag, pp. 190–207.
Das, Mitali, Whitney K. Newey, and Francis Vella (2003). “Nonparametric Estimation of Sample Selection
Models.” Review of Economic Studies 70, pp. 33–58.
Deaton, Angus (2007). “Height, Health, and Development.” Proceedings of the National Academy of Sciences
of the United States of America 104:33, pp. 13232–13237.
——— (2013). The Great Escape: Health, Wealth and the Origins of Inequality. Princeton: Princeton University Press.
Deaton, Angus and Raksha Arora (2009). “Life at the Top: The Benefits of Height.” Economics and Human
Biology 7, pp. 133–136.
d’Haultfoeuille, Xavier and Arnaud Maurel (2009). “Another Look at the Identification at Inifinity of Sample
Selection Models.” IZA Discussion Paper No. 4334.
Domowitz, Ian and Robert L. Sartain (1999). “Determinants of the Consumer Bankruptcy Decision.” The
Journal of Finance 54:1, pp. 403–420.
Eveleth, Phyllis B. and James M. Tanner (1976). Worldwide Variation in Human Growth. Cambridge University Press.
Feinstein, Charles H. (1998). “Pessimism Perpetuated: Real Wages and the Standard of Living in Britain
during and after the Industrial Revolution.” The Journal of Economic History 58:3, pp. 625–658.
Ferrie, Joseph P. (1996). “A New Sample of Males Linked from the Public Use Microdata Sample of the
1850 US Federal Census to the 1860 US Federal Census Manuscript Schedules.” Historical Methods 29:4,
pp. 141–156.
——— (1997). “The Entry into the US Labor Market of Antebellum European Immigrants, 1840–1860.”
Explorations in Economic History 34, pp. 295–330.
Floud, Roderick (1985). “Measuring the Transformation of the European Economies: Income, Health, and
Welfare.” Historical Social Research 33, pp. 25–41.
52
Floud, Roderick, Robert W. Fogel, Bernard Harris, and S. C. Hong (2011). The Changing Body: Health,
Nutrition, and Human Development in the Western World since 1700. New York: Cambridge University
Press.
Floud, Roderick, Kenneth W. Wachter, and Anabel S. Gregory (1990). Height, Health and History: Nutritional Status in the United Kingdom, 1750–1980. Cambridge: Cambridge University Press.
Fogel, Robert W. (1986). “Nutrition and the Decline in Mortality since 1700: Some Preliminary Findings.” In
Long-Term Factors in American Economic Growth. Stanley L. Engerman and Robert E. Gallman (ed.).
Chicago: University of Chicago Press. Chap. 9, pp. 439–556.
Fogel, Robert W., Stanley L. Engerman, Roderick Floud, Gerald Friedman, Robert A. Margo, Kenneth
Sokoloff, Richard H. Steckel, T. James Trussell, Georgia Villaflor, and Kenneth W. Wachter (1983).
“Secular Changes in American and British Stature and Nutrition.” The Journal of Interdisciplinary
History 14:2, pp. 445–481.
Foner, Jack D. (1970). The United States Soldier Between Two Wars: Army Life and Reforms, 1865–1898.
New York: Humanities Press.
Fourie, Johan, Kris Inwood, and Martine Mariotti (2014). “Can Historical Changes in Military Technology
Explain the Industrial Growth Puzzle?” Mimeo., London School of Economics.
Frisancho, A. Roberto (1993). Human Adaptation and Accommodation. Ann Arbor: The University of Michigan Press.
Gallman, Robert E. (1996). “Dietary Change in Antebellum America.” The Journal of Economic History
56:1, pp. 193–201.
Goldin, Claudia and Robert A. Margo (1992). “Wages, Prices, and Labor Markets before the Civil War.” In
Strategic Factors in Nineteenth Century American Economic History. Claudia Goldin and Hugh Rockoff
(ed.). Chicago: University of Chicago Press. Chap. 2, pp. 67–104.
Gordon, Robert J. (2012). “Is US Economic Growth Over? Faltering Innovation Confronts the Six Headwinds.” NBER Working Paper 18315.
Gould, Benjamin Apthorp (1869). Investigations in the Military and Anthropological Statistics of American
Soldiers. Sanitary Memoirs of the War of the Rebellion. Collected and Published by the United States
Sanitary Commission. New York: Hurd and Houghton.
Haines, Michael R. (1998). “Health, Height, Nutrition, and Mortality: Evidence on the ‘Antebellum Puzzle’
from Union Army Recruits for New York State and the United States.” In The Biological Standard of
Living in Comparative Perspective. John Komlos and Jörg Baten (ed.). Stuttgart: Franz Steiner Verlag,
pp. 155–180.
Haines, Michael R., Lee A. Craig, and Thomas Weiss (2003). “The Short and the Dead: Nutrition, Mortality,
and the “Antebellum Puzzle” in the United States.” The Journal of Economic History 53:2, pp. 382–413.
Haines, Michael R. and Richard H. Steckel (2000). “Childhood Mortality and Nutritional Status as Indicators of Standard of Living: Evidence from World War I Recruits in the United States.” Jahrbuch für
Wirtschaftsgeschichte 43:1.
Hatton, Timothy J. and Bernice E. Bray (2010). “Long Run Trends in the Heights of European Men, 19th–
20th Centuries.” Economics and Human Biology 8, pp. 405–413.
Hatton, Timothy J. and Richard M. Martin (2010). “The Effects on Stature of Poverty, Family Size, and
Birth Order: British Children in the 1930s.” Oxford Economic Papers 62, pp. 157–184.
Heckman, James J. (1979). “Sample Selection Bias as a Specification Error.” Econometrica 47:1, pp. 153–161.
53
Heckman, James J. (1990). “Varieties of Selection Bias.” The American Economic Review, Papers and Proceedings 80:2, pp. 313–318.
Hermanussen, Michael, Beate Hermanussen, and Jens Burmeister (1988). “The Association between Birth
Order and Adult Stature.” Annals of Human Biology 15:2, pp. 161–165.
Horrell, Sara and Deborah Oxley (2015). “Gender Discrimination in 19th Century England: Evidence from
Factory Children.” University of Oxford Discussion Papers in Economic and Social History.
Huber, Martin and Giovanni Mellace (2011). “Testing Instrument Validity in Sample Selection Models.”
Universität St. Gallen Discussion Paper no. 2011-45.
——— (2014). “Testing Exclusion Restrictions and Additive Separability in Sample Selection Models.” Empirical Economics 47, pp. 75–92.
Humphries, Jane and Timothy Leunig (2009). “Cities, Market Integration, and Going to Sea: Stunting and
the Standard of Living in Early Nineteenth-Century England and Wales.” The Economic History Review
62:2, pp. 458–478.
Ichimura, Hidehiko and Lung-Fei Lee (1991). “Semiparametric Least Squares Estimation of Multiple Index
Models: Single Equation Estimation.” In Nonparametric and Semiparametric Methods in Econometrics
and Statistics. William A. Barnett, James L. Powell, and George Tauchen (ed.). New York: Cambridge
University Press.
ICPSR (1999). United States Historical Election Returns, 1824–1968 (ICPSR 1) [Machine-readable Database].
Ann Arbor: Inter-university Consortium for Political and Social Research (ICPSR).
Jacobs, Jan, Tomek Katzur, and Vincent Tassenaar (2008). “On Estimators for Truncated Height Samples.”
Economics and Human Biology 6, pp. 43–56.
Jayachandran, Seema and Rohini Pande (2015). “Why are Indian Children so Short?” NBER Working Paper
21036.
Jones, Charles I. and Peter J. Klenow (2015). “Beyond GDP? Welfare across Countries and Time.” Mimeo,
Stanford University.
Kitagawa, Toru (2010). “Testing for Instrument Independence in the Selection Model.” Mimeo., University
College London.
Klein, Roger W. and Richard H. Spady (1993). “An Efficient Semiparametric Estimator for Binary Response
Models.” Econometrica 61:2, pp. 387–421.
Komlos, John (1987). “The Height and Weight of West Point Cadets: Dietary Change in Antebellum America.” The Journal of Economic History 47:4, pp. 897–927.
——— (1989). Nutrition and Economic Development in the Eighteenth-Century Hapsburg Monarchy: An
Anthropometric History. Princeton: Princeton University Press.
——— (1992). “Toward an Anthropometric History of African-Americans: The Case of the Free Blacks
in Antebellum Maryland.” In Strategic Factors in Nineteenth Century American Economic History: A
Volume to Honor Robert W. Fogel. Claudia Goldin and Hugh Rockoff (ed.). Chicago: University of Chicago
Press. Chap. 10, pp. 297–329.
——— (1998a). “On the Biological Standard of Living of African-Americans: the Case of the Civil War
Soldiers.” In The Biological Standard of Living in Comparative Perspective. John Komlos and Jörg Baten
(ed.). Stuttgart: Franz Steiner Verlag, pp. 236–249.
——— (1998b). “Shrinking in a Growing Economy? The Mystery of Physical Stature during the Industrial
Revolution.” The Journal of Economic History 58:3, pp. 779–802.
54
Komlos, John (2012). “A Three-Decade History of the Antebellum Puzzle: Explaining the Shrinking of the
U.S. Population at the Onset of Modern Economic Growth.” The Journal of the Historical Society 12:4,
pp. 395–445.
Komlos, John and Leonard Carlson (2012). “The Anthropometric History of Native Americans, c. 1820–
1890.” CESifo Working Paper No. 3740.
Komlos, John and Joo Han Kim (1990). “Estimating Trends in Historical Heights.” Historical Methods 23:3,
pp. 116–120.
Lee, Myoung-jae and Francis Vella (2006). “A Semi-Parametric Estimator for Censored Selection Models
with Endogeneity.” Journal of Econometrics 130, pp. 235–252.
Lewbel, Arthur (2007). “Endogenous Selection or Treatment Model Selection.” Journal of Econometrics 141,
pp. 777–806.
Li, Qi and Jeffrey Scott Racine (2007). Nonparametric Econometrics: Theory and Practice. Princeton: Princeton University Press.
Li, Qi and Jeffrey M. Wooldridge (2002). “Semiparametric Estimation of Partially Linear Models for Dependent Data with Generated Regressors.” Econometric Theory 18:3, pp. 625–645.
Lindert, Peter H. and Robert A. Margo (2006). “Table Cc1–2: Consumer Price Indexes, for All Items, 1774–
2003.” In Historical Statistics of the United States. Susan B. Carter, Scott Sigmund Gartner, Michael R.
Haines, Alan L. Olmstead, Richard Sutch, and Gavin Wright (ed.). Cambridge: Cambridge University
Press, pp. 3.158–3.159.
Lindert, Peter H. and Jeffrey G. Williamson (2015). Unequal Gains: American Incomes since the 1600s.
Mimeo.
Lundborg, Petter, Hilda Ralsmark, and Dan-Olof Rooth (2015). “The More the Healthier? Health and Family
Size.” Mimeo., Lund University.
Manski, Charles F. (1994). “The Selection Problem.” In Advances in Econometrics: Sixth World Congress.
Christopher A. Sims (ed.). Vol. I. Cambridge: Cambridge University Press. Chap. 4, pp. 143–170.
Manski, Charles F. and Steven R. Lerman (1977). “The Estimation of Choice Probabilities from Choice
Based Samples.” Econometrica 45:8, pp. 1977–1988.
Manski, Charles F. and Daniel McFadden (1981). “Alternative Estimators and Sample Designs for Discrete
Choice Analysis.” In Structural Analysis of Discrete Data with Econometric Applications. Charles F.
Manski and Daniel McFadden (ed.). Cambridge: MIT Press, 1990. Chap. 1, pp. 1–50.
Margo, Robert A. (2000). Wages and Labor Markets in the United States, 1820–1860. Chicago: University
of Chicago Press.
Margo, Robert A. and Richard H. Steckel (1983). “Heights of Native-Born Whites During the Antebellum
Period.” The Journal of Economic History 43:1, pp. 167–174.
Margo, Robert A. and Georgia C. Villaflor (1987). “The Growth of Wages in Antebellum America: New
Evidence.” The Journal of Economic History 47:4, pp. 873–895.
Martínez-Carrión, José-Miguel and Javier Moreno-Lázaro (2007). “Was there an Urban Height Penalty in
Spain, 1840–1913.” Economics and Human Biology 5:1, pp. 144–164.
Martorell, Reynaldo and Jean-Pierre Habicht (1986). “Growth in Early Childhod in Developing Countries.”
In Human Growth: A Comprehensive Treatise. Frank Falkner and James M. Tanner (ed.). Vol. 3. Plenum
Press, pp. 241–262.
55
Minnesota Population Center (2011). National Historic Geographic Information System: Version 2.0 [Machinereadable database]. Minneapolis: University of Minnesota.
Mokyr, Joel (1988). “Is There Still Life in the Pessimist Case? Consumption during the Industrial Revolution,
1790–1850.” The Journal of Economic History 48:1, pp. 69–92.
Mokyr, Joel and Cormac Ó Gráda (1994). “The Heights of the British and the Irish c. 1800–1815: Evidence
from Recruits to the East India Company’s Army.” In Stature, Living Standards, and Economic Development: Essays in Anthropometric History. John Komlos (ed.). Chicago: University of Chicago Press.
Chap. 3, pp. 39–59.
——— (1996). “Height and Health in the United Kingdom 1815–1860: Evidence from the East India Company Army.” Explorations in Economic History 33, pp. 141–168.
Mroz, Thomas A. (2015). “Sample Selection with Multiple Selection Indicators in Two-Step Estimators.”
Mimeo., Georgia State University.
Mulligan, Casey B. and Yona Rubinstein (2008). “Selection, Investment, and Women’s Relative Wages over
Time.” The Quarterly Journal of Economics 123:3, pp. 1061–1110.
Nadaraya, Elizbar A. (1964). “On Estimating Regression.” Theory of Probability & Its Applications 9:1,
pp. 141–142.
Nevins, Allan (1959). The War for the Union. Vol. 1. New York: Charles Scribner’s Sons.
Newey, Whitney K. (2009). “Two-Step Series Estimation of Sample Selection Models.” The Econometrics
Journal 12:S1, S217–S229.
Penttinen, Antti, Elena Moltchanova, and Ilkka Nummela (2013). “Bayesian Modeling of the Evolution of
Male Height in 18th Century Finland from Incomplete Data.” Economics and Human Biology 11, pp. 405–
415.
Powell, James L. (2001). “Semiparametric Estimation of Censored Selection Models.” In Nonlinear Statistical
Modeling. Cheng Hsiao, Kimio Morimune, and James L. Powell (ed.). New York: Cambridge University
Press. Chap. 6, pp. 165–196.
Register of Enlistments in the U.S. Army, 1798–1914. (National Archives Microfilm Publication M233, 81
Rolls), Records of the Adjutant General’s Office, 1780’s–1917, Record Group 94. Washington, D.C.:
National Archives.
Reis, Jaime (2009). “Urban Premium or Urban Penalty? The Case of Lisbon, 1840–1912.” Historia Agraria
47.
Robinson, P. M. (1988). “Root-N-Consistent Semiparametric Regression.” Econometrica 56:4, pp. 931–954.
Rothstein, Jesse (2010). “Teacher Quality in Educational Production: Tracking, Decay, and Student Achievement.” The Quarterly Journal of Economics 125:1, pp. 175–214.
Roy, A. D. (1951). “Some Thoughts on the Distribution of Earnings.” Oxford Economic Papers 3:2, pp. 135–
146.
Ruggles, Steven, J. Trent Alexander, Katie Genadek, Ronald Goeken, Matthew B. Schroeder, and Matthew
Sobek (2010). Integrated Public Use Microdata Series: Version 5.0 [Machine-readable database]. Minneapolis: University of Minnesota.
Sandberg, Lars G. and Richard H. Steckel (1997). “Was Industrialization Hazardous to Your Health? Not in
Sweden!” In Health and Welfare During Industrialization. Richard H. Steckel and Roderick Floud (ed.).
Chicago: University of Chicago Press.
56
Schwarz, L. D. (1985). “The Standard of Living in the Long Run: London 1700–1860.” Economic History
Review 38, pp. 24–41.
Seventh Census of the United States, 1850. (National Archives Microfilm Publication M432, 1,009 Rolls),
Records of the Bureau of the Census, Record Group 29. Washington, D.C.: National Archives.
Silventoinen, Karri (2003). “Determinants of Variation in Adult Body Height.” Journal of Biosocial Science
35:2, pp. 263–285.
Spitzer, Yannay and Ariell Zimran (2015). “Migrant Self-Selection: Anthropometric Evidence from the Mass
Migration of Italians to the United States, 1907–1925.” Mimeo., Northwestern University.
Steckel, Richard H. (1992). “Stature and Living Standards in the United States.” In American Economic
Growth and Standards of Living before the Civil War. Robert E. Gallman and John Joseph Wallis (ed.).
Chicago: University of Chicago Press. Chap. 6, pp. 265–310.
——— (1995). “Stature and the Standard of Living.” Journal of Economic Literature 33:4, pp. 1903–1940.
——— (2008). “Biological Measures of the Standard of Living.” Journal of Economic Perspectives 22:1,
pp. 129–152.
——— (2009). “Heights and Human Welfare: Recent Developments and New Directions.” Explorations in
Economic History 46, pp. 1–23.
Steckel, Richard H. and Nicolas Ziebarth (2015). “Selectivity and Measured Catch-up Growth of American
Slaves.” The Journal of Economic History Forthcoming.
Steinberg, Dan and N. Scott Cardell (1992). “Estimating Logistic Regression Models when the Dependent
Variable Has No Variance.” Communications in Statistics—Theory and Methods 21:2, pp. 423–450.
Sunder, Marco (2011). “Upward and Onward: High-Society American Women Eluded the Antebellum Puzzle.” Economics and Human Biology 9, pp. 165–171.
Twarog, Sophia (1997). “Heights and Living Standards in Germany, 1850–1939: The Case of Württemberg.”
In Health and Welfare During Industrialization. Richard H. Steckel and Roderick Floud (ed.). Chicago:
University of Chicago Press, pp. 285–330.
Vella, Francis (1998). “Estimating Models with Sample Selection Bias: A Survey.” The Journal of Human
Resources 33:1, pp. 127–169.
Voth, Hans-Joachim (2003). “Living Standards during the Industrial Revolution: An Economist’s Guide.”
The American Economic Review, Papers and Proceedings 93:2, pp. 221–226.
Voth, Hans-Joachim and Timothy Leunig (1996). “Did Smallpox Reduce Height? Stature and the Standard
of Living in London, 1770–1873.” The Economic History Review 49:3, pp. 541–560.
Wachter, Kenneth W. and James Trussell (1982). “Estimating Historical Heights.” Journal of the American
Statistical Association 77:378, pp. 279–293.
Watson, Geoffrey S. (1964). “Smooth Regression Analysis.” Sankhya: The Indian Journal of Statistics 26:4,
pp. 359–372.
Weigley, Russell F. (1967). History of the United States Army. New York: The Macmillan Company.
Weir, David R. (1997). “Economic Welfare and Physical Well-Being in France, 1750–1990.” In Health and
Welfare During Industrialization. Richard H. Steckel and Roderick Floud (ed.). Chicago: University of
Chicago Press.
Wilson, Sven and Clayne Pope (2003). “The Height of Union Army Recruits: Family and Community Influences.” In Health and Labor Force Participation over the Life Cycle: Evidence from the Past. Dora L.
Costa (ed.). Chicago: University of Chicago Press, pp. 113–146.
57
Woitek, Ulrich (2003). “Height Cycles in the 18th and 19th Centuries.” Economics and Human Biology 1,
pp. 243–257.
Wooldridge, Jeffrey M. (2002). Econometric Analysis of Cross Section and Panel Data. Cambridge: MIT
Press.
Zehetmayer, Matthias (2010). “An Anthropometric History of the Postbellum US, 1847–1894.” PhD thesis.
Munich: Ludwig-Maximilians-Universität München.
——— (2011). “The Continuation of the Antebellum Puzzle: Stature in the US, 1847–1894.” European Review
of Economic History 15, pp. 313–327.
58
Tables
Table 1: Distribution of observations by census.
Census
Cohorts
Union Army [UA]
1832–1846
Regular Army [RA(a)]
1832–1846
CR
(1)
Supp.
(2)
CR
(3)
Supp.
(4)
1850
1832–1841
3398
(58.07)
5659
(67.47)
1018
(55.51)
7387
(68.04)
1860
1842–1851
2454
(41.93)
2728
(32.53)
816
(44.49)
3470
(31.96)
1870
1852–1860
Total
5852
8387
1834
10857
Regular Army [RA(b)]
1847–1860
CR
(5)
Supp.
(6)
1032
(39.98)
3796
(32.89)
1549
(60.02)
7745
(67.11)
2581
11541
Notes: Each cell reports the number of individuals in each sample with data taken from a particular census.
Samples are restricted to cover individuals with data on all individual-level variables other than father’s nativity.
Numbers in parentheses are percent of the column in cell. Supplementary samples are random samples of the
population from the census, for which military enlistment status is unobserved. They are intended for comparison
to the choice-restricted military sample from enlisments under their particular headings.
59
Table 2: Summary statistics.
Union Army [UA]
Variable
Height (in)
HH Owns Prop.
HH Real Prop. ($1,000)
Related to Head of HH
HH Size
Urban
Attended School
(1)
(2)
(3)
CR
68.198
(2.584)
0.716
(0.451)
1.778
(3.776)
0.889
(0.315)
7.521
(2.407)
0.297
(0.457)
0.568
(0.495)
Supp.
Diff.
0.450
(0.498)
0.420
(0.494)
0.130
(0.336)
0.443
(0.142)
0.347
(0.175)
5,750
Regular Army [RA(a)]
1832–1846
(4)
(5)
(6)
Supp.
Diff.
0.693
0.023
(0.461) [0.015]
2.264 −0.486a
(6.096) [0.100]
0.862
0.027a
(0.345) [0.007]
7.408
0.113b
(2.409) [0.050]
0.423 −0.126a
(0.494) [0.026]
0.656 −0.088a
(0.475) [0.021]
CR
67.556
(2.266)
0.588
(0.492)
1.995
(6.408)
0.797
(0.403)
7.387
(2.493)
0.510
(0.500)
0.575
(0.494)
0.684
(0.465)
2.330
(9.864)
0.869
(0.337)
7.512
(2.450)
0.349
(0.477)
0.615
(0.487)
0.299
0.151a
(0.458) [0.023]
0.562 −0.141a
(0.496) [0.025]
0.140 −0.010
(0.347) [0.014]
0.428
0.015c
(0.141) [0.008]
0.324
0.023b
(0.185) [0.011]
8,210
0.253
(0.435)
0.595
(0.491)
0.152
(0.359)
0.422
(0.140)
0.299
(0.188)
1,786
Regular Army [RA(b)]
1847–1860
(7)
(8)
(9)
Supp.
Diff.
−0.096a
[0.014]
−0.335c
[0.181]
−0.072a
[0.010]
−0.125c
[0.064]
0.162a
[0.017]
−0.040a
[0.013]
CR
67.403
(2.103)
0.827
(0.378)
2.192
(6.730)
0.879
(0.326)
6.981
(2.460)
0.655
(0.475)
0.653
(0.476)
0.731
(0.444)
2.255
(7.405)
0.896
(0.305)
7.119
(2.329)
0.484
(0.500)
0.675
(0.469)
0.097a
[0.019]
−0.062
[0.150]
−0.017b
[0.008]
−0.139c
[0.072]
0.172a
[0.033]
−0.022c
[0.012]
0.232
0.021
(0.422) [0.015]
0.435
0.160a
(0.496) [0.018]
0.333 −0.181a
(0.471) [0.018]
0.461 −0.040a
(0.158) [0.006]
0.270
0.029a
(0.199) [0.007]
10,449
0.320
(0.467)
0.504
(0.500)
0.177
(0.381)
0.428
(0.135)
0.311
(0.187)
2,417
0.334 −0.014
(0.472) [0.030]
0.360
0.144a
(0.480) [0.040]
0.307 −0.130a
(0.461) [0.030]
0.469 −0.040a
(0.156) [0.010]
0.286
0.025
(0.202) [0.017]
10,634
Birth Region
Midwest
Northeast
South
Buchanan Vote Frac. (1856)
Douglas Vote Frac. (1860)
Observations
Significance levels: a p<0.01, b p<0.05, c p<0.1
Notes: Standard deviations in parentheses. Standard errors are in square brackets and are clustered at the county level. Only a
selection of variables are covered in this table; the full list of variables and summary statistics are presented in Table A.1. Sample
sizes are the minimum of the column with observations for all variables (including variables listed only in Table A.1). Supplementary
samples are random samples of the population from the census, for which military enlistment status is unobserved. They are intended
for comparison to the choice-restricted military sample from enlisments under their particular headings.
60
Table 3: Height regressions.
(1)
UA & RA(b)
Variables
(2)
UA & RA(b)
(3)
RA
(4)
RA
0.181b
(0.080)
0.168b
(0.080)
0.047
(0.108)
0.039
(0.109)
HH Real Prop. (1,000)
−0.006
(0.006)
−0.007
(0.006)
0.002
(0.005)
0.001
(0.005)
Related to Head of HH
0.219b
(0.103)
0.240b
(0.104)
0.212
(0.133)
0.216
(0.135)
HH Size
0.023c
(0.013)
0.023c
(0.013)
0.002
(0.018)
0.003
(0.019)
HH Owns Prop.
Urban
−0.169b
(0.079)
−0.161b −0.187c −0.183c
(0.081) (0.109) (0.111)
Attended School
−0.030
(0.063)
−0.031
(0.064)
0.235b
(0.101)
0.221b
(0.103)
Midwest
−0.065
(0.113)
−0.086
(0.131)
0.128
(0.153)
0.052
(0.174)
Northeast
−0.557a
(0.120)
−0.575a −0.321b −0.374b
(0.128) (0.152) (0.165)
Buchanan Vote Frac. (1856)
−0.056
(0.297)
−0.166
(0.356)
0.166
(0.221)
0.245
(0.263)
Douglas Vote Frac. (1860)
Constant
Observations
66.973a
(0.311)
67.037a
(0.354)
66.817a
(0.361)
66.941a
(0.420)
7,881
7,732
4,120
4,018
Significance levels: a p<0.01, b p<0.05, c p<0.1
Notes: Dependent variable is height, measured in inches. All regressions are adjusted for minimum height requirements with a truncation point of 64 inches. All specifications include ageof-measurement indicators, year-of-birth indicators, and all variables from Table A.1 other than
father’s nativity and mother’s age at birth. Standard errors are clustered on the county level. UA
denotes Union Army. RA denotes Regular Army; these regressions are weighted to correct for the
different sampling of the two groups of cohorts. UA & RA(b) denotes the use of Union Army data
for the 1832–1846 cohorts and the use of Regular Army data for the 1847–1860 cohorts; these
regressions are weighted to correct for the different sampling of the two cohorts.
Table 4: Regressions of selection into linkage.
Variables
Linked
(1)
UA
(2)
RA(a)
(3)
RA(b)
0.142a 0.118
0.197a
(0.047) (0.085) (0.073)
Observations
12,234
χ2 -Test of Birth Year FE × Linked
11.75
5,041
17.86
5,428
10.11
(4)
UA & RA(b)
0.167a
(0.047)
(5)
UA & RA(b)
0.163a
(0.055)
17,662
10,469
21.17
26.04
Significance levels: a p<0.01, b p<0.05, c p<0.1
Notes: Dependent variable is height, measured in inches. Truncated regression is performed to account for minimum
height requirements with a truncation point of 64 inches. All specifications include measurement-age and birth-year
dummy variables. Standard errors are clustered by image for the unlinked sample. The sample includes linked and
unlinked members of the Regular Army and Union Army. UA denotes the Union Army. RA(a) denotes the 1832–
1846 cohorts of the Regular Army. RA(b) denotes the 1847–1860 cohorts of the Regular Army. The coefficients on
linked are from a regression without interactions. The statistics on the interactions are from a separate regression
with interactions.
61
Table 5: Balancing tests for selection into linkage.
Union Army [UA]
(1)
Regular Army [RA(a)]
1832–1846
(2)
Regular Army [RA(b)]
1847–1860
(3)
Dep. Variable
Name
0.013
(0.066)
0.304b
(0.134)
0.022
(0.125)
−0.035
(0.071)
−1.738a
(0.125)
−1.866a
(0.105)
Surname Length
0.009
(0.028)
0.161a
(0.050)
0.125a
(0.046)
First Name Length
0.018
(0.031)
−0.472a
(0.052)
−0.797a
(0.047)
−0.032a
(0.009)
−0.053a
(0.016)
−0.021
(0.015)
Midwest
0.007
(0.009)
0.040a
(0.015)
0.069a
(0.013)
South
0.014b
(0.006)
0.012
(0.012)
−0.048a
(0.012)
Birth Year
−0.367a
(0.066)
−0.113
(0.151)
0.634a
(0.160)
Enlistment Year
−0.031
(0.023)
0.000
(0.170)
−0.000
(0.236)
0.011
(0.008)
0.018
(0.017)
0.027b
(0.011)
Professional
−0.001
(0.003)
0.013a
(0.005)
0.009a
(0.003)
Clerical
−0.000
(0.003)
0.008
(0.010)
0.030a
(0.009)
Skilled and Artisan
−0.006
(0.006)
0.003
(0.014)
0.052a
(0.012)
Semi-Skilled and Operative
−0.002
(0.003)
−0.039a
(0.012)
−0.092a
(0.010)
Unskilled
−0.008c
(0.005)
−0.023c
(0.013)
−0.035a
(0.013)
0.006a
(0.002)
0.019a
(0.004)
0.009a
(0.002)
5,558
7,280
1,803
3,493
2,434
3,162
Surname Scrabble Score
First Name Scrabble Score
Region
Northeast
Occupation
Farmer
Unproductive
Observations (Linked)
Observations (Unlinked)
Significance levels: a p<0.01, b p<0.05, c p<0.1
Notes: Each cell represents the coefficient from a regression of the dependent variable in the first column on an indicator for
linkage. The “unlinked” group is not composed only of the unlinked, but is a random sample of the population of enlisters,
so the coefficients are to be interpreted as the difference in the dependent variable between the linked and the population
of enlisters. All regressions cluster standard errors on image, and are weighted to account for stratification; for the Regular
Army, weighting is also performed to make the enlistment years of the whole population similar to that of the linked.
62
Table 6: Computing weights (Q1 ) for estimation of the binary choice model.
Sample
Cohorts
Enlisters
Population at Risk
Fraction Enlisting
UA
RA(a)
RA(b)
1832–1846
1832–1846
1847–1860
1, 660, 068
3, 720, 008
0.446
96, 851
3, 101, 313
0.031
83, 633
4, 327, 190
0.019
Notes: The sources of these figures are discussed in Appendix G
Table 7: Binary choice model estimation.
(1)
(2)
(a)
(b)
(a)
(b)
Variable
HH Owns Prop.
HH Real Prop. (1,000)
Related to Head of HH
HH Size
Urban
Attended School
HH Occupation (Base: Unproductive)
UA
0.270
−0.223
0.217
0.385
−0.038
−0.455
RA(b)
0.320
−0.553
−0.060
−0.115
0.054
−0.390
RA(a)
0.065
−0.170
−0.214
0.202
0.194
−0.213
RA(b)
0.243
−0.148
−0.177
−0.003
0.131
−0.226
Farmer
Professional
Clerical
Skilled and Artisan
Semi-Skilled and Operative
Unskilled
Farm Labor
Birth Region (Base: South)
−0.304
−0.122
−0.307
−0.206
−0.202
−0.150
−0.107
−0.314
−0.052
−0.156
−0.057
−0.118
−0.066
−0.108
−0.359
0.084
0.011
0.047
−0.008
0.109
−0.048
−0.246
0.057
−0.012
0.090
−0.017
0.078
−0.056
Midwest
Northeast
Wheat Bushels PC
Milk Cows PC
Pigs PC
Buchanan Vote Frac.
Douglas Vote Frac.
Observations (CR)
Observations (S)
0.519
0.354
0.333
0.823
0.842
−0.285
0.779
5164
8210
0.196
0.214
−0.056
−0.178
−1.046
−0.915
0.179
2254
10634
0.137
0.311
0.062
−0.094
−1.190
−0.245
0.483
1719
10449
0.018
0.178
−0.009
−0.286
−1.195
−0.401
0.268
2254
10634
Notes: Dependent variable is an indicator for military enlistment. These are the results of
estimation of the binary choice model in the first stage of the correction procedure. Average
semi-elasticities reported; these are derivatives of log(Enlistment Probability) for continuous
variables and the effect on log(Enlistment Probability) of a change from 0 to 1 for binary
variables. All specifications include birth-year indicators. Column (1) presents the results
for the sample representing the 1832–1846 birth cohorts with data on Union Army enlisters.
Column (2) represents these cohorts with data from Regular Army enlisters. Observations (CR)
indicates the number of observations in the choice-restricted sample. Observations (S) indicates
the number of observations in the supplementary sample. UA denotes the Union Army. RA(a)
denotes the Regular Army for the 1832–1846 cohorts. RA(b) denotes the Regular Army for
the 1847–1860 cohorts.
63
Table 8: Second-stage regressions.
Variables
(1)
UA & RA(b)
(2)
RA
0.112 −0.109
(0.087) (0.118)
HH Owns Prop.
HH Real Prop. (1,000)
−0.024b
(0.011)
0.003
(0.018)
Related to Head of HH
0.273a
(0.102)
0.286b
(0.130)
HH Size
0.024
(0.015)
0.025
(0.018)
−0.192b −0.253b
(0.084) (0.116)
Urban
Attended School
0.023
(0.076)
0.244b
(0.115)
Midwest
0.074
(0.126)
0.182
(0.145)
−0.272b −0.173
(0.131) (0.168)
Northeast
Observations
7,299
3,969
67.73
67
(0.0781) (0.113)
Constant
Significance levels: a p<0.01, b p<0.05, c p<0.1
Notes: Dependent variable is height, measured in inches.
These are the results of the second stage of the correction
procedure, and include a non-parametric function of the linear index of the selection function. These regressions include
all variables in Table A.1 except for father’s nativity, mother’s
age at birth, and the voting variables; full results are reported
in Table A.4. All regressions include age-of-measurement indicators and birth year indicators. Standard errors are corrected for error in the estimation of the linear index of selection.
64
Table 9: Linear trends and tests for differences.
(1)
UA & RA(b)
Variables
(2)
RA
Panel A: Raw Data
Constant
67.571a
(0.160)
66.745a
(0.199)
Birthyear
−0.091a −0.093a
(0.022) (0.022)
1847–1860 Cohorts
−1.079a −0.142
(0.210) (0.217)
0.130a
(0.032)
0.128a
(0.030)
Constant
67.353a
(0.250)
66.769a
(0.222)
Birthyear
−0.081a −0.115a
(0.026) (0.025)
1847–1860 Cohorts
−0.875a −0.016
(0.232) (0.244)
Birthyear × 1847–1860 Cohorts
Panel B: Observables-Corrected
0.125a
(0.036)
0.163a
(0.035)
Constant
67.840a
(0.079)
67.237a
(0.118)
Birthyear
−0.111a −0.090a
(0.010) (0.017)
Birthyear × 1847–1860 Cohorts
Panel C: Unobservables-Corrected
1847–1860 Cohorts
0.203
(0.144)
0.117
(0.173)
Birthyear × 1847–1860 Cohorts
0.140a
(0.018)
0.124a
(0.024)
1.80
5.00b
57.58a
2.73c
3.36c
3.20c
347.42a
27.80a
7,063
3,871
Panel D: χ2 Tests
Birthyear
1847–1860 Cohorts
Birthyear + Birthyear × 1847–1860 Cohorts
Trend
Observations
Significance levels: a p<0.01, b p<0.05, c p<0.1
Notes: Dependent variable is height, measured in inches. All regressions are corrected
for age of measurement. Standard errors are corrected for error in previous stages of
estimation. The variable “1847–1860 Cohorts” is an indicator for being in that group of
cohorts. Birthyear is centered at 1846. Panel D presents χ2 tests for differences between
the results of Panels B and C.
65
Table 10: Tests for differences in levels, regional decomposition.
UA & RA(b)
Region
(1)
Northeast
(2)
Midwest
RA
(3)
South
(4)
Northeast
(5)
Midwest
(6)
South
Panel A: Observables-Corrected
Northeast
66.405
(0.235)
66.791
(0.189)
Midwest
−0.708a
(0.169)
66.405
(0.049)
−0.636a
(0.120)
67.427
(0.204)
South
−0.793a
(0.202)
−0.085 67.198
(0.215) (0.260)
−0.745a
(0.169)
−0.109 67.536
(0.188) (0.216)
Panel B: Unobservables-Corrected
Northeast
68.003
(0.084)
67.346
(0.061)
Midwest
−0.428a
(0.122)
68.431
(0.094)
−0.513a
(0.102)
67.859
(0.083)
South
−0.505a
(0.147)
−0.077 68.508
(0.154) (0.126)
−0.606a
(0.122)
−0.093 67.952
(0.135) (0.107)
Panel C: B − A
Northeast
1.598a
(0.208)
0.555a
(0.168)
Midwest
0.280a
(0.058)
1.318a
(0.203)
South
0.288a
(0.072)
0.008
1.310a
(0.074) (0.203)
Observations
3,159
−65.551a
(0.046)
3,032
872
0.139b
(0.059)
2,158
0.432a
(0.165)
0.016
0.416b
(0.058) (0.164)
1,149
564
Significance levels: a p<0.01, b p<0.05, c p<0.1
Notes: In Panels A and B, the diagonals present the estimated mean heights in each region,
corrected for minimum height requirements with a truncation point of 64 inches, for the type of
selection in the panel title, for measurement age, and for the separate sampling of the two groups
of birth cohorts; standard errors are in parentheses. The off-diagonals present the differences
between the diagonal elements, with standard errors in parentheses. Panel C presents differences
between Panels A and B, with standard errors in parentheses.
66
Table 11: Tests for differences in levels, sectoral decomposition.
UA & RA(b)
(1)
Urban
Sector
(2)
Rural
RA
(3)
Urban
(4)
Rural
Panel A: Observables-Corrected
Urban
66.502
(0.225)
66.848
(0.189)
Rural
−0.779a 67.281 −0.623a 67.471
(0.153) (0.189) (0.125) (0.189)
Panel B: Unobservables-Corrected
Urban
68.042
(0.075)
67.409
(0.060)
Rural
−0.508a 68.550 −0.481a 67.890
(0.111) (0.089) (0.091) (0.071)
Panel C: B − A
Urban
1.540a
(0.202)
0.561a
(0.169)
Rural
0.271a 1.269a
(0.052) (0.200)
0.142a 0.419a
(0.042) (0.160)
Observations
2,880
4,183
2,290
1,581
Significance levels: a p<0.01, b p<0.05, c p<0.1
Notes: In Panels A and B, the diagonals present the estimated
mean heights for each sector, corrected for minimum height requirements with a truncation point of 64 inches, for the type
of selection in the panel title, for measurement age, and for
the separate sampling of the two groups of birth cohorts; standard errors are in parentheses. The off-diagonals present the
differences between the diagonal elements, with standard errors
in parentheses. Panel C presents differences between Panels A
and B, with standard errors in parentheses.
67
Table 12: Overidentification tests for exclusion restrictions.
(1)
UA & RA(b)
Variables
(2)
RA
0.111 −0.097
(0.088) (0.118)
HH Owns Prop.
HH Real Prop. (1,000)
−0.024b
(0.011)
0.001
(0.018)
Related to Head of HH
0.275a
(0.102)
0.276b
(0.130)
HH Size
0.024c
(0.015)
0.027
(0.019)
−0.192b −0.247b
(0.083) (0.116)
Urban
Attended School
0.026
(0.076)
0.235b
(0.115)
Midwest
0.051
(0.134)
0.117
(0.157)
−0.277b −0.207
(0.134) (0.170)
Northeast
Buchanan Vote Frac. (1856)
0.138 −0.130
(0.278) (0.331)
Douglas Vote Frac. (1860)
0.131
(0.200)
0.233
(0.233)
Observations
7,299
3,969
Exclusion Restrictions
0.800
1.120
67.65
67.07
(0.0781) (0.113)
Constant
Significance levels: a p<0.01, b p<0.05, c p<0.1
Notes: Standard errors in parentheses. Dependent variable is
height, measured in inches. These are results of the second-stage
of the correction procedure, including the voting variables. Identification is gained solely from allowing the coefficients of the binary
choice model to differ by cohort group. Occupation indicators and
county-level agricultural variables are included but not reported.
The last row before the constant presents the results of a test of
joint significance of the two vote share variables.
68
71
30000
20000
70
10000
69
Height (Inches)
GDP Per Capita (1990 Dollars, Log Scale)
Figures
68
1500
67
1750
1800
1850
Year
Height
1900
1950
2000
GDP
Figure 1: The Antebellum Puzzle in the United States.
Note: GDP data are smooth and presented in log scale. Height data are of native-born whites. They are decade averages
combined by stitching together the series of Craig (Forthcoming) and Floud et al. (2011). This differs from the standard graph
of Costa and Steckel (1997) because it is updated to reflect Zehetmayer’s (2011) improved estimates of heights for the birth
cohorts of the 1850s–1890s.
Source: GDP data are from Bolt and van Zanden (2013). Height data are from Craig (Forthcoming) and Floud et al. (2011).
69
Number of Observations
1000
800
600
400
200
0
1830
1840
Birth Cohort
Union Army
Supp. Sample (UA)
1850
1860
Regular Army
Supp. Sample (RA)
Figure 2: Number of observations by birth cohort.
Note: The three Union Army and Regular Army lines show the sample sizes of the choice-restricted samples. These are
individuals who are known to have enlisted in the military, and whose height is therefore observed. They could also be
successfully linked to the census, and covariates from the census are therefore available for them. The three other lines
represent the sizes of the supplementary samples. These are random samples of the census; their military enlistment status and
height are unknown.
70
(a) Union Army
.3
Density
.2
.1
0
60
65
Height (Inches)
70
75
(b) Regular Army, 1832-1846 Cohorts
.3
Density
.2
.1
0
60
65
Height (Inches)
70
75
(c) Regular Army, 1847–1860 Cohorts
.3
Density
.2
.1
0
60
65
Height (Inches)
70
75
Figure 3: Height distributions.
Note: These figures present histograms and kernel density estimates of the height distributions for the three samples from which
height data are available.
71
Average Age-Adjusted Height (Inches)
69.5
69
68.5
68
67.5
67
66.5
1830
1840
Birth Cohort
Union + Regular Army
1850
1860
Regular Army
Figure 4: Uncorrected trend in average stature.
Note: Both trends are corrected for minimum height requirements with a truncation point of 64 inches, and are standardized
to age 21 through the incorporation of measurement-age dummies. The vertical dotted line indicates the division between the
two groups of birth cohorts. The black line uses the Union Army for coverage of the 1832–1846 birth cohorts, and the Regular
Army sample for the 1847–1860 cohorts. The gray line uses the Regular Army sample for both groups of cohorts. A χ2 -test for
statistical significance of the difference between the 1832–1846 cohorts in each group rejects the null of equality (χ215 = 146.98,
p < 0.01).
.15
Fraction
.1
0
1850
1851
1852
1853
1854
1855
1856
1857
1858
1859
1860
1861
1862
1863
1864
1865
1866
1867
1868
1869
1870
1871
1872
1873
1874
1875
1876
1877
1878
1879
1880
1881
1882
1883
1884
1885
1886
1887
1888
1889
1890
1891
1892
1893
1895
1898
.05
1832-1846
1847-1860
Figure 5: Enlistment year by subsample of the Regular Army.
Note: This figure depicts the enlistment year of the first enlistment from which height data are taken (the first post-age-21
enlistment if one exists, and the first post-age-18 enlistment otherwise) for each sample. The y-axis presents the fraction of
each subsample enlisting in a particular year.
72
Average Measurement Age
30
25
20
15
1830
1840
Birth Cohort
Union Army
1850
1860
Regular Army
Figure 6: Measurement age by birth cohort.
(b) Regular Army
70
70
69
69
Height (Inches)
Height (Inches)
(a) Union & Regular Army
68
67
68
67
66
66
1830
1840
Birth Cohort
Linked
1850
1860
Unlinked
1830
1840
Birth Cohort
Linked
1850
1860
Unlinked
Figure 7: Height trends of linked and unlinked samples.
Note: These trends compare enlisters who could be linked to census data (the “Linked”) to a random sample collected without
respect to linking (the “Unlinked”). Tests for the statistical significance of differences between the trends are presented in Tables
4.
73
(b) Regular Army
.2
.2
0
0
Selection Function (Inches)
Selection Function (Inches)
(a) Union & Regular Army
-.2
-.4
-.6
-.2
-.4
-.6
-.8
-.8
-1
-1
1830
1840
1850
Birth Cohort
1860
1830
1840
Birth Cohort
1850
1860
Figure 8: Estimated Ω(·) function by birth cohort.
Note: Each graph plots the coefficients from a regression of the estimated function Ω(x0i β̂k +c0i α̂+zi 0 δ̂k ) on birth year indicators,
weighting by inverse enlistment probability, as well as versions of these smoothed over time. Panel (a) presents the graphs using
the Union Army sample to represent the 1832–1846 birth cohorts, while Panel (b) presents the graphs using the Regular Army
sample to represent the 1832–1846 birth cohorts; both panels use the Regular Army sample to represent the 1847–1860 cohorts.
(b) Regular Army
69.5
69.5
69
69
68.5
68.5
Height (Inches)
Height (Inches)
(a) Union & Regular Army
68
67.5
68
67.8
Height (Inches)
67
66.5
1830
1840
Birth Cohort
67.5
67
67.6
66.5
67.4
67.2
66
68
66
1850
67
1860
1830
1840
Birth Cohort
Uncorrected
Unobservables and Observables
1830
1850
1840
1860
Birth Cohort
1850
1860
Observables
Figure 9: Estimated corrected trends by birth cohort.
Note: Each graph plots three trends in average height by birth cohort, smoothed over time. The first, in dotted gray, is the
raw trend, corrected only for age of measurement and minimum-height requirements with a truncation point of 64 inches. The
second, in solid gray, corrects also for selection on observables. The black line corrects for selection on unobservables as well.
Panel (a) presents the graphs using the Union Army sample to represent the 1832–1846 birth cohorts, while panel (b) presents
the graphs using the Regular Army sample to represent the 1832–1846 birth cohorts; both panels use the Regular Army sample
to represent the 1847–1860 cohorts. The underlying trends on which the smoother is based are also included. Tests for statistical
significance of the difference between the unobservables-corrected and observables-corrected trends are as follows. Panel (a):
χ2 = 541.07, p < 0.01. Panel (b): χ2 = 54.66, p < 0.01.
74
(b) Regular Army
0
0
-.2
-.2
Normalized Height (Inches)
Normalized Height (Inches)
(a) Union & Regular Army
-.4
-.6
-.8
-1
-1.2
-.4
-.6
-.8
-1
-1.2
-1.4
-1.4
1830
1840
Birth Cohort
1850
1860
1830
1840
Birth Cohort
1850
1860
Figure 10: Confidence intervals of corrected decline.
Note: These confidence intervals are computed as follows. The estimation procedure produces an estimate of corrected cohort
average stature for each cohort, together with a variance-covariance matrix of these estimates. For each sample group, I take
250 monte carlo draws from this distribution. I then smooth these draws over cohorts and normalize the 1832 value to zero.
Finally, I compute 95 percent pointwise confidence intervals from these draws. The centers of the confidence intervals are the
smoothed trends from Figure 9 corrected for selection on both observables and unobservables.
(b) Regular Army
69.5
69
69
68.5
68.5
Height (Inches)
69.5
68
68
67.5
67.8
Height (Inches)
Height (Inches)
(a) Union & Regular Army
67
1830
1840
Birth Cohort
67.5
67.6
67
67.4
67.2
66.5
68
66.5
1850
67
1860
1830
1840
Birth Cohort
Uncorrected
Unobservables and Observables
1830
1850
1840
1860
Birth Cohort
1850
1860
Observables
Figure 11: Estimated corrected trends by birth cohort; piecewise-linear representation.
Note: This graph is analogous to Figure 9, but is based on piecewise linear regressions. Tests of statistical significance of the
difference between the unobservables-corrected and observables-corrected trends are given in Table 9.
75
(b) Regular Army
69.5
69.5
69
69
68.5
68.5
Height (Inches)
Height (Inches)
(a) Union & Regular Army
68
67.5
68
67.8
Height (Inches)
67
66.5
1830
1840
Birth Cohort
67.5
67
67.6
66.5
67.4
67.2
66
68
66
1850
67
1860
1830
1840
1830
Birth Cohort
1840
1850
1860
Uncorrected
Unobservables and Observables
Birth Cohort
1850
1860
Observables
Figure 12: Estimated corrected trends by birth cohort, not weighted to correct for selection into linkage.
Note: See Figure 9. These specifications do not weight to correct for selection into linkage on the basis of observable characteristics. Tests for statistical significance of the difference between the unobservables-corrected and observables-corrected trends
are as follows. Panel (a): χ2 = 118.98, p < 0.01. Panel (b): χ2 = 54.65, p < 0.01.
(a) Union & Regular Army
(b) Regular Army
.028
.13
.026
Enlistment Probability
.12
.18
Enlistment Probability
Enlistment Probability
.19
.11
.1
.17
.16
.024
.022
.02
.15
.09
.018
0
.2
.4
.14
.6
Vote Share
0
.8
.2
1
.4
0
Vote Share
Douglas
.6
.2
.8
.4
1
Buchanan
Figure 13: Enlistment probability by vote share.
76
Vote Share
.6
.8
1
(a) Union & Regular Army
70
70
70
69.5
69
69
69
68.5
68.5
68.5
68
67.5
Height (Inches)
69.5
69.5
Height (Inches)
Height (Inches)
(b) Regular Army
68
67.5
67
66.5
66.5
66
1830
1840
67.5
67
67
66.5
68
66
1830
Birth Cohort
1850
1840
1860
Birth Cohort
No ER
Baseline
1830
1850
1840
1860
Birth Cohort
1850
1860
ER
Figure 14: Results based on identification at infinity.
Note: “No ER” denotes the trends based on identification at infinity. “ER” denotes the trends based on identification with the
exclusion restrictions (the unobservables-corrected trends from Figure 9. “Baseline” denotes the observables-corrected trends
from Figure 9.
77
A
Appendix Tables
Table A.1: Summary statistics.
Union Army [UA]
Variable
Height (in)
HH Owns Prop.
HH Real Prop. ($1,000)
Related to Head of HH
HH Size
Urban
Attended School
(1)
(2)
(3)
CR
68.198
(2.584)
0.716
(0.451)
1.778
(3.776)
0.889
(0.315)
7.521
(2.407)
0.297
(0.457)
0.568
(0.495)
Supp.
Diff.
0.466
(0.499)
0.029
(0.167)
0.028
(0.166)
0.138
(0.345)
0.048
(0.213)
0.044
(0.206)
0.006
(0.080)
0.239
(0.427)
0.518
(0.500)
0.038
(0.191)
0.066
(0.248)
0.187
(0.390)
0.059
(0.235)
0.064
(0.245)
0.006
(0.076)
0.063
(0.243)
0.450
(0.498)
0.420
(0.494)
0.130
(0.336)
8.587
(9.299)
0.297
(0.166)
1.269
(1.205)
0.443
(0.142)
0.347
(0.175)
5,750
Regular Army [RA(a)]
1832–1846
(4)
(5)
(6)
Supp.
Diff.
0.693
0.023
(0.461) [0.015]
2.264 −0.486a
(6.096) [0.100]
0.862
0.027a
(0.345) [0.007]
7.408
0.113b
(2.409) [0.050]
0.423 −0.126a
(0.494) [0.026]
0.656 −0.088a
(0.475) [0.021]
CR
67.556
(2.266)
0.588
(0.492)
1.995
(6.408)
0.797
(0.403)
7.387
(2.493)
0.510
(0.500)
0.575
(0.494)
0.684
(0.465)
2.330
(9.864)
0.869
(0.337)
7.512
(2.450)
0.349
(0.477)
0.615
(0.487)
−0.051b
[0.020]
−0.009b
[0.004]
−0.038a
[0.006]
−0.048a
[0.010]
−0.011b
[0.005]
−0.020a
[0.005]
0.001
[0.001]
0.176a
[0.019]
0.299
0.151a
(0.458) [0.023]
0.562 −0.141a
(0.496) [0.025]
0.140 −0.010
(0.347) [0.014]
5.845
2.743a
(6.879) [0.531]
0.285
0.012
(0.178) [0.009]
0.960
0.309a
(1.088) [0.063]
0.428
0.015c
(0.141) [0.008]
0.324
0.023b
(0.185) [0.011]
8,210
Regular Army [RA(b)]
1847–1860
(7)
(8)
(9)
Supp.
−0.096a
[0.014]
−0.335c
[0.181]
−0.072a
[0.010]
−0.125c
[0.064]
0.162a
[0.017]
−0.040a
[0.013]
CR
67.403
(2.103)
0.827
(0.378)
2.192
(6.730)
0.879
(0.326)
6.981
(2.460)
0.655
(0.475)
0.653
(0.476)
0.372
(0.484)
0.052
(0.222)
0.078
(0.269)
0.234
(0.423)
0.072
(0.259)
0.090
(0.286)
0.008
(0.087)
0.094
(0.292)
0.546 −0.174a
(0.498) [0.015]
0.039
0.013b
(0.194) [0.006]
0.061
0.018c
(0.239) [0.010]
0.165
0.068a
(0.372) [0.011]
0.056
0.016b
(0.231) [0.007]
0.060
0.030a
(0.238) [0.007]
0.007
0.001
(0.082) [0.002]
0.065
0.029a
(0.247) [0.008]
0.302
(0.459)
0.059
(0.235)
0.091
(0.287)
0.230
(0.421)
0.105
(0.307)
0.091
(0.288)
0.031
(0.174)
0.090
(0.287)
0.455 −0.152a
(0.498) [0.023]
0.038
0.021a
(0.191) [0.005]
0.072
0.018
(0.259) [0.012]
0.150
0.080a
(0.357) [0.014]
0.099
0.006
(0.299) [0.007]
0.069
0.023a
(0.253) [0.007]
0.049 −0.018a
(0.216) [0.004]
0.068
0.022a
(0.252) [0.008]
0.253
(0.435)
0.595
(0.491)
0.152
(0.359)
5.612
(7.579)
0.264
(0.173)
0.745
(0.928)
0.422
(0.140)
0.299
(0.188)
1,786
0.232
(0.422)
0.435
(0.496)
0.333
(0.471)
5.100
(6.419)
0.294
(0.214)
1.258
(1.219)
0.461
(0.158)
0.270
(0.199)
10,449
0.320
0.334 −0.014
(0.467) (0.472) [0.030]
0.504
0.360
0.144a
(0.500) (0.480) [0.040]
0.177
0.307 −0.130a
(0.381) (0.461) [0.030]
6.693
7.448 −0.755
(9.496) (10.301) [0.588]
0.231
0.265 −0.034b
(0.188) (0.206) [0.015]
0.575
0.858 −0.283a
(0.811) (0.846) [0.049]
0.428
0.469 −0.040a
(0.135) (0.156) [0.010]
0.311
0.286
0.025
(0.187) (0.202) [0.017]
2,417
10,634
0.731
(0.444)
2.255
(7.405)
0.896
(0.305)
7.119
(2.329)
0.484
(0.500)
0.675
(0.469)
Diff.
0.097a
[0.019]
−0.062
[0.150]
−0.017b
[0.008]
−0.139c
[0.072]
0.172a
[0.033]
−0.022c
[0.012]
HH Occupation
Farmer
Professional
Clerical
Skilled and Artisan
Semi-Skilled and Operative
Unskilled
Farm Labor
Unproductive
Birth Region
Midwest
Northeast
South
Wheat Bushels PC
Milk Cows PC
Pigs PC
Buchanan Vote Frac. (1856)
Douglas Vote Frac. (1860)
Observations
0.021
[0.015]
0.160a
[0.018]
−0.181a
[0.018]
0.512b
[0.241]
−0.030a
[0.007]
−0.512a
[0.031]
−0.040a
[0.006]
0.029a
[0.007]
Significance levels: a p<0.01, b p<0.05, c p<0.1
Notes: Standard deviations in parentheses. Standard errors are in square brackets and are clustered at the county level. Sample sizes are
the minimum of the column with observations for all variables. Supplementary samples are random samples of the population from the
census, for which military enlistment status is unobserved. They are intended for comparison to the choice-restricted military sample from
enlisments under their particular headings.
78
Table A.2: Height regressions, full specification.
(3)
RA(a)
(4)
RA(a)
0.262a
(0.090)
0.028
(0.143)
0.038 −0.000 −0.033
(0.146) (0.153) (0.157)
0.181b
(0.080)
0.168b
(0.080)
0.047
(0.108)
0.039
(0.109)
−0.015 −0.015
(0.010) (0.010)
0.006
(0.010)
0.005 −0.003 −0.005
(0.010) (0.006) (0.006)
−0.006
(0.006)
−0.007
(0.006)
0.002
(0.005)
0.001
(0.005)
0.361b
(0.180)
0.360b −0.012
(0.183) (0.167)
0.001
(0.170)
0.219b
(0.103)
0.240b
(0.104)
0.212
(0.133)
0.216
(0.135)
0.028
(0.022)
0.023c
(0.013)
0.023c
(0.013)
0.002
(0.018)
0.003
(0.019)
Urban
−0.054 −0.051 −0.085 −0.090 −0.313b −0.313b
(0.099) (0.100) (0.156) (0.158) (0.132) (0.137)
−0.169b
(0.079)
−0.161b −0.187c −0.183c
(0.081) (0.109) (0.111)
Attended School
−0.133c −0.139c
(0.075) (0.075)
0.254c
(0.148)
0.225
(0.151)
0.184c
(0.112)
0.195c
(0.115)
−0.030
(0.063)
−0.031
(0.064)
0.235b
(0.101)
0.221b
(0.103)
Farmer
−0.043 −0.022
(0.116) (0.115)
0.500c
(0.277)
0.548c
(0.282)
0.211
(0.206)
0.197
(0.209)
0.030
(0.099)
0.026
(0.100)
0.407b
(0.185)
0.426b
(0.190)
Professional
−0.174 −0.201
(0.267) (0.266)
0.224
(0.353)
0.269
(0.359)
0.179
(0.283)
0.188
(0.285)
0.022
(0.187)
0.010
(0.188)
0.236
(0.238)
0.256
(0.242)
Clerical
−0.477c −0.466c −0.523 −0.493 −0.130 −0.191
(0.278) (0.277) (0.349) (0.353) (0.267) (0.272)
−0.296
(0.183)
−0.338c −0.310 −0.325
(0.185) (0.225) (0.230)
Skilled and Artisan
−0.187 −0.162 −0.040 −0.004 −0.409b −0.446b
(0.142) (0.142) (0.285) (0.289) (0.203) (0.205)
−0.327a
(0.113)
−0.333a −0.175 −0.171
(0.114) (0.191) (0.195)
Semi-Skilled and Clerical
−0.132 −0.153 −0.063 −0.061 −0.002
(0.204) (0.202) (0.334) (0.339) (0.262)
0.025
(0.263)
−0.085
(0.152)
−0.075
(0.152)
0.035
(0.216)
0.053
(0.219)
Unskilled
−0.308 −0.300
(0.197) (0.197)
0.045
(0.304)
0.064
(0.312)
0.114
(0.228)
−0.087
(0.145)
−0.091
(0.147)
0.113
(0.201)
0.113
(0.207)
Farm Labor
−0.196 −0.188
(0.540) (0.540)
0.241
(0.883)
0.292 −0.298 −0.362
(0.888) (0.305) (0.314)
−0.362
(0.251)
−0.410 −0.058 −0.084
(0.256) (0.339) (0.348)
Midwest
−0.067 −0.060
(0.145) (0.164)
0.308
(0.240)
0.198 −0.030 −0.080
(0.268) (0.177) (0.208)
−0.065
(0.113)
−0.086
(0.131)
Northeast
−0.647a −0.623a −0.195 −0.257 −0.420b −0.486b
(0.157) (0.164) (0.239) (0.255) (0.183) (0.199)
−0.557a
(0.120)
−0.575a −0.321b −0.374b
(0.128) (0.152) (0.165)
Variables
HH Owns Prop.
HH Real Prop. (1,000)
(1)
UA
0.263a
(0.090)
(2)
UA
(5)
RA(b)
Related to Head of HH
0.401a
(0.127)
0.426a
(0.128)
HH Size
0.019
(0.017)
0.016 −0.013 −0.012
(0.017) (0.028) (0.028)
0.124
(0.224)
0.014 −0.001
(0.009) (0.006)
Wheat Bushels PC
0.005
(0.005)
0.005
(0.005)
Milk Cows PC
0.439b
(0.183)
0.467b −0.217 −0.213
(0.194) (0.420) (0.456)
Pigs PC
0.013
(0.009)
0.026
(0.022)
−0.007 −0.024 −0.029 −0.033
(0.039) (0.040) (0.089) (0.096)
(6)
RA(b)
(7)
UA & RA(b)
(8)
UA & RA(b)
(9)
RA
0.128
(0.153)
0.052
(0.174)
0.000
(0.006)
0.002
(0.004)
0.002
(0.004)
0.098 −0.009
(0.299) (0.314)
0.357b
(0.150)
0.340b −0.073 −0.112
(0.151) (0.260) (0.278)
0.073
(0.105)
0.017
(0.037)
0.007 −0.009 −0.006
(0.038) (0.062) (0.066)
0.072
(0.108)
0.005
(0.006)
(10)
RA
0.006
(0.006)
Buchanan Vote Frac. (1856)
0.311
(0.379)
0.055
(0.515)
−0.452
(0.470)
−0.056
(0.297)
−0.166
(0.356)
Douglas Vote Frac. (1860)
0.022
(0.297)
0.294
(0.393)
0.209
(0.328)
0.166
(0.221)
0.245
(0.263)
Constant
Observations
66.945a
(0.318)
66.809a
(0.377)
66.344a
(0.485)
66.285a
(0.562)
67.134a
(0.417)
67.396a
(0.513)
66.973a
(0.311)
67.037a
(0.354)
66.817a
(0.361)
66.941a
(0.420)
5,475
5,405
1,714
1,691
2,406
2,327
7,881
7,732
4,120
4,018
Significance levels: a p<0.01, b p<0.05, c p<0.1
Notes: Dependent variable is height, measured in inches. All regressions are adjusted for minimum height requirements with a truncation point of 64 inches. All
specifications include age-of-measurement indicators and year-of-birth indicators. Standard errors are clustered on the county level. UA denotes Union Army. RA
denotes Regular Army; these regressions are weighted to correct for the different sampling of the two groups of cohorts. UA & RA(b) denotes the use of Union Army
data for the 1832–1846 cohorts and the use of Regular Army data for the 1847–1860 cohorts; these regressions are weighted to correct for the different sampling of the
two cohorts. RA(a) denotes Regular Army for the 1832–1846 cohorts. RA(b) denotes Regular Army for the 1847–1860 cohorts.
79
Table A.3: Selection into linkage, weighted by inverse conditional linkage probability.
Variables
(1)
UA
(2)
RA(a)
(3)
RA(b)
(4)
UA & RA(b)
(5)
UA & RA(b)
Linked
0.105b 0.144
0.106
(0.048) (0.092) (0.085)
0.097c
(0.052)
0.125b
(0.062)
Observations
11,894
17,246
10,342
19.58
29.56
2
χ -Test of Birth Year FE × Linked
10.34
4,990
c
22.27
5,352
10.52
Significance levels: a p<0.01, b p<0.05, c p<0.1
Notes: Dependent variable is height, measured in inches. Truncated regression is performed to account for minimum
height requirements with a truncation point of 64 inches. All specifications include measurement-age and birth-year
dummy variables. Standard errors are clustered by image for the unlinked sample. The sample includes linked and
unlinked members of the Regular Army and Union Army. UA denotes the Union Army. RA(a) denotes the 1832–
1846 cohorts of the Regular Army. RA(b) denotes the 1847–1860 cohorts of the Regular Army. The coefficients on
linked are from a regression without interactions. The statistics on the interactions are from a separate regression
with interactions.
80
Table A.4: Second-stage height regressions.
(1)
UA & RA(b)
Variables
(2)
RA
0.112 −0.109
(0.087) (0.118)
HH Owns Prop.
HH Real Prop. (1,000)
−0.024b
(0.011)
0.003
(0.018)
Related to Head of HH
0.273a
(0.102)
0.286b
(0.130)
HH Size
0.024
(0.015)
0.025
(0.018)
−0.192b −0.253b
(0.084) (0.116)
Urban
Attended School
0.023
(0.076)
0.244b
(0.115)
Farmer
0.148
(0.105)
0.169
(0.201)
Professional
0.103
(0.173)
0.059
(0.238)
Clerical
−0.030 −0.125
(0.168) (0.205)
Skilled and Artisan
−0.180 −0.350b
(0.120) (0.167)
Semi-Skilled and Clerical
0.069 −0.031
(0.156) (0.197)
Unskilled
0.140
(0.153)
0.012
(0.192)
−0.267 −0.313
(0.203) (0.241)
Farm Labor
0.074
(0.126)
Midwest
0.182
(0.145)
Northeast
−0.272b −0.173
(0.131) (0.168)
Wheat Bushels PC
−0.006 −0.003
(0.006) (0.006)
Milk Cows PC
−0.058 −0.225
(0.183) (0.240)
Pigs PC
Observations
0.042
(0.046)
0.106
(0.141)
7,299
3,969
67.73
67
(0.0781) (0.113)
Constant
Significance levels: a p<0.01, b p<0.05, c p<0.1
Notes: Dependent variable is height, measured in inches. These
are the results of the second stage of the correction procedure,
and include a non-parametric function of the linear index of the
selection function. All regressions include age-of-measurement
indicators and birth year indicators. Standard errors are corrected for error in the estimation of the linear index of selection.
81
B
Appendix Figures
Figure B.1: Buchanan vote share, 1856, difference from regional means.
Note: Regional boundaries are darkened. The figures presented are differences of the vote share from the regional average.
Black indicates no data.
Source: ICPSR (1999); map from NHGIS.
82
Figure B.2: Douglas vote share, 1860, difference from regional means.
Note: See Figure B.1.
Source: ICPSR (1999); map from NHGIS.
83
(b) Regular Army
70
70
69
69
Height (Inches)
Height (Inches)
(a) Union & Regular Army
68
67
68
67
66
66
1830
1840
Birth Cohort
Linked
1850
1860
1830
1840
Unlinked
Birth Cohort
Linked
1850
1860
Unlinked
Figure B.3: Height trends of linked and unlinked samples, weighting by inverse conditional linkage probability.
Note: See Figure 7. These graphs are weighted by inverse conditional linkage probability.
(b) Regular Army
69
68.5
68.5
68
68
Height (Inches)
69
67.5
68
67
67.8
Height (Inches)
Height (Inches)
(a) Union & Regular Army
66.5
1830
1840
Birth Cohort
67
67.6
66.5
67.4
67.2
66
67.5
66
1850
67
1860
1830
1840
Birth Cohort
Uncorrected
Unobservables and Observables
1830
1850
1840
1860
Birth Cohort
1850
1860
Observables
Figure B.4: Estimated corrected trends by birth cohort including interactions in the second stage.
Note: See Figure 9. These results allow coefficients in the second stage estimation to differ by cohort group so that identification
is only from the voting variables acting as exclusion restrictions. Estimation of the second stage is by Robinson’s (1988) method.
84
(b) Regular Army
69.5
69
69
68.5
68.5
Height (Inches)
69.5
68
67.5
68
67.8
67
Height (Inches)
Height (Inches)
(a) Union & Regular Army
66.5
1830
1840
Birth Cohort
67.5
67
67.6
66.5
67.4
67.2
66
68
66
1850
67
1860
1830
1840
Birth Cohort
Uncorrected
Unobservables and Observables
1830
1850
1840
1860
Birth Cohort
1850
Observables
Figure B.5: Estimated corrected trends by birth cohort using Robinson’s (1988) method.
Note: See Figure 9. Estimation of the second stage is by Robinson’s (1988) method.
85
1860
C
Correcting for Selection: Formal Arguments
This appendix provides the formal details for the methods discussed in section 3.
C.1
Selection on Observables
In this section I briefly develop the method that I will use to compute weights to correct for selection on
observables under the framework discussed above; this method will be combined with the sample-selection
model below in order to correct for selection on both observables and unobservables together.89
By the law of iterated expectations, the object of interest can be written as
Z
E(hi |ci ) =
E(hi |ci , xi , zi )f (xi , zi |ci ) dwi
ZX
=
E(hi |ci , xi )f (xi , zi |ci ) dwi ,
(C.1)
X
where equation (C.1) follows from the assumption that height is independent of zi conditional on ci and
xi . If there were no self selection, either on the basis of observables or unobservables, the left hand side of
equation (C.1) could be computed trivially from the data; however, the researcher observes E(hi |ci , yi = 1)
and not E(hi |ci ) in a selected sample. Moreover, the components of the right-hand side of equation (C.1)
cannot, in general, be directly computed from a sample consisting solely of military enlisters—the researcher
observes E(hi |ci , xi , yi = 1) and not E(hi |ci , xi ); but if the selection is only on observables, the assumptions
discussed above imply that
E(hi |ci , xi , yi = 1) = E(hi |ci , xi ).
(C.2)
What remains as the main pitfall is that selection into military service on the basis of observables implies
that f (xi , zi |ci ) 6= f (xi , zi |ci , yi = 1). That is to say, simply averaging the observed heights within each
birth cohort will not yield consistent estimates of the true heights because the weighting is based on the
distribution of covariates in the selected sample, which differs from that in the population. However, Bayes’s
Theorem implies that
P (yi = 1|ci , xi , zi )f (xi , zi |ci )
,
(C.3)
f (xi , zi |ci , yi = 1) =
P (yi = 1|ci )
so that
f (xi , zi |ci ) =
f (xi , zi |ci , yi = 1)P (yi = 1|ci )
f (xi , zi |ci , yi = 1)
∝
.
P (yi = 1|ci , xi , zi )
P (yi = 1|ci , xi , zi )
Substituting expressions (C.2) and (C.4) into equation (C.1) gives
Z
kci
dwi ,
E(hi |ci ) =
E(hi |ci , xi , yi = 1)f (xi , zi |ci , yi = 1)
P (yi = 1|ci , xi , zi )
X
(C.4)
(C.5)
where kci is the normalizing constant for cohort ci . Note that if the researcher were simply to take the
(unweighted) average height for each birth cohort from a selected sample, he would estimate
Z
E(hi |ci , yi = 1) =
E(hi |ci , xi , yi = 1)f (xi , zi |ci , yi = 1) dwi
(C.6)
X
from its sample analog N1c
i∈ci hi , where Nci denotes the number of individuals in the sample belonging
i
to birth cohort ci ; because expression (C.6) is equivalent to expression (C.5) save for the inclusion of the
P
89 Although consideration of the exclusion-restriction variables z is unnecessary for the correction of this type of selection,
i
I include them as they are required for identification in the correction for selection on unobservables.
86
weights
k ci
P (yi =1|ci ,xi ,zi ) ,
it is natural to estimate expression (C.5) by its sample analog
ĥ(ci ) =
hi
k̂ci X
.
Nci i∈c P (yi = 1|ci , xi , zi )
(C.7)
i
It can be shown that expression (C.7) is precisely the estimated coefficient on the year-of-birth indicator for
cohort ci when observed heights are regressed on birth-cohort indicators (and no constant), weighting by
inverse conditional enlistment probabilities, thus providing a method to perform this correction.
C.2
Selection on Unobservables
When selection on unobservables is admitted alongside selection on observables, the arguments made in
equations (C.3) and (C.4) continue to hold, as they did not rely on independence of εi and ui , but rather
were simply an application of Bayes’s theorem. However, expression (C.2) is no longer true. Instead,
E(hi |ci , xi , zi , yi = 1) = E[h(ci , xi )|ci , xi , zi , yi = 1] + E(εi |ci , xi , zi , yi = 1)
= E(hi |ci , xi ) + E(εi |ci , xi , zi , yi = 1),
(C.8)
where (C.8) follows from the assumptions regarding ui and εi and the relationship between height and zi .
The analog of equation (C.5) is then
Z
E(hi |ci ) =
E(hi |ci , xi , zi , yi = 1) − E(εi |ci , xi , zi , yi = 1)
X
× f (xi , zi |ci , yi = 1)
kci
dwi , (C.9)
P (yi = 1|ci , xi , zi )
which can also be estimated by its sample analog:
ĥ(ci ) =
k̂ci X E(hi |ci , xi , zi , yi = 1) − E(εi |ci , xi , zi , yi = 1)
.
Nci i∈c
P (yi = 1|ci , xi , zi )
(C.10)
i
Equation (C.10) illustrates the difference between the selection correction that is used here to compute unconditional selection-corrected trends, and the standard selection-correction model, which estimates conditional
trends, E(hi |ci , xi ). The estimated selection-corrected conditional trends must simply be averaged within
cohorts, weighting by the inverse conditional enlistment probability in order to compute selection-corrected
unconditional trends.
87
D
Procedure to Link Regular Army Enlisters to the US Censuses
I use standard census linking techniques developed by Abramitzky, Boustan, and Eriksson (2012, 2013, 2014)
and Ferrie (1996, 1997) in order to link the Regular Army samples to the US Censuses. For two reasons,
I use a separate linking procedure for the 1832–1846 cohorts and the 1847–1860 cohorts. First, in order to
ensure that individuals with non-unique combinations of name, year of birth (±4), and state of birth are not
included in linking,90 a complete census index was required. At the time that the 1847–1860 cohorts were
linked, the only readily available 100% index was for 1880, and linkage to this source was thus necessary.
When the 1832–1846 cohorts were linked, full indexes of all censuses had become readily available, making
linkage to 1880 unnecessary. Second, I do not link the 1832–1846 cohorts to 1880 (although doing so would
make the procedures for the two groups more comparable) because doing so would require than an individual
survive the Civil War, leading to possibly severe selection into linkage. The two procedures are as follows.
Procedure D.1. The procedure for linking the 1832–1846 cohorts to the censuses is as follows.
1. I obtained 100 percent samples of the 1850 and 1860 censuses from Ancestry.com (2009a,b), 1850
Census, and 1860 Census.
2. On the basis of first name, last name, state of birth, and year of birth (± 4), I link the 1850 and 1860
census samples to themselves. Any individual for whom another individual similar on these identifying
characteristics existed was removed from the sample.
3. I obtained a 100 percent sample of individuals born 1847–1860 listed in the Register of Enlistments in
the U.S. Army, 1798–1914 (n.d.) from Ancestry.com (2007).
4. Using the same identifying information described in step 2, I linked individuals in the Register of
Enlistments to the remaining individuals from the 1850 and 1860 censuses. Due to the possibility
of multiple enlistments in the lifetime, I permit several individuals in the Register of Enlistments to
match to one individual in the censuses. However, in the event that several individuals in the censuses
are matched to a single individual in the Register of Enlistments, I drop all concerned individuals.
Procedure D.2. The procedure for linking the 1847–1860 cohorts to the censuses is as follows.
1. I obtained a 100 percent sample of the 1880 census from Ruggles et al. (2010).
2. On the basis of first name, last name, state of birth, and year of birth (± 4), I link the 1880 census
sample to itself. Any individual for whom another individual similar on these identifying characteristics
existed was removed from the sample.
3. I obtained a 100 percent sample of individuals born 1847–1860 listed in the Register of Enlistments in
the U.S. Army, 1798–1914 (n.d.) from Ancestry.com (2007).
4. Using the same identifying information described in step 2, I linked individuals in the Register of
Enlistments to the remaining individuals from the 1880 census. Due to the possibility of multiple
enlistments in the lifetime, I permit several individuals in the Register of Enlistments to match to one
individual in the census. However, in the event that several individuals in the 1880 census are matched
to a single individual in the Register of Enlistments, I drop all concerned individuals.91
5. I match individuals in the Register of Enlistments who are matched to the 1880 census to the Register
of Enlistments once more (using stricter criteria), thus bringing in additional enlistments. Multiple
matches from the 1880 census to the Register of Enlistments are once again omitted.
90 This
helps to reduce the probability of a spurious match.
linkage is performed because the only fully-available census index at the time that the linkage was performed was
that for 1880. Linkage to 1880 ensures that only individuals with unique identifying information survive into the linkage to the
1870 and earlier census, ensuring a higher certainty of links to the earlier census.
91 This
88
6. I then link the 1880 individuals who were linked to the Register of Enlistments to the 1860 and 1870
censuses (gathered from Ancestry.com, 2009b,c, 1860 Census, and 1870 Census) using the same criteria.
Tables D.1 and D.2 present the numbers of individuals included in the sample at each stage.
Tables
Table D.1: Sample sizes at each stage of linking for 1832–1846 cohorts.
(1)
(2)
Step No.
Description
(3)
Sample Size
(a)
Census
(4)
% of Previous
(b)
Enlistments
(a)
Census
(b)
Enlistments
3
Enlist Full
157, 429
4
1850 Link
16, 996
24, 449
15.53%
4
1860 Link
14, 998
21, 773
13.83%
Table D.2: Sample sizes at each stage of linking for 1847–1860 cohorts.
(1)
(2)
Step No.
Description
(3)
Sample Size
(a)
Census
(b)
Enlistments
(4)
% of Previous
(a)
Census
(b)
Enlistments
3
Enlist Full
93, 085
4&5
1880-Enlistments Link
14, 343
18, 802
6
1860 Link
3, 133
4, 611
21.84%
24.52%
6
1870 Link
3, 129
4, 632
21.82%
24.64%
89
20.20%
E
Estimation Details
This appendix provides the details of the estimation procedure described in section E.
E.1
Adapting Klein and Spady’s (1993) Estimator
Klein and Spady (1993) develop a method to estimate a model of the form of equation (4) in a simple random
sample of a population. The sample used in the present research differs from such a sample in two ways.
First, the sample is a choice-restricted sample with a supplementary sample (as discussed by Cosslett, 1981),
rather than a simple random sample. Second, the sample is composed of two distinct subsamples that are
sampled separately: the 1832–1846 cohorts and their supplementary sample, and the 1847–1860 cohorts and
their supplementary sample.
Klein and Spady’s (1993) estimator is a maximum likelihood estimator, where the likelihood function
takes the usual form for binary choice models, and where P (yi = 1|x0i β̂k + c0i α̂ + zi 0 δ̂k ) is estimated using
the leave-one-out Nadaraya (1964) and Watson (1964) (NW) estimator.
To adapt this estimator to the sample available in the present context, it is useful to define the variable
si as
(
0 for members of the 1847–1860 cohorts
si =
1 for members of the 1832–1846 cohorts
and ψi := x0i β̂k + c0i α̂ + zi 0 δ̂k . Let λ(·) denote the probability density function of ψi . I then use Bayes’s
Theorem and the law of total probability to write P (yi = 1|ci , xi , zi ) in terms of objects that can be learned
from the available sample:
P (yi = 1|ψi ) = P (yi = 1|ψi , si = 0)P (si = 0|ψi ) + P (yi = 1|ψi , si = 1)P (si = 1|ψi )
λ(ψi |yi = 1, si = 0)P (yi = 1|si = 0)
P (si = 0|ψi )
λ(ψi |si = 0)
λ(ψi |yi = 1, si = 1)P (yi = 1|si = 1)
P (si = 1|ψi )
+
λ(ψi |si = 1)
λ(ψi |yi = 1, si = 0)P (yi = 1|si = 0) λ(ψi |si = 0)P (si = 0)
=
λ(ψi |si = 0)
λ(ψi )
λ(ψi |yi = 1, si = 1)P (yi = 1|si = 1) λ(ψi |si = 1)P (si = 1)
+
λ(ψi |si = 1)
λ(ψi )
λ(ψi |yi = 1, si = 0)P (yi = 1|si = 0)P (si = 0)
=
λ(ψi |si = 0)P (si = 0) + λ(ψi |si = 1)P (si = 1)
λ(ψi |yi = 1, si = 1)P (yi = 1|si = 1)P (si = 1)
+
λ(ψi |si = 0)P (si = 0) + λ(ψi |si = 1)P (si = 1)
=
(E.1)
Every portion of equation (E.1) can either be non-parametrically estimated from the available data, or can
be deduced from aggregate statistics. The distribution of the linear index in the military service sample,
λ(ψi |yi = 1, ·), can be learned from each of the choice-restricted subsamples. The distribution of this same
index in the population, λ(ψi |·) can be learned from each of the supplemental samples. The aggregate
enlistment probabilities, P (yi = 1|·) are given in Table 6. Finally, P (si = 0) and P (si = 1) can be learned
from aggregate data.
In order to discuss the estimation procedure, it is convenient to define an indicator for whether individual
i is a member of the choice-restricted or supplementary sample. Define
(
1 for members of the choice-restricted sample
ỹi =
.
0 for members of the supplementary sample
90
Observations for which ỹi = 1 make it possible to learn the terms in equation (E.1) that are conditional on
yi = 1, while those for which ỹi = 0 make it possible to learn the terms that do not condition on yi = 1 and
which are not learned from aggregate data. I adapt the NW estimator and estimate equation (E.1) with the
statistic


hP
i−1 P
ψi −ψj



 P (yi = 1|si = 0)P (si = 0)
ỹ
(1
−
s
)
ỹ
(1
−
s
)
K
j
j
j
j
j
j6=i
ω
hP
i−1 P
ψi −ψj



+ P (yi = 1|si = 1)P (si = 1)
ỹj sj 
j ỹj sj
j6=i K
ω
, (E.2)
P (yi = 1|ψi ) =


i−1 P hP
ψi −ψj


 P (si = 0) j6=i

(1 − ỹj )(1 − sj )
K
j (1 − ỹj )(1 − sj )
ω
hP
i−1 P
ψi −ψj



+ P (si = 1) j6=i
(1
−
ỹ
)s
K
(1 − ỹj )sj 
j
j
j
ω
V
where ω is a bandwidth and K(·) is a kernel function.92 I use this statistic rather than estimating the model
separately for each of the two subpopulations in order to ensure that the estimates of β and α created by this
procedure are comparable across subpopulations and can thus be used in the selection model. Thus, rather
than simply maximizing the likelihood (E.5), defined below, separately for each sample, I maximize the sum
of the likelihoods for each of the samples, estimating the choice probabilities by (E.2); to accommodate the
separate sampling o the two groups of birth cohorts, I weight the likelihood, so that the final likelihood
function becomes
0 P (si = 1)
0 P (si = 0)
+ L{si =1} β 0 α0 δ 0
,
(E.3)
×
L = L{si =0} β 0 α0 δ 0
×
Ξ(si = 0)
Ξ(si = 1)
where L{si =j} (·) is the likelihood function (E.5) for sample si = j, j ∈ {0, 1}, P (si = j) is the population
proportion, and Ξ(si = j) is the sample proportion.
Klein and Spady (1993) also suggest the use of a trimming function, though they report that the particular
function is empirically unimportant. When estimating, I first assume that G(·) is normally distributed and
estimate a probit model. I then compute a kernel density estimate of the estimated value of xi βk +ci α+zi 0 δk ,
and trim from the estimation (that is, exclude from the likelihood function) individuals for whom the density
falls below 0.005.
E.2
Cosslett’s (1981) Likelihood
The likelihood function also requires adaptation to the structure of the sample. Before proceeding further,
it is useful to introduce some additional notation:
• S = {0, 1}: the set of options for each individual in the sample, where 0 denotes never enlisting and 1
denotes enlisting in the military at some point in the lifetime
• N : the number of individuals in the choice-restricted (military enlister) sample
• N0 : the number of individuals in the supplementary (general population) sample
• Hs =
N0
N
• Qj : the proportion of each choice j ∈ S in the population
• Hj : the proportion of each choice j ∈ S in the choice-restricted sample
• ηj =
92 For
Hj
Qj
estimation, I use a Gaussian kernel.
91
I denote the choice-restricted sample by i ∈ {1, . . . , N } and the supplementary sample by i ∈ {N +1, . . . , N +
N0 }.
Models of this type are studied by Cosslett (1981), who provides a maximum-likelihood estimator.
 
β̂
α̂ = arg max L [ β 0
[ β 0 α0 δ 0 ]0
δ̂
α0 δ 0
0
]
= arg max
N
X
[ β 0 α0 δ 0 ]0 i=1
log


h
 P
i
ηj P (yi = j|ci , xi , zi ) + Hs 



NX
+N0
 X

−
log 
ηj P (yi = j|ci , xi , zi ) + Hs .93


j∈S
i=N +1
L [β
0
α δ
0
]
0
=
N
X
i=1
log
η1 P (yi = 1|ci , xi , zi )
η1 P (yi = 1|ci , xi , zi ) + Hs
−
(E.4)
j∈S
Since the choice-restricted sample contains only enlisters, η1 =
likelihood function in expression (E.4) reduces to
0


η1 P (yi = 1|ci , xi , zi )
1
Q1
NX
+N0
and η0 =
0
Q0
= 0. Thus, the pseudo-log-
log {η1 P (yi = 1|ci , xi , zi ) + Hs } .94
(E.5)
i=N +1
Finally, because Q1 differs between the two groups of cohorts, I maximize the sum of equation (E.5) evaluated
separately for the two subsamples, weighting the sums to account for the separate sampling of each group
of birth cohorts, as described above.
E.3
The Gradient Matrix
When the Gaussian kernel is used, the gradient matrix of the estimated probability with respect to the
first-stage coefficients Θ1 is given by
V
V
∂P (yi = 1|ψi )
P (yi = 1|ψi )
(A + B − C − D)
=
∂Θ1k
E+F
93 Cosslett’s (1981) estimator has also been applied by Domowitz and Sartain (1999). Ben-Akiva et al. (2007) and Steinberg
and Cardell (1992) suggest an alternative maximum likelihood estimator, which is simpler to implement than Cosslett’s (1981),
but inefficient. I use Cosslett’s (1981) estimator in order to preserve efficiency.
94 Location and scale normalizations are required. To this end, I omit a constant and require that
 
0
β
0
0
β
α
δ α = 1.
δ
92
where

−1
X
X ψi − ψj xik − xjk ψi − ψj K
(1 − ỹj )(1 − sj )
A = P (si = 0)  (1 − ỹj )(1 − sj )
ω
ω
ω
j
j6=i

−1
X
X ψi − ψj xik − xjk ψi − ψj 

B = P (si = 1)
K
(1 − ỹj )sj
(1 − ỹj )sj
ω
ω
ω
j
j6=i

−1
X
X ψi − ψj xik − xjk ψi − ψj C = P (yi = 1|si = 0)P (si = 0) 
ỹj (1 − sj )
K
ỹj (1 − sj )
ω
ω
ω
j
j6=i

−1
X
X ψi − ψj xik − xjk ψi − ψj 

D = P (yi = 1|si = 1)P (si = 1)
ỹj sj
K
ỹj sj
ω
ω
ω
j
j6=i

−1
X ψi − ψj X
K
E = P (si = 0)  (1 − ỹj )(1 − sj )
(1 − ỹj )(1 − sj )
ω
j
j6=i

−1
X
X ψi − ψj 

(1 − ỹj )sj
F = P (si = 1)
(1 − ỹj )sj
K
ω
j
j6=i
and xik may be substituted by any element of xi , ci , or zi as appropriate.
E.4
Computing Partial Effects
The semi-elasticities presented in Table 7 are computed as follows. Differentiating expression (E.2) with
respect to xik yields the partial effect for individual i for a continuous covariate:
V
∂ log[P (yi = 1|ψi )]
1
=
∂xik
P (yi = 1|ψi )


hP
i−1 P
0 ψi −ψj




P
(y
=
1|s
=
0)P
(s
=
0)
ỹ
(1
−
s
)
K
ỹ
(1
−
s
)
i
i
i
j
j
j
j
j
j6=i
ω


hP
i−1
P

ψ −ψ


+ P (yi = 1|si = 1)P (si = 1) j6=i
K 0 i ω j ỹj sj 
j ỹj sj
βk 

×
× 

i−1 P hP
 
ω
ψ −ψ

(1 − ỹj )(1 − sj )
K i ω j (1 − ỹj )(1 − sj ) 
  P (si = 0) j6=i
j

h
i
−1

P
P
ψ −ψ



+ P (si = 1) j6=i
K i ω j (1 − ỹj )sj 
j (1 − ỹj )sj


i−1
P hP
0 ψi −ψj


 P (si = 0) j6=i

(1
−
ỹ
)(1
−
s
)
K
(1
−
ỹ
)(1
−
s
)
j
j
j
j
j
ω

h
i
P (yi = 1|ψi ) ×
−1
P
P

0 ψi −ψj




+ P (si = 1) j6=i
(1
−
ỹ
)s
K
(1
−
ỹ
)s

j
j
j
j
j
ω

−



h
i
−1
P
P

ψi −ψj


 P (si = 0) j6=i

(1
−
ỹ
)(1
−
s
)
(1
−
ỹ
)(1
−
s
)
K

j
j
j
j
j
ω

hP
i−1 
P
ψi −ψj



+ P (si = 1) j6=i
K
(1 − ỹj )sj 
j (1 − ỹj )sj
ω
V
V
93
(E.6)
For a discrete covariate, the partial effect for individual i is calculated as
V
∆ log[P (yi = 1|ψi )]
1
− P (yi = 1|ψi ) ;
=
× P (yi = 1|ψi ) ∆xik
xik =1
xik =0
P (yi = 1|ψi )
V
V
V
(E.7)
that is, the difference between the estimated probability of enlistment for each of the two possible values of
xik .
The numbers presented in Table 7 are averages of expressions (E.6) and (E.7) across the supplementary
samples, thus representing the average marginal effect of the covariate on enlistment in the whole population.
94
F
Computing Standard Errors
The estimation consists of five distinct estimation steps (including the estimation of the constant of the
second-stage regression), and standard errors must be computed at each stage. The computation proceeds
as follows.
F.1
First Stage: Binary Choice Model
Computation of the standard errors is straightforward in this case, as the standard maximum likelihood
standard errors can be used (Klein and Spady, 1993, p. 404). In particular, I estimate the first-stage
variance-covariance matrix as Σ11 = Â−1 , where
 =
M
X
∂Li ∂Li
0 ,
∂Θ
1 ∂Θ1
i=1
M is the sample size for this stage, Θ1 are the coefficients for this stage, Li is as in expression (E.3), and
Σ11 is of dimension k-by-k.
F.2
Second Stage: Selection-Corrected Regression
The standard errors for the second case can be computed using the method of Newey (2009, p. S221). Let M
denote the sample size, and let Θ2 (of dimension `-by-1) be the coefficients. The variance-covariance matrix
in this case is computed as
Σ22 = Φ−1 [Λ + ΨΣ11 Ψ0 ]Φ−1 .
The definitions of the terms are as follows.
Φ=
M
1 X
wi (Xi − X̂i )0 (Xi − X̂i ) = (X − X̂)0 W(X − X̂),
M i=1
where Xi are the regressors in the second stage, wi are weights, X̂i are the estimates of the expected values
of the independent variables conditional on the spline of the linear index of the first stage, and X, W, and
X̂, respectively, are matrix representations of these.
Λ=
M
1 X 2
w (Xi − X̂i )0 (Xi − X̂i )ξˆi2 ,
M i=1 i
where ξˆi = hi − Xi Θ̂2 − Ω̂(Zi0 Θ̂1 ), Ω̂(·) is the estimated selection function (from the spline), and Zi are the
variables entering into the first stage.
Ψ=
F.2.1
M
1 X
∂ Ω̂(Zi0 Θ̂1 )
wi (Xi − X̂i )0
Zi .
M i=1
∂Zi0 Θ̂1
The Intercept
Andrews and Schafgans (1998, p. 505) show that the estimated intercept has variance
PM
σ̂n2 i=1 Γ2 (Zi Θ1 )
P
2 ,
M
Γ(Z
Θ
)
i
1
i=1
95
where
σ̂n2 =
PM
i=1 (hi
− µ̂n − Xi Θ2 )2 Γ(Zi Θ1 )
,
PM
i=1 Γ(Zi Θ1 )
so that the final estimator is
σµ2
F.3
PM
=
i=1
PM
Γ2 (Zi Θ1 ) i=1 (hi − µ̂n − Xi Θ2 )2 Γ(Zi Θ1 )
.
P
3
M
Γ(Z
Θ
)
i
1
i=1
Correction for Selection on Observables Only
For the correction on observables only, I use the estimator of Wooldridge (2002, pp. 361–362) for the variance
of two-step m-estimators. The model in this case is
hi = ci Θo + εi ,
estimated with weights and correcting for minimum height requirements using truncated regression with
likelihood function given in expression (11). Let Ki denote individual i’s log-likelihood at this stage. The
generated component of this estimation are the weights wi . The variance is computed as
M
Σo =
M −T
M
X
!−1
Ĥi
i=1
M
X
!
ĝi ĝi0
i=1
M
X
!−1
Ĥi
.
i=1
The components of Σo are computed as follows in this case.
Ĥi = wi
∂Ki ∂Ki
∂Θo ∂Θ0o
ĝi = ŝi + F̂r̂i .
ŝi = wi
F̂ = M −1
M
X
∂Ki
.
∂Θo
∇Θ1 ŝi = M −1
i=1
r̂i =
w̃i
G(Zi Θ1 )
(∇Θ1 wi )
i=1
√
The weights are wi =
M
X
∂Ki
.
∂Θo
∂Li
M Â−1
.
∂Θ1
(where w̃i are the first-stage weights, which are not estimated, and G(Zi Θ1 )
are the estimated enlistment probabilities), so the matrix F̂ is
−M −1
M
X
∇Θ1 G(Zi Θ1 )
ŝi .
G(Zi Θ1 )
i=1
The gradient matrix ∇Θ1 G(Zi Θ1 ) is as in Appendix E.3.
F.4
Correction for Selection on Unobservables
When correcting for selection on unobservables, no further correction for truncation is required, and so
estimation is by OLS, with a generated dependent variable (ĥi + ξˆi ) and generated weights. The variance
96
Σu is then
M
Σo =
M −T
M
X
!−1
M
X
Ĥi
i=1
!
ĝi ĝi0
i=1
M
X
!−1
Ĥi
.
i=1
The components of Σo are computed as follows in this case.
Ĥi = wi c0i ci .
ĝi = ŝi + F̂r̂i .
ŝi = wi c0i (ĥi + ξˆi − c0i Θu ).
F̂ = M −1
M
X
c0i ∇Θ,µ wi (ĥi + ûi − ci Θu ) + M −1
i=1
and

√
r̂i =
M
X
c0i wi ∇Θ,µ (ĥi + ξˆi )
i=1
∂Li
Â−1 ∂Θ
1





−1 P
 PM

M

Γ(Z
Θ
)
(h
−
µ̂
−
X
Θ
)Γ(Z
Θ
)
M
i 1
n
i 2
i 1 
i=1
i=1 i





h
i
∂Li
Φ−1 (Xi − X̂i )w̃i ξˆi + ΨÂ−1 ∂Θ
1
where ξˆi = hi − Xi Θ̂2 − Ω̂(Zi0 Θ̂1 ) and w̃i are the weights from the second-stage regression.95
F.5
Covariance of Estimated Trends
It is possible to test for the statistical significance of differences between the raw and corrected trends
by estimating the raw, observables-corrected and unobservables-corrected trends in a seemingly unrelated
regression. When correcting for truncation, the three models are estimated by maximum likelihood—
truncated for the raw and observables-corrected trends, and and ordinary normal maximum likelihood for
the unobservables-corrected trends—and standard errors can be calculated similarly. This also slightly alters
the estimation of the matrix F̂, with ∇Θ1 ŝi taking a slightly different form.96
95 The
96 In
and
second row of the matrix ri is from Andrews and Schafgans (1998). The third row is from Newey (2009).
particular, the derivatives are
c0 ∇Θ ĥi
∂ 2 log Li
= wi i 21
∂Θu ∂Θ1
σ
!
∇Θ1 ĥi ĥi − ci Θu
∂ 2 log Li
= 2wi
.
∂σ∂Θ1
σ2
σ
97
G
Constructing Weights
This Appendix describes the computation of the contents of Table 6.
In order to compute the fraction of the relevant population serving in the Union Army, I consulted two
sources. The first was Gould (1869, p. 28), who reports that 1,660,068 native-born men served in the Union
Army. Next I consulted the 1860 census, finding that there were 3,720,008 native-born men aged 15–45 in
the portions of the United States that did not secede. Thus, for the Union Army population
QUA
=
1
1, 660, 068
= 0.446.
3, 720, 008
To determine the value of Q1 for the two Regular Army subsamples, I again consulted two sources. First,
I determined from the 1860 census that the total native-born white male population born between 1832 and
1846 was 3,101,313. From the 1870 census, I found that the total native-born white male population born
between 1847 and 1860 was 4,327,190. Second, I collected the index of the Register of Enlistments for the
native-born, and removed duplicate entries of name, birth year and state of birth.97 This procedure yielded
96,851 distinct enlisters for the 1832–1846 cohorts and 83,633 distinct enlisters for the 1847–1860 cohorts.
The estimates of Q1 are thus
96, 851
RA(a)
= 0.031
Q1
=
3, 101, 313
for the 1832–1846 cohorts and
RA(b)
Q1
=
83, 633
= 0.019
4, 327, 190
for the 1847–1860 cohorts.
97 The removal of duplicates is necessary because individuals could enlist multiple times. Dropping duplicate appearances of
name, birth year, and state of birth is an imperfect way of addressing this possibility. It is simultaneously too restrictive—there
may have been two enlisters with the same name born in the same state in the same year—and too loose—slight deviations
in the spelling or abbreviation of names, or misreporting of birth years would allow multiple enlistments by one individual to
survive the removal of duplicates and be counted. Given that the actual value of Q1 does not seem particularly important in
practice, I do not investigate this potential problem further. An alternative method is to estimate the fraction of enlistments
that are repeat enlistments using information from the Register of Enlistments; however, this information is not readily available
for those in the earlier birth cohorts.
98