Sample-selection Bias in the Historical Heights

Sample-selection Bias in the Historical Heights Literature
Howard Bodenhorn
Department of Economics, Clemson University
Timothy W. Guinnane
Department of Economics, Yale University
Thomas A. Mroz
Department of Economics, Clemson University
∗
This draft: March 5, 2012
∗
Note : Preliminary, please do not cite or circulate. For comments and suggestions we thank Cihan Artunç, Jeremy
Edwards, Farly Grubb, Sheilagh Ogilvie, Jonathan Pritchett, Paul Rhode, Mark Rosenzweig, William Sundstrom,
and Christopher Udry. We acknowledge nancial support from the Yale University Economic Growth Center. Direct
correspondence to Guinnane: [email protected].
1
Abstract
Over the past thirty years, an extensive literature has developed that uses anthropometric
measures, typically heights, to draw inferences about living standards in the past. Much of
this literature relies on micro-samples drawn from sub-populations that are themselves selected:
examples include volunteer soldiers, prisoners, and runaway slaves, among others. Contributors
to the heights literature sometimes acknowledge that their samples might not be random draws
from the population cohorts in question, but rely on normality alone to correct for potential
selection into the sample. We argue that selection cannot be resolved simply by augmenting
truncated samples for left-tail shortfall. Consequently, widely accepted results in the literature
may not reect variations in living standards. Observed heights could be predominantly determined by the process determining selection into the sample. We use a simple theoretical model
combined with simulation and econometric results to make our point. A survey of the current
historical heights literature illustrates the problem for the three most common sources: military
personnel, slaves, and prisoners.
2
. . . the socio-economic composition of the institution studied might have varied over
time, even in the absence of explicit changes in the admission criteria. This might be due
both to supply and demand considerations. The willingness of individuals to enter the
1
military, for instance, might have varied over time. . . This problem is quite intractable.
In volunteer services there are two selection processes at work, neither of which can be
considered random: the demand of the recruitment services for men of certain heights
(or other characteristics correlated with height) and the supply of potential recruits.
2
. . . The nature of 19th century volunteer armies such as that studied here presents a
statistical dilemma:
the sampling bias tends to move in a direction opposite to the
population mean. For instance, if population income goes up, options such as military
service in India become less attractive, and our sample would be drawn increasingly
from the left tail of a distribution which itself is shifting to the right. A nding that
the sample mean went up would thus be consistent with both a falling and a rising
population income, depending on which inuence on the sample dominated.
1
2
3
Komlos (2004, note 44).
Weir (1997, p.174).
Mokyr and Ó Gráda (1996), pp.1634
3
3
1
Introduction
Anthropometric history came into its own in the 1980s, and now enjoys a prominent place in
quantitative economic history.
The appeal of this approach to the study of the past is obvious.
Many historical sources record heights, and by studying direct measurements of the human organism
one can hope to achieve a broader picture of how economic growth and development aected human
well-being.
4
The central idea of the heights literature is that, at the population level, heights at a given age
5 Thus a cohort
are a measure of the net nutrition available to individuals during the growing years.
might be unusually short because its members had less food, or gross nutrition; because hard work
when young made greater caloric demands on gross nutrition, leaving less for growth; or because
disease made demands on gross nutrition. Some early papers in this literature viewed height as a
proxy for conventional measures of economic output such as GDP per capita, one that could be
6 As it has developed, the heights literature presents the
used in the absence of direct measures.
anthropometric approach as capable of capturing dimensions of economic well-being that the real
wage or GDP per capita cannot.
As scholars integrated insights from the medical and social sciences, they faced a number of
theoretical and statistical challenges. Perhaps the most discussed problem is the fact that many
samples are drawn from military populations for which there was a minimum height requirement.
The resulting sample of heights is left-truncated at or near that minimum height. This truncation
problem led to a number of approaches to estimating mean height from truncated samples. While
there remain some debates about how this is best done, most recent studies account for it using
7
well-tried approaches.
4
There is even a specialist eld journal,
Economics and Human Biology, started in 2003.
uses anthropometric measures other than heights.
Some historical research
Our argument applies with equal force to any anthropometric
measure used to reect on human well-being; for brevity here we use the term heights.
5
6
Most research on heights relies on sources that pertain to men only, such as soldiers.
Brinkman, Drukker and Slot (1988), for example, regress the heights of Dutch males on per-capita income in
Holland in the period 1900-1940, and then use the resulting equation and measured heights to infer incomes in the
nineteenth century.
7
The seminal reference is Wachter and Trussell (1982).
They discuss two distinct approaches.
The rst, the
Reduced-Sample Maximum Likelihood Estimator (RSMLE), requires ex ante specication of a (maximum) xed
truncation point.
The second, the Quantile Bend Estimator, relies on tting a linear relationship between the
expected and actual normality probability plots. Both rely on the normality of the sample distribution. Some studies
focus on a distinct problem, which is heaping of heights.
One would expect the distribution of heights to be
continuous. Measured heights often show both rounding (say, to 5'4 when the true height is 5' 4.1 inches) and a
tendency to cluster at certain heights, such as 5 feet. Our argument is distinct from both truncation and heaping.
4
Unfortunately the historical literature has not satisfactorily confronted the consequences of a
potentially more serious problem; namely, that many if not most of the samples used are not random
draws from the population in question.
This issue is distinct from that of left truncation, and
applies even in a situation where there are no left-truncation issues. In most sources, an individual
is included in the sample only if he or someone else made a decision that led to their inclusion.
Examples include soldiers and sailors in volunteer forces; prisoners ; runaway slaves; slaves that
entered into the international or interregional slave trade; students at elite educational institutions;
and holders of passports. In some cases more than one actor made a choice that allowed a person to
be observed in a sample. If the chances of inclusion in the sample dier across individuals in ways
that are correlated with height, then the sample distribution of heights are unlikely to be unbiased
measures of the mean height in the population, even after adjustment for possible left-truncation.
More importantly,
changes in observed heights over time may be driven not only by changes in the
actual heights of the population, but also by changes in the probability that individuals of dierent
heights were selected into the sample.
Similarly, cross-sectional dierences in height within a sample
may reect a particular variable's eect on the probability of inclusion in the sample, rather than
population dierences in height correlated with a given characteristic.
8 This is
Thus many heights studies may suer 'from sample-selection bias (SSB).
not
a trun-
cation problem, and none of the approaches that deal with truncation or shortfall address the
issue. It is reasonable to think that selection into a volunteer Army will aect the right-hand tail of
the distribution as well as the left-hand tail: taller men are from more prosperous backgrounds and
as such are more likely to have more attractive civilian opportunities than Army life.. The implications of SSB are grave. Komlos notes in the trenchant epigraph that the problem is intractable.
Many papers in the historical heights literature recognize that their samples might not be representative, but few have acknowledged that some of the most interesting and most discussed results
in this literature may be entirely an artifact of SSB. We are not the rst to recognize the problem.
Pritchett and Chamberlain (1992) and Grubb (1999) discuss selection in specic contexts. But we
8
Extensive discussion of SSB in the economics literature dates to eorts to estimate models of married women's
wages and hours of work. See Heckman (1979). Some sources used in the heights literature are themselves random
samples drawn from a larger source, for example, a random sample of men who joined a particular volunteer army.
We are not criticizing the method of drawing the sample; our argument is that the Army itself represents a selected
sample of men from that time.
5
are the rst to systematically explore the consequences of SSB in the literature more generally.
9
The current historical heights literature suers from two fundamental aws. First, the literature
has emphasized results that may reect nothing more than sample selection bias. Second, it has
labored to construct complicated, indirect explanations of puzzling ndings that reect, at least in
part, changes in the nature of the selection. Our message here is a warning. We cannot demonstrate
fully that selection problems are more or less important than the economic and social forces emphasized by the heights literature. The only fully convincing way to do that would be to compare
random and selected samples drawn from the same source. If such random samples were available,
we are condent someone would have used them. Rather, we show that selection is an important
problem that needs to be taken more seriously than it has been.
We do not argue that scholars should dismiss the heights literature. First, many studies (mostly
in developing countries, but also in a few historical examples) rely on sources relatively free of
selection bias.
Second, economic history always requires careful use of imperfect sources, and
we encourage scholars interested in studying historical heights to recognize and acknowledge this
imperfection in their sources. Finally, those who rely on this literature as part of a broader eort
to study living standards in the past should be skeptical about some of the results reported to this
point.
2
Declining heights in the presence of rising real wages
The sample-selection problems we discuss are potentially key issues in any study using a choice-based
sample of heights in the past. To x our ideas, however, we focus mostly on a set of related issues that
have occupied much of this literature. (In section 7 we discuss the sample-selection issue as it arises
in other heights research more broadly.) Economic historians and others are naturally interested in
how economic growth, and more specically industrialization, aected human welfare in the past.
The most famous version of this standard of living debate focuses on the working classes during the
British Industrial Revolution. A long debate that goes back at least to Engels (1845/1897) divides
optimists, who think the working classes beneted from early industrialization, from pessimists,
9
Mokyr and Ó Gráda (1996) is the only anthropometric study we know of that acknowledges that their result
may be a consequences of SSB. We read their conclusions as a precursor of what we say here. Weir (1997, p.175-7)'s
discussion also alludes to SSB in the British Army data.
6
who think they did not. The earliest literature focused on conditions of work and life sanitation,
diet, disease and mortality but in its modern incarnation the literature has focused on conventional
10
economic measures such as real wages, or incomes or consumption per capita.
Floud, Wachter, and Gregory's (1990) nding that British soldiers grew shorter during the
British Industrial Revolution, and after, seemed a strong point for the pessimist case. Figure 1
(From Nicholas and Steckel (1991)) presents similar evidence. The mean height of English males
declined by about 1.5 inches during a period in which wages were at worst stagnant or, more likely,
growing slowly.
If anything, this evidence seemed even stronger than conventional pessimist
arguments: Feinstein (1988) nds wage
stagnation, not wage declines, and Allen (2007) nds slow
real-wage growth throughout the nineteenth century.
But British soldiers became signicantly
shorter. Something similar also occurred in the United States. During a period when the economy
entered a period of long-term acceleration in real GDP, estimated mean height declined by more than
one inch (Weiss 1992). Growth in estimated heights did not recover until the end of the nineteenth
century. This fact of US economic history is now so much a part of the mainstream narrative that
it is known simply as the antebellum puzzle. The puzzle is the concurrent (apparent) decline in
biological well-being and (apparent) accelerating growth in per capita income.
The heights literature and, in fact, much of the profession, has taken both ndings at face value
(rising incomes and shorter people) and has endeavored to explain it. Because the same phenomena
have been documented for the mid-nineteenth century US, late eighteenth-century UK and the industrializing Hapsburg Empire, we label the phenomena the industrialization puzzle. The ndings
were unexpected and led to considerable eorts to understand the divergence between heights and
real wages.
Komlos (1998) presents a laundry list of possible explanations, including increasing
income inequality, increased income variability, changes in relative prices between nutritious (meat
and grains) and non-nutritious (coee and sugar) foods, transportation innovations and the integration of the disease environment, intensication of labor, among others. The heights literature
focuses on three of these eects.
First, changes in the distribution of income might reduce net nutrition for the working classes,
even if overall measures such as GDP per capita are rising. We nd this explanation particularly
10
Feinstein (1988) is probably the last word on English real wages per se. He nds little evidence real-wage increases
until least the rst decade of the 19th century. Brown (1990) uses hedonic methods to show that much of any increase
in real wages was necessary to oset the utility costs of more unpleasant living and working conditions.
7
interesting because it is so closely related to the SSB problem. In the heights literature, changes in
income distribution are usually thought of as a shift in income from, say, labor to capital. This would
explain the English case especially, with a divergence between the trends in GDP per capita and
measured heights. Changes in income distribution could well be the driving force behind dierent
types of selection into the Army; if workers with particular sets of skills saw sharp declines in their
wages, they might be more ready to enlist. Yet this implication is seldom mentioned in the heights
literature.
11
A second explanation for falling heights in the face of economic growth focuses on an unhealthy
environment. Industrialization was associated with the growth of cities, which were comparatively
unhealthy places to live until at least the late nineteenth century.
Diarrheal and other endemic
infectious diseases drive a wedge between gross and net nutrition. This observation is consistent
with what we know about public health and urban environments in the period.
A third explanation emphasizes instead the price and availability of important food items. While
real wages might have improved, the relative price of nutritious height-enhancing foods might change
in ways that lead to less consumption.
Komlos (1987), Bodenhorn (1999) and others have, in
fact, developed estimates of nutrient availability and consumption for 19th century school boys
and African-Americans to investigate this possibility. This approach requires so many assumptions
about animal weight, waste and spoilage, seed requirements and so forth that even the most carefully
crafted estimate is subject to challenge.
Floud, Gregory and Wachter (1990, 233-243) doubted
whether such studies could overcome the insurmountable hurdles and contribute much to the debate.
We do not deny the value of the fresh research eorts provoked by the nding of the industrialization puzzle. We simply doubt that the puzzle is real. The problem is that scholars have only
identied it in contexts where the data come from choice-based samples.
US had volunteer Armies for most of the period in question.
Both the UK and the
12 France, on the other hand, drafted
men according to lotteries so that all young men had an approximately equal chance of being called
for service. Consider Figure 2, from Weir (1997), which presents the heights of French males over
the same period. There is no French equivalent of the industrialization puzzle. French men grow
11
In other cases, income distribution is invoked to explain above-average height. Steckel and Prince (2001, p. 291)
appeal to the egalitarian practices of the Native Americans of the Great Plains to explain their relatively tall stature.
12
The Union (North) subjected men to a draft in the period 18621865 (during the U.S. Civil War), but the policy
of allowing draftees to pay a substitute to take their place makes it much like a volunteer Army for our purposes.
8
without interruption; in fact, a simple regression of these gures on a linear time trend yields an
R-square of about 0.97.
13 What accounts for the dierence between the US and British case on
the one hand, and the French, on the other?
The heights literature focuses on France's slower
urbanization, which reduced the disease burden and thus the impediments to converting calories
into height. We think there is a much more obvious explanation.
Figure 3 also plots the heights of British military recruits. The year-to-year variations in the
estimated mean heights of British soldiers are enormousmuch larger than could be due to any
short term variations in the standard of living. Consider the cohorts that reached age 20 around
1810. Between 1804 and 1812, French heights uctuate between 163.5 cm and 163.9 cm without
obvious trend. British heights, after having already experiencing a 2 cm decline 20 years earlier,
rst decline by about 2 cm and then as quickly recover. In short, the heights of men potentially
eligible for the French military do not oscillate as wildly as British heights even though both imposed
minimum (but not common) height standards. With a standard deviation of height within a British
cohort of about 2.3 cm, the comparatively smooth decline in mean heights from the cohort born
in 1820 to that born at the end of the 1830s represents a very large decline in mean heights. We
return to this issue in section 6 below.
3
14
Modeling Military Heights
One problem with the limited discussions of selection bias in the heights literature is lack of clarity
about what might cause the selection and how that selection would aect the samples used. To avoid
that weakness, this section describes a simple model in which an individual of given characteristics
decides whether to join the Army or remain a civilian. This model could be adapted to t several
other contexts that generate height data. The model delivers expressions describing the degree of
selection into the Army, by height.
13
In Sections 4 and 5 below we use this feature of the model
Our views on SSB and the heights literature probably originated in conversations with Weir. Weir (1997, p.174)
notes that British military heights uctuate a great deal in short periods. The cohort who reached age 20 in 1850
were apparently 5 centimeters taller than those who reached that age 1 year later. That is far beyond any plausible
range of measurement error in GDP and suggests that there must have been other very powerful forces driving the
heights of British volunteer forces. We agree.
14
Militia conscription records for Sweden for the period 18201965 cover nearly the entire male population, and,
like France, show no deterioration in heights (save for a .5 centimeter dip from 183540, which was made good
by 1845 (Sandberg and Steckel 1997, Table 4.1). Using the settled Army, which was apparently a volunteer force,
Sandberg and Steckel show a sharp decline in height for the cohort born in the 1840s Sandberg and Steckel 1997,
p.135; Sandberg and Steckel 1988, Figure 1).
9
to simulate height distributions for specic distributional parameters.
Our model draws on Roy
(1951), and is similar to Heckman and Sedlacek's (1984) two-sector occupational choice model.
Each person's utility depends on his wage or material compensation in the Army or in the civilian
world. Each individual also has a parameter that describes his taste for military life, and another
that describes his taste for civilian life.
Thus two individuals with identical wages can make
dierent decisions.
We treat both Army and civilian wages as (possibly) a function of the individual's height. This
assumption has two interpretations that are equivalent for our purposes.
Army or some civilian occupations might actually reward height itself.
The rst is that the
Promotion might come
faster to taller soldiers, and in some military and civilian occupations a tall person might be more
productive.
15
A second interpretation seems more natural, and more consistent with the basic tenets of the
heights literature. Height is correlated with a person's biological standard of living and with other
aspects of health human capital (Schultz 2002). The Army might reward a tall person not for
being tall per se, but because taller individuals are generally healthier than shorter people. Similarly,
the civilian world might reward a tall person for his health in addition to any skills correlated with
height.
3.1
Individual Returns
The military pays soldiers as a function of their height,
military traits summarized in by the variable
h,
and their observable set of productive
εM :
ln (wM ) = αM + βM h + γM εM
15
(1)
Studies of modern labor markets have uncovered a positive height-wage correlation and attribute it to height-
based productivity eect. Persico et al (2005) identify an association between height and human capital investment
during adolescence that manifests itself as a positive height-wage correlation for adults.
Case and Paxson (2008)
argue that the positive correlation between height and wages captures, in part, a positive association between height
and cognitive ability.
After controlling for cognitive abilities, the height eect remains, but is less strong.
It is
not unreasonable to believe that human capital and cognitive ability would be rewarded in military labor markets,
though not necessarily at the same rate as civilian markets. Komlos (1989, p.237) reports enlistment bonuses into
the Hapsburg Empire's army, circa 1809, increasing in heights, from 3 . for soldiers just 5'-0 to 45 . for soldiers
5'-5 and above.
10
(Note that
βM
might be zero.)
reported in Section 4).
εM
(Table 1 sets out all notation used here and in the simulations
can entail anything other than height, such as literacy or some specic
skill such as the ability to shoot straight. Individuals dier in their tolerance for other features of
military life, and we summarize those preferences under
τM .
An individual's utility from entering
the military is additive in the military log-wage and his tastes for the military:
U (M ) = ln (wM ) + τM = αM + βM h + γM εM + τM .
(2)
For the moment, we do not make any assumptions about the relationship between height and
productive military traits (εM ).
Height and military productivity could be positively correlated
(tall people might be better with a rie) or negatively correlated (tall people might nd it harder
to nd cover in an ambush). Now suppose civilian wages are given by:
ln (wC ) = αC + βC h + γC εM + δC εC .
(3)
Note that height and other military characteristics (εM ) might be valued by both the military and
the civilian economy, but another set of characteristics,
εC ,
are valuable only in the civilian world.
(The model can be extended such that the military also rewards
εC ,
results below.) Individual preferences for civilian life are denoted by
but that does not change the
τC .
Thus the utility of being
in the civilian world is:
U (C) = ln (wC ) + τC = αC + βC h + γC εM + δC εC + τC .
As above, all productive traits can be correlated. For simplicity we assume that tastes (the
are independent of the productive traits (h and the
ε terms).
(4)
τ
terms)
Relaxing this assumption only requires
the introduction of additional covariance terms in the following derivations.
While we view this primarily as a static model of occupational choice, the functions
and
U (C)
could be interpreted in a dynamic context.
U (C)
U (M )
could represent the expected utility
from choosing the civilian sector today when there is a possibility (option) that one might choose
to enter the military at some later date.
However, throughout this analysis we assume that an
11
16 Similarly, the military pay function
individual has just one opportunity to enlist in the military.
could represent expected lifetime payments that could include, in part, the expected increase in
pay if the volunteer were to be promoted at some later date from soldier to sergeant.
An expected utility maximizing individual enters the military if and only if
U (C) ≤ U (M ),
or
η ≤ (αM − αC )
(5)
η = βh + γεM + δεC + τ
(6)
where
and
β = (βC − βM ) , γ = (γC − γM ) , δ = δC ,
and
τ = τC − τM .
(7)
Equation (7) makes clear that what drives the model (and the simulation below) is
between features of the military and civilian worlds.
If height is more highly rewarded in the
β = (βC − βM ) > 0,
civilian sector than in the military sector, i.e.,
dierences
then taller individuals will
prefer the civilian sector over the military sector, all else equal. Similarly, if the civilian sector's
reward to the military trait
higher values of
εM
εM
increases (i.e.,
γ
or
γC
increases), then
will be less likely to join the military.
ceteris paribus
those with
Additionally if there is an increase
(decrease) in the reward to a civilian sector trait that is positively (negatively) correlated with the
military trait
εM ,
then those with higher (lower) values of
εM
will have a lower (higher) propensity
to enter the military.
Suppose that all productive traits follow standard normal distributions, not necessarily independent of each other or of heights, and that the taste parameters follow mean zero normal distributions
independent of heights and productive traits.
16
17 Then height h and selection error
h
will follow a
Given that most studies using military heights focus on older enlistees (e.g., over age 21 or 22) who likely have
nished growing, the selection process would include the decision to be a civilian during the teenage years and then the
decision in one's early to mid 20's to leave the civilian sector and join the military. This more complicated selection
process would alter the specic characterizations of the self selection biases described below, but it is unlikely that it
would eliminate them.
17
The unit variance assumption is innocuous, and the mean zero assumption for all of these factors only requires
simple redenitions of the intercepts in the two wage oer equations.
assumption for tastes.
12
One could easily relax the independence
bivariate normal distribution with




 h   µh 
2
E =
 ; V ar (h) = σh ;
η
0
(8)
V ar (η) = ση2 = β 2 V ar (h) + γ 2 V ar (εM ) + δ 2 V ar (εC ) + V ar (τ )
+2βγCov (h, εM ) + 2βδCov (h, εC ) + 2γδCov (εM , εC )
(9)
and
Cov (h, η) = σh,η = βV ar (h) + γCov (h, εM ) + δCov (h, εC ) .
(10)
Under these assumptions, we can derive the distribution of heights for those who prefer military
service to working in the civilian sector.
described above over the range of
Integrating the bivariate normal distribution of
η ≤ (αM − αC ),
and normalizing by the Prob (η
(h, η)
≤ (αM − αC ))
(see Kotz et al 2000, p.316) yields
fh|mil (h) = fh|η≤(αM −αC ) (h) = fh (h) Z (h)
(11)
where
Z (h) =
Φ (·)
o
n
q
Φ [(αM − αC ) /ση − ρη,h (h − µh ) /σh ] / 1 − ρ2η,h
Φ [(αM − αC ) /ση ]
ulation) distribution of heights,
1
√
σh 2π
of height and the population selection error
η
and
(12)
fh (h)
is the unconditional (pop-
ρη,h =
Cov(h,η)
is the correlation
σh ση
is the standard normal cumulative distribution function,
2 1 h−µh
exp − 2
,
σh
.
we dened above.
Z (h)
summarizes the selection
process that generates the observed distribution of military heights from the underlying distribution
of population heights.
18
Most historical studies of military heights try to use the observed military height distribution
18
The appendix extends the model to account for the military's optimization decisions.
We show that in con-
structing the least-cost military force, the Army might impose a minimum height restriction, as we actually see in
practice.
13
(fh|mil
(h)) to make inferences about the population height distribution fh (h).
Our derivation makes
clear that accurate inferences about the population distribution depend on the properties of
and especially how
Z (h)
Z (h)
might vary over time (or in the cross-section) in a way that might be
unrelated to variations in the population height distribution. As derived, this selection mechanism
ignores any possible minimum height requirement that the military might impose. The existence
of an exact and binding minimum height requirement implies that instead of the conditional height
distribution described above, we would only observe its truncated analogue.
Suppose there is
some loosely enforced minimum height requirement, but one that is completely non-binding above
some height
h∗ .
Then all observed heights above
h∗
can be considered as random draws from the
upper tail of this conditional height distribution. This assumption underlies Trussell and Watcher
(1982)'s reduced sample maximum likelihood (RSMLE) and quantile bend estimators (QBE) (i.e.,
h∗
is above the extent of the shortfall in Trussell and Watcher's terminology). Our simple choice
model highlights an important implicit assumption in Trussel and Watcher's (1982) description of
the process giving rise to observed military heights.
operating for heights
above h∗ .
They assume there is no selection process
That is, individuals taller than
h∗
choose to enter the Army based
on a coin ip.
Our Roy-style model suggests an alternative process. Suppose that the civilian sector rewards
heights and productive traits more than the military, and that the covariances of heights and
productive traits are positive, zero, or at most only slightly negative.
correlation) of heights and the selection term
η would be positive.
Then the covariance (or
This implies that
Z (h) approaches
zero for taller individuals. Taller individuals would increasingly become more self-selected (less wellrepresented in the Army) under this optimizing model. This conclusion, which is the central point
of our critique, has two, related implications.
rules out such a possibility.
First, Wachter and Trussel's approach implicitly
Their estimation techniques (whether RSMLE or QBE) are biased
estimators for the unconditional height distribution
fh (h)
in this case. Second, our simple choice
model provides a clear understanding of why the right-hand tail may have fewer taller individuals
than one would observe in a true normal distribution.
The Wachter and Trussel assumptions are innocuous only under fairly extreme conditions, when
h∗
is above the unconditional mean height and the correlation of height and the selection error (ρ)
approaches
−1.
How plausible is this? Consider rst the case where the civilian sector more strongly
14
rewards productive traits than the military sector.
In this instance, a large negative correlation
could arise only when the covariances of height with the productive traits
εM
and
εC
are large
and negative. Such negative correlations seem unlikely, as one typically considers height to be a
marker for a better childhood environment which would suggest positive correlations of heights and
other skills. Next, consider the (probably implausible) case where the military rewards height and
the relevant skills more strongly than does the civilian sector. Then we will have a large negative
correlation of heights and the selection error if heights and skills are positively correlated and if the
reward to the civilian sector specic skill,
δC ,
is small. Under these assumptions
Z (h)
would no
19
longer approach zero, and the selection operating on the upper tail would be weak or non-existent.
If the military rewarded common skills more than the civilian sector while the civilian sector pays at
most a small premium for its sector-specic skill, then the variance of wages in the military would
be much larger than the variance of wages in the civilian sector. This implication is at odds with
what we know about military pay.
This discussion raises an important question: if our model is a reasonable approximation to why
individuals choose to enter the military, then why does the upper tail of nearly every one of the
height distributions examined by Wachter and Trussel appear to follow a normal distribution? In
Section 4 we use our occupational choice model to simulate historical height data sets. For a wide
variety of reasonable variances, covariances, and reward structures, we nd that the distributions
of the self-selected heights in the military do follow distributions that closely resemble the shapes
of normal distributions,
even when there is selection built it into the simulation model.
This result highlights a serious statistical problem documented in Section 5. Even with moderate to large sized samples by cliometric standards, the power of standard skewness-kurtosis tests to
reject normality is quite limited for the forms of self-selection modeled here. Even kernel density
estimators for heights in the self-selected military samples seem at most trivially dierent from
normal distributions. Self-selected samples of heights typically dier from normal height distributions in the variance of the observed height.
The self-selected height distributions, even though
they appear to be almost normally distributed, typically have standard deviations for heights well
19
Note that under these conditions, our occupational selection model would imply gross over-selection of tall
individuals into the military. However, for a value of the correlation quite close to
−1,
the amount of over-selection
could be relatively constant for all heights above the population mean. In this case the shape of the upper tail of the
distribution would reect, nearly exactly, that of the population distribution of heights.
15
below those in the underlying population.
Many of the estimated standard deviations in Floud,
Wachter and Gregory's (1990) Table 4.8, especially for the 1800s, seem quite small, suggesting a
strong degree of self-selection into the military consistent with this simple extension of the Roy
model.
4
Simulating the Model
To assess the potential importance of selection bias, we can apply reasonable values to the occupational selection model to investigate the ability of selected samples to yield accurate information
about the characteristics of the unconditional population height distributions. The model underlying these simulations is a simplied version of the model presented in section 3. We focus on the
decision to join the Army and abstract from the taste parameters (τ ); the constant terms in the
log-linear wage equations and the wage shocks capture variations in tastes. Finally, we assume that
each individual has a given shock that is specic to his earnings in the civilian or military sector.
That is, our simulation is based on equations (1) and (3), but for the simulation we set
τC = 0,and τ M = 0.
Our simulation exercises vary
αM , αC , γM , δ C , β M ,
and
βC ,
γ C =0,
as described
below. To uncover accurate characterizations of the distributions of self-selected samples, we start
with hypothetical populations of one million observations and simulate the process of selecting
into the military.
20 A pseudo-random number generator gives this population normally distributed
heights with a mean of 66 inches and a standard deviation of 2.5 inches. We assume values for the
standard deviations of the two shocks, and for their covariance.
21
Table 1 lists the baseline values for each simulation parameter. Our primary interest is in the
four behavioral parameters,
αC , αM , βC
and
βM ,
though in this simplied model only the sectoral
dierences in the intercepts and in the slopes matter for the decision to select into the military. We
also investigate the implications of varying the standard deviations and covariances of the shock
terms
γ M M
and
δ C C
. The results are not at all sensitive to the former for reasonable values,
22 Our baseline behavioral parameters were chosen in the
so we only discuss them briey below.
20
Many characteristics of the self-selected population can be obtained directly from the analytic formula for the
self-selected density function derived in section 3.1. See equation (11).
21
22
The simulation programs used here (written in Stata) can be obtained from the corresponding author.
We performed these simulations in Stata version 12, using the seed 6175 for the pseudo-random number generator
in our sets of simulations. With our sample sizes the seed is nearly irrelevant. We used the mean height of 18 year-old
recruits around 1747, 61.75 inches (Floud et al, Table 4.1), to set the seed.
16
following way.
First, we assume that the civilian sector was almost always preferred to military
sector in the absence of shocks for all possible returns to height for persons of very low height.
We do this by specifying that civilian log pay is
0.1
above military pay for individuals at a height
four standard deviations below the mean (i.e., 56 inches). Second, we consider a range of dierent
rewards to height.
value,
0.02,
23 What matters is the dierence between
and let
βC
βC
and
βM ,
so we set
βM
to a xed
vary. We consider returns to height in the civilian sector that both exceed
and fall below returns in the military sector. Finally, we choose baseline statistical values assuming
that variation in the civilian labor market random component exceeds that of the random military
component. This assumption is consistent with the narrow pay ranges determined by rank in the
military.
Figure 4.A reports the mean heights of soldiers and the fraction of the population in the military
for a range of returns to height in the civilian sector. The gure also highlights two benchmarks.
The vertical line at
βC = 0.02
marks the point at which the return to heights in the Army and
the civilian sector are identical, and there is no selection on height. The horizontal line marks the
point at which half the population is in the military. Note rst that the fraction in the military
declines monotonically with increasing returns to height in the civilian sector.
Military heights
in this graph also decline monotonically from a level above the mean population height to a level
more than one inch below the mean population height. This pattern highlights the importance of
selection on the height distribution for those wanting to enter the military before the imposition of
any minimum height standard. At the point
βC = 0.02, at which there is no selection on height, the
24 Above
mean height in the military equals the population mean value of 66 inches.
βC = 0.02,
the
region in which the reward to the civilian height exceeds the reward to height in the military, the
mean heights of individuals willing to join the Army drops rapidly. When the return to height in
the civilian sector exceeds that in the Army by 2 percentage points
βC = 0.04,
about 23 percent of
the population nds the military sector more attractive than civilian sector, and the average soldier
23
We re-parameterize the log-wage specications as
ln (w) = α + β ∗ (h − 56) + .
This has the eect of allowing us
to use real heights without articially magnifying dierences in returns to height in the two sectors.
24
The gure does not display an interesting feature generated by the model. It is possible for military heights to
increase with increases in the return to height in the civilian sector,
for the military sector.
so long as civilian height returns are below that
This can happen because of the dependence of the selection error variance and the sign and
magnitude of the correlation of the selection error and height on the dierence in the height reward as described in
equation (11). The fraction in the military, however, is always monotonically non-increasing in the return to height
in the civilian sector.
17
is nearly one half inch shorter than the population mean. When
βC
equals
0.06,
so the return to
civilian height exceeds the return to military height by 4 percentage points, only about 11 percent
of the population volunteers for the Army, and the volunteers are more than one inch shorter than
the population mean.
An important implication of the model is that selection operates across the entire distribution
of height, not just on the left tail, so that the military may attract disproportionately many or few
tall recruits. In Figure 4.B we illustrate this feature. We dene tall as taller than the population
mean (66 inches) and we plot the fraction of individuals in the military and in the civilian sectors
who are tall, according to this denition, against the civilian return to height.
As the civilian
return to height increases above the military return, the fraction of soldiers who are tall declines
precipitously. When height rewards are identical in the two sectors (that is,
βC = βM ),
there is no
selection on height and exactly half of the people choosing each sector are tall.
Figures 5.A and 5.B contrast the density of heights for those in the Army and the population
height distribution for two dierent returns to height in the civilian sector, namely
0.04 (in 6.A) and
0.06 (in 6.B). These values correspond to the civilian labor market rewarding each inch of height by
two and four percentage points more than the military does. Figures 5.A and 5.B clearly illustrate
one of our principal contentions: the military height distribution unambiguously shifts to the left,
meaning the Army has relatively few of the tallest men and relatively more of the shortest men. The
measured standard deviation of the selected heights for those in the military is also smaller than
that in the population (2.49 for Figure 5.A and 2.44 for Figure 5.B versus 2.50 in the population).
Yet the distribution of heights in the Army looks normal. Moreover, we know that the military
densities reported there reect deliberate, non-random sampling schemes, and the analytic density
formula clearly rules out that military heights follow a normal distribution.
In any simulation exercise, a reasonable concern is that particular parametric assumptions generate articial results. We experimented with ranges of values for all of our behavioral and statistical
25 Changes to the intercept terms
parameters and have a clear sense of how they aect the results.
(αC ,
αM )
produce a small shift in the point at which heights begin to decline in Figure 5.A, but do
not alter the fundamental result. The same can be said of reasonable variations in other statistical
25
Note that we can solve the two wage equations for the height at which an individual with mean-zero shocks
is indierent between the Army and the civilian sector.
We only consider parameter values that imply interior
solutions; that is, we do not consider values that imply either everyone or nobody wants to join the Army.
18
parameters; increasing or decreasing the standard deviations of the shocks or their covariance aects
the steepness and locations of lines in the graphs reported in Figures 5 and 6, but not the essential
result that selection on height occurs across the entire population distribution of heights.
These gures capture, in dierent ways, the core message of this paper. Even modest relative
dierences in returns produce an Army that is selected in such a way as to be signicantly shorter
than the population from which it was drawn. The message to practitioners engaged in empirical
historical height research is also clear: apparent normality of the military height distribution does
not obviate the need for clear and precise thinking about how selection into the sample may drive
a wedge between the mean height of the sample and the mean height of the population from which
the sample is drawn.
5
The power to reject normality in selected samples
Almost all researchers in the historical height literature accept the hypothesis that population
height distributions closely follow normal distribution laws. From equation (11) in Section 3, it is
clear that the analytic distributions of self-selected heights derived from the simple application of
the Roy model do not follow a normal distributions when height, or whatever height proxies for,
dierentially aects the rewards in the military and civilian sectors. This suggests that a test of the
null hypothesis of normality for an observed height distribution, against the alternative hypothesis of
observed heights not following a normal distribution, might constitute a useful device for detecting
the presence of sample selection bias. Tests of this sort seem to be the standard way scholars working
in the heights literature argue that their samples do not suer from selection problems. Nicholas
and Steckel (1991, pp. 941-2), for example, have used the failure to reject such a null hypothesis as
a reason to dismiss concerns about self-selection biases.
26 In this section, however, we demonstrate
that such tests often have little power to uncover the existence of self-selection biases even when
the magnitudes of the self-selection biases are substantively important.
In Figures 6.A and 6.B we repeat these comparisons of the population and selected density
functions from Figures 5.A and 5.B along with the adjustment terms
26
Z (h)
in equation (12) that
Given the Jarque-Bera tests shown in Table 1, we cannot reject the hypothesis that the distribution of male
English and Irish height is normal, or Gaussian. The absence of truncation bias is a desireable feature of our height
data...
This approach goes back to at least Soklo and Vilaor (1982, p.469). See also Margo and Steckel (1982,
p.533) and Johnson and Nicholas (1997, p.206).
19
allows one to derive self-selected distributions from the population distribution for our simplied
Roy model. Figure 6.A corresponds to a 2 percentage point higher return to height in the civilian
sector (bC
= 0.04
versus
bM = 0.02),
while in Figure 6.B the civilian sector has a 4 percentage
point higher return to height than the military sector. The adjustment term for the 2 percentage
point dierential implies, relatively, that ve foot tall individuals on average would be 2.21 times
more likely to prefer the military sector than six foot tall individuals.
The adjustment term for
the 4 percentage point dierential is even more striking. Five foot tall individuals are 7.86 times
more likely to prefer the military sector to the civilian sector. Those dierences in self-selection, of
course, are what cause the 0.4 and 1 inch mean height dierentials of the selected military samples
from the true population mean height. These are substantively meaningful dierences.
These self-selected densities, as noted above, do look normal, even though they are not normal
distributions, and so it is important to evaluate the ability of standard tests for normality to detect
deviations from normality.
A failure to reject normality would provide some reassurance about
the representativeness of the observed data only if the power of the tests for normality were large
enough to reject interesting alternative distributions.To evaluate this we use a popular test available
in Stata, the sktest, that examines deviations of estimated measures of skewness and kurtosis in
an observed sample from their theoretical counterparts derived from a normal distribution with
the same mean and variances as estimated in the observed sample.
27 In Figure 7 we present power
functions as a function of sample size for the selection model with a 4 percentage point dierential in
the return. We use expected military sample sizes roughly in the range of 100 to 50,000 observations
to correspond to sample sizes encountered in the military height literature. This specication of
the data generating process (with a 0.04 percentage point dierence in the return to height, as
above) implies a 11.34% unconditional probability of preferring the military to the civilian sector to
the civilian sector, and so we generate our selected military samples from population samples
sized from 1,000 to 500,000.
For each population size, we draw 1000 samples, and for each of
these samples we apply the selection model to obtain the military subsample. We then calculate
whether or not the sktest test would reject the normal distribution assumption for the selected
military subsample. For tests at the 5% level, even for military subsamples as large as 50,000,
27
For most sample sizes considered here, the Shapiro-Wilk (for N≤2000) and Shapiro-Francia (for N
performed similarly.
20
≤
5000) tests
one would correctly reject the null hypothesis only about 5% of the time. At the 10% level, one
would reject only about 10% of the time.
Even for the largest sample sizes in these two gures
(which are large relative to those found in the historical literature), the tests have virtually zero
power (above the tests' size/level) to reject the assumption that the selected samples come from a
normal distribution.
28 This happens even though we
know
that they are not normally distributed;
these samples have mean heights more than one inch shorter than the mean in the true underlying
population height distribution.
We can see the reason for the poor performance of these tests by examining
bution of the selected heights sample. We compare
Z (h)
Z (h)
in the distri-
to rejection sampling. The idea behind
rejection sampling is that one can sample from an instrumental distribution, and randomly accept
each draw from that instrumental distribution as if it were from the target distribution.
If this
rejection sampling rate is proportional to the ratio of the density of the target distribution to the
density of the instrumental distribution, then the random draws selected in this way will follow the
29 In other words, with the right kind
distribution of random samples from the target distribution.
of rejection sampling, one can sample from a true normal distribution in ways that yield another
normal distribution.
Suppose that the target distribution is a normal distribution with mean and variance equal
to the mean and variance for the non-normal selected height distribution.
tion
fnor@mil (h).
Let the population height distribution,
f (h),
Denote this distribu-
be the instrumental distribution.
When drawing from the population height distribution acceptance probabilities proportional to
fnor@mil (h) /f (h)
will yield samples following those that would be drawn from the selected height
distribution because
ˆ
f (h)
fnor@mil (h)
f (h)
ˆ
dh =
fnor@mil (h) dh.
A comparison of this relative acceptance probability,
(13)
q (h) = fnor@mil (h) /f (h),
to the
Z (h)
function modifying the population height distribution to yield the selected height distribution (in
28
29
Some studies do recognize the low power of these tests. See, for example, Sokolo and Villaor (1982, p.457).
To make this analogy to rejection/acceptance sampling precise, we do need to assume that the height distributions
only have positive support over nite subset of the real line, say for adult heights in the range 46 to 86 inches (e.g.,
the mean plus or minus eight standard deviations). The ratio of the target to the instrumental distribution needs to
be bounded at all relevant values such that
ftarget (h) < M finst (h)
for some xed, nite value of
M;
this need not
be satised for normal distributions with unbounded support. To simplify the notation, throughout this discussion
we assume that
M = 1,
satises this criteria; allowing
M
to be larger than
21
1
only complicates the notation.
equation (11)) reveals why the non-normal selected height distribution so close reembles a normal
distribution. Figure 8 provides such a comparison for the case when the reward to height diers by
four percentage points in the two sectors and the selected mean height diers from the population
mean height by over one inch. This gure shows that the two dierent adjustments to the normal
population height distribution are nearly identical, even though one adjustment,
a normal distribution and the other,
Z (h)
q (h), yields exactly
a self-selected distribution that clearly does not follow
a normal probability model. Moderate amounts of self-selection can generate selected populations
that dier substantively from the unconditional population distribution in terms of the rst two
moments, but tests for normality would be quite unlikely to provide any evidence of self-selection
biases because of the similarity of the selected distribution to a normal distribution.
6
Econometric analysis of selection in the British military data
Our theoretical and simulation analysis demonstrates that studies based on volunteer armies and
similar sources may suer from serious but undetected selection problems.
We now ask whether
a closer look at the sources might have identied the problem. Note the central diculty: we are
asking whether we can identify selection bias from the selected alone. That said, we show in this
section that there is discernible evidence of selection bias in at least one well-known study. Scholars
rightly consider Floud, Wachter, and Gregory (1990) a central contribution to the historical heights
literature. Their discussion of living standards in Britain in the period 17501980 relies heavily on
three distinct samples that are all in some sense military. The Royal Army and the Royal Marines
recruited adult men to serve in the forces. Floud et al also use samples of younger males to study
age-patterns of growth
We use a subset of the public use version of the Army data to show that
heights of soldiers display patterns that are consistent with selection and inconsistent with the type
30
of explanations usually advanced in the heights literature.
We begin by estimating two types of simple models. In the OLS versions, the dependent variable
is height measured in inches. In the binary probits, the dependent variable is one if the individual is
tall (which we dene in several dierent ways discussed below). In all models we have two types
30
The dataset we use is taken from the UK data archive, Long-term Changes in Nutrition, Welfare and Productivity
in Britain; Physical and Socio-economic Characteristics of Recruits to the Army and Royal Marines, 1760-1879, study
number 2131. We are grateful to Floud et al for making the data public in this way.
22
of regressors, which we call cohort conditions and current conditions .
The rst is a series of
dummy variables for the years in which the individuals were born. These variables should, under
the standard of living interpretation, aect height, because they characterize the individual's
experience as a child and young adult. Everyone born into similar socioeconomic circumstances in
31 The current conditions variables reect not
1820 faced a similar biological standard of living.
the standard of living for a person's birth-cohort, but conditions obtaining at the time
Army.
he joined the
Consider two men, both born in 1820. One joins the Army in 1842 and other in 1843. The
heights literature gives no reason to think the former should be taller or shorter, assuming both have
stopped growing by the time they joined. We might think, however, that if economic conditions in
1843 were relatively bad, then men from relatively more privileged background will be more likely
to join than were similar men in 1842.
Our argument implies that if there is no selection, then no current-conditions variable should
explain the heights of current Army recruits. The current conditions variables we employ here
enter with signs that make sense in the way just outlined. But we should stress that any currentconditions variable that aects conditional mean height, whether positively or negatively, is evidence
of selection. If the heights of current soldiers yield an unbiased estimate of the heights of the entire
male population, then no current conditions variables should enter the regression with meaningful
magnitude or statistical signicance.
We experimented with a number of dierent measures of current economic conditions. In general,
they produce similar results, although they enter dierent specications in dierent ways.
Our
implicit baseline is a model such as those reported in Table 2. In this and all other work reported
here, we limit the sample to men who joined the Army between 22 and 27 years of age, and who
reported a height of 67 to 77 inches.
(We restrict the sub-sample to older men to reduce the
chance that some have not attained their terminal height.
Floud et al do not always report the
lower truncation point they use, but in one example (the birth cohort of 1806-1809), they use a
truncation of 65 inches (Table 3.13)).
We regress height on a series of single-year dummies for
year of birth (1800 is the reference year).
31
The two binary probit models handle the information
More precisely: in studies such as Floud et al, the main variable aecting height is birthyear. In this case the
birth cohort (b) is the only determinant of heights. For those enlisting at date t, in the absence of selection biases
E[h|a<h<c,b,z(t)]=f(a,c,b) and P[(a<h<c|b,z(t)]=g(a,c,b).
background is known and used.
23
In other studies more information about the person's
dierently. In one probit model we classify men as tall or short, dening tall if they are 68 inches
or taller; in the other probit model, we dene tall as a recruit being 69 inches or taller.
Tables 3-5 report the eects of adding dierent types of current-condition variables to the base-
32 The most straightforward approach is to let year of recruitment
line models reported in Table 2.
or age at joining the Army proxy for current conditions. Table 3 reports summaries from specications in which we enter single-year dummies for the age at which the soldier joined to the baseline
models reported Table 2. We can reject the null hypothesis that age dummies are collectively zero
in all three specications. And adding the age dummies does not reduce the importance of the
year-of birth dummies.
33 This is a striking result, and shows that a simple way of parameterizing
the impact of current conditions helps to explain the heights of recruits. Table 4 reports a similar
exercise. This time we add single-year recruitment-year dummies to the baseline specications in
Table 2. We reject the null that the year-of-recruitment dummies are zero for all three models. More
importantly, in the probit specications we now cannot reject the null hypothesis that birthyear
dummies are collectively zero. This suggests that current conditions might be more important in
explaining recruit heights than are the birthyear dummies thought to proxy for the individual's
biological standard of living.
Table 5 reports a nal exercise of this type, which yields more ambigous conclusions. Rather
than age at recruitment or year of recruitment, we wanted to see the impact of a more direct
measure of current economic conditions. We use two measures of current economic conditions for a
working-class person, both derived from the work of the Global Price and Income History Group.
One measure is the wage of laborers relative to the wage of craftsmen, as estimated by Allen for
London.
34 These are both real wages, but since Allen uses the same price index for each, dividing
the two removes the inuence of consumer prices (so long as laborers and craftsmen consumed the
35 We
same basket of goods). The other measure is the nominal price of wheat, again for London.
also considered oats as a substitute for wheat prices, but the two series are highly collinear and using
32
We do not report the actual estimation results, but they are available upon request from the corresponding
author.
33
The age-of-recruitment dummies imply the restriction that the current conditions are always the same for indi-
viduals of a given age, which is clearly unrealistic. Alternatively, one could interpret signicant age at recruitment
eects in these models as evidence for dierential self-selection into the military by age for those who are above any
reasonable minimum height requirement.
34
35
See http://www.economics.ox.ac.uk/members/robert.allen/WagesPrices.htm
See http://www.iisg.nl/hpw/data.php#united
24
wheat allows us to use a longer run of data. The wage measure and wheat prices performed similarly,
so we only discuss models using wages. Why should current wages aect the decision to join the
Army?
Refer back to our model in section 3 above.
Implicitly, young men were comparing the
Army to the civilian world. When civilian wages rose, the Army became relatively less attractive.
The regression asks whether this variation aected heights of new soldiers, that is, whether the
variations in civilian wages at enlistment times aected men of dierent heights dierentially. Table
5 reports the coecients and t-ratios for the variables in question, along with formal tests for the null
hypothesis that together the relative wage and price variables are zero. We see the expected impact
of the relative wage variable in the OLS specication, but in the two probit models its impact
is not statistically signicant.
The wheat-price variable is never statistically signicant, and the
two variables only belong collectively in the OLS specication. The OLS and probit specications
capture dierent features of the height distribution, with the OLS model using more information
from the distribution of heights in this sub-sample; the relative wage and price variables are used
again in another specication, below.
6.1
36
Selection using the RSMLE approach
The regressions discussed above are only an approximation to the model most heights studies have
in mind. To get closer, we estimate the RSMLE model Floud et al use. Note the complication here;
the RSMLE is only correct under the assumption that the distribution above the truncation point
is free of any selection. We have shown this not to be true. So while our RSMLE estimates replicate
and expand upon those reported by Floud et al, they are not, in some sense, correct, because the
statistical assumptions upon which they rest are violated. But the same observation applies to the
Floud et al estimates.
Following Wachter and Trussell (1982a), we assume heights are normally distributed conditional
on a set of covariates
x, where the covariates can inuence either or both the mean and the standard
deviation of heights. Because of possible rounding issues, we only consider the integer portion of
reported heights; that is, we round heights down to 67 inches, 68 inches, etc. For each observation
36
It is important to recognize that the elasticities reported in Table 5 and how the relative wage and wheat price
variables aect the birth year dummy variables tell us very little about changes over time in the population height
distribution. That is because we have not explicitly incorporated the selection process into our empirical model. The
fact that the time of enlistment variables are signicantly related to enlistee' heights, however, is evidence of some
kind of selection.
25
i,
dene the mean conditional on
σi = exp (x0i πσ ).
x
and standard deviation conditional on
x
as
µi = x0i πM
The likelihood function for the integer values of heights, conditional on height
being at or above a possibly person specic lower limit
Li ,
is given by
1(hi ≥Li )
 xhi y−µi
i
−
Φ
Φ xhi y+1−µ
σi
σi


i
1 − Φ Liσ−µ
i=1
i
N
Y
where
1 (·)
and
is the indicator function and
x·y
(14)
is the oor function that extracts the integer portion
of the height.
Table 6 reports estimates and standard errors for the RSMLE, where the covariates are birthcohort dummies in ve-year intervals, and the real-wage and wheat-price variables described above.
The model suers from stability problems, and to estimate it, it was necessary to move from the
37 The message here is very strong: current conditions
single-year to the ve-year birth cohorts.
aect both the mean and the variance of heights in the RSMLE estimates. For the null hypothesis
that both the relative-wage and the wheat-price variables are zero, the
q2
statistic is 33.24, which
38 Under the null of no
with four degrees of freedom is signicant at standard signicance levels.
selection on current conditions, neither variable should matter.
interpretation as in the OLS estimates.
Shifting the mean has the same
The eect on variance is also important, as it probably
directly reects the changes in the right-hand tail that are a manifestation of selection.
It is important to recognize that the birth cohort eect estimates in Table 8, even though the
model controls for variables related to selection at the time of entry into the military, do not provide
simply interpretable height variations.
To obtain such estimates, one would need to construct a
correct model for heights that incorporates explicitly the selection into the military process. For a
static selection model, and under the assumption that all unobserved skills and tastes are normally
distributed, one could use the selected height density function in equation (11) to derive maximum
likelihood estimators of the parameters of the birth-cohort unconditional height distributions. We
37
For several single year birth cohorts, the model estimated mean population heights below 50 inches, sometimes
well below zero inches.
The ve-year birth cohort 1836-1840 yielded absurd mean height and standard deviation
estimates (e.g. N(-10,15)), so we removd individuals in this ve year group from this part of the analysis.
38
The log-likelihood for the unrestricted model presented in Table 8 is -9689.4979. For the restricted version, where
we remove the relative wage and wheat price variables from both the mean and the standard deviation in the RSMLE
model, the log-likelihood is -9706.1187. The test statistic (2 x the dierence in the log-likelihoods) is distribution as
2
with degrees of freedom equal to the dierence in the number of parameters in the two models.
q
26
did use this density function, with corrections for possible minimum height standards and rounding
as discussed above, but the estimates were quite unstable. That is not surprising; the estimation
strategy attempts to model the joint distribution of heights and selection using cross sectional macro
economic measures and height data only from the right hand tail of a self-selected sample.
7
Treatment of sample-selection bias in the historical heights literature
The scientic study of the human physical growth auxology emerged in the 1830s, with studies
by European scientists, including Louis-René Villermé, Adolphe Quetelet and Eduoard Mallet, who
gathered information on the heights of army recruits in France, Belgium and Switzerland respectively
(Staub et al. 2011). Villermé drew a connection between height and health; Quetelet introduced the
normal distribution to the study of practical scientic questions, including human growth; Mallet
noticed a modest urban height advantage and attributed it to Geneva's relative prosperity. A halfcentury later Danson (1881) published his statistical study of English prisoners, which showed that
males did not reach their terminal heights until after age 22.
He concluded that armies should
eschew 18-year old recruits because slightly older men who had already reached their terminal
height would prove to be hardier soldiers. The thread that connects these studies is their belief that
height reected well being. The thread that connects them to the modern literature, besides their
concerns with human height and well-being, is that they relied on readily available and possibly
selected samples. Selection was an issue at the dawn of statistical anthropometrics and, as we argue
below, remains an underappreciated issue in the literature. In this section, we review how selection
issues have shaped the discussion of three principal sources of height data: military recruits, slaves,
and prisoners. With the exception of a fruitful, but underappreciated debate about height-based
selection into the slave trade, concerns with selection have not been given their due.
7.1
Soldiers, military recruiting, and military school students
Fogel et al (1982, pp.29-30) were the rst to call attention to the industrialization puzzle, namely that
the positive correlation between height and per capita income observed in modern cross sectional
studies did not hold in the time series for adult white males in the late antebellum United States.
27
They postulated that the decline in heights was concentrated in the urban-born population. Rapid
urbanization, poor public sanitation and more pronounced urban income inequality meant the
conditions of city life deteriorated during industrialization. Urbanites were shorter as a consequence.
Several subsequent studies report remarkably similar patterns.
Komlos (1989) reports a late
eighteenth-century decline in the heights of adult male Hapsburg recruits.
The mid-eighteenth
century peak was not attained again for nearly 150 years. Floud et al (1990) nd a general secular
increase in English heights during the Industrial Revolution, with a sharp reversal for cohorts born
in the 1840s and 1850s. Komlos (1993) reworks the English gures and argues that that English
heights declined between 1770 and 1830.
Subsequently Komlos and Küchenho (2012) push the
onset of English height decline back to the 1750s, a secular decline that persists to 1850. The era of
early industrialization was an era of shrinking men. A'Hearn (2003) reports a 3cm decline in Italian
heights between 1740 and 1800, attributing it to Malthusian pressures.
Studies of U.S. soldiers generate similar results. Komlos (1987) and Coclanis and Komlos 1995,
g. 3, p.101, use military school students to demonstrate long height cycles. The mean height of
19-year old West Point cadets falls by a half-inch in the late antebellum era, but recovers by the end
of the Civil War. The mean height of 19-year old Citadel cadets is stable up to 1900 and increases
39 Steckel (1995) links several independently constructed series
by about 2.5 inches up to the 1930s.
of US soldier heights, from the French and Indian War (1754-1763) through the Second World War
(1941-1945). His now nearly iconic diagram, reproduced above as Figure 2, demonstrates a decline
in terminal adult heights for cohorts born between the 1830s and the 1880s.
Once the industrialization puzzle was found in the time series, it was quickly uncovered in the
cross section as well.
Komlos (1989) nds that Hapsburg Empire army recruits from the most
economically developed regions within the empire were the shortest, while recruits from the least
developed regions were among the tallest. Similar patterns emerged elsewhere in Europe. Mokyr
and Ó Gráda (1996) report that poor Irish recruits into the English East Indian Company (EIC)
army were taller than less poor English recruits. Moreover, Irish EIC recruits from relatively wealthy
39
Gallman (1996) disagrees with Komlos' interpretation.
The heights of 16- and 17-year old recruits declined
between the 1820s and the 1830s, but then increased in the 1840s and 1850s. The cohorts of the 1860s were slightly
taller than the 1820 cohort. These two age groups account for two-thirds of Komlos' sample and fail to demonstrate any
long-run decline in height. A simple OLS regression of Komlos' height index on decade of birth yields a statistically
insignicant coecient of -0.0066, or that heights declined by 0.6% per decade between 1820 and 1860.
But the
condence interval includes no change in height. The downward trend in the Coclanis and Komlos (1995) diagram is
not readily apparent.
28
Ulster were shorter than recruits from elsewhere in Ireland. Sandberg and Steckel (1988) nd that
mid-nineteenth century Swedish soldiers from the less developed north and east were taller than
recruits from the more developed west.
rural-born peers (A'Hearn 2003).
Urban Italians paid a height penalty relative to their
In the United States, Union Army troops from less developed
Kentucky and Tennessee were taller than troops from the Old Northwest who were taller than troops
from industrializing New England (Johnson and Nicholas 1997, p.
208).
Among Pennsylvanian
recruits, men from less developed and commercially-oriented regions were taller than men from the
industrializing southeast region of the state (Cu 2005 p.207). Margo and Steckel (1992) also found
that ex-slave recruits into the Union Army from the less commercialized inland areas were taller
than those from the more commercialized coastal regions.
Explanations of both the time-series and the cross-section puzzles built on those oered by
Fogel et al (1982). The resolution of the industrialization puzzle oered in the literature focuses on
the decline in net nutrition that occurred in the early stages of industrialization. According to this
view, the underlying sources of decline were: (1) increasing income inequality; (2) increasing income
variability; (3) increases in the price of food relative to manufactured goods; (4) increasing distance
between the production and the consumption of food, with the consequent spoilage waste and loss of
nutrients; (5) increased work eort; and (6) increased infection rates and disease incidence (Komlos
and Coclanis 1997, p.455). The problem, of course, is accounting for many or most of these eects,
40
which, in the end, are more commonly asserted than shown to be the cause of height trends.
Coclanis and Komlos (1995, p.
92), in fact, insist that any residual controversy concerning the
industrialization puzzle centers on the nature and causal connections of height cycles because the
existence of heights cycles, is no longer questioned. We beg to dier. While we do not reject the
possibility that heights cycled prior to the long secular increase in heights observed in the twentieth
century in the developed world, we remain skeptical about eighteenth- and nineteenth-century cycles
because the literature has not taken the selection problem (as opposed to the truncation problem)
seriously.
Much as the literature follows Fogel et al (1982, p.42-45) in documenting the puzzle, it follows
40
Several studies attempt to estimate per capita food consumption (or disease incidence) and link it to birth cohort
heights, but the exercise of recreating diets is fraught with assumptions, judgment calls and error (Floud et al. 1990)
See Haines, Craig and Weiss (2003), Komlos (1987), and Bodenhorn (1999) for examples. Gallman (1996) and Haines,
Craig and Weiss (2003) express skepticism concerning antebellum nutritional decline.
29
them in its discussion of selection issues. They write:
Much of our work during the past four years has been devoted to assessing the
quality of the data .... Volunteer armies, especially in peacetime, are selective in their
admission criteria and often have minimum height requirements. Consequently, even if
information on rejectees exists, there is the question of the extent to which applicants
are self-screened... There is clearly evidence of self-selection bias in volunteer armies.
Persons of foreign birth and from cities are overrepresented.
Native-born individuals
living in rural areas are underrepresented...
Their concerns with the nativity and urban-rural composition of their samples do not address
height-based selection into the sample; rather it reects selection on other characteristics within the
sample. They then turn their attention to left-tail shortfall and their approach to correct for the
nonnormality of the data that followed from the military's minimum height standard.
More than two decades later, the discussion is hardly changed. After a brief discussion of existing
methods for dealing with left-tail shortfall, A'Hearn (2003, p.359) turns his attention to selection:
A second complication concerns selection eects. Whatever method is used [to correct
for left-hand tail shortfall], the legitimacy of comparisons across groups depends on
constant selectivity above the truncation point... [T]here is reason to worry about this
for the sample at hand. In the eighteenth century, especially before 1770, it seems likely
that recruiters did not get a random sample of individuals exceeding 63 in height...[But]
it seems likely that selection disproportionately emphasizes heights in the immediate
neighborhood of the statutory minimum.
From there, the discussion returns to the truncation issue, and A'Hearn lets pass an opportunity to
explore the implications of selection on heights throughout the distribution of heights rather than
in the immediate neighborhood of the minimum height standard.
Our survey of the literature uncovered only a few instances in which the selection problem
discussed above is recognized and given its due. Floud (1994, p.19) acknowledges that recruits into
volunteer armies represented a self-selected sample that may not be representative of the population.
He continues, however, that any selection is unlikely to be large enough to vitiate comparisons
30
over time and between...
countries.
Weir (1997, p.174-5) disagrees.
He recognizes if recruiter
selection manifests as a strict exclusion only of those below the truncation point, the methods, such
as RSMLE and QBE may correct for it, but if selection is continuous across the entire distribution of
heights, these estimators fail to generate accurate estimates of mean height. Meier(1982, p.297), in
his comment on the original Wachter and Trussell (1982a) paper introducing the RSMLE and QBE
estimators, pointed out that recruitment eort and disparagement of shorter individuals might
vary continuously over a range of heights and could yield a selected military height distribution with
a nearly normal shape. Wachter and Trussell (1982b, p.302), in their rejoinder, agreed that some
scenarios for recruitment and volunteering could yield spuriously normal observed distributions
whose failure to represent the underlying population would be undetectable from internal evidence.
In his critique of Komlos' (1987) study of West Point cadets, Gallman (1996, p.194) not only
refuted Komlos' claim of nutritional decline but raised serious concerns about selection. If Komlos'
contention that cadets' heights actually declined were true, it would only be interesting if cadets
can be taken to be a random sample of some larger, more interesting group say all young white
men in the United States. That, of course, cannot be. Not everyone was eligible for West Point
and, of those who were, young men who attended were interested in either a military or engineering
career.
Despite the unrepresentative nature of the sample, Gallman (1996, p.194) notes that it
might still have wide meaning if the pool of candidates retained an unchanging character across
the full period of this study that is, if the same segments of the society continued to supply
candidates, in roughly the same relative numbers. But is that likely to have been so?
Gallman
doubts it. Cohorts born after the early 1840s and interested in a military career would have faced
the prospect of serving in an army with a large permanent class of lieutenants, captains and majors
who had gained battleeld experience in the U.S. Civil War. This surely pushed many otherwise
promising cadets into other pursuits.
A stylized fact of the European industrialization puzzle literature is Mokyr and Ó Gráda's (1994;
1996) nding that poor Irish recruits into the English East India Company (EIC) army were taller
than similarly situated English recruits. A number of possible explanations have been oered for
this counterintuitive insight, including the relatively nutritious Irish diet of milk and potatoes and
epidemiological isolation (Nicholas and Steckel 1997). Mokyr and Ó Gráda (1994, p.42) oer an
alternative explanation: because incomes were lower in Ireland than England, the relative quality
31
of Irish recruits was higher.
The poor Irish appear to be taller in the record than they really
were because the EIC drew a larger proportion of recruits from better-o families. The taller Irish
were not really better o than the shorter English. Faced with less attractive civilian employment
opportunities for a given height, taller Irish men were disproportionately more willing to enlist
than were otherwise comparable (i.e., equally tall) English men. The tall Irish recruits, they note,
were a supply-driven rather than a demand-driven selection phenomenon. The remarkable feature
is not that Irish men presenting themselves for service in the EIC were taller than their English
counterparts; the remarkable feature is that the tall-but-poor Irish result is widely repeated in the
literature absent Mokyr and Ó Gráda's repeated concern that the result is chimerical.
41
Our reading of the historical heights literature that makes use of military samples, which represents a plurality, if not an outright majority of the studies, reveals that sample selection bias
is underappreciated. When the issue does arise, it is given passing notice and attention turns to
left-tail shortfall (or truncation bias).
Given our theoretical and empirical discussion above, the
failure to take sample selection bias seriously raises substantive concerns about the existence of and
explanations for the industrialization puzzle.
7.2
Slavery and the slave trade
Given that one of the principal contributions of the modern heights literature is the demonstration
that income and wealth inequality manifests itself in human height-at-age, it is not surprising
that historians were intrigued by the anthropometric consequences of slavery. Relatively deprived
children are consistently shorter at age than relatively well-o children, and it is hard to imagine
a more potentially deprived population than slaves.
Economic historians have produced several
notable studies of slave heights, which have led to two general results. First, slave children were
extraordinarily small (Steckel 1995, 1923). The mean height of slave children generally fell below
the rst percentile of modern stature until about their tenth year.
These kinds of heights are
sometimes observed in developing countries today, but are practically unheard of in the developed
world. As Steckel (1987) observes, a modern American pediatrician would be alarmed if presented
41
Nicholas and Steckel (1997), Johnson and Nicholas (1997), Komlos (1998, p.
780), among others, discuss the
tall-but-poor Irish without referring to selection concerns. Mokyr and Ó Gráda (1994, p.50) also doubt the time series
evidence. The 1810-1814 (birth) cohort of recruits was taller than the 1802-1809 cohort not because the biological
standard of living had changed noticeably, but because labor market conditions changed enough that the EIC had
greater success attracting taller Irish recruits.
32
with such a short child. Second, the growth of slaves in adolescence was remarkably vigorous so that
adult slaves attained nearly the 20th percentile of modern stature. Such remarkable recovery growth
generated tall slaves, at least by historical standards. Although they were shorter, by about one
inch, than contemporary white Americans, slaves were taller than many contemporary European
populations, especially low-income Europeans.
Economic historians have constructed plausible explanations for these two features of slave
heights. Poor medical knowledge and even poorer practice led to high infant mortality rates, which
is probably indicative of high morbidity rates from both acute and endemic (gastrointestinal and
42 Because diarrheal infection interferes with the nutrient-growth nexus, per-
diarrheal) infections.
sistent endemic infection will lead to stunting.
If nutrition is simultaneously low, the negative
consequences on growth are magnied. Slave children, according to this literature, survived on a
poor diet of hominy and pork fat. Growth recovery began around age 10 because the typical slave
child entered the plantation labor force around that time. Normally, the demands of heavy work
expected of slaves would have further interfered with growth, but once children entered the labor
force they received shoes, which reduced the extent of fecal-based gastrointestinal infection, as well
as more and better food, perhaps as much as one-half pound of pork per day. The increased meat
ration for working slaves was further supplemented by vegetables and legumes, which contributed
to the remarkable catch-up growth of North American slaves.
Although information on US slave heights comes from a host of sources contraband or
runaway slaves that joined the Union Army in the 1860s (Margo and Steckel 1982), notarized
certicates of good behavior led with New Orleans' courts (Freudenberger and Pritchett 1991),
manumission and freedom papers (Komlos 1992; Bodenhorn 2011), and runaway advertisements
(Komlos 1994), among others the principal source of information are the data recorded in the
coastwise manifests led by slave traders (Steckel 1979; Steckel 1987). To enforce the prohibition
42
Infant survival may create its own self-selection bias in that the ability to resist certain infections, which may
manifest itself as vigorous growth capacity in adolescence, may be partly responsible for the observed pattern of
slave growth. Instead of recognizing this possibility, vigorous adolescent growth is attributed to dicult to document
changes in the slave child's work regimen, disease environment and food allotments.
Rees et al (2003) develop a
dynamic optimization slave owner model linking planter prot maximization with the pattern of slave growth.
A
related literature arose around the question of whether smallpox reduced the height of survivors. Voth and Leunig
(1996) claim that the near eradication of small pox in nineteenth century England is responsible for about one-third
of the reported increase in average heights between 1770 and 1873. Voth and Leunig's nding brings attention to
potential survivor bias and the eects it may have on recorded heights over time. See Razzell (1998) and Oxley (2003)
for critiques of the Voth-Leunig hypothesis.
33
on international slave trading, slave traders moving slaves in the seaborne, interregional, domestic
trade were required to provide shipping manifests that included the name, age, sex, color and height
(in feet and inches) of the slave, as well as the name and residence of the slave trader. Prior to a
ship's departure the captain prepared a duplicate manifest with slave and trader information along
with information on the ship. One copy of the manifest was kept by the collector at the port of
origin; the other was presented to the collector at the port of destination.
Tens of thousands of
slaves were recorded on manifests, many of which are now available at the National Archives.
The issue of whether selection bias aects the observed pattern of height-at-age and growth
velocity is not discussed in either Steckel (1979) or Steckel (1987), but it merits two paragraphs in
Trussell and Steckel (1978, 550-551), who use the data to conrm their estimates (drawn from other
sources) of age at menarche and age at rst birth among slave women. They discuss two possible
selection eects: a downward bias due to a lemon's market in slaves, in which only below-average
height slaves entered the interregional trade; or, an Alchian-Allen (1964) shipping the good apples
out market in which a xed shipping cost represented a smaller fraction of the higher price received
for taller slaves making taller slaves more likely to enter the interregional trade. They contend that
age at peak growth velocity (onset of adolescent growth spurt) would not be much aected by either
eect and do not further pursue the issue of potential sample-selection bias.
Higman's (1979; 1984) studies of Caribbean slaves provide some empirical evidence that selection
may be driving Steckel's adolescent recovery nding. Unlike the heights of US slaves, which was
recorded only if slaves entered into the coastwise trade, the British government required universal
registration of slaves in anticipation of general emancipation. Registers of Trinidadian slaves were
open for public inspection and government ocials visited plantations to conrm their validity,
making corrections when necessary. Caribbean slaves demonstrate less adolescent recovery; some
attain only the 2nd percentile of modern height (Steckel 1995, p. 1924). Height dierences between
US and Caribbean slaves may be due, of course, to dierences in genetic potential, disease incidence,
work load or diet (Steckel 1995, p. 1925, in fact, attributes them to fetal-alcohol syndrome due to
substantial and growth-retarding rum allowances on Caribbean plantations). But we need not turn
to dicult-to-prove hypotheses when the dierences were driven, at least in part, by height-based
selection into the US sample and near-universal coverage in the Caribbean sample.
Freudenberger and Pritchett (1991), Pritchett and Freudenberger (1992), and Pritchett and
34
Chamberlain (1993) systematically explore the sample-selection bias problem of the coastwise manifest sample. They argue that the manifest sample is subject to substantial selection on height of
the shipping the good apples out type. That is, when a xed transportation cost is applied to
similar goods, the high-quality, high-priced good (a taller slave in this instance) becomes relatively
less expensive in the destination market. Four features are needed for the good-apples eect to hold:
(1) transport costs must be non-negligible; (2) transportation costs are not proportional to price at
the source; (3) the goods are close, but not perfect substitutes; and (4), the elasticity of substitution
between each of the two goods in question and a composite third good (say, free or indentured labor)
not be substantially dierent (Borcherding and Silberberg 1978). Pritchett (1997) and his coauthors
provide evidence on transportation costs that supports conditions (1) and (2). Conditions (3) and
(4), they contend, are defensible: tall and short slaves are (imperfect) substitutes in production;
and free or indentured labor, as the historical record shows, were substitutable for slaves, whether
tall or short.
43
Moreover, they show that the eect will be more pronounced for younger than adult slaves. One
prediction of the good-apples slave model is that prices will be a positive, but declining function of
age (Pritchett 1997, p.73 note). Because buyers were willing to pay more for taller slaves, traders
faced incentives to select taller slaves for the coastwise market and the incentives were stronger,
the younger the slave. Calomiris and Pritchett (2009) nd that slave children shipped with their
mothers were shorter than children of the same age shipped alone. This result is puzzling absent
selection on height.The available evidence is consistent with this prediction; the height dierential
between slaves traded in the interregional market and those traded in local markets were most
44
pronounced in childhood, declining in adolescence and largely disappearing among adults.
The conclusion to draw from the exchange between Pritchett and his coauthors and Komlos and
Alecke (1996) is that slaves shipped coastwise were not representative of the general population of
North American slaves (Pritchett 1997, p. 83). In this instance, the consequence of sample-selection
bias works in favor of one of existing interpretations.
Because selection on height was stronger
for younger slaves than for adult slaves, Steckel's coastwise sample includes proportionately more
43
See Komlos and Alecke's (1996) response to Pritchett's evidence The substitutability of slave and indentured or
free labor is discussed in Galenson (1981) and Grubb (1994; 2001).
44
Pritchett's (1997, p.
78) empirical work is not without its own problems namely, that he estimates what
Hausman labels the forbidden regression (a probit rst-stage in an IV system).
p.190) for a discussion of the so-called forbidden regression.
35
See Angrist and Pischke (2009,
tall children than tall adults than would be observed in a true random draw of the general slave
population. This means that the much-discussed remarkable recovery growth of adolescent slaves is
underestimated. It is likely that young slaves were even more pathologically short than they appear
in the manifest sample.
Unfortunately, we cannot provide as sanguine a conclusion about the second result of the slave
height literature. The good-apple model predicts that selection will be less pronounced at older ages
and lowest among adults, but it does not predict the absence of selection. Higman's (1979) study of
Caribbean slaves, in fact, provides some evidence of selection on height for inter-island trade. The
mean height of native-born Trinidadian adult males (25-40 years) measured in 1813 was 165.6cm.
The mean height of creole male slaves, or those born in the New World, and imported into Trinidad
from sugar islands was signicantly taller at 167.3cm (t=2.15). Creole slaves imported from nonsugar island were taller still at 170.6cm (t=4.42). Imported females, too, were signicantly taller
than native-born Trinidadian female slaves.
The available evidence, although not denitive, points toward good-apples selection in the interregional slave trade.
This type of selection need not be revealed by (non)normality of height
distributions, or heaping on height or age, or left-tail shortfall. That the selection process is not
readily revealed does not, of course, imply that it is unimportant. Unrepresentative selected samples
will yield incorrect inferences when selection is correlated with the variable of interest. Pritchett's
concerns with Steckel's results are well-founded and it is unfortunate that his contributions have
not had more inuence on the literature.
7.3
Criminals and prisoners
The anthropometrician's concern with the condition of the working classes during industrialization,
in combination with the wide availability of data, led scholars to study the heights of incarcerated
or transported criminals. Scholars making use of prisoner data readily acknowledge that prisoner
samples are not representative of the underlying population; the professional and middle classes and
farmers are underrepresented and unskilled laborers and other low-wage groups are overrepresented
(Steckel and Nicholas 1991; Riggs 1994).
In one of a series of studies of nineteenth-century US
prisoners, Carson (2008, p.591) casts the selection vice as a virtue:
36
The prison data probably selected many of the materially poorest individuals, although there are skilled and agricultural workers in the sample. While prison records
are not random, the selectivity they represent has its own advantages in stature studies,
such as being drawn from lower socioeconomic groups, who were more vulnerable to
economic change.
For the study of height as an indicator of biological variation, this
kind of selection is preferable to that which marks many military records minimum
height requirements.
In other words, prisoner heights are preferable to soldier heights because prisoners are presumably
not selected on height and prison samples are not subject to left-tail shortfall.
Although Carson labors to turn a vice into a virtue, it remains a vice if the purpose of the
historical anthropometric literature is to speak to the average standard of living. Several studies
(which are not without their own selection problems) have shown that the heights of the middle and
upper classes in the nineteenth-century United States did not mirror the height cycles of soldiers
and prisoners (Sunder 2004; Sunder 2011; Lang and Sunder 2003).
Thus, if lower-class heights
are more responsive to economic changes than middle- and upper-class heights, samples in which
the lower classes are overrepresented are likely to overstate any real changes in population heights.
Moreover, inferences about changes in the average biological standard of living may reect changes
in the sample composition, which may be pronounced over the business cycle or during long-run
secular changes in rates of economic growth. Using a small subset of the available samples, Sunder
and Woitek (2005), in fact, show that co-movements between economic and height cycles are most
pronounced for American blacks (predominantly unskilled labor), relatively pronounced for Austrian soldiers and Ohio National Guardsmen (relative overrepresentation of laboring classes), less
pronounced for late nineteenth-century registered voters, and virtually nonexistent for middle- and
upper-class women.
The potential for compositional changes in the heights of individuals selected into prisons to
drive temporal height changes has not gone unnoticed. Nicholas and Oxley (1996), Johnson and
Nicholas (1997) and others have considered whether the relative under- or over-representation of
certain groups reects population proportions or change over time.
p.949), for example, oer the following:
37
Nicholas and Steckel (1991,
Although there is no single powerful test of selectivity bias, the various tests reported
hereafter put at risk the hypothesis that the decline in heights was an artifact of our data.
Almost one-third of our convicts were tried in a county other than that in which they
were born, but it is not possible to know whether the move occurred during childhood,
adolescence, or after maturity. Assuming that all convicts tried in the same county as
their place of birth were nonmovers, ve-year moving averages of height for rural- and
urban-born nonmovers showed the same prole as that for the entire sample.
This argument is irrelevant to the issue of selection into their sample; it merely shows that there is
no apparent selection into two sub-populations within their sample. A common thread connecting
studies of the industrialization puzzle is the proportion rural- and urban-born. This is natural given
concerns that nineteenth-century urbanization led to a marked decline in the biological standard
of living.
But this is not exactly the issue.
The issue is whether cohorts born into urban areas,
conditional on family income and wealth (and thus height), were more likely to enter into crime,
and be apprehended, tried, convicted and sentenced to prison or transportation.
In the modern world, crime is more common in urban than rural places and there is evidence
that this was so historically as well (Glaeser and Sacerdote 1999; McLynn 1989). Moreover, evidence from nineteenth-century US prisons shows that prisoners were short compared even to soldiers
(Bodenhorn, Moehling and Price 2012). The issue is whether selection into prison, conditional on
selecting into crime, followed from height, which seems likely. Persico, Postlewaite and Silverman
(2005) and Case and Paxson (2008) nd evidence that height is positively correlated with labor market outcomes (employment and wages), mediated through cognitive abilities and the accumulation
of more human capital by taller youth and adolescents. Because taller individuals face relatively
better legitimate labor market opportunities than shorter individuals, criminal activities are less
attractive to taller people. Prisons are thus populated by short, relatively low-income people.
Researchers take solace in the fact that their samples are Gaussian but, as we have demonstrated
above, apparent normality alone will not reveal selection even in the presence of selection. The issue
that concerns us is whether selection on height was likely to change in a way that would generate
the counterintuitive results reported in the historical heights literature, whether those changes are
evident in the research, and whether the explanations are consistent. The Roy-type model developed
38
above predicts that better economic conditions are consistent with shorter prisoners in both the cross
section and the time series.
If legitimate labor market opportunities are conditioned, at least in
part, on height, economic expansions will create more and better legitimate opportunities and short
individuals who may have selected into crime in bad times will select into legitimate activities in
good times. While criminals will exhibit a distribution of heights, they will be disproportionately
drawn from the left-hand tail of the population distribution and even more so as economic conditions
improve. If we use the heights of prisoners as indicators of the biological standard of living, it will
appear that biological times are tough when economic times are good and vice-versa. In fact, what
we are observing is dierential selection on height into the criminal class that gets caught and
convicted.
Consider Riggs's (1994) study of Scottish prisoners. He claims the sample is representative of
Scottish working-class heights because in a society of heavy drinkers ... many workers were at risk
of being arrested and thus having their physical stature preserved in the historical record (Riggs
1994, p.64). While his results are in general agreement with the industrialization puzzle, he nds
the curious result that the heights of those arrested in the 1840s actually increased markedly.
Riggs (1994, p.70) considers this curious because the 1840s, known as the hungry forties, were
notorious for hardship and hunger. When considered in light of our Roy model, the result is not
so curious. If the eect of food shortages and unemployment was that men moved from legitimate
to criminal activities, the deterioration in legitimate opportunities would draw dierentially more
men into crime from the right-hand than the left-hand tail of the height distribution because men
in the left-hand tail were already disproportionately in the criminal market prior to the downturn.
Moreover, if right-tail entrants into the crime market have relatively little criminal human capital,
they were probably more likely to have been apprehended and imprisoned. Thus, heights apparently
increase during a sharp economic downturn when the intuitive connection between the biological
and economic standard of living would suggest otherwise.
In a series of studies, Stephen Nicholas and co-authors study the heights of late-eighteenth
and early nineteenth-century English and Irish prisoners.
They acknowledge that their sample
is unrepresentative in that the lower classes and the young are overrepresented, but one feature
recurs:
heights.
the Irish are taller than the English, a feature discussed earlier in reference to military
Nicholas and Steckel (1991) nd that male Irish convicts are taller than male English
39
convicts.
Nicholas and Oxley (1996) nd that the heights of rural-born English women decline
while the heights of Irish-born women rise. Johnson and Nicholas (1997) use the Irish as a control
group against which to compare English women because no Industrial Revolution occurs in Ireland.
Yet, Irish women are tall relative to the English.
selection biases...
They are condent that there are no obvious
that would generate their results.
Their contention is puzzling in that they
cite and discuss Mokyr and Ó Gráda's (1994; 1996) result without discussing their conjecture that
the Irish height advantage results from the army being more attractive to taller people in a poor
economy. Is it not also likely that criminal activity is relatively more attractive to taller people in
a poor economy?
Finally, studies of American prisoners yield results similarly at odds with the industrialization
puzzle.
Komlos and Coclanis (1997) nd that the heights of black men born into slavery did
not decline in the 1830s and 1840s; Sunder (2004) reports that heights of Tennessee prisoners
were stable in the late-antebellum era; Carson (2008) nds no evidence of the antebellum puzzle
among Missouri's prisoners.These studies then attempt to reconcile stable or rising stature with
the antebellum puzzle. Komlos and Coclanis (1997, p.452) contend that slaves were insulated from
the market and ate all they were allotted; Sunder (2004) argues that Tennesseans raised hogs
and did not sell meat in the interregional market so that they had plentiful access to protein; and
Carson's (2008, p.603) explanation turns on the relatively tall stature of the low-income Ozark
Missourians due, in large part, to their reliance on dairy (a variation of the tall-but-poor Irish
45 Alternatively,
explanation), whereas wealthier northern Missourians grew less protein-rich grains.
our Roy model predicts a south to north Missouri prisoner height gradient that is consistent with
a north to south income gradient. The relatively less attractive legitimate market opportunities in
the Ozarks drew relatively taller men into the criminal market; better opportunities in northern
Missouri drew men of comparable height, conditional on other characteristics, into the legitimate
labor market.
Our test of the Roy-type model in Section 5 above notwithstanding, we acknowledge that we
have not tested our theory against every available data set. It is possible that it will not explain
45
Rees et al (2003) provide a dynamic optimization model of slave owner behavior consistent with the remarkable
catch-up growth uncovered by Steckel (1986; 1987), which oers some insight into care and feeding of slaves in
response to changes in the market price of slaves. The Rees et al (2003) model presumes substantial market-oriented
responses rather than insulation from the market.
40
every historical human height experience. Only additional testing can illuminate the model's shortcomings. What the model provides, however, is a set of internally consistent predictions that do not
rely on ad hoc reconciliations of the evidence with the industrialization puzzle. Military and prisoner heights rise in bad economic times because alternative employments become relatively more
attractive to the legitimate civilian labor market.
Some tall men who would nd remunerative
legitimate, civilian employment in a good economy turn to the military or crime in a bad one.
Alternatively, some short men who would not have attractive legitimate market opportunities in
bad times will be drawn into the civilian market in good times. The result is an observed, but not
actual countercyclical pattern of biological well being.
8
Conclusions
The heights literature is now a large and important component of economic history and related
areas.
We agree that the central issues of interest in this literaturelong-term changes in the
standard of livingshould occupy a prominent place in the agenda of social scientists interested in
the past. Unfortunately much of the current literature that uses heights as an indicator of living
standards is badly awed. Most of these sources are selected in some way, and the biases that arise
because of the selection are pernicious and quite possibly account for many of the interesting facts
the heights literature claims to have identied.
Our message may seem entirely negative, but that is not how we intend it. First, we do not
advocate rejection of the heights literature, its themes or basic approaches. We do think that taking
selection into account will require tempering of some of the broader claims made. We also hope
that attention to selection will guide the literature away from complicated, ad-hoc explanations of
surprising ndings; as we note, some of these at least reect little more than selection.
Second,
and more importantly, we hope that careful attention to selection issues will allow these sources to
speak in new and interesting ways. The nature of selection into the British Army (for example) is
a statistical problem for those who want to use soldiers' heights to infer changes in the standard
of living standards during the Industrial Revolution. But the selection process itself can yield rich
insights into the nature of labor markets at the time: what led some young men into the Army, and
what does this tell us about working-class life and the opportunities oered by an industrializing
41
economy? Taking selection seriously, instead of assuming it away, promises yet more insights from
an already mature literature.
42
Appendix: the military's decision to accept a recruit
The occupational choice model developed in section 3 implicitly assumes that the Army will accept
everyone who wants to volunteer. We know this assumption to be false; the most obvious restriction
was the minimum height requirement that generates the need for estimators such as the RSMLE
and the QBE. We now extend our model to allow the military, too, to be an optimizing agent, one
that can reject for military service some individuals who volunteer.
We assume the Army produces a service called security using a constant-returns-to-scale pro-
46
duction function that depends on the sum of security services provided by all current soldiers.
Each individual soldier produces a security service that is a function of his height
military abilities
εM .
Again, we are agnostic on whether height
per se
h
and a set of
makes a better soldier, or
whether height is simply correlated with characteristics that make men better soldiers. The Army
pays each soldier in the manner described earlier. The Army must also pay additional xed costs
for each serving soldier over and above what that man receives as a wage. These costs include
uniforms and training. The net security value (security value minus costs of hiring) of a soldier with
height
h
and military abilities
εM
S (h, εM ).
is
One would expect
S (h, εM )
to be positive, but one
can imagine soldiers who cost more than they are worth.
Now abstract from our log-linear formulation of the utility of being a civilian and the utility of
being a solider. Suppose the civilian economy values these same two traits along with a third trait
that is specic to the civilian sector,
εC .
Civilian wages are given by
wC (h, εC , εM ) .
(15)
Suppose the military can pay soldiers a wage that is a function of all three characteristics. Assuming
the military can observe
h, ε C ,
and
εM ,
let
wM,h,C,M (h, εC , εM )
be the wage that the military pays a soldier.
46
(16)
We continue to assume that each individual has
In our model, dierent types of soldiers can produce dierent amounts of security, but there are no externalities
or spillovers; a good (bad) soldier does not increase (decrease) the eectiveness of other soldiers.
43
preference parameters for the civilian and military life
τC
and
τM .
This implies that a man joins the Army if
wM,h,C,M (h, εC , εM ) + τM ≥ wC (h, εC , εM ) + τC
(17)
τ ≤ wM,h,C,M (h, εC , εM ) − wC (h, εC , εM )
(18)
or
where
τ = τC − τM .
Let
fh,M,C,τ (h, εM , εC , τ )
be the joint distribution of the three characteristics that can be re-
warded, as well as relative tastes, and assume that the military minimizes the cost of producing
any particular level of total security. The military's objective is to choose a wage payment function
wM,h,C,M (h, εC , εM )
min
S,
or
Cost =
wM,h,C,M (h,εC ,εM )
ˆ
to minimize the cost of producing a given amount of security
∞ˆ ∞
ˆ
ˆ
∞
wM,h,C,M (h,εC ,εM )−wC (h,εC ,εM )
N
wM,h,C,M (h, εC , εM ) fh,M,C,τ (h, εM , εC , τ ) dτ dεC dεM dh
−∞
0
−∞
−∞
(19)
subject to providing some specic level of security
ˆ
∞ˆ ∞
S=N
ˆ
∞
ˆ
wM,h,C,M (h,εC ,εM )−wC (h,εC ,εM )
S (h, εM )
−∞
0
fh,M,C,τ (h, εM , εC , τ ) dτ dεC dεM dh
−∞
−∞
(20)
where
N
is the size of the population. Let
λ be the Lagrange multiplier corresponding to the security
constraint. Assuming interior solutions, for each combination of
wage,
wM,h,C,M (h, εC , εM ),
ˆ
(h, εC , εM ) the military will set the
such that
wM,h,C,M (h,εC ,εM )−wC (h,εC ,εM )
N
fh,M,C,τ (h, εM , εC , τ ) dτ
−∞
+ N wM,h,C,M (h, εC , εM ) fh,M,C,τ (h, εM , εC , wM,h,C,M (h, εC , εM ) − wC (h, εC , εM ))
= λN S (h, εM ) fh,M,C,τ (h, εM , εC , wM,h,C,M (h, εC , εM ) − wC (h, εC , εM )) .
(21)
The left hand side of this expression is the marginal cost of paying soldiers with productive traits
44
equal to
(h, εC , εM )
each one more penny; it is the sum the Army must pay each soldier already in
the military (conditional on having this particular set of traits) plus the cost of expanding the Army,
that is, paying for additional soldiers with those same traits. The right had side of the expression
measures the marginal benet of the increased security provided by the additional soldiers (with
these traits) induced into the military by the higher wage. For interior solutions, the ratio of the
marginal costs of soldiers with dierent sets of characteristics should equal the ratio of their marginal
benets.
This formulation implies that the military can extract some surplus from those enlisting in the
military. That is, the optimal military function may depend on observed characteristics that have
no impact on a potential recruit's ability to provide military services. To see this, consider two sets
of individuals whose military-relevant characteristics are identical, but who dier in characteristics
valued by civilian jobs,
εC .
Because they dier in their
Assume the military enlists some positive fraction from both groups.
εC
values, they have dierent potential earnings in the civilian sector.
If they are equally represented in the population, then the rst order conditions will be satised
as long as the military keeps the military-civilian pay dierential constant for the two groups. The
military will pay the group with the lower valued civilian sector trait,
εC ,
less than it would pay
enlistees from the group with the civilian sector trait that is more highly valued, even though this
trait has no impact on their performance as a soldier.
For a variety of reasons, the military may not be able to use such ne details to make pay oers.
Suppose, for the moment, that the military can only observe height,
pay policy on this one characteristic. In this case
h,
and so it can only base its
wM,h,C,M (h, εC , εM ) = wM,h (h),
and the rst
order conditions become
ˆ
∞
ˆ
∞
ˆ
wM,h (h)−wC (h,εC ,εM )
N
fh,M,C,τ (h, εM , εC , τ ) dτ dεC dεM
−∞
−∞
ˆ
−∞
∞ ˆ ∞
+ N wM,h (h)
fh,M,C,τ (h, εM , εC , wM,h (h) − wC (h, εC , εM )) dεC dεM
−∞ −∞
ˆ ∞
ˆ ∞
= λN
S (h, εM )
fh,M,C,τ (h, εM , εC , wM,h (h) − wC (h, εC , εM )) dεC dεM .
−∞
(22)
−∞
The left hand side of this rst order condition is the expected cost of increasing the military pay
by one penny for all people with height
h.
It consists of two components. The rst is the additional
45
penny paid to all individuals of height
h who would have joined the military even without the higher
pay. The second term is the expected full cost of the marginal enlistees with height
h.
The right
hand side again measures the expected marginal benet from the new enlistees with height h who
enter the military because of this increased level of pay.
a priori
restrictions on the sign or
magnitude of the military pay for any observed set of characteristics.
For some characteristics,
Note that both of these military pay formulations place no
(h, εC , εM ), S (h, εM ),
could be negative. This would imply that the recruit has to pay the Army
for the privilege of being a soldier. We know of no such practice and think it is a principal reason
for minimum height requirements.
More realistically, military compensation schedules during the eighteenth and nineteenth centuries usually did not include explicit height gradients. But they usually had a minimum height
requirement in addition to a xed level of compensation. To incorporate these features into the cost
minimization model, we allow the military to choose a non-varying level of pay and to restrict military service to those above some optimally-chosen minimum height threshold
D.
Now the military's
cost minimization problem can be written as
ˆ
∞ˆ ∞
ˆ
ˆ
∞
wM −wC (h,εC ,εM )
min Cost = wM N
fh,M,C,τ (h, εM , εC , τ ) dτ dεC dεM dh
wM ,D
D
−∞
−∞
(23)
−∞
subject to the specied security level
ˆ
∞ˆ ∞
S=N
ˆ
∞
ˆ
wM −wC (h,εC ,εM )
S(h, εM )
−∞
D
fh,M,C,τ (h, εM , εC , τ ) dτ dεC dεM dh.
−∞
(24)
−∞
The two rst order conditions are, in this instance
ˆ
∞ˆ ∞
ˆ
∞
ˆ
wM −wC (h,εC ,εM )
N
fh,M,C,τ (h, εM , εC , τ ) dτ dεC dεM dh
D
ˆ
+N
−∞ −∞ −∞
∞ˆ ∞ ˆ ∞
wM
ˆ D∞ ˆ −∞
∞
=
λN
D
−∞
fh,M,C,τ (h, εM , εC , wM − wC (h, εC , εM )) dεC dεM dh
ˆ ∞
S (h, εM )
fh,M,C,τ (h, εM , εC , wM − wC (h, εC , εM )) dεC dεM dh
−∞
−∞
46
(25)
and
ˆ
ˆ
∞
∞
ˆ
wM −wC (D,εC ,εM )
−wM N
ˆ
fh,M,C,τ (D, εM , εC , τ ) dτ dεC dεM
−∞
∞
= −λN
−∞
−∞
ˆ
∞
ˆ
wM −wC (D,εC ,εM )
fh,M,C,τ (D, εM , εC , τ ) dτ dεC dεM .
S (D, εM )
−∞
−∞
(26)
−∞
Consider the rst of the two rst-order conditions. The rst term on the left hand side is the
marginal cost of paying each soldier not at the margin of joining the military an extra penny, and
the second term in that left hand side is the full cost of paying the marginal soldier who enters
the military at the wage
wM .
Their sum is the cost of inducing an additional expected soldier
into the military at this wage. The right-hand side of that rst rst-order condition is the expected
marginal value of the security services provided by the marginal soldier induced into the military
at this wage.
The second rst-order condition describes how the minimum height threshold is set when the
optimal military wage is given. The left hand side is the expected cost savings gained by requiring
the marginal soldier to be one inch taller, and the right hand side is the loss in security from
excluding this marginal soldier from the military.
Rearranging the second rst order condition
yields:
ˆ
ˆ
∞
∞
ˆ
wM −wC (D,εC ,εM )
[λS (D, εM ) − wM ]
−∞
fh,M,C,τ (D, εM , εC , τ ) dτ dεC dεM = 0.
−∞
(27)
−∞
The military will set the minimum height requirement to the level where the expected value of
the military service for a soldier of height
D,
given the military wage and the occupational choice
decision, just equals the expected wage payments to the individuals of minimum height
D
choosing
to enter the military. Note that this truncation argument reects the Army's limited information
on potential enlistees. If the Army could observe
S (·)
perfectly then there would be no need for
the minimum height standard, at least for this reason.
We observe changes over time in the mean height of soldiers as the Army expands and contracts.
This uctuation in heights is evidence of selection and can be derived from our choice model.
Changes in the demand for military services due to the outbreak of war increase the Army's target
level of security,
S.
Since the military is a cost minimizer, one would expect an increase in
47
S
to
result in a higher marginal cost of military service, so the multiplier
λ
result in the right hand side of the rst order conditions to increase,
ceteris paribus, and this could
would increase. This would
be oset by the military increasing its pay oer and relaxing the minimum height requirement,
thereby increasing the size of the military and the level of military services provided in these more
demanding times.
48
References
A’Hearn, Brian. 2003. “Anthropometric Evidence on Living Standards in Northern Italy, 17301860.” Journal of Economic History 63(2): 351-381.
A’Hearn, Brian, Franco Peracchi, and Giovanni Vecchi, 2009. “Height and the Normal
Distribution: Evidence from Italian Military Data.” Demography 46(1): 1-25.
Alchian, Armen A, and William R. Allen. 1964. University Economics. Belmont, Cal.:
Wadsworth Publishing Company, 1964.
Allen, Robert C. 2007. “Pessimism Preserved: Real Wages in the British Industrial Revolution.”
Oxford University Department of Economics Working Paper 314.
Angrist, Joshua D. and Jörn-Steffen Pischke. 2009. Mostly Harmless Econometrics: An
Empiricist’s Companion. Princeton and Oxford: Princeton University Press.
Bodenhorn, Howard. 1999. “A Troublesome Caste: Height and Nutrition of Antebellum
Virginia’s Rural Free Blacks.” Journal of Economic History 59(4): 972-996.
Bodenhorn, Howard. 2011. “Manumission in Nineteenth-Century Virginia.” Cliometrica 5(2):
145-164.
Bodenhorn, Howard, Carolyn Moehling and Gregory N. Price. 2012 (forthcoming). “Short
Criminals: Stature and Crime in Early America.” Journal of Law & Economics 55(2).
Brinkman, Henk Jan, J.W. Drukker, and Brigitte Slot, 1988. “Height and Income: A New Method
for the Estimation of Historical National Income Series.” Explorations in Economic History 25:
227-264
Brown, John C., 1990. “The Condition of England and the Standard of Living: Cotton Textiles in
the Northwest, 1806-1850.” Journal of Economic History.
Calomiris, Charles and Jonathan Pritchett. 2009. “Preserving Slave Families for Profit: Traders’
Incentives and Pricing in the New Orleans Slave Market.” Journal of Economic History 69(4):
986-1011.
Carson, Scott Alan. 2008. “Inequality in the American South: Evidence from the Nineteenth
Century Missouri State Prison.” Journal of Biosocial Science 40(4): 587-604.
Carson, Scott Alan. 2009. "Health, Wealth and Inequality: A Contribution to the Debate about the
Relationship between Inequality and Health." Historical Methods 42(2): 43-56.
Case, Anne and Christina Paxson. 2008. “Stature and Status: Height, Ability, and Labor Market
Outcomes.” Journal of Political Economy 116, 499-532.
Coclanis, Peter A. and John Komlos. 1995. “Nutrition and Economic Development in PostReconstruction South Carolina.” Social Science History 19(1): 91-115.
Cuff, Timothy. 2005. The Hidden Cost of Economic Development: The Biological Standard of
Living in Antebellum Pennsylvania. Burlington, Vt.: Ashgate Publishing Company.
1
Danson, J. T. 1881. “Statistical Observations on the Growth of the Human Body (Males) in
Height and Weight from Eighteen to Thirty Years of Age, as Illustrated by the Records of the
Borough Gaol of Liverpool.” Journal of the Statistical Society 44(4): 660-674.
Engels, Friedrich. 1945/1987. Condition of the Working Class in England. New York: Viking
Penguin.
Feinstein, Charles, 1988. “Pessimism Perpetuated: Real Wages and the Standard of Living in
Britain during and after the Industrial Revolution.” Journal of Economic History.
Fogel, Robert W., Stanley L. Engerman, Roderick Floud, Richard H. Steckel, T. James Trussell,
Kenneth W. Wachter, Kenneth Sokoloff, Georgia Villaflor, Robert A. Margo, and Gerald
Friedman. 1982. “Changes in American and British Stature since the Mid-Eighteenth Century: A
Preliminary Report on the Usefulness of Data on Height for the Analysis of Secular Trends in
Nutrition, Labor Productivity, and Labor Welfare.” NBER Working Paper No. 890.
Floud, Roderick, Kenneth Wachter and Annabel Gregory, 1990. Height, health and history:
Nutritional Status in the United Kindgom, 1750-1980. Cambridge: Cambridge University Press.
Freudenberger, Herman and Jonathan B. Pritchett. 1991. “The Domestic United States Slave
Trade: New Evidence.” Journal of Interdisciplinary History 21(3): 447-477.
Galenson, David W. 1981. White Servitude in Colonial America: An Economic Analysis.
Cambridge: Cambridge University Press.
Gallman, Robert E. 1996. “Dietary Change in Antebellum America.” Journal of Economic
History 56(1): 193-201.
Glaeser, Edward L. and Bruce Sacerdote. 1999. “Why Is There More Crime in Cities?” Journal
of Political Economy 107(S6): S225-S258.
Grubb, Farley. 1994. “The End of European Immigrant Servitude in the United States: An
Economic Analysis of Market Collapse, 1772-1835. Journal of Economic History 54(4): 794-824.
Grubb, Farley. 1999. “Lilliputians and Brobdingnagians, Stature in British Colonial America:
Evidence from Servants, Convicts, and Apprentices.” Research in Economic History 19: 139-203.
Grubb, Farley. 2001. “The Market Evaluation of Criminality: Evidence from the Auction of
British Convict Labor in America, 1767-1775.” American Economic Review 91(1): 295-304.
Haines, Michael R., Lee A. Craig, and Thomas Weiss. 2003. “The Short and the Dead: Nutrition,
Mortality, and the ‘Antebellum Puzzle’ in the United States.” Journal of Economic History 63(2):
382-413.
Heckman, James J., 1979. “Sample Selection Bias as a Specification Error.” Econometrica 47(1):
pp.
Johnson, Paul and Stephen Nicholas. 1997. “Health and Welfare of Women in the United
Kingdom, 1785-1920.” In Health and Welfare during Industrialization, 201-250. Edited by
Richard H. Steckel and Roderick Floud. Chicago: University of Chicago Press.
2
John Komlos. 1987. "The Height and Weight of West Point Cadets: Dietary Change in
Antebellum America." Journal of Economic History 47(4): 987-927.
Komlos, John. 1992. “Toward an Anthropometric History of African-Americans: The Case of the
Manumitted Slaves of Maryland.” In Strategic Factors in Nineteenth-Century American
Economic History: A Volume to Honor Robert W. Fogel, 297-329. Edited by Claudia Goldin and
Hugh Rockoff. Chicago: University of Chicago Press.
Komlos, John, 1993. “The secular trend in the biological standard of living in the United
Kingdom, 1730-1860.” Economic History Review 46(1): 115-144.
Komlos, John. 1994. “The Height of Runaway Slaves in Colonial America, 1720-1770.” In
Stature, Living Standards, and Economic Development: Essays in Anthropometric History, 93116. Edited by John Komlos. Chicago and London: University of Chicago Press.
Komlos, John, 1998. “Shrinking in a Growing Economy? The Mystery of Physical Stature during
the Industrial Revolution.” The Journal of Economic History 58(3): 779-802.
Komlos, John, 2004. “How to (and How Not to) Analyze Deficient Height Samples – an
Introduction.” Historical Methods 37(4): 160-173.
Komlos, John and Bjorn Alecke. 1996. “The Economics of Antebellum Slave Heights
Reconsidered.” Journal of Interdisciplinary History 26(3): 437-457.
Komlos, John and Peter Coclanis. 1997. “On the Puzzling Cycle in the Biological Standard of
Living: The Case of Antebellum Georgia.” Explorations in Economic History 34(4): 433-459.
Kotz, Samuel, N. Balakrishnan and Norman L. Johnson. 2000. Continuous Multivariate
Distributions, Volume 1, (second edition). New York: John Wiley & Sons, Inc.
Lang, Stefan and Marco Sunder. 2003. “Non-parametric regression with BayesX: A Flexible
Estimation of Trends in Human Physical Stature in 19th Century America.” Economics and
Human Biology 1(1): 77-89.
Margo, Robert A. and Richard H. Steckel. 1982. “The Heights of American Slaves.” Social
Science History 6(4): 516-38.
Margo, Robert A. and Richard H. Steckel. 1983. “Heights of Native-Born Whites during the
Antebellum Period.” Journal of Economic History 43(1): 167-174.
McLynn, Frank. 1989. Crime and Punishment in Eighteenth-Century England. London and New
York: Routledge.
Meier, Paul. 1982. “Estimating Historical Heights: Comment.” Journal of the American
Statistical Association 77(378): 296-297.
Mokyr, Joel and Cormac Ó Gráda. 1994. “The Heights of the British and the Irish c. 1800-1815:
Evidence from Recruits to the East India Company’s Army.” In Stature, Living Standards, and
Economic Development: Essays in Anthropometric History, 39-59. Edited by John Komlos.
Chicago and London: University of Chicago Press.
3
Mokyr, Joel and Cormac Ó Gráda, 1996. “Height and health in the United Kingdom 1815-1860:
evidence from the East India Company army.” Explorations in Economic History 33(2): 141-168.
Nicholas, Stephen and Deborah Oxley. 1996. “Living Standards of Women in England and
Wales, 1785-1815: Evidence from Newgate Prison Records.” Economic History Review 49(3):
591-599.
Nicholas, Stephen and Richard H. Steckel, 1997. “Tall but Poor: living standards of men and
women in pre-famine Ireland.” Journal of European Economic History 26(1):105-134.
Nicholas, Stephen and Richard H. Steckel, 1991. “Heights and living standards of English
workers during the Early Years of Industrialization, 1770-1815.” Journal of Economic History
51(4): 937-57.
Nicholas, Stephen and Richard H. Steckel, 1997. “Tall but Poor: Living Standards of Men and
Women in pre-Famine Ireland.” The Journal of European Economic History 26(1): 105-134.
Ó Gráda, Cormac. 1991. “Heights in Tipperary in the 1840s: Evidence from the Prison Records.”
Irish Economic and Social History 18: 24-33.
Oxley, Deborah. 2003. “’The Seat of Death and Terror’: Urbanization, Stunting, and Smallpox.”
Economic History Review, New Series 56(4): 623-656.
Persico, Nicola, Andrew Postlewaite, and Dan Silverman. 2005. “The Effect of Adolescent
Experience on Labor Market Outcomes: The Case of Height.” Journal of Political Economy 112,
1019-1053.
Pritchett, Jonathan B. 1997. “The Interregional Slave Trade and the Selection of Slaves for the
New Orleans Market.” Journal of Interdisciplinary History 28(1): 57-85.
Pritchett, Jonathan B. and Richard M. Chamberlain. 1993. “Selection in the Market for Slaves:
New Orleans, 1830-1860.” Quarterly Journal of Economics 108(2): 461-473.
Pritchett, Jonathan B. and Herman Freudenberger. 1992. “A Peculiar Sample: The Selection of
Slaves for the New Orleans Market.” Journal of Economic History 52(1): 109-127.
Razzell, Peter. 1998. “Did Smallpox Reduce Height?” Economic History Review, New Series
51(2): 351-359.
Rees, R., John Komlos, Ngo. V. Long, and Ulrich Woitek. 2003. “Optimal Food Allocation in a
Slave Economy.” Journal of Population Economics 16(1): 21-36.
Riggs, Paul. 1994. “The Standard of Living in Scotland, 1800-1850.” In Stature, Living
Standards, and Economic Development: Essays in Anthropometric History, 60-75. Edited by
John Komlos. Chicago and London: University of Chicago Press.
Robert, C.P. and G. Casella. 2004. Monte Carlo Statistical Methods (second edition). New York:
Springer-Verlag.
4
Sandberg, Lars G., and Richard H. Steckel, 1988. “Overpopulation and Malnutrition
Rediscovered: Hard Times in 19th-Century Sweden.” Explorations in Economic History 25:1-19.
Sandberg, Lars G. and Richard H. Steckel, 1997. “Was Industrialization Hazardous to your
Health? Not in Sweden!” in Steckel and Floud (1997).
Schultz, T. Paul. 2002. “Wage Gains Associated with Height as a Form of Health Human
Capital.” American Economic Review 92(2): 349-353.
Sokoloff, Kenneth and Georgia Villaflor. 1982. "The Early Achievement of Modern Stature in
America." Social Science History 6(4): 453-481.
Staub, Kaspar, Frank J. Rühli, Barry Bogin, Ulrich Woitek, and Christian Pfister. 2011. “Eduoard
Mallet’s Early and Almost Forgotten Study of the Average Height of Genevan Conscripts in
1835.” Economics and Human Biology 9(): 438-442.
Steckel, Richard H. 1986. “A Peculiar Population: The Nutrition, Health, and Mortality of
American Slaves from Childhood to Maturity.” Journal of Economic History 46(3): 721-741.
Steckel, Richard H. 1987. “Growth Depression and Recovery: The Remarkable Case of American
Slaves.” Annals of Human Biology 14: 111-132.
Steckel, Richard H., 1995. “Stature and the Standard of Living.” The Journal of Economic
Literature 33(4): 1903-1940.
Steckel, Richard H. and Roderick Floud, eds, 1997. Health and welfare during Industrialization.
Chicago: University of Chicago Press.
Steckel, Richard H., 1998. “Strategic Ideas in the Rise of the New Anthropometric History and
their Implications for Interdisciplinary Research.” The Journal of Economic History 58(3): 803821.
Steckel, Richard H. and Joseph M. Prince, 2001. “Tallest in the World: Native Americans of the
Great Plains in the Nineteenth Century.” American Economic Review 91(1): 287-294.
Steckel, Richard H., 2009. “Heights and Human Welfare: Recent Developments and New
Directions.” Explorations in Economic History 46(1): 1-23.
Sunder, Marco. 2004. “The Height of Tennessee Convicts: Another Piece of the ‘Antebellum
Puzzle.’” Economics and Human Biology 2(1): 75-86.
Sunder, Marco. 2011. “Upward and Onward: High-society American Women Eluded the
Antebellum Puzzle.” Economics and Human Biology 9(2): 165-171.
Sunder, Marco and Ulrich Woitek. 2005. “Boom, Bust, and the Human Body: Further Evidence
on the Relationship between Heights and Business Cycles.” Economics and Human Biology 3(3):
450-466.
Voth, Hans-Joachim and Timothy Leunig. 1996. “Did Smallpox Reduce Height? Stature and the
Standard of Living in London, 1770-1873.” Economic History Review, New Series 49(3): 541560.
5
Wachter, Kenneth W. and James Trussell, 1982a. “Estimating Historical Heights.” Journal of the
American Statistical Association 77(378):279-293
Wachter, Kenneth W. and James Trussell, 1982b. “Estimating Historical Heights: Rejoinder.”
Journal of the American Statistical Association 77(378):301-303
Weir, David R., 1997. “Economic Welfare and Physical Well-Being in France, 1750-1990.” In
Steckel and Floud (1997)
Weiss, Thomas J. 1992. “U.S. Labor Force Estimates and Economic Growth, 1800-1860.” In
American Economic Growth and Standards of Living before the Civil War, 19-78. Edited by
Robert E. Gallman and John J. Wallis. Chicago: University of Chicago Press.
Whitwell, Greg, Christine de Souza, and Stephen Nicholas, 1997. “Height, Health, and Economic
Growth in Australia, 1860-1940.” In Steckel and Floud (1997).
6
Table 1: Summary of main symbols in the occupational choice model, and their assumed values in
simulation model
Symbol
Meaning
h
height
Baseline value in
Simulations
NA
Properties of the simulated population
N
μ
σ
Number of observations in simulated population
Mean of heights in population
Standard deviation of heights in population
1,000,000
66 inches
2.5 inches
Properties of the log-wage equations
αC
αM
βC
Intercept for civilian log-wage equation at height 56 inches
Intercept for military log-wage equation at height 56 inches
Slope (return to one inch in height) for civilian log-wage equation
βM
εC
εM
Slope (return to one inch in height) for military log-wage equation
Skill (unobserved) affecting civilian log-wage equation [~N(0,1)]
Skill (unobserved) affecting military and civilian log wages
[~N(0,1)]
Return to military skill in civilian sector log wage equation
Return to military skill in military sector log wage equation
Return to civilian skill in civilian sector log wage equation
Taste for the civilian sector
Taste for the military sector
γC
γM
δC
τC
τM
0.5
0.4
Ranges from 0 to
0.07
0.02
NA
NA
0
0.1
0.4
0
0
Properties of the error terms
σCM
Correlation of (εM, εC )
.1
Note: variables marked “NA” are either not in the simulation model or are themselves simulated.
Table 2: Baseline specifications for heights regressions
OLS
Estimate
Dummy for year of birth:
1761
1762
1763
1764
1765
1766
1767
1768
1769
1770
1771
1772
1773
1774
1775
1776
1777
1778
1779
1780
1781
1782
1783
1784
1785
1786
1787
1788
1789
1790
1791
1792
1793
1794
1795
1796
1797
1798
1799
Baseline=1800
1801
1802
1803
1804
1805
1806
1807
1808
1809
1810
1811
1812
1813
Probit (tall=68+)
T-ratio
Estimate
T-ratio
Probit (tall=69+)
Estimate
T-ratio
1.143
1.493
1.482
1.170
0.968
1.215
0.650
0.354
0.462
0.514
0.399
0.350
0.092
-0.007
0.072
0.170
0.374
-0.144
0.232
0.332
-0.143
0.143
0.127
0.056
0.032
0.176
-0.159
-0.205
0.154
0.039
-0.158
-0.065
-0.117
-0.102
0.101
0.115
0.214
0.211
0.227
7.190
10.450
9.850
8.040
7.830
6.500
3.290
2.080
2.850
3.550
2.580
2.300
0.630
-0.040
0.430
1.050
2.530
-1.190
1.440
2.340
-1.140
1.000
0.920
0.390
0.210
1.040
-1.110
-1.470
1.040
0.260
-1.290
-0.410
-0.830
-0.710
0.520
0.560
0.910
1.070
1.150
1.037
1.427
1.655
1.775
1.280
1.074
0.838
0.498
0.363
0.529
0.209
0.259
0.158
-0.090
-0.015
0.233
0.250
0.031
0.279
0.509
0.173
0.201
0.209
0.353
0.172
0.146
0.122
-0.137
0.254
0.009
-0.053
-0.061
0.040
-0.139
-0.031
0.523
0.259
0.115
0.104
5.070
6.980
6.370
5.880
6.780
4.300
3.360
2.600
2.350
3.700
1.500
1.820
1.100
-0.620
-0.100
1.490
1.880
0.240
1.840
3.440
1.180
1.430
1.470
2.090
1.160
0.920
0.760
-0.980
1.750
0.070
-0.410
-0.420
0.270
-1.000
-0.170
2.020
1.070
0.650
0.600
1.102
1.399
1.296
1.096
0.986
1.249
0.750
0.447
0.367
0.478
0.371
0.290
0.209
0.017
0.036
0.079
0.287
-0.019
0.111
0.347
-0.157
0.203
0.227
0.159
-0.071
0.232
-0.267
-0.170
0.194
0.017
-0.004
-0.050
-0.146
-0.103
0.158
-0.105
-0.041
0.017
0.154
6.500
9.220
8.130
6.830
7.070
5.990
3.590
2.530
2.480
3.590
2.710
2.090
1.470
0.110
0.240
0.510
2.190
-0.140
0.750
2.520
-1.030
1.470
1.620
0.980
-0.470
1.480
-1.570
-1.150
1.360
0.120
-0.030
-0.330
-0.940
-0.710
0.860
-0.420
-0.170
0.090
0.890
-0.153
0.264
0.316
-0.051
0.728
0.477
0.303
0.112
-0.088
0.217
0.657
0.850
-0.042
-0.990
1.320
1.590
-0.310
1.850
1.560
1.350
0.730
-0.460
1.070
2.800
2.620
-0.150
-0.109
0.136
0.252
0.015
0.305
0.234
0.177
0.295
-0.104
0.375
0.756
1.271
-0.140
-0.670
0.810
1.320
0.090
1.160
0.910
0.800
1.600
-0.590
1.750
2.950
2.530
-0.570
-0.073
0.119
0.165
-0.013
0.401
0.267
0.381
0.237
-0.175
0.099
0.277
0.531
-0.227
-0.430
0.710
0.880
-0.070
1.590
1.070
1.770
1.330
-0.930
0.480
1.270
1.600
-0.850
1814
1815
1816
1817
1818
1819
1820
1821
1822
1823
1824
1825
1826
1827
1828
1829
1830
1831
1832
1833
1834
1835
1836
1837
1838
1839
1840
1841
Constant
R-square (adj)
Log-likelihood
-0.048
-0.048
0.086
0.683
0.604
0.364
0.561
0.842
0.992
0.262
0.382
-0.226
0.243
-0.361
0.031
0.422
0.289
0.115
0.041
0.276
0.370
0.039
-0.200
0.038
-0.498
-0.430
-0.168
0.127
68.434
-0.250
-0.230
0.430
2.780
3.470
2.380
2.790
2.700
3.640
0.470
0.800
-0.630
0.520
-1.120
0.160
2.170
2.110
0.890
0.350
2.260
2.420
0.230
-1.270
0.140
-3.480
-2.130
-0.760
0.820
960.540
-0.073
-0.207
0.057
0.139
0.623
0.500
0.535
0.591
1.007
0.336
0.838
-0.230
0.336
-0.230
0.089
0.317
0.135
0.195
0.096
0.350
0.386
0.116
-0.073
-0.090
-0.096
-0.308
-0.104
0.164
0.230
-0.370
-1.130
0.320
0.770
3.280
2.840
2.440
2.330
3.550
0.660
1.420
-0.620
0.660
-0.620
0.420
1.780
1.110
1.560
0.860
2.910
2.730
0.700
-0.460
-0.410
-0.490
-1.330
-0.490
1.110
3.300
0.129
-0.271
-0.058
0.298
0.447
0.327
0.420
0.670
0.549
0.267
-0.118
-0.227
-0.118
-0.520
-0.006
0.273
0.194
-0.049
-0.018
0.166
0.289
-0.085
-0.267
-0.142
-0.594
-0.440
-0.308
0.109
-0.447
0.650
-1.360
-0.320
1.670
2.630
2.000
2.100
2.940
2.510
0.550
-0.230
-0.570
-0.230
-1.190
-0.030
1.590
1.590
-0.390
-0.150
1.410
2.130
-0.500
-1.570
-0.610
-2.530
-1.650
-1.330
0.730
-6.250
0.082
-3835.164
-4077.361
Note: Estimation sub-sample limited to men aged 22-27 at the time of recruitment, and at least 67 inches tall. The
sample size for all specifications is 6435. The omitted birth year is 1800.
Table 3: Adding age-of recruitment dummies to the baseline model
Test for the null that all age dummies
are zero
OLS
Probit (tall = 68+)
Probit (tall=69+)
2.47
(.0307)
[5, 6349]
16.21
(.0063)
[5]
13.81
(.0169)
[5]
6.89
(0.0)
[80, 6349]
327.37
(0.0)
[80]
461.53
(0.0)
[80]
Test for the null that all birth-year
dummies are zero
Note: Tests performed on specifications that are augmented versions of the models reported in Table 2, as
described in the text. Figures in parenthesis are the probability of not rejecting; figures in square brackets
are the degrees of freedom. The test for the OLS model is an F-test, for the probits, a chi-square test.
There are 6435 observations used in all specifications.
Table 4: Adding year-of-recruitment dummies to the baseline model
Test for the null that all year-ofrecruitment dummies are zero
Test for the null that all birth-year
dummies are zero
OLS
Probit (tall = 68+)
Probit (tall=69+)
1.67
(.0002)
[77, 6277]
136.15
(0.00)
[76]
110.46
(.0075)
[77]
1.33
(0.0275)
[80, 6277]
89.19
(.2258)
[80]
83.94
(.3598)
[80]
Note: Tests performed on specifications that are augmented versions of the models reported in Table 2, as
described in the text. Figures in parenthesis are the probability of not rejecting; figures in square brackets
are the degrees of freedom. The test for the OLS model is an F-test, for the probits, a chi-square test.
There are 6435 observations used in all specifications.
Table 5: Adding relative wage and wheat prices
OLS
Relative wage variable
Wheat-price variable
Test for the null that
relative wages and
wheat prices are both
zero
Probit (tall=68+)
Probit (tall=69+)
-1.979
(-2.15)
[-.018]
-.480
(-.53)
[-.152]
-1.254
(-1.39)
[-.760]
.0005
(.54)
[.005]
.0004
(.47)
[.016]
.0002
(.29)
[.019]
2.34
(.096)
[2, 6352]
.47
(.789)
[2]
1.96
(.375)
[2]
Notes:
The mean of the relative wage variable is 0.627, and its standard deviation is 0.034. For the wheat price
variable the mean is 71.378 and the standard deviation 26.348.
This first panel of the table refers to a specification that augments the models reported in Table 2 with the
relative-wage and wheat-price variables reported in the text. The augmented models were estimated with
robust standard errors. In the first two rows, the figure in parentheses is the t-ratio and the figure in square
brackets is the elasticity evaluated at the sample means. The final row reports the F-test (for the OLS
model) or chi-square test (for the probits) for the null hypothesis that both the relative wage and wheatprice variables are zero. The figure in parenthesis is the probability of not rejecting, and the figure in
square brackets is the degrees of freedom for the test.
Table 6: RSMLE estimates of heights of British soldiers
Variables
Wage ratio
Wheat price
Cohort 1761
Cohort 1766
Cohort 1771
Cohort 1776
Cohort 1781
Cohort 1786
Cohort 1791
Cohort 1796
Cohort 1801-1805 (baseline)
Cohort 1806
Cohort 1811
Cohort 1816
Cohort 1821
Cohort 1826
Cohort 1831
Cohort 1841
Constant
Mu
Estimates
SE
Sigma
Estimates
-15.955
-.0375516
3.963091
3.853138
2.110348
2.732323
3.077086
3.037234
1.845478
1.663066
4.85152
0.009
1.510
1.523
1.721
1.644
1.619
1.658
1.748
1.634
1.474217
.0043528
-.7395829
-.4561435
-.1763412
-.2650115
-.4008354
-.3974875
-.2981857
-.2217938
0.803
0.001
0.173
0.180
0.203
0.197
0.195
0.197
0.204
0.206
2.216812
1.150958
2.63333
2.410806
.7469431
1.61339
-.1531317
77.14726
1.607
1.795
1.568
1.695
1.811
1.619
1.756
3.372
-.3446124
-.1554121
-.2974427
-.1411493
-.0372722
-.166328
-.0671573
-.1078353
0.210
0.234
0.195
0.242
0.208
0.189
.21133
.507653
SE
Note: The log of the likelihood function is -9689.4979. There are 6206 observations.
Without the four time of enlistment effects, the log likelihood value is -9706.1187.
Figure 1
Source: Nicholas and Steckel (1991, Figure 3)
Figure 2: The median heights of French males
1,675
1,670
1,665
1,660
1,655
1,650
1,645
1,640
1,635
1,630
1780
1800
1820
Source: Weir (1997, Table 5B.1)
1840
1860
1880
1900
1920
1940
Figure 3: Male height at age 20 in France and Britain, 1770-1980
Source: Weir (1997, p. 175).
Figure 4: Simulation results I
.5
.5
Proportion tall
.45
.4
.6
0
.02
.04
.06
Civilian return to height
Mean military height
.35
.3
64.5
.1
.2
65
.3
Proportion
65.5
66
.4
.55
B
66.5
A
.08
0
.02
.04
.06
Civilian return to height
50 percent in military
Proportion of civilians tall
Fraction in military
Proportion of soldiers tall
.08
Figure 5: Simulation results II
.15
.1
.05
0
0
.05
.1
.15
.2
B
.2
A
55
60
65
Height
70
75
55
60
65
Height
70
Population density
Population density
Military density
Military density
75
Figure 6: Selected height distribution and the density function adjustment for selection
6566
Height
70
75
4
2
3
Z(h) from selection model
.15
1
.05
0
0
1
.1
4
60
0
2
3
Z(h) from selection model
.15
.1
.05
0
55
.2
B
.2
A
55
60
6566
Height
70
75
Population height density
Population height density
Selected height density
Selected height density
Z(h) from selection model
Z(h) from selection model
Figure 7: Ability to reject normality in selected samples
B
.04
.04
Rejections of normal height distribution (sktest) at 5% size
.06
.08
.1
Rejections of normal height distribution (sktest) at 10% size
.06
.08
.1
.12
.12
A
100 200 500
2000 5000 15000 50000
Expected military sample size -- log scale
100 200 500
2000 5000 15000 50000
Expected military sample size -- log scale
0
1
2
3
4
Figure 8: Selection density adjustment and acceptance (rejection) sampling functions
55
60
65
Height
70
Rejection sampling weight
Z(h) from selection model
75