Sample-selection Bias in the Historical Heights Literature Howard Bodenhorn Department of Economics, Clemson University Timothy W. Guinnane Department of Economics, Yale University Thomas A. Mroz Department of Economics, Clemson University ∗ This draft: March 5, 2012 ∗ Note : Preliminary, please do not cite or circulate. For comments and suggestions we thank Cihan Artunç, Jeremy Edwards, Farly Grubb, Sheilagh Ogilvie, Jonathan Pritchett, Paul Rhode, Mark Rosenzweig, William Sundstrom, and Christopher Udry. We acknowledge nancial support from the Yale University Economic Growth Center. Direct correspondence to Guinnane: [email protected]. 1 Abstract Over the past thirty years, an extensive literature has developed that uses anthropometric measures, typically heights, to draw inferences about living standards in the past. Much of this literature relies on micro-samples drawn from sub-populations that are themselves selected: examples include volunteer soldiers, prisoners, and runaway slaves, among others. Contributors to the heights literature sometimes acknowledge that their samples might not be random draws from the population cohorts in question, but rely on normality alone to correct for potential selection into the sample. We argue that selection cannot be resolved simply by augmenting truncated samples for left-tail shortfall. Consequently, widely accepted results in the literature may not reect variations in living standards. Observed heights could be predominantly determined by the process determining selection into the sample. We use a simple theoretical model combined with simulation and econometric results to make our point. A survey of the current historical heights literature illustrates the problem for the three most common sources: military personnel, slaves, and prisoners. 2 . . . the socio-economic composition of the institution studied might have varied over time, even in the absence of explicit changes in the admission criteria. This might be due both to supply and demand considerations. The willingness of individuals to enter the 1 military, for instance, might have varied over time. . . This problem is quite intractable. In volunteer services there are two selection processes at work, neither of which can be considered random: the demand of the recruitment services for men of certain heights (or other characteristics correlated with height) and the supply of potential recruits. 2 . . . The nature of 19th century volunteer armies such as that studied here presents a statistical dilemma: the sampling bias tends to move in a direction opposite to the population mean. For instance, if population income goes up, options such as military service in India become less attractive, and our sample would be drawn increasingly from the left tail of a distribution which itself is shifting to the right. A nding that the sample mean went up would thus be consistent with both a falling and a rising population income, depending on which inuence on the sample dominated. 1 2 3 Komlos (2004, note 44). Weir (1997, p.174). Mokyr and Ó Gráda (1996), pp.1634 3 3 1 Introduction Anthropometric history came into its own in the 1980s, and now enjoys a prominent place in quantitative economic history. The appeal of this approach to the study of the past is obvious. Many historical sources record heights, and by studying direct measurements of the human organism one can hope to achieve a broader picture of how economic growth and development aected human well-being. 4 The central idea of the heights literature is that, at the population level, heights at a given age 5 Thus a cohort are a measure of the net nutrition available to individuals during the growing years. might be unusually short because its members had less food, or gross nutrition; because hard work when young made greater caloric demands on gross nutrition, leaving less for growth; or because disease made demands on gross nutrition. Some early papers in this literature viewed height as a proxy for conventional measures of economic output such as GDP per capita, one that could be 6 As it has developed, the heights literature presents the used in the absence of direct measures. anthropometric approach as capable of capturing dimensions of economic well-being that the real wage or GDP per capita cannot. As scholars integrated insights from the medical and social sciences, they faced a number of theoretical and statistical challenges. Perhaps the most discussed problem is the fact that many samples are drawn from military populations for which there was a minimum height requirement. The resulting sample of heights is left-truncated at or near that minimum height. This truncation problem led to a number of approaches to estimating mean height from truncated samples. While there remain some debates about how this is best done, most recent studies account for it using 7 well-tried approaches. 4 There is even a specialist eld journal, Economics and Human Biology, started in 2003. uses anthropometric measures other than heights. Some historical research Our argument applies with equal force to any anthropometric measure used to reect on human well-being; for brevity here we use the term heights. 5 6 Most research on heights relies on sources that pertain to men only, such as soldiers. Brinkman, Drukker and Slot (1988), for example, regress the heights of Dutch males on per-capita income in Holland in the period 1900-1940, and then use the resulting equation and measured heights to infer incomes in the nineteenth century. 7 The seminal reference is Wachter and Trussell (1982). They discuss two distinct approaches. The rst, the Reduced-Sample Maximum Likelihood Estimator (RSMLE), requires ex ante specication of a (maximum) xed truncation point. The second, the Quantile Bend Estimator, relies on tting a linear relationship between the expected and actual normality probability plots. Both rely on the normality of the sample distribution. Some studies focus on a distinct problem, which is heaping of heights. One would expect the distribution of heights to be continuous. Measured heights often show both rounding (say, to 5'4 when the true height is 5' 4.1 inches) and a tendency to cluster at certain heights, such as 5 feet. Our argument is distinct from both truncation and heaping. 4 Unfortunately the historical literature has not satisfactorily confronted the consequences of a potentially more serious problem; namely, that many if not most of the samples used are not random draws from the population in question. This issue is distinct from that of left truncation, and applies even in a situation where there are no left-truncation issues. In most sources, an individual is included in the sample only if he or someone else made a decision that led to their inclusion. Examples include soldiers and sailors in volunteer forces; prisoners ; runaway slaves; slaves that entered into the international or interregional slave trade; students at elite educational institutions; and holders of passports. In some cases more than one actor made a choice that allowed a person to be observed in a sample. If the chances of inclusion in the sample dier across individuals in ways that are correlated with height, then the sample distribution of heights are unlikely to be unbiased measures of the mean height in the population, even after adjustment for possible left-truncation. More importantly, changes in observed heights over time may be driven not only by changes in the actual heights of the population, but also by changes in the probability that individuals of dierent heights were selected into the sample. Similarly, cross-sectional dierences in height within a sample may reect a particular variable's eect on the probability of inclusion in the sample, rather than population dierences in height correlated with a given characteristic. 8 This is Thus many heights studies may suer 'from sample-selection bias (SSB). not a trun- cation problem, and none of the approaches that deal with truncation or shortfall address the issue. It is reasonable to think that selection into a volunteer Army will aect the right-hand tail of the distribution as well as the left-hand tail: taller men are from more prosperous backgrounds and as such are more likely to have more attractive civilian opportunities than Army life.. The implications of SSB are grave. Komlos notes in the trenchant epigraph that the problem is intractable. Many papers in the historical heights literature recognize that their samples might not be representative, but few have acknowledged that some of the most interesting and most discussed results in this literature may be entirely an artifact of SSB. We are not the rst to recognize the problem. Pritchett and Chamberlain (1992) and Grubb (1999) discuss selection in specic contexts. But we 8 Extensive discussion of SSB in the economics literature dates to eorts to estimate models of married women's wages and hours of work. See Heckman (1979). Some sources used in the heights literature are themselves random samples drawn from a larger source, for example, a random sample of men who joined a particular volunteer army. We are not criticizing the method of drawing the sample; our argument is that the Army itself represents a selected sample of men from that time. 5 are the rst to systematically explore the consequences of SSB in the literature more generally. 9 The current historical heights literature suers from two fundamental aws. First, the literature has emphasized results that may reect nothing more than sample selection bias. Second, it has labored to construct complicated, indirect explanations of puzzling ndings that reect, at least in part, changes in the nature of the selection. Our message here is a warning. We cannot demonstrate fully that selection problems are more or less important than the economic and social forces emphasized by the heights literature. The only fully convincing way to do that would be to compare random and selected samples drawn from the same source. If such random samples were available, we are condent someone would have used them. Rather, we show that selection is an important problem that needs to be taken more seriously than it has been. We do not argue that scholars should dismiss the heights literature. First, many studies (mostly in developing countries, but also in a few historical examples) rely on sources relatively free of selection bias. Second, economic history always requires careful use of imperfect sources, and we encourage scholars interested in studying historical heights to recognize and acknowledge this imperfection in their sources. Finally, those who rely on this literature as part of a broader eort to study living standards in the past should be skeptical about some of the results reported to this point. 2 Declining heights in the presence of rising real wages The sample-selection problems we discuss are potentially key issues in any study using a choice-based sample of heights in the past. To x our ideas, however, we focus mostly on a set of related issues that have occupied much of this literature. (In section 7 we discuss the sample-selection issue as it arises in other heights research more broadly.) Economic historians and others are naturally interested in how economic growth, and more specically industrialization, aected human welfare in the past. The most famous version of this standard of living debate focuses on the working classes during the British Industrial Revolution. A long debate that goes back at least to Engels (1845/1897) divides optimists, who think the working classes beneted from early industrialization, from pessimists, 9 Mokyr and Ó Gráda (1996) is the only anthropometric study we know of that acknowledges that their result may be a consequences of SSB. We read their conclusions as a precursor of what we say here. Weir (1997, p.175-7)'s discussion also alludes to SSB in the British Army data. 6 who think they did not. The earliest literature focused on conditions of work and life sanitation, diet, disease and mortality but in its modern incarnation the literature has focused on conventional 10 economic measures such as real wages, or incomes or consumption per capita. Floud, Wachter, and Gregory's (1990) nding that British soldiers grew shorter during the British Industrial Revolution, and after, seemed a strong point for the pessimist case. Figure 1 (From Nicholas and Steckel (1991)) presents similar evidence. The mean height of English males declined by about 1.5 inches during a period in which wages were at worst stagnant or, more likely, growing slowly. If anything, this evidence seemed even stronger than conventional pessimist arguments: Feinstein (1988) nds wage stagnation, not wage declines, and Allen (2007) nds slow real-wage growth throughout the nineteenth century. But British soldiers became signicantly shorter. Something similar also occurred in the United States. During a period when the economy entered a period of long-term acceleration in real GDP, estimated mean height declined by more than one inch (Weiss 1992). Growth in estimated heights did not recover until the end of the nineteenth century. This fact of US economic history is now so much a part of the mainstream narrative that it is known simply as the antebellum puzzle. The puzzle is the concurrent (apparent) decline in biological well-being and (apparent) accelerating growth in per capita income. The heights literature and, in fact, much of the profession, has taken both ndings at face value (rising incomes and shorter people) and has endeavored to explain it. Because the same phenomena have been documented for the mid-nineteenth century US, late eighteenth-century UK and the industrializing Hapsburg Empire, we label the phenomena the industrialization puzzle. The ndings were unexpected and led to considerable eorts to understand the divergence between heights and real wages. Komlos (1998) presents a laundry list of possible explanations, including increasing income inequality, increased income variability, changes in relative prices between nutritious (meat and grains) and non-nutritious (coee and sugar) foods, transportation innovations and the integration of the disease environment, intensication of labor, among others. The heights literature focuses on three of these eects. First, changes in the distribution of income might reduce net nutrition for the working classes, even if overall measures such as GDP per capita are rising. We nd this explanation particularly 10 Feinstein (1988) is probably the last word on English real wages per se. He nds little evidence real-wage increases until least the rst decade of the 19th century. Brown (1990) uses hedonic methods to show that much of any increase in real wages was necessary to oset the utility costs of more unpleasant living and working conditions. 7 interesting because it is so closely related to the SSB problem. In the heights literature, changes in income distribution are usually thought of as a shift in income from, say, labor to capital. This would explain the English case especially, with a divergence between the trends in GDP per capita and measured heights. Changes in income distribution could well be the driving force behind dierent types of selection into the Army; if workers with particular sets of skills saw sharp declines in their wages, they might be more ready to enlist. Yet this implication is seldom mentioned in the heights literature. 11 A second explanation for falling heights in the face of economic growth focuses on an unhealthy environment. Industrialization was associated with the growth of cities, which were comparatively unhealthy places to live until at least the late nineteenth century. Diarrheal and other endemic infectious diseases drive a wedge between gross and net nutrition. This observation is consistent with what we know about public health and urban environments in the period. A third explanation emphasizes instead the price and availability of important food items. While real wages might have improved, the relative price of nutritious height-enhancing foods might change in ways that lead to less consumption. Komlos (1987), Bodenhorn (1999) and others have, in fact, developed estimates of nutrient availability and consumption for 19th century school boys and African-Americans to investigate this possibility. This approach requires so many assumptions about animal weight, waste and spoilage, seed requirements and so forth that even the most carefully crafted estimate is subject to challenge. Floud, Gregory and Wachter (1990, 233-243) doubted whether such studies could overcome the insurmountable hurdles and contribute much to the debate. We do not deny the value of the fresh research eorts provoked by the nding of the industrialization puzzle. We simply doubt that the puzzle is real. The problem is that scholars have only identied it in contexts where the data come from choice-based samples. US had volunteer Armies for most of the period in question. Both the UK and the 12 France, on the other hand, drafted men according to lotteries so that all young men had an approximately equal chance of being called for service. Consider Figure 2, from Weir (1997), which presents the heights of French males over the same period. There is no French equivalent of the industrialization puzzle. French men grow 11 In other cases, income distribution is invoked to explain above-average height. Steckel and Prince (2001, p. 291) appeal to the egalitarian practices of the Native Americans of the Great Plains to explain their relatively tall stature. 12 The Union (North) subjected men to a draft in the period 18621865 (during the U.S. Civil War), but the policy of allowing draftees to pay a substitute to take their place makes it much like a volunteer Army for our purposes. 8 without interruption; in fact, a simple regression of these gures on a linear time trend yields an R-square of about 0.97. 13 What accounts for the dierence between the US and British case on the one hand, and the French, on the other? The heights literature focuses on France's slower urbanization, which reduced the disease burden and thus the impediments to converting calories into height. We think there is a much more obvious explanation. Figure 3 also plots the heights of British military recruits. The year-to-year variations in the estimated mean heights of British soldiers are enormousmuch larger than could be due to any short term variations in the standard of living. Consider the cohorts that reached age 20 around 1810. Between 1804 and 1812, French heights uctuate between 163.5 cm and 163.9 cm without obvious trend. British heights, after having already experiencing a 2 cm decline 20 years earlier, rst decline by about 2 cm and then as quickly recover. In short, the heights of men potentially eligible for the French military do not oscillate as wildly as British heights even though both imposed minimum (but not common) height standards. With a standard deviation of height within a British cohort of about 2.3 cm, the comparatively smooth decline in mean heights from the cohort born in 1820 to that born at the end of the 1830s represents a very large decline in mean heights. We return to this issue in section 6 below. 3 14 Modeling Military Heights One problem with the limited discussions of selection bias in the heights literature is lack of clarity about what might cause the selection and how that selection would aect the samples used. To avoid that weakness, this section describes a simple model in which an individual of given characteristics decides whether to join the Army or remain a civilian. This model could be adapted to t several other contexts that generate height data. The model delivers expressions describing the degree of selection into the Army, by height. 13 In Sections 4 and 5 below we use this feature of the model Our views on SSB and the heights literature probably originated in conversations with Weir. Weir (1997, p.174) notes that British military heights uctuate a great deal in short periods. The cohort who reached age 20 in 1850 were apparently 5 centimeters taller than those who reached that age 1 year later. That is far beyond any plausible range of measurement error in GDP and suggests that there must have been other very powerful forces driving the heights of British volunteer forces. We agree. 14 Militia conscription records for Sweden for the period 18201965 cover nearly the entire male population, and, like France, show no deterioration in heights (save for a .5 centimeter dip from 183540, which was made good by 1845 (Sandberg and Steckel 1997, Table 4.1). Using the settled Army, which was apparently a volunteer force, Sandberg and Steckel show a sharp decline in height for the cohort born in the 1840s Sandberg and Steckel 1997, p.135; Sandberg and Steckel 1988, Figure 1). 9 to simulate height distributions for specic distributional parameters. Our model draws on Roy (1951), and is similar to Heckman and Sedlacek's (1984) two-sector occupational choice model. Each person's utility depends on his wage or material compensation in the Army or in the civilian world. Each individual also has a parameter that describes his taste for military life, and another that describes his taste for civilian life. Thus two individuals with identical wages can make dierent decisions. We treat both Army and civilian wages as (possibly) a function of the individual's height. This assumption has two interpretations that are equivalent for our purposes. Army or some civilian occupations might actually reward height itself. The rst is that the Promotion might come faster to taller soldiers, and in some military and civilian occupations a tall person might be more productive. 15 A second interpretation seems more natural, and more consistent with the basic tenets of the heights literature. Height is correlated with a person's biological standard of living and with other aspects of health human capital (Schultz 2002). The Army might reward a tall person not for being tall per se, but because taller individuals are generally healthier than shorter people. Similarly, the civilian world might reward a tall person for his health in addition to any skills correlated with height. 3.1 Individual Returns The military pays soldiers as a function of their height, military traits summarized in by the variable h, and their observable set of productive εM : ln (wM ) = αM + βM h + γM εM 15 (1) Studies of modern labor markets have uncovered a positive height-wage correlation and attribute it to height- based productivity eect. Persico et al (2005) identify an association between height and human capital investment during adolescence that manifests itself as a positive height-wage correlation for adults. Case and Paxson (2008) argue that the positive correlation between height and wages captures, in part, a positive association between height and cognitive ability. After controlling for cognitive abilities, the height eect remains, but is less strong. It is not unreasonable to believe that human capital and cognitive ability would be rewarded in military labor markets, though not necessarily at the same rate as civilian markets. Komlos (1989, p.237) reports enlistment bonuses into the Hapsburg Empire's army, circa 1809, increasing in heights, from 3 . for soldiers just 5'-0 to 45 . for soldiers 5'-5 and above. 10 (Note that βM might be zero.) reported in Section 4). εM (Table 1 sets out all notation used here and in the simulations can entail anything other than height, such as literacy or some specic skill such as the ability to shoot straight. Individuals dier in their tolerance for other features of military life, and we summarize those preferences under τM . An individual's utility from entering the military is additive in the military log-wage and his tastes for the military: U (M ) = ln (wM ) + τM = αM + βM h + γM εM + τM . (2) For the moment, we do not make any assumptions about the relationship between height and productive military traits (εM ). Height and military productivity could be positively correlated (tall people might be better with a rie) or negatively correlated (tall people might nd it harder to nd cover in an ambush). Now suppose civilian wages are given by: ln (wC ) = αC + βC h + γC εM + δC εC . (3) Note that height and other military characteristics (εM ) might be valued by both the military and the civilian economy, but another set of characteristics, εC , are valuable only in the civilian world. (The model can be extended such that the military also rewards εC , results below.) Individual preferences for civilian life are denoted by but that does not change the τC . Thus the utility of being in the civilian world is: U (C) = ln (wC ) + τC = αC + βC h + γC εM + δC εC + τC . As above, all productive traits can be correlated. For simplicity we assume that tastes (the are independent of the productive traits (h and the ε terms). (4) τ terms) Relaxing this assumption only requires the introduction of additional covariance terms in the following derivations. While we view this primarily as a static model of occupational choice, the functions and U (C) could be interpreted in a dynamic context. U (C) U (M ) could represent the expected utility from choosing the civilian sector today when there is a possibility (option) that one might choose to enter the military at some later date. However, throughout this analysis we assume that an 11 16 Similarly, the military pay function individual has just one opportunity to enlist in the military. could represent expected lifetime payments that could include, in part, the expected increase in pay if the volunteer were to be promoted at some later date from soldier to sergeant. An expected utility maximizing individual enters the military if and only if U (C) ≤ U (M ), or η ≤ (αM − αC ) (5) η = βh + γεM + δεC + τ (6) where and β = (βC − βM ) , γ = (γC − γM ) , δ = δC , and τ = τC − τM . (7) Equation (7) makes clear that what drives the model (and the simulation below) is between features of the military and civilian worlds. If height is more highly rewarded in the β = (βC − βM ) > 0, civilian sector than in the military sector, i.e., dierences then taller individuals will prefer the civilian sector over the military sector, all else equal. Similarly, if the civilian sector's reward to the military trait higher values of εM εM increases (i.e., γ or γC increases), then will be less likely to join the military. ceteris paribus those with Additionally if there is an increase (decrease) in the reward to a civilian sector trait that is positively (negatively) correlated with the military trait εM , then those with higher (lower) values of εM will have a lower (higher) propensity to enter the military. Suppose that all productive traits follow standard normal distributions, not necessarily independent of each other or of heights, and that the taste parameters follow mean zero normal distributions independent of heights and productive traits. 16 17 Then height h and selection error h will follow a Given that most studies using military heights focus on older enlistees (e.g., over age 21 or 22) who likely have nished growing, the selection process would include the decision to be a civilian during the teenage years and then the decision in one's early to mid 20's to leave the civilian sector and join the military. This more complicated selection process would alter the specic characterizations of the self selection biases described below, but it is unlikely that it would eliminate them. 17 The unit variance assumption is innocuous, and the mean zero assumption for all of these factors only requires simple redenitions of the intercepts in the two wage oer equations. assumption for tastes. 12 One could easily relax the independence bivariate normal distribution with h µh 2 E = ; V ar (h) = σh ; η 0 (8) V ar (η) = ση2 = β 2 V ar (h) + γ 2 V ar (εM ) + δ 2 V ar (εC ) + V ar (τ ) +2βγCov (h, εM ) + 2βδCov (h, εC ) + 2γδCov (εM , εC ) (9) and Cov (h, η) = σh,η = βV ar (h) + γCov (h, εM ) + δCov (h, εC ) . (10) Under these assumptions, we can derive the distribution of heights for those who prefer military service to working in the civilian sector. described above over the range of Integrating the bivariate normal distribution of η ≤ (αM − αC ), and normalizing by the Prob (η (h, η) ≤ (αM − αC )) (see Kotz et al 2000, p.316) yields fh|mil (h) = fh|η≤(αM −αC ) (h) = fh (h) Z (h) (11) where Z (h) = Φ (·) o n q Φ [(αM − αC ) /ση − ρη,h (h − µh ) /σh ] / 1 − ρ2η,h Φ [(αM − αC ) /ση ] ulation) distribution of heights, 1 √ σh 2π of height and the population selection error η and (12) fh (h) is the unconditional (pop- ρη,h = Cov(h,η) is the correlation σh ση is the standard normal cumulative distribution function, 2 1 h−µh exp − 2 , σh . we dened above. Z (h) summarizes the selection process that generates the observed distribution of military heights from the underlying distribution of population heights. 18 Most historical studies of military heights try to use the observed military height distribution 18 The appendix extends the model to account for the military's optimization decisions. We show that in con- structing the least-cost military force, the Army might impose a minimum height restriction, as we actually see in practice. 13 (fh|mil (h)) to make inferences about the population height distribution fh (h). Our derivation makes clear that accurate inferences about the population distribution depend on the properties of and especially how Z (h) Z (h) might vary over time (or in the cross-section) in a way that might be unrelated to variations in the population height distribution. As derived, this selection mechanism ignores any possible minimum height requirement that the military might impose. The existence of an exact and binding minimum height requirement implies that instead of the conditional height distribution described above, we would only observe its truncated analogue. Suppose there is some loosely enforced minimum height requirement, but one that is completely non-binding above some height h∗ . Then all observed heights above h∗ can be considered as random draws from the upper tail of this conditional height distribution. This assumption underlies Trussell and Watcher (1982)'s reduced sample maximum likelihood (RSMLE) and quantile bend estimators (QBE) (i.e., h∗ is above the extent of the shortfall in Trussell and Watcher's terminology). Our simple choice model highlights an important implicit assumption in Trussel and Watcher's (1982) description of the process giving rise to observed military heights. operating for heights above h∗ . They assume there is no selection process That is, individuals taller than h∗ choose to enter the Army based on a coin ip. Our Roy-style model suggests an alternative process. Suppose that the civilian sector rewards heights and productive traits more than the military, and that the covariances of heights and productive traits are positive, zero, or at most only slightly negative. correlation) of heights and the selection term η would be positive. Then the covariance (or This implies that Z (h) approaches zero for taller individuals. Taller individuals would increasingly become more self-selected (less wellrepresented in the Army) under this optimizing model. This conclusion, which is the central point of our critique, has two, related implications. rules out such a possibility. First, Wachter and Trussel's approach implicitly Their estimation techniques (whether RSMLE or QBE) are biased estimators for the unconditional height distribution fh (h) in this case. Second, our simple choice model provides a clear understanding of why the right-hand tail may have fewer taller individuals than one would observe in a true normal distribution. The Wachter and Trussel assumptions are innocuous only under fairly extreme conditions, when h∗ is above the unconditional mean height and the correlation of height and the selection error (ρ) approaches −1. How plausible is this? Consider rst the case where the civilian sector more strongly 14 rewards productive traits than the military sector. In this instance, a large negative correlation could arise only when the covariances of height with the productive traits εM and εC are large and negative. Such negative correlations seem unlikely, as one typically considers height to be a marker for a better childhood environment which would suggest positive correlations of heights and other skills. Next, consider the (probably implausible) case where the military rewards height and the relevant skills more strongly than does the civilian sector. Then we will have a large negative correlation of heights and the selection error if heights and skills are positively correlated and if the reward to the civilian sector specic skill, δC , is small. Under these assumptions Z (h) would no 19 longer approach zero, and the selection operating on the upper tail would be weak or non-existent. If the military rewarded common skills more than the civilian sector while the civilian sector pays at most a small premium for its sector-specic skill, then the variance of wages in the military would be much larger than the variance of wages in the civilian sector. This implication is at odds with what we know about military pay. This discussion raises an important question: if our model is a reasonable approximation to why individuals choose to enter the military, then why does the upper tail of nearly every one of the height distributions examined by Wachter and Trussel appear to follow a normal distribution? In Section 4 we use our occupational choice model to simulate historical height data sets. For a wide variety of reasonable variances, covariances, and reward structures, we nd that the distributions of the self-selected heights in the military do follow distributions that closely resemble the shapes of normal distributions, even when there is selection built it into the simulation model. This result highlights a serious statistical problem documented in Section 5. Even with moderate to large sized samples by cliometric standards, the power of standard skewness-kurtosis tests to reject normality is quite limited for the forms of self-selection modeled here. Even kernel density estimators for heights in the self-selected military samples seem at most trivially dierent from normal distributions. Self-selected samples of heights typically dier from normal height distributions in the variance of the observed height. The self-selected height distributions, even though they appear to be almost normally distributed, typically have standard deviations for heights well 19 Note that under these conditions, our occupational selection model would imply gross over-selection of tall individuals into the military. However, for a value of the correlation quite close to −1, the amount of over-selection could be relatively constant for all heights above the population mean. In this case the shape of the upper tail of the distribution would reect, nearly exactly, that of the population distribution of heights. 15 below those in the underlying population. Many of the estimated standard deviations in Floud, Wachter and Gregory's (1990) Table 4.8, especially for the 1800s, seem quite small, suggesting a strong degree of self-selection into the military consistent with this simple extension of the Roy model. 4 Simulating the Model To assess the potential importance of selection bias, we can apply reasonable values to the occupational selection model to investigate the ability of selected samples to yield accurate information about the characteristics of the unconditional population height distributions. The model underlying these simulations is a simplied version of the model presented in section 3. We focus on the decision to join the Army and abstract from the taste parameters (τ ); the constant terms in the log-linear wage equations and the wage shocks capture variations in tastes. Finally, we assume that each individual has a given shock that is specic to his earnings in the civilian or military sector. That is, our simulation is based on equations (1) and (3), but for the simulation we set τC = 0,and τ M = 0. Our simulation exercises vary αM , αC , γM , δ C , β M , and βC , γ C =0, as described below. To uncover accurate characterizations of the distributions of self-selected samples, we start with hypothetical populations of one million observations and simulate the process of selecting into the military. 20 A pseudo-random number generator gives this population normally distributed heights with a mean of 66 inches and a standard deviation of 2.5 inches. We assume values for the standard deviations of the two shocks, and for their covariance. 21 Table 1 lists the baseline values for each simulation parameter. Our primary interest is in the four behavioral parameters, αC , αM , βC and βM , though in this simplied model only the sectoral dierences in the intercepts and in the slopes matter for the decision to select into the military. We also investigate the implications of varying the standard deviations and covariances of the shock terms γ M M and δ C C . The results are not at all sensitive to the former for reasonable values, 22 Our baseline behavioral parameters were chosen in the so we only discuss them briey below. 20 Many characteristics of the self-selected population can be obtained directly from the analytic formula for the self-selected density function derived in section 3.1. See equation (11). 21 22 The simulation programs used here (written in Stata) can be obtained from the corresponding author. We performed these simulations in Stata version 12, using the seed 6175 for the pseudo-random number generator in our sets of simulations. With our sample sizes the seed is nearly irrelevant. We used the mean height of 18 year-old recruits around 1747, 61.75 inches (Floud et al, Table 4.1), to set the seed. 16 following way. First, we assume that the civilian sector was almost always preferred to military sector in the absence of shocks for all possible returns to height for persons of very low height. We do this by specifying that civilian log pay is 0.1 above military pay for individuals at a height four standard deviations below the mean (i.e., 56 inches). Second, we consider a range of dierent rewards to height. value, 0.02, 23 What matters is the dierence between and let βC βC and βM , so we set βM to a xed vary. We consider returns to height in the civilian sector that both exceed and fall below returns in the military sector. Finally, we choose baseline statistical values assuming that variation in the civilian labor market random component exceeds that of the random military component. This assumption is consistent with the narrow pay ranges determined by rank in the military. Figure 4.A reports the mean heights of soldiers and the fraction of the population in the military for a range of returns to height in the civilian sector. The gure also highlights two benchmarks. The vertical line at βC = 0.02 marks the point at which the return to heights in the Army and the civilian sector are identical, and there is no selection on height. The horizontal line marks the point at which half the population is in the military. Note rst that the fraction in the military declines monotonically with increasing returns to height in the civilian sector. Military heights in this graph also decline monotonically from a level above the mean population height to a level more than one inch below the mean population height. This pattern highlights the importance of selection on the height distribution for those wanting to enter the military before the imposition of any minimum height standard. At the point βC = 0.02, at which there is no selection on height, the 24 Above mean height in the military equals the population mean value of 66 inches. βC = 0.02, the region in which the reward to the civilian height exceeds the reward to height in the military, the mean heights of individuals willing to join the Army drops rapidly. When the return to height in the civilian sector exceeds that in the Army by 2 percentage points βC = 0.04, about 23 percent of the population nds the military sector more attractive than civilian sector, and the average soldier 23 We re-parameterize the log-wage specications as ln (w) = α + β ∗ (h − 56) + . This has the eect of allowing us to use real heights without articially magnifying dierences in returns to height in the two sectors. 24 The gure does not display an interesting feature generated by the model. It is possible for military heights to increase with increases in the return to height in the civilian sector, for the military sector. so long as civilian height returns are below that This can happen because of the dependence of the selection error variance and the sign and magnitude of the correlation of the selection error and height on the dierence in the height reward as described in equation (11). The fraction in the military, however, is always monotonically non-increasing in the return to height in the civilian sector. 17 is nearly one half inch shorter than the population mean. When βC equals 0.06, so the return to civilian height exceeds the return to military height by 4 percentage points, only about 11 percent of the population volunteers for the Army, and the volunteers are more than one inch shorter than the population mean. An important implication of the model is that selection operates across the entire distribution of height, not just on the left tail, so that the military may attract disproportionately many or few tall recruits. In Figure 4.B we illustrate this feature. We dene tall as taller than the population mean (66 inches) and we plot the fraction of individuals in the military and in the civilian sectors who are tall, according to this denition, against the civilian return to height. As the civilian return to height increases above the military return, the fraction of soldiers who are tall declines precipitously. When height rewards are identical in the two sectors (that is, βC = βM ), there is no selection on height and exactly half of the people choosing each sector are tall. Figures 5.A and 5.B contrast the density of heights for those in the Army and the population height distribution for two dierent returns to height in the civilian sector, namely 0.04 (in 6.A) and 0.06 (in 6.B). These values correspond to the civilian labor market rewarding each inch of height by two and four percentage points more than the military does. Figures 5.A and 5.B clearly illustrate one of our principal contentions: the military height distribution unambiguously shifts to the left, meaning the Army has relatively few of the tallest men and relatively more of the shortest men. The measured standard deviation of the selected heights for those in the military is also smaller than that in the population (2.49 for Figure 5.A and 2.44 for Figure 5.B versus 2.50 in the population). Yet the distribution of heights in the Army looks normal. Moreover, we know that the military densities reported there reect deliberate, non-random sampling schemes, and the analytic density formula clearly rules out that military heights follow a normal distribution. In any simulation exercise, a reasonable concern is that particular parametric assumptions generate articial results. We experimented with ranges of values for all of our behavioral and statistical 25 Changes to the intercept terms parameters and have a clear sense of how they aect the results. (αC , αM ) produce a small shift in the point at which heights begin to decline in Figure 5.A, but do not alter the fundamental result. The same can be said of reasonable variations in other statistical 25 Note that we can solve the two wage equations for the height at which an individual with mean-zero shocks is indierent between the Army and the civilian sector. We only consider parameter values that imply interior solutions; that is, we do not consider values that imply either everyone or nobody wants to join the Army. 18 parameters; increasing or decreasing the standard deviations of the shocks or their covariance aects the steepness and locations of lines in the graphs reported in Figures 5 and 6, but not the essential result that selection on height occurs across the entire population distribution of heights. These gures capture, in dierent ways, the core message of this paper. Even modest relative dierences in returns produce an Army that is selected in such a way as to be signicantly shorter than the population from which it was drawn. The message to practitioners engaged in empirical historical height research is also clear: apparent normality of the military height distribution does not obviate the need for clear and precise thinking about how selection into the sample may drive a wedge between the mean height of the sample and the mean height of the population from which the sample is drawn. 5 The power to reject normality in selected samples Almost all researchers in the historical height literature accept the hypothesis that population height distributions closely follow normal distribution laws. From equation (11) in Section 3, it is clear that the analytic distributions of self-selected heights derived from the simple application of the Roy model do not follow a normal distributions when height, or whatever height proxies for, dierentially aects the rewards in the military and civilian sectors. This suggests that a test of the null hypothesis of normality for an observed height distribution, against the alternative hypothesis of observed heights not following a normal distribution, might constitute a useful device for detecting the presence of sample selection bias. Tests of this sort seem to be the standard way scholars working in the heights literature argue that their samples do not suer from selection problems. Nicholas and Steckel (1991, pp. 941-2), for example, have used the failure to reject such a null hypothesis as a reason to dismiss concerns about self-selection biases. 26 In this section, however, we demonstrate that such tests often have little power to uncover the existence of self-selection biases even when the magnitudes of the self-selection biases are substantively important. In Figures 6.A and 6.B we repeat these comparisons of the population and selected density functions from Figures 5.A and 5.B along with the adjustment terms 26 Z (h) in equation (12) that Given the Jarque-Bera tests shown in Table 1, we cannot reject the hypothesis that the distribution of male English and Irish height is normal, or Gaussian. The absence of truncation bias is a desireable feature of our height data... This approach goes back to at least Soklo and Vilaor (1982, p.469). See also Margo and Steckel (1982, p.533) and Johnson and Nicholas (1997, p.206). 19 allows one to derive self-selected distributions from the population distribution for our simplied Roy model. Figure 6.A corresponds to a 2 percentage point higher return to height in the civilian sector (bC = 0.04 versus bM = 0.02), while in Figure 6.B the civilian sector has a 4 percentage point higher return to height than the military sector. The adjustment term for the 2 percentage point dierential implies, relatively, that ve foot tall individuals on average would be 2.21 times more likely to prefer the military sector than six foot tall individuals. The adjustment term for the 4 percentage point dierential is even more striking. Five foot tall individuals are 7.86 times more likely to prefer the military sector to the civilian sector. Those dierences in self-selection, of course, are what cause the 0.4 and 1 inch mean height dierentials of the selected military samples from the true population mean height. These are substantively meaningful dierences. These self-selected densities, as noted above, do look normal, even though they are not normal distributions, and so it is important to evaluate the ability of standard tests for normality to detect deviations from normality. A failure to reject normality would provide some reassurance about the representativeness of the observed data only if the power of the tests for normality were large enough to reject interesting alternative distributions.To evaluate this we use a popular test available in Stata, the sktest, that examines deviations of estimated measures of skewness and kurtosis in an observed sample from their theoretical counterparts derived from a normal distribution with the same mean and variances as estimated in the observed sample. 27 In Figure 7 we present power functions as a function of sample size for the selection model with a 4 percentage point dierential in the return. We use expected military sample sizes roughly in the range of 100 to 50,000 observations to correspond to sample sizes encountered in the military height literature. This specication of the data generating process (with a 0.04 percentage point dierence in the return to height, as above) implies a 11.34% unconditional probability of preferring the military to the civilian sector to the civilian sector, and so we generate our selected military samples from population samples sized from 1,000 to 500,000. For each population size, we draw 1000 samples, and for each of these samples we apply the selection model to obtain the military subsample. We then calculate whether or not the sktest test would reject the normal distribution assumption for the selected military subsample. For tests at the 5% level, even for military subsamples as large as 50,000, 27 For most sample sizes considered here, the Shapiro-Wilk (for N≤2000) and Shapiro-Francia (for N performed similarly. 20 ≤ 5000) tests one would correctly reject the null hypothesis only about 5% of the time. At the 10% level, one would reject only about 10% of the time. Even for the largest sample sizes in these two gures (which are large relative to those found in the historical literature), the tests have virtually zero power (above the tests' size/level) to reject the assumption that the selected samples come from a normal distribution. 28 This happens even though we know that they are not normally distributed; these samples have mean heights more than one inch shorter than the mean in the true underlying population height distribution. We can see the reason for the poor performance of these tests by examining bution of the selected heights sample. We compare Z (h) Z (h) in the distri- to rejection sampling. The idea behind rejection sampling is that one can sample from an instrumental distribution, and randomly accept each draw from that instrumental distribution as if it were from the target distribution. If this rejection sampling rate is proportional to the ratio of the density of the target distribution to the density of the instrumental distribution, then the random draws selected in this way will follow the 29 In other words, with the right kind distribution of random samples from the target distribution. of rejection sampling, one can sample from a true normal distribution in ways that yield another normal distribution. Suppose that the target distribution is a normal distribution with mean and variance equal to the mean and variance for the non-normal selected height distribution. tion fnor@mil (h). Let the population height distribution, f (h), Denote this distribu- be the instrumental distribution. When drawing from the population height distribution acceptance probabilities proportional to fnor@mil (h) /f (h) will yield samples following those that would be drawn from the selected height distribution because ˆ f (h) fnor@mil (h) f (h) ˆ dh = fnor@mil (h) dh. A comparison of this relative acceptance probability, (13) q (h) = fnor@mil (h) /f (h), to the Z (h) function modifying the population height distribution to yield the selected height distribution (in 28 29 Some studies do recognize the low power of these tests. See, for example, Sokolo and Villaor (1982, p.457). To make this analogy to rejection/acceptance sampling precise, we do need to assume that the height distributions only have positive support over nite subset of the real line, say for adult heights in the range 46 to 86 inches (e.g., the mean plus or minus eight standard deviations). The ratio of the target to the instrumental distribution needs to be bounded at all relevant values such that ftarget (h) < M finst (h) for some xed, nite value of M; this need not be satised for normal distributions with unbounded support. To simplify the notation, throughout this discussion we assume that M = 1, satises this criteria; allowing M to be larger than 21 1 only complicates the notation. equation (11)) reveals why the non-normal selected height distribution so close reembles a normal distribution. Figure 8 provides such a comparison for the case when the reward to height diers by four percentage points in the two sectors and the selected mean height diers from the population mean height by over one inch. This gure shows that the two dierent adjustments to the normal population height distribution are nearly identical, even though one adjustment, a normal distribution and the other, Z (h) q (h), yields exactly a self-selected distribution that clearly does not follow a normal probability model. Moderate amounts of self-selection can generate selected populations that dier substantively from the unconditional population distribution in terms of the rst two moments, but tests for normality would be quite unlikely to provide any evidence of self-selection biases because of the similarity of the selected distribution to a normal distribution. 6 Econometric analysis of selection in the British military data Our theoretical and simulation analysis demonstrates that studies based on volunteer armies and similar sources may suer from serious but undetected selection problems. We now ask whether a closer look at the sources might have identied the problem. Note the central diculty: we are asking whether we can identify selection bias from the selected alone. That said, we show in this section that there is discernible evidence of selection bias in at least one well-known study. Scholars rightly consider Floud, Wachter, and Gregory (1990) a central contribution to the historical heights literature. Their discussion of living standards in Britain in the period 17501980 relies heavily on three distinct samples that are all in some sense military. The Royal Army and the Royal Marines recruited adult men to serve in the forces. Floud et al also use samples of younger males to study age-patterns of growth We use a subset of the public use version of the Army data to show that heights of soldiers display patterns that are consistent with selection and inconsistent with the type 30 of explanations usually advanced in the heights literature. We begin by estimating two types of simple models. In the OLS versions, the dependent variable is height measured in inches. In the binary probits, the dependent variable is one if the individual is tall (which we dene in several dierent ways discussed below). In all models we have two types 30 The dataset we use is taken from the UK data archive, Long-term Changes in Nutrition, Welfare and Productivity in Britain; Physical and Socio-economic Characteristics of Recruits to the Army and Royal Marines, 1760-1879, study number 2131. We are grateful to Floud et al for making the data public in this way. 22 of regressors, which we call cohort conditions and current conditions . The rst is a series of dummy variables for the years in which the individuals were born. These variables should, under the standard of living interpretation, aect height, because they characterize the individual's experience as a child and young adult. Everyone born into similar socioeconomic circumstances in 31 The current conditions variables reect not 1820 faced a similar biological standard of living. the standard of living for a person's birth-cohort, but conditions obtaining at the time Army. he joined the Consider two men, both born in 1820. One joins the Army in 1842 and other in 1843. The heights literature gives no reason to think the former should be taller or shorter, assuming both have stopped growing by the time they joined. We might think, however, that if economic conditions in 1843 were relatively bad, then men from relatively more privileged background will be more likely to join than were similar men in 1842. Our argument implies that if there is no selection, then no current-conditions variable should explain the heights of current Army recruits. The current conditions variables we employ here enter with signs that make sense in the way just outlined. But we should stress that any currentconditions variable that aects conditional mean height, whether positively or negatively, is evidence of selection. If the heights of current soldiers yield an unbiased estimate of the heights of the entire male population, then no current conditions variables should enter the regression with meaningful magnitude or statistical signicance. We experimented with a number of dierent measures of current economic conditions. In general, they produce similar results, although they enter dierent specications in dierent ways. Our implicit baseline is a model such as those reported in Table 2. In this and all other work reported here, we limit the sample to men who joined the Army between 22 and 27 years of age, and who reported a height of 67 to 77 inches. (We restrict the sub-sample to older men to reduce the chance that some have not attained their terminal height. Floud et al do not always report the lower truncation point they use, but in one example (the birth cohort of 1806-1809), they use a truncation of 65 inches (Table 3.13)). We regress height on a series of single-year dummies for year of birth (1800 is the reference year). 31 The two binary probit models handle the information More precisely: in studies such as Floud et al, the main variable aecting height is birthyear. In this case the birth cohort (b) is the only determinant of heights. For those enlisting at date t, in the absence of selection biases E[h|a<h<c,b,z(t)]=f(a,c,b) and P[(a<h<c|b,z(t)]=g(a,c,b). background is known and used. 23 In other studies more information about the person's dierently. In one probit model we classify men as tall or short, dening tall if they are 68 inches or taller; in the other probit model, we dene tall as a recruit being 69 inches or taller. Tables 3-5 report the eects of adding dierent types of current-condition variables to the base- 32 The most straightforward approach is to let year of recruitment line models reported in Table 2. or age at joining the Army proxy for current conditions. Table 3 reports summaries from specications in which we enter single-year dummies for the age at which the soldier joined to the baseline models reported Table 2. We can reject the null hypothesis that age dummies are collectively zero in all three specications. And adding the age dummies does not reduce the importance of the year-of birth dummies. 33 This is a striking result, and shows that a simple way of parameterizing the impact of current conditions helps to explain the heights of recruits. Table 4 reports a similar exercise. This time we add single-year recruitment-year dummies to the baseline specications in Table 2. We reject the null that the year-of-recruitment dummies are zero for all three models. More importantly, in the probit specications we now cannot reject the null hypothesis that birthyear dummies are collectively zero. This suggests that current conditions might be more important in explaining recruit heights than are the birthyear dummies thought to proxy for the individual's biological standard of living. Table 5 reports a nal exercise of this type, which yields more ambigous conclusions. Rather than age at recruitment or year of recruitment, we wanted to see the impact of a more direct measure of current economic conditions. We use two measures of current economic conditions for a working-class person, both derived from the work of the Global Price and Income History Group. One measure is the wage of laborers relative to the wage of craftsmen, as estimated by Allen for London. 34 These are both real wages, but since Allen uses the same price index for each, dividing the two removes the inuence of consumer prices (so long as laborers and craftsmen consumed the 35 We same basket of goods). The other measure is the nominal price of wheat, again for London. also considered oats as a substitute for wheat prices, but the two series are highly collinear and using 32 We do not report the actual estimation results, but they are available upon request from the corresponding author. 33 The age-of-recruitment dummies imply the restriction that the current conditions are always the same for indi- viduals of a given age, which is clearly unrealistic. Alternatively, one could interpret signicant age at recruitment eects in these models as evidence for dierential self-selection into the military by age for those who are above any reasonable minimum height requirement. 34 35 See http://www.economics.ox.ac.uk/members/robert.allen/WagesPrices.htm See http://www.iisg.nl/hpw/data.php#united 24 wheat allows us to use a longer run of data. The wage measure and wheat prices performed similarly, so we only discuss models using wages. Why should current wages aect the decision to join the Army? Refer back to our model in section 3 above. Implicitly, young men were comparing the Army to the civilian world. When civilian wages rose, the Army became relatively less attractive. The regression asks whether this variation aected heights of new soldiers, that is, whether the variations in civilian wages at enlistment times aected men of dierent heights dierentially. Table 5 reports the coecients and t-ratios for the variables in question, along with formal tests for the null hypothesis that together the relative wage and price variables are zero. We see the expected impact of the relative wage variable in the OLS specication, but in the two probit models its impact is not statistically signicant. The wheat-price variable is never statistically signicant, and the two variables only belong collectively in the OLS specication. The OLS and probit specications capture dierent features of the height distribution, with the OLS model using more information from the distribution of heights in this sub-sample; the relative wage and price variables are used again in another specication, below. 6.1 36 Selection using the RSMLE approach The regressions discussed above are only an approximation to the model most heights studies have in mind. To get closer, we estimate the RSMLE model Floud et al use. Note the complication here; the RSMLE is only correct under the assumption that the distribution above the truncation point is free of any selection. We have shown this not to be true. So while our RSMLE estimates replicate and expand upon those reported by Floud et al, they are not, in some sense, correct, because the statistical assumptions upon which they rest are violated. But the same observation applies to the Floud et al estimates. Following Wachter and Trussell (1982a), we assume heights are normally distributed conditional on a set of covariates x, where the covariates can inuence either or both the mean and the standard deviation of heights. Because of possible rounding issues, we only consider the integer portion of reported heights; that is, we round heights down to 67 inches, 68 inches, etc. For each observation 36 It is important to recognize that the elasticities reported in Table 5 and how the relative wage and wheat price variables aect the birth year dummy variables tell us very little about changes over time in the population height distribution. That is because we have not explicitly incorporated the selection process into our empirical model. The fact that the time of enlistment variables are signicantly related to enlistee' heights, however, is evidence of some kind of selection. 25 i, dene the mean conditional on σi = exp (x0i πσ ). x and standard deviation conditional on x as µi = x0i πM The likelihood function for the integer values of heights, conditional on height being at or above a possibly person specic lower limit Li , is given by 1(hi ≥Li ) xhi y−µi i − Φ Φ xhi y+1−µ σi σi i 1 − Φ Liσ−µ i=1 i N Y where 1 (·) and is the indicator function and x·y (14) is the oor function that extracts the integer portion of the height. Table 6 reports estimates and standard errors for the RSMLE, where the covariates are birthcohort dummies in ve-year intervals, and the real-wage and wheat-price variables described above. The model suers from stability problems, and to estimate it, it was necessary to move from the 37 The message here is very strong: current conditions single-year to the ve-year birth cohorts. aect both the mean and the variance of heights in the RSMLE estimates. For the null hypothesis that both the relative-wage and the wheat-price variables are zero, the q2 statistic is 33.24, which 38 Under the null of no with four degrees of freedom is signicant at standard signicance levels. selection on current conditions, neither variable should matter. interpretation as in the OLS estimates. Shifting the mean has the same The eect on variance is also important, as it probably directly reects the changes in the right-hand tail that are a manifestation of selection. It is important to recognize that the birth cohort eect estimates in Table 8, even though the model controls for variables related to selection at the time of entry into the military, do not provide simply interpretable height variations. To obtain such estimates, one would need to construct a correct model for heights that incorporates explicitly the selection into the military process. For a static selection model, and under the assumption that all unobserved skills and tastes are normally distributed, one could use the selected height density function in equation (11) to derive maximum likelihood estimators of the parameters of the birth-cohort unconditional height distributions. We 37 For several single year birth cohorts, the model estimated mean population heights below 50 inches, sometimes well below zero inches. The ve-year birth cohort 1836-1840 yielded absurd mean height and standard deviation estimates (e.g. N(-10,15)), so we removd individuals in this ve year group from this part of the analysis. 38 The log-likelihood for the unrestricted model presented in Table 8 is -9689.4979. For the restricted version, where we remove the relative wage and wheat price variables from both the mean and the standard deviation in the RSMLE model, the log-likelihood is -9706.1187. The test statistic (2 x the dierence in the log-likelihoods) is distribution as 2 with degrees of freedom equal to the dierence in the number of parameters in the two models. q 26 did use this density function, with corrections for possible minimum height standards and rounding as discussed above, but the estimates were quite unstable. That is not surprising; the estimation strategy attempts to model the joint distribution of heights and selection using cross sectional macro economic measures and height data only from the right hand tail of a self-selected sample. 7 Treatment of sample-selection bias in the historical heights literature The scientic study of the human physical growth auxology emerged in the 1830s, with studies by European scientists, including Louis-René Villermé, Adolphe Quetelet and Eduoard Mallet, who gathered information on the heights of army recruits in France, Belgium and Switzerland respectively (Staub et al. 2011). Villermé drew a connection between height and health; Quetelet introduced the normal distribution to the study of practical scientic questions, including human growth; Mallet noticed a modest urban height advantage and attributed it to Geneva's relative prosperity. A halfcentury later Danson (1881) published his statistical study of English prisoners, which showed that males did not reach their terminal heights until after age 22. He concluded that armies should eschew 18-year old recruits because slightly older men who had already reached their terminal height would prove to be hardier soldiers. The thread that connects these studies is their belief that height reected well being. The thread that connects them to the modern literature, besides their concerns with human height and well-being, is that they relied on readily available and possibly selected samples. Selection was an issue at the dawn of statistical anthropometrics and, as we argue below, remains an underappreciated issue in the literature. In this section, we review how selection issues have shaped the discussion of three principal sources of height data: military recruits, slaves, and prisoners. With the exception of a fruitful, but underappreciated debate about height-based selection into the slave trade, concerns with selection have not been given their due. 7.1 Soldiers, military recruiting, and military school students Fogel et al (1982, pp.29-30) were the rst to call attention to the industrialization puzzle, namely that the positive correlation between height and per capita income observed in modern cross sectional studies did not hold in the time series for adult white males in the late antebellum United States. 27 They postulated that the decline in heights was concentrated in the urban-born population. Rapid urbanization, poor public sanitation and more pronounced urban income inequality meant the conditions of city life deteriorated during industrialization. Urbanites were shorter as a consequence. Several subsequent studies report remarkably similar patterns. Komlos (1989) reports a late eighteenth-century decline in the heights of adult male Hapsburg recruits. The mid-eighteenth century peak was not attained again for nearly 150 years. Floud et al (1990) nd a general secular increase in English heights during the Industrial Revolution, with a sharp reversal for cohorts born in the 1840s and 1850s. Komlos (1993) reworks the English gures and argues that that English heights declined between 1770 and 1830. Subsequently Komlos and Küchenho (2012) push the onset of English height decline back to the 1750s, a secular decline that persists to 1850. The era of early industrialization was an era of shrinking men. A'Hearn (2003) reports a 3cm decline in Italian heights between 1740 and 1800, attributing it to Malthusian pressures. Studies of U.S. soldiers generate similar results. Komlos (1987) and Coclanis and Komlos 1995, g. 3, p.101, use military school students to demonstrate long height cycles. The mean height of 19-year old West Point cadets falls by a half-inch in the late antebellum era, but recovers by the end of the Civil War. The mean height of 19-year old Citadel cadets is stable up to 1900 and increases 39 Steckel (1995) links several independently constructed series by about 2.5 inches up to the 1930s. of US soldier heights, from the French and Indian War (1754-1763) through the Second World War (1941-1945). His now nearly iconic diagram, reproduced above as Figure 2, demonstrates a decline in terminal adult heights for cohorts born between the 1830s and the 1880s. Once the industrialization puzzle was found in the time series, it was quickly uncovered in the cross section as well. Komlos (1989) nds that Hapsburg Empire army recruits from the most economically developed regions within the empire were the shortest, while recruits from the least developed regions were among the tallest. Similar patterns emerged elsewhere in Europe. Mokyr and Ó Gráda (1996) report that poor Irish recruits into the English East Indian Company (EIC) army were taller than less poor English recruits. Moreover, Irish EIC recruits from relatively wealthy 39 Gallman (1996) disagrees with Komlos' interpretation. The heights of 16- and 17-year old recruits declined between the 1820s and the 1830s, but then increased in the 1840s and 1850s. The cohorts of the 1860s were slightly taller than the 1820 cohort. These two age groups account for two-thirds of Komlos' sample and fail to demonstrate any long-run decline in height. A simple OLS regression of Komlos' height index on decade of birth yields a statistically insignicant coecient of -0.0066, or that heights declined by 0.6% per decade between 1820 and 1860. But the condence interval includes no change in height. The downward trend in the Coclanis and Komlos (1995) diagram is not readily apparent. 28 Ulster were shorter than recruits from elsewhere in Ireland. Sandberg and Steckel (1988) nd that mid-nineteenth century Swedish soldiers from the less developed north and east were taller than recruits from the more developed west. rural-born peers (A'Hearn 2003). Urban Italians paid a height penalty relative to their In the United States, Union Army troops from less developed Kentucky and Tennessee were taller than troops from the Old Northwest who were taller than troops from industrializing New England (Johnson and Nicholas 1997, p. 208). Among Pennsylvanian recruits, men from less developed and commercially-oriented regions were taller than men from the industrializing southeast region of the state (Cu 2005 p.207). Margo and Steckel (1992) also found that ex-slave recruits into the Union Army from the less commercialized inland areas were taller than those from the more commercialized coastal regions. Explanations of both the time-series and the cross-section puzzles built on those oered by Fogel et al (1982). The resolution of the industrialization puzzle oered in the literature focuses on the decline in net nutrition that occurred in the early stages of industrialization. According to this view, the underlying sources of decline were: (1) increasing income inequality; (2) increasing income variability; (3) increases in the price of food relative to manufactured goods; (4) increasing distance between the production and the consumption of food, with the consequent spoilage waste and loss of nutrients; (5) increased work eort; and (6) increased infection rates and disease incidence (Komlos and Coclanis 1997, p.455). The problem, of course, is accounting for many or most of these eects, 40 which, in the end, are more commonly asserted than shown to be the cause of height trends. Coclanis and Komlos (1995, p. 92), in fact, insist that any residual controversy concerning the industrialization puzzle centers on the nature and causal connections of height cycles because the existence of heights cycles, is no longer questioned. We beg to dier. While we do not reject the possibility that heights cycled prior to the long secular increase in heights observed in the twentieth century in the developed world, we remain skeptical about eighteenth- and nineteenth-century cycles because the literature has not taken the selection problem (as opposed to the truncation problem) seriously. Much as the literature follows Fogel et al (1982, p.42-45) in documenting the puzzle, it follows 40 Several studies attempt to estimate per capita food consumption (or disease incidence) and link it to birth cohort heights, but the exercise of recreating diets is fraught with assumptions, judgment calls and error (Floud et al. 1990) See Haines, Craig and Weiss (2003), Komlos (1987), and Bodenhorn (1999) for examples. Gallman (1996) and Haines, Craig and Weiss (2003) express skepticism concerning antebellum nutritional decline. 29 them in its discussion of selection issues. They write: Much of our work during the past four years has been devoted to assessing the quality of the data .... Volunteer armies, especially in peacetime, are selective in their admission criteria and often have minimum height requirements. Consequently, even if information on rejectees exists, there is the question of the extent to which applicants are self-screened... There is clearly evidence of self-selection bias in volunteer armies. Persons of foreign birth and from cities are overrepresented. Native-born individuals living in rural areas are underrepresented... Their concerns with the nativity and urban-rural composition of their samples do not address height-based selection into the sample; rather it reects selection on other characteristics within the sample. They then turn their attention to left-tail shortfall and their approach to correct for the nonnormality of the data that followed from the military's minimum height standard. More than two decades later, the discussion is hardly changed. After a brief discussion of existing methods for dealing with left-tail shortfall, A'Hearn (2003, p.359) turns his attention to selection: A second complication concerns selection eects. Whatever method is used [to correct for left-hand tail shortfall], the legitimacy of comparisons across groups depends on constant selectivity above the truncation point... [T]here is reason to worry about this for the sample at hand. In the eighteenth century, especially before 1770, it seems likely that recruiters did not get a random sample of individuals exceeding 63 in height...[But] it seems likely that selection disproportionately emphasizes heights in the immediate neighborhood of the statutory minimum. From there, the discussion returns to the truncation issue, and A'Hearn lets pass an opportunity to explore the implications of selection on heights throughout the distribution of heights rather than in the immediate neighborhood of the minimum height standard. Our survey of the literature uncovered only a few instances in which the selection problem discussed above is recognized and given its due. Floud (1994, p.19) acknowledges that recruits into volunteer armies represented a self-selected sample that may not be representative of the population. He continues, however, that any selection is unlikely to be large enough to vitiate comparisons 30 over time and between... countries. Weir (1997, p.174-5) disagrees. He recognizes if recruiter selection manifests as a strict exclusion only of those below the truncation point, the methods, such as RSMLE and QBE may correct for it, but if selection is continuous across the entire distribution of heights, these estimators fail to generate accurate estimates of mean height. Meier(1982, p.297), in his comment on the original Wachter and Trussell (1982a) paper introducing the RSMLE and QBE estimators, pointed out that recruitment eort and disparagement of shorter individuals might vary continuously over a range of heights and could yield a selected military height distribution with a nearly normal shape. Wachter and Trussell (1982b, p.302), in their rejoinder, agreed that some scenarios for recruitment and volunteering could yield spuriously normal observed distributions whose failure to represent the underlying population would be undetectable from internal evidence. In his critique of Komlos' (1987) study of West Point cadets, Gallman (1996, p.194) not only refuted Komlos' claim of nutritional decline but raised serious concerns about selection. If Komlos' contention that cadets' heights actually declined were true, it would only be interesting if cadets can be taken to be a random sample of some larger, more interesting group say all young white men in the United States. That, of course, cannot be. Not everyone was eligible for West Point and, of those who were, young men who attended were interested in either a military or engineering career. Despite the unrepresentative nature of the sample, Gallman (1996, p.194) notes that it might still have wide meaning if the pool of candidates retained an unchanging character across the full period of this study that is, if the same segments of the society continued to supply candidates, in roughly the same relative numbers. But is that likely to have been so? Gallman doubts it. Cohorts born after the early 1840s and interested in a military career would have faced the prospect of serving in an army with a large permanent class of lieutenants, captains and majors who had gained battleeld experience in the U.S. Civil War. This surely pushed many otherwise promising cadets into other pursuits. A stylized fact of the European industrialization puzzle literature is Mokyr and Ó Gráda's (1994; 1996) nding that poor Irish recruits into the English East India Company (EIC) army were taller than similarly situated English recruits. A number of possible explanations have been oered for this counterintuitive insight, including the relatively nutritious Irish diet of milk and potatoes and epidemiological isolation (Nicholas and Steckel 1997). Mokyr and Ó Gráda (1994, p.42) oer an alternative explanation: because incomes were lower in Ireland than England, the relative quality 31 of Irish recruits was higher. The poor Irish appear to be taller in the record than they really were because the EIC drew a larger proportion of recruits from better-o families. The taller Irish were not really better o than the shorter English. Faced with less attractive civilian employment opportunities for a given height, taller Irish men were disproportionately more willing to enlist than were otherwise comparable (i.e., equally tall) English men. The tall Irish recruits, they note, were a supply-driven rather than a demand-driven selection phenomenon. The remarkable feature is not that Irish men presenting themselves for service in the EIC were taller than their English counterparts; the remarkable feature is that the tall-but-poor Irish result is widely repeated in the literature absent Mokyr and Ó Gráda's repeated concern that the result is chimerical. 41 Our reading of the historical heights literature that makes use of military samples, which represents a plurality, if not an outright majority of the studies, reveals that sample selection bias is underappreciated. When the issue does arise, it is given passing notice and attention turns to left-tail shortfall (or truncation bias). Given our theoretical and empirical discussion above, the failure to take sample selection bias seriously raises substantive concerns about the existence of and explanations for the industrialization puzzle. 7.2 Slavery and the slave trade Given that one of the principal contributions of the modern heights literature is the demonstration that income and wealth inequality manifests itself in human height-at-age, it is not surprising that historians were intrigued by the anthropometric consequences of slavery. Relatively deprived children are consistently shorter at age than relatively well-o children, and it is hard to imagine a more potentially deprived population than slaves. Economic historians have produced several notable studies of slave heights, which have led to two general results. First, slave children were extraordinarily small (Steckel 1995, 1923). The mean height of slave children generally fell below the rst percentile of modern stature until about their tenth year. These kinds of heights are sometimes observed in developing countries today, but are practically unheard of in the developed world. As Steckel (1987) observes, a modern American pediatrician would be alarmed if presented 41 Nicholas and Steckel (1997), Johnson and Nicholas (1997), Komlos (1998, p. 780), among others, discuss the tall-but-poor Irish without referring to selection concerns. Mokyr and Ó Gráda (1994, p.50) also doubt the time series evidence. The 1810-1814 (birth) cohort of recruits was taller than the 1802-1809 cohort not because the biological standard of living had changed noticeably, but because labor market conditions changed enough that the EIC had greater success attracting taller Irish recruits. 32 with such a short child. Second, the growth of slaves in adolescence was remarkably vigorous so that adult slaves attained nearly the 20th percentile of modern stature. Such remarkable recovery growth generated tall slaves, at least by historical standards. Although they were shorter, by about one inch, than contemporary white Americans, slaves were taller than many contemporary European populations, especially low-income Europeans. Economic historians have constructed plausible explanations for these two features of slave heights. Poor medical knowledge and even poorer practice led to high infant mortality rates, which is probably indicative of high morbidity rates from both acute and endemic (gastrointestinal and 42 Because diarrheal infection interferes with the nutrient-growth nexus, per- diarrheal) infections. sistent endemic infection will lead to stunting. If nutrition is simultaneously low, the negative consequences on growth are magnied. Slave children, according to this literature, survived on a poor diet of hominy and pork fat. Growth recovery began around age 10 because the typical slave child entered the plantation labor force around that time. Normally, the demands of heavy work expected of slaves would have further interfered with growth, but once children entered the labor force they received shoes, which reduced the extent of fecal-based gastrointestinal infection, as well as more and better food, perhaps as much as one-half pound of pork per day. The increased meat ration for working slaves was further supplemented by vegetables and legumes, which contributed to the remarkable catch-up growth of North American slaves. Although information on US slave heights comes from a host of sources contraband or runaway slaves that joined the Union Army in the 1860s (Margo and Steckel 1982), notarized certicates of good behavior led with New Orleans' courts (Freudenberger and Pritchett 1991), manumission and freedom papers (Komlos 1992; Bodenhorn 2011), and runaway advertisements (Komlos 1994), among others the principal source of information are the data recorded in the coastwise manifests led by slave traders (Steckel 1979; Steckel 1987). To enforce the prohibition 42 Infant survival may create its own self-selection bias in that the ability to resist certain infections, which may manifest itself as vigorous growth capacity in adolescence, may be partly responsible for the observed pattern of slave growth. Instead of recognizing this possibility, vigorous adolescent growth is attributed to dicult to document changes in the slave child's work regimen, disease environment and food allotments. Rees et al (2003) develop a dynamic optimization slave owner model linking planter prot maximization with the pattern of slave growth. A related literature arose around the question of whether smallpox reduced the height of survivors. Voth and Leunig (1996) claim that the near eradication of small pox in nineteenth century England is responsible for about one-third of the reported increase in average heights between 1770 and 1873. Voth and Leunig's nding brings attention to potential survivor bias and the eects it may have on recorded heights over time. See Razzell (1998) and Oxley (2003) for critiques of the Voth-Leunig hypothesis. 33 on international slave trading, slave traders moving slaves in the seaborne, interregional, domestic trade were required to provide shipping manifests that included the name, age, sex, color and height (in feet and inches) of the slave, as well as the name and residence of the slave trader. Prior to a ship's departure the captain prepared a duplicate manifest with slave and trader information along with information on the ship. One copy of the manifest was kept by the collector at the port of origin; the other was presented to the collector at the port of destination. Tens of thousands of slaves were recorded on manifests, many of which are now available at the National Archives. The issue of whether selection bias aects the observed pattern of height-at-age and growth velocity is not discussed in either Steckel (1979) or Steckel (1987), but it merits two paragraphs in Trussell and Steckel (1978, 550-551), who use the data to conrm their estimates (drawn from other sources) of age at menarche and age at rst birth among slave women. They discuss two possible selection eects: a downward bias due to a lemon's market in slaves, in which only below-average height slaves entered the interregional trade; or, an Alchian-Allen (1964) shipping the good apples out market in which a xed shipping cost represented a smaller fraction of the higher price received for taller slaves making taller slaves more likely to enter the interregional trade. They contend that age at peak growth velocity (onset of adolescent growth spurt) would not be much aected by either eect and do not further pursue the issue of potential sample-selection bias. Higman's (1979; 1984) studies of Caribbean slaves provide some empirical evidence that selection may be driving Steckel's adolescent recovery nding. Unlike the heights of US slaves, which was recorded only if slaves entered into the coastwise trade, the British government required universal registration of slaves in anticipation of general emancipation. Registers of Trinidadian slaves were open for public inspection and government ocials visited plantations to conrm their validity, making corrections when necessary. Caribbean slaves demonstrate less adolescent recovery; some attain only the 2nd percentile of modern height (Steckel 1995, p. 1924). Height dierences between US and Caribbean slaves may be due, of course, to dierences in genetic potential, disease incidence, work load or diet (Steckel 1995, p. 1925, in fact, attributes them to fetal-alcohol syndrome due to substantial and growth-retarding rum allowances on Caribbean plantations). But we need not turn to dicult-to-prove hypotheses when the dierences were driven, at least in part, by height-based selection into the US sample and near-universal coverage in the Caribbean sample. Freudenberger and Pritchett (1991), Pritchett and Freudenberger (1992), and Pritchett and 34 Chamberlain (1993) systematically explore the sample-selection bias problem of the coastwise manifest sample. They argue that the manifest sample is subject to substantial selection on height of the shipping the good apples out type. That is, when a xed transportation cost is applied to similar goods, the high-quality, high-priced good (a taller slave in this instance) becomes relatively less expensive in the destination market. Four features are needed for the good-apples eect to hold: (1) transport costs must be non-negligible; (2) transportation costs are not proportional to price at the source; (3) the goods are close, but not perfect substitutes; and (4), the elasticity of substitution between each of the two goods in question and a composite third good (say, free or indentured labor) not be substantially dierent (Borcherding and Silberberg 1978). Pritchett (1997) and his coauthors provide evidence on transportation costs that supports conditions (1) and (2). Conditions (3) and (4), they contend, are defensible: tall and short slaves are (imperfect) substitutes in production; and free or indentured labor, as the historical record shows, were substitutable for slaves, whether tall or short. 43 Moreover, they show that the eect will be more pronounced for younger than adult slaves. One prediction of the good-apples slave model is that prices will be a positive, but declining function of age (Pritchett 1997, p.73 note). Because buyers were willing to pay more for taller slaves, traders faced incentives to select taller slaves for the coastwise market and the incentives were stronger, the younger the slave. Calomiris and Pritchett (2009) nd that slave children shipped with their mothers were shorter than children of the same age shipped alone. This result is puzzling absent selection on height.The available evidence is consistent with this prediction; the height dierential between slaves traded in the interregional market and those traded in local markets were most 44 pronounced in childhood, declining in adolescence and largely disappearing among adults. The conclusion to draw from the exchange between Pritchett and his coauthors and Komlos and Alecke (1996) is that slaves shipped coastwise were not representative of the general population of North American slaves (Pritchett 1997, p. 83). In this instance, the consequence of sample-selection bias works in favor of one of existing interpretations. Because selection on height was stronger for younger slaves than for adult slaves, Steckel's coastwise sample includes proportionately more 43 See Komlos and Alecke's (1996) response to Pritchett's evidence The substitutability of slave and indentured or free labor is discussed in Galenson (1981) and Grubb (1994; 2001). 44 Pritchett's (1997, p. 78) empirical work is not without its own problems namely, that he estimates what Hausman labels the forbidden regression (a probit rst-stage in an IV system). p.190) for a discussion of the so-called forbidden regression. 35 See Angrist and Pischke (2009, tall children than tall adults than would be observed in a true random draw of the general slave population. This means that the much-discussed remarkable recovery growth of adolescent slaves is underestimated. It is likely that young slaves were even more pathologically short than they appear in the manifest sample. Unfortunately, we cannot provide as sanguine a conclusion about the second result of the slave height literature. The good-apple model predicts that selection will be less pronounced at older ages and lowest among adults, but it does not predict the absence of selection. Higman's (1979) study of Caribbean slaves, in fact, provides some evidence of selection on height for inter-island trade. The mean height of native-born Trinidadian adult males (25-40 years) measured in 1813 was 165.6cm. The mean height of creole male slaves, or those born in the New World, and imported into Trinidad from sugar islands was signicantly taller at 167.3cm (t=2.15). Creole slaves imported from nonsugar island were taller still at 170.6cm (t=4.42). Imported females, too, were signicantly taller than native-born Trinidadian female slaves. The available evidence, although not denitive, points toward good-apples selection in the interregional slave trade. This type of selection need not be revealed by (non)normality of height distributions, or heaping on height or age, or left-tail shortfall. That the selection process is not readily revealed does not, of course, imply that it is unimportant. Unrepresentative selected samples will yield incorrect inferences when selection is correlated with the variable of interest. Pritchett's concerns with Steckel's results are well-founded and it is unfortunate that his contributions have not had more inuence on the literature. 7.3 Criminals and prisoners The anthropometrician's concern with the condition of the working classes during industrialization, in combination with the wide availability of data, led scholars to study the heights of incarcerated or transported criminals. Scholars making use of prisoner data readily acknowledge that prisoner samples are not representative of the underlying population; the professional and middle classes and farmers are underrepresented and unskilled laborers and other low-wage groups are overrepresented (Steckel and Nicholas 1991; Riggs 1994). In one of a series of studies of nineteenth-century US prisoners, Carson (2008, p.591) casts the selection vice as a virtue: 36 The prison data probably selected many of the materially poorest individuals, although there are skilled and agricultural workers in the sample. While prison records are not random, the selectivity they represent has its own advantages in stature studies, such as being drawn from lower socioeconomic groups, who were more vulnerable to economic change. For the study of height as an indicator of biological variation, this kind of selection is preferable to that which marks many military records minimum height requirements. In other words, prisoner heights are preferable to soldier heights because prisoners are presumably not selected on height and prison samples are not subject to left-tail shortfall. Although Carson labors to turn a vice into a virtue, it remains a vice if the purpose of the historical anthropometric literature is to speak to the average standard of living. Several studies (which are not without their own selection problems) have shown that the heights of the middle and upper classes in the nineteenth-century United States did not mirror the height cycles of soldiers and prisoners (Sunder 2004; Sunder 2011; Lang and Sunder 2003). Thus, if lower-class heights are more responsive to economic changes than middle- and upper-class heights, samples in which the lower classes are overrepresented are likely to overstate any real changes in population heights. Moreover, inferences about changes in the average biological standard of living may reect changes in the sample composition, which may be pronounced over the business cycle or during long-run secular changes in rates of economic growth. Using a small subset of the available samples, Sunder and Woitek (2005), in fact, show that co-movements between economic and height cycles are most pronounced for American blacks (predominantly unskilled labor), relatively pronounced for Austrian soldiers and Ohio National Guardsmen (relative overrepresentation of laboring classes), less pronounced for late nineteenth-century registered voters, and virtually nonexistent for middle- and upper-class women. The potential for compositional changes in the heights of individuals selected into prisons to drive temporal height changes has not gone unnoticed. Nicholas and Oxley (1996), Johnson and Nicholas (1997) and others have considered whether the relative under- or over-representation of certain groups reects population proportions or change over time. p.949), for example, oer the following: 37 Nicholas and Steckel (1991, Although there is no single powerful test of selectivity bias, the various tests reported hereafter put at risk the hypothesis that the decline in heights was an artifact of our data. Almost one-third of our convicts were tried in a county other than that in which they were born, but it is not possible to know whether the move occurred during childhood, adolescence, or after maturity. Assuming that all convicts tried in the same county as their place of birth were nonmovers, ve-year moving averages of height for rural- and urban-born nonmovers showed the same prole as that for the entire sample. This argument is irrelevant to the issue of selection into their sample; it merely shows that there is no apparent selection into two sub-populations within their sample. A common thread connecting studies of the industrialization puzzle is the proportion rural- and urban-born. This is natural given concerns that nineteenth-century urbanization led to a marked decline in the biological standard of living. But this is not exactly the issue. The issue is whether cohorts born into urban areas, conditional on family income and wealth (and thus height), were more likely to enter into crime, and be apprehended, tried, convicted and sentenced to prison or transportation. In the modern world, crime is more common in urban than rural places and there is evidence that this was so historically as well (Glaeser and Sacerdote 1999; McLynn 1989). Moreover, evidence from nineteenth-century US prisons shows that prisoners were short compared even to soldiers (Bodenhorn, Moehling and Price 2012). The issue is whether selection into prison, conditional on selecting into crime, followed from height, which seems likely. Persico, Postlewaite and Silverman (2005) and Case and Paxson (2008) nd evidence that height is positively correlated with labor market outcomes (employment and wages), mediated through cognitive abilities and the accumulation of more human capital by taller youth and adolescents. Because taller individuals face relatively better legitimate labor market opportunities than shorter individuals, criminal activities are less attractive to taller people. Prisons are thus populated by short, relatively low-income people. Researchers take solace in the fact that their samples are Gaussian but, as we have demonstrated above, apparent normality alone will not reveal selection even in the presence of selection. The issue that concerns us is whether selection on height was likely to change in a way that would generate the counterintuitive results reported in the historical heights literature, whether those changes are evident in the research, and whether the explanations are consistent. The Roy-type model developed 38 above predicts that better economic conditions are consistent with shorter prisoners in both the cross section and the time series. If legitimate labor market opportunities are conditioned, at least in part, on height, economic expansions will create more and better legitimate opportunities and short individuals who may have selected into crime in bad times will select into legitimate activities in good times. While criminals will exhibit a distribution of heights, they will be disproportionately drawn from the left-hand tail of the population distribution and even more so as economic conditions improve. If we use the heights of prisoners as indicators of the biological standard of living, it will appear that biological times are tough when economic times are good and vice-versa. In fact, what we are observing is dierential selection on height into the criminal class that gets caught and convicted. Consider Riggs's (1994) study of Scottish prisoners. He claims the sample is representative of Scottish working-class heights because in a society of heavy drinkers ... many workers were at risk of being arrested and thus having their physical stature preserved in the historical record (Riggs 1994, p.64). While his results are in general agreement with the industrialization puzzle, he nds the curious result that the heights of those arrested in the 1840s actually increased markedly. Riggs (1994, p.70) considers this curious because the 1840s, known as the hungry forties, were notorious for hardship and hunger. When considered in light of our Roy model, the result is not so curious. If the eect of food shortages and unemployment was that men moved from legitimate to criminal activities, the deterioration in legitimate opportunities would draw dierentially more men into crime from the right-hand than the left-hand tail of the height distribution because men in the left-hand tail were already disproportionately in the criminal market prior to the downturn. Moreover, if right-tail entrants into the crime market have relatively little criminal human capital, they were probably more likely to have been apprehended and imprisoned. Thus, heights apparently increase during a sharp economic downturn when the intuitive connection between the biological and economic standard of living would suggest otherwise. In a series of studies, Stephen Nicholas and co-authors study the heights of late-eighteenth and early nineteenth-century English and Irish prisoners. They acknowledge that their sample is unrepresentative in that the lower classes and the young are overrepresented, but one feature recurs: heights. the Irish are taller than the English, a feature discussed earlier in reference to military Nicholas and Steckel (1991) nd that male Irish convicts are taller than male English 39 convicts. Nicholas and Oxley (1996) nd that the heights of rural-born English women decline while the heights of Irish-born women rise. Johnson and Nicholas (1997) use the Irish as a control group against which to compare English women because no Industrial Revolution occurs in Ireland. Yet, Irish women are tall relative to the English. selection biases... They are condent that there are no obvious that would generate their results. Their contention is puzzling in that they cite and discuss Mokyr and Ó Gráda's (1994; 1996) result without discussing their conjecture that the Irish height advantage results from the army being more attractive to taller people in a poor economy. Is it not also likely that criminal activity is relatively more attractive to taller people in a poor economy? Finally, studies of American prisoners yield results similarly at odds with the industrialization puzzle. Komlos and Coclanis (1997) nd that the heights of black men born into slavery did not decline in the 1830s and 1840s; Sunder (2004) reports that heights of Tennessee prisoners were stable in the late-antebellum era; Carson (2008) nds no evidence of the antebellum puzzle among Missouri's prisoners.These studies then attempt to reconcile stable or rising stature with the antebellum puzzle. Komlos and Coclanis (1997, p.452) contend that slaves were insulated from the market and ate all they were allotted; Sunder (2004) argues that Tennesseans raised hogs and did not sell meat in the interregional market so that they had plentiful access to protein; and Carson's (2008, p.603) explanation turns on the relatively tall stature of the low-income Ozark Missourians due, in large part, to their reliance on dairy (a variation of the tall-but-poor Irish 45 Alternatively, explanation), whereas wealthier northern Missourians grew less protein-rich grains. our Roy model predicts a south to north Missouri prisoner height gradient that is consistent with a north to south income gradient. The relatively less attractive legitimate market opportunities in the Ozarks drew relatively taller men into the criminal market; better opportunities in northern Missouri drew men of comparable height, conditional on other characteristics, into the legitimate labor market. Our test of the Roy-type model in Section 5 above notwithstanding, we acknowledge that we have not tested our theory against every available data set. It is possible that it will not explain 45 Rees et al (2003) provide a dynamic optimization model of slave owner behavior consistent with the remarkable catch-up growth uncovered by Steckel (1986; 1987), which oers some insight into care and feeding of slaves in response to changes in the market price of slaves. The Rees et al (2003) model presumes substantial market-oriented responses rather than insulation from the market. 40 every historical human height experience. Only additional testing can illuminate the model's shortcomings. What the model provides, however, is a set of internally consistent predictions that do not rely on ad hoc reconciliations of the evidence with the industrialization puzzle. Military and prisoner heights rise in bad economic times because alternative employments become relatively more attractive to the legitimate civilian labor market. Some tall men who would nd remunerative legitimate, civilian employment in a good economy turn to the military or crime in a bad one. Alternatively, some short men who would not have attractive legitimate market opportunities in bad times will be drawn into the civilian market in good times. The result is an observed, but not actual countercyclical pattern of biological well being. 8 Conclusions The heights literature is now a large and important component of economic history and related areas. We agree that the central issues of interest in this literaturelong-term changes in the standard of livingshould occupy a prominent place in the agenda of social scientists interested in the past. Unfortunately much of the current literature that uses heights as an indicator of living standards is badly awed. Most of these sources are selected in some way, and the biases that arise because of the selection are pernicious and quite possibly account for many of the interesting facts the heights literature claims to have identied. Our message may seem entirely negative, but that is not how we intend it. First, we do not advocate rejection of the heights literature, its themes or basic approaches. We do think that taking selection into account will require tempering of some of the broader claims made. We also hope that attention to selection will guide the literature away from complicated, ad-hoc explanations of surprising ndings; as we note, some of these at least reect little more than selection. Second, and more importantly, we hope that careful attention to selection issues will allow these sources to speak in new and interesting ways. The nature of selection into the British Army (for example) is a statistical problem for those who want to use soldiers' heights to infer changes in the standard of living standards during the Industrial Revolution. But the selection process itself can yield rich insights into the nature of labor markets at the time: what led some young men into the Army, and what does this tell us about working-class life and the opportunities oered by an industrializing 41 economy? Taking selection seriously, instead of assuming it away, promises yet more insights from an already mature literature. 42 Appendix: the military's decision to accept a recruit The occupational choice model developed in section 3 implicitly assumes that the Army will accept everyone who wants to volunteer. We know this assumption to be false; the most obvious restriction was the minimum height requirement that generates the need for estimators such as the RSMLE and the QBE. We now extend our model to allow the military, too, to be an optimizing agent, one that can reject for military service some individuals who volunteer. We assume the Army produces a service called security using a constant-returns-to-scale pro- 46 duction function that depends on the sum of security services provided by all current soldiers. Each individual soldier produces a security service that is a function of his height military abilities εM . Again, we are agnostic on whether height per se h and a set of makes a better soldier, or whether height is simply correlated with characteristics that make men better soldiers. The Army pays each soldier in the manner described earlier. The Army must also pay additional xed costs for each serving soldier over and above what that man receives as a wage. These costs include uniforms and training. The net security value (security value minus costs of hiring) of a soldier with height h and military abilities εM S (h, εM ). is One would expect S (h, εM ) to be positive, but one can imagine soldiers who cost more than they are worth. Now abstract from our log-linear formulation of the utility of being a civilian and the utility of being a solider. Suppose the civilian economy values these same two traits along with a third trait that is specic to the civilian sector, εC . Civilian wages are given by wC (h, εC , εM ) . (15) Suppose the military can pay soldiers a wage that is a function of all three characteristics. Assuming the military can observe h, ε C , and εM , let wM,h,C,M (h, εC , εM ) be the wage that the military pays a soldier. 46 (16) We continue to assume that each individual has In our model, dierent types of soldiers can produce dierent amounts of security, but there are no externalities or spillovers; a good (bad) soldier does not increase (decrease) the eectiveness of other soldiers. 43 preference parameters for the civilian and military life τC and τM . This implies that a man joins the Army if wM,h,C,M (h, εC , εM ) + τM ≥ wC (h, εC , εM ) + τC (17) τ ≤ wM,h,C,M (h, εC , εM ) − wC (h, εC , εM ) (18) or where τ = τC − τM . Let fh,M,C,τ (h, εM , εC , τ ) be the joint distribution of the three characteristics that can be re- warded, as well as relative tastes, and assume that the military minimizes the cost of producing any particular level of total security. The military's objective is to choose a wage payment function wM,h,C,M (h, εC , εM ) min S, or Cost = wM,h,C,M (h,εC ,εM ) ˆ to minimize the cost of producing a given amount of security ∞ˆ ∞ ˆ ˆ ∞ wM,h,C,M (h,εC ,εM )−wC (h,εC ,εM ) N wM,h,C,M (h, εC , εM ) fh,M,C,τ (h, εM , εC , τ ) dτ dεC dεM dh −∞ 0 −∞ −∞ (19) subject to providing some specic level of security ˆ ∞ˆ ∞ S=N ˆ ∞ ˆ wM,h,C,M (h,εC ,εM )−wC (h,εC ,εM ) S (h, εM ) −∞ 0 fh,M,C,τ (h, εM , εC , τ ) dτ dεC dεM dh −∞ −∞ (20) where N is the size of the population. Let λ be the Lagrange multiplier corresponding to the security constraint. Assuming interior solutions, for each combination of wage, wM,h,C,M (h, εC , εM ), ˆ (h, εC , εM ) the military will set the such that wM,h,C,M (h,εC ,εM )−wC (h,εC ,εM ) N fh,M,C,τ (h, εM , εC , τ ) dτ −∞ + N wM,h,C,M (h, εC , εM ) fh,M,C,τ (h, εM , εC , wM,h,C,M (h, εC , εM ) − wC (h, εC , εM )) = λN S (h, εM ) fh,M,C,τ (h, εM , εC , wM,h,C,M (h, εC , εM ) − wC (h, εC , εM )) . (21) The left hand side of this expression is the marginal cost of paying soldiers with productive traits 44 equal to (h, εC , εM ) each one more penny; it is the sum the Army must pay each soldier already in the military (conditional on having this particular set of traits) plus the cost of expanding the Army, that is, paying for additional soldiers with those same traits. The right had side of the expression measures the marginal benet of the increased security provided by the additional soldiers (with these traits) induced into the military by the higher wage. For interior solutions, the ratio of the marginal costs of soldiers with dierent sets of characteristics should equal the ratio of their marginal benets. This formulation implies that the military can extract some surplus from those enlisting in the military. That is, the optimal military function may depend on observed characteristics that have no impact on a potential recruit's ability to provide military services. To see this, consider two sets of individuals whose military-relevant characteristics are identical, but who dier in characteristics valued by civilian jobs, εC . Because they dier in their Assume the military enlists some positive fraction from both groups. εC values, they have dierent potential earnings in the civilian sector. If they are equally represented in the population, then the rst order conditions will be satised as long as the military keeps the military-civilian pay dierential constant for the two groups. The military will pay the group with the lower valued civilian sector trait, εC , less than it would pay enlistees from the group with the civilian sector trait that is more highly valued, even though this trait has no impact on their performance as a soldier. For a variety of reasons, the military may not be able to use such ne details to make pay oers. Suppose, for the moment, that the military can only observe height, pay policy on this one characteristic. In this case h, and so it can only base its wM,h,C,M (h, εC , εM ) = wM,h (h), and the rst order conditions become ˆ ∞ ˆ ∞ ˆ wM,h (h)−wC (h,εC ,εM ) N fh,M,C,τ (h, εM , εC , τ ) dτ dεC dεM −∞ −∞ ˆ −∞ ∞ ˆ ∞ + N wM,h (h) fh,M,C,τ (h, εM , εC , wM,h (h) − wC (h, εC , εM )) dεC dεM −∞ −∞ ˆ ∞ ˆ ∞ = λN S (h, εM ) fh,M,C,τ (h, εM , εC , wM,h (h) − wC (h, εC , εM )) dεC dεM . −∞ (22) −∞ The left hand side of this rst order condition is the expected cost of increasing the military pay by one penny for all people with height h. It consists of two components. The rst is the additional 45 penny paid to all individuals of height h who would have joined the military even without the higher pay. The second term is the expected full cost of the marginal enlistees with height h. The right hand side again measures the expected marginal benet from the new enlistees with height h who enter the military because of this increased level of pay. a priori restrictions on the sign or magnitude of the military pay for any observed set of characteristics. For some characteristics, Note that both of these military pay formulations place no (h, εC , εM ), S (h, εM ), could be negative. This would imply that the recruit has to pay the Army for the privilege of being a soldier. We know of no such practice and think it is a principal reason for minimum height requirements. More realistically, military compensation schedules during the eighteenth and nineteenth centuries usually did not include explicit height gradients. But they usually had a minimum height requirement in addition to a xed level of compensation. To incorporate these features into the cost minimization model, we allow the military to choose a non-varying level of pay and to restrict military service to those above some optimally-chosen minimum height threshold D. Now the military's cost minimization problem can be written as ˆ ∞ˆ ∞ ˆ ˆ ∞ wM −wC (h,εC ,εM ) min Cost = wM N fh,M,C,τ (h, εM , εC , τ ) dτ dεC dεM dh wM ,D D −∞ −∞ (23) −∞ subject to the specied security level ˆ ∞ˆ ∞ S=N ˆ ∞ ˆ wM −wC (h,εC ,εM ) S(h, εM ) −∞ D fh,M,C,τ (h, εM , εC , τ ) dτ dεC dεM dh. −∞ (24) −∞ The two rst order conditions are, in this instance ˆ ∞ˆ ∞ ˆ ∞ ˆ wM −wC (h,εC ,εM ) N fh,M,C,τ (h, εM , εC , τ ) dτ dεC dεM dh D ˆ +N −∞ −∞ −∞ ∞ˆ ∞ ˆ ∞ wM ˆ D∞ ˆ −∞ ∞ = λN D −∞ fh,M,C,τ (h, εM , εC , wM − wC (h, εC , εM )) dεC dεM dh ˆ ∞ S (h, εM ) fh,M,C,τ (h, εM , εC , wM − wC (h, εC , εM )) dεC dεM dh −∞ −∞ 46 (25) and ˆ ˆ ∞ ∞ ˆ wM −wC (D,εC ,εM ) −wM N ˆ fh,M,C,τ (D, εM , εC , τ ) dτ dεC dεM −∞ ∞ = −λN −∞ −∞ ˆ ∞ ˆ wM −wC (D,εC ,εM ) fh,M,C,τ (D, εM , εC , τ ) dτ dεC dεM . S (D, εM ) −∞ −∞ (26) −∞ Consider the rst of the two rst-order conditions. The rst term on the left hand side is the marginal cost of paying each soldier not at the margin of joining the military an extra penny, and the second term in that left hand side is the full cost of paying the marginal soldier who enters the military at the wage wM . Their sum is the cost of inducing an additional expected soldier into the military at this wage. The right-hand side of that rst rst-order condition is the expected marginal value of the security services provided by the marginal soldier induced into the military at this wage. The second rst-order condition describes how the minimum height threshold is set when the optimal military wage is given. The left hand side is the expected cost savings gained by requiring the marginal soldier to be one inch taller, and the right hand side is the loss in security from excluding this marginal soldier from the military. Rearranging the second rst order condition yields: ˆ ˆ ∞ ∞ ˆ wM −wC (D,εC ,εM ) [λS (D, εM ) − wM ] −∞ fh,M,C,τ (D, εM , εC , τ ) dτ dεC dεM = 0. −∞ (27) −∞ The military will set the minimum height requirement to the level where the expected value of the military service for a soldier of height D, given the military wage and the occupational choice decision, just equals the expected wage payments to the individuals of minimum height D choosing to enter the military. Note that this truncation argument reects the Army's limited information on potential enlistees. If the Army could observe S (·) perfectly then there would be no need for the minimum height standard, at least for this reason. We observe changes over time in the mean height of soldiers as the Army expands and contracts. This uctuation in heights is evidence of selection and can be derived from our choice model. Changes in the demand for military services due to the outbreak of war increase the Army's target level of security, S. Since the military is a cost minimizer, one would expect an increase in 47 S to result in a higher marginal cost of military service, so the multiplier λ result in the right hand side of the rst order conditions to increase, ceteris paribus, and this could would increase. This would be oset by the military increasing its pay oer and relaxing the minimum height requirement, thereby increasing the size of the military and the level of military services provided in these more demanding times. 48 References A’Hearn, Brian. 2003. “Anthropometric Evidence on Living Standards in Northern Italy, 17301860.” Journal of Economic History 63(2): 351-381. A’Hearn, Brian, Franco Peracchi, and Giovanni Vecchi, 2009. “Height and the Normal Distribution: Evidence from Italian Military Data.” Demography 46(1): 1-25. Alchian, Armen A, and William R. Allen. 1964. University Economics. Belmont, Cal.: Wadsworth Publishing Company, 1964. Allen, Robert C. 2007. “Pessimism Preserved: Real Wages in the British Industrial Revolution.” Oxford University Department of Economics Working Paper 314. Angrist, Joshua D. and Jörn-Steffen Pischke. 2009. Mostly Harmless Econometrics: An Empiricist’s Companion. Princeton and Oxford: Princeton University Press. Bodenhorn, Howard. 1999. “A Troublesome Caste: Height and Nutrition of Antebellum Virginia’s Rural Free Blacks.” Journal of Economic History 59(4): 972-996. Bodenhorn, Howard. 2011. “Manumission in Nineteenth-Century Virginia.” Cliometrica 5(2): 145-164. Bodenhorn, Howard, Carolyn Moehling and Gregory N. Price. 2012 (forthcoming). “Short Criminals: Stature and Crime in Early America.” Journal of Law & Economics 55(2). Brinkman, Henk Jan, J.W. Drukker, and Brigitte Slot, 1988. “Height and Income: A New Method for the Estimation of Historical National Income Series.” Explorations in Economic History 25: 227-264 Brown, John C., 1990. “The Condition of England and the Standard of Living: Cotton Textiles in the Northwest, 1806-1850.” Journal of Economic History. Calomiris, Charles and Jonathan Pritchett. 2009. “Preserving Slave Families for Profit: Traders’ Incentives and Pricing in the New Orleans Slave Market.” Journal of Economic History 69(4): 986-1011. Carson, Scott Alan. 2008. “Inequality in the American South: Evidence from the Nineteenth Century Missouri State Prison.” Journal of Biosocial Science 40(4): 587-604. Carson, Scott Alan. 2009. "Health, Wealth and Inequality: A Contribution to the Debate about the Relationship between Inequality and Health." Historical Methods 42(2): 43-56. Case, Anne and Christina Paxson. 2008. “Stature and Status: Height, Ability, and Labor Market Outcomes.” Journal of Political Economy 116, 499-532. Coclanis, Peter A. and John Komlos. 1995. “Nutrition and Economic Development in PostReconstruction South Carolina.” Social Science History 19(1): 91-115. Cuff, Timothy. 2005. The Hidden Cost of Economic Development: The Biological Standard of Living in Antebellum Pennsylvania. Burlington, Vt.: Ashgate Publishing Company. 1 Danson, J. T. 1881. “Statistical Observations on the Growth of the Human Body (Males) in Height and Weight from Eighteen to Thirty Years of Age, as Illustrated by the Records of the Borough Gaol of Liverpool.” Journal of the Statistical Society 44(4): 660-674. Engels, Friedrich. 1945/1987. Condition of the Working Class in England. New York: Viking Penguin. Feinstein, Charles, 1988. “Pessimism Perpetuated: Real Wages and the Standard of Living in Britain during and after the Industrial Revolution.” Journal of Economic History. Fogel, Robert W., Stanley L. Engerman, Roderick Floud, Richard H. Steckel, T. James Trussell, Kenneth W. Wachter, Kenneth Sokoloff, Georgia Villaflor, Robert A. Margo, and Gerald Friedman. 1982. “Changes in American and British Stature since the Mid-Eighteenth Century: A Preliminary Report on the Usefulness of Data on Height for the Analysis of Secular Trends in Nutrition, Labor Productivity, and Labor Welfare.” NBER Working Paper No. 890. Floud, Roderick, Kenneth Wachter and Annabel Gregory, 1990. Height, health and history: Nutritional Status in the United Kindgom, 1750-1980. Cambridge: Cambridge University Press. Freudenberger, Herman and Jonathan B. Pritchett. 1991. “The Domestic United States Slave Trade: New Evidence.” Journal of Interdisciplinary History 21(3): 447-477. Galenson, David W. 1981. White Servitude in Colonial America: An Economic Analysis. Cambridge: Cambridge University Press. Gallman, Robert E. 1996. “Dietary Change in Antebellum America.” Journal of Economic History 56(1): 193-201. Glaeser, Edward L. and Bruce Sacerdote. 1999. “Why Is There More Crime in Cities?” Journal of Political Economy 107(S6): S225-S258. Grubb, Farley. 1994. “The End of European Immigrant Servitude in the United States: An Economic Analysis of Market Collapse, 1772-1835. Journal of Economic History 54(4): 794-824. Grubb, Farley. 1999. “Lilliputians and Brobdingnagians, Stature in British Colonial America: Evidence from Servants, Convicts, and Apprentices.” Research in Economic History 19: 139-203. Grubb, Farley. 2001. “The Market Evaluation of Criminality: Evidence from the Auction of British Convict Labor in America, 1767-1775.” American Economic Review 91(1): 295-304. Haines, Michael R., Lee A. Craig, and Thomas Weiss. 2003. “The Short and the Dead: Nutrition, Mortality, and the ‘Antebellum Puzzle’ in the United States.” Journal of Economic History 63(2): 382-413. Heckman, James J., 1979. “Sample Selection Bias as a Specification Error.” Econometrica 47(1): pp. Johnson, Paul and Stephen Nicholas. 1997. “Health and Welfare of Women in the United Kingdom, 1785-1920.” In Health and Welfare during Industrialization, 201-250. Edited by Richard H. Steckel and Roderick Floud. Chicago: University of Chicago Press. 2 John Komlos. 1987. "The Height and Weight of West Point Cadets: Dietary Change in Antebellum America." Journal of Economic History 47(4): 987-927. Komlos, John. 1992. “Toward an Anthropometric History of African-Americans: The Case of the Manumitted Slaves of Maryland.” In Strategic Factors in Nineteenth-Century American Economic History: A Volume to Honor Robert W. Fogel, 297-329. Edited by Claudia Goldin and Hugh Rockoff. Chicago: University of Chicago Press. Komlos, John, 1993. “The secular trend in the biological standard of living in the United Kingdom, 1730-1860.” Economic History Review 46(1): 115-144. Komlos, John. 1994. “The Height of Runaway Slaves in Colonial America, 1720-1770.” In Stature, Living Standards, and Economic Development: Essays in Anthropometric History, 93116. Edited by John Komlos. Chicago and London: University of Chicago Press. Komlos, John, 1998. “Shrinking in a Growing Economy? The Mystery of Physical Stature during the Industrial Revolution.” The Journal of Economic History 58(3): 779-802. Komlos, John, 2004. “How to (and How Not to) Analyze Deficient Height Samples – an Introduction.” Historical Methods 37(4): 160-173. Komlos, John and Bjorn Alecke. 1996. “The Economics of Antebellum Slave Heights Reconsidered.” Journal of Interdisciplinary History 26(3): 437-457. Komlos, John and Peter Coclanis. 1997. “On the Puzzling Cycle in the Biological Standard of Living: The Case of Antebellum Georgia.” Explorations in Economic History 34(4): 433-459. Kotz, Samuel, N. Balakrishnan and Norman L. Johnson. 2000. Continuous Multivariate Distributions, Volume 1, (second edition). New York: John Wiley & Sons, Inc. Lang, Stefan and Marco Sunder. 2003. “Non-parametric regression with BayesX: A Flexible Estimation of Trends in Human Physical Stature in 19th Century America.” Economics and Human Biology 1(1): 77-89. Margo, Robert A. and Richard H. Steckel. 1982. “The Heights of American Slaves.” Social Science History 6(4): 516-38. Margo, Robert A. and Richard H. Steckel. 1983. “Heights of Native-Born Whites during the Antebellum Period.” Journal of Economic History 43(1): 167-174. McLynn, Frank. 1989. Crime and Punishment in Eighteenth-Century England. London and New York: Routledge. Meier, Paul. 1982. “Estimating Historical Heights: Comment.” Journal of the American Statistical Association 77(378): 296-297. Mokyr, Joel and Cormac Ó Gráda. 1994. “The Heights of the British and the Irish c. 1800-1815: Evidence from Recruits to the East India Company’s Army.” In Stature, Living Standards, and Economic Development: Essays in Anthropometric History, 39-59. Edited by John Komlos. Chicago and London: University of Chicago Press. 3 Mokyr, Joel and Cormac Ó Gráda, 1996. “Height and health in the United Kingdom 1815-1860: evidence from the East India Company army.” Explorations in Economic History 33(2): 141-168. Nicholas, Stephen and Deborah Oxley. 1996. “Living Standards of Women in England and Wales, 1785-1815: Evidence from Newgate Prison Records.” Economic History Review 49(3): 591-599. Nicholas, Stephen and Richard H. Steckel, 1997. “Tall but Poor: living standards of men and women in pre-famine Ireland.” Journal of European Economic History 26(1):105-134. Nicholas, Stephen and Richard H. Steckel, 1991. “Heights and living standards of English workers during the Early Years of Industrialization, 1770-1815.” Journal of Economic History 51(4): 937-57. Nicholas, Stephen and Richard H. Steckel, 1997. “Tall but Poor: Living Standards of Men and Women in pre-Famine Ireland.” The Journal of European Economic History 26(1): 105-134. Ó Gráda, Cormac. 1991. “Heights in Tipperary in the 1840s: Evidence from the Prison Records.” Irish Economic and Social History 18: 24-33. Oxley, Deborah. 2003. “’The Seat of Death and Terror’: Urbanization, Stunting, and Smallpox.” Economic History Review, New Series 56(4): 623-656. Persico, Nicola, Andrew Postlewaite, and Dan Silverman. 2005. “The Effect of Adolescent Experience on Labor Market Outcomes: The Case of Height.” Journal of Political Economy 112, 1019-1053. Pritchett, Jonathan B. 1997. “The Interregional Slave Trade and the Selection of Slaves for the New Orleans Market.” Journal of Interdisciplinary History 28(1): 57-85. Pritchett, Jonathan B. and Richard M. Chamberlain. 1993. “Selection in the Market for Slaves: New Orleans, 1830-1860.” Quarterly Journal of Economics 108(2): 461-473. Pritchett, Jonathan B. and Herman Freudenberger. 1992. “A Peculiar Sample: The Selection of Slaves for the New Orleans Market.” Journal of Economic History 52(1): 109-127. Razzell, Peter. 1998. “Did Smallpox Reduce Height?” Economic History Review, New Series 51(2): 351-359. Rees, R., John Komlos, Ngo. V. Long, and Ulrich Woitek. 2003. “Optimal Food Allocation in a Slave Economy.” Journal of Population Economics 16(1): 21-36. Riggs, Paul. 1994. “The Standard of Living in Scotland, 1800-1850.” In Stature, Living Standards, and Economic Development: Essays in Anthropometric History, 60-75. Edited by John Komlos. Chicago and London: University of Chicago Press. Robert, C.P. and G. Casella. 2004. Monte Carlo Statistical Methods (second edition). New York: Springer-Verlag. 4 Sandberg, Lars G., and Richard H. Steckel, 1988. “Overpopulation and Malnutrition Rediscovered: Hard Times in 19th-Century Sweden.” Explorations in Economic History 25:1-19. Sandberg, Lars G. and Richard H. Steckel, 1997. “Was Industrialization Hazardous to your Health? Not in Sweden!” in Steckel and Floud (1997). Schultz, T. Paul. 2002. “Wage Gains Associated with Height as a Form of Health Human Capital.” American Economic Review 92(2): 349-353. Sokoloff, Kenneth and Georgia Villaflor. 1982. "The Early Achievement of Modern Stature in America." Social Science History 6(4): 453-481. Staub, Kaspar, Frank J. Rühli, Barry Bogin, Ulrich Woitek, and Christian Pfister. 2011. “Eduoard Mallet’s Early and Almost Forgotten Study of the Average Height of Genevan Conscripts in 1835.” Economics and Human Biology 9(): 438-442. Steckel, Richard H. 1986. “A Peculiar Population: The Nutrition, Health, and Mortality of American Slaves from Childhood to Maturity.” Journal of Economic History 46(3): 721-741. Steckel, Richard H. 1987. “Growth Depression and Recovery: The Remarkable Case of American Slaves.” Annals of Human Biology 14: 111-132. Steckel, Richard H., 1995. “Stature and the Standard of Living.” The Journal of Economic Literature 33(4): 1903-1940. Steckel, Richard H. and Roderick Floud, eds, 1997. Health and welfare during Industrialization. Chicago: University of Chicago Press. Steckel, Richard H., 1998. “Strategic Ideas in the Rise of the New Anthropometric History and their Implications for Interdisciplinary Research.” The Journal of Economic History 58(3): 803821. Steckel, Richard H. and Joseph M. Prince, 2001. “Tallest in the World: Native Americans of the Great Plains in the Nineteenth Century.” American Economic Review 91(1): 287-294. Steckel, Richard H., 2009. “Heights and Human Welfare: Recent Developments and New Directions.” Explorations in Economic History 46(1): 1-23. Sunder, Marco. 2004. “The Height of Tennessee Convicts: Another Piece of the ‘Antebellum Puzzle.’” Economics and Human Biology 2(1): 75-86. Sunder, Marco. 2011. “Upward and Onward: High-society American Women Eluded the Antebellum Puzzle.” Economics and Human Biology 9(2): 165-171. Sunder, Marco and Ulrich Woitek. 2005. “Boom, Bust, and the Human Body: Further Evidence on the Relationship between Heights and Business Cycles.” Economics and Human Biology 3(3): 450-466. Voth, Hans-Joachim and Timothy Leunig. 1996. “Did Smallpox Reduce Height? Stature and the Standard of Living in London, 1770-1873.” Economic History Review, New Series 49(3): 541560. 5 Wachter, Kenneth W. and James Trussell, 1982a. “Estimating Historical Heights.” Journal of the American Statistical Association 77(378):279-293 Wachter, Kenneth W. and James Trussell, 1982b. “Estimating Historical Heights: Rejoinder.” Journal of the American Statistical Association 77(378):301-303 Weir, David R., 1997. “Economic Welfare and Physical Well-Being in France, 1750-1990.” In Steckel and Floud (1997) Weiss, Thomas J. 1992. “U.S. Labor Force Estimates and Economic Growth, 1800-1860.” In American Economic Growth and Standards of Living before the Civil War, 19-78. Edited by Robert E. Gallman and John J. Wallis. Chicago: University of Chicago Press. Whitwell, Greg, Christine de Souza, and Stephen Nicholas, 1997. “Height, Health, and Economic Growth in Australia, 1860-1940.” In Steckel and Floud (1997). 6 Table 1: Summary of main symbols in the occupational choice model, and their assumed values in simulation model Symbol Meaning h height Baseline value in Simulations NA Properties of the simulated population N μ σ Number of observations in simulated population Mean of heights in population Standard deviation of heights in population 1,000,000 66 inches 2.5 inches Properties of the log-wage equations αC αM βC Intercept for civilian log-wage equation at height 56 inches Intercept for military log-wage equation at height 56 inches Slope (return to one inch in height) for civilian log-wage equation βM εC εM Slope (return to one inch in height) for military log-wage equation Skill (unobserved) affecting civilian log-wage equation [~N(0,1)] Skill (unobserved) affecting military and civilian log wages [~N(0,1)] Return to military skill in civilian sector log wage equation Return to military skill in military sector log wage equation Return to civilian skill in civilian sector log wage equation Taste for the civilian sector Taste for the military sector γC γM δC τC τM 0.5 0.4 Ranges from 0 to 0.07 0.02 NA NA 0 0.1 0.4 0 0 Properties of the error terms σCM Correlation of (εM, εC ) .1 Note: variables marked “NA” are either not in the simulation model or are themselves simulated. Table 2: Baseline specifications for heights regressions OLS Estimate Dummy for year of birth: 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 Baseline=1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 Probit (tall=68+) T-ratio Estimate T-ratio Probit (tall=69+) Estimate T-ratio 1.143 1.493 1.482 1.170 0.968 1.215 0.650 0.354 0.462 0.514 0.399 0.350 0.092 -0.007 0.072 0.170 0.374 -0.144 0.232 0.332 -0.143 0.143 0.127 0.056 0.032 0.176 -0.159 -0.205 0.154 0.039 -0.158 -0.065 -0.117 -0.102 0.101 0.115 0.214 0.211 0.227 7.190 10.450 9.850 8.040 7.830 6.500 3.290 2.080 2.850 3.550 2.580 2.300 0.630 -0.040 0.430 1.050 2.530 -1.190 1.440 2.340 -1.140 1.000 0.920 0.390 0.210 1.040 -1.110 -1.470 1.040 0.260 -1.290 -0.410 -0.830 -0.710 0.520 0.560 0.910 1.070 1.150 1.037 1.427 1.655 1.775 1.280 1.074 0.838 0.498 0.363 0.529 0.209 0.259 0.158 -0.090 -0.015 0.233 0.250 0.031 0.279 0.509 0.173 0.201 0.209 0.353 0.172 0.146 0.122 -0.137 0.254 0.009 -0.053 -0.061 0.040 -0.139 -0.031 0.523 0.259 0.115 0.104 5.070 6.980 6.370 5.880 6.780 4.300 3.360 2.600 2.350 3.700 1.500 1.820 1.100 -0.620 -0.100 1.490 1.880 0.240 1.840 3.440 1.180 1.430 1.470 2.090 1.160 0.920 0.760 -0.980 1.750 0.070 -0.410 -0.420 0.270 -1.000 -0.170 2.020 1.070 0.650 0.600 1.102 1.399 1.296 1.096 0.986 1.249 0.750 0.447 0.367 0.478 0.371 0.290 0.209 0.017 0.036 0.079 0.287 -0.019 0.111 0.347 -0.157 0.203 0.227 0.159 -0.071 0.232 -0.267 -0.170 0.194 0.017 -0.004 -0.050 -0.146 -0.103 0.158 -0.105 -0.041 0.017 0.154 6.500 9.220 8.130 6.830 7.070 5.990 3.590 2.530 2.480 3.590 2.710 2.090 1.470 0.110 0.240 0.510 2.190 -0.140 0.750 2.520 -1.030 1.470 1.620 0.980 -0.470 1.480 -1.570 -1.150 1.360 0.120 -0.030 -0.330 -0.940 -0.710 0.860 -0.420 -0.170 0.090 0.890 -0.153 0.264 0.316 -0.051 0.728 0.477 0.303 0.112 -0.088 0.217 0.657 0.850 -0.042 -0.990 1.320 1.590 -0.310 1.850 1.560 1.350 0.730 -0.460 1.070 2.800 2.620 -0.150 -0.109 0.136 0.252 0.015 0.305 0.234 0.177 0.295 -0.104 0.375 0.756 1.271 -0.140 -0.670 0.810 1.320 0.090 1.160 0.910 0.800 1.600 -0.590 1.750 2.950 2.530 -0.570 -0.073 0.119 0.165 -0.013 0.401 0.267 0.381 0.237 -0.175 0.099 0.277 0.531 -0.227 -0.430 0.710 0.880 -0.070 1.590 1.070 1.770 1.330 -0.930 0.480 1.270 1.600 -0.850 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 Constant R-square (adj) Log-likelihood -0.048 -0.048 0.086 0.683 0.604 0.364 0.561 0.842 0.992 0.262 0.382 -0.226 0.243 -0.361 0.031 0.422 0.289 0.115 0.041 0.276 0.370 0.039 -0.200 0.038 -0.498 -0.430 -0.168 0.127 68.434 -0.250 -0.230 0.430 2.780 3.470 2.380 2.790 2.700 3.640 0.470 0.800 -0.630 0.520 -1.120 0.160 2.170 2.110 0.890 0.350 2.260 2.420 0.230 -1.270 0.140 -3.480 -2.130 -0.760 0.820 960.540 -0.073 -0.207 0.057 0.139 0.623 0.500 0.535 0.591 1.007 0.336 0.838 -0.230 0.336 -0.230 0.089 0.317 0.135 0.195 0.096 0.350 0.386 0.116 -0.073 -0.090 -0.096 -0.308 -0.104 0.164 0.230 -0.370 -1.130 0.320 0.770 3.280 2.840 2.440 2.330 3.550 0.660 1.420 -0.620 0.660 -0.620 0.420 1.780 1.110 1.560 0.860 2.910 2.730 0.700 -0.460 -0.410 -0.490 -1.330 -0.490 1.110 3.300 0.129 -0.271 -0.058 0.298 0.447 0.327 0.420 0.670 0.549 0.267 -0.118 -0.227 -0.118 -0.520 -0.006 0.273 0.194 -0.049 -0.018 0.166 0.289 -0.085 -0.267 -0.142 -0.594 -0.440 -0.308 0.109 -0.447 0.650 -1.360 -0.320 1.670 2.630 2.000 2.100 2.940 2.510 0.550 -0.230 -0.570 -0.230 -1.190 -0.030 1.590 1.590 -0.390 -0.150 1.410 2.130 -0.500 -1.570 -0.610 -2.530 -1.650 -1.330 0.730 -6.250 0.082 -3835.164 -4077.361 Note: Estimation sub-sample limited to men aged 22-27 at the time of recruitment, and at least 67 inches tall. The sample size for all specifications is 6435. The omitted birth year is 1800. Table 3: Adding age-of recruitment dummies to the baseline model Test for the null that all age dummies are zero OLS Probit (tall = 68+) Probit (tall=69+) 2.47 (.0307) [5, 6349] 16.21 (.0063) [5] 13.81 (.0169) [5] 6.89 (0.0) [80, 6349] 327.37 (0.0) [80] 461.53 (0.0) [80] Test for the null that all birth-year dummies are zero Note: Tests performed on specifications that are augmented versions of the models reported in Table 2, as described in the text. Figures in parenthesis are the probability of not rejecting; figures in square brackets are the degrees of freedom. The test for the OLS model is an F-test, for the probits, a chi-square test. There are 6435 observations used in all specifications. Table 4: Adding year-of-recruitment dummies to the baseline model Test for the null that all year-ofrecruitment dummies are zero Test for the null that all birth-year dummies are zero OLS Probit (tall = 68+) Probit (tall=69+) 1.67 (.0002) [77, 6277] 136.15 (0.00) [76] 110.46 (.0075) [77] 1.33 (0.0275) [80, 6277] 89.19 (.2258) [80] 83.94 (.3598) [80] Note: Tests performed on specifications that are augmented versions of the models reported in Table 2, as described in the text. Figures in parenthesis are the probability of not rejecting; figures in square brackets are the degrees of freedom. The test for the OLS model is an F-test, for the probits, a chi-square test. There are 6435 observations used in all specifications. Table 5: Adding relative wage and wheat prices OLS Relative wage variable Wheat-price variable Test for the null that relative wages and wheat prices are both zero Probit (tall=68+) Probit (tall=69+) -1.979 (-2.15) [-.018] -.480 (-.53) [-.152] -1.254 (-1.39) [-.760] .0005 (.54) [.005] .0004 (.47) [.016] .0002 (.29) [.019] 2.34 (.096) [2, 6352] .47 (.789) [2] 1.96 (.375) [2] Notes: The mean of the relative wage variable is 0.627, and its standard deviation is 0.034. For the wheat price variable the mean is 71.378 and the standard deviation 26.348. This first panel of the table refers to a specification that augments the models reported in Table 2 with the relative-wage and wheat-price variables reported in the text. The augmented models were estimated with robust standard errors. In the first two rows, the figure in parentheses is the t-ratio and the figure in square brackets is the elasticity evaluated at the sample means. The final row reports the F-test (for the OLS model) or chi-square test (for the probits) for the null hypothesis that both the relative wage and wheatprice variables are zero. The figure in parenthesis is the probability of not rejecting, and the figure in square brackets is the degrees of freedom for the test. Table 6: RSMLE estimates of heights of British soldiers Variables Wage ratio Wheat price Cohort 1761 Cohort 1766 Cohort 1771 Cohort 1776 Cohort 1781 Cohort 1786 Cohort 1791 Cohort 1796 Cohort 1801-1805 (baseline) Cohort 1806 Cohort 1811 Cohort 1816 Cohort 1821 Cohort 1826 Cohort 1831 Cohort 1841 Constant Mu Estimates SE Sigma Estimates -15.955 -.0375516 3.963091 3.853138 2.110348 2.732323 3.077086 3.037234 1.845478 1.663066 4.85152 0.009 1.510 1.523 1.721 1.644 1.619 1.658 1.748 1.634 1.474217 .0043528 -.7395829 -.4561435 -.1763412 -.2650115 -.4008354 -.3974875 -.2981857 -.2217938 0.803 0.001 0.173 0.180 0.203 0.197 0.195 0.197 0.204 0.206 2.216812 1.150958 2.63333 2.410806 .7469431 1.61339 -.1531317 77.14726 1.607 1.795 1.568 1.695 1.811 1.619 1.756 3.372 -.3446124 -.1554121 -.2974427 -.1411493 -.0372722 -.166328 -.0671573 -.1078353 0.210 0.234 0.195 0.242 0.208 0.189 .21133 .507653 SE Note: The log of the likelihood function is -9689.4979. There are 6206 observations. Without the four time of enlistment effects, the log likelihood value is -9706.1187. Figure 1 Source: Nicholas and Steckel (1991, Figure 3) Figure 2: The median heights of French males 1,675 1,670 1,665 1,660 1,655 1,650 1,645 1,640 1,635 1,630 1780 1800 1820 Source: Weir (1997, Table 5B.1) 1840 1860 1880 1900 1920 1940 Figure 3: Male height at age 20 in France and Britain, 1770-1980 Source: Weir (1997, p. 175). Figure 4: Simulation results I .5 .5 Proportion tall .45 .4 .6 0 .02 .04 .06 Civilian return to height Mean military height .35 .3 64.5 .1 .2 65 .3 Proportion 65.5 66 .4 .55 B 66.5 A .08 0 .02 .04 .06 Civilian return to height 50 percent in military Proportion of civilians tall Fraction in military Proportion of soldiers tall .08 Figure 5: Simulation results II .15 .1 .05 0 0 .05 .1 .15 .2 B .2 A 55 60 65 Height 70 75 55 60 65 Height 70 Population density Population density Military density Military density 75 Figure 6: Selected height distribution and the density function adjustment for selection 6566 Height 70 75 4 2 3 Z(h) from selection model .15 1 .05 0 0 1 .1 4 60 0 2 3 Z(h) from selection model .15 .1 .05 0 55 .2 B .2 A 55 60 6566 Height 70 75 Population height density Population height density Selected height density Selected height density Z(h) from selection model Z(h) from selection model Figure 7: Ability to reject normality in selected samples B .04 .04 Rejections of normal height distribution (sktest) at 5% size .06 .08 .1 Rejections of normal height distribution (sktest) at 10% size .06 .08 .1 .12 .12 A 100 200 500 2000 5000 15000 50000 Expected military sample size -- log scale 100 200 500 2000 5000 15000 50000 Expected military sample size -- log scale 0 1 2 3 4 Figure 8: Selection density adjustment and acceptance (rejection) sampling functions 55 60 65 Height 70 Rejection sampling weight Z(h) from selection model 75
© Copyright 2026 Paperzz