The Ancient Origins of Cross-Country Heterogeneity in Risk Preferences∗ Anke Becker Benjamin Enke Armin Falk May 6, 2015 Abstract Using novel representative preference data on 80,000 individuals from 76 countries, this paper shows that the migratory movements of our early ancestors thousands of years ago have left a footprint in the contemporary crosscountry distribution of risk preferences. Across a wide range of regression specifications, differences in risk aversion between populations are significantly increasing in the length of time elapsed since the respective groups shared common ancestors. This result obtains for various proxies for the structure and timing of historical population breakups, including genetic and linguistic data or predicted measures of migratory distance. In addition, we provide evidence that the within-country heterogeneity in risk aversion significantly decreases in migratory distance from the geographical origin of mankind. Taken together, our results point to the importance of very long-run events for understanding the global distribution of one of the key economic traits. JEL classification: D01, D03 Keywords: Risk preferences, out of Africa, persistence, genetic distance, genetic diversity ∗ We are grateful to Quamrul Ashraf, Klaus Desmet, Nathan Nunn, Uwe Sunde, Hans-Joachim Voth, and Romain Wacziarg for very helpful discussions. Valuable comments were also received from seminar audiences at Bonn, Munich, the NBER Economics of Culture and Institutions Meeting 2014, SITE Stanford 2014, the North-American ESA conference 2014, and the Annual Congress of the EEA 2013. We thank Quamrul Ashraf, Oded Galor, and Ömer Özak for generous data sharing. Ammar Mahran provided oustanding research assistance. Armin Falk acknowledges financial support from the European Research Council through ERC # 209214. Becker, Enke, Falk: University of Bonn, Department of Economics, Adenauerallee 24-42, 53113 Bonn, Germany. [email protected], [email protected], [email protected] 1 Introduction Preferences over risk have been shown to vary substantially within populations and to predict a plethora of individual-level economic decisions. However, do risk preferences also vary across countries? And if so, what explains this variation, i.e., what are the ultimate determinants of heterogeneity in preferences? Using a novel, globally representative data set on risk preferences, this paper seeks to provide an answer to these questions. Our key contribution is to show that the structure of mankind’s ancient migration out of Africa and around the globe has had a persistent impact on the between-country distribution of risk attitudes as of today. In light of the relevance of risk preferences in predicting important economic and social outcomes, these results contribute to understanding the ultimate sources of cross-country economic heterogeneity. Moreover, our findings add to the emerging literature on endogenous preferences (Fehr and Hoff, 2011) in highlighting historical roots as a key driver in shaping preferences. According to the widely accepted “Out of Africa hypothesis”, starting around 50,000 years ago, early mankind migrated out of East Africa and continued to explore and populate our planet through a series of successive migratory steps. Each of these steps consisted of some sub-population breaking apart from the previous colony and moving on to found new settlements. This pattern implies that some contemporary population pairs have spent a longer time of human history apart from each other than others. As a result, the time elapsed since two groups shared common ancestors differs across today’s population pairs. The key idea underlying our analysis is that these differential time frames of separation have affected the cross-country distribution of risk preferences. First, populations that have spent a long time apart from each other were exposed to differential historical experiences, which could affect preferences over risk (Callen et al., 2014). Second, due to, e.g., random genetic drift, long periods of separation lead to different population-level genetic endowments, which might in turn shape risk attitudes.1 Both of these channels imply that populations that have been separated for a long time in the course of human history, should also exhibit different risk attitudes. Thus, we hypothesize that the (absolute) difference in risk attitudes between countries reflects the length of separation of the respective populations, as evoked by the successive population breakups in ancient times. To investigate this hypothesis, we use a novel dataset on risk preferences across countries in combination with proxies for long-run human migration. As part of the Global Preference Survey (GPS), we collected two survey measures of risk preferences 1 Cesarini et al. (2009) use a twin study to provide evidence for a genetic effect on risk preferences. 1 (see Falk et al., 2015a). The sample of 80,000 people from 76 countries is constructed to provide representative population samples within each country and geographical representativeness in terms of countries covered. The survey items – one qualitative and one quantitative lottery-type measure of risk preferences – were selected and tested through a rigorous ex ante experimental validation procedure involving real monetary stakes. The elicitation followed a standardized protocol that was implemented through the professional infrastructure of the Gallup World Poll. These data allow the computation of a nationally representative level of risk aversion, and hence the derivation of the absolute difference in risk attitudes within a country pair.2 We combine these data with three classes of proxies for the temporal patterns of ancient population fissions, i.e., proxies for the length of time since two populations shared common ancestors. (i) First, we employ the FST and Nei genetic distances between populations, as originally measured by the population geneticists CavalliSforza et al. (1994) and introduced into the economics literature by Spolaore and Wacziarg (2009). As population geneticists have long noted, whenever two populations split apart from each other in order to found separate settlements, their genetic distance increases over time due to random genetic drift. Thus, the genetic distance between two populations is a measure of temporal distance since separation. (ii) Second, we use measures of predicted migratory distance between contemporary populations, which were constructed by Ashraf and Galor (2013b) and Özak (2010), respectively. The predicted migratory distance variable of Ashraf and Galor (2013b) is based on a procedure which exploits information on the geographic patterns of early migratory movements and constitutes a proxy for the predicted length of separation of two populations.3 The human-mobility-index measure of Özak (2010), on the other hand, explicitly computes the walking time between two countries’ capitals, taking into account topographic, climatic, and terrain conditions, as well as human biological abilities. (iii) Finally, we make use of the observation that linguistic trees closely follow the structure of separation of human populations and employ a measure of linguistic distance between two populations as explanatory variable. In sum, we use various independent sources of data which are known to reflect the length of separation of populations. Using these measures, we test the hypothesis that the heterogeneity in risk preferences across contemporary populations is driven by migratory movements in the distant past. Our empirical analysis of the relationship between risk preferences and ancient 2 Vieider et al. (2012) and Rieger et al. (2014) also collect data on the prevalence of risk attitudes in multiple countries. Their samples cover a much smaller set of countries and contain university students, rendering conclusions about population-level differences in risk attitudes difficult. 3 See Ramachandran et al. (2005). 2 migration patterns starts by establishing that the absolute difference in average risk attitudes between two countries is significantly increasing in the length of time since today’s populations shared common ancestors, as proxied for by (observed) genetic distance, predicted migratory distance, and linguistic distance. In a second step, using observed genetic distance as main measure, we investigate to what extent our main result is likely to be driven by contemporary environmental conditions, i.e., omitted variables. We establish that the relationship between differences in risk preferences and length of separation is robust to an extensive set of covariates, including controls for differences in the countries’ demographic composition, their geographic position, prevailing climatic and agricultural conditions, institutions, and economic development. In all of the corresponding regressions, the point estimate is rather stable, suggesting that unobserved heterogeneity is unlikely to drive our results, either (Altonji et al., 2005). Our result also holds when we exclude entire continents or observations with large genetic distances from the analysis. Final robustness checks reveal that we obtain similar results when we use alternative genetic distance data or the predicted migratory distance variables in the conditional regressions. In sum, irrespective of the measure used, our results indicate the existence of a relationship between cross-country differences in risk attitudes and the pattern of very distant migratory movements, which is not explained by differences in the contemporary environments people reside in. However, as established by the so-called serial-founder effect of population genetics, the structure of successive population breakups also affects within-population diversity. Intuitively, due to the loss of heterogeneity that is implied by small founder populations breaking apart from previous parental colonies, the diversity of a population decreases along migratory paths. To investigate whether such a mechanism also plays out for preferences over risk, we relate the within-country standard deviation in risk preferences to a population’s (predicted) migratory distance from East Africa. The results show that, for both predicted migratory distance variables, the degree of the within-country heterogeneity in individual risk attitudes is indeed significantly decreasing in migratory distance from East Africa, and is hence positively correlated with the diversity of the respective genetic pool. Thus, along two major dimensions (means and variances), temporally very distant population fissions explain a significant fraction of today’s cross-country heterogeneity in one of the key economic traits. A number of recent contributions argue that the (cultural) diversity caused by long-run migration thousands of years ago can have real aggregate economic effects. Spolaore and Wacziarg (2009, 2011) find a strong relationship between genetic distance and income differences across countries and argue that lower cultural distance 3 between two countries facilitates the diffusion of knowledge.4 Ashraf and Galor (2013b) and Ashraf et al. (2014) document an inverted U-form relationship between per capita income and genetic diversity (migratory distance from East Africa). They argue that this pattern reflects the trade-off between more innovation and lower trust and cooperation that is associated with higher cultural diversity.5 Our paper nicely dovetails with this set of contributions as it shows that the genetic variables which proxy for migratory flows indeed capture variation in an economically important trait (either between or within countries), as is implicitly or explicitly assumed by these papers. This paper also forms part of an active recent literature on the historical, biological, and cultural origins of beliefs and preferences. For example, Chen (2013) and Galor and Özak (2014) show that cross-country differences in future-orientation are affected by a structural feature of languages and historical agricultural productivity. Tabellini (2008) and an earlier version of Guiso et al. (2009) relate interpersonal trust to linguistic features and the genetic distance between two populations. Nunn and Wantchekon (2011), Voigtländer and Voth (2012) and Alesina et al. (2013) establish the deep roots of trust, beliefs over the appropriate role of women in society, and anti-semitism.6 Desmet et al. (2011) show that, within Europe, genetic distance correlates with opinions and attitudes as expressed in the World Values Survey. Desmet et al. (2014) investigate the relationship between cultural traits and ethnicity. However, presumably given the previous lack of data, this paper is the first contribution to study the origins of cross-country variation in risk preferences. The remainder of this paper proceeds as follows. In the next section, we develop our hypotheses on the relationship between the structure of migratory movements and risk preferences, while Section 3 presents the data. Section 4 discusses our main result on the connection between differences in average risk attitudes and length of separation. In Section 5, we examine the within-country heterogeneity in risk preferences. Section 6 concludes the paper by discussing correlations between country-level risk attitudes and economic, institutional, and health outcomes. 4 Spolaore and Wacziarg (2014) find a negative relationship between genetic distance and the occurrence of conflict. Gorodnichenko and Roland (2010) argue that genetic distance to the United States can be used as an instrument for the cultural “individualism” dimension to show a causal effect on GDP. Proto and Oswald (2014) investigate the correlation between genetic distance to Denmark and happiness. Giuliano et al. (2014) show that the negative correlation between genetic distance and bilateral trade vanishes once transportation costs are accounted for. 5 Arbatli et al. (2013) and Ashraf and Galor (2013a) show that genetic diversity is related to ethnolinguistic diversity and the emergence of civil conflicts. 6 See also Guiso et al. (2006) and Fernández and Fogli (2006, 2009). 4 2 Risk Attitudes and the Great Human Expansion According to the widely accepted theory of the origins and the dispersal of early humans, the single cradle of mankind is to be found in East or South Africa and can be dated back to roughly 100’000 years ago (see, e.g., Henn et al. (2012) for an overview). Starting from East Africa, a small sample of hunters and gatherers exited the African continent around 50’000-60’000 years ago and thereby started what is now also referred to as the “great human expansion”. This expansion continued throughout Europe, Asia, Oceania, and the Americas, so that mankind eventually came to settle on all continents. A noteworthy feature of this very long-run process is that it occurred through a large number of discrete steps, each of which consisted of a sub-sample of the original population breaking apart and leaving the previous location to move on and found new settlements elsewhere (so-called serial founder effect). The main argument of this paper is that the pattern of successive breakups affected the distribution of risk preferences we observe around the globe today. First, the series of migratory steps implied a frequent breakup of formerly united populations. After splitting apart, these sub-populations often settled geographically distant from each other, i.e., lived in separation. There are two channels through which the length of separation of two groups might have had an impact on betweengroup differences in risk attitudes: First, the fact that two populations have spent a long time apart from each other implies that they were subject to many differential historical experiences. Recent work by, e.g., Callen et al. (2014) highlights that risk preferences are malleable by idiosyncratic experiences or, more generally, by the composition of people’s social environment. Thus, the differential historical experiences which have accumulated over thousands of years of separation might have given rise to differential attitudes towards risk as of today. Second, whenever two populations spend time apart from each other, they develop different population-level genetic pools due to, e.g., random genetic drift. Given that risk attitudes are transmitted across generations (Dohmen et al., 2012) and that part of this transmission is genetic in nature (Cesarini et al., 2009), the different genetic endowments induced by long periods of separation could also generate differential risk attitudes. Note that both of these channels imply a bilateral (rather than a directional) statement about between-country differences in risk preferences: 5 Main Hypothesis. The absolute difference in (average) risk aversion between two countries is increasing in the length of separation of the respective populations in the course of human history. The serial founder pattern of ancient migration movements could also have had an effect on the within-population heterogeneity in risk preferences. A well-established fact in the population genetics literature is that whenever a sub-population split apart from its parental colony, those humans breaking new ground took with them only a fraction of the genetic diversity of the previous genetic pool (intuitively because they were usually small, and hence non-representative, samples). In consequence, through the sequence of successive fissions, the total diversity of the gene pool significantly decreases along human migratory routes out of East Africa. Note that the logic behind this serial founder effect need not be restricted to genetic variation, but could similarly apply to heterogeneity in risk attitudes: Ancillary Hypothesis. The within-country heterogeneity in risk preferences is negatively related to a population’s migratory distance from East Africa. 3 3.1 Data Representative Cross-Country Data on Risk Preferences Our data on risk preferences around the globe are part of the Global Preference Survey (GPS), which constitutes a unique dataset on economic preferences from representative population samples around the globe. In many countries around the world, the Gallup World Poll regularly surveys representative population samples about social and economic issues. In 76 countries, we included as part of the regular 2012 questionnaire a set of survey items which were explicitly designed to measure a respondent’s risk preferences (for details see Falk et al., 2015a). Four noteworthy features characterize these data. First, the preference measures have been elicited in a comparable way using a standardized protocol across countries. Second, contrary to small- or medium-scale experimental work, we use preference measures that have been elicited from representative population samples in each country. This allows for inference on between-country differences in preferences, in contrast to existing cross-country comparisons of convenience (student) samples. The median sample size was 1,000 participants per country; in total, we collected preference measures for more than 80,000 participants worldwide. Respondents were selected through probability sampling and interviewed face-to-face or via telephone by professional interviewers. Third, the dataset also reflects geographical 6 representativeness. The sample of 76 countries is not restricted to Western industrialized nations, but covers all continents and various development levels. Specifically, our sample includes 15 countries from the Americas, 24 from Europe, 22 from Asia and Pacific, as well as 14 nations in Africa, 11 of which are Sub-Saharan. The set of countries contained in the data covers about 90% of both the world population and global income. Fourth, the preference measures are based on experimentally validated survey items for eliciting preferences. In order to ensure behavioral relevance, the underlying survey items were designed, tested, and selected through an explicit ex-ante experimental validation procedure (Falk et al., 2015b). In this validation step, out of a large set of preference-related survey questions, those items were selected which jointly perform best in explaining observed behavior in standard financially incentivized experimental tasks to elicit preference parameters. In order to make these items cross-culturally applicable, (i) all items were translated back and forth by professionals, (ii) monetary values used in the survey were adjusted along the median household income for each country, and (iii) pretests were conducted in 21 countries of various cultural heritage to ensure comparability. See Appendix A and Falk et al. (2015a) for details. The set of survey items included two measures of the underlying risk preference – one qualitative subjective self-assessment and one quantitative measure. The subjective self-assessment directly asks for an individual’s willingness to take risks: “Generally speaking, are you a person who is willing to take risks, or are you not willing to do so? Please indicate your answer on a scale from 0 to 10, where a 0 means “not willing to take risks at all” and a 10 means “very willing to take risks”. You can also use the values in between to indicate where you fall on the scale.” The quantitative measure is derived from a series of five interdependent hypothetical binary lottery choices, a format commonly referred to as the “staircase procedure”. In each of the five questions, participants had to decide between a 50-50 lottery to win x euros or nothing (which was the same in each question) and varying safe payments y. The questions were interdependent in the sense that the choice of a lottery resulted in an increase of the safe amount being offered in the next question, and conversely. For instance, in Germany, the fixed upside of the lottery x was 300 euros, and in the first question, the fixed payment was 160 euros. In case the respondent chose the lottery (the safe payment), the safe payment increased (decreased) to 240 (80) euros in the second question (see Figure 3 in Appendix A for an exposition of the entire sequence of survey items). In essence, by adjusting the fixed payment according to previous choices, the questions “zoom in” around the respondent’s certainly equivalent and make efficient use of limited and costly survey 7 time. This procedure yields one of 32 ordered outcomes. The subjective self-assessment and the outcome of the quantitative lottery staircase were aggregated into a single index which describes an individual’s degree of risk aversion.7 Despite the lack of financial incentives, there are good reasons to expect that our measures capture respondents’ risk attitudes. Apart from the experimental validation procedure, the qualitative subjective self-assessment has previously been shown to be predictive of both experimental behavior and risk-taking in the field in a representative sample (Dohmen et al., 2011) as well as of incentivized experimental risk-taking across countries in student samples (Vieider et al., forthcoming).8 Additionally, the quantitative lottery staircase measure is akin to standard experimental lottery-choices and lacks any context, so that it is arguably less prone to culture-dependent interpretations. Figure 1 depicts the distribution of the average degree of risk aversion around the world. Darker colors indicate a higher propensity to take risks. The map indicates a large heterogeneity across countries, with Portugal being the most risk averse and South Africa the least risk averse country in the sample. Country averages vary by almost two standard deviations, relative to the total individual-level standard deviation of one.9 3.2 Proxies for Ancient Migration Patterns We use various separate but conceptually linked classes of variables to proxy for the length of time since two populations split apart: (i) Observed genetic distance, (ii) predicted pairwise migratory distance, and (iii) linguistic distance. Observed Genetic Distance Between Countries First, whenever populations break apart, they stop interbreeding, thereby preventing a mixture of the respective genetic pools. However, since every genetic pool is subject to random drift (“noise”), geographical separation implies that over time the genetic distance between sub-populations gradually became (on average) larger. Thus, the genealogical relatedness between two populations reflects the length of time elapsed since these populations shared common ancestors. In fact, akin to a molec7 Aggregation was done by first computing the z-scores of each survey item at the individual level and then linearly combining these two z-scores using the weights obtained in the experimental validation procedure, see Falk et al. (2015b). The weights are given by: Risk aversion = −(0.4729985 × Staircase outcome + 0.5270015 × Qualitative item) 8 For instance, this item correlates with stock market participation, self-employment, risky sports practices, and smoking, see Dohmen et al. (2011). In addition, this question was used to establish the intergenerational transmission of risk attitudes (Dohmen et al., 2012). 9 When we compute all possible t-tests in our sample of 2,850 country pairs, 78% of all country pairs are statistically significantly different from each other at the 5% level. 8 Figure 1: Cross-country heterogeneity in risk aversion. Darker blue (red) indicates higher (lower) risk aversion. All values expressed in standard deviations around the world mean individual (white). ular clock, population geneticists have made use of this observation by constructing mathematical models to compute the timing of separation between groups. This makes clear that, at its very core, genetic distance constitutes not only a measure of genealogical relatedness, but also of temporal distance between two populations. Technically, genetic distance constitutes an index of expected heterozygosity, which can be thought of as the probability that two randomly matched individuals will be genetically different from each other in terms of a pre-defined spectrum of genes. Indices of heterozygosity are derived using data on allelic frequencies, where an allele is a particular variant taken by a gene.10 Intuitively, the relative frequency of alleles at a given locus can be compared across populations and the deviation in frequencies can then be averaged over loci. This is exactly the approach pursued in the work of the population geneticists Cavalli-Sforza et al. (1994). The main dataset assembled by these researchers consists of data on 128 different alleles for 42 world populations. By aggregating differences in these allelic frequencies, the authors compute the FST genetic distance, which provides a comprehensive measure of genetic relatedness between any pair of 42 world populations. In addition, using the same dataset, Cavalli-Sforza et al. (1994) compute the so-called Nei distance for all population pairs. While this genetic distance measure has slightly different theoretical properties than FST , the two measures are highly correlated (ρ = 0.95). Since ge10 Such genetic measures are based on neutral genetic markers only, i.e., on genes which are not subject to selection pressure. Thus, differences in such genes merely reflect random drift and do not arise from evolutionary fit. 9 netic distances are available only at the population rather than at the country level, Spolaore and Wacziarg (2009) matched the 42 populations in Cavalli-Sforza et al. (1994) to countries using ethnic composition data from Fearon (2003). Thus, the genetic distance measures we use measure the expected genetic distance between two randomly drawn individuals, one from each country, according to the contemporary composition of the population. Predicted Migratory Distance Between Countries Rather than physically measure the genetic composition of populations to investigate their kinship, one can also derive predicted migration measures (Ashraf and Galor, 2013b; Özak, 2010). Key idea behind using both of these variables is that populations that have lived far apart from each other (in terms of migratory, not necessarily geographic, distance), must also have spent a large portion of human history apart from each other. Notably, these data are independent of those on observed genetic distance and thus allow for an important out-of-sample robustness check. First, the derivation of the predicted migratory distance variable of Ashraf and Galor (2013b) follows the methodology proposed in Ramachandran et al. (2005) by making use of today’s knowledge of the migration patterns of our ancestors. Specifically, Ashraf and Galor (2013b) obtain an estimate of bilateral migratory distance by computing the shortest path between two countries’ capitals. Given that until recently humans are not believed to have crossed large bodies of water, these hypothetical population movements are restricted to landmass as much as possible by requiring migrations to occur along five obligatory waypoints, one for each continent. By construction, these migratory distance estimates only pertain to the native populations of a given pair of countries. Thus, to the extent that the contemporary populations in a country pair differ from the native ones, these distance estimates need to be adjusted for post-Columbian migration flows. In order to derive values of predicted migratory distance pertaining to the contemporary populations, we combine the dataset of Ashraf and Galor (2013b) with the “World Migration Matrix” of Putterman and Weil (2010), which describes the share of the year 2000 population in every country that has descended from people in different source countries as of the year 1500. Thus, the contemporary predicted migratory distance between two countries equals the weighted migratory distance between the contemporary populations.11 Thus, this ancestry-adjusted predicted migratory distance between two countries 11 Formally, suppose there are N countries, each of which has one native population. Let s1,i be the share of the population in country 1 which is native to country i and denote by di,j the migratory distance between the native populations of countries i and j. Then, the (weighted) predicted ancestry-adjusted migratory distance between countries 1 and 2 as of today is given by 10 can be thought of as the expected migratory distance between the ancestors of two randomly drawn individuals, one from each country. Further note that migratory distance and observed genetic distance tend to be highly correlated (Ramachandran et al., 2005). Ashraf and Galor (2013b) exploit this fact by linearly transforming migratory distance into a measure of predicted FST genetic distance. Thus, our measure of predicted migratory distance might as well be interpreted as predicted genetic distance. Indeed, the correlation of our predicted (ancestry-adjusted) migratory distance measure with observed FST is ρ = 0.54. Second, as an additional independent measure of migratory distance, we use the so-called “human mobility index”-based migratory distance developed by Özak (2010). This measure is more sophisticated than the raw migratory distance using the five intermediate waypoints in that it measures the walking time along the optimal route between any two locations, taking into account the effects of temperature, relative humidity, and ruggedness, as well as human biological capabilities. Given that the procedure assumes travel by foot (as is appropriate if interest lies in migratory movements thousands of years ago), the data do not include islands, but assume that the Old World and the New World are connected through the Bering Strait, over which humans are believed to have entered the Americas. The original data contain the travel time between two countries’ capitals, which we again adjust for post-Columbian migration flows using the ancestry-adjustment methodology outlined above. Thus, our final variable measures the expected travel time between the ancestors of two randomly drawn individuals, one from each country. As we will see below, adjusting both migratory distance variables for post-Columbian migration has a substantial positive effect on their explanatory power for cross-country differences in risk attitudes. Linguistic Distance Between Countries Population geneticists and linguists have long noted the close correspondence between genetic distance and linguistic “trees”, intuitively because population breakups do not only produce diverging gene pools, but also differential languages. Hence, we employ the degree to which two countries’ languages differ from each other as an additional proxy for the timing of separation. The construction of linguistic distances follows the methodology proposed by Fearon (2003). The Ethnologue project classifies all languages of the world into language families, sub-families, sub-sub-families etc., which give rise to a language tree. In such a tree, the degree of relatedness Predicted migratory distance1,2 = N X N X i=1 j=1 11 (s1,i × s2,j × di,j ) between different languages can be quantified as the number of common nodes two languages share.12 As in the case of predicted migratory distances, for each country pair, we calculate the weighted linguistic distance according to the population shares speaking a particular language in the respective countries today. Note that it is well-know in the population genetics and linguistics literatures that genetic distance appears to be a higher-quality measure of separation patterns. First, while languages generally maintain a certain structure over long periods of time, in some cases they change or evolve very quickly. Thus, in general, the slowmoving nature of aggregate genetic endowment makes genetic distance a more robust measure of ancient breakups of populations, also see the discussion in Cavalli-Sforza (1997). In addition, any quantitative measure of linguistic distance suffers from the fact that there is no natural metric on languages. While language trees are a useful tool to circumvent this problem, they remain coarse in nature, potentially introducing severe measurement error.13 We hence expect that genetic distance will be a more powerful explanatory variable than linguistic distance. Likewise, genetic distance appears to be a more appropriate measure of length of separation than the predicted migratory distance variables, which are constructed based on entirely theoretical procedures. 4 Risk Preferences and Temporal Distance 4.1 Baseline Results This section develops our main result on the relationship between differences in the average degree of risk aversion between countries and the temporal distance between the respective populations. Since temporal distance is an inherently bilateral variable, this analysis will necessitate the use of a dyadic regression framework, which takes each possible pair of countries as unit of observation. Accordingly, we match each of the 73 countries with every other country into a total of 2,628 country pairs and relate our proxies for temporal distance to the (absolute) difference in average 12 If two languages belong to different language families, the number of common nodes is 0. In contrast, if two languages are identical, the number of common nodes is 15. Following Fearon (2003), who argues that the marginal increase in the degree of linguistic relatedness is decreasing in the number of common nodes, we transformed these data according to r # Common nodes Linguistic distance = 1 − 15 to produce distance estimates between languages in the interval [0, 1]. We restricted the Ethnologue data to languages which make up at least 5% of the population in a given country. 13 Also see the corresponding discussion in Mecham et al. (2006). 12 risk aversion between the respective populations.14 Our regression equation is hence given by: |riski − riskj | = α + β × temporal distance proxyi,j + i,j where riski and riskj represent the average risk aversion in countries i and j, respectively, while i,j is a country pair specific disturbance term. Regarding the latter, notice that our empirical approach implies that each country will appear multiple times as part of the (in)dependent variable. Thus, to allow for clustering of the error terms at the country-level, we employ the two-way clustering strategy of Cameron et al. (2011), i.e., we cluster at the level of the first and of the second country of a given pair. This procedure allows for arbitrary correlations of the error terms within a group, i.e., within the group of country pairs which share the same first country or which share the same second country, respectively, see the discussion in Appendix II of Spolaore and Wacziarg (2009). Columns (1) through (4) of Table 1 provide the results of unconditional OLS regressions of absolute differences in risk aversion on our proxies for temporal distance. Column (1) shows that the FST genetic distance is a strong predictor of differences in average risk attitudes. In quantitative terms, the standardized beta is large and indicates that a one standard deviation increase in genetic distance is associated with an increase of roughly one-third of a standard deviation in differences in risk attitudes.15 Column (2) of Table 1 introduces the Nei genetic distance. The coefficient on this distance measure is very similar to the one on FST and highly statistically significant. We will continue to use this measure to exemplify the robustness of our results below. In columns (3) and (4), we establish that both predicted migratory distance variables are also significantly related to differences in risk attitudes, even though the coarser measure of predicted migratory distance is only weakly significant. Column (5) relates the absolute difference in average risk aversion to the linguistic distance between the respective populations. Again, this proxy for ancient migration patterns exhibits an unconditional relationship with differences in risk preferences. Note that the genetic variables explain a considerably larger fraction of the variation in risk preferences than the predicted migration variables or linguistic distance. This observation is consistent with the view discussed above that genetic distance is a 14 Due to a lack of data on genetic distance and / or genetic diversity, we excluded Bosnia and Herzegovina, Serbia, and Suriname from the sample. 15 If the standardized beta equals x, then a one standard deviation increase in the independent variable is associated with an increase of x% of a standard deviation in the dependent variable. Interestingly, in this baseline regression, both the standardized beta and the explained variance are close to the respective values in Spolaore and Wacziarg’s (2009) analysis of the relationship between differences in national income and genetic distance. 13 Table 1: Risk preferences and temporal distance Dependent variable: Absolute difference in average risk aversion (1) (2) (3) (4) (5) Fst genetic distance 0.37∗∗∗ (0.09) 0.40∗∗∗ (0.10) Nei genetic distance 0.71∗ (0.37) Predicted migratory distance 0.68∗∗∗ (0.23) HMI migratory distance (Özak, 2010) 0.20∗∗∗ (0.07) Linguistic distance Constant Observations R2 Standardized beta (%) 0.21∗∗∗ (0.03) 0.22∗∗∗ (0.03) 0.28∗∗∗ (0.03) 0.23∗∗∗ (0.04) 0.17∗∗ (0.07) 2628 0.108 32.9 2628 0.117 34.2 2628 0.013 11.5 2080 0.062 24.9 2628 0.021 14.5 OLS estimates, twoway-clustered standard errors in parentheses. The standardized betas refer to the temporal distance proxies. ∗ p < 0.10, ∗∗ p < 0.05, ∗∗∗ p < 0.01. finer measure of temporal distance compared to the relatively noisy constructions of migratory and linguistic distance. In addition, note that the more sophisticated HMI migratory index has substantially more explanatory power for the cross-country variation in risk preferences than the coarser predicted migratory distance measure. In fact, the results using the migratory distance measures already suggest that the correlation between our temporal distance proxies and differences in risk preferences is not driven by simple geographic distance: When we use the non-ancestry adjusted migratory distance measures in the regressions, the explained variance drops substantially. This pattern suggests that the precise migration patterns of our ancestors need to be taken into account to understand the cross-country variation in risk aversion, rather than simple shortestdistance calculations between contemporary populations. Given the superiority of the genetic data in proxying for temporal distance (both conceptually and empirically), we proceed by employing the FST genetic distance as main proxy and use the other variables in the robustness checks. 14 4.2 Controlling for Demographics, Income, Institutions, Geography, and Climate The argument made in this paper is that our main result reflects the impact of ancient migration patterns and the resulting distribution of temporal distances across populations, rather than contemporary differences in idiosyncratic country characteristics. To address the issue of omitted variable bias, we subject our result to an extensive and comprehensive set of control variables, which are commonly used in the comparative development and economic geography literatures. Since our dependent variable consists of absolute differences, all of our control variables will also be bilateral variables that reflect cross-country differences along a wide range of dimensions. In essence, in what follows, our augmented regression specification will be |riski − riskj | = α + β × temporal distance proxyi,j + γ × di,j + i,j where dij is a vector of bilateral measures between countries i and j (such as their geodesic distance or the absolute difference in per capita income). Details on the definitions and sources of all control variables can be found in Appendix G. To ensure that our coefficient of interest does not spuriously pick up the effect of demographic differences, column (2) of Table 2 adds to the baseline specification the absolute differences in average age and the proportion of females. This does not affect the coefficient on genetic distance. Data on genetic, migratory and linguistic distances between populations correlate with measures of the religious composition of the respective populations. Thus, analogously to our linguistic distance measure, we use a “religion tree” to construct an index of religious distance between two countries, following the same methodology as in deriving linguistic distances.16 Column (3) of Table 2 further controls for this distance measure. This does not affect our results. A potentially important determinant of how people perceive risks is the composition of their social environment. For instance, a large religious fractionalization might increase the occurrence of civic conflicts, potentially altering the way people cope with risky choice situations. Thus, column (4) introduces as additional covariates the absolute differences in religious fractionalization and the fraction of the 16 Specifically, we first compute the number of common nodes two religions share in the “religion tree” of Fearon (2006) and transformed this series according to r # Common nodes Distance between two religions = 1 − 5 We then computed weighted religious distance estimates between two countries by weighing the distances between religions with the respective population shares, so that our final distance measure represents the expected religious distance between two randomly drawn individuals, one from each country. This measure exhibits a modest correlation with genetic distance (ρ = 0.18). 15 population who are of European descent. While differences in religious fractionalization are indeed correlated with differences in risk preferences, the coefficient on the fraction of people of European descent is actually negative.17 However, neither in terms of the size of the point estimate nor in terms of its statistical significance does the inclusion of these covariates affect our result. A potential concern with our baseline specification is that it ignores differences in comparative development and institutions across countries, in particular given that genetic distance is known to correlate with differences in national income (Spolaore and Wacziarg, 2009). Column (5) of Table 2 therefore introduces (log) GDP per capita. If anything, this causes the coefficient on genetic distance to become slightly larger.18 Column (6) further establishes that differences in the contemporary institutional environment (proxied by a democracy and a property rights index) as well as common colonial experiences have little, if any, effect on our main result.19 Recall that our “world map” of risk preferences suggests the presence of geographic patterns in the distribution of risk attitudes. However, human migration patterns (and hence temporal distance proxies) are correlated with geographic and climatic variables. Thus, to ensure that effects stemming from variations in geography or climate are not attributed to genetic distance, we now condition on an exhaustive set of corresponding control variables. Column (1) of Table 3 restates our baseline regression from Table 1, while column (2) repeats the regression including all demographic, economic, and institutional covariates from Table 2. Column (3) introduces four distance metrics as additional controls into this regression. Our first geographical control variable consists of the geodesic distance (measuring the shortest distance between any two points on earth) between the most populated cities of the countries in a given pair. Relatedly, we introduce a dummy equal to one if two countries are contiguous. Finally, we also condition on the “distance” between two countries along the two major geographical axes, i.e., the difference in the distance to the equator and the longitudinal (east-west) distance. Again, the introduction of these variables has virtually no effect on the coefficient of genetic distance. This pattern suggests that the precise migration patterns of our ancestors, rather than simple shortest-distance calculations between contemporary populations, need to be taken into account to understand the cross-country variation in risk aversion. 17 This negative coefficient is probably an artifact of this variable’s correlation with genetic distance (ρ = 0.09). In an unconditional regression, the coefficient on the fraction of European descent is almost zero and far from being significant (p = 0.98). 18 Appendix C.2 shows that including differences in inequality or the average number of years of education in a country pair yields very similar results. 19 This insight is robust to using alternative measures of institutional quality such as an index gauging the constraints imposed on the executive (see Appendix C.2). 16 Given that geographic distance as such does not seem to drive our result, we now control for more specific information about differences in the micro-geographic and climatic conditions between the countries in a pair. To this end, we make use of a wealth of information on the agricultural productivity of land, different features of the terrain, and climatic factors. As column (4) shows, the inclusion of a large set of geographic and climatic controls has virtually no effect on the genetic distance point estimate. These results suggest that it is not geographic distance per se which drives differential risk attitudes but rather the precise pattern of migration steps and the distribution of temporal distances associated with these successive breakups.20 As discussed in the concluding remarks below and in Falk et al. (2015a), average risk aversion in a country is correlated with the health risks in the respective environment. Thus, to ensure that the coefficient on genetic distance is not significantly biased upward due to the omission of these variables, column (6) conditions on differences in life expectancy and the fraction of people who are at risk of contracting malaria. While life expectancy confers a significant relationship with risk aversion, the point estimate of genetic distance drops by about 20% in size, but remains statistically significant. As in all cross-country regressions, a potential concern is that our main variable of interest might simply pick up regional effects. For example, the largest genetic distances occur between countries from different continents, in particular between African and non-African countries. To account for this possibility, we construct an extensive set of 28 continental dummies each equal to one if the two countries are from two given continents. For example, we have a dummy equal to one if both countries are from Sub-Saharan Africa, and another one equal to one if one country is from Sub-Saharan Africa and the other one from North America. Conditional on the other covariates, the inclusion of these fixed effects has no further effect on the magnitude of the coefficient on genetic distance. In Appendix C.2, we present a further specification in which we insert a set of 73 country dummies each equal to one if a given country is part of a country pair. Even under this very conservative specification, which also includes all covariates, the coefficient of genetic distance is statistically significant and has a standardized beta of 18.2%. 20 Appendix C.2 provides further robustness checks against geographic features and transportation costs. 17 Table 2: Risk preferences, demographics, and economic structure (1) 0.36∗∗∗ (0.10) 0.35∗∗∗ (0.09) 0.33∗∗∗ (0.09) 0.35∗∗∗ (0.09) 0.34∗∗∗ (0.09) ∆ Average age 0.0023 (0.00) 0.0020 (0.00) 0.0053∗ (0.00) 0.0096∗∗∗ (0.00) 0.011∗∗∗ (0.00) ∆ Proportion female -0.0055 (0.74) -0.037 (0.74) -0.049 (0.72) -0.11 (0.71) -0.11 (0.66) 0.086 (0.06) 0.087 (0.06) 0.083 (0.06) 0.083 (0.06) 0.14∗∗ (0.07) 0.14∗∗ (0.07) 0.14∗∗ (0.07) -0.053∗∗∗ (0.01) -0.047∗∗∗ (0.01) -0.049∗∗∗ (0.02) -0.030∗∗∗ (0.01) -0.026∗∗∗ (0.01) Fst genetic distance 0.37∗∗∗ (0.09) Dependent variable: Absolute difference in average risk aversion (2) (3) (4) (5) (6) Religious distance ∆ Religious fractionalization ∆ % Of European descent ∆ Log [GDP p/c PPP] ∆ Democracy index -0.0017 (0.00) ∆ Property rights -0.00055 (0.00) 1 if common legal origin -0.012 (0.02) 1 if ever colonial relationship -0.0064 (0.03) 1 if colonial relationship post 1945 -0.037 (0.04) 1 if common colonizer post 1945 0.015 (0.03) Constant Observations R2 Standardized beta (%) 0.21∗∗∗ (0.03) 0.21 (0.74) 0.19 (0.74) 0.18 (0.72) 0.26 (0.71) 0.27 (0.67) 2628 0.108 32.9 2628 0.110 32.3 2628 0.115 31.0 2628 0.129 29.4 2628 0.143 31.2 2556 0.146 30.6 OLS estimates, twoway-clustered standard errors in parentheses. The standardized betas refer to genetic distance. ∗ p < 0.10, ∗∗ p < 0.05, ∗∗∗ p < 0.01. 18 Table 3: Risk preferences, geography, and climate (1) 0.32 (0.08) 0.31 (0.09) 0.24 (0.09) 0.26∗∗∗ (0.10) Log [Geodesic distance] 0.11∗∗∗ (0.04) 0.11∗∗∗ (0.04) 0.095∗∗∗ (0.03) 0.091∗∗∗ (0.03) 1 for contiguity 0.034 (0.04) 0.048 (0.04) 0.038 (0.04) 0.027 (0.03) ∆ Distance to equator -0.0040∗∗∗ (0.00) -0.0046∗∗∗ (0.00) -0.0038∗∗∗ (0.00) -0.0061∗∗∗ (0.00) ∆ Longitude -0.0021∗∗∗ (0.00) -0.0020∗∗∗ (0.00) -0.0018∗∗∗ (0.00) -0.00094∗ (0.00) ∆ Log [Area] 0.0016 (0.00) 0.0011 (0.00) 0.0043 (0.00) ∆ Land suitability for agriculture -0.013 (0.05) -0.0037 (0.05) -0.013 (0.04) 0.000034 (0.00) 0.000081 (0.00) 0.00031 (0.00) ∆ Terrain roughness 0.059 (0.15) 0.044 (0.15) 0.11 (0.13) ∆ Mean elevation -0.0086 (0.03) -0.0067 (0.03) -0.073∗ (0.04) -0.099∗∗∗ (0.02) -0.086∗∗∗ (0.02) -0.010 (0.03) 0.039 (0.04) -0.013 (0.04) -0.048 (0.04) ∆ Ave precipitation 0.00033 (0.00) 0.00043 (0.00) 0.00095∗∗∗ (0.00) ∆ Ave temperature 0.00041 (0.00) 0.00051 (0.00) 0.0033∗∗∗ (0.00) 0.0076∗∗∗ (0.00) 0.00043 (0.00) -0.033 (0.05) -0.074 (0.06) 0.37 (0.09) ∗∗∗ 0.34 (0.09) ∆ % Arable land ∆ SD Elevation ∆ Log [Neolithic transition] ∗∗∗ ∆ Life expectancy ∆ % Population at risk of malaria ∗∗ (6) ∗∗∗ Fst genetic distance ∗∗∗ Dependent variable: Absolute difference in average risk aversion (2) (3) (4) (5) 0.21∗∗∗ (0.03) 0.27 (0.67) -0.50 (0.69) -0.54 (0.71) -0.32 (0.69) -0.34 (0.63) Demographic and economic controls No Yes Yes Yes Yes Yes Continent FE No No No No No Yes 2628 0.108 32.9 2556 0.146 30.6 2556 0.206 36.0 2556 0.230 32.6 2556 0.244 24.2 2556 0.329 23.2 Constant Observations R2 Standardized beta (%) OLS estimates, twoway-clustered standard errors in parentheses. See Table 2 for a complete list of the demographic and economic control variables. The standardized betas refer to genetic distance. ∗ p < 0.10, ∗∗ p < 0.05, ∗∗∗ p < 0.01. 19 4.3 Selection on (Un-)Observables In sum, conditioning on a large set of economic, institutional, geographic, climatic, and demographic variables, the relationship between genetic distance and differences in risk preferences is highly significant. Furthermore, the corresponding point estimate is rather robust: While the coefficient is 0.26 in the full regression model, it is 0.37 in the unconditional regression. Following the work of Altonji et al. (2005) and Bellows and Miguel (2009), this observation can also be used to assess the extent to which unobservable omitted variables likely bias our result. These authors show that by comparing the coefficients in the unconditional and the full regression model, the β̂ f ull yields an estimate of how much larger the bias arising from ratio β̂ unconditional −β̂ f ull unobservable factors would need to be relative to the bias resulting from observables, in oder to generate the observed coefficient if the true coefficient was actually zero. If this ratio equals α, then omitted variables can only explain the observed coefficient if the bias resulting from unobservables equals at least α times the bias resulting from observed variables. In our case, comparing the coefficients in columns (1) and (6) of Table 3, the ratio equals 2.4.21 Thus, if the genetic distance coefficient was to be explained away by selection on unobservables, then this bias would have to be at least 2.4 times as strong as selection on the very large and comprehensive set of covariates that we included in our regressions. 4.4 Robustness While the majority of the genetic distances in our sample are relatively small, we also observe a number of large genetic distances. To ensure that our results are not driven by these large distances, we pursue two distinct strategies. First, using the same specification as in our full regression model in column (6) of Table 3, we exclude the largest genetic distances step-by-step. Second, we reduce the magnitude of differences in genetic distances by estimating a log-log relationship between differences in risk preferences and genetic distance. Columns (1) through (4) of Table 4 present the results of the regressions utilizing the restricted samples, which demonstrate that excluding large distances does not affect the statistical significance of our result. Furthermore, column (13) shows that the spirit of our results is unaffected in a log-log specification. Our control variables already accounted for the presence of differences in continent fixed effects. To provide further evidence that no single region drives our result, columns (5) through (12) of Table 4 present the results from regressions which exclude 21 The ratio is the same if we consider the common sample from the specifications from columns (1) and (6) in Table 3. 20 the four major regions of the world one-by-one. In most specifications, the results remain statistically significant despite the large drop in the number of observations, and conditional on the full set of covariates. In Appendix C.1 we provide further robustness checks by employing different dependent variables than the absolute difference in average risk attitudes. First, we show that our main result continues to hold when we use the absolute difference in the 25th, 50th or 75th percentile of risk aversion as dependent variable. Second, recall that our index of risk aversion consists of a linear combination of two types of survey items, a qualitative self-assessment and a series of quantitative binary lottery choices. By employing both items separately, we show that the effect of temporal distance on risk preferences is not specific to a particular type of survey question. 4.5 Alternative Measures Up to this point, all of our multivariate regressions were conducted using the raw FST genetic distance. We now present a set of regressions which utilize additional information on the measurement precision of the genetic distance data. Since the data on allele frequencies are collected from different sample sizes across countries, the precision of the measurement varies across country pairs. Thus, using bootstrap analysis, Cavalli-Sforza et al. (1994) compute standard errors of the FST genetic distance for each pair. Following the methodology proposed by Spolaore and Wacziarg (2009), we utilize this information by linearly downweighing each observation by its standard error (see Appendix D). The results from the corresponding weighted least squares regression, are presented in columns (1)-(3) of Table 10. The resulting standardized beta coefficients are highly significant and very similar to those from the unweighted regressions, providing reassuring evidence that our results are not driven by measurement error. Section 4.1 showed that the raw relationship between temporal distance and differences in risk preferences also holds when employing the Nei genetic distance or migratory distance measures as proxies. Columns (4)-(12) establish that these raw correlations are robust to the full set of covariates. Notably, the results obtained with the predicted measures of migratory distance further emphasize that it is indeed ancient migration patterns rather than genetic distance as such which is at the core of our analysis.22 22 In contrast, when we employ linguistic distance in the conditional regressions, we obtain small and insignificant coefficients. This is consistent with the idea that the coarse and noisy construction of linguistic distances introduces severe measurement error and is hence an inferior proxy for timing of separation as compared to genetic or predicted migratory distance data. 21 22 Yes Yes 2556 0.329 23.2 Continent FE Observations R2 Standardized beta (%) (3) 2258 0.297 29.4 Yes Yes Yes -0.22 (0.64) 0.41∗∗∗ (0.11) (4) 1925 0.209 24.4 Yes Yes Yes -0.37 (0.70) 0.40∗∗∗ (0.11) (5) 1653 0.171 32.4 No No Yes 0.21 (0.66) 0.36∗∗∗ (0.10) (6) 1653 0.287 22.4 No Yes Yes -0.90 (0.69) 0.25∗∗ (0.10) (7) 1081 0.110 20.1 No No Yes 0.43 (0.76) 0.22∗∗ (0.09) (8) 1081 0.207 12.2 No Yes Yes 0.39 (0.73) 0.13 (0.10) (9) 1711 0.183 31.5 No No Yes 0.067 (0.77) 0.36∗∗∗ (0.09) (10) 1711 0.285 29.8 No Yes Yes -0.85 (0.76) 0.34∗∗∗ (0.12) 1326 0.031 12.0 No No Yes 0.26 (0.49) 0.15 (0.10) (11) 1326 0.079 20.5 No Yes Yes 0.26 (0.59) 0.25∗∗ (0.11) (12) Africa & ME 2556 0.235 16.4 Yes Yes Yes -2.91 (1.90) 0.17∗∗ (0.07) (13) Log [Abs. difference in risk attitudes] OLS estimates, twoway-clustered standard errors in parentheses. EU = Europe, ME = Middle East, SE Asia = South Asia and South-East Asia. The variables in logs are defined as ln[0.01 + x], where x denotes the original variable. See Tables 2 and 3 for a complete list of the demographic, economic, geographic, climatic, and health control variables. The standardized betas refer to genetic distance. ∗ p < 0.10, ∗∗ p < 0.05, ∗∗∗ p < 0.01. 2483 0.329 25.8 Yes Yes Geographic, climate and health controls Yes -0.28 (0.64) Yes -0.34 (0.63) (2) 0.31∗∗∗ (0.10) (1) 0.26∗∗∗ (0.10) Demographic and economic controls Constant Log [Fst genetic distance] Fst genetic distance Restricted sample: Fst < Baseline < 0.85 < 0.7 < 0.5 Dependent variable: Absolute difference in average risk aversion Exclude continents Americas EU & Central Asia SE Asia & Pacific Table 4: Robustness: Exclude large genetic distances or continents 23 No Continent FE 2556 0.186 20.5 No Yes Yes -0.31 (0.71) 0.25 (0.09) ∗∗∗ (2) 2556 0.283 26.9 Yes Yes Yes -0.31 (0.66) 0.33 (0.09) ∗∗∗ (3) 2556 0.244 23.2 No Yes Yes -0.26 (0.69) 0.27∗∗∗ (0.09) (4) 2556 0.327 22.0 Yes Yes Yes -0.31 (0.64) 0.26∗∗∗ (0.09) (5) 2628 0.059 24.3 No No No 0.23∗∗∗ (0.03) 0.34∗∗∗ (0.10) (6) 2556 0.162 18.5 No Yes Yes -0.21 (0.71) 0.26∗∗∗ (0.09) (7) 2556 0.256 23.6 Yes Yes Yes -0.26 (0.68) 0.34∗∗∗ (0.10) (8) 2556 0.236 14.1 No Yes Yes -0.26 (0.68) 0.87∗∗∗ (0.33) (9) (10) 2556 0.328 21.3 Yes Yes Yes -0.23 (0.62) 1.31∗∗∗ (0.41) OLS 2016 0.268 25.0 No Yes Yes -0.031 (0.84) 0.69∗∗∗ (0.20) (11) (12) 2016 0.355 24.5 Yes Yes Yes -0.17 (0.77) 0.67∗∗ (0.28) OLS OLS estimates, twoway-clustered standard errors in parentheses. The weighted least-squares regressions linearly downweigh observations by the standard error of the respective genetic distance estimate, as obtained from bootstrap analysis by Cavalli-Sforza et al. (1994), see Appendix D for details. See Tables 2 and 3 for a complete list of the demographic, economic, geographic, climatic, and health control variables. The standardized betas refer to the temporal distance proxies. ∗ p < 0.10, ∗∗ p < 0.05, ∗∗∗ p < 0.01. 2628 0.074 27.3 No Geographic, climate and health controls Observations R2 Standardized beta (%) No 0.22∗∗∗ (0.03) 0.34 (0.09) Demographic and economic controls Constant HMI migratory distance (Özak, 2010) Predicted migratory distance Nei genetic distance Fst genetic distance ∗∗∗ (1) WLS Dependent variable: Absolute difference in average risk aversion OLS WLS Table 5: Alternative measures and estimation techniques 5 Preference Heterogeneity and Migratory Distance from East Africa The final step of our analysis considers the relationship between the within-population variability in risk preferences and migratory distance from Africa. To this end, we estimate the following equation: σi = α + β × Predicted migratory distance from Ethiopiai + γ × xi + i where σi denotes the standard deviation in risk attitudes within a country, xi a vector of covariates, and i a disturbance term. Table 6 provides an overview of the results of corresponding OLS regressions. In columns (1)-(4), we use predicted migratory distance from Ethiopia (constructed as described in Section 3.2) as explanatory variable, while columns (5) through (8) employ the migratory distance measure which is based on the human mobility index. Column (1) indicates that the average total length of the migration path of a given population from East Africa exhibits a raw correlation with the standard deviation in risk attitudes. Columns (2) through (3) add several control variables to the baseline specification, which capture cross-country differences in sociodemographics, development, geography, and climate. The details of the corresponding regressions, i.e., the coefficients of the covariates, can be found in Appendix E. The results show that, if anything, conditioning on this vector of covariates strengthens the correlation between preference heterogeneity and migratory distance. However, as column (4) indicates, the inclusion of continental fixed effects leads the coefficient of migratory distance to drop in size and to become insignificant. By the serial founder effect and the paths of human migration, continent dummies are correlated with migratory distance from East Africa and take out a substantial fraction of the variation not only in migratory distance but also in preference heterogeneity. Columns (5) through (8) repeat the same specifications using the more sophisticated mobility index-adjusted measure of migratory distance. Consistent with measurement error in the baseline migratory distance measure biasing the coefficient towards zero, the results are, if anything, even stronger, and remain weakly significant even under continent fixed effects. Figure 5 in Appendix E provides a graphical illustration of the corresponding raw relationship. Taken together, our second set of results shows that ancient migration patterns are not only reflected in cross-country differences in average risk aversion, but also in the dispersion of the preference pool. 24 Table 6: Preference dispersion and migratory distance from East Africa (1) Migratory distance from Ethiopia (2) ∗ -0.52 (0.28) Dependent variable: SD in risk aversion (3) (4) (5) (6) (7) ∗∗ -0.83 (0.32) ∗ -0.64 (0.36) -0.15 (0.47) HMI migratory distance from Ethiopia (Özak, 2010) Constant Socioeconomic controls (8) -0.52∗∗ (0.21) -0.77∗∗∗ (0.26) -0.73∗∗ (0.29) -0.80∗ (0.45) 0.98∗∗∗ (0.02) 0.11 (0.42) 1.16∗∗ (0.52) 0.80 (0.62) 1.00∗∗∗ (0.02) -0.47 (0.56) 1.36∗∗ (0.62) 1.51∗∗ (0.70) No Yes Yes Yes No Yes Yes Yes Geographic controls No No Yes Yes No No Yes Yes Continent FE No No No Yes No No No Yes 73 0.048 -21.8 73 0.380 -34.6 73 0.509 -26.9 73 0.568 -6.4 64 0.074 -27.2 64 0.412 -40.0 64 0.578 -37.9 64 0.595 -41.5 Observations R2 Standardized beta (%) OLS estimates, robust standard errors in parentheses. In columns (2) and (6), the additional controls include average age, proportion of females, log (GDP p/c) and its square, SD in age, a gender heterogeneity index, SD in (log) household income, (log) population size, linguistic diversity and life expectancy. Columns (3) and (7) additionally include distance to the equator, temperature, precipitation, suitability for agriculture, the fraction of the population at risk of contracting malaria, and the log of the timing of the Neolithic revolution, while columns (4) and (8) add continent fixed effects. ∗ p < 0.10, ∗∗ p < 0.05, ∗∗∗ p < 0.01. 6 Conclusion We analyze the distribution of risk preferences of representative population samples in countries spanning all continents. The main contribution of this paper is to establish that, along two dimensions, a significant fraction of the substantial betweencountry heterogeneity has its deep historical origins in ancestral migration patterns. As for average risk attitudes, we show that three classes of proxies for the length of time since two populations shared common ancestors are predictive of how similar two countries’ risk preferences are. Regarding within-country heterogeneity, we show that the migratory distance from East Africa predicts the dispersion of the preference distribution. A growing body of empirical work highlights the importance of heterogeneity in risk preferences for understanding a myriad of economic and health behaviors at the individual level. However, it remains an open question whether country-level risk preferences are related to the cross-country variation in economic, institutional, and health-related outcomes. In our data, average risk aversion is indeed predictive of variables that one would intuitively expect to be associated with risk attitudes. Specifically, more risk-tolerant societies tend to live in more risky economic, institutional, and health environments (see Falk et al., 2015a, and Appendix F). Regarding economic risks, high risk taking is associated with high income inequality (Gini index, income share of the top 10%) and a high risk of living in poverty, i.e., on less than 25 $2 per day, conditional on per capita income. Similarly, risk averse populations tend to live in institutional structures which provide strong labor protection against, e.g., unemployment. Similarly, as for health risks, a higher risk tolerance is significantly related to a higher prevalence of HIV, traffic death rates, and lower life expectancy. The latter relationship is depicted in the left-hand panel of Figure 2. Thus, in sum, risk averse socities appear to live in environments which are intuitively less risky, providing suggestive evidence that the cross-country heterogeneity in risk preferences might have important consequences not just for individual decision making. However, it is conceivable that not just the level of risk aversion in a given country has economic effects, but also the corresponding variability. Indeed, Ashraf and Galor (2013b) argue for the importance of cultural diversity in explaining comparative development around the globe. Recall that these authors show that per capita income is hump-shaped in the diversity of the respective genetic pool (which is a transformation of migratory distance from East Africa). Given our finding that the within-country dispersion in the pool of risk preferences decreases in migratory distance from East Africa, and hence increases in the diversity of the genetic pool, it is a natural question to ask whether there is a relationship between the variability in risk attitudes and economic development. The right-hand panel of Figure 2 shows that per capita income monotonically decreases in the standard deviation of risk attitudes, with a raw correlation of ρ = −0.32, p < 0.01. As we show in Appendix F, this correlation is very robust and holds up when we condition on continent fxied effects, a large number of geographic and climatic covariates, genetic diversity and its square, ethnic or linguistic fractionalization or even institutional quality and average years of schooling. In fact, as we show in the Appendix, this correlation is much stronger when we consider the standard deviation in the quantitative choice SD in risk taking and per capita income 12 Risk taking and life expectancy ISR CHE ITAFRA SWEAUS ESP CAN GRCAUT GBR NLD DEU KOR CRI CHL USA CZE ARE MEX HRV BIH POL ARG VNM SAU CHN LKA PERVEN HUN SRB GEO EST TUR THA COL ROUJOR NIC LTU BRA IRN DZA GTM EGY IRQ SUR IDN UKR MAR BGD MDA PHL KHM RUS KAZ PAKBOL IND FIN HTI RWA 50 CMR GHA AFG KEN UGA JPN TZA ZAF Log [GDP p/c PPP] 8 10 Life expectancy at birth 60 70 PRT CHE USA SWE ARE CAN DEU FIN GBR FRAAUT ESP ISRGRC PRT SAU CZE HUN HRV EST POL LTU MEX TUR CHL VENROU BWA RUS ARG ZAF BRA CRI COL SUR SRBKAZ DZA BIH PER IRN THA JOR GTM IRQ UKR MAR CHN EGY GEO LKA PHL NIC CMR BOL MDA NGA IND VNM PAK KEN GHA ZWE HTI KHM BGD TZA UGA RWA AFG MWI NLD AUS ITA KOR IDN 6 80 JPN NGAMWI BWA 4 40 ZWE -1 -.5 0 Willingness to take risks .5 1 .7 .8 .9 1 SD in willingness to take risks 1.1 1.2 Figure 2: Risk attitudes, health and economic outcomes. The left panel depicts the raw correlation between life expectancy at birth and risk taking, while the right panel depicts the relationship between (log) per capita income and the standard deviation in risk aversion. 26 procedure only. In sum, both the first and second moment of the within-country distribution of risk preferences are associated with heterogeneous economic, institutional, and health outcomes. Our results highlight that if we aim to understand the ultimate roots of the underlying preference heteroegeneity, we might have to consider events very far back in time. 27 References Alesina, Alberto, Arnaud Devleeschauwer, William Easterly, Sergio Kurlat, and Romain Wacziarg, “Fractionalization,” Journal of Economic Growth, 2003, 8 (2), 155–194. , Paola Giuliano, and Nathan Nunn, “On the Origins of Gender Roles: Women and the Plough,” Quarterly Journal of Economics, 2013, 128 (2), 469–530. Altonji, Joseph G., Todd E. Elder, and Christopher R. Taber, “Selection on Observed and Unobserved Variables: Assessing the Effectiveness of Catholic Schools,” Journal of Political Economy, 2005, 113 (1), 151–184. Arbatli, Cemal Eren, Quamrul Ashraf, and Oded Galor, “The Nature of Civil Conflict,” Working Paper, 2013. Ashraf, Quamrul and Oded Galor, “Genetic Diversity and the Origins of Cultural Fragmentation,” American Economic Review: Papers & Proceedings, 2013, 103 (3), 528–33. and , “The Out of Africa Hypothesis, Human Genetic Diversity, and Comparative Economic Development,” American Economic Review, 2013, 103 (1), 1–46. , , and Marc P. Klemp, “The Out of Africa Hypothesis of Comparative Development Reflected by Nighttime Light Intensity,” Working Paper, 2014. Barro, Robert J. and Jong-Wha Lee, “A New Data Set of Educational Attainment in the World, 1950–2010,” Journal of Development Economics, 2012. Bellows, John and Edward Miguel, “War and Local Collective Action in Sierra Leone,” Journal of Public Economics, 2009, 93 (11), 1144–1157. Botero, Juan, Simeon Djankov, Rafael LaPorta, Florencio Lopez de Silanes, and Andrei Shleifer, “The Regulation of Labor,” Quarterly Journal of Economics, 2004, 119 (4), 1339–1382. Callen, Michael, Mohammad Isaqzadeh, James D. Long, and Charles Sprenger, “Violence and Risk Preference: Experimental Evidence from Afghanistan,” American Economic Review, 2014, 104 (1), 123–148. Cameron, A. Colin, Jonah B. Gelbach, and Douglas L. Miller, “Robust Inference with Multiway Clustering,” Journal of Business & Economic Statistics, 2011, 29 (2). 28 Cavalli-Sforza, Luigi L., “Genes, Peoples, and Languages,” Proceedings of the National Academy of Sciences, 1997, 94 (15), 7719–7724. Cavalli-Sforza, Luigi, Paolo Menozzi, and Alberto Piazza, The History and Geography of Human Genes, Princeton university press, 1994. Cesarini, David, Christopher T. Dawes, Magnus Johannesson, Paul Lichtenstein, and Björn Wallace, “Genetic Variation in Preferences for Giving and Risk Taking,” Quarterly Journal of Economics, 2009, 124 (2), 809–842. Chen, M. Keith, “The Effect of Language on Economic Behavior: Evidence from Savings Rates, Health Behaviors, and Retirement Assets,” American Economic Review, 2013, 103 (2), 690–731. Desmet, Klaus, Ignacio Ortuno-Ortin, and Romain Wacziarg, “Culture, Ethnicity and Diversity,” Working Paper, 2014. , Michel Le Breton, Ignacio Ortuno-Ortin, and Shlomo Weber, “The Stability and Breakup of Nations: A Quantitative Analysis,” Journal of Economic Growth, 2011, 16, 183–213. Dohmen, Thomas, Armin Falk, David Huffman, and Uwe Sunde, “The Intergenerational Transmission of Risk and Trust Attitudes,” Review of Economic Studies, 2012, 79 (2), 645–677. , , , , Jürgen Schupp, and Gert G. Wagner, “Individual Risk Attitudes: Measurement, Determinants, and Behavioral Consequences,” Journal of the European Economic Association, 2011, 9 (3), 522–550. Falk, Armin, Anke Becker, Thomas Dohmen, Benjamin Enke, Uwe Sunde, and David Huffman, “The Nature of Human Preferences: Global Evidence,” Working Paper, 2015. , , , David Huffman, and Uwe Sunde, “An Experimentally Validated Survey Module of Economic Preferences,” Working Paper, 2015. Fearon, James, “Is Genetic Distance a Plausible Measure of Cultural Distance?,” unpublished, Stanford University, 2006. Fearon, James D., “Ethnic and Cultural Diversity by Country,” Journal of Economic Growth, 2003, 8 (2), 195–222. 29 Fehr, Ernst and Karla Hoff, “Introduction: Tastes, Castes and Culture: the Influence of Society on Preferences,” Economic Journal, 2011, 121 (556), F396– F412. Fernández, Raquel and Alessandra Fogli, “Fertility: The Role of Culture and Family Experience,” Journal of the European Economic Association, 2006, 4 (2-3), 552–561. and , “Culture: An Empirical Investigation of Beliefs, Work, and Fertility,” American Economic Journal: Macroeconomics, 2009, 1 (1), 146–177. Gallup, John L., Jeffrey D. Sachs, and Andrew D. Mellinger, “Geography and Economic Development,” International Regional Science Review, 1999, 22 (2), 179–232. , et al., “The Economic Burden of Malaria.,” American Journal of Tropical Medicine and Hygiene, 2000, 64 (1-2 Suppl), 85–96. Galor, Oded and Ömer Özak, “The Agricultural Origins of Time Preference,” Working Paper, 2014. Giuliano, Paola, Antonio Spilimbergo, and Giovanni Tonon, “Genetic Distance, Transportation Costs, and Trade,” Journal of Economic Geography, 2014, 14 (1), 179–198. Gorodnichenko, Yuriy and Gerard Roland, “Culture, Institutions and the Wealth of Nations,” Working Paper, 2010. Guiso, Luigi, Paola Sapienza, and Luigi Zingales, “Does Culture Affect Economic Outcomes?,” Journal of Economic Perspectives, 2006, 20 (2), 23–48. , , and , “Cultural Biases in Economic Exchange?,” Quarterly Journal of Economics, 2009, 124 (3), 1095–1131. Henn, Brenna M., Luigi L. Cavalli-Sforza, and Marcus W. Feldman, “The Great Human Expansion,” Proceedings of the National Academy of Sciences, 2012, 109 (44), 17758–17764. Mecham, R. Quinn, James Fearon, and David Laitin, “Religious Classification and Data on Shares of Major World Religions,” unpublished, Stanford University, 2006, 1 (5), 18. Michalopoulosa, Stelios, “The Origins of Ethnolinguistic Diversity,” American Economic Review, 2012, 102 (4), 1508–1539. 30 Nordhaus, William D., “Geography and Macroeconomics: New data and new Findings,” Proceedings of the National Academy of Sciences of the United States of America, 2006, 103 (10), 3510–3517. Nunn, Nathan and Leonard Wantchekon, “The Slave Trade and the Origins of Mistrust in Africa,” American Economic Review, 2011, 101 (7), 3221–52. Özak, Ömer, “The Voyage of Homo-Oeconomicus: Some Economic Measures of Distance,” Working Paper, 2010. Proto, Eugenio and Andrew J. Oswald, “National Happiness and Genetic Distance: A Cautious Exploration,” Working Paper, 2014. Putterman, Louis and David N. Weil, “Post-1500 Population Flows and the Long-Run Determinants of Economic Growth and Inequality,” Quarterly Journal of Economics, 2010, 125 (4), 1627–1682. Ramachandran, Sohini, Omkar Deshpande, Charles C. Roseman, Noah A. Rosenberg, Marcus W. Feldman, and L. Luca Cavalli-Sforza, “Support from the Relationship of Genetic and Geographic Distance in Human Populations for a Serial Founder Effect Originating in Africa,” Proceedings of the National Academy of Sciences of the United States of America, 2005, 102 (44), 15942–15947. Rieger, Marc Oliver, Mei Wang, and Thorsten Hens, “Risk Preferences Around the World,” Management Science, 2014. Spolaore, Enrico and Romain Wacziarg, “The Diffusion of Development,” Quarterly Journal of Economics, 2009, 124 (2), 469–529. and , “Long-Term Barriers to the International Diffusion of Innovations,” in “NBER International Seminar on Macroeconomics 2011” University of Chicago Press 2011, pp. 11–46. and , “War and Relatedness,” Working Paper, 2014. Tabellini, Guido, “Institutions and Culture,” Journal of the European Economic Association, 2008, 6 (2-3), 255–294. Vieider, Ferdinand M., Mathieu Lefebvre, Ranoua Bouchouicha, Thorsten Chmura, Rustamdjan Hakimov, Michal Krawczyk, and Peter Martinsson, “Common Components of Risk and Uncertainty Attitudes Across Contexts and Domains: Evidence from 30 Countries,” Journal of the European Economic Association, forthcoming. 31 , Thorsten Chmura, and Peter Martinsson, “Risk Attitudes, Development, and Growth: Macroeconomic Evidence from Experiments in 30 Countries,” Discussion Paper, Social Science Research Center Berlin, 2012. Voigtländer, Nico and Hans-Joachim Voth, “Persecution Perpetuated: The Medieval Origins of Anti-Semitic Violence in Nazi Germany,” Quarterly Journal of Economics, 2012, 127 (3), 1339–1392. 32 ONLINE APPENDIX A Details on Data Collection of Risk Preferences The description of the dataset builds on Falk et al. (2015a). A.1 Overview The cross-country dataset covering risk aversion, patience, positive and negative reciprocity, altruism, and trust, was collected through the professional infrastructure of the Gallup World Poll 2012. The data collection process essentially consisted of three steps. First, we conducted a laboratory experimental validation procedure to select the survey items. Second, Gallup implemented a pre-test in a variety of countries to ensure the implementability of our items in a culturally diverse sample. Third, the data were collected through the regular professional data collection efforts in the framework of the World Poll 2012. A.2 Experimental Validation To ensure the behavioral relevance of our preference measures, all underlying survey items were selected through an experimental validation procedure. To this end, a sample of German undergraduates completed standard state-of-the-art financially incentivized laboratory experiments designed to measure risk aversion, patience, positive and negative reciprocity, altruism, and trust. The same sample of subjects then completed a large battery of potential survey items. In a final step, for each preference, those survey items were selected which jointly perform best in predicting financially incentivized behavior. See Falk et al. (2015a) for details. The resulting set of items was then adjusted in the process of a pre-test. A.3 Pre-Test Prior to including the preference module in the Gallup World Poll 2012, it was tested in the field as part of the World Poll 2012 pre-test, which was conducted at the end of 2011 in 22 countries. The main goal of the pre-test was to receive feedback and comments on each item from various cultural backgrounds in order to assess potential difficulties in understanding and differences in the respondents’ interpretation of items. Based on respondents’ feedback and suggestions, several items were modified in terms of wording before running the survey as part of the World Poll 2012. 33 The pre-test was run in 10 countries in central Asia (Russia, Belarus, Kyrgyzstan, Azerbaijan, Georgia, Tajikistan, Uzbekistan, Kazakhstan, Armenia, and Turkmenistan), 2 countries in South-East Asia (Bangladesh and Cambodia), 5 countries in Southern and Eastern Europe (Croatia, Hungary, Poland, Romania, Turkey), 4 countries in the Middle East and North Africa (Saudi-Arabia, Lebanon, Jordan, Algeria), and 1 country in Eastern Africa (Kenya). In each country, the sample size was 10 to 15 people. Overall, more than 220 interviews were conducted. In most countries, the sample was mixed in terms of gender, age, educational background, and area of residence (urban / rural). Participants of the pre-test were asked to state any difficulties in understanding the items and to rephrase the meaning of items in their own words. If encountering difficulties in understanding or interpretations of items that were not intended, respondents were asked to make suggestions on how to modify the wording of the item in order to attain the desired meaning. Overall, the understanding of both the qualitative items and the quantitative items was good. In particular, no interviewer received any complaints regarding difficulties in assessing the quantitative questions. When asked for rephrasing the qualitative item in their own words, most participants seemed to have understood the item in exactly the way that was intended. However, when being confronted with hypothetical choices between monetary amounts today versus larger amounts one year later, some participants, especially in countries with current or relatively recent phases of volatile and high inflation rates, stated that their answer would depend on the rate of inflation, or said that they would always take the immediate payment due to uncertainty with respect to future inflation. Therefore, we decided to adjust the wording, relative to the “original” experimentally validated item, by adding the phrase “Please assume there is no inflation. i.e. future prices are the same as today’s prices” to each question involving hypothetical choices between immediate and future monetary amounts. A.4 Selection of Countries Our goal when selecting countries was to ensure representativeness for the global population. Thus, we chose countries from each continent and each region within continents. In addition, we aimed at maximizing variation with respect to observables, such as GDP per capita, language, historical and political characteristics, or geographical location and climatic conditions. Accordingly, we favored non-neighboring and culturally dissimilar countries. This procedure resulted in the following sample of 76 countries: 34 East Asia and Pacific: Australia, Cambodia, China, Indonesia, Japan, Philippines, South Korea, Thailand, Vietnam Europe and Central Asia: Austria, Bosnia and Herzegovina, Croatia, Czech Republic, Estonia, Finland, France, Georgia, Germany, Greece, Hungary, Italy, Kazakhstan, Lithuania, Moldova, Netherlands, Poland, Portugal, Romania, Russia, Serbia, Spain, Sweden, Switzerland, Turkey, Ukraine, United Kingdom Latin America and Caribbean: Argentina, Bolivia, Brazil, Chile, Colombia, Costa Rica, Guatemala, Haiti, Mexico, Nicaragua, Peru, Suriname, Venezuela Middle East and North Africa: Algeria, Egypt, Iran, Iraq, Israel, Jordan, Morocco, Saudi Arabia, United Arab Emirates North America: United States, Canada South Asia: Afghanistan, Bangladesh, India, Pakistan, Sri Lanka Sub-Saharan Africa: Botswana, Cameroon, Ghana, Kenya, Malawi, Nigeria, Rwanda, South Africa, Tanzania, Uganda, Zimbabwe A.5 A.5.1 Sampling and Survey Implementation Background Since 2005, the international polling company Gallup conducts an annual World Poll, in which it surveys representative population samples in almost every country around the world on, e.g., economic, social, political, and environmental issues. The collection of our preference data was embedded into the regular World Poll 2012 and hence made use of the pre-existing polling infrastructure of one of the largest professional polling institutes in the world.23 A.5.2 Survey Mode Interviews were conducted via telephone and face-to-face. Gallup uses telephone surveys in countries where telephone coverage represents at least 80% of the population or is the customary survey methodology. In countries where telephone interviewing is employed, Gallup uses a random-digit-dial method or a nationally representative list of phone numbers. In countries where face-to-face interviews are conducted, households are randomly selected in an area-frame-design. 23 Compare http://www.gallup.com/strategicconsulting/156923/worldwide-research-methodology. aspx 35 A.5.3 Sample Composition In most countries, samples are nationally representative of the resident population aged 15 and older. Gallup’s sampling process is as follows. Selecting Primary Sampling Units In countries where face-to-face interviews are conducted, the first stage of sampling is the identification of primary sampling units (PSUs), consisting of clusters of households. PSUs are stratified by population size and / or geography and clustering is achieved through one or more stages of sampling. Where population information is available, sample selection is based on probabilities proportional to population size. If population information is not available, Gallup uses simple random sampling. In countries where telephone interviews are conducted, Gallup uses a randomdigit-dialing method or a nationally representative list of phone numbers. In countries where cell phone penetration is high, Gallup uses a dual sampling frame. Selecting Households and Respondents Gallup uses random route procedures to select sampled households. Unless an outright refusal to participate occurs, interviewers make up to three attempts to survey the sampled household. To increase the probability of contact and completion, interviewers make attempts at different times of the day, and when possible, on different days. If the interviewer cannot obtain an interview at the initially sampled household, he or she uses a simple substitution method. In face-to-face and telephone methodologies, random respondent selection is achieved by using either the latest birthday or Kish grid method.24 In a few Middle East and Asian countries, gender-matched interviewing is required, and probability sampling with quotas is implemented during the final stage of selection. Gallup implements quality control procedures to validate the selection of correct samples and that the correct person is randomly selected in each household. Sampling Weights Ex post, data weighting is used to ensure a nationally representative sample for each country and is intended to be used for calculations within a country. First, base sampling weights are constructed to account for geographic oversamples, household size, 24 The latest birthday method means that the person living in the household whose birthday among all persons in the household was the most recent (and who is older than 15) is selected for interviewing. With the Kish grid method, the interviewer selects the participants within a household by using a table of random numbers. The interviewer will determine which random number to use by looking at, e.g., how many households he or she has contacted so far (e.g., household no. 8) and how many people live in the household (e.g., 3 people, aged 17, 34, and 36). For instance, if the corresponding number in the table is 7, he or she will interview the person aged 17. 36 and other selection probabilities. Second, post-stratification weights are constructed. Population statistics are used to weight the data by gender, age, and, where reliable data are available, education or socioeconomic status. A.5.4 Translation of Items The preference module items were translated into the major languages of each target country. The translation process involved three steps. As a first step, a translator suggested an English, Spanish or French version of a German item, depending on the region. A second translator, being proficient in both the target language and in English, French, or Spanish. then translated the item into the target language. Finally, a third translator would review the item in the target language and translate it back into the original language. If semantic differences between the original item and the back-translated item occurred, the process was adjusted and repeated until all translators agreed on a final version. A.5.5 Adjustment of Monetary Amounts in Quantitative Items All items involving monetary amounts were adjusted to each country in terms of their real value. In each country, all monetary amounts were calculated to represent the same share of the country’s median income in local currency as the share of the amount in Euro of the German median income since the validation study had been conducted in Germany. Monetary amounts used in the validation study with the German sample were round numbers in order to facilitate easy calculations and to allow for easy comparisons. In order to proceed in a similar way in all countries, we rounded all monetary amounts to the next “round” number. While this necessarily resulted in some (if very minor) variation in the real stake size between countries, it minimized cross-country differences in the difficulty of understanding the quantitative items due to differences in assessing the involved monetary amounts. A.5.6 Staircase procedure The sequence of survey questions that form the basis for the quantitative risk aversion measure is given by the “tree” logic depicted in Figure 3. In Germany, each respondent faced five interdependent choices between taking part in a 50:50 lottery of receiving 300 euros or nothing and varying safe amounts of money. The values in the tree denote the safe amounts of money. The rightmost level of the tree (5th decision) contains 16 distinct monetary amounts, so that responses can be classified into 32 categories which are ordered in the sense that the (visually) lowest path / 37 endpoint indicates the highest level of risk aversion. As in the experimental validation procedure in Falk et al. (2015b), we assign values 1-32 to these endpoints, with 32 denoting the highest level of risk aversion. A.6 A.6.1 Computation of Preference Measures Cleaning and Imputation of Missings In order to make maximal use of the available information in our data, missing survey items were imputed based on the following procedure: If one survey item was missing, then the missing item was predicted using the responses to the other item. The procedure was as follows: • Qualitative question missing: We regress all available survey responses to the qualitative question on responses to the staircase task, and then use these coefficients to predict the missing qualitative items using the available staircase items. • Staircase item missing: The imputation procedure was similar, but made additional use of the informational content of the responses of participants who started but did not finish the sequence of the five questions. If the respondent did not even start the staircase procedure, then imputation was done by predicting the staircase measure based on answers to the qualitative survey measure using the methodology described above. On the other hand, if the respondent answered at least one of the staircase questions, the final staircase outcome was based on the predicted path through the staircase procedure. Suppose the respondent answered four items such that his final staircase outcome would have to be either x or y. We then predict the expected choice between x and y based on a probit of the “x vs. y” decision on the qualitative item. If the respondent answered three (or less) questions, the same procedure was applied, the only difference being that in this case the obtained predicted probabilities were applied to the expected values of the staircase outcome conditional on reaching the respective node. Put differently, the procedure outlined above was applied recursively by working backwards through the “tree” logic of the staircase procedure. 38 A A Risk taking=32 B Risk taking=31 A Risk taking=30 B Risk taking=29 A Risk taking=28 B Risk taking=27 A Risk taking=26 B Risk taking=25 A Risk taking=24 B Risk taking=23 A Risk taking=22 B Risk taking=21 A Risk taking=20 B Risk taking=19 A Risk taking=18 B Risk taking=17 A Risk taking=16 B Risk taking=15 A Risk taking=14 B Risk taking=13 A Risk taking=12 B Risk taking=11 A Risk taking=10 B Risk taking=9 A Risk taking=8 B Risk taking=7 A Risk taking=6 B Risk taking=5 A Risk taking=4 B Risk taking=3 A Risk taking=2 B Risk taking=1 310 300 B A 290 280 B A A 270 260 B 250 240 A 220 B B A A 230 210 200 B A 190 180 B 170 160 A 150 140 B A 130 B 120 B A A 110 100 B 90 80 A 70 60 B B A 50 40 B A 30 20 B 10 Figure 3: Tree for the staircase time task (numbers = payment in 12 months, A = choice of “100 euros today”, B = choice of “x euros in 12 months”). The staircase procedure worked as follows. First, in Germany, each respondent was asked whether they would prefer to receive 160 euros for sure or whether they preferred a 50:50 chance of receiving 300 euros or nothing. In case the respondent opted for the safe choice (“B”), the safe amount of money being offered in the second question decreased to 80 euros. If, on the other hand, the respondent opted for the gamble (“A”), the safe amount was increased to 240 euros. Working further through the tree follows the same logic. 39 A.6.2 Computation of Preference Indices at Individual Level We compute an individual-level index of patience by (i) computing the z-scores of each survey item at the individual level and (ii) weighing these z-scores using the weights resulting from the experimental validation procedure of Falk et al. (2015a). Technically, these weights are given by the coefficients of an OLS regression of observed behavior on responses to the respective survey items, such that the coefficients sum to one. These weights are given by (see above for the precise survey items): Risk taking = 0.4729985 × Staircase risk + 0.5270015 × Qualitative item A.6.3 Computation of Country Averages In order to compute country-level averages, we weigh the individual-level data with the sampling weights provided by Gallup, see above. B Summary Statistics for Main Variables Table 7: Summary statistics of main unilateral variables Obs Mean SD Min Max 73 73 73 0.01 0.94 0.07 0.31 0.09 0.04 -0.79 0.75 0.01 0.97 1.18 0.19 Average risk aversion SD in risk aversion Predicted migratory distance from Ethiopia Table 8: Summary statistics of main bilateral variables Abs. difference in avg. risk aversion Fst genetic distance Nei genetic distance Predicted migratory distance HMI migratory distance Linguistic distance 40 Obs Mean 2628 2628 2628 2628 2080 2628 0.34 0.34 0.30 0.08 0.17 0.87 SD Min Max 0.27 0 0.24 0 0.23 0 0.04 0 0.10 0.01 0.20 0 1.76 1 1 0.23 0.59 1 0 .5 Density 1 1.5 Average risk aversion -1 -.5 0 Average risk aversion .5 1 kernel = epanechnikov, bandwidth = 0.0908 SD in risk aversion Density 0 0 .5 .5 Density 1 1 1.5 1.5 2 Abs. difference in average risk aversion 0 .5 1 1.5 Abs. difference in average risk aversion 2 -1 kernel = epanechnikov, bandwidth = 0.0488 -.5 0 SD in risk aversion .5 1 kernel = epanechnikov, bandwidth = 0.0908 Predicted migratory distance from Ethiopia Density 10 0 0 5 .5 Density 1 15 1.5 2 20 Fst genetic distance 0 .2 .4 .6 Fst genetic distance .8 1 0 kernel = epanechnikov, bandwidth = 0.0454 .05 .1 .15 Predicted migratory distance from Ethiopia kernel = epanechnikov, bandwidth = 0.0067 Figure 4: Kernel density estimates of main variables 41 .2 Table 9: Raw correlations among bilateral variables ∆ average risk attitudes Fst genetic distance Nei genetic distance Predicted migratory distance HMI migratory distance Linguistic distance C C.1 ∆ Risk 1 0.343 0.358 0.111 0.249 0.147 Fst dist Nei dist 1 0.945 0.517 0.623 0.457 1 0.509 0.634 0.408 Pred migr dist HMI dist Ling. dist 1 0.922 0.105 1 0.210 1 Additional Bilateral Regressions Alternative Dependent Variable Specifications This Appendix provides additional robustness checks on our results pertaining to average risk aversion. Column (1) of Table 10 restates our full regression model, i.e., the results of column (5) in Table 3. Columns (2) through (4) employ the same regression specification, yet use the absolute difference in the 25th, 50th, and 75th percentile of risk aversion as dependent variable. Interestingly, the results show that the relationship between genetic distance and risk attitudes is not specific to the mean of the distribution of risk preferences, but extends to other parts of the distribution. Recall that our main measure of a country’s average risk aversion consists of a weighted average of two survey questions, a hypothetical staircase lottery task, and a qualitative question. Columns (5) through (10) demonstrate that our result extends to using each of these survey questions separately. Specifically, columns (5) and (6) employ the absolute difference in the average qualitative self-assessment as dependent variable, while columns (7) through (8) use the average outcome of the five-step lottery choice procedure. Finally, columns (9) and (10) present two regressions which utilize only the first choice in the sequential lottery procedure, i.e., (in Germany) the choice between 160 euros for sure and a 50-50 bet of receiving 300 euros or nothing. While this measure ignores a lot of information that is contained in subsequent choices, the corresponding results show that even according to this very rough measure of risk attitudes does genetic distance explain about 10 % of the variation in preferences. C.2 Additional Control Variables Column (1) of Table 11 reproduces our full between-country regression including all control variables (i.e., column (5) of Table 3). Columns (2)-(4) introduce additional 42 covariates into the analysis, but the overall picture remains virtually unchanged. Column (5) presents an additional, very conservative, specification which employs country fixed effects rather than regional fixed effects. Specifically, we construct 73 dummies each equal to one if a given country is part of a country pair and zero otherwise. While the resulting point estimate is somewhat lower than in the previous regressions, it is still significant and features a standardized beta of 18.2 %. 43 44 Yes Yes 2556 0.329 23.2 Geographic, climate and health controls Continent FE Observations R2 Standardized beta (%) 2556 0.229 25.8 Yes Yes Yes -0.64 (0.89) 0.33 (0.12) ∗∗∗ (2) 2556 0.292 22.5 Yes Yes Yes 2556 0.373 18.4 Yes Yes Yes 0.30 (1.52) 0.74∗∗∗ (0.08) 2628 0.036 18.9 No No 2556 0.146 19.3 Yes Yes Yes 0.53 (0.21) No ∗∗ (6) ∗∗∗ (5) 0.24 0.52 (0.10) (0.14) ∗∗ (4) -0.50 -0.26 (0.68) (0.62) 0.28 (0.11) ∗∗ (3) 2628 0.066 25.6 No No No 2.69∗∗∗ (0.34) 3.02 (1.03) ∗∗∗ (7) 2556 0.325 15.9 Yes Yes Yes -4.57 (4.65) 1.83 (0.89) ∗∗ (8) 2628 0.098 31.3 No No No 0.094∗∗∗ (0.02) 0.15 (0.05) ∗∗∗ (9) 2556 0.376 19.1 Yes Yes Yes -0.30 (0.19) 0.091∗∗ (0.04) (10) Dependent variable: Absolute difference in... Percentiles Self-assessment Lottery choices 25th 50th 75th (average) (average outcome) (average 1st choice) OLS estimates, twoway-clustered standard errors in parentheses. The dependent variables consist of different measures of the absolute difference in risk attitudes: (1) averages, (2) - (4) percentiles, (5) - (6) average (qualitative) self-assessments, (7) - (8) average outcome of the staircase lottery sequence, (9) - (10) average decision in first lottery choice (safe amount = 160 euros). See Tables 2 and 3 for a complete list of the demographic, economic, geographic, and climatic control variables. The standardized betas refer to genetic distance. ∗ p < 0.10, ∗∗ p < 0.05, ∗∗∗ p < 0.01. Yes -0.34 (0.63) Constant Demographic and economic controls 0.26 (0.10) Fst genetic distance ∗∗∗ (1) Average (baseline) Table 10: Alternative dependent variables Table 11: Additional control variables Dependent variable: Absolute difference in average risk aversion (1) (2) (3) (4) (5) Fst genetic distance 0.26∗∗∗ (0.10) Freight cost (surface transport) 0.28∗∗∗ (0.10) 0.32∗∗∗ (0.11) 0.00018 0.00018 (0.00) (0.00) ∆ Landlocked 0.28∗∗ (0.11) 0.20∗∗ (0.10) -0.000033 (0.00) 0.00011 (0.00) -0.033 (0.03) -0.022 (0.03) -0.024 (0.03) -0.0050 (0.02) 0.00027 (0.02) -0.0060 (0.02) -0.014 (0.02) 0.031 (0.06) ∆ Traffic death rate 0.0015 (0.00) 0.00011 (0.00) 0.0011 (0.00) ∆ Linguistic fractionalization 0.088 (0.05) 0.13∗∗ (0.05) 0.084 (0.06) 0.0063 (0.01) 0.0014 (0.00) 0.0040∗∗∗ (0.00) -0.00016 (0.00) -0.0052 (0.00) -0.0051 (0.01) ∆ Mean distance to nearest waterway ∆ Executive constraints ∆ Gini coefficient ∆ Years of schooling Constant -0.34 (0.63) -0.29 (0.65) -0.46 (0.68) -0.30 (0.68) 0.50 (0.31) Demographic and economic controls Yes Yes Yes Yes Yes Geographic, climate and health controls Yes Yes Yes Yes Yes Continent FE Yes Yes Yes Yes No Country FE No No No No Yes 2556 0.329 23.2 2556 0.331 24.6 2346 0.359 27.3 1953 0.415 25.5 1953 0.671 18.2 Observations R2 Standardized beta (%) OLS estimates, twoway-clustered standard errors in parentheses. See Tables 2 and 3 for a complete list of the demographic, economic, geographic, and climatic control variables. The standardized beta refers to genetic distance. ∗ p < 0.10, ∗∗ p < 0.05, ∗∗∗ p < 0.01. D Details for WLS Regressions Following Spolaore and Wacziarg (2009), the regression weights in Table 10 were computed as follows: Weight = Maximum standard error + 1 − standard error of observation Maximum standard error Thus, the regression weights are between zero and one and linearly downweigh 45 observations with a high standard error for genetic distance. E Details for Analysis of Heterogeneity Within Countries (Table 6) Tables 12 and 13 provide the details for columns (1) through (8) of Table 6. Table 12: Correlates of standard deviation in risk attitudes Dependent variable: SD in risk aversion (2) (3) (1) ∗ Migratory distance from Ethiopia -0.52 (0.28) -0.83 (0.32) -0.52∗∗ (0.21) HMI migratory distance from Ethiopia (Ozak, 2010) Will. to take risks (4) ∗∗ -0.77∗∗∗ (0.26) -0.032 (0.04) -0.040 (0.04) SD in age 0.00084 (0.01) 0.0041 (0.01) Average age 0.0040 (0.01) 0.0047 (0.01) Fraction female -0.44 (0.77) 0.56 (0.89) Gender heterogeneity index 31.4∗∗∗ (8.38) 1.41 (14.20) Log [GDP p/c PPP] 0.19∗∗ (0.08) 0.18∗∗ (0.08) -0.013∗∗ (0.01) -0.013∗∗ (0.01) SD in log [HH income] 0.17∗∗ (0.06) 0.12∗ (0.07) Log [Population] 0.0018 (0.01) 0.0077 (0.01) Linguistic diversity 0.049 (0.05) 0.053 (0.05) Life expectancy 0.0022 (0.00) 0.0025 (0.00) Log [GDP p/c PPP] squared Constant 0.98∗∗∗ (0.02) 0.11 (0.42) 1.00∗∗∗ (0.02) -0.47 (0.56) 73 0.048 73 0.380 64 0.074 64 0.412 Observations R2 OLS estimates, robust standard errors in parentheses. 46 ∗ p < 0.10, ∗∗ p < 0.05, ∗∗∗ p < 0.01. Table 13: Correlates of standard deviation in risk attitudes Dependent variable: SD in risk aversion (2) (3) (1) ∗ Migratory distance from Ethiopia -0.64 (0.36) (4) -0.15 (0.47) HMI migratory distance from Ethiopia (Ozak, 2010) -0.73∗∗ (0.29) -0.80∗ (0.45) -0.00061 (0.00) 0.00048 (0.00) Distance to equator 0.00089 (0.00) 0.0022 (0.00) Average precipitation -0.00040 (0.00) -0.00035 (0.00) Average temperature -0.00080 (0.00) 0.0013 (0.00) -0.0021 (0.00) -0.0012 (0.00) Log [Timing neolithic revolution] -0.091∗ (0.05) -0.088∗ (0.05) -0.17∗∗∗ (0.05) -0.15∗∗ (0.06) % at risk of malaria -0.081 (0.06) -0.076 (0.06) -0.094 (0.06) -0.080 (0.08) Land suitability for agriculture 0.043 (0.05) 0.017 (0.06) 0.032 (0.06) 0.024 (0.08) Constant 1.16∗∗ (0.52) 0.80 (0.62) 1.36∗∗ (0.62) 1.51∗∗ (0.70) Socioeconimic controls Yes Yes Yes Yes Continent FE No Yes No Yes Observations R2 73 0.509 73 0.568 64 0.578 64 0.595 OLS estimates, robust standard errors in parentheses. 47 ∗ p < 0.10, ∗∗ -0.00038 -0.00028 (0.00) (0.00) p < 0.05, ∗∗∗ p < 0.01. 1.2 Preference heterogeneity and migratory distance to Ethiopia GEO EGY BOL PER .7 .8 SD in risk aversion .9 1 1.1 CHN KAZMAR MWI ZWE UKR MEX IRN HUN RUS CZE LTU TZA NGA BGD VNM MDA GTM ZAF ESTGHA RWA KEN CHL DZA POL THA ARG JOR TUR ESP FINSWE USA CRI COL ROU AFG CHE UGA PAK BRA VEN IND DEU KHM SAU IRQ BWA AUT GRC FRA ISR CMR CAN NIC HRV KOR NLD ITA 0 .05 .1 .15 .2 Mobility index-based migratory distance from Ethiopia Sub-Saharan Africa South Asia ME & North Africa South America EU & Central Asia North America .25 East Asia Figure 5: Migratory distance from South Africa and preference heterogeneity F Relationship Between Risk Preferences and Outcomes F.1 Average Risk Aversion and Riskiness of the Environment Table 14 presents correlations between the average level of risk aversion in a given country and the riskiness of the environment, as proxied for by life expectancy at birth, HIV prevalence, the traffic death rate, a poverty ratio, the Gini index, the fraction of income that goes to the top 10%, and a labor protection index. Also see Falk et al. (2015a). 48 49 70.6∗∗∗ (0.96) 76 0.158 34.7∗∗∗ (3.13) 76 0.692 4.32∗∗∗ (0.34) -0.86∗∗∗ (0.17) 69 0.150 1.62∗ (0.89) 69 0.245 -0.30∗∗∗ (0.10) 15.3∗∗∗ (0.92) 74 0.111 39.5∗∗∗ (3.49) 74 0.403 -2.90∗∗∗ (0.39) Log [HIV prevalence] Traffic death rate (3) (4) (5) (6) 1.96∗∗ 1.78∗∗ 9.37∗∗∗ 7.66∗∗∗ (0.74) (0.74) (2.78) (2.88) 13.3∗∗∗ (2.08) 50 0.077 96.8∗∗∗ (9.96) 50 0.692 -11.1∗∗∗ (1.23) Poverty ratio (7) (8) 14.0∗ 8.90∗∗ (7.83) (4.23) ∗ ∗∗ 31.0∗∗∗ (0.86) 68 0.129 p < 0.10, 38.6∗∗∗ (8.74) 52 0.093 0.35 (1.14) p < 0.05, ∗∗∗ 45.2∗∗∗ (3.97) 68 0.252 -1.71∗∗∗ (0.44) 0.49∗∗∗ (0.02) 56 0.135 0.37∗∗∗ (0.14) 56 0.148 0.015 (0.02) Labor protection (13) (14) -0.24∗∗∗ -0.21∗∗ (0.08) (0.08) p < 0.01. Income share of top 10% (11) (12) 9.59∗∗ 9.00∗∗ (3.94) (4.20) Economic risks Gini index (9) (10) 9.45∗∗ 9.58∗∗ (4.46) (4.53) 41.2∗∗∗ (1.23) 52 0.091 Dependent variable: OLS estimates, robust standard errors in parentheses. Labor protection is an index taken from Botero et al. (2004). Observations R2 Constant Log [GDP p/c PPP] Risk taking Life expectancy (1) (2) -12.3∗∗∗ -9.69∗∗∗ (4.01) (3.34) Health risks Table 14: Average risk taking and riskiness of the environment F.2 Variability in Risk Attitudes and Economic Development Table 15 investigates the relationship between the variability in risk attitudes and per capita income. Across the different specifications, the standard deviation in risk attitudes is significantly related to income, even conditional on institutions and education. In fact, the relationship becomes even stronger when we focus on the standard deviation of the quantitative (lottery-choice) risk measure only. The corresponding raw correlation is ρ = −0.57 (p < 0.01) and is illustrated in Figure 6. CHE USA SWE CAN NLD GBR FIN DEUAUT FRA ITA JPN AUS ESP GRC ISR KOR PRT ARE SAU CZE HUN EST POL MEX LTU CHL TUR BWA ZAF BRA CRI RUS ROU ARGVEN COL SRB DZA KAZSUR PER BIH THAIRN JOR GTMIRQ MAR CHN UKR GEO EGY IDN LKA PHL NIC BOL CMR MDA NGA IND PAK VNM KEN GHA KHM ZWE HTI BGD UGA AFG HRV RWA MWI TZA 4 6 Log [GDP p/c PPP] 8 10 12 SD in quantitative risk measure and per capita income 8 10 12 SD in quantitative risk measure 14 16 Figure 6: Variation in risk attitudes (using the quantitative lottery choice measure only) and development. 50 Table 15: Variability in risk preferences and development (1) (6) -5.27∗∗∗ (1.61) -5.41∗∗∗ (1.34) -4.62∗∗∗ (1.20) -5.00∗∗∗ (0.96) -2.87∗∗∗ (0.86) -0.48 (0.56) 0.89∗ (0.47) 0.22 (0.34) -0.0085 (0.34) 0.087 (0.35) Distance to equator 0.034 (0.02) -0.012 (0.02) -0.021 (0.01) Log [Timing neolithic revolution] -0.057 (0.39) 0.36 (0.44) 0.24 (0.27) Average temperature 0.059∗∗ (0.03) 0.023 (0.03) 0.015 (0.02) Average precipitation 0.0044 (0.00) -0.0045 (0.00) -0.0038 (0.00) % at risk of malaria -2.01∗∗∗ (0.54) -2.23∗∗∗ (0.60) -1.64∗∗∗ (0.42) % living in (sub-)tropical zones -0.78 (0.73) -0.069 (0.65) -0.36 (0.46) Land suitability for agriculture -0.73 (0.54) -0.82 (0.57) -0.12 (0.37) Predicted genetic diversity 559.2∗∗∗ (162.10) 477.6∗∗∗ (138.91) Predicted genetic diversity sqr. -408.5∗∗∗ (118.63) -346.0∗∗∗ (101.79) Ethnic fractionalization -2.36∗∗∗ (0.64) -0.48 (0.60) 0.68 (0.45) 0.13 (0.48) SD in risk taking -5.39∗∗∗ (1.62) Dependent variable: Log [GDP p/c PPP] (2) (3) (4) (5) Average risk taking Linguistic fractionalization 0.10∗∗ (0.04) Average years of schooling 0.022∗∗∗ (0.00) Property rights 13.4∗∗∗ (1.58) 13.3∗∗∗ (1.58) 15.3∗∗∗ (1.22) 11.7∗∗∗ (3.91) -176.8∗∗∗ (55.32) -155.1∗∗∗ (48.00) Continent FE No No Yes Yes Yes Yes Observations R2 Adjusted R2 76 0.100 0.087 76 0.108 0.083 76 0.613 0.566 74 0.756 0.693 72 0.813 0.745 67 0.927 0.893 p < 0.10, ∗∗ Constant OLS estimates, robust standard errors in parentheses. 51 ∗ p < 0.05, ∗∗∗ p < 0.01. G G.1 Definitions and Data Sources of Main Variables Explanatory Variables Fst and Nei genetic distance. Genetic distance between contemporary populations, taken from Spolaore and Wacziarg (2009). Linguistic distance. Weighted linguistic distance between contemporary populations. Derived from the Ethnologue project data, taking into account all languages which are spoken by at least 5% of the population in a given country. Predicted migratory distance. Predicted migratory distance between two countries’ capitals, along a land-restricted way through five intermediate waypoints (one on each continent). Taken from Ashraf and Galor (2013b). HMI migratory distance. Walking time between two countries’ capitals in years, taking into account topographic, climatic, and terrain conditions, as well as human biological abilities. Data from Özak (2010). G.2 Covariates and Outcome Variables Average age, proportion female, standard deviation age, gender heterogeneity index, standard deviation of household income. Computed from Gallup’s sociodemographic background data. Gender heterogeneity is computed as (2 × proportion female − 1)2 . Religious and linguistic fractionalization. Indices due to Alesina et al. (2003) capturing the probability that two randomly selected individuals from the same country will be from different religious / linguistic groups. Percentage of European descent. Constructed from the “World Migration Matrix” of Putterman and Weil (2010). Contemporary national GDP per capita. Average annual GDP per capita over the period 2001 – 2010, in 2005US$. Source: World Bank Development Indicators. Democracy index. Index that quanties the extent of institutionalized democracy, as reported in the Polity IV dataset. Average from 2001 to 2010. 52 Property rights. This factor scores the degree to which a country’s laws protect private property rights and the degree to which its government enforces those laws. It also accounts for the possibility that private property will be expropriated. In addition, it analyzes the independence of the judiciary, the existence of corruption within the judiciary, and the ability of individuals and businesses to enforce contracts. Average 2001-2010, taken from the Quality of Government dataset, http://www. qogdata.pol.gu.se/codebook/codebook_basic_30aug13.pdf. Colonial relationship dummies. Taken from the CEPII Geodist database at http://www.cepii.fr/CEPII/en/bdd_modele/presentation.asp?id=6. Geodesic distance, contiguity, longitude, latitude, area, 1 if landlocked Taken from CEPII GeoDist database. The longitudinal distance between two countries is computed as Longitudinal distance = min{|longitudei −longitudej |, 360−|longitudei |−|longitudej |} Suitability for agriculture. Index of the suitability of land for agriculture based on ecological indicators of climate suitability for cultivation, such as growing degree days and the ratio of actual to potential evapotranspiration, as well as ecological indicators of soil suitability for cultivation, such as soil carbon density and soil pH, taken from Michalopoulosa (2012). Percentage of arable land. Fraction of land within a country which is arable, taken from the World Bank Development Indicators. Terrain roughness. Degree of terrain roughness of a country, taken from Ashraf and Galor (2013b). Data originally based on geospatial undulation data reported by the G-ECON project (Nordhaus, 2006). Mean and standard deviation of elevation. Mean elevation in km above sea, taken from Ashraf and Galor (2013b). Data originally based on geospatial elevation data reported by the G-ECON project (Nordhaus, 2006). Timing of the neolithic transition. The number of thousand years elapsed, until the year 2000, since the majority of the population residing within a country’s modern nationalw borders began practicing sedentary agriculture as the primary mode of subsistence. The measure is weighted within each country, where the weight represents the fraction of the year 2000 population (of the country for which the 53 measure is being computed) that can trace its ancestral origins to the given country in the year 1500. Measure taken from Ashraf and Galor (2013b). Precipitation. Average monthly precipitation of a country in mm per month, 1961-1990, taken from Ashraf and Galor (2013b). Data originally based on geospatial average monthly precipitation data for this period reported by the G-ECON project (Nordhaus, 2006). Temperature. Average monthly temperature of a country in degree Celsius, 19611990, taken from Ashraf and Galor (2013b). Data originally based on geospatial average monthly temperature data for this period reported by the G-ECON project (Nordhaus, 2006). Life expectancy. Life expectancy at birth, average 2001–2010. Taken from the World Bank Indicators. Percentage at risk of contracting malaria. The percentage of population in regions of high malaria risk (as of 1994), multiplied by the proportion of national cases involving the fatal species of the malaria pathogen, P. falciparum. This variable was originally constructed by Gallup et al. (2000) and is part of Columbia University’s Earth Institute data set on malaria. Freight cost of surface transport Cost of 1,000 kg of unspecified freight transported over sea or land with no special handling, taken from Spolaore and Wacziarg (2009). Data originally retrieved from http://www4.importexportwizard.com/. Distance to nearest waterway. The distance, in thousands of km, from a GIS grid cell to the nearest ice-free coastline or sea-navigable river, averaged across the grid cells of a country. Source: Ashraf and Galor (2013b), originally constructed by Gallup et al. (1999). Executive constraints. The 1960-2000 mean of an index (1-7) that quantifies the extent of institutionalized conatrsints on the executive, as reported in the Polity IV data set. Data taken from Ashraf and Galor (2013b). Average years of schooling. The mean over the 2000-2010 time period, of the 5-yearly figure, reported by Barro and Lee (2012), on average years of schooling amongst the population aged 25 and over. 54 Population. Average 2001-2010, taken from the World Bank Development Indicators. Linguistic diversity. Computed akin to linguistic and religious distance indices by matching a country with itself. HIV prevalence. Primary data taken from Unicef. Missings replaced by data from most recent available period of the World Bank Development Indicators. Traffic death rate. Traffic deaths per 1,000 population, estimate of the World Health Organization. Poverty headcount ratio. Fraction of the population living on less than $2 per day, average 2001-2010. Taken from World Bank Development Indicators. Gini index. Average 2001-2010. Taken from World Bank Development Indicators. Income share of top 10%. Average 2001-2010. Taken from World Bank Development Indicators. Labor protection index. Index capturing the rigidity of employment laws by Botero et al. (2004). Includes data on employment, collective relations, and social security laws and measures legal worker protection. 55
© Copyright 2026 Paperzz