Genetic Adaptation to Levels of Dietary Selenium in Recent Human History Louise White,*,1 Frederic Romagne,1 Elias M€uller,1 Eva Erlebach,1 Antje Weihmann,1 Genıs Parra,1 Aida M. Andres,1 and Sergi Castellano*,1 1 Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany *Corresponding author: E-mail: [email protected]; [email protected]. Associate editor: Ryan Hernandez Abstract As humans migrated around the world, they came to inhabit environments that differ widely in the soil levels of certain micronutrients, including selenium (Se). Coupled with cultural variation in dietary practices, these migrations have led to a wide range of Se intake levels in populations around the world. Both excess and deficiency of Se in the diet can have adverse health consequences in humans, with severe Se deficiency resulting in diseases of the bone and heart. Se is required by humans mainly due to its function in selenoproteins, which contain the amino acid selenocysteine as one of their constituent residues. To understand the evolution of the use of this micronutrient in humans, we surveyed the patterns of polymorphism in all selenoprotein genes and genes involved in their regulation in 50 human populations. We find that single nucleotide polymorphisms from populations in Asia, particularly in populations living in the extreme Sedeficient regions of China, have experienced concerted shifts in their allele frequencies. Such differentiation in allele frequencies across genes is not observed in other regions of the world and is not expected under neutral evolution, being better explained by the action of recent positive selection. Thus, recent changes in the use and regulation of Se may harbor the genetic adaptations that helped humans inhabit environments that do not provide adequate levels of Se in the diet. Key words: micronutrients, polygenic adaptation, local adaptation. Introduction ß The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: [email protected] Mol. Biol. Evol. 32(6):1507–1518 doi:10.1093/molbev/msv043 Advance Access publication March 3, 2015 1507 Article As humans spread out of Africa about 60,000–100,000 years ago (Stringer and Andrews 1988; Fu et al. 2013) and migrated around the world, they came to inhabit a vast range of environments. These environments differ widely in their geology, and the abundance of the chemical elements required by humans varies in their rocks and soils (Selinus and Alloway 2005). Elements such as zinc, iron, manganese, copper, iodine, molybdenum, and selenium (Se), which are essential to the human diet but needed only in trace amounts (Mertz 1981), accumulate in varying quantities in the plants and animals in these environments. The intake of these micronutrients, which may have changed with the domestication of plants and animals in the past 10,000 years (Diamond 2002), can exceed or fail the dietary needs of human populations today (Selinus and Alloway 2005). Both excess and deficiency of micronutrients in the diet have adverse health consequences in humans, but deficiency is more common with iron and iodine deficiencies today affecting almost one-quarter of the world’s population (Caballero 2002; Bhutta and Salam 2012). Deficiency of other micronutrients, such as Se, appears less widespread but severe in some regions of the world. Studies on the level and bioavailability of Se in plant and animal foods across the world suggest that it varies widely at both global and local scales (Oldfield 2002; Selinus and Alloway 2005), with relatively little Se in countries such as Finland (Koivistoinen and Huttunen 1986), New Zealand (Thomson 2004), possibly Malawi (Hurst et al. 2013) and, most notably, some parts of China (Oldfield 2002; Xia et al. 2005). Heart (Keshan) and bone (Kashin–Beck) diseases, which are treated with Se supplementation, are endemic to large areas of China (Keshan Disease Research Group of the Chinese Academy of Medical Sciences 1979; Moreno-Reyes et al. 2001). More generally, Se deficiency is associated with low immune response, infertility, cognitive decline, and increased risk of mortality (Hatfield et al. 2012; Rayman 2012). Se is required by humans mainly due to its function in 25 selenoproteins (Kryukov et al. 2003), which contain the amino acid selenocysteine (Sec) as one of their constituent residues. This amino acid is encoded by a codon (UGA) that in all other genes represents a stop codon. Sec is analogous to the amino acid cysteine (Cys) in its molecular structure with an atom of Se replacing that of sulfur (Hatfield et al. 2012). These elements confer different properties when incorporated into proteins (Gromer et al. 2003; Steinmann et al. 2010; Snider et al. 2013) and although substitutions between Sec and Cys in orthologous proteins are uncommon in vertebrates (Castellano et al. 2009), paralogous proteins with Cys instead of Sec exist in most vertebrate species, with humans having six of them. Selenoproteins and their Cys-containing paralogs are generally involved in preventing oxidative damage (Steinbrenner and Sies 2009). The homeostasis and metabolism of Se and Sec involves the transport of dietary Se from the liver to other organs in the body, its receptor-mediated uptake into cells, the MBE White et al. . doi:10.1093/molbev/msv043 synthesis of Sec, and the readthrough of a UGA stop codon with a tRNA charged with Sec. Se deficiency tightly regulates these processes to favor the supply of Se to some tissues and selenoproteins at the expense of others (Schomburg and Schweizer 2009). The molecular basis of such hierarchy in the distribution of Se is not fully understood but some 19 proteins may be involved (Allmang et al. 2009; Burk and Hill 2009; Hatfield et al. 2012), including different receptors for the uptake of Se in cells (Burk and Hill 2009) and proteins directing the recoding of the UGA codon in selenoproteins (Dumitrescu et al. 2005; Allmang et al. 2009; Bifano et al. 2013) through their differential binding of an RNA secondary structure in each selenoprotein mRNA (Latreche et al. 2009). To explore whether changes in dietary Se have shaped the evolution of genes that use or regulate Se, we have conducted a survey of single nucleotide polymorphisms (SNPs) in selenoprotein genes and genes involved in the regulation of Se and Sec in 50 human populations (Cann et al. 2002). We investigated whether human populations have adapted to the different levels of Se in their diets by assessing the signatures of positive selection in these genes. We investigate both classical signatures of selection and signatures of polygenic adaptation, including population differentiation and the concerted change in allele frequencies in different regions of the world. We find evidence for polygenic adaptation in the groups of selenoprotein genes and genes involved in the regulation of Se and Sec, with the strongest signatures in populations from China living in regions characterized by Se deficiency (Oldfield 2002; Xia et al. 2005). Results Candidate Genes and Their Genetic Variation in Humans We used a hybridization approach (see Materials and Methods) to enrich and sequence the protein-coding and regulatory regions of 25 selenoprotein genes and 19 genes involved in the regulation of Se and Sec (table 1) in 855 unrelated individuals from the Human Genome Diversity Panel (HGDP-CEPH) (supplementary tables S1 and S2, Supplementary Material online) (Cann et al. 2002). These individuals belong to 50 populations and provide an unbiased sample of Se nutritional histories in humans across Africa, Middle East, Europe, Central South Asia, East Asia, Oceania, and America. We also sequenced the six Cys-containing paralogs of selenoproteins (table 1) in the same populations. We called SNPs in the HGDP-CEPH individuals, with an average coverage of 20-fold across the 50 candidate genes, and identified 7,989 SNPs (see Materials and Methods). To derive a neutral expectation for these SNPs and genes, we sequenced 48 neutrally evolving pseudogenes (supplementary table S3, Supplementary Material online) in the same populations and identified 1,035 SNPs in them (see Materials and Methods). Signatures of Positive Selection in the Candidate Genes We investigated whether certain genes have differentiated, in certain populations, beyond what is expected under 1508 Table 1. Candidate Genes Grouped According to Their Type and the Biological Process They Participate in. Genes Selenoproteins Glutathione peroxidase (GPx) 1, 2, 3, 4, and 6 Iodothyronine deiodinase (DI) 1, 2, and 3 15 kDa selenoprotein (Sel15) Selenoprotein H (SelH) Selenoprotein I (SelI) Selenoprotein K (SelK) Selenoprotein M (SelM) Selenoprotein N (SelN) Selenoprotein O (SelO) Selenoprotein R 1 (SelR1) Selenoprotein S (SelS) Selenoprotein T (SelT) Selenoprotein V (SelV) Selenoprotein W 1 (SelW1) Thioredoxin reductase (TR) 1, 2, and 3 Cys-containing paralogs Glutathione peroxidase (GPx) 5, 7, and 8 Selenoprotein R (SelR) 2 and 3 Selenoprotein W 2 (SelW2) Regulation of Se and Sec Transport and uptake of Se into cells Selenoprotein P (SelP) Apolipoprotein E receptor 2 (ApoER2) Megalin Metabolism of Se Selenocysteine lyase (SCLY) Se binding protein 1 (SELENBP1) Biosynthesis of Sec O-phosphoryl tRNASec Kinase (PSTK) Selenophosphate synthetase 2 (SPS2) O-phosphoseryl-tRNASec Se transferase (SEPSECS) Seryl-tRNA synthethase (SARS2) tRNA Ser-tRNASec, copy in chromosome 17 Ser-tRNASec, copy in chromosome 19 Ser-tRNASec, copy in chromosome 22 Incorporation of Sec into proteins CUGBP, Elav-like family member 1 (CELF1) Elongation factor for Sec (EFSec) Eukaryotic translation initiation factor 4A3 (EIF4A3) ELAV like RNA binding protein 1 (ELAVL1) Ribosomal protein L30 (RPL30) SECIS binding protein 2 (SBP2) Selenophosphate synthetase 1 (SPS1) tRNASec 1 associated protein 1 (TRNAU1AP) Exportin 1 (XPO1) NOTE.—SelP and SPS2 are selenoproteins with a regulatory role. neutrality. Specifically, we tested whether the groups of selenoprotein genes, their Cys-containing paralogs, or the genes involved in the regulation of Se and Sec (table 1) show concerted shifts in their SNP allele frequencies in certain regions of the world. We first computed the FST statistic MBE Genetic Adaptation to Dietary Selenium . doi:10.1093/molbev/msv043 (supplementary table S2, Supplementary Material online) to consider together the extreme FST ranks from populations in each world region (excluding pairs of populations in the same world region). We then compared the FST ranks across world regions with a two-sided Mann–Whitney U test (fig. 1A). The skewness of the FST ranks measures the deviation of these genes from the neutral expectation, with stronger skewness (higher FST ranks) indicating higher allele frequency differentiation in a given world region. The P values from the Mann– Whitney U pairwise comparisons are represented as colors on a heat map in figure 2, with red indicating a significantly stronger departure from the neutral expectation in the lefthand world region than in the world region at the top. Such unexpected and region-specific allele frequency differentiation is a signature of local adaptation in that region of the world, through positive selection. (Weir and Cockerham 1984) for each of these candidate genes by combining the FST values of all SNPs in each gene (see Materials and Methods). This FST value takes into account the differentiation of functional alleles, if any, and other linked alleles and, thus, measures the overall genetic differentiation of each gene between each pair of populations in our study. We then compared the FST values of each gene with the FST values of our set of neutral loci, for the same population pair (fig. 1). This provides a rank for each FST value with respect to its neutral expectation, which measures how unusual the differentiation for a gene is in a given population pair. We divided the genes into the three functional groups above and focused only on genes whose FST values rank in the top 5% of the distribution of neutral FST values, as these extreme genes bear the strongest signature of local adaptation. We then grouped populations by world region Population in world region A Compute FST ranks per population pair Population in world region B 5% highest FST 5% highest FST FST neutral distribution between a population in world region A and a population from another world region FST neutral distribution between a population in world region B and a population from another world region Repeat for each population in world region B Repeat for each population in world region A ST ranks World region A World region B Mann-Whitney U-test World region B P-value World region A Red indicates a stronger FST shift in A than in B ST ranks by gene World region A 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 Rank values to sum X Gene X X Y X Y Gene X: 20 + 17 + 16 + 13 = 66 (70.2%) Gene Y: 14 + 8 + 5 + Y Y Contributes more to the shift in FST ranks 1 = 28 (29.8%) FIG. 1. Computation of FST ranks for each set of candidate genes (selenoproteins, Cys-containing paralogs, or genes involved in the regulation of Se and Sec) between pairs of populations. The FST ranks are based on the distribution of FST values from neutral loci (pseudogenes) between the same pair of populations. (A) Comparison of FST ranks between pairs of world regions: 1) Comparison of the ranks of FST values in the top 5% of the distribution of neutral FST with a two-sided Mann–Whitney U test between a pair of world regions; and 2) P values from these pairwise comparisons are shown as colors on a heat map. (B) Contribution of individual genes to the FST ranks in a world region: 1) The first (lowest) FST rank a value of 1, the next rank a value of 2, and so on. This produces a combined measure of the number of times the FST of a gene (X or Y) significantly departs from the neutral expectation between region A and the rest of the world; and 2) Gene X has the highest sum of rank values and contributes more to the signatures of positive selection in region A than gene Y. 1509 MBE White et al. . doi:10.1093/molbev/msv043 FIG. 2. Comparison of FST values falling in the top 5% of the distribution of neutral FST values across populations from different world regions. Their relative departure from the neutral expectation is tested, such that populations from the left-hand regions that appear in red are significantly more differentiated than populations living in the regions at the top. Regions in gray do not have FST in the top 5%. (A) Comparison among regions in the world. (B) Comparison between populations in China living in either Se deficient or Se adequate regions to other world regions. Cys-Containing Paralogs We observe no signatures of local positive selection in the group of Cys-containing paralogs, as no comparison in figure 2A shows a significant difference between pairs of world regions (lowest P = 0.292, for Middle East vs. Africa). These genes have functions that, while related to those of selenoproteins, are independent of Se and its abundance throughout the world. It is perhaps then not surprising that as a group Cys-containing paralogs have not experienced local adaptation in recent human history. If changes in dietary Se pose weak selective pressures in humans, we expect a similar pattern in selenoprotein genes. Selenoprotein Genes The pattern in selenoprotein genes (fig. 2A) is quite different from their Cys-containing homologs (fig. 2A). We observe significant signatures of local positive selection in selenoprotein genes, with several world region comparisons below or close to the 5% significance level. These include comparisons that involve Central South Asia (P = 0.035; vs. Oceania), East Asia (P = 0.11 vs. Africa, P = 0.180 vs. Middle East, P = 0.133 vs. Europe, P = 0.027 vs. Oceania), and America (P = 0.010 vs. Africa, P = 0.025 vs. Middle East, P = 0.011 vs. Europe, P = 0.054 vs. Central South Asia, P = 0.053 vs. East Asia, P = 0.004 vs. Oceania). Thus, local allele frequency differentiation in selenoprotein genes departs from neutral expectations significantly more in Asia and America than in other regions of the world. Furthermore, in these regions they depart from neutral expectations significantly more than their Cys-containing paralogs when all FST ranks are considered (P = 2.53 104 in Central South Asia, P < 2.20 1016 1510 in East Asia, P = 4.87 1011 in America). This is consistent with selenoprotein genes, unlike genes that use Cys in their proteins, having experienced local adaptation in some regions of the world. Genes Involved in the Regulation of Se and Sec Interestingly, the pattern in genes that regulate Se and Sec (fig. 2A) resembles that of selenoprotein genes rather than that of their Cys-containing paralogs, with the strongest differences in allele frequencies found in Central South Asia (P = 0.207 vs. Africa, P = 0.057 vs. America) and East Asia (P = 0.147 vs. Africa, P = 0.173 vs. Europe, P = 0.028 vs. America). Overall, the differences among pairs of world regions are somewhat weaker in these genes than in selenoprotein genes. Still, they also depart from neutral expectations significantly more than the Cys-containing paralogs when all FST ranks are considered (P < 2.20 1016 in Central South Asia, P < 2.20 1016 in East Asia). This is consistent with regulatory changes in the homeostasis of Se and Sec having experienced local adaptation in human populations in Asia. These regulatory adaptations are likely to have a broader physiological impact than adaptations in selenoprotein genes. Se Deficiency in China The majority of populations from East Asia, a region with signatures of positive selection (fig. 2A), are from China (supplementary table S1, Supplementary Material online). Because China has large Se deficient areas (Oldfield 2002; Xia et al. 2005), we further investigated whether these signatures are due to local differences in dietary Se intake. To do this, we compared the FST ranks of Se-related genes from six Genetic Adaptation to Dietary Selenium . doi:10.1093/molbev/msv043 Chinese populations living in known Se deficient areas to eight Chinese populations living in areas that are not deficient (supplementary table S7, Supplementary Material online) (see Materials and Methods). The group of populations living in Se deficient areas has a significantly stronger enrichment in highly differentiated genes than populations living in nondeficient areas (P = 1.42 1010 for selenoproteins, P = 2.00 105 for regulation of Se and Sec). Furthermore, as a group, populations living in Se deficient areas show significant signatures of positive selection compared with other world regions (fig. 2B) in selenoprotein genes (P = 3.22 104 vs. Africa, P = 3.16 103 vs. Middle East, P = 1.02 104 vs. Europe, P = 1.51 103 vs. Central South Asia, P = 1.72 103 vs. Oceania) and genes involved in the regulation of Se and Sec (P = 6.62 103 vs. Africa, P = 0.04 vs. Middle East, P = 7.49 103 vs. Europe, P = 0.05 vs. Central South Asia, P = 5.84 104 vs. America). This signature is absent in the set of Chinese populations living in Se adequate areas (fig. 2B). This is consistent with Se-related genes having experienced local adaptation in populations from geographic regions whose soil does not provide adequate levels of dietary Se. Polygenic Adaptation We next investigated which particular genes, among the selenoproteins and genes involved in the regulation of Se and Sec, contribute most to the signatures of local adaptation in Asia. We aimed to determine whether the signatures of positive selection in these regions are spread across genes. For each gene we counted the number of times its FST values appear in the upper 5% tail of the neutral FST distribution, and measured the contribution of each gene as its fraction in the total number of extreme FST values (fig. 1B and Materials and Methods). Signatures of positive selection in Asia are the result of allele frequency shifts in many genes, each of them having a contribution to the extreme FST values in this region (supplementary table S4, Supplementary Material online). In fact, only three and six genes do not contribute to the signatures in selenoproteins from East Asia and Central South Asia, respectively (supplementary tables S4 and S5, Supplementary Material online). For genes involved in the regulation of Se and Sec, the number of genes not contributing is 2 and 5, respectively (supplementary tables S4 and S5, Supplementary Material online). This suggests that local adaptation was polygenic in nature in Asia, with SNPs in several genes changing in allele frequency in response to differences in environmental Se. Still, some genes have a particularly large contribution to these signatures of local adaptation. The five selenoprotein genes that show the strongest evidence of local adaptation in East Asia already have 75.6% of the extreme FST values in this world region. These genes are, in order: DI2, SelS, GPx1, SelM, and Sel15 (table 2 and supplementary table S4, Supplementary Material online). With the addition of DI3, the same genes contribute the most to the signatures of positive selection in Central South Asia (supplementary table S5, Supplementary Material online). In addition, multiple MBE populations contribute to these signatures of positive selection in Asia, with the Hezhen, Naxi and Oroqen populations in East Asia and the Balochi, Brahui, Kalash, Sindhi and Uygur in Central South Asia contributing the most. The five regulatory genes that show the strongest evidence of local adaptation in East Asia have 64.4% of the extreme FST values in this world region. These genes are, in order: CELF1, SPS2, SEPSECS, ELAVL1 and Ser-tRNASec (table 2, fig. 3, and supplementary table S4, Supplementary Material online). With the addition of SelP, the same genes contribute the most to the signatures in Central South Asia (table 2 and supplementary table S5, Supplementary Material online). We note that the picture in America is very different, as the signatures from the allele frequency differentiation in selenoprotein genes (fig. 2A) are almost exclusive due to the DI2 gene (supplementary table S6, Supplementary Material online), particularly in the Colombian population. In any case, we do not find a significant excess of low frequency or high frequency derived alleles in any of the 50 candidate genes using classic neutrality tests, such as Tajima’s D and Fay & Wu’s H (see Materials and Methods). The neutral expectation for these tests was based on coalescent simulations for the genes and populations in our study using a modified demography from Gutenkunst et al. (2009), which accounts for the low admixture of Europeans with the American populations analyzed (see Materials and Methods). We conclude that polygenic adaptation was important in those genes that use or regulate Se and Sec in Asia. Discussion An interesting question in biology is whether shifts in micronutrients intake as humans migrated around the world or changed their dietary practices are an important selective pressure in humans. This is the case for certain macronutrients, such as starch (Perry et al. 2007) and lactose (Tishkoff et al. 2007), but the effect of micronutrients such as Se remains largely unexplored. We have addressed this question surveying the patterns of polymorphism in selenoprotein genes and genes involved in the regulation of Se and Sec in 50 human populations, which have a range of Se nutritional histories. Although a few regions in Africa, middle East, Europe or Oceania have abnormal levels of Se (Koivistoinen and Huttunen 1986; Thomson 2004; Hurst et al. 2013), Se deficiency or toxicity has not been broadly reported (Oldfield 2002; Safaralizadeh et al. 2005; Selinus and Alloway 2005; Novakovic et al. 2014), with no Se-related diseases being common in these regions (Oldfield 2002; Selinus and Alloway 2005; Novakovic et al. 2013). This agrees well with the absence of signatures of local positive selection in these world regions (fig. 2A). In contrast, levels of Se in the diet vary widely across some parts of Asia (Oldfield 2002), with Pakistan, where most Central South Asian populations were sampled (supplementary tables S1 and S2, Supplementary Material online), having levels of Se in the soil that are borderline deficient across large areas (Khan et al. 2006, 2008; Ahmad et al. 2009). This agrees with the signatures of positive selection identified in 1511 MBE White et al. . doi:10.1093/molbev/msv043 Table 2. Genes and SNPs Contributing Most to the Signatures of Positive Selection in East Asia (figs. 2 and 3). Gene Rank Gene Selenoproteins 1 DI2 SNP Rank SNPs ID in SelenoDB 2.0, Chromosome and Coordinate (hg19) 1 2 3 16 2 SelS 1 2 3 15 3 GPx1 1 2 3 10 11 4 SelM 1 2 3 5 Sel15 1 2 3 Biosynthesis and incorporation of Sec into proteins 1 CELF1 1 2 3 2 SPS2 1 2 3 3 SEPSECS 1 2 3 4 ELAVL1 1 2 3 5 Ser-tRNASec 1 2 3 SNP00000190_2.0, SNP00000163_2.0, SNP00000193_2.0, SNP00000160_2.0, SNP00005709_2.0, SNP00005693_2.0, SNP00005729_2.0, SNP00005750_2.0, SNP00000352_2.0, SNP00000327_2.0, SNP00000309_2.0, SNP00000330_2.0, SNP00000377_2.0, SNP00005262_2.0, SNP00005302_2.0, SNP00005326_2.0, SNP00001985_2.0, SNP00001936_2.0, SNP00001965_2.0, 14, 80,677,639 14, 80,669,618 14, 80,677,659 14, 80,669,580 15, 101,815,150 15, 101,814,579 15, 101,817,144 15, 101,817,876 3, 49,395,766 3, 49,394,636 3, 49,393,640 3, 49,394,834 3, 49,396,360 22, 31,498,917 22, 31,500,490 22, 31,501,788 1, 87,381,198 1, 87,379,501 1, 87,380,644 SNP0005905_2.0, 11, 47,500,661 SNP0005814_2.0, 11, 47,490,809 SNP0005961_2.0, 11, 47,521,325 SNP00001698_2.0, 16, 30,457,350 SNP00001710_2.0, 16, 30,459,119 SNP00001663_2.0, 16, 30,455,458 SNP00001861_2.0, 4, 25,161,929 SNP00001794_2.0, 4, 25,146,251 SNP00008709_2.0, 4, 25,161,942 SNP0006324_2.0, 19, 8,027,942 SNP0006321_2.0, 19, 8,027,784 SNP0006262_2.0, 19, 8,026,170 —, 17, 38,273,561 —, 17, 38,270,885 —, 17, 38,274,085 Gene Region, SNP ID in dbSNP Coding/Intron Coding/30 -UTR Coding 30 -UTR/Coding, rs225014 Intron Intron Intron Promoter region, rs34713741 50 -UTR/Promoter region 30 -UTR Downstream region 30 -UTR/Coding, rs1050450 Promoter region, rs3811699 Downstream region Downstream region Intron Promoter region Intron Promoter region Intron Downstream region/30 -UTR Promoter region /Intron 50 -UTR Promoter region 30 -UTR Coding Intron Coding 30 -UTR/Downstream region 30 -UTR/Downstream region 30 -UTR/Downstream region — — — NOTE.—The SNP ID in SelenoDB 2.0 is given for all SNPs in protein-coding genes. The SNP ID in dbSNP is given for those SNPs with known functional consequences discussed in the text. FIG. 3. Genes involved in the biosynthesis of Sec and its incorporation into proteins (in orange) that contribute most to the signatures of positive selection in regulatory genes from East Asian populations. 1512 Genetic Adaptation to Dietary Selenium . doi:10.1093/molbev/msv043 populations from these regions in both selenoprotein genes and genes involved in the regulation of Se and Sec (fig. 2A). Furthermore, the most Se deficient areas worldwide, where Se-related diseases are endemic (Keshan Disease Research Group of the Chinese Academy of Medical Sciences 1979; Moreno-Reyes et al. 2001; Oldfield 2002; Xia et al. 2005), are found in China in East Asia. This agrees with the signatures of local adaptation identified in those populations in China living in Se deficient areas (fig. 2B). Thus, local adaptation influenced not only selenoprotein genes but also the tight regulation in the use of Se that has been observed in humans under deficiency conditions (Allmang et al. 2009; Burk and Hill 2009; Hatfield et al. 2012). The absence of signatures in Cys-containing genes (fig. 2A) shows that the differentiation observed in Se-related genes is not due to the genetic drift experienced by populations in East Asia (Keinan et al. 2007). We are more cautious in the interpretation of the signatures of positive selection in America (fig. 2A). First, they are observed only in selenoprotein genes and are largely due to a single gene, the DI2 gene. So, it is unclear whether Se availability (rather than other selective pressures in this gene) is responsible for them. Second, the signatures are not widespread in this world region but strongest in one particular population, the Colombian. In any case, little is known about the levels of dietary Se in humans in America. Although white muscle disease, a symptom of Se deficiency, is quite prevalent across livestock from North and South America (Oldfield 2002), and at least ten Latin American and Caribbean countries are thought to be Se deficient based on the composition of their animal feeds (McDowell 1976), broad Se deficiency in humans has yet to be reported in America (Oldfield 2002). The selenoprotein genes contributing to the signatures of positive selection in Asia (table 2) contain SNPs with known functional consequences, some also with high levels of population differentiation. First, DI2, an oxidoreductase that catalyses the conversion of the iodine-dependent hormone T4 to its active form T3 in the thyroid, has a Thr to Ala substitution (rs225014) in exon 2 with high FST values for many East Asian populations. Tissue samples from individuals homozygous for the Ala allele at this locus have been shown to exhibit significantly lower DI2 enzyme velocity than samples with Ala/Thr or Thr/Thr genotypes (Canani et al. 2005). The activity of this enzyme thus depends on two micronutrients, Se and iodine, whose deficiency has been linked to Kashin–Beck disease in China (Yao et al. 2011). Second, SelS, a gene involved in stress response in the endoplasmic reticulum and with a role in mediating inflammation (Curran et al. 2005), has a noncoding G to A substitution (rs34713741) associated with colorectal cancer risk in European and Asian populations (Meplan et al. 2010; Sutherland et al. 2010). This SNP shows high FST values across many populations in East Asia. Finally, GPx1, an antioxidant enzyme that reduces hydrogen peroxide thereby protecting cells from oxidative damage, has a Pro to Leu amino acid substitution (rs1050450) and a noncoding A to G SNP (rs3811699) in the promoter region (Hamanishi et al. 2004) with high FST across many East Asian populations. This gene and substitution have been previously suggested to MBE be under positive selection in Asia (Foster et al. 2006). Interestingly, the derived Leu allele at rs1050450, which is found at lower frequencies in East Asia than in the rest of the world (excluding America), results in 40% lower GPx1 activity than the ancestral Pro allele (Hamanishi et al. 2004). In addition, the derived G allele at rs3811699 in GPx1 was shown in the same study to confer a 25% decrease in transcriptional activity and is again found at lower frequencies in East Asia than elsewhere (except America). This predominance of the ancestral alleles at these loci, with higher transcriptional activity and higher enzymatic activity, could reflect adaptation to the low Se levels across much of the region. On the other hand, it is interesting that the regulatory processes with the strongest signatures of selection in Asia (table 2 and fig. 3) are the biosynthesis of Sec by SPS2 and SEPSECS and the insertion of Sec into proteins by CELF1 and ELAVL1, which regulate SBP2, the SECIS (selenocysteine insertion sequence) binding protein that enables the Sec amino acid carried by Ser-tRNASec to be inserted at the UGA codon. It has been suggested that as SPS2 is itself a selenoprotein, and therefore dependent on the supply of its own product, it could serve as an auto-regulator of selenoprotein synthesis (Guimaraes et al. 1996). In this regard, SBP2 has been described by some as the “master regulator” of selenoprotein synthesis (Bubenik et al. 2009) but SNPs within SBP2 itself contribute little to the signatures of positive selection found in Asia. Genes that regulate SBP2 represent an alternative target where adaptations that would alter the expression of the entire selenoproteome could occur. It is also interesting that, despite the hierarchical prioritization of Se supply to different tissues being an important homeostatic mechanism when dietary Se intake is insufficient (Schomburg and Schweizer 2009), the Se transport protein SelP and its tissue-specific receptors (Megalin and ApoER2) do not appear to be the predominant targets for natural selection in Asia. Nevertheless, SelP and ApoER2, the SelP receptor in the brain and testis, contribute to the signatures of selection in Central South Asia and East Asia whereas Megalin, the kidney SelP receptor, does not. This is compatible with adaptations to maximize the uptake of Se by these tissues under Se deficiency conditions in this region. Brain and testis are known to accumulate Se when scarce at the expense of other tissues (Burk and Hill 2009), which may be explained by the sensitivity of neurons and sperm to oxidative damage due to their above average oxygen consumption (Steinbrenner and Sies 2009). Indeed, the impairment of SelP and/or ApoER2 leads to severe neurological defects and reduced fertility in mice (Burk and Hill 2009; Steinbrenner and Sies 2009). In conclusion, we provide evidence that genetic changes in selenoprotein genes and genes that regulate the use of Se were important for humans to adapt to environments that do not provide adequate levels of dietary Se. Given the relevance of these adaptions to the ability of humans to inhabit Se-deficient environments, and to individual susceptibility to Se-related conditions (Xiong et al. 2010), a better understanding of the functional consequences of the alleles underlying local adaptation is needed. It is, however, possible that the 1513 MBE White et al. . doi:10.1093/molbev/msv043 magnitude of their functional impact is in line with the magnitude of their differentiation, which is spread across genes, making such future assessment challenging. In any case, our understanding of local adaptions to environmental Se will only improve with better knowledge of the levels of Se in the diet of human populations worldwide. Today, the lack of comparable quantitative data on soil, plant or dietary Se levels across the world hampers the correlation of human Se status with genetic data. Progress on the biology of micronutrients and the evolution of their use by humans is thus likely to come from linking alleles at functional sites with precise measurement of these essential trace elements across the world. Materials and Methods Candidate Genes We compiled a list of 50 human genes (table 1). This list includes the 25 selenoprotein genes in humans and the 6 genes that encode proteins with sulfur (in the form of cysteine, Cys) instead of Se (in the form of selenocysteine, Sec) (Kryukov et al. 2003). These are paralogs of selenoprotein genes and their functions may be related to their Sedependent counterparts. We have also included 19 genes involved in the homeostasis and metabolism of Se and Sec (Allmang et al. 2009). These are genes whose function and regulation may depend on levels of Se in the diet. We used the human selenoprotein gene annotation from SelenoDB 1.0 (Castellano et al. 2008), which includes the annotation of the SECIS element of each selenoprotein gene. The annotation of the Cys-containing homologs was also obtained from SelenoDB 1.0. The annotation of genes involved in the homeostasis and metabolism of Se and Sec, as well as the alternative transcript forms for all genes, was obtained from RefSeq (Pruitt et al. 2009). For analysis, the 50 genes were split into three groups as defined in table 1: 1) Selenoprotein genes, 2) Cys-containing paralogous genes, and 3) genes involved in the regulation of Se and Sec, which includes the SelP and SPS2 selenoprotein genes. Pseudogenes Following Andres et al. (2010), we used a group of 48 neutrally evolving loci to derive the neutral expectation for some of the analyses (supplementary table S3, Supplementary Material online). These loci are unlinked (physically dispersed) known pseudogenes that are greater than 400 nt in length and located over 2,000 nt from any known gene. To ensure that these pseudogenes are neutral with respect to natural selection, they 1) do not overlap evolutionary conserved regions of the human genome; 2) do not include simple repeats or low complexity DNA; 3) do not differ from the rest of the human genome in terms of recombination rate; 4) are present as pseudogenes in chimpanzee, orangutan, and rhesus; and 5) are present in a single copy in the human and chimpanzee genomes or, when a member of a gene family, only one member is included. In addition, processed ribosomal pseudogenes and olfactory receptor pseudogenes are not 1514 included. The length of these pseudogenes ranges from 422 to 2,832 nt, and their combined length is 43,700 nt. Human Samples We used samples from the Foundation Jean Dausset-Centre d’Etude du Polymorphisme Humain; Human Genome Diversity Panel (HGDP-CEPH) (supplementary table S1, Supplementary Material online). We restricted our analysis to the H952 HGDP-CEPH set detailed in Rosenberg (2006), which excludes individuals with second-degree relationships or closer. Of this set 940 individuals were available for sequencing. Additionally, we removed one Naxi individual (HGDP01339) prior to variant calling as a result of likely contamination during the sequencing process. The set used for variant calling consisted of 928 individuals (supplementary table S2, Supplementary Material online). These individuals from 53 populations were grouped in one of the following geographical regions for analysis: Africa, Middle East, Europe, Central South Asia, East Asia, Oceania, and America. Previous analyses have shown the Middle East, Central South Asia, and America to contain inbred populations (Pemberton et al. 2012), which may bias analyses based on allele frequency differences among populations. We therefore excluded from our analysis the population from each of these regions having the most inbred individuals (Pemberton et al. 2012) (supplementary table S2, Supplementary Material online). Our final analysis is thus based on 855 individuals from 50 worldwide populations. Array Capture Design We designed an Agilent custom array (Agilent Technologies) to target all exons (coding and noncoding untranslated regions, UTRs) plus 400 nt of the surrounding introns (200 nt at each side of each exon) and 2,000 nt upstream (to include promoter regions) of all candidate genes (table 1). Additionally, for genes where the target region was shorter than 4,000 nt we added 2,000 nt of extra downstream region and for tRNA genes 3,000 nt were added upstream and downstream. The complete coding region of each pseudogene was targeted (supplementary table S3, Supplementary Material online). Using the human reference sequence (hg18) as a template, we designed a total of 469,695 probes (419,468 and 50,227 for the candidate and the putatively neutrally evolving genes, respectively). The probes were of 60 nt in length, started 50 nt before and after each targeted region, and were set every 1 nt to maximize capture efficiency. The probes target a total of 524,875 nt (470,800 and 54,083 nt for the candidate and the putatively neutrally evolving genes, respectively) of the human genome. Library Preparation, Capture, and Sequencing We prepared 940 libraries from the HGDP-CEPH samples following a double index protocol (Meyer and Kircher 2010). The genomic DNA (100 ml, 10 ng/ml) of each sample was sonicated (Bioruptor) to fragments of 150–300 nt in length. Target capture was performed in batches of pooled Genetic Adaptation to Dietary Selenium . doi:10.1093/molbev/msv043 libraries with around 90 samples per pool (Hodges et al. 2009). The enriched libraries were amplified and sequenced using the Illumina GAIIx platform yielding 76-bp paired-end reads Variant Calling Base-calling was performed with Ibis (Kircher et al. 2009). Reads were mapped to the human reference genome (hg19) using BWA (Li and Durbin 2009) yielding an average on-target coverage of 20-fold per individual and per gene after filtering out duplicate reads and those with a mapping quality less than 25. The GATK IndelRealigner was used to improve read alignment in indel regions (McKenna et al. 2010). We called genotypes using the GATK UnifiedGenotyper version 2.2 (DePristo et al. 2011), and filtered them to remove genotypes in sites that 1) had a coverage below 8 in more than 50% of samples, 2) had an average coverage above 100, 3) were indels or SNPs within 5 bp of an indel, 4) were multiallelic sites, 5) had an SNP quality less than 20, and 6) had a strand bias greater than 10. We additionally filtered out sites that did not have 1:1 human–chimp correspondence in the Ensembl EPO 6 primate alignments (Paten et al. 2008) or were identified as being prone to systematic error (Castellano et al. 2014). Genotypes with a quality (GQ) less than 20 were masked out but the positions retained unless all genotypes at that position had a GQ less than 20. This resulted in 7,989 and 1,035 SNPs in the candidate genes and pseudogenes, respectively, across populations. Population Differentiation We measured population differentiation with the FST statistic according to the formula of Weir and Cockerham (1984). This formula calculates FST per SNP in an unbiased way (that accounts for both genetic and statistical sampling) using an analysis of variance approach. The formula represents the proportion of the overall genetic diversity that is accounted for by allele frequency differences between pairs of populations. As originally defined by Wright (1951), FST values range from 0 (no population differentiation) to 1 (no allele sharing among populations); however, the Weir and Cockerham (1984) method can result in negative FST values which were considered as 0 in our analyses. We did not calculate FST values for sites where fewer than 50% of individuals in a population had a genotype call. To calculate an FST value per gene (based on all SNPs across that gene), we used a weighted average by first summing the numerator over all SNPs then summing the denominator over all SNPs before finding the quotient (Weir and Cockerham 1984). Thus, the resulting value is not the same as the average of single SNP FST values. We calculated FST between each pair of populations in supplementary table S2, Supplementary Material online, and for each pair of populations, an FST value was obtained for each gene in table 1. Genes and SNPs Contributing to the Signatures of Positive Selection An overview of this analysis can be found in figure 1. To identify the genes that contribute most to the patterns in MBE figure 2, we used the rank of their FST values in the 5% upper tail of the neutral distribution. We then sorted the ranks of these extreme FST values in ascending order and gave the first rank a value of 1, the next rank a value of 2, and so on (fig. 1B). Any identical ranks were given the same value. All values for each gene were then summed (fig. 1B). This produces a combined measure of the number of times the FST of a gene departs from the neutral expectation between populations in two geographical regions and the strength of such departure. Genes can achieve a high summed value by having many FST values within the 5% tail or by having a few very extreme FST values, or more usually by a combination of both. The genes with the highest summed values contribute most to the signatures of positive selection shown in figure 2A. We present these genes in table 2 and supplementary tables S4–S6, Supplementary Material online. We then took the five highest ranking genes in East Asia from each group (table 2 and supplementary table S4, Supplementary Material online) and performed the same ranking process using the per-SNP FST values from each of these genes, this time summing values for each SNP across pairs of populations (supplementary tables S8 and S9, Supplementary Material online). This indicates those SNPs within each gene with the highest contribution to the signatures of positive selection in East Asia in figure 2A. We further classified the populations from China into two groups: Those populations sampled from within the most Se deficient regions (pink areas of map in Oldfield, page 45; less than 0.02 ppm of Se based on local animal feeds and forages) (Oldfield 2002) and those populations that were sampled from regions likely to be Se adequate (more than 0.02 ppm of Se) or where it was difficult to tell (supplementary table S7, Supplementary Material online), making our test conservative. A two-sided Mann–Whitney U test was then performed using vectors of FST ranks from these two groups to ascertain whether the signals of positive selection in Se-related genes are stronger in populations that are likely to be Se deficient (fig. 2B). Classic Neutrality Tests We computed Tajima’s D and Fay & Wu’s H statistics (Tajima 1989; Fay and Wu 2000) using the Perl program “Neutrality Test Pipeline” (NTP) (Andres et al. 2010). Tajima’s D compares the number of segregating sites (S) against the average nucleotide diversity () in a gene. S and are both independent estimators of (4Ne) and are expected to be equal under neutrality, but behave differently when a population evolves nonneutrally. Significantly negative values of Tajima’s D are consistent with positive or purifying selection (excess of low frequency alleles). Fay & Wu’s H can help to distinguish positive selection from purifying selection and population size expansion in cases where Tajima’s D is negative. A negative Fay & Wu’s H detects the excess of high-frequency-derived alleles produced when an advantageous variant has rapidly increased to high frequencies pulling linked variation with it. H has the greatest power to detect selection when the 1515 MBE White et al. . doi:10.1093/molbev/msv043 selected allele is very close to being fixed or has just fixed in the population. NTP runs a modified version of msstats (Thornton 2003) on genotype input data and computes a P value for each output statistic based on supplied coalescent simulations. Each gene and each control region were run individually. Fay & Wu’s H test requires an outgroup in order to distinguish the derived allele from the ancestral one. The inferred sequence of the human–chimp ancestor for each gene and control region was obtained from Ensembl (Flicek et al. 2013) and added to the NTP input. There were a number of positions at which the inferred ancestral allele did not match either the present human reference or alternative allele. These positions were removed prior to the NTP analysis because in this situation neither allele can be assigned as being derived. We used the Benjamini and Hochberg method (Benjamini and Hochberg 1995) for multiple test correction. Coalescent Simulations We simulated a neutral expectation using a slightly modified four population demography (Africa, Europe, Asia, and America) from Gutenkunst et al. (2009). This demographic scenario accounts for both population expansion and admixture between populations and includes a 48% admixture between the European and American populations after the split between the Asian and European populations. This model was based on a population of Mexican ancestry living in the United States. The American populations in our data set show very little European admixture (Li et al. 2008) and so this European and American admixture was removed from our demography. The use of this demographic scenario required the assignment of our 50 populations to the four populations included in the demography. This was done with reference to the k = 4 plot in Li et al. (2008). This grouped the populations as shown in supplementary table S2, Supplementary Material online, for NTP. We performed 10,000 coalescent simulations with ms (Hudson 2002) for each gene and each pseudogene in each HGDP-CEPH population. Population and gene/pseudogenespecific details, for example, the number of chromosomes and segregating sites were used in each simulation. NTP accounts for missing genotype data by marking those positions as also missing in the ms simulations. Data Deposition Human gene annotations and their SNPs in the HGDP-CEPH populations are available in SelenoDB 2.0 (Romagne et al. 2014) at http://www.selenodb.org. Supplementary Material Supplementary tables S1–S9 are available at Molecular Biology and Evolution online (http://www.mbe.oxfordjournals.org/). Acknowledgments The authors thank Raymond Burk, Kristina Hill, and Alain Krol for help with the list of genes involved in the homeostasis and metabolism of Se and Sec. They also thank Mark 1516 Stoneking for access to the HGDP-CEPH samples and Ryan Gutenkunst for advice about population admixture in these samples. They also thank Felix M. Key for help with the NTP and ms analyses. This work was supported by the Max Planck Society. References Ahmad K, Khan ZI, Ashraf M, Shah ZA, Ejaz A, Valeem EE. 2009. Timecourse chages in selenium status of soil and forage in a pasture in Sargodha, Punjab, Pakistan. Pak J Bot. 41:2397–2401. Allmang C, Wurth L, Krol A. 2009. The selenium to selenoprotein pathway in eukaryotes: more molecular partners than anticipated. Biochim Biophys Acta. 1790:1415–1423. Andres AM, Dennis MY, Kretzschmar WW, Cannons JL, Lee-Lin SQ, Hurle B, Schwartzberg PL, Williamson SH, Bustamante CD, Nielsen R, et al. 2010. Balancing selection maintains a form of ERAP2 that undergoes nonsense-mediated decay and affects antigen presentation. PLoS Genet. 6:e1001157. Benjamini Y, Hochberg Y. 1995. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc B. 57:289–300. Bhutta ZA, Salam RA. 2012. Global nutrition epidemiology and trends. Ann Nutr Metab. 61(Suppl. 1): 19–27. Bifano AL, Atassi T, Ferrara T, Driscoll DM. 2013. Identification of nucleotides and amino acids that mediate the interaction between ribosomal protein L30 and the SECIS element. BMC Mol Biol. 14:12. Bubenik JL, Ladd AN, Gerber CA, Budiman ME, Driscoll DM. 2009. Known turnover and translation regulatory RNA-binding proteins interact with the 30 UTR of SECIS-binding protein 2. RNA Biol. 6: 73–83. Burk RF, Hill KE. 2009. Selenoprotein P-expression, functions, and roles in mammals. Biochim Biophys Acta. 1790:1441–1447. Caballero B. 2002. Global patterns of child health: the role of nutrition. Ann Nutr Metab. 46(Suppl. 1): 3–7. Canani LH, Capp C, Dora JM, Meyer EL, Wagner MS, Harney JW, Larsen PR, Gross JL, Bianco AC, Maia AL. 2005. The type 2 deiodinase A/G (Thr92Ala) polymorphism is associated with decreased enzyme velocity and increased insulin resistance in patients with type 2 diabetes mellitus. J Clin Endocrinol Metab. 90:3472–3478. Cann HM, de Toma C, Cazes L, Legrand MF, Morel V, Piouffre L, Bodmer J, Bodmer WF, Bonne-Tamir B, Cambon-Thomsen A, et al. 2002. A human genome diversity cell line panel. Science 296:261–262. Castellano S, Andres AM, Bosch E, Bayes M, Guigo R, Clark AG. 2009. Low exchangeability of selenocysteine, the 21st amino acid, in vertebrate proteins. Mol Biol Evol. 26:2031–2040. Castellano S, Gladyshev VN, Guigo R, Berry MJ. 2008. SelenoDB 1.0: a database of selenoprotein genes, proteins and SECIS elements. Nucleic Acids Res. 36:D332–D338. Castellano S, Parra G, Sanchez-Quinto FA, Racimo F, Kuhlwilm M, Kircher M, Sawyer S, Fu Q, Heinze A, Nickel B, et al. 2014. Patterns of coding variation in the complete exomes of three Neandertals. Proc Natl Acad Sci U S A. 111:6666–6671. Curran JE, Jowett JB, Elliott KS, Gao Y, Gluschenko K, Wang J, Abel Azim DM, Cai G, Mahaney MC, Comuzzie AG, et al. 2005. Genetic variation in selenoprotein S influences inflammatory response. Nat Genet. 37:1234–1241. DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR, Hartl C, Philippakis AA, del Angel G, Rivas MA, Hanna M, et al. 2011. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat Genet. 43:491–498. Diamond J. 2002. Evolution, consequences and future of plant and animal domestication. Nature 418:700–707. Dumitrescu AM, Liao XH, Abdullah MS, Lado-Abeal J, Majed FA, Moeller LC, Boran G, Schomburg L, Weiss RE, Refetoff S. 2005. Mutations in SECISBP2 result in abnormal thyroid hormone metabolism. Nat Genet. 37:1247–1252. Fay JC, Wu CI. 2000. Hitchhiking under positive Darwinian selection. Genetics 155:1405–1413. Genetic Adaptation to Dietary Selenium . doi:10.1093/molbev/msv043 Flicek P, Ahmed I, Amode MR, Barrell D, Beal K, Brent S, Carvalho-Silva D, Clapham P, Coates G, Fairley S, et al. 2013. Ensembl 2013. Nucleic Acids Res. 41:D48–D55. Foster CB, Aswath K, Chanock SJ, McKay HF, Peters U. 2006. Polymorphism analysis of six selenoprotein genes: support for a selective sweep at the glutathione peroxidase 1 locus (3p21) in Asian populations. BMC Genet. 7:56. Fu Q, Mittnik A, Johnson PL, Bos K, Lari M, Bollongino R, Sun C, Giemsch L, Schmitz R, Burger J, et al. 2013. A revised timescale for human evolution based on ancient mitochondrial genomes. Curr Biol. 23: 553–559. Gromer S, Johansson L, Bauer H, Arscott LD, Rauch S, Ballou DP, Williams CH Jr, Schirmer RH, Arner ES. 2003. Active sites of thioredoxin reductases: why selenoproteins? Proc Natl Acad Sci U S A. 100:12618–12623. Guimaraes MJ, Peterson D, Vicari A, Cocks BG, Copeland NG, Gilbert DJ, Jenkins NA, Ferrick DA, Kastelein RA, Bazan JF, et al. 1996. Identification of a novel selD homolog from eukaryotes, bacteria, and archaea: is there an autoregulatory mechanism in selenocysteine metabolism? Proc Natl Acad Sci U S A. 93:15086–15091. Gutenkunst RN, Hernandez RD, Williamson SH, Bustamante CD. 2009. Inferring the joint demographic history of multiple populations from multidimensional SNP frequency data. PLoS Genet. 5: e1000695. Hamanishi T, Furuta H, Kato H, Doi A, Tamai M, Shimomura H, Sakagashira S, Nishi M, Sasaki H, Sanke T, et al. 2004. Functional variants in the glutathione peroxidase-1 (GPx-1) gene are associated with increased intima-media thickness of carotid arteries and risk of macrovascular diseases in japanese type 2 diabetic patients. Diabetes 53:2455–2460. Hatfield DL, Berry MJ, Gladyshev VN. 2012. Selenium: its molecular biology and role in human health. New York: Springer. Hodges E, Rooks M, Xuan Z, Bhattacharjee A, Benjamin Gordon D, Brizuela L, Richard McCombie W, Hannon GJ. 2009. Hybrid selection of discrete genomic intervals on custom-designed microarrays for massively parallel sequencing. Nat Protoc. 4:960–974. Hudson RR. 2002. Generating samples under a Wright-Fisher neutral model of genetic variation. Bioinformatics 18:337–338. Hurst R, Siyame EW, Young SD, Chilimba AD, Joy EJ, Black CR, Ander EL, Watts MJ, Chilima B, Gondwe J, et al. 2013. Soil-type influences human selenium status and underlies widespread selenium deficiency risks in Malawi. Sci Rep. 3:1425. Keinan A, Mullikin JC, Patterson N, Reich D. 2007. Measurement of the human allele frequency spectrum demonstrates greater genetic drift in East Asians than in Europeans. Nat Genet. 39:1251–1255. Keshan Disease Research Group of the Chinese Academy of Medical Sciences.. 1979. Observations on effect of sodium selenite in prevention of Keshan disease. Chin Med J (Engl). 92:471–476. Khan ZI, Ashraf M, Danish M, Ahmad K, Valeem EE. 2008. Assessment of selenium content in pasture and ewes in Punjab, Pakistan. Pak J Bot. 40:1159–1162. Khan ZI, Hussain A, Ashraf M, McDowell R. 2006. Mineral status of soils and forages in southwestern Punjab-Pakistan: micro-minerals. AsianAust J Anim Sci. 19:1139–1147. Kircher M, Stenzel U, Kelso J. 2009. Improved base calling for the Illumina Genome Analyzer using machine learning strategies. Genome Biol. 10:R83. Koivistoinen P, Huttunen JK. 1986. Selenium in food and nutrition in Finland. An overview on research and action. Ann Clin Res. 18:13–17. Kryukov GV, Castellano S, Novoselov SV, Lobanov AV, Zehtab O, Guigo R, Gladyshev VN. 2003. Characterization of mammalian selenoproteomes. Science 300:1439–1443. Latreche L, Jean-Jean O, Driscoll DM, Chavatte L. 2009. Novel structural determinants in human SECIS elements modulate the translational recoding of UGA as selenocysteine. Nucleic Acids Res. 37:5868–5880. Li H, Durbin R. 2009. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25:1754–1760. Li JZ, Absher DM, Tang H, Southwick AM, Casto AM, Ramachandran S, Cann HM, Barsh GS, Feldman M, Cavalli-Sforza LL, et al. 2008. MBE Worldwide human relationships inferred from genome-wide patterns of variation. Science 319:1100–1104. McDowell LR. 1976. Mineral deficiencies and toxicities and their effect on beef production in developing countries. In: Smith AJ, editor. Beef cattle production in developing countries. Edinburgh (United Kingdom): University of Edinburgh, Centre for Tropical Veterinary Medicine. p. 216–241. McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, et al. 2010. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20:1297–1303. Meplan C, Hughes DJ, Pardini B, Naccarati A, Soucek P, Vodickova L, Hlavata I, Vrana D, Vodicka P, Hesketh JE. 2010. Genetic variants in selenoprotein genes increase risk of colorectal cancer. Carcinogenesis 31:1074–1079. Mertz W. 1981. The essential trace elements. Science 213:1332–1338. Meyer M, Kircher M. 2010. Illumina sequencing library preparation for highly multiplexed target capture and sequencing. Cold Spring Harb Protoc. 2010:pdb prot5448. Moreno-Reyes R, Egrise D, Neve J, Pasteels JL, Schoutens A. 2001. Selenium deficiency-induced growth retardation is associated with an impaired bone metabolism and osteopenia. J Bone Miner Res. 16: 1556–1563. Novakovic R, Cavelaars A, Geelen A, Nikolic M, Altaba II, Vinas BR, Ngo J, Golsorkhi M, Medina MW, Brzozowska A, et al. 2014. Socioeconomic determinants of micronutrient intake and status in Europe: a systematic review. Public Health Nutr. 17:1031–1045. Novakovic R, Cavelaars AE, Bekkering GE, Roman-Vinas B, Ngo J, Gurinovic M, Glibetic M, Nikolic M, Golesorkhi M, Medina MW, et al. 2013. Micronutrient intake and status in Central and Eastern Europe compared with other European countries, results from the EURRECA network. Public Health Nutr. 16:824–840. Oldfield JE. 2002. Selenium world atlas. Grimbergen (Belgium): STDA. Paten B, Herrero J, Beal K, Fitzgerald S, Birney E. 2008. Enredo and Pecan: genome-wide mammalian consistency-based multiple alignment with paralogs. Genome Res. 18:1814–1828. Pemberton TJ, Absher D, Feldman MW, Myers RM, Rosenberg NA, Li JZ. 2012. Genomic patterns of homozygosity in worldwide human populations. Am J Hum Genet. 91:275–292. Perry GH, Dominy NJ, Claw KG, Lee AS, Fiegler H, Redon R, Werner J, Villanea FA, Mountain JL, Misra R, et al. 2007. Diet and the evolution of human amylase gene copy number variation. Nat Genet. 39: 1256–1260. Pruitt KD, Tatusova T, Klimke W, Maglott DR. 2009. NCBI Reference Sequences: current status, policy and new initiatives. Nucleic Acids Res. 37:D32–D36. Rayman MP. 2012. Selenium and human health. Lancet 379: 1256–1268. Romagne F, Santesmasses D, White L, Sarangi GK, Mariotti M, Hubler R, Weihmann A, Parra G, Gladyshev VN, Guigo R, et al. 2014. SelenoDB 2.0: annotation of selenoprotein genes in animals and their genetic diversity in humans. Nucleic Acids Res. 42:D437–D443. Rosenberg NA. 2006. Standardized subsets of the HGDP-CEPH Human Genome Diversity Cell Line Panel, accounting for atypical and duplicated samples and pairs of close relatives. Ann Hum Genet. 70: 841–847. Safaralizadeh R, Kardar GA, Pourpak Z, Moin M, Zare A, Teimourian S. 2005. Serum concentration of selenium in healthy individuals living in Tehran. Nutr J. 4:32. Schomburg L, Schweizer U. 2009. Hierarchical regulation of selenoprotein expression and sex-specific effects of selenium. Biochim Biophys Acta. 1790:1453–1462. Selinus O, Alloway BJ. 2005. Essentials of medical geology: impacts of the natural environment on public health. Amsterdam (The Netherlands): Elsevier Academic Press. Snider GW, Ruggles E, Khan N, Hondal RJ. 2013. Selenocysteine confers resistance to inactivation by oxidation in thioredoxin reductase: comparison of selenium and sulfur enzymes. Biochemistry 52: 5472–5481. 1517 White et al. . doi:10.1093/molbev/msv043 Steinbrenner H, Sies H. 2009. Protection against reactive oxygen species by selenoproteins. Biochim Biophys Acta. 1790:1478–1485. Steinmann D, Nauser T, Koppenol WH. 2010. Selenium and sulfur in exchange reactions: a comparative study. J Org Chem. 75: 6696–6699. Stringer CB, Andrews P. 1988. Genetic and fossil evidence for the origin of modern humans. Science 239:1263–1268. Sutherland A, Kim DH, Relton C, Ahn YO, Hesketh J. 2010. Polymorphisms in the selenoprotein S and 15-kDa selenoprotein genes are associated with altered susceptibility to colorectal cancer. Genes Nutr. 5:215–223. Tajima F. 1989. Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123:585–595. Thomson CD. 2004. Selenium and iodine intakes and status in New Zealand and Australia. Br J Nutr. 91:661–672. Thornton K. 2003. Libsequence: a C++ class library for evolutionary genetic analysis. Bioinformatics 19:2325–2327. 1518 MBE Tishkoff SA, Reed FA, Ranciaro A, Voight BF, Babbitt CC, Silverman JS, Powell K, Mortensen HM, Hirbo JB, Osman M, et al. 2007. Convergent adaptation of human lactase persistence in Africa and Europe. Nat Genet. 39:31–40. Weir BS, Cockerham CC. 1984. Estimating F-statistics for the analysis of population-structure. Evolution 38:1358–1370. Wright S. 1951. The genetical structure of populations. Ann Eugen. 15: 323–354. Xia Y, Hill KE, Byrne DW, Xu J, Burk RF. 2005. Effectiveness of selenium supplements in a low-selenium area of China. Am J Clin Nutr. 81: 829–834. Xiong YM, Mo XY, Zou XZ, Song RX, Sun WY, Lu W, Chen Q, Yu YX, Zang WJ. 2010. Association study between polymorphisms in selenoprotein genes and susceptibility to Kashin-Beck disease. Osteoarthr Cartil. 18:817–824. Yao Y, Pei F, Kang P. 2011. Selenium, iodine, and the relation with Kashin-Beck disease. Nutrition 27:1095–1100.
© Copyright 2026 Paperzz