Genomic Drift and Evolution of Microsatellite DNAs in Human Populations Naoko Takezaki* and Masatoshi Nei *Life Science Research Center, Kagawa University, Japan; and Department of Biology, Institute of Molecular Evolutionary Genetics, Pennsylvania State University In recent years, copy number variation (CNV) of DNA segments has become a hot topic in the study of genetic variation, and a large amount of CNVs has been uncovered in human populations. The CNVs involving the smallest units of DNA segments are microsatellite DNAs, and the evolutionary change of microsatellite DNAs is believed to occur mostly by the increase or decrease of one repeat unit at a time in a more or less neutral fashion. If we note that eukaryotic genomes contain millions of microsatellite loci, this pattern of nucleotide change is expected to generate random changes of genome size, that is, genomic drift, and will provide a neutral model of CNV evolution. We therefore investigated the amount of variation of the total number of repeats (TNR) per individual concerned with 145 microsatellite loci in three human populations, Africans, Europeans, and Asians. It was shown that the TNR follows the normal distribution in all three populations and that the extent of variation of TNR is more than 50% greater in Africans than in Europeans and Asians as expected from the hypothesis of African origin of modern humans. If we consider all microsatellite loci in the human genome and compute the variation of the total number of nucleotides involved (TNN), it is possible to study the contribution of microsatellite loci to the genome size variation. This study has shown that the genome sizes of human individuals are affected considerably by genomic drift of microsatellite DNA alone. This pattern of evolution is similar to that of olfactory receptor (OR) genes previously studied in human populations and support the idea that the number of OR genes has evolved in a more or less neutral fashion. However, this conclusion does not necessarily apply to the genomewide CNVs of various DNA segments, and it appears that long variant DNA fragments are deleterious and under purifying selection. Introduction Microsatellite DNAs are tandem repeats of short nucleotides (1–6 bp) and are abundant in eukaryotic genomes. In microsatellite loci, the number of repeats changes rapidly, and the extent of polymorphism in populations is very high. The change of repeat number in microsatellite loci is generally caused by replication slippage, and the repeat number mostly increases or decreases by one repeat at a time. In practice, however, irregular changes such as multistep changes or point mutations are known to occur, and the number of repeats seems to show a directional change under certain circumstances (Ellegren 2000, 2004). Therefore, the actual mutational pattern of microsatellite loci is quite complicated and heterogeneous. In general, however, the repeat number is believed to change in a random fashion, roughly following the stepwise mutation model (SMM) (Di Rienzo et al. 1994; Ellegren 2000, 2004; Schlötterer 2000; Xu et al. 2000; Estoup et al. 2002), which was originally developed for electrophoretic loci (Ohta and Kimura 1973). Recently, many studies have shown that human populations harbor a substantial amount of copy number variation (CNV) of DNA segments caused by duplication, deletion, insertion, etc. (e.g., Sebat et al. 2004; Tuzun et al. 2005; Redon et al. 2006) and at least 10% of the human genome contains CNVs (Redon et al. 2006; Scherer et al. 2007; Wong et al. 2007). CNVs are also known to generate some complex medical disorders such as cancer and autoimmune diseases (e.g., Fanciulli et al. 2007; Hollox et al. 2008) and often occur in multigene families (e.g., olfactory receptor [OR] genes, innate immune systems genes, and b-defensin antimicrobial gene clusters). ThereKey words: microsatellite DNA, genomic drift, copy number variation, genome evolution, human evolution. E-mail: [email protected]. Mol. Biol. Evol. 26(8):1835–1840. 2009 doi:10.1093/molbev/msp091 Advance Access publication April 30, 2009 Ó The Author 2009. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: [email protected] fore, CNVs seem to be an important source of genetic and phenotypic variation (Feuk et al. 2006; Freeman et al. 2006; Beckmann et al. 2007; Bailey et al. 2008). By analyzing Redon et al.’s (2006) data on DNA hybridization, Nozawa et al. (2007) found that the gene copy number of OR genes varies extensively within and between human populations and the number is normally distributed in each of the three populations examined (Africans, Europeans, and Asians). They then proposed that this variation is caused by duplication and deletion (or pseudonization) of genes that occur in a random fashion. Nei (2007) called this type of change the genomic drift. This proposition may be supported if the total number of repeats (TNR) involved in microsatellite loci also shows the normal distribution, because in these loci, the number of repeats is known to change in a more or less random fashion. For this reason, we have decided to study the distribution of TNR of microsatellite loci in the African, European, and Asian populations. We also examined the relative contributions of microsatellite DNAs and OR genes to the variation of human genome size. Materials and Methods Microsatellite DNA data were taken from Ramachandran et al. (2005) (http://rosenberglab.bioinformatics.med.umich. edu/data/ramachandranEtAl2005/combinedmicrosats-1027. stru). In this website, genotypic data of 783 autosomal microsatellite loci (45 dinucleotide, 175 trinucleotide, 555 tetranucleotide, and 8 pentanucleotide repeat loci) are available for 53 human populations (see also Takezaki and Nei 2008). From this data set, we chose 145 loci (9 dinucleotide, 23 trinucleotide, 112 tetranucleotide, and 1 pentanucleotide repeat loci) from Biaka Pygmy (N 5 32) in Africa, French (N 5 29) in Europe, and Han-Chinese (N 5 34) in Asia after elimination of loci with missing (null) allele data. The 145 loci used in this study appear to be good representatives of all 783 loci in the original data set of Ramachandran et al. (2005) 1836 Takezaki and Nei Table 1 Average Heterozygosities (H) and Average Number of Alleles per Locus (Na) for the Three Human Populations 145 Loci Populations African European Asian H 0.75 0.70 0.73 The second one was the (dl)2 distance (Goldstein et al. 1995), which is defined by ðdlÞ2 5 783 Loci Na H 7.2 6.3 6.4 0.76 0.70 0.73 Na 7.5 6.4 6.5 because the average heterozygosities and the average numbers of alleles per locus for the three populations are nearly the same for the data sets of 145 loci and 783 loci (table 1). The allele frequency data of the 145 loci for the three populations are available in Supplementary Material online. In the data set used here, microsatellite alleles were represented by the fragment size of the amplified DNA at each locus. The fragment for an allele included the flanking regions as well as the repeat region of the locus, and the actual number of repeats was not known. We computed the TNR for each individual from the total fragment size (nucleotides) of the two alleles divided by the repeat unit length of the locus for all 145 loci and then obtained the difference of this number from that of the individual who had the smallest number. This was done to eliminate the DNA segments in the flanking regions, and therefore, the TNR for the individual can be obtained approximately. In this paper, we call this difference the TNR and use it to study the variation of TNR among different individuals in populations. It should be noted that Nozawa et al. (2007) used CNV data for OR genes from Africans (Yoruba), Europeans (European descendants in Utah), and Asians (Han-Chinese and Japanese) obtained by Redon et al. (2006), but these samples are completely different from the data used in this study. The heterozygosity P for a locus was estimated by h5f2n=½2n 1gð1 xi2 Þ, where xi is the frequency of the ith allele in the sample, and n is the number of diploid individuals examined at the locus (Nei 1987). The average hetP erozygosity (H) for r loci was estimated by H5 rj51 hj =r, where hj is the value of h at the jth locus. The genetic distances between the populations were computed by two different distance measures. The first one was Nei’s (1972) standard genetic distance (DS), which is given by JXY DS 5 ln pffiffiffiffiffiffiffiffiffi ; JX JY Pr Pmj 2 Pr Pm j 2 where Pr PmJj X 5 j i xij =r and JY 5 j i yij =r; JXY 5 i xij yij =r; and xij and yij are the frequencies of the j ith allele at the jth locus in populations X and Y, respectively. r is the number of loci and mj is the number of alleles at the jth locus. Under the infinite allele model (IAM) (Kimura and Crow 1964) with mutation–drift balance, the expectation of DS, E[DS], is given by 2vt, where v is the mutation rate per locus per generation and t is the time measured in generations after the two populations diverged. r X ðlXj lYi Þ2 =r; j P P where lXj 5 i ixij and lYj 5 i iyij are the average repeat number of alleles at the jth locus, and xij and yij are the frequencies of the allele with repeat number (allele size) i at the jth locus in populations X and Y, respectively. Under the SMM with mutation–drift balance, E[(dl)2] 5 2vt and the value of (dl)2 is expected to increase linearly with time. Search of Microsatellite Loci in Genomic Sequence The members of different types of microsatellite loci in the human genome were estimated by comparing the human and chimpanzee chromosome 21 sequences. The alignment of human and chimpanzee genomic sequences is available at the UCSC genome web site (http://hgdownload.cse.ucsc.edu/ downloads.html) (Karolchik et al. 2003). Two kinds of aligned sequences of human and chimpanzee chromosome 21 (chr21.hg18.panTro2.net.axt and chr21.panTro2.hg18. net.axt) were downloaded. The two different alignments of the genome sequences were BlastZ hit (Schwartz et al. 2003) from human to chimpanzee sequences and from chimpanzee to human sequences. We used the regions of the aligned sequences in chr21.hg18.panTro2.net.axt that appeared only once in each of the two alignments. These regions add up to 28 Mb of the human–chimpanzee alignment, about 1% of the entire human genome. Uninterrupted microsatellite DNA repeat loci with repeat unit size 1–5 bp were searched by setting the minimum number of repeats of a locus to three and six. If locations of human and chimpanzee loci with the same repeat motif overlapped, the loci were regarded as orthologous. Otherwise, the loci were regarded as nonorthologous. Variable loci are loci for which the numbers of repeats are different for human and chimpanzee, and invariable loci are those with the same number of repeats for human and chimpanzee. Microsatellite loci were searched only in intergenic and intronic regions by using the information at the UCSC website. Results Variation of Microsatellite Loci in Human Populations Figure 1 shows the frequency distribution of TNR in the three human populations. The abscissa of this figure stands for the TNR summed up for all 145 microsatellite loci examined. The distribution approximately follows the normal distribution in each population (P 5 0.96, 0.53, and 0.92 for Africans, Europeans, and Asians, respectively, by the Kolmogorov–Smirnov test). The mean TNR is smaller in Africans than in Europeans and Asians (fig. 1). The mean difference in TNR between Africans and Europeans is statistically significant (P 5 0.03 by t-test), but the difference between Africans and Asians is not. Under the assumption of random changes of TNR, these differences are caused primarily by stochastic errors and do not necessarily Genomic Drift of Microsatellite DNAs 1837 Number of individuals 10 8 Table 2 MAD of TNR between Two Randomly Chosen Individuals within and between Populations Mean=75.1 SD=42.2 African N=32 6 4 0 10 Number of individuals African European Asian African European Asian 47.0 41.1 40.0 30.4 26.8 32.3 2 0 20 40 60 80 100 120 European N=29 8 140 160 180 200 Mean=95.3 SD=23.0 6 4 2 0 0 10 Number of individuals Population 20 40 60 80 100 120 Asian N=34 8 140 160 180 200 Mean=81.9 SD=28.7 6 4 2 0 0 20 40 60 80 100 120 140 160 180 200 Total repeat number FIG. 1.—Distributions of total repeat number of 145 microsatellite loci for individuals of African, European, and Asian populations. The TNR refers to the difference of the sum of repeat numbers for two alleles of all loci in an individual and the smallest number observed. indicate the effect of natural selection. The standard deviation (SD) of TNR in Africans is about 50–80% greater than that in non-Africans, and the difference is statistically significant (P 5 0.002–0.032 by the F-test). This observation is consistent with the theory of African origin of modern humans (see Nei and Roychoudhury 1982; Cann et al. 1987), which maintains that modern humans originated in Africa before 116,000 years ago and subsequently spread through the rest of the world. In this theory, it is likely that the founding populations of nonAfricans were derived from a subset of the African population and therefore the current non-African populations are expected to have a smaller genetic variation than the African population. To measure the extent of variation of TNR caused by stochastic changes, we computed the mean absolute difference (MAD) of TNN between two randomly chosen individuals within and between populations following Nozawa et al. (2007). Like the SD values, the MAD for Africans is about 50% greater than that for Europeans and Asians (table 2). The interpopulational MAD between Africans and non-Africans is also about 50% greater than that between Europeans and Asians. The estimates of Nei’s genetic distance (DS) and (dl)2 distance between Africans and non-Africans are about twice as large as the distance values between Europeans and Asians (table 3). The estimates of the both genetic distance measures for Africans and non-Africans are approximately two times higher than the estimates between Europeans and Asians. This observation is consistent with the previous results obtained by classical markers and other microsatellite DNA data (Nei and Roychoudhury 1982; Goldstein et al. 1995; Cavalli-Sforza and Feldman 2003). The expected value of DS distance increases linearly with the time after divergence of populations in mutation–drift equilibrium in the IAM, whereas the same linear increase with time holds for (dl)2 in the SMM (see Materials and Methods). Therefore, this finding suggests that the evolutionary change of microsatellite loci does not occur exactly following the IAM or SMM but lie somewhere between the two models (Li 1976; Ellegren 2000, 2004). The MAD of TNR between Africans and non-Africans is both about 1.5 times as large as the MAD between Europeans and Asians. This ratio is somewhat lower than that of the genetic distance values, but MAD appears to increase with time, as expected theoretically (Chakraborty and Nei 1982). Comparison of Variation of the TNN Between Microsatellite Loci and OR Genes It is interesting to study the relative contributions of microsatellite loci and OR genes to the genomic variation. However, this cannot be done by comparison of the distributions of TNR for the two groups of genes, because the number of nucleotides involved in a gene or repeat unit is not the same. In the case of OR genes, one gene is composed of about 930 nt (310 amino acids per protein), and Nozawa et al. (2007) showed that the SD of TNR is 10 in the African population. Therefore, it is possible to obtain the SD of the total number of nucleotides (TNN) (SD[TNN]) by multiplying SD by 930. (This is because the abscissa of the distribution of gene number has to be converted into the TNN involved.) Therefore, we have SD(TNN) 5 930 10 5 9,300. This indicates that the genomic variation caused by OR genes is substantial. Estimation of SD(TNN) for microsatellite loci in the entire genome is more complicated because the loci we used are not a random sample of all types of microsatellite loci. In our search of the microsatellite loci in the human chromosome 21, we found 70,103 mononucleotide, 2,989 dinucleotide, 188 trinucleotide, 287 tetranucleotide, and 26 pentanucleotide loci, which are orthologous between human and chimpanzee (table 4). Among these loci, 1838 Takezaki and Nei Table 3 Genetic Distances between Populations Genetic Distance African versus European African versus Asian European versus Asian DS (dl)2 0.23 0.25 0.10 1.29 1.18 0.60 there were 16,767, 1,974, 118, 217, and 21 variable loci (loci with different number of repeats between human and chimpanzee). Note that the estimate of the number of microsatellite loci varies depending on the minimum number of repeats used for each locus and some other factors in the search (International Human Genome Sequencing Consortium 2001; Takezaki N, unpublished data). Furthermore, the variability of microsatellite loci is positively correlated with the number of repeats (Ellegren 2000, 2004). We tried to include highly variable loci by setting the minimum number of repeats in the search to six. Although the variability of microsatellite loci seems to be affected by many other factors, we assume that all types of microsatellite loci are equally polymorphic to have a crude estimate of the effect of variation of TNN in microsatellite loci on the genome size. We also assume that the variable loci on the entire genome are 100 times the numbers of loci found on chromosome 21 because the genomic region examined was about 1% of the entire human genome (Materials and Methods). Then, the approximate value of SD(TNN) for the entire genome in the African population may be computed in the following way: First, the SD value of TNR for the 145 microsatellite loci was 42.2 in the African population (fig. 1). Here we note that if the number of repeats evolves independently at each of the r loci following the SMM model, the variance of total repeat number for r loci is given by rV, where V is the variance of repeat number for a locus (Chakraborty p and pffiffiffiffi ffiffi Nei 1982). Therefore, the SD of the TNR for r loci is r SD1 , where SD1 5 V . From the p SD ffiffiffiffiffiffiffiffi value of the 145 loci, we obtain SD1 542:2= 14553:5. For the r loci with repeat unit length b, the variance of the total nucleotide number [SD(TNN)]2 is given by b2rV where V 5 (SD1)2. If we use these results, the variance for the TNR of microsatellite loci on the genome can be computed as [SD(TNN)]2 5 (12 16,767 100 þ 22 1,974 100 þ 32 118 100 þ 42 217 100 þ 52 21 100) 3.52 5 36,409,450 5 (6,034)2. Therefore, SD(TNN) is 6,034. This estimate is slightly smaller than that of SD(TNN) for OR genes. Yet, this number is likely to be an overestimate because experimentalists tend to study highly polymorphic loci rather than low-polymorphic loci. For this reason, it is interesting to know a minimum estimate of SD(TNN) considering only the loci with tri- and tetra-, and pentanucleotide repeats. The variance of total nucleotide number [SD(TNN)]2 for these loci is (32 118 100 þ 42 217 100 þ 52 21 100) 3.52 5 6,197,252 5 (2,489)2. Therefore, SD(TNN) is 2,489. The number of loci for mono- and dinucleotide repeat loci is nearly 50 times greater than the number of tri-, tetra-, and pentanucleotide loci (table 4). In addition, there are some nonorthologous loci (table 4) and a large number of variable loci with a small repeat number, which may also contribute to genome size variation (see the numbers of microsatellite loci with the minimum repeat number set to three in supplementary table S1, Supplementary Material online). Therefore, our estimate of SD(TNN) 5 2,489 is almost certainly a minimum estimate, though it is very crude. If we use this minimum estimate, the expected difference [4 SD(TNN)] between the numbers of the upper and lower 5% levels of individuals in the African population is about 9,956 nt. This estimate indicates that the contribution of microsatellite loci to the genome size variation is substantial mainly because of the large number of loci in the genome. Yet, it appears to be smaller than that of OR genes, where the large repeat (gene) size contributes to the genomic variation. In the above computation, we considered only the African population. In the European and Asian populations the SD of the total repeat number is about 65–68% of that of the African. Therefore, the extent of variation of TNN is expected to be smaller in these populations, yet the variation is substantial. This is true with the SD(TNN) of OR genes as well (Nozawa et al. 2007). Discussion We have seen that the number of repeats involved in the microsatellite loci in the human genome approximately follows the normal distribution as expected from the Table 4 Numbers of Microsatellite Loci in Human Populations Identified from the Comparison of Genome Sequences of Human and Chimpanzee Chromosome 21 Orthologous Loci Repeat Unit Mononucleotide Dinucleotide Trinucleotide Tetranucleotide Pentanucleotide Total Nonorthologous Loci Variable Invariable All Human Chimpanzee 16,767 1,974 118 217 21 19,097 53,336 1,015 70 70 5 54,496 70,103 2,989 188 287 26 73,593 5,087 790 128 408 88 6,501 4,871 727 129 264 38 6,029 The minimum of the number of repeats used for the search of microsatellite loci was set to six in the search of microsatellite loci. Variable: The loci in which the repeat numbers are different for human and chimpanzee. Invariable: The loci in which the repeat numbers are the same for human and chimpanzee. Genomic Drift of Microsatellite DNAs 1839 mutational pattern of these loci, and this observation suggests that a similar normal distribution observed with the number of OR genes is also caused by random events of genomic drift. However, even if CNV is generated by genomic drift, the gene copy number may not necessarily show a normal distribution if the number of loci involved is small. For example, the number of gene copies responsible for taste receptors in mammalian species is of the order of 20–30 per genome (Nei et al. 2008), and consequently the CNV of this gene family shows a skewed distribution in human populations apparently because of sampling errors. By contrast, if we consider the CNVs for the entire genome, the CNVs are apparently generated by insertions, deletions, and inversions of various sizes of DNA fragments, some fragments being as large as tens of thousands of nucleotides (Kidd et al. 2008), and the fragment size apparently shows a lognormal distribution (Perry et al. 2008). This means that the large variant DNA fragments are more frequent than expected under the normal distribution. In fact, the most frequent class of variant fragments was that for about 10 kb, but there were fragments of the order of megabases in low frequency (Perry et al. 2008). If this distribution is stationary, it suggests either that long fragments are generated with a lower frequency than short fragments or that the former are generally deleterious and eventually eliminated from the population. In practice, the latter hypothesis is more likely to be true than the former, and neutral CNVs seem to be limited to certain groups of gene families. At the genome level, many new large DNA insertions and deletions are probably eliminated because of the association with complex medical disorders. In general, the CNVs in the human genome are heterogeneous, and some are apparently deleterious, whereas others are more or less neutral or even advantageous. It would be important to know the evolutionary significance of these CNVs in the future. In this case, microsatellite loci may be used as a neutral model of CNVs. Supplementary Material The allele frequency data of 145 loci for Africans, Europeans, and Asians and supplementary table S1 are available at Molecular Biology and Evolution online (http://www.mbe.oxfordjournals.org/). Literature Cited Bailey JA, Kidd JM, Eichler EE. 2008. Human copy number polymorphic genes. Cytogenet Genome Res. 123:234–243. Beckmann JS, Estivill X, Antonarakis SE. 2007. Copy number variants and genetic traits: closer to the resolution of phenotypic to genotypic variability. Nat Rev Genet. 8:639–646. Cann RL, Stoneking M, Wilson AC. 1987. Mitochondrial DNA and human evolution. Nature. 325:31–36. Cavalli-Sforza L, Feldman MW. 2003. The application of molecular genetic approaches to the study of human evolution. Nat Genet. 33:266–275. Chakraborty R, Nei M. 1982. Genetic differentiation of quantitative characters between populations or species. I. Mutation and random genetic drift. Genet Res. 39:303–314. Di Rienzo A, Peterson AC, Garza JC, Valdes AM, Slatkin M, Freimer NB. 1994. Mutational processes of simple-sequence repeat loci in human populations. Proc Natl Acad Sci USA. 91:3166–3170. Ellegren H. 2000. Microsatellite mutations in the germline: implications for evolutionary inference. Trend Genet. 16: 551–558. Ellegren H. 2004. Microsatellites: simple sequences with complex evolution. Nat Rev Genet. 5:435–445. Estoup A, Jarne P, Cornuet J. 2002. Homoplasy and mutation model at microsatellite loci and their consequences for population genetics analysis. Mol Ecol. 11: 1591–1604. Fanciulli M, Norsworthy PJ, Petretto E, et al. (20 co-authors). 2007. FCGR3B copy number variation is associated with susceptibility to systemic, but not organ-specific, autoimmunity. Nat Genet. 39:721–723. Feuk L, Carson AR, Scherer SW. 2006. Structural variation in the human genome. Nat Rev Genet. 7:85–97. Freeman JL, Perry GH, Fuek L, et al. (13 co-authors). 2006. Copy number variation: new insights in genome diversity. Genome Res. 16:949–961. Goldstein DB, Ruiz Linares A, Cavalli-Sforza LL, Feldman MW. 1995. Genetic absolute dating based on microsatellites and the origin of modern humans. Proc Natl Acad Sci USA. 92: 6723–6727. Hollox E, Huffmeier U, Zeeuwen PL, et al. (13 co-authors). 2008. Psoriasis is associated with increased beta-defensin genomic copy number. Nat Genet. 40:23–25. International Human Genome Sequencing Consortium. 2001. Initial sequencing and analysis of the human genome. Nature. 409:860–921. Karolchik D, Baertsch R, Diekhans M, et al. (13 co-authors). 2003. The UCSC Genome Browser Database. Nucl Acids Res. 31:51–54. Kidd JM, Cooper GM, Donahue WF, et al. (46 co-authors). 2008. Mapping and sequencing of structural variation from eight human genomes. Nature. 453:56–64. Kimura M, Crow JF. 1964. The number of alleles that can be maintained in a finite populations. Genetics. 49: 523–538. Li WH. 1976. Mixed model of mutation for electrophoretic identity of proteins within and between populations. Genetics. 83:423–432. Nei M. 1972. Genetic distance between populations. Am Nat. 106: 203–291. Nei M. 1987. Molecular evolutionary genetics. New York: Columbia University Press. Nei M. 2007. The new mutation theory of phenotypic evolution. Proc Natl Acad Sci USA. 104:12235–12242. Nei M, Niimura Y, Nozawa M. 2008. The evolution of animal chemosensory receptor gene repertoires: roles of chance and necessity. Nat Rev Genet. 9:951–963. Nei M, Roychoudhury AK. 1982. Genetic relationship and evolution of human races. Evol Biol. 14:1–59. Nozawa M, Kawahara Y, Nei M. 2007. Genomic drift and copy number variation of sensory receptor genes in humans. Proc Natl Acad Sci USA. 104:20147–20148. Ohta T, Kimura M. 1973. A model of mutation appropriate to estimate the number of electrophoretically detectable alleles in a finite population. Genet Res. 22:201–204. Perry GH, Ben-Dor A, Tsalenko A, et al. (17 co-authors). 2008. The fine-scale and complex architecture of human copynumber variation. Am J Hum Genet. 82:685–695. Ramachandran S, Deshpande O, Roseman CC, Rosenberg NA, Feldman MW, Cavalli-Sforza LL. 2005. Support from the relationship of genetic and geographic distance in human 1840 Takezaki and Nei populations for a serial founder effect originating in Africa. Proc Natl Acad Sci USA. 102:15942–15947. Redon R, Ishikawa S, Fitch KR, et al. (43 co-authors). 2006. Global variation in copy number in the human genome. Nature. 444:428–429. Scherer SW, Lee C, Birney E, Altshuler DM, Eichler EE, Carter NP, Hurles ME, Feuk L. 2007. Challenges and standards in integrating surveys of structural variation. Nat Genet. 39:S7–S15. Schlötterer C. 2000. Evolutionary dynamics of microsatellite DNA. Chromosoma. 109:365–371. Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. 2003. Human-mouse alignments with BLASTZ. Genome Res. 13:103–107. Sebat J, Lakshmi B, Troge J, et al. (21 co-authors). 2004. Large-scale copy number polymorphism. Science. 305: 525–528. Takezaki N, Nei M. 2008. Empirical tests of the reliability of phylogenetic trees constructed with microsatellite DNA. Genetics. 178:385–392. Tuzun E, Sharp AJ, Bailey JA, et al. (12 co-authors). 2005. Finescale structural variation of the human genome. Nat Genet. 37: 727–732. Wong KK, deLeeuw RJ, Dosanjh NS, et al. (11 co-authors). 2007. A comprehensive analysis of common copy-number variations in the human genome. Am J Hum Genet. 80:91–104. Xu X, Peng M, Fang Z, Xu X. 2000. The direction of microsatellite mutations is dependent upon allele length. Nat Genet. 24:396–399. Takashi Gojobori, Associate Editor Accepted April 22, 2009
© Copyright 2026 Paperzz