MEC1122.fm Page 2109 Saturday, November 11, 2000 2:57 PM Molecular Ecology (2000) 9, 2109 – 2118 Genetic isolation by distance in Arabidopsis thaliana: biogeography and postglacial colonization of Europe Blackwell Science, Ltd T I M O T H Y F. S H A R B E L , B E R N H A R D H A U B O L D and T H O M A S M I T C H E L L - O L D S Department of Genetics and Evolution, Max Planck Institut für Chemische Ökologie, Carl Zeiss Promenade 10, 07745 Jena, Germany Abstract Arabidopsis thaliana provides a useful model system for functional, evolutionary and ecological studies in plant biology. We have analysed natural genetic variation in A. thaliana in order to infer its biogeographical and historical distribution across Eurasia. We analysed 79 amplified fragment length polymorphism (AFLP) markers in 142 accessions from the species’ native range, and found highly significant genetic isolation by distance among A. thaliana accessions from Eurasia and southern Europe. These spatial patterns of genetic variation suggest that A. thaliana colonized central and northern Europe from Asia and from Mediterranean Pleistocene refugia, a trend which has been identified in other species. Statistically significant levels of multilocus linkage disequilibrium suggest intermediate levels of disequilibrium among subsets of loci, and analysis of genetic relationships among accessions reveal a star or bush-like dendrogram with low bootstrap support. Taken together, it appears that there has been sufficient historical recombination in the A. thaliana genome such that accessions do not conform to a tree-like, bifurcating pattern of evolution – there is no ‘ecotype phylogeny.’ Nonetheless, significant isolation by distance provides a framework upon which studies of natural variation in A. thaliana may be designed and interpreted. Keywords: AFLP, Arabidopsis thaliana, biogeography, linkage disequilibrium, postglacial colonization Received 20 May 2000; revision received 22 July 2000; accepted 24 July 2000 Introduction Arabidopsis thaliana provides a useful model system for functional, evolutionary and ecological studies in plant biology and genetics. Although positional cloning or insertional mutagenesis have been used to elucidate genetic influences on development and physiology, studies of natural genetic variation also are becoming increasingly common (Alonso-Blanco & Koornneef 2000). Besides contributing to functional genomics, analyses of naturally occurring quantitative genetic variation can elucidate the evolutionary causes and consequences of molecular variation in quantitative traits (Mitchell-Olds 1995). It is, therefore, important to understand historical and ecological influences on genetic variation in A. thaliana. A. thaliana reproduces almost exclusively through selfing (Redei 1975; Abbott & Gomes 1989). Observed patterns of Correspondence: Thomas Mitchell-Olds. Fax: +49–3641–643668; E-mail: [email protected] © 2000 Blackwell Science Ltd genetic variation are consistent with this life style: most individuals are highly inbred, little heritable variation exists within populations, and most genetic variation is found among populations (Hanfstingl et al. 1994; Todokoro et al. 1995; Bergelson et al. 1998; Breyne et al. 1999; Miyashita et al. 1999). A. thaliana is native to Eurasia and North Africa (Price et al. 1994; O’Kane & Al-Shehbaz 1997), and has become widely naturalized in the Western Hemisphere following European colonization. A. thaliana is a poor competitor which occupies disturbed environments early in succession. It is commonly found in agricultural fields and other disturbed sites associated with human activity (e.g. Bergelson et al. 1998; Mauricio 1998). Previous studies of molecular markers have identified no association between genetic polymorphisms and geographical location (King et al. 1993; Todokoro et al. 1995; Bergelson et al. 1998; Miyashita et al. 1999), suggesting that the original biogeographic patterns in A. thaliana have been obscured by human disturbance. However, it is possible that phylogeographic MEC1122.fm Page 2110 Saturday, November 11, 2000 2:57 PM 2110 T. F. S H A R B E L , B . H A U B O L D and T. M I T C H E L L - O L D S structure could be detectable within the original species range in Eurasia. Pleistocene changes in climate and vegetation have influenced the geographical range and genetic variation of many European species during the past 135 000 years (Comes & Kadereit 1998; Hewitt 1996). A. thaliana likely occupied Europe during glacial and interglacial periods, thus its present distribution may have been influenced by repeated episodes of glacial advance and retreat. Briefly, ice sheets covered parts of Britain and northern Europe as well as the major mountain ranges during the Pleistocene, while the plains of central Europe were tundra-like and characterized by permafrost (Hewitt 1996; Willis 1996; Comes & Kadereit 1998). Consequently, much of Europe’s flora and fauna were forced southward into three main glacial refugia: the Iberian Peninsula, Southern Italy, and the Balkan region (Konnert & Bergmann 1995; Hewitt 1996; Comes & Kadereit 1998). It has also been hypothesized that parts of Scandinavia may have been a glacial refugium, as there are indications that the Norwegian coast was ice-free (Forsström & Punkari 1997), although there has been little biogeographical evidence to support this. Other regions east and south (e.g. south-west Asia) were warmer during this time, but they suffered from reduced precipitation levels (Willis 1996) and as a result may have lacked suitable habitats for some species. Many plants recolonized Europe from these refugia during the present interglacial period, with species-specific differences in recolonization rates and patterns (Hewitt 1996; Comes & Kadereit 1998). A. thaliana may have undergone similar Pleistocene migrations, and this may be detectable using molecular markers. A number of characteristics of A. thaliana are consistent with such a colonization pattern. First, the influx of genetically divergent populations from different glacial refugia should lead to relatively higher interpopulation genetic variability in Europe compared to any single glacial refugium (Cooper et al. 1995; Leonardi & Menozzi 1995; Schmidtling & Hipkins 1998). In support of this, elevated interpopulation genetic variability has repeatedly emerged from interecotype (i.e. accession) analyses of A. thaliana (Hanfstingl et al. 1994; Todokoro et al. 1995; Bergelson et al. 1998; Breyne et al. 1999; Miyashita et al. 1999). While this high level of genetic variation among populations has for the most part been attributed to its selfing nature, a portion of this differentiation may have resulted from isolation and divergence in disjunct refugia. Second, populations which have undergone independent evolution in different glacial refugia should be characterized by molecular markers unique to each region (see Comes & Kadereit 1998; Purugganan & Suddith 1999). This phenomenon may be reflected in the preponderance of low frequency polymorphisms in A. thaliana (see Fig. 2 in Miyashita et al. 1999a), although other phenomena (i.e. population bottlenecks) cannot be discounted. Third, one would predict that populations in areas which have acted as refugia over repeated glaciations should together encompass most of the genetic variability in present-day Europe (Comes & Kadereit 1998). To support this view, it has been shown that genetic analyses of a few accessions can account for most of the variability contained in larger samples (albeit from the analysis of a single locus, Hanfstingl et al. 1994). Therefore, we have undertaken a study of genetic variation across a large sample of A. thaliana accessions using amplified fragment length polymorphism (AFLP) markers (Vos et al. 1995). Our aim was to genotype a large number of Eurasian accessions using markers scattered throughout the genome in order for us to detect biogeographic trends in Europe and Asia. As it is clear that glacial refugia have influenced genetic variation in many species (Willis 1996; Newton et al. 1999), our results show that geography and history are important determinants of molecular and quantitative genetic variation in A. thaliana. Materials and methods Samples We sampled one genotype per accession because previous studies have found most genetic variation among populations, and little polymorphism within populations (Todokoro et al. 1995; Bergelson et al. 1998). Individuals of 142 accessions (Table 1) were grown under identical conditions (light/dark cycle) from seeds obtained from the Arabidopsis Biological Resource Center (The Ohio State University), the Nottingham stock centre and independent collectors. After 6 weeks, leaves from three to five individuals per accession were pooled, flash frozen in liquid nitrogen, and DNA isolated using a Nucleon (Amersham Pharmacia Biotech Europe GmbH) extraction kit. DNA quality and concentration were assessed by restriction digestion and visualization of 5 µL of the product on 0.7% TAE-agarose gels. AFLP analysis From 0.5 to 1.0 µg of genomic DNA, per individual, was digested with MseI (1 unit) and EcoRI (5 units; New England Biolabs) and ligated to polymerase chain reaction (PCR) adapters following the Ligation and Preselective Amplification Module for Small Plant Genomes (P/N 402004) procedure from Applied Biosystems. All restriction-ligation reactions were incubated at 17 °C overnight, and PCRs were run on a GeneAmp PCR System 9600 thermal cycler. An initial screening using 64 selective primer combinations was performed on 10 accessions using the Selective © 2000 Blackwell Science Ltd, Molecular Ecology, 9, 2109 – 2118 MEC1122.fm Page 2111 Saturday, November 11, 2000 2:57 PM G E N E T I C I S O L AT I O N B Y D I S TA N C E I N A R A B I D O P S I S T H A L I A N A 2111 Table 1 Arabidopsis thaliana accessions from which AFLP genotypes were generated, grouped by geographical region Geographic region Accession Africa Cvi-0, Ita-0, Jl-3, Mt-0 Asia Condara, cs1074, cs22482, cs22484, cs22485, cs22486, cs22488, cs22491, cs22492, cs22493, cs22495, cs6179, cs6180, cs931, Hodja, Kas-1, Perm-1, Ms-0, Rsch-0, Stw-0, Ws-0, Ws-3, En-T British Isles Lc-0, Bur-0, Cnt-1, Edi-0, Lan-0, Su-0, Kil-0, Cal-0, Ty-0 Central Europe Aa-0, Ag-0, Ak-1, An-2, Bch-1, Blh-1, Br-0, Bs-1, Bsch-0, Bu-0, Ca-0, Cha-0, Cha-1, Cit-0, Da(1)-12, Db-1, Di-0, Di-1, Di-g, Do-0, Dr-0, Ei-2, Eil-0, El-0, En-0, En-1, Ep-0, Estland, Fe-1, Fi-0, Ga-0, Gd-1, Ge-0, Gie-0, Goe-0, Gü-0, Gy-0, Ha-0, Hh-0, Hl-0, Hn-0, Je54, Jm-0, Ka-0, Kae-0, Kb-0, Kl-0, Kl-5, Ko-2, Kr-0, Kro-0, Kz, Le-0, Ler, Li-0, Li-5, Lip-0, Lm-2, Lö-1, Lö-2, Lz-0, Ma-0, Me-0, Mh-0, Nd-0, No-0, Mrk-0, Mz-0, Nd-1, Nok-0, Np-0, Nw-0, Ob-0, Ove-0, Pi-0, Po-0, Pr-0, Rak-2, RLD1, Ru-0, Sap-0, Sav-0, Ste-0, Ta-0, Uk-1, Wei-0, Wl-0, Wt-1 Iberian Peninsula Bla-1, Bla-10, Co-1, Ll-0, Pla-0, Sah-0, Sf-1 Scandinavia Fl-1, Lu-1, Oy-0, Oy-1, Te-0 Southern Italy Bl-1, Ct-1, Mr-0, Pa-1, Pa-3, Tu-1 Amplification Start-up Module for Small Plant Genomes (P/N 402006, Applied Biosystems), following which the quality and number of polymorphic bands were assessed and three primer combinations were chosen for further analysis (EcoRI TG × MseI CTA; EcoRI TG × MseI CAT; EcoRI TG × MseI CTT). All samples were processed in random order, and independent AFLP reactions were performed on duplicate samples for internal control. A sequencer loading mix was made by combining 1.2 µL deionized formamide, 0.46 µL GeneScan-500 (ROX) internal size standard, 0.34 µL blue loading dye and 2 µL of the selective amplification product. This mixture was denatured at 95 °C for 4 min, and snap-cooled in an ice-water mixture before being run on an ABI Prism 377 Genetic Analyser. The GS Run 36D-2400 module was run using the following collection parameters: 4 h (run time); 2500 V (run voltage); 50 mA (current); 200 W (power); and 60 °C (gel temperature). Raw data was collected using the ABI Prism GeneScan Analysis Software (Applied Biosystems), and sample files were aligned using the internal size standard (ROX 500). Aligned data was subsequently imported into Genographer (version 1.1.0, © Montana State University, 1998; http://hordeum.oscs.montana.edu/genographer/) for band calling. Each AFLP locus was assessed and scored using the ‘thumbnail’ option of Genographer, which enables fluorescence signal strength distributions per locus to be compared across all accessions together, and presence was assigned if an accession had a band ≥100 fluorescence units. From this a presence/absence matrix was constructed and imported into spss for Windows (release 9.0.0; © SPSS Inc.) for data manipulation and analysis. © 2000 Blackwell Science Ltd, Molecular Ecology, 9, 2109–2118 Analyses Accessions were separated into the following seven geographical regions: Scandinavia; British Isles; central Europe (North Coast to the Alps); Iberian Peninsula; Southern Italy (South of the Po Valley); Asia; and Africa (Table 1). Cape Verdi Island was included in Africa as it probably had been colonized from Africa or the Canary Islands (Böhle et al. 1996). These regions were chosen because they are separated by physical barriers to gene flow, or are known glacial refugia (see Hewitt 1996). We performed analyses of linkage disequilibrium to test for correlations or statistical nonindependence between different loci using the program lian (Haubold & Hudson 2000). Two-locus linkage disequilibrium between allele 1 at locus A and allele 0 at locus B is often expressed as D = gA1,B0 – pA1 pB0 , where gA1,B0 is the frequency of the two-locus haplotype and pA1 pB0 is the product of the corresponding single-locus allele frequencies. If D = 0, the loci are said to be in linkage equilibrium and the converse is called linkage disequilibrium. However, pairwise tests for linkage disequilibrium become cumbersome with many loci. To avoid spurious rejection of the null hypothesis, Bonferroni correction imposes a more stringent significance threshold, but may be too conservative (Rice 1989; Weir 1996). Alternatively, one may ask whether more significant results are obtained than by chance alone (Miyashita et al. 1999). However, these significance tests are not fully independent, so this approach may also be conservative. Alternatively, genome-wide multilocus analysis of linkage disequilibrium can be applied to population data on multilocus haplotypes (Brown et al. 1980). A mismatch distribution is obtained by comparing all possible pairs of MEC1122.fm Page 2112 Saturday, November 11, 2000 2:57 PM 2112 T. F. S H A R B E L , B . H A U B O L D and T. M I T C H E L L - O L D S haplotypes (individuals) in a data set. Each pairwise comparison counts the number of loci that differ between two haplotypes. For example, the five-locus haplotypes 10011 and 10000 are mismatched at two loci. Under linkage equilibrium mismatches occur independently, one locus at a time. In contrast, with linkage disequilibrium mismatches involve correlated sets of loci. Consequently, the mismatch distribution is influenced by presence or absence of linkage disequilibrium, thus enabling a test for multilocus linkage disequilibrium. The observed variance of the number of mismatches from all n(n – 1)/2 possible pairs of haplotypes in a sample, VD, can be compared to the mismatch variance expected under linkage equilibrium, Ve, r V e = ∑ hi ( 1 − hi ) , i=1 where hi = ∑ pij is the genetic diversity at the i-th locus 2 j and r is the number of loci. The test of multilocus linkage equilibrium then amounts to testing the null hypothesis H0: VD = Ve. This can be achieved either by imitating the effect of recombination through computer simulation (Souza et al. 1992) or by using a parametric test (Haubold et al. 1998). The approach based on pairwise mismatches also yields a standardized index of association as a measure of haplotype-wide linkage disequilibrium, data were unavailable for some accessions, and thus only a subset (n = 113) of the complete data could be analysed. We first tested for genetic isolation by distance for the entire data set and for samples contained in each geographical region separately. Following this, geographical contrasts were tested by grouping accessions from pairs of regions. The latter analyses were performed only for those comparisons containing n ≥ 10 accessions. Finally, we divided the central European accessions into two groups: west (west of 5°); and east (east of 13°), and Mantel permutation tests were then run on comparisons of east and west Europe combined with populations from Asia and the Iberian Peninsula separately. Results We scored 79 polymorphic AFLP loci across 142 Arabidopsis thaliana accessions. The frequency distribution across all accessions (n = 142) for the 79 loci is similar to that found by Miyashita et al. (1999) for 472 AFLP bands and 38 accessions, with relatively high occurrences at the low and high frequency ends of the distribution (Fig. 1). This distribution is partially explained by random mutation, which will lead to the formation of unique bands to weight the frequency distribution towards its lower end (Fig. 1; (Miyashita et al. 1999). Intermediate frequency polymorphisms were also found, although at lower frequencies. s IA = (VD /Ve – 1)/(r – 1), which is zero for linkage equilibrium (Maynard Smith et al. 1993; Hudson 1994). Genetic similarity among accessions was examined using the phylogenetic tree building package treecon (Van de Peer & De Wachter 1993; Van de Peer & De Wachter 1994; Van de Peer & De Wachter 1997). The genetic distance algorithm of Link et al. (1995) was used because only shared band presence is considered to be informative, while band loss can be attained convergently (e.g. mutation of any nucleotides within restriction sites or selective nucleotides). Neighbour-joining trees were constructed using 100 bootstrap replicates. Principal components analysis is a method of data reduction (Manly 1994). If the data are highly correlated, a plot of the taxa against the first few principal components will account for a large portion of the total variance. Such a plot would effectively summarize the structure contained in the full data set. We applied principal components analysis to our AFLP data using spss for Windows (release 9.0.0; © SPSS Inc.). We used the Mantel permutation procedure to test for genetic isolation by distance (Mantel 1967). Sample coordinates of most accessions were taken from the Nottingham database (http://nasc.nott.ac.uk/) or found using encarta (Microsoft Inc.). Latitude and longitude Linkage disequilibrium and relationships among accessions On average, 62 (78%) of the AFLP bands were shared by any two accessions. Significant linkage disequilibrium Fig. 1 Locus frequency class distribution of 79 AFLP markers sampled from 142 Arabidopsis thaliana accessions. © 2000 Blackwell Science Ltd, Molecular Ecology, 9, 2109 – 2118 MEC1122.fm Page 2113 Saturday, November 11, 2000 2:57 PM G E N E T I C I S O L AT I O N B Y D I S TA N C E I N A R A B I D O P S I S T H A L I A N A 2113 Table 2 Linkage disequilibrium analyses for geographical regions with at least 10 accessions. n = 79 loci. Monte Carlo simulations were performed with 10 000 iterations, and statistically significant results after a sequential Bonferroni correction for multiple tests are shown in bold. Bonferroni significance levels are indicated by *P ≤ 0.05; **P ≤ 0.01; ***P ≤ 0.001 Geographic region n VD Ve H (± SE) IAS P All Samples 142 16.1 11.7 0.2318 (0.0195) 0.0048 0.0001** Central Europe 88 14.2 11.4 0.2278 (0.0200) 0.0031 0.0009** Asia 23 23.4 11.2 0.2317 (0.0215) 0.0140 0.0001** was detected for the complete sample, central Europe, and Asia (Table 2). Mismatch distributions for the entire data set, as well as for Europe, and combined Europe and Asia showed bell-shaped distributions, whereas the Asian sample was irregular (Fig. 2). Miyashita et al. (1999) found that removal of the Fl-3 accession from their analysis had the effect of decreasing linkage disequilibrium for the remaining data set, the result of a disproportionately high number of unique bands in the Fl-3 genotype. However, Fl-3 is not A. thaliana (L. Dorn and T. Mitchell-Olds, unpublished). The t-tests for differences in genetic diversity were insignificant for comparisons between all geographical regions (not shown). Given the significant linkage disequilibrium in the data set, it might contain appreciable information about evolutionary relationships. We searched for robust subclusters of accessions using neighbour-joining with bootstrap. This analysis returned no node with bootstrap support >60%. The dendrogram resembled a bush or star phylogeny (not shown), so consequently there is no evidence for phylogenetic substructuring among A. thaliana accessions. Isolation by distance In order to further investigate the genetic structure of A. thaliana we subjected our data set to principal components analysis (Manly 1994). A plot of the taxa against the first three principal components for the allele variables accounted for merely 16.3% of the total variance. Nevertheless, some Asian accessions clustered separately from the central European plants, and in addition, two Iberian accessions (Sah-0 and Ll-0) and eight central European accessions were also separated from the main cluster of A. thaliana (Fig. 3). We formally tested the geographical patterning suggested by the principal components analysis using Mantel tests. Genetic distance between accessions increases © 2000 Blackwell Science Ltd, Molecular Ecology, 9, 2109–2118 Fig. 2 Mismatch distributions for (a) the entire sample, (b) Asia, and (c) Europe. All possible pairs of accessions are compared, showing the frequency of mismatched (nonidentical) AFLP loci for each pairwise comparison. MEC1122.fm Page 2114 Saturday, November 11, 2000 2:57 PM 2114 T. F. S H A R B E L , B . H A U B O L D and T. M I T C H E L L - O L D S Fig. 3 Three dimensional plot of first three principle components which describe 16.3% of the AFLP variation in Arabidopsis thaliana (Sah0 and Ll-0 are two outlier accessions from the Iberian peninsula). significantly with geographical distance (Fig. 4). Even with a conservative correction for multiple statistical tests (Rice 1989), nine of 20 Mantel test comparisons were statistically significant. All significant tests showed a positive correlation between geographical and genetic distance ranging from 0.16 (central Europe and Iberian Peninsula) to 0.65 (Asia and Scandinavia; Table 3). Furthermore, genetic isolation by distance (with Bonferroni significance) was found between samples from east Europe and Asia (n = 20, r = 0.46, P < 0.01), west Europe and Asia (n = 19, r = 0.50, P < 0.01), Asia and the Iberian Peninsula (n = 16, r = 0.54, P < 0.0001), and east Europe and the Iberian Peninsula (n = 18, r = 0.21, P < 0.02). Two major trends are apparent from the significant comparisons between geographical regions. First, a significant positive correlation exists for all comparisons with Asia (Table 3), and of these, the strongest correlations exist for comparisons involving the edges of the European distribution. Second, a Mantel test of the central European samples alone shows no significant genetic isolation by distance, but this becomes significant if accessions from the Iberian Peninsula are included (Table 3). The Iberian Peninsula sample exhibits no significant genetic isolation by distance when compared to any other geographical region except Asia. Discussion Glacial refugia and the postglacial colonization of Europe We find significant isolation by distance among Arabidopsis thaliana accessions from Eurasia and southern Europe (Table 3; Figs 3 and 4). These spatial patterns of genetic variation suggest that A. thaliana colonized central and northern Europe from Asia, with some indications of an additional Mediterranean Pleistocene refugium (the Iberian peninsula). Previously, lack of phylogeographic pattern has been ascribed to recent human-induced migrations (King et al. 1993; Todokoro et al. 1995; Bergelson et al. 1998; Miyashita et al. 1999). However, although human disturbance clearly influences the biogeography of A. thaliana, our large sample of Eurasian accessions provides strong evidence for historical migrations and isolation by distance. This result has important implications for design and interpretation of functional and evolutionary studies of natural variation in Arabidopsis. Two possible scenarios could explain the significant Mantel test results for comparisons involving Asia, the Iberian peninsula and central Europe (Table 3; Fig. 4). First, as significant genetic isolation by distance exists for all comparisons involving Asia, Asia might be the source © 2000 Blackwell Science Ltd, Molecular Ecology, 9, 2109 – 2118 MEC1122.fm Page 2115 Saturday, November 11, 2000 2:57 PM G E N E T I C I S O L AT I O N B Y D I S TA N C E I N A R A B I D O P S I S T H A L I A N A 2115 Table 3 Mantel test results r (P-value) for the correlation between geographical region and genetic distance (10 000 iterations; n = 79 loci). Boldfaced values are significant after sequential Bonferroni correction for multiple tests. Post-Bonferroni significance levels are indicated by *P ≤ 0.05; **P ≤ 0.01; ***P ≤ 0.001 Fig. 4 Positive correlation between geographical and genetic distance shows isolation by distance in Arabidopsis thaliana. (a) all accessions; (b) Asia. region from which other areas were colonized since the Pleistocene. The colonizing populations emigrating from Asia would likely be characterized by reduced genetic variability, as migrating populations are affected by stochastic events which tend to decrease diversity (Cooper et al. 1995; Leonardi & Menozzi 1995; Koch et al. 1998; Schmidtling & Hipkins 1998). Regardless, one would not expect genetic distance between the colonizing and Asian source populations to increase significantly with geographical distance because: (i) only a short time period (≈17 000 years) would have passed during the colonization event; and (ii) population bottlenecks along the colonization route would have the effect of decreasing genetic diversity but not increasing genetic distance. © 2000 Blackwell Science Ltd, Molecular Ecology, 9, 2109–2118 Comparison n All All samples together Central Europe – Asia Central Europe – Iberian Peninsula Central Europe – Southern Italy Central Europe – British Isles Central Europe – Scandinavia Central Europe – Africa Central Europe Asia – Iberian Peninsula Asia – Southern Italy Asia – British Isles Asia – Scandinavia Iberian Peninsula – Southern Italy British Isles – Iberian Peninsula British Isles – Southern Italy Asia – Africa Iberian Peninsula – Scandinavia British Isles – Scandinavia Southern Italy – Scandinavia Asia 113 86 84 83 83 81 80 77 16 15 15 13 13 13 13 12 11 10 10 9 0.24 (0.0001)** 0.31 (0.0001)** 0.16 (0.0001)** – 0.06 (0.05) – 0.03 (0.47) – 0.006 (0.84) 0.06 (0.17) 0.07 (0.04) 0.53 (0.0001)** 0.60 (0.0001)** 0.56 (0.0001)** 0.65 (0.0001)** – 0.06 (0.57) 0.11 (0.46) 0.12 (0.45) 0.53 (0.0008)** – 0.007 (0.65) 0.05 (0.76) 0.03 (0.86) 0.63 (0.0002)** Thus, a second, and more likely scenario explaining the observed pattern of isolation by distance is that both Asia and the Iberian Peninsula provided Pleistocene glacial refugia for A. thaliana. The Asian accessions are clearly distinct based both on the Mantel and PCA procedures (Table 3; Figs 3 and 4). Accessions from the Iberian Peninsula exhibit no significant isolation by distance with any geographical region other than central Europe and Asia (Table 3). Overall, it seems likely that Europe may have been recolonized by accessions emerging from the Iberian Peninsula and Asia. In support of this hypothesis, the most divergent pairs of accessions occur in comparisons of western Europe vs. Asia (see Results). The principle components analysis furthermore imply that this trend with the Iberian Peninsula may have been influenced by a subset of the Iberian accessions (Sah-0 and Ll-0), although this conclusion is weakened by the fact that only 16.3% of the data variance is accounted for in the first three principle components (Fig. 3). The Balkan region also remains a possible glacial refugium for A. thaliana (Hewitt 1996), although this hypothesis cannot be tested due to lack of collections from this region. Similar post-Pleistocene migration patterns are exhibited by a number of species (Hewitt 1996; Taberlet et al. 1998). MEC1122.fm Page 2116 Saturday, November 11, 2000 2:57 PM 2116 T. F. S H A R B E L , B . H A U B O L D and T. M I T C H E L L - O L D S Fig. 5 Scenario for Arabidopsis postglacial colonization of Europe from the Iberian Peninsula (black) and Asia (white), to a central European contact zone (checkered). Smaller arrows out of the Iberian Peninsula indicate colonization of Scandinavia and Italy. Significant genetic isolation by distance between central European vs. Asian accessions, and between Iberian vs. central European accessions suggests that central Europe may be a suture zone (Taberlet et al. 1998) between the two refugia (Fig. 5). We hypothesize that the genetic diversity characteristic of this zone may represent the combined diversities of the colonizing populations from both glacial refugia, each of which was composed of genetically distinct selfing populations. The t-tests show no increase in genetic diversity in central Europe relative to any single glacial refugium (not shown), and this result is consistent with a scenario whereby the genetic diversity of migrant populations derived from each refugium has been decreased by drift (Cooper et al. 1995; Leonardi & Menozzi 1995; Schmidtling & Hipkins 1998). If the above hypothesis is correct, then central European accessions may show an east-to-west clinal distribution in genetic variation. Two colonization ‘waves’ into Europe, one from Asia and one from Iberia, would be reflected in western European accessions being more closely related to those from Iberia, and eastern European accessions being more closely related to those from Asia (Fig. 5). Mantel tests of the subdivided central European sample indirectly support this hypothesis, with the most significant genetic isolation by distance found in the western Europe–Asia and eastern Europe–Asia comparisons, and less significance for the eastern Europe–Iberia comparison (see Results). This trend suggests that the hypothesized suture zone between European and Asia populations lies further east than our central Europe sample, as the comparisons involving both east and west Europe with Asia show similar levels of genetic isolation by distance, while the east Europe and Iberia comparison is less significant (Table 3). Linkage disequilibrium and relationships among accessions We found statistically significant levels of multilocus linkage disequilibrium among A. thaliana accessions. Previous analyses have not found strong evidence for linkage disequilibrium, but were based on fewer accessions and on two-locus estimators averaged across many comparisons (Bergelson et al. 1998; Miyashita et al. 1999). In addition to possible differences in statistical power (see Materials and methods), these statistical estimators ask somewhat different biological questions. VD, a genome-wide multilocus estimator, may reject the null hypothesis of linkage equilibrium if a small subset of loci are in disequilibrium. Alternatively, large numbers of pairwise tests (D) only reject the null hypothesis if high levels of disequilibrium exist among many loci. Consequently, this analysis and previously published studies are all compatible with a model of intermediate levels of disequilibrium among subsets of loci. More data will be needed to understand the extent and possible utility of linkage disequilibrium in A. thaliana (e.g. Collins et al. 1999; Kruglyak 1999). Patterns of genetic relationship can be summarized by a dendrogram. Our resulting tree indicated a bush or star © 2000 Blackwell Science Ltd, Molecular Ecology, 9, 2109 – 2118 MEC1122.fm Page 2117 Saturday, November 11, 2000 2:57 PM G E N E T I C I S O L AT I O N B Y D I S TA N C E I N A R A B I D O P S I S T H A L I A N A 2117 topology (not shown), with low levels of bootstrap support. Similar results were found in previous studies with fewer accessions but more polymorphic loci (Breyne et al. 1999; Miyashita et al. 1999). This pattern agrees with our findings of partial linkage disequilibrium. Evidently there has been sufficient recombination in the A. thaliana genome to partially reshuffle genetic variation. This conclusion is also supported by surveys of nucleotide polymorphism in several A. thaliana genes, which typically contain evidence of intragenic recombination within a few kilobases (Innan et al. 1996; Purugganan & Suddith 1998; Kawabe & Miyashita 1999; Purugganan & Suddith 1999). Although A. thaliana is highly selfing under laboratory conditions, historical recombinations have been sufficiently important that A. thaliana accessions do not conform to a tree-like, bifurcating pattern of evolution — there is no ‘ecotype phylogeny’. Nucleotide polymorphisms have an historical context, as the frequency of specific alleles may have been influenced by gene flow before, during and after isolation in Pleistocene refugia (Hewitt 1996). Populations of A. thaliana from different refugia should differ in terms of low frequency alleles, which may have resulted from mutation, genetic drift and weak selection. Intermediate frequency alleles may represent older polymorphisms distributed over a broad geographical range, or polymorphisms that have reached high frequency through some form of selection (Konnert & Bergmann 1995; Akashi 1999). Our data suggest that Eurasian A. thaliana accessions have undergone genetic differentiation in separate Pleistocene glacial refugia. Northward recolonization following glacial retreat may have generated a suture zone somewhere in eastern Europe. This geographical structure provides a context from which accessions of differing genetic backgrounds may be chosen for studies of natural genetic variation (Alonso-Blanco & Koornneef 2000). Choosing accessions from different glacial refugia may increase levels of genetic variation for functional and evolutionary studies. Acknowledgements We thank Domenica Schnabelrauch and Antje Figuth for help with the AFLP analysis, and J. Bergelson, D. Charlesworth, M. Clauss, M. Koch, J. McKay, B. Stranger, and two anonymous reviewers for comments on the manuscript. Randy Scholl Arabidopsis Biological Resource Center (ABRC) provided seeds. This work was supported by the Max-Planck-Gesellschaft, and by grants to TMO from the US National Science Foundation (DEB-9527725), the European Union, and Bundesministerium für Bildung und Forschung (BMBF). References Abbott RJ, Gomes MF (1989) Population genetic structure and outcrossing rate of Arabidopsis thaliana (L.) Heynh. Heredity, 62, 411– 418. © 2000 Blackwell Science Ltd, Molecular Ecology, 9, 2109–2118 Akashi H (1999) Within- and between-species DNA sequence variation and the ‘footprint’ of natural selection. Gene, 238, 39 – 51. Alonso-Blanco C, Koornneef M (2000) Naturally occurring variation in Arabidopsis: an underexploited resource for plant genetics. Trends in Plant Science, 5, 22–29. Bergelson J, Stahl E, Dudek S, Kreitman M (1998) Genetic variation within and among populations of Arabidopsis thaliana. Genetics, 148, 1311–1323. Böhle U-R, Hilge HH, Martin WF (1996) Island colonization and evolution of the insular woody habit in Echium L. (Boraginaceae). Proceedings of the National Academy of Sciences of the USA, 93, 11740–11745. Breyne P, Rombaut D, Van Gysel A, Van Montagu M, Gerats T (1999) AFLP analysis of genetic diversity within and between Arabidopsis thaliana ecotypes. Molecular and General Genetics, 261, 627–634. Brown AHD, Feldman MW, Nevo E (1980) Multilocus structure of natural populations of Hordeum spontaneum. Genetics, 96, 523–536. Collins A, Lonjou C, Morton N (1999) Genetic epidemiology of single-nucleotide polymorphisms. Proceedings of the National Academy of Sciences of the USA, 96, 15173 –15177. Comes HP, Kadereit JW (1998) The effects of Quaternary climatic changes on plant distribution and evolution. Trends in Plant Science, 3, 432–438. Cooper SJB, Ibrahim KM, Hewitt GM (1995) Postglacial Expansion and Genome Subdivision in the European Grasshopper Chorthippus Parallelus. Molecular Ecology, 4, 49 – 60. Forsström L, Punkari M (1997) Initiation of the Last Glaciation in Northern Europe. Quaternary Science Review, 16, 1197 –1215. Hanfstingl U, Berry A, Kellog EA et al. (1994) Haplotypic divergence coupled with lack of diversity at the Arabidopsis thaliana alcohol dehydrogense locus: Roles for both balancing and directional selection? Genetics, 138, 811– 828. Haubold B, Hudson RR (2000) lian, version 3.0: detecting linkage disequilibrium in multilocus data. Bioinformatics, in press. Haubold B, Travisano M, Rainey P, Hudson R (1998) Detecting linkage disequilibrium in bacterial populations. Genetics, 150, 1341–1348. Hewitt GM (1996) Some genetic consequences of ice ages, and their role in divergence and speciation. Biological Journal of the Linnaean Society, 58, 247–276. Hudson RR (1994) Analytical results concerning linkage disequilibrium in models with genetic transformation and conjugation. Journal of Evolutionary Biology, 7, 535 – 548. Innan H, Tajima F, Terauchi R, Miyashita NT (1996) Intragenic recombination in the Adh locus of the wild plant Arabidopsis thaliana. Genetics, 143, 1761–1770. Kawabe A, Miyashita N (1999) DNA variation in the basic chitinase locus (ChiB) region of the wild plant Arabidopsis thaliana. Genetics, 153, 1445–1453. King G, Niehnuis J, Hussey C (1993) Genetic similarity among ecotypes of Arabidopsis thaliana estimated by analysis of restriction fragment length polymorphisms. Theoretical and Applied Genetics, 86, 1028–1032. Koch M, Hurka H, Mummenhoff K (1998) Molecular phylogenetics of Cochlearia L. & allied genera based on nuclear ribosomal ITS DNA sequence analysis contradict traditional concepts of their evolutionary relationship. Plant Systematics and Evolution, 216 (3–4), 207–230. MEC1122.fm Page 2118 Saturday, November 11, 2000 2:57 PM 2118 T. F. S H A R B E L , B . H A U B O L D and T. M I T C H E L L - O L D S Konnert M, Bergmann F (1995) The geographical distribution of genetic variation of silver fir (Abies alba, Pinaceae) in relation to its migration history. Plant Systematics and Evolution, 196, 19 – 30. Kruglyak L (1999) Prospects for whole-genome linkage disequilibrium mapping of common disease genes. Nature Genetics, 22, 139 –144. Leonardi S, Menozzi P (1995) Genetic Variability of Fagus Sylvatica L in Italy — the Role of Postglacial Recolonization. Heredity, 75, 35 – 44. Link W, Dixens C, Singh M, Schwall M, Melchinger AE (1995) Genetic diversity in European and Mediterranean faba bean germ plasm revealed by RAPD markers. Theoretical and Applied Genetics, 90, 27 – 32. Manly FJ (1994) Multivariate Statistical Methods, a Primer. 2nd edn. Chapman & Hall, London. Mantel N (1967) The detection of disease clustering and a generalized regression approach. Cancer Research, 27, 209–220. Mauricio R (1998) Costs of resistance to natural enemies in field populations of the annual plant Arabidopsis thaliana. American Naturalist, 151, 20 – 28. Maynard Smith J, Smith NH, Dowson CG, Spratt BG (1993) How clonal are bacteria? Proceedings of the National Academy of Sciences of the USA, 90, 4384 – 4388. Mitchell-Olds T (1995) The molecular basis of quantitative genetic variation in natural populations. Trends Ecology and Evolution, 10, 324 – 328. Miyashita NT, Kawabe A, Innan H (1999) DNA variation in the wild plant Arabidopsis thaliana revealed by amplified fragment length polymorphism analysis. Genetics, 152, 1723 –1731. Newton AC, Allnutt TR, Gillies ACM, Lowe AJ, Ennos RA (1999) Molecular phylogeography, intraspecific variation and the conservation of tree species. Trends in Ecology and Evolution, 14, 140 –145. O’Kane SL, Al-Shehbaz IA (1997) A Synopsis of Arabidopsis (Brassicaceae). Novon, 7, 323 – 327. Price RA, Al-Shehbaz IA, Palmer JD (1994) Systematic relationships of Arabidopsis: a molecular and morphological approach. In: Arabidopsis (eds Meyerowitz E, Somerville C), pp. 7–19. Cold Spring Harbor Press, Cold Spring Harbor, NY. Purugganan MD, Suddith JI (1998) Molecular Population Genetics of the Arabidopsis Cauliflower Regulatory Gene — Nonneutral Evolution and Naturally Occurring Variation in Floral Homeotic Function. Proceedings of the National Academy of Sciences of the USA, 95, 8130 – 8134. Purugganan MD, Suddith JI (1999) Molecular population genetics of floral homeotic loci: Departures from the equilibrium- neutral model at the APETALA3 and PISTILLATA genes of Arabidopsis thaliana. Genetics, 151, 839 – 848. Redei GP (1975) Arabidopsis as a genetic tool. Annual Review of Genetics, 9, 111–127. Rice W (1989) Analyzing tables of statistical tests. Evolution, 43, 223–225. Schmidtling RC, Hipkins V (1998) Genetic diversity in longleaf pine (Pinus palustris): influence of historical and prehistorical events. Canadian Journal of Forest Research, 28, 1135 –1145. Souza V, Nguyen TT, Hudson RR, Pinero D, Lenski RE (1992) Hierarchical analysis of linkage disequilibrium in Rhizobium populations: evidence for sex? Proceedings of the National Academy of Sciences of the USA, 89, 8389 – 8393. Taberlet P, Fumagalli L, Wustsaucy AG, Cosson JF (1998) Comparative Phylogeography and Postglacial Colonization Routes in Europe. Molecular Ecology, 7, 453– 464. Todokoro S, Terauchi R, Kawano S (1995) Microsatellite polymorphisms in natural populations of Arabidopsis thaliana in Japan. Japanese Journal of Genetics, 70, 543 – 554. Van de Peer Y, De Wachter R (1993) treecon: a software package for the construction and drawing of evolutionary trees. Computer Applications in the Biosciences, 9, 177–182. Van de Peer Y, De Wachter R (1994) treecon for Windows: a software package for the construction and drawing of evolutionary trees for the Microsoft Windows environment. Computer Applications in the Biosciences, 10, 569 – 570. Van de Peer Y, De Wachter R (1997) Construction of evolutionary distance trees with treecon for Windows: accounting for variation in nucleotide substitution rate among sites. Computer Applications in the Biosciences, 13, 227– 230. Vos P, Hogers R, Bleeker M et al. (1995) AFLP — a New Technique For DNA fingerprinting. Nucleic Acids Research, 23, 4407 – 4414. Weir BS (1996) Genetic Data Analysis II. Sinauer, Sunderland, MA. Willis KJ (1996) Where did all the flowers go? The fate of temperate European flora during glacial periods. Endeavour, 20, 110–114. Tim Sharbel is a postdoctoral researcher working on gametophytic apomixis in plants and animals, for which he uses genomic, chromosomal and population genetic approaches. Bernhard Haubold is a bioinformatician with research interests in population genetics and molecular evolution. Thomas Mitchell-Olds is Director of the Max Planck Institute of Chemical Ecology. He studies the functional basis of evolutionary forces influencing ecologically important genetic variation. © 2000 Blackwell Science Ltd, Molecular Ecology, 9, 2109 – 2118
© Copyright 2026 Paperzz