Origin and Domestication of the Fungal Wheat Pathogen Mycosphaerella graminicola via Sympatric Speciation Eva H. Stukenbrock,* Søren Banke,* Mohammad Javan-Nikkhah,! and Bruce A. McDonald* *Plant Pathology, Institute of Integrative Biology, Eidgenössische Techniche Hochschule, Zurich Land und Forst Wissenschaft, Zurich, Switzerland; and !Department of Plant Protection, Faculty of Agriculture, University of Tehran Plant Protection Building, Karaj, Iran The Fertile Crescent represents the center of origin and earliest known place of domestication for many cereal crops. During the transition from wild grasses to domesticated cereals, many host-specialized pathogen species are thought to have emerged. A sister population of the wheat-adapted pathogen Mycosphaerella graminicola was identified on wild grasses collected in northwest Iran. Isolates of this wild grass pathogen from 5 locations in Iran were compared with 123 M. graminicola isolates from the Middle East, Europe, and North America. DNA sequencing revealed a close phylogenetic relationship between the pathogen populations. To reconstruct the evolutionary history of M. graminicola, we sequenced 6 nuclear loci encompassing 464 polymorphic sites. Coalescence analyses indicated a relatively recent origin of M. graminicola, coinciding with the known domestication of wheat in the Fertile Crescent around 8,000–9,000 BC. The sympatric divergence of populations was accompanied by strong genetic differentiation. At the present time, no genetic exchange occurs between pathogen populations on wheat and wild grasses although we found evidence that gene flow may have occurred since genetic differentiation of the populations. Introduction Archaeological findings provide evidence of the important changes that took place in human culture when the first farming practices were initiated about 12,000 years ago in the Fertile Crescent (Childe 1953; Moore et al. 2000; Zohary and Hopf 2000; Gopher et al. 2002). Genetic data can support archaeology as genome-wide measures of genetic similarity have traced the origin of several domesticated cereals to wild populations of naturally occurring grasses that persist in the Middle East (see Salamini et al. 2002 for review). Intensive selection and cultivation of selected phenotypes altered populations of wild grasses into domesticated varieties of crops. For the cereal crops, the main morphological features that changed through this selection process were seed size, stiffness of the ear rachis, and the ease with which the seeds are released from the glumes (Sharma and Waines 1980; Elias et al. 1996; Taenzler et al. 2002). The process of domestication of certain grasses not only changed the genetic structure of the plant populations but also changed the physical and biotic environment for pathogens associated with the selected grass species. In wild host populations, characterized by patchy distributions, low plant density, high genetic diversity, and uneven distribution of nutrients, the pathogen is exposed to substantial environmental fluctuations, including factors such as humidity, UV radiation, and nutrient availability (Burdon 1993). Agricultural host populations are characterized by higher plant densities, more even nutrient distribution, and greater genetic uniformity, creating a more uniform environment that is less prone to environmental fluctuation and that is also more conducive to pathogen transmission. It is likely that the initiation of agricultural practices and the development of new crop species also led to the emergence of new pathogens or to significant changes through local adaptation in already existing pathogen populations. Key words: plant pathogens, coevolution, sympatric speciation, population growth, Mycosphaerella graminicola, septoria tritici. E-mail: [email protected]. Mol. Biol. Evol. 24(2):398–411. 2007 doi:10.1093/molbev/msl169 Advance Access publication November 9, 2006 ! The Author 2006. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: [email protected] One of the most serious pathogens on cultivated wheat today is the fungus Mycosphaerella graminicola that is found worldwide with its host. Mycosphaerella graminicola is known to be host specific and does not occur on other host species (Eyal et al. 1973, 1985; Saadaoui 1987; van Ginkel and Scharen 1987). Genetic analysis of virulence and resistance in the wheat—M. graminicola pathosystem suggests that pathogenicity is controlled by several loci and is likely inherited as a quantitative trait (Kema, Annone, et al. 1996; kema, Sayoud, et al. 1996; Zhan et al. 2005). Population genetic studies at both regional and continental scales demonstrated high levels of genetic diversity and gene flow between pathogen populations (Zhan et al. 2003; Banke and McDonald 2005). Mycosphaerella graminicola reproduces both clonally through the production of splash-dispersed pycnidiospores and sexually by the formation of ascospores. Ascospores are wind dispersed (Shaw and Royle 1989) and may provide an important means of long-distance dispersal together with the continuous human-mediated transport of infected plant material (Brown and Hovmøller 2002). Phylogeographic studies indicated that the center of origin of M. graminicola is the Middle East and that the pathogen was dispersed throughout the world during the spread of wheat cultivation (McDonald et al. 1999; Banke et al. 2004). Ancient patterns of migration inferred from DNA sequence data supported this hypothesis by showing an asymmetrical migration of the pathogen from European and Israeli populations to ‘‘New World’’ countries (Banke and McDonald 2005). Estimates of recent migration patterns using microsatellite markers support an asymmetrical migration of the pathogen from the ‘‘Old World’’ but additionally include North America as a contemporary major donor of pathogen immigrants (Banke and McDonald 2005). This pattern of contemporary migration is consistent with the intensive cultivation of wheat on the North American continent today and global wheat trading patterns. We hypothesized that M. graminicola emerged as a new pathogen during the process of wheat domestication. Sister populations of M. graminicola on other grass hosts have, up until now, not been available to test this hypothesis. The closest known relative of M. graminicola is the Origin and Domestication of Mycosphaerella graminicola 399 Table 1 Pathogen Populations Included in the Analysis Pathogen Mycosphaerella sp. (S1 and S2) Mycosphaerella graminicola Septoria passerinii Number of Isolates Population 4 3 31 9 14 16 40 32 Iran 1 Iran 2 Iran 3 Iran 4 Iran 5 Iran Israel United States 32 10 Switzerland Zurich, Winterthur United States North Dakota Location Province Province Province Province Province Mehran Sa’ad Oregon Ardabil, Ardabil, Ardabil, Ardabil, Ardabil, Host Plant Khalkhal Khalkhal Ardabil Meshkin Road Meshkin Road barley-adapted pathogen Septoria passerinii; however, it is not known whether divergence of these 2 pathogen species coincided with the domestication of wheat and barley. Recently collected isolates of Mycosphaerella pathogens from noncultivated grasses in Iran provided us the opportunity to study the evolutionary history of M. graminicola. DNA sequence data can be combined with genealogical-based analyses to infer processes of population and species divergence and to identify evolutionary forces that have shaped the genetic structure of an organism (Avise 2000; Hey and Machado 2003). For fungi, such studies are limited. Fisher et al. (2001) used a genealogical approach to characterize the biogeographic expansion of the human pathogen Coccidioides immitis into South America. They concluded that the expansion coincided with the human colonization of South America but the estimated timeframe of the expansion had a very large confidence interval (CI) (2,700–228,000 years). In a recent study of the rice blast pathogen Magnaporthe oryzae, Couch et al. (2005) assessed the divergence time for a host shift event from millet to cultivated rice. Their Bayesianderived estimates suggested an early origin of the riceinfecting pathogen approximately 9,000 years ago (95% CI 5 2,500–28,000 years ago), which could be associated with rice domestication. The emergence of a wheat-adapted Mycosphaerella population is likely to have occurred as a sympatric process where no geographical barriers separated pathogen populations occurring on wild grasses and populations adapting to the new domesticated crops. Because of the close proximity of wild and cultivated host populations and the longdistance dispersal of airborne ascospores, a continuous exchange of genes would have been possible during the process of divergence between Mycosphaerella populations occurring on wild and agricultural host populations. This scenario complicates the inference of coalescence patterns. A regular exchange of genetic information may result in shared polymorphisms that create the appearance of recent divergence even when the actual splitting occurred long ago. To account for this possibility, we applied an isolation with migration model which takes into account the exchange of genes between populations after divergence from a common ancestral population (Nielsen and Wakeley 2001; Hey and Nielsen 2004). The method allows the estimation of multiple coalescence parameters such as founding population sizes, Year Lolium multiflorum 2004 Dactylis glomerata 2004 Lolium multiflorum 2004 Agropyron repens 2004 Lolium multiflorum 2004 Triticum aestivum 2001 Triticum aestivum 1992 Triticum aestivum 1990–1992 Triticum aestivum Hordeum vulgare 1999 1995 Sampling Strategy Collectors Random Random Random Random Random Random Hierarchical Hierarchical Javan-Nikkhah M Javan-Nikkhah M Javan-Nikkhah M Javan-Nikkhah M Javan-Nikkhah M Javan-Nikkhah M Yarden O Mundt CC and McDonald BA Hierarchical McDonald BA Random Long D changes in population size, the time of population formation, and gene flow. Previous studies used this coalescence approach to describe the divergence of Drosophila species, chimpanzee subpopulations, and ancient human populations (Hey and Nielsen 2004; Won and Hey 2005; Hey 2005). The transition from hunter-gatherer cultures led to the emergence of new human infectious diseases that evolved from herd-domesticated animals and successfully spread in the associated human populations (Diamond 1999). We hypothesized a parallel emergence of new plant pathogens mediated by the domestication of crop plants and the development of agriculture. It is probable that the dense host populations and different environmental conditions existing in agroecosystems selected for new pathogen populations that were locally adapted to the agroecosystem, setting the stage for pathogen population differentiation and rapid speciation. To test these hypotheses, we here inferred the evolutionary history of M. graminicola by comparing gene genealogies of closely related populations on uncultivated grasses from the center of origin of both host and pathogen. Materials and Methods Fungal Isolates Uncultivated grasses of 3 different genera, Agropyron repens, Dactylis glomerata, and Lolium multiflorum, infected by a fungal pathogen in the genus Mycosphaerella were collected in the northwestern Iranian province Ardabil (table 1). This region is one of the most important locations for wheat cultivation in Iran. Little is known about the origin and evolution of the 3 grass hosts; however, they are known to be native to Europe, North Africa, and temperate Asia (Hussey et al. 1997). Pathogen isolation and DNA extraction were performed as previously described (Linde et al. 2002). Using repetitive sequence–based polymerase chain reaction (Rep-PCR) fingerprints to differentiate clonal lineages, we selected 28 unique isolates for sequencing. One hundred twenty-three strains of M. graminicola isolated from cultivated wheat collected in Iran, Israel, Oregon, and Switzerland were used to infer the coalescence of Mycosphaerella populations on grasses and wheat. Mycosphaerella graminicola populations were described previously (McDonald et al. 1999; Linde et al. 2002; Zhan et al. 2003; Banke et al. 2004; Juergens et al. 2006). Additionally, we included 10 isolates of the closely related barley 400 Stukenbrock et al. pathogen S. passerinii collected in Walsh, North Dakota (kindly provided by Goodwin SB) (table 1). Polymerase Chain Reaction and Sequencing Internal transcribed spacer (ITS) amplification was performed as described by White et al. (1990) using the primer combination ITS4/ITS5. For population genetic and coalescence analyses, we sequenced 6 DNA loci previously described in M. graminicola: b-tubulin (321 bp, a transcribed sequence), Leu (111-bp promoter and 189-bp transcribed sequences of the 3-isopropylmalate dehydrogenase gene), HPPD (560 bp, a promoter sequence of the 4hydroxyphenylpyruvate dioxygenase gene), STS2 (402 bp of an anonymous restriction fragment length polymorphism [RFLP] locus), STL10 (1,104 bp of an anonymous RFLP locus), and STS43 (430 bp of an anonymous RFLP locus). Sequence data from the loci b-tubulin, Leu, STS2, and STL10 were used in previous studies to determine phylogeny and patterns of migration in M. graminicola (Banke et al. 2004; Banke and McDonald 2005). STS43 is a nuclear RFLP locus that was used in previous population genetic studies of M. graminicola (McDonald et al. 1999). The HPPD locus was originally sequenced and described by Keon and Hargreaves et al. (1998). Sequence data not published in previous studies can be accessed in National Center for Biotechnology Information GenBank under the accession numbers DQ535489–DQ535722. Amplification and sequencing of all loci were performed as described in Banke et al. (2004). Sequences were aligned and manually adjusted using the programs Sequencher 4.0 (Gene Code, Ann Arbor, MI) and Bioedit Sequence Alignment Editor (http://www.mb. mahidol.ac.th/Downloads/Mol-Bio/Bioedit/Bioedit.htm). Data Analysis Gene Diversity As most polymorphic sites are located in noncoding DNA, we did not differentiate between silent or nonsilent mutations and all polymorphic sites were analyzed. To initially characterize the phylogenetic relationship between pathogen species, we used the ITS alignment to construct a maximum parsimony tree by heuristic searches in PAUP* 4.0 (Swofford 1996). Trees were built by 100 iterations of random taxon addition with a different number seed for each iteration. Maximum parsimony analyses were also performed using sequence alignments of the 6 DNA loci. Within the Mycosphaerella population from uncultivated grasses, we identified 2 clades, here named S1 (representing 27 isolates) and S2 (representing 34 isolates). Haplotype diversity was calculated for each of the 6 loci using the program Dnasp version 3.53 (Nei 1987; Rozas J and Rozas R 1999). To test for deviation from neutral evolution, we performed Fu and Li’s D*, Fu and Li’s F* (Fu and Li 1993) and Tajima’s D tests (Tajima 1989) for all sites in each gene using Dnasp (Rozas J and Rozas R 1999). Maximum parsimony, haplotype diversity, and deviation from neutrality were assessed using entire sequence alignments. Compatible Alignments Alignments were processed in the SNAP workbench Java program package, which implements and coordinates several programs to analyze gene genealogies and population parameters (Price and Carbone 2005). The program SNAP map was used to collapse sequences into haplotypes, removing indel mutations and excluding infinite-site violations (Aylor et al. 2006). Conflicting sites showing homoplasy were initially identified among variable sites using the compatibility methods SNAP clade and SNAP matrix in the SNAP workbench. To generate compatible sequence alignments for coalescence analyses, Carbone and Kohn (2001) suggested the removal of haplotypes. Here, instead, we chose to manually remove conflicting sites in each gene alignment to generate compatible nonrecombining alignments for the parsimony and the coalescence analysis. Table 2 shows the number of sites before and after manipulation of the alignments. The majority of polymorphic sites removed were found within the 4 M. graminicola populations, consistent with a high number of recombination events. Removal of conflicting sites may have introduced a bias into the data set. The majority of conflicting sites in the 6 genes were found between M. graminicola isolates, and we believe that the removal of conflicting sites did not bias the migration or coalescence estimates between the 2 groups. Haplotype Networks We analyzed the haplotype alignments of all loci using the program TCS 1.3 (Clement et al. 2000). This program applies a statistical parsimony method to infer unrooted cladograms based on Templeton’s 95% parsimony connection limit (Templeton 1992). All polymorphic sites in the compatible sequence alignments were used to generate the haplotype networks. LAMARC Parameter Estimation Population growth parameters, theta: h values and recombination rates were inferred using the program LAMARC 2.0 (Kuhner et al. 2004). To compare parameters in pathogen populations on uncultivated grasses and wheat, we compared a pooled sample of the 4 M. graminicola populations and a pooled sample of S1 and S2. The pooling of populations was done in order to compare a sample of pathogens from cultivated wheat and from uncultivated grasses. Because geographical populations of M. graminicola showed little subdivision, these populations were analyzed as one population sample. Due to the smaller sample sizes of S1 and S2, we pooled these populations into one sample. The larger sample sizes improved the performance of the likelihood analysis. The LAMARC algorithm permits intragenic recombination and therefore allowed us to use the complete sequence alignments for each locus. All polymorphic sites were used to assess the population parameters. To estimate h and a population growth parameter, we used 10 initial chains with 2,000 genealogies sampled and 2 final chains with 20,000 genealogies sampled. Analyses were run with fixed migration rates obtained from analyses performed in the IM program (see below) (Nielsen and Wakeley 2001). Because of large credibility intervals, we also performed LAMARC analyses without fixed migration estimates to determine the influence of these on the LAMARC analyses. Population parameters were inferred using both Bayesian and Maximum Likelihood Origin and Domestication of Mycosphaerella graminicola 401 Table 2 Estimates of Gene Diversity and Tests of Neutrality for Each Locus Locus b-tubulin Mycosphaerella graminicola S1 S2 HPPD M. graminicola S1 S2 Leu M. graminicola S1 S2 STS2 M. graminicola S1 S2 STL10 M. graminicola S1 S2 STS43 M. graminicola S1 S2 Total Analyzed Sequence Length b 285 (277) Total Polymorphic Sites Polymorphic Sites per Population 300 (254) 415 (398) 1104e/515 (377) 445 (376) 2505 (2157) Haplotype Diversity Fu and Li’s D 0.8 NSc 39 (38) 20 545 (475) Haplotypesa 22 (123) d Fu and Li’s F* Tajima’s D NS NS NS NS NS NS 2 1 2 (13) 2 (7) 0.39 0.29 1.4 (0.05) NS 85 84 72 52 (107) 4 (16) 3 (10) 0.93 0.53 0.39 !4.0 (0.02) !2.6 (0.05) !2.2 (0.02) !3.8 (0.02) !2.7 (0.05) !2.4 (0.02) !2.1 (0.05) !1.9 (0.05) !2.1 (0.01) 33 10 NDf 19 (97) 3 (11) ND 0.56 0.35 ND NS !2.6 (0.02) ND NS !2.4 (0.02) ND NS !2.0 (0.05) ND 18 1 0 18 (106) 2 (9) 1 (16) 0.71 0.39 0 NS !2.3 (0.02) NS NS !2.5 (0.02) NS NS !2.0 (0.01) NS 92e 17 18 36 (81) 4 (7) 3 (11) 0.85 0.71 0.69 !5.92 (0.02) NS NS !5.47 (0.001) NS NS !2.58 (0.02) NS NS 44 6 NDf 21 (98) 3 (6) ND 0.88 0.6 ND NS NS ND NS NS ND NS NS ND 165 (127) 44 (37) 59 (35) 53 (45) 53 (39) 413 (321) a Number in parentheses indicates the total number of sequences sampled in each population. b Numbers in parentheses indicate the length of the compatible sequence alignment and the number of polymorphic sites in the compatible alignments. c Nonsignificant. d Numbers in parentheses for the last 3 columns show P-values of the neutrality tests. e Including 589 bp indel mutation present in M. graminicola but not in S1 and S2. f Loci could not be amplified from the S2 group. analyses. 95% credibility intervals were assessed only from the Bayesian search. Isolation with Migration Parameter Estimation Migration rates backward in time due to divergence of pathogen populations on uncultivated grasses and wheat were estimated using an isolation with migration model implemented in the Bayesian-based isolation with migration (IM) program (Nielsen and Wakeley 2001). The method estimates the likelihood function of the demographic parameters h (h: population diversity), t (time), m (the number of migrants per generation between the 2 populations), mt (the time of migration events into each population), and l (mutation rate) given 1) a demographic model, 2) the entire set of data (the DNA sequences) and the likelihood of their possible underlying genealogies calculated using a Markov chain Monte Carlo approach, and 3) a mutational model. In this case, we assumed the infinite-site model of Kimura (1969), which is intended for analyses of relatively recent cases of population splitting. Analyses were also performed using the finite-site model of Hasegawa et al. (1985), which allows for recurrent mutations, differences in nucleotide frequencies, and a transition/transversion bias. Highly similar results were obtained using the 2 models. Here only results obtained with the infinite-site model are shown. Nonrecombining sequence alignments were applied as in-files for the IM program. Initially, pairwise population comparisons were carried out between the 4 M. graminicola populations. The IM program estimates mutation rates for each locus and subsequently scales all demographic parameters by the overall mutation rate. Bayesian posterior likelihood distributions for h, m (m 5 ml), and t (t 5 tl) were calculated using 10,000,000 simulations and a burn in time of 500,000. Scalar for maximum h values was set at 100, maximum migration rates (m) at 50, and maximum divergence time (t) at 10. Inheritance scalar was set at 1 for autosomal loci. We repeated every run 5 times with different random number seeds to ensure convergence of the estimates. Marginal histograms were compared between all independent runs for all parameters. Lower and upper bounds of the estimated 90% Highest Posterior Density intervals were calculated for each parameter. Based on the low population subdivision found with the IM analyses, the 4 M. graminicola populations were pooled to form one panmictic population sample. To infer the coalescence of M. graminicola from the most closely related wild grass population S1, we conducted an analysis of these 2 population samples. The IM program estimated Bayesian posterior likelihood distributions of h of the ancestral founding population and of the 2 present-day populations; t, the population splitting time; m, gene flow rates between populations; and s, the fraction of the ancestral population that founded the M. graminicola population 402 Stukenbrock et al. (where 0 , s , 1) (Hey 2005). Independent analyses wereconducted using either mutation scalars for each locus (see below) or the geometric mean across all loci as inferred in IM. Analyses were conducted using 5,000,000 simulations and a burn in time of 100,000. Initially, scalar for maximum h values was set at 200, maximum migration rates (m) at 50, and maximum divergence time (t) between M. graminicola and S1 at 50 and between M. graminicola and S2 at 100. The first analyses showed large CIs for the t estimates, and we therefore performed several analyses optimizing the maximum t value to 4 to have a smaller CI. We used Metropolis Coupling implemented with 5 chains and a 2-step increment model as described by Hey and Nielsen in the IM documentation (2006). Every run was repeated 5 times with different random number seeds. Calculation of Mutation Rates The IM analysis uses the estimate l: mutation rate per locus per year of each locus to infer t (time since divergence of populations). To convert the estimates of t to real time, we needed an overall mutation rate per locus per year. To scale l, we used an estimate of mutation rate in exons in the b-tubulin locus inferred by Kasuga et al. (2002). The estimate of Kasuga et al. (2002) is given as mutation rate per site per year. We converted this estimate to mutation rate per locus per year using the observed sequence divergence between M. graminicola and S. passerinii in the b-tubulin locus. A mutation rate for the b-tubulin locus was calculated and used to convert IM estimates of l for all 6 loci. The mean mutation rate across all loci was used to calculate t. Conversion of IM Coalescence Parameters to Real Time To convert estimates of coalescence time to real time, the overall scaled mutation rate was used. The mean mutation rate across the 6 loci was estimated to be 2.0 3 10!5. We assumed a generation time of M. graminicola of 1 year (Kema, Verstappen, et al. 1996; Zhan et al. 2004) and the coalescence time t in years can then be calculated as: t 5 t/l (where t is the IM time estimate and l is the overall mutation rate). Genetree Population Divergence Time We also used the program Genetree (Griffiths and Tavaré 1994), incorporated in the SNAP program package to infer the sequence coalescence for each gene. Genetree allows the coalescence parameters to be estimated under a full multipopulation coalescence model. For this analysis, we therefore used a full data set of all populations including M. graminicola, S1, S2, and S. passerinii. The program estimates the ancestral history of each haplotype and shows the distribution of mutations on a coalescence scale. This allowed us to compare the divergence of haplotypes between and within each population for each gene. Due to intergenic recombination, we were not able to combine sequence alignments from the 6 loci. Instead, coalescence parameters were estimated for each locus separately using the compatible sequence alignments of each gene. The coalescence analyses were performed assuming subdivision between the M. graminicola populations, the S1 and S2 populations, and S. passerinii. As we assume different population sizes among the wheat-adapted pathogens and the wild grass pathogens, we inferred the coalescence of populations with different population sizes. To do this, we applied information obtained from the IM estimates about the relative differences in population sizes. To determine the order of coalescence events for haplotypes backward in time, it is necessary to know the amount of migration between populations. Haplotypes from populations linked by migration will coalesce before haplotypes from unlinked populations. As suggested by Carbone and Kohn (2001), we constructed migration matrices for each locus, indicating the number of migrants exchanged between populations, using the program Migrate (Beerli and Felsenstein 2001). These migration matrices were then used as backward migration matrices for ancestral inference in Genetree. The genealogy with the highest root probability was determined by first performing 500,000 simulations of the coalescent with 10 different starting random number seeds. From these runs, the tree with the highest root probability was selected showing the relative divergence time for the pathogen populations and the distribution of mutations along the branches. To convert coalescence units to real time, we used the S1—M. graminicola population split as a calibration point. As this parameter was estimated across all 6 loci, it provided a robust measure of t. As the coalescence time scale of the gene tree is linear, it was possible to assess the time of the split of S. passerinii and S2, as well as the diversification of M. graminicola. Results Sixty-one Mycosphaerella isolates were obtained from wild grasses collected at 5 different locations in northwest Iran. We differentiated clonal lineages using Rep-PCR fingerprints and selected 28 unique isolates from the 5 Iranian populations for detailed sequence analysis. Gene Diversity and Tests of Neutrality Based on ITS sequence data, M. graminicola and S. passerinii form a monophyletic cluster within the Mycosphaerella clade (Goodwin et al. 2001). To evaluate the phylogenetic proximity of the wild grass Mycosphaerella isolates to M. graminicola, we initially performed a maximum parsimony analysis based on ITS sequences using S. passerinii as outgroup. Three ITS sequence types were identified among the wild grass populations. The ITS type 1 was identical to ITS sequenced from M. graminicola, the ITS type 2 differed by 1 nucleotide substitution and 4 indel mutations, and the ITS type 3 differed by 2 nucleotide substitutions and 5 indel mutations. In comparison, the number of nucleotide substitutions between M. graminicola and S. passerinii was 13. We amplified and sequenced 6 polymorphic DNA sequence loci, a total of 3,080 bp, from the wild grass Mycosphaerella populations and from 4 populations of M. graminicola and 10 isolates of S. passerinii. A total of 464 polymorphic sites were identified among the 6 loci Origin and Domestication of Mycosphaerella graminicola 403 representing more than 10% of the total number of sequenced base pairs. The majority of the polymorphic sites were found only within the wheat-adapted M. graminicola population. Numbers of haplotypes and gene diversity for each locus for M. graminicola, S1, and S2 are summarized in table 2. Maximum parsimony analyses were performed for each of the 6 sequence loci to assess phylogenetic relationships (data not shown). Consistent tree topologies identified 2 clusters within the collection of Mycosphaerella spp. from wild grasses, here named S1 and S2 (S1 representing the ITS type 1 and S2 the ITS types 2 and 3). None of the haplotypes identified among the 6 loci from S1 and S2 were shared with M. graminicola. Neutrality tests were performed for each locus in the S1 and S2 Mycosphaerella clades and for M. graminicola. Evidence of nonneutral evolution was found in the HPPD locus for the S1 cluster in Leu and STS2 and for M. graminicola in the STL10 locus (table 2). For these 3 loci, the significant statistical values were negative. A significant test result is consistent with either population growth or shrinkage or background selection (Fu 1997). As the majority of nucleotide positions were located in noncoding DNA, we hypothesized that the deviation from neutrality was due to change in effective population size. This hypothesis was supported by the population growth estimates inferred by the program LAMARC 2.0 as described below. Haplotype Networks Haplotype networks were inferred from all 6 loci using compatible sequence alignments. No pattern of geographical association was revealed among the M. graminicola haplotypes, and the most frequent haplotype for the 6 loci was present in all 4 geographical populations. All 6 networks exhibited a star-like shape for the M. graminicola haplotypes. Haplotype networks of the STS2 and HPPD loci are shown in figures 1a and b. These 2 networks had the smallest number of missing steps. The STS2 network contained 26 M. graminicola haplotypes, 2 S1 haplotypes, 1 S2 hapotype, and 3 S. passerinii haplotypes. The total number of STS2 and HPPD haplotypes was higher (table 2); however, as conflicting sites were removed from the alignment prior to the analysis, the number of haplotypes was decreased. The number of mutational steps between S. passerinii and the S2 population was 72 and between S2 and M. graminicola was 18. Between S1 and M. graminicola, there were 13 mutations. Homoplasy was present in the STS2 network in the M. graminicola clades as shown by the loop between haplotype numbers 21, 3, 35, and 25. In the HPPD network 29 haplotypes were recognized in the M. graminicola cluster, 2 in the S1, 2 in the S2, and 2 in the S. passerinii cluster. There were 76 differences between S. passerinii and M. graminicola. Between the S1 group and M. graminicola populations there were 8 differences, and between S2 and M. graminicola 9 differences. For both genes, the number of steps between M. graminicola haplotypes did not exceed 3 mutations, suggesting that almost all possible haplotypes were sampled for this group. Population Expansion The population parameters h (for haploids h 5 2Nel, where Ne is effective population size and l is mutation rate) and population growth rates were calculated using both Maximum Likelihood and Bayesian analyses in LAMARC. h values were used as a relative measure of effective population sizes. We found consistent values of h using the 2 analytical strategies; however, the growth rate estimates were less consistent and CIs for this parameter were large. The Bayesian estimates of h and growth rate are summarized in table 3. The overall pattern observed for the 6 genes suggests a larger effective population size of M. graminicola (h 5 0.022) when compared with the pooled S1 and S2 (h 5 0.0063). For individual loci, h ranged from 0.02 (b-tubulin) to 0.16 (HPPD) in M. graminicola and from 0.004 (STS2) to 0.017 (Leu) in the S1 and S2 population. The overall growth rate of the M. graminicola population (13.86) is consistent with a large effective population size and a population expansion. The negative growth rate values in the S1–S2 population (!139.91) indicate a decline in population size or a consistently small population size. For individual loci in M. graminicola, growth rates ranged from 1.51 (Leu) to 472.08 (STS2) and for the S1–S2 population from !462.28 (STS2) to 18.75 (Leu). These estimates may reflect a slower rate of evolution of the Leu locus as compared with the highly variable STS2 locus. The analyses were also performed without fixed migration values. Estimates of h and population growth did not change significantly, neither was the range of the credibility intervals reduced. In these analysis, the overall h values were 0.022 (0.008–0.18) for the M. graminicola population and 0.007 (0.002–0.05) for the S1 population. Population growth estimates were 5.84 (!60.3–581.05) and !144.23 (!467.61–843.54) for M. graminicola and S1, respectively. Recombination rates inferred by LAMARC varied among the different loci, ranging from 2.23 3 10!5 (HPPD) to 0.16 (STS2) (table 3). The low levels of recombination found in the b-tubulin, HPPD, STL10, and STL43 suggest that mutation had a greater influence on the evolution of these loci than recombination. Coalescence and Migration Estimates of directional gene flow between the 4 geographical M. graminicola populations were consistent with relatively high rates of migration among populations (table 4). Based on these estimates and previous results showing high genetic identity and low global FST values (Linde et al. 2002; Zhan et al. 2003; Juergens et al. 2006), we considered the 4 M. graminicola populations as one panmictic population. The pooled M. graminicola sample was compared with the S1 wild grass population. Estimates of directional gene flow backward in time since the divergence of M. graminicola and the S1 population were low, with an overall movement of genes from the wild grass S1 population to the wheat-adapted population (table 5 and fig. 2). From S1 to M. graminicola, the migration parameter mS1 was estimated to be 0.80 and from M. graminicola to S1 mMg 5 0.15. CIs are given in table 5. 404 Stukenbrock et al. FIG. 1.—Parsimony haplotype networks for the 2 loci STS2 (a) and HPPD (b). Origin of haplotypes from the Septoria passerinii, the Iranian S1 and S2 populations, and the 4 geographical populations of Mycosphaerella graminicola are indicated by different colors. Number for each haplotype is given. Numbers in parentheses refer to number of isolates with this haplotype if more than 1. Dots are hypothetical missing intermediate haplotypes. 10!6 10!5 10!5 10!6 10!5 10!5 10!5 3 3 3 3 3 3 3 NOTE.—Mutation rates in noncoding sequences estimated in the IM Program are shown (Nielsen and Wakeley 2001; Hey 2005). 95% credibility intervals are denoted in parentheses. 3.2 2.2 4.0 8.5 3.5 1.3 2.0 0.085 (0.005–0.57) 2.23 3 10!5 (9.98 3 10!6–0.005) 0.25 (1.41 3 10!5–0.98) 0.16 (0.036–0.43) 2.92 3 10!5 (9.66 3 10!6–0.046) 0.073 (0.008–0.25) 0.09 (1.41 3 10!5–0.50) (!500.79–123.12) (!364.07 to –33.78) (!269.90–608.55) (!504.16 to –57.62) (!475.18–119.97) (!371.35–952.22) (!483.31–710.06) !320.01 !137.65 18.75 !462.28 !86.21 209.18 !139.91 (!44.65–311.89) (116.78–229.88) (!86.65–102.36) (315.65–858.38) (12.22–624.49) (!88.43–81.06) (!57.99–730.98) 35.13 181.07 1.51 472.08 117.08 19.18 13.86 0.006 (0.002–0.02) 0.011 (0.004–0.02) 0.017 (0.004–0.09) 0.004 (0.001–0.012) 0.005 (0.002–0.016) 0.008 (0.002–0.13) 0.0063 (0.002–0.06) 0.02 (0.008–0.03) 0.16 (0.12–0.21) 0.022 (0.008–0.03) 0.057 (0.041–0.085) 0.029 (0.02–0.06) 0.03 (0.02–0.05) 0.022 (0.01–0.18) b-tubulin HPPD Leu STS2 STL10 STS43 Overall Recombination Rate per Locus Population Growth, S1 and S2 Population Growth, M. graminicola h, S1 and S2 h, Mycosphaerella graminicola Table 3 Estimates of u, Population Growth, and Recombination Rates Calculated by Bayesian Analyses in the Program LAMARC 2.0 (Kuhner et al. 2004) Mutation Rate per Locus per Year Origin and Domestication of Mycosphaerella graminicola 405 The mean time of migration events was estimated over the 6 loci to assess when the majority of migration events took place. Converting the migration time estimates to real time, we find that the mean migration time corresponded to 8,250–12,750 years ago. The estimates suggest that the majority of migration events occurred at the time when the 2 populations were initially diverging and that recent migration events are uncommon. Using the IM program, we inferred coalescence parameters for the split between M. graminicola and the S1 clade. To assess the efficiency of the Markov Chain mixing, we used the estimates of parameter autocorrelation qk and Effective Sample Size (ESS). qk is calculated from the parameter k, which is based on the number of steps between the pairs of values that are included in the IM parameter calculation. qk values close to zero indicate a high k value, which is expected for independent samples. Similarly, the ESS is a measure of the number of independent points that have been sampled for each parameter and can be used as a guide to how well the Markov Chain is mixing. The parameters qk and ESS obtained from our IM analyses indicated that parameter estimations were not biased by strong autocorrelations in the Markov Chain mixing (table 5). Estimates of coalescence parameters are summarized in table 5. h of the ancestral population of S1 and M. graminicola was 10.97. Consistent with the LAMARC estimates, the IM analysis showed significantly higher h values for M. graminicola (h 5 51.3) than the wild grass S1 population (h 5 0.26). However, the parameter s (the fraction of the ancestral population which founded M. graminicola) suggests that the M. graminicola population was founded by only a very small fraction (s 5 0.02) of the ancestral population (fig. 2). A large fraction (1 ! s 5 0.98) of the ancestral population founded the present-day S1 population. These estimates are consistent with a strong expansion of M. graminicola and a decline in the S1 population since the population split. Due to the large credibility intervals for the splitting factor s (table 5), we also performed the IM analyses using a model that assumes constant population size. Estimates under this model are also shown in table 5. Estimates inferred by this model remained within the same range; however, credibility intervals for h values of M. graminicola and S1 were narrower. We calculated the overall mutation rate for the btubulin locus. This gave an overall mutation rate of 3.2 3 10!6 mutations per locus per year. This estimate was used to calibrate the IM mutation rate parameter u˛ for the other 5 loci. Mutation rates are summarized in table 3. The highest mutation rate of 4.0 3 10!5 substitutions per locus per year was found in the Leu locus. The lowest mutation rate was found in the b-tubulin locus. Assuming a generation time of 1 year for these Mycosphaerella pathogens, we calculated an overall mutation rate of 2.0 3 10!5 per locus per year. Converting estimates of t to real time, we find that the population split between the wheat-adapted M. graminicola population and the wild grass S1 population occurred approximately 10,500 years ago (95% CI: 2,000–19,500). The model parameters h and m were also converted to demographic parameters (table 5). These estimates suggest approximate values of effective population size and migration rates for M. graminicola and S1. The effective 406 Stukenbrock et al. Table 4 Estimates of Migration between 4 Geographical Populations of Wheat-Adapted Mycosphaerella graminicola Inferred Using an Isolation with Migration Model in the Program IM (Nielsen and Wakeley 2001; Hey 2005 Source population Sink Population M. graminicola (migration, m) Iran Israel Oregon Switzerland Iran (16) Israel (40) Oregon (32) Switzerland (32) — 11.65 (6.35–18.75) 0.35 (0.05–1.55) 13.15 (5.25–24.55) 21.25 (7.95–54.15) — 0.35 (0.05–1.45) 7.30 (2.30–14.30) 1.25 (0.05–4.05) 0.85 (0.25–2.15) — 8.85 (5.15–14.45) 5.95 (3.05–9.55) 7.70 (3.90–11.30) 4.25 (1.25–9.05) — NOTE.—ESS is given for each population in parentheses. Source populations are shown on the left side and sink populations are indicated along the top. 95% credibility intervals are given in parentheses. population size of the pooled M. graminicola sample exceeded 1 million, whereas the S1 effective population size was 6,500. The migration rate per year for M. graminicola into S1 was 3 3 10!6 and for S1 into M. graminicola 1.6 3 10!5. However, these estimates may be imprecise in this case where single spores represent individuals and the yearly production of sexual and asexual fungal spores may exceed billions. We also used the program Genetree (Griffiths and Tavaré 1994) to infer the gene history of the individual loci STS2, b-tubulin, HPPD, and STL10. We did not have information from the S2 population and S. passerinii for Leu or STS43 or from S. passserinii for the STL10 locus. Overall tree topologies and the relative divergence of M. graminicola from S1 and S2 were consistent for STS2, b-tubulin, HPPD, and STL10. The ancestral distributions of mutations and coalescence events for the loci STS2 and b-tubulin are illustrated in figures 3a and b. The S. passerinii cluster branched into the trees at the deepest point, suggesting an ancient divergence of S. passerinii from the 3 other Mycosphaerella clades. The S1 and M. graminicola clades represented the youngest lineages of the trees. The STS2 RFLP locus showed a higher resolution of the different population groups. Less resolution was shown in the b-tubulin locus suggesting a more conserved evolution of some regions of this locus, which also showed the lowest mutation rate. In the b-tubulin and STL10 loci, we found evidence of introgression into M. graminicola as a few isolates were positioned in the same groups as the S1 population. This pattern most likely reflects gene flow from S1 into M. graminicola after divergence. The M. graminicola isolates clustering with the S1 population were from the Iranian (as observed in the b-tubulin locus) and the Israeli (as observed in the STL10 locus) populations. In both cases, the divergence of these isolates did not occur recently and may reflect ancient gene flow events. Scaling the coalescence time in the STS2 gene tree to real time according to the split between M. graminicola and the S1 population inferred in the IM analyses, we calculate that the divergence from S. passerinii occurred approximately 68,500 (95% CI: 13,700–126,700) years ago. In the btubulin gene tree, the divergence of S. passerinii and S2 is not well resolved, and both clusters are positioned as the roots of the tree. However, the number of mutations separating S. passerinii from M. graminicola and S1 is 15 and only 10 for the S2 population. This suggests that the actual divergence of S. passerinii is more ancient than the divergence of S2. The Genetree coalescence estimates from the STS2 locus suggested a divergence time of M. graminicola from the S2 clade approximately 18,500 (95% CI: 3,700– 34,225) years ago. After divergence from the S1 clade, the M. graminicola populations experienced a strong expansion Table 5 Estimates of Coalescence Parameters from the Divergence between Mycosphaerella graminicola and S1 Parameter hA hMg hS t mMg mS mtMg mtS s Model Parameters, Constant Population Size Model Parameters, Population Size Change Demographic Parameters, Population Size Change qk (k 5 1,000,000) ESS 10.97 (5.24–23.36) 11.2035 (2.62–25.98) 0.09 (0.09–1.67) 0.16 (0.08–0.30) 0.03 (0.03–0.48) 0.58 (0.03–1.93) 0.092 (0.038–0.19) 0.126 (0.063–0.257) NA 10.97 (5.24–21.45) 51.25 (4.05–397.37) 0.26 (0.09–161.14) 0.21 (0.04–0.39) 0.15 (0.01–0.7) 0.80 (0.40–1.40) 0.255 (0.225–0.30) 0.165 (0.12–0.21) 0.02 (0.004–0.22) 274,250 (131,000–536,250) 1,281,250 (101,250–9,934,250) 6,500 (2250–4,028,500) 10,500 (2,000–19,500) 3 3 10!6 (2 3 10!6–1.4 3 10!5) 1.6 3 10!5 (8 3 10!6–2.8 3 10!5) 12,750 (11,250–15,000) 8,250 (6,000–10,500) NA !0.256 0.078 ! 0.108 !0.322 !0.279 0.267 NA NA 0.013 8 22 6 4 4 4 NA NA 6 NOTE.—NA, not applicable. Analyses were performed assuming either a constant population size or allowing population size change. Estimates are shown as model parameters and as converted demographic parameters for the analyses allowing population size change. As a measure of effective population size, the program estimates h (for haploid h 5 2Nel, where Ne 5 effective population size and l 5 mutation rate inferred over all loci). h is given for each present-time population and for the common ancestral population. An estimate of divergence time of the 2 descendant populations is given as t. Migration from M. graminicola to S1 was estimated as mMg and from S1 to M. graminicola as mS (m 5 gene flow rates per locus per generation). Mean migration times over all loci are shown as mtMg (migration from S1 into M. graminicola) and as mtS (migration from M. graminicola into S1). s is the fraction of the ancestral population that founded the M. graminicola population. 95% credibility intervals are shown in parentheses. qk is the autocorrelation for the distance value k indicating the number of steps between the pairs of values that were included in the parameter calculation. The measure ESS shows the number of independent points that have been sampled for each parameter. Origin and Domestication of Mycosphaerella graminicola 407 FIG. 2.—Population splitting of Mycosphaerella graminicola and the S1 population isolated from uncultivated grasses in Iran. The 2 descendant populations split from an ancestral population at the time t 5 10,500 years ago. h values show relative population sizes, m show migration from M. graminicola to the S1 population (mMg) and from S1 to M. graminicola (mS1). The factor s indicates the fraction of the ancestral population that founded the M. graminicola population. 1 ! s represents the remaining fraction founding the S1 population. as demonstrated by the diversification of M. graminicola haplotypes in the gene trees. The approximate time of this diversification was 5,000–6,000 years ago. Independent analyses of 4 loci showed consistent coalescence times between pathogen populations within the same CIs. Discussion The genetic data presented here provide evidence that the divergence of the wheat-adapted pathogen M. graminicola from an ancestral population infecting wild grasses in the Middle East occurred approximately 10,500 years ago. This coalescence event coincided with the beginning of agricultural-based societies in the Fertile Crescent and the domestication of wild grasses. Our findings demonstrate that the domestication of an agricultural crop was simultaneously accompanied by the domestication of a fungal pathogen. We found 1–3 fixed nucleotide substitutions at the ITS locus among the Mycosphaerella populations. Phylogenetic relationships in the clade Mycosphaerella were described in a previous study using ITS sequences (Goodwin et al. 2001). It was found that the average number of within-species nucleotide substitutions in the Mycosphaerella genus ranged from 0 to 7. Based on ITS sequences, we could not resolve the S1 and S2 populations from M. graminicola, indicating a close phylogenetic relationship below the species level. The lack of nucleotide variation in the ITS locus suggests that this gene is unsuitable to resolve clades of pathogens that have been separated for less than 11,000 years. In contrast to the ITS locus, the 6 loci that were used for population and coalescence analyses showed high amounts of nucleotide variability within and between populations. Tests of nonneutral evolution could not be rejected for any of the 6 loci, but we believe that the observed pattern of gene diversity is most likely due to changes in population size. The most frequent M. graminicola haplotypes were present in all 4 geographical populations, supporting previous results showing that the majority of gene diversity is distributed within individual populations and that populations are not geographically differentiated (McDonald et al. 1999; Linde et al. 2002; Zhan et al. 2003). The Iranian M. graminicola haplotypes clearly belonged to the wheatadapted population though they originated from the same geographical region as S1 and S2. Isolates of the S1 and S2 populations were all collected very close to cultivated wheat fields but still differed from the wheat-adapted pathogens. We found low haplotype and gene diversity in the S1 and S2 populations even though these isolates originated from 5 different locations and had a broader host range (both S1 and S2 were collected from 3 different hosts). The small number of mutations separating haplotypes indicates that a large fraction of the total diversity was sampled. The star-like shape of the M. graminicola populations in the haplotype networks is consistent with a significant population expansion (figs. 1a and b). We expected to find a higher haplotype and gene diversity in the pathogen populations isolated from uncultivated grasses as a result of diversifying selection imposed by different environments and host species (Burdon 1993). However, the low genetic diversity observed in the S1 and S2 populations is consistent with small effective population sizes of these populations and may reflect a different population biology than M. graminicola. Results from the LAMARC and the IM analyses were both consistent with a significant expansion in M. graminicola populations and a decline in the S1 and S2 populations. A new version of the ‘‘isolation with migration’’ model allowed us to estimate the fraction of the ancestral pathogen population that gave rise to the M. graminicola population. Given the contemporary h values, it is possible to extrapolate the relative increase or decrease in population sizes since the time of divergence. The IM estimates of demographic parameters suggest a recent split where only a small fraction of the ancestral pathogen population founded a new population during the domestication of the wheat host. A much larger fraction of the ancestral population founded the S1 group, which today persists on uncultivated grasses. The increase in M. graminicola effective population size was most likely caused by the corresponding increase in host population size following the spread of agriculture. We hypothesize that the decrease in size of the S1 and S2 populations reflects the significant decline in wild host populations that occurred during the conversion of wild grass habitats into agricultural fields. We are not able to determine whether the increase or decrease in population size of the respective pathogen populations occurred in a linear way through time or whether some historical events were particularly favorable or unfavorable for the respective pathogen populations. However, expansion of agriculture has favored pathogen propagation and dispersal by providing dense and uniform host populations that cover a large area. The spread of agriculture to other regions moved domesticated plants and animals across Europe and Asia (Clark 1965; Ammerman and Cavalli-Sforza 1971), and many centuries later wheat was introduced into New World countries by the European colonialism. Banke and McDonald (2005) showed evidence for past migration of 408 Stukenbrock et al. FIG. 3.—Coalescence-based gene genealogies of the STS2 locus (a) and b-tubulin (b) showing the distribution of mutations in 4 geographical populations of Mycosphaerella graminicola (Mg), the 2 Iranian S1 and S2 populations, and S. passerineii (Sp USA). The split between M. graminicola and S1 was used to convert the coalescence scale to real time. The number of haplotypes represented in each branch is shown below the trees. The vertical lines enclose the position of haplotypes derived from the S1 population. Time scale is shown at the right side of each gene genealogy. M. graminicola from the Middle East and Europe to New World countries, suggesting that the pathogen traveled with its host. The increase in wheat cultivation is also reflected in the burst of M. graminicola diversification and population expansion that the coalescence analyses dated to approximately 5,000–6,000 years ago. We estimated the directional amount of gene flow occurring among M. graminicola and the S1 population since their divergence. Our results indicate that only a limited amount of gene flow has occurred between the 2 sympatrically diverging populations and that the small amount of genetic exchange detected has been mainly from populations on wild grasses to wheat-infecting populations. This points to a possible mechanism for the introgression of new avirulence genes or pathogenicity factors into M. graminicola from the S1 clade. However, estimates of migration time showed that the majority of gene flow events occurred at the time when populations were initially diverging approximately 10,000 years ago. Evidence of gene flow was also present in the distribution of haplotypes in the Genetree analyses. The Iranian M. graminicola population con- tained one isolate, which diverged from the S1 group later than the remaining M. graminicola haplotypes (fig. 3b). The position of this isolate in the b-tubulin gene tree suggests that gene flow and recombination also took place between the 2 diverged populations after genetic differentiation had already occurred. In summary, our data indicate that the pathogen host shift did not occur as one single event but probably happened through a series of introgressions of isolates from uncultivated grasses into the wheat-infecting population. Multiple host shifts also occurred in the evolution of the rice blast pathogen, which similarly emerged as a new pathogen by the domestication of rice (Couch et al. 2005). Consistent with our results, other studies have reported extensive genetic differentiation between diverging fungal species (Bicknell and Douglas 1970; Horgen et al. 1984; Vilgalys and Johnson 1987; Dettman et al. 2003). The high amount of genetic diversity observed between relatively recently separated species suggests that speciation in Mycosphaerella was not only associated with host adaptation but was also accompanied by significant genetic differentiation. Species concepts and mechanisms of speciation in fungi Origin and Domestication of Mycosphaerella graminicola 409 have been discussed and addressed in many studies (e.g., Burnett 1983; Hibbet et al. 1995; Natvig and May 1996; Brasier 1987; Petersen and Hughes 1999; Taylor et al. 2000; Harrington et al. 2002; Kohn 2005). Reinforcement has been suggested as an important mechanism for divergence between sympatrically diverging populations (Kohn 2005) and has been demonstrated in several studies (Anderson et al. 1980; Capretti et al. 1990; Stenlid and Karlsson 1991; Dettman et al. 2003). Reinforcement is the active mechanism that leads to an increase in prezygotic reproductive isolation and reduced hybrid fitness, thereby favoring intraspecific mating and over time resulting in genetic differentiation between species (Dobzhansky 1951). Reproductive success between M. graminicola, S1, and S2 was not tested in this study, but such tests could provide additional knowledge regarding the mechanisms underlying the sympatric speciation of Mycosphaerella populations. The coalescence analyses performed in this study provide an approximate picture of the divergence of Mycosphaerella populations since the split from S. passerinii. The coalescence time was calculated using mutation rates from each of the 6 loci. The scaled mutation rates shown in this study are within the same range as mutations rates inferred in IM from a multilocus data set of human populations (Hey 2005) and as experimentally shown in yeast (Zeyl and DeVisser 2001). As our estimates were calculated across 6 loci from 360 individuals using a Bayesian-based approach, we believe these mutations rates provide robust estimates. Based on the 6 inferred mutation rates, the Genetree coalescence analyses suggest that S. passerinii diverged from the 3 other Mycosphaerella groups approximately 68,500 years ago. This implies that speciation of M. graminicola and S. passerinii did not occur simultaneously with the domestication of wheat and barley but instead occurred much earlier. The divergence of these pathogen species may have been related to the specialization of S. passerinii to a Hordeum host. Mycosphaerella graminicola and the S1 group represent the youngest phylogenetic clades that diverged from the S2 clade approximately 18,500 years ago. Mycosphaerella graminicola and the S1 group split 10,500 years ago, consistent with the genetic and archeological data obtained from crop plants in the Fertile Crescent showing speciation of the host plants (Salamini et al. 2002). Coalescence analyses of M. oryzae similarly showed that the rice blast pathogen emerged simultaneously with the domestication of rice approximately 9,000 years ago (Couch et al. 2005). The strong population expansion and global spread of M. graminicola demonstrate the accelerated pathogen evolution that was mediated by human practices and favored by the huge land areas covered by wheat. This cospeciation of M. graminicola and wheat took place within a very short evolutionary time frame, specifically, during the 10,000to 12,000-year period in which historical changes in the host (i.e., the increase in wheat cultivation and the global dispersal) had a major impact on the parasite demography. A similar picture has emerged from coalescence analyses of the malaria parasite Plasmodium falciparum (Joy et al. 2003). Changes in human culture and the introduction of agricultural practices also led to an accelerated coevolution of the malaria mosquito Anopheles gamibiae and P. falciparum, both experiencing strong population increases and migration to other regions. These examples illustrate how agricultural practices shaped the evolution of some of our most important modern pathogens. Acknowledgments We would like to thank M. Zala for technical support and Ignazio Carbone and Rasmus Nielsen for helpful suggestions to the coalescence analyses. Additionally, we would like to thank 3 anonymous reviewers for helpful comments on the manuscript. This work was funded by the Swiss Federal Institute of Technology (ETH), Zurich, and the Swiss National Science Foundation (grant 3100A0-104145). Literature Cited Ammerman AJ, Cavalli-Sforza LL. 1971. Measuring the rate of spread of early farming in Europe. Man. 6:674. Anderson JB, Korhonen K, Ullrich RC. 1980. Relationships between European and North-American biological species of Armillaria. Exp Mycol. 4:87–95. Avise JC. 2000. Phylogeography: the history and formation of species. Cambridge: Harvard University Press. Aylor DL, Price EW, Carbone I. 2006. SNAP: combine and map modules for multilocus population genetic analysis. Bioinformatics. doi: 10.1093/bioinformatics/btl136. Banke S, McDonald BA. 2005. Migration patterns among global populations of the pathogenic fungus Mycosphaerella graminicola. Mol Ecol. 14:1881–1896. Banke S, Peschon A, McDonald BA. 2004. Phylogenetic analysis of globally distributed Mycosphaerella graminicola populations based on three DNA sequence loci. Fungal Genet Biol. 41:226–238. Beerli P, Felsenstein J. 2001. Maximum likelihood estimation of a migration matrix and effective population sizes in n subpopulations by using a coalescent approach. Proc Natl Acad Sci USA. 98:4563–4568. Bicknell JN, Douglas HC. 1970. Nucleic acid homologies among species of Saccharomyces. J Bacteriol. 101:505–512. Brasier CM. 1987. The dynamics of fungal speciation. In: Rayner ADM, Brasier CM, Moore D, editors. Evolutionary biology of the fungi. Cambridge: Cambridge University Press. p. 231–260. Brown JKM, Hovmøller MS. 2002. Aerial dispersal of pathogens on the global and continental scales and its impact on plant disease. Science. 297:537–541. Burdon JJ. 1993. The structure of pathogen populations in natural plant communities. Annu Rev Phytopathol. 31:305–323. Burnett JH. 1983. Presidential address: speciation in fungi. Trans Br Mycol Soc. 81:1–14. Capretti P, Korhonen K, Mugnai L, Romagnoli C. 1990. An intersterility group of Heterobasidion annosum specialized to Abies alba. Eur J Pathol. 20:231–240. Carbone I, Kohn L. 2001. A microbial population-species interface: nested cladistic and coalescence inference with multilocus data. Mol Ecol. 10:947–964. Childe VG. 1953. New light on the most ancient Near East. New York: Praeger. Clark JGD. 1965. Radiocarbon dating and expansion of farming culture from the Near East over Europe. Proc Prehist Soc. 31:58–73. Clement M, Posada D, Crandall KA. 2000. TCS: a computer program to estimate gene genealogies. Mol Ecol. 9:1657–1659. 410 Stukenbrock et al. Couch BC, Fudal I, Lebrun M-H, Tharreau D, Valent B, van Kim P, Nottéghem JL, Kohn LM. 2005. Origins of host-specific populations of the blast pathogen Magnaporthe oryzae in crop domestication with subsequent expansion of pandemic clones on rice and weeds of rice. Genetics. 170:613–630. Dettman JR, Jacobson DJ, Taylor JW. 2003. A Multilocus genealogical approach to phylogenetic species recogintion in the model eukaryote Neurospora. Evolution. 57:2703–2720. Diamond J. 1999. Guns, Germs, and Steel. New York: Norton. Dobzhansky T. 1951. Genetics and the origin of species. New York: Columbia University Press. Elias EM, Steiger DK, Cantrell RG. 1996. Evaluation of lines derived from wild emmer chromosome substitutions. II. Agronomic traits. Crop Sci. 36:228–233. Eyal Z, Amiri Z, Wahl I. 1973. Physiologic specialization of Septoria tritici. Phytopathology. 63:1087–1091. Eyal Z, Scharen AL, Huffman MD, Prescott JM. 1985. Global insights into virulence frequencies of Mycosphaerella graminicola. Phytopathology. 75:1456–1462. Fisher MC, Koenig GL,White TJ, San-Blas G, Negroni R, Alvarez IG, Wanke B, Taylor JW. 2001. From the cover: biogeographic range expansion into South America by Coccidioides immitis mirrors New World patterns of human migration. Proc Natl Acad Sci USA. 98:4558–4562. Fu YX. 1997. Statistical tests of neutrality of mutations against population growth, hitchhiking and background selection. Genetics. 147:915–925. Fu YX, Li WH. 1993. Statistical tests of neutrality of mutations. Genetics. 133:693–709. Goodwin SB, Dunkle LD, Zismann VL. 2001. Phylogenetic analysis of Cercospora and Mycosphaerella based on the internal transcribed spacer region of ribosomal DNA. Phytopathology. 91:648–658. Gopher A, Abbo S, Lev-Yadun S. 2002. The "when#, the "where# and the "why# of the Neolithic revolution in the Levant. Doc Praehistorica. 28:49–62. Griffiths RC Tavare S. 1994. Sampling theory for neutral alleles in a varying environment. Philos Trans R Soc Lond B Biol Sci. 344:403–410. Harrington TC, Pashenova NV, McNew DL, Steimel J, Konstantinov MY. 2002. Species delimitation and host specialization of Ceratocystis laricicola and C. polonica to larch and spruce. Plant Dis. 86:418. Hasegawa M, Kishino H, Yano T. 1985. Dating of the humanape splitting by a molecular clock of mitochondrial DNA. J Mol Evol. 22:160–174. Hey J. 2005. On the number of New World founders: a population genetic portrait of the peopling of the Americas. PLoS Biol. 3:e193. Hey J, Machado CA. 2003. The study of structured populations— new hope for a difficult and divided science. Nat Rev Genet. 4:535–543. Hey J, Nielsen R. 2004. Multilocus methods for estimating population sizes, migration rates and divergence time, with applications to the divergence of Drosophila pseudoobscura and D. persimilis. Genetics. 167:747–760. Hey J, Nielsen R. 2006. IM documentation. [cited 2006 July]; [Internet]. Piscataway, (NJ): Rutgers University. Available from: http://lifesci.rutgers.edu/;heylab/ProgramsandData/Programs/ IM/IMdocumentation.pdf. Hibbett DS, Fukumasa-Nakai Y, Tsuneda A, Donoghue, MJ. 1995. Phylogenetic diversity in shiitake inferred from nuclear ribosomal DNA sequences. Mycologia. 87:618–638. Horgen PA, Arthur R, Davy O, Moum A, Herr F, Straus N, Anderson J. 1984. The nucleotide sequence homologies of unique DNA’s of some cultivated and wild mushrooms. Can J Microbiol. 30:587–593. Hussey BMJ, Keighery GJ, Cousens RD, Dodd J. Lloyd SG. 1997. Western weeds, a guide to the weeds of Western Australia. Australia: Plant protection society of WA. Joy DA, Feng X, Mu J, et al. (12 co-authors). 2003. Early origin and recent expansion of Plasmodium falciparum. Science. 300:318–321. Juergens T, Linde C, McDonald AB. Forthcoming 2006. Genetic structure of Mycosphaerella graminicola populations from Iran, Argentina and Australia. Eur J Plant Pathol. 115:223– 233. Kasuga T, White TJ, Taylor JW. 2002. Estimation of nucleotide substitution rates in Eurotiomycete fungi. Mol Biol Evol. 19:2318–2324. Kema GHJ, Annone JG, Sayoud R, Van Silfhout CH, Van Ginkel M, de Bree J. 1996. Genetic variation for virulence and resistance in the wheat-Mycosphaerella graminicola pathosystem. I. Interactions between pathogen isolates and host cultivars. Phytopathology. 86:200–212. Kema GHJ, Sayoud R, Annone JG, Van Silfhout CH. 1996. Genetic variation for virulence and resistance in the wheatMycosphaerella graminicola pathosystem. II. Analysis of interactions between pathogen isolates and host cultivars. Phytopathology. 86:213–220. Kema GHJ, Verstappen ECP, Todorova M, Waalwijk C. 1996. Successful crosses and molecular tetrad and progeny analyses demonstrate the heterothallism in Mycosphaerella graminicola. Curr Genet. 30:251–258. Keon J, Hargreaves J. 1998. Isolation and heterologous expression of a gene encoding 4-hydroxyphenylpyruvate dioxygenase from the wheat leaf-spot pathogen, Mycosphaerella graminicola. FEMS Microbiol Lett. 161:337–343. Kimura M. 1969. The number of heterozygous nucleotide sites maintained in a finite population due to a steady flux of mutations. Genetics. 61:893–903. Kohn LM. 2005. Mechanism of fungal speciation. Annu Rev Phytopathology. 43:279–308. Kuhner MK, Yamato J, Beerli P, Smith LP, Rynes E, Walkup E, Li C, Sloan J, Colacurcio P, Felsenstein J. 2004. LAMARC v. 1.2.1 [Internet]. University of Washington. Available from: http://evolution.gs.washington.edu/lamarc.html. Linde CC, Zhan J, and McDonald BA. 2002. Population structure of Mycosphaerella graminicola: from lesions to continents. Phytopathology. 92:946–955. McDonald BA, Zhan J, Yarden O, Hogan K, Garton J, Pettway RE. 1999. The population genetics of Mycosphaerella graminicola and Phaeosphaeria nodorum. In: Lucas JA, Bowyer P, Anderson HM, editors. Septoria on cereals: a study of Pathosystems. Wallingford (UK): CABI. p. 44–69. Moore AMT, Hillman GC, Legge AJ. 2000. Village on the Euphrates: from foraging to farming at Abu Hureyra. New York: Oxford University Press. Natvig DO, May G. 1996. Fungal evolution and speciation. J Genet. 75:441–452. Nei M. 1987. Molecular evolutionary genetics. New York: Columbia University Press. Nielsen R, Wakeley J. 2001. Distinguishing migration from isolation: a Markov chain Monte Carlo approach. Genetics. 158:885–896. Petersen RH, Hughes KW. 1999. Species and speciation in mushrooms. Bioscience. 49:440–452. Price EW, Carbone I. 2005. SNAP: workbench management tool for evolutionary population genetic analysis. Bioinformatics. 21:402–404. Rozas J, Rozas R. 1999. DnaSP version 3: an integrated program for molecular population genetics and molecular evolution analysis. Bioinformatics. 15:174–175. Origin and Domestication of Mycosphaerella graminicola 411 Saadaoui EM. 1987. Physiologic specialization of Septoria tritici in Morocco. Plant Dis. 71:153–155. Salamini F, Ozkan H, Brandolini A, Schafer-Pregl R, Martin W. 2002. Genetics and geography of wild cereal domestication in the Near East. Nat Rev Genet. 3:429–441. Sharma HC, Waines JG. 1980. Inheritance of tough rachis in crosses of Triticum monococcum and T. boeoticum. J Hered. 71:214–216. Shaw MW, Royle DJ. 1989. Airborne inoculum as a major source of Septoria tritici (Mycosphaerella graminicola) infections in winter wheat crops in the UK. Plant Pathol. 8:35–43. Stenlid J, Karlsson JO. 1991. Partial intersterility in Heterobasidion annosum. Mycol Res. 95:1153–1159. Swofford DL. 1996. PAUP*: phylogenetic analysis using parsimony (*and other methods). Version 4.0. Sunderland (MA): Sinauer Associates. Taenzler B, Esposti RF, Vaccino P, Brandolini A, Effgen S, Heun M, Schäfer-Pregl R, Borghi B, Salamini F. 2002. Molecular linkage map of Einkorn wheat: mapping of storage—protein and soft-glume genes and bread-making quality QTLs. Genet Res Camb. 80:131–143. Tajima F. 1989. Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics. 123:585–595. Taylor JW, Jacobson DJ, Kroken S, Kasuga T, Geiser DM, Hibbett DS, Fisher MC. 2000. Phylogenetic species recognition and species concepts in fungi. Fungal Genet Biol. 31:21–32. Templeton AR, Crandall KA, Sing CF. 1992. A cladistic analysis of the phenotypic associations with haplotypes inferred from restriction endonuclease mapping and DNA sequence data. III. Cladogram estimation. Genetics. 132:619–633. van Ginkel M, Scharen AL. 1987. Generation mean analysis and heritabilities of resistance to Septoria tritici in durum wheat. Phytopathology. 77:1629–1633. Vilgalys RJ, Johnson JL. 1987. Extensive genetic divergence associated with speciation in filamentous fungi. Proc Natl Acad Sci USA. 84:2355–2358. White TJ, Bruns T, Lee S, Taylor J. 1990 Amplification and direct sequencing of fungal ribosomal RNA genes for phylogenetics. In: Innis M, Gelfand DH, Sninsky JJ, White TJ, editors. PCR protocols. San Diego: Academic Press. p. 315–322. Won Y-J, Hey J. 2005 Divergence population genetics of chimpanzees. Mol Biol Evol. 22:297–307. Zeyl C, DeVisser JAGM. 2001. Estimates of the rate and distribution of fitness effects of spontaneous mutation in Saccharomyces cerevisiae. Genetics. 157:53–61. Zhan J, Kema GHJ, McDonald BA. 2004. Evidence for natural selection in the mitochondrial genome of Mycosphaerella graminicola. Phytopathology. 94:261–267. Zhan J, Linde CC, Juergens T, Merz U, Steinebrunner F, McDonald BA. 2005. Variation for neutral markers is correlated with variation for quantitative traits in the pathogenic fungus Mycosphaerella graminicola. Mol Ecol. 14:2683–2693. Zhan J, Pettway RE, McDonald BA. 2003. The global genetic structure of the wheat pathogen Mycosphaerella graminicola is characterized by high nuclear diversity, low mitochondrial diversity, regular recombination, and gene flow. Fungal Genet Biol. 38:286–297. Zohary D, Hopf M. 2000. Domestication of plants in the Old World. 3rd ed. Oxford: Oxford University Press. Jody Hey, Associate Editor. Accepted November 3, 2006
© Copyright 2026 Paperzz