Origin and Domestication of the Fungal Wheat Pathogen

Origin and Domestication of the Fungal Wheat Pathogen Mycosphaerella
graminicola via Sympatric Speciation
Eva H. Stukenbrock,* Søren Banke,* Mohammad Javan-Nikkhah,! and Bruce A. McDonald*
*Plant Pathology, Institute of Integrative Biology, Eidgenössische Techniche Hochschule, Zurich Land und Forst Wissenschaft,
Zurich, Switzerland; and !Department of Plant Protection, Faculty of Agriculture, University of Tehran Plant
Protection Building, Karaj, Iran
The Fertile Crescent represents the center of origin and earliest known place of domestication for many cereal crops.
During the transition from wild grasses to domesticated cereals, many host-specialized pathogen species are thought
to have emerged. A sister population of the wheat-adapted pathogen Mycosphaerella graminicola was identified on wild
grasses collected in northwest Iran. Isolates of this wild grass pathogen from 5 locations in Iran were compared with 123 M.
graminicola isolates from the Middle East, Europe, and North America. DNA sequencing revealed a close phylogenetic
relationship between the pathogen populations. To reconstruct the evolutionary history of M. graminicola, we sequenced 6
nuclear loci encompassing 464 polymorphic sites. Coalescence analyses indicated a relatively recent origin of M. graminicola, coinciding with the known domestication of wheat in the Fertile Crescent around 8,000–9,000 BC. The sympatric divergence of populations was accompanied by strong genetic differentiation. At the present time, no genetic
exchange occurs between pathogen populations on wheat and wild grasses although we found evidence that gene flow
may have occurred since genetic differentiation of the populations.
Introduction
Archaeological findings provide evidence of the important changes that took place in human culture when the first
farming practices were initiated about 12,000 years ago in the
Fertile Crescent (Childe 1953; Moore et al. 2000; Zohary and
Hopf 2000; Gopher et al. 2002). Genetic data can support archaeology as genome-wide measures of genetic similarity
have traced the origin of several domesticated cereals to wild
populations of naturally occurring grasses that persist in the
Middle East (see Salamini et al. 2002 for review). Intensive
selection and cultivation of selected phenotypes altered populations of wild grasses into domesticated varieties of crops.
For the cereal crops, the main morphological features that
changed through this selection process were seed size, stiffness of the ear rachis, and the ease with which the seeds are
released from the glumes (Sharma and Waines 1980; Elias
et al. 1996; Taenzler et al. 2002). The process of domestication of certain grasses not only changed the genetic structure
of the plant populations but also changed the physical and
biotic environment for pathogens associated with the selected grass species. In wild host populations, characterized
by patchy distributions, low plant density, high genetic diversity, and uneven distribution of nutrients, the pathogen is exposed to substantial environmental fluctuations, including
factors such as humidity, UV radiation, and nutrient availability (Burdon 1993). Agricultural host populations are
characterized by higher plant densities, more even nutrient
distribution, and greater genetic uniformity, creating a
more uniform environment that is less prone to environmental fluctuation and that is also more conducive to
pathogen transmission. It is likely that the initiation of agricultural practices and the development of new crop species
also led to the emergence of new pathogens or to significant
changes through local adaptation in already existing pathogen populations.
Key words: plant pathogens, coevolution, sympatric speciation, population growth, Mycosphaerella graminicola, septoria tritici.
E-mail: [email protected].
Mol. Biol. Evol. 24(2):398–411. 2007
doi:10.1093/molbev/msl169
Advance Access publication November 9, 2006
! The Author 2006. Published by Oxford University Press on behalf of
the Society for Molecular Biology and Evolution. All rights reserved.
For permissions, please e-mail: [email protected]
One of the most serious pathogens on cultivated wheat
today is the fungus Mycosphaerella graminicola that is
found worldwide with its host. Mycosphaerella graminicola is known to be host specific and does not occur on other
host species (Eyal et al. 1973, 1985; Saadaoui 1987; van
Ginkel and Scharen 1987). Genetic analysis of virulence
and resistance in the wheat—M. graminicola pathosystem
suggests that pathogenicity is controlled by several loci and
is likely inherited as a quantitative trait (Kema, Annone,
et al. 1996; kema, Sayoud, et al. 1996; Zhan et al. 2005).
Population genetic studies at both regional and continental
scales demonstrated high levels of genetic diversity and
gene flow between pathogen populations (Zhan et al.
2003; Banke and McDonald 2005). Mycosphaerella graminicola reproduces both clonally through the production of
splash-dispersed pycnidiospores and sexually by the formation of ascospores. Ascospores are wind dispersed (Shaw
and Royle 1989) and may provide an important means
of long-distance dispersal together with the continuous
human-mediated transport of infected plant material
(Brown and Hovmøller 2002). Phylogeographic studies indicated that the center of origin of M. graminicola is the
Middle East and that the pathogen was dispersed throughout the world during the spread of wheat cultivation
(McDonald et al. 1999; Banke et al. 2004). Ancient patterns
of migration inferred from DNA sequence data supported
this hypothesis by showing an asymmetrical migration of
the pathogen from European and Israeli populations to
‘‘New World’’ countries (Banke and McDonald 2005). Estimates of recent migration patterns using microsatellite
markers support an asymmetrical migration of the pathogen
from the ‘‘Old World’’ but additionally include North
America as a contemporary major donor of pathogen immigrants (Banke and McDonald 2005). This pattern of contemporary migration is consistent with the intensive
cultivation of wheat on the North American continent today
and global wheat trading patterns.
We hypothesized that M. graminicola emerged as a
new pathogen during the process of wheat domestication.
Sister populations of M. graminicola on other grass hosts
have, up until now, not been available to test this hypothesis. The closest known relative of M. graminicola is the
Origin and Domestication of Mycosphaerella graminicola 399
Table 1
Pathogen Populations Included in the Analysis
Pathogen
Mycosphaerella
sp. (S1 and S2)
Mycosphaerella
graminicola
Septoria passerinii
Number of
Isolates
Population
4
3
31
9
14
16
40
32
Iran 1
Iran 2
Iran 3
Iran 4
Iran 5
Iran
Israel
United States
32
10
Switzerland Zurich, Winterthur
United States North Dakota
Location
Province
Province
Province
Province
Province
Mehran
Sa’ad
Oregon
Ardabil,
Ardabil,
Ardabil,
Ardabil,
Ardabil,
Host Plant
Khalkhal
Khalkhal
Ardabil
Meshkin Road
Meshkin Road
barley-adapted pathogen Septoria passerinii; however, it is
not known whether divergence of these 2 pathogen species
coincided with the domestication of wheat and barley. Recently collected isolates of Mycosphaerella pathogens from
noncultivated grasses in Iran provided us the opportunity to
study the evolutionary history of M. graminicola.
DNA sequence data can be combined with genealogical-based analyses to infer processes of population and
species divergence and to identify evolutionary forces that
have shaped the genetic structure of an organism (Avise
2000; Hey and Machado 2003). For fungi, such studies
are limited. Fisher et al. (2001) used a genealogical approach to characterize the biogeographic expansion of
the human pathogen Coccidioides immitis into South
America. They concluded that the expansion coincided
with the human colonization of South America but the estimated timeframe of the expansion had a very large confidence interval (CI) (2,700–228,000 years). In a recent
study of the rice blast pathogen Magnaporthe oryzae,
Couch et al. (2005) assessed the divergence time for a host
shift event from millet to cultivated rice. Their Bayesianderived estimates suggested an early origin of the riceinfecting pathogen approximately 9,000 years ago (95%
CI 5 2,500–28,000 years ago), which could be associated
with rice domestication.
The emergence of a wheat-adapted Mycosphaerella
population is likely to have occurred as a sympatric process
where no geographical barriers separated pathogen populations occurring on wild grasses and populations adapting to
the new domesticated crops. Because of the close proximity
of wild and cultivated host populations and the longdistance dispersal of airborne ascospores, a continuous
exchange of genes would have been possible during the process of divergence between Mycosphaerella populations occurring on wild and agricultural host populations. This
scenario complicates the inference of coalescence patterns.
A regular exchange of genetic information may result in
shared polymorphisms that create the appearance of recent
divergence even when the actual splitting occurred long ago.
To account for this possibility, we applied an isolation with
migration model which takes into account the exchange of
genes between populations after divergence from a common
ancestral population (Nielsen and Wakeley 2001; Hey and
Nielsen 2004). The method allows the estimation of multiple
coalescence parameters such as founding population sizes,
Year
Lolium multiflorum
2004
Dactylis glomerata
2004
Lolium multiflorum
2004
Agropyron repens
2004
Lolium multiflorum
2004
Triticum aestivum
2001
Triticum aestivum
1992
Triticum aestivum 1990–1992
Triticum aestivum
Hordeum vulgare
1999
1995
Sampling
Strategy
Collectors
Random
Random
Random
Random
Random
Random
Hierarchical
Hierarchical
Javan-Nikkhah M
Javan-Nikkhah M
Javan-Nikkhah M
Javan-Nikkhah M
Javan-Nikkhah M
Javan-Nikkhah M
Yarden O
Mundt CC and
McDonald BA
Hierarchical McDonald BA
Random
Long D
changes in population size, the time of population formation,
and gene flow. Previous studies used this coalescence approach to describe the divergence of Drosophila species,
chimpanzee subpopulations, and ancient human populations
(Hey and Nielsen 2004; Won and Hey 2005; Hey 2005).
The transition from hunter-gatherer cultures led to the
emergence of new human infectious diseases that evolved
from herd-domesticated animals and successfully spread
in the associated human populations (Diamond 1999). We
hypothesized a parallel emergence of new plant pathogens
mediated by the domestication of crop plants and the development of agriculture. It is probable that the dense host populations and different environmental conditions existing in
agroecosystems selected for new pathogen populations that
were locally adapted to the agroecosystem, setting the stage
for pathogen population differentiation and rapid speciation.
To test these hypotheses, we here inferred the evolutionary
history of M. graminicola by comparing gene genealogies of
closely related populations on uncultivated grasses from the
center of origin of both host and pathogen.
Materials and Methods
Fungal Isolates
Uncultivated grasses of 3 different genera, Agropyron
repens, Dactylis glomerata, and Lolium multiflorum, infected by a fungal pathogen in the genus Mycosphaerella
were collected in the northwestern Iranian province Ardabil
(table 1). This region is one of the most important locations
for wheat cultivation in Iran. Little is known about the origin and evolution of the 3 grass hosts; however, they are
known to be native to Europe, North Africa, and temperate
Asia (Hussey et al. 1997). Pathogen isolation and DNA
extraction were performed as previously described (Linde
et al. 2002). Using repetitive sequence–based polymerase
chain reaction (Rep-PCR) fingerprints to differentiate clonal
lineages, we selected 28 unique isolates for sequencing.
One hundred twenty-three strains of M. graminicola isolated from cultivated wheat collected in Iran, Israel,
Oregon, and Switzerland were used to infer the coalescence
of Mycosphaerella populations on grasses and wheat. Mycosphaerella graminicola populations were described previously (McDonald et al. 1999; Linde et al. 2002; Zhan
et al. 2003; Banke et al. 2004; Juergens et al. 2006). Additionally, we included 10 isolates of the closely related barley
400 Stukenbrock et al.
pathogen S. passerinii collected in Walsh, North Dakota
(kindly provided by Goodwin SB) (table 1).
Polymerase Chain Reaction and Sequencing
Internal transcribed spacer (ITS) amplification was
performed as described by White et al. (1990) using the
primer combination ITS4/ITS5. For population genetic
and coalescence analyses, we sequenced 6 DNA loci previously described in M. graminicola: b-tubulin (321 bp, a transcribed sequence), Leu (111-bp promoter and 189-bp
transcribed sequences of the 3-isopropylmalate dehydrogenase gene), HPPD (560 bp, a promoter sequence of the 4hydroxyphenylpyruvate dioxygenase gene), STS2 (402 bp
of an anonymous restriction fragment length polymorphism
[RFLP] locus), STL10 (1,104 bp of an anonymous RFLP locus), and STS43 (430 bp of an anonymous RFLP locus). Sequence data from the loci b-tubulin, Leu, STS2, and STL10
were used in previous studies to determine phylogeny and
patterns of migration in M. graminicola (Banke et al.
2004; Banke and McDonald 2005). STS43 is a nuclear RFLP
locus that was used in previous population genetic studies of
M. graminicola (McDonald et al. 1999). The HPPD locus
was originally sequenced and described by Keon and
Hargreaves et al. (1998). Sequence data not published in previous studies can be accessed in National Center for Biotechnology Information GenBank under the accession numbers
DQ535489–DQ535722. Amplification and sequencing of
all loci were performed as described in Banke et al. (2004).
Sequences were aligned and manually adjusted using the
programs Sequencher 4.0 (Gene Code, Ann Arbor, MI)
and Bioedit Sequence Alignment Editor (http://www.mb.
mahidol.ac.th/Downloads/Mol-Bio/Bioedit/Bioedit.htm).
Data Analysis
Gene Diversity
As most polymorphic sites are located in noncoding
DNA, we did not differentiate between silent or nonsilent
mutations and all polymorphic sites were analyzed. To initially characterize the phylogenetic relationship between
pathogen species, we used the ITS alignment to construct
a maximum parsimony tree by heuristic searches in PAUP*
4.0 (Swofford 1996). Trees were built by 100 iterations of
random taxon addition with a different number seed for
each iteration. Maximum parsimony analyses were also
performed using sequence alignments of the 6 DNA loci.
Within the Mycosphaerella population from uncultivated
grasses, we identified 2 clades, here named S1 (representing
27 isolates) and S2 (representing 34 isolates). Haplotype
diversity was calculated for each of the 6 loci using the program Dnasp version 3.53 (Nei 1987; Rozas J and Rozas R
1999). To test for deviation from neutral evolution, we performed Fu and Li’s D*, Fu and Li’s F* (Fu and Li 1993)
and Tajima’s D tests (Tajima 1989) for all sites in each gene
using Dnasp (Rozas J and Rozas R 1999). Maximum parsimony, haplotype diversity, and deviation from neutrality
were assessed using entire sequence alignments.
Compatible Alignments
Alignments were processed in the SNAP workbench
Java program package, which implements and coordinates
several programs to analyze gene genealogies and population parameters (Price and Carbone 2005). The program
SNAP map was used to collapse sequences into haplotypes,
removing indel mutations and excluding infinite-site violations (Aylor et al. 2006). Conflicting sites showing homoplasy were initially identified among variable sites
using the compatibility methods SNAP clade and SNAP matrix in the SNAP workbench. To generate compatible sequence alignments for coalescence analyses, Carbone and
Kohn (2001) suggested the removal of haplotypes. Here, instead, we chose to manually remove conflicting sites in each
gene alignment to generate compatible nonrecombining
alignments for the parsimony and the coalescence analysis.
Table 2 shows the number of sites before and after manipulation of the alignments. The majority of polymorphic sites
removed were found within the 4 M. graminicola populations, consistent with a high number of recombination events.
Removal of conflicting sites may have introduced a bias into
the data set. The majority of conflicting sites in the 6 genes
were found between M. graminicola isolates, and we believe
that the removal of conflicting sites did not bias the migration
or coalescence estimates between the 2 groups.
Haplotype Networks
We analyzed the haplotype alignments of all loci using
the program TCS 1.3 (Clement et al. 2000). This program
applies a statistical parsimony method to infer unrooted cladograms based on Templeton’s 95% parsimony connection
limit (Templeton 1992). All polymorphic sites in the compatible sequence alignments were used to generate the haplotype networks.
LAMARC Parameter Estimation
Population growth parameters, theta: h values and recombination rates were inferred using the program
LAMARC 2.0 (Kuhner et al. 2004). To compare parameters
in pathogen populations on uncultivated grasses and wheat,
we compared a pooled sample of the 4 M. graminicola populations and a pooled sample of S1 and S2. The pooling of
populations was done in order to compare a sample of
pathogens from cultivated wheat and from uncultivated
grasses. Because geographical populations of M. graminicola showed little subdivision, these populations were analyzed as one population sample. Due to the smaller sample
sizes of S1 and S2, we pooled these populations into one
sample. The larger sample sizes improved the performance
of the likelihood analysis. The LAMARC algorithm permits intragenic recombination and therefore allowed us
to use the complete sequence alignments for each locus.
All polymorphic sites were used to assess the population
parameters. To estimate h and a population growth parameter, we used 10 initial chains with 2,000 genealogies sampled and 2 final chains with 20,000 genealogies sampled.
Analyses were run with fixed migration rates obtained from
analyses performed in the IM program (see below) (Nielsen
and Wakeley 2001). Because of large credibility intervals,
we also performed LAMARC analyses without fixed
migration estimates to determine the influence of these
on the LAMARC analyses. Population parameters were
inferred using both Bayesian and Maximum Likelihood
Origin and Domestication of Mycosphaerella graminicola 401
Table 2
Estimates of Gene Diversity and Tests of Neutrality for Each Locus
Locus
b-tubulin
Mycosphaerella
graminicola
S1
S2
HPPD
M. graminicola
S1
S2
Leu
M. graminicola
S1
S2
STS2
M. graminicola
S1
S2
STL10
M. graminicola
S1
S2
STS43
M. graminicola
S1
S2
Total
Analyzed
Sequence
Length
b
285 (277)
Total
Polymorphic
Sites
Polymorphic
Sites per
Population
300 (254)
415 (398)
1104e/515 (377)
445 (376)
2505 (2157)
Haplotype
Diversity
Fu and
Li’s D
0.8
NSc
39 (38)
20
545 (475)
Haplotypesa
22 (123)
d
Fu and
Li’s F*
Tajima’s D
NS
NS
NS
NS
NS
NS
2
1
2 (13)
2 (7)
0.39
0.29
1.4 (0.05)
NS
85
84
72
52 (107)
4 (16)
3 (10)
0.93
0.53
0.39
!4.0 (0.02)
!2.6 (0.05)
!2.2 (0.02)
!3.8 (0.02)
!2.7 (0.05)
!2.4 (0.02)
!2.1 (0.05)
!1.9 (0.05)
!2.1 (0.01)
33
10
NDf
19 (97)
3 (11)
ND
0.56
0.35
ND
NS
!2.6 (0.02)
ND
NS
!2.4 (0.02)
ND
NS
!2.0 (0.05)
ND
18
1
0
18 (106)
2 (9)
1 (16)
0.71
0.39
0
NS
!2.3 (0.02)
NS
NS
!2.5 (0.02)
NS
NS
!2.0 (0.01)
NS
92e
17
18
36 (81)
4 (7)
3 (11)
0.85
0.71
0.69
!5.92 (0.02)
NS
NS
!5.47 (0.001)
NS
NS
!2.58 (0.02)
NS
NS
44
6
NDf
21 (98)
3 (6)
ND
0.88
0.6
ND
NS
NS
ND
NS
NS
ND
NS
NS
ND
165 (127)
44 (37)
59 (35)
53 (45)
53 (39)
413 (321)
a
Number in parentheses indicates the total number of sequences sampled in each population.
b
Numbers in parentheses indicate the length of the compatible sequence alignment and the number of polymorphic sites in the compatible alignments.
c
Nonsignificant.
d
Numbers in parentheses for the last 3 columns show P-values of the neutrality tests.
e
Including 589 bp indel mutation present in M. graminicola but not in S1 and S2.
f
Loci could not be amplified from the S2 group.
analyses. 95% credibility intervals were assessed only from
the Bayesian search.
Isolation with Migration Parameter Estimation
Migration rates backward in time due to divergence of
pathogen populations on uncultivated grasses and wheat
were estimated using an isolation with migration model
implemented in the Bayesian-based isolation with migration
(IM) program (Nielsen and Wakeley 2001). The method estimates the likelihood function of the demographic parameters
h (h: population diversity), t (time), m (the number of migrants per generation between the 2 populations), mt (the
time of migration events into each population), and l (mutation rate) given 1) a demographic model, 2) the entire set of
data (the DNA sequences) and the likelihood of their possible underlying genealogies calculated using a Markov chain
Monte Carlo approach, and 3) a mutational model. In this
case, we assumed the infinite-site model of Kimura (1969),
which is intended for analyses of relatively recent cases of
population splitting. Analyses were also performed using the
finite-site model of Hasegawa et al. (1985), which allows for
recurrent mutations, differences in nucleotide frequencies,
and a transition/transversion bias. Highly similar results were
obtained using the 2 models. Here only results obtained with
the infinite-site model are shown. Nonrecombining sequence
alignments were applied as in-files for the IM program.
Initially, pairwise population comparisons were carried out between the 4 M. graminicola populations. The
IM program estimates mutation rates for each locus and
subsequently scales all demographic parameters by the
overall mutation rate. Bayesian posterior likelihood distributions for h, m (m 5 ml), and t (t 5 tl) were calculated
using 10,000,000 simulations and a burn in time of
500,000. Scalar for maximum h values was set at 100, maximum migration rates (m) at 50, and maximum divergence
time (t) at 10. Inheritance scalar was set at 1 for autosomal
loci. We repeated every run 5 times with different random
number seeds to ensure convergence of the estimates. Marginal histograms were compared between all independent runs for all parameters. Lower and upper bounds of
the estimated 90% Highest Posterior Density intervals were
calculated for each parameter.
Based on the low population subdivision found with
the IM analyses, the 4 M. graminicola populations were
pooled to form one panmictic population sample. To infer
the coalescence of M. graminicola from the most closely
related wild grass population S1, we conducted an analysis
of these 2 population samples. The IM program estimated
Bayesian posterior likelihood distributions of h of the ancestral founding population and of the 2 present-day populations; t, the population splitting time; m, gene flow rates
between populations; and s, the fraction of the ancestral
population that founded the M. graminicola population
402 Stukenbrock et al.
(where 0 , s , 1) (Hey 2005). Independent analyses wereconducted using either mutation scalars for each locus (see
below) or the geometric mean across all loci as inferred in
IM. Analyses were conducted using 5,000,000 simulations
and a burn in time of 100,000. Initially, scalar for maximum h values was set at 200, maximum migration rates
(m) at 50, and maximum divergence time (t) between M.
graminicola and S1 at 50 and between M. graminicola
and S2 at 100. The first analyses showed large CIs for
the t estimates, and we therefore performed several analyses optimizing the maximum t value to 4 to have a smaller
CI. We used Metropolis Coupling implemented with 5
chains and a 2-step increment model as described by
Hey and Nielsen in the IM documentation (2006). Every
run was repeated 5 times with different random number
seeds.
Calculation of Mutation Rates
The IM analysis uses the estimate l: mutation rate per
locus per year of each locus to infer t (time since divergence
of populations). To convert the estimates of t to real time,
we needed an overall mutation rate per locus per year. To
scale l, we used an estimate of mutation rate in exons in the
b-tubulin locus inferred by Kasuga et al. (2002). The estimate of Kasuga et al. (2002) is given as mutation rate per
site per year. We converted this estimate to mutation rate
per locus per year using the observed sequence divergence
between M. graminicola and S. passerinii in the b-tubulin
locus. A mutation rate for the b-tubulin locus was calculated and used to convert IM estimates of l for all 6
loci. The mean mutation rate across all loci was used to
calculate t.
Conversion of IM Coalescence Parameters to
Real Time
To convert estimates of coalescence time to real time,
the overall scaled mutation rate was used. The mean mutation rate across the 6 loci was estimated to be 2.0 3 10!5.
We assumed a generation time of M. graminicola of 1 year
(Kema, Verstappen, et al. 1996; Zhan et al. 2004) and the
coalescence time t in years can then be calculated as: t 5 t/l
(where t is the IM time estimate and l is the overall mutation rate).
Genetree Population Divergence Time
We also used the program Genetree (Griffiths and
Tavaré 1994), incorporated in the SNAP program package
to infer the sequence coalescence for each gene. Genetree
allows the coalescence parameters to be estimated under
a full multipopulation coalescence model. For this analysis,
we therefore used a full data set of all populations including
M. graminicola, S1, S2, and S. passerinii. The program estimates the ancestral history of each haplotype and shows
the distribution of mutations on a coalescence scale. This
allowed us to compare the divergence of haplotypes between and within each population for each gene.
Due to intergenic recombination, we were not able to
combine sequence alignments from the 6 loci. Instead, coalescence parameters were estimated for each locus separately using the compatible sequence alignments of each
gene. The coalescence analyses were performed assuming
subdivision between the M. graminicola populations, the
S1 and S2 populations, and S. passerinii. As we assume
different population sizes among the wheat-adapted pathogens and the wild grass pathogens, we inferred the coalescence of populations with different population sizes. To do
this, we applied information obtained from the IM estimates
about the relative differences in population sizes. To determine the order of coalescence events for haplotypes backward in time, it is necessary to know the amount of
migration between populations. Haplotypes from populations linked by migration will coalesce before haplotypes
from unlinked populations. As suggested by Carbone and
Kohn (2001), we constructed migration matrices for each
locus, indicating the number of migrants exchanged between populations, using the program Migrate (Beerli
and Felsenstein 2001). These migration matrices were then
used as backward migration matrices for ancestral inference
in Genetree. The genealogy with the highest root probability was determined by first performing 500,000 simulations
of the coalescent with 10 different starting random number
seeds. From these runs, the tree with the highest root probability was selected showing the relative divergence time
for the pathogen populations and the distribution of mutations along the branches. To convert coalescence units to
real time, we used the S1—M. graminicola population split
as a calibration point. As this parameter was estimated
across all 6 loci, it provided a robust measure of t. As
the coalescence time scale of the gene tree is linear, it
was possible to assess the time of the split of S. passerinii
and S2, as well as the diversification of M. graminicola.
Results
Sixty-one Mycosphaerella isolates were obtained from
wild grasses collected at 5 different locations in northwest
Iran. We differentiated clonal lineages using Rep-PCR
fingerprints and selected 28 unique isolates from the 5
Iranian populations for detailed sequence analysis.
Gene Diversity and Tests of Neutrality
Based on ITS sequence data, M. graminicola and S.
passerinii form a monophyletic cluster within the Mycosphaerella clade (Goodwin et al. 2001). To evaluate the
phylogenetic proximity of the wild grass Mycosphaerella
isolates to M. graminicola, we initially performed a maximum parsimony analysis based on ITS sequences using S.
passerinii as outgroup. Three ITS sequence types were
identified among the wild grass populations. The ITS type
1 was identical to ITS sequenced from M. graminicola, the
ITS type 2 differed by 1 nucleotide substitution and 4 indel
mutations, and the ITS type 3 differed by 2 nucleotide substitutions and 5 indel mutations. In comparison, the number
of nucleotide substitutions between M. graminicola and S.
passerinii was 13.
We amplified and sequenced 6 polymorphic DNA
sequence loci, a total of 3,080 bp, from the wild grass
Mycosphaerella populations and from 4 populations of
M. graminicola and 10 isolates of S. passerinii. A total
of 464 polymorphic sites were identified among the 6 loci
Origin and Domestication of Mycosphaerella graminicola 403
representing more than 10% of the total number of sequenced base pairs. The majority of the polymorphic sites
were found only within the wheat-adapted M. graminicola
population. Numbers of haplotypes and gene diversity for
each locus for M. graminicola, S1, and S2 are summarized
in table 2. Maximum parsimony analyses were performed
for each of the 6 sequence loci to assess phylogenetic relationships (data not shown). Consistent tree topologies identified 2 clusters within the collection of Mycosphaerella
spp. from wild grasses, here named S1 and S2 (S1 representing the ITS type 1 and S2 the ITS types 2 and 3). None
of the haplotypes identified among the 6 loci from S1 and
S2 were shared with M. graminicola.
Neutrality tests were performed for each locus in the
S1 and S2 Mycosphaerella clades and for M. graminicola.
Evidence of nonneutral evolution was found in the HPPD
locus for the S1 cluster in Leu and STS2 and for M. graminicola in the STL10 locus (table 2). For these 3 loci, the
significant statistical values were negative. A significant
test result is consistent with either population growth or
shrinkage or background selection (Fu 1997). As the majority of nucleotide positions were located in noncoding
DNA, we hypothesized that the deviation from neutrality was due to change in effective population size. This
hypothesis was supported by the population growth estimates inferred by the program LAMARC 2.0 as described
below.
Haplotype Networks
Haplotype networks were inferred from all 6 loci using
compatible sequence alignments. No pattern of geographical association was revealed among the M. graminicola
haplotypes, and the most frequent haplotype for the 6 loci
was present in all 4 geographical populations. All 6 networks exhibited a star-like shape for the M. graminicola
haplotypes. Haplotype networks of the STS2 and HPPD
loci are shown in figures 1a and b. These 2 networks
had the smallest number of missing steps. The STS2 network contained 26 M. graminicola haplotypes, 2 S1 haplotypes, 1 S2 hapotype, and 3 S. passerinii haplotypes.
The total number of STS2 and HPPD haplotypes was higher
(table 2); however, as conflicting sites were removed from
the alignment prior to the analysis, the number of haplotypes was decreased. The number of mutational steps between S. passerinii and the S2 population was 72 and
between S2 and M. graminicola was 18. Between S1
and M. graminicola, there were 13 mutations. Homoplasy
was present in the STS2 network in the M. graminicola
clades as shown by the loop between haplotype numbers
21, 3, 35, and 25.
In the HPPD network 29 haplotypes were recognized
in the M. graminicola cluster, 2 in the S1, 2 in the S2, and 2
in the S. passerinii cluster. There were 76 differences between S. passerinii and M. graminicola. Between the S1
group and M. graminicola populations there were 8 differences, and between S2 and M. graminicola 9 differences.
For both genes, the number of steps between M. graminicola haplotypes did not exceed 3 mutations, suggesting
that almost all possible haplotypes were sampled for this
group.
Population Expansion
The population parameters h (for haploids h 5 2Nel,
where Ne is effective population size and l is mutation rate)
and population growth rates were calculated using both
Maximum Likelihood and Bayesian analyses in LAMARC.
h values were used as a relative measure of effective population sizes. We found consistent values of h using the 2
analytical strategies; however, the growth rate estimates
were less consistent and CIs for this parameter were large.
The Bayesian estimates of h and growth rate are summarized in table 3. The overall pattern observed for the 6 genes
suggests a larger effective population size of M. graminicola (h 5 0.022) when compared with the pooled S1
and S2 (h 5 0.0063). For individual loci, h ranged from
0.02 (b-tubulin) to 0.16 (HPPD) in M. graminicola and
from 0.004 (STS2) to 0.017 (Leu) in the S1 and S2 population. The overall growth rate of the M. graminicola population (13.86) is consistent with a large effective
population size and a population expansion. The negative
growth rate values in the S1–S2 population (!139.91) indicate a decline in population size or a consistently small
population size. For individual loci in M. graminicola,
growth rates ranged from 1.51 (Leu) to 472.08 (STS2)
and for the S1–S2 population from !462.28 (STS2) to
18.75 (Leu). These estimates may reflect a slower rate of
evolution of the Leu locus as compared with the highly variable STS2 locus.
The analyses were also performed without fixed migration values. Estimates of h and population growth did
not change significantly, neither was the range of the credibility intervals reduced. In these analysis, the overall h values were 0.022 (0.008–0.18) for the M. graminicola
population and 0.007 (0.002–0.05) for the S1 population.
Population growth estimates were 5.84 (!60.3–581.05)
and !144.23 (!467.61–843.54) for M. graminicola and
S1, respectively.
Recombination rates inferred by LAMARC varied
among the different loci, ranging from 2.23 3 10!5
(HPPD) to 0.16 (STS2) (table 3). The low levels of recombination found in the b-tubulin, HPPD, STL10, and STL43
suggest that mutation had a greater influence on the evolution of these loci than recombination.
Coalescence and Migration
Estimates of directional gene flow between the 4 geographical M. graminicola populations were consistent with
relatively high rates of migration among populations (table
4). Based on these estimates and previous results showing
high genetic identity and low global FST values (Linde et al.
2002; Zhan et al. 2003; Juergens et al. 2006), we considered
the 4 M. graminicola populations as one panmictic population. The pooled M. graminicola sample was compared
with the S1 wild grass population. Estimates of directional
gene flow backward in time since the divergence of M. graminicola and the S1 population were low, with an overall
movement of genes from the wild grass S1 population to the
wheat-adapted population (table 5 and fig. 2). From S1 to
M. graminicola, the migration parameter mS1 was estimated
to be 0.80 and from M. graminicola to S1 mMg 5 0.15. CIs
are given in table 5.
404 Stukenbrock et al.
FIG. 1.—Parsimony haplotype networks for the 2 loci STS2 (a) and HPPD (b). Origin of haplotypes from the Septoria passerinii, the Iranian S1 and S2 populations, and the 4 geographical populations of
Mycosphaerella graminicola are indicated by different colors. Number for each haplotype is given. Numbers in parentheses refer to number of isolates with this haplotype if more than 1. Dots are hypothetical
missing intermediate haplotypes.
10!6
10!5
10!5
10!6
10!5
10!5
10!5
3
3
3
3
3
3
3
NOTE.—Mutation rates in noncoding sequences estimated in the IM Program are shown (Nielsen and Wakeley 2001; Hey 2005). 95% credibility intervals are denoted in parentheses.
3.2
2.2
4.0
8.5
3.5
1.3
2.0
0.085 (0.005–0.57)
2.23 3 10!5 (9.98 3 10!6–0.005)
0.25 (1.41 3 10!5–0.98)
0.16 (0.036–0.43)
2.92 3 10!5 (9.66 3 10!6–0.046)
0.073 (0.008–0.25)
0.09 (1.41 3 10!5–0.50)
(!500.79–123.12)
(!364.07 to –33.78)
(!269.90–608.55)
(!504.16 to –57.62)
(!475.18–119.97)
(!371.35–952.22)
(!483.31–710.06)
!320.01
!137.65
18.75
!462.28
!86.21
209.18
!139.91
(!44.65–311.89)
(116.78–229.88)
(!86.65–102.36)
(315.65–858.38)
(12.22–624.49)
(!88.43–81.06)
(!57.99–730.98)
35.13
181.07
1.51
472.08
117.08
19.18
13.86
0.006 (0.002–0.02)
0.011 (0.004–0.02)
0.017 (0.004–0.09)
0.004 (0.001–0.012)
0.005 (0.002–0.016)
0.008 (0.002–0.13)
0.0063 (0.002–0.06)
0.02 (0.008–0.03)
0.16 (0.12–0.21)
0.022 (0.008–0.03)
0.057 (0.041–0.085)
0.029 (0.02–0.06)
0.03 (0.02–0.05)
0.022 (0.01–0.18)
b-tubulin
HPPD
Leu
STS2
STL10
STS43
Overall
Recombination
Rate per Locus
Population Growth,
S1 and S2
Population Growth,
M. graminicola
h, S1 and S2
h, Mycosphaerella
graminicola
Table 3
Estimates of u, Population Growth, and Recombination Rates Calculated by Bayesian Analyses in the Program LAMARC 2.0 (Kuhner et al. 2004)
Mutation Rate
per Locus per Year
Origin and Domestication of Mycosphaerella graminicola 405
The mean time of migration events was estimated over
the 6 loci to assess when the majority of migration events
took place. Converting the migration time estimates to real
time, we find that the mean migration time corresponded to
8,250–12,750 years ago. The estimates suggest that the majority of migration events occurred at the time when the 2
populations were initially diverging and that recent migration events are uncommon.
Using the IM program, we inferred coalescence
parameters for the split between M. graminicola and the
S1 clade. To assess the efficiency of the Markov Chain mixing, we used the estimates of parameter autocorrelation qk
and Effective Sample Size (ESS). qk is calculated from the
parameter k, which is based on the number of steps between
the pairs of values that are included in the IM parameter
calculation. qk values close to zero indicate a high k value,
which is expected for independent samples. Similarly, the
ESS is a measure of the number of independent points that
have been sampled for each parameter and can be used as
a guide to how well the Markov Chain is mixing. The parameters qk and ESS obtained from our IM analyses indicated that parameter estimations were not biased by strong
autocorrelations in the Markov Chain mixing (table 5).
Estimates of coalescence parameters are summarized
in table 5. h of the ancestral population of S1 and M. graminicola was 10.97. Consistent with the LAMARC estimates, the IM analysis showed significantly higher h
values for M. graminicola (h 5 51.3) than the wild grass
S1 population (h 5 0.26). However, the parameter s (the
fraction of the ancestral population which founded M. graminicola) suggests that the M. graminicola population was
founded by only a very small fraction (s 5 0.02) of the ancestral population (fig. 2). A large fraction (1 ! s 5 0.98) of
the ancestral population founded the present-day S1 population. These estimates are consistent with a strong expansion of M. graminicola and a decline in the S1 population
since the population split. Due to the large credibility intervals for the splitting factor s (table 5), we also performed the
IM analyses using a model that assumes constant population size. Estimates under this model are also shown in table
5. Estimates inferred by this model remained within the
same range; however, credibility intervals for h values of
M. graminicola and S1 were narrower.
We calculated the overall mutation rate for the btubulin locus. This gave an overall mutation rate of 3.2 3
10!6 mutations per locus per year. This estimate was used
to calibrate the IM mutation rate parameter u˛ for the other 5
loci. Mutation rates are summarized in table 3. The highest
mutation rate of 4.0 3 10!5 substitutions per locus per year
was found in the Leu locus. The lowest mutation rate was
found in the b-tubulin locus. Assuming a generation time of
1 year for these Mycosphaerella pathogens, we calculated
an overall mutation rate of 2.0 3 10!5 per locus per year.
Converting estimates of t to real time, we find that the population split between the wheat-adapted M. graminicola
population and the wild grass S1 population occurred
approximately 10,500 years ago (95% CI: 2,000–19,500).
The model parameters h and m were also converted to
demographic parameters (table 5). These estimates suggest
approximate values of effective population size and migration rates for M. graminicola and S1. The effective
406 Stukenbrock et al.
Table 4
Estimates of Migration between 4 Geographical Populations of Wheat-Adapted Mycosphaerella graminicola Inferred Using
an Isolation with Migration Model in the Program IM (Nielsen and Wakeley 2001; Hey 2005
Source population
Sink Population
M. graminicola
(migration, m)
Iran
Israel
Oregon
Switzerland
Iran (16)
Israel (40)
Oregon (32)
Switzerland (32)
—
11.65 (6.35–18.75)
0.35 (0.05–1.55)
13.15 (5.25–24.55)
21.25 (7.95–54.15)
—
0.35 (0.05–1.45)
7.30 (2.30–14.30)
1.25 (0.05–4.05)
0.85 (0.25–2.15)
—
8.85 (5.15–14.45)
5.95 (3.05–9.55)
7.70 (3.90–11.30)
4.25 (1.25–9.05)
—
NOTE.—ESS is given for each population in parentheses. Source populations are shown on the left side and sink populations are indicated along the top. 95% credibility
intervals are given in parentheses.
population size of the pooled M. graminicola sample exceeded 1 million, whereas the S1 effective population size
was 6,500. The migration rate per year for M. graminicola
into S1 was 3 3 10!6 and for S1 into M. graminicola 1.6 3
10!5. However, these estimates may be imprecise in this
case where single spores represent individuals and the
yearly production of sexual and asexual fungal spores
may exceed billions.
We also used the program Genetree (Griffiths and
Tavaré 1994) to infer the gene history of the individual loci
STS2, b-tubulin, HPPD, and STL10. We did not have information from the S2 population and S. passerinii for
Leu or STS43 or from S. passserinii for the STL10 locus.
Overall tree topologies and the relative divergence of M.
graminicola from S1 and S2 were consistent for STS2,
b-tubulin, HPPD, and STL10. The ancestral distributions
of mutations and coalescence events for the loci STS2
and b-tubulin are illustrated in figures 3a and b. The S. passerinii cluster branched into the trees at the deepest point,
suggesting an ancient divergence of S. passerinii from
the 3 other Mycosphaerella clades. The S1 and M. graminicola clades represented the youngest lineages of the trees.
The STS2 RFLP locus showed a higher resolution of the
different population groups. Less resolution was shown
in the b-tubulin locus suggesting a more conserved evolution of some regions of this locus, which also showed the
lowest mutation rate. In the b-tubulin and STL10 loci, we
found evidence of introgression into M. graminicola as
a few isolates were positioned in the same groups as the
S1 population. This pattern most likely reflects gene flow
from S1 into M. graminicola after divergence. The M. graminicola isolates clustering with the S1 population were
from the Iranian (as observed in the b-tubulin locus) and
the Israeli (as observed in the STL10 locus) populations.
In both cases, the divergence of these isolates did not occur
recently and may reflect ancient gene flow events. Scaling
the coalescence time in the STS2 gene tree to real time
according to the split between M. graminicola and the
S1 population inferred in the IM analyses, we calculate that
the divergence from S. passerinii occurred approximately
68,500 (95% CI: 13,700–126,700) years ago. In the btubulin gene tree, the divergence of S. passerinii and S2 is
not well resolved, and both clusters are positioned as the
roots of the tree. However, the number of mutations separating S. passerinii from M. graminicola and S1 is 15 and
only 10 for the S2 population. This suggests that the actual
divergence of S. passerinii is more ancient than the divergence of S2. The Genetree coalescence estimates from the
STS2 locus suggested a divergence time of M. graminicola
from the S2 clade approximately 18,500 (95% CI: 3,700–
34,225) years ago. After divergence from the S1 clade, the
M. graminicola populations experienced a strong expansion
Table 5
Estimates of Coalescence Parameters from the Divergence between Mycosphaerella graminicola and S1
Parameter
hA
hMg
hS
t
mMg
mS
mtMg
mtS
s
Model Parameters,
Constant Population Size
Model Parameters,
Population Size Change
Demographic Parameters,
Population Size Change
qk (k 5 1,000,000)
ESS
10.97 (5.24–23.36)
11.2035 (2.62–25.98)
0.09 (0.09–1.67)
0.16 (0.08–0.30)
0.03 (0.03–0.48)
0.58 (0.03–1.93)
0.092 (0.038–0.19)
0.126 (0.063–0.257)
NA
10.97 (5.24–21.45)
51.25 (4.05–397.37)
0.26 (0.09–161.14)
0.21 (0.04–0.39)
0.15 (0.01–0.7)
0.80 (0.40–1.40)
0.255 (0.225–0.30)
0.165 (0.12–0.21)
0.02 (0.004–0.22)
274,250 (131,000–536,250)
1,281,250 (101,250–9,934,250)
6,500 (2250–4,028,500)
10,500 (2,000–19,500)
3 3 10!6 (2 3 10!6–1.4 3 10!5)
1.6 3 10!5 (8 3 10!6–2.8 3 10!5)
12,750 (11,250–15,000)
8,250 (6,000–10,500)
NA
!0.256
0.078
!
0.108
!0.322
!0.279
0.267
NA
NA
0.013
8
22
6
4
4
4
NA
NA
6
NOTE.—NA, not applicable. Analyses were performed assuming either a constant population size or allowing population size change. Estimates are shown as model
parameters and as converted demographic parameters for the analyses allowing population size change. As a measure of effective population size, the program estimates h (for
haploid h 5 2Nel, where Ne 5 effective population size and l 5 mutation rate inferred over all loci). h is given for each present-time population and for the common ancestral
population. An estimate of divergence time of the 2 descendant populations is given as t. Migration from M. graminicola to S1 was estimated as mMg and from S1 to M.
graminicola as mS (m 5 gene flow rates per locus per generation). Mean migration times over all loci are shown as mtMg (migration from S1 into M. graminicola) and as mtS
(migration from M. graminicola into S1). s is the fraction of the ancestral population that founded the M. graminicola population. 95% credibility intervals are shown in
parentheses. qk is the autocorrelation for the distance value k indicating the number of steps between the pairs of values that were included in the parameter calculation. The
measure ESS shows the number of independent points that have been sampled for each parameter.
Origin and Domestication of Mycosphaerella graminicola 407
FIG. 2.—Population splitting of Mycosphaerella graminicola and the
S1 population isolated from uncultivated grasses in Iran. The 2 descendant
populations split from an ancestral population at the time t 5 10,500 years
ago. h values show relative population sizes, m show migration from M.
graminicola to the S1 population (mMg) and from S1 to M. graminicola
(mS1). The factor s indicates the fraction of the ancestral population that
founded the M. graminicola population. 1 ! s represents the remaining
fraction founding the S1 population.
as demonstrated by the diversification of M. graminicola
haplotypes in the gene trees. The approximate time of this
diversification was 5,000–6,000 years ago. Independent
analyses of 4 loci showed consistent coalescence times between pathogen populations within the same CIs.
Discussion
The genetic data presented here provide evidence that
the divergence of the wheat-adapted pathogen M. graminicola from an ancestral population infecting wild grasses in
the Middle East occurred approximately 10,500 years ago.
This coalescence event coincided with the beginning of agricultural-based societies in the Fertile Crescent and the domestication of wild grasses. Our findings demonstrate that
the domestication of an agricultural crop was simultaneously accompanied by the domestication of a fungal
pathogen.
We found 1–3 fixed nucleotide substitutions at the ITS
locus among the Mycosphaerella populations. Phylogenetic relationships in the clade Mycosphaerella were described in a previous study using ITS sequences
(Goodwin et al. 2001). It was found that the average number
of within-species nucleotide substitutions in the Mycosphaerella genus ranged from 0 to 7. Based on ITS sequences, we could not resolve the S1 and S2 populations from
M. graminicola, indicating a close phylogenetic relationship below the species level. The lack of nucleotide variation in the ITS locus suggests that this gene is unsuitable to
resolve clades of pathogens that have been separated for
less than 11,000 years. In contrast to the ITS locus, the 6
loci that were used for population and coalescence analyses
showed high amounts of nucleotide variability within and
between populations. Tests of nonneutral evolution could
not be rejected for any of the 6 loci, but we believe that
the observed pattern of gene diversity is most likely due
to changes in population size.
The most frequent M. graminicola haplotypes were
present in all 4 geographical populations, supporting previous results showing that the majority of gene diversity is
distributed within individual populations and that populations are not geographically differentiated (McDonald et al.
1999; Linde et al. 2002; Zhan et al. 2003). The Iranian M.
graminicola haplotypes clearly belonged to the wheatadapted population though they originated from the same
geographical region as S1 and S2. Isolates of the S1 and
S2 populations were all collected very close to cultivated
wheat fields but still differed from the wheat-adapted pathogens. We found low haplotype and gene diversity in the S1
and S2 populations even though these isolates originated
from 5 different locations and had a broader host range
(both S1 and S2 were collected from 3 different hosts).
The small number of mutations separating haplotypes indicates that a large fraction of the total diversity was sampled.
The star-like shape of the M. graminicola populations in the
haplotype networks is consistent with a significant population expansion (figs. 1a and b). We expected to find a higher
haplotype and gene diversity in the pathogen populations
isolated from uncultivated grasses as a result of diversifying
selection imposed by different environments and host species (Burdon 1993). However, the low genetic diversity observed in the S1 and S2 populations is consistent with small
effective population sizes of these populations and may reflect a different population biology than M. graminicola.
Results from the LAMARC and the IM analyses were
both consistent with a significant expansion in M. graminicola populations and a decline in the S1 and S2 populations.
A new version of the ‘‘isolation with migration’’ model allowed us to estimate the fraction of the ancestral pathogen
population that gave rise to the M. graminicola population.
Given the contemporary h values, it is possible to extrapolate the relative increase or decrease in population sizes
since the time of divergence. The IM estimates of demographic parameters suggest a recent split where only a small
fraction of the ancestral pathogen population founded a new
population during the domestication of the wheat host. A
much larger fraction of the ancestral population founded
the S1 group, which today persists on uncultivated grasses.
The increase in M. graminicola effective population size
was most likely caused by the corresponding increase in
host population size following the spread of agriculture.
We hypothesize that the decrease in size of the S1 and
S2 populations reflects the significant decline in wild host
populations that occurred during the conversion of wild
grass habitats into agricultural fields. We are not able to determine whether the increase or decrease in population size
of the respective pathogen populations occurred in a linear
way through time or whether some historical events were
particularly favorable or unfavorable for the respective
pathogen populations. However, expansion of agriculture
has favored pathogen propagation and dispersal by providing dense and uniform host populations that cover a large
area. The spread of agriculture to other regions moved
domesticated plants and animals across Europe and Asia
(Clark 1965; Ammerman and Cavalli-Sforza 1971),
and many centuries later wheat was introduced into New
World countries by the European colonialism. Banke and
McDonald (2005) showed evidence for past migration of
408 Stukenbrock et al.
FIG. 3.—Coalescence-based gene genealogies of the STS2 locus (a) and b-tubulin (b) showing the distribution of mutations in 4 geographical
populations of Mycosphaerella graminicola (Mg), the 2 Iranian S1 and S2 populations, and S. passerineii (Sp USA). The split between M. graminicola
and S1 was used to convert the coalescence scale to real time. The number of haplotypes represented in each branch is shown below the trees. The vertical
lines enclose the position of haplotypes derived from the S1 population. Time scale is shown at the right side of each gene genealogy.
M. graminicola from the Middle East and Europe to New
World countries, suggesting that the pathogen traveled with
its host. The increase in wheat cultivation is also reflected in
the burst of M. graminicola diversification and population
expansion that the coalescence analyses dated to approximately 5,000–6,000 years ago.
We estimated the directional amount of gene flow occurring among M. graminicola and the S1 population since
their divergence. Our results indicate that only a limited
amount of gene flow has occurred between the 2 sympatrically diverging populations and that the small amount of
genetic exchange detected has been mainly from populations on wild grasses to wheat-infecting populations. This
points to a possible mechanism for the introgression of
new avirulence genes or pathogenicity factors into M. graminicola from the S1 clade. However, estimates of migration time showed that the majority of gene flow events
occurred at the time when populations were initially diverging approximately 10,000 years ago. Evidence of gene flow
was also present in the distribution of haplotypes in the Genetree analyses. The Iranian M. graminicola population con-
tained one isolate, which diverged from the S1 group later
than the remaining M. graminicola haplotypes (fig. 3b). The
position of this isolate in the b-tubulin gene tree suggests
that gene flow and recombination also took place between
the 2 diverged populations after genetic differentiation had
already occurred. In summary, our data indicate that the
pathogen host shift did not occur as one single event but
probably happened through a series of introgressions of isolates from uncultivated grasses into the wheat-infecting population. Multiple host shifts also occurred in the evolution of
the rice blast pathogen, which similarly emerged as a new
pathogen by the domestication of rice (Couch et al. 2005).
Consistent with our results, other studies have reported
extensive genetic differentiation between diverging fungal
species (Bicknell and Douglas 1970; Horgen et al. 1984;
Vilgalys and Johnson 1987; Dettman et al. 2003). The high
amount of genetic diversity observed between relatively recently separated species suggests that speciation in Mycosphaerella was not only associated with host adaptation but
was also accompanied by significant genetic differentiation.
Species concepts and mechanisms of speciation in fungi
Origin and Domestication of Mycosphaerella graminicola 409
have been discussed and addressed in many studies (e.g.,
Burnett 1983; Hibbet et al. 1995; Natvig and May 1996;
Brasier 1987; Petersen and Hughes 1999; Taylor et al.
2000; Harrington et al. 2002; Kohn 2005). Reinforcement
has been suggested as an important mechanism for divergence between sympatrically diverging populations
(Kohn 2005) and has been demonstrated in several studies
(Anderson et al. 1980; Capretti et al. 1990; Stenlid and
Karlsson 1991; Dettman et al. 2003). Reinforcement is
the active mechanism that leads to an increase in prezygotic
reproductive isolation and reduced hybrid fitness, thereby
favoring intraspecific mating and over time resulting in
genetic differentiation between species (Dobzhansky
1951). Reproductive success between M. graminicola,
S1, and S2 was not tested in this study, but such tests could
provide additional knowledge regarding the mechanisms
underlying the sympatric speciation of Mycosphaerella
populations.
The coalescence analyses performed in this study provide an approximate picture of the divergence of Mycosphaerella populations since the split from S. passerinii.
The coalescence time was calculated using mutation rates
from each of the 6 loci. The scaled mutation rates shown in
this study are within the same range as mutations rates inferred in IM from a multilocus data set of human populations (Hey 2005) and as experimentally shown in yeast
(Zeyl and DeVisser 2001). As our estimates were calculated
across 6 loci from 360 individuals using a Bayesian-based
approach, we believe these mutations rates provide robust
estimates. Based on the 6 inferred mutation rates, the
Genetree coalescence analyses suggest that S. passerinii diverged from the 3 other Mycosphaerella groups approximately 68,500 years ago. This implies that speciation of
M. graminicola and S. passerinii did not occur simultaneously with the domestication of wheat and barley but
instead occurred much earlier. The divergence of these
pathogen species may have been related to the specialization of S. passerinii to a Hordeum host.
Mycosphaerella graminicola and the S1 group represent the youngest phylogenetic clades that diverged from
the S2 clade approximately 18,500 years ago. Mycosphaerella graminicola and the S1 group split 10,500 years ago,
consistent with the genetic and archeological data obtained
from crop plants in the Fertile Crescent showing speciation
of the host plants (Salamini et al. 2002). Coalescence analyses of M. oryzae similarly showed that the rice blast pathogen emerged simultaneously with the domestication of
rice approximately 9,000 years ago (Couch et al. 2005).
The strong population expansion and global spread of
M. graminicola demonstrate the accelerated pathogen evolution that was mediated by human practices and favored by
the huge land areas covered by wheat. This cospeciation of
M. graminicola and wheat took place within a very short
evolutionary time frame, specifically, during the 10,000to 12,000-year period in which historical changes in the
host (i.e., the increase in wheat cultivation and the global
dispersal) had a major impact on the parasite demography.
A similar picture has emerged from coalescence analyses of
the malaria parasite Plasmodium falciparum (Joy et al.
2003). Changes in human culture and the introduction of
agricultural practices also led to an accelerated coevolution
of the malaria mosquito Anopheles gamibiae and P. falciparum, both experiencing strong population increases and
migration to other regions. These examples illustrate how
agricultural practices shaped the evolution of some of our
most important modern pathogens.
Acknowledgments
We would like to thank M. Zala for technical support
and Ignazio Carbone and Rasmus Nielsen for helpful suggestions to the coalescence analyses. Additionally, we
would like to thank 3 anonymous reviewers for helpful
comments on the manuscript. This work was funded by
the Swiss Federal Institute of Technology (ETH), Zurich,
and the Swiss National Science Foundation (grant
3100A0-104145).
Literature Cited
Ammerman AJ, Cavalli-Sforza LL. 1971. Measuring the rate of
spread of early farming in Europe. Man. 6:674.
Anderson JB, Korhonen K, Ullrich RC. 1980. Relationships
between European and North-American biological species
of Armillaria. Exp Mycol. 4:87–95.
Avise JC. 2000. Phylogeography: the history and formation of
species. Cambridge: Harvard University Press.
Aylor DL, Price EW, Carbone I. 2006. SNAP: combine and map
modules for multilocus population genetic analysis. Bioinformatics. doi: 10.1093/bioinformatics/btl136.
Banke S, McDonald BA. 2005. Migration patterns among global
populations of the pathogenic fungus Mycosphaerella graminicola. Mol Ecol. 14:1881–1896.
Banke S, Peschon A, McDonald BA. 2004. Phylogenetic analysis
of globally distributed Mycosphaerella graminicola populations based on three DNA sequence loci. Fungal Genet Biol.
41:226–238.
Beerli P, Felsenstein J. 2001. Maximum likelihood estimation of
a migration matrix and effective population sizes in n subpopulations by using a coalescent approach. Proc Natl Acad Sci
USA. 98:4563–4568.
Bicknell JN, Douglas HC. 1970. Nucleic acid homologies among
species of Saccharomyces. J Bacteriol. 101:505–512.
Brasier CM. 1987. The dynamics of fungal speciation. In: Rayner
ADM, Brasier CM, Moore D, editors. Evolutionary biology of
the fungi. Cambridge: Cambridge University Press. p. 231–260.
Brown JKM, Hovmøller MS. 2002. Aerial dispersal of pathogens
on the global and continental scales and its impact on plant
disease. Science. 297:537–541.
Burdon JJ. 1993. The structure of pathogen populations in natural
plant communities. Annu Rev Phytopathol. 31:305–323.
Burnett JH. 1983. Presidential address: speciation in fungi. Trans
Br Mycol Soc. 81:1–14.
Capretti P, Korhonen K, Mugnai L, Romagnoli C. 1990. An intersterility group of Heterobasidion annosum specialized to
Abies alba. Eur J Pathol. 20:231–240.
Carbone I, Kohn L. 2001. A microbial population-species interface: nested cladistic and coalescence inference with multilocus data. Mol Ecol. 10:947–964.
Childe VG. 1953. New light on the most ancient Near East. New
York: Praeger.
Clark JGD. 1965. Radiocarbon dating and expansion of farming
culture from the Near East over Europe. Proc Prehist Soc.
31:58–73.
Clement M, Posada D, Crandall KA. 2000. TCS: a computer program to estimate gene genealogies. Mol Ecol. 9:1657–1659.
410 Stukenbrock et al.
Couch BC, Fudal I, Lebrun M-H, Tharreau D, Valent B, van Kim
P, Nottéghem JL, Kohn LM. 2005. Origins of host-specific
populations of the blast pathogen Magnaporthe oryzae in crop
domestication with subsequent expansion of pandemic clones
on rice and weeds of rice. Genetics. 170:613–630.
Dettman JR, Jacobson DJ, Taylor JW. 2003. A Multilocus genealogical approach to phylogenetic species recogintion in the
model eukaryote Neurospora. Evolution. 57:2703–2720.
Diamond J. 1999. Guns, Germs, and Steel. New York: Norton.
Dobzhansky T. 1951. Genetics and the origin of species. New
York: Columbia University Press.
Elias EM, Steiger DK, Cantrell RG. 1996. Evaluation of lines derived from wild emmer chromosome substitutions. II. Agronomic traits. Crop Sci. 36:228–233.
Eyal Z, Amiri Z, Wahl I. 1973. Physiologic specialization of Septoria tritici. Phytopathology. 63:1087–1091.
Eyal Z, Scharen AL, Huffman MD, Prescott JM. 1985. Global
insights into virulence frequencies of Mycosphaerella graminicola. Phytopathology. 75:1456–1462.
Fisher MC, Koenig GL,White TJ, San-Blas G, Negroni R, Alvarez
IG, Wanke B, Taylor JW. 2001. From the cover: biogeographic
range expansion into South America by Coccidioides immitis
mirrors New World patterns of human migration. Proc Natl
Acad Sci USA. 98:4558–4562.
Fu YX. 1997. Statistical tests of neutrality of mutations against
population growth, hitchhiking and background selection.
Genetics. 147:915–925.
Fu YX, Li WH. 1993. Statistical tests of neutrality of mutations.
Genetics. 133:693–709.
Goodwin SB, Dunkle LD, Zismann VL. 2001. Phylogenetic analysis of Cercospora and Mycosphaerella based on the internal
transcribed spacer region of ribosomal DNA. Phytopathology.
91:648–658.
Gopher A, Abbo S, Lev-Yadun S. 2002. The "when#, the "where#
and the "why# of the Neolithic revolution in the Levant. Doc
Praehistorica. 28:49–62.
Griffiths RC Tavare S. 1994. Sampling theory for neutral alleles in
a varying environment. Philos Trans R Soc Lond B Biol Sci.
344:403–410.
Harrington TC, Pashenova NV, McNew DL, Steimel J, Konstantinov MY. 2002. Species delimitation and host specialization
of Ceratocystis laricicola and C. polonica to larch and spruce.
Plant Dis. 86:418.
Hasegawa M, Kishino H, Yano T. 1985. Dating of the humanape
splitting by a molecular clock of mitochondrial DNA. J Mol
Evol. 22:160–174.
Hey J. 2005. On the number of New World founders: a population
genetic portrait of the peopling of the Americas. PLoS Biol.
3:e193.
Hey J, Machado CA. 2003. The study of structured populations—
new hope for a difficult and divided science. Nat Rev Genet.
4:535–543.
Hey J, Nielsen R. 2004. Multilocus methods for estimating population sizes, migration rates and divergence time, with applications to the divergence of Drosophila pseudoobscura and
D. persimilis. Genetics. 167:747–760.
Hey J, Nielsen R. 2006. IM documentation. [cited 2006 July];
[Internet]. Piscataway, (NJ): Rutgers University. Available from:
http://lifesci.rutgers.edu/;heylab/ProgramsandData/Programs/
IM/IMdocumentation.pdf.
Hibbett DS, Fukumasa-Nakai Y, Tsuneda A, Donoghue, MJ.
1995. Phylogenetic diversity in shiitake inferred from nuclear
ribosomal DNA sequences. Mycologia. 87:618–638.
Horgen PA, Arthur R, Davy O, Moum A, Herr F, Straus N,
Anderson J. 1984. The nucleotide sequence homologies of
unique DNA’s of some cultivated and wild mushrooms.
Can J Microbiol. 30:587–593.
Hussey BMJ, Keighery GJ, Cousens RD, Dodd J. Lloyd SG. 1997.
Western weeds, a guide to the weeds of Western Australia.
Australia: Plant protection society of WA.
Joy DA, Feng X, Mu J, et al. (12 co-authors). 2003. Early origin
and recent expansion of Plasmodium falciparum. Science.
300:318–321.
Juergens T, Linde C, McDonald AB. Forthcoming 2006. Genetic
structure of Mycosphaerella graminicola populations from
Iran, Argentina and Australia. Eur J Plant Pathol. 115:223–
233.
Kasuga T, White TJ, Taylor JW. 2002. Estimation of nucleotide
substitution rates in Eurotiomycete fungi. Mol Biol Evol.
19:2318–2324.
Kema GHJ, Annone JG, Sayoud R, Van Silfhout CH, Van Ginkel
M, de Bree J. 1996. Genetic variation for virulence and resistance in the wheat-Mycosphaerella graminicola pathosystem.
I. Interactions between pathogen isolates and host cultivars.
Phytopathology. 86:200–212.
Kema GHJ, Sayoud R, Annone JG, Van Silfhout CH. 1996.
Genetic variation for virulence and resistance in the wheatMycosphaerella graminicola pathosystem. II. Analysis of
interactions between pathogen isolates and host cultivars. Phytopathology. 86:213–220.
Kema GHJ, Verstappen ECP, Todorova M, Waalwijk C. 1996.
Successful crosses and molecular tetrad and progeny analyses
demonstrate the heterothallism in Mycosphaerella graminicola. Curr Genet. 30:251–258.
Keon J, Hargreaves J. 1998. Isolation and heterologous expression
of a gene encoding 4-hydroxyphenylpyruvate dioxygenase
from the wheat leaf-spot pathogen, Mycosphaerella graminicola. FEMS Microbiol Lett. 161:337–343.
Kimura M. 1969. The number of heterozygous nucleotide sites
maintained in a finite population due to a steady flux of mutations. Genetics. 61:893–903.
Kohn LM. 2005. Mechanism of fungal speciation. Annu Rev Phytopathology. 43:279–308.
Kuhner MK, Yamato J, Beerli P, Smith LP, Rynes E, Walkup E,
Li C, Sloan J, Colacurcio P, Felsenstein J. 2004. LAMARC v.
1.2.1 [Internet]. University of Washington. Available from:
http://evolution.gs.washington.edu/lamarc.html.
Linde CC, Zhan J, and McDonald BA. 2002. Population structure
of Mycosphaerella graminicola: from lesions to continents.
Phytopathology. 92:946–955.
McDonald BA, Zhan J, Yarden O, Hogan K, Garton J,
Pettway RE. 1999. The population genetics of Mycosphaerella graminicola and Phaeosphaeria nodorum. In:
Lucas JA, Bowyer P, Anderson HM, editors. Septoria on
cereals: a study of Pathosystems. Wallingford (UK): CABI.
p. 44–69.
Moore AMT, Hillman GC, Legge AJ. 2000. Village on the
Euphrates: from foraging to farming at Abu Hureyra. New
York: Oxford University Press.
Natvig DO, May G. 1996. Fungal evolution and speciation. J
Genet. 75:441–452.
Nei M. 1987. Molecular evolutionary genetics. New York:
Columbia University Press.
Nielsen R, Wakeley J. 2001. Distinguishing migration from isolation: a Markov chain Monte Carlo approach. Genetics.
158:885–896.
Petersen RH, Hughes KW. 1999. Species and speciation in mushrooms. Bioscience. 49:440–452.
Price EW, Carbone I. 2005. SNAP: workbench management tool
for evolutionary population genetic analysis. Bioinformatics.
21:402–404.
Rozas J, Rozas R. 1999. DnaSP version 3: an integrated program
for molecular population genetics and molecular evolution
analysis. Bioinformatics. 15:174–175.
Origin and Domestication of Mycosphaerella graminicola 411
Saadaoui EM. 1987. Physiologic specialization of Septoria tritici
in Morocco. Plant Dis. 71:153–155.
Salamini F, Ozkan H, Brandolini A, Schafer-Pregl R, Martin W.
2002. Genetics and geography of wild cereal domestication in
the Near East. Nat Rev Genet. 3:429–441.
Sharma HC, Waines JG. 1980. Inheritance of tough rachis in
crosses of Triticum monococcum and T. boeoticum. J Hered.
71:214–216.
Shaw MW, Royle DJ. 1989. Airborne inoculum as a major source
of Septoria tritici (Mycosphaerella graminicola) infections in
winter wheat crops in the UK. Plant Pathol. 8:35–43.
Stenlid J, Karlsson JO. 1991. Partial intersterility in Heterobasidion annosum. Mycol Res. 95:1153–1159.
Swofford DL. 1996. PAUP*: phylogenetic analysis using parsimony (*and other methods). Version 4.0. Sunderland (MA):
Sinauer Associates.
Taenzler B, Esposti RF, Vaccino P, Brandolini A, Effgen S, Heun
M, Schäfer-Pregl R, Borghi B, Salamini F. 2002. Molecular
linkage map of Einkorn wheat: mapping of storage—protein
and soft-glume genes and bread-making quality QTLs. Genet
Res Camb. 80:131–143.
Tajima F. 1989. Statistical method for testing the neutral mutation
hypothesis by DNA polymorphism. Genetics. 123:585–595.
Taylor JW, Jacobson DJ, Kroken S, Kasuga T, Geiser DM,
Hibbett DS, Fisher MC. 2000. Phylogenetic species recognition
and species concepts in fungi. Fungal Genet Biol. 31:21–32.
Templeton AR, Crandall KA, Sing CF. 1992. A cladistic analysis
of the phenotypic associations with haplotypes inferred from
restriction endonuclease mapping and DNA sequence data.
III. Cladogram estimation. Genetics. 132:619–633.
van Ginkel M, Scharen AL. 1987. Generation mean analysis and
heritabilities of resistance to Septoria tritici in durum wheat.
Phytopathology. 77:1629–1633.
Vilgalys RJ, Johnson JL. 1987. Extensive genetic divergence associated with speciation in filamentous fungi. Proc Natl Acad
Sci USA. 84:2355–2358.
White TJ, Bruns T, Lee S, Taylor J. 1990 Amplification and
direct sequencing of fungal ribosomal RNA genes for
phylogenetics. In: Innis M, Gelfand DH, Sninsky JJ, White
TJ, editors. PCR protocols. San Diego: Academic Press.
p. 315–322.
Won Y-J, Hey J. 2005 Divergence population genetics of chimpanzees. Mol Biol Evol. 22:297–307.
Zeyl C, DeVisser JAGM. 2001. Estimates of the rate and distribution of fitness effects of spontaneous mutation in Saccharomyces cerevisiae. Genetics. 157:53–61.
Zhan J, Kema GHJ, McDonald BA. 2004. Evidence for natural
selection in the mitochondrial genome of Mycosphaerella graminicola. Phytopathology. 94:261–267.
Zhan J, Linde CC, Juergens T, Merz U, Steinebrunner F,
McDonald BA. 2005. Variation for neutral markers is correlated with variation for quantitative traits in the pathogenic fungus Mycosphaerella graminicola. Mol Ecol. 14:2683–2693.
Zhan J, Pettway RE, McDonald BA. 2003. The global genetic
structure of the wheat pathogen Mycosphaerella graminicola
is characterized by high nuclear diversity, low mitochondrial
diversity, regular recombination, and gene flow. Fungal Genet
Biol. 38:286–297.
Zohary D, Hopf M. 2000. Domestication of plants in the Old
World. 3rd ed. Oxford: Oxford University Press.
Jody Hey, Associate Editor.
Accepted November 3, 2006