Mycologia, 104(6), 2012, pp. 1351–1368. DOI: 10.3852/12-056 # 2012 by The Mycological Society of America, Lawrence, KS 66044-8897 How well do ITS rDNA sequences differentiate species of true morels (Morchella)? Research, United States Department of Agriculture, Agricultural Research Service, 1815 North University Street, Peoria, Illinois 61604 Xi-Hui Du Qi Zhao Zhu L. Yang Key Laboratory of Biodiversity and Biogeography, Kunming Institute of Botany, Chinese Academy of Sciences, Lanhei Road No. 132, Kunming 650201, Yunnan Province, P.R. China Graduate University of Chinese Academy of Sciences, Beijing 100049, P.R. China Abstract: Arguably more mycophiles hunt true morels (Morchella) during their brief fruiting season each spring in the northern hemisphere than any other wild edible fungus. Concerns about overharvesting by individual collectors and commercial enterprises make it essential that science-based management practices and conservation policies are developed to ensure the sustainability of commercial harvests and to protect and preserve morel species diversity. Therefore, the primary objectives of the present study were to: (i) investigate the utility of the ITS rDNA locus for identifying Morchella species, using phylogenetic species previously inferred from multilocus DNA sequence data as a reference; and (ii) clarify insufficiently identified sequences and determine whether the named sequences in GenBank were identified correctly. To this end, we generated 553 Morchella ITS rDNA sequences and downloaded 312 additional ones generated by other researchers from GenBank using emerencia and analyzed them phylogenetically. Three major findings emerged: (i) ITS rDNA sequences were useful in identifying 48/62 (77.4%) of the known phylospecies; however, they failed to identify 12 of the 22 species within the species-rich Elata Subclade and two closely related species in the Esculenta Clade; (ii) at least 66% of the named Morchella sequences in GenBank are misidentified; and (iii) ITS rDNA sequences of up to six putatively novel Morchella species were represented in GenBank. Recognizing the need for a dedicated Webaccessible reference database to facilitate the rapid identification of known and novel species, we constructed Morchella MLST (http://www.cbs.knaw. nl/morchella/), which can be queried with ITS rDNA sequences and those of the four other genes used in our prior multilocus molecular systematic studies of this charismatic genus. Key words: Ascomycota, biodiversity, biogeography, emerencia, GCPSR, GenBank, phylogeny, species limits Karen Hansen Cryptogamic Botany, Swedish Museum of Natural History, P.O. Box 50007, SE-10405 Stockholm, Sweden Hatıra Taşkın Saadet Büyükalaca Department of Horticulture, Faculty of Agriculture, Çukurova University, Adana 01330, Turkey Damon Dewsbury Jean-Marc Moncalvo Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, Ontario M5S2C6, Canada Department of Natural History, Royal Ontario Museum, and Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, Ontario M5S2C6, Canada Greg W. Douhan Department of Plant Pathology and Microbiology, University of California Riverside, Riverside, California 92521 Vincent A.R.G. Robert Pedro W. Crous CBS-KNAW Fungal Biodiversity Center, Utrecht, the Netherlands Stephen A. Rehner Systematic Mycology and Microbiology Laboratory, United States Department of Agriculture, Agricultural Research Service, Beltsville, Maryland 20705 Alejandro P. Rooney Crop Bioprotection Research Unit, National Center for Agricultural Utilization Research, United States Department of Agriculture, Agricultural Research Service, 1815 North University Street, Peoria, Illinois 61604 Stacy Sink Kerry O’Donnell1 Bacterial Foodborne Pathogens and Mycology Research Unit, National Center for Agricultural Utilization INTRODUCTION With the notable exception of true truffles in the genus Tuber (Bonito et al. 2010), few fungal genera are Submitted 17 Feb 2012; accepted for publication 16 May 2012. 1 Corresponding author. E-mail: [email protected] 1351 1352 MYCOLOGIA as synonymous with epicurean cuisine as true morels (Morchella, phylum Ascomycota). Literally thousands of mycophiles scour mixed hardwood and coniferous forests throughout the northern hemisphere in search of these highly prized edible fungi during their short and often sporadic fruiting season each spring. Adding to their huge popularity is the dramatic growth of commercial harvesting of morels into highly lucrative cottage industries in several morel-rich countries, including China, Turkey and the United States (Pilz et al. 2007), together with significant advances toward their cultivation on a commercial scale (Ower et al. 1986, Zhao et al. 2009). Due to the increasing commercial demand, it is imperative that comprehensive studies are conducted to develop informed science-based conservation policies and management practices that will ensure that species diversity is maximally preserved and commercial harvests are sustainable (see Pilz et al. 2007 and references therein). To this end, our multilocus molecular phylogenetic analyses have identified at least 62 phylogenetically distinct species worldwide based on their genealogical exclusivity (Du et al. 2012; O’Donnell et al. 2011; Taşkın et al. 2010, 2012). Robust hypotheses of their species limits, employing genealogical concordance/discordance phylogenetic species recognition (GCPSR; Dettman et al. 2003, Taylor et al. 2000), provides the requisite framework for assessing the utility of ITS rDNA for identifying species so that their geographic distribution and area of endemism can be critically assessed and so that unknown taxa can be detected and identified for further taxonomic evaluation. Even though the ITS rDNA region has been used as the sole locus in more studies than any other for assessing Morchella genetic diversity (Pagliaccia et al. 2011, Stefani et al. 2010, Taşkın et al. 2012 and references therein), a search of GenBank using emerencia (Ryberg et al. 2009) revealed that most sequences were insufficiently identified (n 5 190), and in the near absence of type studies, binomials appear to have been arbitrarily applied to the vast majority of those designated fully identified (n 5 127) sensu Nilsson et al. (2005). Thus, the present study of Morchella was conducted to: (i) determine how well ITS rDNA sequence data resolve species we previously delimited via GCPSR, (ii) assess whether the fully identified sequences in GenBank were identified correctly and identify ones classified as insufficiently identified, (iii) mine metadata to better understand species diversity and their geographic distributions and (iv) develop a dedicated, Web-accessible informational database, including references to voucher specimens and/or cultures (Agerer et al. 2000), that aids DNA sequence-based identifications of true morels via the Internet. We have accomplished the latter objective by constructing Morchella MLST (multilocus sequence typing, http://www.cbs.knaw. nl/morchella/) and populated it with the data we generated both in the current and our prior multilocus molecular systematic studies of this charismatic genus. MATERIALS AND METHODS Source of sequence data.—We generated ITS sequence data from 553 collections during extensive surveys of China (Du et al. 2012), Turkey (Taşkın et al. 2010, 2012) and a global survey of the genus (O’Donnell et al. 2011). In addition, 312 sequences were retrieved from GenBank with the genus search tool in emerencia (Nilsson et al. 2005, Ryberg et al. 2009). Five additional sequences were retrieved by emerencia, including three of M. rufobrunnea, although one of these (GQ304946) was misidentified as M. crassipes, possibly because the pileus of this species turns yellow during ascosporogenesis. The other two GenBank accessions (EU480151.1, U61390.1) were unidentifiable sequences unrelated to Morchella based on BLAST queries of GenBank. Sequences of M. rufobrunnea were not analyzed further because they represent a highly divergent basal sister to the rest of Morchella. Thus, a total of 865 Morchella ITS rDNA nucleotide sequences were analyzed. Molecular methods.—Total genomic DNA was extracted from herbarium specimens or freeze-dried mycelium cultivated from pure cultures using a CTAB protocol (Gardes and Bruns 1993, O’Donnell et al. 2011). PCR amplification and sequencing of the nuclear ribosomal transcribed spacer region (ITS rDNA) was conducted using the ITS5 3 ITS4, ITS1 3 ITS4 (White et al. 1990) or ITS5 3 NL4 (O’Donnell et al. 2011) primer pairs. To completely sequence the latter amplicon ITS4 and NL1 also were used as internal sequencing primers (O’Donnell et al. 2011). All PCR amplifications were performed with Platinum Taq DNA Polymerase High Fidelity (Invitrogen Life Technologies, Carlsbad, California), after which amplicons were purified with Montage96 filter plates (Millipore Corp., Bellerica, Massachusetts), and then sequenced with ABI BigDye chemistry 3.1 (Applied Biosystems). ABI XTerminator was used to purify the sequencing reactions before running them on an ABI 3730 genetic analyzer as described in O’Donnell et al. (2011). Sequence analyses.—The Morchella ITS rDNA sequences we generated (n 5 553) and those downloaded from GenBank (n 5 312) were too divergent to align across the breadth of the genus. Therefore, the following three clade-specific alignments were constructed: (i) Esculenta Clade (n 5 222: 164 new + 58 from GenBank, SUPPLEMENTARY TABLE I), (ii) Elata Clade (n 5 200: 110 new + 90 from GenBank, SUPPLEMENTARY TABLE I), and (iii) Elata Subclade (n 5 443: 348 + 164 from GenBank by others, SUPPLEMENTARY TABLE III). A fourth clade of highly divergent M. rufobrunnea sequences (n 516: 13 new + three from GenBank, SUPPLEMENTARY TABLE IV) also was identified. However, DU ET AL.: MORCHELLA ITS RDNA because these sequences were highly divergent from the aforementioned three clades, they were excluded from further analysis. Sequence chromatograms generated in this study were imported into Sequencher 4.9 (Gene Codes Corp., Ann Arbor, Michigan), where they were edited, aligned and exported as NEXUS files. Assignment of the Morchella sequences downloaded from GenBank using emerencia (Nilsson et al. 2005) to one of the four aforementioned clades was based on an initial maximum parsimony analysis conducted in PAUP* 4.0b10 (Swofford 2002). Sequences in each of the three clade-specific datasets were aligned with MUSCLE 3.8 (Edgar 2004) in SeaView 4.3.0 (http://pbil.univ-lyon1.fr/software/seaview) and then improvements in the alignments were made visually with TextPad 5.1.0 for Windows (http://textpad.com/). To improve phylogenetic resolution sequences of M. tomentosa (Mel-1) and M. steppicola (Mes-1), the earliest diverging species within the Elata and Esculenta clades respectively, were used to root these phylogenies. Sequences of Mel-11 and Mel-25, the sister group of the Elata Subclade (Du et al. 2012; O’Donnell et al. 2011; Taşkın et al. 2010, 2012), were used to root this phylogeny. In addition to excluding two unidentifiable highly divergent sequences from the emerencia search of GenBank (EU480151.1, U61390.1), the 59 or 39 ends of three sequences obtained from GenBank were trimmed to remove poor sequence (93 bp from 59 end of FJ237237, 194 bp from 39 end of GQ223457, 101 bp from 59 end of EU834821). To construct partitions that only contained unique haplotypes for the final analyses, COLLAPSE 1.2 (http://darwin.uvigo.es/software/collapse) was used to identify haplotypes within each dataset. Instead of using percent identity to place unidentified sequences within similarity clusters (Bonito et al. 2010, Horton and Bruns 2001), haplotypes were identified to species, where possible, by including ITS rDNA sequences of the phylogenetic species identified with multilocus DNA sequence data (Du et al. 2012; O’Donnell et al. 2011; Taşkın et al. 2010, 2012). Each partition was analyzed via maximum parsimony (MP) in PAUP* (Swofford 2002) to obtain tree statistics, including number of parsimony informative characters (PIC); clade support was assessed based on 1000 MP pseudoreplicates of the data (O’Donnell et al. 2011). Due to the large number of MPTs, single phylograms were chosen randomly to present the metadata and clade support for the Esculenta (FIG. 1) and Elata Clades (FIG. 2). However, due to the large number of ITS haplotypes, a 70% majority rule bootstrap consensus was chosen to present the Elata Subclade phylogeny (FIG. 3). All DNA sequence data generated in this study was deposited in GenBank (JQ723016–JQ723151), TreeBASE (S12489, Tr51345-Tr51347) and Morchella MLST at CBSKNAW (http://www.cbs.knaw.nl/morchella/). In addition, accession numbers identifying voucher specimens deposited in herbaria and cultures are listed (SUPPLEMENTARY TABLES I–IV). Construction of Morchella MLST.—This database was created with the BioloMICS software (Robert et al. 2011). All available molecular, specimen and/or culture data were 1353 PHYLOGENETICS imported into the database. The database can be queried by keywords or using one of the two available identification tools. The first one consists of a modified version of BLAST (Altschul et al. 1990). Single sequences can be aligned against the reference database and ranked according to a combination of parameters including similarity, BLAST score, overlap percentage between query and reference sequences and number of fragments. The second tool is called polyphasic identification because it allows aligning (pairwise alignments) one or several sequences from different loci at the same time. A global similarity coefficient is computed with this formula (Gower 1971): n 1 X Sjk ~ P Si Wi n Wi i~1 i~1 Where Sjk is the global similarity coefficient for the comparison of records j and k; Si is the local similarity coefficient for the pairwise alignment of locus I; Wi is the weight of locus I (no differential weighting is applied in the Morchella database); n is the number of loci in the similarity comparison. Results are ranked according to the combination of this global similarity coefficient, the number of loci and the overlap percentage between query and reference sequences. Phenetic trees based on one of the following methods can be produced dynamically to help end users locate the clade to which the unknown specimen belongs: UPGMA, WPGMA, single linkage, complete linkage, UPGMC, WPGMC, ward or neighbor joining using a fixed model (Saitou and Nei 1987). RESULTS Esculenta clade.—The 222-sequence Esculenta Clade dataset included 58 sequences we deposited in GenBank (O’Donnell et al. 2011; Taşkın et al. 2010, 2012) and 164 collected in the present study (SUPPLEMENTARY TABLE I). COLLAPSE 1.2 analysis of the 222-sequence dataset (1349 bp alignment) identified 125 unique haplotypes. The latter dataset was used for all subsequent analyses, after 253 nucleotide positions within the ITS1 were excluded as ambiguously aligned. Maximum parsimony (MP) analysis of the 125-sequence Esculenta Clade dataset recovered . 10 000 equally most parsimonious trees (MPTs) 1045 steps long (FIG. 1, consistency index 5 0.67, retention index 5 0.96). The Esculenta Clade phylogeny was rooted on sequences of M. steppicola Mes-1 based on more inclusive analyses (O’Donnell et al. 2011). Of the 27 species previously identified with multilocus GCPSR, we were able to apply names with confidence only to M. steppicola and five species endemic to North America (FIG. 1, Kuo et al. 2012). Parsimony analysis of the Esculenta Clade dataset indicated that 212/222 (95.5%) of ITS sequences 1354 MYCOLOGIA FIG. 1. One of . 10 000 equally most parsimonious trees (MPTs) inferred from Esculenta Clade (yellow morels) ITS sequences, rooted on sequences of Morchella steppicola (Mes-1), the basal-most member of this clade (O’Donnell et al. 2011). The phylogram contains the 125 unique haplotypes identified within the 222-sequence dataset (SUPPLEMENTARY TABLE I). Each sequence is identified by GenBank or herbarium accession number or laboratory code (SUPPLEMENTARY TABLE I), geographic origin, name used as deposited in GenBank and number of sequences in parentheses that comprise each haplotype represented by two or more sequences. ITS sequences identified by an asterisk were used as a reference for each phylospecies based on prior multilocus GCPSR analysis (Du et al. 2012; O’Donnell et al. 2011; Taşkın et al. 2010, 2012). Because binomials can be applied with confidence to only six of the species, each phylogenetic species (i.e. phylospecies) is identified by Mes followed by a unique Arabic number. Numbers to the right of the phylospecies identify the size and number of sequences (in subscript) of the (ITS1) (5.8SrDNA) (ITS2) (complete ITS region) determined by DNA sequence analysis. Note that ‘‘s’’ indicates the sequence was short and ‘‘?’’ indicates that the species and/or ITS length is unknown. The number above internodes represents maximum parsimony bootstrap support $ 70% based on 1000 pseudoreplicates of the data. Phylospecies Mes-23 and Mes-24 cannot be distinguished with ITS data. Mes-27 from Sichuan Province, China, is featured. DU ET AL.: MORCHELLA ITS RDNA FIG. 1. PHYLOGENETICS 1355 Continued. could be identified to species. The exception involved GenBank sequences of nine collections from China and one from India, none of which were full length (SUPPLEMENTARY TABLE I, FIG. 1). In analyses of the Esculenta Clade dataset, five collections of Mes-23 or Mes-24 from China could not be distinguished when ambiguously aligned regions were included or excluded. In addition, three collections from Xingiang and one from Jilin, which we listed as Mes-?, could be Mes-21. As with the aforementioned unidentified collections, additional data is needed to determine the identity of one collection from India listed as Mes-?. Six of the species (SUPPLEMENTARY TABLE I; i.e. M. diminutiva Mes-2, M. esculentoides Mes-4, Mes-6, Mes-8, Mes-9 and Mes-16) were represented by 15–42 sequences and comprised 132/222 (59.5%) of Esculenta Clade sequences. Of the 27 Esculenta Clade species, M. esculentoides, the most common yellow morel endemic 1356 MYCOLOGIA FIG. 2. Phylogenetic relationships within Elata Clade (black morels) inferred by MP analysis of 60 unique ITS haplotypes identified within the 200-sequence dataset (SUPPLEMENTARY TABLE II). The phylogram is one of . 10 000 MPTs 373 steps long, rooted on sequences of Morchella tomentosa (Mel-1). Haplotype sequences are identified by a herbarium or GenBank accession number or laboratory number (SUPPLEMENTARY TABLE II), geographic origin, the name used for the deposit in GenBank and the number of sequences (in parentheses) that comprise each non-singleton haplotype. An asterisk identifies ITS sequences that were used as a reference for each phylogenetic species (i.e. phylospecies) previously resolved by multilocus GCPSR data. MP bootstrap support is identified above internodes when it was $ 70%. Nine of the species can be identified by binomials; the 10 known phylospecies (O’Donnell et al. 2011) are identified by Mel followed by a unique Arabic number. Two putatively novel species represented by ITS sequences in GenBank are identified by NEW-1? and NEW-2? The size and number of sequences of the (ITS1) (5.8S rDNA) (ITS2) (entire ITS region) analyzed is indicated to the right of each species. Note that ITS sequences for M. populiphila (Mel-5) and NEW-1? from China are not full length. Mel-10 from Yunnan Province, China, is featured. DU ET AL.: MORCHELLA ITS RDNA to North America, was represented by the most sequences (n 5 42). By contrast, we were unable to obtain ITS sequence data from collections of the European endemic Mes-5, the only known phylogenetic species missing from the Morchella ITS dataset. With ITS rDNA sequences of species identified previously via multilocus GCPSR as a reference, separate analyses using maximum parsimony and COLLAPSE 1.2 revealed that 58 Esculenta Clade sequences deposited in GenBank by other researchers comprised at least 10 species (SUPPLEMENTARY TABLE I). Three of the 58 accessions in GenBank were listed as Morchella sp., with the remaining 55 reported as M. crassipes (n 5 29), M. esculenta (n 5 16), M. spongiola (n 5 9) and M. vulgaris (n 5 1). Our results, however, indicate the former three names have been misapplied broadly given that sequences of eight species were deposited in GenBank as M. esculenta and M. crassipes and four as M. spongiola (SUPPLEMENTARY TABLE I). All species represented by sequences deposited under the latter three names are nested within the Esculenta Clade, whereas those listed as M. esculenta included three species within the Esculenta Clade, two within the Elata Clade and three within the Elata Subclade (SUPPLEMENTARY TABLES I–III). Furthermore, our analyses indicate that, at best, only 18/55 (32.7%) of the Esculenta Clade sequences in GenBank are correctly identified to species. We found that 12 of the 16 accessions deposited as M. esculenta are the putative North American endemic M. esculentoides (Mes-4). Moreover, it is plausible that some of the 18 sequences that we assumed to be correct are actually misidentified. For these sequences to be identified correctly the sequences of Mes-16 (n 5 10), Mes-17 (n 5 7) and M. esculentoides (Mes-4, n 5 1) respectively would have to be authentic for M. crassipes, M. spongiola and M. vulgaris. Studies of original material, epi- or neo-typification of recent material and additional sampling of Morchella in Europe are needed to test whether this assumption is correct. We were able to obtain complete sequences of the ITS rDNA region for 24 of the 27 species within the Esculenta Clade in the present study. In addition to missing ITS rDNA data from the European endemic Mes-5, the ITS1 and ITS2 were sequenced only partially in GenBank accession GQ228465 from India, the ITS1 was missing in Mes-18 from Turkey and it was represented by a partial sequence in Mes-20 from Yunnan Province, China. Although six sequences downloaded from GenBank possessed insertions (1–11 bp) and deletions (1–7 bp) within the 5.8S rDNA (SUPPLEMENTARY TABLE V), these indels may represent sequencing errors because the length of this gene appears to be fixed at 156 bp in Morchella based on the sequences we generated from 663 PHYLOGENETICS 1357 collections. However, without the ability to resequence the individuals from which these sequences were generated, we cannot discount the possibility that they represent rare variants within their populations of origin. Analyses of the present dataset have highlighted several noteworthy features of the ITS1 and ITS2 spacers within the Esculenta Clade. Mapping the size of the ITS regions on the molecular phylogeny (FIG. 1) revealed that the ITS1 more than doubled in length during the evolutionary diversification of this clade; the two basal-most lineages possessed the shortest ITS1 spacer (285 bp in M. steppicola Mes-1 and 285–286 bp in M. virginiana Mes-3), whereas it was largest in the two most derived species (610– 611 bp in Mes-8 and 617 bp in Mes-9). By contrast, the ITS2 was remarkably conserved in length within the Esculenta Clade in that the shortest (371 bp in Mes20) and longest (387 bp in Mes-25) spacers differed by only 16 bp (FIG. 1). Intraspecific variation was detected only within the ITS1 (1 bp) in five species and ITS2 (1–2 bp) in four (FIG. 1). Analyses of the ITS regions revealed that the ITS1 (313 PIC/501 bp 5 0.624 PIC/bp) was 2.77 times more informative phylogenetically than the ITS2 (96 PIC/426 bp 5 0.225 PIC/bp). By comparison, the 5.8S rDNA, excluding 13 putatively erroneous indel containing nucleotide positions, was highly conserved in that it possessed only 8 PIC/156 bp (5 0.051 PIC/bp). To assess whether sequences of the ITS1 or ITS2, or portions of these two spacers could be used to identify Esculenta Clade species, we coded the 59, middle and 39 thirds of the ITS1 (59 5 251 bp, middle 5 251 bp, 39 5 252 bp) and 59 and 39 halves of the ITS2 (59 5 213 bp, 39 5 213 bp) as separate character sets and analyzed them and the separate spacers individually with maximum parsimony. These analyses revealed that the ITS1 and 59 third of the ITS1 distinguished the most species (22/26), whereas the 39 third of the ITS1 (16/26) and 59 (16/26) and 39 (14/26) halves of the ITS2 were the least discriminating (SUPPLEMENTARY TABLE VI). Elata clade.—The 200 sequences in the Elata Clade dataset included 90 that we downloaded from GenBank and 110 generated in the present study (SUPPLEMENTARY TABLE II). Validly published binomials can be applied with confidence to only three of the species (i.e. 5 M. tomentosa Mel-1, M. semilibera Mel-3 and M. punctipes Mel-4); however, names for six species endemic to North America have been proposed (FIG. 2, Kuo et al. 2012). Using COLLAPSE 1.2, we identified 60 unique haplotypes within the 200 sequence Elata Clade dataset (996 bp alignment) for subsequent MP phylogenetic analysis. Before 1358 MYCOLOGIA FIG. 3. Bootstrapped 70% majority rule cladogram inferred from maximum parsimony analysis of 106 unique ITS haplotypes in the 443-sequence Elata Subclade (black morels) dataset (SUPPLEMENTARY TABLE III). Sequences of the sister taxa Mel-11 and Mel-25 were used to root the phylogeny. Each haplotype is identified by GenBank or herbarium accession number or laboratory code (SUPPLEMENTARY TABLE III), geographic origin and number of sequences (in parentheses) that comprise each non-singleton haplotype. Haplotypes identified by an asterisk represent ITS sequences used as a reference for each phylogenetically distinct species based on multilocus support for their genealogical exclusivity. Because binomials can be DU ET AL.: MORCHELLA ITS RDNA FIG. 3. PHYLOGENETICS 1359 Continued. r applied with confidence to only four of the species, each phylogenetic species (i.e. phylospecies) is identified by Mel followed by a unique Arabic number. Note that ITS sequences of four putatively novel Elata Subclade species (NEW-3?-NEW-6?) were discovered in GenBank. The size and number of sequences of the (ITS1) (5.8SrDNA) (ITS2) (complete ITS region) determined by DNA sequence analysis is indicated to the right of each phylospecies, except for five sequences where the ITS1 is short (s). Internodes that were supported by $ 70% maximum parsimony bootstrap intervals are indicated. Note that several groups of phylospecies cannot be distinguished using ITS sequence data, with Mel-17/Mel-19/Mel-20/Mel-34, indicated by gray highlight being the most notable. Mel-20 from Yunnan Province, China, is featured. 1360 MYCOLOGIA conducting phylogenetic analyses, 102 ambiguously aligned nucleotide positions within the ITS1 and 27 positions within the ITS2 were excluded from the dataset. To minimize the number of ambiguously aligned nucleotide characters, Elata Clade phylogeny was rooted on sequences of Mel-1 based on more inclusive analyses (O’Donnell et al. 2011). MP analysis of the Elata Clade dataset revealed that all 200 sequences could be identified to species with the ITS rDNA data (SUPPLEMENTARY TABLE II). However, the genealogical exclusivity of two putatively novel species designated NEW-1? and NEW-2? needs to be accessed via GCPSR. Four species composed 79% (158/200) of the sequences, with M. capitata (Mel-9, n 5 53) and M. importuna (Mel-10, n 5 56) being the best represented, followed by M. tomentosa (Mel-1, n 5 25) and M. septimelata (Mel-7, n 5 24). By contrast, seven of the Elata Clade species each were represented by fewer than eight sequences (SUPPLEMENTARY TABLE II), including two putatively novel species designated NEW-1? from China and NEW-2? from China and Germany (SUPPLEMENTARY TABLE II). Moreover, only one-quarter (22/90) of the Elata Clade sequences in GenBank were identified by validly published binomials. Analysis of these accessions revealed that M. importuna (Mel-10) was deposited under four names, one sequence of M. sextelata (Mel-6) and one of M. importuna (Mel-10) were deposited as M. esculenta and three sequences of M. semilibera (Mel-3) were deposited as M. gigas. Although the six sequences of M. tomentosa (Mel-1) are authentic for this species, type studies are needed to determine whether the eight sequences of Mel-10 that we obtained from GenBank as M. conica are identified correctly. Even if we assume that they are, at least 8/22 (36.4%) of the Elata Clade sequences in GenBank can be inferred to be misidentified. However, if M. conica is shown to represent a species other than Mel-10, then 14–16 (63.6–72.7%) of the sequences from this clade can be inferred to be misidentified in GenBank. The Elata Clade dataset contained complete ITS sequences for all of the species except M. populiphila (Mel-5), where approximately one-half of the ITS1 spacer region was missing (FIG. 1). In contrast to what was observed within the Esculenta Clade, the ITS region was largest in the two basal-most taxa, M. tomentosa (Mel-1, 834–835 bp) and M. frustrata (Mel2, 903–904 bp), and smallest in the five most derived taxa, where it was 626–655 bp. By mapping the size of the ITS1/5.8S rDNA/ITS2 on the ITS phylogeny (FIG. 2) and the topologically concordant multilocus phylogeny (O’Donnell et al. 2011 FIG. 3), our results indicate that the ITS1 and ITS2 spacers both have contracted during the evolutionary diversification of the Elata Clade. Comparison of the two species with the largest (M. tomentosa Mel-1, 834–835 bp and M. frustrata Mel-2, 903–904 bp) and smallest (M. septimelata Mel-7, 626–628 bp and Morchella sp. Mel8, 624 bp) spacers revealed that the ITS1 and ITS2 contracted respectively by a factor of 1.7–1.9 and 1.2– 1.3 during the evolution of the Elata Clade. Minor length variation was observed within the ITS1 (2–5 bp) in five species and within the ITS2 (2–3 bp) in three (FIG. 2). Assessment of phylogenetic signal within the ITS regions revealed that, in contrast to the Esculenta Clade, the ITS2 (138 PIC/308 bp 5 0.448 PIC/bp) was 1.08 times more phylogenetically informative than the ITS1 (166 PIC/402 bp 5 0.413 PIC/bp) within the Elata Clade. The 5.8S rDNA, by comparison, was highly conserved, possessing only 5 PIC/ 156 bp (5 0.032 PIC/bp). Finally, to determine how well the ITS1 and ITS2, and equally divided portions of these two spacers, performed in distinguishing species, we coded the 59 and 39 halves of the ITS1 (59 5 252 bp, 39 5 252 bp) and ITS2 (59 5 168 bp, 39 5 167 bp) as separate character sets and analyzed them and alignments of each spacer separately via maximum parsimony. The results of these analyses indicated that sequences from each of these partitions can be used to identify all the species or all but two closely related species pairs within the Elata Clade (SUPPLEMENTARY TABLE VI). Elata subclade.—The 443-sequence Elata Subclade dataset included 164 sequences downloaded from GenBank and 279 sequences generated in the present study. We were able to apply binomials with confidence to only four of the 22 phylogenetically distinct species previously identified within this subclade using multilocus DNA sequence data (Du et al. 2012; O’Donnell et al. 2011; Taşkın et al. 2010, 2012). One of these was for M. angusticeps (Mel-15, Kuo et al. 2012), a species restricted to eastern North America, and three others were names recently proposed for North American endemics (Kuo et al. 2012). Analysis using COLLAPSE 1.2 identifed 106 unique haplotypes (FIG. 3, SUPPLEMENTARY TABLE V). The MP and COLLAPSE 1.2 analyses revealed that only 10/22 species within this clade could be identified with ITS rDNA sequence data (FIG. 3). No sequences were excluded as ambiguously aligned from the Elata Subclade dataset. Therefore, phylogenetic analyses and BLAST queries using ITS rDNA sequences cannot be used to definitively identify 10 species because they share an identical allele with at least one other species (FIG. 3). The MP analysis also suggested that 21 of the sequences obtained from GenBank might represent as many as four putatively novel species that we DU ET AL.: MORCHELLA ITS RDNA designated NEW-3?–NEW-6? (F IG. 3); however, GCPSR-based analyses are needed to determine whether these lineages are genealogically exclusive. One of these species (NEW-4?) appears to be a previously unknown species on the basis of phylogenetic analyses of ITS + LSU rDNA sequence data (Pagliaccia et al. 2011). Forty of the Elata Subclade sequences downloaded from GenBank were deposited as Mel-12 (Pagliaccia et al. 2011); the remaining 55 sequences could not be identified to species (SUPPLEMENTARY TABLE III). Although 95 Elata Subclade sequences were represented in GenBank, only 50 of these were identified to species. However, our analyses indicated that at least 35 of the 50 identifications are incorrect. Seven species were listed as M. elata, four as M. angusticeps and three as M. conica and M. esculenta (SUPPLEMENTARY TABLE III). Complete sequences of the ITS rDNA region were obtained for all species within the Elata Subclade, except for an unassigned species (Morchella sp., GenBank accession number AF000970) and the aforementioned putative species NEW-4? (GenBank accession numbers JF319903–JF319906), which encompassed ITS sequences that were missing approximately 17 bp from the 59 end of the ITS1 (SUPPLEMENTARY TABLE III). Although the ITS1 spacer within this subclade showed slight variation (213– 221 bp) in three species (FIG. 3), the ITS2 was fixed at 277 bp. The latter spacer is identical in length to the ITS2 in Mel-11 and Mel-25, the sisters of the Elata Subclade (O’Donnell et al. 2011), which was used to root the Elata Subclade phylogeny. Intraspecific variation was observed only within the ITS1 of Mel13 (213 or 217 bp), Mel-14 (217 or 221 bp) and a putatively novel species (NEW-5?, 215–216 bp) represented by two collections incorrectly deposited in GenBank as M. elata from India (SUPPLEMENTARY TABLE III, Kanwal et al. 2011). A comparison of phylogenetic signal within the ITS regions showed that the ITS1 (65 PIC/231 bp 5 0.281 PIC/bp) was 2.3 times more informative than the ITS2 (34 PIC/ 279 bp 5 0.122 PIC/bp). By contrast, only two synapomorphic positions were found within the highly conserved 5.8S rDNA (2 PIC/156 bp 5 0.013 PIC/bp). Morchella MLST.—The Web-accessible Morchella MLST database hosted at the CBS-KNAW was constructed to aid molecular identification of morels via the Internet with BLAST queries or phylogenetic analysis. In contrast to GenBank, the Morchella database will include only accession sequences for which voucher specimens or cultures are available from publically accessible herbaria or culture collections. This criterion is the same as that employed by PHYLOGENETICS 1361 the FUSARIUM-ID (Geiser et al. 2004) and Fusarium MLST (http://www.cbs.knaw.nl/fusarium) databases. In addition, the Morchella MLST database houses metadata that can be downloaded for further study. The current version of Morchella MLST contains broad taxonomic sampling from four loci (ITS and LSU rDNA, RPB1, RPB2 and EF-1a). Two options are available for identifying sequences via BLAST queries of the database: (i) a single sequence from one of the four loci can be compared with those housed in Morchella MLST or Morchella MLST and GenBank databases or (ii) multiple sequences can be used, which is recommended especially to obtain a definitive identification for species within the Elata Subclade. In addition to being able to conduct BLAST queries, neighbor joining phylogenetic analysis can be conducted to infer evolutionary relationships of an unknown to species represented in the database. The Website contains a portal by which sequences can be added to the database under the conditions that vouchers, including accession numbers, are made available from publically accessible herbaria and/or culture collections. DISCUSSION Identification of Morchella phylogenetic species using ITS rDNA sequence data.—The present study was modeled after one conducted on a grass-associated clade of Colletotrichum (Crouch et al. 2009) in which species limits were inferred from multilocus GCPSR to provide a framework for determining the utility of ITS data for differentiating species from each other via molecular phylogenetics and by BLAST queries of a dedicated database for morels, Morchella MLST. The primary objective of the present study was to generate an inclusive set of taxonomically validated ITS sequences for morels and to evaluate the accuracy of this single, widely used locus for the dual purpose of phylogenetic reconstruction and species identification (Du et al. 2012; O’Donnell et al. 2011; Taşkın et al. 2010, 2012). To assess their utility within Morchella we generated 553 ITS rDNA sequences representing 61 of the 62 known phylogenetically distinct species and downloaded another 312 sequences from GenBank with emerencia. Although efforts to obtain ITS rDNA sequence data from the European endemic Mes-5 were unsuccessful, we predict that the ITS length will match that of the closely related species Mes-6/Mes-7/Mes-21, where it is conserved at 1089 bp. Our analyses elucidated the strengths and limitations of ITS as the sole locus for distinguishing species within Morchella. The present study revealed that the ITS data could be used to distinguish 77.4% (48/62) 1362 MYCOLOGIA of the known phylogenetic species within Morchella. However, only 68.4% (592/865) of the sequences analyzed could be assigned to one of those species. ITS sequences resolved most species within the Elata and Esculenta clades, but fewer than half of the species within the Elata Subclade. Moreover, as a result of mining ITS data in GenBank, the existence of six putatively novel species was revealed, although their genealogical exclusivity needs to be critically accessed via GCPSR. Two of the potentially novel species were nested within the Elata Clade, where ITS sequences differentiated the 10 known phylogenetic species (Mel1 through Mel-10) and all 200 sequences analyzed from this clade. Another noteworthy finding was that sequences of the ITS1 (504 bp) or ITS2 (335 bp) and even the 59 or 39 halves of either spacer resolved all of the species, or all but two closely related sister species within the Elata Clade (SUPPLEMENTARY TABLE VI). We also discovered that ITS data performed well within the Esculenta Clade in that 92.3% (24/26) of the phylogenetic species were diagnosable and 95.5% (212/222) of the sequences analyzed could be assigned to one of these species. In addition, our analyses revealed that sequences of the ITS1 (754 bp), or only a 251 bp fragment comprising the 39 third of the ITS1, could be used to identify 22/26 of the species within this clade (SUPPLEMENTARY TABLE VI). While it was possible to differentiate close to three-quarters of the Esculenta Clade species (19/26) with complete ITS2 sequences, the 59 and 39 halves of the ITS2 differentiated only 50–60% of the species. One important finding is ITS sequence data has limited utility in differentiating species within the species-rich Elata Subclade. Counting sequences of two sister species used to root the phylogeny (i.e. Mel11 and Mel-25), our results show that only 50% (12/ 24) of species in the Elata Subclade + Mel-11/Mel-25 could be identified with the ITS data and only 44.7% (198/443) of the sequences analyzed could be assigned to one of these species. We attribute the failure of the ITS to resolve species within the Elata Subclade to its relatively recent origin in the middle Miocene, about 12.15 Mya (95% HPD: 9.46–15.48) (Du et al. 2012) and to its subsequent rapid radiation into a species-rich lineage comprising one-third of the true morels (O’Donnell et al. 2011). Rapid radiations result in short internal branch lengths in many gene sequence phylogenies; these short branches have little phylogenetic information (Kraus and Miyamoto 1991) and may cluster randomly due to short-branch attraction (Nei 1996). Thus, understanding species diversity within the Elata Subclade will require identification of additional marker loci with higher phylogenetic signal (Lopez-Giraldez and Townsend 2011) to resolve closely related species and the backbone of the phylogeny; it is likely that only rapidly evolving loci will be informative due to the short-branch phenomena described previously. Our results also should serve as a cautionary note to not assume a priori 3% ITS interspecific variability within Morchella and other fungi (Nilsson et al. 2006). While it is convenient to define species or phylotypes phenetically in fungal environmental surveys (Peay et al. 2008), or in systematic studies focused on mining metadata from GenBank accessions (Bonito et al. 2010, Ryberg et al. 2008), our results and those of others (Crouch et al. 2009, Roe et al. 2010, Will et al. 2005) strongly indicate that this approach is likely to significantly underestimate species diversity. Fortunately, in instances where researchers querying Morchella MLST are unable to identify unknowns with ITS data, they can conduct additional searches of this reference database using sequences of one or more of the three single-copy protein-coding genes (i.e. RPB1, RPB2, EF-1a), which were developed initially within the framework of the AFTOL project (James et al. 2006, Matheny 2005, Reeb et al. 2004). Evolution of the ITS regions.—The current study adds to our growing knowledge of ITS evolution and variability within the Fungi. In contrast to the significant ITS divergence reported for diverse species inferred from GenBank data (Nilsson et al. 2008), we detected surprisingly little intraspecific variability within the ITS region in Morchella; the highest we observed was only 1.3% in M. esculentoides (Mes-4). However, had we grouped sequences by the names used to identify them in GenBank, this estimate would have been considerably higher because multiple species are accessioned under the same name in this database. Our analyses of GenBank accessions, for example, revealed that those listed as M. esculenta, M. crassipes and M. elata each comprised eight different phylogenetic species (SUPPLEMENTARY TABLES I–III). To further illustrate problems caused by misidentified sequences in GenBank (Bridge et al. 2003, Vilgalys 2003), Nilsson et al. (2008) reported 3.1% ITS divergence in Fusarium solani; however, this estimate is flawed because our GCPSR studies revealed that this morphospecies comprises more than 50 distinct phylogenetic species (O’Donnell 2000, O’Donnell et al. 2008, Zhang et al. 2006). In contrast to the aforementioned survey of ITS variation within diverse fungal species, our analyses showed that the highest ITS divergence for any species within the F. solani species complex was only 0.1% in F. falciforme. Estimates of ITS variation can be affected significantly by putative paralogs and intragenomic variation, as reported in diverse fungi (Lindner and Banik 2011, O’Donnell et al. 1997, DU ET AL.: MORCHELLA ITS RDNA Simon and Weiß 2008), but these were not detected within the ITS of Morchella. In addition, we did not detect any gene tree-species tree discordance between the morel ITS rDNA and the RPB1, RPB2 and EF-1a phylogenies. Results of the present study fit experimental data in budding and fission yeast that indicate the ITS1 is generally less constrained evolutionarily than the ITS2 (Abeyrathne and Nazar 2005, van Nues et al. 1995). Early in our studies we discovered that ITS length variation was primarily due to length mutations within the ITS1 (O’Donnell et al. 1993), so we used variation at this locus to screen cultures and herbarium specimens via PCR to sample for the greatest global genetic diversity (O’Donnell et al. 2011), using the universal PCR primers ITS5 3 ITS4 (White et al. 1990). In the present study we discovered that the ITS1 more than doubled in length during the evolutionary diversification of the Esculenta Clade whereas its length contracted by 50% within the Elata Clade. Significant length variation has been documented within the ITS region in other fungi (reviewed in Hausner and Wang 2005), with the greater than threefold increase reported within the ITS1 of Cantharellus being one of the most dramatic (Feibelman et al. 1994). One of the more noteworthy findings of the present study was that ITS2 length was highly conserved during the estimated 68.7 million year (95% HPD interval: 49.19–93.03] evolutionary history of the Esculenta Clade (Du et al. 2012), with the shortest and longest spacers differing in length by only 16 bp. Although not as highly conserved, the ITS2 only varied by 53 bp across the phylogenetic breadth of the Elata Clade. Although MUSCLE (Edgar 2004) produced fewer gaps in the final alignment when compared to Clustal W (Thompson et al. 1994) and MAAFT 6 (Katoh and Toh 2008) (data not shown), considerable manual adjustments were required and even then one-third and one-fifth of the ITS1 nucleotide positions were excluded respectively from the Esculenta and Elata Clade datasets due to ambiguity in the alignment. Consistent with uniform reports of the 5.8S rDNA being highly conserved in the Fungi (Nilsson et al. 2008), our analyses indicate the 5.8S rDNA was fixed at 156 bp within Morchella and in the hypogeous (Trappe et al. 2010) and other epigeous members of the Morchellaceae (Hansen and Pfiser 2006, O’Donnell et al. 1997). Conservation of the 5.8S rDNA also was reported in an extensive survey of the core group of Peziza, where it was fixed at 157 bp (Hansen et al. 2002). Based on our findings that six of the Morchella sequences we downloaded from GenBank had length mutations within the 5.8S rDNA that necessitated inserting indels within the alignments PHYLOGENETICS 1363 (SUPPLEMENTARY TABLE V), and three other ITS sequences could be used only once, bad sequence data at the 59 or 39 end were removed, it would be highly desirable if automated processes at GenBank were in place to identify sequences with probable sequencing errors before they end up in the database. Of course, this necessitates the adoption of a sequence standard for each species or strain to which a deposited sequence could be compared, in which case rare variant alleles that encompass insertions or deletions also must be taken into account. While such a method could be based on models of nucleotide substitution, they do not incorporate the probability that an insertion or deletion occurs due to the intractability of that problem (i.e. While the likelihood of an ARG transition can be predicted based on observations from the sequence data, how can the probability be calculated that a nucleotide will be inserted between any two nucleotides?). This problem requires that an alternative solution be found. Can GenBank and emerencia be used to identify Morchella species?— With 66% or more of the sequences misidentified, Morchella may have the dubious distinction of having the highest percentage of poorly annotated named fungal ITS sequences in GenBank. This estimate significantly eclipses the often cited ‘‘upwards of 20%’’ reported as the worst case scenario for certain groups of the Fungi (Bridge et al. 2003, Nilsson et al. 2006, Vilgalys 2003). Our startling discovery prompts three obvious questions: (i) Why are Morchella sequences so poorly annotated in GenBank?; (ii) Can these databases be used for species identifications, and if so, how? And (iii) is the poor state of Morchella annotation in GenBank closer to the rule rather than the exception? Lack of knowledge of species boundaries and the absence of type studies both have contributed to the large number of misidentifications. Contrary to the general assumption, our studies indicate that Morchella species are not cosmopolitan in distribution (Du et al. 2012; O’Donnell et al. 2011; Taşkın et al. 2010, 2012). However, because Kanwal et al. (2011) incorrectly assumed M. crassipes, M. elata and M. angusticeps were present in India, and did not identify their accession, GQ228474, as M. tomentosa (Mel-2), 75% (12/16) of the ITS sequences they deposited in GenBank were misidentified. Once misidentified sequences such as these are accessioned, they then can serve as the reference for subsequent BLAST queries of GenBank and identifications using emerencia (Nilsson et al. 2006). Another potential source of error is the inclination of individual researchers to deposit sequences under a validly published binomial regardless of whether the sequence is actually 1364 MYCOLOGIA assignable to that species. However, because almost every GCPSR-based study, including our own, has discovered widespread cryptic speciation, consistent with the prediction that only , 7% of fungal diversity has been discovered (Hawksworth 2004), it is essential that sequences of phylogenetically distinct species be identified by some standardized informal nomenclature when they are deposited in GenBank and emerencia, especially when applying a binomial is problematic. To this end, we distinguished sequences of phylogenetic species within the Esculenta and Elata Clades respectively by Mes and Mel followed by a unique Arabic number, thereby facilitating DNA sequence-based identifications of Morchella in GenBank. A BLAST query of GenBank matching one of these sequences identifies the organism (e.g. ‘‘Morchella sp. Mel-14’’ represented by GU551435) and provides a clarifying note (e.g. note 5 ‘‘phylogenetic species Mel-14’’). However, because at least 66% of the named sequences in GenBank are misidentified, interpreting BLAST results can be complicated. We found emerencia (Nilsson et al. 2005) to be extraordinarily useful in retrieving Morchella ITS sequences represented in GenBank because it also found several from environmental studies that were listed ‘‘uncultured fungus’’ (Taylor et al. 2008) and one as ‘‘Pezizomycetes sp.’’ (SUPPLEMENTARY TABLE III). Results of the present study, however, show that emerencia cannot be used to obtain reliable species identifications for Morchella because two-thirds of the ‘‘fully identified sequences’’ (i.e. those with binomials) it retrieved from GenBank as a reference were misidentified. To provide an example of the problems we encountered when we queried emerencia with separate sequences of phylogenetic species Morchella sp. Mel-20 (GU551406, O’Donnell et al. 2011; HM056383, Taşkın et al. 2010), the best BLAST matches were identified respectively as M. esculenta (EU600241) and M. elata (EF081000). This heuristic should serve to alert users of emerencia, as cautioned by its authors (Nilsson et al. 2005), that ‘‘fully identified’’ and correctly identified are not synonymous. Because the factors that contribute to poorly annotated sequences in GenBank, such as widespread cryptic speciation and failure to include type specimens as a reference, are hardly unique to Morchella, it seems likely that the percentage of misidentified sequences for certain groups of fungi in GenBank could easily exceed the current estimate of 20% (Bridge et al. 2003). Looking to the future, it is clear that multilocus GCPSR-based studies will play an ever-increasing role in assessing the genetic diversity of fungi, increasing our ability to optimally mine sequence data in GenBank. By constructing an ITS dataset that comprised all but one of the Morchella species inferred via GCPSR, we were able to identify endophytes of Bromus (Baynes et al. 2012) and Juniperus scopulorum (http://www.ncbi.nlm.nih.gov/ nuccore/GQ153010) in western North America, a putative ectomycorrhizal associate of the terrestrial orchid Gymnadenia conopsea in Germany (Stark et al. 2009), hypothesize the existence of six putatively novel species from ITS sequences in GenBank (SUPPLEMENTARY TABLES II–III) and identify threequarters of the Morchella accessions in GenBank to phylospecies. The latter revealed: (i) several species with intercontinental distributions, including four fire-adapted species native to western North America (Du et al. 2012, O’Donnell et al. 2011, Taşkın et al. 2012); (ii) the putative North American endemic M. esculentoides (Mes-4) in central Europe (Kellner et al. 2005) and (iii) that Morchella sp. (Mes-16) appears to be the most widely distributed species, based on collections from China (Du et al. 2012), Israel (Masaphy et al. 2010), Turkey (Taşkın et al. 2012), New Zealand (GenBank JF423317), India (Kanwal et al. 2011), three African countries (Degreef et al. 2009), and Hawaii and Java (O’Donnell et al. 2011). As proposed for true truffles in the genus Tuber (Bonito et al. 2010) and diverse ectomycorrhizal fungi (Vellinga et al. 2009), our working hypothesis is that most if not all disjunct distributions in Morchella are due to human introductions of species into nonindigenous areas. Examples include the North American endemics M. esculentoides (Mes-4) and M. importuna (Mel-10), which were discovered respectively in Europe and Turkey (Taşkin et al. 2012) and Europe, Turkey and China (Du et al. 2012, Taşkin et al. 2012). We theorize that their introduction into Europe and China might have been mediated by relatively recent anthropogenic activities, which should minimize the likelihood that they represent later synonyms of taxa described from European and Chinese collections (Kuo et al. 2012). The relatively recent discoveries that morels (Baynes et al. 2012) and Verpa (Stark et al. 2009) can grow endophytically in plants and form mycorrhizallike associations with angiosperms (Stark et al. 2009) and gymnosperms (Dahlstrom et al. 2000) suggest that economically important plants likely serve as vectors for the vertical transmission of morels across oceans. With the whole genome sequencing of one species within the Elata Clade nearing completion (J. Spatafora pers comm) and others likely to follow in the near future, rich genetic resources such as these should provide a wealth of molecular markers for phylogeographic and phylogenetic studies needed to distinguish between human-mediated introductions and long distance dispersal. Phylogenomic analyses should DU ET AL.: MORCHELLA ITS RDNA help elucidate the underlying genetic basis of niche adaption of the Esculenta and Elata Clades respectively to temperate deciduous and coniferous forests, help guide genetic studies focused on strain improvement for commercial cultivation and advance our understanding of species diversity and distribution so that this invaluable genetic resource can be managed to prevent overharvesting, thereby promoting long term species conservation. Morchella MLST.—We were motivated to construct Morchella MLST based on our discovery that at least two-thirds of the named ITS sequences of Morchella in GenBank were misidentified and because this problem is exacerbated when emerencia is applied (Nilsson et al. 2006), given that the latter database uses poorly annotated named sequences in GenBank as the reference to re-identify sequences, including those of distinct phylogenetic species inferred to be genealogically exclusive from multilocus DNA sequence data (O’Donnell et al. 2011; Taşkın et al. 2010, 2012). Because many people who deposit sequences use the names in GenBank as a reference, the system unfortunately is designed to perpetuate itself (Bidartondo et al. 2008, Vilgalys 2003). To obviate potential problems created by poorly annotated sequences in GenBank we constructed the Internet-accessible Morchella MLST to provide a means by which the scientific community can contribute to and access high quality electropherograms of voucher specimens and/or cultures that are accessible to interested researchers (Kang et al. 2010). An added advantage of a dedicated database is that the informal nomenclatural system that we developed for distinguishing the phylogenetically distinct species by clade (i.e. Mes or Esculenta and Mel for Elata/Elata Subclade) followed by a unique Arabic number can be implemented. This informal system was adopted out of necessity because our studies surprisingly revealed that most of the species in North America, Turkey and Asia are novel, negating the assumption that European taxa are cosmopolitan, and because of the broad uncertainty on how to apply validly published names. The utility of Morchella MLST, as for any reference library dedicated to species identification and discovery, is the dense taxon sampling in which species were resolved a priori by multilocus GCPSR. One of the overarching goals of Morchella MLST is to stimulate integrative taxonomic studies so that all of the phylospecies receive specific epithets, even in instances where they cannot be reliably diagnosed phenotypically (Kuo et al. 2012). A comprehensive nomenclatural and comparative morphotaxonomic study of these species is needed to be able to apply European names of Morchella to PHYLOGENETICS 1365 species delimited via GCPSR. Roughly 82 valid European species names of Morchella are available, excluding many subspecific epithets (see Index Fungorum http://www.indexfungorum.org/), the majority of which are based on material from France, Czechoslovakia, Italy, the Netherlands and Sweden. Because many of these names date back to 1800 and early 1900, it is likely that original material either does not exist or that it will be much too old to yield useful DNA sequence data. These names should be lectotypified as needed, or if neither material nor an illustration exists a neotype should be selected (the latter with newly collected material, including cultures and multilocus DNA sequence data). If an original type is too old to yield useful DNA or is an illustration, it will be necessary to designate a recent collection (with cultures and DNA sequence data) as an epitype to be able to stabilize the name, considering the high morphological plasticity of macroscopic characters within species of Morchella and stasis in several microscopic characters across the genus. The neotype or epitype preferably should be from the type locality or country. Europe is the poorest sampled area in our surveys, and to fully explore the diversity and boundaries of these species a more thorough molecular and morphological study of fresh collections from for example France, Czech Republic, Slovakia, Italy, the Netherlands, Sweden, Germany and Spain will be essential. Circumscriptions and synonymies of iconic European species, such as M. esculenta, M. crassipes, M. deliciosa, M. conica, M. elata, M. distans and M. hiemalis, need to be clarified. Types of a number of recently published taxa from Europe (e.g. Jacquetant 1984, Jacquetant and Bon 1984) need to be located and subjected to similar morphological and molecular phylogenetic analyses. ACKNOWLEDGMENTS Special thanks are due Nathane Orwig for processing DNA sequences in the NCAUR DNA core facility. The research visit of X.-H.D. to NCAUR in 2011 was supported by the National Basic Research Program of China (No. 2009CB522300), the Hundred Talents Program of the Chinese Academy of Sciences, Joint Funds of the National Natural Science Foundation of China and the Yunnan provincial government (No. U0836604). The mention of trade names or commercial products in this publication is solely for the purpose of providing specific information and does not imply recommendation or endorsement by the U.S. Department of Agriculture. The USDA is an equal opportunity provider and employer. LITERATURE CITED Abeyrathne PD, Nazar RN. 2005. Parallels in rRNA processing: conserved features in the processing of 1366 MYCOLOGIA the internal transcribed spacer I in the pre-rRNA from Schizosaccharomyces pombe. Biochemistry 44:16977– 16987, doi:10.1021/bi051465a Agerer R, Ammirati J, Blanz P, Courtecuisse R, Desjardin DE, Gams W, Hallenberg N, Halling R, Hawksworth DL, Horak E, Korf RP, Mueller GM, Oberwinkler F, Rambold G, Summerbell RC, Triebel D, Watling R. 2000. Always deposit vouchers. Mycol Res 104:642–644. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. 1990. Basic local alignment search tool. J Mol Biol 215: 403–410. Baynes M, Newcombe G, Dixon L, Castlebury L, O’Donnell K. 2012. A novel plant-fungal mutualism associated with fire. Fungal Biol 116:113–144, doi:10.1016/j.funbio. 2011.10.008 Bidartondo MI, Bruns TD, Blackwell M. 2008. Preserving accuracy in GenBank. Science 319:1616, doi:10.1126/ science.319.5870.1616a Bonito GM, Gryganskyi AP, Trappe JM, Vilgalys R. 2010. A global meta-analysis of Tuber ITS rDNA sequences: species diversity, host association and long-distance dispersal. Mol Ecol 19:4994–5008, doi:10.1111/j.1365-294X.2010. 04855.x Bridge PD, Roberts PJ, Spooner BM, Panchal G. 2003. On the unreliability of published DNA sequences. New Phytol 160:43–48, doi:10.1046/j.1469-8137.2003.00861.x Crouch JA, Clarke BB, Hillman BI. 2009. What is the value of ITS sequence data in Colletotrichum systematics and species diagnosis? A case study using the falcate-spored graminicolous Colletotrichum group. Mycologia 101: 648–656, doi:10.3852/08-231 Dahlstrom JL, Smith JE, Weber NS. 2000. Mycorrhiza-like interaction by Morchella with species of the Pinaceae in pure culture synthesis. Mycorrhiza 9:279–285, doi:10. 1007/PL00009992 Degreef J, Fischer E, Sharp C, Raspé O. 2009. African Morchella crassipes (Ascomycota, Pezizales) revealed by analysis of nrDNA ITS. Nova Hedwigia 88:11–22, doi:10.1127/0029-5035/2009/0088-0011 Dettman JR, Jacobson DJ, Taylor JW. 2003. A multilocus genealogical approach to phylogenetic species recognition in the model eukaryote Neurospora. Evolution 57:2703–2720. Du X-H, Zhao Q, O’Donnell K, Rooney AP, Yang ZL. 2012. Multigene molecular phylogenetics reveals true morels (Morchella) are especially species-rich in China. Fungal Genet Biol 49:455–469, doi:10.1016/j.fgb.2012.03.006. Edgar RC. 2004. MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinform 5:1–19, doi:10.1186/1471-2105-5-113 Feibelman T, Bayman P, Cibula WG. 1994. Length variation in the internal transcribed spacer of ribosomal DNA in chanterelles. Mycol Res 98:614–618, doi:10.1016/ S0953-7562(09)80407-3 Gardes M, Bruns TD. 1993. ITS primers with enhanced specificity for basidiomycetes—application to the identification of mycorrhizae and rusts. Mol Ecol 2:113–118, doi:10.1111/j.1365-294X.1993.tb00005.x Geiser DM, Jiménez-Gasco MdM, Kang S, Makalowska I, Veeraraghavan N, Ward TJ, Zhang N, Kuldau GA, O’Donnell K. 2004. FUSARIUM-ID v. 1.0: A DNA sequence database for identifying Fusarium. Eur J Plant Pathol 110: 473–479, doi:10.1023/B:EJPP.0000032386.75915.a0 Gower JC. 1971. A general coefficient of similarity and some of its properties. Biometrics 27:857–874, doi:10.2307/ 2528823 Hansen K, Laessøe T, Pfister DH. 2002. Phylogenetic diversity in the core group of Peziza inferred from ITS sequences and morphology. Mycol Res 106:879– 902, doi:10.1017/S0953756202006287 ———, Pfister DH. 2006. Systematics of the Pezizomycetes—the operculate discomycetes. Mycologia 98: 1029–1040, doi:10.3852/mycologia.98.6.1029 Hausner G, Wang X. 2005. Unusual compact rDNA gene arrangements within some members of the Ascomycota: evidence for molecular co-evolution between ITS1 and ITS2. Genome 48:648–660, doi:10.1139/g05-037 Hawksworth DL. 2004. Fungal diversity and its implications for genetic resource collections. Stud Mycol 50:9–18. Horton TR, Bruns TD. 2001. The molecular revolution in ectomycorrhizal ecology: peeking into the black box. Mol Ecol 10:1855–1871, doi:10.1046/j.0962-1083.2001. 01333.x Jacquetant E. 1984. Les Morilles. Paris: La Bibliothèque des Arts. p 7–114. ———, Bon M. 1984. Typifications et mises au point nomenclaturales dans l’ouvrage ‘‘Les Morilles’’ (de E. Jacquetant), Nature-Piantanida 1984. Doc Mycol 14:1. James TY, Kauff F, Schoch CL, Matheny PB, Hofstetter V, Cox CJ, Celio G, Gueidan C, Fraker E, Miadlikowska J, Lumbsch HT, Rauhut A, Reeb V, Arnold AE, Amtoft A, Stajich JE, Hosaka K, Sung GH, Johnson D, O’Rourke B, Crockett M, Binder M, Curtis JM, Slot JC, Wang Z, Wilson AW, Schussler A, Longcore JE, O’Donnell K, Mozley-Standridge S, Porter D, Letcher PM, Powell MJ, Taylor JW, White MM, Griffith GW, Davies DR, Humber RA, Morton JB, Sugiyama J, Rossman AY, Rogers JD, Pfister DH, Hewitt D, Hansen K, Hambleton S, Shoemaker RA, Kohlmeyer J, Volkmann-Kohlmeyer B, Spotts RA, Serdani M, Crous PW, Hughes KW, Matsuura K, Langer E, Langer G, Untereiner WA, Lücking R, Budel B, Geiser DM, Aptroot A, Diederich P, Schmitt I, Schultz M, Yahr R, Hibbett DS, Lutzoni F, McLaughlin DJ, Spatafora JW, Vilgalys R. 2006. Reconstructing the early evolution of Fungi using a six-gene phylogeny. Nature 443:818–822, doi:10.1038/nature05110 Kang S, Mansfield MA, Park B, Geiser DM, Ivors KL, Coffey MD, Grünwald NJ, Martin FN, Lévesque CA, Blair JE. 2010. The promise and pitfalls of sequence-based identification of plant-pathogenic fungi and oomycetes. Phytopathology 100:732–737, doi:10.1094/ PHYTO-100-8-0732 Kanwal HK, Acharya K, Ramesh G, Reddy MS. 2011. Molecular characterization of Morchella species from the western Himalayan region of India. Curr Microbiol 62:1245–1252, doi:10.1007/s00284-010-9849-1 Katoh K, Toh H. 2008. Recent developments in the MAFFT multiple sequence alignment program. Brief Bioinform 9:286–298, doi:10.1093/bib/bbn013 Kellner H, Renker C, Buscot F. 2005. Species diversity within the Morchella esculenta group (Ascomycota: Morchellaceae) in DU ET AL.: MORCHELLA ITS RDNA Germany and France. Org Diver Evol 5:101–107, doi:10.1016/j.ode.2004.07.001 Kraus F, Miyamoto MM. 1991. Cladogenesis among the pecoran ruminants: evidence from mitochondrial DNA sequences. Syst Zool 40:117–130, doi:10.2307/2992252 Kuo M, Dewsbury DR, O’Donnell K, Carter MC, Rehner SA, Moore JD, Moncalvo J-M, Canfield SA, Methven AS, Volk TJ. 2012. A taxonomic revision of true morels (Morchella) in Canada and the United States. Mycologia 104:1159–1177, doi:10.3852/11-375 Lindner DL, Banik MT. 2011. Intragenomic variation in the ITS rDNA region obscures phylogenetic relationships and inflates estimates of operational taxonomic units in genus Laetiporus. Mycologia 103:731–740, doi:10.3852/ 10-331 Lopez-Giraldez F, Wang Z, Townsend JP. 2011. Assessing phylogenetic information content in fungal genomes with PhyDesign, a Web application for smart marker selection. Inoculum 62:30. Masaphy S, Zabari L, Goldberg D, Jander-Shgag G. 2010. The complexity of Morchella systematics: a case of the yellow morel in Israel. Fungi 3:14–18. Matheny PB. 2005. Improving phylogenetic inference of mushrooms with RPB1 and RPB2 nucleotide sequences (Inocybe; Agaricales). Mol Phylogenet Evol 35:1–20, doi:10.1016/j.ympev.2004.11.014 Nei M. 1996. Phylogenetic analysis in molecular evolutionary genetics. Annu Rev Genet 30:371–403, doi:10.1146/ annurev.genet.30.1.371 Nilsson RH, Kristiansson E, Ryberg M, Hallenberg N. 2008. Intraspecific ITS variability in the kingdom Fungi as expressed in the international sequence databases and its implications for molecular species identification. Evol Bioinform 4:193–201. ———, ———, ———, Abarenkov K, Larsson K-H, Köljalg U. 2006. Taxonomic reliability of DNA sequences in public sequence databases: a fungal perspective. PLoS One 1, e59, doi:10.1371/journal.pone.0000059 ———, ———, ———, Larsson K-H. 2005. Approaching the taxonomic affiliation of unidentified sequences in public databases—an example from the mycorrhizal fungi. BMC Bioinform 6:178, doi:10.1186/1471-2105-6-178 O’Donnell K. 2000. Molecular phylogeny of the Nectria haematococca-Fusarium solani species complex. Mycologia 92:919–938, doi:10.2307/3761588 ———, Cigelnik E. 1997. Two divergent intragenomic rDNA ITS2 types within a monophyletic lineage of the fungus Fusarium are nonorthologous. Mol Phylogeneti Evol 7:103–116, doi:10.1006/mpev.1996.0376 ———, ———, Weber NS, Trappe JM. 1997. Phylogenetic relationships among ascomycetous truffles and the true and false morels inferred from 18S and 28S ribosomal DNA sequence analysis. Mycologia 89:48–65, doi:10. 2307/3761172 ———, Rooney AP, Mills GL, Kuo M, Weber NS, Rehner SA. 2011. Phylogeny and historical biogeography of true morels (Morchella) reveals an early Cretaceous origin and high continental endemism and provincialism in the Holarctic. Fungal Genet Biol 48:252–265, doi:10.1016/j.fgb.2010.09.006 PHYLOGENETICS 1367 ———, Sutton DA, Fothergill A, McCarthy D, Rinaldi MG, Brandt ME, Zhang N, Geiser DM. 2008. Molecular phylogenetic diversity, multilocus haplotype nomenclature and in vitro antifungal resistance within the Fusarium solani species complex. J Clin Microbiol 46: 2477–2490, doi:10.1128/JCM.02371-07 ———, Weber NS, Hains JH, Mills GL. 1993. Molecular systematics and phylogenetics of the Morchellaceae. Inoculum 44:52. Ower R, Mills GL, Malachowski JA. 1986. Cultivation of Morchella. US patent 4594809. Pagliaccia D, Douhan GW, Douhan L, Peever TL, Carris LM, Kerrigan JL. 2011. Development of molecular markers and preliminary investigation of the population structure and mating system in one lineage of black morel (Morchella elata) in the Pacific Northwestern USA. Mycologia 103:969–982, doi:10.3852/10-384 Peay KG, Kennedy PG, Bruns TD. 2008. Fungal community ecology: a hybrid beast with a molecular master. BioScience 58:799–810, doi:10.1641/B580907 Pilz D, McLain R, Alexander S, Villarreal-Ruiz L, Berch S, Wurtz TL, Parks CG, McFarlane E, Baker B, Molina R, Smith JE. 2007. Ecology and Management of Morels Harvested from the Forests of Western North America. Corvallis, OR: PNW–GTR-710. USDA Forest Service. Reeb V, Lutzoni F, Roux C. 2004. Contribution of RPB2 to multilocus phylogenetic studies of the euascomycetes (Pezizomycotina, Fungi) with species emphasis on the lichen-forming Acarosporaceae and evolution of polyspory. Mol Phylogenet Evol 32:1036–1060, doi:10. 1016/j.ympev.2004.04.012 Robert V, Szoke S, Jabas J, Vu D, Chouchen O, Blom E, Cardinali G. 2011. BioloMICS software, biological data management, identification, classification and statistics. Open Appl Informatics J 5:87–98, doi:10.2174/ 1874136301105010087 Roe AD, Rice AV, Bromilow SE, Cooke JEK, Sperling FAH. 2010. Multilocus species identification and fungal DNA barcoding: insights from blue stain fungal symbionts of the mountain pine beetle. Mol Ecol Res 10:946–959, doi:10.1111/j.1755-0998.2010.02844.x Ryberg M, Nilsson RH, Kristiansson E, Töpel M, Jacobsson S, Larsson E. 2008. Mining metadata from unidentified ITS sequences in GenBank: a case study in Inocybe (Basidiomycota). BMC Evol Biol, doi:10.1186/1471-2148-8-50 Saitou N, Kristiansson E, Sjökvist E, Nilsson RH. 2009. An outlook on the fungal internal transcribed spacer sequences in GenBank and the introduction of a Web-based tool for the exploration of fungal diversity. New Phytol 181: 471–477, doi:10.1111/j.1469-8137.2008.02667.x ———, Nei N. 1987. The neighbor joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol 14:406–425. Simon UK, Weiß M. 2008. Intragenomic variation of fungal ribosomal genes is higher than previously thought. Mol Biol Evol 25:2251–2254, doi:10.1093/molbev/ msn188 Stark C, Babik W, Durka W. 2009. Fungi from the roots of the common terrestrial orchid Gymnadenia conopsea. Mycol Res 113:952–959, doi:10.1016/j.mycres.2009.05.002 1368 MYCOLOGIA Stefani FOP, Sokolski S, Wurtz TL, Piché Y, Hamelin RC, Fortin JA, Bérubé JA. 2010. Morchella tomentosa: a unique belowground structure and a new clade of morels. Mycologia 102:1082–1088, doi:10.3852/09-294 Swofford DL. 2002. PAUP* 4.0b10: phylogenetic analysis using parsimony (*and other methods). Sunderland, Massachusetts: Sinauer Associates. Taşkın H, Büyükalaca S, Dogan HH, Rehner SA, O’Donnell K. 2010. A multigene molecular phylogenetic assessment of true morels (Morchella) in Turkey. Fungal Genet Biol 47:672–682, doi:10.1016/j.fgb.2010.05.004 ———, ———, Hansen K, O’Donnell K. 2012. Multilocus phylogenetic analysis of true morels (Morchella) reveals high levels of endemics in Turkey relative to other regions of Europe. Mycologia 104, 446–461. Taylor DL, Booth MG, McFarland JW, Herriott IC, Lennon NJ, Nusbaum C, Marr TG. 2008. Increasing ecological inference from high throughput sequencing of fungi in the environment through a tagging approach. Mol Ecol Resour 8:742–752, doi:10.1111/j.1755-0998.2008.02094.x Taylor JW, Jacobson DJ, Kroken S, Kasuga T, Geiser DM, Hibbett DS, Fischer MC. 2000. Phylogenetic species recognition and species concepts in fungi. Fungal Genet Biol 31:21–32, doi:10.1006/fgbi.2000.1228 ———, Turner E, Townsend JP, Dettman JR, Jacobson D. 2006. Eukaryotic microbes, species recognition and the geographic limits of species: examples from the kingdom Fungi. Phil Trans R Soc B 361:1947–1963, doi:10.1098/rstb.2006.1923 Thompson JD, Higgins DG, Gibson TJ. 1994. Clustal W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22:4673–4680, doi:10.1093/nar/22.22.4673 Trappe MJ, Trappe JM, Bonito GM. 2010. Kalapuya brunnea gen. and sp. nov. and its relationship to the other sequestrate genera in Morchellaceae. Mycologia 102: 1058–1065, doi:10.3852/09-232 van Nues RW, Rientjes JMJ, Morré SA, Mollee E, Planta RJ, Venema J, Raué HA. 1995. Evolutionary conserved structural elements are critical for processing on internal transcribed spacer 2 from Saccharomyces cerevisiae precursor ribosomal RNA. J Mol Biol 250: 24–26, doi:10.1006/jmbi.1995.0355 Vellinga EC, Wolfe BE, Pringle A. 2009. Global patterns of ectomycorrhizal introductions. New Phytol 181:960– 973, doi:10.1111/j.1469-8137.2008.02728.x Vilgalys R. 2003. Taxonomic misidentification in public DNA databases. New Phytol 160:4–5, doi:10.1046/ j.1469-8137.2003.00894.x White TJ, Bruns T, Lee S, Taylor J. 1990. Amplification and direct sequencing of fungal ribosomal RNA genes for phylogenetics. In: Innis MA, Gelfand HH, Sninsky JJ, White TJ, eds. PCR protocols: a guide to methods and applications. San Diego: Academic Press. p 315– 322. Will KW, Mishler BD, Wheller QD. 2005. The perils of DNA barcoding and the need for integrative taxonomy. Syst Biol 56:844–851, doi:10.1080/10635150500354878 Zhang N, O’Donnell K, Sutton DA, Nalim FA, Summerbell RC, Padhye AA, Geiser DA. 2006. Members of the Fusarium solani species complex that cause infections in both humans and plants are common in the environment. J Clin Microbiol 44: 2186– 2190, doi:10.1128/JCM.00120-06 Zhao Q, Xu ZZ, Cheng YH, Qi SW, Hou ZJ. 2009. Artificial cultivation of Morchella conica. Southwest China J Agric Sci 22:1690–1693 (in Chinese). Note. Shortly after our paper was accepted for publication we discovered the following article: Clowez P. 2012. Les morilles, une nouvelle approche mondiale du genre Morchella. Bull Soc Mycol Fr 126:199–376. Because several species from North America were described in Clowez (2012), at least six species described in Kuo et al. (2012) and whose names were used herein (i.e. M. virginiana [Mes-3], M. esculentoides [Mes-4], M. populiphila [Mel-5], M. septimelata [Mel-7], M. importuna [Mel-10] and M. cryptic [Mel-11]) are later synonyms of taxa described in Clowez (2012). Molecular phylogenetic studies are in progress to determine which phylospecies is represented by each type included in the latter publication (Franck Richard and Mathieu Sauve pers comm).
© Copyright 2026 Paperzz