Comparison of potential diatom `barcode` genes

International Journal of Systematic and Evolutionary Microbiology (2015), 65, 1369–1380
DOI 10.1099/ijs.0.000076
Comparison of potential diatom ‘barcode’ genes
(the 18S rRNA gene and ITS, COI, rbcL) and their
effectiveness in discriminating and determining
species taxonomy in the Bacillariophyta
Liliang Guo, Zhenghong Sui, Shu Zhang, Yuanyuan Ren and Yuan Liu
Correspondence
Zhenghong Sui
Key Laboratory of Marine Genetics and Breeding of Ministry of Education,
College of Marine Life Sciences, Ocean University of China, Qingdao 266003, PR China
[email protected]
Diatoms form an enormous group of photoautotrophic micro-eukaryotes and play a crucial role in
marine ecology. In this study, we evaluated typical genes to determine whether they were effective
at different levels of diatom clustering analysis to assess the potential of these regions for
barcoding taxa. Our test genes included nuclear rRNA genes (the nuclear small-subunit rRNA
gene and the 5.8S rRNA gene+ITS-2), a mitochondrial gene (cytochrome c-oxidase subunit 1,
COI), a chloroplast gene [ribulose-1,5-biphosphate carboxylase/oxygenase large subunit (rbcL)]
and the universal plastid amplicon (UPA). Calculated genetic divergence was highest for the
internal transcribed spacer (ITS; 5.8S+ITS-2) (p-distance of 1.569, 85.84 % parsimonyinformative sites) and COI (6.084, 82.14 %), followed by the 18S rRNA gene (0.139, 57.69 %),
rbcL (0.120, 42.01 %) and UPA (0.050, 14.97 %), which indicated that ITS and COI were highly
divergent compared with the other tested genes, and that their nucleotide compositions were
variable within the whole group of diatoms. Bayesian inference (BI) analysis showed that the
phylogenetic trees generated from each gene clustered diatoms at different phylogenetic levels.
The 18S rRNA gene was better than the other genes in clustering higher diatom taxa, and both
the 18S rRNA gene and rbcL performed well in clustering some lower taxa. The COI region was
able to barcode species of some genera within the Bacillariophyceae. ITS was a potential marker
for DNA based-taxonomy and DNA barcoding of Thalassiosirales, while species of Cyclotella,
Skeletonema and Stephanodiscus gathered in separate clades, and were paraphyletic with those
of Thalassiosira. Finally, UPA was too conserved to serve as a diatom barcode.
INTRODUCTION
Diatoms are a large group of unicellular photoautotrophic
eukaryotes, containing approximately 200 000 extant species
(Mann & Droop, 1996), and are widely distributed in oceans,
freshwater and soils and on damp surfaces. They are especially
important in marine ecosystems, and are responsible for up
to 45 % of the total oceanic primary production (Yool &
Tyrrell, 2003). Diatom communities can indicate water
quality; therefore, they are often used as bio-indicators for
ecological water quality (Stevenson et al., 2010). Accurate
identification of diatoms to the species level is crucial for
monitoring water quality. Mainly based on the morphological
Abbreviations: BI, Bayesian inference; ITS, internal transcribed spacer;
LSU, large-subunit; %PI, percentage of parsimony-informative sites;
SSU, small-subunit.
The GenBank/EMBL/DDBJ accession numbers for the sequences
determined in this study are given in Table S1.
A supplementary table and a supplementary figure are available with the
online Supplementary Material.
000076 G 2015 IUMS
characteristics of siliceous frustules, the diatoms have been
divided into two groups (centric and pennate) (Simonsen,
1972; Round et al., 1990) or three classes [Coscinodiscophyceae
(centric diatoms), Fragillariophyceae (araphid diatoms) and
Bacillariophyceae (raphid diatoms)] (Round et al., 1990).
However, their small size, ambiguous definition and subtle
differences often make identification to the level of genus or
species complex. Morphology is not always a reliable indicator of species boundaries and phylogenetic relationships in
some diatom groups (Evans et al., 2007). However, with the
help of molecular technologies, this may be overcome.
Hebert et al. (2003) first proposed a short DNA sequence
of cytochrome c-oxidase subunit 1 (cox 1, COI) as a
barcode for animal identification. DNA barcodes provide
reliable evidence for species identification, and consist of a
standardized short sequence of DNA that can be generated
easily and characterized for all species under analysis
(Savolainen et al., 2005; Ratnasingham & Hebert, 2007).
An ideal DNA barcode should possess the following: (i)
conserved flanking fragments to facilitate the design of
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sat, 17 Jun 2017 00:49:15
Printed in Great Britain
1369
L. Guo and others
universal primers; (ii) a sequence length obtainable in a
single amplification; (iii) the power to identify organisms
at the species level (Moniz & Kaczmarska, 2010). Many
markers (such as rRNA genes, mitochondrial and chloroplast genes) have been used to identify species or to assess
phylogenetic relationships, and the use of molecular data
for diatom phylogeny dates back to the report by Medlin
et al. (1993). The small-subunit (SSU) rRNA gene from a
broad range of diatom species has been used to elucidate
diatom phylogeny (Medlin et al., 1996, 1997; Kooistra &
Medlin, 1996) that generally supports classical morphological
taxonomies (Simonsen, 1979; Round et al., 1990; Medlin
et al., 1993, 2000; Sörhannus, 2004, 2007). rRNA genes, COI
and rbcL (ribulose-1,5-bisphosphate carboxylase/oxygenase
large subunit) have also been used to discriminate complex
diatom species or to estimate phylogeny (Lundholm et al.,
2006; Evans et al., 2007; Jung et al., 2010; Kooistra et al., 2010;
MacGillivary & Kaczmarska, 2012).
Different portions of the rRNA genes can resolve different
levels of phylogenetic relationship (Alverson, 2008). The
SSU (18S) and large-subunit (LSU; 28S) rRNA genes have
often been used for phylogenetic inference studies at higher
taxon levels (Sörhannus, 2004; Alverson et al., 2006), while
internal transcribed spacer (ITS) sequences are considered
useful in defining intraspecific or population-level differences because of their fast evolution, being less subject to
functional constraints (Behnke et al., 2004; Beszteri et al.,
2005; Godhe et al., 2006; Vanormelingen et al., 2007, 2008).
The relative rates of evolution of rRNA genes and mitochondrial and chloroplast genes vary in different groups
(Bruder & Medlin, 2007); therefore, they are suitable for
reconstructing phylogenetic relationships among taxa of
different phylogenetic ranks (e.g. species, genus, family,
order and class). COI has been used extensively and very
successfully for barcoding various animals and some protist
taxa such as red and brown algae (Saunders, 2005; Kucera
& Saunders, 2008) and 22 species of the genus Sellaphora
(Evans et al., 2007). A segment of the 59-end of rbcL (rbcL59) has been shown to function as a dual-locus barcode
with matK in flowering plants (de Vere et al., 2012);
MacGillivary & Kaczmarska (2011) have examined the
universal application of rbcL-59 to the Mediophyceae and
Bacillariophyceae, and the 39-end of rbcL has been examined in species of the Naviculales and Bacillariales; its
universal application has yet to be assessed (Hamsher et al.,
2011). The universal plastid amplicon (UPA) region has
been proposed as a possible algal barcode region because of
its short length and universal primers (Sherwood & Presting,
2007), and it is easily amplified for diatoms, but it is considerably more conserved in diatom species (Hamsher et al.,
2011).
Although several DNA markers show promise for diatom
phylogenetic studies, a universal DNA barcode satisfying
the purposes of phylogenetic study has not been identified.
Generally, the proposed fragment needs at least two pairs
of primers. Thus, the main objectives of this study were to
design universal primers for each potential diatom barcode
gene (18S, ITS, COI and rbcL) and to evaluate whether
these amplified regions are effective at different levels of
diatom clustering analysis, to assess the potential of these
regions for barcoding of some taxa.
METHODS
Design of primers. The sequences of the 18S rRNA gene, the ITS
(5.8S rRNA gene plus ITS-2) and the COI and rbcL genes were
downloaded from NCBI and then classified and aligned for each gene.
Degenerate primers were designed according to their conserved
portions by merging different bases at the same position. The primer
sequences are shown in Table 1.
Cultures and morphological identification. Water samples were
obtained from Zhanshan Bay (36u 029 580 N 120u 209 410 E) and
Shilaoren Bay (36u 059 200 N 120u 299 210 E), Qingdao, Shandong
Province, PR China, and the South China Sea. Thirty-seven clonal
cultures were established by single-cell isolation. All cultures were
maintained in F/2 medium (Guillard, 1975) and incubated at 20 uC
under a 12 h : 12 h light/dark cycle at low irradiance (10–25 mmol
photons?m22?s21).
Pure cultures of diatom strains were identified to the species or genus
level by using morphological characters based on observations under
Table 1. Sequences of primers used in this study
Primer
ITS-F
ITS-R
18S-F
18S-R
COI-F
COI-R
rbcL-F
rbcL-R
UPA-F*
UPA-R*
Sequence (5§–3§)
CSMACAACGATGAAGRRCRCAGC
TCCCDSTTCRBTCGCCVTTACT
TCYAAGGAAGGCAGCAGGCGC
GTTTCAGHCTTGCGACCATACTCC
ATGATHGGDGCDCCWGAYATG
CCWCCHCCHGCDGGRTC
ATGTCTCAATCTGTAWCAGAACGGACTC
TAARAAWCKYTCTCTCCAACGCA
GGACAGAAAGACCCTATGAA
TGAGTGACGGCCTTTCCACT
Approx. fragment length (bp)
520
720
420
660
380
*Details taken from Hamsher et al. (2011).
1370
Downloaded from www.microbiologyresearch.org by
International Journal of Systematic and Evolutionary Microbiology 65
IP: 88.99.165.207
On: Sat, 17 Jun 2017 00:49:15
Assessment of diatom barcode genes
light and electron microscopy. Details of the species are shown in
Table 2. In total, 37 strains were identified, and they covered the main
groups of diatoms, including radial centric and multipolar centric
diatoms, araphid and raphid pennate diatoms. Among these species,
36 strains belonged to 20 genera, representing 16 families and 14
orders. One exception was a diatom defined as ‘unidentified species in
Thalassiosirales (UST)’, as the morphology of this species was very
similar to that of Thalassiosira, having fasciculate areolation and a six
to eight fultoportulae structure, but no rimoportulae (Round et al.,
1990).
DNA extraction, PCR amplification and sequencing. Cells were
harvested by centrifugation (12 000 r.p.m.) and washed twice with
16 TE buffer, and genomic DNA was extracted using the 26 CTAB
method (Rogers & Bendich, 1985) or a Plant Genomic DNA kit
(Tiangen). The target sequences were amplified by PCR using the
degenerate primers detailed in Table 1. The PCR was performed with
20 ml reaction mixture containing 11.4 ml sterile distilled water, 2 ml
106 Taq buffer (Thermo Scientific), 1.2 ml MgCl2 (4 mM), 1.2 ml
dNTP mixture (4 mM), 1 ml each primer (10 pM), 1 U Taq DNA
polymerase (Thermo Scientific) and 2 ml DNA template. PCR
Table 2. Strains and species identified in this study and information on collection
Samples from Shilaoren Bay (36u 059 200 N 120u 299 210 E) and Zhanshan Bay (36u 029 580 N 120u 209 410 E) were collected from Qingdao, PR
China; others were from the South China Sea. Collection dates are given in the form ‘YYYYMMDD’.
Species
Family, order
Radial centric diatoms
Chaetoceros didymus Ehrenberg
Chaetoceros danicus Cleve
Chaetoceros sp.
Coscinodiscus sp.
Guinardia striata (Stolterfoth) Hasle
Leptocylindrus danicus Cleve
Skeletonema marinoi 1 (Greville) Cleve
Skeletonema marinoi 2 (Greville) Cleve
Skeletonema marinoi 3 (Greville) Cleve
Skeletonema tropicum Cleve
Skeletonema marinoi 4 (Greville) Cleve
Stephanopyxis turris (Grevet & Arnott) Ralfs
Tenuicylindrus sp.
Thalassiosira curviseriata Takano
Thalassiosira nordenskioldi Cleve
Thalassiosira rotula 1 Meunier
Thalassiosira rotula 2 Meunier
Thalassiosira sp. 1
Thalassiosira sp. 21
Thalassiosira sp. 22
Thalassiosira sp. 23
Thalassiosira sp. 24
Unidentified species in Thalassiosirales (UST)
Polar-centric diatoms
Biddulphia sinensis Greville
Eucampia zodiacus Ehrenberg
Lithodesmium variable Takano
Raphid pennate diatoms
Amphiprora alata (Ehrenberg) Kützing
Bacillaria paradoxa Gmelin
Cylindrotheca closterium (Ehremberg)
Reimann & J. C. Lewin
Nitzschia longissima (Brébisson) Ralfs
Psammodictyon panduriforme (W. Gregory)
D. G. Mann
Pleurosigma strigosum W. Smith
Pseudo-nitzschia sp. 1
Pseudo-nitzschia sp. 2
Araphid pennate diatoms
Grammatophora sp.
Synedra sp.
Thalassionema nitzschioides Grunow
http://ijs.sgmjournals.org
Collection information
Chaetocerotaceae, Chaetocerotales
Chaetocerotaceae, Chaetocerotales
Chaetocerotaceae, Chaetocerotales
Coscinodiscaceae, Coscinodiscales
Rhizosoleniaceae, Rhizosoleniales
Leptocylindraceae, Leptocylindrales
Skeletonemataceae, Thalassiosirales
Skeletonemataceae, Thalassiosirales
Skeletonemataceae, Thalassiosirales
Skeletonemataceae, Thalassiosirales
Skeletonemataceae, Thalassiosirales
Stephanopyxidaceae, Melosirales
Leptocylindraceae, Leptocylindrales
Thalassiosiraceae, Thalassiosirales
Thalassiosiraceae, Thalassiosirales
Thalassiosiraceae, Thalassiosirales
Thalassiosiraceae, Thalassiosirales
Thalassiosiraceae, Thalassiosirales
Thalassiosiraceae, Thalassiosirales
Thalassiosiraceae, Thalassiosirales
Thalassiosiraceae, Thalassiosirales
Thalassiosiraceae, Thalassiosirales
Thalassiosiraceae, Thalassiosirales
Shilaoren Bay (20130313)
Zhanshan Bay (20121015)
Zhanshan Bay (20121015)
Shilaoren Bay (20130708)
Zhanshan Bay (20121015)
Zhanshan Bay (20121015)
Shilaoren Bay (20130313)
Zhanshan Bay (20121015)
Zhanshan Bay (20121015)
Zhanshan Bay (20121015)
Zhanshan Bay (20121015)
Zhanshan Bay (20121015)
Zhanshan Bay (20121015)
Zhanshan Bay (20121015)
Zhanshan Bay (20121015)
Zhanshan Bay (20121015)
Zhanshan Bay (20121015)
Zhanshan Bay (20121015)
105u0.2029E, 6u0.1279N (20120912)
105u0.2029E, 6u0.1279N (20120912)
105u0.2029E, 6u0.1279N (20120912)
105u0.2029E, 6u0.1279N (20120912)
Zhanshan Bay (20121015)
Biddulphiaceae, Biddulphiales
Hemiaulaceae, Hemiaulales
Lithodesmiaceae, Lithodesmiales
Zhanshan Bay (20121015)
Shilaoren Bay (20130708)
Zhanshan Bay (20121015)
Amphipleuraceae, Naviculales
Bacillariaceae, Bacillariales
Bacillariaceae, Bacillariales
105u30.1549E, 5u30.1089N (20120912)
Zhanshan Bay (20121015)
Zhanshan Bay (20121015)
Bacillariaceae, Bacillariales
Bacillariaceae, Bacillariales
Shilaoren Bay (20130313)
105u0.0129E, 4u44.8679N (20120908)
Pleurosigmataceae, Naviculales
Bacillariaceae, Bacillariales
Bacillariaceae, Bacillariales
105u30.1549E, 5u30.1089N (20120912)
105u30.1549E, 5u30.1089N (20120912)
105u0.0139E, 2u2.1229N (20120909)
Striatellaceae, Striatellales
Fragilariaceae, Fragilariales
Thalassionemataceae, Thalassionemales
105u0.2029E, 6u0.1279N (20120912)
Zhanshan Bay (20121015)
Zhanshan Bay (20121015)
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sat, 17 Jun 2017 00:49:15
1371
L. Guo and others
programs were run with initial denaturation at 94 uC for 5 min,
followed by 35 cycles of 94 uC for 40 s, annealing (59 uC for ITS, 18S
rRNA and COI; 55 uC for rbcL and UPA) for 40 s and 72 uC for 45 s,
and a final extension at 72 uC for 10 min. PCR products were
confirmed by 1 % agarose gel electrophoresis. PCR products were
sequenced by direct bidirectional sequencing or cloned into the PMD
18-T vector (Takara).
Molecular divergence and base substitution saturation analysis.
All five genes were amplified from the 37 diatom strains and sequenced
(accession numbers are detailed in Table S1, available in the online
Supplementary Material). Other sequences of the 18S rRNA gene (541
sequences), ITS (61 sequences), COI (77 sequences) and rbcL (315
sequences) available from NCBI were downloaded for phylogenetic
analyses. After elimination of apparently erroneous sequences, alignments were created by using CLUSTAL_X version 2.1 (Larkin et al., 2007).
The aligned sequences were trimmed at each end. Genetic distances and
percentages of parsimony-informative sites (%PI) were calculated using
MEGA 6.0 (Tamura et al., 2013). Analysis of base substitution saturation
was performed using the DAMBE software (Xia & Xie, 2001).
Phylogenetic analysis. Phylogenetic analyses for the five genes were
estimated using Bayesian inference (BI) as carried out in MrBayes
version 3.2 (Ronquist & Huelsenbeck, 2003). Gaps were treated as
missing data. The program Modeltest version 3.7 (Posada & Crandall,
1998) was used to explore the evolutionary model (Table 3) of
sequences that best fitted the five datasets using the Akaike
information criterion (AIC) (Luo et al., 2010). In BI analyses, all
Bayesian Markov chain Monte Carlo (MCMC) analyses were run with
four Markov chains (three heated chains, one cold) for 1 000 000
generations. Trees were sampled every 100 generations. We obtained
posterior probability values for the branching patterns in BI trees.
Trees were edited by using FigTree version 1.4 (Rambaut, 2013).
RESULTS
Amplification, sequencing and
the five genes
BLAST
success for
Amplification and sequencing success are summarized in
Table 4. Amplification of the five genes was very successful.
PCR products of rbcL and UPA were sequenced directly,
while several amplified fragments of the 18S rRNA gene,
ITS and COI were sequenced after being cloned.
BLAST results against NCBI are also shown in Table 4. rbcL
had the highest hit success (32 sequences; 88.9 %); the 18S
rRNA gene was next (29 sequences; 80.6 %), followed by ITS
(26 sequences; 72.2 %). The rbcL sequences of Eucampia
zodiacus and Pleurosigma strigosum matched corresponding
sequences in the families Hemiaulaceae and Pleurosigmataceae, respectively, as there were no rbcL sequences for the
two genera in the database. The limited availability of diatom
COI and UPA sequence data prevented BLAST success for COI
and UPA to the level of genus, while COI had the highest
mismatch ratio (9 sequences; 25 %). The 18S rRNA gene, COI
and rbcL sequences of Chaetoceros danicus were all mismatched to other genera. Interestingly, all sequences of the
UST were matched to sequences of the genus Skeletonema,
except the UPA gene (there was no sequence deposited from
the genus Skeletonema).
Molecular divergence for nuclear rRNA genes and
chloroplast and mitochondrial genes
Genetic divergence for the five gene sequences was compared
using pairwise genetic distance (p-distance) scores calculated
by Kimura’s two-parameter model. The mean p-distances of
the 18S rRNA gene, ITS, COI, rbcL and UPA were found to be
0.139 (SD50.010), 1.569 (SD50.193), 6.084 (SD56.427),
0.120 (SD50.010) and 0.050 (SD50.007), respectively (Fig.
1). Based on further analysis, %PI for the 18S rRNA gene, ITS,
COI, rbcL and UPA was calculated to be 57.69, 85.84, 82.14,
42.01 and 14.97 %, respectively (Table 4), showing variations
within the sequences for each gene. COI sequences had the
highest mean p-distances, and ITS sequences contained more
PI sites than the 18S rRNA gene, rbcL and UPA. Variation in
the ITS and COI genes was much higher, as revealed by both
p-distance and %PI values compared with those of the other
genes. However, p-distance and %PI values showed that the
variation in UPA was much lower in the 37 diatom strains. In
addition, a limited number of UPA sequences have been
uploaded to NCBI; these sequences were therefore not used in
the following analysis.
Base substitution saturation analysis
Base substitution saturation curves are displayed in Fig. S1.
Transversion was higher than transition for the four
Table 3. Best-fit models and parameters for the four sets of sequences as chosen by Modeltest version 3.7 software
‘Base’ shows the stationary nucleotide frequencies of the GTR rate matrix; ‘Nst56’ sets the evolutionary model to the GTR substitution model;
‘Rmat’ shows the six substitution rates of the GTR rate matrix; ‘Rates5gamma’ sets the GTR substitution model with gamma-distributed rate
variation across sites; ‘Shape’ shows the shape parameter of the gamma distribution of rate variation; ‘Pinvar’ shows the proportion of invariable
sites.
Sequence
18S rRNA gene
ITS
COI
rbcL
1372
Best-fit model
(by AIC)
GTR+I+G
GTR+I+G
TVM+I+G
GTR+I+G
Base
0.2350
0.2102
0.2096
0.3807
0.1863
0.1967
0.1893
0.0441
Nst
0.2408
0.2794
0.1597
0.0630
6
6
6
6
Rmat
1.0556
1.5907
0.7708
0.3085
3.1005
3.4006
5.9318
3.9419
1.2012
1.7504
2.8702
0.3835
0.8305
0.8234
1.5870
1.0905
3.9344
4.5016
5.9318
4.5695
Rates
Shape
Pinvar
gamma
gamma
gamma
gamma
0.5966
0.7858
0.5000
0.4574
0.2146
0.0552
0.0609
0.5028
Downloaded from www.microbiologyresearch.org by
International Journal of Systematic and Evolutionary Microbiology 65
IP: 88.99.165.207
On: Sat, 17 Jun 2017 00:49:15
Assessment of diatom barcode genes
Table 4. Amplification, sequencing and
BLAST
success of the five sets of sequences
UST was not included in the BLAST analysis. ‘Mismatch’ shows the number of sequences matched by BLAST to a sequence from a different
morphological genus; ‘lacking’ shows the number of sequences for which sequences of morphologically identified genera were not available from
NCBI. %PI indicates the percentage of parsimony-informative sites.
Sequence
Amplification success
18S rRNA gene
ITS
COI
rbcL
UPA
100 %
100 %
100 %
100 %
100 %
Sequencing success
(direct sequence/clone)
(37)
(37)
(37)
(37)
(37)
100 %
100 %
100 %
100 %
100 %
(35/2)
(34/3)
(32/5)
(37/0)
(37/0)
datasets, and both transition and transversion of ITS and
COI reached saturation for the diatom phylum, particularly COI. In contrast, the curves for the 18S rRNA gene
and rbcL sequences were linear, suggesting the possibility of
further substitution.
Phylogenetic analysis of the four gene markers in
diatoms
In total, 578 sequences of the 18S rRNA gene, 98 sequences
of ITS, 117 sequences of COI and 352 sequences of rbcL
were used for the phylogenetic analysis. Based on 18S
rRNA gene sequences (Fig. 2; 18S tree), the 578 sequences
clustered into three main clades: ‘clade Bacillariophyceae’,
‘clade Fragilariophyceae’ and ‘clade Coscinodiscophyceae
12.6
12.4
12.2
12.0
6
p-distance
5
4
3
2
1
0
–1
18S rRNA gene
ITS
COI
rbcL
UPA
Fig. 1. Nucleotide divergence of selected diatom genes, the 18S
rRNA gene, ITS, COI, rbcL and UPA, on the basis of p-distance.
Genetic distances were calculated using Kimura’s two-parameter
model. Bar heights indicate p-distance measured for each gene;
values are shown as means±SE.
http://ijs.sgmjournals.org
BLAST
match (to genus) (n)
%PI
Success
Mismatch
Lacking
29
26
15
32
13
7
7
9
2
1
0
3
12
2
22
57.69
85.84
82.14
42.01
14.97
and Mediophyceae’. Three small clades that did not cluster
in these main clades are highlighted in Fig. 2. Small clades
belonging to the Coscinodiscophyceae and Mediophyceae
were mixed and separated from clade Bacillariophyceae
and clade Fragilariophyceae. Some lower taxa clustered
well, such as the genera Aulacoseira, Licmophora and
Skeletonema and the family Amphipleuraceae (Fig. 3). The
rbcL region failed to resolve diatom clustering relationships
in the way that the 18S rRNA gene tree did (Fig. 2; rbcL
tree), but there were exceptions among the lower taxa (Fig.
4); for example, species belonging to genera Gomphonema
and Skeletonema and the families Stephanodiscaceae and
Rhizosoleniaceae clustered into single clades.
In the phylogeny of the ITS (Fig. 2; ITS tree), sequences of
members of Cyclotella, Discostella, Skeletonema, Stephanodiscus and Thalassiosira formed a ‘clade Thalassiosirales’, and
the species of these genera separated well within the clade.
However, other centric and pennate diatoms were clustered
out of order. This may suggest that this ITS portion could be
suitable for species discrimination and phylogenetic analyses
of lower taxa within the Thalassiosirales. Thus, another tree
focusing on the Thalassiosirales (Fig. 5) was reconstructed
with 44 available sequences belonging to four genera,
Cyclotella, Skeletonema, Stephanodiscus and Thalassiosira,
with Phaeodactylum triconutum, Nitzschia panduriformis,
Stephanopyxis turris and Chaetoceros didymus as the
outgroup. This showed that different species of the
Thalassiosirales clustered very well; species of Cyclotella,
Skeletonema and Stephanodiscus gathered into single clades
and these were paraphyletic groups within the Thalassiosira
clade. In addition, Thalassiosira pseudonana along with
Thalassiosira weissflogii and Thalassiosira guillardii were
clustered within the clade of genus Skeletonema, which
differed from the report of Lee et al. (2013), where they
clustered with Cyclotella. However, Thalassiosira angustelineata, T. minuscula, T. oceanica, T. eccentrica, T. punctigera,
T. aestivalis, T. curviseriata, T. rotula and three other species
of Thalassiosira were sister clades with Stephanodiscus, which
had the same pattern as in the study of Lee et al. (2013).
Similar to the result of Von Dassow et al. (2008), the four
sequences from Thalassiosira oceanica did not cluster into one
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sat, 17 Jun 2017 00:49:15
1373
Coscinodiscophyceae
Mediophyceae
rbcL tree
Pseudostriatella pacifica AB430686 Fragilariophyceae
Eunotogramma laevis AB430668
Cymatosiraceae
Biddulphia
Coscinodiscophyceae
International Journal of Systematic and Evolutionary Microbiology 65
Triceratiaceae
Odontella aurita HQ912551
Mastodiscus radiatus HQ912539
Toxarium
Climacosphenia sp. HQ912549
Synedra fulgens KC309556
0.3
Diadesmis gallica AJ867023
Phaeodactylum tricornutum HQ912556
Mastogloia sp. HQ912632
0.1
Plagiostriata goreensis AB430605
Fragilariophyceae
Fragilariforma virescens HQ912628
Triceratiaceae
Mediophyceae
Coscinodiscophyceae
Dactyliosolen blavyanus KC309564
Chaetoceros peruvianus HQ912514
Mediophyceae
Hybrosera sp. HQ912547
Coscinodiscophyceae
Chaetoceros muellerii HQ912422
Mediophyceae
Hemiaulus sinensis KC309569
Coscinodiscophyceae
Leptocylindrus aporus KC814841
Fragilariophyceae
Cyclophora tenuis JN975255
KJ463436 Amphora obtusa var. crassa Bacillariophyceae
Rhabdonema adriaticum JX401242
Pseudosolenia calcar-avis KC309614
JN975256
JN975257
Cyclophora tenuis HQ912660
Coscinodiscophyceae
Fragilariophyceae
Cyclophora
Terpsinoe musica HQ912546
Hemiaulaceae
Chaetoceros didymus
Mediophyceae
Coscinodiscophyceae
Rhizosoleniaceae
Delphineis sp. JX413561
Rhaphoneis amphiceros HQ912537
Hemiaulus sinensis HQ912624
Dactyliosolen blavyanus KC309491
Fragilariophyceae
Coscinodiscophyceae
Rhizosolenia robusta AY485481
Chaetoceros
Fragilariaceae
Fragilariophyceae
Plagiostriata goreensis AB430684
FragilarifAorma virescens HQ912492
Melosira nummuloides HQ91243
Pseudogomphonema cf. kamschaticum AY571748
AB430661
Melosira Mediophyceae
KC309611
Coscinodiscophyceae
Fragilariophyceae
Urosolenia eriensis HQ912441
Coscinodiscophyceae
Acanthoceras sp. HQ912540
Bacillarophyceae
0.3
Eunotogramma laevis AB430593
0.1
ITS tree
Coscinodiscophyceae
Thalassiosirales
Ditylum brightwellii GQ844259 Mediophyceae
Lithodesmium variable KJ671735
Asterionellopsis Fragilariophyceae
Dimeregramma acutum KF768027
Plagiogramma sp. KF768028
Mediophyceae
Minutocellus polymorphus GQ844268
Thalassiosira punctigera GQ844276
Attheya longicornis GQ844247 Coscinodiscophyceae
0.4
Chaetoceros sp. KJ671728
Eucampia zodiacus KJ671731 Mediophyceae
Coscinodiscophyceae
Biddulphia chinensis KJ671725 Mediophyceae
Leptocylindrus danicus KJ671734 Coscinodiscophyceae
Fragilariopsis
Navicula glaciei HQ337545
Navicula ramosissima HQ337548 Bacillariophyceae
4.8
Navicula salinicola HQ337549
Navicula cryptocephala HQ337543
Thalassionema
Synedra sp. KJ671747
Grammonema striatula GQ844262 Fragilariophyceae
Cyclophora
Grammatophora
Coscinodiscophyceae
Pseudo-nitzschia sp.2 KJ671740 Bacillariophyceae
Chaetoceros didymus KJ671727
Guinardia striata KJ671733
Coscinodiscophyceae
Frustulia vulgaris HF562256
Psammodictyon panduriforme KJ671737
Pseudo-nitzschia sp.1 KJ671739
Eunotia sp. EF164960
Haslea
Navicula
Frustulia
Sellaphora
Pinnularia cf. gibba EF164938
Fallacia sp. AB618071
Biddulphiopsis membranacea HQ912502
Mediophyceae
Isthmia enervis HQ912548
Biddulphiopsis titiana HQ912505
Pseudohimantidium pacificum AB430685
Fragilariophyceae
Podocystis spathulata HQ912525
Mediophyceae
Trigonium formosum HQ912512
Coscinodiscophyceae
Chrysanthemodiscus sp. HQ912506
Thalassiosira
Cyclotella
Skeletonema
Thalassiosirales
Thalassiosira
Coscinodiscophyceae
Stephanodiscus
Discostella GQ148713
Chaetoceros
Lithodesmium variabile KJ671771
Mediophyceae
Eucampia zodiacus KJ671767
Guinardia striata KJ671769 Coscinodiscophyceae
Grammatophora sp. KJ671768
Fragilariophyceae
Thalassionema nitzschioides KJ671784
Bacillariophyceae
Cylindrotheca closterium KJ671766
Chaetoceros danicus KJ671762 Coscinodiscophyceae
Pseudo-nitzschia sp. KJ671775
Bacillaria paxillifer KJ671760
Bacillariophyceae
Navicula phyllepta DQ235783
0.4
Nitzschia microcephala KC759159
Haslea sp. HE663060
0.4
Odontella aurita EU861394 Mediophyceae
0.4
Coscinodiscus sp. KJ671765
Coscinodiscophyceae
0.4
Stephanopyxis turris KJ671782
Synedra sp. KJ671783 Fragilariophyceae
Psammodictyon panduriforme KJ671773
Nitzschia longissima KJ671772
Bacillariophyceae
Pseudo-nitzschia
Bacillariophyceae
KC017449
Amphiprora paludosa GQ844245
Cylindrotheca fusiformis GQ844253
0.4
Phaeodactylum tricornutum GQ844269
0.4
Cylindrotheca closterium GQ844252
Cylindrotheca closterium KJ671730
Nitzschia longissima KJ671736
0.8
Pseudo-nitzschia
Haslea crucigera HF563534
Amphiprora alata KJ671723
Pleurosigma strigosum KJ671738
Bacillariophyceae
Lampriscus
0.3
0.3
COI tree
Fragilariophyceae
Plagiogramma staurophorum HQ912520
Plagiogrammaceae Mediophyceae
Dimeregramma minor var. nanum AB430675
Eunotia bilunaris DQ514763
Bacillariophyceae
Eunotia
Fragilariaceae
Fragilariophyceae
Mediophyceae
Dimeregramma cf. dubium JX413564
Stellarima microtrias HQ912614
0.2
Fragilariophyceae
Ardissonea
Odontella
Mediophyceae
Fig. 2. Phylogenetic trees of the 18S rRNA gene, the ITS and the rbcL and COI genes. Three clades that did not cluster into
their traditional classes are shaded in dark grey in the 18S rRNA gene tree. Species belonging to the class Mediophyceae were
shaded in light grey. Species of the Thalassiosirales are shaded in dark grey in the ITS tree; while five genera that clustered well,
including all of the analysed species, are shaded in dark grey in the COI tree.
Mediophyceae
Bacillariophyceae
Amphora sulcata KJ463446
L. Guo and others
1374
Thalassiosirophycidae
18S tree
0.4
Amphiprora alata KJ671759
Leptocylindrus danicus KJ671770
Biddulphia chinensis KJ671761
Pleurosigma strigosum KJ671774
Chaetoceros didymus KJ671763
Pseudo-nitzschia
0.4
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sat, 17 Jun 2017 00:49:15
Bacillariophyceae
Coscinodiscophyceae
Mediophyceae
Bacillariophyceae
Coscinodiscophyceae
Assessment of diatom barcode genes
AY485493 Aulacoseira cf. granulata v. angustissima
AY121819 Aulacoseira nyassensis
AY569586 Aulacoseira valida
HQ912606 Aulacoseira granulata
AY121821 Aulacoseira baicalensis
AY121822 Aulacoseira skvortzowii
AJ535183 Aulacoseira islandica
AY569578 Aulacoseira alpigena
AY569573 Aulacoseira subarctica
AY569589 Aulacoseira crenulata
AY633760 Licmophora abbreviata
HQ912612 Licmophora paradoxa
(b)
JX401239 Licmophora grandis
HQ997923 Licmophora flucticulata
AY633758 Licmophora gracilis
AY633756 Licmophora communis
JX401237 Licmosphenia peragallioides
AY633759 Licmophora juergensii
AY633757 Licmophora flabellata
AY684957 Skeletonema pseudocostatum
(c)
EF138941 Skeletonema tropicum
KJ671709
AB948135 Skeletonema ardens
KJ671706
KJ671705
KJ671707
EF138940 Skeletonema marinoi
KJ671722
KJ671708
AJ632210 Skeletonema dohrnii
AB948143 Skeletonema grevillei
DQ011159 Skeletonema subsalsum
DQ011161 Skeletonema menzellii
DQ011158 Skeletonema grethae
AY684970 Skeletonema costatum
DQ011160 Skeletonema japonicum
AJ535168 Skeletonema menzelii
AB728775 Skeletonema potamos
HF562292
(d)
HF562294 Frustulia crassinervialsaxonica complex
HF562291
HF562293 Frustulia saxonica
HF562290 Frustulia crassinervia
HF562296 Frustulia maoriana
HF562286 Frustulia aotearoa
HF562288 Frustulia gondwana
HF562295 Frustulia erifuga
HF562287 Frustulia cf. magaliesmontana
AM502038 Frustulia vulgaris
HF562289 Frustulia cassieae
KC309477 Amphipleura pellucida
(a)
0.2
Fig. 3. Four clades that clustered well in the 18S rRNA gene tree.
(a) Clade Aulacoseira, which clustered in the clade of
‘Coscinodiscales+Melosirales+Paraliales’. (b) Clade Licmophora,
which was located in the ‘Fragilariophyceae’ clade. (c) Clade
Skeletonema, which clustered in the ‘Thalassiosirales+Planktoniella
sol’ clade. (d) Clade Amphipleuraceae, which clustered in the clade of
‘Cymbellales+Achnanthales+Lyrellales+Naviculales’.
clade, and T. oceanica sequences deposited as EF134954 and
EF134955, EF208794 and EF208795 formed separated
groups.
With regard to phylogenetic analyses of COI (Fig. 2; COI
tree), the COI region performed poorly in phylogenetic
analysis of the whole diatom group. However, it showed
some potential for clustering of some pennate species, such
as those belonging to the genera Frustulia, Sellaphora,
Cyclophora, Fragilariopsis and Asterionellopsis. To explore
this analysis, we focused on the Bacillariophyceae and
reconstructed a tree for the tested COI region by using 49
available sequences from pennate members of the
Bacillariales, Naviculales, Fragilariales and Thalassionemales,
with Thalassiosira curviseriata and Chaetoceros didymus as the
http://ijs.sgmjournals.org
outgroup (Fig. 6). It clustered most of the species of the
Bacillariales and Naviculales except Bacillaria paradoxa and
Pseudo-nitzschia sp. 2. Species of the Bacillariales clustered
into one clade and had a sister relationship with species of the
Naviculales clades. Species of the same genus (such as
Frustulia, Haslea, Sellaphora and Navicula) were confirmed
and clustered together. Fistulifera, Haslea and Navicula
clustered into one clade, which was in the same family
Naviculaceae.
DISCUSSION
Primer universality
Kress & Erickson (2008) proposed that several factors must be
considered and weighted when selecting a DNA barcode:
universal PCR amplification, range of taxonomic diversity,
power of species differentiation and bioinformatic analysis
and application. Generally, for one gene marker, at least two
or more pairs of primers are used, especially for ITS, COI and
rbcL (Evans et al., 2007; Moniz & Kaczmarska, 2010; Trobajo
et al., 2010; Hamsher et al., 2011).
Universal primers were designed from conserved regions of
the alignments to compare the four candidate gene markers
(18S rRNA gene, ITS, COI and rbcL), and the ‘universality’
of our primers were tested on species of the Bacillariophyta.
They were used successfully in our laboratory to amplify
these regions from a variety of taxa including 37 strains
representative of the Bacillariales, Biddulphiales, Chaetocerotales, Coscinodiscales, Leptocylindrales, Lithodesmiales,
Hemiaulales, Melosirales, Naviculales, Rhizosoleniales,
Striatellales, Thalassionemales and Thalassiosirales. rbcL
and UPA were amplified and sequenced easily for all 37
strains, representing about 20 genera, while the 18S rRNA
gene, ITS and COI were amplified and sequenced successfully
in most cases, except for several that were sequenced after
being cloned, as they were less conserved than rbcL and UPA
(Table 4). Universal primers for the 18S rRNA gene, ITS, COI
and rbcL are not available at present based on the limited
number of tested strains, and further testing on additional
diatoms should be carried out. However, the amplification
and sequencing success using these primers on other diatom
lineages should be similar.
Nuclear rRNA genes and chloroplast and
mitochondrial genes of diatoms
As a large group, diatoms display a high degree of variation
in both morphology and DNA sequence, including the
nuclear rRNA genes and chloroplast and mitochondrial
genes. An increase in the number of diatom sequences in
the databases and the development of techniques for
molecular analyses using various genes have gradually
expanded our knowledge of diatom phylogeny (Theriot
et al., 2010). Different markers can resolve different levels
of phylogenetic relationships and their molecular divergences. However, few studies or comparisons of gene
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sat, 17 Jun 2017 00:49:15
1375
L. Guo and others
(a)
(b)
(c)
(d)
JQ354681 Gomphonema cf. lagenula
HG530054 Gomphonema lagenula
HG530053 Gomphonema narodoense
HG530051 Gomphonema saprophilum
JQ354693 Gomphonema parvulum
HQ912472 Gomphonema affine
JQ354682 Gomphonema clevei
KC736598
AY571751 Gomphonema capitatum
KC736595 Gomphonema bourbonense
DQ514821 Skeletonema menzellii
KJ671830
KJ671813
KJ671815 Skeletonema marinoi
KJ671816
KJ671814
DQ514820 Skeletonema subsalsum
KF621301 Skeletonema potamos
KJ671817 Skeletonema tropicum
DQ514818 Skeletonema grethae
DQ514819 Skeletonema pseudocostatum
KC309609 Guinardia flaccida
KJ671805 Guinardia striata
HQ912515 Guinardia delicatula
JX413574 Rhizosolenia formosa
KC309615 Rhizosolenia imbricata
AF015568 Rhizosolenia setigera
DQ514832 Cyclotella ocellata
DQ514780 Cyclotella distinguenda
DQ514829 Cyclotella bodanica
DQ514826 Cyclostephanos tholiformis
DQ514837 Stephanodiscus reimerii
DQ514836 Stephanodiscus niagarae
DQ514838 Stephanodiscus yellowstonensis
DQ514844 Stephanodiscus minutulus
DQ514824 Stephanodiscus binderanus
DQ514842 Stephanodiscus hantzschii
DQ514823 Stephanodiscus agassizensis
0.1
KC284710 Cyclostephanos dubius
DQ514827 Cyclostephanos invisitatus
Fig. 4. Four clades that clustered well in the rbcL tree. (a) Clade Gomphonema, which was in the clade of
‘Achnanthales+Bacillariales+Cymbellales+Naviculales+Rhopalodiales+Surirellales+Entomoneis+Lyrella+Amphora’. (b)
Clade Skeletonema, which clustered in the ‘Thalassiosirophycidae’ clade. (c) Clade Stephanodiscaceae, which also clustered
in the clade of ‘Thalassiosirophycidae’. (d) Clade Rhizosoleniaceae, a single Coscinodiscophyceae clade in the rbcL tree.
markers have been undertaken within the phylum. From
the calculation of genetic divergence, we found that ITS
and COI were highly divergent compared with the other
tested genes. In addition, analysis of PI sites displayed
variations within the nucleotide sequence for each gene, in
which the highest PI value was recorded for ITS (85.84 %),
followed by COI (82.14 %), and the lowest values were
found for rbcL (42.01 %) and UPA (14.97 %). The PI values
indicate that the ITS has evolved around 2.00 and 5.72
times faster than the rbcL and UPA genes, respectively
(Table 4), indicating that ITS and COI nucleotide compositions are very variable within the whole group of diatoms,
probably because of their hypervariable domains (Moniz &
Kaczmarska, 2009, 2010). Base substitution saturation
curves also confirmed that ITS and COI were too variable
to be used in phylogenetic analysis of the whole diatom
group. The UPA gene was the most conservative compared
with the other genes, suggesting that it may be unsuitable
as a marker for diatom barcoding.
1376
Effectiveness of selected genes for clustering
diatoms
When the four unrooted trees were compared, they
displayed potential for resolving the cluster of different
taxa. Although the 18S rRNA gene performed better than
the other three in clustering higher taxa, it did not correctly
show the phylogenetic relationship of diatoms, as described
in the study of Medlin & Kaczmarska (2004). They
proposed two new subdivisions (Coscinodiscophytina
and Bacillariophytina) and a new class (the Mediophyceae
for the bipolar centrics and the Thalassiosirales). The
Coscinodiscophytina contained the class Coscinodiscophyceae and the Bacillariophytina contained the classes
Mediopbyceae and Bacillariopbyceae. However, Adl et al.
(2005) treated both the Coscinodiscophyceae and Mediophyceae as paraphyletic taxa, similar to our observations on
the paraphyletic relationship of Coscinodiscophyceae and
Mediophyceae.
Downloaded from www.microbiologyresearch.org by
International Journal of Systematic and Evolutionary Microbiology 65
IP: 88.99.165.207
On: Sat, 17 Jun 2017 00:49:15
UST
0.6919 S. marinoi4
0.8379
S. marinoi3
0.967 S. marinoi1
0.3
0.5543 S. marinoi2
S. marinoi AB604350
0.9992
Skeletonema
S. marinoi AB604353
0.537
S. marinoi AB604351
1
S. marinoi AB604352
S. costatum AY660001
S. tropicum
T. weissflogii DQ469927
0.9993
0.6266
1
T. weissflogii DQ469928
0.9999
T. weissflogii FJ432753
T. guillardii FJ208788
0.9442
T. pseudonana EF208791
1 T. pseudonana EF208792
T. pseudonana EF208793
T. pseudonana FJ432751
0.8306
1 T. oceanica EF134954
T. oceanica EF134955
T. minuscula EF208799
St. sp. GQ844873
0.9997
Stephanodiscus
St. hantzschii GQ844874
0.5081
T. anguste-lineata EF208800
1 T. curviseriata
0.9163
0.9988
T. nordenskioldi
T. sp.1
0.9661 T. eccentrica JQ217343
T. rotula1 0.9971
0.9885
0.7854
T. rotula2
T. punctigera EF208796
1 T. oceanica EF208794
T. oceanica EF208795
T. sp.21
0.9751
0.9972 T. sp.22
T. sp.24
T. sp.23
0.6782
T. aestivalis EF208797
FR865514
1 C. cryptica
C. meneghiniana GQ148712
Cyclotella
C. striata JQ217342
1
0.9994
1 C. choctawhatcheeana JQ217341
C. litoralis JQ217340
Nitzschia panduriformis
0.9701
0.9
0.3
1
Chaetoceros didymus
0.3
0.9885
Stephanopyxis
turris
0.9
Phaeodactylum tricornutum HQ840789
Outgroup
Assessment of diatom barcode genes
Fig. 5. ITS phylogeny of 44 species of the Thalassiosirales, with Phaeodactylum triconutum, Chaetoceros didymus, Nitzschia
panduriformis and Stephanopyxis turris as the outgroup in a BI analysis. The best-fit model was GTR+G, with parameters set
as follows. Base5(0.2256, 0.2252, 0.2720), Nst56, Rmat5(1.0999, 2.9106, 1.0786, 0.4572, 4.1452), Rates5gamma,
Shape50.6062, Pinvar50. Node labels are posterior probabilities of the BI consensus tree.
For rbcL, MacGillivary & Kaczmarska (2011) proposed a
540 bp fragment of rbcL as a better barcode to distinguish
the Mediophyceae and Bacillariophyceae, without mentioning its capacity for phylogenetic analysis. rbcL as tested
in this study was more conserved than the 18S rRNA gene
and failed to cluster the entire group of diatoms; however,
the sequences both performed well in clustering of some
lower taxa.
species-rich classes of diatoms including the mainly marine
taxa of the Mediophyceae and Bacillariophyceae. We found
that the tested region of 5.8S+ITS-2 (referred to as ITS in
this study) had significant potential in clustering species
within the Thalassiosirales. Based on analysis of the two
ITS phylogenetic trees, we propose that the ITS might be a
suitable marker for DNA-based taxonomy and DNA
barcoding of the Thalassiosirales.
ITS, the rRNA gene internal transcribed spacer, generally
refers to the highly variable ITS-1 and ITS-2 regions along
with the conserved 5.8S rRNA gene. The ITS region has
already been used as a marker in some protists (e.g. dinoflagellates; Stern et al., 2012), certain animal groups
(Moritz et al., 2001) and fungi (Seifert, 2009; Schoch
et al., 2012). Moniz & Kaczmarska (2009, 2010) proposed
the 5.8S+ITS-2 fragment as a tool to screen the most
COI is a mitochondrial gene that encodes cytochrome coxidase subunit 1, and previous research has proved that
the 59 end of COI (COI-59) could distinguish closely
related lineages, such as Frustulia, the Nitzschia palea
species complex and the Sellaphora species complex (Evans
et al., 2007; Trobajo et al., 2010; Urbánková & Veselá,
2013). However, COI analysis of higher taxa has not been
reported. In addition, no suitable primers are available for
http://ijs.sgmjournals.org
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sat, 17 Jun 2017 00:49:15
1377
L. Guo and others
0.9874
0.9882
0.3 0.9828
0.6 1
0.6
Araphid
Naviculales
Raphid
Bacillariales
COI
Cylindrotheca fusiformis GQ844253
C. closterium GQ844252
Cylindrotheca closterium
Nitzschia longissima
0.873
1 Pseudo-nitzschia subcurvata HQ317088
Ps. subcurvata HQ317087
0.9822
1 Ps. delicatissima GQ844274
0.9997
0.3
Ps. delicatissima GQ844273
0.5106
0.7279
1 Fragilariopsis cylindrus HQ317081
F. curta HQ317080
0.8094
Pseudo-nitzschia sp.1
Nitzschia sp. HQ317085
0.9432
0.7498
N. frustulum HQ317083
Eunotia sp. EUN392T EF164960
Nitzschia panduriformis
Rossia sp. AB618070
Mayamaea atomus var. permitis JN418700
Frustulia maoriana HF562264
0.5485
Fru. gondwana HF562263
0.9997
0.9901 Fru. saxonica HF562262
0.9082
Fru. crassinervia/saxonica complex
Fru. aotearoa HF562266 HF562260
0.8967
Pinnularia acuminata JN418697
0.7688
0.7029
P. viridiformis JN418691
0.8817
P. neglectiformis JN418696
0.7114
Fallacia sp. AB618071
0.7228
P. acrosphaeria JN418701
P. parvulissima JN418693
P. subanglica JN418698
Phaeodactylum tricornutum NC_016739
Pleurosigma strigosum
0.6685
0.9902 Haslea ostrearia HE995427
0.9978
H. ostrearia HE995416
0.9983
0.5566
H. karadagensis HF563530
H. pseudostrearia HF563533
0.993
Navicula glaciei HQ317072
1
0.9934
0.8622
N. glaciei HQ317073
N. cryptocephala HQ317071
0.7075
Fistulifera pelliculosa HQ317074
0.65770.9857
Sellaphora pupula HQ317109
S. pupula HQ317092
Amphiprora alata
0.9746
0.9998
Synedra sp.
Thalassionema nitzschioides
Grammatophora sp.
1 G. striatula GQ844262
G. striatula GQ844261
Bacillaria paxillifera
Pseudo-nitzschia sp.2
1
Chaetoceros danicus
Outgroup
Thalassiosira curviseriata
Fig. 6. COI phylogenetic tree of 49 pennate species, with Chaetoceros danicus and Thalassiosira curviseriata as the outgroup,
based on a BI analysis. The best-fit model was TIM+I+G and parameters were set as follows. Base5(0.3036, 0.1200,
0.1138), Nst56, Rmat5(1.0000, 3.1324, 0.4760, 0.4760, 8.3490), Rates5gamma, Shape50.3227, Pinvar50.2077. Node
labels are posterior probabilities of the BI consensus tree.
amplifying COI from a wide range of diatom taxa. In this
study, COI was amplified by only one pair of primers, and
showed potential only in confirming the clustering of lower
taxa, for instance some pennates (Fig. 2, COI tree; Fig. 6),
as a result of its high variation. Thus, the tested COI region
was not suitable for clustering analysis of the whole diatom
phylum, but was useful for clustering or barcoding species
of some genera within the Bacillariophyceae.
ACKNOWLEDGEMENTS
Alverson, A. J., Cannone, J. J., Gutell, R. R. & Theriot, E. C. (2006).
The evolution of elongate shape in diatoms. J Phycol 42, 655–
668.
Behnke, A., Friedl, T., Chepurnov, V. A. & Mann, D. G. (2004).
Reproductive compatibility and rDNA sequence analyses in the
Sellaphora pupula species complex (Bacillariophyta). J Phycol 40, 193–
208.
Beszteri, B., Ács, É. & Medlin, L. K. (2005). Ribosomal DNA sequence
variation among sympatric strains of the Cyclotella meneghiniana
complex (Bacillariophyceae) reveals cryptic diversity. Protist 156,
317–333.
Bruder, K. & Medlin, L. K. (2007). Molecular assessment of
This work was supported by the National Natural Science Foundation
of China (41176098) and the Shandong Provincial Natural Science
Foundation, China (ZR2011DZ002).
phylogenetic relationships in selected species/genera in the naviculoid
diatoms (Bacillariophyta). I. The genus Placoneis. Nova Hedwigia 85,
331–352.
de Vere, N., Rich, T. C. G., Ford, C. R., Trinder, S. A., Long, C., Moore,
C. W., Satterthwaite, D., Davies, H., Allainguillaume, J. & other
authors (2012). DNA barcoding the native flowering plants and
REFERENCES
conifers of Wales. PLoS ONE 7, e37945.
Adl, S. M., Simpson, A. G. B., Farmer, M. A., Andersen, R. A.,
Anderson, O. R., Barta, J. R., Bowser, S. S., Brugerolle, G. U. Y.,
Fensome, R. A. & other authors (2005). The new higher level
classification of eukaryotes with emphasis on the taxonomy of
protists. J Eukaryot Microbiol 52, 399–451.
Evans, K. M., Wortley, A. H. & Mann, D. G. (2007). An assessment of
potential diatom ‘‘barcode’’ genes (cox1, rbcL, 18S and ITS rDNA)
and their effectiveness in determining relationships in Sellaphora
(Bacillariophyta). Protist 158, 349–364.
Alverson, A. J. (2008). Molecular systematics and the diatom species.
Godhe, A., McQuoid, M. R., Karunasagar, I., Karunasagar, I. &
Rehnstam-Holm, A. S. (2006). Comparison of three common
Protist 159, 339–353.
molecular tools for distinguishing among geographically separated
1378
Downloaded from www.microbiologyresearch.org by
International Journal of Systematic and Evolutionary Microbiology 65
IP: 88.99.165.207
On: Sat, 17 Jun 2017 00:49:15
Assessment of diatom barcode genes
clones of the diatom Skeletonema marinoi Sarno et Zingone
(Bacillariophyceae). J Phycol 42, 280–291.
small-subunit rRNA sequence comparisons confirm a paraphyletic
origin for the centric diatoms. Mol Biol Evol 13, 67–75.
Guillard, R. R. L. (1975). Culture of phytoplankton for feeding marine
Medlin, L. K., Kooistra, W., Gersonde, R., Sims, P. & Wellbrock, U.
(1997). Is the origin of diatoms related to the end-Permian mass
invertebrates. In Culture of Marine Invertebrate Animals, pp. 29–60.
Edited by W. L. Smith & M. H. Chanley. New York: Springer.
Hamsher, S. E., Evans, K. M., Mann, D. G., Poulı́čková, A. &
Saunders, G. W. (2011). Barcoding diatoms: exploring alternatives to
COI-5P. Protist 162, 405–422.
Hebert, P. D., Cywinska, A., Ball, S. L. & deWaard, J. R. (2003).
Biological identifications through DNA barcodes. Proc Biol Sci 270,
313–321.
Jung, S. W., Han, M.-S. & Ki, J.-S. (2010). Molecular genetic
divergence of the centric diatom Cyclotella and Discostella
(Bacillariophyceae) revealed by nuclear ribosomal DNA comparisons.
J Appl Phycol 22, 319–329.
extinction? Nova Hedwigia 65, 1–11.
Medlin, L. K., Kooistra, W. H. C. F. & Schmid, A. M.-M. (2000). A
review of the evolution of the diatoms – a total approach using
molecules, morphology and geology. In The Origin and Early
Evolution of the Diatoms: Fossil, Molecular and Biogeographical
Approaches, pp. 13–35. Edited by A. Witkowski & J. Siemińska.
Kraków: Władysław Szafer Institute of Botany, Polish Academy of
Sciences.
Moniz, M. B. & Kaczmarska, I. (2009). Barcoding diatoms: is there a
good marker? Mol Ecol Resour 9 (Suppl. s1), 65–74.
Moniz, M. B. & Kaczmarska, I. (2010). Barcoding of diatoms: nuclear
Kooistra, W. H. & Medlin, L. K. (1996). Evolution of the diatoms
encoded ITS revisited. Protist 161, 7–34.
(Bacillariophyta). IV. A reconstruction of their age from small
subunit rRNA coding regions and the fossil record. Mol Phylogenet
Evol 6, 391–407.
Moritz, G., Paulsen, M., Delker, C., Picl, S. & Kumm, S. (2001).
Kooistra, W. H., Sarno, D., Hernández-Becerril, D. U., Assmy, P., Di
Prisco, C. & Montresor, M. (2010). Comparative molecular and
Identification of thrips using ITS-RFLP analysis. In Thrips and
Tospoviruses: Proceedings of the 7th International Symposium on
Thysanoptera, pp. 365–367. Edited by R. Marullo & L. Mound.
Calabria, Italy, 2–7 July 2001.
morphological phylogenetic analyses of taxa in the Chaetocerotaceae
(Bacillariophyta). Phycologia 49, 471–500.
Posada, D. & Crandall, K. A. (1998). MODELTEST: testing the model of
DNA substitution. Bioinformatics 14, 817–818.
Kress, W. J. & Erickson, D. L. (2008). DNA barcodes: genes,
Rambaut, A. (2013). FigTree. Available from http://tree.bio.ed.ac.uk/
genomics, and bioinformatics. Proc Natl Acad Sci U S A 105, 2761–
2762.
software/figtree/
Kucera, H. & Saunders, G. W. (2008). Assigning morphological
variants of Fucus (Fucales, Phaeophyceae) in Canadian waters to
recognized species using DNA barcoding. Botany 86, 1065–1079.
Larkin, M. A., Blackshields, G., Brown, N. P., Chenna, R., McGettigan,
P. A., McWilliam, H., Valentin, F., Wallace, I. M., Wilm, A. & other
authors (2007). Clustal W and Clustal X version 2.0. Bioinformatics
23, 2947–2948.
Lee, M. A., Faria, D. G., Han, M. S., Lee, J. & Ki, J. S. (2013). Evaluation
of nuclear ribosomal RNA and chloroplast gene markers for the DNA
taxonomy of centric diatoms. Biochem Syst Ecol 50, 163–174.
BOLD: the Barcode of
Life Data System (http://www.barcodinglife.org). Mol Ecol Notes 7,
355–364.
Ratnasingham, S. & Hebert, P. D. N. (2007).
Rogers, S. O. & Bendich, A. J. (1985). Extraction of DNA from
milligram amounts of fresh, herbarium and mummified plant tissues.
Plant Mol Biol 5, 69–76.
Ronquist, F. & Huelsenbeck, J. P. (2003). MrBayes 3: Bayesian
phylogenetic inference under mixed models. Bioinformatics 19, 1572–
1574.
Round, F. E., Crawford, R. M. & Mann, D. G. (1990). The Diatoms.
Biology and Morphology of the Genera. Cambridge: Cambridge
University Press.
Lundholm, N., Moestrup, Ø., Kotaki, Y., Hoef-Emden, K., Scholin, C. &
Miller, P. (2006). Inter- and intraspecific variation of the Pseudo-nitzschia
Saunders, G. W. (2005). Applying DNA barcoding to red macroalgae:
delicatissima complex (Bacillariophyceae) illustrated by RNA probes,
morphological data and phylogenetic analyses. J Phycol 42, 464–481.
a preliminary appraisal holds promise for future applications. Philos
Trans R Soc Lond B Biol Sci 360, 1879–1888.
Luo, A., Qiao, H., Zhang, Y., Shi, W., Ho, S. Y., Xu, W., Zhang, A. &
Zhu, C. (2010). Performance of criteria for selecting evolutionary
Savolainen, V., Cowan, R. S., Vogler, A. P., Roderick, G. K. & Lane, R.
(2005). Towards writing the encyclopedia of life: an introduction to
models in phylogenetics: a comprehensive study based on simulated
datasets. BMC Evol Biol 10, 242.
DNA barcoding. Philos Trans R Soc Lond B Biol Sci 360, 1805–1811.
MacGillivary, M. L. & Kaczmarska, I. (2011). Survey of the efficacy of
a short fragment of the rbcL gene as a supplemental DNA barcode for
diatoms. J Eukaryot Microbiol 58, 529–536.
MacGillivary, M. L. & Kaczmarska, I. (2012). Genetic differentiation
Schoch, C. L., Seifert, K. A., Huhndorf, S., Robert, V., Spouge, J. L.,
Levesque, C. A., Chen, W., Bolchacova, E., Voigt, K. & other authors
(2012). Nuclear ribosomal internal transcribed spacer (ITS) region as
a universal DNA barcode marker for Fungi. Proc Natl Acad Sci U S A
109, 6241–6246.
within the Paralia longispina (Bacillariophyta) species complex.
Botany 90, 205–222.
Seifert, K. A. (2009). Progress towards DNA barcoding of fungi. Mol
Mann, D. & Droop, S. (1996). Biodiversity, biogeography and
Sherwood, A. R. & Presting, G. G. (2007). Universal primers amplify
conservation of diatoms. Hydrobiologia 336, 19–32.
Medlin, L. K. & Kaczmarska, I. (2004). Evolution of the diatoms: V.
a 23S rDNA plastid marker in eukaryotic algae and cyanobacteria.
J Phycol 43, 605–608.
Morphological and cytological support for the major clades and a
taxonomic revision. Phycologia 43, 245–270.
Simonsen, R. (1972). Ideas for a more natural system of the centric
diatoms. Nova Hedwigia Beih 39, 37–54.
Medlin, L. K., Williams, D. & Sims, P. (1993). The evolution of the
Simonsen, R. (1979). The diatom system: ideas on phylogeny.
Bacillaria 2, 9–71.
diatoms (Bacillariophyta). I. Origin of the group and assessment of
the monophyly of its major divisions. Eur J Phycol 28, 261–275.
Medlin, L. K., Kooistra, W. H., Gersonde, R. & Wellbrock, U. (1996).
Evolution of the diatoms (Bacillariophyta). II. Nuclear-encoded
http://ijs.sgmjournals.org
Ecol Resour 9 (Suppl. s1), 83–89.
Sörhannus, U. (2004). Diatom phylogenetics inferred based on direct
optimization of nuclear-encoded SSU rRNA sequences. Cladistics 20,
487–497.
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sat, 17 Jun 2017 00:49:15
1379
L. Guo and others
Sörhannus, U. (2007). A nuclear-encoded small-subunit ribosomal
Urbánková, P. & Veselá, J. (2013). DNA-barcoding: a case study in
RNA timescale for diatom evolution. Mar Micropaleontol 65, 1–12.
the diatom genus Frustulia (Bacillariophyceae). Nova Hedwigia Beih
142, 147–162.
Stern, R. F., Andersen, R. A., Jameson, I., Küpper, F. C., Coffroth,
M.-A., Vaulot, D., Le Gall, F., Véron, B., Brand, J. J. & other authors
(2012). Evaluating the ribosomal internal transcribed spacer (ITS) as
a candidate dinoflagellate barcode marker. PLoS ONE 7, e42780.
Stevenson, R. J., Pan, Y. & Van Dam, H. (2010). Assessing
environmental conditions in rivers and streams with diatoms. In
The Diatoms: Applications for the Environmental and Earth Sciences,
2nd edn, pp. 57–85. Edited by J. P. Smol & E. F. Stoermer.
Cambridge: Cambridge University Press.
Tamura, K., Stecher, G., Peterson, D., Filipski, A. & Kumar, S. (2013).
MEGA6: molecular evolutionary genetics analysis version 6.0. Mol Biol
Evol 30, 2725–2729.
Theriot, E. C., Ashworth, M., Ruck, E., Nakov, T. & Jansen, R. K. (2010).
A preliminary multigene phylogeny of the diatoms (Bacillariophyta):
challenges for future research. Plant Ecol Evol 143, 278–296.
Trobajo, R., Mann, D. G., Clavero, E., Evans, K. M., Vanormelingen, P.
& McGregor, R. C. (2010). The use of partial cox1, rbcL and LSU
rDNA sequences for phylogenetics and species identification within
the Nitzschia palea species complex (Bacillariophyceae). Eur J Phycol
45, 413–425.
1380
Vanormelingen, P., Hegewald, E., Braband, A., Kitschke, M., Friedl,
T., Sabbe, K. & Vyverman, W. (2007). The systematics of a small
spineless Desmodesmus species, D. costato-granulatus (Sphaeropleales,
Chlorophyceae), based on ITS2 rDNA sequence analyses and cell wall
morphology. J Phycol 43, 378–396.
Vanormelingen, P., Cottenie, K., Michels, E., Muylaert, K.,
Vyverman, W. & other authors (2008). The relative importance of
dispersal and local processes in structuring phytoplankton communities in a set of highly interconnected ponds. Freshw Biol 53,
2170–2183.
Von Dassow, P., Petersen, T. W., Chepurnov, V. A. & Armbrust, E. V.
(2008). Inter- and intraspecific relationships between nuclear
DNA content and cell size in selected members of the centric
diatom genus Thalassiosira (Bacillariophyceae). J Phycol 44, 335–
349.
Xia, X. & Xie, Z. (2001). DAMBE: software package for data analysis in
molecular biology and evolution. J Hered 92, 371–373.
Yool, A. & Tyrrell, T. (2003). Role of diatoms in regulating the ocean’s
silicon cycle. Global Biogeochem Cycles 17, 1103.
Downloaded from www.microbiologyresearch.org by
International Journal of Systematic and Evolutionary Microbiology 65
IP: 88.99.165.207
On: Sat, 17 Jun 2017 00:49:15