The Chloroplast Genomes of the Green Algae Pyramimonas

The Chloroplast Genomes of the Green Algae Pyramimonas, Monomastix, and
Pycnococcus Shed New light on the Evolutionary History of Prasinophytes and
the Origin of the Secondary Chloroplasts of Euglenids
Monique Turmel,* Marie-Christine Gagnon,* Charley J. O’Kelly, Christian Otis,* and
Claude Lemieux*
*Département de Biochimie et de Microbiologie, Université Laval, Québec (Québec), Canada; and Botany Department, University
of Hawaii
Because they represent the earliest divergences of the Chlorophyta and include the smallest known eukaryotes (e.g., the
coccoid Ostreococcus), the morphologically diverse unicellular green algae making up the Prasinophyceae are central to
our understanding of the evolutionary patterns that accompanied the radiation of chlorophytes and the reduction of cell
size in some lineages. Seven prasinophyte lineages, four of which exhibit a coccoid cell organization (no flagella nor
scales), were uncovered from analysis of nuclear-encoded 18S rDNA data; however, their order of divergence remains
unknown. In this study, the chloroplast genome sequences of the scaly quadriflagellate Pyramimonas parkeae (clade I),
the coccoid Pycnococcus provasolii (clade V), and the scaly uniflagellate Monomastix (unknown affiliation) were
determined, annotated, and compared with those previously reported for green algae/land plants, including two
prasinophytes (Nephroselmis olivacea, clade III and Ostreococcus tauri, clade II). The chlorarachniophyte Bigelowiella
natans and the euglenid Euglena gracilis, whose chloroplasts originate presumably from distinct green algal
endosymbionts, were also included in our comparisons. The three newly sequenced prasinophyte genomes differ
considerably from one another and from their homologs in overall structure, gene content, and gene order, with the
80,211-bp Pycnococcus and 114,528-bp Monomastix genomes (98 and 94 conserved genes, respectively) resembling the
71,666-bp Ostreococcus genome (88 genes) in featuring a significantly reduced gene content. The 101,605-bp
Pyramimonas genome (110 genes) features two conserved genes (rpl22 and ycf65) and ancestral gene linkages
previously unrecognized in chlorophytes as well as a DNA primase gene putatively acquired from a virus. The
Pyramimonas and Euglena cpDNAs revealed uniquely shared derived gene clusters. Besides providing unequivocal
evidence that the green algal ancestor of the euglenid chloroplasts belonged to the Pyramimonadales, phylogenetic
analyses of concatenated chloroplast genes and proteins elucidated the position of Monomastix and showed that the
Mamiellales, a clade comprising Ostreococcus and Monomastix, are sister to the Pyramimonadales þ Euglena clade. Our
results also revealed that major reduction in gene content and restructuring of the chloroplast genome occurred in
conjunction with important changes in cell organization in at least two independent prasinophyte lineages, the
Mamiellales and the Pycnococcaceae.
Introduction
The green plants (Viridiplantae) are divided among
two major lineages: the Chlorophyta, containing the bulk
of the extant green algae, and the Streptophyta, containing
the green algae belonging to the Charophyceae sensu
Mattox and Stewart (1984) and all land plants (Lewis
and McCourt 2004). It is thought that the first green plants
were unicellular green algae bearing nonmineralized organic scales on their cell body and/or their flagella (Mattox
and Stewart 1984). This hypothesis was put forward when it
was recognized that flagellated reproductive cells (zoospores, gametes) of some taxa in both the Chlorophyta
and Streptophyta are covered by a layer of square-shaped
scales, which also occur as an underlayer in many prasinophytes. Free-living scaly flagellates have been ascribed
mainly to the Prasinophyceae, a nonmonophyletic class
representing the earliest divergences of the Chlorophyta
(Steinkotter et al. 1994; Nakayama et al. 1998; Fawley
et al. 2000; Guillou et al. 2004; Proschold and Leliaert
2007). This morphologically heterogeneous assemblage
of green algae gave rise to the three advanced classes designated as the Trebouxiophyceae, Ulvophyceae, and ChlorKey words: prasinophyte green algae, euglenids, chloroplast genome
evolution, phylogenomics, secondary endosymbiosis, genome reduction,
horizontal DNA transfers.
E-mail: [email protected].
Mol. Biol. Evol. 26(3):631–648. 2009
doi:10.1093/molbev/msn285
Advance Access publication December 12, 2008
Ó The Author 2008. Published by Oxford University Press on behalf of
the Society for Molecular Biology and Evolution. All rights reserved.
For permissions, please e-mail: [email protected]
ophyceae (Lewis and McCourt 2004). Note that the scaly
biflagellate Mesostigma viride, traditionally classified
within the Prasinophyceae, has been formally excluded
from this class and placed in the Streptophyta (Marin
and Melkonian 1999; Lemieux et al. 2007; RodriguezEzpeleta et al. 2007). Prasinophytes have always fascinated
the phycologists because their studies have the potential
to shed light on the nature of the last common ancestor
of all green plants and on the origin of the advanced
chlorophytes.
The concept of the class Prasinophyceae has been under constant revision since its formal description by Moestrup and Throndsen (1988) (Sym and Pienaar 1993); in the
last few years, it has profoundly changed with the description of several new taxa and the analysis of environmental
sequences. Most prasinophytes are found in marine habitats, and considerable diversity is observed with respect
to cell shape and size, flagella number and behavior, mitotic
and cytokinetic mechanisms, and biochemical features such
as accessory photosynthetic pigments and storage products
(Melkonian 1990; O’Kelly 1992; Sym and Pienaar 1993;
Latasa et al. 2004). Some species lack flagella, others lack
scales, and in some cases, both flagella and scales are absent
(e.g., Ostreococccus tauri). The small-sized members of the
Prasinophyceae, particularly those belonging to three genera of the Mamiellales (Micromonas, Bathycoccus, and
Ostreococcus), are prominent in the oceanic picoplankton
(comprising organisms less than 3 lm in diameter) (Guillou
et al. 2004). Included in this category is the smallest
632 Turmel et al.
free-living eukaryote known to date, O. tauri (Courties et al.
1994). Phylogenetic studies using molecular data, in particular the nuclear-encoded small subunit (SSU) rRNA gene,
identified seven monophyletic groups of prasinophytes
at the base of the Chlorophyta (Steinkotter et al. 1994;
Nakayama et al. 1998; Fawley et al. 2000; Guillou et al.
2004); however, their order of divergence could not be resolved. Despite this uncertainty, it appears that the coccoid
form evolved more than once in the Prasinophyceae
(Fawley et al. 2000; Guillou et al. 2004). Coccoid cells
are distributed among four lineages (clade II, Mamiellales;
clade V, Pseudocourfieldiales, Pycnococcaceae; clade VI,
Prasinococcales; and clade VII, no order assigned to this
clade), two of which (clades II and V) exhibit both the
coccoid and flagellated cell organizations.
Comparative analysis of chloroplast genomes has been
helpful to resolve problematic relationships among green
algae and land plants (Wolf et al. 2005; Qiu et al. 2006;
Jansen et al. 2007; Lemieux et al. 2007; Turmel et al.
2008) although the phylogenetic positions of some green
plant lineages have remained contentious (Pombert et al.
2005; Turmel et al. 2006; Lemieux et al. 2007). The only
two complete chloroplast DNA (cpDNA) sequences currently available for prasinophytes, those of the scaly biflagellate Nephroselmis olivacea (clade III, Pseudocourfieldiales,
Nephroselmidaceae) (Turmel et al. 1999b) and of the tiny
coccoid O. tauri (clade II, Mamiellales) (Robbens et al.
2007), have revealed contrasting evolutionary patterns which
can be designated as ancestral and reduced derived, respectively. Whereas the 200.8-kb Nephroselmis genome harbors
the largest gene repertoire yet reported for a chlorophyte (128
different conserved genes compared with about 138 genes
for the deepest branching streptophyte algae) and has retained many ancestral gene clusters, the nearly 3-fold smaller
Ostreococcus genome, which is the most compact chlorophyte cpDNA known to date, displays a reduced set of 88
genes whose order is highly scrambled. As in most other
chloroplast genomes, two identical copies of a large inverted
repeat (IR) are separated by single-copy (SC) regions; however, the two prasinophyte genomes differ remarkably in
their quadripartite architectures. The Nephroselmis architectural design closely resembles that found in all streptophyte
IR-containing cpDNAs: the SC regions are vastly unequal in
size, each SC region is characterized by a highly conserved
set of genes, and the rRNA operon encoded by the IR is transcribed toward the small SC (SSC) region. In Ostreococcus,
the SC regions have essentially the same number of genes;
the few genes (just five) that would be expected to map to the
SSC region in streptophyte cpDNAs are confined to the same
SC region, and the rRNA operon is transcribed away from
the latter SC region (see supplementary fig. 1, Supplementary Material online). This gene partitioning pattern is reminiscent of that reported for the cpDNAs of the ulvophytes
Pseudendoclonium akinetum and Oltmannsiellopsis viridis
(Pombert et al. 2005, 2006).
To explore the relationships among prasinophyte lineages and to better understand the mode of cpDNA evolution in the Prasinophyceae, we sequenced the cpDNAs of
the scaly quadriflagellate Pyramimonas parkeae (clade I,
Pyramimonadales), the coccoid Pycnococcus provasolii
(clade V, Pseudocourfieldiales, Pycnococcaceae), and the
scaly uniflagellate Monomastix (unknown affiliation) and
compared these genomes with those previously reported
for Nephroselmis (Turmel et al. 1999b), Ostreococcus
(Robbens et al. 2007), other chlorophytes (Wakasugi
et al. 1997; Maul et al. 2002; Pombert et al. 2005, 2006;
Bélanger et al. 2006; de Cambiaire et al. 2006, 2007;
Brouard et al. 2008), the deep-branching streptophytes
Mesostigma (Lemieux et al. 2000) and Chlorokybus atmophyticus (Lemieux et al. 2007), the euglenid Euglena gracilis (Hallick et al. 1993) and the chlorarachniophyte
Bigelowiella natans (Rogers et al. 2007). The latter photosynthetic eukaryotes, which presumably gained their chloroplasts via independent secondary endosymbiotic events
(Rogers et al. 2007), were included in our comparisons
in an attempt to gain more detailed information about
the green algal donors of their chloroplasts. We found that
the three newly sequenced prasinophyte genomes differ
considerably from one another and from their previously
sequenced homologs at the overall structure, gene content,
and gene order levels, with both the Monomastix and
Pycnococcus genomes featuring a reduced pattern of
evolution. Our phylogenetic analyses of sequence data offered significant insights into the phylogeny and evolution
of prasinophytes and provided unequivocal evidence that
the euglenid chloroplasts were secondarily acquired from
a member of the Pyramimonadales.
Materials and Methods
Strains and Culture Conditions
Pyramimonas parkeae (CCMP 726) and P. provasolii
(CCMP 1203), two marine species, were obtained from the
Provasoli–Guillard National Center for Culture of Marine
Phytoplankton (West Boothbay Harbor, Maine) and grown
in K medium (Keller et al. 1987) under 12 h light–dark
cycles. Monomastix sp., a freshwater strain originally collected by H. R. Preisig in New Zealand, originates from the
personal collection of C.J.O. This strain, which is available
upon request to M.T., was grown in modified Volvox medium (McCracken et al. 1980) under 12 h light–dark cycles.
Cloning and Sequencing of Chloroplast Genomes
The complete cpDNA sequences of Pyramimonas,
Monomastix, and Pycnococcus were generated essentially
as described previously (Turmel et al. 2005). For each green
alga, A þ T-rich organelle DNA was separated from nuclear DNA by CsCl–bisbenzimide isopycnic centrifugation
of total cellular DNA (Turmel et al. 1999a). The organelle
DNA fraction was sheared by nebulization to produce
1,500 to 3,000-bp fragments that were subsequently cloned
into a plasmid vector, either pBluescrit II KSþ or
pSMART-HCKan (Lucigen Corporation, Middleton,
WI). After hybridization of the resulting clones with the
original DNA used for cloning, plasmids from positive
clones were purified with the QIAprep 96 Miniprep kit
(Qiagen Inc., Mississauga, Canada) and sequenced using
universal primers. DNA assembly was carried out using
AUTOASSEMBLER 2.1.1 (Applied BioSystems, Foster
City, CA) or SEQUENCHER 4.2 (Gene Codes Corporation,
Analysis of three Prasinophyte Chloroplast Genomes 633
Ann Arbor, MI). Distinct contigs of cpDNA origin were ordered by polymerase chain reaction (PCR) amplification
with primers specific to contig ends. The amplified fragments
encompassing uncloned regions were sequenced on both
strands.
Chloroplast Genome Analyses
Genes and all open reading frames (ORFs) larger than
100 codons were identified as described previously (Turmel
et al. 2006). Secondary structures of group I and group II
introns were modeled according to Michel et al. (1989) and
Michel and Westhof (1990), respectively. Short repeats in
the Monomastix genome were identified using REPuter
2.74 (Kurtz et al. 2001), and the number of copies of each
repeat was determined with FINDPATTERNS of the
Genetics Computer Group package (Accelrys, San Diego,
CA). For all three newly sequenced prasinophyte genomes,
regions containing nonoverlapping repeated elements were
mapped with RepeatMasker (http://www.repeatmasker.
org/) running under the WU-Blast 2.0 search engine
(http://blast.wustl.edu/), using the repeats 30 bp identified
with REPuter as input sequences. Conserved gene clusters
exhibiting identical gene polarities in selected green algal
cpDNAs were identified using a custom-built program.
Sequencing of the Monomastix 18S rRNA Gene and
Phylogenetic Analysis
The nuclear-encoded SSU rRNA gene was amplified
from total cellular DNA by PCR using the specific primers
NS1 (White et al. 1990) and 18L (Hamby and Zimmer
1991). The resulting PCR product was purified and sequenced directly using these primers and two internal primers. The Monomastix nuclear-encoded SSU rDNA
sequence was aligned manually against the alignment prepared by Guillou et al. (2004) from 83 chlorophytes and 12
streptophytes. A data set of 1,663 positions was obtained
after removing ambiguously aligned regions using
GBLOCKS 0.91b (Castresana 2000) and the same filtration
parameters employed by Guillou et al. (2004). Maximum
likelihood (ML) trees were inferred using Treefinder (version of April 2008) (Jobb et al. 2004) with the best model
fitting the data [TN þ I (proportion of invariable sites) þ C
(four discrete rate categories)] under the Akaike information criterion. Bootstrap values were calculated for 100
replications.
Phylogenetic Inferences from Whole-Genome Sequence
Data
An amino acid data set and the corresponding nucleotide data set with first and second codon positions were
derived from the completely sequenced cpDNAs of
Bigelowiella (NC_008408), Euglena (NC_001603), and
22 green plants (species names and accession numbers, except those for Oedogonium cardiacum [NC_011031] and
Leptosira terrestris [NC_009681], are provided in table
3 of Lemieux et al. 2007). These data sets were allowed
to contain missing data; however, limitations were imposed
to the proportion of missing data by selecting for analysis
the protein-coding genes that are shared by at least 14 taxa.
Seventy genes met this criterion: atpA, B, E, F, H, I, ccsA,
cemA, chlB, I, L, N, clpP, ftsH, infA, petA, B, D, G, L, psaA,
B, C, I, J, M, psbA, B, C, D, E, F, H, I, J, K, L, M, N, T, Z,
rbcL, rpl2, 5, 14, 16, 20, 23, 32, 36, rpoA, B, C1, C2, rps2,
3, 4, 7, 8, 9, 11, 12, 14, 18, 19, tufA, ycf1, 3, 4, 12. The
amino acid data set was prepared as follows. The deduced
amino acid sequences from the 70 individual genes were
aligned using MUSCLE 3.7 (Edgar 2004), the ambiguously
aligned regions in each alignment were removed using
GBLOCKS 0.91b (Castresana 2000) with the –b2 option
(minimal number of sequences for a flank position) set
to 13, and the protein alignments were concatenated. To
obtain the nucleotide data set, the multiple sequence alignment of each protein was converted into a codon alignment,
the poorly aligned and divergent regions in each codon
alignment were excluded using GBLOCKS 0.91b with
the options –b2 5 13 and –t 5 c (the latter specifying that
selected sequences are complete codons), the individual codon alignments were concatenated, and finally third codon
positions were excluded with PAUP* 4.0b10 (Swofford
2003). Missing characters represented 5.9% and 5.8% of
the amino acid and nucleotide data sets, respectively.
Treefinder (version of April 2008) was used to perform the ML analyses and to identify the best model fitting
the data under the Akaike information criterion. The amino
acid data set was analyzed using the cpREV þ F (observed
amino acid frequencies) þ C (five categories) model of sequence evolution. Trees were inferred from the nucleotide
data set using the GTR þ C (five categories) model. Confidence of branch points was estimated by 500 bootstrap
replications.
The Bayesian inference method was conducted using
MrBayes 3.1.2 (Ronquist and Huelsenbeck 2003). The
model selected was cpREV þ F þ C for the inference from
the amino acid data set and GTR þ C for the inference of
the nucleotide data set. Rates across sites were modeled on
a discrete gamma distribution with five categories. Two independent Markov chain Monte Carlo runs, each consisting
of three heated chains in addition to the cold chain, were
carried out using the default parameters. For the analysis
of the nucleotide data set, the length of each run was 3 million generations after a burn-in phase of 500,000 generations; for the amino acid data set, it was 1 million
generations after a burn-in phase of 150,000 generations.
Trees were sampled every 100 generations. Convergence
of the two independent runs was verified according to
the output of the ‘‘sump’’ command; this output was also
used to determine the burn-in phase. Posterior probability
values were estimated from the trees sampled from both
runs using the ‘‘sumt’’ command.
Reconstruction of Ancestral Character States
A data set of gene content was prepared from the chloroplast genomes of the streptophytes Mesostigma and
Chlorokybus, the prasinophytes, and Euglena by coding
the presence and absence of genes as binary characters.
634 Turmel et al.
Table 1
General Features of Prasinophyte cpDNAs
Feature
Size (bp)
Total
IR
LSC
SSC
AþT (%)
Conserved genes (no.)b
Introns
Fraction of genome (%)
Group I (no.)
Group II (no.)
Intergenic sequencesc
Fraction of genome (%)
Average size (bp)
Short repeated sequencesd
Fraction of genome (%)
Nephroselmis
Pyramimonas
Pycnococcus
Monomastix
Ostreococcus
200,799
46,137
92,126
16,399
57.9
128
101,605
13,057
65,153
10,338
65.3
110
80,211
—a
—a
—a
60.5
98
114,528
—a
—a
—a
61.0
94
71,666
6,825
35,684
22,332
60.1
88
2.7
0
1
3.3
0
1
4.6
5
1
5.2
0
1
32.6
352
19.6
159
11.6
102
43.9
524
15.1
115
0.5
0.5
0.1
17.6
0
0
0
0
a
Because Pycnococcus and Monomastix cpDNAs lack an IR, only the total sizes of these genomes are given.
Conserved genes refer to free-standing coding sequences usually present in chloroplast genomes. Genes present in the IR were counted only once.
In addition to conserved genes, all ORFs 100 codons were considered as gene sequences.
d
Nonoverlapping repeat elements were mapped on each genome with RepeatMasker using the repeats 30 bp identified with REPuter as input sequences.
b
c
Gene order in each of these chloroplast genomes was converted to all possible pairs of signed genes (i.e., taking into
account gene polarity) and a gene order data set was obtained by coding as binary characters the presence/absence
of the ancestral gene pairs conserved in at least one streptophyte and one prasinophyte. The gene content and gene
order data sets were merged to produce a data set of combined ancestral characters. Losses of these characters on the
best tree topology inferred from sequence data were mapped using MacClade 4.08 (Maddison and Maddison 2000).
The most parsimonious reconstructions of ancestral character states were inferred under the Dollo principle of parsimony (Farris 1977).
Results and Discussion
Pyramimonas cpDNA Features an Ancestral
Quadripartite Structure and a Large Repertoire of Genes
Of the three newly sequenced prasinophyte genomes,
only that of Pyramimonas displays a large IR (table 1). At
101,605 bp, this genome is 2-fold smaller than its Nephroselmis homolog, a size difference attributable to a much
shorter IR, gene losses, and a more compact gene organization. As shown in figure 1, the two copies of the IR sequence, each 13,057 bp in size and encoding 11 genes, are
separated by SC regions of 10,338 and 65,153 bp comprising 12 and 76 genes, respectively. In this figure are color
coded the genes whose orthologs are usually found within
the IR, the SSC and large SC (LSC) regions in streptophyte
cpDNAs. It can be seen that the pattern of gene partitioning
among the SC regions of the Pyramimonas genome closely
resembles that observed for streptophytes. Considering that
the Pyramimonas IR is about 2-fold larger and encodes additional genes relative to that of Mesostigma and that the IR
is known to contract and expand through gene conversion
events (Goulding et al. 1996), the observation that the termini of the Pyramimonas IR contain genes characteristic of
the adjacent SC regions is not surprising. The most impor-
tant deviation from the highly conserved partitioning pattern displayed by streptophytes concerns the locations of
chlL and chlN. These two genes, which would be expected
to be present in the SSC region, lie within the IR near the
LSC region.
The Pyramimonas chloroplast genome encodes 110
conserved genes, that is, genes found in several other
cpDNAs and usually present in cyanobacteria. The products
of these genes consist of 81 proteins and 29 RNA species (2
rRNAs and 27 tRNAs) (table 2). The set of 27 tRNAs is sufficient to decode all 61 sense codons provided that the tRNA
species encoded by trnV(uac), trnA(ugc), trnT(ugu), trnS(uga),
trnL(uag), and trnP(ugg) recognize all four members of their
respective codon family through superwobble pairing
between the first position of the anticodon and the third
position of the codon (Rogalski et al. 2008). The size of
the Pyramimonas chloroplast gene complement closely
matches those observed for the trebouxiophytes Chlorella
vulgaris and Leptosira and for the ulvophytes Pseudendoclonium and Oltmannsiellopsis (de Cambiaire et al. 2007).
Although it is significantly reduced compared with its
Nephroselmis counterpart (table 2), the set of Pyramimonas
chloroplast genes includes six ndh genes (ndhA and ndhD
through ndhH) typically present in streptophytes but previously found only in Nephroselmis in the Chlorophyta, as
well as two protein-coding genes reported here for the first
time in a chlorophyte chloroplast genome, rpl22 and ycf65
(supplementary table 1, Supplementary Material online).
The ycf65 gene is present in both Mesostigma and Chlorokybus but missing in the other investigated streptophytes,
whereas rpl22 shows a widespread distribution in the
Streptophyta and also resides in the Euglena chloroplasts.
Perhaps not surprisingly, most of the 22 chloroplast genes
present in Nephroselmis but absent in Pyramimonas are
also missing from some chlorophytes belonging to the
Trebouxiophyceae, Ulvophyceae, or Chlorophyceae (supplementary table 1, Supplementary Material online). Only
five genes (cemA, petD, petL, psbM, and rrf) represent exceptions and interestingly, all five, except rrf (the 5S rRNA
Analysis of three Prasinophyte Chloroplast Genomes 635
FIG. 1.—Gene map of Pyramimonas cpDNA. The two copies of the IR sequence are represented by thick lines. Genes (filled boxes) on the outside
of the map are transcribed in a clockwise direction. Coding sequences not commonly found in cpDNA are shown in gray. The single intron in atpB is
represented by an open box. The color code denotes the genomic regions containing the corresponding genes in the cpDNAs of Nephroselmis and
streptophytes: magenta, SSC; cyan, LSC; and yellow, IR. Given the variable gene content of the IR in these ancestral-type genomes, only the genes
invariably present in this region (i.e., those forming the rRNA operon) were represented in yellow. tRNA genes are indicated by the one-letter amino
acid code (Me, elongator methionine; Mf, initiator methionine) followed by the anticodon in parentheses.
gene), are also lacking in the Ostreococcus and Euglena
chloroplasts. The analysis of the nuclear genome from
both O. tauri and Ostreococcus lucimarinus revealed that
cemA, petD, and psbM have been transferred to the nucleus
(Derelle et al. 2006; Palenik et al. 2007; Robbens et al.
2007). Considering that these genes are essential for chloroplast function, they are also likely to be nuclear-encoded
in Pyramimonas. Because no case of chloroplast to nucleus
transfer has been documented for rrf, the possibility exists
that this conserved gene is present in Pyramimonas cpDNA
and that its sequence has diverged beyond recognition.
We found two large ORFs that are not associated with
any introns, orf454 and orf510. For the orf510, present in the
LSC region near the IR, our Blast searches against the nonredundant protein sequence database of the National Center
for Biotechnology Information failed to identify any putative
function for the potential encoded protein. However, the
product of the orf454 localized in the IR revealed sequence
similarity with the conserved domain of phage associated
DNA primases (COG3378, E-value 5 1e 06). Interestingly, in the course of the present study, we have found that
the orf389 in the Nephroselmis IR (Turmel et al. 1999b) also
encodes a putative protein with the conserved domain of
phage associated DNA primases (COG3378, E-value 5
2e 12). Given that viruses have been observed in Pyramimonas (Moestrup and Thomsen 1974; Sandaa et al. 2001)
and Nephroselmis (Nakayama et al. 2007), it is tempting
to speculate that the above-mentioned orf454 and orf389
originated from horizontal transfer of viral genes. There
are only a few documented cases of nonstandard, freestanding chloroplast genes that were acquired via horizontal
gene transfer, and all these cases involve genes that participate in DNA recombination or replication (Khan et al. 2007;
Brouard et al. 2008; Cattolico et al. 2008). Like the orf454
and orf389, the two horizontally transferred genes identified
in the chlorophycean green alga Oedogonium cardiacum are
housed in the IR (Brouard et al. 2008).
In general, the conserved genes present in Pyramimonas cpDNA are densely packed (table 1). Prominent exceptions are those in the regions containing the orf454 and
orf510 (fig. 1). There are two cases of overlapping genes
(psbC–psbD and ndhC–ndhK); for the remaining genes,
636 Turmel et al.
Table 2
Gene Repertoires of Prasinophyte cpDNAs
Genea
accD
ccsA
cemA
chlB
chlI
chlL
chlN
cysA
cysT
ftsI
ftsW
minD
ndhA
ndhB
ndhC
ndhD
ndhE
ndhF
ndhG
ndhH
ndhI
ndhK
petD
petL
petN
psaC
psaJ
psaM
psbM
rne
rnpB
rpl12
rpl19
rpl22
rpl32
rpoB
rps9
rrf
trnG(gcc)
trnI(cau)
trnL(caa)
trnL(gag)
trnP(ggg)
trnR(ccg)
trnS(cga)
trnS(gga)
trnT(ggu)
ycf4
ycf20
ycf47
ycf62
ycf65
ycf81
Nephroselmis
Pyramimonas
Pycnococcus
Monomastix
Ostreococcus
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þb
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
–
þ
þ
þ
þ
þ
þ
þ
þ
–
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
þ
a
Only the genes that are missing in one or more genomes are indicated. A total of 80 genes are shared by all compared cpDNAs: atpA, B, E, F, H, I, clpP, ftsH, infA,
petA, B, G, psaA, B, I, psbA, B, C, D, E, F, H, I, J, K, L, N, T, Z, rbcL, rpl2, 5, 14, 16, 20, 23, 36, rpoA, C1, C2, rps2, 3, 4, 7, 8, 11, 12, 14, 18, 19, rrl, rrs, tufA, trnA(ugc),
C(gca), D(guc), E(uuc), F(gaa), G(ucc), H(gug), I(gau), K(uuu), L(uaa), L(uag), Me(cau), Mf(cau), N(guu), P(ugg), Q(uug), R(acg), R(ucu), S(gcu), S(uga), T(ugu), V(uac),
W(cca), Y(gua), ycf1, 3, 12.
b
ycf20 is present as a pseudogene in Nephroselmis (Lemieux C, unpublished data); it is located downstream of ndhE and corresponds to orf111 in the gene map
reported by Turmel et al. (1999b).
intergenic spacers vary between 3 and 2,517 bp, with an
average size of 159 bp. Consistent with this high degree
of compaction, only a few short repeats, mostly direct repeats, were identified (table 2); they are found mainly in the
large spacer adjacent to the orf510.
Like its Ostreococcus and Pycnococcus homologs (see
below), the Pyramimonas genome features a unique intron,
a group II intron in atpB. However, the Pyramimonas atpB
intron and those of Ostreococcus and Pycnococcus are inserted at different sites and carry distinct ORFs, indicating
that they arose from separate events of horizontal DNA
transfer. It should be pointed out here that the currently
available chloroplast genome data strongly support the notion that no introns were present in the chloroplast of the
Analysis of three Prasinophyte Chloroplast Genomes 637
FIG. 2.—Gene map of Pycnococcus cpDNA. Genes (filled boxes) on the outside of the map are transcribed in a clockwise direction. The single
intron in atpB is represented by an open box. The orf163 and orf175 revealed no detectable similarity with any known gene sequences. The genes
whose orthologs are found within the IR, SSC, and LSC regions in Nephroselmis and streptophyte cpDNAs are color coded in supplementary figure 2,
Supplementary Material online.
common ancestor of all green plants (Turmel et al. 1999b;
Lemieux et al. 2000, 2007). The orf608 of the Pyramimonas
group IIA intron is located within domain IV of the
intron secondary structure and carries the reverse transcriptase (cd01651) and maturase (pfam01348) domains,
but not the endonuclease domain, of reverse transcriptases
encoded by group II introns. The endonuclease domain,
which carries out second-strand DNA cleavage during
group II intron mobility (Lambowitz and Zimmerly
2004), was most likely lost after the horizontal transfer
of the intron in the Pyramimonas chloroplast. The
orf608 product shares strong sequence similarity with reverse transcriptases encoded by the genomes of firmicute
bacteria and by the mitochondrial cox1 genes of fungi,
the brown alga Pylaiella littoralis, and the cryptophyte
Rhodomonas salina.
Like Its Ostreococcus Homolog, Pycnococcus cpDNA
Has a Reduced Gene Content and Is Highly Compact
The Pycnococcus chloroplast genome is the smallest
and most compact of the three prasinophyte genomes se-
quenced during this study (table 1 and fig. 2). It is only
8.6 kb larger relative to Ostreococcus cpDNA and contains
10 additional conserved genes, for a total of 98 genes. In
terms of size, this gene repertoire, which consists of 65 protein genes and 33 RNA genes encoding 2 rRNAs, 30
tRNAs, and the RNA component of RNase P (table 2),
is similar to that observed for chlorophycean green algae
(Brouard et al. 2008). The tRNA complement includes
one tRNA species not previously documented in any chlorophytes [tRNAPro (GGG)] but like its Ostreococcus homolog, lacks the tRNA species that reads the AUA codon [i.e.,
the tRNAIle (CAU) where C is modified posttranscriptionally to lysidine]. As in Pyramimonas cpDNA, the 5S rRNA
gene was not detected. Moreover, the Pycnococcus genome
is missing the protein-coding genes psaJ and rpoB, which
are present in all other investigated chlorophytes. Although
the Pycnococcus, Ostreococcus, and Pyramimonas
cpDNAs all show a reduced gene content compared with
the Nephroselmis genome, their sets of genes show substantial differences (table 2).
No vestigial IR region was identified in Pycnococcus
cpDNA. The genes generally found in this region are
638 Turmel et al.
FIG. 3.—Gene map of Monomastix cpDNA. Genes (filled boxes) on the outside of the map are transcribed in a clockwise direction. Introns are
represented by open boxes. The orf122 and orf125 revealed no detectable similarity with any known gene sequences. The genes whose orthologs are
found within the IR, SSC, and LSC regions in Nephroselmis and streptophyte cpDNAs are color coded in supplementary figure 3, Supplementary
Material online.
dispersed throughout the genome; in contrast, several
genes usually present within the SSC region in genomes
displaying an ancestral quadripartite structure [chlN, chlL,
ycf1, cysT, and trnP(ggg)] remained clustered together
(supplementary fig. 2, Supplementary Material online).
There are two cases of overlapping genes (ycf4–rnpB
and psbD–psbC); for the other coding regions, intergenic
spacers were found to vary from 0 to 383 bp, for an average length of 102 bp.
The Pycnococcus atpB intron shares with its Ostreococcus counterpart the same insertion position and a large
ORF in domain IV that features the reverse transcriptase
(cd01651), maturase (pfam08388), and HNH endonuclease
(cd00085) domains of reverse transcriptases encoded by
group II introns. The Pycnococcus and Ostreococcus intron
ORFs share strong similarity with one another and with reverse transcriptase genes found in several cyanobacterial
species as well as in group II introns present in the mitochondrial large subunit (LSU) rRNA gene of the red alga
Porphyra purpurea (Burger et al. 1999) and the chloroplast
psbA genes of Chlamydomonas sp. CCMP 1619 (Odom
et al. 2004) and Euglena myxocylindracea (Sheveleva
and Hallick 2004).
The Monomastix Chloroplast Genome Has a Reduced
Gene Content but is Loosely Packed with Genes
Compared with its Pycnococcus homolog, the Monomastix chloroplast genome is 34 kb larger, has a deficit
of four genes, and contains five additional introns (table 1,
fig. 3 and supplementary fig. 3, Supplementary Material online). Its increased size is largely accounted for by the expansion of intergenic spacers. The latter vary from 3 to
2,566 bp, for an average size of 524 bp, and contain a
myriad of short repeated sequences rich in G þ C. The
94 conserved genes specify 64 proteins and 30 RNAs
(3 rRNAs, 26 tRNAs, and the RNA component of RNase
P) (table 2). The 26 tRNAs can decode all 61 sense codons
assuming that tRNAArg(ACG), where A is modified to
inosine, recognizes all four codons of the CGX family.
The reduced gene content of Monomastix is more like
the gene complement of Ostreococcus than that of
Analysis of three Prasinophyte Chloroplast Genomes 639
FIG. 4.—Conservation of ancestral gene clusters in prasinophyte and Euglena cpDNAs. Ancestral clusters were defined as those containing genes
in the same order and polarity in at least one streptophyte and one prasinophyte. For each genome, the set of genes making up each of the identified
ancestral clusters is shown as black boxes connected by a horizontal line. Black boxes that are contiguous but not linked together indicate that the
corresponding genes are not adjacent on the genome. Gray boxes denote individual genes that have been relocated elsewhere on the chloroplast genome
and empty boxes denote missing genes. The relative polarities of the genes are not represented in this figure; for this information, consult the maps
shown in figures 1–3 or that previously reported for the Nephroselmis genome (Turmel et al. 1999b).
Pycnococcus (table 2). It features nine genes that are missing from Ostreococcus and lacks only three genes that are
present in this alga, including psaC, a gene shared by the
chloroplasts of all previously investigated chlorophytes.
Although short dispersed repeats were mapped predominantly to intergenic regions, a small fraction was found
within the coding regions of five genes (ftsH, rpoB, rpoC1,
rpoC2, and ftsH) and within two introns (psbA intron and rrl
intron 4) (supplementary fig. 4, Supplementary Material
online). This distribution pattern resembles those reported
for other chlorophyte cpDNAs rich in short repeats (Maul
et al. 2002; Pombert et al. 2005, 2006; Bélanger et al.
2006; de Cambiaire et al. 2006, 2007). Ranging from 19
and 58 nucleotides, the most abundant short dispersed repeats of Monomastix were classified into four families (A
and A1, B and B1, and C and D) according to their sequence
motifs; moreover, some repeats displaying partial sequences characteristic of distinct families were discerned (supplementary fig. 5, Supplementary Material online). The
hybrid nature of the latter dispersed repeats, which were assigned to six categories (named AB, AC, AD, A1D, A1B,
and BD), suggests they arose through recombination between regions carrying different repeats.
The Monomastix chloroplast genome contains a single
group II intron, located in trnK(uuu), and five group I introns, one of which resides in psbA and four in the LSU
rRNA gene (rrl) (fig. 1). The IIB trnK intron is inserted
within the D arm of the tRNA secondary structure following G23 and lacks an ORF. All other trnK(uuu) introns that
have been identified in streptophyte cpDNAs carry an internal ORF with a maturase domain (matK) and are inserted
within the anticodon loop (Turmel et al. 2006). In view of
their ability to encode a homing endonuclease, the five
Monomastix group I introns are likely to be mobile and
were probably captured via horizontal intracellular and/or
intercellular DNA transfer. The IA2 psbA intron, found
at position 525 relative to the corresponding Mesostigma
gene, specifies a potential homing endonuclease with the
GIY–YIG motif and has chloroplast homologs with the
same insertion site and highly similar endonuclease genes
in the ulvophytes Oltmannsiellopsis and Pseudendoclonium and the chlorophycean green algae Oedogonium
and Chlamydomonas reinhardtii (Brouard et al. 2008).
The remaining four group I introns encode potential LAGLIDADG homing endonucleases (Côté et al. 1993; Lucas
et al. 2001) and also share identical insertion sites with
a large number of chlorophyte (Lucas et al. 2001; Brouard
et al. 2008) and cyanobacterial (Haugen et al. 2007) introns.
The first and third LSU rDNA introns, whose insertion positions correspond to sites 1931 and 2500 in the E. coli 23S
rRNA, fall within subgroup IB4, whereas the second and
fourth introns inserted at sites 1951 and 2593 belong to
the IA3 family. Like its Chlamydomonas homolog I-CreI,
the Monomastix site-2593 intron-encoded homing endonuclease (I-MsoI) has been characterized at the 3D level in
the presence of its DNA target site, revealing that the two
isoschizomers display strikingly different protein/DNA contacts (Lucas et al. 2001; Chevalier et al. 2003). Interestingly, at
640 Turmel et al.
FIG. 5.—Derived gene clusters uniquely shared by the Euglena and Pyramimonas cpDNAs. The genes shown as gray boxes represent the derived
components of these clusters; those shown as black boxes exhibit an ancestral organization. The genes shown as empty boxes are missing in Euglena cpDNA.
sites 1931, 2500, and 2593, the Monomastix mitochondrial
LSU rRNA gene features introns with similar structures and
ORFs as those found at identical sites in the chloroplast gene
(Lucas et al. 2001) (Turmel M, Gagnon M-C, Otis C, Lemieux
C, unpublished data), highlighting the possibility that mobile
group I introns were exchanged between different organellar compartments in the Monomastix lineage. Evidence
supporting such intracellular exchanges of group I introns
has also been reported for the Nephroselmis (Turmel et al.
1999a) and Pseudendoclonium (Pombert et al. 2006)
lineages.
Pyramimonas and Euglena cpDNAs Show Striking
Similarities in Gene Order
Gene orders in the three newly sequenced prasinophyte chloroplast genomes were compared with one another and with those in previously examined
chlorophytes, the streptophytes Mesostigma and Chlorokybus, the euglenid Euglena, and the chlorarachniophyte
Bigelowiella. In all pairwise genome comparisons, except
that including Pyramimonas and Euglena, the vast majority
of the identified syntenic blocks were composed exclusively of gene clusters commonly found in streptophytes
and chlorophytes. Ancestral clusters of this type display
substantial variability among the Euglena and prasinophyte
genomes (fig. 4). Clearly, the gene-rich genome of Nephroselmis exhibits the highest number of genes (94 genes)
mapping to clusters predating the split of the Chlorophyta
and Streptophyta. Breakpoints within ancestral clusters
proved to be too variable in positions to determine which
of the compared genomes are the most closely related. Note
that our comparisons of the Pyramimonas genome with
those of Mesostigma and Chlorokybus disclosed ancestral
gene linkages that had not been reported in any chlorophyte
cpDNA (e.g., psbH–petB–petD, R(ccg)–rbcL–atpB–atpE).
The ancestral rps2–atpI linkage detected in the Euglena
genome was also previously unrecognized in chlorophytes.
Comparison of gene orders in the Pyramimonas
and Euglena cpDNAs revealed striking similarities between these genomes. Almost two-thirds of the 87 genes
(56 genes) in Euglena cpDNA were found to be part of
collinear regions, for a total of 16 syntenic blocks.
Thirty-five of these genes form eight blocks that exhibit
gene linkages unique to Pyramimonas and Euglena
(fig. 5). Four blocks contain exclusively derived linkages,
whereas the remaining four also include ancestral gene
linkages present in chlorophytes and streptophytes
(the rpl23, rpl32, rps12, and rrs clusters). It is interesting
to note that in each of the latter four blocks, a pair of
adjacent genes was cleanly excised from the Euglena
genome following the formation of the derived linkages.
The syntenic block containing the triad psbK–ycf12–
psaM is not uniquely shared by the Pyramimonas and
Euglena chloroplasts. Being also present in Chlorella,
Pseudendoclonium, and Oltmannsiellopsis but not in
streptophytes, this derived cluster must have arisen
in prasinophytes and have been transmitted by vertical
descent to the trebouxiophyte and ulvophyte lineages.
Monomastix Occupies an Early-Diverging Branch of the
Mamiellales in 18S rDNA Trees
Monomastix has been historically affiliated with the
Prasinophyceae; however, the finding that its body scales
are not typical of those found in prasinophytes but are
more like those of the chrysophyte Chromulina placentula
(Manton 1967) led to the exclusion of this genus from the
Prasinophyceae (Melkonian 1990; Sym and Pienaar
1993). Very limited molecular information has been reported so far for Monomastix, explaining why its phylogenetic status has remained enigmatic. In the present
study, we determined the sequence of the Monomastix
nuclear-encoded SSU rRNA gene and compared it with
those available for other prasinophytes and some
representatives of the Trebouxiophyceae, Ulvophyceae,
and Chlorophyceae. Trees inferred with ML unambiguously showed that Monomastix represents an earlydiverging lineage of the Mamiellales (clade II) (fig. 6).
This uniflagellate, which has nonprasinophyte scales,
was resolved as the first branch of this morphologically
diverse clade. An unquestionable affinity therefore
exists between Ostreococcus and Monomastix even
though these two taxa belong to different lineages of
Analysis of three Prasinophyte Chloroplast Genomes 641
FIG. 6.—Phylogenetic position of Monomastix among prasinophytes as inferred from nuclear-encoded SSU rDNA sequences. The figure presents
the best ML tree. Bootstrap values are shown on the corresponding nodes. The names of the taxa whose chloroplast genomes were examined in the
present study are shown on a black background. Clade numbering follows that of Guillou et al. (2004).
the Mamiellales. The naked Ostreococcus is closely related to the scaly Bathycoccus and the clade uniting these
nonflagellated genera is sister to that containing the flag-
ellated genera Mamiella (two flagella), Mantoniella (one
flagellum), Micromonas (naked, one flagellum), and the
new genus represented by isolate RCC 391 (two flagella).
642 Turmel et al.
FIG. 7.—Phylogenies inferred from 70 concatenated chloroplast genes (first two codon positions) and their deduced amino acid sequences. (A) Best
ML tree inferred from the amino acid data set. (B) Best ML tree inferred from the nucleotide data set. The bootstrap values obtained in ML analyses and
the posterior probability values obtained in Bayesian analyses are shown on the left and right, respectively, on the corresponding nodes.
Chloroplast Phylogenomic Analyses Unite the
Pyramimonadales with the Mamiellales and Identify the
Pyramimonadales as the Source of the Euglenid
Chloroplasts
To explore the relationships among prasinophyte lineages (in particular clades I, II, III, and V) as well as the
relationships of chlorophyte chloroplasts with the second-
arily acquired chloroplasts of Bigelowiella and Euglena, we
generated data sets of 70 concatenated proteins and genes
(first and second codon positions) from completely sequenced chloroplast genomes and analyzed them using
the ML and Bayesian methods (fig. 7). As expected, both
the protein and gene trees identified a strongly supported
clade uniting the two representatives of the Mamiellales,
Monomastix, and Ostreococcus. This clade is sister to a
Analysis of three Prasinophyte Chloroplast Genomes 643
robust monophyletic group clustering the Pyramimonas
(scaly, four or eight flagella) and Euglena chloroplasts.
Although this sister relationship received 87% bootstrap
support in the protein ML tree (fig. 7A), exclusion of the
long-branch taxa Euglena and Bigelowiella from the analysis
resulted in 97% bootstrap support for the Pyramimonas þ
Monomastix þ Ostreococcus clade (data not shown). In
all analyses, the scaly biflagellate Nephroselmis was sister
to all chlorophytes analyzed, whereas the position of the
naked, nonflagellated Pycnococcus remained equivocal.
The latter prasinophyte was resolved as sister to the core
chlorophytes in the protein tree (fig. 7A), but was sister
to the Mamiellales, Pyramimonadales, and euglenids in
the gene tree (fig. 7B). The protein and gene trees thus differ
only in the branching position of the core chlorophytes with
respect to the prasinophyte lineages.
Because phylogenetic analyses based on the wholegenome approach are inherently associated with sparse
taxon sampling, they can lead to trees robustly supporting
an artifactual clustering of taxa (Brinkmann and Philippe
2008; Heath et al. 2008). Caution must therefore be exercised in the interpretation of the observed topologies. In the
case of trees derived from complete genome sequences,
structural features of these genomes can be used as independent data to test topologies (Rokas 2006). In the present
study, the strong alliance we uncovered between the
Pyramimonas and Euglena chloroplasts is strengthened
by a number of gene linkages that are unique to the cpDNAs
of these algae (fig. 5). Based on this finding, we infer with
confidence that the green algal partner in the secondary
endosymbiosis that gave rise to euglenids was a member
of the Pyramimonadales. Euglenids are unicellular organisms that belong to the Excavata, a supergroup of eukaryotes including diverse nonphotosynthetic groups like
diplomonads, retortamonads, parabasalids, oxymonads,
and jakobids (Baldauf et al. 2000, 2008; Keeling et al.
2005). Euglenids are the only photosynthetic excavates,
and they are specifically related to a subgroup containing
the kinetoplastids and diplonemids (Triemer and Farmer
2007). Prior to our study, published data were consistent
with the notion that the euglenid chloroplasts evolved from
a green algal endosymbiont that was allied to prasinophytes
(Turmel et al. 1999b; Ishida et al. 1997; Rogers et al. 2007);
however, it remained unknown as to which of the monophyletic groups of prasinophytes harbored the closest relative of the euglenid endosymbiont. In agreement with
our results, the ML tree that Ishida et al. (1997) inferred
from the amino acid sequences of elongation factor Tu
revealed a strongly supported clade clustering Pyramimonas disomata and the euglenids E. gracilis and Astasia
longa; however, this Pyramimonas species was the only
prasinophyte sampled in this single-gene analysis. Likewise, considering that P. parkeae is the unique representative of the Pyramimonadales in our chloroplast
phylogenomic study, there remain uncertainties about the
exact pyramimonadalean lineage that was the source of
the euglenid chloroplasts.
In the eukaryotic tree of life based on nuclear-encoded
genes, euglenids and chlorarachniophytes fall within distinct branches. Like euglenids, chlorarachniophytes belong
to a supergroup of eukaryotes that is primarily nonphoto-
synthetic, the Rhizaria (Keeling et al. 2005; Baldauf
2008). By robustly placing Bigelowiella at a separate position from Euglena, our chloroplast phylogenomic analyses
strongly reinforce the hypothesis that the euglenid and
chlorarachniophyte chloroplasts trace back to two independent secondary endosymbioses (Rogers et al. 2007; Takahashi et al. 2007) (fig. 7). Although the chloroplast of
Bigelowiella was found to be sister to those of the ulvophytes Pseudendoclonium and Oltmannsiellopsis in both
the protein and gene trees, broader sampling of core chlorophytes will be required to pinpoint the closest green algal
relative of the chlorarachniophyte endosymbiont.
The most unexpected finding that emerged from our
study is the observation that the Pyramimonas þ Euglena
clade is sister to the Monomastix þ Ostreococcus clade. Although the existence of a sister relationship between the
Pyramimonadales and Mamiellales has not been previously
documented, it is compatible with the resemblance that
these monophyletic groups display at the level of flagellar
scale structure (Melkonian 1984, 1990; O’Kelly 1992; Sym
and Pienaar 1993) and with the branching order inferred
from 18S rDNA data. The Pyramimonadales emerge just
before the Mamiellales in most 18S rDNA trees (Steinkotter
et al. 1994; Nakayama et al. 1998; Fawley et al. 2000;
Guillou et al. 2004); however, these lineages form a weakly
supported clade in the ML tree recently reported by
Nakayama et al. (2007). No similarities were found at
the chloroplast gene order level that link the Pyramimonadales and Mamiellales to the exclusion of other chlorophyte
groups; however, losses of at least four genes (cemA, cysT,
petL, and rpl19) could be traced back unambiguously to the
common ancestor of the Pyramimonadales and Mamiellales
(supplementary table 1, Supplementary Material online).
Because the Pyramimonadales and Mamiellales are
distinguished by prominent morphological differences,
the existence of a sister relationship between these lineages
has important implications for the evolution of prasinophytes. All members of the Pyramimonadales, which represent the five genera indicated in figure 6 and also probably
the Tasmanites (a fossil resembling the phycoma stages of
Cymbomonas, Pterosperma, and Halosphaera, which has
been found in Precambrian deposits), share a number of
synapomorphic characters and have at least four flagella
and a complex scaly covering consisting of three layers
of scales on the cell body and of two layers on the flagella
(Melkonian 1984, 1990; Sym and Pienaar 1993). The intermediate scale layer on the cell body consists of spiderwebshaped scales in Pterosperma and is homologous to the
outer scale layer on the flagellum (the limulus scales)
and to the spiderweb scales of the Mamiellales. The limuloid scales of Cymbomonas are also reminiscent of the
spiderweb scales of the Mamiellales, particularly during
morphogenesis (Moestrup et al. 2003). Interestingly, an apparent food-uptake apparatus is present in Cymbomonas,
which has been interpreted as a character inherited from a
phagotrophic ancestor of the green plants and subsequently
lost during evolution of the green algae (Moestrup et al.
2003). On the other hand, the members of the Mamiellales
show reduced morphological complexity and are characterized by a progressive simplification of cellular structure
and a reduction in cell size that occurred concomitantly with
644 Turmel et al.
the loss of scales (Nakayama et al. 1998). They lack an
underlayer of square-shaped scales (such scales are present
in most other prasinophyte lineages and the flagellate
reproductive cells of streptophytes) and no microtubular
flagellar roots are attached to the basal body no. 2. A sister
relationship between the Pyramimonadales and Mamiellales implies that some of the cellular features displayed
by the Mamiellales were derived from the more complex
organization seen in the Pyramimonadales and presumably
in the common ancestor of all chlorophytes. In this context,
it is worth mentioning that the nature of the progenitor of
all green plants has generated intense debate and is still
controversial (Melkonian 1984; O’Kelly 1992; Sym and
Pienaar 1993). A better understanding of the relationships
among prasinophyte lineages will be required before one
can infer with confidence evolutionary scenarios of cellular
changes.
At present, the identity of the earliest-diverging chlorophyte lineage remains uncertain. Intriguingly, the trees
inferred from 18S rDNA sequences (Guillou et al. 2004;
Nakayama et al. 2007) are in discordance with the chloroplast phylogenomic trees reported in this study with regards
to the position of the Nephroselmis genus (clade III). The
early-diverging position observed for the Nephroselmis
representative in chloroplast trees is in agreement with
the high degree of ancestral features found in the cpDNA
of this taxon (see fig. 8) but contrasts sharply with the much
later divergence observed for the genus in 18S rDNA trees.
In the latter trees, the branch occupied by Nephroselmis
species emerges near the lineage containing Pycnococcus
and Pseudocourfieldia marina, the clade VII containing only picoplanktonic species, and the clade containing the core
chlorophytes (Chlorodendrales sensu [Melkonian 1990] þ
Trebouxiophyceae þ Ulvophyceae þ Chlorophyceae).
Together, these lineages form a large clade that is well supported in ML analysis (fig. 6). Given the close relationship
observed on the basis of scale structure between Nephroselmis and the genera Tetraselmis and Scherffelia, Nakayama
et al. (2007) proposed that the common ancestor of the
clade containing Nephroselmis and the core chlorophytes
had two layers of small scales on the flagella (squared-shaped
scales and rod-shaped scales) and cell body (square scales
and stellate scales). The above-mentioned discrepancy between nuclear and chloroplast trees highlights the need for
analysis of chloroplast genomes from additional prasinophytes. Sampling of chloroplast genomes from all seven
known lineages of prasinophytes will be required to determine
the exact position of Nephroselmis relative to the Pycnococcaceae, Pyramimonadales, and Mamiellales.
Losses of Multiple Ancestral cpDNA Characters in
Independent Prasinophyte Lineages are Correlated with
Major Cellular Remodeling
To trace some of the evolutionary changes that
occurred at the chloroplast genome level during the evolution of prasinophytes and euglenids, losses of 62 genes and
75 ancestral gene pairs were mapped on the tree topology
inferred from sequence data (fig. 8). In this analysis, the
core chlorophytes were excluded and the streptophytes
Mesostigma and Chlorokybus were used as outgroup.
Although multiple characters were lost in independent
lineages, a substantial fraction of losses are uniquely
shared. In particular, the monophyletic group containing
the Mamiellales þ euglenids þ Pyramimonadales and
the node linking the latter clade with the Pycnococcaceae
are supported by several changes that occurred only once.
Because the nuclear genome of just one prasinophyte genus
(Ostreococcus) has been decrypted so far (Derelle et al.
2006; Palenik et al. 2007), we cannot interpret our results
in terms of gene transfers from the chloroplast to the nucleus. Most of the genes that vanished from the chloroplast
genome probably fall into this category; however, some
might have disappeared entirely from the cell because their
requirement is restricted to certain growth and physiological conditions (e.g., the chl genes associated with chlorophyll synthesis in the dark, the cys genes involved in sulfate
and thiosulfate transport, and the ndh genes associated with
chlororespiration).
The chloroplast genome sustained important reduction
in gene content in at least three separate lineages, namely,
the lineages leading to Euglena, to the mamiellalean genera
Monomastix and Ostreococcus, and to Pycnococcus (fig. 8).
In light of the close affinity of the Pyramimonas and
Euglena chloroplast genomes, we propose that the secondary
endosymbiosis that gave rise to the euglenid chloroplasts was
accompanied by extensive gene losses. Similar extinction of
numerous chloroplast genes has been associated with the
secondary endosymbiosis that involved the capture of a red
alga and generated the chloroplasts of heterokonts, cryptophytes, and haptophytes (Khan et al. 2007; Oudot-Le Secq
et al. 2007; Cattolico et al. 2008). With regards to the
Mamiellales, it appears that the common ancestor of
Monomastix and Ostreococcus had already experienced
multiple chloroplast gene losses (fig. 8), implying that these
events might have accompanied the simplification of cell organization that presumably coincided with the emergence of
the Mamiellales. Moreover, as indicated by the higher frequency of genes losses in the Ostreococcus lineage compared
with the Monomastix lineage, part of the gene losses in the
former lineage were likely connected with the evolution of
the coccoid cell organization and the reduction in cell size.
Pycnococcus represents an independent coccoid lineage that
sustained considerable reduction of the chloroplast genome,
and as observed for Ostreoccocus, there was strong pressure
to maintain a compact genome organization. In contrast, the
genomeofthe mamiellaleanMonomastixfollowedadivergent
evolutionary pathway and became loosely packed with genes
following proliferation of small dispersed repeats (table 1 and
supplementary fig. 4, Supplementary Material online).
The pressure to maintain the ancestral quadripartite architecture became relaxed during the evolution of prasinophytes and euglenids. The IR was lost a minimum of three
times (fig. 8), an observation that is not surprising given that
independent IR losses have been documented for the class
Trebouxiophyceae (de Cambiaire et al. 2007) and for land
plants (Palmer 1991; Raubeson and Jansen 2005). More
unexpected was our finding that the three examined IRcontaining prasinophyte cpDNAs differ significantly in
the distribution of their genes among the two SC regions
and in the orientation of the IR relative to these regions.
Analysis of three Prasinophyte Chloroplast Genomes 645
FIG. 8.—Losses of chloroplast genes and gene pairs during the evolution of prasinophytes and euglenids. Unique losses are indicated by squares,
whereas convergent losses in two or more lineages are indicated by triangles. Red and blue symbols refer to losses of genes and gene pairs, respectively.
Some gene pairs disappeared as a result of gene losses; those that were not correlated with any gene losses are denoted by dots. The number below each
taxon name indicates the total number of conserved genes in the chloroplast genome. Losses of the IR occurred in the three indicated lineages.
Although the Nephroselmis genome is the most similar to the
gene partitioning pattern observed for streptophytes and
some nongreen algae (Turmel et al. 1999b), the reduced ge-
nome of Ostreococcus shows a pattern (supplementary fig. 1,
Supplementary Material online) more like that observed for
the ulvophytes Pseudendoclonium and Oltmannsiellopsis
646 Turmel et al.
(Pombert et al. 2005, 2006). When the latter pattern was
identified in Pseudendoclonium, it was hypothesized that
it might represent an intermediate form between the highly
derived pattern found in the chlorophycean green alga
C. reinhardtii and the ancestral quadripartite structure
found in streptophytes, Nephroselmis, and probably earlydiverging trebouxiophytes, thus lending support to the
notion that the Ulvophyceae is sister to the Chlorophyceae
(Pombert et al. 2005). However, the great variability in the
quadripartite structure uncovered here for the Prasinophyceae and recently reported for the Chlorophyceae (de
Cambiaire et al. 2006; Brouard et al. 2008) casts doubt
on the phylogenetic value of this genomic feature. Clearly,
these data indicate that chloroplast genome rearrangements led to the exchanges of genes between opposite
SC regions on multiple occasions during the evolutionary
history of chlorophytes.
Conclusions
The chloroplast genome of prasinophytes exhibits
much more fluidity in gene content and arrangement than
anticipated from the earlier reports on the Nephroselmis
and Ostreococcus genomes. Major reduction and restructuring of the chloroplast genome occurred in conjunction
with changes in cell organization in at least two lineages,
the Mamiellales and Pycnococcaceae. By disclosing the existence of a sister relationship between the Mamiellales and
Pyramimonadales, our study represents a significant step
toward a better understanding of prasinophyte evolution.
Furthermore, it offers for the first time compelling evidence
that the evolutionary history of the prasinophytes was directly linked with the acquisition of photosynthesis through
secondary endosymbiosis by a subgroup of excavates, the
euglenids. Two independent lines of evidence, trees inferred from sequence data and the presence of uniquely
shared derived gene clusters, robustly support the notion
that the green algal ancestor of the euglenid chloroplasts
belonged to the Pyramimonadales. Although sampling of
Bigelowiella has not enabled us to pinpoint the green algal
donor of chlorarachniophytes chloroplasts, the inferred
trees strengthen the hypothesis that chloroplasts arose independently in chlorarachniophytes and euglenids. Considering that pyramimonadaleans are richer in ancestral
characters at the chloroplast genome level and exhibit
a more pronounced level of cell asymmetry and complexity
compared with the mamiellaleans, it is plausible that cell
asymmetry characterized the common ancestor of these lineages. Consistent with the hypothesis that the common ancestor of all chlorophytes also featured an asymmetrical cell
architecture is the observation that Nephroselmis occupies
the earliest divergence of the Chlorophyta and displays the
highest conservation of ancestral characters. Future chloroplast genome investigations incorporating the Chlorodendrales, the two picoplanktonic lineages not sampled in
the present study, and a broader range of taxa in each lineage should resolve further the branching pattern of prasinophyte lineages and clarify the number of separate events
that gave rise to coccoids and streamlining of the chloroplast genome.
Supplementary Material
Supplementary figures 1–5, supplementary table 1, the
data sets used in phylogenetic analyses, and the data set
used to infer the evolutionary scenario of character losses
are available at Molecular Biology and Evolution online
(http://mbe.oxfordjournals.org/). The fully annotated chloroplast genome sequences of Monomastix, Pycnococcus
and Pyramimonas have been deposited in the GenBank database under the accession numbers FJ493497, FJ493498,
and FJ493499, respectively. The GenBank accession number for the Monomastix 18S rDNA sequence determined in
this study is FJ493496.
Acknowledgments
We thank Mathieu Blais and Bertrand Caillier for their
assistance in cloning and sequencing the Pyramimonas
chloroplast genome. This study was supported by a grant
from the Natural Sciences and Engineering Research Council of Canada (to M.T. and C.L.).
Literature Cited
Baldauf SL. 2008. An overview of the phylogeny and diversity of
eucaryotes. J Syst Evol. 46:263–273.
Baldauf SL, Roger AJ, Wenk-Siefert I, Doolittle WF. 2000.
A kingdom-level phylogeny of eukaryotes based on combined
protein data. Science. 290:972–977.
Bélanger A-S, Brouard J-S, Charlebois P, Otis C, Lemieux C,
Turmel M. 2006. Distinctive architecture of the chloroplast
genome in the chlorophycean green alga Stigeoclonium
helveticum. Mol Genet Genomics. 276:464–477.
Brinkmann H, Philippe H. 2008. Animal phylogeny and largescale sequencing: progress and pitfalls. J Syst Evol. 46:
274–286.
Brouard J-S, Otis C, Lemieux C, Turmel M. 2008. Chloroplast
DNA sequence of the green alga Oedogonium cardiacum
(Chlorophyceae): unique genome architecture, derived characters shared with the Chaetophorales and novel genes
acquired through horizontal transfer. BMC Genomics. 9:290.
Burger G, Saint-Louis D, Gray MW, Lang BF. 1999. Complete
sequence of the mitochondrial DNA of the red alga Porphyra
purpurea. Cyanobacterial introns and shared ancestry of red
and green algae. Plant Cell. 11:1675–1694.
Castresana J. 2000. Selection of conserved blocks from multiple
alignments for their use in phylogenetic analysis. Mol Biol
Evol. 17:540–552.
Cattolico R, Jacobs M, Zhou Y, Chang J, Duplessis M,
Lybrand T, McKay J, Ong H, Sims E, Rocap G. 2008.
Chloroplast genome sequencing analysis of Heterosigma
akashiwo CCMP452 (West Atlantic) and NIES293 (West
Pacific) strains. BMC Genomics. 9:211.
Chevalier B, Turmel M, Lemieux C, Monnat RJ, Stoddard BL.
2003. Flexible DNA target site recognition by divergent
homing endonuclease isoschizomers I-CreI and I-MsoI. J Mol
Biol. 329:253–269.
Côté V, Mercier J-P, Lemieux C, Turmel M. 1993. The single
group-I intron in the chloroplast rrnL gene of Chlamydomonas humicola encodes a site-specific DNA endonuclease
(I-ChuI). Gene. 129:69–76.
Courties C, Vaquer A, Troussellier M, Lautier J, ChretiennotDinet MJ, Neveux J, Machado C, Claustre H. 1994. Smallest
eukaryotic organism. Nature. 370:255.
Analysis of three Prasinophyte Chloroplast Genomes 647
de Cambiaire J-C, Otis C, Lemieux C, Turmel M. 2006. The
complete chloroplast genome sequence of the chlorophycean
green alga Scenedesmus obliquus reveals a compact gene
organization and a biased distribution of genes on the two
DNA strands. BMC Evol Biol. 6:37.
de Cambiaire J-C, Otis C, Lemieux C, Turmel M. 2007. The
chloroplast genome sequence of the green alga Leptosira
terrestris: multiple losses of the inverted repeat and extensive
genome rearrangements within the Trebouxiophyceae. BMC
Genomics. 8:213.
Derelle E, Ferraz C, Rombauts S, et al. (26 co-authors). 2006.
Genome analysis of the smallest free-living eukaryote Ostreococcus tauri unveils many unique features. Proc Natl Acad Sci
USA. 103:11647–11652.
Edgar RC. 2004. MUSCLE: multiple sequence alignment with
high accuracy and high throughput. Nucleic Acids Res.
32:1792–1797.
Farris JS. 1977. Phylogenetic analysis under Dollo’s Law. Syst
Zool. 26:77–88.
Fawley MW, Yun Y, Qin M. 2000. Phylogenetic analyses of 18S
rDNA sequences reveal a new coccoid lineage of the
Prasinophyceae (Chlorophyta). J Phycol. 36:387–393.
Goulding SE, Olmstead RG, Morden CW, Wolfe KH. 1996. Ebb
and flow of the chloroplast inverted repeat. Mol Gen Genet.
252:195–206.
Guillou L, Eikrem W, Chrétiennot-Dinet M-J, Le Gall F,
Massana R, Romari K, Pedrós-Alió C, Vaulot D. 2004.
Diversity of picoplanktonic prasinophytes assessed by direct
nuclear SSU rDNA sequencing of environmental samples and
novel isolates retrieved from oceanic and coastal marine
ecosystems. Protist. 155:193–214.
Hallick RB, Hong L, Drager RG, Favreau MR, Monfort A,
Orsat B, Spielmann A, Stutz E. 1993. Complete sequence of
Euglena gracilis chloroplast DNA. Nucleic Acids Res.
21:3537–3544.
Hamby RK, Zimmer EA. 1991. Ribosomal RNA as a phylogenetic tool in plant systematics. In: Soltis P, Soltis D, Doyle J,
editors. Molecular systematics in plants. New York: Routledge, Chapman and Hall. p. 50–91.
Haugen P, Bhattacharya D, Palmer JD, Turner S, Lewis LA,
Pryer KM. 2007. Cyanobacterial ribosomal RNA genes with
multiple, endonuclease-encoding group I introns. BMC Evol
Biol. 7:159.
Heath TA, Hedtke SM, Hillis DM. 2008. Taxon sampling and the
accuracy of phylogenetic analyses. J Syst Evol. 46:239–257.
Ishida K, Cao Y, Hasegawa M, Okada N, Hara Y. 1997. The
origin of chlorarachniophyte plastids, as inferred from
phylogenetic comparisons of amino acid sequences of EFTu. J Mol Evol. 45:682–687.
Jansen RK, Cai Z, Raubeson LA, et al. (16 co-authors). 2007.
Analysis of 81 genes from 64 plastid genomes resolves
relationships in angiosperms and identifies genome-scale
evolutionary patterns. Proc Natl Acad Sci USA.
104:19369–19374.
Jobb G, von Haeseler A, Strimmer K. 2004. TREEFINDER:
a powerful graphical analysis environment for molecular
phylogenetics. BMC Evol Biol. 4:18.
Keeling PJ, Burger G, Durnford DG, Lang BF, Lee RW,
Pearlman RE, Roger AJ, Gray MW. 2005. The tree of
eukaryotes. Trends Ecol Evol. 20:670–676.
Keller MD, Selvin RC, Claus W, Guillard RRL. 1987. Media for the
culture of oceanic ultraphytoplankton. J Phycol. 23:633–638.
Khan H, Parks N, Kozera C, Curtis BA, Parsons BJ, Bowman S,
Archibald JM. 2007. Plastid genome sequence of the
cryptophyte alga Rhodomonas salina CCMP1319: lateral
transfer of putative DNA replication machinery and a test of
chromist plastid phylogeny. Mol Biol Evol. 24:1832–1842.
Kurtz S, Choudhuri JV, Ohlebusch E, Schleiermacher C, Stoye J,
Giegerich R. 2001. REPuter: the manifold applications of repeat
analysis on a genomic scale. Nucleic Acids Res. 29:4633–4642.
Lambowitz AM, Zimmerly S. 2004. Mobile group II introns.
Annu Rev Genet. 38:1–35.
Latasa M, Scharek R, Le Gall F, Guillou L. 2004. Pigment suites and
taxonomic groups in Prasinophyceae. J Phycol. 40:1149–1155.
Lemieux C, Otis C, Turmel M. 2000. Ancestral chloroplast
genome in Mesostigma viride reveals an early branch of green
plant evolution. Nature. 403:649–652.
Lemieux C, Otis C, Turmel M. 2007. A clade uniting the green
algae Mesostigma viride and Chlorokybus atmophyticus
represents the deepest branch of the Streptophyta in
chloroplast genome-based phylogenies. BMC Biol. 5:2.
Lewis LA, McCourt RM. 2004. Green algae and the origin of
land plants. Am J Bot. 91:1535–1556.
Lucas P, Otis C, Mercier J-P, Turmel M, Lemieux C. 2001.
Rapid evolution of the DNA-binding site in LAGLIDADG
homing endonucleases. Nucleic Acids Res. 29:960–969.
Maddison D, Maddison W. 2000. MacClade 4: analysis of phylogeny
and character evolution. Sunderland (MA): Sinauer Associates.
Manton I. 1967. Electron microscopical observations on a clone
of Monomastix Scherffel in culture. Nova Hedwigia. 14:1–11.
Marin B, Melkonian M. 1999. Mesostigmatophyceae, a new
class of streptophyte green algae revealed by SSU rRNA
sequence comparisons. Protist. 150:399–417.
Mattox KR, Stewart KD. 1984. Classification of the green algae:
a concept based on comparative cytology. In: Irvine DEG,
John DM, editors. The systematics of the green algae.
London: Academic Press. p. 29–72.
Maul JE, Lilly JW, Cui L, dePamphilis CW, Miller W,
Harris EH, Stern DB. 2002. The Chlamydomonas reinhardtii
plastid chromosome: islands of genes in a sea of repeats. Plant
Cell. 14:2659–2679.
McCracken DA, Nadakavukaren MJ, Cain JR. 1980. A biochemical
and ultrastructural evaluation of the taxonomic position of
Glaucosphaera vacuolata Korsch. New Phytol. 86:39–44.
Melkonian M. 1984. Flagellar apparatus ultrastructure in relation
to green algal classification. In: Irvine DEG, John DM, editors.
The systematics of the green algae. London: Academic Press.
p. 73–120.
Melkonian M. 1990. Phylum Chlorophyta. Class Prasinophyceae. In: Margulis L, Corliss JO, Melkonian M, Chapman DJ,
editors. Handbook of protoctista. The structure, cultivation,
habitats and life histories of the eukaryotic microorganisms
and their descendants exclusive of animals, plants and fungi.
Boston: Jones and Bartlett Publishers. p. 600–607.
Michel F, Umesono K, Ozeki H. 1989. Comparative and
functional anatomy of group II catalytic introns – a review.
Gene. 82:5–30.
Michel F, Westhof E. 1990. Modelling of the three-dimensional
architecture of group I catalytic introns based on comparative
sequence analysis. J Mol Biol. 216:585–610.
Moestrup O, Inouye I, Hori T. 2003. Ultrastructural studies on
Cymbomonas tetramitiformis (Prasinophyceae). I. General structure, scale microstructure, and ontogeny. Can J Bot. 81:657–671.
Moestrup O, Thomsen HA. 1974. An ultrastructural study of the
flagellate Pyramimonas orientalis with particular emphasis on
golgi apparatus activity and the flagellar apparatus. Protoplasma. 81:247–269.
Moestrup O, Throndsen J. 1988. Light and electron microscopical studies on Pseudoscourfieldia marina a primitive scaly
green flagellate prasinophyceae with posterior flagella. Can J
Bot. 66:1415–1434.
Nakayama T, Marin B, Kranz HD, Surek B, Huss VAR, Inouye I,
Melkonian M. 1998. The basal position of scaly green
flagellates among the green algae (Chlorophyta) is revealed by
648 Turmel et al.
analyses of nuclear-encoded SSU rRNA sequences. Protist.
149:367–380.
Nakayama T, Suda S, Kawachi M, Inouye I. 2007. Phylogeny
and ultrastructure of Nephroselmis and Pseudoscourfieldia
(Chlorophyta), including the description of Nephroselmis
anterostigmatica sp. nov. and a proposal for the Nephroselmidales ord. nov. Phycologia. 46:680–697.
O’Kelly CJ. 1992. Flagellar apparatus architecture and the
phylogeny of ‘‘green algae’’: chlorophytes, euglenoids,
glaucophytes. In: Menzel D, editor. The cytoskeleton of the
algae. Boca Raton: CRC Press. p. 315–345.
Odom OW, Shenkenberg DL, Garcia JA, Herrin DL. 2004.
A horizontally acquired group II intron in the chloroplast psbA
gene of a psychrophilic Chlamydomonas: in vitro self-splicing
and genetic evidence for maturase activity. RNA. 10:1097–1107.
Oudot-Le Secq M-P, Grimwood J, Shapiro H, Armbrust EV,
Bowler C, Green BR. 2007. Chloroplast genomes of the
diatoms Phaeodactylum tricornutum and Thalassiosira pseudonana: comparison with other plastid genomes of the red
lineage. Mol Genet Genomics. 277:427–439.
Palenik B, Grimwood J, Aerts A, et al. 2007. The tiny eukaryote
Ostreococcus provides genomic insights into the paradox of
plankton speciation. Proc Natl Acad Sci USA. 104:7705–7710.
Palmer JD. 1991. Plastid chromosomes: structure and evolution.
In: Bogorad L, Vasil K, editors. The molecular biology of
plastids. San Diego: Academic Press. p. 5–53.
Pombert J-F, Lemieux C, Turmel M. 2006. The complete chloroplast
DNA sequence of the green alga Oltmannsiellopsis viridis
reveals a distinctive quadripartite architecture in the chloroplast
genome of early diverging ulvophytes. BMC Biol. 4:3.
Pombert J-F, Otis C, Lemieux C, Turmel M. 2005. The
chloroplast genome sequence of the green alga Pseudendoclonium akinetum (Ulvophyceae) reveals unusual structural
features and new insights into the branching order of
chlorophyte lineages. Mol Biol Evol. 22:1903–1918.
Proschold T, Leliaert F. 2007. Systematics of the green algae:
conflict of classic and modern approaches. In: Brodie J, Lewis
J, editors. Unravelling the algae: the past, present, and future
of algal systematics. Boca Raton: CRC Press, Taylor &
Francis. p. 123–153.
Qiu YL, Li LB, Wang B, et al. (21 co-authors). 2006. The
deepest divergences in land plants inferred from phylogenomic evidence. Proc Natl Acad Sci USA. 103:15511–15516.
Raubeson LA, Jansen RK. 2005. Chloroplast genomes of plants.
In: Henry RJ, editor. Plant diversity and evolution: genotypic
and phenotypic variation in higher plants. Wallingford: CABI
Publishing. p. 45–68.
Robbens S, Derelle E, Ferraz C, Wuyts J, Moreau H, Van de
Peer Y. 2007. The complete chloroplast and mitochondrial
DNA sequence of Ostreococcus tauri: organelle genomes of
the smallest eukaryote are examples of compaction. Mol Biol
Evol. 24:956–968.
Rodriguez-Ezpeleta N, Philippe H, Brinkmann H, Becker B,
Melkonian M. 2007. Phylogenetic analyses of nuclear,
mitochondrial, and plastid multigene data sets support the
placement of Mesostigma in the Streptophyta. Mol Biol Evol.
24:723–731.
Rogalski M, Karcher D, Bock R. 2008. Superwobbling facilitates
translation with reduced tRNA sets. Nat Struct Mol Biol.
15:192–198.
Rogers MB, Gilson PR, Su V, McFadden GI, Keeling PJ. 2007.
The complete chloroplast genome of the chlorarachniophyte
Bigelowiella natans: evidence for independent origins of
chlorarachniophyte and euglenid secondary endosymbionts.
Mol Biol Evol. 24:54–62.
Rokas A. 2006. Genomics and the tree of life. Science.
313:1897–1899.
Ronquist F, Huelsenbeck JP. 2003. MrBayes 3: Bayesian
phylogenetic inference under mixed models. Bioinformatics.
19:1572–1574.
Sandaa RA, Heldal M, Castberg T, Thyrhaug R, Bratbak G.
2001. Isolation and characterization of two viruses with large
genome size infecting Chrysochromulina ericina (Prymnesiophyceae) and Pyramimonas orientalis (Prasinophyceae).
Virology. 290:272–280.
Sheveleva EV, Hallick RB. 2004. Recent horizontal intron transfer
to a chloroplast genome. Nucleic Acids Res. 32:803–810.
Steinkotter J, Bhattacharya D, Semmelroth I, Bibeau C,
Melkonian M. 1994. Prasinophytes form independent lineages
within the Chlorophyta: evidence from ribosomal RNA
sequence comparisons. J Phycol. 30:340–345.
Swofford DL. 2003. PAUP*. Phylogenetic analysis using
parsimony (*and other methods). Version 4. Sunderland
(MA): Sinauer Associates.
Sym SD, Pienaar RN. 1993. The class Prasinophyceae. In: Round
FE, Chapman DJ, editors. Progress in phycological research.
Bristol: Biopress Ltd. p. 281–376.
Takahashi F, Okabe Y, Nakada T, Sekimoto H, Ito M,
Kataoka H, Nozaki H. 2007. Origins of the secondary plastids
of Euglenophyta and Chlorarachniophyta as revealed by an
analysis of the plastid-targeting, nuclear-encoded gene psbO.
J Phycol. 43:1302–1309.
Triemer R, Farmer M. 2007. A decade of euglenoid molecular
phylogenetics. In: Brodie J, Lewis J, editors. Unravelling the
algae: the past, present, and future of algal systematics. Boca
Raton: CRC Press, Taylor & Francis. p. 315–330.
Turmel M, Brouard JS, Gagnon C, Otis C, Lemieux C. 2008.
Deep division in the Chlorophyceae (Chlorophyta) revealed
by chloroplast phylogenomic analyses. J Phycol. 44:739–750.
Turmel M, Lemieux C, Burger G, Lang BF, Otis C, Plante I,
Gray MW. 1999a. The complete mitochondrial DNA
sequences of Nephroselmis olivacea and Pedinomonas minor:
two radically different evolutionary patterns within green
algae. Plant Cell. 11:1717–1729.
Turmel M, Otis C, Lemieux C. 1999b. The complete chloroplast
DNA sequence of the green alga Nephroselmis olivacea:
insights into the architecture of ancestral chloroplast genomes.
Proc Natl Acad Sci USA. 96:10248–10253.
Turmel M, Otis C, Lemieux C. 2005. The complete chloroplast
DNA sequences of the charophycean green algae Staurastrum
and Zygnema reveal that the chloroplast genome underwent
extensive changes during the evolution of the Zygnematales.
BMC Biol. 3:22.
Turmel M, Otis C, Lemieux C. 2006. The chloroplast genome
sequence of Chara vulgaris sheds new light into the closest green
algal relatives of land plants. Mol Biol Evol. 23:1324–1338.
Wakasugi T, Nagai T, Kapoor M, et al. (15 co-authors). 1997.
Complete nucleotide sequence of the chloroplast genome from
the green alga Chlorella vulgaris: the existence of genes
possibly involved in chloroplast division. Proc Natl Acad Sci
USA. 94:5967–5972.
White TJ, Bruns T, Lee S, Taylor J. 1990. Amplification and
direct sequencing of fungal ribosomal RNA genes for
phylogenetics. In: Innis MA, Gelfand DH, Sninsky JJ, White
TJ, editors. PCR protocols: a guide to methods and
applications. San Diego: Academic Press. p. 315–322.
Wolf PG, Karol KG, Mandoli DF, et al. 2005. The first complete
chloroplast genome sequence of a lycophyte, Huperzia
lucidula (Lycopodiaceae). Gene. 350:117–128.
Martin Embley, Associate Editor
Accepted December 8, 2008