Genome drafts of four phytoplasma strains of the

Microbiology (2012), 158, 2805–2814
DOI 10.1099/mic.0.061432-0
Genome drafts of four phytoplasma strains of the
ribosomal group 16SrIII
Federica Saccardo,1 Marta Martini,1 Sabrina Palmano,2 Paolo Ermacora,1
Marco Scortichini,3 Nazia Loi1 and Giuseppe Firrao1,4
Correspondence
1
Giuseppe Firrao
2
Dipartimento di Scienze Agrarie ed Ambientali, Università di Udine, via Scienze 208, Udine, Italy
Istituto di Virologia Vegetale, CNR, Strada delle Cacce 73, 10135 Torino, Italy
[email protected]
3
Centro di Ricerca per la Frutticoltura, CRA, via di Fioranello 54, Roma, Italy
4
Istituto Nazionale di Biostrutture e Biosistemi, Interuniversity Consortium, Italy
Received 20 June 2012
Revised
22 August 2012
Accepted 27 August 2012
By applying a coverage-based read selection and filtration through a healthy plant dataset, and a
post-assembly contig selection based on homology and linkage, genome sequence drafts were
obtained for four phytoplasma strains belonging to the 16SrIII group (X disease clade), namely
Vaccinium Witches’ Broom phytoplasma (647 754 nt in 272 contigs), Italian Clover Phyllody
phytoplasma strain MA (597 245 nt in 197 contigs), Poinsettia branch-inducing phytoplasma
strain JR1 (631 440 nt in 185 contigs) and Milkweed Yellows phytoplasma (583 806 nt in 158
contigs). Despite assignment to different 16SrIII subgroups, the genomes of the four strains were
similar, comprising a highly conserved core (92–98 % similar in their nucleotide sequence among
each other over alignments about 500 kb in length) and a minor strain-specific component. As far
as their protein complement was concerned, they did not differ significantly in their basic
metabolism potential from the genomes of other wide-host-range phytoplasmas sequenced
previously, but were distinct from strains of other species, as well as among each other, in genes
encoding functions conceivably related to interactions with the host, such as membrane trafficking
components, proteases, DNA methylases, effectors and several hypothetical proteins of unknown
function, some of which are likely secreted through the Sec-dependent secretion system. The four
genomes displayed a group of genes encoding hypothetical proteins with high similarity to a
central domain of IcmE/DotG, a core component of the type IVB secretion system of Gramnegative Legionella spp. Conversely, genes encoding functional GroES/GroEL chaperones were
not detected in any of the four drafts. The results also indicated the significant role of horizontal
gene transfer among different ‘Candidatus Phytoplasma’ species in shaping phytoplasma
genomes and promoting their diversity.
INTRODUCTION
The phytoplasmas are wall-less non-helical prokaryotes
colonizing plant phloem and insects that are known to
cause hundreds of diverse plant diseases, some of which are
Abbreviations: APp, ‘Ca. P. mali’ strain AT; AUp, ‘Ca. P. australiense’;
AYWBp, ‘Ca. P. asteris’ strain Aster Yellows Witches’ Broom; HGT,
horizontal gene transfer; JR1p, Poinsettia branch-inducing phytoplasma
strain JR5PoiBII; MA1p, Italian Clover Phyllody phytoplasma strain MA;
MW1p, Milkweed Witches’ Broom phytoplasma; OYp, ‘Ca. P. asteris’
strain Onion Yellows; PMU, potential mobile unit; SPs, secreted proteins;
VACp, Vaccinium Witches’ Broom phytoplasma.
The GenBank/EMBL/DDBJ accession numbers for the contig
sequences of strains JR1, MW1, MA1 and VAC are AKIK00000000,
AKIL00000000, AKIM00000000 and AKIN00000000, respectively.
A supplementary file, a supplementary figure and two supplementary
tables are available with the online version of this paper.
061432 G 2012 SGM
of great economic concern (Gasparich, 2010). Evidence has
shown that they represent a monophyletic branch of the
class Mollicutes, and the genus ‘Candidatus Phytoplasma’
has been proposed to accommodate them (IRPCM
Phytoplasma/Spiroplasma Working Team – Phytoplasma
taxonomy group, 2004; Firrao et al., 2005). Phylogenetic
analyses of several genes have allowed subdivision on the
basis of 16S rRNA gene sequences into 19–30 (depending
on the method used) so-called 16Sr groups of strains (Lee
et al., 2000; Wei et al., 2007). One of the most populous,
economically important and widely spread groups is named
16SrIII: it encompasses strains that infect many different
plants and insects and which differ greatly in their geographical origin, plant host/insect vector relations, and symptoms induced (Gundersen et al., 1996). Based on RFLP
analysis of 16S rRNA gene fragments, several subgroups
have been delineated within the 16SrIII group (Zhao et al.,
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Wed, 14 Jun 2017 18:36:27
Printed in Great Britain
2805
F. Saccardo and others
2009), each subgroup consisting of strains that share identical or nearly identical RFLP patterns.
Despite the fact that phytoplasmas cannot be cultivated in
vitro, the development of laborious techniques for the
separation of phytoplasma DNA from that of their host
plant (Cimerman et al., 2006; Firrao et al., 1996a; GarciaChapa et al., 2004; Kollar & Seemüller, 1989; Tran-Nguyen
& Gibb, 2007) has opened access to the physical organization and more recently the sequence of their genomes.
To date, four phytoplasma complete genome sequences
have been released, namely those belonging to two ‘Ca. P.
asteris’ strains, Onion Yellows phytoplasma (OYp) (Oshima
et al., 2004) and Aster Yellows Witches’ Broom phytoplasma
(AYWBp) (Bai et al., 2006), to strain AT of ‘Ca. P. mali’
(Kube et al., 2008) and to one strain of ‘Ca. P. australiense’
(Tran-Nguyen et al., 2008); genomic surveys have also been
published for several phytoplasmas (Kawar et al., 2010;
Liefting & Kirkpatrick, 2003; Cimerman et al., 2006; GarciaChapa et al., 2004). Based on these studies, progress has
recently been made in our understanding of the physiology
of the phytoplasmas, such as the identification of the missing
essential pathways that prevent their in vitro cultivation
(Oshima et al., 2004), and their pathogenicity, through the
identification of effectors (Bai et al., 2009; Hoshi et al., 2009;
MacLean et al., 2011; Sugio et al., 2011).
So far, large-scale phytoplasma genomics has been limited by
the difficulties in preparing phytoplasma DNA in the amount
and purity required for Sanger sequencing. Here we show
that recent extension of the reading capability of the
sequencing by synthesis technology (Illumina) has introduced the potential for producing enough data to perform a
de novo assembly of phytoplasma genomes without purification, paving the way for extensive genomic analysis of these
fastidious pathogens.
METHODS
Strains. Four phytoplasmas were chosen based on their different
geographical origin, different restriction pattern of the 16S rRNA
gene, and different symptoms displayed. Two strains are believed to
be closely related and belong to the 16SrIII-F subgroup. According to
Valiunas et al. (2007), 16SrIII-F phytoplasmas have been found in
four plant species (Vaccinium myrtillus, Vaccinium corymbosum,
Heracleum sosnowski and Dictamnus albus) in northern Europe, while
Asclepias syriaca in North America is the only known host elsewhere.
In this study we included the Vaccinium Witches’ Broom phytoplasma (VACp, 16SrIII-F), a 40-year-old European strain isolated
from Vaccinium myrtillus by R. Marwitz, that on periwinkle shows
symptoms of dwarfism, strong witches’ broom with very small leaves,
small flowers but not typical virescence or phyllody; and the
Milkweed Yellows phytoplasma (MW1p, 16SrIII-F), isolated in the
USA from Asclepias syriaca (Griffiths et al., 1994), that in periwinkle
induces severe decline, and small leaves and flowers. In addition, we
included a strain of Italian Clover Phyllody phytoplasma (MA1p,
16SrIII-B) isolated in the early 1990s in Italy from Chrysanthemum
leuchantemum by Dr L. Carraro (Carraro et al., 1991; Firrao et al.,
1996b; Osler et al., 1994); this strain belongs to subgroup 16SrIII-B
and colonizes periwinkle to high titre, producing strong symptoms of
virescence and phyllody, but with minor reductions of plant and leaf
2806
size. Finally, the Poinsettia branch-inducing phytoplasma strain
JR15PoiBII (JR1p, 16SrIII-H), isolated from Euphorbia pulcherrima
in the USA by Dr Ing-Ming Lee (Molecular Plant Pathology
Laboratory, ARS-USDA, Beltsville, MA), was included. JR1-infected
periwinkles show limited reduction in size; the main symptoms are
yellowing, virescence and phyllody, witches’ broom and reduced leaf
size. The symptoms are summarized in Table 1 and infected plants are
shown in comparison with a healthy control in Fig. 1.
Nucleic acid extraction. Nucleic acids were extracted from 1.5–2 g
of periwinkle midribs using the routinely used CTAB (cetrimonium
bromide) method with phytoplasma enrichment by differential
centrifugation (Ahrens & Seemuller, 1992).
Real-time PCR. Quantification of enriched phytoplasma DNAs was
achieved by SYBR Green I real-time PCR using phytoplasma universal
primers designed for conserved portions of the 16S rRNA gene,
16S(RT)F1/16S(RT)R1 (59-TTC GGC AAT GGA GGA AAC T-39)/
(59-GTT AGC CGG GGC TTA TTC AT-39) amplifying a fragment
138 bp long.
For quantification of the phytoplasmas, a standard curve was
established by diluting a plasmid containing the 16S rRNA gene with
Flavescence dorée (FD) phytoplasma. Serial dilutions (1 : 10) of the
plasmid starting from 1 ng ml21 to 1 fg ml21 were prepared in
20 ng ml21 of total DNA from healthy periwinkle.
Real-time PCR was conducted in an automated thermal cycler (DNA
Engine Opticon 2 System; MJ Research) with AmpliTaq Gold
polymerase (Applied Biosystems). PCR was performed in mixtures
containing 16 PCR buffer, 200 mM each dNTP, 2.5 mM MgCl2,
0.3 mM each primer, 1.25 U AmpliTaq Gold DNA polymerase and
0.256 SYBR Green I (Molecular Probes) in DMSO. After an initial
denaturing step of 11 min at 95 uC, the reactions were cycled 40 times
under the following parameters: 94 uC for 15 s, 57 uC for 15 s and
72 uC for 20 s. At the end of the PCR, to construct the melting curve,
the temperature was increased from 65 to 95 uC at a rate of
0.2 uC s21 to evaluate the purity of the amplification product.
Illumina sequencing. A total of 10 mg DNA from each sample was
fragmented by incubation for 70 min with 5 ml dsDNA Fragmentase
(New England Biolabs). The following steps in library preparation
were carried out as described previously (Marcelletti et al., 2011). The
samples were run on an Illumina Genome Analyzer IIx that provided
paired reads of 100 nt in length, at the Istituto di Genomica Applicata
(Udine, Italy).
Computer-assisted data analysis. Most of the computer analysis
was carried out with custom scripts run on a laptop computer
equipped with 8 GB RAM and Linux OS. For simulation analysis of the
assembly of Illumina reads, a PERL script was developed that generated
simulated random Illumina reads from the complete genome sequence
of AYWBp (NCBI accession no. NC_007716), introducing nucleotide
read errors and varying the distance among pairs to approximate a real
scenario. The simulated reads were then mixed with real Illumina reads
obtained from DNA extracts of healthy periwinkle. For the simulation
presented here, a mixed dataset of 4 141 920 reads 100 nt long with 5 %
phytoplasma reads and 95 % healthy periwinkle reads was used; this
dataset contained 20 709 600 nt from the AYWBp genome, corresponding to 256 coverage.
For final assemblies of the simulated dataset and the real data a
procedure based on a few BASH and PERL scripts was developed that:
(1) Runs VELVET (Zerbino & Birney, 2008), with a wordsize525 to
create a table of coverages.
(2) Sorts the contigs, excluding those with a coverage value lower than
a cut-off value (3–9, determined on the basis of the total reads count
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Wed, 14 Jun 2017 18:36:27
Microbiology 158
Genome drafts of four 16SrIII phytoplasmas
Table 1. Symptom types induced on periwinkle by the four phytoplasmas included in this study
+, Infrequent; ++, common; +++, typical.
Strain
Phyllody/
virescence
Small flowers
Witches’ broom
Small leaves
Dwarfism
Plant decline
VACp
MA1p
MW1p
JR1p
+
+++
+
+++
++
+
++
+
++
++
+
++
+++
+
+
++
+++
+
++
+
+++
+
+++
+
and percentage of phytoplasma DNA in the sample as estimated by
quantitative PCR).
(5) Runs the CLC Genomic workbench (CLC bio) for de novo
assembly.
The assemblies were deposited at DDBJ/EMBL/GenBank under
accession numbers AKIK01000000, AKIL01000000, AKIM01000000
and AKIN01000000. Contig sequences were scanned for ORFs via
GLIMMER version 3.02 (Delcher et al., 2007), which had been
previously trained on ORFs of the complete coding sequences of
other phytoplasmas and then trained with the ORFs of the draft
genomes identified in the first run. The putative proteins were
annotated against the NCBI Ref_Seq database using a PERL script for
recursive BLASTP (Altschul et al., 1990; Camacho et al., 2009) searches
against the entire GenBank database. Additional genome sequence
analysis was performed with the aid of the software packages MUMMER
(Kurtz et al., 2004), for DNA-based comparison, and BWA (Li &
Durbin, 2009), for mapping of Illumina reads.
(6) Sorts the contigs according to their sequence similarity to known
phytoplasma sequences or the presence of ends overlapping with a
previously assigned contig.
Comparative genome analysis was carried out by comparing the draft
genome sequences with the complete genomes of AYWBp, OYp (NCBI
Ref_Seq accession no. NC_005303), ‘Ca. P. mali’ strain AT (APp; NCBI
(3) Sorts the contigs again, excluding those that show a perfect match
(.98 nt) with at least one read obtained from the Illumina
sequencing of a healthy periwinkle. The healthy periwinkle dataset
used for this step is available at the NCBI Short Reads Archive under
accession number SRS356159.
(4) Rescues the Illumina reads that were used by VELVET for assembly
of the contigs selected and their pairs to create a new dataset.
Fig. 1. Symptoms displayed by periwinkle infected with JR1p (a), MW1p (b), MA1p (c) and VACp (d) compared with a healthy
control.
http://mic.sgmjournals.org
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Wed, 14 Jun 2017 18:36:27
2807
F. Saccardo and others
Ref_Seq accession no. NC_011047) and ‘Ca. P. australiense’ (AUp;
NCBI Ref_Seq no. accession NC_010544). To cope with the
exceptionally extensive gene duplication that occurs in all phytoplasmas, Tran-Nguyen et al. (2008) introduced the concept of ‘paralogue
family’ to refer to highly similar gene sequences spread along the
genome that presumably originated from duplication. In the analysis
with multiple genomes, this peculiar feature of the phytoplasma
genome hampers the identification of pairs of orthologues. To facilitate
comparative analysis a script was developed that grouped the
paralogues of each genome with their orthologues in the other
genomes. Below, each of these clusters is referred to as a ‘homologue
gene family’, while the individual members of a cluster are named ‘gene
fragments’, although they may include complete as well as incomplete
genes. Families were established based on a modification of the
reciprocal smallest distance algorithm (Poptsova & Gogarten, 2007;
Wall et al., 2003) after establishment by reciprocal best BLAST hits, and
were evaluated using alignments generated with MAFFT (Katoh et al.,
2005) and CLUSTAL W2 (Larkin et al., 2007); families that did not pass a
quality check (i.e. with a mean ,0.7 or a standard deviation ,0.05 in
identity values calculated between all pairs of predicted proteins) were
revised manually and eventually split.
All families that did not include at least one gene fragment for each
draft (incomplete families) were examined to determine whether the
missing members were actually not encoded in the corresponding
genome. In fact, genes may not be included in the drafts because their
sequence was missing in the assembly or was overlooked in the gene
discovery procedure carried out via GLIMMER. First, using BWA (Li &
Durbin, 2009), the entire read dataset of each strain was mapped on
the ORFs identified in the genome draft of the strain itself (target
strain) and, additionally, on the ORFs identified in the genome drafts
of the other three strains (probe strains), a process referred to below
as crossmapping. Homologue gene families were then scanned to
identify incomplete families where no member was discovered for the
target strains. For each incomplete family detected, the reads of the
target strain dataset that crossmapped on the ORFs of the probe
strains and did not map on other ORFs of the target strain genome
draft were collected and aligned. Alignment consensus sequences were
used as a seed for extension by MAPSEMBLER (Peterlongo & Chikhi,
2012).
Crossmapping was also used to assess the completeness of the protein
complement deduced for each strain. Reads maps and crossmaps were
compared in pairs using EPICENTER (Huang et al., 2011) under the null
hypothesis that the ORFs identified by GLIMMER in the probe strains
were also present in the dataset used for crossmapping. The program
EPICENTER provided an estimation of the P value of the z-test for the
null hypothesis. From the figures for genes predicted by GLIMMER, the
genes that were rescued from the reads dataset and the genes that were
assessed as missing by EPICENTER (with P,0.001), the number of
homologue gene families remaining not discovered was estimated as
described in Supplementary File 1 (available with the online version
of this paper).
Pairwise global alignments and similarity calculations for the
detection of horizontally transferred genes were carried out with
CLUSTAL W2.
Hypothetically secreted proteins were identified as described by Bai
et al. (2009).
RESULTS
Simulation analysis
Before attempting to reconstruct phytoplasma genome
sequence from periwinkle DNA-contaminated samples by
2808
second-generation sequencing, an ad hoc assembly strategy
was developed and evaluated on a simulated dataset built by
mixing computer-generated phytoplasma reads with real
Illumina plant reads. The assembly method developed
exploits the fact that host and pathogen chromosomes are
covered to a different extent in the dataset: while the
phytoplasma genomes range from 500 kbp to 1.3 Mbp, the
periwinkle genome size has been estimated to be 696 Mbp
(Galbraith et al., 1983) and 2000 Mbp (Zonneveld et al.,
2005). Therefore, in samples where the phytoplasma DNA
forms .1 % of the total DNA, the coverage of phytoplasma
DNA is expected to be .10 times the coverage of the host
nuclear DNA. An assembly pipeline (described in Methods)
was developed that improved the relative amount of
phytoplasma reads by selection of reads based on coverage
estimates determined in a preassembly step. Reads were
further selected by filtering on a dataset obtained from the
Illumina sequencing of healthy periwinkle DNA that
efficiently removed host organelle sequences with high
coverage. As shown in Table 2, this selection reduced the
total number of contigs and enriched them in the
phytoplasma sequence. Phytoplasma contig sorting could
then be efficiently carried out by selection based on contig
similarity with known sequences and reciprocal contig
linkage.
Table 2. Summary of the results of simulation
Parameter
Reads (total)*
Reads (phytoplasma)D
Contig number (direct assembly)d
Reads selected (total)§
Reads selected (AYWB)||
Contig number (primary assembly)
Contig number (assigned to AYWB)#
Total sequence in contigs**
AYWB sequence not in contigsDD
Contigs that do not contain AYWB sequencedd
n
4 141 920
207 096
11 064
751 348
206 974
754
150
635 910
8730
0
*Size of the simulated Illumina dataset.
DSimulated reads derived from the complete genome of AYWB
included in the simulated dataset.
dContigs resulting from an assembly of the entire simulated dataset
with VELVET (Zerbino & Birney, 2008) with a cut-off value of 15.
§Size of the dataset used for primary assembly, obtained with the
reads selection procedure described in Methods.
||Number of simulated reads derived from the complete genome of
AYWBp that remained after the selection procedure.
Total number of contigs obtained in the assembly of the selected reads.
#Number of contigs remaining after the post-assembly sorting
procedure described in Methods.
**Total sequence contained in the contigs assigned to AYWB.
DDSequence that was not covered after contigs mapping on the
complete genome of AYWB with MUMMER (Kurtz et al., 2004).
ddNumber of contigs that were not mapped on the complete genome
of AYWB with MUMMER (Kurtz et al., 2004).
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Wed, 14 Jun 2017 18:36:27
Microbiology 158
Genome drafts of four 16SrIII phytoplasmas
Table 3. Coverage of the genome drafts as estimated from the entire reads dataset and the results of quantitative PCR (qPCR), and
as calculated by mapping reads on contigs after assembly
Strain
Total reads
DNA fraction
(qPCR)
Reads from pathogen
(estimated)
Predicted
coverage
Reads from
pathogen (mapped)
MA1p
MW1p
VACp
JR1p
1 974 008
2 619 032
3 898 408
4 166 594
15
6
5
3
296 101
157 142
194 920
124 998
496
266
326
216
244 720
162 014
203 766
143 748
The pipeline developed with simulated data was applied to
assemble the genomes of four phytoplasma strains of the
16SrIII group, namely strains JR1p, MA1p, MW1p and
VACp (named 16SrIII phytoplasmas below). The nucleic
acid extracted from periwinkle plants infected by each strain
was evaluated by real-time quantitative PCR to determine
the relative amount of the pathogen DNA in each extract.
Several samples were extracted, compared to select those
with the highest percentage of phytoplasma DNA (Table 3),
and submitted to second-generation sequencing (Illumina).
Observed
coverage
416
286
316
236
from the reads dataset, exploiting the high similarity of the
four genome drafts among themselves. Crossmapping the
reads of one strain on the putative coding sequences of
another strain generated consensus sequences that could be
extended to reveal 20–36 additional putative coding sequence
fragments per strain, roughly corresponding to 3–6 % of the
total number of predicted proteins (Table 4 and Supplementary File 1).
Crossmapping led to the definition of new protein complements of the four genomes, which included 677, 650,
565 and 654 proteins for VACp, MA1p, MW1p and JR1p,
respectively.
Assembly of genome drafts for 16SrIII
phytoplasmas and assessment of their protein
complement
In addition, crossmapping was used to estimate the number
of genes for each strain that were not included in the protein
complement of genome drafts, allowing the general estimate
that 4–10 homologue gene families per draft genome had
remained undiscovered (Supplementary File 1).
DNA assemblies of the genome drafts for strains VACp,
MW1p, JR1p and MA1p contained 647 754 nt in 272
contigs, 583 806 nt in 158 contigs, 631 440 nt in 185 contigs,
and 597 245 nt in 197 contigs, respectively (Table 4).
Reciprocal contig mapping using NUCMER (a script from
the MUMMER package) showed that, despite their assignment
to different 16Sr subgroups and differences in the symptoms
displayed, all four genome drafts were highly similar, sharing
.94 % identity of their DNA sequence along the length of
the alignment (Table 5).
Comparison of the genomes of 16SrIII
phytoplasmas with previously sequenced
phytoplasma genomes
Comparison of the protein complement of 16SrIII phytoplasmas with those of AYWBp, OYp, APp and AUp using
BLAST showed that as many as 271 homologue gene families
were conserved in all eight genomes. Using the recently
published summary of the protein repertoire encoded by the
four fully sequenced phytoplasma genomes (Kube et al.,
2012) as a guide, a comprehensive comparative gene table
The protein complement of each of the four genomes
resulting from a GLIMMER prediction comprised 529–650
putative proteins, depending on strain. To complete the
protein complements, we developed a procedure (described
in Methods) to rescue putative coding sequences directly
Table 4. Summary of assembly results
Strain
No. of
contigs
Total contig
length (nt)
Average
contig size
N50 contig
size*
N50 contig
countD
Minimum
contig
length
Maximum
contig
length
G+C
content
(mol%)
VACp
MA1p
MW1p
JR1p
272
197
158
185
647 754
597 245
583 806
631 440
2381
3302
3695
3413
6572
12 309
7972
8188
24
29
16
26
230
230
231
230
22 489
40 778
22 485
31 934
27.4
27.1
27.5
27.3
Number of
predicted coding
sequencesd
677
650
565
654
(650+27)
(624+26)
(529+36)
(634+20)
*N50 contig size is the value X in nucleotides such that at least half of the genome is contained in contigs of size X or larger.
DN50 contig count is the number of contigs of size X or larger.
dTotal number of predicted coding sequences as a sum of genes predicted by GLIMMER (Delcher et al., 2007) and genes rescued using
Durbin, 2009) and MAPSEMBER (Peterlongo & Chikhi, 2012).
http://mic.sgmjournals.org
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Wed, 14 Jun 2017 18:36:27
BWA
(Li &
2809
F. Saccardo and others
Table 5. Percentage identity among the DNA sequences
contained in the assembled contigs of each strain over the
length of the alignment
The length of the alignment is lower than the total length of the
sequence in the contigs because contigs that did not align were
excluded from computation.
was drawn (Table S1). In addition to the genes listed in Table
S1, multiple comparison showed that about 20 homologue
gene families with no function assigned were shared by all
genomes, five of which encoded putative secreted proteins.
As the features of ‘Ca. P. asteris’ genomes have been reviewed
recently (Kube et al., 2012), the summary below focuses
almost exclusively on the differences with those genomes.
The 16SrIII phytoplasmas lack an RmuC family protein but
share a set of rec-dependent DNA recombination and repair
systems, as has been found in ‘Ca. P. mali’ but in contrast to
‘Ca. P. asteris’ and ‘Ca. P. australiense’, with implications
that are discussed elsewhere (Kube et al., 2012).
As in the other phytoplasmas, 16SrIII group genomes
contain several copies of DNA methylases. Also, there are
numerous proteins similar to DNA integration host factor
HimA; as these proteins were associated with potential
mobile units (PMUs) (Bai et al., 2006; Kube et al., 2012), it
may be speculated that PMUs are as abundant as they are
in ‘Ca. P. asteris’ strains (see below).
All 16SrIII phytoplasmas share transcriptional regulators (with
similarity to the phage-related Xre family) and terminators
(with similarity to the YlxR family of Gram-positive bacteria)
that are not present in other phytoplasma genomes. None of
the 16SrIII group strains has the restriction–modification
system (hsd) putatively identified in OYM.
A major feature that characterized the four genome drafts
was the complete absence of sequences encoding the
Hsp60-type (GroEL/GroES) chaperone system, which has
been postulated to be indispensable (Glass et al., 2006).
Only strain MA1p has a DNA fragment with the coding
potential for a protein with similarity to GroEL; it is,
however, a 250 nt fragment interrupted by the start of the
coding sequence of the permease of a multidrug efflux
pump; it contains one translation stop codon and it is
therefore not likely to encode a functional protein. All
16SrIII phytoplasma genomes contain two clearly distinct
paralogues encoding the heat-shock protein IbpA (Hsp20),
which was not found in ‘Ca. P. mali’ and is present as a
single copy in other phytoplasmas. Intriguingly, one of the
two copies of the gene is similar to ibpA of ‘Ca. P. asteris’
and the other is similar to ibpA of ‘Ca. P. australiense’.
2810
According to our preliminary annotation the 16SrIII phytoplasma genomes did not encode an ABC-type methionine
transport system. Moreover, as expected, the draft genomes
did not encode an Amp- or Imp-type membrane protein, but
rather a protein similar to the immunodominant membrane
protein of the Western X disease phytoplasmas, named IdpA
(Blomquist et al., 2001). In addition, they displayed a protein
with low similarity to the F1 lipoprotein of Mycoplasma
agalactiae. No similarity of this protein to the VMP1 hypervariable protein of the stolbur phytoplasma (Cimerman et al.,
2009) was recognized.
The 16SrIII group genome drafts displayed some 30 common
homologue gene families with no homology in ‘Ca. P. asteris’
but with homologues in one or more of the other phytoplasmas (more in ‘Ca. P. mali’ than in ‘Ca. P. australiense’),
and at least 40 additional families that were specifically
conserved in the 16SrIII group. Such gene families were
predicted to encode proteins of unknown function, phage/
EcDNA-related proteins, peptidases/proteases, methylases/
methyltransferases, and proteins related to ABC transport.
Furthermore, one homologous 16SrIII phytoplasma-specific
gene family was found to encode a protein with similarity to
a rhoptry protein of Plasmodium yoelii yoelii (gi23481286).
Homology to rhoptry, a group of proteins released during
host cell invasion by the intracellular parasites apicomplexans, has also been found in P80, an antigen of Mycoplasma hominis that is thought to be secreted by a novel,
protease-assisted mechanism (Hopfe et al., 2004).
Among the genes found in all four genomes drafted here
we identified several gene fragments (six, 14, four and 10 in
MA1p, JR1p, MW1p and VACp, respectively) encoding
putative proteins with significant similarity to a central
domain of the DotG/IcmE protein of Legionella spp.
Although similar proteins have also been predicted in other
phytoplasmas (Bai et al., 2006; Tran-Nguyen et al., 2008),
the similarity to DotG/IcmE of those found in the 16SrIII
phytoplasma was strikingly high (a putative protein of
VACp was 71.6 % similar and 53.3 % identical to accession
no. E1Y0Z0 of Legionella pneumophila; Fig. S2).
In stark contrast to the core genome, the approximately
180 genes present in AYWBp, but not identified in any of
the four 16SrIII group strains, mostly encode proteins of
unknown function, a thiamine biosynthesis protein, an
asparagine synthase, a dihydropteroate synthase, a pyrophosphokinase, genes classified as proteases, ABC transporter components, DNA-binding proteins, transposases
and helicases, and the tengu effector. Several of the above
are known components of PMUs (Bai et al., 2006).
Differences among the genome drafts of the four
16SrIII phytoplasmas
In the revised comparison, 444 homologue gene families
were shared by all four strains. The number of homologue
gene families that were strain-specific or shared by some
but not all strains are listed in Table 6, grouped according
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Wed, 14 Jun 2017 18:36:27
Microbiology 158
Genome drafts of four 16SrIII phytoplasmas
Table 6. Number of homologue gene families, subdivided into groups on the basis of their annotation, that were not shared among all
four genome drafts
to their preliminary annotation. Differences were scored
for the effectors SAP11, found only in VACp and JR1p, and
for the effector SAP54, shared by MA1 and JR1. In both
cases the orthologues were extremely well conserved,
particularly downstream of the secretion signal (.85 %
amino acid identity).
In summary, the 16SrIII phytoplasmas do not differ among
themselves in genes controlling any essential metabolic
function but rather in genes related to mechanisms of gene
mobilization and transfer and to host interaction, namely
membrane transporters, proteases/peptidases, methylases
and some proteins of unknown function that can be
tentatively associated with mobile elements (PMUs,
EcDNAs, plasmids).
Secreted proteins
Using the method developed by Bai et al. (2009) we
investigated the occurrence of secreted proteins (SPs) that
may be released in the host phloem by the Sec transport
system. We identified 46 putative proteins in JR1p, 42 in
MA1p, 33 in MW1p and 38 in VACp that displayed a Sec
secretion signal and not a membrane localization domain
in the processed protein. These figures are lower than those
found in ‘Ca. P. asteris’ (55 in AYWBp and 71 in OYp).
The difference may be due, at least in part, to fragmentation of the genomes into contigs and the consequent
http://mic.sgmjournals.org
truncation in the predicted translation products, which
impeded the identification of the secretion signal.
We collected the sequence of the 368 putative SPs encoded
by the eight phytoplasma genomes available (AYWBp,
OYp, AUp, APp, VACp, JPp, MA1p and MW1p) and
found that 67 SPs were strain-specific, while the other 301
were shared by at least two strains. With regard to the
16SrIII phytoplasmas, only six of the 159 SPs predicted in
the drafts were strain-specific, 51 were shared only by
16SrIII phytoplasmas, while the large majority were
conserved among the genus ‘Ca. Phytoplasma’.
Sequence diversity
Examination of the protein complement of the genome
draft of each strain revealed that several genes found only
in one of the drafts had homology with genes of more
distantly related phytoplasmas. For example, the VACp
genome encodes a gene for a homologue of PAM_541 of
OYp that is not found in any other of the eight genomes
examined, and a methyltransferase that is shared only with
AUp. The orthologous sequences were highly similar; in
particular, PAM_541 was nearly identical (one singlenucleotide polymorphism in 736 nt) to its VACp homologue, strongly suggesting that these genes did not evolve
independently but rather were horizontally transferred. In a
wider investigation, close examination of the sequence
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Wed, 14 Jun 2017 18:36:27
2811
F. Saccardo and others
differences revealed that the similarities among putative
proteins within families did not always reflect phylogenetic
order. As a case study, similarity scores between VACp and
AYWBp orthologues in all multiple alignments were
compared with those among the orthologues in the
16SrIII phytoplasmas. Twenty-five genes were found to
be more similar between AYWBp and VACp than between
VACp and any other of the 16SrIII phytoplasmas (Table
S2), and were presumably shared between VACp and
AYWBp due to horizontal gene transfer (HGT). They
include nine hypothetical proteins predicted to be secreted.
AYWBp being only one of the numerous possible donors,
HGT should have been a major driving force in the
evolution of phytoplasma genomes, with particular reference to potential effector proteins.
DISCUSSION
Phytoplasmas are totally recalcitrant to axenic cultivation
and grow very slowly in the plant host, so the genome
approach offers a unique opportunity to understand their
key features by comparative analysis. Evidence suggesting
the existence of variant strains that are characterized by a
reduced virulence (Marcone et al., 2010; Oshima et al.,
2001; Seemüller et al., 2011) makes us confident that a
clearer picture of the molecular basis of phytoplasma
virulence may emerge from multiple comparisons of the
genomic characteristics of strains with different degrees of
virulence. However, the application of large-scale comparative genomics to phytoplasmas is hampered by contamination of the sample by a large amount of host DNA.
Methods that include time-consuming DNA purification
(such as PFGE or CsCl gradients) are clearly inadequate for
processing a relatively large number of samples. On the
other hand, post-sequence processing of the data, with
assignment of the contigs generated by the assembler, is
affordable with standard computing equipment only with a
limited number of relatively large contigs.
We have addressed the above issues and developed a
multiple-step assembly method based on reads selection
and contig filtration that proved highly efficient. We
obtained genome drafts of relatively high quality (as
demonstrated by the very low number of genes that were
found to be missing in only one of the four drafts) from
diagnostic/standard nucleic acid preparations. In addition
we have developed a method based on reads crossmapping
for direct rescue from the reads dataset of gene sequences
that were not included or not predicted in the primary
assembly. In the four drafts generated in this work, direct
analysis of the reads dataset allowed the definition of
protein complements that we have estimated to be nearly
complete (4–10 genes missing per draft).
The results of comparative genome analysis showed limited
differences in the housekeeping gene set between the
16SrIII phytoplasmas and the other polyphagous phytoplasmas previously sequenced. The results therefore
2812
support the view that the phytoplasmas originated by
radiation from an ancestral organism that already had a
reduced genome, as the same indispensable 300 gene
functions are shared among distantly related phytoplasmas
(Kube et al., 2008). The sole relevant difference concerning
the 16SrIII phytoplasmas relates to our inability to locate
the genes encoding the GroES/GroEL chaperonin system.
Given the completeness of the drafts and the failure to find
either GroES or GroEL putative coding sequences in all
four strains, the probability that the two genes were not
discovered by chance is low. Nevertheless, further investigations are needed to confirm whether this phytoplasma
group actually lacks this highly conserved function.
In addition to the common set, the 16SrIII phytoplasmas
displayed an additional gene set, variable in composition,
possibly shared among even phylogenetically distant phytoplasmas through HGT. In this study, HGT was found to
affect not only gain of genes from distantly related strains
but also polymorphism in orthologous genes, and may be a
major driving force in shaping phytoplasma genomes during
adaptation to different hosts.
The novel sequence information obtained in this work
supports the notion that the abundance of proteins involved
in proteolysis, DNA methylation, and inward and outward
trafficking through the phytoplasma cellular membrane is a
general rule in phytoplasma genomes. Notably, the 16SrIII
phytoplasmas share a relatively large number of proteins
with extensive similarity to a central domain of IcmE/DotG
of Legionella pneumophila, an inner membrane-bound
component of the core complex of the type IVB secretion
system (T4SSB; Vincent et al., 2006). The critical role of
IcmE/DotG within the core structure, which is also
conserved in the distantly related T4SSA of Agrobacterium
tumefaciens (Vincent et al., 2006), and the involvement of
T4SS in plasmid conjugation systems and in adapted
conjugation systems, suggest that further features of the
phytoplasma interaction capabilities remain to be unveiled.
IcmE/DotG homologues may play a direct role in effector
delivery and membrane trafficking, but it is intriguing to
hypothesize that the IcmE/DotG and rhoptry-like homologues that we detected in the 16SrIII phytoplasmas, together
with several proteins that are presently annotated as phage/
plasmid-associated proteins, are part of an as-yet-undiscovered gene-sharing mechanism that grants access to a
large, shared gene pool to organisms with a reduced genome,
defining their host adaptation-driven evolutionary strategy.
Analysis of the draft sequences suggests significant
differences with reference to the few proteins so far
recognized as effectors. VACp and JR1p shared the effector
SAP11 (Bai et al., 2009), while the effector SAP54 was
found only in MA1p and JR1p. Indeed, the two latter
strains displayed much more pronounced symptoms of
virescence and phyllody than the other strains, related to
the effect of SAP54 (MacLean et al., 2011). SAP54 and its
homologues in MA1p and JR1p were highly similar, with
very few amino acid differences, suggesting a unique origin
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Wed, 14 Jun 2017 18:36:27
Microbiology 158
Genome drafts of four 16SrIII phytoplasmas
of the phyllody-causing effector and its recent spread
among distantly related phytoplasmas. The Italian Clover
Phyllody phytoplasma, which has a very large host range
(Osler et al., 1994), has been reported in double infections
with strains of ‘Ca. P. asteris’ (Firrao et al., 1996b), which
may have favoured horizontal transfer. Whether the difference in the severity of the virescence/phyllody symptoms
could be ascribed to the minor amino acid sequence
differences or to other factors remains to be established.
Cettul & Firrao (2011) reported marked differences in the
symptoms displayed by different Arabidopsis thaliana
accessions infected with the same phytoplasma, as well as
by the same accession infected by different phytoplasmas,
suggesting that the affinity between the effectors and their
target may play a role in determining the host response.
Conversely, an orthologue of the effector tengu could not be
found in any of the four drafts, although all strains except
MW1 showed clear symptoms of shoot proliferation, and
VACp was characterized by dwarfism, suggesting that other
effector(s) could be responsible for those growth disorders.
Although there are relatively few proteins that are differentially displayed by strains presenting similar typical
symptoms, the number of strains examined is still too limited to attempt a hypothesis of association. Nevertheless,
this gene selection may provide the basis for future studies to
clarify the origin of different symptoms such as small leaves,
small flowers, general plant decline, and dwarfism.
Carraro, L., Osler, R., Loi, N. & Favali, M. A. (1991). Transmission
characteristics of the clover phyllody agent by dodder. J Phytopathol
133, 15–22.
Cettul, E. & Firrao, G. (2011). Development of phytoplasma-induced
flower symptoms in Arabidopsis thaliana. Physiol Mol Plant Pathol 76,
204–211.
Cimerman, A., Arnaud, G. & Foissac, X. (2006). Stolbur phytoplasma
genome survey achieved using a suppression subtractive hybridization
approach with high specificity. Appl Environ Microbiol 72, 3274–3283.
Cimerman, A., Pacifico, D., Salar, P., Marzachı̀, C. & Foissac, X.
(2009). Striking diversity of vmp1, a variable gene encoding a puta-
tive membrane protein of the stolbur phytoplasma. Appl Environ
Microbiol 75, 2951–2957.
Delcher, A. L., Bratke, K. A., Powers, E. C. & Salzberg, S. L. (2007).
Identifying bacterial genes and endosymbiont DNA with Glimmer.
Bioinformatics 23, 673–679.
Firrao, G., Smart, C. D. & Kirkpatrick, B. C. (1996a). Physical map of
the Western X-disease phytoplasma chromosome. J Bacteriol 178,
3985–3988.
Firrao, G., Carraro, L., Gobbi, E. & Locci, R. (1996b). Molecular
characterization of a phytoplasma causing phyllody in clover and
other herbaceous hosts in Northern Italy. Eur J Plant Pathol 102, 817–
822.
Firrao, G., Gibb, K. & Streten, C. (2005). Short taxonomic guide to the
genus ‘‘Candidatus Phytoplasma’’. J Plant Pathol 87, 249–263.
Galbraith, D. W., Harkins, K. R., Maddox, J. M., Ayres, N. M., Sharma,
D. P. & Firoozabady, E. (1983). Rapid flow cytometric analysis of the
cell cycle in intact plant tissues. Science 220, 1049–1051.
Garcia-Chapa, M., Batlle, A., Rekab, D., Rosquete, M. R. & Firrao, G.
(2004). PCR-mediated whole genome amplification of phytoplasmas.
J Microbiol Methods 56, 231–242.
ACKNOWLEDGEMENTS
Gasparich, G. E. (2010). Spiroplasmas and phytoplasmas: microbes
Drs Federica Cattonaro and Eleonora Di Centa, IGA Technology
services s.r.l., are gratefully acknowledged for their valuable help with
sequencing and preliminary assembly.
associated with plant hosts. Biologicals 38, 193–203.
Glass, J. I., Assad-Garcia, N., Alperovich, N., Yooseph, S., Lewis,
M. R., Maruf, M., Hutchison, C. A., III, Smith, H. O. & Venter, J. C.
(2006). Essential genes of a minimal bacterium. Proc Natl Acad Sci
U S A 103, 425–430.
REFERENCES
Griffiths, H. M., Gundersen, D. E., Sinclair, W. A., Lee, I.-M. & Davis,
R. E. (1994). Mycoplasmalike organisms from milkweed, goldenrod,
Ahrens, U. & Seemuller, E. (1992). Detection of DNA of plant
and spyrea, represent two new 16SrRNA subgroups and three new
strain subclusters related to peach X-disease MLOs. Can J Phytopathol
16, 255–260.
pathogenic mycoplasmalike organisms by a polymerase chain reaction
that amplifies a sequence of the 16S rRNA gene. Phytopathology 82,
828–832.
Altschul, S. F., Gish, W., Miller, W., Myers, E. W. & Lipman, D. J.
(1990). Basic local alignment search tool. J Mol Biol 215, 403–410.
Gundersen, D. E., Lee, I.-M., Schaff, D. A., Harrison, N. A., Chang,
C. J., Davis, R. E. & Kingsbury, D. T. (1996). Genomic diversity and
differentiation among phytoplasma strains in 16S rRNA groups I
(aster yellows and related phytoplasmas) and III (X-disease and
related phytoplasmas). Int J Syst Bacteriol 46, 64–75.
Bai, X., Zhang, J., Ewing, A., Miller, S. A., Jancso Radek, A.,
Shevchenko, D. V., Tsukerman, K., Walunas, T., Lapidus, A. & other
authors (2006). Living with genome instability: the adaptation of
Hopfe, M., Hoffmann, R. & Henrich, B. (2004). P80, the HinT
phytoplasmas to diverse environments of their insect and plant hosts.
J Bacteriol 188, 3682–3696.
interacting membrane protein, is a secreted antigen of Mycoplasma
hominis. BMC Microbiol 4, 46.
Bai, X., Correa, V. R., Toruño, T. Y., Ammar, D., Kamoun, S. &
Hogenhout, S. A. (2009). AY-WB phytoplasma secretes a protein that
Hoshi, A., Oshima, K., Kakizawa, S., Ishii, Y., Ozeki, J., Hashimoto,
M., Komatsu, K., Kagiwada, S., Yamaji, Y. & Namba, S. (2009). A
targets plant cell nuclei. Mol Plant Microbe Interact 22, 18–30.
Blomquist, C. L., Barbara, D. J., Davies, D. L., Clark, M. F. &
Kirkpatrick, B. C. (2001). An immunodominant membrane protein
unique virulence factor for proliferation and dwarfism in plants
identified from a phytopathogenic bacterium. Proc Natl Acad Sci
U S A 106, 6416–6421.
gene from the Western X-disease phytoplasma is distinct from those
of other phytoplasmas. Microbiology 147, 571–580.
Huang, W., Umbach, D. M., Vincent Jordan, N., Abell, A. N., Johnson,
G. L. & Li, L. (2011). Efficiently identifying genome-wide changes with
Camacho, C., Coulouris, G., Avagyan, V., Ma, N., Papadopoulos, J.,
Bealer, K. & Madden, T. L. (2009). BLAST+: architecture and
next-generation sequencing data. Nucleic Acids Res 39, e130.
applications. BMC Bioinformatics 10, 421.
http://mic.sgmjournals.org
IRPCM Phytoplasma/Spiroplasma Working Team – Phytoplasma
taxonomy group (2004). ‘Candidatus Phytoplasma’, a taxon for the
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Wed, 14 Jun 2017 18:36:27
2813
F. Saccardo and others
wall-less, non-helical prokaryotes that colonize plant phloem and
insects. Int J Syst Evol Microbiol 54, 1243–1255.
MAFFT version 5:
improvement in accuracy of multiple sequence alignment. Nucleic
Acids Res 33, 511–518.
Katoh, K., Kuma, K., Toh, H. & Miyata, T. (2005).
Kawar, P. G., Pagariya, M. C., Dixit, G. B. & Prasad, D. T. (2010).
Identification and isolation of SCGS phytoplasma-specific fragments
by riboprofiling and development of specific diagnostic tool. J Plant
Biochem Biotechnol 19, 185–194.
Kollar, A. & Seemüller, E. (1989). Base composition of the DNA of
mycoplasma-like organisms associated with various plant diseases.
J Phytopathol 127, 177–186.
Kube, M., Schneider, B., Kuhl, H., Dandekar, T., Heitmann, K.,
Migdoll, A. M., Reinhardt, R. & Seemüller, E. (2008). The linear
chromosome of the plant-pathogenic mycoplasma ‘Candidatus
Phytoplasma mali’. BMC Genomics 9, 306.
Kube, M., Mitrovic, J., Duduk, B., Rabus, R. & Seemüller, E. (2012).
Current view on phytoplasma genomes and encoded metabolism.
Scientific World Journal 2012, 185942.
Kurtz, S., Phillippy, A., Delcher, A. L., Smoot, M., Shumway, M.,
Antonescu, C. & Salzberg, S. L. (2004). Versatile and open software
for comparing large genomes. Genome Biol 5, R12.
Larkin, M. A., Blackshields, G., Brown, N. P., Chenna, R., McGettigan,
P. A., McWilliam, H., Valentin, F., Wallace, I. M., Wilm, A. & other
authors (2007). CLUSTAL W and CLUSTAL X version 2.0. Bioinformatics
23, 2947–2948.
Lee, I.-M., Davis, R. E. & Gundersen-Rindal, D. E. (2000).
Phytoplasma: phytopathogenic mollicutes. Annu Rev Microbiol 54,
221–255.
Li, H. & Durbin, R. (2009). Fast and accurate short read alignment with
Burrows–Wheeler transform. Bioinformatics 25, 1754–1760.
Liefting, L. W. & Kirkpatrick, B. C. (2003). Cosmid cloning and sample
sequencing of the genome of the uncultivable mollicute, Western Xdisease phytoplasma, using DNA purified by pulsed-field gel
electrophoresis. FEMS Microbiol Lett 221, 203–211.
Oshima, K., Kakizawa, S., Nishigawa, H., Jung, H.-Y., Wei, W.,
Suzuki, S., Arashida, R., Nakata, D., Miyata, S. & other authors
(2004). Reductive evolution suggested from the complete genome
sequence of a plant-pathogenic phytoplasma. Nat Genet 36, 27–29.
Osler, R., Firrao, G., Carraro, L., Loi, N., Musetti, R. & Chen, T. A.
(1994). Biodiversity of plant MLOs in a well defined ecological area.
IOM Lett 3, 286–287.
Peterlongo, P. & Chikhi, R. (2012). Mapsembler, targeted and micro
assembly of large NGS datasets on a desktop computer. BMC
Bioinformatics 13, 48.
Poptsova, M. S. & Gogarten, J. P. (2007). BranchClust: a phylogenetic
algorithm for selecting gene families. BMC Bioinformatics 8, 120.
Seemüller, E., Kampmann, M., Kiss, E. & Schneider, B. (2011). HflB
gene-based phytopathogenic classification of ‘Candidatus Phytoplasma
mali’ strains and evidence that strain composition determines virulence
in multiply infected apple trees. Mol Plant Microbe Interact 24, 1258–
1266.
Sugio, A., Kingdom, H. N., MacLean, A. M., Grieve, V. M. &
Hogenhout, S. A. (2011). Phytoplasma protein effector SAP11
enhances insect vector reproduction by manipulating plant development and defense hormone biosynthesis. Proc Natl Acad Sci U S A
108, E1254–E1263.
Tran-Nguyen, L. T. & Gibb, K. S. (2007). Optimizing phytoplasma
DNA purification for genome analysis. J Biomol Tech 18, 104–112.
Tran-Nguyen, L. T., Kube, M., Schneider, B., Reinhardt, R. & Gibb, K. S.
(2008). Comparative genome analysis of ‘‘Candidatus Phytoplasma
australiense’’ (subgroup tuf-Australia I; rp-A) and ‘‘Ca. Phytoplasma
asteris’’ strains OY-M and AY-WB. J Bacteriol 190, 3979–3991.
Valiunas, D., Samuitiene, M., Rasomavicius, V., Navalinskiene, M.,
Staniulis, J. & Davis, R. E. (2007). Subgroup 16SrIII-F phytoplasma
strains in an invasive plant, Heracleum sosnowskyi, and an
ornamental, Dictamnus albus. J Plant Pathol 89, 137–140.
Vincent, C. D., Friedman, J. R., Jeong, K. C., Buford, E. C., Miller, J. L.
& Vogel, J. P. (2006). Identification of the core transmembrane
complex of the Legionella Dot/Icm type IV secretion system. Mol
Microbiol 62, 1278–1291.
MacLean, A. M., Sugio, A., Makarova, O. V., Findlay, K. C., Grieve,
V. M., Tóth, R., Nicolaisen, M. & Hogenhout, S. A. (2011).
Wall, D. P., Fraser, H. B. & Hirsh, A. E. (2003). Detecting putative
Phytoplasma effector SAP54 induces indeterminate leaf-like flower
development in Arabidopsis plants. Plant Physiol 157, 831–841.
Wei, W., Davis, R. E., Lee, I.-M. & Zhao, Y. (2007). Computer-
Marcelletti, S., Ferrante, P., Petriccione, M., Firrao, G. & Scortichini,
M. (2011). Pseudomonas syringae pv. actinidiae draft genomes
comparison reveal strain-specific features involved in adaptation
and virulence to Actinidia species. PLoS ONE 6, e27297.
Marcone, C., Jarausch, B. & Jarausch, W. (2010). ‘Candidatus
orthologs. Bioinformatics 19, 1710–1711.
simulated RFLP analysis of 16S rRNA genes: identification of ten new
phytoplasma groups. Int J Syst Evol Microbiol 57, 1855–1867.
Zerbino, D. R. & Birney, E. (2008). Velvet: algorithms for de novo
short read assembly using de Bruijn graphs. Genome Res 18, 821–829.
Zhao, Y., Wei, W., Lee, I.-M., Shao, J., Suo, X. & Davis, R. E. (2009).
Phytoplasma prunorum’ the causal agent of European stone fruit
yellows: an overview. J Plant Pathol 92, 19–34.
Construction of an interactive online phytoplasma classification tool,
iPhyClassifier, and its application in analysis of the peach X-disease
phytoplasma group (16SrIII). Int J Syst Evol Microbiol 59, 2582–2593.
Oshima, K., Shiomi, T., Kuboyama, T., Sawayanagi, T., Nishigawa,
H., Kakizawa, S., Miyata, S., Ugaki, M. & Namba, S. (2001). Isolation
Zonneveld, B. J. M., Leitch, I. J. & Bennett, M. D. (2005). First nuclear
and characterization of derivative lines of the Onion Yellows
phytoplasma that do not cause stunting or phloem hyperplasia.
Phytopathology 91, 1024–1029.
2814
DNA amounts in more than 300 angiosperms. Ann Bot (Lond) 96,
229–244.
Edited by: J. Renaudin
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Wed, 14 Jun 2017 18:36:27
Microbiology 158