Genotype to phenotype: identification of diagnostic vibrio

International Journal of Systematic and Evolutionary Microbiology (2014), 64, 357–365
DOI 10.1099/ijs.0.057927-0
Genotype to phenotype: identification of diagnostic
vibrio phenotypes using whole genome sequences
Gilda Rose S. Amaral, Graciela M. Dias, Michiyo Wellington-Oguri,
Luciane Chimetto, Mariana E. Campeão, Fabiano L. Thompson
and Cristiane C. Thompson
Correspondence
Institute of Biology, Federal University of Rio de Janeiro (UFRJ), Brazil
Fabiano L. Thompson
[email protected]
Cristiane C. Thompson
[email protected]
Vibrios are ubiquitous in the aquatic environment and can be found in association with animal or
plant hosts. The range of ecological relationships includes pathogenic and mutualistic
associations. To gain a better understanding of the ecology of these microbes, it is important to
determine their phenotypic features. However, the traditional phenotypic characterization of
vibrios has been expensive, time-consuming and restricted in scope to a limited number of
features. In addition, most of the commercial systems applied for phenotypic characterization
cannot characterize the broad spectrum of environmental strains. A reliable and possible
alternative is to obtain phenotypic information directly from whole genome sequences. The aim of
the present study was to evaluate the usefulness of whole genome sequences as a source of
phenotypic information. We performed a comparison of the vibrio phenotypes obtained from the
literature with the phenotypes obtained from whole genome sequences. We observed a
significant correlation between the previously published phenotypic data and the phenotypic data
retrieved from whole genome sequences of vibrios. Analysis of 26 vibrio genomes revealed that all
genes coding for the specific proteins involved in the metabolic pathways responsible for positive
phenotypes of the 14 diagnostic features (Voges–Proskauer reaction, indole production, arginine
dihydrolase, ornithine decarboxylase, utilization of myo-inositol, sucrose and L-leucine, and
fermentation of D-mannitol, D-sorbitol, L-arabinose, trehalose, cellobiose, D-mannose and Dgalactose) were found in the majority of the vibrios genomes. Vibrio species that were negative for
a given phenotype revealed the absence of all or several genes involved in the respective
biochemical pathways, indicating the utility of this approach to characterize the phenotypes of
vibrios. The absence of the global regulation and regulatory proteins in the Vibrio
parahaemolyticus genome indicated a non-vibrio phenotype. Whole genome sequences
represent an important source for the phenotypic identification of vibrios.
INTRODUCTION
Microbial taxonomy comprises the identification of
isolates within known species, classification of new isolates
(creation of new taxa) and nomenclature. The taxonomic
schemes used for identification and classification need to
be reliable, reproducible and informative. It is also
desirable that the schemes are easy and affordable for the
end-users. Phenotypic tables listing useful diagnostic
features for identification have been extensively used since
the early developments of microbial taxonomy (for
example Bergey’s Manual of Determinative Bacteriology,
Abbreviations: CRP, cAMP receptor protein; DDH, DNA–DNA
hybridization; GGD, genome-to-genome distance; GGDC, genome-togenome distance calculator.
A supplementary figure and two supplementary tables are available with
the online version of this paper.
057927 G 2014 IUMS
9th edition), partially as a consequence of the influence of
other disciplines of biology (e.g. botany) in the early
developments of microbial taxonomy (Drews, 2000), and
partially due to the technological limitations associated
with these early periods of taxonomy. However, microbiologists soon realized that taxonomic schemes based
on phenotypic features had several shortcomings: different
species could have indistinguishable phenotypes (e.g. for
the Vibrio alginolyticus species group), strains of the same
species could also have different phenotypes (e.g. colony
variation, enzyme activities), discrepancies in the results
sometimes with the same strain and lack of interlaboratory reproducibility. One astonishing consequence of the
phenotype-based taxonomy developed in the first half of
the last century was a multiplication of novel species,
leading to a complete revision in the number of recognized
species between 1957 and 1974. More than 90 % of all the
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Fri, 16 Jun 2017 15:31:22
Printed in Great Britain
357
G. R. S. Amaral and others
species described in the 1957 edition of Bergey’s Manual of
Determinative Bacteriology were subsequently reclassified as
synonymous or excluded from the taxonomic schemes due
to the application of more rigorous procedures to assess
strain similarity (e.g. DNA–DNA hybridization, DDH).
The impact of molecular tools (e.g. molecular fingerprinting, DTm, DDH and DNA sequencing) in taxonomy has
been extremely high, culminating in the establishment of
polyphasic taxonomy (Vandamme et al., 1996). Taxonomic
schemes are currently based on the evolutionary relationships determined using gene sequences (most notably the
16S rRNA gene) and genomic information (Konstantinidis
& Stackebrandt, 2013; Thompson et al., 2009).
understanding of the ecology of these microbes, it is
important to obtain phenotypic information based on whole
genome sequences. The aim of the present study was to
evaluate the usefulness of genome sequences as a source of
phenotypic information. We performed a comparison of the
vibrio phenotypes obtained from the literature with those
obtained from whole genome sequences. We also developed a
prototype program, vibriophenotyping, to determine the
major diagnostic phenotypic features for the identification of
vibrios.
The advent of whole genome sequencing allowed the
establishment of taxonomic schemes based on the evolutionary information contained in the genome sequences,
such as the Karlin signature, average amino acid identity,
supertrees, and genome-to-genome distance (GGD, a type
of in silico DDH). It is becoming clear that bacterial species
can be defined on the basis of these features. A common
definition would consider that strains from the same species
share ,10 dissimilarity in Karlin signature, .95 % amino
acid identity, .95 % similarity based on multiple alignment
genes and .70 % in silico GGD (Thompson et al., 2009,
2011, 2013a, b). It is reasonable to suggest that microbial
taxonomy will become steadily more dependent on genome
sequences rather than on the classic phenotypic characterization using time-consuming laborious wet laboratory tests.
In addition, the extremely close phenotypic similarities
observed for some types of bacteria (e.g. vibrios) are a
hindrance to the routine phenotypic identification of these
microbes.
Genome sequence data. We analysed 26 strains corresponding
to 12 vibrio species (Table S1, available in the online Supplementary Material). The genome sequences were downloaded from the
National Center for Biotechnology Information (NCBI) database.
The genomes were annotated using the Rapid Annotation System
Technology (RAST) server (Aziz et al., 2008). This server is based
on manually curated subsystems and subsystem-based protein
families that automatically guarantee a high degree of assignment
consistency.
Vibrios are ubiquitous in the aquatic environment and can be
found in association with animal or plant hosts. Some species
are animal (Vibrio coralliilyticus and Vibrio harveyi) or human
(Vibrio cholerae, Vibrio parahaemolyticus and Vibrio vulnificus) pathogens, and others form mutualistic relationships
with marine organisms (e.g. that between Vibrio fischeri and
the squid Euprymna scolopes). Vibrios were among the first
taxonomic groups to be evaluated by means of multilocus
sequence analysis (Thompson et al., 2005), and genomic
taxonomy (Thompson et al., 2009). We also developed a
surrogate methodology to determine genome similarity based
on DTm using a real-time PCR platform as an alternative to
the classic DDH based on the Ezaki methodology (Moreira
et al., 2011). However, it has become evident that the
phenotypic characterization of vibrios is problematic. The
phenotypic identification of vibrios remains a difficult task,
particularly for some sister species; for example, V. cholerae–
Vibrio mimicus, V. coralliilyticus–Vibrio tubiashii–Vibrio
brasiliensis, V. alginolyticus–V. parahaemolyticus–Vibrio natriegens and V. harveyi–Vibrio campbellii have similar phenotypes. Vibrio sister species have highly similar genomes
(around 70 % DDH) and nearly indistinguishable phenotypes. In spite of having similar genomes, these species can
be recognized as different evolutionary unities in nature
(Hunt et al., 2008; Preheim et al., 2011). To gain a better
358
METHODS
Identification of genes responsible for different diagnostic
phenotypes. We chose the 14 diagnostic biochemical features
(Voges–Proskauer reaction, indole production, arginine dihydrolase,
ornithine decarboxylase, utilization of myo-inositol, sucrose and Lleucine, and fermentatation of D-mannitol, D-sorbitol, L-arabinose,
trehalose, cellobiose, D-mannose and D-galactose) that have been
applied in previous studies to identify sister species of vibrios (Alsina
& Blanch, 1994; Farmer & Hickman-Brenner, 2006; Noguerola &
Blanch, 2008; Thompson et al., 2004). For each diagnostic feature we
established a list of corresponding genes (Table S2). We detected the
genes coding for the proteins responsible for these features using the
RAST program and the KEGG metabolic database (http://www.
genome.jp/kegg/). The BLASTP algorithm (Altschul et al., 1990) was
used to identify genes associated with the biochemical pathways. The
program ExPASy translate (ExPASy Bioinformatics Resource Portal)
was used to analyse protein sequences. The phenotypic features
obtained from the literature and the features obtained from the
genome searches were compared be means of numerical coefficients.
The search for genes related to the biochemical characteristics of
interest generated a table of presence or absence that was compared
with the data from previously published studies. We used the Jaccard
index to obtain a similarity coefficient between pairs of strains using
the two datasets. Species were compared pairwise, generating a
similarity matrix for both the genotype and the phenotype.
Correlation analysis was performed based on the Pearson correlation
using Excel software.
Genome-to-genome distance calculator (GGDC). The genome
distance among the 27 genomes was calculated using GGDC (Auch
et al., 2010). The GGDC values were compared with the DDH values
obtained in previous studies (Thompson et al., 2004). The previous
DDH estimates were all based on empirical data using the Ezaki
methodology (Willems et al., 2001). The regression-based DDH
estimate uses parameters from a robust-line fit, whereas the
threshold-based DDH estimate applies the distance threshold leading
to the lowest error ratio in predicting whether DDH is .70 % or
,70 % (Auch et al., 2010).
Development of the prototype program vibriophenotyping. To
automate searches for enzymes related to phenotypes of interest, a
program was written in Python. The program vibriophenotyping uses
Downloaded from www.microbiologyresearch.org by
International Journal of Systematic and Evolutionary Microbiology 64
IP: 88.99.165.207
On: Fri, 16 Jun 2017 15:31:22
Vibrio phenotypes using whole genome sequences
After these steps, if all the enzymes involved in a metabolic pathway
are present in the genome, the organism is considered positive for this
phenotype, or if one or more enzymes in a metabolic pathway are
absent, the organism is considered negative. The output of the
program comprises a table and a text file. The table contains a list of
phenotypes, whereas the text file contains the BLAST results as well as
the identity and length cuts, and the phenotypes with enzymes present
or absent. The program is available at http://www.microbiologia.
biologia.ufrj.br/.
RESULTS
The genomic relatedness obtained by the in silico GGD
hybridization methodology was in agreement with previous
measurements of DDH (Fig. 1a). Overall, the fluorometric
microplate methodology resulted in slightly higher values
(approx. 5.9 % higher) than the in silico methodology. The
comparison of DDH values obtained by the fluorometric
microplate methodology with the DDH values obtained by
the in silico GGD methodology provided a significant
correlation, with fluorometric values .70 % corresponding
to GGD values .70 %. For instance, V. campbellii
CAIM519T and the newly isolated proteorhodpsin-bearing
V. campbellii PEL22A shared .70 % GGD. Similarly, all
strains of V. mimicus shared .70 % GGD, whereas strains of
V. mimicus and V. cholerae shared ,70 % GGD.
Genome enabled phenotypic identification of
vibrios
We analysed the genes coding for the enzymes responsible
for a group of key phenotypic markers currently used to
identify vibrios. Analysis of the genomes revealed that all
genes coding for the specific proteins involved in the
metabolic pathways responsible for positive vibrio phenotypes were found in the majority of the vibrio genomes
(Table 1; Fig. 1b). Some phenotypes were defined based
on the presence of a few genes, e.g. indole production
(presence of tnaAB coding for tryptophanase A and
tryptophanase B), whereas for other phenotypes the
definition was based on the presence of several genes, e.g.
ornithine decarboxylase (presence of 36 genes) (Fig. 2).
The vibrio species that were negative for a given phenotype
revealed the absence of at least one gene involved in the
respective biochemical pathway (Table 1). Details of the
metabolic pathways of the diagnostic vibrio phenotypes are
given in Fig. S1. Regulatory proteins (Sigma-54 transcriptional regulator factor, ScrR, ArgR, GalR, MtlR, TreR,
http://ijs.sgmjournals.org
In vitro (DDH, %)
(a) 80
60
40
R 2= 0.84
20
0
0
20
40
In silico (GGD, %)
60
80
Pearson correlation = 0.92
(b) 100
Phenotypic similarity (%)
the BLAST algorithm to make searches. The program has an associated
database, which is composed of protein sequences related to the
analysed phenotypes (see Table S1). These sequences were used as
queries for BLAST searches. The BLAST results were analysed by the
vibriophenotyping program in the following steps: (i) the user
provides an amino acid FASTA file with coding sequences as input; (ii)
verify if hits were found for the enzyme being searched; (iii) verify
which orthologue in the database has the greater score for the
subsequent analysis; (iv) verify if the identity is greater or less than
40 %; and (v) verify if the sequence length is greater or less than 70 %
of the query length.
80
60
40
R 2= 0.48
20
0
20
30
60
40
50
Genotypic similarity (%)
70
80
Pearson correlation = 0.68
Fig. 1. Genomic and phenotypic information based on whole
genome sequences. Correlation between the DDH values
obtained from the literature and in silico GGD values from the
pairs of vibrios. The points on the graph correspond to pairs of the
test species. (a) Regression curve between in vitro (DDH) and
in silico (GGD) data. (b) Regression curve between genotypic
similarity and phenotypic similarity.
TnaA and transcriptional regulator of the myo-inositol
catabolic operon) associated with each diagnostic vibrio
phenotype were identified (Table 2).
We observed that the pairs of sister species V. campbellii
and V. harveyi, V. parahaemolyticus and V. alginolyticus, V.
cholerae and V. mimicus, Vibrio anguillarum and Vibrio
ordalii, V. tubiashii and V. coralliilyticus, V. brasiliensis and
V. tubiashii, and V. natriegens and V. alginolyticus contained
different sets of genes, conferring different diagnostic
phenotypes for each species (Table 1). Phenotypic similarity
above 65 % was observed between pairs of strains of different species, including V. alginolyticus and V. cholerae, V.
parahaemolyticus and V. cholerae, V. parahaemolyticus and
V. mimicus, and V. harveyi and V. cholerae. These pairs had
less than 25 % in silico DDH values, indicating the rather
limited scope of the phenotypic identification compared
with the genome-based identification. The phenotypic
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Fri, 16 Jun 2017 15:31:22
359
G. R. S. Amaral and others
Table 1. Phenotypic features for differentiating sister species
of vibrios based on genomic analysis (presence and absence
of genes involved in the biochemical pathway of each feature)
Species: 1, V. campbellii; 2, V. harveyi; 3, V. parahaemolyticus; 4, V.
alginolyticus; 5, V. cholerae; 6, V. mimicus; 7, V. anguillarum; 8, V.
ordalii; 9, V. tubiashii; 10, V. coralliilyticus; 11, V. brasiliensis; 12, V.
natriegens. Phenotypic data for reference species were obtained from
Farmer & Hickman-Brenner (2006). +, Positive; 2, negative; V,
variable. Genotypic and phenotypic similarities were calculated using
the Jaccard coefficient based on the presence or absence of the diagnostic phenotypic features. Blank cells, tests were not determined.
Feature
1 2 3 4 5 6 7 8 9 10 11 12
Ornithine decarboxylase V +
+*
Voges–Proskauer reaction
2 ++ 2+ 2
++
Indole production
+ 2
+*
Arginine dihydrolase
+ +*
Utilization of:
myo-Inositol
2+
+
L-Leucine
2+
D-Galactose
V + + 2
Fermentation of:
Sucrose
2+ 2 +
D-Mannose
+ 2 + +*
L-Arabinose
+ 2
+ 2
+
Cellobiose
+* +
2* 2
D-Mannitol
+ +*
D-Sorbitol
+ 2 +2
Trehalose
+ +*
*Different result from data in the literature (Alsina & Blanch, 1994;
Thompson et al., 2004; Farmer & Hickman-Brenner, 2006; Noguerola
& Blanch, 2008).
similarity derived from the genome sequences gave, in
general, higher values than the phenotypic similarity derived
based on physiological and biochemical tests (Fig. 1b).
Diagnostic phenotypic features of V. harveyi and
V. campbellii
Ornithine decarboxylase, and fermentation of sucrose
and D-galactose are key biochemical tests routinely used
for differentiation of V. harveyi and V. campbellii. Our
analysis of the genome sequences revealed the presence
of the genes responsible for these phenotypes in the
two strains of V. harveyi (1792 and 14126T). In contrast,
in the genomes of five strains of V. campbellii (PEL22A,
DM40S, 519T, HY01 and BAA-1116) four genes involved
in the fermentation of sucrose (sucrose-specific IIB domain of the Phosphotransferase System (PTS), sucrose
operon repressor ScrR, sucrose 6-phosphate dehydrogenase
and fructokinase) were absent. These genes are required
for the fermentation of sucrose, and their absence in
these genomes explained the sucrose-negative phenotype.
Utilization of D-galactose and activity of ornithine
360
decarboxylase are known to be variable features in V.
campbellii. V. campbellii PEL22A, DM40S and 519T did not
possess the five genes (i.e. galactokinase, galactose 1phosphate uridylyltransferase, UDP-glucose 4-epimerase,
b-D-galactosidase subunit a, cryptic b-D-galactosidase
subunit b) involved in D-galactose utilization. However,
V. campbellii BAA-1116 and HY01 did have these genes.
Similar results were found for genes involved in activity of
ornithine decarboxylase. The genomes of strains PEL22A
and CAIM519T lacked the genes necessary for utilization
of ornithine as a source of carbon and energy for growth,
whereas strains DM40S, BAA-1116 and HY01 possessed
these genes, suggesting that they are capable of utilizing
ornithine as a carbon source.
Diagnostic phenotypic features of V.
alginolyticus, V. parahaemolyticus and V.
natriegens
The Voges–Proskauer reaction, and fermentation of Larabinose, cellobiose and D-galactose may differentiate
V. parahaemolyticus and V. alginolyticus. The genomes
of V. alginolyticus 12G01 and 40B had the necessary genes
for the Voges–Proskauer reaction, sucrose fermentation
and cellobiose fermentation. However, the genes for Larabinose and D-galactose fermentation were absent
in these genomes. V. parahaemolyticus RIMD 2210633,
10329, Peru-466 and K5030 had the genes involved in Larabinose and D-galactose fermentation, reflecting the
positive character of this species for these phenotypes. The
genomes of strains of V. parahaemolyticus lacked the gene
coding for acetolactate synthase, an important protein
related to the production of acetoin (Voges–Proskauer
test). The lack of these genes may reflect the negative
character of V. parahaemolyticus to these biochemical
reactions. All V. parahaemolyticus genomes had the genes
required for the fermentation of cellobiose, a phenotype
that has been previously reported as absent in this species,
due to the lack of the regulatory protein CelR. Another
relevant finding was the presence of the genes relating to
ornithine decarboxylase and indole production in the V.
natriegens genome, in disagreement with the literature
(Table 2). The discrepancies between the results of
phenotypic tests reported in the literature and the results
obtained in this study based on the genome analysis
explained, at least in part, the correlation between these
two datasets (Fig. 1b).
Diagnostic phenotypic features of V. cholerae and
V. mimicus
V. cholerae is differentiated from V. mimicus by its positive
Voges–Proskauer reaction, and fermentation of sucrose
and D-mannose phenotypes. The two strains of V. cholerae
(O395 and N16961) had the genes involved in these
biochemical features. V. mimicus genomes lacked three genes
involved in the metabolic pathway for sucrose fermentation (genes encoding sucrose 6-phosphate dehydrogenase,
Downloaded from www.microbiologyresearch.org by
International Journal of Systematic and Evolutionary Microbiology 64
IP: 88.99.165.207
On: Fri, 16 Jun 2017 15:31:22
Vibrio phenotypes using whole genome sequences
sucrose operon repressor ScrR, putative, and fructokinase), one gene involved in the metabolic pathway of
production of acetoin (encoding acetolactate synthase)
and two genes involved in the biochemical pathway for
fermentation of D-mannose (manBC).
Diagnostic phenotypic features of V. anguillarum
and V. ordalii
Indole production, Voges–Proskauer reaction, arginine dihydrolase, and fermentation of D-mannitol, D-sorbitol, trehalose,
L-arabinose, D-mannose and cellobiose are biochemical tests
used to differentiate the sister species V. anguillarum and V.
ordalii (Table 1). All the genes involved in these biochemical
tests were present in the genome of V. anguillarum, with the
exception of the genes that encode cellobiose fermentation.
This result may reflect a limitation in genome annotation or
simply that this species is cellobiose-negative. According to
our genome analysis, V. ordalii ATCC 33509T lacked all genes
for the diagnostic features. Surprisingly, this strain had the
genes for arginine dihydrolase, and fermentation of Dmannitol, trehalose and D-mannose, features that have been
reported as negative in the literature.
Diagnostic phenotypic features of V. brasiliensis,
V. coralliilyticus and V. tubiashii
Utilization of myo-inositol and L-leucine, and fermentation
of mannitol are some of the key biochemical tests that
differentiate V. coralliilyticus and V. tubiashii (Table 1). We
found a complete match between the presence of the genes
and the respective phenotypes previously reported in
the literature for V. brasiliensis, V. coralliilyticus and V.
tubiashii. For instance, the genes encoding proteins that are
part of the biochemical pathway for acetoin production
and utilization of myo-inositol and L-leucine were absent in
the genome of V. tubiashii.
DISCUSSION
Computational analysis of the genomes of vibrios may
be a useful tool for rapid identification of species.
The identification and classification of vibrios has been
performed using biochemical tests. Phenotypic identification is laborious and time-consuming. Furthermore, the
use of commercial kits for the biochemical identification of
environmental isolates may require changes to the protocols used, as most tests are typically intended for clinical
use (Noguerola & Blanch, 2008). Considering that novel
species descriptions may, in the near future, require a
genome sequence of the type strain and of reference strains,
it will be possible to retrieve phenotypic information for
the respective genomes, complementing or even replacing
the classic phenotypic characterization of microbes. The
phenotypic characterization of microbes using whole
genome sequences enables researchers to readily compare electronic data. The number of phenotypic features
http://ijs.sgmjournals.org
analysed in this study was based on the examination of
previously published studies that defined key diagnostic
features. However, it is also possible to expand the panel of
features to encompass other types, such as phenotypes
related to specific ecologies, for example host colonization
and utilization of different energy sources.
Genotype to phenotype
The diagnostic phenotypic features analysed in this study
were a reflection of the genome information. We observed
a clear correlation between the presence of the genes and
the respective phenotypes (Fig. 1b). Strains with the
complete set of genes had a positive given phenotype,
whereas strains devoid of the respective genes had a
negative phenotype. However, there were some exceptions.
We found that genes belonging to metabolic pathways
related to ornithine decarboxylase, arginine dihydrolase,
indole production, and fermentation of galactose, Dmannose, trehalose, D-mannitol and cellobiose were
present in certain vibrios that were considered negative
according to the literature. V. parahaemolyticus genomes
had the genes required for the fermentation of cellobiose,
whereas V. natriegens had the genes for ornithine decarboxylase and indole production, in disagreement with the
literature. Similarly, V. ordalii ATCC 33509T had the genes
for arginine dihydrolase, and fermentation of D-mannitol,
trehalose and D-mannose, features that have been reported
as negative in the literature. This seems to illustrate the
limitations of such diagnostic tests at least for some species
of vibrios. However, the presence of a group of genes in a
bacterial genome does not necessarily mean that the
organism will present that phenotype. Mutation and gene
expression repression could influence the phenotypes
observed for these diagnostic features. A mutation in an
important gene of the metabolic pathway that causes
an insertion or deletion may generate a premature stop
codon in the gene, making it inactive. A single nucleotide
insertion in the gene encoding the sucrose-specific second
domain (IIB) of the phosphoenolpyruvate-dependent sugar
transport system was deemed the cause of the inability of
V. cholerae IEC224 to ferment sucrose (Garza et al., 2012).
This protein is an essential component in sucrose metabolism because it selectively transports sucrose to the
intracellular medium and phosphorylates it, so it can be
further metabolized by sucrose 6-hydrolase to a-D-glucose
6-phosphate and b-D-fructose. This insertion truncated the
protein sequence by introducing a stop codon at aminoacid residue 185. In addition, repressor genes, regulatory
proteins and global regulators may prevent the expression
of related genes under certain conditions.
Gene expression is also regulated by global regulators.
Global regulators have been defined on the basis of their
pleiotropic phenotype and their ability to regulate operons
that belong to different metabolic pathways (Gottesman,
1984). The diagnostic vibrio phenotypes analysed in this
study appear to be controlled by several global regulators
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Fri, 16 Jun 2017 15:31:22
361
362
Glutamate
Carbamoyl-P
Pymole2-carbosylate
1.2.1.41
Alanine, aspartate and
glutamate metabolism
ADC pathway
1.2.1.3
PuuC
N4-Acetylc-Glutamylaminoc-aminobutanoate
butyrate
3.5.1.63
PuuD
5-carboxylate
1.5.99.8
1.5.1.2
3.4.11.5
1.5.1.19
2.1.4.1
1.5.1.11
Sarcosine
3.5.3.3
GuanidinoCreatine
acetate
2.1.1.2
1.4.3.3
1.21.4.1
2-Oxo-5-Aminovalerate
Linatine
2.7.3.3
3.5.2.14
4-Aminobutanoate
1.2.1.19 1.2.1.-
1.2.1.3
6.3.2.11
3.4.13.20
3.4.13.18
L-Homocamosine
2.5.1.16 Spermine
Glutathione
metabolism
Spermidine 2.5.1.22
2.1.4.1
3.5.3.7
β -Alanine
metabolism
Cysteine and
methionine metabolism
S-Adenosylmetioninamine
4.1.1.50
S-AdenosylL-methionine
2.5.1.16
Polyamine
Pathway
Carboxyspermidine
1.4.3.22
4.1.1.4.1.1.2.6.1.Norspermidine
1.5.99.6
Carboxynorspermidine
1.5.1.43
4-semialdehyde
1,3-Diaminopropane 1.5.1.43
Glycine
metabolism
L-Aspartate-
Feruloyl-CoA
1.2.1.54
4-Guanidinobutanoate
2.3.1.64
Feruloylptrescine Feruloylagmatine
ACT
2.3.1.64
p-Coumaroyl- p-Coumaroylptrescine
agmatine
ACT
p-Coumaroyl-CoA
Agmatine
4-Guanidinobutanal
4.1.1.75
2-Oxoarginine
2.6.1.84
ADH pathway
Creatine pathway
23.1.109 4.1.1.19
N-Methylhydantoin
Citrate cycle
Nitric oxide
1.14.13.39
NwHydroxyarginine
1.14.13.39
Lysine degradation
Creatinine
3.5.4.1 3.5.4.21
spontaneous
N-Carbamoylsarcosine
L-Arginine-P
3.5.1.59
3.5.2.10
Creatine-P
Urea
2.7.3.2
5-Amino1.4.1.12 pentanoate
1-Pymoline2-carboxylate
5.1.1.4
D-Proline
L-Proline
1.5.1.1
3.5.1.5
D-Nopaline
D-Octopine
Excretion Urea1-carboxylate
3.5.1.54
6.3.4.6
Citrate cycle
4.3.1.12
3.5.3.1
Arginine
Urea
2.7.3.1
Fumarate
Guanidinoacetate-P
4.3.2.1
5-semialdehyde
Peptide
L-Glutamate
L-1-Pymoline-
2.6.1.13
Omithine
2.6.1.82
1.4.3.4
PuuB
1.4.3.10 2.6.1.29
c-GlutamylN4-Acetylc-aminoaminobutanal
butyraldehyde
N-Acetylputrescine
2.3.1.57 3.5.1.62
2,5-Dioxopentanoate
c-L-Glutamylputrescine
PuuA
2.3.1.35
N-Acetylglutamate
semialdehyde
L-Glutamyl-P
1.2.1.38
N-Acetylglutamyl-P
2.6.1.11
3.5.1.16
3.5.1.14
& D-Om
metabolism
D-Arg
1.14.13.39
3.5.3.6
Urea Cycle
succinate
L-Argino
L-1-Pymolinetrans- 1.14.11.2
L-erythroPyruvate
1.5.1.2
3-hydroxy4-HydroxyD-4-Hydroxy4-Hydroxy5-carboxylate
L-proline
2-oxoglutarate
glutamate 2.6.1.1
4.1.1.3
3.5.4.22
1.4.3.3
5.1.1.8
1.2.1.88
PRODH2
1-Pymoline2.6.1.21 2.6.1.23
4.1.3.16
Glyoxylate
cis-4-Hydroxy4-hydroxyD-proline
1.1.1.104
1.2.1.88
L
-4-Hydroxy2-carboxylate
2-oxo-4-hydroxy4-Oxoproline
4.1.1.17
glutamate
5-aminovalerate
1.5.99.8
N2-SuccinylN2-Succinylsemialdehyde
glutamate
L-arginine
3.5.1.96
2.6.1.81
1.2.1.71
3.5.3.23
N2-Succinyl-L-glutamate
N2-SuccinylN-Carbamoyl5-semialdehyde
L-ornithine
putrescine
Tropane, piperidine and
3.5.1.53
3.5.3.12
pyridine alkaloid biosynthesis
Putrescine
3.5.3.11
1.2.1.88
2.7.2.11
1.2.1.88
N-Acetylglutamate
2.3.1.1
2.7.2.8
2.1.3.9
Citruline
2.1.3.3
Aspartate
6.3.4.5
Alanine, aspartate and
glutamate metabolism
N-AcetylL-citrulline
3.5.1.16
N-Acetylomithine
2.1.3.11
N-Succinylcitrulline
N-Succinylomithine
Pyrimidine
metabolism
6.3.4.16
2.7.2.2
1.4.1.4
1.4.1.2
NH3
1.4.1.3
6.3.1.2
3.5.1.2 3.5.1.38
Glutamine
Nitrogen
metabolism
ARGININE AND PROLINE METABOLISM
3.5.1.4
4-Guanidinobutanamide
1.13.12.1
G. R. S. Amaral and others
Downloaded from www.microbiologyresearch.org by
International Journal of Systematic and Evolutionary Microbiology 64
IP: 88.99.165.207
On: Fri, 16 Jun 2017 15:31:22
Vibrio phenotypes using whole genome sequences
Fig. 2. The metabolic pathways of ornithine decarboxylase and arginine dihydrolase. The products of the biochemical reactions,
enzymes and respective IC numbers are indicated.
(Table 2). For instance, leucine-responsive protein (Lrp)
seems to be involved in the global regulation of L-leucine
utilization, while the catabolic control protein (CcpA)
globally regulates the Voges–Proskauer reaction, ornithine
decarboxylase, arginine dihydrolase and trehalose fermentation. The cAMP–cAMP receptor protein (CRP)
complex is a global regulator for sucrose, L-arabinose, Dgalactose and D-sorbitol fermentation, indole production,
and myo-inositol utilization. The cellobiose-negative phenotype may be explained by the lack of the global regulator
in the vibrio genome. In addition, regulatory proteins may
also play a critical role. Only seven regulatory proteins
(CRP, FNR, IHF, FIS, ArcA, NarL and LrP) directly
modulate the expression of 51 % of the genes in Eschrichia
coli (Martı́nez-Antonio & Collado-Vides, 2003). In vibrios,
the global regulators, such as Lrp and the cAMP–CRP
complex, are also involved in the regulation of virulence
and quorum sensing (Alice & Crosa, 2012; Lo Scrudato &
Blokesch, 2013). Vibrio phenotypes are also under control
of non-coding RNAs (Silveira et al., 2010).
Determination of the diagnostic phenotypic
features depends on the quality of the genome
sequences
In some strains, we did not find the genes responsible for
previously reported diagnostic phenotypes, for instance the
lack of genes related to fermentation of cellobiose in V.
Table 2. Regulatory genes and global regulators of vibrio phenotypes
Phenotypic test
Pathway
Voges–Proskauer Acetoin, butanediol
reaction
metabolism
Sucrose
Starch and sucrose
fermentation
metabolism
Ornithine
decarboxylase
L-Arabinose
D-Galactose
fermentation
Cellobiose
fermentation
D-Mannitol
fermentation
Arginine
dihydrolase
Trehalose
fermentation
D-Sorbitol
fermentation
Indole
production
myo-Inositol
utilization
D-Mannose
Associated regulatory genes
Sigma-54 dependent
transcriptional regulator
Sucrose operon repressor,
ScrR, LacI family
Arginine and
Arginine regulatory pathway,
proline metabolism ArgR
Starch and sucrose
Arabinose operon regulatory
metabolism
protein
Galactose
Galactose operon repressor,
metabolism
GalR, LacI family of
transcriptional regulators
b-Glucoside
Cellobiose and glucan
metabolism
utilization regulator, LacI
family, celR
Fructose and
Mannitol operon repressor,
mannose
mtlR
metabolism
Arginine and
Arginine regulatory pathway,
proline
ArgR
metabolism
Starch and sucrose
Trehalose operon
metabolism
transcriptional regulator,
treR
Fructose and
Fructose repressor, FruR,
mannose
LacI family
metabolism
Aromatic amino
Tryptophanase, TnaA
acid degradation
Inositol
Transcriptional regulator of
catabolism
the myo-inositol catabolic
operon
Fructose and
Transcriptional regulator of
mannose
mannoside utilization,
metabolism
manR
ID
fig|314288.3.peg.1410
fig|243277.1.peg.3384
fig|6666666.580.peg.4755
fig|223926.1.peg.4758
fig|223926.1.peg.2393
*
Global regulator
Organism
Catabolic control
protein, CcpA
cAMP–CRP
complex
V. alginolyticus
12G01
V. cholerae O1
biovar eltor str.
N16961
V. harveyi LMG
14126T
V. parahaemolyticus
RIMD 2210633
V. parahaemolyticus
RIMD 2210633
Catabolic control
protein, CcpA
cAMP–CRP
complex
cAMP receptor
protein (CRP)
*
*
fig|223926.1.peg.368
cAMP–CRP
complex
V. parahaemolyticus
RIMD 2210633
fig|6666666.580.peg.4755
Catabolic control
protein, CcpA
V. harveyi LMG
14126T
fig|6666666.37776.peg.2156 Catabolic control
protein, CcpA
V. coralliilyticus
P1T
fig|6666666.26157.peg.3552 cAMP–CRP
complex
V. anguillarum
775T
fig|6666666.17460.peg.357
V. natriegens
LMG 10935T
V. coralliilyticus
P1T
cAMP–CRP
complex
fig|6666666.37776.peg.3580/ cAMP–CRP
4088
complex
*
*
*
*Regulators that were not found in the vibrio genomes analysed.
http://ijs.sgmjournals.org
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Fri, 16 Jun 2017 15:31:22
363
G. R. S. Amaral and others
anguillarum 775. The genome of this strain is in contigs,
which could influence its annotation. In addition, we cannot
rule out the effects of sequencing errors in genome
annotation, but this should be a minor issue in strain 775,
due to the high genome coverage normally used in genome
sequencing projects. To expand the use of whole genome
sequences for phenotypic characterization, we suggest high
genome coverage (.206; total number of reads6average
read size/mean genome size) and high-quality control using
the Phred’s error probability. For instance, in Ionsequencing
and Illumina sequencing, the thresholds for quality control
provided by the respective sequencers are Q20 (99 %
accuracy of the base call, or one error in 100) and Q30
(99.9 % accuracy of the base call, or one error in 1000). To
automatically retrieve the phenotypic diagnostic features
from whole genome sequences of vibrios we developed a
prototype program, named vibriophenotyping. This program will allow researchers to search and identify the
diagnostic phenotypes using genome sequences.
genome-to-genome sequence comparison. Stand Genomic Sci 2, 117–
134.
Aziz, R. K., Bartels, D., Best, A. A., DeJongh, M., Disz, T., Edwards,
R. A., Formsma, K., Gerdes, S., Glass, E. M. & other authors (2008).
The RAST Server: rapid annotations using subsystems technology.
BMC Genomics 9, 75.
Drews, G. (2000). The roots of microbiology and the influence of
Ferdinand Cohn on microbiology of the 19th century. FEMS
Microbiol Rev 24, 225–249.
Farmer, J. J. & Hickman-Brenner, F. W. (2006). The Genera Vibrio
and Photobacterium, 6: 508–563 In: The Prokaryotes. A Handbook
on the Biology of Bacteria: Proteobacteria: Gamma Subclass, 3rd edn.
Edited by M. Dworkin, S. Falkow, E. Rosenberg, K.-H. Schleifer &
E. Stackebrandt. Berlin: Springer.
Garza, D. R., Thompson, C. C., Loureiro, E. C. B., Dutilh, B. E., Inada,
D. T., Junior, E. C. S., Cardoso, J. F., Nunes, M. R. T., de Lima, C. P. S.
& other authors (2012). Genome-wide study of the defective sucrose
fermenter strain of Vibrio cholerae from the Latin American cholera
epidemic. PLoS ONE 7, e37283.
Gottesman, S. (1984). Bacterial regulation: global regulatory net-
works. Annu Rev Genet 18, 415–441.
Hunt, D. E., David, L. A., Gevers, D., Preheim, S. P., Alm, E. J. &
Polz, M. F. (2008). Resource partitioning and sympatric differenti-
Concluding remarks
The major advantage with the determination of phenotypes
using genome sequences is the cumulative nature and
portability of these types of datasets. It will also be cheaper
and quicker to identify phenotypes using whole genome
sequences obtained with new technologies. The phenotypic
tables provided in current taxonomic manuals and in
publications containing species descriptions are not readily
accessible to researchers and are not in a computerreadable format. This is a tremendous hindrance for
studies on the metabolic diversity of prokaryotes. In
addition, with the further development of microbial
genomic taxonomy, genome sequences will have a critical
role in their identification and classification. As a new
definition of bacterial species based on genome sequences
emerges from various studies, it becomes evident that
genomes will be used more frequently in prokaryotic
taxonomy.
ation among closely related bacterioplankton. Science 320, 1081–
1085.
Konstantinidis, K. T. & Stackebrandt, E. (2013). Defining Taxo-
nomic Ranks. 229. In The Prokaryotes (4th edition): Prokaryotic Biology and Symbiotic Associations. 4th edn. Edited by E. Rosenberg,
E. F. DeLong, S. Lory, E. Stackebrandt & F. L. Thompson. New York:
Springer.
Lo Scrudato, M. & Blokesch, M. (2013). A transcriptional regulator
linking quorum sensing and chitin induction to render Vibrio cholerae
naturally transformable. Nucleic Acids Res 41, 3644–3658.
Martı́nez-Antonio, A. & Collado-Vides, J. (2003). Identifying global
regulators in transcriptional regulatory networks in bacteria. Curr
Opin Microbiol 6, 482–489.
Moreira, A. P. B., Pereira, N., Jr & Thompson, F. L. (2011). Usefulness
of a real-time PCR platform for G+C content and DNA-DNA
hybridization estimations in vibrios. Int J Syst Evol Microbiol 61,
2379–2383.
Noguerola, I. & Blanch, A. R. (2008). Identification of Vibrio spp. with
a set of dichotomous keys. J Appl Microbiol 105, 175–185.
Preheim, S. P., Timberlake, S. & Polz, M. F. (2011). Merging
ACKNOWLEDGEMENTS
C. C. T. and F. L. T. acknowledge grant support from CNPq, CAPES
and FAPERJ.
taxonomy with ecological population prediction in a case study of
Vibrionaceae. Appl Environ Microbiol 77, 7195–7206.
Silveira, A. C. G., Robertson, K. L., Lin, B., Wang, Z., Vora, G. J.,
Vasconcelos, A. T. R. & Thompson, F. L. (2010). Identification of
non-coding RNAs in environmental vibrios. Microbiology 156, 2452–
2458.
REFERENCES
Thompson, F. L., Iida, T. & Swings, J. (2004). Biodiversity of vibrios.
Alice, A. F. & Crosa, J. H. (2012). The TonB3 system in the human
Microbiol Mol Biol Rev 68, 403–431.
pathogen Vibrio vulnificus is under the control of the global regulators Lrp and cyclic AMP receptor protein. J Bacteriol 194, 1897–1911.
Thompson, F. L., Gevers, D., Thompson, C. C., Dawyndt, P., Naser,
S., Hoste, B., Munn, C. B. & Swings, J. (2005). Phylogeny and
Alsina, M. & Blanch, A. R. (1994). A set of keys for biochemical
molecular identification of vibrios on the basis of multilocus sequence
analysis. Appl Environ Microbiol 71, 5107–5115.
identification of environmental Vibrio species. J Appl Bacteriol 76,
79–85.
Altschul, S. F., Gish, W., Miller, W., Myers, E. W. & Lipman, D. J.
(1990). Basic local alignment search tool. J Mol Biol 215, 403–
410.
Thompson, C. C., Vicente, A. C. P., Souza, R. C., Vasconcelos,
A. T. R., Vesth, T., Alves, N., Jr, Ussery, D. W., Iida, T. & Thompson,
F. L. (2009). Genomic taxonomy of vibrios. BMC Evol Biol 9, 258.
Auch, A. F., von Jan, M., Klenk, H.-P. & Göker, M. (2010). Digital
Thompson, C. C., Vieira, N. M., Vicente, A. C. P. & Thompson, F. L.
(2011). Towards a genome based taxonomy of Mycoplasmas. Infect
DNA-DNA hybridization for microbial species delineation by means of
Genet Evol 11, 1798–1804.
364
Downloaded from www.microbiologyresearch.org by
International Journal of Systematic and Evolutionary Microbiology 64
IP: 88.99.165.207
On: Fri, 16 Jun 2017 15:31:22
Vibrio phenotypes using whole genome sequences
Thompson, C. C., Emmel, V. E., Fonseca, E. L., Marin, M. A. &
Vicente, A. C. (2013a). Streptococcal taxonomy based on genome
Vandamme, P., Pot, B., Gillis, M., de Vos, P., Kersters, K. & Swings, J.
(1996). Polyphasic taxonomy, a consensus approach to bacterial
sequence analyses. F1000 Res 2, 1–9.
systematics. Microbiol Rev 60, 407–438.
Thompson, C. C., Silva, G. G. Z., Vieira, N. M., Edwards, R., Vicente,
A. C. P. & Thompson, F. L. (2013b). Genomic taxonomy of the genus
Willems, A., Doignon-Bourcier, F., Goris, J., Coopman, R., de Lajudie,
P., De Vos, P. & Gillis, M. (2001). DNA-DNA hybridization study of
Prochlorococcus. Microb Ecol 66, 752–762.
Bradyrhizobium strains. Int J Syst Evol Microbiol 51, 1315–1322.
http://ijs.sgmjournals.org
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Fri, 16 Jun 2017 15:31:22
365