International Journal of Systematic and Evolutionary Microbiology (2014), 64, 357–365 DOI 10.1099/ijs.0.057927-0 Genotype to phenotype: identification of diagnostic vibrio phenotypes using whole genome sequences Gilda Rose S. Amaral, Graciela M. Dias, Michiyo Wellington-Oguri, Luciane Chimetto, Mariana E. Campeão, Fabiano L. Thompson and Cristiane C. Thompson Correspondence Institute of Biology, Federal University of Rio de Janeiro (UFRJ), Brazil Fabiano L. Thompson [email protected] Cristiane C. Thompson [email protected] Vibrios are ubiquitous in the aquatic environment and can be found in association with animal or plant hosts. The range of ecological relationships includes pathogenic and mutualistic associations. To gain a better understanding of the ecology of these microbes, it is important to determine their phenotypic features. However, the traditional phenotypic characterization of vibrios has been expensive, time-consuming and restricted in scope to a limited number of features. In addition, most of the commercial systems applied for phenotypic characterization cannot characterize the broad spectrum of environmental strains. A reliable and possible alternative is to obtain phenotypic information directly from whole genome sequences. The aim of the present study was to evaluate the usefulness of whole genome sequences as a source of phenotypic information. We performed a comparison of the vibrio phenotypes obtained from the literature with the phenotypes obtained from whole genome sequences. We observed a significant correlation between the previously published phenotypic data and the phenotypic data retrieved from whole genome sequences of vibrios. Analysis of 26 vibrio genomes revealed that all genes coding for the specific proteins involved in the metabolic pathways responsible for positive phenotypes of the 14 diagnostic features (Voges–Proskauer reaction, indole production, arginine dihydrolase, ornithine decarboxylase, utilization of myo-inositol, sucrose and L-leucine, and fermentation of D-mannitol, D-sorbitol, L-arabinose, trehalose, cellobiose, D-mannose and Dgalactose) were found in the majority of the vibrios genomes. Vibrio species that were negative for a given phenotype revealed the absence of all or several genes involved in the respective biochemical pathways, indicating the utility of this approach to characterize the phenotypes of vibrios. The absence of the global regulation and regulatory proteins in the Vibrio parahaemolyticus genome indicated a non-vibrio phenotype. Whole genome sequences represent an important source for the phenotypic identification of vibrios. INTRODUCTION Microbial taxonomy comprises the identification of isolates within known species, classification of new isolates (creation of new taxa) and nomenclature. The taxonomic schemes used for identification and classification need to be reliable, reproducible and informative. It is also desirable that the schemes are easy and affordable for the end-users. Phenotypic tables listing useful diagnostic features for identification have been extensively used since the early developments of microbial taxonomy (for example Bergey’s Manual of Determinative Bacteriology, Abbreviations: CRP, cAMP receptor protein; DDH, DNA–DNA hybridization; GGD, genome-to-genome distance; GGDC, genome-togenome distance calculator. A supplementary figure and two supplementary tables are available with the online version of this paper. 057927 G 2014 IUMS 9th edition), partially as a consequence of the influence of other disciplines of biology (e.g. botany) in the early developments of microbial taxonomy (Drews, 2000), and partially due to the technological limitations associated with these early periods of taxonomy. However, microbiologists soon realized that taxonomic schemes based on phenotypic features had several shortcomings: different species could have indistinguishable phenotypes (e.g. for the Vibrio alginolyticus species group), strains of the same species could also have different phenotypes (e.g. colony variation, enzyme activities), discrepancies in the results sometimes with the same strain and lack of interlaboratory reproducibility. One astonishing consequence of the phenotype-based taxonomy developed in the first half of the last century was a multiplication of novel species, leading to a complete revision in the number of recognized species between 1957 and 1974. More than 90 % of all the Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Fri, 16 Jun 2017 15:31:22 Printed in Great Britain 357 G. R. S. Amaral and others species described in the 1957 edition of Bergey’s Manual of Determinative Bacteriology were subsequently reclassified as synonymous or excluded from the taxonomic schemes due to the application of more rigorous procedures to assess strain similarity (e.g. DNA–DNA hybridization, DDH). The impact of molecular tools (e.g. molecular fingerprinting, DTm, DDH and DNA sequencing) in taxonomy has been extremely high, culminating in the establishment of polyphasic taxonomy (Vandamme et al., 1996). Taxonomic schemes are currently based on the evolutionary relationships determined using gene sequences (most notably the 16S rRNA gene) and genomic information (Konstantinidis & Stackebrandt, 2013; Thompson et al., 2009). understanding of the ecology of these microbes, it is important to obtain phenotypic information based on whole genome sequences. The aim of the present study was to evaluate the usefulness of genome sequences as a source of phenotypic information. We performed a comparison of the vibrio phenotypes obtained from the literature with those obtained from whole genome sequences. We also developed a prototype program, vibriophenotyping, to determine the major diagnostic phenotypic features for the identification of vibrios. The advent of whole genome sequencing allowed the establishment of taxonomic schemes based on the evolutionary information contained in the genome sequences, such as the Karlin signature, average amino acid identity, supertrees, and genome-to-genome distance (GGD, a type of in silico DDH). It is becoming clear that bacterial species can be defined on the basis of these features. A common definition would consider that strains from the same species share ,10 dissimilarity in Karlin signature, .95 % amino acid identity, .95 % similarity based on multiple alignment genes and .70 % in silico GGD (Thompson et al., 2009, 2011, 2013a, b). It is reasonable to suggest that microbial taxonomy will become steadily more dependent on genome sequences rather than on the classic phenotypic characterization using time-consuming laborious wet laboratory tests. In addition, the extremely close phenotypic similarities observed for some types of bacteria (e.g. vibrios) are a hindrance to the routine phenotypic identification of these microbes. Genome sequence data. We analysed 26 strains corresponding to 12 vibrio species (Table S1, available in the online Supplementary Material). The genome sequences were downloaded from the National Center for Biotechnology Information (NCBI) database. The genomes were annotated using the Rapid Annotation System Technology (RAST) server (Aziz et al., 2008). This server is based on manually curated subsystems and subsystem-based protein families that automatically guarantee a high degree of assignment consistency. Vibrios are ubiquitous in the aquatic environment and can be found in association with animal or plant hosts. Some species are animal (Vibrio coralliilyticus and Vibrio harveyi) or human (Vibrio cholerae, Vibrio parahaemolyticus and Vibrio vulnificus) pathogens, and others form mutualistic relationships with marine organisms (e.g. that between Vibrio fischeri and the squid Euprymna scolopes). Vibrios were among the first taxonomic groups to be evaluated by means of multilocus sequence analysis (Thompson et al., 2005), and genomic taxonomy (Thompson et al., 2009). We also developed a surrogate methodology to determine genome similarity based on DTm using a real-time PCR platform as an alternative to the classic DDH based on the Ezaki methodology (Moreira et al., 2011). However, it has become evident that the phenotypic characterization of vibrios is problematic. The phenotypic identification of vibrios remains a difficult task, particularly for some sister species; for example, V. cholerae– Vibrio mimicus, V. coralliilyticus–Vibrio tubiashii–Vibrio brasiliensis, V. alginolyticus–V. parahaemolyticus–Vibrio natriegens and V. harveyi–Vibrio campbellii have similar phenotypes. Vibrio sister species have highly similar genomes (around 70 % DDH) and nearly indistinguishable phenotypes. In spite of having similar genomes, these species can be recognized as different evolutionary unities in nature (Hunt et al., 2008; Preheim et al., 2011). To gain a better 358 METHODS Identification of genes responsible for different diagnostic phenotypes. We chose the 14 diagnostic biochemical features (Voges–Proskauer reaction, indole production, arginine dihydrolase, ornithine decarboxylase, utilization of myo-inositol, sucrose and Lleucine, and fermentatation of D-mannitol, D-sorbitol, L-arabinose, trehalose, cellobiose, D-mannose and D-galactose) that have been applied in previous studies to identify sister species of vibrios (Alsina & Blanch, 1994; Farmer & Hickman-Brenner, 2006; Noguerola & Blanch, 2008; Thompson et al., 2004). For each diagnostic feature we established a list of corresponding genes (Table S2). We detected the genes coding for the proteins responsible for these features using the RAST program and the KEGG metabolic database (http://www. genome.jp/kegg/). The BLASTP algorithm (Altschul et al., 1990) was used to identify genes associated with the biochemical pathways. The program ExPASy translate (ExPASy Bioinformatics Resource Portal) was used to analyse protein sequences. The phenotypic features obtained from the literature and the features obtained from the genome searches were compared be means of numerical coefficients. The search for genes related to the biochemical characteristics of interest generated a table of presence or absence that was compared with the data from previously published studies. We used the Jaccard index to obtain a similarity coefficient between pairs of strains using the two datasets. Species were compared pairwise, generating a similarity matrix for both the genotype and the phenotype. Correlation analysis was performed based on the Pearson correlation using Excel software. Genome-to-genome distance calculator (GGDC). The genome distance among the 27 genomes was calculated using GGDC (Auch et al., 2010). The GGDC values were compared with the DDH values obtained in previous studies (Thompson et al., 2004). The previous DDH estimates were all based on empirical data using the Ezaki methodology (Willems et al., 2001). The regression-based DDH estimate uses parameters from a robust-line fit, whereas the threshold-based DDH estimate applies the distance threshold leading to the lowest error ratio in predicting whether DDH is .70 % or ,70 % (Auch et al., 2010). Development of the prototype program vibriophenotyping. To automate searches for enzymes related to phenotypes of interest, a program was written in Python. The program vibriophenotyping uses Downloaded from www.microbiologyresearch.org by International Journal of Systematic and Evolutionary Microbiology 64 IP: 88.99.165.207 On: Fri, 16 Jun 2017 15:31:22 Vibrio phenotypes using whole genome sequences After these steps, if all the enzymes involved in a metabolic pathway are present in the genome, the organism is considered positive for this phenotype, or if one or more enzymes in a metabolic pathway are absent, the organism is considered negative. The output of the program comprises a table and a text file. The table contains a list of phenotypes, whereas the text file contains the BLAST results as well as the identity and length cuts, and the phenotypes with enzymes present or absent. The program is available at http://www.microbiologia. biologia.ufrj.br/. RESULTS The genomic relatedness obtained by the in silico GGD hybridization methodology was in agreement with previous measurements of DDH (Fig. 1a). Overall, the fluorometric microplate methodology resulted in slightly higher values (approx. 5.9 % higher) than the in silico methodology. The comparison of DDH values obtained by the fluorometric microplate methodology with the DDH values obtained by the in silico GGD methodology provided a significant correlation, with fluorometric values .70 % corresponding to GGD values .70 %. For instance, V. campbellii CAIM519T and the newly isolated proteorhodpsin-bearing V. campbellii PEL22A shared .70 % GGD. Similarly, all strains of V. mimicus shared .70 % GGD, whereas strains of V. mimicus and V. cholerae shared ,70 % GGD. Genome enabled phenotypic identification of vibrios We analysed the genes coding for the enzymes responsible for a group of key phenotypic markers currently used to identify vibrios. Analysis of the genomes revealed that all genes coding for the specific proteins involved in the metabolic pathways responsible for positive vibrio phenotypes were found in the majority of the vibrio genomes (Table 1; Fig. 1b). Some phenotypes were defined based on the presence of a few genes, e.g. indole production (presence of tnaAB coding for tryptophanase A and tryptophanase B), whereas for other phenotypes the definition was based on the presence of several genes, e.g. ornithine decarboxylase (presence of 36 genes) (Fig. 2). The vibrio species that were negative for a given phenotype revealed the absence of at least one gene involved in the respective biochemical pathway (Table 1). Details of the metabolic pathways of the diagnostic vibrio phenotypes are given in Fig. S1. Regulatory proteins (Sigma-54 transcriptional regulator factor, ScrR, ArgR, GalR, MtlR, TreR, http://ijs.sgmjournals.org In vitro (DDH, %) (a) 80 60 40 R 2= 0.84 20 0 0 20 40 In silico (GGD, %) 60 80 Pearson correlation = 0.92 (b) 100 Phenotypic similarity (%) the BLAST algorithm to make searches. The program has an associated database, which is composed of protein sequences related to the analysed phenotypes (see Table S1). These sequences were used as queries for BLAST searches. The BLAST results were analysed by the vibriophenotyping program in the following steps: (i) the user provides an amino acid FASTA file with coding sequences as input; (ii) verify if hits were found for the enzyme being searched; (iii) verify which orthologue in the database has the greater score for the subsequent analysis; (iv) verify if the identity is greater or less than 40 %; and (v) verify if the sequence length is greater or less than 70 % of the query length. 80 60 40 R 2= 0.48 20 0 20 30 60 40 50 Genotypic similarity (%) 70 80 Pearson correlation = 0.68 Fig. 1. Genomic and phenotypic information based on whole genome sequences. Correlation between the DDH values obtained from the literature and in silico GGD values from the pairs of vibrios. The points on the graph correspond to pairs of the test species. (a) Regression curve between in vitro (DDH) and in silico (GGD) data. (b) Regression curve between genotypic similarity and phenotypic similarity. TnaA and transcriptional regulator of the myo-inositol catabolic operon) associated with each diagnostic vibrio phenotype were identified (Table 2). We observed that the pairs of sister species V. campbellii and V. harveyi, V. parahaemolyticus and V. alginolyticus, V. cholerae and V. mimicus, Vibrio anguillarum and Vibrio ordalii, V. tubiashii and V. coralliilyticus, V. brasiliensis and V. tubiashii, and V. natriegens and V. alginolyticus contained different sets of genes, conferring different diagnostic phenotypes for each species (Table 1). Phenotypic similarity above 65 % was observed between pairs of strains of different species, including V. alginolyticus and V. cholerae, V. parahaemolyticus and V. cholerae, V. parahaemolyticus and V. mimicus, and V. harveyi and V. cholerae. These pairs had less than 25 % in silico DDH values, indicating the rather limited scope of the phenotypic identification compared with the genome-based identification. The phenotypic Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Fri, 16 Jun 2017 15:31:22 359 G. R. S. Amaral and others Table 1. Phenotypic features for differentiating sister species of vibrios based on genomic analysis (presence and absence of genes involved in the biochemical pathway of each feature) Species: 1, V. campbellii; 2, V. harveyi; 3, V. parahaemolyticus; 4, V. alginolyticus; 5, V. cholerae; 6, V. mimicus; 7, V. anguillarum; 8, V. ordalii; 9, V. tubiashii; 10, V. coralliilyticus; 11, V. brasiliensis; 12, V. natriegens. Phenotypic data for reference species were obtained from Farmer & Hickman-Brenner (2006). +, Positive; 2, negative; V, variable. Genotypic and phenotypic similarities were calculated using the Jaccard coefficient based on the presence or absence of the diagnostic phenotypic features. Blank cells, tests were not determined. Feature 1 2 3 4 5 6 7 8 9 10 11 12 Ornithine decarboxylase V + +* Voges–Proskauer reaction 2 ++ 2+ 2 ++ Indole production + 2 +* Arginine dihydrolase + +* Utilization of: myo-Inositol 2+ + L-Leucine 2+ D-Galactose V + + 2 Fermentation of: Sucrose 2+ 2 + D-Mannose + 2 + +* L-Arabinose + 2 + 2 + Cellobiose +* + 2* 2 D-Mannitol + +* D-Sorbitol + 2 +2 Trehalose + +* *Different result from data in the literature (Alsina & Blanch, 1994; Thompson et al., 2004; Farmer & Hickman-Brenner, 2006; Noguerola & Blanch, 2008). similarity derived from the genome sequences gave, in general, higher values than the phenotypic similarity derived based on physiological and biochemical tests (Fig. 1b). Diagnostic phenotypic features of V. harveyi and V. campbellii Ornithine decarboxylase, and fermentation of sucrose and D-galactose are key biochemical tests routinely used for differentiation of V. harveyi and V. campbellii. Our analysis of the genome sequences revealed the presence of the genes responsible for these phenotypes in the two strains of V. harveyi (1792 and 14126T). In contrast, in the genomes of five strains of V. campbellii (PEL22A, DM40S, 519T, HY01 and BAA-1116) four genes involved in the fermentation of sucrose (sucrose-specific IIB domain of the Phosphotransferase System (PTS), sucrose operon repressor ScrR, sucrose 6-phosphate dehydrogenase and fructokinase) were absent. These genes are required for the fermentation of sucrose, and their absence in these genomes explained the sucrose-negative phenotype. Utilization of D-galactose and activity of ornithine 360 decarboxylase are known to be variable features in V. campbellii. V. campbellii PEL22A, DM40S and 519T did not possess the five genes (i.e. galactokinase, galactose 1phosphate uridylyltransferase, UDP-glucose 4-epimerase, b-D-galactosidase subunit a, cryptic b-D-galactosidase subunit b) involved in D-galactose utilization. However, V. campbellii BAA-1116 and HY01 did have these genes. Similar results were found for genes involved in activity of ornithine decarboxylase. The genomes of strains PEL22A and CAIM519T lacked the genes necessary for utilization of ornithine as a source of carbon and energy for growth, whereas strains DM40S, BAA-1116 and HY01 possessed these genes, suggesting that they are capable of utilizing ornithine as a carbon source. Diagnostic phenotypic features of V. alginolyticus, V. parahaemolyticus and V. natriegens The Voges–Proskauer reaction, and fermentation of Larabinose, cellobiose and D-galactose may differentiate V. parahaemolyticus and V. alginolyticus. The genomes of V. alginolyticus 12G01 and 40B had the necessary genes for the Voges–Proskauer reaction, sucrose fermentation and cellobiose fermentation. However, the genes for Larabinose and D-galactose fermentation were absent in these genomes. V. parahaemolyticus RIMD 2210633, 10329, Peru-466 and K5030 had the genes involved in Larabinose and D-galactose fermentation, reflecting the positive character of this species for these phenotypes. The genomes of strains of V. parahaemolyticus lacked the gene coding for acetolactate synthase, an important protein related to the production of acetoin (Voges–Proskauer test). The lack of these genes may reflect the negative character of V. parahaemolyticus to these biochemical reactions. All V. parahaemolyticus genomes had the genes required for the fermentation of cellobiose, a phenotype that has been previously reported as absent in this species, due to the lack of the regulatory protein CelR. Another relevant finding was the presence of the genes relating to ornithine decarboxylase and indole production in the V. natriegens genome, in disagreement with the literature (Table 2). The discrepancies between the results of phenotypic tests reported in the literature and the results obtained in this study based on the genome analysis explained, at least in part, the correlation between these two datasets (Fig. 1b). Diagnostic phenotypic features of V. cholerae and V. mimicus V. cholerae is differentiated from V. mimicus by its positive Voges–Proskauer reaction, and fermentation of sucrose and D-mannose phenotypes. The two strains of V. cholerae (O395 and N16961) had the genes involved in these biochemical features. V. mimicus genomes lacked three genes involved in the metabolic pathway for sucrose fermentation (genes encoding sucrose 6-phosphate dehydrogenase, Downloaded from www.microbiologyresearch.org by International Journal of Systematic and Evolutionary Microbiology 64 IP: 88.99.165.207 On: Fri, 16 Jun 2017 15:31:22 Vibrio phenotypes using whole genome sequences sucrose operon repressor ScrR, putative, and fructokinase), one gene involved in the metabolic pathway of production of acetoin (encoding acetolactate synthase) and two genes involved in the biochemical pathway for fermentation of D-mannose (manBC). Diagnostic phenotypic features of V. anguillarum and V. ordalii Indole production, Voges–Proskauer reaction, arginine dihydrolase, and fermentation of D-mannitol, D-sorbitol, trehalose, L-arabinose, D-mannose and cellobiose are biochemical tests used to differentiate the sister species V. anguillarum and V. ordalii (Table 1). All the genes involved in these biochemical tests were present in the genome of V. anguillarum, with the exception of the genes that encode cellobiose fermentation. This result may reflect a limitation in genome annotation or simply that this species is cellobiose-negative. According to our genome analysis, V. ordalii ATCC 33509T lacked all genes for the diagnostic features. Surprisingly, this strain had the genes for arginine dihydrolase, and fermentation of Dmannitol, trehalose and D-mannose, features that have been reported as negative in the literature. Diagnostic phenotypic features of V. brasiliensis, V. coralliilyticus and V. tubiashii Utilization of myo-inositol and L-leucine, and fermentation of mannitol are some of the key biochemical tests that differentiate V. coralliilyticus and V. tubiashii (Table 1). We found a complete match between the presence of the genes and the respective phenotypes previously reported in the literature for V. brasiliensis, V. coralliilyticus and V. tubiashii. For instance, the genes encoding proteins that are part of the biochemical pathway for acetoin production and utilization of myo-inositol and L-leucine were absent in the genome of V. tubiashii. DISCUSSION Computational analysis of the genomes of vibrios may be a useful tool for rapid identification of species. The identification and classification of vibrios has been performed using biochemical tests. Phenotypic identification is laborious and time-consuming. Furthermore, the use of commercial kits for the biochemical identification of environmental isolates may require changes to the protocols used, as most tests are typically intended for clinical use (Noguerola & Blanch, 2008). Considering that novel species descriptions may, in the near future, require a genome sequence of the type strain and of reference strains, it will be possible to retrieve phenotypic information for the respective genomes, complementing or even replacing the classic phenotypic characterization of microbes. The phenotypic characterization of microbes using whole genome sequences enables researchers to readily compare electronic data. The number of phenotypic features http://ijs.sgmjournals.org analysed in this study was based on the examination of previously published studies that defined key diagnostic features. However, it is also possible to expand the panel of features to encompass other types, such as phenotypes related to specific ecologies, for example host colonization and utilization of different energy sources. Genotype to phenotype The diagnostic phenotypic features analysed in this study were a reflection of the genome information. We observed a clear correlation between the presence of the genes and the respective phenotypes (Fig. 1b). Strains with the complete set of genes had a positive given phenotype, whereas strains devoid of the respective genes had a negative phenotype. However, there were some exceptions. We found that genes belonging to metabolic pathways related to ornithine decarboxylase, arginine dihydrolase, indole production, and fermentation of galactose, Dmannose, trehalose, D-mannitol and cellobiose were present in certain vibrios that were considered negative according to the literature. V. parahaemolyticus genomes had the genes required for the fermentation of cellobiose, whereas V. natriegens had the genes for ornithine decarboxylase and indole production, in disagreement with the literature. Similarly, V. ordalii ATCC 33509T had the genes for arginine dihydrolase, and fermentation of D-mannitol, trehalose and D-mannose, features that have been reported as negative in the literature. This seems to illustrate the limitations of such diagnostic tests at least for some species of vibrios. However, the presence of a group of genes in a bacterial genome does not necessarily mean that the organism will present that phenotype. Mutation and gene expression repression could influence the phenotypes observed for these diagnostic features. A mutation in an important gene of the metabolic pathway that causes an insertion or deletion may generate a premature stop codon in the gene, making it inactive. A single nucleotide insertion in the gene encoding the sucrose-specific second domain (IIB) of the phosphoenolpyruvate-dependent sugar transport system was deemed the cause of the inability of V. cholerae IEC224 to ferment sucrose (Garza et al., 2012). This protein is an essential component in sucrose metabolism because it selectively transports sucrose to the intracellular medium and phosphorylates it, so it can be further metabolized by sucrose 6-hydrolase to a-D-glucose 6-phosphate and b-D-fructose. This insertion truncated the protein sequence by introducing a stop codon at aminoacid residue 185. In addition, repressor genes, regulatory proteins and global regulators may prevent the expression of related genes under certain conditions. Gene expression is also regulated by global regulators. Global regulators have been defined on the basis of their pleiotropic phenotype and their ability to regulate operons that belong to different metabolic pathways (Gottesman, 1984). The diagnostic vibrio phenotypes analysed in this study appear to be controlled by several global regulators Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Fri, 16 Jun 2017 15:31:22 361 362 Glutamate Carbamoyl-P Pymole2-carbosylate 1.2.1.41 Alanine, aspartate and glutamate metabolism ADC pathway 1.2.1.3 PuuC N4-Acetylc-Glutamylaminoc-aminobutanoate butyrate 3.5.1.63 PuuD 5-carboxylate 1.5.99.8 1.5.1.2 3.4.11.5 1.5.1.19 2.1.4.1 1.5.1.11 Sarcosine 3.5.3.3 GuanidinoCreatine acetate 2.1.1.2 1.4.3.3 1.21.4.1 2-Oxo-5-Aminovalerate Linatine 2.7.3.3 3.5.2.14 4-Aminobutanoate 1.2.1.19 1.2.1.- 1.2.1.3 6.3.2.11 3.4.13.20 3.4.13.18 L-Homocamosine 2.5.1.16 Spermine Glutathione metabolism Spermidine 2.5.1.22 2.1.4.1 3.5.3.7 β -Alanine metabolism Cysteine and methionine metabolism S-Adenosylmetioninamine 4.1.1.50 S-AdenosylL-methionine 2.5.1.16 Polyamine Pathway Carboxyspermidine 1.4.3.22 4.1.1.4.1.1.2.6.1.Norspermidine 1.5.99.6 Carboxynorspermidine 1.5.1.43 4-semialdehyde 1,3-Diaminopropane 1.5.1.43 Glycine metabolism L-Aspartate- Feruloyl-CoA 1.2.1.54 4-Guanidinobutanoate 2.3.1.64 Feruloylptrescine Feruloylagmatine ACT 2.3.1.64 p-Coumaroyl- p-Coumaroylptrescine agmatine ACT p-Coumaroyl-CoA Agmatine 4-Guanidinobutanal 4.1.1.75 2-Oxoarginine 2.6.1.84 ADH pathway Creatine pathway 23.1.109 4.1.1.19 N-Methylhydantoin Citrate cycle Nitric oxide 1.14.13.39 NwHydroxyarginine 1.14.13.39 Lysine degradation Creatinine 3.5.4.1 3.5.4.21 spontaneous N-Carbamoylsarcosine L-Arginine-P 3.5.1.59 3.5.2.10 Creatine-P Urea 2.7.3.2 5-Amino1.4.1.12 pentanoate 1-Pymoline2-carboxylate 5.1.1.4 D-Proline L-Proline 1.5.1.1 3.5.1.5 D-Nopaline D-Octopine Excretion Urea1-carboxylate 3.5.1.54 6.3.4.6 Citrate cycle 4.3.1.12 3.5.3.1 Arginine Urea 2.7.3.1 Fumarate Guanidinoacetate-P 4.3.2.1 5-semialdehyde Peptide L-Glutamate L-1-Pymoline- 2.6.1.13 Omithine 2.6.1.82 1.4.3.4 PuuB 1.4.3.10 2.6.1.29 c-GlutamylN4-Acetylc-aminoaminobutanal butyraldehyde N-Acetylputrescine 2.3.1.57 3.5.1.62 2,5-Dioxopentanoate c-L-Glutamylputrescine PuuA 2.3.1.35 N-Acetylglutamate semialdehyde L-Glutamyl-P 1.2.1.38 N-Acetylglutamyl-P 2.6.1.11 3.5.1.16 3.5.1.14 & D-Om metabolism D-Arg 1.14.13.39 3.5.3.6 Urea Cycle succinate L-Argino L-1-Pymolinetrans- 1.14.11.2 L-erythroPyruvate 1.5.1.2 3-hydroxy4-HydroxyD-4-Hydroxy4-Hydroxy5-carboxylate L-proline 2-oxoglutarate glutamate 2.6.1.1 4.1.1.3 3.5.4.22 1.4.3.3 5.1.1.8 1.2.1.88 PRODH2 1-Pymoline2.6.1.21 2.6.1.23 4.1.3.16 Glyoxylate cis-4-Hydroxy4-hydroxyD-proline 1.1.1.104 1.2.1.88 L -4-Hydroxy2-carboxylate 2-oxo-4-hydroxy4-Oxoproline 4.1.1.17 glutamate 5-aminovalerate 1.5.99.8 N2-SuccinylN2-Succinylsemialdehyde glutamate L-arginine 3.5.1.96 2.6.1.81 1.2.1.71 3.5.3.23 N2-Succinyl-L-glutamate N2-SuccinylN-Carbamoyl5-semialdehyde L-ornithine putrescine Tropane, piperidine and 3.5.1.53 3.5.3.12 pyridine alkaloid biosynthesis Putrescine 3.5.3.11 1.2.1.88 2.7.2.11 1.2.1.88 N-Acetylglutamate 2.3.1.1 2.7.2.8 2.1.3.9 Citruline 2.1.3.3 Aspartate 6.3.4.5 Alanine, aspartate and glutamate metabolism N-AcetylL-citrulline 3.5.1.16 N-Acetylomithine 2.1.3.11 N-Succinylcitrulline N-Succinylomithine Pyrimidine metabolism 6.3.4.16 2.7.2.2 1.4.1.4 1.4.1.2 NH3 1.4.1.3 6.3.1.2 3.5.1.2 3.5.1.38 Glutamine Nitrogen metabolism ARGININE AND PROLINE METABOLISM 3.5.1.4 4-Guanidinobutanamide 1.13.12.1 G. R. S. Amaral and others Downloaded from www.microbiologyresearch.org by International Journal of Systematic and Evolutionary Microbiology 64 IP: 88.99.165.207 On: Fri, 16 Jun 2017 15:31:22 Vibrio phenotypes using whole genome sequences Fig. 2. The metabolic pathways of ornithine decarboxylase and arginine dihydrolase. The products of the biochemical reactions, enzymes and respective IC numbers are indicated. (Table 2). For instance, leucine-responsive protein (Lrp) seems to be involved in the global regulation of L-leucine utilization, while the catabolic control protein (CcpA) globally regulates the Voges–Proskauer reaction, ornithine decarboxylase, arginine dihydrolase and trehalose fermentation. The cAMP–cAMP receptor protein (CRP) complex is a global regulator for sucrose, L-arabinose, Dgalactose and D-sorbitol fermentation, indole production, and myo-inositol utilization. The cellobiose-negative phenotype may be explained by the lack of the global regulator in the vibrio genome. In addition, regulatory proteins may also play a critical role. Only seven regulatory proteins (CRP, FNR, IHF, FIS, ArcA, NarL and LrP) directly modulate the expression of 51 % of the genes in Eschrichia coli (Martı́nez-Antonio & Collado-Vides, 2003). In vibrios, the global regulators, such as Lrp and the cAMP–CRP complex, are also involved in the regulation of virulence and quorum sensing (Alice & Crosa, 2012; Lo Scrudato & Blokesch, 2013). Vibrio phenotypes are also under control of non-coding RNAs (Silveira et al., 2010). Determination of the diagnostic phenotypic features depends on the quality of the genome sequences In some strains, we did not find the genes responsible for previously reported diagnostic phenotypes, for instance the lack of genes related to fermentation of cellobiose in V. Table 2. Regulatory genes and global regulators of vibrio phenotypes Phenotypic test Pathway Voges–Proskauer Acetoin, butanediol reaction metabolism Sucrose Starch and sucrose fermentation metabolism Ornithine decarboxylase L-Arabinose D-Galactose fermentation Cellobiose fermentation D-Mannitol fermentation Arginine dihydrolase Trehalose fermentation D-Sorbitol fermentation Indole production myo-Inositol utilization D-Mannose Associated regulatory genes Sigma-54 dependent transcriptional regulator Sucrose operon repressor, ScrR, LacI family Arginine and Arginine regulatory pathway, proline metabolism ArgR Starch and sucrose Arabinose operon regulatory metabolism protein Galactose Galactose operon repressor, metabolism GalR, LacI family of transcriptional regulators b-Glucoside Cellobiose and glucan metabolism utilization regulator, LacI family, celR Fructose and Mannitol operon repressor, mannose mtlR metabolism Arginine and Arginine regulatory pathway, proline ArgR metabolism Starch and sucrose Trehalose operon metabolism transcriptional regulator, treR Fructose and Fructose repressor, FruR, mannose LacI family metabolism Aromatic amino Tryptophanase, TnaA acid degradation Inositol Transcriptional regulator of catabolism the myo-inositol catabolic operon Fructose and Transcriptional regulator of mannose mannoside utilization, metabolism manR ID fig|314288.3.peg.1410 fig|243277.1.peg.3384 fig|6666666.580.peg.4755 fig|223926.1.peg.4758 fig|223926.1.peg.2393 * Global regulator Organism Catabolic control protein, CcpA cAMP–CRP complex V. alginolyticus 12G01 V. cholerae O1 biovar eltor str. N16961 V. harveyi LMG 14126T V. parahaemolyticus RIMD 2210633 V. parahaemolyticus RIMD 2210633 Catabolic control protein, CcpA cAMP–CRP complex cAMP receptor protein (CRP) * * fig|223926.1.peg.368 cAMP–CRP complex V. parahaemolyticus RIMD 2210633 fig|6666666.580.peg.4755 Catabolic control protein, CcpA V. harveyi LMG 14126T fig|6666666.37776.peg.2156 Catabolic control protein, CcpA V. coralliilyticus P1T fig|6666666.26157.peg.3552 cAMP–CRP complex V. anguillarum 775T fig|6666666.17460.peg.357 V. natriegens LMG 10935T V. coralliilyticus P1T cAMP–CRP complex fig|6666666.37776.peg.3580/ cAMP–CRP 4088 complex * * * *Regulators that were not found in the vibrio genomes analysed. http://ijs.sgmjournals.org Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Fri, 16 Jun 2017 15:31:22 363 G. R. S. Amaral and others anguillarum 775. The genome of this strain is in contigs, which could influence its annotation. In addition, we cannot rule out the effects of sequencing errors in genome annotation, but this should be a minor issue in strain 775, due to the high genome coverage normally used in genome sequencing projects. To expand the use of whole genome sequences for phenotypic characterization, we suggest high genome coverage (.206; total number of reads6average read size/mean genome size) and high-quality control using the Phred’s error probability. For instance, in Ionsequencing and Illumina sequencing, the thresholds for quality control provided by the respective sequencers are Q20 (99 % accuracy of the base call, or one error in 100) and Q30 (99.9 % accuracy of the base call, or one error in 1000). To automatically retrieve the phenotypic diagnostic features from whole genome sequences of vibrios we developed a prototype program, named vibriophenotyping. This program will allow researchers to search and identify the diagnostic phenotypes using genome sequences. genome-to-genome sequence comparison. Stand Genomic Sci 2, 117– 134. Aziz, R. K., Bartels, D., Best, A. A., DeJongh, M., Disz, T., Edwards, R. A., Formsma, K., Gerdes, S., Glass, E. M. & other authors (2008). The RAST Server: rapid annotations using subsystems technology. BMC Genomics 9, 75. Drews, G. (2000). The roots of microbiology and the influence of Ferdinand Cohn on microbiology of the 19th century. FEMS Microbiol Rev 24, 225–249. Farmer, J. J. & Hickman-Brenner, F. W. (2006). The Genera Vibrio and Photobacterium, 6: 508–563 In: The Prokaryotes. A Handbook on the Biology of Bacteria: Proteobacteria: Gamma Subclass, 3rd edn. Edited by M. Dworkin, S. Falkow, E. Rosenberg, K.-H. Schleifer & E. Stackebrandt. Berlin: Springer. Garza, D. R., Thompson, C. C., Loureiro, E. C. B., Dutilh, B. E., Inada, D. T., Junior, E. C. S., Cardoso, J. F., Nunes, M. R. T., de Lima, C. P. S. & other authors (2012). Genome-wide study of the defective sucrose fermenter strain of Vibrio cholerae from the Latin American cholera epidemic. PLoS ONE 7, e37283. Gottesman, S. (1984). Bacterial regulation: global regulatory net- works. Annu Rev Genet 18, 415–441. Hunt, D. E., David, L. A., Gevers, D., Preheim, S. P., Alm, E. J. & Polz, M. F. (2008). Resource partitioning and sympatric differenti- Concluding remarks The major advantage with the determination of phenotypes using genome sequences is the cumulative nature and portability of these types of datasets. It will also be cheaper and quicker to identify phenotypes using whole genome sequences obtained with new technologies. The phenotypic tables provided in current taxonomic manuals and in publications containing species descriptions are not readily accessible to researchers and are not in a computerreadable format. This is a tremendous hindrance for studies on the metabolic diversity of prokaryotes. In addition, with the further development of microbial genomic taxonomy, genome sequences will have a critical role in their identification and classification. As a new definition of bacterial species based on genome sequences emerges from various studies, it becomes evident that genomes will be used more frequently in prokaryotic taxonomy. ation among closely related bacterioplankton. Science 320, 1081– 1085. Konstantinidis, K. T. & Stackebrandt, E. (2013). Defining Taxo- nomic Ranks. 229. In The Prokaryotes (4th edition): Prokaryotic Biology and Symbiotic Associations. 4th edn. Edited by E. Rosenberg, E. F. DeLong, S. Lory, E. Stackebrandt & F. L. Thompson. New York: Springer. Lo Scrudato, M. & Blokesch, M. (2013). A transcriptional regulator linking quorum sensing and chitin induction to render Vibrio cholerae naturally transformable. Nucleic Acids Res 41, 3644–3658. Martı́nez-Antonio, A. & Collado-Vides, J. (2003). Identifying global regulators in transcriptional regulatory networks in bacteria. Curr Opin Microbiol 6, 482–489. Moreira, A. P. B., Pereira, N., Jr & Thompson, F. L. (2011). Usefulness of a real-time PCR platform for G+C content and DNA-DNA hybridization estimations in vibrios. Int J Syst Evol Microbiol 61, 2379–2383. Noguerola, I. & Blanch, A. R. (2008). Identification of Vibrio spp. with a set of dichotomous keys. J Appl Microbiol 105, 175–185. Preheim, S. P., Timberlake, S. & Polz, M. F. (2011). Merging ACKNOWLEDGEMENTS C. C. T. and F. L. T. acknowledge grant support from CNPq, CAPES and FAPERJ. taxonomy with ecological population prediction in a case study of Vibrionaceae. Appl Environ Microbiol 77, 7195–7206. Silveira, A. C. G., Robertson, K. L., Lin, B., Wang, Z., Vora, G. J., Vasconcelos, A. T. R. & Thompson, F. L. (2010). Identification of non-coding RNAs in environmental vibrios. Microbiology 156, 2452– 2458. REFERENCES Thompson, F. L., Iida, T. & Swings, J. (2004). Biodiversity of vibrios. Alice, A. F. & Crosa, J. H. (2012). The TonB3 system in the human Microbiol Mol Biol Rev 68, 403–431. pathogen Vibrio vulnificus is under the control of the global regulators Lrp and cyclic AMP receptor protein. J Bacteriol 194, 1897–1911. Thompson, F. L., Gevers, D., Thompson, C. C., Dawyndt, P., Naser, S., Hoste, B., Munn, C. B. & Swings, J. (2005). Phylogeny and Alsina, M. & Blanch, A. R. (1994). A set of keys for biochemical molecular identification of vibrios on the basis of multilocus sequence analysis. Appl Environ Microbiol 71, 5107–5115. identification of environmental Vibrio species. J Appl Bacteriol 76, 79–85. Altschul, S. F., Gish, W., Miller, W., Myers, E. W. & Lipman, D. J. (1990). Basic local alignment search tool. J Mol Biol 215, 403– 410. Thompson, C. C., Vicente, A. C. P., Souza, R. C., Vasconcelos, A. T. R., Vesth, T., Alves, N., Jr, Ussery, D. W., Iida, T. & Thompson, F. L. (2009). Genomic taxonomy of vibrios. BMC Evol Biol 9, 258. Auch, A. F., von Jan, M., Klenk, H.-P. & Göker, M. (2010). Digital Thompson, C. C., Vieira, N. M., Vicente, A. C. P. & Thompson, F. L. (2011). Towards a genome based taxonomy of Mycoplasmas. Infect DNA-DNA hybridization for microbial species delineation by means of Genet Evol 11, 1798–1804. 364 Downloaded from www.microbiologyresearch.org by International Journal of Systematic and Evolutionary Microbiology 64 IP: 88.99.165.207 On: Fri, 16 Jun 2017 15:31:22 Vibrio phenotypes using whole genome sequences Thompson, C. C., Emmel, V. E., Fonseca, E. L., Marin, M. A. & Vicente, A. C. (2013a). Streptococcal taxonomy based on genome Vandamme, P., Pot, B., Gillis, M., de Vos, P., Kersters, K. & Swings, J. (1996). Polyphasic taxonomy, a consensus approach to bacterial sequence analyses. F1000 Res 2, 1–9. systematics. Microbiol Rev 60, 407–438. Thompson, C. C., Silva, G. G. Z., Vieira, N. M., Edwards, R., Vicente, A. C. P. & Thompson, F. L. (2013b). Genomic taxonomy of the genus Willems, A., Doignon-Bourcier, F., Goris, J., Coopman, R., de Lajudie, P., De Vos, P. & Gillis, M. (2001). DNA-DNA hybridization study of Prochlorococcus. Microb Ecol 66, 752–762. Bradyrhizobium strains. Int J Syst Evol Microbiol 51, 1315–1322. http://ijs.sgmjournals.org Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Fri, 16 Jun 2017 15:31:22 365
© Copyright 2026 Paperzz