GENOME-BASED IDENTIFICATION OF BACTERIAL PATHOGENS Michel DRANCOURT, MD, PhD Unité de Recherche sur les Maladies Infectieuses et Tropicales Emergentes Marseille Medical School, Marseille, France http://www.mediterranee-infection.com/ The diagnosis of infectious diseases Nature Rev Microbiol. 2004;2:151-9. Accurate Identification of Bacteria • Accurate Diagnostic of the Disease • Prognosis • Treatment • Secondary prophylaxis • Further investigations Traditional Identification • Isolation and culture • Culture characteristics • Gram staining • Simple biochemical tests : catalase, oxydase, urease • Biochemical profiling - Enzyme activity profile - Auxanogram When do we use any molecular identification tool ? Uncultured / fastidious bacteria Phenotypically inert bacteria New bacterial species Unusual bacteria Usual bacteria from unusual clinical specimen / situation Annual numbers of sequenced genomes until October 15th, 2012 1400 1274 1200 1000 830 800 600 400 326 226 200 139 0 13 25 29 48 60 172 177 77 5 4 4 2 2 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 Burden of bacterial isolates requiring molecular identification Bacteria No. of isolates tested No. (%) of isolates identified by 16S rDNA analysis No. of rare or unique isolatesa Gram-positive cocci Gram-positive rodsb Gram-negative bacteria Enteric bacteria Other bacteria Anaerobic bacteria 75,537 16,487 300 (0.40) 524 (3.18) 6 86 51,177 26,357 5,780 132 (0.26) 225 (0.85) 223 (3.86) 3 14 11 Total 175,338 1,404 (0.80) 120 a Isolates from human samples that have been reported 0 to 10 times. b Including Mycobacterium spp. [Drancourt M. et al. J Clin Microbiol. 2004;42:2197] Molecular Identification of Bacteria • DNA Sequencing • DNA Hybridization • Real-time PCR, Sybrgreen • Real-time PCR, probe • Mass-spectrometry Identification of Bacteria by DNA Sequencing = Flow chart Isolate Sequencing Identification Phylogenetic analysis DNA extraction Target PCRamplification Sequence analysis / Comparison againt databases DNA extraction = a wide range of protocols • No extraction: thermocycler will do the job ! • Cell-wall lysis : * Mechanical - Heat * Enzymatic - Glass beads - Proteinase K - Lysosyme • Cell-wall lysis plus DNA extraction DNA extraction : a wide range of protocols • Manual protocol • Semi-automatic protocol • Automatic protocol DNA Sequencing Capillary sequencer Pyrosequencer Mass-spectrometer High-throughput pyrosequencer Perform 16S rDNA based identification of isolates using BLAST >Mabscessus TAGAGTTTGATCCTGGCTCAGGACGAACGCTGGCGGCGTGCTTAACACATGCAAGTCGAACGGA AAGGCCCTTCGGGGTACTCGAGTGGCGAACGGGTGAGTAACACGTGGGTGATCTGCCCTGCACT CTGGGATAAGCCTGGGAAACTGGGTCTAATACCGGATAGGACCACACACTTCATGGTGAGTGGT GCAAAGCTTTTGCGGTGTGGGATGAGCCCGCGGCCTATCAGCTTGTTGGTGGGGTAATGGCCCA CCAAGGCGACGACGGGTAGCCGGCCTGAGAGGGTGACCGGCCACACTGGGACTGAGATACGGCC CAGACTCCTACGGGAGGCAGCAGTGGGGAATATTGCACAATGGGCGCAAGCCTGATGCAGCGAC GCCGCGTGAGGGATGACGGCCTTCGGGTTGTAAACCTCTTTCAGTAGGGACGAAGCGAAAGTGA CGGTACCTACAGAAGAAGGACCGGCCAACTACGTGCCAGCAGCCGCGGTAATACGTAGGGTCCG AGCGTTGTCCGGAATTACTGGGCGTAAAGAGCTCGTAGGTGGTTTGTCGCGTTGTTCGTGAAAA CTCACAGCTTAACTGTGGGCGTGCGGGCGATACGGGCAGACTAGAGTACTGCAGGGGAGACTGG AATTCCTGGTGTAGCGGTGGAATGCGCAGATATCAGGAGGAACACCGGTGGCGAAGGCGGGTCT CTGGGCAGTAACTGACGCTGAGGAGCGAAAGCGTGGGTAGCGAACAGGATTAGATACCCTGGTA GTCCACGCCGTAAACGGTGGGTACTAGGTGTGGGTTTCCTTCCTTGGGATCCGTGCCGTAGCTA ACGCATTAAGTACCCCGCCTGGGGAGTACGGTCGCAAGACTAAAACTCAAAGGAATTGACGGGG GCCCGCACAAGCGGCGGAGCATGTGGATTAATTCGATGCAACGCGAAGAACCTTACCTGGGTTT GACATGCACAGGACGTACCTAGAGATAGGTATTCCCTTGTGGCCTGTGTGCAGGTGGTGCATGG CTGTCGTCAGCTCGTGTCGTGAGATGTTGGGTTAAGTCCCGCAACGAGCGCAACCCTTGTCCTA TGTTGCCAGCGGGTAATGCCGGGGACTCGTAGGAGACTGCCGGGGTCAACTCGGAGGAAGGTGG GGATGACGTCAAGTCATCATGCCCCTTATGTCCAGGGCTTCACACATGCTACAATGGCCAGTAC AGAGGGCTGCGAAGCCGTAAGGTGGAGCGAATCCCTTAAAGCTGGTCTCAGTTCGGATTGGGGT CTGCAACTCGACCCCATGAAGTCGGAGTCGCTAGTAATCGCAGATCAGCAACGCTGCGGTGAAT ACGTTCCCGGGCCTTGTACACACCGCCCGTCACGTCATGAAAGTCGGTAACACCCGAAGCCAGT GGCCTAACCTTTTGGAGGGAGCTGTCGAAGGTGGGATCGGCGATTGGGACGAAGTCGTAACAAG GTAGCCGTA http://www.ncbi.nlm.nih.gov/BLAST/Blast.cgi 16S rDNA-base : rapidly growing mycobacteria example. Identification of Bacteria by DNA Sequencing: The choice of the molecular target • Universal target • Genus / Species-specific target • Genotype-specific target Atypical phenotype - fastidious bacterium, - intracellular bacterium, - biochemically poorly reactive, - discrepant antibiotic susceptibility pattern, - 16S DNA sequence BLAST analysis - almost entire16S DNA gene -less than 0.5 % ambiguous positions 98.7 % 98.7 % Complete phenotypic and molecular characterisation - rpob gene sequencing - 23S rDNA sequencing - ITS - groEL gene sequencing - recA gene sequencing -gyrB gene sequencing [Adapted from Drancourt M, Raoult D. J Clin Microbiol. 2005; 43: 4311-5] BACTERIAL GENOME ARCHITECTURE chromosome plasmid [Audic S. et al. PLoS Genet. 2007;3:e138] [Lescot M et al. PLoS Genet. 2008;4:e1000185] BACTERIAL GENOME PLASTICITY: LATERAL GENE TRANSFER GENOME [Raoult D. et al. Genome Res. 2003; 13:1800-9] 16S rDNA - based universal detection / identification of bacteria : a powerfull tool in medical microbiology 16S rDNA : first-line target for universal identification of Bacteria Universal, present in all bacteria: Universal primers : fD1 357f 536f 1 357r 536r 800f 800r 1050f 1050r ~ 1,500 bp long : easily sequenced 1500 rP2 16S rDNA / DNA-DNA Hybridization plot indicates a robust 98.7 % 16S rDNA sequence similarity value for identification [Stackebrandt E, Ebers J. Microbiol. Today 2006; 33:152] 16S rDNA Databases GenBank Poorly-controled Free General Microbiology (http://www.ncbi.nlm.nih.gov/Genbank) EzTaxon RDP Controled Free General Mycrobiology [Chun J. et al. IJSEM 2007; 57:2259] (http://www.eztaxon.org) 8,369 entries Controled Free General Microbiology rdp.cme.msu.edu MicroSeq® RIDOM 16S rDNA Controlled Controlled Commercial Medical Microbiology (Applied Biosystems) Environnemental Microbiology Commercial Medical Microbiology (Ridom GmbH) rdna.ridom.de [Harmsen D. et al. Nucleic Acids Res. 2002; 30 = 416 16S rDNA – based achievement: description of new bacterial species (IJSEM 2007-2008 / Total = 35) 16S rDNA – based description of new medical species : sources of isolates (Total = 35) 16S rDNA – based description of new species of medical interest (Total = 35) 16S rDNA – based:quering the microbial nature of microscopic particles Lack of detection of the universal 16S rDNA agrees with the non-microbial nature of microscopic particles • « Nanobacterium spp. » [Raoult D. et al. PloS Pathog. 2008; 4:e41] • Mimivirus [Raoult D. et al. Science 2004; 306:1344-50] Limits of 16S rDNA - based universal identification of bacteria • 16S rDNA sequence non-discriminant - Lack of specific variability - Multicopy - Lateral gene transfer • Laboratory contamination • Databases 16S rDNA limitations Sequence heterogeneity between the two copies of the 16S rDNA sequence M. mucogenicum ATCC 49649 (Adékambi and Drancourt, 2004) Erroneous identification may due to 16S rDNA point mutations associated with aminoside resistance M. abscessus and M. chelonae (Prammananan et al. 1998) 16S rDNA limitations: mixed culture 16S rDNA limitations 16S rDNA underestimates diversity of RGM M. senegalense = M. houstonense 16S rDNA resolving power= insufficient to garantee correct delineation « It is doubtful whether phylogenetic relationships should be based solely on the 16S rRNA in cases where sequence identies are > 99% » (Drancourt et al. 2000) M. wolinskyi belong to M. smegmatis group Why to develop alternative molecular tools for universal identification ? • Secondary target to confirm 16S rDNA based detection / identification • Refining molecular identification in case of 16S rDNA ambiguity • By-passing 16S rDNA amplicon contamination Alternative universal molecular tools for bacteria detection / identification Available sequences . rpoB 1,700 . 23S rDNA (b-subnit RNA polymerase) 5,400 . 16S-23S rDNA internal transcribed sequence (ITS) 5,400 . groEL (heat-stock protein) 2,700 . gyrB (b-subunit of DNA gyrase) 3,700 . recA (homologous recombination gene) 3,790 . gltA (citrate-synthase gene) 835 rpoB: an alternative tool for the detection / identification of bacteria • It is an O.R.F. • Encodes the b-subunit of RNA polymerase • 4,000 – 4,500 bp • Universal • Unique copy except for Nocardia farcinica (Ishikawa J.et al. Proc Natl Acad Sci USA, 2004;101 : 14925-30) rpoB-based RGM phylogeny [Iyer et al., 2005] rpoB sequence-based taxonomy of RGM Genes Bootstrap > 80 % p values with rpoB Combined 5 genes 82.4% 1 rpoB recA 82.4% 64.7% 1 0.4 16S rRNA 56.3% 0.1 hsp65 47.1% 0.03 sodA 41.2% 0.01 RGM taxonomy overview : rpoB-based Genome Diagnostic and Epidemiology Florence Fenollar and Didier Raoult. APMIS. 2004;112:785-807. Fournier PE et al. Lancet Infect Dis. 2007;7:711-23. ESTIMATION OF GENOMIC G + C CONTENT • • The G + C content is a global estimator of genome. Genomic G + C content can be estimated by rpoB gene G + C content: GCg = 1.2065 x GCr – 11.495 [Fournier PE. et al. Int J Syst Evol Microbiol. 2006; 56: 10259] WHOLE GENOME COMPARISONS IN BACTERIA • Experimental : DNA : DNA Hybridization > 70 % • Bio-informatics: A verage Nucleotide (identify > 95 %) [Goris J et al. DNADNA hybridization values and their relationship to whole-genome sequence similarities. Int J Syst Evol Microbiol. 2007;57:81-91 ; Konstantinidis KT et al. Toward a more robust assessment of intraspecies diversity, using fewer genetic markers. Appl Environ Microbiol. 2006;72:7286-93]. • Bio-informatics: rpoB gene sequence similarity > 97.7 % [Adékambi T et al. Complete rpoB gene sequencing as a suitable supplement to DNA-DNA hybridization for bacterial species and genus delineation. Int J Syst Evol Microbiol. 2008;58:1807-14]. ANI : the Mycobacterium tuberculosis complex M. tuberculosis CDC1551 M. tuberculosis H37Rv M. bovis AF212 7 M. bovis BCG M. aviu m M. avium subsp. paratubercu losis M. tuberculosis CDC1551 - 99.87 99.75 99.73 79.49 79.56 M. tuberculosis H37Rv 99.89 - 99.76 99.71 79.40 M. bovis AF2122/97 99.83 99.80 - 99.90 M. bovis BCG 1173P2 99.83 99.80 99.93 M. avium hominsuiss 78.72 78.71 M. avium subsp. paratuberculosis 79.13 M. ulcerans M. marinum M. abscessus M. smegmati s M. leprae 79.12 79.15 71.39 74.17 76.14 79.56 79.08 79.06 71.34 74.08 76.17 79.54 79.60 79.04 79.10 71.35 74.13 76.17 - 79.53 79.61 79.07 79.12 71.35 74.11 76.17 78.81 78.82 - 97.90 78.23 78.28 72.16 75.22 75.35 79.14 79.21 79.22 98.71 - 78.42 78.40 72.21 75.40 75.38 79.05 79.05 79.04 79.05 78.44 78.86 - 98.90 71.49 74.09 74.95 M. marinum 78.50 78.52 78.42 78.47 78.39 78.44 98.06 - 71.44 73.72 74.49 M. abscessus 71.19 71.20 71.19 71.20 72.10 72.11 71.11 71.30 - 72.09 69.79 M. smegmatis 73.95 73.96 73.97 73.97 75.47 75.52 73.72 73.68 72.04 - 71.45 M. leprae 78.87 78.88 78.89 78.89 78.08 78.08 77.13 77.18 70.66 72.84 - genome \ reference M. ulcerans GENOME-BASED MULTI-LOCUS SEQUENCE-TYPING (MLST) Intégration dans le Point of Care (P.O.C.) GENOME-BASED MULTISPACER SEQUENCE TYPING (M.S.T) • We developed MST first for paleomicrobiological investigations of ancient Yersinia pestis (plague) [Drancourt M. et al. Emerg Infect Dis. 2007;13:332-3]. • MST relies on genome alignements MST PRINCIPLE Spacer B Spacer A Orf 2 Orf 1 Seq 1 Seq 2 Seq 3 Orf 4 Orf 3 Seq 1 Seq 2 Spacer A Spacer B Strain 1 1 1 2 1 Strain 2 2 4 3 2 Strain 3 1 1 2 1 Strain 4 3 3 1 3 Strain Spacer C Spacer C MST type Seq 1 Seq 2 Seq 3 Seq 1 differs from Seq 2 by: - SNP - VNTR - Deletion / Insertion These 3 events have the same genetic weight 1- CHOICE OF SPACER M. tuberculosis H37Rv Gene A M. tuberculosis CDC1551 Gene A Gene B Gene C Gene B Gene C D Gene E D Gene E FF F Spacer sequences extraction perl script software Comparison of homologous spacer - Sequence length of ≤ 500-bp Difseq software - Sequence similarity of 70-99% - Spacer exhibiting >– 3 genotypes Spacer selection/ criteria 8 most variable spacers: - 4 VNTR loci described (ETR) - 4 Newly evaluated spacers M.S.T. APPLICATIONS: identification of Mycobacterium tuberculosis complex species * Selection of spacers - MST1 - MT2221 - MST2 - MST3 - ETR-B - ETR-C - ETR-D - Mtub21 - 6 Single nucleotide polymorphisms All MTC species at once - Variable number tandem repeat (1-7) - 2 Deletions [Djelouadji Z. et al. PLoS One. 2008;3:e2433] Euro-american lineage West-African 1 lineage Indo-Oceanic lineage East-African Indian lineage West-African 2 lineage Asian lineage Beijing family [Djelouadji Z. et al. PLoS Negl Trop dis. 2008;2:e253] APPLICATIONS OF M.S.T Micro-organism Application Reference M. avium complex M. Tuberculosis complex I, G. I, G. Cayrou C. et al. unpublished Djelouadji Z. et al. PLoS ONE 2008 Djelouadji Z. et al. PLoS Negl Trop Dis 2008 T. whipplei R. prowazekii R. sibirica C. burnetii B. henselae B. quintana G G G G. G G Li W. et al. Microbiology 2008 Zhu Y. et al. J Clin Microbiol. 2005 Zhang L. et al. J Clin Microbiol. 2006 Glazunova O. et al. Emerg Infect Dis. 2005 Li W. et al. J Clin Microbiol. 2006 Foucault C. et al. J Clin Microbiol. 2004 Wooley MW. et al. J Clin Microbiol. 2007 Y. pestis G Drancourt M. et al. Emerg Infect. Dis. 2004 Identification based on specific genes /targets Bacterial species gene target Neisseria meningitidis Streptococcus pneumoniae Escherichia coli Listeria monocytogenes Mycoplasma pneumoniae Bordetella pertussis Chlamydia (C. pneumoniae, C. psittaci, C. trachomatis) Staphylococcus aureus Borrelia burgdorferi Streptococcus agalactiae (groupe B) ctrA - crgA PlyN - LytA rpoB hlyQ P1 IS481 Omp2 nucA FliD Sip GENOME-BASED, RANDOM ACCESS IDENTIFICATION Prospective: place of sequence-based identification in near future Specimen Isolate Culture High-throughput phenotypic identification Sequence-based identification Prospective: DNA Sequence – based Identification in the near Future • Universal gene (16S rDNA, rpoB) sequence for the description of new species • Specific gene sequence to resolve unique situations : oral Streptococcus spp. • Decreasing role for routine identification of isolates : - Mass-spectrometry profiling [Sauer S. et al. PLoS ONE 2008;3:e2843] - High throughput biomechemical profiling TARGETING GENOME REPEATS: TO INCREASE TEST SENSITIVITY Resolution of Genome-based sequencing methods for bacteria detection, identification, genotyping Multispacer sequence typing MST Surface protein gene sequencing Multilocus sequence typing MLST Random access sequence Species-specific gene Repeats species Sub-species isolates
© Copyright 2026 Paperzz