GENOMIC EPIDEMIOLOGY OF CAMPYLOBACTER JEJUNI Associate Professor DVM Mirko Rossi, University of Helsinki DISCLAIMER Some data presented are part of the INNUENDO project which has received funding from European Food Safety Authority (EFSA), grant agreement GP/EFSA/AFSCO/2015/01/CT2 (New approaches in identifying and characterizing microbial and chemical hazards). The conclusions, findings, and opinions expressed in this presentation reflect only the view of the authors and not the official position of the EFSA. CHICKEN IS THE MAIN RESERVOIR 65.4% total ~90 /105 ~3% Jun/Sep 25.6% 9% ~20 /105 domestic Source attribution of Finnish human cases is ≈ other countries Chicken and Ruminants (Bovine) as MAIN RESERVOIRS (De Hann et al., 2012) IS CHICKEN MEAT DIRECT SOURCE OF INFECTIONS OF DOMESTIC CASES? Decrease sero-genotype association between Human/Chicken when accounting for temporal clustering Kärenlampi 2003 JCM Temporary shift in poultry vs human for the two dominant C. jejuni STs Kovanen 2016 IJFM Peak in human week 26 (July); peak in poultry August Llarena 2014 PLoSONE Do Human and poultry share a common source? Finnish domestic cases and poultry isolates from 2012 FROM SOURCE ATTRIBUTION TO SOURCE TRACKING EFSA Panel on Biological Hazards (BIOHAZ), 2010: Chicken as reservoir might account up to 80% of the human cases Broiler meat as direct source might account for only 20-30% 50-60% unknown transmission routes In Norway, contemporary space-time clusters of Campylobacter spp. in human and in broiler shared risk factors Jonsson 2010 IJHG interventions targeting those common factors would be more effective in reducing Campylobacter infections FROM SOURCE ATTRIBUTION TO SOURCE TRACKING Spatial-temporal clusters of apparently sporadic cases diffuse outbreaks are probable common Norway: up to 19.6% of the human (maximum radius of 50 km and maximum time of 30 days) Jonsson 2010 IJHG UK: ~13% of the human cases (lasting up to 50 days); 5 times more common than general or household outbreaks Strachan&Forbes 2014 Gabriel 2010 EAI WHAT ARE THE MOST LIKELY SOURCE OF THESE CLUSTERS? Integration of genome sequencing in surveillance for detection of epidemiologically related cases occurring in a less demographically distinct group Fernandes 2015 CID GENOMIC DIVERSITY OF CAMPYLOBACTER JEJUNI Genomic diversity within an outbreak & Epidemic and global dispersion GENOMIC DIVERSITY WITHIN AN OUTBREAK 1-Genomic variation after a single human passage Genetic heterogeneity of Campylobacter jejuni NCTC 11168 upon human infection Revez 2013 IGE 2-Genomic in point-source Campylobacter outbreaks Genomic Variation between Campylobacter jejuni Isolates Associated with Milk-Borne-Disease Outbreaks Revez 2014 JCM Genome analysis of Campylobacter jejuni strains isolated from a waterborne outbreak Revez 2014 BMC genomics Refinement of whole-genome multilocus sequence typing analysis by addressing gene paralogy Zhang 2015 JCM EPIDEMIC AND GLOBAL DISPERSION 3-Discover diffuse Campylobacter outbreak using genomics Multilocus Sequence Typing (MLST) and Whole-Genome MLST of Campylobacter jejuni Isolates from Human Infections in Three Districts during a Seasonal Peak in Finland Kovanen 2014 JCM Tracing isolates from domestic human Campylobacter jejuni infections to chicken slaughter batches and swimming water using whole-genome multilocus sequence typing Kovanen 2016 IJFM 4-Genome phylogeography of C. jejuni and its implication in epidemiology Monomorphic genotypes within a generalist lineage of Campylobacter jejuni show signs of global dispersion Llarena 2016 MGen GENETIC VARIATION AFTER HUMAN PASSAGE Revez 2013 IGE 11168 - p < 0.01 Single deletion in a single gene Frequencies of in-frame status of contingency genes: Cj1139c b-1,3 galactosyltransferase Cj0045c Iron-binding protein Cj1145c & Cj0456c Hypothetical proteins Confirmed by independent research on 11168 (although different sets of genes) – deletions in two loci Thomas 2014 PLoSONE Van Amsterdam 2006 FEMS STANDING GENETIC VARIATION IN CONTINGENCY LOCI AND RAPID ADAPTATION Homopolymeric runs rapid SNV during population growth in the absence of selection (rate ~10- to 100-fold faster) Bayliss 2012 NAR Bottleneck effect and selection responsible for the differences observed in vivo (Host specificity Human ≠ Mice ≠ Chicken) E.g. Expression of specific LOS structures immune evasion (variation of Cj1139c GM1/GM2) Baseline for variation within outbreak (1 – 2 SNV + indels in homopolymeric runs) GENOMIC IN THE MILK-BORNE CAMPYLOBACTER ST-50 OUTBREAK 2002 All PubMLST alleles for 1,738 loci 1,432 shared loci 8th Jan Costumized BLAST+ script wgMLST (extract all the shared loci with allele information in all the samples + new alleles) Intra-outbreak up to 12 allele differences affecting 15 genes Revez 2014 JCM 18th Feb Finnish hen 16th Dec Pe =patient Ma = milk Le = bovine 80% of the differences = homopolymeric runs (only 3 SNVs, Po_1 and Le_204R) ~250 allele diversity with a ST-50 a separate milk-borne outbreak > 10 years apart 1,404 shared loci Outbreak strains UK GENOMIC OF THE WATERBORNE CAMPYLOBACTER OUTBREAK (2000) Revez 2014 BMCg Manually filtered isolated SNVs 3 and 69 Using BIGSdb to retrieve allele profile for all 1,287 match in DB: 8 – 23 allele differences Difference with strains isolated 12 years apart having same PFGE profile = 64 SNVs (~8 allele differences) Phylogenomics and genealogy revealed two strains circulating in the outbreak Av.d. 0.0175 (~23 alleles dif) Av.d. 0.0061 (~8 alleles dif) IHV116260 vs 4031 3 SNVs IHV116292 vs 4031 69 SNVs 6236/12 vs 4031 64 SNVs GENOMIC IN POINT-SOURCE CAMPYLOBACTER OUTBREAKS Zhang 2015 JCM 1 -7 Allele differences when using wgMLST with ~1,200 – 1,500 shared loci Poultry Farm epi-linked strains Allelic diversity 551 550 552 62 62 59 57 62 61 7 80 54 52 54 126 126 126 125 4 6 559 560 559 559 566 565 552 567 555 554 555 570 570 570 570 4 558 559 558 558 565 564 551 566 554 553 554 569 569 569 569 560 561 560 560 567 566 553 568 556 555 556 571 571 571 571 6 33 32 51 52 64 96 33 31 32 140 139 139 138 33 33 54 53 64 96 34 31 35 137 136 136 137 5 50 48 60 93 29 28 30 135 134 134 134 46 47 61 93 29 29 29 134 134 134 133 3 66 98 43 42 42 139 139 139 138 66 97 43 42 44 137 137 137 136 85 55 53 55 129 128 128 128 Waterborne outbreak 88 86 88 135 134 134 133 5 3 132 131 131 130 4 129 128 128 129 133 132 132 131 1 1 2 0 1 1 DISCOVER DIFFUSE CAMPYLOBACTER OUTBREAK USING GENOMICS wgMLST Human domestic cases 2012 from three hospital districts Whole genome analysis using ~70% of total loci + BIGSdb clusters of 1-8 allele differences Genetically closely related isolates within the STs possible diffuse outbreak?? 1,121 1,264 Genealogy revealed genetic diversity within ~identical wgMLST types (e.g. cluster 1 ST-45) Higher similarity in pairs with high allele diversity (e.g. cluster 2 ST-45) Genealogy (ClonalFrame) Kovanen 2014 JCM TRACING SOURCE OF POSSIBLE DIFFUSE CAMPYLOBACTER OUTBREAK Kovanen 2016 IJFM Hierarchical approach discover more diversity (Genome Profiler Zhang 2015 JCM) Coupuled with temporal clustering GENOMIC ANATOMY OF CAMPYLOBACTER OUTBREAKS Changes in homopolymeric tructs were the main differences within epi-linked strains 1 -12 Allele differences ~ 3 SNVs Fernandes 2015 CID In British milk-borne outbreak mean of 4 allele/1577 loci: equivalent to the differences seen between 2 isolates from the same patient do not appear to occur among isolates with no epidemiological relationship Cody 2013 JCM Diffuse Campylobacter outbreaks are possible, but unknown sources Missing information on baseline genomic diversity of C. jejuni population; how much that affect epidemiology? MONOMORPHIC GENOTYPES WITHIN CAMPYLOBACTER CLONAL COMPLEX Spatial temporal evolution of ST-45 clonal complex Llarena 2016 MGen Genealogy reconstruction of ST-45CC UK, FINLAND and BALTIC countries From 1999 to 2013 Little or no spatial clustering – no temporal clustering Little genetic diversity over time and space for certain populations (green and violet; arrows) MLST typing not always correlated with genealogy Country red=UK blue=Finland Year of isolation (different colour different year) Different colour of the branches = different BAPS populations GLOBAL DISPERSION OF MONOMORPHIC GENOTYPES Llarena 2016 MGen Lack of genetic isolation by distance sampling date had a weak correlation with the root-to-tip distance Overall genetic diversity in 12 years ~ genetic diversity in a single year Sign of global dispersion geese Canada 2011 Italy 2015 Predicted TMRCA using Wilson 2009 mutation rate NOT FITTING WITH SAMPLING DATES IMPLICATION FOR EPIDEMIOLOGY Llarena 2016 MGen Separation of clustered and sporadic cases based only on genomic diversity is not possible Dominant circulating clone simulates diffuse outbreaks Different STs have different ”history” difficult to predict species wide dynamics Unknown reason of expansion of these clones neutral evolution in the form of mild purifying selection vs sporadic selection This is actually the BAPS 6 monomorphic clones Kovanen 2014 JCM http://biorxiv.org/content/early/2016/10/01/078550 You can find everything about my presentation in a recent review available in pre-print WG/CG-MLST Surveillance and outbreak investigation REFINEMENT WG-MLST BY ADDRESSING GENE PARALOGY Easy to use local ad hoc analysis GeP address paralogy with Conserved Gene Neighborhoods Zhang 2015 JCM I. Failing to choose orthologous from paralogous IV. Including ambiguities in allele definition II. Assegning missing locus as allele V. Alleles composed by overlaping loci III. Missing allele due to high seq diversity VI. Exclude homopolimeric runs REFINEMENT WG-MLST BY ADDRESSING GENE PARALOGY https://sourceforge.net/projects/genom eprofiler/ (~300 download worldwide) GeP is precise but quite slow especially for highly divergent strains It is designed for ad hoc wgMLST analysis useful for extracting core alignment and for small analyses Independently of the analysis the topologies of the split graph are very similar Zhang 2015 JCM INNUENDO: A STANDARDIZED CROSSSECTORIAL FRAMEWORK Data collection Data analysis A portable platform for automatic realtime application of wg/cgMLST in a public health settings A predictable model for forecasting epidemiological relationship between isolates Place project stakeholders at the center of the platform design and development Prof-of-concept Platform development Campylobacter, Yersinia, STEC, Salmonella PROJECT WORKFLOW A. Quality assurance ( the product is the ”correct” cgMLST profile) I. II. Define the general measures and write pipeline (WP3) Define the species-specific cut-off values effect on allele calling (WP2) B. Calibration – known samples i. ii. cgMLST schema developing/selection Cut-off for epi-linked strains C. Validation– unknown samples i. ii. Resolution of cgMLST ( when we need ad hoc schemas?) Inferring transmission patterns QUALITY ASSURANCE - INNUCA 1. INNUca v1 theoretical coverage + assembly statistics INNUca v1.0 2. INNUca v2 species-specific: A. B. C. “True” Coverage estimation using reference genes (ReMatCh) Probability of identify no-axenic samples Affect of coverage on allele calling defining cut-off https://github.com/INNUENDOCON/INNUca INNUENDO quality control of reads, de novo assembly and contigs quality assessment, and possible contamination search Cov. Est + Cont ReMatCh INNUca v2.0 Assem correction Pilon https://github.com/mickaelsilva/chewBBACA THE ALLELE CALLING ENGINE: CHEWBBACA Twitter: @jacarrico University of Lisbon "Comprehensive and Highly Efficient Workflow“ BSR-Based Allele Calling Algorithm Faster then every other allele calling Portable (does not need a lot of computing) Allele call outputs LOT – Locus on the tip PLOT- Possible LOT Size threshold selection – Choosing allele length mode +/- 20% More than one match with BSR > 0.6 – Non Informative Paralogous Locus ASM /ALM – Allele smaller than Mode / Allele larger than Mode DEFINING THE CORRECT SCHEMA, NOMENCLATURE AND WORKFLOW C. coli/C. jejuni v1 from PubMLST problem with the definition of core genomes build a novel C. jejuni cgMLST schema validating each loci test different ”small” schemas = fractions of the core genomes ( collabotation with Ed Taboada) Hierarchical WGS typing (inclusion of ad hoc analysis) 2 phases of the analysis: 1. 2. surveillance (which can be based even on a sort of small cgMLST) link to nomenclature outbreak investigation which is based on ad hoc wgMLST Cut-off definition for population structure Cut-off for epilinked as % of shared genes cgMLST schema or small-cgMLST Ad hoc wgMLST analysis for outbreak investigation Acknowledgment UniHelsinki Prof. Emerita Marja-Liisa Hänninen The INNUENDO consortium www.innuendoweb.org Funding agencies: Academy of Finland EFSA University of Helsinki Walter Ehrstömin Säätiö Ministry of Agriculture and Forestry, Finland ERA-NET University of Lisbon (PT) Dr. Joao Carrico Prof. Jukka Corander PhD student Miguel Machado Dr. Ann-Katrin llarena PhD student Bruno Goncalves Dr. Joana Revez (now ECDC) PhD student Mickael Silva Dr. Ji Zhang (now Massey University, NZ) University of Basque Countries (ES) Dr. Schott Thomas (now Rostock University) Prof. Javier Garazair Dr. Rauni Kivistö Dr. Astrid Skarp (now Hogeschool Rotterdam) Dr. Joseba Bikandi University of Veterinary Medicine PhD student Sara Kovanen (Austria) Dr. Vehkala M., Dr. Välimäki N. Prof. Friederike Hilbert EVIRA IZS Teramo (Italy) Dr. Marjaana Hakkinen Dr. Elisabetta Giannatale, Dr. Giuliano THL Garfolo, Dr. Cesare Gammà Dr. Saara Salmenllinna Dr. Jani Halkilahti UniTartu (Estonia) INSA For providing the human strains Prof. Mati Roasto, PhD student Mäesaar M. Dr. Kärkkäinen U.M. Dr. Monica Oleastro Dr. Tuuminen T. Dr. Vitor Borges Dr. Uksila J. Public Health Agency of Canada Prof. Rautelin H (Uppsala University) Eduardo Taboada, Dillon Barker Thanks for your attention
© Copyright 2026 Paperzz