REVIEWS Helicobacter pylori evolution and phenotypic diversification in a changing host Sebastian Suerbaum and Christine Josenhans Abstract | Helicobacter pylori colonizes the stomachs of more than 50% of the world’s population, making it one of the most successful of all human pathogens. One striking characteristic of H. pylori biology is its remarkable allelic diversity and genetic variability. Not only does almost every infected person harbour their own individual H. pylori strain, but strains can undergo genetic alteration in vivo, driven by an elevated mutation rate and frequent intraspecific recombination. This genetic variability, which affects both housekeeping and virulence genes, has long been thought to contribute to host adaptation, and several recently published studies support this concept. We review the available knowledge relating to the genetic variation of H. pylori, with special emphasis on the changes that occur during chronic colonization, and argue that H. pylori uses mutation and recombination processes to adapt to its individual host by modifying molecules that interact with the host. Finally, we put forward the hypothesis that the lack of opportunity for intraspecies recombination as a result of the decreasing prevalence of H. pylori could accelerate its disappearance from Western populations. RAPD (Random amplification of polymorphic DNA). A simple method to assess the genetic relatedness of bacterial strains within one species. A single short primer is used in PCR reactions and the resulting band patterns are compared. Medizinische Hochschule Hannover, Institut für Medizinische Mikrobiologie und Krankenhaushygiene, Carl-Neuberg-Strasse 1, 30625 Hannover, Germany. e-mails: suerbaum. [email protected]; josenhans.christine@ mh-hannover.de doi:10.1038/nrmicro1658 The Gram-negative bacterium Helicobacter pylori colonizes the stomachs of more than half of the world’s population from infancy, making it one of the most successful bacterial pathogens1. H. pylori has probably accompanied humans for tens of thousands of years2–4, and it has been hypothesized that H. pylori colonization could have provided benefits to its human carriers and hence provided a selective advantage during long periods of human history5. In the modern world, H. pylori infections are responsible for a heavy toll of morbidity and mortality as a consequence of ulcer disease, lymphoma of the mucosa-associated lymphoid tissue (MALT) and, the most dangerous complication of H. pylori infection, gastric adenocarcinoma. The discovery of H. pylori in 1982 (REF. 6) had fundamental consequences for the treatment of stomach diseases and earned the two discoverers, Robin Warren and Barry Marshall, a Nobel Prize in 2005. H. pylori is the only formally recognized definitive bacterial carcinogen for humans7 and is estimated to be responsible for 5.5% of all human cancer cases, or approximately 592,000 gastric cancer cases per year 8. H. pylori has also become a paradigm for a bacterium that causes chronic infections, and its mode of action as a pathogen has been NATURE REVIEWS | MICROBIOLOGY termed ‘slow’ or ‘stealth’9,10. One other Helicobacter species, Helicobacter hepaticus, has been associated with the development of both hepatocellular carcinoma and colon cancer in immunocompromised mice11–13. For several years, it has been known that H. pylori is one of the most diverse and variable bacterial species to be studied. When fingerprinting methods, such as restriction endonuclease digestion, random amplification of polymorphic DNA (RAPD), or nucleotide-sequence-based methods are used to analyse H. pylori strains from unrelated individuals, the data indicate that every individual seems to carry his or her own strain, or even multiple strains14–17 (FIG. 1). Direct clonal transmission of strains from person to person has rarely been documented; evidence for direct clonal transmission that has been attained was by the detection of closely related strains within families18–20. Transmission of the organism in developed countries is thought to occur mainly vertically through direct human-to-human contact, usually within families and during the first years of life21. Although our mechanistic understanding of the bacterial processes involved in generating diversity in H. pylori has substantially increased over recent years, VOLUME 5 | JUNE 2007 | 441 © 2007 Nature Publishing Group REVIEWS we still know little about how this diversity relates to the survival of the bacteria in their individual human hosts. Because of the lifestyle of H. pylori as a chronic and persistent host-specific bacterium that lacks a natural reservoir outside humans, we have to assume that adaptation processes that take place in the context of the host environment are responsible for bacterial diversification. However, what are the driving forces for the adaptation processes in the host niche during H. pylori infection? H. pylori causes an ‘active chronic infection’, in which the bacteria are constantly multiplying and displaying vigorous motility in the stomach mucus (indicating constant high energy consumption by the bacteria), but appear to be only poorly contained by the immune system to the extent that several thousand bacterial cells can be observed in 1 nL of gastric mucus22. The concept of an almost complete lack of control by the human immune system is supported by the fact that immunodeficient individuals do not have a higher incidence Microevolution The generation of genetic variation within a species over relatively short timescales. b High a of H. pylori infection or disease symptoms caused by H. pylori infection. On the contrary, H. pylori itself seems to have developed mechanisms during its evolution to evade and thwart the host’s natural immune responses. In addition, microevolution of the bacteria inside each single individual host is likely to further contribute to maintaining a host–pathogen balance that promotes persistence23. Hence, immune evasion and suppression, as well as achieving a specific polarization of the immune response that is beneficial for bacterial survival, are likely to be the main selective forces driving the microevolution of H. pylori in vivo. These selective pressures operate on a microorganism that possesses specific capabilities for diversification by mutation and recombination, as is discussed below. The bacteria need to interact indirectly and directly with the hosts, requiring both fixed bacterial surface molecules to provide adherence and soluble molecules that are either surface-bound or secreted, and act on Level of acid production Antralpredominant gastritis Normal gastric mucosa Acute H. pylori infection Chronic H. pylori infection Duodenal ulcer MALT lymphoma Asymptomatic H. pylori infection Nonatrophic pangastritis Corpuspredominant atrophic gastritis Gastric ulcer Intestinal metaplasia Dysplasia Low Gastric cancer Childhood Advanced age Age c d A ATGCACCTCC TAGGTCTTCG AGTAATGCCC TCGGGCCTTC CTCCCCGCTC GCGTGTAGCC GCTG B TCA...T..T ..T..C.... .AC....... .......C.. .......... .T.C.C.... .... C T.A....C.. ......C... .AC...A... ......TGC. TCAGA.A... .T....T... .... D ..A...T... .....TC.TT G......... .T.A...GC. ..AG...... A..C..T... .... E ...TGTTC.. .....TC... ..C...A... .......C.. .CAGG..... ......T..T .... F T.A....C.. ......C... .......... .T.A...C.. .......... .T.C.C.... .... G T.A...T..T .....T..TT G.......T. .T.A..TGC. TCAG...... ...C..T... .... H ......TC.T C..T..C.TT G.C....... .......... .......... A.....TA.. .... I T.....T..T ...T.TC.TT G.....A.T. .......... .....T.... .....CT... .... A B C D E F G H I J J .......... ......C... .......... ......TCC. .......... .T.C..T... .... Figure 1 | Helicobacter pylori and its extensive genetic heterogeneity. a | Transmission electron micrograph of H. pylori showing its characteristic curved morphology and a unipolar bundle of sheathed flagella, which are essential for the colonization of the gastric mucosa. b | Natural progression of H. pylori infection. Infection usually occurs during childhood and causes symptomatic acute gastritis. Because the symptoms of acute gastritis are non-specific and transient, a diagnosis is rarely made at this stage. Acute infection transforms to chronic active gastritis in most patients and persists for decades or is life-long. The infection can take multiple courses. Most people that are infected with H. pylori will never develop symptomatic disease. 10–15% will develop ulcer disease (gastric or duodenal ulcer), approximately 1% will develop gastric adenocarcinoma, and a small group of patients will develop gastric MALT lymphoma. c–d | H. pylori exhibits extensive genetic variation, so that almost every individual carries his or her own H. pylori strain. c | Random amplification of polymorphic DNA (RAPD) fingerprints of ten H. pylori strains from unrelated individuals (A–J) showing a unique banding pattern for each strain. d | Partial sequences of the flaB flagellin gene for ten strains A–J showing a unique combination of polymorphic nucleotides for each strain. Part b reprinted with permission from REF. 1 © (2002) Massachusetts Medical Society. 442 | JUNE 2007 | VOLUME 5 www.nature.com/reviews/micro © 2007 Nature Publishing Group REVIEWS Polymorphic nucleotide A position in a nucleotide sequence that displays variation in a sample population. If a sequence analysis of a gene fragment for n isolates yields three different alleles, for example, AACTTA, AAGTTA and AAATTA, the third position in this sequence is polymorphic but the other positions are not. Multilocus enzyme electrophoresis (MLEE). A classical method that was used to study the structure of bacterial populations. Differences between the electrophoretic mobilities of multiple enzymes in starch gels are used as indicators of allelic variation. Homoplasy test A method to quantify the contribution of recombination to sequence variation in a set of homologous nucleotide sequences from multiple isolates. Panmictic A population structure where clonal structure is lost due to frequent recombination. Species with panmictic or close to panmictic population structures include Helicobacter pylori and Neisseria gonorrhoeae; species with predominantly clonal population structures include Salmonella enterica and Mycobacterium tuberculosis. Type IV secretion system (T4SS). A complex bacterial secretion system that can transport bacterial protein effector molecules or DNA into a eukaryotic cell. MLST (Multilocus sequence typing). A nucleotide-sequencebased approach for the characterization of isolates of microorganisms. The method involves the sequence analysis of approximately seven housekeeping gene fragments. Unique sequences obtained for each fragment are assigned an allele number, and the combination of allele numbers for all fragments defines the sequence type (ST). MLST is applicable to almost all bacteria and some other microorganisms. See Further information for a central website to access MLST databases for different organisms. their respective host receptors. As such, one might assume that adaptation processes could lead to alterations in three groups of bacterial genes: first, genes in systems that affect intrabacterial mutation, DNA uptake, repair and recombination themselves; second, genes that favour bacteria–bacteria interactions for the purpose of interbacterial genetic exchange, including decreasing or increasing the barrier of genetic exchange; and third, genes that influence bacterial properties that modulate host interaction (adherence and immune response). In the following sections, we review the current knowledge relating to the diversity of H. pylori and the mechanisms that mediate diversification within the host. Allelic diversity and population structure Following the discovery of H. pylori, it was noted that this pathogen has extraordinary genetic heterogeneity, and that almost every isolate from unrelated patients appears to have a unique ‘fingerprint’4,15,17,24. Individuals can be colonized with multiple strains25, and strains have been shown to change during chronic colonization26,27. The genetic heterogeneity of H. pylori is perhaps most striking at the nucleotide sequence level. The sequence analysis of a few hundred base pairs of only one housekeeping or virulence gene is sufficient to obtain a unique signature for an H. pylori isolate, suggesting an unprecedented degree of allelic diversity16,19. Only when strains from members of the same family or people living closely together were studied, could related strains be observed, indicating clonal transmission18–20,28. These observations led to the hypothesis that H. pylori rapidly adapts to individual hosts, sparking intensive research into the underlying mechanisms of genetic variation. Genetic variation can be generated in a bacterial population by mutation and/or recombination between different strains. Owing to their haploid genotype and mode of replication, bacteria are by default clonal, and diversity arises by the sequential acquisition of mutations. However, recombination due to natural transformation, conjugation, or phage transfer, which shuffles polymorphic nucleotides between different clonal lineages, can greatly influence the structure of bacterial populations29. Recombination is capable of generating a large number of alleles from relatively few polymorphic nucleotides. The first evidence of the strong impact of recombination on the population structure of H. pylori was obtained from multilocus enzyme electrophoresis data30. Analysis of nucleotide sequences from small collections of strains from one geographical location with population genetic analysis tools, such as the Homoplasy test31, provided robust evidence for the impact of recombination and allowed its quantification. These studies showed that allelic diversity within H. pylori was primarily created by recombination between strains during mixed infection, and that the population structure was quasi-panmictic, largely lacking clonal structure19. A large study comparing mutation and recombination rates between different pathogenic bacteria based on sequence data confirmed that H. pylori stands out as the bacterial species with the highest population recombination rate32. NATURE REVIEWS | MICROBIOLOGY H. pylori from different geographical regions The virulence genes vacA and cagA, which encode the vacuolating cytotoxin and the translocated substrate of the cag type IV secretion system, respectively, were the first genes for which differences related to the geographical origin of strains were noted. The vacA and cagA sequences from Asian strains were notably distinct from the sequences obtained from Western H. pylori strains, indicating that at least for these virulence-related genes, geographical partitioning of the bacterial population and some degree of a clonal structure existed in this rapidly recombining species33–36. A breakthrough in the analysis of the population structure of H. pylori came with the application of the multilocus sequence typing (MLST) approach37. This approach, which was first developed for meningococci38 and has since been applied to a wide range of pathogens, uses a small number of housekeeping genes to sample the genome. Housekeeping genes encode proteins with functions such as nucleic acid and protein synthesis that are conserved in most bacteria and, in comparison to virulence genes, are less likely to be under positive diversifying selection pressures39. The MLST approach, in conjunction with modern population genetic tools such as Structure40, has proven extremely powerful for the analysis of recombining H. pylori populations. An H. pylori multilocus haplotype contains partial sequences from 7 housekeeping genes — atpA, efp, mutY, ppa, trpC, ureI and yphC — giving a total length of 3,406 bp. Two large analyses of the population structure of H. pylori have been published, based on MLST datasets from 370 and 769 strains from diverse geographical and ethnic sources3,4, and the results of both studies are summarized below. Despite frequent recombination, the species H. pylori can be subdivided into six main populations with distinct geographical distributions (FIG. 2). Four of these are relatively homogeneous (hpEurope, hpAsia2, hpAfrica2 and hpNEAfrica), and two are composed of subpopulations (hspWAfrica and hspSAfrica together form a population called hpAfrica1, and hspEAsia, hspMaori and hspAmerind together form the population hpEastAsia). These modern populations are the result of tens of thousands of years of joint human and bacterial population movements, geographical isolation and bacterial interstrain recombination. The Structure tool also allows the five ancestral populations from which these modern populations have been derived to be reconstructed, and every strain can be analysed to assess how its nucleotide composition relates to the five ancestral populations. Interestingly, none of the modern H. pylori strains has derived their nucleotides from just one ancestral population. All strains showed a nucleotide composition derived from a mixture of ancestral populations. The distribution of H. pylori populations across the globe and the distribution of sequence polymorphisms assigned to ancestral H. pylori populations are consistent with known ancient and more recent human migrations. For example, strains isolated from Native Americans in Venezuela or from the Inuit contain a high proportion of ancestral Asian nucleotides, consistent with the fact that the ancestors of these human populations migrated to the Americas through the Bering strait approximately 13,000 years ago 3. VOLUME 5 | JUNE 2007 | 443 © 2007 Nature Publishing Group REVIEWS a b 0.01 Kimura 2-parameter distance hpAfrica2 hspAmerind hpNEAfrica hspEAsia hpAsia2 hspSAfrica hspMaori hpEurope hspWAfrica Autosomal microsatellite marker A microsatellite is a simple sequence repeat that consists of repeating units of 1–4 nucleotides. Microsatellites are highly polymorphic and are widely used as markers in human genetic studies. The term autosomal is used if a microsatellite marker is located on a non-sex chromosome (in contrast to markers located on X or Y chromosomes). Mutator strain A strain of a bacterial species that has an elevated mutation rate compared with the average mutation rate of the species. The mutator phenotype is due to defects in genes coding for DNA repair enzymes or proteins involved in assuring fidelity of DNA replication. Mismatch repair (MMR). A DNA repair mechanism that recognizes and corrects mismatches between the parental DNA strand and the copied DNA strand that is generated during replication. Base excision repair (BER). A DNA repair mechanism that recognizes and corrects single mutated bases in the DNA, such as oxidated or alkylated bases. Figure 2 | Helicobacter pylori populations and their worldwide distribution. H. pylori can be subdivided into six main populations. Four of these are relatively homogeneous (hpEurope, hpAsia2, hpAfrica2 and hpNEAfrica), and two are composed of subpopulations (hspWAfrica and hspSAfrica together form a population called hpAfrica1, and hspEAsia, hspMaori and hspAmerind together form the population hpEastAsia). a | Phylogenetic tree of the populations and subpopulations; diameters of circles represent within-population genetic diversity, angles of filled arcs are proportional to numbers of isolates. b | Distribution of the nine populations and subpopulations among the 769 strains studied in REF. 4. The pie charts placed in 51 different locations represent the relative abundance of a population in a given location. Reprinted with permission from REF. 4 © (2007) Macmillan Publishers Ltd. A high prevalence of hpAfrica1 strains and hpEurope strains containing large proportions of ancestral hpAfrica1 nucleotides is consistent with the slave trade between Africa and the Americas. Likewise the distribution of strains with varying proportions of ancestral Europe 1 and ancestral Europe 2 nucleotides is consistent with the colonization of Europe by migrations from the Near East and Central Asia. These studies provide strong evidence that the association of H. pylori with humans predates the migration of anatomically modern humans out of Africa, and that H. pylori has accompanied all human ethnic groupings ever since4. Genetic diversity within H. pylori population samples has decreased with increasing distance from east Africa, the cradle of modern humans, in a strikingly similar way to that described for the genetic diversity of humans, indicating an old and close association between the two. Simulations from the data show that H. pylori is likely to have migrated from Africa 58,000 ± 3,500 years ago, consistent with current estimates for the migrations of humans out of Africa4. The observation that H. pylori sequences reflect human migrations has raised the possibility that H. pylori multilocus haplotypes could be used to resolve open questions regarding the history of human migrations — as a valuable addition to the anthropological toolbox. Indeed, Wirth et al. have presented a first example in which H. pylori multilocus haplotyping was more informative about human migrations than the currently available human genetic markers. In this study, 444 | JUNE 2007 | VOLUME 5 they showed that H. pylori multilocus haplotype analysis could distinguish between Buddhists and Muslims in Ladakh, two populations that had been socially distinct for 500–1000 years, whereas human mitochondrial DNA markers and 17 autosomal microsatellite markers were not informative41. Multiple studies are currently in progress to further exploit this potential of H. pylori to reveal the history of their human hosts. Mechanisms generating allelic diversity Allelic diversity in bacteria is created by a combination of mutagenesis and recombination. The mutation rate of H. pylori is significantly higher than that of many other bacteria. Twenty six out of twenty nine H. pylori strains, the mutation rates of which were assayed by measuring the emergence of resistance to rifampicin, had a higher mutation rate than the average value for the Enterobacteraceae, and about 25% of strains had a mutation rate that even exceeded that of Escherichia coli mutator strains42. Genome analysis reveals that H. pylori apparently lacks homologues of many of the genes that contribute to DNA repair in E. coli, including the complete mismatch repair pathway (mutS1/mutL/mutH) and several enzymes involved in base excision repair (BER)43. The lack of these enzymes might explain the overall higher mutation rate, although this has not been tested experimentally. In addition to the overall high mutation rate, H. pylori possesses 46 genes that contain homopolymeric runs of nucleotides or dinucleotide repeats that are prone to frequent length changes as a consequence of www.nature.com/reviews/micro © 2007 Nature Publishing Group REVIEWS slipped-strand mispairing-mediated mutagenesis44. Length changes can lead to reversible inactivation of genes due to frameshifts (phase variation), or to changed gene transcription if the repeat is located in regulatory sequences. The large repertoire of hyper-mutable genes implies that any large population of H. pylori will consist of many different subpopulations, each with a specific combination of active and inactive phase-variable genes (the term ‘bacterial quasispecies’ has been used to describe this phenomenon). Although the inactivation of some genes containing hypermutable sequences is likely to be selected against in vivo, a large body of evidence (some of which is further reviewed in later sections) demonstrates that slipped-strand mispairingmediated mutagenesis and intrastrain recombination do indeed generate highly diverse populations of H. pylori in an individual host26,45–47. The bacterial population can therefore react quickly to changing environments by expanding a specific subpopulation with higher fitness. This mechanism could even allow the bacteria to regulate their mutation rate, as the antimutator gene mutY contains a repeat sequence in which length variation could lead to gene inactivation43,48,49. Details of the mechanisms involved in recombination, DNA repair and mutagenesis have been the subject of several recent reviews and are not reviewed here43,50. Finally, the H. pylori genome contains numerous repetitive sequences of different lengths that permit intragenomic deletions or rearrangements. Aras et al. have studied the distribution of such repeats and experimentally demonstrated that the deletion of fragments between repeats of up to 100 bp was RecA-independent and that deletion frequencies increased with the increasing length of the repeats51. Examples of genes that showed frequent intragenomic rearrangements included cagY and cagA (located on the cag pathogenicity island (PAI)), as well as amiA, a gene encoding an amidase involved in peptidoglycan biosynthesis46, and genes involved in the fucosylation of lipopolysaccharide (LPS)45. Functional implications of the variability of these and other genes will be discussed in more detail below. Pyrosequencing A sequencing method that is based on the detection of released pyrophosphate (PPi) during DNA synthesis. Evidence for interstrain recombination in vivo Can H. pylori generate genetic diversity during the colonization of an individual host stomach by recombination between different strains, in addition to the intrastrain diversification mechanisms outlined above? The first evidence for interstrain recombination during chronic colonization came from a study by Kersulyte and colleagues, who characterized six H. pylori strains that were isolated from a single patient and showed that this patient harboured two different H. pylori strains that had repeatedly exchanged DNA, creating multiple mosaic genotypes27. A different approach was used by Falush et al., who studied the genetic relationships of sequential H. pylori isolates, cultured from biopsies taken from the same patient at time intervals of several months to years52. Numerous recombination events, many of them spanning only a few hundred base pairs, were detected when 10 gene fragments were sequenced for 24 pairs of such sequential isolates (FIG. 3). A Bayesian mathematical NATURE REVIEWS | MICROBIOLOGY model was then developed and used to determine the most likely combination of recombination rate, mutation rate and length of the imported fragments that would generate the real dataset. Strikingly, H. pylori cells that undergo recombination import short pieces of DNA (on average 417 bp) into their chromosomes, in contrast to other bacteria, for which known lengths range from 2 kb (pneumococci) to over 10 kb (E. coli). Recombination events during chronic infection were also unexpectedly frequent, and extrapolations from the model predict that up to 50% of the genome of an H. pylori strain could be exchanged by recombination over four decades of infection52. However, the speed of intrahost evolution of H. pylori currently remains a subject of some controversy. The number of differences between paired sequential H. pylori isolates varies strongly between patients, with some pairs of sequential isolates showing little or no change over time periods extending to 9 years52–55. One obvious explanation for the relative stability of H. pylori strains in some patients is the lack of mixed infection in those individuals, the probability of which is likely to vary in parallel with the prevalence of H. pylori infection in a population, and would have obvious effects on interstrain recombination. Another reason why robust estimates of rates of mutation and recombination in vivo are not yet available is inherent in the design of the sequential isolate studies carried out so far. Because in most studies only one strain per time point was characterized, the exact point in time at which an observed difference has been generated is not known. Mutations or recombination events might have occurred between the isolation of the first and the follow-up strain, but both strains may also already have coexisted for some time before the first isolation. Given the overall low transmissibility of H. pylori, it seems unlikely that a mixture of strains, which have already undergone recombination, are transmitted simultaneously. As such, the assumption made in sequential isolate studies that mixed infections and recombination events occur in the same individual seems realistic. Because of the uncertainty in timing the mutation and recombination events, using the sequential isolate approach (so far) only allows a calculation of the maximal rates for mutation and recombination52. Future studies that use faster sequencing technologies including highly parallel pyrosequencing56 to fully sample the population diversity of H. pylori at every time point should soon clarify these open questions. Genomic changes during colonization in vivo H. pylori strains also differ markedly in their genomic content. This was first shown by the comparison of the two complete genome sequences of H. pylori strains 26695 and J99 (REF. 57). These two genomes share approximately 1,406 of their 1,590 and 1,495 open reading frames, respectively, and each contains approximately 100 (7%) strain-specific genes. A third H. pylori strain, HPAG1, the genome of which was sequenced more recently, lacked another 29 of the VOLUME 5 | JUNE 2007 | 445 © 2007 Nature Publishing Group REVIEWS Pa tie a nt M on th s genes shared by the first two genomes58. The set of core genes present in all H. pylori genomes was further determined by studies that used comparative genome hybridization technology59,60. Gressmann et al. used a collection of 56 globally representative H. pylori strains that included examples from all known populations and subpopulations in comparative genome hybridizations with a microarray representing the genes of the combined genomes of 26695 and J99. Between the 56 strains, 1,150 genes were conserved. Extrapolation of this number to infinity yielded an estimate of 1,111 genes that are predicted to be conserved in all H. pylori strains (the core genome). The remaining genes that each H. pylori genome contains (approximately 400 genes), come from a pool of genes that are only present in a subset of H. pylori strains. Although some of these are important in pathogenesis, most notably the cag PAI, the majority of these non-conserved genes have as-yet-unknown functions. If H. pylori strains recombine during chronic infection in one individual, is this also associated with the loss or gain of genes, and if yes, how do short allelic replacement events and those associated with gain or 3001 3005 24 6 1010 1037 2003 3 12 36 315 331 367 1014 1040 1062 352 36 36 36 24 24 3 36 299 36 ureI flaB mutY efp b H. pylori strain 1014-1 1014-4 1014-6 1014-1 1014-4 1014-6 loss compare quantitatively? Israel et al. compared the genome of H. pylori J99, which was originally isolated in 1994, with multiple strains that had been isolated from the same patient six years later 47. All follow-up strains differed from the original isolate by one or multiple gene losses or gains. However, significant genomic changes are apparently much rarer than sequence alterations where a small segment of DNA is exchanged with a segment of the same size from a different strain. Using DNA microarray hybridization, Kraft et al.61 quantified gains and losses of genes in 21 pairs of sequential isolates and compared these with estimates of recombination frequency calculated for the same set of strains based on MLST data52. According to their analysis, only one in 650 recombination events is associated with gene loss or gain, showing remarkable conservation of genome structure despite frequent recombination. Although the significance of such genomic changes is unknown in most cases, they can have important implications for pathogenesis. For example, the cag PAI can be deleted when a cag-positive strain recombines with a cag-negative strain resulting in the replacement of the cag PAI with an ‘empty site flaA ppa yphC vacA atpA trpC Fragment of vacA gene GAAGAAGCGAATAAAACCCCAGATAAACCCGATAAAGTTTGGCGCATTCAA GAAGAAGCGAATAAAACCCCAGATAAACCCGATAAAGTTTGGCGCATTCAA GAATAAGCGAATAAAACCCCAGATAAACCCGATAAAGTTTGGTGCATTCAA E E A N K T P D K P D K V W R I Q E E A N K T P D K P D K V W R I Q E * A N K T P D K P D K V W C I Q Figure 3 | The effect of mutation and recombination on sequences during chronic infection. a | Shows the genetic relationships between H. pylori strains isolated sequentially from patients at two time points. The time between the isolation of the first and the follow-up strain is given in the second column. Ten gene fragments were sequenced and compared for each pair. The first two lines show patients for whom no changes were detected in any of the ten fragments (indicated by a green rectangle). The next three lines show patients for whom a single nucleotide change, most likely due to a point mutation, occurred in one of the fragments (indicated by a yellow box, with the position of single nucleotide change indicated by a black horizontal line within the box). In the next seven patients, recombination events have occurred in one or more of the ten sequenced fragments. Recombination events are easily recognized by multiple clustered polymorphisms (recombination events are indicated by violet boxes), clusters can extend over the entire sequenced fragment or be limited to a short patch. The last row is an unusual case in which two strains isolated sequentially seem completely unrelated, indicated by completely different sequences for all ten fragments. b | The effect of a recombination event on the expression of the vacuolating cytotoxin VacA. The three strains 1014-1, 1014-4 and 1014-6 were isolated sequentially from patient 1014 and the sequence of a short fragment of the vacA gene is shown for all three strains. The sequence of the strain (1014-6) isolated last (that is, two years after the first strain) differs from the other two strains in two nucleotides. One of these changes, which has most likely been acquired by the importation of a piece of DNA from a different strain that co-colonized the stomach of patient 1014, has introduced a stop codon into the vacA coding sequence (represented by *). Part a modified with permission from REF. 52 © (2001) National Academy of Sciences. 446 | JUNE 2007 | VOLUME 5 www.nature.com/reviews/micro © 2007 Nature Publishing Group REVIEWS allele’27. Alternatively, the PAI can be excised without contact with a second strain by an intrachromosomal recombination event involving the two 31-bp repeats that flank the island61,62. Evidence for adaptation of H. pylori to the host Although nucleotide sequence changes and gene gains and losses during chronic infection have been well documented, the role these changes have in the ability of H. pylori to cause persistent infections and in the pathogenesis of H. pylori gastritis and associated diseases remains poorly understood. An attractive hypothesis suggests that H. pylori could use its genetic plasticity to adapt to individual human hosts. Although this hypothesis remains as-yet unproven, there are now several lines of evidence indicating that this could indeed be the case, and that H. pylori is capable of rapidly adapting to a host by selecting a population with modified surface properties or with changes in other molecules that interact with the host. The same mechanism could enable H. pylori to colonize a more diverse ecological niche (for example, different regions of the stomach that vary with respect to acidity, gastric mucus composition, expression of host cell products, such as trefoil peptides or host cell glycans, or the degree of inflammation) than could be colonized by a genetically homogenous population. Adhesins and other outer-membrane proteins. H. pylori possesses a large superfamily of 33 paralogous genes encoding predicted outer-membrane proteins63,64. These genes have been subdivided into two families termed hop and hor genes. The function of most of their gene products, the Hop and Hor proteins, is still unknown, but several have been shown to be involved in the adherence of H. pylori to glycoconjugates on the gastric epithelium. The two best-studied members of the families are BabA and SabA, which mediate adherence to ABO/ Lewis b and sialyl-Lewis x/sialyl-Lewis a blood-group antigens, respectively65,66. BabA sequences show strong allelic variation, and different alleles encode proteins with two different binding modes67 (BOX 1). H. pylori is capable of varying the expression of these adhesins by multiple mechanisms. The babA and sabA genes are phase-variable owing to the presence of dinucleotide repeats, ensuring that a larger population of H. pylori will always contain a subpopulation of cells with a different expression status from that of the dominant population that can quickly expand if selective pressures change65,66. For example, expression of sialyl-Lewis x and sialyl-Lewis y increases in inflamed mucosa, and inflammation may therefore provide selection for strains that express SabA66. Intragenomic recombination between highly homologous genes from the hop/hor family might help H. pylori to adapt its adhesive properties to different niches within the stomach, or to changing gastric conditions that result from ageing, atrophy, other infections or dietary changes. The same mechanism might apply to ensure efficient transmission within families whose individuals can differ in traits that are important for colonization, such as blood-group antigens. Such intragenomic recombination events that rearrange the hop/hor genes have been shown to occur in vitro68 as well as in vivo in Rhesus monkeys69. The large pool of hop/hor genes, which share several highly conserved regions, might allow the generation of a large variety of mosaic adhesin genes that are adapted to the surface properties of the individual stomach. The hop family members AlpA and AlpB, which have also been linked to adhesion70, occur in allelic variants correlated with the geographical origin of the strain (Eastern and Western type). These variants displayed differences in their ability to activate the innate immune system, independently from their contribution to adherence71. Box 1 | Helicobacter pylori and blood-group antigens Helicobacter pylori adheres to human epithelial cells using fucosylated glycoproteins and sialylated glycolipids as cellular receptors66,105. This adhesion is mediated by two members of the large family of Helicobacter outer-membrane proteins (Hop) termed BabA and SabA. The BabA adhesin, a 75-kDa surface-exposed protein, was initially described to bind to the fucosylated Lewis antigens Leb and H-1, which are abundantly expressed on the epithelial cells of people with blood group O65. This observation was consistent with the epidemiological observation that peptic ulcer disease is particularly common in individuals with this blood group. An extensive investigation by Thomas Borén and colleagues67 into the binding patterns of H. pylori strains from different geographical regions showed that the babA gene and its encoded adhesin display marked sequence variation, and that this variation correlates with differential binding properties. BabA proteins from many strains had a wider spectrum of binding that included the antigens expressed on cells of individuals with blood groups A or B (A-Lewis b and B-Lewis b, respectively). Strains with a narrower binding spectrum (Leb and H-1 only) were termed ‘specialist binders’, whereas those with the wider spectrum were termed ‘generalist binders’. The geographical distribution of the strains studied correlated strikingly with the occurrence of host blood-group antigens. Specialist binders were particularly abundant among strains isolated from South American Amerindians, where blood group O is extremely common. The generalist binding characteristic was the most common type in European and Asian strains, where the distribution of blood groups is much more even. In in vitro experiments, specialist binders could be converted into generalist binders by transformation with DNA from a generalist, a phenomenon that is likely to occur in vivo. The data provide a telling example of how the genetic makeup of the host population might shape the H. pylori population. Adaptation to the host individual must not interfere with successful transmission. Specialist adaptation to an individual with, for example, blood group A, would make transmission to individuals with other blood groups more difficult, thereby reducing the effectiveness of transmission in a population that is heterogeneous with respect to blood groups. Only in a highly homogeneous population such as the Amerindians will specialist adaptation not interfere with transmission within the community. NATURE REVIEWS | MICROBIOLOGY VOLUME 5 | JUNE 2007 | 447 © 2007 Nature Publishing Group REVIEWS Box 2 | The cag secretion apparatus and the CagA oncoprotein The cag pathogenicity island (PAI) has a central role in Helicobacter pylori pathogenesis. A large number of epidemiological studies have shown that strains containing a cag PAI are associated with more severe disease (such as ulcers and adenocarcinomas) in H. pyloriinfected individuals worldwide106,107. The >30 kb genomic island contains approximately 28 genes89. Some of these genes encode proteins with homology to components of the T pilus of Agrobacterium tumefaciens, the prototype of the type IV secretion system (T4SS)103,108. T4SSs are multisubunit nanomachines that can introduce proteins (and/or DNA) into host cells and thereby influence host cell functions109. The cag T4SS translocates the CagA protein into host epithelial cells110. After entering the host cell, CagA becomes phosphorylated by cellular kinases and binds to several target proteins. These interactions, some of which are phosphorylation-dependent and some are phosphorylation-independent, induce multiple cellular events that contribute to cellular responses. These responses include the morphogenetic changes that are characteristic of cell infection with cag-positive H. pylori strains, and may ultimately lead to malignant transformation92,97,111. CagA has therefore been termed a bacterial oncoprotein. In addition, cag-mediated contact between bacteria and epithelial cells has been shown to lead to the delivery of peptidoglycan fragments into the host cell, which are recognized by the intracellular pattern receptor, NOD1, leading to the activation of pro-inflammatory signalling pathways112 and increased interleukin-8 secretion. VacA cytotoxin. The majority of H. pylori strains express a vacuolating cytotoxin, VacA, which exerts multiple effects on epithelial cells72 and inhibits the proliferation of T cells73. The vacA gene displays pronounced allelic variation, which has given rise to a widely used typing scheme for H. pylori based on polymorphisms in vacA gene sequences encoding the signal peptide and a mid region33. Different alleles show varying affinity for cellular receptors74, and strains carrying multiple recombinant alleles with differential toxic activity have been isolated from a single stomach at one time point75, or sequentially during chronic colonization in vivo76 (FIG. 3b). LPS. Lipopolysaccharides are another important component of the outer surface of H. pylori that show extensive phenotypic variation. The O-antigen side chains of many strains display one or multiple Lewis antigens77. The expression of these antigens is controlled by three fucosyltransferase genes, futA, futB and futC, the expression of which can be switched on or off by transcriptional or translational frameshifting, creating populations with highly diverse LPS glycosylation patterns78–80. In addition, FutA and FutB contain a C-terminal heptad repeat region, consisting of a variable number of repeats of seven amino acids. Different strains that can coexist in one host display fut genes with varying numbers of these heptad repeats, and the number of these repeats determines the specificity of the enzyme for O-antigen sidechain backbones of different lengths, thus functioning as a ‘molecular ruler’45. Recently, Lewis-antigen-expressing H. pylori strains were shown to bind more strongly to dendritic cells (DCs) through the C-type lectin, DC-SIGN, than Lewis-negative variants, suggesting that this variation might have a role in regulating the interaction of H. pylori with DCs, which might determine T-cell immune polarization81. Flagellar motility. Another property that is important for persistent colonization that can undergo phenotypic ON/OFF switching is flagellar motility. Flagellar motility 448 | JUNE 2007 | VOLUME 5 depends on the coordinated expression of more than 30 genes that mediate the assembly and operation of the flagellar filament, the flagellar motor and the chemotaxis machinery82,83. One of the genes encoding a component of the flagellar basal body, fliP, is amenable to ON/OFF switching by translational frameshift that is due to slipped strand mispairing, and non-motile variants can be isolated from a motile population and vice versa84. The role this switch has in vivo has not yet been elucidated. The cag PAI. The cag PAI (see BOX 2 and FIG. 4 ), a chromosome segment that plays an important part in H. pylori pathogenesis, is not present in all strains, and the prevalence of cag PAI-containing strains varies widely between different geographical regions. Almost all Asian H. pylori strains and strains from the hpAfrica1 population contain the complete cag PAI. By contrast, there is one H. pylori population, hpAfrica2, found in South Africa, where all strains studied so far have been devoid of a cag PAI60. Strains from Amerinds, although closely related to Asian H. pylori, either completely lack the PAI, or carry islands with large deletions. Finally, carriage of the cag PAI is variable in hpEurope strains that are the most common strains in western Europe and North America60. Possession of all or most cag genes does not guarantee that the type IV secretion apparatus is functional, and more data are needed where the function of the island (for example, ability to translocate CagA) has been studied in strains from the different populations. The cag PAI is a highly plastic region of the H. pylori genome. Not only do strains differ significantly in the number of cag PAI genes they carry, but the island can be lost, either completely or partially, during chronic infections of humans61,62 or experimentally infected animals85,86. Many patients have been shown to simultaneously carry cag-positive and cag-negative bacteria87. It has therefore been proposed that a dynamic population of cag-positive and cag-negative strains or subclones can exist in one human individual that can expand and contract depending on the physiological and immune status of the host, and on the requirements of different niches within the stomach88. However, data supporting this hypothesis are still limited. The cag apparatus is in direct contact with the host, and some of its proteins are likely to interact with the host cell, although its receptor has not yet been identified. Thus, it can be postulated that the cag PAI and its components have to be under positive selection in individual human hosts, possibly more than most other H. pylori genes. The cag PAI is therefore a particularly attractive target to test the hypothesis of changes leading to host adaptation. Since the first description of the cag PAI in 1996 (REF. 89), the full cag PAI sequence has been elucidated in three other complete genome sequences (H. pylori 26695, J99 and HPAG1). In addition, complete cag PAI sequences from four Swedish strains90 and eleven Japanese strains91 have been published. The cag PAI gene whose allelic variation and its functional implications have been studied in most detail is cagA. CagA proteins www.nature.com/reviews/micro © 2007 Nature Publishing Group REVIEWS a H. pylori strain 26695 genome cag pathogenicity island HP0524 (VirD4) HP0525 (VirB11) HP0527 (VirB10) HP0530 (VirB8) HP0528 (VirB9) HP0530 (VirB7) HP0544 (VirB4) CagY FRR cagA Translocation of CagA into gastric epithelial cells by type IV secretion Many of these genes are required to form type IV secretion aparatus b HP0546 (VirB2) c CagC MRR FVR VirB10 domain TVR VirB10 domain VirB10 domain VirB10 domain d CagA EPIYA mosaic types AB ABC ABCC ABCCC ABD Figure 4 | The cag pathogenicity island contains genes that show marked sequence variation. Most Helicobacter pylori strains that cause disease contain the cag pathogenicity island (PAI), a chromosomal region comprising approximately 37,000 base pairs and 29 genes (see BOX 2 for details). a | Arrangement of cag PAI genes in H. pylori strain 26695. Most of the cag genes are probably involved in the assembly of the type IV secretion system that translocates the protein CagA into the cytoplasm of gastric epithelial cells. Seven genes (marked in red) show similarity to components of the type IV secretion system of the plant pathogen Agrobacterium tumefaciens. Proteins encoded by the island are involved in two major processes, the induction of interleukin-8 (IL-8) production by gastric epithelial cells and the translocation of CagA from the bacterium into host cells. All genes depicted by arrows in dark shades of red and green are essential for IL-8 induction, whereas lighter shades of red and green indicate genes that are not involved in this process. The arrows marked with a red dot indicate genes that are not required for translocation of CagA, the non-marked genes are essential for translocation103. b–d | Exposure of cag proteins to the host presumably places them under strong positive selection in vivo. Extensive sequence variation, possibly linked to host adaptation, has so far been documented for three cag PAI-encoded proteins, CagY (HP0527, b), a protein that probably forms a sheath covering the type IV pilus46,104, CagC (HP0546, c), the putative cag pilin100, and the translocated effector CagA (HP0547, d)97. CagA shows striking ethnic and individual variation in its C-terminal repetitive phosphorylation (EPIYA) motifs; the upper four combinations of EPIYA types depicted are characteristic for Western strains, and the lower combination (ABD), including the unique Asian D-type EPIYA motif, is associated with east Asian strains. See text and references for details. FRR, 5′-repeat region; FVR, 5′-variable region; MRR, middle repeat region; TVR, 3′-variable region. Part a modified with permission from REF. 1 © (2002) Massachusetts Medical Society. from different strains show extensive variation in their mosaic C-terminal domain92. This domain contains repetitive phosphorylation motifs (EPIYA motifs) that can be tyrosine phosphorylated by kinases of the Src family93,94 and the kinase c-Abl95. Phosphorylated CagA subsequently binds to the SHP-2 tyrosine phosphatase, inducing elevated cell motility and deregulating cell NATURE REVIEWS | MICROBIOLOGY growth96,97. EPIYA motifs have been classified into four types based on sequences. H. pylori strains vary widely in their configuration of EPIYA motifs (in both the number and combination of different EPIYA motifs, as well as the number of each type present in the CagA C terminus). Some combinations of these motifs are highly characteristic of Western strains, whereas others VOLUME 5 | JUNE 2007 | 449 © 2007 Nature Publishing Group REVIEWS T helper (TH) 1 immune response T cell immune responses can be broadly categorized into two types. TH1 responses are dominated by TH1 cells, which produce interferon-γ and tumour necrosis factor. TH2 responses are characterized by a prodominance of TH2 cells secreting interleukins (IL)-4, IL-5 and IL-13. The TH1 response is particularly geared towards the defence against intracellular bacteria, whereas the TH2 response is more suited to defend against extracellular bacteria. 1. 2. 3. 4. 5. are characteristic of East Asian strains (FIG. 4). CagA proteins with different EPIYA configurations differ in their interactions with different interaction partners, such as SHP-2 or C-terminal Src kinase (Csk)98,99. The strong association between the ethnicity of the host and the CagA type suggests strong positive selection. Whether CagA also changes during chronic colonization and how this affects the development of gastric diseases over time has not yet been elucidated. Strong variability has also been documented for a second cag PAI gene, cagY, a virB10 orthologue. The cagY gene contains two long, repetitive regions, FRR (5′) and MRR (middle region), which are both absent from virB10 family genes from other bacteria, and can vary in length and in the amino-acid composition of the resulting proteins46. Isolates from different human hosts and bacteria recovered from experimentally colonized mice and rhesus monkeys varied considerably in repeat numbers, always leading to new in-frame combinations of nucleotide repeats, resulting in CagY proteins of different length46. As the antibody response against this protein in human hosts was low, the authors postulate that the almost infinite potential for variation in the cagY gene by intergenomic or intragenomic recombination or deletion serves the purpose of evading the host antibody response, while preserving cag apparatus function. The ability of the cag secretion system to induce interleukin-8 expression in gastric epithelial cells also varied in some strains that harboured changes in CagY, therefore the observed variation of the surface-exposed protein could have a different function in host interaction and host adaptation. The surface-exposed region of a cag PAI VirB2 orthologue, CagC, a surface-associated protein of the cag apparatus that is essential for cag function (cag pilin), was also shown to vary extensively between strains100. It has not yet been assessed whether cagC variation is found in sequential or simultaneous isolates from a single host, or after experimental colonization in animals. Sufficient information on intraspecies variability of other cag PAI proteins to assess their possible roles in host adaptation is currently not available. Suerbaum, S. & Michetti, P. Helicobacter pylori infection. N. Engl. J. Med. 347, 1175–1186 (2002). Ghose, C. et al. East Asian genotypes of Helicobacter pylori strains in Amerindians provide evidence for its ancient human carriage. Proc. Natl Acad. Sci. USA 99, 15107–15111 (2002). Falush, D. et al. Traces of human migrations in Helicobacter pylori populations. Science 299, 1582–1585 (2003). Linz, B. et al. An African origin for the intimate association between humans and Helicobacter pylori. Nature 445, 915–918 (2007). Two analyses of the global population structure of H. pylori using multilocus sequence data. The first paper established that H. pylori can be subdivided into populations with distinct geographical distribution, and that its genes reflect human migrations. The second study compared the genetic diversity of human and H. pylori DNA, and provides evidence that H. pylori was already associated with humans at the time they migrated out of Africa 60,000 years ago. Blaser, M. J. Who are we? Indigenous microbes and the ecology of human diseases. EMBO Rep. 7, 956–960 (2006). 6. H. pylori: is variability important for its success? Although a large body of evidence points towards a role of H. pylori genetic variability in adaptation to its human host, experimental evidence to support this is currently scarce. So far, only two published studies have experimentally addressed the impact of a loss of a factor involved in genetic variation on bacterial fitness. Mutants in ruvC, a Holliday-junction resolvase that is essential for recombination, were cleared from the mouse stomach whereas the wild-type strain showed persistent infection101,102. The mutant also elicited a polarized T helper (TH) 1 immune response, whereas the wild-type strain generated a TH2-cell- type response. However, the mechanism underlying the fitness defect of the mutant is not known, and more work is needed in this area. If H. pylori adapts to the host individual by continuously improving its genome, predominantly by interstrain recombination, then this mechanism is critically dependent on the frequent occurrence of mixed infections with two or more different H. pylori strains. This is probably still the case in regions of the world where H. pylori infection remains almost universal, but could now be rare in Western countries, where the prevalence of H. pylori has been declining over several decades. Such a decline in prevalence, initiated by changing environmental conditions such as general hygiene, less crowded living conditions and, more recently, the use of antibiotics, would make a recombination-based method of genomic adaptation ineffective, giving rise to abortive infections, and ultimately accelerating the disappearance of H. pylori from regions where its prevalence has fallen beyond a critical threshold level. This hypothesis, if confirmed, would have important implications for models that predict the impact of interventive measures, such as vaccination and treatment campaigns on prevalence rates, because the long-term effect of measures that have a moderate effect on prevalence could be much stronger than anticipated and ultimately lead to the disappearance of H. pylori. Warren, J. R. & Marshall, B. Unidentified curved bacilli on gastric epithelium in active chronic gastritis. Lancet 1, 1273–1275 (1983). 7. Schistosomes, liver flukes and Helicobacter pylori. IARC Working Group on the Evaluation of Carcinogenic Risks to Humans. Lyon, 7–14 June 1994. IARC Monogr Eval. Carcinog. Risks Hum. 61, 1–241 (1994). 8. Parkin, D. M. The global health burden of infectionassociated cancers in the year 2002. Int. J. Cancer 118, 3030–3044 (2006). 9. Blaser, M. J. Helicobacter pylori: microbiology of a ‘slow’ bacterial infection. Trends Microbiol. 1, 255–260 (1993). 10. Merrell, D. S. & Falkow, S. Frontal and stealth attack strategies in microbial pathogenesis. Nature 430, 250–256 (2004). 11. Suerbaum, S. et al. The complete genome sequence of the carcinogenic bacterium Helicobacter hepaticus. Proc. Natl Acad. Sci. USA 100, 7901–7906 (2003). 12. Erdman, S. E. et al. CD4+ CD25+ regulatory T lymphocytes inhibit microbially induced colon cancer in Rag2-deficient mice. Am. J. Pathol. 162, 691–702 (2003). 450 | JUNE 2007 | VOLUME 5 13. Ward, J. M. et al. Chronic active hepatitis and associated liver tumors in mice caused by a persistent bacterial infection with a novel Helicobacter species. J. Natl Cancer Inst. 86, 1222–1227 (1994). 14. Langenberg, W., Rauws, E. A., Widjojokusumo, A., Tytgat, G. N. & Zanen, H. C. Identification of Campylobacter pyloridis isolates by restriction endonuclease DNA analysis. J. Clin. Microbiol. 24, 414–417 (1986). 15. Majewski, S. I. & Goodwin, C. S. Restriction endonuclease analysis of the genome of Campylobacter pylori with a rapid extraction method: evidence for considerable genomic variation. J. Infect. Dis. 157, 465–471 (1988). These two papers were the first two descriptions of the striking allelic diversity in H. pylori. 16. Kansau, I. et al. Genotyping of Helicobacter pylori isolates by sequencing of PCR products and comparison with the RAPD technique. Res. Microbiol. 147, 661–669 (1996). 17. Akopyanz, N., Bukanov, N. O., Westblom, T. U., Kresovich, S. & Berg, D. E. DNA diversity among clinical isolates of Helicobacter pylori detected by PCR-based RAPD fingerprinting. Nucleic Acids Res. 20, 5137–5142 (1992). www.nature.com/reviews/micro © 2007 Nature Publishing Group REVIEWS 18. Bamford, K. B. et al. Helicobacter pylori: comparison of DNA fingerprints provides evidence for intrafamilial infection. Gut 34, 1348–1350 (1993). 19. Suerbaum, S. et al. Free recombination within Helicobacter pylori. Proc. Natl Acad. Sci. USA 95, 12619–12624 (1998). Demonstrates that the population structure of H. pylori is shaped by frequent recombination, and that H. pylori behaves clonally on a short-term range after natural transmission in families. 20. Miehlke, S., Genta, R. M., Graham, D. Y. & Go, M. F. Molecular relationships of Helicobacter pylori strains in a family with gastroduodenal disease. Am. J. Gastroenterol. 94, 364–368 (1999). 21. Magalhaes Queiroz, D. M. & Luzza, F. Epidemiology of Helicobacter pylori infection. Helicobacter 11 (Suppl. 1), 1–5 (2006). 22. Schreiber, S. et al. The spatial orientation of Helicobacter pylori in the gastric mucus. Proc. Natl Acad. Sci. USA. 101, 5024–5029 (2004). 23. Lee, S. K. & Josenhans, C. Helicobacter pylori and the innate immune system. Int. J. Med. Microbiol. 295, 325–334 (2005). 24. Akopyanz, N., Bukanov, N. O., Westblom, T. U. & Berg, D. E. PCR-based RFLP analysis of DNA sequence diversity in the gastric pathogen Helicobacter pylori. Nucleic. Acids. Res. 20, 6221–6225 (1992). 25. Taylor, N. S. et al. Long-term colonization with single and multiple strains of Helicobacter pylori assessed by DNA fingerprinting. J. Clin Microbiol. 33, 918–923 (1995). 26. Kuipers, E. J. et al. Quasispecies development of Helicobacter pylori observed in paired isolates obtained years apart from the same host. J. Infect. Dis. 181, 273–282 (2000). 27. Kersulyte, D., Chalkauskas, H. & Berg, D. E. Emergence of recombinant strains of Helicobacter pylori during human infection. Mol. Microbiol. 31, 31–41 (1999). First demonstration of intrahost recombination in multiple strains isolated from one individual patient. 28. Raymond, J. et al. Genetic and transmission analysis of Helicobacter pylori strains within a family. Emerg. Infect. Dis. 10, 1816–1821 (2004). 29. Maynard Smith, J., Smith, N. H., O’Rourke, M. & Spratt, B. G. How clonal are bacteria? Proc. Natl Acad. Sci. USA 90, 4384–4388 (1993). A seminal paper that established the concept that different species of bacteria possess different population structures. 30. Go, M. F., Kapur, V., Graham, D. Y. & Musser, J. M. Population genetic analysis of Helicobacter pylori by multilocus enzyme electrophoresis: extensive allelic diversity and recombinational population structure. J. Bacteriol. 178, 3934–3938 (1996). 31. Maynard Smith, J. & Smith, N. H. Detecting recombination from gene trees. Mol. Biol. Evol. 15, 590–599 (1998). 32. Perez-Losada, M. et al. Population genetics of microbial pathogens estimated from multilocus sequence typing (MLST) data. Infect. Genet. Evol. 6, 97–112 (2006). 33. Atherton, J. C. et al. Mosaicism in vacuolating cytotoxin alleles of Helicobacter pylori. Association of specific vacA types with cytotoxin production and peptic ulceration. J. Biol. Chem. 270, 17771–17777 (1995). This paper describes the mosaic structure of the vacA cytotoxin gene, establishes a vacA typing system for H. pylori and presents evidence for a differential association of vacA alleles with disease. 34. Pan, Z. J. et al. Equally high prevalences of infection with cagA-positive Helicobacter pylori in Chinese patients with peptic ulcer disease and those with chronic gastritis-associated dyspepsia. J. Clin Microbiol. 35, 1344–1347 (1997). 35. Pan, Z. J. et al. Prevalence of vacuolating cytotoxin production and distribution of distinct vacA alleles in Helicobacter pylori from China. J. Infect. Dis. 178, 220–226 (1998). 36. Miehlke, S. et al. Allelic variation in the cagA gene of Helicobacter pylori obtained from Korea compared to the United States. Am. J. Gastroenterol. 91, 1322–1325 (1996). 37. Achtman, M. et al. Recombination and clonal groupings within Helicobacter pylori from different geographic regions. Mol. Microbiol. 32, 459–470 (1999). First application of the multilocus sequence approach to H. pylori from different geographical regions. This paper showed that weakly clonal groupings exist despite frequent recombination. 38. Maiden, M. C. et al. Multilocus sequence typing: a portable approach to the identification of clones within populations of pathogenic microorganisms. Proc. Natl Acad. Sci. USA 95, 3140–3145 (1998). 39. Maiden, M. C. Multilocus sequence typing of bacteria. Annu. Rev. Microbiol. 60, 561–588 (2006). 40. Falush, D., Stephens, M. & Pritchard, J. K. Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. Genetics 164, 1567–1587 (2003). 41. Wirth, T. et al. Distinguishing human ethnic groups by means of sequences from Helicobacter pylori: lessons from Ladakh. Proc. Natl Acad. Sci. USA 101, 4746–4751 (2004). 42. Bjorkholm, B. et al. Mutation frequency and biological cost of antibiotic resistance in Helicobacter pylori. Proc. Natl Acad. Sci. USA 98, 14607–14612 (2001). The study established that H. pylori has a higher mutation rate than the Enterobacteriaceae. 43. Kang, J. & Blaser, M. J. Bacterial populations as perfect gases: genomic integrity and diversification tensions in Helicobacter pylori. Nature Rev. Microbiol. 4, 826–836 (2006). 44. Salaun, L., Linz, B., Suerbaum, S. & Saunders, N. J. The diversity within an expanded and redefined repertoire of phase-variable genes in Helicobacter pylori. Microbiology 150, 817–830 (2004). 45. Nilsson, C. et al. An enzymatic ruler modulates Lewis antigen glycosylation of Helicobacter pylori LPS during persistent infection. Proc. Natl Acad. Sci. USA (2006). Phase variation in H. pylori LPS glycosylation genes during persistent infection suggests a possible temporal adaptation mechanism to a host niche, which changes through chronic infection. 46. Aras, R. A. et al. Plasticity of repetitive DNA sequences within a bacterial (Type IV) secretion system component. J. Exp. Med. 198, 1349–1360 (2003). Investigation of the interstrain and intrastrain variation of the H. pylori cag pathogenicity island gene cagY in natural infection and experimental infection, and its possible role in host immune adaptation. 47. Israel, D. A. et al. Helicobacter pylori genetic diversity within the gastric niche of a single human host. Proc. Natl Acad. Sci. USA 98, 14625–14630 (2001). 48. Huang, S., Kang, J. & Blaser, M. J. Antimutator role of the DNA glycosylase mutY gene in Helicobacter pylori. J. Bacteriol. 188, 6224–6234 (2006). 49. Mathieu, A., O’Rourke, E. J. & Radicella, J. P. Helicobacter pylori genes involved in avoidance of mutations induced by 8-oxoguanine. J. Bacteriol. 188, 7464–7469 (2006). 50. Kraft, C. & Suerbaum, S. Mutation and recombination in Helicobacter pylori: Mechanisms and role in generating strain diversity. Int. J. Med. Microbiol. 295, 299–305 (2005). 51. Aras, R. A., Kang, J., Tschumi, A. I., Harasaki, Y. & Blaser, M. J. Extensive repetitive DNA facilitates prokaryotic genome plasticity. Proc. Natl Acad. Sci. USA 100, 13579–13584 (2003). 52. Falush, D. et al. Recombination and mutation during long-term gastric colonization by Helicobacter pylori: Estimates of clock rates, recombination size and minimal age. Proc. Natl Acad. Sci. USA 98, 15056–15061 (2001). Intrahost evolution was studied using sequence comparisons from paired sequential H. pylori isolates from one host. Mathematical modelling was used to determine quantitative parameters of recombination and mutation in vivo. 53. Lundin, A. et al. Slow genetic divergence of Helicobacter pylori strains during long-term colonization. Infect. Immun. 73, 4818–4822 (2005). 54. Prouzet-Mauleon, V. et al. Pathogen evolution in vivo: genome dynamics of two isolates obtained 9 years apart from a duodenal ulcer patient infected with a single Helicobacter pylori strain. J. Clin. Microbiol. 43, 4237–4241 (2005). 55. Salama, N. R. et al. Genetic analysis of Helicobacter pylori strain populations colonizing the stomach at different times post-infection. J. Bacteriol. 189, 3834–3845 (2007). 56. Margulies, M. et al. Genome sequencing in microfabricated high-density picolitre reactors. Nature 437, 376–380 (2005). 57. Alm, R. A. et al. Genomic-sequence comparison of two unrelated isolates of the human gastric pathogen Helicobacter pylori. Nature 397, 176–180 (1999). First complete genome comparison of two unrelated strains within one species. NATURE REVIEWS | MICROBIOLOGY 58. Oh, J. D. et al. The complete genome sequence of a chronic atrophic gastritis Helicobacter pylori strain: evolution during disease progression. Proc. Natl Acad. Sci. USA 103, 9999–10004 (2006). 59. Salama, N. et al. A whole-genome microarray reveals genetic diversity among Helicobacter pylori strains. Proc. Natl Acad. Sci. USA 97, 14668–14673 (2000). 60. Gressmann, H. et al. Gain and loss of multiple genes during the evolution of Helicobacter pylori. PLoS Genet. 1, e43 (2005). 61. Kraft, C. et al. Genomic changes during chronic Helicobacter pylori infection. J. Bacteriol. 188, 249–254 (2006). 62. Bjorkholm, B. et al. Comparison of genetic divergence and fitness between two subclones of Helicobacter pylori. Infect. Immun. 69, 7832–7838 (2001). 63. Tomb, J.-F. et al. The complete genome sequence of the gastric pathogen Helicobacter pylori. Nature 388, 539–547 (1997). 64. Alm, R. A. et al. Comparative genomics of Helicobacter pylori: analysis of the outer membrane protein families. Infect. Immun. 68, 4155–4168 (2000). 65. Ilver, D. et al. Helicobacter pylori adhesin binding fucosylated histo-blood group antigens revealed by retagging. Science 279, 373–377 (1998). 66. Mahdavi, J. et al. Helicobacter pylori SabA adhesin in persistent infection and chronic inflammation. Science 297, 573–578 (2002). 67. Aspholm-Hurtig, M. et al. Functional adaptation of BabA, the H. pylori ABO blood group antigen binding adhesin. Science 305, 519–522 (2004). Describes the discovery of the association of the H. pylori adhesin BabA to human ethnic groupings with different blood-group prevalences. Hints at a link between adhesive properties and host adaptation. 68. Backstrom, A. et al. Metastability of Helicobacter pylori bab adhesin genes and dynamics in Lewis b antigen binding. Proc. Natl Acad. Sci. USA 101, 16923–16928 (2004). 69. Solnick, J. V., Hansen, L. M., Salama, N. R., Boonjakuakul, J. K. & Syvanen, M. Modification of Helicobacter pylori outer membrane protein expression during experimental infection of rhesus macaques. Proc. Natl Acad. Sci. USA 101, 2106–2111 (2004). Currently, the only published study that shows H. pylori strain variation arising after experimental infection in rhesus monkeys. 70. Odenbreit, S., Till, M., Hofreuter, D., Faller, G. & Haas, R. Genetic and functional characterization of the alpAB gene locus essential for adhesion of Helicobacter pylori to human gastric tissue. Mol. Microbiol. 31, 1537–1548 (1999). 71. Lu, H. et al. Functional and intracellular signalling differences associated with the Helicobacter pylori AlpAB adhesin from Western and East Asian strains. J. Biol. Chem. (2007). 72. Cover, T. L. & Blanke, S. R. Helicobacter pylori VacA, a paradigm for toxin multifunctionality. Nature Rev. Microbiol. 3, 320–332 (2005). 73. Gebert, B., Fischer, W., Weiss, E., Hoffmann, R. & Haas, R. Helicobacter pylori vacuolating cytotoxin inhibits T lymphocyte activation. Science 301, 1099–1102 (2003). 74. Pagliaccia, C. et al. The m2 form of the Helicobacter pylori cytotoxin has cell type-specific vacuolating activity. Proc. Natl Acad. Sci. USA 95, 10212–10217 (1998). 75. Aviles-Jimenez, F. et al. Evolution of the Helicobacter pylori vacuolating cytotoxin in a human stomach. J. Bacteriol. 186, 5182–5185 (2004). 76. Carroll, I. M. et al. Microevolution between paired antral and paired antrum and corpus Helicobacter pylori isolates recovered from individual patients. J. Med. Microbiol. 53, 669–677 (2004). 77. Aspinall, G. O. & Monteiro, M. A. Lipopolysaccharides of Helicobacter pylori strains P466 and MO19: structures of the O antigen and core oligosaccharide regions. Biochemistry 35, 2498–2504 (1996). 78. Monteiro, M. A. et al. Simultaneous expression of type 1 and type 2 Lewis blood group antigens by Helicobacter pylori lipopolysaccharides. Molecular mimicry between H. pylori lipopolysaccharides and human gastric epithelial cell surface glycoforms. J. Biol. Chem. 273, 11533–11543 (1998). 79. Appelmelk, B. J. et al. Phase variation in Helicobacter pylori lipopolysaccharide due to changes in the lengths of poly(C) tracts in α3-fucosyltransferase genes. Infect. Immun. 67, 5361–5366 (1999). VOLUME 5 | JUNE 2007 | 451 © 2007 Nature Publishing Group REVIEWS 80. Wang, G., Ge, Z., Rasko, D. A. & Taylor, D. E. Lewis antigens in Helicobacter pylori: biosynthesis and phase variation. Mol. Microbiol. 36, 1187–1196 (2000). 81. Bergman, M. P. et al. Helicobacter pylori modulates the T helper cell 1/T helper cell 2 balance through phase-variable interaction between lipopolysaccharide and DC-SIGN. J. Exp. Med. 200, 979–990 (2004). The authors report a strain-specific influence on immune responses associated with phase-variable expression of lipopolysaccharide outer chains in H. pylori. 82. Josenhans, C. & Suerbaum, S. Helicobacter pylori: Molecular and Cellular Biology. (eds Achtman, O. & Suerbaum, S.) 171–184 (Horizon Scientific Press, Wymondham, 2001). 83. Niehus, E. et al. Genome-wide analysis of transcriptional hierarchy and feedback regulation in the flagellar system of Helicobacter pylori. Mol. Microbiol. 52, 947–961 (2004). 84. Josenhans, C., Eaton, K. A., Thevenot, T. & Suerbaum, S. Switching of flagellar motility in Helicobacter pylori by reversible length variation of a short homopolymeric sequence repeat in fliP, a gene encoding a basal body protein. Infect. Immun. 68, 4598–4603 (2000). 85. Thompson, L. J. et al. Chronic Helicobacter pylori infection with Sydney strain 1 and a newly identified mouse-adapted strain (Sydney strain 2000) in C57BL/6 and BALB/c mice. Infect. Immun. 72, 4668–4679 (2004). 86. Sozzi, M., Crosatti, M., Kim, S. K., Romero, J. & Blaser, M. J. Heterogeneity of Helicobacter pylori cag genotypes in experimentally infected mice. FEMS Microbiol. Lett. 203, 109–114 (2001). 87. Figura, N. et al. cagA positive and negative Helicobacter pylori strains are simultaneously present in the stomach of most patients with nonulcer dyspepsia: relevance to histological damage. Gut 42, 772–778 (1998). 88. Covacci, A. & Rappuoli, R. Helicobacter pylori: molecular evolution of a bacterial quasi-species. Curr. Opin. Microbiol. 1, 96–102 (1998). 89. Censini, S. et al. cag, a pathogenicity island of Helicobacter pylori, encodes type I-specific and disease-associated virulence factors. Proc. Natl Acad. Sci. USA 93, 14648–14653 (1996). 90. Blomstergren, A., Lundin, A., Nilsson, C., Engstrand, L. & Lundeberg, J. Comparative analysis of the complete cag pathogenicity island sequence in four Helicobacter pylori isolates. Gene 328, 85–93 (2004). 91. Azuma, T. et al. Distinct diversity of the cag pathogenicity island among Helicobacter pylori strains in Japan. J. Clin. Microbiol. 42, 2508–2517 (2004). 92. Covacci, A. et al. Molecular characterization of the 128-kDa immunodominant antigen of Helicobacter pylori associated with cytotoxicity and duodenal ulcer. Proc. Natl Acad. Sci. USA 90, 5791–5795 (1993). 93. Stein, M. et al. c-Src/Lyn kinases activate Helicobacter pylori CagA through tyrosine phosphorylation of the EPIYA motifs. Mol. Microbiol. 43, 971–980 (2002). 94. Selbach, M., Moese, S., Hauck, C. R., Meyer, T. F. & Backert, S. Src is the kinase of the Helicobacter pylori CagA protein in vitro and in vivo. J. Biol. Chem. 277, 6775–6778 (2002). 95. Poppe, M., Feller, S. M., Romer, G. & Wessler, S. Phosphorylation of Helicobacter pylori CagA by c-Abl leads to cell motility. Oncogene (2006). 96. Higashi, H. et al. SHP-2 tyrosine phosphatase as an intracellular target of Helicobacter pylori CagA protein. Science 295, 683–686 (2002). 97. Hatakeyama, M. & Higashi, H. Helicobacter pylori CagA: a new paradigm for bacterial carcinogenesis. Cancer Sci. 96, 835–843 (2005). 98. Naito, M. et al. Influence of EPIYA-repeat polymorphism on the phosphorylation-dependent biological activity of Helicobacter pylori CagA. Gastroenterology 130, 1181–1190 (2006). This study presents evidence for an influence of interstrain diversity in CagA types on host interaction. 99. Higashi, H. et al. Biological activity of the Helicobacter pylori virulence factor CagA is determined by variation in the tyrosine phosphorylation sites. Proc. Natl Acad. Sci. USA 99, 14428–14433 (2002). 100. Andrzejewska, J. et al. Characterization of the pilin ortholog of the Helicobacter pylori type IV cag pathogenicity apparatus, a surface-associated protein expressed during infection. J. Bacteriol. 188, 5865–5877 (2006). 101. Robinson, K., Loughlin, M. F., Potter, R. & Jenks, P. J. Host adaptation and immune modulation are mediated by homologous recombination in Helicobacter pylori. J. Infect. Dis. 191, 579–587 (2005). 102. Loughlin, M. F., Barnard, F. M., Jenkins, D., Sharples, G. J. & Jenks, P. J. Helicobacter pylori mutants defective in RuvC Holliday junction resolvase display reduced macrophage survival and spontaneous clearance from the murine gastric mucosa. Infect. Immun. 71, 2022–2031 (2003). 103. Fischer, W. et al. Systematic mutagenesis of the Helicobacter pylori cag pathogenicity island: essential genes for CagA translocation in host cells and induction of interleukin-8. Mol. Microbiol. 42, 1337–1348 (2001). 104. Rohde, M., Puls, J., Buhrdorf, R., Fischer, W. & Haas, R. A novel sheathed surface organelle of the Helicobacter pylori cag type IV secretion system. Mol. Microbiol. 49, 219–234 (2003). 105. Boren, T., Falk, P., Roth, K. A., Larson, G. & Normark, S. Attachment of Helicobacter pylori to human gastric epithelium mediated by blood group antigens. Science 262, 1892–1895 (1993). 106. Crabtree, J. E. et al. Helicobacter pylori induced interleukin-8 expression in gastric epithelial cells is associated with CagA positive phenotype. J. Clin. Pathol. 48, 41–45 (1995). 452 | JUNE 2007 | VOLUME 5 107. Crabtree, J. E. et al. Mucosal IgA recognition of Helicobacter pylori 120 kDa protein, peptic ulceration, and gastric pathology. Lancet 338, 332–335 (1991). 108. Bourzac, K. M. & Guillemin, K. Helicobacter pylorihost cell interactions mediated by type IV secretion. Cell Microbiol. 7, 911–919 (2005). 109. Christie, P. J., Atmakuri, K., Krishnamoorthy, V., Jakubowski, S. & Cascales, E. Biogenesis, architecture, and function of bacterial type iv secretion systems. Annu. Rev. Microbiol. 59, 451–485 (2005). 110. Odenbreit, S. et al. Translocation of Helicobacter pylori CagA into gastric epithelial cells by type IV secretion. Science 287, 1497–1500 (2000). 111. Tummuru, M. K., Cover, T. L. & Blaser, M. J. Cloning and expression of a high-molecular-mass major antigen of Helicobacter pylori: evidence of linkage to cytotoxin production. Infect. Immun. 61, 1799–1809 (1993). 112. Viala, J. et al. Nod1 responds to peptidoglycan delivered by the Helicobacter pylori cag pathogenicity island. Nature Immunol. 5, 1166–1174 (2004). Acknowledgments The authors wish to dedicate this article to their long-time academic teacher and mentor, W. Opferkuch, on the occasion of his 75th birthday. M. Achtman and D. Falush are acknowledged for many fruitful discussions on bacterial evolution. We also thank three anonymous reviewers for helpful suggestions. Work in the authors’ laboratories was supported by grants from the German Research Foundation (DFG), the German Ministry for Education and Research (Competence network PathoGenoMik and ERA-NET Pathogenomics - HELDIVNET), the European Commission (FP6 Integrated Project INCA) and the Volkswagen Foundation. Competing interests statement The authors declare no competing financial interests. DATABASES The following terms in this article are linked online to: Entrez Gene: http://www.ncbi.nlm.nih.gov/entrez/query. fcgi?db=gene amiA | babA | cagA | cagC | cagY | fliP | mutY | sabA | VacA | Entrez Genome Project: http://www.ncbi.nlm.nih.gov/ entrez/query.fcgi?db=genomeprj Escherichia coli | Helicobacter hepaticus | Helicobacter pylori | Helicobacter pylori HPAG1 | Helicobacter pylori J99 | Helicobacter pylori 26695 UniProtKB: http://ca.expasy.org/sprot AlpA | AlpB | BabA | CagA | CagC | CagY | VacA FURTHER INFORMATION MLST databases: http://www.mlst.net Structure software: http://pritch.bsd.uchicago.edu/ structure.html Access to this links box is available online. www.nature.com/reviews/micro © 2007 Nature Publishing Group
© Copyright 2026 Paperzz