- Wiley Online Library

REVIEW ARTICLE
Structural diversity in Salmonella O antigens and its genetic
basis
Bin Liu1,2, Yuriy A. Knirel3, Lu Feng1,2,4, Andrei V. Perepelov3, Sof’ya N. Senchenkova3,
Peter R. Reeves5 & Lei Wang1,2,4,6
1
TEDA School of Biological Sciences and Biotechnology, Nankai University, TEDA, Tianjin, China; 2The Key Laboratory of Molecular Microbiology
and Technology, Ministry of Education, Tianjin, China; 3N.D. Zelinsky Institute of Organic Chemistry, Russian Academy of Sciences, Moscow,
Russian Federation; 4Tianjin Key Laboratory of Microbial Functional Genomics, Tianjin, China; 5School of Molecular and Microbial Bioscience
(G08), University of Sydney, Sydney, Australia; and 6Tianjin Research Center for Functional Genomics and Biochip, Tianjin, China
Correspondence: Lei Wang, TEDA School of
Biological Sciences and Biotechnology,
Nankai University, 23 Hongda Street, TEDA,
Tianjin 300457, China.
Tel.: 86 22 66229588; fax: 86 22 66229596;
e-mail: [email protected]
Received 30 November 2012; revised 15 May
2013; accepted 5 July 2013. Final version
published online 2 August 2013.
DOI: 10.1111/1574-6976.12034
MICROBIOLOGY REVIEWS
Editor: Wilbert Bitter
Keywords
polysaccharide; pathogen; polymorphism;
serotyping; evolution; glycosyltransferase.
Abstract
This review covers the structures and genetics of the 46 O antigens of Salmonella, a major pathogen of humans and domestic animals. The variation in
structures underpins the serological specificity of the 46 recognized serogroups.
The O antigen is important for the full function and virulence of many bacteria, and the considerable diversity of O antigens can confer selective advantage.
Salmonella O antigens can be divided into two major groups: those which have
N-acetylglucosamine (GlcNAc) or N-acetylgalactosamine (GalNAc) and those
which have galactose (Gal) as the first sugar in the O unit. In recent years, we
have determined 21 chemical structures and sequenced 28 gene clusters for
GlcNAc-/GalNAc-initiated O antigens, thus completing the structure and DNA
sequence data for the 46 Salmonella O antigens. The structures and gene clusters of the GlcNAc-/GalNAc-initiated O antigens were found to be highly
diverse, and 24 of them were found to be identical or closely related to Escherichia coli O antigens. Sequence comparisons indicate that all or most of the
shared gene clusters were probably present in the common ancestor, although
alternative explanations are also possible. In contrast, the better-known eight
Gal-initiated O antigens are closely related both in structures and gene cluster
sequences.
Introduction
O antigen (O polysaccharide) is a part of the lipopolysaccharide (LPS) component of the outer membrane of
Gram-negative bacteria and is one of the most variable
cell constituents. It consists of oligosaccharide repeats
(O units), normally containing two to eight sugar residues.
The variation is mostly in the types of sugars present,
their order in the structure, and the linkages between
them. The O antigen is subject to intense selection by the
host immune system, bacteriophages, and other environmental factors (Reeves & Wang, 2002), which may
account for the maintenance of diverse O-antigen forms
within a species. O-antigen diversity is a common basis
for bacterial serotyping and also important for the bacteria, as it allows each of the various clones to present a
surface that offers selective advantage in its specific niche
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
(Reeves, 1992). The presence of O antigen is also essential
for survival of bacteria in their natural environment and
plays a role in bacterial virulence. There is direct evidence
that the loss of O antigen makes many pathogens, such
as Escherichia coli, Shigella flexneri, Francisella tularensis,
and Yersinia enterocolitica, serum sensitive or otherwise
seriously impaired in virulence (Pluschke et al., 1983;
Bengoechea et al., 2004; West et al., 2005; Plainvert et al.,
2007; Raynaud et al., 2007).
Salmonella is recognized as a major pathogen of both
animals and humans and is the cause of typhoid fever,
paratyphoid fever, and the foodborne illness salmonellosis. Salmonella infections arise from contamination of
poultry, eggs, beef, and other foods, sometimes including
unwashed fruits and vegetables. In many countries, Salmonella is the leading cause of foodborne outbreaks and
infections. It is estimated that there are 1.3 million cases
FEMS Microbiol Rev 38 (2014) 56–89
Salmonella O-antigen diversity
of salmonellosis, 15 000 hospitalizations, and 400 deaths
annually in the United States (Hardnett et al., 2004). The
genus Salmonella includes two species, S. enterica and
S. bongori. S. enterica is divided into the following six
subspecies: S. enterica enterica, S. enterica salamae, S. enterica arizonae, S. enterica diarizonae, S. enterica houtenae,
and S. enterica indica or subspecies I, II, IIIa, IIIb, IV,
and VI, respectively. S. bongori was originally designated
S. enterica subspecies V, but it has since been determined
to be a separate species. This classification has been confirmed by multilocus enzyme electrophoresis and sequence
analysis of housekeeping genes (Nelson et al., 1991; Nelson
& Selander, 1992; Boyd et al., 1994; McQuiston et al.,
2008).
Serotyping is highly useful for identifying strains that
vary in host range and disease spectrum, including pathogens such as Salmonella, and is invaluable for epidemiological investigations. The Kauffmann–White–Le Minor
serotyping scheme for designation of Salmonella serotypes, maintained by the WHO Collaborating Centre for
Reference and Research on Salmonella, is used by most
laboratories for the characterization of Salmonella isolates.
A serotype of Salmonella is determined on the basis of O
and flagellar (H) antigens. The O antigen determines the
serogroup, while the H antigen completes the definition
of the serovar or serotype of a Salmonella isolate.
There are 46 O serogroups described in the Kauffmann–
White–Le Minor scheme. These were originally designated
by letters of the alphabet, but later, it was necessary
to continue with numbers 51–67. The genes specific for
O-antigen synthesis are normally present as a gene cluster
in the chromosome, which maps between galF and gnd in
Salmonella, E. coli, and Shigella, but sometimes, one or
more such genes map outside the gene cluster. There are
114 H antigens in Salmonella (McQuiston et al., 2004), and
2557 serovars in total have been recognized (Grimont &
Weill, 2007). Approximately 60% of the serovars belong to
subspecies I, while subspecies VI and S. bongori are rare.
O-antigen gene clusters appear to have been transferred
among subspecies, as the majority of Salmonella O antigens
are found in at least two subspecies with a mean of 3.5
subspecies per O antigen (Reeves, 1995; Popoff & Le
Minor, 1997).
Genetic variation in the O-antigen gene cluster is
the major determinant of differences among the diverse
O-antigen forms. O-antigen synthesis genes fall into three
main classes: (1) nucleotide sugar precursor synthesis
genes for sugars specific to the O antigen. Note that the
common sugars in the O antigen that are also found in
other polysaccharide structures or are involved in metabolism, such as glucose (Glc), galactose (Gal), and N-acetylglucosamine (GlcNAc), are usually synthesized by genes
outside the O-antigen gene cluster. (2) sugar transferase
FEMS Microbiol Rev 38 (2014) 56–89
57
genes associated with the O-unit assembly that are
specific for the donor and acceptor sugars and generate a
specific linkage between them; and (3) genes for O-unit
processing and the conversion of the O unit to O antigen
(wzx and wzy in the Wzx/Wzy pathway and wzm and wzt
in the ABC transporter pathway). However, genes on
bacteriophages or other chromosomally encoded genes,
which are not located in the O-antigen gene cluster, are
often involved in modification of the structure and
particularly in the addition of side-chain residues to the
O units.
The synthesis and translocation of the O antigen can
occur through three distinct pathways: the Wzx/Wzy
pathway, the ATP-binding cassette (ABC) transporter
pathway, and the synthase pathway (Bronner et al., 1994;
Keenleyside & Whitefield, 1996; Daniels et al., 1998;
Linton & Higgins, 1998; Samuel & Reeves, 2003). In the
Wzx/Wzy pathway, the O unit is synthesized by sequential transfer of a sugar phosphate and one or more sugars
from the respective nucleotide sugars to the carrier lipid,
namely undecaprenyl phosphate (UndP). O units are
flipped across the cytoplasmic membrane and then polymerized to form polysaccharide chains, which are transferred to the independently synthesized core-lipid
A component to form LPS (Mulford & Osborn, 1983;
McGrath & Osborn, 1991; Reeves & Wang, 2002). In the
ABC transporter pathway, the glycosyltransferases mediate
the sequential addition of sugar residues to the nonreducing end of the growing polymer to form the complete
O-antigen polymer that is attached to UndPP. The polysaccharide is then translocated across the cytoplasmic
membrane by an ABC transporter and ligated to the
core-lipid A to form the complete LPS (Bronner et al.,
1994; Linton & Higgins, 1998). In the synthase pathway
used for synthesis of the Salmonella O54 antigen, a
synthase catalyzes the extension of the polysaccharide
chain with simultaneous extrusion of the nascent polymer across the cytoplasmic membrane (Keenleyside &
Whitefield, 1996).
Most Salmonella O antigens (39 in total including
Salmonella O54 and O67 and taking into account that
Salmonella O28 is divided into O28ab and O28ac) have
either GlcNAc or N-acetylgalactosamine (GalNAc) as the
first sugar of the O unit. As in most E. coli and Shigella
strains WecA, which is encoded by a gene in the enterobacterial common antigen (ECA) gene cluster, is responsible
for initiating the synthesis of GlcNAc- and GalNAc-initiated O antigens by transferring GlcNAc-1-phosphate to
the UndP carrier. When GalNAc is the initiating sugar,
UndPP–GlcNAc is then converted to UndPP–GalNAc by
an epimerase, which is encoded by a gene that has been
called gne (Rush et al., 2010). However, we suggest that
this gene be renamed gnu, as its product is specific for
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
58
UndPP–GlcNAc, whereas epimerases encoded by gne are
specific for UDP–GlcNAc (Cunneen et al., 2013).
Salmonella O67 occurs rarely and has been suggested
to be a variant of serogroup O4 (B) O antigen (Li &
Reeves, 2000). However, in this study, we found that the
O-antigen structure of Salmonella O67 is similar to that
of D-galactan I O antigen in Klebsiella pneumonia and
that its gene cluster is not located between galF and gnd.
Salmonella O54 has a disaccharide O unit composed of
two ManNAc residues. The O54 antigen gene cluster is
on a plasmid, and the O antigen expressed from the main
O-antigen gene cluster is present together with the O54
antigen (Keenleyside & Whitefield, 1996). The O54 serogroup is currently retained, but if the plasmid is lost,
factor O54 is no longer expressed.
The Salmonella O antigens belonging to serogroups O2
(A), O4 (B), O8 (C2–C3), O9 (D1), O9,46 (D2),
O9,46,27 (D3), O3,10 (E1–E3), and O1,3,19 (E4) form a
distinct set that is characterized by having a Gal residue
as the first sugar of the O unit and a wbaP gene in the
O-antigen gene cluster, which encodes the glycosyltransferase that catalyzes the addition of the Gal-1-phosphate
residue to UndP to initiate O-unit synthesis. These serogroups have related O-antigen structures and gene clusters (Reeves et al., 2013). Details of their relationships
show that they have a complex evolutionary history that
will be reviewed separately (Reeves et al., 2013).
Although GlcNAc-/GalNAc-initiated O antigens outnumber Gal-initiated O antigens in Salmonella (39 vs. 8),
the latter were found to be more prevalent in Salmonella
isolates. Among Salmonella isolates from human sources
reported between 1999 and 2009 by the Centers for Disease Control and Prevention in the United State, 84.23%
isolates belonged to serogroups with a Gal-initiated O
antigen, and only 5.35% isolates belonged to serogroups
with a GlcNAc-/GalNAc-initiated O antigen (other isolates
could not be serotyped) (CDC, 2009).
Systematically analyzing the chemical structures and
gene clusters of different O-antigen forms in a genus or
species will improve our understanding of the generation
of the O-antigen diversity. It will also open the way for
experimental studies on the relationship between this
diversity and pathogenicity. Many laboratories in the
world have worked on the structure, genetics, and function of O antigens. However, most of these studies have
focused on relatively few O-antigen forms.
In a previous review, we summarized the structures
and gene clusters of all Shigella O antigens (Liu et al.,
2008) and found many genetic anomalies in the gene
clusters. It was suggested that the Shigella set of O antigens has been assembled relatively recently or undergone
adaptive modifications in a newly occupied niche. Salmonella, Shigella, and E. coli are known to be evolutionarily
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
B. Liu et al.
related (Ochman & Wilson, 1987), and we also presented
evidence in support of the close relationship between Shigella and E. coli, as 21 of 34 Shigella O antigens are either
identical or closely related to an E. coli O antigen.
Homologous recombination was shown to be an essential
mechanism in the diversification of Shigella O antigens.
Shigella is a pathogenic form that was estimated to have
developed within E. coli several times over the last 35 000–
270 000 years (Pupo et al., 2000), but these events were
probably more recent as mutation rates in bacterial clones
observed in recent studies are much higher than earlier estimates, and this affects the date estimates (Feng et al., 2008;
Ho et al., 2011; Morelli et al., 2011; Reeves et al., 2011). In
contrast, Salmonella is a distinct genus with a much longer
history (Ochman & Wilson, 1987; Doolittle et al., 1996); it
is thought that E. coli and Salmonella diverged from a common ancestor about 140 million years ago. The evolutionary mechanisms for generation of the O-antigen diversity
in Salmonella are expected to be different from those in
Shigella. When we started a systematic study of Salmonella
and E. coli O antigens, there were only four cases in which
the O-antigen structure had been shown to be identical in
the two species (Rundlof et al., 1998; Samuel et al., 2004),
although more serological cross-reactions had been
observed (Orskov et al., 1977). It was suggested that there
had been extensive replacement of O antigens, presumably
by lateral gene transfer, since divergence of the two species.
In the past 5 years, we have sequenced 28 Salmonella
O-antigen gene clusters, 14 of which are reported here for
the first time, and determined 21 Salmonella O-antigen
chemical structures, five of which are reported here for
the first time. We have also revised the chemical structures of another three Salmonella O antigens. In this
study, we present a compilation of the published and new
chemical structures and DNA sequence data for the 46
known Salmonella O antigens. Together with the summary of Shigella O antigens, it gives an improved insight
into the evolution of O-antigen diversity in bacteria. The
structures and gene clusters of GlcNAc-/GalNAc-initiated
O antigens were found to be highly diverse. However, the
proportion of genetic anomalies in these gene clusters is
clearly lower than that in Shigella, indicating that these O
antigens are more stable. We also sequenced 18 E. coli
O-antigen gene clusters and determined 9 and revised 2
E. coli O-antigen chemical structures to obtain sufficient
data for a comparison of all O antigens shared by Salmonella and E. coli (the others were retrieved from databases). We found that 24 Salmonella O-antigen forms are
either identical or closely related to E. coli O antigens, as
indicated by both genetic and structural data. Therefore,
the relationship between E. coli and Salmonella O antigens
is much closer than previously thought. The genetic data
imply that almost all O antigens shared by Salmonella
FEMS Microbiol Rev 38 (2014) 56–89
Salmonella O-antigen diversity
and E. coli originated from an O antigen in their common ancestor, although alternative explanations (such as
a recent lateral transfer of a gene cluster from one species
to the other) are also possible. In contrast to Salmonella
GlcNAc-/GalNAc-initiated O antigens, Salmonella Galinitiated O antigens exhibit a high level of relatedness
in structure and genetic aspects, implying a distinct
evolutionary history.
Chemical composition and structures for
Salmonella GlcNAc-/GalNAc-initiated
O antigens
The structures of all GlcNAc-/GalNAc-initiated Salmonella
O antigens are now known (Table 1). Some structures elucidated by us only recently have not been reported earlier
and are presented here for the first time. They were established using one- and two-dimensional 1H- and 13C-NMR
spectroscopy essentially as described (Duus et al.,
2000).Three new Salmonella structures, those of O42, O52,
and O65 antigens, are identical to the known structures of
E. coli O1B (Gupta et al., 1992), O153 (Ratnayake et al.,
1994), and O78 (Jansson et al., 1987), respectively.
The O-antigen structure of Salmonella O67 was found
to be highly similar to that of D-galactan I (?3)-D-Galf(b1?3)-a-D-Galp-(a1?) in K. pneumoniae (Whitfield
et al., 1991). The only difference between the two O antigens is the presence of an O-acetyl group in Salmonella
O67. Its position was determined by a comparison of the
NMR spectra of the initial and O-deacetylated polysaccharides, which revealed characteristic displacements of
1
H- and 13C NMR signals caused by a deshielding effect
of the O-acetyl group.
Using 13C-NMR spectroscopy and the ‘fingerprint’
method, it was found that the O antigen of Salmonella
O21 has the same structure as that reported erroneously
for S. enterica arizonae O64 and Citrobacter freundii O32
(Kocharova et al., 1988). Formerly, a wrong structure has
been assigned to the S. enterica arizonae O21 O antigen
(Vinogradov et al., 1994), which, in fact, may belong to
Citrobacter braakii O37 (A. Gamian, pers. commun.).
In addition, structures of two Salmonella O antigens
were revised in this work. Using known regularities in the
13
C-NMR chemical shifts of the Quip3NAc-(a1?3)-DManp disaccharide (Shashkov et al., 1988), the absolute
configuration of Qui3NAc in the O39 antigen was revised
from L (Gajdus et al., 2009) to D. In the O62 antigen,
D-GalNAcA is present in the amide form (D-GalNAcAN)
rather than as the free acid (Vinogradov et al., 1992).
This was demonstrated by the 1H-NMR spectrum of a
polysaccharide sample measured in a 9:1 H2O/D2O
mixture, which showed two signals for NH2 protons at 7.40
and 7.65 ppm [compare published data (Rundlof et al.,
FEMS Microbiol Rev 38 (2014) 56–89
59
1998)]. We have also revised the N-acyl group on L-FucN
in the O48 antigen from N-acetyl to N-acetimidoyl (Feng
et al., 2005b).
Except for the O54 and O67 antigens, all Salmonella
GlcNAc-/GalNAc-initiated O antigens are heteropolysaccharides. Some are linear and have tri- to pentasaccharide
O units. Others are branched with tetra- to hexasaccharide O units usually including one or two monosaccharide side chains or, less often, a disaccharide side chain.
Most of the sugars are in the pyranose form, whereas
D-Gal occurs in the furanose form in two O antigens, and
D-Rib exists in the furanose form in all cases.
In addition to D-GlcNAc and D-GalNAc, hexoses D-Glc,
D-Man, D-Gal, L-Rha, and L-Fuc occur in six or more O
antigens each (Supporting Information, Table S1). When
D-Glc is present as a side chain, its content is often less
than stoichiometric, and there is no putative glycosyltransferase for its transfer in the gene clusters, both indicating that this sugar is incorporated into the O antigen
after assembling and processing of the O unit. Exceptionally, a side-chain Glc was proposed to be transferred by a
glycosyltransferase encoded in the Salmonella O66 gene
cluster (Liu et al., 2010a).
Other monosaccharides are components in 1–4 O antigens each. These include neutral sugars (D-Rib, Col) and
various uncommon amino sugars (D-ManN, L-QuiN,
L-FucN, D-Qui3N, D-Fuc3N, D-Qui4N, D-Rha4N). In most
cases, the amino sugars are N-acetylated, but in some O
antigens, they carry rarely occurring N-acyl groups, such
as N-acetimidoyl on L-FucN, N-formyl on D-Fuc3N,
(R)-3-hydroxybutanoyl on 8-epilegionaminic acid (8eLeg),
and N-acetyl-L-seryl or N-[(S)-3-hydroxybutanoyl]-D-alanyl on Qui4N (Table S1). O-Acetylation is not uncommon
in Salmonella O antigens, and one or two O-acetyl groups
are present in nonstoichiometric quantities in the O units
of 7 O serogroups (Table 1).
In contrast to the O antigens of Shigella (Liu et al.,
2008), the O antigens of Salmonella are typically neutral
polysaccharides. Only a few of them contain acidic components, such as hexuronic acids (D-GlcA and D-GalNAcA
in the O45 and O62 antigens, respectively), nonulosonic
acids (derivatives of Neu and 8eLeg in the O48 and O61
antigens, respectively), and ribitol phosphate (in the O47
antigen). However, GalNAcA exists in the neutral amide
form, and the negative charge of both nonulosonic acids
and phosphate group is neutralized by a basic N-acetimidoyl group on a L-FucN residue.
8eLeg5RHb7Ac (7-acetamido-3,5,7,9-tetradeoxy-5-[(R)3-hydroxybutanoylamino]-L-glycero-D-galacto-non-2-ulosonic
acid, a derivative of 8-epilegionaminic acid) is a higher
sugar rarely occurring in nature and is similar to isomeric
nonulosonic acids found in some other bacterial carbohydrates, di-N-acyl derivatives of Pse (5,7-diamino-3,5,7,
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
60
B. Liu et al.
Table 1. Structures of Salmonella GlcNAc-/GalNAc-initiated O antigens, including O54 and O67, and related Escherichia coli O antigens
Bacterium*,
serogroup,
Salmonella
serovar or
subspecies
SO6,7 (C1)
Thompson
O-antigen structure†
References
?2)-D-Manp-(b1?2)-D-Manp-(a1?2)-D-Manp-(a1?2)-D-Manp-(b1?3)-D-GlcpNAc-(b1?
Lindberg et al. (1988)
D-Glcp
SO6,7 (C1)
Thompson,
Livingstone
α1
↓
3
→2)-D-Manp-(β1→2)-D-Manp-(α1→2)-D-Manp-(α1→2)-D-Manp-(β1→3)-D-GlcpNAc-(β1→
Lindberg et al. (1988)
Di Fabio et al. (1989b)
D-Glcp
SO6,7 (C1)
Ohio
α1
↓
3
→2)-D-Manp-(β1→2)-D-Manp-(α1→2)-D-Manp-(α1→2)-D-Manp-(β1→3)-D-GlcpNAc-(β1→
Di Fabio et al. (1989c)
D-Glcp
SO6,7 (C4)
Livingstone
var. 14+
α1
↓
3
→2)-D-Manp-(β1→2)-D-Manp-(α1→2)-D-Manp-(α1→2)-D-Manp-(β1→3)-D-GlcpNAc-(β1→
Di Fabio et al. (1988b)
D-Manp
β1
↓
4
SO11 (F)
Aberdeen
EO75
→3)-D-Galp-(α1→4)-L-Rhap-(α1→3)-D-GlcpNAc-(β1→
SO13 (G)
?2)-L-Fucp-(a1?2)-D-Galp-(b1?3)-D-GalpNAc-(a1?3)-D-GlcpNAc-(a1?
EO127
Ac (~70%)
|
3
→2)-L-Fucp-(α1→2)-D-Galp-(β1→3)-D-GalpNAc-(α1→3)-D-GalpNAc-(α1→
4
|
Ac (~40%)
SO6,14 (H)
Boecker,
Carrau,
Madelia
EO77
Szafranek et al. (2003)
Erbing et al. (1978)
?6)-D-Manp-(a1?2)-D-Manp-(a1?2)-D-Manp-(b1?3)-D-GlcpNAc-(a1?
Perepelov et al. (2010e)
Widmalm & Leontein
(1993)
Perepelov et al. (2010e)
Brisson & Perry (1988)
Di Fabio et al. (1988a)
Di Fabio et al. (1989a)
Yildirim et al. (2001)
D-Glcp
SO6,14 (H)
Carrau,
Madelia
EO44
α1
↓
3
→6)-D-Manp-(α1→2)-D-Manp-(α1→2)-D-Manp-(β1→3)-D-GlcpNAc-(α1→
Di Fabio et al. (1988a)
Di Fabio et al. (1989a)
Staaf et al. (1995)
D-Glcp
SO6,14 (H)
Madelia
α1
↓
4
→6)-D-Manp-(α1→2)-D-Manp-(α1→2)-D-Manp-(β1→3)-D-GlcpNAc-(α1→
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
Di Fabio et al. (1989a)
FEMS Microbiol Rev 38 (2014) 56–89
61
Salmonella O-antigen diversity
Table 1. Continued
Bacterium*,
serogroup,
Salmonella
serovar or
subspecies
O-antigen structure†
References
EO17
Masoud & Perry
(1996)
D-Glcp
α1
↓
6
→6)-D-Manp-(α1→2)-D-Manp-(α1→2)-D-Manp-(β1→3)-D-GlcpNAc-(α1→
D-Glcp
EO73
D-Glcp
α1
α1
↓
↓
4
3
→6)-D-Manp-(α1→2)-D-Manp-(α1→2)-D-Manp-(β1→3)-D-GlcpNAc-(α1→
Wang et al. (2007)
L-Fucp
SO16 (I)
D-Glcp (~50%)
α1
Ac (~20/40/20 %)
β1
↓
|
↓
3
2/3/4
4
→4)-D-GalpNAc-(α1→6)-D-Manp-(α1→3)-L-Fucp-(α1→3)-D-GalpNAc-(β1→
Li et al. (2010b)
L-Fucp
EO11
α1
↓
3
→4)-D-GalpNAc-(α1→6)-D-Manp-(α1→3)-L-Fucp-(α1→3)-D-GalpNAc-(β1→
Li et al. (2010b)
D-Galf
SO17 (J)
α1
Ac (~80%)
↓
|
4
2
→2)-D-Galp-(α1→3)-D-ManpNAc-(β1→6)-D-Galf-(β1→3)-D-GlcpNAc-(β1→
Perepelov et al.
(2011d)
D-Galf
EO85
SO18 (K)
Cerro
α1
↓
4
→2)-D-Galp-(α1→3)-D-ManpNAc-(β1→6)-D-Galf-(β1→3)-D-GlcpNAc-(β1→
?4)-D-Manp-(a1?2)-D-Manp-(a1?2)-D-Manp-(b1?3)-D-GalpNAc-(a1?
Perepelov et al.
(2011d)
Vinogradov et al.
(2004)
D-Glcp
E. coli 73-1
α1
↓
3
→4)-D-Manp-(α1→2)-D-Manp-(α1→2)-D-Manp-(β1→3)-D-GalpNAc-(α1→
Weintraub et al.
(1993)
D-GlcpNAc
SO21 (L)‡
α1
↓
3
→4)-D-GalpNAc-(β1→3)-D-Galp-(α1→4)-D-Galp-(β1→3)-D-GalpNAc-(β1→
This review
(~22%)
D-Glcp (~55%)
α1
α1
↓
↓
3
4
→4)-D-Quip3NAc-(β1→3)-D-Ribf-(β1→4)-D-Galp-(β1→3)-D-GalpNAc-(α1→
D-Galp-(α1→3)-D-Galp
SO28ab (M)
Telaviv
FEMS Microbiol Rev 38 (2014) 56–89
Kumirska et al.
(2011)
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
62
B. Liu et al.
Table 1. Continued
Bacterium*,
serogroup,
Salmonella
serovar or
subspecies
O-antigen structure†
References
EO5ab
?4)-D-Quip3NAc-(b1?3)-D-Ribf-(b1?4)-D-Galp-(b1?3)-D-GalpNAc-(a1?
MacLean & Perry
(1997)
D-Glcp
SO28ac (M)
Dakar
EO71
SO30 (N)
Landau
β1
↓
4
→4)-D-Quip3NAc-(α1→3)-L-Rhap-(α1→4)-D-Galp-(β1→3)-D-GalpNAc-(α1→
Ac (~10%)
Ac (~30%)
|
|
4
2
→4)-D-Quip3NAc-(α1→3)-L-Rhap-(α1→4)-D-Galp-(β1→3)-D-GalpNAc-(α1→
3
|
Ac (~55%)
Ac (~50%)
|
6
→2)-D-Rhap4NAc-(α1→3)-L-Fucp-(α1→4)-D-Glcp-(β1→3)-D-GalpNAc-(α1→
Kumirska et al.
(2007)
Hu et al. (2010)
Bundle et al. (1986)
D-Glcp
SO30 (N)
Urbana,
Godesberg
β1
↓
4
→2)-D-Rhap4NAc-(α1→3)-L-Fucp-(α1→4)-D-Glcp-(β1→3)-D-GalpNAc-(α1→
EO157
?2)-D-Rhap4NAc-(a1?3)-L-Fucp-(a1?4)-D-Glcp-(b1?3)-D-GalpNAc-(a1?
Perry et al. (1986a)
Colp
α1
↓
3
→4)-D-Glcp-(α1→4)-D-Galp-(α1→3)-D-GlcpNAc-(β1→
6
↑
α1
Colp
Kenne et al. (1983)
SO35 (O)
Adelaide
EO111
D-Galp
SO38 (P)
EO21
SO39 (Q)
Mara
D-GlcpNAc
β1
β1
↓
↓
4
2
→3)-D-Galp-(β1→4)-D-Glcp-(β1→3)-D-GalpNAc-(β1→
Perry et al. (1986b)
?2)-D-Quip3NAc-(a1?3)-D-Manp-(a1?3)-L-Fucp-(a1?3)-D-GalpNAc-(a1?
Li et al. (2010b)
Staaf et al. (1999)
Gajdus et al. (2009)
this review
D-GlcpNAc
SO40 (R)
Riogrande
β1
↓
2
→4)-D-GalpNAc-(α1→3)-D-Manp-(β1→4)-D-Glcp-(β1→3)-D-GalpNAc-(α1→
SO41 (S)
?2)-D-Manp-(b1?4)-D-Glcp-(a1?3)-L-QuipNAc-(a1?3)-D-GlcpNAc-(a1?
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
Perry & MacLean
(1992b)
Perepelov et al.
(2010b)
FEMS Microbiol Rev 38 (2014) 56–89
63
Salmonella O-antigen diversity
Table 1. Continued
Bacterium*,
serogroup,
Salmonella
serovar or
subspecies
SO42 (T)
EO1B
O-antigen structure†
References
This review
Gupta et al. (1992)
D-ManpNAc
β1
↓
2
→3)-L-Rhap-(α1→2)-L-Rhap-(α1→2)-D-Galp-(α1→3)-D-GlcpNAc-(β1→
D-Galp
SO43 (U)
Milwaukee
α1
↓
3
→4)-L-Fucp-(α1→2)-D-Galp-(β1→3)-D-GalpNAc-(α1→3)-D-GlcpNAc-(β1→
Perry & MacLean
(1992b)
D-Galp
EO86:K2:H2
α1
↓
3
→4)-L-Fucp-(α1→2)-D-Galp-(β1→3)-D-GalpNAc-(α1→3)-D-GalpNAc-(β1→
Andersson et al.
(1989)
D-GlcpNAc
SO44 (V)
SO45 (W)
ssp. arizonae
β1
↓
3
→2)-D-Glcp-(α1→6)-D-Glcp-(α1→4)-D-Galp-(α1→3)-D-GlcpNAc-(β1→
Perepelov et al.
(2010d)
L-Fucp
Ac (~80%)
α1
|
↓
3
2
→4)-D-GlcpA-(β1→4)-L-Fucp-(α1→3)-D-Ribf-(β1→4)-D-Galp-(β1→3)-D-GlcpNAc-(β1→
SO47 (X)
Ac (~10%)
|
4
→2)-D-Rib-ol-5-P-(O→6)-D-Galp-(α1→3)-L-FucpNAm-(α1→3)-D-GlcpNAc-(α1→
EO118
?3)-D-Rib-ol-5-P-(O?6)-D-Galp-(a1?3)-L-FucpNAm-(a1?3)-D-GlcpNAc-(b1?
Shashkov et al.
(1993)
Perepelov et al.
(2009)
Liu et al. (2010b)
D-GlcpNAc
EO151
SO48 (Y)
Toucra
EO145
β1
↓
4
→2)-D-Rib-ol-5-P-(O→6)-D-Galp-(α1→3)-L-FucpNAm-(α1→3)-D-GlcpNAc-(β1→
→4)-Neup5Ac-(α2→3)-L-FucpNAm-(α1→3)-D-GlcpNAc-(β1→
7,9
|
Ac (~30%, ~70%)
?4)-Neup5Ac-(a2?3)-L-FucpNAm-(a1?3)-D-GlcpNAc-(b1?
FEMS Microbiol Rev 38 (2014) 56–89
Liu et al. (2010b)
Gamian et al. (2000)
Feng et al. (2005b)
Feng et al. (2005b)
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
64
B. Liu et al.
Table 1. Continued
Bacterium*,
serogroup,
Salmonella
serovar or
subspecies
SO50 (Z)
Greenside
EO55
SO50
ssp. arizonae
O-antigen structure†
References
Colp-(α1→2)-D-Galp
β1
↓
3
→6)-D-GlcpNAc-(β1→3)-D-Galp-(α1→3)-D-GalpNAc-(β1→
Kenne et al. (1983)
Lindberg et al. (1981)
Colp-(α1→2)-D-Galp
β1
↓
3
→6)-D-GlcpNAc-(β1→3)-D-Galp-(α1→3)-D-GlcpNAc-(β1→
Senchenkova et al.
(1997)
D-GlcpNAc
SO51
β1
↓
3
→6)-D-Glcp-(α1→4)-D-Galp-(β1→3)-D-GalpNAc-(α1→3)-D-GlcpNAc-(β1→
Perepelov et al.
(2011c)
D -GlcpNAc
EO23
SO52
EO153
SO53
SO54
Borreze
D-Glcp
β1
α1
↓
↓
3
6
→6)-D-Glcp-(α1→4)-D-Galp-(β1→3)-D-GalpNAc-(α1→3)-D-GlcpNAc-(β1→
Bartelt et al. (1993)
?2)-D-Ribf-(b1?4)-D-Galp-(b1?4)-D-GlcpNAc-(a1?4)-D-Galp-(b1?3)-D-GlcpNAc-(a1?
This review
Ratnayake et al.
(1994)
Ac (~60%, ~25%)
|
2,3
→2)-Galf-(α1→4)-D-GalpNAc-(β1→4)-L-Rhap-(α1→3)-D-GlcpNAc-(β1→
Perepelov et al.
(2011a)
?4)-D-ManpNAc-(b1?3)-D-ManpNAc-(b1?
Keenleyside et al.
(1994)
SO55
?2)–D-Glcp-(b1?2)-D-Fucp3NAc-(b1?6)-D-Glcp-(a1?4)-D-GalpNAc-(a1?3)-D-GlcpNAc-(b1?
Liu et al. (2010c)
EO103
?2)–D-Glcp-(b1?2)-D-Fucp3NRHb-(b1?6)-D-GlcpNAc-(a1?4)-D-GalpNAc-(a1?3)-D-GlcpNAc-(b1?
Liu et al. (2010c)
SO56
?3)-D-Quip4N(L-SerAc)-(b1?3)-D-Ribf-(b1?4)-D-GalpNAc-(a1?3)-D-GlcpNAc-(a1?
Perepelov et al.
(2010g)
D-GlcpNAc
SO57
EO51
β1
↓
2
→3)-L-Rhap-(α1→2)-L-Rhap-(α1→4)-D-Glcp-(α1→3)-D-GalpNAc-(β1→
SO58
?3)-D-Quip4N(D-Ala-SHb)-(b1?6)-D-GlcpNAc-(a1?3)-L-QuipNAc-(a1?3)-D-GlcpNAc-(a1?
EO123
Ac (~30%)
|
6
→3)-D-Quip4N(D-Ala-SHb)-(β1→6)-D-GlcpNAc-(α1→3)-L-QuipNAc-(α1→3)-D-GlcpNAc-(α1→
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
Perepelov et al.
(2011e)
Perepelov et al.
(2011e)
Perepelov et al.
(2010f)
Clark et al. (2009)
Perepelov et al.
(2010f)
FEMS Microbiol Rev 38 (2014) 56–89
65
Salmonella O-antigen diversity
Table 1. Continued
Bacterium*,
serogroup,
Salmonella
serovar or
subspecies
§
SO59
O-antigen structure†
References
?2)-D-Galp-(b1?3)-D-GlcpNAc-(a1?4)-L-Rhap-(a1?3)-D-GlcpNAc-(b1?
Perepelov et al.
(2011b)
D-Fucp3NFo
SO60
SO61
ssp. arizonae
α1
↓
3
→2)-D-Manp-(β1→3)-D-Glcp-(β1→3)-D-GlcpNAc-(β1→
Perepelov et al.
(2010a)
?8)-8eLegp5RHb7Ac-(a2?3)-L-FucpNAmp-(a1?3)-D-GlcpNAc-(a1?
Vinogradov et al.
(1992)
D-GalpNAcAN
SO62 ssp. arizonae
EO35
α1
↓
2
→3)-L-Rhap-(α1→2)-L-Rhap-(α1→3)-L-Rhap-(α1→2)-L-Rhap-(α1→3)-D-GlcpNAc-(β1→
Vinogradov et al.
(1994)
this review
Rundlof et al. (1998)
D-Fucp3NAc
SO63
ssp.
arizonae
SO65
EO78
α1
↓
4
→3)-D-Galp-(β1→4)-D-Glcp-(α1→4)-D-GalpNAc-(α1→3)-D-GalpNAc-(β1→
Vinogradov et al.
(1987a)
?4)-D-Manp-(b1?4)-D-Manp-(a1?3)-D-GlcpNAc-(b1?4)-D-GlcpNAc-(b1?
This review
Jansson et al. (1987)
D-Glcp
SO66
β1
Ac (~90%)
↓
|
3
6
→2)-D-Galp-(α1→6)-D-Galp-(α1→4)-D-GalpNAc-(α1→3)-D-GalpNAc-(β1→
Liu et al. (2010a)
D-Glcp
EO166
SO67
β1
↓
3
→3)-D-Galp-(α1→6)-D-Galp-(α1→4)-D-GalpNAc-(α1→3)-D-GalpNAc-(β1→
Ac (∼30%)
|
2
→3)-D-Galf-(β1→3)-D-Galp-(α1→
Ali et al. (2007)
This review
*S, Salmonella; E, E. coli.
For abbreviations of sugar residues and nonsugar groups, see Table S1.
‡
Earlier, this structure has been reported erroneously as that of S. enterica arizonae O64 and Citrobacter freundii O32 (Kocharova et al., 1988),
whereas another structure has been reported for S. enterica arizonae O21 (Vinogradov et al., 1994), which, in fact, may belong to Citrobacter
braakii O37 (A. Gamian, pers. commun.).
§
Earlier, another structure has been reported for S. enterica arizonae O59 (Vinogradov et al., 1987b), which, in fact, belongs to Citrobacter braakii O35 (Kocharova et al., 1996) and E. coli O15 (Perepelov et al., 2011b).
†
9-tetradeoxy-L-glycero-L-manno-non-2-ulosonic or pseudaminic acid), and Leg (5,7-diamino-3,5,7,9-tetradeoxyD-glycero-D-galacto-non-2-ulosonic or legionaminic acid)
(Knirel et al., 2003, 2012).
There are only two pairs of Salmonella O serogroups
with closely related GlcNAc-/GalNAc-initiated O antigens. The O13 and O43 antigens differ only in (1) the
FEMS Microbiol Rev 38 (2014) 56–89
configuration (a vs. b) and the position (1?2 vs. 1?4)
of the polymerization linkage between the O units; and
(2) the presence of a Gal side chain in the O43 antigen.
The O6, 14 and O18 antigens, which share O factors 6
and 14, differ only in (1) the initiating amino sugar
(GlcNAc vs. GalNAc); and (2) the polymerization linkage (1?6 vs. 1?4). Within serogroups, nonglucosylated
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
66
B. Liu et al.
Table 2. Summary of Salmonella and Escherichia coli sharing the identical or closely related O-antigen structures and gene clusters*
Salmonella O antigen
E. coli O antigen
Structure relationship
Reference for sequences (Salmonella/E. coli)
SO11(F)
SO13 (G)
SO6,14 (H)
SO16(I)
SO17(J)
SO18(K)
SO28ac(M)
SO28ab(M)
SO30(N)
SO35(O)
SO38(P)
SO42(T)
SO43(U)
SO47(X)
SO48(Y)
SO50(Z)
SO51
SO52‡
SO55
SO57
SO58
SO62
SO65
SO66
EO75
EO127
EO77/O73//O17/O44
EO11
EO85
73-1†
EO71
EO5
EO157
EO111
EO21
EO1B
EO86
EcO118/O151
EO145
EO55
EO23
EO153
EO103
EO51
EO123
EO35
EO78
EO166
=
Cr
Cr
Cr
Cr
Cr
Cr
Cr
Cr
=
=
=
Cr
Cr
Cr
=
Cr
=
Cr
=
Cr
=
=
Cr
This review/Li et al. (2010a)
Fitzgerald et al. (2007)/Iguchi et al. (2009)
Fitzgerald et al. (2003)/Wang et al. (2007)
Li et al. (2010a, b)/Li et al. (2010b)
Fitzgerald et al. (2006)/Perepelov et al. (2011d)
Fitzgerald et al. (2006)
Hu et al. (2010)/Hu et al. (2010)
Clark et al. (2010)/this review
Samuel et al. (2004)/Wang & Reeves (1998)
Wang & Reeves (2000)/Wang & Reeves (2000)
Li et al. (2010b)/Ren et al. (2008)
This review/Li et al. (2010a)
This review/Feng et al. (2005a)
Liu et al. (2010b)/Liu et al. (2010b)
This review/Feng et al. (2005b)
Samuel et al. (2004)/Wang et al. (2002b)
Perepelov et al. (2011c)/Perepelov et al. (2011c)
This review/this review
Liu et al. (2010c)/Fratamico et al. (2005)
Perepelov et al. (2011e)/Perepelov et al. (2011e)
Clark et al. (2009)/Beutin et al. (2007)
This review/Liu et al. (2009)
This review/Liu et al. (2009)
Liu et al. (2010a)/Liu et al. (2010a)
*S: Salmonella; E: E. coli; =: identical; Cr: closely related.
†
The backbone of Salmonella O18 antigen is found to be identical to that of strain E. coli 73-1(Weintraub et al., 1993). However, the O-serogroup and O-antigen gene cluster of E. coli 73-1 are unknown.
‡
Salmonella O52 shares the same O-antigen structure with E. coli O153 (Ratnayake et al., 1994). However, their gene clusters are unrelated.
and glucosylated structural variants are known, for
example, compare the Salmonella O6,7, O6,14, and O30
antigens (Table 1). Finally, two structures have been
reported for the Salmonella O50 antigen, which differ
only in the initiating amino sugar (GlcNAc vs. GalNAc).
In contrast, the similarity of the O antigens of Salmonella O28ab and O28ac is limited to the presence of the
common D-Galp-(b1?3)-D-GalpNAc-(a1?4)-D-Quip3NAc trisaccharide fragment in the main chain, and
classification of the two bacteria to the same serogroup
requires reconsideration.
Remarkably, many Salmonella GlcNAc-/GalNAc-initiated
O antigens are closely related or even identical to E. coli
O antigens (Tables 1 and 2). Most O-antigen structures
shared by these bacteria have been reported by us or others
earlier, and some of them are discussed below.
General features of gene clusters for
Salmonella GlcNAc-/GalNAc-initiated
O antigens
The gene clusters of all GlcNAc-/GalNAc-initiated Salmonella O antigens have been sequenced (Fig. 1, Table S2).
Except for Salmonella O54 and O67, these gene clusters
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
are localized in the genomes between the galF and gnd
genes. Their general characteristics, such as having low
GC content (about 30%), using wzx/wzy as O-unit
processing genes, exhibiting great diversity, etc., are similar
to those of E. coli and Shigella.
Almost all Salmonella O antigens use the Wzx-/Wzydependent process for the synthesis and translocation of
O antigens, the only exceptions being O54 and O67,
which use the synthase-dependent pathway and the ABC
transporter pathway, respectively. The wzx and wzy genes
are usually located within the O-antigen gene cluster, but
for O66, there is no wzy gene in the gene cluster, and it
must be located elsewhere in the genome. This resembles
the situation in serogroups A, B, and D1, which have the
wzy gene at a locus far from the main gene cluster (Naide
et al., 1965; Curd et al., 1998). In E. coli, the ABC transporter pathway has been reported for 10 of the 148 O
antigens with sequenced gene clusters (O8, O9, O20,
O52, O89, O95, O97, O99, O101, O162) (Liu et al.,
2008), but is not found to mediate the synthesis of any
Salmonella or Shigella O antigens except for Salmonella
O67.
Several studies have shown that most gene clusters
for Salmonella Gal-initiated O antigens have a cassette
FEMS Microbiol Rev 38 (2014) 56–89
67
Salmonella O-antigen diversity
wzy
wbaB wbaC wbaD
manC
manB
wzx
wzx
rmlB
rmlD rmlA rmlC
wdaA
wzy
wdaB wdaC
manC
manB
wzx
rib
O11(F)
wdbD wdbE wdbFwdbG gmd
fcl
gmm manC
manB
wdbH
wzx
wdbI wzy
wdbJ
fnlB
fnlC
wbuB
wbuC
wbuW wbuX wbuY wbuZ fnlA
fnlB
fnlC
wbuX wbuY wbuZ fnlA
O47(X)
gne
wfbG
gmd
fcl gmm
manC
manB
manC
manB
wzx
wzy
wcmC wfbI
nnaD nnaB nnaC
nnaA
wxz
wzy
wbuB wbuC
O48(Y)
O13(G)
wbaC wbaD
wzy
wzx
wbgM
O6,14(H)
gmd
gmm manC
manB
wbgN wzy
wzx
wbgO wbgP
gnd
colA
colB
O50(Z)
wzx
wdaD wzy
wdaE
gne
wdaF gmd
fcl
gmm manC
wdaG manB
wzy wdbK
O16(I)
wzx
wdbL wcmC wfbG
gne
O51
mnaA
wzy
wfbQ
wzx
wfbR
wfbS
glf
wdbM
wfbU
wdbN
wdbO wzy
wdbP
wzx
O52
O17(J)
wfbV
wbaC manC
manB
wzy
wzx
rmlB rmlD
wfbX
O18(K)
rmlA
wzx
glf
wzy
wdbQ wdbR rmlC wdaC
gne
O53
wzx
wdaH
wzy
wdaI wdaJ wdaK
gne
wbbE wbbF
wdaL
O21(L)
mnaA
O54
rmlB rmlD rmlA wdaK rmlC qdtA qdtB
wzx
wdaM
wzy
rmlB rmlA fdtA fdtC fdtB
wdaN qdtC
wzx
wdbT
wdbU wzy
wdbV
wfbG
gne
O55
O28ac(M)
rmlB rmlA qdtA qdtC qdtB
wzx
wzy
rmlB
wbuM wbuN wbuO wbeD
O28ab(M)
O30(N)
wdbB wdbC wzy
O45(W)
C1
vioA wdbW vioB wzx
wzy
wdbX wdbY wdbZ wfbG
gne
rmlA
O56
wbdN wzy
wbdO wzx
per
wbdP
gmd
fcl
gmm manC
rmlB
manB
wzx
rmlD rmlA rmlC
wdcA wzy
wdcB wdcC wdcC
wdcE
O57
wbdH
gmd gmm
manC
manB
colA
colB
wzx
wzy
wbdL wbdM
rmlB
O35(O)
rmlA
wzx
vioA wfbA wfbB wfbC wfbD wfbE
wzy
wfbF
fnlA qnlA
qnlB
wbwH wbuC
O58
wzx
wclN wbdN wclP wzy
rmlB
wclQ
O38(P)
rmlD rmlA rmlC
wzy
wdcF wdcG wdaC
fdtB
wzx
wdcI
gnaB
wzx
wzx
O59
rmlB rmlA qdtA qdtC qdtB
wzx
wdaO
wzy
wdaF
gmd
fcl gmm manC
wdaG manB
O39(Q)
rmlB
rmlA fdtA fdtF
rmlB
rmlD rmlA rmlC gnaA
wdcJ
wzy
wbdN
manC
manB
O60
gne
wzx
wzy
wdaQ wdaR wdaS
wbdN
manC
manB
O40(R)
wbpS
wcnX wcnY wzy
wcnZ
O62
wbyJ
manC
manB
wzx
wzy
wbuH
fnlA qnlA
qnlB
wbwH
wbuC
O41(S)
gne
weiD
rmlB rmlA fdtA fdtC fdtB
wejN
wejO
manC
wzx
weiA
wzx
wdcO
wzy
wdcP wdcQ
O63
rmlB
rmlD rmlA rmlC wzx
mnaA
wekM
wzy
wekN wbdH
O42(T)
manB
wejP wejQ
wzy wejR
wejS
wejT
wzx
O65
gne
wfbG
gmd
fcl gmm
manC
manB
wzx
wcmB wzy
wcmC wcmD
O43(U)
weiB
weiC
weiD
gne
O66
wzx
wdaX wzy
wdaY wdaZ
wzm wzt
wbdH
O44(V)
wbbM
glf
wbbN wbbO
wejU
wejV
O67
elb1
elb2
elb4
elb6
elb3
elb5
wdcK elb7 wdcL wzx
wzy
wdcM
wbuX wbuYwbuZ
fnlA
fnlB
fnlC
wbuB wbuC
O61
1 Kb
O unit processing gene
glycosyltransferase gene
GDP-sugar pathway gene
UDP-sugar pathway gene
dTDP-sugar pathway gene
Neu5Ac synthesis gene
acetyltransferase gene
CDP-ribitol synthesis gene
8eLeg5RHb7Ac synthesis gene
function unknown gene
CDP-sugar pathway gene
H-repeat element
gene remnant
gnd
IS remnant
Fig. 1. The O-antigen gene clusters of Salmonella GlcNAc-/GalNAc-initiated O antigens. Open arrows represent the location and orientation of
putative genes. The O-antigen gene clusters that are first reported in this review have been deposited in GenBank under accession numbers from
JX975328 to JX975348.
structure with a central set of variable serogroup-specific
genes flanked by highly homologous sugar pathway genes
or other shared genes. A similar situation has been found
in several groups of Streptococcus pneumoniae gene clusters for capsules with related structures (Bentley et al.,
2006; Mavroidi et al., 2007) and in Yersinia pseudotuberculosis O-antigen gene clusters (Cunneen et al., 2009; De
Castro et al., 2009, 2010). In contrast, the gene clusters
for Salmonella GlcNAc-/GalNAc-initiated O antigens are
highly diverse and possess no cassette structure. There are
only three sets of related O-antigen gene clusters.
(1) Salmonella O11 and C1. The last three genes (manC,
manB, and wzx) at the 3′ end of their gene clusters are in
the same order and share obvious DNA identity (63% for
FEMS Microbiol Rev 38 (2014) 56–89
manC, 93% for manB, and 97% for wzx; Fig. 2a). The
O-antigen structures of Salmonella O11 and C1 are not
related except for having mannose as a constituent sugar,
and the other genes of their gene clusters are quite different. It is likely that a recombination event has occurred
between the O-antigen gene clusters of Salmonella O11
and C1. The DNA identity level of manC is much lower
than that of the manB and wzx genes, and we propose
that one of the recombination sites is located in the
manB gene. It is surprising that an almost identical Wzx
protein is responsible for translocation of O antigens with
such different structures.
(2) Salmonella O13 and O43. The first seven genes and
last two genes of the Salmonella O13 and O43 antigen
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
68
B. Liu et al.
(a)
wzy
wdaB wdaC wbaD manC
manB
WbaB
wzx
DNA identity%
63
Protein identity%
93
97
68
wzy
WbaC
WbaD
D-Manp
96
β1
↓
4
WdaC
→3)-D-Gal p-(α1→4)-L-Rhap-(α1→3)-D-GlcpNAc-(β1→
97
O11
rmlB rmlD rmlA rmlC wdaA
WbaC
→2)-D-Manp-(β1→2)-D-Manp-(α1→2)-D-Manp-(α1→2)-D-Manp-(β1→3)-D-GlcpNAc-(α1→
C1
wdaB wdaC manC
manB
wzx
D-Galp
(b)
gmd
fcl gmm manC manB
wzx
wcmB wzy
α1
↓
WcmC
WfbG
3
→4)-L-Fucp-(α1→2)-D-Galp-(β1→3)-D-GalpNAc-(α1→3)-D-GlсpNAc-(β1→
wcmCwcmD
WcmD
O43
DNA identity%
98
99
99
98
98
98
93
Protein identity%
99
98
100
99
99
99
94
gne
wfbG
gmd
fcl gmm manC manB
68
68
WcmB
gne wfbG
64
55
WfbI
O13
wzx
WcmC
WfbG
→2)-L-Fucp-(α1→2)-D-Galp-(β1→3)-D-GalpNAc-(α1→3)-D-GlcpNAc-(α1→
wzy wcmC wfbI
(c)
rib
wzx wdbI wzy
wbuB wbuC
wdbJ wbuX wbuY wbuZ fnlA fnlB fnlC
WdbI
O47
WdbJ
WbuB
→2)-D-Ribitol-5-P-(O→6)-D-Galp-(α1→3)-L-FucpNAm-(α1→3)-D-GlcpNAc-(α1→
DNA identity%
88
Protein identity%
98 99
91
nnaD nnaB nnaC nnaA
wzx
wzy
99
99 99
99
98
99
99
98
99
96
99
wbuW wbuX wbuYwbuZ fnlA fnlB fnlC
98
WbuW
O48
DNA identity%
77
63 71
Protein identity%
73
99
65 64
99
99
100
99
WbuB
→4)-Neup5Ac-(α2→3)-L-FucpNAm-(α1→3)-D-GlcpNAc-(β1→
7,9
|
Ac (~30%, ~70%)
wbuB wbuC
100
99
99
100
99
WdcM
WbuB
→8)-8eLegp5R3Hb7Ac-(α2→3)-L-FucpNAm-(α1→3)-D-GlсpNAc-(α1→
O61
elb1
elb2
elb4
elb6 elb3 elb5 wdcK elb7 wdcL wzx
wzy
wdcM wbuX wbuY wbuZ fnlA fnlB
fnlC
wbuB wbuC
Fig. 2. Comparisons of related Salmonella O-antigen gene clusters. For color coding key, see Fig. 1. The proposed functions of
glycosyltransferases are shown.
gene clusters have the same order and significant DNA
identity, and the structures are also related (Fig. 2b). Both
structures are also found in E. coli, and the relationships
between the four structures and gene clusters are discussed below.
(3) Salmonella O47, O48, and O61. The last eight genes in
the O-antigen gene clusters of the three O serogroups have
the same order and share 63–100% DNA identity (Fig. 2c).
All three structures contain the L-FucpNAm-(a1?3)-DGlcpNAc disaccharide fragment. Four of the eight genes,
fnlA, fnlB, fnlC, and wbuX, are involved in the synthesis of
L-FucpNAm, and wbuB is proposed to be the L-FucNAm
transferase gene. The role of the other three genes is not
clear as there are no other shared structural elements.
Although the O-antigen structures of Salmonella O6,14
and O18 are identical apart from the Wzy polymerization
linkage (Table 1), the genes in their gene clusters share
no similarity, except for manC (59% identity) and wzy
(49% identity). It is interesting that the wzy genes are
among those with higher levels of identity given the
different polymerization linkage.
The sugar synthesis genes in O-antigen gene clusters,
such as those for L-Rha and D-Man, are often highly
conserved and easily identified.
Among the Salmonella GlcNAc-/GalNAc-initiated
O antigens discussed in this section, L-Rha is present in 7
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
O antigens. RmlB, RmlD, RmlA, and RmlC catalyze the
four-step synthesis of dTDP-L-Rha, and the genes are
usually located at the 5′ end of the O-antigen gene clusters of Salmonella and E. coli with the above conserved
gene order. The sequence comparisons show that in Salmonella, the 5′ end of the rml gene set, comprising rmlB,
rmlD, and most of rmlA, has many characteristics of
housekeeping genes and is in general subspecies specific
(data not shown). In contrast, the 3′ end, including part
of rmlA and all of rmlC, is much more variable, and the
variation at this end is clearly O antigen- and not subspecies-related. This is consistent with a previous report
(Li & Reeves, 2000) based on a much smaller number of
serotypes. It was suggested in that study that this was
because rmlC and the 3′ end of rmlA are commonly
transferred between subspecies with the glycosyltransferase
and O-antigen processing genes that determine O-antigen
specificity and are generally in the central region of the
gene cluster. The 5′ end of the rml gene set was proposed
to gain its subspecies-specific sequence in this process, as
these genes remain in the species as new gene clusters
arrive and others die out. The additional data fully
support those conclusions.
Where the rmlB and rmlA genes are involved in the
synthesis of sugars other than Rha, the full gene set may
also be located at the 5′ end with rmlB and rmlA as the
FEMS Microbiol Rev 38 (2014) 56–89
69
Salmonella O-antigen diversity
first two genes, as in the O-antigen gene clusters of
Salmonella O28ab, O39, O55, O58, and O60. Only in Salmonella O56 and O63 are rmlB and rmlA found elsewhere
in the gene cluster (Fig. 1).
D-Man is present in 10 GlcNAc-/GalNAc-initiated Salmonella O antigens. GDP-D-Man is synthesized from fructose-6-phosphate by ManA, ManB, and ManC, but only
the manB and manC genes are generally present in the
gene cluster as ManA is also involved in use of exogenous
mannose as a carbon source, and the gene is not associated with the O-antigen gene clusters (Neidhardt et al.,
1987). ManB and ManC are also involved in the synthesis
of GDP-Col, GDP-L-Fuc, GDP-D-Rha4NAc (GDP-PerNAc),
so a total of 16 gene clusters for GlcNAc-/GalNAcinitiated Salmonella O antigens contain manB and manC
genes. Colanic acid (CA), which is widely present in Salmonella, contains L-Fuc, and the manB and manC genes
required for production of GDP-Fuc are located within
the CA gene cluster (Aoyama et al., 1994). The CA gene
cluster is unusual in having generally a high GC content,
and remarkably, most manB genes for GlcNAc-/GalNAcinitiated Salmonella O antigens (including those in O60
and O65 that are reported in this review) share high level
identity (93–99%) to the CA manB gene (Jensen & Reeves,
2001). The only exceptions are manB genes of O6,14 and
O35. Furthermore, those CA-like manB genes display
obvious subspecies specificity, and the CA manB genes
and the CA-like manB genes in each strain appear to be
evolving in concert via gene conversion events (Jensen &
Reeves, 2001). These events appear to be unidirectional, as
no manB gene with low GC content has been found in a
CA gene cluster. It should be noted that the manB genes
from the O-antigen gene clusters of the 8 Gal-initiated serogroups are closely related and not CA-like (Jensen &
Reeves, 2001). In contrast to manB, with the exception of
O11 and O41, the Salmonella manC genes are not CA-like
even in gene clusters with the whole of the L-Fuc pathway.
To assess the diversity of Wzx, Wzy, and the glycosyltransferases involved in the synthesis of the 37 Wzx/
Wzy pathway GlcNAc-/GalNAc-initiated O antigens, we
used the TribeMCL program (Enright et al., 2002) with a
cutoff of 1e 50 to assemble each group of proteins into
homology groups (HG). 36 Wzy proteins (the Salmonella
O66 gene cluster contains no wzy gene) and 37 Wzx proteins were assembled into 35 and 23 HG, respectively.
There is enormous diversity as the average amino acid
identity levels between the Wzy or Wzx HG are under
15%.
In contrast, Wzy and Wzx proteins for the 8 Gal-initiated Salmonella O antigens were assembled into 4 and 3
HG, respectively, with mostly similar low levels of identity
between HG as found for Salmonella GlcNAc-/GalNAcinitiated O antigens. However, the higher proportion of
FEMS Microbiol Rev 38 (2014) 56–89
gene clusters with a shared HG for Wzx or Wzy reflects a
higher level of relatedness among gene clusters for Galinitiated O antigens. The data also further demonstrate
the different patterns of diversity in the gene clusters for
Salmonella GlcNAc-/GalNAc- and Gal-initiated O-antigen
gene clusters.
The 127 glycosyltransferases from the 37 Wzx/Wzy pathway O antigens were assembled into 91 HG (Table S3), of
which 20 contain 2–6 members. The functions of 64 of
these glycosyltransferases can be predicted based on correlations between the presence of a glycosyltransferase with a
specific protein sequence and a shared or similar structural
element in the corresponding O antigens (Fig. S1).
In some cases, glycosyltransferases belonging to the
same HG were proposed to have the same function. For
instance, the 6 glycosyltransfeases in HG-GT-1 share
41–99% identity in pairwise comparisons. Among these,
WfbG in Salmonella O43 was proposed to be responsible
for the synthesis of a D-GalNAc-(a1?3)-D-GlcNAc linkage. When structural data were taken into consideration,
5 of the 6 HG-GT-1 glycosyltransferases were proposed to
have the same function and named WfbG accordingly.
The only exception is WbdH in Salmonella O35, which is
proposed to be responsible for the formation of a D-Gal(a1?3)-D-GlcNAc linkage.
Low proportion of anomalies in gene
clusters for Salmonella GlcNAc-/GalNAcinitiated O antigens
Anomalies in the O-antigen gene clusters usually indicate
a recent genetic event that may have been involved in the
formation of the O-antigen form, perhaps related to
adaptive modifications of bacteria in newly occupied
niches (Liu et al., 2008). Twelve such anomalies belonging to five categories (mobile elements, noncoding region,
gene(s) in the reverse orientation or unusual location,
and gene remnant) are found in the 37 Salmonella GlcNAc-/GalNAc-initiated O-antigen gene clusters. Previous
studies found 17 such anomalies in the 33 Shigella
O-antigen gene clusters, and 49 anomalies present in 148
E. coli O-antigen gene clusters. The proportion of anomalies in Salmonella O-antigen gene clusters is very similar
to that in E. coli O-antigen gene clusters and much lower
than that in Shigella. This suggests that it is Shigella that
is atypical, which is consistent with it having diverged
relatively recently and adopting a new niche.
Mobile elements
Several insertion sequences and H-repeat elements were
found in Shigella strains and were often associated with
inferred gene cluster rearrangements. However, for the
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
70
gene clusters of Salmonella GlcNAc-/GalNAc-initiated O
antigens, only one mobile element is found, an H-repeat
insertion that is 78% identical to the RhsB H-repeat of
E. coli K12 (Zhao et al., 1993), and is located between gne
and gnd in the O-antigen gene cluster of Salmonella O51.
This is the only major difference between the O-antigen
gene clusters of Salmonella O51 and E. coli O23 that
encode the same O antigen (Perepelov et al., 2011c), indicating that this H-repeat unit inserted into the O-antigen
gene cluster of Salmonella O51 after the divergence of
Salmonella and E. coli. Because the H-repeat unit in
Salmonella O51 is intact, it is likely that the insertion
occurred recently.
Noncoding regions
The gaps between genes in O-antigen gene clusters are
often very short, suggesting that translational coupling is
occurring, but larger gaps can arise during restructure of
a gene cluster (for instance, the incorporation or deletion
of genes). In Salmonella serogroups A, B, and D1, for
example, the functional wzy genes responsible for the
polymerization of O units are found outside the O-antigen
gene cluster (Naide et al., 1965; Curd et al., 1998), and a
remnant wzy gene is present in the large gap upstream of
the wbaO gene where the wzy gene is found in groups E
and D2. Noncoding regions also are found in gene clusters
for four of the Salmonella GlcNAc-/GalNAc-initiated O
antigens.
(1) In the O-antigen gene cluster of Salmonella O66,
there is no wzy gene (Fig. 1), and there is also a 874-bp
noncoding region between weiA and weiB in the gene
cluster (Liu et al., 2010a). However, no remnant of a wzy
gene can be found in this region by sequence homology
search. A wzy remnant can be difficult to find by BLAST
search because of the high divergence levels in wzy genes
and the degradation of remnant sequences by deletions,
which can fragment an open reading frame and/or change
the reading frame. In Salmonella serogroups A, B, and D1
discussed above, the wzy remnants were not found until
the ancestral wzy gene of group D3 was sequenced, which
provided a closely related homologue. The Salmonella wzy
genes are highly divergent, and if none are in the same
HG as the lost O66 wzy gene, then a remnant may well
not be detectable by BLAST but have to await sequencing
of a near relative. Because the Salmonella O66 type strain
can produce normal LPS, it is highly likely that it also
has a functional wzy gene for its b 1?2 linkage outside
the O-antigen gene cluster.
(2) In the O-antigen gene cluster of Salmonella O40,
there is a remnant gnu gene between gne and wzx
(Fig. 1). Gnu is responsible for the formation of UndPP–
GalNAc from UndPP–GlcNAc for GalNAc-initiated O
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
B. Liu et al.
antigens (Rush et al., 2010), and the remnant suggests
that the ancestral gene cluster coded for a GalNAc-initiated O antigen, although in E. coli and Salmonella, there
is often a gnu gene upstream of galF rather than in the
gene cluster. There is a 671-bp noncoding region between
the gnu remnant and wzx, with no good hits in a BLAST
search. Salmonella O40 has two GalNAc residues and no
main-chain GlcNAc residue, indicating that both gne and
gnu are required for the O-antigen synthesis. We suggest
that there is a gnu gene upstream of galF in Salmonella
O40, which is responsible for the synthesis of UndPP–
GalNAc and replaces the function of the now degraded
gnu gene in the gene cluster.
(3) A 570-bp noncoding region with no good hits in a
BLAST search is located between wekM and wzy in the Salmonella O42 antigen gene cluster. Salmonella O42 has the
same O-antigen structure as E. coli O1 (Table 1), and their
gene clusters also have the same organization. However,
the 570-bp noncoding region is not found in the O-antigen
gene cluster of E. coli O1. This noncoding region marks a
boundary between different levels of DNA identity between
the two gene clusters (Fig. 3a), being 55–81% for the seven
upstream genes (rmlB-wekA), but only about 40% identity
for the three downstream genes (wzy, wekN, wbdH) to the
corresponding genes in E. coli O1, with no obvious protein
identity for the gene products. The first seven genes in the
two gene clusters presumably have a common ancestor,
while the other three may have different origins. It is likely
that the presence of the 570-bp noncoding region is related
to the incorporation of wzy, wekN, and wbdH in Salmonella
O42, suggesting that the ancestral gene cluster was like that
in E. coli.
(4) Two noncoding regions are found in the O-antigen
gene cluster of Salmonella O53. One is located upstream
(positions 9867–10961) of gne, and the other (positions
11991–13054), downstream of gne (upstream of gnd). It
is likely that these two noncoding regions are related to
the incorporation of the gne gene into the O-antigen gene
cluster, implying an ancestor without the GalNAc residue
currently present.
Genes in the reverse orientation
All but two of the genes in the Salmonella O-antigen gene
clusters are transcribed from galF to gnd, the exceptions
being qdtC in Salmonella O28ac and gne in Salmonella
O21, which are transcribed in the opposite direction.
(1) qdtC is located at the 3′ end of the Salmonella O28ac
antigen gene cluster. QdtC is involved in the biosynthesis
of dTDP-Qui3NAc, together with RmlA, RmlB, QdtA,
and QdtB (Pfostl et al., 2008). QdtC is an acetyltransferase for the final step in synthesis of dTDP-Qui3NAc, and
it is likely that qdtC was added to the O28ac gene cluster
FEMS Microbiol Rev 38 (2014) 56–89
71
Salmonella O-antigen diversity
(a)
D -ManpNAc
rmlD rmlArmlC wzx
mnaA
wekM
wzy wekN wbdH
β1
↓
WekN
WekN
2
WbdH
→3)- L-Rha p -(α1→2)- L -Rhap-(α1→2)-D-Galp-(α1→3)- D -GlcpNAc-(β1→
WekM
rmlB
SO42
DNA identity%
80
74
81
72
55
63
57
Protein identity% 90
84
95
81
46
60
48
41
40
41
D -ManpNAc
WekM
β1
↓
WekN
WekN
2
WbdH
→3)- L-Rha p -(α1→2)- L -Rhap-(α1→2)-D-Galp-(α1→3)- D -GlcpNAc-(β1→
EO1
rmlD rmlArmlC wzx
(b)
rmlB rmlD
fdtA
fdtC
rmlB
fdtB
mnaA wekM
wzx
wzy wekN wbdH
wdbT wdbU wzy wdbV wfbG gne
WfbG
→2)-D-Glcp-(β1→2)-D-Fucp3NAc-(β1→6)-D-Glcp-(α1→4)-D-GalpNAc-(α1→3)-D-GlcpNAc-(β1→
SO55
DNA identity%
74
76 69
67
69
53
59
58
60
63
66
Protein identity%
85
88 75
67
70
47
52
42
47
62
66
wbtF wbtG
gne
WfbG
EO103
fdtA
fdhC
→2)-D-Glcp-(β1→2)-D-Fucp3NR3Hb-(β1→6)-D-GlcpNAc-(α1→4)-D-GalpNAc-(α1→3)-D-GlcpNAc-(β1→
rmlB rmlD
fdtB
wzx
wbtD wbtE wzy
D-Glcp
(c)
wzx
weiA
weiB weiC weiD
Ac (~90%)
β1
↓
|
3
WeiD
6
→2)-D-Galp-(α1→6)-D-Galp-(α1→4)-D-GalpNAc-(α1→3)-D-GalpNAc-(β1→
gne
SO66
DNA identity%
66
66
65
70
64
66
Protein identity%
61
62
59
68
60
66
wzx
weiA
weiB
weiC
weiD
gne
wfbG
D-Glcp
EO166
wzy
β1
↓
3
WeiD
→2)-D-Galp-(α1→6)-D-Galp-(α1→4)-D-GalpNAc-(α1→3)-D-GalpNAc-(β1→
gne
D-Galp
fcl gmm manC manB
wzx
α1
↓
WfbG
3 WcmC
→4)-L-Fuc p-(α1→2)-D-Galp-(β1→3)-D-GalpNAc-(α1→3)-D-GlсpNAc-(β1→
wcmB wzy wcmC wcmD
SO43
WcmD
DNA identity%
63
83
84 75
74
77
64
73
63
68 66
Protein identity%
60
93
90 77
79
86
54
72
59
67 54
D-Galp
α1
↓
WcmA
3 WcmC
→4)-L-Fuc p-(α1→2)-D-Galp-(β1→3)-D-GalpNAc-(α1→3)-D-GalpNAc-(β1→
WcmD
EO86
gne wcmA
gmd fcl gmm manC
manB
wzx
gne
gmd fcl gmm manC
manB
wzx
WcmB
gmd
WcmB
(d)
wcmB wzy wcmC wcmD
(e)
wfbG
wzy
wcmC wfbI
WfbI
DNA identity%
62
83
84
75
74
78
37
68
73 66
Protein identity%
60
92
87
78
79
86
35
62
71 63
WfbG
Ac (~70%)
|
WcmA
WfbI
WcmC
3
→2)-L-Fucp-(α1→2)-D-Galp-(β1→3)-D-GalpNAc-(α1→3)-D-GalpNAc-(α1→
4
|
Ac (~40%)
EO127
gne
(f)
WcmC
→2)-L-Fucp-(α1→2)-D-Galp-(β1→3)-D-GalpNAc-(α1→3)-D-GlсpNAc-(α1→
SO13
wcmA gmd
fcl gmm manC manB
manB
wzx wdbR wzy wcmC wfbI
Ac (~70%)
|
WcmA
WfbI
WcmC
3
→2)-L-Fucp-(α1→2)-D-Galp-(β1→3)-D-GalpNAc-(α1→3)-D-GalpNAc-(α1→
4
|
Ac (~40%)
D-Galp
α1
↓
WcmD
WcmA
3 WcmC
→4)-L-Fuc p-(α1→2)-D-Galp-(β1→3)-D-GalpNAc-(α1→3)-D-GalpNAc-(β1→
wzx wdbR wzy wcmC wfbI
gne wcmA gmd
fcl gmm manC
DNA identity%
99
99
98
96 98
99
94
50
47
54
67 63
Protein identity%
99
100
100
99 98
99
99
15
13
16
59 50
manB
wzx wcmB wzy wcmC wcmD
WcmB
EO127
EO86
fcl gmm manC
gne wcmA gmd
(g)
wbdN wzy wbdO wzx
perA
wbdP
gmd
fcl gmm manC manB
SO30
DNA identity%
66
69
66
70
77
70
74
67 64
76
79
Protein identity%
60
65
61
70
83
66
90
77 61
81
89
wbdN wzy wbdO wzx
perA
Ac (~50%)
|
6
WbdN
→2)-D-Rhap4NAc-(α1→3)-L-Fucp-(α1→4)-D-Glcp-(β1→3)-D-GalpNAc-(α1→
WbdN
EO157
→2)-D-Rhap4NAc-(α1→3)-L-Fucp-(α1→4)-D-Glcp-(β1→3)-D-GalpNAc-(α1→
wbdP gmd
fcl gmm manC
manB
perB
Fig. 3. Examples of O antigens with gene clusters and structures that are the same or are related in Salmonella and Escherichia coli. For color
coding key, see Fig. 1. S, Salmonella. E, E. coli. The proposed functions of glycosyltransferases are shown.
FEMS Microbiol Rev 38 (2014) 56–89
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
72
relatively recently and that the ancestor had Qui3N in
place of Qui3NAc. The E. coli O71 antigen gene cluster
has the same organization as that of Salmonella O28ac,
including the orientation of the qdtC gene, and the main
chains of the two polysaccharides are the same (Hu et al.,
2010). Thus, unlike other anomalies discussed in this section, it is likely that the qdtC gene in Salmonella O28ac
and E. coli O71 was present in the common ancestor and
is not an indication of recent change in our definition.
(2) The gne gene in the O-antigen gene cluster of Salmonella O21 is located between wdaK and wdaL and
transcribed in the opposite direction. This is an unusual
location for a gene with this orientation as all previously
described genes transcribed in the opposite direction in
the O-antigen gene clusters of Salmonella and Shigella are
located after the 3′ end of the normally transcribed genes.
The O21 gne gene is needed to synthesize the GalNAc residue, which is the last sugar in the structure, and would
perhaps be replaced by a GlcNAc residue in its absence.
The orientation of this gne gene creates a need for two
additional promoters (one for wcaL), but there is no evidence to indicate that the gne gene is a recent addition,
especially because a promoter upstream of gne could not
be identified based on an in silico search.
rml genes in unusual location
As discussed above, the rmlB, rmlD, rmlA, and rmlC
genes for the four-step synthesis of dTDP-L-Rha are
usually located at the 5′ end of the E. coli and Salmonella
O-antigen gene clusters, with the conserved gene order as
above. In some cases, the rmlC gene is separated from
other rml genes. Seven Salmonella GlcNAc-/GalNAc-initiated O antigens contain L-Rha, and two of the rmlC genes
are found in unusual locations.
(1) and (2) In Salmonella O28ac and O53, rmlC was
located 1 gene and 5 genes, respectively, downstream of
the rmlBDA genes.
(3) In Salmonella O56, the rmlA and rmlB genes are
involved in the synthesis of dTDP-Qui4N, and as there is
no L-Rha moiety, the rmlC and rmlD genes are not
required. However, while the O56 rmlB gene is at the 5′
end of the gene cluster, the rmlA gene is located at the other
end of the gene cluster, 10 genes downstream of rmlB.
(4) In the O-antigen gene cluster of Salmonella O63, the
rmlB and rmlA genes, which are involved in the synthesis
of dTDP-D-Fuc3NAc, are not located at the 5′ end of the
O-antigen gene cluster, but downstream of weiD.
Remnant genes
The O antigen of Salmonella O50 contains D-Gal, D-GalNAc, D-GlcNAc, and Col. The synthesis of GDP-Col
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
B. Liu et al.
requires the products of 5 nonhousekeeping sugar synthesis genes: manB, manC, gmd, colA, and colB (Fig. 4). The
manB, manC, and gmd genes are in the O-antigen gene
cluster, while colA and colB are downstream of gnd, suggesting that they are a recent addition not fully incorporated into the gene cluster. There is a remnant of an fcl
gene between gmd and gmm in the Salmonella O50 gene
cluster, indicating that before the acquisition of colA and
colB, the ancestral gene cluster coded for synthesis of
GDP-L-Fuc. The colA and colB genes in E. coli O55, which
has the same O antigen as Salmonella O50, are also
located downstream of gnd. However, there is no fcl gene
remnant in the O-antigen gene cluster of E. coli O55, presumably due to more extensive deletions than in Salmonella O50, which have occurred since the acquisition of
the colA and colB genes. The presence of the colA and
colB genes downstream of gnd and a remnant fcl gene in
the gene cluster could suggest that colA and colB are a
recent addition not fully incorporated into the gene cluster. However, the presence of the genes in the same location in E. coli suggests that in this case, it has survived
for a long time and that consolidation of genes into the
gene cluster can be a very slow process.
Biosynthetic pathways of
monosaccharides
Twenty-one different sugars were found in the Salmonella
GlcNAc-/GalNAc-initiated O antigens (Table S1). Fourteen of them are also present in Shigella O antigens, and
their proposed or characterized biosynthetic pathways
have been reviewed (Liu et al., 2008). The pathways for
the other 7 sugars (Col, L-FucNAm, D-Qui3NAc, D-Fuc3NAc/D-FucNFo, D-Rha4NAc (D-PerNAc), Neu5Ac, and
8eLeg5RHb7Ac) and ribitol are shown in Fig. 4.
The biosynthetic pathway for 8eLeg5RHb7Ac, a component of the Salmonella O61 O antigen, is first proposed in
this review (Fig. 4) and requires biochemical confirmation.
A similar derivative of 8-epilegionaminic acid, 8eLeg5Ac7Ac, has been found in the O antigens of E. coli O61 and
O108, and a biosynthetic pathway, including 7 enzymes
(Elg1–Elg7), also was proposed (Perepelov et al., 2010c).
Orf1–Orf6 and Orf8 in the Salmonella O61 antigen gene
cluster share 51–84% identity to Elg1-Elg7, respectively,
and may have the corresponding functions. Therefore, it is
likely that the pathway for 8eLeg5RHb7Ac is similar to that
proposed for 8eLeg5Ac7Ac, and orf1–orf6 and orf8 are
responsible for the synthesis of 8eLeg5RHb7Ac in Salmonella O61 (Fig. 4). Based on the structural difference,
we propose that the substrates of each gene in the two
pathways have the different acyl groups at N5. The biosynthesis of 8eLeg5Ac7Ac is initiated from UDP-GlcNAc,
and we propose that 8eLeg5RHb7Ac is initiated from
FEMS Microbiol Rev 38 (2014) 56–89
73
Salmonella O-antigen diversity
PerA
a
Fru-6-P
ManA
D-Man-6-P
ManB
D-Man-1-P
ManC
GDP-D-Man
Gmd
ribulose 5-phosphate
PerB
GDP-D-Rha4NAc
GDP-6-deoxy- α-D-lyxo-hexos-4-ulose
ColA
Rib
GDP-D-Rha4N
GDP-3,6-dideoxy- α- D-threo-hexos-4-ulose
ColB
CDP-ribitol
GDP-L-colitose
FdtA
Glc-1-P
RmlA
dTDP-D-Glc
RmlB
dTDP-6-deoxy-α -D-xylo-hexos-3-ulose
FdtB
dTDP-D -Fuc3N
dTDP-6-deoxy- α-D-ribo-hexos-3-ulose
QdtB
dTDP-D -Qui3N
Abe
CDP-D-Glc
DdhB
CDP-6-deoxy- α-D-xylo-hexos-4-ulose
DdhC
[I]
DdhD
UDP-2-acetamido-2,6-dideoxy-β-L-arabino-hexos-4-ulose
FnlB
QdtC
dTDP- D -Qui3NAc
CDP-Abe
CDP-3,6-dideoxy- α-D-erythro-hexos-4-ulose
Prt
FnlA
dTDP- D -Fuc3NAc
dTDP-6-deoxy-α -D-xylo-hexos-4-ulose
QdtA
DdhA
FdtC
UDP-2-acetamido-2,6-dideoxy-β-L-talose
FnlC
CDP-Par Tyv
UDP-L-FucNAc
CDP-Tyv
WbuX
UDP-L-FucNAm
X
UDP-GlcNAc
NnaA
UDP-GlcNRHb
Elb1
UDP-ManNAc
NnaB
NeuNAc
NnaC
CMP-NeuNAc
UDP-2,6-dideoxy-2-[(R)-3-hydroxybutanoylamino]-β-L-arabino-hexos-4-ulose
Elb2
Elb3
UDP-4-amino-2,4,6-trideoxy-2-[(R)3-hydroxybutanoylamino]-β-L-idose
UDP-4-acetamido-2,4,6-trideoxy-2-[(R)3-hydroxybutanoylamino]-β-L-idose
Elb4
CMP-8eLeg5RHb7Ac
Elb7
8eLeg5RHb7Ac
Elb6
4-acetamido-2,4,6-trideoxy-2-[(R)-3hydroxybutanoylamino]-β-L-gulose
Elb5
UDP-4-acetamido-2,4,6-trideoxy-2-[(R)3-hydroxybutanoylamino]-β-L-gulose
Fig. 4. Biosynthetic pathways for the sugars in Salmonella O antigens. The pathways for the sugars that are also present in Shigella O antigens
(Liu et al., 2008) are not included. Putative pathways are denoted by a broken line. In the CDP-3,6-dideoxyhexose pathway, [I] indicates a
4-pyridoxamine 6-deoxy-D3,4-glucoseen intermediate (Johnson & Liu, 1998). ManA, phosphomannose isomerase; ManB, phosphomannomutase;
ManC, mannose-1-phosphate guanylyltransferase (Samuel & Reeves, 2003); Gmd, GDP-mannose 4,6-dehydratase (Somoza et al., 2000;
Kneidinger et al., 2001); ColA, GDP-4-keto-6-deoxy-D-mannose 3-dehydrase (Alam et al., 2004); ColB, GDP-colitose synthase (Alam et al., 2004);
PerA, GDP-perosamine synthetase (Zhao et al., 2007; Albermann & Beuttler, 2008); PerB, GDP-perosamine N-acetyltransferase (Albermann &
Beuttler, 2008); Rib, ribulose 5-phosphate reductase/CDP-ribitol pyrophosphorylase (Follens et al., 1999); RmlA, glucose-1-phosphate
thymidylyltransferase (Zuccotti et al., 2001); RmlB, dTDP-D-glucose 4,6-dehydratase (Allard et al., 2001); FdtA, dTDP-6-deoxy-hex-4-ulose
isomerase; FdtB, dTDP-6-deoxy-D-xylo-hex-3-ulose aminase; FdtC, dTDP-D-Fuc3N acetylase (Pfoestl et al., 2003); QdtA, dTDP-4-oxo-6-deoxy-Dglucose 3,4-oxoisomerase; QdtB, dTDP-3-oxo-6-deoxy-D-glucose aminase; QdtC, dTDP-D-Qui3N acetylase (Pfostl et al., 2008); DdhA, glucose-1phosphate cytidylyltransferase; DdhB, CDP-glucose 4,6-dehydratase; DdhC, CDP-4-keto-6-deoxy-D-glucose 3-dehydrase; DdhD, CDP-6-deoxy-D3,4glucoseen reductase (Johnson & Liu, 1998; Samuel & Reeves, 2003); Abe, CDP-abequose synthase (Hallis et al., 1998); Prt, CDP-paratose
synthase (Hallis et al., 1998); Tyv, CDP-Par 2-epimerase (Koropatkin et al., 2003); FnlA, 4,6-dehydratase/5-epimerase; FnlB, 3-epimerase/
reductase; FnlC, C2 epimerase (Mulrooney et al., 2005);WbuX, aminotransferase (King et al., 2008); NnaA, GlcNAc-2-epimerase; NnaB, Neu5Ac
condensing enzyme; NnaC, CMP-Neu5Ac synthetase (Annunziato et al., 1995); Esb1, C6 dehydratase/C5 epimerase; Esb2, aminotransferase;
Esb3, acetyltransferase; Esb4, nucleotidase; Esb5, condensase; Esb6, cytidylyltransferase. aThe enzyme is encoded by the gene, which is not
located in the O-antigen gene cluster.
UDP-GlcNRHb, which is probably synthesized from UDPGlcNAc. However, the expected genes for UDP-GlcNRHb
are not found in the O-antigen gene cluster of Salmonella
O61 and may be located elsewhere in the genome. orf1-orf6
and orf8 in Salmonella O61 were named elb1–elb7,
respectively.
A close relationship between the O
antigens of Salmonella and E. coli
Until recently, there were only four confirmed cases in
which the O antigens are identical in the two species: Salmonella O35 and E. coli O111, Salmonella O50 and E. coli
FEMS Microbiol Rev 38 (2014) 56–89
O55, Salmonella O30 and E. coli O157, and Salmonella
O62 and E. coli O35 (Rundlof et al., 1998; Samuel et al.,
2004) (Table 1). We now find that there are 24 O antigens present in both Salmonella and E. coli being either
identical or near identical between the two species, which
is a much higher number than previously thought. All of
the shared O antigens are GlcNAc-/GalNAc-initiated. The
data are summarized in Table 2, and some interesting
examples are described in detail below.
It is worth noting that in addition to Salmonella O30,
O35, O50, and O62, there are 11 Salmonella O antigens
that cross-react serologically with one or more E. coli
O antigens (Orskov et al., 1977), and 7 of them (Salmonella
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
74
O6,14, O11, O17, O38, O42, O43, and O51) were shown
in this study to have structures and gene clusters that are
identical or closely related to an E. coli O antigen. However, the remaining four Salmonella O antigens are not
obviously related structurally or genetically to the respective E. coli O antigen.
Salmonella O52 was found to have the same O-antigen
structure as that of E. coli O153 (Ratnayake et al., 1994).
However, there are no genes shared by the two gene clusters, which is obviously different from other pairs of Salmonella and E. coli O antigens that are identical or closely
related. The sources for the sequences of the 15 Salmonella
GlcNAc-/GalNAc-initiated O antigens that are not related
to E. coli O antigens are summarized in Table S4.
Salmonella O6,14 and E. coli O77 group
It is well known that most S. flexneri serotypes share a
common O-antigen backbone and differ only in the distribution of four possible Glc side-branch residues and an
O-acetyl moiety, which are all attached by enzymes
encoded by prophage genes (Allison & Verma, 2000), or
differ in the presence of a plasmid-encoded phosphoethanolamine modification (Sun et al., 2012; Knirel et al.,
2013). There is a similar group of O-antigen structures in
E. coli, comprising E. coli O77, O17, O44, O73, and O106,
which have been given serogroup status, and Salmonella
O6,14 is a single representative in Salmonella with a related
structure (Wang et al., 2007). These strains also share a
common four-sugar backbone O-unit structure and differ
by the addition of one or two Glc side branches at various
positions of the backbone (the only exception is the E. coli
O77 O antigen that does not have any side-chain modification). Their O-antigen gene clusters contain the same genes
in the same order and express proteins required for the biosynthesis of the common four-sugar backbone. The O-antigen gene clusters of the E. coli O77 group share > 99%
identity to each other and 70–76% identity to that of
Salmonella O6,14, suggesting that this O-antigen backbone
was in the common ancestor. In S. flexneri, the side-branch
Glc residues are added from UDP-Glc in a three-step
process involving GtrA and GtrB common to all such residues and a side-branch-specific transferase. The three genes
are always present as a set of three genes, which are on a
prophage genome in the chromosome, and most probably
the E. coli O77-related strains and Salmonella O6,14 gained
their specific side-branch modifications by acquiring
similar prophages carrying different gtr gene sets.
Salmonella O55 and E. coli O103
The O antigens of Salmonella O55 and E. coli O103 have
similar pentasaccharide O units that differ in only one
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
B. Liu et al.
sugar (Glc vs. GlcNAc) and in the acyl group on Fuc3N
(Ac group vs. Hb group; Table 1).
The DNA sequence identity in corresponding genes
ranges from 53% to 76% (Fig. 3b), the only exception
being the two acyltransferase genes (fdtC encoding an
acetyltransferase in Salmonella O55 and fdhC encoding a
3-hydroxybutanoyltransferase in E. coli O103), which
share no similarity and are responsible for the structural
difference between dTDP-D-Fuc3NHb and dTDP-D-Fuc3NAc. We suggest that one of the two gene clusters
acquired a new gene (acetyltransferase or 3-hydroxybutanoyltransferase gene) after species divergence (Liu et al.,
2010c), but there is no indication as which was the original gene in the ancestor. There must also be a difference
in the specificity of the second glycosyltransferase to
ensure the difference in the third sugar as precursors for
Glc and GlcNAc are generally available.
Salmonella O66 and E. coli O166
The O-antigen structures of Salmonella O66 and E. coli
O166 differ only in the linkage between O units and the
presence of an O-acetyl moiety in the former. The Oantigen gene clusters of Salmonella O66 and E. coli O166
have nearly identical organizations, the only exception
being that the wzy gene in E. coli O166 is replaced by a
noncoding region in Salmonella O66 (Fig. 3c) (Liu et al.,
2010a). It is proposed that a functional wzy gene outside
the O-antigen gene cluster is involved in the synthesis of
the O antigen of Salmonella O66, similar to what is found
in Salmonella serogroups A, B, and D1 (Naide et al.,
1965; Curd et al., 1998). The ancestral gene cluster of Salmonella O66 presumably had the wzy gene found in
E. coli O166 between weiA and weiB, which would be no
longer required after the bacteria gained the new wzy
gene. The noncoding region in Salmonella O66 could be
a remnant of a gene, but we found no region of similarity
with the E. coli O166 wzy gene, probably owing to the
substantial degradation observed between weiA and weiB.
Salmonella O43-E. coli O86 and Salmonella
O13-E. coli O127
The four O antigens have similar four-sugar main chains
varying mainly in the first sugar, which is GlcNAc in Salmonella and GalNAc in E. coli Also Salmonella O43 and
E. coli O86 have a side-branch Gal that is lacking in the
others. So for our purposes, there is a pair of related Salmonella O antigens, and both Salmonella O antigens have
a related E. coli O antigen, all of which are treated
together here.
Four of the genes (gmd-manC) of the Salmonella O13
and O43 antigen gene clusters have the same order and
FEMS Microbiol Rev 38 (2014) 56–89
75
Salmonella O-antigen diversity
are 93–99% identical, as are the same genes in E. coli
O127 and O86 (Figs 2b and 3d–f). In comparisons
between the species, these genes are 74–84% identical.
This is as expected for genes that were present in the
common ancestor and diverged as the species diverged,
with the genes in the two gene clusters undergoing frequent recombination within each species so that they
evolved in concert (Samuel et al., 2004). The manB gene
immediately downstream of manC is similar, but has
rather more divergence than the gmd-manC genes due to
having a CA gene cluster form of manB in the Salmonella
and E. coli strains.
The other genes show quite complex patterns including
high levels of divergence as discussed below.
The choice of first sugar is determined when the second sugar, a GalNAc residue, is added to either UndPP–
GalNAc or UndPP–GlcNAc by glycosyltransferases WcmA
or WfbG, respectively (Yi et al., 2005). The wcmA and
wfbG genes are second genes in the gene cluster, after the
gne gene that is required for synthesis of the UDP-GalNAc substrate. The E. coli strains will also need a gnu
gene for synthesis of the UndPP–GalNAc. The two genes
in Salmonella are again highly similar (98–99% identity)
as are the two in E. coli (99–100% identity). However,
the gne genes are only 60–63% identical in comparisons
between the species, and the two glycosyltransferase genes,
wcmA and wfbG, are not related at all (no more than
30% identity). It appears that this end of the gene cluster
was replaced in one of the species causing the first sugar
to be replaced.
At the 3′ end are genes related to the addition of the
side-branch Gal in Salmonella O43 and E. coli O86, and
the corresponding glycosyltransferase wcmB gene is found
only in those strains, where it is located between the wzx
and wzy genes. The wdbR gene in the same location in
E. coli O127 only is proposed to be an acetyltransferase
gene based on sequence homology and may be responsible for addition of one of the O-acetyl groups to the Fuc
residue in E. coli O127. The genes for addition of the
main-chain Gal residue and addition of the Fuc residue
to it are very different in the 2 structural forms. The
main-chain Gal residue carries the Gal side branch in Salmonella O43 and E. coli O86, so this may account for the
difference between wfbI and wcmD, as if the side-branch
Gal is added first, it would affect the target sugar for the
Fuc transferases. However, the explanation for the difference between wfbI and wcmD genes is not so simple, as
they are responsible for the same linkage, although the
first sugars of the molecule at this stage are different. All
these genes, including wzx and wzy, are highly divergent,
and only for wcmB does it seem likely that the various
forms have diverged from the gene cluster in the common ancestor of the two species. Perhaps there have been
FEMS Microbiol Rev 38 (2014) 56–89
gene replacements since species divergence, or perhaps
the situation in the common ancestor was more complex
than just having the two forms seen today and included
the sequence diversity now observed.
Salmonella O30 and E. coli O157
Salmonella O30 and E. coli O157 have the same O-antigen structure that contains one residue each of D-Rha4NAc (N-acetyl-D-perosamine, D-PerNAc), D-Glc, L-Fuc,
and D-GalNAc. The O-antigen gene cluster of Salmonella
O30 is nearly identical to that of E. coli O157, the only
difference being that Salmonella O30 lacks the acetyltransferase gene perB, which is located at the 3′ end of the
E. coli O157 antigen gene cluster and is involved in the
synthesis of D-Rha4NAc (Albermann & Beuttler, 2008)
(Fig. 3g). An H-repeat remnant is located upstream of
the perB gene in E. coli O157. It is likely that the acquisition of the E. coli O157 perB gene was mediated by the
H-repeat element and occurred more recently. An acetyltransferase gene that converts GDP-D-Rha4N to GDP-DRha4NAc may be located elsewhere in Salmonella O30
genome.
Two special Salmonella GlcNAc-/GalNAcinitiated O-antigen forms (O54 and O67)
The O antigen of Salmonella O54 is different from all other
reported bacterial O antigens in being synthesized by the
synthase pathway. Salmonella O54 has a disaccharide O
unit: ?4)-D-ManpNAc-(b1?3)-D-ManpNAc-(b1) and is
thus a homopolymer. The gene cluster responsible for the
synthesis of the Salmonella O54 O antigen resides on a
small mobilizable plasmid (Keenleyside & Whitefield,
1996), and mobilization of this plasmid into strains with a
functional chromosomal O-antigen gene cluster can lead to
the simultaneous expression of two distinct O antigens.
However, in the Salmonella O54 type strain, this is not the
case due to inactivation of the chromosomal O-antigen
locus, but the strains for O54 serovars include some with
group B, C1, C2, E, or 21(L) epitopes (Fitzgerald et al.,
2007), suggesting that the plasmid is quite mobile in nature. The Salmonella O54 antigen gene cluster contains
mnaA, wbbE, and wbbF. MnaA is a C2 epimerase that converts UDP-GlcNAc to UDP-ManNAc (Campbell et al.,
2000). WbbE transfers the first UDP-ManNAc to UndPP–
GlcNAc that is also synthesized by WecA, to complete an
adapter. WbbF, an integral membrane protein, is responsible for both sequential addition of ManNAc and the
concurrent extrusion of the nascent polymer across the
cytoplasmic membrane (Keenleyside & Whitefield, 1996).
Salmonella O67 has previously been suggested to be a
variant of serogroup B (O4) (Li & Reeves, 2000). Indeed,
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
76
a molecular typing study based on O-antigen gene cluster
probes found that the serogroup O4 antigen-specific gene
does not distinguish strains of serogroups O4 and O67
(Fitzgerald et al., 2007). In this study, we sequenced the
region between galF and gnd in Salmonella O67 and
found it to be the same as for the serogroup O4 antigen
gene cluster (with identity ranging from 99% to 100% for
corresponding genes). However, the structural analysis
revealed that the O67 antigen structure is similar to that
of D-galactan I O antigen of K. pneumoniae (Table 1), the
only difference being the presence of an O-acetyl group
in Salmonella O67, which is consistent with the fact that
there is no cross-reaction between serogroup O4 and O67
antigens and their respective antisera. The data show that
the O67 gene cluster is not located between galF and gnd,
but elsewhere in the genome.
The gene cluster responsible for the synthesis of
D-galactan I in K. pneumoniae O1 has been identified
downstream of gnd and consists of six genes, comprising
wzm, wzt, wbbM, glf, wbbN, and wbbO (Clarke & Whitfield, 1992). wzm and wzt encode components of an ABC
transporter for export of the O polysaccharide, and glf
encodes a UDP-galactopyranose mutase, for conversion
of UDP-Galp to UDP-Galf. In the ABC transporter pathway, O-antigen synthesis begins with the formation in
the cytoplasm of a chain of O units on an acceptor UndPP–GlcNAc, which is synthesized by WecA. Galactan I
synthesis has been studied by the Whitfield group and
summarized in a recent review (Greenfield & Whitfield,
2012). WbbO was shown to be a bifunctional glycosyltransferase adding the first two-sugar repeat unit (Galp
and Galf) to the UndPP–GlcNAc acceptor, forming the
adaptor region of the O polysaccharide. Further extension
of galactan I requires WbbM, which encodes a Galp
transferase (Guan et al., 2001). WbbN is thought to be
the Galf transferase for galactan I extension, although
WbbO can replace WbbN as the Galf transferase in vitro.
However, no genes with the potential for the synthesis
of D-galactan I can be found downstream of gnd in
Salmonella O67.
To identify the O-antigen gene cluster, we obtained a
draft genome of the Salmonella O67 type strain using
Solexa sequencing. A contig was found containing eight
genes related to the synthesis of the O67 antigen. orf1-orf6
of that contig are identified as wzm, wzt, wbbM, glf, wbbN,
and wbbO by homology with the genes of K. pneumoniae
O1 (92%, 95%, 76%, 85%, 64%, and 79% identity, respectively) and account for the synthesis of D-galactan I. orf7
and orf8 were named wejU and wejV, respectively. wejU
appears to be a glycosyltransferase gene, but its exact function is unclear. WejV shares similarity to many acyltransferase, and we propose that it is responsible for transfer of
the O-acetyl group to the Salmonella O67 antigen.
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
B. Liu et al.
We found that K. oxytoca 10–5246 has the same gene
cluster (downstream of gnd) as that of Salmonella O67
including wejU and wejV. Some of the Klebsiella strains
also have galactan O antigens, and we suggest that the
O67 gene cluster was derived from a Klebsiella strain with
this gene cluster. Currently, the genomic locus of Salmonella O67 antigen gene cluster is unclear. It should be
noted that the wzm-wejU set of genes is also present in
the genome of E. coli SMS-3-5, with many insertion elements present upstream and downstream of that region.
It is likely that the Salmonella O67 strain under study
arose from a serogroup B strain by gaining a new gene
cluster for D-galactan I that originally came from Klebsiella and has been incorporated into the chromosome at an
unidentified locus and that this was followed by repression of the function of its original O-antigen gene cluster
by means also not identified. It remains to be seen
whether all O67 isolates have similar genetics, but serogroup O67 has only one serovar, named ‘Cresswell’, and
isolates are extremely rare.
Structure and genetics of Salmonella
Gal-initiated O antigens with close
relatedness
There is a set of O antigens in Salmonella that have Gal
as first sugar of the O unit, comprising serogroups O2
(A), O4 (B), O8 (C2-C3), O9 (D1), O9,46 (D2), O9,46,27
(D3), O3,10 (E1-E3), and O1,3,19 (E4). These O antigens
also have many other similarities (Table 3). Except for
serogroup C2-C3, they possess a main chain having a
D-Manp-(1?4)-L-Rhap-(a1?3)-D-Galp trisaccharide repeat
unit and may differ in (1) the configuration (a vs. b) and
the position of the polymerization linkage (a 1?2 vs. a
1?6); and (2) the configuration (a vs. b) of the D-Manp(1?4)-L-Rhap linkage. In serogroup C2-C3, the main
chain is built up of L-Rhap-(b1?2)-D-Manp-(a1?2)-DManp-(a1?3)-D-Galp tetrasaccharide repeats. The major
differences between serogroups are defined by the presence
or absence and the identity (Abe, Tyv, or Par) of the sidebranch 3,6-dideoxyhexose residue, and additional structural diversity is achieved by lateral glucosylation and/or
O-acetylation, which in most cases are nonstoichiometric.
The other defining feature of Gal-initiated Salmonella
O antigens is that they have the wbaP gene in the gene
cluster for the initial transferase that transfers Gal-P from
UDP–Gal to UndP to generate UndPP–Gal.
It is near universal in the Enterobacteriaceae for the
first sugar to be GlcNAc or GalNAc with WecA as the
initial transferase. The Gal-initiated O antigens are major
exception, and it seems that the use of Gal as initial sugar
arose in Salmonella since its divergence from E. coli. However, although there are only 8 Gal-initiated O antigens and
FEMS Microbiol Rev 38 (2014) 56–89
77
Salmonella O-antigen diversity
Table 3. Structures of Salmonella Gal-initiated O antigens [adopted from the recent review (Knirel, 2011)]
Salmonella (S) serogroup,
serovar
SO2 (A) Paratyphi
SO4 (B) Typhimurium, Agona,†
Abortus equi*
SO4 (B) Bredeney, Typhimurium
SL3622†
O-antigen structure*
Parp
D-Glcp
α1
Ac
α1
↓
|
↓
3
2
4
→2)-D-Manp-(α1→4)-L-Rhap-(α1→3)-D-Galp-(α1→
Abep-(2------Ac
D-Glcp
α1
α1
↓
↓
3
4
→2)-D-Manp-(α1→4)-L-Rhap-(α1→3)-D-Galp-(α1→
D-Glcp
Abep-(2------Ac
α1
α1
↓
↓
3
6
→2)-D-Manp-(α1→4)-L-Rhap-(α1→3)-D-Galp-(α1→
SO8 (C2) Newport
Abep
D-Glcp-(2------Ac
α1
α1
↓
↓
3
3
→4)-L-Rhap-(β1→2)-D-Manp-(α1→2)-D-Manp-(α1→3)-D-Galp-(β1→
2
|
Ac
SO8 (C3) Kentucky I.S. 98
D-Glcp-(2------Ac
Abep
α1
α1
↓
↓
3
4
→4)-L-Rhap-(β1→2)-D-Manp-(α1→2)-D-Manp-(α1→3)-D-Galp-(β1→
SO8 (C3) Kentucky 98/39
Abep
D-Glcp
α1
α1
↓
↓
3
2
→4)-L-Rhap-(β1→2)-D-Manp-(α1→2)-D-Manp-(α1→3)-D-Galp-(β1→
SO9 (D1) Typhi, Enteritidis SE6,†
Gallinarum bv. Pullorum 77†
SO9 (D1) Enteritidis I.S. 64,
Gallinarum bv. Pullorum 11
SO9,46 (D2) Strasbourg
FEMS Microbiol Rev 38 (2014) 56–89
D-Glcp-(2------Ac
Tyvp
α1
α1
↓
↓
3
4
→2)-D-Manp-(α1→4)-L-Rhap-(α1→3)-D-Galp-(α1→
Tyvp
α1
↓
3
→2)-D-Manp-(α1→4)-L-Rhap-(α1→3)-D-Galp-(α1→
Tyvp
D-Glcp
α1
α1
↓
↓
3
4
→6)-D-Manp-(β1→4)-L-Rhap-(α1→3)-D-Galp-(α1→
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
78
B. Liu et al.
Table 3. Continued
Salmonella (S) serogroup,
serovar
SO9,46 (D2) II
O-antigen structure*
Tyvp
α1
↓
3
→6)-D-Manp-(β1→4)-L-Rhap-(α1→3)-D-Galp-(α1→
SO9,46,27 (D3) II
Tyvp
D-Glcp
α1
α1
↓
↓
3
6
→6)-D-Manp-(α/β1→4)-L-Rhap-(α1→3)-D-Galp-(α1→
SO3,10 (E1) Anatum
Ac
|
6
→6)-D-Manp-(β1→4)-L-Rhap-(α1→3)-D-Galp-(α1→
D-Glcp
SO3,10 (E1) Muenster
SO3,10 (E2) Anatum var. 15+
α1
↓
4
→6)-D-Manp-(β1→4)-L-Rhap-(α1→3)-D-Galp-(α1→
?6)-D-Manp-(b1?4)-L-Rhap-(a1?3)-D-Galp-(b1?
D-Glcp
SO3,10 (E3) Lexington var. 15+,34+
α1
↓
4
→6)-D-Manp-(β1→4)-L-Rhap-(α1→3)-D-Galp-(β1→
D-Glcp
SO1,3,19 (E4) Senftenberg
α1
↓
6
→6)-D-Manp-(β1→4)-L-Rhap-(α1→3)-D-Galp-(α1→
*Abe, abequose (3,6-dideoxy-D-xylo-hexose); Par, paratose (3,6-dideoxy-D-ribo-hexose); Tyv, tyvelose (3,6-dideoxy-D-arabino-hexose). Nonstoichiometric substituents are italicized.
†
The O antigen lacks O-acetylation.
they are found almost exclusively in subspecies I and II,
they have nonetheless been very successful and dominate
the isolation lists. The only exception to the presence in
subspecies I and II for the serovar type strains is the C2-C3
O antigen strain, which is in subspecies IIIb.
There is a modular structure for this set of O-antigen
gene clusters, with several genes, especially sugar synthesis
genes, being shared by different gene clusters in conserved
locations as shown in Fig. 5. The rml genes are at the 5′
end of each gene cluster as for many GlcNAc-/GalNAcinitiated O antigens, followed by four ddh genes (ddhD,
ddhA, ddhB, and ddhC), which are responsible for the
synthesis of CDP-4-keto-3,6-dideoxy-D-glucose (Samuel &
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
Reeves, 2003), the precursor of CDP-Abe, CDP-Par, and
CDP-Tyv (Fig. 4). The abe gene or prt plus tyv genes for
completing the synthesis of CDP-Abe and CDP-Tyv,
respectively (Fig. 4), are located just downstream of the
ddh genes. However, the serogroup E O antigen does not
contain a dideoxyhexose residue, and the gene cluster
does not have the relevant genes. The manBC and wbaP
genes are located at the 3′ end of all of these Gal-initiated
O-antigen gene clusters.
The major differences between the gene clusters are
found in their central regions, which contain the diverse
glycosyltransferase genes and the O-unit processing genes
(wzx and wzy). In addition, the gene orf17.4 with unclear
FEMS Microbiol Rev 38 (2014) 56–89
79
A
rmlB rmlD rmlA
rm
lC
B
rmlB rmlD rmlA
rm
lC
Salmonella O-antigen diversity
ddhD ddhA ddhB ddhC abe
wzy remnant
wzx wbaV wbaU wbaN manC manB
ddhD ddhAddhB ddhC prt tyv*
wzx wbaV
wbaP
wbaU wbaN manC manB
wbaP
D1
rmlB rmlD rmlA
rm
lC
wzy remnant
ddhD ddhAddhB ddhC prt
tyv
wzx wbaV
wbaU wbaN manC manB
wbaP
rm
rmlB rmlD rmlA
rm
rmlB rmlD rmlA
rmlB rmlD rmlA
ddhD ddhAddhB ddhC
prt tyv
wzx wbaV
wzy
wbaU wbaN manC manB
wbaP
lC
rmlB rmlD rmlA
rm
D3
rm
lC
wzy remnant
ddhD ddhAddhB ddhC
prt tyv
wzx
wzy wbaO wbaN manC manB
wbaV
wbaP
lC
D2
wzx
wzy wbaO wbaN manC manB
wbaP
orf17.4
lC
E
C2
ddhD ddhAddhB ddhC
abe wzx
wbaR wbaLwbaQ wzy wbaW wbaZ manC
manB
wbaP
1 kb
Fig. 5. The O-antigen gene clusters of Salmonella Gal-initiated O antigens. For color coding key, see Fig. 1. Group B was the first O-antigen
gene cluster to be described (Nikaido et al., 1967; Jiang et al., 1991). The O-antigen gene cluster of group A was extracted from the published
genome CP00026. Other gene clusters were studied by restriction enzyme mapping to locate regions shared with the O-antigen gene cluster of
group B, and only unique region was sequenced (Verma & Reeves, 1989; Liu et al., 1991; Brown et al., 1992; Wang et al., 1992; Xiang et al.,
1994; Curd et al., 1998). * the tyv gene in the O-antigen gene cluster of group A is nonfunctional due to a frameshift mutation.
function was found downstream of wbaP in some group
E strains.
Group B was the first O-antigen gene cluster to be
described (Nikaido et al., 1967; Jiang et al., 1991) as it is
present in the strain LT2 that was used in many early
studies in bacterial genetics. It has Abe as its dideoxyhexose and the four ddh genes plus the abe gene. The difference between the O-antigen structures of the D1 and B O
antigens is the presence of a Tyv side-branch sugar in D1
in place of Abe. The D1 gene cluster has prt and tyv genes
in place of abe, which accounts for the structural difference. The only difference between the O-antigen structures of serogroups A and D1 is the presence of a Par or
Tyv side branch, respectively, and the gene clusters are
near identical, with prt and tyv genes both present. However, the tyv gene is not functional in group A, and in
serovar Paratyphi A at least, this has been shown to be
due to a frameshift mutation near the start of the gene,
which would prevent conversion of CDP-Par to CDP-Tyv
(Verma et al., 1988). As mentioned above, the wzy genes
of groups A, B, and D1 are not located within the O-antigen gene cluster, but at a locus named rfc. However, there
are wzy remnants in their gene clusters.
The group D2 structure differs from the group D1 structure in the polymerization linkage and the configuration
FEMS Microbiol Rev 38 (2014) 56–89
(a vs. b) of the D-Manp-(1?4)-L-Rhap linkage. It has been
suggested that the O-antigen gene cluster of serogroup D2
has arisen by reassortment of the serogroup D1 and E gene
clusters by recombination mediated by an H-repeat
element (Xiang et al., 1994).
For serogroup D3, there are two forms of the O unit
that differ only in the configuration of the linkage
between Man and Rha (a-1?4 and b-1?4). wbaU is
responsible for the formation of the D-Man-(a1?4)-LRha linkage (Curd et al., 1998), but there is no glycosyltransferase gene in the O-antigen gene cluster for the
D-Man-(b1?4)-L-Rha linkage, which may be located
elsewhere in the genome. The O-antigen gene cluster of
serogroup D1 also is thought to have arisen from that of
D3 by the loss of original wzy gene (Curd et al., 1998).
Group E was initially subdivided into groups E1, E2, E3,
and E4, based on serology. Serogroups E1, E2, and E3 have
been amalgamated as serogroup E1 (the 2007 Weill summary) on the basis that they have almost the same O-antigen gene clusters between galF and gnd. Recently, we found
that serogroup E4 also has the same O-antigen gene cluster
as E1 and so is not really a separate serogroup. The variation in the Glc side chain among serogroup E O antigens is
presumably due to the presence of different bacteriophages
with side-chain modification genes (Table 3).
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
80
Serogroup C2-C3 has a different order for the sugars
and one more Man residue in the O unit. All of the genes
in the central region of the gene cluster are unique to
serogroup C2-C3. The C3 form was originally put in a
separate serogroup C3, but differs only in having a Glc
side branch that is due to a bacteriophage-encoded set of
genes as is common in Salmonella.
Conclusions
This review covers the chemical structure and DNA
sequence data for all Salmonella O antigens, including
more recent work on the GlcNAc-/GalNAc-initiated
Salmonella O antigens that are more directly comparable
with those of E. coli and Shigella. Together with the previously published survey of Shigella O antigens, it provides insights into the evolution of O-antigen diversity in
bacteria. It also documents the relationships between the
O antigens of Salmonella and E. coli, which were underestimated before.
In our previous review (Liu et al., 2008), we had
observed that Shigella has a higher than usual proportion
of anomalies in its O-antigen gene clusters (17 anomalies
in 33 O-antigen gene clusters), many of which are
thought to be indicators of events that mediated the formation of new O-antigen forms, such as remnants of
genes no longer required or elements that mediated gain
of new genes. However, only 12 anomalies are found in
the 37 gene clusters of the Salmonella GlcNAc-/GalNAcinitiated O antigens with the Wzx/Wzy pathway (excluding Salmonella O54 and O67), much lower than that in
Shigella. The smaller number of anomalies indicates that
for this major group of Salmonella O antigens, the structure of the gene clusters has generally been stable, indicating that the set of O antigens is well adapted to the
Salmonella niche.
In the previous review (Liu et al., 2008), we also found
that 21 of 34 Shigella O antigens are either identical or
closely related to an O antigen in E. coli which is easily
explained as all Shigella serotypes except for S. boydii type
13 are in fact part of the species E. coli. Homologous
recombination occurs readily and was shown to be an
essential mechanism in the diversification of Shigella
O antigens.
Previous structural analysis of Salmonella and E. coli
O antigens had revealed only a few shared structures,
although early serological data had shown extensive crossreactions (Orskov et al., 1977). However, in our recent
studies, we found many more cases to give a total of 24 O
antigens that are identical or closely related in Salmonella
and E. coli. The most likely explanation for the observed
similarities in the two species is that the each pair of gene
clusters originated from a gene cluster that was present in
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
B. Liu et al.
the most recent common ancestor. In that case, the two
gene clusters would have a similar organization. E. coli
and Salmonella diverged about 140 million years ago, and
93% of E. coli and Salmonella housekeeping genes have
levels of identity between 76.3% and 100% (Sharp, 1991).
For 23 of the 24 O-antigen gene clusters that encode
identical or closely related O antigens in E. coli and
Salmonella, the average identity for corresponding genes is
73.5%, and the average identity for corresponding
proteins is 73.7%. This is close to the lower end of the
range for housekeeping genes, but the pattern is generally
similar for all of the gene clusters. However, genes in these
O-antigen gene clusters do diverge at a higher rate than
housekeeping genes, suggesting that the O-antigen genes
are under consistent selection pressure from the environments or hosts for better adaptation. This is not unexpected as they have an atypical GC content, suggesting an
origin in another species, and they may still be adjusting
to the new intracellular environment.
In these pairs of O-antigen gene clusters in Salmonella
and E. coli, the identity levels for sugar synthesis genes,
glycosyltransferase genes, and O-unit processing genes are
different. The average level of identity is 77.1%, 70.3%,
and 69.4%, respectively, for the three classes of genes and
81.4%, 67.1%, and 65.6%, respectively, for the proteins
encoded by these genes. The divergence levels for glycosyltransferase genes and O-unit processing genes are
consistently higher than those for sugar synthesis genes,
being observed in almost all pairs of gene clusters. It
appears that each pair has indeed diverged from a gene
cluster that was in the common ancestor, but that the
three classes of genes were subject to different selection
pressures, although there is no experimental evidence for
that.
Alternative explanations for the shared gene clusters
are the following: (1) The gene clusters were recently
transferred from one species to the other after species
divergence. In this case, the two gene clusters should have
a higher level of sequence identity, not related to the level
for housekeeping genes. (2) They have a common origin
but were acquired independently. In this case, we expect
similar gene order, as is observed, but can make no predictions on level of divergence as it will depend on the
time since the divergence of the donor species, which
could be earlier or later than divergence of the E. coli and
Salmonella species. (3) The two gene clusters were assembled independently either after species divergence or
before being acquired by the E. coli and Salmonella lineages. In this case, the gene organization of the individual
gene clusters should be different, so is not supported at
all for the 23 pairs being discussed.
None of these alternative explanations fit the data as
a sole explanation, but options (1) and (2) are also
FEMS Microbiol Rev 38 (2014) 56–89
Salmonella O-antigen diversity
possible, although if at all common one would expect
some pairs with a much higher or lower level of divergence. This conclusion is in agreement with one proposed
earlier with data for just three structures (Samuel et al.,
2004), but now has much stronger support. It is of course
possible that some of the 23 gene clusters did arrive independently but happen to be close to the others in divergence, but if so, we suggest that this was a minority of
them.
The exception is the case of Salmonella O52 and E. coli
O153, in which gene clusters that are not related generate
the same O-antigen structure. Each gene cluster has the
expected number of glycosyltransferase genes and a wzx
and wzy gene, but the order is different, and none have
significant levels of identity. This is presumably a case of
two gene clusters for a given structure that were assembled independently.
Some of the gene clusters shared by Salmonella and
E. coli did evolve to generate new O-antigen forms by
acquisition of new genes after species divergence as
described above. The Salmonella O66 gene cluster is
thought to have obtained a new wzy gene outside the
O-antigen gene cluster that is responsible for the b-1?2
linkage between the O units. The original wzy gene for
the b-1?3 linkage in the O-antigen gene cluster must
have degraded over time as proposed for Salmonella serogroup B (Wang et al., 2002a) as it was no longer required
in O-antigen synthesis. For the Salmonella O55-E. coli
O103 pair, one of the two gene clusters must have
acquired a new gene (an acetyltransferase gene or a 3-hydroxybutanoyltransferase gene) after species divergence to
synthesize a different sugar derivative. For the related
Salmonella O43–E. coli O86 and Salmonella O13–E. coli
O127 pairs, there were significant evolutionary changes,
but it is not yet possible to unravel what happened. The
Salmonella O6,14 and the E. coli O77 groups are interesting as, like the group of related S. flexneri serogroups,
diversity arises by acquisition (or loss) of different prophage genes for side-chain modification: only one form
has been observed in Salmonella, but 5 in E. coli.
It should be noted that within some serogroups, there
are also variant strains with O-antigen structures and
gene clusters different from those of the type strains. For
example, the O-antigen structure of one Salmonella O50
strain was reported to differ from that of the type strain
in having a GlcNAc in place of a GalNAc residue
(Senchenkova et al., 1997). Also Fitzgerald et al. found
that the O-antigen-based molecular typing method they
devised for Salmonella O13 cannot detect O13 strains
belonging to subspecies IIIb or S. bongori (Fitzgerald
et al., 2007). The genetic basis for this difference remains
to be determined. In addition, there is more than one
O-antigen structure for some other Salmonella O
FEMS Microbiol Rev 38 (2014) 56–89
81
serogroups, usually obtained to determine the basis of
serological variation, and most variations are in the side
branches, as in serogroups O6,7, O6,14, and O30
(Table 1). These variations are probably due to the presence in the chromosome of different bacteriophage
genomes that include O-antigen side-branch modification
genes.
As a genus with a long evolutionary history, the mechanism for the generation and maintenance of O-antigen
diversity in Salmonella is obviously different from that in
Shigella (Liu et al., 2008), which is essentially a relatively
small group of strains in another species (E. coli) that are
distinguished by a capacity for host cell invasion that
may have only recently been adopted in the species
(Maurelli et al., 1998; Pallen & Wren, 2007). One of the
major characteristics of Salmonella is that the O antigens
can be divided into two different classes (Gal-initiated
class and GlcNAc-/GalNAc-initiated class). The GlcNAc-/
GalNAc-initiated Salmonella O antigens that we have just
been discussing are similar to those in other members
of Enterobacteriaceae in using the WecA initial sugar
transferase encoded in the ECA gene cluster that is widely
distributed in the family. Over half of the GlcNAc-/
GalNAc-initiated O antigens are also found in the closest
relative E. coli and all but one of these are thought to
have been present in their common ancestor.
The Gal-initiated O antigens have a quite different
evolutionary history and are thought to have entered the
species quite recently, but although only 8 in number are
now dominant. We do not know the reasons for this
enormous difference between E. coli and Salmonella, with
Gal-initiated O antigens greatly outnumbering the GlcNAc-/GalNAc-initiated O antigens in Salmonella, but to
our knowledge not reported in E. coli.
Most Salmonella strains that cause serious infection in
humans and animals have Gal-initiated O antigens. However, it is worth noting that the E. coli members of several
Salmonella–E. coli serogroup pairs with identical or related
O antigens, including E. coli O157, O55, O111, O145,
O103, O118, and O78, are associated with important
pathogenic E. coli strains. The long history of these O
antigens in E. coli and Salmonella indicates that they are
possibly adaptive in both species, but most Salmonella
members are not recognized to be particularly pathogenic.
O-antigen diversity has been thought to be important
in offering the various clones selective advantages in their
specific niches. It has been estimated that a selective
advantage of only 0.1% for one O antigen over another
in a given niche is sufficient to maintain different alleles
in different clones (Reeves, 1992), although it is difficult
to demonstrate this in a laboratory assay. The O antigen
is a target of the host innate immune system. It is recognized by the Toll-like receptor 4 (Royle et al., 2003), and
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
82
it has been suggested that the pressures from the immune
system may contribute to O-antigen diversity. Novel O
antigens, especially those containing rare sugars, would
not be recognized by the immune system. An example of
the effect of a change in O antigen is given by Vibrio
cholerae O139, a variant of the 7th pandemic O1 clone
with a new O unit containing 2 Col residues and a QuiNAc residue (Knirel, 2011). It was first identified in 1992
Southern India and quickly spread in India and Bengal
and some other Asian countries, totally displacing the O1
serogroup (Ramamurthy et al., 2003). It had the capacity
to infect persons previously immune to the ancestral
V. cholerae O1 form of the pandemic strain (Blokesch &
Schoolnik, 2007), and this was thought to be the cause of
its success (Ramamurthy et al., 2003). After few years, the
O139 form virtually disappeared, but there has been periodic switching between V. cholerae O1 and O139 strains
as agents of cholera in some areas (Faruque et al., 2003;
Chatterjee et al., 2007). The strains also diversified other
factors, which affect the balance of the two antigenic
forms; however, the original rise of O139 form showed
how powerful the selective pressure of O-antigen variation
can be.
It should be noted that a relationship between O-antigen
form and host has been observed in several bacteria,
including Salmonella for which a host is commonly most
easily infected by strains bearing a specific O antigen
(Makela et al., 1973; Rabsch et al., 2002; Butela &
Lawrence, 2010). In addition, most bacteria cannot evade
an immune response by switching their O antigens in the
timescale of an infection, as for H antigen phase variation. These data raise the possibility that the different O
antigens expressed by different strains may confer advantages in different ecological niches, such as different host
intestinal environments, which may be a major selection
pressure for the generation and maintenance of O-antigen
diversity (Butela & Lawrence, 2010). It has for instance
been shown that diversifying selection mediated by predation from intestinal amoebae can contribute to O-antigen
variation in Salmonella (Wildschutte et al., 2004;
Wildschutte & Lawrence, 2007). Intestinal amoebae recognize antigenically diverse Salmonella strains with different
efficiency, giving the various serotypes different ability to
escape predators in particular environments. O-antigen
variation is also helpful for bacteria in avoiding bacteriophage predation (Blokesch & Schoolnik, 2007). In addition, O-antigen diversity may provide selective advantage
in other aspects; for example, they may mediate more
effective adhesion to different intestinal mucins.
Serotyping has been very important for our understanding of diversity in Salmonella and is used to define the
serovars that are referred to in most discussions of
the genus. However, in recent years, several aspects of
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
B. Liu et al.
traditional serotyping methods have limited the utility of
serotyping, especially in large-scale epidemiology studies.
The techniques can be laborious and time-consuming, and
the full range of sera needed is kept only in major typing
centers. In addition, based on the O-antigen structure data
obtained in this study, considerable serological crossreaction is expected between E. coli and Salmonella. There
has often been discussion of developing a molecular typing
system for Salmonella based on the current serology
scheme, and the relevant concepts have also been discussed
and applied in other bacteria (Raymond et al., 2002;
Li et al., 2009). The completion of the sequencing of the
Salmonella O-antigen gene clusters provides the data for a
comprehensive typing scheme for Salmonella using
sequence diversity, but based on the serotyping scheme, to
give in effect molecular serotyping. To facilitate this, we
have included data on the specific genes that could be useful for this and also a comprehensive set of primers that we
have developed for a microarray targeting the O-antigenspecific genes that can differentiate most Salmonella serogroups (Table S5) (Guo et al., 2013). The only exceptions
are groups A and D1 that need to be further distinguished
from each other using conventional serotyping methods,
due to having near-identical O-antigen gene clusters. The
mutations in the tyv gene of group A strains are not enough
to easily distinguish groups A and D1, but the specific
frameshift in serovar Paratyphi A could probably be developed into a specific test for this serovar.
For most serogroups, the O-unit processing genes (wzx
and wzy) were selected as target genes, the exceptions
being serogroups A/D1, O54, and O67, for which the
sugar synthesis gene prt, the glycosyltransferase gene
wbbE, and acetyltransferase gene wejV, respectively, were
selected. For most serogroups, only primer pairs based on
their own specific genes can generate the specific PCR
products. However, due to the close relationship among
the Gal-initiated Salmonella O-antigen gene clusters, combinations of primer pairs targeting more than one gene
were necessary for detecting some of these serogroups
(Table S5). For instance, as the prt genes of groups A,
D1, D2, and D3 are highly similar, but not found in
other O-antigen gene clusters, prt was used in the identification of all these groups, with D2 and D3, for example,
further distinguished by their specific wzy genes. Our
molecular typing system can also accurately differentiate
Salmonella and E. coli strains with related O-antigen
structures.
Acknowledgements
This work was supported by the National Key Programs for
Infectious Diseases of China (2013ZX10004-216-001); the
National 973 Program of China Grant (2012CB721001,
FEMS Microbiol Rev 38 (2014) 56–89
Salmonella O-antigen diversity
2009CB522603); the National Natural Science Foundation
of China (NSFC) Key Program Grant 31030002; the NSFC
General Program Grant (81171524, 31270003); and the
Russian Foundation for Basic Research (projects 11-0491173_NNSF-a and 11-04-01020-a). The authors have no
conflict of interest to declare.
References
Alam J, Beyer N & Liu HW (2004) Biosynthesis of colitose:
expression, purification, and mechanistic characterization of
GDP-4-keto-6-deoxy-D-mannose-3-dehydrase (ColD)
and GDP-L-colitose synthase (ColC). Biochemistry 43:
16450–16460.
Albermann C & Beuttler H (2008) Identification of the
GDP-N-acetyl-D-perosamine producing enzymes from
Escherichia coli O157:H7. FEBS Lett 582: 479–484.
Ali T, Weintraub A & Widmalm G (2007) Structural
determination of the O-antigenic polysaccharide from
Escherichia coli O166. Carbohydr Res 342: 274–278.
Allard ST, Giraud MF, Whitfield C et al. (2001) The crystal
structure of dTDP-D-glucose 4,6-dehydratase (RmlB) from
Salmonella enterica serovar Typhimurium, the second
enzyme in the dTDP-l-rhamnose pathway. J Mol Biol 307:
283–295.
Allison GE & Verma NK (2000) Serotype-converting
bacteriophages and O-antigen modification in Shigella
flexneri. Trends Microbiol 8: 17–23.
Andersson M, Carlin N, Leontein K et al. (1989) Structural
studies of the O-antigenic polysaccharide of Escherichia coli
O86, which possesses blood-group B activity. Carbohydr Res
185: 211–223.
Annunziato PW, Wright LF, Vann WF et al. (1995) Nucleotide
sequence and genetic analysis of the neuD and neuB genes
in region 2 of the polysialic acid gene cluster of Escherichia
coli K1. J Bacteriol 177: 312–319.
Aoyama KM, Haase AM & Reeves PR (1994) Evidence for
effect of random genetic drift on G+C content after Lateral
transfer of fucose pathway genes to Escherichia coli K-12.
Mol Biol Evol 11: 829–838.
Bartelt M, Shashkov AS, Kochanowski H et al. (1993)
Structure of the O-specific polysaccharide of the O23
antigen (LPS) from Escherichia coli O23:K?:H16. Carbohydr
Res 248: 233–240.
Bengoechea JA, Najdenski H & Skurnik M (2004)
Lipopolysaccharide O antigen status of Yersinia enterocolitica
O:8 is essential for virulence and absence of O antigen
affects the expression of other Yersinia virulence factors. Mol
Microbiol 52: 451–469.
Bentley SD, Aanensen DM, Mavroidi A et al. (2006) Genetic
analysis of the capsular biosynthetic locus from all 90
pneumococcal serotypes. PLoS Genet 2: e31.
Beutin L, Wang Q, Naumann D et al. (2007) Relationship
between O-antigen subtypes, bacterial surface structures and
O-antigen gene clusters in Escherichia coli O123 strains
FEMS Microbiol Rev 38 (2014) 56–89
83
carrying genes for Shiga toxins and intimin. J Med Microbiol
56: 177–184.
Blokesch M & Schoolnik GK (2007) Serogroup conversion of
Vibrio cholerae in aquatic reservoirs. PLoS Pathog 3: e81.
Boyd EF, Nelson K, Wang F-S et al. (1994) Molecular
genetic basis of allelic polymorphism in malate
dehydrogenase (mdh) in natural populations of Escherichia
coli and Salmonella enterica. P Natl Acad Sci USA 91: 1280–
1284.
Brisson JR & Perry MB (1988) The structures of the two
lipopolysaccharide O-chains produced by Salmonella
boecker. Biochem Cell Biol 66: 1066–1077.
Bronner D, Clarke BR & Whitfield C (1994) Identification of
an ATP-binding cassette transport system required for
translocation of lipopolysaccharide O-antigen side-chains
across the cytoplasmic membrane of Klebsiella pneumoniae
serotype O1. Mol Microbiol 14: 505–519.
Brown PK, Romana LK & Reeves PR (1992) Molecular
analysis of the rfb gene cluster of Salmonella serovar
Muenchen (strain M67): genetic basis of the polymorphism
between groups C2 and B. Mol Microbiol 6: 1385–1394.
Bundle D, Gerken M & Perry M (1986) Two-dimensional
nuclear magnetic resonance at 500 MHz: the structural
elucidation of a Salmonella serogroup N polysaccharide
antigen. Can J Chem 64: 255–264.
Butela K & Lawrence J (2010) Population genetics of Salmonella:
selection for antigenic diversity. Bacterial Population Genetics
in Infectious Disease Vol. (Ashley Robinso D, Falush D & Feil
EJ, eds), A John Wiley & Sons, Inc., Hoboken, NJ.
Campbell RE, Mosimann SC, Tanner ME et al. (2000) The
structure of UDP-N-acetylglucosamine 2-epimerase reveals
homology to phosphoglycosyl transferases. Biochemistry 39:
14993–15001.
CDC (2009) Salmonella Surveillance: Annual Summary, 2009.
US Department of Health and Human Services, Atlanta,
GA.
Chatterjee S, Ghosh K, Raychoudhuri A et al. (2007)
Phenotypic and genotypic traits and epidemiological
implication of Vibrio cholerae O1 and O139 strains in India
during 2003. J Med Microbiol 56: 824–832.
Clark CG, Kropinski AM, Parolis H et al. (2009) Escherichia
coli O123 O-antigen genes and polysaccharide structure are
conserved in some Salmonella enterica serogroups. J Med
Microbiol 58: 884–894.
Clark CG, Grant CC, Trout-Yakel KM et al. (2010) The O28
antigen gene clusters of Salmonella enterica subsp. enterica
serovar Dakar and serovar Pomona are different. Int J
Microbiol 2010: 209291.
Clarke BR & Whitfield C (1992) Molecular cloning of the rfb
region of Klebsiella pneumoniae serotype O1:K20: the rfb
gene cluster is responsible for synthesis of the D-galactan I
O polysaccharide. J Bacteriol 174: 4614–4621.
Cunneen MM, De Castro C, Kenyon J et al. (2009) The
O-specific polysaccharide structure and biosynthetic gene
cluster of Yersinia pseudotuberculosis serotype O:11.
Carbohydr Res 344: 1533–1540.
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
84
Cunneen MM, Liu B, Wang L & Reeves PR (2013) Biosynthesis
of UDP-GlcNAc, UndPP-GlcNAc and UDP-GlcNAcA
involves three easily distinguished 4-epimerase enzymes,
Gne, Gnu and GnaB. PLoS One 8: e67646.
Curd H, Liu D & Reeves PR (1998) Relationships among the
O-antigen gene clusters of Salmonella enterica groups B, D1,
D2, and D3. J Bacteriol 180: 1002–1007.
Daniels C, Vindurampulle C & Morona R (1998) Overexpression
and topology of the Shigella flexneri O-antigen polymerase
(Rfc/Wzy). Mol Microbiol 28: 1211–1222.
De Castro C, Skurnik M, Molinaro A et al. (2009)
Characterization of the specific O-polysaccharide structure
and biosynthetic gene cluster of Yersinia pseudotuberculosis
serotype O:15. Innate Immun 15: 351–359.
De Castro C, Kenyon JJ, Cunneen MM et al. (2010) Genetic
characterisation and structural analysis of the O-specific
polysaccharide of Yersinia pseudotuberculosis serotype O:1c.
Innate Immun 17: 183–190.
Di Fabio JL, Brisson JR & Perry MB (1988a) Structure of the
major lipopolysaccharide antigenic O-chain produced by
Salmonella carrau (O:6, 14, 24). Carbohydr Res 179: 233–244.
Di Fabio JL, Perry MB & Brisson JR (1988b) Structure of the
antigenic O-polysaccharide of the lipopolysaccharide
produced by Salmonella eimsbuttel. Biochem Cell Biol 66:
107–115.
Di Fabio JL, Brisson JR & Perry MB (1989a) Structural
analysis of the three lipopolysaccharides produced by
Salmonella madelia (1,6,14,25). Biochem Cell Biol 67: 78–85.
Di Fabio JL, Brisson JR & Perry MB (1989b) Structure of the
lipopolysaccharide antigenic O-chain produced by
Salmonella livingstone (O:6,7). Biochem Cell Biol 67:
278–280.
Di Fabio JL, Brisson JR & Perry MB (1989c) Structure of the
lipopolysaccharide antigenic O-chain produced by
Salmonella ohio (O:6,7). Carbohydr Res 189: 161–168.
Doolittle RF, Feng DF, Tsang S et al. (1996) Determining
divergence times of the major kingdoms of living organisms
with a protein clock. Science 271: 470–477.
Duus J, Gotfredsen CH & Bock K (2000) Carbohydrate
structural determination by NMR spectroscopy: modern
methods and limitations. Chem Rev 100: 4589–4614.
Enright AJ, Van Dongen S & Ouzounis CA (2002) An efficient
algorithm for large-scale detection of protein families.
Nucleic Acids Res 30: 1575–1584.
Erbing C, Kenne L, Lindberg B et al. (1978) Structure of the
O-specific side-chains of the Escherichia coli O 75
lipopolysaccharide: a revision. Carbohydr Res 60: 259–265.
Faruque SM, Chowdhury N, Kamruzzaman M et al. (2003)
Reemergence of epidemic Vibrio cholerae O139, Bangladesh.
Emerg Infect Dis 9: 1116–1122.
Feng L, Han W, Wang Q et al. (2005a) Characterization of
Escherichia coli O86 O-antigen gene cluster and
identification of O86-specific genes. Vet Microbiol 106:
241–248.
Feng L, Senchenkova SN, Tao J et al. (2005b) Structural and
genetic characterization of enterohemorrhagic Escherichia
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
B. Liu et al.
coli O145 O antigen and development of an O145
serogroup-specific PCR assay. J Bacteriol 187: 758–764.
Feng L, Reeves PR, Lan R et al. (2008) A recalibrated
molecular clock and independent origins for the cholera
pandemic clones. PLoS ONE 3: e4053.
Fitzgerald C, Sherwood R, Gheesling LL et al. (2003)
Molecular analysis of the rfb O-antigen gene cluster of
Salmonella enterica serogroup O:6,14 and development of a
serogroup-specific PCR assay. Appl Environ Microbiol 69:
6099–6105.
Fitzgerald C, Gheesling L, Collins M et al. (2006) Sequence
analysis of the rfb loci, encoding proteins involved in the
biosynthesis of the Salmonella enterica O17 and O18
antigens: serogroup-specific identification by PCR. Appl
Environ Microbiol 72: 7949–7953.
Fitzgerald C, Collins M, van Duyne S et al. (2007) Multiplex,
bead-based suspension array for molecular determination of
common Salmonella serogroups. J Clin Microbiol 45:
3323–3334.
Follens A, Veiga-da-Cunha M, Merckx R et al. (1999) acs1 of
Haemophilus influenzae type a capsulation locus region II
encodes a bifunctional ribulose 5-phosphate reductaseCDP-ribitol pyrophosphorylase. J Bacteriol 181: 2001–2007.
Fratamico PM, DebRoy C, Strobaugh TP Jr et al. (2005) DNA
sequence of the Escherichia coli O103 O-antigen gene cluster
and detection of enterohemorrhagic E. coli O103 by PCR
amplification of the wzx and wzy genes. Can J Microbiol 51:
515–522.
Gajdus J, Kaczynski Z, Smietana J et al. (2009) Structural
determination of the O-antigenic polysaccharide from
Salmonella Mara (O:39). Carbohydr Res 344: 1054–1057.
Gamian A, Jones C, Lipinski T et al. (2000) Structure of the
sialic acid-containing O-specific polysaccharide from
Salmonella enterica serovar Toucra O48 lipopolysaccharide.
Eur J Biochem 267: 3160–3166.
Greenfield LK & Whitfield C (2012) Synthesis of
lipopolysaccharide O-antigens by ABC
transporter-dependent pathways. Carbohydr Res 356: 12–24.
Grimont PAD & Weill FX (2007) Antigenic Formulae of the
Salmonella Serovars, 9th edn. WHO Collaborating Centre
for Reference and Research on Salmonella. Institut Pasteur,
Paris, France.
Guan S, Clarke AJ & Whitfield C (2001) Functional analysis of
the galactosyltransferases required for biosynthesis of
D-galactan I, a component of the lipopolysaccharide O1
antigen of Klebsiella pneumoniae. J Bacteriol 183: 3318–3327.
Guo D, Liu B, Liu F et al. (2013) Development of a DNA
microarray for molecular identification of all 46 Salmonella
O serogroups. Appl Environ Microbiol 79: 3392–3399.
Gupta DS, Shashkov AS, Jann B et al. (1992) Structures of the
O1B and O1C lipopolysaccharide antigens of Escherichia
coli. J Bacteriol 174: 7963–7970.
Hallis TM, Lei Y, Que NL et al. (1998) Mechanistic studies of
the biosynthesis of paratose: purification and
characterization of CDP-paratose synthase. Biochemistry 37:
4935–4945.
FEMS Microbiol Rev 38 (2014) 56–89
Salmonella O-antigen diversity
Hardnett FP, Hoekstra RM, Kennedy M et al. (2004)
Epidemiologic issues in study design and data analysis
related to FoodNet activities. Clin Infect Dis 38(suppl 3):
S121–S126.
Ho SY, Lanfear R, Bromham L et al. (2011) Time-dependent
rates of molecular evolution. Mol Ecol 20: 3087–3101.
Hu B, Perepelov AV, Liu B et al. (2010) Structural and genetic
evidence for the close relationship between Escherichia coli
O71 and Salmonella enterica O28 O-antigens. FEMS
Immunol Med Microbiol 59: 161–169.
Iguchi A, Thomson NR, Ogura Y et al. (2009) Complete
genome sequence and comparative genome analysis of
enteropathogenic Escherichia coli O127:H6 strain E2348/69.
J Bacteriol 191: 347–354.
Jansson PE, Lindberg B, Widmalm G et al. (1987) Structural
studies of the Escherichia coli O78 O-antigen polysaccharide.
Carbohydr Res 165: 87–92.
Jensen SO & Reeves PR (2001) Molecular evolution of the
GDP-mannose pathway genes (manB and manC) in
Salmonella enterica. Microbiology 147: 599–610.
Jiang XM, Neal B, Santiago F et al. (1991) Structure and
sequence of the rfb (O antigen) gene cluster of Salmonella
serovar typhimurium (strain LT2). Mol Microbiol 5:
695–713.
Johnson DA & Liu H (1998) Mechanisms and pathways from
recent deoxysugar biosynthesis research. Curr Opin Chem
Biol 2: 642–649.
Keenleyside WJ & Whitefield C (1996) A novel pathway for
O-polysaccharide biosynthesis in Salmonella enterica serovar
Borreze. J Biol Chem 271: 28581–28592.
Keenleyside WJ, Perry M, Maclean L et al. (1994) A
plasmid-encoded rfbO:54 gene cluster is required for
biosynthesis of the O:54 antigen in Salmonella enterica
serovar Borreze. Mol Microbiol 11: 437–448.
Kenne L, Lindberg B, Soderholm E et al. (1983) Structural
studies of the O-antigens from Salmonella greenside and
Salmonella adelaide. Carbohydr Res 111: 289–296.
King JD, Mulrooney EF, Vinogradov E et al. (2008) lfnA from
Pseudomonas aeruginosa O12 and wbuX from Escherichia coli
O145 encode membrane-associated proteins and are
required for expression of
2,6-dideoxy-2-acetamidino-L-galactose in lipopolysaccharide
O antigen. J Bacteriol 190: 1671–1679.
Kneidinger B, Graninger M, Adam G et al. (2001)
Identification of two GDP-6-deoxy-D-lyxo-4-hexulose
reductases synthesizing GDP-D-rhamnose in
Aneurinibacillus thermoaerophilus L420-91T. J Biol Chem
276: 5577–5583.
Knirel YA (2011) Structure of O-antigens. Bacterial
Lipopolysaccharides: Structure, Chemical Synthesis, Biogenesis
and Interaction with Host Cells. (Knirel YA & Valvano MA,
eds), Springer Wien, New York, NY.
Knirel YA, Shashkov AS, Tsvetkov YE et al. (2003)
5,7-Diamino-3,5,7,9-tetradeoxynon-2-ulosonic acids in
bacterial glycopolymers: chemistry and biochemistry. Adv
Carbohydr Chem Biochem 58: 371–417.
FEMS Microbiol Rev 38 (2014) 56–89
85
Knirel YA, Shevelev SD & Perepelov AV (2012) Higher
aldulosonic acids: components of bacterial glycans.
Mendeleev Commun 21: 173–182.
Knirel YA, Lan R, Senchenkova SN et al. (2013) O-antigen
structure of Shigella flexneri serotype Yv and effect of the
lpt-O gene variation on phosphoethanolamine modification
of S. flexneri O-antigens. Glycobiology 23: 475–485.
Kocharova NA, Vinogradov EV, Knirel’ IuA et al. (1988) The
structure of O-specific polysaccharide chains of
lipopolysaccharides from Citrobacter 032 and Salmonella
arizonae 064 (Arizona 29). Bioorg Khim 14: 697–700.
Kocharova NA, Knirel YA, Stanislavsky ES et al. (1996)
Structural and serological studies of lipopolysaccharides of
Citrobacter O35 and O38 antigenically related to Salmonella.
FEMS Immunol Med Microbiol 13: 1–8.
Koropatkin NM, Liu HW & Holden HM (2003) High
resolution x-ray structure of tyvelose epimerase from
Salmonella typhi. J Biol Chem 278: 20874–20881.
Kumirska J, Szafranek J, Czerwicka M et al. (2007) The
structure of the O-polysaccharide isolated from the
lipopolysaccharide of Salmonella Dakar (serogroup O:28).
Carbohydr Res 342: 2138–2143.
Kumirska J, Dziadziuszko H, Czerwicka M et al. (2011)
Heterogeneous structure of O-Antigenic part of
lipopolysaccharide of Salmonella Telaviv (serogroup O:28)
containing 3-Acetamido-3,6-dideoxy-D-glucopyranose.
Biochemistry (Mosc) 76: 780–790.
Li Q & Reeves PR (2000) Genetic variation of
dTDP-L-rhamnose pathway genes in Salmonella enterica.
Microbiology 146: 2291–2307.
Li Y, Cao B, Liu B et al. (2009) Molecular detection of all 34
distinct O-antigen forms of Shigella. J Med Microbiol 58:
69–81.
Li D, Liu B, Chen M et al. (2010a) A multiplex PCR method
to detect 14 Escherichia coli serogroups associated with
urinary tract infections. J Microbiol Methods 82: 71–77.
Li Y, Perepelov AV, Guo D et al. (2010b) Structural and
genetic relationships of two pairs of closely related
O-antigens of Escherichia coli and Salmonella enterica: E. coli
O11/S. enterica O16 and E. coli O21/S. enterica O38. FEMS
Immunol Med Microbiol 61: 258–268.
Lindberg B, Lindh F, Longren J et al. (1981) Structural studies
of the O-specific side-chain of the lipopolysaccharide from
Escherichia coli O55. Carbohydr Res 97: 105–112.
Lindberg B, Leontein K, Lindquist U et al. (1988) Structural
studies of the O-antigen polysaccharide of Salmonella
thompson, serogroup C1 (6,7). Carbohydr Res 174:
313–322.
Linton KJ & Higgins CF (1998) The Escherichia coli
ATP-binding cassette (ABC) proteins. Mol Microbiol 28: 5–
13.
Liu D, Verma NK, Romana LK & Reeves PR (1991)
Relationships among the rfb regions of Salmonella serovars
A, B, and D. J Bacteriol 173: 4814–4819.
Liu B, Knirel YA, Feng L et al. (2008) Structure and genetics
of Shigella O antigens. FEMS Microbiol Rev 32: 627–653.
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
86
Liu B, Wu F, Li D et al. (2009) Development of a
serogroup-specific DNA microarray for identification of
Escherichia coli strains associated with bovine septicemia and
diarrhea. Vet Microbiol 142: 373–378.
Liu B, Perepelov AV, Li D et al. (2010a) Structure of the
O-antigen of Salmonella O66 and the genetic basis for
similarity and differences between the closely related
O-antigens of Escherichia coli O166 and Salmonella O66.
Microbiology 156: 1642–1649.
Liu B, Perepelov AV, Guo D et al. (2010b) Structural and
genetic relationships between the O-antigens of Escherichia
coli O118 and O151. FEMS Immunol Med Microbiol 60:
199–207.
Liu B, Perepelov AV, Svensson MV et al. (2010c) Genetic
and structural relationships of Salmonella O55
and Escherichia coli O103 O-antigens and identification
of a 3-hydroxybutanoyltransferase gene involved in
the synthesis of a Fuc3N derivative. Glycobiology 20:
679–688.
MacLean LL & Perry MB (1997) Structural characterization of
the serotype O:5 O-polysaccharide antigen of the
lipopolysaccharide of Escherichia coli O:5. Biochem Cell Biol
75: 199–205.
Makela PH, Valtonen VV & Valtonen M (1973) Role of
O-antigen (lipopolysaccharide) factors in the virulence of
Salmonella. J Infect Dis 128 (Suppl): 81–85.
Masoud H & Perry MB (1996) Structural characterization of
the O-antigenic polysaccharide of Escherichia coli serotype
017 lipopolysaccharide. Biochem Cell Biol 74: 241–248.
Maurelli AT, Fernandez RE, Bloch CA et al. (1998) “Black
holes” and bacterial pathogenicity: a large genomic deletion
that enhances the virulence of Shigella spp. and
enteroinvasive Escherichia coli. P Natl Acad Sci USA 95:
3943–3948.
Mavroidi A, Aanensen DM, Godoy D et al. (2007) Genetic
relatedness of the Streptococcus pneumoniae capsular
biosynthetic loci. J Bacteriol 189: 7841–7855.
McGrath BC & Osborn MJ (1991) Localisation of the terminal
steps of O-antigen synthesis in Salmonella typhimurium.
J Bacteriol 173: 649–654.
McQuiston JR, Parrenas R, Ortiz-Rivera M et al. (2004)
Sequencing and comparative analysis of flagellin genes fliC,
fljB, and flpA from Salmonella. J Clin Microbiol 42:
1923–1932.
McQuiston JR, Herrera-Leon S, Wertheim BC et al. (2008)
Molecular phylogeny of the salmonellae: relationships
among Salmonella species and subspecies determined from
four housekeeping genes and evidence of lateral gene
transfer events. J Bacteriol 190: 7060–7067.
Morelli G, Song Y, Mazzoni CJ et al. (2011) Yersinia pestis
genome sequencing identifies patterns of global phylogenetic
diversity. Nat Genet 42: 1140–1143.
Mulford CA & Osborn MJ (1983) A intermediate step in
translocation of lipopolysaccharide to outer membrane of
Salmonella typhimurium. P Natl Acad Sci USA 80:
1159–1163.
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
B. Liu et al.
Mulrooney EF, Poon KK, McNally DJ et al. (2005)
Biosynthesis of UDP-N-acetyl-L-fucosamine, a precursor to
the biosynthesis of lipopolysaccharide in Pseudomonas
aeruginosa serotype O11. J Biol Chem 280: 19535–19542.
Naide Y, Nikaido H, M€akel€a PH et al. (1965)
Semirough strains of Salmonella. P Natl Acad Sci USA 53:
147–153.
Neidhardt FC, Ingraham JL, Magasanik B et al. (1987)
Escherichia and Salmonella typhimurium: Cellular and
Molecular Biology. American Society for Microbiology,
Washington, DC.
Nelson K & Selander RK (1992) Evolutionary genetics of the
proline permease gene (putP) and the control region of the
proline utilization operon in populations of Salmonella and
Escherichia coli. J Bacteriol 174: 6886–6895.
Nelson K, Whittam TS & Selander RK (1991) Nucleotide
polymorphism and evolution in the
glyceraldehyde-3-phosphate dehydrogenase gene (gapA) in
natural populations of Salmonella and Escherichia coli. P
Natl Acad Sci USA 88: 6667–6671.
Nikaido H, Levinthal M, Nikaido K et al. (1967) Extended
deletions in the histidine-rough-B region of the Salmonella
chromosome. P Natl Acad Sci USA 57: 1825–1832.
Ochman H & Wilson AC (1987) Evolution in bacteria:
evidence for a universal substitution rate in cellular
genomes. J Mol Evol 26: 74–86.
Orskov I, Orskov F, Jann B et al. (1977) Serology, chemistry,
and genetics of O and K antigens of Escherichia coli.
Bacteriol Rev 41: 667–710.
Pallen MJ & Wren BW (2007) Bacterial pathogenomics.
Nature 449: 835–842.
Perepelov AV, Liu B, Senchenkova SN et al. (2009) Structure
of O-antigen and functional characterization of O-antigen
gene cluster of Salmonella enterica O47 containing ribitol
phosphate and 2-acetimidoylamino-2,6-dideoxy-L-galactose.
Biochemistry (Mosc) 74: 416–420.
Perepelov AV, Liu B, Senchenkova SN et al. (2010a) Structure
and gene cluster of the O-antigen of Salmonella enterica
O60 containing 3-formamido-3,6-dideoxy-D-galactose.
Carbohydr Res 345: 1632–1634.
Perepelov AV, Liu B, Senchenkova SN et al. (2010b) Structure
of the O-polysaccharide of Salmonella enterica O41.
Carbohydr Res 345: 971–973.
Perepelov AV, Liu B, Senchenkova SN et al. (2010c)
Structure of the O-antigen and characterization of the
O-antigen gene cluster of Escherichia coli O108 containing
5,7-diacetamido-3,5,7,9-tetradeoxy-L-glycero-D-galacto-non-2ulosonic (8-epilegionaminic) acid. Biochemistry (Mosc) 75:
19–24.
Perepelov AV, Liu B, Senchenkova SN et al. (2010d) Structure
and gene cluster of the O-antigen of Salmonella enterica
O44. Carbohydr Res 345: 2099–2101.
Perepelov AV, Liu B, Senchenkova SN et al. (2010e) The
O-antigen of Salmonella enterica O13 and its relation to
the O-antigen of Escherichia coli O127. Carbohydr Res
345: 1808–1811.
FEMS Microbiol Rev 38 (2014) 56–89
Salmonella O-antigen diversity
Perepelov AV, Liu B, Shevelev SD et al. (2010f) Relatedness of
the O-polysaccharide structures of Escherichia coli O123 and
Salmonella enterica O58, both containing
4,6-dideoxy-4-{N-[(S)-3-hydroxybutanoyl]-D-alanyl}
amino-D-glucose; revision of the E. coli O123
O-polysaccharide structure. Carbohydr Res 345: 825–829.
Perepelov AV, Liu B, Shevelev SD et al. (2010g) Structural and
genetic characterization of the O-antigen of Salmonella
enterica O56 containing a novel derivative of
4-amino-4,6-dideoxy-D-glucose. Carbohydr Res 345:
1891–1895.
Perepelov AV, Liu B, Senchenkova SN et al. (2011a) Structure
of the O-polysaccharide and characterization of the
O-antigen gene cluster of Salmonella enterica O53.
Carbohydr Res 346: 373–376.
Perepelov AV, Liu B, Senchenkova SN et al. (2011b) Structures
of the O-polysaccharides of Salmonella enterica O59 and
Escherichia coli O15. Carbohydr Res 346: 381–383.
Perepelov AV, Liu B, Guo D et al. (2011c) Structure
elucidation of the O-antigen of Salmonella enterica O51 and
its structural and genetic relation to the O-antigen of
Escherichia coli O23. Biochemistry (Mosc) 76: 774–779.
Perepelov AV, Li D, Liu B et al. (2011d) Structural and
genetic characterization of the closely related O-antigens of
Escherichia coli O85 and Salmonella enterica O17. Innate
Immun 17: 164–173.
Perepelov AV, Liu B, Senchenkova SN et al. (2011e) O-antigen
structure and gene clusters of Escherichia coli O51 and
Salmonella enterica O57; another instance of identical
O-antigens in the two species. Carbohydr Res 346: 828–832.
Perry MB & MacLean LL (1992a) Structure of the
polysaccharide O-antigen of Salmonella riogrande O:40
(group R) related to blood group A activity. Carbohydr Res
232: 143–150.
Perry MB & MacLean LL (1992b) Structural characterization
of the O-polysaccharide of the lipopolysaccharide produced
by Salmonella milwaukee O:43 (group U) which possesses
human blood group B activity. Biochem Cell Biol 70: 49–55.
Perry MB, MacLean L & Griffith DW (1986a) Structure of the
O-chain polysaccharide of the phenol-phase soluble
lipopolysaccharide of Escherichia coli 0157:H7. Biochem Cell
Biol 64: 21–28.
Perry MB, Bundle DR, MacLean L et al. (1986b) The structure
of the antigenic lipopolysaccharide O-chains produced by
Salmonella urbana and Salmonella godesberg. Carbohydr Res
156: 107–122.
Pfoestl A, Hofinger A, Kosma P et al. (2003) Biosynthesis of
dTDP-3-acetamido-3,6-dideoxy-a-D-galactose in
Aneurinibacillus thermoaerophilus L420-91T. J Biol Chem
278: 26410–26417.
Pfostl A, Zayni S, Hofinger A et al. (2008) Biosynthesis of
dTDP-3-acetamido-3,6-dideoxy-a-D-glucose. Biochem J 410:
187–194.
Plainvert C, Bidet P, Peigne C et al. (2007) A new O-antigen
gene cluster has a key role in the virulence of the Escherichia
coli meningitis clone O45:K1:H7. J Bacteriol 189: 8528–8536.
FEMS Microbiol Rev 38 (2014) 56–89
87
Pluschke G, Mayden J, Achtman M et al. (1983) Role of the
capsule and the O-antigen in resistance of O18:K1
Escherichia coli to complement-mediated killing. J Bacteriol
42: 907–913.
Popoff MY & Le Minor L (1997) Antigenic Formulas of the
Salmonella Serovars, 7th Revision. WHO Collaborating
Centre for Reference and Research on Salmonella. Institut
Pasteur, Paris, France.
Pupo GM, Lan R & Reeves PR (2000) Multiple independent
origins of Shigella clones of Escherichia coli and convergent
evolution of many of their characteristics. P Natl Acad Sci
USA 97: 10567–10572.
Rabsch W, Andrews HL, Kingsley RA et al. (2002) Salmonella
enterica serotype Typhimurium and its host-adapted
variants. Infect Immun 70: 2249–2255.
Ramamurthy T, Yamasaki S, Takeda Y et al. (2003) Vibrio
cholerae O139 Bengal: odyssey of a fortuitous variant.
Microbes Infect 5: 329–344.
Ratnayake S, Weintraub A & Widmalm G (1994) Structural
studies of the enterotoxigenic Escherichia coli (ETEC) O153
O-antigenic polysaccharide. Carbohydr Res 265: 113–120.
Raymond CK, Sims EH, Kas A et al. (2002) Genetic variation
at the O-antigen biosynthetic locus in Pseudomonas
aeruginosa. J Bacteriol 184: 3614–3622.
Raynaud C, Meibom KL, Lety MA et al. (2007) Role of the
wbt locus of Francisella tularensis in lipopolysaccharide
O-antigen biogenesis and pathogenicity. Infect Immun 75:
536–541.
Reeves PR (1992) Variation in O antigens, niche specific
selection and bacterial populations. FEMS Microbiol Lett
100: 509–516.
Reeves PR (1995) Role of O-antigen variation in the immune
response. Trends Microbiol 3: 381–386.
Reeves PR & Wang L (2002) Genomic organization of
LPS-specific loci. Curr Top Microbiol Immunol 264: 109–135.
Reeves PR, Liu B, Zhou Z et al. (2011) Rates of mutation and
host transmission for an Escherichia coli clone over 3 years.
PLoS ONE 6: e26907.
Reeves PR, Cunneen MM, Liu B & Wang L (2013) Genetics
and evolution of the Salmonella galactose-initiated set of O
antigens. PLoS One 8: e69306.
Ren Y, Liu B, Cheng J et al. (2008) Characterization of
Escherichia coli O3 and O21 O-antigen gene clusters and
development of serogroup-specific PCR assays. J Microbiol
Methods 75: 329–334.
Royle MC, Totemeyer S, Alldridge LC et al. (2003)
Stimulation of Toll-like receptor 4 by lipopolysaccharide
during cellular invasion by live Salmonella typhimurium is a
critical but not exclusive event leading to macrophage
responses. J Immunol 170: 5445–5454.
Rundlof T, Weintraub A & Widmalm G (1998) Structural
determination of the O-antigenic polysaccharide from
Escherichia coli O35 and cross-reactivity to Salmonella
arizonae O62. Eur J Biochem 258: 139–143.
Rush JS, Alaimo C, Robbiani R et al. (2010) A novel
epimerase that converts GlcNAc-P-P-undecaprenol to
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
88
GalNAc-P-P-undecaprenol in Escherichia coli O157. J Biol
Chem 285: 1671–1680.
Samuel G & Reeves P (2003) Biosynthesis of O-antigens: genes
and pathways involved in nucleotide sugar precursor synthesis
and O-antigen assembly. Carbohydr Res 338: 2503–2519.
Samuel G, Hogbin JP, Wang L et al. (2004) Relationships of
the Escherichia coli O157, O111, and O55 O-antigen gene
clusters with those of Salmonella enterica and Citrobacter
freundii, which express identical O antigens. J Bacteriol 186:
6536–6543.
Senchenkova SN, Shashkov AS, Knirel YA et al. (1997)
Structure of the O-specific polysaccharide of Salmonella
enterica ssp. arizonae O50 (Arizona 9a, 9b). Carbohydr Res
301: 61–67.
Sharp PM (1991) Determinants of DNA sequence divergence
between Escherichia coli and Salmonella typhimurium: codon
usage, map position, and concerted evolution. J Mol Evol
33: 23–33.
Shashkov AS, Lipkind GM, Knirel NK et al. (1988)
Stereochemical factors determining the effects of
glycosylation on the 13C chemical shifts in carbohydrates.
Magn Reson Chem 26: 735–747.
Shashkov AS, Vinogradov EV, Knirel YA et al. (1993)
Structure of the O-specific polysaccharide of Salmonella
arizonae O45. Carbohydr Res 241: 177–188.
Somoza JR, Menon S, Schmidt H et al. (2000) Structural and
kinetic analysis of Escherichia coli GDP-mannose 4,6
dehydratase provides insights into the enzyme’s catalytic
mechanism and regulation by GDP-fucose. Structure 8:
123–135.
Staaf M, Widmalm G, Weintraub A et al. (1995) Structural
elucidation of the O-antigenic polysaccharide from
Escherichia coli O44:H18. Eur J Biochem 233: 473–477.
Staaf M, Urbina F, Weintraub A et al. (1999) Structural
elucidation of the O-antigenic polysaccharides from
Escherichia coli O21 and the enteroaggregative Escherichia
coli strain 105. Eur J Biochem 266: 241–245.
Sun Q, Knirel YA, Lan R et al. (2012) A novel
plasmid-encoded serotype conversion mechanism through
addition of phosphoethanolamine to the O-antigen of
Shigella flexneri. PLoS ONE 7: e46095.
Szafranek J, Kaczynska M, Kaczynski Z et al. (2003) Structure
of the polysaccharide O-antigen of Salmonella Aberdeen
(O:11). Pol J Chem 77: 1135–1140.
Verma V & Reeves PR (1989) Identification and sequence of
rfbS and rfbE, which determine antigenic specificity of group
A and group D Salmonella. J Bacteriol 171: 5694–5701.
Verma NK, Quigley NB & Reeves PR (1988) O-antigen
variation in Salmonella spp.: rfb gene clusters of three
strains. J Bacteriol 170: 103–107.
Vinogradov EV, Knirel’ IuA, Lipkind GM et al. (1987a)
[Antigenic bacterial polysaccharides. 24. The structure of the
O-specific polysaccharide chain of Salmonella arizonae 063
(Arizona 08) lipopolysaccharide]. Bioorg Khim 13: 1399–1404.
Vinogradov EV, Knirel YA, Lipkind GM et al. (1987b)
Antigenic polysaccharides of bacteria. 23. The structure of
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved
B. Liu et al.
the O-specific polysaccharide chain of the
lipopolysaccharide of Salmonella arizonae O59. Bioorg Khim
13: 1275–1281.
Vinogradov EV, Shashkov AS, Knirel YA et al. (1992) The
structure of the O-specific polysaccharide chain of the
lipopolysaccharide of Salmonella arizonae O61. Carbohydr
Res 231: 1–11.
Vinogradov EV, Knirel YA, Kochetkov NK et al. (1994) The
structure of the O-specific polysaccharide of Salmonella
arizonae O62. Carbohydr Res 253: 101–110.
Vinogradov E, Nossova L & Radziejewska-Lebrecht J (2004)
The structure of the O-specific polysaccharide from
Salmonella cerro (serogroup K, O:6,14,18). Carbohydr Res
339: 2441–2443.
Wang L & Reeves PR (1998) Organization of Escherichia coli
O157 O-antigen gene cluster and identification of its
specific genes. Infect Immun 66: 3545–3551.
Wang L & Reeves PR (2000) The Escherichia coli O111 and
Salmonella enterica O35 gene clusters: gene clusters
encoding the same colitose-containing O antigen are highly
conserved. J Bacteriol 182: 5256–5261.
Wang L, Romana LK & Reeves PR (1992) Molecular analysis
of a Salmonella enterica group E1 rfb gene cluster: O antigen
and the genetic basis of the major polymorphism. Genetics
130: 429–443.
Wang L, Andrianopoulos K, Liu D et al. (2002a) Extensive
variation in the O-antigen gene cluster within one
Salmonella enterica serogroup reveals an unexpected
complex history. J Bacteriol 184: 1669–1677.
Wang L, Huskic S, Cisterne A et al. (2002b) The O-antigen
gene cluster of Escherichia coli O55:H7 and identification of
a new UDP-GlcNAc C4 epimerase gene. J Bacteriol 184:
2620–2625.
Wang W, Perepelov AV, Feng L et al. (2007) A group of
Escherichia coli and Salmonella enterica O antigens sharing
a common backbone structure. Microbiology 153: 2159–
2167.
Weintraub A, Leontein K, Widmalm G et al. (1993) Structural
studies of the O-antigenic polysaccharide of an
enteroaggregative Escherichia coli strain. Eur J Biochem 213:
859–864.
West NP, Sansonetti P, Mounier J et al. (2005) Optimization
of virulence functions through glucosylation of Shigella LPS.
Science 307: 1313–1317.
Whitfield C, Richards JC, Perry MB et al. (1991) Expression of
two structurally distinct D-galactan O antigens in the
lipopolysaccharide of Klebsiella pneumoniae serotype O1.
J Bacteriol 173: 1420–1431.
Widmalm G & Leontein K (1993) Structural studies of the
Escherichia coli O127 O-antigen polysaccharide. Carbohydr
Res 247: 255–262.
Wildschutte H & Lawrence JG (2007) Differential Salmonella
survival against communities of intestinal amoebae.
Microbiology 153: 1781–1789.
Wildschutte H, Wolfe DM, Tamewitz A et al. (2004)
Protozoan predation, diversifying selection, and the
FEMS Microbiol Rev 38 (2014) 56–89
89
Salmonella O-antigen diversity
evolution of antigenic diversity in Salmonella. P Natl Acad
Sci USA 101: 10644–10649.
Xiang SH, Hobbs M & Reeves PR (1994) Molecular analysis of
the rfb gene cluster of a group D2 Salmonella enterica strain:
evidence for its origin from an insertion sequence-mediated
recombination event between group E and D1 strains.
J Bacteriol 176: 4357–4365.
Yi W, Shao J, Zhu L et al. (2005) Escherichia coli O86
O-antigen biosynthetic gene cluster and stepwise enzymatic
synthesis of human blood group B antigen tetrasaccharide.
J Am Chem Soc 127: 2040–2041.
Yildirim H, Weintraub A & Widmalm G (2001) Structural
studies of the O-polysaccharide from the Escherichia coli
O77 lipopolysaccharide. Carbohydr Res 333: 179–183.
Zhao S, Sandt CH, Feulner G et al. (1993) Rhs elements of
Escherichia coli K-12: complex composites of shared and
unique components that have different evolutionary
histories. J Bacteriol 175: 2799–2808.
Zhao G, Liu J, Liu X et al. (2007) Cloning and
characterization of GDP-perosamine synthetase (Per)
from Escherichia coli O157:H7 and synthesis of
GDP-perosamine in vitro. Biochem Biophys Res Commun
363: 525–530.
Zuccotti S, Zanardi D, Rosano C et al. (2001) Kinetic and
crystallographic analyses support a sequential-ordered bi bi
FEMS Microbiol Rev 38 (2014) 56–89
catalytic mechanism for Escherichia coli glucose-1-phosphate
thymidylyltransferase. J Mol Biol 313: 831–843.
Supporting Information
Additional Supporting Information may be found in the
online version of this article:
Fig. S1. The proposed functions of glycosyltransferases
involved in the synthesis of Salmonella GlcNAc/GalNAcinitiated O antigens with the Wzx/Wzy pathway.
Table S1. Composition of Salmonella GlcNAc/GalNAcinitiated O antigens.
Table S2. Characteristics of the ORFs in Salmonella
O-antigen gene clusters which are firstly reported in this
review.
Table S3. Homology groups of glycosyltransferases
in Salmonella GlcNAc/GalNAc-initiated O-antigen gene
clusters with wzx/wzy.
Table S4. Summary of unique Salmonella GlcNAc/
GalNAc-initiated O antigens.
Table S5. Primers used for Salmonella molecular typing.
ª 2013 Federation of European Microbiological Societies.
Published by John Wiley & Sons Ltd. All rights reserved