An extracellular calcium-binding domain in bacteria with a distant

FEMS Microbiology Letters 221 (2003) 103^110
www.fems-microbiology.org
An extracellular calcium-binding domain in bacteria with
a distant relationship to EF-hands
Daniel J. Rigden a , Mark J. Jedrzejas b , Michael Y. Galperin
c;
a
c
National Center of Genetic Resources and Biotechnology, Cenargen/Embrapa, Brasilia D.F. 70770-900, Brazil
b
Children’s Hospital Oakland Research Institute, Oakland, CA 94609, USA
National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
Received 22 January 2003; received in revised form 19 February 2003; accepted 20 February 2003
First published online 12 March 2003
Abstract
Extracellular Ca2þ -dependent nuclease YokF from Bacillus subtilis and several other surface-exposed proteins from diverse bacteria are
encoded in the genomes in two paralogous forms that differ by a V45 amino acid fragment, which comprises a novel conserved domain.
Sequence analysis of this domain revealed a conserved DxDxDGxxCE motif, which is strikingly similar to the Ca2þ -binding loop of the
calmodulin-like EF-hand domains, suggesting an evolutionary relationship between them. Functions of many of the other proteins in
which the novel domain, named Excalibur (extracellular calcium-binding region), is found, as well as a structural model of its conserved
motif are consistent with the notion that the Excalibur domain binds calcium. This domain is but one more example of the diversity of
structural contexts surrounding the EF-hand-like calcium-binding loop in bacteria. This loop is thus more widespread than hitherto
recognized and the evolution of EF-hand-like domains is probably more complex than previously appreciated.
4 2003 Federation of European Microbiological Societies. Published by Elsevier Science B.V. All rights reserved.
Keywords : Genome analysis; Calcium-binding site; Protein domain ; Cell envelope; S-layer; Nuclease ; Calmodulin; Evolution
1. Introduction
Many enzymes depend on metal cofactors for their activity. Several metal ions, such as copper and iron, feature
dedicated binding domains that participate in their storage, transport and delivery to the target enzymes.
Although the diversity of the metal-binding domains in
prokaryotes appears to be much less than in eukaryotes,
there are some well-characterized ones, such as the heavy
metal-associated domain found in metal-transporting
P-type ATPases, copper chaperones, and periplasmic mercury-binding proteins (reviewed in [1^3]) and the ironbinding ferritin domain (reviewed in [4]).
Calcium, which regulates a variety of regulatory processes in bacteria, such as motility, chemotaxis, cell division and di¡erentiation [5,6], also has dedicated binding
domains. The best studied of these are the parallel L-roll
domain, characterized by a repeated GGxGxD sequence
* Corresponding author. Tel. : +1 (301) 435 5910;
Fax : +1 (301) 435 7794.
E-mail address : [email protected] (M.Y. Galperin).
motif, found in a number of secreted proteins from Gramnegative bacteria [7] and the calmodulin-like EF-hand helix^loop^helix domain, consisting of two helices, E and F,
connected by a calcium-binding loop with a DxDxDG
consensus motif (reviewed in [8^10]). The EF-hand structure was found in a variety of eukaryotic Ca2þ -binding
proteins, such as parvalbumin, calmodulin, aequorin, calbindin and S100 proteins [8^10]. In contrast, this structural motif has not been found so far in any bacterial protein
with known three-dimensional structure. Nevertheless,
bacterial Ca2þ -binding proteins with predicted EF-hands
have been described, for example, calerythrin from the
actinomycete Saccharopolyspora erythrea [11] and calsymin from Rhizobium etli [12]. The presence of the EFhand helix^loop^helix structure in the former protein has
been recently con¢rmed by NMR analysis [13]. Sequence
comparisons revealed the presence of EF-hand-like motifs
in a variety of proteins encoded in completely sequenced
bacterial genomes [14,15]. In addition, Ca2þ -binding sites
in several bacterial extracytoplasmic proteins were found
to resemble the EF-hands. These include, for example, the
soluble lytic transglycosylase from Escherichia coli [16], the
periplasmic galactose-binding protein from Salmonella
0378-1097 / 03 / $22.00 4 2003 Federation of European Microbiological Societies. Published by Elsevier Science B.V. All rights reserved.
doi:10.1016/S0378-1097(03)00160-5
FEMSLE 10901 7-4-03
Cyaan Magenta Geel Zwart
104
D.J. Rigden et al. / FEMS Microbiology Letters 221 (2003) 103^110
typhimurium [17], Bacillus anthracis protective antigen [18]
and the dockerin domain of the extracellular cellulase
complex from Clostridium thermocellum [19]. While all
these proteins deviate in one way or another from the
typical EF-hands, they all share the conserved Ca2þ -binding motif Dx(D/N)xDG within the central loop of the EFhand domain. The variety of Ca2þ -binding proteins in
bacteria suggests the possibility of additional Ca2þ -binding
structures. Here we describe one more bacterial extracellular domain that shares local but striking similarity with
the EF-hand motif and is found in several Ca2þ -dependent
surface proteins from both Gram-positive and Gram-negative bacteria.
2. Materials and methods
2.1. Database searches
Initial sequence similarity searches were performed
against the non-redundant protein database and the database of ¢nished and un¢nished microbial genome sequences at the National Center for Biotechnology Information
(NCBI, Bethesda, MD, USA) using PSI-BLAST [20] program run with default parameters. Database hits identi¢ed
in these searches were used as queries for additional protein database PSI-BLAST searches with the relaxed inclusion threshold values of E = 0.018 and for the searches
against translated GenBank nucleotide database using
the TBLASTn [20] program. The completeness of the
search results was veri¢ed by direct pattern matching using
the DxDxDGxxCE sequence pattern.
Domain architecture analyses were carried out by individually comparing the protein sequences against SMART
[21] and Pfam [22] domain databases.
2.2. Secondary structure analysis
Signal peptides were predicted using the SignalP [23]
program. Secondary structure predictions were performed
using PSI-PRED [24], SSpro [25], and several other programs available on the web (reviewed in [26]). The resulting predictions were manually reconciled based on the
reliability ¢gures for each particular residue, reported by
each program. Consensus sequences were derived from
alignments using Mview [27]; EF-hand sequences were
obtained from the SMART database [21].
3. Results and discussion
3.1. Sequence and structure of the Excalibur (extracellular
calcium-binding region) domain
ing protein SP0667 (SPTREMBL accession Q97RW9) was
found to contain an 80-aa fragment that was located between the predicted signal peptide and the cell-wall-binding domains and showed no statistically signi¢cant sequence similarity to any domain listed in SMART [21]
or Pfam [22] domain databases. A PSI-BLAST search of
the NCBI protein database using this fragment of SP0667
(residues 24^103) as the query converged after ¢ve iterations of PSI-BLAST retrieving 14 proteins from both lowGC and high-GC Gram-positive bacteria that contained a
well-de¢ned domain with two absolutely conserved Cys
residues and three absolutely conserved Asp residues
(Fig. 1). All these proteins had predicted signal peptides
and invariably placed the new domain at the extreme Nor C-termini of the mature secreted proteins (Fig. 2). Using these proteins as queries in additional PSI-BLAST
searches with a relaxed inclusion cut-o¡ E-value retrieved
proteins from Gram-negative bacteria that had the same
sequence pattern. Several more instances of this domain,
including proteins from Mycobacterium tuberculosis and
Streptomyces exfoliatus, were found encoded in the
DNA sequences in GenBank nucleotide database and in
the database of un¢nished microbial genome sequences
(Table 1) but not translated into proteins, presumably because of their short size.
All the predicted proteins retrieved by these database
searches contained a similar 40^45-aa domain, present in
a stand-alone form in the Pasteurella multocida protein
PM0157 and characterized by the conserved DxDxDGxxCE motif (Fig. 1B). The remarkable similarity of
this motif to that de¢ned for the Ca2þ -binding motif of
the EF-hand calcium-binding domain [8^10] indicated that
the novel domain should also bind Ca2þ and prompted us
to name it accordingly as the Excalibur domain. Indeed, a
comparison of Excalibur and EF-hand consensus sequences demonstrates conservation in the former of the three
Ca2þ -binding Asp residues of the EF-hands and compatibility of the surrounding residues (shown in bold)
where small letters indicate the following groups of residues: h ^ hydrophobic, p ^ polar, s ^ small, and x ^ any
residue. The only signi¢cant di¡erence between the two
consensus motifs is the shorter distance between the conserved aspartates and the Glu residue in the Excalibur
domain. Since in some EF-hand-like sequences this Glu
residue can be shifted or even located elsewhere on the
polypeptide chain [17], this comparison supports the prediction that the Excalibur domain is capable of binding
calcium.
3.2. Diversity of the Excalibur-containing proteins
In the course of a survey of the surface-exposed proteins
of Streptococcus pneumoniae, the putative cell-wall- bind-
FEMSLE 10901 7-4-03
The domain organization of deduced proteins contain-
Cyaan Magenta Geel Zwart
D.J. Rigden et al. / FEMS Microbiology Letters 221 (2003) 103^110
ing the Excalibur domain (Table 1 and Fig. 2) indicates
that it is often found in association with domains that are
either Ca2þ -dependent or Ca2þ -regulated. The most common of such domain structures is a combination of a
C-terminal Excalibur domain with an N-terminal endonuclease domain, typi¢ed by the staphylococcal nuclease
[28]. This domain combination is found in Gram-positive
bacteria (Bacillus subtilis), cyanobacteria (Nostoc sp., Prochlorococcus marinus, Trichodesmium erythraeum), and in
105
proteobacteria (Azotobacter vinelandii, Mesorhizobium
loti). Remarkably, genomes of some of these organisms
also encode paralogous forms of extracellular nuclease
that lack the Excalibur domain (Fig. 2). In B. subtilis,
for example, the core regions of yokF and yncB gene products are 92% identical. The YokF protein, however, contains an extra C-terminal 83-aa fragment [29], which consists of the Excalibur domain and a £exible linker (Fig. 2).
A comparison of these two proteins revealed that YokF is
Table 1
A list of the Excalibur-containing proteins
Organism
Accession
no. or gia
Protein
namea
Length
Total
Gram-positive
B. subtilis phage SPBc2
B. anthracis
B. anthracis
S. pneumoniae
Streptococcus mitis
Staphylococcus aureus
Staphylococcus capitis
Actinobacteria
Corynebacterium glutamicum
Corynebacterium glutamicum
Corynebacterium e⁄ciens
Corynebacterium e⁄ciens
Streptomyces coelicolor
S. exfoliatus
Thermobi¢da fusca
Mycobacterium avium
Mycobacterium smegmatis
M. tuberculosis
Cyanobacteria
Nostoc sp. pCC7120alpha
Nostoc sp. pCC7120alpha
Nostoc sp. pCC7120alpha
Prochlorococcus marinus
Trichodesmium erythraeum
Chloro£exi
Chloro£exus aurantiacus
Proteobacteria
Alpha subdivision
Azotobacter vinelandii
Mesorhizobium loti
Novosphingobium aromaticivorans
Rhizobium sp. pNGR234alpha
Rhodobacter sphaeroides
Gamma subdivision
P. multocida
Shewanella oneidensis
Vibrio vulni¢cus
Excalibur
location
Other domainsb
Paralog without
Excalibur
Signal
peptide
YokF
BA_2645
BA_5470
SP0667
NA
SA1617
NA
AAC12979
21400020
21402845
AAK74812
Un¢nished
BAB57961
AF548637
296
333
268
332
190
204
s 184
24
57
26
26
26
52
^c
C-terminal
C-terminal
C-terminal
N-terminal
N-terminal
C-terminal
C-terminal
SNase
Coiled coil
SLH
Cell wall
Cell wall
Low comp.
SNase
YncB
^
BA_1675
^
Cgl0341
Cgl0956
CE1921
CE2653
C7A8.31
NA
NA
NA
NA
NA
BAB97734
BAB98349
BAC18731
BAC19463
CAB69780
U97057
Un¢nished
Un¢nished
Un¢nished
NC_000962
191
173
211
363
157
100
142
69
82
71
33
57
30
24
27
11
33
24
43
23
C-terminal
C-terminal
C-terminal
C-terminal
N-terminal
C-terminal
C-terminal
Whole protein
Whole protein
Whole protein
Low comp.
Low comp.
Low comp.
Rv2972c
Low comp.
Low comp.
Low comp.
None
None
None
^
^
^
CE1360
^
Alr7333
Alr7381d
Asr7343d
Pmit0770
Tery3071
BAB77091
BAB77139
BAB77101
23131487
23042405
175
128
69
211
233
21
^d
^d
24
22
C-terminal
C-terminal
Whole protein
C-terminal
C-terminal
SNase
SNased
Noned
SNase
SNase
All0265, All2317
^
^
^
^
Chlo2608
22972742
448
27
C-terminal
ComEC
^
Avin3995
23105817
225
21
C-terminal
SNase
Msi140
NA
NA
Rsph0055
CAD31545
Un¢nished
AE000066
22956401
257
129
140
292
28
46
^
17
C-terminal
C-terminal
C-terminal
C-terminal
SNase
^
^
^
Avin0542
Avin3705
Mlr0479 Mll8119
PM0157
SO0733
VV12532
AAK02241
AAN53811
AAO10888
69
203
155
19
^
^
Whole protein
C-terminal
C-terminal
None
Cold shock
Cold shock
a
^
^
^
^
^
^
^
^
Protein names and accession numbers in the NCBI protein database; NA, not available (previously untranslated proteins or proteins from un¢nished
genomes). Bold accession numbers are from the GenBank nucleotide database, newly translated proteins have the following locations: AF548637:
1..555 Frame+1 ; U97057 : 1482..1784 Frame31; NC_000962: 2580036..2580251 Frame31; AE000066:1792..2214 Frame31.
b
Domain names and abbreviations are as in Fig. 2. Low comp. stands for low-complexity segments, a dash indicates absence of recognizable domains.
See text for details.
c
The N-terminal fragment of this protein is missing.
d
This protein is a C-terminal fragment of an Alr7333-like nuclease, disrupted by a frameshift mutation.
FEMSLE 10901 7-4-03
Cyaan Magenta Geel Zwart
106
D.J. Rigden et al. / FEMS Microbiology Letters 221 (2003) 103^110
Fig. 1. Multiple alignment of the Excalibur domain. A: The alignment was constructed on the basis of PSI-BLAST [20] search results with minimal
manual editing. The proteins are listed under their gene names and species names, abbreviated as follows: Avin, Azotobacter vinelandii; Bsub, B. subtilis ; Bant, B. anthracis; Caur, Chloro£exus aurantiacus; Ce¡, Corynebacterium e⁄ciens; Cglu, C. glutamicum ; Mlot, Mesorhizobium loti ; Nos, Nostoc
(Anabaena) sp. PCC7120 ; Pmar, Prochlorococcus marinus; Pmul, Pasteurella multocida; Rsph, Rhodobacter sphaeroides ; Saur, Staphylococcus aureus ;
Scoe, Streptomyces coelicolor ; Spne, S. pneumoniae ; Tery, Trichodesmium erythraeum; Sone, Shewanella oneidensis ; Vvul, Vibrio vulni¢cus. Missing amino acid residues at the end of the alignment indicate protein C-termini. The secondary structure prediction shown beneath the alignment represents a
tentative consensus of six di¡erent prediction algorithms [26]. The NCBI protein database or Swiss-Prot accession numbers, where available, are shown
in the right column. Conserved residues are shown in bold typeface and/or colored as follows: acidic ^ red; hydrophobic ^ yellow background; small
(Gly, Ser, Ala) and Pro ^ green background. Cys residues that form predicted disul¢de bond are shown in white letters on blue background. The yellow
numbers on dark background indicate the lengths of variable inserts in the respective protein sequences. B: Consensus sequence of the Excalibur domain drawn in the SeqLogo format using the WebLogo tool (http://weblogo.berkeley.edu [44]). The height of each letter indicates the degree of its conservation, the total height of each column represents the statistical importance of the given position. The residue numbering is based on the sequence of
the mature form of SP0667 (lacking the 26-aa signal peptide).
responsible for the major part of the endonuclease activity
in B. subtilis cells, while YncB represented a minor fraction of that activity [29]. Whether these di¡erences are
fully attributable to the presence of the Excalibur domain
remains to be determined.
The B. anthracis gene BA_5470 encodes another notable
domain combination, the Excalibur domain preceded by
three N-terminal S-layer homology (SLH) domains (Fig.
2). SLH domains are capable of tight non-covalent binding to the peptidoglycan and often serve as adaptors for
cell-wall anchoring of various enzymes [30]. Studies of
FEMSLE 10901 7-4-03
peptidoglycan attachment by SLH domains suggested an
involvement of metal ions, possibly Ca2þ , in this interaction [31]. Ca2þ has been previously shown to be involved
in S-layer assembly in Caulobacter crescentus [32], although this might be an unrelated process, as this organism’s genome does not seem to encode SLH domains.
Similarly to the case of B. subtilis endonucleases above,
B. anthracis also encodes a close paralog of BA_5470 that
is lacking the Excalibur domain (Fig. 3).
In Chlo£exus aurantiacus, the Excalibur domain is
found in combination with an uncharacterized hydrolase
Cyaan Magenta Geel Zwart
D.J. Rigden et al. / FEMS Microbiology Letters 221 (2003) 103^110
107
Fig. 2. Domain organization of proteins containing the Excalibur domain and their close paralogs. Full names and accession numbers of the proteins
are as in Fig. 1 and Table 1; proteins are drawn approximately to scale. Domain names are from SMART [21] and Pfam [22] databases and are abbreviated as follows : SNase, staphylococcal nuclease (PF00565); Cold, cold-shock protein domain (SM00357, PF00313); SLH (PF00395), Lactamase_B,
metallo-L-lactamase superfamily (PF00753). Small black boxes on the N-termini indicate predicted signal peptides, the sword in a stone indicates the
Excalibur domain, the rhombs indicate cell-wall-binding (PF01473) domains, Rv2972c indicates an uncharacterized actinobacterial protein domain (see
text for details).
of the metallo-L-lactamase superfamily [33], closely related
to the C-terminal part of B. subtilis competence protein
ComEC [34]. Although the principal catalytic residues of
L-lactamase are conserved in this protein (data not
shown), and there is no doubt that it possesses a hydrolytic activity, its exact substrate speci¢city and the nature
of the bound metal remain unclear; it could well be another extracellular endonuclease.
The Corynebacterium e⁄ciens CE2653 gene product
combines the Excalibur domain with a previously uncharacterized actinobacteria-speci¢c globular domain that is
distantly related to the periplasmic endonuclease I of
E. coli [35] and can be predicted to have a similar activity.
In other proteins from corynebacteria and streptomycetes,
as well as from streptococci and staphylococci, the Excalibur domain is found in association with various lowcomplexity domains, which might be involved in adhesion
and colonization processes. While immunogenic properties
of these proteins have not been investigated so far, they
might represent interesting vaccine candidates.
In two gamma-proteobacteria, Shewanella oneidensis
and Vibrio vulni¢cus, the Excalibur domain is found associated with the cold-shock domain, known to bind RNA
and single-stranded DNA [36,37]. This is the only context
where Excalibur domain is predicted to be inside the cy-
FEMSLE 10901 7-4-03
Fig. 3. A model of Ca2þ binding by the Excalibur domain. The consensus Excalibur sequence motif DRDGDGIACE was modeled with MODELLER-6 [45] using the ¢rst EF-hand domain of human calmodulin
(PDB code 1cdl [46]) as template. Interactions with bound Ca2þ are
shown as dotted lines and interacting residues are numbered with the
¢rst Asp de¢ned as residue 1. The ¢gure was produced with PYMOL
(DeLano, 2003, http://www.pymol.org).
Cyaan Magenta Geel Zwart
108
D.J. Rigden et al. / FEMS Microbiology Letters 221 (2003) 103^110
toplasm, and its exact function there is as obscure as that
of the cold-shock domain (see [38] for discussion). The
absence of the Excalibur domain in all other Q-proteobacteria whose genomes have been completely sequenced to
date probably indicates its relatively recent acquisition by
S. oneidensis and V. vulni¢cus.
Finally, in several organisms, including M. tuberculosis,
the Excalibur domain may be found on bacterial surface
in the stand-alone form (Fig. 2). As this predicted protein
has not been described before and was even missed in
genome annotation of all M. tuberculosis strains sequenced
so far, no data on its possible functions is currently available. Nevertheless, it seems plausible that the function of
this domain in mycobacteria and in P. multocida could be
related, respectively, to the maintenance of the stability of
the cell envelope and capsular polysaccharide.
3.3. Structural aspects of Ca2+ binding by EF-hands and
the Excalibur domain
domain. Inconsistencies between secondary structure predictions for the Excalibur domain obtained by di¡erent
methods do not allow for unequivocal determination of
the presence or absence in it of the helix that precedes
the Ca2þ -binding loop in EF-hands. In any case, shared
structural context is neither required nor su⁄cient for the
inference of homology, given the increasingly appreciated
subtleties of protein structural evolution [40].
The two entirely conserved Cys residues in this extracellular protein family are suggestive of the existence of a
disul¢de bridge. It may be that this structural constraint
substitutes for the larger domain framework of EF-hand
containing proteins in orienting the ends of the Ca2þ -binding loop correctly for metal binding. Although simple,
more generalized binding motifs can arise by convergent
evolution [41], we believe that the extensive local similarities between EF-hands and the novel domain indicate a
genuine evolutionary relationship.
3.4. Diversity and evolution of the EF-hand-like domains
Classical EF-hands contain a helix^loop^helix structure
with the Ca2þ ion binding by the residues in the loop [8^
10]. The Ca2þ ion binds in a pentagonal bipyramidal fashion with the side chains of three turn residues, numbered
1, 3 and 5 in the corresponding PROSITE motif [39],
typically aspartates, acting as monodentate ligands. A further side chain, contributed by residue 12, typically Glu, is
a bidentate calcium ligand while the main chain carboxyl
group of residue seven and water molecules complete the
coordination of the Ca2þ ion. Allowing for a di¡erent
separation of aspartates and glutamate ^ four residues in
the Excalibur domain, six residues for EF-hands ^ these
characteristics appear to be entirely conserved in the former (Fig. 1). Examination of EF-hand structures and
alignments [21] shows that other, non-ligating positions
have characteristic residue distributions for structural reasons. Positions 4 and 6 lie, respectively, close to the core of
the left-handed helical region of the Ramachandran plot,
and at the very periphery of this region. These locations of
the Ramachandran plot would be expected to de¢ne, respectively, a preference for turn-favoring residues and a
near-exclusive preference for glycine residues. Indeed, exactly these tendencies are observed in the EF-hand consensus loop sequence. Once again, these residue preferences for positions 4 and 6 of the EF-hand are re£ected
exactly in the consensus sequence for the Excalibur domain (Fig. 1). These considerations favor an evolutionary
relationship, or at least a structural similarity, between the
Excalibur and EF-hand domains. Consistent with this
idea, the lesser distance between DxDxD and conserved
Glu in the novel domain is readily structurally accommodated (Fig. 2).
That the overall structures of the Excalibur and EFhand domains di¡er is guaranteed by the fact that the
novel domains terminate shortly after the conserved Glu
while a further 12-residue helix is part of the EF-hand
FEMSLE 10901 7-4-03
Although we have compared the Excalibur domain with
the relatively well-understood EF-hand, the Excalibur domain is, intriguingly, the ¢fth structural context in which
the core EF-hand-like Ca2þ -binding loop has been identi¢ed. Structural determinations of the periplasmic galactose-binding protein from S. typhimurium (PDB entry
1gcg) revealed, surprisingly, a DxNxD calcium-binding
loop, nearly superimposable on those of EF-hands, in a
helix^loop^strand structural context [17]. Just as in EFhands, the third monodentate side chain ligand (a carbonyl from an Asp residue) is followed there by glycine, which
allows the interaction with the bound calcium of the main
chain carbonyl of the residue in the seventh position. Such
similarities again argue for an evolutionary relationship
between the two loops, despite di¡erences in the position
of the glutamate acting as the bidentate Ca2þ ligand.
In the dockerin domain, the E-helix in front of the
Ca2þ -binding loop is missing, but the F-helix follows the
loop, just as in typical EF-hands. The context of the Ca2þ binding loop in dockerins may therefore be described as
coil^turn^helix. The same separation of the DxNxD calcium-binding loop and the conserved bidentate calcium
ligand, in this case an aspartate, is present in the dockerin
domain as in the EF-hand. Essentially the same organization of the Ca2þ -binding site is seen in the B. anthracis
protective antigen. Again the characteristic stereochemistry of the Ca2þ -binding loop is re£ected in similar residue
preferences to the EF-hand, even outside the actual calcium ligands, providing further support for an evolutionary
relationship.
Another notable example is the Ca2þ -binding protein
from the £atworm Echinococcus granulosus [42], where
the DxDxD motif, followed by a conserved acidic residue
at the typical EF-hand spacing, is present in 15 tandem
repeats of 12^14 residues. Based on limited prediction of
Cyaan Magenta Geel Zwart
D.J. Rigden et al. / FEMS Microbiology Letters 221 (2003) 103^110
L-structure for some of the repeats, a structural relationship to the L-roll Ca2þ -binding proteins was proposed [42].
There are several reasons to doubt this idea, not least the
fact that the currently available programs predict very
little regular secondary structure in the repeat region
(analysis not shown). Most importantly, given the excellent match between the E. granulosus repeat consensus and
the EF-hand consensus and the former’s complete lack of
similarity with the L-roll consensus, the more conservative
explanation is that the repeats of the E. granulosus protein
bind calcium in loops of similar structure to those of EFhands.
Based on the discovery of calmodulin-like proteins in
bacteria, it has been argued that the EF-hands had originated in the prokaryotic kingdom [11,14]. While this could
still be correct, one has to consider that the Ca2þ -binding
loop is found in bacteria in a variety of diverse structural
contexts : helix^loop^helix in calerythrin [13] and soluble
lytic transglycosylase [16], helix^loop^strand in the periplasmic galactose-binding protein from S. typhimurium
[17], coil^loop^helix in the dockerin domain [19] and in
B. anthracis protective antigen [18], and as the C-terminal
loop in the Excalibur domain (Fig. 1). This suggests that
evolution of the EF-hand-like domains might be much
more complex than previously appreciated and these domains could be much more widespread than was previously recognized, involving various modi¢cations of the
structural elements surrounding the Ca2þ -binding loop.
The classical EF-hand is therefore just the most visible
representative of the whole superfamily, not necessarily
the most ancient one.
In conclusion, prokaryotic relatives of the Ca2þ -binding
EF-hand domain clearly exhibit a signi¢cantly wider sequence and structural diversity than has been so far appreciated, with the Excalibur domain representing a further member of a superfamily of Ca2þ -binding domains.
Five structural contexts for the core Ca2þ -binding loop
motif, originally associated solely with EF-hands, are
now known: EF-hands themselves, the periplasmic galactose-binding protein, the dockerin domain/protective antigen, the Excalibur domain, and the repeated structure
typi¢ed by a E. granulosus protein. This indicates a very
complex evolutionary history, involving acquisition and/or
loss of £anking helices and strands. Another possibility is
that the Ca2þ -binding loop motif is an evolutionarily mobile unit that has become implanted into several di¡erent
‘host’ proteins (see [40] for the discussion of possible scenarios). Direct experiments will be needed to de¢ne the
functions of the Excalibur domain in each of its diverse
domain contexts and, indeed, to con¢rm experimentally
the computational prediction of Ca2þ -binding ability.
It is relevant to note that the other classes of proteins
containing these Ca2þ -binding loops bind calcium for a
variety of reasons ^ bu¡ering or regulation (EF-hands
[8^10]), structural (lytic transglycosylase [16] and periplasmic galactose-binding protein [17]) and for calcium mobi-
FEMSLE 10901 7-4-03
109
lization to and/or from deposits (the E. granulosus Ca2þ binding protein [42]). Hence, it also remains to be seen
whether binding and storing Ca2þ ions is the sole biological function of the Excalibur domain, as it has been proposed for other bacterial calmodulin-like proteins [43], or
whether the metal-bound Excalibur domain has some additional function. Predicted nucleic acid-binding proteins
are relatively abundant among those containing Excalibur
domains, raising the possibility that this domain might
bind to DNA or RNA.
Acknowledgements
Supported in part by NIH Grant AI44079 and DARPA
Grant N66001-01-C-8013 (M.J.J.) and by CNPq and
PRODETAB (D.J.R.).
References
[1] Silver, S. and Phung, L.T. (1996) Bacterial heavy metal resistance:
new surprises. Annu. Rev. Microbiol. 50, 753^789.
[2] Harrison, M.D., Jones, C.E., Solioz, M. and Dameron, C.T. (2000)
Intracellular copper routing: the role of copper chaperones. Trends
Biochem. Sci. 25, 29^32.
[3] Jordan, I.K., Natale, D.A., Koonin, E.V. and Galperin, M.Y. (2001)
Independent evolution of heavy metal-associated domains in copper
chaperones and copper-transporting ATPases. J. Mol. Evol. 53, 622^
633.
[4] Andrews, S.C. (1998) Iron storage in bacteria. Adv. Microb. Physiol.
40, 281^351.
[5] Silver, S. (1977) Calcium transport in microorganisms. In: Microorganisms and Minerals (Weinberg, E.D., Ed.), pp. 49^103. Marcel
Dekker Publ., New York.
[6] Smith, R.J. (1995) Calcium and bacteria. Adv. Microb. Physiol. 37,
83^133.
[7] Baumann, U., Wu, S., Flaherty, K.M. and McKay, D.B. (1993)
Three-dimensional structure of the alkaline protease of Pseudomonas
aeruginosa: a two-domain protein with a calcium binding parallel
beta roll motif. EMBO J. 12, 3357^3364.
[8] Kawasaki, H., Nakayama, S. and Kretsinger, R.H. (1998) Classi¢cation and evolution of EF-hand proteins. Biometals 11, 277^295.
[9] Nelson, M.R. and Chazin, W.J. (1998) Structures of EF-hand Ca2þ binding proteins: diversity in the organization, packing and response
to Ca2þ binding. Biometals 11, 297^318.
[10] Lewit-Bentley, A. and Rety, S. (2000) EF-hand calcium-binding proteins. Curr. Opin. Struct. Biol. 10, 637^643.
[11] Swan, D.G., Hale, R.S., Dhillon, N. and Leadlay, P.F. (1987) A
bacterial calcium-binding protein homologous to calmodulin. Nature
329, 84^85.
[12] Xi, C., Schoeters, E., Vanderleyden, J. and Michiels, J. (2000) Symbiosis-speci¢c expression of Rhizobium etli casA encoding a secreted
calmodulin-related protein. Proc. Natl. Acad. Sci. USA 97, 11114^
11119.
[13] Aitio, H., Annila, A., Heikkinen, S., Thulin, E., Drakenberg, T. and
Kilpelainen, I. (1999) NMR assignments, secondary structure, and
global fold of calerythrin, an EF-hand calcium-binding protein
from Saccharopolyspora erythraea. Protein Sci. 8, 2580^2588.
[14] Yang, K. (2001) Prokaryotic calmodulins : recent developments and
evolutionary implications. J. Mol. Microbiol. Biotechnol. 3, 457^459.
[15] Michiels, J., Xi, C., Verhaert, J. and Vanderleyden, J. (2002) The
functions of Ca2þ in bacteria: a role for EF-hand proteins ? Trends
Microbiol. 10, 87^93.
Cyaan Magenta Geel Zwart
110
D.J. Rigden et al. / FEMS Microbiology Letters 221 (2003) 103^110
[16] van Asselt, E.J., Dijkstra, A.J., Kalk, K.H., Takacs, B., Keck, W.
and Dijkstra, B.W. (1999) Crystal structure of Escherichia coli lytic
transglycosylase Slt35 reveals a lysozyme-like catalytic domain with
an EF-hand. Struct. Fold. Des. 7, 1167^1180.
[17] Vyas, N.K., Vyas, M.N. and Quiocho, F.A. (1987) A novel calcium
binding site in the galactose-binding protein of bacterial transport
and chemotaxis. Nature 327, 635^638.
[18] Petosa, C., Collier, R.J., Klimpel, K.R., Leppla, S.H. and Liddington, R.C. (1997) Crystal structure of the anthrax toxin protective
antigen. Nature 385, 833^838.
[19] Lytle, B.L., Volkman, B.F., Westler, W.M., Heckman, M.P. and Wu,
J.H. (2001) Solution structure of a type I dockerin domain, a novel
prokaryotic, extracellular calcium-binding domain. J. Mol. Biol. 307,
745^753.
[20] Altschul, S.F., Madden, T.L., Scha¡er, A.A., Zhang, J., Zheng, Z.,
Miller, W. and Lipman, D.J. (1997) Gapped BLAST and PSIBLAST - A new generation of protein database search programs.
Nucleic Acids Res. 25, 3389^3402.
[21] Letunic, I., Goodstadt, L., Dickens, N.J., Doerks, T., Schultz, J.,
Mott, R., Ciccarelli, F., Copley, R.R., Ponting, C.P. and Bork, P.
(2002) Recent improvements to the SMART domain-based sequence
annotation resource. Nucleic Acids Res. 30, 242^244.
[22] Bateman, A., Birney, E., Cerruti, L., Durbin, R., Etwiller, L., Eddy,
S.R., Gri⁄ths-Jones, S., Howe, K.L., Marshall, M. and Sonnhammer, E.L. (2002) The Pfam protein families database. Nucleic Acids
Res. 30, 276^280.
[23] Nielsen, H., Engelbrecht, J., Brunak, S. and von Heijne, G. (1997)
Identi¢cation of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites. Protein Eng. 10, 1^6.
[24] Jones, D.T. (1999) Protein secondary structure prediction based on
position-speci¢c scoring matrices. J. Mol. Biol. 292, 195^202.
[25] Pollastri, G., Przybylski, D., Rost, B. and Baldi, P. (2002) Improving
the prediction of protein secondary structure in three and eight
classes using recurrent neural networks and pro¢les. Proteins 47,
228^235.
[26] Koonin, E.V. and Galperin, M.Y. (2002) Sequence - Evolution Function. Computational Approaches in Comparative Genomics.
Kluwer Academic Publishers, Boston, MA.
[27] Brown, N.P., Leroy, C. and Sander, C. (1998) MView: a web-compatible database search or multiple alignment viewer. Bioinformatics
14, 380^381.
[28] Hynes, T.R. and Fox, R.O. (1991) The crystal structure of staphylococcal nuclease re¢ned at 1.7 A resolution. Proteins 10, 92^105.
[29] Sakamoto, J.J., Sasaki, M. and Tsuchido, T. (2001) Puri¢cation and
characterization of a Bacillus subtilis 168 nuclease, YokF, involved in
chromosomal DNA degradation and cell death caused by thermal
shock treatments. J. Biol. Chem. 276, 47046^47051.
[30] Sleytr, U.B. and Beveridge, T.J. (1999) Bacterial S-layers. Trends
Microbiol. 7, 253^260.
FEMSLE 10901 7-4-03
[31] Kosugi, A., Murashima, K., Tamaru, Y. and Doi, R.H. (2002) Cellsurface-anchoring role of N-terminal surface layer homology
domains of Clostridium cellulovorans EngE. J. Bacteriol. 184, 884^
888.
[32] Walker, S.G., Karunaratne, D.N., Ravenscroft, N. and Smit, J.
(1994) Characterization of mutants of Caulobacter crescentus defective in surface attachment of the paracrystalline surface layer. J. Bacteriol. 176, 6312^6323.
[33] Aravind, L. (1999) An evolutionary classi¢cation of the metallo-betalactamase fold proteins. In Silico Biol. 1, 69^91.
[34] Hahn, J., Inamine, G., Kozlov, Y. and Dubnau, D. (1993) Characterization of comE, a late competence operon of Bacillus subtilis required for the binding and uptake of transforming DNA. Mol. Microbiol. 10, 99^111.
[35] Jekel, M. and Wackernagel, W. (1995) The periplasmic endonuclease
I of Escherichia coli has amino-acid sequence homology to the extracellular DNases of Vibrio cholerae and Aeromonas hydrophila. Gene
154, 55^59.
[36] Schindelin, H., Jiang, W., Inouye, M. and Heinemann, U. (1994)
Crystal structure of CspA, the major cold shock protein of Escherichia coli. Proc. Natl. Acad. Sci. USA 91, 5119^5123.
[37] Weber, M.H., Fricke, I., Doll, N. and Marahiel, M.A. (2002)
CSDBase : an interactive database for cold shock domain-containing
proteins and the bacterial cold shock response. Nucleic Acids Res 30,
375^378.
[38] Sommerville, J. (1999) Activities of cold-shock domain proteins in
translation control. Bioessays 21, 319^325.
[39] Sigrist, C.J., Cerutti, L., Hulo, N., Gattiker, A., Falquet, L., Pagni,
M., Bairoch, A. and Bucher, P. (2002) PROSITE: a documented
database using patterns and pro¢les as motif descriptors. Brief Bioinform. 3, 265^274.
[40] Grishin, N.V. (2001) Fold change in evolution of protein structures.
J. Struct. Biol. 134, 167^185.
[41] Denessiouk, K.A., Rantanen, V.V. and Johnson, M.S. (2001) Adenine recognition : a motif present in ATP-, CoA-, NAD-, NADP-,
and FAD-dependent proteins. Proteins 44, 282^291.
[42] Rodrigues, J.J., Ferreira, H.B., Farias, S.E. and Zaha, A. (1997) A
protein with a novel calcium-binding domain associated with calcareous corpuscles in Echinococcus granulosus. Biochem. Biophys. Res.
Commun. 237, 451^456.
[43] Cox, J.A. and Bairoch, A. (1988) Sequence similarities in calciumbinding proteins. Nature 331, 491.
[44] Crooks, G.E., Hon, G., Chandonia, J.M. and Brenner, S.E. (2003)
WebLogo: A sequence logo generator. Genome Res. 13, (in press).
[45] Sali, A. and Blundell, T.L. (1993) Comparative protein modelling by
satisfaction of spatial restraints. J. Mol. Biol. 234, 779^815.
[46] Meador, W.E., Means, A.R. and Quiocho, F.A. (1992) Target enzyme recognition by calmodulin : 2.4 A structure of a calmodulinpeptide complex. Science 257, 1251^1255.
Cyaan Magenta Geel Zwart