FEMS Microbiology Letters 221 (2003) 103^110 www.fems-microbiology.org An extracellular calcium-binding domain in bacteria with a distant relationship to EF-hands Daniel J. Rigden a , Mark J. Jedrzejas b , Michael Y. Galperin c; a c National Center of Genetic Resources and Biotechnology, Cenargen/Embrapa, Brasilia D.F. 70770-900, Brazil b Children’s Hospital Oakland Research Institute, Oakland, CA 94609, USA National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA Received 22 January 2003; received in revised form 19 February 2003; accepted 20 February 2003 First published online 12 March 2003 Abstract Extracellular Ca2þ -dependent nuclease YokF from Bacillus subtilis and several other surface-exposed proteins from diverse bacteria are encoded in the genomes in two paralogous forms that differ by a V45 amino acid fragment, which comprises a novel conserved domain. Sequence analysis of this domain revealed a conserved DxDxDGxxCE motif, which is strikingly similar to the Ca2þ -binding loop of the calmodulin-like EF-hand domains, suggesting an evolutionary relationship between them. Functions of many of the other proteins in which the novel domain, named Excalibur (extracellular calcium-binding region), is found, as well as a structural model of its conserved motif are consistent with the notion that the Excalibur domain binds calcium. This domain is but one more example of the diversity of structural contexts surrounding the EF-hand-like calcium-binding loop in bacteria. This loop is thus more widespread than hitherto recognized and the evolution of EF-hand-like domains is probably more complex than previously appreciated. 4 2003 Federation of European Microbiological Societies. Published by Elsevier Science B.V. All rights reserved. Keywords : Genome analysis; Calcium-binding site; Protein domain ; Cell envelope; S-layer; Nuclease ; Calmodulin; Evolution 1. Introduction Many enzymes depend on metal cofactors for their activity. Several metal ions, such as copper and iron, feature dedicated binding domains that participate in their storage, transport and delivery to the target enzymes. Although the diversity of the metal-binding domains in prokaryotes appears to be much less than in eukaryotes, there are some well-characterized ones, such as the heavy metal-associated domain found in metal-transporting P-type ATPases, copper chaperones, and periplasmic mercury-binding proteins (reviewed in [1^3]) and the ironbinding ferritin domain (reviewed in [4]). Calcium, which regulates a variety of regulatory processes in bacteria, such as motility, chemotaxis, cell division and di¡erentiation [5,6], also has dedicated binding domains. The best studied of these are the parallel L-roll domain, characterized by a repeated GGxGxD sequence * Corresponding author. Tel. : +1 (301) 435 5910; Fax : +1 (301) 435 7794. E-mail address : [email protected] (M.Y. Galperin). motif, found in a number of secreted proteins from Gramnegative bacteria [7] and the calmodulin-like EF-hand helix^loop^helix domain, consisting of two helices, E and F, connected by a calcium-binding loop with a DxDxDG consensus motif (reviewed in [8^10]). The EF-hand structure was found in a variety of eukaryotic Ca2þ -binding proteins, such as parvalbumin, calmodulin, aequorin, calbindin and S100 proteins [8^10]. In contrast, this structural motif has not been found so far in any bacterial protein with known three-dimensional structure. Nevertheless, bacterial Ca2þ -binding proteins with predicted EF-hands have been described, for example, calerythrin from the actinomycete Saccharopolyspora erythrea [11] and calsymin from Rhizobium etli [12]. The presence of the EFhand helix^loop^helix structure in the former protein has been recently con¢rmed by NMR analysis [13]. Sequence comparisons revealed the presence of EF-hand-like motifs in a variety of proteins encoded in completely sequenced bacterial genomes [14,15]. In addition, Ca2þ -binding sites in several bacterial extracytoplasmic proteins were found to resemble the EF-hands. These include, for example, the soluble lytic transglycosylase from Escherichia coli [16], the periplasmic galactose-binding protein from Salmonella 0378-1097 / 03 / $22.00 4 2003 Federation of European Microbiological Societies. Published by Elsevier Science B.V. All rights reserved. doi:10.1016/S0378-1097(03)00160-5 FEMSLE 10901 7-4-03 Cyaan Magenta Geel Zwart 104 D.J. Rigden et al. / FEMS Microbiology Letters 221 (2003) 103^110 typhimurium [17], Bacillus anthracis protective antigen [18] and the dockerin domain of the extracellular cellulase complex from Clostridium thermocellum [19]. While all these proteins deviate in one way or another from the typical EF-hands, they all share the conserved Ca2þ -binding motif Dx(D/N)xDG within the central loop of the EFhand domain. The variety of Ca2þ -binding proteins in bacteria suggests the possibility of additional Ca2þ -binding structures. Here we describe one more bacterial extracellular domain that shares local but striking similarity with the EF-hand motif and is found in several Ca2þ -dependent surface proteins from both Gram-positive and Gram-negative bacteria. 2. Materials and methods 2.1. Database searches Initial sequence similarity searches were performed against the non-redundant protein database and the database of ¢nished and un¢nished microbial genome sequences at the National Center for Biotechnology Information (NCBI, Bethesda, MD, USA) using PSI-BLAST [20] program run with default parameters. Database hits identi¢ed in these searches were used as queries for additional protein database PSI-BLAST searches with the relaxed inclusion threshold values of E = 0.018 and for the searches against translated GenBank nucleotide database using the TBLASTn [20] program. The completeness of the search results was veri¢ed by direct pattern matching using the DxDxDGxxCE sequence pattern. Domain architecture analyses were carried out by individually comparing the protein sequences against SMART [21] and Pfam [22] domain databases. 2.2. Secondary structure analysis Signal peptides were predicted using the SignalP [23] program. Secondary structure predictions were performed using PSI-PRED [24], SSpro [25], and several other programs available on the web (reviewed in [26]). The resulting predictions were manually reconciled based on the reliability ¢gures for each particular residue, reported by each program. Consensus sequences were derived from alignments using Mview [27]; EF-hand sequences were obtained from the SMART database [21]. 3. Results and discussion 3.1. Sequence and structure of the Excalibur (extracellular calcium-binding region) domain ing protein SP0667 (SPTREMBL accession Q97RW9) was found to contain an 80-aa fragment that was located between the predicted signal peptide and the cell-wall-binding domains and showed no statistically signi¢cant sequence similarity to any domain listed in SMART [21] or Pfam [22] domain databases. A PSI-BLAST search of the NCBI protein database using this fragment of SP0667 (residues 24^103) as the query converged after ¢ve iterations of PSI-BLAST retrieving 14 proteins from both lowGC and high-GC Gram-positive bacteria that contained a well-de¢ned domain with two absolutely conserved Cys residues and three absolutely conserved Asp residues (Fig. 1). All these proteins had predicted signal peptides and invariably placed the new domain at the extreme Nor C-termini of the mature secreted proteins (Fig. 2). Using these proteins as queries in additional PSI-BLAST searches with a relaxed inclusion cut-o¡ E-value retrieved proteins from Gram-negative bacteria that had the same sequence pattern. Several more instances of this domain, including proteins from Mycobacterium tuberculosis and Streptomyces exfoliatus, were found encoded in the DNA sequences in GenBank nucleotide database and in the database of un¢nished microbial genome sequences (Table 1) but not translated into proteins, presumably because of their short size. All the predicted proteins retrieved by these database searches contained a similar 40^45-aa domain, present in a stand-alone form in the Pasteurella multocida protein PM0157 and characterized by the conserved DxDxDGxxCE motif (Fig. 1B). The remarkable similarity of this motif to that de¢ned for the Ca2þ -binding motif of the EF-hand calcium-binding domain [8^10] indicated that the novel domain should also bind Ca2þ and prompted us to name it accordingly as the Excalibur domain. Indeed, a comparison of Excalibur and EF-hand consensus sequences demonstrates conservation in the former of the three Ca2þ -binding Asp residues of the EF-hands and compatibility of the surrounding residues (shown in bold) where small letters indicate the following groups of residues: h ^ hydrophobic, p ^ polar, s ^ small, and x ^ any residue. The only signi¢cant di¡erence between the two consensus motifs is the shorter distance between the conserved aspartates and the Glu residue in the Excalibur domain. Since in some EF-hand-like sequences this Glu residue can be shifted or even located elsewhere on the polypeptide chain [17], this comparison supports the prediction that the Excalibur domain is capable of binding calcium. 3.2. Diversity of the Excalibur-containing proteins In the course of a survey of the surface-exposed proteins of Streptococcus pneumoniae, the putative cell-wall- bind- FEMSLE 10901 7-4-03 The domain organization of deduced proteins contain- Cyaan Magenta Geel Zwart D.J. Rigden et al. / FEMS Microbiology Letters 221 (2003) 103^110 ing the Excalibur domain (Table 1 and Fig. 2) indicates that it is often found in association with domains that are either Ca2þ -dependent or Ca2þ -regulated. The most common of such domain structures is a combination of a C-terminal Excalibur domain with an N-terminal endonuclease domain, typi¢ed by the staphylococcal nuclease [28]. This domain combination is found in Gram-positive bacteria (Bacillus subtilis), cyanobacteria (Nostoc sp., Prochlorococcus marinus, Trichodesmium erythraeum), and in 105 proteobacteria (Azotobacter vinelandii, Mesorhizobium loti). Remarkably, genomes of some of these organisms also encode paralogous forms of extracellular nuclease that lack the Excalibur domain (Fig. 2). In B. subtilis, for example, the core regions of yokF and yncB gene products are 92% identical. The YokF protein, however, contains an extra C-terminal 83-aa fragment [29], which consists of the Excalibur domain and a £exible linker (Fig. 2). A comparison of these two proteins revealed that YokF is Table 1 A list of the Excalibur-containing proteins Organism Accession no. or gia Protein namea Length Total Gram-positive B. subtilis phage SPBc2 B. anthracis B. anthracis S. pneumoniae Streptococcus mitis Staphylococcus aureus Staphylococcus capitis Actinobacteria Corynebacterium glutamicum Corynebacterium glutamicum Corynebacterium e⁄ciens Corynebacterium e⁄ciens Streptomyces coelicolor S. exfoliatus Thermobi¢da fusca Mycobacterium avium Mycobacterium smegmatis M. tuberculosis Cyanobacteria Nostoc sp. pCC7120alpha Nostoc sp. pCC7120alpha Nostoc sp. pCC7120alpha Prochlorococcus marinus Trichodesmium erythraeum Chloro£exi Chloro£exus aurantiacus Proteobacteria Alpha subdivision Azotobacter vinelandii Mesorhizobium loti Novosphingobium aromaticivorans Rhizobium sp. pNGR234alpha Rhodobacter sphaeroides Gamma subdivision P. multocida Shewanella oneidensis Vibrio vulni¢cus Excalibur location Other domainsb Paralog without Excalibur Signal peptide YokF BA_2645 BA_5470 SP0667 NA SA1617 NA AAC12979 21400020 21402845 AAK74812 Un¢nished BAB57961 AF548637 296 333 268 332 190 204 s 184 24 57 26 26 26 52 ^c C-terminal C-terminal C-terminal N-terminal N-terminal C-terminal C-terminal SNase Coiled coil SLH Cell wall Cell wall Low comp. SNase YncB ^ BA_1675 ^ Cgl0341 Cgl0956 CE1921 CE2653 C7A8.31 NA NA NA NA NA BAB97734 BAB98349 BAC18731 BAC19463 CAB69780 U97057 Un¢nished Un¢nished Un¢nished NC_000962 191 173 211 363 157 100 142 69 82 71 33 57 30 24 27 11 33 24 43 23 C-terminal C-terminal C-terminal C-terminal N-terminal C-terminal C-terminal Whole protein Whole protein Whole protein Low comp. Low comp. Low comp. Rv2972c Low comp. Low comp. Low comp. None None None ^ ^ ^ CE1360 ^ Alr7333 Alr7381d Asr7343d Pmit0770 Tery3071 BAB77091 BAB77139 BAB77101 23131487 23042405 175 128 69 211 233 21 ^d ^d 24 22 C-terminal C-terminal Whole protein C-terminal C-terminal SNase SNased Noned SNase SNase All0265, All2317 ^ ^ ^ ^ Chlo2608 22972742 448 27 C-terminal ComEC ^ Avin3995 23105817 225 21 C-terminal SNase Msi140 NA NA Rsph0055 CAD31545 Un¢nished AE000066 22956401 257 129 140 292 28 46 ^ 17 C-terminal C-terminal C-terminal C-terminal SNase ^ ^ ^ Avin0542 Avin3705 Mlr0479 Mll8119 PM0157 SO0733 VV12532 AAK02241 AAN53811 AAO10888 69 203 155 19 ^ ^ Whole protein C-terminal C-terminal None Cold shock Cold shock a ^ ^ ^ ^ ^ ^ ^ ^ Protein names and accession numbers in the NCBI protein database; NA, not available (previously untranslated proteins or proteins from un¢nished genomes). Bold accession numbers are from the GenBank nucleotide database, newly translated proteins have the following locations: AF548637: 1..555 Frame+1 ; U97057 : 1482..1784 Frame31; NC_000962: 2580036..2580251 Frame31; AE000066:1792..2214 Frame31. b Domain names and abbreviations are as in Fig. 2. Low comp. stands for low-complexity segments, a dash indicates absence of recognizable domains. See text for details. c The N-terminal fragment of this protein is missing. d This protein is a C-terminal fragment of an Alr7333-like nuclease, disrupted by a frameshift mutation. FEMSLE 10901 7-4-03 Cyaan Magenta Geel Zwart 106 D.J. Rigden et al. / FEMS Microbiology Letters 221 (2003) 103^110 Fig. 1. Multiple alignment of the Excalibur domain. A: The alignment was constructed on the basis of PSI-BLAST [20] search results with minimal manual editing. The proteins are listed under their gene names and species names, abbreviated as follows: Avin, Azotobacter vinelandii; Bsub, B. subtilis ; Bant, B. anthracis; Caur, Chloro£exus aurantiacus; Ce¡, Corynebacterium e⁄ciens; Cglu, C. glutamicum ; Mlot, Mesorhizobium loti ; Nos, Nostoc (Anabaena) sp. PCC7120 ; Pmar, Prochlorococcus marinus; Pmul, Pasteurella multocida; Rsph, Rhodobacter sphaeroides ; Saur, Staphylococcus aureus ; Scoe, Streptomyces coelicolor ; Spne, S. pneumoniae ; Tery, Trichodesmium erythraeum; Sone, Shewanella oneidensis ; Vvul, Vibrio vulni¢cus. Missing amino acid residues at the end of the alignment indicate protein C-termini. The secondary structure prediction shown beneath the alignment represents a tentative consensus of six di¡erent prediction algorithms [26]. The NCBI protein database or Swiss-Prot accession numbers, where available, are shown in the right column. Conserved residues are shown in bold typeface and/or colored as follows: acidic ^ red; hydrophobic ^ yellow background; small (Gly, Ser, Ala) and Pro ^ green background. Cys residues that form predicted disul¢de bond are shown in white letters on blue background. The yellow numbers on dark background indicate the lengths of variable inserts in the respective protein sequences. B: Consensus sequence of the Excalibur domain drawn in the SeqLogo format using the WebLogo tool (http://weblogo.berkeley.edu [44]). The height of each letter indicates the degree of its conservation, the total height of each column represents the statistical importance of the given position. The residue numbering is based on the sequence of the mature form of SP0667 (lacking the 26-aa signal peptide). responsible for the major part of the endonuclease activity in B. subtilis cells, while YncB represented a minor fraction of that activity [29]. Whether these di¡erences are fully attributable to the presence of the Excalibur domain remains to be determined. The B. anthracis gene BA_5470 encodes another notable domain combination, the Excalibur domain preceded by three N-terminal S-layer homology (SLH) domains (Fig. 2). SLH domains are capable of tight non-covalent binding to the peptidoglycan and often serve as adaptors for cell-wall anchoring of various enzymes [30]. Studies of FEMSLE 10901 7-4-03 peptidoglycan attachment by SLH domains suggested an involvement of metal ions, possibly Ca2þ , in this interaction [31]. Ca2þ has been previously shown to be involved in S-layer assembly in Caulobacter crescentus [32], although this might be an unrelated process, as this organism’s genome does not seem to encode SLH domains. Similarly to the case of B. subtilis endonucleases above, B. anthracis also encodes a close paralog of BA_5470 that is lacking the Excalibur domain (Fig. 3). In Chlo£exus aurantiacus, the Excalibur domain is found in combination with an uncharacterized hydrolase Cyaan Magenta Geel Zwart D.J. Rigden et al. / FEMS Microbiology Letters 221 (2003) 103^110 107 Fig. 2. Domain organization of proteins containing the Excalibur domain and their close paralogs. Full names and accession numbers of the proteins are as in Fig. 1 and Table 1; proteins are drawn approximately to scale. Domain names are from SMART [21] and Pfam [22] databases and are abbreviated as follows : SNase, staphylococcal nuclease (PF00565); Cold, cold-shock protein domain (SM00357, PF00313); SLH (PF00395), Lactamase_B, metallo-L-lactamase superfamily (PF00753). Small black boxes on the N-termini indicate predicted signal peptides, the sword in a stone indicates the Excalibur domain, the rhombs indicate cell-wall-binding (PF01473) domains, Rv2972c indicates an uncharacterized actinobacterial protein domain (see text for details). of the metallo-L-lactamase superfamily [33], closely related to the C-terminal part of B. subtilis competence protein ComEC [34]. Although the principal catalytic residues of L-lactamase are conserved in this protein (data not shown), and there is no doubt that it possesses a hydrolytic activity, its exact substrate speci¢city and the nature of the bound metal remain unclear; it could well be another extracellular endonuclease. The Corynebacterium e⁄ciens CE2653 gene product combines the Excalibur domain with a previously uncharacterized actinobacteria-speci¢c globular domain that is distantly related to the periplasmic endonuclease I of E. coli [35] and can be predicted to have a similar activity. In other proteins from corynebacteria and streptomycetes, as well as from streptococci and staphylococci, the Excalibur domain is found in association with various lowcomplexity domains, which might be involved in adhesion and colonization processes. While immunogenic properties of these proteins have not been investigated so far, they might represent interesting vaccine candidates. In two gamma-proteobacteria, Shewanella oneidensis and Vibrio vulni¢cus, the Excalibur domain is found associated with the cold-shock domain, known to bind RNA and single-stranded DNA [36,37]. This is the only context where Excalibur domain is predicted to be inside the cy- FEMSLE 10901 7-4-03 Fig. 3. A model of Ca2þ binding by the Excalibur domain. The consensus Excalibur sequence motif DRDGDGIACE was modeled with MODELLER-6 [45] using the ¢rst EF-hand domain of human calmodulin (PDB code 1cdl [46]) as template. Interactions with bound Ca2þ are shown as dotted lines and interacting residues are numbered with the ¢rst Asp de¢ned as residue 1. The ¢gure was produced with PYMOL (DeLano, 2003, http://www.pymol.org). Cyaan Magenta Geel Zwart 108 D.J. Rigden et al. / FEMS Microbiology Letters 221 (2003) 103^110 toplasm, and its exact function there is as obscure as that of the cold-shock domain (see [38] for discussion). The absence of the Excalibur domain in all other Q-proteobacteria whose genomes have been completely sequenced to date probably indicates its relatively recent acquisition by S. oneidensis and V. vulni¢cus. Finally, in several organisms, including M. tuberculosis, the Excalibur domain may be found on bacterial surface in the stand-alone form (Fig. 2). As this predicted protein has not been described before and was even missed in genome annotation of all M. tuberculosis strains sequenced so far, no data on its possible functions is currently available. Nevertheless, it seems plausible that the function of this domain in mycobacteria and in P. multocida could be related, respectively, to the maintenance of the stability of the cell envelope and capsular polysaccharide. 3.3. Structural aspects of Ca2+ binding by EF-hands and the Excalibur domain domain. Inconsistencies between secondary structure predictions for the Excalibur domain obtained by di¡erent methods do not allow for unequivocal determination of the presence or absence in it of the helix that precedes the Ca2þ -binding loop in EF-hands. In any case, shared structural context is neither required nor su⁄cient for the inference of homology, given the increasingly appreciated subtleties of protein structural evolution [40]. The two entirely conserved Cys residues in this extracellular protein family are suggestive of the existence of a disul¢de bridge. It may be that this structural constraint substitutes for the larger domain framework of EF-hand containing proteins in orienting the ends of the Ca2þ -binding loop correctly for metal binding. Although simple, more generalized binding motifs can arise by convergent evolution [41], we believe that the extensive local similarities between EF-hands and the novel domain indicate a genuine evolutionary relationship. 3.4. Diversity and evolution of the EF-hand-like domains Classical EF-hands contain a helix^loop^helix structure with the Ca2þ ion binding by the residues in the loop [8^ 10]. The Ca2þ ion binds in a pentagonal bipyramidal fashion with the side chains of three turn residues, numbered 1, 3 and 5 in the corresponding PROSITE motif [39], typically aspartates, acting as monodentate ligands. A further side chain, contributed by residue 12, typically Glu, is a bidentate calcium ligand while the main chain carboxyl group of residue seven and water molecules complete the coordination of the Ca2þ ion. Allowing for a di¡erent separation of aspartates and glutamate ^ four residues in the Excalibur domain, six residues for EF-hands ^ these characteristics appear to be entirely conserved in the former (Fig. 1). Examination of EF-hand structures and alignments [21] shows that other, non-ligating positions have characteristic residue distributions for structural reasons. Positions 4 and 6 lie, respectively, close to the core of the left-handed helical region of the Ramachandran plot, and at the very periphery of this region. These locations of the Ramachandran plot would be expected to de¢ne, respectively, a preference for turn-favoring residues and a near-exclusive preference for glycine residues. Indeed, exactly these tendencies are observed in the EF-hand consensus loop sequence. Once again, these residue preferences for positions 4 and 6 of the EF-hand are re£ected exactly in the consensus sequence for the Excalibur domain (Fig. 1). These considerations favor an evolutionary relationship, or at least a structural similarity, between the Excalibur and EF-hand domains. Consistent with this idea, the lesser distance between DxDxD and conserved Glu in the novel domain is readily structurally accommodated (Fig. 2). That the overall structures of the Excalibur and EFhand domains di¡er is guaranteed by the fact that the novel domains terminate shortly after the conserved Glu while a further 12-residue helix is part of the EF-hand FEMSLE 10901 7-4-03 Although we have compared the Excalibur domain with the relatively well-understood EF-hand, the Excalibur domain is, intriguingly, the ¢fth structural context in which the core EF-hand-like Ca2þ -binding loop has been identi¢ed. Structural determinations of the periplasmic galactose-binding protein from S. typhimurium (PDB entry 1gcg) revealed, surprisingly, a DxNxD calcium-binding loop, nearly superimposable on those of EF-hands, in a helix^loop^strand structural context [17]. Just as in EFhands, the third monodentate side chain ligand (a carbonyl from an Asp residue) is followed there by glycine, which allows the interaction with the bound calcium of the main chain carbonyl of the residue in the seventh position. Such similarities again argue for an evolutionary relationship between the two loops, despite di¡erences in the position of the glutamate acting as the bidentate Ca2þ ligand. In the dockerin domain, the E-helix in front of the Ca2þ -binding loop is missing, but the F-helix follows the loop, just as in typical EF-hands. The context of the Ca2þ binding loop in dockerins may therefore be described as coil^turn^helix. The same separation of the DxNxD calcium-binding loop and the conserved bidentate calcium ligand, in this case an aspartate, is present in the dockerin domain as in the EF-hand. Essentially the same organization of the Ca2þ -binding site is seen in the B. anthracis protective antigen. Again the characteristic stereochemistry of the Ca2þ -binding loop is re£ected in similar residue preferences to the EF-hand, even outside the actual calcium ligands, providing further support for an evolutionary relationship. Another notable example is the Ca2þ -binding protein from the £atworm Echinococcus granulosus [42], where the DxDxD motif, followed by a conserved acidic residue at the typical EF-hand spacing, is present in 15 tandem repeats of 12^14 residues. Based on limited prediction of Cyaan Magenta Geel Zwart D.J. Rigden et al. / FEMS Microbiology Letters 221 (2003) 103^110 L-structure for some of the repeats, a structural relationship to the L-roll Ca2þ -binding proteins was proposed [42]. There are several reasons to doubt this idea, not least the fact that the currently available programs predict very little regular secondary structure in the repeat region (analysis not shown). Most importantly, given the excellent match between the E. granulosus repeat consensus and the EF-hand consensus and the former’s complete lack of similarity with the L-roll consensus, the more conservative explanation is that the repeats of the E. granulosus protein bind calcium in loops of similar structure to those of EFhands. Based on the discovery of calmodulin-like proteins in bacteria, it has been argued that the EF-hands had originated in the prokaryotic kingdom [11,14]. While this could still be correct, one has to consider that the Ca2þ -binding loop is found in bacteria in a variety of diverse structural contexts : helix^loop^helix in calerythrin [13] and soluble lytic transglycosylase [16], helix^loop^strand in the periplasmic galactose-binding protein from S. typhimurium [17], coil^loop^helix in the dockerin domain [19] and in B. anthracis protective antigen [18], and as the C-terminal loop in the Excalibur domain (Fig. 1). This suggests that evolution of the EF-hand-like domains might be much more complex than previously appreciated and these domains could be much more widespread than was previously recognized, involving various modi¢cations of the structural elements surrounding the Ca2þ -binding loop. The classical EF-hand is therefore just the most visible representative of the whole superfamily, not necessarily the most ancient one. In conclusion, prokaryotic relatives of the Ca2þ -binding EF-hand domain clearly exhibit a signi¢cantly wider sequence and structural diversity than has been so far appreciated, with the Excalibur domain representing a further member of a superfamily of Ca2þ -binding domains. Five structural contexts for the core Ca2þ -binding loop motif, originally associated solely with EF-hands, are now known: EF-hands themselves, the periplasmic galactose-binding protein, the dockerin domain/protective antigen, the Excalibur domain, and the repeated structure typi¢ed by a E. granulosus protein. This indicates a very complex evolutionary history, involving acquisition and/or loss of £anking helices and strands. Another possibility is that the Ca2þ -binding loop motif is an evolutionarily mobile unit that has become implanted into several di¡erent ‘host’ proteins (see [40] for the discussion of possible scenarios). Direct experiments will be needed to de¢ne the functions of the Excalibur domain in each of its diverse domain contexts and, indeed, to con¢rm experimentally the computational prediction of Ca2þ -binding ability. It is relevant to note that the other classes of proteins containing these Ca2þ -binding loops bind calcium for a variety of reasons ^ bu¡ering or regulation (EF-hands [8^10]), structural (lytic transglycosylase [16] and periplasmic galactose-binding protein [17]) and for calcium mobi- FEMSLE 10901 7-4-03 109 lization to and/or from deposits (the E. granulosus Ca2þ binding protein [42]). Hence, it also remains to be seen whether binding and storing Ca2þ ions is the sole biological function of the Excalibur domain, as it has been proposed for other bacterial calmodulin-like proteins [43], or whether the metal-bound Excalibur domain has some additional function. Predicted nucleic acid-binding proteins are relatively abundant among those containing Excalibur domains, raising the possibility that this domain might bind to DNA or RNA. Acknowledgements Supported in part by NIH Grant AI44079 and DARPA Grant N66001-01-C-8013 (M.J.J.) and by CNPq and PRODETAB (D.J.R.). References [1] Silver, S. and Phung, L.T. (1996) Bacterial heavy metal resistance: new surprises. Annu. Rev. Microbiol. 50, 753^789. [2] Harrison, M.D., Jones, C.E., Solioz, M. and Dameron, C.T. (2000) Intracellular copper routing: the role of copper chaperones. Trends Biochem. Sci. 25, 29^32. [3] Jordan, I.K., Natale, D.A., Koonin, E.V. and Galperin, M.Y. (2001) Independent evolution of heavy metal-associated domains in copper chaperones and copper-transporting ATPases. J. Mol. Evol. 53, 622^ 633. [4] Andrews, S.C. (1998) Iron storage in bacteria. Adv. Microb. Physiol. 40, 281^351. [5] Silver, S. (1977) Calcium transport in microorganisms. In: Microorganisms and Minerals (Weinberg, E.D., Ed.), pp. 49^103. Marcel Dekker Publ., New York. [6] Smith, R.J. (1995) Calcium and bacteria. Adv. Microb. Physiol. 37, 83^133. [7] Baumann, U., Wu, S., Flaherty, K.M. and McKay, D.B. (1993) Three-dimensional structure of the alkaline protease of Pseudomonas aeruginosa: a two-domain protein with a calcium binding parallel beta roll motif. EMBO J. 12, 3357^3364. [8] Kawasaki, H., Nakayama, S. and Kretsinger, R.H. (1998) Classi¢cation and evolution of EF-hand proteins. Biometals 11, 277^295. [9] Nelson, M.R. and Chazin, W.J. (1998) Structures of EF-hand Ca2þ binding proteins: diversity in the organization, packing and response to Ca2þ binding. Biometals 11, 297^318. [10] Lewit-Bentley, A. and Rety, S. (2000) EF-hand calcium-binding proteins. Curr. Opin. Struct. Biol. 10, 637^643. [11] Swan, D.G., Hale, R.S., Dhillon, N. and Leadlay, P.F. (1987) A bacterial calcium-binding protein homologous to calmodulin. Nature 329, 84^85. [12] Xi, C., Schoeters, E., Vanderleyden, J. and Michiels, J. (2000) Symbiosis-speci¢c expression of Rhizobium etli casA encoding a secreted calmodulin-related protein. Proc. Natl. Acad. Sci. USA 97, 11114^ 11119. [13] Aitio, H., Annila, A., Heikkinen, S., Thulin, E., Drakenberg, T. and Kilpelainen, I. (1999) NMR assignments, secondary structure, and global fold of calerythrin, an EF-hand calcium-binding protein from Saccharopolyspora erythraea. Protein Sci. 8, 2580^2588. [14] Yang, K. (2001) Prokaryotic calmodulins : recent developments and evolutionary implications. J. Mol. Microbiol. Biotechnol. 3, 457^459. [15] Michiels, J., Xi, C., Verhaert, J. and Vanderleyden, J. (2002) The functions of Ca2þ in bacteria: a role for EF-hand proteins ? Trends Microbiol. 10, 87^93. Cyaan Magenta Geel Zwart 110 D.J. Rigden et al. / FEMS Microbiology Letters 221 (2003) 103^110 [16] van Asselt, E.J., Dijkstra, A.J., Kalk, K.H., Takacs, B., Keck, W. and Dijkstra, B.W. (1999) Crystal structure of Escherichia coli lytic transglycosylase Slt35 reveals a lysozyme-like catalytic domain with an EF-hand. Struct. Fold. Des. 7, 1167^1180. [17] Vyas, N.K., Vyas, M.N. and Quiocho, F.A. (1987) A novel calcium binding site in the galactose-binding protein of bacterial transport and chemotaxis. Nature 327, 635^638. [18] Petosa, C., Collier, R.J., Klimpel, K.R., Leppla, S.H. and Liddington, R.C. (1997) Crystal structure of the anthrax toxin protective antigen. Nature 385, 833^838. [19] Lytle, B.L., Volkman, B.F., Westler, W.M., Heckman, M.P. and Wu, J.H. (2001) Solution structure of a type I dockerin domain, a novel prokaryotic, extracellular calcium-binding domain. J. Mol. Biol. 307, 745^753. [20] Altschul, S.F., Madden, T.L., Scha¡er, A.A., Zhang, J., Zheng, Z., Miller, W. and Lipman, D.J. (1997) Gapped BLAST and PSIBLAST - A new generation of protein database search programs. Nucleic Acids Res. 25, 3389^3402. [21] Letunic, I., Goodstadt, L., Dickens, N.J., Doerks, T., Schultz, J., Mott, R., Ciccarelli, F., Copley, R.R., Ponting, C.P. and Bork, P. (2002) Recent improvements to the SMART domain-based sequence annotation resource. Nucleic Acids Res. 30, 242^244. [22] Bateman, A., Birney, E., Cerruti, L., Durbin, R., Etwiller, L., Eddy, S.R., Gri⁄ths-Jones, S., Howe, K.L., Marshall, M. and Sonnhammer, E.L. (2002) The Pfam protein families database. Nucleic Acids Res. 30, 276^280. [23] Nielsen, H., Engelbrecht, J., Brunak, S. and von Heijne, G. (1997) Identi¢cation of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites. Protein Eng. 10, 1^6. [24] Jones, D.T. (1999) Protein secondary structure prediction based on position-speci¢c scoring matrices. J. Mol. Biol. 292, 195^202. [25] Pollastri, G., Przybylski, D., Rost, B. and Baldi, P. (2002) Improving the prediction of protein secondary structure in three and eight classes using recurrent neural networks and pro¢les. Proteins 47, 228^235. [26] Koonin, E.V. and Galperin, M.Y. (2002) Sequence - Evolution Function. Computational Approaches in Comparative Genomics. Kluwer Academic Publishers, Boston, MA. [27] Brown, N.P., Leroy, C. and Sander, C. (1998) MView: a web-compatible database search or multiple alignment viewer. Bioinformatics 14, 380^381. [28] Hynes, T.R. and Fox, R.O. (1991) The crystal structure of staphylococcal nuclease re¢ned at 1.7 A resolution. Proteins 10, 92^105. [29] Sakamoto, J.J., Sasaki, M. and Tsuchido, T. (2001) Puri¢cation and characterization of a Bacillus subtilis 168 nuclease, YokF, involved in chromosomal DNA degradation and cell death caused by thermal shock treatments. J. Biol. Chem. 276, 47046^47051. [30] Sleytr, U.B. and Beveridge, T.J. (1999) Bacterial S-layers. Trends Microbiol. 7, 253^260. FEMSLE 10901 7-4-03 [31] Kosugi, A., Murashima, K., Tamaru, Y. and Doi, R.H. (2002) Cellsurface-anchoring role of N-terminal surface layer homology domains of Clostridium cellulovorans EngE. J. Bacteriol. 184, 884^ 888. [32] Walker, S.G., Karunaratne, D.N., Ravenscroft, N. and Smit, J. (1994) Characterization of mutants of Caulobacter crescentus defective in surface attachment of the paracrystalline surface layer. J. Bacteriol. 176, 6312^6323. [33] Aravind, L. (1999) An evolutionary classi¢cation of the metallo-betalactamase fold proteins. In Silico Biol. 1, 69^91. [34] Hahn, J., Inamine, G., Kozlov, Y. and Dubnau, D. (1993) Characterization of comE, a late competence operon of Bacillus subtilis required for the binding and uptake of transforming DNA. Mol. Microbiol. 10, 99^111. [35] Jekel, M. and Wackernagel, W. (1995) The periplasmic endonuclease I of Escherichia coli has amino-acid sequence homology to the extracellular DNases of Vibrio cholerae and Aeromonas hydrophila. Gene 154, 55^59. [36] Schindelin, H., Jiang, W., Inouye, M. and Heinemann, U. (1994) Crystal structure of CspA, the major cold shock protein of Escherichia coli. Proc. Natl. Acad. Sci. USA 91, 5119^5123. [37] Weber, M.H., Fricke, I., Doll, N. and Marahiel, M.A. (2002) CSDBase : an interactive database for cold shock domain-containing proteins and the bacterial cold shock response. Nucleic Acids Res 30, 375^378. [38] Sommerville, J. (1999) Activities of cold-shock domain proteins in translation control. Bioessays 21, 319^325. [39] Sigrist, C.J., Cerutti, L., Hulo, N., Gattiker, A., Falquet, L., Pagni, M., Bairoch, A. and Bucher, P. (2002) PROSITE: a documented database using patterns and pro¢les as motif descriptors. Brief Bioinform. 3, 265^274. [40] Grishin, N.V. (2001) Fold change in evolution of protein structures. J. Struct. Biol. 134, 167^185. [41] Denessiouk, K.A., Rantanen, V.V. and Johnson, M.S. (2001) Adenine recognition : a motif present in ATP-, CoA-, NAD-, NADP-, and FAD-dependent proteins. Proteins 44, 282^291. [42] Rodrigues, J.J., Ferreira, H.B., Farias, S.E. and Zaha, A. (1997) A protein with a novel calcium-binding domain associated with calcareous corpuscles in Echinococcus granulosus. Biochem. Biophys. Res. Commun. 237, 451^456. [43] Cox, J.A. and Bairoch, A. (1988) Sequence similarities in calciumbinding proteins. Nature 331, 491. [44] Crooks, G.E., Hon, G., Chandonia, J.M. and Brenner, S.E. (2003) WebLogo: A sequence logo generator. Genome Res. 13, (in press). [45] Sali, A. and Blundell, T.L. (1993) Comparative protein modelling by satisfaction of spatial restraints. J. Mol. Biol. 234, 779^815. [46] Meador, W.E., Means, A.R. and Quiocho, F.A. (1992) Target enzyme recognition by calmodulin : 2.4 A structure of a calmodulinpeptide complex. Science 257, 1251^1255. Cyaan Magenta Geel Zwart
© Copyright 2026 Paperzz