BMC Evolutionary Biology BioMed Central Open Access Research article Phylogenomics of the archaeal flagellum: rare horizontal gene transfer in a unique motility structure Elie Desmond1, Celine Brochier-Armanet2,3 and Simonetta Gribaldo*1 Address: 1Unite Biologie Moléculaire du Gène chez les Extremophiles, Institut Pasteur, 25 rue du Dr. Roux, 75724 Paris Cedex 15, France, 2Université de Provence Aix-Marseille I, Marseille, France and 3Laboratoire de chimie bactérienne, Institut de Biologie Structurale et de Microbiologie (CNRS), Marseille, France Email: Elie Desmond - [email protected]; Celine Brochier-Armanet - [email protected]; Simonetta Gribaldo* - [email protected] * Corresponding author Published: 2 July 2007 BMC Evolutionary Biology 2007, 7:106 doi:10.1186/1471-2148-7-106 Received: 28 February 2007 Accepted: 2 July 2007 This article is available from: http://www.biomedcentral.com/1471-2148/7/106 © 2007 Desmond et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Abstract Background: As bacteria, motile archaeal species swim by means of rotating flagellum structures driven by a proton gradient force. Interestingly, experimental data have shown that the archaeal flagellum is non-homologous to the bacterial flagellum either in terms of overall structure, components and assembly. The growing number of complete archaeal genomes now permits to investigate the evolution of this unique motility system. Results: We report here an exhaustive phylogenomic analysis of the components of the archaeal flagellum. In all complete archaeal genomes, the genes coding for flagellum components are colocalized in one or two well-conserved genomic clusters showing two different types of organizations. Despite their small size, these genes harbor a good phylogenetic signal that allows reconstruction of their evolutionary histories. These support a history of mainly vertical inheritance for the components of this unique motility system, and an interesting possible ancient horizontal gene transfer event (HGT) of a whole flagellum-coding gene cluster between Euryarchaeota and Crenarchaeota. Conclusion: Our study is one of the few exhaustive phylogenomics analyses of a noninformational cell machinery from the third domain of life. We propose an evolutionary scenario for the evolution of the components of the archaeal flagellum. Moreover, we show that the components of the archaeal flagellar system have not been frequently transferred among archaeal species, indicating that gene fixation following HGT can also be rare for genes encoding components of large macromolecular complexes with a structural role. Background Motile archaeal species swim by means of rotating flagellum structures driven by a proton gradient force [1,2], as in bacteria [3]. Interestingly, although they are both responsible for swimming, archaeal and bacterial flagella are not homologous, either in terms of overall structure, components and assembly (for a recent review see [4,5]). The bacterial flagellum is a complex rotary structure made up of as much as 20 proteins and composed of three major parts -the basal body, the hook, and the filament. Rotation is provided by an ATPase exploiting a proton gradient force, and can be switched by specific proteins in Page 1 of 13 (page number not for citation purposes) BMC Evolutionary Biology 2007, 7:106 response to attractants or repellents in the environment through the chemotaxis system. The filament is a hollow structure about 20 nm in diameter and is composed of a single type of protein called flagellin. Bacterial flagellins are assembled by a complex type III secretion system located in the basal body and are added to the distal tip of the flagellum after passing through the hollow cavity [4]. Much less is known about the archaeal flagellum. It has been extensively studied in terms of components, assembly, and mutation experiments, in Halobacteria and Methanococcales (for recent reviews [4-6]). The archaeal flagellum is a structure thinner than its bacterial counterpart, where at least a filament and a hook are evident [79]. The archaeal flagellum has been shown to have a unique symmetry in Halobacterium salinarium. In fact, it has 3.3 subunits/turn of a 1.9 nm pitch left-handed helix compared to 5.5 subunits/turn of a 2.6 nm pitch righthanded helix for plain bacterial flagellum filaments [10,11]. The archaeal filament can be made up of different types of homologous flagellin proteins (called FlaA or FlaB). The filament is ~10 nm in diameter and is not hollow, resembling more to bacterial type IV pili in this respect [10]. A few other characteristics of archaeal flagella make them more alike bacterial pili than flagella: as bacterial pilins, archaeal flagellins (i) are made as preproteins with short signal peptides that are processed by a recently identified archaeal-specific signal peptidase (called FlaK) [12-14] that shows weak sequence similarity with the bacterial pili leader peptidase PilD, (ii) are likely added at the base of the filament as in bacterial pili, and (iii) undergo glycosylation as post-translational modification [15] (see [5] for a recent review). Moreover, one component of archaeal flagella (FlaI) is homologous to bacterial PilT, an ATPase involved in bacterial pilin export (a type II/IV secretion system) and pilus retraction during twitching motility [16]. However, none of the remaining archaeal flagellum components are homologous to those of bacterial pili [5]. Moreover, bacterial pili are not rotating structures, and no specific anchoring structures have ever been observed, indicating substantial differences between these two cellular structures. A number of putative flagellum accessory genes lie close to flagellin genes in archaeal genomes (called flaC, flaD/ flaE, flaF, flaG, flaH, flaI, and flaJ) [5]. Their putative role in flagellum structure and assembly was tentatively deduced based on their sequences, cellular location, and mutation experiments [4-6]. FlaC, FlaD, FlaH and FlaI are associated with the membrane fraction in Methanococcus voltae and may thus be peripheral components of the archaeal flagellum [5,6]. FlaH, FlaI and FlaJ may be important for the assembly of archaeal flagella and possibly form a secretion complex http://www.biomedcentral.com/1471-2148/7/106 [5,6]. FlaH harbors a domain similar to that found in bacterial RecA-like ATPases, and FlaH mutants are nonmotile and nonflagellated [17]. FlaJ contains many transmembrane domains, while FlaI probably encodes an ATPase that may be important for flagellins export, similarly to the role of its bacterial homologue, the pilin export ATPase PilT, and/or for providing the force for rotation. No experimental data are presently available for FlaG and FlaF, although FlaG may be a component of the anchoring system between the hook and the filament [6]. It may be possible that some of the multiple flagellin proteins have different roles in flagellum substructures other than the filament [6]. Finally, additional components of the archaeal flagellum may be encoded by genes that have not yet been identified. The uniqueness of the archaeal flagellum in terms of components, structure, and assembly indicates that archaeal and bacterial flagella have distinct origins (i.e. they are analogous systems). Interestingly, homologues of most bacterial chemotaxis genes are found in archaeal genomes [5,18], suggesting that archaeal and bacterial chemotaxis systems are evolutionary related. However, their interaction with the flagellum system in Archaea remains largely unknown (for a recent review see [19]). In this work, we sought to contribute to the research on archaeal flagella and archaeal motility in general performing an accurate phylogenomic study (sensu Eisen [20]) of the archaeal motility apparatus in terms of taxonomic distribution of the genes coding for its components, their genomic context, and their phylogeny. This allowed us to sketch a fairly detailed image on the origin and evolution of this macromolecular structure. Results Taxonomic distribution and genomic context The taxonomic distribution of the genes coding for archaeal flagellum components is congruent with that presented in a recent review[5] and is generally consistent with species descriptions [21]. Gene homologues for all components of the archaeal flagellum are found in the complete genomes of Archaea that are described as motile [5] (indicated by M and ● signs in Figure 1). Conversely, no homologues of genes coding archaeal flagellum proteins are found in the complete genomes of Archaea that are described as non motile [5] (indicated by NM and ❍ signs in Figure 1). Although the representative of the Methanosarcina genus (i.e. Methanosarcina mazei, Methanosarcina acetivorans and Methanosarcina barkeri) are described as non-motile [21], at least a full complement of homologues of the genes coding for archaeal flagellum components is present in the complete genomes of these species [5,22] (Red rectangles in Figure 1). Conversely, no homologues of the genes coding for archaeal flagellum components are found in the complete genome of Pyrob- Page 2 of 13 (page number not for citation purposes) BMC Evolutionary Biology 2007, 7:106 aculum aerophilum [5] (Yellow rectangle in Figure 1). This is surprising since the genus Pyrobaculum is described as "motile due to flagellation" in the Bergey's manual and a picture is included of a "platinum shadowed cell showing flagella of P. aerophilum" [21]. Similarly, no homologues of the genes coding for archaeal flagellum components are present in the complete genome of Methanopyrus kandleri [5] (Yellow rectangle in Figure 1), which is described as motile in the Bergey's manual [21]. In all archaeal genomes harboring flagellum components, the corresponding genes are always organized into one or two very well conserved clusters [5] (fla clusters, Figure 2). The only exception is the gene coding for the preflagellin peptidase FlaK, which is located close to the fla cluster in Methanococcus jannaschii only [5]. flaK homologues are nevertheless always present, at least in single copy at different locations in the other archaeal genomes, to the notable exception of Aeropyrum pernix, Thermoplasma acidophilum and Thermoplasma volcanium. We verified that these species do not harbor any homologue of PilD -the bacterial prepilin peptidase- that they may have recruited by horizontal gene transfer, and how they cope with the absence of this enzymatic activity (or which non-homologous enzyme performs the function) remains puzzling. A careful observation of gene order within each cluster revealed two types of organizations, that we will hereafter call fla1 and fla2 (Figure 2). fla1 clusters are characterized by the presence of flaC, plus one or few copies of flaD (also annotated as flaE), or by a fusion of flaC and flaD (Figure 2A), whereas fla2 clusters lack these genes (Figure 2B). Nevertheless, psi-Blast searches revealed that the genes (hyp1 and hyp2, Figure 2B) lying between flaB and flaG in fla2 clusters from Methanomicrobia (Methanosarcinales and the Methanomicrobiale Methanospirillum hungatei) and Archaeoglobales may be a very distant homologue of flaD, as already suggested [5]. A second characteristic differentiates fla1 and fla2 clusters: they show an inverted order of flaG and flaF. While the gene order in fla1 clusters is flaB-flaC-flaD-flaF-flaG-flaH-flaIflaJ, fla2 clusters (lacking flaC/D) display the order flaBflaG-flaF-flaH-flaI-flaJ (Figure 2). Interestingly, all the archaeal genomes contain only a single type of fla clusters (i.e. fla1 or fla2), except M. burtonii that is the only species that harbors both type I and II fla clusters. fla1 clusters are present only in Euryarchaeota: Thermococcales (Pyrococcus abyssi, Pyrococcus furiosus, Pyrococcus horikoshii and Thermococcus kodakarensis), Methanococcales (Methanocaldococcus jannashii and Methanococcus maripaludis), Thermoplasmatales (Thermoplasma acidophilum and Thermoplasma volcanium), Halobacteriales (Haloarcula marismortuii, Halobacterium sp. and Natromonas pharaonis), and Methanomicrobia(Methanococcoides burtonii) (Figure 2A); whereas fla2 clusters are present in Crenarchaeota: http://www.biomedcentral.com/1471-2148/7/106 Desulfurococcales (Aeropyrum pernix) and Sulfolobales (Sulfolobus solfataricus Sulfolobus tokodaii Sulfolobus acidocaldarius) and in some Euryarchaeota: Methanomicrobia (Methanospirillum hungatei, Methanococcoides burtonii, Methanosarcina acetivorans, Methanosarcina mazei and Methanosarcina barkeri) and the Archaeoglobale Archaeoglobus fulgidus (Figure 2B). In Halobacteriales and Sulfolobales the clusters also include non-flagellum genes (Figure 2A). In particular, a gene coding for a homologue of a bacterial chemotaxis component (MCP domain signal transducer) is present in Halobacterium sp., and three genes coding for components of the chemotaxis system are present in H. marismortui (CheY, CheA, and CheD) and in N. pharaonis (CheY, CheC, and CheD). In this archaeon, the operon is disrupted, the second half lying at ~50 ORFs downstream (Figure 2A). Interestingly,M. mazei and M. acetivorans each harbor two copies of the fla2 cluster (hereafter called fla2A and fla2B), that differ in the number of flagellin gene copies (two in fla2A and one in fla2B), and in the presence of two different hypothetical genes (possibly very distant homologues of flaD, see above) lying in between the genes coding for FlaB and FlaG (Figure 2B). According to these characteristics, the A. fulgidus, the M. hungatei and one of the M. burtonii gene clusters resemble more to the fla2A cluster, while that from M. barkeri resembles more to the fla2B cluster (Figure 2B). Multiple copies of flagellin genes (flaB/flaA) are found in most archaeal genomes, especially in Thermococcales, whereas Thermoplasmatales and Sulfolobales harbor single gene copies (Figure 2), confirming earlier studies on the composition of the flagella from T. volcanium and S. shibatae [23]. In M. hungatei, the flagellin genes lie in another region of the chromosome (two are clustered together and an additional small one is isolated (Figure 2B)), suggesting a disruption of the original cluster. This is also the case of the fla1 cluster of M. burtonii, which presents no nearby flagellin genes (a single isolated flaB gene was possibly part of this cluster before disruption, Figure 2A and see below). In S. solfataricus a transposase disrupts the gene coding for FlaG (Figure 2B). However, both the N- and C-ter sequences of FlaG are still very similar to FlaG homologues found in S. solfataricus close relatives (e.g. S. acidocaldarius and S. tokodaii), suggesting that the disruption of flaG is recent. Indeed, the sequenced genome of S. solfataricus presents indeed a high number of insertion elements that may have recently invaded this strain [24]. Interestingly, cells of this strain appear non-flagellated under the electron microscope (P. Redder, personal communication) even if the disruption of the flaG gene does not affect the transcription of the downstream operon genes [25]. This suggests that FlaG (possibly involved in the flagellum anchoring system [6]) is an essential flagellum component. Page 3 of 13 (page number not for citation purposes) BMC Evolutionary Biology 2007, 7:106 http://www.biomedcentral.com/1471-2148/7/106 Pyrobaculum aerophilum Picrophilus torridus Ferroplasma acidarmanus Thermoplasma acidophilum Thermoplasma volcanium Archaeoglobus fulgidus Haloarcula marismortui Halobacterium sp Methanospirillum hungatei Methanococcoides burtonii Methanosarcina barkeri Methanosarcina mazei Methanosarcina acetivorans Mesophilic Crenarchaea Desulfurococcales Crenarchaea mésophiles Crenarchaea hyperthermophiles Sulfolobales Nanoarchaeota Thermococcales Methanopyrales Methanobacteriales Methanococcales Euryarchaea Thermoplasmatales Archaeoglobales Halobacteriales Methanomicrobiales Methanosarcinales Methanomicrobia Methanogen Class II Natronomonas pharaonis ° ° • • • • ° • • • • • ° ° • • NM ° NM ° M• M• M• M• M• M• M• M• NM • NM • NM • Methanogen Class I NA M Aeropyrum pernix M Sulfolobus solfataricus M Sulfolobus tokodaii M Sulfolobus acidocaldarius M Nanoarchaeum equitans NM Thermococcus kodakarensis M Thermococcus gammatolerans M Pyrococcus furiosus M Pyrococcus horikoshii M M Pyrococcus abyssi Methanopyrus kandleri M Methanothermobacter thermautotrophicus NM M Methanocaldococcus jannaschii Methanococcus maripaludis M Cenarchaeum symbiosum Schematic sequenced Figure 1 phylogeny genomes are adapted available fromin[29] databases showing the relationship between the main archaeal phyla for which completely Schematic phylogeny adapted from [29] showing the relationship between the main archaeal phyla for which completely sequenced genomes are available in databases. Motility or non-motility by the mean of a flagellum of each organism according to [21] is indicated by M and NM, respectively (NA is used when no information is available). Black circles and open circles indicate the presence or the absence of flagellum components coding gene in the genomes of the considered organisms, respectively. Red rectangles indicate the presence of flagellum component coding genes in organisms described as non-motile whereas yellow rectangles indicate the absence of flagellum component coding genes in organisms described as motile. Finally, additional copies of flagellum components lie in a few instances outside of the clusters (examples are additional flaB genes in M. burtonii, the two Thermoplasmatales, H. marismortui and N. pharaonis; an additional flaG in Halobacterium sp.; an additional flaD in H. marismortui, N. pharaonis, and M. burtonii; an additional flaF in M. maripaludis, and an additional flaK in Methanococcales) (Figure 2). Phylogenetic analysis Phylogenetic analyses were performed on six amino acid sequence datasets corresponding to FlaA/B, FlaD/E, FlaG, FlaH, FlaI, and FlaJ. Phylogenetic analysis of FlaC, FlaF and FlaK could not be performed due to a too restricted phylogenetic distribution of FlaC, and the poor sequence conservation of FlaF and FlaK. FlaG, FlaH, FlaI, FlaJ Among all archaeal flagellum components, FlaH, FlaI and FlaJ are the most conserved at the sequence level and always lie close to each other in all the analyzed genomes (Figure 2) strengthening their likely fundamental role in flagellum assembly and function (see above). FlaI has a number of bacterial homologues belonging to type IV and type II secretion systems, including the typeIV pili component PilT [26]. Moreover, FlaI has additional archaeal homologues that are also probably part of yet to describe secretion machineries [27,28]. In a phylogeny including all these homologues FlaI sequences form a monophyletic group and are most closely related to their archaeal counterparts (not shown). FlaH shares a RecA-like ATPase domain with distant archaeal and bacterial homologues that are not involved in motility structures. Psi-blast Page 4 of 13 (page number not for citation purposes) BMC Evolutionary Biology 2007, 7:106 http://www.biomedcentral.com/1471-2148/7/106 B A Desulfurococcales & Sulfolobales Thermococcales Thermococcus kodakarensis (TK0038-39-40-41-42-43-44-45-46-47-48-49) // +K TK0053 B B B B Pyrococcus furiosus B B B F G H I J B B C D F G H I J B B B C D B C F D G H I J B B C F G H I B J B B I J C E F G H I J E F G H I I J transposase G F H I J Methanomicrobia & Archaeoglobales B K (MMP1666-1667-1668-1669-1670-1671-1672-1673-1674-1675-1676) // +2K MMP0232-MMP0555 +F MMP0342 D H Methanosarcina barkeri (Mbar_A1970-1969-1968-1967-1966-1965-1964) +K Mbar_A1491 (MJ0891-892-893-894-895-896-897-898-899-900-901-902) // +2K MJ0835.2 MJ1282.1 D Methanococcus maripaludis B H (SSO2323-2322-2320-2319-2319-2318-2316-2315-2314) +K SSO0131 Sulfolobus tokodaii (ST2518-2519-2520-2521-2522-2523-2524) +K ST2258 Sulfolobus acidocaldarius (Saci_1178-1177-1176-1175-1174-1173-1172) + K Saci_0139 Methanococcales & Thermoplasmatales Methanocaldococcus jannashii B G G G F transposase (PH0546-548-549-550-551-552-553-553.1n-555-556-557-559) // +K PH0446 Pyrococcus abyssi (PAB1378-1379-1380-1381-1382-1383-1384-1385-1386-1387) // +K PAB1309 B B Sulfolobus solfataricus B B D (PF0338-337-336-335-334-333-332-331-330) // +K PF0471 Pyrococcus horikoshii B C Aeropyrum pernix (APE1907-1905-1904-1903-1901-1899-1898-1896-1895) G F H I J fla2B hyp2 Methanosarcina mazei B J (MM0418-417-416-415-414-413-412) // K MM1678 (MM0323-322-321-320-319-318-317-316) G F H I J fla2B G F H I J fla2A hyp2 Thermoplasma acidophilum B (Ta0553-554-555-556-557a-558-559-560) // +B Ta1407m C D F G H Thermoplasma volcanium (TVN0607-608-609-610-611-612-613-614) B C D F G H I B J // +B TVN1426 I B hyp1 Methanosarcina acetivorans (MA3077-3078-3079-3080-3081-3082-3083) // 2K MA0519-MA4505 (MA3062-3061-3060-3059-3058-3057-3056-3055) B J G F H I J fla2B hyp2 Halobacteriales & Methanomicrobia (rrnAC2198-2197-2195-2194-2193-2192-2191-2190-2187-2186-2184-2183) // +2B pNG1026 -rrnB0018 +D rrnAC1482 + K rrnAC2525 Haloarcula marismortui B C/D Halobacterium sp. B B F G H I J (VNG0962G-961G-960G-959H-958G-955G-954C-953C-950a-950Gm-949G-947G) // +3B VNG1181G-VNG1009-VNG1008+G VNG2567C + K VNG2285C B D Natromononas pharaonis B C/D F G H I J (NP2086A-2088A-2090A-2092A-2094A-2096A-2098A-2100A-2102A-2104A-2106A) // (NP2152A-2154A-2156A-2158A-2160A) // +K NP1276A +D NP2686A B B H B I J B G F H I J fla2A hyp1 Methanospirillum hungatei (Mhun_0100-0101-0102-0103-0104-0105-0106) // +3B Mhun_1238 Mhun_3139 Mhun_3140 + K Mhun_2921 fla2A G G F H I J hyp1 Archaeoglobus fulgidus (AF1055-1054-1053-1052-1051-1050-1049-1048) +K AF0936 B 2 C/D B B G F H I J hyp1 Methanococcoides burtonii (Mbur_0346-0347-0348-0349-0350-0351-0352-0353) fla2A B G F H I J fla2A hyp1 Methanococcoides burtonii (Mbur_1570-1571-1572-1573-1574-1575) // +B Mbur_0104 + D Mbur_1246 + K Mbur_0062 C G H I J Figure Genomic2 organization of the genes coding for flagellum components in complete archaeal genomes (fla clusters) Genomic organization of the genes coding for flagellum components in complete archaeal genomes (fla clusters). Numbers within brackets correspond to the locus tags of each gene. A // sign indicates that the following components are elsewhere in the genome. The genes annotated as hyp1 and hyp2 are probable distant homologues of flaD/E. Genes colored in grey are homologues of chemotaxis components, as discussed in the text. Remaining genes where no name is indicated are annotated as hypothetical. A Genomic organization of type I clusters (fla1). These are found only in Euryarchaea and are characterized by the presence of flaC and flaD/E and by a conserved gene order flaA/B, flaC, flaD, flaF, flaG, flaH, flaI, flaJ. B Genomic organization of the type II clusters (fla2). These are found in Crenarchaea and in some Euryarchaea and are characterized by the absence of flaC and flaD/E and by a conserved gene order flaA/B, flaC, flaD, flaG, flaF, flaH, flaI, flaJ. searches revealed (i) that FlaJ harbours a few distant archaeal homologues annotated as involved in type II secretion and (ii) weak similarities with the bacterial pilus assembly protein TadC and TadB. FlaJ shares the domain GSPII F with TadC and TadB. However, this may not be significant, given that the domain was defined on the basis of an alignment that included both archaeal and bacterial sequences. The similarity between the FlaJ sequences and TadB and TadC sequences is very weak (16% and 35% of identity and similarity with TadB sequences, respectively and 14% and 32% of identity and similarity with TadC sequences, respectively) and is mainly the result of the sharing of small hydrophobic amino acids. To our point of view this sequence similarity is too weak to definitively conclude that these sequences are homologues although this has been claimed [27]. After removal of ambiguously aligned positions, 104, 193, 392 and 353 amino acids could be kept for phylogenetic analysis of FlaG, FlaH, FlaI, and FlaJ, respectively. The resulting trees are strikingly congruent (Figure 3). Notably, all major archaeal groups except Methanomicrobia (Methanomicrobiales plus Methanosarcinales) are well defined and strongly supported statistically (Bootstrap Values -BV- > 990‰ and/or Posterior Probabilities PP- = 1), suggesting that no recent horizontal transfer of flaH, flaG, flaI and flaJ genes occurred across these groups. Page 5 of 13 (page number not for citation purposes) BMC Evolutionary Biology 2007, 7:106 The sequences from Methanomicrobia form a well supported cluster (BV > 980‰ and/or Posterior Probabilities -PP- = 1) that also includes sequences from A. fulgidus. It is not possible for the time being to decide whether this is due to a HGT or a hidden paralogy. Importantly, the corresponding trees are also strongly consistent with gene cluster organization. In fact, homologues from fla1 and fla2 clusters (characterized by a flaF-flaG and by a flaG-flaF gene order, respectively) form two distinct strongly groups (BV > 975‰ and PP = 1, Figure 3). In particular, in all four trees, the homologues from the fla2 clusters of M. hungatei, the four Methanosarcinales and A. fulgidus appear close to Crenarchaeota (Figure 3). This is in contrast to their expected position as sister-group of Halobacteriales within Euryarchaeota (Figure 1, [29,30]). Interestingly, such expected position is shown by the M. burtonii sequences belonging to its fla1 cluster (Figure 2A). This suggests that the fla1 and fla2 clusters from M. burtonii have different origins (see below). Independent species-specific duplications of flaG appear to have occurred in Halobacterium sp., N. pharaonis, and M. hungatei (tandem gene duplications in these last two species). Moreover, in the FlaH, FlaI and FlaJ phylogenies, the sequences from M. burtonii, M. hungatei, M. barkeri and A. fulgidus fla2 clusters group with the sequences belonging to the fla2B cluster from M. mazei and M. acetivorans (Figures 3B, 3C and 3D, respectively), supporting a close relationship of these clusters, as suggested by their gene organization (Figure 2B). FlaD/E As discussed above, homologues of FlaD/E genes are missing in all fla2 clusters from Crenarchaea. In fla2 clusters from Methanomicrobia and A. fulgidus the two hypothetical proteins hyp1 and hyp2 could be distantly related to FlaD/E (Figure 2B). However they are too distant to be included in any phylogenetic analysis. The small number of unambiguously aligned positions (77 amino acids positions) that could be kept for phylogenetic analysis gives a poor resolution of the relationships between major euryarchaeal groups (Figure 4A). However, sequences from phyla form monophyletic groups generally well supported (BV = 996‰, PP = 0.93, BV = 1000‰ and BV = 958‰ for Halobacteriales, Methanosarcinales, Thermoplasmatales and Thermococcales, respectively, Figure 4A). This indicates that, similarly to FlaG, FlaI, FlaJ, and FlaH, no recent HGT involving the flaD/E gene occurred among these groups. Interestingly, a tandem duplication event appears to have occurred in an ancestor of Methanococcales, with the two copies having been conserved within the cluster. Halobacteriales also harbor two copies of flaD. One of the two copies is fused with flaC and always resides within the fla cluster, whereas the second copy resides within the fla cluster only in Halobacterium sp. This suggests that the fusion of flaC and flaD genes occurred before http://www.biomedcentral.com/1471-2148/7/106 the divergence of the three Halobacteriales, but after the duplication event and that one of the two copies was displaced after the duplication event in the ancestor of H. marismortui and N. pharaonis. A similar duplication of FlaD followed by a fusion of one of the two copies of FlaD and FlaC also appears to have occurred in M. burtonii. As in most Halobacteriales one of the two copies resides outside the fla cluster (Figure 2A) suggesting its displacement after the duplication event. The poor resolution of the relationships between groups (due to the small number of positions kept for the phylogenetic analysis) does not permit to determine if a single fusion event of FlaC and FlaD occurred in ancestor of Halobacteriales and Methanomicrobia or if this event occurred two times independently in both lineages. FlaB Given the use of only 72 unambiguously aligned positions for analysis, the FlaB tree presents a number of poorly resolved nodes (Figure 4B). Nevertheless, the monophyly of a number of groups is recovered and supported by BV > 850‰, except for Thermococcales, Methanococcales and Methanomicrobia. As for FlaG, FlaI, FlaJ, and FlaH, the FlaB tree is again strongly consistent with gene cluster organization (Figure 4B). In particular, the FlaB from the fla2 clusters of M. hungatei, the four Methanosarcinales and A. fulgidus appear close to Crenarchaeota, while the isolated FlaB from M. burtonii appear close to Halobacteria, and thus likely functions with the flagellum components of fla1 cluster (Figure 4B). Interestingly, the multiple copies of flagellins group in a group-specific manner (Figure 4B), suggesting that flagellin gene family expansion occurred mainly by gene duplication and multiple times independently in different species, and not by HGT. Discussion and conclusion The archaeal flagellum is a complex cellular structure composed of multiple subunits and anchored to the membrane. The striking conservation of the genomic context of the genes coding for these subunits indicates a likely highly coordinated expression and assembly mechanisms. Coupled to the high sequence conservation of the different subunits across archaeal species inhabiting very different habitats, this underlines the importance for structural maintenance of the archaeal flagellum. However, we highlighted an important dimorphism of the genetic context organization, with two types of gene clusters harboring differences in both gene content and gene order (Figure 2). In fact, most Euryarchaea exhibit the fla1 cluster, whereas all the Crenarchaea and some Euryarchaea have the fla2 cluster (e.g. Methanomicrobia and A. fulgidus). The difference between the two clusters is strongly supported by phylogenetic analysis, and indicates that Methanomicrobia and A. fulgidus flagellum Page 6 of 13 (page number not for citation purposes) BMC Evolutionary Biology 2007, 7:106 A 788 .93 881 .88 Type II clusters (fla2) G F F G 433 .6 641 .86 722 1 213 .48 977 1 458 999 1 901 1 532 .57 994/1 0.1 C 501 .6 Type II clusters (fla2) G F F G 697/.57 Type I clusters (fla1) 1000 1 539 .99 405 .98 0.1 B 983 1 1000 1 624 .94 892 1 Type I clusters (fla1) Aeropyrum pernix BAA80906 Sulfolobus acidocaldarius AAY80523 Sulfolobus tokodaii BAB67633 Sulfolobus solfataricus Methanospirillum hungatei ABD39880 Methanospirillum hungatei ABD39879 Methanosarcina mazei NP_632344 fla2A 943 Methanosarcina acetivorans AAM06432 497 .98 Methanococcoides burtonii ABE51344 .89 470 Archaeoglobus fulgidus NP_069885 442 .96 .91 Methanosarcina barkeri AAZ70904 1000 Methanosarcina acetivorans AAM06452 1 fla2B 542/.55 Methanosarcina mazei NP_632440 Methanococcoides burtonii EAN01051 Haloarcula marismortui AAV47027 Haloferax volcanii Natronomonas pharaonis YP_326701 784 Natronomonas pharaonis YP_326700 .97 Halobacterium sp NP_444204 478 Halobacterium sp NP_281136 .98 Pyrococcus furiosus NP_578062 Thermococcus kodakarensis YP_182459 Pyrococcus abyssi NP_127164 985 Pyrococcus horikoshii NP_142523 1 Thermoplasma volcanium NP_111130 1000 1 Thermoplasma acidophilum NP_394031 Methanocaldococcus jannaschii NP_247893 Methanococcus maripaludis NP_988793 Methanothermococcus thermolithotrophicus AAG50076 816/1 577/.63 Methanococcus voltae O06640 http://www.biomedcentral.com/1471-2148/7/106 Aeropyrum pernix NP_148247 Sulfolobus acidocaldarius YP_255813 1000 Sulfolobus tokodaii NP_378527 1 731 Sulfolobus solfataricus NP_343681 .69 Methanosarcina barkeri YP_305481 Methanosarcina acetivorans NP_617975 fla2B 1000/1 863/.97 Methanosarcina mazei AAM30109 1000 Methanospirillum hungatei EAP16188 1 Archaeoglobus fulgidus AAB90188 641 Methanococcoides burtonii EAN00027 fla2A .66 903 1 Methanosarcina mazei AAM30013 996 1 1000/1 Methanosarcina acetivorans NP_617949 Methanococcoides burtonii EAN01053 Haloferax volcanii Haloarcula marismortui AAV47025 1000/1 Natronomonas pharaonis YP_326731 540/.85 407/.9 Halobacterium sp NP_279896 Methanocaldococcus jannaschii Q58310 Methanothermococcus thermolithotrophicus AAG50078 999 Methanococcus maripaludis NP_988795 1 998/1 961/1 Methanococcus voltae AAB57833 Thermoplasma acidophilum NP_394033 1000/1 Thermoplasma volcanium NP_111132 Thermococcus kodakarensis YP_182461 Pyrococcus furiosus NP_578060 999/1 Pyrococcus horikoshii NP_142525 923/1 965/1 Pyrococcus abyssi NP_127162 985 1 885 1 Type II clusters (fla2) G F F G 639 .96 Type I clusters (fla1) 1000 1 904 1 353 .73 0.1 .61 D 654 .73 Type II clusters (fla2) G F F G Type I clusters (fla1) 1000 1 872 .99 823 1 442 .58 0.1 Aeropyrum pernix NP_148248 Sulfolobus acidocaldarius YP_255814 Sulfolobus tokodaii NP_378526 998 Sulfolobus solfataricus NP_343682 1 Methanosarcina barkeri YP_305482 1000 1 Methanosarcina mazei AAM30110 fla2B 734 Methanosarcina acetivorans NP_617974 .83 Methanospirillum hungatei EAP16187 Archaeoglobus fulgidus AAB90187 992 Methanococcoides burtonii EAN00028 1 fla2A Methanosarcina mazei AAM30014 1000/1 999/1 Methanosarcina acetivorans NP_617950 Methanococcoides burtonii EAN01052 Haloferax volcanii 1000 Halobacterium sp NP_444203 1 Haloarcula marismortui AAV47026 731/.85 887/1 Natronomonas pharaonis YP_326730 Methanocaldococcus jannaschii Q58309 999 Methanothermococcus thermolithotrophicus AAG50077 1 Methanococcus maripaludis NP_988794 939 1 979 Methanococcus voltae O06641 1 Thermoplasma volcanium NP_111131 1000 1 Thermoplasma acidophilum NP_394032 Pyrococcus horikoshii NP_142524 628/.8 Pyrococcus abyssi NP_127163 Pyrococcus furiosus NP_578061 1000/1 607 Thermococcus kodakarensis YP_182460 1000 1 Aeropyrum pernix NP_148246 Sulfolobus solfataricus AAK42470 Sulfolobus tokodaii BAB67637 1000/1 320/.75 Sulfolobus acidocaldarius AAY80519 Methanosarcina barkeri YP_305480 Methanosarcina mazei AAM30108 1000/1 fla2B 884/1 Methanosarcina acetivorans NP_617976 1000 Methanospirillum hungatei EAP16189 1 Archaeoglobus fulgidus AAB90189 421 .69 Methanococcoides burtonii EAN00026 fla2A 1000 1 1000/1 Methanosarcina mazei AAM30012 1000/1 Methanosarcina acetivorans NP_617948 Halobacterium sp NP_279895 Haloferax volcanii 1000/1 Natronomonas pharaonis YP_326732 803/.99 694/.99 Haloarcula marismortui AAV47024 Methanococcoides burtonii EAN01054 Methanocaldococcus jannaschii NP_247896 Methanothermococcus thermolithotrophicus AAG50079 1000 Methanococcus maripaludis NP_988796 1 1000 1 998 Methanococcus voltae AAC19121 1 Thermoplasma volcanium NP_111133 Thermoplasma acidophilum NP_394034 1000/1 Thermococcus kodakarensis YP_182462 Pyrococcus furiosus NP_578059 1000/1 Pyrococcus horikoshii NP_142526 890/1 978/1 Pyrococcus abyssi NP_127161 Figure Unrooted 3 maximum likelihood phylogenetic trees of FlaG (A), FlaH (B), FlaI (C) and FlaJ (D) Unrooted maximum likelihood phylogenetic trees of FlaG (A), FlaH (B), FlaI (C) and FlaJ (D). Numbers at nodes indicate bootstrap values for 1000 replicates of the original dataset and posterior probabilities, computed by PHYML and MrBayes, respectively. The scale bar represents the average number of substitutions per site. The phylogenies show a clear separation between the type I clusters (characterized by a FlaF FlaG order of genes) and type II clusters (characterized by a FlaG FlaF order of genes). components encoded by fla2 gene clusters are more closely related to their crenarchaeal homologues than to the homologues encoded by the fla1 gene cluster of their close relatives (i.e. M. burtonii, Thermoplasmatales and Halobacteriales, Figures 1, 3 and 4). Two different evolutionary scenarios can be proposed to explain our results. In the first scenario (Figure 5), the last ancestor of Archaea was flagellated and had the two types of clusters (i.e. both fla1 and fla2, blue and red clusters, respectively, Figure 5), resulting from the duplication of an ancestral cluster (purple cluster, Figure 5), and these were secondarily and differently lost during archaeal lineages evolution (seven losses of the fla1 cluster and nine losses of the fla2 cluster). Importantly, some of these losses would have also occurred recently in the Euryarchaea (for example the loss of the fla1 cluster in the Methanosarcinale lineage would have occurred after the divergence of M. burtonii, that would have kept the two types of clusters). Finally, a duplication event of the whole fla2 cluster occurred in an ancestor of Methanosarcina (red circle, Figure 5) leading to the fla2B cluster (orange cluster, Figure 5). This first scenario involves 16 losses, and implies that the ancestor of Archaea and the ancestor of each euryarchaeal group had two types of flagella. Moreover, it does not explain the position of A. fulgidus sequences within the Methanomicrobia group and not as sister of this group, as generally indicated by molecular phylogeny of multiple markers (Figure 1 and [30]). A second scenario involves less losses (three losses of the fla2 cluster and seven losses of the fla1 cluster) (Figure 6 and Figure 7 for a more detailed scenario). Here, the ancestor of Archaea was also flagellated, but had only a single type of fla cluster (either fla1, fla2, or else, purple cluster in Figure 6 and Figure 7). After the divergence of Crenarchaea and Euryarchaea (black circle, Page 7 of 13 (page number not for citation purposes) BMC Evolutionary Biology 2007, 7:106 http://www.biomedcentral.com/1471-2148/7/106 A Sulfolobus acidocaldarius YP_255818 B cluster Sulfolobus tokodaii NP_378522 B cluster Sulfolobus solfataricus NP_343686 B cluster Aeropyrum pernix NP_148254 B B cluster 977/1 Aeropyrum pernix NP_148253 B B cluster Archaeoglobus fulgidus O29207 B cluster fla2A 547 M. hungatei EAP15115 B 800/1 868/1 Methanospirillum hungatei EAP15114 B Methanosarcina acetivorans NP_617955 cluster B 242/.58 Methanosarcina mazei AAM30019 cluster B Methanococcoides burtonii EAN00033 cluster B 205/.68 620/.93 Methanococcoides burtonii EAN00032 cluster B B fla2A 768 .84 100/.54 Methanosarcina acetivorans NP_617954 cluster B cluster B B 924/1Methanosarcina mazei AAM30018 334/.96 cluster Archaeoglobus fulgidus O29208 B B cluster Methanosarcina acetivorans NP_617970 B 512/.97 Type II clusters (fla2) cluster Methanosarcina barkeri YP_305486 B fla2B 1000/1 G F cluster 969/.97 Methanosarcina mazei AAM30114 B Methanococcoides burtonii EAM99976 B F G Natronomonas pharaonis YP_326696 cluster B B B Type I Natronomonas pharaonis YP_326697 B B B cluster 1000/1 601 clusters 945 Natronomonas pharaonis YP_326695 B B B cluster 1 .81 (fla1) Natrialba magadii O93719 974 Natrialba magadii Q9P9I3 1 744/.9 Natrialba magadii O93718 826 990/1 Natrialba magadii Q9P9I1 1 Haloarcula marismortuii AAV48232 B 782 .98 Haloarcula marismortuii AAV44283 530/.8 993/1 Haloarcula marismortuii AAV47035 cluster 400 Halobacterium salinarum AAA72644 876 B cluster 1000 .77 .97 .98 Halobacterium salinarum AAA72641 B cluster 1000/1 Halobacterium sp s NP_279945 334/.31 Halobacterium sp s NP_279905 716/.53 Halobacterium salinarum AAA72643 B B B cluster Methanococcus voltae AAA73073 Pyrococcus abyssi NP_127170 cluster B B B 317/Thermoplasma volcanium NP_111126 cluster B 940 998 Thermoplasma acidophilum NP_394027 cluster B 1 1 345 Thermoplasma volcanium NP_111945 B 984/1 Thermoplasma acidophilum NP_394861 B Methanocaldococcus jannaschii NP_247886 cluster B B B 285 Methanocaldococcus jannaschii NP_247887 cluster B B B 786 Methanothermococcus thermolithotrophicus 504 Methanococcus voltae AAA73074 Methanococcus maripaludis AAG50058 968 167 Methanococcus maripaludis NP_988787 318 cluster B B B 149 Methanococcus vannielii AAC27726 497 Methanothermococcus thermolithotrophicus 626 Methanococcus voltae AAA73075 14 Methanococcus vannielii AAC27725 238 Methanococcus maripaludis AAG50057 Methanococcus maripaludis NP_988786 998 B B B cluster Methanococcus voltae AAA73076 Methanococcus vannielii AAC27727 986 5 Methanothermococcus thermolithotrophicus 404 Methanocaldococcus jannaschii Q58303 cluster B B B 564 382 Methanococcus maripaludis AAG50059 979 Methanococcus maripaludis NP_988788 473 B B B cluster Thermococcus kodakarensis BAA84106 cluster B B B B Pyrococcus horikoshii NP_142515 cluster B B B B B Pyrococcus horikoshii NP_142517 558 cluster B B B B B 321 Thermococcus kodakarensis BAA84105 154 B B B B cluster Pyrococcus horikoshii NP_142516 253 cluster B B B B B 304 Thermococcus kodakarensis BAA84107 B B B B cluster 179 Pyrococcus abyssi Q9UYL3 B B B cluster 922 Thermococcus kodakarensis BAA84108 cluster B B B B 885 1000 Pyrococcus furiosus NP_578067 cluster B B Pyrococcus horikoshii NP_142518 520 cluster B B B B B Pyrococcus furiosus NP_578066 cluster B B 0.1 Pyrococcus horikoshii NP_142519 cluster 345 B B B B B Pyrococcus abyssi Q9UYL4 297 cluster B B B 270 Thermococcus kodakarensis BAA84109 cluster B B B B B B B C D F G H I J B B C D F G H I J B B B C D F G H I J B B B C D F G H I J Pyrococcus furiosus NP_578064 B B B 958 Pyrococcus horikoshii NP_142521 1 874 Pyrococcus abyssi NP_127166 .76 576 .37 Thermococcus kodakarensis YP_182457 B B Methanothermococcus thermolithotrophicus AAG50074 Methanocaldococcus jannaschii Q58306 962 1 706 .71 Methanococcus maripaludis NP_988791 900 .99 546 .73 D E F G H I J B B B C D E F G H I J B B B C D D F G H I J B B B C D D F G H I J Methanococcus voltae O06638 Methanocaldococcus jannaschii Q58305 Methanococcus maripaludis NP_988790 870 1 B B B C 934 1 452 .6 1000 1 Methanococcus voltae O06637 Methanothermococcus thermolithotrophicus AAG50073 Thermoplasma volcanium NP_111128 B C D F G H I J Thermoplasma acidophilum NP_394029 B C D F G H I J C/D F G H I J D F G H I J C C/D F G H I Methanococcoides burtonii ABE52168 406 .56 782 .93 D Methanococcoides burtonii EAN01049 Halobacterium sp NP_279899 498 .7 B B B Natronomonas pharaonis D YP_326993 996 .7 Haloarcula marismortui 454 1 AAV47029 610 .99 328 .67 B Natronomonas pharaonis YP_326729 C/D H I J J Haloarcula marismortui AAV A 46407 0.1 508 .88 Halobacterium sp s B B B NP_279900 428 .63 Haloferax volcanii C/D F G H I J 997/1 987/.99 Figure A. Unrooted 4 maximum likelihood phylogenetic trees of FlaD/E A. Unrooted maximum likelihood phylogenetic trees of FlaD/E. Numbers at nodes indicate bootstrap values for 1000 replicates of the original dataset and posterior probabilities, computed by PHYML and MrBayes, respectively. The scale bar represents the average number of substitutions per site. The light blue circles indicate duplication events. B. Unrooted maximum likelihood phylogenetic trees of FlaA/B. Numbers at nodes indicate bootstrap values for 1000 replicates of the original dataset and posterior probabilities, computed by PHYML and MrBayes, respectively. The scale bar represents the average number of substitutions per site. Detailed cluster organizations are not shown. White arrows are used to schematize the clusters. The phylogenies show a clear separation between the type I clusters (characterized by a FlaF FlaG order of genes) and type II clusters (characterized by a FlaG FlaF order of genes). Figure 6 and Figure 7), evolution led to the fla1 and fla2 clusters. A single HGT of the fla2 cluster would have then occurred from some ancestors of Sulfolobales and Desulfurococcales to an ancestor of Methanomicrobia (Figure 6 Figure 7, HGT 1), and was followed by a HGT from some ancestors of Methanosarcinales to A. fulgidus (Figure 6 Figure 7, HGT 2). Methanosarcina, M. hungatei and A. fulgidus would thus have lost their original fla1 gene cluster before or after their replacement by a fla2 gene cluster of crenarchaeal origin. We favor a HGT from Crenarchaea to Methanomicrobia rather than the opposite, since in this case we would expect to find the M. burtonii fla1 genes more closely related to their paralogues belonging to the fla2 cluster than to their fla1 orthologues from Halobacteriales. Both scenarios imply that the archaeal flagellum would have appeared prior to the last archaeal ancestor Apart these two likely cases of HGT of the entire fla2 gene cluster, we found no clear evidence for recent transfers of the genes coding for flagellum components among archaeal species. Indeed, the poor resolution of interphyla relationships in some trees is more likely due to lack of phylogenetic signal rather than horizontal gene transfer. One explanation for the rarity of HGT is that it is pos- Page 8 of 13 (page number not for citation purposes) BMC Evolutionary Biology 2007, 7:106 http://www.biomedcentral.com/1471-2148/7/106 Mesophilic Crenarchaea Cenarchaeum symbiosum Pyrobaculum aerophilum Desulfurococcales Aero r pyr yrum pern r iix Sulfo f lobus solfa f tari r cus Type I cluster T pe II cluster Sulfolobales Sulfo f lobus tokodaiii Sulfo f lobus acidocaldarius Nanoarchaeota Nanoarc r haeum equitans Therm T r ococcus kodaka k rrensis i Therm T r ococcus gammatolera r ns Thermococcales Pyrococcus furiosus Pyrococcus horikoshii Pyyrococcus abys P y si Meth t anoth t erm r obacter therm r autotr trophicus Meth t anocaldococcus jjannaschiii Methanococcus maripaludis Picrophilus torridus Methanogen class I Meth t anopyr yrus kkandl d eri r Ferroplasma acidarmanus Methanopyrales Methanobacteriales Methanococcales Thermoplasmatales Thermoplasma acidophilum Thermoplasma volcanium Archaeoglobales Archaeoglobus fulgidus Haloarcula marismortui Halobacteriales Halobacterium sp Natronomonas pharaonis Methanococcoides burtonii Methanosarcina barkeri Methanosarcina mazei Methanosarcina acetivorans Methanomicrobia Methanogen class II Methanospirillum hungatei Methanomicrobiales Methanosarcinales Figure An evolutionary 5 scenario for the origin and evolution of the archaeal flagellum An evolutionary scenario for the origin and evolution of the archaeal flagellum. Blue, red, orange and purple clusters represent fla1 cluster, fla2A cluster, fla2B cluster and their ancestor, respectively. The black circle represents the last common ancestor of Archaea. The green circle represents the duplication event that led to the fla1 and fla2 clusters. This duplication event occurred before the last common ancestor of Archaea, which thus harbored the two types of clusters. The red circle represents the recent duplication event of the fla2 cluster in ancestor of Methanosarcina. The blue and red crosses represent the loss of the fla1 and fla2A clusters, respectively. sible the result of the high level of integration of flagellum components within the macromolecular structure. Importantly, this contradicts the generally assumed notion that HGT of "informational" genes (i.e. those coding for proteins involved in the expression and the transmission of genetic information) are less frequent than those involving the remaining ones ("operational") genes. Nevertheless, the acquisition of a whole flagellum component coding gene cluster from distant donors seems possible. Interestingly, even if two clusters coexist within a genome (i.e. in M. burtonii) neither gene recombination is observed between homologous genes belonging to the two clusters, nor losses, suggesting that the components of a cluster may interact preferentially due to coevolution, although they form similar cellular structures. The presence in M. burtonii of two types of fla clusters (possibly one native and one acquired by HGT from crenarchaeota) represents an interesting experimental model to study. It would be in fact particular interesting to know the difference between the components encoded by the two fla clusters in terms of expression and assembly, and how two different flagellum systems coexist in this archaeon. Finally, it has been suggested that archaea are modified Actinobacteria and that archaeal flagella are derived from bacterial flagella following an adaptation to acidic environments by the recruitment of an already acid-stable glycoprotein from the pilus machinery that would have replaced the original flagellin while leaving intact the basal rotary motor [31]. We find this hypothesis unlikely for the fact that archaeal flagella do not resemble to bacterial flagella in major structure, assembly, and components, and not only concerning flagellin. Indeed, no even distant homologues to any component, including the basal rotary motor, of bacterial flagella can be recovered in archaeal genomes, including the related type III secretion system components [28]. Moreover, flagella of acidophilic bacteria (such as Thiobacillus) show a typical bacterial structure (e.g. a diameter of approximately 20 nm), so they adapted to acidic conditions without radically modi- Page 9 of 13 (page number not for citation purposes) BMC Evolutionary Biology 2007, 7:106 http://www.biomedcentral.com/1471-2148/7/106 Type II cluster Mesophilic Crenarchaea Cenarchaeum symbiosum Pyrobaculum aerophilum Desulfurococcales Aeropyrum pernix Sulfolobus solfataricus Sulfolobales Sulfolobus tokodaii Sulfolobus acidocaldarius Nanoarchaeota Nanoarchaeum equitans Thermococcus kodakarensis Therm T r ococcus gammatolerans Ancestral cluster Thermococcales Pyyrococcus ffurriosus P Pyyrococcus hori P rikoshiii Pyyrococcus abyssi P Meth t anopyrus kandleri Meth t anoth t erm r obacter therm r autotr trophicus Methanocaldococcus jannaschii Methanococcus maripaludis Picrophilus torridus Methanogen class I Type I cluster Ferroplasma acidarmanus Methanopyrales Methanobacteriales Methanococcales Thermoplasmatales Thermoplasma acidophilum Thermoplasma volcanium Archaeoglobales Archaeoglobus fulgidus Haloarcula marismortui Halobacteriales Halobacterium sp HGT 1 Natronomonas pharaonis Methanospirillum hungatei Methanococcoides burtonii Methanosarcina barkeri Methanosarcina mazei Methanosarcina acetivorans Methanomicrobia Methanogen class II HGT 2 Methanomicrobiales Methanosarcinales Figure An evolutionary 6 scenario for the origin and the evolution of the archaeal flagellum An evolutionary scenario for the origin and the evolution of the archaeal flagellum. Blue, red, orange and purple clusters represent fla1 cluster, fla2A cluster, fla2B cluster and their ancestor, respectively. The black circle represents the last common ancestor of Archaea. The red circle represents the recent duplication event of the fla2 cluster in ancestor of Methanosarcina. The blue and red crosses represent the loss of the fla1 and fla2A clusters, respectively. The green arrows represent horizontal gene transfers. fying their motility structures and components [28]. Indeed, the uncomplete genome of the extreme acidophilic bacterium (optimal pH<3) Acidobacterium capsulatum contains a clear homologue of bacterial flagellin, indicating that adaptation to acidic environments was possible without its replacement. The lack of congruence between the description of archaeal species as motile or non-motile and the taxonomic distribution of homologues of flagellum component coding genes[5] is particularly striking and underlines the need to explore more in depth the motility systems in Archaea. The presence of two gene clusters in non motile Methanosarcinales is particularly puzzling. Either these species can be flagellated under particular conditions that have not yet been tested, or the flagellum component homologues are involved in other functions than motility (for example, they could be required for cellcell adhesion to form cell aggregates, a peculiarity of this archaeal family). It will be extremely interesting to study the expression and localization of the flagellum components in Methanosarcinales, in the light of the fact that the flagellum genes of Methanosarcinales may have been recruited from the distantly related Crenarchaeota, since this event may have been an important step in the evolution of this archaeal lineage. Moreover, virtually nothing is known about other types of motility than swimming in archaea [28], while in bacteria these are starting being investigated [4]. The fact that no homologues of flagellum components are encoded in the genomes of at least two archaeal species that are described as motile (M. kandleri and P. aerophilum) is also puzzling. Although it is possible that the strains used for genome sequencing have lost the flagellum operon following lab cultivation (see for example [32], it would surely be interesting to test motility in these archaeal species. Alternatively, this observation could also suggest that other types of motility may occur in archaea and are made possible by still unknown molecular structures. It would also be very useful to investigate the relationship between the flagellum and the secretion Page 10 of 13 (page number not for citation purposes) BMC Evolutionary Biology 2007, 7:106 http://www.biomedcentral.com/1471-2148/7/106 C. symbiosum Type II cluster B D/E G F H P. aerophilum I J A. pern r iix B G F H I G J H I G B B G F H I B I J I J H I J B S. acido d calda d rrius J H F I J S. solfa f ttarricus B J G S. toko daiii k d B F G F H I J N. equita N t ns J B Ancestral cluster B B B C D F G H I J T. k kod dak karrensis T i B B B B B C D F G H I J B B C D F G H I J B B B C D F G H I J B B B C D F G H I J P. furiosus B B B C D F G H I J P. horikoshii B B B C D F G H I J P. abyssi B B M. kandleri B D/E F G H I J M. thermautotrophicus C B Type I cluster B B C D D F G H I B B C D D F G H I J J M. jjannaschii i M. mari ripaludi dis B B B C D D F F G H I J torr rridus B B B C D E F G H I J B F B B C D E F G H I J acida d rrmanus HGT 1 B B C D F H G I J T. acido T d phil i um B B C D F G H I J T. volcanium B B C D F G H I J B B G F H I J F G H I J A. fulgidus H. volcanii H i B B B D HGT 2 B F B C G H I B D CD G F F B F B D F I D/E C D B G G H I G H H I I B B G G F H I J H. mari H rismort r ui J F J B B CD G J G H I J J B D B B CD H. sp D CD B D G M. hungate t iB B B I B CD B B B D B M. barkeri F G H I J G F H I J B B N. phara N r onis i M. burt r onii i J B M. mazei B G F H B G F H I J I J M. acetivorans B B B B B B J I J G G F H I J CD F G G F H H I I J J G F H I J G F H I J G G F F H I J H I J G F H I J A Figure detailed 7 evolutionary scenario for the origin and the evolution of the archaeal flagellum based on figure 6 A detailed evolutionary scenario for the origin and the evolution of the archaeal flagellum based on figure 6. The purple cluster represents the ancestor of fla1 and fla2 clusters. The black circle represents the last common ancestor of Archaea. Blue arrows represent the recent duplication events. Black arrows indicate gene movements to different locations from their original positions in the cluster. Dark green { symbols indicate gene insertions within the clusters. The blue, red and black crosses represent the loss of the fla1 cluster, fla2A cluster, or of single genes, respectively. A red arrow indicates gene inversion. F indicates the fusion of FlaC and FlaD. The green arrows represent horizontal gene transfers. systems in Archaea. Indeed, archaeal genomes harbor only a few homologues of bacterial TypeII/IV secretion systems [28], and it is not known whether they form pili, despite rare observations [33-35] and evidence for conjugation [36-38]. To sum up, two radically different options for motility were adopted at the divergence of the two prokaryotic domains, and much still remain to be uncovered on archaeal motility systems. Methods Data set construction The list of archaeal flagellum components was retrieved from the Kyoto Encyclopedia of Genes and Genomes [39]. Homologous sequences of each archaeal flagellum com- ponent were retrieved from the nr database at the National Center for Biotechnology Information [40] using the BlastP program with different seeds [41] and the ALIBABA program (P. Lopez personal communication). For each dataset, additional searches using psi-Blast were performed to search for divergent homologues (especially from bacteria) [41]. tBlastN searches on the unfinished genomes of the Halobacteriale Haloferax volcanii were performed at TIGR [42]. Multiple alignments were done with ClustalW [43] and MUSCLE [44]. For each dataset, the quality of the alignments obtained with CLUSTALW and MUSCLE, was evaluated with T-COFFEE (CLUSTALW and MUSCLE provided alignments of comparable quality, not shown) [45]. All the alignments were edited and manually improved using the ED program from the MUST package [46]. Regions where the homology between sites was Page 11 of 13 (page number not for citation purposes) BMC Evolutionary Biology 2007, 7:106 doubtful were manually removed from the datasets for phylogenetic analyses. Phylogenetic analysis Maximum Likelihood (ML) phylogenetic trees were computed with Phyml [47,48] and the JJT model of amino acid substitution (Jones, Taylor and Thornton [49]. A gamma correction with 4 discrete classes of sites was used to take into account the heterogeneity of evolutionary rates across sites. The alpha parameter and the proportion of invariable sites were estimated for each dataset. The robustness of each branch was estimated by non-parametric bootstrap analysis (1000 replicates) using PHYML. In addition, bayesian analyses were performed using MrBayes [50] with a mixed model of amino acid substitution. As for ML tree reconstruction, a gamma correction with 4 discrete classes of sites was used and the alpha parameter and the proportion of invariable sites were estimated. MrBayes was run with four chains for 1 million generations and trees were sampled every 100 generations. To construct the consensus tree, the first 1500 trees were discarded as "burnin". Genomic context analysis The genomic localization of each archaeal flagellum component was manually investigated in all archaeal complete genomes available at the NCBI. http://www.biomedcentral.com/1471-2148/7/106 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. Authors' contributions SG conceived the study and supervised the analyses. ED carried out the analyses. CB and SG refined and completed the analyses and wrote the manuscript. All authors read and approved the final manuscript. References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. Alam M, Oesterhelt D: Morphology, function and isolation of halobacterial flagella. J Mol Biol 1984, 176(4):459-475. Marwan W, Alam M, Oesterhelt D: Rotation and switching of the flagellar motor assembly in Halobacterium halobium. J Bacteriol 1991, 173(6):1971-1977. Macnab RM: The bacterial flagellum: reversible rotary propellor and type III export apparatus. J Bacteriol 1999, 181(23):7149-7153. Bardy SL, Ng SY, Jarrell KF: Prokaryotic motility structures. Microbiology 2003, 149(Pt 2):295-304. Ng SY, Chaban B, Jarrell KF: Archaeal flagella, bacterial flagella and type IV pili: a comparison of genes and posttranslational modifications. J Mol Microbiol Biotechnol 2006, 11(3-5):167-191. Thomas NA, Bardy SL, Jarrell KF: The archaeal flagellum: a different kind of prokaryotic motility structure. FEMS Microbiol Rev 2001, 25(2):147-174. Faguy DM, Koval SF, Jarrell KF: Physical characterization of the flagella and flagellins from Methanospirillum hungatei. J Bacteriol 1994, 176(24):7491-7498. Metlina AL: Bacterial and archaeal flagella as prokaryotic motility organelles. Biochemistry (Mosc) 2004, 69(11):1203-1212. Cruden D, Sparling R, Markovetz AJ: Isolation and Ultrastructure of the Flagella of Methanococcus thermolithotrophicus and Methanospirillum hungatei. Appl Environ Microbiol 1989, 55(6):1414-1419. Cohen-Krausz S, Trachtenberg S: The structure of the archeabacterial flagellar filament of the extreme halophile Halobacterium salinarum R1M1 and its relation to eubacterial 23. 24. 25. 26. 27. 28. 29. 30. 31. flagellar filaments and type IV pili. J Mol Biol 2002, 321(3):383-395. Trachtenberg S, Cohen-Krausz S: The archaeabacterial flagellar filament: a bacterial propeller with a pilus-like structure. J Mol Microbiol Biotechnol 2006, 11(3-5):208-220. Bardy SL, Jarrell KF: FlaK of the archaeon Methanococcus maripaludis possesses preflagellin peptidase activity. FEMS Microbiol Lett 2002, 208(1):53-59. Bardy SL, Jarrell KF: Cleavage of preflagellins by an aspartic acid signal peptidase is essential for flagellation in the archaeon Methanococcus voltae. Mol Microbiol 2003, 50(4):1339-1347. Albers SV, Szabo Z, Driessen AJ: Archaeal homolog of bacterial type IV prepilin signal peptidases with broad substrate specificity. J Bacteriol 2003, 185(13):3918-3925. Wieland F, Paul G, Sumper M: Halobacterial flagellins are sulfated glycoproteins. J Biol Chem 1985, 260(28):15180-15185. Mattick JS: Type IV pili and twitching motility. Annu Rev Microbiol 2002, 56:289-314. Thomas NA, Pawson CT, Jarrell KF: Insertional inactivation of the flaH gene in the archaeon Methanococcus voltae results in non-flagellated cells. Mol Genet Genomics 2001, 265(4):596-603. Rudolph J, Oesterhelt D: Deletion analysis of the che operon in the archaeon Halobacterium salinarium. J Mol Biol 1996, 258(4):548-554. Szurmant H, Ordal GW: Diversity in chemotaxis mechanisms among the bacteria and archaea. Microbiol Mol Biol Rev 2004, 68(2):301-319. Eisen JA: A phylogenomic study of the MutS family of proteins. Nucleic Acids Res 1998, 26(18):4291-4300. Garrity GM: Bergey's Manual of Systematic Bacteriology. 2nd edition. Edited by: Garrity GM. New York , Springer-Verlag; 2001. Galagan JE, Nusbaum C, Roy A, Endrizzi MG, Macdonald P, FitzHugh W, Calvo S, Engels R, Smirnov S, Atnoor D, Brown A, Allen N, Naylor J, Stange-Thomann N, DeArellano K, Johnson R, Linton L, McEwan P, McKernan K, Talamas J, Tirrell A, Ye W, Zimmer A, Barber RD, Cann I, Graham DE, Grahame DA, Guss AM, Hedderich R, Ingram-Smith C, Kuettner HC, Krzycki JA, Leigh JA, Li W, Liu J, Mukhopadhyay B, Reeve JN, Smith K, Springer TA, Umayam LA, White O, White RH, Conway de Macario E, Ferry JG, Jarrell KF, Jing H, Macario AJ, Paulsen I, Pritchett M, Sowers KR, Swanson RV, Zinder SH, Lander E, Metcalf WW, Birren B: The genome of M. acetivorans reveals extensive metabolic and physiological diversity. Genome Res 2002, 12(4):532-542. Faguy DM, Bayley DP, Kostyukova AS, Thomas NA, Jarrell KF: Isolation and characterization of flagella and flagellin proteins from the Thermoacidophilic archaea Thermoplasma volcanium and Sulfolobus shibatae. J Bacteriol 1996, 178(3):902-905. Redder P, Garrett RA: Mutations and rearrangements in the genome of Sulfolobus solfataricus P2. J Bacteriol 2006, 188(12):4198-4206. Albers SV, Driessen AJ: Analysis of ATPases of putative secretion operons in the thermoacidophilic archaeon Sulfolobus solfataricus. Microbiology 2005, 151(Pt 3):763-773. Planet PJ, Kachlany SC, DeSalle R, Figurski DH: Phylogeny of genes for secretion NTPases: identification of the widespread tadA subfamily and development of a diagnostic key for gene classification. Proc Natl Acad Sci U S A 2001, 98(5):2503-2508. Peabody CR, Chung YJ, Yen MR, Vidal-Ingigliardi D, Pugsley AP, Saier MH Jr.: Type II protein secretion and its relationship to bacterial type IV pili and archaeal flagella. Microbiology 2003, 149(Pt 11):3051-3072. Albers SV, Szabo Z, Driessen AJ: Protein secretion in the Archaea: multiple paths towards a unique cell surface. Nat Rev Microbiol 2006, 4(7):537-547. Gribaldo S, Brochier-Armanet C: The origin and evolution of Archaea: a state of the art. Philos Trans R Soc Lond B Biol Sci 2006, 361(1470):1007-1022. Brochier C, Forterre P, Gribaldo S: An emerging phylogenetic core of Archaea: phylogenies of transcription and translation machineries converge following addition of new genome sequences. BMC Evol Biol 2005, 5(1):36. Cavalier-Smith T: The neomuran origin of archaebacteria, the negibacterial root of the universal tree and bacterial megaclassification. Int J Syst Evol Microbiol 2002, 52(Pt 1):7-76. Page 12 of 13 (page number not for citation purposes) BMC Evolutionary Biology 2007, 7:106 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46. 47. 48. 49. 50. Labes A, Schonheit P: Sugar utilization in the hyperthermophilic, sulfate-reducing archaeon Archaeoglobus fulgidus strain 7324: starch degradation to acetate and CO2 via a modified Embden-Meyerhof pathway and acetyl-CoA synthetase (ADP-forming). Arch Microbiol 2001, 176(5):329-338. Leadbetter JR, Breznak JA: Physiological ecology of Methanobrevibacter cuticularis sp. nov. and Methanobrevibacter curvatus sp. nov., isolated from the hindgut of the termite Reticulitermes flavipes. Appl Environ Microbiol 1996, 62(10):3620-3631. Miroshnichenko ML, Gongadze GM, Rainey FA, Kostyukova AS, Lysenko AM, Chernyh NA, Bonch-Osmolovskaya EA: Thermococcus gorgonarius sp. nov. and Thermococcus pacificus sp. nov.: heterotrophic extremely thermophilic archaea from New Zealand submarine hot vents. Int J Syst Bacteriol 1998, 48 Pt 1:23-29. Weiss RL: Attachment of bacteria to sulfur in extreme environments. J Gen Microbiol 1973, 77:501-507. Prangishvili D, Albers SV, Holz I, Arnold HP, Stedman K, Klein T, Singh H, Hiort J, Schweier A, Kristjansson JK, Zillig W: Conjugation in archaea: frequent occurrence of conjugative plasmids in Sulfolobus. Plasmid 1998, 40(3):190-202. Rosenshine I, Tchelet R, Mevarech M: The mechanism of DNA transfer in the mating system of an archaebacterium. Science 1989, 245(4924):1387-1389. Stedman KM, She Q, Phan H, Holz I, Singh H, Prangishvili D, Garrett R, Zillig W: pING family of conjugative plasmids from the extremely thermophilic archaeon Sulfolobus islandicus: insights into recombination and conjugation in Crenarchaeota. J Bacteriol 2000, 182(24):7014-7020. KEGG: [http://www.genome.jp/kegg/]. NCBI: [http://www.ncbi.nlm.nih.gov/]. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 1997, 25(17):3389-3402. TIGR: [http://tigrblast.tigr.org/]. Thompson JD, Higgins DG, Gibson TJ: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 1994, 22(22):4673-4680. Edgar RC: MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 2004, 32(5):1792-1797. Poirot O, O'Toole E, Notredame C: Tcoffee@igs: A web server for computing, evaluating and combining multiple sequence alignments. Nucleic Acids Res 2003, 31(13):3503-3506. Philippe H: MUST, a computer package of Management Utilities for Sequences and Trees. Nucleic Acids Res 1993, 21(22):5264-5272. Guindon S, Gascuel O: A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst Biol 2003, 52(5):696-704. Guindon S, Lethiec F, Duroux P, Gascuel O: PHYML Online--a web server for fast maximum likelihood-based phylogenetic inference. Nucleic Acids Res 2005, 33(Web Server issue):W557-9. Jones DT, Taylor WR, Thornton JM: The rapid generation of mutation data matrices from protein sequences. Comput Appl Biosci 1992, 8(3):275-282. Huelsenbeck JP, Ronquist F: MRBAYES: Bayesian inference of phylogenetic trees. Bioinformatics 2001, 17(8):754-755. http://www.biomedcentral.com/1471-2148/7/106 Publish with Bio Med Central and every scientist can read your work free of charge "BioMed Central will be the most significant development for disseminating the results of biomedical researc h in our lifetime." Sir Paul Nurse, Cancer Research UK Your research papers will be: available free of charge to the entire biomedical community peer reviewed and published immediately upon acceptance cited in PubMed and archived on PubMed Central yours — you keep the copyright BioMedcentral Submit your manuscript here: http://www.biomedcentral.com/info/publishing_adv.asp Page 13 of 13 (page number not for citation purposes)
© Copyright 2025 Paperzz