FEMS Microbiology Ecology 28 (1999) 99^110 MiniReview Molecular diversity of thermophilic cellulolytic and hemicellulolytic bacteria Peter L. Bergquist a;b *, Moreland D. Gibbs a , Daniel D. Morris a , V.S. Junior Te'o a , David J. Saul c , Hugh W. Morgan d a School of Biological Sciences, Macquarie University, Sydney, N.S.W. 2109, Australia Department of Molecular Medicine, University of Auckland Medical School, Auckland, New Zealand Centre for Gene Technology, School of Biological Sciences, University of Auckland, Auckland, New Zealand d Thermophile Research Unit, University of Waikato, Hamilton, New Zealand b c Received 23 March 1998; received in revised form 14 July 1998; accepted 27 July 1998 Abstract Many thermophilic bacteria belong to groups with deep phylogenetic lineages and ancestral forms were established before the occurrence of eucaryotes that produced cellulose and hemicellulose. Thus they may have acquired their L-glycanase genes from more recent mesophilic bacteria. Most research has focussed on extremely thermophilic eubacteria growing above 65³C under anaerobic conditions. Only recently have aerobic cellulolytic thermophiles been described from widely separated lineages (for example, Rhodothermus marinus, Caldibacillus cellulovorans). Many thermophilic bacteria produce cellulases and xylanases that have novel structures, with additional protein domains not identified with their catalytic activity. Many of these enzymes are multifunctional and code for more than one catalytic activity. This type of enzyme structure was first identified in the extreme thermophile Caldicellulosiruptor saccharolyticus. There is a general relatedness evident between catalytic domains, cellulose binding domains and other ancillary domains, which suggests that there may have been significant lateral gene transfer in the evolution of these microorganisms. Detailed molecular studies show that there is variation in the sequences of these related but not identical genes from taxonomically widely-separated organisms. z 1999 Federation of European Microbiological Societies. Published by Elsevier Science B.V. All rights reserved. Keywords : Cellulase; Xylanase; Molecular diversity ; Binding domain; Thermal stabilising domain; Caldicellulosiruptor 1. Introduction Cellulose and hemicellulose are some of the most abundant biological polymers with over 109 tonnes of cellulose produced and degraded annually. Curiously, no cellulose-producing organisms have been * Corresponding author. Tel.: +61 (2) 9850 8614; Fax: +61 (2) 9850 8799; E-mail: [email protected] found growing at temperatures above 65³C, yet environments well above this temperature harbour a wide variety of thermophilic cellulolytic bacteria. Many groups of thermophilic bacteria have deeprooted phylogenetic lineages [1] and so, presumably, the common ancestors of the thermophilic cellulolytic organisms were well established prior to the development of cellulose and hemicellulose-forming eucaryotes. The origin of the thermophilic bacterial 0168-6496 / 99 / $19.00 ß 1999 Federation of European Microbiological Societies. Published by Elsevier Science B.V. All rights reserved. PII: S 0 1 6 8 - 6 4 9 6 ( 9 8 ) 0 0 0 7 8 - 6 FEMSEC 964 4-2-99 100 P.L. Bergquist et al. / FEMS Microbiology Ecology 28 (1999) 99^110 cellulases and hemicellulases warrants investigation: are they novel enzymes or are they the result of lateral gene transfer, presumably from mesophilic bacteria? Regardless of how these bacteria obtained cellulases, the occurrence of cellulolytic and hemicellulolytic thermophiles is testimony to the presence of these substrates in thermophilic environments, either by the accidental accumulation of plant litter into natural hot springs or in environments such as compost piles. Two unusual aspects of cellulose and hemicellulose degradation in thermophilic environments are the apparent paucity of aerobic species demonstrated to be involved and the complete absence of cellulolytic Archaea. As a result, the great majority of cellulolytic microorganisms so far described are anaerobic Bacteria. Whether this is a true re£ection of natural diversity or an artifact of isolation techniques remains to be seen. DNA-based methods used to probe environmental populations for conserved regions of genes encoding catalytic domains may provide an opportunity to assess the true abundance of cellulolytic organisms by circumventing the need for cultivation. In this review, we describe the molecular diversity of the cellulase and hemicellulase enzyme systems of thermophilic bacteria from the New Zealand geothermal region. We examine individual activities and relationships of enzymes with activity on cellulose and xylan, and propose genetic mechanisms that explain the predominant multi-domain architecture. Diversity between closely related isolates suggests lateral transfer of blocks of genes between thermophiles in the past. In recent years, the improvement of long range, automated, DNA sequencing techniques has revealed widespread molecular diversity of the genomic organisation of even apparently closely related bacteria (as judged from SSU rDNA sequence similarity). These data greatly extend the wealth of information on the biodiversity of bacteria that have been generated from orthodox enrichment studies and from SSU rDNA analysis of biomass DNA from various environments. Many of the examples of diversity studies have been of bacteria inhabiting extreme environments, and we detail below some of our ¢ndings from genetic studies of extreme thermophiles that revealed the broad biodiversity at the molecular level of cellulases and hemicellulases from closely related bacteria. From the ecological point of view, our molecular studies have shown that in the thermal environments that we have studied, there is a wide variety of bacteria that are closely related as judged by ribosomal small subunit DNA (SSU) analysis that colonise and thrive in niche environments. Close analysis of the cellulase and hemicellulase genes shows that there is a surprising variation within and between these ostensibly close relatives. Hence traditional taxonomic tools and molecular ecology using only SSU analysis may overlook the signi¢cant di¡erences in enzyme content and activities shown by the hemicellulolytic and cellulolytic thermophiles. We expect that close examination of other habitats will reveal similar sequence variations in given genes amongst the bacterial inhabitants. 2. Cellulolytic and hemicellulolytic thermophiles Thermophilic Archaea have been described which are able to hydrolyse complex polysaccharides including starch [2], chitin [3] and xylan [2], and we have cultured for over two years a stable consortium of at least two Archea able to grow on glucomannan as sole carbon source (unpublished results). To date, no cellulolytic Archea have been described despite many attempts at enrichment. This fact is surprising because all glycosyl hydrolases show some conservation in their catalytic domains, and hence the existence of cellulolytic Archaea must remain a possibility. Extremely thermophilic eubacteria are conventionally regarded as those species with a temperature optimum for growth of greater than 65³C. With this restriction, until recently, there were only two known aerobic extreme thermophiles reported as being cellulolytic and only a few more are hemicellulolytic. This contrasts with a large and growing number of anaerobic species from similar environments (Table 1 and see Ref. [2] for listings of xylanolytic thermophiles). While Acidothermus cellulolyticus does not meet the criterion of an extreme thermophile (having an optimum temperature of only 55³C), it is regarded as an extremophile because of its pH 5 growth optimum and its ability to grow at pH 3. Acidothermus is a member of the Actino- FEMSEC 964 4-2-99 P.L. Bergquist et al. / FEMS Microbiology Ecology 28 (1999) 99^110 101 Table 1 Summary of thermophilic cellulolytic eubacteria Species Temperature optimum (³C) pH optimum Growth conditions Growth on cellulose Cellulase enzymes Reference Acidothermus cellulolyticus Rhodothermus marinus `Caldibacillus cellulovorans' Thermotoga maritima 55 70 68 80 5.0 7.0 7.0 7.0 Aerobic Aerobic Aerobic Anaerobic + + + + [5] [6] [7] Thermotoga maritima strain FjSS3.B l Thermotoga neapolitana Strain 177RIB Strain 175CIA Thermoanaerobacter cellulolyticus `Anaerocellum thermophilum' Dictyoglomus turgidus Spirochaeta thermophila Caldicellulosiruptor saccharolyticus 80 80 78 78 75 72^75 72 70 68^70 7.3 7.3 7.0 7.0 8.1 7.2 7.1 7.0 7.0 Anaerobic Anaerobic Anaerobic Anaerobic Anaerobic Anaerobic Anaerobic Anaerobic Anaerobic Endoglucanase Endoglucanase Endoglucanase Endoglucanase Exoglucanase Cellobiohydrolase + + + + + (+) + + Caldicellulosiruptor lactoaceticus Clostridium stercorarium 68^75 65 7.0 7.3 Anaerobic Anaerobic + + Fervidobacterium islandicum Clostridium thermolacticum Clostridium thermocellum 65 65 60 7.2 7.2 7.0 Anaerobic Anaerobic Anaerobic + (+) + Endoglucanases Endoglucanases Exoglucanases Endoglucanase Exoglucanase Cellobiohydrolase ^ Exoglucanase Cellobiohydrolase [10] see [2] see [2] [13] [13] see [13] see [13] [15] [11] [16] see [13] [9] [17] [18] [11] + indicates growth on cellulose as sole carbon source, (+) indicates increased growth on cellulose in the presence of other substrates. mycete subphylum which uses cellulose (crystalline or amorphous) or xylan as sole carbon and energy source for growth. It produces at least three endoglucanases that are non-cellulosomal. These endoglucanases are all thermostable and one of them (E1, a member of the group 5 family of glycosyl hydrolases; [4]) has been crystallised [5]. Much of the recent work on the properties and applications of these enzymes has been protected by patent. Rhodothermus marinus, an aerobic thermophile that was isolated from marine springs, is most closely related to the Cytophaga-Flexibacter-Bacteroides phylogenetic lineage, and like many organisms in this group displays versatility in its growth on polysaccharides. An endocellulase puri¢ed from Rhodothermus is one of the most stable cellulases yet recorded, retaining 50% activity after 3.5 h at 100³C [6]. Although this endoglucanase has no activity on crystalline cellulase, such an activity was demonstrated in culture ¢ltrates of the organism, and thus Rhodothermus must produce a glycosyl hydrolase active on insoluble substrates. A third aerobic cellulolytic thermophile was isolated in a survey of New Zealand thermal sites involving arti¢cial composts set up under laboratory conditions. This organism, designated Caldibacillus cellulovorans, is an obligately aerobic spore-forming Bacillus which grows optimally at 70³C at pH 7.0 [7]. It grows on a fully-de¢ned salts medium with cellulose substrates as sole carbon and energy source, but again, there is no evidence of any cellulosomes on the cells. Cellulases are excreted into the medium and are stable for up to a week at 70³C. Amorphous cellulose is the preferred substrate for growth and though crystalline cellulose is degraded, ligni¢ed wood is not. The isolate is Gram-type positive and the spores, which are readily formed when the organism is grown on crystalline substrates, are heat resistant. The SSU (16S) rRNA gene sequence suggests that Caldibacillus cellulovorans has a close af¢liation with the genus Alicyclobacillus. Members of the genus Alicyclobacillus are characterised by the presence of alicyclic fatty acids as major components of their membrane lipids and constituent members of FEMSEC 964 4-2-99 102 P.L. Bergquist et al. / FEMS Microbiology Ecology 28 (1999) 99^110 the genus are acidophilic, non-cellulolytic and hydrogen autotrophs. In contrast, the new isolate is unable to metabolise hydrogen, is unable to grow below pH 6.0 and is cellulolytic and hemicellulolytic. On the basis of the genetic and phenotypic di¡erences, the new isolate probably represents a new genus of thermophilic eubacteria [7]. One endoglucanase has been puri¢ed to homogeneity from this isolate (unpublished results). The enzyme has no activity on substrates other than cellulose and a multi-domain enzyme complex (such as is found in many anaerobic thermophiles; see below) seems unlikely. Thermophilic anaerobic eubacteria are among the deepest-rooted phylogenetic lineages in the eubacterial line of descent and the most diverse array of cellulolytic isolates. Historically, they are important because one of the ¢rst reported thermophiles was an anaerobic spore forming cellulolytic organism, possibly Clostridium thermocellum. C. thermocellum was reisolated and formally described and remains one of the most completely investigated thermophilic cellulolytic bacteria. The optimum growth temperature of C. thermocellum is only 55^60³C and so it is only moderately thermophilic. The cellulosome complex of C. thermocellum contains endoglucanases (at least 14 di¡erent proteins), cellobiohydrolases and xylanases, all anchored to the cellulose integrating protein (CIP). The cellulase system of C. thermocellum has been extensively reviewed [8]. Cellulosome structures are found in several species of mesophilic anaerobic bacteria but only in C. thermocellum and Clostridium stercorarium amongst the thermophiles, and no other known thermophile has as well-developed aggregating enzyme systems as these two organisms. The cellulosome of C. stercorarium is less well developed than that of C. thermocellum with a lower complement of enzymes and a less pronounced `yellow complex' which is presumably the CIP protein. Avicellases of Clostridium stercorarium have been the focus of study by Bronnenmeier's group and as with many cellulase enzymes, a high degree of synergism in activity is evident on crystalline cellulose substrate [9]. The genus Thermotoga is one of the most deep-rooted phylogenetic lineages in the Bacterial domain, and is also among the most thermophilic, with growth up to 90³C. Not all species of Thermotoga show good growth with cellulose as carbon source, but in the type strain T. maritima, the complete spectrum of enzymes necessary for growth on crystalline cellulose has been demonstrated [10]. Perhaps in contrast to other phylogenetic groups where cellulolytic species are more commonly mesophilic, Spirochaeta thermophila strain Rt19B.1 is the most thermophilic representative of the family Spirochaetales [11] and the only reported cellulolytic member. It can hydrolyse amorphous and crystalline forms of cellulose, and xylan and cellulose can be used as sole carbon source for growth. Again, no cellulosome-type structure has been observed on Rt19.B1 and these motile organisms do not appear to attach to cellulose particles. Presumably, cellulases are secreted into the medium but at this stage no characterisation of the enzyme(s) has been undertaken. Rainey demonstrated the diversity of cellulolytic isolates from high temperature, neutral pH environments in a phenotypic and phylogenetic study of thermophilic anaerobic isolates. At least ¢ve distinct phenotypic groupings were recognised and these were partly supported by phylogenetic analysis of representative strains by SSU rRNA gene sequencing [12]. The great majority of isolates (which had been enriched on cellulose as sole carbon source for growth) were also xylanolytic and mannanolytic. Isolates for this study were obtained from hot springs well-distributed over the globe; it would appear that these organisms are well dispersed and common inhabitants of most neutral environments in the 60^ 80³C temperature range. In addition, Bredholt et al. [13] have isolated an even greater diversity of thermophilic anaerobes from Icelandic springs, including several with an optimum temperature for growth of 78³C. A representative strain from one of the New Zealand groups was further characterised and formally described as the species Caldicellulosiruptor saccharolyticus [14]. The genetics and properties of the cellulase and hemicellulase enzymes of this organism have been extensively investigated. Attempts at purifying the component enzymes for cellulose and hemicellulose utilisation proved to be confusing and largely fruitless. The heterogeneity of the substrate, the multi-domain and multi-catalytic nature of many hydrolytic enzymes, the possibility of glycosylation and the large cooperative e¡ects of minor contaminating ac- FEMSEC 964 4-2-99 P.L. Bergquist et al. / FEMS Microbiology Ecology 28 (1999) 99^110 103 tivities all serve to produce a confusing analysis. The preferred approach is to clone the genes encoding the enzymes in non-cellulolytic hosts and study the activity of single pure enzymes or that of known, pure mixes. This approach has been used successfully with Cs. saccharolyticus and provides a detailed picture of the organisation of enzyme activities and clustering of genes in this thermophile. 3. Molecular diversity of cellulases from Caldicellulosiruptor strains Many bacteria have been reported to carry a multiplicity of genes for cellulases and hemicellulases [19]. The extreme thermophile Caldicellulosiruptor saccharolyticus is unusual in possessing a multifunctional, multidomain organisation for the majority of its L-glycanases [20]. Other Caldicellulosiruptor strains (as determined by SSU rDNA sequence-based phylogeny) also have a number of multidomain enzymes that encode xylanases or mannanases as well as cellulases, and these are distinguished from the single catalytic domain xylanases of Family 10. Fig. 1 is a diagrammatic representation of the three gene clusters of cellulases from Caldicellulosiruptor sp. strain Tok7B.1 (unpublished data). The three clusters are not closely linked, and each one is di¡erent in its organisation from any gene cluster described for Cs. saccharolyticus [20,21]. The catalytic domains of the enzymes belong to a limited number of families as determined by hydrophobic cluster analysis (Families 5, 9, 10, 43, 44, and 48; [4]) and unlike Cs. saccharolyticus, there are no genes coding for multidomain enzymes containing a Family 5 Lmannanase domain. The cellulose binding domains of these enzymes from Caldicellulosiruptor Tok7B.1 are of either type II or III [22] with a single exception of an otherwise unclassi¢ed cellulose binding domain (CBD) associated with the K-arabinosidase domain of celA. 6 Fig. 1. Overall architecture of the three cellulase gene clusters sequenced from Caldicellulosiruptor Tok7B.1. The shaded line shows where complete sequence information is available. A stylised representation of the gene products is provided below the shaded line. Some restriction sites are named and VZAP recombinant boundaries indicated, e.g. W2^4. FEMSEC 964 4-2-99 104 P.L. Bergquist et al. / FEMS Microbiology Ecology 28 (1999) 99^110 Fig. 2. Genetic organisation of the xylanase gene cluster from Caldicellulosiruptor saccharolyticus. This region of DNA has been sequenced completely on both strands. A stylised representation of the gene products is provided below the shaded line, in examples where the enzymatic activity has been identi¢ed. X: L-xylanase; Xy : L-xylosidase; A: K-arabinosidase ; E : acetyl xylan esterase ; F: xylanase pseudogene; P: exoxylanase (?). The unlinked L-xylanase gene xynI is not shown. 4. Xylanases from Caldicellulosiruptor strains Enzymes involved in the metabolism of plant carbohydrate polymers have been grouped into 35 different families on the basis of primary and tertiary sequence homologies [4]. The endo-1,4-L-D-xylanases comprise Families 10 and 11. The only similarity between members of these two families is their ability to hydrolyze the acetyl-methylglucuronoxylans of hardwoods and arabinomethylxylans of softwoods, but they are unrelated biochemically and structurally. Like most other cellulolytic and hemicellulolytic enzymes, xylanases are highly modular in structure and may be composed of either a single domain or a number of distinct domains broadly classi¢ed as catalytic or non-catalytic. Linker peptides typically delineate the individual domains of multidomain enzymes into discrete and functionally-independent units. The catalytic domain of a xylanase determines the hydrolytic activity and hence governs the classi¢cation of the enzyme as belonging to Family 10 or 11. Most of the genes coding for enzymes involved in xylan degradation in Cs. saccharolyticus are found in a large gene cluster. A total of 10 open reading frames were found in this cluster, seven of which were upstream of xynA (see Fig. 2). Three of the ORFs were identi¢ed with enzymes involved in xylan degradation: xynA, a xylanase, xynB, a L-xylosidase and xynC, an acetylxylan esterase and a non-functional gene that has Family 10 xylanase homology. XynE is a multi-domain enzyme with xylanase activity. XynF appears to have ¢ve domains which may have resulted from a gene fusion: two domains comprise an arabinofuranosidase and two more a xylanase (domains 1+2 and domains 4+5) (Te'o, PhD. thesis, 1996; [23]). Close by are a number of other genes whose functions have been inferred from homology comparisons. They seem to be part of a major gene cluster that is involved in the metabolism of xylose and other sugars in this organism. Surprisingly, the xylanase gene organisation in this region of the genome of Cs. saccharolyticus is quite di¡erent from that of its close relative Caldicellulosiruptor sp. strain Rt8B.4 [25]. No multigene xylanase or cellulase/hemicellulase gene clusters were present and other xylanases were not found in the expression gene library of this organism despite extensive screening of genomic VZAPII gene libraries. The size of the multidomain cellulases make it hard to isolate complete, active genes using this vector. The gene xynI, which was isolated after PCR ampli¢cation from the Cs. saccharolyticus genome using consensus primers [24], is part of a genetic organisation which is very similar to that of the xynA gene cluster of Caldicellulosiruptor sp. strain Rt8B.4 [25]. 5. Molecular diversity of Thermotoga and Caldicellulosiruptor multidomain xylanases There is a subfamily of hyper- and extremely-thermophilic enzymes within the Family 10 xylanases which we call here the `TSD-IX' subfamily. There is substantial homology within the non-catalytic domains of the enzymes from T. maritima XynA [26]; T. sp. FjSS3B.1 XynB and XynC [27]; Cellulomonas ¢mi XynC [28]; Caldicellulosiruptor strain Rt8.B4 XynA [25]; Clostridium thermocellum F1 XynC [29]; and T. saccharolyticum XynA [30]. The structure of these enzymes is based on the domain arrangement: TSD-TSD-Family 10 xylanaseCBDIX -CBDIX (TSD=thermostabilising domain [31]; additional domains at the C-terminus are FEMSEC 964 4-2-99 P.L. Bergquist et al. / FEMS Microbiology Ecology 28 (1999) 99^110 present in some of the enzymes of this subfamily, for example, see Fig. 3; CBDIX =cellulose binding domain type IX [22]). The molecular architecture of these genes is quite unlike those of the enzymes from Cs. saccharolyticus. However, the signi¢cance of cellulose-binding and thermostabilising domains is unclear. XynA (Family 10) from Dictyoglomus thermophilum Rt46B.1 has neither and is not only more thermostable than XynB (a Family 11 xylanase with a C-terminal non-catalytic domain from the same organism) but it is also able to bind to xylan and release reducing sugars [41]. These observations place doubt on the true role of the `thermostabilising' domains. Their removal certainly reduces the thermal stability of the catalytic domain [26], but this may be an incidental e¡ect resulting from the fact that the enzyme cannot fold properly or make the appropriate thermostabilising interactions. For example, molecular modeling and enzyme characterisation has shown that with Dictyoglomus thermophilum Rt46B.1 XynB, removal of even a few amino acid residues from the N-terminus creates thermal instability which appears to arise from the creation of a disorganised N-terminal region [32]. Similar enzymatic and thermostability characteristics were seen 105 for both the complete, multidomain XynB and with the catalytic domain expressed alone which suggests that the non-catalytic domains have other, as yet undiscovered, functions [32]. Furthermore, in the case of the Caldicellulosiruptor strain Rt8.B4 XynA, there is no di¡erence in thermostability between recombinant enzymes with and without the putative N-terminal `thermostabilising' domain (Gri¤ths and Bergquist, unpublished results). The nucleotide sequences of over 80 Family 10 and 11 xylanase genes have now been deposited in the GenBank and EMBL databases. These xylanase genes were identi¢ed from gene libraries which were screened for either hybridisation to labeled gene probes, or more commonly, the expression of endoxylanase activity. An alternative approach for identifying novel Family 10 and 11 xylanase genes is to use the polymerase chain reaction (PCR) in conjunction with broad-speci¢city xylanase consensus primers that are designed from the overall consensus of the most highly conserved regions of Family 10 and Family 11 xylanase genes. Because this approach is PCR-based, it is highly sensitive, and o¡ers an expedient means for the identi¢cation of xylanase genes directly from genomic DNAs without having to Fig. 3. Architectural and sequence homologies between the Caldicellulosiruptor strain Rt69B.1 Family 10 (XynA, XynB and XynC) and Family 11 (XynD) xylanases. From top to bottom, showing Thermoanaerobacterium saccharolyticum XynA ; Thermotoga maritima XynA ; Rt69B.1 XynB; Rt69B.1 XynA ; Rt69B.1 XynC ; Caldicellulosiruptor saccharolyticus CelB; Cs. saccharolyticus XynF; Bacillus polymyxa XynD ; Rt69B.1 XynD and Dictyoglomus thermophilum XynB. Key : TSD: thermostabilising domain; CBD IX: Family IX cellulose binding domain; ? : domain of unknown function; E: endoglucanase domain (truncated in ¢gure due to size constraints) ; Family 43: family 43 L-glycanase domain (reported xylosidase/arabinofuranosidase activities); CBD IV: Family VI CBD; XBD ?: possible xylan-binding domain. Repeated SLH (S-layer homology) domains are indicated by white arrowheads, whilst the interdomain linker peptides are indicated by black. FEMSEC 964 4-2-99 106 P.L. Bergquist et al. / FEMS Microbiology Ecology 28 (1999) 99^110 make and screen gene libraries. Furthermore, by using genomic-walking PCR, xylanase gene(s) identi¢ed with the consensus PCR step can be subsequently sequenced in a quick and relatively straightforward manner [24]. We have used this two-step approach to examine the complement of xylanase genes in Caldicellulosiruptor sp. strain Rt69B.1. This organism is closely related to Caldicellulosiruptor strain Rt8B.4 and Caldicellulosiruptor saccharolyticus as inferred from SSU rRNA gene sequence-based phylogeny [33]. Rt69B.1 has three Family 10 xylanases: XynA, XynB and XynC (Fig. 3). They are members of the TSD-IX subfamily (see above) and are related in structure to the XynA Family 10 xylanase from Thermotoga maritima. [26]. The N-terminal regions of Rt69B.1 XynA, XynB and XynC are architecturally identical and are related in sequence. However, there are suf¢cient sequence di¡erences between the TSDs that the N- and C-terminal-most TSDs from each xylanase can be segregated into distinct subfamilies, with the exception of the second TSD from Rt69B.1 XynA. There is microdiversity at the level of the individual genes. The xylanase domains of Rt69B.1 XynA, XynB and XynC are very closely related (approximately 60% identity in each case). The XynA and XynC Family 10 xylanase domains are 329 residues in length, whilst the XynB domain is slightly longer at 340 residues. The length variations within the Rt69B.1 XynA, XynB and XynC xylanase domains can be mapped to several of the variable loop regions which partition the alternating beta-strand and alpha-helix motifs [34]. Downstream from the catalytic domains there is considerable diversity between the genes, which encode di¡erent families of CBDs (genes xynB and xynC), and xynC has a further C-terminal catalytic domain, encompassing the Family 43 L-glycanase domain and CBDIV , which is highly homologous (89%) to the C-terminal domains of Cs. saccharolyticus XynF (Fig. 2, [23]). Surprisingly, the C-terminal region of XynC is also homologous to the two N-terminal domains of the Bacillus polymyxa XynD L-glycanase, which is composed of an N-terminal Family 43 arabinofuranosidase/xylosidase domain, a central CBDIV , and an additional Cterminal domain [35]. The Family 11 xylanase gene from Rt69B.1 has high overall homology with the xynB gene from Dictyoglomus thermophilum Rt46B.1 and the binding domain with a related structure at the C-terminus of Bacillus polymyxa XynD [35]. There appears to be only a single copy of the Family 11 gene unlike its Family 10 counterparts. 6. Origins of molecular diversity What is the explanation for the diversity in gene structure found in homologous genes in closely related bacteria? It has been generally assumed that the L-glycanases have evolved by domain shu¥ing [36] although exact mechanisms have not been described. Linkers are often found in xylanases and cellulases [37] which are thought to function as £exible hinges between the catalytic and substrate binding domains. The DNA encoding the repeated linkers may have a role analogous to that of introns, enabling sequences that encode discrete domains to be excised and fused to other genes, thus generating novel hybrid enzymes [19]. Another possibility that can be proposed is that, following duplication of cellulase genes by DNA replication, a recombinational event similar to that postulated for the origin of multiple tRNA genes (`unequal crossing-over', [38]) or intragenic recombination [39] could give rise to the genes coding for multidomain enzymes seen in the genomes of most cellulolytic bacteria. Furthermore, there are super¢cial similarities between the organisations of the CBDs. A simple example that could be attributed to intragenic recombination is shown by what appears to be an inverted orientation for the CBDs of CelA from Caldicellulosiruptor Tok7B.1 in comparison to the other related proteins, which may be explained by the occurrence of two intragenic cross-overs in the PT-linker regions as outlined in Fig. 1. However, although this is a super¢cially plausible model, alignment of the amino acid sequences of the CBDs and a dendrogram of their relationships suggest that none of the CBD arrangements was the immediate precursor of the inverted CelA structure (Fig. 4). In view of the high degree of sequence homology between the Family 10 xylanase domains of Caldicellulosiruptor Rt69B.1 XynA, XynB and XynC, and the similarities in the N-terminal architectures of FEMSEC 964 4-2-99 P.L. Bergquist et al. / FEMS Microbiology Ecology 28 (1999) 99^110 Fig. 4. Hypothetical intragenic recombination events that gave rise to the architecture of the celA gene in Caldicellulosiruptor Tok7B.1 (see Fig. 1). It is postulated that two crossover events occurred in the DNA coding for the linker sequences. Other domain sequence orders can be generated by shifting the sites of the crossovers. these enzymes, it appears that the genes encoding the enzymes have arisen by duplication of an ancestral xylanase gene which, in its primitive form, encoded a Family 10 xylanase domain with duplicated N-terminal TSDs. The di¡erences within the C-terminal regions of XynA, XynB and XynC are also presumably the result of the `domain-shu¥ing' mechanisms central to the evolution of microbial glycosyl-hydrolase genes [40]. The sequence and architectural homologies observed between the xylanases from Rt69B.1 and assorted L-glycanases from other cellulolytic bacterial strains provide a remarkable example of these domain-shu¥ing processes. For example, at least four distinct gene segments can be identi¢ed within the Rt69B.1 xynC gene based upon isolated homologies between XynC and the CelB and XynF enzymes from Cs. saccharolyticus (Fig. 3). Similarly, the two C-terminal domains of Rt69B.1 XynC can combine with the C-terminal domain of Rt69B.1 XynD to form an enzyme of identical architecture to Bacillus polymyxa XynD (xylosidase/arabinofuranosidase). It is noteworthy that the junction of the B. polymyxa XynD peptide sequence signifying the end of homology to Rt69B.1 XynC and the commencement of homology to Rt69B.1 XynD is continuous with sequences neither added or lost. This observation suggests that the C-terminal domains of Rt69B.1 XynC and XynD may have arisen through the recombinational joining of an ancestral B. polymyxa xynD-like gene (see Fig. 3). 107 Caldicellulosiruptor and its close relatives with their unique array of multifunctional enzymes with catalytic domains carrying out related activities in the hydrolysis of insoluble substrates may represent a persistent evolutionary experiment which developed before the organisation of genes into operons in other lines of bacteria. A cluster of genes in the same orientation is frequently part of an operon and is regulated by transcription from a single promoter. An alternative regulatory mechanism would be to fuse the genes encoding the hydrolytic enzymes to result in the production of a multifunctional protein on transcription. Multifunctional enzymes guarantee equivalent transcription and translation of the related enzyme activities and the binding domain(s) ensure co-ordinate action at the same site on the substrate. Further evidence for the molecular diversity of the L-glycanase genes that we have studied is provided by the discovery of non-functional gene copies that appear to be evolutionary remnants of the gene-shuf£ing process. In the case of Orf3/4 of Cs. saccharolyticus (Te'o, PhD. thesis, 1996; and Ref. [23]) and xynC of Thermotoga FjSS3.B.1 [27], and in other examples we have not described here, we have found evidence for non-functional genes (pseudogenes) on the genomes of thermophilic bacteria. Pseudogenes have been reported from higher eucaryotes but their occurrence in procaryotes is unusual. It appears that the reason we have found these sequences is because our two-step PCR approach does not rely on the expression of genes within libraries, and perhaps the lack of known procaryotic pseudogenes is an artifact of the commonly used methods of gene isolation. Presumably, a pseudogene could arise only because of the presence of multiple copies of the gene. It has been proposed that domains involved with carbohydrate metabolism have evolved through the duplication, and subsequent modi¢cation, of progenitor sequences ^ the acquisition of new catalytic speci¢cities and the optimisation of existing speci¢cities have presumably come about through the process of di- and convergent evolution. An inevitable consequence of such evolutionary mechanisms would be the accumulation of pseudogenes from non-productive gene rearrangements. While genes without functions would not be expected to persist, FEMSEC 964 4-2-99 108 P.L. Bergquist et al. / FEMS Microbiology Ecology 28 (1999) 99^110 it is reasonable to expect that some of these pseudogenes would be contained in the genomes of saccharolytic organisms if they were closely linked to metabolically important genes. Ultimately, it may be found that L-glycanase pseudogenes are quite widely distributed, especially in those organisms containing gene clusters. However, their identi¢cation may require a systematic examination of the entire L-glycanase system of an organism, as any pseudogenes would be overlooked by standard techniques used for plate assays of genomic expression libraries. 7. Conclusions The information on molecular diversity of thermophilic bacteria discussed above has been derived entirely from culturable bacteria that grow as pure strains. It is now widely acknowledged that traditional enrichment strategies produce a subset of organisms that represent only a portion of those found in natural environments [41]. It is possible to bypass enrichment completely by amplifying SSU rRNA genes directly from natural environments with the development of PCR [42]. This strategy has allowed the detection of a wide variety of organisms which were previously unknown and demonstrated that our perceptions of microbial diversity and phylogeny were inadequate [43]. As a result, most microbiologists agree that less than 1% of the total microbiota that exist in natural environments have been identi¢ed [44]. We have examined the diversity of one genus, Thermus, both prior to and after standard enrichment techniques by isolating SSU (16S) rRNA genes and comparing their sequences [45]. Although Thermus is neither cellulolytic nor hemicellulolytic, it is a convenient experimental model and we believe that the results are representative of a broad range of microorganisms that exist in nature. The enrichments resulted in the predominance of e¡ectively the same single strain from each pool and there was a complete loss of heterogeneity in the sequences [45]. From the ecological point of view, this result must mean that surveys of habitats using SSU sequences as probes of one sort or another will ¢nd that many of the organisms present have not been described or are unrelated to known culturable bacteria whose ribosomal DNA sequences are avail- able from databanks. Similarly, molecular surveys of microbial habitats using marker genes such as cellulases may reveal much greater diversity than can be accounted for by the culturable or taxonomically identi¢able organisms present in the sample. All of the bacteria we have described have been isolated by enrichment and grow in pure culture. It is likely that there is even greater genetic diversity amongst the cellulases and hemicellulases in unenriched biomass. Our two-step PCR technique could be used to examine the extent of this biodiversity, and indeed, we have used genomic walking PCR in the isolation of a Family 10 xylanase directly from biomass [24]. While this gene turned out to be derived from a culturable bacterium (Dictyoglomus), the combination of the technique with rDNA analysis by PCR should allow correlation between the occurrence of diverse genetic coding sequences and the presence of new or unusual microorganisms. Recent developments in genomic sequencing have in£uenced considerations of genome organisation in procaryotes (reviewed in Ref. [47]). One ¢nding from recent comparative genomics is that there is a lack of large-scale conservation of gene order and where it occurs, it only involves a small number of essential genes. One conclusion from these studies is that `in the evolution of procaryotes, horizontal gene transfer has been common and intense' [47]. Duplication and divergence of ancestral genes has been proposed to be the major route for molecular evolution [48]. Accordingly, the limited sequencing and comparison studies of cellulases and hemicellulases that we have performed do not allow us to distinguish the manner in which genes have evolved in related bacteria or whether the genes have been acquired by horizontal transfer [49]. Divergent evolution is postulated to occur after gene duplication, whereas in the case of convergent evolution it has occurred in parallel. Resolution of the exact evolutionary relationships of thermophile cellulases and hemicellulases may depend on the conclusions that can be drawn from large scale, genomic, sequencing results. Acknowledgments The work from our laboratories has been supported by grants from the Foundation for Research, FEMSEC 964 4-2-99 P.L. Bergquist et al. / FEMS Microbiology Ecology 28 (1999) 99^110 Science and Technology, Wellington, New Zealand and grants from the University of Auckland and Macquarie University Research Committees. [14] References [1] Olsen, G.J., Woese, C.R. and Overbeek, R. (1994) The winds of (evolutionary) change: Breathing new life into microbiology. J. Bacteriol. 176, 1^6. [2] Sunna, A., Moracci, M., Rossi, M. and Antranikian, G. (1997) Glycosyl hydrolases from hyperthermophiles. Extremophiles 1, 2^13. [3] Huber, R., Stohr, J., Hohenhaus, S., Rachel, R., Burggraf, S., Jannasch, H.W. and Stetter, K.O. (1995) Thermococcus chitinophagus sp.nov., a novel, chitin-degrading, hyperthermophilic archaeum from a deep-sea hydrothermal vent environment. Arch. Microbiol. 164, 255^264. [4] Henrissat, B. and Bairoch, A. (1995) New families in the classi¢cation of glycosyl hydrolases based on amino acid sequence similarities. Biochem. J. 293, 781^788. [5] Sakon, J., Adney, W.S., Himmel, M.E., Thomas, S.R. and Karplus, P.A. (1996) Crystal structure of thermostable family 5 endocellulase E1 from Acidothermus cellulolyticus in complex with cellotetraose. Biochemistry 35, 10648^10660. [6] Hreggvidsson, G.O., Kaiste, E., Holst, O., Eggertsson, G., Palsdottir, A. and Kristjansson, J.K. (1996) An extremely thermostable cellulase from the thermophilic eubacterium Rhodothermus marinus. Appl. Environ. Microbiol. 62, 3047^ 3049. [7] Huang, X.P., Hudson, J.A., Rainey, F.A., Nichols, P.D. and Morgan, H.W. (1998) Isolation and characterization of Caldibacillus cellulovorans gen. nov., sp. nov., an extremely thermophilic, cellulolytic bacterium. Int. J. Syst. Bacteriol. (in press). [8] Felix, C.R. and Ljungdahl, L.G. (1993) The cellulosome: the exocellular organelle of Clostridium. Annu. Rev. Microbiol. 47, 791^819. [9] Riedel, K., Ritter, J. and Bronnenmeier, K. (1997) Synergistic interaction of the Clostridium stercorarium avicelase-1 (Celz) and avicelase-11 (Cely) in the degradation of crystalline cellulose. FEMS Microbiol. Lett. 147, 239^243. [10] Liebl, W., Ruile, P., Bronnenmeier, K., Reidel, K., Lottspeich, F. and Greif, I. (1996) Analysis of a Thermotoga maritima DNA fragment encoding two similar thermostable cellulases, CelA and CelB, and characterization of the recombinant enzymes. Microbiology 142, 2533^2542. [11] Aksenova, H., Rainey, F.A., Janssen, P.H., Morgan, H.W. and Zavarzin, G.A. (1992) Spirochaeta thermophila sp. nov., an obligately anaerobic polysaccharolytic member of the genus Spirochaeta. Int. J. Syst. Bacteriol. 42, 175^177. [12] Rainey, F.A., Ward, N.L., Morgan, H.W., Toalster, R. and Stackebrandt, E. (1993) Phylogenetic analysis of anaerobic thermophilic bacteria: Aid for their reclassi¢cation. J. Bacteriol. 175, 4772^4779. [13] Bredholt, S., Mathrani, I.M. and Ahring, B.K. (1995) Ex- [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] 109 tremely thermophilic cellulolytic anaerobes from Icelandic hot springs. Antonie van Leeuwenhoek 68, 263^271. Rainey, F.A., Donnison, A.M., Janssen, P.H., Saul, D., Rodrigo, A., Bergquist, P.L., Daniel, R.M., Stackebrandt, E. and Morgan, H.W. (1994) Description of Caldicellulosiruptor saccharolyticus gen. nov., sp. nov.: An obligately anaerobic, extremely thermophilic, cellulolytic bacterium. FEMS Microbiol. Lett. 120, 263^266. Svetlichny, V.A. and Svetlichnya, T.P. (1988) Dictyoglomus turgidus, sp. nov., a new extreme thermophilic eubacterium isolated from hot springs in the Uzon Volcano Crater. Mikrobiologiya 57, 435^441. Te'o, V.S.J., Saul, D.J. and Bergquist, P.L. (1995) cellA, another gene coding for a multidomain cellulase from the extreme thermophile `Caldocellum saccharolyticum'. Appl. Microbiol. Biotechnol. 43, 291^296. Huber, R., Woese, C.R., Langworthy, T.A., Kristjansson, J.K. and Stetter, K.O. (1990) Fervidobacterium islandicum sp. nov., a new extremely thermophilic eubacterium belonging to the `Thermotogales'. Arch. Microbiol. 154, 105^111. Le Ruyet, P., Dubourguier, H.C., Albagnac, G. and Prensier, G. (1985) Characterization of Clostridium thermolacticum sp. nov., a hydrolytic thermophilic anaerobe producing high amounts of lactate. Syst. Appl. Microbiol. 6, 196^202. Gilbert, H.J. and Hazelwood, G.P. (1993) Bacterial cellulases and xylanases. J. Gen. Microbiol. 139, 187^194. Bergquist, P.L., Gibbs, M.D., Saul, D.J., Te'o, V.S.J., Dwivedi, P.P. and Morris, D. (1993) Molecular genetics of thermophilic bacterial genes coding for enzymes involved in cellulose and hemicellulose degradation. In : Genetics, Biochemistry and Ecology in Biodegradation of Lignocellulose (Shimada, K., Hoshino, S., Ohmiya, K., Sakka, K., Kobayashi, Y. and Karita, S., Eds.), pp. 276^285. Uni Publishers, Tokyo, Japan. Gibbs, M.D., Reeves, R.A., Farrington, K.G., Williams, D.P. and Bergquist, P.L. (1998) Multidomain and multifunctional cellulase genes from the extreme thermophile Caldicellulosiruptor isolate Tok7B.1. Appl. Environ. Microbiol., submitted. Tomme, P., Warren, R.A.J., Miller, R.C., Kilburn, D.G. and Gilkes, N.R. (1995) Cellulose-binding domains ^ classi¢cation and properties. In: The Enzymatic Degradation of Insoluble Polysaccharides (Saddler, J.N. and Penner, M.H., Eds.), pp. 142^161. American Chemical Society Symposium Series 618. Te'o, V.S.J., Gibbs, M.D., Saul, D.J. and Bergquist, P.L. (1998) A cluster of genes involved in xylan degradation cloned from the extreme thermophile Caldicellulosiruptor saccharolyticus. Extremophiles, submitted. Bergquist, P.L., Gibbs M.D., Saul, D.J., Reeves, R.A., Morris, D.D. and Te'o, V.S.J. (1998) Isolation and expression of genes for hemicellulases from extremely thermophilic culturable and unculturable bacteria. In: Enzyme Applications in Fiber Processing (Eriksson, K.-E. and Cavaco-Paulo, A., Eds.). American Chemical Society Symposium series, 653, in press. Dwivedi, P.P., Gibbs, M.D., Saul, D.J. and Bergquist, P.L. (1996) Cloning, sequencing and over-expression in Escherichia coli of a xylanase gene, xynA from the thermophilic bacterium FEMSEC 964 4-2-99 110 [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] P.L. Bergquist et al. / FEMS Microbiology Ecology 28 (1999) 99^110 Caldicellulosiruptor Rt8B.4. Appl. Microbiol. Biotechnol. 45, 86^93. Winterhalter, C., Heinrich, P., Candussio, A., Wich, G. and Liebl, W. (1995) Identi¢cation of a novel cellulose-binding domain within the multidomain 120 kDa xylanase XynA of the hyperthermophilic bacterium Thermotoga maritima. Mol. Microbiol. 15, 431^444. Reeves, R.A., Saul, D.J., Morris, D.D., Gibbs, M.D. and Bergquist, P.L. (1998) Sequences and expression of further xylanase genes from the hyperthermophile Thermotoga sp. strain FjSS3-B.1. J. Bacteriol., submitted. Clarke, J., Davidson, K., Gilbert, H.J., Fontes, C.M.G.A. and Hazlewood, G.P. (1996) A modular xylanase from mesophilic Cellulomonas ¢mi contains the same cellulose-binding domain and thermostabilising domain as xylanases from thermophilic bacteria. FEMS Microbiol. Lett. 139, 27^35. Hayashi, H., Takagi, K.-I., Fukumura, M., Kimura, T., Karita, S., Sakka, K. and Ohmiya, K. (1997) Sequence of xynC and properties of XynC, a major component of the Clostridium thermocellum cellulosome. J. Bacteriol. 179, 4246^ 4253. Lee, Y., Lowe, S.E. and Zeikus, G.J. (1993) Gene cloning, sequencing and biochemical characterisation of endoxylanase from Thermoanaerobacterium saccharolyticum B6A-RI. Appl. Environ. Microbiol. 59, 3134^3137. Fontes, C.M., Hazlewood, G.P., Morag, E., Hall, J., Hirst, B.H. and Gilbert, H.J. (1995) Evidence for a general role for non-catalytic thermostabilising domains in a xylanase from thermophilic bacteria. Biochem. J. 307, 151^158. Morris, D.D., Gibbs, M.D., Chin, C.J.W., Koh, M.-H., Wong, K.K.Y., Allison, R.W., Nelson, P.J. and Bergquist, P.L. (1998) Cloning of the xynB gene from Dictyoglomus thermophilum strain Rt46B.1 and characterization of the gene product on kraft pulp. Appl. Environ. Microbiol. 64. Morris, D.D., Gibbs, M.D., Ford, M., Thomas, J. and Bergquist, P.L. (1998) Family 10 and 11 xylanase genes from Caldicellulosiruptor sp. Rt69B.1. Extremophiles, submitted. White, A., Withers, S.G., Gilkes, N.R. and Rose, D.R. (1994) Crystal structure of the catalytic domain of the L-1,4-glycanase Cex from Cellulomonas ¢mi. Biochemistry 33, 12546^ 12552. Gosables, M.J., Perez-Gonzalez, J.A., Gonzales, R. and Navarros, A. (1991) Two beta-glycanase genes are clustered in [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] Bacillus polymyxa: molecular cloning, expression and sequence analysis of genes encoding a xylanase and an endobeta-(1,3)-(1,4)-glucanase. J. Bacteriol. 173, 7705^7710. West, C.A., Elzanowski, A., Yeh, L.-S. and Barker, W.C. (1989) Homologues of catalytic domains of Cellulomonas glucanases found in fungal and Bacillus glycosidases. FEMS Microbiol. Lett. 59, 167^172. Ferreira, L.M.A., Durrant, A.J., Hall, J., Hazlewood, G.P. and Gilbert, H.J. (1990) Spatial separation of protein domains is not necessary for catalytic activity or substrate binding in a xylanase. Biochem. J. 269, 261^264. Smith, J.D., Barnett, L., Brenner, S. and Russell, R.L. (1970) More mutant tyrosine transfer ribonucleic acids. J. Mol. Biol. 54, 1^14. Cooper, V.J.C. and Salmond, G.P.C. (1993) Molecular analysis of the major cellulase (CelV) of Erwinia carotovora: evidence for an evolutionary `mix-and-match' of enzyme domains. Mol. Gen. Genet. 241, 341^350. Gilkes, N.R., Henrissat, B., Kilburn, D.G., Miller, R.C. and Warren, R.A.J. (1991) Domains in microbial L-1,4-glycanases : Sequence conservation, function, and enzyme families. Microbiol. Rev. 55, 2303^2315. Risatti, J.B., Capman, W.C. and Stahl, D.A. (1994) Community structure of a microbial mat: the phylogenetic dimension. Proc. Natl. Acad. Sci. USA 91, 10173^10177. Pace, N.R., Stahl, D.A., Lane, D.J. and Olsen, G.J. (1986) The analysis of natural microbial populations by ribosomal sequences. Adv. Microb. Ecol. 9, 1^55. Woese, C.R. (1994) Microbiology in transition. Proc. Natl. Acad. Sci. USA 91, 1601^1603. Amann, R.I., Ludwig, W. and Schleifer, K.-H. (1995) Phylogenetic identi¢cation and in situ detection of individual cells without cultivation. Microbiol. Rev. 59, 143^169. Saul, D.J., Reeves, R.A., Morgan, H.W. and Bergquist, P.L. (1998) Thermus diversity and strain loss during enrichment. FEMS Microbiol. Ecol., accepted. Koonin, E.V. and Galperin, M.Y. (1997) Prokaryotic genomes: the emerging paradigm of genome-based microbiology. Curr. Opin. Genet. Dev. 7, 757^763. Ohno, S. (1970) Evolution by Gene Duplication. Springer, New York, NY. Fitch, W.D. (1970) Distinguishing homologous from analogous proteins. Syst. Zool. 19, 99^113. FEMSEC 964 4-2-99
© Copyright 2026 Paperzz