Get PDF - Wiley Online Library

FEMS Microbiology Ecology 28 (1999) 99^110
MiniReview
Molecular diversity of thermophilic cellulolytic and
hemicellulolytic bacteria
Peter L. Bergquist a;b *, Moreland D. Gibbs a , Daniel D. Morris a ,
V.S. Junior Te'o a , David J. Saul c , Hugh W. Morgan d
a
School of Biological Sciences, Macquarie University, Sydney, N.S.W. 2109, Australia
Department of Molecular Medicine, University of Auckland Medical School, Auckland, New Zealand
Centre for Gene Technology, School of Biological Sciences, University of Auckland, Auckland, New Zealand
d
Thermophile Research Unit, University of Waikato, Hamilton, New Zealand
b
c
Received 23 March 1998; received in revised form 14 July 1998; accepted 27 July 1998
Abstract
Many thermophilic bacteria belong to groups with deep phylogenetic lineages and ancestral forms were established before
the occurrence of eucaryotes that produced cellulose and hemicellulose. Thus they may have acquired their L-glycanase genes
from more recent mesophilic bacteria. Most research has focussed on extremely thermophilic eubacteria growing above 65³C
under anaerobic conditions. Only recently have aerobic cellulolytic thermophiles been described from widely separated lineages
(for example, Rhodothermus marinus, Caldibacillus cellulovorans). Many thermophilic bacteria produce cellulases and xylanases
that have novel structures, with additional protein domains not identified with their catalytic activity. Many of these enzymes
are multifunctional and code for more than one catalytic activity. This type of enzyme structure was first identified in the
extreme thermophile Caldicellulosiruptor saccharolyticus. There is a general relatedness evident between catalytic domains,
cellulose binding domains and other ancillary domains, which suggests that there may have been significant lateral gene
transfer in the evolution of these microorganisms. Detailed molecular studies show that there is variation in the sequences of
these related but not identical genes from taxonomically widely-separated organisms. z 1999 Federation of European
Microbiological Societies. Published by Elsevier Science B.V. All rights reserved.
Keywords : Cellulase; Xylanase; Molecular diversity ; Binding domain; Thermal stabilising domain; Caldicellulosiruptor
1. Introduction
Cellulose and hemicellulose are some of the most
abundant biological polymers with over 109 tonnes
of cellulose produced and degraded annually. Curiously, no cellulose-producing organisms have been
* Corresponding author. Tel.: +61 (2) 9850 8614;
Fax: +61 (2) 9850 8799; E-mail: [email protected]
found growing at temperatures above 65³C, yet environments well above this temperature harbour a
wide variety of thermophilic cellulolytic bacteria.
Many groups of thermophilic bacteria have deeprooted phylogenetic lineages [1] and so, presumably,
the common ancestors of the thermophilic cellulolytic organisms were well established prior to the
development of cellulose and hemicellulose-forming
eucaryotes. The origin of the thermophilic bacterial
0168-6496 / 99 / $19.00 ß 1999 Federation of European Microbiological Societies. Published by Elsevier Science B.V. All rights reserved.
PII: S 0 1 6 8 - 6 4 9 6 ( 9 8 ) 0 0 0 7 8 - 6
FEMSEC 964 4-2-99
100
P.L. Bergquist et al. / FEMS Microbiology Ecology 28 (1999) 99^110
cellulases and hemicellulases warrants investigation:
are they novel enzymes or are they the result of
lateral gene transfer, presumably from mesophilic
bacteria?
Regardless of how these bacteria obtained cellulases, the occurrence of cellulolytic and hemicellulolytic thermophiles is testimony to the presence of
these substrates in thermophilic environments, either
by the accidental accumulation of plant litter into
natural hot springs or in environments such as compost piles. Two unusual aspects of cellulose and
hemicellulose degradation in thermophilic environments are the apparent paucity of aerobic species
demonstrated to be involved and the complete absence of cellulolytic Archaea. As a result, the great
majority of cellulolytic microorganisms so far described are anaerobic Bacteria. Whether this is a
true re£ection of natural diversity or an artifact of
isolation techniques remains to be seen. DNA-based
methods used to probe environmental populations
for conserved regions of genes encoding catalytic
domains may provide an opportunity to assess the
true abundance of cellulolytic organisms by circumventing the need for cultivation.
In this review, we describe the molecular diversity
of the cellulase and hemicellulase enzyme systems of
thermophilic bacteria from the New Zealand geothermal region. We examine individual activities
and relationships of enzymes with activity on cellulose and xylan, and propose genetic mechanisms that
explain the predominant multi-domain architecture.
Diversity between closely related isolates suggests
lateral transfer of blocks of genes between thermophiles in the past.
In recent years, the improvement of long range,
automated, DNA sequencing techniques has revealed
widespread molecular diversity of the genomic organisation of even apparently closely related bacteria
(as judged from SSU rDNA sequence similarity).
These data greatly extend the wealth of information
on the biodiversity of bacteria that have been generated from orthodox enrichment studies and from
SSU rDNA analysis of biomass DNA from various
environments. Many of the examples of diversity
studies have been of bacteria inhabiting extreme environments, and we detail below some of our ¢ndings from genetic studies of extreme thermophiles
that revealed the broad biodiversity at the molecular
level of cellulases and hemicellulases from closely
related bacteria. From the ecological point of view,
our molecular studies have shown that in the thermal
environments that we have studied, there is a wide
variety of bacteria that are closely related as judged
by ribosomal small subunit DNA (SSU) analysis
that colonise and thrive in niche environments. Close
analysis of the cellulase and hemicellulase genes
shows that there is a surprising variation within
and between these ostensibly close relatives. Hence
traditional taxonomic tools and molecular ecology
using only SSU analysis may overlook the signi¢cant
di¡erences in enzyme content and activities shown by
the hemicellulolytic and cellulolytic thermophiles.
We expect that close examination of other habitats
will reveal similar sequence variations in given genes
amongst the bacterial inhabitants.
2. Cellulolytic and hemicellulolytic thermophiles
Thermophilic Archaea have been described which
are able to hydrolyse complex polysaccharides including starch [2], chitin [3] and xylan [2], and we
have cultured for over two years a stable consortium
of at least two Archea able to grow on glucomannan
as sole carbon source (unpublished results). To date,
no cellulolytic Archea have been described despite
many attempts at enrichment. This fact is surprising
because all glycosyl hydrolases show some conservation in their catalytic domains, and hence the existence of cellulolytic Archaea must remain a possibility.
Extremely thermophilic eubacteria are conventionally regarded as those species with a temperature
optimum for growth of greater than 65³C. With
this restriction, until recently, there were only two
known aerobic extreme thermophiles reported as
being cellulolytic and only a few more are hemicellulolytic. This contrasts with a large and growing
number of anaerobic species from similar environments (Table 1 and see Ref. [2] for listings of xylanolytic thermophiles). While Acidothermus cellulolyticus does not meet the criterion of an extreme
thermophile (having an optimum temperature of
only 55³C), it is regarded as an extremophile because
of its pH 5 growth optimum and its ability to grow
at pH 3. Acidothermus is a member of the Actino-
FEMSEC 964 4-2-99
P.L. Bergquist et al. / FEMS Microbiology Ecology 28 (1999) 99^110
101
Table 1
Summary of thermophilic cellulolytic eubacteria
Species
Temperature
optimum (³C)
pH
optimum
Growth
conditions
Growth
on cellulose
Cellulase
enzymes
Reference
Acidothermus cellulolyticus
Rhodothermus marinus
`Caldibacillus cellulovorans'
Thermotoga maritima
55
70
68
80
5.0
7.0
7.0
7.0
Aerobic
Aerobic
Aerobic
Anaerobic
+
+
+
+
[5]
[6]
[7]
Thermotoga maritima strain FjSS3.B l
Thermotoga neapolitana
Strain 177RIB
Strain 175CIA
Thermoanaerobacter cellulolyticus
`Anaerocellum thermophilum'
Dictyoglomus turgidus
Spirochaeta thermophila
Caldicellulosiruptor saccharolyticus
80
80
78
78
75
72^75
72
70
68^70
7.3
7.3
7.0
7.0
8.1
7.2
7.1
7.0
7.0
Anaerobic
Anaerobic
Anaerobic
Anaerobic
Anaerobic
Anaerobic
Anaerobic
Anaerobic
Anaerobic
Endoglucanase
Endoglucanase
Endoglucanase
Endoglucanase
Exoglucanase
Cellobiohydrolase
+
+
+
+
+
(+)
+
+
Caldicellulosiruptor lactoaceticus
Clostridium stercorarium
68^75
65
7.0
7.3
Anaerobic
Anaerobic
+
+
Fervidobacterium islandicum
Clostridium thermolacticum
Clostridium thermocellum
65
65
60
7.2
7.2
7.0
Anaerobic
Anaerobic
Anaerobic
+
(+)
+
Endoglucanases
Endoglucanases
Exoglucanases
Endoglucanase
Exoglucanase
Cellobiohydrolase
^
Exoglucanase
Cellobiohydrolase
[10]
see [2]
see [2]
[13]
[13]
see [13]
see [13]
[15]
[11]
[16]
see [13]
[9]
[17]
[18]
[11]
+ indicates growth on cellulose as sole carbon source, (+) indicates increased growth on cellulose in the presence of other substrates.
mycete subphylum which uses cellulose (crystalline
or amorphous) or xylan as sole carbon and energy
source for growth. It produces at least three endoglucanases that are non-cellulosomal. These endoglucanases are all thermostable and one of them (E1, a
member of the group 5 family of glycosyl hydrolases; [4]) has been crystallised [5]. Much of the recent work on the properties and applications of these
enzymes has been protected by patent.
Rhodothermus marinus, an aerobic thermophile
that was isolated from marine springs, is most
closely related to the Cytophaga-Flexibacter-Bacteroides phylogenetic lineage, and like many organisms
in this group displays versatility in its growth on
polysaccharides. An endocellulase puri¢ed from Rhodothermus is one of the most stable cellulases yet
recorded, retaining 50% activity after 3.5 h at
100³C [6]. Although this endoglucanase has no activity on crystalline cellulase, such an activity was demonstrated in culture ¢ltrates of the organism, and
thus Rhodothermus must produce a glycosyl hydrolase active on insoluble substrates.
A third aerobic cellulolytic thermophile was isolated in a survey of New Zealand thermal sites involving arti¢cial composts set up under laboratory
conditions. This organism, designated Caldibacillus
cellulovorans, is an obligately aerobic spore-forming
Bacillus which grows optimally at 70³C at pH 7.0 [7].
It grows on a fully-de¢ned salts medium with cellulose substrates as sole carbon and energy source, but
again, there is no evidence of any cellulosomes on
the cells. Cellulases are excreted into the medium and
are stable for up to a week at 70³C. Amorphous
cellulose is the preferred substrate for growth and
though crystalline cellulose is degraded, ligni¢ed
wood is not. The isolate is Gram-type positive and
the spores, which are readily formed when the organism is grown on crystalline substrates, are heat
resistant. The SSU (16S) rRNA gene sequence suggests that Caldibacillus cellulovorans has a close af¢liation with the genus Alicyclobacillus. Members of
the genus Alicyclobacillus are characterised by the
presence of alicyclic fatty acids as major components
of their membrane lipids and constituent members of
FEMSEC 964 4-2-99
102
P.L. Bergquist et al. / FEMS Microbiology Ecology 28 (1999) 99^110
the genus are acidophilic, non-cellulolytic and hydrogen autotrophs. In contrast, the new isolate is unable
to metabolise hydrogen, is unable to grow below pH
6.0 and is cellulolytic and hemicellulolytic. On the
basis of the genetic and phenotypic di¡erences, the
new isolate probably represents a new genus of thermophilic eubacteria [7]. One endoglucanase has been
puri¢ed to homogeneity from this isolate (unpublished results). The enzyme has no activity on substrates other than cellulose and a multi-domain enzyme complex (such as is found in many anaerobic
thermophiles; see below) seems unlikely.
Thermophilic anaerobic eubacteria are among the
deepest-rooted phylogenetic lineages in the eubacterial line of descent and the most diverse array of
cellulolytic isolates. Historically, they are important
because one of the ¢rst reported thermophiles was an
anaerobic spore forming cellulolytic organism, possibly Clostridium thermocellum. C. thermocellum was
reisolated and formally described and remains one
of the most completely investigated thermophilic cellulolytic bacteria. The optimum growth temperature
of C. thermocellum is only 55^60³C and so it is only
moderately thermophilic. The cellulosome complex
of C. thermocellum contains endoglucanases (at least
14 di¡erent proteins), cellobiohydrolases and xylanases, all anchored to the cellulose integrating protein (CIP). The cellulase system of C. thermocellum
has been extensively reviewed [8].
Cellulosome structures are found in several species
of mesophilic anaerobic bacteria but only in C. thermocellum and Clostridium stercorarium amongst the
thermophiles, and no other known thermophile has
as well-developed aggregating enzyme systems as
these two organisms. The cellulosome of C. stercorarium is less well developed than that of C. thermocellum with a lower complement of enzymes and a
less pronounced `yellow complex' which is presumably the CIP protein. Avicellases of Clostridium stercorarium have been the focus of study by Bronnenmeier's group and as with many cellulase enzymes, a
high degree of synergism in activity is evident on
crystalline cellulose substrate [9]. The genus Thermotoga is one of the most deep-rooted phylogenetic
lineages in the Bacterial domain, and is also among
the most thermophilic, with growth up to 90³C. Not
all species of Thermotoga show good growth with
cellulose as carbon source, but in the type strain T.
maritima, the complete spectrum of enzymes necessary for growth on crystalline cellulose has been
demonstrated [10].
Perhaps in contrast to other phylogenetic groups
where cellulolytic species are more commonly mesophilic, Spirochaeta thermophila strain Rt19B.1 is the
most thermophilic representative of the family Spirochaetales [11] and the only reported cellulolytic
member. It can hydrolyse amorphous and crystalline
forms of cellulose, and xylan and cellulose can be
used as sole carbon source for growth. Again, no
cellulosome-type structure has been observed on
Rt19.B1 and these motile organisms do not appear
to attach to cellulose particles. Presumably, cellulases
are secreted into the medium but at this stage no
characterisation of the enzyme(s) has been undertaken.
Rainey demonstrated the diversity of cellulolytic
isolates from high temperature, neutral pH environments in a phenotypic and phylogenetic study of
thermophilic anaerobic isolates. At least ¢ve distinct
phenotypic groupings were recognised and these
were partly supported by phylogenetic analysis of
representative strains by SSU rRNA gene sequencing
[12]. The great majority of isolates (which had been
enriched on cellulose as sole carbon source for
growth) were also xylanolytic and mannanolytic. Isolates for this study were obtained from hot springs
well-distributed over the globe; it would appear that
these organisms are well dispersed and common inhabitants of most neutral environments in the 60^
80³C temperature range. In addition, Bredholt et
al. [13] have isolated an even greater diversity of
thermophilic anaerobes from Icelandic springs, including several with an optimum temperature for
growth of 78³C.
A representative strain from one of the New Zealand groups was further characterised and formally
described as the species Caldicellulosiruptor saccharolyticus [14]. The genetics and properties of the cellulase and hemicellulase enzymes of this organism have
been extensively investigated. Attempts at purifying
the component enzymes for cellulose and hemicellulose utilisation proved to be confusing and largely
fruitless. The heterogeneity of the substrate, the multi-domain and multi-catalytic nature of many hydrolytic enzymes, the possibility of glycosylation and the
large cooperative e¡ects of minor contaminating ac-
FEMSEC 964 4-2-99
P.L. Bergquist et al. / FEMS Microbiology Ecology 28 (1999) 99^110
103
tivities all serve to produce a confusing analysis. The
preferred approach is to clone the genes encoding the
enzymes in non-cellulolytic hosts and study the activity of single pure enzymes or that of known, pure
mixes. This approach has been used successfully with
Cs. saccharolyticus and provides a detailed picture of
the organisation of enzyme activities and clustering
of genes in this thermophile.
3. Molecular diversity of cellulases from
Caldicellulosiruptor strains
Many bacteria have been reported to carry a multiplicity of genes for cellulases and hemicellulases
[19]. The extreme thermophile Caldicellulosiruptor
saccharolyticus is unusual in possessing a multifunctional, multidomain organisation for the majority of
its L-glycanases [20]. Other Caldicellulosiruptor
strains (as determined by SSU rDNA sequence-based
phylogeny) also have a number of multidomain enzymes that encode xylanases or mannanases as well
as cellulases, and these are distinguished from the
single catalytic domain xylanases of Family 10.
Fig. 1 is a diagrammatic representation of the three
gene clusters of cellulases from Caldicellulosiruptor
sp. strain Tok7B.1 (unpublished data). The three
clusters are not closely linked, and each one is di¡erent in its organisation from any gene cluster described for Cs. saccharolyticus [20,21]. The catalytic
domains of the enzymes belong to a limited number
of families as determined by hydrophobic cluster
analysis (Families 5, 9, 10, 43, 44, and 48; [4]) and
unlike Cs. saccharolyticus, there are no genes coding
for multidomain enzymes containing a Family 5 Lmannanase domain. The cellulose binding domains
of these enzymes from Caldicellulosiruptor Tok7B.1
are of either type II or III [22] with a single exception
of an otherwise unclassi¢ed cellulose binding domain
(CBD) associated with the K-arabinosidase domain
of celA.
6
Fig. 1. Overall architecture of the three cellulase gene clusters sequenced from Caldicellulosiruptor Tok7B.1. The shaded line
shows where complete sequence information is available. A stylised representation of the gene products is provided below the
shaded line. Some restriction sites are named and VZAP recombinant boundaries indicated, e.g. W2^4.
FEMSEC 964 4-2-99
104
P.L. Bergquist et al. / FEMS Microbiology Ecology 28 (1999) 99^110
Fig. 2. Genetic organisation of the xylanase gene cluster from Caldicellulosiruptor saccharolyticus. This region of DNA has been sequenced
completely on both strands. A stylised representation of the gene products is provided below the shaded line, in examples where the enzymatic activity has been identi¢ed. X: L-xylanase; Xy : L-xylosidase; A: K-arabinosidase ; E : acetyl xylan esterase ; F: xylanase pseudogene;
P: exoxylanase (?). The unlinked L-xylanase gene xynI is not shown.
4. Xylanases from Caldicellulosiruptor strains
Enzymes involved in the metabolism of plant carbohydrate polymers have been grouped into 35 different families on the basis of primary and tertiary
sequence homologies [4]. The endo-1,4-L-D-xylanases
comprise Families 10 and 11. The only similarity
between members of these two families is their ability
to hydrolyze the acetyl-methylglucuronoxylans of
hardwoods and arabinomethylxylans of softwoods,
but they are unrelated biochemically and structurally. Like most other cellulolytic and hemicellulolytic
enzymes, xylanases are highly modular in structure
and may be composed of either a single domain or a
number of distinct domains broadly classi¢ed as catalytic or non-catalytic. Linker peptides typically delineate the individual domains of multidomain enzymes into discrete and functionally-independent
units. The catalytic domain of a xylanase determines
the hydrolytic activity and hence governs the classi¢cation of the enzyme as belonging to Family 10 or
11.
Most of the genes coding for enzymes involved in
xylan degradation in Cs. saccharolyticus are found in
a large gene cluster. A total of 10 open reading
frames were found in this cluster, seven of which
were upstream of xynA (see Fig. 2). Three of the
ORFs were identi¢ed with enzymes involved in xylan
degradation: xynA, a xylanase, xynB, a L-xylosidase
and xynC, an acetylxylan esterase and a non-functional gene that has Family 10 xylanase homology.
XynE is a multi-domain enzyme with xylanase activity. XynF appears to have ¢ve domains which may
have resulted from a gene fusion: two domains comprise an arabinofuranosidase and two more a xylanase (domains 1+2 and domains 4+5) (Te'o, PhD.
thesis, 1996; [23]). Close by are a number of other
genes whose functions have been inferred from homology comparisons. They seem to be part of a major gene cluster that is involved in the metabolism of
xylose and other sugars in this organism.
Surprisingly, the xylanase gene organisation in this
region of the genome of Cs. saccharolyticus is quite
di¡erent from that of its close relative Caldicellulosiruptor sp. strain Rt8B.4 [25]. No multigene xylanase or cellulase/hemicellulase gene clusters were
present and other xylanases were not found in the
expression gene library of this organism despite extensive screening of genomic VZAPII gene libraries.
The size of the multidomain cellulases make it hard
to isolate complete, active genes using this vector.
The gene xynI, which was isolated after PCR ampli¢cation from the Cs. saccharolyticus genome using
consensus primers [24], is part of a genetic organisation which is very similar to that of the xynA gene
cluster of Caldicellulosiruptor sp. strain Rt8B.4 [25].
5. Molecular diversity of Thermotoga and
Caldicellulosiruptor multidomain xylanases
There is a subfamily of hyper- and extremely-thermophilic enzymes within the Family 10 xylanases
which we call here the `TSD-IX' subfamily.
There is substantial homology within the non-catalytic domains of the enzymes from T. maritima
XynA [26]; T. sp. FjSS3B.1 XynB and XynC [27];
Cellulomonas ¢mi XynC [28]; Caldicellulosiruptor
strain Rt8.B4 XynA [25]; Clostridium thermocellum
F1 XynC [29]; and T. saccharolyticum XynA [30].
The structure of these enzymes is based on the domain arrangement: TSD-TSD-Family 10 xylanaseCBDIX -CBDIX (TSD=thermostabilising domain
[31]; additional domains at the C-terminus are
FEMSEC 964 4-2-99
P.L. Bergquist et al. / FEMS Microbiology Ecology 28 (1999) 99^110
present in some of the enzymes of this subfamily, for
example, see Fig. 3; CBDIX =cellulose binding domain type IX [22]). The molecular architecture of
these genes is quite unlike those of the enzymes
from Cs. saccharolyticus. However, the signi¢cance
of cellulose-binding and thermostabilising domains
is unclear. XynA (Family 10) from Dictyoglomus
thermophilum Rt46B.1 has neither and is not only
more thermostable than XynB (a Family 11 xylanase
with a C-terminal non-catalytic domain from the
same organism) but it is also able to bind to xylan
and release reducing sugars [41]. These observations
place doubt on the true role of the `thermostabilising' domains. Their removal certainly reduces the
thermal stability of the catalytic domain [26], but
this may be an incidental e¡ect resulting from the
fact that the enzyme cannot fold properly or make
the appropriate thermostabilising interactions. For
example, molecular modeling and enzyme characterisation has shown that with Dictyoglomus thermophilum Rt46B.1 XynB, removal of even a few amino
acid residues from the N-terminus creates thermal
instability which appears to arise from the creation
of a disorganised N-terminal region [32]. Similar enzymatic and thermostability characteristics were seen
105
for both the complete, multidomain XynB and with
the catalytic domain expressed alone which suggests
that the non-catalytic domains have other, as yet
undiscovered, functions [32]. Furthermore, in the
case of the Caldicellulosiruptor strain Rt8.B4 XynA,
there is no di¡erence in thermostability between recombinant enzymes with and without the putative
N-terminal `thermostabilising' domain (Gri¤ths
and Bergquist, unpublished results).
The nucleotide sequences of over 80 Family 10
and 11 xylanase genes have now been deposited in
the GenBank and EMBL databases. These xylanase
genes were identi¢ed from gene libraries which were
screened for either hybridisation to labeled gene
probes, or more commonly, the expression of endoxylanase activity. An alternative approach for identifying novel Family 10 and 11 xylanase genes is to use
the polymerase chain reaction (PCR) in conjunction
with broad-speci¢city xylanase consensus primers
that are designed from the overall consensus of the
most highly conserved regions of Family 10 and
Family 11 xylanase genes. Because this approach is
PCR-based, it is highly sensitive, and o¡ers an expedient means for the identi¢cation of xylanase genes
directly from genomic DNAs without having to
Fig. 3. Architectural and sequence homologies between the Caldicellulosiruptor strain Rt69B.1 Family 10 (XynA, XynB and XynC) and
Family 11 (XynD) xylanases. From top to bottom, showing Thermoanaerobacterium saccharolyticum XynA ; Thermotoga maritima XynA ;
Rt69B.1 XynB; Rt69B.1 XynA ; Rt69B.1 XynC ; Caldicellulosiruptor saccharolyticus CelB; Cs. saccharolyticus XynF; Bacillus polymyxa
XynD ; Rt69B.1 XynD and Dictyoglomus thermophilum XynB. Key : TSD: thermostabilising domain; CBD IX: Family IX cellulose binding domain; ? : domain of unknown function; E: endoglucanase domain (truncated in ¢gure due to size constraints) ; Family 43: family
43 L-glycanase domain (reported xylosidase/arabinofuranosidase activities); CBD IV: Family VI CBD; XBD ?: possible xylan-binding domain. Repeated SLH (S-layer homology) domains are indicated by white arrowheads, whilst the interdomain linker peptides are indicated
by black.
FEMSEC 964 4-2-99
106
P.L. Bergquist et al. / FEMS Microbiology Ecology 28 (1999) 99^110
make and screen gene libraries. Furthermore, by using genomic-walking PCR, xylanase gene(s) identi¢ed with the consensus PCR step can be subsequently sequenced in a quick and relatively
straightforward manner [24].
We have used this two-step approach to examine
the complement of xylanase genes in Caldicellulosiruptor sp. strain Rt69B.1. This organism is closely
related to Caldicellulosiruptor strain Rt8B.4 and Caldicellulosiruptor saccharolyticus as inferred from SSU
rRNA gene sequence-based phylogeny [33]. Rt69B.1
has three Family 10 xylanases: XynA, XynB and
XynC (Fig. 3). They are members of the TSD-IX
subfamily (see above) and are related in structure
to the XynA Family 10 xylanase from Thermotoga
maritima. [26]. The N-terminal regions of Rt69B.1
XynA, XynB and XynC are architecturally identical
and are related in sequence. However, there are suf¢cient sequence di¡erences between the TSDs that
the N- and C-terminal-most TSDs from each xylanase can be segregated into distinct subfamilies, with
the exception of the second TSD from Rt69B.1
XynA.
There is microdiversity at the level of the individual genes. The xylanase domains of Rt69B.1 XynA,
XynB and XynC are very closely related (approximately 60% identity in each case). The XynA and
XynC Family 10 xylanase domains are 329 residues
in length, whilst the XynB domain is slightly longer
at 340 residues. The length variations within the
Rt69B.1 XynA, XynB and XynC xylanase domains
can be mapped to several of the variable loop regions which partition the alternating beta-strand
and alpha-helix motifs [34]. Downstream from the
catalytic domains there is considerable diversity between the genes, which encode di¡erent families of
CBDs (genes xynB and xynC), and xynC has a further C-terminal catalytic domain, encompassing the
Family 43 L-glycanase domain and CBDIV , which is
highly homologous (89%) to the C-terminal domains
of Cs. saccharolyticus XynF (Fig. 2, [23]). Surprisingly, the C-terminal region of XynC is also homologous to the two N-terminal domains of the Bacillus
polymyxa XynD L-glycanase, which is composed of
an N-terminal Family 43 arabinofuranosidase/xylosidase domain, a central CBDIV , and an additional Cterminal domain [35].
The Family 11 xylanase gene from Rt69B.1 has
high overall homology with the xynB gene from Dictyoglomus thermophilum Rt46B.1 and the binding
domain with a related structure at the C-terminus
of Bacillus polymyxa XynD [35]. There appears to
be only a single copy of the Family 11 gene unlike
its Family 10 counterparts.
6. Origins of molecular diversity
What is the explanation for the diversity in gene
structure found in homologous genes in closely related bacteria? It has been generally assumed that
the L-glycanases have evolved by domain shu¥ing
[36] although exact mechanisms have not been described. Linkers are often found in xylanases and
cellulases [37] which are thought to function as £exible hinges between the catalytic and substrate binding domains. The DNA encoding the repeated linkers may have a role analogous to that of introns,
enabling sequences that encode discrete domains to
be excised and fused to other genes, thus generating
novel hybrid enzymes [19]. Another possibility that
can be proposed is that, following duplication of
cellulase genes by DNA replication, a recombinational event similar to that postulated for the origin
of multiple tRNA genes (`unequal crossing-over',
[38]) or intragenic recombination [39] could give
rise to the genes coding for multidomain enzymes
seen in the genomes of most cellulolytic bacteria.
Furthermore, there are super¢cial similarities between the organisations of the CBDs. A simple example that could be attributed to intragenic recombination is shown by what appears to be an inverted
orientation for the CBDs of CelA from Caldicellulosiruptor Tok7B.1 in comparison to the other related
proteins, which may be explained by the occurrence
of two intragenic cross-overs in the PT-linker regions
as outlined in Fig. 1. However, although this is a
super¢cially plausible model, alignment of the amino
acid sequences of the CBDs and a dendrogram of
their relationships suggest that none of the CBD
arrangements was the immediate precursor of the
inverted CelA structure (Fig. 4).
In view of the high degree of sequence homology
between the Family 10 xylanase domains of Caldicellulosiruptor Rt69B.1 XynA, XynB and XynC, and
the similarities in the N-terminal architectures of
FEMSEC 964 4-2-99
P.L. Bergquist et al. / FEMS Microbiology Ecology 28 (1999) 99^110
Fig. 4. Hypothetical intragenic recombination events that gave
rise to the architecture of the celA gene in Caldicellulosiruptor
Tok7B.1 (see Fig. 1). It is postulated that two crossover events
occurred in the DNA coding for the linker sequences. Other domain sequence orders can be generated by shifting the sites of
the crossovers.
these enzymes, it appears that the genes encoding the
enzymes have arisen by duplication of an ancestral
xylanase gene which, in its primitive form, encoded a
Family 10 xylanase domain with duplicated N-terminal TSDs. The di¡erences within the C-terminal regions of XynA, XynB and XynC are also presumably the result of the `domain-shu¥ing' mechanisms
central to the evolution of microbial glycosyl-hydrolase genes [40]. The sequence and architectural homologies observed between the xylanases from
Rt69B.1 and assorted L-glycanases from other cellulolytic bacterial strains provide a remarkable example of these domain-shu¥ing processes. For example,
at least four distinct gene segments can be identi¢ed
within the Rt69B.1 xynC gene based upon isolated
homologies between XynC and the CelB and XynF
enzymes from Cs. saccharolyticus (Fig. 3). Similarly,
the two C-terminal domains of Rt69B.1 XynC can
combine with the C-terminal domain of Rt69B.1
XynD to form an enzyme of identical architecture
to Bacillus polymyxa XynD (xylosidase/arabinofuranosidase). It is noteworthy that the junction of the B.
polymyxa XynD peptide sequence signifying the end
of homology to Rt69B.1 XynC and the commencement of homology to Rt69B.1 XynD is continuous
with sequences neither added or lost. This observation suggests that the C-terminal domains of
Rt69B.1 XynC and XynD may have arisen through
the recombinational joining of an ancestral B. polymyxa xynD-like gene (see Fig. 3).
107
Caldicellulosiruptor and its close relatives with
their unique array of multifunctional enzymes with
catalytic domains carrying out related activities in
the hydrolysis of insoluble substrates may represent
a persistent evolutionary experiment which developed before the organisation of genes into operons
in other lines of bacteria. A cluster of genes in the
same orientation is frequently part of an operon and
is regulated by transcription from a single promoter.
An alternative regulatory mechanism would be to
fuse the genes encoding the hydrolytic enzymes to
result in the production of a multifunctional protein
on transcription. Multifunctional enzymes guarantee
equivalent transcription and translation of the related enzyme activities and the binding domain(s)
ensure co-ordinate action at the same site on the
substrate.
Further evidence for the molecular diversity of the
L-glycanase genes that we have studied is provided
by the discovery of non-functional gene copies that
appear to be evolutionary remnants of the gene-shuf£ing process. In the case of Orf3/4 of Cs. saccharolyticus (Te'o, PhD. thesis, 1996; and Ref. [23]) and
xynC of Thermotoga FjSS3.B.1 [27], and in other
examples we have not described here, we have found
evidence for non-functional genes (pseudogenes) on
the genomes of thermophilic bacteria. Pseudogenes
have been reported from higher eucaryotes but their
occurrence in procaryotes is unusual. It appears that
the reason we have found these sequences is because
our two-step PCR approach does not rely on the
expression of genes within libraries, and perhaps
the lack of known procaryotic pseudogenes is an
artifact of the commonly used methods of gene isolation.
Presumably, a pseudogene could arise only because of the presence of multiple copies of the
gene. It has been proposed that domains involved
with carbohydrate metabolism have evolved through
the duplication, and subsequent modi¢cation, of
progenitor sequences ^ the acquisition of new catalytic speci¢cities and the optimisation of existing speci¢cities have presumably come about through the
process of di- and convergent evolution. An inevitable consequence of such evolutionary mechanisms
would be the accumulation of pseudogenes from
non-productive gene rearrangements. While genes
without functions would not be expected to persist,
FEMSEC 964 4-2-99
108
P.L. Bergquist et al. / FEMS Microbiology Ecology 28 (1999) 99^110
it is reasonable to expect that some of these pseudogenes would be contained in the genomes of saccharolytic organisms if they were closely linked to metabolically important genes. Ultimately, it may be
found that L-glycanase pseudogenes are quite widely
distributed, especially in those organisms containing
gene clusters. However, their identi¢cation may require a systematic examination of the entire L-glycanase system of an organism, as any pseudogenes
would be overlooked by standard techniques used
for plate assays of genomic expression libraries.
7. Conclusions
The information on molecular diversity of thermophilic bacteria discussed above has been derived entirely from culturable bacteria that grow as pure
strains. It is now widely acknowledged that traditional enrichment strategies produce a subset of organisms that represent only a portion of those found
in natural environments [41]. It is possible to bypass
enrichment completely by amplifying SSU rRNA
genes directly from natural environments with the
development of PCR [42]. This strategy has allowed
the detection of a wide variety of organisms which
were previously unknown and demonstrated that our
perceptions of microbial diversity and phylogeny
were inadequate [43]. As a result, most microbiologists agree that less than 1% of the total microbiota
that exist in natural environments have been identi¢ed [44]. We have examined the diversity of one
genus, Thermus, both prior to and after standard
enrichment techniques by isolating SSU (16S)
rRNA genes and comparing their sequences [45].
Although Thermus is neither cellulolytic nor hemicellulolytic, it is a convenient experimental model
and we believe that the results are representative of
a broad range of microorganisms that exist in nature. The enrichments resulted in the predominance
of e¡ectively the same single strain from each pool
and there was a complete loss of heterogeneity in the
sequences [45]. From the ecological point of view,
this result must mean that surveys of habitats using
SSU sequences as probes of one sort or another will
¢nd that many of the organisms present have not
been described or are unrelated to known culturable
bacteria whose ribosomal DNA sequences are avail-
able from databanks. Similarly, molecular surveys of
microbial habitats using marker genes such as cellulases may reveal much greater diversity than can be
accounted for by the culturable or taxonomically
identi¢able organisms present in the sample.
All of the bacteria we have described have been
isolated by enrichment and grow in pure culture. It is
likely that there is even greater genetic diversity
amongst the cellulases and hemicellulases in unenriched biomass. Our two-step PCR technique could
be used to examine the extent of this biodiversity,
and indeed, we have used genomic walking PCR in
the isolation of a Family 10 xylanase directly from
biomass [24]. While this gene turned out to be derived from a culturable bacterium (Dictyoglomus),
the combination of the technique with rDNA analysis by PCR should allow correlation between the
occurrence of diverse genetic coding sequences and
the presence of new or unusual microorganisms.
Recent developments in genomic sequencing have
in£uenced considerations of genome organisation in
procaryotes (reviewed in Ref. [47]). One ¢nding from
recent comparative genomics is that there is a lack of
large-scale conservation of gene order and where it
occurs, it only involves a small number of essential
genes. One conclusion from these studies is that `in
the evolution of procaryotes, horizontal gene transfer has been common and intense' [47]. Duplication
and divergence of ancestral genes has been proposed
to be the major route for molecular evolution [48].
Accordingly, the limited sequencing and comparison
studies of cellulases and hemicellulases that we have
performed do not allow us to distinguish the manner
in which genes have evolved in related bacteria or
whether the genes have been acquired by horizontal
transfer [49]. Divergent evolution is postulated to
occur after gene duplication, whereas in the case of
convergent evolution it has occurred in parallel. Resolution of the exact evolutionary relationships of
thermophile cellulases and hemicellulases may depend on the conclusions that can be drawn from
large scale, genomic, sequencing results.
Acknowledgments
The work from our laboratories has been supported by grants from the Foundation for Research,
FEMSEC 964 4-2-99
P.L. Bergquist et al. / FEMS Microbiology Ecology 28 (1999) 99^110
Science and Technology, Wellington, New Zealand
and grants from the University of Auckland and
Macquarie University Research Committees.
[14]
References
[1] Olsen, G.J., Woese, C.R. and Overbeek, R. (1994) The winds
of (evolutionary) change: Breathing new life into microbiology. J. Bacteriol. 176, 1^6.
[2] Sunna, A., Moracci, M., Rossi, M. and Antranikian, G.
(1997) Glycosyl hydrolases from hyperthermophiles. Extremophiles 1, 2^13.
[3] Huber, R., Stohr, J., Hohenhaus, S., Rachel, R., Burggraf, S.,
Jannasch, H.W. and Stetter, K.O. (1995) Thermococcus chitinophagus sp.nov., a novel, chitin-degrading, hyperthermophilic archaeum from a deep-sea hydrothermal vent environment. Arch. Microbiol. 164, 255^264.
[4] Henrissat, B. and Bairoch, A. (1995) New families in the classi¢cation of glycosyl hydrolases based on amino acid sequence
similarities. Biochem. J. 293, 781^788.
[5] Sakon, J., Adney, W.S., Himmel, M.E., Thomas, S.R. and
Karplus, P.A. (1996) Crystal structure of thermostable family
5 endocellulase E1 from Acidothermus cellulolyticus in complex with cellotetraose. Biochemistry 35, 10648^10660.
[6] Hreggvidsson, G.O., Kaiste, E., Holst, O., Eggertsson, G.,
Palsdottir, A. and Kristjansson, J.K. (1996) An extremely
thermostable cellulase from the thermophilic eubacterium
Rhodothermus marinus. Appl. Environ. Microbiol. 62, 3047^
3049.
[7] Huang, X.P., Hudson, J.A., Rainey, F.A., Nichols, P.D. and
Morgan, H.W. (1998) Isolation and characterization of Caldibacillus cellulovorans gen. nov., sp. nov., an extremely thermophilic, cellulolytic bacterium. Int. J. Syst. Bacteriol. (in press).
[8] Felix, C.R. and Ljungdahl, L.G. (1993) The cellulosome: the
exocellular organelle of Clostridium. Annu. Rev. Microbiol.
47, 791^819.
[9] Riedel, K., Ritter, J. and Bronnenmeier, K. (1997) Synergistic
interaction of the Clostridium stercorarium avicelase-1 (Celz)
and avicelase-11 (Cely) in the degradation of crystalline cellulose. FEMS Microbiol. Lett. 147, 239^243.
[10] Liebl, W., Ruile, P., Bronnenmeier, K., Reidel, K., Lottspeich,
F. and Greif, I. (1996) Analysis of a Thermotoga maritima
DNA fragment encoding two similar thermostable cellulases,
CelA and CelB, and characterization of the recombinant enzymes. Microbiology 142, 2533^2542.
[11] Aksenova, H., Rainey, F.A., Janssen, P.H., Morgan, H.W.
and Zavarzin, G.A. (1992) Spirochaeta thermophila sp. nov.,
an obligately anaerobic polysaccharolytic member of the genus Spirochaeta. Int. J. Syst. Bacteriol. 42, 175^177.
[12] Rainey, F.A., Ward, N.L., Morgan, H.W., Toalster, R. and
Stackebrandt, E. (1993) Phylogenetic analysis of anaerobic
thermophilic bacteria: Aid for their reclassi¢cation. J. Bacteriol. 175, 4772^4779.
[13] Bredholt, S., Mathrani, I.M. and Ahring, B.K. (1995) Ex-
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
109
tremely thermophilic cellulolytic anaerobes from Icelandic
hot springs. Antonie van Leeuwenhoek 68, 263^271.
Rainey, F.A., Donnison, A.M., Janssen, P.H., Saul, D., Rodrigo, A., Bergquist, P.L., Daniel, R.M., Stackebrandt, E. and
Morgan, H.W. (1994) Description of Caldicellulosiruptor saccharolyticus gen. nov., sp. nov.: An obligately anaerobic, extremely thermophilic, cellulolytic bacterium. FEMS Microbiol. Lett. 120, 263^266.
Svetlichny, V.A. and Svetlichnya, T.P. (1988) Dictyoglomus
turgidus, sp. nov., a new extreme thermophilic eubacterium
isolated from hot springs in the Uzon Volcano Crater.
Mikrobiologiya 57, 435^441.
Te'o, V.S.J., Saul, D.J. and Bergquist, P.L. (1995) cellA, another gene coding for a multidomain cellulase from the extreme thermophile `Caldocellum saccharolyticum'. Appl. Microbiol. Biotechnol. 43, 291^296.
Huber, R., Woese, C.R., Langworthy, T.A., Kristjansson,
J.K. and Stetter, K.O. (1990) Fervidobacterium islandicum sp.
nov., a new extremely thermophilic eubacterium belonging to
the `Thermotogales'. Arch. Microbiol. 154, 105^111.
Le Ruyet, P., Dubourguier, H.C., Albagnac, G. and Prensier,
G. (1985) Characterization of Clostridium thermolacticum sp.
nov., a hydrolytic thermophilic anaerobe producing high
amounts of lactate. Syst. Appl. Microbiol. 6, 196^202.
Gilbert, H.J. and Hazelwood, G.P. (1993) Bacterial cellulases
and xylanases. J. Gen. Microbiol. 139, 187^194.
Bergquist, P.L., Gibbs, M.D., Saul, D.J., Te'o, V.S.J., Dwivedi, P.P. and Morris, D. (1993) Molecular genetics of thermophilic bacterial genes coding for enzymes involved in cellulose
and hemicellulose degradation. In : Genetics, Biochemistry
and Ecology in Biodegradation of Lignocellulose (Shimada,
K., Hoshino, S., Ohmiya, K., Sakka, K., Kobayashi, Y. and
Karita, S., Eds.), pp. 276^285. Uni Publishers, Tokyo, Japan.
Gibbs, M.D., Reeves, R.A., Farrington, K.G., Williams, D.P.
and Bergquist, P.L. (1998) Multidomain and multifunctional
cellulase genes from the extreme thermophile Caldicellulosiruptor isolate Tok7B.1. Appl. Environ. Microbiol., submitted.
Tomme, P., Warren, R.A.J., Miller, R.C., Kilburn, D.G. and
Gilkes, N.R. (1995) Cellulose-binding domains ^ classi¢cation
and properties. In: The Enzymatic Degradation of Insoluble
Polysaccharides (Saddler, J.N. and Penner, M.H., Eds.), pp.
142^161. American Chemical Society Symposium Series 618.
Te'o, V.S.J., Gibbs, M.D., Saul, D.J. and Bergquist, P.L.
(1998) A cluster of genes involved in xylan degradation cloned
from the extreme thermophile Caldicellulosiruptor saccharolyticus. Extremophiles, submitted.
Bergquist, P.L., Gibbs M.D., Saul, D.J., Reeves, R.A., Morris, D.D. and Te'o, V.S.J. (1998) Isolation and expression of
genes for hemicellulases from extremely thermophilic culturable and unculturable bacteria. In: Enzyme Applications in
Fiber Processing (Eriksson, K.-E. and Cavaco-Paulo, A.,
Eds.). American Chemical Society Symposium series, 653, in
press.
Dwivedi, P.P., Gibbs, M.D., Saul, D.J. and Bergquist, P.L.
(1996) Cloning, sequencing and over-expression in Escherichia
coli of a xylanase gene, xynA from the thermophilic bacterium
FEMSEC 964 4-2-99
110
[26]
[27]
[28]
[29]
[30]
[31]
[32]
[33]
[34]
[35]
P.L. Bergquist et al. / FEMS Microbiology Ecology 28 (1999) 99^110
Caldicellulosiruptor Rt8B.4. Appl. Microbiol. Biotechnol. 45,
86^93.
Winterhalter, C., Heinrich, P., Candussio, A., Wich, G. and
Liebl, W. (1995) Identi¢cation of a novel cellulose-binding
domain within the multidomain 120 kDa xylanase XynA of
the hyperthermophilic bacterium Thermotoga maritima. Mol.
Microbiol. 15, 431^444.
Reeves, R.A., Saul, D.J., Morris, D.D., Gibbs, M.D. and
Bergquist, P.L. (1998) Sequences and expression of further
xylanase genes from the hyperthermophile Thermotoga sp.
strain FjSS3-B.1. J. Bacteriol., submitted.
Clarke, J., Davidson, K., Gilbert, H.J., Fontes, C.M.G.A. and
Hazlewood, G.P. (1996) A modular xylanase from mesophilic
Cellulomonas ¢mi contains the same cellulose-binding domain
and thermostabilising domain as xylanases from thermophilic
bacteria. FEMS Microbiol. Lett. 139, 27^35.
Hayashi, H., Takagi, K.-I., Fukumura, M., Kimura, T., Karita, S., Sakka, K. and Ohmiya, K. (1997) Sequence of xynC
and properties of XynC, a major component of the Clostridium thermocellum cellulosome. J. Bacteriol. 179, 4246^
4253.
Lee, Y., Lowe, S.E. and Zeikus, G.J. (1993) Gene cloning,
sequencing and biochemical characterisation of endoxylanase
from Thermoanaerobacterium saccharolyticum B6A-RI. Appl.
Environ. Microbiol. 59, 3134^3137.
Fontes, C.M., Hazlewood, G.P., Morag, E., Hall, J., Hirst,
B.H. and Gilbert, H.J. (1995) Evidence for a general role for
non-catalytic thermostabilising domains in a xylanase from
thermophilic bacteria. Biochem. J. 307, 151^158.
Morris, D.D., Gibbs, M.D., Chin, C.J.W., Koh, M.-H.,
Wong, K.K.Y., Allison, R.W., Nelson, P.J. and Bergquist,
P.L. (1998) Cloning of the xynB gene from Dictyoglomus thermophilum strain Rt46B.1 and characterization of the gene
product on kraft pulp. Appl. Environ. Microbiol. 64.
Morris, D.D., Gibbs, M.D., Ford, M., Thomas, J. and Bergquist, P.L. (1998) Family 10 and 11 xylanase genes from Caldicellulosiruptor sp. Rt69B.1. Extremophiles, submitted.
White, A., Withers, S.G., Gilkes, N.R. and Rose, D.R. (1994)
Crystal structure of the catalytic domain of the L-1,4-glycanase Cex from Cellulomonas ¢mi. Biochemistry 33, 12546^
12552.
Gosables, M.J., Perez-Gonzalez, J.A., Gonzales, R. and Navarros, A. (1991) Two beta-glycanase genes are clustered in
[36]
[37]
[38]
[39]
[40]
[41]
[42]
[43]
[44]
[45]
[46]
[47]
[48]
Bacillus polymyxa: molecular cloning, expression and sequence analysis of genes encoding a xylanase and an endobeta-(1,3)-(1,4)-glucanase. J. Bacteriol. 173, 7705^7710.
West, C.A., Elzanowski, A., Yeh, L.-S. and Barker, W.C.
(1989) Homologues of catalytic domains of Cellulomonas glucanases found in fungal and Bacillus glycosidases. FEMS Microbiol. Lett. 59, 167^172.
Ferreira, L.M.A., Durrant, A.J., Hall, J., Hazlewood, G.P.
and Gilbert, H.J. (1990) Spatial separation of protein domains
is not necessary for catalytic activity or substrate binding in a
xylanase. Biochem. J. 269, 261^264.
Smith, J.D., Barnett, L., Brenner, S. and Russell, R.L. (1970)
More mutant tyrosine transfer ribonucleic acids. J. Mol. Biol.
54, 1^14.
Cooper, V.J.C. and Salmond, G.P.C. (1993) Molecular analysis of the major cellulase (CelV) of Erwinia carotovora: evidence for an evolutionary `mix-and-match' of enzyme domains. Mol. Gen. Genet. 241, 341^350.
Gilkes, N.R., Henrissat, B., Kilburn, D.G., Miller, R.C. and
Warren, R.A.J. (1991) Domains in microbial L-1,4-glycanases :
Sequence conservation, function, and enzyme families. Microbiol. Rev. 55, 2303^2315.
Risatti, J.B., Capman, W.C. and Stahl, D.A. (1994) Community structure of a microbial mat: the phylogenetic dimension.
Proc. Natl. Acad. Sci. USA 91, 10173^10177.
Pace, N.R., Stahl, D.A., Lane, D.J. and Olsen, G.J. (1986)
The analysis of natural microbial populations by ribosomal
sequences. Adv. Microb. Ecol. 9, 1^55.
Woese, C.R. (1994) Microbiology in transition. Proc. Natl.
Acad. Sci. USA 91, 1601^1603.
Amann, R.I., Ludwig, W. and Schleifer, K.-H. (1995) Phylogenetic identi¢cation and in situ detection of individual cells
without cultivation. Microbiol. Rev. 59, 143^169.
Saul, D.J., Reeves, R.A., Morgan, H.W. and Bergquist, P.L.
(1998) Thermus diversity and strain loss during enrichment.
FEMS Microbiol. Ecol., accepted.
Koonin, E.V. and Galperin, M.Y. (1997) Prokaryotic genomes: the emerging paradigm of genome-based microbiology. Curr. Opin. Genet. Dev. 7, 757^763.
Ohno, S. (1970) Evolution by Gene Duplication. Springer,
New York, NY.
Fitch, W.D. (1970) Distinguishing homologous from analogous proteins. Syst. Zool. 19, 99^113.
FEMSEC 964 4-2-99