Lateral gene transfer challenges principles of microbial

Opinion
Lateral gene transfer challenges
principles of microbial systematics
Eric Bapteste1 and Yan Boucher2
1
UPMC UMR 7138, 7 quai Saint-Bernard, Bâtiment A, 4ème étage, 75005, Paris, France
Department of Civil and Environmental Engineering, MIT, Building 48–305, 77 Massachusetts Avenue, Cambridge, MA 02139,
USA
2
Evolutionists strive to learn about the natural historical
process that gave rise to various taxa, while also
attempting to classify them efficiently and make generalizations about them. The quantitative importance of
lateral gene transfer inferred from genomic data,
although well acknowledged by microbiologists, is in
conflict with the conceptual foundations of the
traditional phylogenetic system erected to achieve these
goals. To provide a true account of microbial evolution,
we suggest developing an alternative conception of
natural groups and introduce a new notion – the composite evolutionary unit. Furthermore, we argue that a
comprehensive database containing overlapping taxonomical groups would constitute a step forward regarding the classification of microbes in the presence of
lateral gene transfer.
Introduction
The molecular phylogenetics project conceived by Zuckerkandl and Pauling [1] in the 1960s was ambitious. Among
other revolutionary accomplishments, molecular phylogenetics was expected to function as a powerful time machine,
enabling the identification of genetic, ultrastructural and
metabolic features of ancient life forms for which no fossils
had been left [2]. Through their congruence (i.e. the agreement between phylogenies obtained using different datasets) [2], genes could help to reconstruct what is often called
the Tree of Life (TOL). To understand ancient microbial
evolution, the biggest challenges have been seen as mostly
methodological – improving phylogenetic algorithms accurately to model the complex evolution of molecules [3] and
sequencing a sufficient number of phylogenetic markers [4].
Using a wealth of methods and data, TOLs flourished [5,6].
Yet, over the past 15 years, lateral inheritance (as opposed to
vertical descent) was discovered to be a major evolutionary
force in microorganisms [7–11]. For archaea, bacteria and
some unicellular eukaryotes, individual gene histories can
legitimately differ from species history, and the two phylogenetic patterns (species trees and gene trees) do not have to
show much identity with one another on a broad evolutionary scale. Microbial physiologists and geneticists were not
surprised by the fact that a single genome could comprise
genes arising from multiple phylogenetic sources, yet it
conflicted with the conceptual foundations of the phylogenetic system.
Corresponding author: Bapteste, E. ([email protected]).
200
As a result, the traditional TOL reconstruction project,
as far as prokaryotic organisms are concerned, fell short. It
is arguable whether debating the branching order in the
TOL and looking for a unique nested hierarchy is a satisfactory way to classify such microbes in the presence of
lateral gene transfer. Instead, we propose alternative concepts to the traditional phylogenetic projects to deal with
microbial evolution and systematics: (i) a redefinition of
natural groups; (ii) the description of a new type of evolutionary unit originating from lateral gene transfer (LGT);
and (iii) the realization of an interactive taxonomical database (comprising overlapping groups) to progress towards
a more natural classification. The third of these solutions
would constitute a transition possibly as significant as the
change from a linear system of classification to a nested
hierarchy that occurred thousands of years ago.
Glossary
Essentialism: the view that some permanent, unalterable properties
of objects are essential to them, so that, for any specific type of entity,
it is at least theoretically possible to specify a finite list of characteristics – all of which must be possessed by any entity to belong to the
group defined. For instance, for a property essentialist, all essential
parts of a species remain unchanging throughout time. In historical
essentialism, the unchanging essential characteristic is a common
history. A monophyletic group is thus natural because it is defined by
the existence of a last common ancestor exclusively shared by all its
members, even though these members are not similar to each other
in other respects (ecologically, morphologically, functionally etc.).
Mill, John Stuart: British philosopher (1806–1873) who was an influential liberal thinker. He is notably famous for his defense of utilitarianism and his book A System of Logic: Ratiocinative and Inductive,
published in 1843, describing the five basic principles of induction
and the methods of scientific inquiry.
Monism: at the methodological level, the view that a single method
and a unique representation can account satisfactorily for the unified
set of laws that underlie nature.
Pluralism: opposes monism by endorsing the view that several
methods and theories are legitimate in an evolutionary study
because no single coherent explanatory system can account satisfactorily for all the diverse phenomena of life.
Polythetic: a phylogenetic group in which ‘(i) each individual has a
large but unspecified number of a set of properties occurring in the
aggregate as a whole; (ii) each of those properties is possessed by
large numbers of those individuals; (iii) not one of those properties is
possessed by every individual in the aggregate’, as explained in Ref.
[46].
Synapomorphy: a derived character state shared by two or more
terminal groups (taxa included in a cladistic analysis as further
indivisible units) and inherited from their most recent common
ancestor.
0966-842X/$ – see front matter ß 2008 Elsevier Ltd. All rights reserved. doi:10.1016/j.tim.2008.02.005 Available online 15 April 2008
Opinion
Trends in Microbiology
Vol.16 No.5
Box 1. Different types of evolutionary trees
Box 2. LGT and the definition of natural groups
Two types of evolutionary trees are currently being reconstructed.
First are genome trees, based on the statistical properties of the
genome, on the presence or absence of genes, on the chromosomal
gene order or on average sequence similarity, as calculated in
BLAST analyses (and variants thereof). Second are phylogenomic
trees, based on vertically inherited orthologs [17]. Genome trees
provide a way to compare the evolutionary information present in
different genomes. However, they do not reflect the exact course of
organismal evolution and should not be interpreted as phylogenies.
In no case is the relevance of the tree model tested in these
approaches. Furthermore, such phenetic trees are especially complex to interpret because some of the groupings obtained can result
principally from lateral relationships, whereas others result from
vertical ones. In summary, genome trees show prevailing trends in
the evolution of genome-scale gene sets [16]. By contrast,
phylogenomic Trees – species Trees – are reconstructed to learn
about the pattern of natural relationships between species [18,19]
on the basis of strictly vertically inherited markers. To this end,
molecular datasets are trimmed to exclude genes with conflicting
signals. There are, however, very few data for which one can
confidently assess a strictly vertical transmission, resulting in
skeletal microbial phylogenomic trees, built on a very small amount
of information. The latest TOL, published by Ciccarelli et al. [6],
which Dagan and Martin legitimately renamed the ‘tree of one per
cent’ [20], is a good example of this. In addition, this approach is
probably unwillingly essentialist, because a few characters are
being reified in the name of the congruence between the gene trees
and the species tree. Such a definition makes genes the essence of
species in a systematic scheme based on molecular phylogenetics.
Such an approach is likely to be endlessly criticized, for instance in
the debate over the choice of what is an essential character, or when
essentialist definitions of species are being rejected from the
evolutionary field [47,48].
Consider four organisms, in two independently evolving lineages:
two photobacteria (P1 and P2) having photoreceptors; and two
flagellobacteria (F3 and F4) harboring a flagellum (see Figure 1 in
the main text). Suppose that, at t1, a descendant from P2 laterally
acquired a flagellum it obtained from an F4 relative in addition to its
photoreceptors. How should the chimeric P2 descendant be
classified? Multiple answers seem possible: (i) because it harbors
photoreceptors, the P2 descendant could be joined to the photobacteria; (ii) because it harbors a flagellum, the P2 descendant could
be joined to the flagellobacteria; (iii) because it presents both
photoreceptors and a flagellum, the P2 descendant is something
new, neither a photobacteria nor a flagellobacteria. Evolutionists
generally rely on historical evidence, and consider the photoreceptors as a synapomorphy, the flagellum as a bad character for natural
classification, and the P2 descendant as ‘a photobacterium that
acquired a flagellum’ (the exact description of its evolutionary
history).
Suppose now that, later, at t2, the P2 descendant lost its
photoreceptors. Then, the P2 descendant would only harbor a
flagellum, homologous to those found in F4 and F3 descendants.
Would it be considered a flagellobacteria? This would seem the
most natural solution, given the presence of a flagellum, which is
the essence of the flagellobacteria category, and the absence of
other traits that would suggest an alternative classification. Yet, it
would be in direct contradiction with the historical logic used at t1,
according to which ‘being a photobacterium’ means to have a last
common ancestor that had photoreceptors, regardless of the makeup of the extant descendants. Paradoxically, this solution both
describes the notion of photobacteria and empties it of its
substance, creating groups where no part (and thus no gene) can
define the ‘essence’ of species and of higher taxa. If a descendant of
F3 subsequently lost its flagellum, flagellobacteria would become
‘bacteria with or without a flagellum, knowing that not all bacteria
with a flagellum are flagellobacteria’ and photobacteria ‘bacteria
with or without photoreceptors, with or without a flagellum’, two
descriptions indistinguishable from each other if one ignores the
history of these features. In the presence of LGT (and in the absence
of historical evidence), some groups could seem ‘more natural’ (i.e.
all the flagellated organisms sharing homologous characters, all the
organisms with homologous photoreceptors etc.) than the polythetic groups of higher level in the TOL. In presence of LGT, Millian
and historical essentialist definitions of a ‘natural group’ will fail to
produce a consensual microbial systematics.
Problems in traditional tree making
Despite the molecular saturation problem [12], responsible
for the weakness of phylogenetic signals on a large evolutionary scale, and other tree reconstruction artifacts [13],
phylogenetics showed that some genes are congruent with
each other, whereas other markers display significantly
conflicting signals [14,15]. This situation affects the meaning of the two main types of tree-like phylogenies of life
under reconstruction (Box 1). On the one hand, genome
trees [16,17] (based on genomic properties or content)
provide only central tendencies. Such trees index taxa well
but they do not tell us much about their history, speciation
events, etc. On the other hand, phylogenomic trees (built
from strictly vertically inherited markers [18,19]) have a
limited power to explain the features of extant and past
microbial biodiversity because the vast majority of the
molecular characters (at least) might have evolved along
different evolutionary patterns than the vertical one.
Simply put, genealogical relationships might differ significantly from similarity relationships. In this case, the utility
of a ‘tree of one per cent’ [20] to generalize about the
genomic and genetic evolution of a lineage is probably
close to nil on a broad evolutionary scale.
Consequently, it can be argued whether groups
derived from such vertical trees should be held as
‘natural’. Monophyletic groups are considered natural
because all of their members share an exclusive last
common ancestor – termed an ‘historical essentialist’
(see Glossary) definition of the natural group by Rieppel
[21]. This is in sharp contrast with the definition that
Simpson or Mayr used in their classification. Theirs was
closer to that of Mill [22] (i.e. ‘groups respecting which a
greater number of general propositions can be made, [. . .]
than could be made respecting any other groups into
which the same things could be distributed’). In the
case of animals, these two classical definitions can overlap. Yet, in prokaryotes, LGT exacerbates the tension
between these two definitions of natural groups [Figure 1,
Box 2 and Table 1 (data from Garrity [23])]. Clearly,
molecular-based systematics requires phylogenetic characters whose history is decipherable and stable enough to
form groups. However, in a more adverse context, when
molecular phylogeneticists try to define taxonomical
categories of high rank (such as the Haloarchaea, Alphaproteobacteria and methanogens), we argue that they try
to solve an issue that cannot be conclusively resolved in
traditional terms.
Because neither of the two different tree-based
approaches (genomic or verticalist) satisfactorily fulfills
the goals of traditional phylogenetic systematics for
microbial organisms (i.e. to produce informative natural
groups), we suggest nontraditional alternatives to address
201
Opinion
Trends in Microbiology Vol.16 No.5
Figure 1. Questioning natural groups in the presence of LGT. Two unrelated hypothetical bacterial lineages: the photobacteria (with a photoreceptor, symbolized by a
crown, and two species, P1 and P2) and the flagellobacteria (with a flagellum, and two species, F3 and F4). Their evolution (through the gain and loss of the aforementioned
features) unfolds from the top to the bottom of the drawing, and their corresponding morphology is represented at three different times (t0, t1 and t2). At these different
times, the classification of P2 descendants in a ‘natural group’ is particularly arguable, notably under the historical essentialist definition (Box 2).
the problems of phylogenetic systematics, such as that
raised in Box 2.
Alternative approaches to microbial phylogenetics
and systematics
Proposition of an alternative definition of natural groups
A third definition of a natural group (neither Millian nor
historical essentialist), inspired by the work of Splitter [24]
(and other philosophers [25,26]), could prove useful for
microbial phylogenetics. For Splitter, a specialist in the
species concept debate, a natural group is ‘natural when it
is causally efficacious, relative to some explanatory theory’
[24] – that is, natural groups are real when they have a real
causal impact and real consequences on the biological
world. Under this definition, evolutionary units –because
they have a causal effect and have a role in the natural
world – are natural groups. The consequences of such a
perspective are far reaching. First, higher taxa (e.g. the
Proteobacteria), might not be considered as a natural
group under this definition because there is no such thing
as a real causal impact of the Proteobacteria phylum (i.e.
there is not a single physiological feature shared by all
Proteobacteria that is not a general feature of bacterial
cells). Such higher taxa are an arbitrary way of classifying
the living world rather than the natural one. Second,
because multiple evolutionary units of all sizes have a role
in different biological processes, natural groups in a
revised systematics are expected to be diverse.
Despite their variability, the emergence of evolutionary
units seems to follow a general scheme that enables their
Table 1. Some examples of physiological properties showing variation within and between taxonomic groups of microbes
Property
Anoxygenic photosynthesis
Dissimilatory sulfate reduction
Nitrification
Nitrogen fixation
Sulfur oxidation
Hyperthermophily
Obligate aerobiosis
202
Taxonomic group
Family Bradyrhizobiaceae
Family Rhodocyclaceae
Family Ectothiorhodospiraceae
Family Archaeoglobaceae
Family Nitrospiraceae
Order Desulfobacterales
Family Bradyrhizobiaceae
Family Nitrospiraceae
Order Desulfobacterales
Family Ectothiorhodospiraceae
Genus Azoarcus
Genus Methanococcus
Genus Rhodocyclus
Family Hydrogenophilaceae
Family Sulfolobaceae
Family Ectothiorhodospiraceae
Order Methanococcales
Family Thermotogaceae
Family Desulfurococcaceae
Family Hydrogenophilaceae
Positive representative
Rhodopseudomonas palustris
Rhodocyclus purpureus
Ectothiorhodospira marina
Archaeoglobus fulgidus
Thermodesulfovibrio yellowstonii
Desulfotalea psychrophila
Nitrobacter winogradskyi
Nitrospira marina
Nitrospina gracilis
Nitrococcus mobilis
Azoarcus communis
Methanococcus maripaludis
Rhodocyclus tenuis
Thiobacillus denitrificans
Sulfolobus solfataricus
Ectothiorhodospira marina
Methanocaldococcus jannaschii
Thermotoga maritima
Aeropyrum pernix
Hydrogenophilus thermoluteolus
Negative representative
Nitrobacter winogradskyi
Azoarcus communis
Nitrococcus mobilis
Ferroglobus placidus
Nitrospira marina
Nitrospina gracilis
Rhodopseudomonas palustris
Thermodesulfovibrio yellowstonii
Desulfotalea psychrophila
Ectothiorhodospira marina
Azoarcus anaerobius
Methanococcus vannielii
Rhodocylcus purpureus
Hydrogenophilus thermoluteus
Stygiolobus azoricus
Nitrococcus mobilis
Methanococcus vannielii
Geotoga petraea
Desulfurococcus mobilis
Thiobacillus denitrificans
Opinion
Trends in Microbiology
Vol.16 No.5
Figure 2. Schematic description of coherent and composite evolutionary units. (a) Scheme of a lower-level evolutionary unit, symbolized by a small circle with two
arrows. Circles of similar color correspond to phylogenetically related evolutionary units. Circles of different colors correspond to phylogenetically unrelated evolutionary
units. (b) Scheme of a coherent higher-level evolutionary unit, emerging from a selective process applied to many phylogenetically related lower-level evolutionary units.
(c) Scheme of a composite higher-level evolutionary unit, emerging from a selective process applying on many phylogenetically unrelated lower-level evolutionary units.
Selective processes might involve selection on a function, environmental pressures, natural selection, interbreeding, homeostatic loops etc.
characterization. In it simplest form, an evolutionary
unit rests on the integrated association of lower level
elements that can be replicated and are held together by
some biological mechanism (Figure 2). Depending on
which biological process is responsible for the integration
of the lower-level elements of the ‘whole’ evolutionary
unit, these evolutionary units are more or less familiar to
phylogeneticists (and to systematicists). Animal ‘species’,
for example, are macroscopic evolutionary units emerging when a reproductive process (interbreeding) causes
the functional integration of a set of organisms which are
similar enough to interbreed, and thus results in the
relative persistence in the traits of their offspring across
time (Figure 2b). Based on such a process, these natural
groups comprise organisms that show some similarity
(i.e. that are more similar to one another than to organisms of another interbreeding group). In this case, knowing the genealogical relationships certainly helps in
proposing a useful phylogenetically based taxonomy:
monophyletic groups can match natural groups (sensu
Mill or Splitter), providing a good index of biodiversity
and yielding explanatory power. Yet, as philosophers
have long known, there is no necessity for the various
integrated constituents of an evolutionary unit to have a
unique coherent phylogenetic origin or to show similarity
with each other [27,28]. In fact, and especially for
microbes, the representatives of which far outnumber
members of animal ‘species’, biological processes other
than interbreeding can be responsible for the functional
integration of diverse molecular constituents and the
emergence of more disparate – yet real – evolutionary
units (Figure 2c).
Introducing composite evolutionary units
Microbiologists are also familiar with phylogenetically
diverse, yet functionally integrated groups. In nature,
coevolving associations of multiple phylogenetically distinct microorganisms are frequent. Most importantly, such
evolutionary units often display emerging properties
that none of their constituent parts harbors alone. For
example, syntrophic microbial consortia, composed of
multiple organisms with various physiologies, are able
to achieve chemical reactions that would be energetically
unfavorable if carried out by a single microbe. Such a
relationship was uncovered between closely associated
methanotrophic archaea and sulfate-reducing bacteria
found in anoxic marine sediments [29]. In this case, the
archaeal partner metabolizes methane and the bacteria
use a resulting metabolite as an electron source. Other
examples include oxidation of fermentative end products
by acetogenic bacteria in the presence of methanogens [30],
anaerobic oxidation of methane coupled to denitrification
[31] and mineralization of chlorinated aromatic compounds under methanogenic conditions [32]. Furthermore,
ecological and environmental pressures influencing LGT,
and consequently the genetic units composing organisms,
create evolutionary units by the association of different
genes or pathways within organisms. For instance, significant LGT has been detected between Sulfolobales and
members of the Thermoplasmatales, two phylogenetically
distant phyla that frequently share thermoacidophilic
environments [33]. The evolution of the hyperthermophilic
bacteria Thermotogales is also likely to have been shaped
by their uptake of DNA from the archaea that often share
their environment [34].
203
Opinion
Finally, it is essential to realize that composite evolutionary units of all sizes and levels can emerge in nature.
Such units rely on parts which might have different origins, some global biological process being responsible for
their association while selection is acting on the emerging
higher-level phenotype. For instance, when a biological
function is selected, the composition of its lower-level
structural components can be flexible (i.e. bacteria can
synthesize the essential isoprenoid building block isopentenyl diphosphate through two analogous pathways, one
using 1-deoxy-D-xylulose-5-phosphate as a precursor and
the other using mevalonate [35]). Thus, the list of genes
able to fulfill a function can be extensive. The mix–match
model proposed by Charlebois and Doolittle [36] (Box 3)
formalized this idea well by describing modular evolutionary units of all sizes which are more or less flexible in their
composition because not all their lower-level constituents
have to be the same forever. Consequently, studies on LGT
strongly suggest the existence of multiple levels of selection and the presence of many biological ‘individualities’ in
complex interactions in the microbial world. We thus argue
for a richer view of biodiversity, comprising more evolutionary units than the mere ‘species’ and ‘genes’ generally
considered in traditional phylogenetics, and thus more
natural groups to classify.
Box 3. Different types of composite evolutionary units
Particularly relevant for prokaryotic genome evolution, the mix–
match model stems from the idea that cells need to fulfill different
functions but that the genes responsible for realizing these multiple
functions might differ over time. It proposes that, for a given
function, the available genes can belong to different gene families
(i.e. be ‘analogous’, non-homologous markers) and that the set of
genes fulfilling a given function varies during the course of
evolution (owing to gene and function loss). As a result, new
genomic lineages would arise through mixing and matching of
genes performing different functions, not only by vertical descent,
but also by processes of replacement. Thus, ‘where there are many
analogous types of genes . . . that can perform the same general
function (e.g. energy production or cell envelope formation), the
living world will collectively exhibit much variability, and there will
be no ubiquitous sets of genes that appear as part of any universal
core. Where choices are more limited, most genes performing the
needed function (some step in translation for instance) being
homologous, there will appear to be little variability’ [36].
This model can account for the evolution of two distinct types of
composite evolutionary units, as described in the main text. On one
hand, if the set of genes fulfilling the selected function remains
stable, the collection of lower-level elements from which the
function emerges is limited, and the genetic composition of the
evolutionary unit is mostly definable. In this case, the unit is mostly
rigid: in theory, its constitutive elements can be listed exhaustively.
The translation machinery seems to be a good example of this, as
already noted by Charlebois and Doolittle [36]. On the other hand,
composite evolutionary units can be built from many different
elements changing over time. In this case, the unit is mostly flexible;
it has a tendency to vary in the details of its make-up over long
historical periods. An example of this would be the methionine
biosynthesis pathway, in which the enzymes catalyzing each of the
various steps can differ between organisms but still catalyze the
same reaction [49]. Woese [50] also seems to defend a comparable
view. For him, the components of the cell ‘are modular to one extent
or another’, and if, in the integrative process, some cellular
functions ’became more or less refractory to horizontal gene flow
. . . still others of them remained, and remain today, subject to the
vagaries of horizontal gene flow’ [50].
204
Trends in Microbiology Vol.16 No.5
Pluralistic microbial ontology
If natural biodiversity is truly irreducible to a hierarchic
scheme and cannot be studied with accuracy under a single
model, a pluralistic approach [37] is then ontologically
justified to acknowledge the multiplicity of evolutionary
units in nature, as long as ‘objects both large and small
have an equal reality and causal efficacy’ [38].
In this context, the question of the origin of a microbe
is superseded by (i) the question of the origins of its
many constitutive elements (the various smaller evolutionary units of which it is made) and (ii) the question of
whether this organism might itself belong to larger
composite evolutionary units. This transition – searching
for the multiple origins of a microbe rather than its
unique origin and for the many natural groups to which
a microbe belongs rather than its unique natural group –
might seem counterintuitive. Indeed, evolutionists
are familiar with assigning a unique phylogenetic position to microbial lineages, as if all their parts originated
from a unique point in space and time and remained
cohesive. Yet, the study of the origins (note the plural
form) of microbes would be consistent with a deeper
understanding of the evolutionary theory, in which phylogenetics means ‘phylum genesis’, the processes by
which various evolutionary units emerge across time
rather than the ‘branching pattern arising through evolutionary time’.
Importantly, our model presupposes populations of
elements on which selection is functioning to sustain evolutionary units. Because elementary parts of microbes can
originate from pools of phylogenetically diverse genes,
different parts might come from different populations.
Consequently, the further back we move in the history
of microbial evolutionary units, the more useless and
empty the notion of a single last common ancestor becomes.
Evolutionary units present in a microbial population are
likely to have been carried by multiple separate populations in the past, in different combinations. Only a
variety of evolutionary Trees – as opposed to a unique
phylogeny – would enable us to approximate these different ancestral combinations of features, by trying to reconstruct the history of these smaller gene associations.
Hence, it seems important to revise some of our phylogenetic and systematic practices.
Revised practices in microbial phylogenetics and
systematics
Revised phylogenetic practices
A good phylogenetic analysis of multiple markers no longer
consists in the mere addition of various phylogenetic signals through concatenation to obtain the best unique
topology. The accumulation of data under the null hypothesis that there is a common tree, without having a chance
to refute this premise, even for data of poor phylogenetic
quality, suffers from a logical flaw [39]. In the presence of
LGT, the resolution in a concatenated tree can no longer be
taken as evidence for the existence of a tree. Instead, the
validity of the null hypothesis must be tested by exploring
the origin of the resolution in such a super-tree, and by
testing whether its support is genuine or artifactual [40].
In addition, phylogenetic analysis of complete genomes
Opinion
(from pure cultures or from metagenomic projects) could
systematically include a decomposition analysis to identify
the incongruent phylogenetic patterns within individual
genomes. This analysis would isolate the various incongruent sets of genes that every single genome comprises
and could thereby inform us about potential smaller-level
evolutionary units that are part of the genomic make-up of
any microbe [41]. Some software, such as Concaterpillar,
which uses a hierarchical likelihood ratio test framework
to assess both the topological congruence between gene
phylogenies and branch-length congruence [42], could help
in this task. Moreover, phylogeneticists could search systematically for local congruencies between a priori unrelated gene phylogenies – that is, trees of a same
environment or between distantly related taxa. Starting
with thousands of topologies issued from metagenomic or
genomic projects, analyses of split decomposition identifying common bipartitions or common embedded quartets
[43] should enable the discovery of coevolving sets of genes
of all sizes. If these sets of genes prove to have a role in the
evolutionary process, they too could help in discovering
composite evolutionary units.
Trends in Microbiology
Vol.16 No.5
Overlapping microbial taxonomies
The complexity of the evolutionary process acting on
microbes indicates that a single taxonomy will be likely
to provide an overly coarse picture of microbial relationships. As shown in Table 1, the binomial nomenclature and
the sole hierarchical classification are a poor proxy of the
genetic make-up of a microbe. By contrast, more taxonomies based on real biological processes could bring
significant information that it would be arbitrary to overlook [44]. Discarding all but one of these process-based
taxonomies would be comparable to reducing a person’s
identity to a single aspect of his or her life, even though he
or she might have an effective role in many organizations:
professional, artistic, sportive, familial and so on. To avoid
overlooking any of the natural groups, it seems legitimate
to propose – rather than a single taxonomy of microbial
species – many taxonomies describing the multiple evolutionary units and their role. Thus, we suggest giving up
the unique hierarchy as the reference classification system
and instead encourage the production of a comprehensive
interactive database in which an individual could possibly
belong to overlapping taxonomical groups.
Figure 3. Three alternative typical classification systems. (a) The linear system (here in alphabetical order) is often unambiguous but uninformative about the history and
properties of classified organisms. (b) The tree, informative on the vertical relationships of organisms but not necessarily on their properties. Typically, incongruent features
are overlooked in such a hierarchical classification. (c) The interactive database, with its keywords and overlapping groups, where a given organism can be simultaneously
placed in different taxonomical groups because it is naturally involved in different processes and belongs to multiple nonexclusive evolutionary units. Importantly, this
system preserves the information concerning vertical inheritance learned from (b). Simply, this information becomes a part, and not the end, of evolutionary knowledge.
205
Opinion
Through the elaboration of this database, phylogeneticists would be able to appreciate that the tree of cells is not
the only evolutionary pattern and that it should not mask
the complexity of microbial evolution. Importantly, the
database would contain other patterns that evolutionists
might also be willing to classify and generalize about. For
instance, one should be able to generalize about the adaptation to high temperature in thermophiles or the survival
of halophiles at high salt concentrations (irrespective of
whether these groups comprise polyphyletic associations of
archaea and bacteria), etc. Using this system, the extent of
convergences and their genetic basis would be better
appreciated, especially in prokaryotes. Such an evolutionary-based microbial systematics should also improve our
working knowledge, providing keys to distinguish pathogenic microbes from benign ones, to classify bacterial
communities and so on. To achieve this, we cannot rely
exclusively on traditional genealogical relationships.
Medical cases are obvious examples of this; if a patient
is sick, what ultimately matters is to identify which
particular genetic associations are responsible for the antibiotic resistance by the infectious organisms, and not the
nature of the sister group of these organisms in the TOL. If
all information about the evolutionary units composing
microbes and their communities were to be recorded in a
comprehensive database – just as we pool all the sequences
known at the National Center for Biotechnology Information – we would be able to access them at the click of
a mouse.
Our main reason to recommend a comprehensive database, rather than multiple ones, is easing scientific communication. However, we do not have a recipe for naming
its taxonomical groups. Simple names referring to polyphyletic groups of organisms carrying specific evolutionary
units are already used by the microbiology community. In
practice, we do use the terms ‘denitrifier’, ‘sulfate reducer’
and ‘methanogen’, and know what they mean because
these functions are associated with specific evolutionary
units (sets of well-characterized genes allowing a certain
biochemical function to be performed). We also use simple
terms such as ‘Cyanobacteria’, ‘Proteobacteria’ and ‘Crenarchaeon’ knowing that these names also refer to evolutionary units but of a different type (monophyletic core
sets of genes). Providing that these names encapture real
evolutionary units – that is, not just whatever arbitrary
suites of traits, but those having a causal role in the
evolutionary process – they can all constitute valuable
keywords in our evolutionary-based taxonomical database.
Any given organism can then be characterized by many
names because it can belong to more than one group at
once, which is, in theory, testable. Furthermore, some
fields of microbiology (metagenomics) do not use organisms, but rather DNA extracted directly from the environment, to investigate biological processes. This makes the
use of concepts such as evolutionary units not only useful,
but essential.
Importantly, the considerable progress that has been
made in computer science makes non-tree-like, yet efficient, classifications realistic and promising. Classification
systems with overlapping groups, previously known to be
intractable, are no longer so. Anyone who has looked for a
206
Trends in Microbiology Vol.16 No.5
book on the internet, entering a series of keywords in a
search engine, has experienced this: there is no need to use
nested notions (such as a tree) to access the information.
Thus, even though the transition from a tree-like structure
of classification to a more dynamic reticulated system is
probably as shocking as was the transition from a linear
order to a series of dichotomies thousands of years ago (and
in fact this is still encountering resistance nowadays [45]),
it will most likely prove to be even more useful in microbiology (Figure 3).
Conclusions
We advocate here a pluralistic microbial systematics,
multiplying names and taxa when it is legitimate – that
is, when identifying biological units having a causal role in
the evolutionary process – to avoid presenting an overly
coarse view of microbial history. It would be important to
evaluate whether such an alternative model offers a better
description of natural diversity than that provided
through a unique nested hierarchy, splitting the living
world into various inclusive categories (i.e. taxa of high
rank), many of them devoid of causal efficacy. This
approach, applicable to archaea, bacteria and possibly
unicellular eukaryotes, undoubtedly goes beyond the
traditional classification on a ‘debated tree’ of ‘debated
species’. It adds to the traditional classification, because it
acknowledges the importance of the studies by various
microbial specialists, including those of traditional molecular phylogeneticists, without giving absolute priority
or exclusivity to the latter. For us, it could constitute a step
forward by promoting a more informative and integrated
systematics, implicating an increasing number of scientists in this huge task. We also expect the identification of
composite evolutionary units through alternative phylogenetic analyses, less constrained by the tree formalism,
to bring forth new perspectives about the evolution of life
and its taxa. In contrast to the traditional practice of
molecular phylogenetics centered around a unique tree,
we feel that it is time for evolutionists to explore the whole
phylogenetic forest.
Acknowledgements
We thank Ford Doolittle, Pascal Tassy, Michel Morange, Armand de
Ricqlès and Jean Gayon for critical discussions, and also Chris Lane, Sara
Hopkins and Hans Wildschutte for careful reading of the manuscript.
References
1 Zuckerkandl, E. and Pauling, L. (1965) Molecules as documents of
evolutionary history. J. Theor. Biol. 8, 357–366
2 Zuckerkandl, E. and Pauling, L. (1965) Evolutionary divergence and
convergence in proteins. In Evolving Genes and Proteins (Bryson, V.
and Vogel, H.J., eds), pp. 97–166, Academic Press
3 Felsenstein, J. (2004) Inferring Phylogenies, Sinauer
4 Cavalier-Smith, T. (1981) Eukaryote kingdoms: seven or nine?
Biosystems 14, 461–481
5 Schwartz, R.M. and Dayhoff, M.O. (1978) Origins of prokaryotes,
eukaryotes, mitochondria, and chloroplasts. Science 199, 395–403
6 Ciccarelli, F.D. et al. (2006) Toward automatic reconstruction of a
highly resolved tree of life. Science 311, 1283–1287
7 Doolittle, W.F. (1999) Phylogenetic classification and the universal
tree. Science 284, 2124–2129
8 Koonin, E.V. et al. (2001) Horizontal gene transfer in prokaryotes:
quantification and classification. Annu. Rev. Microbiol. 55, 709–742
9 Thompson, J.R. et al. (2005) Genotypic diversity within a natural
coastal bacterioplankton population. Science 307, 1311–1313
Opinion
Trends in Microbiology
10 Lo, I. et al. (2007) Strain-resolved community proteomics reveals
recombining genomes of acidophilic bacteria. Nature 446, 537–541
11 Hanage, W.P. et al. (2006) The impact of homologous recombination on
the generation of diversity in bacteria. J. Theor. Biol. 239, 210–219
12 Penny, D. et al. (2003) Testing fundamental evolutionary hypotheses.
J. Theor. Biol. 223, 377–385
13 Gribaldo, S. and Philippe, H. (2002) Ancient phylogenetic
relationships. Theor. Popul. Biol. 61, 391–408
14 Bapteste, E. et al. (2005) Do orthologous gene phylogenies really
support tree-thinking? BMC Evol. Biol. 5, 33
15 Susko, E. et al. (2006) Visualizing and assessing phylogenetic
congruence of core gene sets: a case study of the gammaproteobacteria. Mol. Biol. Evol. 23, 1019–1030
16 Wolf, Y.I. et al. (2002) Genome trees and the tree of life. Trends Genet.
18, 472–479
17 Snel, B. et al. (2005) Genome trees and the nature of genome evolution.
Annu. Rev. Microbiol. 59, 191–209
18 Daubin, V. et al. (2002) A phylogenomic approach to bacterial
phylogeny: evidence of a core of genes sharing a common history.
Genome Res. 12, 1080–1090
19 Brochier, C. et al. (2005) An emerging phylogenetic core of Archaea:
phylogenies of transcription and translation machineries converge
following addition of new genome sequences. BMC Evol. Biol. 5, 36
20 Dagan, T. and Martin, W. (2006) The tree of one percent. Genome Biol.
7, 118
21 Rieppel, O. (2005) The philosophy of total evidence and its relevance for
phylogenetic inference. Pap. Avulsos Zool. 45, 1–31
22 Mill, J.S. (1843) A System of Logic – Ratiocinative and Inductive,
Longman
23 Garrity, G.M. (2005) Bergey’s Manual of Systematic Bacteriology (The
Proteobacteria) (Vol. 2), Springer Verlag
24 Splitter, L.J. (1988) Species and identity. Philos. Sci. 55, 323–348
25 Brooks, D.R. (2001) Evolution in the information age: rediscovering the
nature of the organism. SSED 1, 1–29
26 Collier, J.D. and Muller, S.J. (1998) The dynamical basis of emergence
in natural hierarchies. In Emergence, Complexity, Hierarchy and
Organization (Farre, G. and Oksala, T., eds), pp. 1–30, Acta
Polytechnica Scandinavica
27 Ghiselin, M.T. (1974) A radical solution to the species problem. Syst.
Zool. 23, 536–544
28 Kitcher, P. (1984) Species. Philos. Sci. 51, 308–333
29 Nauhaus, K. et al. (2007) In vitro cell growth of marine archaealbacterial consortia during anaerobic oxidation of methane with
sulfate. Environ. Microbiol. 9, 187–196
30 Stams, A.J. (1994) Metabolic interactions between anaerobic
bacteria in methanogenic environments. Antonie Van Leeuwenhoek
66, 271–294
Vol.16 No.5
31 Strous, M. et al. (2006) Deciphering the evolution and metabolism of
an anammox bacterium from a community genome. Nature 440, 790–
794
32 Becker, J.G. et al. (2005) The role of syntrophic associations in
sustaining anaerobic mineralization of chlorinated organic
compounds. Environ. Health Perspect. 113, 310–316
33 Ruepp, A. et al. (2000) The genome sequence of the thermoacidophilic
scavenger Thermoplasma acidophilum. Nature 407, 508–513
34 Nelson, K.E. et al. (1999) Evidence for lateral gene transfer between
Archaea and bacteria from genome sequence of Thermotoga maritima.
Nature 399, 323–329
35 Boucher, Y. et al. (2003) Lateral gene transfer and the origins of
prokaryotic groups. Annu. Rev. Genet. 37, 283–328
36 Charlebois, R.L. and Doolittle, W.F. (2004) Computing prokaryotic
gene ubiquity: rescuing the core from extinction. Genome Res. 14,
2469–2477
37 Doolittle, W.F. and Bapteste, E. (2007) Pattern pluralism and the Tree
of Life hypothesis. Proc. Natl. Acad. Sci. U. S. A. 104, 2043–2049
38 Steel, D. (2004) Can a reductionist be a pluralist? Biol. Philos. 19, 55–
73
39 Bucknam, J. et al. (2006) Refuting phylogenetic relationships. Biol.
Direct. 1, 26
40 Bapteste, E. et al. (2008) Alternative methods for concatenation of core
genes indicate a lack of resolution in deep nodes of the prokaryotic
phylogeny. Mol. Biol. Evol. 25, 83–91
41 Azad, R.K. and Lawrence, J.G. (2007) Detecting laterally transferred
genes: use of entropic clustering methods and genome position. Nucleic
Acids Res. 35, 4629–4639
42 Leigh, J. et al. (2008) Testing congruence in phylogenomic anaysis.
Syst. Biol. 57, 104–115
43 Zhaxybayeva, O. et al. (2006) Phylogenetic analyses of cyanobacterial
genomes: quantification of horizontal gene transfer events. Genome
Res. 16, 1099–1108
44 Ereshefsky, M. (1992) Eliminative pluralism. Philos. Sci. 59, 671–690
45 Gupta, R.S. (2001) The branching order and phylogenetic placement of
species from completed bacterial genomes, based on conserved indels
found in various proteins. Int. Microbiol. 4, 187–202
46 Panchen, A.L. (1992) Classification, Evolution, and the Nature of
Biology, Cambridge University Press
47 Ereshefsky, M. (2006) Species. In The Stanford Encyclopedia of
Philosophy (Zalta, E.N., ed.), Stanford University Press
48 Doolittle, F. and Papke, R.T. (2006) Genomics and the bacterial species
problem. Genome. Biol. 7, 116
49 Gophna, U. et al. (2005) Evolutionary plasticity of methionine
biosynthesis. Gene 355, 48–57
50 Woese, C.R. (2000) Interpreting the universal phylogenetic tree. Proc.
Natl. Acad. Sci. U. S. A. 97, 8392–8396
Have your say
Trends in Microbiology is a unique forum for the discussion of exciting current research in all aspects of microbiology
from microbial evolution to virulence. Would you like to respond to any of the issues raised in this month’s TiM?
Letters to the editor can be up to 900 words and include a figure or table and 10 references. If you are interested in
contributing a letter, please contact the Editor at:
[email protected]
207