The universal ancestor and the ancestors of Archaea and Bacteria

doi:10.1111/j.1420-9101.2006.01259.x
The universal ancestor and the ancestors of Archaea and Bacteria
were anaerobes whereas the ancestor of the Eukarya domain was
an aerobe
M. DI GIULIO
Laboratory of Molecular Evolution, Institute of Genetics and Biophysics ‘Adriano Buzzati Traverso’, CNR, Naples, Napoli, Italy
Keywords:
Abstract
origin of life;
prebiotic environment;
last universal common ancestor;
aerobic ancestor of eukaryotes;
respiration-early hypothesis;
ancestral sequences;
oxyphobic index.
The use of an oxyphobic index (OI) based on the propensity of amino acids to
enter more frequently the proteins of anaerobes makes it possible to make
inferences on the environment in which the last universal common ancestor
(LUCA) lived. The reconstruction of the ancestral sequences of proteins using a
method based on maximum likelihood and their attribution by means of the
OI to the set of aerobe or anaerobe sequences has led to the following
conclusions: the LUCA was an anaerobic ‘organism’, as were the ancestors of
Archaea and Bacteria, whereas the ancestor of Eukarya was an aerobe. These
observations seem to falsify the hypothesis that the LUCA was an aerobe and
help to identify better the environment in which the first organisms lived.
Introduction
Although there are many different views on the origin of
life (Lazcano & Miller, 1996), it is commonly assumed
that it took place in an anaerobic environment (e.g. see
Samuilov, 2005). Although it is more than reasonable to
think that life originated in an oxygen-free environment,
because oxygen would almost certainly have oxidized
the gradually accumulating molecules, it is not so sure
that the terminal phases of the origin of life, i.e. those
culminating in the birth of the last universal common
ancestor (LUCA), necessarily took place in an anaerobic
environment. In particular, some scenarios envisage that
the primitive atmosphere was indeed reducing but that it
might have had large oases of oxygen over the ocean
formed from the photo-dissociation of water vapour
(Klein, 1992; Kasting, 1993). Castresana & Saraste (1995)
propose the fascinating hypothesis that, as the LUCA
lived in these oxygen oases, it might have been able to
perform aerobic respiration. Therefore the LUCA might
have been an aerobic ‘organism’.
Correspondence: Massimo Di Giulio, Laboratory of Molecular Evolution,
Institute of Genetics and Biophysics ‘Adriano Buzzati Traverso’, CNR, Via
P. Castellino, 111, 80131 Naples, Napoli, Italy.
Tel.: +39 081 6132369; fax: +39 081 6132706;
e-mail: [email protected]
We have very little or no information on whether or
not the LUCA was an aerobe or an anaerobe. A method
that makes it possible to make inferences on the nature of
the LUCA is to reconstruct the ancestral sequences of
genes or proteins of the LUCA using a phylogenetic
reconstruction (Galtier et al., 1999; Di Giulio, 2001,
2003a,b). This type of analysis, if accompanied by an
oxyphobic index (OI) (Archetti & Di Giulio, 2006) which
can be associated with any protein and measures the
propensity of proteins to be in an aerobic or anaerobic
environment, can allow us to establish whether the
LUCA was an aerobe or an anaerobe and, therefore, test
the hypothesis of Castresana & Saraste (1995). This can
be achieved by simply comparing the OI associated with
the protein of the phylogentically reconstructed LUCA
with that of the proteins from aerobic or anaerobic
organisms, and hence conclude that the LUCA was an
aerobe if the reconstructed protein had an OI value not
dissimilar to that of proteins of aerobes but differed from
that of proteins of anaerobes, or vice versa. This method
has been used for temperature using a thermophily index
(Di Giulio, 2001, 2003a,b) and is extended here to
oxygen using an OI in the same way (Archetti & Di
Giulio, 2006).
Obviously, the same analysis makes it possible to
obtain information on whether or not the other ancestors
of the three domains of life could perform respiration. In
ª 2006 THE AUTHOR 20 (2007) 543–548
JOURNAL COMPILATION ª 2006 EUROPEAN SOCIETY FOR EVOLUTIONARY BIOLOGY
543
544
M. DI GIULIO
particular, it would be extremely interesting to determine
whether the ancestor of the Eukarya domain was an
anaerobe because, if it was, it would imply for instance
that models predicting the origin of mitochondria in a
prokaryotic host followed by the acquisition of specific
eukaryotic features (Searcy, 1992; Martin & Muller,
1998; Vellai et al., 1998) must be false. Indeed, if the
ancestor of eukaryotes was an anaerobe, this would
necessarily imply that the evolution of eukaryotes passed
through an amitochondriate eukaryote, which is not
predicted by these models (Searcy, 1992; Martin &
Muller, 1998; Vellai et al., 1998) of the origin of the
eukaryote cell (Embly & Martin, 2006). This might make
the present analysis particularly important.
Finally, having information on the respiratory capability of the other two ancestors, those of the Archaea
and Bacteria domains, would be of interest as it might
help to decide whether the first lines of divergence of
these domains were aerobic or anaerobic, and this clearly
depends only on what their ancestors were. Obviously
this might contribute to identifying the environment in
which the first organisms lived.
On the whole, these considerations encouraged me to
undertake the herein proposed analysis.
Materials and methods
The protein sequences were taken from NCBI using
B L A S T P (Altschul et al., 1997). The protein alignments
were constructed using C L U S T A L X (Thompson et al.,
1997). Only the regions between highly conserved amino
acid sites were maintained in the final alignment used in
the analysis. These alignments contain no gaps. The
alignments in NEXUS format of all proteins used in the
analysis are available upon request.
The ancestral sequences of the nodes of interest were
reconstructed using the maximum likelihood method
with the A N C E S T O R program of Zhang & Nei (1997). All
the recommendations made by these authors were
followed in using the program.
The OI (Archetti & Di Giulio, 2006) is defined by:
OI ¼
X
N
Rj =N;
j¼1
where Rj is the oxyphobic ranks of amino acids (Table 1)
(Archetti & Di Giulio, 2006) and N is the number of
amino acids forming the protein.
The aerobic organisms and the anaerobe obligates were
identified by consulting Jacobs & Gerstein (1960) and
Staley et al. (1984) and the web site http://www.ncbi.nlm.
nih.gov/entrez/query.fcgi?CMD¼File&DB¼genomeprj.
The statistical test (Balaam, 1972) to determine whether or not a given OI value of a reconstructed ancestral
sequence belongs to the set of sequences of aerobes or of
anaerobes is given by: t ¼ (x ) l)/(s/(n)1/2), where x is
the mean of the OI values of the protein sequences of
Table 1 Oxyphobic ranks defined on the basis of the total number
of amino acid substitutions involving a given amino acid and
deriving from a comparison of numerous proteins from an aerobe
and an anaerobe (Archetti & Di Giulio, 2006).
Cys
Arg
Met
Tyr
Glu
Asn
Val
Asp
Pro
Trp
Gly
Ala
Gln
His
Thr
Phe
Ile
Leu
Lys
Ser
20
19
18
17
16
15
10
10
10
10
10
10
10
10
10
5
4
3
2
1
aerobic or anaerobic organisms; l is the OI value of a
particular reconstructed ancestral sequence for which we
aim to establish whether or not it belongs to the set of
sequences of aerobes or of anaerobes; s is the standard
deviation and n is the number of proteins from aerobes or
anaerobes being considered (see also the legend to
Table 3). This test has already been used elsewhere (Di
Giulio, 2001, 2003a,b). The other statistical test used is an
unpaired t-test (Balaam, 1972) to establish if the means
of the OI values of aerobic organism proteins are
significantly different from those of anaerobic organisms.
Results and discussion
The proteins used in the analysis and the
reconstruction of ancestral sequences
I have used a total of 1763 proteins from 33 different
families of orthologous proteins. The oxyphopbic index
(OI) was calculated for all these proteins (see Materials
and methods). Only six of the 33 families of orthologous
proteins showed that they possessed different means of
OI values between aerobic organisms and anaerobe
obligates (Table 2). For these six proteins (Table 2) I
have reconstructed the ancestral sequences in the
following way.
For each of the different families of orthologous
proteins, the individual proteins were chosen in such a
way that each of the three domains of life was wellrepresented. Moreover, as far as possible, I have tried to
represent each phyletic group of organisms with at least
one protein sequence. Finally, the anaerobe obligates
ª 2006 THE AUTHOR 20 (2007) 543–548
JOURNAL COMPILATION ª 2006 EUROPEAN SOCIETY FOR EVOLUTIONARY BIOLOGY
An anaerobic LUCA
545
Table 2 This shows the results of the unpaired t-test (Balaam, 1972) applied in order to establish whether the difference between the mean of
the oxyphobic index values of protein sequences from aerobes and that from anaerobes is statistically significant.
Protein
Mean diff.
d.f.
t-value
P-value
Leucyl-tRNA synthetase
Arginyl-tRNA synthetase
ATPase subunit A
Inosine-5¢ monophosphate dehydrogenase
S-adenosyl-L -homocysteine hydrolase
Valyl-tRNA synthetase
)0.301
)0.241
)0.235
)0.290
)0.306
)0.331
49
57
44
50
38
50
)3.184
)2.411
)2.972
)3.188
)3.101
)3.877
0.0025
0.019
0.0048
0.0025
0.0036
0.0003
Mean diff., difference between the mean from the aerobes and that from the anaerobes.
Table 3. Values of the statistical analysis for the oxyphobic index values relative to the reconstructed sequences of the LUCA (OI) of the six
orthologous proteins used in the analysis.
LUCA
Aerobic
Protein
OI
Anaerobic
Mean OI Std. Error t
Leucyl-tRNA synthetase
9.642 9.251
Arginyl-tRNA synthetase
9.548 9.364
ATPase subunit A
9.992 9.586
Inosine-5¢ monophospate
9.414 9.251
dehydrogenase
S-adenosyl-L-homocysteine 10.040 9.537
hydrolase
Valyl-tRNA synthetase
9.688 9.481
0.353
0.393
0.243
0.269
)6.167
)2.886
)9.598
)3.880
n
P
OI
31
38
33
41
P << 10)3
10)3<P < 0.01
P << 10)3
P < 10)3
9.642
9.548
9.992
9.414
Mean OI Std. Error t
9.552
9.605
9.821
9.541
0.286
0.311
0.236
0.263
)1.407
+0.840
)2.612
+1.602
n
P
20
21
13
11
0.10 < P < 0.20*
0.40<P < 0.50
0.02<P < 0.05*
0.10<P < 0.20
0.262
)10.159 28 P << 10)3
10.040 9.843
0.338
)2.019 12 0.05<P < 0.10*
0.261
)4.824 37 P < 10)3
9.688 9.812
0.318
+1.510 15 0.10<P < 0.20
The mean OI is the mean value of the oxyphobic index of protein sequences from aerobes and anaerobes; std. error is the standard error; t is the
value of the t-test (see Materials and Methods); n is the number of sequences from aerobes and anaerobes (df = n-1); P is the probability of
observing that particular value of t. The t-test is performed under the null hypothesis that the LUCA is both an aerobe and an anaerobe. The
null hypothesis of the t test is that the value of the oxyphobic index of the ancestral sequence (OI) is equal to the mean OI. Therefore, for
example, a probability lower than or equal to 5% indicates that these two OI values are significantly different from each other and that
therefore the reconstructed ancestral sequence (OI) does not belong to the sequences of aerobes (anaerobes) and hence the LUCA was not an
aerobe (anaerobe). Alternatively, a probability greater than 5% indicates that the two OI values are statistically equal and, therefore, the LUCA
was an aerobe (anaerobe). * For these t values there is no evidence against the null hypothesis, although the interval is indicated nevertheless.
were, as far as possible, chosen so as to be randomly
distributed over the phylogenetic tree topology used to
reconstruct the ancestral sequences. The percentage of
protein sequences from these anaerobic organisms was
always lower than that of the aerobes (see, for instance,
Table 3).
To reconstruct the ancestral sequences of the nodes of
interest to us, I have used the maximum likelihood
method of Zhang & Nei (1997) which needs a topology of
the phylogenetic tree. As previously introduced and
extensively discussed and justified, I have used an
unrooted topology of the tree of life (Di Giulio, 2003b)
to reconstruct these ancestral sequences. In other words,
the node of the LUCA in the topology of the unrooted
phylogenetic trees used in these reconstructions refers to
the deepest node in the tree, i.e. the one that is directly
connected to the three domains of life (Di Giulio, 2003b).
Whereas, to reconstruct the topology of the parts of the
trees in the single domains of life, (i) as far as the Archaea
domain is concerned, I have used the tree topology
identified in the work of Brochier et al. (2005) (bear in
mind that I have never used protein sequences from
Nanoarchaeum equitans). While (ii) for the Eukarya
domain, I have used the topology identified by Ciccarelli
et al. (2006) and I have also used parts of the topologies
reported in Philippe et al. (2004); I have also imposed
that the first lines of divergence in this domain were the
representatives of the anaerobes of Diplomonadida (for
instance Giardia lambia) and of the Entamoebidae (for
instance Entamoeba histolytica) as observed, for instance in
the tree topology of ribosomal RNA (Leipe et al., 1993;
Hashimoto et al., 1997). Finally for (iii) the Bacteria
domain I have used the first four lines of divergence
identified in the tree topology of ribosomal RNA (Olsen
et al., 1994), whereas for the rest of the topology I have
used that reported in Ciccarelli et al. (2006).
Once a given phylogenetic tree topology has been
reconstructed for a certain protein, this topology has
always been visualized, and hence checked, using the
options contained in P A U P (Swofford, 1993).
ª 2006 THE AUTHOR 20 (2007) 543–548
JOURNAL COMPILATION ª 2006 EUROPEAN SOCIETY FOR EVOLUTIONARY BIOLOGY
546
M. DI GIULIO
ancestral sequence that is statistically different from the
mean of OI values of sequences from aerobes (t ¼
)6.854, d.f. ¼ 37, P << 10)3) (data not shown).
The ancestor of Bacteria also appears to be an anaerobic ‘organism’ and, in five of six cases, with a very high
statistical significance (Table 5) as the test is positive
twice. The test indicates that, for this ancestor, there is no
difference from the behaviour of anaerobes whereas a
difference is observed compared with those of aerobes
(Table 5). Once again the low statistical significance for
sequences of arginyl-tRNA synthetase for the aerobic part
of Table 5 is because of a statistical fluctuation in that in
the sample which has only 19.1 % of sequences from
anaerobes, the test turns out to be highly significant (t ¼
)4.831, d.f. ¼ 37, P < 10)3) (data not shown).
The LUCA and the ancestors of Archaea and Bacteria
were anaerobes
Table 3 clearly shows that the LUCA was most probably
an anaerobe because the OI value of the reconstructed
ancestral sequences is not different, in any of the six cases
examined, from the mean of the OI values of sequences
from anaerobic organisms present in the sample from
which the ancestral sequences were reconstructed. This
sharp conclusion is explicitly supported by the fact that
all the OI values of the reconstructed ancestral sequences
are different from the mean of the OI values of sequences
from the aerobic organisms present in the sample from
which the ancestral sequences were reconstructed, and
this is true in five of six cases, which represents a high
statistical significance (Table 3).
A substantially equivalent conclusion can be reached
for the ancestor of the Archaea domain, although the
result of the statistical test is ambiguous in one out of six
tests for the aerobic part of the sequences of arginyl-tRNA
synthetase (Table 4). However, the controls referred in
the following section showed that this is only a statistical
fluctuation as, for instance, lowering the percentage of
anaerobe sequences present in the sample from 35.6%
(21/59) to 19.1% (9/47) provides an OI value for the
The ancestor of the Eukarya domain was an aerobe
The ancestor of eukryotes turns out to be an aerobe as
the OI values of the ancestral sequences are not
different from the means of the OI values of sequences
from aerobic organisms (Table 6). This conclusion is
supported by the fact that the parallel test performed
on the sample of sequences from anaerobes shows that
the OI values of the ancestral sequences of this
Table 4. Results for the Archaea ancestor. For the various meanings, see the Legend to Table 3 and the text.
Archaea ancestor
Aerobic
Protein
OI
Anaerobic
Mean OI Std. Error t
Leucyl-tRNA synthetase
9.642 9.251
Arginyl-tRNA synthetase
9.473 9.364
ATPase subunit A
9.992 9.586
Inosine-5¢ monophospate
9.602 9.251
dehydrogenase
S-adenosyl-L-homocysteine 10.193 9.537
hydrolase
Valyl-tRNA synthetase
9.909 9.481
n
)6.167
)1.710
)9.598
)8.355
0.353
0.393
0.243
0.269
31
38
33
41
P
OI
)3
P << 10
0.05<P < 0.10
P << 10)3
P << 10)3
9.642
9.473
9.992
9.602
Mean OI Std. Error t
9.552
9.605
9.821
9.541
0.286
0.311
0.236
0.263
)1.407
+1.945
)2.612
)0.769
n
P
20
21
13
11
0.10<P
0.05<P
0.02<P
0.40<P
<
<
<
<
0.20*
0.10
0.05*
0.50*
0.262
)13.249 28 P << 10)3
10.193 9.843
0.338
)3.587 12 10)3<P < 0.01*
0.261
)9.975 37 P << 10)3
9.909 9.812
0.318
)1.181 15 0.20<P < 0.30*
Table 5. Results for the Bacteria ancestor. For the various meanings, see the Legend to Table 3 and the text.
Bacteria ancestor
Aerobic
Protein
OI
Anaerobic
Mean OI Std. Error t
Leucyl-tRNA synthetase
9.636 9.251
Arginyl-tRNA synthetase
9.505 9.364
ATPase subunit A
9.978 9.586
Inosine-5¢ monophospate
9.496 9.251
dehydrogenase
S-adenosyl-L-homocysteine 10.040 9.537
hydrolase
Valyl-tRNA synthetase
9.712 9.481
0.353
0.393
0.243
0.269
n
)6.072
)2.212
)9.267
)5.189
31
38
33
41
P
OI
)3
P << 10
0.02<P < 0.05
P << 10)3
P < 10)3
9.636
9.505
9.978
9.496
Mean OI Std. Error t
9.552
9.605
9.821
9.541
0.286
0.311
0.236
0.263
)1.313
+1.473
)2.399
+0.567
n
P
20
21
13
11
0.20<P
0.10<P
0.02<P
0.40<P
<
<
<
<
0.30*
0.20
0.05*
0.50
0.262
)10.159 28 P << 10)3
10.040 9.843
0.338
)2.019 12 0.05<P < 0.10*
0.261
)5.384 37 P << 10)3
9.712 9.812
0.318
+1.218 15 0.20<P < 0.30
ª 2006 THE AUTHOR 20 (2007) 543–548
JOURNAL COMPILATION ª 2006 EUROPEAN SOCIETY FOR EVOLUTIONARY BIOLOGY
An anaerobic LUCA
547
Table 6. Results for the Eukarya ancestor. For the various meanings, see the Legend to Table 3 and the text.
Eukarya ancestor
Aerobic
Anaerobic
Protein
OI
Mean OI
Std. Error
t
n
P
OI
Mean OI
Std. Error
t
n
P
Leucyl-tRNA synthetase
Arginyl-tRNA synthetase
ATPase subunit A
Inosine-5¢ monophospate
dehydrogenase
S-adenosyl-L-homocysteine
hydrolase
Valyl-tRNA synthetase
9.289
9.215
9.538
9.034
9.251
9.364
9.586
9.251
0.353
0.393
0.243
0.269
)0.599
+2.337
+1.135
+5.165
31
38
33
41
0.50<P < 0.60
0.02<P < 0.05
0.20<P < 0.30
P < 10)3*
9.289
9.215
9.538
9.034
9.552
9.605
9.821
9.541
0.286
0.311
0.236
0.263
+4.112
+5.747
+4.324
+6.394
20
21
13
11
P
P
P
P
9.629
9.537
0.262
)1.858
28
0.05<P < 0.10
9.629
9.843
0.338
+2.193
12
P ¼ 0.051
9.548
9.481
0.261
)1.561
37
0.10<P < 0.20
9.548
9.812
0.318
+3.215
15
10)3<P < 0.01
ancestor are different from their respective means
(Table 6). This is observed with a high statistical
significance with the exception of just one case out of
six, namely that of S-adenosyl-L -homocysteine hydrolase in which the test is in any case marginally
significant (Table 6).
Examination of the effect of the percentage of protein
sequences from anaerobes on the stability and
robustness of the observations
A criticism that could be raised against the above reported
results is that they are the effect of the relatively high
percentage of sequences from anaerobic organisms
present in the protein sample (Table 3) and do not reflect,
for instance, that the LUCA was actually an anaerobe.
This protein sample has 30.7 % of sequences from
anaerobes (Table 3). This is a relatively high percentage
but it is not clear, for instance, why this must impose an
anaerobic LUCA. In other words, as this percentage is
lower than that of the aerobes, this should perhaps
impose an aerobic LUCA [and not an anaerobic one as
has been observed (Table 3)], even if this depends partly
on the tree topology. However, a simple way of checking
the stability and robustness of the observations made so
far (Tables 3–6) seems to be to lower the percentage of
sequences from anaerobes present in the six different
orthologous proteins (Table 3). I have therefore performed two successive removals of sequences of anaerobes. The first lowering the percentage of anaerobe
sequences to about 20%, and the second to about 12%. I
have observed complete stability of the data shown in
Tables 3–6 (data not shown). This makes it possible to say
that the above referred observations do not depend on
the relatively high percentage of sequences from anaerobes present in the sample, and that the results are stable
at least until the percentage is lowered to about 12%.
In conclusion, it seems likely that the observations
presented herein cannot be attributed to a particular
effect of sequences from anaerobes.
< 10)3
<< 10)3
< 10)3
< 10)3
Conclusions
Castresana & Saraste (1995) suggest that the LUCA might
have been an aerobic ‘organism’. The observations
reported in the present paper do not seem to corroborate
this hypothesis and, indeed, seem to falsify it. It seems
that the LUCA was most probably an anaerobe (Table 3).
However, high concentrations of oxygen are not necessarily needed for aerobic respiration (Castresana &
Saraste, 1995) and therefore, as the present analysis is
perhaps not sufficient to detect low oxygen concentrations, it might consequently be unsuitable for testing the
hypothesis of Castresana & Saraste (1995).
Having provided observations in favour of the hypothesis that the ancestor of Eukaryotes was an aerobe
(Table 6), this implies that it had mitochondria or, at
least, it is not distinguishable from this evolutionary
stage. As mitochondria originated by symbiosis from an
a-proteobacterium (Gray et al., 2004) this implies that the
ancestor of eukaryotes appeared either after the appearance of the O2-consuming a-proteobacteria (Moreira &
Lopez Garcia, 1998; Cavalier-Smith, 2002, 2004; Lake
et al., 2005; Margulis et al., 2005; Embly & Martin, 2006)
or was the result of the fusion between an a-proteobacterium or its ancestor and an archaebacterium (Searcy,
1992; Martin & Muller, 1998; Vellai et al., 1998; Embly &
Martin, 2006). Unfortunately, the observation that the
ancestor of eukaryotes was an aerobe is not on its own
able to discriminate between the different hypotheses
suggested to explain the origin of the eukaryotic cell
(Searcy, 1992; Martin & Muller, 1998; Moreira & Lopez
Garcia, 1998; Vellai et al., 1998; Cavalier-Smith, 2002,
2004; Lake et al., 2005; Margulis et al., 2005; Embly &
Martin, 2006) as this observation does not exclude the
existence of the amitochondriate eukaryote. The latter
might therefore have been the first and true ancestor of
eukaryotes from which the mitochondriate eukaryote
emerged by symbiosis with an O2-consuming a-proteobacterium, with the consequential extinction of all other
lineages of amitochondriate eukaryote.
ª 2006 THE AUTHOR 20 (2007) 543–548
JOURNAL COMPILATION ª 2006 EUROPEAN SOCIETY FOR EVOLUTIONARY BIOLOGY
548
M. DI GIULIO
Finally, the observations referred here seem to underline a widely accepted concept that the origin of life and
the main phases of the evolution of the first organisms
populating the Earth took place in an anaerobic environment, such as the ocean abysses.
References
Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z.,
Miller, W. & Lipman, D.J. 1997. Gapped BLAST and PSIBLAST: a new generation of protein database search programs.
Nucl. Acid. Res. 25: 3389–3402.
Archetti, M. & Di Giulio, M. 2006. The evolution of the genetic
code took place in an anaerobic environment. J. Theor. Biol.
(in press).
Balaam, L.N. 1972. Fundamentals of biometry, pp. 120–142.
George Allen & Unwin, London.
Brochier, C., Gribaldo, S., Zivanovic, Y., Confalonieri, F. &
Forterre, P. 2005. Nanoarchaea: representatives of a novel
archaeal phylum or a fast-evolving euryarchaeal lineage
related to Thermococcales? Genome Biol. 6: R42–R52.
Castresana, J. & Saraste, M. 1995. Evolution of energetic
metabolism: the respiration-early hypothesis. Trends Biochem.
Sci. 20: 443–448.
Cavalier-Smith, T. 2002. The phagotrophic origin of eukaryotes
and phylogenetic classification of Protozoa. Int. J. Syst: Evol.
Microbiol. 52: 297–354.
Cavalier-Smith, T. 2004. Only six kingdoms of life. Proc. R. Soc.
Lond. B 271: 1251–1262.
Ciccarelli, F.D., Doerks, T., von Mering, C., Creevey, C.J., Snel,
B. & Bork, P. 2006. Toward automatic reconstruction of a
highly resolved tree of life. Science 311: 1283–1287.
Di Giulio, M. 2001. The universal ancestor was a thermophile or
a hyperthermophile. Gene 281: 11–17.
Di Giulio, M. 2003a. The universal ancestor was a thermophile
or a hyperthermophile: tests and further evidence. J. Theor.
Biol. 221: 425–436.
Di Giulio, M. 2003b. The universal ancestor and the ancestor of
Bacteria were hyperthermophiles. J. Mol. Evol. 57: 721–730.
Embly, T.M. & Martin, M. 2006. Eukaryotic evolution, changes
and challenges. Nature 440: 623–630.
Galtier, N., Tourasse, N. & Gouy, M. 1999. A nonhyperthermophilic common ancestor to extant life forms. Science 283: 981–
987.
Gray, M.W., Lang, B.F. & Burger, G. 2004. Mitochondria of
protists. Ann. Rev. Genet. 38: 477–524.
Hashimoto, T., Nakamura, Y., Kamaishi, T. & Hasegawa, M.
1997. Early evolution of eukaryotes inferred from the amino
acid sequences of elongation factors 1aand 2. Arch. Protistenkd.
148: 287–295.
Jacobs, M.J. & Gerstein, M.J. 1960. Handbook of Microbiology. D.
Van Nostrand Company, Inc., London.
Kasting, J.F. 1993. Earth’s early atmosphere. Science 259: 920–
926.
Klein, C. 1992. Isotopic Compositions of Iron-Formation
Carbonates. In: The Proterozoic Biosphere (W. Schopf & C.
Klein, eds), pp. 137–174. Cambridge University Press, Cambridge.
Lake, J., Moore, J., Simonson, A. & Rivera, M. 2005. Origin of
the Eukaryotic Nucleus. In: Microbial phylogeny and evolution
concepts and controversies (J. Sapp, ed.), pp. 184–206. Oxford
Univ. Press, Oxford.
Lazcano, A. & Miller, S.L. 1996. The origin and early evolution
of life: prebiotic chemistry, the pre-RNA world, and time. Cell
85: 793–798.
Leipe, D.D., Gunderson, J.H., Nerad, T.A. & Sogin, M.L. 1993.
Small subunit ribosomal RNA of Hexamita inflata and the quest
for the first branch in the eukaryotic tree. Mol. Biochem.
Parasitol. 59: 41–48.
Margulis, L., Dolan, M.F. & Whiteside, J.H. 2005. ‘‘Imperfections
and oddities’’ in the origin of nucleus. Paleobiology 31: 175–
191.
Martin, W. & Muller, M. 1998. The hydrogen hypothesis for the
first eukaryote. Nature 392: 37–41.
Moreira, D. & Lopez Garcia, P. 1998. Symbiosis between
methanogenic archaea and d-proteobacteria as the origin of
eukaryotes: the syntrophic hypothesis. J. Mol. Evol. 47: 517–
530.
Olsen, G.J., Woese, C.R. & Overbeek, R. 1994. The winds of
(evolutionary) change: breathing new life into microbiology.
J. Bacteriol. 176: 1–6.
Philippe, H., Snell, E.A., Bapteste, E., Lopez, P., Holland, P.W.H.
& Casane, D. 2004. Phylogenomics of eukaryotes: impact of
missing data on large alignments. Mol. Biol. Evol. 21: 1740–
1752.
Samuilov, V.D. 2005. Energy problems in life evolution.
Biochemistry (Moscow) 70: 972–979.
Searcy, D.G. 1992. The Evolutionary Origin of Mitochondria. In:
The Origin and Evolution of the Cell (H. H. Matsuno & K.
Matsuno, eds), pp. 47–78. World Scientific, Singapore.
Staley, J.T., Bryant, M.P., Plennig, N. & Holt, J.G. 1984.
Amoxygenic Phototrophic Bacteria. In: Bergey’s manual of
systematic, Vol. 3 (W. R. Hensyl, ed.), pp. 1635–1889. Lippincut
Williams & Wilkins, Philadelphia.
Swofford, D.L. 1993. P A U P : Phylogenetic Analysis Using Parsimony, version 3.1.1. Laboratory of Molecular Systematics,
Smithsonian Institution, Champaign, IL.
Thompson, J.D., Gibson, T.J., Plewniak, F., Jeanmougin, F. &
Higgins, D.G. 1997. The CLUSTAL_X windows interface:
flexible strategies for multiple sequence alignment aided by
quality analysis tools. Nucleic Acids Res. 25: 4876–4882.
Vellai, T., Takacs, K. & Vida, G. 1998. A new aspect to the origin
and evolution of eukaryotes. J. Mol. Evol. 46: 499–507.
Zhang, J. & Nei, M. 1997. Accuracies of ancestral amino acid
sequences inferred by the parsimony, likelihood, and distance
methods. J. Mol. Evol. 44(Suppl. 1): S139–S146.
Received 6 June 2006; revised 14 September 2006; accepted 15
September 2006
ª 2006 THE AUTHOR 20 (2007) 543–548
JOURNAL COMPILATION ª 2006 EUROPEAN SOCIETY FOR EVOLUTIONARY BIOLOGY