HYPOTHESIS The indefinable term “prokaryote” and the polyphyletic origin of genes MASSIMO DI GIULIO Early Evolution of Life Laboratory, Institute of Biosciences and Bioresources, CNR, Via P. Castellino, 111, 80131 Naples, Italy email: [email protected] Running title: Prokaryote term and gene polyphyletic origin Keywords: Progenote - LUCA - Evolutionary stages – Early Evolution of Life Abstract An analysis has been performed of implications existing between the presence/absence of the evolutionary stage of the Prokaryote, that is to say, the presence/absence of common characteristics between archaea and bacteria, and the monophyletic/polyphyletic origin of genes of proteins. Thereafter, a theorem stating that: “the polyphyletic origin of proteins would imply absence of common characteristics between bacteria and archaea and therefore the lack of the evolutionary stage of the Prokaryote, and vice versa that the indefinable Prokaryote stage implies a polyphyletic origin of proteins”, has been made and validated. The conclusion is that given the absence of true in common characteristics between archaea and bacteria, the origin of genes of proteins should have been polyphyletic. The non-biological meaning of the Prokaryote term and the polyphyletic origin of genes: an introduction The monophyletic origin of genes seems to be the only sensible explanation for their existence. This would derive, for example, from the observation that we can build plausible multiple alignments of orthologous proteins taken from the three domains of life. In a more explicit way, the fact that it is possible to align orthologous proteins from the three domains of life would seem to imply that these proteins were already present at the Last Universal Common Ancestor (LUCA) stage. Therefore, they were simply inherited in the three domains of life, thus defining a their monophyletic origin. This reasoning seems to be sensible and therefore would discredit considerably the polyphyletic hypothesis as an explanation for the origin of genes of proteins. However, it is possible, at least in principle, that at the LUCA stage protein modules were present, active, and codified by means of pieces of RNA that were assembled by trans-splicing to form mRNAs. The translation of these mRNAs resulted in proteins whose genes (DNA) actually evolved only later, that is to say, only after the formation of the main phyletic lines (Di Giulio 2008a). In other words, at the LUCA stage protein modules might have existed and codified by means of pieces of RNA. The one that we observe today in the multiple alignments of orthologous proteins is substantially and primarily an alignment of modules. These modules might have been different in the different phyletic lines of divergence (also if homologues), because assembled by trans-splicing only late on, during or after the formation of the three domains of life (Di Giulio 2006b, 2008a, 2008b, 2008d), to form proteins that we align today. In this way we would be able to explain why we can obtain multiple alignments of the same protein taken from organisms from the three domains of life, also under the hypothesis that the model for their evolution has been the polyphyletic one and not only that monophyletic (Di Giulio 2006b, 2008a, 2008b, 2008d). There are several observations that bring us to consider that the origin of genes could have been polyphyletic. Firstly, it has been suggested that the origin of DNA might have been relatively recent, that is to say, only after that the main plyletic lines were established (Mushegian and Koonin 1996; Leipe et al. 1999; Aravind et al. 1999; Poole and Logan 2005; Forterre 2005, 2006). If this would have been true then the origin of genes must have been necessarily polyphyletic, having DNA, for hypothesis, evolved late on and independently in the main phyletic lines (Di Giulio 2008a). Furthermore, there are strong observations that bring us to consider that the origin of genes of tRNAs has been polyphyletic (Di Giulio 1999, 2004, 2006a, 2006b, 2008a, 2008b, 2008c, 2008d, 2008e, 2009b 2012, 2013a, 2013b). Indeed, given that the hypothesis of ancestrality of split tRNA genes of Nanoarchaeum equitans has been strongly corroborated (Di Giulio 2004, 2005, 2006a, 2006b, 2008a, 2008b, 2008c, 2008d, 2008e, 2009a, 2009b, 2011, 2012a, 2012b, 2014; Branciamore and Di Giulio 2011). The presence of split tRNA genes both at the LUCA stage and in actual organisms should then prove a late assembly of the genes of tRNAs, in the main phyletic lines. In other words, their origin should have been polyphyletic (Di Giulio 1999, 2004, 2006b, 2012a, 2013a, 2013b). There are also other observations that consider that the origin of genes of proteins might have been polyphyletic (Lupas et al. 2001; Di Giulio 1999, 2004, 2006b, 2012a, 2013a, 2013b). In conclusion, although the polyphyletic hypothesis for the origin of genes has never been taken in account, there are several observations and explanations that bring us to consider that such origin is less absurd than commonly thought (Di Giulio 1999, 2004, 2006b, 2012, 2013a, 2013b). There is a debate on the Prokaryote term, that is to say if it has, or does not have, an actual biological meaning (Pace 2006, 2009; Martin and Koonin 2006; Sapp 2006; Dolan and Margulis 2007; Whitman 2009; Doolittle and Zhaxybayeva 2009, 2013; Fox-Keller 2010; Di Giulio 2015). Indeed, although analyses have reconstructed a complex LUCA, for example, composed of genes and proteins comparable to that of current organisms (Mushegian and Koonin 1996; Ouzounis and Kyrpides 1996; Ouzounis et al. 2006; Mat et al. 2008; Tuller et al. 2010). We are, nevertheless, unable to define in a likewise clear and immediate way what in reality is a prokaryote – i.e. what characteristics it had - as, in contrast, these analyses would easily seem to imply (Di Giulio 2015). That is to say, we are not essentially able to associate to the Prokaryote word a single characteristic, for instance, of the cellular level that is really in common between bacteria and archaea. This would seem to deny the very existence of the Prokaryote stage because not appreciable and not supported by the presence of at least a character that was in common between archaea and bacteria (Di Giulio 2015; Fig. 1b). It has been maintained that hundreds of genes, in particular the operational genes, are commonly present in archaea and bacteria and they are generally absent in eukaryotes (Doolittle and Zhaxybayeva 2013). These genes would identify a common ancestor between them (Doolittle and Zhaxybayeva 2013). But this would not clarify the meaning of the Prokaryote term because the existence of a common ancestor between bacteria and archaea indicated from the presence of in common proteins between them - would not imply that the evolutionary stage of the Prokaryote has existed. This is because these proteins might have been present – as seen above – under different forms, for example, under form of modules and assembled only late on, in the main phyletic lines. This implies a polyphyletic origin for proteins. Therefore the not existence of the evolutionary stage of the Prokaryote (Fig. 1a), but only that of a progenote (Fig. 1b), given the rapid dynamism that would seem involved in the evolution of these proteins (Di Giulio 2000, 2001, 2011, 2015; see below). It would result therefore that proteins would not be a good indicator of an evolutionary stage as for instance, the cellular membrane or the organization of a chromosome might instead be it. This is because proteins might not represent a clear cellular level. They would be a kind of character that might reflect both the presence of an evolutionary stage and its absence, given the evolutionary dynamism that would seem to be associated to their structure. Then, the set of problems would seem that to understand if the multiple alignments of orthologous proteins define “complete” proteins in the common ancestor, that is to say, they define that the evolutionary stage of the Prokaryote really existed (Fig. 1a). Or if, on the contrary, these multiple alignments would imply “incomplete” proteins, i.e. still in rapid evolution, that is to say, with a mode and tempo of evolution completely different from the current one. The latter scenario would imply therefore the absence of the evolutionary stage of the Prokaryote, and instead the presence of the one of the progenote (Fig. 1b). Multiple alignments of orthologous proteins have been analyzed and it has been concluded that these might support, or at least to be compatible, with a polyphyletic origin of genes, and thus the absence of the evolutionary stage of the Prokaryote (Di Giulio, 2008a). It being understood that multiple alignments of orthologous proteins from the domains of life should be read also in the light of a polyphyletic origin of genes. Here, I analyze this kind of set of problems, considering our incapacity to define with accuracy and immediacy the Prokaryote term and its relationships with the polyphyletic origin of proteins. [I assume that the domains of life are only two - Bacteria and Archaea – and not three, that is to say, eukaryotes are derived from some archaea as now there is the tendency to consider (Williams et al. 2012, 2013; Spang et al. 2015). However, the conclusions seem true also if eukaryotes would represent a third domain of life. Furthermore, all crucial terms used in this work have been previously defined (Woese 1987; Woese and Fox 1977; Doolittle and Brown 1994; Pace 2006, 2009; Di Giulio 2000, 2001, 2006b, 2011, 2015)]. The non-biological meaning of the Prokaryote term does imply a polyphyletic origin of genes and vice versa: a theorem We start to reason under the hypothesis that the origin of proteins has been polyphyletic. This should imply, by definition, an “immaturity” of proteins at the stage of LUCA. The absence of common characteristics between bacteria and archaea (Fig. 1b) would be a simple consequence of that. Indeed, the immaturity of proteins should imply, that is to say would be equivalent to, an immaturity of all cellular structures that, therefore, might not have been in common between bacteria and archaea as still rapidly evolving (Fig. 1b). Therefore, such to not define the evolutionary stage of the Prokaryote (genote) (Fig. 1a), but only that of a progenote (Fig. 1b). We can see that a polyphyletic origin of proteins should imply the absence of common structures between archaea and bacteria, and therefore our incapacity to define the Prokaryote term in a biological sense. Does exist the possibility that immature proteins, that is to say still in rapid evolution, can instead go to define complete structures, i.e. for instance, belonging to and defining a determined cellular level and thus an evolutionary stage? It seems to me that this is not possible because if proteins are immature, that is in rapid evolution, then also the structures of corresponding cellular level should be rapidly evolving. Consequently it seems to me remote the possibility that instead a structure of this “cellular” level can have been passed to bacteria and archaea, as still rapidly evolving, that is to say, with a tempo and mode typical of a progenote and not of a genote (Woese 1987; Woese and Fox 1977; Doolittle and Brown 1994; Di Giulio 2000, 2001, 2006b, 2011, 2015). Is the opposite also true? That is to say, the absence of in common characteristics between bacteria and archaea would imply an immaturity of proteins and therefore a polyphyletic origin of these or not? The answer seems to be positive. If proteins would have been in an advanced stage of “maturation”, this would have imposed evolved cellular structures, i.e. complete. These structures would have been in common between bacteria and archaea, and this, instead, is not observable. Indeed, the absence of common characteristics between archaea and bacteria is first of all absence of common cellular structures. This can only reflect the presence of proteins still in rapid evolution, i.e. premature, at the LUCA’s stage, and thus a polyphyletic origin of these latter. Whereas the absence of common characteristics between archaea and bacteria cannot reflect the presence of proteins completely evolved, i.e. mature. These proteins would have implied the very contrary. That is to say, the presence at the LUCA’s stage of at least some complete characters that would have become in common between bacteria and archaea, and this is not true for hypothesis. In conclusion, it seems that all this can be included in the following theorem: “the polyphyletic origin of proteins does imply absence of common characteristics between bacteria and archaea, and therefore the absence of the evolutionary stage of Prokaryote and vice versa, the indefinable evolutionary stage of the Prokaryote does imply a polyphyletic origin of proteins”. That is to say, the theorem would make equivalent the following points: (1) the absence of in common characteristics between archaea and bacteria, (2) the absence of the evolutionary stage of Prokaryote i.e. the presence of that of progenote, and (3) the polyphyletic origin of genes of proteins. The highly significant biological consequence would be that the origin of proteins would have followed the polyphyletic model and not the usually accepted monophyletic one. Acknowledgments This work was carried out in the Department of Philosophy and History of Science, Faculty of Science, Charles University in Prague, Czech Republic, during my stay in this university, financed by the short-term mobility program of the National Research Council of Italy. I would like to thank Prof. Jaroslav Flegr for discussions and cordial hospitality. References Aravind I., Walkers D. R., Koonin E. K. 1999 Conserved domains in DNA repair proteins and evolution of repair systems. Nucleic Acids Res. 27, 1223–1242. Branciamore S., Di Giulio M. 2011 The presence in tRNA molecule sequences of the double hairpin, an evolutionary stage through which the origin of this molecule is thought to have passed. J. Mol. Evol. 72, 352-363. Di Giulio M. 1999 The non-monophyletic origin of tRNA molecule. J. Theor. Biol. 197, 403–414. Di Giulio M. 2000 The universal ancestor lived in a thermophilic or hyperthermophilic environment. J. Theor. Biol. 203, 203–213. Di Giulio M. 2001 The non-universality of the genetic code: the universal ancestor was a progenote. J. Theor. Biol. 209, 345–349. Di Giulio M. 2004 The origin of the tRNA molecule: implications for the origin of protein synthesis. J. Theor. Biol. 226, 89–93. Di Giulio M. 2006a Nanoarchaeum equitans is a living fossil. J. Theor. Biol. 242, 257–260. Di Giulio M. 2006b The non-monophyletic origin of the tRNA molecule and the origin of genes only after the evolutionary stage of the Last Universal Common Ancestor (LUCA). J. Theor. Biol. 240, 343–352. Di Giulio M. 2008a The origin of genes could be polyphyletic. Gene 426, 39–46. Di Giulio M. 2008b The split genes of Nanoarchaeum equitans are an ancestral character. Gene 421, 20–26. Di Giulio M. 2008c Permuted tRNA genes of Cyanidioschyzon merolae, the origin of the tRNA molecule and the root of the Eukarya domain. J. Theor. Biol. 253, 587–592. Di Giulio M. 2008d Split genes, ancestral genes. In: Tze-Fei Wong J., Lazcano A. (Eds) Prebiotic Evolution and Astrobiology. Landes Bioscience Pub- lisher, Texas, USA, chapter 13. Di Giulio M. 2008e Transfer RNA genes in pieces are an ancestral characther. EMBO Rep. 9, 820. Di Giulio M. 2009a A comparison among the models proposed to explain the origin of the tRNA molecule: a synthesis. J Mol Evol 69, 1–9. Di Giulio M. 2009b Formal proof that the split genes of tRNAs of Nanoarchaeum equitans are an ancestral character. J. Mol. Evol. 69, 505–511. Di Giulio M. 2011 The Last Universal Common Ancestor (LUCA) and the Ancestors of Archaea and Bacteria were Progenotes. J. Mol. Evol. 72, 119–126. Di Giulio M. 2012a The origin of the tRNA molecule: independent data favor a specific model of its evolution. Biochimie 94, 1464-1466. Di Giulio M. 2012b The 'recently' split transfer RNA genes may be close to merging the two halves of the tRNA rather than having just separated them. J. Theor. Biol. 310, 1-2. Di Giulio M. 2013a A polyphyletic model for the origin of tRNAs has more support than a monophyletic model. J. Theor. Biol. 318, 124–128. Di Giulio M. 2013b Is Nanoarchaeum equitans a paleokaryote? J. Biol. Res. 19, 83–88. Di Giulio M. 2014 The split genes of Nanoarchaeum equitans have not originated in its lineage and have been merged in another Nanoarchaeota: a reply to Podar et al. J. Theor. Biol. 349, 167-169. Di Giulio M. 2015 The Non-Biological Meaning of the Term ‘‘Prokaryote’’ and Its Implications. J. Mol. Evol. 80, 98–101. Dolan M. F., Margulis L. 2007 Advances in biology reveal truth about prokaryotes. Nature 445, 21. Doolittle W. F., Brown J. R. 1994 Tempo mode the progenote and the universal root. Proc. Natl. Acad. Sci. U. S. A. 91, 6721–6728. Doolittle W. F., Zhaxybayeva O. 2009 On the origin of the prokaryotic species. Genome Res. 19, 744–756. Doolittle W. F., Zhaxybayeva O. 2013 What is a prokaryote? E Rosenberg et al. (eds) The Prokaryotes—Prokaryotic Biology and Symbiotic Associations, Springer, Berlin Heidelberg pp. 21–37. Forterre P. 2005 The two ages of the RNA world, and the transition to the DNA world: a story of viruses and cells. Biochimie 87, 793–803. Forterre P. 2000 Three RNA cells for ribosomal lineages and three DNA virus to replicate their genomes: a hypothesis for the origin of cellular domain. Proc. Natl. Acad. Sci. U. S. A. 103, 3669–3674. Fox-Keller E. 2010 The mirage of a space between nature and nurture. Duke Univesity Press, Durham. Leipe D. D., Aravind L., Koonin E. V. 1999 Did DNA replication evolve twice independently? Nucleic Acids Res. 27, 3389–3401. Lupas A. N., Ponting C. P., Russell R. B. 2001 On the Evolution of Protein Folds: Are Similar Motifs in Different Protein Folds the Result of Convergence, Insertion, or Relics of an Ancient Peptide World? J. Struct. Biol. 134, 191–203. Martin W., Koonin E. V. 2006 A positive definition of prokaryotes. Nature 442, 868. Mat W. K., Xue H., Wong J. T. 2008 The genomics of LUCA. Front Biosci. 3, 5605–5613. Mushegian A. R., Koonin E. V. 1996 A minimal gene set for cellular life derived by comparison of complete bacterial genomes. Proc. Natl. Acad. Sci. U. S. A. 93, 10268–10273. Ouzounis C., Kyrpides N. 1996 The emergence of major cellular processes in evolution. FEBS Lett. 390, 119–123. Ouzounis C. A., Kunin V., Darzentas N., Goldovsky L. 2006 A minimal estimate for the gene content of the last universal common ancestor. Res. Microbiol. 157, 57–68. Pace N. R. 2006 Time for a change. Nature 441, 289. Pace N. R. 2009 Rebuttal: the modern concept of prokaryote. J. Bacteriol. 191, 2006–2007. Poole A. M., Logan D. T. 2005 Modern mRNA proofreading and repair: clues that the last universal ancestor possessed an RNA genome? Mol. Biol. Evol. 22, 1444–1455. Sapp R. 2006 Two faces of the prokaryote concept. Intern. Microbiol. 9, 163-172. Spang A., Saw J. H., Jørgensen S. L., Zaremba-Niedzwiedzka K., Martijn J., Lind A. E., van Eijk R., Schleper C., Guy L., Ettema T. J. 2015 Complex archaea that bridge the gap between prokaryotes and eukaryotes. Nature 521, 173-179. Tuller T., Birin H., Gophna U., Kupiec M., Ruppin E. 2010 Reconstruction ancestral gene content by coevolution. Genome Res. 20, 122–132. Whitman W. B. 2009 The modern concept of the Procaryote. J. Bacteriol. 191, 2000–2005. Widmann J., Di Giulio M., Yarus M., Knight R. 2005 tRNA creation by hairpin duplication. J. Mol. Evol. 4, 524-530. Williams T. A., Foster P. G., Nye T. M., Cox C. J., Embley T. M. 2012 A congruent phylogenomic signal places eukaryotes within the Archaea. Proc. R. Soc. Lond. B 279, 4870– 4879. Williams T. A., Foster P. G., Cox C. J., Embley T. M. 2013 An archaeal origin of, Eukaryotes supports only two primary domains of life. Nature 504, 231–236. Woese C. R. 1987 Bacterial evolution. Microbiol Res. 51, 221–271. Woese C. R., Fox G. E. 1977 The concept of cellular evolution. J. Mol. Evol. 10, 1–6. Received 9 May 2016, in revised form 6 July 2016; accepted 23 August 2016 Unedited version published online: 25 August 2016 Figure 1. Three possible evolutionary branching from the universal ancestor (LUCA) are showed. (a) The universal ancestor is a Prokaryote because the characters (geometrical forms) are “identical” in the bacterial and archaeal domains. (b) LUCA is a progenote because the characters are completely different between the bacterial and archaeal domains; and (c) only a few characters are identical in the bacterial and archaeal domains, while the most part of characters are different in the two domains of life. This would define a “pseudoProkaryote” condition or better “pseudo-progenote“ condition of LUCA that is indicated as possibility but it is not discussed in the text.
© Copyright 2026 Paperzz